pax_global_header00006660000000000000000000000064150270751220014513gustar00rootroot0000000000000052 comment=72a3124d048dc5c89eb3f00c9f2866f4492b5383 papi-papi-7-2-0-t/000077500000000000000000000000001502707512200136405ustar00rootroot00000000000000papi-papi-7-2-0-t/.gitattributes000066400000000000000000000003561502707512200165370ustar00rootroot00000000000000PAPI_FAQ.html -diff release_procedure.txt -diff gitlog2changelog.py -diff doc/DataRange.html -diff doc/PAPI-C.html -diff doc/README -diff src/buildbot_configure_with_components.sh -diff delete_before_release.sh -diff .gitattributes -diff papi-papi-7-2-0-t/.github/000077500000000000000000000000001502707512200152005ustar00rootroot00000000000000papi-papi-7-2-0-t/.github/pull_request_template.md000066400000000000000000000007501502707512200221430ustar00rootroot00000000000000## Pull Request Description ## Author Checklist - [ ] **Description** _Why_ this PR exists. Reference all relevant information, including _background_, _issues_, _test failures_, etc - [ ] **Commits** _Commits_ are self contained and only do one thing _Commits_ have a header of the form: `module: short description` _Commits_ have a body (whenever relevant) containing a detailed description of the addressed problem and its solution - [ ] **Tests** The PR needs to pass all the tests papi-papi-7-2-0-t/.github/workflows/000077500000000000000000000000001502707512200172355ustar00rootroot00000000000000papi-papi-7-2-0-t/.github/workflows/README.md000066400000000000000000000145631502707512200205250ustar00rootroot00000000000000As of now, the GitHub CI is designed to run in three instances: 1. An individual component basis, meaning if a component's codebase is updated, we will only run CI tests for that component. As an example, if we update `cupti_profiler.c` in `src/components/cuda`, we will only run CI tests for the cuda component. Note that this includes updates to subdirectories located in a component's directory (e.g. `src/components/cuda/tests`). See the **NOTE** in [Individual Component Basis](#individual-component-basis) for more info on the setup for the default components (`perf_event`, `perf_event_uncore`, and `sysdetect`). 2. A change to the Counter Analysis Toolkit i.e. in the `src/counter_analysis_toolkit` directory and any subdirectories. 3. A change in the PAPI framework i.e. in the `src/` directory or `src/` subdirectories (excluding individual components and the Counter Analysis Toolkit). If this occurs then we will run a full test suite. # Individual Component Basis All individual component basis tests have a `.yml` that is structured with `componentName_component_workflow.yml`. As an example for the `cuda` component we would have a `.yml` of `cuda_component_workflow.yml`. Therefore, if a new component is added to PAPI, you will need to create a `.yml` based on the aforementioned structure. Upon creating the `.yml` file, you will need to add a workflow. Below is a skeleton that can be used as a starting point. As a reminder, make sure to change the necessary fields out for your component. ```yml name: cuda # replace cuda with your component name on: pull_request: paths: - 'src/components/cuda/**' # replace the cuda path with your component path jobs: component_tests: strategy: matrix: component: [cuda] # replace cuda with your component name debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, nvidia_gpu] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: cuda component tests # replace cuda with your component name run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} ```` For each component `.yml`, there will be a single job with the options of: - `component`, this is the component we want to configure PAPI with (e.g. cuda) - `debug`, with the options of either `yes` (builds a debug version) or `no` (does not build a debug version) - `shlib`, with `--with-shlib-tools` or without `--with-shlib-tools` These options will be used in the script `ci_individual_component.sh` to test configuring and building PAPI. Besides configuring and building PAPI, `ci_individual_component.sh` will: - Check to make sure components are active that we expect - Run a test suite using either `run_tests.sh` (without `--with-shlib-tools`) or `run_tests_shlib.sh` (with `--with-shlib-tools`) **NOTE**: The components `perf_event`, `perf_event_uncore`, and `sysdetect` do not follow the above outlined file structure. For these three components the files used are `default_components_workflow.yml` and `ci_default_components.sh`. Even though the file structure is different the workflow will still only run if a change is made to one of their directories or subdirectories. The reason for this is that these three components are compiled in by default and trying to pass one of them to `--with-components=` will result in an error during the build process. Therefore, any PAPI CI updates for one of these three components would need to be addressed in either of the two aforementioned files. # Counter Analysis Toolkit The Counter Analysis Toolkit (CAT) CI uses the `cat_workflow.yml` and `ci_cat.sh` files. Any updates to the CI for CAT need to be done in these two files. The `cat_workflow.yml` will have a single job with the options of: - `debug`, with the options of either `yes` (builds a debug version) or `no` (does not build a debug version) - `shlib`, with `--with-shlib-tools` or without `--with-shlib-tools` These options will be used in the script `ci_cat.sh` to test configuring and building PAPI. Besides configuring and building PAPI `ci_cat.sh` will: - Test building CAT - Check to see if CAT successfully detects the architecture we are on - Run CAT with a real event and a dummy event - For the real event, we expect the file to exist and values to be present - For the dummy event, we expect the file to exist, but values to not be present # PAPI Framework The PAPI framework CI uses the `papi_framework_workflow.yml` along with the scripts `clang_analysis.sh`, `ci_papi_framework.sh`, and `spack.sh`. Any updates to the CI for the PAPI framework need to be done in these files. `papi_framework_workflow.yml` will have a total of five jobs: 1. papi_components_comprehensive - With the options of: - `components`, this is the components we want to configure PAPI with, i.e. `cuda nvml rocm rocm_smi powercap powercap_ppc rapl sensors_ppc infiniband net appio io lustre stealtime` - `debug`, with the options of either `yes` (builds a debug version) or `no` (does not build a debug version) - `shlib`, with `--with-shlib-tools` or without `--with-shlib-tools` 2. papi_components_amd - With the options of: - `components`, this is the components we want to configure PAPI with, i.e. `rocm rocm_smi` - `debug`, with the options of either `yes` (builds a debug version) or `no` (does not build a debug version) - `shlib`, with `--with-shlib-tools` or without `--with-shlib-tools` 3. papi_component_infiniband - With the options of: - `component`, this is the component we want to configure PAPI with, i.e. `infiniband` - `debug`, with the options of either `yes` (builds a debug version) or `no` (does not build a debug version) - `shlib`, with `--with-shlib-tools` or without `--with-shlib-tools` 4. papi_spack - The script `spack.sh` will be ran, which configures and builds PAPI from SPACK 5. papi_clang_analysis - The script `clang_analysis.sh` will be ran, which configures and builds PAPI with clang For jobs 1, 2, and 3, the options listed will be used in the script `ci_papi_framework.sh` to test configuring and building PAPI. Besides configuring and building PAPI, `ci_papi_framework.sh` will: - Check to make sure components are active that we expect - Run a test suite using either `run_tests.sh` (without `--with-shlib-tools`) or `run_tests_shlib.sh` (with `--with-shlib-tools`) papi-papi-7-2-0-t/.github/workflows/appio_component_workflow.yml000066400000000000000000000012401502707512200251010ustar00rootroot00000000000000name: appio on: pull_request: # run CI only if appio directory or appio sub-directories receive updates paths: - 'src/components/appio/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [appio] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: appio component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/cat_workflow.yml000066400000000000000000000011321502707512200224560ustar00rootroot00000000000000name: counter analysis toolkit on: pull_request: # run CI for updates to counter analysis toolkit paths: - 'src/counter_analysis_toolkit/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: cat_tests: strategy: matrix: debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: counter analysis toolkit tests run: .github/workflows/ci_cat.sh ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/ci_cat.sh000077500000000000000000000024661502707512200210260ustar00rootroot00000000000000#!/bin/bash -e DEBUG=$1 SHLIB=$2 COMPILER=$3 [ -z "$COMPILER" ] && COMPILER=gcc@11 source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases module load $COMPILER cd src # configuring and installing PAPI if [ "$SHLIB" = "without" ]; then ./configure --prefix=$PWD/cat-ci --with-debug=$DEBUG --enable-warnings else ./configure --prefix=$PWD/cat-ci --with-debug=$DEBUG --enable-warnings --with-shlib-tools fi make -j4 && make install # set environment variables for CAT export PAPI_DIR=$PWD/cat-ci export LD_LIBRARY_PATH=${PAPI_DIR}/lib:$LD_LIBRARY_PATH cd counter_analysis_toolkit # check detected architecture was correct # note that the make here will finish DETECTED_ARCH=$(make | grep -o 'ARCH.*' | head -n 1) if [ "$DETECTED_ARCH" != "ARCH=X86" ]; then echo "Failed to detect appropriate architecture." exit 1 fi # create output directory mkdir OUT_DIR # create real and fake events to monitor echo "BR_INST_RETIRED 0" > event_list.txt echo "PAPI_CI_FAKE_EVENT 0" >> event_list.txt ./cat_collect -in event_list.txt -out OUT_DIR -branch cd OUT_DIR # we expect this file to exist and have values [ -f BR_INST_RETIRED.branch ] [ -s BR_INST_RETIRED.branch ] # we expect this file to exist but be empty [ -f PAPI_CI_FAKE_EVENT.branch ] [ ! -s PAPI_CI_FAKE_EVENT.branch ] papi-papi-7-2-0-t/.github/workflows/ci_default_components.sh000077500000000000000000000021541502707512200241420ustar00rootroot00000000000000#!/bin/bash -e DEBUG=$1 SHLIB=$2 COMPILER=$3 [ -z "$COMPILER" ] && COMPILER=gcc@11 source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases module load $COMPILER cd src # test linking with or without --with-shlib-tools if [ "$SHLIB" = "without" ]; then ./configure --with-debug=$DEBUG --enable-warnings else ./configure --with-debug=$DEBUG --enable-warnings --with-shlib-tools fi make -j4 # run PAPI utilities utils/papi_component_avail # active component check EXPECTED_ACTIVE_COMPONENTS="perf_event perf_event_uncore sysdetect" CURRENT_ACTIVE_COMPONENTS=$(utils/papi_component_avail | grep -A1000 'Active components' | grep "Name: " | awk '{printf "%s%s", sep, $2; sep=" "} END{print ""}') [ "$EXPECTED_ACTIVE_COMPONENTS" = "$CURRENT_ACTIVE_COMPONENTS" ] # without '--with-shlib-tools' in ./configure if [ "$SHLIB" = "without" ]; then echo "Running full test suite for active components" ./run_tests.sh TESTS_QUIET # with '--with-shlib-tools' in ./configure else echo "Running single component test for active components" ./run_tests_shlib.sh TESTS_QUIET fi papi-papi-7-2-0-t/.github/workflows/ci_individual_component.sh000077500000000000000000000040621502707512200244630ustar00rootroot00000000000000#!/bin/bash -e COMPONENT=$1 DEBUG=$2 SHLIB=$3 COMPILER=$4 [ -z "$COMPILER" ] && COMPILER=gcc@11 source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases module load $COMPILER cd src # lmsensors environment variables if [ "$COMPONENT" = "lmsensors" ]; then wget https://github.com/groeck/lm-sensors/archive/V3-4-0.tar.gz tar -zxf V3-4-0.tar.gz cd lm-sensors-3-4-0 make install PREFIX=../lm ETCDIR=../lm/etc cd .. export PAPI_LMSENSORS_ROOT=lm export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_LMSENSORS_ROOT/lib fi # rocm and rocm_smi environment variables if [ "$COMPONENT" = "rocm" ] || [ "$COMPONENT" = "rocm_smi" ]; then export PAPI_ROCM_ROOT=/apps/rocm/rocm-5.5.3 export PAPI_ROCMSMI_ROOT=$PAPI_ROCM_ROOT/rocm_smi fi # set necessary environemnt variables for cuda and nvml if [ "$COMPONENT" = "cuda" ] || [ "$COMPONENT" = "nvml" ]; then module load cuda export PAPI_CUDA_ROOT=$ICL_CUDA_ROOT export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_CUDA_ROOT/extras/CUPTI/lib64 fi # test linking with or without --with-shlib-tools if [ "$SHLIB" = "without" ]; then ./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENT" else ./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENT" --with-shlib-tools fi make -j4 # run PAPI utilities utils/papi_component_avail # active component check EXPECTED_ACTIVE_COMPONENTS=$(echo "perf_event perf_event_uncore sysdetect" | sed "s/perf_event_uncore/& $COMPONENT/") CURRENT_ACTIVE_COMPONENTS=$(utils/papi_component_avail | grep -A1000 'Active components' | grep "Name: " | awk '{printf "%s%s", sep, $2; sep=" "} END{print ""}') [ "$EXPECTED_ACTIVE_COMPONENTS" = "$CURRENT_ACTIVE_COMPONENTS" ] # without '--with-shlib-tools' in ./configure if [ "$SHLIB" = "without" ]; then echo "Running full test suite for active components" ./run_tests.sh TESTS_QUIET --disable-cuda-events=yes # with '--with-shlib-tools' in ./configure else echo "Running single component test for active components" ./run_tests_shlib.sh TESTS_QUIET fi papi-papi-7-2-0-t/.github/workflows/ci_papi_framework.sh000077500000000000000000000051071502707512200232600ustar00rootroot00000000000000#!/bin/bash -e COMPONENTS=$1 DEBUG=$2 SHLIB=$3 COMPILER=$4 [ -z "$COMPILER" ] && COMPILER=gcc@11 source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases module load $COMPILER cd src # set necessary environment variables for lmsensors case "$COMPONENTS" in *"lmsensors"*) wget https://github.com/groeck/lm-sensors/archive/V3-4-0.tar.gz tar -zxf V3-4-0.tar.gz cd lm-sensors-3-4-0 make install PREFIX=../lm ETCDIR=../lm/etc cd .. export PAPI_LMSENSORS_ROOT=lm export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_LMSENSORS_ROOT/lib ;; esac # set necessary environment variables for rocm and rocm_smi case "$COMPONENTS" in *"rocm rocm_smi"*) export PAPI_ROCM_ROOT=/apps/rocm/rocm-5.5.3 export PAPI_ROCMSMI_ROOT=$PAPI_ROCM_ROOT/rocm_smi ;; esac # set necessary environment variables for cuda and nvml case "$COMPONENTS" in *"cuda nvml"*) module load cuda export PAPI_CUDA_ROOT=$ICL_CUDA_ROOT export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_CUDA_ROOT/extras/CUPTI/lib64 ;; esac # test linking with or without --with-shlib-tools if [ "$SHLIB" = "without" ]; then ./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENTS" else ./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENTS" --with-shlib-tools fi make -j4 # run PAPI utilities utils/papi_component_avail # active component check CURRENT_ACTIVE_COMPONENTS=$(utils/papi_component_avail | grep -A1000 'Active components' | grep "Name: " | awk '{printf "%s%s", sep, $2; sep=" "} END{print ""}') if [ "$COMPONENTS" = "cuda nvml rocm rocm_smi powercap powercap_ppc rapl sensors_ppc net appio io lustre stealtime coretemp lmsensors mx sde" ]; then [ "$CURRENT_ACTIVE_COMPONENTS" = "perf_event perf_event_uncore cuda nvml powercap net appio io stealtime coretemp lmsensors sde sysdetect" ] elif [ "$COMPONENTS" = "rocm rocm_smi" ]; then [ "$CURRENT_ACTIVE_COMPONENTS" = "perf_event perf_event_uncore rocm rocm_smi sysdetect" ] elif [ "$COMPONENTS" = "infiniband" ]; then [ "$CURRENT_ACTIVE_COMPONENTS" = "perf_event perf_event_uncore infiniband sysdetect" ] else # if the component from the .yml is not accounted for in the above # elif's exit 1 fi # without '--with-shlib-tools' in ./configure if [ "$SHLIB" = "without" ]; then echo "Running full test suite for active components" ./run_tests.sh TESTS_QUIET --disable-cuda-events=yes # with '--with-shlib-tools' in ./configure else echo "Running single component test for active components" ./run_tests_shlib.sh TESTS_QUIET fi papi-papi-7-2-0-t/.github/workflows/clang_analysis.sh000077500000000000000000000003661502707512200225700ustar00rootroot00000000000000#!/bin/bash -e OUT=$1 echo Analysis output to $OUT source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases module load llvm which scan-build cd src scan-build -o $OUT ./configure scan-build -o $OUT make papi-papi-7-2-0-t/.github/workflows/coretemp_component_workflow.yml000066400000000000000000000012511502707512200256110ustar00rootroot00000000000000name: coretemp on: pull_request: # run CI only if coretemp directory or coretemp sub-directories receive updates paths: - 'src/components/coretemp/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [coretemp] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: coretemp component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/cuda_component_workflow.yml000066400000000000000000000012221502707512200247050ustar00rootroot00000000000000name: cuda on: pull_request: # run CI only if cuda directory or cuda sub-directories receive updates paths: - 'src/components/cuda/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [cuda] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_nvidia] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: cuda component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/default_components_workflow.yml000066400000000000000000000013761502707512200256120ustar00rootroot00000000000000name: default components tests on: pull_request: # run CI only if perf_event, perf_event_uncore, or sysdetect directories or sub-directories receive updates paths: - 'src/components/perf_event/**' - 'src/components/perf_event_uncore/**' - 'src/components/sysdetect/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: default components tests run: .github/workflows/ci_default_components.sh ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/example_component_workflow.yml000066400000000000000000000012431502707512200254270ustar00rootroot00000000000000name: example on: pull_request: # run CI only if example directory or example sub-directories receive updates paths: - 'src/components/example/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [example] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: example component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/intel_gpu_component_workflow.yml000066400000000000000000000012571502707512200257670ustar00rootroot00000000000000name: intel_gpu on: pull_request: # run CI only if intel_gpu directory or intel_gpu sub-directories receive updates paths: - 'src/components/intel_gpu/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [intel_gpu] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: intel_gpu component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/io_component_workflow.yml000066400000000000000000000012051502707512200244010ustar00rootroot00000000000000name: io on: pull_request: # run CI only if io directory or io sub-directories receive updates paths: - 'src/components/io/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [io] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: io component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/lmsensors_component_workflow.yml000066400000000000000000000012571502707512200260260ustar00rootroot00000000000000name: lmsensors on: pull_request: # run CI only if lmsensors directory or lmsensors sub-directories receive updates paths: - 'src/components/lmsensors/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [lmsensors] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: lmsensors component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/net_component_workflow.yml000066400000000000000000000012131502707512200245570ustar00rootroot00000000000000name: net on: pull_request: # run CI only if net directory or net sub-directories receive updates paths: - 'src/components/net/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [net] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: net component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/nvml_component_workflow.yml000066400000000000000000000012221502707512200247450ustar00rootroot00000000000000name: nvml on: pull_request: # run CI only if nvml directory or nvml sub-directories receive updates paths: - 'src/components/nvml/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [nvml] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_nvidia] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: nvml component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/papi_framework_workflow.yml000066400000000000000000000052651502707512200247300ustar00rootroot00000000000000name: papi framework on: pull_request: # run CI if framework receives an update excluding individual # components and counter analysis toolkit paths-ignore: - 'src/components/*/**' - 'src/counter_analysis_toolkit/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: # build PAPI with multiple components to simulate a users' workflow # rocm, rocm_smi, powercap_ppc, rapl, sensors_ppc, infiniband, lustre, and mx should all # be disabled upon the build finishing papi_components_comprehensive: strategy: matrix: components: [cuda nvml rocm rocm_smi powercap powercap_ppc rapl sensors_ppc net appio io lustre stealtime coretemp lmsensors mx sde] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel, gpu_nvidia] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: comprehensive component tests run: .github/workflows/ci_papi_framework.sh "${{matrix.components}}" ${{matrix.debug}} ${{matrix.shlib}} # build PAPI only with the amd components, as they will not be active in the above comprehensive job papi_components_amd: strategy: matrix: components: [rocm rocm_smi] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_amd] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: rocm and rocm_smi component tests run: .github/workflows/ci_papi_framework.sh "${{matrix.components}}" ${{matrix.debug}} ${{matrix.shlib}} # build PAPI only with the infiniband component, as it will not be active in the above comprehensive job papi_component_infiniband: strategy: matrix: components: [infiniband] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, infiniband] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: infiniband component tests run: .github/workflows/ci_papi_framework.sh ${{matrix.components}} ${{matrix.debug}} ${{matrix.shlib}} papi_spack: runs-on: cpu timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: Build/Test/Install via Spack run: .github/workflows/spack.sh papi_clang_analysis: runs-on: cpu steps: - uses: actions/checkout@v4 - name: Run static analysis run: .github/workflows/clang_analysis.sh clang-analysis-output - name: Archive analysis results uses: actions/upload-artifact@v4 if: always() with: name: clang-analysis-output path: src/clang-analysis-output papi-papi-7-2-0-t/.github/workflows/powercap_component_workflow.yml000066400000000000000000000012511502707512200256130ustar00rootroot00000000000000name: powercap on: pull_request: # run CI only if powercap directory or powercap sub-directories receive updates paths: - 'src/components/powercap/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [powercap] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: powercap component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/rocm_component_workflow.yml000066400000000000000000000012171502707512200247350ustar00rootroot00000000000000name: rocm on: pull_request: # run CI only if rocm directory or rocm sub-directories receive updates paths: - 'src/components/rocm/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [rocm] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_amd] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: rocm component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/rocm_smi_component_workflow.yml000066400000000000000000000012471502707512200256100ustar00rootroot00000000000000name: rocm_smi on: pull_request: # run CI only if rocm_smi directory or rocm_smi sub-directories receive updates paths: - 'src/components/rocm_smi/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [rocm_smi] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, gpu_amd] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: rocm_smi component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/sde_component_workflow.yml000066400000000000000000000012131502707512200245440ustar00rootroot00000000000000name: sde on: pull_request: # run CI only if sde directory or sde sub-directories receive updates paths: - 'src/components/sde/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [sde] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: sde component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/.github/workflows/spack.sh000077500000000000000000000010341502707512200206730ustar00rootroot00000000000000#!/bin/bash -e COMPILER=$1 [ -z "$COMPILER" ] && COMPILER=gcc@11 source /etc/profile set +x set -e trap 'echo "# $BASH_COMMAND"' DEBUG shopt -s expand_aliases export HOME=`pwd` git clone https://github.com/spack/spack spack || true source spack/share/spack/setup-env.sh SPEC="papi@master +lmsensors +powercap +sde +infiniband +rapl +cuda %$COMPILER ^cuda@12.5.1" echo SPEC=$SPEC module load $COMPILER spack compiler find spack install --fresh --only=dependencies $SPEC spack dev-build -i --fresh --test=root $SPEC spack test run papi papi-papi-7-2-0-t/.github/workflows/stealtime_component_workflow.yml000066400000000000000000000012571502707512200257700ustar00rootroot00000000000000name: stealtime on: pull_request: # run CI only if stealtime directory or stealtime sub-directories receive updates paths: - 'src/components/stealtime/**' # allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: component_tests: strategy: matrix: component: [stealtime] debug: [yes, no] shlib: [with, without] fail-fast: false runs-on: [self-hosted, cpu_intel] timeout-minutes: 60 steps: - uses: actions/checkout@v4 - name: stealtime component tests run: .github/workflows/ci_individual_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}} papi-papi-7-2-0-t/ChangeLogP400.txt000066400000000000000000000320741502707512200166020ustar00rootroot000000000000002010-01-14 terpstra * src/perf_events.c 1.18: More tweaks from Corey for event rescheduling problem. Also a syntax fix for POWER platforms. 2010-01-13 sbk * src/configure.in 1.166: Enable the static perf events static table to be created and compiled in again for Cray XT CLE. 2010-01-13 terpstra * release_procedure.txt 1.13: Bump the date. * src/papi_internal.c 1.138: Fix for rescheduling events after a failed add. This addresses the NULL pointer issue found in overflow_allcounters on i7. Thanks to Corey Ashford of IBM for the fix. * papi.spec 1.4: Final changes from Will Cohen. * src/libpfm-3.y/config.mk 1.3: * src/libpfm-3.y/examples_v2.x/self_smpl_multi.c 1.3: * src/libpfm-3.y/examples_v2.x/syst.c 1.3: * src/libpfm-3.y/examples_v2.x/syst_multi_np.c 1.3: * src/libpfm-3.y/examples_v2.x/syst_np.c 1.3: * src/libpfm-3.y/lib/pfmlib_coreduo.c 1.3: * src/libpfm-3.y/lib/power7_events.h 1.3: *** empty log message *** * src/utils/event_info.c 1.11: Change test for version number to differentiate between PAPI-C and Classic PAPI. We were testing for versions >=3 && >= .9. This was missing versions >= 4. * src/libpfm-3.y/include/perfmon/pfmlib_gen_mips64.h 1.1.1.4: * src/libpfm-3.y/lib/intel_atom_events.h 1.1.1.5: * src/libpfm-3.y/lib/intel_corei7_events.h 1.1.1.5: * src/libpfm-3.y/lib/pfmlib_gen_mips64.c 1.1.1.6: * src/libpfm-3.y/lib/pfmlib_intel_nhm.c 1.1.1.9: Importing latest libpfm * src/Makefile.in 1.44: * src/papi.h 1.193: Update version numbers for impending release of PAPI 4.0.0. 2010-01-13 jagode * src/Makefile.inc 1.152: Avoid printing conditional 'if' statements while compiling (but they are performed). * src/perf_events.h 1.7: Seg fault on i7 with perf_events. This was fixed a while ago for perfmon and perfctr but perf-events was left behind. 2010-01-13 bsheely * src/configure 1.165: Generated configure to correspond to ost recent change (Cray XT) to configure.in 2010-01-12 terpstra * src/linux-bgp.c 1.3: Restore prior native naming convention: PNE_BGP_... Needed to avoid conflict with system level naming conventions. * src/ctests/bgp/Makefile 1.3: Modifications to build test files for BGP. * INSTALL.txt 1.42: Update description for BGP. 2010-01-08 terpstra * src/Rules.perfctr-pfm 1.47: * src/Rules.pfm_pe 1.10: Eliminate duplicate definitions of environment variable for the compile line. These are now defined in configure. * src/ctests/test_utils.c 1.77: Minor tweak to print native event codes in hex instead of decimal -- far more useful that way. * src/perfctr-p4.c 1.106: Minor tweak to get this file to compile with DEBUG turned on. 2010-01-07 sbk * src/Rules.pfm 1.50: The libpfm flag CONFIG_PFMLIB_OLD_PFMV2 was correctly set for when compiling and building libpfm, but it also needs to be set for installing also. The header file libpfm-3.y/include/perfmon/perfmon.h uses this flag to determine if a macro is prepended to perfmon.h when installing it. 2010-01-07 jagode * src/linux-acpi.c 1.16: * src/linux-mx.c 1.15: * src/linux-net.c 1.4: Renamed identifier 'native_name' for net, mx, and acpi components because of conflicts on POWER machines. This variable has also been defined in powerX_events.h. 2010-01-07 bsheely * src/Rules.perfctr 1.57: Added DEBUGFLAGS to OPTFLAGS since only OPTFLAGS gets used in Makefile.inc 2010-01-05 terpstra * src/multiplex.c 1.76: Modified license language for John May's LLNL portion of this code to conform with BSD as provided by LLNL. Thanks, Bronis, for bird-dogging this. 2009-12-20 terpstra * src/solaris-niagara2.c 1.4: Changes to fix overflow/profile issues in niagara2. Thanks to Fabian Gorsler. 2009-12-18 terpstra * src/ctests/bgp/papi_1.c 1.2: * src/ctests/native.c 1.61: * src/ctests/papi_test.h 1.37: * src/extras.c 1.159: * src/linux-bgp-memory.c 1.2: * src/linux-bgp-native-events.h 1.2: * src/linux-bgp-preset-events.c 1.2: * src/linux-bgp.h 1.2: * src/papi.c 1.337: * src/papiStdEventDefs.h 1.38: * src/papi_data.c 1.35: * src/papi_internal.h 1.181: * src/papi_preset.h 1.17: * src/papi_protos.h 1.69: * src/papi_vector.c 1.22: Committing changes for BG/P. Utilities and basic counting works; Not fully tested. 2009-12-16 terpstra * LICENSE.txt 1.6: Minor tweaks on the header of the license text. * src/solaris-niagara2-memory.c 1.3: * src/solaris-niagara2.h 1.3: Commit initial changes for Niagara2 support for PAPI-C. Thanks to Fabian Gorsler. Basic counting works; some unresolved issues remain for overflow and profile. 2009-12-11 terpstra * src/papi_events.csv 1.3: Add a synonym for Pentium M. 2009-12-08 bsheely * src/linux.c 1.69: Fixed memory issue seen in testing on certain platforms 2009-12-05 terpstra * ChangeLogP372.txt 1.1: file ChangeLogP372.txt was initially added on branch papi-3-7-0. 2009-12-02 terpstra * src/sys_perf_counter_open.c 1.10: * src/syscalls.h 1.4: Slightly cleaner syntax for redefinition of perf_event_attr in KERNEL31. 2009-12-01 terpstra * src/ctests/sdsc4.c 1.14: Fix from Will Cohen to avoid round-off errors in computing small differences between large numbers, which occasionally resulted in sqrt of negative numbers. Originally applied to sdsc2; modified and applied to sdsc2. 2009-11-30 terpstra * src/x86_cache_info.c 1.7: Strip the Windows version of cpuid out to make this version compatible with the 3.7.x branch. * src/ctests/sdsc2.c 1.13: Fix from Will Cohen to avoid round-off errors in computing small differences between large numbers, which occasionally resulted in sqrt of negative numbers. Thanks Will 2009-11-25 terpstra * src/papi_hl.c 1.77: PAPI_stop_counters was returning PAPI_OK even if PAPI_stop returned something other than PAPI_OK. Uncovered as part of the BG/P merge. 2009-11-25 bsheely * src/hwinfo_linux.c 1.2: added test for topology/thread_siblings and topology/ core_siblings 2009-11-24 terpstra * src/papi_vector.h 1.10: Fix a bug in assigning signals for overflow. Also expose a vector_find_dummy routine to allow testing for component functions. If the function pointer is a dummy, it isn't implemented in the component. This is used in extras to test for the implementation of a name_to_code routine. 2009-11-24 bsheely * src/ctests/hwinfo.c 1.7: Removed invalid code (zero can be a valid value for nnodes) 2009-11-23 bsheely * src/solaris-ultra.c 1.125: resolved compile error * src/run_tests.sh 1.37: * src/run_valgrind_tests.sh 1.2: valgrind code merged into run_tests.sh and commented out by default 2009-11-20 bsheely * src/genpapifdef.c 1.41: * src/papi_events.xml 1.3: * src/papi_fwrappers.c 1.81: Applied patch from Steve Kaufmann at Cray. Removes the remaining Unicos, Catamount, T3E, X1 and X2 references. Only explicit support for XT4+/CLE remains. 2009-11-18 mucci * src/any-null.c 1.52: * src/linux-bgl.c 1.9: * src/perfmon.c 1.97: * src/windows.c 1.4: Renamed shutdown_global to shutdown_substrate to make it more obvious that this is per substrate. This callback will be important for freeing some memory up and making sure locks are reset. Looks like a big patch, but only a few lines. * src/config.h.in 1.9: Add support for detecting gettid and syscall(gettid) which results in HAVE_GETTID and HAVE_SYSCALL_GETTID being defined in config.h This will be useful for Linux where we can remove all the special casing for threads and locking and the errors with getpid. gettid all the time. * src/papi_lock.h 1.1: Beginnings of a single function with all PAPI/Linux locking functions. Note to PAPI-C developers. The multiple context concept of PAPI-C has failed to include the lock data structure. PAPI currently only has one scope of locks that span the high-level to the low-level. This will need to be revisited and the locks split into high-level and per-context locks. 2009-11-13 terpstra * ChangeLogP371.txt 1.1: file ChangeLogP371.txt was initially added on branch papi-3-7-0. 2009-11-12 bsheely * src/papi_events_table.sh 1.1: * src/papi_pfm_events.c 1.35: * src/papi_pfm_events.h 1.4: * src/perfmon_events.csv 1.57: * src/perfmon_events_table.sh 1.6: * src/pmapi-ppc64_events.c 1.8: * src/ppc64_events.h 1.10: renamed perfmon_events.csv perfmon_events_table.h perfmon_events_table.sh to papi_events.csv papi_events_table.h papi_events_table.sh and made code changes required by the renaming 2009-11-11 terpstra * src/ctests/first.c 1.49: Fix overly restrictive verification of results. In verifying that FP_INS/FP_OPS/TOT_INS was non-zero, we were requiring it to be near theoretical FP_OPS which caused false verification failures in some edge cases. Now we just require count >= iterations. 2009-11-11 bsheely * src/ctests/inherit.c 1.13: * src/ctests/multiplex1_pthreads.c 1.49: * src/ctests/overflow.c 1.66: * src/ctests/overflow2.c 1.25: * src/ctests/overflow3_pthreads.c 1.21: * src/ctests/overflow_allcounters.c 1.5: * src/ctests/overflow_force_software.c 1.24: * src/ctests/overflow_index.c 1.9: * src/ctests/overflow_one_and_read.c 1.5: * src/ctests/overflow_single_event.c 1.45: * src/ctests/overflow_twoevents.c 1.26: * src/ctests/pthrtough2.c 1.7: * src/ctests/zero_shmem.c 1.6: * src/ftests/cost.F 1.18: * src/ftests/fmultiplex1.F 1.37: * src/ftests/ftests_util.F 1.49: * src/ftests/native.F 1.55: * src/perfmon.h 1.20: removed code for obsolete cray builds * src/ctests/do_loops.c 1.32: * src/ctests/zero_fork.c 1.9: * src/linux-memory.c 1.41: * src/linux.h 1.3: * src/perfctr-p3.c 1.91: * src/perfctr-p3.h 1.50: * src/run_cat_tests.sh 1.4: removed Catamount code 2009-11-09 bsheely * src/linux-ia64-memory.c 1.23: * src/linux-ia64.c 1.176: created hwinfo_linux.c to encapsulate code to set _papi_hw_info struct on Linux platforms * src/unicosmp-memory.c 1.4: removed obsolete file 2009-11-06 terpstra * src/libpfm-3.y/examples_v2.x/x86/smpl_nhm_lbr.c 1.1.1.2: libpfm nhm and atom fixes 2009-11-05 bsheely * src/alpha-memory.c 1.11: * src/ckcatamount.c 1.3: * src/dadd-alpha.c 1.43: * src/dadd-alpha.h 1.14: * src/irix-memory.c 1.20: * src/irix-mips.c 1.116: * src/irix-mips.h 1.34: * src/irix.c 1.2: * src/irix.h 1.3: * src/linux-alpha.c 1.24: * src/linux-alpha.h 1.9: * src/power3.c 1.41: * src/power3.h 1.19: * src/power3_events.c 1.9: * src/power3_events.h 1.8: * src/power4_events.h 1.9: * src/power4_events_map.c 1.6: * src/t3e_events.c 1.11: * src/tru64-alpha.c 1.66: * src/tru64-alpha.h 1.22: * src/unicos-ev5.c 1.69: * src/unicos-ev5.h 1.20: * src/unicos-memory.c 1.12: * src/unicosmp.h 1.5: * src/x1-native-presets.h 1.4: * src/x1-native.h 1.5: * src/x1-presets.h 1.7: * src/x1.c 1.38: * src/x1.h 1.11: removed files related to obsolete builds 2009-11-03 terpstra * src/libpfm-3.y/examples_v2.x/x86/Makefile 1.1.1.3: * src/libpfm-3.y/examples_v2.x/x86/smpl_core_pebs.c 1.1.1.3: * src/libpfm-3.y/examples_v2.x/x86/smpl_pebs.c 1.1.1.1: * src/libpfm-3.y/include/Makefile 1.1.1.9: * src/libpfm-3.y/include/perfmon/perfmon_pebs_smpl.h 1.1.1.1: * src/libpfm-3.y/include/perfmon/pfmlib_intel_nhm.h 1.1.1.2: * src/libpfm-3.y/lib/amd64_events_fam10h.h 1.1.1.5: * src/libpfm-3.y/lib/intel_corei7_unc_events.h 1.1.1.2: * src/libpfm-3.y/lib/pfmlib_amd64.c 1.1.1.10: * src/libpfm-3.y/lib/pfmlib_core.c 1.1.1.12: * src/libpfm-3.y/lib/pfmlib_intel_atom.c 1.1.1.6: * src/libpfm-3.y/lib/pfmlib_intel_nhm_priv.h 1.1.1.2: * src/libpfm-3.y/lib/power6_events.h 1.1.1.4: latest libpfm changes 2009-11-02 terpstra * src/utils/avail.c 1.49: * src/utils/native_avail.c 1.42: Fixes to eliminate strcpy on overlapping strings The offending calls were replaced with memmoves and encapsulated in a single function for better maintenance. 2009-10-29 bsheely * src/solaris-ultra.h 1.41: resolved compile errors on solaris 2009-10-23 bsheely * src/Rules.pfm_pcl 1.13: * src/pcl.c 1.12: * src/pcl.h 1.5: Naming convention change from PCL to Perf Events: renamed pcl.h and pcl.c to perf_events.h and perf_events.c, renamed Rules.pfm_pcl to Rules.pfm_pe, configure option --with-pcl changed to --with-perf-events 2009-10-20 bsheely * src/ctests/byte_profile.c 1.18: corrected possible logic error in setting end point of profile buffer 2009-10-15 bsheely * src/perfctr-ppc32.c 1.9: corrected possible init error 2009-10-14 terpstra * src/ctests/calibrate.c 1.39: Error checking was missing undercount conditions. 2009-10-13 terpstra * src/run_tests_exclude.txt 1.6: This file never existed on the PAPI-C branch. * src/aix-memory.c 1.15: * src/aix.c 1.84: * src/aix.h 1.29: * src/pmapi-ppc64.c 1.8: * src/pmapi-ppc64.h 1.4: * src/threads.c 1.33: Conversion of AIX to PAPI-C. Most tests pass, except for some overflow related stuff. Haven't examined things closely yet, but thought I should check this stuff in. 2009-10-12 bsheely * src/ftests/fdmemtest.F 1.5: * src/ftests/flops.F 1.14: declare types explicitly * src/ctests/multiattach.c 1.5: * src/ctests/zero_attach.c 1.5: corrected logic error with pid type 2009-10-09 terpstra * src/power6_events.h 1.3: * src/power6_events_map.c 1.4: Somehow these got removed from the repository. papi-papi-7-2-0-t/ChangeLogP410.txt000066400000000000000000000362131502707512200166020ustar00rootroot000000000000002010-06-21 terpstra * src/Makefile.in 1.52: * src/configure 1.224: * src/configure.in 1.224: Change version numbers in anticipation of the impending 4.1 release. 2010-06-18 vweaver1 * src/components/example/example.c 1.4: Correct a comment. 2010-06-18 ralph * doc/Doxyfile 1.5: * doc/Doxyfile-everything 1.2: Upped the version number in doxygen config files for upcoming release. * INSTALL.txt 1.47: Friday afternoon typo... the command given for generating all documentation was wrong * src/components/lustre/linux-lustre.c 1.6: * src/components/lustre/linux-lustre.h 1.5: Fixed some of the comments to get doxygen's attention /* -> /** I'm still working out how to best do the papi_components group but for now I just put the .h file for the component into the group. (@ingroup papi_components) So that one file per component shows up listing. * src/papi.h 1.208: Added a small section about components on the main doxygen generated page. 2010-06-17 jagode * src/components/lustre/Rules.lustre 1.3: * src/components/lustre/host_counter.c 1.2: * src/components/lustre/host_counter.h 1.2: Added new component for infiniband devices. Major changes for lustre component. * src/components/README 1.4: Added documentation (Doxygen) for InfiniBand (and lustre) component. 2010-06-15 ralph * src/components/acpi/linux-acpi.c 1.3: * src/components/acpi/linux-acpi.h 1.2: * src/components/lmsensors/linux-lmsensors.h 1.3: * src/components/mx/linux-mx.h 1.2: * src/components/net/linux-net.h 1.2: * src/papi.c 1.360: * src/papi_hl.c 1.85: * src/utils/avail.c 1.53: * src/utils/clockres.c 1.25: * src/utils/command_line.c 1.15: * src/utils/cost.c 1.40: * src/utils/decode.c 1.9: * src/utils/event_chooser.c 1.18: * src/utils/mem_info.c 1.17: * src/utils/native_avail.c 1.47: Added documentation for the several components. Doxygen will now search recursivly under the components directory for documented *.[c|h] files ( /** @file */ somewhere in it). Several other files got brief descriptions of what is in the file. 2010-06-14 terpstra * papi.spec 1.9: Minor tweak to make sure libpfm builds without warnings. 2010-06-11 jagode * src/components/lmsensors/linux-lmsensors.c 1.2: removed compiler warnings for lm-sensors component; switched to stderr so that papi_xml_event_info creates a clean output. 2010-06-11 bsheely * src/ctests/api.c 1.2: Added first few api test cases 2010-06-10 bsheely * src/ctests/papi_test.h 1.39: * src/ctests/test_utils.c 1.82: Added test_fail_exit for use in single threaded tests 2010-06-09 vweaver1 * src/perfctr-2.6.x/patches/aliases 1.13: * src/perfctr-2.6.x/usr.lib/Makefile 1.31: Fix conflicts from import. * src/perfctr-2.6.x/CHANGES 1.1.1.28: ... * src/perfctr-2.6.x/usr.lib/x86.c 1.1.1.11: Import of perfctr 2.6.41 2010-06-07 bsheely * src/any-null.c 1.60: * src/freq.c 1.1: * src/papi_vector.c 1.31: Moved timer impl from any-null.c into papi_vector.c and added generic functionality to compute frequency if unable to determine based on platform * src/papi_data.c 1.40: * src/papi_data.h 1.6: Added new error code * src/Makefile.inc 1.163: Added freq.c to build * src/configure 1.223: * src/configure.in 1.223: ctests/api (not yet implemented) added to default ctests 2010-06-03 bsheely * src/ctests/Makefile 1.155: Initial commit for ctests/api which is not yet implemented 2010-06-02 bsheely * src/papi_lock.h 1.7: Fixed for BG/P 2010-06-01 vweaver1 * README 1.10: Fix typo in README 2010-06-01 bsheely * src/config.h.in 1.13: Added code to define _rtc when Cray is compiled with gcc * src/cycle.h 1.4: Rolled back previous changes 2010-05-27 bsheely * src/papi_internal.c 1.158: * src/threads.h 1.15: --with-no-cpu-component renamed --with-no-cpu-counters * src/components/mx/configure 1.3: * src/components/mx/configure.in 1.3: Rollback last change * src/ctests/multiattach.c 1.8: * src/ctests/zero_attach.c 1.8: Attempt to fix xlc compile errors 2010-05-21 bsheely * src/Rules.perfctr 1.66: * src/Rules.perfctr-pfm 1.57: * src/Rules.pfm 1.57: * src/Rules.pfm_pe 1.18: Use MISCHDRS from configure 2010-05-20 bsheely * src/components/mx/linux-mx.c 1.2: Fixed compile error and warnings. Added option to configure 2010-05-19 terpstra * src/ctests/all_native_events.c 1.24: Hard-code an exception for Nehalem OFFCORE_RESPONSE_0. This event can't be counted because it uses a shared chip-level register. 2010-05-19 bsheely * src/linux-ia64-memory.c 1.25: * src/linux-ia64.c 1.183: * src/pfmwrap.h 1.43: Fixed warning in ia64 * src/components/net/linux-net.c 1.2: Fixed compile warnings * src/Makefile.in 1.51: Extra compiler warning flags are not added until after the libpfm build 2010-05-14 vweaver1 * src/linux-bgp.c 1.5: Temporary fix to emulate cycles HW counter on BlueGeneP using the get_cycles() call. 2010-05-13 bsheely * src/x86_cache_info.c 1.13: added missing C library headers * src/hwinfo_linux.c 1.7: fixed compile errors on torc0 by including missing C library headers * src/ftests/Makefile 1.66: * src/utils/Makefile 1.16: Replaced missing MEMSUBSTR macro in configure. AC_ARG_ENABLE macros replaced with AC_ARG_WITH macros. Continued changes for -- with-no-cpu-component 2010-05-07 ralph * doc/Doxyfile-everything 1.1: * doc/Makefile 1.1: Added makefile in doc to generate user and developer documentation. from src, make doc builds the user documentation in doc/html (do we want this?) 2010-05-07 jagode * src/utils/event_info.c 1.14: papi_xml_event_info generated some invalid xml output. This bug was introduced in Revision 1.10 2010-05-07 bsheely * src/any-null-memory.c 1.11: * src/any-null.h 1.23: * src/extras.c 1.170: * src/multiplex.c 1.85: * src/papi_preset.c 1.29: * src/papi_vector.h 1.14: * src/threads.c 1.36: Added --with-no-cpu-component option which has only been tested on x86 2010-05-03 ralph * src/freebsd-memory.c 1.1: * src/freebsd.c 1.9: * src/freebsd.h 1.6: * src/papi_fwrappers.c 1.86: Updated Harald Servat's freebsd work to Component Papi. Has had cursory testing, but should be considered alpha quality. (there is a really nasty bug when running the overflow_pthreads test) * src/genpapifdef.c 1.43: Removed a holdout from catamount support, are there any platforms where we don't get malloc from stdlib? 2010-05-03 bsheely * src/papi_table.c 1.5: Removed obsolete file 2010-04-30 terpstra * release_procedure.txt 1.17: Add a few more steps on testing a patch. 2010-04-30 bsheely * src/components/acpi/Rules.acpi 1.2: * src/components/lmsensors/Rules.lmsensors 1.2: * src/components/lustre/Rules.lustre 1.2: * src/components/mx/Rules.mx 1.2: * src/components/net/Rules.net 1.2: Adding new components no longer requires modification of Papi code 2010-04-29 bsheely * src/components/Rules.components 1.1: * src/components/acpi/linux-acpi-memory.c 1.1: * src/components/lmsensors/Makefile.lmsensors.in 1.1: * src/components/lmsensors/configure 1.1: * src/components/lmsensors/configure.in 1.1: * src/components/lustre/host_counter.c 1.1: * src/components/lustre/host_counter.h 1.1: * src/components/mx/Makefile.mx.in 1.1: * src/components/net/Makefile.net.in 1.1: * src/components/net/configure 1.1: * src/components/net/configure.in 1.1: * src/host_counter.c 1.2: * src/host_counter.h 1.2: * src/linux-acpi-memory.c 1.4: * src/linux-acpi.c 1.18: * src/linux-acpi.h 1.10: * src/linux-lmsensors.c 1.4: * src/linux-lmsensors.h 1.4: * src/linux-lustre.c 1.4: * src/linux-lustre.h 1.2: * src/linux-mx.c 1.17: * src/linux-mx.h 1.10: * src/linux-net.c 1.6: * src/linux-net.h 1.4: Created new build environment for components 2010-04-21 bsheely * src/perfmon.c 1.105: removed code that was commented out (accidentally uncommented out on last commit 2010-04-20 bsheely * src/freebsd/map-i7.c 1.3: * src/freebsd/map-i7.h 1.3: Updated on 3.7 branch * src/linux-bgl-events.c 1.4: * src/linux-bgl-memory.c 1.4: * src/linux-bgl.c 1.11: * src/linux-bgl.h 1.4: * src/linux-ia64.h 1.61: * src/linux.c 1.77: * src/papi_events.csv 1.9: * src/papi_pfm_events.c 1.40: * src/perf_events.c 1.26: * src/perf_events.h 1.11: * src/perfctr-ppc64.c 1.19: * src/perfctr-x86.c 1.4: * src/perfmon.h 1.24: * src/pmapi-ppc64.c 1.11: * src/solaris-ultra.c 1.128: Removed code for obsolete platforms 2010-04-16 jagode * src/ctests/native.c 1.63: * src/papiStdEventDefs.h 1.41: * src/papi_internal.h 1.190: * src/papi_preset.h 1.22: * src/papi_protos.h 1.74: After further investigations of the stack corruption issue on BGP, the real problem has been nailed down. The size of the PAPI_event_info_t struct is different on BGP systems which is due to a bigger PAPI_MAX_INFO_TERMS value. A _BGP was defined at configure time to differentiate between BGP and other systems. However, the problem is that a user program does not know this macro. When PAPI_event_info_t is initialized to zero, the beginning of the user program's stack frame is zeroed out --> BAD. It was fun, though. * src/aix.c 1.87: Fixed compilation errors for AIX which were due to missing inclusion of new header file papi_defines.h. 2010-04-15 bsheely * src/freebsd/map-atom.c 1.5: ... * src/freebsd/memory.c 1.4: Added files 2010-04-09 bsheely * src/linux-ppc64-memory.c 1.9: * src/perfctr-ppc32.c 1.11: * src/perfctr-ppc32.h 1.4: * src/perfctr-ppc64.h 1.11: * src/ppc32_events.c 1.8: * src/ppc64_events.c 1.9: * src/ppc64_events.h 1.12: Removed support for ppc32 architectures. Removed support for perfmon versions older than 2.5 except for Itanium. Removed all code related to POWER3 and POWER4. 2010-04-08 bsheely * src/solaris-niagara2.h 1.5: Added new include file * src/solaris-niagara2.c 1.7: Removed recently added include file since that file is now included in the header which is included here 2010-04-06 jagode * src/linux-bgp.h 1.4: Missing declaration of PAPI_MAX_LOCK (fixed for linux-bgp only) 2010-04-05 bsheely * src/papi_memory.c 1.23: Resolved compile warning * src/ctests/profile.c 1.60: Modified code to exit properly on test failure 2010-04-01 bsheely * src/ctests/clockcore.c 1.21: Prevent output after test failure 2010-03-30 vweaver1 * src/libpfm-3.y/lib/pfmlib_intel_nhm.c 1.4: Fix conflict from merge. * src/libpfm-3.y/lib/intel_corei7_events.h 1.1.1.6: * src/libpfm-3.y/lib/pfmlib_itanium2.c 1.1.1.3: * src/libpfm-3.y/lib/pfmlib_montecito.c 1.1.1.4: import libpfm CVS adds additional i7 model 46 support, fixes ia64 builds 2010-03-29 bsheely * src/ctests/pthrtough.c 1.11: Fixed buffer overflow debug output related to threads.c. Rolled back change to pthrtough.c 2010-03-19 bsheely * src/solaris-ultra.h 1.43: Add new include for remaining substrates 2010-03-18 bsheely * src/ctests/p4_lst_ins.c 1.5: * src/ftests/native.F 1.56: * src/p3_pfm_events.c 1.14: * src/p4_events.c 1.56: * src/p4_events.h 1.10: * src/papi_defines.h 1.2: * src/papi_memory.h 1.12: * src/perfctr-p3.c 1.95: * src/perfctr-p3.h 1.52: * src/perfctr-p4.c 1.109: * src/perfctr-p4.h 1.47: * src/perfctr-x86.h 1.2: Merge bsheely-temp branch by hand 2010-03-12 vweaver1 * src/ctests/multiplex1.c 1.53: * src/ctests/multiplex1_pthreads.c 1.54: * src/solaris-memory.c 1.14: Fix PAPI support for solaris-ultra. This code had not worked for some time. * Derived events now work (although the events are still hard-coded and not read from the csv file) * Add cache size detection routines * Fix ntv_code_to_name() * Modify the multiplex* ctests to use proper events on UltraSPARC All of the regression tests pass except for profile_pthreads. This is because overflow handling is still partially broken. 2010-03-05 ralph * doc/doxygen_procedure.txt 1.1: doc/doxygen_procedure.txt provides a quick overview of how to use doxygen for commenting the PAPI code. The utilities are now commented, cloning the wiki man pages. The high level api is also documented, cloning the wiki again. In the low level api, PAPI_accum - PAPI_destroy_eventset are documented. 2010-03-05 bsheely * src/ctests/thrspecific.c 1.6: Test now passes while testing the same functionality without memory leaks 2010-03-04 vweaver1 * src/libpfm-3.y/lib/pfmlib_priv.h 1.7: Fix conflicts from the libpfm import. * src/libpfm-3.y/docs/man3/libpfm_westmere.3 1.1.1.1: * src/libpfm-3.y/examples_v2.x/showevtinfo.c 1.1.1.3: * src/libpfm-3.y/include/perfmon/pfmlib.h 1.1.1.13: * src/libpfm-3.y/lib/intel_wsm_events.h 1.1.1.1: * src/libpfm-3.y/lib/intel_wsm_unc_events.h 1.1.1.1: * src/libpfm-3.y/lib/pfmlib_common.c 1.1.1.14: * src/libpfm-3.y/lib/pfmlib_intel_nhm_priv.h 1.1.1.3: Import latest libpfm, which includes Westmere support 2010-03-04 bsheely * src/ctests/fork.c 1.7: * src/ctests/fork2.c 1.4: * src/ctests/krentel_pthreads.c 1.8: * src/ctests/kufrin.c 1.15: * src/ctests/overflow_pthreads.c 1.43: * src/ctests/profile_pthreads.c 1.37: Fixed memory leaks 2010-03-03 vweaver1 * src/p3_ath_event_tables.h 1.4: * src/p3_core_event_tables.h 1.5: * src/p3_events.c 1.65: * src/p3_opt_event_tables.h 1.4: * src/p3_p2_event_tables.h 1.4: * src/p3_p3_event_tables.h 1.4: * src/p3_pm_event_tables.h 1.4: Now that Athlon and Pentium II events use libpfm, remove the old hard coded event table files. * src/perfctr-2.6.x/README 1.1.1.6: * src/perfctr-2.6.x/patches/patch-kernel-2.6.18-164.el5-redhat 1.1.1.1: * src/perfctr-2.6.x/patches/patch-kernel-2.6.31 1.1.1.1: * src/perfctr-2.6.x/patches/patch-kernel-2.6.32 1.1.1.1: Import of perfctr 2.6.40 2010-03-03 bsheely * src/ctests/clockres_pthreads.c 1.11: * src/ctests/fork_exec_overflow.c 1.12: * src/ctests/zero_pthreads.c 1.29: Fixed memory leaks 2010-02-24 bsheely * src/linux-memory.c 1.44: Removed hack to compile without warnings using Wconversion 2010-02-23 bsheely * src/ctests/all_events.c 1.15: * src/ctests/multiplex2.c 1.36: * src/ctests/multiplex3_pthreads.c 1.45: Fixed (debug) compile warnings 2010-02-22 jagode * src/.indent.pro 1.1: ... * src/utils/version.c 1.4: Added and applied new PAPI-coding-style profile file * src/windows.c 1.6: Added missing comment closer */ This misindented the rest of the source code in windows.c 2010-02-16 terpstra * src/ctests/prof_utils.h 1.8: Cleaned up a bunch of implicit type conversions. 2010-02-15 terpstra * src/run_tests_exclude.txt 1.7: Remove the PAPI_set_event_info and PAPI_encode_event API calls, since they were never supported, and generally come to be thought of as a bad idea. * src/ctests/encode.c 1.7: * src/ctests/encode2.c 1.5: Remove the encode and encode2 tests that exercise PAPI_set_event_info and PAPI_encode_event API calls, since they were never supported, and generally come to be thought of as a bad idea. 2010-01-25 bsheely * src/examples/PAPI_flips.c 1.4: * src/examples/PAPI_flops.c 1.4: * src/examples/PAPI_get_opt.c 1.5: * src/examples/PAPI_ipc.c 1.4: * src/examples/PAPI_overflow.c 1.5: * src/examples/PAPI_profil.c 1.7: * src/examples/high_level.c 1.4: * src/examples/locks_pthreads.c 1.3: * src/examples/overflow_pthreads.c 1.5: Fixed remaining compile warnings * src/examples/sprofile.c 1.5: Fixed compile warnings papi-papi-7-2-0-t/ChangeLogP411.txt000066400000000000000000000415471502707512200166110ustar00rootroot000000000000002010-09-30 * src/: configure, configure.in: When --with-OS=CLE is enabled, check the kernel version and use perfmon2 for old kernels and perf_events for new kernels. * src/: configure, configure.in: If no sources of perf counters are available, then use the generic_platform substrate instead. Currently the code would always fall back on perfctr even if no perfctr support was available. * src/: configure, configure.in: If you specify --with-perf-events or --with-pe-include but the required perf_event.h header is not available, then have configure fail with an error. * papi.spec: Bump version number to 4.1.1 in affected files. Also bump requirement for kernel from 2.6.31 to 2.6.32. This in prep for the pending release. * src/: configure, Makefile.in, configure.in, papi.h: Bump version number to 4.1.1 in affected files. This in prep for the pending release. * INSTALL.txt: Hope this late commit doesn't interfere with anything. This updates the INSTALL.txt to reflect all of the improvements we've made to perf_event support since the last release. 2010-09-29 * src/Rules.pfm: The -Werror problem was still occurring on ia64/perfmon compiles, as I hadn't updated Rules.pfm * src/: configure, configure.in, perf_events.c, perf_events.h, sys_perf_counter_open.c, sys_perf_event_open.c, syscalls.h: Remove support for the perf_counter interface in kernel 2.6.31. Now supports only the perf_event interface in kernel 2.6.32 and above. 2010-09-22 * src/perf_events.c: Attempt to add mmtimer support to perf_events substrate. * src/: multiplex.c, papi.c, papi_protos.h: The multiplex code currently does not make a final adjustment at the time of MPX_read(). This is to avoid the case where counts could be decreasing if you have multiple reads returning estimated values before the next actual counter read. While this code works to keep the results non-decreasing, it can cause significant differences from expected results for final reads, especially if many counters are being multiplexed. This is seen in the sdsc-mpx test. It was failing occasionally on some machines by having error of over 20% (the cutoff for a test error) when multiplexing 11 events. What this fix does is to special case the PAPI_stop() case when multiplexing is enabled, having the PAPI_stop() do a final adjustment. The intermediate PAPI_read() case is not changed. This fixes the sdsc-mpx case, while still passing the mendes-alt case (which checks for non-decreasing values). There is a #define that can be set in multiplex.c to restore the previous behavior. * src/ctests/mendes-alt.c: This is our only test that checks to see if multiplexed values are non-decreasing or not. Unfortunately the test currently doesn't fail if values do go backward. This change causes the test to fail if it finds multiplexed counts that decrease. 2010-09-17 * src/libpfm-3.y/: config.mk, lib/intel_wsm_events.h: Fix conflicts from merge. 2010-09-15 * src/: Makefile.inc, Rules.perfctr-pfm, Rules.pfm_pe: Finally fix the -WExtra problem. The issue was -WExtra was being passed to libpfm, but only in the case where the user had a CFLAGS env variable. It turns out this is due to the following from section 5.7.2 of the gmake manual: Except by explicit request, make exports a variable only if it is either defined in the environment initially or set on the command line, And the fix is also described: If you want to prevent a variable from being exported, use the unexport directive, So I've added an "unexport CFLAGS" directive, which seems to be the right thing as our Makefile explicitly passes CFLAGS to the sub-Makefiles that need it. This seems to fix the build. 2010-09-13 * src/libpfm-3.y/: docs/man3/libpfm_westmere.3, lib/intel_wsm_events.h, lib/intel_wsm_unc_events.h, lib/pfmlib_intel_nhm.c, lib/pfmlib_priv.h: Fix the missing files from the import (CVS claims this as a "conflict") 2010-09-08 * src/Makefile.inc: Fixed the recipies for [c|f]tests and utils. $(LIBRARY) => $(papiLIBS) (this way we don't build libpapi.a if we won't want it) 2010-09-03 * src/ctests/sdsc.c: Had a "%d" instead of "%lld" in that last commit. * src/ctests/sdsc.c: Give a more detailed error message on the sdsc-mpx test. We're seeing sporadic failures (probably due to results being close to the threshold value) but it's hard to tell on buildbot which counter is failing because the error message didn't print the value. 2010-09-02 * src/papi.c: Remove code that reported ENOSUPP if HW multiplexing is not available. PAPI can automatically perform SW multiplexing if HW is not available. With this part of my previous multiplexing patch reverted, multiplexing seems to work even on 2.6.32 perf_events (by reverting to SW mode on those machines) 2010-08-31 * src/perf_events.c: Explicitly set the disabled flag to zero in perf_events for new events. It was possible with an event set that if you removed an event then added a new one that the disabled flag was obtaining the value from the previously removed event. This fix doesn't seem to break anything, but the code involved is a bit tricky to follow. This fixes the sdsc4-mpx test on sol. * src/components/coretemp/: Rules.coretemp, linux-coretemp.c, linux-coretemp.h: Initial stab at a coretemp component. This component exposes every thing that looks like a useful file under /sys/class/hwmon. 2010-08-30 * src/perf_events.c: F_SETOWN_EX is not available until 2.6.32, so don't use it unless we are running on a recent enough kernel. * src/perf_events.c: Pentium 4 was not supported by perf_events until version 2.6.35. Print an error if we attempt to use it on an older kernel. 2010-08-27 * src/ctests/overflow_allcounters.c: The "overflow_allcounters" test failed on perfmon2 kernels because the behavior of a counter on overflow differs between the various substrates. Therefore detect if we're running on perfmon2 and print a warning, but still pass the test. * src/libpfm-3.y/lib/: intel_wsm_events.h, intel_wsm_unc_events.h, pfmlib_intel_nhm.c, pfmlib_priv.h: updating * src/libpfm-3.y/docs/man3/libpfm_westmere.3: removing westmere documentation * src/perf_events.c: Fix warning in compile due to missing parameter in a debug statement. * src/ctests/test_utils.c: In the ctests, test_skip() was attempting a PAPI_shutdown() before exiting. On multithreaded tests (that had already spawned threads before the decision to skip) this really causes the programs to end up confused and reports spurious memory errors. So remove the PAPI_shutdown() from test_skip(). There's a comment in test_fail() that indicates this was already done there for similar reasons. 2010-08-26 * src/ctests/byte_profile.c: byte_profile was failing on systems where fp_ops is a derived event. modify the test so it gives a warning instead of failing and avoids using the derived event. * src/perf_events.c: At PAPI_stop() time a counter with overflow enabled is being adjusted by a value equal to the sampling period. It looks like this isn't needed (and is generating an overcount that breaks overflow_allcounters). I'm still checking up on this code; if it turns out to be necessary I may have ro revert this later. * src/ctests/overflow_allcounters.c: Add validation check to overflow_allcounters It turns out perf_event kernels overcount overflows for some reason, while perfctr doesn't. I'm investigating. * src/ctests/: overflow_allcounters.c, papi_test.h, test_utils.c: On Power5 and Power6, hardware counters 5 and 6 cannot generate interrupts. This means the overflow_allcounters test was failing because overflow could not be generated for events 5 and 6. Add code that special cases Power5 and Power6 for this test (and generate a warning) * src/perf_events.c: Change some debug messages to be warnings instead of errors. * src/: papi.c, ctests/second.c: Fix ctests/second on bluegrass (POWER6) The test was testing domains by trying PAPI_DOM_ALL^PAPI_DOM_SUPERVISOR in an attempt to turn off the SUPERVISOR bit. This fails on Power6 as it leaves the PAPI_DOM_OTHER bit set, which isn't allowed. How did the test earlier measure PAPI_DOM_ALL then, which has all bits set? Well it turns out papi.c silently corrects PAPI_DOM_ALL to be available_domains. But if you fiddle any of the bits this correction is lost. This is probably not the right thing to do, but the best way to fix it is not clear. For now this modifies the "second" test to clear the DOM_OTHER bit too if the domain setting fails with it set. 2010-08-25 * src/: papi.c, papi.h, perf_events.c, ctests/kufrin.c, ctests/mendes-alt.c, ctests/multiplex1.c, ctests/multiplex1_pthreads.c, ctests/multiplex2.c, ctests/multiplex3_pthreads.c, ctests/sdsc.c, ctests/sdsc2.c, ctests/sdsc4.c, ftests/fmultiplex1.F, ftests/fmultiplex2.F: Add support for including the OS version in the component_info_t struct. Use this support under perf_events to disable multiplexing support if the kernel is < 2.6.33 Modify the various multiplexing tests to "skip" if they get a PAPI_ENOSUPP when attempting to set up multiplexing. * src/ctests/all_native_events.c: Update all_native_events ctest to print warning in the case where we skip events because they aren't implemented yet (offcore and uncore mostly). 2010-08-24 * src/ctests/: papi_test.h, profile.c, test_utils.c: Adds a new "test_warn()" function for the ctests. This allows you to let tests pass with a warning. This is useful in cases where you don't want to forget that an option needs implementing, but that the feature being missed isn't important enough to fail the test. The first user of this is the "profile" test. We warn that PAPI_PROFIL_RANDOM is not supported on perf_events. * src/perf_events.c: From what I can tell, on perf_events the overflow PAPI_OVERFLOW_FORCE_SW case was improperly falling through in _papi_pe_dispatch_timer() to also run the HARDWARE code. This meant that we were attempting to read non-existant hardware overflow data, causing a lot of errors to be printed to the screen. This shows up in the overflow_force_software test * src/ctests/: ipc.c, multiplex2.c, multiplex3_pthreads.c, test_utils.c: Some minor changes to the ctests. + ipc -- fail if the reported IPC value is zero + multiplex2 -- fail if all 32 counter values report as zero + multiplex3_pthread -- give up sooner if each counter returns zero. otherwise the test can take upwards of an hour to finish and makes the fan on my laptop sound like it's going to explode in the process 2010-08-20 * src/Makefile.inc: Disable CFLAGS += $(EXTRA_CFLAGS) (-Wextra) for now. This will get buildbot running again, and if I can manage to figure out exactly what the Makefiles are doing I'll re-enable it again. * src/perf_events.c: Add support for Pentium 4 under perf events. This requires a 2.6.35 kernel. On p4 perf events requires a special format for the raw event, so we modify the results from libpfm3 to conform to what the kernel expects. * release_procedure.txt: release_procedure updated to reflect files to keep under /doc 2010-08-18 * src/perf_events.c: Patch from Gary Mohr that allows PAPI on perf events to catch permissions problems at the time of configuration, rather than only appearing once papi_start() is called. Quick summary of changes: + Adds a check_permissions() routine PERF_COUNT_HW_INSTRUCTIONS is used as the test event. + check_permissions() is called during PAPI_ATTACH, PAPI_CPU_ATTACH and PAPI_DOMAIN + Various "ctl" structures renamed "pe_ctl" + Some minor debug changes 2010-08-05 * src/perf_events.c: Use F_SETOWN_EX instead of F_SETOWN in tune_up_fd() This fixes a multi-thread overflow bug found with the Rice test-suite. F_SETOWN_EX doesn't exist until Linux 2.6.32. We really need some infrastructure that detects the running kernel at init time and warns that things like F_SETOWN_EX, multiplexing, etc., are unavailable if the kernel is too old. 2010-08-04 * src/: Makefile.inc, cpus.c, cpus.h, genpapifdef.c, papi.c, papi.h, papi_defines.h, papi_internal.c, papi_internal.h, perf_events.c, perf_events.h, threads.h: This is the PAPI_CPU_ATTACH patch from Gary Mohr that also fixes a problem with multiple event sets on perf events. Changes by file: papi.h + Add PAPI_CPU_ATTACHED + Add strutctures needed for CPU_ATTACH Makefile.in + include the new cpus.c file papi_internal.c + add call to _papi_hwi_shutdown_cpu() in _papi_hwi_free_EventSet() + make remap_event_position() non-static + add_native_events() and remove_native_events() use _papi_hwi_get_context() + _papi_hw_read() has some whitespace and debug message changes, and removes an extraneous loop index papi_internal.h + a new CPUS_LOCK is added + cpuinfo struct added to various structures + an inline call called _papi_hwi_get_context() added perf_events.h + a cpu_num field added to control_state_t perf_events.c + open_pe_events() allows per-cpu counting, additional debug was added + set_cpu() function added + new debug messages in set_granularity() and _papi_pe_read() + _papi_pe_ctl() has PAPI_CPU_ATTACH code added + _papi_pe_update_control_state() has the default domain set to be PAPI_DOM_USER instead of pe_ctl->domain genpapifdef.c + PAPI_CPU_ATTACHED added threads.h + an ESI field added to ThreadInfo_t papi.c + many new ABIDBG() debug messages added + PAPI_start() updated to check for CPU_ATTACH conflicts, has whitespace fixes, gets context now, if dirty calls update_control_state() + PAPI_stop(), PAPI_reset(), PAPI_read(), PAPI_read_ts(), PAPI_accum(), PAPI_write(), PAPI_cleanup_eventset(), all use _papi_hwi_get_context() to get context + PAPI_read() has some braces added + PAPI_get_opt() and PAPI_set_opt() have CPU_ATTACHED code added. + PAPI_overflow() and PAPI_sprofil() now report errors if CPU_ATTACH enabled cpus.c, cpus.h + New files based on threads.c and threads.h I made some additional changes, based on warnings given by gcc + Added a few missing function prototypes in cpus.h + Update PAPI_MAX_LOCK as it wasn't increased to handle the new addition of CPUS_LOCK + Removed various variables and functions reported as being unused. 2010-08-03 * src/: papi_internal.h, papi_lock.h: The option --with-no-cpu-counters was not supported on AIX. This has been fixed and works now. Also the get_{real|virt}_{cycles|usec} implementations for AIX (checked in Jul 29) have now been tested and work correctly. 2010-07-29 * src/: configure, configure.in, papi_lock.h, papi_vector.c: Added AIX support for the get_{real|virt}_{cycles|usec} functions +++ Fortran tests are now compiling on AIX. Wrong compiler flags were used for the AIX compilers. 2010-07-26 * src/papi_events.csv: add PAPI_L1_DCM for atom * src/x86_cache_info.c: Update the x86 cache_info table. The data from this table now comes from figure 3-17 in the Intel Architectures Software Reference Manual 2A (cpuid instruction section) This fixes an issue on my Atom N270 machine where the L2 cache was not reported. 2010-07-16 * INSTALL.txt, src/perf_events.c, src/perf_events.h: Perf Events now support attach and detach. The patch for supporting this was written by Gary Mohr * src/papi_events.csv: Add a few missing events to Nehalem, based on reading Intel Volume 3b. * src/papi_events.csv: Fix Westmere to not use L1D_ALL_REF:ANY I tested this on a Nehalem which has the proper behavior, unfortunately no Westmere here to test on. * src/: papi_events.csv, papi_pfm_events.c, perfctr-x86.c: Enable support for having more than one CPU block with the same name in the .csv file. This allows easier support for sharing events between similar architectures. I *think* this is needed and *think* it shouldn't break anything, but I might have to back it out. Also fixes event support for Pentium Pro / Pentium III/ P6 on perfmon2 and perf events kernels. Also fixed some confusion where perfctr called chips "Intel Core" meaning Core Duo wheras pfmon called "Intel Core" meaning Core2. This was tested on actual Pentium Pro and PIII hardware (as well as on a few Pentium 4 machines plus a Core2 machine) 2010-07-02 * src/: papi_hl.c, ctests/api.c: Added remaining low-level api tests papi-papi-7-2-0-t/ChangeLogP412.txt000066400000000000000000000436021502707512200166040ustar00rootroot000000000000002011-01-17 * src/configure: Ran autoconf to generate updated configure file. 2011-01-16 * src/components/README: Adding a component for the FreeBSD OS that reports the value of the thermal sensors available in the Intel Core processors. There are as many counters as cores, and the value reported by each counter is in Kelvin degrees. * src/freebsd.c: Implemented missing _papi_freebsd_ntv_name_to_code. * src/: Makefile.in, Makefile.inc, configure.in, ctests/Makefile: Fix dependency on -ldl Now configure checks if dl* symbols are in the base system libraries (i.e., no -ldl needed). If so, avoid adding -ldl to shlib example. If dl* symbols are not find in the base system libraries, then check for -ldl, and if it exists, pass it to ctests/Makefile through Makefile. If -ldl is not found, fail at configure time. * src/ctests/multiattach.c: Fix to compile in FreeBSD. * src/: freebsd-memory.c, freebsd.c: Code cleanup. 2011-01-14 * src/: perf_events.c, perfmon.c: [PATCH 18/18] papi: make _perfmon2_pfm_pmu_type variable static In perf_events.c and perfmon.c the variable _perfmon2_pfm_pmu_type is used locally only, making it static. Signed-off-by: Robert Richter * src/: linux-bgp.c, linux-ia64.c, perf_events.c, perfctr.c, perfmon.c: [PATCH 17/18] papi: remove inline_static macro in Linux only code We better replace the macro with 'static inline'. Not sure if this works for all compilers, so doing it for Linux only files. Signed-off-by: Robert Richter * src/x86_cache_info.c: [PATCH 16/18] papi: remove static inline function declaration By moving the static inline function cpuid() to the begin of the file we may remove its declaration. Signed-off-by: Robert Richter * src/linux.h: [PATCH 15/18] papi: remove unused linux.h header file This file is included nowhere, removing it. Signed-off-by: Robert Richter * src/linux-ia64.c: [PATCH 14/18] papi: fix array out of bounds access Fixing the following warning: linux-ia64.c: In function ?_ia64_init_substrate?: linux-ia64.c:1123:22: warning: array subscript is above array bounds Signed-off-by: Robert Richter * src/: configure, configure.in: [PATCH 13/18] papi: remove unnecassary checks in configure.in The check is obsolete and covered by default. Signed-off-by: Robert Richter * src/: papi_pfm_events.c, perf_events.c, perfmon.c, perfmon.h: [PATCH 12/18] papi: include perfmon header files only where necessary This patch includes perfmon header files only where necessary. Declarations in perfmon/perfmon.h are never used, removing its inclusion. Itanium header files are needed only in perfmon.c and perf_events.c. Signed-off-by: Robert Richter * src/: papi_pfm_events.c, perfctr-x86.c: [PATCH 11/18] papi: make some functions in papi_pfm_events.c static Functions _pfm_decode_native_event() and _pfm_convert_umask() are internally used only. Remove export declaration and make it static. Signed-off-by: Robert Richter * src/: Rules.pfm, linux-ia64-pfm.h, linux-ia64.c, pfmwrap.h: [PATCH 10/18] papi: rename pfmwrap.h -> linux-ia64-pfm.h pfmwrap.h actually only contains IA64 code included by linux-ia64.c. Rename it to linux-ia64-pfm.h. Signed-off-by: Robert Richter * src/: linux-ia64.c, pfmwrap.h: [PATCH 09/18] papi, linux-ia64: make inline functions static Inline functions should be static. Fixing it. Signed-off-by: Robert Richter * src/: linux-ia64.c, papi_pfm_events.c: [PATCH 08/18] papi: fix _papi_pfm_ntv_name_to_code() function interface The function is supposed to return a PAPI error code which is an integer. Make the function's return code an integer too. Signed-off-by: Robert Richter * src/perfctr-ppc64.c: [PATCH 07/18] papi: fix spelling modifer -> modifier Fix spelling: modifer -> modifier Signed-off-by: Robert Richter * src/: linux-ia64.c, papi_pfm_events.c, papi_pfm_events.h, perf_events.c, perfctr-x86.c, perfmon.c: [PATCH 06/18] papi: define function interface in papi_pfm_events.h The header file should define the interface that papi_pfm_events.c provides. Declarations used internally only in papi_pfm_events.c are moved there. Now papi_pfm_events.h only contains the prototype functions. Remapping of definitions is removed too. This cleanup removes duplicate code and better defines the interface. Signed-off-by: Robert Richter * src/: Rules.perfctr, Rules.perfctr-pfm, linux.c, multiplex.c, papi_vector.c, perfctr-x86.c, perfctr.c, ctests/test_utils.c: [PATCH 05/18] papi: rename linux.c -> perfctr.c The name of linux.c is misleading, it only implements perfctr functionality. Thus renaming it to perfctr.c. Signed-off-by: Robert Richter * src/: papi_pfm_events.c, perfctr-x86.c: [PATCH 04/18] papi: make _papi_pfm_init() static by moving it to perfctr-x86.c _papi_pfm_init() is only used in perfctr-x86.c but implemented in papi_pfm_events.c. Move it to perfctr-x86.c and make it static. Signed-off-by: Robert Richter * src/perfmon.c: [PATCH 03/18] papi: make some functions static in perfmon.c The functions are only used in perfmon.c, making it static. Signed-off-by: Robert Richter * src/: Rules.pfm, Rules.pfm_pe: [PATCH 02/18] papi: do not compile libpfm examples to support cross compilation Signed-off-by: Robert Richter * src/Rules.pfm: To cross compile papi we need to pass the architecture to libpfm. Otherwise it will be confused and tries to build the host's make targets with the cross compiler ending up in the following error: pfmlib_amd64.c: In function ?cpuid?: pfmlib_amd64.c:166:3: error: impossible register constraint in ?asm? pfmlib_amd64.c:172:1: error: impossible register constraint in ?asm? make[2]: *** [pfmlib_amd64.o] Error 1 Signed-off-by: Robert Richter * src/ctests/Makefile: Temporarily back out the FreeBSD makefile change that breaks the build so that I can properly test some other changes. * src/papi_events.csv: Change the Core2 L1_TCM preset to be LLC_REFERENCES The current event (L2_RQSTS:SELF:MESI) returns an event equivelent to LLC_REFERENCES on libpfm3, but in libpfm4 L2_RQSTS:SELF:MESI maps instead to L2_RQSTS:SELF:MESI:ALL which counts prefetches too. By moving to LLC_REFERENCES both libpfm3 and libpfm4 count the proper value. This also makes the "tenth" benchmark pass when using PAPI/libpfm4. * src/configure: Update to match current configure.in * src/ctests/Makefile: Fix the if / fi syntax of the last change. 2011-01-13 * src/: Makefile.inc, configure.in, freebsd-memory.c, freebsd.c, ctests/Makefile, ctests/zero_attach.c: Changes from Harald Servat for freebsd support. Note that configure has not been regenerated from this version of configure.in. * papi.spec, doc/Doxyfile, doc/Doxyfile-everything, src/Makefile.in, src/configure.in, src/papi.h: Change version numbers to 4.1.2 in preparation for a release. 2011-01-12 * src/ctests/code2name.c: The code2name test was assuming that the native events start right at PAPI_NATIVE_MASK. We specifically document elsewhere this might not be the case, and indeed for the libpfm4 code this fails. This fix changes the code to properly enunmerate the native events for the test. 2011-01-06 * src/: papi.c, papi_internal.c: Fix a long-standing bug where we were walking off the end of the EventInfoArray in remap_event_position(). This was noticed by Richard Strong when instrumenting some of the PARSEC benchmarks. In papi_internal.c in the remap_event_position() function we have the loop for ( i = 0; i <= total_events; i++ ) { It seems weird that we are doing a <= compare, and in fact this is why we walk off the end of the array sometimes. But why only somtimes? If I change that <= to a < then many of the regression tests fail. It turns out that the two calls to remap_event_position() in papi_internal.c are called with ESI->NumberOfEvents being one less than it should be, as it is incremented after the remap_event_position() call (though the new events are added before the call). This is why <= is used. However the call in PAPI_start() happens with ESI->NumberOfEvents with the right value. In this case < should be used. The fix I've come up with has a NumberOfEvents value passed in as a parameter to remap_event_position(). This way the value+1 can be passed in the former cases. 2010-12-20 * src/aix.c: Problem on POWER6 with AIX: pm_initialize() cannot be called multiple times with PM_CURRENT. Instead, use the actual proc type - here PM_POWER6 - and multiple invocations are no longer a problem. Ctests/multiplex1.c passes now. 2010-12-15 * src/run_tests.sh: If we don't run any tests, get buildbot's attention. 2010-12-14 * src/aix.c: number_of_nodes var was set to zero in _aix_get_system_info. This caused the papi utilities to report that the number of total CPUs is zero. This also caused ctests/hwinfo to fail on POWER6 with AIX. 2010-12-13 * src/papi_internal.h: Slight re-ordering of the no_vararg_macro debug statements. (I actually tested the changes with --with-debug and without on aix) 2010-12-10 * src/run_tests.sh: Change the syntax on our find command to be more posix compliant. GNU is Not UNIX, cute acronym or massive compatibility conspiracy. I fall back to posix, you decide! * src/: configure, configure.in: Update configure file to be aware of the existence of AIX-Power7. PAPI still won't build, but it gets further than before. 2010-12-09 * src/run_tests.sh: Make our grep invocation posix compliant. (--invert-match == -v & --regex == -e ) * src/ctests/overflow_allcounters.c: Separate 'indent' check-in so that the previous modifications are comprehensible :) * src/ctests/overflow_allcounters.c: The overflow_allcounters test failed on Power6 with AIX (pmapi) but passes on Power6 with linux (perf_events | perfctr). Therefore detect if we're running on AIX, print a warning, but still pass the test. * src/run_tests.sh: Move away from echo -n to the shell builtin printf (echo -n is not portable) non-argumented instances of echo are fine. * src/run_tests_exclude.txt: Skip the non-test ctests/burn executable. * src/Matlab/: PAPI_Matlab.c, PAPI_Matlab.readme: Change documentation for matlab integration to reflect the need to link to the libpapi.so library and not the static one. Also listed me and the ptools-perfapi list as points of contact for future questions *gulp* 2010-12-08 * src/: configure, configure.in, run_tests.sh: Clean up (purge) references to libpfm-2.x in configure and run_tests.sh * src/Matlab/PAPI_Matlab.c: MATLAB fixups: Calls to PAPI('stop') now stop counting even if we ignore the return values. * src/Matlab/PAPI_Matlab.c: Fixup for papi matlab integration. Calls to PAPI('stop') don't cause errors now. If you call PAPI('stop') with out capturing its return value, it does nothing. * src/Matlab/PAPI_Matlab.c: mex does not like c++ style comments (double-backslash) 2010-12-06 * src/solaris-ultra.c: Resolved a couple type cast warnings. Also initialized a variable and enabled GET_OVERFLOW_ADDRESS code in two places. The overflow test suite still has a number of failures and is disabled in configure. 2010-11-24 * src/papi_internal.h: That last commit was lacking in creativity... By having the debug function names still a macro, we get all the goodness of __FILE__ etc bing in the right place and still not using variadic macros. #define SUBDBG do{ if (_papi_hwi_debug & DEBUG_SUBSTRATE ) print_the_label; } while (0); _SUBDBG was the clever line that eluded me yesterday. 2010-11-23 * src/papi_internal.h: Turns out that when DEBUG and NO_VARARG_MACRO are true, we didn't correctly implement component-level debug functions. This change uses variable argument lists ( man stdarg) to correctly handle this case. ( papi_internal.h defines these) Note that debugging information is not completly useful; due to functions which use variable argument lists not being inlinable ( the inline keyword is afterall only a sugestion), all messages appear to come from papi_internal.h:PAPIDEBUG:525:22619 and I am not clever enough to get around that in general right now. Thanks to Maynard Johnson for reporting. * src/papi_events.csv: Enable the PAPI_HW_INT event on Nehalem, as tests show the HW_INT:RCV event is the proper one to use here. 2010-11-22 * src/papi_events.csv: Update the preset events for Nehalem, as contributed by Michel Brown. 2010-11-19 * src/: perf_events.h, perf_events.c: Address problem with overflow handler continuing to count events. Add overflow status field to determine if an event set has any events enabled for overflow. Use IOC_REFRESH instead of IOC_ENABLE when overflowing. Implement IOC_REFRESH at end of overflow handler. None of this worked. Also implemented an IOC_DISABLE at top of overflow handler. That worked, even though it's suboptimal. 2010-11-17 * src/utils/command_line.c: test_fail_exit() substituted for test_fail(). This became necessary because PAPI_event_name_to_code now returns a PAPI_EATTR error if the base name matches but attribute names don't. This utility was producing an error message and then running the test. Perfctr implementations will happily add a base name with no umasks and then generate 0 counts. This fix prevents that behavior. * src/ctests/test_utils.c: Rewrite of test_fail_exit() to call test_fail(). It should be noted that test_fail_exit() behaves the way test_fail() used to behave, i.e. it exits after printing the fail message. However, test_fail no longer exits as that was causing problems with multi-threaded tests not freeing memory. In those cases where an exit is desired, calls to test_fail_exit() should be substituted for calls to test_fail(). * src/: papi.h, papi_data.c, papi_pfm_events.c, perfmon.c: Added 3 new error codes: PAPI_EATTR, PAPI_ECOUNT, and PAPI_ECOMBO. These map onto equivalent errors in libpfm and are provided to give more detail on failures in libpfm calls. A new error mapping function has been added to papi_pfm_events.c to map libpfm errors to PAPI errors, and this function is employed in the compute_kernel_args function in perfmon.c. It could also be deployed elsewhere, but so far is not. 2010-11-09 * src/x86_cache_info.c: The cpuid change yesterday broke compilation on a 32-bit Pentium 3. Fix the inline assembly to compile properly there too. 2010-11-08 * src/: configure, configure.in: Fix configure script to properly detect Pentium M machines. * src/x86_cache_info.c: Add cpuid leaf4 cache detection support. This has been available on intel processors since Late model P4s and all Core2 and newer. It returns cache info in a different way than the older leaf2 method. Currently we only use leaf4 data if the leaf2 results tell us to (apparently Westmere does that). Otherwise we use the old method. It might be interesting to use more of the leaf4 info. It can tell us things such as how many processors share a socket, how many processors share a cache, and info on the inclusivity of a cache. * src/: linux.c, perfctr-x86.c: Add perfctr Westmere support. * src/perfctr-2.6.x/: patches/aliases, usr.lib/Makefile: Fix conflicts from perfctr merge. 2010-11-06 * src/perf_events.c: Replace KERNEL_CHECKS_SCHEDUABILITY_UPON_OPEN with the proper dynamic kernel version number checking. This should be the last place in our perf_events code that was using a hard-coded rather than dynamic check for a kernel-version related bugfix. * src/perf_events.c: This patch allows PAPI to read multiple events at a time out of the kernel when the kernel is new enough (2.6.34 or newer). The previous code required setting a #define by hand to get this behavior, this new code picks the proper way to do things based on the kernel version number. The patch was supplied by Gary Mohr 2010-11-04 * src/: linux.c, perfctr-x86.c: Replace occurrances of PERFCTR_X86_INTEL_COREI7 with PERFCTR_X86_INTEL_NHLM as the former has been documented as being deprecated as of perfctr 2.6.41. 2010-11-03 * src/cycle.h: Change "unicos" to "CLE" since "unicos" no longer exists. 2010-10-26 * src/examples/locks_pthreads.c: Add a call to PAPI_thread_init(), Thanks to Martin Schindewolf for pointing this out. 2010-10-21 * src/: papi.c, components/lmsensors/linux-lmsensors.h: Fixup url's that checkbot was finding in error. 2010-10-05 * src/ctests/: multiattach.c, zero_attach.c: The zero_attach and multiattach were forking before off children before testing that PAPI in fact is available. Then when PAPI_init() failed the children weren't being cleaned up properly. This was confusing build bot. This changeset moves the fork to after the check plus do a fail_exit() on failure. * src/: configure, configure.in: Solaris build will fail if /usr/ccs/bin isn't in the path. Have it check there for "ar" on Solaris systems if it can't be found by normal methods. * src/: configure, configure.in: Only run the EAR tests on itanium systems. * src/: configure, configure.in: Pentium4-perfctr was skipping most of the CTESTS. Make sure they are all run. papi-papi-7-2-0-t/ChangeLogP4121.txt000066400000000000000000000001731502707512200166610ustar00rootroot000000000000002011-01-20 * src/papi_events.csv: Remove HW_INT:RCV event that was mistakenly enabled for Westmere papi-papi-7-2-0-t/ChangeLogP413.txt000066400000000000000000000520061502707512200166030ustar00rootroot000000000000002011-05-10 * src/Rules.pfm_pe: The --with-bitmode parameter was not being passed along to libpfm3, so it was not possible to build perf_event PAPI on non-default bitmodes. This change passes along the $(BITFLAGS) value to the libpfm3 make invocation. * src/: papi_pfm_events.c, papi_pfm_events.h, perf_events.c: The perf_events code was using __u64 instead of uint64_t and this was causing a warning when compiling for 64-bit Power. * src/libpfm-3.y/lib/amd64_events_fam15h.h: Added Robert Richter's patch with a few new events for AMD Family 15h. 2011-05-06 * INSTALL.txt: Load the 'gcc' module not 'gnu' module for Cray. * INSTALL.txt: Update the install instructions for Cray XT and XE systems. * src/ctests/: multiattach.c, multiattach2.c: Make the multiattach and multiattach2 failures into warnings. I have a proposed fix that makes the failures go away, but it has not been tested much and also causes some new fcntl() error messages under perfctr. So temporarily make the tests only warn for the release and I'll work on a proper fix for after. The behavior in these tests has been broken for a long time so it is not a recent regression. * src/papi_memory.c: Band-aid for the leak debugging statement in papi_memory.c on NO_VARARG_MACRO systems. (aix currently) 2011-05-05 * src/ctests/multiattach.c: Had the division backwards on the validation. * src/ctests/multiattach.c: Update the multiattach test to fail if the results aren't in the proper ratio. This was failing on perf_event kernels but since the results weren't checked it was never reported as an error. * delete_before_release.sh: delete cvs2cl.pl before release * ChangeLogP413.txt: First cut change log for the 4.1.3 release. Nothing's frozen yet... * cvs2cl.pl: Perl script to generate change logs. Keeping it with the project makes life easier. * INSTALL.txt: Change INSTALL to reflect that we support power7. * src/Makefile.in, src/configure, src/configure.in, src/papi.h, doc/Doxyfile, doc/Doxyfile-everything, papi.spec: Modfy version number for pending release: 4.1.3.0 2011-05-03 * src/: papi_internal.c, papi_internal.h, sys_perf_event_open.c, ctests/attach2.c: Cleanup the _papi_hwi_cleanup_eventset() function in papi_internal.c This function was re-using existing functionality to remove one event at a time before cleaning out the eventset. This is not strictly necessary and was breaking on perf_event eventsets that were attached to finished processes, as a call to update_control_state() would close/reopen the perf_event fd, failing when the finished process went away after the close. The new code removes all events from the eventset in one go before calling update_control_state. The change here also updates code comments as necessary, as some of the code in papi_internal.c can be a bit obscure. It also updates some of the comments in ctests/attach2.c to give better debugging info. 2011-04-28 * src/threads.c: Uncomment the actual signal passing functionality in _papi_hwi_broadcast_signal * src/papi_debug.h: Include files added to papi_debug.h * src/components/README: Added detailed instructions on how to build PAPI with the CUDA component 2011-04-27 * src/threads.c: Move an escape test to the outer loop in _papi_hwi_broadcast_signal. This cleans up an infinite loop where before we would only break out of the component look, not the thread list walking loop. * src/: papi.c, papi_internal.c, papi_internal.h, papi_protos.h: Clean up papi_internal.c so that functions not used outside are marked static. * src/: papi_pfm_events.c, papi_preset.c, pmapi-ppc64_events.c: papi: Fix some memory leaks Signed-off-by: Robert Richter * src/perf_events.c: papi: Make functions and variables static in perf_events.c All this functions and variables are not used outside perf_events.c. Making them static. Signed-off-by: Robert Richter * src/papi_pfm_events.c: papi: Fix crash in error handler for pfm_get_event_code_counter() Signed-off-by: Robert Richter * src/utils/native_avail.c: papi: Fix error check in native_avail.c Signed-off-by: Robert Richter 2011-04-26 * src/libpfm-3.y/: include/perfmon/pfmlib_amd64.h, lib/pfmlib_amd64.c: AMD architectural PMU could not be detected for family 15h as there was a strict check for AMD family 10h. Enabling it now for all families from 10h. Signed-off-by: Robert Richter * src/libpfm-3.y/lib/amd64_events_fam15h.h: There is no kernel support for AMD family 15h northbridge events, disabling them in libpfm3 to not report them as available native events. Patch from Robert Richter * src/: configure, configure.in, linux-common.c: Add some extra debug messages for better tracking of the --with-assumed-kernel configure option. 2011-04-25 * src/: configure, configure.in, linux-common.c: Add a new configure option: --with-assumed-kernel= This allows you to specify a kernel revision to (instead of being autodetected with uname) for perf_event workaround purposes. With this you can force PAPI to not use workarounds on kernels with backported versions of perf_event features. 2011-04-19 * src/: Makefile.inc, configure, configure.in, papi_debug.h, papi_internal.h, sys_perf_event_open.c: Add debugging to sys_perf_event_open.c to show exactly what values are being passed to the perf_event_open syscall. 2011-04-18 * src/: run_tests.sh, ctests/attach2.c, ctests/attach3.c: Fix for finding attach_target with execlp to search the path. 2011-04-14 * src/: Rules.pfm, configure, configure.in, linux-ia64-pfm.h, linux-ia64.c, linux-ia64.h, perfmon-ia64-pfm.h, perfmon-ia64.c, perfmon-ia64.h, perfmon.h: Rename the linux-ia64-* files to be called perfmon-ia64-* This is a more descriptive name, and makes it more obvious what the files are for. * src/libpfm-3.y/: include/perfmon/pfmlib_amd64.h, lib/pfmlib_amd64.c, lib/pfmlib_amd64_priv.h: Patch to have libpfm3 use 6 counters on Interlagos. Patch provided by Robert Richter * src/linux-memory.c: Fix the POWER cache detection routines to work properly on POWER7. Patch provided by Corey Ashford * src/: configure, configure.in: Have configure check for ifort if gfortran, etc, not found. Patch by Gary Mohr * src/ctests/johnmay2.c: Update the validation message on the ctests/johnmay2.c test to be less confusing. Also add some comments to the source code. Problem reported by Steve Kaufmann. 2011-04-13 * src/ctests/: multiattach2, multiattach2.c: Remove the accidentally added ctests/multiattach2 and add instead the proper ctests/multiattach2.c * src/Makefile.inc: components_config.h is cleaned out with make clobber, not make clean this should fix the build bot issues. * src/ctests/: Makefile, attach3.c, multiattach.c, multiattach2, zero_attach.c: Minor typos in comments. Discovered another bug in attach code demonstrated by multiattach2. You cannot have an eventset running that is self counting as well as one that is attached. PAPI thinks that both are running and throws an error. * src/perf_events.c: We must update the control state after attaching for perf_events, zero_attach now passes * src/ctests/: Makefile, attach2.c, attach3.c, do_loops.c: This commit adds testing of attaching to fork/exec'd executables. zero_attach and multiattach just test forks. This also modifies do_loops.c to be able to generate a test driver when -DDUMMY_DRIVER is defined so we can use it to generate flops as a sub process. Attach2 and attach3 have one important difference. Attach3 does a 'assign component' before attaching and then adding events. Attach2 does not assign a component and thus should inherit the default component. The current bug in PAPI is that: * The default component is not assigned until you add an event. * However, attaching an eventset without events is perfectly valid, but we get an error. Possible solution is that the default component should be assigned at create time. 2011-04-12 * src/ctests/multiattach.c: Make sure the two processes compute different numbers of flops to test attach 2011-04-05 * src/power7_events.h: Turns out Maynard Johnson answered my questions about the native_name enum back in December. ( this is a correct version of the events file ) As I found out, the AIX substrates do not use the native_name enum. But a hypothetical perfctr build would. 2011-04-04 * src/Makefile.inc: Clear out the components_config.h file on make clobber * src/: aix.c, power7_events.h: Initial support for power7 aix, the events file is a copy of power6_events.h with the number of groups changed. The native_name enum is unchanged, but unused? 2011-04-01 * src/configure.in: Commited wrong configure.in * src/: configure, configure.in: Clean up setting bitmode flags for non-gcc (xlc in this case) compilers. * src/papi_events.csv: Change the Nehalem PAPI_FP_OPS event from FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION+FP_COMP_OPS_EXE:DOUBLE_PRECISION to FP_COMP_OPS_EXE:SSE+FP_COMP_OPS_EXE:X87 The new event gives the same results as the previous one, with the added benefit of also counting 32-bit compiled x87 fp ops properly. More detailed analysis can be found here: http://web.eecs.utk.edu/~vweaver1/projects/nehalem-fp_ops/ 2011-03-28 * src/utils/multiplex_cost.c: Turns out that getopt_long isn't as standard as I had hoped. Convert multiplex_cost to use only getopt. -s disables software multiplexing -k disables kernel multiplexing 2011-03-25 * src/: configure, utils/Makefile, utils/multiplex_cost.c, configure.in: Multiplex_cost utility. * src/utils/: Makefile, cost.c, cost_utils.c, cost_utils.h: Split off the statistics functions from cost. 2011-03-22 * src/: run_tests_exclude_cuda.txt, run_tests.sh: Exclude some fork/thread tests from fulltest that won't run with CUDA (reason: cannot invoke same GPU from different threads) 2011-03-21 * src/utils/cost.c: Add a test for DERIVED_[ADD | SUB ] events to papi_cost. 2011-03-18 * src/components/cuda/linux-cuda.c: all_native_events ctest failed when CUDA Component is used. Reason: removing cuda events from the eventset is currently not supported. According to the NVIDIA folks this is a bug in cuda 4.0rc and will be fixed in rc2. Note also, several fork and thread tests fail since it's illegal to invoke the same GPU device from different processes / threads. We need a mechanism that allows us to run tests for the CPU component only. 2011-03-15 * src/utils/cost.c: Add a test case to cost util, look for a derived-postfix event and if found, give timing information for read calls to it. This is just a first run at the test, Core2 and AMD have candidate events and the test runs, but that is the extent of my testing so far. 2011-03-11 * src/components/: README, cuda/Makefile.cuda.in, cuda/Rules.cuda, cuda/configure, cuda/configure.in, cuda/linux-cuda.c, cuda/linux-cuda.h: Added CUDA component, a hardware performance counter measurement technology for the NVIDIA CUDA platform which provides access to the hardware counters inside the GPU. PAPI CUDA is based on CUPTI support - shipped with CUDA 4.0rc - in the NVIDIA driver library. In any environment where the CUPTI-enabled driver is installed, the PAPI CUDA component can provide detailed performance counter information regarding the execution of GPU kernels. * src/components/: coretemp/linux-coretemp.c, lustre/linux-lustre.c: Add some missing includes to components. Thanks to Will Cohen for reminding us warnings matter. :) * src/: configure, configure.in, perf_events.c: The SYNC_READ workaround in perf_events.c was being handled at compile time, rather than at run-time like all of our other workarounds. Change it to be like our other kernel-version related workarounds. 2011-03-09 * src/ctests/multiplex1_pthreads.c: Between 4.0.0 and 4.1.0 a pthread_exit() call was added to ctest/multiplex1_pthreads.c that caused the test to exit partway through the test and without doing a proper PASS/FAIL result. This changeset backs out that change, though the original change was marked as a memory leak fix so a different fix may be needed. Reported by Steve Kaufmann * src/linux-timer.c: Add missing header needed by --with-virtualtimer=times build. Reported by Steve Kaufmann 2011-03-01 * src/: papi_pfm_events.c, perf_events.c: Fix broken Linux/PPC build caused by my pfm_events code movement changes. 2011-02-25 * src/: papi_pfm_events.c, papi_pfm_events.h, perfctr-x86.h: My changes yesterday broke the perfctr build. This should fix it. 2011-02-24 * src/ctests/inherit.c: Make the inherit test respect TESTS_QUIET so that it does not print extra output during a run_tests.sh run * src/ctests/overflow.c: Fix missing newline in the overflow output. Reported by Gary Mohr * src/: papi_pfm_events.c, papi_pfm_events.h, perf_events.c: Move the libpfm3 specific functions from perf_events.c into papi_pfm_events.c * src/perf_events.c: Separate the libpfm3-specific code from _papi_pe_init_substrate() and _papi_pe_update_control_state() into their own functions. This will allow eventual code sharing and also make the libpfm4 merge easier. * src/perf_events.c: Some minor cleanups I found after reviewing the inherit merge. + Add missing "static inline" to the new kernel-version codes + Remove duplicated test for Pentium 4 + Fix a warning only seen if --with-debug is enabled * src/: papi.c, papi.h, papi_internal.h, perf_events.c, perf_events.h, ctests/Makefile, ctests/inherit.c, ctests/test_utils.c: Merging Gary Mohr's re-implementation of inherit into the code base. Thanks, Gary! 2011-02-23 * src/: any-null.h, freebsd.h, linux-bgp.h, linux-common.c, linux-common.h, linux-context.h, linux-ia64.c, linux-ia64.h, linux-lock.h, linux-memory.c, linux-ppc64.h, linux-timer.c, papi_internal.h, papi_pfm_events.c, perf_events.c, perf_events.h, perfctr-x86.h, perfctr.c, perfmon.h, solaris-niagara2.h, solaris-ultra.h, solaris.h, x86_cache_info.c: Move some more duplicated OS common code (in this case the locking code and the context accessing code) out of the various substrate include files and into a common location. 2011-02-22 * src/perf_events.c: Separate out the kernel-version dependent checks and group them together near the beginning of the code. This not only allows us to easily see which routines are kernel-version dependent, but it makes it easier to disable the checks one-by-one when debugging kernel-version related issues like those found with the inherit patches. 2011-02-21 * src/papi_internal.c: Extend _papi_hwi_cleanup_eventset to free memory and better cleanup after us. 2011-02-18 * src/papi.c: PAPI_assign_eventset_component changed; refuses to reassign components. 2011-02-17 * src/: papi_events.csv, libpfm-3.y/include/perfmon/pfmlib.h, libpfm-3.y/lib/amd64_events.h, libpfm-3.y/lib/amd64_events_fam10h.h, libpfm-3.y/lib/amd64_events_fam15h.h, libpfm-3.y/lib/pfmlib_amd64.c, libpfm-3.y/lib/pfmlib_amd64_priv.h: Add support for AMD Family 15h processors. Also adds suport for Family 10h RevE Patches provided by Robert Richter * src/utils/native_avail.c: Modify papi_native_avail to properly handle event names with libpfm4-style "::" separators in them. 2011-02-15 * src/Makefile.inc: make install-doxyman will build/install the doxygen version of the manpages. Note that these pages are very rough right now, much work is needed to get them to be a drop in replacement for the current man pages. (mostly formatting related/use related issues, eg man PAPI_start will not work yet; the content is there.) * doc/Makefile: Add install target for doxygen generated man pages. 2011-02-11 * src/: perfctr-x86.c, perfctr.c: perfctr-2.6.42 introduced PERFCTR_X86_INTEL_WSTMR PAPI added support for PERFCTR_X86_INTEL_WSMR notice the missing T Fix PAPI to use the proper define. This should fix Westmere support on perfctr kernels. 2011-02-09 * src/: papi_protos.h, papi_vector.c, papi_vector.h, papi_vector_redefine.h: Added function pointer destroy_eventset to the PAPI vector table. Needed for the CUDA Component to disable CUDA eventGroups, to destroy floating CUDA context, and to free perfmon hardware on the GPU. (Note: the CUDA Component cannot be released yet since we are still under NDA with NVIDIA. Stay tuned.) 2011-02-07 * src/x86_cache_info.c: The cpuid leaf2 code was printing a message to stderr if leaf4 was needed (only happens on Westmere currently). Change this to be a MEMDBG() debug message instead. 2011-02-03 * src/: papi_events.csv, perfctr-x86.c: perfctr-x86 was reporting "Core i7" instead of "Nehalem". i7 can mean Westmere or Sandy Bridge too, so change the code to properly report Nehalem. 2011-01-27 * src/ctests/all_native_events.c: Fix this ctest. It failed when the package was built with several components because the eventset was reused and failed to add events that were not from the first component. In order to fix it, I recreate & destroy the eventset when the current event does not belong to the previous component. 2011-01-26 * src/: configure, configure.in, linux-timer.c, perfmon.c: Fix Cray CLE build. * src/: configure, configure.in: Putting -Wall in cflags now requires CC = gcc * src/: aix.c, freebsd.c, linux-bgp.c, linux-common.c, linux-memory.c, linux-memory.h, papi.c, papi_protos.h, papi_vector.c, papi_vector.h, solaris-niagara2.c, solaris-ultra.c, windows-common.c, windows-memory.c: Change the paramaters passed to update_shlib_info() to match better with those passed to get_system_info(). This only affects the substrates, outside users of PAPI will not notice this change. 2011-01-25 * src/: configure, configure.in: Make sure that aix gets -g. * src/: configure, configure.in: Give everyone else -g when configuring with debug. To wit, we pass gcc -g3 but neglected platforms where CC!=gcc. * src/aix.c: First run at supporting power7. NOTE: this code is only good for getting event listings eg papi_native_avail, passing PM_GET_GROUPS causes our code to segfault later on, a buffer overflow I'm still tracking down. * src/perfctr-x86.c: Accidentally converted a function to _perfctr_ that should have stayed _linux_. * src/: perfctr-x86.c, perfctr.c: Rename the various perfctr functions to be _perfctr_ rather than _linux_. This way _linux_ is reserved for the common functions used by all. * src/: linux-common.c, linux-memory.c, linux-timer.c, perf_events.c, windows-common.c, windows-memory.c, windows-timer.c: Split the WIN32 specific code out from the new linux common code. In most cases very little code was shared (it tended to be a big #ifdef block) and it is confusing to have windows-specific code in files named linux-* 2011-01-24 * src/linux-timer.c: Fix a compile error that only shows up on PPC. * src/linux-timer.c: Fix compile warning if mmtimer is enabled. * src/perfctr-x86.c: Missing comma in the perfctr code. * src/: Makefile.inc, aix.c, configure, configure.in, hwinfo_linux.c, linux-bgp.c, linux-common.c, linux-common.h, linux-ia64.c, linux-timer.c, linux-timer.h, papi_vector.h, perf_events.c, perfctr-x86.c, perfctr.c, perfmon.c, solaris-niagara2.c, solaris-ultra.c: One last batch of consolidation changes. This one moves get_system_info and get_cpu_info into linux-common.c, plus moves some other routines from perf_events.c there that are shared by the future libpfm4 version. Some non-linux substrates are touched here; these are just short fixes to make sure the get_system_info() function pointed to by the papi_vector has the same format on all substrates. * src/: Makefile.inc, configure, configure.in, linux-memory.c, linux-memory.h, perf_events.c, perfctr-x86.c, perfctr.c, perfmon.c: Move the various Linux update_shlib_info() functions into a common place. * src/: Makefile.inc, linux-timer.c, linux-timer.h, perf_events.c, perfctr-x86.c, perfctr.c, perfmon.c: Move the various timer-related functions to linux-timer.c This gets rid of the duplicated code spread throughout the substrates. 2011-01-21 * delete_before_release.sh, release_procedure.txt: Updated the release docs with what I learned when making the 4.1.2.1 release. * src/: configure, configure.in, freebsd-memory.c, linux-ia64-memory.c, linux-memory.c, linux-memory.h, linux-mx-memory.c, linux-ppc64-memory.c, perf_events.c, perfctr-x86.c, perfmon-memory.c, perfmon.c: Currently there are at least 3 identical copies of the linux memory detection code spread throughout the PAPI source code. This change puts them all in linux-memory.c, and then has all the individual substrates use the common code. papi-papi-7-2-0-t/ChangeLogP414.txt000066400000000000000000001007711502707512200166070ustar00rootroot000000000000002011-08-29 * src/configure: Rebuild from configure.in with version number bump to 4.1.4 in advance of pending internal vendor release for Cray. 2011-08-26 * release_procedure.txt: Update rel procedure to mention building the man pages before a release. * man/: man1/avail.c.1, man1/clockres.c.1, man1/command_flags_t.1, man1/command_line.c.1, man1/component.c.1, man1/cost.c.1, man1/decode.c.1, man1/error_codes.c.1, man1/event_chooser.c.1, man1/mem_info.c.1, man1/native_avail.c.1, man1/options_t.1, man1/papi_avail.1, man1/papi_clockres.1, man1/papi_command_line.1, man1/papi_component_avail.1, man1/papi_cost.1, man1/papi_decode.1, man1/papi_error_codes.1, man1/papi_event_chooser.1, man1/papi_mem_info.1, man1/papi_multiplex_cost.1, man1/papi_native_avail.1, man3/CDI.3, man3/HighLevelInfo.3, man3/PAPIF.3, man3/PAPIF_accum.3, man3/PAPIF_add_event.3, man3/PAPIF_add_events.3, man3/PAPIF_assign_eventset_component.3, man3/PAPIF_cleanup_eventset.3, man3/PAPIF_create_eventset.3, man3/PAPIF_destroy_eventset.3, man3/PAPIF_get_dmem_info.3, man3/PAPIF_get_exe_info.3, man3/PAPIF_get_hardware_info.3, man3/PAPIF_num_hwctrs.3, man3/PAPI_accum.3, man3/PAPI_accum_counters.3, man3/PAPI_add_event.3, man3/PAPI_add_events.3, man3/PAPI_addr_range_option_t.3, man3/PAPI_address_map_t.3, man3/PAPI_all_thr_spec_t.3, man3/PAPI_assign_eventset_component.3, man3/PAPI_attach.3, man3/PAPI_attach_option_t.3, man3/PAPI_cleanup_eventset.3, man3/PAPI_component_info_t.3, man3/PAPI_cpu_option_t.3, man3/PAPI_create_eventset.3, man3/PAPI_debug_option_t.3, man3/PAPI_descr_error.3, man3/PAPI_destroy_eventset.3, man3/PAPI_detach.3, man3/PAPI_dmem_info_t.3, man3/PAPI_domain_option_t.3, man3/PAPI_enum_event.3, man3/PAPI_event_code_to_name.3, man3/PAPI_event_info_t.3, man3/PAPI_event_name_to_code.3, man3/PAPI_exe_info_t.3, man3/PAPI_flips.3, man3/PAPI_flops.3, man3/PAPI_get_cmp_opt.3, man3/PAPI_get_component_info.3, man3/PAPI_get_dmem_info.3, man3/PAPI_get_event_info.3, man3/PAPI_get_executable_info.3, man3/PAPI_get_hardware_info.3, man3/PAPI_get_multiplex.3, man3/PAPI_get_opt.3, man3/PAPI_get_overflow_event_index.3, man3/PAPI_get_real_cyc.3, man3/PAPI_get_real_nsec.3, man3/PAPI_get_real_usec.3, man3/PAPI_get_shared_lib_info.3, man3/PAPI_get_thr_specific.3, man3/PAPI_get_virt_cyc.3, man3/PAPI_get_virt_nsec.3, man3/PAPI_get_virt_usec.3, man3/PAPI_granularity_option_t.3, man3/PAPI_hw_info_t.3, man3/PAPI_inherit_option_t.3, man3/PAPI_ipc.3, man3/PAPI_is_initialized.3, man3/PAPI_itimer_option_t.3, man3/PAPI_library_init.3, man3/PAPI_list_events.3, man3/PAPI_list_threads.3, man3/PAPI_lock.3, man3/PAPI_mh_cache_info_t.3, man3/PAPI_mh_info_t.3, man3/PAPI_mh_level_t.3, man3/PAPI_mh_tlb_info_t.3, man3/PAPI_mpx_info_t.3, man3/PAPI_multiplex_init.3, man3/PAPI_multiplex_option_t.3, man3/PAPI_num_cmp_hwctrs.3, man3/PAPI_num_components.3, man3/PAPI_num_counters.3, man3/PAPI_num_events.3, man3/PAPI_num_hwctrs.3, man3/PAPI_option_t.3, man3/PAPI_overflow.3, man3/PAPI_perror.3, man3/PAPI_preload_info_t.3, man3/PAPI_profil.3, man3/PAPI_query_event.3, man3/PAPI_read.3, man3/PAPI_read_counters.3, man3/PAPI_read_ts.3, man3/PAPI_register_thread.3, man3/PAPI_remove_event.3, man3/PAPI_remove_events.3, man3/PAPI_reset.3, man3/PAPI_set_cmp_domain.3, man3/PAPI_set_cmp_granularity.3, man3/PAPI_set_debug.3, man3/PAPI_set_domain.3, man3/PAPI_set_granularity.3, man3/PAPI_set_multiplex.3, man3/PAPI_set_opt.3, man3/PAPI_set_thr_specific.3, man3/PAPI_shlib_info_t.3, man3/PAPI_shutdown.3, man3/PAPI_sprofil.3, man3/PAPI_sprofil_t.3, man3/PAPI_start.3, man3/PAPI_start_counters.3, man3/PAPI_state.3, man3/PAPI_stop.3, man3/PAPI_stop_counters.3, man3/PAPI_strerror.3, man3/PAPI_thread_id.3, man3/PAPI_thread_init.3, man3/PAPI_unlock.3, man3/PAPI_unregister_thread.3, man3/PAPI_write.3, man3/high_api.3, man3/low_api.3, man3/papi_data_structures.3, man3/papi_vector_t.3, man3/ret_codes.3: Switch over to doxygen generated man pages. * man/: man1/papi_avail.1, man1/papi_clockres.1, man1/papi_command_line.1, man1/papi_cost.1, man1/papi_decode.1, man1/papi_event_chooser.1, man1/papi_mem_info.1, man1/papi_native_avail.1, man3/PAPI.3, man3/PAPIF.3, man3/PAPIF_get_clockrate.3, man3/PAPIF_get_domain.3, man3/PAPIF_get_exe_info.3, man3/PAPIF_get_granularity.3, man3/PAPIF_get_preload.3, man3/PAPIF_set_event_domain.3, man3/PAPI_accum.3, man3/PAPI_accum_counters.3, man3/PAPI_add_event.3, man3/PAPI_add_events.3, man3/PAPI_assign_eventset_component.3, man3/PAPI_attach.3, man3/PAPI_cleanup_eventset.3, man3/PAPI_create_eventset.3, man3/PAPI_destroy_eventset.3, man3/PAPI_detach.3, man3/PAPI_encode_events.3, man3/PAPI_enum_event.3, man3/PAPI_event_code_to_name.3, man3/PAPI_event_name_to_code.3, man3/PAPI_flips.3, man3/PAPI_flops.3, man3/PAPI_get_cmp_opt.3, man3/PAPI_get_component_info.3, man3/PAPI_get_dmem_info.3, man3/PAPI_get_event_info.3, man3/PAPI_get_executable_info.3, man3/PAPI_get_hardware_info.3, man3/PAPI_get_multiplex.3, man3/PAPI_get_opt.3, man3/PAPI_get_overflow_event_index.3, man3/PAPI_get_real_cyc.3, man3/PAPI_get_real_usec.3, man3/PAPI_get_shared_lib_info.3, man3/PAPI_get_substrate_info.3, man3/PAPI_get_thr_specific.3, man3/PAPI_get_virt_cyc.3, man3/PAPI_get_virt_usec.3, man3/PAPI_help.3, man3/PAPI_ipc.3, man3/PAPI_is_initialized.3, man3/PAPI_library_init.3, man3/PAPI_list_events.3, man3/PAPI_list_threads.3, man3/PAPI_lock.3, man3/PAPI_multiplex_init.3, man3/PAPI_native.3, man3/PAPI_num_cmp_hwctrs.3, man3/PAPI_num_components.3, man3/PAPI_num_counters.3, man3/PAPI_num_events.3, man3/PAPI_num_hwctrs.3, man3/PAPI_overflow.3, man3/PAPI_perror.3, man3/PAPI_presets.3, man3/PAPI_profil.3, man3/PAPI_query_event.3, man3/PAPI_read.3, man3/PAPI_read_counters.3, man3/PAPI_register_thread.3, man3/PAPI_remove_event.3, man3/PAPI_remove_events.3, man3/PAPI_reset.3, man3/PAPI_set_cmp_domain.3, man3/PAPI_set_cmp_granularity.3, man3/PAPI_set_debug.3, man3/PAPI_set_domain.3, man3/PAPI_set_event_info.3, man3/PAPI_set_granularity.3, man3/PAPI_set_multiplex.3, man3/PAPI_set_opt.3, man3/PAPI_set_thr_specific.3, man3/PAPI_shutdown.3, man3/PAPI_sprofil.3, man3/PAPI_start.3, man3/PAPI_start_counters.3, man3/PAPI_state.3, man3/PAPI_stop.3, man3/PAPI_stop_counters.3, man3/PAPI_strerror.3, man3/PAPI_thread_id.3, man3/PAPI_thread_init.3, man3/PAPI_unlock.3, man3/PAPI_unregister_thread.3, man3/PAPI_write.3: Remove the old manpages in preperation for defaulting to doxygen generated ones. 2011-08-25 * src/: perf_events.c, ctests/overflow_allcounters.c, ctests/papi_test.h, ctests/test_utils.c: Block all PERF_COUNT_SW events from overflow_allcounters test, as overflow on software counter can crash perf_event kernels pre 3.1 * src/libpfm4/: Makefile, config.mk, lib/Makefile, lib/pfmlib_common.c, lib/pfmlib_perf_event.c, lib/pfmlib_priv.h, perf_examples/perf_util.c, perf_examples/task_smpl.c: Fix the "conflicts" from the import * papi.spec, doc/Doxyfile, doc/Doxyfile-everything, src/Makefile.in, src/configure.in, src/papi.h: Bump version number to 4.1.4 in advance of pending internal vendor release for Cray. 2011-08-23 * src/: papi.c, papi_hl.c: Removed all references to Fortran APIs. These are now all in papi_fwrappers.c Also normalized syntax for many doxygen headers. * src/papi_fwrappers.c: Added doxygen skeleton for all remaining Fortran functions in this file. Also added wrappers for four additional APIs: PAPI_get_real_nsec PAPI_read_ts PAPI_lock PAPI_unlock 2011-08-19 * src/: papi.c, papi_fwrappers.c: Stubbed out doxygen pages for Fortran functions. About half way done! * src/papi_libpfm4_events.c: Finish up the documentation/cleanup pass through the libpfm4 code. 2011-08-18 * src/papi_libpfm3_events.c: Fix code so we no longer get warnings that 'setup_preset_term' and '_pfm_get_counter_info' are defined but not used * src/: papi_libpfm3_events.c, papi_libpfm4_events.c, papi_libpfm_events.h, perf_events.c, perfctr-x86.c: Consolidate use of _papi_libpfm_init() and pass in MY_VECTOR when necessary. * src/papi_libpfm4_events.c: Dynamically allocate the libpfm4 native events, rather than having a fixed array allocated at init time. * src/papi_libpfm4_events.c: Some more minor cleanups and documentation in the libpfm4 code. * src/components/coretemp/linux-coretemp.c: Fixup for linux coretemp component, it pays to check cvs status once in a while... 2011-08-16 * src/papi.c: Update the PAPI_enum_event() Doxygen comments to reflect modern values for the "modifier" parameter. * src/papi_libpfm4_events.c: Clean up code and add documentation for all the functions involved in libpfm4's _papi_libpfm_ntv_enum_events() function. 2011-08-15 * src/mb.h: Updat the rmb() barrier for ARM. * src/papi_events.csv: Update SandyBridge EP support to match that of mainline libpfm4 * src/papi_libpfm4_events.c: Cleanup libpfm4 code, and add more comments to code. * src/perf_events.c: Fix bug where umask support was disabled. * src/Rules.perfctr-pfm: Make the perfctr code use the merged preset event code. * src/: Rules.pfm_pe, papi_libpfm3_events.c, papi_libpfm_presets.c: Have libpfm3 use the merged preset code. * src/: Rules.pfm4_pe, papi_libpfm4_events.c, papi_libpfm_presets.c: Move the libpfm presets code to its own file, and modify the libpfm4 code to use it. * src/papi_libpfm3_events.c: Make the libpfm3 predefined events parser identical to the libpfm4 one, in preparation for a merge. * src/: papi_libpfm3_events.c, papi_libpfm4_events.c, papi_libpfm_events.h, perf_events.c: Move vendor fixups into the substrate and out of the naming library code. * src/: Rules.perfctr-pfm, Rules.pfm4_pe, Rules.pfm_pe, papi_libpfm3_events.c, papi_libpfm4_events.c, papi_libpfm_events.h, papi_pfm4_events.c, papi_pfm_events.c, papi_pfm_events.h, perf_events.c, perfctr-x86.c, perfmon.c: Rename papi_pfm_events.c to papi_libpfm3_events.c to make it more clear what is in the file. Also rename papi_pfm4_events.c to papi_libpfm4_events.c and papi_pfm_events.h to papi_libpfm_events.h * src/perfmon.c: Fixup perfmon2 case for the libpfm renaming * src/perfctr-x86.c: Fix perfctr breakage from the libpfm rename. * src/: papi_pfm4_events.c, papi_pfm_events.c, papi_pfm_events.h, perf_events.c, perfctr-x86.c, perfmon-ia64.c, perfmon.c: The PAPI code uses _pfm_ in function names to mean *both* perfmon2 code and libpfm3/4 code. This can cause a lot of confusion. Rename libpfm specific function names to use _libpfm_ instead. * src/: papi_pfm_events.c, papi_pfm_events.h, perf_events.c: Fix build error on perfmon2 due to movement of the _papi_pfm_shutdown() 2011-08-05 * src/: Makefile.in, Makefile.inc, configure, configure.in, components/Makefile_comp_tests, components/cuda/tests/HelloWorld.cu, components/cuda/tests/Makefile, components/example/tests/HelloWorld.c, components/example/tests/Makefile, components/README: Added generic implementation that makes it possible to add tests to components without modifying any PAPI-specific code (other than adding the tests and a makefile to the component directory). All component tests will be compiled together with PAPI when typing 'make' (as well as cleaned up when 'make clean' or 'make clobber' is typed). +++ Also added tests to 2 components, the example and cuda component. * src/: papi_defines.h, papi_internal.h, papi_pfm4_events.c, perf_events.c: Add locking to papi_pfm4_events so that adding/looking up event names doesn't have a race condition when multiple threads are doing it at once. Also fix the recently-added pfm_shutdown() to be called at substrate_shutdown() rather than plain shutdown() as the latter is called at thread_shutdown() time too. * src/: papi_pfm4_events.c, papi_pfm_events.c, papi_pfm_events.h, perf_events.c: Add a _papi_pfm_shutdown() function and have it clear out the native events array at PAPI_shutdown(). This makes sample code that exhibits the libpfm4 event race much easier to write. * src/ctests/multiplex2.c: Added some PAPI_set_domain's inside of #if 0's for testing. 2011-08-03 * src/papi_pfm4_events.c: Use the new ARM vendor code to force the proper default domain on ARM cpus. * src/: linux-common.c, papi.h: Add an ARM vendor string and have it properly set. The hardware detection logic is a horrible mess of parsing /proc/cpuinfo I took the easy way out and just tacked the ARM logic on the end rather than trying to clean it up at all. * src/perf_events.c: Clean up some comments, add a few debug messages. 2011-08-02 * src/linux-memory.c: The ARM warning for memory hierarchy not being implemented was in the wrong place. * src/: papi_pfm4_events.c, sys_perf_event_open.c: Fix some misleading debug messages. * src/papi_events.csv: Update ARM Cortex A9 preset events, and add ARM Cortex A8 events 2011-07-28 * src/: cycle.h, linux-context.h, linux-lock.h, linux-memory.c, linux-timer.c, mb.h: Add remaining changes needed for ARM compilation. This is enough for "papi_avail" and "papi_native_avail" to work. Lots of #warning statements scattered around. ARM is a complicated architecture and things like memory barriers and mutexes are very dependent on what version of the architecture they are running on. It will take a while to figure out the proper way to handle this in PAPI. Also, on Cortex-A8 and Cortex-A9 there is no way to separate kernel events from the user ones. So all measurements contain both. This will probably confuse our ctests. * src/papi_events.csv: Add ARM Cortex A9 preset events to the CSV file. * src/sys_perf_event_open.c: Add the perf_event syscall number for ARM * src/papi_fwrappers.c: Create PAPIF group in doxygen, for the papi fortran interface. 2011-07-27 * src/x86_cache_info.c: My changes yesterday broke on the --with-debug case, as noticed by buildbot. 2011-07-26 * src/: papi.c, papi_fwrappers.c: Implement doxygen comments for PAPI_get_opt; Implement doxygen comments for PAPIF_accum in papi_fwrappers.c. This is a first step in providing separate independent Fortran documentation. * doc/Doxyfile: Have doxygen parse papi_fwrappers.c for comments. * src/papi_pfm4_events.c: The last checkin broke papi_native_avail on libpfm4. Fix it. * src/papi_pfm4_events.c: Cleanup some code in papi_pfm4_events.c to avoid gcc-4.6 warnings * src/x86_cache_info.c: Fix some warnings in src/x86_cache_info.c reported by gcc-4.6 2011-07-21 * src/ctests/all_native_events.c: Change all_native_events test to create an eventset for each native event it finds. Also becomes a good test of the number of outstanding eventsets allowed. 2011-07-19 * src/papi.c: Doxygen rewrite for PAPI_set_opt. 2011-07-13 * src/: papi_events.csv, libpfm4/lib/events/intel_snb_events.h: A few more commits that get SandyBridge mostly working. * src/papi.h: Include a comment to the prototype for PAPI_read_ts. This is apparently a requirement to get doxygen to link from the prototype to the doc block for the function (a link shows up in the low_api group now). 2011-07-12 * src/libpfm4/lib/events/intel_snb_events.h: Temporarily add missing SandyBridge FP events until support gets merged upstream. * src/papi.c: Some minor Doxygen fixes. This was my run through the HTML output produced by my assigned functions. 2011-07-11 * src/libpfm4/lib/pfmlib_intel_snb.c: Temporarily add model 45 Sandy Bridge to our copy of libpfm4 until we can get this merged upstream. * src/ctests/: multiattach.c, multiattach2.c, reset.c, val_omp.c, zero_attach.c, zero_fork.c, zero_omp.c, zero_pthreads.c, zero_smp.c: Fix all the remaining users of the ctests add_two_events() helper * src/ctests/first.c: Fix first test bug due to add_two_events() change. Clean up validation of results. * src/ctests/zero.c: Some cleanups I made to the testing routine add_two_events() a while ago broke the zero test. (the cycles result was swapped with the other counter result). This fixes this, plus adds a validation check to try to avoid this happening in the future. * src/: configure, configure.in: Patch from William Cohen that sets LD_LIBRARY_PATH and LIBPATH to include libpfm4/lib. A better fix would probably be to include only the libpfm library we are currently configured for. I need to do more testing of the --with-static-lib=no --with-shared-lib=yes --with-shlib options * src/papi_hl.c: High level interface Doxygen comments updated to include interface overview 2011-07-08 * doc/Doxyfile, src/papi.h, src/papi_hl.c, src/papi_vector.h: Add in the PAPI component development page. Currently not linked to by anything yet, but can be found at file://$(html_dir)/CDI or http://web.eecs.utk.edu/~ralph/html/CDI for an already built page. 2011-07-07 * src/: papi.c, papi.h: Add doxygen comments for PAPI_get_executable_info(), PAPI_exe_info_t and PAPI_address_map_t * src/papi.c: Add doxygen comments for PAPI_event_code_to_name() and PAPI_event_name_to_code() * src/papi.c: Add doxygen comments for PAPI_enum_event() * src/papi.c: Add doxygen comments for PAPI_create_eventset() * src/papi.c: Add doxygen comments for PAPI_cleanup_eventset() and PAPI_destroy_eventset() * src/papi.c: Add doxygen comments for PAPI_attach() and PAPI_detach() * src/papi.c: Add doxygen comments for PAPI_assign_eventset_component() 2011-07-05 * src/components/cuda/linux-cuda.c: missing parentheses added in CUDA_Shutdown() which caused a seg fault. 2011-07-01 * src/papi.c: Add doxygen comments for PAPI_add_event() * src/papi.c: Add doxygen comments for PAPI_add_events() +++ Updated PAPI_accum() * src/papi.c: Add doxygen comments for PAPI_accum() * src/ctests/: data_range.c, earprofile.c: Some more ia64 ctests fixes * src/papi.c: Add doxygen comments for PAPI_register_thread() * src/papi.c: Add doxygen comments for: PAPI_read() PAPI_read_ts() * src/ctests/earprofile.c: Another attempt at fixing earprofile on ia64. * src/ctests/earprofile.c: PAPI for ia64 compiles now, and now it's some of the ia64-specific ctests that are broken. There was a missing #include "papi.h" in earprofile 2011-06-30 * src/papi.c: Doxygen for: PAPI_set_multiplex PAPI_shutdown PAPI_sprofil_t PAPI_start (int EventSet) PAPI_state (int EventSet, int *status) PAPI_stop (int EventSet, long long *values) PAPI_strerror (int) * src/: linux-timer.c, perfmon-ia64-pfm.h, perfmon-ia64.c: more ia64 fixes * src/papi.c: doxygen comments for: PAPI_query_event() * src/: linux-timer.c, linux-timer.h, papi_vector.c, papi_vector.h: Some more ia64 fixes. * src/papi.c: add doxygen comments for PAPI_profil() * src/: linux-timer.c, linux-timer.h, perfmon-ia64.c: More ia64 fixes. Getting closer. * src/: linux-context.h, perfmon-ia64.c, perfmon-ia64.h: One more try at fixing ia64. The trick to cross compiling is ./configure --with-CPU=itanium2 --with-arch=ia64 --with-perfmon=2.0 --with-tls=no make __ia64__=1 and you still have to fiddle with some __ia64__ ifdefs scattered in the code 2011-06-29 * src/papi.c: Add doxygen comments for: * PAPI_num_events() * PAPI_overflow() * PAPI_perror() * src/papi.c: Doxygen for PAPI_set_domain and PAPI _set_granularity. Unfortunately, this seems to have raised more issues about Fortran support... * src/papi.c: Add doxygen comments to * PAPI_list_threads() * PAPI_lock() * PAPI_multiplex_init() * PAPI_num_hwctrs() * PAPI_num_cmp_hwctrs() * src/papi.c: Doxygen for PAPI_set_debug and minor tweaks to other function documentation. 2011-06-28 * src/: linux-common.h, linux-timer.c, papi_pfm_events.c, perfmon-ia64-pfm.h: some more itanium fixes. This won't be enough to fix things but it is a start. * src/papi.c: Check in Kiran's doxygen work. This time hopefully not clobbering anyone. * src/: linux-context.h, linux-timer.c, perfmon-ia64.h: Attempt to fix the build for itanium systems. * src/papi.c: Fix comments embedded in doygen source to be C++ single line format. 2011-06-27 * src/papi.c: Commit documentation changes for PAPI_reset, PAPI_set_thr_specific, and PAPI_get_thr_specific. The last one wasn't on my list, but it mirrored _set_ so I did it anyway. * src/papi.c: [no log message] * src/papi.c: Commit Kiren's updates to the code documentation. 2011-06-24 * doc/Doxyfile: One got left behind... ( see previous commit about redoing doxygen procedures ) * src/Makefile.inc, src/configure, src/configure.in, doc/Doxyfile.html, doc/Doxyfile.utils, doc/Doxyfile.utils-everything, doc/Makefile, doc/doxygen_procedure.txt: Update install process for man-pages, install from pre-built pages living in $(PAPI_DIR)/man and update $(PAPI_DIR)/doc to generate doxygen pages and copy them to $(PAPI_DIR)/man. This removes doxygen from the install process. And when removes the web of doxygen configurationf files, going back to just two, lite and kitchen-sink. * src/papi.c: Updates to doxygen stuff for PAPI_remove_event{s} * src/: linux-bgp.c, perfmon-ia64.c, perfmon.c, solaris-niagara2.c, solaris-ultra.c: When I made the multiattach change I forgot to update _papi_hwi_lookup_thread calls on all architectures. This should get the ones I missed. 2011-06-23 * src/papi_pfm4_events.c: For libpfm4 we were setting available counters to the number of generic counters. This was less than libpfm3, so update the code to set the number of counters to be equal to generic+fixed. In theory whether an event can be added is determined at add time, so the extra check for number of counters is unnecessarily getting in the way. This should be fixed but might require a re-write of some PAPI internals. 2011-06-22 * src/ctests/test_utils.c: One more fix to the byte_profile code * src/ctests/byte_profile.c: Fix byte_profile ctest, as it was breaking on libpfm4. * src/: extras.c, papi.c, perf_events.c, threads.c, threads.h, ctests/multiattach.c, ctests/multiattach2.c: Add support for handling multiattach properly. This adds a pid argument to the _papi_hwi_lookup_or_create_thread() call. A pid of "0" falls back to the old behavior of using the current tid/pid. If attaching to an outside pid/tid, a new thread object is created to handle this. This seems like the right thing to do, though there's enough complicated code in the threads code that I haven't fully audited that this can't fail somehow in complicated cases where lots of attaching/detaching is done in conjunction with having a large multi-threaded program. 2011-06-13 * src/papi_pfm4_events.c: Fix the libpfm4 enumerate code. It was possible for papi_native_avail to get stuck in an infinite loop if two events had the same name on different PMUs and the "default" PMU happened later in the enumeration. This was the case on SandyBridge at least. This should be fixed now. * src/ctests/test_utils.c: Make "test_fail()" actually fail. In the comments we say we don't exit to avoid leaking memory in threads. That seems suspect. The threads should exit properly too. If they don't, then we should fix the threading code and not make our tests never exit on fail (which can make debugging a pain). 2011-06-10 * src/: papi.c, papi_hl.c: Add example code to the high level interface docs * src/papi_events.csv: Add initial Sandy Bridge event support. This is in no way nested, so be cautious if using. Sandy Bridge support is libpfm4 only, so you'll have to configure with --with-libpfm4 * src/papi_hl.c: Added an example of how to embed example code in PAPI_stop_counters documentation. 2011-06-09 * src/Makefile.inc: Makefile fix for fortran wrapper files on case-insensitive filesystems. During build, it renames the preprocessed file PAPI_FWRAPPERS.c to upper_PAPI_FWRAPPERS.c 2011-06-08 * src/: configure, Makefile.inc, configure.in: Have configure check that doxygen is installed, and have make install only attempt to build the doxygen docs if we found doxygen. 2011-06-07 * src/: run_tests_exclude_cuda.txt, components/cuda/linux-cuda.c: ctests/thrspecific works now too with the CUDA component * src/components/cuda/linux-cuda.c: clean up and indent * src/components/cuda/: linux-cuda.c, linux-cuda.h: Added CudaRemoveEvent functionality (was broken in earlier CUDA RC versions). ctests/all_native_events works now (at least for the default CUDA device). +++ Minor exit/return mods in CUDA component * doc/Doxyfile, doc/Doxyfile.html, doc/Doxyfile.utils, doc/Doxyfile.utils-everything, doc/Makefile, src/Makefile.inc, src/papi.c, src/papi.h, src/papi_hl.c: Rework doxygen to better generate manpages from code comments. 2011-06-03 * release_procedure.txt: Incorporate a note about using 2.59 autoconf to build configure. 2011-06-02 * src/utils/error_codes.c: Tweak the doxygen title text. 2011-06-01 * src/: configure, configure.in: Modified configure.in to look for a 2.59 autoconf prerequisite. Rebuilt configure with 2.59. We'll try this out on buildbot. 2011-05-31 * src/: run_tests_exclude_cuda.txt, components/cuda/linux-cuda.c, components/cuda/linux-cuda.h: 2 things: (1) Bug in CUDA v4.0 fixed. It caused a threaded application to hang when parent called cuInit() before fork() and child called also cuInit(). All fork ctests pass now if papi is configured with cuda component. (2) If running a threaded application, we need to make sure that a thread doesn't free the same memory location(s) more than once. Now all pthread ctests pass, too (again, if papi is configured with cuda component). 2011-05-27 * src/perf_events.c: It turns out our FORMAT_ID workaround detection code was identical to FORMAT_GROUP (and not really necessary) so merge the two. 2011-05-26 * src/papi_pfm_events.h: One last try at the cray compile fix, this time using a suggestion from Steve Kaufmann. * src/perf_events.c: Update some comments on the workarounds. I've been writing some validation tests for our various workarounds. It turns out the "no multiplexing before 2.6.33" problem is actually an artifact of the check_schedulability bug on x86 (and its interaction with our event partitioning code) rather than a distinct kernel bug. * src/Rules.pfm4_pe: Now fix libpfm4. I think they should all be fixed now. Too many permutations. * src/: Rules.pfm_pe, papi_pfm_events.h: One last try at fixing the perfmon2 build. * src/papi_pfm_events.h: Fix the perfmon2 build that broke with the libpfm4 merge. The previous fix only fixed perfctr, not perfmon2 This should fix the build for cray machines. 2011-05-24 * src/utils/component.c: Add doxygen comments to components.c * src/papi_events.csv: Fix the PAPI_TOT_INS instruction for Atom, as well as update the floating point events. * src/perf_events.c: We were using some of the perf_event functionality in an susupported way and this broke recently when the perf_event interface was made more strict. You can't use the PERF_EVENT_IOC_REFRESH ioctl on a group leader to start all sampling siblings... use PERF_EVENT_IOC_ENABLE Don't pass NULL or 0 as the argument to the PERF_EVENT_IOC_REFRESH ioctl. These fixes seem to work and fix the Nehalem regressions. The above changes were made to PAPI back in November to fix the I/O possible error, so we should check to be sure that this doesn't reintroduce the problem. We should also probably back-port this fix to 4.1.2 and 4.2 stable 2011-05-23 * src/: configure, configure.in, papi.c, papi.h, papi_data.h, utils/Makefile, utils/error_codes.c: New utility to display PAPI error codes and description strings. There was no API to access error descriptions, so I created PAPI_descr_error( int error_code ) too. I also updated the error table to provide strings for all defined codes. * src/aix.c: Define aix's .cmp_info.itimer_ns value to a default. The multiplexing tests are happy on power7 aix now. * src/: sys_perf_event_open.c, ctests/overflow.c: cleanup some debug messages * src/ctests/: overflow.c, test_utils.c: The overflow test depends on the exact ordering of the flags in the add_test_event() code. So my previous changes broke the test. This commit fixes the test case again. * src/ctests/: byte_profile.c, prof_utils.c, prof_utils.h, profile.c, profile_twoevents.c, sprofile.c: ctests: remove the "hw_info" field from the profile setup functions, as the field isn't used. * src/: configure, configure.in, utils/Makefile, utils/component.c: Introduce a component avail utility, lists the components we were built with, optionally with native/preset counts and version number. * src/components/example/example.c: Add number of 'native' events to the component info structure in example component. * src/ctests/: byte_profile.c, papi_test.h, prof_utils.c, prof_utils.h, profile.c, profile_twoevents.c, sprofile.c, test_utils.c, zero_smp.c: Clean up the ctest profile event section code some more. This fixes a build error on AIX that I introuced on Friday. * src/papi_events.csv: Initial PAPI Fam14h Bobcat support. Only works with libpfm4 version of PAPI. Passes most of the tests, but still need to verify as there are a number of subtle differences in the native events. 2011-05-20 * src/ctests/: byte_profile.c, mendes-alt.c, papi_test.h, prof_utils.c, test_utils.c: Fix byte_profile to work on Nehalem. Still needs some more work to print the result properly. * src/ctests/: attach2.c, attach3.c, branches.c, byte_profile.c, case1.c, case2.c, first.c, multiattach.c, multiattach2.c, overflow.c, overflow3_pthreads.c, overflow_index.c, overflow_one_and_read.c, overflow_pthreads.c, papi_test.h, prof_utils.c, profile_pthreads.c, reset.c, sdsc.c, sprofile.c, tenth.c, test_utils.c, zero.c, zero_attach.c, zero_fork.c, zero_pthreads.c: Some cleanups to the ctests/test_utils.c code + Remove the hw_info field from the add_two_events() and add_two_nonderived_events() functions, as it wasn't used. + Make the add_test_events() function loop through all the masks, insteading having a hardcoded test for each possible mask * src/ctests/test_utils.c: buildbot didn't like the colored test messages (despite the code having fancy checks for "isatty()"). So change the color thing to require an environment variable to be set, TESTS_COLOR=y 2011-05-19 * src/ctests/test_utils.c: Add color to the testsuite results if we are running at a console. This makes is much easier to see FAILED results. I can back this out if people don't like it, but it's made my life a lot easier when running all the tests involved with the libpfm4 merge. * src/: papi_pfm_events.c, papi_pfm_events.h: Fix the build with perfctr introduced by libpfm4 changes. * src/configure.in: Documentation for the AIX heap fix. * src/: papi_pfm4_events.c, ctests/test_utils.c: power6 doesn't work with libpfm4, as it reports num_cntrs=0 have PAPI print a better error in this case until we get a fix upstream. * src/: configure, configure.in: On aix one has to ask really nicely for a usable ammount of heap space. The omp tests should run now. * src/: configure, configure.in, perf_events.c, sys_perf_event_open.c: This is the last commit needed to get libpfm4 support going. To build with libpfm4 support enabled, run configure like this: ./configure --with-libpfm4 * src/: papi_pfm_events.c, papi_pfm_events.h, perf_events.c: Pass the actual perf_attr structure around, rather than just a 64-bit event value. This allows support for generalized events and eventual offcore/uncore support. * src/: papi_pfm_events.c, perf_events.c, perf_events.h: Clean up some debugging #ifdefs * src/papi_events.csv: The papi_events.csv file requires some additions for libpfm4 to work + The CPU family names have changed from libpfm3 to libpfm4 It should be backward compatible to just add the libpfm4 ones in addition to the libpfm3 ones + libpfm4 does not provide a helper to get the instruction and cycle event names. So we have to add them for all supported CPUs * src/: Rules.pfm4_pe, papi_pfm4_events.c: New files needed for libpfm4 support 2011-05-16 * release_procedure.txt: Add note to update from cvs before tagging. Thanks, Will Cohen :) papi-papi-7-2-0-t/ChangeLogP420.txt000066400000000000000000001311631502707512200166030ustar00rootroot000000000000002011-10-25 * doc/: Makefile, doxygen_procedure.txt: Update doxygen_procedure to note that we need a recent version of doxygen. * man/: man1/avail.c.1, man1/clockres.c.1, man1/command_flags_t.1, man1/command_line.c.1, man1/component.c.1, man1/cost.c.1, man1/decode.c.1, man1/error_codes.c.1, man1/event_chooser.c.1, man1/mem_info.c.1, man1/native_avail.c.1, man1/options_t.1, man1/papi_avail.1, man1/papi_clockres.1, man1/papi_command_line.1, man1/papi_component_avail.1, man1/papi_cost.1, man1/papi_decode.1, man1/papi_error_codes.1, man1/papi_event_chooser.1, man1/papi_mem_info.1, man1/papi_multiplex_cost.1, man1/papi_native_avail.1, man3/CDI.3, man3/HighLevelInfo.3, man3/PAPIF.3, man3/PAPIF_accum.3, man3/PAPIF_accum_counters.3, man3/PAPIF_add_event.3, man3/PAPIF_add_events.3, man3/PAPIF_assign_eventset_component.3, man3/PAPIF_cleanup_eventset.3, man3/PAPIF_create_eventset.3, man3/PAPIF_destroy_eventset.3, man3/PAPIF_enum_event.3, man3/PAPIF_event_code_to_name.3, man3/PAPIF_event_name_to_code.3, man3/PAPIF_flips.3, man3/PAPIF_flops.3, man3/PAPIF_get_clockrate.3, man3/PAPIF_get_dmem_info.3, man3/PAPIF_get_domain.3, man3/PAPIF_get_event_info.3, man3/PAPIF_get_exe_info.3, man3/PAPIF_get_granularity.3, man3/PAPIF_get_hardware_info.3, man3/PAPIF_get_multiplex.3, man3/PAPIF_get_preload.3, man3/PAPIF_get_real_cyc.3, man3/PAPIF_get_real_nsec.3, man3/PAPIF_get_real_usec.3, man3/PAPIF_get_virt_cyc.3, man3/PAPIF_get_virt_usec.3, man3/PAPIF_ipc.3, man3/PAPIF_is_initialized.3, man3/PAPIF_library_init.3, man3/PAPIF_lock.3, man3/PAPIF_multiplex_init.3, man3/PAPIF_num_cmp_hwctrs.3, man3/PAPIF_num_counters.3, man3/PAPIF_num_events.3, man3/PAPIF_num_hwctrs.3, man3/PAPIF_perror.3, man3/PAPIF_query_event.3, man3/PAPIF_read.3, man3/PAPIF_read_ts.3, man3/PAPIF_register_thread.3, man3/PAPIF_remove_event.3, man3/PAPIF_remove_events.3, man3/PAPIF_reset.3, man3/PAPIF_set_cmp_domain.3, man3/PAPIF_set_cmp_granularity.3, man3/PAPIF_set_debug.3, man3/PAPIF_set_domain.3, man3/PAPIF_set_event_domain.3, man3/PAPIF_set_granularity.3, man3/PAPIF_set_inherit.3, man3/PAPIF_set_multiplex.3, man3/PAPIF_shutdown.3, man3/PAPIF_start.3, man3/PAPIF_start_counters.3, man3/PAPIF_state.3, man3/PAPIF_stop.3, man3/PAPIF_stop_counters.3, man3/PAPIF_thread_id.3, man3/PAPIF_thread_init.3, man3/PAPIF_unlock.3, man3/PAPIF_unregister_thread.3, man3/PAPIF_write.3, man3/PAPI_accum.3, man3/PAPI_accum_counters.3, man3/PAPI_add_event.3, man3/PAPI_add_events.3, man3/PAPI_addr_range_option_t.3, man3/PAPI_address_map_t.3, man3/PAPI_all_thr_spec_t.3, man3/PAPI_assign_eventset_component.3, man3/PAPI_attach.3, man3/PAPI_attach_option_t.3, man3/PAPI_cleanup_eventset.3, man3/PAPI_component_info_t.3, man3/PAPI_cpu_option_t.3, man3/PAPI_create_eventset.3, man3/PAPI_debug_option_t.3, man3/PAPI_descr_error.3, man3/PAPI_destroy_eventset.3, man3/PAPI_detach.3, man3/PAPI_dmem_info_t.3, man3/PAPI_domain_option_t.3, man3/PAPI_enum_event.3, man3/PAPI_event_code_to_name.3, man3/PAPI_event_info_t.3, man3/PAPI_event_name_to_code.3, man3/PAPI_exe_info_t.3, man3/PAPI_flips.3, man3/PAPI_flops.3, man3/PAPI_get_cmp_opt.3, man3/PAPI_get_component_info.3, man3/PAPI_get_dmem_info.3, man3/PAPI_get_event_info.3, man3/PAPI_get_executable_info.3, man3/PAPI_get_hardware_info.3, man3/PAPI_get_multiplex.3, man3/PAPI_get_opt.3, man3/PAPI_get_overflow_event_index.3, man3/PAPI_get_real_cyc.3, man3/PAPI_get_real_nsec.3, man3/PAPI_get_real_usec.3, man3/PAPI_get_shared_lib_info.3, man3/PAPI_get_thr_specific.3, man3/PAPI_get_virt_cyc.3, man3/PAPI_get_virt_nsec.3, man3/PAPI_get_virt_usec.3, man3/PAPI_granularity_option_t.3, man3/PAPI_hw_info_t.3, man3/PAPI_inherit_option_t.3, man3/PAPI_ipc.3, man3/PAPI_is_initialized.3, man3/PAPI_itimer_option_t.3, man3/PAPI_library_init.3, man3/PAPI_list_events.3, man3/PAPI_list_threads.3, man3/PAPI_lock.3, man3/PAPI_mh_cache_info_t.3, man3/PAPI_mh_info_t.3, man3/PAPI_mh_level_t.3, man3/PAPI_mh_tlb_info_t.3, man3/PAPI_mpx_info_t.3, man3/PAPI_multiplex_init.3, man3/PAPI_multiplex_option_t.3, man3/PAPI_num_cmp_hwctrs.3, man3/PAPI_num_components.3, man3/PAPI_num_counters.3, man3/PAPI_num_events.3, man3/PAPI_num_hwctrs.3, man3/PAPI_option_t.3, man3/PAPI_overflow.3, man3/PAPI_perror.3, man3/PAPI_preload_info_t.3, man3/PAPI_profil.3, man3/PAPI_query_event.3, man3/PAPI_read.3, man3/PAPI_read_counters.3, man3/PAPI_read_ts.3, man3/PAPI_register_thread.3, man3/PAPI_remove_event.3, man3/PAPI_remove_events.3, man3/PAPI_reset.3, man3/PAPI_set_cmp_domain.3, man3/PAPI_set_cmp_granularity.3, man3/PAPI_set_debug.3, man3/PAPI_set_domain.3, man3/PAPI_set_granularity.3, man3/PAPI_set_multiplex.3, man3/PAPI_set_opt.3, man3/PAPI_set_thr_specific.3, man3/PAPI_shlib_info_t.3, man3/PAPI_shutdown.3, man3/PAPI_sprofil.3, man3/PAPI_sprofil_t.3, man3/PAPI_start.3, man3/PAPI_start_counters.3, man3/PAPI_state.3, man3/PAPI_stop.3, man3/PAPI_stop_counters.3, man3/PAPI_strerror.3, man3/PAPI_thread_id.3, man3/PAPI_thread_init.3, man3/PAPI_unlock.3, man3/PAPI_unregister_thread.3, man3/PAPI_write.3, man3/high_api.3, man3/low_api.3, man3/papi_data_structures.3, man3/papi_vector_t.3, man3/ret_codes.3: Update doxygen generated man-pages for the pending release. In the future, we need to use a newer version of doxygen to generate the pages (1.7 +) because locally installed verions appear to have a bug. * src/ctests/nmi_watchdog.c: The nmi_watchdog test should report a Warning if nmi_watchdog is enabled not an error. (Since we do work around it, even if performance is likely impacted). * src/ctests/: Makefile, nmi_watchdog.c: I think the nmi_watchdog stuff is going to cause us problems down the road. Thus add a test that will tell users about the issue. * src/perf_events.c: The nmi_watchdog workaround is needed for multiplexing too. The kernel devs don't seem eager to fix this. Until they do, we'll have to fall back to software multiplexing on recent kernels that have nmi_watchdog enabled (most vendor kernels). * src/multiplex.c: Yesterday's coverity fix to make sure the cleanup and destroy rerturn values were checked ended up over-writing "retval" in a way that broke the sdsc4-mpx test. Fix things so that doesn't happen. * src/: papi.c, perf_events.c, ctests/overflow_allcounters.c: Some changes for perf_event MIPS support + Add __mips__ cases to the format_group, schedulability, and broken multiplexing bug workarounds, as even new Linux mips kernels have these bugs + fix overflow_allcounters to work properly if the MHz value is zero. + Add some debugging to PAPI_overflow() so that errors are more obvious than just returning PAPI_EINVAL, which made the previous item a pain to track down. * man/: footer.htm, header.htm, manServer_papi.pl, papiman.bat, html/papi.html, html/papi_accum.html, html/papi_accum_counters.html, html/papi_add_event.html, html/papi_add_events.html, html/papi_assign_eventset_component.html, html/papi_attach.html, html/papi_avail.html, html/papi_cleanup_eventset.html, html/papi_clockres.html, html/papi_command_line.html, html/papi_cost.html, html/papi_create_eventset.html, html/papi_decode.html, html/papi_destroy_eventset.html, html/papi_detach.html, html/papi_encode_events.html, html/papi_enum_event.html, html/papi_event_chooser.html, html/papi_event_code_to_name.html, html/papi_event_name_to_code.html, html/papi_flips.html, html/papi_flops.html, html/papi_get_component_info.html, html/papi_get_dmem_info.html, html/papi_get_event_info.html, html/papi_get_executable_info.html, html/papi_get_hardware_info.html, html/papi_get_multiplex.html, html/papi_get_opt.html, html/papi_get_overflow_event_index.html, html/papi_get_real_cyc.html, html/papi_get_real_usec.html, html/papi_get_shared_lib_info.html, html/papi_get_substrate_info.html, html/papi_get_thr_specific.html, html/papi_get_virt_cyc.html, html/papi_get_virt_usec.html, html/papi_help.html, html/papi_ipc.html, html/papi_is_initialized.html, html/papi_library_init.html, html/papi_list_events.html, html/papi_list_threads.html, html/papi_lock.html, html/papi_mem_info.html, html/papi_multiplex_init.html, html/papi_native.html, html/papi_native_avail.html, html/papi_num_cmp_hwctrs.html, html/papi_num_components.html, html/papi_num_counters.html, html/papi_num_events.html, html/papi_num_hwctrs.html, html/papi_overflow.html, html/papi_perror.html, html/papi_presets.html, html/papi_profil.html, html/papi_query_event.html, html/papi_read.html, html/papi_read_counters.html, html/papi_register_thread.html, html/papi_remove_event.html, html/papi_remove_events.html, html/papi_reset.html, html/papi_set_cmp_domain.html, html/papi_set_cmp_granularity.html, html/papi_set_debug.html, html/papi_set_domain.html, html/papi_set_event_info.html, html/papi_set_granularity.html, html/papi_set_multiplex.html, html/papi_set_opt.html, html/papi_set_thr_specific.html, html/papi_shutdown.html, html/papi_sprofil.html, html/papi_start.html, html/papi_start_counters.html, html/papi_state.html, html/papi_stop.html, html/papi_stop_counters.html, html/papi_strerror.html, html/papi_thread_id.html, html/papi_thread_init.html, html/papi_unlock.html, html/papi_unregister_thread.html, html/papi_write.html, html/papif.html, html/papif_get_clockrate.html, html/papif_get_domain.html, html/papif_get_exe_info.html, html/papif_get_granularity.html, html/papif_get_preload.html, html/papif_set_event_domain.html, images/cssigoff.gif, images/cssigon.gif, images/headertop.jpg, images/line.gif, images/logobottom.jpg, images/logoleft.jpg, images/menubg.jpg, images/menubg95.jpg, images/rd.jpg, images/spinbg.jpg, images/spinlogo.gif, images/stable.gif, images/stripes2.jpg, images/trans.gif, images/utsigoff.gif, images/utsigon.gif, images/white.jpg: Remove the old html documentation and assorted helper files. * src/components/coretemp/linux-coretemp.c: Fix a possible directory stream leak in the coretemp component. reported by coverity checker. * src/ctests/calibrate.c: Properly free the arrays in calibrate, introduced by yesterdays coverity fix. Patch by Will Cohen 2011-10-24 * src/components/coretemp/linux-coretemp.c: Fix coretemp to not fail if /sys/class/hwmon doesn't exist. * src/components/coretemp/linux-coretemp.c: Patch coretemp to only free the initialized data in shutdown_substrate (once per PAPI_init) rather than shutdown (once per thread). This was causing double free errors. Patch from Will Cohen * src/utils/multiplex_cost.c: Fix various calls to PAPI_start() and PAPI_stop() in multiplex_cost that didn't check the return value. Took care to try to avoid changing timing measurements. Noticed by coverity checker. * src/utils/cost.c: In one case, cost was not checking the return of PAPI_start()/PAPI_stop(). This change makes it does so, while being careful not to interfere with the timing that is going on. * src/ctests/: pthrtough.c, pthrtough2.c: pthrtough and pthrtough2 were not checking the return value for pthread_attr_setscope(). Reported by coverity checker. * src/ctests/multiplex1_pthreads.c: multiplex1_pthreads was not checking the return from PAPI_library_init() as flagged by coverity checker. * src/ctests/inherit.c: inherit.c wasn't checking the result of the waitpid() call, as reported by coverity checker. * src/ctests/clockres_pthreads.c: Check the return of pthread_create(). Reported by coverity checker. * src/papi_libpfm4_events.c: Fix an actual bug (reported as deadcode by coverity) where _papi_hwd_ntv_code_to_descr was appending extraneous ", masks:" strings into an event description. None of our utils/ctests exercise this function, which is probably why the bug wasn't noticed. * src/: multiplex.c, papi.c: Fix cases where PAPI_*() functions were called without checking the return for an error. Reported by coverity. * doc/Doxyfile.utils: Update version to 4.2.0 for pending release. * src/multiplex.c: Fix some code that could potentially dereference a null pointer. Found by the coverity checker. * src/papi_vector.c: Remove a dead code case as reported by coverity. Shouldn't break anything as I can't find anywhere that vector_print_table() is actually called. * release_procedure.txt: Update release_procedure to reflect another file that needs a version number bump. (Doxyfile.utils) * src/ctests/calibrate.c: Fix some weird code that was sharing a memory allocation for both double and floats. This was really ugly and made the coverity checker sad. Patch provided by Will Cohen. * src/testlib/test_utils.c: Fix a signed/unsigned comparison bug I introduced. * src/components/coretemp/tests/coretemp_basic.c: Fix the test so it correctly iterates all of the components. * src/components/coretemp/: linux-coretemp.c, tests/Makefile, tests/coretemp_basic.c: Fix a potential memory leak in coretemp (flagged by coverity). Also added a test case for coretemp so I can actually test if these changes are breaking anything. * src/solaris-ultra.c: Remove const decleration from get_virt_* in solaris substrate. Vince removed this from papi_vector.h back in June. * src/testlib/test_utils.c: Improce the add_two_events() code in the test library. Before it was possible to overrun a buffer if none of the potential predefined events were available. Noticed by the coverity checker. * papi.spec, doc/Doxyfile, doc/Doxyfile-everything, src/configure, src/papi.h, src/Makefile.in, src/configure.in: Update version to 4.2.0 for pending release. 2011-10-21 * src/: Makefile.inc, configure, configure.in, papi.c, papi.h, papi_internal.c, papi_user_events.c, papi_user_events.h: Merge in the user events code , protected by a configure option. ( --with-user-events ) * src/testlib/test_utils.c: We now ensure that test_fail() always exits. There was some code around that tracked the number of times test_fail() was called. Remove that, as I think it was confusing the coverity checker and causing a huge number of false positives for NULL pointer dereferences. * src/components/acpi/linux-acpi.c: Some minor cleanups to the acpi component. It was choking a bit if ACPI didn't provide thermal information, and also fix a few coverity bugs involving not checking the result of a dup() call. * src/testlib/test_utils.c: Another problem with negative numbers, this time one could potentially be passed to a malloc call. noticed by coverity * src/ctests/overflow_pthreads.c: We were indexing an array with a returned value that could be negative on failure. Add a check to avoid that. We're also indexing a per-thread array with an EventSet number, which sounds suspect, should probably investigate that further. * src/perf_events.c: perf_events.c was setting variables to -1 and then potentially using them to index arrays or call close() on them. This adds checks to avoid that. Noticed by the coverity checker. * src/components/lustre/linux-lustre.h: Include stdint.h and ctype.h; needed for uint64_t and isspace() respectivly. * src/components/coretemp/linux-coretemp.c: Fix problem where we try to manipulate a NULL directory entry. This fixes a segfault on a Nehalem machine we have here that has a /sys/class/hwmon/hwmon0 directory without a "device" subdirectory. * src/components/coretemp/linux-coretemp.c: We were opening a file but not checking for failure before reading from it. Flagged by the coverity checker. * src/components/coretemp/linux-coretemp.c: Both gcc and coverity were complaining about using an uninitialized pointer. This makes sure it's not dereferenced if not initialized. * src/ctests/prof_utils.c: Stop doing unnecessary pointer math in a print statement. This was flagged as a problem by the coverity tool. * src/components/coretemp/linux-coretemp.c: Fix some wrong buffer sizes in the coretemp component. Patch from Will Cohen * src/ctests/sdsc.c: add some extra debug info for sdsc test failures. * src/papi_hl.c: Add comment to PAPI_num_counters() documentation about use of PAPI_num_cmp_hwctrs() for component counters. 2011-10-19 * src/papi.c: Correct documentation errors for PAPI_strerror. * src/: configure, configure.in: Under a no-cpu-counters build, still build all of the utils. We probably want to rethink some of the cost util details. 2011-10-11 * src/run_tests.sh: Remove an unneeded call to "cat". For some reason it was printing pointless warnings that needlessly cluttered the buildbot logs. * src/ctests/: Makefile, multiplex1.c: -lpapi should never be a dependency. -I.. is missing in makefile You should be able to cd ctests and do: make or make multiplex. Also, added the read after start multiplex case for multiplex1. This triggers bugs in perf_events systems. 2011-10-10 * src/: papi.c, papi_internal.c, threads.c: The multiplex1_pthreads test was reporting a memory leak. This is because the test was calling PAPI_unregister_thread() without destroying its EventSets. This added change adds code that at unregister_thread time will destroy any events belonging to that thread. This works on all the current ctests but I should check some of the various corner cases not currently tested. 2011-10-07 * src/libpfm4/: config.mk, lib/pfmlib_amd64.c, lib/pfmlib_common.c, lib/pfmlib_intel_x86.c, lib/events/intel_nhm_events.h, lib/events/intel_wsm_events.h: Merge the "conflicts" from the libpfm4 merge * src/: threads.c, threads.h: Fix the MEMORY LEAK errors involving the attach ctests (as seen on buildbot) These came about when proper multiattach support was added. A "fake" thread structure is created for each attached process. These fake thread structures were not being cleaned up at shutdown, hence the leak. This fix adds support so at thread shutdown, if we have any "fake" threads that we created, also shut them down too. This was tricky, especially dealing with the circular-linked list the thread info structs are in. This fix seems to work without negatively affecting the pthread cases. ctests/multiplex1_pthreads still reports MEMORY LEAK but that seems to be an eventset issue, not a thread issue, so will be investigated separately. 2011-10-06 * src/: papi.h, papi_fwrappers.c: Add Fortran reference to doxygen main page. 2011-10-05 * src/: papi.c, papi_internal.c, perf_events.c: There has been some ongoing speculation about what would happen if you enabled Multiplexing and Overflow at the same time. It turns out (at least on perf_events) that if you have kernel multiplexing, the results are what you expect. You get overflows, but less than in the non-multiplexing case because the overflow counter isn't being run all the time. The results for software multiplexing involved a segfault. This is because in the software multiplexing case the primary EventSet is a fiction; a set of shadow EventSets are created behind the scene, and these are the ones used. Therefore when you enable overflow, the overflow event is attempted to be enabled on the fictious main EventSet. There are no native events mapped for it, so overflow tries to access native event array index "-1" which causes bad things to happen. This change avoids the issue by catching the "-1" case and failing accordingly. We should probably decide if we want to catch the oflo/mpx combination earlier and outright ban it. I also went through a lot of the code involved adding comments, as it was really hard following what was going on. This involved the infamously dense "_papi_hwi_remap_event_position()" function too. * src/papi.h: Moved cpu and inherit bits to end of structure for compat across all 4.x lines. Found by Will Cohen. As it turns out, I ended up reviewing the CPU_ATTACH changes; I had not done so before. This functionality actually belongs in PAPI_set_granularity. A CPU is a natural unit of granularity of counting, and that value was speced in papi.h a long time ago. Right thing to do here is leave the current attach stuff but make it work as part of set_granularity. Consider that a TODO for 4.3. 2011-10-04 * doc/: Doxyfile, Doxyfile-everything: Enable macro expansion in the doxygen preprocessor step. Doxygen was not creating docs for the fortran functions and I believe it is because it was silently choking on our clever preprocessor abuse; this fixes? that. However, its worth taking a critical eye to the generated pages again. * src/: papi.c, papi_fwrappers.c, papi_hl.c: make "* #include" into "* \#include" so doxygen doesn't treat it as a command. * src/papi_fwrappers.c: Added all doxygen stubs to the PAPIF group. 2011-10-03 * src/ctests/ipc.c: My previous "fix" for the array bounds issue in ipc.c had multiple embarassing bugs. Thanks to Will Cohen for noticing. Things should be better now. * src/: Rules.perfctr-pfm, Rules.pfm_pe: Additionally remove the now extraneous papi_libpfm_preset definition from the other Rules files too. * src/: Makefile.inc, Rules.pfm4_pe: The change to make the preset code generic accidentally ended up defining the build rules for the file in duplicate places. This fixes that. 2011-09-30 * src/: linux-common.c, utils/decode.c: Fix two unused variable warnings. * src/ctests/second.c: We were allocating the "values" array but never freeing it. * src/ctests/: sdsc2.c, sdsc4.c: The SDSC tests could walk off the end of an array. * src/ctests/overflow_twoevents.c: We could potentially access outside an array boundary in overflow_twoevents. * src/ctests/ipc.c: ipc was also abusing array boundaries. * src/ctests/flops.c: The flops.c ctest was abusing the notion of C arrays, by writing INDEX*INDEX values to mresult[0][i], I suppose "knowing" that this would fill in the whole array. Fix things to use an additional iterator. * src/ctests/byte_profile.c: The coverity checker rightly points out that the last argument to strncat should be buffersize-1. * src/ctests/: exeinfo.c, shlib.c: Coverity flagged that there were some tests that had no effect. In particular the are tests that the pointers are non-null. However, they are arrays rather than pointers. This patch make it clear that arrays are being used in the code. Patch from Will Cohen at redhat * src/ctests/clockcore.c: This is a relatively minor patch that ensures that all the allocated memory is initialized to zero before it is used. Coverity might not be smart enough to determine whether the test actually wrote into all the locations because of the case statement. This is make it easier for coverity to determine that the memory has been initialized. Path from Will Cohen at redhat. * src/multiplex.c: Coverity scan showed that MPX_cleanup() function was blindly accessing a value through a pointer and then checking to see that the pointer was null. This patch makes sure that the pointer is checked before it is used. Patch from Will Cohen at redhat. * src/ctests/: pthrtough.c, pthrtough2.c: Coverity found that the sizeof argument for pthrtough2.c and pthrtough.c was using sizeof(pthread *) rather than sizeof(pthread). This patch fixes that problem. Patch from Will Cohen at redhat * src/papi_internal.c: This change moves the setting for default domain to be enforced at eventset add time, rather than eventset creation time. This fixes some problems seen when multiplexing. The patch was provided by Phil Mucci. * src/pmapi-ppc64.h: One more file that is no longer needed. * src/: configure, configure.in, perfctr.c, pmapi-ppc64_events.c, ppc64_events.c: Clean up the now not-needed pmapi-ppc64_events.c file. * src/: Makefile.inc, aix.c, aix.h, configure, configure.in, papi_libpfm_presets.c: Finalize the merge of the preset code. * src/aix.c: Fix a missing include. * src/: aix.c, configure, configure.in: Move more code to its proper place. * src/: aix.c, configure, configure.in, pmapi-ppc64.c, pmapi-ppc64_events.c, ppc64_events.c: Move the ppc64_setup_native_table() routines out of the preset code. This is complicated, as there are two very similar routines setup_ppc64_native_table() used by AIX/pmapi and ppc64_setup_native_table() used by perfctr These could probably be merged too, but this is definitely not the time. * src/: aix.c, papi_libpfm_presets.c, pmapi-ppc64_events.c: move pmapi_find_full_event to be _aix_ntv_name_to_code() as it probably always should have been. * src/: papi_libpfm_presets.c, papi_setup_presets.h, pmapi-ppc64_events.c: Make papi_libpfm_presets more generic by calling _papi_hwi_native_name_to_code() rather than a substrate-specific call. * src/: aix.c, papi_libpfm_presets.c, pmapi-ppc64_events.c: I was mainly doing this to aid debugging, but now the papi_libpfm_presets.c file and pmapi-ppc64_events.c file are close enough to being identical I might try to merge them. 2011-09-29 * src/: papi_libpfm_presets.c, pmapi-ppc64_events.c, ppc64_events.h: The files are almost the same now. * src/: papi_libpfm_presets.c, pmapi-ppc64_events.c: More making these files the same, including some memory leak fixes that made it to the former but not the latter. * src/: papi_libpfm_presets.c, pmapi-ppc64_events.c: Tracking down problems on AIX can be a bit of a pain because papi_libpfm_presets.c and pmapi-ppc64_events.c are almost (but not quite) the same. This change makes the files more similar, mostly by cleaning up whitespace and normalizing comments and debugging statements between the two. * src/pmapi-ppc64_events.c: Ugh, obvious typo in that last commit. * src/pmapi-ppc64_events.c: In ppc64_setup_gps() the current code sometimes walks off the end of the group array and trashes unrelated memory. Until we work out the proper fix, this prints an error message and stops the loop before memory is corrupted. * src/papi_data.h: No one seems to remember the last time this file was used, so let's remove it. 2011-09-28 * src/Makefile.inc: Remove the "u" option to the "ar" command that links libpapi.a, as it was breaking the build on MIPS. This *shouldn't* break anything, but messing around with "ar" options can be potentially dangerous. I'll double-check the non-Linux builds. * src/libpfm4/lib/: Makefile, pfmlib_mips_priv.h, events/intel_nhm_events.h, events/intel_wsm_events.h: Fix up the "collisions" from the libpfm4 import 2011-09-26 * src/Makefile.inc: We would like to use parallel make on packages to speed things up. However, when this was tried with papi the "make -j4" failed (https://bugzilla.redhat.com/show_bug.cgi?id=740909). I took a look through the code and found that some of dependencies were not quite right. Turns out that $(papiLIBS) is substituted during the configure, but it isn't available for the actual make. Attached is the patch that ensures that the $(LIBS) are built before utils and tests. Patch from Will Cohen * src/run_tests.sh: Modify run_tests.sh so that you can set the VALGRIND command externally via environment variable without having to edit run_tests.sh itself. Also adds Date and cpuinfo information to the beginning of run_tests.sh results. This can help when run run_tests.sh output is passed around when debugging a problem. Patch from Phil Mucci * src/: configure, configure.in: If we have no Fortran compiler available, then our current build system tries to build the Fortran examples with an empty compiler string which just generates strange errors. This patch changes F77 to be "echo" which at least avoids the errors. The proper fix is probably just not to build the Fortran samples if no compiler is available. Patch from Phil Mucci * src/papi_libpfm4_events.c: The build on power6 was warning in a DEBUG statement because sizeof() returns an int rather than a long. So use a cast to avoid this. * src/perf_events.c: The move to use pid_t for pid values caused warnings on a --with-debug build due to the lack of a way to print a pid_t value without a cast. This fix adds the proper casts. 2011-09-23 * src/papi_libpfm4_events.c: Rename the "perfmon_idx" structure field the more evocative "libpfm4_idx" value. Patch from Phil Mucci * src/ctests/all_native_events.c: Fix problem where we were passing a pointer to an EventSet rather than the actual EventSet number to PAPI_cleanup_eventset(). Also include some of the cleanups from Phil Mucci's MIPS tree. * src/: perf_events.c, perf_events.h: Make the perf_event ctl structure have more explicit data types. Patch from Philip Mucci * src/: cycle.h, linux-common.c, linux-context.h, linux-lock.h, linux-timer.c, mb.h, papi.h: Add bare minimal MIPS74k support, enough to compile. Patch from Philip Mucci * src/papi_events.csv: Add MIPS 74k pre-defined events Patch by Philip Mucci 2011-09-22 * src/ctests/all_native_events.c: Heike's cleanup_eventset work allows the calling of PAPI_cleanup_eventset with cuda, so uncomment the eventset cleanup code in all_native_events. * src/papi.h: Update papi.h to properly detect if being built with a C99 compiler. * src/papi_events.csv: Update PAPI_FP_INS event name on amd_fam14h as it was changed in the most recent libpfm4 merge * src/libpfm4/: README, config.mk, docs/Makefile, docs/man3/pfm_get_event_info.3, examples/Makefile, examples/showevtinfo.c, include/Makefile, include/perfmon/perf_event.h, lib/Makefile, lib/pfmlib_common.c, lib/pfmlib_gen_mips64_priv.h, lib/pfmlib_mips.c, lib/pfmlib_mips_74k.c, lib/pfmlib_mips_perf_event.c, lib/pfmlib_mips_priv.h, lib/pfmlib_perf_event_pmu.c, lib/pfmlib_priv.h, lib/events/intel_atom_events.h, lib/events/intel_core_events.h, lib/events/intel_nhm_events.h, lib/events/intel_snb_events.h, lib/events/intel_wsm_events.h: Fix the "conflicts" from the libpfm4 git import * src/libpfm4/: docs/man3/libpfm_mips_74k.3, tests/validate_arm.c, tests/validate_mips.c: Initial revision 2011-09-21 * src/multiplex.c: Fix problem where we were freeing a singly-linked list in a for loop, possibly free()ing the allocation before dereferencing ->next Problem reported by coverity tool, via Will Cohen * src/utils/cost.c: Fixed uninitialized data problem in papi_cost Problem reported by coverity tool, via Will Cohen * src/papi_internal.c: Fix problem where we were copying around chunks of memory that were not initialized yet. Problem reported by coverity tool, via Will Cohen * src/multiplex.c: Fix two cases where we were dereferencing a pointer without checking for NULL. Problem reported by coverity tool, via Will Cohen * src/linux-memory.c: We were opening files but not properly closing them if we returned early with an error condition. Problem reported by coverity tool, via Will Cohen * src/linux-common.c: The coverity tool noticed that we allocate and populate a cpu node info structure, but we never pass any info on this structure outside of the cpu detection routine, in effect leaking the allocation. For now just comment out this code as it is not used by anyone. Problem reported by coverity tool, via Will Cohen * src/: papi.c, papi_libpfm3_events.c, perfctr-x86.c: The coverity checker was reporting we forgot to fclose() /proc/cpuinfo in papi.c The bigger question, is why were we unconditionally trying to open /proc/cpuinfo in generic code in papi.c anyway? Turns out it was to set the event masks properly for itanium and p4. The platform code sets CPU vendor and family for us though, so if we just make the event mask code use those values then we don't have to open cpuinfo. This also means that non-Linux users with the misfortune of running on a P4 might actually work too. * src/: papi_internal.c, papi_libpfm_presets.c: In various places we were using MAX_COUNTER_TERMS (defined by substrate) rather than PAPI_MAX_COUNTER_TERMS (a papi predefined event define). This could cause buffer overruns. This fixes things, though really we shouldn't have such similar names for different defines. Problem reported by coverity tool, via Will Cohen * src/multiplex.c: Avoid case where we could have been dereferencing a NULL pointer in MPX_stop() Reported by coverity tool, via Will Cohen * src/papi.c: Fix problem where thread and cpu could be dereferenced as NULL in PAPI_start() Reported by coverity tool, via Will Cohen * src/papi_events.csv: Update the AMD Family 14h (Bobcat) pre-defined events. It turns out they are different enough from 10h that they need their own category. In going through the Fam14h BKDG it turns out that Bobcat has a really nice set of events available, especially for Floating-Point/SSE but also memory bandwidth. With this change, all of the ctests pass on a Bobcat machine. * src/: configure, configure.in: Recent Ubuntu versions use the ld flag --as-needed by default. This breaks the PAPI configure step for the libdl check, as the --as-needed flag enforces the rule that libraries (in this case -ldl) must come after the object files on the command line, not before. The fix for this is easy, the libdl check was wrongly sticking -ldl in LDFLAGS rather than in LIBS. Putting it in LIBS makes things work as expected. You can see here: http://www.gentoo.org/proj/en/qa/asneeded.xml For more info on this issue than you probably ever want to know. 2011-09-19 * src/: ctests/Makefile, ftests/Makefile, utils/Makefile: When building testlib dependencies from ctests/ ftests/ and utils/ call $(MAKE) and not make, this should fix aix. 2011-09-14 * src/: aix.c, freebsd.c, linux-bgp.c, papi_vector.c, perf_events.c, perfctr-ppc64.c, perfctr-x86.c, perfmon-ia64.c, perfmon.c, solaris-niagara2.c, solaris-ultra.c, components/acpi/linux-acpi.c, components/coretemp/linux-coretemp.c, components/coretemp_freebsd/coretemp_freebsd.c, components/example/example.c, components/infiniband/linux-infiniband.c, components/lmsensors/linux-lmsensors.c, components/lustre/linux-lustre.c, components/mx/linux-mx.c, components/net/linux-net.c, win2k/substrate/win32.c, win2k/substrate/winpmc-p3.c: Change initialization of function pointer cleanup_eventset() from vec_int_dummy to vec_int_ok_dummy so that it returns PAPI_OK by default. Roll back initialization for every substrate. AGAIN, keep an eye on builtbot. * src/libpfm4/lib/: pfmlib_mips.c, pfmlib_mips_74k.c, pfmlib_mips_perf_event.c, pfmlib_mips_priv.h, events/mips_74k_events.h: Merged with HEAD, still passing all tests 2011-09-13 * src/papi_libpfm4_events.c: The libpfm4 code was doing a full call to pfm_get_os_event_encoding() during every call to update_control_state(). This is unnecessary, as we can call pfm_get_os_event_encoding() once at event creation time and cache the results. There's no need to call it each update_control_state(), as that is called during PAPI_start() and thus relatively time critical. * src/run_tests.sh: Missed a $ * src/: run_tests.sh, components/example/tests/HelloWorld.c: Update run_tests.sh to run component tests, and update the example test to act more like a ctest. * src/components/example/example.c: Fix warnings generated by the example component. * src/: Makefile.inc, components/Makefile_comp_tests, ctests/Makefile, ctests/do_loops.c, ctests/dummy.c, ctests/papi_test.h, ctests/test_utils.c, ctests/test_utils.h, ftests/Makefile, testlib/Makefile, testlib/do_loops.c, testlib/dummy.c, testlib/papi_test.h, testlib/test_utils.c, testlib/test_utils.h, utils/Makefile: ctests, ftests, utils, and the component tests were all using some files in ctests. These weren't being built when --with-no-cpu-counters was enabled, so the PAPI build was breaking when that was enabled as well as a component. Move the shared files to their own directory, testlib Then update all the users to look in the right place. After this commit you might need to do a "cvs -d update" to make sure you get the new subdirectory. * src/: configure, configure.in: When compiling with --with-no-cpu-counters configure would report the platform as linux-perfctr-x86. This changes it to report as linux-no-counters 2011-09-12 * src/: aix.c, freebsd.c, linux-bgp.c, perf_events.c, perfctr-ppc64.c, perfctr-x86.c, perfmon-ia64.c, perfmon.c, solaris-niagara2.c, solaris-ultra.c, components/acpi/linux-acpi.c, components/coretemp/linux-coretemp.c, components/coretemp_freebsd/coretemp_freebsd.c, components/example/example.c, components/infiniband/linux-infiniband.c, components/lmsensors/linux-lmsensors.c, components/lustre/linux-lustre.c, components/mx/linux-mx.c, components/net/linux-net.c, win2k/substrate/win32.c, win2k/substrate/winpmc-p3.c: Initialize new function pointer cleanup_eventset() for every substrate. Keep an eye on builtbot. * src/components/cuda/: linux-cuda.c, linux-cuda.h: Cannot override void* definitions from PAPI framework layer (e.g. hwd_control_state_t) with typedefs to conform to PAPI Component layer code if this technique has already been used in another substrate (e.g. perfctr-x86). Or short: #undef and typedef can't be done twice. * src/perf_events.c: Fix bug caused by forgetting to drop the stream name when converting a fprintf() into a SUBDBG() * src/papi_libpfm_presets.c: Patch from William Cohen fixing a potential problem found by a static analysis tool where we could possibly pass a NULL pointer to free_notes(). * src/papi_libpfm_presets.c: Some memory leak fixes made to libpfm3 papi_pfm_events.c by Robert Richter were lost when the libpfm4/libpfm4 presets merge was done. This re-applies these fixes. 2011-09-10 * src/run_tests.sh: Cleaned up old comment regarding CUDA pre-4.0 when it was not possible to access a GPU from multiple CPU threads. * src/: papi.c, papi_protos.h, papi_vector.c, papi_vector.h, components/README, components/cuda/linux-cuda.c, components/cuda/linux-cuda.h: Deleted function pointer destroy_eventset from the PAPI vector table, and added cleanup_eventset instead. PAPI_destroy_eventset() requires an empty EventSet. Hence, usually PAPI_cleanup_eventset() is called before PAPI_destroy_eventset(); which also sets the CompIdx to -1. This means, PAPI_destroy_eventset() won't have any knowledge about components. However, in order to disable CUDA eventGroups and to free perfmon hardware on the GPU, knowledge about the CUDA component index is required. Hence, I replaced CUDA_destroy_eventset() with CUDA_cleanup_eventset() in the CUDA component. NOTE: Please make sure you call PAPI_cleanup_eventset() before calling PAPI_shutdown(). 2011-09-09 * src/: papi_protos.h, papi_vector.c, papi_vector.h, components/cuda/linux-cuda.c, components/cuda/linux-cuda.h: CUDA component is now thread-safe. Starting in CUDA 4.0, multiple CPU threads can access the same CUDA context. This is a much easier programming model then pre-4.0 as threads - using the same CUDA context - can share memory, data, etc. Note, it's possible to create a different CUDA context for each thread, but then we are likely running into a limitation that only one context can be profiled at a time. 2011-09-07 * src/ctests/: do_loops.c, test_utils.c: Apply fixes to problems noticed by a static analysis tool. Provided by William Cohen at RedHat * src/papi_events.csv: Update SandyBridge preset events. These were provided by Michel Brown at Bull * src/libpfm4/lib/: pfmlib_gen_mips64.c, pfmlib_mips.c, pfmlib_mips_74k.c, pfmlib_mips_perf_event.c, pfmlib_mips_priv.h, events/gen_mips64_events.h, events/mips_74k_events.h: MIPS 74K little endian perf event support, requires 3.0.3+ kernel 2011-09-06 * src/perf_events.c: The warning I had print on nmi_watchdog being found was a bit much, make it a SUBDBG() call instead. I do wish there were a way to notify the user more visibly, because losing a counter (when you might only have 4 total to begin with) is a big deal, and most Linux vendors are starting to ship kernels with the nmi_watchdog enabled. * src/: linux-common.c, linux-common.h, perf_events.c: On newer Linux kernels (2.6.34+) the nmi_watchdog counter can steal one of the counters, reducing by one the total available. There's a bug in Linux where if you try to use the full number of counters on such a system with a group leader, the sys_perf_open() call will succeed only to fail at read time. (instead of the proper error code at open time). This patch attempts to work around this issue by detecting if a watchdog timer is being used, and in that case re-use the existing KERNEL_CHECKS_SCHEDUABILITY_UPON_OPEN bugfix code. * src/papi_events.csv: We were missing a proper libpfm4 interlagos CPU name in the papi_events.csv file 2011-09-02 * src/libpfm4/: include/perfmon/perf_event.h, lib/Makefile, lib/pfmlib_intel_nhm_unc.c, lib/pfmlib_intel_x86.c, lib/pfmlib_intel_x86_priv.h, lib/pfmlib_priv.h, lib/events/amd64_events_fam10h.h, lib/events/amd64_events_k7.h, lib/events/amd64_events_k8.h, lib/events/intel_atom_events.h, lib/events/intel_core_events.h, lib/events/intel_coreduo_events.h, lib/events/intel_nhm_events.h, lib/events/intel_nhm_unc_events.h, lib/events/intel_p6_events.h, lib/events/intel_snb_events.h, lib/events/intel_wsm_events.h, lib/events/intel_wsm_unc_events.h, lib/events/intel_x86_arch_events.h: Fix "conflicts" from the libpfm4 import * src/papi_libpfm4_events.c: Explicitly set num_native_events to zero at init time. Somehow the value was surviving fork/exec and making the fork/exec test cases fail on a recent Debian system. * src/perf_events.c: Set FD_CLOEXEC on the overflow signal handler fd. Otherwise if we exec() with overflow enabled, the exec'd process will quickly die due to lack of signal handler. This patch is needed due to a change in behavior in Linux 3.0. Mark Krentel first noticed this problem. * src/: Rules.perfctr-pfm, Rules.pfm, Rules.pfm4_pe, Rules.pfm_pe: Remove the "unexport CFLAGS" lines from the Rules files. * src/: multiplex.c, papi_internal.c, utils/component.c: Fix a few warnings reported by gcc-4.6 * src/: configure, configure.in: Override auto-detection of substrate if the user specifies what they want to build with. This allows building perfctr and perfmon2 PAPI on systems auto-detected as having perf_event support. * src/: configure, configure.in: Add a "--with-libpfm3" argument to configure that lets us specify libpfm3 for testing purposes. * src/solaris-niagara2.c: Fix solaris niagara2 build problems reported by tigrage on the PAPI forum. 2011-08-30 * src/configure: Regen 2011-08-29 * src/configure.in: Check for a requested interface to tweak build flags * src/: configure, configure.in: Last bit for cross compiling... * src/: configure, configure.in: Better double quotes * src/: configure, configure.in: There can be only 1. (choice of perfctr, perfmon or perf events) * src/: configure, configure.in: Further refinement of the combinations of --with-perfctr --with-perfmon and --with-perf-events True autotools cross not yet supported until we move to automake. I did trick it into doing a cross compile with... # ARCH=mips CC=scgcc ./configure --with-arch=mips --host=mips64el-gentoo-linux-gnu- --with-ffsll --with-libpfm4 --w ith-perf-events --with-virtualtimer=times --with-walltimer=gettimeofday --with-tls=__thread --with-CPU=mips # cross compiling should work differently... Wow, do I hate specifying mips in 3 places... * src/: config.h.in, configure, configure.in: Some fixes for cross compiling and not including x86_cache_info.c when not ensured an x86. * src/Makefile.inc: Surround component tests and cleanup recipies with a conditional, the version of sh that our aix machine has does not handle for i in {Empty set}; treating it as a syntax error. NOTE: This requires gnu make, my shell-foo couldn't make sh happy, so for now gnu conditionals! * ChangeLogP414.txt, RELEASENOTES.txt: Update Release Notes and add ChangeLog for PAPI 4.1.4. * src/configure: Rebuild from configure.in with version number bump to 4.1.4 in advance of pending internal vendor release for Cray. papi-papi-7-2-0-t/ChangeLogP421.txt000066400000000000000000001226651502707512200166130ustar00rootroot000000000000002012-02-13 * src/components/net/linux-net.c: Repairing more coverity warnings. 2012-02-11 * src/windows-common.c: Missed an instance of CPUs yesterday. * src/: papi_internal.c, threads.c: This changes fixes two race conditions that are probably the cause of the pthrtough double-free error. When freeing a thread, we remove and free all eventsets belonging to that thread. This could race with the thread itself removing the evenset, causing some ESI fields to be freed twice. The problem was found by using the Valgrind 3.8 Helgrind tool valgrind --tool=helgrind --free-is-write=yes ctests/pthrtough In order for Helgrind to work, I had to temporarily modify PAPI to use POSIX pthread mutexes for locking. Is there any reason we don't use these all the time? 2012-02-10 * src/utils/: avail.c, component.c, event_chooser.c, native_avail.c: ix one more case of "CPU's" in the print header code. Also remove the extraneous The following correspond to fields in the PAPI_event_info_t structure. message * src/: testlib/papi_test.h, testlib/test_utils.c, ctests/all_native_events.c, ctests/calibrate.c, ctests/code2name.c, ctests/hwinfo.c: Fix one more case of "CPU's" in the print header code. Also remove the extraneous The following correspond to fields in the PAPI_event_info_t structure. message * src/buildbot_configure_with_components.sh: take infiniband out of the buildbot test. * src/: x86_cache_info.c, components/coretemp/linux-coretemp.c, components/lmsensors/linux-lmsensors.c, components/lustre/linux-lustre.c, components/net/linux-net.c, utils/event_chooser.c: Fix coverity errors reported by Will Cohen. * src/: aix.c, any-proc-null.c, linux-common.c, papi.c, papi.h, papivi.h, solaris-niagara2.c, solaris-ultra.c, ctests/clockres_pthreads.c: Address Redhat bug 785975. The plural of CPU appears to be CPUs * src/Makefile.inc: Patch to cleanup dependencies, allowing for parallel makes. Patch due to Will Cohen from redhat 2012-02-09 * src/buildbot_configure_with_components.sh: Add infiniband and mx component to buildbot component tests. * src/components/net/tests/: net_values_by_code.c, net_values_by_name.c: Apply patch suggested by Will Cohen to check for system return values. * src/components/lmsensors/linux-lmsensors.h: Added missing string header 2012-02-08 * man/... : update man pages one more time for 4.2.1 release * release_procedure.txt: Make sure generated html has papi group id. 2012-02-07 * src/multiplex.c: Fix the @file matching multiple files warning. * src/components/README: Cleanup doxygen errors. * doc/Doxyfile-html: Typo introduced by the last commit. * doc/Doxyfile-html: Exclude linux-bgp.c from doxygen. * doc/Doxyfile-html: Make sure the component README file gets included in doxygen. * src/components/coretemp_freebsd/coretemp_freebsd.c: Cleanup doxygen warnings in freebsd coretemp component. * src/papi.h: Cleanup some doxygen warnings related to the groupings. * src/components/example/example.c: fix doxygen warning in the example component * doc/Doxyfile-html: Remove some cruft from doxygen config file. This addresses the warning about dot not found at /sw/bin/dot . * src/components/: infiniband/linux-infiniband.c, infiniband/linux-infiniband.h, cuda/linux-cuda.c, cuda/linux-cuda.h: Cleaned up some doxygen issues * src/components/lmsensors/linux-lmsensors.c: Removed long forgotten debug outputs * src/papi_libpfm4_events.c: Fix minor doxygen typos. * src/components/vmware/vmware.c: Add params for doxygen * man/... : update man pages 2012-02-06 * doc/Doxyfile-man1: Fix a typo in a doxygen config file. 2012-02-03 * release_procedure.txt, doc/Doxyfile, doc/Doxyfile-everything, doc/Doxyfile-html, doc/Doxyfile.utils, doc/Doxyfile-man1, doc/Doxyfile-man3, doc/Makefile, doc/doxygen_procedure.txt: Rework the doxygen configuration files. * RELEASENOTES.txt: Update for the impending release. * ChangeLogP421.txt, RELEASENOTES.txt: Updates for the impending release. 2012-02-02 * src/: papi.c, papi.h: Minor tweaks for doxygen errors 2012-02-01 * src/components/lmsensors/: Rules.lmsensors, configure.in: Fixed configure error message and rules link error for shared object linking. Thanks Will Cohen. * src/components/appio/Rules.appio: Correct pathing * src/ctests/api.c: One minor tiny fix to check for PAPI_ENOEVNT when testing PAPI_flops. If PAPI_FP_OPS does not exist on the processor (like many of em), then this tests fails. 2012-01-31 * src/ctests/multiattach.c: Increase acceptance criteria for cycles. * src/Makefile.in, src/configure, src/configure.in, src/papi.h, doc/Doxyfile, doc/Doxyfile-everything, doc/Doxyfile.utils, papi.spec: Update version number to 4.2.1 in preparation for release. * src/ctests/prof_utils.c: Correct a warning on 32bit builds about casting caddr_t to (long long) Specifically: prof_utils.c:234: warning: cast from pointer to integer of different size prof_utils.c:248: warning: cast from pointer to integer of different size prof_utils.c:262: warning: cast from pointer to integer of different size We first cast to unsigned long and then on to long long. ( This maybe overkill, but its for a printf format string ) 2012-01-30 * release_procedure.txt: Add the correct path for doxygen on ICL machines. * src/papi_events.csv: Modify Intel Sandybridge PAPI_FP_OPS and PAPI_FP_INS events to not count x87 fp instructions. The problem is that the current predefines were made by adding 5 events. With the NMI watchdog stealing an event and/or hyperthreading reducing the numbr of available counters by half, we just couldn't fit. This now raises the potential for people using x87-compiled floating point on Sandybridge and getting 0 FP_OPS. This is only likely if running a 32-bit kernel and *not* compiling your code with -msse. A long-term solution might be trying to find a better set of FP predefines for sandybridge. * src/components/: lustre/linux-lustre.c, mx/linux-mx.c: Some really minor cleanups to the lustre and mx components. 2012-01-28 * src/components/example/: example.c, tests/example_basic.c: Update example component Cleans up code, adds some more documentation, adds counter write support. 2012-01-27 * src/papi_user_events.c: Minor cleanups for user events. * src/libpfm4/: README, include/perfmon/pfmlib.h, lib/Makefile, lib/pfmlib_amd64.c, lib/pfmlib_common.c, lib/pfmlib_priv.h: Fix "conflicts" in git import of libpfm4. * src/libpfm4/lib/: pfmlib_amd64_fam11h.c, events/amd64_events_fam11h.h: Initial revision 2012-01-26 * src/papi_fwrappers.c: Escape the include directives in the documentation. (Cleans up doxygen ) * src/components/README: Adding vmware to component README * src/components/vmware/: Makefile.vmware.in, PAPI-VMwareComponentDocument.pdf, Rules.vmware, VMwareComponentDocument.txt, configure, configure.in, vmware.c, vmware.h: merge vmware branch to head * src/perf_events.c: Set fast_counter_read back to 0 on x86/x86_64 perf_events, as currently rdpmc counter access is not supported. There are patches floating around that enable this (although performance is still a long way from perfctr) but they will not likely be merged for a while now, and the perf_events substrate will require a lot of extra code to support it once it does make it into a shipping kernel. * src/buildbot_configure_with_components.sh: Remove acpi from the buildbot configure script. 2012-01-25 * src/components/mx/: Makefile.mx.in, Rules.mx, configure, configure.in, linux-mx.c, linux-mx.h, tests/Makefile, tests/mx_basic.c, tests/mx_elapsed.c, utils/fake_mx_counters.c, utils/sample_output: Re-write of the MX component + Add tests + Modernize code + Remove the need to run ./configure in the mx directory + Add fake mx_counters program that lets you test component on machine without myrinet installed * src/components/: README, acpi/Rules.acpi, acpi/linux-acpi-memory.c, acpi/linux-acpi.c, acpi/linux-acpi.h: Remove the ACPI component. It was one of the oldest components and needed a lot of cleanup work, and it turns out that the main useful event it provided (temperature) isn't available on modern machines/kernels (coretemp should be used instead). 2012-01-23 * src/perf_events.c: Restored Phil's changes that I inadvertently clobbered with my last commit :( * src/perf_events.c: Remove a warning about an uninitialized variable. * src/utils/: component.c, event_info.c, native_avail.c: Update the Doxygen comments on these utilities to have the command line options listed in a list like the other utils. * src/perf_events.c: More improvements to the read path for multiplexed counters. Now the case for bad kernel behavior is built in, and is not required with a #define. Basically, there are situations when either enabled or running is zero but not both. This could result in a divide by 0 in the worst case, as was observed by Tushar Mohan in papiex. You could trigger it by doing a read immediately after doing a start with perf events and use a FORMAT_SCALE argument. Now the logic goes, assuming mpxing. 1) if (running=enabled) return raw counter 2) if (running && enabled) scale counter by ratio 3) else warn in debug mode return raw counter Apparently we need a test case that does a read immediately after a start. That's a hole. Tested on brutus, core2 2.6.36 Here's the original report. ------------------- Model string and code : Intel(R) Pentium(R) M processor 1600MHz (9) Linux thinkpad 2.6.38-02063808-generic #201106040910 SMP Sat Jun 4 10:51:30 UTC 2011 i686 GNU/Linux PAPI Version: 4.2.0.0 I think I ran into a bug similar to what we ran with MIPS. With the latest PAPI (from CVS), on an x86 (32-bit machine), when using papiex with multiplex with anything more than two events, I get a floating point exception in PAPI during the PAPI_read call. On enabling debugging in the substrate, I think the problem is the same (namely a division by zero, because some event had a zero time of running): libpapiex debug: 24625,0x0,papiex_thread_init_routine Starting counters with PAPI_start SUBSTRATE:perf_events.c:pe_enable_counters:953:24625 ioctl(enable): ctx: 0x96a4bc8, fd: 3 SUBSTRATE:perf_events.c:pe_enable_counters:953:24625 ioctl(enable): ctx: 0x96a4bc8, fd: 5 libpapiex debug: 24625,0x0,papiex_thread_init_routine Calling PAPI_lock before critical section libpapiex debug: 24625,0x0,papiex_thread_init_routine Released PAPI lock libpapiex debug: 24625,0x0,papiex_start START POINT 0 LABEL libpapiex debug: 24625,0x0,papiex_start Reading counters (PAPI_read) to get initial counts SUBSTRATE:perf_events.c:_papi_pe_read:1147:24625 read: fd: 3, tid: 0, cpu: -1, ret: 56 SUBSTRATE:perf_events.c:_papi_pe_read:1148:24625 read: 2 1341021 1341021 SUBSTRATE:perf_events.c:_papi_pe_read:1181:24625 (papi_pe_buffer[3] 33405 * tot_time_enabled 1341021) / tot_time_running 1341021 SUBSTRATE:perf_events.c:_papi_pe_read:1181:24625 (papi_pe_buffer[5] 44552 * tot_time_enabled 1341021) / tot_time_running 1341021 SUBSTRATE:perf_events.c:_papi_pe_read:1147:24625 read: fd: 5, tid: 0, cpu: -1, ret: 40 SUBSTRATE:perf_events.c:_papi_pe_read:1148:24625 read: 1 214777 0 SUBSTRATE:perf_events.c:_papi_pe_read:1181:24625 (papi_pe_buffer[3] 0 * tot_time_enabled 214777) / tot_time_running 0 The above debug log is for three events: PAPI_TOT_CYC, PAPI_TOT_INS and PAPI_L1_DCM. Multiplexing works with two events. Adding the third (any event), gives this error. Basically, the floating point exception kills the program, and PAPI_read never returns. I think I know why papiex always hits this bug: It's because right after starting the counters with PAPI_start, papiex does a PAPI_read to store the initial values of the counters in a tmp variable. These are then subtracted from the final counter values. Should we put a deliberate delay? Of course, the real bug should be fixed in PAPI. ---- * src/utils/event_info.c: Major re-write of the papi_xml_event_info program. + Remove event code numbers, as they are not stable run-to-run + Add some Doxygen comments + Remove some wrong assumptions that could cause potential buffer overflows + Improve usage information 2012-01-20 * src/components/lustre/: Rules.lustre, linux-lustre.c, linux-lustre.h, fake_proc/fs/lustre/llite/hpcdata-ffff81022a732800/read_ahead_stats, fake_proc/fs/lustre/llite/hpcdata-ffff81022a732800/stats, tests/Makefile, tests/lustre_basic.c: Finish the re-write of the lustre component. It would be nice if someone with access to a machine with a lustre filesystem could test this for us. * src/: papi_internal.c, components/lustre/linux-lustre.c: Update the component initialization code so that it can handle a PAPI ERROR return gracefully. Previously there was no way to indicate initialization failure besides just setting num_native_events to 0. 2012-01-19 * src/components/lustre/: linux-lustre.c, linux-lustre.h: First pass at cleaning up the lustre component. It should now properly report no events when no lustre filesystems are available. 2012-01-11 * src/papi_events.csv: Add AMD fam12h support to the events file. Right now it is just an alias to the similar fam10h event list; this can be split out if necessary once we find a tester with the hardware. * src/libpfm4/: README, docs/man3/pfm_get_event_next.3, docs/man3/pfm_get_pmu_info.3, include/perfmon/perf_event.h, include/perfmon/pfmlib.h, lib/Makefile, lib/pfmlib_amd64.c, lib/pfmlib_amd64_priv.h, lib/pfmlib_common.c, lib/pfmlib_perf_event.c, lib/pfmlib_priv.h, lib/events/intel_coreduo_events.h, lib/events/perf_events.h, perf_examples/Makefile, perf_examples/perf_util.c, perf_examples/perf_util.h, perf_examples/self.c, perf_examples/task_smpl.c, perf_examples/x86/bts_smpl.c: Fix "merge" conflicts with libpfm4 merge. * src/libpfm4/lib/: pfmlib_amd64_fam12h.c, events/amd64_events_fam12h.h: Initial revision * src/papi_libpfm4_events.c: Properly use the pfm_get_event_next() iterator to find next event. Without this, on AMD Fam10h some events are missed. Some events are still missed due to libpfm4 bug, this will be fixed once I update the libpfm4 tree included with PAPI. Note, enumeration fixes like this often break things, so please test if possible. * src/papi_events.csv: Update the coreduo (not core2) events. Most notably the FP events were wrong. This, along with a forthcoming libpfm4 update, make all the CTESTS pass on an old Yonah coreduo laptop I have. 2012-01-05 * src/ctests/api.c: Make the api test actually test PAPI_flops() as it claims to do, rather than PAPI_flips(). Patch thanks to: Emilio De Camargo Francesquini * src/papi_hl.c: Fix some copy-and-paste documentation remnants in the papi_hl.c file, mostly where it said FLIPS where it meant FLOPS. 2012-01-04 * src/utils/native_avail.c: Update papi_native_avail to *not* print the event codes, as these are not guaranteed to be stable from run to run. Also fix up the formatting and print some component info too. Please try and let me know if you don't like the new output. * src/: configure, configure.in: Respect a FORCED option in configure. 2011-12-22 * src/Rules.pfm4_pe: Remove perfmon.h from MISCHDRS. 2011-12-20 * src/: Rules.perfctr, Rules.perfctr-pfm, Rules.pfm, Rules.pfm4_pe, Rules.pfm_pe, linux-lock.h, mb.h: Merry Christmas ARM users. This patch fixes the SMP ARM issues reported by Harald Servat. Also, adds proper header dependency checking in the Rules files. People, please when you add headers, please add them to the dependency lines so everything gets rebuilt properly. New implementation of SMP locks are very pedantic, that is, they are nost the fastest, but they do use atomics and avoid kernel intervention. Passed on our 2 core ARM v7. All pthreads tests now pass, except the ones that also fail in the single processor case usually due to a missing event. Samples: mucci@panda:~/papi.head/src$ uname -a Linux panda 3.0.0 #2 SMP Fri Jul 29 16:23:54 EDT 2011 armv7l GNU/Linux mucci@panda:~/papi.head/src$ hostname panda mucci@panda:~/papi.head/src$ cat /proc/cpuinfo Processor: ARMv7 Processor rev 2 (v7l) processor: 0 BogoMIPS: 2007.19 processor: 1 BogoMIPS: 1965.18 Features: swp half thumb fastmult vfp edsp thumbee neon vfpv3 CPU implementer: 0x41 CPU architecture: 7 CPU variant: 0x1 CPU part: 0xc09 CPU revision: 2 Hardware: OMAP4 Panda board Revision: 0020 Serial: 0000000000000000 mucci@panda:~/papi.head/src$ ./ctests/locks_pthreads Creating 2 threads 10000 iterations took 13489 us. Running 44480 iterations Expected: 88960 Received: 88960 locks_pthreads.c PASSED mucci@panda:~/papi.head/src$ ./ctests/pthrtough Creating 2 threads for 1000 iterations each of: register create_eventset destroy_eventset unregister pthrtough.c PASSED mucci@panda:~/papi.head/src$ ./ctests/pthrtough2 Creating 2000 threads for 1 iterations each of: register create_eventset destroy_eventset unregister Failed to create thread: 238 Continuing test with 237 threads. pthrtough2.c PASSED mucci@panda:~/papi.head/src$ ./ctests/thrspecific Thread 0x40ae1470 started, specific data is at 0xbea9c6d4 Thread 0x40021000 started, specific data is at 0xbea9c6c4 Thread 0x4244d470 started, specific data is at 0xbea9c6c8 Thread 0x4138d470 started, specific data is at 0xbea9c6d0 Thread 0x41c4d470 started, specific data is at 0xbea9c6cc Entry 0, Thread 0x41c4d470, Data Pointer 0xbea9c6cc, Value 4000000 Entry 1, Thread 0x40021000, Data Pointer 0xbea9c6c4, Value 500000 Entry 2, Thread 0x40ae1470, Data Pointer 0xbea9c6d4, Value 1000000 Entry 3, Thread 0x4244d470, Data Pointer 0xbea9c6c8, Value 8000000 Entry 4, Thread 0x4138d470, Data Pointer 0xbea9c6d0, Value 2000000 thrspecific.c PASSED mucci@panda:~/papi.head/src$ ./ctests/krentel_pthreads program_time = 6, threshold = 20000000, num_threads = 3 launched timer in thread 0 launched timer in thread 1 launched timer in thread 3 launched timer in thread 2 [1] time = 1, count = 7, iter = 5, rate = 1400.0/Kiter [2] time = 1, count = 7, iter = 5, rate = 1400.0/Kiter [0] time = 1, count = 7, iter = 5, rate = 1400.0/Kiter [3] time = 1, count = 7, iter = 5, rate = 1400.0/Kiter [1] time = 2, count = 25, iter = 16, rate = 1562.5/Kiter [0] time = 2, count = 25, iter = 16, rate = 1562.5/Kiter [3] time = 2, count = 25, iter = 16, rate = 1562.5/Kiter [2] time = 2, count = 25, iter = 16, rate = 1562.5/Kiter [1] time = 3, count = 25, iter = 16, rate = 1562.5/Kiter [2] time = 3, count = 25, iter = 16, rate = 1562.5/Kiter [0] time = 3, count = 25, iter = 16, rate = 1562.5/Kiter [3] time = 3, count = 25, iter = 16, rate = 1562.5/Kiter [1] time = 4, count = 25, iter = 16, rate = 1562.5/Kiter [0] time = 4, count = 25, iter = 16, rate = 1562.5/Kiter [3] time = 4, count = 25, iter = 16, rate = 1562.5/Kiter [2] time = 4, count = 25, iter = 16, rate = 1562.5/Kiter [3] time = 5, count = 25, iter = 16, rate = 1562.5/Kiter [0] time = 5, count = 25, iter = 16, rate = 1562.5/Kiter [2] time = 5, count = 25, iter = 16, rate = 1562.5/Kiter [1] time = 5, count = 26, iter = 17, rate = 1529.4/Kiter [2] time = 6, count = 25, iter = 16, rate = 1562.5/Kiter [0] time = 6, count = 27, iter = 17, rate = 1588.2/Kiter done krentel_pthreads.c PASSED 2011-12-15 * src/papi_libpfm_presets.c: Change PAPI_PERFMON_EVENT_FILE environment variable name to PAPI_CSV_EVENT_FILE since it's not just for perfmon anymore. * src/: configure, configure.in: Open mouth, insert foot; fix perfctr configure by not testing a library we have not built yet. 2011-12-14 * src/: configure, configure.in: Missed one more place where we tested perfctr != "no" * src/: configure, configure.in: Fix a typo in the perfctr section; it was causing a machine to default to perfctr when it had no performance interface. ( a centos vm image with a 2.6.18 kernel ) Also checks that we actually have perfctr if we specify --with-perfctr. 2011-12-08 * src/components/cuda/: Makefile.cuda.in, Rules.cuda, configure, configure.in, linux-cuda.c, linux-cuda.h: Added auto-detection of CUDA version to PAPI CUDA Component. Reason is, the interface has changed between CUDA/CUPTI 4.0 and 4.1. PAPI now supports both CUDA versions without any exposure to the users. Configure step is unchanged and no additional knowledge of which CUDA version is installed is required. 2011-12-03 * src/components/appio/: CHANGES, README, Rules.appio, appio.c, appio.h, tests/Makefile, tests/appio_list_events.c, tests/appio_values_by_code.c, tests/appio_values_by_name.c: [no log message] 2011-11-25 * src/linux-timer.c: Fix compilation warning if you specify --with-walltime=gettimeofday * src/linux-timer.c: Fix the build on Linux systems using mmtimer * src/linux-common.c: Update the linux MHz detection code to use bogoMIPS when there is no MHz field available in /proc/cpuinfo. This gives roughly correct MHz on ARM, and the MIPS workaround should also still work. 2011-11-23 * src/components/net/linux-net.c: Fix compile errors in a debug message. (pathname didn't exist but we are working on NET_PROC_FILE) 2011-11-22 * src/components/net/: linux-net.c, tests/net_values_by_code.c, tests/net_values_by_name.c: Change the ping command in the net tests to not use &> to redirect to NULL. This would work on a system with csh, but on systems with a bash shell this runs ping in the background instead, so the test finishes before ping can generate any packets. * src/components/net/linux-net.c: Fix slight bug in the net component, where a memset() had the wrong arguments. This made for weird results in the case where we start/stop quickly enough that we return the initial data. * src/components/net/: CHANGES, Makefile.net.in, README, Rules.net, configure, configure.in, linux-net.c, linux-net.h, tests/Makefile, tests/net_list_events.c, tests/net_values_by_code.c, tests/net_values_by_name.c: Replace net component with updated version written by Jose Pedro Oliveira * Dynamically detects the network interfaces (i.e. the ones listed in /proc/net/dev) * No longer needs to fork/exec the external ifconfig command and parse its output. It now reads the Linux kernel network statistics directly from /proc/net/dev. * Each network interface now has 16 events instead of 13 (all counters in /proc/net/dev). * Adds support for PAPI_event_name_to_code() * Adds a couple of small tests/examples 2011-11-16 * doc/Doxyfile-everything: Fix the exclude libpfm/perfctr config. 2011-11-10 * src/perf_events.c: Only scale when running != enabled. Now verified on ig, brutus and the malta * src/perf_events.c: Further tuneups for mpx'ing. Previous commit broke systems with valid return values from perf_events for running & enabled. My attempt at scaling in long long world caused an overflow which led to a negative number when passed up the chain. Also consolidated types... best way to avoid this stuff is to start as the type you are ending as. Now we use some better integer scaling...guaranteed within +-0.5% of the actual scaled value of enabled / running. New results on brutus: multiplex1 case1: Does PAPI_multiplex_init() not break regular operation? Added PAPI_TOT_CYC Added PAPI_FP_INS case1: PAPI_TOT_CYC PAPI_FP_INS case1: 2739865106 600002876 case2: Does setmpx/add work? Added PAPI_TOT_CYC Added PAPI_FP_INS case2: PAPI_TOT_CYC PAPI_FP_INS case2: 2739678237 600002258 case3: Does add/setmpx work? Added PAPI_TOT_CYC Added PAPI_FP_INS case3: PAPI_TOT_CYC PAPI_FP_INS case3: 2739847832 600002298 case4: Does add/setmpx/add work? Added PAPI_TOT_CYC Added PAPI_FP_INS case4: PAPI_TOT_CYC PAPI_FP_INS case4: 2737832980 600013404 case5: Does setmpx/add/add/start/read work? Added PAPI_TOT_CYC Added PAPI_FP_INS read @start counter[0]: 7106 read @stop counter[0]: 2740387017 difference counter[0]: 2740379911 read @start counter[1]: 0 read @stop counter[1]: 600017169 difference counter[1]: 600017169 multiplex1.c PASSED 2011-11-09 * src/components/cuda/linux-cuda.c: For the CUDA Component, PAPI_read() now accumulates event values. This has to be explicitly done in PAPI because CUPTI automatically resets all counter values to 0 after a read. (PAPI_start()/stop() continues to reset the values to 0) * src/perf_events.c: Last of the multiplex fixes to perf events. The root of all evil was this: counts[i] = ( uint64_t ) ( ( double ) buffer[count_idx] * ( double ) buffer[get_total_time_enabled_idx( )] / ( double ) buffer[get_total_time_running_idx( )] ) ; In addition to improper casting to uints... (papi returns int64s), using floating point arith is a no-no. Plus this resulted in divide by zeros... Before: SUBSTRATE:perf_events.c:_papi_pe_read:1155:12218 read: fd: 3, tid: 0, cpu: -1, buffer[0-2]: 0x6cba, 0x0, 0x0, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1155:12218 read: fd: 4, tid: 0, cpu: -1, buffer[0-2]: 0x23, 0x0, 0x0, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1155:12218 read: fd: 3, tid: 0, cpu: -1, buffer[0-2]: 0x6de72b5d, 0x8ae0fa80, 0x8ae0fa80, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1155:12218 read: fd: 4, tid: 0, cpu: -1, buffer[0-2]: 0x4c4b46b, 0x8ae0fa80, 0x8ae0fa80, ret: 24 So kernel is good, but errors in multiplexed scaling. case5: Does setmpx/add/add/start/read work? Added PAPI_TOT_CYC Added PAPI_FP_INS read @start counter[0]: 9223372034707292159 read @stop counter[0]: 1843791732 difference counter[0]: -9223372032863500427 multiplex1.c FAILED Line # 389 With fix: SUBSTRATE:perf_events.c:_papi_pe_read:1151:12821 read: fd: 3, tid: 0, cpu: -1, buffer[0-2]: 0x6782, 0x0, 0x0, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1151:12821 read: fd: 4, tid: 0, cpu: -1, buffer[0-2]: 0x0, 0x0, 0x0, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1151:12821 read: fd: 3, tid: 0, cpu: -1, buffer[0-2]: 0x6de725dc, 0x8ae0fa80, 0x8ae0fa80, ret: 24 SUBSTRATE:perf_events.c:_papi_pe_read:1151:12821 read: fd: 4, tid: 0, cpu: -1, buffer[0-2]: 0x4c4b400, 0x8ae0fa80, 0x8ae0fa80, ret: 24 read @start counter[0]: 26498 read @stop counter[0]: 1843865052 difference counter[0]: 1843838554 read @start counter[1]: 0 read @stop counter[1]: 80000000 difference counter[1]: 80000000 SUBSTRATE:perf_events.c:_papi_pe_update_control_state:1288:12821 Called with count == 0 SUBSTRATE:papi_libpfm4_events.c:_papi_libpfm_shutdown:1178:12821 shutdown multiplex1.c PASSED New code is vastly simpler and smaller and checks for bad kernel behavior: int64_t tot_time_running = papi_pe_buffer[get_total_time_running_idx( )]; int64_t tot_time_enabled = papi_pe_buffer[get_total_time_enabled_idx( )]; #ifdef BRAINDEAD_MULTIPLEXING if (tot_time_enabled == 0) tot_time_enabled = 1; if (tot_time_running == 0) tot_time_running = 1; #else /* If we are convinced this platform's kernel is fully operational, then this stuff will never happen. If it does, then BRAINDEAD_MULTIPLEXING needs to be enabled. */ if ((tot_time_running == 0) && (papi_pe_buffer[count_idx])) { PAPIERROR("This platform has a kernel bug in multiplexing, count is %lld (not 0), but time running is 0.\n",papi_pe_buffer[count_idx]); return PAPI_EBUG; } if ((tot_time_enabled == 0) && (papi_pe_buffer[count_idx])) { PAPIERROR("This platform has a kernel bug in multiplexing, count is %lld (not 0), but time enabled is 0.\n",papi_pe_buffer[count_idx]); return PAPI_EBUG; } #endif pe_ctl->counts[i] = (papi_pe_buffer[count_idx] * tot_time_enabled) / tot_time_running; Also, renamed all instances of 'buffer' to papi_pe_buffer because buffer is a global variable on MIPS/Linux/libc. Yikes! (gdb) whatis buffer type = struct utmp * * src/ctests/multiplex1.c: Made sure that PAPI_TOT_CYC is the first event added to multiplexing event set. This will demonstrate the bug in perf_event multiplexing arithmetic in case5 on MIPS and other perf_event subsystems that likely have some breakage in the kernels handling of multiplexing. The common bug is that the perf_event subsystem does not fill in the second and third elements of the 24 byte read that gets returned from the kernel. These values are time_enabled and time_running. MIPS as of 3.0.3 just fills this in after a HZ tick has happened. Workarounds are pretty simple in the low level layer... A buggy output looks like this (3.0.3 MIPS/Linux Big Endian) -bash-4.1$ ./ctests/multiplex1 case1: Does PAPI_multiplex_init() not break regular operation? Added PAPI_TOT_CYC Added PAPI_FP_INS case1: PAPI_TOT_CYC PAPI_FP_INS case1: 1843775252 80000000 case2: Does setmpx/add work? Added PAPI_TOT_CYC Added PAPI_FP_INS case2: PAPI_TOT_CYC PAPI_FP_INS case2: 1843773254 80000037 case3: Does add/setmpx work? Added PAPI_TOT_CYC Added PAPI_FP_INS case3: PAPI_TOT_CYC PAPI_FP_INS case3: 1843772919 80000037 case4: Does add/setmpx/add work? Added PAPI_TOT_CYC Added PAPI_FP_INS case4: PAPI_TOT_CYC PAPI_FP_INS case4: 1843773959 80000037 case5: Does setmpx/add/add/start/read work? Added PAPI_TOT_CYC Added PAPI_FP_INS read @start counter[0]: 9223372034707292159 read @stop counter[0]: 1843784577 difference counter[0]: -9223372032863507582 multiplex1.c FAILED Line # 389 Error: Difference in start and stop resulted in negative value! 2011-11-08 * src/components/cuda/: linux-cuda.c, linux-cuda.h: Updated CUDA component for CUPTI 4.1 (RC1). Note, SetCudaDevice() should now work with the latest CUDA 4.1 version. 2011-11-07 * src/components/coretemp/linux-coretemp.c: Update coretemp to better handle sparse numbering of the inputs. * doc/Doxyfile-everything: Exclude the libpfm* and perfctr-* directories from consideration when generating Doxygen docs. * src/: papi.h, components/acpi/linux-acpi.h, components/coretemp_freebsd/coretemp_freebsd.c, components/cuda/linux-cuda.h, components/infiniband/linux-infiniband.h, components/mx/linux-mx.h, components/net/linux-net.h: Place a space in < your name here > to cleanup doxygen warnings. * src/perf_events.c: Only perf event systems that have FAST counter reads and FAST hw timer access are x86... * src/linux-common.c: MIPS clock and Linux fixup code * src/components/example/example.c: A little more documentation on which of the component vector function pointers are relevant. * src/papi_vector.c: Tested the dummy get_{real,virt}_{cyc,usec} functions on zeus, they appear to work. * src/components/example/tests/example_multiple_components.c: Another fix to properly skip the multiple component case if CPU component not available. * src/components/example/tests/example_multiple_components.c: Skip the test if no CPU component enabled, rather than fail. 2011-11-04 * src/components/example/example.c: Free example_native_table with papi_free, glibc didn't like it if we just called free. (we allocate it with papi_calloc) * man/...: Version number bump. (since the pages are quantifiably different from those released in 4.2.0 ) * doc/: Doxyfile, Doxyfile-everything, Doxyfile.utils: Bump version number in the doxygen config files. * src/components/example/example.c: _papi_example_shutdown_substrate does not have any arguments. * src/components/net/linux-net.c: Include ctype.h for isspace(). * release_procedure.txt: release_procedure now reflects the correct version of doxygen to use. * src/buildbot_configure_with_components.sh: Do not always configure with not cpu counters, allow this to be passed in. Allows us to use one script for both types of builds we test. * delete_before_release.sh, src/buildbot_configure_with_components.sh: Create a script for buildbot to configure with several components. Buildbot runs all commandline arguments through a sanitization before passing them to sh. Thus --with-configure="a b c" => '--with-configure="a b c"' which is bad. delete_before_release.sh has been instructed to remove this file. * man/...: Rebuild the manpages with doxygen 1.7.4 to remove the 's at the end of sentances. The html output looks clean. 2011-11-03 * src/: multiplex.c, papi.c: Fix some gcc-4.6 compile warnings complaining that retval was being set but not used. * src/papi.c: Add some extra comments to the PAPI_num_cmp_hwctrs() code that describe its limitations a bit better. 2011-11-02 * src/: ctests/overflow_allcounters.c, testlib/test_utils.c: Add lots of debugging to make results of overflow_allcounters test a bit more clear. * src/components/coretemp/tests/coretemp_pretty.c: coretemp_pretty wasn't printing the description for fan inputs. The result on an apple MacBook Pro (running Linux) now looks like this: Trying all coretemp events Found coretemp component at cid 2 hwmon0.temp1_input value: 33.50 degrees C, applesmc module, label TB0T hwmon0.temp2_input value: 33.50 degrees C, applesmc module, label TB1T hwmon0.temp3_input value: 32.00 degrees C, applesmc module, label TB2T hwmon0.temp4_input value: 0.00 degrees C, applesmc module, label TB3T hwmon0.temp5_input value: 62.25 degrees C, applesmc module, label TC0D hwmon0.temp6_input value: 54.25 degrees C, applesmc module, label TC0F hwmon0.temp7_input value: 57.25 degrees C, applesmc module, label TC0P hwmon0.temp8_input value: 69.00 degrees C, applesmc module, label TG0D hwmon0.temp9_input value: 58.00 degrees C, applesmc module, label TG0F hwmon0.temp10_input value: 51.25 degrees C, applesmc module, label TG0H hwmon0.temp11_input value: 58.25 degrees C, applesmc module, label TG0P hwmon0.temp12_input value: 60.75 degrees C, applesmc module, label TG0T hwmon0.temp13_input value: 62.25 degrees C, applesmc module, label TN0D hwmon0.temp14_input value: 59.25 degrees C, applesmc module, label TN0P hwmon0.temp15_input value: 49.00 degrees C, applesmc module, label TTF0 hwmon0.temp16_input value: 54.00 degrees C, applesmc module, label Th2H hwmon0.temp17_input value: 58.75 degrees C, applesmc module, label Tm0P hwmon0.temp18_input value: 31.50 degrees C, applesmc module, label Ts0P hwmon0.temp19_input value: 44.25 degrees C, applesmc module, label Ts0S hwmon0.fan1_input value: 1999 RPM, applesmc module, label Left side hwmon0.fan2_input value: 2003 RPM, applesmc module, label Right side coretemp_pretty.c PASSED * src/components/coretemp/: linux-coretemp.c, linux-coretemp.h, tests/coretemp_pretty.c: Make the coretemp code a bit pickier about which events it supports. Add descriptions to the events. Also add support for Voltage (in*) events. On an amd14h machine I have access to, coretemp_pretty now prints: Trying all coretemp events Found coretemp component at cid 2 hwmon0.in1_input value: 1.31 V, it8721 module, label ? hwmon0.in2_input value: 2.22 V, it8721 module, label ? hwmon0.in3_input value: 3.34 V, it8721 module, label +3.3V hwmon0.in4_input value: 1.02 V, it8721 module, label ? hwmon0.in5_input value: 1.52 V, it8721 module, label ? hwmon0.in6_input value: 1.13 V, it8721 module, label ? hwmon0.in7_input value: 3.26 V, it8721 module, label 3VSB hwmon0.in8_input value: 3.17 V, it8721 module, label Vbat hwmon0.temp1_input value: 28.00 degrees C, it8721 module, label ? hwmon0.temp2_input value: -128.00 degrees C, it8721 module, label ? hwmon0.temp3_input value: -128.00 degrees C, it8721 module, label ? hwmon0.fan1_input value: 0 RPM hwmon0.fan2_input value: 1320 RPM hwmon1.temp1_input value: 33.00 degrees C, jc42 module, label ? hwmon2.temp1_input value: 31.75 degrees C, jc42 module, label ? hwmon3.temp1_input value: 53.00 degrees C, radeon module, label ? hwmon4.temp1_input value: 53.12 degrees C, k10temp module, label ? coretemp_pretty.c PASSED * src/components/coretemp/: linux-coretemp.c, tests/coretemp_pretty.c: Cut and paste error slipped in to that last commit. Fixes a build issue. * src/components/coretemp/: linux-coretemp.c, tests/Makefile, tests/coretemp_pretty.c: Clean up coretemp with same cleanups done in example component. Add a new test, "coretemp_pretty" that prints coretemp results in a more user-friendly way. * man/:... Rebuild the man pages with a newer version of doxygen. ( older versions of doxygen had a nasty bug in man output. ) Also reworked the utilities documentation to remove pages for the files. Thanks to Jose Pedre Oliveria for pointing this out. * src/components/example/tests/: Makefile, example_multiple_components.c: Add a test that makes sure you can have active EventSets on multiple components at the same time. * release_procedure.txt: Change PATH specification to include tcsh syntax; other minor syntax corrections. * src/components/example/example.c: More cleanups and documentation for the example component. 2011-11-01 * src/components/example/example.c: Some more major overhaul of the example component. A lot more documentation, plus make is behave a lot more like a real component would. * doc/Doxyfile.utils: Turn off undocumented warnings for the utils. doxygen run. * src/utils/: avail.c, command_line.c, cost.c, event_chooser.c, multiplex_cost.c: Add spaces to the comments so doxygen doesn't think is an xml tag. 2011-10-31 * src/utils/: avail.c, clockres.c, command_line.c, component.c, cost.c, decode.c, error_codes.c, event_chooser.c, mem_info.c, multiplex_cost.c, native_avail.c: Remove the @file directive from the doxygen comment blocks for the utilities. This cleans up the generated man pages. ( we nolonger build *.c.1 ) * src/components/example/: example.c, tests/example_basic.c: Clarify in the example component that ->reset only gets called if an eventset is currently running. Extend the example_basic test to test PAPI_reset() * release_procedure.txt: Fix a maketarget typo. * release_procedure.txt: We now have a good version of doxygen installed on most icl run machines. ( /mnt/scratch/sw/doxygen-1.7.5.1 ) * doc/doxygen_procedure.txt: [no log message] * release_procedure.txt: Update release_procedure to inform how to update the website documentation link. 2011-10-28 * RELEASENOTES.txt: Correct the RELEASENOTES for some things I missed when reviewing it. It's Offcore events that we don't support on Nehalem/Westmere/Sandybridge. Also the power6 libpfm4 bug that was listed as an outstanding bug was fixed a long time ago. * src/components/coretemp/linux-coretemp.c: Have coretemp set the num_native_events field. * src/components/example/tests/example_basic.c: Update example test to print num_native_events, to help debug issues with other components not updating the value. * src/components/coretemp/: linux-coretemp.c, linux-coretemp.h: Fix typo enent -> event Also remove residual LMSENSOR mentions from the coretemp header. * src/papi_libpfm4_events.c: Fix two memory leak locations. The attached patch reduces the number of lost memory blocks reported by valgrind from 234 to 39. It frees the memory allocated by the 4 strdups and the calloc functions in papi_libpfm4_events.c:allocate_native_event(). Patch by: José Pedro Oliveira * src/components/cuda/tests/Makefile: The change to pass the PAPI CC/CFLAGS to the component tests broke the nvidia test as it wants CC to be nvcc. So update that Makefile to use nvcc instead. 2011-10-27 * src/components/example/tests/example_basic.c: Improve the example_basic component test to be much more comprehensive. * src/components/example/: example.c, tests/HelloWorld.c, tests/Makefile, tests/example_basic.c: Cleanup the example test. Fix various mistakes in the comments as well as add better error checking. Also rename the "HelloWorld" test to "example_basic" * src/components/coretemp/tests/Makefile: The coretemp_test target was example_test due to cut-and-paste error. Patch from Jose Pedro Oliveira * src/Makefile.inc: Add a component_tests dependency so that the component_tests are made during a make -j build * src/Makefile.inc: Make sure the component test makefiles get passed the CC and CFLAGS definitions. * src/components/coretemp/: linux-coretemp.c, tests/Makefile, tests/coretemp_basic.c: Fix up the coretemp component some more. Make sure the enumerate function returns PAPI_ENOEVNT if no events are available. Update the Makefile so it has proper dependencies. Update the test so it prints the first event available. (The latter based on a patch from Jose Pedro Oliveira) * src/: solaris-ultra.c, ctests/all_native_events.c: The solaris-ultra substrate was still broken. This is because recent changes to component bind time explictly used the ->set_domain() call, and this vector was not set up in solaris_ultra. Also made the all_native_events test report the returned error value to aid in debugging problems like this in the future. papi-papi-7-2-0-t/ChangeLogP440.txt000066400000000000000000000122071502707512200166020ustar00rootroot000000000000002012-04-17 * 8782daed cvs2cl.pl delete_before_release.sh gitlog2changelog.py...: Update the release machinery for git. gitlog2changelog.py takes the output of git log and parses it to something like a changelog. * 80ff04a9 doc/Doxyfile-html: Cover up an instance of doxygen using full paths. Doxygen ( up to 1.8.0, the most recent at this writing ) would use full paths in directory dependencies ignoring the use relative paths config option. * c556dad1 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump the version for the PAPI 4.4.0 release. 2012-04-14 * 27174c0b src/components/bgpm/CNKunit/CVS/Entries src/components/bgpm/CNKunit/CVS/Repository src/components/bgpm/CNKunit/CVS/Root...: Removed CVS stuff from Q code. * 970a2d50 src/configure src/configure.in src/linux-bgq.c...: Removed papi_events.csv parsing from Q code. (CVS stuff still needs to be taken care of.) 2012-04-13 * 853d6c74 src/libpfm-3.y/lib/intel_corei7_events.h src/libpfm-3.y/lib/intel_wsm_events.h src/libpfm-3.y/lib/pfmlib_intel_nhm.c: Add missing update to libpfm3 Somehow during all of the troubles we had with importing libpfm3 into CVS, we lost some Nehalem/Westmere updates. Tested on a Nehalem machine to make sure this doesn't break anything. 2012-04-12 * 07e4fcd6 INSTALL.txt: Updated INSTALL notes for Q * 2a0f919e src/Makefile.in src/Makefile.inc src/components/README...: Added missing files for Q merge. * 0b0f1863 src/Rules.bgpm src/components/bgpm/CNKunit/CVS/Entries src/components/bgpm/CNKunit/CVS/Repository...: Added PAPI support for Blue Gene/Q. 2012-02-17 * 147a4969 src/perfctr-2.6.x/usr.lib/event_set_centaur.o src/perfctr-2.6.x/usr.lib/event_set_p5.o src/perfctr-2.6.x/usr.lib/event_set_p6.o: Remove a few binary files in perfctr-2.6.x 2012-02-23 * 955bd899 src/perfctr-2.6.x/usr.lib/event_set_centaur.os src/perfctr-2.6.x/usr.lib/event_set_p5.os src/perfctr-2.6.x/usr.lib/event_set_p6.os: Removes the last of the binary files from perfctr2.6.x Some binary files were left out in the cold after a mishap trying to configure perfctr for the build test. 2012-02-17 * 5fe239c8 src/perfctr-2.6.x/CHANGES src/perfctr-2.6.x/INSTALL src/perfctr-2.6.x/Makefile...: More cleanups from the migration, latest version of libpfm-3.y perfctr-2.[6,7] Version numbers got really confused in cvs and the git cvsimport didn't know that eg 1.1.1.28 > 1.1 ( see perfctr-2.6.x/CHANGES revision 1.1.1.28.6.1 :~) 2012-03-13 * e7173952 src/libpfm-3.y/examples_v2.x/multiplex.c src/libpfm-3.y/examples_v2.x/pfmsetup.c src/libpfm-3.y/examples_v2.x/rtop.c...: Fix some libpfm3 warnings. libpfm3 is not maintained anymore, so applied these changes locally. libpfm3 is compiled with -Werror so they broke the build with newer gcc even though they are just warnings in example programs. 2012-04-09 * 10528517 src/libpfm-3.y/Makefile src/libpfm-3.y/README src/libpfm-3.y/docs/Makefile...: Copy over libpfm-3.y from cvs. libpfm3 was another one of our skeletons in CVS. Thanks to Steve Kaufmann for keeping us honest. 2012-02-17 * ec8c879e src/aix.c src/components/coretemp/linux-coretemp.c src/components/coretemp_freebsd/coretemp_freebsd.c...: The git conversion reset all of the CVS $Id$ lines to just $Id$ Since we depend on the $Id$ lines for the component names, I had to go back and fix all of them to be the component names again. 2012-03-09 * 71a2ae4f src/components/lmsensors/linux-lmsensors.c: Fix buffer overrun in lmsensors component Conflicts: src/components/lmsensors/linux-lmsensors.c * ec0e1e9a src/libpfm4/config.mk src/libpfm4/docs/man3/pfm_get_os_event_encoding.3 src/libpfm4/examples/showevtinfo.c...: Update to current git libpfm4 snapshot 2012-02-15 * 1312923e src/libpfm4/debian/changelog src/libpfm4/debian/control src/libpfm4/debian/rules...: The git cvsimport didn't get the latest version of the libpfm4 import. This should be the versions as were in cvs now. 2012-02-24 * 81847628 src/papi_events.csv: Fix broken Pentium 4 Prescott support We were missing the netbusrt_p declaration in papi_events.csv 2012-03-01 * 917afc7f src/papi_internal.c: Add some locking in _papi_hwi_shutdown_global_internal This caused a glibc double-free warning, and was caught by the Valgrind helgrind tool in krentel_pthreads There are some other potential locking issues in PAPI_shutdown, especially when debug is enabled. * f85c092f src/papi.c: Fix possible race in _papi_hwi_gather_all_thrspec_data The valgrind helgrind tool noticed this with the thrspecific test 2012-03-09 * 912311ed src/multiplex.c src/papi_internal.c src/papi_libpfm4_events.c...: Fix issue when using more than 31 multiplexed events on perf_event On perf_event we were setting num_mpx_cntrs to 64. This broke, as the MPX_EventSet struct only allocates room for PAPI_MPX_DEF_DEG events, which is 32. This patch makes perf_event use a value of 32 for num_mpx_cntrs, especially as 64 was arbitrarily chosen at some point (the actual value perf_event can support is static, but I'm pretty sure it is higher than 64). Conflicts: src/papi_libpfm4_events.c papi-papi-7-2-0-t/ChangeLogP500.txt000066400000000000000000003075711502707512200166120ustar00rootroot000000000000002012-08-08 * 4b4f87ff ChangeLogP5000.txt: Changelog for PAPI5 * 6f208c06 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump version numbers in prep for a 5.0 release. * c6fdbd11 release_procedure.txt: Update release_procedure.txt Change the order of when we branch git, so that the main dev branch gets some of the release related changes. 2012-04-17 * 97d4687f ChangeLogP440.txt: Pickup the changelog from papi 4.4 This was only included in the stable-4.4 branch. 2012-08-23 * 628c2b6e src/buildbot_configure_with_components.sh: Take debug out of the with several components build test config. When built with PAPI's memory wrapper routines, the threaded stress tests will sometimes get into poor performing situations. See trac ticket 148 for discussion. http://icl.utk.edu/trac/papi/ticket/148 2012-08-22 * 46faae8e src/ctests/overflow2.c src/ctests/overflow_single_event.c src/ctests/overflow_twoevents.c...: Move find_nonderived_event() from overflow_twoevents to test_utils and call it from overflow2 and overflow_single_event to insure that we're not trying to overflow on a derived event. * 3e7d8455 src/ctests/zero_smp.c: Fix a memory leak reported on the aix power7 machine. zero_smp.c did not unregister at the end of its thread function. * 3ad5782f src/perf_events.c: perf_events: fix segfault if DEBUG is enabled Was incorrectly using "i" as an index where it should be "0" in a debug statement. 2012-08-21 * a3cadbdb src/ftests/accum.F src/ftests/avail.F src/ftests/case1.F...: Take #2. Changing len_trim function in ftests to last_char. This time, I respect 72 char line limit. * c9db8fbf src/ctests/overflow_force_software.c: overflow_force_software was the only test that used a different hard_tolerance value (0.25) than the other overflow tests (0.75). This caused trouble on Power7/AIX. Now we are using the same hard_tolerance value in all overflow tests. * 70515343 src/ftests/accum.F src/ftests/avail.F src/ftests/case1.F...: Changed name of function len_trim to last_char. * 95168d79 src/components/cuda/linux-cuda.c: Cleanup cuda shutdown code. * The shutdown_thread code cleaned out the whole component's state. This has been split into shutdown_global for the whole component, and shutdown_thread is left to cleanup some control state info. * 56284f81 src/ctests/multiplex1_pthreads.c: Fix memory leaks in pthread multiplex tests. * aeead8b6 src/threads.c: Remove an outdated comment about _papi_hwi_free_EventSet holding INTERNAL lock * e598647b src/perf_events.c: perf_events: fix issue where we dereference a pointer before NULL check. Fix suggested by Will Cohen, based on a coverity report. * 4e0ed976 src/ctests/calibrate.c: Modify warning message to eliminate the word "error" Hopefully this will suppress it in buildbot outputs. * 50fbba18 src/ctests/api.c src/ftests/case2.F: Cleanup a few more warnings from the PAPI_perror change. * 1f06bf28 src/ftests/case2.F: Missed an instance of perror in the fortran code. * 93e6ae2c src/ftests/ftests_util.F: Fix warning in ftest_util.F 2012-08-20 * 60c6029e src/perf_events.c: perf_events: Update multiplexing code It * turns out the PERF_EVENT_IOC_RESET ioctl resets the count but not the multiplexing info. This means that when we fiddle with the events then reset them in check_scheduability(), we are not really resetting things to zero. The effect might be small, but since the new multiplex code by definition is always scheduable, then let's skip the test if multiplexing. * 9079236c src/ctests/zero.c: Change error reporting so FLOPS > 100% above theoretical FAIL and FLOPS > 25% above theoretical WARN. 2012-08-18 * 980558af src/papi_internal.c: papi_internal: fix memory leak When I made some changes a while back I forgot to free ESI->NativeBits properly. This was causing memory leak warnings on buildbot. 2012-08-17 * 83a14612 src/perf_events.c: perf_events: more cleanups and comments We really need to go back and figure out in more detail what the profile/sampling/overflow code is doing. * 7cafb941 src/perf_events.c: perf_events: more cleanups and comments * e9e39a4b src/perf_events.c: perf_events: disable kernel multiplexing * before 2.6.34 It turns out even our simple multiplexing won't work on kernels before 2.6.34, so fall back to sw multiplex in that case. * 05801901 src/perf_events.c: perf_events: more cleanup and comments * 268e31d7 src/perf_events.c: perf_events: more cleanup and commenting * d62fc2bf src/perf_events.c: perf_events: more cleanup and comments * fb0081bc src/perf_events.c: perf_events: more cleanups and comments * a1142fc8 src/perf_events.c: perf_events: cleanup and comment the kernel * bug workarounds * b8560369 src/perf_events.c: perf_events: minor cleanups and new comments * 6c320bb2 src/perf_events.c: perf_events: fix some debug messages I forget to test with --with-debug enabled * f7a3cccf src/perf_events.c: perf_events: enable new read_code This makes the read code much simpler. It finishes the multiplexing changes. To avoid complication, we no longer enable PERF_FORMAT_ID as reading that extra info is unnecessary with the current implementation. This passes all the tests on a recent kernel, but on 2.6.32 there are still a few issues. * 15749cff src/ctests/all_events.c src/ctests/all_native_events.c: Fix warning in all_events and all_native_events. In the perror semantic change, several strings for use in the old interface were left. 2012-08-16 * afdd25fa src/perf_events.c: perf_events: always enable kernel multiplexing The new code should work on any kernel version. * 9f5e23ae src/perf_events.c: Rewrite multiplex support. Drop support for the former "partitioned" multiplexing, as we could never use it. Instead use the simple/braindead model. This still needs more work, as sometimes reads are failing. * cdd29909 src/ftests/strtest.F: Fix strtest.F ftest It was still making some assumptions about PAPI_perror() writing to a string rather than directly to standard error. * 565f60b3 src/papi_internal.c: Missing code to set num_error_chunks to 0 The new _papi_hwi_cleanup_errors() function was not resetting num_error_chunks to 0, leading to a segfault in the fmultiplex1 test. 2012-08-02 * bb85bafd src/genpapifdef.c src/papi.c src/papi_common_strings.h...: Remove usage of _papi_hwi_err. Move PAPI over to storing errors in a runtime list. * Functions to add/lookup errors. * Generate the list of PAPI_E* errors at library_init time. * genpapifdef pulled the values for the PAPI_* error return codes from the _papi_hwi_err structure at configure time. Since this is now built at run-time, I added the appropriate values to genpapifdef's builting describe_t table. See : _papi_hwi_publish_error _papi_hwi_init_errors For usage hints. 2012-08-10 * e27af085 src/perf_events.c: perf_event: rename BRAINDEAD_MULTIPLEXING It is now "simple_multiplexing" and is a variable not an #ifdef This is needed before perf_event multiplexing can be sorted out. It's unclear if it actually works anyway. * 7f8e8c58 src/perf_events.c: perf_event: remove context "cookie" field It was a bit of overkill, we just need an initialized field. Also revamp how context and control are initialized. * 8cb8ac6d src/perf_events.c: perf_event: move all event specific info to * the control state previously half was in the context state and half in the control state perf_event has a strange architecture with each event being created having its own fd, which is context wide. In PAPI though we usually only have one eventset (control state) active at once, so there's no need to have the context be aware of this. 2012-08-09 * 8d7782cb src/perf_events.c: perf_event: rename evt_t to perf_event_info_t This just makes the code easier to follow. * 349de05c src/perf_events.c: perf_event: remove the superfluous per_event_info_t structure 2012-08-08 * da8ad0a2 src/ctests/all_native_events.c src/ctests/get_event_component.c src/utils/native_avail.c: Fix warnings about PAPI_enum_cmp_event() return not being checked Reported by coverity checker via Will Cohen Harmless warnings, and now the checker will likely complain about the value being checked but ignored. * b4719888 src/papi_user_events.c: Fix unused value in papi_user_events.c Reported by Coverity checker by Will Cohen * 6a8f255c src/utils/event_chooser.c: remove unused * PAPI_get_component_info() call in event_chooser Reported by Will Cohen from coverity checker 2012-08-06 * 62cda478 src/genpapifdef.c src/papi_common_strings.h src/papi_internal.c...: Remove usage of _papi_hwi_err. genpapifdef pulled the values for the PAPI_* error return codes from the _papi_hwi_err structure at configure time. Since this is now built at run-time, I added the appropriate values to genpapifdef's builting describe_t table. 2012-08-02 * d11259f3 src/papi.c src/papi_internal.c src/papi_internal.h...: Move over to generating the list of PAPI errors at library_init time. * 097ffc44 src/papi_internal.c: Functions to add/lookup errors. 2012-08-07 * 2530533f src/papi_events.csv: tests/zero fails on Power7 due to PAPI_FP_INS Error of 50%. Preset definition has been redefined and test now passes. * 8e17836f src/components/appio/Rules.appio src/components/appio/appio.c src/components/appio/appio.h...: We now intercept recv(). The support for recv() requires dynamic linkage. For static linkage, recv is not intercepted. 2012-08-06 * 8b1eb84c src/perf_events.c: perf_events: some whitespace cleanup and extra comments * f10edba6 src/perf_events.c: perf_events: MAX_READ was no longer being used, remove it * 08c06ed1 src/perf_events.c: perf_event event_id is actually 64-bit, so make our copy match * a33e8d9c src/perf_events.c: Rename context_t pe_context_t in perf_events.c Makes the code a bit clearer and matches how other components name things. * 96ce9dcd src/perf_events.c: Rename control_state_t pe_control_state_t This makes the code a bit easier to follow and matches how other components name things. 2012-08-03 * 4c5dce7f src/ctests/zero.c: Beef up error reporting. * 83b5d28a src/ctests/cycle_ratio.c: Have the cycle_ratio test skip if PAPI_REF_CYC event is not defined. 2012-08-02 * 25b1ba41 src/ctests/cycle_ratio.c: Removed all TESTS_QUIET blocks. They aren't needed because tests_quiet() overloads printf. We should probably remove TEST_QUIET blocks in ALL tests at some point for code clarity… * 8777d7d4 src/ctests/zero.c: Fixed error reporting. The error computation was inside a TESTS_QUIET block and wasn't getting executed when run quietly. Thus this test always passed on buildbot, even when it didn't. * 006fe8e9 src/ctests/Makefile: Fix typo in cycle_ratio make line. * 88e6d6a4 src/aix.c src/aix.h: Setting number of multiplex counters back to 32 for AIX. Before it was set equal to number of max HW counters. This caused ctests/sdsc-mpx to fail. * ab78deda src/papi_events.csv: ctests/calibrate on Power7/AIX failed with a 50% error all the way through. Updated the preset FP_OPS with a more appropriate definition. Now the calibrate errors range from 0.0002 to 0.0011% for double and single precision * fadce32f src/ctests/calibrate.c: Modify calibrate test in two ways: 1. add a -f flag to suppress failures and allow test to run to completion; 2. change error detection to allow warnings above MAX_WARN and failures above MAX_FAIL. Currently set to 10% and 80% respectively. This allows speculative over counting to pass with warning rather than fail completely. * 8a39ac9d src/papi_events.csv: LST_INS for Power7 was defined from 3 native events that cannot be counted together at the same time. Caused ctests/all_events to fail. Updated the preset with a more appropriate definition. * cdc16e5d src/papi_events.csv: L1_DCA for Power7 was defined from 3 native events that cannot be counted together at the same time. That caused ctests/tenth to fail. Updated the preset with a more appropriate definition. 2012-08-01 * 2bf44d13 src/papi_internal.c src/perf_events.c: icc does not like arithmetic on void pointers. Added cast to unsigned char* when arithmetic was being performed on void pointers in papi_internal and perf_events. * 7825ec14 src/ctests/api.c src/ctests/attach2.c src/ctests/attach3.c...: Modify tests that FAIL if PAPI_FP_OPS or PAPI_FP_INS not implemented. Now they will warn and continue. This is specifically to accommodate the brain-dead IvyBridge implementation. * fd70a015 src/testlib/test_utils.c: Re-writing of test_utils introduced new bugs that caused ctests/tenth to fail. test_events struct lists the same event twice (MASK_L1_DCW), hence PAPI_add_event() fails because it's forced to add the same preset twice. * 74ece3a0 src/run_tests.sh: run_tests.sh was clobbering $EXCLUDE variable if $CUDA was defined. Changed to add entries from run_tests_exclude_cuda.txt to $EXCLUDE which should already contain entries from run_tests_exclude.txt instead of replacing the entries already contained. * 11ed2364 src/libpfm4/config.mk: Added check in libpfm4/config.mk to check if using icc. If so, the -Wno-unused-parameter flag will no longer be used because icc does not provide it and provides no alternative. * dedf73f6 src/papi_user_events.c: fget() returns an int it should be treated as an int The coverity scan flagged that the int return by fget was stored in a char. The main concern with this is the EOF that fget() could return is -1. Do not want to mess up that value by typecasting to char and then back to int. * c4fcbe7e src/ctests/kufrin.c: Check return values of PAPI_get_cmp_opt() and calloc A coverity scan showed that PAPI_get_cmp_opt() could potentially return a negative number. Also it is good form to check the return value of calloc to ensure it is a non-null pointer. * e89d6ffa src/testlib/test_utils.c: Clean up test_print_event_header() There were a couple warnings flagged by coverity on test_print_events_header(). The function now checks for error conditions flagged by PAPI_get_cmp_opt() and also frees memory allocated by a calloc() function. * c81d8b60 src/threads.h: Eliminate deadcode from threads.h If HAVE_THREAD_LOCAL_STORAGE is defined, a portion of the _papi_hwi_lookup_thread() will never be executed. This patch make either one section or the other section of code be compiled. This will eliminate a coverity scan warning about unreachable code. * f70f3f56 src/ctests/all_native_events.c: Eliminate unused variable in ctests/all_native_events.c Coverity identified a variable that was set but never used in all_native_events.c. This patch removes the unused variable to eliminate that warning. * a9f29840 src/components/appio/appio.c: A couple places in appio.c used the FD_SET() without initializing the variable. Coverity scan pointed out this issue. * 9e535ae2 src/components/rapl/linux-rapl.c: A Coverity scan pointed out that read_msr() could potentially use an invalid value of fd for pread(). Need to check the value of fd before using it. * 7b55c675 src/components/rapl/linux-rapl.c: The arrays used for initialization were hard coded to 1024 packages. Want to avoid hard coding that so the day when machines with 1025 packages are available is a non-event. Also changed the initialization code to avoid having the initialization be O((number of packages)^2) in time complexity. 2012-07-27 * 3703995a src/papi_internal.c: Fix the component name predending code. When presented with a NULL component .short_name, the code did the wrong thing. * 5258db8b src/components/cuda/linux-cuda.c: Fix a warning in cuda. 2012-07-26 * ddd6f193 src/ctests/Makefile src/ctests/cycle_ratio.c: Add a test to compute nominal CPU MHz from real cycles and use PAPI_TOT_CYC and PAPI_REF_CYC to compute effective MHz. Warns if PAPI_REF_CYC is zero, which can happen on kernels < ~3.3. * fab5e9ef src/papiStdEventDefs.h src/papi_common_strings.h src/papi_events.csv: Add PAPI_REF_CYC preset event. Define it as UNHALTED_REFERENCE_CYCLES for all Intel platforms on which this native event is supported. 2012-07-25 * 8b9b6bef src/papi_events.csv: Modify SandyBridge and IvyBridge tables: SandyBridge FP_OPS only counts scalars; SP_OPS and DP_OPS now count correctly, including SSE and AVX. IvyBridge can't count FP at all; adjustments made to eliminate event differences with SandyBridge. 2012-07-26 * 5b11c982 src/components/cuda/linux-cuda.c: Fix the cuda component. The cuda component prepended CUDA. to all its event names, this is no longer the case. 2012-07-25 * db5b0857 src/papi_events.csv: Added 2 new preset definitions for BGQ. Note, these presets use the new feature where a generic event mask together with an ORed opcode string is used. This won't work until the new Q driver is released (currently scheduled for end of August). 2012-07-24 * af7cd721 src/components/coretemp/linux-coretemp.c src/components/coretemp/tests/coretemp_pretty.c src/components/cuda/linux-cuda.c...: Enforce all our components to use the same naming. We setteled on :'s as inter-event seperators. This also touches a few of the components' tests, we changed the name field so their searches needed help. 2012-07-23 * 57aeb9d4 src/papi_internal.c: Prepend component .short_name to each event name. Use ::: as a sep. 2012-07-24 * 762e9584 src/ctests/multiplex2.c src/sw_multiplex.c: Fix multiplex2 test It complained if it tried to add a multiplex event and PAPI properly told it that it couldn't. * 531870f1 src/papi_internal.c: Add sanity check at component init time Looks for num_cntrs being larger than num_mpx_cntrs which doesn't make much sense. * 53ad0259 src/extras.c src/genpapifdef.c src/papi.c...: Rename PAPI_MAX_HWCTRS to PAPI_EVENTS_IN_DERIVED_EVENT Hopefully this will make things a little less confusing. * 700af24b src/papi_internal.c: Change EventInfoArrayLength to always return num_mpx_cntrs Things should be consistently using num_mpx_cntrs rather than num_cntrs now. Issue reported by Steve Kaufmann * d1570bec src/sw_multiplex.c: Fix sw_multiplex bug when max SW counters is less than max HW counters this was causing kufrin and max_multiplex to fail * f47f5d6a src/aix.c src/components/appio/appio.c src/components/bgpm/CNKunit/linux-CNKunit.c...: Remove PAPI_MPX_DEF_DEG It was not well documented and being used in confused ways all over the code. Now there is a different define PAPI_MAX_SW_MPX_EVENTS used by the software multiplex code. All other components have had the value replaced with just the maximum number of counters. If a component can handle its own (non-software) multiplexing it is up to it to set .num_mpx_cntrs to a value that's different from .num_cntrs * 0d83f5db src/papi_internal.c src/papi_internal.h: Split NativeBits off of NativeInfoArray in EventSet previously we were doing some crazy thing where we allocated both at once and then split them afterward. The new code is easier to follow. * 98f2ecbd src/papi_internal.c: Clean up EventSet creation Sort out which sizes are used for allocating which structures. * e1024579 src/Makefile.inc src/multiplex.c src/multiplex.h...: Rename the multiplex files to be sw_multiplex That way it's more clear the stuff included only relates to software multiplexing, not generic multiplexing. * a6adc7ff src/multiplex.h src/papi_internal.c src/papi_internal.h: Move some sw-multiplex specific terms out of papi_internal.h and into multiplex.h 2012-07-23 * 1ddbe117 src/components/README: Added note that lmsensors component requires lmsensors version >=3.0.0 * 94676869 src/components/appio/appio.c src/components/appio/tests/appio_test_pthreads.c: proper checking of return codes in response to tests using coverity * ea958b18 src/components/appio/tests/appio_list_events.c src/components/appio/tests/appio_values_by_code.c: As component name in table has been changed from appio.c to appio, we now use appio in the tests. 2012-07-20 * f212cc34 src/components/appio/appio.c src/components/coretemp/linux-coretemp.c src/components/coretemp_freebsd/coretemp_freebsd.c...: Add .short_name entries to each component. * 1e755836 src/papi_libpfm4_events.c src/perf_events.c: Fix use-after-free bug in perf_events.c This turned up in the ctests/flops test, and Valgrind made it easy to track down. * 4580ed1d src/perf_events.c: Update perf_event.c rdpmc support Use the libpfm4 definition for mmap rather than our custom one, now that libpfm4 has been updated * 47558b2c src/libpfm4/examples/showevtinfo.c src/libpfm4/include/perfmon/perf_event.h src/libpfm4/lib/pfmlib_intel_nhm_unc.c...: Import current libpfm4 from libpfm4 git It has some minor uncore fixes plus the header changes needed for rdpmc. 2012-07-17 * 65d4c06c src/linux-common.c: Reorder statements to ensure the fclose() are performed Coverity pointed out that it was possible for resources to be leaked in linux-common.c if the fscanf() encountered error. This reordering of the statements ensures that the fclose() calls are done regards of the results of the fscanf() functions. 2012-07-18 * 7bf071ff src/papi_user_events.c: Ensure that load_user_event_table() frees files and memory on error A Coverity scan showed that an error condition in load_user_event_table() function would exit the the function without closing the table file or freeing allocated memory. This patch addresses those issues. 2012-07-17 * 1ba52e35 src/components/stealtime/linux-stealtime.c: Ensure that read_stealtime() closes the file in case of an error condition A Coverity scan showed that an error condition could cause read_stealtime() to exit without closing the file. This patch ensures that the file is closed regardless of success or failure. 2012-07-18 * f37f22e5 src/papi_libpfm4_events.c: Fix warning in papi_libpfm4_events.c We were setting a value but never using it. * 8e8401bc src/testlib/test_utils.c: Remove unused variable in test_utils.c Most of the machines in buildbot were complaining about this. * 133ce6a9 src/linux-timer.c: Add missing papi_vector.h include to linux-timer.c This was breaking on PPC Linux 2012-07-17 * 6fd3cedd src/perf_events.c: Fix perf_events.c warnings reported by ICC * 21c8f932 src/perfctr-x86.c: Fix perfctr-x86 build with debug enabled * 08f76743 src/configure src/configure.in src/linux-bgq.c: Attempt to fix linux-bgq compilation error. It turns out BGQ uses the standard linux-context.h header * 43457f4f src/linux-bgq.c: Made check for opcodes more robust. * d58116b4 src/perf_events.c: More cleanups of perf_events.c file * 409438b7 src/freebsd-context.h src/freebsd.c src/freebsd.h: Fix FreeBSD compile warnings Similar to the perfctr issues. * 1e6dfb02 src/aix.c src/aix.h: Fix AIX build warnings They were similar in cause to the perfctr issues. * 3d0b5785 src/Rules.perfmon2 src/components/appio/appio.c src/components/bgpm/CNKunit/linux-CNKunit.h...: Remove papi_vector.h include from papi_internal.h There were some semi-circular dependencies that came up with the context split changes. The easiest way to fix things for perfctr was just move papi_vector.h out to be included explicitly. This touches a lot of files because a lot of files include papi_internal.h This should also fix the perfctr and perfmon2 builds that were broken yesterday. 2012-07-16 * a7a14a5b src/ctests/zero.c src/testlib/test_utils.c: Modify zero test to warm up processor before measuring events, and report timing errors as signed deviations. Modify test_utils add_two_events code to check for errors after adding nominally valid events. This is a more rigorous test than just counting available registers. * de0860d3 src/perf_events.c: Remove perf_events.h module header It's no longer needed, everything important is merged into the perf_events.c file. * 22975f14 src/perf_events.c: Remove perf_event SYNCHRONIZED_RESET code This was never defined and never used, just remove the code. * 48750b8c src/perf_events.c: Remove papi_pe_allocate_registers On perf_event this code wasn't really doing anything useful, as update_control_state would end up re-doing any possible tests we could want to do here. * 1775566f src/Makefile.in src/Makefile.inc src/Rules.pfm4_pe...: Remove "include CPUCOMPONENT" from papi_internal.h This was the last major dependency on CPU component in common PAPI code. It was mostly necessary for the ucontext definitions when trying to get the instruction pointer when doing sampling. This change creates new OS-specific header files that handle the ucontext related code, and has papi_internal.h include that instead. * 969ce035 src/Rules.pfm4_pe src/Rules.pfm_pe src/configure...: Make perf_event libpfm4 only perf_event libpfm3 support is not really needed anymore and supporting it was cluttering up the perf_event component. 2012-07-13 * adad1d2a src/perf_events.c: Add init time error messages to perf_event component * 827ccc07 src/perf_events.c: Add perf_event rdpmc / fast_real_timer detection Currently we need a custom copy of struct perf_event_mmap_page because the version included with libpfm4 doesn't define the fields we need yet. * 4f82fe94 src/perf_events.c: Read in paranoid info on perf_events This indicates whether a regular user can read CPU-specific or system-wide measurements. * 03080450 src/perf_events.c: Add proper perf_event detection Using the official /proc/sys/kernel/perf_event_paranoid file * 6e71d3f7 src/linux-bgq.c: Updated BGQ opcode stuff; cleaned up code. 2012-07-11 * 3114d3dc src/multiplex.c src/papi_internal.c src/perf_events.c: Minor documentation improvements Plus fixes some typos 2012-07-09 * b60c0f0c src/perf_events.c: Minor cleanups to perf_events.c * 075278a0 src/aix.c src/freebsd.c src/linux-bgp.c...: Change return value for .allocate_registers For some reason it returned 1 on success and 0 on failure. Change it so you return PAPI_OK on success and a PAPI error on failure, to better match all of the other component vectors. * 29d9e62b src/testlib/test_utils.c: Fixed the print_header routine to report an error message if counters are not found, instead of a negative counter number. Tested by forcing the return value negative; not by running on a Mac, where the error first appeared. * 74257334 src/ctests/Makefile src/ctests/remove_events.c: Add remove_events test This just makes sure EventSets still work after an event has been removed. This is probably covered by other more elaborate tests, but I needed a simple test to make sure I wasn't breaking anything. * 1372714f src/papi.c src/papi_internal.c src/papi_internal.h: Clean up, rename, and comment _papi_hwi_remap_event_position I've found this section of code to be confusing for a long time. I think I finally have it mostly figured out. I've renamed it _papi_hwi_map_events_to_native() to better describe what it does. The issue is that the native event list in an EventSet can change at various times. At event add, event remove, and somewhat unexpectedly, whenever ->update_control_state is called (a component can re-arrange native events if it wants, to handle conflicts, etc.) Once the native event list has been changed, _papi_hwi_map_events_to_native() has to be called to make sure the events all map to the proper native_event again. Currently we have the _papi_hwi_map_events_to_native() calls in odd places. It seems to cover all possible needed locations, but analyzing that we do takes a lot of analysis... * f1b837d8 src/papi.c: Remove unused variable in papi.c * 541bcf44 src/papi_internal.h: Update commens in papi_internal.h Some of the EventSetInfo comments were out of date. * e6587847 src/papi.c src/papi_internal.c src/papi_internal.h: Remove unused paramater from _papi_hwi_remap_event_position The mysterious _papi_hwi_remap_event_position function had a "thisindex" field that was ignored. This will possibly speed PAPI_start() time as it was running a loop over num_native_events on _papi_hwi_remap_event_position even though each call did the same thing since the value being passed was ignored. * 3ad3d14b src/papi_internal.c: Clean up and comment add_native_events in papi_internal.c I'm chasing some weird perf_events behavior with the papi_event_chooser. The add_native_events code is very hard to understand, working on commenting it more. * 4e5e7664 src/utils/event_chooser.c: Fix coverity warning in papi_event_chooser * 666249a8 src/jni/EventSet.java src/jni/FlipInfo.java src/jni/FlopInfo.java...: RIP Java. Java PAPI wrappers have not been supported for years (2005?). They are being removed to declutter the source. * e18561fc src/papi_preset.c: Update cmpinfo->num_preset_events properly This value wasn't being set if we were reading the presets directly from the CSV file. * 290ab7c3 src/utils/component.c: Have papi_component_avail report counter and event info * 7c421b9c src/testlib/test_utils.c src/utils/native_avail.c: Remove counter number from the testlib header. The header was only reporting number of counters for the CPU component, even though the header is printed for many utils and the CPU component might not even be involved. This could be a bit confusing, so remove it. * 26432359 src/darwin-common.c src/darwin-memory.c: Improve OSX support This properly detects CPU information now. You can get results like this: Available native events and hardware information. - PAPI Version : 4.9.0.0 Vendor string and code : GenuineIntel (1) Model string and code : Intel(R) Core(TM) i5-2435M CPU @ 2.40GHz (42) CPU Revision : 7.000000 CPUID Info : Family: 6 Model: 42 Stepping: 7 CPU Max Megahertz : 2400 CPU Min Megahertz : 2400 CPUs per Node : 0 Total CPUs : 4 Running in a VM : no Number Hardware Counters : -4 Max Multiplex Counters : -4 - 2012-07-08 * 845d9ecb src/Makefile.inc src/configure src/configure.in...: Add Mac OSX support This is enough that things compile and simple utilities run. No CPU perf counter support. 2012-07-06 * ff6f9ab4 src/linux-bgq.c: missed to delete a debug output. 2012-04-17 * 12e9a11a RELEASENOTES.txt: Release notes for the 4.4 release. 2012-07-06 * ac2eac56 src/papi.c src/papi.h: Add a PAPI_disable_component_by_name entry point. * 8c490849 src/components/coretemp_freebsd/coretemp_freebsd.c src/freebsd.c: Fix FreeBSD to work I'm not sure how it ever worked in the past. With these changes I can at least do a papi_component_avail and a papi_native_avail and get sane results * 108b5ce6 src/freebsd.c src/freebsd.h src/freebsd/map-atom.c...: Fix FreeBSD build some of the recent changes broke the FreeBSD build * 40a60f0a src/linux-bgq.c src/linux-bgq.h: Added BGQ's opcode and generic event functionality to PAPI. For BGQ there are multiple ways to define presets. The naive way is to derive from multiple events. This eats up multiple counters and we lose sample capability as well as overflow capability. On the other side, some events come with multiple InstrGroup derived in the field. If that's the case we can use a generic event and opcodes to filter multiple groups in a single counter. This is not working properly yet due to a known error in BGPM. Bgpm_AddEvent() does not work properly when multiple generic events are added to an EventSet. The BGPM folks have been made aware of this issue, they confirmed the error, and they are currently working on a fix. * 6f72b70f src/papi_events_table.sh: Make this script robust enough to handle any line ending, including CR (Mac), CRLF (Windows), and LF (Unix). It appears that google mail now automagically converts attached files to CRLF format. * 765ed0d2 src/papi_internal.c: Fix a type warning in the UE code. * 94bc1b15 src/MACROS: Remove the MACROS file it held out of date info and hasn't been touched since 2004 * d19e73ba src/ctests/Makefile src/ctests/clockcore.c src/testlib/Makefile...: Move the clockcore.c file from ctests to testlib it's common code used by multiple tests, including some in the utils directory also add a function definition to fix a build-time warning * 1101a6aa src/aix-lock.h src/aix.h src/configure...: Make papi_lock.h changes for non Linux architectures 2012-07-05 * 3b82b03d src/Makefile.in src/Makefile.inc src/aix.c...: Make the PAPI locks be tied to OS, not to CPU There is not a papi_lock.h file that when included gets the proper lock include for the OS. This fixes a lot of previous build hacks where a CPU component was needed in order for locks to work. * 0632ef42 src/threads.c: Fix spurious init_thread call in threads.c threads.c was calling init_thread() on all components, even ones that were disabled Fix it to honor the disable bit, as well as for shutdown_thread(). This was causing perfctr disable code to not work. * 19d9de7f src/Makefile.in src/Makefile.inc src/Rules.pfm4_pe...: Replace SUBSTRATE with CPUCOMPONENT in build This was mostly a configure/build change but it also cleaned up some cases where we were including SUBSTRATE where we didn't have to. * 829780db src/solaris-common.c src/solaris-common.h src/solaris-niagara2.c...: Move some common solaris code to solaris-common * 681ef027 src/configure src/configure.in src/solaris-memory.c...: Merge solaris-memory.c and solaris-niagara2-memory.c * bbd41743 src/solaris-ultra/get_tick.S src/solaris.h: Remove solaris-ultra/get_tick.S Nothing was using it. * dc3b6920 src/papi_sys_headers.h src/solaris.h: Remove papi_sys_headers.h Solaris was the only thing including it, and it wasn't really using it. * 7ccfa9df src/Makefile.in src/Makefile.inc src/configure...: Move move OS specific code into the new OSFILESSRC Linux in particular was using MISC for this. * 6f16c0c5 src/configure: Re-run autoconf to pickup the substrate=>component change. * cfff1ede src/Makefile.in src/Makefile.inc src/configure...: Remove MEMSUBSTR In reality, what we want instead of a Memory Substrate is an idea of the OS-specific common code that includes the memory substrate. This change adds OSFILESSRC and OSFILESOBJ to handle this case in configure * ca4729e6 src/configure.in: Separate out MEMSUBSTR and make it per-OS * 3148cba5 src/Matlab/PAPI_Matlab.dsp src/ctests/calibrate.c src/ctests/flops.c...: RIP Windows, remove the windows support code. Windows has not been activly supported since the transition to Component PAPI (4.0) This cleans up the code-base. 2012-07-03 * a366adf7 src/papi.c src/utils/error_codes.c: Change PAPI_strerror and PAPI_perror to behave more like thir POSIX namesakes. PAPI_error_descr is made redundant and removed as a result. 2012-07-05 * 7df46f81 src/Rules.pfm src/aix.c src/components/coretemp/linux-coretemp.c...: Move uses of PAPI_ESBSTR to PAPI_ECMP I left PAPI_ESBSTR defined too for backward compatability. Also some of the changes update PAPI_ESBSTR to be a more relevant error code, it one is available. 2012-07-03 * fdb348ad src/components/coretemp_freebsd/coretemp_freebsd.c src/components/example/example.c src/components/net/linux-net.c...: A few more substrate removals * 791747c1 src/cpus.c src/papi.h src/perf_events.c...: Fix bugs introduced by substrate -> component change Fix some stupid compile bugs that I missed. * 79b01a47 src/aix.c src/components/appio/appio.c src/components/bgpm/CNKunit/linux-CNKunit.c...: More substrate -> component changes This changes the vectors .init_substrate -> .init_component .shutdown_substrate -> .shutdown_substrate .init -> .init_thread .shutdown -> .shutdown_thread hopefully this will make the code clearer. * 02a10d71 src/Makefile.inc src/aix.c src/cpus.c...: Rename "substrate" to "component" This first pass only re-names things in comments. 2012-07-02 * c4bbff1c src/papi.c src/papi.h: Minor documentation fixes Found when writing up the PAPI 5.0 changes document 2012-06-30 * f9cb7346 src/components/vmware/vmware.c: Fix vmware component apparently I forgot to test the build with the vmguestlib support disabled. 2012-06-22 * 599040d1 src/components/coretemp/linux-coretemp.c src/components/rapl/linux-rapl.c src/components/stealtime/linux-stealtime.c...: Fix libpfm4 ntv_event_to_info event_info_t on other components This was actually a widespead problem due to cut-and-paste. * 2b51b439 src/papi_libpfm4_events.c: Properly fix libpfm4 ntv_event_to_info event_info_t event value The previous fix was subtly wrong. This is the proper fix, which is to do nothing inside of papi_libpfm4_events.c because papi_internal.c does the right thing for us and we were overwriting that with the wrong value. * a4f576bf src/ctests/overflow_allcounters.c src/testlib/papi_test.h src/testlib/test_utils.c: Clean up overflow_allcounters code While tracking down a previous issue I also cleaned up the overflow_allcounters test code to use some of the new interfaces. * 6903e053 src/papi_libpfm4_events.c: Fix libpfm4 ntv_event_to_info event_info_t event value The recently added libpfm4 ntv_event_to_info function was not properly oring PAPI_NATIVE_MASK to the event value in the event_info_t struct. That means if you tried to use that event value to add an event it would fail. The overflow_allcounters test broke because of this. * 420c3d11 src/ctests/Makefile src/ctests/disable_component.c src/papi.c...: Add PAPI_get_component_index() and PAPI_disable_component() PAPI_get_component_index() will return a component index if you give it the name of a component to match. This saves you having to iterate the entire component list looking. PAPI_disable_component() will manually mark a component as disabled. It has to be run before PAPI_library_init() is called. * 11946525 src/aix.c src/components/cuda/linux-cuda.c src/components/example/example.c...: Standardize component names to not end in .c We were being inconsistent; the time to make them all be the same is now before 5.0 gets out. 2012-06-21 * 274e1ad8 src/components/vmware/tests/Makefile: Fix cut-and-paste error in the vmware component Makefile * 85d6438d src/utils/event_chooser.c: Update papi_event_chooser to work with components Now you can specify events from components and it will tell you all the other events on that component that can run with it. Previously the utility was limited to the CPU component (0) only. * 3c2fcc83 src/papi_libpfm3_events.c src/papi_libpfm4_events.c src/perf_events.c: Hook up .ntv_code_to_info on perf_event * 36e864b3 src/papi_libpfm4_events.c src/papi_libpfm_events.h src/perf_events.c: Enable support for showing extended umasks on perf_event With this change, papi_native_avail now shows event umasks such as :u, :k, :c, :e, and :i. (user, kernel, cmask, edge-trigger, invert) Thes are boolean or integer events. They were supported by previous PAPI but they were never enumerated. * 8f3e305e src/components/coretemp/linux-coretemp.c: Fix cut-and-paste error in linux-coretemp.c that could lead to wrong size being copied * 0eedd562 src/libpfm4/lib/events/intel_atom_events.h src/libpfm4/lib/events/intel_core_events.h src/libpfm4/lib/events/intel_coreduo_events.h...: Import most recent libpfm4 git This fixes an issue where there can be confusion between :i and :i=1 type events. It also has initial support for Uncore, though you need a specially patched kernel and PAPI does not support it yet. * 2f86ec78 src/components/appio/tests/Makefile src/components/appio/tests/appio_test_blocking.c .../appio/tests/appio_test_fread_fwrite.c...: - Fixed tests verbosity by using TESTS_QUIET macro - Fixed Makefile to only include necessary tests for automatic builds (skip blocking tests that read from stdin) * 6936b955 src/components/appio/README src/components/appio/appio.c src/components/appio/appio.h...: Added polling of read/write descriptor to check which operations would block. * 48cacccf src/papi.h: Add back PAPI_COMPONENT_INDEX() for backward compatability It turns out some people were using this for cases other than enumeration. The proper way to do things now is to use PAPI_get_event_component() which is what this PAPI_COMPONENT_INDEX() now maps to. * d1ed12b7 src/ctests/Makefile src/ctests/get_event_component.c src/papi.c...: Add PAPI_get_event_component() This function returns the component an event belongs to. It also adds a test to test this functionality. 2012-06-20 * ffccf633 src/papi.h: Add component_type field to .cmp_info The idea is we'll specify CPU, I/O, GPU, hardware, etc. * 9998eecc src/components/lmsensors/Rules.lmsensors: Another lmsensors build fix * caa94d64 src/components/lmsensors/linux-lmsensors.c: Update lmsensors component to actually compile. I finally found a machine with lmsensors installed. * fbcde325 src/components/lmsensors/linux-lmsensors.c src/components/lmsensors/linux-lmsensors.h: Update lmsensor component Unlike the other components it hadn't been updated to PAPI 5 standards. Also, it was wrongly de-allocating all state in "_shutdown" rather than "_shutdown_substrate" which was causing double-frees during tests. * 0d3c0ae2 src/papi_internal.c: Add some extra debugging to _papi_hwi_get_native_event_info * 5961c03d src/aix.c src/components/nvml/linux-nvml.c src/ctests/subinfo.c...: Remove cntr_groups from .cmp_info This information is better exposed by enumeration. * 2b4193fd src/utils/event_chooser.c: Cleanup and comment event_chooser code * 7f9fab2b src/ctests/all_native_events.c: Cleanup and add comments to all_native_events.c * a245b502 src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/freebsd.c...: Remove profile_ear from .cmp_info The CPU components should handle this internally. * bca07f3c src/papi.c: Add comments to the PAPI_sprofil code. * b1e2090c src/papi.c: Minor papi.c cleanups Fix some minor cosmetic things, including a typo in a comment. * 8f3aef4a src/ctests/subinfo.c src/papi.h: Remove opcode_match_width field from .cmp_info This should be exposed via enumeration and not by a field in the generic cmp_info structure. * 047af629 src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/papi.h...: Remove cntr_OPCM_events field from .cmp_info This should be exposed via enumeration and not by a field in the generic cmp_info structure. * 3f1f9e10 src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/papi.h...: Remove cntr_DEAR_events field from .cmp_info This should be exposed via enumeration and not by a field in the generic cmp_info structure. * 962c642a src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/papi.h...: Remove cntr_IEAR_events field from .cmp_info This should be exposed via enumeration and not by a field in the generic cmp_info structure. * 5aa7eac1 src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/papi.h...: Remove instr_address_range from .cmp_info This feature should be deteced via enumeration, not via a flag in the generic .cmp_info structure. * 1bf68d5d src/components/nvml/linux-nvml.c src/ctests/subinfo.c src/papi.h...: Remove data_address_range field from .cmp_info The proper way to detect this feature is via enumeration. 2012-06-19 * 90037307 src/linux-context.h: Change Linux from using "struct siginfo" to "siginfo_t" This conforms to POSIX, and fixes newer Fedora where struct siginfo is no longer supported. This can in theory break on older setups (possibly kernel 2.4). If that happens, we need to somehow detect this using autoconf. 2012-06-18 * ad48b4fa src/Rules.perfctr-pfm: Fix the perfctr-pfm build; for buildbot, mostly. Have the perfctr-pfm build only build libpfm, like the perfevents builds. The icc build was choking on warnings (-Werror => errors) in the example programs with libpfm, this is not something we depend upon. 2012-06-17 * 358b14f9 src/papi_events.csv: Update BGQ presets * cf26fc87 src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/L2unit/linux-L2unit.c...: Update bgpm components according to the papi5 changes * a7b08a91 src/configure src/configure.in src/linux-bgq.c: Merging the BG/Q stuff from stable_4.2 to PAPI 5 did break it. It's corrected now; also predefined events are now working.) 2012-06-15 * 2d5a4205 src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/L2unit/linux-L2unit.c src/configure...: Merging the BG/Q stuff from stable_4.2 to PAPI 5 did break it. It's corrected now (almost); predefined events are not working yet.) * 1b034920 src/papi.c: Re-enable PAPI_event_name_to_code() optimization In PAPI_event_name_to_code() there was a commented-out optimization where we would check if an event name begins with "PAPI_" before searching the entire preset list for an event name. The comment says we had to disable this due to "user events", but a check shows that was introduced in e7bd768850ecf90 and that the "user events" it means is not the current support, but the now-removed PAPI_set_event_info() function where you could change the names of presets on the fly (even to something not starting with PAPI_). Since we don't support that anymore, we can re-enable the optimization. 2012-06-14 * 9a26b43d src/papi_internal.c src/papi_internal.h src/papi_preset.c: Remove the 16-component limit This turned out to be easier than I thought it would be. Now determining which component an event is in is a two step process. Before, the code shifted and masked to find the component from bits 26-30. Now, _papi_hwi_component_index() is used. There's a new native event table which maps all native events (which are allocated incrementally at first use starting with 0x4000000) to two values, a component number and an "internal" event number. 2012-06-13 * d5c50353 src/papi_internal.c: Fix for the PAPI_COMPONENT_MASK change I missed two cases in papi_internal.c This was causing the overflow_allcounters test to fail * 46fd84ce src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/CNKunit/linux-CNKunit.h src/components/bgpm/IOunit/linux-IOunit.c...: Updating the Q substrate according to the PAPI 5 changes * 05a8dcbf src/components/appio/appio.c src/components/appio/tests/appio_list_events.c src/components/appio/tests/appio_values_by_code.c...: First steps of removing 16-component limit This change removes PAPI_COMPONENT_INDEX(), PAPI_COMPONENT_MASK and PAPI_COMPONENT_AND_MASK. It adds the new functions _papi_hwi_component_index() _papi_hwi_native_to_eventcode() _papi_hwi_eventcode_to_native() By replacing all of the former macros with the equivalent of the latter functions, it allows all of the future 16-component limit changes to be made in the functions. Components now receive as events a plain 32-bit value as their internal native event; the high bits are not set. This may break some external components. This change should not break things, but a lot of testing is needed. * af4cbb86 src/run_tests_exclude.txt: Exclude iozone helper scripts from run_tests. run_tests.sh looks for executible files under components/*/tests Some of the plotting scripts in appio/iozone were getting picked up. 2012-06-12 * c10c7ccb src/configure src/configure.in: Configure does not work on BGQ due to missing subcomp feature. It worked for stable-4.2 but got lost in current git origin. * d9a58148 src/aix.c src/ctests/hwinfo.c src/ctests/overflow.c...: Update hw_info_t CPU frequency reporting. Previously PAPI reported "float mhz" and "int clock_mhz". In theory the first was the current CPU speed, and the latter was the resolution of the cycle counter. In practice they were both set to the same value (on Linux, read from /proc/cpuinfo) and not very useful when DVFS was enabled, as the value reported was usually lower than the actual frequency running once CPU started being used. This change adds two new values "cpu_max_mhz" and "cpu_min_mhz" which are read from /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq and /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq if they are available, and falls back to /proc/cpuinfo otherwise. All of the tests were updated to use cpu_max_mhz. The old mhz and clock_mhz values are left for compatability reasons (and set to cpu_max_mhz) but are currently unused otherwise. 2012-06-11 * 0f124891 src/papi_events.csv: Initial PAPI Ivy Bridge Support for now try to re-use the sandy bridge event presets * a1f46077 src/libpfm4/docs/man3/libpfm_intel_ivb.3 src/libpfm4/include/perfmon/err.h src/libpfm4/lib/events/intel_ivb_events.h...: Import libpfm4 git snapshot This adds IvyBridge support * 3bb983cc src/libpfm-3.y/examples_v2.x/self_smpl_multi.c: Fix a libpfm3 example program for icc, local fix because libpfm3 is deprecated. icc does have more enjoyable warnings than gcc, : error 186: pointless comparison of unsigned integer with zero on this: unsigned int foo; ... if ( foo < 0 ) 2012-06-06 * d28adccf src/papi_user_events.c: The user events code had a call to exit, this was bad... 2012-06-04 * 6bf43022 src/testlib/test_utils.c: Further the hack for testing for perf SW events. Events like - | perf::CPU-CLOCK | | PERF_COUNT_SW_CPU_CLOCK | - were passing the check, now we also check the event_info_struct.long_descr field for PERF_COUNT_SW.... * fa4b1a28 src/components/nvml/linux-nvml.c: Cleanup nvml code a little. A few print statements were left over from debugging. Also check errors from nvml and cuda pciinfo functions, disabling the component in a few more cases. 2012-06-01 * da144a94 src/components/nvml/Makefile.nvml.in src/components/nvml/README src/components/nvml/Rules.nvml...: Rewrite and merge of the nVidia Management library component. This component attempts to expose all supported 'performance counters' on each card cuda knows about at runtime. Much like the cuda component reads happen on the card you're executing on at PAPI_read time. The test included is a copy of the cuda helloworld test, but it attempts to start/stop the event on each gpgpu. If you select an event that is not supported on the card you're running on we should fail gracefully but this has not been tested. 2012-05-23 * b2d414dc src/components/stealtime/linux-stealtime.c: At units to stealtime component Added the function but forgot to add a function vector for it. * ce9d4500 src/components/stealtime/linux-stealtime.c: Add units to stealtime Properly report that the units are in micro seconds. * 149948c8 src/components/rapl/linux-rapl.c: Minor cleanup of RAPL code missing "void" paramter in init_substrate function * 6a7e22fa src/components/vmware/vmware.c: More vmware component fixes. This makes the component thread-safe. Also makes it fail more gracefully if the guestlib SDK is installed but does not support our hypervisor (for example, if we are running under VM Workstation). Still need to test on ESX. * 072d6473 src/components/appio/tests/appio_test_select.c: added code to intercept and time select() calls. 2012-05-22 * 12b6d0d7 src/components/vmware/vmware.c: Some more minor fixes to VMware component Try to handle things better if VMguest SDK not working * 6e015bc5 src/components/vmware/Rules.vmware src/components/vmware/vmware.c: More vmware component fixups Now works with the events from the VMguest SDK library * 5fc0f646 src/components/vmware/vmware.c: More cleanup of vmware component The pseudo-performance counters work again. Now they behave in accumulate mode, like all other PAPI counters. * f72b0967 src/components/vmware/tests/vmware_basic.c: Make vmware test a bit more complete * 070e5481 src/components/vmware/tests/Makefile src/components/vmware/tests/vmware_basic.c: Add a test for the vmware component * 7cf62498 src/components/vmware/Makefile.vmware.in src/components/vmware/Rules.vmware src/components/vmware/configure...: Clean up the vmware component. bring it up to date with other components. make it possible to build it without the vmguest library being installed * b32ae1ae src/components/stealtime/Rules.stealtime src/components/stealtime/linux-stealtime.c src/components/stealtime/tests/Makefile...: Add a stealtime component When running in a VM, this provides information on how much time was "stolen" by the hypervisor due to the VM being disabled. This info is gathered from column 8 of /proc/stat This currently only works on KVM. * 9e95b480 src/components/appio/tests/appio_test_blocking.c: Use a non-blocking select to determine which reads and writes would block 2012-05-19 * f60d991f src/components/appio/README src/components/appio/appio.c src/components/appio/tests/appio_test_read_write.c...: Interception of close() implemented. This allows us to correctly determine the number of currently open descriptors. 2012-05-17 * 7cd8b5a3 src/libpfm4/.gitignore src/libpfm4/config.mk src/libpfm4/lib/Makefile...: Update libpfm4 to current git tree * ebffdb7e src/components/rapl/tests/rapl_overflow.c: Skip rapl_overflow test if RAPL not available * 98d21ef3 src/components/example/example.c src/components/rapl/linux-rapl.c: Fix some component warnings. * 0447f373 src/configure src/configure.in src/linux-generic.h: Make build not stall if PAPI_EVENTS_CSV not set This is some fallout from the FreeBSD changes. PAPI_EVENTS_CSV could not be set, which would make the event creation script hang forever. Also catch various fallthroughs in the code where SUBSTR wasn't being set, which is how the above problem can happen. * ef484c00 src/linux-timer.h: Fix typo in linux-timer.h 2012-04-14 * 7c3385f4 src/components/bgpm/CNKunit/CVS/Entries src/components/bgpm/CNKunit/CVS/Repository src/components/bgpm/CNKunit/CVS/Root...: Removed CVS stuff from Q code. * 2cf4aeb2 src/configure src/configure.in src/linux-bgq.c...: Removed papi_events.csv parsing from Q code. (CVS stuff still needs to be taken care of.) 2012-04-12 * 153c2bb1 INSTALL.txt: Updated INSTALL notes for Q 2012-05-17 * ff6a43fb src/Makefile.in src/Makefile.inc src/components/README...: Added missing files for Q merge. Conflicts: src/configure src/configure.in src/freq.c 2012-04-12 * 0e142630 src/Rules.bgpm src/components/bgpm/CNKunit/CVS/Entries src/components/bgpm/CNKunit/CVS/Repository...: Added PAPI support for Blue Gene/Q. 2012-05-14 * ad7e3fa0 src/components/rapl/linux-rapl.c: Properly accumulate RAPL results Previously it was resetting the counts on read, instead of continuing to count as per other PAPI events. * c79e3018 src/components/rapl/tests/rapl_overflow.c: Fix some warnings in rapl_overflow test * 731afd1a src/components/rapl/tests/Makefile src/components/rapl/tests/rapl_overflow.c: Add rapl_overflow test This test sees if we can measure RAPL in conjunction with overflow CPU performance events. * b0e201bb src/components/rapl/utils/Makefile src/components/rapl/utils/rapl_plot.c: Remove derived "uncore" values from rapl tool They weren't really measuring uncore, but were just TOTAL - PP0. It was causing some confusion. 2012-05-09 * 547e9379 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump the version number to 4.9.0.0 Read 4.9 as pre-5.0 master was at version number 4.2.1, this was archaic... Sorry for the confusion Tushar, master is the correct branch for the latest development code. * 133e3d67 src/configure src/configure.in: Fix perfctr build In the FreeBSD changes I removed the CPU determination by reading /proc/cpuinfo as that was prone to failure and non-portable. This broke perfctr as it was doing a huge CPU name lookup to determine if it was on an x86 system or not. This change fixes that. 2012-05-08 * 42b21d67 src/papi_libpfm4_events.c: Fix PAPI event enumeration inside of VMware VMware disables the availability of architectural events when virtualized PMU is not available. libpfm4 was checkign this when enumerating events, and we would end up in the situation where ix86arch was marked active but 0 events were available. We didn't check for this error condition and thus end up thoroughly confused. 2012-05-07 * fd79a584 src/freebsd.c: Fix event enumeration on FreeBSD It was passing PAPI_OK in all cases, causing papi_native_avail to try to do things like report groups even when groups weren't available. * 53732c2e src/freebsd.c: Add Virtual Machine detection support to FreeBSD again, support for this on x86 is OS Neutral * 7b4d7c96 src/configure src/configure.in src/freebsd-memory.c...: Add x86_cacheinfo support to FreeBSD The x86 cache and memory info is OS-independent, so add support for it to FreeBSD. * 91033df6 src/Makefile.in src/Makefile.inc src/configure...: Re-enable predefined events on FreeBSD * 36f6dc1b src/freebsd.c src/freebsd/map.c src/freebsd/map.h: Modify FreeBSD to use _papi_load_preset_table * 45651746 src/freebsd.c: Cleanup the freebsd code a bit. * e1554ed8 src/configure: re-run autoconf for updated configure * 1deb2f5d src/Makefile.inc: Make sure a proper dependency for papi_events_table.h exists Our Makefile code that builds a shared library is way broken; it will fail to rebuild in many cases where the static library properly detects thing. * 28e28006 src/configure.in: Make papi_events_table.h build normally, not by configure. * 9a66dfa5 src/configure.in: Another place papi_events_table.sh is called * 12e4a934 src/Makefile.inc src/papi_events_table.sh: Make papi_events_table.sh take a command line argument This way we can use it on any .csv file, not just papi_events.csv * 7018528f src/freebsd/memory.c: Remove unused freebsd/memory.c file * 819e5826 src/freebsd_events.csv: Make freebsd_events.csv a valid PAPI event file * 9cc4a468 src/freebsd.c src/freebsd/map-atom.c src/freebsd/map-core.c...: Fix FreeBSD build on head. This temporarily disables preset events. There are also a few other minor fixes. 2012-05-01 * ab36c0a2 src/Makefile.inc src/configure src/configure.in: Update build system for FreeBSD * 2b61d8b7 src/freebsd.c src/freebsd.h: Fix various compiler warnings on FreeBSD * 2c0bcc84 src/freebsd.c: Enable new Westmere events on FreeBSD * b0499663 src/freebsd/map-i7.c src/freebsd/map-i7.h src/freebsd/map-westmere.c...: Add Westmere event support for FreeBSD * e54cabc6 src/ctests/inherit.c: Fix the inherit ctest to compile on FreeBSD * d9dbdd31 src/components/appio/appio.c: - change in appio component (appio.c): removed reference to .ntv_bits_to_info as it doesn't exist in the PAPI component interface. 2012-04-27 * 5d661b2d src/Rules.pfm src/Rules.pfm_pe: Add the libpfm -Wno-override-init bandaid to the other rules files. In b33331b66137668155c02e52c98a7e389fad402e we test if gcc -Wextra complains about some structure initialization that libpfm does. This was incoperated into Rules.pfm4_pe only. Jim Galarowicz noticed the other Rules files didn't have it. * 4349b6fd src/Rules.pfm4_pe src/Rules.pfm_pe: Cleanup the perf events Rules files. Steve Kaufmann reported that CONFIG_PFMLIB_OLD_PFMV2 is only used for libpfm3 builds targeting old versions of perfmon2. 2012-04-26 * 8a7fef68 src/mb.h: Add memory barries for ia64 2012-04-24 * 9af4dd4a src/libpfm4/README src/libpfm4/config.mk src/libpfm4/include/perfmon/perf_event.h...: Import libpfm4 git snapshot This brings libpfm4 up to 9ffc45e048661a29c2a8ba4bfede78d3feb828f4 The important change is support for Intel Atom Cedarview. 2012-04-20 * fac6aec0 src/linux-bgp-memory.c src/linux-bgp.c: Some BG/P cleanups. Removed a lot of dead code, noticed when looking for any potential BG/P issues. * 977709f6 src/linux-bgp-preset-events.c src/linux-bgp.c: Fix PAPI compile on BG/P Thanks to Harald Servat 2012-04-19 * 5207799e release_procedure.txt: Modified release_procedure.txt to push tags. 2012-04-18 * b248ae80 doc/Makefile: Have clean remove the doxygen error file. * 1d4f75a3 doc/Doxyfile-man1 doc/Doxyfile-man3: Fix an error in the Doxygen config files. Doxygen includes things with @INCLUDE not @include. The html file had this, the man page files did not... 2012-04-17 * 979cda20 cvs2cl.pl delete_before_release.sh gitlog2changelog.py...: Update the release machinery for git. gitlog2changelog.py takes the output of git log and parses it to something like a changelog. * 67bdd45f doc/Doxyfile-html: Cover up an instance of doxygen using full paths. Doxygen ( up to 1.8.0, the most recent at this writing ) would use full paths in directory dependencies ignoring the use relative paths config option. 2012-04-13 * c38eb0b7 src/libpfm-3.y/lib/intel_corei7_events.h src/libpfm-3.y/lib/intel_wsm_events.h src/libpfm-3.y/lib/pfmlib_intel_nhm.c: Add missing update to libpfm3 Somehow during all of the troubles we had with importing libpfm3 into CVS, we lost some Nehalem/Westmere updates. Tested on a Nehalem machine to make sure this doesn't break anything. * 193d8d06 src/papi_libpfm3_events.c: Fix max_multiplex case on perf_event/libpfm3 num_mpx_cntrs was being set to 512 even though the real maximum is 32, causing a buffer overflow and segfault. 2012-04-12 * f1f7fb5b src/threads.h: Fix minor typo in a comment * 0373957d src/linux-timer.c: Fix potential fd leak Noticed by coverity checker. * 71727e38 src/ctests/max_multiplex.c: Improve max_multiplex ctest on perfmon2, this test was failing because the maximum number of multiplexed counters was much more than the available counters we could test with. This change modifies the test to not fail in this case. 2012-04-11 * fdbdac9f src/perfmon.c: Fix the perfmon substrate. It was missing a _papi_libpfm_init() call, which meant the number of events was being left at 0. 2012-04-09 * 2a44df97 src/libpfm-3.y/examples_v2.x/multiplex.c src/libpfm-3.y/examples_v2.x/pfmsetup.c src/libpfm-3.y/examples_v2.x/rtop.c...: Catch a few libpfm-3.y files up to libpfm-3.10. More skeletons keep falling out of the cvs closet. This is just what diff -q -r catches. 2012-04-04 * 0e05da68 src/components/rapl/utils/Makefile src/components/rapl/utils/README src/components/rapl/utils/rapl_plot.c: Add the rapl_plot utility to the RAPL component. This utility uses PAPI to periodicly poll the RAPL counters and generate average power results suitable for plotting. There's been a lot of interest in this utility so it's probably useful to include it with the RAPL component. * 2daa03ac src/papi_internal.c: Check if a component is disabled at init time. This change modifies the code so that at PAPI_library_init() time we check the component disable field, and we don't call the init routines for components the user has disabled. This allows code like the following to happen _before_ PAPI_library_init(): numcmp = PAPI_num_components(); for(cid=0; cidname,"cuda")) { cmpinfo->disabled=1; strncpy(cmpinfo->disabled_reason,"Disabled by user",PAPI_MAX_STR_LEN); } } We might want to add a specific PAPI_disable_component(int cid) call of maybe even a PAPI_disable_component(char *name) as the above code causes compiler warnings since cmpinfo is returned as a const pointer. This all works because currently PAPI currently statically allocates all of the components at compile time, so we can view and modify the cmp_info structure before PAPI_library_init() is called. * 3fd2b21e src/components/appio/README src/components/appio/appio.c src/components/appio/appio.h...: Added support to count reads that are interrupted or would block. 2012-04-03 * dd3a192f release_procedure.txt: Change chmod flags for doxygen stuff from 755 to 775 to allow group write permissions. 2012-03-30 * deac54cc src/components/coretemp/linux-coretemp.c src/components/coretemp/tests/coretemp_basic.c src/components/coretemp/tests/coretemp_pretty.c...: Add new PAPI_enum_cmp_event() function This will be needed when we remove the 16-component limit. Currently in PAPI_enum_event() the component number is gathered from bits 29-26 of the eventcode. This won't work anymore once we remove those bits. Also update the various components to not use PAPI_COMPONENT_MASK() as this too will go away in the transition. * 48331cc9 src/configure src/configure.in src/papi.c...: Place all compiled-in components in the _papi_hwd[] array. Previously we had separate compiled_in[] and _papi_hwd[] arrays. At init time a pointer to the compiled_in[] was copied to _papi_hwd[] if initialization passed. This kind of code setup makes enumerating components hard, and finding info from non-available components would require additional function entry points. This change leaves all compiled in components to _papi_hwd[]. Availability of the component can be checked with the new "disabled" field. This will make enumeration support a lot easier to add. It can possibly cause user confusion if they try to access component structures directly without checking the "disabled" field first. This change should also make any eventual support for run-time component enabling/disabling a lot easier. * 66a72f44 src/papi.c: Documentation was referring to nonexistent "PAPI_enum_events()" The actual function we have is PAPI_enum_event() * 0f2c2593 src/components/coretemp/linux-coretemp.c src/components/lustre/linux-lustre.c src/components/mx/linux-mx.c...: Add support for reporting reason for failed component initialization. This change adds the fields "disabled" and "disabled_reason" to the component_info_t structure. At initialization time, PAPI will set the "disabled" field to the value returned by component init (that is PAPI_OK if OK or an error otherwise). This can be checked later to find why component init failed. Also provided is the "disabled_reason" string. The components can set this at failure time, and this can be printed later. For example, this is sample output of the updated papi_component_avail routine: - Compiled-in components: Name: perf_events.c Linux perf_event CPU counters Name: linux-rapl Linux SandyBridge RAPL energy measurements \-> Disabled: Not a SandyBridge processor Name: example.c A simple example component Name: linux-coretemp Linux hwmon temperature and other info \-> Disabled: No coretemp events found Name: linux-net.c Linux network driver statistics Name: linux-mx.c Myricom MX (Myrinet Express) statistics \-> Disabled: No MX utilities found Name: linux-lustre.c Lustre filesystem statistics \-> Disabled: No lustre filesystems found Active components: Name: perf_events.c Linux perf_event CPU counters Name: example.c A simple example component Name: linux-net.c Linux network driver statistics 2012-03-29 * d84b144e src/components/rapl/Rules.rapl src/components/rapl/linux-rapl.c src/components/rapl/tests/Makefile...: Add a SandyBridge RAPL (Running Average Power Level) Component This component allows energy measurement at the package-level on Sandybridge machines. To run, you need the Linux x86-msr kernel module installed and read permissions to /dev/cpu/*/msr The output from the rapl_busy test looks like this on a SandyBridge-EP machine: Trying all RAPL events Found rapl component at cid 2 Starting measurements... Doing a naive 1024x1024 MMM... Matrix multiply sum: s=1016404871450364.375000 Stopping measurements, took 3.979s, gathering results... Energy measurements: PACKAGE_ENERGY:PACKAGE0 175.786011J (Average Power 44.2W) PACKAGE_ENERGY:PACKAGE1 73.451096J (Average Power 18.5W) DRAM_ENERGY:PACKAGE0 11.663467J (Average Power 2.9W) DRAM_ENERGY:PACKAGE1 8.055389J (Average Power 2.0W) PP0_ENERGY:PACKAGE0 119.215500J (Average Power 30.0W) PP0_ENERGY:PACKAGE1 16.315216J (Average Power 4.1W) Fixed values: THERMAL_SPEC:PACKAGE0 135.000W THERMAL_SPEC:PACKAGE1 135.000W MINIMUM_POWER:PACKAGE0 51.000W MINIMUM_POWER:PACKAGE1 51.000W MAXIMUM_POWER:PACKAGE0 215.000W MAXIMUM_POWER:PACKAGE1 215.000W MAXIMUM_TIME_WINDOW:PACKAGE0 0.046s MAXIMUM_TIME_WINDOW:PACKAGE1 0.046s rapl_basic.c PASSED 2012-03-26 * b44d60ca src/components/appio/appio.c src/components/appio/appio.h src/components/appio/tests/appio_test_read_write.c: Added support for intercepting open calls. 2012-03-23 * 9e9fac4b src/Makefile.in src/Rules.pfm4_pe src/configure...: Fix the test case in configure at 0cea1848 Make use of the structure we're using for the override-init test case. * 0cea1848 src/configure src/configure.in: Doctor CFLAGS when testing for a gcc warning. -Wextra was not in CFLAGS when I attempted to check for the initialized field overwritten warning. So we set -Wall -Wextra -Werror when running the test code. 2012-03-22 * b33331b6 src/Makefile.in src/Rules.pfm4_pe src/configure...: Fix initialized field overwritten warning when building libpfm4 on some gcc versions. In gcc 4.2 or so, -Woverride-init was added to -Wextra causing issues with code like struct foo { int a; int b;}; struct foo bar = { .a=0, .b=0, .b=5; }; --Wno-override-init allows us to keep -Werror for libpfm4 compiles. 2012-03-21 * ae149766 src/papi_internal.h: Delete an old comment. Yes, Dan in 2003, we should and do use MAX_COUNTER_TERMS as the size of the event position array. 2012-03-20 * b937cdd8 src/papi_user_events.c: Move the user events code over to using the new preset event data structure. 2012-03-14 * 6ca599e2 src/papi_internal.c: Fix a small memory leak. We weren't freeing _papi_hwd, causing a lot of MEM_LEAK warnings in buildbot. 2012-03-13 * 473b8203 src/aix.h src/configure src/configure.in...: Remove last MY_VECTOR usage. Have configure explicitly set the name of the perf counter substrate vector in the components_config.h file This removes one more special case, and gets us slightly closer to being able to have multiple CPU substrates compiled in at once. * 360c3003 src/papi.c src/papi_libpfm3_events.c src/papi_libpfm_events.h...: Clean up the papi_libpfm3_events.c code. Move code that was perfctr specific into perfctr-x86.c * 03de65e3 src/libpfm-3.y/examples_v2.x/multiplex.c src/libpfm-3.y/examples_v2.x/pfmsetup.c src/libpfm-3.y/examples_v2.x/rtop.c...: Fix some libpfm3 warnings. libpfm3 is not maintained anymore, so applied these changes locally. libpfm3 is compiled with -Werror so they broke the build with newer gcc even though they are just warnings in example programs. * ad490353 src/ctests/zero_named.c src/utils/multiplex_cost.c: Fix a few compiler warnings in the tests. * a0fec783 src/linux-timer.c: Fix another linux-timer.c compile problem. I hadn't tested with debug enabled, so all of buildbot failed last night. 2012-03-12 * a3733ecd src/linux-timer.h: Fix typo in the linux-timer.h header _linux_get_virt_usec_timess should have been _linux_get_virt_usec_times Thanks to Steve Kaufmann for noticing this. * 785db5ae src/linux-common.c src/linux-timer.c: Fix timer compile on Power machines Power, ARM, and MIPS have no get_cycles() call so provide a dummy function on these architectures. * 708090ee src/linux-common.c src/linux-timer.h: Another fix for non-POSIX timers The recent changes had the name of the fallback usec method wrong. * 88e8d355 src/papi_libpfm3_events.c: Fix a warning in the libpfm3 code. * 8ca63705 src/configure src/configure.in src/linux-common.c...: Fix build when not using POSIX timers The PAPI build system was being overly clever with how it defined what kind of wall clock timers were to be used, so of course I broke things when breaking the timer code out to make it a bit more understandable. This patch breaks out the timer define into two pieces; one saying it's a POSIX timer and one saying whether to use HR timers or not. 2012-03-09 * b69ad727 src/linux-common.c src/linux-timer.c src/linux-timer.h: Add Linux posix gettime() nanosecond functions * af2c9a49 src/papi.c src/papi_vector.c src/papi_vector.h: Add ->get_virt_nsec() and ->get_real_nsec() OS vectors Currently PAPI was just cheating and running the usec functions and multiplying by 1000. Make this the default, but allow the OS code to override if they have timers capable of returning nsec percision. * 24c68dbe src/aix.c src/freebsd.c src/linux-bgp.c...: Clean up ->get_virt_usec() It no longer needs to be passed a context, so remove that from all callers. Also, ->get_virt_cycles() was just a get_virt_usec()*MHz on most platforms. While this is a bit dubious (especially as MHz can't be relied on) make this a common routine that will be added at innoculate time if ->get_virt_cycles() is set to NULL. * a3ef7cef src/linux-common.c src/linux-timer.c src/linux-timer.h: Cleanup the Linux timer code. Split things up a bit to make the code more readable. * 50ce8ea0 src/papi_internal.c: Change a strcpy() to strncpy() just to be a bit safer. * 0526b125 src/components/lmsensors/linux-lmsensors.c: Fix buffer overrun in lmsensors component * b088db70 src/libpfm4/config.mk src/libpfm4/docs/man3/pfm_get_os_event_encoding.3 src/libpfm4/examples/showevtinfo.c...: Update to current git libpfm4 snapshot * ccb45f61 src/aix.c src/extras.c: Fix segfault on AIX During some of the cleanups, the extras.h header was not added to aix.c This made some of the functions (silently) use default data types for the function parameters, leading to segfaults in some of the tests. 2012-03-08 * 1cb22d0b src/components/coretemp/linux-coretemp.c src/utils/native_avail.c: Make "native_avail -d" report units if available Add units support to the coretemp component, have native_avail -d (detailed mode) print it to make sure it works. * 9c54840e src/extras.c src/extras.h src/papi_internal.c...: Add new ntv_code_to_info vector This will allow components to return the extended event_info data for native events. If a component doesn't implement ntv_code_to_info then get_event_info falls back to the old way of just reporting symbol name and long description. * c4579559 src/papi.h: Add new event_info fields New fields are added to event_info that allow passing on extended information. This includes things such as measurement units, data type, location, timescope, etc. * 17533e4e src/ctests/all_events.c src/ctests/derived.c src/ctests/kufrin.c...: Restore fields to event_info structure The changes made were probably too ambitious, even for a 5.0 release. In the end it looks like we can remain API compatible while just using up a little more memory. We can still save space by shrinking preset_t behind the scenes. * 6f13a5f6 src/aix.c src/components/coretemp/linux-coretemp.c src/components/coretemp_freebsd/coretemp_freebsd.c...: Remove ->ntv_bits_to_info vector from component interface We weren't using it anymore, and many of the components were just setting it to NULL unncessarily. We'll be replacing the functionality soon with ntv_code_to_info * 401f37bc src/components/example/example.c src/ctests/subinfo.c src/papi.h: Remove invert and edge_detect fields from component info These fields were there to indicate if a CPU component supported these attributes (for Intel processors) but in the end we never used these. The proper way to export this info is during event enumeration. * f32fe481 src/papi_events.csv: We had the PAPI_VEC_INS preset wrong on amd fam12h llano * 38a8d8a7 src/ctests/multiplex2.c src/papi_preset.c: Fix preset adding code to be more robust. If an invalid event is in a preset definition, we'd currently add it with an eventcode of 0 to the preset, which would break if you tried to use the event. This change properly prints a warning in this case, and sets the preset to be unavailable. * 2591a546 src/ctests/val_omp.c src/ctests/zero_omp.c: Remove the hw_info field from add_two_events calls. Two ctests missed the bus when Vince reworked the add_two_events call. * 358a2e32 src/papi_internal.c src/papi_preset.c: Fix segfault seen on an AMD fusion machine With the recent preset and component hanges, we were not properly resetting papi_num_components if PAPI_library_init()/PAPI_shutdown() was called multiple times. 2012-03-07 * 7751f5d8 src/ftests/zeronamed.F: Fix a compile error on aix. Dan ran over 72 characters on a single line. xlf actually enforced that part the Fortran spec. 2012-03-06 * 1c87d89c src/ftests/Makefile src/ftests/zeronamed.F src/papi_fwrappers.c: Add support for {add, remove, query}_named to Fortran interface; add zero named.F test case; modify ftests Makefile to support "all" tag. * 71bd4fdd src/configure src/configure.in: Modify configure to define the default FTEST_TARGETS as "all" * 54e39855 src/components/vmware/vmware.c: Changed tri8ggering environment variable to PAPI_VMWARE_PSEUDOPERFORMANCE per Vince's earlier email. This should complete all the VMware component changes. 2012-03-05 * 845503fb src/Makefile.inc: Add missing MISCSRCS line to Makefile.inc This was breaking the shared library build 2012-02-01 * 11be8e4b .../appio/tests/appio_test_fread_fwrite.c src/components/appio/tests/appio_test_pthreads.c src/components/appio/tests/appio_test_read_write.c: updated these tests to print timing information * 9ad62ab1 src/components/appio/README src/components/appio/appio.c src/components/appio/appio.h...: Added support for timing I/O calls. Updated tests and README. 2012-01-31 * beaa5ff0 src/components/appio/tests/iozone/Changes.txt src/components/appio/tests/iozone/Generate_Graphs src/components/appio/tests/iozone/Gnuplot.txt...: added the latest stable iozone to the appio tests. * 4af58174 src/components/appio/README src/components/appio/tests/Makefile src/components/appio/tests/init_fini.c: added a hook to run the appio test for iozone. 2012-01-21 * 15c733cf src/components/appio/CHANGES src/components/appio/README src/components/appio/appio.c...: Removed stray 'net' references. All remaining references are only for the purpose of giving credit. Updated change log. 2012-01-20 * ca4b6785 src/components/appio/README src/components/appio/appio.c src/components/appio/tests/appio_list_events.c...: - general cleanup - improved tests to be quiet and be conform to other PAPI tests - replaced hardwire constants in appio.c with symbolic ones - tests will now write to /dev/null to avoid filling the terminal screen with useless text - more comments added - @author added to files - updated README 2012-01-18 * bb22ed9f src/components/appio/README src/components/appio/Rules.appio src/components/appio/appio.c...: - Added support to measure bytes/calls/eof/short calls for read/write calls. - Interception of read/write and fread/fwrite calls. - Works for static and dynamic linkage (without need for LD_PRELOAD) - Tested OK on 32-bit i686 Linux 2.6.38. Tushar 2011-12-03 * d58b34b6 src/components/appio/tests/Makefile src/components/appio/tests/appio_list_events.c src/components/appio/tests/appio_values_by_code.c...: *** empty log message *** * cd7d7acc src/components/appio/tests/appio_values_by_name.c: file appio_values_by_name.c was added on branch appio on 2011-12-03 05:22:06 +0000 * 425e4d09 src/components/appio/tests/appio_values_by_code.c: file appio_values_by_code.c was added on branch appio on 2011-12-03 05:22:06 +0000 * 596ad9bb src/components/appio/tests/appio_list_events.c: file appio_list_events.c was added on branch appio on 2011-12-03 05:22:06 +0000 * 119543dc src/components/appio/tests/Makefile: file Makefile was added on branch appio on 2011-12-03 05:22:06 +0000 2012-03-05 * ba748a41 src/components/vmware/configure: Remove old configuration parameters from vmware/configure 2012-03-02 * 2b7e2abb src/ctests/Makefile src/ctests/max_multiplex.c: Add a new max_multiplex test This tries to use the maximum number of multiplexed events. This was written in response to the 32/64 perf_event multiplexed event limit reported by Mohammad j. Ranji * a0985ff5 src/multiplex.c src/papi_internal.c src/papi_libpfm4_events.c...: Fix issue when using more than 32 multiplexed events on perf_event On perf_event we were setting num_mpx_cntrs to 64. This broke, as the MPX_EventSet struct only allocates room for PAPI_MPX_DEF_DEG events, which is 32. This patch makes perf_event use a value of 32 for num_mpx_cntrs, especially as 64 was arbitrarily chosen at some point (the actual value perf_event can support is static, but I'm pretty sure it is higher than 64). * 331c516c src/ctests/acpi.c: Remove the acpi.c file from ctests It wasn't being built, and we removed the ACPI component a while ago. * 73e7d191 src/components/vmware/vmware.c: Removed all old references to #define VMWARE_PSEUDO_PPERF and switched over to getenv 2012-03-01 * 969b8aa9 src/ctests/Makefile src/ctests/zero_named.c src/papi.c: Three new APIs: PAPI_query_named_event PAPI_add_named_event PAPI_remove_named_event and a new test: zero_named Still to do: maybe test named native events and support Fortran * 97bf9bf8 src/papi.c src/papi.h: First pass implementation of {add, remove, query}_named_event * 2416af88 src/components/vmware/vmware.c: Add functionality to getenv selectors * 297f9cd6 src/papi.c: Fix possible race in _papi_hwi_gather_all_thrspec_data The valgrind helgrind tool noticed this with the thrspecific test * be599976 src/papi_internal.c: Add some locking in _papi_hwi_shutdown_global_internal This caused a glibc double-free warning, and was caught by the Valgrind helgrind tool in krentel_pthreads There are some other potential locking issues in PAPI_shutdown, especially when debug is enabled. * 8444d577 src/utils/clockres.c src/utils/command_line.c: Cleanup the oxygen markup for the utilities. * 7144394f doc/Doxyfile-html: Missed a recursive tag for the html config file. * 63b2efc4 src/papi_preset.c: Fix segfaults in tests on AMD machines The papi_preset code was wrongly calling papi_free() on some code that was allocated with strdup() (not with papi_malloc). We were only noticing this on AMD machines because it was the code for freeing developer notes in presets, and currently only AMD events have developer notes. * 0b1350df src/linux-common.c: Touch 'virtual_vendor_name' to cleanup a warning on bluegrass. 2012-02-29 * 1f17b571 src/Makefile.inc src/Rules.perfctr-pfm src/Rules.pfm4_pe...: Merge the contents of papi_libpfm_presets.c into papi_preset.c The code isn't libpfm specific at all anymore, it's the generic "read presets from a file" code. It makes more sense to find it in papi_presets.c * d087d49f src/papi_fwrappers.c: Fix Fortran breakage after the preset event changes * 156141ec src/papi_libpfm_presets.c src/papi_preset.c src/papi_preset.h: Simplify papi_libpfm_presets.c Previously adding presets from papi_events.csv was a three step process. 1. Load the presets from the file, put in temporary structure. 2. Convert this temporary structure to a "findem" dense structure 3. Pass this dense structure to _papi_hwi_setup_all_presets for final assignment. This change creates the final assignment directly without the intermediate two steps. * 8bc2bafd src/papi.c src/papi.h src/papi_common_strings.h...: Make the internal preset_info match the one exported by papi.h There were a lot of cases where the same structure fields were available, just with different names. That was confusing. Also, this allows using a pointer to the preset info instead of having to copy values out of the structure when gathering event info for presets. * 8fda68cb src/genpapifdef.c src/papi.c src/papi_common_strings.h...: Merge the 4 separate preset structs into one. _papi_hwi_presets was a structure containing pointers to 4 other arrays of structures which held the actual preset data. This change merges all of these into one big structure. 2012-02-28 * e69815d7 src/linux-bgp.c src/papi_internal.c src/papi_internal.h...: Removing remaining vestiges of references to bipartite routines. Now the only references are in papi_bipartite.h, perfctr-x86.c and winpmc-p3.c. * 5766b641 src/papi_bipartite.h src/perfctr-x86.c src/win2k/substrate/winpmc-p3.c: These changes implement the bipartite allocation routine as a header file to be included in whatever cpu component needs it. Right now, that's just perfecter-x86 and windows. Both components have been modified and perfecter-86 compiles cleanly. Neither has been tested since I don't have access to a test bed. * 7f444b76 src/papi_libpfm_presets.c src/papi_preset.c src/papi_preset.h: Merge the hwi_dev_notes structure into hwi_preset_data * 21a1d197 src/components/vmware/vmware.c: add getenv * 08c1b474 src/perfctr-x86.c: Merge bipartite routine into perfecter-x86 component, since this is effectively the only place it is used. * 9ed9b1f5 src/papi.c: Remove a reference to PAPI_set_event_info() which was removed for PAPI 4 * c626f064 src/ctests/all_events.c src/ctests/derived.c src/ctests/kufrin.c...: Convert PAPI_event_info_t to separate preset event info This moves the preset event info to its own separate structure, which reduces greatly the large string overhead that is not used by the native events. * 787d6822 src/perfctr-x86.c: Move bipartite stuff to perfctr_x86 since that's really the only place it's currently used. * 229c8b41 src/components/vmware/vmware.h: Add env_var definition to vmware.h * 46aaf6ca src/components/vmware/vmware.c: Remove all unneeded cases * 874a5718 src/freebsd.c src/perfctr-ppc64.c: Remove more unused references to .bpt_ routines in preparation for refactoring. * 74e5a5fd src/components/vmware/vmware.h: Remove uneeded defines from vmware.h header * 58b51367 src/components/coretemp_freebsd/coretemp_freebsd.c src/components/vmware/vmware.c src/solaris-niagara2.c...: Remove unused references to .bpt_ routines in preparation for refactoring. 2012-02-27 * 6b184158 src/Makefile.inc src/components/coretemp/linux-coretemp.c src/configure...: Have separate concept of "compiled-in" versus "active" components With this change, the _papi_hwd[] component info array only contains a null-terminated list of _active_ components. The _papi_compiled_components[] array has the original full list. At init_substrate[] time a pointer to a component is only put in the _papi_hwd[] list if it is successfully initialized. In addition the PAPI_num_compiled_components() and PAPI_get_compiled_component_info() calls have been added, but this is probably a confusing interface so they might only be temporary additions. * 042bfd5b src/Makefile.inc src/papi.c src/papi_data.c...: Split the contents of papi_data.c to various other files. The data declarations in papi_data.c were mostly used in other files. Move these into more relevant locations. * 1877862c src/papiStdEventDefs.h src/papi_common_strings.h: Remove the BGL and BGP specific pre-defined events. They can be better replaced by user-events, and we also had already removed BGL support completely a while back. This removes some ifdefs from the pre-defined event list and keeps future pre-defined events from having different eventcodes on different platforms. * c3986b79 src/components/coretemp/linux-coretemp.c src/components/cuda/linux-cuda.c src/components/infiniband/linux-infiniband.c...: Add names and descriptions for components. Also fixes cuda and lmsensors build issues introduced by vector.h cleanup * 2c84f920 src/aix.c src/freebsd.c src/perf_events.c...: Add names and descriptions to all of the CPU substrates. * 9f3e634a src/components/example/example.c src/papi.h src/utils/component.c: Add new "description" and "short_name" fields to .cmp_info structure This description field allows components to provide extra information on what they do. The short_name field will eventually be used to pre-pend event names. The papi_component_avail utility has been updated to print the description. The example component was updated to fill in these values. * ab61c9a7 src/Makefile.inc src/genpapifdef.c src/papi_common_strings.h...: Split papi_data.c into two parts papi_data.c was half data structure definitions for all of PAPI and half string definitions used by both PAPI *and* genpapifdef This splits the common string definitions to papi_common_strings.h so that genpapifdef can still be built w/o linking libpapi.a while making the code a lot easier to follow. * b8e6294c src/solaris-ultra.c: Remove unncessary extern declarations from solaris-ultra.c. * 5ddaff91 src/sys_perf_event_open.c: Remove unncessary extern declarations from sys_perf_event_open.c * a6c463b7 doc/Doxyfile-common.config: Create a common config file for doxygen. As part of streamlining the doxygen process, this is a new template doxygen config files. This is a blank template file generated by doxygen 1.7.4 (the version currently mandated by the release procedure ) * dc2c11fa src/aix.c src/aix.h src/perfmon.c...: The vector pre-definition should be in the .c file, not the .h file * 0b3c83c3 src/perf_events.c: Remove unnecessary extern declarations in perf_events.c * b93efca0 src/perfmon.c src/perfmon.h: Remove unnecessary extern declarations in perfmon.c * 7f7a2359 src/papi_preset.c: Remove unnecessary extern declarations from papi_preset.c * ecec03ad src/papi_libpfm_presets.c: Remove extraneous extern declarations from papi_libpfm_presets.c * 7b5f3991 src/extras.c: remove extraneous extern declarations from extras.c * f6470e4d src/aix-memory.c src/aix.c src/aix.h: Remove unncessary extern declarations from aix.c * f197d4ab src/papi_data.h src/papi_internal.c: Remove unncessary extern declarations in papi_internal.c * e7b39d48 src/papi.c src/papi_data.c src/papi_data.h...: remove unnecessary extern definitions from papi.c 2012-02-24 * 92689f62 src/configure src/configure.in src/linux-common.c...: Add a --with-pthread-mutexes option to enable using pthread mutexes rather than PAPI custom locks This is useful when running on new architectures that don't have a custom PAPI lock coded yet, and also for running valgrind deadlock detection utilities that only work with pthread based locking. * ca51ae67 src/papi_events.csv: Fix broken Pentium 4 Prescott support We were missing the netbusrt_p declaration in papi_events.csv * f6460736 src/linux-common.c: Fix build on POWER, broken by the virtualization change. * 91d32585 src/perfctr-x86.c src/perfmon.c: Fix some warnings that have appeared due to recent changes. * ae0cf00f src/linux-common.c src/papi_libpfm3_events.c src/papi_libpfm4_events.c...: Clean up the Linux lock files The locking primitives for some reason were spread among the libpfm code and the substrate codes. This change moves them into linux-common and has them part of the OS code. This way they will get properly initialized even if the perf counter or libpfm code isn't being used. 2012-02-23 * 88847e52 src/papi.c src/papi_memory.h: Remove _papi_cleanup_all_memory define from papi_memory.h The code in papi_memory.h said: /* define an alternate entry point for compatibility with papi.c for 3.1.x*/ /* this line should be deleted for the papi 4.0 head */ Since we are post papi-4.0 I thought it was time to act on this. Of course papi.c was still using the old name in one place. * 1d29dfc6 src/papi_libpfm_presets.c src/perfctr.c src/perfmon.c: Fix some missing includes found after the header cleanup. * b425a9f4 src/Makefile.inc src/extras.c src/extras.h...: Header file cleanup The papi_protos.h file contained a lot of no-longer in use exports. I split up the ones that are still relevant to header files corresponding to the C file that the functions are defined in. * 07199b41 src/extras.c src/papi_vector.c src/papi_vector.h: Clean up the papi_vector code. Remove things no longer being used, mark static functions as static. * d7496311 src/linux-common.c src/x86_cpuid_info.c src/x86_cpuid_info.h: Fix a missing "return 1" which meant that the virtualization flag wasn't being set right. With this fix, on saturn-vm1 we now get: Running in a VM : yes VM Vendor: : VMwareVMware in the papi_native_avail header * 8da36222 src/freebsd.c src/linux-bgp.c src/papi.c...: Remove the ->add_prog_event function vector As far as I can tell this is a PAPI 2.0 remnant that was never properly removed. This also removes PAPI_add_pevent(), PAPI_save(), and PAPI_restore(), none of which were exported in papi.h so in theory no one could have been using them. Also removes _papi_hwi_add_pevent() * a5f3c8b5 src/aix.c src/freebsd.c src/linux-timer.c...: Reduce the usage of MY_VECTOR whenever possible. This is an attempt to make the cpu-counter components to be as similar as possible to external components. * abbcbf29 src/any-null.h: Missed removing any-null.h during the any-null removal. * 665d4c5c src/linux-common.c: Somehow missed an include during the virtualization addition. * 0c06147b src/perfctr-2.6.x/usr.lib/event_set_centaur.os src/perfctr-2.6.x/usr.lib/event_set_p5.os src/perfctr-2.6.x/usr.lib/event_set_p6.os: Removes the last of the binary files from perfctr2.6.x Some binary files were left out in the cold after a mishap trying to configure perfctr for the build test. * 3acb7d57 src/Makefile.inc src/configure src/configure.in...: Add support for reporting if we are running in a virtualized environment to the PAPI_hw_info_t structure. This currently only works on x86. it works by looking at bit 31 of ecx after a cpuid (the "in a VM" bit) and then using leaf 0x40000000 to get the name of the VM software (this works for VMware and Xen at least) x86_cache_info.c was renamed to x86_cpuid_info.c to better reflect what goes on in that file (it does various things based on the cpuid instruction). the testlib header was updated to report virtualization status in the papi header (printed for things like papi_native_avail). 2012-02-22 * 9c7659b5 src/Makefile.inc src/freq.c: Remove the freq.c file as nothing seemed to be using it. * d205e2d3 src/perfctr-x86.c: Made a stupid typo when converting perfctr to call libpfm functions with the component id. * 25b41779 src/papi_libpfm3_events.c src/papi_libpfm4_events.c src/papi_libpfm_events.h...: When updating the preset code to take a component index I missed a few callers. * a713ffb1 src/papi_internal.c src/papi_vector.c: Remove any-null component * 27e1c2c5 src/any-null-memory.c src/any-null.c src/any-proc-null.c...: Remove the any-null component. * 25779ae0 PAPI_FAQ.html: Saving another version of the FAQ after adding a git section, and removing several obsolete sections. These questions still need detailed review for relevance and timeliness. * 449a1a61 src/ctests/overflow_allcounters.c: Fix overflow_allcounters which was making assumptions about component 0 existing. * f21be742 src/ctests/hwinfo.c: Make the hwinfo test not bail out if no counters are available. * ebc675e6 src/ctests/memory.c: Make sure the memory ctest runs even if no components are available. * 9b3de551 src/linux-common.c src/perf_events.c src/perfmon-ia64.c...: Make sure the system info init happens at os init time. Otherwise the system info never gets set if a perfcounter component isn't available. * 59e47e12 src/papi_internal.c: Make sure that _papi_hwi_assign_eventset() does the right thing if no components are available. * dd51e5d6 src/ctests/api.c: The api test would fail in the no cpu component case. Fix it to properly check for errors before attempting to run high-level PAPI tests. * 069e9d2f src/aix.c src/papi.c src/papi_internal.h...: Fix code that was depending on _papi_hwd[0] existing. Most of this was in the presets code. The preset code had many assumptions so that you can only code presets with component[0]. This fixes some of them by passing the component index around. * 7259eaec src/papi_vector.c: Fix up papi_vector to get rid of some warnings introduced on AIX. * 16fe0a61 src/aix.c src/solaris-ultra.c: Fix two last substrates where I missed some fields in the OS structure conversion. * 625871ec src/perfmon.c: Missed a cmp_info field in perfmon.c * 680919d9 PAPI_FAQ.html: Saving the latest version of the FAQ before undertaking major revisions. * 3d4fa2e5 src/linux-timer.c src/perfctr-x86.h: Fix the perfctr code to compile if configured with --with-virtualtimer=perfctr * bbd7871f src/perfctr.c: Missed two OS vector calls in the perfctr code during the conversion. * bc6d1713 src/Makefile.inc: Removed one of the two instances of MISCOBJS listed in Makefile.inc. 2012-02-21 * 40bc4c57 src/papi_vector.c src/papi_vector.h: Remove now-unused OS vectors from the main papi vector table. * 3c6a0f7b src/aix.c src/freebsd.c src/linux-bgp.c...: Convert PAPI to use the _papi_os_vector for the operating-system specific function vectors. * 568abad5 src/papi_vector.h: Add new _papi_os_vector structure to hold operating-system specific function vectors. * a39d2373 src/ctests/subinfo.c: Missed removing a field from the subinfo ctest. * 1d930868 src/papi.h: Remove fields now in PAPI_os_info_t from the component_info_t struct. * d397d74a src/components/example/example.c: Remove fields now in PAPI_os_info_t from the example component. * 8cd5c8e0 src/aix.c src/freebsd.c src/linux-bgp.c...: Modify all the substrates to use _papi_os_info. instead of _papi_hwd[0]->cmp_info for the values moved to the OS struct * 58855d3a src/papi_internal.h: Add padding for future expansion to PAPI_os_info_t Add _papi_hwi_init_os(void); definition * ea1930e1 src/papi_internal.h: Add new PAPI_os_info_t structure to papi_internal.h * 0eac1b29 src/utils/multiplex_cost.c: Modify multiplex_cost to properly use the API_get_opt() interface to get itimer data, rather than directly accessing the fields from the cmp_info structure. This would have broken after the OS split. * 87c2aa2f src/ctests/subinfo.c: subinfo was printing itimer data from the cmpinfo structure. These values will not be in cmpinfo once the OS split happens. * f2c62d50 src/components/vmware/vmware.h: Clean up the VMware Header a bit 2012-02-17 * 6f0c1230 src/aix.c src/components/coretemp/linux-coretemp.c src/components/coretemp_freebsd/coretemp_freebsd.c...: The git conversion reset all of the CVS $Id$ lines to just $Id$ Since we depend on the $Id$ lines for the component names, I had to go back and fix all of them to be the component names again. * 2d208d0e src/perfctr-2.6.x/usr.lib/event_set_centaur.o src/perfctr-2.6.x/usr.lib/event_set_p5.o src/perfctr-2.6.x/usr.lib/event_set_p6.o: Remove a few binary files in perfctr-2.6.x * f78bf1af src/libpfm-3.y/Makefile src/libpfm-3.y/README src/libpfm-3.y/docs/Makefile...: More cleanups from the migration, latest version of libpfm-3.y perfctr-2.[6,7] Version numbers got really confused in cvs and the git cvsimport didn't know that eg 1.1.1.28 > 1.1 ( see perfctr-2.6.x/CHANGES revision 1.1.1.28.6.1 :~) * e8aa2e61 INSTALL.txt: Explicitly state that 3.7 was the last version of PAPI with good windows support. * 546901fa src/components/cuda/linux-cuda.c: Modified CUDA component so that a PAPI version - that was configured with CUDA - will successfully build on a machine that does not have GPUs. 2012-02-16 * 49d9f71c src/.gitignore: Add a .gitignore file with the files that PAPI autogenerates. This way they won't clutter up "git status" messages papi-papi-7-2-0-t/ChangeLogP501.txt000066400000000000000000000071051502707512200166010ustar00rootroot000000000000002012-09-20 * 708d173a man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Rebuild the manpages for a 5.0.1 release. 2012-09-19 * 29cdd839 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump the version number for a 5.0.1 release. * bb7727f6 src/libpfm4/examples/fo src/libpfm4/examples/injectevt.c .../bin/usr/local/include/perfmon/perf_event.h...: Cleanup a botched libpfm4 update. As Steve Kaufmann noted, I botched an update of libpfm4. 2012-09-18 * dc117410 src/configure src/configure.in: Remove a trailing slash in libpfm4 pathing. Addresses an issue in rpmbuild when using bundled libpfm4. Reported and patched by William Cohen 2012-09-17 * e196b89b src/components/cuda/configure src/components/cuda/configure.in: Minor changes to CUDA configure necessary to get it running smoothly on the Kepler architecture. 2012-09-11 * 866bd51c src/papi_internal.c src/papi_preset.c: Fix preset bug The preset code was only initializing the first element of the preset code[] array. Thus any event with more than one subevent was not terminated at all, and the preset code would use random garbage as presets. This exposed another problem; half our code assumed a 0 terminated code[] array, the rest was looking for PAPI_NULL (-1). This standardizes on PAPI_NULL, with comments. Hopefully this might fix PAPI bug #150. This is a serious bug and should be included in the next stable release. 2012-08-29 * b978a744 src/configure src/configure.in: configure: fix autodetect perfmon case The fixes I made yesterday to libpfm include finding broke on perfmon2 PAPI if you were letting the library be autodetected. This change should fix things. Tested on an actual 2.6.30 perfmon2 system. * 4386e6e5 src/libpfm4/Makefile src/libpfm4/README src/libpfm4/config.mk...: Update libpfm4 included with papi to 4.3 2012-08-28 * 729a8721 src/configure src/configure.in: configure: don't check for libpfm if incdir specified When various --with-pfm values are passed, extra checks are done against the libpfm library. This was being done even if only the include path was specified, which probably shouldn't be necessary. This broke things because a recent change I made had the libpfm include path be always valid. * bc9ddffc src/configure src/configure.in: Fix compiling with separate libpfm4 The problem was if you used any of the --with-pfm-incdir type directives to configure, it would them assume you wanted a perfmon2 build. This removes that assumption. I did check this with perfmon2, perfctr, and perf_event builds so hopefully I didn't break anything. 2012-08-27 * 3b737198 src/papi.c src/papi_libpfm4_events.c src/papi_preset.c...: Hack around debugging macros. Under NO_VARARG_MACROS configs the debug printing guys become two expression statements. This is bad for code expecting eg SUBDBG(); to be one statement. --ie-- if ( foo ) SUBDBG("Danger Will Robinson"); ------ In order to keep the useful file and line number expansions with out variadic macro support, we split SUBDBG into two parts; A call to DEBUGLABEL() and friends and then a call to a function to capture the actual informative message. So if(foo) stmt(); becomes if (foo) print_the_debug_label(); print_your_message(...); And your message is always printed. See papi_debug.h for what actually happens. I'm not clever enough to work around this any other way, so I exaustivly put { }s around every case of the above I found. (I only searched on 'DBG' so its possible I missed some) papi-papi-7-2-0-t/ChangeLogP510.txt000066400000000000000000000456051502707512200166100ustar00rootroot000000000000002013-01-15 * 0917f567 src/threads.c: Cleaned up compiler warning (gcc version 4.4.6) * 06ca3faa src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/L2unit/linux-L2unit.c...: Cleaned up compiler warnings on BG/Q (gcc version 4.4.6 (BGQ-V1R1M2-120920)) 2013-01-14 * 56400627 .../build/lib.linux-x86_64-2.7/perfmon/__init__.py .../lib.linux-x86_64-2.7/perfmon/perfmon_int.py .../build/lib.linux-x86_64-2.7/perfmon/pmu.py...: libpfm4: remove extraneous build artifacts. Steve Kaufmann reported differences between the libpfm4 I imported into PAPI and the libpfm4 that can be attained with a git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4 Self: Do libpfm4 imports from a fresh clone of libpfm4. 2013-01-11 * 4ad994bc src/papi_events.csv: Clean up armv7 cortex a15 presets Clean up armv7 cortex a15 presets and add presets for L1 and L2 cache * d54dabf5 ChangeLogP510.txt RELEASENOTES.txt doc/Doxyfile-common...: Prepare the repo for a 5.1 release. * Bump the version number to 5.1 * Update the man pages * Create a changelog for 5.1 * Update RELEASENOTES * 8816a3b8 INSTALL.txt: Update INSTALL.txt Add information about installing PAPI on Intel MIC. Based upon information from Vince Weaver's PAPI MIC support page. http://www.eece.maine.edu/~vweaver/projects/mic/ * 8dc1ca23 TEST.TXT: Remove TEST.TXT This was a leftover from a switch over to git. * 292d6c9b src/papi_libpfm3_events.c: Fix build on ia64 When trying to build papi 5.0.1 for IA64, my collegue got compile errors due to perfmon.h not being included. We're not sure if this actually is a configure bug, but this patch fixed it. * 25424f41 src/extras.c: Fix kernel warning in _papi_hwi_stop_timer() In _papi_hwi_stop_timer() we were calling setitimer( timer, NULL, NULL ) to disable the itimer. Recent Linux kernels print warnings if you do this; NULL is not a valid second argument to setitimer() and possibly this wasn't really working before. According to the manpage the proper fix is to call setitimer() with a valid "new_value" field but with the values all 0. That is what this patch does. 2012-11-30 * a7d70127 src/components/micpower/README src/components/micpower/Rules.micpower src/components/micpower/linux-micpower.c...: MIC power component The Intel MIC (Xeon PHI) card reports power of several components of the card. These values are reported in a sysfs file, so this component is cloned from the coretemp component. 2013-01-08 * 121cd0a6 src/Makefile.in src/Rules.pfm4_pe src/configure...: configure: Add shortcut for mic support. * Add a --with-mic flag to enable the several options to cross compile for mic. MIC builds are cross-compiled and Matt and I were unable to figure out how to trigger cross compilation with just our flag. This is short-hand for setting --with-arch=k1om --without-ffsll --with-walltimer=clock_realtime_hr \ --with-perf-events --with-tls=__thread --with-virtualtimer=cputime_id * Automatically cause make to pass CONFIG_PFMLIB_ARCH_X86=y to libpfm4's make. So to build for the mic card one has to do: {Set pathing to find the x86_64-k1om-linux-gcc cross-compiler} $ ./configure --host=x86_64-k1om-linux --with-mic $ make Thanks to Matt Johnson for the legwork on configure shortcuting. 2013-01-07 * f65c9d9e src/papi_events.csv: Add preset events for ARM Cortex A15 2012-12-14 * 61a9c7b1 man/man3/PAPI_get_eventset_component.3 src/papi.c: Doxygen: Add a new API entry Add the manpage for the new PAPI_get_eventset_component api entry. 2013-01-02 * 38d969ab doc/Doxyfile-man1 doc/Doxyfile-man3 doc/Makefile...: Doxygen: Cleanup generated man pages. Mark a few \page sections as \htmlonly so that man pages are not built for them. Modify the makefile to rm some data structures that are generated. Doxyfile-man3: * Take out papi_vector.h, this file only defines a few data structures from which we don't need manpages. papi.h: * PAPI_get_component_index's inline comment had the close /**> to delimit its description, but doxygen uses /**<. papi_fwrappers.c: * Mark the group PAPIF as internal so that a man page is not generated for it. utils/*: * Remove some useless htmlonly directrives, doxygen will generate pages for any data structure, htmlonly doesn't stop that. Doxyfile-man1: * Change a flag in Doxyfile-man1 so that we don't document internal data structures in the utilities. We don't do this in -man3 because of the \class workaround we use to create manpages for each of the PAPI_* api entry points. Because we call them classes, they would be caught in the no data structures flag. * 7b790c09 doc/Doxyfile-html src/papi.h src/papi_fwrappers.c...: Doxygen: Cleanup some of the markup We were not using htmlonly correctly... The idea was to use \htmlonly to not build manpages for a few things. To properly hide \page s you want things like: /** \htmlonly \page Foo I don't want this to generate a manpage. \endhtmlonly */ 2012-12-07 * 152bac19 src/papi.c: Doxygen: Cleanup papi.c Cleanup some \ref s, \ref PAPI_function() isn't happy, use \ref PAPI_function it'll put in the proper links. Remove _papi_overflow_handler doc block. We had the block but no code. 2012-12-20 * 7a40c769 src/components/rapl/tests/rapl_overflow.c: RAPL test code: Add flexibility to the test code. Per Will Cohen; ------------------ I was reviewing some test results for the papi test and found that the rapl_overflow.c tests makes an assumption that there are exactly two packages. As a result the test will fail on machines with a single package. The following is a patch to make it a bit more flexible allow 1-n packages in the test. -Will ----------------- 2012-12-19 * 96c9afb0 src/components/appio/README src/components/appio/appio.c src/components/appio/appio.h...: Added events for seek statistics and support for intercepting lseek() calls. 2012-12-14 * 003abf6d src/Rules.perfctr-pfm: Rules.perfctr-pfm: pass CC in all cases. Perfctr user library was not being passed CC when built. 2012-12-05 * e2c05b29 src/papi_internal.c: papi_internal.c: Refactor dublicated code in cleanup and free eventset. Currently the code to free runtime state is duplicated in cleanup and free. The perf_event_uncore test exposed an issue where free cleaned up cpu_attach state but cleanup did not, causing a leak. Have _papi_hwi_free_EventSet call _papi_hwi_cleanup_eventset to free most of the runtime state of the eventset and then allow free_eventset to free the Eventset Info struct. 2012-12-13 * 7d020224 src/configure src/configure.in: configure: Change fortran compiler search order. Bandaid fix to buildbot errors. By default, configure would find icc before gcc but gfortran would be used before ifort. The real fix is to test that object code from the c compiler can be linked to by the fortran compiler. 2012-12-12 * 87b6e913 src/papi_events.csv: ivy_bridge: remove PAPI_HW_INT event Apparently recent Intel Vol3B documentation removed this event, and the most recent libpfm4 merge followed suit. I asked at Intel about this and possibly they only removed it because they didn't think anyone was using it. Maybe they'll ad it back 2012-12-10 * 293b26b9 src/Makefile.inc: Makefile.inc: Fix library link ordering. Per Will Cohen ----------------------------------------------------------- I ran across a problem when trying to build papi with the bundled libpfm and an earlier incompatible version of libpfm was already installed on the machine. The make would use the /usr/lib{64}/libpfm.so before trying to use the locally built version and this would cause problems. The attached patch changes the order of the linking and uses the local built libpfm before it tries the installed version. -Will ----------------------------------------------------------- 2012-12-12 * 57e6aa0d src/Makefile.in: Makefile.in: export CC_COMMON_NAME In 17cfcb4a I started using CC_COMMON_NAME in Rules.pfm4 but failed to have configure put it in Makefile. 2012-12-11 * 17cfcb4a src/Rules.pfm4_pe src/configure src/configure.in: Cleanup icc build Start using -diag-disable to quiet down some of the remarks icc carps about in libpfm4. Also have configure export CC_COMMON_NAME and check against that in Rules.pfm4_pe. afec8fc9a reverted us to passing -Wno-unused-parameter to icc, polluting buildbot. 2012-12-10 * afec8fc9 src/configure src/configure.in: configure: Attempt to better detect which C compiler we are using. This attempts to address trac bug 162. http://icl.cs.utk.edu/trac/papi/ticket/162 Specifying full paths for CC caused issues in our configure logic. We set several flags specific to gcc or icc and this was breaking down EG "/usr/bin/gcc" != "gcc" Now we attempt to execute whatever CC we are going to use and grep its version string. We set a CC_COMMON_NAME \in {"gcc", "icc", "xlc", "unknown"} based upon the above and later check CC_COMMON_NAME inplace of CC to set compiler specific flags. * 14432aa0 src/linux-timer.c src/papi.c: Minor Coverity fixes. Thanks, Will Cohen. 2012-12-07 * ba5e83d4 src/papi_user_events.c: papi_user_events.c: Fix memory leak. Reported by William Cohen as detected by the coverity tool. * 166498a8 src/components/nvml/linux-nvml.c: nvml component: fix detectDevices() The routine detectDevices() always returned with the error PAPI_ESYS when there was a device available. This resulted in that there were no nvml events available. Fixed. * 11ad5894 src/components/nvml/linux-nvml.c: nvml component: add missing variable declaration In the routine _papi_nvml_init_componen(), the variable papi_errorcode was not declared which prevented this component to build. Added declaration of papi_errorcode as int. 2012-12-06 * 9567dfef src/ftests/first.F src/ftests/second.F: Fix warning messages issued by gfortran 4.6.x regarding loss of precision when casting REAL to INT. Thanks to Heike for identifying the proper intrinsics. * 72588227 src/papi.c src/papi.h: Add PAPI_get_eventset_component() to get the component index from an event set. This is symmetric with PAPI_get_event_component which extracts the information from an event. In response to a request from John Mellor-Crummey. * 2e055d40 src/components/rapl/linux-rapl.c: Fix a compiler warning about a possibly uninitialized return value. 2012-12-05 * 1aae2246 src/utils/command_line.c: Reformat the floating point output string to recognize that you can't cast the *value* of a long long to a double and expect to get the right answer; you need to cast the *pointer* to a double, then everything works. * 0e834fc2 src/utils/command_line.c: Incorporated use of the new PAPI_add_named_event API. Restructured output to support formatted printing of built-in DATATYPEs: UINT64 prints as unsigned followed by (u); INT64 prints as signed; FP64 prints as float (but I don't like the default format); BIT64 prints a hex, prefixed by '0x'. Also if info.units is not empty, units are appended to output values. These features can be demo'd with the RAPL component. * af6abec2 src/papi.h: Rearranged DATATYPE enums so INT64 is now default (0) value. Also added a BIT64 type for unspecified bitfields. 2012-12-04 * 862033e0 src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/IOunit/linux-IOunit.h src/components/bgpm/L2unit/linux-L2unit.c...: Resolved multiple components conflict on BG/Q when overflow is enabled for multiple events from different components at the same time. * 44744002 src/utils/command_line.c: Add -x and -u options to papi_command_line to allow printing counter values in hexadecimal and unsigned formats. 2012-11-30 * 25a914c5 src/papi_user_events.c: Cleanup unused variable warnings in user_events code. 2012-11-28 * 9a75f872 src/Rules.pfm4_pe src/configure src/configure.in: Cleanup the build under icc. libpfm4's build system uses a gcc specific flag, -Wno-unused-parameter. It does this via a variable, DBG, in config.mk: DBG?=-g -Wall -Werror -Wextra -Wno-unused-parameter The Intel compiler doesn't understand -Wno-unused-parameter and complains about it. In Rules.pfm4_pe we set DBG for icc builds. 2012-11-27 * 4def827b src/configure src/configure.in: Fix the perfctr build that was breaking due to missing CPU Mark Gates was reporting PAPI 5 wasn't running properly on Keeneland. It looks like some CPU cleanups in the configure code broke things. Hopefully this helps the situation. 2012-11-21 * 4316f172 src/perf_events.c: perf_events: get rid of "PAPI Error: Didn't close all events" error This was more meant as a warning; it could trigger when closing an EventSet that had an event partially added but failed for some reason. * 671e10bd src/utils/command_line.c: papi_command_line: fix error output The error messages got a bit weird looking due to the PAPI error printing changes a while back. * 959afa49 src/papi_internal.c: Fix _papi_hwi_add_event to report errors back to user. Previously _papi_hwi_add_event would report all errors returned by add_native_events() as being PAPI_ECNFLCT even though add_native_events() returned a wider range of errors. * 8ecb70ba src/perf_events.c: Have perf_event return PAPI_EPERM rather than PAPI_ECNFLCT if the kernel itself returns EPERM * 9053ca1c src/perf_events.c: Work around kernel issue with PERF_EVENT_IOC_REFRESH It's unclear exactly the best way to restart sampling. Refreshing with 1 is the "official" way as espoused by the kernel developers, but it doesn't work on Power. 0 works for Power and most other machines, but the kernel developers say not to use it. This makes power user 0 until we can figure out exactly what is going on. * e85df04b src/components/appio/tests/appio_test_socket.c: - added support distinguishing between network and file I/O. - added events to measure statistics for sockets - updated README 2012-11-15 * 248694ef src/x86_cpuid_info.c: Update x86_cpuid_info code for KNC. On Knight's Corner the leaf2 code returns 0 for the count value. We were printing a warning on this; better would be to just skip the cache detection code if we get this result. 2012-11-08 * 82c93156 src/linux-bgp-memory.c src/linux-bgp.c src/linux-bgp.h: There was more cleaning up necessary in order to get PAPI compiled on BG/P. It should work now with the recommended configure steps described in INSTALL. 2012-11-07 * 77da80b3 src/Makefile.inc src/configure src/configure.in...: Make BGP use papi_events.csv This was easier than trying to clean up the linux-bgp-preset-events.c file to have the proper file layout. * fc8a4168 src/linux-bgp.c: Fix some linux-bgp build issues. No one has tried compiling after all the PAPI 5.0 changes so many bugs slipped in. * c16ef312 src/ctests/perf_event_uncore.c: Fix type warnings in perf_event_uncore test. * 3947e9c8 src/ctests/perf_event_uncore.c: Put a bandaid on the perf_event_uncore test. Check for an Intel family 6 model 45 processor (sandybridge ep) before executing the test. 2012-09-27 * a23d95f8 src/papi.c src/papi.h src/papi_fwrappers.c...: Mark some comments @htmlonly. This cleans up what man pages are generated. 2012-11-07 * d239c350 src/Makefile.inc src/Rules.pfm4_pe: Factor out duplicate install code from Rules.pfm4_pe The Makefile.inc has a rule to installed shared libraries. However, Rules.pfm4_pe also has a slightly different set of rules to install code for shared libraries. This leads to the same shared library being installed under two different names. The duplicate code has been removed from Rules.pfm4_pe and a symbolic link has been added to ensure that any code that might have linked with libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE) still runs. 2012-10-30 * fcc64ff9 src/papi_events.csv: Add PAPI_HW_INT event for IvyBridge 2012-10-26 * ef89fc56 src/papi_events.csv: MIC: update PAPI_FP_INS / PAPI_VEC_INS instruction We were using VPU_INSTRUCTIONS_EXECUTED for PAPI_FP_INS but really it's more appropriate for PAPI_VEC_INS This leaves PAPI_FP_INS undefined, which breaks a lot of the ctests. A long term goal should probably be modifying the tests to use another counter if PAPI_FP_INS isn't available (this affects Ivy Bridge too). 2012-10-25 * 975c03f1 src/perf_events.c: perf_event: fix granularity bug cut-and paste error in the last set of changes. Would have meant if you tried to explicitly set granularity to thread you'd get system instead. * 3cd3a62d src/configure src/configure.in src/ctests/Makefile...: Add perf_event_uncore ctest Also add a new type of ctest, perf_event specific In theory we should have configure only enable this if perf_event support is being used. * 5ee97430 src/perf_events.c: perf_event: add PAPI_DOM_SUPERVISOR to allowed perf_event domains perf_event supports this domain but since we didn't have it in the list PAPI wasn't letting us set/unset this. This is needed for uncore support, as for uncore domain must be set to allow monitoring everything. * c9325560 src/perf_events.c: perf_event enable granularity support Add support for PAPI_GRAN_SYS to perf_event. This is needed for uncore support. 2012-10-18 * 59d3d758 src/mb.h src/perf_events.c: Update the memory barriers It turns out PAPI fails on older 32-bit x86 machines because it tries to use an SSE rmb() memory barrier. (Yes, I'm trying to run PAPI on a Pentium II. Don't ask) It looks like our memory barriers were copied out of the kernel, which doesn't quite work because it expects some kernel infrastructure instead. This patch uses the definitions used by the "perf" tool instead. Also dropped the use of the mb() memory barrier on mmap tail write, as the perf tool itself did a while ago so I'm hoping it's safe to do so as well. It makes these definitions a lot simpler. 2012-10-08 * bcdce5bc src/perf_events.c: perf_event: clarify an error message The message was saying detecting rdpmc support broke, but the real error is that perf_events itself is totally broken on this machine and it's just rdpcm was the first code that tried to access it. 2012-10-02 * 3bb3558f src/mb.h: Update memory barries for Knights Corner Despite being x86_64 they don't support the SSE memory barrier instructions, so add a case in mb.h to handle this properly. 2012-10-01 * 38a5d74c src/libpfm4/README src/libpfm4/docs/Makefile src/libpfm4/docs/man3/libpfm_intel_atom.3...: Merge libpfm4 with Knights Corner Support * bf959960 src/papi_events.csv: Change "phi" to "knc" to match libpfm4 for Xeon Phi / Knights Corner support 2012-09-20 * d9249635 ChangeLogP501.txt RELEASENOTES.txt: Update releasenotes and add a changelog for 5.0.1 * a1e30348 man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Rebuild the manpages for a 5.0.1 release. papi-papi-7-2-0-t/ChangeLogP511.txt000066400000000000000000000173241502707512200166060ustar00rootroot000000000000002013-05-21 * 602d8dbc man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Rebuild man pages for a 5.1.1 release. * 93d9be34 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump version number for a 5.1.1 release. 2013-04-15 * 8e47838d src/components/cuda/linux-cuda.c: When creating two event sets - one for the CUDA and one for the CPU component - the order of event set creation appears crucial. When the CPU event set has been created before the CUDA event set then PAPI_start() for the CUDA event set works fine. However, if the CUDA event set has been created before the CPU event set, then PAPI_start(CUDA_event_set) forces the CUDA control state to be updated one more time, even if the CUDA event set has not been modified. The CUDA control state function did not properly handle this case and hence cause PAPI_start() to fail. This has been fixed. 2013-05-13 * c93dfa68 src/perf_events.c: perf_event component: update error returns This passes more error return values back to PAPI. Before this change a lot of places were hardcoded to PAPI_EPERM even if sys_perf_event_open() was reporting a different error. 2013-05-08 * d1db58e8 src/configure src/configure.in: Force the use of pthread_mutexes on ARM This lets the system libraries worry about the best way to define mutexes, rather than trying to hand-code in assembly around all of the various issues there are with atomic instructions in the ARM architecture. It might make sense to enable this for *all* Linux architectures, but for now just do it for ARM. * 29662e3e src/linux-lock.h: Commit 59d3d7584b2925bd05b4b5d0f4fe89666eb8494a removed the definition of mb(). mb() was defined as rmb(). This just corrects it back. (Note from VMW -- this fixes some things, but ARM still won't build on a Cortex A9 pandaboard due to the use of the "swp" instruction. Proper fix is probably to enforce posix-mutexes on ARM) 2013-04-22 * ff29fd12 src/run_tests.sh: The test for determining whether to run valgrind was backwards. Correcting that allow the run_test.sh script to stay the same and one just needs to define "VALGRIND=yes" (or any non-null string) to make run_test.sh use valgrind. --- src/run_tests.sh | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/run_tests.sh b/src/run_tests.sh index d1ce205..9337ff2 100755 --- a/src/run_tests.sh +++ b/src/run_tests.sh @@ -19,10 +19,8 @@ else export TESTS_QUIET fi -if [ "x$VALGRIND" = "x" ]; then -# Uncomment the following line to run tests using Valgrind -# VALGRIND="valgrind --leak-check=full"; - VALGRIND=""; +if [ "x$VALGRIND" != "x" ]; then + VALGRIND="valgrind --leak-check=full"; fi #CTESTS=`find ctests -maxdepth 1 -perm -u+x -type f`; -- 2013-03-28 * 1e8101f6 src/run_tests.sh: run_tests.sh: further refine component test find Exclude *.cu when looking for component tests. 2013-03-25 * 0b600bc5 src/run_tests.sh: run_tests.sh: File mode changes. run_tests.sh is now expected to run from the install location in addition to src. The script tried to remove execute from *.[c|h], now it just excludes *.[c|h] from the find commands. 2013-03-18 * 06f9c43b src/perfctr-x86.c: perfctr: don't read in event table multiple times papi_libpfm3_events.c now reads in the predefined events, we don't also need to do this in perfctr setup_x86_presets() * 48d7330c src/perfctr.c: Fix segfault in perfctr.c The preset lookup uses the cidx index, but in perfctr.c we weren't passing a cidx value (it was being left off). The old perfctr code plays games with defining extern functions so the compiler wasn't giving us a warning. 2013-03-14 * eda94e50 src/components/bgpm/L2unit/linux-L2unit.c src/linux-bgq.c: If a counter is not set to overflow (threshold==0; happens when PAPI_shutdown is called) then we do not want to rebuild the BGPM event set, even if the event set has been used previously and hence "applied or attached". Usually if an event set has been applied or attached prior to setting overflow, the BGPM event set needs to be deleted and recreated (which implies malloc() from within BGPM). Not so, though, if threshold is 0 which is the case when PAPI_shutdown is called. Note, this only applies to Punit and L2unit, not IOunit since an IOunit event set in not applied or attached. 2013-03-13 * 46f6123a src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/IOunit/linux-IOunit.h src/components/bgpm/L2unit/linux-L2unit.c...: Overflow issue on BG/Q resolved. Overflow with multiple components worked; overflow with multiple components and multiple events did not work as supposed to. 2013-03-07 * 6a0813f8 src/linux-common.c src/linux-memory.c: Fix the build on Linux-SPARC I dug out an old SPARC machine and fixed the PAPI build on it. * 51fe7e53 src/perf_events.c: More comprehensive sys_perf_open to PAPI error mappings This tries to cover more of the errors returned by sys_perf_open and map them to better results. EINVAL is a problem because it can mean Conflict as well as Event not found and many other things, so it's unclear what to do with it. * 1479a67f src/perf_events.c src/sys_perf_event_open.c: Return proper error codes for sys_perf_event_open For some reason on x86 and x86_64 we were trying to set errno manually and thus over-writing the proper errno value, causing all errors to look like PAPI_EPERM This removes that code, as well as adds code to report ENOENT as PAPI_ENOEVENT. With this change, on IVY this happens which looks more correct. ./utils/papi_command_line perf::L1-ICACHE-PREFETCHES Failed adding: perf::L1-ICACHE-PREFETCHES because: Event does not exist command_line.c PASSED 2013-03-06 * 7a3e75e8 src/papi_libpfm4_events.c src/papi_user_events.c: Coverity fixes: Coverity pointed out that there was a case where load_user_eent_table() could leak memory. The change in the location of the papi_free(foo) ensures that the allocated memory is freed. Coverity pointed out one path through the code in _papi_libpfm4_ntv_code_to_descr() that did not free up memory allocated in the function. Added a free on the path in free up that memory. Thanks Will Cohen. 2013-03-04 * b19bd1a2 src/components/rapl/linux-rapl.c: Remove a stray debug statement. Thanks to Harald Servat for catching this. 2013-03-01 * 6e5be510 src/utils/command_line.c: Wrestled some horribly convoluted indexing into shape. The -u and -x options now print as expected (I think). 2013-01-31 * 02bd70ad src/components/nvml/linux-nvml.c: linux-nvml.c: Fix type warning. CUDA and NVML have an signed vs unsigned thing going on in their returned device counts, cast away the warning. 2013-01-23 * a5bed384 src/linux-memory.c src/linux-timer.c: ia64 fixes. Thanks to Tony Jones for patches. 2013-01-16 * 021db23a src/components/nvml/linux-nvml.c: nvml component: cleanup a memory leak We did not free a buffer at shutdown time. 2013-05-17 * b25fc417 src/perf_events.c: perf_event: allow running with perf_event_paranoid is 2 perf_event_paranoid set to 2 means allow user monitoring only (no kernel domain). The code before this mistakenly disabled all events in this case. Also set the allowed domains to exclude PAPI_DOM_KERNEL. 2013-05-16 * 12768bec src/papi_events.csv: papi_events.csv Revert a little mishap in adding ivbep support Somehow the contents of papi_hl.c ended up in the events file. * 5e97ad7f src/papi_events.csv: Add identifier for ivb_ep 2013-01-29 * e201b8eb src/papi.c: General doxygen cleanup: remove all "No known bugs" messages; correct and cleanup examples for PAPI_code_to_name and PAPI_name_to_code papi-papi-7-2-0-t/ChangeLogP520.txt000066400000000000000000001655061502707512200166140ustar00rootroot000000000000002013-08-02 * 6b62d586 man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Update the manpages for a pending 5.2 release. New pages for PAPI[F]_epc and papi_version. * 1ae08835 src/linux-common.c: try to properly detect number of sockets Use totalcpus rather than ncpu in the calculation. This change fixes things on a Sandybridge-EP machine. We should maybe find a more robust way to detect this. * 79c37fbf .../perf_event_uncore/tests/perf_event_uncore.c .../tests/perf_event_uncore_multiple.c: perf_event_uncore: have tests skip if component disabled rather than fail * 638ccf6b .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: change order of uncore detection logic This way it will report an error of "no uncore found" before it reports "not enough permissions". That way a user won't waste time getting permissions only to find out they didn't have an uncore anyway. * 30582773 src/components/perf_event/pe_libpfm4_events.c: perf_event: fix papi_native_avail output A recent change of mine that added stricter error checking for libpfm4 event lookup broke event enumeration on perf_event, specifically papi_native_avail output. libpfm4 will return an error on some events if no UMASK or improper UMASK is supplied, but papi_native_avail always wants to print the root event and umasks separately. this temporary fix just ignores libpfm4 umask errors; we might in the future want to properly indicate which events are only valid when certain umasks are present. * c7612326 src/utils/native_avail.c: papi_native_avail: fix empty component case If a component had no events, papi_native_avail would ignore the error returned by PAPI_enum_cmp_event( PAPI_ENUM_FIRST ); and try to print a first event anyway. * e1b064eb .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: disable component if no events found This can happen on older (pre 3.6) kernels with the new libpfm4 that does proper uncore detection. 2013-08-01 * 9a54633a src/components/host_micpower/linux-host_micpower.c src/components/infiniband/linux-infiniband.c src/components/nvml/linux-nvml.c...: Components: Use the cuda dlopen fix all cases. See 4cb76a9b for details, the short version is if you call dlopen when you have been statically linked to libc, it gets ugly. 2013-07-31 * dbc44ed1 src/components/perf_event/pe_libpfm4_events.c .../perf_event_uncore/perf_event_uncore.c .../perf_event_uncore/peu_libpfm4_events.c: perf_event libpfm4 events -- correctly handle invalid events It was possible for event names to be obtained from libpfm4 during enumeration that were not valid events. This usually happens with uncore events, where the uncore is listed as available based on cpuid but when libpfm4 tries to get the uncore type from the kernel finds out it is unsupported. This change makes this properly fail, instead of just returning "0" for all the event paramaters (which is a valid event on x86). Also make this change in the regular perf_event component, even though it is less likely to happen in practice. * 4720890a .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: remove check_permissions() test It was trying to see if an EventSet was runnable by using the current permissions and adding the PERF_HW_INSTRUCTIONS event. That doesn't really make sense on uncore. The perf_event component uses this test to try to give errors early, at set_opt() time rather than at the first run time, although in practice now we can probably make intelligent guesses based on the current permission levels. * 113d35f7 .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: remove unused kernel workarounds uncore only works on Linux 3.6 or newer so all of the pre-2.6.35 workarounds aren't necessary. If someone has backported the uncore support to kernels that old, hopefully they've also backported all the other bugfixes too. 2013-07-25 * 4cb76a9b src/components/cuda/linux-cuda.c: Trial fix for the cuda component static libc linking issue. Weak link against _dl_non_dynamic_init, this appears in my limited testing to be in gnu libc.a and not in the so. For background, it was reported by Steve Kaufmann that statically linking tools with a PAPI library configured with the CUDA component segfaulted. It appears that calling any of the dynamic linker functions from a static executable is asking for pain. See Trac bug 182 https://icl.cs.utk.edu/trac/papi/ticket/182 2013-07-24 * ad47cfb9 src/configure src/configure.in: Add linux-pfm-ia64 to configure I'm not sure if this is enough to fix itanium support but it's a start. * 098294c5 src/components/example/tests/example_basic.c .../example/tests/example_multiple_components.c: Fixed tests for example component. Both tests failed due to incorrect check of the components PAPI has been configured with. 2013-07-23 * c0c4caf4 src/linux-memory.c src/papi_events.csv: Add initial support for IBM POWER8 processor Add initial support for IBM POWER8 processor The IBM POWER8 processor (to be publicly announced at some future date) has some preliminary support in libpfm with a subset of native events. These POWER8-related libpfm changes were pulled into PAPI on July 3, so further updates in PAPI were required to support this new processor. This patch adds that required support. NOTE: Due to the fact that only a subset of native events have been publicised at this point (and pushed into libpfm), not all of the usual PAPI preset events have corresponding native events. The rest of the POWER8 native events will be pushed upstream once they are verified, and then we can flesh out the PAPI preset events. With this initial POWER8 support patch, 5 of the ctests and ftests fail, compared to 3 when PAPI is run on a POWER7. At least one of the failing testcases is due to testing being done on an early POWER8 processor with some known hardware problems. We presume the number of failing tests will decrease once we have GA-level hardware to test on. 2013-07-22 * 6c231d1a src/configure: Rerun autoconf for f4ec143e Correct versioning of libpapi.so * f4ec143e src/configure.in: Correct versioning of libpapi.so The configure for linux always set the soname to libpapi.so. This causes problems when /sbin/ldconfig tries to update the library information on linux. The shared library is installed as /lib{64}/libpapi.so.$VERSION, but the shared library has the soname of libpapi.so. ldconfig makes a symbolic link from /lib/libpapi.so to the actual versioned shared library, /lib/{64}/libpapi.so$VERSION. The configure should get the soname correct to avoid creating this symbolic link. This patch only addresses the issues for some of the possible platforms and similar patches may be needed for other platforms. 2013-07-19 * 92356bbd src/papi.c src/threads.c src/threads.h: Attempt to fix a memory leak in fork2 test. Fork2 does the following: PAPI_library_init() fork(); / \ parent child wait() PAPI_shutdown() -> _papi_hwi_shutdown_global_threads() -> foreach(threadinfo we allocated): _papi_hwi_shutdown_thread() PAPI_library_init() _papi_hwi_shutdown_thread checks who allocated a ThreadInfo entry in the global list, and will only free it if our thread did the allocation. When threading is not initialized, we fall back to getpid(), now in the child process, the one ThreadInfo item on the list was allocated by our parent, so at shutdown time we don't free this, and thus leak it. Solution is to add a parameter to _hwi_shutdown_thread to force shutdown even if we didn't allocate it. At _papi_hwi_shutdown_global_threads() time, who cares, its closing time. * c04d908e src/cpus.c: Fix a deadlock in _papi_hwi_lookup_cpu(). If cpu_num is not found by _papi_hwi_lookup_cpu(), _papi_hwi_initialize_cpu() calls insert_cpu(), which locks CPUS_LOCK, which was already held by _papi_hwi_lookup_cpu(). * efac24c4 src/components/micpower/linux-micpower.c: micpower: fix return value check Also add a time check at stop time. 2013-07-16 * b9fd9dd1 src/configure src/configure.in: configure: Fix AIX build perfctr_ppc was not the only system that relied on ppc64_events.h, power*.h, and friends. First run at a fix is -Icomponents/perfctr_ppc for the C and F flags... * 46042e68 src/components/micpower/linux-micpower.c: micpower: update some indexing code 2013-07-15 * 5220e7d2 INSTALL.txt: INSTALL.txt: typo --with-arch=, not --arch=; Thanks to Karl Schulz for catching this. * 207e0ee0 src/papi_libpfm_events.h: papi_libpfm_events: needs include files for types. Include papi.h and papi_vector.h for papi_vector_t and PAPI_component_info_t * d96c01c7 src/components/perfctr/perfctr.c: perfctr: cleanup a warning Include papi_libpfm_events.h for _papi_libpfm_init() decl. * 367e1b38 src/components/perfctr/perfctr-x86.c src/components/perfctr/perfctr.c: perfctr: refactor out setup_x86_presets The setup_presets function served only to call _papi_libpfm_init, so we go the rest of the way and completly remove the function, calling _papi_libpfm_init directly from _perfctr_init_component. * 1ba38ce5 src/components/perfctr/perfctr-x86.c: perfctr: cleanup unused parameter warning. The perfctr code was refactored to only call into the table loading code one time. This had the side effect of removing most of what setup_x86_presets does. * 02710ced src/configure src/configure.in: configure: remove debugging message The compiler detection code had a stray AC_MSG_RESULT. 2013-07-12 * 028ce29d src/components/lustre/linux-lustre.c: lustre: use whole directory name as event Gary Mohr reported that on a trial system he was seeing many events of the form fs3-* which were all chopped to fs3, not helpful. I've not actually been able to figure out exactly how lustre names things, I've seen it described as - But have no clue what uid promisses. 2013-07-15 * 129d4587 src/papi.c: allow more than one EventSet attach to a CPU at a time This is necessary for perf_event_uncore support, as multiple uncores will want to attach to a CPU. It looks like this change won't break anything, and the tests pass on my test machines. I am a bit concerned about cpu->running_eventset, though no one seems to use that value... * bcda5ddd src/components/perf_event_uncore/tests/Makefile .../tests/perf_event_uncore_nogran.c: perf_event_uncore: remove perf_event_uncore_nogran test It is unnecessary after recent changes to the uncore component. * b1b9f654 src/components/perf_event_uncore/tests/Makefile .../tests/perf_event_uncore_cbox.c: perf_event_uncore: add perf_event_uncore_cbox test This adds a non-trivial test of the CBOX uncores. It turned up various bugs in the PAPI uncore implementation. * df1b6453 src/linux-common.c: linux: properly set hwinfo->socket value It was being derived from hwinfo->ncpu but being calculated before hwinfo->ncpu was set. 2013-07-13 * ee537448 .../perf_event_uncore/perf_event_uncore.c .../perf_event_uncore/peu_libpfm4_events.c .../perf_event_uncore/peu_libpfm4_events.h: perf_event_uncore: properly report number of total counters available * 7eb93917 src/components/perf_event/Rules.perf_event src/components/perf_event/pe_libpfm4_events.c src/components/perf_event/pe_libpfm4_events.h...: perf_event/perf_event_uncore/libpfm4 -- rearrange files Give perf_event and perf_event_uncore copies of papi_libpfm4_events to work with, as they will have different needs for the code. Get rid of the perf_event_lib stuff. It was a hack to begin with and in the end not much code will be shared. Maybe we can re-share things once uncore support is complete. 2013-07-12 * 6810af2a src/components/perf_event/perf_event.c .../perf_event_uncore/perf_event_uncore.c src/papi_libpfm4_events.c...: papi_libpfm4: properly call pfm_terminate() in papi_libpfm4_shutdown * 010497f4 src/components/perf_event/perf_event.c .../perf_event_uncore/perf_event_uncore.c src/papi_libpfm4_events.c...: split papi_libpfm4_init() split this function because the perf_event_uncore() component is going to want to initialize things differently than plain perf_event * d9023411 src/components/perf_event/perf_event.c: perf_event: on old kernels if SW Multiplex enabled, then report proper number of MPX counters available it may be different than the amount HW supports * 7595a840 src/components/perf_event/perf_event_lib.c: perf_event: use PERF_IOC_FLAG_GROUP when resetting events This ioctl argument specifies to reset all events in a group, so we don't have to iterate. This argument dates back to the introduction of perf_event and it makes the code a bit cleaner. * f220fd19 src/ctests/Makefile src/ctests/reset_multiplex.c: Add reset_multiplex.c PAPI_reset() potentially exercises different paths when resetting normal and multiplexed eventsets, so make sure we test both. * f784a489 src/components/lustre/linux-lustre.c: lustre: botched a conflict resolution properly do error checking on addCounter() * c1350fc8 src/components/perf_event/perf_event.c src/components/perf_event/perf_event_lib.c src/components/perf_event/perf_event_lib.h: perf_event: move overflow and profile code out of common lib the perf_event_uncore component doesn't need it * 8dde03fc .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: remove profiling and overflow code perf_event doesn't support sampling or overflow on uncore * 30d23636 src/components/lustre/linux-lustre.c: lustre component: Several fixes 1. create a dynamic native events table in pathalogical cases, lustre can have lots of events. 2. resolve some warnings change signature of init_component properly error check addCounter 3. Add a preprocessor flag to fake interface Set LIBCFLAGS="-DFAKE_LUSTRE" * 7ef51566 .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: remove dispatch timer call perf_event doesn't support sampling on uncore events * 667661c6 src/components/perf_event/perf_event.c src/components/perf_event/perf_event_lib.c src/components/perf_event/perf_event_lib.h: perf_event: move rdpmc detection back into perf_event.c It was in the perf_event_lib but uncore won't use the feature. * d46f01e1 .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: check the paranoid file Disable the component if paranoid isn't 0 or lower, and we're not running as root. * e4ec67d1 src/components/perf_event/perf_event.c: perf_event and paranoid level 2 If paranoid level 2 (no kernel events) was set we were removing PAPI_DOM_KERNEL from the allowable domains We were doing this even if the user was root. This code checks for uid 0 and overrides the restriction. * c5501081 src/components/perf_event/perf_event_lib.c src/components/perf_event/perf_event_lib.h: rename sys_perf_event_open2() call back to sys_perf_event_open() This was changed when merging code to avoid a conflict but wasn't renamed back whe the conflict was fixed. 2013-07-11 * e263ea60 src/configure src/configure.in: configure: libpfm selection logic rework If configure detected perfctr it would force libpfm3 to be used, even with --with-perf_events, now force libpfm4 if perf_events is requested. 2013-07-10 * 7a3ce030 .../host_micpower/Makefile.host_micpower.in src/components/host_micpower/Rules.host_micpower src/components/host_micpower/configure...: Component: host_micpower This is a component that exports power information for Intel Xeon Phi cards (MIC). The component makes use of the MicAccessAPI distributed with the Intel Manycore Platform Software Stack. k-mpss) * 9d9bd9c2 src/ctests/shlib.c: Fwd: Re: [Ptools-perfapi] ctests/shlib FAILED Should have sent this to the papi devel list. -Will -------- Original Message -------- Subject: Re: [Ptools-perfapi] ctests/shlib FAILED Date: Tue, 09 Jul 2013 23:20:10 -0400 From: William Cohen To: ptools-perfapi@eecs.utk.edu On 03/09/2012 03:40 PM, William Cohen wrote: > I was looking through the test results and found that ctests/shlib FAILED on all the machines I tested on because libm shared library is already linked in. There is no difference in the number of shared libraries before and after the dlopen. The test ctests/shlib fails as a reult of this. > > -Will > _______________________________________________ > Ptools-perfapi mailing list > Ptools-perfapi@eecs.utk.edu > http://lists.eecs.utk.edu/mailman/listinfo/ptools-perfapi > I did some more investigation of this problem today. I found that the lmsensor component implicitly pulls in the libm. As an alternative, I wrote the attached patch that uses setkey() and encrypt() in libcrypt.so instead. It works on various linux machines, but I do not know whether it is going to work on other OS. -Will >From c53c97e1de2d1c7dc0bca64d1906287ff73343c6 Mon Sep 17 00:00:00 2001 From: William Cohen Date: Tue, 9 Jul 2013 22:37:27 -0400 Subject: [PATCH] Avoid using libm.so for ctests/shlib because of implicit use in some components The lmsensors component can implicitly pull in libm.so into the executable. Unfortunately, the ctests/shlib test expects that libm.so is not loaded and will fail because there is no change in the count of shared libraries. The patch uses libcrypt.so library setkey and encrypt functions to test PAPI_get_shared_lib_info( ) instead of libm.so library pow function. 2013-07-09 * bdc9b34b .../tests/perf_event_amd_northbridge.c: Perf_event_amd_northbridge_test: Use buffer event_name instead of uncore_event The variable uncore_event is initialized to NULL and is never changed during execution of the test. PAPI_add_named_event fails and the event set cannot be started. The correct event name is stored in event_name, replacing all occurrences of uncore_event with event_name therefore fixes the problem metioned above. 2013-07-08 * a1678388 src/components/micpower/linux-micpower.c: micpower: Fix output in native_avail and component_avail. It uses cmp_info.name, not .short_name? Native Events in Component: mic-power Name: mic-power Component for reading power on Intel Xeon Phi (MIC) Should both match what is prepended to event names, so change .name from mic-power to micpower. * e0582f2d src/components/micpower/linux-micpower.c: Micpower: fix a typo subsystem, not sybsystem... * c7b357ec INSTALL.txt: INSTALL.txt: update instructions for MIC. * 34a1124e src/components/perf_event_uncore/tests/Makefile .../tests/perf_event_amd_northbridge.c: Add perf_event_amd_northbridge test The test should show how to write a program using AMD fam15h NB with a 3.9 kernel. Once libpfm4 gets updated we can see if it's possible to also have the test properly run on 3.10 kernels (in that case the regular perf_event_uncore test should work w/o changes) * 41b6507c .../perf_event_uncore/tests/perf_event_uncore.c .../tests/perf_event_uncore_multiple.c: Make perf_event_uncore tests use PAPI_get_component_index() They were open-coding the component name search for no good reason. 2013-07-05 * abf38945 src/papi_libpfm4_events.c: avoid having a "default" PMU for the uncore component on the main CPU component we have a "default" PMU where you can leave out the PMU part of the event name. This is unnecessary and sometimes confusing on uncore, so always print the full event name if it's an uncore PMU. * b9fe5c3e .../perf_event_uncore/tests/perf_event_uncore.c .../tests/perf_event_uncore_multiple.c: Update perf_event_uncore tests to properly fail if they don't have enough permissions * 32ae1686 .../perf_event_uncore/tests/perf_event_uncore.c: perf_event_uncore_test : properly use uncore component The sample code was still hardcoding to component "0" which shouldn't have worked. Thanks to Claris Castillo for pointing out this problem. * 59e73b51 src/papi_libpfm4_events.c: have _papi_libpfm4_ntv_name_to_code properly check pmu_type With the existing code, uncore events were being found by the perf_event component even when that component has uncore events distabled. 2013-07-03 * a01394eb .../tests/perf_event_uncore_lib.c: perf_event_uncore: fix ivb event in uncore test Now that libpfm4 officially supports plain ivb uncore, make sure the test event we were using matches what libpfm4 supports. 2013-07-01 * f10342a8 src/utils/cost.c: Clean up option handling in papi_cost The papi_cost used strstr to seach for the substring that matched the option. this is pretty inexact. Made sure that the options matched exactly and the option argments for -b and -t were greater than 0. Also make papi_cost print out the help if there was an option that it didn't understand. * b5adc561 src/utils/native_avail.c: Clean up option handling for papi_native_avail Corrected the help to reflect the name of the option "--noumasks". Print error message if the "-i", "-e", and "-x" option arguments are invalid. Avoid using strstr() for "-h", use strcmp instead. Also check for "--help" option. * 8933be9b src/utils/decode.c: Clean up option handling in papi_decode papi_decode used strstr() to match options; this can lead to inexact matchs. The code should used strcmp instead. Make sure command name is not processed as an option. Also print help iformation is some argument is not understood. * d94ac43a src/utils/component.c: Improve option matching in papi_component and add "--help" option * bb63fe5c src/utils/command_line.c: Add options to papi_command_line man page and improve opt handling Add options mention in the -h to the man page. Also improve the matching of the options. * 09059c82 doc/Makefile src/utils/version.c: Add information for papi_version to be complete * 4f2eee8c src/configure src/configure.in: add a --disable-perf-event-uncore option to configure 2013-06-29 * 901c5cc2 src/components/perf_event/perf_event.c src/components/perf_event/perf_event_lib.c .../perf_event_uncore/perf_event_uncore.c...: remove syscalls.h it's no longer needed * 4d7e3666 src/Rules.perfmon2 src/components/perfmon2/Rules.perfmon2 src/components/perfmon2/perfmon.c...: move perfmon modules to their own component directory * a7e9c5f1 src/Rules.perfctr src/Rules.perfctr-pfm src/components/perfctr/Rules.perfctr...: move perfctr files to components/perfctr directory verified that perfctr-x86 still builds and works perfctr_ppc has all the files to build, but it doesn't work. It looks like no one has tried to build perfctr-ppc for a very very long time. 2013-06-27 * e9dec1fd src/ctests/hl_rates.c src/papi.h src/papi_fwrappers.c...: debugged versions of these files * e282034e src/utils/native_avail.c: native_avail: Fix parse_unit_mask code Reported by Steve Kaufmann -------------------------- I noticed while developing a new component that the output from papi_native_avail was incorrectly presented for the component. I believe this is because the ":::" prefix is not being taken into account, so the base event name is interpreted as a unit mask and is prepend with a : before each legitimate unit mask associated with the event. I think this is just now happening because mine is the first component that has unit masks. I have include a fix below. The output of the unit masks by papi_native_avail now appears correctly for my component. Thanks, Steve 2013-06-26 * ff096786 src/ctests/fork2.c: fork2: Return fork2 test to its old functionality Once upon a time fork2 did: PAPI_library_init() … if ( fork() == 0) PAPI_shutdown() PAPI_library_init() … 2013-06-25 * 978d0d3d src/examples/PAPI_add_remove_event.c src/papi.c: Modify PAPI_list_events functionality to match documentation. You can now pass in a NULL event array and a zero count to get back the valid number of events. This can then be used to allocate the array and retrieve the exact number of events. Thanks to Nils Smeds and Alain Miniussi for pointing this out. * 13c52402 src/examples/PAPI_add_remove_event.c src/papi.c: Modify PAPI_list_events functionality to match documentation. You can now pass in a NULL event array and a zero count to get back the valid number of events. This can then be used to allocate the array and retrieve the exact number of events. Thanks to Nils Smeds and Alain Miniussi for pointing this out. * 656e703e src/ctests/zero_fork.c: zero_fork ctest : make documentation match code * 96aad0c7 src/ctests/forkexec.c: forkexec ctest : make comments match code * b7c70953 src/ctests/forkexec4.c: forkexec4 ctest : make comments match the code * 7ffb0245 src/ctests/forkexec3.c: forkexec3 ctest : make documentation match code * 55ea846c src/ctests/forkexec2.c: forkexec2 ctest: have comments match what source does * 7a601e2a src/ctests/Makefile src/ctests/fork2.c: fork2 ctest: remove; was an exact duplicate of fork * 9deff49b src/ctests/fork.c: fork ctest: make comments match what file actually does 2013-06-24 * 2770d2c5 src/components/perf_event/perf_event_lib.c: perf_event: fix failure on ARM due to domain settings forgot to git add the perf_event_lib.c file :( * bf7c4c50 src/components/perf_event/perf_event.c src/components/perf_event/perf_event_lib.h: perf_event: fix failure on ARM due to domain settings On Cortex A8 and A9 it's not possible to set exclude_kernel (hardware does not support it). Make sure the rdpmc detection code doesn't try to set exclude_kernel. 2013-06-18 * 2b1433d8 src/ctests/all_native_events.c src/ctests/get_event_component.c: ctests: Skip calling into disabled components. This patch fixes a problem that was causing two test cases to abort when they were run on a system which has disabled components. Code was added to check if the component is disabled and just go to the next component in the list when the check is true. This prevents calls to code in components which may abort because the component was unable to initialize itself correctly. Thanks to Gary Mohr and Chuck LaCasse from Bull for reporting. 2013-06-14 * 1872453c src/testlib/do_loops.c: testlib: don't change the iter count The first argument to do_misses is an iteration count, for some reason the code was dividing this in half before doing work. Most places that call do_misses call it as do_misses ( 1, ...) void do_misses( int n, int bytes ) { {...} n = n / 2; for ( j = 0; j < n; j++ ) { 1/2 == 0; so our do_misses call was usually not. Thanks Nils Smeds for reporting. 2013-06-12 * c113e5b6 src/components/infiniband/Makefile.infiniband.in src/components/infiniband/Rules.infiniband src/components/infiniband/configure...: Infiniband component: switch over to weak linking Thnaks to Gary Mohr for the patch. ---------------------------------- The infiniband component needs include files and libraries from both the infiniband ibmad and ibumad packages. When these packages are installed on a system, both packages normally install their files in the same place (includes in /usr/include/infiniband and libraries in /usr/lib64). The current component configure script allows you to provide a single include path and a single library path which gets used to access files from both packages. If these two packages have different install prefixes (or you are trying to build from install images of each package which are not located under the same directory) then the configure script fails because it can not find all the files it needs. These changes modify the configure script to replace the include and library dir's with an ibmad_dir and ibumad_dir and then uses the correct packages directory when looking for includes and libraries from that package. This makes it work like the cuda and nvml components with respect to configuring how to find files from a package the component depends on. There are also changes in this patch file to remove an unneeded variable in the dlopen code to resolve some defects reported by coverity. 2013-06-11 * d5be5643 src/components/rapl/tests/rapl_basic.c: rapl tests: make the error messages a little more verbose * 0c9f1a8c src/run_tests_exclude.txt src/run_tests_exclude_cuda.txt: run_tests_exclude files: Exclude a template file ------------------------------------------- It also adds the cpi.pbs file to the list of files to excluded when the tests are run. This file is just a template and attempts to run it hang the run_tests script on our systems. ------------------------------------------- * 0a063619 src/run_tests.sh: run_tests.sh: fix exclude check. The script failed to remove .cu files, this patch fixes the check. Thanks Gary Mohr for reporting/patching. 2013-06-10 * 87399477 src/components/cuda/linux-cuda.c: cuda component: Address a coverity issue The library linking code saved return values in a local var but never used them. Thanks to Gary Mohr for submitting this patch. * 99b5b685 src/components/coretemp/tests/coretemp_basic.c: coretemp_basic: update test to properly enumerate events The code was old and was searching the entire native event list for ones that started with "hwmon". This updates the test to first find the coretemp component, then enumerate all events contained within. * b5c0795b src/components/rapl/tests/rapl_overflow.c: rapl component: address potential looping issue in test. A rapl component test has a do/while which only exited when PAPI_add_named_event returned 0 ( and only 0; the PAPI_E* error codes would not terminate a while( retval ) loop), this felt fragile, minimal checks are now inplace. * 4e9484a5 src/components/rapl/tests/rapl_overflow.c: rapl components: coverity fixes Reported/patched by Gary Mohr ----------------------------- The rapl component also has 1 defect in a test case. The complaint is that there is code that can never be executed. But this one is not as clear, it says that you can not exit the do/while loop that preceeds a test of retval until retval=0 which means the test can never be true. The patch I am providing is to again remove the if test and its contents. But I am concerned that the do/while loop preceeding the test could result in a hard loop that would hang the test case forever. It seems to me like something should also be done to insure the loop will exit at some point. Here is a patch that provides at least part of the fix: ----------------------------- * 0a533810 src/components/net/tests/net_values_by_name.c: net components: coverity fixes Reported/patched by Gary Mohr ----------------------------- The net component has one defect in one of the test cases. The complaint is that there is code that can never be executed. There is a test to see if event_count == 0 which can never be true at that place in the code. So I removed the if statement and its contents. Here is the patch: ----------------------------- 2013-06-07 * b784b063 src/components/nvml/Rules.nvml src/components/nvml/configure src/components/nvml/configure.in...: nvml: Apply Gary Mohr's dlopen patch. Move the nvml component over to using the dlopen and weak linking infrastructure of the cuda component. Thanks, Gary. * d6505b76 src/components/rapl/utils/rapl_plot.c: rapl: update the rapl_plot utility Get the event names by enumerating the ones available with the RAPL component rather than having a hard-coded list. * 2094c5b1 src/components/rapl/linux-rapl.c: rapl: add better error messages on component init failure * d0e668fb src/ctests/Makefile src/ctests/high-level.c src/ctests/hl_rates.c...: First round of changes to implement a PAPI high level event per cycle call. Untested. 2013-06-05 * 63074f82 src/components/rapl/linux-rapl.c: rapl: Add Ivb-EP support The Intel docs are spotty on what is actually supported. They state: 14.7.2 RAPL Domains and Platform Specificity The specific RAPL domains available in a platform varies across product segments. Platforms targeting client segment support the following RAPL domain hierarchy: * Package * Two power planes: PP0 and PP1 (PP1 may reflect to uncore devices) Platforms targeting server segment support the following RAPL domain hierarchy: * Package * Power plane: PP0 * DRAM 2013-05-31 * 31b4702d src/cpus.c: cpus.c: Don't run init_thread/shutdown_thread for disabled components. 2013-05-29 * c48087d2 ChangeLogP511.txt RELEASENOTES.txt: Grab the updated ChangeLog from 5.1.1 Create a ChangeLog and update RELEASENOTES for a 5.1.1 release. 2013-05-24 * d1c8769e src/components/perf_event/tests/Makefile src/components/perf_event/tests/event_name_lib.c .../perf_event/tests/perf_event_user_kernel.c: Add perf_event user/kernel domain test This will be useful if/when we start handling domains properly. * 89e1aeba src/components/perf_event/tests/Makefile src/components/perf_event/tests/event_name_lib.c src/components/perf_event/tests/event_name_lib.h...: Add perf_event offcore response test Does a quick check to see if offcore response events are working. * bda86616 .../perf_event_uncore/perf_event_uncore.c src/ctests/get_event_component.c src/papi_internal.c: Some more ctest fixes involving disabled components. We enforce disabled components sometime in the PAPI routines and sometimes in the components themselves. A bit confusing. It is tough with perf_event and perf_event_uncore because we share libpfm4 by both, so the naming library for perf_event_uncore will be active even if the component is disabled, which can cause some confusing results if your test code ignores PAPI_ENOCMP error messages and accesses a disabled component anyway. This at least fixes our test cases, we might have to revisit this later. * b596621e doc/Doxyfile-common papi.spec src/Makefile.in...: Bump version numbers Call this 5.2.0.0 simple because its greater than (and some components are completely incompatible with) 5.1.1 * eb77a91e .../perf_event_uncore/perf_event_uncore.c src/papi.c: Disallow enumerating events on disabled components. This was causing segfaults on tests where enumeration was trying to enumerate uncore events on machines w/o uncores. * 4e991a8a .../perf_event/tests/perf_event_system_wide.c: perf_event_system_wide: SKIP instead of FAIL if we don't have proper permissions * 7654bb1f src/Makefile.inc src/components/perf_event/tests/Makefile .../perf_event/tests/perf_event_system_wide.c...: move the perf_event specific tests to be with their component This means the perf_event tests will only be run if perf_event is enabled * d82e343f src/ctests/perf_event_uncore_multiple.c: ctests/perf_event_uncore_multiple: Improve this test a bit * b1a594bf src/perf_events.c src/sys_perf_event_open.c: Remove the no-longer needed perf_events files Now we use the versions in the components/perf_event directory * a9a277f3 src/Makefile.in src/Makefile.inc src/configure...: Split up CPUCOMPONENT configure variable Now it is CPUCOMPONENT_NAME CPUCOMPONENT_C CPUCOMPONENT_OBJ This allows having setups with no CPUCOMPONENT set (perf_event used as a component) while keeping backward compatible with non-component CPU components. This has been tested on perf_event and perfctr. It might break other architectures, so test if you can. * 69e29526 src/configure src/configure.in: configure: have --with-components append comonents to existing value This allows configure to earlier set the components value to include "perf_event" if detected and then later append the values passed in with --with-components * 9d28df4c src/components/perf_event/Rules.perf_event src/components/perf_event/perf_event.c src/components/perf_event/perf_event_lib.c...: add perf_event and perf_event_uncore components This adds perf_event as a standalone component. Currently it is not compiled or built, some changes need to be made to the build system before this will work. 2013-05-21 * ea996661 src/components/cuda/linux-cuda.c: eliminate warnings of unused vars * 691bf114 src/components/cuda/linux-cuda.c: eliminate warnings of unused vars * 221bfdab src/components/cuda/linux-cuda.c src/components/cuda/tests/HelloWorld.cu: Problem with cleanup_eventset(): after destroying the CUDA eventset, update_control_state() is called again which operates on the already destroyed eventset. 2013-05-17 * 84925f50 src/components/cuda/linux-cuda.c: When adding multiple CUDA events to an event set, PAPI_add_event() error 14 (CUPTI_ERROR_NOT_COMPATIBLE) is being raised from the CUPTI library. Turns out that the CUDA update control state wasn't cleaning the event set up properly before adding new events. It's fixed now. * 2337aa3a src/perf_events.c: perf_event: allow running with perf_event_paranoid is 2 perf_event_paranoid set to 2 means allow user monitoring only (no kernel domain). The code before this mistakenly disabled all events in this case. Also set the allowed domains to exclude PAPI_DOM_KERNEL. 2013-05-16 * 617d9fbb src/papi_events.csv: papi_events.csv Revert a little mishap in adding ivbep support Somehow the contents of papi_hl.c ended up in the events file. * 2aff4596 src/papi_events.csv: Add identifier for ivb_ep * 1810ddf9 src/papi_libpfm4_events.c src/papi_libpfm4_events.h src/perf_events.c: papi_libpfm4_events: allow specifying core/uncore/os_generic PMUs This allows you to specify you only want your perf_event/libpfm4 based component to only export the PMU types you want. Now we can have an uncore-only component. * 6554f3f0 src/papi_libpfm4_events.c: papi_libpfm4_events.c: only enable presets for component 0 If we have multiple events using libpfm4, we only want to load the presets if it is component 0. * 6a4a4594 src/papi.c: PAPI_get_component_index() was matching names improperly For example, it was matching perf_event and perf_event_uncore as the same component. * 1b94e157 src/papi_hl.c: papi_hl.c : fix IPC calculation I broke it a while back while trying to clear out use of MHz. The code was uncommented and very confusing. It is slightly better now. * 92d4552e src/papi_libpfm4_events.c src/papi_libpfm4_events.h src/perf_events.c: papi_libpfm4_events: code changes to allow multiple component access the PAPI libpfm4 code has been modified to allow multiple users at once. This will allow multiple components to use libpfm4, for example a CPU component and an uncore component. * 7902b30e src/cpus.c: cpus: fix debug compile I always forget to compile with --with-debug and miss changes in the DEBUG statements. 2013-05-15 * 7ddc05ff src/cpus.c src/cpus.h: cpus.c: Add reference count to cpu structure It is possible to have multiple eventsets all attached to the same CPU, as long as only one eventset is running at a time. At EventSet cleanup, PAPI would free the CpuInfo_t structure even if other EventSets were still using it. This patch adds a reference count to the structure and only frees it after the last user is cleaned up. I also fixed a few locking bugs, hopefully I didn't introduce any new ones. * 6a61f9a2 src/cpus.c: more cleanup of the cpus.c file mostly formatting and added comments. * 710d269f src/cpus.c src/cpus.h src/papi.c...: cleanup cpus.h It had a lot of extraneous stuff in it. Also make sure it only gets included in files that need it. * 422226c9 src/papi.c: papi.c: add some extra debug messages * b1297058 src/cpus.c: Clean up cpus.c a bit Tracking down a segfault in the cpu attach cleanup code. * 7b6023cf src/ctests/perf_event_system_wide.c: ctests/perf_event_system_wide: much improved output It segfaults at the end though, unclear if this is a bug in the test or a bug in PAPI. Will investigate. * 38397aa3 src/components/cuda/configure src/components/cuda/configure.in src/components/cuda/linux-cuda.c...: Cuda component: Update library search path From Gary Mohr: It turns out that with the changes I gave you the path to the libcuda.so library is still hard coded to /usr/lib64. This assumes that the NVIDIA-Linux package is installed on the system where the build is being done. In Bull's case (and probably other users also) this is not always the case. To add the flexibility we need, I have added a new configure argument to the cuda configure script. The new argument is "--with-cudrv_dir" and it allows the user to specify where the cuda driver package (ie: NVIDIA-Linux) to be used for the build can be found. This new argument is optional and if not provided a value of "/usr" will be used. This allows existing configure calls to continue to work like before. * f8873d1c src/ctests/perf_event_system_wide.c: ctests/perf_event_system_wide: clean up the output a lot Still working on understanding it. * ebf20589 src/ctests/perf_event_system_wide.c: perf_event_system_wide: testing various DOMAIN and GRANULARITY settings pushing the limits of PAPI/perf_event trying to see why system-wide measurement doesn't work. 2013-05-14 * 0c1ef3f5 src/components/cuda/linux-cuda.c: CUDA component: Update description field Also removes a strcpy in the init code, which overwrote the name field. Thanks to Gary Mohr * 474fc00e src/ctests/perf_event_uncore_lib.c: Add AMD fam15h northbridge event to ctests/perf_event_uncore_lib.c 2013-05-13 * cf56cdac src/perf_events.c: perf_event component: update error returns This passes more error return values back to PAPI. Before this change a lot of places were hardcoded to PAPI_EPERM even if sys_perf_event_open() was reporting a different error. * c824471b src/ctests/Makefile src/ctests/perf_event_system_wide.c src/ctests/perf_event_uncore.c...: Update the perf_event specific tests. This adds a few more uncore tests, which are currently showing some bugs in the implementation. The tests all need root permissions to run, so should default to "SKIPPED" for most users. 2013-05-08 * e0204914 src/configure src/configure.in: Force the use of pthread_mutexes on ARM This lets the system libraries worry about the best way to define mutexes, rather than trying to hand-code in assembly around all of the various issues there are with atomic instructions in the ARM architecture. It might make sense to enable this for *all* Linux architectures, but for now just do it for ARM. * f21b1b27 src/linux-lock.h: Commit 59d3d7584b2925bd05b4b5d0f4fe89666eb8494a removed the definition of mb(). mb() was defined as rmb(). This just corrects it back. (Note from VMW -- this fixes some things, but ARM still won't build on a Cortex A9 pandaboard due to the use of the "swp" instruction. Proper fix is probably to enforce posix-mutexes on ARM) 2013-05-06 * 913f0795 src/components/nvml/configure src/components/nvml/configure.in: NVML: Update wording for configure options. Thanks for pointing out the ambigous wording, Heike. * 81a86c2b src/components/infiniband/Rules.infiniband src/components/infiniband/linux-infiniband.c src/components/infiniband/tests/Makefile: Infiniband component: use dlopen/dlsym for symbols Apply Gary Mohr's patch to switch the infiniband component over to dl* with the same motivations as the cuda component. 2013-05-02 * 2e6bcb2a src/utils/native_avail.c: Add two command line switches: -i EVENTSTR includes only events whose names contain EVENTSTR; -x EVENTSTR excludes all events whose names contain EVENTSTR. These two switches can be combined, but only one string per switch can be used. This allows you to, for example, filter events by component name, or eliminate all uncore events on Sandy Bridge… 2013-05-01 * 3163cc83 src/ctests/perf_event_uncore.c: ctests/perf_event_uncore: add IvyBridge support this needs an updated libpfm4 to work 2013-04-30 * 55c89673 src/examples/add_event/Papi_add_env_event.c src/examples/overflow_pthreads.c: Examples: Missed two instances of %x printf formating. 2013-04-29 * b3c5bd47 src/components/appio/tests/appio_list_events.c src/components/appio/tests/appio_values_by_code.c src/components/appio/tests/appio_values_by_name.c...: Address TRAC 174: Let printf do the formatting https://icl.cs.utk.edu/trac/papi/ticket/174 174: PAPI's debuggin/info output should use %# conversions for octal and hex ------------------------+-------------------- Reporter: sbk@… | Owner: Type: enhancement | Status: new Priority: normal | Component: All Version: HEAD | Severity: normal Keywords: | ------------------------+-------------------- Email sent to James Ralph: Seeing your latest change reminded me: Anytime there is a value issued in hex or octal the "%#" conversion should be used so the value is always preceded with a "0" for octal or a "0x" for hex. Otherwise when a value is printed one can not tell the base it is in (one shouldn't have to rely on internal knowledge of the code or the context to tell). For variables that are pointers the "%p" conversion can be used (this will always use an hex syntax). It would be nice to apply this to all PAPI print statements in their entirety. 2013-04-25 * 87ec9286 src/components/vmware/Rules.vmware: Rules.vmware: Use $(LDL) no -ldl Minor cleanup, but configure sets it, so why not use it. 2013-04-26 * 8dddd587 src/papi_hl.c: papi_hl: Use PAPI_get_virt_usec() for process time The code was using cycles / MHz which is not guaranteed to work on modern machines. It also was sometimes using (instructions / estimated IPC) / MHz which hopefully isn't necessary for any machine PAPI currently supports. Instead use PAPI_get_virt_usec() which should give the right value. 2013-04-25 * 9dd36088 src/ctests/perf_event_uncore.c: ctests/perf_event_uncore: make more modular Cleans up the code to make it easier to add tests for architectures other than SandyBridge-EP. I was doing this so I could add support for IvyBridge but it turns out neither Linux nor libpfm4 supports uncore on IvyBridge yet. hmmm. * 52ff0293 src/components/cuda/Rules.cuda: Rules.cuda: The cuda component now depend on the dynamic linking loader and on some systems one has to explicitly link to it. Add $(LDL) to LD_FLAGS, configure sets it if we need it. * 97a4a5ea src/components/cuda/Rules.cuda src/components/cuda/linux-cuda.c src/components/cuda/tests/Makefile: Cuda component enhancement. ---------------- From Gary's submission--------------------------------- The current packaging of the cuda component in PAPI has a fairly unfriendly side effect. When PAPI is built with the cuda component, then that copy of PAPI can only be used on systems where the cuda libraries are installed. If it is installed on a system without these libraries then all PAPI services fail because they have references to libraries which can not be found. Even papi_avail which you would think has nothing to do with cuda reports the error. This issue significantly complicates the delivery and install of the PAPI package on large clusters where some of the nodes have NVIDIA GPU's (and the cuda libraries to talk to them) and other nodes do not have GPU's (and therefore no software to access them). I have been working with the help of Phil Mucci to eliminate this dependency so that a copy of PAPI built with a cuda component could be installed on all nodes in the cluster and if the node had NVIDIA GPU's (and libraries available) then the cuda component would get enabled and could be used. If the node did not have the hardware or the access libraries were not available, then the cuda component would just disable itself at component initialization so it could not be used (but all other PAPI services would still work). Phil has provided some gentle prodding and lots of valuable suggestions to assist this effort. I now think that I have a working version of this capability and am ready to share it with the community. ----------------------------------------------------------------------- Many thanks to Gary Mohr and Phil Mucci for this much needed functionality. 2013-04-23 * 99c8e352 src/papi_internal.c: papi_internal.c: Print an eventcode in hex vs decimal. Thanks, Gary Mohr. 2013-04-22 * 1fc5dae2 src/run_tests.sh: The test for determining whether to run valgrind was backwards. Correcting that allow the run_test.sh script to stay the same and one just needs to define "VALGRIND=yes" (or any non-null string) to make run_test.sh use valgrind. --- src/run_tests.sh | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/run_tests.sh b/src/run_tests.sh index d1ce205..9337ff2 100755 --- a/src/run_tests.sh +++ b/src/run_tests.sh @@ -19,10 +19,8 @@ else export TESTS_QUIET fi -if [ "x$VALGRIND" = "x" ]; then -# Uncomment the following line to run tests using Valgrind -# VALGRIND="valgrind --leak-check=full"; - VALGRIND=""; +if [ "x$VALGRIND" != "x" ]; then + VALGRIND="valgrind --leak-check=full"; fi #CTESTS=`find ctests -maxdepth 1 -perm -u+x -type f`; -- 2013-04-19 * 4cf16234 src/components/README src/components/bgpm/README src/components/coretemp_freebsd/README...: Restructure README files for components so that the file in the components directory doesn't document individual component details. Add README files to each component directory that requires further installation detail. Update RAPL instructions to capture how to enable reading the MSRs. These files are supposedly configured with Doxygen markup, but I don't think the master README ever got built. It probably should. 2013-04-17 * bf75d226 src/components/cuda/tests/HelloWorld.cu: cuda/tests/HelloWorld.cu: workaround a segfault. Report from Gary Hohr I was running the Cuda test case on a system which did not actually have any NVIDIA GPU's installed on it (but the cuda software was installed and papi was built with the cuda component). I modified the test case to put an real cuda event in the source (as suggested in the source). When I run the test case the cuda component gets disabled in PAPI_library_init (because detectDevice function can not find any GPU's) which is the correct behavior. The test case then calls PAPI_event_name_to_code which failed because the cuda component was disabled. The test case then created an event set and called PAPI_add_events with an empty list of events to be added. This led to a segfault somewhere inside libpfm4. The attached patch makes some minor changes to protect against this problem. I noticed this test case does not use the PAPI test framework utilities (test_xxxx functions) so I did not modify the test to use them. 2013-04-15 * 457bfd74 src/components/cuda/linux-cuda.c: When creating two event sets - one for the CUDA and one for the CPU component - the order of event set creation appears crucial. When the CPU event set has been created before the CUDA event set then PAPI_start() for the CUDA event set works fine. However, if the CUDA event set has been created before the CPU event set, then PAPI_start(CUDA_event_set) forces the CUDA control state to be updated one more time, even if the CUDA event set has not been modified. The CUDA control state function did not properly handle this case and hence cause PAPI_start() to fail. This has been fixed. * 807120b6 src/components/cuda/linux-cuda.h: linux-cuda.c 2013-03-28 * 7b0eec7a src/run_tests.sh: run_tests.sh: further refine component test find Exclude *.cu when looking for component tests. 2013-03-25 * 6a40c8ba src/run_tests.sh: run_tests.sh: File mode changes. run_tests.sh is now expected to run from the install location in addition to src. The script tried to remove execute from *.[c|h], now it just excludes *.[c|h] from the find commands. 2013-03-18 * 2ba9f473 src/perfctr-x86.c: perfctr: don't read in event table multiple times papi_libpfm3_events.c now reads in the predefined events, we don't also need to do this in perfctr setup_x86_presets() * 326401b1 src/perfctr.c: Fix segfault in perfctr.c The preset lookup uses the cidx index, but in perfctr.c we weren't passing a cidx value (it was being left off). The old perfctr code plays games with defining extern functions so the compiler wasn't giving us a warning. 2013-03-14 * 50130c6f src/components/bgpm/L2unit/linux-L2unit.c src/linux-bgq.c: If a counter is not set to overflow (threshold==0; happens when PAPI_shutdown is called) then we do not want to rebuild the BGPM event set, even if the event set has been used previously and hence "applied or attached". Usually if an event set has been applied or attached prior to setting overflow, the BGPM event set needs to be deleted and recreated (which implies malloc() from within BGPM). Not so, though, if threshold is 0 which is the case when PAPI_shutdown is called. Note, this only applies to Punit and L2unit, not IOunit since an IOunit event set in not applied or attached. 2013-03-13 * 1a143003 src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/IOunit/linux-IOunit.h src/components/bgpm/L2unit/linux-L2unit.c...: Overflow issue on BG/Q resolved. Overflow with multiple components worked; overflow with multiple components and multiple events did not work as supposed to. * 42741a40 src/components/cuda/Rules.cuda: Added one more library to linker command. 2013-03-12 * 1431eb3f src/components/nvml/Makefile.nvml.in src/components/nvml/Rules.nvml src/components/nvml/configure...: NVML component: build system work Adopt the cuda component's method for specifying library location. 2013-03-11 * ce66feac src/components/mx/linux-mx.c: mx component: Modernize init routine. Add component index to _mx_component_init()s signarure and set the bit in component info. * 1c1bc177 src/components/cuda/Makefile.cuda.in src/components/cuda/Rules.cuda src/components/cuda/configure...: Resolve configure issues for CUDA component. 2013-03-07 * f3572537 src/linux-common.c src/linux-memory.c: Fix the build on Linux-SPARC I dug out an old SPARC machine and fixed the PAPI build on it. * 2c7f102c src/perf_events.c: More comprehensive sys_perf_open to PAPI error mappings This tries to cover more of the errors returned by sys_perf_open and map them to better results. EINVAL is a problem because it can mean Conflict as well as Event not found and many other things, so it's unclear what to do with it. * 299070ef src/perf_events.c src/sys_perf_event_open.c: Return proper error codes for sys_perf_event_open For some reason on x86 and x86_64 we were trying to set errno manually and thus over-writing the proper errno value, causing all errors to look like PAPI_EPERM This removes that code, as well as adds code to report ENOENT as PAPI_ENOEVENT. With this change, on IVY this happens which looks more correct. ./utils/papi_command_line perf::L1-ICACHE-PREFETCHES Failed adding: perf::L1-ICACHE-PREFETCHES because: Event does not exist command_line.c PASSED 2013-03-06 * baa557ca src/papi_libpfm4_events.c src/papi_user_events.c: Coverity fixes: Coverity pointed out that there was a case where load_user_eent_table() could leak memory. The change in the location of the papi_free(foo) ensures that the allocated memory is freed. Coverity pointed out one path through the code in _papi_libpfm4_ntv_code_to_descr() that did not free up memory allocated in the function. Added a free on the path in free up that memory. Thanks Will Cohen. 2013-02-14 * 395b7bc7 src/Makefile.inc src/components/README src/components/appio/tests/Makefile...: Add component tests' to the install-[all|tests] target. Thanks to Gary Mohr. ------------------- This makes a fairly small change to src/Makefile.inc to add logic that adds a new install-comp_tests target which calls the install target for each component being built. This new target is listed as a dependency on the install-tests target so it will happen when the 'install-all', 'install-tests', or 'install-comp_tests' targets are used. A note about this change, I am not real familiar with the auto make and auto conf tools. This change was enough to make it work for me but if there is another file that should also be changed for this modification, please help me out here. The patch also adds install targets to the Makefiles for all of the components which have 'tests' directories and updates the README file which talks about how to create component tests. Another note, I only compile with a couple of components (ours, rapl, and example) so if I fat fingered something in one of the other components Makefiles I would not have noticed. Please keep me honest and make sure you compile with them all enabled. Thanks for adding this capability for us. Gary --------------------------- Makefile.inc: Add run_tests and friends to install-tests target. Component test Makefiles' get their install location to mirror what runtests expects. 2013-03-04 * 448d21ab src/components/rapl/linux-rapl.c: Remove a stray debug statement. Thanks to Harald Servat for catching this. 2013-03-01 * df1a75cc src/utils/command_line.c: Wrestled some horribly convoluted indexing into shape. The -u and -x options now print as expected (I think). 2013-01-31 * b0f5f4d6 src/components/nvml/linux-nvml.c: linux-nvml.c: Fix type warning. CUDA and NVML have an signed vs unsigned thing going on in their returned device counts, cast away the warning. 2013-01-29 * 8490b4ee src/papi.c: General doxygen cleanup: remove all "No known bugs" messages; correct and cleanup examples for PAPI_code_to_name and PAPI_name_to_code 2013-01-23 * 89e45a9b src/linux-memory.c src/linux-timer.c: ia64 fixes. Thanks to Tony Jones for patches. 2013-01-16 * 23e0ba2d src/components/nvml/linux-nvml.c: nvml component: cleanup a memory leak We did not free a buffer at shutdown time. 2013-01-15 * f3db85fc src/papi.h: papi.h bump version number. * dfa80287 src/buildbot_configure_with_components.sh: Buildbot configure script. Add cuda and nvml components, if configured, to the buildbot coverage test. Note: Script now checks for existance of Makefile.cuda and then Makefile.nvml so see if it can build the cuda component and then if it can build the nvml component. * cf416e27 src/threads.c: Cleaned up compiler warning (gcc version 4.4.6) * 59cbc8fc src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/L2unit/linux-L2unit.c...: Cleaned up compiler warnings on BG/Q (gcc version 4.4.6 (BGQ-V1R1M2-120920)) 2013-01-14 * 3af71658 .../build/lib.linux-x86_64-2.7/perfmon/__init__.py .../lib.linux-x86_64-2.7/perfmon/perfmon_int.py .../build/lib.linux-x86_64-2.7/perfmon/pmu.py...: libpfm4: remove extraneous build artifacts. Steve Kaufmann reported differences between the libpfm4 I imported into PAPI and the libpfm4 that can be attained with a git clone git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4 Self: Do libpfm4 imports from a fresh clone of libpfm4. papi-papi-7-2-0-t/ChangeLogP530.txt000066400000000000000000000465121502707512200166100ustar00rootroot000000000000002013-11-25 * a40c96c5 src/components/nvml/linux-nvml.c: nvml component: Add missing } * 166971ba src/components/nvml/linux-nvml.c: nvml component: modify api checks To check if nvmlDeviceGetEccMode and nvmlDeviceGetPowerUsage are supported, we just call the functions and see if nvml thinks its supported by the card. 2013-11-21 * 78192de9 delete_before_release.sh: Kill the .gitignore files in delete_before_release * 60fb1dd4 src/utils/command_line.c: command_line utility: Initialize a variable Initialize data_type to PAPI_DATATYPE_INT64 Addresses a coverity error Error: COMPILER_WARNING: [#def19] papi-5.2.0/src/utils/command_line.c:133:4: warning: 'data_type' may be used uninitialized in this function [-Wmaybe-uninitialized] switch (data_type) { ^ 2013-11-20 * da2925f6 src/ctests/data_range.c: Make data_range test use prginfo Coverity complained about prginfo being an unused variable for data_range.c. The code is modified to be stylistically like the code for hw_info in the preceding lines which also is not used elsewhere in the test. This is more to reduce the amount of output in the Coverity scan than to fix this minor issue. * 3386953d src/ctests/data_range.c: Check the return values of PAPI_start() and PAPI_stop() for the data_range test The ia64 data_range test did not check the return values of PAPI_start() or PAPI_stop(). There are propbably few people running this test on ia64 machine, but this is more to eliminate a couple errors noted by a Coverity scan and reduce the clutter in the Coverity scan. 2013-11-19 * e704e8f1 src/configure src/configure.in: configure: Build fpapi.h and co for mic When building for mic, set the cross_compiling var in configure to use a native c compiler to build genpapif. 2013-11-18 * d32b1dae man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Rebuild the man pages for a 5.3 release * 4e735d11 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump version numbers for a pending 5.3 * efe026cd src/Makefile.inc: Makefile.inc: Pass LINKLIB, not SHLIB to the comp_tests * f0598acb src/ctests/Makefile.target.in: ctests/Makefile.target.in: Properly catch LINKLIB LINKLIB=$(SHLIB) or $(LIBRARY), so we have to have configure fill in those as well. * 1744c23e src/ctests/Makefile.target.in: ctests/Makefile.target.in: Respect static-tools the --with-static-tools configure flag sets STATIC, not LDFLAGS. This gets passed to the tests' make subprocesses via LDFLAGS="$(LDFLAGS) $(STATIC)" We mimic this in the installed Makefile. * e9347373 src/ctests/Makefile: ctests/Makefile: Don't clobber value of LIBRARY TOOD: write a better message * 237219d1 src/Makefile.inc: Makefile.inc: Add enviro vars to fulltest recipe The fulltest target didn't set LD_LIBRARY_PATH and as a result, several tests wouldn't find libpfm and fail to run. The fix is to call our SETPATH command first (as all of the other testing targets do) See ------------------------------------------------------------------ icc -diag-disable 188,869,271 -g -g -DSTATIC_PAPI_EVENTS_TABLE -DPEINCLUDE="libpfm4/include/perfmon/perf_event.h" -D_REENTRANT -D_GNU_SOURCE -DUSE_COMPILER_TLS -Ilibpfm4/include -I../../../testlib -I../../.. -I. -o perf_event_offcore_response perf_event_offcore_response.o event_name_lib.o ../../../testlib/libtestlib.a ../../../libpapi.so.5.2.0.0 ld: warning: libpfm.so.4, needed by ../../../libpapi.so.5.2.0.0, not found (try using -rpath or -rpath-link) ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_event_attr_info' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_initialize' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_pmu_info' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_version' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_os_event_encoding' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_event_next' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_get_event_info' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_strerror' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_find_event' ../../../libpapi.so.5.2.0.0: undefined reference to `pfm_terminate' make[2]: *** [perf_event_offcore_response] Error 1 ------------------------------------------------------------------ 2013-11-17 * a7f642d2 src/Makefile.inc src/configure src/configure.in: Switch LINKLIB to not have relative pathing 2013-11-15 * 91a6fa54 src/components/lustre/tests/Makefile: Fix a typo in the lustre tests' Makefile 2013-11-13 * 9a5f9ad4 src/papi_preset.c: papi_preset.c: Fix _papi_load_preset_table func Patch by Gleb Smirnoff ---------------------- The _papi_load_preset_table() loses last entry from a static table. The code in get_event_line() returns value of a char next to the line we are returning. Obviously, for the last entry the char is '\0', so function returns false value and _papi_load_preset_table() ignores the last line. Patch attached. The most important part of my patch is only: - ret = **tmp_perfmon_events_table; + return i; This actually fixes the lost last line. However, I decided to make the entire get_event_line() more robust, protected from bad input, and easier to read. ---------------------- 2013-11-12 * 579139a6 src/utils/hybrid_native_avail.c: more doxygen xml tag cleanup * 952bb621 src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/CNKunit/linux-CNKunit.h src/components/bgpm/IOunit/linux-IOunit.c...: Fix doxygen Unsupported xml/html tag warnings * 0c161015 src/components/micpower/linux-micpower.h: micpower: fix doxygen warning * b187f065 src/components/host_micpower/README: host_micpower: update docs 2013-11-11 * 4d379c6f src/ctests/p4_lst_ins.c: ctests/p4_lst_ins: Narrow scope of test This test attempted to ensure that it was running on a P4, the test missed for all non intel systems. 2013-11-10 * ee1c7967 .../host_micpower/utils/host_micpower_plot.c: Added energy consumption to host_micpower utility. 2013-11-08 * eee49912 src/ctests/shlib.c: shlib.c: Check for NULL Thanks to Will Cohen for reporting. Coverity picked up an instance of a value that could be NULL and strlen would barf on it. Error: FORWARD_NULL (CWE-476): papi-5.2.0/src/ctests/shlib.c:70: var_compare_op: Comparing "shinfo->map" to null implies that "shinfo->map" might be null. papi-5.2.0/src/ctests/shlib.c:74: var_deref_model: Passing "shinfo" to function "print_shlib_info_map(PAPI_shlib_info_t const *)", which dereferences null "shinfo->map". papi-5.2.0/src/ctests/shlib.c:13:26: var_assign_parm: Assigning: "map" = "shinfo->map". papi-5.2.0/src/ctests/shlib.c:24:3: deref_var_in_call: Function "strlen(char const *)" dereferences an offset off "map" (which is a copy of "shinfo->map"). * 83c31e25 src/components/perf_event/perf_event.c: perf_event.c: Check return value of ioctl Thanks to Will Cohen for reporting based upon output of coverity. * e5b33574 src/utils/multiplex_cost.c: multiplex_cost: check return value on PAPI_set_opt Thanks to Will Cohen for reporting based upon output of coverity. * 04f95b14 src/components/.gitignore: Ignore component target makefile * cbf7c1a8 src/components/rapl/linux-rapl.c src/components/rapl/tests/Makefile src/components/rapl/tests/rapl_basic.c: Modify linux-rapl to support one wrap-around of the 32-bit registers for reading energy. This insures availability of the full 32-bit dynamic range. However, it does not protect against two wrap-arounds. Care must be taken not to exceed the expected dynamic range, or to check reasonableness of results on completion. Modifications were also made to report rapl events as unscaled binary values in order to compute dynamic ranges. Modify rapl-basic to add a test (rapl_wraparound) to estimate maximum measurement time for a naive gemm. With a -w option, measurement for this amount of time will be performed. The gemm can be replaced with a user kernel for more accurate time estimates. Makefile was modified to support the new test case. 2013-11-07 * 7784de21 src/ctests/data_range.c src/ctests/zero_shmem.c: Modernize some ctests Add tests_quiet check to data_range and zero_shmem 2013-11-06 * 7c953490 src/configure src/configure.in: More MPICC checking Have configure check for mpcc on AIX, in addition to mpicc. * 5c8d2ce0 src/ctests/zero_shmem.c: zero_shmem.c: Fix compiler warning The worker threads in the test print an ID, the test was setup to call pthread_self(), this is problematic. Since each thread is started with a unique work load, use this to lable threads. * 993a6e96 src/ctests/Makefile.recipies src/ctests/Makefile.target.in: ctests/Makefile.recipies: conditionally build the MPI test * b29d5f56 src/Makefile.inc src/configure src/configure.in: Check for mpicc at configure time configure[.in]: look for mpicc Makefile.inc: Pass MPICC to ctests' make 2013-11-05 * b2d643df src/papi_events.csv: Add floating point events for IvyBridge Now that Intel has documented them and libpfm4 supports them, PAPI can use them. We just use the same events as on sandybridge. Tested on an ivybridge system. 2013-11-01 * c5be5e26 src/components/micpower/linux-micpower.c: micpower: check return of fopen before use Issue reported by Will Cohen from results of Coverity run. * 5c1405ab src/components/host_micpower/utils/Makefile src/components/host_micpower/utils/README .../host_micpower/utils/host_micpower_plot.c: Add host_micpower utility to gather power (and voltage) measurements on Intel Xeon Phi chips using MicAccessAPI. * 46b9bdf5 src/components/host_micpower/linux-host_micpower.c: Added more detailed event description and correct units to host mic power events. * b97c0126 src/components/host_micpower/linux-host_micpower.c: host_micpower: Better error reporting grab output of dlerror on library load failure 2013-10-31 * 84da7fd3 src/components/host_micpower/Rules.host_micpower src/components/host_micpower/tests/Makefile: host_micpower: Fix some makefile bits tests/Makefile needed to define a target to work with the Make_comp_tests install machinery. Rules.host_micpower had a typo 2013-10-30 * 14f3e4c4 src/components/host_micpower/linux-host_micpower.c: host_micpower: fix function signature shutdown_thread took wrong arguments. 2013-10-28 * a4cc1113 release_procedure.txt: Update release_procedure.txt Bug in the version of doxygen we were using to produce the documentation led to some of the Fortran functions being left out in the cold. We now proscribe 1.8.5 * a1d6ae34 src/components/host_micpower/README: host_micpower: Add a README file. 2013-10-25 * 859dbc2c src/Makefile.inc src/components/Makefile_comp_tests src/components/Makefile_comp_tests.target.in...: Make the testsuite as a stand-alone copy-able directory of code These changes to the Makefiles allows the testsuite to be compiled separately from the papi sources. This is useful for people wanting to experiment with the tests and verify that the existing installation of papi works. We put absolute paths to the installed library and include files into the installed makefile for the tests. * c307ad18 src/ctests/Makefile src/ctests/attach_target.c src/testlib/do_loops.c: Refactor the driver in do_loops.c into its own file. (ctests/Makefile, ctests/attach_target.c testlib/do_loops.c) 2013-10-23 * ace71699 src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/L2unit/linux-L2unit.c...: Passing BGPM errors up to PAPI. 2013-10-22 * 2ee090ec src/components/bgpm/NWunit/linux-NWunit.c src/components/bgpm/NWunit/linux-NWunit.h: Fixed the behavior in BGQ's NWunit component after attaching an event set to a specific thread that owns the target recourse counting * 8ab071ee src/components/cuda/linux-cuda.c: CUDA component: Set the number of native events Patch by Steve Kaufmann When running papi_component_avail I notice that the number of CUDA events was always zero when the component was available. The following change correctly sets the number of native events for the component: 2013-10-11 * 071943b6 src/configure src/configure.in src/linux-context.h...: add preliminary aarch64 (arm64) support There has been some work to build fedora 19 on 64-bit arm armv8 machines (aarch64). I took a look that the why the papi build was failing. The attached is a set of minimal patches to get papi to build. The patch is just a step toward getting aarch64 support for papi. Things are not all there for papi to work in that environment. Still need libpfm to support aarch64 and papi_events.csv describing mappings to machine specific events. 2013-10-01 * 096eb7fc src/ctests/zero_shmem.c: zero_shmem: cleanup compiler warnings Remove unused variables. * d9669053 src/ctests/earprofile.c: ctests/earprofile.c: Fix compiler warning Both PAPI_get_hardware_info and PAPI_get_executable_info expect const pointers, (get_executable_info is called by prof_init in profile_utils). 2013-09-30 * 87e7e387 src/ctests/p4_lst_ins.c: ctests/p4_lst_ins.c: Fix the P4 load test. This test relied upon a removed symbol to decide if it should run. The symbol unsigned char PENTIUM4 was removed in 2011, update the logic. * 737d91ff src/ctests/zero_shmem.c: ctests/zero_shmem: Update the test * add_test_events expects another argument, update the zero_shmem test's invocation * Protect[Hide] OpenSHMEM calls with ifdefs 2013-09-27 * 86c11829 src/ctests/zero_shmem.c: zero_shmem: Include pthread.h * 2d0e666c src/ctests/zero_smp.c: zero_smp: Change a compile time error to a test_skip In 8d1f2c1, we changed the default assumption to be that all ctests are build. This change allows the test to gracefully skip if it does not have 'native SMP' threads. 2013-09-26 * 8d1f2c16 src/ctests/Makefile: ctests/Makefile: Default to building everything Set target all to depend upon ALL * ffd051cf src/ftests/Makefile src/testlib/Makefile: testlib, ftests Makefiles: cleanup ifort generated files ifort produces mod and f90 intermediate files which clean does not cleanup * c720bb59 src/components/coretemp/tests/coretemp_basic.c src/components/coretemp/tests/coretemp_pretty.c: Coretemp tests: Fix skipping logic The coretemp_basic test was failing if coretemp was disabled, skip seems more appropriate. Add this logic to the coretemp_pretty test. * af7f7508 src/configure src/configure.in: configure: refactor CTEST_TARGETS Problem: The set of ctests to build is determined at configure time, in CTEST_TARGETS. This is set in each OS detection section and suffers from neglect. Solution: Try to push the decisions about which tests to build out of configure, ask for them all. Idealy the tests will be written in such a way as to fail/skip gracefully if they lack functionallity, teething problems are expected initially. * 14421695 src/testlib/Makefile: testlib: Fix the Makefile variable assignment Consider: src=a.c b.c c.F obj=$(src:.c=.o) c.o After this substution, obj is {a.o b.o c.F c.o}, not quite the nut. Change the logic to correct that. 2013-09-17 * 05a4e17b .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: cleanup a compiler warning _peu_read does not use the hwd_context argument. * f2056857 src/papi_events.csv: papi_events.csv: Add PAPI_L1_ICM for Haswell Thanks to Maurice Marks of Unisys for the contribution ------------- I've continued testing on Haswell. By comparison with Vtune and Emon on Haswell I found that we can use the counter L2_RQSTS:ALL_CODE_RD for PAPI_L1_ICM, which is a very useful measure. Attached is my current version of papi_events.csv with Haswell fixes. ------------- 2013-08-28 * efe3533d src/Makefile.inc src/components/Makefile_comp_tests src/ctests/Makefile...: testlib: library-ify testlib * Move ftests_util to testlib * Naively create libtestlib.a * utils link to the testlib library * [c|f]tests Switch the tests over to linking libtestlib.a * Component tests link libtestlib.a 2013-08-26 * d2a76dde src/configure src/configure.in src/utils/hybrid_native_avail.c: Gabrial's mic with icc changes to configure. Specify --with-mic at configure time and upon finding icc as the C compiler, it adds --mmic * 4c0349c0 src/papi_events.csv: papi_events.csv: First draft preset events on HSW Contributed by Nils Smeds ------------------------- Here is a suggestion for addition to Hsw counters. These are not rigorously tested. It compiles and loads. I'm rather uncertain on many of the events so I am hoping that adding events like this will get some useful feedback from the community so that we can improve. ------------------------- 2013-08-20 * 1b8ff589 src/utils/command_line.c: command_line util: Fix skipping event bug. The command line utility had an extranious index increment which resulted in skipping the reporting of event counts. Remove the increment. Reported by Steve Kaufmann -------------------------- I am getting some funny results when I use papi_command_line with the RAPL events. If I request them all: $ papi_command_line THERMAL_SPEC:PACKAGE0 MINIMUM_POWER:PACKAGE0 MAXIMUM_POWER:PACKAGE0 MAXIMUM_TIME_WINDOW:PACKAGE0 PACKAGE_ENERGY:PACKAGE0 DRAM_ENERGY:PACKAGE0 PP0_ENERGY:PACKAGE0 Successfully added: THERMAL_SPEC:PACKAGE0 Successfully added: MINIMUM_POWER:PACKAGE0 Successfully added: MAXIMUM_POWER:PACKAGE0 Successfully added: MAXIMUM_TIME_WINDOW:PACKAGE0 Successfully added: PACKAGE_ENERGY:PACKAGE0 Successfully added: DRAM_ENERGY:PACKAGE0 Successfully added: PP0_ENERGY:PACKAGE0 THERMAL_SPEC:PACKAGE0 : 115.000 W <<<<< MINIMUM_POWER:PACKAGE0 ?? MAXIMUM_POWER:PACKAGE0 : 180.000 W PACKAGE_ENERGY:PACKAGE0 : 2003784180(u) nJ DRAM_ENERGY:PACKAGE0 : 438751220(u) nJ PP0_ENERGY:PACKAGE0 : 1248748779(u) nJ ---------------------------------- Verification: Checks for valid event name. This utility lets you add events from the command line interface to see if they work. command_line.c PASSED Note that a value for MINIMUM_POWER:PACKAGE0 is not displayed even though it was successfully added to the event set. In fact, if combined with other events, the value for this event is never displayed. If you specifiy it on its own it is displayed: ------------------------------------ 2013-08-16 * 0cb63d6e src/components/lustre/linux-lustre.c: lustre component: fix memory leak 2013-08-13 * c810cd0d src/components/micpower/linux-micpower.c src/linux-memory.c src/papi_preset.c: Close resource leaks User dcb reported several resource leaks in trac bug #184. -------------------- I just ran the static analysis checker "cppcheck" over the source code of papi-5.2.0 It said 1. [linux-memory.c:711]: (error) Resource leak: sys_cpu 2. [papi_preset.c:735]: (error) Resource leak: fp 3. [components/micpower/linux-micpower.c:166]: (error) Resource leak: fp I've checked them all and they all look like resource leaks to me. Suggest code rework. ---------------------------------- 2013-08-07 * 8d479895 doc/Makefile: Doxygen makefile: update dependencies The manpages are generated from comments in papi.h, papi.c, papi_hl.c and papi_fwrappers.c; update the make dependencies to reflect this. papi-papi-7-2-0-t/ChangeLogP532.txt000066400000000000000000000072711502707512200166110ustar00rootroot000000000000002014-06-30 * 511d05bc man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Regenerate man pages for a pending 5.3.2 release * a07adc91 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump version number for a 5.3.2 release 2014-06-27 * 43070347 src/components/coretemp/linux-coretemp.c src/components/micpower/linux-micpower.c src/components/net/linux-net.c: Fix a warning in component initialization A copy/paste perpetuated a multiple definition of available_domains 2014-06-24 * ea216f5a src/run_tests.sh src/run_tests_exclude.txt src/run_tests_exclude_cuda.txt: Fix excluded files for run_tests Gary Mohr pointed out that in a Makefile refactor we neglected to update the tests eclude creteria -------------------------------- It turns out that you guys have made changes in the way the tests are built which have introduced errors when running the scripts. The script attempted to automatically remove the makefiles from the list of files it would execute but your build changes broke the makefiles up into several files and renamed all of them. So the script was trying to execute them. I decided the most flexible way to handle this is to remove the code from the script that looks for makefiles and just add them to the exclude files used by the script. The script will not execute any file listed in the exclude file. The attached patch implements these changes so the script runs correctly with the current papi build files. Gary -------------------------------- 2014-05-16 * aacf9628 man/man3/PAPI_enum_cmp_event.3 man/man3/PAPI_enum_event.3 man/man3/PAPI_get_overflow_event_index.3...: Printf formatting change... Based upon a patch by Steve Kaufmann. I took the liberty of removing all the leading "0x" as part of formatting output strings. Now the "%#" syntax is used to print out hexidecimal values with a leading "0x" (letting the printf function do the work). 2014-03-26 * 0c93f0a1 src/components/nvml/linux-nvml.c src/components/nvml/linux-nvml.h: Add units to NVML component Thanks to Brian Lemke at Bull for the patch. 2014-02-26 * 2c79fab8 src/configure src/configure.in: configure: respect --with-walltimer and virtualtimer For whatever reason configure would not check for the --with-walltimer argument if we had already determined one (See BG/P and CLE sections also --with-mic option). This is not desirable, kill this behaviour. 2014-01-30 * 284f25c2 src/components/lustre/linux-lustre.c src/components/net/linux-net.c: Use correct specification for signed and unsigned int A run of cppcheck showed that some mismatches between the specfications for sscanf and the variables being used to store the values. This corrects those minor issues. 2014-02-04 * 291bad11 src/components/coretemp/linux-coretemp.c src/components/coretemp_freebsd/coretemp_freebsd.c src/components/host_micpower/linux-host_micpower.c...: Update the domain/granularity of many components Many PAPI components only report system wide events, here we attempt to match up entries in the .cmp_info struct with reality by only allowing PAPI_GRAN_SYS and PAPI_DOM_ALL. 2013-12-30 * eeefec5c src/ctests/attach2.c src/ctests/attach3.c: ctests/attach[2,3]: Fix ptrace call for BSD "The SunOS man page describes ptrace() as "unique and arcane", which it is." * 27d416c8 src/configure src/configure.in: Configure.in: Remove Bash-isms from comp selection Part of a patch set by Gleb Smirnoff * 52f8979f src/configure src/configure.in: Configure.in: Correctly detect FreeBSD OS version The script incorrectly parsed "FreeBSD 10" as "FreeBSD 1". Part of a series of patches by Gleb Smirnoff papi-papi-7-2-0-t/ChangeLogP540.txt000066400000000000000000000627531502707512200166160ustar00rootroot000000000000002014-11-13 * 8f524875 RELEASENOTES.txt: Prepare release notes for a 5.4.0 release 2014-11-12 * a8b4613b man/man1/papi_avail.1 man/man1/papi_clockres.1 man/man1/papi_command_line.1...: Rebuild the doxygen manpages * fbea4897 src/run_tests_exclude.txt: Remove omptough from standard run_tests.sh testing On some platforms (e.g. some AMD machines), if OMP_NUM_THREADS is not set, then this test does not complete in a reasonable time. That is because on these platforms too many threads are spawned inside a large loop. 2014-11-11 * 23c2705b src/components/bgpm/CNKunit/linux-CNKunit.c src/components/bgpm/IOunit/linux-IOunit.c src/components/bgpm/IOunit/linux-IOunit.h...: Fix number of counters and events for each of the 5 BGPM units as well as emon on BG/Q 2014-11-07 * 93c69ded src/linux-timer.c: Patch linux-timer.c to provide cycle counter support on aarch64 (64-bit ARMv8 architecture) Thanks to William Cohen for this patch and the message below: --- The aarch64 has cycle counter available to userspace and this resource should be made available in papi. --- This patch is not tested by the PAPI team (no easily available hardware). 2014-11-06 * 038b2f31 src/components/rapl/linux-rapl.c: Extension of the RAPL energy measurements on Intel via msr-safe. (https://github.com/scalability-llnl/msr-safe) msr-safe is a linux kernel module that allows user access to a whitelisted set of MSRs. It is nearly identical in structure to the stock msr kernel module, with the important exception that the "capabilities" check has been removed. The LLNL sysadmins did a security review for the whitelist. * 67e0b3f6 src/components/rapl/tests/rapl_basic.c: Fixed string null termination. 2014-10-30 * 2a1805ec src/components/perf_event/pe_libpfm4_events.c src/components/perf_event/pe_libpfm4_events.h src/components/perf_event/perf_event.c...: Patch to reduce cost of using PAPI_name_to_code and add list of supported pmu names to papi_component_avail output Thanks to Gary Mohr for this patch and its documentation: --- This patch file contains code to look for either pmu names or component names on the front of event strings passed to PAPI_name_to_code calls. If found the name will be compared to values in each component info structure to see if the component supports this event. If the pmu name or component name does not match the values in the component info structure then there is no need to call this component for this event. If the event string does not contain either a pmu name or a component name then all components will be called. This reduces the overhead in PAPI when converting event names to event codes when either component names or pmu names are provided in the event name string. To support the above checks, there is also code in this patch to add an array of pmu names to the component info structure and modifications to the core and uncore components to save the pmu names supported by each of these components in this new array. This patch also adds code to the papi_component_avail tool to display the pmu names supported by each active component. --- 2014-10-28 * a91db97b src/components/net/linux-net.c src/components/nvml/linux-nvml.c src/components/perf_event/perf_event.c...: This patch file contains additional changes to resolve defects reported by Coverity. Thanks to Gary Mohr for this patch. ------ This patch file contains additional changes to resolve defects reported by Coverity. Mostly these just make sure that character buffers get null terminated so they can be used as C strings. There is also a change in the RAPL component to improve the message to identify why the component may have been disabled. ------ 2014-10-22 * 3f913658 src/ctests/tenth.c: Fix percent error calculation in ctests/tenth.c Thanks to Carl Love for this patch and the following documentation: Do the division first then multiply by 100 when calculating the percent error. This will keep the magnitude of the numbers closer. If you multiply by 100 before dividing you may exceed the size the representable number size. Additionally, by casting the values to floats then dividing we get more accuracy in the calculation of the percent error. The integer division will not give us a percent error less then 1. * ba5ef24a src/papi_events.csv: PPC64, fix L1 data cache read, write and all access equations. Thanks to Carl Love for this patch and the following documentation: The current POWER 7 equations for all accesses over counts because it includes non load accesses to the cache. The equation was changed to be the sum of the reads and the writes. The read accesses to the two units, can be counted with the single event PM_LD_REF_L1 rather then counting the events to the two LSU units independently. The number of reads to the L1 must be adjusted by subtracting the misses as these become writes. Power 8 has four LSU units. The same equations can be used since PM_LD_REF_L1 counts across all four LSU units. * 882f5765 src/utils/native_avail.c: This patch file fixes two problems and adds a performance improvement in "papi_native_avail. Thanks to Gary Mohr for this patch and the following information: First it corrects a problem when using the -i or -x options. The code was putting out too many event divider lines (lines with all '-' characters). This has been corrected. Second it improves the results from "papi_native_avail --validate" when being used on SNBEP systems. This system has some events which require multiple masks to be provided for them to be valid. The validate code was showing these events as not available because it did not try to use the event with the correct combination of masks. The fix checks to see if a valid form of the event has not yet been found and if so then it tries the event with specific combinations of masks that have been found to make these events work. It also adds a check before trying to validate the event with a new mask to see if a valid form of the event has already been found. If it has then there is no need to try and validate the event again. * 94985c8a src/config.h.in src/configure src/configure.in...: Fix build error when no fortan is installed Thanks to Maynard Johnson for this patch. Fix up the build mechanism to properly handle the case where no Fortran compiler is installed -- i.e., don't build or install testlib/ftest_util.o or the ftests. 2014-10-16 * de05a9d8 src/linux-common.c: PPC64 add support for the Power non virtualized platform Thanks to Carl Love for this patch and the following description: The Power 8 system can be run as a non-virtualized machine. In this case, the platform is "PowerNV". This patch adds the platform to the possible IBM platform types. * 547f4412 src/ctests/byte_profile.c: byte_profile.c: PPC64 add support for PPC64 Little Endian to byte_profile.c Thanks to Carl Love for this patch and the following description: The POWER 8 platform is Little Endian. It uses ELF version 2 which does not use function descriptors. This patch adds the needed #ifdef support to correctly compile the test case for Big Endian or Little Endian. This patch is untested by the PAPI developers (hardware not easily accessible). 2014-10-15 * 14f70ebc src/ctests/sprofile.c: PPC64 add support for PPC64 Little Endian to sprofile.c Thanks to Carl Love for this patch and the following description: The POWER 8 platform is Little Endian. It uses ELF version 2 which does not use function descriptors. This patch adds the needed #ifdef support to correctly compile the test case for Big Endian or Little Endian. * 6d41e208 src/linux-memory.c: PPC64 sys_mem_info array size is wrong Thanks to Carl Love for this patch and the following description: The variable sys_mem_info is an array of type PAPI_mh_info_t. It is statically declared as size 4. The data for POWER8 is statically declared in entry 4 of the array which is beyond the allocated array. The array should be declared without a size so the compiler will automatically determine the correct size based on the number of elements being initialized. This patch makes the change. * 061817e0 src/papi_events.csv: Remove stray Intel Haswell events from Intel Ivy Bridge presets Thanks to William Cohen for this patch and the following description: 'Commit 4c87d753ab56688acad5bf0cb3b95eae8aa80458 added some events meant for Intel Haswell to the Intel Ivy bridge presets. This patch removes those stray events. Without this patch on Intel Ivy Bridge machines would see messages like the following: PAPI Error: papi_preset: Error finding event L2_TRANS:DEMAND_DATA_RD. PAPI Error: papi_preset: Error finding event L2_RQSTS:ALL_DEMAND_REFERENCES.' This patch was not tested by the PAPI team (no appropriate hardware). 2014-10-14 * 8bc1ff85 src/papi_events.csv: Update papi_events.csv to match libpfm support for Intel family 6 model 63 (hsw_ep) Thanks to William Cohen for this patch and its information: 'A recent September 11, 2014 patch (98c00b) to the upstream libpfm split out Intel family 6 model 63 into its own name of "hsw_ep". The papi_events.csv needs to be updated to support that new name. This should have no impact for older libpfms that still identify Intel family 6 model 63 as "hswv" and "hsw_ep" map to the same papi presets.' * 32a8b758 src/papi_events.csv: Support for the ARM X-Gene processor. Thanks to William Cohen for this patch. The events for the Applied Micro X-Gene processor are slightly different from other ARM processors. Thus, need to define those presets for the X-Gene processor. Note: This patch is not tested by the PAPI team because we do not have the appropriate hardware. * 0a97f54e src/components/perf_event/pe_libpfm4_events.c .../perf_event_uncore/peu_libpfm4_events.c src/papi_internal.c...: Thanks to Gary Mohr for the patch --------------------- Fix for bugs in PAPI_get_event_info when using the core and uncore components: PAPI_get_event_info returns incorrect / incomplete information. The errors were in how the code handled event masks and their descriptions so the errors would not lead to program failures, just the possibility of incorrect labeling of output. --------------------- 2014-10-09 * 77960f71 src/components/perf_event/pe_libpfm4_events.c: Record encode_failed error by setting attr.config to 0xFFFFFFF. This causes any later validate attempts to fail, which is the desired behavior. Note: as of 2014/10/09 this has not been thoroughly tested, since a failure case is not known. This patch simply copies a fix that was applied to the perf_event_uncore component. 2014-09-25 * 00ae8c1e src/components/perf_event_uncore/peu_libpfm4_events.c: Based on Gary Mohr's suggestion. If an event fails when we try to add it (encode_failed), then we note that error by setting attr.config = 0xFFFFFF for that event. Then, if there is a later check to validate this event, the check will correctly return an error. * 3801faaf src/utils/native_avail.c: Adding the NativeAvailValidate patch provided by Gary Mohr. The problem being addresed is that if there were any problems validating event masks, then those problems would result in the entire event being invalid. The desired action was to test each event mask, and if any basic event mask can make the event succeed, then the event should be returned as valid and available. The solution is to create a large buffer and write events and masks into this buffer as they are processed, tracking their validity. At the end go back and mark the validity of the entire event. This matches the standard output of PAPI. 2014-09-24 * 6abc8196 src/components/emon/README src/components/emon/Rules.emon src/components/emon/linux-emon.c: Emon power component for BG/Q 2014-09-23 * 62b9f2a9 .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore.c: Check scheduability of events Patch by Gary Mohr, ------------------- This patch file adds code to the uncore component to check and make sure that the events being opened from an event set can be scheduled by the kernel. This kind of code exists in the core component but was not moved into the uncore component because it was felt that it would not be an issue with uncore. Turns out the kernel has the same kind of issues when scheduling uncore events. The symptoms of this problem will show that the kernel will report that all events in the event set were opened successfully but when trying to read the results one (or more) of the events will get a read failure. Seen in the traces and on stderr (if papi configured with debug=yes) as a "short read" error. The logic is slightly different than what is in the core component because the events in the core component are grouped and the ones in the uncore are not grouped. When events are grouped, you only need to enable/disable and read results on the group leader. But when they are not grouped you need to do these operations on each event in the event set. 2014-09-19 * 8e6bf887 src/components/cuda/linux-cuda.c src/components/infiniband/linux-infiniband.c src/components/lustre/linux-lustre.c...: Address coverity defects in src/components Thanks Gary Mohr ---------------- This patch file contains fixes for defects reported by Coverity in the /src/components directory. Mostly these changes just make sure that char buffers get null terminated so that when they get used as a C string (they usually do) we will not end up with unpredictable results. A problem has been reported by one of our testers that the lustre component produced very long event names with lots of unprintable garbage in the names. It turns out this was caused by a buffer that filled up and never got null terminated. Then string functions were used on the buffer which picked up the whole buffer and lots more. These changes fixed the problem. * 266c61a4 src/linux-common.c src/papi_hl.c src/papi_internal.c...: Address coverity reported issues in src/ Thanks to Gary Mohr ------------------- Changes in this patch file: linux-common.c: Add code to insure that cpu info vendor_string and model_string buffers are NULL terminated strings. Also insure that the value which gets read into mdi->exe_info.fullname gets NULL terminated. This makes it safe to use the 'strxxx' functions on the value (which is done immediately after it is read in). papi_hl.c: Fix call to _hl_rate_calls() where the third argument was not the correct data type. papi_internal.c: Add code to insure that event info name, short_desc, and long_desc buffers are NULL terminated strings. papi_user_events.c: While processing define symbols, insure that the 'local_line', 'name', and 'value' buffers get NULL terminated (so we can safely use 'strxxx' functions on them). Insure that the 'symbol' field in the user defined event ends up NULL terminated. Rearrange code to avoid falling through from one case to the next in a switch statement. Coverity flagged falling out the bottom of a case statement as a potential defect but it was doing what it should. sw_multiplex.c: Unnecessary test. The value of ESI can not be NULL when this code is reached. x86_cpuid_info.c: The variable need_leaf4 is set but not used. The only place it gets set returns without checking its value. The place that checks its value never could have set its value non-zero. 2014-09-12 * d72277fc release_procedure.txt: Update release procedure, check buildbot! 2014-09-08 * a0e4f9a7 src/components/perf_event_uncore/perf_event_uncore.c: Uncore component fix: By Gary Mohr, The line that sets exclude_guests in the uncore component is there because it also there in the core component. But when I look at it closer, it is an error in both cases. I will submit a patch to remove them and get rid of some commented out code that no longer belongs in the source. The uncore events do not support the concept of excluding the host or guest os so we should never set either bit. But the core events do support this concept and libpfm4 provides event masks "mg" and "mh" to control counting in these domains. By default if neither is set then libpfm4 excludes counting in the guest os, if either "mh" or "mg" is provided as an event mask then it is counted but the other is excluded, and if both are provided then both are counted. So when the code forces the exclude_guest bit to be set it breaks the ability to fully control what will happen with the masks. I did not notice the uncore part of this problem when testing on my SNB system, probably because we use an older kernel which tolerated the bit being set (or maybe because HSW is handled differently). 2014-09-02 * 4499fee7 src/papi_internal.c: Thanks to Gary Mohr for the patch --------------------- Fix in papi_internal.c where it was trying to look up the event name. The RAPL component found the event and returned a code but papi_internal.c exited the enum loop for that component but failed to exit the loop that checks all of the components. This caused it to keep looking at other components until it fell out of the outer loop and returned an error. In addition to the actual change, some formatting issues were fixed. --------------------- * f5835c26 src/components/perf_event/perf_event_lib.h: Bump NUM_MPX_COUNTERS for linux-perf Uncore on SNB and newer systems has enough counters to go beyond the 64 array spaces we allocate. This needs a better long term solution. Reported by Gary Mohr --------------------- When running on snbep systems with the uncore component enabled, if papi is configured with debug=yes then the message "Warning! num_cntrs is more than num_mpx_cntrs" gets written to stderr. This happens because the snbep uncore pmu's have a total of 81 counters and PAPI is set to only accept a maximum of 64 counters. This change increases the amount PAPI will accept to 100 (prevents the warning message from being printed). --------------------- * 07990f85 src/ctests/branches.c src/ctests/calibrate.c src/ctests/describe.c...: ctests/ Address coverity reported defects Thanks to Gary Mohr for the patch --------------------------------- he contents of this patch file fix defects reported by Coverity in the directory 'papi/src/ctests'. The defect reported in branches.c was that a comparison between different kinds of data was being done. The defect reported in calibrate.c was that the variable 'papi_event_str' could end up without a null terminator. The defects reported in describe.c, get_event_component.c, and krentel_pthreads.c were that return values from function calls were being stored in a variable but never being used. I also did a little clean-up in describe.c. This test had been failing for me on Intel NHM and SNBEP but now it runs and reports that it PASSED. --------------------------------- 2014-08-29 * 74cb07df src/testlib/test_utils.c: testlib/test_util.c: Check enum return value Addresses an issue found by Coverity. Thanks Gary Mohr, ---------------- The changes in this patch file fixes the only defect in the src/testlib directory. The defect reported that the return value from a call to PAPI_enum_cmp_event was being ignored. This call to enum events is to get the first event for the component index passed into this function. It turns out that the function that contains this code is only ever called by the overflow_allcounters ctest and it only calls once and always passes a component index of 0 (perf_event). So I added code to check the return value and fail the test if an error was returned. ---------------- * 74041b3e src/utils/event_info.c: event_info utility: address coverity defect From Gary Mohr -------------- This patch corrects a defect reported by Coverity. The defect reported that the call to PAPI_enum_cmp_event was setting retval which was never getting used before it got set again by a call to PAPI_get_event_info. After looking at the code, I decided that we should not be trying to get the next event inside a loop that is enumerating masks for the current event. It makes more sense to break out of the loop to get masks and let the outer loop that is walking the events get the next event. -------------- 2014-08-28 * 62dceb9b src/utils/native_avail.c: Extend 'papi_native_event --validate' to check for umasks. 2014-08-27 * a5c2beb2 src/components/perf_event/pe_libpfm4_events.c src/components/perf_event/pe_libpfm4_events.h src/components/perf_event/perf_event.c...: perf_event[_uncore]: switch to libpfm4 extended masks Patch due to Gary Mohr, many thanks. ------------------------------------ This patch file contains the changes to make the perf_event and perf_event_uncore components in PAPI use the libpfm4 extended event masks. This adds a number of new masks that can be entered with events supported by these components. They include a mask 'u' which can be used control if counting in the user domain should be enabled, a mask 'k' which does the same for the kernel domain, and a mask 'cpu' which will cause counting to only occur on a specified cpu. There are also some other new masks which may work but have not been tested yet. ------------------------------------ 2014-08-20 * e76bbe66 src/components/perf_event/pe_libpfm4_events.c src/components/perf_event/perf_event.c .../perf_event_uncore/perf_event_uncore.c...: General code cleanup and improved debugging Thanks to Gary Mohr ------------------- This patch file does general code cleanup. It modifies the code to eliminate compiler warnings, remove defects reported by coverity, and improve traces. 2014-08-11 * 8f2a1cee src/utils/error_codes.c: error_codes utility: remove internal bits Remove dependency on _papi_hwi_num_errors, just keep calling PAPI_strerror until it fails. We shouldn't be using internal symbols anyways. 2014-08-04 * a7136edd src/components/nvml/README: Update nvml README We changed the options to simplify the configure line. Bad information is worse than no information... 2014-07-25 * a37160c1 src/components/perf_event/perf_event.c: perf_event.c: cleanup error messages Thanks to Gary Mohr ------------------- This patch contains general cleanup code. Calls to PAPIERROR pass a string which does not need to end with a new line because this function will always add one. New lines at the end of strings passed to this function have been removed. These changes also add some additional debug messages. 2014-07-24 * bf55b6b7 src/papi_events.csv: Update HSW presets Thanks to Gary Mohr ------------------- Previously we sent updates to the PAPI preset event definitions to improve the preset cache events on Haswell processors. In checking the latest source, it looks like the L1 cache events changes did not get applied quite right. Here is a patch to the latest source that will make it the way we had intended. * eeaef9fa src/papi.c: papi.c: Add information to API entry debuging Thanks to Gary Mohr ------------------- This patch contains the results of taking a second pass to cleanup the debug prints in the file papi.c. It adds entry traces to more functions that can be called from an application. It also adds lots of additional values to the trace entries so that we can see what is being passed to these functions from the application. 2014-07-23 * ee736151 src/run_tests.sh: run_tests.sh: more exclude cleanups Thanks Gary Mohr ---------------- This patch removes an additional check for Makefiles in the script. The exclude files are now used to prevent Makefiles from getting run by this script. I missed this one when providing the previous patch to make this change. * c37afa23 src/papi_internal.c: papi_internal.c: change SUBDBG to INTDBG Thanks to Gary Mohr ------------------- This patch contains changes to replace calls to SUBDBG with calls to INTDBG in this source file. This source file should be using the Internal debug macro rather than the Substrate debug macro so that the PAPI debug filters work correctly. These changes also add some new debug calls so that we will get a better picture of what is going on in the PAPI internal layer. There are a few calls to the SUBDBG macro that are in code that I have modified to add support for new event level masks which are not converted by this patch. They will be corrected when the event level mask patch is provided. * e43b1138 src/utils/native_avail.c: native_avail.c: Bug fixes and updates Thanks to Gary Mohr -------------------------------------------------- This patch fixes a couple of problems found in the papi_native_avail program. First change fixes a problem introduced when the -validate option was added. This option causes events to get added to an event set but never removes them. This change will remove them if the add works. This change also fixes a coverity detected error where the return value from PAPI_destroy_eventset was being ignored. Second change improves the delimitor check when separating the event description from the event mask description. The previous check only looked for a colon but some of the event descriptions contain a colon so descriptions would get displayed incorrectly. The new check finds the "masks:" substring which is what papi inserts to separate these two descriptions. Third change adds code to allow the user to enter events of the form pmu:::event or pmu::event when using the -e option in the program. papi-papi-7-2-0-t/ChangeLogP541.txt000066400000000000000000000271361502707512200166130ustar00rootroot000000000000002015-03-02 * bcc508a9 src/components/perf_event/pe_libpfm4_events.c: Thanks much to Gary Mohr for the patch: This patch fixes a problem in the perf_events component that could cause get event info to produce incorrect results. The problem was reported by Harold Servat and occurs when the functions PAPI_event_name_to_code and PAPI_get_event_info are called for an event with a mask (name:mask) and then called again for the event without a mask (name). When this is done the second call to PAPI_get_event_info will incorrectly return the event name and mask from the first call (name:mask). This patch also corrects a problem found with valgrind which was causing memory on the heap to get stranded. We were passing a char **event_string to the libpfm4 encode function and he was allocating some memory and giving us back a pointer to the allocated space. The code in PAPI was responsible for freeing this space but failed to do so. After looking closer at the PAPI code, it does not need the information returned in this space so the patch changes the code to not ask for the information so that libpfm4 no longer allocates heap space. * 62e90303 src/Makefile.inc src/configure src/configure.in...: Generating pkg-config files for papi Thanks to William Cohen for this patch (and to Phil Mucci for the patch review). Some software makes use of pkg-config (http://www.freedesktop.org/wiki/Software/pkg-config/) when using libraries. pkg-config selects compiler flags and libraries for compiling user code based on the installation location of the package. It could make it a bit easier to build other software on papi by abstracting where the libraries are installed. Rather than having some complicated path to the installed library, users could use "pkg-config --libs --cflags papi" to get that information for the compile. If there are multiple versions of papi available on the machine, the user could get a particular one with something like "pkg-config --libs --cflags papi-5.4.0". 2015-02-28 * f6bc16c6 src/papi_events.csv: Add support for ARM 1176 cpus This is the chip in the original Raspberry Pi. With the recently released Raspberry Pi 3.18.8 kernel perf_event support is finally enabled by default. 2015-02-27 * 74801065 src/papi_events.csv: Add ARM Cortex A7 support. Tested on a Raspberry Pi 2 board. 2015-02-25 * 71e6e5e5 src/ctests/krentel_pthreads.c: Sync thread exit in krental_threads.c Thanks to William Cohen for this patch and to Phil Mucci for approving it. William Cohnen and Michael Petlan noticed that this test can have threads dangling after the main thread is done. This patch tracks the created threads and ensures that they are joined before the code exits. Note: There is still some problem remaining. For example, the following test will sometimes (maybe 1 of 10 runs) generate an error message. > ./ctests/krentel_pthreads 8 2000 10 .... [10] time = 8, count = 38110, iter = 20, rate = 1905500.0/Kiter PAPI Error: thread->running_eventset == NULL in _papi_pe_dispatch_timer for fd 14!. [0] time = 8, count = 38161, iter = 20, rate = 1908050.0/Kiter krentel_pthreads.c PASSED 2015-02-20 * c0de16d8 INSTALL.txt: Added additional notes and examples for the MIC. Specify how to use qualifiers to set exclude_guest and exclude_host bits to 0. Use micnativeloadex to run the utilites. 2015-02-11 * 65825ef7 src/utils/native_avail.c: Change papi_native_avail to refer to event qualifiers (qual) rather than event masks. Thanks to Gary Mohr for this patch and the following notes. This patch file fixes one bug and replaces the term "Unit Mask" and other names used to identify a unit mask with the term "event qualifier". This renaming was done because the term "Unit Mask" has a very specific meaning in the hardware. Many of the flags and other fields we can now provide with an event to control how it is counted have nothing to do with the unit masks defined in the manuals provided by the hardware vendors. Summary of what changed: Removed the -d command line argument. It controlled if units should be displayed in output. Now we always display units if they are defined (only place I have seen them defined is with rapl events). Fixed bug when displaying event units. It was displaying the units information in front of the event name and description. It now displays the units information after the description. Renamed the -noumasks argument to -noqual. This prevents event qualifiers (previously known as unit masks) from being displayed. Replaced headings "Unit Mask" and "Mask Name" with "Qualifiers" and "Name" (when displaying a single event). 2015-02-10 * 91e36312 src/ctests/Makefile.recipies src/ctests/attach_cpu.c: Test case for attaching an eventset to a single CPU rather than a thread (attach_cpu) Thanks to Gary Mohr for this contribution. This patch adds a test case to demonstrate how to attach an event set to a cpu so that the event counts for events in that event set reflect how many of those events occurred on the attached cpu (instead of the number of events that occurred in a thread of execution). See comments in attach_cpu.c to see how and why to probe with specific cpus (e.g. ./attach_cpu 3). 2015-02-02 * 1fc57875 src/components/cuda/Makefile.cuda.in src/components/cuda/README src/components/cuda/Rules.cuda...: Updated CUDA component supporting multiple GPUs and multiple CUDA contexts. This PAPI CUDA component uses the CUPTI library to get information about the event counters. NOTE: To use this PAPI CUDA component, there is a difference from standard PAPI usage. When adding PAPI events to the CUDA component, each event needs to be added from the correct CUDA context. To repeat, for each CUDA device, switch to that device and add the events relevant to that device! If there is only one CUDA device, then the default context will be used and things should work as before. * 40151180 src/ftests/Makefile: Reported by Mark Maurice: On linux systems without a fortran compiler installed we get an error when building the PAPI fortran tests. The reason for the error is that in the Makefile in the ftests directory the @echo lines start with spaces instead of tabs. 'make' is fussy about tabs and spaces and gives a 'missing separator' error if a command starts with spaces instead of a tab. 2015-01-20 * 1dec8a9d src/components/lustre/linux-lustre.c: Thanks to Gary Mohr for the patch:The patch provided solves the segmentation faults produced by the lustre component.The changes done by the patch are in the _lustre_shutdown_component() by adding lustre_native_table=NULL statement and Later num_events=0 and table_size=32 were added in the same function to fully solve the segmentation faults 2014-12-17 * aba85b18 man/man1/PAPI_derived_event_files.1 man/man1/papi_avail.1 src/Makefile.inc...: User defined events: Enhance PAPI preset events allow user defined events via a user event definition file. Thanks to Gary Mohr for this patch and its documentation. -------------------------------------------------------- This patch file enhances the code that processes PAPI preset event definition files (papi_events.csv) so that it can also now be used to process a user provided event definition file. PAPI still looks for an environment variable 'PAPI_USER_EVENTS_FILE' and if found uses its value as the pathname of the user event definition file to process (same behavior as before). The change is that this is done right after processing the PAPI preset events rather than at the end of PAPI_library_init (after all components were initialized). An advantage of using this approach is that now user defined events, like preset events, can define multiple versions of the same event where each version is customized to a particular hardware platform (or pmu name). The code which processes preset events was also enhanced in the following ways: The papi_avail command was updated to also list user defined events in its output. The papi_avail help and man page have been updated to include user defined events in the descriptions. The man page was also updated to add a "see also" reference to a new 'PAPI_derived_event_files' man page. A new 'PAPI_derived_event_files' man page to provide the user information about how to build an event definition file has been added. This patch file contains both the source file changes (needed by doxygen) and updated copies of the man pages created by doxygen. The code now allows both postfix (Reverse Polish Notation) and infix (algebraic) formulas to be entered. There is a new derived event type 'DERIVED_INFIX' to specify that the formula is provided in the algebraic format. The formulas will always be converted to postfix format as part of the event definition processing so if the user does a 'papi_avail -e ' later it will always be displayed as a postfix formula. When defining a new derived event (either preset or user defined), it is now possible to use any already known native event, preset event or user defined event. This means that new derived events can be created as a relationship between other already known (their definitions had to already be processed) derived events. When derived events are created, there is a list of native events needed by that defined event created and optionally a formula to compute the derived events value. If a new derived event is created that depends on another derived event, then the new event will inherit all the native events used by the event it depends on and the new derived events formula will be merged with the formula from the event it depends on (if there was one or if it had an implied formula like derived add or sub). This means that after event definition processing completes the event tables inside PAPI always contain the list of all native events needed to compute the derived events results and a postfix formula that will be used to compute the events result. So if a user does a 'papi_avail -e ', the output will show what events PAPI is going to count and how they will be used to generate the events final value. A new command 'EVENT' has been added to the code which is intended to be used for user defined events. It is identical to the existing command 'PRESET' used to define preset events. They are interchangeable and both can be used in both preset and user defined event definition files. The code now allows the user to provide a short and long description for the derived event. The event definition commands 'PRESET' and 'EVENT' now support tags of "LDESC" and "SDESC" to identify what is found in the following string. This was done the same way as the already supported 'NOTE' tag. These changes do not support the ability to create #define variables that can then be used in event definition formulas. This was supported by the old user event definition code. These changes delete the existing papi_user_event code (two files that are no longer needed). 2014-12-15 * f8b722a9 src/components/perf_event/tests/event_name_lib.c: perf_event tests: add sample haswell offcore event 2014-12-11 * adbae8cd src/papi_events.csv: Update presets for Intel Haswell and Haswell-EP (according to the updates of the libpfm4 event table for Intel Haswell and Haswell-EP). These mods have not been tested due to lacking access to an Intel Haswell system. 2014-11-14 * ca1ba786 doc/Doxyfile-common papi.spec src/Makefile.in...: Bump master to 5.4.1, we just released out of the stable-5.4 branch. papi-papi-7-2-0-t/ChangeLogP543.txt000066400000000000000000000331471502707512200166140ustar00rootroot000000000000002016-01-25 * d779d1172a6e4c73b5ece9939c4d067c2b3d7b8d Update libpfm4 current with Jan 25 08:33:02 2016 version. 2016-01-07 * 0d9776b8 src/components/stealtime/linux-stealtime.c: Free allocated memory in the stealtime component when component is shutdown Thanks to William Cohen for contributing this patch and the following explaination: Running examples with "valgrind --leak-check=full ..." showed a number of items allocated by the stealtime component were not freed when PAPI_shutdown() was called. This patch frees those unused memory allocations. 2016-01-06 * de40668c src/papi_preset.c: Fixed memory leak in papi_preset.c by updating the infix_to_postfix function. Thanks to William Cohen for discovering the leak. The infix_to_postfix function was re-written and tested using user defined events. 2015-12-30 * db37e115 src/utils/avail.c src/utils/native_avail.c: Added "-check" flag to papi_avail and papi_native_avail to test counter availability/validity" This patch updates the papi_avail and papi_native_avail utilities to use the "-check" flag to test the actual availability of counters. There were previously two different flags for this capability, papi_native_avail used "-validate" and papi_avail used "-avail_test". Based on a mailing list discussion these flags have been consolidated as "-check". 2015-12-29 * 72e0ffe8 src/components/lmsensors/linux-lmsensors.c: Fixed minor error with multiple initializers for lmsensors_vector .default_granularity. Thanks to William Cohen for the bug report. 2015-12-07 * ec3582d8 src/utils/avail.c: papi_avail to test actual availability of counters using "papi_avail --avail-test" This problem and the associated patch were detected and contributed by Harald Savat. Thanks. On an Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz system with PAPI 5.4.1 installed The papi_avail command indicates that both PAPI_LD_INS and PAPI_SR_INS are available, however the papi_event_chooser does not accept them (see below) and return -1. This problem can occur in kernels from version 3.1 till 4.1. The kernel devs blocked all uses of the MEM_OPS events (including load and store). The patch modifies papi_avail to test the counters to see if they can be added. papi_avail # gets all PAPI counters papi_avail -a # gets all available PAPI counters papi_avail -at # shows all available PAPI counters that can be added [Ptools-perfapi: Oct 14 2015] 2015-11-30 * 1fab922e src/components/libmsr/README src/components/libmsr/configure src/components/libmsr/configure.in...: The libmsr component is updated to match major changes in LLNL libmsr library and the LLNL msr-safe kernel module 2015-11-18 * 242b16d3 src/Makefile.inc src/components/cuda/configure src/components/cuda/configure.in...: Added papi_cuda_sampling utility in /src/components/cuda/sampling changed src/Makefile.inc , src/components/cuda/configure.in to build the utiltiy during PAPI installation Added in /src/components/cuda/tests/Makefile which is -ldl switch because 3.10.0-229.14.1.el7.x86_64 had issues using libpapi.a during compilation of cuda component test programs 2015-10-21 * a10e8331 src/papi_events.csv: papi_events: add Intel Skylake presets This just shares all of teh broadwell events with skylake. Some quick tests show that this probably works. Someone with skylake hardware should validate this at some point. 2015-10-08 * 91736851 src/papi_internal.c: Thanks to David Eberius of ICL for reporting a bug in PAPI_get_event_info() in papi_internal.c, (info->component_index = (unsigned int) cidx) was missing at line 2554, of papi_internal.c 2015-08-27 * 502df070 src/Makefile.inc: Thanks to Steve Kaufmann for reporting about the redundant () paramater in the OBJECTS expression of src/Makefile.inc file. Updated Makefile.inc by removing the redundant paramater 2015-08-24 * 69fdc2e0 src/papi.c: Thanks to Harald Servat for reporting the PAPI_overflow issue for multiple eventsets. The problem was in the PAPI_start() function in the branch at line-2166:papi.c , if(is_dirty). After update_control_state(), it is required to re-initialize the overflow settings using set_overflow() 2015-07-29 * be81dc43 src/components/perf_event/perf_event.c: perf_event: update the ARM domain workaround older ARM processors could not separate out KERNEL vs USER events. ARMv7 starting with the Cortex A15 can, as can all ARMv8 (ARM64). This updates the code with a whitelist to properly allow setting the domains. * 43be2588 src/linux-common.c: linux-common: clean up ARM cpu detection Parsing cpuinfo is always a pain. Extra work because of Raspberry Pi (ARM1176) lying and saying it's ARMv7 rather than ARMv6. * 5a101a50 src/linux-common.c: linux-common: split up x86, power and arm cpuinfo parsing * 0d7772d9 src/linux-common.c: linux-common: clean up and comment the cpuinfo parsing code 2015-07-16 * 59489b1f src/components/libmsr/Makefile.libmsr.in src/components/libmsr/README src/components/libmsr/Rules.libmsr...: Create libmsr component for reading power information and writing power constraints using MSRs on some Intel processsors The PAPI libmsr component supports measuring and capping power usage on recent Intel architectures using the RAPL interface exposed through MSRs (model-specific registers). Lawrence Livermore National Laboratory has released a library (libmsr) designed to provide a simple, safe, consistent interface to several of the model-specific registers (MSRs) in Intel processors. The problem is that permitting open access to the MSRs on a machine can be a safety hazard, so access to MSRs is usually limited. In order to encourage system administrators to give wider access to the MSRs on a machine, LLNL has released a Linux kernel module (msr_safe) which provides safer, white-listed access to the MSRs. PAPI has created a libmsr component that can provide read and write access to the information and controls exposed via the libmsr library. This PAPI component introduces a new ability for PAPI; it is the first case where PAPI is writing information to a counter as well as reading the data from the counter. 2015-07-13 * d326ecc9 src/components/perf_event/perf_event_lib.h src/papi_internal.c: Thanks to Steve Kaufman for providing a patch that increases the PERF_EVENT_MAX_MPX_COUNTERS to 192 from 128 and enhances the corresponding warning message in papi_internal.c 2015-06-29 * e829baa5 src/components/cuda/tests/Makefile src/components/cuda/tests/cuda_ld_preload_example.README src/components/cuda/tests/cuda_ld_preload_example.c: Example of using LD_PRELOAD with the CUDA component. A short example of using LD_PRELOAD on a Linux system to intercept function calls and PAPI-enable an un-instrumented CUDA binary. Several CUDA events (e.g. SM PM counters) require a CUcontext handle to be a provided since they are context switched. This means that we cannot use a PAPI_attach from an external process to measure those events in a preexisting executable. These events can only be measured from within the CUcontext, that is, within the CUDA enabled code we are trying to measure. If the user is unable to change the source code, they may be able to use LD_PRELOAD's ability to trap functions and measure the events for within the executable. See src/components/cuda/tests/cuda_ld_preload_example.README for details. 2015-06-26 * 0829a4f5 src/papi_events.csv: Add future broadwell-ep support. libpfm4 doesn't support it yet, but add it for when it appears. 2015-06-25 * 36c5b5b6 src/papi_events.csv: add broadwell predefined events For now they are the same as Haswell, as that's what the Linux kernel does. * f42eda64 src/papi_events.csv: Added definitions to Power8 for PAPI_SP_OPS, PAPI_DP_OPS. 2015-06-18 * f87542f7 src/components/perf_event/tests/event_name_lib.c: Added [case 63: /*Haswell EP*/] line the src/components/perf_event/tests/event_name_lib.c file to support offcore for haswell EP * fbfc641f src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c src/components/perf_event_uncore/tests/perf_event_uncore_lib.c: Added suuport for Haswell-EP processor with model-63 in src/components/perf_event_uncore/tests/perf_event_uncore_lib.c and src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c files. As a result perf_event_uncore, perf_event_uncore_multiple and perf_event_uncore_cbox tests get passed. Tested and verified on Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz with linux kernel 4.0.4-1.el6.elrepo.x86_64 2015-06-17 * 56698211 src/components/lustre/linux-lustre.c: Thanks to Garry Mohr for the patch that removes the error message (PAPI Error: Error Code -7,Event does not exist) on executing papi_native_avail in PAPI built with lustre component 2015-06-16 * 1b9fd867 src/components/rapl/linux-rapl.c: rapl: allow DRAM to have separate scaling factor from CPU on Haswell-EP the DRAM scaling value is different and cannot be detected. See https://lkml.org/lkml/2015/3/20/582 * 1aa74f85 src/components/rapl/linux-rapl.c: rapl: add support for Broadwell 2015-06-11 * a5ecda79 src/components/rapl/linux-rapl.c: Thanks to William Cohen for the patch which does the following: Checking the cpu family and module number is not sufficient to determine whether RAPL can be used. If the papi is running inside a guest VM, the MSR used by the PAPI RAPL component may not be available. There should be a simple read test to verify the RAPL MSR registers are available. This allows the component to more clearly report that RAPL is unsupported rather than just exiting program when the RAPL 2015-05-19 * 54c45107 src/components/rapl/utils/rapl_plot.c: Updated rapl_plot utility so that the correct values/units are reported (e.g. scaled and fixed value counts should not be converted) 2015-05-04 * a34fbc62 src/papi_events.csv: papi_events.csv: typo in the ARM Cortex A53 definitions 2015-04-30 * caa3af72 src/papi_events.csv: papi_events.csv: add preset events for ARM Cortex A53 This is based purely on the names in the libpfm4 output, these were not validated in any way. 2015-04-20 * 66553715 INSTALL.txt: added compile incantation for compiling programs that offload code to MIC 2015-04-16 * 8914dcfc src/papi_events.csv: Bug reported by William Cohen in papi_events.csv for the event PAPI_L1_TCM 2015-03-31 * 023af5ec src/components/nvml/configure: Updated the NVML configure script which requires an autoconf and an updated configure script * 2385c1b2 src/components/nvml/Makefile.nvml.in src/components/nvml/Rules.nvml src/components/nvml/configure.in: Updated the NVML configure script to allow separate include and library paths 2015-03-30 * 3d509095 src/components/infiniband_umad/linux-infiniband_umad.c: Bugfix linux-infiniband_umad.c to include linux-infiniband_umad.h rather than linux-infiniband.h. Thanks to Aurelien Bouteiller for pointing out this bug. * b865f227 src/components/vmware/vmware.c: Corrected function name in _vmware_vector from _vmware_init to _vmware_init_thread. 2015-03-24 * 2f58a4d8 src/configure: Regenerated configure to match the PAPI_GRN_SYS patch * 12e6ef31 src/components/perf_event/tests/perf_event_system_wide.c: Support PAPI_GRN_SYS granularity for perf component, updating the system wide test (patch 2 of 2). Thanks to William Cohen for this patch and the documentation Make sure that a sane cpu number is selected with PAPI_GRN_SYS Corrections to output and comments of perf_event_system_wide.c test * 42879693 src/components/perf_event/perf_event.c src/components/perf_event/tests/perf_event_system_wide.c src/config.h.in...: Support PAPI_GRN_SYS granularity for perf component, picking a sane CPU number (patch 1 of 2). Thanks to William Cohen for this patch and the documentation The checks in perf_event_open syscall cause a failure when both pid=-1 and cpu=-1. The perf_event component was passing in pid=-1 and cpu=-1 when PAPI_GRN_SYS was selected. If possible, the code should pick the current processor that the command is running so that the permission check works properly when PAPI_GRN_SYS is used. The patch also adds a test fail if PAPI_GRN_SYS unable to add PAPI_TOT_CYC. * 0ab9b0c8 src/ctests/krentel_pthreads.c: Added call to unregister the overflow handler.. plus small code cleanup 2015-03-05 * d886c49c src/papi.c src/papi_libpfm4_events.c src/utils/avail.c: Clean output from papi_avail tools when there are no user defined events Thanks to Gary Mohr for this patch. The changes in this patch improve the output from the papi_avail tool. It was printing the user defined events header and a PAPI Error message when no user defined events existed. These changes add code in the enum call to return an error when trying to fetch the first user defined event if no user events are defined. This allows the application to detect that no user events are known and skip printing the user defined event heading. It also prevents the application from calling PAPI_get_event_info with a user defined event code that does not exist which avoids the PAPI Error message. Also a one line change to modify a debug message type to make the debug output produced by papi_libpfm4_events.c consistent. 2015-03-03 * ee0c58d7 src/components/cuda/linux-cuda.c: Do not generate an error if the CUDA libraries cannot be loaded, just write a debug message * 08bb9bf0 src/configure: Updating the number to 5.4.1 2015-03-02 * 01f742c1 release_procedure.txt: Minor change to specify locations of some files papi-papi-7-2-0-t/ChangeLogP550.txt000066400000000000000000000240041502707512200166020ustar00rootroot000000000000002016-09-08 * dfa52d3f man/man1/PAPI_derived_event_files.1 man/man1/papi_avail.1 man/man1/papi_clockres.1...: Generated man files for release 2016-08-18 * 43c1be67 src/ctests/all_native_events.c: ctests all_native: Make sure we count all native events for KNL. * adc47828 src/components/perf_event_uncore/tests/perf_event_uncore_lib.c: perf_event_uncore tests: KNL has uncore support. * 0a9e1a8d src/components/perf_event/tests/event_name_lib.c: perf_event tests: add KNL offcore event. * e9144b9b src/papi_events.csv: Added preset definitions for KNL. 2016-08-12 * 03c766a6 src/components/rapl/linux-rapl.c: linux-rapl: update KNL support Knight's Landing does not support pp0, and also it uses a different unit for DRAM RAPL (much like the hsw-ep does) 2016-08-04 * ce57b7a7 src/testlib/test_utils.c: testlib: give better error message if component failed to initialize Old message: ./zero test_utils.c FAILED Line # 697 Error: Zero Counters Available! PAPI Won't like this! New message: ./zero Component perf_event disabled due to Error initializing libpfm4 test_utils.c FAILED Line # 702 Error: ERROR! Zero Counters Available! 2016-07-25 * ae00a502 src/papi_internal.c: add William Cohen's rewrite of the _papi_hwi_postfix_calc function which corrects the parsing and makes the parser more robust by catching any errors in the parsing early with asserts in the code rather than silently corrupting memory. 2016-07-22 * a6359b9d src/papi_preset.c: This was another bug of smashing the stack. This code declared the stack as: static char stack[PAPI_HUGE_STR_LEN]; But then did this later. memset(stack, 0, 2*PAPI_HUGE_STR_LEN); How our static analysis tools didn't catch this one? 2016-06-30 * f35e6e77 doc/Doxyfile-common papi.spec src/Makefile.in...: Updated PAPI version to 5.5 for upcoming release. 2016-06-29 * 48aee8e1 src/Makefile.inc src/components/cuda/Rules.cuda src/components/cuda/linux-cuda.c: cuda/sampling, cuda: Move sampling build rules to the cuda component. Minor bugfix in linux-cuda.c to check ok return status. 2016-06-28 * 78249608 src/components/cuda/sampling/libactivity.so src/components/cuda/sampling/path.h src/components/cuda/sampling/test/matmul...: cuda/sampling: Removing generated files that should not be in the repository 2016-06-27 * 10385c63 src/components/cuda/sampling/Makefile: Adding the missing Makefile for cuda/sampling. 2016-06-22 * 45c2935e src/papi_events.csv: Correct IBM Power7 and Power8 computation of PAPI_L1_DCA When reviewing the test results for IBM Power7 and Power8 Michael Petlan found that the PAPI_L1_DCA preset was incorrectly computed. The L1 cache misses need to be subtracted rather than added to the result. 2016-06-23 * 0364d397 src/components/powercap/README src/components/powercap/linux-powercap.c src/components/powercap/tests/powercap_basic.c: Cleanup powercap component. Most changes are cosmetic and achieved by runing thru astyle and cleaning up manually. README file should match the powercap component now rather than inheriting generic comments from other components. * 1c64bfc0 src/papi_events.csv: Added FP (SP, DP) presets for Broadwell. NOT TESTED yet due to lack of access to bdw hardware 2016-06-22 * 0d006ea3 src/components/rapl/linux-rapl.c: add Intel Skylake and Knights Landing RAPL support * bd921b74 src/ftests/fmatrixpapi.F src/testlib/ftests_util.F: Eliminate the sole use of ftests_skip subroutine There was only one test using ftests_skip subroutine, fmatrixpapi.F. Converted fmatrixpapi.F to use ftest_skip subroutine like all the other Fortran tests. 2016-06-21 * e9cde551 src/ctests/tenth.c: Correct the event string names for tenth.c There are stray ": " at the end of the event names in ctests/tenth.c. These are unneeded because the ctests support routines already insert a ": " after then event name when the error is printed out. * 97fb93c3 src/testlib/ftests_util.F: Have Fortran test support code report errors more clearly When a Fortran test called the ftest_skip or ftest_fail the support code would attempt to print out error strings. However, this support code would print out gibberish because the string was not properly initialized. There doesn't seem to be a easy way in Fortran to get the error string, for the time being just print out the error number and people will need to manually map it back to the string. 2016-06-17 * db9c70f5 src/papi_events.csv: Added FP (SP, DP) presets for Skylake. Corrected L1_LDM|STM, L2_DCW|TCW, PRF_DM, STL_ICY presets for Skylake. * 9de0c97f src/components/libmsr/linux-libmsr.c: Bugfix: libmsr component can now disable itself without printing an error message to the screen * d09657bf src/components/cuda/linux-cuda.c: Bugfix: CUDA component can now disable itself without printing an error message to the screen 2016-05-19 * 4718b481 src/components/perf_event/perf_event.c src/components/perf_event_uncore/perf_event_uncore.c: Force all processors to check event schedulability by reading the counters There are situations where the perf_event_open syscall will return a file descriptor for a set of events even when they cannot be scheduled together. This occurs on 32-bit and 64-bit ARM processors and MIPS processors. This problem also occurs on linux kernels older than 2.6.33 and when the watchdog timer steals a performance counter. To check that the performance counters are properly setup PAPI needs to check that the counter values can be successfully read. Rather than trying to avoid this test PAPI will now always do it. 2016-03-30 * 35264ea6 src/papi.h: update the caddr_t compatability hack in papi.h Erik Schnetter reported that the workaround failed with a C11 compiler. Really, we should replace all instances of caddr_t with something better, but I'm not sure what that does dor older compilers or breakage of ABI. 2016-03-16 * 504d05c3 src/Rules.pfm4_pe src/papi.h src/papi_fwrappers.c: Only expose the shared libary symbols listed *papi.h files The shared library should avoid exposing internal symbols of the library. This change hides the PAPI's internal symbols when it is built with libpfm 4. Only the functions in papi.h and the associated fortran wrapper functions are visible to code using the library. This change also makes libpapi.so slightly smaller (29KB for stripped x86_64 shared libary or about 6%). Note that a similar patch has been proposed for upstream libpfm4 and would be needed for the bundled libpfm if papi is being built with the bundled libpfm4. 2016-03-10 * 943fb056 INSTALL.txt: Fix leftover doxygen reference in INSTALL file. Fix leftover doxygen reference in INSTALL file. Noticed this while working through build/install steps on a local system, Looks like the doxygen command was switched from Doxyfile-everything to Doxyfile-html as part of revision bfee45 "Rework the doxygen configuration files" This fixes up the INSTALL reference to match. * 947f6cb3 src/Makefile.inc: Fix a bashism found in Makefile.inc While building on an ubuntu system, hit an error that took a bit to run down. /bin/sh: 1: [: perf_event: unexpected operator /bin/sh: 1: [: perf_event_uncore: unexpected operator This was on an ubuntu system where /bin/sh is actually /bin/dash, and is due to a bashism in Makefile.inc for the build and clean of cuda_samples. Swapping out the '==' for an '=' should be safe. 2016-02-29 * 7996d480 src/components/coretemp/linux-coretemp.c: Make coretemp internal functions static where possible As much of the internal of the papi shared library should be hidden. A number of the internal functions for the perf_event and coretemp components should be static since they are only used within the individual component. Making the functions static allows the compiler to generate better code and reduce the number of entries in the PLT (Procedure Link Tables). 2016-02-26 * a0240d5a src/components/perf_event/perf_event_lib.h: Removed the re declaration of the static functions in the perf_event_lib.h * 5d6e8295 src/components/appio/appio.c src/components/example/example.c src/components/lmsensors/linux-lmsensors.c...: Thanks to William Cohen of RedHat for providing the patches with following description Make perf_event and perf_event_uncore internal functions static where possible Make appio component internal functions static where possible Make example component internal functions static where possible Make lmsensors component internal functions static where possible Make lustre component internal functions static where possible Make micpower component internal functions static where possible Make mx component internal functions static where possible Make net component internal functions static where possible Make rapl component internal functions static where possible Make stealtime component internal functions static where possible. 2016-02-24 * 0eb308b4 src/components/cuda/README: Fixed cuda component README to use the correct configure flags. Thanks to Jianqiao Liu for pointing out errors in the README file. 2016-02-15 * 70bd7584 src/components/powercap/utils/README src/components/powercap/utils/powercap_write_test.c: Cleanup powercap utility. Removed mention of libmsr and no-longer-needed union type left over from the libmsr example 2016-01-31 * 03afa3fe src/components/powercap/README src/components/powercap/utils/Makefile src/components/powercap/utils/README...: added intial powercap write test and readme 2016-01-26 * 8fd9e4e3 src/components/powercap/tests/Makefile src/components/powercap/tests/powercap_basic.c: added power cap read test * edf8af95 src/components/powercap/Rules.powercap src/components/powercap/linux-powercap.c: added PAPI component * 66df01be ChangeLogP542.txt ChangeLogP543.txt RELEASENOTES.txt...: PAPI 5.4.3 release (releasenotes, changelog, man files, ...) papi-papi-7-2-0-t/ChangeLogP551.txt000066400000000000000000000036571502707512200166160ustar00rootroot000000000000002016-11-17 * 4b7c2c8b src/components/coretemp/linux-coretemp.c src/components/cuda/configure src/components/cuda/configure.in...: Handing some of the problems exposed by Coverity Mostly adding strncpy termination to some components (coretemp, lmsensors, micpower). Removed some unused component writing functions (lustre, mx). Fixed CUDA component configure.in to get the correct version of nvcc. Fixed division so it works in double precision rather than integer in the rapl component. Fixed a minor complaint about a stack counter variable in papi_preset. Thanks to William Cohen for sending the Coverity results report. 2016-11-15 * 7384d4d1 src/components/rapl/linux-rapl.c: Enable RAPL for Broadwell-EP 2016-11-04 * 0e90ecd4 src/Makefile.inc: Minor change: Removed unneeded characters in src/Makefile.inc. (Thanks to Steve Kaufmann) 2016-10-24 * b72df977 src/components/perf_event/perf_event_lib.h: Increase PERF_EVENT_MAX_MPX_COUNTERS to 384 to support KNL uncore events * Update libpfm4 to enable Intel Knights Landing untile PMU support. 2016-09-18 * b92abb7c src/components/powercap/utils/Makefile src/components/powercap/utils/powercap_plot.c src/components/powercap/utils/powercap_write_test.c: changed the tool in /powercap/utils to behave as the similiar tool in /rapl/utils does. removed the old code residing in /powercap/utils. 2016-09-16 * 51d76878 src/threads.c: threads: silence compiler warning our_tid is only being used in debug statements * 33aacc65 src/papi_preset.c: papi_preset: quiet a compiler warning we were setting the papi_preset variable but only using it in debug statements. tell the compiler to not warn in this case. * 7ff9a01c src/ctests/zero_omp.c: tests/zero_omp: fix warning in zero_omp we weren't using the maxthr variable * 33deefbd src/components/rapl/tests/rapl_basic.c: componensts/rapl: fix compiler warning in rapl_basic test papi-papi-7-2-0-t/ChangeLogP560.txt000066400000000000000000003501031502707512200166050ustar00rootroot00000000000000Tue Dec 5 20:10:50 2017 -0800 William Cohen * src/libpfm4/lib/events/power9_events.h, src/libpfm4/tests/validate_power.c: Update libpfm4 Current with commit 206dea666e7c259c7ca53b16f934660344293475 Ensure unique names for IBM Power 9 events Older versions of PAPI use the event name to look up the libpfm event number when doing the enumeration of the available events. If there were multiple events with the same name in libpfm, the earliest one would be selected. This selection would cause the enumeration of events in papi_native_avail to get stuck looping on the first duplicated named event in a pmu. In the case of IBM Power 9 the enumeration would get stuck on PM_CO0_BUSY. Gave each event a unique name to avoid this unfortunate behavior. 2017-11-16 Will Schmidt * src/papi_events.csv: revised papi_derived patch. [PATCH, papi] Updated derived entries for power9. This is a re-implementation of the patch that Will Cohen posted earlier, which uses the (newly defined) PM_LD_MISS_ALT entry instead of the PM_LD_MISS_FIN . Thanks, -Will 2017-12-05 Heike Jagode (jagode@icl.utk.edu) * release_procedure.txt: Updated notes for release procedure. 2017-12-05 Vince Weaver * src/extras.c: extras.c: add string.h include to make the ffsll warning go away 2017-12-04 Heike Jagode (jagode@icl.utk.edu) * src/configure, src/configure.in: Fixed configure bug: Once ffsll support is detected, set HAVE_FFSLL to 1 in config.h. Tested without configure flag --with-ffsll, with --with-ffsll=yes, --with- ffsll=no. 2017-12-04 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/locks_pthreads.c: ctests: locks_pthreads: adjust run count again linear slowdown makes things run really quickly. This patch scales it down by the square root of the number of cores which is maybe a better compromise. * src/ctests/locks_pthreads.c: ctests: locks_pthreads, minor cleanups 2017-11-20 William Cohen * src/ctests/locks_pthreads.c: Keep locks_pthreads test's amount of work reasonable on many core machines The runtime of locks_pthreads test scaled by the number of processors on the machine because of the serialized increment operation in the test. As more machines are available with 100+ processors the runtime of locks_pthreads is becoming execessive. Revised the test to specify the approximate total number of iterations and split the work the threads. Fri Dec 4 11:31:46 2015 -0500 sangamesh * src/extras.c, src/papi.h: Revert change that added ffsll to papi.h This reverts commit 2f1ec33a9e585df1b6343a0ea735f79974c080df. commit 2f1ec33a9e585df1b6343a0ea735f79974c080df changed #if (!defined(HAVE_FFSLL) || defined(__bgp__)) int ffsll( long long lli ); #endif --- to --- extern int ffsll( long long lli in extras.c to avoid warning when --with-ffsll is used as config option Thu Apr 20 11:31:38 2017 -0400 Stephen Wood * src/extras.c, src/papi.h: revert part of patch that added extra attributes to ffsll This manually reverts part of: commit 9e199a8aee48f5a2c62d891f0b2c1701b496a9ca cast pointers appropriately to avoid warnings and errors Sun Dec 3 09:42:44 2017 -0800 Will Schmidt * src/libpfm4/lib/events/power9_events.h, src/libpfm4/tests/validate_power.c: Updated libpfm4 Current with: ---------------- commit ed3f51c4690685675cf2766edb90acbc0c1cdb67 (HEAD -> master, origin/master, origin/HEAD) Add alternate event numbers for power9. I had previously missed adding the _ALT entries, which allow some events to be specified on different counters. This patch fills those in. This patch also adds a few validation tests for the ALT events. ---------------- 2017-11-28 Heike Jagode (jagode@icl.utk.edu) * src/utils/papi_avail.c, src/utils/papi_native_avail.c: Fixed utility option inconsistencies between papi_avail and papi_native_avail. There are more inconsistencies with other PAPI utilities, which will be addressed eventually. 2017-11-28 Heike Jagode * README.md: README.md edited online with Bitbucket * README.md: README.md edited online with Bitbucket * README.md: README.md edited online with Bitbucket * README.md: README.md edited online with Bitbucket 2017-11-27 Heike Jagode * src/components/powercap/linux-powercap.c: More clean-ups and checking of return values. Mon Nov 13 23:15:53 2017 -0800 Thomas Richter * src/libpfm4/lib/pfmlib_common.c: Update libpfm4†> /tmp/commit- libpfm4-header.txt echo “Current with commit f5331b7cbc96d9f9441df6a54a6f3b6e0fab3fb9 better fix for pfmlib_getl() The following commit: commit 9c69edf67f6899d9c6870e9cb54dcd0990974f81 better param check in pfmlib_getl() Fixed paramter checking of pfmlib_getl() but missed one condition on the buffer argument. It is char **buffer. Therefore we need to check if *buffer is not NULL before we can check *len. 2017-11-19 Asim YarKhan * src/components/cuda/linux-cuda.c: CUDA component: Bug fix for releasing and resetting event list When an event addition failed because the event (or metric) requires multiple-runs the eventlist and event-context structure was not being cleaned up properly. This fixes the event cleanup process. 2017-11-17 Asim YarKhan * src/components/powercap/tests/powercap_basic.c, src/components/powercap/tests/powercap_limit.c: Powercap component: Updated tests to handle no-event-counters (num_cntrs==0) and skip some compiler warnings (argv, argc unused) 2017-11-16 William Cohen * src/components/lmsensors/linux-lmsensors.c: Make more of lmsensors component internal state hidden There are a number of functions pointers stored in variable that are only used within the lmsensors component. Making those static ensures they are not visible outside the lmsensors component. * src/components/lmsensors/linux-lmsensors.c: Make internal cached_counts variable static Want to make as little information about the internals of the PAPI lmsensors component visible to the outside. Thus, making cached_counts variable static. 2017-11-15 William Cohen * src/components/lmsensors/linux-lmsensors.c: Avoid statically limiting the number of lmsensor events allowed Some high-end server machines provide more events than the 512 entries limit imposed by the LM_SENSORS_MAX_COUNTERS define in the lmsensor component (observed 577 entries on one machine). When this limit was exceeded the lmsensor component would write beyond the array bounds causing ctests/all_native_events to crash. Modified the lmsensor code to dynamically allocate the required space for all the available lmsensor entries on the machine. This allows ctests/all_native_events to run to completion. * src/components/appio/appio.c, src/components/coretemp/linux- coretemp.c, src/components/example/example.c, src/components/infiniband/linux-infiniband.c, src/components/lustre /linux-lustre.c, src/components/rapl/linux-rapl.c: Use correct argument order for calloc function calls Some calls to calloc in PAPI have the order of the arguments reversed. According to the calloc man page the number of elements is the first argument and the size of each element is the second argument. Due to alignment constraints the second argument might be rounded up. Thus, it is best not to not to swap the arguments to calloc. 2017-11-15 Philip Vaccaro * src/components/powercap/linux-powercap.c, src/components/powercap/tests/powercap_basic.c: Updates and changes to the powercap component to address a few areas.. Various things were changed but mainly things were simplified and made more streamlined. Main focus was on simpifying managing the sytem files. Mon Nov 13 23:15:53 2017 -0800 Thomas Richter * src/libpfm4/docs/man3/pfm_get_event_encoding.3, src/libpfm4/docs/man3/pfm_get_os_event_encoding.3, src/libpfm4/lib/events/amd64_events_fam11h.h, src/libpfm4/lib/events/amd64_events_fam12h.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 9c69edf67f6899d9c6870e9cb54dcd0990974f81 better param check in pfmlib_getl() This patch ensures tha len >= 2 because we do: m = l - 2; Reviewed-by: Hendrik Brueckner 2017-11-13 Vince Weaver * src/components/perf_event/pe_libpfm4_events.c: pe_libpfm4_events: properly notice if trying to add invalid umask this passes the broken-event test case and all of the unit tests, but it would be good to test this on codes that do a lot of native event tests. the pe_libpfm4_events code *really* needs a once-over, it is currently a confusing mess. * src/components/perf_event/tests/Makefile, src/components/perf_event/tests/broken_events.c, src/components/perf_event/tests/event_name_lib.c, src/components/perf_event/tests/event_name_lib.h: perf_event/tsts: add broken event name test we were wrongly accepting event names with invalid umasks 2017-11-13 Philip Mucci * src/utils/print_header.c: Removed extraneous colon in VM vendor output 2017-11-10 Vince Weaver * src/validation_tests/papi_l1_dcm.c, src/validation_tests/papi_l2_dcm.c, src/validation_tests/papi_l2_dcr.c, src/validation_tests/papi_l2_dcw.c: validation_tests: fix compiler warnings on arm32 On Raspberry Pi we were getting warnings where we were printing sizeof() valus with %ld. Convert to %zu instead. 2017-11-09 Vince Weaver * src/validation_tests/papi_l2_dca.c: validation_tests: papi_l2_dca fix crash on ARM32 On raspberry pi it's not possible to detect L2 cache size so the test was dividing by zero. * src/linux-common.c: linux-common: remove warning on not finding mhz in cpuinfo This was added recently and is not needed. Most ARM32 devices don't have MHz in the cpuinfo file and it's not really a bug. * src/components/perf_event/perf_event.c: perf_event: disable the old pre-Linux-2.6.34 workarounds by default There were a number of bugs in perf_event that PAPI had to work around, but most of these were fixed by 2.6.34 In order to hit these bugs you would need to be running a kernel from before 2010 which wouldn't support any recent hardware. Unfortunately these bugs are hard to test for. We were enabling things based on kernel versions, but this caught vendors (such as Redhat) shipping 2.6.32 kernels that had backported fixes. This fix just #ifdefs things out, if no one complains then we can fully remove the code. * src/components/perf_event/perf_event.c: perf_event: decrement the available counter count if NMI_WATCHDOG is stealing one * src/components/perf_event/perf_event.c: perf_event: move the paranoid handling code to its own function * src/components/perf_event/perf_event.c: perf_event: centralize fast_counter_read flag just use the component version of the flag, rather than having a shadow global version. 2017-11-09 William Cohen * src/linux-memory.c: Make the fallback generic_get_memory_info function more robust On the aarch64 processor linux 4.11.0 kernels /sys/devices/system/cpu/cpu0/cache is available, but the index[0-9] subdirectories are not fully populated with information about cache and line size, associativity, or number of sets. These missing files would cause the generic_get_memory_info function to attempt to read data using a NULL file descriptor causing the program to crash. Added checks to see if every fopen was and fscan was successful and just say there is no cache if there is any failure. 2017-11-09 Asim YarKhan * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, src/components/nvml/tests/Makefile, src/configure, src/configure.in: Enable icc and nvcc to work together in cuda and nvml components. For nvcc to work with Intel icc to compile cuda and nvml components and tests , it needs to use nvcc -ccbin=<$CC- compilerbin> . The compiler name in CC also needs to be clean, so CC= and any other flags are pushed to CFLAGS (changed in src/configure.in script). * src/ctests/mpifirst.c: Minor correction to mpifirst.c test 2017-11-09 Vince Weaver * src/utils/print_header.c: utils: print fast_counter_read (rdpmc) status in the utils header 2017-11-08 William Cohen * src/validation_tests/cache_helper.c: Ensure access to array within bounds Coverity reported the following issues. Need the test to be "type>=MAX_CACHE" rather than "type>MAX_CACHE". Error: OVERRUN (CWE-119): papi-5.5.2/src/validation_tests/cache_helper.c:85: cond_at_most: Checking "type > 4" implies that "type" may be up to 4 on the false branch. papi-5.5.2/src/validation_tests/cache_helper.c:90: overrun-local: Overrunning array "cache_info" of 4 24-byte elements at element index 4 (byte offset 96) using index "type" (which evaluates to 4). Error: OVERRUN (CWE-119): papi-5.5.2/src/validation_tests/cache_helper.c:101: cond_at_most: Checking "type > 4" implies that "type" may be up to 4 on the false branch. papi-5.5.2/src/validation_tests/cache_helper.c:106: overrun-local: Overrunning array "cache_info" of 4 24-byte elements at element index 4 (byte offset 96) using index "type" (which evaluates to 4). Error: OVERRUN (CWE-119): papi-5.5.2/src/validation_tests/cache_helper.c:117: cond_at_most: Checking "type > 4" implies that "type" may be up to 4 on the false branch. papi-5.5.2/src/validation_tests/cache_helper.c:122: overrun-local: Overrunning array "cache_info" of 4 24-byte elements at element index 4 (byte offset 96) using index "type" (which evaluates to 4). * src/ctests/overflow_pthreads.c: Eliminate coverity overflow warning about expression * src/components/perf_event_uncore/tests/perf_event_uncore_lib.c: Remove dead code from perf_event_uncore_lib.c 2017-11-09 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: don't initialize globals statically from the mucci-5.5.2 tree 2017-11-08 phil@minimalmetrics.com * src/linux-common.c: linux-common: clean up the /proc/cpuinfo parsing code From the mucci-cleanup branch * src/components/perf_event/perf_event.c, .../perf_event_uncore/perf_event_uncore.c, src/papi_libpfm4_events.c, src/papi_libpfm4_events.h: perf_event: clean up _papi_libpfm4_shutdown() From the mucci-cleanup branch * src/utils/print_header.c: utils: clean up the cpuinfo header From the mucci-cleanup branch * src/papi_internal.c, src/papi_internal.h: papi_internal: add PAPI_WARN() function From the mucci-cleanup branch * src/components/perf_event/pe_libpfm4_events.c: perf_event: clean up pe_libpfm4_events From the mucci-cleanup branch -- 2017-11-08 Vince Weaver * src/utils/papi_avail.c: utils/papi_avail: update the manpage info based on changes by Phil Mucci * .../perf_event/tests/perf_event_system_wide.c: perf_event tests: perf_event_system_wide: don't fail if permissions restrict system- wide events right now we just skip if we get EPERM, we should also maybe check the perf_event_paranoid setting and print a more meaningful report * src/ctests/locks_pthreads.c: ctests/locks_pthreads: avoid printing values when in quiet mode 2017-08-31 phil@minimalmetrics.com * src/Makefile.inc: Better symlink creation for shared library in make phase 2017-08-28 phil@minimalmetrics.com * doc/Makefile, src/.gitignore, src/Makefile.inc, src/components/.gitignore, src/components/Makefile_comp_tests, src/ctests/.gitignore, src/ctests/Makefile.recipies, src/ftests/.gitignore, src/ftests/Makefile.recipies, src/testlib/.gitignore, src/utils/.gitignore, src/utils/Makefile, src/validation_tests/.gitignore, src/validation_tests/Makefile.recipies: Full cleanup, including removal of .gitignore files that prevented us from realizing we were really cleaning/clobbering properly * src/validation_tests/.gitignore: .gitignore Makefile.target * src/papi.c: Remove PAPI_VERB_ECONT setting by default from initialization path. This prints all kinds of needless errors on virtual platforms. * src/x86_cpuid_info.c: Remove leftover printf 2017-08-21 phil@minimalmetrics.com * src/ctests/locks_pthreads.c: Test now performs a fixed number of iterations, and reports lock/unlock timings per thread. * src/components/perf_event/perf_event.c: Added more descriptive error message to exclude_guest check * src/papi_internal.c: Removed leading newline and trailing . from error messages * src/papi_preset.c: Updated message for derived event failures 2017-11-07 Vince Weaver * src/Makefile.inc, src/ctests/Makefile, src/ctests/Makefile.target.in, src/ftests/Makefile, src/ftests/Makefile.target.in, src/testlib/Makefile.target.in, src/utils/Makefile.target.in, src/validation_tests/Makefile, src/validation_tests/Makefile.target.in: tests: make sure DESTDIR and DATADIR are passed in when doing an install * src/ctests/Makefile, src/ctests/Makefile.target.in, src/ftests/Makefile, src/ftests/Makefile.target.in, src/utils/Makefile, src/utils/Makefile.target.in, src/validation_tests/Makefile, src/validation_tests/Makefile.target.in: ctests/ftests/utils/validation_tests: get shared library linking working again This should let the various tests and utils be linked as shared libraries again. * src/validation_tests/Makefile: validation_tests: add an installation target this makes the validation tests have an install target, like the ctests and ftests * src/ctests/Makefile, src/ftests/Makefile: ctests/ftests: fix "install" target at some point DATADIR was renamed datadir and the install targets were not updated. 2017-11-07 Asim YarKhan * bitbucket-pipelines.yml: Bitbucket pipeline testing: Inspired by Phil Mucci's branch; copied the functionalty tests run in that branch. * src/components/lmsensors/linux-lmsensors.c: lmsensors component: Changed event names to use lm_sensors (only once) instead of LM_SENSORS (twice) to be consistent with other events 2017-11-02 William Cohen * src/components/appio/tests/iozone/gnu3d.dem: gnu3d.dem should not be executed by the test framework This file is a gnuplot file and should not be executed as part of the tests. Removing the executable perms will signal to the testing framework that it shouldn't be executed. * src/components/appio/tests/iozone/Gnuplot.txt: Gnuplot.txt should not be executed by the test framework This file is a readme file and should not be executed as part of the tests. Removing the executable perms will signal to the testing framework that it shouldn't be executed. * .../appio/tests/iozone/iozone_visualizer.pl, src/components/appio/tests/iozone/report.pl: Fix perl scripts so they run on Linux machines The DOS style newlines were preventing Linux from selecting the appropriate interpreter for these scripts and causing these tests to fail. 2017-11-07 Asim YarKhan * src/components/lmsensors/configure: lmsensors component: Regenerate the configure file for the component 2017-11-02 William Cohen * src/components/lmsensors/Makefile.lmsensors.in, src/components/lmsensors/configure.in, src/components/lmsensors /linux-lmsensors.c: Make the lmsensors dynamically load the needed shared library When attempting to build the current git repo of papi the build of the files in the utils subdirectory failed because the lmsensors libraries were not being linked in. Rather than forcing the papi to link in the lmsensor library during the build the lmsensors component has been modified to dynamically load the needed libraries and enable the lmsensors events when available. This allows machines missing the lmsensor libraries installed to still use papi. 2017-11-06 Asim YarKhan * src/components/cuda/linux-cuda.c: CUDA component: On architectures without CUDA Metrics (e.g. Tesla C2050), skip metric registration rather than returning errors 2017-11-06 Vince Weaver * src/validation_tests/papi_l2_dca.c, src/validation_tests/papi_l2_dcm.c, src/validation_tests/papi_l2_dcr.c, src/validation_tests/papi_l2_dcw.c: validation_tests: make the papi_l2 tests fail with warnings On Haswell/Broadwell and newer these tests fail for unknown reasons. This isn't new behavior, it's just that the tests are new. It's unlikely we will have time to completely sort this out before the upcoming release, so change the FAIL to WARN so testers won't be unnecessarily alarmed. 2017-11-05 Vince Weaver * src/components/perf_event/perf_event.c, src/configure, src/configure.in: perf_event: enable rdpmc support by default It can still be disabled at configure time with --enable-perfevent- rdpmc=no This speeds up PAPI_read() by at least a factor of 5x (see the ESPT'17 workshop presentation) It is only enabled on Linux 4.13 and newer due to bugs in previous versions. 2017-11-03 Vince Weaver * src/ctests/sdsc-mpx.c: ctests: sdsc: fix issue where the error message is not printed correctly 2017-11-01 Heike Jagode * src/components/powercap/linux-powercap.c: Intermediate check-in: Fixed a whole bunch of careless file handling (missing closing of open files, missing setting of open/close flag, etc). Still more rigorous checks needed. Mon Oct 30 17:16:32 2017 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_skl_events.h: Update libpfm4\n\nCurrent with\n commit 21405fb3c247a0d16861483daf0696cf4fa0cc43 update SW_PREFETCH event for Intel Skylake Event was renamed SW_PREFETCH_ACCESS, but we keep SW_PREFETCH as an alias. Added PREFETCHW umask. Enabled suport for both Skylake client and server as per official event table from 10/27/2017. See download.01.org/perfmon/ 2017-10-30 Vince Weaver * src/validation_tests/Makefile.recipies, src/validation_tests/cycles.c, src/validation_tests/cycles_validation.c: validation_tests: add cycles_validation test this is the old zero test, which does a number of cycles tests It should be extended to add more. 2017-10-30 Vince Weaver * src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/calibrate.c, src/ctests/child_overflow.c, src/ctests/code2name.c, src/ctests/earprofile.c, src/ctests/exec_overflow.c, src/ctests/fork_overflow.c, src/ctests/hwinfo.c, src/ctests/mendes- alt.c, src/ctests/prof_utils.c, src/ctests/prof_utils.h, src/ctests/profile.c, src/ctests/remove_events.c, src/ctests/shlib.c, src/ctests/system_child_overflow.c, src/ctests/system_overflow.c, src/ctests/zero_named.c, src/testlib/papi_test.h, src/testlib/test_utils.c: papi: c++11 fixes: fix various ctests that c++ complains on mostly just const warnings, some K+R function declarations, and possibly an actual char/char* bug. * src/papi.c, src/papi.h: papi: c++11 conversion: PAPI_get_component_index() * src/papi.c, src/papi.h: papi: c++11 conversion: convert PAPI_perror() * src/aix.c, src/components/appio/appio.c, src/components/bgpm/CNKunit/linux-CNKunit.c, src/components/bgpm/IOunit/linux-IOunit.c, src/components/bgpm/L2unit/linux-L2unit.c, src/components/bgpm/NWunit/linux-NWunit.c, src/components/emon /linux-emon.c, src/components/net/linux-net.c, src/components/perf_event/pe_libpfm4_events.c, src/components/perf_event/pe_libpfm4_events.h, src/components/perf_event/perf_event.c, .../perf_event_uncore/perf_event_uncore.c, src/components/perfmon_ia64/perfmon-ia64.c, src/freebsd.c, src /linux-bgq.c, src/papi.c, src/papi.h, src/papi_internal.c, src/papi_internal.h, src/papi_libpfm3_events.c, src/papi_libpfm_events.h, src/papi_vector.c, src/papi_vector.h: papi: start converting papi.h to be C++11 clean Most of the issues have to do with string to char * conversion. This first patch converts PAPI_event_name_to_code() The issue was first reported by Brian Van Straalen * src/validation_tests/papi_l2_dca.c: validation_tests/papi_l2_dca: update some comments * src/ctests/zero.c, src/validation_tests/cycles.c: ctests/zero: make test pass on recent intel machines The test was failing due to the PAPI_get_real_cycles() validation on recent Intel chips. This is probably something that should be tested in a separate test and not in zero which is supposed to be a bare-bones are-things-working test. 2017-10-27 Philip Vaccaro * src/components/powercap/README: updated powercap README to be more concise. includes more details on interacting with energy counters and power limits. 2017-10-27 Asim YarKhan * src/components/cuda/linux-cuda.c, src/components/nvml/linux-nvml.c: CUDA/NVML components: Handled segfault which can occur when dlclosing libcudart from both components by adding an additional flag to dlopen 2017-10-24 Asim YarKhan * src/components/cuda/linux-cuda.c, src/components/cuda/tests/simpleMultiGPU.cu: CUDA component: Clean up fulltest by moving some output from stdout to SUBDBG, removed some commented out lines * src/components/nvml/linux-nvml.c: nvml component: To support V100 (Volta) updated to get nvmlDevice handle ordered by index rather than pci busid. 2017-10-23 Asim YarKhan * src/components/cuda/linux-cuda.c: CUDA component: Minor fix to remove some unneeded stdout which shows up during fulltest 2017-10-20 Asim YarKhan * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, src/components/cuda/tests/simpleMultiGPU.cu: CUDA component test update: Remove some debug output. Do not build cupti_only test binary. Thu Oct 19 11:23:44 2017 -0700 Stephane Eranian * src/libpfm4/examples/showevtinfo.c, src/libpfm4/lib/events/intel_skl_events.h: Update libpfm4\n\nCurrent with\n commit 2e98642dd331b15382256caa380834d01b63bef8 Fix Intel Skylake EXE_ACTIVITY.1_PORTS_UTIL event Was missing a umask name. 2017-10-17 Vince Weaver * src/ctests/version.c: ctests: version, add INCREMENT field at the request of Steve Kaufmann * src/ctests/Makefile.recipies, src/ctests/version.c: ctests: re- enable version test not sure why it was disabled * src/ctests/Makefile.recipies: ctests: alphabetize SERIAL tests in Makefile.recipes 2017-10-13 Philip Vaccaro * src/components/powercap/tests/Makefile, src/components/powercap/tests/powercap_limit.c: added simple limit test for the powercap component. 2017-10-09 Asim YarKhan * src/components/nvml/linux-nvml.c: Big Fix NVML component: Fix problem with names when there are multiple identical GPUs If multiple identical GPUs were available, the names were not mapped correctly. Fixed event names to be "nvml:::Tesla_K40c:device_0:myevent" rather than "nvml:::Tesla_K40c_0:myevent". Fri Sep 29 00:25:09 2017 -0700 Stephane Eranian * src/libpfm4/include/perfmon/perf_event.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/perf_examples/Makefile, src/libpfm4/perf_examples/branch_smpl.c, src/libpfm4/perf_examples/perf_util.c: Update libpfm4\n\nCurrent with\n commit d1e7c96df60a00a371fdaa3b635ad4a38cee4c2f add new branch_smpl.c perf_events example This patch adds a new example to demo how to sample and parse the PERF_SAMPLE_BRANCH_STACK record format of perf_events. It will dump branches taken from the sampled command. 2017-10-05 Asim YarKhan * src/components/nvml/README, src/components/nvml/linux-nvml.c, src/components/nvml/linux-nvml.h, src/components/nvml/tests/HelloWorld.cu, src/components/nvml/tests/Makefile, .../nvml/tests/nvml_power_limiting_test.cu: Update NVML component: Support for power limiting using NVML PAPI has added support for power limiting using NVML (on supported devices from the Kepler family or later). The executable needs to have root permissions to change the power limits on the device. We have added new events to the NVML component to support power management limits. The nvml:::DEVICE:power_management_limit can be written (as well as read), but requires higher permissions (root level). The limit is constrainted between a min and a max value, which can be read. When the component is unloaded, the power_management_limit should be reset to the initial value. nvml:::DEVICE:power_management_limit nvml:::DEVICE:power_management_limit_constraint_min nvml:::DEVICE:power_management_limit_constraint_max A new test (nvml/tests/nvml_power_limiting_test.cu)/ was written to check if the writing functionality works (with the proper hardware and permissions). 2017-10-04 Asim YarKhan * src/components/nvml/linux-nvml.c, src/components/nvml/linux-nvml.h, src/components/nvml/tests/HelloWorld.cu: Style consistency and refactoring via astyle command. No changes to the actual code were made here. 2017-10-04 Vince Weaver * src/components/rapl/linux-rapl.c: rapl: add support for some Intel Atom models Goldmont / Gemini_Lake / Denverton * src/components/rapl/linux-rapl.c: rapl: fix skylake SoC measurement support * src/components/rapl/linux-rapl.c: rapl: add support for skylake SoC energy measurements * src/components/rapl/linux-rapl.c: rapl: add Skylake-X / Kabylake support * src/components/rapl/linux-rapl.c: rapl: centralize the "different DRAM units" code * src/components/rapl/linux-rapl.c: rapl: merge like processors * src/components/rapl/linux-rapl.c: rapl: convert chip detection to a switch statement * src/components/rapl/linux-rapl.c: rapl: update the whitespace a bit 2017-09-12 Heike Jagode (jagode@icl.utk.edu) * .../infiniband_umad/linux-infiniband_umad.c, .../infiniband_umad /linux-infiniband_umad.h: Fixed papi_vector for infiniband_umad component. The array of function pointers that the component defines must use the naming convention papi_vector_t _x_vector where x is the name of the component directory. In this case, the name of the component directory is infiniband_umad and not infiniband. This change has not been tested yet due to OFED lib issues on our local machines. There may be more changes required in order to get the infiniband_umad component to work properly. 2017-09-11 Hanumanth * man/man1/papi_avail.1, man/man1/papi_native_avail.1, src/utils/papi_avail.c, src/utils/papi_native_avail.c: Updating man and help pages for papi_avail and papi_native_avail 2017-09-07 Asim YarKhan * src/components/cuda/tests/nvlink_bandwidth.cu, .../cuda/tests/nvlink_bandwidth_cupti_only.cu: Update to CUDA component to support NVLink. The CUDA component has been cleaned up and updated to support NVLink. NVLink metrics can not be measured properly in KERNEL event collection mode, so the CUPTI EventCollectionMode is transparently set to CUPTI_EVENT_COLLECTION_MODE_CONTINUOUS when a NVLink metric is being measured in an eventset. For all other events and metrics, the CUDA component uses the KERNEL event collection mode. A bug in the earlier version was that repeated calls to add CUDA events were failing because some structures were not cleaned up. This should now be fixed. A new nvlink test was added to the CUDA component tests. 2017-08-31 Phil Mucci * man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_destroy_eventset.3: Updating options for papi_avail/native_avail as well as all references to old mailing list 2017-08-31 Asim YarKhan * src/components/nvml/linux-nvml.c, src/components/nvml/tests/HelloWorld.cu, src/components/nvml/tests/Makefile: Minor updates to NVML component to enable it to compile and run without complaints 2017-08-30 Vince Weaver * src/validation_tests/papi_br_prc.c, src/validation_tests/papi_br_tkn.c: validation: update papi_br_prc and papi_br_tkn for amd fam15h amd fam15h doesn't have a conditional branch event so the measures have to be against total. for now print warning, maybe we should let it go w/o a warning. * src/papi_events.csv: papi_events: add PAPI_BR_PRC event to amd fam15h * src/papi_events.csv: papi_events: update PAPI_BR_PRC and PAPI_BR_TKN on sandybridge/ivybridge They were using TOTAL branches for the derived branch events rather than CONDITIONAL like the other modern x86 processors were using. * src/validation_tests/papi_br_tkn.c: validation_tests: papi_br_tkn: update to only count conditional branches * src/validation_tests/papi_br_prc.c: validation_tests: papi_br_prc: make sure it is comparing conditional branches was doing total branches, which made the test fail on skylake Mon Aug 21 23:55:46 2017 -0700 Stephane Eranian * src/libpfm4/lib/pfmlib_intel_x86.c: Update libpfm4\n\nCurrent with\n commit a290dead7c1f351f8269a265c0d4a5f38a60ba29 fix usage of is_model_event() for Intel X86 This patch fixes a couple of problems introduced by commit: 77a5ac9d43b1 add model field to intel_x86_entry_t The code in pfm_intel_x86_get_event_first() was incorrect. It was calling is_model_event() before checking if the index was within bounds. It should have been the opposite. Same issue in pfm_intel_x86_get_next_event(). This could cause SEGFAULT as report by Phil Mucci. The patch also fixes the return value of pfm_intel_x86_get_event_first(). It was not calculated correctly. Reported-by: Phil Mucci 2017-08-20 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/failed_events.c: ctests: add failed_events test it tries to create invalid events to make sure the event parser properly handles invalid events. 2017-08-19 Vince Weaver * src/components/perf_event_uncore/tests/Makefile, .../perf_event_uncore/tests/perf_event_uncore.c, .../tests/perf_event_uncore_attach.c: perf_event_uncore: tests: update perf_event_uncore to use :cpu=0 This is the more common way of specifying uncore events. Rename the old test that uses PAPI_set_opt() to perf_event_uncore_attach * .../tests/perf_event_uncore_cbox.c, .../tests/perf_event_uncore_lib.c, .../tests/perf_event_uncore_lib.h: perf_event_uncore: tests: update uncore events for recent processors * src/ctests/zero_pthreads.c: ctests: zero_pthreads: remove extraneous printf when in quiet mode * .../tests/perf_event_uncore_lib.c: perf_event_uncore: event list, add recent processors libpfm4 still doesn't support regular Haswell, Broadwell, or Skylake machines * .../perf_event_uncore/tests/perf_event_uncore.c, .../tests/perf_event_uncore_cbox.c, .../tests/perf_event_uncore_multiple.c: perf_event_uncore: tests: print a message indicating the problem on skip also some whitespace cleanups * src/components/perf_event/tests/event_name_lib.c: perf_event: tests: update event_name_lib for recent Intel processors * src/components/perf_event/tests/event_name_lib.c: perf_event: tests: event_name_lib, clean up whitespace * .../perf_event/tests/perf_event_offcore_response.c: perf_event: tests: update perf_event_offcore_response test print an indicator of why we are skipping the test also some gratuitous whitespace cleanups * src/ctests/zero_shmem.c: ctests: zero_shmem: document the code a little better * src/ctests/zero_smp.c: ctests: zero_smp: make it actually do something on Linux Linux can use the pthread code just like AIX although we don't validate the results, so this test could be another candidate for not being necessary anymore. * src/ctests/zero_shmem.c: ctests: zero_shmem: minor cleanups we pretty much always skip this test. Is it needed anymore? What was it testing in the first place? The code it calls (start_pes() ) doesn't seem to exist anymore * src/ctests/zero_omp.c, src/ctests/zero_pthreads.c: ctests: zero_omp and zero_pthread were skipping due to a typo when updating the code I had left a stray ! before PAPI_query_event() 2017-08-19 Vince Weaver * src/papi_events.csv: papi_events: the skylake fixes broke hsw/bdw this skylake-x change is way more trouble than it was worth. 2017-08-19 Vince Weaver * src/papi_events.csv: papi_events: on skylake the SNP_FWD umask was renamed to SNP_HIT_WITH_FWD This broke presets on skylake, skylake-x * src/components/perf_event/pe_libpfm4_events.c: perf_event: fix uninitialized descr issue reported by valgrind I don't think this is the skylake-x bug though 2017-08-18 Vince Weaver * src/components/perf_event/pe_libpfm4_events.c: perf_event: clean up some whitespace in pe_libpfm4_events.c * src/linux-memory.c: linux-memory: various errors when compiling with debug enabled the new proc memory code had some mistakes in the debug messages that only appeared when compiled with --with- debug Reported-by: Steve Kaufmann 2017-08-17 Vince Weaver * src/papi_events.csv: papi_events: missed one of the skx event locations 2017-08-16 Vince Weaver * src/papi_events.csv: papi_events: enable Skylake X support Sun Aug 6 00:22:52 2017 -0700 Stephane Eranian * src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_skl.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4\n\nCurrent with\n commit efd16920194999fdf1146e9dab3f7435608a9479 add support for Intel Skylake X This patch adds support for Intel Skylake X core PMU events. Based on download.01.org/perfmon/SKX/skylakex_core_v25.json. New PMU is called skx. 2017-08-07 Vince Weaver * src/papi_events.csv: papi_events: add initial AMD fam17h support not tested on actual hardware yet * src/papi_events.csv: papi_events: fix the amd_fam16h PMU name The way libpfm4 reports fam16h was modified a bit from my initial patches. fam16h seems to be working now. Thu Jul 27 23:30:20 2017 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_amd64_fam16h.3, src/libpfm4/docs/man3/libpfm_amd64_fam17h.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_cbo.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_ha.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_imc.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_irp.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_pcu.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_qpi.3, .../docs/man3/libpfm_intel_bdx_unc_r2pcie.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_r3qpi.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_sbo.3, src/libpfm4/docs/man3/libpfm_intel_bdx_unc_ubo.3, src/libpfm4/examples/showevtinfo.c, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/amd64_events_fam16h.h, src/libpfm4/lib/events/amd64_events_fam17h.h, src/libpfm4/lib/events/intel_bdx_unc_cbo_events.h, src/libpfm4/lib/events/intel_bdx_unc_ha_events.h, src/libpfm4/lib/events/intel_bdx_unc_imc_events.h, src/libpfm4/lib/events/intel_bdx_unc_irp_events.h, src/libpfm4/lib/events/intel_bdx_unc_pcu_events.h, src/libpfm4/lib/events/intel_bdx_unc_qpi_events.h, .../lib/events/intel_bdx_unc_r2pcie_events.h, .../lib/events/intel_bdx_unc_r3qpi_events.h, src/libpfm4/lib/events/intel_bdx_unc_sbo_events.h, src/libpfm4/lib/events/intel_bdx_unc_ubo_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_fam16h.c, src/libpfm4/lib/pfmlib_amd64_fam17h.c, src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_cbo.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_ha.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_imc.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_irp.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_qpi.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_r2pcie.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_r3qpi.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_sbo.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_ubo.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/perf_examples/self_count.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 72474c59d88512e49d9be7c4baa4355e8d8ad10a fix typo in AMd Fam17h man page PMU name was mistyped. 2017-08-04 Vince Weaver * src/validation_tests/papi_l1_dcm.c, src/validation_tests/papi_l2_dcm.c: validation_tests: for the DCM tests up the allowed error to 5% We don't want to fail too easily, and 5% seems reasonable. This lets the test pass on ARM64 Dragonboard 401c * src/linux-memory.c: linux-memory: add fallback generic Linux /sys cache size detection This will allow getting cache sizes on architectures we don't have custom code for. Currently this mostly means ARM64. * src/validation_tests/papi_l1_dcm.c, src/validation_tests/papi_l2_dcm.c: validation_tests: don't crash if cachesize reported as zero * src/validation_tests/branches_testcode.c: branches_testcode: add arm64 support 2017-07-27 Vince Weaver * src/papi_events.csv, src/validation_tests/papi_l2_dca.c: validation_tests: trying to find out why PAPI_L2_DCA fails on Haswell it's a mystery still. One alternative is to switch the event to be the same as PAPI_L1_DCM but that seems like it would be cheating. * src/validation_tests/papi_l2_dcw.c: validation_tests: papi_l2_dcw: shorten a warning message * src/papi_events.csv: papi_events: note that libpfm4 Kaby Lake support is treated as part of Skylake * src/validation_tests/Makefile.recipies, src/validation_tests/papi_l2_dcw.c: validation_tests: add PAPI_L2_DCW test * src/validation_tests/Makefile.recipies, src/validation_tests/papi_l2_dcr.c: validation_tests: add PAPI_L2_DCR test * src/validation_tests/papi_l2_dcm.c: validation_tests: PAPI_L2_DCM figured out a test that made sense * src/validation_tests/Makefile.recipies, src/validation_tests/papi_l1_dcm.c: validation_tests: add PAPI_L1_DCM test * src/validation_tests/Makefile.recipies, src/validation_tests/cache_testcode.c, src/validation_tests/papi_l2_dcm.c, src/validation_tests/testcode.h: validation_tests: first attempt at papi_l2_dcm test disabled for now, as it's really hard to make a workable cache miss test on modern hardware. 2017-07-26 Vince Weaver * src/ctests/Makefile, src/ctests/Makefile.recipies, src/ctests/child_overflow.c, src/ctests/exec_overflow.c, src/validation_tests/Makefile.recipies, src/validation_tests/busy_work.c, src/validation_tests/testcode.h: ctests: clean up the exec/child overflow tests The exec_overflow test segfaults when using rdpmc This is a bug in Linux. I'm working on getting it fixed. 2017-07-21 Vince Weaver * src/validation_tests/Makefile.recipies, src/validation_tests/cache_helper.c, src/validation_tests/cache_helper.h, src/validation_tests/cache_testcode.c, src/validation_tests/papi_l1_dca.c, src/validation_tests/papi_l2_dca.c, src/validation_tests/testcode.h: validation_tests: add PAPI_L2_DCA test also adds some generic cache testing infrastructure * src/validation_tests/papi_l1_dca.c: validation_tests: PAPI_L1_DCA fixes had to find a machine that actually supported the event. On AMD Fam15h the write count is 3x expected? Need to investigate further. * src/validation_tests/papi_br_prc.c: validation_tests: papi_br_prc, properly skip if event not found * src/validation_tests/Makefile.recipies, src/validation_tests/papi_l1_dca.c: validation_tests: add PAPI_L1_DCA test 2017-07-20 Vince Weaver * src/validation_tests/Makefile.recipies, src/validation_tests/papi_br_msp.c, src/validation_tests/papi_br_prc.c: validation_tests: add PAPI_BR_PRC test * src/validation_tests/Makefile.recipies, src/validation_tests/papi_br_tkn.c: validation_tests: add PAPI_BR_TKN test * src/validation_tests/Makefile.recipies, src/validation_tests/papi_br_ntk.c: validation_tests: add PAPI_BR_NTK test 2017-07-07 Vince Weaver * src/papi_events.csv: papi_events: move haswell, skylake, and broadwell to traditional PAPI_REF_CYC there's a slight chance this might break things for people, if so we can revert it. * src/linux-timer.c: linux-timer: fix build warning on non-power build * src/ctests/flops.c, src/validation_tests/flops_testcode.c, src/validation_tests/papi_dp_ops.c, src/validation_tests/papi_fp_ops.c, src/validation_tests/papi_sp_ops.c: validation: make the flops tests handle that POWER has fused multiply-add PAPI_DP_OPS and PAPI_SP_OPS still fail, need to audit what the event is doing * src/papi_events.csv: POWER8: add a few branch preset events they pass the validation tests, not sure why they weren't enabled originally * src/validation_tests/branches_testcode.c: validation: add POWER branches testcode not sure I got the clobbers right * src/components/perf_event/perf_helpers.h, src/validation_tests/papi_tot_ins.c: POWER: fix some compiler warnings 2016-10-18 Phil Mucci * src/linux-timer.c: Ensure stdint gets included for all Linuxen. * src/linux-timer.c: Some Linuxen need stdint to get the uint64_t type. 2016-10-14 Phil Mucci * src/linux-lock.h: Restructured unlock code to avoid warnings. Tested against 80 threads on Power8 2016-10-12 Phil Mucci * src/linux-timer.c: PPC64/PPC fast timer fixup. 2017-07-07 Vince Weaver * src/linux-timer.c: linux-timer: allow using fast timer for get_real_cycles() on POWER 2016-07-12 Phil Mucci * src/linux-timer.c, src/linux-timer.h: First pass at good rdtsc for Power7/8 2017-07-03 Vince Weaver * src/ctests/flops.c, src/ctests/hl_rates.c, src/validation_tests/Makefile.recipies, src/validation_tests/flops.c, src/validation_tests/flops_testcode.c, src/validation_tests/flops_validation.c, src/validation_tests/papi_dp_ops.c, src/validation_tests/papi_fp_ops.c, src/validation_tests/papi_sp_ops.c, src/validation_tests/testcode.h: validation_tests: add tests for PAPI_SP_OPS and PAPI_DP_OPS extend the flops_testcode as well, to have both float and double versions. * src/validation_tests/papi_ref_cyc.c: validation_tests: papi_ref_cyc: update test to work on older systems it's actually the newer (haswell/broadwell/skylake) that are using a different event than the older systems. Make the test check for the old behavior. 2017-07-02 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/cycle_ratio.c, src/validation_tests/Makefile.recipies, src/validation_tests/flops_testcode.c, src/validation_tests/papi_ref_cyc.c, src/validation_tests/testcode.h: validation_tests: move cycle_ratio test to be papi_ref_cyc test * src/ctests/cycle_ratio.c: ctests: rewrite cycle_ratio test on Intel platforms PAPI_REF_CYC is a fixed 100MHz cycle count the test was making the assumption that PAPI_REF_CYC was equal to the max design freq (not turboboost) and thus as far as I can tell it never would return the right answer. This test should probably be moved to validation_tests. 2017-07-01 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/branches.c, src/ctests /sdsc-mpx.c, src/ctests/sdsc2.c: ctests: migrate all other users of dummy3() workload * src/ctests/Makefile.recipies, src/ctests/sdsc4-mpx.c, src/validation_tests/flops_testcode.c, src/validation_tests/testcode.h: ctests: move the "dummy3" workload to the common workload library * src/ctests/sdsc4-mpx.c: ctests: sdsc4-mpx: fix failing on recent Intel machines the multiplexing of an event with small results (PAPI_SR_INS in this case) has high variance, so don't use it for validation. There was code trying to do this but it wasn't working. 2017-06-30 Vince Weaver * src/ctests/first.c, src/ctests/matrix-hl.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c: ctests: catch lack of CPU component earlier gets rid of extreaneous SKIPPED in the output of run_tests.sh * src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/Makefile: tests:cuda: make the HelloWorld test more like a standard PAPI test * src/validation_tests/Makefile.recipies: validation_tests: fix linking against a CUDA enabled PAPI Fix suggested by Steve Kaufmann * src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: make it so it can compile with c++ this lets us link against it from the CUDA tests * src/components/cuda/sampling/gpu_activity.c: tests: cuda: fix sampling/gpu_activity to compile without warnings * src/Makefile.inc: tests: make the component tests build command be the same as ctests/ftests * src/ctests/calibrate.c: ctests: calibrate: turn off printf if TEST_QUIET missed this one when testing because test machine skipped it due to lack of floating point events 2017-06-29 Vince Weaver * .../tests/perf_event_amd_northbridge.c, src/ctests/Makefile.recipies, src/ctests/cycle_ratio.c, src/ctests/derived.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_index.c, src/ctests/overflow_pthreads.c, src/ctests/overflow_twoevents.c, src/ctests/prof_utils.c, src/ctests/prof_utils.h, src/ctests/profile.c, src/ctests/profile_twoevents.c, src/ctests/realtime.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/sdsc-mpx.c, src/ctests/sdsc.c, src/ctests/sdsc4-mpx.c, src/ctests/sdsc4.c, src/ctests/shlib.c, src/ctests/tenth.c, src/ctests/thrspecific.c, src/testlib/papi_test.h: testlib: remove the hack where all printf's are #defined to something else Explicitly check everywhere for TESTS_QUIET or equivelent, rather than using c-pre- processor macros to redefine printf * src/papi.c, src/testlib/test_utils.c: tests: set the ctest debug mode to VERBOSE by default for tests the TESTS_QUIET mode was turning *off* verbose debugging, which meant that PAPIERROR() calls wouldn't show up during a ./run_tests.sh * src/components/perf_event/perf_event.c: perf_event: properly initialize the mmap_addr structure It wasn't always being set to NULL, and so on some tests the code would try to munmap() it even though it wasn't mapped. * src/testlib/test_utils.c: tests: enable color in test status messages this has been an optional feature for a long time, if you enabled the environment variable TESTS_COLOR=y this change makes it default to being on (you can disable with export TESTS_COLOR=n also it should automatically detect if you are piping to a file and disable colors in the case too * src/validation_tests/Makefile, src/validation_tests/Makefile.recipies: validation_tests: always include -lrt on the tests Should be harmless, and I don't always test on an old enough machine to trigger the problem. * src/ctests/forkexec.c, src/ctests/forkexec2.c, src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/multiplex3_pthreads.c, src/ctests/system_child_overflow.c: ctests: make the fork/exec tests only print "PASSED" once this makes the run_test.sh input look a lot nicer * src/run_tests.sh, src/testlib/test_utils.c: tests: make the output from run_tests.sh more compact 2017-06-28 Vince Weaver * .../perf_event/tests/perf_event_system_wide.c: perf_event: tests, make perf_event_system_wide use INS rather than CYC cycles varied too much, making the validation fail * src/validation_tests/Makefile.recipies, src/validation_tests/papi_br_cn.c, src/validation_tests/papi_br_ucn.c: validation_tests: add tests for PAPI_BR_CN and PAPI_BR_UCN * src/validation_tests/flops.c: validation_tests: flops: wasn't falling back properly if no FLOPS event * src/utils/Makefile, src/validation_tests/Makefile.recipies: tests: clean up the Makefiles * src/utils/print_header.c: utils: print_header: print the operating system version in the header * .../tests/perf_event_amd_northbridge.c: perf_event_uncore: the perf_event_amd_northbridge test wasn't working it maybe never worked at all? It was hardcoded to thinking it was running on a 3.9 kernel always. * src/ctests/Makefile, src/ctests/Makefile.recipies, src/ctests/zero.c: ctests: zero: complete transition from FLOPS to INS as metric this will make it more likely to be runnable on modern machines. * src/ctests/vector.c, src/validation_tests/vector_testcode.c: validation_tests: move the unused vector.c code maybe we should remove it. It was never built as far as I can tell. * src/validation_tests/Makefile.recipies, src/validation_tests/flops.c: validation_tests: add a generic flops test based on hl_rates we do a lot of testing of the high-level interface but not as much of the regular PAPI interface. * src/ctests/Makefile.recipies, src/ctests/hl_rates.c, src/validation_tests/flops_testcode.c, src/validation_tests/testcode.h: ctests: hl_rates: clean up and fix extraneous error message the error message was due to the way TESTS_QUIET is passed as a command line argument. also made it use the same matrix-multiply code that the flops test uses. also added some validation to the results. * src/ctests/all_events.c: ctests: all_events: issue warning if preset cannot be created specifically this came up on an AMD fam15h system where the PAPI_L1_ICH event cannot be created due to Linux stealing a counter for the NMI watchdog * src/validation_tests/papi_hw_int.c: validation_tests: papi_hw_int explicitly mark large constant as ULL compiler was warning on 32-bit machine * src/validation_tests/papi_ld_ins.c, src/validation_tests/papi_sr_ins.c, src/validation_tests/papi_tot_cyc.c: validation_tests: a few tests had the !quiet check inverted * src/validation_tests/papi_hw_int.c: validation_tests: fix papi_hw_int looping forever somehow the loop exit line got lost * src/validation_tests/Makefile.recipies, src/validation_tests/matrix_multiply.c, src/validation_tests/matrix_multiply.h, src/validation_tests/papi_ld_ins.c, src/validation_tests/papi_sr_ins.c: validation_tests: add PAPI_SR_INS test * src/validation_tests/Makefile.recipies, src/validation_tests/matrix_multiply.c, src/validation_tests/matrix_multiply.h, src/validation_tests/papi_hw_int.c, src/validation_tests/papi_ld_ins.c: validation_tests: add PAPI_LD_INS test * src/run_tests.sh, src/validation_tests/Makefile.recipies, src/validation_tests/papi_hw_int.c: validation_tests: add PAPI_HW_INT test 2017-06-27 Vince Weaver * src/run_tests_exclude.txt: run_tests_exclude: add attach_target not really a test so we shouldn't run it * src/ctests/byte_profile.c, src/ctests/earprofile.c, src/ctests/prof_utils.c, src/ctests/prof_utils.h: ctests/prof_utils: remove prof_init() helper It didn't do much more than a papi_init, probably better to have each file do that in the open. * src/ctests/inherit.c, src/ctests/ipc.c, src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/low- level.c, src/ctests/mendes-alt.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c, src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_index.c, src/ctests/overflow_one_and_read.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/prof_utils.c, src/ctests/profile.c, src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c, src/ctests/remove_events.c, src/ctests/sprofile.c, src/ctests/zero.c, src/ctests/zero_flip.c, src/ctests/zero_named.c, src/testlib/test_utils.c: ctests: skip rather than fail if no events available 2017-06-26 Vince Weaver * src/ctests/first.c, src/ctests/mpifirst.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/testlib/test_utils.c: testlib: fix add_two_events() was not setting some values, causing many tests to fail * src/ctests/attach2.c, src/ctests/system_overflow.c: ctests: compiler warning caught two lack-of-braces mistakes * src/ctests/byte_profile.c, src/ctests/code2name.c, src/ctests/describe.c, src/testlib/test_utils.c: tests: more changes to skip instead of fail if no events available * src/ctests/Makefile.recipies, src/ctests/child_overflow.c, src/ctests/exec_overflow.c, src/ctests/fork_exec_overflow.c, src/ctests/fork_overflow.c, src/ctests/system_child_overflow.c, src/ctests/system_overflow.c: ctests: break up the for_exec_overflow test it was really four benchmarks with some ifdefs the proper way to do that would be to have a common C file and link against it for the shared routines, rather than using the pre-processor * src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c: ctests: have attach tests cleanly skip if no events available * src/testlib/test_utils.c: testlib: update add_two_events to skip() if not events found * src/ctests/mendes-alt.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/testlib/papi_test.h, src/testlib/test_utils.c: testutils: remove init_multiplex() test helper the only benefit it had over calling PAPI_multiplex_init() was a domain workaround for perfctr+power6 systems. Ideally not many of those systems are around anymore, an in any case a proper fix would have the perfctr component handle that, not the testing library. * .../perf_event/tests/perf_event_system_wide.c, .../perf_event/tests/perf_event_user_kernel.c, src/ctests/api.c, src/ctests/byte_profile.c, src/ctests/high-level.c, src/ctests/hl_rates.c, src/validation_tests/papi_br_ins.c, src/validation_tests/papi_br_msp.c, src/validation_tests/papi_tot_cyc.c, src/validation_tests/papi_tot_ins.c: tests: try to "skip" rather than "fail" if no events available * src/ctests/derived.c: ctests: derived: fix warning found on older gcc * src/ctests/high-level2.c: ctests: clean up high-level2 test skip on machine without flops/flips event * src/components/Makefile_comp_tests.target.in: components test: fix another build issue be sure to use local copy of papi.h * src/components/Makefile_comp_tests.target.in: component tests: fix build issue was trying to use the system version of libpapi.a instead of local version * src/components/appio/tests/Makefile, src/components/appio/tests/appio_list_events.c, src/components/appio/tests/appio_values_by_code.c, src/components/coretemp/tests/Makefile, src/components/example/tests/Makefile, src/components/host_micpower/tests/Makefile, src/components/infiniband/tests/Makefile, .../infiniband/tests/infiniband_values_by_code.c, src/components/infiniband_umad/tests/Makefile, .../tests/infiniband_umad_values_by_code.c, src/components/lustre/tests/Makefile, src/components/micpower/tests/Makefile, src/components/mx/tests/Makefile, src/components/net/tests/Makefile, src/components/perf_event/tests/Makefile, src/components/perf_event_uncore/tests/Makefile, src/components/powercap/tests/Makefile, src/components/rapl/tests/Makefile, src/components/stealtime/tests/Makefile: components: update component test Makefiles to include Makefile_comp_test.target * src/components/Makefile_comp_tests.target.in: components: update Makefile_comp_test.target.in should now be usable by the components without many Makefile changes * src/components/perf_event/tests/Makefile, src/components/perf_event/tests/nmi_watchdog.c, src/ctests/Makefile.recipies, src/ctests/nmi_watchdog.c: ctests: nmi_watchdog is a perf_event specific test, move it there * src/components/Makefile_comp_tests.target.in, src/components/README, src/components/perf_event/tests/Makefile: components: update the autoconfigure to generate more useful Makefile.target.in although I don't think most components are using it at all 2017-06-26 Asim YarKhan * src/components/cuda/Makefile.cuda.in, src/components/cuda/README, src/components/cuda/Rules.cuda, src/components/cuda/configure, src/components/cuda/configure.in, src/components/cuda/linux-cuda.c, src/components/cuda/sampling/Makefile, src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/Makefile, src/components/cuda/tests/simpleMultiGPU.cu: CUDA component update: Support for CUPTI metrics (early release) This commit adds support for CUPTI metrics, which are higher level measures that may be decompsed into multiple lower level CUPTI events. Known problems and limitations in early release of metric support * Only sets of metrics and events that can be gathered in a single pass are supported. Transparent multi-pass support is expected * All metrics are returned as long long integers, which means that CUPTI double precision values will be truncated, possibly severely. * The NVLink metrics have been disabled for this alpha release. 2017-06-23 Vince Weaver * src/validation_tests/papi_fp_ops.c: validation: papi_fp_ops, skip (not fail) if PAPI_FP_OPS unavailable * src/ctests/Makefile, src/ctests/Makefile.recipies, src/ctests/Makefile.target.in, src/ctests/flops.c: ctests: flops, update to use some of the validate_tests infrastructure * src/validation_tests/Makefile.recipies, src/validation_tests/flops_testcode.c, src/validation_tests/papi_fp_ops.c, src/validation_tests/testcode.h: validation_tests: add papi_fp_ops test tested on an AMD fam15h machine * src/components/powercap/tests/powercap_basic.c: powercap: fix compiler warnings in the powercap_basic test * src/ctests/flops.c: ctests: update flops test * src/ctests/api.c: ctests: update api test only seems to test the high-level API * src/ctests/all_native_events.c: ctests: update all_native_events removed some ancient warnings about uncore/offcore events. Should not be a problem on libpfm4/perf_event * src/ctests/all_events.c: ctests: clean up all_events test * src/components/appio/tests/appio_list_events.c, src/components/appio/tests/appio_test_blocking.c, .../appio/tests/appio_test_fread_fwrite.c, src/components/appio/tests/appio_test_pthreads.c, src/components/appio/tests/appio_test_read_write.c, src/components/appio/tests/appio_test_recv.c, src/components/appio/tests/appio_test_seek.c, src/components/appio/tests/appio_test_select.c, src/components/appio/tests/appio_test_socket.c, src/components/appio/tests/appio_values_by_code.c, src/components/appio/tests/appio_values_by_name.c, src/components/coretemp/tests/coretemp_basic.c, src/components/coretemp/tests/coretemp_pretty.c, src/components/example/tests/example_basic.c, .../example/tests/example_multiple_components.c, .../host_micpower/tests/host_micpower_basic.c, .../infiniband/tests/infiniband_list_events.c, .../infiniband/tests/infiniband_values_by_code.c, .../tests/infiniband_umad_list_events.c, src/components/libmsr/tests/libmsr_basic.c, src/components/lustre/tests/lustre_basic.c, src/components/micpower/tests/micpower_basic.c, src/components/mx/tests/mx_basic.c, src/components/mx/tests/mx_elapsed.c, src/components/net/tests/net_list_events.c, src/components/net/tests/net_values_by_code.c, src/components/net/tests/net_values_by_name.c, .../perf_event/tests/perf_event_offcore_response.c, .../perf_event/tests/perf_event_system_wide.c, .../perf_event/tests/perf_event_user_kernel.c, .../tests/perf_event_amd_northbridge.c, .../perf_event_uncore/tests/perf_event_uncore.c, .../tests/perf_event_uncore_cbox.c, .../tests/perf_event_uncore_multiple.c, src/components/powercap/tests/powercap_basic.c, src/components/rapl/tests/rapl_basic.c, src/components/rapl/tests/rapl_overflow.c, src/components/stealtime/tests/stealtime_basic.c, src/components/vmware/tests/vmware_basic.c, src/ctests/all_events.c, src/ctests/all_native_events.c, src/ctests/api.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/branches.c, src/ctests/byte_profile.c, src/ctests/calibrate.c, src/ctests/case1.c, src/ctests/case2.c, src/ctests/clockres_pthreads.c, src/ctests/cmpinfo.c, src/ctests/code2name.c, src/ctests/cycle_ratio.c, src/ctests/data_range.c, src/ctests/derived.c, src/ctests/describe.c, src/ctests/disable_component.c, src/ctests/dmem_info.c, src/ctests/earprofile.c, src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c, src/ctests/exeinfo.c, src/ctests/first.c, src/ctests/flops.c, src/ctests/fork.c, src/ctests/fork2.c, src/ctests/fork_exec_overflow.c, src/ctests/forkexec.c, src/ctests/forkexec2.c, src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/get_event_component.c, src/ctests/high-level.c, src/ctests/high-level2.c, src/ctests/hl_rates.c, src/ctests/hwinfo.c, src/ctests/inherit.c, src/ctests/ipc.c, src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/low-level.c, src/ctests /matrix-hl.c, src/ctests/max_multiplex.c, src/ctests/memory.c, src/ctests/mendes-alt.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/nmi_watchdog.c, src/ctests/omptough.c, src/ctests/overflow.c, src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_force_software.c, src/ctests/overflow_index.c, src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c, src/ctests/profile.c, src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c, src/ctests/pthrtough.c, src/ctests/pthrtough2.c, src/ctests/realtime.c, src/ctests/remove_events.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/shlib.c, src/ctests/sprofile.c, src/ctests/tenth.c, src/ctests/thrspecific.c, src/ctests/timer_overflow.c, src/ctests/virttime.c, src/ctests/zero.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c, src/ctests/zero_fork.c, src/ctests/zero_named.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c, src/ctests/zero_smp.c, src/testlib/papi_test.h, src/testlib/test_utils.c, src/validation_tests/papi_br_ins.c, src/validation_tests/papi_br_msp.c, src/validation_tests/papi_tot_cyc.c, src/validation_tests/papi_tot_ins.c: testlib: remove the "free variables" option from test_pass() It was only used by a small handfull of tests, and wasn't really strictly necessary anyway. test_pass() should pass the test and that's all. * src/ctests/zero.c: ctests: zero: start cleaning up this test * src/validation_tests/Makefile.recipies: validation_tests: clock_gettime() requires -lrt on older versions of glibc 2017-06-22 Will Schmidt * src/linux-memory.c, src/papi_events.csv: PAPI power9 event list presets Here is an initial set of events and changes to help support Power9. This is based on similar changes that were made for power8 when initial support was added there. I've updated the event names to match what we expect to have in power9, and have done compile/build/ sniff tests. 2017-06-22 Vince Weaver * src/ftests/Makefile.target.in: ftests: fortran tests weren't getting the TOPTFLAGS var set * src/testlib/test_utils.c: testlib: fix colors not turning off in pass/fail indicator * src/ctests/api.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/inherit.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/zero_attach.c, src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: update the way pass/fail is printed It's been bugging me for years that they don't line up * src/run_tests.sh: run_tests.sh: run the validation tests too * src/Makefile.inc: Makefile.inc: make it compile the validation_tests * src/validation_tests/Makefile.recipies, src/validation_tests/papi_br_msp.c: validation-tests: add papi_br_msp test * src/validation_tests/Makefile.recipies, src/validation_tests/branches_testcode.c, src/validation_tests/matrix_multiply.c, src/validation_tests/matrix_multiply.h, src/validation_tests/papi_br_ins.c, src/validation_tests/testcode.h: validation_tests: add papi_br_ins test * src/validation_tests/Makefile.recipies, src/validation_tests/papi_tot_cyc.c: validation_tests: add papi_tot_cyc test * src/Makefile.inc: fix "make install-all" had some extraneous ".." after some previous changes * src/configure, src/configure.in, src/validation_tests/Makefile.target.in, src/validation_tests/papi_tot_ins.c: validation_tests: update configure so it sets up the Makefile * src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: papi_print_header() lives with the utils code now * src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: make tests_quiet() return an integer This way we don't have to depend on the global var TESTS_QUIET if we don't want to. * src/validation_tests/Makefile, src/validation_tests/Makefile.recipies, src/validation_tests/Makefile.target.in, src/validation_tests/display_error.c, src/validation_tests/display_error.h, src/validation_tests/instructions_testcode.c, src/validation_tests/papi_tot_ins.c, src/validation_tests/testcode.h: validation_tests: add initial papi_tot_ins test it is not hooked up to the build system yet * src/ctests/multiplex1.c, src/ctests/multiplex2.c, src/ctests/second.c, src/ctests/sprofile.c, src/ctests/virttime.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c, src/ctests/zero_fork.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c: ctests: more printf/TESTS_QUIET conversions * src/testlib/fpapi_test.h: ftests: missing define was making second.F fail * src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/memory.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/multiplex1.c: ctests: more printf/TESTS_QUIET fixes 2017-06-21 Vince Weaver * src/ctests/all_events.c, src/ctests/all_native_events.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/byte_profile.c, src/ctests/calibrate.c, src/ctests/cmpinfo.c, src/ctests/code2name.c, src/ctests/cycle_ratio.c, src/ctests/exeinfo.c, src/ctests/fork_exec_overflow.c, src/ctests/hl_rates.c, src/ctests/hwinfo.c: ctests: explicitly block printfs with TESTS_QUIET There was some hackery with the preprocessor to avoid this but that wasn't a good solution. * src/testlib/do_loops.h, src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: minor papi_test.h cleanups * .../perf_event/tests/perf_event_offcore_response.c, .../perf_event/tests/perf_event_system_wide.c, .../perf_event/tests/perf_event_user_kernel.c, .../tests/perf_event_amd_northbridge.c, .../perf_event_uncore/tests/perf_event_uncore.c, .../perf_event_uncore/tests/perf_event_uncore_cbox.c, .../tests/perf_event_uncore_multiple.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/attach_target.c, src/ctests/branches.c, src/ctests/burn.c, src/ctests/byte_profile.c, src/ctests/cycle_ratio.c, src/ctests/derived.c, src/ctests/dmem_info.c, src/ctests/earprofile.c, src/ctests/first.c, src/ctests/high-level.c, src/ctests/inherit.c, src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/low- level.c, src/ctests/matrix-hl.c, src/ctests/memory.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c, src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_force_software.c, src/ctests/overflow_index.c, src/ctests/overflow_one_and_read.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c, src/ctests/prof_utils.c, src/ctests/profile.c, src/ctests/profile_twoevents.c, src/ctests/remove_events.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/sprofile.c, src/ctests/tenth.c, src/ctests/zero.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c, src/ctests/zero_fork.c, src/ctests/zero_named.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c, src/ctests/zero_shmem.c, src/ctests/zero_smp.c, src/testlib/Makefile, src/testlib/fpapi_test.h, src/testlib/papi_test.h, src/testlib/test_utils.h: testlib: more papi_test.h reduction * src/testlib/Makefile: testlib: turn off optimization on the validation loops it's making tests fail, need to go back and be sure we are properly tricking the compiler. * src/Makefile.inc, src/components/Makefile_comp_tests, src/components/perf_event/tests/Makefile, src/components/perf_event_uncore/tests/Makefile, src/components/rapl/tests/Makefile, src/components/rapl/tests/rapl_overflow.c, src/ctests/Makefile, src/ctests/Makefile.recipies, src/ctests/overflow_pthreads.c, src/ctests/profile_pthreads.c, src/ftests/Makefile, src/ftests/Makefile.recipies, src/ftests/Makefile.target.in, src/testlib/Makefile, src/testlib/do_loops.c, src/testlib/do_loops.h, src/testlib/papi_test.h: testlib: start splitting the validation code off from the pass/fail code * src/components/perf_event/tests/perf_event_offcore_response.c, src/components/perf_event/tests/perf_event_system_wide.c, src/components/perf_event/tests/perf_event_user_kernel.c, src/compo nents/perf_event_uncore/tests/perf_event_amd_northbridge.c, src/components/perf_event_uncore/tests/perf_event_uncore.c, src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c, sr c/components/perf_event_uncore/tests/perf_event_uncore_multiple.c, src/components/rapl/tests/rapl_basic.c, src/components/rapl/tests/rapl_overflow.c, src/ctests/all_native_events.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/attach_target.c, src/ctests/branches.c, src/ctests/burn.c, src/ctests/byte_profile.c, src/ctests/calibrate.c, src/ctests/case1.c, src/ctests/case2.c, src/ctests/clockres_pthreads.c, src/ctests/cmpinfo.c, src/ctests/code2name.c, src/ctests/cycle_ratio.c, src/ctests/data_range.c, src/ctests/derived.c, src/ctests/describe.c, src/ctests/disable_component.c, src/ctests/dmem_info.c, src/ctests/earprofile.c, src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c, src/ctests/exeinfo.c, src/ctests/first.c, src/ctests/flops.c, src/ctests/fork.c, src/ctests/fork2.c, src/ctests/forkexec.c, src/ctests/forkexec2.c, src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/get_event_component.c, src/ctests/high-level.c, src/ctests/high-level2.c, src/ctests/hl_rates.c, src/ctests/hwinfo.c, src/ctests/inherit.c, src/ctests/ipc.c, src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/low-level.c, src/ctests /matrix-hl.c, src/ctests/memory.c, src/ctests/mendes-alt.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/nmi_watchdog.c, src/ctests/omptough.c, src/ctests/overflow.c, src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_force_software.c, src/ctests/overflow_index.c, src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c, src/ctests/prof_utils.c, src/ctests/profile.c, src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c, src/ctests/pthrtough.c, src/ctests/pthrtough2.c, src/ctests/realtime.c, src/ctests/remove_events.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/shlib.c, src/ctests/sprofile.c, src/ctests/tenth.c, src/ctests/thrspecific.c, src/ctests/timer_overflow.c, src/ctests/virttime.c, src/ctests/zero.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c, src/ctests/zero_fork.c, src/ctests/zero_named.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c, src/ctests/zero_shmem.c, src/ctests/zero_smp.c, src/testlib/do_loops.c, src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: remove include of papi.h Need to explicitly include it in your test if you need it. * src/testlib/Makefile, src/testlib/do_loops.c, src/testlib/do_loops.h, src/testlib/dummy.c, src/utils/Makefile, src/utils/papi_command_line.c, src/utils/papi_cost.c: utils: remove last uses of testlib * src/utils/Makefile, src/utils/papi_hybrid_native_avail.c: utils: update papi_hybrid_native_avail to not depend on testlib * src/utils/papi_multiplex_cost.c: utils: clean up papi_multiplex_cost remove dependeicnes on papi_test.h print message warning that it can take a long time to run * .../perf_event/tests/perf_event_offcore_response.c, .../perf_event/tests/perf_event_system_wide.c, .../perf_event/tests/perf_event_user_kernel.c, .../perf_event_uncore/perf_event_uncore.c, .../tests/perf_event_amd_northbridge.c, .../perf_event_uncore/tests/perf_event_uncore.c, .../tests/perf_event_uncore_cbox.c, .../tests/perf_event_uncore_multiple.c, src/components/rapl/tests/rapl_basic.c, src/components/rapl/tests/rapl_overflow.c, src/ctests/all_native_events.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/branches.c, src/ctests/byte_profile.c, src/ctests/calibrate.c, src/ctests/data_range.c, src/ctests/describe.c, src/ctests/disable_component.c, src/ctests/earprofile.c, src/ctests/exec.c, src/ctests/exec2.c, src/ctests/exeinfo.c, src/ctests/first.c, src/ctests/forkexec.c, src/ctests/forkexec2.c, src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/get_event_component.c, src/ctests/inherit.c, src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests /matrix-hl.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/nmi_watchdog.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_force_software.c, src/ctests/overflow_pthreads.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/prof_utils.c, src/ctests/profile_pthreads.c, src/ctests/remove_events.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/shlib.c, src/ctests/timer_overflow.c, src/ctests/zero_named.c, src/testlib/do_loops.c, src/testlib/papi_test.h, src/testlib/test_utils.c, src/utils/Makefile, src/utils/cost_utils.c, src/utils/papi_command_line.c, src/utils/papi_cost.c, src/utils/papi_event_chooser.c: testlib: more header removal from papi_test.h * src/components/perf_event/tests/perf_event_system_wide.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/zero_attach.c, src/testlib/papi_test.h, src/utils/cost_utils.c: testlib: remove a few more includes from papi_test.h * src/components/rapl/tests/rapl_basic.c, src/ctests/all_events.c, src/ctests/all_native_events.c, src/ctests/api.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c, src/ctests/attach_target.c, src/ctests/branches.c, src/ctests/burn.c, src/ctests/calibrate.c, src/ctests/case1.c, src/ctests/case2.c, src/ctests/clockres_pthreads.c, src/ctests/code2name.c, src/ctests/cycle_ratio.c, src/ctests/data_range.c, src/ctests/derived.c, src/ctests/describe.c, src/ctests/dmem_info.c, src/ctests/earprofile.c, src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c, src/ctests/exeinfo.c, src/ctests/flops.c, src/ctests/fork.c, src/ctests/fork2.c, src/ctests/forkexec.c, src/ctests/forkexec2.c, src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/high- level.c, src/ctests/high-level2.c, src/ctests/hl_rates.c, src/ctests/hwinfo.c, src/ctests/inherit.c, src/ctests/ipc.c, src/ctests/johnmay2.c, src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/low-level.c, src/ctests/max_multiplex.c, src/ctests/memory.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c, src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c, src/ctests/overflow_force_software.c, src/ctests/overflow_index.c, src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c, src/ctests/overflow_single_event.c, src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c, src/ctests/prof_utils.c, src/ctests/profile.c, src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c, src/ctests/pthrtough.c, src/ctests/pthrtough2.c, src/ctests/realtime.c, src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/shlib.c, src/ctests/sprofile.c, src/ctests/tenth.c, src/ctests/thrspecific.c, src/ctests/timer_overflow.c, src/ctests/virttime.c, src/ctests/zero.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c, src/ctests/zero_fork.c, src/ctests/zero_omp.c, src/ctests/zero_pthreads.c, src/ctests/zero_shmem.c, src/ctests/zero_smp.c, src/testlib/do_loops.c, src/testlib/dummy.c, src/testlib/papi_test.h, src/testlib/test_utils.c, src/utils/papi_command_line.c, src/utils/papi_cost.c: testlib: split some headers out of papi_test.h Too much is going on in that header, no need to have every include in the world in it. Trying to make the testcode more standalone so it is easier to follow. * src/testlib/Makefile, src/testlib/Makefile.target.in: testlib: let testlib build properly from within the testlib directory * src/testlib/clockcore.c: testlib: clockcore wasn't protecting all the output with !quiet * src/ctests/Makefile: ctests: make sure tests link against the right papi.h file * src/Makefile.inc, src/ctests/Makefile, src/ctests/Makefile.target.in: ctests: allow running "make" in the ctests directory to work 2017-06-20 Vince Weaver * src/Matlab/PAPI_Matlab.readme, src/papi.c, src/utils/papi_avail.c, src/utils/papi_clockres.c, src/utils/papi_command_line.c, src/utils/papi_component_avail.c, src/utils/papi_cost.c, src/utils/papi_decode.c, src/utils/papi_error_codes.c, src/utils/papi_event_chooser.c, src/utils/papi_hybrid_native_avail.c, src/utils/papi_mem_info.c, src/utils/papi_multiplex_cost.c, src/utils/papi_native_avail.c, src/utils/papi_version.c, src/utils/papi_xml_event_info.c: update the ptools-perfapi e-mail address in the auto-generated manpages it was still using the old ptools.org address. * doc/Makefile: docs: fix the manpage build after renaming the utils Thanks to Steve Kaufmann for catching this. * src/utils/Makefile, src/utils/papi_native_avail.c: utils: papi_native_avail: remove extraneous testing code * src/utils/Makefile, src/utils/papi_mem_info.c: utils: papi_mem_info: remove extraneous test code * src/utils/Makefile, src/utils/papi_xml_event_info.c: utils: papi_xml_event_info: remove extraneous test code * src/utils/Makefile, src/utils/papi_decode.c: utils: papi_decode: remove extraneous test code * src/utils/Makefile, src/utils/papi_error_codes.c: utils: papi_error_codes: remove extraneous test code * src/utils/Makefile, src/utils/papi_component_avail.c: utils: papi_component_avail: remove extraneous test code * src/ctests/clockres_pthreads.c, src/testlib/clockcore.c, src/testlib/clockcore.h, src/testlib/papi_test.h, src/utils/Makefile, src/utils/papi_clockres.c: utils: papi_clockres, remove extraneous test code * src/utils/Makefile, src/utils/papi_avail.c, src/utils/print_header.c, src/utils/print_header.h: utils: update papi_avail to not depend on testlibs It's not a test. * src/utils/Makefile: utils: add target for papi_hybrid_native_avail do not build it by default though? Should only be built if compiling for MIC? * src/utils/Makefile, src/utils/avail.c, src/utils/clockres.c, src/utils/command_line.c, src/utils/component.c, src/utils/cost.c, src/utils/decode.c, src/utils/error_codes.c, src/utils/event_chooser.c, src/utils/event_info.c, src/utils/hybrid_native_avail.c, src/utils/mem_info.c, src/utils/multiplex_cost.c, src/utils/native_avail.c, src/utils/papi_avail.c, src/utils/papi_clockres.c, src/utils/papi_command_line.c, src/utils/papi_component_avail.c, src/utils/papi_cost.c, src/utils/papi_decode.c, src/utils/papi_error_codes.c, src/utils/papi_event_chooser.c, src/utils/papi_hybrid_native_avail.c, src/utils/papi_mem_info.c, src/utils/papi_multiplex_cost.c, src/utils/papi_native_avail.c, src/utils/papi_xml_event_info.c: utils: rename the utils so the executable matches the filename This has bothered me for years, you want to fix "papi_native_avail" but there is no file in the tree called "papi_native_avail.c" * src/utils/Makefile, src/utils/papi_version.c, src/utils/version.c: utils: rename version.c to papi_version.c Also minor cleanups to the utility. * src/Makefile.inc, src/configure, src/configure.in, src/utils/Makefile, src/utils/Makefile.target.in: utils: clean up Makefile and build process of utils Now should be able to run "make" in the utils subdir and have it build. Also move the list of util files to build out of configure as I don't think there's any reason for having them there. * src/components/perf_event/pe_libpfm4_events.c: perf: fall back to operating system default events if libpfm4 lacks support This will allow use of PAPI on machines that Linux has support for, but libpfm4 has not added events yet. Still some limitations, for example the PAPI preset events won't work. * src/components/perf_event/pe_libpfm4_events.c, src/components/perf_event/perf_event.c: perf: report better errors if libpfm4 initialization fails * src/components/perf_event/pe_libpfm4_events.c: perf: pe_libpfm4_events: minor whitespace fixup * src/components/perf_event/pe_libpfm4_events.c: perf: pe_libpfm4_events: whitespace changes to make code easier to follow 2017-06-19 Vince Weaver * src/ctests/code2name.c: ctests/code2name: fix uninitialized variable warning * src/ctests/calibrate.c: ctests/calibrate: fix uninitialized variable warning * src/ctests/thrspecific.c: ctests: thrspecific fix so it finishes It's actually really unclear what this code is trying to test, but with optimization enabled it hung forever. Marking the variable being spun on as volatile fixes things but I think there is more wrong with the test than just that. * src/ctests/branches.c, src/ctests/sdsc.c, src/ctests/sdsc4.c: ctests: fix tests using "dummy3()" as a workload Now that we enable optimization on the ctests this breaks some of the benchmarks. dummy3() was being optimized away which caused segfaults and other problems. The tests don't crash now, but they still fail. Still investigating. 2016-10-12 Phil Mucci * src/configure: Regenerated configure with recent autoconf * src/configure.in: By default, we want -O1 on tests (TOPTFLAGS). -O0 is too literal and causes a number of tests who depend on peephole optimization to run. * src/utils/Makefile: Utils are installed therefore they should be built with production flags not test/debug flags * src/Makefile.inc: Make clean should not clean up libpfm. Thats for make distclean. We're not developing libpfm! 2016-07-04 Phil Mucci * src/ctests/mendes-alt.c, src/ctests/zero.c: Moved functions definitions to top of file to eliminate non-ANSI-C prototypes inside main. Modified message in zero to not turbo boost will also cause errors (cycles > real-time-cycle * src/Makefile.in, src/Makefile.inc, src/configure, src/configure.in: Remove EXTRA_CFLAGS, now CFLAGS. Added FTOPTS so compiling Fortran tests have same flags as ctests. Fix proper testing at configure time of libpfm for proper combinations of libpfm options * src/ftests/Makefile: Homogenize include flags * src/ctests/Makefile: Homogenize include flags * src/testlib/Makefile: Removed unnecessary defs and options * src/utils/Makefile: Removed unnecessary definitions and compiler options 2016-07-01 Phil Mucci * src/Makefile.in, src/Makefile.inc, src/Rules.perfctr-pfm, src/Rules.perfmon2, src/Rules.pfm4_pe, src/components/Makefile_comp_tests.target.in, src/components/perf_event/pe_libpfm4_events.c, src/configure, src/configure.in, src/ctests/Makefile, src/ctests/Makefile.target.in, src/ftests/Makefile, src/ftests/Makefile.target.in: Makefile.in: - Removed DEBUGFLAGS, NOTLS, PAPI_EVENTS_TABLE from being generated. These were not properly used. - Added LIBCFLAGS generated from configure for CFLAGS that ONLY apply to the library and the library code. NOT tests nor utilities. Previously we were propagating all kinds of bogus flags to the tests and utils. - CFLAGS is now properly set for compiler flags not defines etc. Makefile.inc: - Put papi_events_table.h in the right place. This is always the same name. Previous attempts at parameterizing this were broken and/or unnecessary. - Added dependency for the above in the right place and ALWAYS generate it, regardless of whether we actually include it in the library (vs load the CSV at runtime). Rules.perfctr-pfm - Removed conditional removal of events table during clean. Rules.perfmon2 - Removed conditional removal of events table during clean. Rules.pfm4_pe - Stopped mussing with CFLAGS which would pollute child builds but refer to LIBCFLAGS. CFLAGS is for everything! - Removed conditional removal of events table during clean. - Removed duplicate reference to papi_events_table.h components/perf_event/pe_libpfm4_events.c: - Removed HARDCODED include of a libpfm4 private header file. Wrong path and unnecessary include. This would break if you linked against another libpfm using any of the config options. components/perf_event/peu_libpfm4_events.c: - Removed HARDCODED include of a libpfm4 private header file. Wrong path and unnecessary include. This would break if you linked against another libpfm using any of the config options. components/Makefile_comp_tests.target.in: - Refer to datarootdir to make autoconf happy configure/configure.in: Regenerated using autoconf 2.69 and many modifications to serious brokennesss. Lots of fixes: - Sanitize options for static inclusion of user and papi presets - Fix options that do not print out a result - Fix debug=yes to not include PAPI_MEMORY_MANAGEMENT. That's only enabled with debug=memory. This will reduce false positives when we debug. We don't want our own malloc/free changing behavior when we are trying to debug! - Fix CFLAGS/LIBCFLAGS/DEBUGFLAGS. configure now exports a variable called PAPICFLAGS which gets stuffed into LIBCFLAGS in Makefile.in. This variable IS ONLY for compiler flags relevant to the library. Previously we were exporting all sorts of stuff that would make our passes behave differently that user code. _GNU_SOURCE and -D_REENTRANT. That stuff is for the library and components. Not user code. - Update compile tests to use AC_LANG_SOURCE as required. - Fix clock timer checking output to now say what timer we picked instead of just skipping an answer - Same for virtual clock timer - Remove broken --with-papi-events option. - Fixed --with-static-tools option - Fixed/added --with- static-papi-events option (default) and --with-static-user-events option. - Fixed modalities of configuring whether to build a static/shared or both. - Fixed link of tests with shared libraries when above options don't support it. Modality again. Remove SETPATH/LIBPATH define, which won't work for ANY combination of --with-pfm-prefix/root/libdir except our included library. Woefully broken and would result in many false positive failures. If you are going to run the tests on the shared library it is now the users responsibility to set LD_LIBRARY_PATH/LIBPATH correctly. I suspect this may irritate some, but broken 90% of the time is no excuse for correct 10% of the time especially when it could generate bug reports falsely. - Fixed with-static-tools, with-shlib-tools options to correct modalities. - Fixed all modalities with --with- pfm-prefix/root/libdir/incdir. Previously the build, configure and source files were still referring to pieces of code INSIDE our libpfm4 resulting in version skew and breakage. The way to test this stuff is to use --root or --prefix after removing the internal libpfm4 library. - Removed unnecessary and confusing force_pfm_incdir - Fixed with-pe-incdir option which, like before was most of the time referring to the libpfm4 included header file. Not good if one has a custom kernel! PECFLAGS now only appended to PAPICFLAGS(LIBCFLAGS). - Removal of DEBUGFLAGS. aix.c needs testing. Anyone have one? - Fixed CFLAGS for BSD - Add message for papi_events.csv ctests/Makefile ftests/Makefile - Don't redefine CC/CC_R/CFLAGS/FFLAGS. - Make these files consistent ctests/Makefile.target.in ftests/Makefile.target.in - refer to datarootdir as required 2016-06-27 Phil Mucci * src/testlib/Makefile, src/testlib/Makefile.target.in: Added explicit target for libtestlib.a. The all target should have been markted as .PHONY as to avoid constant rebuilding. Also, we really should merge these two files into a master and an include. Maintaining two makefiles stinks! 2017-06-16 Vince Weaver * src/papi_fwrappers.c: fwrappers: papif_unregister_thread was misspelled as papif_unregster_thread This was noticed by Vedran Novakovic For an extremely long time (10+ years?) the fortran wrapper was misspelled as papif_unregster_thread() It's probably too late to fix this without potentially breaking things, so just add a duplicate function with the proper spelling and leave the old one too. * src/papi_preset.c: papi_preset: fix compiler warning This really confusing warning has been around for a while. gcc-6.3 reports it in a really odd way: papi_preset.c: In function ‘check_derived_events’: papi_preset.c:513:19: warning: ‘__s’ may be used uninitialized in this function$ int val = atoi(&subtoken[1]); ^~~~~~~~~~~~ papi_preset.c:464:1: note: ‘__s’ was declared here ops_string_merge(char **original, char *insertion, int replaces, int start_ind$ ^~~~~~~~~~~~~~~~ But there is no __s variable, or anything to do with where the arrows are pointing. gcc-5 gives a better warning: papi_preset.c: In function ‘check_derived_events’: papi_preset.c:513:14: warning: ‘tok_save_ptr’ may be used uninitialized in this$ int val = atoi(&subtoken[1]); ^ papi_preset.c:472:8: note: ‘tok_save_ptr’ was declared here char *tok_save_ptr; So the thing it seems to be complaining about is that the *saveptr paramater to strtok_r() is not set to NULL. According to the manpage I don't think this should be needed? But I think it should be safe to initialize it anyway. Tue Jun 6 11:09:17 2017 -0500 Will Schmidt * src/libpfm4/lib/events/power9_events.h, src/libpfm4/perf_examples/self_count.c, src/libpfm4/tests/validate_power.c: Update libpfm4 Current with commit ce5b320031f75f9a9881333c13902d5541f91cc8 add power9 entries to validate_power.c Hi, Update the validate_power test to include power9 entries. sniff-test run output: $ ./validate Libpfm structure tests: libpfm ABI version : 0 pfm_pmu_info_t : Passed pfm_event_info_t : Passed pfm_event_attr_info_t : Passed pfm_pmu_encode_arg_t : Passed pfm_perf_encode_arg_t : Passed Libpfm internal table tests: checking power9 (946 events): Passed Architecture specific tests: 20 PowerPC events: 0 errors All tests passed 2017-06-15 Vince Weaver * src/components/perf_event/pe_libpfm4_events.c, src/components/perf_event/pe_libpfm4_events.h, .../perf_event_uncore/Rules.perf_event_uncore, .../perf_event_uncore/perf_event_uncore.c, .../perf_event_uncore/peu_libpfm4_events.c, .../perf_event_uncore/peu_libpfm4_events.h: perf_event: merge the libpfm4 helper libraries perf_event and perf_event_uncore had their own almost exactly the same libpfm4 helper libraries. Maintaining both was a chore, and it looks like it is possible to just share one copy. This does mean that it is now not possible to configure the perf_event_uncore component without perf_event being enabled, but I am not sure if that was even possible to begin with. * src/components/perf_event/pe_libpfm4_events.c, .../perf_event_uncore/perf_event_uncore.c, .../perf_event_uncore/peu_libpfm4_events.c, .../perf_event_uncore/peu_libpfm4_events.h: perf_event_uncore: make the libpfm4 routines match even more * src/components/perf_event/pe_libpfm4_events.c, .../perf_event_uncore/peu_libpfm4_events.c: perf_event: make perf_event and perf_event uncore libpfm4 more similar it's a bad idea to have more or less two copies of the same code * src/components/perf_event/pe_libpfm4_events.c, .../perf_event_uncore/peu_libpfm4_events.c: perf_event: Avoid unintended libpfm build dependency due to PFM_PMU_MAX enum This patch is based on one sent by William Cohen The libpfm pfmlib.h file enumerates the each of performance monitoring units (PMUs) it can program in pfm_pmu_t type. The last enum in this type is PFM_PMU_MAX. Depending on which specific version of libpfm being used this specific value could vary. The problem is that PFM_PMU_MAX is statically defined in the pfmlib.h file and this was being used as a loop bounds when iterating to determine which PMUs are potentially available. If PAPI was built with an older version of libpfm and then run with a newer libpfm shared library on a machine with a larger PFM_PMU_MAX value, none of the PMUs past the smaller PFM_PMU_MAX used for the the build would be examined or enabled. 2017-06-15 Heike Jagode (jagode@icl.utk.edu) * src/components/infiniband/linux-infiniband.c: Updated infiniband component so that it works for mofed driver version 4.0, where directory counters_ext in sysfs fs has changed to hw_counters. This update to the component makes it work for both directory names: - counters_ext for mofed driver version <4.0, and - hw_counters for mofed driver version =>4.0 This change has not been fully tested yet due to missing access to machine with updated version of mofed driver. (CORAL machines will have an updated version of this driver.) 2017-05-04 Vince Weaver * src/components/rapl/linux-rapl.c: rapl: broadwell-ep DRAM units are special (like Haswell-EP) The Linux kernel perf interface had this wrong too. I noticed this in my cluster computing classs, the Broadwell-EP DRAM results were unrealistically high values. Fri Apr 21 17:33:15 2017 -0700 William Cohen * src/libpfm4/README, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/power9_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_power9.c, src/libpfm4/lib/pfmlib_power_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c: Update libpfm4\n\nCurrent with\n commit 8385268c98553cb5dec9ca86bbad3e5c44a2ab16 fix internal pfm_event_attr_info_t use for S390X Commit 321133e converted most of the architectures to use the internal perflib_event_attr_info_t type. However, the s390 was missed in that previous commit. This patch corrects the issue so libpfm compiles on s390. 2017-04-20 Stephen Wood * src/extras.c, src/papi.h, src/papi_fwrappers.c, src/papi_hl.c, src/papi_internal.c: cast pointers appropriately to avoid warnings and errors 2017-04-19 Sangamesh Ragate * src/papi_events.csv: Mapped PAPI_L2_ICM preset event to PM_INST_FROM_L2MISS native event for Power8 2017-04-06 Asim YarKhan * src/ftests/fmatrixlowpapi.F: Fixed: This fortran test exceeded 72 columns and made the default Intel ifort compilation unhappy Wed Apr 5 23:35:44 2017 -0700 Andreas Beckmann * src/libpfm4/docs/man3/libpfm_arm_ac53.3, src/libpfm4/docs/man3/libpfm_arm_ac57.3, src/libpfm4/docs/man3/libpfm_arm_xgene.3, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_cortex_a53_events.h, src/libpfm4/lib/events/intel_glm_events.h, src/libpfm4/lib/events/intel_hswep_unc_imc_events.h, src/libpfm4/lib/events/intel_ivbep_unc_imc_events.h, src/libpfm4/lib/events/intel_knl_events.h, src/libpfm4/lib/events/intel_knl_unc_cha_events.h, src/libpfm4/lib/events/power4_events.h, src/libpfm4/lib/events/ppc970_events.h, src/libpfm4/lib/events/ppc970mp_events.h, src/libpfm4/perf_examples/self_smpl_multi.c: Update libpfm4\n\nCurrent with\n commit 71a960d9c17b663137a2023ce63edd2f3ca115f5 fix various event description typos This patch fixes the typos in several event description for Intel, Arm, and Power event tables. 2017-03-30 William Cohen * src/ftests/cost.F, src/ftests/first.F, src/ftests/fmatrixlowpapi.F, src/ftests/second.F: Eliminate warnings about implicit type conversions in Fortran tests The gfortran compiler on Fedora 25 was giving warnings indicating that a few of the tests were doing implicit type convertion between reals and ints. Those implicit conversions have been made explicit to elminate the fortran compiler warning messages. Tue Apr 4 09:42:25 2017 -0700 Stephane Eranian * src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_arm.c, src/libpfm4/lib/pfmlib_arm_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_netburst.c, src/libpfm4/lib/pfmlib_intel_nhm_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_mips.c, src/libpfm4/lib/pfmlib_mips_priv.h, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_perf_event_raw.c, src/libpfm4/lib/pfmlib_power_priv.h, src/libpfm4/lib/pfmlib_powerpc.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/lib/pfmlib_sparc.c, src/libpfm4/lib/pfmlib_sparc_priv.h, src/libpfm4/lib/pfmlib_torrent.c, src/libpfm4/tests/validate.c, src/libpfm4/tests/validate_x86.c: Update libpfm4\n\nCurrent with\n commit 5e311841e5d70efb93d11826109cb5acab6e051c enable 38-bit raw umasks for Intel offcore_response events This patch enables support for passing and encoding of 38-bit offcore_response matrix umask. Without the patch, the raw umask was limited to 32-bit which is not enough to cover all the possible bits of the offcore_response event available since Intel SandyBridge. $ examples/check_events offcore_response_0:0xffffff Requested Event: offcore_response_0:0xffffff Actual Event: ivb::OFFCORE_RESPONSE_0:0xffffff:k=1:u=1:e=0:i=0:c=0:t=0 PMU : Intel Ivy Bridge IDX : 155189325 Codes : 0x5301b7 0xffffff The patch also adds tests to the validation code. 2017-03-29 Vince Weaver * src/components/perfctr/perfctr-x86.c: perfctr: fix perfctr component to actually work Simple one-line typo means perfctr was not working, probably for years. I've tested on a 2.6.32-perfctr kernel and it works again. 2017-03-28 Vince Weaver * src/papi_events.csv: papi_events: add AMD fam16h jaguar events These will become useful if/when the contributed libpfm4 jaguar patches get applied. 2017-03-27 Vince Weaver * src/papi_events.csv: events: p4: change the PAPI_TOT_CYC event PAPI_TOT_CYC wasn't working on Pentium4 because the GLOBAL_POWER_EVENT:RUNNING event was being grabbed by the hardware watchdog. perf cycles:u was still working, that's because the kernel transparently remaps the cycles event to an alias when global_power_event's slot is taken. The aliased event is the unwieldly: execution_event:nbogus0:nbogus1:nbogus2:nbogus3:bogus0:b ogus1:bogus2:bogus3:cmpl:thr=15 which does seem to give the right results. Use this event instead by default on Pentium 4 * src/components/perf_event/perf_event.c: perf_event: fix warning when compiling with debug enabled the flags field is an unsigned long, not an int 2017-03-22 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: don't allocate a mmap page if not rdpmc or sampling * src/components/perf_event/perf_event.c: perf_event: only allocate 1 mmap page (rather than 3) if not sampling Next step is to allocate 0 mmap pages unless rdpmc is enabled * src/components/perf_event/perf_event.c, src/components/perf_event/perf_event_lib.h: perf_event: update the _pe_set_overflow() call Working on making it more obvious which events are sampling (and thus need mmap buffers) or not. Also there were some bugs in the handling of having multiple overflow sources per eventset, though I'm not sure if PAPI actually handles that. * src/components/perf_event/perf_event.c: perf_event: turn off fast_counter_read if mmaps fail By default on Linux perf_event can't use more than 516kB of mmap space. So perf_event-rdpmc would fail after you added a large number (>32) of events. This shows up on the kufrin benchmark on some machines. This fix makes PAPI fall back to non-rdpmc if an mmap error happens. I'm also going to try to tune the mmap usage a bit to make the limits a bit higher. 2017-03-21 Asim YarKhan * src/configure: configure script updated using autoconf-2.59 2017-03-20 Vince Weaver * src/components/perf_event/perf_event.c, src/configure.in: configure: enable rdpmc with --enable-perfevent-rdpmc=yes Make this an option to configure. Defaults to no. Need to find a machine with autoconf 2.59 on and I'll regenerate configure as well. 2017-03-16 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: try to work around exclude_guest issue run a test at startup to see if events with exclude_guest fail. libpfm4 sets this by default, but older kernels will fail because this was previously a reserved (must be zero) field. 2017-03-14 Vince Weaver * src/ctests/multiattach.c: tests: multiattach: whitespace/comments/clarifications digging through the code trying to figure out why it fails with rdpmc enabled. it turns out it is seeing wrong running/enabled multiplexing results even though we aren't multiplexing tracking this down is a pain because we can't strace/ltrace due to the code using ptrace to start/stop processes. 2017-03-09 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: can't mmap() an inherited event this is why the inherit test was failing * src/components/perf_event/perf_event.c, src/components/perf_event/perf_helpers.h: perf_event: add rdpmc support (but disabled) finally add the rdpmc code, but it still fails on a few tests so it is disabled by default. * src/components/perf_event/perf_event.c, src/components/perf_event/perf_event_lib.h: perf_event: make all events come with a mmap buffer This wastes some address space, but having separate codepaths for rdpmc/regular/sampling/profiling would be hard to maintain. Had to remove some assumptions from the profiling/sampling code that mmap_buf means sampling is happening. * src/components/perf_event/perf_event.c: perf_event: add check for paranoid==3 Recent distributions are *completely* disablng perf_event by default with their vendor kernels (this is not upstream yet). Have PAPI detect and disable the perf_event component if this is detected. * src/components/perf_event/perf_event.c: perf_event: split close_pe_events() into two functions * src/components/perf_event/perf_event.c, src/components/perf_event/perf_helpers.h: perf_event: more whitespace / rearrangement should not be any changes to actual code, is just whitespace/comment/function movement I know changes like this make the git history harder to follow, but it really helps when trying to follow the code when working on major changes. 2017-03-08 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: more whitespace/comment cleanups digging through the code, still prepping for rdpmc 2017-03-07 Vince Weaver * src/components/perf_event/perf_helpers.h: perf_event: rdpmc: need to sign extend offset too Otherwise things stop working after a PAPI_reset() * src/components/perf_event/perf_event.c: perf_event: split up _pe_read() makes the code a bit easier to follow. also prep for rdpmc() * src/components/perf_event/perf_event.c: perf_event: clean up whitespace in _pe_read 2017-03-08 Vince Weaver * src/ctests/first.c: ctests: first: white space cleanups minor things noticed when trying to figure out why it was failing with rdpmc (the answer was rdpmc code not handling PAPI_reset()) 2017-03-07 Vince Weaver * src/components/perf_event/perf_helpers.h: perf_event: recent changes broke build on non-x86 an ifdef was in the wrong location. * src/components/perf_event/perf_event.c, src/components/perf_event/perf_helpers.h: perf_event: update rdpmc detection * src/utils/component.c: utils: component_avail: clean up -d (detailed) results print rdpmc status, as well as line things up. Also don't print redundant info, now that a lot more fields are printed by default. * src/utils/component.c: utils: component_avail: whitespace/grammar fixes * src/components/perf_event/Rules.perf_event, src/components/perf_event/perf_helpers.h: perf_event: add mmap/rdpmc routine we don't use it yet 2017-03-06 Vince Weaver * src/components/perf_event/perf_helpers.h: perf_event: add rdtsc() and rdpmc() inline-assembly * src/components/perf_event/perf_event.c, src/components/perf_event/perf_helpers.h: perf_event: move perf_event_open() code to a helper file We'll be adding some other helpers to this file too. 2017-03-03 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: move bug_sync_read() check out of line we should eventually just phase out a lot of these checks for older kernels, but it gets tricky as long as RHEL is shipping 2.6.32. With this change on my IVB machine PAPI_read() cost went from mean cycles : 932.158549 std deviation: 358.752461 to mean cycles : 896.642644 std deviation: 305.568268 * src/components/perf_event/pe_libpfm4_events.c, src/components/perf_event/pe_libpfm4_events.h, src/components/perf_event/perf_event.c: perf_event: remove _pe_libpfm4_get_cidx() helper function easier to explicitly pass it to the libpfm4 event code * src/components/perf_event/perf_event_lib.h: perf_event: wakeup_mode field is no longer used * src/components/perf_event/perf_event.c: perf_event: remove WAKEUP_MODE_ defines These date back to initial perf_event support, but were never used. Probably were meant in case advanced sampling/profiling was ever implemented, but it wasn't. * src/components/perf_event/perf_event.c: perf_event.c: split setup_mmap() to its own function non-sampling events will need to have mmap buffers when we move to rdpmc() * src/components/perf_event/perf_event.c: perf_event: rename tune_up_fd to configure_fd_for_sampling makes it a bit more clear what is going on * src/components/perf_event/perf_event.c: perf_event: remove extraneous whitespace 2017-02-24 Vince Weaver * src/utils/cost.c: papi_cost: wasn't properly resetting the event search after POSTFIX This means some architectures could have skipped the ADD/SUB test even though such events were available. Wed Feb 22 01:16:42 2017 -0800 Stephane Eranian * src/libpfm4/lib/events/intel_bdw_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_intel_rapl.c, src/libpfm4/tests/validate_x86.c: Update libpfm4\n\nCurrent with\n commit 1bd352eef242f53e130c3b025bbf7881a5fb5d1e update Intel RAPL processor support Added Kabylake, Skylake X Added PSYS RAPL event for Skylake client. 2017-02-17 Vince Weaver * src/utils/cost.c: papi_cost: clear eventset before derived add test we weren't clearing the eventset after the derived postfix test to the add test was actually measuring two derived events. This was noticed on broadwell-ep where papi_cost would fail due to the lack of enough counters to have both the postfix and add events at the same time. 2017-01-23 Asim YarKhan * RELEASENOTES.txt: Fixing the date in the RELEASENOTES file. papi-papi-7-2-0-t/ChangeLogP570.txt000066400000000000000000001370631502707512200166160ustar00rootroot000000000000002019-02-17 Vince Weaver * src/ctests/attach_cpu_sys_validate.c, src/ctests/attach_cpu_validate.c: ctests: attach_cpu_*validate: fix buffer overrun embarassing bug, on systems with more than 16 cores was running off the end of a buffer. Oddly this did not fail on my debian system with 32 cores. 2019-02-15 Anthony Castaldo * src/validation_tests/papi_tot_cyc.c: Added a 'priming' call of the naive matrix multiply, before counting the cycles on the second call. This is to overcome any first-time system overhead in the procedure, like loading the program, so the cycles will better match the 3rd and 4th calls. This corrects an error (and failure of the test) in which the first call to the routine takes over 10% more cycles to complete than the subsequent calls. * src/testlib/test_utils.c: Added new message to test_fail, so users will not think a test failure means their PAPI install is unusable, or that the failure should be reported to the PAPI development team. * src/run_tests_exclude.txt: Added new memleak_check.c in validation_tests; it is not a standalone test, but a utility to be run by valgrind when checking for memory leaks. * release_procedure.txt: Release guidance improved with more details on testing. 2019-02-11 Frank Winkler * src/ctests/zero.c, src/validation_tests/cycles_validation.c: Fixed warnings detected by clang. Replaced abs with llabs. * src/components/coretemp_freebsd/coretemp_freebsd.c: Fixed unused warnings. 2019-02-08 Anthony Castaldo * src/components/perf_event/pe_libpfm4_events.c: revert change that repaired a memory leak; it caused a problem with ARM systems, per Vince Weaver. 2019-02-08 Vince Weaver * src/linux-common.c: arm64: update the ARM family configuration to work with newer Linux kernels. As of Linux-3.19 the "CPU architecture" field in /proc/cpuinfo changed from "AArch64" to "8". Update the code so it properly falls back in this situation. Reported-by: Al Grant 2019-02-08 Frank Winkler * src/components/nvml/tests/Makefile: Fixed linking error. * src/components/nvml/linux-nvml.c: Fixed "this statement may fall through" warning. * src/components/pcp/tests/testPCP.c: Fixed sprintf warnings by replacing them with the safer method snprintf. * src/components/coretemp/linux-coretemp.c: Fixed warning: '%s' directive output may be truncated. Newer versions of gcc are more strict with regards to return values of snprintf(), so check the values. 2019-02-07 Anthony Castaldo * src/papi_internal.c: After PAPI_shutdown free(_papi_native_events); must reset variables and counts to ensure next PAPI_library_init() from same code will realloc() and rebuild the table. * src/libpfm4/Makefile, src/libpfm4/config.mk, src/libpfm4/examples/Makefile, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/perf_examples/Makefile: improve top makefile by separating targets This patch improves the Makefile structure by, at the top level, separating the various targets: all, install-lib, install-examples. The patch keeps install_examples as a backward compatible target. The patch also makes it possible to override PREFIX from cmdline. 2019-02-06 Anthony Castaldo * src/validation_tests/Makefile.recipies, src/validation_tests/memleak_check.c: memleak_check is a new simple test file to help expose memory leaks in PAPI main and components. * src/papi_internal.c, src/papi_preset.c: Patches provided by Jiali Li. Added cleanup code to prevent memory leaks. * src/libpfm4/lib/pfmlib_perf_event.c: Patches provided by Jiali Li. Added cleanup code to prevent memory leaks. * src/components/stealtime/linux-stealtime.c: Patches provided by Jiali Li. Added cleanup code to prevent memory leaks. * src/components/powercap/linux-powercap.c: Added cleanup code to prevent some of the memory leaks. Dynamic-library related calls leak, but we cannot prevent all of those leaks. * src/components/perf_event_uncore/perf_event_uncore.c: Patches provided by Jiali Li. Added cleanup code to prevent memory leaks. * src/components/perf_event/pe_libpfm4_events.c: Patches provided by Jiali Li. Added cleanup code to prevent memory leaks. * src/components/nvml/linux-nvml.c: Added cleanup code to prevent some of the memory leaks. Dynamic-library related calls leak, but we cannot prevent all of those leaks. * src/components/lustre/linux-lustre.c: Added cleanup code to prevent memory leaks. * src/components/lmsensors/linux-lmsensors.c: Added cleanup code to prevent some of the memory leaks. Dynamic-library related calls leak, but we cannot prevent all of those leaks. * src/components/infiniband_umad/linux-infiniband_umad.c: Added cleanup code to prevent some of the memory leaks. Dynamic-library related calls leak, but we cannot prevent all of those leaks. 2019-02-06 Heike Jagode * src/components/cuda/tests/cudaTest_cupti_only.cu, src/components/cuda/tests/likeComp_cupti_only.cu, src/components/cuda/tests/simpleMultiGPU.cu, src/components/cuda/tests/simpleMultiGPU.h, src/components/cuda/tests/timer.h: Added more details to the license statement for cuda tests. 2019-02-05 Anthony Castaldo * src/components/nvml/README: Expanded description of NVML component and usage. * src/components/README: Expanded notes to guide releases. 2019-02-01 Anthony Castaldo * src/components/infiniband_umad/linux-infiniband_umad.c: Corrected name of component to distinguish this one from the 'infiniband' component. * src/components/nvml/README, src/components/nvml/linux-nvml.c: Had a problem with undefined variables; added notes to README. 2019-02-01 Frank Winkler * src/components/perf_event/perf_event.c: Suppress "unused" warnings. * src/components/perf_event/perf_event.c: Get rid of "use of uninitialized variable" warnings. * src/components/infiniband_umad/README, src/components/lmsensors/README: Added component instructions. * src/components/cuda/Rules.cuda: Commented out target "native_clean" since it is used twice. * src/components/lmsensors/README: Added build instructions for component lmsensors. * src/components/lmsensors/linux-lmsensors.c: Suppress "unused variables" warnings. 2019-01-31 Anthony Castaldo * man/man1/PAPI_derived_event_files.1, man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPIF_accum.3, man/man3/PAPIF_accum_counters.3, man/man3/PAPIF_add_event.3, man/man3/PAPIF_add_events.3, man/man3/PAPIF_add_named_event.3, man/man3/PAPIF_assign_eventset_component.3, man/man3/PAPIF_cleanup_eventset.3, man/man3/PAPIF_create_eventset.3, man/man3/PAPIF_destroy_eventset.3, man/man3/PAPIF_enum_event.3, man/man3/PAPIF_epc.3, man/man3/PAPIF_event_code_to_name.3, man/man3/PAPIF_event_name_to_code.3, man/man3/PAPIF_flips.3, man/man3/PAPIF_flops.3, man/man3/PAPIF_get_clockrate.3, man/man3/PAPIF_get_dmem_info.3, man/man3/PAPIF_get_domain.3, man/man3/PAPIF_get_event_info.3, man/man3/PAPIF_get_exe_info.3, man/man3/PAPIF_get_granularity.3, man/man3/PAPIF_get_hardware_info.3, man/man3/PAPIF_get_multiplex.3, man/man3/PAPIF_get_preload.3, man/man3/PAPIF_get_real_cyc.3, man/man3/PAPIF_get_real_nsec.3, man/man3/PAPIF_get_real_usec.3, man/man3/PAPIF_get_virt_cyc.3, man/man3/PAPIF_get_virt_usec.3, man/man3/PAPIF_ipc.3, man/man3/PAPIF_is_initialized.3, man/man3/PAPIF_library_init.3, man/man3/PAPIF_lock.3, man/man3/PAPIF_multiplex_init.3, man/man3/PAPIF_num_cmp_hwctrs.3, man/man3/PAPIF_num_counters.3, man/man3/PAPIF_num_events.3, man/man3/PAPIF_num_hwctrs.3, man/man3/PAPIF_perror.3, man/man3/PAPIF_query_event.3, man/man3/PAPIF_query_named_event.3, man/man3/PAPIF_read.3, man/man3/PAPIF_read_ts.3, man/man3/PAPIF_register_thread.3, man/man3/PAPIF_remove_event.3, man/man3/PAPIF_remove_events.3, man/man3/PAPIF_remove_named_event.3, man/man3/PAPIF_reset.3, man/man3/PAPIF_set_cmp_domain.3, man/man3/PAPIF_set_cmp_granularity.3, man/man3/PAPIF_set_debug.3, man/man3/PAPIF_set_domain.3, man/man3/PAPIF_set_event_domain.3, man/man3/PAPIF_set_granularity.3, man/man3/PAPIF_set_inherit.3, man/man3/PAPIF_set_multiplex.3, man/man3/PAPIF_shutdown.3, man/man3/PAPIF_start.3, man/man3/PAPIF_start_counters.3, man/man3/PAPIF_state.3, man/man3/PAPIF_stop.3, man/man3/PAPIF_stop_counters.3, man/man3/PAPIF_thread_id.3, man/man3/PAPIF_thread_init.3, man/man3/PAPIF_unlock.3, man/man3/PAPIF_unregister_thread.3, man/man3/PAPIF_write.3, man/man3/PAPI_accum.3, man/man3/PAPI_accum_counters.3, man/man3/PAPI_add_event.3, man/man3/PAPI_add_events.3, man/man3/PAPI_add_named_event.3, man/man3/PAPI_addr_range_option_t.3, man/man3/PAPI_address_map_t.3, man/man3/PAPI_all_thr_spec_t.3, man/man3/PAPI_assign_eventset_component.3, man/man3/PAPI_attach.3, man/man3/PAPI_attach_option_t.3, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_component_info_t.3, man/man3/PAPI_cpu_option_t.3, man/man3/PAPI_create_eventset.3, man/man3/PAPI_debug_option_t.3, man/man3/PAPI_destroy_eventset.3, man/man3/PAPI_detach.3, man/man3/PAPI_disable_component.3, man/man3/PAPI_disable_component_by_name.3, man/man3/PAPI_dmem_info_t.3, man/man3/PAPI_domain_option_t.3, man/man3/PAPI_enum_cmp_event.3, man/man3/PAPI_enum_event.3, man/man3/PAPI_epc.3, man/man3/PAPI_event_code_to_name.3, man/man3/PAPI_event_info_t.3, man/man3/PAPI_event_name_to_code.3, man/man3/PAPI_exe_info_t.3, man/man3/PAPI_flips.3, man/man3/PAPI_flops.3, man/man3/PAPI_get_cmp_opt.3, man/man3/PAPI_get_component_index.3, man/man3/PAPI_get_component_info.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_get_event_component.3, man/man3/PAPI_get_event_info.3, man/man3/PAPI_get_eventset_component.3, man/man3/PAPI_get_executable_info.3, man/man3/PAPI_get_hardware_info.3, man/man3/PAPI_get_multiplex.3, man/man3/PAPI_get_opt.3, man/man3/PAPI_get_overflow_event_index.3, man/man3/PAPI_get_real_cyc.3, man/man3/PAPI_get_real_nsec.3, man/man3/PAPI_get_real_usec.3, man/man3/PAPI_get_shared_lib_info.3, man/man3/PAPI_get_thr_specific.3, man/man3/PAPI_get_virt_cyc.3, man/man3/PAPI_get_virt_nsec.3, man/man3/PAPI_get_virt_usec.3, man/man3/PAPI_granularity_option_t.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_inherit_option_t.3, man/man3/PAPI_ipc.3, man/man3/PAPI_is_initialized.3, man/man3/PAPI_itimer_option_t.3, man/man3/PAPI_library_init.3, man/man3/PAPI_list_events.3, man/man3/PAPI_list_threads.3, man/man3/PAPI_lock.3, man/man3/PAPI_mh_cache_info_t.3, man/man3/PAPI_mh_info_t.3, man/man3/PAPI_mh_level_t.3, man/man3/PAPI_mh_tlb_info_t.3, man/man3/PAPI_mpx_info_t.3, man/man3/PAPI_multiplex_init.3, man/man3/PAPI_multiplex_option_t.3, man/man3/PAPI_num_cmp_hwctrs.3, man/man3/PAPI_num_components.3, man/man3/PAPI_num_counters.3, man/man3/PAPI_num_events.3, man/man3/PAPI_num_hwctrs.3, man/man3/PAPI_option_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_perror.3, man/man3/PAPI_preload_info_t.3, man/man3/PAPI_profil.3, man/man3/PAPI_query_event.3, man/man3/PAPI_query_named_event.3, man/man3/PAPI_read.3, man/man3/PAPI_read_counters.3, man/man3/PAPI_read_ts.3, man/man3/PAPI_register_thread.3, man/man3/PAPI_remove_event.3, man/man3/PAPI_remove_events.3, man/man3/PAPI_remove_named_event.3, man/man3/PAPI_reset.3, man/man3/PAPI_set_cmp_domain.3, man/man3/PAPI_set_cmp_granularity.3, man/man3/PAPI_set_debug.3, man/man3/PAPI_set_domain.3, man/man3/PAPI_set_granularity.3, man/man3/PAPI_set_multiplex.3, man/man3/PAPI_set_opt.3, man/man3/PAPI_set_thr_specific.3, man/man3/PAPI_shlib_info_t.3, man/man3/PAPI_shutdown.3, man/man3/PAPI_sprofil.3, man/man3/PAPI_sprofil_t.3, man/man3/PAPI_start.3, man/man3/PAPI_start_counters.3, man/man3/PAPI_state.3, man/man3/PAPI_stop.3, man/man3/PAPI_stop_counters.3, man/man3/PAPI_strerror.3, man/man3/PAPI_thread_id.3, man/man3/PAPI_thread_init.3, man/man3/PAPI_unlock.3, man/man3/PAPI_unregister_thread.3, man/man3/PAPI_write.3, release_procedure.txt: New Doc Files in preparatoin for release 5.7.0.0. 2019-01-30 Anthony Castaldo * src/configure: For new version 5.7.0.0 * doc/Doxyfile-common, src/Makefile.in, src/configure.in, src/papi.h: Changing version number to 5.7.0.0. 2019-01-30 Anthony Castaldo * src/components/cuda/tests/LDLIB.src: Corrected a path name. 2019-01-30 William Cohen * src/run_tests.sh: Elimininating some of the SHELLCHECK_WARNINGS. Removing unused variables. Correcting printf arguments. 2019-01-29 Konstantin Stefanov * src/components/nvml/linux-nvml.c: Change method for detecting available NVML component events Previously PAPI nvml component used ROM version for detecting the type of the GPU and find which events are supported. On some newer cards, e.g. Tesla Kepler and Tesla Pascal, this gives wrong result. Those card support GPU and memory utilization, for example, but it was not detected as Kepler card may not have powerROM, and PAPI nvml considers it as an old card. So I changed the way the event availability is detected: just try to obtain the info, and if it succeds, it is available. 2019-01-28 Anthony Castaldo * src/components/nvml/linux-nvml.c: Added (void)s to eliminate warnings about unused variables. * src/components/cuda/linux-cuda.c: Corrected field-name typo in a SUBDBG message that was not previously being compiled. * src/components/cuda/README, src/components/cuda/tests/LDLIB.src: Changes about accessing to cupti libs and includes. * src/components/cuda/tests/simpleMultiGPU.cu, src/components/nvml/tests/HelloWorld.cu: Corrected compile warnings for deprecated routines or compiler complaints. 2019-01-23 Vince Weaver * src/papi_events.csv: papi_events: the skylake events are actually split in two, make sure cascadelake gets both cases too 2019-01-23 Anthony Castaldo * src/components/nvml/tests/nvml_power_limiting_test.cu: structure member name was misspelt; 'cmpinfo->disabled_resaon' instead of 'cmpinfo->disabled_reason'. * src/components/infiniband_umad/tests/infiniband_umad_list_events.c: Code was missing the 'string.h' include necessary for use of 'strstr() function'. * src/components/infiniband_umad/linux-infiniband_umad.c: Header file changed; fixed prototypes for umad_get_ca() to use 'const char*' instead of 'char*'. 2019-01-22 Vince Weaver * src/papi_events.csv: papi_events: add cascade lake X support 2019-01-22 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/cudaTest_cupti_only.cu, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_arch.c: linux-cuda.c and cudaTest_cupti_only.cu have cosmetic changes. the pfmlib changes were committed by Stephane to simplify cpuid; The push/pop were causing problems with some compiler optimizations. 2019-01-16 Anthony Castaldo * src/libpfm4/README, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/amd64_events_fam17h.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_skl.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Three patches to libpfm4. (1) Add Intel CascadeLake X core PMU support. (2) Add get_num_events() support for Intel X86 (3) Check PMU models when validating event codes. * src/components/cuda/tests/runSMG.sh: Example file to run simpleMultiGPU. * src/components/cuda/tests/runCTCO.sh: Example file for running cudaTest_cupti_only * src/components/cuda/tests/runBW.sh: Example script to run nvlink_bandwidth on PEAK. * src/components/cuda/tests/runAll.sh: Example script to run nvlink_all on PEAK. * src/components/cuda/tests/simpleMultiGPU.cu: This is a PAPI version of an NVIDIA cupti-only sample program; it is a useful starting point to test a variety of metrics or events, which are specified in a simple internal table. * src/components/cuda/tests/likeComp_cupti_only.cu: This program (likeComp = likeComponent) tested if the events in a metric could be harvested and put into an event group, read and re-ordered to provide data to cuptiMetricGetValue. They can; we did this before rewriting the component to do all Metrics in this way. * src/components/cuda/tests/nvlink_bandwidth.cu: This is a tester for just 4 NVLINK bandwidth metrics. It moves data from CPU (host) to GPU, or GPU to GPU. Reporting of intermediate steps has been increased, and we retrieve the number of Async engines dynamically to optimize the number of streams used in the copies. It was previously hard-coded. * src/components/cuda/tests/nvlink_all.cu: This utility will iterate through all the available NVLINK metrics in the PAPI system, and run a test program for each of them, and report the results to stdout. The test program is memory movement, a command line argument can test CPU(host) to GPU memory movement, or GPU to GPU movement amongst available GPU devices. The report consists of all single events, followed by a list of all possible pairs of events on one GPU, and multiple GPUS. This report notes which nvlink events are incompatible pairs, and will also report if any metrics in pairs produce significantly different measurements than when they are read singly. All of this is done with PAPI. * src/components/cuda/tests/cudaTest_cupti_only.cu: This program will test a single performance metric or event that is provided on the command line, on one or more GPUs. Only cupti is used. The exercise will include both extensive memory moves NOT executed by kernel, and a kernel. Options allow the user to skip the kernel execution if desired, and to optionally use cuInit() and reset the devices before beginning the test. Reports of the steps in the process are output to stdout. The purpose of this program is to show what a cupti-only result looks like, in order to see if issues with an event are in the PAPI implementation only, or also exist in a cupti-only implementation. * src/components/cuda/tests/Makefile: Several targets were added for new test and utility programs. * src/components/cuda/linux-cuda.c: Several changes were made to more efficiently (and correctly) read and compute metrics, including the newly added nvlink metrics. The previous method was not reading groups properly; and though this did not cause an error, it could result in zeros being read instead of actual values. The change is to break down all metrics and events for a device into global event list (without duplicates) and build a single event group set for everything the user has added. We repeat this each time the user adds an event; on the assumption that this overhead is less likely to occur during a performance critical time than when the user reads the event set. After reading all the resultant groups we then re-order the events and values to compute each metric, then store those values (and any other event values) back into the user- provided order. Outstanding Issues: We do not provide to the user the cuda metric 'branch_efficiency', there is an issue with the library code sometimes segfaulting while reading the events for particular metric. The bug has been reported to nvidia as bug ID 2485834. 2019-01-10 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/attach_cpu_sys_validate.c: ctests: add an attach_cpu_sys test this test for the Linux bug where you attach to a process with SYS granularity * src/components/perf_event/perf_event.c, src/components/perf_event/perf_event_lib.h, src/ctests/attach_cpu_validate.c: perf_event: fix granularity setting for attached processes the old code was setting the granularity wrong when attaching to a CPU. * src/components/perf_event/perf_event.c: perf_event: properly fall back to read() if rdpmc read attempt fails The code wasn't properly handling this. We now fall back to read() if *any* rdpmc call in an eventset fails. In theory it is possible to only fall back in a per-event fashion but that would make the code a lot more complex. * src/components/perf_event/perf_helpers.h: perf_event: internally indicate we need fallback when rdpmc not available 2018-12-07 Vince Weaver * src/ctests/attach_cpu_validate.c: ctests: attach_cpu_validate: fail test if all values are close to the same * src/ctests/Makefile.recipies, src/ctests/attach_cpu_validate.c: ctests: add attach_cpu_validate test 2018-12-03 Vince Weaver * src/ctests/branches.c: ctests/branches: remove code to set "sleep time" which is no longer used * src/ctests/branches.c: ctests/branches: make the failure message more verbose to see what was going wrong. the issue I was seeing on Haswell was because there was some perf-related system load happening on the same machine (the perf_fuzzer) * src/ctests/branches.c: ctests: branches, update code comments to explain what test is doing trying to figure out why sometimes failing on Haswell system 2018-11-20 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, src/components/cuda/tests/nvlink_bandwidth.cu, .../cuda/tests/nvlink_bandwidth_cupti_only.cu, src/components/cuda/tests/runBW.sh, src/components/cuda/tests/runCO.sh, src/components/cuda/tests/simpleMultiGPU.cu: Several files modified to properly utilize the NVLINK metrics added to the linux-cuda.c component. Commenting improved to aid my own understanding of the existing code. Tony C. 2018-11-20 Terry Cojean * src/components/cuda/README, src/components/nvml/README, .../nvml/tests/nvml_power_limiting_test.cu, src/components/powercap/README, src/components/powercap/tests/powercap_limit.c: Improved error handling. Fixed typos. Added details to component README files. 2018-11-05 Anara Kozhokanova * src/components/powercap/utils/powercap_plot.c: Revert "Temporary Fix: The powercap component does not properly" This reverts commit bde6c257e4af47e9267ebb194b0aa4697568e99f. The issue with incorrect values reported by powercap component was fixed in ea8fa1f. Therefore, this temporary fix is no longer needed. * src/components/powercap/linux-powercap.c: Fix the bug in powercap component introduced in 2231b36. The reported values by powercap component were not correct (read values were not subtracted from start values and without wraparound). 2018-11-04 Frank Winkler * src/components/perf_event/perf_event.c: Fixed a bug that occurred when compiling with debug flag. - papi_pe_buffer was undeclared 2018-10-26 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, src/components/cuda/tests/nvlink_all.cu, src/components/cuda/tests/nvlink_bandwidth.cu, .../cuda/tests/nvlink_bandwidth_cupti_only.cu, src/components/cuda/tests/runAll.sh, src/components/cuda/tests/runBW.sh, src/components/cuda/tests/runCO.sh: repairs, new features, run files, a new utility in nvlink_all. 2018-10-10 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/LDLIB.src, src/components/cuda/tests/Makefile, src/components/cuda/tests/nvlink_all.cu, src/components/cuda/tests/nvlink_bandwidth.cu, .../cuda/tests/nvlink_bandwidth_cupti_only.cu, src/components/cuda/tests/runAll.sh, src/components/cuda/tests/runBW.sh, src/components/nvml/PeakConfigure.sh: Added several files, and rewrote the tests. I created a new test, nvlink_all.cu, with a new approach to test all nvlink events present in the component standalone, and I rewrote the original nvlink_bandwidth.cu to make it work properly with PAPI. I also added some testing scripts needed to function on the PEAK supercomputer; where this code was tested. 2018-10-03 Anara Kozhokanova * src/utils/papi_avail.c: Add a note to the output of "papi_avail -e " if preset event is not available at the host architecture. 2018-09-28 Anthony Castaldo * src/components/cuda/tests/Makefile, src/components/cuda/tests/nvlink_bandwidth.cu, src/components/cuda/tests/simpleMultiGPU.cu, src/components/nvml/tests/Makefile, src/components/nvml/tests/nvmlcap_plot.cu: New and debugged files for NVML and CUDA testing. 2018-09-28 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: remove debug printf from libpfm4 error handling code Steve Kaufmann reported this triggered sometimes and was unnecessary * src/utils/papi_avail.c: papi_avail: fix the -e option to not print spurious message the "no events available" message should not be printed if -e is being used 2018-09-27 Vince Weaver * src/components/perf_event/perf_event.c: perf_event: avoid floating point exception if running is 0 The perf_event interface isn't supposed to return 0 for running, but it happens occasionally. So be sure not to divide by zero if this happens. This makes the rdpmc code match the generic perf code in this case. This is in response to bitbucket issue #52 2018-09-25 Heike Jagode * src/components/powercap/utils/powercap_plot.c: Temporary Fix: The powercap component does not properly report energy values. At some point in Nov 2017, the read() function was rewritten, which resulted in numerous errors, such as: +++ the energy start values are not subtracted from the read values. +++ wraparound is no longer working properly. etc. This commit serves as an immediate workaround and adds a temporary fix to get the powercap_plot utility working again. However, all this should and will be fixed in the powercap component itself. 2018-09-21 Anthony Castaldo * src/components/cuda/tests/simpleMultiGPU.cu: Corrected a bug in the CUPTI_ONLY version of simpleMultiGPU.cu. This manifested specifically if the node has multiple GPUs and they are of different models or types; in which case they can have differently numbered PAPI events. We converted a scalar storing the eventID to a vector with one eventID per GPU. 2018-09-19 Anthony Castaldo * src/components/nvml/tests/Makefile, src/components/nvml/tests/benchSANVML.c, .../nvml/tests/nvml_power_limit_read_test.cu, .../nvml/tests/nvml_power_limiting_test.cu: new test files, more cleanup on failure reporting. * src/components/cuda/tests/Makefile, src/components/nvml/tests/Makefile, .../nvml/tests/nvml_power_limiting_test.cu: Additions to Makefiles, and several changes to power limiting testing to correct errors when multiple GPUs are present, remove extraneous code, and provide greater clarity in output and error messages. 2018-09-14 Heike Jagode * src/components/cuda/linux-cuda.c: Minor fix: return correct error message if libcupti.so not found. 2018-09-13 Heike Jagode * src/components/cuda/linux-cuda.c: minor fix * src/components/cuda/linux-cuda.c: Bug fix: Instead of normalizing all the event values to represent the total number of domain instances on the device, only the last event value was normalized. Tue Mar 20 09:37:56 2018 -0700 Steve Walk * src/libpfm4/lib/events/arm_cavium_tx2_events.h: Update libpfm4 Current with ------------ commit 6c9e44b95a55b8bf62cbd64009c4c9b30964a66c update Cavium ThunderX2 with now public events This patch adds new model specific events to the Cavium Thunder X2 core PMU. The updated list is based on publicly available documentation from Cavium which is available at: https://cavium.com/resources.html 2018-08-27 Vince Weaver * src/components/rapl/linux-rapl.c: rapl: add support for AMD Fam17h (Zen) CPUs AMD Fam17h chips have a new RAPL-like interface that supports energy measurement using register layouts like Intel RAPL, but at a different MSR number. This has been tested on an EPYC system and the package value seems to be plausible, but as reproted by the LIKWID people the cores value seems a bit too low. 2018-08-01 Anthony Castaldo * src/components/pcp/tests/Makefile2, src/components/pcp/tests/README_BenchTesting.txt, src/components/pcp/tests/benchPCP.c, src/components/pcp/tests/benchPCP_script.sh, src/components/pcp/tests/benchStats.c: benchmarking files and README. 2018-07-23 Tony Castaldo * ChangeLogP500.txt, RELEASENOTES.txt, man/man1/papi_multiplex_cost.1, man/man3/PAPI_attach.3, man/man3/PAPI_detach.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_profil.3, src/Makefile.inc, src/components/Makefile_comp_tests, src/components/Makefile_comp_tests.target.in, src/components/appio/tests/Makefile, src/components/appio/tests/iozone/libasync.c, src/components/appio/tests/iozone/makefile, src/components/cuda/tests/Makefile, src/components/nvml/linux- nvml.c, src/components/perfctr/perfctr-x86.c, src/configure.in, src/ctests/Makefile.recipies, src/ctests/Makefile.target.in, src/ctests/overflow_force_software.c, src/examples/Makefile, src/examples/PAPI_overflow.c, src/freebsd/map-atom.c, src/freebsd /map-core2-extreme.c, src/freebsd/map-core2.c, src/ftests/Makefile.recipies, src/ftests/Makefile.target.in, src /linux-context.h, src/linux-timer.c, src/papi.c, src/papi.h, src/papi_events.csv, src/sw_multiplex.c, src/testlib/Makefile, src/testlib/Makefile.target.in, src/utils/Makefile, src/utils/Makefile.target.in, src/utils/papi_multiplex_cost.c, src/validation_tests/Makefile.recipies, src/validation_tests/Makefile.target.in: 8 patches to make system from Andreas Beckmann 2018-06-27 Anthony Castaldo * src/components/pcp/linux-pcp.c: Removed a duplicated IF statement. No difference in execution. 2018-06-25 Anthony Castaldo * src/components/pcp/linux-pcp.c: fixed pcp_init_component to show any errors in reason for a disabled PCP component. 2018-06-22 Anthony Castaldo * src/components/pcp/README, src/components/pcp/linux-pcp.c, src/components/pcp/tests/testPCP.c: fixed a debug print, added 'timescope' to testPCP output, completed README. * src/components/pcp/README: fixed up README file. 2018-06-21 Anthony Castaldo * src/components/pcp/linux-pcp.c, src/components/pcp/tests/testPCP.c: removed debug code, added non-zeroing on instantaneous variables. 2018-06-19 Heike Jagode * src/components/cuda/sampling/Makefile, src/components/cuda/tests/Makefile: Add cuda/lib64/stubs to linker for cuda tests to link with libcuda.so. 2018-06-19 Anthony Castaldo * src/components/pcp/linux-pcp.c, src/components/pcp/tests/testPCP.c: Code changes necessary to work on Power9. 2018-06-18 Anthony Castaldo * src/components/pcp/Rules.pcp, src/components/pcp/linux-pcp.c: Corrections to allow compile and execution on Power9. 2018-06-15 Anthony Castaldo * src/components/pcp/README, src/components/pcp/Rules.pcp, src/components/pcp/linux-pcp.c, src/components/pcp/tests/Makefile, src/components/pcp/tests/testPCP.c: Initial coding of pcp component and tester completed. Wed Jun 13 23:49:10 2018 -0700 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog: Update libpfm4 Current with ------------ commit 37d4628e37ba76c1ab586ab35e85340e30f7c523 update to version 4.10.1 - Fix build issues on Cavium Thunder X2 - Update Skylake event table Tue Jun 12 23:31:13 2018 -0700 Stephane Eranian * src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_common.c: Update libpfm4 Current with ------------ commit fa65a75a8af5b4e2c360be41e66203e04735dfd2 update Skylake event table Based on Intel's skykake_core_v40,json event table from download.01.org. Added PARTIAL_RAT_STALLS.SCOREBOARD Added ROB_MISC_EVENT.PAUSE_INST Fixed encodings of some umasks for L2_RQSTS 2018-06-12 Vince Weaver * src/components/perf_event/perf_event.c, src/ctests/Makefile.recipies, src/ctests/attach_validate.c: ctests: add new attach_validate test actually tries to validate the counter values when attached we might have an issue with rdpmc() and attach and trying to make a test to catch it. Thu Jun 7 11:38:48 2018 -0700 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog: Update libpfm4 Current with ------------ commit 924437778d3fe75de5f7a43374ed6f4b1c0533a7 update to version 4.10.0 Update verison number to 4.10 2018-06-08 Steve Walk * src/papi_events.csv: enable Cavium ThunderX2 support 2018-06-07 Anara Kozhokanova * src/components/cuda/README: Update README in CUDA component: added '-i' flag to grep. Add a note about verifying whether the component is active or not before using it. Tue Jun 5 14:22:32 2018 -0700 William Cohen * src/libpfm4/python/self.py, src/libpfm4/python/src/pmu.py, src/libpfm4/python/sys.py: Update libpfm4 Current with ------------ commit 3106615db87f81f220efc13df7a4e36e31f1ee64 Import python print function So that code works in the same manner for python 2 and 3 Mon Jun 4 20:15:08 2018 -0700 William Cohen * src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/perf_examples/syst_count.c, src/libpfm4/perf_examples/syst_smpl.c: Update libpfm4 Current with ------------ commit 29f626744df184913a200532408e205e2b0ec2ec Fix error: '%s' directive output may be truncated Newer versions of gcc are more strict with regards to return values of snprintf(), so check the values. 2018-06-01 Heike Jagode * src/Makefile.inc: Fixed 'make dist' step. Mon May 28 13:50:44 2018 -0700 Stephane Eranian * src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_cavium_tx2_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_x86_arch.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with ------------ commit 488697d43bc5601ca51a22f7072169781d5b45b2 fix typo in BUS_ACCESS event for Cavium ThunderX2 This patch fixes a typo in event name for the Cavium ThunderX2 core PMU event list. BUS_ACCESS_LD -> BUS_ACCESS_WR Event list based on ARM Architecture Reference Manual (ARM DDI 0487C.a). 2018-05-25 Vince Weaver * src/ftests/Makefile.recipies, src/ftests/openmp.F: ftests: add an openmp test 2018-04-30 Vince Weaver * src/ctests/Makefile.recipies, src/ctests/destroy.c: ctests: add destroy test this checks to make sure that when we destroy eventsets we aren't leaking file descriptors Wed Apr 18 19:03:36 2018 +0200 AndrĂ© Wild * src/libpfm4/lib/events/mips_74k_events.h, src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_intel_nhm_unc.c, src/libpfm4/lib/pfmlib_intel_x86.c: Update libpfm4 Current with ------------ commit 903d1c05ed72d45e5bebc1f2a1a1ae60b3ed1ee6 (HEAD -> master, origin/master, origin/HEAD) remove duplicate assignment in pfm_nhm_unc_get_encoding pe was assigned twice for no reason. commit 37b7e406b77acf6115386cca43bab128e2a2d905 clarify intel_x86_check_pebs() This routine is not used right now because we cannot determine in the x86 code whether or not PEBS has been requested for an event. This is usually requested at the OS interface level. But the patch keeps the code around in case we need it later on. commit 832e1a388d25ba39444505c2fa7ffb77f7537df5 fix typo in mip74k event name OCP_WRITE_CACHEABLE REQUESTS -> OCP_WRITE_CACHEABLE_REQUESTS Reported-by: Andreas Beckmann commit 56cea590df7e77a1c1f1044e95d836cc01cfdb56 s390/cpumf: rename IBM z13/z14 counter names Change the IBM z13/z14 counter names to be in sync with all other models. Wed Apr 4 18:45:18 2018 -0400 Heike Jagode * src/libpfm4/README, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_knl_unc_cha.c, src/libpfm4/lib/pfmlib_intel_knl_unc_edc.c, src/libpfm4/lib/pfmlib_intel_knl_unc_imc.c, src/libpfm4/lib/pfmlib_intel_knl_unc_m2pcie.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with ------------ commit c4de2ea3b50fa14e66129b06619775840aafab2a Add support for Intel KNM uncore events This patch adds Intel Knights Mill uncore event support for: CHA uncore PMU Integrated EDRAM uncore PMU Integrated Memory Controller (IMC) uncore PMU M2PCIe uncore PMU It is based on the Knights Landing event table, which is shared with Knights Mill. 2018-04-02 Heike Jagode * src/papi_events.csv: PAPI preset event support for Intel Knights Mill. Mon Mar 19 23:53:23 2018 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_knm.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_cha.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_iio.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_imc.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_irp.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_m2m.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_m3upi.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_pcu.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_ubo.3, src/libpfm4/docs/man3/libpfm_intel_skx_unc_upi.3, src/libpfm4/examples/check_events.c, src/libpfm4/examples/showevtinfo.c, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_bdw_events.h, src/libpfm4/lib/events/intel_bdx_unc_cbo_events.h, src/libpfm4/lib/events/intel_bdx_unc_ha_events.h, src/libpfm4/lib/events/intel_bdx_unc_imc_events.h, src/libpfm4/lib/events/intel_bdx_unc_irp_events.h, .../lib/events/intel_bdx_unc_r3qpi_events.h, src/libpfm4/lib/events/intel_bdx_unc_sbo_events.h, .../lib/events/intel_ivbep_unc_pcu_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/events/intel_skx_unc_cha_events.h, src/libpfm4/lib/events/intel_skx_unc_iio_events.h, src/libpfm4/lib/events/intel_skx_unc_imc_events.h, src/libpfm4/lib/events/intel_skx_unc_irp_events.h, src/libpfm4/lib/events/intel_skx_unc_m2m_events.h, .../lib/events/intel_skx_unc_m3upi_events.h, src/libpfm4/lib/events/intel_skx_unc_pcu_events.h, src/libpfm4/lib/events/intel_skx_unc_ubo_events.h, src/libpfm4/lib/events/intel_skx_unc_upi_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_bdx_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_hswep_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_ivbep_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_knl.c, src/libpfm4/lib/pfmlib_intel_skx_unc_cha.c, src/libpfm4/lib/pfmlib_intel_skx_unc_iio.c, src/libpfm4/lib/pfmlib_intel_skx_unc_imc.c, src/libpfm4/lib/pfmlib_intel_skx_unc_irp.c, src/libpfm4/lib/pfmlib_intel_skx_unc_m2m.c, src/libpfm4/lib/pfmlib_intel_skx_unc_m3upi.c, src/libpfm4/lib/pfmlib_intel_skx_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_skx_unc_ubo.c, src/libpfm4/lib/pfmlib_intel_skx_unc_upi.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, .../lib/pfmlib_intel_snbep_unc_perf_event.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/perf_examples/perf_util.c, src/libpfm4/python/self.py, src/libpfm4/python/src/pmu.py, src/libpfm4/python/sys.py, src/libpfm4/tests/validate.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with ------------ commit 7987ff8978d4ceef07a539e822c1b582f8924720 (HEAD -> master, origin/master, origin/HEAD) fix 32-bit compile on skx_cha_filt0 The bit field was too wide, so break it in two to keep gcc -m32 happy. commit d60d8955580e7f27c5a269c636d6dcb50eef287d Add support for Intel KNM core events This patch adds Intel Knights Mill core event support for libpfm4. It is based on the Knights Landing event table, which is shared with Knights Mill. commit 3fdae82b5e028a388510798f1f0c84d1139a1735 fix headers on Intel Skylake Uncore PMU files Fix the header with proper copyright line. commit e26ca9492ae26a0150b81828306af3a6e132e488 Fix empty event descriptions for Intel Broadwell-EP uncore PMUs Now that we have empty description detection, fix the ones detected in the Intel Broadwell-EP uncore PMUs. commit 05fc5910b78526d3cba3160713f467d9dcc0774b detect empty event/umask descriptions on Intel processors This patch adds a validation test to detect empty descriptions for events and umasks on Intel X86 processors. 2018-02-28 John Henry * INSTALL.txt: Fixed typo --with_bitmode=32 changed to --with- bitmode=32 2018-02-23 Vince Weaver * src/ctests/hl_rates.c, src/ctests/inherit.c, src/papi_hl.c: ctests: change a few more test results from FAIL to SKIP when paranoid=3 there are some more that fail, but their failure errors make no sense and are coming from deep within PAPI so I am not sure I can easily fix things without making it worse. * src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_cpu.c: ctests: attach tests, skip instead of fail if not enough permissions * src/papi_internal.c: papi_internal: whitespace cleanup (no code changes) * src/utils/papi_avail.c, src/utils/papi_native_avail.c: utils: papi_avail/native_avail suggest papi_component_avail if no events detected If no events are detected, let the user know they should use papi_component_avail to find out why. * src/papi_internal.c: papi_internal: comment the error generation code I'm not sure why we generate things this way, but it makes it really confusing when adding a new error. * src/genpapifdef.c, src/papi.h, src/papi_common_strings.h, src/papi_internal.c: add new PAPI_ECMP_DISABLED error We can return this error if an event is added but the component involved is disabled. If a user moves working code to a system where perf_event_paranoid is set to 3 (all perf events disabled) they will now get an error indicating the component is disabled rather than an "event not found" error which was confusing. * src/ctests/zero.c: ctests: zero: print full error message if cannot add 2018-02-21 Vince Weaver * src/utils/papi_component_avail.c: utils: papi_component_avail: fix the NAME field in the auto-generated manpage Steve Kaufmann noticed that the papi_component_avail NAME field for the auto- generated manpage for some reason had info for papi_native_avail instead. 2018-02-16 Heike Jagode * src/configure, src/configure.in: Fixed compilation error that occurs with deprecated option '-openmp' when using a more current icc compiler. Replaced with '-qopenmp'. Tested with: icc/2016.0 icc/2017.4 icc/2018 icc/2018.1 Reported by Preeti Suman from Intel. Wed Feb 7 09:51:16 2018 -0800 Stephane Eranian * src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/lib/pfmlib_s390x_priv.h: Update libpfm4 Current with commit 8f2653b8e2e18bad44ba1acc7f92c825f226ef71 s390/cpumf: add support for IBM z14 counters Add counter definitions for the IBM z14 hardware model. With z14, the counters in the problem-state set are reduced and the counter first number version is increased accordingly. Now, the counters are processed depending on the counter facility versions. commit 96c0847f524b0b23e189478315587abf35cbf774 add CORE_SNOOP_RESPONSE event for Intel Skylake This is a newly disclosed event of Intel Skylake Core PMU. Based on download.01.org skylakex_core_v1.06.json event table. Thu Jan 25 19:23:45 2018 -0800 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog, src/libpfm4/lib/events/perf_events.h, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/tests/Makefile, src/libpfm4/tests/validate.c, src/libpfm4/tests/validate_perf.c: Update libpfm4 Current with commit 18e3c1f0254ab9323ac848643b8e042e65cf5259 Add minimal perf_events generic events validation This patch adds a small validation tests suite for the generic PMU event provide by the perf_events interface. This is specific to Linux. This patch modifies the validate.c file to handle the new perf_events test suite. 2018-01-24 Vince Weaver * src/components/Makefile_comp_tests.target.in, src/components/perf_event_uncore/tests/Makefile, src/ctests/Makefile.recipies, src/ctests/Makefile.target.in, src/ftests/Makefile.target.in, src/utils/Makefile.target.in, src/validation_tests/Makefile.target.in: build: fix various LDFLAGS/CFLAGS issues issues were reported by Andreas Beckmann 2018-01-22 Vince Weaver * src/utils/papi_cost.c: utils: papi_cost: uset getopt() to parse command line rather than open-coding one The existing code was fragile and also as far as I can tell the -b option hadn't worked for a long time. * src/utils/papi_cost.c: utils: papi_cost: various minor cleanups to the code * src/utils/cost_utils.c, src/utils/cost_utils.h, src/utils/papi_cost.c: utils: papi_cost: add -p option for printing boxplot percentages makes generating boxplots from the results much easier 2018-01-05 John Henry * release_procedure.txt: Fix typo in release_procedure.txt. Missing do papi-papi-7-2-0-t/ChangeLogP600.txt000066400000000000000000002234541502707512200166100ustar00rootroot000000000000002020-02-27 Steven Kaufmann * src/components/infiniband/tests/Makefile: Making MPI tester optional 2020-02-22 Frank Winkler * src/papi_fwrappers.c: Added fortran wrappers for PAPI_rate_stop and PAPI_hl_stop. Also fixed doxygen documentation for PAPI_flops_rate. 2020-02-21 Anthony Castaldo * src/components/rocm/tests/square.cpp, src/components/rocm/tests/square.cu, src/components/rocm/tests/square.hipref.cpp, src/components/rocm_smi/linux-rocm-smi.c: Deleted test files from the repository, and commented-out debug lines from rocm_smi. * src/components/rocm/linux-rocm.c, src/components/rocm/tests/Makefile, src/components/rocm/tests/rocm_all.cpp: Added patches provided by Evgeny Shcherbakov (AMD), and corrected bugs in rocm_all.cpp. Tested and now functions as expected. 2020-02-20 Anthony * src/components/sde/tests/Makefile, src/configure, src/configure.in: Added -lrt to LIBS (if needed) so that it propagates into the pkg- config file papi.pc. Also, removed the explicit flag from the SDE tests Makefile. 2020-02-19 Anthony * src/components/sde/sde_internal.h, src/configure, src/configure.in: Enabled overflow by default in SDE and added -lrt detection in the configure script. 2020-02-19 Anthony Castaldo * src/components/rocm/tests/rocm_all.cpp: Reconciling this version of rocm_all.cpp with another pull request. 2020-02-18 Anthony Castaldo * src/components/cuda/linux-cuda.c: ---Correct cuda push/pop context consistency--- In _cuda_cleanup_eventset we attempt to push a current cuda context, set a new cuda context to do some cleanup, then restore the original context with a pop. (cuCtxPushCurrent, cuCtxPopCurrent). This was failing. We corrected it by doing a Save+Restore instead of a Push+Pop using cuCtxGetCurrent, cuCtxSetCurrent, different routines that do not require the cuda Context Stack, and have fewer restrictions on their use. 2020-02-16 Daniel Barry * src/counter_analysis_toolkit/main.c: Added check for whether or not the user provided a benchmark category. When using the Counter Analysis Toolkit, if the user did not supply a benchmark category, then it will run the 'branch' benchmark by default and inform the user of such. The 'branch' benchmark executes the most quickly of all the categories, making it a suitable default. These changes were tested on the Intel Haswell architecture. 2020-02-13 Frank Winkler * src/run_tests.sh: Little change in test script based on commit 14cebbc. We have changed the high-level environment variable PAPI_NO_WARNING to PAPI_HL_VERBOSE. Also, verbose output is off by default, that's why this variable is not needed in the test script anymore. 2020-02-13 Anthony Castaldo * src/components/cuda/linux-cuda.c: Modifications for more thorough error-checking in routines before using pointers (ensuring they are non-NULL). Suggested by Steve Kaufmann. 2020-02-11 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c: Removed a debug message. 2020-02-10 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c: Corrects a problem producing a segfault. The function MakeRoomAllEvents() can realloc() a table, but this can make the use of a pointer into the former area produce a segfault. 2020-01-31 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c, src/components/rocm_smi/tests/ROCM_SMI_Makefile, src/components/rocm_smi/tests/rocmcap_plot.cpp: A new utility added to tests, and debug lines (commented out) in component code until SMI library problem with power events is sorted out. * src/components/io/linux-io.c: We have to fopen/fclose the system file for every read; otherwise Linux caches the file and reports the same values every time. 2020-01-30 Anthony * src/high-level/papi_hl.c: Turned verbosity of HL API off by default. 2020-01-30 Anthony Castaldo * src/components/io/linux-io.c: Rewrite to use ctx and ctl structures for thread safety. 2020-01-29 Anthony Castaldo * src/components/rocm/Rules.rocm, src/components/rocm/tests/rocm_all.cpp: Corrected a typo in Rules.rocm, and cleaned up a test program rocm_all.cpp. * src/components/io/linux-io.c: Provided some insurance that io component initialization occurs only once. 2020-01-29 Daniel Barry * src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/gen_seq_dlopen.sh: Removed unnecessary error reporting. Some error messages from the CAT benchmarks were removed so as not to cause extraneous output. These changes were tested on the Intel Broadwell architecture. 2020-01-28 Anthony * src/counter_analysis_toolkit/main.c: Avoid computing the latencies twice. * src/components/sde/sde.c: Updated the info that is reported by the component about itself. 2020-01-28 Daniel Barry * src/counter_analysis_toolkit/flops.c: Fixed bug in FLOPS benchmark. The FLOPS benchmarks ensure that the compiler does not discard the results of the numerical kernels. A double-precision benchmark was ensuring that the single-precision result was not discarded, instead of the double-precision result. This has now been corrected. This was tested on the Intel Broadwell architecture. 2020-01-28 Anthony * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/gen_seq_dlopen.sh, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/icache.h, src/counter_analysis_toolkit/main.c: Added code to show progress if the user asks for it (-verbose flag), and removed confusing error messages and dead code. 2020-01-28 Daniel Barry * src/counter_analysis_toolkit/main.c: Per the sscanf man page, it is unnecessary to call free() in this block since memory for the string would not be allocated. This was tested on the AMD EPYC architecture. 2020-01-27 Daniel Barry * src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c: Added checks for negative amounts of qualifers provided by the user. Previously, there was a bug caused by a user providing a negative number of qualifiers. Now, if a user does provide a negative number of qualifiers, this number is set to zero. This fix was tested on the AMD EPYC architecture. 2020-01-27 Anthony * src/components/perf_event/pe_libpfm4_events.c: Fixed problems with debug macro. 2020-01-24 Damien Genet * src/components/infiniband/tests/Makefile: Adds missing rule for compilation of MPI test 2020-01-24 Anthony Castaldo * src/components/perf_event/pe_libpfm4_events.c: New libpfm4 contains "aliased" pmus for backward compatibility, amd64_fam17h == amd64_fam17h_zen1; this causes us to put BOTH pmus into the PMUs supported string and double the events in native_avail. This update recognizes when aliases exist (the names must be hard-coded) and uses only one of the most recent name. 2020-01-23 Heike Jagode * src/components/infiniband_umad/README.md, .../infiniband_umad/Rules.infiniband_umad, .../infiniband_umad /linux-infiniband_umad.c, .../infiniband_umad/linux- infiniband_umad.h, src/components/infiniband_umad/tests/Makefile, .../tests/infiniband_umad_list_events.c, .../tests/infiniband_umad_values_by_code.c: Retirement of infiniband_umad component. With the latest advancements of the infiniband component, infiniband_umad has become redundant. 2020-01-22 Damien Genet * src/components/Makefile_comp_tests.target.in: Propagating MPICC to components tests * src/components/infiniband/linux-infiniband.c, .../infiniband/tests/MPI_test_infiniband_events.c: snprintf return value, a classic now. And the 3-space indentation. 2019-09-04 Rizwan-ICL * src/components/infiniband/linux-infiniband.c, .../infiniband/tests/MPI_test_infiniband_events.c, src/components/infiniband/tests/Makefile: Added descriptions for events of infiniband component using documentation provided by Mellanox; Added test code to test the various events in infiniband component and modified Makefile to compile the test code; 2020-01-22 Damien Genet * src/components/powercap_ppc/README, src/components/powercap_ppc/Rules.powercap_ppc, src/components/powercap_ppc/linux-powercap-ppc.c, src/components/powercap_ppc/linux-powercap-ppc.h, src/components/powercap_ppc/tests/Makefile, src/components/powercap_ppc/tests/powercap_basic.c, src/components/powercap_ppc/tests/powercap_limit.c: Merged in feature/powercap_ppc (pull request #34) Feature/powercap ppc * Powercapping for IBM PowerPC architecture, Power9 processors * Adding 2 tests for powercap component on PPC architecture Power9 Approved-by: adanalis Approved-by: Anthony Castaldo 2020-01-22 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Fixed bug for python3. - dict.iteritems() was removed in python3 --> Instead: use dict.items() The output script works for both python2 and python3. * src/papi.c: Bug fix that was caused by commit db01193. * src/examples/PAPI_flops.c: Improved some comments. 2020-01-21 Damien Genet * src/components/sensors_ppc/linux-sensors-ppc.c: Adds missing checks for snprintf. A return value larger than the buffer is not really an error, just a poor design, but whatever. 2020-01-20 Frank Winkler * src/examples/PAPI_mix_hl_ll.c, src/examples/PAPI_mix_hl_rate.c, src/examples/PAPI_mix_ll_rate.c, src/papi.c, src/papi.h: Renamed papi_rate_stop to papi_stop_events. * src/high-level/papi_hl.c: Fixed bug. Check for empty string in PAPI_EVENTS. * src/high-level/papi_hl.c, src/papi.c, src/papi_internal.c, src/papi_internal.h: Fixed typo. * src/high-level/papi_hl.c: Improved cleanup function. 2020-01-18 Frank Winkler * src/examples/Makefile, src/examples/PAPI_mix_hl_ll.c, src/examples/PAPI_mix_hl_rate.c, src/examples/PAPI_mix_ll_rate.c, src/papi.c: Added examples that show how to mix hl, ll, and rate functions. 2020-01-17 Frank Winkler * src/high-level/papi_hl.c, src/papi.c, src/papi.h, src/papi_internal.c, src/papi_internal.h: Added feature that allows mixing of rate functions and hl functions. 2020-01-16 Anthony Castaldo * src/papi_events.csv: Added two machine types to papi_events.csv to be in line with libpfm4 update to support amd64_fam17h_zen1 and zen2. 2020-01-16 Anthony * src/components/sde/tests/Makefile: Fixed dependency in Makefile. 2020-01-16 Frank Winkler * src/papi.c, src/papi.h: Added PAPI_rate_stop() that stops any rate function. 2020-01-16 Damien Genet * src/components/sensors_ppc/README, src/components/sensors_ppc/Rules.sensors_ppc, src/components/sensors_ppc/linux-sensors-ppc.c, src/components/sensors_ppc/linux-sensors-ppc.h, src/components/sensors_ppc/tests/Makefile, .../sensors_ppc/tests/sensors_ppc_basic.c: Add new component for sensors reading on PowerPC 9 Enable with ./configure --with- components="sensors_ppc" 2020-01-16 Frank Winkler * src/run_tests.sh: Fixed little bug in test script. The output directory of the high-level API has been renamed from papi to papi_hl_output. 2020-01-16 Anthony Castaldo * src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c: Changed Rules file to look in multiple places for rocm_smi.h, it moved between rocm releases. Rewrote a routine to be more efficient and eliminate a string-size warning. Made some diagnostic outputs that were left active in previous commit dependent on #ifdef macros. 2020-01-15 Frank Winkler * src/high-level/papi_hl.c: Fixed memory leak in high-level API. Based on commit ef20e24 that fixed a bug by deleting a "free" call, the "free" call is now done in the last function of the high-level API which is called during the "atexit()" call. 2020-01-14 Anthony * .../sde/tests/Advanced_C+FORTRAN/Gamum.c, .../sde/tests/Advanced_C+FORTRAN/Xandria.F90, .../sde/tests/Advanced_C+FORTRAN/sde_test_f08.F90, src/components/sde/tests/Gamum.c, src/components/sde/tests/Makefile, src/components/sde/tests/Minimal/Minimal_Test.c, src/components/sde/tests/Minimal_Test.c, src/components/sde/tests/Recorder.c, .../sde/tests/Recorder/Lib_With_Recorder.c, .../sde/tests/Recorder/Recorder_Driver.c, src/components/sde/tests/Simple/Simple_Driver.c, src/components/sde/tests/Simple/Simple_Lib.c, src/components/sde/tests/Simple2/Simple2_Driver.c, src/components/sde/tests/Simple2/Simple2_Lib.c, src/components/sde/tests/Xandria.F90, src/components/sde/tests/sde_test_f08.F90: Added new tests/examples under the SDE component and organized them based on complexity. * src/components/sde/sde.c: Improved and corrected the checks that relate to counter groups and recorders. 2020-01-13 Anthony * src/utils/Makefile, src/utils/papi_sde_interface.c: Added the weak symbols for SDE to papi_native_avail, so the utility works when PAPI is not configured with the SDE component. * src/utils/papi_avail.c, src/utils/papi_native_avail.c: Improved the code that checks the command-line arguments. 2020-01-06 Anthony * src/components/sde/sde.c, src/components/sde/sde_internal.h, src/utils/papi_native_avail.c: Moved the responsibility of listing SDEs of a library/executable to papi_native_avail instead of the SDE component. * src/papi_internal.c: Updated the variables that are used in the debug messages in accordance to a previous commit that made these variables thread safe. 2020-01-03 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Changed name of some derived metrics. * src/high-level/papi_hl.c, src/high- level/scripts/papi_hl_output_writer.py: Added new derived metrics. 2020-01-03 Frank Winkler * src/high-level/papi_hl.c: Little format changes. * src/high-level/papi_hl.c: Fixed bug in high-level API caused by commit ff8ff65. The creation of the measurement directory failed since Coverity freed memory of a string that was used later to create the measurement directory. 2020-01-02 Frank Winkler * src/high-level/papi_hl.c, src/validation_tests/Makefile.recipies, src/validation_tests/flops_validation_hl.c, src/validation_tests/fp_validation_hl.c: Revised default events for flops and flips. 2019-12-20 Frank Winkler * src/papi.c: papi.c edited online with Bitbucket * src/examples/high_level.c: high_level.c edited online with Bitbucket * src/examples/PAPI_ipc.c: PAPI_ipc.c edited online with Bitbucket * src/examples/PAPI_flops.c: PAPI_flops.c edited online with Bitbucket * src/examples/PAPI_flips.c: PAPI_flips.c edited online with Bitbucket * src/examples/PAPI_epc.c: PAPI_epc.c edited online with Bitbucket 2019-12-19 Anthony * src/components/sde/sde.c, src/components/sde/sde_internal.h: Fixed issues in the SDE component unveiled by Coverity. 2019-12-19 Daniel Barry * src/counter_analysis_toolkit/main.c: Fixed typo in comment for argument parsing. 2019-12-19 Frank Winkler * src/libpapi.exp: Fixed typo. * src/ctests/bgp/Makefile, src/ctests/bgp/papi_1.c, src/libpapi.exp: Further clean-up. 2019-12-19 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/caches.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Removed unnecessary variables and checks. Refactored code blocks. Added comments in the main driver file. 2019-12-19 Frank Winkler * src/ctests/bgp/papi_1.c, src/libpapi.exp: Clean-up of old high- level functions. 2019-12-18 Frank Winkler * man/man1/papi_component_avail.1: Fixed typo in papi_component_avail.1. See pull request #2. 2019-12-16 Anthony * src/counter_analysis_toolkit/Makefile: Renamed cit_collect to cat_collect. * src/counter_analysis_toolkit/eventstock.c: Clarified comment. * src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/eventstock.c, src/counter_analysis_toolkit/eventstock.h, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/gen_seq_dlopen.sh, src/counter_analysis_toolkit/main.c: Removed unnecessary work when setting up the list of events, and minor cosmetic changes. 2019-12-16 Daniel Barry * src/counter_analysis_toolkit/flops.c: Cleaned up comments. 2019-12-16 Anthony Castaldo * src/components/rapl/tests/rapl_overflow.c: Corrected a working but convoluted line of code. 2019-12-13 Frank Winkler * src/examples/PAPI_flips.c, src/examples/PAPI_flops.c, src/papi.c: Minor documentation corrections. * src/papi.h: Fixed some thread definitions. * src/high-level/papi_hl.c, src/papi.h: Revised documentation of high-level API. * src/high-level/papi_hl.c, src/high- level/scripts/papi_hl_output_writer.py: Renamed the output directory of the high-level API from 'papi' to 'papi_hl_output'. * src/papi.c: Revised documentation. * src/examples/PAPI_epc.c, src/examples/PAPI_flips.c, src/examples/PAPI_flops.c, src/examples/PAPI_ipc.c, src/papi.c: Adjusted doxygen documentation. 2019-12-12 Frank Winkler * src/examples/Makefile, src/examples/PAPI_flips.c, src/examples/PAPI_flops.c, src/examples/PAPI_ipc.c, src/examples/high_level.c, src/papi.c, src/papi.h, src/papi_fwrappers.c: Reimplemented rate functions and adjusted examples. 2019-12-11 Daniel Barry * src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/gen_seq_dlopen.sh: Added PAPI_cleanup_eventset() call to each of the benchmarks. This removes events from the event set. By including these calls, the benchmarks do not encounter the PAPI_ECOUNT error code, which occurs if there are too many events added to the same event set. These changes were tested on the Intel Skylake architecture. 2019-12-10 Anthony Castaldo * src/components/rocm_smi/README, src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi/tests/rocm_smi_all.txt: Minor changes to text and a setting that was for development only. 2019-12-10 Frank Winkler * src/papi.c: Made rate functions thread safe. 2019-12-09 Anthony * src/utils/Makefile: Changed the order of the linker flags so that -ldl is at the end since libpapi.a needs libdl.so but not the other way around. 2019-12-06 Heike Jagode * README.md: README.md edited online with Bitbucket 2019-12-06 Steve Kaufmann * src/components/rocm/linux-rocm.c, src/papi_events.csv: The changes here are based on a patch provided by Steve Kaufmann; to correct a misnamed event in papi_events.csv, and prevent a segfault in rocm when a context pointer is null. Additional changes by Tony Castaldo check to see if the necessary rocprofiler environment variables have been set; and disable the component if they are not, with an informative reason to be reported by papi_component_avail. (The component will not work without them). 2019-12-05 Frank Winkler * src/papi.c: Replaced HighLevelInfo with RateInfo. 2019-12-03 Anthony Castaldo * src/extras.c: extra '#' in "%#p" print formats, using just '%p'. 2019-12-03 William Cohen * src/testlib/papi_test.h, src/testlib/test_utils.c: Use the noreturn attribute only when the compiler support GNU C extensions. * src/testlib/papi_test.h, src/testlib/test_utils.c: Properly mark some test_utils.c functions with noreturn attributes Clang makes use of the information whether a function returns in flow analysis to determine whether there are uses of null values and other possible problematic issues. Marking the test_pass, test_hl_pass, test_fail, and test_skip functions properly with noreturn attribute allows Clang to more accurately analyze the code and eliminates 87 false positive warnings in the PAPI testsuite code. 2019-12-02 Anthony Castaldo * src/components/coretemp/linux-coretemp.c, src/components/infiniband /linux-infiniband.c, src/components/lmsensors/linux-lmsensors.c, src/components/lustre/linux-lustre.c, src/components/pcp/linux- pcp.c, src/components/pcp/tests/testPCP.c, src/components/perf_event/perf_event.c, .../perf_event_uncore/perf_event_uncore.c, src/components/rapl /linux-rapl.c, src/ctests/failed_events.c, src/ctests/kufrin.c, src/ctests/pthrtough.c, src/ctests/pthrtough2.c, src/extras.c, src /high-level/papi_hl.c, src/linux-common.c, src/linux-memory.c, src/testlib/clockcore.c, src/utils/cost_utils.c, src/utils/papi_command_line.c, src/utils/papi_multiplex_cost.c: The code in this commit all failed a Coverity scan (a code consistency tool) that correctly identified memory leaks, potential buffer overflows, and failures to close a file or directory that had been opened. 2019-12-02 Frank Winkler * src/papi.c, src/papi.h, src/papi_fwrappers.c: Reimplemented rate calls such as PAPI_flips, PAPI_flops, etc. - These calls are now part of the low-level API - PAPI_stop_rates() stop the counters 2019-11-20 William Cohen * src/components/sde/Rules.sde: Limit Fortran 90 compilers options to SDE component Fortran 90 code The Rules.sde added Fortran 90 options to FFLAGS that would end up being applied to other Fortran code being built in papi. Unfortunately, the other code is F77 code and the options would cause the build to fail. 2019-11-21 Heike Jagode * README.md: README.md edited online with Bitbucket 2019-11-14 Daniel Barry * src/counter_analysis_toolkit/main.c: Swapped lines 268 and 269 of main.c so that the appropriate memory allocation is freed, and the pointer is then set to NULL. 2019-11-13 Anthony Castaldo * src/components/nvml/tests/Makefile, src/components/nvml/tests/nvmlcap_plot.cu, src/components/nvml/utils/Makefile, src/components/nvml/utils/README, src/components/nvml/utils/nvmlcap_plot.cu: For consistency with powercap and rapl components, moved nvmlcap_plot.cu to a new nvml/utils/ directory. New Makefile in nvml/utils/ and adjusted Makefile in nvml/tests/. Created a new README for nvmlcap_plot. No code changes; but tested configure and make of PAPI and nvmlcap_plot. 2019-11-08 Anthony Castaldo * src/components/rapl/linux-rapl.c: Fixed an inaccurate comment. * src/components/rapl/README, src/components/rapl/tests/rapl_overflow.c: Added a paragraph of usage info to README; also reformatted existing comments to comply with 80 char line limit; without changing their content. rapl_overflow.c was confusing, it was not using the PACKAGE_ENERGY_CNT event to test for overflow, and the scaled value seemed to wrap in 85ms. This seemed to conflict with the results of rapl_wraparound; which computes a wraparound time in 85 minutes. rapl_overflow.c is now in line with an 80-90 minute wraparound vaue. 2019-11-07 Anthony Castaldo * src/components/rapl/linux-rapl.c: Changes to properly mask energy values to uint32, and accumulate them to return a 64-bit accumulator. Verified wraparound time at approx 85 minutes (for a 32 bit read). That is the maximum allowed time between reads; the 64-bit value returned should never wrap. (Some tabs converted to spaces in changed code.) 2019-11-01 Frank Winkler * src/high-level/papi_hl.c: Removed Doxygen documentation for internal functions and moved code block for multiplex initialization. PAPI_multiplex_init is only called after a successful PAPI_thread_init. 2019-10-31 Anthony Castaldo * src/components/perf_event/pe_libpfm4_events.c: Fixed a typo in the error message. * src/ctests/Makefile.recipies, src/ctests/filter_helgrind.c, src/papi.c, src/papi_internal.c, src/threads.c, src/threads.h: The changes to papi.c, papi_internal.c, threads.h and threads.c correct a race condition that was the result of all threads using the same two static variables (papi_event_code and papi_event_code_changed) to temporarily record a state of operation. The solution was to make these variables unique per thread, using the ThreadInfo_t structure already provided in PAPI for such purposes. The file krentel_pthread_race.c is a stress test to produce race conditions. filter_helgrind.c reduces the volume of --tool-helgrind output to a more manageable summary. Both are added to Makefile.recipies. 2019-10-31 William Cohen * src/ctests/krentel_pthreads_race.c: This code is a modification of krentel_pthreads.c, to better test some race conditions. It is not included in the standard tests; it is a diagnostic that should be run with "valgrind --tool=helgrind". 2019-10-31 Anthony Castaldo * src/components/perf_event/pe_libpfm4_events.c: Changed SUBDBG error reporting in new code to a single message instead of two, before the unlock code (so no race condition on variables in report). Cosmetics. 2019-10-28 Daniel Barry * src/counter_analysis_toolkit/main.c: Added checks for improperly formatted lines in the user-provided event list. If a line is missing a qualifier count, then it is discarded. If a provided event name is either not available in the architecture or contains qualifiers, then the qualifier count is set to zero to prevent appending extraneous qualifiers, and the user is notified. Also cleaned up string manipulation. These changes were tested on the Intel Haswell architecture. 2019-10-25 Anthony Castaldo * src/components/perf_event/pe_libpfm4_events.c: In two places, we exited the routine allocate_native_event() because we could not find a mask or attribute in an event name (because the event was supported but the given mask was not), and failed without unlocking the NAMELIB_LOCK, or cleaning up allocated memory. + free (msk_ptr); + free(pmu_name); + _papi_hwi_unlock( NAMELIB_LOCK ); 2019-10-24 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c, src/components/rocm_smi/tests/ROCM_SMI_Makefile, src/components/rocm_smi/tests/rocm_smi_all.cpp, src/components/rocm_smi/tests/rocm_smi_all.txt: New events added, some bugs corrected. ROCM_SMI_Makefile is modified to use env variable $PAPI_ROCM_ROOT to make it easier to compile with a local version of the rocm_smi library. rocm_smi_all.txt is the output of a run of rocm_smi_all.cpp, which has been modified to handle strings, and skip testing of events that bomb (unhandled exceptions in library code). NOTE this code may still contain debug printing to stderr, to be removed in the final version after all issues are corrected. -Tony 2019-10-24 Frank Winkler * src/papi.h: Removed TLS definitions. * src/high-level/papi_hl.c, src/papi.h: Replaced PAPI_TLS_KEYWORD with THREAD_LOCAL_STORAGE_KEYWORD due to ABI conflicts. 2019-10-18 Anthony Castaldo * src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c, src/components/rocm_smi/rocm_smi.h: This is a first installment of the rewrite of the rocm_smi component. It currently requires a private install of the updated library (with iterators), and a special Rules.file, PAPI_ROCM_ROOT, and PAPI_ROCM_SMI_MAIN. It works as far as executing utils/papi_native_avail; but none of the events have been tested yet by reading with PAPI code. -TC 2019-10-15 Damien Genet * src/components/nvml/linux-nvml.c: Merged in dgenet/papi/fix/nvml- rules (pull request #14) Fixes error messages while detecting Rules.nvml Patch from Vince Weaver Approved-by: Heike Jagode Approved-by: Damien Genet Approved-by: Anthony Castaldo 2019-10-09 Heike Jagode * README.md: Cleaning up README file. * README.md: README.md edited online with Bitbucket * README: README edited online with Bitbucket * README.md: README.md edited online with Bitbucket 2019-10-08 Steve Kaufmann * src/components/cuda/linux-cuda.c: Corrected several cosmetic issues and typos, standardized naming, used PATH_MAX instead of literal, and PAPI_MAX_STR_LEN instead of PAPI_MIN_STR_LEN. 2019-10-08 Frank Winkler * src/components/lmsensors/linux-lmsensors.c: Removed blank line. * src/components/lmsensors/linux-lmsensors.c: Replaced spaces with underscores in event name. 2019-10-06 Frank Winkler * src/papi_fwrappers.c: Corrected Doxygen documentation. * src/ctests/Makefile.recipies, src/ctests/mpi_hl.c, src/ctests/mpi_omp_hl.c, src/ctests/omp_hl.c, src/ctests/pthread_hl.c, src/ctests/serial_hl.c, src/ctests/serial_hl_advanced.c, src/ctests/serial_hl_ll_comb.c, src/ctests/serial_hl_ll_comb2.c, src/ftests/Makefile.recipies, src/ftests/serial_hl.F, src/ftests/serial_hl_advanced.F, src/high- level/papi_hl.c, src/papi.h, src/papi_fwrappers.c, src/testlib/ftests_util.F, src/testlib/papi_test.h, src/testlib/test_utils.c, src/validation_tests/flops_validation_hl.c: Removed advanced functions from the new high-level API. The new high-level API consists of three functions: - PAPI_hl_region_begin - PAPI_hl_read - PAPI_hl_region_end Validation test in C: - src/validation_tests/flops_validation_hl.c Test examples in C: - src/ctests/serial_hl.c - src/ctests/omp_hl.c - src/ctests/pthread_hl.c - src/ctests/mpi_hl.c - src/ctests/mpi_omp_hl.c - src/ctests/serial_hl_ll_comb.c Test example in Fortran: - src/ftests/serial_hl.F 2019-10-03 Damien Genet * src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/branch.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/eventstock.c, src/counter_analysis_toolkit/eventstock.h, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/gen_seq_dlopen.sh, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/icache.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/prepareArray.c, src/counter_analysis_toolkit/prepareArray.h, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Adding Checks 2019-10-04 Anthony Danalis * src/components/lmsensors/linux-lmsensors.c: Fixed inconsistency in component name. 2019-09-30 Anthony Danalis * src/components/sde/README, src/components/sde/Rules.sde, src/components/sde/interface/papi_sde_interface.c, src/components/sde/interface/papi_sde_interface.h, src/components/sde/sde.c, src/components/sde/sde_F.F90, src/components/sde/sde_internal.h, src/components/sde/tests/Gamum.c, src/components/sde/tests/Makefile, src/components/sde/tests/Minimal_Test.c, src/components/sde/tests/Recorder.c, src/components/sde/tests/Xandria.F90, src/components/sde/tests/sde_test_f08.F90: Software Defined Events (SDE) component. * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/README, src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/branch.h, src/counter_analysis_toolkit/caches.h, src/counter_analysis_toolkit/compar.c, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/event_list.txt, src/counter_analysis_toolkit/eventstock.c, src/counter_analysis_toolkit/eventstock.h, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/flops.h, src/counter_analysis_toolkit/flops_aux.c, src/counter_analysis_toolkit/flops_aux.h, src/counter_analysis_toolkit/gen_seq_dlopen.sh, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/icache.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/prepareArray.c, src/counter_analysis_toolkit/prepareArray.h, src/counter_analysis_toolkit/replicate.sh, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Counter Analysis Toolkit. 2019-09-30 Anthony Castaldo * src/components/cuda/Rules.cuda, src/components/nvml/Rules.nvml, src/components/pcp/Rules.pcp: Corrected typos, replacing "optimal" with "optional." 2019-09-18 Anthony Castaldo * src/components/cuda/linux-cuda.c: We no longer check the error on setting CUPTI_EVENT_COLLECTION_MODE_CONTINUOUS, it only works on Tesla devices (and is preferred there) but fails on other models, they don't support the feature. We do not fail if they reject it. 2019-09-17 Kevin Huck * src/components/io/CHANGES, src/components/io/README, src/components/io/Rules.io, src/components/io/linux-io.c, src/components/io/linux-io.h, src/components/io/tests/Makefile, src/components/io/tests/io_basic.c, src/components/io/tests/io_multiple_components.c: Adding I/O component to read from /proc/self/io. 2019-09-13 Steve Kaufmann * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Changes to make these components (ROCM, ROCM_SMI) have naming consistency with others; fixed numerous minor formatting issues and comments. Compiled and checked on ICL Caffeine. 2019-09-13 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Revert changes, used wrong author (Should be Steve Kaufmann). This reverts commit 9a60e91d539b8eb079dd81adc1d91c17620cfaed. 2019-09-12 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Changes suggested by Steve Kaufmann (Cray) to make these components have naming consistency with others; fixed numerous minor formatting issues. Reviewed, accepted, compiled, checked. 2019-09-09 Frank Winkler * src/components/infiniband_umad/README.md, src/components/lmsensors/README.md: Little format changes for markdown documentation files. * src/components/libmsr/Makefile.libmsr.in, src/components/libmsr/README, src/components/libmsr/README.md, src/components/libmsr/Rules.libmsr, src/components/libmsr/configure, src/components/libmsr/configure.in, src/components/libmsr/linux- libmsr.c, src/components/libmsr/utils/libmsr_write_test.c: Updated code and documentation for component libmsr to get compliance with the new component setup standard. * src/components/infiniband_umad/README, src/components/infiniband_umad/README.md, .../infiniband_umad/Rules.infiniband_umad, .../infiniband_umad /linux-infiniband_umad.c: Updated code and documentation for component infiniband_umad to get compliance with the new component setup standard. 2019-09-08 Frank Winkler * src/components/lmsensors/README, src/components/lmsensors/README.md, src/components/lmsensors/Rules.lmsensors, src/components/lmsensors /linux-lmsensors.c: Updated code and documentation for component lmsensors to get compliance with the new component setup standard. 2019-09-05 Anthony Castaldo * src/components/pcp/Rules.pcp: Corrected an issue with Rules, changing the name of macro that conflicted with other potential macros. 2019-09-04 Anthony Castaldo * src/components/cuda/Rules.cuda, src/components/nvml/Rules.nvml: Corrected an incompatibility in multiple Rules files when multiple components are included. Rules files cannot all use the same "MACRODEF" variable for different purposes; each needs a unique ID, like CUDA_MACS, NVML_MACS, etc. * src/components/rocm/README, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c, src/components/rocm_smi/README, src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c: Changes to make rocm_smi component compliant with new component setup standard; changes to rocm component to correct bugs in compatibility and comments. * src/components/rocm/README, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c: Modified documentation, Rules and code for ROCM component to comply with new setup standards. It now requires PAPI_ROCM_ROOT as an environment variable. * src/components/pcp/README, src/components/pcp/Rules.pcp, src/components/pcp/linux-pcp.c: Code and documentation to get component PCP into compliance with the new component setup standard; PAPI_PCP_ROOT is only environmental variable required. 2019-09-03 Anthony Castaldo * src/components/cuda/Rules.cuda, src/components/nvml/README, src/components/nvml/Rules.nvml, src/components/nvml/linux-nvml.c: NVML component README, Rules and code updated to reflect new setup policy, relies on PAPI_CUDA_ROOT only. Adds a new override, PAPI_NVML_MAIN. Instructions improved in Rules.cuda, Rules.nvml. 2019-08-29 Anthony Castaldo * src/components/cuda/linux-cuda.c: Corrected comments. * src/components/cuda/README, src/components/cuda/Rules.cuda, src/components/cuda/linux-cuda.c: The changes make the cuda component reliant on a single environment variable, PAPI_CUDA_ROOT, allowing overrides specified in Rules.cuda if the necessary libraries are not in their expected locations. Detailed instructions are in README, and for overrides in Rules.cuda. 2019-08-28 Anthony Castaldo * src/components/rocm/linux-rocm.c: Bug fixes, for missing eventName in debug mode; also for failure to clear internal 'usage' flags when destroying an event set. 2019-08-14 Carl Love * src/papi_events.csv: Per Carl Love, "The POWER9 event PM_BR_TAKEN_CMPL includes conditional and unconditional branches. The equation for event PAPI_BR_NTK should not include the event PM_BR_UNCOND as PM_BR_TAKEN_CMPL already counts unconditional branches. The POWER9 event PM_LD_REF_L1 includes hits and misses to the L1. Thus we should not be adding PM_LS_MISS_L1_ALT when calculating PAPI_LD_INS on POWER9." The definitions for these preset events were changed accordingly, and their patterns of behavior were measured during the execution of performance benchmarks on the IBM POWER9 processors on Summit. The patterns of behavior for the corresponding events on the Intel Skylake and Broadwell processors were measured during the execution of the same performance benchmarks. The respective events from each architecture behave similarly. In addition, the new definitions pass the PAPI validation tests. 2019-08-12 Anthony Castaldo * src/components/pcp/Rules.pcp, src/components/rocm/Rules.rocm: Adding $(LDL) to LDFLAGS in Rules.x files when it was missing, on PCP and ROCM components. 2019-08-09 Anthony Castaldo * src/components/pcp/README, src/components/pcp/Rules.pcp, src/components/pcp/linux-pcp.c: The PCP component changed to use the new standard for PAPI environment variables; there are now no necessary environment variables, and no need to change LD_LIBRARY_PATH. The rules file was streamlined. The code was tested on Peak and Summit. We do allow overrides for non-standard installations of PCP, the variables PAPI_PCP_ROOT, PAPI_PCP_LIBS, PAPI_PCP_INC and PAPI_PCP_LIBNAME can be set by users to specify non-standard locations or library names. The README file in components/pcp/ contains detailed instructions on their use. 2019-08-08 Anthony Castaldo * src/components/rocm/README, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c, src/components/rocm/tests/run_papi.sh, src/components/rocm_smi/README, src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c: Components ROCM and ROCM_SMI have been changed to adhere to our recent standardization of using environment variables. README files are updated with detailed information, and we have both simplified and extended the capabilities with new env vars. It is simplified because for a standard install of the rocm or rocm_smi software puts it in the default directories, PAPI will find the libraries and include files without any configure step. But it is more powerful because we allow overrides to the defaults, including overrides to the necessary library names. We also no longer require the LD_LIBRARY_PATH environment variable be modified, or exist at all. If it is there and we don't find a library in a path given by the user, we will still search it, and the default search directories. The Rules.rocm and Rules.rocm_smi are changed to use defaults, and their linker commands changed to allow the specification of a non-standard library name; e.g. a versioned library that does not end in ".so". These changes were tested and verified on the ICL machine Caffeine. 2019-08-04 Frank Winkler * src/components/infiniband_umad/Rules.infiniband_umad, src/components/lmsensors/Rules.lmsensors: Added "$(LDL)" to LDFLAGS of components lmsensors and infiniband_umad. "libdl" was missing in a previous commit (0f0b74f). * src/high-level/papi_hl.c: Fixed bug in high-level API. Function PAPI_hl_print_output () caused a segmentation fault when no events were recorded. 2019-08-01 Frank Winkler * .../infiniband_umad/Makefile.infiniband_umad.in, src/components/infiniband_umad/README, .../infiniband_umad/Rules.infiniband_umad, src/components/infiniband_umad/configure, src/components/infiniband_umad/configure.in, .../infiniband_umad /linux-infiniband_umad.c, src/components/infiniband_umad/tests/Makefile, src/components/lmsensors/Makefile.lmsensors.in, src/components/lmsensors/README, src/components/lmsensors/Rules.lmsensors, src/components/lmsensors/configure, src/components/lmsensors/configure.in, src/components/lmsensors /linux-lmsensors.c: Changed configuration mechanism for components lmsensors and infiband_umad. We do not use configure scripts anymore. Each component is configured via environment variables. For compilation: PAPI_[component]_ROOT PAPI_[component]_INCLUDE PAPI_[component]_LIB For runtime: PAPI_[component]_LIBNAME Detailed information can be found in the README file of each component. 2019-07-25 Anthony Castaldo * src/components/cuda/README, src/components/cuda/Rules.cuda, src/components/cuda/linux-cuda.c, src/components/cuda/sampling/Makefile, src/components/cuda/tests/Makefile, src/components/nvml/README, src/components/nvml/Rules.nvml, src/components/nvml/linux-nvml.c, src/components/nvml/tests/Makefile: A continuation of a previous commit prematurely pushed. Same commentary: The CUDA and NVML components have a revamped Environment Variable processing; we have simplified this for users, made it more flexible, and standardized on environment variables beginning with "PAPI_". The NVML component used to require a separate configure step, this has been eliminated. Simplification: The only required environment variable is now PAPI_CUDA_ROOT, set to the path corresponding to CUDA. Users no longer need to update LD_LIBRARY_PATH. There are several other environment variables that can be set to override the defaults we would automatically use if only PAPI_CUDA_ROOT is given. The general protocol we now use for naming environment variables is PAPI_[component]_[setting]. Examples are: PAPI_CUDA_STUBS (default = ${PAPI_CUDA_ROOT}/lib64/stubs PAPI_CUPTI_LIBS (default = ${PAPI_CUDA_ROOT}/extras/CUPTI/lib64) PAPI_NVML_LIBNAME (default = "libnvidia-ml.so") Some possible overrides are processed at compile time, the Rules.[component] files now set defaults (For cuda and nvml based on PAPI_CUDA_ROOT) for the path to include files, or to library files. Other possible overrides are handled at runtime; using environment variables the user can specify specific paths to attempt first for each library. If the necessary libraries are not found on those paths, the system will still attempt to use the LD_LIBRARY_PATH and the default directories (/lib64, /usr/lib64). The "disabled_reason" field for components has been updated to provide more information when libraries are not found. The README files have been rewritten to reflect this protocol, to detail the new possible overrides, and to show the order in which they are searched when more than one environment variable applies. 2019-07-24 Anthony Castaldo * src/components/nvml/Makefile.nvml.in, src/components/nvml/configure, src/components/nvml/configure.in: The CUDA and NVML components have a revamped Environment Variable processing; we have simplified this for users, made it more flexible, and standardized on environment variables beginning with "PAPI_". The NVML component used to require a separate configure step, this has been eliminated. Simplification: The only required environment variable is now PAPI_CUDA_ROOT, set to the path corresponding to CUDA. Users no longer need to update LD_LIBRARY_PATH. There are several other environment variables that can be set to override the defaults we would automatically use if only PAPI_CUDA_ROOT is given. The general protocol we now use for naming environment variables is PAPI_[component]_[setting]. Examples are: PAPI_CUDA_STUBS (default = ${PAPI_CUDA_ROOT}/lib64/stubs PAPI_CUPTI_LIBS (default = ${PAPI_CUDA_ROOT}/extras/CUPTI/lib64) PAPI_NVML_LIBNAME (default = "libnvidia-ml.so") Some possible overrides are processed at compile time, the Rules.[component] files now set defaults (For cuda and nvml based on PAPI_CUDA_ROOT) for the path to include files, or to library files. Other possible overrides are handled at runtime; using environment variables the user can specify specific paths to attempt first for each library. If the necessary libraries are not found on those paths, the system will still attempt to use the LD_LIBRARY_PATH and the default directories (/lib64, /usr/lib64). The "disabled_reason" field for components has been updated to provide more information when libraries are not found. The README files have been rewritten to reflect this protocol, to detail the new possible overrides, and to show the order in which they are searched when more than one environment variable applies. 2019-07-19 Frank Winkler * src/high-level/papi_hl.c: Removed function "error_at_line" (declared in error.h), since it is not portable. Fixed warning "implicit declaration of function error_at_line". 2019-07-17 Anthony Castaldo * src/components/nvml/README: Changes explaining the issues with libnvidia-ml.so in detail; and the new facility for changing the default name using the environment variable PAPI_NVML_LIBNAME. * src/components/nvml/Rules.nvml, src/components/nvml/linux-nvml.c, src/utils/papi_component_avail.c: linux-nvml.c is changed to allow the nvml library name to be set by an environment variable, PAPI_NVML_LIBNAME. If this is not present, the default 'libnvidia- ml.so' is used. Also, misspellings in error messages were corrected. Rules.nvml: A previous method used a -D #define during the compile of linux-nvml.c to change the default name. This method was eliminated. utils/papi_component_avail.c a typographic error in an error message was corrected. 2019-07-15 Anthony Castaldo * src/utils/papi_component_avail.c: To avoid confusion, we no longer print an empty "PMUs supported:" line for components for which Performance Monitoring Units do not apply (or are not exposed through its device interfaces). We also corrected minor bugs in computing the display length to limit output lines to 130 characters (when listing PMUs); this was most evident on the perf_event_uncore component. 2019-07-12 Anthony Castaldo * src/components/rocm_smi/README, src/components/rocm_smi/linux-rocm- smi.c, .../rocm_smi/tests/rocm_command_line.cpp, src/components/rocm_smi/tests/rocm_smi_all.cpp: linux-rocm-smi.c (for the rocm_smi component) was fixed to not expose globals other than the _rocm_smi_vector. The src/components/rocm_smi/README file was updated to provide more information on the LD_LIBRARY_PATH required, and the utilities rocm_command_line.cpp and rocm_smi_all.cpp in the tests/ directory were updated to report more information. 2019-06-27 Frank Winkler * src/high-level/papi_hl.c: Added multiplexing support for high-level API. Mutliplexing of cpu core components can be enabled via the environment variable PAPI_MULTIPLEX. 2019-06-26 Daniel Barry * src/ctests/zero_omp.c, src/papi_vector.c: Changed the dummy function call in papi_vector.c and created a function to wrap the call to omp_get_thread_num() in ctests/zero_omp.c. These allow the function castings in the respective files to operate properly without warnings from GCC 8.3.0. These changes were tested on the Intel Haswell architecture. 2019-06-25 Anthony Castaldo * src/components/nvml/PeakConfigure.sh, src/components/nvml/README, src/components/nvml/Rules.nvml, src/components/nvml/linux-nvml.c: linux-nvml.c is modified to accept an alternate name for the nvml library, which will default to the standard 'libnvidia-ml.so'. This is necessary on a system (Summit in particular) that doesn't have the standard link file to the current versioned lib. It will also provide flexibility for testing previous versions or new versions of the library. The library file name can be specified in Rules.nvml, as a compiler-line Define of NVML_LIBNAME. Rules.nvml has comments added, one of which is an example of how to specify NVML_LIBNAME. Otherwise it is unchanged; and the library name used will default to 'libnvidia-ml.so'. README has been updated to describe this new capability, and for ICL staff contains examples of what works on Summit. PeakConfigure.sh had a typo that was corrected. 2019-06-18 Vince Weaver * src/components/rapl/tests/rapl_basic.c: rapl: quiet a strncpy() warning in the rapl_basic test * src/linux-common.c, src/linux-memory.c, src/papi_internal.c, src/papi_libpfm4_events.c: papi: fix some strncpy() related warnings reported by gcc 8.3 2019-06-10 Daniel Barry * src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c: Changed the sprintf() call to snprintf() and added an if-statement to check whether the number of characters intended to be written to the destination buffer exceed the size of the buffer. This prevents GCC 8.3.0 from warning that the destination buffer may not be large enough to store the contents of the source buffers. These changes were tested on the Intel Haswell architecture. 2019-06-05 Daniel Barry * src/components/perf_event_uncore/tests/perf_event_uncore.c: Added a second buffer in the perf_event_uncore test. This prevents GCC 8 from complaining about the source and destination buffers overlapping. Per the sprintf man-page (release 3.53 of the Linux man-pages project), "the standards explicitly note that the results are undefined if source and destination buffers overlap when calling sprintf()." Since the second buffer is only present in a test program, this change will not create memory overhead to user programs which use PAPI. These changes were tested on the Intel Haswell architecture. 2019-06-05 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c: Added a direct file system search for AMD GPU peripherals; vendor ID 0x1002. We search up to 64 /sys/class/drm/card?/device/vendor files; (card0, card1, ... card 63). Also corrected a typo in an event name. Tested and worked on ICL Caffeine system; correctly excluded card0 (display card) and found two AMD GPUs on card1, card2. 2019-05-25 Yunqiang Su * src/linux-lock.h: [mips] replace beqzl with beqzc for r6 2019-05-20 Daniel Barry * src/papi_events.csv: I have added PAPI POWER9 event definitions for PAPI_L2_DCR, PAPI_L2_DCW, PAPI_BR_CN, PAPI_BR_NTK, PAPI_BR_UCN, and PAPI_BR_TKN. These events have been tested. Their patterns of behavior were measured during the execution of performance benchmarks on Summit's POWER9 processors. The patterns of behavior for the corresponding events on Intel Haswell processors were measured during the execution of the same performance benchmarks. The respective events from each architecture behave similarly. 2019-05-17 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c: Added missing "rsmi_init(0)" call to component_init() function. * src/components/rocm/README, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c: Modifications to support indexed variables; requires different names be used for PAPI users and the request to the RocProfiler (it interprets the index within the name). Updated notes in README, and additional potential -I include paths in Rules.rocm. 2019-05-14 Anthony Castaldo * src/components/cuda/linux-cuda.c: Improved error reporting when libraries are not found, or the cuda initialization function fails. No changes to function. 2019-05-07 Heike Jagode * src/components/appio/tests/iozone/Gnuplot.txt, src/components/appio/tests/iozone/gnu3d.dem, src/components/powercap/tests/powercap_limit.c, src/components/vmware/VMwareComponentDocument.txt: More clean up of carriage return character (^M) throughout the code base. Thanks to Steve Kaufmann! 2019-05-07 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c, src/components/rocm_smi/rocm_smi.h, src/components/rocm_smi/tests/Makefile, src/components/rocm_smi/tests/ROCM_SMI_Makefile: I fixed linux- rocm-smi.c to include an event per device called rocm_smi:::device=?:busy_percent; I overlooked this event in the first draft of the component. I added a note to rocm_smi.h; we cannot use the distributed version of this file; we have a compile error on one of the include files that is not necessary; so we comment it out. I created a Makefile for the rocm_smi/tests/ directory, it is just a placeholder until we develop some standardized tests of the rocm_smi component; but necessary to prevent an error during system 'make'. I added rocm_command_line.out to the ROCM_SMI_Makefile. This is to make non-standardized tests; and can be used as make -f ROCM_SMI_Makefile 2019-05-07 Heike Jagode * src/Makefile.in, src/Makefile.inc, src/Rules.perfmon2, src/configure.in: Clean up of carriage return character (^M) from previous patch (commit 5434010). Thanks to Steve Kaufmann from Cray! 2019-05-06 Andreas Beckmann * src/Makefile.in, src/Makefile.inc, src/Rules.perfmon2, src/configure, src/configure.in: [PATCH] set SONAME to libpapi.so.$(PAPIVER).$(PAPIREV) The version check in PAPI_library_init() requires matching PAPI_VER_CURRENT, therefore libpapi.so.5 from papi-5.6.x and papi-5.7.x are not interchangeable, but require applications to be recompiled. Change the SONAME to contain the two version components that define PAPI_VER_CURRENT, thereafter upgrading the shared library to a new version does no longer break existing applications (which will pick up the new SONAME upon recompilation). Introduce a new variable PAPISOVER and use it in all places where the SONAME is being used. drop unused symlinks with three version components: $(PAPIVER).$(PAPIREV).$(PAPIAGE) 2019-05-03 Daniel Barry * src/ctests/profile_twoevents.c: Prevented another warning about buffer size potentially being to small. 2019-04-25 Frank Winkler * src/high-level/papi_hl.c: Fixed "format-overflow" warning detected by gcc/8.1.0. 2019-04-24 Anthony Castaldo * src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c, src/components/rocm_smi/README, src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c, src/components/rocm_smi/rocm_smi.h, src/components/rocm_smi/tests/ROCM_SMI_Makefile, .../rocm_smi/tests/rocm_command_line.cpp, src/components/rocm_smi/tests/rocm_smi_all.cpp, .../rocm_smi/tests/rocm_smi_writeTests.cpp: Major addition: a component to access the rocm_smi library; this is the System Management Interface for AMD GPU devices. It allows monitoring of hardware elements; like power consumption, memory usage, PCIe throughput, fan speed, etc. It allows control for some hardware functions as well, via PAPI_write(), although these are untested (write requires root privileges to test). Included here are the component code, a tester for all readable events, and an incomplete tester for writing control values. The tests are cpp; this is required for the AMD 'HIPP' compiler to process an AMD Kernel that can exercise the GPU itself. The rules and exports are bit complicated; for development the rocm_smi_lib was installed and built in my user directory; in production it would be in a system directory. 2019-04-24 Frank Winkler * src/high-level/papi_hl.c: Fixed warnings detected by gcc/8.3.0 when using "-Wrestrict" or "-Wall". * src/high-level/papi_hl.c: Replaced "get_current_dir_name()" with "getcwd(NULL,0)". "get_current_dir_name()" is only GNU specific. * src/run_tests.sh: Replaced bash statements with shell statements. Some systems do not have a bash. 2019-04-22 Frank Winkler * src/high-level/papi_hl.c: Corrected data type declaration according to the return value of C library function fgetc. 2019-04-18 Daniel Barry * src/ctests/derived.c, src/ctests/multiattach.c, src/ctests/multiattach2.c, src/ctests/reset.c, src/ctests/reset_multiplex.c, src/ctests/zero_attach.c, src/ctests/zero_flip.c: Prevented warnings about buffer sizes of length PAPI_MAX_STR_LEN potentially being too small. 2019-04-18 Frank Winkler * doc/Doxyfile-man3, doc/Makefile, src/Makefile.inc, src/components/appio/tests/appio_test_blocking.c, .../appio/tests/appio_test_fread_fwrite.c, src/components/appio/tests/appio_test_pthreads.c, src/components/appio/tests/appio_test_read_write.c, src/components/appio/tests/appio_test_recv.c, src/components/appio/tests/appio_test_seek.c, src/components/appio/tests/appio_test_select.c, src/components/appio/tests/appio_test_socket.c, src/components/appio/tests/init_fini.c, src/ctests/Makefile.recipies, src/ctests/api.c, src/ctests/flops.c, src/ctests/high-level.c, src/ctests/high-level2.c, src/ctests/hl_rates.c, src/ctests/ipc.c, src/ctests/matrix-hl.c, src/ctests/mpi_hl.c, src/ctests/mpi_omp_hl.c, src/ctests/omp_hl.c, src/ctests/pthread_hl.c, src/ctests/serial_hl.c, src/ctests/serial_hl_advanced.c, src/ctests/serial_hl_ll_comb.c, src/ctests/serial_hl_ll_comb2.c, src/ftests/Makefile.recipies, src/ftests/flops.F, src/ftests/fmatrixpapi.F, src/ftests/fmatrixpapi2.F, src/ftests/highlevel.F, src/ftests/serial_hl.F, src/ftests/serial_hl_advanced.F, src/high- level/papi_hl.c, src/high-level/scripts/papi_hl_output_writer.py, src/papi.c, src/papi.h, src/papi_debug.h, src/papi_fwrappers.c, src/papi_hl.c, src/papi_hl.h, src/run_tests.sh, src/run_tests_exclude.txt, src/validation_tests/Makefile.recipies, src/validation_tests/flops_validation_hl.c: Replaced old high-level API with a new high-level API. The new high-level API provides the ability to record performance events within instrumented code sections, called regions, of serial, multi-processing (MPI, SHMEM) and thread (OpenMP, Pthreads) parallel applications. Events to be recorded are determined via an environment variable that lists both preset and native events separated by commas. This enables the programmer to perform different measurements without recompiling. In addition, the programmer does not need to take care of printing performance events since a JSON output is generated at the end of each measurement. Main changes: - Removed old high-level API including all test files. - Added new high-level API including a python script that merges results from several MPI ranks. - Added Doxygen documentation for new high-level API. - Added high-level tests for c and fortran. - Added high-level flops validation test. - Replaced old high-level tests with new high-level tests in appio component. 2019-04-02 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm/tests/rocm_all.cpp: NOTE: This component is still not functional! Added missing code to prevent hsa_shut_down() call from segfaulting. Changed skip table for testing code rocm_all.cpp. 2019-04-01 Al Grant * src/linux-memory.c: The logic in linux-memory.c generic_get_memory_info() isn't correct. It looks at the cpu0/cache node and iterates through the caches. The intention is to collect information about caches at each level. There may be multiple caches at a given level (typically at L1 there will be I and D). PAPI's data structure allows for this. There is a 'level count' that is incremented so that multiple caches can be collected per level. The bug is in the lines if (level != last_level) { level_count = 0; last_level=level; } else { level_count++; } This assumes that for a given level, you see all the caches at that level, then you go to the next level. But in fact sysfs may return the caches in random order. An actual example: index2: level 2, unified cache index0: level 1, data cache index3: level 3, unified cache index1: level 1, instruction cache Because index1 is at a different level from index3, level_count will be reset to 0. So in PAPI's structures, the L1I information will overwrite the L1D information. The knowledge about L1D will be lost. 2019-03-28 Anthony Castaldo * src/components/rocm/README, src/components/rocm/linux-rocm.c, src/components/rocm/tests/Makefile, src/components/rocm/tests/ROCM_Makefile, src/components/rocm/tests/rocm_all.cpp, src/components/rocm/tests/rocm_command_line.c, src/components/rocm/tests/run_papi.sh, src/components/rocm/tests/square.cpp, src/components/rocm/tests/square.cu, src/components/rocm/tests/square.hipref.cpp: linux-rocm.c updated with PAPI standard component function names, beginning '_rocm', and events named '..:device=n:...' instead of 'device:n'. New files and utilities are added in the test/ directory. The ROCM_Makefile is used to compile cpp code using the AMD HIPCC compiler; e.g. 'make -f ROCM_Makefile rocm_all.out', in order to compile code that uses the AMD GPUs. 2019-03-18 Anthony Castaldo * src/components/rocm/linux-rocm.c: This is the ROCM component (linux_rocm.c) with the minimal changes needed to compile with the PAPI standard GCC flags and settings. This version is functional; it shows up on papi_components_avail, and papi_native_avail shows rocm::: events. However, the compile still produces warnings for unused variables (they are used in debug mode but the code using them is suppressed in production mode). These are corrected in the next commit; and a '/tests' directory will be added. 2019-03-18 Evgeny Shcherbakov * src/components/rocm/README, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c: These are the original files produced by Evgeny Shcherbakov for the ROCM PAPI component; this component allows PAPI to access to AMD GPU events. Note that linux- rocm.c will not compile using the PAPI default settings for GCC; it has 3 lines of code that require a C99 flag (e.g. -std=gnu99). We do not wish to mix standards, so the next commit will revise these lines to standard C that will compile clean with our standard settings. 2019-03-07 Heike Jagode * doc/Doxyfile-common, papi.spec, src/Makefile.in, src/configure, src/configure.in, src/papi.h: Updated version to 5.7.1 after the release. 2019-03-04 Heike Jagode * RELEASENOTES.txt, release_procedure.txt: Minor updates to release procedure text. * RELEASENOTES.txt: Updated release notes for 5.7.0 release. 2019-02-22 Anthony Castaldo * doc/Doxyfile-common, man/man1/PAPI_derived_event_files.1, man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPIF_accum.3, man/man3/PAPIF_accum_counters.3, man/man3/PAPIF_add_event.3, man/man3/PAPIF_add_events.3, man/man3/PAPIF_add_named_event.3, man/man3/PAPIF_assign_eventset_component.3, man/man3/PAPIF_cleanup_eventset.3, man/man3/PAPIF_create_eventset.3, man/man3/PAPIF_destroy_eventset.3, man/man3/PAPIF_enum_event.3, man/man3/PAPIF_epc.3, man/man3/PAPIF_event_code_to_name.3, man/man3/PAPIF_event_name_to_code.3, man/man3/PAPIF_flips.3, man/man3/PAPIF_flops.3, man/man3/PAPIF_get_clockrate.3, man/man3/PAPIF_get_dmem_info.3, man/man3/PAPIF_get_domain.3, man/man3/PAPIF_get_event_info.3, man/man3/PAPIF_get_exe_info.3, man/man3/PAPIF_get_granularity.3, man/man3/PAPIF_get_hardware_info.3, man/man3/PAPIF_get_multiplex.3, man/man3/PAPIF_get_preload.3, man/man3/PAPIF_get_real_cyc.3, man/man3/PAPIF_get_real_nsec.3, man/man3/PAPIF_get_real_usec.3, man/man3/PAPIF_get_virt_cyc.3, man/man3/PAPIF_get_virt_usec.3, man/man3/PAPIF_ipc.3, man/man3/PAPIF_is_initialized.3, man/man3/PAPIF_library_init.3, man/man3/PAPIF_lock.3, man/man3/PAPIF_multiplex_init.3, man/man3/PAPIF_num_cmp_hwctrs.3, man/man3/PAPIF_num_counters.3, man/man3/PAPIF_num_events.3, man/man3/PAPIF_num_hwctrs.3, man/man3/PAPIF_perror.3, man/man3/PAPIF_query_event.3, man/man3/PAPIF_query_named_event.3, man/man3/PAPIF_read.3, man/man3/PAPIF_read_ts.3, man/man3/PAPIF_register_thread.3, man/man3/PAPIF_remove_event.3, man/man3/PAPIF_remove_events.3, man/man3/PAPIF_remove_named_event.3, man/man3/PAPIF_reset.3, man/man3/PAPIF_set_cmp_domain.3, man/man3/PAPIF_set_cmp_granularity.3, man/man3/PAPIF_set_debug.3, man/man3/PAPIF_set_domain.3, man/man3/PAPIF_set_event_domain.3, man/man3/PAPIF_set_granularity.3, man/man3/PAPIF_set_inherit.3, man/man3/PAPIF_set_multiplex.3, man/man3/PAPIF_shutdown.3, man/man3/PAPIF_start.3, man/man3/PAPIF_start_counters.3, man/man3/PAPIF_state.3, man/man3/PAPIF_stop.3, man/man3/PAPIF_stop_counters.3, man/man3/PAPIF_thread_id.3, man/man3/PAPIF_thread_init.3, man/man3/PAPIF_unlock.3, man/man3/PAPIF_unregister_thread.3, man/man3/PAPIF_write.3, man/man3/PAPI_accum.3, man/man3/PAPI_accum_counters.3, man/man3/PAPI_add_event.3, man/man3/PAPI_add_events.3, man/man3/PAPI_add_named_event.3, man/man3/PAPI_addr_range_option_t.3, man/man3/PAPI_address_map_t.3, man/man3/PAPI_all_thr_spec_t.3, man/man3/PAPI_assign_eventset_component.3, man/man3/PAPI_attach.3, man/man3/PAPI_attach_option_t.3, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_component_info_t.3, man/man3/PAPI_cpu_option_t.3, man/man3/PAPI_create_eventset.3, man/man3/PAPI_debug_option_t.3, man/man3/PAPI_destroy_eventset.3, man/man3/PAPI_detach.3, man/man3/PAPI_disable_component.3, man/man3/PAPI_disable_component_by_name.3, man/man3/PAPI_dmem_info_t.3, man/man3/PAPI_domain_option_t.3, man/man3/PAPI_enum_cmp_event.3, man/man3/PAPI_enum_event.3, man/man3/PAPI_epc.3, man/man3/PAPI_event_code_to_name.3, man/man3/PAPI_event_info_t.3, man/man3/PAPI_event_name_to_code.3, man/man3/PAPI_exe_info_t.3, man/man3/PAPI_flips.3, man/man3/PAPI_flops.3, man/man3/PAPI_get_cmp_opt.3, man/man3/PAPI_get_component_index.3, man/man3/PAPI_get_component_info.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_get_event_component.3, man/man3/PAPI_get_event_info.3, man/man3/PAPI_get_eventset_component.3, man/man3/PAPI_get_executable_info.3, man/man3/PAPI_get_hardware_info.3, man/man3/PAPI_get_multiplex.3, man/man3/PAPI_get_opt.3, man/man3/PAPI_get_overflow_event_index.3, man/man3/PAPI_get_real_cyc.3, man/man3/PAPI_get_real_nsec.3, man/man3/PAPI_get_real_usec.3, man/man3/PAPI_get_shared_lib_info.3, man/man3/PAPI_get_thr_specific.3, man/man3/PAPI_get_virt_cyc.3, man/man3/PAPI_get_virt_nsec.3, man/man3/PAPI_get_virt_usec.3, man/man3/PAPI_granularity_option_t.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_inherit_option_t.3, man/man3/PAPI_ipc.3, man/man3/PAPI_is_initialized.3, man/man3/PAPI_itimer_option_t.3, man/man3/PAPI_library_init.3, man/man3/PAPI_list_events.3, man/man3/PAPI_list_threads.3, man/man3/PAPI_lock.3, man/man3/PAPI_mh_cache_info_t.3, man/man3/PAPI_mh_info_t.3, man/man3/PAPI_mh_level_t.3, man/man3/PAPI_mh_tlb_info_t.3, man/man3/PAPI_mpx_info_t.3, man/man3/PAPI_multiplex_init.3, man/man3/PAPI_multiplex_option_t.3, man/man3/PAPI_num_cmp_hwctrs.3, man/man3/PAPI_num_components.3, man/man3/PAPI_num_counters.3, man/man3/PAPI_num_events.3, man/man3/PAPI_num_hwctrs.3, man/man3/PAPI_option_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_perror.3, man/man3/PAPI_preload_info_t.3, man/man3/PAPI_profil.3, man/man3/PAPI_query_event.3, man/man3/PAPI_query_named_event.3, man/man3/PAPI_read.3, man/man3/PAPI_read_counters.3, man/man3/PAPI_read_ts.3, man/man3/PAPI_register_thread.3, man/man3/PAPI_remove_event.3, man/man3/PAPI_remove_events.3, man/man3/PAPI_remove_named_event.3, man/man3/PAPI_reset.3, man/man3/PAPI_set_cmp_domain.3, man/man3/PAPI_set_cmp_granularity.3, man/man3/PAPI_set_debug.3, man/man3/PAPI_set_domain.3, man/man3/PAPI_set_granularity.3, man/man3/PAPI_set_multiplex.3, man/man3/PAPI_set_opt.3, man/man3/PAPI_set_thr_specific.3, man/man3/PAPI_shlib_info_t.3, man/man3/PAPI_shutdown.3, man/man3/PAPI_sprofil.3, man/man3/PAPI_sprofil_t.3, man/man3/PAPI_start.3, man/man3/PAPI_start_counters.3, man/man3/PAPI_state.3, man/man3/PAPI_stop.3, man/man3/PAPI_stop_counters.3, man/man3/PAPI_strerror.3, man/man3/PAPI_thread_id.3, man/man3/PAPI_thread_init.3, man/man3/PAPI_unlock.3, man/man3/PAPI_unregister_thread.3, man/man3/PAPI_write.3, papi.spec, release_procedure.txt, src/Makefile.in, src/configure.in, src/papi.h: Fixing updates to manual; incorrectly done for release 5.7.0.0. 2019-02-21 Anthony Castaldo * release_procedure.txt: Updated release procedure with additional instructions on final steps. 2019-02-18 Anthony Castaldo * doc/Doxyfile-common, papi.spec, src/Makefile.in, src/configure.in, src/papi.h: Changed version to 5.7.1 after release. * release_procedure.txt: Corrected directory entry typo. * ChangeLogP570.txt, RELEASENOTES.txt: New ChangeLogP570.txt for new release, updated RELEASENOTES.txt papi-papi-7-2-0-t/ChangeLogP700.txt000066400000000000000000007315221502707512200166110ustar00rootroot000000000000002022-11-10 Giuseppe Congiu * ChangeLogP700.txt, RELEASENOTES.txt, doc/Doxyfile-common, man/man1/PAPI_derived_event_files.1, man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hardware_avail.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPIF_accum.3, man/man3/PAPIF_add_event.3, man/man3/PAPIF_add_events.3, man/man3/PAPIF_add_named_event.3, man/man3/PAPIF_assign_eventset_component.3, man/man3/PAPIF_cleanup_eventset.3, man/man3/PAPIF_create_eventset.3, man/man3/PAPIF_destroy_eventset.3, man/man3/PAPIF_enum_dev_type.3, man/man3/PAPIF_enum_event.3, man/man3/PAPIF_epc.3, man/man3/PAPIF_event_code_to_name.3, man/man3/PAPIF_event_name_to_code.3, man/man3/PAPIF_flips_rate.3, man/man3/PAPIF_flops_rate.3, man/man3/PAPIF_get_clockrate.3, man/man3/PAPIF_get_dev_attr.3, man/man3/PAPIF_get_dev_type_attr.3, man/man3/PAPIF_get_dmem_info.3, man/man3/PAPIF_get_domain.3, man/man3/PAPIF_get_event_info.3, man/man3/PAPIF_get_exe_info.3, man/man3/PAPIF_get_granularity.3, man/man3/PAPIF_get_hardware_info.3, man/man3/PAPIF_get_multiplex.3, man/man3/PAPIF_get_preload.3, man/man3/PAPIF_get_real_cyc.3, man/man3/PAPIF_get_real_nsec.3, man/man3/PAPIF_get_real_usec.3, man/man3/PAPIF_get_virt_cyc.3, man/man3/PAPIF_get_virt_usec.3, man/man3/PAPIF_ipc.3, man/man3/PAPIF_is_initialized.3, man/man3/PAPIF_library_init.3, man/man3/PAPIF_lock.3, man/man3/PAPIF_multiplex_init.3, man/man3/PAPIF_num_cmp_hwctrs.3, man/man3/PAPIF_num_events.3, man/man3/PAPIF_num_hwctrs.3, man/man3/PAPIF_perror.3, man/man3/PAPIF_query_event.3, man/man3/PAPIF_query_named_event.3, man/man3/PAPIF_rate_stop.3, man/man3/PAPIF_read.3, man/man3/PAPIF_read_ts.3, man/man3/PAPIF_register_thread.3, man/man3/PAPIF_remove_event.3, man/man3/PAPIF_remove_events.3, man/man3/PAPIF_remove_named_event.3, man/man3/PAPIF_reset.3, man/man3/PAPIF_set_cmp_domain.3, man/man3/PAPIF_set_cmp_granularity.3, man/man3/PAPIF_set_debug.3, man/man3/PAPIF_set_domain.3, man/man3/PAPIF_set_event_domain.3, man/man3/PAPIF_set_granularity.3, man/man3/PAPIF_set_inherit.3, man/man3/PAPIF_set_multiplex.3, man/man3/PAPIF_shutdown.3, man/man3/PAPIF_start.3, man/man3/PAPIF_state.3, man/man3/PAPIF_stop.3, man/man3/PAPIF_thread_id.3, man/man3/PAPIF_thread_init.3, man/man3/PAPIF_unlock.3, man/man3/PAPIF_unregister_thread.3, man/man3/PAPIF_write.3, man/man3/PAPI_accum.3, man/man3/PAPI_add_event.3, man/man3/PAPI_add_events.3, man/man3/PAPI_add_named_event.3, man/man3/PAPI_addr_range_option_t.3, man/man3/PAPI_address_map_t.3, man/man3/PAPI_all_thr_spec_t.3, man/man3/PAPI_assign_eventset_component.3, man/man3/PAPI_attach.3, man/man3/PAPI_attach_option_t.3, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_component_info_t.3, man/man3/PAPI_cpu_option_t.3, man/man3/PAPI_create_eventset.3, man/man3/PAPI_debug_option_t.3, man/man3/PAPI_destroy_eventset.3, man/man3/PAPI_detach.3, man/man3/PAPI_disable_component.3, man/man3/PAPI_disable_component_by_name.3, man/man3/PAPI_dmem_info_t.3, man/man3/PAPI_domain_option_t.3, man/man3/PAPI_enum_cmp_event.3, man/man3/PAPI_enum_dev_type.3, man/man3/PAPI_enum_event.3, man/man3/PAPI_epc.3, man/man3/PAPI_event_code_to_name.3, man/man3/PAPI_event_info_t.3, man/man3/PAPI_event_name_to_code.3, man/man3/PAPI_exe_info_t.3, man/man3/PAPI_flips_rate.3, man/man3/PAPI_flops_rate.3, man/man3/PAPI_get_cmp_opt.3, man/man3/PAPI_get_component_index.3, man/man3/PAPI_get_component_info.3, man/man3/PAPI_get_dev_attr.3, man/man3/PAPI_get_dev_type_attr.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_get_event_component.3, man/man3/PAPI_get_event_info.3, man/man3/PAPI_get_eventset_component.3, man/man3/PAPI_get_executable_info.3, man/man3/PAPI_get_hardware_info.3, man/man3/PAPI_get_multiplex.3, man/man3/PAPI_get_opt.3, man/man3/PAPI_get_overflow_event_index.3, man/man3/PAPI_get_real_cyc.3, man/man3/PAPI_get_real_nsec.3, man/man3/PAPI_get_real_usec.3, man/man3/PAPI_get_shared_lib_info.3, man/man3/PAPI_get_thr_specific.3, man/man3/PAPI_get_virt_cyc.3, man/man3/PAPI_get_virt_nsec.3, man/man3/PAPI_get_virt_usec.3, man/man3/PAPI_granularity_option_t.3, man/man3/PAPI_hl_read.3, man/man3/PAPI_hl_region_begin.3, man/man3/PAPI_hl_region_end.3, man/man3/PAPI_hl_stop.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_inherit_option_t.3, man/man3/PAPI_ipc.3, man/man3/PAPI_is_initialized.3, man/man3/PAPI_itimer_option_t.3, man/man3/PAPI_library_init.3, man/man3/PAPI_list_events.3, man/man3/PAPI_list_threads.3, man/man3/PAPI_lock.3, man/man3/PAPI_mh_cache_info_t.3, man/man3/PAPI_mh_info_t.3, man/man3/PAPI_mh_level_t.3, man/man3/PAPI_mh_tlb_info_t.3, man/man3/PAPI_mpx_info_t.3, man/man3/PAPI_multiplex_init.3, man/man3/PAPI_multiplex_option_t.3, man/man3/PAPI_num_cmp_hwctrs.3, man/man3/PAPI_num_components.3, man/man3/PAPI_num_events.3, man/man3/PAPI_num_hwctrs.3, man/man3/PAPI_option_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_perror.3, man/man3/PAPI_preload_info_t.3, man/man3/PAPI_profil.3, man/man3/PAPI_query_event.3, man/man3/PAPI_query_named_event.3, man/man3/PAPI_rate_stop.3, man/man3/PAPI_read.3, man/man3/PAPI_read_ts.3, man/man3/PAPI_register_thread.3, man/man3/PAPI_remove_event.3, man/man3/PAPI_remove_events.3, man/man3/PAPI_remove_named_event.3, man/man3/PAPI_reset.3, man/man3/PAPI_set_cmp_domain.3, man/man3/PAPI_set_cmp_granularity.3, man/man3/PAPI_set_debug.3, man/man3/PAPI_set_domain.3, man/man3/PAPI_set_granularity.3, man/man3/PAPI_set_multiplex.3, man/man3/PAPI_set_opt.3, man/man3/PAPI_set_thr_specific.3, man/man3/PAPI_shlib_info_t.3, man/man3/PAPI_shutdown.3, man/man3/PAPI_sprofil.3, man/man3/PAPI_sprofil_t.3, man/man3/PAPI_start.3, man/man3/PAPI_state.3, man/man3/PAPI_stop.3, man/man3/PAPI_strerror.3, man/man3/PAPI_thread_id.3, man/man3/PAPI_thread_init.3, man/man3/PAPI_unlock.3, man/man3/PAPI_unregister_thread.3, man/man3/PAPI_write.3, man/man3/PAPIf_hl_read.3, man/man3/PAPIf_hl_region_begin.3, man/man3/PAPIf_hl_region_end.3, man/man3/PAPIf_hl_stop.3, man/man3/RateInfo.3, man/man3/binary_tree_t.3, man/man3/components_t.3, man/man3/local_components_t.3, man/man3/reads_t.3, man/man3/regions_t.3, man/man3/threads_t.3, man/man3/value_t.3, papi.spec, src/Makefile.in, src/configure, src/configure.in, src/papi.h: release: preparation for release commit - Update documentation - Update version * src/validation_tests/papi_br_tkn.c: papi_br_tkn: add not taken branch event to the right eventset The branch not taken event is added to the eventset for branch taken. Add the not taken event to the right eventset. * man/man3/PAPIF_enum_dev_type.3, man/man3/PAPIF_get_dev_attr.3, man/man3/PAPIF_get_dev_type_attr.3: sysdetect: add missing fortran man pages Man pages for PAPIF_enum_dev_type, PAPIF_get_dev_type_attr and PAPIF_get_dev_attr were missing. 2022-11-08 Giuseppe Congiu * .../sysdetect/tests/query_device_simple_f.F: sysdetect: update test to reflect 'list' argument removal Commit 482e8c5f1 removed the 'list' argument from papif_get_dev_attr fortran wrapper. However, the test still passed 'dummy_list' to every call of the function. This cause the len of the string to be read from the wrong argument and the following 'strncpy' to segfault. 2022-11-02 Giuseppe Congiu * man/man1/PAPI_derived_event_files.1, man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hardware_avail.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPIF_accum.3, man/man3/PAPIF_add_event.3, man/man3/PAPIF_add_events.3, man/man3/PAPIF_add_named_event.3, man/man3/PAPIF_assign_eventset_component.3, man/man3/PAPIF_cleanup_eventset.3, man/man3/PAPIF_create_eventset.3, man/man3/PAPIF_destroy_eventset.3, man/man3/PAPIF_enum_event.3, man/man3/PAPIF_epc.3, man/man3/PAPIF_event_code_to_name.3, man/man3/PAPIF_event_name_to_code.3, man/man3/PAPIF_flips_rate.3, man/man3/PAPIF_flops_rate.3, man/man3/PAPIF_get_clockrate.3, man/man3/PAPIF_get_dmem_info.3, man/man3/PAPIF_get_domain.3, man/man3/PAPIF_get_event_info.3, man/man3/PAPIF_get_exe_info.3, man/man3/PAPIF_get_granularity.3, man/man3/PAPIF_get_hardware_info.3, man/man3/PAPIF_get_multiplex.3, man/man3/PAPIF_get_preload.3, man/man3/PAPIF_get_real_cyc.3, man/man3/PAPIF_get_real_nsec.3, man/man3/PAPIF_get_real_usec.3, man/man3/PAPIF_get_virt_cyc.3, man/man3/PAPIF_get_virt_usec.3, man/man3/PAPIF_ipc.3, man/man3/PAPIF_is_initialized.3, man/man3/PAPIF_library_init.3, man/man3/PAPIF_lock.3, man/man3/PAPIF_multiplex_init.3, man/man3/PAPIF_num_cmp_hwctrs.3, man/man3/PAPIF_num_events.3, man/man3/PAPIF_num_hwctrs.3, man/man3/PAPIF_perror.3, man/man3/PAPIF_query_event.3, man/man3/PAPIF_query_named_event.3, man/man3/PAPIF_rate_stop.3, man/man3/PAPIF_read.3, man/man3/PAPIF_read_ts.3, man/man3/PAPIF_register_thread.3, man/man3/PAPIF_remove_event.3, man/man3/PAPIF_remove_events.3, man/man3/PAPIF_remove_named_event.3, man/man3/PAPIF_reset.3, man/man3/PAPIF_set_cmp_domain.3, man/man3/PAPIF_set_cmp_granularity.3, man/man3/PAPIF_set_debug.3, man/man3/PAPIF_set_domain.3, man/man3/PAPIF_set_event_domain.3, man/man3/PAPIF_set_granularity.3, man/man3/PAPIF_set_inherit.3, man/man3/PAPIF_set_multiplex.3, man/man3/PAPIF_shutdown.3, man/man3/PAPIF_start.3, man/man3/PAPIF_state.3, man/man3/PAPIF_stop.3, man/man3/PAPIF_thread_id.3, man/man3/PAPIF_thread_init.3, man/man3/PAPIF_unlock.3, man/man3/PAPIF_unregister_thread.3, man/man3/PAPIF_write.3, man/man3/PAPI_accum.3, man/man3/PAPI_add_event.3, man/man3/PAPI_add_events.3, man/man3/PAPI_add_named_event.3, man/man3/PAPI_addr_range_option_t.3, man/man3/PAPI_address_map_t.3, man/man3/PAPI_all_thr_spec_t.3, man/man3/PAPI_assign_eventset_component.3, man/man3/PAPI_attach.3, man/man3/PAPI_attach_option_t.3, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_component_info_t.3, man/man3/PAPI_cpu_option_t.3, man/man3/PAPI_create_eventset.3, man/man3/PAPI_debug_option_t.3, man/man3/PAPI_destroy_eventset.3, man/man3/PAPI_detach.3, man/man3/PAPI_disable_component.3, man/man3/PAPI_disable_component_by_name.3, man/man3/PAPI_dmem_info_t.3, man/man3/PAPI_domain_option_t.3, man/man3/PAPI_enum_cmp_event.3, man/man3/PAPI_enum_dev_type.3, man/man3/PAPI_enum_event.3, man/man3/PAPI_epc.3, man/man3/PAPI_event_code_to_name.3, man/man3/PAPI_event_info_t.3, man/man3/PAPI_event_name_to_code.3, man/man3/PAPI_exe_info_t.3, man/man3/PAPI_flips_rate.3, man/man3/PAPI_flops_rate.3, man/man3/PAPI_get_cmp_opt.3, man/man3/PAPI_get_component_index.3, man/man3/PAPI_get_component_info.3, man/man3/PAPI_get_dev_attr.3, man/man3/PAPI_get_dev_type_attr.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_get_event_component.3, man/man3/PAPI_get_event_info.3, man/man3/PAPI_get_eventset_component.3, man/man3/PAPI_get_executable_info.3, man/man3/PAPI_get_hardware_info.3, man/man3/PAPI_get_multiplex.3, man/man3/PAPI_get_opt.3, man/man3/PAPI_get_overflow_event_index.3, man/man3/PAPI_get_real_cyc.3, man/man3/PAPI_get_real_nsec.3, man/man3/PAPI_get_real_usec.3, man/man3/PAPI_get_shared_lib_info.3, man/man3/PAPI_get_thr_specific.3, man/man3/PAPI_get_virt_cyc.3, man/man3/PAPI_get_virt_nsec.3, man/man3/PAPI_get_virt_usec.3, man/man3/PAPI_granularity_option_t.3, man/man3/PAPI_hl_read.3, man/man3/PAPI_hl_region_begin.3, man/man3/PAPI_hl_region_end.3, man/man3/PAPI_hl_stop.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_inherit_option_t.3, man/man3/PAPI_ipc.3, man/man3/PAPI_is_initialized.3, man/man3/PAPI_itimer_option_t.3, man/man3/PAPI_library_init.3, man/man3/PAPI_list_events.3, man/man3/PAPI_list_threads.3, man/man3/PAPI_lock.3, man/man3/PAPI_mh_cache_info_t.3, man/man3/PAPI_mh_info_t.3, man/man3/PAPI_mh_level_t.3, man/man3/PAPI_mh_tlb_info_t.3, man/man3/PAPI_mpx_info_t.3, man/man3/PAPI_multiplex_init.3, man/man3/PAPI_multiplex_option_t.3, man/man3/PAPI_num_cmp_hwctrs.3, man/man3/PAPI_num_components.3, man/man3/PAPI_num_events.3, man/man3/PAPI_num_hwctrs.3, man/man3/PAPI_option_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_perror.3, man/man3/PAPI_preload_info_t.3, man/man3/PAPI_profil.3, man/man3/PAPI_query_event.3, man/man3/PAPI_query_named_event.3, man/man3/PAPI_rate_stop.3, man/man3/PAPI_read.3, man/man3/PAPI_read_ts.3, man/man3/PAPI_register_thread.3, man/man3/PAPI_remove_event.3, man/man3/PAPI_remove_events.3, man/man3/PAPI_remove_named_event.3, man/man3/PAPI_reset.3, man/man3/PAPI_set_cmp_domain.3, man/man3/PAPI_set_cmp_granularity.3, man/man3/PAPI_set_debug.3, man/man3/PAPI_set_domain.3, man/man3/PAPI_set_granularity.3, man/man3/PAPI_set_multiplex.3, man/man3/PAPI_set_opt.3, man/man3/PAPI_set_thr_specific.3, man/man3/PAPI_shlib_info_t.3, man/man3/PAPI_shutdown.3, man/man3/PAPI_sprofil.3, man/man3/PAPI_sprofil_t.3, man/man3/PAPI_start.3, man/man3/PAPI_state.3, man/man3/PAPI_stop.3, man/man3/PAPI_strerror.3, man/man3/PAPI_thread_id.3, man/man3/PAPI_thread_init.3, man/man3/PAPI_unlock.3, man/man3/PAPI_unregister_thread.3, man/man3/PAPI_write.3, man/man3/PAPIf_hl_read.3, man/man3/PAPIf_hl_region_begin.3, man/man3/PAPIf_hl_region_end.3, man/man3/PAPIf_hl_stop.3, man/man3/RateInfo.3, man/man3/binary_tree_t.3, man/man3/components_t.3, man/man3/local_components_t.3, man/man3/reads_t.3, man/man3/regions_t.3, man/man3/threads_t.3, man/man3/value_t.3: sysdetect: regenerate man pages for updated attributes * src/papi.c: sysdetect: remove unused attributes from doc * src/components/sysdetect/tests/query_device_mpi.c: sysdetect: white space cleanup 2022-11-02 John Rodgers * src/components/cuda/linux-cuda.c: CUDA: Align memory zero with pad Update logic in `cuda11_makeRoomAllEvents` to ensure the memory zero'ing operation covers the amount expanded by the `realloc` operation. * src/components/cuda/linux-cuda.c: CUDA: CUPTI11 Sporadic Memory Failures The CUPTI11 portion of the cuda component has exhibited sporadic memory failures for applications compiled against MVAPICH's libmpi.so. Specifically, the realloc operation in `cuda11_makeRoomAllEvents`, called in `_cuda11_add_native_events`, would fail even when there was sufficient memory to complete the requested allocation. As a workaround, this patch prevents the failure by allocating the expected memory up front prior to the device loop in `_cuda11_add_native_events`. * src/components/cuda/linux-cuda.c: CUDA: Prevent memory leak Prevent memory leak by freeing `firstLast` buffer in `_cuda11_add_native_events`. * src/components/cuda/linux-cuda.c: CUDA: Remove unnecessary code Remove logic only necessary when trying to resolve counters without an active profiling session. Given that a profiling session is created and active (see: _cuda11_add_native_events -> _cuda11_init_profiler) creation and usage of `cuda11_CounterAvailabilityImage` is unnecessary. * src/components/cuda/linux-cuda.c: CUDA: Prevent component deadlock Add missing component unlock to `_cuda_update_control_state` to prevent deadlocks encountered when adding multiple events sequentially. Patch resolves issue #121 * src/components/cuda/linux-cuda.c, src/components/nvml/linux-nvml.c, src/components/rocm/rocm.c, src/components/rocm_smi/linux-rocm- smi.c: DELAY_INIT: Set disabled for delay init comps Ensure components that leverage the delayed initialization scheme, namely cuda, nvml, rocm, and rocm_smi, set thier respective .cmp_info.disabled flag with `PAPI_EDELAY_INIT` when completing the standard component initialization. Update necessary to conform with PR: 328 2022-10-24 Daniel Barry * src/components/pcp/linux-pcp.c, src/papi.h: pcp: warning instead of error when 'reason' string truncated When the hostname is too long, there is not enough memory allocated for the error 'reason' string. This caused the component to prematurely exit initialization when the PM daemon is not active. Instead, a warning is now issued, and the initialization exits appropriately. Additionally, the size of the 'reason' string has been increased to accommodate longer host names. These changes have been tested on the IBM POWER9 architecture. 2022-10-28 Daniel Barry * src/counter_analysis_toolkit/main.c: cat: support to comment-out lines in input file These changes add support for users to comment-out lines in the input file. This allows users to more flexibly take measurements without having to remove lines or use multiple input files. These changes have been tested on the AMD Zen3 architecture. 2022-10-27 Anthony Danalis * src/validation_tests/branches_testcode.c, src/validation_tests/papi_br_msp.c: Improved the branch misprediction validation test. The previous version of the branch misprediction validation test relied on the libc function random() to generate entropy. However, this function introduced 15x more branches than the number of branches in the code of the validation test, polluting the results. The new code uses an inline Xorshift pseudo-random number generator which is more than sufficient to confuse the branch predictor, and does not contain any branch instructions so it does not pollute the event measurement. Also, the logic of the test has been simplified. 2022-10-27 Anthony * src/sde_lib/sde_lib_datastructures.c: Removed unneeded NULL pointer checks in libsde. 2022-09-14 Giuseppe Congiu * man/man1/PAPI_derived_event_files.1, man/man1/papi_avail.1, man/man1/papi_clockres.1, man/man1/papi_command_line.1, man/man1/papi_component_avail.1, man/man1/papi_cost.1, man/man1/papi_decode.1, man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1, man/man1/papi_hardware_avail.1, man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1, man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1, man/man1/papi_version.1, man/man1/papi_xml_event_info.1, man/man3/PAPIF_accum.3, man/man3/PAPIF_add_event.3, man/man3/PAPIF_add_events.3, man/man3/PAPIF_add_named_event.3, man/man3/PAPIF_assign_eventset_component.3, man/man3/PAPIF_cleanup_eventset.3, man/man3/PAPIF_create_eventset.3, man/man3/PAPIF_destroy_eventset.3, man/man3/PAPIF_enum_event.3, man/man3/PAPIF_epc.3, man/man3/PAPIF_event_code_to_name.3, man/man3/PAPIF_event_name_to_code.3, man/man3/PAPIF_flips_rate.3, man/man3/PAPIF_flops_rate.3, man/man3/PAPIF_get_clockrate.3, man/man3/PAPIF_get_dmem_info.3, man/man3/PAPIF_get_domain.3, man/man3/PAPIF_get_event_info.3, man/man3/PAPIF_get_exe_info.3, man/man3/PAPIF_get_granularity.3, man/man3/PAPIF_get_hardware_info.3, man/man3/PAPIF_get_multiplex.3, man/man3/PAPIF_get_preload.3, man/man3/PAPIF_get_real_cyc.3, man/man3/PAPIF_get_real_nsec.3, man/man3/PAPIF_get_real_usec.3, man/man3/PAPIF_get_virt_cyc.3, man/man3/PAPIF_get_virt_usec.3, man/man3/PAPIF_ipc.3, man/man3/PAPIF_is_initialized.3, man/man3/PAPIF_library_init.3, man/man3/PAPIF_lock.3, man/man3/PAPIF_multiplex_init.3, man/man3/PAPIF_num_cmp_hwctrs.3, man/man3/PAPIF_num_events.3, man/man3/PAPIF_num_hwctrs.3, man/man3/PAPIF_perror.3, man/man3/PAPIF_query_event.3, man/man3/PAPIF_query_named_event.3, man/man3/PAPIF_rate_stop.3, man/man3/PAPIF_read.3, man/man3/PAPIF_read_ts.3, man/man3/PAPIF_register_thread.3, man/man3/PAPIF_remove_event.3, man/man3/PAPIF_remove_events.3, man/man3/PAPIF_remove_named_event.3, man/man3/PAPIF_reset.3, man/man3/PAPIF_set_cmp_domain.3, man/man3/PAPIF_set_cmp_granularity.3, man/man3/PAPIF_set_debug.3, man/man3/PAPIF_set_domain.3, man/man3/PAPIF_set_event_domain.3, man/man3/PAPIF_set_granularity.3, man/man3/PAPIF_set_inherit.3, man/man3/PAPIF_set_multiplex.3, man/man3/PAPIF_shutdown.3, man/man3/PAPIF_start.3, man/man3/PAPIF_state.3, man/man3/PAPIF_stop.3, man/man3/PAPIF_thread_id.3, man/man3/PAPIF_thread_init.3, man/man3/PAPIF_unlock.3, man/man3/PAPIF_unregister_thread.3, man/man3/PAPIF_write.3, man/man3/PAPI_accum.3, man/man3/PAPI_add_event.3, man/man3/PAPI_add_events.3, man/man3/PAPI_add_named_event.3, man/man3/PAPI_addr_range_option_t.3, man/man3/PAPI_address_map_t.3, man/man3/PAPI_all_thr_spec_t.3, man/man3/PAPI_assign_eventset_component.3, man/man3/PAPI_attach.3, man/man3/PAPI_attach_option_t.3, man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_component_info_t.3, man/man3/PAPI_cpu_option_t.3, man/man3/PAPI_create_eventset.3, man/man3/PAPI_debug_option_t.3, man/man3/PAPI_destroy_eventset.3, man/man3/PAPI_detach.3, man/man3/PAPI_disable_component.3, man/man3/PAPI_disable_component_by_name.3, man/man3/PAPI_dmem_info_t.3, man/man3/PAPI_domain_option_t.3, man/man3/PAPI_enum_cmp_event.3, man/man3/PAPI_enum_dev_type.3, man/man3/PAPI_enum_event.3, man/man3/PAPI_epc.3, man/man3/PAPI_event_code_to_name.3, man/man3/PAPI_event_info_t.3, man/man3/PAPI_event_name_to_code.3, man/man3/PAPI_exe_info_t.3, man/man3/PAPI_flips_rate.3, man/man3/PAPI_flops_rate.3, man/man3/PAPI_get_cmp_opt.3, man/man3/PAPI_get_component_index.3, man/man3/PAPI_get_component_info.3, man/man3/PAPI_get_dev_attr.3, man/man3/PAPI_get_dev_type_attr.3, man/man3/PAPI_get_dmem_info.3, man/man3/PAPI_get_event_component.3, man/man3/PAPI_get_event_info.3, man/man3/PAPI_get_eventset_component.3, man/man3/PAPI_get_executable_info.3, man/man3/PAPI_get_hardware_info.3, man/man3/PAPI_get_multiplex.3, man/man3/PAPI_get_opt.3, man/man3/PAPI_get_overflow_event_index.3, man/man3/PAPI_get_real_cyc.3, man/man3/PAPI_get_real_nsec.3, man/man3/PAPI_get_real_usec.3, man/man3/PAPI_get_shared_lib_info.3, man/man3/PAPI_get_thr_specific.3, man/man3/PAPI_get_virt_cyc.3, man/man3/PAPI_get_virt_nsec.3, man/man3/PAPI_get_virt_usec.3, man/man3/PAPI_granularity_option_t.3, man/man3/PAPI_hl_read.3, man/man3/PAPI_hl_region_begin.3, man/man3/PAPI_hl_region_end.3, man/man3/PAPI_hl_stop.3, man/man3/PAPI_hw_info_t.3, man/man3/PAPI_inherit_option_t.3, man/man3/PAPI_ipc.3, man/man3/PAPI_is_initialized.3, man/man3/PAPI_itimer_option_t.3, man/man3/PAPI_library_init.3, man/man3/PAPI_list_events.3, man/man3/PAPI_list_threads.3, man/man3/PAPI_lock.3, man/man3/PAPI_mh_cache_info_t.3, man/man3/PAPI_mh_info_t.3, man/man3/PAPI_mh_level_t.3, man/man3/PAPI_mh_tlb_info_t.3, man/man3/PAPI_mpx_info_t.3, man/man3/PAPI_multiplex_init.3, man/man3/PAPI_multiplex_option_t.3, man/man3/PAPI_num_cmp_hwctrs.3, man/man3/PAPI_num_components.3, man/man3/PAPI_num_events.3, man/man3/PAPI_num_hwctrs.3, man/man3/PAPI_option_t.3, man/man3/PAPI_overflow.3, man/man3/PAPI_perror.3, man/man3/PAPI_preload_info_t.3, man/man3/PAPI_profil.3, man/man3/PAPI_query_event.3, man/man3/PAPI_query_named_event.3, man/man3/PAPI_rate_stop.3, man/man3/PAPI_read.3, man/man3/PAPI_read_ts.3, man/man3/PAPI_register_thread.3, man/man3/PAPI_remove_event.3, man/man3/PAPI_remove_events.3, man/man3/PAPI_remove_named_event.3, man/man3/PAPI_reset.3, man/man3/PAPI_set_cmp_domain.3, man/man3/PAPI_set_cmp_granularity.3, man/man3/PAPI_set_debug.3, man/man3/PAPI_set_domain.3, man/man3/PAPI_set_granularity.3, man/man3/PAPI_set_multiplex.3, man/man3/PAPI_set_opt.3, man/man3/PAPI_set_thr_specific.3, man/man3/PAPI_shlib_info_t.3, man/man3/PAPI_shutdown.3, man/man3/PAPI_sprofil.3, man/man3/PAPI_sprofil_t.3, man/man3/PAPI_start.3, man/man3/PAPI_state.3, man/man3/PAPI_stop.3, man/man3/PAPI_strerror.3, man/man3/PAPI_thread_id.3, man/man3/PAPI_thread_init.3, man/man3/PAPI_unlock.3, man/man3/PAPI_unregister_thread.3, man/man3/PAPI_write.3, man/man3/PAPIf_hl_read.3, man/man3/PAPIf_hl_region_begin.3, man/man3/PAPIf_hl_region_end.3, man/man3/PAPIf_hl_stop.3, man/man3/RateInfo.3, man/man3/binary_tree_t.3, man/man3/components_t.3, man/man3/local_components_t.3, man/man3/reads_t.3, man/man3/regions_t.3, man/man3/threads_t.3, man/man3/value_t.3: doc: regenerate man pages 2022-10-25 Giuseppe Congiu * src/utils/papi_hardware_avail.c: papi_hardware_avail: print thread affinity list for numas 2022-10-26 Giuseppe Congiu * src/components/sysdetect/tests/query_device_mpi.c: sysdetect: add GPU affinity example in tests The GPU affinity example utilizes MPI shared memory windows to workout the GPU affinity of every MPI rank. The first rank in every GPU rank list prints the list of rank for the give GPU. * src/components/Makefile_comp_tests.target.in, src/components/sysdetect/tests/Makefile: sysdetect: hook mpi tests to NO_MPI_TESTS The configure step in PAPI checks whether MPI tests can be enabled or not. If not it sets NO_MPI_TESTS to yes. This variable is then used in ctests/Makefile.recipies to enable or disable MPI tests. The sysdetect tests were not relying on this variable. Instead sysdetect relied on MPICC being set which is no accurate. This patch make the MPI checks more uniform across the code by adding NO_MPI_TESTS checks in sysdetect tests too. 2022-10-25 Giuseppe Congiu * src/components/sysdetect/sysdetect.c: sysdetect: add PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY was missing in sysdetect. This attribute can be used to discover the numa affinity of every hardware thread in the system. * src/components/sysdetect/Rules.sysdetect, src/components/sysdetect/amd_gpu.c, src/components/sysdetect/nvidia_gpu.c, src/components/sysdetect/shm.c, src/components/sysdetect/shm.h, src/components/sysdetect/sysdetect.c, src/components/sysdetect/sysdetect.h, src/components/sysdetect/tests/query_device_mpi.c, .../sysdetect/tests/query_device_simple.c, .../sysdetect/tests/query_device_simple_f.F, src/configure, src/configure.in, src/genpapifdef.c, src/papi.h, src/papi_fwrappers.c, src/utils/papi_hardware_avail.c: sysdetect: remove builtin support for numa and GPU affinity Numa and GPU affinity of threads and MPI ranks adds an MPI dependency to PAPI that may cause problems (link time unresolved MPI symbols) if the application using PAPI does not link against MPI. Most of the work that sysdetect currently does to provide affinity lists to the users can be easily done by the users themselves. Thus, sysdetect will no longer support them. 2022-10-12 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/vec.c: cat: ifdefs for AVX availability Utilize ifdefs so that the build can be more flexible between systems with different AVX vector-width availability. * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c: cat: specify architecture in macros Rename VEC_WIDTH_[128|256|512] to X86_VEC_WIDTH_[128|256|512]B to be more specific. * src/counter_analysis_toolkit/vec_arch.h: cat: remove unused typedef; add used typedef Typedef 'half' since this type is actually used in the code, and remove HP_SCALAR_TYPE. 2022-10-11 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c: cat: rename macros for POWER architecture For the sake of consistency, use "POWER" instead of "IBM." * src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c: cat: remove unused code Remove unused AMD Bulldozer intrinsics. 2022-09-19 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_arch.h: cat: consolidate 'INTEL' and 'AMD' flags for vector FLOPs benchmark Since the ifdefs which check whether "INTEL" is defined also check whether "AMD" is defined, use "X86" for both. These changes have been tested on the Intel Ice Lake architecture. * src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c: cat: specify architecture vector FLOPs benchmark function names Include the architecture names in the function names for consistency. These changes have been tested on the IBM POWER9 architecture. 2022-09-07 Daniel Barry * src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c: cat: vector FLOPs benchmark for non-x86 architectures bug fix The driver code for the vector benchmark could not call the functions for the vector FLOPs kernels because they were declared 'static'. For builds which use either the NEON or ALTIVEC intrinsics, these static functions are now wrapped, so they can be called by the driver. These changes have been tested on the IBM POWER9 architecture. 2022-10-13 Daniel Barry * src/components/powercap/tests/Makefile, .../powercap/tests/powercap_basic_read.c, .../powercap/tests/powercap_basic_readwrite.c: powercap: add new component tests This adds a new component test for each of the following: (1) add one event to an event set at a time and read it (2) add one event at a time, read it, write it, read the new value, restore the original value These changes have been tested on the Intel Ice Lake architecture. 2022-10-26 AnustuvICL * src/components/perf_event/pe_libpfm4_events.c: perf_event: Free allocated string in function allocate_native_event 2022-10-18 Peinan Zhang * src/components/intel_gpu/README, src/components/intel_gpu/README.md, .../intel_gpu/internal/inc/GPUMetricHandler.h, .../intel_gpu/internal/inc/GPUMetricInterface.h, .../intel_gpu/internal/src/GPUMetricHandler.cpp, .../intel_gpu/internal/src/GPUMetricInterface.cpp, src/components/intel_gpu/internal/src/Makefile, src/components/intel_gpu/linux_intel_gpu_metrics.c, src/components/intel_gpu/linux_intel_gpu_metrics.h, src/components/intel_gpu/tests/Makefile, src/components/intel_gpu/tests/gemm.spv, src/components/intel_gpu/tests/gpu_common_utils.c, src/components/intel_gpu/tests/gpu_common_utils.h, src/components/intel_gpu/tests/gpu_metric_list.c, src/components/intel_gpu/tests/gpu_metric_read.c, src/components/intel_gpu/tests/gpu_query_gemm.cc, src/components/intel_gpu/tests/gpu_thread_read.c, src/components/intel_gpu/tests/readme.txt: Added support for multiple Intel GPU devices and multiple-tiles per device. Allow query performance metrics on multiple Intel GPUs and multiple tiles per GPU. Support Intel GPU Arctic Sound and Ponte Sound. Update test cases for taking metrics from input, so to work with different platforms. Update conponent README.md file 2022-10-24 Giuseppe Congiu * src/utils/papi_native_avail.c: sde: make '-sde' option always visible in papi_native_avail The '-sde' option was not visible in papi_native_avail unless the SDE component was configured in PAPI. Now we always have the option visible but return an error if the SDE component is not configured. 2022-10-25 Anthony * src/configure, src/configure.in: Make papi_native_avail support the "-sde" flag only if *both* libsde and the SDE component are configured in. * src/components/sde/tests/Makefile, src/components/sde/tests/README.txt: Added path to libpfm4 in the SDE tests Makefile, and further instructions for users in the README. 2022-10-24 AnustuvICL * src/papi.h: papi.h: Update bit field post removal of members from struct _papi_component_option 2022-10-23 William Cohen * src/components/sysdetect/linux_cpu_utils.c, src/linux-memory.c: Use fgets in place of fscanf functions to avoid possible buffer overflows There were several locations in the PAPI code that used fscanf calls like the following statement to read in information: result=fscanf(fff,"%s",allocation_policy_string); The problem with this statement is that the fscanf could possibly write past the end of allocation_policy_string. To limit the write to the size of the allocation_policy_string an fgets like the following is used in its place: str_result=fgets(allocation_policy_string, BUFSIZ, fff); One set of fscanf were for the generic memory information code reading the cache characteristics. Another fscanf was in the sysdetect component reading of cache characteristics. 2022-10-10 AnustuvICL * src/genpapifdef.c, src/papi.h: Remove C++ style commented code 2022-08-31 Giuseppe Congiu * src/components/perfctr/perfctr.c: perfctr: set disabled flag in cmp * src/components/perfmon2/perfmon.c: perfmon2: set disabled flag in cmp * src/papi_internal.c: papi: do not set disabled flag in framework * src/components/vmware/vmware.c: vmware: set disabled flag in cmp * src/components/stealtime/linux-stealtime.c: stealtime: set disabled flag in cmp * src/components/sensors_ppc/linux-sensors-ppc.c: sensors_ppc: set disabled flag in cmp * src/components/rapl/linux-rapl.c: rapl: set disabled flag in cmp * src/components/powercap_ppc/linux-powercap-ppc.c: powercap_ppc: set disabled flag in cmp * src/components/powercap/linux-powercap.c: powercap: set disabled flag in cmp * src/components/perf_event_uncore/perf_event_uncore.c: perf_event_u: set disabled flag in cmp * src/components/perf_event/perf_event.c: perf_event: set disabled flag in cmp * src/components/pcp/linux-pcp.c: pcp: set disabled flag in cmp * src/components/net/linux-net.c: net: set disabled flag in cmp * src/components/mx/linux-mx.c: mx: set disabled flag in cmp * src/components/lustre/linux-lustre.c: lustre: set disabled flag in cmp * src/components/lmsensors/linux-lmsensors.c: lmsensors: set disabled flag in cmp * src/components/libmsr/linux-libmsr.c: libmsr: set disabled flag in cmp * src/components/io/linux-io.c: io: set disabled flag in cmp * src/components/intel_gpu/linux_intel_gpu_metrics.c: intel_gpu: set disabled flag in cmp * src/components/infiniband/linux-infiniband.c: infiniband: set disabled flag in cmp * src/components/micpower/linux-micpower.c: micpower: set disabled flag in cmp * src/components/host_micpower/linux-host_micpower.c: host_micpower: set disabled flag in cmp * src/components/example/example.c: example: set disabled flag in cmp * src/components/coretemp_freebsd/coretemp_freebsd.c: coretemp_freebsd: set disabled flag in cmp * src/components/coretemp/linux-coretemp.c: coretemp: set disabled flag in cmp * src/components/appio/appio.c: appio: set disabled flag in cmp 2022-10-18 Giuseppe Congiu * src/components/rocm/rocm.c: rocm: return PAPI_ENOEVNT if event not found 2022-10-17 Giuseppe Congiu * .../rocm/tests/intercept_single_kernel_monitoring.cpp, .../rocm/tests/intercept_single_thread_monitoring.cpp, src/components/rocm/tests/multi_kernel_monitoring.cpp, src/components/rocm/tests/multi_thread_monitoring.cpp, .../rocm/tests/sample_single_kernel_monitoring.cpp, src/components/rocm/tests/single_thread_monitoring.cpp: rocm: SQ_WAVES does not reflect logical waves SQ_WAVES counts the number of logical waves, plus the waves that are restored due to context switching. This patch computes the logical number of waves as SQ_WAVES - SQ_WAVES_RESTORED. For those architectures that do not support SQ_WAVES_RESTORED (preceeding MI200) the tests return with a warning and the number of waves check is ignored. 2022-10-20 Daniel Barry * src/counter_analysis_toolkit/main.c: cat: fix memory leak from hw_desc alloc Free the dynamically allocated memory used by the hardware description feature of CAT. These changes have been tested on the Intel Westmere EP architecture. 2022-10-19 Anthony * src/Makefile.in, src/Makefile.inc, src/components/Makefile_comp_tests.target.in, src/components/sde/tests/Makefile, src/configure, src/configure.in, src/sde_lib/Makefile: Make static libsde.a optional. We build the static sde library 'libsde.a' only if libpapi.a is also built, based on the configure flags provided by the user (i.e., --with- static-lib). Also, the linking of the relevant tests and utilities depends on the existence or not of the static sde library. 2022-10-20 Daniel Barry * src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/main.c: cat: define default number of OMP threads Using the PAPI_hw_info_t structure, define the default number of threads as the number of CPUs per socket. These changes have been tested on the Intel Westmere EP architecture. 2022-10-18 William Cohen * src/components/sysdetect/tests/query_device_simple_f.F: Removed unused label and variable from query_device_simple_f.F Clean up query_device_simple_f.F to eliminate the following warnings: query_device_simple_f.F:142:12: 142 | 10 format(9I5) | 1 Warning: Label 10 at (1) defined but not used [-Wunused-label] query_device_simple_f.F:7:41: 7 | integer :: i, j, ret_val, error, handle, modifier, id, vendor_id | 1 Warning: Unused variable 'error' declared at (1) [-Wunused- variable] * src/papi_preset.c: Correctly size papi_preset.c array to avoid possible overflow Uped the work array size to avoid the following warnings: papi_preset.c: In function 'update_ops_string': papi_preset.c:336:50: warning: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Wformat-overflow=] 336 | sprintf (work, "N%d", cur_index-1); | ^~ papi_preset.c:336:48: note: directive argument in the range [-2147483648, 2147483646] 336 | sprintf (work, "N%d", cur_index-1); | ^~~~~ In file included from /usr/include/stdio.h:906, from papi_debug.h:23, from papi_internal.h:24, from papi_preset.c:18: In function 'sprintf', inlined from 'update_ops_string' at papi_preset.c:336:5: /usr/include/bits/stdio2.h:30:10: note: '__builtin___sprintf_chk' output between 3 and 13 bytes into a destination of size 10 30 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 31 | __glibc_objsize (__s), __fmt, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ 2022-10-13 Daniel Barry * src/components/powercap/tests/powercap_basic.c: powercap: fix memory leak in test The component test 'powercap_basic' now frees the dynamically allocated memory used to store counter readings. These changes have been tested on the Intel Cascade Lake architecture. Sat Oct 1 23:04:01 2022 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_amd64.c: libpfm4: update to commit 8aaaf17 Original commits: commit 8aaaf1747e96031a47ed6bd9337ff61a21f8cc64 add missing break in amd64_get_revision() Fixed bug introduced by: commit 79031f76f8a1 ("fix amd_get_revision() to identify AMD Zen3 uniquely") Must have a break statment for AMD Zen3 (model 1) to avoid errors later. Reported-by: Steve Kaufmann commit bc4233d35418788423e8442395c7920eb156589d update Intel Skylake event table Based on download.01.or version 1.28. commit 4c0bc1c8ae06abd5f876657888b88aaf9c9530e6 Fix typos in Intel Icelake event table Based on download.01.org version 1.16. commit b6f86fb0d8eae38d65d4394e3ed82f528b10bebf Update Intel SapphireRapid event table Based on download.01.org release 1.06. Minor changes to ASSITS and DECODE events. Untested 2022-10-13 Daniel Barry * src/components/powercap/tests/powercap_basic.c: powercap: ensure proper string format in test Ensure that the proper string is null-terminated. 2022-10-11 Daniel Barry * src/components/powercap/linux-powercap.c, src/components/powercap/tests/Makefile, src/components/powercap/tests/powercap_basic.c: powercap: fix formatting Replace tabs with appropriate amounts of spaces. These changes have been tested on the Intel Cascade Lake architecture. 2022-10-10 Daniel Barry * src/components/powercap/tests/powercap_basic.c: powercap: fix compiler warnings for component test powercap_basic The warnings for the powercap component test can also be squelched by replacing sizeof() with the actual buffer sizes. These changes have been tested on the Intel Cascade Lake architecture. * src/components/powercap/linux-powercap.c: powercap: fix compiler warnings for component The warnings for the powercap component can be squelched by replacing sizeof() with the actual size of the destination buffer. These changes have been tested on the Intel Cascade Lake architecture. 2022-10-10 AnustuvICL * src/aix.c, src/components/bgpm/IOunit/linux-IOunit.c, src/components/bgpm/L2unit/linux-L2unit.c, src/components/perf_event/perf_event.c, src/components/perf_event/perf_helpers.h, src/components/perfctr/perfctr.c, src/components/perfmon2/perfmon.c, src/components/perfmon_ia64 /perfmon-ia64.c, src/components/perfnec/perfmon.c, src/components/rocm/rocm.c, src/components/sde/sde.c, src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/attach_validate.c, src/ctests/byte_profile.c, src/ctests/data_range.c, src/ctests/earprofile.c, src/ctests/prof_utils.c, src/ctests/prof_utils.h, src/ctests/profile.c, src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c, src/ctests/sprofile.c, src/examples/PAPI_profil.c, src/examples/sprofile.c, src/extras.c, src/extras.h, src/linux-bgp.c, src/linux-bgq.c, src/linux- context.h, src/linux-memory.c, src/papi.c, src/papi.h, src/papi_fwrappers.c, src/papi_internal.h, src/papivi.h, src /solaris-common.c, src/solaris-common.h, src/solaris-niagara2.c, src/solaris-ultra.c, src/solaris-ultra.h: Refactor caddr_t to void* vptr_t 2022-10-11 Anthony * src/counter_analysis_toolkit/params.h: Missing file that should have been included in PR 349 (commit 89c0f19). 2022-10-12 Giuseppe Congiu * src/papi_fwrappers.c: sysdetect: fix warning in papi_fwrappers.c papi_fwrappers.c is used to generate multiple wrapper versions for fortran. Because of a global variable not declared static the different versions cause a redefinition of the symbols when used with recent versions of the gcc compiler (as the compiler does link time optimizations). Declaring the variable static should fix the problem. 2022-09-07 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c: cat: add MPI support Add MPI support to accelerate the collection of event data. This works by splitting up the list of events to be monitored among the MPI ranks. These changes have been tested on the IBM POWER9 architecture. 2022-09-28 Anthony * src/counter_analysis_toolkit/scripts/README.txt, src/counter_analysis_toolkit/scripts/default.gnp, .../scripts/multi_plot.gnp, .../scripts/process_dcache_output.sh, .../L2_RQSTS:ALL_DEMAND_REFERENCES.data.reads.stat, .../L2_RQSTS:DEMAND_DATA_RD_HIT.data.reads.stat, .../L2_RQSTS:DEMAND_DATA_RD_MISS.data.reads.stat, .../scripts/single_plot.gnp: Scripts and sample data for viewing CAT's dcache output. 2022-09-21 Anthony * src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c: Removed redundant latency step. * src/counter_analysis_toolkit/main.c: Added support for "-quick" flag which skips the latency tests. * src/counter_analysis_toolkit/eventstock.c: Force the CPU component to initialize itself. * src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/branch.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/main.c: Cleaned up the way we handle the parameters specified via the command line arguments. 2022-09-04 Giuseppe Congiu * src/components/sysdetect/tests/Makefile, .../sysdetect/tests/query_device_simple_f.F, src/genpapifdef.c, src/papi_fwrappers.c: sysdetect: add fortran bindings and test Add fortran bindings for PAPI sysdetect interface and tests. 2022-09-13 Daniel Barry * src/components/powercap/linux-powercap.c: powercap: fix wrap-around arithmetic When the energy counters reach the maximum value (given by '/sys/class/powercap/intel-rapl*/max_energy_range_uj'), they wrap around to zero. There is arithmetic in the powercap component to account for this case, but it previously used the maximum value for an unsigned int, which is not necessarily the value given by 'max_energy_range_uj'. Thus, the arithmetic has been modified to now use the values given in the appropriate 'max_energy_range_uj' files. These changes have been tested on the Intel Cascade Lake architecture. 2022-10-07 Giuseppe Congiu * src/components/infiniband/linux-infiniband.c: infiniband: fix warning in snprintf Instead of using FILENAME_MAX as the length of the string to be copied over to ev_file use the sum of the substrings and account for the extra '/'. 2022-08-31 Giuseppe Congiu * src/components/perfmon2/perfmon.c: perfmon2: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/perfctr/perfctr.c: perfctr: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. 2022-08-28 Giuseppe Congiu * src/components/host_micpower/linux-host_micpower.c: host_micpower: funnel PAPI_ENOMEM through fn_fail untested due to lack of hardware * src/components/host_micpower/linux-host_micpower.c: host_micpower: rework error handling in init_component * src/components/host_micpower/linux-host_micpower.c: host_micpower: delete empty line * src/components/host_micpower/linux-host_micpower.c: host_micpower: add fn_exit point * src/components/host_micpower/linux-host_micpower.c: host_micpower: rename disable_me to fn_fail * src/components/vmware/vmware.c: vmware: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/stealtime/linux-stealtime.c: stealtime: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/sensors_ppc/linux-sensors-ppc.c: sensors_ppc: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/rapl/linux-rapl.c: rapl: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/powercap_ppc/linux-powercap-ppc.c: powercap_ppc: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/powercap/linux-powercap.c: powercap: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/pcp/linux-pcp.c: pcp: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/pcp/linux-pcp.c: pcp: return PAPI_ECMP on error instead of ctxHandle * src/components/net/linux-net.c: net: return PAPI_ECMP on error instead of num_events * src/components/net/linux-net.c: net: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/mx/linux-mx.c: mx: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/micpower/linux-micpower.c: micpower: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. untested due to lack of hardware * src/components/micpower/linux-micpower.c: micpower: replace PAPI_ENOCMP with PAPI_ECMP PAPI_ENOCMP should be used to indicate that the requested component is not available in the component index (e.g. because it wasn't initialized). PAPI_ECMP, on the other hand, should be used when the component is initialized but some requested feature is not supported by the component (e.g. the component is not compatible with the feature). By its own definition no component can return PAPI_ENOCMP. * src/components/lustre/linux-lustre.c: lustre: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/lmsensors/linux-lmsensors.c: lmsensors: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/libmsr/linux-libmsr.c: libmsr: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/io/linux-io.c: io: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/intel_gpu/linux_intel_gpu_metrics.c: intel_gpu: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/example/example.c: example: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/coretemp/linux-coretemp.c: coretemp: replace PAPI_ENOCMP with PAPI_ECMP PAPI_ENOCMP should be used to indicate that the requested component is not available in the component index (e.g. because it wasn't initialized). PAPI_ECMP, on the other hand, should be used when the component is initialized but some requested feature is not supported by the component (e.g. the component is not compatible with the feature). By its own definition no component can return PAPI_ENOCMP. * src/components/coretemp_freebsd/coretemp_freebsd.c: coretemp_freebsd: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. untested * src/components/coretemp/linux-coretemp.c: coretemp: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/appio/appio.c: appio: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. 2022-08-27 Giuseppe Congiu * src/components/perf_event_uncore/perf_event_uncore.c: perf_event_u: replace PAPI_ENOCMP with PAPI_ECMP PAPI_ENOCMP should be used to indicate that the requested component is not available in the component index (e.g. because it wasn't initialized). PAPI_ECMP, on the other hand, should be used when the component is initialized but some requested feature is not supported by the component (e.g. the component is not compatible with the feature). By its own definition no component can return PAPI_ENOCMP. * src/components/perf_event/perf_event.c: perf_event: replace PAPI_ENOCMP with PAPI_ECMP PAPI_ENOCMP should be used to indicate that the requested component is not available in the component index (e.g. because it wasn't initialized). PAPI_ECMP, on the other hand, should be used when the component is initialized but some requested feature is not supported by the component (e.g. the component is not compatible with the feature). By its own definition no component can return PAPI_ENOCMP. * src/components/perf_event_uncore/perf_event_uncore.c: perf_event_u: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. * src/components/perf_event/perf_event.c: perf_event: funnel init_component failures init_component failures are handled locally to the failure. Instead, funnel all error handling code paths through a single exit point. This makes the code more robust to bugs and also makes it easier to read. 2022-09-22 Giuseppe Congiu * src/components/rocm_smi/tests/Makefile: rocm_smi: add default rocm path for tests When PAPI_ROCM_ROOT is not defined it expands to the empty string during compilation. Thus, many of the rocm flags used by the compiler are incomplete and might cause problems. This patch makes sure that PAPI_ROCM_ROOT always falls back to a default if not defined. 2022-09-29 Anthony * src/Makefile.inc, src/sde_lib/Makefile: libsde: Passing the CC Makefile variable to the sub-make. 2022-09-22 Giuseppe Congiu * src/components/rocm/tests/hl_intercept_multi_thread_monitoring.cpp, src/components/rocm/tests/hl_intercept_single_thread_monitoring.cpp , src/components/rocm/tests/hl_sample_single_thread_monitoring.cpp: rocm: skip multi-threaded high-level API tests PAPI high-level API tests in rocm require user intervention to set the LD_LIBRARY_PATH to the path of libpapi.so and librocprofiler64.so, required, respectively, to set ROCP_TOOL_LIB and HSA_TOOLS_LIB. Skip these tests as they would fail without LD_LIBRARY_PATH being properly set. 2022-09-21 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: use static libpapi for tests Instead of linking to libpapi.so, which requires additional environment variables (i.e. LD_LIBRARY_PATH) to be set for the test to work, build tests with libpapi.a. * src/components/rocm/tests/Makefile: rocm: add default rocm path for tests When PAPI_ROCM_ROOT is not defined it expands to the empty string during compilation. Thus, many of the rocm flags used by the compiler are incomplete and might cause problems. This patch makes sure that PAPI_ROCM_ROOT always falls back to a default if not defined. 2022-07-11 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: account for PAPI defined compilation flags Makefile_comp_tests.target already contains all the variables needed to compile tests in various components. Instead of hard coding compile flags all over again, use the available variables. 2022-06-02 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: add Makefile_comp_tests.target dependency in tests Currently, the install target is missing in rocm tests. Including Makefile_comp_tests.target fixes the problem 2022-09-24 Giuseppe Congiu * src/linux-memory.c: meminfo: support POWER10 cache information Add support for IBM POWER10 information in meminfo. * src/components/sysdetect/powerpc_cpu_utils.c: sysdetect: add POWER10 cache info support * src/components/sysdetect/powerpc_cpu_utils.c: sysdetect: update power9 L1 cache info The number of lines in L1 cache is wrong. It was set to 64 for a 32KB cache with 128B line size, while is should be 32K/128 = 256. 2022-09-04 Giuseppe Congiu * src/linux-memory.c: meminfo: add power9 cache info 2022-09-21 Giuseppe Congiu * src/configure, src/configure.in: configure: fix tls check logic 2022-09-23 Giuseppe Congiu * src/papi.c: papi_get_opt: update documentation for PAPI_LIB_VERSION The PAPI_LIB_VERSION option in PAPI_get_opt() no longer requires PAPI to be successfully initialized first. * src/ctests/version.c: ctests/version: PAPI_library_init does not fail test PAPI_library_init can fail now and PAPI_get_opt will return the runtime version for the user to compare with the linked version. 2022-09-22 Giuseppe Congiu * src/papi.c: Return PAPI library version even if PAPI is not initialized Currently, there is no way for the user to compare his version of the PAPI library with the version of the library being loaded at runtime. PAPI_library_init() takes the version of the user library and compares it with the version of the library loaded. If the two don't match, it returns an error. PAPI_get_opt() also provides access to some library information, like the version, but this only works if PAPI has been correctly initialized. This patch extends PAPI_get_opt() to provide the library version regardless PAPI being initialized at all. 2021-12-02 Masahiko, Yamada * src/ctests/all_native_events.c: ctests: improve all_native_events. all_native_events does not test for event names only. Therefore, you cannot test that specifying only the event name in an uncore PMU event results in an error. Add a test for all_native_events with only the event name. Tue Sep 20 00:46:19 2022 -0700 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog, src/libpfm4/debian/control, src/libpfm4/debian/rules, src/libpfm4/include/perfmon/perf_event.h, src/libpfm4/lib/pfmlib_amd64.c: libpfm4: update to commit 8c606bc Original commits: commit 8c606bc2f2d186c2797d9f013283c9150f594f93 update perf_event.h to Linux 5.18 The perf_events interface for directly accessing PMU registers from userspace for arm64 has been formally implemented in the kernel v5.18. Update perf_event.h header used to build perf_event based examples. commit 79031f76f8a1af7d3c83ae3c4363d32cfb5dadc6 fix amd_get_revision() to identify AMD Zen3 uniquely Make sure we handle the model number properly for AMD Zen3. Right now, it would consider any family 19h as Zen3. commit 56f6a05d46b7592ddf81d77f4714dfc9b4c975e5 Update to version 12.1 To fix some debian control files issues commit e1c16c829abc86a4e9547f4518d7834fcbd0a603 fix debian rules to build again Can now build using: $ debuild -i -us -uc -b -d commit 19b784d3404fda20e27b30473804ff3a3a14f4d5 fix debian changelog for 4.12 release changelog entry was added after the previous 11.1 release instead of at the top of the file 2022-09-22 Giuseppe Congiu * src/components/sysdetect/sysdetect.c: sysdetect: fix problem with missing shutdown_thread implementation Originally, init_thread and shutdown_thread were not implemented in the sysdetect component. However, this causes issues when PAPI_register_thread and PAPI_unregister_thread are used. In the case of unregister, the framework will go through all the enabled components and call shutdown_thread for each of them. Since the sysdetect did not implement these functions a default (PAPI_ECMP) error would be returned. This patch adds the missing functions to sysdetect. Solves issue #116 2022-09-21 Giuseppe Congiu * src/configure, src/configure.in: configure: add support for automatic ARM cpu detection The configure script should be able to detect the cpu architecture and enable the building of the corresponding source code support it. However, the configure only does this for x86_64 and power architectures. This patch add support for ARM architectures as well. 2022-09-20 Giuseppe Congiu * src/components/sysdetect/arm_cpu_utils.c: sysdetect: fix cache info data structure name With commit number 9f8e6b0 the sysdetect data structures, that were originally hosted in papi.h, are moved to the sysdetect.h instead. To account for this, the data structures prefix was changed from PAPI_ to _sysdetect_. This change, however, was erroneously skipped for the arm files. This patch fixes this problem reflecting the name change in arm files. 2022-09-02 Giuseppe Congiu * src/components/sysdetect/tests/Makefile: sysdetect: conditionally build mpi tests 2022-09-13 Anthony * src/components/sde/tests/Makefile: Add the tests in the "clean" target in the makefile. * src/configure, src/configure.in, src/utils/Makefile: More proper handling of special compilation flags for papi_native_avail. 2022-04-20 Anthony * src/Makefile.in, src/Makefile.inc, src/components/Makefile_comp_tests.target.in, src/components/sde/Rules.sde, src/components/sde/sde.c, src/components/sde/sde_F.F90, src/components/sde/sde_internal.h, src/components/sde/sde_lib/sde_lib.h, .../sde/tests/Advanced_C+FORTRAN/sde_test_f08.F90, .../sde/tests/Counting_Set/CountingSet_Lib++.cpp, .../sde/tests/Counting_Set/CountingSet_Lib.c, .../MemoryLeak_CountingSet_Driver++.cpp, .../Counting_Set/MemoryLeak_CountingSet_Driver.c, .../Counting_Set/Simple_CountingSet_Driver++.cpp, .../tests/Counting_Set/Simple_CountingSet_Driver.c, src/components/sde/tests/Counting_Set/cset_lib.hpp, .../Created_Counter/Lib_With_Created_Counter++.cpp, src/components/sde/tests/Makefile, .../sde/tests/Minimal/Minimal_Test++.cpp, .../sde/tests/Recorder/Lib_With_Recorder++.cpp, src/components/sde/tests/Simple2/Simple2_Lib++.cpp, src/components/sde/tests/Simple2/Simple2_Lib.c, src/configure, src/configure.in, src/sde_lib/Makefile, src/sde_lib/sde_lib.c, src/sde_lib/sde_lib.h, src/sde_lib/sde_lib.hpp, src/sde_lib/sde_lib_datastructures.c, src/sde_lib/sde_lib_internal.h, src/sde_lib/sde_lib_lock.h, src/sde_lib/sde_lib_misc.c, src/sde_lib/sde_lib_ti.c, src/sde_lib/sde_lib_ti.h, src/utils/Makefile, src/utils/Makefile.target.in, src/utils/papi_native_avail.c: libsde: Refactoring the sde code in a standalond library with a clean API. The libsde library is built and installed along with libpapi (unless the user specifies --with-libsde=no at configure). Clean seperation between the PAPI SDE component and libsde. Now PAPI invokes the "tools interface" of libsde. Added missing functions for symmetry, such as papi_sde_shutdown() and papi_delete_counting_set(), and papi_sde_enabled()/papi_sde_disable(). 2022-05-17 AnustuvICL * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, .../cuda/tests/test_multipass_event_fail.c, src/genpapifdef.c, src/papi.c, src/papi.h: cuda: Raise error when adding metrics that need multiple pass - Add new error code `PAPI_EMULPASS` - Updated docs for `PAPI_add_event()` and `PAPI_add_named_event()` - Add test program in `components/cuda/test/test_multipass_event_fail.c` Fri Sep 16 22:37:40 2022 -0700 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog, src/libpfm4/docs/man3/libpfm_intel_bdw.3, src/libpfm4/docs/man3/libpfm_intel_hsw.3, src/libpfm4/docs/man3/libpfm_intel_icl.3, src/libpfm4/docs/man3/libpfm_intel_icx.3, src/libpfm4/docs/man3/libpfm_intel_ivb.3, src/libpfm4/docs/man3/libpfm_intel_nhm.3, src/libpfm4/docs/man3/libpfm_intel_skl.3, src/libpfm4/docs/man3/libpfm_intel_snb.3, src/libpfm4/docs/man3/libpfm_intel_spr.3, src/libpfm4/docs/man3/libpfm_intel_wsm.3, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/tests/validate_x86.c: libpfm4: update to commit 11f2d6c Original commits: commit 11f2d6c70a8b353e80eee55e9a2011c27c82398e update to version 4.12.0 Update to 4.12.0 revision to prepare for release commit 471fe633ae01a636b78481b8030a1f922c9d24d2 fix minimal ldlat latency for Intel Load Latency SDM lists 1 cycle as the lowest possible, adjust code to reflect spec. Adjust validation test suite accordingly. Adjust man pages accordingly. 2022-09-03 Giuseppe Congiu * src/utils/papi_native_avail.c: papi_native_avail: fix typo in doxygen comment 2021-07-13 Masahiko, Yamada * src/ctests/memory.c, src/linux-memory.c, src/papi.h: meminfo: Add alloc/write policy for generic_get_memory_info On arm64, if the firmware supports ACPI PPTT (Processor Properties Topology Table), the generic_get_memory_info function references the file locate in the "/sys/devices/system/cpu/cpu*/cache/index*" directory and sets the cache information. In the arm64 environment, the following cache information files are available from the kernel:. index0: allocation_policy number_of_sets size ways_of_associativity coherency_line_size shared_cpu_list type write_policy level shared_cpu_map uevent Currently, the papi library does not reference two files, "allocation_policy" and "write_policy". Add allocation_policy/write_policy support for the generic_get_memory_info function. /sys/devices/system/cpu/cpu*/cache/index*/allocation_policy - ReadAllocate: allocate a memory location to a cache line on a cache miss because of a read - WriteAllocate: allocate a memory location to a cache line on a cache miss because of a write - ReadWriteAllocate: both writeallocate and readallocate /sys/devices/system/cpu/cpu*/cache/index*/write_policy - WriteThrough: data is written to both the cache line and to the block in the lower-level memory - WriteBack: data is written only to the cache line and the modified cache line is written to main memory only when it is replaced 2022-09-02 Giuseppe Congiu * src/components/libmsr/linux-libmsr.c: libmsr: improve disabled reason string Instead of returning a generic error message if libmsr.so cannot be dlopen'ed return the dlerror() string. * src/components/sysdetect/powerpc_cpu_utils.c: sysdetect: update power9 cache info 2022-08-24 Giuseppe Congiu * src/ctests/all_native_events.c, src/ctests/get_event_component.c: ctests: allow access to PAPI_EDELAY_INIT components Some of the tests, such as all_native_events and get_event_component, check the 'disabled' flag of the component before accessing it. Device components, such as cuda and rocm, set the 'disabled' flag to PAPI_EDELAY_INIT, which signifies the component is a delayed initialization one. Thus, the event table in the component is initialized only when events are accessed (e.g. PAPI_enum_cmp_event). 2022-09-01 Giuseppe Congiu * src/components/sysdetect/README.md: sysdetect: update README.md 2022-08-31 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c, src/components/sysdetect/nvidia_gpu.c, src/components/sysdetect/shm.c: sysdetect: warning fix in snprintf 2022-09-01 Giuseppe Congiu * src/papi_internal.c: errcode: add PAPI_EMULPASS to error codes * src/papi.h: errcode: add comment for PAPI_EDELAY_INIT in papi.h * src/genpapifdef.c: errcode: add PAPI_EDELAY_INIT to genpapifdef 2022-08-31 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect: sysdetect: add support for Power8 through 10 * src/configure, src/configure.in: configure: add support for POWER8 through 10 The configuration script only supported Power architectures up to version 7. This patch adds Power8, Power9, and Power10 as well. 2022-07-17 Giuseppe Congiu * src/components/rocm/rocm.c: rocm: refactor ntv_name_to_code and ntv_code_to_name 2022-07-16 Giuseppe Congiu * src/components/rocm/Rules.rocm, src/components/rocm/rocm.c: rocm: add name_to_code implementation * src/components/rocm/Rules.rocm, src/components/rocm/htable.c, src/components/rocm/htable.h: rocm: add hash table for name_to_code fast conversions Add hash table implementation that can be used by components to convert event names into their corresponding codes. The hash function used by the hash table is djb2 by Dan Bernstein. 2022-07-11 Giuseppe Congiu * src/components/sysdetect/tests/query_device_mpi.c, .../sysdetect/tests/query_device_simple.c: sysdetect: update tests 2022-06-03 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c: sysdetect: fix warning in amd gpu probe 2022-04-20 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c, src/components/sysdetect/amd_gpu.h, src/components/sysdetect/cpu.c, src/components/sysdetect/cpu.h, src/components/sysdetect/cpu_utils.c, src/components/sysdetect/cpu_utils.h, src/components/sysdetect/linux_cpu_utils.c, src/components/sysdetect/nvidia_gpu.c, src/components/sysdetect/nvidia_gpu.h, src/components/sysdetect/powerpc_cpu_utils.c, src/components/sysdetect/shm.c, src/components/sysdetect/sysdetect.c, src/components/sysdetect/sysdetect.h, src/components/sysdetect/x86_cpu_utils.c, src/configure, src/configure.in, src/papi.c, src/papi.h, src/papi_internal.c, src/papi_internal.h, src/utils/papi_hardware_avail.c: sysdetect: extend PAPI with system detection and querying APIs Queries performed by accessing PAPI internal data structures are not easily maintainable. Once an internal data structure is exposed to users these rely on it not being changed and ties our hands from an implementation standpoint. This patch introduces a new set of APIs that can be used to query system attributes for different devices through the system detection component. The APIs are generic enough to allow extending the capabilities with new hardware devices, as they become available, and allow to separate user interfaces from the implementation of the underlying functionality. Because the new APIs have to be always functional we configure the sysdetect component by default and initialize it lazily, i.e., only when the corresponding APIs are called by the user. 2022-08-30 Daniel Barry * src/papi_events.csv: papi_avail: add presets for Intel Ice Lake SP Define preset events for the Intel Ice Lake SP processor. These presets have been verified using the Counter Analysis Toolkit benchmarks. These changes have been tested on the Intel Ice Lake architecture. 2022-08-25 Daniel Barry * src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_scalar_verify.c: cat: remove unused code from vector benchmark There were several outdated and unused lines of code in the CAT vector FLOPs benchmark. These changes have been tested on the Intel Ice Lake (ICX) architecture. * src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c: cat: format changes to vector benchmark comments Make comment style consistent across the various source files. These changes have been tested on the Intel Ice Lake (ICX) architecture. 2022-08-24 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma.h, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c, src/counter_analysis_toolkit/vec_scalar_verify.h: cat: extend vector FLOPs benchmark to 128-bit and 512-bit instrinsics Previously, kernels within the vector FLOPs benchmark of the Counter Analysis Toolkit used only 256-bit intrinsics. But to accurately identify native events for 128-bit and 512-bit vector widths, the corresponding intrinsics need to be included in the benchmark. These changes have been tested on the Intel Ice Lake (ICX) architecture. 2022-06-03 Daniel Barry * src/configure, src/configure.in: sysdetect: modify configure.in logic to parse '--with-CPU=x86' When the configuration is invoked with "--with-CPU=x86", it should add 'x86_cpuid_info.c' to MISCSRCS. When the configuration is invoked on an architecture of the x86_64 family, and "--with-CPU=x86" is not specified, the build will proceed normally. However, the build will fail on x86_64 when "--with-CPU=x86" is specified because this flag bypasses the check for the x86_64 family but does not add 'x86_cpuid_info.c' to MISCSRCS. Thus, if "--with-CPU=x86" is specified, "x86" must be included in the list of CPUs for which 'x86_cpuid_info.c' is added as a source file. These changes have been tested on the Intel Cascade Lake and Skylake architectures. 2022-08-22 Giuseppe Congiu * src/libpfm4/lib/events/power10_events.h, src/libpfm4/lib/pfmlib_power10.c: libpfm4: fix broken update This patch fixes an error in the the previous libpfm4 update (c340321). The update in question was missing the expected power10 files. This patch adds the missing libpfm4 commit including those files. Wed Jul 27 03:58:13 2022 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_power_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_power.c: libpfm4: update libpfm4 to commit 5140ce5 Original commits: commit 5140ce5fe28a7d595eb0a3a906445d0deeb2c53c Add IBM Power10 core PMU support Adds support for IBM Power 10 core PMU. Documentation on the PMU events for Power10 can be found in Appendix E of the Power10 Users Manual. The Power10 manual is at: https://ibm.ent.box.com/v/power10usermanual This and other PowerPC related documents can be found at: https://www-50.ibm.com/systems/power/openpower/ commit c88fd465519ae6e96105efe19a06f64b3daa16af More Intel SapphirRapids updates Based on download.01.org: sapphire_rapids_core_v1.04.json commit 77711b23c5c2124c45d35f61a4b7edce7824ba53 Update Intel SapphireRapids event table Based on official event table at download.01.org: sapphirerapids_core_v1.04.json Event RS_EMPTY deprecated in favor of RS event. Updated OCR umasks. commit 391d20ec0a7d53bf5d7b39888734ba6fa716df3f Update Icelake and IcelakeX event tables Based on official event tables from downoad.01.org: icelakex_core_v1.15.json icelake_core_v1.14.json Mostly updating the OCR events. Tested: Power10 : No Sapphire Rapids: No Icelake : No IcelakeX : No 2022-05-26 John Rodgers * src/components/cuda/linux-cuda.c: CUDA: Add compile/runtime version debug msgs In `linux-cuda.c::_cuda_linkCudaLibraries`, added debug messages to report the compile/runtime versions for the driver, runtime API, and CUPTI API. 2022-07-12 Giuseppe Congiu * src/components/rocm/rocm.c: rocm: fix assign eventset to component PAPI_assign_eventset_component() assigns an eventset to a component of certain index. The function relies on component information to allocate data structures for the component. One such parameter is num_mpx_cntrs. The component was setting this to -1 in rocm_init_component() causing any malloc to fail. Additionally, when rocm_init_private() is finally called, it also resets num_mpx_cntrs to the number of native events detected for the device. This is wrong as the framework relies on this parameter when freeing allocated data structures, e.g., EventInfoArray. 2022-06-11 Giuseppe Congiu * src/components/rocm_smi/linux-rocm-smi.c: rsmi: ignore not yet implemented function Calls to rsmi_dev_pci_bandwidth_get() currently return a RSMI_STATUS_NOT_YET_IMPLEMENTED error. This results in the rocm_smi component to be disabled. To avoid this we allow the component to still work even without functioning rsmi_dev_pci_bandwidth_get() function. 2022-05-30 Giuseppe Congiu * src/components/rocm_smi/tests/force_init.h, src/components/rocm_smi/tests/power_monitor_rocm.cpp, src/components/rocm_smi/tests/rocm_smi_writeTests.cpp: rocm_smi: account for PAPI_EDELAY_INIT in tests 2022-05-28 Giuseppe Congiu * src/components/nvml/tests/force_init.h, src/components/nvml/tests/nvml_power_limit_read_test.cu, src/components/nvml/tests/nvml_power_limiting_test.cu: nvml: account for PAPI_EDELAY_INIT in tests Currently, nvml tests check for disable state of the component. Tests do not allow for PAPI_EDELAY_INIT error, introduced in commit 1f44a36, however. Thus, tests fail spuriously. This patch adds a force_nvml_init function that accesses the nvml events, forcing the component to init. 2022-06-02 Daniel Barry * src/components/powercap/linux-powercap.c: powercap: add wrapper function to map event-set entry to counter Created a wrapper function to map the event-set index to the appropriate counter index. These changes have been tested on the Intel Cascade Lake architecture. 2022-05-17 Daniel Barry * src/components/powercap/linux-powercap.c: powercap: fix event lookup in _powercap_read() The function read_powercap_value() should be given the index of the powercap event, not the position of that event in the event set. The event-set index maps to the powercap-event index via the 'which_counter' array, which is already used in the function _powercap_write(). These changes were tested on the Intel Skylake and Cascade Lake architectures. 2022-07-06 Daniel Barry * src/components/sysdetect/arm_cpu_utils.c: sysdetect: add support for Fujitsu A64FX This enables 'papi_hardware_avail' utility support for the A64FX processor. TLB and cache information were obtained from the A64FX Microarchitecture Manual. (https://github.c om/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_ 1.6.pdf) These changes were tested on the A64FX and ThunderX2 processors. 2022-06-03 Masahiko, Yamada * src/linux-memory.c: papi_mem_info: modify aarch64_get_memory_info by hw_info Reference hw_info->vendor and hw_info->cpuid_model and modify the aarch64_get_memory_info function to determine the processor by the combination of Implementer and PartNum. These changes were tested on the ThunderX2 and Fujitsu A64FX architectures. 2022-05-17 Daniel Barry * src/linux-memory.c: papi_mem_info: add back support for ARM64 processors Cache information for ARM64 processors other than the Fujitsu A64FX is available in the /sys/ directory. Therefore, these changes utilize the generic_get_memory_info() function for non- A64FX ARM64 processors. These changes were tested on the ThunderX2 and Fujitsu A64FX architectures. 2022-04-22 Daniel Barry * src/linux-memory.c, src/papi.h: papi_mem_info: add support for Fujitsu A64FX This enables 'papi_mem_info' utility support for the A64FX processor. TLB and cache information were obtained from the A64FX Microarchitecture Manual. (https://github.com/fujitsu/A64FX/b lob/master/doc/A64FX_Microarchitecture_Manual_en_1.6.pdf) These changes were tested on the Fujitsu A64FX, IBM POWER9, AMD Zen 2, and Intel Haswell architectures. 2022-06-11 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: rocp_pool_close double free comment * src/components/rocm/rocp.c: rocm: remove useless comments * src/components/rocm/rocp.c: rocm: adjust for 5.2.0 change of directory structure * src/components/rocm/rocp.c: rocm: check config files are regular 2022-06-10 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: rename dispatch counter functions increment/decrement_dispatch_counter also return the value of the counter after increment/decrement. To make more clear what the functions do rename them. * src/components/rocm/rocp.c: rocm: fix bug in sampling read * src/components/rocm/rocp.c: rocm: cleanup function signature * src/components/rocm/rocp.c: rocm: wrap sampling/intercept_ctx_init 2022-06-11 Giuseppe Congiu * src/components/pcp/README.md: pcp: add how to run on Summit in readme 2022-06-10 John Rodgers * .../cuda/tests/cupti_multi_kernel_launch_monitoring.cu: CUDA: Update Multi-Kernel Launch Test PR 298 toggled back the CUDA profiling API support ranges to include CC 7.0 when built against CUDA11+. As a result, the `cupti_multi_kernel_launch_monitoring.cu` test needs to be updated to account for this new support range behavior. 2022-06-09 Giuseppe Congiu * src/components/sensors_ppc/tests/sensors_ppc_basic.c: sensors_ppc: fix test sensor_ppc_basic does not call PAPI_stop before cleaning up and destroying the eventset. This causes the test to return PAPI_EISRUN error. Replace PAPI_read with PAPI_stop as fix. Fri Jun 3 06:15:02 2022 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/tests/validate_x86.c: libpfm4: update libpfm4 to commit 322e66c Original commits: commit 322e66c6463d6ff4035a751843dbce2ee83b6663 fix validate for CPU_CLK_UNHALTED:REF_DISTRIBUTED for SapphireRapids Was using the bogus evnet code following the change in: 44a62a52e4e5 ("fix CPU_CLK_UNHALTED.REF_DISTRIBUTED encoding for Intel SapphireRapids") Correct event code is 0x3c. commit a7b26272d8327ad1c001456a18518a0ac65dc2bb avoid GCC-12 use-after- free warnings gcc-12 seems to complain about bogus use-after-free situations in the libpfm4 code: p = realloc(q, ...) if (!p) return NULL s = p + (q - z) It complains because of the use of q after realloc in this case. Yet q - z is just pointer artihmetic and is not dereferencing any memory through the pointer q which may have been freed by realloc. Fix is to pre-computer the delta before realloc to avoid using the pointer after the call. Reported-by: Vitaly Chikunov commit 44a62a52e4e554cad7971b79770e03ae880336ce fix CPU_CLK_UNHALTED.REF_DISTRIBUTED encoding for Intel SapphireRapids Was using 0x8ec instead 0x 0x83c. commit b28625959098b3889f5ffe1d209b5da196b959e1 update Intel SapphireRapids core PMU events Based on download.01.org/perfmon/SPR/sapphire_rapids_v1.02.json Mostly updates to the OCR event. Testing: SapphireRapids commits untested due to lack of hardware. 2022-06-02 Giuseppe Congiu * src/components/rocm_smi/tests/Makefile: rocm_smi: remove duplicate include Makefile_comp_tests.target * src/components/rocm/rocp.c: rocm: fix rocprofiler load logic The load_rocp_sym function in rocp.c should not use PAPI_ROCM_ROOT to calculate the pathname of the rocprofiler library. init_rocp_env already sets up HSA_TOOLS_LIB to point to the right pathname based on PAPI_ROCM_ROOT (or the pathname explicitly defined by users). Thus, instead, load_rocp_sym should simply use HSA_TOOLS_LIB. 2022-06-03 Giuseppe Congiu * doc/Makefile: sysdetect: add papi_hardware_avail to man pages 2022-06-02 Giuseppe Congiu * src/components/sysdetect/x86_cpu_utils.c: sysdetect: fix a 'may be used uninitialized' warning * src/components/cuda/linux-cuda.c: cuda: allow for CC 7.0 to use the profiler API Currently, the cuda component enforce the cupti event API to be used if the compute capability (CC) of the device is 7.0. This patch allows users to select the cupti profiler API instead by using a cuda11 installation and exposing this to PAPI through the PAPI_CUDA_ROOT environment variable. Tue May 31 05:47:23 2022 -0700 Thomas Richter * src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c: libpfm4: update libpfm4 to commit b03a81e Original Commit: s390: Update counter definition for IBM z16 This patch updates the libpfm4 s390 counter definitions to the latest documentation: SA23-2261-07:The CPU- Measurement Facility Extended Counters Definition for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16 April 29, 2022 https://www.ibm.com/support/pages/cpu-measurement-facility- extended-counters- definition-z10-z196z114-zec12zbc12-z13z13s-z14-z15-and-z16 This includes updated counter description for existing counters and the complete counter definition for IBM z16. Acked-by: Sumanth Korikkar Testing: not tested 2022-05-26 Giuseppe Congiu * src/components/sysdetect/x86_cpu_utils.c: sysdetect: fix warning in cpu probe 2022-01-12 Giuseppe Congiu * src/components/rocm/README.md, src/components/rocm/Rules.rocm, src/components/rocm/common.h, src/components/rocm/linux-rocm.c, src/components/rocm/rocm.c, src/components/rocm/rocm_IncDirs.awk, src/components/rocm/rocp.c, src/components/rocm/rocp.h, src/components/rocm/tests/Makefile, src/components/rocm/tests/ROCM_SA_Makefile, src/components/rocm/tests/common.h, .../tests/hl_intercept_multi_thread_monitoring.cpp, .../hl_intercept_single_kernel_monitoring.cpp, .../hl_intercept_single_thread_monitoring.cpp, .../tests/hl_sample_single_kernel_monitoring.cpp, .../tests/hl_sample_single_thread_monitoring.cpp, .../tests/intercept_multi_kernel_monitoring.cpp, .../tests/intercept_multi_thread_monitoring.cpp, .../tests/intercept_single_kernel_monitoring.cpp, .../tests/intercept_single_thread_monitoring.cpp, src/components/rocm/tests/matmul.cpp, src/components/rocm/tests/matmul.h, .../rocm/tests/multi_kernel_monitoring.cpp, .../rocm/tests/multi_kernel_monitoring.h, .../rocm/tests/multi_thread_monitoring.cpp, .../rocm/tests/multi_thread_monitoring.h, src/components/rocm/tests/rocm_all.cpp, src/components/rocm/tests/rocm_command_line.c, src/components/rocm/tests/rocm_example.cpp, src/components/rocm/tests/rocm_failure_demo.cpp, src/components/rocm/tests/rocm_standalone.cpp, src/components/rocm/tests/run_papi.sh, .../rocm/tests/sample_multi_kernel_monitoring.cpp, .../rocm/tests/sample_multi_thread_monitoring.cpp, .../rocm/tests/sample_overflow_monitoring.cpp, .../rocm/tests/sample_single_kernel_monitoring.cpp, .../rocm/tests/sample_single_thread_monitoring.cpp, .../rocm/tests/single_thread_monitoring.cpp, .../rocm/tests/single_thread_monitoring.h: rocm: component rewrite The new rocm component implementation supports rocprofiler sampling as well as intercepting mode. In sampling mode the new rocm component assigns each eventset one or more GPU devices (depending on the events requested by the PAPI user). Two separate threads can thus create two eventsets and have them monitor a separate device (N to N), or one thread can create a single eventset and have it monitor all devices (1 to N). Sampling mode concerns with whatever happens at the device level, not the kernel level. Thus, a section of code instrumented with PAPI_start and PAPI_stop might measure the activity of whatever kernel the current thread has launched on a device, plus whatever kernels other threads may have launched on the same device at the same time. In intercepting mode the new rocm component assigns each eventset one kernel at the time (kernels are serialized by rocm). Intercepting mode concerns with whatever happens at the kernel level (inside the device). Thus, a section of code instrumented with PAPI_start and PAPI_stop might measure the activity of whatever kernel the current thread launched on a device. If the instrumented section contains multiple kernel launches the component will accumulate the counters of those into a single counter value. The component also support software emulated counters sampling through PAPI_overflow. 2022-04-18 Giuseppe Congiu * src/high-level/papi_hl.c: high-level: use _papi_getpid instead of getpid * src/threads.c, src/threads.h: thread: add _papi_getpid() function 2022-01-27 Anthony * src/atomic_ops.h, src/atomic_ops/ao_version.h, src/atomic_ops /generalize-arithm.h, src/atomic_ops/generalize-arithm.template, src/atomic_ops/generalize-small.h, src/atomic_ops/generalize- small.template, src/atomic_ops/generalize.h, src/atomic_ops/sysdeps/README, .../sysdeps/all_acquire_release_volatile.h, .../sysdeps/all_aligned_atomic_load_store.h, src/atomic_ops/sysdeps/all_atomic_load_store.h, src/atomic_ops/sysdeps/all_atomic_only_load.h, src/atomic_ops/sysdeps/ao_t_is_int.h, src/atomic_ops/sysdeps/ao_t_is_int.template, src/atomic_ops/sysdeps/armcc/arm_v6.h, src/atomic_ops/sysdeps/emul_cas.h, src/atomic_ops/sysdeps/gcc/aarch64.h, src/atomic_ops/sysdeps/gcc/alpha.h, src/atomic_ops/sysdeps/gcc/arm.h, src/atomic_ops/sysdeps/gcc/avr32.h, src/atomic_ops/sysdeps/gcc/cris.h, src/atomic_ops/sysdeps/gcc/e2k.h, src/atomic_ops/sysdeps/gcc /generic-arithm.h, src/atomic_ops/sysdeps/gcc/generic- arithm.template, src/atomic_ops/sysdeps/gcc/generic-small.h, src/atomic_ops/sysdeps/gcc/generic-small.template, src/atomic_ops/sysdeps/gcc/generic.h, src/atomic_ops/sysdeps/gcc/hexagon.h, src/atomic_ops/sysdeps/gcc/hppa.h, src/atomic_ops/sysdeps/gcc/ia64.h, src/atomic_ops/sysdeps/gcc/m68k.h, src/atomic_ops/sysdeps/gcc/mips.h, src/atomic_ops/sysdeps/gcc/powerpc.h, src/atomic_ops/sysdeps/gcc/riscv.h, src/atomic_ops/sysdeps/gcc/s390.h, src/atomic_ops/sysdeps/gcc/sh.h, src/atomic_ops/sysdeps/gcc/sparc.h, src/atomic_ops/sysdeps/gcc/tile.h, src/atomic_ops/sysdeps/gcc/x86.h, src/atomic_ops/sysdeps/generic_pthread.h, src/atomic_ops/sysdeps/hpc/hppa.h, src/atomic_ops/sysdeps/hpc/ia64.h, src/atomic_ops/sysdeps/ibmc/powerpc.h, src/atomic_ops/sysdeps/icc/ia64.h, .../sysdeps/loadstore/acquire_release_volatile.h, .../loadstore/acquire_release_volatile.template, src/atomic_ops/sysdeps/loadstore/atomic_load.h, .../sysdeps/loadstore/atomic_load.template, src/atomic_ops/sysdeps/loadstore/atomic_store.h, .../sysdeps/loadstore/atomic_store.template, .../loadstore/char_acquire_release_volatile.h, .../sysdeps/loadstore/char_atomic_load.h, .../sysdeps/loadstore/char_atomic_store.h, .../sysdeps/loadstore/double_atomic_load_store.h, .../loadstore/int_acquire_release_volatile.h, src/atomic_ops/sysdeps/loadstore/int_atomic_load.h, .../sysdeps/loadstore/int_atomic_store.h, .../sysdeps/loadstore/ordered_loads_only.h, .../sysdeps/loadstore/ordered_loads_only.template, .../sysdeps/loadstore/ordered_stores_only.h, .../sysdeps/loadstore/ordered_stores_only.template, .../loadstore/short_acquire_release_volatile.h, .../sysdeps/loadstore/short_atomic_load.h, .../sysdeps/loadstore/short_atomic_store.h, src/atomic_ops/sysdeps/msftc/arm.h, src/atomic_ops/sysdeps/msftc/arm64.h, src/atomic_ops/sysdeps/msftc/common32_defs.h, src/atomic_ops/sysdeps/msftc/x86.h, src/atomic_ops/sysdeps/msftc/x86_64.h, src/atomic_ops/sysdeps/ordered.h, src/atomic_ops/sysdeps/ordered_except_wr.h, src/atomic_ops/sysdeps/read_ordered.h, src/atomic_ops/sysdeps/standard_ao_double_t.h, src/atomic_ops/sysdeps/sunc/sparc.S, src/atomic_ops/sysdeps/sunc/sparc.h, src/atomic_ops/sysdeps/sunc/x86.h, src/atomic_ops/sysdeps/test_and_set_t_is_ao_t.h, src/atomic_ops/sysdeps/test_and_set_t_is_char.h, src/linux- common.c, src/linux-lock.h: Integrate the atomic operations of the libatomic_ops library into PAPI. 2022-05-23 John Rodgers * src/components/cuda/linux-cuda.c: CUDA11 Start Variable Update In `_cuda11_start`, change `userContext` to `userCtx` to be consistent with rest of the component. * src/components/cuda/linux-cuda.c: CUDA11 Profiler Active Context Sensitivity The CUPTI11 portion of the `cuda` component has shown sensitivities for a calling threads active context when using CUDA11 versions < 11.2. Specifically, it was found that not pushing the session context onto the stack prior to calling `cuptiProfilerSetConfig` would result in a failure. This change set addresses this issue by leveraging the same mechanics that are used when calling other CUPTI routines, namely pushing the session context onto the stack and popping it off once we are done with it. 2022-05-05 John Rodgers * src/components/cuda/linux-cuda.c: Allow context creation for CUDA11 For the CUPTI11 portion of `cuda` component, adopt logic from the legacy version of the component to allow creation of contexts should one not exist. Update enables simple single GPU, as well as target offload codes to be profiled. Note: Update also resolves issue with `papi_command_line` (issue 92) * src/components/cuda/README.md, src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld_CUPTI11.cu, src/components/cuda/tests/simpleMultiGPU.cu, .../cuda/tests/simpleMultiGPU_CUPTI11.cu: Issue102: Remove CUDA11 callback subscriber The CUPTI callback subscriber introduced to monitor contexts created a problem for packages that use PAPI for CUDA performance counter collection. Specifically, it prevented registering of custom profiling/tracing callbacks, resulting in `CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED` when attempting to register one. This changeset is effectively a targeted reversion of the callback subscriber logic from the commit that introduced it. Commit that introduced subscriber: 9ff1d73dae9a7b297a54a77fac5fdb3957041452 With this update in place, the `cuda` component context capturing behavior is consistent between the legacy and update CUPTI11 version of the code. Requiring that applications create and set the context used to run the kernels prior to calling `PAPI_add_events()`. 2022-05-16 John Rodgers * src/components/cuda/linux-cuda.c, src/components/cuda/tests/Makefile, .../tests/cupti_multi_kernel_launch_monitoring.cu: Issue 105: CUDA11 Multi Read Error The `cuda` component generated erroneous values when multiple `PAPI_read` operations were called. Testing via direct usage of the CUPTI API revealed that the CUDA11 profiling image (`cuda11_CounterDataImage`) needed to be reset after each read operation to prevent this behavior. To enable resetting, the initialization parameters for the profiling image and scratch buffer are now stored along with the other profiling parameters in `cuda_device_desc_t`. These newly stored parameters are then used in each read operation to re-initialize the profiling images after the counter results have been resolved. A new test, `cupti_multi_kernel_launch_monitoring`, has been introduced to the `components/cuda/tests/` directory and was used in the validation of this changeset. 2022-04-29 John Rodgers * src/components/cuda/linux-cuda.c: Remove unnecessary calls to `cuptiProfiler{Enable,Disable}Profiling` in CUDA11 `PAPI_read` operations. Starting and stopping of profiling session now handled in appropriate CUDA11 `PAPI_{start,stop}` operations. 2022-05-16 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: remove leftover cupti11 switching logic For cupti 11 devices, i.e. devices that are compatible with cupti 11 profiler interface, the component overrides the vector function pointers with cupti 11 variants. However, the _cuda_update_control_state function still contains a switch for cupti 11 version. This should not happen and is therefore removed by this patch. 2022-05-10 Giuseppe Congiu * src/components/cuda/linux-cuda.c, src/configure, src/configure.in: cuda: replace cupti_profiler with cupti_api_version The cupti.h header exports a CUPTI_API_VERSION macro that can be used for checking whether the profiler API is supported or not. The macro can assume the following values (associated to the corresponding CUDA version, and compute capability): CUPTI_API_VERSION | CUDA_VERSION | COMPUTE CAPABILITY --------------------+----------------+------------------- V1 | 4.0 | V2 | 4.1 | V3 | 5.0 | V4 | 5.5 | V5 | 6.0 | V6 | 6.5 | V7 | 6.5 | V8 | 7.0 | V9 | 8.0 | V10 | 9.0 | V11 | 9.1 | V12 | 10.0,10.1,10.2 | < 7.5 V13 | 11.0 | >= 7.0 V14 | 11.1 | This patch replaces the CUPTI_PROFILER preprocessor flag, previously set in configure if the cupti_profiler_target.h header was found in PAPI_CUDA_ROOT, with the CUPTI_API_VERSION in linux-cuda.c 2022-04-28 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: pre-cupti11 backward compatibility fix Cuda devices with compute capability <= 7.0 should be able to use the event API provided by cuda toolkits with version >= 11. The cuda component selection logic however causes cuda devices with such compute capabilities to fail when cuda toolkits >= 11 are used. This patch fixes the selection logic. 2022-05-23 AnustuvICL * src/components/cuda/linux-cuda.c: cuda: Added debug messages to indicate the locations of loaded dynamic CUDA libraries. 2022-03-21 Anustuv Pal * src/components/cuda/linux-cuda.c: cuda: Add support for CUDA version >11.2 - CUDA 11.0 deprecated the use of NVPA_RawMetricsConfig_Create and replaced it with NVPW_CUDA_RawMetricsConfig_Create. This patch replaces the deprecated functions with the NVIDIA recommended substitute. 2022-05-10 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect: sysdetect: fix include path of nvidia GPUs 2022-04-30 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c: sysdetect: fix amd gpu product name The HSA_AMD_AGENT_INFO_PRODUCT_NAME attribute in hsa_agent_get_info seems to be broken in latest version of ROCm. Replace this with the HSA attribute HSA_AGENT_INFO_NAME instead. This reports the device compute architecture rather than the product name. 2022-05-18 Giuseppe Congiu * src/components/perf_event/perf_event.c: perf_event: fix typo in error code handling _pe_libpfm4_init() returns PAPI_ECMP error and not PAPI_ENOCMP like currently handled by the caller (i.e. _pe_init_component). Change PAPI_ENOCMP into PAPI_ECMP. * src/components/perf_event/pe_libpfm4_events.c: perf_event: do not set disable string in _pe_libpfm4_init _pe_libpfm4_init() returns an error code that is used by the called (i.e. _pe_init_component()) to set the disabled_reason string to the appropriate error message, overwriting whatever was set by _pe_libpfm4_init(). Thu Apr 21 15:01:07 2022 -0700 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam19h_zen3.h: libpfm4: update libpfm4 to commit c779846 Original commit: commit c7798469063288ca5829ab96c7c174dad5a08e74 Rename OP_QUEUE_EMPTY to UOPS_QUEUE_EMPTY on AMD Zen3 To be comptible with AMD Zen2. 2022-04-07 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c: sysdetect: use PAPI_ROCM_ROOT for rocmsmi dlopen path 2022-04-19 Giuseppe Congiu * src/run_tests_exclude.txt: intel_gpu: remove test/readme.txt from test list Currently the run_tests_exclude.txt does not list intel_gpu readme.txt file in the tests directory. This causes the run_test.sh script to try execute such file. Black list the file to skip execution. Wed Apr 20 19:56:03 2022 -0700 Stephane Eranian * src/libpfm4/docs/man3/libpfm_intel_spr.3, src/libpfm4/lib/events/amd64_events_fam19h_zen3.h, .../lib/events/amd64_events_fam19h_zen3_l3.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 9580a003d83900569db3f2c7bc41e0e2ea7b88ef Fix amd64 duplicate event detection logic Must check flags as well as code otherwise false positive duplicate are detected on AMD Fam10h Barcelona where some events appears as duplicate when in fact they are for different revisions of the CPU. 2022-04-14 Anthony * src/components/sde/sde_lib/sde_lib.h: Refactored unlocking to the end of each function, and replaced tabs with spaces. 2022-04-13 Anthony * src/components/sde/sde_lib/sde_lib.h: Cleaning up error messages. * src/components/sde/sde.c, src/components/sde/sde_lib/sde_lib.h, src/components/sde/tests/Makefile: Counting Set introduced to sde_lib. Both C and C++ APIs along with tests. 2022-04-12 Giuseppe Congiu * src/high-level/papi_hl.c: high-level: replace flock with fcntl flock is not POSIX compliant. Replace it with fcntl instead. 2022-04-05 Giuseppe Congiu * src/high-level/papi_hl.c: high-level: use variable to select between single and multi-thread mode Currently, the HL API always assumes multi-thread mode. This means that multiple threads in the program can create and manage PAPI event sets. This is not a valid assumption as the behavior is different for single-thread monitoring programs. This patch, introduces a new environment variable named PAPI_HL_THREAD_MULTIPLE that allows to select single and multi-thread mode explicitly in the HL API. To avoid affecting existing applications and tests the default is multi-thread monitoring. If the variable is set to "0" single-thread mode is selected instead. For explicitly setting multi-thread monitoring the variable has to be set to "1". Mon Apr 11 15:19:40 2022 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_spr.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit eca0a1f2d274ba26e6c24231fdf61b1407e3ed03 add Intel SapphireRapid core PMU support This patch adds Intel SapphireRapid core PMU support to libpfm4. It is based on the public event list from: https://download.01.org/perfmon/SPR/sapphirerapids_core_v1.00.json 2022-04-13 Daniel Barry * src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c, src/counter_analysis_toolkit/vec_scalar_verify.h: CAT: Rename printing functions in vector benchmark The previous names for the functions which print to a file the results of the CAT vector benchmark did not indicate their purpose. These new function names better describe what they do. These changes were tested on the Fujitsu A64FX architecture. 2022-04-08 Daniel Barry * src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c: CAT: Added scalar instrinsics to validate half-precision vector benchmark accuracy Previously, the half-precision kernels' numerical results were not checked against results computed using only scalar quantities. The GCC documentation states that "The __fp16 type may only be used as an argument to intrinsics defined in , or as a storage format." (https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html) Thus, these intrinsics are now in use to verify the accuracy of the vector benchmarks. These changes have been tested on the Fujitsu A64FX architecture. 2022-03-31 Anthony * src/components/sde/sde_lib/sde_lib.h: Fixed a potential deadlock in the SDE component. 2022-03-30 Anthony * src/components/sde/sde.c: Fixed bug in terminating a string in the SDE component. 2022-03-30 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/README, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec.h, src/counter_analysis_toolkit/vec_arch.h, src/counter_analysis_toolkit/vec_fma.h, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma.h, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c, src/counter_analysis_toolkit/vec_scalar_verify.h, src/counter_analysis_toolkit/weak_symbols.c: CAT: Added vector FLOPs benchmarks to identify related events Hardware events in certain architectures account for floating-point operations incurred by vector instructions. This new benchmark category allows for these events to be more easily identified by using the vector instrinsics available on a given architecture. This benchmark includes kernels for fused multiply-add (FMA) vector instructions. These changes have been tested on the IBM POWER9, Fujitsu A64FX (ARM), and AMD Zen2 architectures. 2022-03-22 Anthony Danalis * src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Updated the data cache write benchmark to make it cause one read and one write more reliably. 2022-01-23 Giuseppe Congiu * src/configure, src/configure.in, src/papi_lock.h: configure: add one lock per components Currently components do not have dedicated locks. This patch adds support for one lock per component so that two components do not have to share the same lock. 2022-03-28 Masahiko, Yamada * src/components/sysdetect/arm_cpu_utils.c: sysdetect: improve processor name for ARM processors On ARM processors, Raspbian OS can get the processor name from "model name" in /proc/cpuinfo. On non-Raspbian OS, the processor name cannot be retrieved from /proc/cpuinfo. Therefore, the processor name is generated based on the information of "CPU implementer" and "CPU part" in /proc/cpuinfo. 2022-03-15 Giuseppe Congiu * src/components/sysdetect/arm_cpu_utils.c, src/components/sysdetect/linux_cpu_utils.c: sysdetect: fix vendor codes for ARM Fri Mar 18 12:25:26 2022 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit ad5c64e1ac2f177e2166bedfd7b679e49017cb55 fix Intel Icelake TOPDOWN.SLOTS_P encoding Was using the 0x00 (fixed counter) encoding instead of 0xa4 for the generic counter. Add validation tests for SLOTS AND SLOTS_P. 2022-03-14 Giuseppe Congiu * src/components/sysdetect/linux_cpu_utils.c: sysdetect: system information queries treat all ARM CPUs the same When querying for system information all ARM CPUs are treated the same regardless the vendor. 2022-03-17 Masahiko, Yamada * src/components/sysdetect/README.md: sysdetect: configure --with-CPU required. To enable sysdetect, use the following command:. `./configure --with-CPU=$CPU --with-components="sysdetect"` $CPU can have the following values:. x86, POWER5, POWER5+, POWER6, POWER7, PPC970, arm 2022-03-16 Masahiko, Yamada * src/components/sysdetect/cpu_utils.c: sysdetect: Adding Macro Definitions for arm64 On ARM processors, for arm32, the macro definition for compilation is "defined (__arm__)", for arm64, the macro definition for compilation is "defined (__aarch64__)". Therefore, on ARM processors, you need to enable both arm32 and arm64, so you need to set "defined (__arm__) || defined (__aarch64__)". 2022-02-20 Giuseppe Congiu * src/components/sysdetect/arm_cpu_utils.c: sysdetect: patch arm to convert vendor id into vendor string * src/components/sysdetect/linux_cpu_utils.c: sysdetect: get rid of VENDOR_ARM 2022-02-16 Giuseppe Congiu * src/components/sysdetect/cpu.c, src/components/sysdetect/cpu_utils.h, src/components/sysdetect/linux_cpu_utils.c, src/components/sysdetect/x86_cpu_utils.c, src/papi.h, src/utils/papi_hardware_avail.c: sysdetect: add vendor id field for ARM processors 2022-02-17 Giuseppe Congiu * src/components/sysdetect/linux_cpu_utils.c: sysdetect: update vendor id codes in cpu probe 2022-02-20 Giuseppe Congiu * src/utils/papi_hardware_avail.c: papi_hardware_avail: only print numa node memory if greater than 0 * src/components/sysdetect/linux_cpu_utils.c: sysdetect: replace assignment to atoi with sscanf Assigning a model number using atoi and assignment operator might be ineffective. If the string is expressed in hex atoi will fail the conversion. Instead replace the atoi assignment with sscanf. * src/components/sysdetect/cpu.c: sysdetect: fix typo in cpu probe * src/components/sysdetect/linux_cpu_utils.c: sysdetect: there is at least one numa node in SMPs 2022-03-04 Masahiko, Yamada * src/papi_events.csv: Add PAPI idle-related preset events for a64fx For a64fx, add four PAPI idle-related preset events (PAPI_BRU_IDL/PAPI_FXU_IDL/PAPI_FPU_IDL/PAPI_LSU_IDL). PAPI_BRU_IDL = BR_COMP_WAIT PAPI_FXU_IDL = EU_COMP_WAIT - FL_COMP_WAIT PAPI_FPU_IDL = FL_COMP_WAIT PAPI_LSU_IDL = LD_COMP_WAIT The specifications of BR_COMP_WAIT, EU_COMP_WAIT, FL_COMP_WAIT, and LD_COMP_WAIT can be found in the "14.4. Cycle Accounting" on A64FX_Microarchitecture_Manual_en_1.5.pdf at the following URL:. https://github.com/fujitsu/A64FX/blob/master/doc * src/components/perf_event/pe_libpfm4_events.c, src/components/perf_event/perf_event.c, src/linux-common.c, src/papi.h: PAPI_get_hardware_info: improve PAPI_hw_info_t for ARM processors Currently, it is not possible to determine which company the ARM processor was designed by from the PAPI_hw_info_t available in PAPI_get_hardware_info(). On ARM processors, the PAPI_hw_info_t obtained with PAPI_get_hardware_info() does not contain information indicating which company was designed. For ARM processors, improve the vendor and vendor_string entries in PAPI_hw_info_t, which can be retrieved with PAPI_get_hardware_info(), to include information indicating which company was designed. 2022-02-20 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect: sysdetect/amd: add CPPFLAGS to Rules file 2022-02-19 Giuseppe Congiu * src/components/sysdetect/amd_gpu.c: sysdetect/amd: use snprintf instead of assignment operator * src/components/sysdetect/amd_gpu.c: sysdetect/amd: fix hsa_error_string definition and usage * src/utils/papi_component_avail.c: papi_component_avail: rename force_lazy_init to force_cmp_init 2022-02-18 Giuseppe Congiu * src/utils/papi_component_avail.c: papi_component_avail: force component init only when necessary * src/utils/papi_native_avail.c: papi_native_avail: allow for delayed init components Mon Feb 28 11:55:13 2022 -0800 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, .../lib/events/arm_hisilicon_kunpeng_events.h, .../lib/events/arm_hisilicon_kunpeng_unc_events.h, src/libpfm4/lib/events/arm_neoverse_n1_events.h, src/libpfm4/lib/events/arm_neoverse_n2_events.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/events/perf_events.h, src/libpfm4/lib/pfmlib_arm.c, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_arm_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_rapl.c, src/libpfm4/lib/pfmlib_kunpeng_unc_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_perf_event_raw.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c, src/libpfm4/tests/validate_x86.c: Update libpfm4. Tested on orbitty.icl.utk.edu, ARMv8 Processor rev 1 (v8l) Tested on methane.icl.utk.edu, Intel Skylake Xeon(R) Gold 6140 CPU Tested on dopamine.icl.utk.edu, AMD Zen3 EPYC 7413 CPU Current with commit 58efe1f26fe1ca82f8b25b83c1089c5f9eac0f1b add Intel SapphireRapid RAPL support Add CPU model number for SapphireRapid based on Linux kernel information. 2022-02-20 Giuseppe Congiu * src/components/sysdetect/linux_cpu_utils.c: sysdetect: fix infinite loop bug When there is no node is /sys/devices/system/cpu/ sysdetect will loop indefinitely looking for the node affine to the thread specified. Instead we should look for existence of node0 for cpu0. If this is present then other nodes will likely to be in the file system tree. Otherwise there is no point looking further and we just assume there is only one numa node. 2022-02-19 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: fix overcounting of cuda devices Cuda device counting is done by scanning the file system for the related device number information. There are two places in the file system where this information can be found: in /sys and in /proc. The second being a fallback in case the first does not contain the desired information. The /proc based device counting goes through all the directories in the /proc/driver/nvidia/gpus, including '.' and '..', thus overcounting the number of devices. This patch fixes the problem by filtering such directories. 2022-02-16 Giuseppe Congiu * src/components/rocm_smi/linux-rocm-smi.c: rocm_smi: change disabled_reason for delayed init * src/components/nvml/linux-nvml.c: nvml: change disabled_reason for delayed init * src/components/rocm/linux-rocm.c: rocm: change disabled_reason for delayed init * src/components/cuda/linux-cuda.c: cuda: change disabled_reason for delayed init 2022-02-15 Giuseppe Congiu * src/papi_internal.c: papi_errno: add delay init string error * src/cpus.c, src/papi.c, src/papi_internal.c, src/threads.c: papi: handle delay init for GPU components The PAPI_EDELAY_INIT error code is handled by PAPI as if the component was enabled. This allows PAPI to kick off delayed init by calling any of the internal component functions that access its events. * src/papi.h: papi_errno: add PAPI_EDELAY_INIT error for delayed init components Delayed initialization components need a way for distinguishing delayed initialization from disabled state. 2021-12-12 Giuseppe Congiu * src/papi.c: papi_get_component_info: remove init_private support * src/papi_vector.c, src/papi_vector.h: papi_vector: remove init_private support * src/utils/papi_component_avail.c: papi_component_avail: add lazy init code for components Previously we had an init_private() function added to papi_vector and implemented by those components that needed delayed (lazy) initialization, such as rocm, cuda, rocm_smi, nvml. Such init_private() delayed initialization was mainly used by papi_component_avail to read the number of events and hardware counters for reporting purposes to the user of the utility. Applications are free to ignore init_private() and avoid PAPI to call the GPU runtime init functions. The same result can be achieved by forcing lazy init in papi_component_avail by accessing the events in the component. If no such access happens the init_component now disables the components mentioned above and sets the disabled status to the following message: "Not initialized, call PAPI_enum_cmp_event or any other component event access function to force lazy init". " * src/components/nvml/linux-nvml.c: nvml: remove init_private for lazy init * src/components/rocm_smi/linux-rocm-smi.c: rocm_smi: remove init_private for lazy init * src/components/cuda/linux-cuda.c: cuda: remove init_private for lazy init * src/components/rocm/linux-rocm.c: rocm: remove init_private for lazy init init_private is a hack that causes inconsistency in the component interface. Such inconsistency can cause bugs. This patch removes the init_private interface. 2022-02-03 Anthony Danalis * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/timing_kernels.c: Improved error handling and reporting. 2022-01-31 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: rename device count routine looking into /sys Rename _cuda_count_nvidia_devices to _cuda_count_dev_sys to reflect the source of the information. This is to distinguish this routine from the other counting routine _cuda_count_dev_proc, which looks into the /proc file system instead. 2022-01-28 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: extend device counting with /proc filesystem /sys based device counting for cuda relies on the linux display rendering manager populating the corresponding entries in the filesystem. This is not always the case and depends on the specific linux configuration. Thus, this method might cause the component wrongly detecting no cuda devices in a system that has some. Another source of information for cuda devices is /proc file system. This patch extends the current /sys functionality with /proc information. * src/high-level/scripts/papi_hl_output_writer.py: high-level: make output writer script python2/3 compatible The papi_hl_output_writer.py script uses 'long' which is no longer supported in python2/3. Replace 'long' with 'int'. 2022-01-27 Giuseppe Congiu * src/components/sysdetect/shm.c: sysdetect: fix load_mpi_sym signature 2022-01-22 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: fix bug in shut down sequence When the cuda component is configured but not used, there is no dlopen() of nvidia libraries. Still, when the component is shutdown, dlclose() of these libraries is unconditionally called, causing a segmentation fault. This patch adds guards around dlclose() so that every dlopen() is always paired with a dlclose(). 2022-01-17 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect, src/components/sysdetect/amd_gpu.c: sysdetect/rocm: explicitly look for rocm in PAPI_ROCM_ROOT Similarly to the ROCm component sysdetect also requires PAPI_ROCM_ROOT to be defined so that the user is explicitly forced to define where in the file system tree the ROCm installation to be used is located. This prevents from situations in which the user believes she is using a certain ROCm version while in reality the component is picking up the system environment defined one. * src/components/sysdetect/amd_gpu.c: sysdetect/rocm: fix hsa_status_string arguments hsa_status_string takes an hsa_status_t code and returns a pointer to const char with the error message associated to the status code. In the code we were passing a pointer to a status array of chars rather than a pointer to a const char. This is fixed by this patch. 2021-12-08 Masahiko, Yamada * src/utils/papi_xml_event_info.c: Improve the papi_xml_event_info command. Modify the papi_xml_event_info command as follows:. - Test only the event name even if the event has a unit mask. - Test other unit masks in the event even if there is an error in one unit mask in the event. 2021-12-03 Giuseppe Congiu * src/components/rocm_smi/linux-rocm-smi.c: rocm_smi: fix bug in event reporting while running papi_component_avail This fix is similar to fixes: - e646d570 for cuda component & - 8e2f725 for rocm component 2021-11-08 Giuseppe Congiu * src/papi_vector.c: init_private: fix indentation in order to silence compiler warning Most recent versions of gcc complain when if statements, not followed by curly braces, are not indented using tabs. Fix the warning by replacing white spaces with tabs. 2021-07-29 Giuseppe Congiu * src/components/sysdetect/README.md, src/components/sysdetect/Rules.sysdetect, src/components/sysdetect/amd_gpu.c, src/components/sysdetect/amd_gpu.h, src/components/sysdetect/arm_cpu_utils.c, src/components/sysdetect/arm_cpu_utils.h, src/components/sysdetect/cpu.c, src/components/sysdetect/cpu.h, src/components/sysdetect/cpu_utils.c, src/components/sysdetect/cpu_utils.h, src/components/sysdetect/linux_cpu_utils.c, src/components/sysdetect/linux_cpu_utils.h, src/components/sysdetect/nvidia_gpu.c, src/components/sysdetect/nvidia_gpu.h, src/components/sysdetect/os_cpu_utils.c, src/components/sysdetect/os_cpu_utils.h, src/components/sysdetect/powerpc_cpu_utils.c, src/components/sysdetect/powerpc_cpu_utils.h, src/components/sysdetect/shm.c, src/components/sysdetect/shm.h, src/components/sysdetect/sysdetect.c, src/components/sysdetect/sysdetect.h, src/components/sysdetect/tests/Makefile, src/components/sysdetect/tests/query_device_mpi.c, .../sysdetect/tests/query_device_simple.c, src/components/sysdetect/x86_cpu_utils.c, src/components/sysdetect/x86_cpu_utils.h, src/configure, src/configure.in, src/papi.h, src/utils/Makefile, src/utils/papi_hardware_avail.c: Sysdetect: system information detection component The SYSDETECT component allows PAPI users to query comprehensive system information. The information is gathered at PAPI_library_init() time and presented to the user through appropriate APIs. The component works similarly to other components, which means that hardware information for a specific device might not be available at runtime if, e.g., the device runtime software is not installed. At the moment the infrastructure defines the following device types: - PAPI_DEV_TYPE_ID__CPU : for all CPU devices from any vendor - PAPI_DEV_TYPE_ID__NVIDIA_GPU : for all GPU devices from NVIDIA - PAPI_DEV_TYPE_ID__AMD_GPU : for all GPU devices from AMD Every device is scanned to gather information when the component is initialized. If there is no installed hardware for the considered device type, is not found, the corresponding information is filled with zeros. This patch also adds a new utility program, called papi_hardware_avail, that prints out to the command line what hardware is installed for each type and the specifications of each device. 2021-10-28 Giuseppe Congiu * src/papi.c, src/papi_vector.c: init: make init_private exposed by every component Lack of uniformity across components burdens front-end code with additional checks. One example is init_private(). This function is implemented only by those components that need delayed initialization due to the high cost of parsing a large number of events from the hardware (e.g. rocm and cuda components). However, this also means that front-end code has to check whether such init_private() function is implemented by other components in order to avoid dereferencing NULL function pointers. A better solution is to implement init_private() in every component and simply make the function return PAPI_OK if the component does not need delayed initialization. 2021-10-18 Giuseppe Congiu * src/papi.c: papi_lock: fix bug in PAPI_lock and PAPI_unlock 2021-10-16 Giuseppe Congiu * src/components/rocm/linux-rocm.c: RocmCmp: fix bug in event reporting while running papi_component_avail In the papi_component_avail() utility program the component private init function is called multiple times. The first time the component events are initialized as expected and the disable flag is set to the error code returned while performing the process, along with the reason the component might be disabled. The second time the private init function of the component is called is when trying listing the supported events. In this case the init does not remember the error code previously returned and sets the disabled flag to PAPI_OK instead. This same bug was already fixed by Tony Castaldo in patch e646d570 for the cuda component. 2021-10-13 Anthony Danalis * src/components/rocm/linux-rocm.c: Added code to set the environment variable "ROCP_HSA_INTERCEPT" which is needed since rocm-4.1, and removed spurious whitespace. 2021-10-03 Giuseppe Congiu * src/papi.c: PAPI_accum: documentation bug fix 2021-09-01 Daniel Barry * src/components/pcp/linux-pcp.c: Modified the PCP component to use the local host as the PMAPI context. These changes are compatible with both RHEL 7 and 8. These changes were tested on the IBM POWER9 architecture. 2021-08-30 Vince Weaver * src/linux-memory.c: linux-memory: change cache parsing so it works on ARM servers On Linux we parse files under /sys/devices/system/cpu/ to determine the various cache settings. The old code assumes certain files, such as associativity and linesize are always there (because they are on x86). This updated code instead of exiting with an error if the files don't exist sets the values to 0. This allows the cache values to be returned on ARM systems such as Ampere servers. This could potentially break user code if they are taking the cache values and doing division (such as taking the cache size and dividing by the linesize: if linesize is zero they could get a divide-by-zero error). I'm not sure if there's a way around this without redesigning how the meminfo structure works. 2021-08-24 Anthony Castaldo * src/components/cuda/linux-cuda.c: Added CUPTI_PROFILER=-1 for no PAPI_CUDA_ROOT set at ./configure time. Disables the CUDA component with the message "Environment variable \ PAPI_CUDA_ROOT must be specified before ./configure is executed." * src/components/cuda/linux-cuda.c, src/configure, src/configure.in: Changes to ensure PAPI_CUDA_ROOT was set BEFORE ./configure was run, to ensure we distinguish between CUPTI11, Legacy, and misconfigured. 2021-08-24 Masahiko, Yamada * src/papi_events.csv: Fix the PAPI_FUL_CCY setting for a64fx In a64fx, the maximum number of instruction commits is 4, so the following setting was incorrect. PAPI_FUL_CCY=CPU_CYCLES- 0INST_COMMIT-1INST_COMMIT-2INST_COMMIT-3INST_COMMIT-4INST_COMMIT The correct settings are:. PAPI_FUL_CCY=CPU_CYCLES-0INST_COMMIT- 1INST_COMMIT-2INST_COMMIT-3INST_COMMIT 2021-08-23 Anthony Castaldo * src/components/cuda/linux-cuda.c: Removed extraneous comments. * src/components/cuda/linux-cuda.c: Changed the error messages about the Legacy/Cupti11 failures to better distinguish the exact cause of failure, and updated several possible exits of the initialization that might cause an empty "Disabled" message on papi_component_avail. 2021-08-20 Daniel Barry * src/counter_analysis_toolkit/dcache.c: Fixed a bug regarding race conditions in a parallel construct in the CAT data cache benchmarks. These changes were tested on the Fujitsu A64FX architecture. 2021-08-19 Anthony * src/components/sde/sde_lib/sde_lib.h, .../Created_Counter/Lib_With_Created_Counter++.cpp, src/components/sde/tests/Simple2/Simple2_Lib++.cpp, src/components/sde/tests/Simple2/Simple2_Lib.c: Updates to the C++ API based on early-adopter feedback. 2021-08-12 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Corrected shutdown code to work correctly if delayed init never executes (due to shutdown without using the component). 2021-08-11 Anthony Castaldo * src/components/cuda/linux-cuda.c: In scanning system devices to find GPUs, added Guiseppe's recommendation to also check device class to filter out all but Display Controllers; which are GPUs. 2021-08-10 Anthony Castaldo * src/components/cuda/linux-cuda.c: Added checks for Nvidia devices up front by scanning /sys/class/drm/card files. This is necessary to avoid cuInit() which is needed to run cuDeviceGetCount(). Also corrected a bug in delayed init, in case it was called more than once after already being disabled. 2021-08-09 Anthony Castaldo * src/components/cuda/linux-cuda.c: Second change, needed proper printf format code. * src/components/cuda/linux-cuda.c: Fixed a compile bug in CUDA that only shows with later modules. 2021-08-07 Anthony Castaldo * src/components/cuda/linux-cuda.c: Changes necessary to sort out at runtime what to do if we were compiled with one cuda module loaded, but run with a different cuda module loaded. Also had a compile error to fix running with an old cuda module. 2021-08-06 Anthony Castaldo * src/components/cuda/linux-cuda.c: Change to reject if compiled in headers have structures of different sizes than the version of the cuda_runtime library we found. Also rejecting libraries <11.0, they don't contain CounterAvailability functions that we currently must use in setting up events; i.e. the 10.x API differs slightly from the 11.x API. API differ Mon Jul 26 16:22:25 2021 +0200 Thomas Richter * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_neoverse_n2.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_neoverse_n2_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_arm_perf_event.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/tests/validate_arm64.c: Update libpfm4, to be current with the following commit. Tested on orbitty.icl.utk.edu, ARMv8 Processor rev 1 (v8l). commit 790451411d481492b6a3b94077b543c3e68c6d2b do not set certain config bits in pfm_arm_get_perf_encoding() By default (raw encoding) on ARM, the library was setting the PL1, USR, HYP control bits in the config in the encoded value. With Linux perf_events, these bits are under the control of the kernel. Any of these bits set by the user is overridden by the kernel based on the settings of the perf_event_attr.exclude_* fields. Recent versions of the perf tool started checking that the config field is not setting bits which are ignored by the kernel. To avoid the perf tool warning, this patch removes the setting of these bits when encoding for Linux perf_events. commit 0c3efc889fadc8cd9a632f5a10462d37c508c56a add support for ARM Neoverse N2 core PMU This patch adds support for ARM Neoverse N2 core PMU based on the ARM TRM version 0. The new PMU is called arm_n2. commit e166a8869f64cd3a47b2b42a3022e4cceecea799 Support cycles:u modifier for s390 The function invocation of pfm_get_perf_event_encoding("cycles:u", ...) fails on s390. However the modifier :u is supported on s390, where as modifiers :h and :k are not supported. Fix this by adding the supported_plm field and set it properly. This setting causes function pfm_perf_perf_validate_pattrs() to accept modifier :u as valid. Test code: .... memset(&attr, 0, sizeof(attr)); attr.size = sizeof(attr); ret = pfm_get_perf_event_encoding(evname, PFM_PLM0|PFM_PLM3, &attr, NULL, NULL); txt = pfm_strerror(ret); printf("TEST %s ret:%d(%s) config:%#lx type:%d\n", evname, ret, txt, attr.config, attr.type); Output before: TEST cycles:u ret:ret:-8(invalid event attribute) config:0 type:0 Output after: TEST cycles:u ret:0(success) config:0 type:0 Acked-by: Sumanth Korikkar 2021-08-05 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/main.c: Added feature to CAT data cache benchmark to print a header line in the output files. This header shows the ID of the CPU core to which each thread was pinned. This provides more detail of the hardware context for reproducibility. These changes were tested on the Fujitsu A64FX architecture. 2021-07-30 Anthony * src/components/sde/sde_lib/sde_lib.h: Cleanup. * src/components/sde/sde_lib/sde_lib.h, .../Created_Counter/Created_Counter_Driver++.cpp, .../Created_Counter/Lib_With_Created_Counter++.cpp, src/components/sde/tests/Makefile, .../sde/tests/Minimal/Minimal_Test++.cpp, src/components/sde/tests/README.txt, .../sde/tests/Recorder/Lib_With_Recorder++.cpp, .../sde/tests/Recorder/Recorder_Driver++.cpp, src/components/sde/tests/Simple/Simple_Driver.c, .../sde/tests/Simple2/Simple2_Driver++.cpp, src/components/sde/tests/Simple2/Simple2_Driver.c, src/components/sde/tests/Simple2/Simple2_Lib++.cpp, src/components/sde/tests/Simple2/Simple2_Lib.c: C++ interface for the library-side API of papi-sde and examples that demonstrate its usage. 2021-07-29 Heike Jagode * src/papi.c: Added missing changes for 'delayed init' feature to ensure that our PAPI utilities still report the correct number of native events and counters. 2021-07-28 Anthony Castaldo * src/components/cuda/linux-cuda.c: Minor changes to Macros so merge is not confused. 2021-07-29 Daniel Barry * src/counter_analysis_toolkit/caches.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/timing_kernels.c: Added feature to CAT data cache benchmark to measure data read latencies for each worker thread. This allows us to observe additional data-access nuance for each core in the socket. These changes were tested on the Fujitsu A64FX architecture. 2021-07-28 Anthony Castaldo * src/components/cuda/Rules.cuda, src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/Makefile, src/components/cuda/tests/simpleMultiGPU.cu, .../cuda/tests/simpleMultiGPU_CUPTI11.cu, src/configure: Moved test for CUpti 11 to configure, out of Rules.cuda. Modified HelloWorld and simpleMultiGPU.cu to work properly with Legacy CUpti. Added simpleMultiGPU_CUPTI11.cu, because CUcontext monitoring requires a different protocol for managing CUcontexts. Adjusted Makefile accordingly. 2021-07-28 Daniel Barry * src/counter_analysis_toolkit/caches.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Added feature to CAT data cache benchmark to measure event occurrences for each worker thread. This allows us to accurately measure a chip's region- specific events. These changes were tested on the Fujitsu A64FX architecture. 2021-07-27 Anthony Castaldo * src/components/cuda/linux-cuda.c: Changes to compile clean on Legacy Cupti. 2021-07-27 Anthony * src/components/sde/Rules.sde, src/components/sde/sde_internal.h, src/components/sde/sde_lib/papi_sde_interface.h, src/components/sde/sde_lib/sde_common.c, src/components/sde/sde_lib/sde_common.h, src/components/sde/sde_lib/sde_lib.c, src/components/sde/sde_lib/sde_lib.h, src/components/sde/sde_lib/weak_symbols.c, .../sde/tests/Advanced_C+FORTRAN/Gamum.c, .../sde/tests/Advanced_C+FORTRAN/sde_symbols.c, .../Created_Counter/Lib_With_Created_Counter.c, src/components/sde/tests/Makefile, src/components/sde/tests/Minimal/Minimal_Test.c, .../sde/tests/Recorder/Lib_With_Recorder.c, src/components/sde/tests/Simple/Simple_Driver.c, src/components/sde/tests/Simple/Simple_Lib.c, src/components/sde/tests/Simple2/Simple2_Driver.c, src/components/sde/tests/Simple2/Simple2_Lib.c, src/configure, src/configure.in, src/utils/Makefile, src/utils/papi_native_avail.c: Converted libsde into a header-only library to ease integration into third part software. Now the only thing a third party code needs in order to export SDEs is to #include "sde_lib.h". This change also simplified the integration into the PAPI utility papi_native_avail so "linking tricks" and weak symbols are not needed anymore. 2021-07-26 Anthony Castaldo * src/components/cuda/README.md, src/components/cuda/Rules.cuda, src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/HelloWorld_CUPTI11.cu, src/components/cuda/tests/simpleMultiGPU.cu: Changes to make the test code work as expected in CUpti 11, with the CUpti callback monitoring of CUcontext activity. 2021-07-15 Anthony Castaldo * src/components/cuda/linux-cuda.c: On A100 CC 8.0, details on some events fails; and caused debug errors to print that should have been suppressed. Corrected this. * src/components/cuda/linux-cuda.c: disabled some debug messages. * src/components/cuda/linux-cuda.c: Legacy CUPTI was failing if PAPI user already had a context set. * src/components/cuda/linux-cuda.c: Clean up some debug code, and extraneous code in cuda_shutdown that belonged in cuda11_shutdown. 2021-07-12 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld_CUPTI11.cu: Retested with valgrind for memory leaks; removed redundant code in the cuda11_read(), and corrected HelloWorld_CUPTI11.cu to use a single pass event, instead of my test 2-pass event. 2021-07-09 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld_CUPTI11.cu: Corrected a bug in description formation; had HelloWorld_CUPTI11.cu report in both decimal and hexadecimal. * src/components/cuda/linux-cuda.c: Corrected problems with correctly choosing Legacy or CUpti-11, and issues about what to include/exclude to be compatible with previous Nvidia distributions without profile headers and libraries. * src/components/cuda/README.md, src/components/cuda/Rules.cuda, src/components/cuda/linux-cuda.c: Updates in commentary and documentation. 2021-07-07 Anthony Castaldo * src/components/cuda/README.md, src/components/cuda/linux-cuda.c: All CUPTI11 code works, legacy cupti still works, on xsdk. However, configure and build still needs work, it will fail without a PerfWorks directory. 2021-07-08 Daniel Barry * src/counter_analysis_toolkit/timing_kernels.c: Increased the minimum number of pointer chain accesses in the CAT data cache benchmark. This yields more stable measurements when using smaller buffer sizes. These changes were tested on the Fujitsu A64FX architecture. 2021-06-23 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld_CUPTI11.cu, src/components/cuda/tests/Makefile: Extensive changes linux-cuda.c to use CUpti-11 when CC >= 7.0; this has been tested and seems to work on ICL xsdk, but is not optimized. Adding a changed Makefile and HelloWorld_CUPTI11.cu to better test various scenarios of cuda_context arrangements. Sat Jun 5 15:07:53 2021 -0700 Thomas Richter * src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_fam11h.c, src/libpfm4/lib/pfmlib_amd64_fam12h.c, src/libpfm4/lib/pfmlib_amd64_fam15h.c, src/libpfm4/lib/pfmlib_amd64_fam16h.c, src/libpfm4/lib/pfmlib_amd64_fam17h.c, src/libpfm4/lib/pfmlib_amd64_fam19h.c, src/libpfm4/lib/pfmlib_amd64_perf_event.c, src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_s390x_perf_event.c: Update libpfm4, to be current with the following commit. Tested on icl.utk.edu machines xsdk (Intel Xeon Gold 6254), morphine (AMD EPYC 7301), histamine dopamine (AMD EPYC 7402) guyot (AMD EPYC 7742). commit d0b85fb5813dbd73e408fa21dceaf204623609cc AMD64 encoding and debug cleanup This patch fixes and update the way the guest vs. host encoding is handled. The guest vs. hosst hardware filtering is available since Fam10h onward except for Fam11h. This is now handled with proper pmu_rev encoding for each PMU and a new helper function pfm_amd64_supports_virt(). Also fixes the verbose output to handle guest vs. host correctly. commit e3ae4bd86b9f37cbdc31625dd23b80ef66da5df7 fix typo in OFFCORE_RESPONSE umask on Intel SkylakeX L3_MISS_MISS_REMOTE_HOP1_DRAM -> L3_MISS_REMOTE_HOP1_DRAM Reported-by: Ian Rogers commit 0106e839a8bade2abda66512b8b4be2338fc3729 make verbose print more explicit in pfmlib_perf_event_encode() Spell out the field names better to make them easier to understand. commit e0fcc38251cf680fcdd0c18b4c13327737f3ebb8 do not set certain config bits in pfm_amd64_get_perf_encoding() By default on Intel X86, the library was setting the EN and INT bits for each core PMU events. But when encoding for perf_events, these bits are ignored by the interface and reprogrammed by the kernel. Similarly, the USR/OS/GUEST/HOST bits are controlled by the perf_event_attr.exclude_* field not the config field. Recent versions of the Linux perf tool warn when bits which are ignored are set in the config field which is useful. This patch clears all the config bits under the control of the perf_events interface. The encoding for raw PMU mode is unchanged. commit 5be1e849a25c7d02bdeb04678bfe204783b8b5ff do not set certain config bits in pfm_intel_x86_get_perf_encoding() By default on Intel X86, the library was setting the EN and INT bits for each core PMU events. But when encoding for perf_events, these bits are ignored by the interface and reprogrammed by the kernel. Similarly, the USR/OS bits are controlled by the perf_event_attr.exclude_* field not the config field. Recent versions of the Linux perf tool warn when bits which are ignored are set in the config field which is useful. This patch clears all the config bits under the control of the perf_events interface. The encoding for raw PMU mode is unchanged. commit 3833ff527012a33131f9af2530fe1447f6984ebf search perf attr.type event number for s390 Commit 30adc677603b ("lib/pfmlib_s390x_perf_event.c: Fix perf attr.type event number for s390") fixes the dynamic PMU type assignment by the kernel when s390x PMU device drivers are loaded at boot time. However s390x has several PMU device drivers. Therefore find the correct one first and then return the type number read out from a sysfs file. Once the PMU type number is determined, it does not change until the next reboot. It is ok to cache it. Also add a check if the PMU really exists and return an error if not. Fixes: 30adc677603b ("lib/pfmlib_s390x_perf_event.c: Fix perf attr.type event number for s390") 2021-06-15 William Cohen * src/validation_tests/instructions_testcode.c: Use numeric local labels to allow compilation with LTO enabled Some assembly snippets in instructions_testcode.c used regular label names. Unfortunately, when multiple copies of the snippets are inlined in different places with LTO enabled the multiple copies of a label by the same name cause the build to fail because of the redefinition of the label. To avoid this problem all those labels have been converted to numeric local labels to allow multiple copies to peacefully coexist in the LTO enabled code. 2021-06-10 Heike Jagode * src/Rules.pfm4_pe: Rebase to remove commit 1f48bb7 since there appear to be issues with this. Sun May 9 15:45:18 2021 -0700 Stephane Eranian * src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_icl.3, src/libpfm4/docs/man3/libpfm_intel_icx.3, src/libpfm4/docs/man3/pfm_get_os_event_encoding.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/arm_neoverse_n1_events.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_icl.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4, to be current with the following commit: Tested on Histamine (Zen2) Dopamine (Zen2) Morphine (Zen1) XSDK (Intel). commit 74b79969f2f752df3be404d9c23f9709d738062f fix buffer overrun in Intel IcelakeX model table The following commit introduced a bug: 12aeb9f69438 enable Intel IcelakeX core PMU support By forgetting a NULL termination to the icx_models[] table. commit e2bd6b5b573b124d5c07670cfc9f0923b6223288 fix Intel Icelake man page date No Icelake in 2015! commit 12aeb9f694382bbf82061ac0b28abb5d2178fe8d enable Intel IcelakeX core PMU support This patch adds Intel IcelakeX (Icelake for servers) core PMU support. This is the same core PMU as for the client Icelake with the addition of events to cover remote and PMM accesses. Based on Intel's icelakex_core_v1.04.json from 01.org. commit 9c3e9c025efc06f4ac4422d5e87a05d9776cbb94 fix detection of AMD64 Zen1 vs. Zen2 This patch fixes the test checking the model number for AMD64 Fam17h processors. There was a bug where it would detect some Zen1 processors as Zen2. Zen2 processors start at model number 48 and up. commit dee24f6323023573f22dc68882cea44859c0b7ac add ARM SPE events for Neoverse N1 core PMU This patches adds the four Statistical Profiling Extension (SPE) related core PMU events: - SAMPLE_POP - SAMPLE_FEED - SAMPLE_FILTRATE - SAMPLE_COLLISON commit 21787c7cca3b8b4d02e5608bfef9bdfa7acd7d8e fix pfm_get_os_event_encoding man page typos There is no PERF_OS_EVENT enum, should be PFM_OS_PERF_EVENT. 2021-05-22 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/nvml/linux-nvml.c, src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c, src/papi.h, src/papi_vector.h: Reposting changes made by Damien Genet, with bug corrections, to delay component initialization until necessary. For CUDA, NVML, ROCM and ROCM_SMI components. CUDA and NVML components tested on XSDK, ROCM and ROCM_SMI components tested on Caffeine. 2021-05-17 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/timing_kernels.c: Added feature to CAT to collect latency data for the entire parameter sweep used in the data cache reading benchmark. Also fixed an overflow error in the number of pointer-chain accesses by storing this value as a 'long' instead of an 'int'. 2021-05-18 Swarup Sahoo * src/papi_events.csv: Added AMD Zen3 preset events. Refer section 2.1.17.2 of PPR for AMD family 19h model 01h, https://www.amd.com/system/files/TechDocs/55898_pub.zip 2021-05-04 Anthony Castaldo * src/components/cuda/linux-cuda.c: Corrected an error discovered by Tristan Konolige; pushing the retained context when it is identical to the current context causes an error. Also updated all error exits to properly restore user context. Sun May 2 23:43:17 2021 -0700 Stephane Eranian * src/libpfm4/docs/man3/pfm_get_os_event_encoding.3, src/libpfm4/include/perfmon/perf_event.h, src/libpfm4/lib/events/perf_events.h, src/libpfm4/lib/pfmlib_amd64_rapl.c: Update libpfm4, to be current with the following commit: The ZEN3 modification cannot be tested; we have no ZEN3 machine. The other changes are not machine specific; we did a smoke test (compile and execute papi_component_avail, papi_native_avail) on ICLs xsdk machine. commit 06197c0543476d40fad1c94d240e46a5d114f887 enable RAPL for AMD64 Fam19h Zen3 processor As per AMD64 PPR for Fam19h model 01h, RAPL Package is supported, so enable it. commit be0dd1e0f63cb3d0915bc368baebe778792b6955 Add cgroup-switches software event Linux v5.13 added the 'cgroup-switches' event so it should be supported by libpfm4 as well. commit d624a97b8e2143e1b890ac1a892b4620acb736f5 fix arg type in pfm_get_os_event_encoding() man page This patch replaces references to pfm_raw_pmu_encode_t with pfm_pmu_encode_t to reflect the actual data type used in the code. Thanks to Claudio Parra for reporting the issue. 2021-05-03 Anthony Castaldo * src/components/cuda/linux-cuda.c: Correcting a typo that can cause a segfault. 2021-04-29 Anthony Castaldo * src/components/rocm/linux-rocm.c: Using macros (like papi_debug.h) instead of if (0). 2021-04-28 Anthony Castaldo * src/components/cuda/linux-cuda.c: Additional context cleanup in _cuda_update_control_state() to accomodate issues with non-primary contexts. 2021-04-23 Anthony Castaldo * src/components/rocm/linux-rocm.c: Deleted an extraneous paranoid line of code. 2021-04-22 Anthony Michael Castaldo * src/components/rocm/README.md, src/components/rocm/Rules.rocm, src/components/rocm/linux-rocm.c, src/components/rocm/rocm_IncDirs.awk: Improved automatic detection of ROCM root directory, so exporting PAPI_ROCM_ROOT is not always necessary on systems that load modules. We recognize environment variables ROCM_PATH, ROCM_DIR, and ROCMDIR. At compile time, we have code in Rules.rocm that can examine the LD_LIBRARY_PATH variable and extract possible -Iinclude_paths for the compile. This uses 'awk', but if 'awk' is not present on the system it won't cause an error message. We will also still use PAPI_ROCM_ROOT at compile time, preferentially, when specified. README.md has been updated to reflect these changes. 2021-04-22 William Cohen * src/Makefile.inc: Correct warning message to 'make dist-targz'. 2021-04-20 William Cohen * src/utils/papi_multiplex_cost.c: Check to ensure that mallocs allocated memory in papi_multiplex_cost.c The malloc function can return NULL if the function is unable to allocate memory. papi_multiplex_cost.c needs checks like papi_command_line.c has and exit the program with an error if any of the malloc operations fail. 2021-04-13 Anthony Castaldo * src/components/cuda/linux-cuda.c, src/components/cuda/tests/HelloWorld_NP_Ctx.cu, src/components/cuda/tests/Makefile: This code corrects an oversight and works if the application has already created a non-primary context before calling PAPI_library_init(). A modification of HelloWorld.cu, HelloWorld_NP_Ctx.cu, will test if if the code works with a non-primary context created; HelloWorld.cu tests without creating a non-primary context. This was tested on XSDK with two Titan V GPUs. 2021-04-12 Anthony * src/configure, src/configure.in: Changes to the configure script to accommodate the (upcoming) intel_gpu component. Fri Apr 2 12:38:56 2021 -0700 Stephane Eranian * src/libpfm4/lib/pfmlib_amd64_fam19h_l3.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/tests/validate_x86.c: The following fixes are for AMD Zen3 CPUs, untested by ICL, we have no access to Zen3 processors at this time. Update libpfm4, to be current with the following commit: commit 6864dad7cf85fac9fff04bd814026e2fbc160175 Fix AMD64 Fam19h L3 PMU support The PMU perf_events type was not correctly encoded because the .perf_name field was not initialized and therefore it defaulted to using the core PMU. The correct perf_name is "amd_l3". With that in place, the library now picks up the correct PMU type and associated programming restrictions, e.g., per-cpu mode only and code such as perf_examples/self should not be allow to succeed at perf_event_open(). Reported-by: Steve Kaufmann commit 99975b4738cf7f2550922f0761f2776159842c00 fix grpid handling for Intel X86 uncore On SkylakeX the umask grpid field is overloaded to contain two subfield. The actual grpid and the required grpid (at offset 8). The encoding code has a bug where it would not use the accessor function get_grpid() to extract the group id from the field. Given that the grpid is used in statements such as: u = 1 << pe[e->event].umasks[a->idx].grpid; The code could run the risk of exceeding the max shift for a 16-bit value. The fix is to use accessor function to extract the grpid. The patch also adds a validation test to ensure events which would cause a large grpid are properly encoded. 2021-04-06 Anthony * src/counter_analysis_toolkit/.cat_cfg, src/counter_analysis_toolkit/main.c: Adjust cache leves based on information in config file, and make the default config file empty. 2021-04-05 Anthony * src/counter_analysis_toolkit/.cat_cfg, src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/branch.c, src/counter_analysis_toolkit/branch.h, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/event_list.txt, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/flops.h, src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/icache.h, src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/prepareArray.c, src/counter_analysis_toolkit/prepareArray.h, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Changed CAT code to enable dynamic discovery of cache sizes, and also user provided values (through .cat_cfg file). Wed Jan 27 20:12:59 2021 +0900 Masahiko, Yamada * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_amd64_fam19h_zen3.3, .../docs/man3/libpfm_amd64_fam19h_zen3_l3.3, src/libpfm4/docs/man3/libpfm_arm_a64fx.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/amd64_events_fam19h_zen3.h, .../lib/events/amd64_events_fam19h_zen3_l3.h, src/libpfm4/lib/events/perf_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_fam19h.c, src/libpfm4/lib/pfmlib_amd64_fam19h_l3.c, src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_amd64_rapl.c, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_icl.c, src/libpfm4/lib/pfmlib_intel_nhm_unc.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/perf_examples/notify_group.c, src/libpfm4/perf_examples/perf_util.c, src/libpfm4/tests/validate_x86.c: This affects the processor AMD Zen2, we tested on it. It affects the following processors we do not have to test on; A64FX (Fujitsu ARM),AMD Zen3, Intel TigerLake and RocketLake. Update libpfm4, to be current with the following commit: commit c132ab4948a828334a8fef00303a4b47f59bb4d9 Add prefix to AMD Fam19h Zen3 L3 events To avoid potential conflict with other core PMU events and make it more explicit these are uncore L3 events following the model of Intel uncore PMUs. commit a97908e8e6b6a28ae369dfbc9af97b52fe932273 Enable Intel Tigerlake and Rocketlake core PMU support They are equivalent to Intel Icelake, so reuse the same event table. commit 315941fc05f5a487e4eb5efd36ea10438336944b add AMD64 Fam19h Zen3 L3 PMU support This patch adds the AMD Fam19h (Zen3) L3 PMU support consisting of 3 published events. new PMU model: amd64_fam19h_zen3_l3 Based on the public specifications PPR (#55898) Rev 0.35 - Feb 5, 2021. Available at: https://www.amd.com/system/files/TechDocs/55898_pub.zip commit e2afb6186dab2419a4b6f79a6adf7cd9bb0f2340 Add AMD64 Fam17h Zen2 RAPL support This patch adds RAPL support for AMD64 Fam17h Zen2 processors. On Zen2, only the RAPL_ENERGY_PKGS event is supported. commit cc4ba27e55440f87359bee5176380db1ba4ef8af Add AMD64 Fam19h Zen3 core PMU support The patch adds a core PMU support for AMD Fam19h Zen3. new PMU model: amd64_fam19h_zen3 Based on the public specifications PPR (#55898) Rev 0.35 - Feb 5, 2021. Available at: https://www.amd.com/system/files/TechDocs/55898_pub.zip commit 5333f3245954b038100530a17675bbbafdae3061 Fix casting issues reported by PGI compiler The PGI compiler does not like: struct { unsigned long field; }; struct.field = -1, So clean this up and various others casting issues reported by Carl Ponder on the bugs. commit f6500e77563e606c8510ff26f57d321328bd8157 Changing the number of PMU counters and deleting the ARM(32-bit) mode for A64FX The current libpfm4 implementation treats PMCR_EL0.N = 0x6 like other ARM Reference processors. On an A64FX, PMCR_EL0.N = 0x8 (The number of PMU counters is 8.). Therefore, only 6 counters are available in the current implementation. The A64FX core also supports the AArch64 state and the A64 Instruction set. The AArch32 state and the A32, T32 Instruction set are not supported and cannot be transitioned to this Execution state. Currently, the libpfm manual(docs/man3/libpfm_arm_a64fx.3) states that A32/A64 can be used, but A32 cannot be used. I have created a patch with the above fixes, so please review and merge it. Originally, the specification of the A64FX which Fujitsu published should have described the above two points, but the description was omitted. A64FX Specification HPC Extension v1.1 will add:. - On a A64FX, PMCR_EL0.N = 0x8 (The number of PMU counters is 8.). - A64FX does not support the AArch32 state and the A32, T32 Instruction set and cannot transition to this Execution state. 2021-03-11 Frank Winkler * src/high-level/papi_hl.c: Improved randomization of rank id. 2021-03-10 Frank Winkler * src/high-level/papi_hl.c: Added more hardware information in hl performance output. 2021-03-09 Frank Winkler * src/high-level/papi_hl.c: Improved hl performance output for parallel programs. If the system does not provide the rank id, a unique file is created per rank. This implementation avoids race conditions. 2021-02-24 Anthony Castaldo * src/components/cuda/linux-cuda.c: Corrects a sequence error in the use of cuda context that was causing an issue on Summit. * src/components/cuda/linux-cuda.c: interim commit for merge 2021-02-22 Frank Winkler * src/high-level/papi_hl.c, src/high- level/scripts/papi_hl_output_writer.py: Improved hl output. 2021-02-22 Anthony Castaldo * src/components/rocm/tests/rocm_example.cpp, src/components/rocm_smi/tests/rocmsmi_example.cpp: Modifications to commentary in instructional example code, for accuracy and clarity. 2021-02-19 Anthony Castaldo * src/components/rocm/tests/ROCM_Makefile: Deleting obsolete ROCM_Makefile. * src/components/rocm/tests/ROCM_Makefile: Re-adding components/rocm/tests/ROCM_Makefile to resolve merge conflict. It is obsolete, and will be deleted in a future update. * src/components/rocm/tests/ROCM_Makefile: ROCM_Makefile is obsolete; incorporated into Makefile. * src/components/rocm/tests/ROCM_Makefile, src/components/rocm_smi /linux-rocm-smi.c: Restoring ROCM_Makefile to deal with merge conflictt. Adding sensor 0-relative, 1-relative fix. 2021-02-19 Frank Winkler * src/high-level/papi_hl.c, src/high- level/scripts/papi_hl_output_writer.py: Revised hl output. 2021-02-19 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Clean up code and library search for both components. For ROCM, automatically set rocprofiler environment variables if missing. 2021-02-18 Frank Winkler * src/high-level/papi_hl.c: Fixed raw output. * src/high-level/papi_hl.c: Added component name to event definitions. 2021-02-16 Masahiko, Yamada * src/papi_events.csv: remove PAPI_L1_TCA and PAPI_L1_TCH for a64fx PAPI_L1_TCA and PAPI_L1_TCH for a64fx measure L1D_CACHE just like PAPI_L1_DCA and PAPI_L1_DCH, so I delete (comment out) PAPI_L1_TCA and PAPI_L1_TCH for a64fx from the papi_events.csv file. 2021-02-15 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/timing_kernels.c: Modified the multi- threaded CAT data cache benchmark so that each thread's memory buffer is allocated in separate threads. Allocating all buffers in a single thread means they exist in the same NUMA region. This change prevents an imbalance of memory accesses to just a single NUMA region. This change was tested on the IBM POWER9 architecture. 2021-02-14 Frank Winkler * src/high-level/papi_hl.c: Modified recording of regions. - All regions have an unique region ID - Added hierarchy for nested regions - List regions that have the same name separately in the JSON output 2021-02-12 Masahiko, Yamada * src/papi_events.csv: remove PAPI_L1_DCA and PAPI_L1_DCH for a64fx There seems to be a problem with PAPI_L1_DCA and PAPI_L1_DCH for a64fx that prefetch overcounts. I delete (comment out) PAPI_L1_DCA and PAPI_L1_DCH for a64fx from the papi_events.csv file. I will issue the pullrequest again once I have identified how to handle the overcount. 2021-02-11 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: Implemented a multi- threaded version of the CAT data cache benchmarks. This is necessary for full utilization of the hardware in the memory hierarchy, which provides more stable benchmark results. These changes were tested on the IBM POWER9 architecture. 2021-02-10 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h: Removed the CAT data cache benchmarks from running in a separate, stand-alone thread. This is a necessary step to implement truly multi-threaded versions of the benchmarks. These changes were tested on the IBM POWER9 architecture. 2021-02-04 Anthony Castaldo * src/components/rocm_smi/tests/rocmsmi_example.cpp: Minor modifications to comments and report code. 2021-02-03 Anthony Castaldo * src/components/rocm/tests/Makefile, src/components/rocm/tests/ROCM_Makefile, src/components/rocm/tests/rocm_all.cpp, src/components/rocm/tests/rocm_example.cpp, src/components/rocm_smi/tests/Makefile, src/components/rocm_smi/tests/ROCM_SMI_Makefile, .../rocm_smi/tests/power_monitor_rocm.cpp, .../rocm_smi/tests/rocm_command_line.cpp, src/components/rocm_smi/tests/rocm_smi_all.cpp, .../rocm_smi/tests/rocm_smi_writeTests.cpp, src/components/rocm_smi/tests/rocmsmi_example.cpp: In ROCM and ROCM_SMI, deleted specialty Makefiles and incorporated all makes into .../tests/Makefile. This required minor mods to existing files to get a clean compile without warnings. Added two files, rocm_example.cpp and rocmsmi_example.cpp, that are coding tutorials with heaving commenting for programmers new to PAPI; these will also be used for video tutorials by AMD. 2021-01-29 Anthony * src/components/sde/tests/lib/.gitignore: Added the directory lib under sde/tests * src/components/sde/Rules.sde, src/components/sde/interface/papi_sde_interface.c, src/components/sde/interface/papi_sde_interface.h, src/components/sde/sde_lib/Makefile, src/components/sde/sde_lib/sde_common.c, src/components/sde/sde_lib/sde_common.h, src/components/sde/sde_lib/sde_lib.c, src/components/sde/sde_lib/weak_symbols.c, src/components/sde/tests/Makefile, src/configure, src/configure.in, src/utils/Makefile, src/utils/Makefile.target.in: Cleaned up the stand alone sde code. Now it does not need to be built into a separate library, the sources/objects can be integrated into third party libraries directly. 2021-01-20 Anthony * src/components/sde/interface/papi_sde_interface.c, src/components/sde/sde.c, src/components/sde/sde_internal.h, src/components/sde/sde_lib/Makefile, src/components/sde/sde_lib/sde_common.c, src/components/sde/sde_lib/sde_common.h, src/components/sde/sde_lib/sde_lib.c, src/components/sde/sde_lib/weak_symbols.c, .../tests/Created_Counter/Created_Counter_Driver.c, .../Created_Counter/Lib_With_Created_Counter.c, .../sde/tests/Created_Counter/Overflow_Driver.c, src/components/sde/tests/Minimal/Minimal_Test.c, .../sde/tests/Recorder/Lib_With_Recorder.c, .../sde/tests/Recorder/Recorder_Driver.c, src/components/sde/tests/Simple/Simple_Driver.c, src/components/sde/tests/Simple/Simple_Lib.c, src/components/sde/tests/Simple2/Simple2_Driver.c, src/components/sde/tests/Simple2/Simple2_Lib.c, .../sde/tests/Simple2/Simple2_NoPAPI_Driver.c: Removed trailing white spaces. 2020-09-03 Anthony * src/components/sde/Rules.sde, src/components/sde/sde.c, src/components/sde/sde_common.c, src/components/sde/sde_common.h, src/components/sde/sde_internal.h, src/components/sde/sde_lib.c, src/components/sde/sde_lib/Makefile, src/components/sde/sde_lib/papi_sde_interface.h, src/components/sde/sde_lib/sde_common.c, src/components/sde/sde_lib/sde_common.h, src/components/sde/sde_lib/sde_lib.c, src/components/sde/sde_lib/weak_symbols.c, .../sde/tests/Advanced_C+FORTRAN/sde_test_f08.F90, .../tests/Created_Counter/Created_Counter_Driver.c, .../Created_Counter/Lib_With_Created_Counter.c, .../sde/tests/Created_Counter/Overflow_Driver.c, src/components/sde/tests/Makefile, src/components/sde/tests/Minimal/Minimal_Test.c, src/components/sde/tests/README.txt, .../sde/tests/Recorder/Lib_With_Recorder.c, .../sde/tests/Recorder/Recorder_Driver.c, src/components/sde/tests/Simple/Simple_Driver.c, src/components/sde/tests/Simple2/Simple2_Driver.c, src/components/sde/tests/Simple2/Simple2_Lib.c, .../sde/tests/Simple2/Simple2_NoPAPI_Driver.c, src/run_tests.sh, src/run_tests_exclude.txt, src/utils/Makefile, src/utils/papi_native_avail.c, src/utils/papi_sde_interface.c: + Major restructuring of the libsde code, so that it can be used more easily by external projects. + Changes to the tests so they conform to the rest of PAPI's testing infrastructure. 2020-05-30 Anthony * src/components/sde/README.md, src/components/sde/Rules.sde, src/components/sde/sde.c, src/components/sde/sde_common.h: Fixed a problem occuring in non-debug builds and updated the README file. 2020-05-29 Anthony * src/components/sde/sde.c, src/components/sde/sde_common.c, src/components/sde/sde_common.h, src/components/sde/sde_internal.h, src/components/sde/sde_lib.c: Better header organization. * src/components/sde/tests/Makefile, .../sde/tests/Simple2/Simple2_NoPAPI_Driver.c: New test with no libpapi.so linkage was added. * src/components/sde/sde.c, src/components/sde/sde_internal.h, src/components/sde/sde_lib.c: More complete support for overflowing. Now, case r5 is supported. * src/components/sde/sde.c, src/components/sde/sde_common.h, src/components/sde/sde_internal.h, src/components/sde/sde_lib.c: Support for overflow for the case of created counters as well as r[1-4]. 2020-05-28 Anthony * src/components/sde/Rules.sde, src/components/sde/sde.c, src/components/sde/sde_common.c, src/components/sde/sde_common.h, src/components/sde/sde_internal.h, src/components/sde/sde_lib.c, src/components/sde/tests/Makefile, src/components/sde/tests/Simple2/Simple2_Driver.c: Pushing the library interface of SDEs into a stand-alone library (libsde.so). 2021-01-26 Masahiko, Yamada * src/components/mx/linux-mx.c: Add string length check before strncpy() and strcat() calls in _mx_init_component() Myrinet Express-related component MX modules are initialized with the _mx_init_component() function, which is called from the PAPI_library_init() function. The popen(3) call runs a loadable module called "mx_counters", and if the loadable module does not exist, it attempts to run a loadable module called "./components/mx/utils/fake_mx_counters". In an environment where there are no "mx_counters" and "./components/mx/utils/fake_mx_counters" loadable modules, popen(3) will be called twice uselessly. popen(3) internally calls pipe(2) once, fork(2) twice and exec(2) once. The size of the user space of the application calling the PAPI_library_init() function affects the performance of fork(2), which is called as an extension of popen(3). As a result, the performance of the PAPI_library_init() function is affected by the amount of user space in the application that called the PAPI_library_init() function. In the _mx_init_component() function, the MX module only needs to be able to verify that a load module named "mx_counters" exists. We improved the _mx_init_component() function to call fopen(3) instead of popen(3). We add string length check before strncpy() and strcat() calls in _mx_init_component() function. 2021-01-21 William Cohen * src/Rules.pfm4_pe: Only check for libpfm.a if static libraries are being used. Even when static libraries are not be used papi was checking for libpfm.a, this would cause a failure if libpfm.a wasn't installed. Exclude checking for libpfm.a if no static libpfm library is needed. 2021-01-22 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Improved performance report script. 2021-01-18 Frank Winkler * src/high-level/papi_hl.c, src/high- level/scripts/papi_hl_output_writer.py: Fixed real time measurement. 2021-01-07 Damien Genet * src/Rules.perfnec, src/components/perfnec/Rules.perfnec, src/components/perfnec/perfmon.c, src/components/perfnec/perfnec.h, src/configure, src/configure.in, src/libperfnec/COPYRIGHT, src/libperfnec/ChangeLog, src/libperfnec/Makefile, src/libperfnec/README, src/libperfnec/TODO, src/libperfnec/config.mk, src/libperfnec/docs/Makefile, src/libperfnec/docs/man3/libpfm.3, src/libperfnec/docs/man3/libpfm_amd64.3, src/libperfnec/docs/man3/libpfm_atom.3, src/libperfnec/docs/man3/libpfm_core.3, src/libperfnec/docs/man3/libpfm_itanium.3, src/libperfnec/docs/man3/libpfm_itanium2.3, src/libperfnec/docs/man3/libpfm_montecito.3, src/libperfnec/docs/man3/libpfm_nehalem.3, src/libperfnec/docs/man3/libpfm_p6.3, src/libperfnec/docs/man3/libpfm_powerpc.3, src/libperfnec/docs/man3/libpfm_westmere.3, src/libperfnec/docs/man3/pfm_dispatch_events.3, src/libperfnec/docs/man3/pfm_find_event.3, src/libperfnec/docs/man3/pfm_find_event_bycode.3, .../docs/man3/pfm_find_event_bycode_next.3, src/libperfnec/docs/man3/pfm_find_event_mask.3, src/libperfnec/docs/man3/pfm_find_full_event.3, src/libperfnec/docs/man3/pfm_force_pmu.3, src/libperfnec/docs/man3/pfm_get_cycle_event.3, src/libperfnec/docs/man3/pfm_get_event_code.3, .../docs/man3/pfm_get_event_code_counter.3, src/libperfnec/docs/man3/pfm_get_event_counters.3, .../docs/man3/pfm_get_event_description.3, src/libperfnec/docs/man3/pfm_get_event_mask_code.3, .../docs/man3/pfm_get_event_mask_description.3, src/libperfnec/docs/man3/pfm_get_event_mask_name.3, src/libperfnec/docs/man3/pfm_get_event_name.3, src/libperfnec/docs/man3/pfm_get_full_event_name.3, .../docs/man3/pfm_get_hw_counter_width.3, src/libperfnec/docs/man3/pfm_get_impl_counters.3, src/libperfnec/docs/man3/pfm_get_impl_pmcs.3, src/libperfnec/docs/man3/pfm_get_impl_pmds.3, src/libperfnec/docs/man3/pfm_get_inst_retired.3, .../docs/man3/pfm_get_max_event_name_len.3, src/libperfnec/docs/man3/pfm_get_num_counters.3, src/libperfnec/docs/man3/pfm_get_num_events.3, src/libperfnec/docs/man3/pfm_get_num_pmcs.3, src/libperfnec/docs/man3/pfm_get_num_pmds.3, src/libperfnec/docs/man3/pfm_get_pmu_name.3, src/libperfnec/docs/man3/pfm_get_pmu_name_bytype.3, src/libperfnec/docs/man3/pfm_get_pmu_type.3, src/libperfnec/docs/man3/pfm_get_version.3, src/libperfnec/docs/man3/pfm_initialize.3, src/libperfnec/docs/man3/pfm_list_supported_pmus.3, src/libperfnec/docs/man3/pfm_pmu_is_supported.3, src/libperfnec/docs/man3/pfm_regmask_and.3, src/libperfnec/docs/man3/pfm_regmask_clr.3, src/libperfnec/docs/man3/pfm_regmask_copy.3, src/libperfnec/docs/man3/pfm_regmask_eq.3, src/libperfnec/docs/man3/pfm_regmask_isset.3, src/libperfnec/docs/man3/pfm_regmask_or.3, src/libperfnec/docs/man3/pfm_regmask_set.3, src/libperfnec/docs/man3/pfm_regmask_weight.3, src/libperfnec/docs/man3/pfm_set_options.3, src/libperfnec/docs/man3/pfm_strerror.3, src/libperfnec/include/Makefile, src/libperfnec/include/perfmon/perfmon.h, src/libperfnec/include/perfmon/perfmon_compat.h, src/libperfnec/include/perfmon/perfmon_crayx2.h, .../include/perfmon/perfmon_default_smpl.h, src/libperfnec/include/perfmon/perfmon_dfl_smpl.h, src/libperfnec/include/perfmon/perfmon_i386.h, src/libperfnec/include/perfmon/perfmon_ia64.h, src/libperfnec/include/perfmon/perfmon_mips64.h, src/libperfnec/include/perfmon/perfmon_nec.h, .../include/perfmon/perfmon_pebs_core_smpl.h, .../include/perfmon/perfmon_pebs_p4_smpl.h, src/libperfnec/include/perfmon/perfmon_pebs_smpl.h, src/libperfnec/include/perfmon/perfmon_powerpc.h, src/libperfnec/include/perfmon/perfmon_sparc.h, src/libperfnec/include/perfmon/perfmon_v2.h, src/libperfnec/include/perfmon/perfmon_x86_64.h, src/libperfnec/include/perfmon/pfmlib.h, src/libperfnec/include/perfmon/pfmlib_amd64.h, src/libperfnec/include/perfmon/pfmlib_cell.h, src/libperfnec/include/perfmon/pfmlib_comp.h, .../include/perfmon/pfmlib_comp_crayx2.h, src/libperfnec/include/perfmon/pfmlib_comp_i386.h, src/libperfnec/include/perfmon/pfmlib_comp_ia64.h, .../include/perfmon/pfmlib_comp_mips64.h, .../include/perfmon/pfmlib_comp_powerpc.h, src/libperfnec/include/perfmon/pfmlib_comp_sparc.h, .../include/perfmon/pfmlib_comp_x86_64.h, src/libperfnec/include/perfmon/pfmlib_core.h, src/libperfnec/include/perfmon/pfmlib_coreduo.h, src/libperfnec/include/perfmon/pfmlib_crayx2.h, src/libperfnec/include/perfmon/pfmlib_gen_ia32.h, src/libperfnec/include/perfmon/pfmlib_gen_ia64.h, src/libperfnec/include/perfmon/pfmlib_gen_mips64.h, src/libperfnec/include/perfmon/pfmlib_i386_p6.h, src/libperfnec/include/perfmon/pfmlib_intel_atom.h, src/libperfnec/include/perfmon/pfmlib_intel_nhm.h, src/libperfnec/include/perfmon/pfmlib_itanium.h, src/libperfnec/include/perfmon/pfmlib_itanium2.h, src/libperfnec/include/perfmon/pfmlib_montecito.h, src/libperfnec/include/perfmon/pfmlib_os.h, src/libperfnec/include/perfmon/pfmlib_os_crayx2.h, src/libperfnec/include/perfmon/pfmlib_os_i386.h, src/libperfnec/include/perfmon/pfmlib_os_ia64.h, src/libperfnec/include/perfmon/pfmlib_os_mips64.h, src/libperfnec/include/perfmon/pfmlib_os_powerpc.h, src/libperfnec/include/perfmon/pfmlib_os_sparc.h, src/libperfnec/include/perfmon/pfmlib_os_x86_64.h, src/libperfnec/include/perfmon/pfmlib_pentium4.h, src/libperfnec/include/perfmon/pfmlib_powerpc.h, src/libperfnec/include/perfmon/pfmlib_sicortex.h, src/libperfnec/include/perfmon/pfmlib_sparc.h, src/libperfnec/lib/Makefile, src/libperfnec/lib/amd64_events.h, src/libperfnec/lib/amd64_events_fam10h.h, src/libperfnec/lib/amd64_events_fam15h.h, src/libperfnec/lib/amd64_events_k7.h, src/libperfnec/lib/amd64_events_k8.h, src/libperfnec/lib/cell_events.h, src/libperfnec/lib/core_events.h, src/libperfnec/lib/coreduo_events.h, src/libperfnec/lib/crayx2_events.h, src/libperfnec/lib/gen_ia32_events.h, src/libperfnec/lib/gen_mips64_events.h, src/libperfnec/lib/i386_p6_events.h, src/libperfnec/lib/intel_atom_events.h, src/libperfnec/lib/intel_corei7_events.h, src/libperfnec/lib/intel_corei7_unc_events.h, src/libperfnec/lib/intel_wsm_events.h, src/libperfnec/lib/intel_wsm_unc_events.h, src/libperfnec/lib/itanium2_events.h, src/libperfnec/lib/itanium_events.h, src/libperfnec/lib/libpfm.a, src/libperfnec/lib/montecito_events.h, src/libperfnec/lib/niagara1_events.h, src/libperfnec/lib/niagara2_events.h, src/libperfnec/lib/pentium4_events.h, src/libperfnec/lib/pfmlib_amd64.c, src/libperfnec/lib/pfmlib_amd64_priv.h, src/libperfnec/lib/pfmlib_cell.c, src/libperfnec/lib/pfmlib_cell_priv.h, src/libperfnec/lib/pfmlib_common.c, src/libperfnec/lib/pfmlib_core.c, src/libperfnec/lib/pfmlib_core_priv.h, src/libperfnec/lib/pfmlib_coreduo.c, src/libperfnec/lib/pfmlib_coreduo_priv.h, src/libperfnec/lib/pfmlib_crayx2.c, src/libperfnec/lib/pfmlib_crayx2_priv.h, src/libperfnec/lib/pfmlib_gen_ia32.c, src/libperfnec/lib/pfmlib_gen_ia32_priv.h, src/libperfnec/lib/pfmlib_gen_ia64.c, src/libperfnec/lib/pfmlib_gen_mips64.c, src/libperfnec/lib/pfmlib_gen_mips64_priv.h, src/libperfnec/lib/pfmlib_gen_powerpc.c, src/libperfnec/lib/pfmlib_i386_p6.c, src/libperfnec/lib/pfmlib_i386_p6_priv.h, src/libperfnec/lib/pfmlib_intel_atom.c, src/libperfnec/lib/pfmlib_intel_atom_priv.h, src/libperfnec/lib/pfmlib_intel_nhm.c, src/libperfnec/lib/pfmlib_intel_nhm_priv.h, src/libperfnec/lib/pfmlib_itanium.c, src/libperfnec/lib/pfmlib_itanium2.c, src/libperfnec/lib/pfmlib_itanium2_priv.h, src/libperfnec/lib/pfmlib_itanium_priv.h, src/libperfnec/lib/pfmlib_montecito.c, src/libperfnec/lib/pfmlib_montecito_priv.h, src/libperfnec/lib/pfmlib_os_linux.c, src/libperfnec/lib/pfmlib_os_linux_v2.c, src/libperfnec/lib/pfmlib_os_linux_v3.c, src/libperfnec/lib/pfmlib_os_macos.c, src/libperfnec/lib/pfmlib_pentium4.c, src/libperfnec/lib/pfmlib_pentium4_priv.h, src/libperfnec/lib/pfmlib_power4_priv.h, src/libperfnec/lib/pfmlib_power5+_priv.h, src/libperfnec/lib/pfmlib_power5_priv.h, src/libperfnec/lib/pfmlib_power6_priv.h, src/libperfnec/lib/pfmlib_power7_priv.h, src/libperfnec/lib/pfmlib_power_priv.h, src/libperfnec/lib/pfmlib_powerpc_priv.h, src/libperfnec/lib/pfmlib_ppc970_priv.h, src/libperfnec/lib/pfmlib_ppc970mp_priv.h, src/libperfnec/lib/pfmlib_priv.c, src/libperfnec/lib/pfmlib_priv.h, src/libperfnec/lib/pfmlib_priv_comp.h, src/libperfnec/lib/pfmlib_priv_comp_ia64.h, src/libperfnec/lib/pfmlib_priv_ia64.h, src/libperfnec/lib/pfmlib_sicortex.c, src/libperfnec/lib/pfmlib_sicortex_priv.h, src/libperfnec/lib/pfmlib_sparc.c, src/libperfnec/lib/pfmlib_sparc_priv.h, src/libperfnec/lib/power4_events.h, src/libperfnec/lib/power5+_events.h, src/libperfnec/lib/power5_events.h, src/libperfnec/lib/power6_events.h, src/libperfnec/lib/power7_events.h, src/libperfnec/lib/powerpc_events.h, src/libperfnec/lib/powerpc_reg.h, src/libperfnec/lib/ppc970_events.h, src/libperfnec/lib/ppc970mp_events.h, src/libperfnec/lib/ultra12_events.h, src/libperfnec/lib/ultra3_events.h, src/libperfnec/lib/ultra3i_events.h, src/libperfnec/lib/ultra3plus_events.h, src/libperfnec/lib/ultra4plus_events.h, src/libperfnec/libpfms/Makefile, src/libperfnec/libpfms/include/libpfms.h, src/libperfnec/libpfms/lib/Makefile, src/libperfnec/libpfms/lib/libpfms.c, src/libperfnec/libpfms/syst_smp.c, src/libperfnec/python/Makefile, src/libperfnec/python/README, src/libperfnec/python/self.py, src/libperfnec/python/setup.py, src/libperfnec/python/src/__init__.py, src/libperfnec/python/src/perfmon_int.i, src/libperfnec/python/src/pmu.py, src/libperfnec/python/src/session.py, src/libperfnec/python/sys.py, src/libperfnec/rules.mk, src/linux-common.h, src/linux-context.h, src/linux-lock.h, src/linux-timer.c, src/mb.h, src/utils/papi_native_avail.c: Merged in feature/pr_nec (pull request #157) * Adding lib and component 2021-01-05 Masahiko, Yamada * src/components/perf_event/pe_libpfm4_events.c: Get model_string for ARM processor from pfm_get_pmu_info() function On ARM processors, the model_string does not appear in /proc/cpuinfo. Instead of looking at the /proc/cpuinfo information, you can look at the lscpu command information at the following URL:. https://github.com/google/cpu_features/issues/26 http://suihkulokki.blogspot.com/2018/02/making-sense-of- proccpuinfo-on-arm.html The libpfm4 library identifies the ARM processor type from the "CPU implement" and "CPU part" in the /proc/cpuinfo information. The papi library can use the pfm_get_pmu_info() function from the libpfm4 library to obtain a string identifying the ARM processor type. 2020-12-23 Peinan Zhang * .../intel_gpu/internal/inc/GPUMetricHandler.h, .../intel_gpu/internal/src/GPUMetricHandler.cpp, .../intel_gpu/internal/src/GPUMetricInterface.cpp: Changed query based data read with timeout rather than blocked till data available. 2020-12-18 Peinan Zhang * src/components/intel_gpu/README, src/components/intel_gpu/Rules.intel_gpu, .../intel_gpu/internal/inc/GPUMetricHandler.h, .../intel_gpu/internal/inc/GPUMetricInterface.h, .../intel_gpu/internal/src/GPUMetricHandler.cpp, .../intel_gpu/internal/src/GPUMetricInterface.cpp, src/components/intel_gpu/internal/src/Makefile, src/components/intel_gpu/linux_intel_gpu_metrics.c, src/components/intel_gpu/linux_intel_gpu_metrics.h, src/components/intel_gpu/tests/Makefile, src/components/intel_gpu/tests/gemm.spv, src/components/intel_gpu/tests/gpu_metric_list.c, src/components/intel_gpu/tests/gpu_metric_read.c, src/components/intel_gpu/tests/gpu_query_gemm.cc, src/components/intel_gpu/tests/gpu_thread_read.c, src/components/intel_gpu/tests/readme.txt: Add intel_gpu component to collect Intel GPU performance metrics 2020-12-17 Heike Jagode * src/components/perf_event/perf_event.c: Deleting Tony's hard failure for check_exclude_guest() if perf_event_open fails. There shouldn't be a failure since exclude_guest_unsupported is already set before perf_event_open is called. At this point, if perf_event_open fails it should just return but not result in a hard failure. Hence, going back to the previous version of check_exclude_guest(). And since the return value of this function is not checked, we change it to void (instead of int). 2020-12-15 Masahiko, Yamada * src/papi_events.csv: modify PAPI_FP_INS and PAPI_VEC_INS for A64FX supports 2020-12-14 Masahiko, Yamada * src/papi_events.csv: Add or modify various A64FX support events, including floating point events (PAPI_FP_OPS, PAPI_SP_OPS, PAPI_DP_OPS). * src/papi_events.csv: Corrected typo for A64FX support (PAPI_L2_DCH is a typo of PAPI_L2_DCA) Wed Dec 9 19:48:23 2020 -0800 Stephane Eranian * src/libpfm4/lib/pfmlib_intel_snbep_unc_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit c96ebc0d19c6167b45e1694ea38719f230da254e fix typos in comments related to PERF_ATTR_HWS AMD was mentioned in non-AMD related files. Reported-by: Steve Kaufmann Thu Nov 12 17:46:47 2020 -0800 Stephane Eranian * src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_perf_event.c, src/libpfm4/lib/pfmlib_arm.c, src/libpfm4/lib/pfmlib_arm_perf_event.c, src/libpfm4/lib/pfmlib_intel_netburst.c, src/libpfm4/lib/pfmlib_intel_netburst_perf_event.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, .../lib/pfmlib_intel_snbep_unc_perf_event.c, src/libpfm4/lib/pfmlib_mips.c, src/libpfm4/lib/pfmlib_mips_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_perf_event_raw.c, src/libpfm4/lib/pfmlib_powerpc.c, src/libpfm4/lib/pfmlib_powerpc_perf_event.c, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/lib/pfmlib_s390x_perf_event.c, src/libpfm4/lib/pfmlib_s390x_priv.h, src/libpfm4/lib/pfmlib_sparc.c, src/libpfm4/lib/pfmlib_sparc_perf_event.c, src/libpfm4/tests/validate_x86.c: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit fb6ddf78949eb1bc6921df5cfd0cf3e5ef2e752e fix Intel Icelake encodings for CPU_CLK_UNHALTED.*_DISTRIBUTED The event code for CPU_CLK_UNHALTED was wrong and the umasks DISTRIBUTED and REF_DISTRIBUTED were wrong. For these, the event code is actually 0xec, so add the code override tag. commit 6f687e42c62bc71766c5369d218cea9ca2e246cf fix support of PERF_ATTR_HWS Remove the attribute for all PMU which do not support it which is the majority. Without the patch, you would see [hw_smpl] on Intel uncore PMUs, AMD64 Fam17h PMU, and much more. The patch also fixes a few place where info->is_precise was not cleared. commit 02ab45abc160d1be754917524c40e268c490937d Fix MEM_TRANS_RETIRED for Intel Icelake The umasks generated for Intel Icelake MEM_TRANS_RETIRED where not setting ldlat properly. To use the Load Latency feature with libpfm4, the ldlat= modifier must be used either implicitly or explicitly. It cannot be encoded in the umask code for now. 2020-11-30 William Cohen * src/components/appio/README.md: Remove mention of the removed iozone test in the appio README.md. * src/components/appio/tests/Makefile, src/components/appio/tests/iozone/Changes.txt, src/components/appio/tests/iozone/Generate_Graphs, src/components/appio/tests/iozone/Gnuplot.txt, src/components/appio/tests/iozone/client_list, src/components/appio/tests/iozone/fileop.c, src/components/appio/tests/iozone/gengnuplot.sh, src/components/appio/tests/iozone/gnu3d.dem, src/components/appio/tests/iozone/gnuplot.dem, src/components/appio/tests/iozone/gnuplotps.dem, src/components/appio/tests/iozone/iozone.c, .../appio/tests/iozone/iozone_visualizer.pl, src/components/appio/tests/iozone/libasync.c, src/components/appio/tests/iozone/libbif.c, src/components/appio/tests/iozone/makefile, src/components/appio/tests/iozone/pit_server.c, src/components/appio/tests/iozone/read_telemetry, src/components/appio/tests/iozone/report.pl, src/components/appio/tests/iozone/spec.in, src/components/appio/tests/iozone/write_telemetry: Remove bundled iozone due to incompatible license. A review of the PAPI sources found some iozone code bundled in papi (rhbz1901077 - papi bundles non-free iozone code ). The upstream license for iozone does not give permission to modify the source. There are some minor changes in the PAPI version of the iozone files. 2020-11-25 Masahiko, Yamada * src/components/mx/linux-mx.c: fix for performance improvement of _mx_init_component() function 2020-11-18 Gerald Ragghianti * src/components/rocm/README.md: Typo in library file name 2020-11-17 Damien Genet * src/components/infiniband/linux-infiniband.c: Fix: location is not stored, so mark that location is broken 2020-11-17 Anthony Castaldo * src/components/cuda/tests/nvlink_all.cu: Removed debug messages. * src/components/cuda/tests/nvlink_all.cu: Removing debug messages. * src/components/cuda/tests/nvlink_all.cu, src/components/cuda/tests/nvlink_bandwidth.cu: Improved argument handling in nvlink_all.cu and nvlink_bandwidth.cu. 2020-11-06 Anthony Michael Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm/tests/ROCM_Makefile: Changed ROCM_Makefile to require PAPI_ROCM_ROOT; and to cross-compile for all "Instinct" GPUs. Code in component to set needed environment variables if not defined, or ensure definition meets expectations. 2020-11-05 Anthony Michael Castaldo * src/components/rocm/README.md, src/components/rocm/linux-rocm.c, src/components/rocm/tests/ROCM_Makefile: First draft of changes to automatically find and set environment variables automatically. 2020-11-03 Daniel Barry * src/counter_analysis_toolkit/main.c: Added checks for the return values of calls to malloc(), calloc(), and realloc(). This way, the user will know if there are issues with allocating memory. These changes were tested on the IBM POWER9 architecture. 2020-11-02 Daniel Barry * src/counter_analysis_toolkit/main.c: Increase buffer size for larger input files to CAT. Since the number of qualifier counts is equal to the number of event names in the input file, the size of the buffer containing the qualifier counts should be equal to the size of the buffer containing the event names. This change is necessary to accommodate large input files. This change was tested on the IBM POWER9 architecture. 2020-10-27 Anthony Castaldo * src/components/cuda/linux-cuda.c: Added explicit Compute Capability retrieval and checking to disable component if CC>=7.5 and cannot work with Legacy CUPTI. Added a filter to exclude multipass metrics. Provided a timing for that conditioned on #define TIME_MULTIPASS_ELIM to measure how long that takes. On Saturn A04 V100 device, 95-98ms additional time in init_component(). On Summit, 1-6 GPUs, about 73ms extra time per GPU in init_component(); from 73.5 ms (1 GPU) to 437.3 ms (6 GPU). 2020-10-26 Anthony Castaldo * src/components/cuda/linux-cuda.c: Added explicit Compute Capability retrieval and checking; also added a filter to exclude multipass metrics; as well as timing (if a #define is made) of how long that takes. On 1 V100 device, 95-98ms additional time in init_component(). 2020-10-23 Björn Dick * src/components/sde/tests/Makefile: adapted setting FFLAGS in src/components/sde/tests/Makefile in order to make it work with flang (otherwise crashes due to unknown flag '-free') 2020-10-22 Anthony Castaldo * src/components/cuda/linux-cuda.c: Just the compute capability check. 2020-10-20 Anthony Castaldo * src/components/cuda/linux-cuda.c, .../cuda/tests/BlackScholes/BlackScholes.cu, .../cuda/tests/BlackScholes/BlackScholes_gold.cpp, .../tests/BlackScholes/BlackScholes_kernel.cuh, src/components/cuda/tests/BlackScholes/Makefile, .../cuda/tests/BlackScholes/NsightEclipse.xml, .../cuda/tests/BlackScholes/README_SETUP.txt, src/components/cuda/tests/BlackScholes/readme.txt, .../cuda/tests/BlackScholes/testAllEvents.sh, .../cuda/tests/BlackScholes/testSomeEvents.sh, .../cuda/tests/BlackScholes/thr_BlackScholes.cu: Thread Safety is added to the cuda component; protecting functions with PAPI locks: _papi_hwi_lock(COMPONENT_LOCK) and _papi_hwi_unlock(COMPONENT_LOCK). The BlackScholes directory is added; a slightly modified version of an Nvidia sample program, used to exercise a great deal of computation. This includes new code, in particular thr_BlackScholes.cu uses phtreads and executes the kernel from several threads, using PAPI_read() in each, to test the above thread safety. Also tested with helgrind (tool within valgrind to test for threading synchronization issues). 2020-10-19 Anthony Castaldo * src/components/net/linux-net.c, src/components/nvml/linux-nvml.c, src/components/pcp/linux-pcp.c, src/components/perf_event/perf_event.c, .../perf_event_uncore/perf_event_uncore.c, src/components/powercap /linux-powercap.c, src/components/powercap_ppc/linux-powercap- ppc.c, src/components/rapl/linux-rapl.c, src/components/rocm/linux- rocm.c, src/components/rocm_smi/linux-rocm-smi.c, src/components/sensors_ppc/linux-sensors-ppc.c: Changes to init_component() to properly set the component vector element disabled_reason() if init fails. Also changes to eliminate compiler warnings for failing to process return codes from string functions (strcpy and snprintf and variants), and to check for alloc() failures; but only in the init_component() functions and any functions it invokes. Testing was on Saturn A04 by default; ICL Caffeine for (rocm, rocm_smi, powercap), Summit for PCP, Tellico (IBM processor) for sensors_ppc and powercap_ppc. Wed Sep 2 11:40:42 2020 -0700 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/debian/changelog, src/libpfm4/docs/Makefile, src/libpfm4/lib/events/amd64_events_fam16h.h, src/libpfm4/lib/events/amd64_events_fam17h_zen2.h, src/libpfm4/lib/events/intel_bdx_unc_cbo_events.h, src/libpfm4/lib/events/intel_bdx_unc_ha_events.h, src/libpfm4/lib/events/intel_bdx_unc_imc_events.h, src/libpfm4/lib/events/intel_bdx_unc_pcu_events.h, src/libpfm4/lib/events/intel_bdx_unc_qpi_events.h, .../lib/events/intel_bdx_unc_r3qpi_events.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_knl_unc_cha_events.h, src/libpfm4/lib/events/intel_skx_unc_cha_events.h, src/libpfm4/lib/events/intel_skx_unc_imc_events.h, .../lib/events/intel_skx_unc_m3upi_events.h, src/libpfm4/lib/events/intel_skx_unc_pcu_events.h, src/libpfm4/lib/events/intel_skx_unc_upi_events.h, src/libpfm4/lib/events/mips_74k_events.h, src/libpfm4/lib/events/power4_events.h, src/libpfm4/lib/events/power5+_events.h, src/libpfm4/lib/events/power5_events.h, src/libpfm4/lib/events/power6_events.h, src/libpfm4/lib/events/power7_events.h, src/libpfm4/lib/events/power8_events.h, src/libpfm4/lib/events/power9_events.h, src/libpfm4/lib/events/ppc970_events.h, src/libpfm4/lib/events/ppc970mp_events.h, src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_itanium2.c, src/libpfm4/lib/pfmlib_mips.c, src/libpfm4/lib/pfmlib_montecito.c: Update libpfm4, to be current with the following commits: -------------------------------------------------------------- commit fa84c27b60572621a8e48e364de9f55bdff5237e fix incorrect strncpy() usage gcc 9 failed on mips* with: /usr/include/mips64el- linux-gnuabi64/bits/string_fortified.h:106:10: error: ‘__builtin___strncpy_chk’ output truncated before terminating nul copying as many bytes from a string as its length [-Werror =stringop-truncation] pfmlib_mips.c: In function ‘pfm_mips_detect’: pfmlib_mips.c:147:2: note: length computed here 147 | strncpy(pfm_mips_cfg.model,buffer,strlen(buffer)); strncpy(dest, src, strlen(src)) does *not* copy the terminating '\0' strncpy(dest, src, strlen(src)+1) is identical to strcpy(dest, src) but the third argument to strncpy() should rather be based on the size of 'dest', not 'src' commit c3e97e0c9510f047623f6548cdef188eed0038cd fix typos and normalize spacing most typos were found by Lintian commit de4beb0da7530bc1dcd2f19582dfeca2ecb1d185 update AMD Fam17h Zen2 event table Based on PPR version 0.91 Sep1, 2020. Thanks to Emmanuel for tracking the diffs. commit 53797b096497dd278fa844c302ce93495b469754 update Intel Icelake event table to 1.09 This patch updates the Icelake event table based on the official JSON event file up to version 1.09. commit 414e482ace00d334015341e032a8b325d80e92eb update to version 4.11.1 Update to 4.11.1 revision to fix some minor issues with 11.0 release commit dfe30a72c18dc64ea8e55c469a9adcfec9c09340 install Fujitsu A64FX man page in ARM64 mode This patch corrects the documentation Makefile to install the libpfm_a64fx.3 man page when bulding for ARM64. Otherwise the man page woul only be installed in ARM (32-bit) mode. Reported-by: William Cohen commit 3a7dbd35cfde80923dca3d7a02386fde6d859f93 update to version 4.11.0 Update to 4.11.0 revision to prepare for release 2020-10-12 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Fixed bug in summary mode 2. Each region can have a different number of ranks and threads. * src/high-level/scripts/papi_hl_output_writer.py: Fixed bug in summary mode. Starting from the second region all events were ignored by a wrong indented break statement. 2020-10-09 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Fixed bug for IPC metric. * src/high-level/scripts/papi_hl_output_writer.py: Revised performance output. 2020-10-08 Anthony Castaldo * src/components/appio/appio.c, src/components/coretemp/linux- coretemp.c, src/components/cuda/linux-cuda.c, src/components/example/example.c, src/components/io/linux-io.c, src/components/libmsr/linux-libmsr.c: Got rid of unnecessary PAPI_MAX_STR_LEN-2, replaced with PAPI_MAX_STR_LEN. 2020-10-08 Sebastian Mobo * src/papi_events.csv: Added instruction-cache preset events for the Zen2. 2020-10-08 Heike Jagode * src/papi_events.csv: For zen2, since FP_OPS counts both single- and double-prec operations correctly, we don't need to confuse the user with additional DP_OPS and SP_OPS events. So, I'm taking them out. Same applies for events counting FP instructions. 2020-10-08 Anthony Castaldo * src/components/appio/appio.c, src/components/coretemp/linux- coretemp.c: Brought appio and coretemp in line with other components for standardization. * src/components/appio/appio.c, src/components/coretemp/linux- coretemp.c, src/components/cuda/linux-cuda.c, src/components/example/example.c, src/components/io/linux-io.c, src/components/libmsr/linux-libmsr.c: In addition to init_component() changes to ensure all possible failing return paths set a disable_reason; added testing for string functions; strncpy and snprintf, to avoid compiler warnings about uninterpreted return values. 2020-10-08 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Added derived events for summary report. 2020-10-06 Anthony Castaldo * src/components/appio/appio.c, src/components/coretemp/linux- coretemp.c, src/components/cuda/linux-cuda.c, src/components/example/example.c, src/components/io/linux-io.c, src/components/libmsr/linux-libmsr.c, src/components/libmsr/tests/ICL_TESTING_NOTES.txt, src/components/libmsr/tests/Makefile, src/components/libmsr/tests/libmsr_basic.c: Most changes here ensure that any exit from the init_component() function that disables the component will give a sensible reason for the disable in the component vector string, which is reported by papi_component_avail. Additional changes were made to prevent compiler warnings on various issues; such as unused values or incompatible formatting. 2020-10-04 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Several improvements. 2020-10-02 Frank Winkler * src/high-level/scripts/papi_hl_output_writer.py: Started with summary output format. 2020-10-01 Damien Genet * src/components/infiniband/linux-infiniband.c: Fixing the infiniband component for the 3 counters that are misplaced in the filesystem and thus wrongfully listed as 32 bits 2020-09-24 Heike Jagode * src/papi_events.csv: Added missing 'PRESET' to csv file. * src/papi_events.csv: Added presets for floating-point instructions (FP_INS, VEC_DP, VEC_SP) for AMD zen2. For unoptimized code (like native MMM), these events may include non-numeric floating-point instructions, e.g. MOVSD: move or merge scalar double-precision floating-point value instructions. Tested with: 1) SSE double: _mm_mul_pd / _mm_add_pd 2) SSE single: _mm_mul_ps / _mm_add_ps 3) AVX double: _mm256_mul_pd / _mm256_add_pd 4) AVX single: _mm256_mul_ps / _mm256_add_ps 5) FMA double: _mm256_macc_pd 6) FMA single: _mm256_macc_pd * src/papi_events.csv: Added presets for floating-point operations (FP_OPS, DP_OPS, SP_OPS) for AMD zen2. PPR (under section 2.1.15.3. -- https://www.amd.com/system/files/TechDocs/54945_3.03_p pr_ZP_B2_pub.zip) explains that FLOP events require MergeEvent support, which was included in the 5.6 kernel. ===>>> Hence, a kernel version 5.6 or greater is required. NOTE: without the MergeEvent support in the kernel, there is no guarantee that the SSE/AVX FLOP events produce any useful data whatsoever. These events have been tested and verified for scalar flops, SSE, AVX, and FMA: (1) for one AVX instruction (e.g. _mm256_add_pd()), the RETIRED_SSE_AVX_FLOPS:ADD_SUB_FLOPS event returns a count of 4 (in the case of double precision), and a count of 8 (in the case of single precision). (2) for one AVX FMA instruction (e.g. _mm256_macc_pd()), the RETIRED_SSE_AVX_FLOPS:MAC_FLOPS event returns a count of 8 (in the case of double precision), and a count of 16 (in the case of single precision). (3) for one SSE instruction (e.g. _mm_mul_pd()), the RETIRED_SSE_AVX_FLOPS:MULT_FLOPS event returns a count of 2 (in the case of double precision), and a count of 4 (in the case of single precision). 2020-09-18 Frank Winkler * src/high-level/papi_hl.c: Added min, avg, and max for instantaneous events. 2020-09-17 Frank Winkler * src/high-level/papi_hl.c: Fixed bug for empty strings in PAPI_EVENTS. 2020-09-11 Frank Winkler * src/high-level/papi_hl.c: Improved coding style. * src/high-level/papi_hl.c: Simplified event definitions. 2020-09-09 Daniel Barry * src/components/pcp/linux-pcp.c: Added __FILE__ and __LINE__ to the error messages which are shown depending on the return value of snprintf(). This way, the user will know from where the error message originated. These changes were tested on the IBM POWER9 architecture. 2020-09-09 Anthony Castaldo * src/components/cuda/linux-cuda.c: Uninitialized retcode variables caused problems on Power 9. 2020-09-09 Frank Winkler * src/high-level/papi_hl.c: Added event definitions to performance report. 2020-09-08 Daniel Barry * src/components/pcp/linux-pcp.c: Added if-statements to check whether the number of characters intended to be written to the destination buffer exceed the size of the buffer. This prevents GCC 9.1.0 from warning that the destination buffer may not be large enough to store the contents of the source buffers. These changes were tested on the IBM POWER9 architecture. 2020-09-08 Anthony Castaldo * src/components/rocm/tests/ROCM_Makefile, src/components/rocm/tests/rocm_standalone.cpp: Corrected a too- specific reference in Makefile; and changed a #define from CamelCase to all caps. 2020-08-28 Anthony Castaldo * src/components/cuda/linux-cuda.c: Capitalized #defines. 2020-08-27 Anthony Castaldo * src/components/cuda/linux-cuda.c: Moved structure typedef definitions ahead of their use in the source file; and changed references to structures within structures to use the typedef type. Removed '#if 0' test code for the binary search function. 2020-08-26 Anthony Castaldo * src/papi.c, src/papi_internal.c, src/papi_internal.h: This modifies PAPI_library_init() to initialize components in two classes, separated by the initialization of the papi thread structure. The first class is those that need no thread structure, currently everything but perf_event and perf_event_uncore. Following the init of the threading structure, we init the second class (perf_event and perf_event_uncore) that DOES need the thread structure to successfully init_component(). This required a change to _papi_hwi_init_global(), to add an argument to distinguish which class it should initialize. Thu Aug 13 01:51:28 2020 -0700 Ondrej Sykora * src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_s390x_perf_event.c, src/libpfm4/perf_examples/Makefile: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit 437628ebe58edd6cff3e493a7925f66e3a016b76 make rtop build conditional on ncurses.h present This avoids build problems on systems where ncurses development package is not installed. commit 2293ceb3ad9d2ed0c63f85fc07cc30b278ee4eda lib/events/s390x_cpumf_events.h: Change counter name DFLT_CCERROR on s390 Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15. This counter counts completed DEFLATE instructions with exit code 0, 1 or 2. Since exit code 0 means success and exit code 1 or 2 indicate errors, change the counter name to avoid confusion. This counter is incremented each time the DEFLATE instruction completed regardless if an error was detected or not. This change is in sync with kernel commit 3d3af181d370 ("s390/cpum_cf,perf: change DFLT_CCERROR counter name") commit 30adc677603b28c6d9eb311de7298fa4fea26eed lib/pfmlib_s390x_perf_event.c: Fix perf attr.type event number for s390 The s390 Performance Measurement counter facility does not have a fixed type number anymore. This was caused by linux kernel commits commit 66d258c5b048 ("perf/core: Optimize perf_init_event()") and its necessary follow on commit commit 6a82e23f45fe ("s390/cpumf: Adjust registration of s390 PMU device drivers") Now read out the current type number from a sysfs file named /sys/devices/cpum_cf/type. If it does not exist there is no CPU-MF counter facility installed or activated, which has been checked before. commit b1651ff3c5eed6289db9545d080d8d28bccfdbe4 Add a custom implementation of strsep(). This is required to build the library on platforms where strsep() is not available, e.g. on Windows via MinGW. 2020-08-25 Anthony Castaldo * src/components/cuda/linux-cuda.c: Implement cumulative counters (as per PAPI specification) for cuda events and metrics. It includes two #define controlled diagnostics that may prove necessary on other models of GPUs. 'Produce_Event_Report' will print (to stderr) all the events discovered, and 'Expose_Unenumerated_Events' will add as PAPI events (beginning 'cuda:::unenum_event:0x...') those events used by cuda metrics that are not reported by nvidia's event enumeration routines. These are explained further in code comments. 2020-08-21 William Cohen * src/Makefile.inc: Makefile to generate papi-x.y.z.tar.gz directly from git repo SystemTap has a make rule to generate a tarball directly from the git repository. This make rule has proved useful for quickly producing Fedora rawhide rpms with a snapshot of what is currently git repository. This patch adds a similar make rule to PAPI. Wed Aug 12 15:23:27 2020 -0700 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam17h_zen1.h: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit e162519d26d313860a9e69889bcc67406f92edc9 fix duplicate event code on AMD Fam17h Zen1 Removed DISPATCH_RESOURCE_STALL_CYCLES_0 which is not an AMD Fam17h event but rather a Zen2 event with the same event code. Reported-by: Kaufmann, Steve Tested on Zen1, Castaldo. Fri Jun 19 15:07:01 2020 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_neoverse_n1.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/amd64_events_fam17h_zen1.h, src/libpfm4/lib/events/arm_neoverse_n1_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c: Update libpfm4, to be current with the following commits: -------------------------------------------------------------- commit 2c3d94eb306e52a48fe881c8c5d68fd8849bccc0 clean INC_ARM in lib Makefile Had duplicated INC_ARM= definitions. Some includes were missing from INC_ARM64. commit 286bf87042469524098a3aa65485f2eef395c3d5 enable priv level filtering on ARMv8 ARMv8 core PMU supports privilege level filtering but this was missing from the definitions all of ARMv8 PMUs, therefore it was ignored during perf_events encoding. This patch fixes the problem by initializingi the .supported_plm field properly. commit 7fa9131274d450581aa98e6ee662a19f20ff3381 Enable ARM Neoverse N1 core PMU This patch enabled ARM Neoverse N1 core PMU support. Event table based on: https://static.docs.arm.com/1006 16/0301/neoverse_n1_trm_100616_0301_01_en.pdf commit ea9752f3fee76798010093c2f35cbf719980997d more updates to AMD Fam17h Zen1 event table Added: - DYNAMIC_INDIRECT_PREDICTIONS - DECODER_OVERRIDES_PREDICTION Reported-by: Emmanuel Oseret commit 5a623727cf7111afd09df2cdb0ff4b294d31efa7 update AMD Fam17h Zen2 event table Added: - FP_DISPATCH_FAULT - DATA_CACHE_REFILLS_FROM_SYSTEM Fixed typos in umask for SOFTWARE_PREFETCH_DATA_CACHE_FILLS which are shared with DATA_CACHE_REFILLS_FROM_SYSTEM. Reported-by: Steve Kaufmann 2020-07-24 Anthony Castaldo * src/components/rocm/README.md: Added an extra HSA_TOOLS_LIB export that is required to read counters. 2020-07-23 Frank Winkler * src/high-level/papi_hl.c: Revised previous push. Only nvml events are automatically saved as instantaneous values. * src/high-level/papi_hl.c: Some events like power, temperature or all nvml events are always considered instantaneous. 2020-07-17 Anthony * src/papi_events.csv: Separated the cache preset events of AMD Zen1 and Zen2 and added some more. 2020-07-17 Frank Winkler * src/configure, src/configure.in: Revised configure script. 1) Changed "--with-tests" option. The user can now disable all tests using "--with-tests=no" or "--without-tests". MPI tests are included in "--with-tests". 2) Aligned help text for a better output format. 2020-07-14 Anthony Castaldo * src/components/rocm/tests/ROCM_SA_Makefile, src/components/rocm/tests/rocm_failure_demo.cpp, src/components/rocm/tests/rocm_standalone.cpp: Added two utilities that perform event reading for AMD GPUs without any use of the PAPI interface. To prove that PAPI is not the problem when events are not working correctly. 2020-07-03 Frank Winkler * src/components/cuda/README.md, src/components/nvml/README.md: Added instructions how to find the correct paths of all required shared libraries at runtime. 2020-06-24 Steve Kaufmann * src/papi_events.csv: Added PAPI preset support for Fujitsu A64FX. Sat Jun 13 00:39:58 2020 -0700 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam17h_zen2.h: commit 5a623727cf7111afd09df2cdb0ff4b294d31efa7 update AMD Fam17h Zen2 event table Added: - FP_DISPATCH_FAULT - DATA_CACHE_REFILLS_FROM_SYSTEM Fixed typos in umask for SOFTWARE_PREFETCH_DATA_CACHE_FILLS which are shared with DATA_CACHE_REFILLS_FROM_SYSTEM. Reported-by: Steve Kaufmann commit 17e622e9539e1f8faf3c0c27889963a537e95537 add L2_PREFETCH_MISS_L3 for AMD Fam17h Zen2 Add missing L2_PREFETCH_MISS_L3 event for AMD Fam17h Zen2. Reported-by: Emmanuel Oseret 2020-06-23 Frank Winkler * src/components/sde/README.md: README.md edited online with Bitbucket 2020-06-18 Heike Jagode * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket 2020-06-18 Frank Winkler * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket 2020-06-12 Anthony Castaldo * src/components/rocm_smi/Rules.rocm_smi: Added an include directory to the Rules file, to fix a coding error on including a new file, kfd_ioctl.h. (The #include statement includes the directory rocm_smi when it should not.) 2020-06-12 Frank Winkler * src/configure: Generated new configure file with autoconf 2.69 on saturn.icl.utk.edu. * src/configure.in: Added rpath and runpath to find libpfm.so and libpapi.so if not specified via LD_LIBRARY_PATH. The search path at runtime can be overriden by LD_LIBRARY_PATH. 2020-06-12 Frank Winkler * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket Sat May 30 18:08:52 2020 -0700 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam17h_zen1.h: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit c99ed181402b21e74744d5f602aceb6a320c7ded update AMD64 Fam17h Zen1 event table Add a few missing events. Thanks to Emmanuel for tracking them down. Based on AMD Fam17h model 01,08h B2 PPR version 3.03 Jun 14, 2019 Reported-by: Emmanuel Oseret Tested functional on ICL Morphine; AMD64 Fam17h Zen1 machine. 2020-06-03 Heike Jagode * src/papi.h: Bug fix for architectures with more than 40 PMUs (e.g. KNL has > 40 uncore PMUs). PAPI_PMU_MAX and its static value were introduced in 2014 (https://bitbucket.org/icl/papi/commits/2a1805ec ebba1b1789853e0a36af9bd921ef1b9a). The problem was not only that papi_component_avail didn't list all PMUs, but even worse, that papi_native_avail did, in fact, list all events, however, if a user tried to monitor listed events from omitted PMUs, an error was returned. 2020-05-29 Frank Winkler * src/components/perf_event_uncore/README.md: README.md edited online with Bitbucket 2020-05-29 Frank Winkler * src/components/perf_event_uncore/README.md: Added FAQ entry for component perf_event_uncore. * src/components/cuda/README.md, src/components/perf_event/README.md, src/components/perf_event_uncore/README.md, src/components/powercap/README, src/components/powercap/README.md, src/components/powercap_ppc/README, src/components/powercap_ppc/README.md, src/components/rapl/README, src/components/rapl/README.md, src/components/sde/README, src/components/sde/README.md, src/components/sensors_ppc/README, src/components/sensors_ppc/README.md, src/components/stealtime/README.md, src/components/vmware/README, src/components/vmware/README.md: New readme files for the components in markdown format (2/2). 2020-05-29 Frank Winkler * src/components/pcp/README.md: README.md edited online with Bitbucket 2020-05-28 Anthony Castaldo * src/components/rocm/linux-rocm.c, src/components/rocm_smi/linux- rocm-smi.c: Improved _init_component() code in rocm, rocm_smi to populate the disabled reason if library failures are encountered during initialization. Per Steve Kaufmann request 05/28/2020. 2020-05-28 Frank Winkler * src/components/appio/README, src/components/appio/README.md, src/components/bgpm/README, src/components/bgpm/README.md, src/components/coretemp/README.md, src/components/coretemp_freebsd/README, src/components/coretemp_freebsd/README.md, src/components/emon/README, src/components/emon/README.md, src/components/example/README.md, src/components/host_micpower/README, src/components/host_micpower/README.md, src/components/infiniband/README, src/components/infiniband/README.md, src/components/io/README, src/components/io/README.md, src/components/libmsr/README.md, src/components/lmsensors/README.md, src/components/lustre/README.md, src/components/lustre/linux- lustre.c, src/components/micpower/README, src/components/micpower/README.md, src/components/mx/README.md, src/components/net/README, src/components/net/README.md: New readme files for 16 components in markdown format. 2020-05-22 Anthony Castaldo * src/components/nvml/tests/Makefile: Removed some leftover development lines from nvml/tests/Makefile 2020-05-21 Anthony Castaldo * src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c, .../rocm_smi/tests/rocm_command_line.cpp: Changes to make rocm_smi have its own PAPI_ROCMSMI_ROOT variable, and given PAPI_ROCM_SMI_LIB as an override environment variable. * src/components/cuda/README, src/components/cuda/README.md, src/components/nvml/README, src/components/nvml/README.md, src/components/pcp/README, src/components/pcp/README.md, src/components/rocm/README, src/components/rocm/README.md, src/components/rocm_smi/README, src/components/rocm_smi/README.md: Changed files to Markup versions, matched templates, for five components; cuda, nvml, pcp, rocm, rocm_smi. Mon May 18 09:33:57 2020 -0700 Steve Kaufmann * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_a64fx.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/arm_fujitsu_a64fx_events.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c, src/libpfm4/tests/validate_x86.c: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit 0cfc35f73e0e39d54ba48c24e663bec93d164211 Enable support for Fujitsu A64FX core PMU This patch adds support for Fujitsu A64FX core PMU. This includes ARMv8 generic core events and Fujitsu model specfic events. 2020-05-15 Frank Winkler * src/configure: Generated new configure file with autoconf (2.69) on saturn. 2020-05-07 Anthony * src/components/sde/sde.c: Avoid creating a variable for something that is only used for a debug message. Otherwise we create compiler warnings when debug is not enabled. 2020-05-07 Frank Winkler * src/configure.in, src/utils/papi_native_avail.c: Added CFLAG -DSDE. 2020-04-30 Frank Winkler * src/configure.in, src/ctests/Makefile.recipies, src/ctests/Makefile.target.in, src/utils/Makefile, src/utils/Makefile.target.in, src/utils/papi_native_avail.c: Fixed static build. - SDE component is disabled - "ctest" shlib is disabled Thu Apr 16 15:12:05 2020 +0200 Thomas Richter * src/libpfm4/lib/events/s390x_cpumf_events.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/tests/validate.c: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit 47f0845d81f851e8bee8745b8c4c7ad6f8e03122 s390: Update counter definition This patch updates the libpfm4 s390 counter defintions to the latest documentation: SA23-2260-06:The Load- Program-Paramenter and the CPU-Measurement Facilities, September 2019 https://www.ibm.com/support/pages/sites/default/files/inline- files/117183_SA23-2260-06.pdf SA23-2261-06:The CPU-Measurement Facility Extended Counters Definition for z10, z196/z114, zEC12/zBC12, z13/z13s, z14 and z15, January 2020 https://www.ibm.com/support/pages/sites/default/files/inline- files/119190_SA23-2261-06.pdf This includes updated counter description for existing counters and the complete counter definition for IBM z15. commit f1aedd4f189814b980763f9db2465a4a9c34bd6e validate: Add flag p to the getopt list of commandline files The validate program supports flag -p to test perf events. This option is not listed in the getopt list. 2020-04-24 Frank Winkler * src/configure.in: Another test for "--with-static-tools". * src/configure.in: Fixed configure options for shared and static builds. 1) --with-static-lib=no (force PAPI to build shared libraries and tools) 2) --with-shlib-tools (use internal libpfm via rpath-link) * release_procedure.txt: Modified instructions for release procedure. * release_procedure.txt, src/configure: Generated new configure file with autoconf 2.69 on saturn. 2020-04-19 Frank Winkler * src/configure.in, src/ctests/Makefile.recipies, src/ctests/Makefile.target.in: Fixed bug for MPI tests. MPI tests are disabled if - user specifiy "--with-shared-lib=no" - mpicc is not using the current $CC compiler 2020-04-17 Anthony Castaldo * src/components/cuda/linux-cuda.c: Repaired error code and error reporting on the check for compute capability >= 7.5. An uninitialized variable. 2020-04-14 Anthony Castaldo * src/components/rocm_smi/linux-rocm-smi.c, .../rocm_smi/tests/power_monitor_rocm.cpp: Changed component to handle mis-numbered sensors coming from driver. Cleaned up commenting in power_monitor_rocm.cpp. Wed Apr 8 01:02:22 2020 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_skl_events.h: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- commit 34164d84bba9794c75b4ce643ad74aad1362e97a fix encoding typos for OFFCORE_RESPONSE on SKL/SKX/CLX Some of the aliases encodings were wrong. No impact because only the encoding of the actual umask is used except when listing umasks. 2020-04-03 Frank Winkler * release_procedure.txt: Added text for bug fix release procedure. * src/papi.h: Fixed typo. 2020-04-02 Anthony * .../tests/Created_Counter/Created_Counter_Driver.c, .../Created_Counter/Lib_With_Created_Counter.c, .../sde/tests/Created_Counter/Overflow_Driver.c, src/components/sde/tests/Makefile: Added example that shows how to implement Created Counters in a library and how to use PAPI_overflow() to monitor an SDE. 2020-04-01 Frank Winkler * src/Makefile.inc: Fixed bug in install process. Create BINDIR before copying the hl python script to BINDIR. 2020-03-30 Anthony Castaldo * src/components/cuda/linux-cuda.c: Changed to report a useful disabled reason when devices with compute capability >=7.5 are present; these no longer support the CUPTI interface. 2020-03-24 Anthony Castaldo * src/components/rocm_smi/tests/ROCM_SMI_Makefile, .../rocm_smi/tests/power_monitor_rocm.cpp, src/components/rocm_smi/tests/rocmcap_plot.cpp: New code for power monitoring, replaces rocmcap_plot.cpp. Extensive new command line options. Fri Mar 6 17:32:45 2020 -0800 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_icl.3, src/libpfm4/docs/man3/libpfm_intel_tmt.3, src/libpfm4/docs/man3/libpfm_perf_event_raw.3, src/libpfm4/examples/showevtinfo.c, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/amd64_events_fam17h_zen2.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_tmt_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_icl.c, src/libpfm4/lib/pfmlib_intel_rapl.c, src/libpfm4/lib/pfmlib_intel_skl.c, src/libpfm4/lib/pfmlib_intel_tmt.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4, to be current with the following commit: -------------------------------------------------------------- NOTE: Intel Tremont and IceLake changes have not been tested; due to lack of hardware at this time. commit 647d1160b6fdd902b2bfe3138522cc09e2d57387 add Intel Icelake core PMU support This patch adds Intel Icelake core PMU support for all published SKUs. It is based on the official event table published at download.01.org version 1.04. commit 5847026aa516dd4c220a5d04ab9e6128eefc19fd add hw_smpl support to x86 perf_events code Enables support for new hw_smpl attribute to perf_events x86 code. commit 67e238ef03bcdccd017c1bfc2a0c4d8fe545c442 add perf_events hw_smpl attribute This patch adds a new attribute to perf_events OS support. The attribute is called hw_smpl. It enables hardware sampling on an event. Hardware sampling is CPU specific and therefore requires CPU specific code. hw_smpl is a variation of precise sampling. It provides hardware assistance to sample but does not guarantee precise attribution of samples to code. With perf_events this is equivalent to setting attr.precise_ip = 1 which is what this attribute does. This patch only modifies the generic perf_event code. commit 2ba296e3b1254f2bbaa0c7a3505721f395b53bf8 enable ExtendedPEBS attribute support for Intel X86 This patch introduces a new Intel X86 specific PMU flag, INTEL_X86_PMU_FL_EXTPEBS, to indicate that the PMU supports Extended PEBS. ExtendedPEBS provides the flexibility of the PEBS hardware assist sampling without guaranteeing the precision of the sample instruction address. commit 2a6c6b60c4f65f63a300be52382af283a6a537c8 add support_hw_smpl attribute This patch adds a new event and umask attribute visible at the API level called support_hw_smpl. This is a boolean attribute. If set, it means the event or the umask supports hardware buffer sampling. In other words, the event can be sampled using hardware-assist buffer instead of basic interrupt-based sampling. This usually brings lower cost of sampling by amortizing the PMU interupt over multiple samples. Hardware-assist sampling does not mean there is no sampling skid. There may be some skid. Only events supporting precise sampling can be sampled without skid. Note that oftentimes, precise sampling is achieved by having a precise event sampled using a hardware-assist buffer. In other words, event/umask marked as precise usually also have support_hw_smpl set to true but this is not a requirement. This patch adds the new attributes in the generic code, the man page, and the showeventinfo program. Arch-specific enablement is provided by separate patches. commit 70f9c2d13ee7088be788a399e23f69a5f0524cb4 fix handling of FETHR on Intel X86 This patch fixes several issues with the handling of the Precise FRONTEND_RETIRED event on Intel X86 processors which support it (Skylake and later). First, the FE latency field is not 3 bits but 12. Second, the code was missing locked down capability for the fe_thres modifier. Some events of the fe_thres hardcoded, and therefore attempts to force a value should be rejected. Third, when a umask does not hardcode a fe_thres then the user can pass one. Note that not all umasks of the event use the fe_thres. commit 42c1857c7694cec1a4750a340381d49dd84ca8ff add RETIRED_SSE_AVX_FLOPS event for AMD64 Fam17h Zen2 Was missing from initial commit. Added as PPR rev 0.54. Note that this event by itself does not count correctly. It needs large increment support, which means merging of two consecutive counters. This is handled by the Linux kernel starting with 5.6-rc4. The library simply encodes the event as if it was like any other normal event. commit 210b2ef95f33eccb671f2a88a979de5364c94465 fix Intel Tremont OCR event code In Tremont, the second OCR event has encoding 0x02b7 and not 0x01bb. commit a2909cdfbea45524931ca13035293555a645d2e5 add Intel Tremont core PMU support This patch adds support for the Intel Tremont core PMU events. Based on Intel snowridgex_v1.06.json event information released on download.01.org/perfmon/SNR. commit 0f6a3c3308f29699b4f698b5e0983af322d44bdb update RAPL processor support Added Goldmont, Cannonlake, CometLake, Icelake support. Based on Linux kernel support. commit a291613f3cd2d3e3355627674af264210c3fcbe1 enable Intel CometLake support Identical to Intel Skylake client support. 2020-03-18 Frank Winkler * src/configure, src/configure.in: Some modifications. 2020-03-17 Frank Winkler * src/configure: Generated configure via autoconf 2.69. * src/configure.in: Put paranoid check message at the end of configure. 2020-03-16 Frank Winkler * src/configure, src/configure.in: Replaced paranoid check error with warning. 2020-03-14 Frank Winkler * src/configure, src/configure.in: Added paranoid check at configuration. * src/threads.c: Removed linux condition. * src/papi_internal.c, src/papi_internal.h, src/threads.c, src/threads.h: Added several thread identification functions for PAPI_thread_init (2). 2020-03-13 Frank Winkler * src/high-level/papi_hl.c, src/papi.c, src/papi_internal.c, src/papi_internal.h: Added several thread identification functions for PAPI_thread_init. 2020-03-05 Anthony Castaldo * release_procedure.txt: Further clarifications in release_procedure.txt papi-papi-7-2-0-t/ChangeLogP701.txt000066400000000000000000001045001502707512200166000ustar00rootroot000000000000002023-03-09 Giuseppe Congiu * src/components/Makefile_comp_tests.target.in: tests: fix order of headers in makefile The local includedir now has precedence over the install includedir. 2023-02-14 Giuseppe Congiu * src/components/rocm_smi/rocs.c: rocm_smi: add support for XGMI events Added events for XGMI on MI50 and MI100. Also support P2P internode min and max bandwidth monitoring. 2023-03-06 Giuseppe Congiu * src/components/rocm_smi/tests/Makefile: rocm_smi: fix test warning 2023-02-14 Giuseppe Congiu * src/components/rocm_smi/rocs.c: rocm_smi: refactor component to support XGMI events (3/3) Add infrastructure for supporting XGMI events. 2023-02-07 Giuseppe Congiu * src/components/rocm_smi/Rules.rocm_smi, src/components/rocm_smi /linux-rocm-smi.c: rocm_smi: refactor rocm_smi frontend to use rocs API (2/3) Replace old linux-rocm-smi.c logic with calls to the rocs layer interface. 2023-01-18 Giuseppe Congiu * src/components/rocm_smi/htable.h, src/components/rocm_smi/rocs.c, src/components/rocm_smi/rocs.h: rocm_smi: refactor rocm_smi logic into rocs backend (1/3) Refactors most of the code originally in linux-rocm-smi.c by moving it to a new layer name rocs (for ROCmSmi) and simplifying the original event detection logic. 2023-03-02 Daniel Barry * src/ftests/Makefile: build system: accommodate Fortran compiler absence These changes introduce clean/clobber targets in the ftests/Makefile to remove ftests/Makefile.target in the case that the Fortran tests were not built. These changes were tested on platform containing the AMD Zen4 architecture. Sat Feb 25 18:01:45 2023 -0800 John Linford * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_neoverse_v1.3, src/libpfm4/docs/man3/libpfm_arm_neoverse_v2.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_neoverse_n2_events.h, src/libpfm4/lib/events/arm_neoverse_v1_events.h, src/libpfm4/lib/events/arm_neoverse_v2_events.h, src/libpfm4/lib/events/intel_skl_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_arm_armv9.c, src/libpfm4/lib/pfmlib_arm_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c: libpfm4: update to commit c676419 Original commits: commit c676419047f240468efd63407cf5e3fefa71752a update Intel SKL/SKX/CLX event table Based on github.com/Intel/perfmon/SKX version 1.29 commit 098a39459fa0d0ed1d81f4c269a3b0ece46f9f27 add ARM Neoverse V2 core PMU support Based on information from: github.com/ARM- software/data/blob/master/pmu/neoverse-v2.json commit 1307e234db0f3922d6854e9b84283c5f6c72d2d6 move ARM Neoverse N2 to ARMv9 support Neoverse N2 is a ARMv9 implementation therefore it needs to be moved to the pfmlib_arm_armv9.c support file. Attributes are also updated to point to the V9 specific version. commit 61b49e0bbcc0906c54c17007faca91d0c62e6b38 add ARM v9 support basic infrastructure Adds the pmlib_arm_armv9.c support file and a few macros definitions to enable ARMv9 PMU support. commit 21895bae4e59936079b908c08787aa63fe485141 add Arm Neoverse V1 core PMU support This patch adds support for Arm Neoverse V1 core PMU. Based on Arm information posted on github.com/ARM- software/data/blob/master/pmu/neoverse-v1.json 2023-03-07 Anthony * src/components/sde/tests/Makefile, src/components/sde/tests/Simple2/Simple2_Lib++.cpp, src/components/sde/tests/Simple2/Simple2_Lib.c: Modified a non- compliant type aliasing code that gcc-12.2 treats as undefined behavior to conform to the C standard and be more portable. 2023-02-27 Giuseppe Congiu * src/configure, src/configure.in: rocm: define PAPI_ROCM_PROF if rocm component enabled * src/components/sysdetect/tests/Makefile: sysdetect: enable fortran tests rules only if F77 is set Wed Feb 22 23:00:00 2023 -0800 Stephane Eranian * src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_spr_events.h: libpfm4: update to commit b4361ca Original commits: commit b4361ca023198b9a96f1d824cfcd276f020bcac3 Update Intel SapphireRapid event table Based on github.com/intel/perfmon version 1.11 commit f31c0f5ff0792d547eff436c577eac82d99b4e8b update Intel Icelake event table Based on githib.com: - v1.19 for IcelakeX - v1.17 for Icelake 2023-02-08 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: fix intercept mode shutdown bug (2/2) Shutdown hash table in intercept mode path. * src/components/rocm/rocm.c: rocm: fix init_private return code bug (1/2) Check init_private return state in delay initialization functions (e.g. rocm_update_control_state). 2023-02-20 Giuseppe Congiu * src/components/rocm/Rules.rocm, src/components/rocm/htable.c, src/components/rocm/htable.h: rocm: refactor htable (28/28) The htable data structure is used across multiple components and having it in a C file causes multiple definition errors. Instead move the implementation to the header and declare all function static inline. 2023-01-26 Giuseppe Congiu * src/components/rocm/Rules.rocm, src/components/rocm/common.h, src/components/rocm/rocd.c, src/components/rocm/rocd.h, src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h, src/configure, src/configure.in: rocm: add dispatch layer for future extensions (27/28) Add dispatch layer for accomodating the integration of additional backend profiling tools (e.g. rocmtools). 2023-01-23 Giuseppe Congiu * src/components/rocm/common.h, src/components/rocm/rocp.h: rocm: refactor rocp backend header (26/28) Move context state defines to rocp.h 2023-01-10 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: add note to source code (25/28) Add fixme note to intercept code 2023-01-09 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: remove same-event-number-per- device limitation (24/28) The rocm component used to make some assumption when working in sampling mode. More specifically, in the case of a single eventset containing events from different devices, the component assumed every device had the same number of events. This is a reasonable assumption to make because in the typical case the user has a SIMD workload split across available devices. It makes sense therefore to monitor the same events on all devices, in order to make apple to apple comparisons. This assumption, however, does not allow for the case in which the user wants to monitor different events on different devices. This might be the case for a MIMD workload. This patch removes the limitation by allowing different number of events to be monitored on different devices using a single eventset. 2023-01-05 Giuseppe Congiu * src/components/rocm/rocm.c, src/components/rocm/rocp.c: rocm: sort events in component frontend (23/28) The rocp layer needs events sorted by device. The sorting used to happen in the rocp layer and counters would be eventually assigned to the correct position in the user counter array. The same mechanism is already available in the frontend through the ntv_info ni_position attribute. Thus, do the sorting the in the rocm layer and set the ni_position to the remapped event/counter. This allows to simplify the rocp layer logic. 2023-01-04 Giuseppe Congiu * src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: add comments for backend functions (22/28) Explicitly separate interfaces by functionality in rocp.h and add descriptions in rocp.c 2022-12-23 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: refactor static string lengths (21/28) Use PATH_MAX instead of PAPI_MAX_STR_LEN for paths 2022-12-19 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: refactor shutdown function names in backend (20/28) Rename shutdown funcs in rocp layer 2022-12-16 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: refactor event verification logic in intercept mode (19/28) The verify_events function makes sure that, in intercept mode, all eventsets have the same events. This is dictated by rocprofiler that, currently, does not allow resetting intercept mode callbacks. The way this check is carried out is by going through a list of intercept events (kept internally by the rocp layer) and a list of user requested events. If the two differ, then there is a conflict and the new eventset cannot be monitored. Use the htable to log the name of the intercept events and check the presence/absence for conflict in verify_events. 2022-12-15 Giuseppe Congiu * src/components/rocm/rocm.c: rocm: refactor event duplication verification logic (18/28) Currently, update_native_events removes event duplication whenever that happens. This is the case for events that differ by instance number. The rocp layer reports events instances as separate native events. The logic to remove duplicate events is messy and probably not even needed as the user will normally add events without indicating the instance of the event. This patch removes such logic. 2022-12-14 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: refactor dispatch_counter logic (17/28) dispatch_counter is used to keep track of what kernel has been dispatched by what thread. The current implementation relies on an array to keep the tid and the counter value. Using the hash table, currently also used for keeping events, simplifies the code significantly. 2022-12-13 Giuseppe Congiu * src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: refactor rocp_ctx_open function name (16/28) Rename rocp_ctx_open_v2 to rocp_ctx_open 2022-12-09 Giuseppe Congiu * src/components/rocm/common.h, src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: handle event table in component backend (15/28) No longer initialize and use the event table in the rocm layer. No longer pass the ntv_table around in the rocp layer. * src/components/rocm/rocm.c: rocm: add event counting function in component frontend (14/28) Add evt_get_count function to count events * src/components/rocm/rocm.c: rocm: use rocp API to enumerate events in component (13/28) Do not use ntv_table for enum events in rocm.c Instead of using ntv_table for enum events use the new rocp exposed interfaces such as rocp_evt_code_to_name, etc. * src/components/rocm/rocm.c: rocm: add event name tokenizer (12/28) Add tokenize_event_string function to extract device information from the event name. * src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: get errors through rocp_err_get_last (11/28) Instead of returning error string explicitly during rocp_init/_environment, use rocp_err_get_last. * src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: add error code to string function (10/28) Add error string return function: rocp_err_get_last(). 2022-12-08 Giuseppe Congiu * src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: refactor rocp_ctx_open function (9/28) The goal of this patch is to remove the need for passing the event table reference to the rocp_ctx_open function. To avoid too many changes in the code here we introduce a new rocp_ctx_open_v2 function and replace rocp_ctx_open with it in a later commit. * src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: refactor component backend interface (8/28) This patch does some preparatory work to make the rocp layer completely opaque to the component layer (rocm.c). These include storing a reference to the native table built at rocp_init time, and adding four new interfaces for enumerating events, getting event descriptors, converting event names to codes and viceversa. 2022-12-06 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: refactor user to component event mapping (7/28) The rocp_ctx already contains a reference to the events_id provided by the user. Remove explicit reference in get_user_counter_id arg list. * src/components/rocm/rocp.c: rocm: refactor event sorting function name (6/28) Rename sort_events_by_device to sort_events. Events are numbered starting from the first device to the last. Thus, events are always ordered by device, and the name of the function is a redundant statement of how events are sorted. * src/components/rocm/rocp.c: rocm: refactor event collision detection (5/28) Rename compare_events to verify_events to indicate the change in functionality for the function. Used to compare events, returning either 0 if they all matched or an integer if the did not. Now verify_events returns PAPI_OK if the events match and PAPI_ECNFLT if they don't. verify_events also checks whether the rocprofiler callbacks are set or not. If not the function exits immediately as there cannot be any event conflict. * src/components/rocm/rocp.c: rocm: refactor intercept_ctx_open (4/28) Move rocprofiler init_callbacks from intercept_ctx_open into ctx_init. * src/components/rocm/rocp.c: rocm: quick sort component events by id (3/28) Events from multiple GPUs can appear in any order in the eventset. However, rocprofiler needs events to be ordered by device. Previously, we were sorting events by device using a brute force approach. This is unnecessary because events are numbered according to device order anyway. Doing a quick sort of the events identifiers is sufficient. 2022-12-05 Giuseppe Congiu * src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: refactor rocp_ctx_read (2/28) The events_id array, containing the id of the events requested by the user, is passed to rocp_ctx_open and can be saved in the rocp_ctx returned by this function. It is not necessary to pass the array again as argument to rocp_ctx_read. Thus, remove it from the argument list of rocp_ctx_read. 2022-12-13 Giuseppe Congiu * src/components/rocm/common.h, src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h: rocm: refactor variable types (1/28) Variable type refactoring. Use unsigned int for ids (e.g. events_id and devs_id) and int for counts (e.g. num_events, num_devs). 2023-02-21 Giuseppe Congiu * src/components/perf_event/perf_helpers.h: perf_event: used unused attribute in mmap_read_self * src/components/perf_event/perf_helpers.h: perf_event: add missing mmap_read_reset_count for non default cpus Power cpus do not have a version of mmap_read_reset_count. Implement the missing function. * src/components/perf_event/perf_helpers.h: perf_event: bug fix in mmap_read_self Commit 9a1f2d897 broke the perf_event component for power cpus. The mmap_read_self function is missing one argument. This patch restores the missing argument in the function. 2023-01-26 Daniel Barry * src/components/nvml/README.md, src/components/nvml/linux-nvml.c: nvml: fix support for multiple devices Replace each cudaGetDevicePtr() call with a table lookup. This tracks the correct device ID when counting events; whereas, cudaGetDevicePtr() will only ever return a single ID with the way it was used. The dependency on CUDA unnecessarily restricts multi-process jobs on Summit. Removing this dependency was necessary to properly support multiple devices. These changes were tested on the Summit supercomputer, which contains the IBM POWER9 and NVIDIA Tesla V100 architectures. 2023-02-21 Masahiko, Yamada * src/components/perf_event/perf_event.c, src/components/perf_event/perf_event_lib.h, src/components/perf_event/perf_helpers.h: PAPI_read performance improvement for the arm64 processor We developed PAPI_read performance improvements for the arm64 processor with a plan to port direct user space PMU register access processing from libperf to the papi library without using libperf. The workaround has been implemented that stores the counter value at the time of reset and subtracts the counter value at the time of reset from the read counter value at the next read. When reset processing is called, the value of pc->offset is cleared to 0, and only the counter value read from the PMU counter is referenced. There was no problem with the counters FAILED with negative values during the multiplex+reset test, except for sdsc2-mpx and sdsc4-mpx. To apply the workaround only during reset, the _pe_reset function call sets the reset_flag and the next _pe_start function call clears the reset_flag. The workaround works if the mmap_read_self function is called between calls to the _pe_reset function and the next call to the _pe_start function. Switching PMU register direct access from user space from OFF to ON is done by changing the setting of the kernel variable "/proc/sys/kernel/perf_user_access". Setting PMU Register Direct Access from User Space Off $ echo 0 > /proc/sys/kernel/perf_user_access $ cat /proc/sys/kernel/perf_user_access 0 Setting PMU Register Direct Access from User Space ON $ echo 1 > /proc/sys/kernel/perf_user_access $ cat /proc/sys/kernel/perf_user_access 1 Performance of PAPI_read has been improved as expected from the execution result of the papi_cost command. Improvement effect of switching PMU register direct access from user space from OFF to ON Total cost for PAPI_read (2 counters) over 1000000 iterations min cycles: 689 -> 28 max cycles: 3876 -> 1323 mean cycles: 724.471979 -> 28.888076 Total cost for PAPI_read_ts (2 counters) over 1000000 iterations min cycles: 693 -> 29 max cycles: 4066 -> 3718 mean cycles: 726.753003 -> 29.977226 Total cost for PAPI_read (1 derived_[add|sub] counter) over 1000000 iterations min cycles: 698 -> 28 max cycles: 7406 -> 2346 mean cycles: 728.527079 -> 28.880691 Sun Feb 5 22:56:09 2023 -0800 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam19h_zen4.h: libpfm4: update to commit 678bca9 Original commits: commit 678bca9bf803b089c089629661d457533a7705b0 Update AMD Zen4 event table - Fix wrong encodings in for event RETIRED_FP_OPS_BY_TYPE - Fix INT256_OTHER bogus name - Add missing RETIRED_UCODE_INSTRUCTIONS - Fix Name and descripiton for event RETIRED_UNCONDITIONAL_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED commit dcb2f5e73d0343c87995919495c3c10252a7b0ca remove useless combination in AMD Zen4 packed_int_ops_retired event This combination is useless and does not match the rest of the logic for this event. libpfm4 allows one umask at a time. 2023-01-26 Giuseppe Congiu * src/ctests/all_native_events.c: ctests/all_native_events: bug workaround Sampling mode fails in the presence of more than one rocm GPU. This is due to a bug in the rocprofiler library. To avoid the failure in the all_native_events test we skip rocm tests. 2023-02-07 Daniel Barry * src/components/Makefile_comp_tests.target.in, src/components/cuda/sampling/Makefile, src/components/cuda/tests/BlackScholes/Makefile, src/components/cuda/tests/Makefile, src/components/nvml/tests/Makefile, src/components/nvml/utils/Makefile, src/configure, src/configure.in: build system: workaround for GCC8+CUDA10 bug on POWER9 When using CUDA 10 with GCC 8 on IBM POWER9, the following compile-time error occurs with 'nvcc': > error: identifier "__ieee128" is undefined We work around this issue by passing the flags "-Xcompiler -mno-float128" to 'nvcc' invocations. These changes have been tested on the ppc64le architecture and NVIDIA Tesla V100 GPUs. * src/components/sysdetect/nvidia_gpu.c, src/components/sysdetect/nvidia_gpu.h: sysdetect: account for older CUDA versions In CUDA 11.0 or greater, the macro "NVML_DEVICE_UUID_V2_BUFFER_SIZE" is defined. Older versions of CUDA define "NVML_DEVICE_UUID_BUFFER_SIZE." In order to support older versions of CUDA, these changes apply the appropriate macro. These changes have been tested on the NVIDIA Tesla V100 architecture. 2023-01-26 Daniel Barry * src/components/nvml/README.md: nvml: fix small typo in README Remove extra underscore in the README.md file. 2023-01-13 Giuseppe Congiu * src/components/rocm/tests/sample_multi_thread_monitoring.cpp: rocm: skip sampling multi-thread tests Sampling mode tests fail because of a still unresolved bug in rocm-5.3. Skip them until the bug is resolved. 2023-01-09 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: fix bug in sampling_ctx_open The function creates one profiling context per device. The way the agent corresponding to the device is selected was erroneous. This caused different threads monitoring different devices with different eventsets to all access the counters from the first device. The fix is not to select the agent using a for loop index but instead to use that index to get the device id from the devs_id array. Thu Jan 5 12:29:51 2023 -0800 Giuseppe Congiu * src/libpfm4/lib/pfmlib_amd64_fam19h.c: libpfm4: update to commit dd42292 Original commits: commit dd422923f79a6c160e499f484212020ca2398f90 Fix AMD Zen4 cpu_family used in detection code AMD Zen4 was expecting Zen3 CPU family number. 2022-12-22 Giuseppe Congiu * src/components/net/linux-net.c: net: fix warning in strncpy The source and target string have the same length. If the source is null terminated (as expected in the absence of bugs) there is no need to null terminate the target manually. * src/components/net/linux-net.c: net: fix warning in snprintf Compute the length of the source string instead of copying PAPI_MAX_STR_LEN characters regardless. * src/components/coretemp/linux-coretemp.c: coretemp: fix warning in strncpy Source and target are the same length. No need to null terminate if the source is already null terminated. * src/components/powercap_ppc/tests/powercap_basic.c: powercap_ppc: fix warning in powercap_basic test Copy the whole source string to the target. * src/components/powercap_ppc/tests/powercap_basic.c: powercap_ppc: fix bug in powercap_basic test Make target and source strings the same size. 2022-12-20 Giuseppe Congiu * src/libpfm4/docs/man3/libpfm_amd64_fam19h_zen4.3, src/libpfm4/lib/events/amd64_events_fam19h_zen4.h: libpfm4: add missing zen 4 files Commit 2fe62da left out additional zen 4 files. This patch adds them. 2022-12-15 Giuseppe Congiu * src/components/sensors_ppc/linux-sensors-ppc.c: sensors_ppc: fix typo in fall through comment The compiler throws a warning because it does not recognize fallthrough as valid indication that the code can fall through in the case statement. * src/components/powercap_ppc/linux-powercap-ppc.c: powercap-ppc: fix warning in strncpy strncpy causes the following warning in linux- powercap-ppc.c: warning: '__builtin_strncpy' specified bound 1024 equals destination size 62 | char *retval = strncpy( dst, src, size ); | ^~~~~~~ the problem is that size is the same size as dst. If src is also the same length, the null termination character will not be copied over. Instead copy only size - 1 and terminate the string manually. Fri Dec 2 00:01:47 2022 -0800 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_fam19h.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: libpfm4: update libpfm4 to commit c0116f9 Original commits: commit c0116f9433f34e5953407036243af998c00fcc1f Add AMD Zen4 core PMU support Based on AMD PPR for Fam19h model 11 B1 rec 0.25 commit 11f94169598b84a71fb9da4357baf3673f83038b Correctly detect all AMD Zen3 processors Fixes commit 79031f76f8a1 ("fix amd_get_revision() to identify AMD Zen3 uniquely") The commit above broke the detection of certain AMD Zen3, such as: Vendor ID: AuthenticAMD BIOS Vendor ID: Advanced Micro Devices, Inc. Model name: AMD Ryzen 9 5950X 16-Core Processor BIOS Model name: AMD Ryzen 9 5950X 16-Core Processor BIOS CPU family: 107 CPU family: 25 Model: 33 commit c5100b69add67172366e897cef5b854c5348dc91 fix CPU_CLK_UNHALTED.REF_DISTRIBUTED on Intel Icelake Had the wrong encoding of 0x8ec instead of 0x083c. 2022-12-19 Giuseppe Congiu * src/utils/papi_hardware_avail.c: sysdetect: fix typo in papi_hardware_avail 2022-12-01 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: give precendence to new dir structure for metrics file 2022-11-28 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: account for new directory tree structure in rocm librocprofiler64.so is being moved from /rocprofiler/lib to /lib. This patch allow the rocm component to search in the new location if the old is empty/non-existant. 2022-11-30 Giuseppe Congiu * src/ftests/clockres.F, src/ftests/fmatrixlowpapi.F: ftest: fix warning in fortran tests Arrays that are statically allocated cannot be placed in the stack if the exceed a certain size (-fmax- stack-var-size). The compilers moves them into static storage which might cause problems if the function is called recursively. Solve the warning by allocating the arrays dynamically. 2022-12-08 Daniel Barry * src/components/infiniband/linux-infiniband.c: infiniband: fix compiler warnings Recent versions of GCC (9.3.0 in this case) threw the following warnings: components/infiniband/linux- infiniband.c: In function '_infiniband_ntv_code_to_info': components/infiniband/linux-infiniband.c:937:9: warning: 'strncpy' specified bound depends on the length of the source argument [-Wstringop-overflow=] 937 | strncpy(info->symbol, infiniband_native_events[index].name, len); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ components/infiniband/linux-infiniband.c:935:28: note: length computed here 935 | unsigned int len = strlen(infiniband_native_events[index].name); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ components/infiniband /linux-infiniband.c:944:9: warning: 'strncpy' specified bound depends on the length of the source argument [-Wstringop-overflow=] 944 | strncpy(info->long_descr, infiniband_native_events[index].description, len); | ^~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~ components/infiniband/linux-infiniband.c:942:28: note: length computed here 942 | unsigned int len = strlen(infiniband_native_events[index].description); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The changes in this commit fix these warnings by using the maximum possible length of the source arguments. These changes were tested on Summit, which has the following IB device listing from 'lspci': Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]. 2022-11-29 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect: sysdetect: fix order in include dirs * src/components/sysdetect/Rules.sysdetect: sysdetect: fix typo in makefile Rules * src/components/sysdetect/amd_gpu.c: sysdetect: explicit cast AMD hsa attributes to hsa_agent_info_t 2022-11-28 Florian Weimer * src/configure, src/configure.in: configure: Avoid implicit ints and implicit function declarations Implicit ints and implicit function declarations were removed from the C language in 1999. Relying on them can cause spurious autoconf check failures with compilers that do not support them in the default language mode. 2022-12-02 Daniel Barry * src/components/infiniband/linux-infiniband.c: infiniband: increase max number of events The maximum number of events ('INFINIBAND_MAX_COUNTERS') was hard-coded to be 128. However, some Infiniband devices provide more than 128 events, causing the component test 'infiniband_values_by_code' to seg fault. The IB devices on Summit nodes provide 188 events, so the macro needs to be greater than or equal to this number. These changes were tested on Summit, which has the following IB device listing from 'lspci': Infiniband controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]. 2022-05-26 Giuseppe Congiu * src/components/cuda/Rules.cuda: cuda: remove untested dependency linux-cuda.o depends on cuda_sampling (directory). This contains untested code and does not seem to be an indispensable dependency for the cuda component. This patch removes the cuda_sampling dependency for now. 2022-11-29 Giuseppe Congiu * src/components/sysdetect/sysdetect.c: sysdetect: add missing numa memsize attr * src/components/cuda/linux-cuda.c: cuda: use appropriate macro for perfworks API calls Two perfworks API calls were made using CUPTI_CALL macro instead of NVPW_CALL. This patch uses the appropriate call macro. * src/components/rocm_smi/Rules.rocm_smi: rsmi: fix warning in Makefile Rules This warning shows up for rocm version 5.2 and later, which changed the directory structure and deprecated headers for the old structure. This patch prioritizes the new structure when looking for rocm_smi headers at build time. * src/components/rocm_smi/linux-rocm-smi.c: rsmi: fix warning in strncpy strncpy warning was caused by the len of the copy being equal to the target string len. Increasing the target string by one character leaves space for a termination character and fixes the warning. * src/components/rocm/Rules.rocm: rocm: fix dependency priority in Makefile Recent versions of ROCM have deprecated the old directory tree in favour of a different organization of headers and libraries. This patch gives priority to the new directory structure when searching for headers. * src/components/rocm/tests/Makefile: rocm: fix deprecated warning in tests 2022-11-28 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: fix miscellaneous warnings in rocp.c Fix following warnings: - implicit cast from enum to rocprofiler_feature_kind_t: this is caused by rocprofiler and is solved by explicit cast of ROCPROFILER_FEATURE_KIND_METRIC to rocprofiler_feature_kind_t; - casting 'getpid' to (unsigned long (*)(void)) incompatible type: this is solved by using '_papi_getpid' instead of 'getpid'. 2022-12-01 Giuseppe Congiu * src/components/nvml/linux-nvml.c: nvml: also copy null termination character in strncpy String literals are null terminated. When using strncpy the number of characters to be copied has to be the string length plus 1 in order to include the null termination character. Since the null termination is already included in the string literal, manually terminating the string is superflous. 2022-11-29 Giuseppe Congiu * src/components/nvml/linux-nvml.c: nvml: fix warning in strncpy The code defines the source string twice as long as the target string in number of characters. This causes the warning. The warning is removed by making the target string PAPI_MAX_STR_LEN long and the source string PAPI_MIN_STR_LEN long. * src/components/nvml/linux-nvml.c: nvml: trim excessive long description string An excessive long description string for power managemenent upper bound limit does not fit into the 128 characters of PAPI_MAX_STR_LEN. This patch trims the description string to make it fit in the description string limit. * src/components/nvml/linux-nvml.c: nvml: fix cuInit returned variable type cuInit return result is of type CUresult and not cudaError_t. Fix the error to resolve the warning: implicit conversion from CUresult to cudaError_t. 2022-11-22 Giuseppe Congiu * src/components/cuda/linux-cuda.c: cuda: fix compile error with gcc 10 Some of the symbols used by the cuda component clash with the nvml component as they are not defined static. This problem only affects latest versions of the gcc compiler (>= 10). This is due to how gcc places global variables that do not have an initializer (tentative definition variables in the C standard). These variables are placed in the .BSS section of the object file. This avoids the merging of tentative definition variables by the linker, which causes multiple definition errors. (This happens by default in gcc and corresponds to the -fno-common option). * src/components/nvml/linux-nvml.c: nvml: fix compile error with gcc 10 Some of the symbols used by the nvml component clash with the cuda component as they are not defined static. This problem only affects latest versions of the gcc compiler (>= 10). This is due to how gcc places global variables that do not have an initializer (tentative definition variables in the C standard). These variables are placed in the .BSS section of the object file. This avoids the merging of tentative definition variables by the linker, which causes multiple definition errors. (This happens by default in gcc and corresponds to the -fno-common option). papi-papi-7-2-0-t/ChangeLogP710.txt000066400000000000000000002107271502707512200166110ustar00rootroot000000000000002023-12-19 Daniel Barry * src/counter_analysis_toolkit/flops.c: cat: fix compile-time error On some older versions of GCC (10.3.0), not having a statement after 'default' in a switch-case statement can yield the compiler warning: "label at end of compound statement" These changes fix this error and have been tested on the AMD Zen3 architecture. 2023-12-19 Giuseppe Congiu * .../rocm/tests/hl_intercept_multi_thread_monitoring.cpp, .../rocm/tests/hl_intercept_single_kernel_monitoring.cpp, .../rocm/tests/hl_intercept_single_thread_monitoring.cpp, .../rocm/tests/hl_sample_single_kernel_monitoring.cpp, .../rocm/tests/hl_sample_single_thread_monitoring.cpp, src/components/rocm/tests/matmul.cpp: rocm: fix warnings in the rocm tests * src/components/rocm/tests/Makefile: rocm: search for hipcc in PAPI_ROCM_ROOT instead of using fixed path The path of hipcc in the ROCm installation directory has changed. In order to be location independent the rocm/tests Makefile should locate the hipcc compiler in the installation directory rather than relying on a fixed pathname. * src/configure, src/configure.in: configure: search for rocm_smi headers in PAPI_ROCMSMI_ROOT The configure script used to search for rocm_smi headers in PAPI_ROCM_ROOT instead of PAPI_ROCMSMI_ROOT. This was because the rocm headers are typically installed under the same root. However, with rocm-6.0.0 the rocm_smi.h causes a failure while building the sysdetect component in PAPI (component that is enabled by default). Thus, we now look explicitly for the rocm_smi header in PAPI_ROCMSMI_ROOT instead in order to isolate the sysdetect & rocm components from rocm_smi. * src/components/cuda/cupti_common.c: cuda: add cudaGetErrorString to generate error messages cudaGetErrorString is used to the proper disabled_message to the users whenever there is a cuda related problem during initialization. * src/components/cuda/cupti_common.c: cuda: refactor get_gpu_compute_capability With exception made for trivial functions (i.e. functions that cannot fail) every function should return an error code for proper error handling. The get_gpu_compute_capability does not account for error handling in the case a cuda call failure happens. * src/components/cuda/cupti_common.c: cuda: refactor util_gpu_collection_kind With exception made for trivial functions (i.e. functions that cannot fail) every function should return an error code for proper error handling. The util_gpu_collection_kind does not account for error handling in the case a cuda call failure happens. * src/components/cuda/cupti_common.c, src/components/cuda/cupti_common.h, src/components/cuda/cupti_profiler.c: cuda: refactor cuptic_device_get_count With exception made for trivial functions (i.e. functions that cannot fail) every function should return an error code for proper error handling. The cuptic_device_get_count does not account for error handling in the case a cuda call failure happens. 2023-12-14 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: print all masks descriptors for events that contain them * src/components/rocm/roc_profiler.c: rocm: add coma separator between event descriptor and masks 2023-12-18 Florian Weimer * src/configure, src/configure.in: configure: Fix return values in start thread routines Thread start routines must return a void * value, and future compilers refuse to convert integers to pointers with just a warning (the virtualtimer probe). Without this change, the probe always fails to compile with future compilers (such as GCC 14). For the tls probe, return a null pointer for future- proofing, although current and upcoming C compilers do not treat this omission as an error. Updates commit dd11311aadbd06ab6c76d ("configure: fix tls detection"). 2023-12-14 Daniel Barry * src/papi_events.csv: presets: various cache presets for SPR CPUs Defines the presets for data cache and total cache activity in the Intel Sapphire Rapids architecture. These changes have been tested on the Intel Sapphire Rapids architecture using the Counter Analysis Toolkit. 2023-12-08 Daniel Barry * src/papi_events.csv: presets: add total cache presets for Zen4 CPUs Add preset definitions for total L2 total cache hits and misses. These changes have been tested on the AMD Zen4 architecture using the Counter Analysis Toolkit. * src/papi_events.csv: presets: correction to instr cache preset Fix mistake introduced in commit ef1cc48846b58156995db58f53314bd4c9ec9bc0, in which the definition for PAPI_L2_ICM can realize negative values. These changes have been tested on the AMD Zen4 architecture using the Counter Analysis Toolkit. 2023-12-14 Daniel Barry * src/counter_analysis_toolkit/gen_seq_dlopen.sh: cat: reduce exec time of instr cache benchmark Skip the most time-consuming kernels in the CAT instruction cache benchmark. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/timing_kernels.c: cat: remove unused variable Remove declaration for an unused variable. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c: cat: account for proper number of buffers Adjust the logic to properly account for how many buffer sizes shall exceed the size of the last-level cache. These changes have been tested on the Intel Sapphire Rapids architecture. 2023-12-13 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/main.c: cat: read values from config file as 'long long' Since some of the buffer sizes are very large, then the values for the cache sizes provided in the config file should be interpreted as type 'long long.' These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c: cat: remove unnecessary typecast Remove a typecast to 'long long,' which is unnecessary because the variable is already the type 'long long.' These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h: cat: use macro for LLC factor Create a macro to more easily define the factor by which the LLC is multiplied to attain the largest buffer size used in the pointer chase. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c: cat: ensure proper integer arithmetic Append 'LL' to constant values that are added or multiplied with 'long long' variables. These changes have been tested on the Intel Sapphire Rapids architecture. 2023-12-12 Daniel Barry * src/counter_analysis_toolkit/dcache.c: cat: allocate the proper max buffer size Allocate enough space for the largest buffer size used in the pointer chase. When values in the config file are provided, this needs to account for them. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c: cat: fix erroneous malloc Fix an erroneous malloc() call by changing the size of each element to that of 'long long.' These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/main.c: cat: fix memory leak Fix memory leak by freeing dynamically allocated in the case it was not previously freed. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/prepareArray.c: cat: clean-up comments Remove in-line comments and fix typos in comments for readability. 2023-12-06 Daniel Barry * src/counter_analysis_toolkit/main.c: cat: place MPI_Barrier before MPI_Finalize When MPI is used, no rank should reach MPI_Finalize until all ranks' work has completed. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/main.c: cat: only measure latencies once When MPI is used, only one rank needs to run the latency tests. These changes have been tested on the Intel Sapphire Rapids architecture. 2023-12-08 Daniel Barry * src/papi_events.csv: presets: add instr cache presets for Intel SPR Defines the instruction cache presets for the Intel Sapphire Rapids architecture. These changes have been tested on the Intel Sapphire Rapids architecture using the Counter Analysis Toolkit. * src/components/intel_gpu/README.md: intel_gpu: fix small typo Fix small typo in the README for the Intel GPU component. 2023-12-06 Daniel Barry * src/counter_analysis_toolkit/prepareArray.c: cat: fix memory leak Free the dynamically allocated memory at the end of the function that sets up the pointer chain. These changes have been tested on the AMD Zen4 architecture. * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/prepareArray.c, src/counter_analysis_toolkit/prepareArray.h, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: cat: store buffer sizes as 'long long' Use 'long long' instead of 'int' for buffer sizes to prevent overflow from occurring for large buffer sizes. These changes have been tested on the AMD Zen4 architecture. 2023-12-04 Daniel Barry * src/counter_analysis_toolkit/timing_kernels.c: cat: properly normalize counter values Ensure that the number of pointer chain accesses is evenly divisible by the work macros to prevent incorrectly normalizing event counts. These changes have been tested on the AMD Zen4 architecture. * src/counter_analysis_toolkit/.cat_cfg, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/main.c: cat: fix logic for memory hierarchy parameters Make distinction between the "L4" and "MM" levels of the memory hierarchy. These changes have been tested on the AMD Zen4 architecture. * src/counter_analysis_toolkit/main.c: cat: larger default PPB value Make the default pages-per-block (PPB) value larger to accommodate more recent architectures. These changes have been tested on the AMD Zen4 architecture. * src/counter_analysis_toolkit/.cat_cfg, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/main.c: cat: create parameter for max PPB in config file Allow the user to change the pages-per-block (PPB) value via the congfiguration file. These changes have been tested on the AMD Zen4 architecture. * src/counter_analysis_toolkit/main.c: cat: probe fewer buffers per cache level Make the default number of buffer sizes three (per cache level) to decrease the benchmark execution time while still sufficiently sampling each level in the memory hierarchy. These changes have been tested on the AMD Zen4 architecture. 2023-12-01 Daniel Barry * src/counter_analysis_toolkit/dcache.c: cat: exclude cache sizes from tests Do not use the exact cache sizes in the sweep of buffer sizes in the data-cache kernels because there tends to be transient behavior at these boundaries. These changes have been tested on the AMD Zen4 architecture. 2023-12-01 Giuseppe Congiu * .github/workflows/ci.sh, .github/workflows/main.yml: ci: run tests with and without PAPI debug enabled Tests should make sure real use cases work as expected. Some tests might not working correctly if -O0 is used as optimization level in the compiler. For example, the ROCm runtime submits a kernel of 4 waves if the tests are built using -O0, which makes the tests fail. Update the github test configuration matrix to include testing without PAPI debug. 2023-11-15 Aurelian MELINTE * src/components/sysdetect/arm_cpu_utils.c: PAPI: ARM Cortexx A76 support (Raspberry Pi 5) 2023-11-29 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: change opt level to user choice for tests 2023-11-16 Giuseppe Congiu * src/components/rocm/roc_dispatch.c, src/components/rocm/roc_dispatch.h, src/components/rocm/roc_profiler.c, src/components/rocm/roc_profiler.h, src/components/rocm/rocm.c: rocm: add rocp_evt_code_to_info support This function is needed to allow papi_native_avail to extract qualifier descriptions for the event identifier. 2023-11-14 Giuseppe Congiu * src/components/rocm/roc_profiler.c, src/components/rocm/rocm.c: rocm: add qualifier support This commit contains the core changes of this feature set. It introduces the logic necessary to handle event identifiers in such a way that device and instance attributes are presented to the PAPI users as qualifiers. This means that papi_native_avail will return: Native Events in Component: rocm == =================================================================== ========== | rocm:::SQ_WAIT_INST_LDS | | Number of wave-cycles spent waiting for LDS instruction issue. In | | units of 4 cycles. (per-simd, nondeterministic) | | :device=0 | | mandatory device qualifier [devices: 0,1] | ----------------------------------------------------------------- --------------- | rocm:::TCP_TCP_TA_DATA_STALL_CYCLES | | TCP stalls TA data interface. Now Windowed. | | :device=0 | | mandatory device qualifier [devices: 0,1] | | :instance=0 | | mandatory instance qualifier in range [0 - 15] | ----------------------------------------------------------------- --------------- The PAPI user will be able to use event names in the same form as before (all previous tests will still work) with the relaxation on the order of device and instance numbers. * src/components/rocm/roc_profiler.c: rocm: add finalize_features function This function is needed as features will be generated on the fly for rocprofiler rather than saved in the event table. Therefore, the feature names have to be freed when the rocprofiler context is closed. * src/components/rocm/roc_profiler.c: rocm: add unique metric utility functions for intercept mode In intercept mode we are only interested in unique events, i.e., events that have the same name and instance (can be from different devices). This is because in intercept mode all unique events are monitored on all devices. Though, only the counters for the actual requested events will be presented to the user. This is a design choice that accounts for the fact that once set, callbacks for dispatch queues cannot be updated (this includes the monitored events). * src/components/rocm/roc_profiler.c: rocm: add event name to info utility functions Add functions to extract event info from the name (e.g., device number and instance number). * src/components/rocm/roc_profiler.c: rocm: remove useless comments 2023-11-03 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: remove useless check for intercept code path * src/components/rocm/roc_profiler.c: rocm: move init_callbacks call init_callbacks should be called only once, i.e., when the intercept_global_state is initialized. After that happens the callbacks for the dispatch queues in all devices are already set and cannot longer be changed. * src/components/rocm/roc_profiler.c: rocm: remove intercept global state macros Intercept mode macros were simple aliases to entries in the global intercept mode state. Using explicit references to the said data structure entries improves readability. 2023-11-21 Giuseppe Congiu * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_profiler.c: rocm: change type of device id from unsigned int to int 2023-11-01 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: add event identifier utility functions Add functions to create and query event id attributes like device and instance. 2023-11-13 Giuseppe Congiu * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h: rocm: add bitmap utility functions Add rocc_dev_set and rocc_dev_check. The first register the presence of a device in the passed in bitmap while the second checks the bit corresponding to the passed in device number is set in the bitmap. 2023-11-01 Giuseppe Congiu * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_dispatch.c, src/components/rocm/roc_dispatch.h, src/components/rocm/roc_profiler.c, src/components/rocm/roc_profiler.h, src/components/rocm/roc_profiler_config.h, src/components/rocm/rocm.c: rocm: change the event id type to uint64_t in backend Preparatory commit to increase the size of the event id datatype in the component backend layer so to make it ready for hosting event id encoded information, such as device and instance numbers. 2023-07-21 Giuseppe Congiu * src/components/template/README.md, src/components/template/Rules.template, src/components/template/template.c, src/components/template/tests/Makefile, src/components/template/tests/simple.c, src/components/template/vendor_common.c, src/components/template/vendor_common.h, src/components/template/vendor_config.h, src/components/template/vendor_dispatch.c, src/components/template/vendor_dispatch.h, src/components/template/vendor_profiler_v1.c, src/components/template/vendor_profiler_v1.h: template: add template for new components 2023-12-01 Anthony * src/counter_analysis_toolkit/timing_kernels.c: CAT: Initialize variables to suppress warnings, and move them to correct scope. 2023-11-29 Daniel Barry * src/papi_events.csv: presets: add inst cache presets for Zen4 CPUs Defines various instruction-cache related presets for Zen4. These changes have been tested on the Zen4 architecture using the Counter Analysis Toolkit. 2023-08-30 Giuseppe Congiu * src/components/rocm/README.md: rocm: extend README with device partitioning information 2023-11-01 Giuseppe Congiu * src/components/sysdetect/tests/Makefile: sysdetect: add -ffree-form to silence error in ARM comp 2023-11-09 Giuseppe Congiu * src/components/rocm/README.md: rocm: add known problems with some events to README 2023-11-17 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: fix bug in intercept mode reset function * src/components/rocm/roc_profiler.c: rocm: fix bug introduced by commit 4991e1614 2023-11-14 Giuseppe Congiu * src/libpfm4/.gitignore: libpfm4: remove leftover .gitignore file Thu Sep 28 08:01:09 2023 +0000 ClĂ©ment Foyer * src/libpfm4/lib/pfmlib_intel_x86_arch.c: libpfm4: update to commit 535c204 Original commit: Add Intel IceLake and Intel SapphireRapid performance counters to the event table 2023-11-10 Anthony * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h: CAT: Add information about the cache sizes in the header of the output file. 2023-11-22 Daniel Barry * src/papi_events.csv: presets: add data cache presets for Zen4 CPUs Includes various data-cache related presets for Zen4. These changes have been tested on the Zen4 architecture using the Counter Analysis Toolkit. 2023-11-12 Anthony * src/counter_analysis_toolkit/main.c: CAT: Add missing option in the usage output. 2023-11-09 Anthony * src/utils/Makefile: utils: Fix bogus "Disabled" message in papi_component_avail for the sde component. When the sde component is initialized, in the context of an application that uses PAPI, it looks for the availability of libsde symbols. The rationale is that if the application is not linked against libsde, there are no SDEs to read, so the component disables itself. Therefore, papi_component_avail, which does not export any SDEs itself, always reported the sde component as "Disabled". Adding the symbols to the utility resolves this problem. 2023-10-27 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: fix bug in intercept mode path The intercept mode path keeps track of incercepted events using the same hash table used to map event names to entries in the native event table. The event names don't collide because intercept mode keeps track of the base name of the event (discarding device id and instance number), while native event table entries are referenced as "name:device=N:instance=M". The reason is that events are intercepted on all devices' dispatch queues regarless the device id specified by the user (this approach follows rocprof strategy). However, using only the event name without the instance number will cause problems. Instances represent separate events and should not be treated as a single event. The proposed patch uses a separate has table for intercept mode and inserts the feature name rather than the event base name. This means that events with more than one instance will have an hash table key of the form "name[M]", where M represents the instance. If the event only has one instance the key will be "name". 2023-11-06 Giuseppe Congiu * .github/workflows/ci.sh: ci: add --enable-warnings to github actions * src/configure, src/configure.in: configure: add -Wall to the --enable-warnings configure flag 2023-10-24 Giuseppe Congiu * src/configure, src/configure.in: configure: add --enable-warnings flag The --enable-warnings configure flag allows for a maintainer build mode where the compiler (gcc) enables extra warnings (-Wextra). 2023-11-07 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: refactor get_context_counters The function already takes rocp_ctx as input argument thus there is no need to pass events_id as input argument as well. * src/components/rocm/roc_profiler.c: rocm: get rid of asserts * src/components/rocm/roc_profiler.c: rocm: set return code outside fn_fail in init_event_table 2023-11-08 Giuseppe Congiu * src/configure, src/configure.in: configure: fix for issue #112 2023-09-13 Josh Minor * src/components/perf_event/pe_libpfm4_events.c: Set size of perf_attr_struct prior to getting pfm encoding 2023-11-07 William Cohen * src/ctests/thrspecific.c: ctests/thrspecific: Have the threads clean up after themselves Each thread is doing doing memory allocations via malloc. They should also free the memory once they are done to eliminate the following coverity issues: Error: CPPCHECK_WARNING (CWE-401): [#def10] papi-7.0.1/src/ctests/thrspecific.c:77: error[memleak]: Memory leak: data.data # 75| } # 76| processing = 0; # 77|-> } # 78| } # 79| Error: CPPCHECK_WARNING (CWE-401): [#def11] papi-7.0.1/src/ctests/thrspecific.c:77: error[memleak]: Memory leak: data.id # 75| } # 76| processing = 0; # 77|-> } # 78| } # 79| * src/components/sysdetect/linux_cpu_utils.c: sysdetect: Eliminate file resource leak in get_vendor_id() function This fix eliminates the following issue reported by coverity: Error: RESOURCE_LEAK (CWE-772): [#def9] papi-7.0.1/src/components/sysdetect/linux_cpu_utils.c:900: alloc_fn: Storage is returned from allocation function "fopen". papi-7.0.1/src/components/sysdetect/linux_cpu_utils.c:900: var_assign: Assigning: "fp" = storage returned from "fopen("/proc/cpuinfo", "r")". papi-7.0.1/src/components/sysdetect/linux_cpu_utils.c:906: noescape: Resource "fp" is not freed or pointed-to in "search_cpu_info". papi-7.0.1/src/components/sysdetect/linux_cpu_utils.c:968: leaked_storage: Variable "fp" going out of scope leaks the storage it points to. * src/components/net/linux-net.c: net: Ensure that strings copied are NULL terminated The strncpy function may not put a NULL at the end of the destination buffer if the source string is longer than the specified copy size. To ensure that the the copied strings are null terminated using snprintf instead and checking its return value to ensure that the copied string was not truncated. The snprintf function will always include a NULL at the end of copy. This particular fix addresses the following two coverity issues: Error: BUFFER_SIZE (CWE-170): [#def6] papi-7.0.1/src/components/net/linux- net.c:346: buffer_size_warning: Calling "strncpy" with a maximum size argument of 128 bytes on destination array "_net_native_events[i].name" of size 128 bytes might leave the destination string unterminated. Error: BUFFER_SIZE (CWE-170): [#def7] papi-7.0.1/src/components/net/linux-net.c:347: buffer_size_warning: Calling "strncpy" with a maximum size argument of 128 bytes on destination array "_net_native_events[i].description" of size 128 bytes might leave the destination string unterminated. * src/components/coretemp/linux-coretemp.c: coretemp: Ensure strings copied during initialization are NULL terminated The strncpy function will not place a NULL character at the end of the string if the string being copied is the same length or longer than the destination of the strncpy function. Switching the code in the _coretemp_init_component function to use snprintf and checking the return value of snprintf to verify the copied string fits in the destination. * src/components/coretemp/linux-coretemp.c: coretemp: add closedir operation to function exit Coverity flagged a resource leak on one of the possibe exit path of the generateEventList function. This patch adds the missing closedir. 2023-10-25 Daniel Barry * src/components/rocm/README.md: rocm: update README For versions of ROCM >= 5.2.0, the ROCM library path structure is different. The README has been updated to reflect this difference. This was verified on the Frontier supercomputer. 2023-09-29 Giuseppe Congiu * src/sde_lib/Makefile: sde_lib: do not build with debug symbols by default * src/configure, src/configure.in: configure: do not build with debug symbols by default Remove -g being added by default in configure. 2023-10-19 Anustuv Pal * src/components/cuda/cupti_profiler.c: cuda: Fix papi_command_line segfault when passed non-existent event name 2023-10-06 Anustuv Pal * src/components/cuda/cupti_profiler.c, src/components/cuda/linux- cuda.c: cuda: Improve CUDA component PAPI_read() overhead, issue 85 2023-10-06 Giuseppe Congiu * src/components/rocm/roc_profiler.c, .../rocm/tests/sample_multi_thread_monitoring.cpp, .../rocm/tests/sample_single_thread_monitoring.cpp: rocm: fix sampling mode multithread issue Issue #80 was causing sampling mode multithreading not to work. This was caused by a bug in the rocm component that tried to monitor multiple GPU devices using the using the same rocprofiler queue. Assigning one independent queue per device solves the issue. 2023-10-09 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: fix typo in ctx_open 2023-09-08 Giuseppe Congiu * src/components/rocm/roc_profiler.c: rocm: add logging to component backend * src/components/rocm/rocm.c: rocm: add logging to component frontend * src/components/rocm/rocm.c: rocm: funnel exits through same point in compomnent frontend 2023-07-20 Giuseppe Congiu * src/components/rocm/roc_common.c: rocm: refactor rocc_dev_get_{count,id} functions * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_profiler.c: rocm: fix warning in callback function 2023-07-18 Giuseppe Congiu * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_profiler.c: rocm: move thread id get function to roc_common 2023-07-17 Giuseppe Congiu * src/components/rocm/roc_common.c: rocm: fix warning in roc_common.c * src/components/rocm/roc_profiler.h: rocm: remove roc_common.h from roc_profiler.h * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_profiler.c: rocm: move agent to id function to roc_common 2023-07-14 Giuseppe Congiu * src/components/rocm/roc_profiler.h: rocm: remove leftover err_get_last function header * src/components/rocm/roc_dispatch.c, src/components/rocm/roc_dispatch.h, src/components/rocm/roc_profiler.c, src/components/rocm/roc_profiler.h, src/components/rocm/rocm.c: rocm: rename evt_get_descr to evt_code_to_descr 2023-07-13 Giuseppe Congiu * src/components/rocm/roc_common.h, src/components/rocm/roc_dispatch.h, src/components/rocm/{rocp_config.h => roc_profiler_config.h}: rocm: rename rocp_config.h to roc_profiler_config.h * src/components/rocm/roc_profiler.c: rocm: reformat roc_profiler.c code * src/components/rocm/roc_profiler.c: rocm: remove FIXME comment * src/components/rocm/roc_profiler.c: rocm: use snprintf instead of strncpy * src/components/rocm/roc_common.c, src/components/rocm/roc_common.h, src/components/rocm/roc_profiler.c: rocm: extract all device booking and checking functions 2023-07-12 Giuseppe Congiu * src/components/rocm/rocm.c, src/components/rocm/rocp_config.h: rocm: move extern declarations to config header The rocm lock and the profiling mode variables need to be shared between the front- end and the back-end. The reason for the lock is that this has to be initialized by the front-end which is the only one with access to the required information. This lock design in PAPI is flawed as it is hard to extend. * src/components/rocm/roc_profiler.c: rocm: remove unneeded comments * src/components/rocm/Rules.rocm, src/components/rocm/{rocc.c => roc_common.c}, src/components/rocm/{rocc.h => roc_common.h}, src/components/rocm/{rocd.c => roc_dispatch.c}, src/components/rocm/{rocd.h => roc_dispatch.h}, src/components/rocm/{rocp.c => roc_profiler.c}, src/components/rocm/{rocp.h => roc_profiler.h}, src/components/rocm/rocm.c: rocm: rename source files for better readability 2023-05-17 Giuseppe Congiu * src/components/rocm/Rules.rocm, src/components/rocm/common.h, src/components/rocm/rocc.c, src/components/rocm/rocc.h, src/components/rocm/rocd.c, src/components/rocm/rocd.h, src/components/rocm/rocm.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h, src/components/rocm/rocp_config.h: rocm: extract shared functionality Some functionality can be shared with other profiler versions, if and when these become available. Thus, it makes sense to extract such functionality from the specific profiler implementation and make it available to future profiler versions. 2023-07-12 Giuseppe Congiu * src/components/rocm/rocd.c, src/components/rocm/rocp.c, src/components/rocm/rocp.h, src/configure, src/configure.in: rocm: remove ROCM_PROF_ROCPROFILER guard This guard was introduced when rocmtools was planned instead of rocprofiler V2. 2023-05-16 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: update returned error codes Errors associated with rocprofiler calls are assigned PAPI_EMISC, while errors caused by unexpected user actions (e.g. starting an eventset that is already running) are assigned PAPI_EINVAL. Everything that is not a memory allocation failure (PAPI_ENOMEM) is assigned the PAPI_ECMP error. * src/components/rocm/rocp.c: rocm: remove macros handling error management * src/components/rocm/rocp.c: rocm: rename hsa_agent_arr_t to device_table_t * src/components/rocm/rocp.c: rocm: replace trailing Ptr in rocm functions with _p 2023-09-08 G-Ragghianti * .github/workflows/ci.sh, .github/workflows/spack.sh: changing gcc version for rocm compatibility 2023-09-29 Giuseppe Congiu * src/components/sysdetect/tests/Makefile: sysdetect: fix compiler flag selection in tests * src/configure, src/configure.in: configure: fix tls detection Configure TLS detection tests were failing because of wrong usage of pthread_create(). Problem was caused by wrong definition of thread functions which require void *f(void *) instead of int f(void *) or void f(void *). 2023-09-26 Giuseppe Congiu * src/smoke_tests/Makefile: smoke_tests: fix Makefile Makefile file was missing a PAPI_ROOT path and also an additional -pthread in the linker flags. 2023-09-15 Anustuv Pal * src/components/cuda/linux-cuda.c, src/papi.h, src/utils/papi_component_avail.c: cuda: Revert "utils: papi_component_avail does not support cuda component counters" This reverts commit 4f15f3d15463df5acfda26fbc6367756e1f62f03. * src/components/lmsensors/linux-lmsensors.c: lmsensors: Replace numerical literal 1024 with PATH_MAX macro 2023-09-05 Anustuv Pal * src/components/lmsensors/README.md, src/components/lmsensors/linux- lmsensors.c: lmsensors: Add lib/ to explicit search path to .so loader 2023-09-15 Anustuv Pal * src/components/coretemp/linux-coretemp.c: coretemp: Fix snprintf warnings for gcc 10 2023-07-12 Caleb Han * src/sde_lib/sde_lib.hpp: sde_lib: fixed make bug 2023-09-18 Anustuv Pal * src/components/sde/tests/Minimal/Minimal_Test.c: sde: Fix Minimal_Test.c handle pointer 2023-07-06 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: fix snprintf handling The expected return value from snprintf is < PAPI_MAX_STR_LEN. If it is >= PAPI_MAX_STR_LEN, the input string was longer than the output string and this is an unexpected condition that needs to be handled properly. * src/components/sysdetect/nvidia_gpu.c: sysdetect: fix snprintf n argument in CUDA backend The n argument in snprintf specifies the length of the output string not the one of the input string. * src/components/sysdetect/amd_gpu.c: sysdetect: fix snprintf n argument in ROCm backend The n argument in snprintf specifies the length of the output string not the one of the input string. * src/components/sysdetect/nvidia_gpu.c: sysdetect: do not null terminate manually in CUDA backend snprintf will always null terminate the output string regarless characters from input string being dropped (i.e. if the output string is shorter than the input string). * src/components/sysdetect/amd_gpu.c: sysdetect: do not null terminate manually in ROCm backend snprintf will always null terminate the output string regarless characters from input string being dropped (i.e. if the output string is shorter than the input string). 2023-07-21 Lukas Alt * src/components/rapl/linux-rapl.c: rapl: support for icelake-sp 2023-07-25 Daniel Barry * src/counter_analysis_toolkit/main.c: cat: add missing entry in usage message Add a command-line flag for the instructions benchmark to the usage message. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/main.c, src/counter_analysis_toolkit/params.h: cat: add option for conf file path Add an optional command-line flag for the path to the configuration file. This is useful on systems which do not assume the work directory is where the .cat_cfg file is located. These changes have been tested on the Intel Sapphire Rapids architecture. 2023-09-06 Giuseppe Congiu * src/components/rocm_smi/rocs.c: rocm_smi: fix warning "variable might be used uninitialized" 2023-09-01 Giuseppe Congiu * src/components/rocm/tests/Makefile, .../tests/hl_intercept_multi_thread_monitoring.cpp, .../hl_intercept_single_thread_monitoring.cpp, .../tests/hl_sample_single_thread_monitoring.cpp, .../rocm/tests/multi_thread_monitoring.cpp, .../rocm/tests/single_thread_monitoring.cpp: rocm: remove openmp dependency Spack installation of PAPI + rocm component have dependency issues with openmp caused by the AMD llvm compiler. Because component tests are always built in PAPI this prevents spack from installing PAPI in the system. Removing the openmp dependency and replacing with pthreads solves the issue. 2023-09-06 Anustuv Pal * src/components/cuda/cupti_profiler.c: cuda: fix event enumeration 2023-08-30 Anustuv Pal * src/components/cuda/cupti_common.c: cuda: fix dangerous dl_iterate_phdr operation 2023-08-15 Anustuv Pal * src/components/cuda/linux-cuda.c, src/papi.h, src/utils/papi_component_avail.c: utils: papi_component_avail does not support cuda component counters 2023-08-24 Anustuv Pal * src/components/cuda/tests/runtest.sh: cuda: Remove x flag from cuda/tests/runtest.sh 2023-08-18 Giuseppe Congiu * src/components/rocm/rocp.c: rocm: fix instanced events Some events have multiple instances. The way the component was handling those events was wrong, causing such events to not work. This patch fixes the problem. 2023-08-23 Bert Wesarg * src/components/rocm/rocp.c: rocm: prefer librocprofiler64.so.1 `librocprofiler64.so` was a linker script in 5.6 which was not be able to `dlopen`ed. In 5.7 this has vanished completely, thus try `.so.1` first. Fri Jun 30 15:06:22 2023 -0400 William Cohen * src/libpfm4/lib/pfmlib_amd64_perf_event.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_skx_unc_cha.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c: libpfm4: update to commit efd10fb Original commit: Correct the arguments in a number of printf statements Adjusted the printf statements to fix the following issues flagged by static analsysis: Error: PRINTF_ARGS (CWE-685): [#def66] libpfm-4.13.0/lib/pfmlib_intel_x86.c:87: extra_argument: This argument was not used by the format string: "e->fstr". # 85| __pfm_vbprintf(" any=%d", reg.sel_anythr); # 86| # 87|-> __pfm_vbprintf("]", e->fstr); # 88| # 89| for (i = 1 ; i < e->count; i++) Error: PRINTF_ARGS (CWE-685): [#def11] libpfm-4.13.0/lib/pfmlib_amd64_perf_event.c:78: missing_argument: No argument for format specifier "%d". # 76| # 77| if (e->count > 1) { # 78|-> DPRINT("%s: unsupported count=%d\n", e->count); # 79| return PFM_ERR_NOTSUPP; # 80| } Error: PRINTF_ARGS (CWE-685): [#def14] libpfm-4.13.0/lib/pfmlib_common.c:1151: missing_argument: No argument for format specifier "%d". # 1149| # 1150| if (pfmlib_is_blacklisted_pmu(p)) { # 1151|-> DPRINT("%d PMU blacklisted, skipping initialization\n"); # 1152| continue; # 1153| } Error: PRINTF_ARGS (CWE-685): [#def15] libpfm-4.13.0/lib/pfmlib_common.c:1367: missing_argument: No argument for format specifier "%s". # 1365| ainfo->equiv= NULL; # 1366| if (*endptr) { # 1367|-> DPRINT("raw umask (%s) is not a number\n"); # 1368| return PFM_ERR_ATTR; # 1369| Error: PRINTF_ARGS (CWE-685): [#def34] libpfm-4.13.0/lib/pfmlib_intel_skx_unc_cha.c:60: missing_argument: No argument for format specifier "%x". # 58| f.val = e->codes[1]; # 59| # 60|-> __pfm_vbprintf("[UNC_CHA_FILTER0=0x%"PRIx64" thread_id=%d source=0x%x state=0x%x" # 61| " state=0x%x]\n", # 62| f.val, Error: PRINTF_ARGS (CWE-685): [#def83] libpfm-4.13.0/lib/pfmlib_intel_x86_perf_event.c:100: missing_argument: No argument for format specifier "%d". # 98| # 99| if (e->count > 2) { # 100|-> DPRINT("%s: unsupported count=%d\n", e->count); # 101| return PFM_ERR_NOTSUPP; # 102| } 2023-08-22 Anustuv Pal * src/components/cuda/cupti_common.c, src/components/cuda/cupti_common.h: cuda: fix get linked shared library link error gcc 10.0 * src/components/cuda/cupti_common.c, src/components/cuda/cupti_common.h, src/components/cuda/cupti_profiler.c: cuda: Load cuda shared libraries from linked/rpath/LD_LIBRARY_PATH 2023-08-13 Anustuv Pal * src/papi.h: papi.h: Fix warnings for -Wstrict-prototypes 2023-07-25 Daniel Barry * src/papi_events.csv: add more Ice Lake FLOPs presets Since there are enough counters available to monitor both single- and double- precision floating-point events, PAPI_FP_OPS, PAPI_FP_INS, and PAPI_VEC_INS are all defined. These presets have been validated using the Counter Analysis Toolkit. These changes have been tested on the Intel Ice Lake architecture. 2023-07-31 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: temporarely remove all tests from being built Spack has issues building rocm tests because of a broken dependency in hip (openmp). To avoid spack failing to build PAPI altogether this commits temporarely removes the rocm component tests from being built. A better, and permanent, solution will follow soon. 2023-07-26 Anustuv Pal * src/components/cuda/README.md, src/components/cuda/Rules.cuda, src/components/cuda/cupti_common.c, src/components/cuda/cupti_common.h, src/components/cuda/cupti_config.h, src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_dispatch.h, src/components/cuda/cupti_events.c, src/components/cuda/cupti_events.h, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_profiler.h, src/components/cuda/cupti_utils.c, src/components/cuda/cupti_utils.h, src/components/cuda/htable.h, src/components/cuda/lcuda_debug.h, src/components/cuda/linux- cuda.c, src/components/cuda/sampling/Makefile, src/components/cuda/sampling/README, src/components/cuda/sampling/activity.c, src/components/cuda/sampling/gpu_activity.c, src/components/cuda/sampling/path.h.in, src/components/cuda/sampling/test/matmul.cu, .../cuda/sampling/test/sass_source_map.cubin, .../cuda/tests/BlackScholes/BlackScholes.cu, .../cuda/tests/BlackScholes/BlackScholes_gold.cpp, .../tests/BlackScholes/BlackScholes_kernel.cuh, src/components/cuda/tests/BlackScholes/Makefile, .../cuda/tests/BlackScholes/NsightEclipse.xml, .../cuda/tests/BlackScholes/README_SETUP.txt, src/components/cuda/tests/BlackScholes/readme.txt, .../cuda/tests/BlackScholes/testAllEvents.sh, .../cuda/tests/BlackScholes/testSomeEvents.sh, .../cuda/tests/BlackScholes/thr_BlackScholes.cu, src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/HelloWorld_CUPTI11.cu, src/components/cuda/tests/HelloWorld_NP_Ctx.cu, src/components/cuda/tests/HelloWorld_noCuCtx.cu, src/components/cuda/tests/LDLIB.src, src/components/cuda/tests/Makefile, src/components/cuda/tests/concurrent_profiling.cu, .../cuda/tests/concurrent_profiling_noCuCtx.cu, src/components/cuda/tests/cudaOpenMP.cu, src/components/cuda/tests/cudaOpenMP_noCuCtx.cu, src/components/cuda/tests/cudaTest_cupti_only.cu, .../cuda/tests/cuda_ld_preload_example.README, .../cuda/tests/cuda_ld_preload_example.c, .../tests/cupti_multi_kernel_launch_monitoring.cu, src/components/cuda/tests/gpu_work.h, src/components/cuda/tests/likeComp_cupti_only.cu, src/components/cuda/tests/nvlink_all.cu, src/components/cuda/tests/nvlink_bandwidth.cu, .../cuda/tests/nvlink_bandwidth_cupti_only.cu, src/components/cuda/tests/pthreads.cu, src/components/cuda/tests/pthreads_noCuCtx.cu, src/components/cuda/tests/runAll.sh, src/components/cuda/tests/runBW.sh, src/components/cuda/tests/runCO.sh, src/components/cuda/tests/runCTCO.sh, src/components/cuda/tests/runSMG.sh, src/components/cuda/tests/runtest.sh, src/components/cuda/tests/simpleMultiGPU.cu, .../cuda/tests/simpleMultiGPU_CUPTI11.cu, .../cuda/tests/simpleMultiGPU_noCuCtx.cu, .../cuda/tests/test_2thr_1gpu_not_allowed.cu, .../cuda/tests/test_multi_read_and_reset.cu, .../cuda/tests/test_multipass_event_fail.c, .../cuda/tests/test_multipass_event_fail.cu: cuda: New cuda component based on NVIDIA PerfWorks API. 2023-07-26 Kamil Iskra * src/components/powercap/linux-powercap.c: powercap: test counter read permissions Check that the files inside /sys/class/powercap /intel-rapl: directories not only exist, but are readable. On recent Linux kernels, "energy_uj" is by default readable by root only, which is something that PAPI fails to detect, resulting in 0 being returned for that counter without any indication of a problem. * src/components/powercap/linux-powercap.c: powercap: ignore the psys entry This is a bit of a workaround for newer Intel CPUs that, in addition to the traditional "package-" entries in /sys/class/powercap/, also contain a "psys" entry that controls the platform domain (see, e.g., https://lkml.kernel.org/lkml/1458516392-2130-3-git-send-email- srinivas.pandruvada@linux.intel.com/). PAPI currently assumes that entries starting with "intel-rapl:0" correspond to socket 0 and "intel-rapl:1" to socket 1. With "psys" around that unfortunately need not be the case; on at least one system relevant to DOE (I can't post the details as it's not public yet) intel-rapl:0 corresponds to socket 0, intel-rapl:1 corresponds to *psys*, and intel-rapl:2 corresponds to socket 1 (what a mess!). What currently happens is that PAPI entirely misses the counters for socket 1. This PR works around the problem by exhaustively searching for the right "intel-rapl:" directory. It preserves the current PAPI assumption that ZONE0 events correspond to socket 0 and ZONE1 to socket 1. On the other hand, it completely ignores the "psys" entry, while one could argue that the data it contains should ideally be made available as well... 2023-07-23 Daniel Barry * src/papi_events.csv: add various Sapphire Rapids presets These changes include cycles, instructions, branching, and FLOPs presets for Intel Sapphire Rapids, validated using the Counter Analysis Toolkit. These changes have been tested on the Intel Sapphire Rapids architecture. 2023-07-11 G-Ragghianti * .github/workflows/clang_analysis.sh, .github/workflows/main.yml: added support for clang static code analysis 2023-06-12 Daniel Barry * src/counter_analysis_toolkit/Makefile, .../{vec_arch.h => cat_arch.h}, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/flops.h, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_scalar_verify.c, src/counter_analysis_toolkit/vec_scalar_verify.h: cat: put GEMM kernels back in Re-introduce the GEMM operation in each precision to provide a kernel that executes exclusively fused multiply-add floating-point operations. We use intrinsics to ensure that the FMA instructions are included. These changes have been tested on the AMD Zen4, Fujitsu A64FX, and IBM POWER9 architectures. 2023-06-28 Giuseppe Congiu * src/components/rocm/tests/Makefile: rocm: user HIP_PATH for hipcc compiler in tests Makefile The rocm tests assume hipcc is located under the same root directory as the rest of rocm toolkit software. Spack installs rocm dependencies in separate directories however, which breaks this assumption. This patch introduces a HIP_PATH variable that, if unset, is set automatically to PAPI_ROCM_ROOT. Spack can use this variable to let the tests Makefile in the PAPI rocm component know where the hipcc compiler is located. 2023-06-21 Daniel Barry * src/papi_events.csv: add cycles and instructions presets for Zen4 These changes include the 'total cycles' and 'instructions completed' presets for Zen4, validated using the Counter Analysis Toolkit. These changes have been tested on the AMD Zen4 architecture. 2023-06-30 Anthony Danalis * src/sde_lib/sde_lib_datastructures.c: sde_lib: Fixed bug in hash- table deletion. If the item being deleted from the hash-table happened to be on the head of a list and there was no other item in the list, then the head was not being cleaned properly. 2023-06-29 Anthony Danalis * src/sde_lib/sde_lib.c: sde_lib: Allow group placeholders. If reading a group has been requested from the application/tool layer (though PAPI_event_name_to_code()) before the group is actually registered by the library, we will create a placeholder for it. This change allows the group registration to overwrite the placeholder. * src/sde_lib/sde_lib.c, src/sde_lib/sde_lib.h, src/sde_lib/sde_lib_internal.h, src/sde_lib/sde_lib_misc.c: sde_lib: Added reference counts for proper unregistering of groups. Counters can belong in groups, even multiple groups, and groups can recursively belong in larger groups. This means that a counter (or group) cannot be unregistered and freed without keep track of which groups it belong to. Now each counter has a reference counter "ref_count" which is incremented when it's added in a group and decremented when the counter is unregistered, or when a parent group is unregistered. * src/sde_lib/sde_lib_ti.c: sde_lib: Added locking to sde_ti_read_counter() funtion. Protected reading funtion with locks so it can't race against papi_sde_shutdown(). * src/sde_lib/sde_lib_datastructures.c, src/sde_lib/sde_lib_internal.h: sde_lib: Added function for hash- table serialization. This function helps abstract the hash-table from other parts of the code, instead of directly accessing the internal structure of the hash table from all over the place. 2023-06-28 Vince Weaver * src/components/perf_event/perf_event.c: don't use fast rdpmc counter reads in attach or syswide scenarios With perf_event we can use fast rdpmc reads for low-overhead counter access. This only works in self-monitoring situations where the thread being measured is in the same process context and same CPU as PAPI. This means it cannot generally be used in the attach case, or if trying to do system-wide measurements (granularity anything other than PAPI_GRN_THR). Ideally the Linux kernel would notice the request to use rdpmc in inappropriate circumstances and cause the mmap() read to fail and fallback to using the read() syscall. However for various reasons the kernel devs did not want to support this, so it's up to PAPI to avoid using rdpmc in cases where Linux will silently fail and allow rdpmc to return invalid counter values. This should fix the "attach_cpu_sys_validate" test failure. 2023-06-20 Giuseppe Congiu * .github/pull_request_template.md: PR author's checklist Add PR author's checklist for github 2023-07-03 G-Ragghianti * .github/workflows/main.yml, .github/workflows/spack.sh: Implemented CI check of spack install * src/smoke_tests/Makefile, src/smoke_tests/simple.c, src/smoke_tests/threads.c: Adding smoke-test code for spack install validation 2023-06-23 Giuseppe Congiu * src/components/intel_gpu/tests/Makefile: intel_gpu: remove libsupc++ dependency from tests makefile * src/components/intel_gpu/Rules.intel_gpu: intel_gpu: remove libsupc++ dependency from component makefile 2023-06-27 Giuseppe Congiu * src/components/sysdetect/x86_cpu_utils.c: sysdetect: replace logic AND with bitwise AND operator There was a typo in the sysdetect code for the x86 CPU architecture that computed a bitwise AND of two variables using the logic AND (&&) instead of the bitwise AND (&). This patch fixes the problem. 2023-06-20 Wileam Y.Phan * src/components/sde/tests/Makefile: sde: fix cray and intel fortran test flag * src/components/sysdetect/tests/Makefile: sysdetect: fix cray and intel fortran test flag 2023-06-20 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect: sysdetect: add include path for cuda headers to makefile 2023-06-18 Daniel Barry * src/components/pcp/tests/testPCP.c: pcp: skip test if component is disabled Previously, the test would fail if the PCP component was disabled. These changes check to see if it is disabled, and if so, skip the test. These changes have been tested on the IBM POWER9 architecture. 2023-06-12 Daniel Barry * src/papi_events.csv: add flops presets for Zen4 These changes include FLOPs presets for Zen4, validated using the Counter Analysis Toolkit. These changes have been tested on the AMD Zen4 architecture. 2023-06-13 Daniel Barry * src/counter_analysis_toolkit/main.c: cat: fix bug in data cache benchmarks Previously, there were no default values given for the PTS_PER_LX and LX_SPLIT parameters in the ".cat_cfg" file. This caused a floating-point exception in the data cache benchmarks. These parameters now have valid default values, even if they are not specified by the user in the ".cat_cfg" file. These changes have been tested on the AMD Zen4 architecture. 2023-06-12 Daniel Barry * PAPI_FAQ.html, README.md, src/components/sde/README.md: remove references to Bitbucket Removed some remaining links and references to the former Bitbucket repository and replaced them with the GitHub repository. Wed Jun 7 00:34:30 2023 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_spr_events.h: libpfm4: update to commit 70b5b4c Original commit: commit 70b5b4c82912471b43c7ddf0d1e450c4e0ef477e add default umask for ICL/SPR br_inst_retired/br_misp_retired Were missing a default umask unlike SKL. That was causing errors when passing these events with no umask. Default is umask ALL_BRANCHES 2023-06-07 Daniel Barry * src/papi_events.csv: add branch presets for Zen3 and Zen4 These changes include all branching preset events for Zen3 and Zen4, validated using the Counter Analysis Toolkit. For Zen3, PAPI_BR_TKN was modified to exclude unconditional branches taken, in order to adhere to the preset's meaning. These changes have been tested on the AMD Zen3 and Zen4 architectures. 2023-04-07 Giuseppe Congiu * src/genpapifdef.c: genpapifdef.c: delete file 2023-04-05 Giuseppe Congiu * src/configure, src/configure.in, src/maint/genpapifdef.pl: maint/genpapifdef.pl: replacement perl script for genpapifdef.c Add genpapifdef.pl script in maint directory and hook it to configure. 2023-06-05 G-Ragghianti * .github/workflows/ci.sh: changed cuda requirement 2023-03-27 G-Ragghianti * .github/workflows/ci.sh, .github/workflows/main.yml: CI: creating CI files 2023-04-03 John Linford * src/papi_events.csv: Update Neoverse V2 events Add/remove PAPI events to match available hardware counters All tests pass on NVIDIA Grace Disclaimer: The PAPI team was not able to verify the functionality included in this commit. Wed May 17 00:34:35 2023 -0700 Stephane Eranian * src/libpfm4/lib/pfmlib_common.c: libpfm4: update to commit 533633a Original commit: commit 533633adf7d00bbfcb7f2759567869d585bf97e1 remove unused variable in pfmlib_pmu_validate_encoding() The n variable was set and incremented by result was never used, so remove. 2023-05-16 Giuseppe Congiu * src/Makefile.in, src/Makefile.inc, src/configure, src/configure.in: buildsystem: fix install target in Makefile PR #464 introduced a --disable-fortran flag that allows users to disable fortran header and wrappers generation in case the user does not need them. Commit 40b7afc also introduces a bug as the fortran headers no longer generated by the configure are still part of the install target. This patch fixes the problem. 2023-05-16 Anthony Danalis * src/components/intel_gpu/tests/gpu_query_gemm.cc: intel_gpu: fix test 2023-04-03 Giuseppe Congiu * src/components/Makefile_comp_tests.target.in, src/components/sde/tests/Advanced_C+FORTRAN/sde_symbols.c, src/components/sde/tests/Makefile, src/components/sysdetect/tests/Makefile, src/configure, src/configure.in, src/ftests/Makefile, src/ftests/Makefile.target.in, src/testlib/Makefile, src/testlib/Makefile.target.in: fort: do not compile fortran code if disabled * src/Makefile.in, src/Makefile.inc, src/configure, src/configure.in: fort: add --disable-fortran switch 2023-05-11 Giuseppe Congiu * src/components/rocm_smi/rocs.c: rocm_smi: fix bug in get_ntv_events_count Unchecked rsmi return error codes could led to errors in the component. Make sure all rsmi error codes are checked and handled appropriately. * src/linux-memory.c: memory: fix bug in generic_get_memory_info Level should not be greater than PAPI_MAX_MEM_HIERARCHY_LEVELS and not greater, equal than. 2023-03-31 Giuseppe Congiu * src/components/sysdetect/Rules.sysdetect, src/components/sysdetect/amd_gpu.c: sysdetect: fix rocm and rocm_smi dlopen logic Tue Mar 28 16:48:58 2023 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/config.mk, src/libpfm4/debian/changelog, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_emr.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_spr.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: libpfm4: update to commit 52632c7 Original commits: commit 52632c7ffe3b088846e86ced207e38dfe5bc4731 add Intel EmeraldRapid core PMU support Intel EmeraldRapid shares the same PMU as Intel SapphireRapid. Add an emr:: PMU sharing the same event table. commit 1befa3d200cc17d5a278fcb2f597c4876c58f949 fix AMD Zen3/Zen4 detection To cover more models of Zen4. commit 8ea5575b6b10a91f3d7a079ca35d6e4eb33f379d Fix unitialized variable in gen_tracepoint_table() Need to ensure that p was initialized at the start of function gen_tracepoint_table otherwise on some architectures such as s390x will get the following error when compiling with -Werror: make[1]: Entering directory '/root/rpmbuild/BUILD/libpfm-4.13.0/lib' cc -O2 -flto=auto -ffat- lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 - Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat- hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat /redhat-annobin-cc1 -m64 -march=z14 -mtune=z15 -fasynchronous- unwind-tables -fstack-clash-protection -g -Wall -Werror -Wextra -Wno-unused-parameter -I. -I/root/rp mbuild/BUILD/libpfm-4.13.0/lib/../include -DCONFIG_PFMLIB_DEBUG -DCONFIG_PFMLIB_OS_LINUX -D_REENTRANT -I. -fvisibility=hidden -DCONFIG_PFMLIB_A RCH_S390X -I. -c pfmlib_perf_event_pmu.c pfmlib_perf_event_pmu.c: In function 'gen_tracepoint_table': pfmlib_perf_event_pmu.c:434:35: error: 'p' may be used uninitialized in this function [-Werror=maybe-uninitialized] 434 | p->modmsk = 0; | ~~~~~~~~~~^~~ cc1: all warnings being treated as errors commit 72709ed4237e9259348080e05ffd7750ee202506 fix active list ordering issue In commit 363825f72afd ("maintain list of active PMUs") We introduce an active list of PMUs to speed up lookups of events. However, there was a bug introduced by this commit which caused wrong encodings of certain events. For instance, on Intel x86, the event unhalted_reference_cycles was encoded using an architected event code instead of the model specific event code which used a fixed counter. This was due to the fact that the ordering of the PMU models in pfmlib_pmus[] was inverted by the way we built the active list. The order in the table matters for lookups. the list must maintain the same order. This patch fixes the problem by rewriting the linked list code to support appending to the tail of the list (instead of the head). That way the order in the table is maintained. The patch introduces the notion of a link list node supporting a double linked list data structure with basic accessor functions. commit aa31ca87eb00d0f74d5566fed9a7cf62c48e236a Revert "optimize active PMU list further" This reverts commit b009b1263098eec925bc2dba1760c70d8a46d4b8. Because it makes it necessary to have a lock as the head of the list may be changing at each encoding and that would cause issues with multiple parallel calls of the encoding entry points. This optimization will be redone with a thread local variiable once we modify libpfm4 to depend on libpthread. commit 80260a02ab805acfb702ee3eab9af82729f20c79 clear pfmlib_active_pmus_list on init and terminate Must clear on pfm_terminate() to avoid creating cycles in the list in case pfm_initialize() is called multiple times. Also clear in pfm_initialize() to make sure we start from a known situation. commit 9c3d167fa6017836cb6e33004471cebd4d1bf0f6 fix active PMU list handling for LIBPFM_ENCODE_INACTIVE=1 There was an issue introduced by: 363825f72afd ("maintain list of active PMUs") Where if a PMU is not detected because not exported by OS, it would not be put on the active list when LIBPFM_ENCODE_INACTIVE=1 causing tests/validate on some uncore PMU events. Fix this by correctly handling the case where a PMU is not exported by the OS. Also check that a PMU is actually active in pfmlib_terminate() and pfm_get_pmu_by_type() again to handle LIBPFM_ENCODE_INACTIVE=1. commit b009b1263098eec925bc2dba1760c70d8a46d4b8 optimize active PMU list further By moving the last PMU matched to the head of the list each time an event is found in pfmlib_parse_event(). commit 363825f72afde0e8cae2ecfd261a95d2bd0b3868 maintain list of active PMUs Given that the list of PMUs supported per architecture keeps growing, it is becoming expensive to iterate over each PMU looking for a match. The macro pfmlib_for_each_pmu() is iterating over active and inactive PMUs looking for active ones. Given that the number of inactive PMUs is always larger than the number of active, this was expensive. Fix this by creating a list of active PMUs and adding a new macro pfmlib_for_each_active_pmu(). We use a oubly linked list to allow further optimizations. As an example on a X86 build, the new iterator allowed a 10x reduction in iterations inside pfmlib_parse_event() for a core PMU event. When LIBPFM_ENCODE_INACTIVE is set to 1, then all PMUs supported by the architecture are put on the active list even when they are not detected. This simplifies the parsing loop. commit b3e956879bc9499d5c3012f3f82ce31d2f169e5b Fix parsing of LIBPFM_ENCODE_INACTIVE Was not taking into account the value of the variable, unlike for LIBPFM_DEBUG and LIBPFM_VERBOSE. Passing LIBPFM_ENCODE_INACTIVE=0 would still activate the feature. commit 158c879b9408b84ecfc78c1385c81ce25a8f2cd1 Use relative path using openat(2) in gen_tracepoint_table() It doesn't need to traverse the filesystem hierarchy from the root. Instead it can use relative pathname with openat() and pass it to fdopendir(). Actually it can introduce some kernel lock contentions when it's invoked from multiple CPUs at the same time. commit f200f50751557a1b9aef6120140bfb13d7cafe9f Define HAS_OPENAT for Linux Now I think all major Linux distro provides openat() functions in libc as it's specified in POSIX.1-2008. Maybe we could add a config check to detect them later if somebody don't. Also remove the old code to undefine the macro unconditionally. commit 3d77461cb966259c51f3b3e322564187f4bef7fb Update to version 4.13.0 Various updates and AMD Zen4 support. 2023-04-25 Daniel Barry * src/counter_analysis_toolkit/.cat_cfg, src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/hw_desc.h, src/counter_analysis_toolkit/main.c: cat: allow user to specify DCR/DCW sampling To provide more flexible benchmarks, we allow the user to specify the number of measurements for each level of the memory hierarchy. In addition, we include user-definable parameters to accommodate the different cache levels shared by different numbers of cores. These changes have been tested on the AMD Zen3 and Zen4 architectures. 2023-04-17 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/timing_kernels.c: cat: non-core events in multithreaded benchmarks Each thread in the data-cache benchmarks previously added events to its local event set. This caused an error for events that are not from the perf_event (core) component. To accommodate such events, we check for the component to which an event belongs. If the event does not belong to the core component, then only the thread with ID 0 adds the event to its event set. These changes have been tested on the AMD Zen3 architecture. 2023-03-16 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/flops.c, src/counter_analysis_toolkit/flops.h, src/counter_analysis_toolkit/flops_aux.c, src/counter_analysis_toolkit/flops_aux.h: cat: refactor flops benchmark Vector-normalization and Cholesky decomposition kernels are sufficient for characterizing addition, subtraction, multiplication, division, and square root events. The Makefile has been updated to use -O1 for the FLOPs benchmark to enable the inclusion of scalar square root instructions. The output now includes problem size and expected number of each aforementioned operation in addition to counter readings. This refactored benchmark has a greatly reduced execution time. These changes have been tested on the AMD Zen3, Zen4, and Fujitsu A64FX architectures. 2023-04-06 Anthony Danalis * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/README, src/counter_analysis_toolkit/driver.h, src/counter_analysis_toolkit/icache.c, src/counter_analysis_toolkit/instr.h, src/counter_analysis_toolkit/instructions.c, src/counter_analysis_toolkit/main.c: cat: addition of instructions benchmarks This new benchmark includes microkernels to detect integer, floating- point, and memory read and write instructions. These changes have been tested on the AMD Zen4 architecture. 2023-04-06 Terry Cojean * src/sde_lib/sde_lib.c: SDE: Fix shutdown for a consistent global control struct 2023-03-29 AnustuvICL * src/components/cuda/linux-cuda.c: Fix wrong dlsym for cuptiDisableKernelReplayMode 2023-03-20 John Linford * src/components/sysdetect/arm_cpu_utils.c, src/papi_events.csv: Add minimal events for Arm Neoverse V2 * src/papi_events.csv: Add minimal events for Arm Neoverse N2 * src/components/sysdetect/arm_cpu_utils.c, src/papi_events.csv: Add minimal events for Arm Neoverse V1 papi-papi-7-2-0-t/ChangeLogP720.txt000066400000000000000000001164171502707512200166130ustar00rootroot000000000000002025-06-25 Treece Burgess * doc/Doxyfile-common, papi.spec, src/Makefile.in, src/configure.in, src/papi.h: The version numbers for doc/Doxyfile-common, papi.spec, src/Makefile.in, src/configure.in, and src/papi.h have been updated. 2025-06-13 Heike Jagode * RELEASENOTES.txt: Prepared Release Notes for PAPI 7.2.0 release. 2025-06-17 Treece Burgess * src/components/rocm/tests/sample_overflow_monitoring.cpp: rocm: Skip the test sample_overflow_monitoring.cpp. 2025-06-20 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: Ensure env variables are always respected. * src/components/rocp_sdk/rocp_sdk.c: ROCP_SDK: Improve the file/dir check to skip "." and "." * src/components/rocp_sdk/rocp_sdk.c: ROCP_SDK: use path instead of hsa to test for devices. 2025-06-16 Daniel Barry * .../rocm/tests/multi_thread_monitoring.cpp: rocm: fix segmentation fault in component test On Frontier, the invocation of the exit() call before pthread_merge() causes a segmentation fault. I remedy this issue by only calling test_warn(), test_fail(), and hip_test_fail() after the threads have been merged. These changes were tested using ROCm versions 6.1.3, 6.2.0, 6.2.4, 6.3.1, and 6.4.0 with the AMD MI250X architecture on the Frontier supercomputer. 2025-06-12 Gerald Ragghianti * src/components/rocm/tests/Makefile, src/components/rocm_smi/tests/Makefile: rocm/rocm_smi: Allow users to optionally set HIPCC. 2025-06-10 Treece Burgess * src/components/cuda/linux-cuda.c, src/components/rocm/rocm.c, src/components/template/template.c: cuda/rocm components: Restructure update_native_events to not call realloc on a size of 0. 2025-06-11 Treece Burgess * src/configure, src/configure.in: configure: Add a warning message if rocm and rocp_sdk are configured together. 2025-06-04 Treece Burgess * src/components/rapl/linux-rapl.c: RAPL Component: Add support in RAPL for Intel Emerald Rapids. Note at this time the PAPI team does not have access to a machine with an Intel Emerald Rapids CPU to verify this addition. 2025-06-12 Treece Burgess * src/components/rocp_sdk/tests/Makefile: rocp_sdk: In the tests Makefile account for CPU agents on amd64. 2025-06-12 Daniel Barry * src/components/intel_gpu/README.md: intel_gpu: update environment variable name On a system containing the Intel Arc A770 device, I am met with the following warning: ZET_ENABLE_API_TRACING_EXP is deprecated. Use ZE_ENABLE_TRACING_LAYER instead. The current README states to set ZET_ENABLE_API_TRACING_EXP; however, ZE_ENABLE_TRACING_LAYER is the correct variable to set. Setting ZE_ENABLE_TRACING_LAYER prevents the above warning. 2025-06-10 Daniel Barry * src/components/cuda/linux-cuda.c, src/papi.c, src/papi_internal.c, src/papi_preset.c, src/papi_vector.c, src/papi_vector.h: framework: force init per existing policy PR #284 introduced code that always forced the initialization of all components. However, this defeats the purpose of having PAPI_EDELAY_INIT. The changes in this pull request only force initialization of components when necessary. These changes have been tested on systems containing: - NVIDIA Hopper architecture - AMD Zen3 CPU and AMD MI250X GPU architectures (Frontier). 2025-06-07 Treece Burgess * src/components/intel_gpu/Rules.intel_gpu: intel_gpu: Remove -DDEBUG from Rules.intel_gpu. 2025-06-06 Treece Burgess * src/components/rocm/README.md: rocm: Update the component README.md to account for new limitations. 2025-06-09 Treece Burgess * src/components/sysdetect/sysdetect.c: sysdetect: Add newline characters to the SUBDBG messages. 2025-06-06 Anthony * src/components/rocm/rocm.c: ROCM: PAPI_strerror() cannot be used at shutdown. 2025-06-05 Treece Burgess * src/papi.c: PAPI_list_events: Update functions documentation to match the function protoype. 2025-06-04 Anthony Danalis * src/components/rocp_sdk/rocp_sdk.c: ROCP_SDK: Handle case where all events are removed. 2025-06-03 Treece Burgess * src/components/rocp_sdk/rocp_sdk.c: rocp_sdk: Remove assignment of info->event_code and info->component_index in rocp_sdk as it is already done in papi_internal.c. 2025-06-02 Treece Burgess * src/papi_events.csv: PAPI Presets: Update AMD Family 17h to account for PMCx080 and PMCx081 reporting incorrect IC accesses and misses respectively. PMCx060 unit mask 0x10 replaces PMCx081, but there is no suitable replacement for PMCx080 therefore those instances are removed. 2025-05-29 Treece Burgess * src/components/coretemp/linux-coretemp.c, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_utils.c, src/components/cuda/htable.h, src/components/cuda/linux-cuda.c, src/components/cuda/papi_cupti_common.c, src/components/infiniband/linux-infiniband.c, src/components/net/linux-net.c, src/components/rocm/htable.h, src/components/rocm_smi/htable.h, src/components/rocm_smi/rocs.c: Various Components: Use only PAPI memory allocation or C memory allocation to avoid possible segmentation faults. 2025-06-02 Treece Burgess * src/components/rocm_smi/linux-rocm-smi.c, src/components/rocp_sdk/rocp_sdk.c, src/components/template/template.c: rocm_smi/rocp_sdk: Restructure init_private functions to avoid setting initialized equal to 1 even when initialization fails. 2025-05-29 Treece Burgess * src/components/infiniband/linux-infiniband.c, src/components/nvml/linux-nvml.c, src/components/sysdetect/sysdetect.c, src/components/topdown/topdown.c, src/components/topdown/topdown.h: Sysdetect/Topdown/Infiniband/NVML Components: Properly set .size in a components vector to avoid possible Error! PAPI_library_init. 2025-05-27 Treece Burgess * src/components/lmsensors/tests/lmsensors_read.c: lmsensors component: Remove restriction on the events chosen to be added to an eventset for the test lmsensors_read.c. Thu Sep 19 23:41:22 2024 -0700 Stephane Eranian * src/libpfm4/docs/man3/libpfm_intel_knl.3, src/libpfm4/docs/man3/libpfm_intel_knm.3, src/libpfm4/lib/events/amd64_events_fam1ah_zen5.h, src/libpfm4/lib/pfmlib_arm.c, src/libpfm4/lib/pfmlib_arm_armv6.c, src/libpfm4/lib/pfmlib_arm_armv7_pmuv1.c, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_arm_armv8_kunpeng_unc.c, src/libpfm4/lib/pfmlib_arm_armv8_thunderx2_unc.c, src/libpfm4/lib/pfmlib_arm_armv9.c, src/libpfm4/lib/pfmlib_arm_perf_event.c, src/libpfm4/lib/pfmlib_arm_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_x86_perf_event.c, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_perf_event_priv.h, src/libpfm4/lib/pfmlib_priv.h: Update libpfm4 Current with commit 0727e5f5561101d8c635a36e139dd7512616d49e add another perf_name for ARM Cortex-A57 PAPI developers with NVIDIA Jetson board and ARM Cortex-A57 reported that the Linux PMU type is "armv8_pmuv3" Add that name as a possible name to the list of perf_name for Cortex A57. commit 75d2e605f763f3220793c3bb52a6b6effffe4d9c fix AMD Zen5 umasks for L2_PREFETCH_MISS_L3 and L2_FILL_RESPONSE_SRC The umasks tables were swapped between the two events. Simplify umasks names for L2_FILL_RESPONSE_SRC commit c5587f9931123be6fcb6f8133497d93cab36bdcd Hotfix ARM CPU detection due to arch mismatch This is a hotfix to avoid failure of ARM CPU detection with the new detection code introduce by commit 15c4cd9f1f4a ("Add ARM hybrid detection"). For some processors, the architecture revision expected by libpfm4 does not match the revision exported by the Linux kernel via cpuinfo. For instance, the Neoverse V2 is a V9 processor, yet cpuinfo reports arch: 8. A few other ARM processors may exhibit the same error. The hotfix simply skips checking the arch revision for now. commit b2888ea7995d781d1c59d9c8714487b863774912 Cope with empty /proc/cpuinfo file When running inside e.g. lxc containers, /proc/cpuinfo may be empty, in which case pfmlib_getl() never allocates a buffer, and the trailing b[i] = '\0' thus becomes bogus. commit f09c366b45fba75f1143cb14ec8f22ad96c4c1b1 Merge: e887d24 8ca3087 Merge /u/mousezhang/perfmon2/ branch master into master https://sourceforge.net/p/perfmon2/libpfm4/merge- requests/32/ commit e887d24a6c4b97b8087e5a284c79f63adaab4fc0 Add sysfs PMU caching on initialization In order to accommodate the growing number of PMUs active and to handle hybrid processors better, this patch adds sysfs PMU perf_events information caching to avoid going back to sysfs for each encoded event. The caching stores the name of PMU, e.g., armv8_pmu3, and the perf_events type which is then use to build the perf_events encoding. commit ff3291fe3f6d2c280ed2e33c42842e5dc08f38df Remove references to /sys/devices to remain compatible with upstream The PMUs will not appear in /sys/devices for much longer. The proper way to access PMU directories is via: /sys/bus/event_source/devices/ Where each PMU has a symlink. It should be noted that this alternate directory is not new. It has been there all along. Therefore it is okay to remove all references to /sys/devices. commit a41f8eeedf2c81232e5fa9129928edf9215bf3fc Add ARM hybrid encoding support for perf_events This patch adds the new logic to handle encoding of the PMU type for the Linux perf_events interface. Hybrids are a challenge in that it is not possible to simply use PERF_TYPE_RAW because that does not disambiguate which of the core PMU models to attach the events to. Instead, the PMU type must be collected from the Linux sysfs interface. But for that to happen the library needs to know the PMU instance name assigned by perf_events for each PMU model detected. On ARM, this is not straightforward. The patch extends the meaning the the pmu->perf_name string to include a comma separated list of names instead of just one. The library then tries each name until there is a match in /sys/bus/event_source/devices/. This accommodates situations where the same PMU model is used in a homogeneous vs. hybrid config. commit 15c4cd9f1f4a382ef6753a05a5d4d6c27bd449c5 Add ARM hybrid detection This patch rewrites the ARM core PMU detection logic to handle the case of hybrid processors. On ARM, there can be many different cores in the same SoC. Each potentially shows up with a different implementer, part, variant. That means just looking at the first entry in cpuinfo on Linux is not enough to activate all supported event tables. The new code parses the entire cpuinfo once and detects each unique core identifiers. Then, for each core PMU table, the detection code checks against that pre-built list of detected core models. That way up to N (currently 8) different core models can be detected. This new detection code is provided for Linux. For other operating systems, new code must be added to get the implementer, part, variant codes for all cores in the system. Thanks to Vince Weaver for providing the test cases to exercise this new code. Testing: AMD Zen5 Update (Tested on a AMD Ryzen 9 9950X 16-Core Processor): - papi_avail - runs successfully and matches master branch - papi_component_avail - runs successfully and matches master branch - papi_native_avail - runs successfully and matches master branch - papi_command_line - runs successfully I verified that with papi_native_avail we see the swapped umasks for L2_PREFETCH_MISS_L3 and L2_FILL_RESPONSE_SRC. Using the swapped umasks with papi_command_line work as expected. ARM Updates (Tested on ARM Cortex A57, ARM Cortex A72, and ARM Neoverse V2): - papi_avail - runs successfully on all three models and matches master branch - papi_component_avail - runs successfully on all three models and matches master branch - papi_native_avail - runs successfully on all three models and matches master branch - papi_command_line - runs successfully on all three models Note that for the ARM updates, this includes Vince's patch to resolve Issue #364. 2025-05-23 Treece Burgess * src/components/lmsensors/linux-lmsensors.c: lmsensors component: Replace fprintf with SUBDBG. 2025-05-19 Treece Burgess * src/components/cuda/linux-cuda.c: Cuda component: Initialize count variable in function cuda_init_private. 2025-05-23 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: More verbose debug messages. * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: Do not overwrite library in PAPI_ROCP_SDK_LIB. * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: Cleanup dlopen() error handling. 2025-05-21 Daniel Barry * src/papi_preset.c: framework: proper memory management functions This makes the usage of memory allocation and freeing functions consistent to prevent segmentation faults when using preset events. These changes were tested on the ARM Neoverse-V2 and NVIDIA Hopper architectures. 2025-05-20 Anthony * src/components/rocm_smi/tests/Makefile: ROCM_SMI: Added -pthread flag in tests/Makefile. 2025-05-20 Treece Burgess * src/utils/print_header.c: utils/print_header.c: Move for loop counter declaration out of for loop header. 2025-05-18 G-Ragghianti * src/components/rocp_sdk/rocp_sdk.c: Adding multiple search path functionality for libhsa 2025-05-16 Treece Burgess * src/components/README: Remove perfctr and perfctr_ppc documentation from src/components README. 2025-05-13 Daniel Barry * src/utils/papi_avail.c: utils: fix compiler warnings for papi_avail.c Revert the structure of printf() statements to those prior to commit c214d8ca879ba5195d7cae1d8808e807ea2f812c, which inappropriately modified certain fields. This resulted in the following compiler warnings from GCC 13.3.0 (architecture: AMD Ryzen 9 9950X 16-Core CPU and NVIDIA GeForce RTX 5080 GPU): papi_avail.c: In function ‘main’: papi_avail.c:573:17: warning: too many arguments for format [-Wformat-extra-args] 573 | printf( "%-*s%-11s%-8s%-16s\n |Long Description|\n", maxSymLen, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ papi_avail.c:687:21: warning: too many arguments for format [-Wformat-extra-args] 687 | printf( "%-*s%-11s%-8s%-16s\n |Long Description|\n", maxCompSymLen, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These changes have been tested on systems containing the NVIDIA Hopper and Blackwell architectures. * src/utils/papi_avail.c: utils: convert tabs in papi_avail.c to spaces These changes have been tested on systems containing the NVIDIA Hopper and Blackwell architectures. 2025-05-14 Anthony Danalis * src/papi_memory.c, src/papi_memory.h: HEADERS: __FILE__ is "const char *", not "char *" * src/components/rocp_sdk/sdk_class.hpp: ROCP_SDK: protect the included papi headers from C++ 2025-05-13 Treece Burgess * src/components/cuda/tests/HelloWorld.cu, src/components/cuda/tests/HelloWorld_noCuCtx.cu, src/components/cuda/tests/concurrent_profiling.cu, .../cuda/tests/concurrent_profiling_noCuCtx.cu, src/components/cuda/tests/cudaOpenMP.cu, src/components/cuda/tests/cudaOpenMP_noCuCtx.cu, src/components/cuda/tests/pthreads.cu, src/components/cuda/tests/pthreads_noCuCtx.cu, src/components/cuda/tests/runtest.sh, src/components/cuda/tests/simpleMultiGPU.cu, .../cuda/tests/simpleMultiGPU_noCuCtx.cu, .../cuda/tests/test_2thr_1gpu_not_allowed.cu, .../cuda/tests/test_multi_read_and_reset.cu: Cuda component: Update tests to more gracefully handle multiple pass events. 2025-05-13 Anthony Danalis * src/components/rocp_sdk/README.md: ROCP_SDK: Update README with linking limitations. 2025-05-12 Treece Burgess * src/components/appio/appio.c: Appio Component: Add a component description, as it is missing from papi_component_avail * src/components/rocm/roc_profiler.c: ROCm component: Bug fix for typo in rocm_verify_no_repeated_qualifiers. 2025-05-10 Treece Burgess * src/configure, src/configure.in: Update configure.in to have a default value for --with-debug if not provided by a user. 2025-05-06 Treece Burgess * src/configure, src/configure.in: Configure: Correctly output the tests chosen by a user with --with-tests. 2025-05-08 Treece Burgess * src/components/cuda/cupti_dispatch.c: Cuda component: Properly set return value in cuptid_init. 2024-11-07 Daniel Barry * src/utils/papi_avail.c: utils: papi_avail extension for component presets Enumerate presets for components as well as the CPU. These changes have been tested on the NVIDIA Grace-Hopper architecture. * src/utils/papi_avail.c: utils: new modifiers for strictly CPU presets Replace modifiers with only those that enumerate the CPU preset events. These changes have been tested on the NVIDIA Grace- Hopper architecture. 2024-10-31 Daniel Barry * src/utils/papi_avail.c: utils: convert tabs to spaces in papi_avail.c This is an aesthetic change to improve the development process. These changes have been tested on the NVIDIA Grace-Hopper architecture. * src/papi.c, src/papi.h, src/papi_common_strings.h, src/papi_internal.c, src/papi_internal.h, src/papi_preset.c, src/papi_preset.h: framework: support for component presets Updates to framework to facilitate preset events defined by native events of non-perf_event components. These changes have been tested on the NVIDIA Hopper architecture. * src/Makefile.inc, src/configure, src/configure.in, src/papiStdEventDefs.h: config: updates for component presets Update configure to track both the number of presets per component and the arrays of presets belonging to each component. These changes have been tested on the NVIDIA Hopper architecture. 2024-10-28 Daniel Barry * src/papi_events.csv: presets: support for NVIDIA Hopper and Ampere * src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_dispatch.h, src/components/cuda/cupti_profiler.c, src/components/cuda/linux- cuda.c, src/components/cuda/papi_cuda_presets.h, src/components/cuda/papi_cuda_std_event_defs.h, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h: cuda: updates for presets Add functions to facilitate CUDA presets. These changes have been tested on the NVIDIA Hopper architecture. 2024-10-31 Daniel Barry * src/papi_vector.c, src/papi_vector.h: framework: fields for component presets Create fields in the vector struct for components to define presets. These changes have been tested on the NVIDIA Hopper architecture. 2024-12-20 Dandan Zhang * src/linux-context.h, src/linux-timer.c, src/mb.h: Add loongarch64 support 2025-05-06 Treece Burgess * src/components/cuda/cupti_profiler.c, src/components/rocm/roc_profiler.c: ROCm component: Add stricter qualifiers checks. * src/components/cuda/cupti_profiler.c: Cuda component: Add stricter qualifiers checks. 2025-05-05 Treece Burgess * src/components/coretemp/linux-coretemp.c: Coretemp: Enable support for multiplexing. 2025-05-06 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: Improved handling of pathological paths. 2025-05-03 Treece Burgess * src/components/cuda/cupti_profiler.c: Cuda component: Replace int typing with long long to avoid overflow with measured values. 2025-05-06 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/sdk_class.hpp: ROCP_SDK: Force fail if PAPI_ROCP_SDK_LIB is bogus. 2025-05-05 Anthony Danalis * src/components/rocp_sdk/rocp_sdk.c: ROCP_SDK: Suppress ROCprofiler- SDK warnings. * src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: Enable default dlopen() paths, and cleaner error handling. * src/components/rocp_sdk/rocp_sdk.c: ROCP_SDK: Move dlclose() to component finalization. This avoid a conflict between the two components rocp_sdk and rocm, if both components are configured in. 2025-04-29 Anthony Danalis * src/components/rocp_sdk/rocp_sdk.c, src/components/rocp_sdk/sdk_class.cpp: ROCP_SDK: call configure_device_counting_service as early as possible. When applications are linked against libpapi.a rocprofiler_configure() is not called on-load, so we have to explicitly initialize everything. This PR moves some of the necessary steps earlier, so that everything is initialized after PAPI_library_init(). 2025-04-24 Treece Burgess * .../cuda/tests/test_multipass_event_fail.cu: Cuda component: Update the error checks in the test test_multipass_event_fail to PASS even when events that do not require multiple passes are provided. 2025-04-30 Treece Burgess * src/components/cuda/cupti_config.h, src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_dispatch.h, src/components/cuda/cupti_events.c, src/components/cuda/cupti_profiler.c, src/components/cuda/linux- cuda.c, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h, src/papi.h, src/papi_internal.c, src/utils/papi_component_avail.c: Cuda component: Add functionality for a partially disabled Cuda component for CCs >= 7.0 (Perfworks API). 2025-04-28 Dong Jun Woun * src/components/rocm_smi/rocs.c: rocm_smi: Add proper fan_speed access, control, and return 2025-04-29 Treece Burgess * src/components/cuda/cupti_profiler.c, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h, src/components/nvml/Rules.nvml, src/components/nvml/linux-nvml.c: Cuda/NVML Components: Check for variation of shared objects e.g. libcudart.so, libcudart.so.1 or libcudart (catch all). 2025-04-28 Dong Jun Woun * .../rocm_smi/tests/rocm_smi_writeTests.cpp: rocm_smi: Update read/write test 2025-04-29 Treece-Burgess * src/components/perf_event/perf_event.c: perf_event: Disable component if perf_event_paranoid is set to 4 in /proc/sys/kernel/perf_event_paranoid. * src/components/cuda/Rules.cuda, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_utils.h, src/components/cuda/lcuda_debug.h, src/components/cuda/linux- cuda.c, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h, src/components/cuda/tests/concurrent_profiling.cu, .../cuda/tests/concurrent_profiling_noCuCtx.cu: Cuda component: Refactor to support the MetricsEvaluator API (Cuda Versions 11.3 and greater). 2025-04-23 Anthony Danalis * src/components/rocp_sdk/Rules.rocp_sdk, src/components/rocp_sdk/rocp_sdk.c, src/components/rocp_sdk/tests/Makefile, src/components/rocp_sdk/tests/advanced.c, src/components/rocp_sdk/tests/kernel.cpp, src/components/rocp_sdk/tests/simple.c, src/components/rocp_sdk/tests/simple_sampling.c, src/components/rocp_sdk/tests/two_eventsets.c: ROCP_SDK: Accomodate machines with fewer AMD GPUs. 2025-02-26 Yoshihiro Furudera * src/papi_events.csv: Remove some preset events for FUJITSU-MONAKA The following preset events of FUJITSU-MONAKA are not counted properly: PAPI_L3_DCM PAPI_L3_TCM PAPI_PRF_DM PAPI_L3_DCH PAPI_L3_TCH Specifically, the native event that is the source of the above preset event is counted inaccurately. So I remove these events in papi_events.csv. 2024-09-19 Akio Kakuno * src/components/sysdetect/arm_cpu_utils.c, src/papi_events.csv: papi_events.csv: Add preset events support for FUJITSU-MONAKA This commit adds preset events support for FUJITSU-MONAKA. Also update arm_cpu_util.c to show the processor name in papi_hardware_avail command. 2025-04-20 Willow Cunningham * src/components/topdown/README.md, src/components/topdown/Rules.topdown, src/components/topdown/topdown.c, src/components/topdown/topdown.h: topdown: Use librseq to protect rdpmc on het cpus On Intel's heterogeneous multicore processors such as Raptor Lake, the PERF_METRICS MSR is only available on the performance cores (p-cores). If the rdpmc instruction is executed attempting to access the MSR while the process is on an efficient core (e-core), a segmentation fault occurs. Previously, the topdown component has used a simple check before every execution of the rdpmc instruction to ensure the core the program is bound to is a p-core. However, this can fail if the program is moved to another core between the check and the execution of rdpmc. While rare, a worst-case scenario test that repeatedly moves a program which is using the topdown component from p-core to e-core at a random time saw 338 segmentation faults out of 1 million affinity switches (a 0.0338% error rate). This is a non-zero number of segmentation faults, and we can do better. Use librseq to protect the rdpmc instruction with a restartable sequence (rseq). When the process is preempted by an affinity change, the sequence immediately aborts and can be restarted. By keeping the check that the process is on a p-core and the rdpmc instruction itself within the critical section of the rseq, it is guaranteed that the rdpmc instruction will never be executed on an invalid core. The same test described in the previous paragraph sees 0 segmentation faults. 2025-04-22 Dong Jun Woun * src/components/cuda/cupti_profiler.c: cuda: Adding stat|device case to code_to_info 2025-04-22 Anthony * papi.spec: .SPEC: Logic for setting rocm_smi env. variables. 2025-04-18 Daniel Barry * src/components/cuda/cupti_profiler.c, src/components/rocm/roc_profiler.c, src/components/rocm_smi/rocs.c, src/components/template/vendor_profiler_v1.c: components: improper usage of PAPI_END macro PAPI_END is a macro defined in papiStdEventDefs.h to denote the end of the list of preset macros. However, it was being used as an error code in various components in cases unrelated to the number of presets. This commit changes this to a more appropriate error code: PAPI_ENOEVNT. These changes have been tested with ROCm 6.3.1 on Frontier. 2024-08-21 Daniel Barry * src/components/rocm/roc_common.c, src/components/rocm/rocm.c: rocm: add reason for disabled component Previously, in the absence of a ROCm device, the rocm component did not set the string containing the reason that the component was disabled. These changes have been tested with ROCm 6.3.1 on Frontier and with ROCm 6.3.2 on a system with no ROCm devices. 2025-04-18 Daniel Barry * src/components/rocm_smi/tests/Makefile: rocm_smi: updates to Makefile The rocm_smi component tests were not getting compiled during the build process. These updates point to the proper location of 'hipcc' and automatically builds the component tests. The 'square' test was removed due to the source file missing. These changes have been tested with ROCm 6.3.1 on Frontier. 2025-04-17 Treece Burgess * src/components/cuda/cupti_profiler.c: For the stats qualifier check for excess characters 2025-01-08 Treece Burgess * src/components/cuda/cupti_profiler.c, src/components/rocm/roc_profiler.c: Add a check to parse event qualifiers to make sure no excess characters are appended. 2024-12-13 Treece Burgess * src/components/rapl/linux-rapl.c: Add support for Intel Comet Lake S/H in RAPL component. 2025-04-16 voidbert * src/components/perf_event_uncore/perf_event_uncore.c: perf_event_uncore: fix compilation when CAP_PERFMON is missing 2024-11-04 Daniel Barry * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/cat_arch.h, src/counter_analysis_toolkit/vec.c, src/counter_analysis_toolkit/vec_fma_dp.c, src/counter_analysis_toolkit/vec_fma_hp.c, src/counter_analysis_toolkit/vec_fma_sp.c, src/counter_analysis_toolkit/vec_nonfma_dp.c, src/counter_analysis_toolkit/vec_nonfma_hp.c, src/counter_analysis_toolkit/vec_nonfma_sp.c, src/counter_analysis_toolkit/vec_scalar_verify.c, src/counter_analysis_toolkit/vec_scalar_verify.h: cat: updates in vector-FLOPs benchmarks Include kernels that perform scalar floating-point operations. These changes have been tested on the Intel Sapphire Rapids and IBM POWER10 architectures. 2025-01-22 William Cohen * src/high-level/papi_hl.c, src/papi_vector.c: Eliminate conflicting type errors generated by GCC15 Recent PAPI compiles on Fedora rawhide (F42) fail because of "conflicting types" errors produced by GCC15. Proper arguments types have been added to _internal_hl_read_user_events function declaration in papi_hl.c and the typecasting in papi_vector.c. 2024-11-09 Dong Jun Woun * src/components/cuda/README_internal.md, src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_dispatch.h, src/components/cuda/cupti_events.c, src/components/cuda/cupti_events.h, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_profiler.h, src/components/cuda/cupti_utils.c, src/components/cuda/cupti_utils.h, src/components/cuda/linux- cuda.c, src/components/cuda/tests/runtest.sh: Cuda: Statistic Qualifier 2024-01-18 Evans, Richard Todd * src/components/rapl/linux-rapl.c: added Sapphire Rapids (Model 143) support to RAPL component 2024-09-25 Willow Cunningham * src/papi_events.csv: papi_events.csv: Added preset events for the Arm Cortex A72 processor. Because the A72 has the same events as the A57, this addition is a one-liner. This work is based on a patch by Stack Exchange user Bambo Wu, published in May 2021: https://raspberrypi.stackexchange.com/a/112396 2025-02-21 Willow Cunningham * src/papi_events.csv: papi_events.csv: Second pass at arm cortex-a76 events The previous commit adding preset events for the arm cortex a76 lacked important preset events such as L3 cache misses. Add missing events based on arm documentation and validation tests. All tests passing or warning on Raspberry Pi 5. 2024-10-11 Willow Cunningham * src/validation_tests/Makefile.recipies, src/validation_tests/load_store_testcode.c, src/validation_tests/papi_ld_ins.c, src/validation_tests/papi_sr_ins.c, src/validation_tests/testcode.h: validation_tests: Add load/store ARM assembly testcode The previous load/store validation tests were being optimized by the compiler in a way that caused the tests to mispredict the amount of memory instructions that are generated. This made it appear like the counters were incorrect, when it was really the test being inaccurate. To fix this, add assembly testcode for ARM to eliminate the problem of compiler optimizations. When load/store testcode is unavailable for the current platform, default back to the original matrix multiplication test. * src/papi_events.csv: papi_events: Add preset events for the Arm Cortex-A76 2025-01-17 Willow Cunningham * src/components/topdown/topdown.c: topdown: simplified metrics calculation Previously, the topdown component calculated metrics by taking the difference of the metrics before and the metrics after the calculated code block using Equation 1: M% = (Mb*Sb/255 - Ma*Sa/255) / (Sb - Sa) * 100 (1) where Mx are the raw bytes of the metric before and after the calipered code block and Sx are the slots. However, if Sa = 0 this simplifies to M% = Mb/255 * 100 (2) Therefore it is sufficient to simply reset the PERF_METRICS MSR and SLOTS during PAPI_start() and then use Equation 2 in PAPI_stop, reducing the number of dangerous rdpmc calls, reducing overhead, and simplifying the code. 2025-01-07 Willow Cunningham * src/components/topdown/topdown.c: topdown: relocated core type checks To prevent programs using the topdown component on heterogeneous processors that only supply the PERF_METRICS MSR on some of their cores from segfaulting due to trying to read the MSR after being moved to an unsupported core type, the topdown component periodically checks it is on a supported core and exits if not. Previously, this check occured at the start of PAPI_start and PAPI_stop. After writing a script that starts a program being calipered with the topdown component and moves it to an unsupported core after a random amount of time, for N=100,000 tests the heterogeneous checks failed to prevent a segmentation fault 0.08% of the time. This patch moves the heterogeneous checks to occur only directly before the rdpmc calls, resulting in cleaner code and a reduced segfault prevention failure rate of 0.064%. While it is frustrating that the failure rate is non-zero, since there appears to be no way to tell a process to ignore changes to its affinity, I believe there to be no perfect solution at this time. 2024-12-17 Willow Cunningham * src/components/topdown/topdown.c: topdown: stop including x86intrin.h Previously, the x86intrin.h header file had been included in order to provide definitions for _rdpmc(). However, this has caused the github actions testing compilation of the component on ARM systems to fail. Therefore, remove the include and add a manual definition for _rdpmc() taken from the perf_event component. 2024-12-11 Willow Cunningham * src/components/topdown/README.md, src/components/topdown/topdown.c: topdown: Prevent segfault on heterogeneous CPUs All of Intel's heterogeneous CPUs that support the PERF_METRICS MSR only support it for their performance (p-core) cores. This means that if a program that is being measured using the topdown component in PAPI happens to be rescheduled to a e-core during its runtime, PAPI will segfault. To fix this, add a check in _topdown_start() and _topdown_stop() to exit gracefully if the core affinity of the process has changed to an unsupported core type. 2024-12-04 Willow Cunningham * src/components/topdown/topdown.c: topdown: add arch support based on perfmon-intel While the offical Software Developer Manual only lists the availability of the PERF_METRICS MSR for three architectures, we can use the 'perfmon' repository maintained by Intel to discover what architectures support the MSR (repo here: https://github.com/intel/perfmon). Architectures that the repository demonstrates support the events 'PERF_METRICS.BACKEND_BOUND', 'PERF_METRICS.FRONTEND_BOUND', etc. must support the topdown level 1 metrics of the PERF_METRICS MSR. Similarly, the presence of the events 'PERF_METRICS.FETCH_LATENCY', 'PERF_METRICS.MEMORY_BOUND', etc. demonstrates support for topdown L2 metrics in the PERF_METRICS MSR. By cross-referencing the architecture names in the perfmon repository with their DisplayFamily/DisplayModel values in Table 2-1 of volume 4 of the IA32 SDM, we can add support for the following architectures: - Rocket Lake - Ice Lake (icl & icx) - Tiger Lake - Sapphire Rapids - Meteor Lake (redwood cove p-core only) - Alder Lake (golden cove p-core only) - Granite Rapids - Everald Rapids None of these additional architectures have been tested with the topdown component yet. While Arrow Lake is shown to support L1 & L2 metrics in the prefmon repository, its FamilyModel is not yet available in the IA32 SDM so it has not been added. 2024-11-11 Willow Cunningham * src/components/topdown/README.md, src/components/topdown/Rules.topdown, src/components/topdown/tests/Makefile, src/components/topdown/tests/topdown_L1.c, src/components/topdown/tests/topdown_L2.c, src/components/topdown/tests/topdown_basic.c, src/components/topdown/topdown.c, src/components/topdown/topdown.h: topdown: Created a component for interfacing with Intel's PERF_METRICS MSR Add a component that collects Intel's topdown metrics from the PERF_METRICS MSR and automatically converts the raw metric values to user-consumable percentages. The intent of this component is to provide an intuitive interface for accessing topdown metrics on the supported processors. Tested on a RaptorLake-S/HX machine (family/model/stepping 0x6/0xb7/0x1). To add other supported architectures the switch statment in _topdown_init_component() should be populated for the architecture's model number, whether it supports level 2 topdown metrics, and in the case of a heterogeneous processor what core type it must be run on. 2024-06-28 voidbert <50591320+voidbert@users.noreply.github.com> * .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore: consider capabilities for permissions 2025-02-27 Dong Jun Woun * src/components/rocm_smi/README.md: rocm_smi: Update readme to note two cases of root path 2025-03-27 G-Ragghianti * src/configure, src/configure.in: Include the comp_tests in the list of tests that are enabled by the '--with-tests' configure option 2025-03-20 Treece Burgess * .github/workflows/papi_framework_workflow.yml: Using paths-ignore instead of paths for framework workflow 2025-03-18 Treece Burgess * .github/workflows/ci_papi_framework.sh, .github/workflows/papi_framework_workflow.yml: Remove infiniband from the papi_components_comprehensive CI test 2025-03-20 Anthony Danalis * src/components/rocp_sdk/tests/Makefile: ROCP_SDK: Change tests/Makefile for spack builds. 2025-03-05 G-Ragghianti * src/components/rocm_smi/Rules.rocm_smi: Adding location of rocm_smi header files for newer versions of rocm papi-papi-7-2-0-t/ChangeLogP720b1.txt000066400000000000000000002042111502707512200170240ustar00rootroot000000000000002024-08-30 Anthony Danalis * src/components/rocp_sdk/Rules.rocp_sdk, src/components/rocp_sdk/rocp_sdk.c, src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/sdk_class.h, src/components/rocp_sdk/sdk_class.hpp, src/components/rocp_sdk/tests/Makefile, src/components/rocp_sdk/tests/advanced.c, src/components/rocp_sdk/tests/kernel.cpp, src/components/rocp_sdk/tests/simple.c, src/configure, src/configure.in: Beta support for AMD ROCprofiler-SDK events. Mon Jul 22 16:59:06 2024 +0900 jdeokkim * src/libpfm4/docs/man3/pfm_get_os_event_encoding.3, .../docs/man3/pfm_get_perf_event_encoding.3, src/libpfm4/docs/man3/pfm_initialize.3, src/libpfm4/lib/events/arm_neoverse_n1_events.h, src/libpfm4/lib/events/arm_neoverse_n2_events.h, src/libpfm4/lib/events/arm_neoverse_v1_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_arm.c: Update libpfm4 Current with commit 3abda5bc6c1af7f1b620dc594a806b3b5a4134cb Optimize pfm_detect() for ARM processors Avoid calling pfmlib_getcpuinfo_attr() 3 times for each ARM PMU to detect. For a given processor, the function will always return the same information. Use the pfm_arm_cfg structure as a cache on subsequent calls. Note: this overall logic does not handle ARM hybrids right now. commit b8b7d69e774c38618aa440f49d69814d109629f5 add L2D_CACHE and make L2D_CACHE_ACCESS alias for ARM Neoverse N1,N2,V2 To match kernel and documentation. Patch provides an alias to avoid breaking existing scripts. commit 0d216ee4082aef2d8cabfa9816cdb6d6560d1d3f update Intel SapphireRapids core PMU to 1.24 Updates the Intel SapphireRapids core PMU event table to latest Intel released version: Date : 07/18/2024 Version: 1.24" From gitub.com/Intel/perfmon commit 874ed7cff57271c5d4e530650eadce76e3dcaa14 Fix typos in docs/man3/pfm_get_perf_event_encoding.3 commit ffbfc5970897de87471e7cba64737dc13e2369cf Fix typos in docs/man3/pfm_get_os_event_encoding.3 Note: The PAPI team does not have access to a machine with ARM and ARM Neoverse to test building PAPI with updated libpfm4 changes. Built PAPI successfully on a machine with an Intel Sapphire Rapids (Intel(R) Xeon(R) Gold 6430) CPU and output from papi_native_avail shows updated names and descriptions. 2024-08-02 Treece Burgess * src/components/rocm/roc_dispatch.c, src/components/rocm/roc_dispatch.h, src/components/rocm/roc_profiler.c, src/components/rocm/roc_profiler.h, src/components/rocm/rocm.c: Revert changes to ROCm component to only count the basename for papi_component_avail. This will be the default choice for components with qualifiers. 2024-08-01 Treece Burgess * src/components/cuda/README.md: Updating format of export LD_LIBRARY_PATH= in Cuda component README. 2024-07-29 Heike Jagode * README.md: Updated license. 2024-07-25 William Cohen * src/components/cuda/tests/Makefile: cuda: When making the cuda tests do not assume nvcc on $PATH is one being used A check in the Makefile for the cuda tests assumed the nvcc found on $PATH and the one referred to by $(NVCC) were the same. It is possible to have multiple versions of cuda installed on the system and the one being used for $(NVCC) may not be the same as the one found on the default path. Should always be use $(NVCC) in the Makefile to ensure getting the same version of nvcc. * src/components/cuda/linux-cuda.c: cuda: Eliminate -Werror=format- security error in cuda_init_private() Using a "%s" format in the sprintf to avoid the following error when compiling the cuda component with -Werror=format-security : components/cuda/linux- cuda.c: In function ‘cuda_init_private’: components/cuda/linux- cuda.c:158:9: error: format not a string literal and no format arguments [-Werror=format-security] 158 | sprintf(_cuda_vector.cmp_info.disabled_reason, disabled_reason); | ^~~~~~~ cc1: some warnings being treated as errors 2024-07-24 Daniel Barry * src/papi_events.csv: add presets for Zen5 These changes include all available preset events for the Zen4 architecture, and they were validated using the Counter Analysis Toolkit. These changes have been tested on the AMD Zen5 architecture. 2024-07-16 Treece Burgess * src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h: Adding papi_cupti_common.c and papi_cupti_common.h * src/components/cuda/Rules.cuda, src/components/cuda/cupti_common.c, src/components/cuda/cupti_common.h, src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_events.c, src/components/cuda/cupti_profiler.c: Renaming cupti_common.* to papi_cupti_common.* to avoid file name collision with NVIDIA. Fixes build issues in Cuda >= 12.4. 2024-07-23 Heike Jagode * src/components/cuda/tests/Makefile: Fixed issue with building CUDA tests when linked with shared PAPI library If CUDA tests are linked with the PAPI shared library (via the configure option --with-shlib-tools), the tests don't build because nvcc doesn't accept the -Wl,-rpath linker option. To fix this issue, instead of linking with nvcc, we can link with the PAPI-chosen C compiler via the CC macro (or CXX macro for tests with C++ code). Additionally, this PR cleans up other issues with tests (e.g., cudaOpenMP.cu and cudaOpenMP_noCuCtx.cu) by removing redundant explicit compilations, as the Makefile already includes a compilation rule. 2024-07-09 Daniel Barry * src/components/rocm/roc_dispatch.c, src/components/rocm/roc_dispatch.h, src/components/rocm/roc_profiler.c, src/components/rocm/roc_profiler.h, src/components/rocm/rocm.c: rocm: fix event count to include qualifiers The event count previously only counted basenames. This has been changed to include all qualified event names. These changes have been tested on the Frontier supercomputer (AMD MI250X GPU architecture) with ROCm version 5.3.0. Wed Jul 3 23:50:15 2024 -0700 Swarup Sahoo * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_amd64_fam1ah_zen5.3, .../docs/man3/libpfm_amd64_fam1ah_zen5_l3.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/amd64_events_fam1ah_zen5.h, .../lib/events/amd64_events_fam1ah_zen5_l3.h, src/libpfm4/lib/pfmlib_amd64.c, src/libpfm4/lib/pfmlib_amd64_fam19h_l3.c, src/libpfm4/lib/pfmlib_amd64_fam1ah.c, src/libpfm4/lib/pfmlib_amd64_fam1ah_l3.c, src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 18f2a3e0541cc438094bbf65ebbed2b6742bf0d4 Add AMD Zen5 L3 PMU support This patch implements support for AMD Zen5 processor L3 cache PMU. The implementation is based on "Performance Monitor Counters for AMD Family A0h Model 00h- 0Fh Processors, rev 0.02", available at - https://www.amd.com/content/dam/amd/en/documents/epyc-technical- docs/programmer-references/58550-0.01.pdf commit e50b6a47c9f9f1386805bbce3d2a634782f8c30e Add AMD Zen5 core PMU support This patch implements support for AMD Zen5 core PMU support. The implementation is based on "Performance Monitor Counters for AMD Family A0h Model 00h- 0Fh Processors, rev 0.02", available at - https://www.amd.com/content/dam/amd/en/documents/epyc-technical- docs/programmer-references/58550-0.01.pdf Note: The PAPI team does not have access to AMD Zen5 architecture to test additions as of now. 2024-06-07 Vince Weaver * src/papi_internal.c: papi_internal: don't segfault if event missing from derived event definition this shouldn't be possible with a correctly configured papi_event.csv but it is still good to be robust here so that we don't segfault and also print useful debug messages 2024-06-03 Vince Weaver * src/components/perf_event/pe_libpfm4_events.c: perf_event/libpfm4_events: add initial heterogenous CPU support Alderlake and Raptorlake should now be able to properly enumerate events for both Power and Efficiency cores on heterogeneous systems. We're still testing if PAPI properly handles measurements where applications end up running on both types of cores. The detection of E-cores is currently a bit of a hack, we're going to discuss with upstream libpfm4 to see if we can get a cleaner interface for this. * src/components/perf_event/pe_libpfm4_events.c: perf_event/libpfm4_events: add some code comments to _pe_libpfm4_init() this should not change the behavior of the code, just adding some comments and whitespace to clear up what the code is actually doing to prepare for improved heterogeneous CPU support 2024-06-16 William Cohen * src/Rules.perfctr, src/Rules.perfctr-pfm, src/components/perfctr/Rules.perfctr, src/components/perfctr/perfctr-x86.c, src/components/perfctr/perfctr-x86.h, src/components/perfctr/perfctr.c, src/components/perfctr_ppc/Rules.perfctr_ppc, src/components/perfctr_ppc/linux-ppc64.h, src/components/perfctr_ppc/perfctr-ppc64.c, src/components/perfctr_ppc/perfctr-ppc64.h, src/components/perfctr_ppc/power5+_events.h, src/components/perfctr_ppc/power5+_events_map.c, src/components/perfctr_ppc/power5_events.h, src/components/perfctr_ppc/power5_events_map.c, src/components/perfctr_ppc/power6_events.h, src/components/perfctr_ppc/power6_events_map.c, src/components/perfctr_ppc/power7_events.h, src/components/perfctr_ppc/ppc64_events.c, src/components/perfctr_ppc/ppc64_events.h, src/components/perfctr_ppc/ppc970_events.h, src/components/perfctr_ppc/ppc970_events_map.c, src/configure, src/configure.in, src/libpfm-3.y/COPYRIGHT, src/libpfm-3.y/ChangeLog, src/libpfm-3.y/Makefile, src/libpfm-3.y/README, src/libpfm-3.y/TODO, src/libpfm-3.y/config.mk, src/libpfm-3.y/docs/Makefile, src/libpfm-3.y/docs/man3/libpfm.3, src/libpfm-3.y/docs/man3/libpfm_amd64.3, src/libpfm-3.y/docs/man3/libpfm_atom.3, src/libpfm-3.y/docs/man3/libpfm_core.3, src/libpfm-3.y/docs/man3/libpfm_itanium.3, src/libpfm-3.y/docs/man3/libpfm_itanium2.3, src/libpfm-3.y/docs/man3/libpfm_montecito.3, src/libpfm-3.y/docs/man3/libpfm_nehalem.3, src/libpfm-3.y/docs/man3/libpfm_p6.3, src/libpfm-3.y/docs/man3/libpfm_powerpc.3, src/libpfm-3.y/docs/man3/libpfm_westmere.3, src/libpfm-3.y/docs/man3/pfm_dispatch_events.3, src/libpfm-3.y/docs/man3/pfm_find_event.3, src/libpfm-3.y/docs/man3/pfm_find_event_bycode.3, .../docs/man3/pfm_find_event_bycode_next.3, src/libpfm-3.y/docs/man3/pfm_find_event_mask.3, src/libpfm-3.y/docs/man3/pfm_find_full_event.3, src/libpfm-3.y/docs/man3/pfm_force_pmu.3, src/libpfm-3.y/docs/man3/pfm_get_cycle_event.3, src/libpfm-3.y/docs/man3/pfm_get_event_code.3, .../docs/man3/pfm_get_event_code_counter.3, src/libpfm-3.y/docs/man3/pfm_get_event_counters.3, .../docs/man3/pfm_get_event_description.3, src/libpfm-3.y/docs/man3/pfm_get_event_mask_code.3, .../docs/man3/pfm_get_event_mask_description.3, src/libpfm-3.y/docs/man3/pfm_get_event_mask_name.3, src/libpfm-3.y/docs/man3/pfm_get_event_name.3, src/libpfm-3.y/docs/man3/pfm_get_full_event_name.3, .../docs/man3/pfm_get_hw_counter_width.3, src/libpfm-3.y/docs/man3/pfm_get_impl_counters.3, src/libpfm-3.y/docs/man3/pfm_get_impl_pmcs.3, src/libpfm-3.y/docs/man3/pfm_get_impl_pmds.3, src/libpfm-3.y/docs/man3/pfm_get_inst_retired.3, .../docs/man3/pfm_get_max_event_name_len.3, src/libpfm-3.y/docs/man3/pfm_get_num_counters.3, src/libpfm-3.y/docs/man3/pfm_get_num_events.3, src/libpfm-3.y/docs/man3/pfm_get_num_pmcs.3, src/libpfm-3.y/docs/man3/pfm_get_num_pmds.3, src/libpfm-3.y/docs/man3/pfm_get_pmu_name.3, src/libpfm-3.y/docs/man3/pfm_get_pmu_name_bytype.3, src/libpfm-3.y/docs/man3/pfm_get_pmu_type.3, src/libpfm-3.y/docs/man3/pfm_get_version.3, src/libpfm-3.y/docs/man3/pfm_initialize.3, src/libpfm-3.y/docs/man3/pfm_list_supported_pmus.3, src/libpfm-3.y/docs/man3/pfm_pmu_is_supported.3, src/libpfm-3.y/docs/man3/pfm_regmask_and.3, src/libpfm-3.y/docs/man3/pfm_regmask_clr.3, src/libpfm-3.y/docs/man3/pfm_regmask_copy.3, src/libpfm-3.y/docs/man3/pfm_regmask_eq.3, src/libpfm-3.y/docs/man3/pfm_regmask_isset.3, src/libpfm-3.y/docs/man3/pfm_regmask_or.3, src/libpfm-3.y/docs/man3/pfm_regmask_set.3, src/libpfm-3.y/docs/man3/pfm_regmask_weight.3, src/libpfm-3.y/docs/man3/pfm_set_options.3, src/libpfm-3.y/docs/man3/pfm_strerror.3, src/libpfm-3.y/examples_ia64_v2.0/Makefile, src/libpfm-3.y/examples_ia64_v2.0/ita2_btb.c, src/libpfm-3.y/examples_ia64_v2.0/ita2_dear.c, src/libpfm-3.y/examples_ia64_v2.0/ita2_irr.c, src/libpfm-3.y/examples_ia64_v2.0/ita2_opcode.c, src/libpfm-3.y/examples_ia64_v2.0/ita2_rr.c, src/libpfm-3.y/examples_ia64_v2.0/ita_btb.c, src/libpfm-3.y/examples_ia64_v2.0/ita_dear.c, src/libpfm-3.y/examples_ia64_v2.0/ita_irr.c, src/libpfm-3.y/examples_ia64_v2.0/ita_opcode.c, src/libpfm-3.y/examples_ia64_v2.0/ita_rr.c, src/libpfm-3.y/examples_ia64_v2.0/mont_dear.c, src/libpfm-3.y/examples_ia64_v2.0/mont_etb.c, src/libpfm-3.y/examples_ia64_v2.0/mont_irr.c, src/libpfm-3.y/examples_ia64_v2.0/mont_opcode.c, src/libpfm-3.y/examples_ia64_v2.0/mont_rr.c, src/libpfm-3.y/examples_ia64_v2.0/multiplex.c, src/libpfm-3.y/examples_ia64_v2.0/notify_self.c, src/libpfm-3.y/examples_ia64_v2.0/notify_self2.c, src/libpfm-3.y/examples_ia64_v2.0/notify_self3.c, .../examples_ia64_v2.0/notify_self_fork.c, src/libpfm-3.y/examples_ia64_v2.0/self.c, src/libpfm-3.y/examples_ia64_v2.0/showreset.c, src/libpfm-3.y/examples_ia64_v2.0/syst.c, src/libpfm-3.y/examples_ia64_v2.0/task.c, src/libpfm-3.y/examples_ia64_v2.0/task_attach.c, .../examples_ia64_v2.0/task_attach_timeout.c, src/libpfm-3.y/examples_ia64_v2.0/task_smpl.c, src/libpfm-3.y/examples_ia64_v2.0/whichpmu.c, src/libpfm-3.y/examples_v2.x/Makefile, src/libpfm-3.y/examples_v2.x/check_events.c, src/libpfm-3.y/examples_v2.x/detect_pmcs.c, src/libpfm-3.y/examples_v2.x/detect_pmcs.h, src/libpfm-3.y/examples_v2.x/ia64/Makefile, src/libpfm-3.y/examples_v2.x/ia64/ita2_btb.c, src/libpfm-3.y/examples_v2.x/ia64/ita2_dear.c, src/libpfm-3.y/examples_v2.x/ia64/ita2_irr.c, src/libpfm-3.y/examples_v2.x/ia64/ita2_opcode.c, src/libpfm-3.y/examples_v2.x/ia64/ita2_rr.c, src/libpfm-3.y/examples_v2.x/ia64/ita_btb.c, src/libpfm-3.y/examples_v2.x/ia64/ita_dear.c, src/libpfm-3.y/examples_v2.x/ia64/ita_irr.c, src/libpfm-3.y/examples_v2.x/ia64/ita_opcode.c, src/libpfm-3.y/examples_v2.x/ia64/ita_rr.c, src/libpfm-3.y/examples_v2.x/ia64/mont_dear.c, src/libpfm-3.y/examples_v2.x/ia64/mont_etb.c, src/libpfm-3.y/examples_v2.x/ia64/mont_irr.c, src/libpfm-3.y/examples_v2.x/ia64/mont_opcode.c, src/libpfm-3.y/examples_v2.x/ia64/mont_rr.c, src/libpfm-3.y/examples_v2.x/multiplex.c, src/libpfm-3.y/examples_v2.x/multiplex2.c, src/libpfm-3.y/examples_v2.x/notify_self.c, src/libpfm-3.y/examples_v2.x/notify_self2.c, src/libpfm-3.y/examples_v2.x/notify_self3.c, src/libpfm-3.y/examples_v2.x/notify_self_fork.c, src/libpfm-3.y/examples_v2.x/pfmsetup.c, src/libpfm-3.y/examples_v2.x/rtop.c, src/libpfm-3.y/examples_v2.x/self.c, src/libpfm-3.y/examples_v2.x/self_pipe.c, src/libpfm-3.y/examples_v2.x/self_smpl.c, src/libpfm-3.y/examples_v2.x/self_smpl_multi.c, src/libpfm-3.y/examples_v2.x/self_view.c, src/libpfm-3.y/examples_v2.x/set_notify.c, src/libpfm-3.y/examples_v2.x/showevtinfo.c, src/libpfm-3.y/examples_v2.x/showreginfo.c, src/libpfm-3.y/examples_v2.x/syst.c, src/libpfm-3.y/examples_v2.x/syst_multi_np.c, src/libpfm-3.y/examples_v2.x/syst_np.c, src/libpfm-3.y/examples_v2.x/task.c, src/libpfm-3.y/examples_v2.x/task_attach.c, src/libpfm-3.y/examples_v2.x/task_attach_timeout.c, .../examples_v2.x/task_attach_timeout_np.c, src/libpfm-3.y/examples_v2.x/task_smpl.c, src/libpfm-3.y/examples_v2.x/task_smpl_user.c, src/libpfm-3.y/examples_v2.x/whichpmu.c, src/libpfm-3.y/examples_v2.x/x86/Makefile, src/libpfm-3.y/examples_v2.x/x86/smpl_amd64_ibs.c, src/libpfm-3.y/examples_v2.x/x86/smpl_core_pebs.c, src/libpfm-3.y/examples_v2.x/x86/smpl_nhm_lbr.c, src/libpfm-3.y/examples_v2.x/x86/smpl_p4_pebs.c, src/libpfm-3.y/examples_v2.x/x86/smpl_pebs.c, src/libpfm-3.y/examples_v3.x/Makefile, src/libpfm-3.y/examples_v3.x/check_events.c, src/libpfm-3.y/examples_v3.x/detect_pmcs.c, src/libpfm-3.y/examples_v3.x/detect_pmcs.h, src/libpfm-3.y/examples_v3.x/ia64/Makefile, src/libpfm-3.y/examples_v3.x/ia64/ita2_btb.c, src/libpfm-3.y/examples_v3.x/ia64/ita2_dear.c, src/libpfm-3.y/examples_v3.x/ia64/ita2_irr.c, src/libpfm-3.y/examples_v3.x/ia64/ita2_opcode.c, src/libpfm-3.y/examples_v3.x/ia64/ita2_rr.c, src/libpfm-3.y/examples_v3.x/ia64/ita_btb.c, src/libpfm-3.y/examples_v3.x/ia64/ita_dear.c, src/libpfm-3.y/examples_v3.x/ia64/ita_irr.c, src/libpfm-3.y/examples_v3.x/ia64/ita_opcode.c, src/libpfm-3.y/examples_v3.x/ia64/ita_rr.c, src/libpfm-3.y/examples_v3.x/ia64/mont_dear.c, src/libpfm-3.y/examples_v3.x/ia64/mont_etb.c, src/libpfm-3.y/examples_v3.x/ia64/mont_irr.c, src/libpfm-3.y/examples_v3.x/ia64/mont_opcode.c, src/libpfm-3.y/examples_v3.x/ia64/mont_rr.c, src/libpfm-3.y/examples_v3.x/multiplex.c, src/libpfm-3.y/examples_v3.x/multiplex2.c, src/libpfm-3.y/examples_v3.x/notify_self.c, src/libpfm-3.y/examples_v3.x/notify_self2.c, src/libpfm-3.y/examples_v3.x/notify_self3.c, src/libpfm-3.y/examples_v3.x/notify_self_fork.c, src/libpfm-3.y/examples_v3.x/pfmsetup.c, src/libpfm-3.y/examples_v3.x/rtop.c, src/libpfm-3.y/examples_v3.x/self.c, src/libpfm-3.y/examples_v3.x/self_pipe.c, src/libpfm-3.y/examples_v3.x/self_smpl_multi.c, src/libpfm-3.y/examples_v3.x/set_notify.c, src/libpfm-3.y/examples_v3.x/showevtinfo.c, src/libpfm-3.y/examples_v3.x/showreginfo.c, src/libpfm-3.y/examples_v3.x/syst.c, src/libpfm-3.y/examples_v3.x/task.c, src/libpfm-3.y/examples_v3.x/task_attach.c, src/libpfm-3.y/examples_v3.x/task_attach_timeout.c, src/libpfm-3.y/examples_v3.x/task_smpl.c, src/libpfm-3.y/examples_v3.x/task_smpl_user.c, src/libpfm-3.y/examples_v3.x/whichpmu.c, src/libpfm-3.y/examples_v3.x/x86/Makefile, src/libpfm-3.y/examples_v3.x/x86/smpl_amd64_ibs.c, src/libpfm-3.y/examples_v3.x/x86/smpl_core_pebs.c, .../examples_v3.x/x86/smpl_core_pebs_sys.c, src/libpfm-3.y/examples_v3.x/x86/smpl_p4_pebs.c, src/libpfm-3.y/include/Makefile, src/libpfm-3.y/include/perfmon/perfmon.h, src/libpfm-3.y/include/perfmon/perfmon_compat.h, src/libpfm-3.y/include/perfmon/perfmon_crayx2.h, .../include/perfmon/perfmon_default_smpl.h, src/libpfm-3.y/include/perfmon/perfmon_dfl_smpl.h, src/libpfm-3.y/include/perfmon/perfmon_i386.h, src/libpfm-3.y/include/perfmon/perfmon_ia64.h, src/libpfm-3.y/include/perfmon/perfmon_mips64.h, .../include/perfmon/perfmon_pebs_core_smpl.h, .../include/perfmon/perfmon_pebs_p4_smpl.h, src/libpfm-3.y/include/perfmon/perfmon_pebs_smpl.h, src/libpfm-3.y/include/perfmon/perfmon_powerpc.h, src/libpfm-3.y/include/perfmon/perfmon_sparc.h, src/libpfm-3.y/include/perfmon/perfmon_v2.h, src/libpfm-3.y/include/perfmon/perfmon_x86_64.h, src/libpfm-3.y/include/perfmon/pfmlib.h, src/libpfm-3.y/include/perfmon/pfmlib_amd64.h, src/libpfm-3.y/include/perfmon/pfmlib_cell.h, src/libpfm-3.y/include/perfmon/pfmlib_comp.h, .../include/perfmon/pfmlib_comp_crayx2.h, src/libpfm-3.y/include/perfmon/pfmlib_comp_i386.h, src/libpfm-3.y/include/perfmon/pfmlib_comp_ia64.h, .../include/perfmon/pfmlib_comp_mips64.h, .../include/perfmon/pfmlib_comp_powerpc.h, src/libpfm-3.y/include/perfmon/pfmlib_comp_sparc.h, .../include/perfmon/pfmlib_comp_x86_64.h, src/libpfm-3.y/include/perfmon/pfmlib_core.h, src/libpfm-3.y/include/perfmon/pfmlib_coreduo.h, src/libpfm-3.y/include/perfmon/pfmlib_crayx2.h, src/libpfm-3.y/include/perfmon/pfmlib_gen_ia32.h, src/libpfm-3.y/include/perfmon/pfmlib_gen_ia64.h, src/libpfm-3.y/include/perfmon/pfmlib_gen_mips64.h, src/libpfm-3.y/include/perfmon/pfmlib_i386_p6.h, src/libpfm-3.y/include/perfmon/pfmlib_intel_atom.h, src/libpfm-3.y/include/perfmon/pfmlib_intel_nhm.h, src/libpfm-3.y/include/perfmon/pfmlib_itanium.h, src/libpfm-3.y/include/perfmon/pfmlib_itanium2.h, src/libpfm-3.y/include/perfmon/pfmlib_montecito.h, src/libpfm-3.y/include/perfmon/pfmlib_os.h, src/libpfm-3.y/include/perfmon/pfmlib_os_crayx2.h, src/libpfm-3.y/include/perfmon/pfmlib_os_i386.h, src/libpfm-3.y/include/perfmon/pfmlib_os_ia64.h, src/libpfm-3.y/include/perfmon/pfmlib_os_mips64.h, src/libpfm-3.y/include/perfmon/pfmlib_os_powerpc.h, src/libpfm-3.y/include/perfmon/pfmlib_os_sparc.h, src/libpfm-3.y/include/perfmon/pfmlib_os_x86_64.h, src/libpfm-3.y/include/perfmon/pfmlib_pentium4.h, src/libpfm-3.y/include/perfmon/pfmlib_powerpc.h, src/libpfm-3.y/include/perfmon/pfmlib_sicortex.h, src/libpfm-3.y/include/perfmon/pfmlib_sparc.h, src/libpfm-3.y/lib/Makefile, src/libpfm-3.y/lib/amd64_events.h, src/libpfm-3.y/lib/amd64_events_fam10h.h, src/libpfm-3.y/lib/amd64_events_fam15h.h, src/libpfm-3.y/lib/amd64_events_k7.h, src/libpfm-3.y/lib/amd64_events_k8.h, src/libpfm-3.y/lib/cell_events.h, src/libpfm-3.y/lib/core_events.h, src/libpfm-3.y/lib/coreduo_events.h, src/libpfm-3.y/lib/crayx2_events.h, src/libpfm-3.y/lib/gen_ia32_events.h, src/libpfm-3.y/lib/gen_mips64_events.h, src/libpfm-3.y/lib/i386_p6_events.h, src/libpfm-3.y/lib/intel_atom_events.h, src/libpfm-3.y/lib/intel_corei7_events.h, src/libpfm-3.y/lib/intel_corei7_unc_events.h, src/libpfm-3.y/lib/intel_wsm_events.h, src/libpfm-3.y/lib/intel_wsm_unc_events.h, src/libpfm-3.y/lib/itanium2_events.h, src/libpfm-3.y/lib/itanium_events.h, src/libpfm-3.y/lib/montecito_events.h, src/libpfm-3.y/lib/niagara1_events.h, src/libpfm-3.y/lib/niagara2_events.h, src/libpfm-3.y/lib/pentium4_events.h, src/libpfm-3.y/lib/pfmlib_amd64.c, src/libpfm-3.y/lib/pfmlib_amd64_priv.h, src/libpfm-3.y/lib/pfmlib_cell.c, src/libpfm-3.y/lib/pfmlib_cell_priv.h, src/libpfm-3.y/lib/pfmlib_common.c, src/libpfm-3.y/lib/pfmlib_core.c, src/libpfm-3.y/lib/pfmlib_core_priv.h, src/libpfm-3.y/lib/pfmlib_coreduo.c, src/libpfm-3.y/lib/pfmlib_coreduo_priv.h, src/libpfm-3.y/lib/pfmlib_crayx2.c, src/libpfm-3.y/lib/pfmlib_crayx2_priv.h, src/libpfm-3.y/lib/pfmlib_gen_ia32.c, src/libpfm-3.y/lib/pfmlib_gen_ia32_priv.h, src/libpfm-3.y/lib/pfmlib_gen_ia64.c, src/libpfm-3.y/lib/pfmlib_gen_mips64.c, src/libpfm-3.y/lib/pfmlib_gen_mips64_priv.h, src/libpfm-3.y/lib/pfmlib_gen_powerpc.c, src/libpfm-3.y/lib/pfmlib_i386_p6.c, src/libpfm-3.y/lib/pfmlib_i386_p6_priv.h, src/libpfm-3.y/lib/pfmlib_intel_atom.c, src/libpfm-3.y/lib/pfmlib_intel_atom_priv.h, src/libpfm-3.y/lib/pfmlib_intel_nhm.c, src/libpfm-3.y/lib/pfmlib_intel_nhm_priv.h, src/libpfm-3.y/lib/pfmlib_itanium.c, src/libpfm-3.y/lib/pfmlib_itanium2.c, src/libpfm-3.y/lib/pfmlib_itanium2_priv.h, src/libpfm-3.y/lib/pfmlib_itanium_priv.h, src/libpfm-3.y/lib/pfmlib_montecito.c, src/libpfm-3.y/lib/pfmlib_montecito_priv.h, src/libpfm-3.y/lib/pfmlib_os_linux.c, src/libpfm-3.y/lib/pfmlib_os_linux_v2.c, src/libpfm-3.y/lib/pfmlib_os_linux_v3.c, src/libpfm-3.y/lib/pfmlib_os_macos.c, src/libpfm-3.y/lib/pfmlib_pentium4.c, src/libpfm-3.y/lib/pfmlib_pentium4_priv.h, src/libpfm-3.y/lib/pfmlib_power4_priv.h, src/libpfm-3.y/lib/pfmlib_power5+_priv.h, src/libpfm-3.y/lib/pfmlib_power5_priv.h, src/libpfm-3.y/lib/pfmlib_power6_priv.h, src/libpfm-3.y/lib/pfmlib_power7_priv.h, src/libpfm-3.y/lib/pfmlib_power_priv.h, src/libpfm-3.y/lib/pfmlib_powerpc_priv.h, src/libpfm-3.y/lib/pfmlib_ppc970_priv.h, src/libpfm-3.y/lib/pfmlib_ppc970mp_priv.h, src/libpfm-3.y/lib/pfmlib_priv.c, src/libpfm-3.y/lib/pfmlib_priv.h, src/libpfm-3.y/lib/pfmlib_priv_comp.h, src/libpfm-3.y/lib/pfmlib_priv_comp_ia64.h, src/libpfm-3.y/lib/pfmlib_priv_ia64.h, src/libpfm-3.y/lib/pfmlib_sicortex.c, src/libpfm-3.y/lib/pfmlib_sicortex_priv.h, src/libpfm-3.y/lib/pfmlib_sparc.c, src/libpfm-3.y/lib/pfmlib_sparc_priv.h, src/libpfm-3.y/lib/power4_events.h, src/libpfm-3.y/lib/power5+_events.h, src/libpfm-3.y/lib/power5_events.h, src/libpfm-3.y/lib/power6_events.h, src/libpfm-3.y/lib/power7_events.h, src/libpfm-3.y/lib/powerpc_events.h, src/libpfm-3.y/lib/powerpc_reg.h, src/libpfm-3.y/lib/ppc970_events.h, src/libpfm-3.y/lib/ppc970mp_events.h, src/libpfm-3.y/lib/ultra12_events.h, src/libpfm-3.y/lib/ultra3_events.h, src/libpfm-3.y/lib/ultra3i_events.h, src/libpfm-3.y/lib/ultra3plus_events.h, src/libpfm-3.y/lib/ultra4plus_events.h, src/libpfm-3.y/libpfms/Makefile, src/libpfm-3.y/libpfms/include/libpfms.h, src/libpfm-3.y/libpfms/lib/Makefile, src/libpfm-3.y/libpfms/lib/libpfms.c, src/libpfm-3.y/libpfms/syst_smp.c, src/libpfm-3.y/python/Makefile, src/libpfm-3.y/python/README, src/libpfm-3.y/python/self.py, src/libpfm-3.y/python/setup.py, src/libpfm-3.y/python/src/__init__.py, src/libpfm-3.y/python/src/perfmon_int.i, src/libpfm-3.y/python/src/pmu.py, src/libpfm-3.y/python/src/session.py, src/libpfm-3.y/python/sys.py, src/libpfm-3.y/rules.mk, src/perfctr-2.6.x/CHANGES, src/perfctr-2.6.x/COPYING, src/perfctr-2.6.x/INSTALL, src/perfctr-2.6.x/Makefile, src/perfctr-2.6.x/OTHER, src/perfctr-2.6.x/README, src/perfctr-2.6.x/TODO, src/perfctr-2.6.x/etc/Makefile, src/perfctr-2.6.x/etc/costs/Athlon-1.2, src/perfctr-2.6.x/etc/costs/Athlon-1.46, src/perfctr-2.6.x/etc/costs/Athlon-1.66, src/perfctr-2.6.x/etc/costs/Athlon-1000, src/perfctr-2.6.x/etc/costs/Athlon-500, src/perfctr-2.6.x/etc/costs/Athlon-700, src/perfctr-2.6.x/etc/costs/Athlon-850, src/perfctr-2.6.x/etc/costs/Athlon64-2.0, src/perfctr-2.6.x/etc/costs/Core-i7-920-2.66, src/perfctr-2.6.x/etc/costs/Core2-2.4, src/perfctr-2.6.x/etc/costs/Core2-E8400-3.0, src/perfctr-2.6.x/etc/costs/Duron-750, src/perfctr-2.6.x/etc/costs/K6-III-400, src/perfctr-2.6.x/etc/costs/MPC7400-400, src/perfctr-2.6.x/etc/costs/MPC7447A-1.25, src/perfctr-2.6.x/etc/costs/MPC7455-1.0, src/perfctr-2.6.x/etc/costs/Opteron-1.4, src/perfctr-2.6.x/etc/costs/Opteron-2.4, src/perfctr-2.6.x/etc/costs/Opteron-2352-2.1, src/perfctr-2.6.x/etc/costs/Opteron-8354-2.2, src/perfctr-2.6.x/etc/costs/Opteron-8384-2.7, src/perfctr-2.6.x/etc/costs/PPC750-300, src/perfctr-2.6.x/etc/costs/Pentium-133, src/perfctr-2.6.x/etc/costs/Pentium4-1.5, src/perfctr-2.6.x/etc/costs/Pentium4-1.6, src/perfctr-2.6.x/etc/costs/Pentium4-1.7, src/perfctr-2.6.x/etc/costs/Pentium4-2.0, src/perfctr-2.6.x/etc/costs/Pentium4-2.26, src/perfctr-2.6.x/etc/costs/Pentium4-3.0, src/perfctr-2.6.x/etc/costs/Pentium4Xeon-2.2, src/perfctr-2.6.x/etc/costs/Pentium4Xeon-2.4, src/perfctr-2.6.x/etc/costs/Pentium4Xeon-2.8, src/perfctr-2.6.x/etc/costs/Pentium4Xeon-3.0, src/perfctr-2.6.x/etc/costs/PentiumII-266a, src/perfctr-2.6.x/etc/costs/PentiumII-266b, src/perfctr-2.6.x/etc/costs/PentiumII-300, src/perfctr-2.6.x/etc/costs/PentiumII-350, src/perfctr-2.6.x/etc/costs/PentiumIII-1.0, src/perfctr-2.6.x/etc/costs/PentiumIII-1.4, src/perfctr-2.6.x/etc/costs/PentiumIII-450, src/perfctr-2.6.x/etc/costs/PentiumIII-800, src/perfctr-2.6.x/etc/costs/PentiumIII-933, src/perfctr-2.6.x/etc/costs/PentiumIIIXeon-700, src/perfctr-2.6.x/etc/costs/PentiumM-2.0, src/perfctr-2.6.x/etc/costs/PentiumMMX-166, src/perfctr-2.6.x/etc/costs/PentiumMMX-233, src/perfctr-2.6.x/etc/costs/PentiumPro-200, src/perfctr-2.6.x/etc/install.sh, src/perfctr-2.6.x/etc/p4.c, src/perfctr-2.6.x/etc/perfctr.rc, src/perfctr-2.6.x/etc/perfctr.rules, src/perfctr-2.6.x/examples/Makefile, src/perfctr-2.6.x/examples/README, src/perfctr-2.6.x/examples/global/Makefile, src/perfctr-2.6.x/examples/global/arch.h, src/perfctr-2.6.x/examples/global/arm.c, src/perfctr-2.6.x/examples/global/global.c, src/perfctr-2.6.x/examples/global/ppc.c, src/perfctr-2.6.x/examples/global/x86.c, src/perfctr-2.6.x/examples/perfex/Makefile, src/perfctr-2.6.x/examples/perfex/arch.h, src/perfctr-2.6.x/examples/perfex/arm.c, src/perfctr-2.6.x/examples/perfex/arm.h, src/perfctr-2.6.x/examples/perfex/perfex.c, src/perfctr-2.6.x/examples/perfex/ppc.c, src/perfctr-2.6.x/examples/perfex/ppc.h, src/perfctr-2.6.x/examples/perfex/x86.c, src/perfctr-2.6.x/examples/perfex/x86.h, src/perfctr-2.6.x/examples/self/Makefile, src/perfctr-2.6.x/examples/self/arch.h, src/perfctr-2.6.x/examples/self/arm.c, src/perfctr-2.6.x/examples/self/ppc.c, src/perfctr-2.6.x/examples/self/self.c, src/perfctr-2.6.x/examples/self/x86.c, src/perfctr-2.6.x/examples/signal/Makefile, src/perfctr-2.6.x/examples/signal/arch.h, src/perfctr-2.6.x/examples/signal/ppc.c, src/perfctr-2.6.x/examples/signal/signal.c, src/perfctr-2.6.x/examples/signal/x86.c, src/perfctr-2.6.x/linux/drivers/perfctr/Kconfig, src/perfctr-2.6.x/linux/drivers/perfctr/Makefile, .../linux/drivers/perfctr/RELEASE-NOTES, src/perfctr-2.6.x/linux/drivers/perfctr/arm.c, .../linux/drivers/perfctr/arm_setup.c, src/perfctr-2.6.x/linux/drivers/perfctr/compat.h, src/perfctr-2.6.x/linux/drivers/perfctr/cpumask.h, src/perfctr-2.6.x/linux/drivers/perfctr/global.c, src/perfctr-2.6.x/linux/drivers/perfctr/global.h, src/perfctr-2.6.x/linux/drivers/perfctr/init.c, src/perfctr-2.6.x/linux/drivers/perfctr/marshal.c, src/perfctr-2.6.x/linux/drivers/perfctr/marshal.h, src/perfctr-2.6.x/linux/drivers/perfctr/ppc.c, .../linux/drivers/perfctr/ppc_compat.h, .../linux/drivers/perfctr/ppc_setup.c, .../linux/drivers/perfctr/ppc_tests.c, .../linux/drivers/perfctr/ppc_tests.h, src/perfctr-2.6.x/linux/drivers/perfctr/version.h, src/perfctr-2.6.x/linux/drivers/perfctr/virtual.c, src/perfctr-2.6.x/linux/drivers/perfctr/virtual.h, .../linux/drivers/perfctr/virtual_stub.c, src/perfctr-2.6.x/linux/drivers/perfctr/x86.c, .../linux/drivers/perfctr/x86_compat.h, .../linux/drivers/perfctr/x86_setup.c, .../linux/drivers/perfctr/x86_tests.c, .../linux/drivers/perfctr/x86_tests.h, src/perfctr-2.6.x/linux/include/asm-arm/perfctr.h, src/perfctr-2.6.x/linux/include/asm-i386/perfctr.h, .../linux/include/asm-powerpc/perfctr.h, src/perfctr-2.6.x/linux/include/asm-ppc/perfctr.h, src/perfctr-2.6.x/linux/include/asm-x86/perfctr.h, .../linux/include/asm-x86_64/perfctr.h, src/perfctr-2.6.x/linux/include/linux/perfctr.h, src/perfctr-2.6.x/patches/aliases, src/perfctr-2.6.x/patches/patch- kernel-2.6.10, src/perfctr-2.6.x/patches/patch-kernel-2.6.11, src/perfctr-2.6.x/patches/patch-kernel-2.6.12, src/perfctr-2.6.x/patches/patch-kernel-2.6.13, src/perfctr-2.6.x/patches/patch-kernel-2.6.14, src/perfctr-2.6.x/patches/patch-kernel-2.6.15, src/perfctr-2.6.x/patches/patch-kernel-2.6.16, .../patches/patch- kernel-2.6.16.46-0.12-suse, src/perfctr-2.6.x/patches/patch- kernel-2.6.17, src/perfctr-2.6.x/patches/patch-kernel-2.6.18, .../patches/patch-kernel-2.6.18-128.el5-redhat, .../patches/patch- kernel-2.6.18-164.el5-redhat, .../patches/patch- kernel-2.6.18-194.el5-redhat, .../patches/patch- kernel-2.6.18-53.el5-redhat, .../patches/patch- kernel-2.6.18-8.1.1.el5-redhat, .../patches/patch- kernel-2.6.18-92.el5-redhat, src/perfctr-2.6.x/patches/patch- kernel-2.6.19, src/perfctr-2.6.x/patches/patch-kernel-2.6.20, src/perfctr-2.6.x/patches/patch-kernel-2.6.21, src/perfctr-2.6.x/patches/patch-kernel-2.6.22, src/perfctr-2.6.x/patches/patch-kernel-2.6.23, src/perfctr-2.6.x/patches/patch-kernel-2.6.24, src/perfctr-2.6.x/patches/patch-kernel-2.6.25, src/perfctr-2.6.x/patches/patch-kernel-2.6.26, src/perfctr-2.6.x/patches/patch-kernel-2.6.27, src/perfctr-2.6.x/patches/patch-kernel-2.6.28, src/perfctr-2.6.x/patches/patch-kernel-2.6.29, src/perfctr-2.6.x/patches/patch-kernel-2.6.30, src/perfctr-2.6.x/patches/patch-kernel-2.6.31, src/perfctr-2.6.x/patches/patch-kernel-2.6.32, src/perfctr-2.6.x/patches/patch-kernel-2.6.5, .../patches/patch- kernel-2.6.5-7.276-suse, src/perfctr-2.6.x/patches/patch- kernel-2.6.6, src/perfctr-2.6.x/patches/patch-kernel-2.6.7, src/perfctr-2.6.x/patches/patch-kernel-2.6.8.1, src/perfctr-2.6.x/patches/patch-kernel-2.6.9, .../patches/patch- kernel-2.6.9-55.EL-redhat, .../patches/patch-kernel-2.6.9-67.EL- redhat, .../patches/patch-kernel-2.6.9-78.EL-redhat, .../patches/patch-kernel-2.6.9-89.EL-redhat, src/perfctr-2.6.x/perfctr.spec, src/perfctr-2.6.x/update-kernel, src/perfctr-2.6.x/usr.lib/Makefile, src/perfctr-2.6.x/usr.lib/arch.h, src/perfctr-2.6.x/usr.lib/arm.c, src/perfctr-2.6.x/usr.lib/arm.h, src/perfctr-2.6.x/usr.lib/event_set.h, src/perfctr-2.6.x/usr.lib/event_set_amd.c, src/perfctr-2.6.x/usr.lib/event_set_arm.c, src/perfctr-2.6.x/usr.lib/event_set_centaur.c, src/perfctr-2.6.x/usr.lib/event_set_p4.c, src/perfctr-2.6.x/usr.lib/event_set_p5.c, src/perfctr-2.6.x/usr.lib/event_set_p6.c, src/perfctr-2.6.x/usr.lib/event_set_ppc.c, src/perfctr-2.6.x/usr.lib/event_set_x86.c, src/perfctr-2.6.x/usr.lib/gen-event-codes.c, src/perfctr-2.6.x/usr.lib/global.c, src/perfctr-2.6.x/usr.lib/libperfctr.h, src/perfctr-2.6.x/usr.lib/misc.c, src/perfctr-2.6.x/usr.lib/ppc.c, src/perfctr-2.6.x/usr.lib/ppc.h, src/perfctr-2.6.x/usr.lib/virtual.c, src/perfctr-2.6.x/usr.lib/x86.c, src/perfctr-2.6.x/usr.lib/x86.h, src/perfctr-2.7.x/CHANGES, src/perfctr-2.7.x/COPYING, src/perfctr-2.7.x/INSTALL, src/perfctr-2.7.x/Makefile, src/perfctr-2.7.x/OTHER, src/perfctr-2.7.x/README, src/perfctr-2.7.x/TODO, src/perfctr-2.7.x/etc/costs/Athlon-1.1, src/perfctr-2.7.x/etc/costs/Athlon-1.133, src/perfctr-2.7.x/etc/costs/Athlon-1.2, src/perfctr-2.7.x/etc/costs/Athlon-1.3, src/perfctr-2.7.x/etc/costs/Athlon-1.33, src/perfctr-2.7.x/etc/costs/Athlon-1.46, src/perfctr-2.7.x/etc/costs/Athlon-1.53, src/perfctr-2.7.x/etc/costs/Athlon-1.64, src/perfctr-2.7.x/etc/costs/Athlon-1.66, src/perfctr-2.7.x/etc/costs/Athlon-1.75, src/perfctr-2.7.x/etc/costs/Athlon-1.8, src/perfctr-2.7.x/etc/costs/Athlon-1000, src/perfctr-2.7.x/etc/costs/Athlon-2.0, src/perfctr-2.7.x/etc/costs/Athlon-500, src/perfctr-2.7.x/etc/costs/Athlon-700, src/perfctr-2.7.x/etc/costs/Athlon-800, src/perfctr-2.7.x/etc/costs/Athlon-850, src/perfctr-2.7.x/etc/costs/Athlon64-2.0, src/perfctr-2.7.x/etc/costs/Athlon64-2.2, src/perfctr-2.7.x/etc/costs/Athlon64FX-2.2, src/perfctr-2.7.x/etc/costs/AthlonXP-1800, src/perfctr-2.7.x/etc/costs/AthlonXPM-2500, src/perfctr-2.7.x/etc/costs/C3-1.2, src/perfctr-2.7.x/etc/costs/Celeron-466, src/perfctr-2.7.x/etc/costs/Celeron-500, src/perfctr-2.7.x/etc/costs/CyrixMII-233, src/perfctr-2.7.x/etc/costs/Duron-1.0, src/perfctr-2.7.x/etc/costs/Duron-600, src/perfctr-2.7.x/etc/costs/Duron-700, src/perfctr-2.7.x/etc/costs/Duron-750, src/perfctr-2.7.x/etc/costs/K6-III-400, src/perfctr-2.7.x/etc/costs/MPC7400-400, src/perfctr-2.7.x/etc/costs/MPC7447A-1.35, src/perfctr-2.7.x/etc/costs/Opteron-1.4, src/perfctr-2.7.x/etc/costs/Opteron-1.6, src/perfctr-2.7.x/etc/costs/Opteron-2.0, src/perfctr-2.7.x/etc/costs/PPC750-300, src/perfctr-2.7.x/etc/costs/Pentium-133, src/perfctr-2.7.x/etc/costs/Pentium4-1.5, src/perfctr-2.7.x/etc/costs/Pentium4-1.6, src/perfctr-2.7.x/etc/costs/Pentium4-1.7, src/perfctr-2.7.x/etc/costs/Pentium4-1.8, src/perfctr-2.7.x/etc/costs/Pentium4-2.2, src/perfctr-2.7.x/etc/costs/Pentium4-2.26, src/perfctr-2.7.x/etc/costs/Pentium4-2.66, src/perfctr-2.7.x/etc/costs/Pentium4-2.8, src/perfctr-2.7.x/etc/costs/Pentium4-3.0, src/perfctr-2.7.x/etc/costs/Pentium4-3.4, src/perfctr-2.7.x/etc/costs/Pentium4M-1.8, src/perfctr-2.7.x/etc/costs/Pentium4Xeon-2.2, src/perfctr-2.7.x/etc/costs/Pentium4Xeon-2.4, src/perfctr-2.7.x/etc/costs/Pentium4Xeon-2.8, src/perfctr-2.7.x/etc/costs/Pentium4Xeon-3.4, src/perfctr-2.7.x/etc/costs/PentiumII-266a, src/perfctr-2.7.x/etc/costs/PentiumII-266b, src/perfctr-2.7.x/etc/costs/PentiumII-300, src/perfctr-2.7.x/etc/costs/PentiumII-350, src/perfctr-2.7.x/etc/costs/PentiumIII-1.0, src/perfctr-2.7.x/etc/costs/PentiumIII-1.4, src/perfctr-2.7.x/etc/costs/PentiumIII-450, src/perfctr-2.7.x/etc/costs/PentiumIII-500, src/perfctr-2.7.x/etc/costs/PentiumIII-700, src/perfctr-2.7.x/etc/costs/PentiumIII-733, src/perfctr-2.7.x/etc/costs/PentiumIII-800, src/perfctr-2.7.x/etc/costs/PentiumIII-866, src/perfctr-2.7.x/etc/costs/PentiumIII-900, src/perfctr-2.7.x/etc/costs/PentiumIII-933, src/perfctr-2.7.x/etc/costs/PentiumM-1.3, src/perfctr-2.7.x/etc/costs/PentiumM-1.4, src/perfctr-2.7.x/etc/costs/PentiumM-1.5, src/perfctr-2.7.x/etc/costs/PentiumM-1.6, src/perfctr-2.7.x/etc/costs/PentiumM-1.7, src/perfctr-2.7.x/etc/costs/PentiumMMX-150, src/perfctr-2.7.x/etc/costs/PentiumMMX-166, src/perfctr-2.7.x/etc/costs/PentiumMMX-233, src/perfctr-2.7.x/etc/costs/PentiumPro-200, src/perfctr-2.7.x/etc/costs/Sempron-3100+, src/perfctr-2.7.x/etc/install.sh, src/perfctr-2.7.x/etc/p4.c, src/perfctr-2.7.x/examples/Makefile, src/perfctr-2.7.x/examples/README, src/perfctr-2.7.x/examples/global/Makefile, src/perfctr-2.7.x/examples/global/arch.h, src/perfctr-2.7.x/examples/global/global.c, src/perfctr-2.7.x/examples/global/ppc.c, src/perfctr-2.7.x/examples/global/x86.c, src/perfctr-2.7.x/examples/perfex/Makefile, src/perfctr-2.7.x/examples/perfex/arch.h, src/perfctr-2.7.x/examples/perfex/perfex.c, src/perfctr-2.7.x/examples/perfex/ppc.c, src/perfctr-2.7.x/examples/perfex/ppc.h, src/perfctr-2.7.x/examples/perfex/ppc64.c, src/perfctr-2.7.x/examples/perfex/ppc64.h, src/perfctr-2.7.x/examples/perfex/x86.c, src/perfctr-2.7.x/examples/perfex/x86.h, src/perfctr-2.7.x/examples/self/Makefile, src/perfctr-2.7.x/examples/self/arch.h, src/perfctr-2.7.x/examples/self/ppc.c, src/perfctr-2.7.x/examples/self/ppc64.c, src/perfctr-2.7.x/examples/self/self.c, src/perfctr-2.7.x/examples/self/x86.c, src/perfctr-2.7.x/examples/signal/Makefile, src/perfctr-2.7.x/examples/signal/arch.h, src/perfctr-2.7.x/examples/signal/ppc.c, src/perfctr-2.7.x/examples/signal/ppc64.c, src/perfctr-2.7.x/examples/signal/signal.c, src/perfctr-2.7.x/examples/signal/x86.c, .../linux/Documentation/perfctr/low-level-api.txt, .../Documentation/perfctr/low-level-ppc32.txt, .../linux/Documentation/perfctr/low-level-x86.txt, .../linux/Documentation/perfctr/overview.txt, .../linux/Documentation/perfctr/virtual.txt, src/perfctr-2.7.x/linux/drivers/perfctr/Kconfig, src/perfctr-2.7.x/linux/drivers/perfctr/Makefile, .../linux/drivers/perfctr/RELEASE-NOTES, src/perfctr-2.7.x/linux/drivers/perfctr/cpumask.h, src/perfctr-2.7.x/linux/drivers/perfctr/init.c, src/perfctr-2.7.x/linux/drivers/perfctr/ppc.c, src/perfctr-2.7.x/linux/drivers/perfctr/ppc64.c, .../linux/drivers/perfctr/ppc64_tests.c, .../linux/drivers/perfctr/ppc64_tests.h, .../linux/drivers/perfctr/ppc_tests.c, .../linux/drivers/perfctr/ppc_tests.h, src/perfctr-2.7.x/linux/drivers/perfctr/version.h, src/perfctr-2.7.x/linux/drivers/perfctr/virtual.c, src/perfctr-2.7.x/linux/drivers/perfctr/virtual.h, src/perfctr-2.7.x/linux/drivers/perfctr/x86.c, .../linux/drivers/perfctr/x86_tests.c, .../linux/drivers/perfctr/x86_tests.h, src/perfctr-2.7.x/linux/include/asm-i386/perfctr.h, src/perfctr-2.7.x/linux/include/asm-ppc/perfctr.h, .../linux/include/asm-ppc64/perfctr.h, .../linux/include/asm-x86_64/perfctr.h, src/perfctr-2.7.x/linux/include/linux/perfctr.h, src/perfctr-2.7.x/patches/patch-kernel-2.6.11, src/perfctr-2.7.x/patches/patch-kernel-2.6.12-rc1, .../patches/patch-kernel-2.6.12-rc1-mm1, .../patches/patch- kernel-2.6.12-rc1-mm3, .../patches/patch-kernel-2.6.12-rc1-mm4, src/perfctr-2.7.x/patches/patch-kernel-2.6.12-rc2, src/perfctr-2.7.x/patches/patch-kernel-2.6.12-rc5, src/perfctr-2.7.x/patches/patch-kernel-2.6.14, src/perfctr-2.7.x/patches/patch-kernel-2.6.14-mm1, .../patches/patch-kernel-2.6.14-rc5-mm1, src/perfctr-2.7.x/patches/patch-kernel-2.6.16, .../patches/patch- kernel-2.6.16.21-SLES10, src/perfctr-2.7.x/patches/patch- kernel-2.6.17, src/perfctr-2.7.x/patches/patch-kernel-2.6.18, src/perfctr-2.7.x/patches/patch-kernel-2.6.18-rc4, src/perfctr-2.7.x/patches/patch-kernel-2.6.19, src/perfctr-2.7.x/patches/patch-kernel-2.6.20, src/perfctr-2.7.x/patches/patch-kernel-2.6.21, src/perfctr-2.7.x/patches/patch-kernel-2.6.22, src/perfctr-2.7.x/perfctr.spec, src/perfctr-2.7.x/update-kernel, src/perfctr-2.7.x/usr.lib/Makefile, src/perfctr-2.7.x/usr.lib/arch.h, src/perfctr-2.7.x/usr.lib/event_set.h, src/perfctr-2.7.x/usr.lib/event_set_amd.c, src/perfctr-2.7.x/usr.lib/event_set_centaur.c, src/perfctr-2.7.x/usr.lib/event_set_p4.c, src/perfctr-2.7.x/usr.lib/event_set_p5.c, src/perfctr-2.7.x/usr.lib/event_set_p6.c, src/perfctr-2.7.x/usr.lib/event_set_ppc.c, src/perfctr-2.7.x/usr.lib/event_set_ppc64.c, src/perfctr-2.7.x/usr.lib/event_set_x86.c, src/perfctr-2.7.x/usr.lib/gen-event-codes.c, src/perfctr-2.7.x/usr.lib/global.c, src/perfctr-2.7.x/usr.lib/libperfctr.h, src/perfctr-2.7.x/usr.lib/misc.c, src/perfctr-2.7.x/usr.lib/ppc.c, src/perfctr-2.7.x/usr.lib/ppc.h, src/perfctr-2.7.x/usr.lib/ppc64.c, src/perfctr-2.7.x/usr.lib/ppc64.h, src/perfctr-2.7.x/usr.lib/virtual.c, src/perfctr-2.7.x/usr.lib/x86.c, src/perfctr-2.7.x/usr.lib/x86.h, src/perfctr-2.7.x/usr.lib/x86_cpuid.S, src/perfctr-2.7.x/usr.lib/x86_cpuinfo.c, src/perfctr-2.7.x/usr.lib/x86_cpuinfo.h: perfctr: Remove the obsolete bundled perfctr and libpfm-3.y code perfctr was an earlier project that provided access to a processor's performance monitoring hardware on Linux. Over time Linux has developed its own infrastructure to access the performance monitoring hardware via the perf_event_open syscall. The perfctr kernel patches and user-space code has not been updated to work with current kernels. PAPI has included bundled perfctr and libpfm-3.y code to make use of the perfctr interface, but it is no longer useful. In the interest of making the PAPI code more compact the obsolete bundled perfctr and libpfm-3.y code have been removed. 2024-06-27 Treece Burgess * src/papi.c: Updating documentation for PAPI_read and PAPI_accum based off feedback. * src/papi.c: Update doxygen documentation to make note of the differences between PAPI_read and PAPI_accum. Specifically the second parameter in PAPI_accum must be initialized. 2024-02-29 Daniel Barry * src/components/sysdetect/arm_cpu_utils.c: sysdetect: add support for ARM Neoverse V2 Add cache information for ARM Neoverse V2, per the Reference Manual: https://developer.arm.com/documentation/102375/0002?lang=en These changes have been tested on the ARM Neoverse V2 architecture. Wed Jun 19 23:32:25 2024 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_gnr.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_gnr_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_gnr.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/lib/pfmlib_s390x_cpumf.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 92c52017d7395c4040ec22949ee8c7f17bc5b4f7 add missing pfmlib_intel_gnr.c file To complete the Intel GraniteRapids support. Was missing from commit 23d0b7c47c2e ("add Intel GraniteRapids core PMU support") Sorry about that. commit 44a55dd97c929087b4ca93540f6bbb2d2efffd15 s390: Fix calloc compiler error for gcc14 The definition of calloc is as follows: void *calloc(size_t nmemb, size_t size); number of members is in the first parameter and the size is in the second parameter. Fix error message on the gcc 14 20240102: error: 'calloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument ... by adhering to the calloc() calling convention. Output before: # make Entering directory '/root/perfmon2-libpfm4/lib' cc -g -Wall -Werror -Wextra -Wno-unused-parameter -I. \ -I/root/perfmon2-libpfm4/lib/../include -DCONFIG_PFMLIB_DEBUG \ -DCONFIG_PFMLIB_OS_LINUX -DCONFIG_PFMLIB_NOTRACEPOINT \ -DHAS_OPENAT -D_REENTRANT -I. -fvisibility=hidden \ -DCONFIG_PFMLIB_ARCH_S390X -I. -c pfmlib_s390x_cpumf.c pfmlib_s390x_cpumf.c: In function ‘pfm_cpumcf_init’: pfmlib_s390x_cpumf.c:221:26: error: ‘calloc’ sizes specified with \ ‘sizeof’ in the earlier argument and not in the later argument \ [-Werror=calloc-transposed-args] 221 | cpumcf_pe = calloc(sizeof(*cpumcf_pe), cfvn_set_count): | ^ pfmlib_s390x_cpumf.c:221:26: note: earlier argument should specify number of elements, later size of each element Acked-by: Sumanth Korikkar commit 23d0b7c47c2ec06334b3eb378bfc3568b08e0042 add Intel GraniteRapids core PMU support Based on JSON event files published on github.com/Intel/perfmon version 1.02, dated 05/10/2024 commit 489a940be48980956b27dda89de1eb91b01f185d fix Intel SPR inst_retired umask flags Were missing PEBS and fixed event code umask (ANY) was not encoded properly. inst_retired.any -> fixed counter 0 encoding 0x100 inst_retired.any_p -> generic counter encoding 0x00c0 commit ace21560113100f4ab5032e99753459ed9da7049 fix validate_x86.c Intel SPR/EMR inst_retired tests They were not using the proper event codes for the different versions of inst_retired. Note: Built PAPI with Intel Xeon Gold 6430 (SPR) with GCC 14. No errors occured during build. Unable to test Granite Rapids support due to the PAPI team not having access to a machine with Granite Rapids. Unable to test instr_retired fixes due to no access to Emerald Rapids and events not available on Sapphire Rapids machine we have access to. 2024-06-25 Daniel Barry * README.md: readme: updated FAQ page Edit to the README to point to the new live FAQ webpage. 2024-06-21 Daniel Barry * PAPI_FAQ.html: FAQ: update outdated links Replace outdated links with newer equivalents. * INSTALL.txt: install: remove outdated instructions Remove outdated and incorrect instructions from the INSTALL.txt file. 2024-06-20 Daniel Barry * PAPI_FAQ.html: FAQ: Update to address MacOS question A common question raised in issues and on the mailing list is whether or not PAPI supports MacOS. Fri Jun 14 22:14:05 2024 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_adl_grt_events.h: Update libpfm4 Current with commit 4bdeb7e067363013257460bdb6c3dbae778b5634 Fix encoding of inst_retired.* on Alderlake E-core Was returning event code 0x0 for both .any and .any_p umasks. any_p refers to the generic counter encoding which uses event code 0xc0. Note: Unable to test due to the PAPI team not having access to a machine with Alderlake as of now. 2024-05-13 Treece Burgess * src/papi.c: Restructuring conditional check to check for allowed modifiers. Wed Apr 24 18:01:23 2024 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_icx_unc_cha_events.h: Update libpfm4 Current with commit 7c486cf96f9eab7019023d40f9c568486f696c44 remove Intel IcelakeX UNC_CHA_PIPE_REJECT event Encodings of umasks is invalid and fails to pass tests with perf as it sets bits the kernel does not know about. Note: Tested on ICL system Hexane, with Linux 6.1.81-1.el9.elrepo.x86_64 and Intel(R) Xeon(R) Silver 4309Y CPU @ 2.80GHz as CPU architecture. 2024-04-22 Treece Burgess * src/utils/papi_avail.c: Adding extra check within conditional statement to avoid entering with the options of --cache or --cnd. Mon Apr 15 17:27:33 2024 -0700 Stephane Eranian * src/libpfm4/perf_examples/task.c: Update libpfm4 Current with commit 1c0cdb91bc79eb1bc827e022f0a3738f124796a2 fix uninitialized variable in perf_examples/task.c Commit 9410619f922f ("update task.c example to handle hybrid") Introduced a bug by not initializing group_fd which could generate a compiler warning and a bug. Fix this by initializing group_fd to -1. Reported-by: William Cohen Thu Apr 11 08:36:50 2024 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_spr_events.h: Update libpfm4 Current with commit 72866cbc2666820d87ebc0af3b1a16d1d5db6965 fix duplicate event code for Intel SPR TOPDOWN.BAD_SPEC_SLOTS Was using same encoding as TOPDOWN.SLOTS. Fix by adding the proper event code (0xa4) and code override flag. Reported-by: 2024-04-25 William Cohen * src/sde_lib/Makefile: SDE_LIB: Build libsde.so.1.0 with the CFLAGS and LDFLAGS passed in A recent annocheck of the papi RPMS showed that libsde.so.1.0 was not built with the expected flags passed into the RPM build. Minor changes were made to src/sde_lib/Makefile to use the CFLAGS and LDFLAGS passed in. 2024-04-17 Treece Burgess * src/utils/papi_avail.c: Adding PAPI_PRESET_BIT_MEM for correct else if conditional check. 2024-03-28 Anthony * src/sde_lib/sde_lib_misc.c, src/sde_lib/sde_lib_ti.c: SDE_LIB: Allow PAPI_reset() to reset a CountingSet. 2024-04-02 Anthony * src/components/sde/sde.c: SDE: Adding ntv_code_to_info functionality. 2024-03-26 Anthony * src/components/sde/tests/Counting_Set/CountingSet_Lib.c: SDE tests: Add more randomness to the test. By varying the pressure on different buckets we increase the chance of catching hash table bugs. 2024-03-22 Anthony * src/sde_lib/sde_lib_datastructures.c: sde_lib: Improved bucket utilization of hash table. The previous check would put elements in the overflow list, even in cases where there was room in the bucket that the hash function points to. The updated code gives priority to the bucket. This was a potential performance issue, not a correctness bug. Fri Jan 12 21:55:04 2024 -0800 Stephane Eranian * src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_spr_unc_cha.3, src/libpfm4/docs/man3/libpfm_intel_spr_unc_imc.3, src/libpfm4/docs/man3/libpfm_intel_spr_unc_upi.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_spr_unc_cha_events.h, src/libpfm4/lib/events/intel_spr_unc_imc_events.h, src/libpfm4/lib/events/intel_spr_unc_upi_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_intel_spr_unc_cha.c, src/libpfm4/lib/pfmlib_intel_spr_unc_imc.c, src/libpfm4/lib/pfmlib_intel_spr_unc_upi.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 33513ef78f0d81edb277e0d0fd16411abb161297 Add Intel SapphireRapids uncore PMU support for CHA Adds the Coherence and Home Agent (CHA) for Intel SapphireRapids. Based on Intel JSON events v1.17 published from github.com/intel/perfmon/SPR commit e943f891e9f1d63c4b55bac051ca7b2b3979b25f Add Intel SapphireRapids uncore PMU support for UPI Adds the Ultra Path Interconnect PMU (UPI) for Intel SapphireRapids. Based on Intel JSON events v1.17 published from github.com/intel/perfmon/SPR commit 10b8044a90ba512be2b10e9425330e989cc22d01 Add Intel SapphireRapids uncore PMU support for IMC Adds the memory controller PMU (IMC) for SapphireRapids. Based on Intel JSON events v1.17 published from github.com/intel/perfmon/SPR 2024-03-12 Masahiko, Yamada * src/components/sysdetect/linux_cpu_utils.c, src/linux-memory.c: Update:Use fgets in place of fscanf functions to avoid possible buffer overflows There was a bug in the buffer overflow fix below. Use fgets in place of fscanf functions to avoid possible buffer overflows https://github.com/icl-utk- edu/papi/commit/ec2aa022fee2a1d0decf1d5b2e7e28a4ca2cf794 The buffer overflow fix rewrites fscanf() to fgets(), but there is a difference in the specifications between fscanf() and fgets(): in fscanf(), "\n" is not read, but in fgets(), "\n" is also read, so it seems necessary to remove the unnecessary "\n" to do the same. For this reason, generic_get_memory_info() could not obtain cache information strings from allocation_policy, type, and write_policy under the /sys/devices/system/cpu/cpu0/cache/index[012] directory as expected. The buffer overflow update fix allows us to do the same thing as fscanf() by rewriting fscanf() to fgets()+sscanf(). Thu Feb 29 23:07:21 2024 -0800 Stephane Eranian * src/libpfm4/config.mk, src/libpfm4/lib/pfmlib_perf_event_pmu.c: Update libpfm4 Current with commit f517f1ec8038de00ce8f5fefeeef704e24aa08ae add CONFIG_PFMLIB_NOTRACEPOINT to speedup libpfm4 initialization When pfmlib_initialize() is run as root and if the debugs is mounted, then the library parses all the tracepoints to add them to the perf PMU. Depending on the number of tracepoints, this can take a significant amount of time even though this may not be needed if no tracepoints is passed. In order to speedup pfm_initialize(), the patch adds a compile-time option to disable support for tracepoint in the perf PMU. To deactivate tracepoint support: make CONFIG_PFMLIB_NOTRACEPOINT=y The default build is unchanged with tracepoints enabled. This patch adds an opt-out option. Tue Feb 13 21:58:11 2024 -0800 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_adl_glc.3, src/libpfm4/docs/man3/libpfm_intel_adl_grt.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_pcu.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_adl_glc_events.h, src/libpfm4/lib/events/intel_adl_grt_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_adl.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/perf_examples/task.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 816bb547f8997b84f7ef70bb99420f40dc8a984d add Intel Raptorlake PMU support Enables support for Raptorlake, Raptorlake P, Raptorlake S. commit 8fa4467ffa7014ed8f1525783b5919b80117beca Add Intel AlderLake Gracemont (E-Core) core PMU support Adds core PMU support for Alderlake E-core (gracemont). Based on Intel JSON events v1.24 published from github.com/intel/perfmon/ADL commit e84a9563f4c93dc6e530dfa55d61b150fbf51510 Add Intel AlderLake Goldencove (P-Core) core PMU support Adds core PMU support for Alderlake P-core (goldencove). Based on Intel JSON events v1.24 published from github.com/intel/perfmon/ADL commit 9410619f922facca7dab2406c58fe41a8dd61529 update task.c example to handle hybrid Cannot group event if they do not belong to the same hardware PMU. commit 2441b263f6f28c0fe80f8cee62cd2e64d75cd433 add INTEL_X86_CODE_DUP event flag for Intel PMUs To handle the case where two events shared the same code and none is because of deprecation. In order to pass validation both events with the same code must have that flag set. commit 9669e0d696a98b8b5655186dda8b457113cb0ba2 Add support for deprecated events for Intel X86 PMUs Adds INTEL_X86_DEPRECATED flag to Intel X86 events. Deprecated in this context means there is a newer event monitoring the same condition. This is used to mark events as deprecated and avoid detecting duplicate event codes. commit b7307408ddb1548271983d1fd7c4f17287d2dc0e fix Intel IcelakeX uncore PCU PMU man page typo Was referring to eRP clockticks instead of PCU clockticks. 2024-02-19 Daniel Barry * src/papi_events.csv: presets: add support for Ice Lake ICL Add various preset definitions for architectures containing the ICL PMU. These changes have been tested on the Intel Ice Lake architecture. 2024-02-16 Treece Burgess * src/counter_analysis_toolkit/Makefile: Updating CAT Makefile to use PAPI_DIR, to be more inline with the PAPI WIKI documentation. 2024-02-08 Treece Burgess * src/high-level/scripts/papi_hl_output_writer.py: Changing --source flag to --source_dir and update to error output messages. Removed Python doc string and will address docs in another PR. 2024-02-06 Treece Burgess * src/high-level/scripts/papi_hl_output_writer.py: Fixing small typo in parse_source_file docstring. * src/high-level/scripts/papi_hl_output_writer.py: Updating error handling, with updated output messages. * src/high-level/scripts/papi_hl_output_writer.py: Adding support for optional individual file flag to papi_hl_output_writer.py script. 2024-02-01 Anthony * .../sde/tests/Minimal/Minimal_Test++.cpp, .../sde/tests/Simple2/Simple2_Driver++.cpp, src/components/sde/tests/Simple2/Simple2_Lib++.cpp, src/components/sde/tests/Simple2/simple2.hpp: SDE: Updated the C++ version of test Simple2. 2024-01-30 Anthony * .../sde/tests/Minimal/Minimal_Test++.cpp: SDE: Updated the C++ version of the minimal test. The updated version of the example uses classes and looks like proper C++ much more than the previous version. However, the functionality has remained the same. Fri Feb 2 15:32:27 2024 -0800 Stephane Eranian * src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_intel_icx_unc_irp.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 0d4ed0e7b09338e1bb1ab9153beab030c52570fe fix missing PEBS flag on Intel SPR MEM_LOAD_L3_MISS_RETIRED commit 769b239ee314 ("add PEBS support to Intel SPR MEM_LOAD_L3_MISS_RETIRED") added the PEBS flag to the event but to none of the umasks causing a validation issue. Add missing PEBS flags to the umasks commit 769b239ee31465f030f63d8dd16c6be006bfcb55 add PEBS support to Intel SPR MEM_LOAD_L3_MISS_RETIRED Was missing PEBS flag. 2024-01-29 Treece Burgess * src/papi_fwrappers.c: Updating first argument for PAPIF_flops_rate, PAPIF_flips_rate, and PAPIF_epc to use a int * instead of just int. 2024-01-22 Treece Burgess * src/components/net/README.md, src/components/rocm/README.md: Updating net README to remove unwanted links and rocm README to add correct link for GPU isolation. 2024-01-19 Treece Burgess * src/components/appio/README.md, src/components/bgpm/README.md, src/components/coretemp/README.md, src/components/coretemp_freebsd/README.md, src/components/cuda/README.md, src/components/emon/README.md, src/components/example/README.md, src/components/host_micpower/README.md, src/components/infiniband/README.md, src/components/io/README.md, src/components/libmsr/README.md, src/components/lmsensors/README.md, src/components/lustre/README.md, src/components/micpower/README.md, src/components/mx/README.md, src/components/net/README.md, src/components/nvml/README.md, src/components/pcp/README.md, src/components/perf_event/README.md, src/components/perf_event_uncore/README.md, src/components/powercap/README.md, src/components/powercap_ppc/README.md, src/components/rapl/README.md, src/components/rocm/README.md, src/components/rocm_smi/README.md, src/components/sde/README.md, src/components/sensors_ppc/README.md, src/components/stealtime/README.md, src/components/sysdetect/README.md, src/components/vmware/README.md: Updates to individual component README's to fix markdown links. 2024-01-12 Daniel Barry * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/dcache.h, src/counter_analysis_toolkit/main.c: cat: improve progress indicator in d-cache tests Make the progress indicator finer- grained to visualize when each successive buffer size is proved. These changes have been tested on the Intel Sapphire Rapids architecture. * src/counter_analysis_toolkit/dcache.c, src/counter_analysis_toolkit/timing_kernels.c, src/counter_analysis_toolkit/timing_kernels.h: cat: optimize data cache benchmarks Traverse the pointer chain fewer times for larger buffers and do a warm-up traversal closer to PAPI_start() to prevent cache pollution and compulsory misses. These changes have been tested on the Intel Sapphire Rapids architecture. 2024-01-09 Vince Weaver * src/linux-context.h, src/linux-timer.c, src/mb.h: add initial riscv support This adds basic support for the RISC-V architecture After this PAPI will compile and the tools will run, however no events will work because of missing libpfm4 support. Tested on a BeagleV- Ahead board Fri Dec 8 22:25:25 2023 -0800 Stephane Eranian * src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_icx_unc_cha.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_iio.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_imc.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_irp.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_m2m.3, .../docs/man3/libpfm_intel_icx_unc_m2pcie.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_m3upi.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_pcu.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_ubox.3, src/libpfm4/docs/man3/libpfm_intel_icx_unc_upi.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_icx_unc_cha_events.h, src/libpfm4/lib/events/intel_icx_unc_iio_events.h, src/libpfm4/lib/events/intel_icx_unc_imc_events.h, src/libpfm4/lib/events/intel_icx_unc_irp_events.h, src/libpfm4/lib/events/intel_icx_unc_m2m_events.h, .../lib/events/intel_icx_unc_m2pcie_events.h, .../lib/events/intel_icx_unc_m3upi_events.h, src/libpfm4/lib/events/intel_icx_unc_pcu_events.h, src/libpfm4/lib/events/intel_icx_unc_ubox_events.h, src/libpfm4/lib/events/intel_icx_unc_upi_events.h, src/libpfm4/lib/pfmlib_amd64_rapl.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_icx_unc_cha.c, src/libpfm4/lib/pfmlib_intel_icx_unc_iio.c, src/libpfm4/lib/pfmlib_intel_icx_unc_imc.c, src/libpfm4/lib/pfmlib_intel_icx_unc_irp.c, src/libpfm4/lib/pfmlib_intel_icx_unc_m2m.c, src/libpfm4/lib/pfmlib_intel_icx_unc_m2pcie.c, src/libpfm4/lib/pfmlib_intel_icx_unc_m3upi.c, src/libpfm4/lib/pfmlib_intel_icx_unc_pcu.c, src/libpfm4/lib/pfmlib_intel_icx_unc_ubox.c, src/libpfm4/lib/pfmlib_intel_icx_unc_upi.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: libpfm4: update to commit 90f61a0 Original commits: commit 90f61a008cdee50d085b5041414df55e16e045fe Add Intel IcelakeX uncore PMU support for M2PCIE Adds the Mesh to IIO PMU (M2PCIE) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 22afed4c1020b579205ac8e8f9d6e8599307b9ee Add Intel IcelakeX uncore PMU support for UBOX Adds the UBOX PMU (UBOX) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit cdbe2eed7bdcf5d45086d6730033defc1939a722 Add Intel IcelakeX uncore PMU support for M3UPI Adds the Mesh to UPI PMU (M3UPI) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 17dddc2f4cde87f37f041f40586654190da5a8c2 Add Intel IcelakeX uncore PMU support for UPI Adds the Ultra Path Interconnect PMU support (UPI) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 32fcf6fe2eaf2f2bf105f7543dcc0b07c097baaf Add Intel IcelakeX uncore PMU support for PCU Adds the Power Control unit PMU support (PCU) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 20ff7523ffaae04f6762d51e32fe35e04fa70cad Add Intel IcelakeX uncore PMU support for M2M Adds the Mesh to Memory PMU support (M2M) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit d5ed0e03686c051f5311fc1993824245eb10e1d2 Add Intel IcelakeX uncore PMU support for IRP Adds the PCIe IIO Ring Port PMU support (IRP) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 30afdce909bcc7313af7599cfbf6484ae4b1fc3e Add Intel IcelakeX uncore PMU support for IIO Adds the PCIe I/O controller PMU support (IIO) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 6237022aa77bc9c845b1c48d741e54bdc22ac077 Add Intel IcelakeX uncore PMU support for IMC Adds the memory controller PMU support (IMC) for Intel IcelakeX. Based on Intel JSON events v1.21 published from github.com/intel/perfmon/ICX commit 05f04adec932cd2cd28e83f718e4e0ae6ba2eab4 Add Intel IcelakeX uncore PMU support for CHA Adds Intel IcelakeX CHA (Coherency and Home Agent) uncore PMU support. Based on Intel published uncore JSON events v1.21 from github.com/intel/perfmon/ICX. commit 94e82e27c02ef01f288a1b40904d72b2954d3f31 check umasks[] bounds in intel_x86_uflag() Otherwise may run into SEGFAULT for some events. commit d058479bd048d2742df298097da86bc86dd1a5ce Enable RAPL support on AMD Zen4 Just like other AMD EPYC processors, only ENERGY_PKG is supported. papi-papi-7-2-0-t/ChangeLogP720b2.txt000066400000000000000000001324671502707512200170420ustar00rootroot000000000000002025-02-24 Anthony Danalis * doc/Doxyfile-common, papi.spec, src/Makefile.in, src/configure.in: Version updated for releasing 7.2.0b2 2025-02-24 Treece Burgess * src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_utils.h, src/components/cuda/linux- cuda.c, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h: Add support for heterogeneous systems in the Cuda component. 2025-02-21 Anthony * src/counter_analysis_toolkit/scripts/README.md, src/counter_analysis_toolkit/scripts/README.txt, .../scripts/{default.gnp => default_gnp.inc}, .../scripts/multi_plot.gnp, .../L2_RQSTS:ALL_DEMAND_REFERENCES.data.reads.stat, .../L2_RQSTS:DEMAND_DATA_RD_HIT.data.reads.stat, .../L2_RQSTS:DEMAND_DATA_RD_MISS.data.reads.stat, .../sample_data/PM_DATA_FROM_L2.data.reads.stat, .../sample_data/PM_DATA_FROM_L3.data.reads.stat, .../PM_DATA_FROM_L3MISS.data.reads.stat, .../scripts/single_plot.gnp: Changing example files to accomodate Windows. 2025-02-12 Dong Jun Woun * src/components/rocm_smi/rocs.c: rocm_smi: Initial event count and event table initialization event count upper bound mismatch & handling unsupported events 2025-02-19 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp: Ensure context is valid and active when stopping 2025-02-13 Treece Burgess * src/components/cuda/cupti_profiler.c: Remove maxPassCount from calculate_num_passes. Add further documentation for maxPassCount behavior. 2025-02-12 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp: Propagate error when obtaining function pointers. If we fail to dlopen() the library because the path was wrong, or if we can't dlsym() some of the the functions, then make sure the code does not proceed. Sun Feb 9 16:07:40 2025 -0800 Stephane Eranian * src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_intel_gnr_unc_imc.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/intel_adl_glc_events.h, src/libpfm4/lib/events/intel_adl_grt_events.h, src/libpfm4/lib/events/intel_gnr_events.h, src/libpfm4/lib/events/intel_gnr_unc_imc_events.h, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_gnr_unc_imc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc.c, src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h, src/libpfm4/lib/pfmlib_perf_event_pmu.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/perf_examples/self_smpl_multi.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 762ca94010d9a8f21f0440c0b5807e9a2e849420 Cleanup Alderlake event descriptions Remove . at end of descriptions. commit 66627c778115b7d5a1cd6200250b7c4b07bccc67 Add Intel GraniteRapids uncore IMC PMU support Adds Intel GraniteRapids uncore IMC (memory controller) PMU support. - Based on Intel JSON event table version : 1.06 - Based on Intel JSON event table published : 01/17/2025 Available at github.com/Intel/perfmon commit 59aae0f5ce1dda4013063a4d192ab793179916d6 Fix clang unused function/variable errors With clang and -Werrors, the library did not compile. Fix both issues: - unused perf_get_ovfl_umask_idx() with CONFIG_PFMLIB_NOTRACEPOINT set - unused variable sum in self_smpl_multi.c commit 876528e6213b478986ba2fef768ee7e06df0e5fd Update Intel GraniteRapids core events to 1.06 Updates the Intel GraniteRapids core PMU event table to latest Intel released version: Date : 01/17/2025 Version: 1.06 From gitub.com/Intel/perfmon Note: At this time the PAPI team does not have access to a machine with Intel GraniteRapids or Intel Alderlake to test Commit ID's: - 762ca94 - 66627c7 - 876528e As a sanity check, PAPI was built on a Intel SkyLake (Intel(R) Xeon(R) Gold 6140) with the utilities behaving as expected. Commit ID 59aae0f does resolve the issue of being unable to compile libpfm4 with llvm compilers. Tested with llvm 11.1.0. 2025-02-10 Anthony Danalis * src/components/rocp_sdk/README.md, src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/sdk_class.hpp: Updated variable names for uniformity. 2025-02-05 Anthony Danalis * src/components/rocp_sdk/README.md: Version information. * src/components/rocp_sdk/README.md: README file. 2025-02-04 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/tests/kernel.cpp, src/components/rocp_sdk/tests/two_eventsets.c: rocprofiler_configure() can be called before PAPI_library_init() * src/components/rocp_sdk/rocp_sdk.c, src/components/rocp_sdk/sdk_class.cpp: Added author information to files. 2025-01-31 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/tests/Makefile, src/components/rocp_sdk/tests/advanced.c: Default value for qualifier "device". 2024-09-23 Anthony Danalis * src/components/rocp_sdk/rocp_sdk.c, src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/sdk_class.hpp, src/components/rocp_sdk/tests/Makefile, src/components/rocp_sdk/tests/advanced.c, src/components/rocp_sdk/tests/kernel.cpp, src/components/rocp_sdk/tests/simple.c, src/components/rocp_sdk/tests/simple_sampling.c, src/components/rocp_sdk/tests/two_eventsets.c: ROCP_SDK: Enabling device profiling mode. - Sampling is the default mode. - Sampling reads the values directly from the call. - Qualifier "kernel" is a mandatory qualifier. For all other qualifiers the default value is "sum". - Code needs w.r.t mutexes adapt to different C++ standards. - Test for Sampling mode. - dispatch mode accumulates values across kernel invocations. 2025-02-10 Bill Williams * src/components/rocp_sdk/Rules.rocp_sdk: rocp_sdk: add missing srcfile Fixes #313. Mon Jan 27 16:35:05 2025 -0800 Stephane Eranian * src/libpfm4/lib/events/amd64_events_fam1ah_zen5.h: Update libpfm4 Current with commit 7750d00833a607eeb53c9a6832ffa8a6b827cdb9 fix AMD Zen5 encodings for L2_FILL_RESPONSE_SRC and L2_PREFETCH_MISS_L3 Event codes were swapped. Note: The PAPI team at this time does not have access to a machine with an AMD Zen5 to test this update. An AMD Zen4 was used as a sanity check with the utilities: papi_component_avail, papi_native_avail, and papi_command_line behaving as expected. 2025-01-27 Treece Burgess * src/components/perf_event/perf_event.c: Updating error message for function check_exclude_guest() in perf_event.c. 2025-01-17 Treece Burgess * src/configure, src/configure.in: Restructure configure and configure.in to avoid errors: cuda.h no such file and integer expression expected on Power 9 and Power 10. 2025-01-31 Treece Burgess * .github/workflows/papi_framework_workflow.yml: Replace actions/upload-artifact@v3 with actions/upload-artifact@v4. 2024-12-18 Treece Burgess * src/papi_events.csv: Updating papi_events.csv to support addition of L1I_CACHE and deprecation of L1I_CACHE_ACCESS in libpfm4. Testing took place on a Neoverse V2 by comparing output from the commit prior to the addition of L1I_CACHE and the commit that adds L1I_CACHE. - Output for papi_avail was identical - Output for papi_command_line for the events PAPI_L1_ICA, PAPI_L1_ICH, PAPI_L1_TCA, and PAPI_L1_TCH was roughly identical. Fri Dec 13 00:23:03 2024 -0800 Stephane Eranian * src/libpfm4/include/perfmon/perf_event.h, src/libpfm4/lib/events/arm_neoverse_n1_events.h, src/libpfm4/lib/events/arm_neoverse_n2_events.h, src/libpfm4/lib/events/arm_neoverse_v1_events.h, src/libpfm4/lib/events/arm_neoverse_v2_events.h: Update libpfm4 Current with commit d22403ec9bddaf62c59d847904918b30db69550d make L1I_CACHE_ACCESS an alias to the official L1I_CACHE event Covers Neoverse N1, N2, N3, V1, V2. L1I_CACHE_ACCESS is marked as deprecated. commit 0003418f8b698cbb2709e7f6931c6fd94e634f98 update perf_events interface header to 6.12 Update perf_event.h to reflect state in 6.12. Testing was only done on an ARM Neoverse V2 as the PAPI team does not have access to an N1, N2, N3, or V1. Testing: - papi_avail successfully runs - papi_component_avail successfully runs - papi_native_avail shows the new event L1I_CACHE and that L1I_CACHE_ACCESS is deprecated - papi_command_line works with L1I_CACHE (tested with qualifiers cpu and u) 2025-01-28 Treece Burgess * src/run_tests.sh: Update run_tests.sh to remove bash specific syntax. 2024-10-10 Willow Cunningham * src/components/perf_event/perf_event.c: perf_event: Eliminate permission error for check_exclude_guest() with paranoid kernel On systems with perf_event_paranoid>=2, userspace calls to perf_event_open() must set attr.exclude_kernel=1. Set exclude_kernel=1 in check_exclude_guest() to allow the function to execute successfully on systems with perf_event_paranoid>=2; 2024-11-12 Daniel Barry * src/counter_analysis_toolkit/eventstock.c: cat: remove error return value Since the perf_event component can be disabled, this error message should be removed. These changes have been tested on the NVIDIA Grace-Hopper architecture. * src/counter_analysis_toolkit/eventstock.c: cat: add newlines to error messages Include newlines for consistency. These changes have been tested on the NVIDIA Grace-Hopper architecture. * src/counter_analysis_toolkit/main.c: cat: remove unnecessary cleanup call Remove call that is already invoked. These changes have been tested on the NVIDIA Grace-Hopper architecture. 2025-01-08 Treece Burgess * .github/workflows/cat_workflow.yml: Fixing bug in counter analysis toolkit workflow 2025-01-06 Treece Burgess * .github/workflows/README.md, .github/workflows/appio_component_workflow.yml, .github/workflows/cat_workflow.yml, .github/workflows/ci.sh, .github/workflows/ci_cat.sh, .github/workflows/ci_default_components.sh, .github/workflows/ci_individual_component.sh, .github/workflows/ci_papi_framework.sh, .github/workflows/coretemp_component_workflow.yml, .github/workflows/cuda_component_workflow.yml, .github/workflows/default_components_workflow.yml, .github/workflows/example_component_workflow.yml, .github/workflows/intel_gpu_component_workflow.yml, .github/workflows/io_component_workflow.yml, .github/workflows/lmsensors_component_workflow.yml, .github/workflows/main.yml, .github/workflows/net_component_workflow.yml, .github/workflows/nvml_component_workflow.yml, .github/workflows/papi_framework_workflow.yml, .github/workflows/powercap_component_workflow.yml, .github/workflows/rocm_component_workflow.yml, .github/workflows/rocm_smi_component_workflow.yml, .github/workflows/sde_component_workflow.yml, .github/workflows/stealtime_component_workflow.yml, src/run_tests.sh, src/run_tests_shlib.sh: Updating the PAPI GitHub CI, see README for more details on structure. 2024-12-11 Daniel Barry * src/components/intel_gpu/Rules.intel_gpu, src/components/intel_gpu/tests/Makefile: intel_gpu: remove unnecessary linker flags Since the flag -lstdc++ is already included from the configure script, it does not need to be included in the component's Rules and Makefiles. Also, -g is already included in the recipes, it is not needed in the linker flags variable. These changes have been tested on the Intel Max 1550 GPU. * src/configure, src/configure.in: configure: -pthread with intel_gpu Since the intel_gpu component uses mutexes, which use POSIX threads on some systems, -pthread needs to be used when this component is configured-in. These changes have been tested on the Intel Max 1550 GPU. 2024-12-13 Treece Burgess * src/components/lmsensors/tests/Makefile, .../lmsensors/tests/lmsensors_list_events.c, src/components/lmsensors/tests/lmsensors_read.c: Adding tests for the lmsensors component. Thu Oct 3 16:25:55 2024 +0900 Yoshihiro Furudera * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_monaka.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_fujitsu_monaka_events.h, src/libpfm4/lib/pfmlib_arm_armv9.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm64.c: Update libpfm4 Current with commit 5e26b48b6d9b9d5f8c368c81cfe23a54a129bd24 Enable support for FUJITSU-MONAKA core PMU This patch adds support for FUJITSU-MONAKA core PMU. This includes ARMv9 generic core events and FUJITSU- MONAKA specfic events. FUJITSU-MONAKA Specification URL: https://github.com/fujitsu/FUJITSU-MONAKA Note: The PAPI team at this time does not have access to a machine with the ARM processor FUJITSU-MONAKA for testing. 2024-11-06 Treece Burgess * src/components/rapl/linux-rapl.c: Add support for AMD family 25 (19h) processors in the RAPL component. Tested on Family/Model/Stepping: - 25/1/1 - 25/48/1 - 25/17/1 - 25/144/1 - 25/97/2 2024-11-21 Treece Burgess * src/ctests/all_native_events.c, src/ctests/get_event_component.c: Update all_native_events.c and get_event_component.c to take an optional flag --disable-cuda-events= to disable processing of Cuda native events. 2024-12-02 Dong Jun Woun * src/configure, src/configure.in, src/darwin-common.c, src/darwin- common.h, src/darwin-memory.c, src/run_tests.sh, src/threads.c, src/utils/papi_multiplex_cost.c, src/utils/print_header.c: Disable perf_event, perf_events_uncore, and cpu 2024-11-27 Willow Cunningham * src/components/rapl/linux-rapl.c: rapl: fixed indentation (spaces->tabs) 2024-11-14 Willow Cunningham * src/components/rapl/linux-rapl.c: rapl: Add support for RaptorLake Add RAPL component support for the RaptorLake architecture. This change was tested for Family/Model/Stepping 0x6/0xb7/0x1 (RaptorLake-S/HX) only. The other two models of RaptorLake (Models 0xba and 0xbf) are untested. 2024-11-12 Willow Cunningham * src/components/rapl/linux-rapl.c: fixed indentation 2024-09-19 Willow Cunningham * src/components/rapl/linux-rapl.c: rapl: Added RAPL component support for Intel RaptorLake. Sat Sep 28 23:07:55 2024 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_neoverse_n3.3, src/libpfm4/docs/man3/libpfm_intel_gnr.3, src/libpfm4/docs/man3/libpfm_intel_icl.3, src/libpfm4/docs/man3/libpfm_intel_spr.3, src/libpfm4/docs/man3/pfm_get_event_attr_info.3, src/libpfm4/examples/showevtinfo.c, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/arm_neoverse_n3_events.h, src/libpfm4/lib/events/intel_gnr_events.h, src/libpfm4/lib/events/intel_icl_events.h, src/libpfm4/lib/events/intel_spr_events.h, src/libpfm4/lib/pfmlib_arm_armv9.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_intel_x86.c, src/libpfm4/lib/pfmlib_intel_x86_priv.h, src/libpfm4/lib/pfmlib_perf_event.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/perf_examples/task.c, src/libpfm4/tests/validate_arm64.c, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 91970fe6eb4e80b63f77fb54a9592e28a207050c Add ARM Neoverse N3 core PMU support Adds ARM Neoverse N3 core PMU support. Based on: https://github.com/ARM- software/data/blob/master/pmu/neoverse-n3.json commit 1c2c67b38cd28823b3e34208b86e4656b55d310f fix group_pmu initialization in perf_examples/task Variable was reinitialized at each iteration preventing events from being group by fd via perf_event_open() commit 298127beb46c43ed02f8d2f8efc8ce52a9d601db Add support for Topdown via PERF_METRICS for Intel Icelake/IcelakeX Add the TOPDOWN_M dedicated event to provide the pseudo encodings necessary to program Topdown L1 events onto the PERF_METRICS MSR of Intel GraniteRapids on Linux. In order to successfully use PERF_METRICS, the kernel imposes some restrictions (which are not known to libpfm4) which the user must follow: TOPDOWN_M events must be passed to the kernel in a single perf_events group (chained fds) AND the TOPDOWN_M.SLOTS event must be the first event in that group. Note that only the SLOTS events (programmed in fixed counter3) support modifiers such as user vs. kernel, hw_smpl or precise sampling on perf_events. All other umasks do not support any modifier. The SLOTS event controls the filtering for all PERF_METRICS pseudo events. The encodings provided by libpfm4 for fixed counters are specific to Linux. When used on non Linux systems, encodings are not guaranteed to be valid. commit d01108b9b470131545389be4f2f479d0fa4b9444 Add support for Topdown via PERF_METRICS for Intel GraniteRapids Cut & paste of the Intel SapphireRapids support. Add the TOPDOWN_M dedicated event to provide the pseudo encodings necessary to program Topdown L1 and L2 events onto the PERF_METRICS MSR of Intel GraniteRapids on Linux. In order to successfully use PERF_METRICS, the kernel imposes some restrictions (which are not known to libpfm4) which the user must follow: TOPDOWN_M events must be passed to the kernel in a single perf_events group (chained fds) AND the TOPDOWN_M.SLOTS event must be the first event in that group. Note that only the SLOTS events (programmed in fixed counter3) support modifiers such as user vs. kernel, hw_smpl or precise sampling on perf_events. All other umasks do not support any modifier. The SLOTS event controls the filtering for all PERF_METRICS pseudo events. The encodings provided by libpfm4 for fixed counters are specific to Linux. When used on non Linux systems, encodings are not guaranteed to be valid. commit d9de389d9cf168116b4753ac7d94edbffcd2a161 Add support for Topdown via PERF_METRICS for Intel SapphireRapids Add the TOPDOWN_M dedicated event to provide the pseudo encodings necessary to program Topdown L1 and L2 events onto the PERF_METRICS MSR of Intel SapphireRapids on Linux. In order to successfully use PERF_METRICS, the kernel imposes some restrictions (which are not known to libpfm4) which the user must follow: TOPDOWN_M events must be passed to the kernel in a single perf_events group (chained fds) AND the TOPDOWN_M.SLOTS event must be the first event in that group. Note that only the SLOTS events (programmed in fixed counter3) support modifiers such as user vs. kernel, hw_smpl or precise sampling on perf_events. All other umasks do not support any modifier. The SLOTS event controls the filtering for all PERF_METRICS pseudo events. The encodings provided by libpfm4 for fixed counters are specific to Linux. When used on non Linux systems, encodings are not guaranteed to be valid. commit 1530567c76d290233b68e71b064cb1c3abfde8c3 add support_no_mods support to Intel X86 encoding This patch adds support for attribute info support_no_mods to Intel X86 encoding routines. There is a new INTEL_X86_NO_MODS flags that can be set on umasks flags field. commit 7e131752c0e486555cbb57342cdac2087e129dd4 Add support_no_mods support to perf_events encoding This patch adds handling of the support_no_mods attribute info to perf_events encoding routine. If the flag is set for any umasks then it is applied to all. It is expected that events with such conditions prohibit umasks combinations. Only the perf_events pinned attribute is maintained because it is a modifier of the perf_events subsystem with no specific implication in the hardware. commit bf495f91ef5c455af0de679703090566a776b65a Add support_hw_smpl description to pfm_get_event_attr_info() man page Was missing from manual since the field was added. commit 8717588413b931818eddc6c9caaa9995c7d2418d Add new attribute info field support_no_mods This new attribute is added to the pfm_event_attr_info_t structure to handle the case where some of the umasks of an event do not support all the modifiers that the event as a whole support. That applies to privilege level filtering, hardware sampling and such. commit 704fb4aa4b722c110256ec0488e8f895542f64ab fix assignment in pfmlib_build_event_pattrs Was doing os_nattrs += ..... then os_nattrs was just initialized at declaration with no prior usage. Testing: - Arm Neoverse v3: As stands the PAPI team does not have access to a machine with ARM Neoverse v3 for testing. - Icelake (Intel Xeon Silver 4309Y): Successful PAPI build. papi_component_avail, papi_native_avail, and papi_command_line are successful. TOPDOWN_M metrics are shown in papi_native_avail, but we are only able to monitor TOPDOWN_M:SLOTS at this time. - Granite Rapids: As stands the PAPI team does not have access to a machine with Granite Rapids for testing. - Sapphire Rapids (Intel Xeon Gold 6430): Successful PAPI build. papi_component_avail, papi_native_avail, and papi_command_line are successful. TOPDOWN_M metrics are shown in papi_native_avail, but we are only able to monitor TOPDOWN_M:SLOTS at this time. 2024-11-21 Dong Jun Woun * src/components/cuda/tests/Makefile: remove unecessary linking 2024-10-04 William Cohen * src/components/lmsensors/linux-lmsensors.c: lmsensors: Avoid possible overruns on local variable The local path_name variable in link_lmsensors_libraries() needs to be PATH_MAX in size to avoid possible overruns and scribbling over parts of the stack. Error: OVERRUN (CWE-119): [#def22] [important] papi-7.2.0b1-build/papi-7.2.0b1/src/components/lmsensors/linux- lmsensors.c:398:7: overrun-buffer-arg: Overrunning array "path_name" of 1024 bytes by passing it to a function which accesses it at byte offset 4095 using argument "4096UL". [Note: The source code implementation of the function has been overridden by a builtin model.] # 396| // Step 3: Try the explicit install default. # 397| if (dl1 == NULL && lmsensors_root != NULL) { // if root given, try it. # 398|-> snprintf(path_name, PATH_MAX, "%s/lib64/libsensors.so", lmsensors_root); // PAPI Root check. # 399| dl1 = dlopen(path_name, RTLD_NOW | RTLD_GLOBAL); // Try to open that path. # 400| } Error: OVERRUN (CWE-119): [#def23] [important] papi-7.2.0b1-build/papi-7.2.0b1/src/components/lmsensors/linux- lmsensors.c:404:7: overrun-buffer-arg: Overrunning array "path_name" of 1024 bytes by passing it to a function which accesses it at byte offset 4095 using argument "4096UL". [Note: The source code implementation of the function has been overridden by a builtin model.] # 402| // Step 4: Try another explicit install default. # 403| if (dl1 == NULL && lmsensors_root != NULL) { // if root given, try it. # 404|-> snprintf(path_name, PATH_MAX, "%s/lib/libsensors.so", lmsensors_root); // PAPI Root check. # 405| dl1 = dlopen(path_name, RTLD_NOW | RTLD_GLOBAL); // Try to open that path. # 406| } * src/threads.c: thread: Properly free memory in case of malloc failure in allocate_thread Added code to properly free memory for the following situation that Coverity pointed out: Error: RESOURCE_LEAK (CWE-772): [#def49] [important] papi-7.2.0b1-build/papi-7.2.0b1/src/threads.c:117:2: alloc_fn: Storage is returned from allocation function "malloc". papi-7.2.0b1-build/papi-7.2.0b1/src/threads.c:117:2: var_assign: Assigning: "thread->running_eventset" = storage returned from "malloc(8UL * (size_t)papi_num_components)". papi-7.2.0b1-build/papi-7.2.0b1/src/threads.c:134:4: leaked_storage: Freeing "thread" without freeing its pointer field "running_eventset" leaks the storage that "running_eventset" points to. # 132| papi_free( thread->context[i] ); # 133| papi_free( thread->context ); # 134|-> papi_free( thread ); # 135| return ( NULL ); # 136| * src/components/sysdetect/linux_cpu_utils.c: sysdetect: Ensure that a variable in get_cache_type is always initialized Initializing variable type to a sane value (PAPI_MH_TYPE_EMPTY) to make sure that the coverity static analysis does not generate the following message: Error: UNINIT (CWE-457): papi-7.2.0b1-build/papi-7.2.0b1/ src/components/sysdetect/linux_cpu_utils.c:561:5: var_decl: Declaring variable "type" without initializer. papi-7.2.0b1-build/p api-7.2.0b1/src/components/sysdetect/linux_cpu_utils.c:593:5: uninit_use: Using uninitialized value "type". # 591| } # 592| # 593|-> *value = type; # 594| # 595| return CPU_SUCCESS; * src/components/infiniband/linux-infiniband.c: infiniband: Ensure that memory allocated by strdup() freed It is possible in add_ib_counter() that one of the strdup() operations fails and the other succeeds. In cases where the struct referencing them is being free because of a failure one of the strdup() any allocated memory should be freed also. This was noted noted in a coverity scan of PAPI: Error: RESOURCE_LEAK (CWE-772): papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:332:5: alloc_fn: Storage is returned from allocation function "strdup". papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:332:5: var_assign: Assigning: "new_cnt->ev_file_name" = storage returned from "strdup(file_name)". papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:338:9: leaked_storage: Freeing "new_cnt" without freeing its pointer field "ev_file_name" leaks the storage that "ev_file_name" points to. # 336| { # 337| PAPIERROR("cannot allocate memory for counter internal fields"); # 338|-> papi_free(new_cnt); # 339| return (0); # 340| } Error: RESOURCE_LEAK (CWE-772): papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:331:5: alloc_fn: Storage is returned from allocation function "strdup". papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:331:5: var_assign: Assigning: "new_cnt->ev_name" = storage returned from "strdup(name)". papi-7.2.0b1-build/papi-7.2.0b1/src/components/infiniband/linux- infiniband.c:338:9: leaked_storage: Freeing "new_cnt" without freeing its pointer field "ev_name" leaks the storage that "ev_name" points to. # 336| { # 337| PAPIERROR("cannot allocate memory for counter internal fields"); # 338|-> papi_free(new_cnt); # 339| return (0); # 340| } 2024-11-12 Daniel Barry * src/papi_events.csv: presets: remove PAPI_FP_OPS from POWER9 & POWER10 The previous definition counted floating-point instructions instead of operations. Added note to papi_events.csv that explains how to scale and combine native events to measure all FLOPs with multiplexing. These changes have been tested on the IBM POWER9 and POWER10 architectures. 2024-10-16 Treece Burgess * src/components/cuda/tests/Makefile: Removing -lpthread flag inplace of -pthread and adding -pthread to concurrent_profiling to build correctly on glibc < 2.34. 2024-11-11 Treece Burgess * gitlog2changelog.py, release_procedure.txt: Updating gitlog2changelog.py to add command line arguments and updating the text in release_procedures.txt. 2024-10-30 Anthony * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/instructions.c: CAT: More instruction tests and cleaner output. 2024-10-29 Anthony * src/counter_analysis_toolkit/Makefile, src/counter_analysis_toolkit/instructions.c: CAT: New instruction benchmarks for FMA and Int. 2024-10-29 Jeevitha Palanisamy * src/papi_events.csv: PAPI power10 event list presets Added presets for IBM Power 10 2024-10-16 Treece Burgess * src/components/cuda/cupti_profiler.c: Allow for a user to supply just the base cuda native event name. Defaults to cuda_ntv_name:device=0. 2024-10-24 Treece Burgess * src/components/cuda/cupti_profiler.c, src/components/cuda/papi_cupti_common.c: Updating the ordering of conditional checks in functions to load library files. 2024-10-17 Willow Cunningham * src/counter_analysis_toolkit/Makefile: counter_analysis_toolkit: use 'ifndef' for more idiomatic code 2024-10-01 Willow Cunningham * src/counter_analysis_toolkit/Makefile: counter_analysis_toolkit: Added automatic architecture detection to the Makefile The CAT Makefile will now use 'uname -m' to detect if the architecture is arm or x86, and otherwise assume powerpc. If ARCH is already set, it is left alone. 2024-10-23 Anthony * src/components/sysdetect/cpu.c, src/components/sysdetect/linux_cpu_utils.c, src/components/sysdetect/sysdetect.c, src/utils/papi_hardware_avail.c: sysdetect: accounting for missing core ids. 2024-10-16 Willow Cunningham * src/validation_tests/Makefile.recipies: validation_tests: Add -O1 optimization flag to matrix_multiply.o When compiled with optimization level -O2, the papi_sr_ins test has a roughly -33% error. Compiling with -O1 resolves this issue. Tested on Intel Xeon CPU E5-2640 v3 (Haswell-EP) and ARM Cortex-A76 2024-10-17 Anthony Danalis * src/configure, src/configure.in: Sysdetect: allow users to disable component. Added a flag (--with-sysdetect=) in configure that allows users to disable the Sysdetect component. 2024-10-17 Treece Burgess * src/Makefile.in: Update Makefile.in to have PAPI version 7.2.0.0b1. 2024-09-20 Heike Jagode * src/components/cuda/tests/Makefile: Link with CXX instead of CC macro To avoid issues with older GCC versions (e.g., GCC 9.5.0), where libstdc++ needs to be manually added during the linking step (or a soft link to the library is required—on our machine (see gcc -v when GCC 9.5.0 module is loaded), this would be 'ln -s /usr/lib64/libstdc++.so.6 /usr/lib64/libstdc++.so'), we link using the CXX macro instead of the CC macro. 2024-09-24 Treece Burgess * src/components/cuda/README_internal.md, src/components/cuda/Rules.cuda, src/components/cuda/cupti_config.h, src/components/cuda/cupti_dispatch.c, src/components/cuda/cupti_dispatch.h, src/components/cuda/cupti_events.c, src/components/cuda/cupti_events.h, src/components/cuda/cupti_profiler.c, src/components/cuda/cupti_profiler.h, src/components/cuda/cupti_utils.c, src/components/cuda/cupti_utils.h, src/components/cuda/htable.h, src/components/cuda/linux-cuda.c, src/components/cuda/papi_cupti_common.c, src/components/cuda/papi_cupti_common.h: Update Cuda component to add a device qualifier. As of Cuda Version 12.6, the MetricsContext API has been removed and replaced with the MetricsEvaluator API. Due to this change, the Cuda component will only work with Cuda versions < 12.6. The PAPI team will be working to support both the MetricsContext API and MetricsEvaluator API. 2024-10-11 Daniel Barry * src/run_tests.sh: run_tests.sh: change libpfm-3.y to libpfm4 Change paths from libpfm-3.y to libpfm4 so that the tests link to libpfm.so.4 These changes have been tested on the NVIDIA Grace- Hopper architecture. 2024-02-16 Daniel Barry * src/linux-memory.c: papi_mem_info: add support for ARM Neoverse V2 Add TLB information for ARM Neoverse V2, per the Reference Manual: https://developer.arm.com/documentation/102375/0002?lang=en These changes have been tested on the ARM Neoverse V2, ARM Cortex-A72, and AMD Zen3 architectures. 2024-02-15 Daniel Barry * src/linux-memory.c: papi_mem_info: check for newlines in sysfs files Some systems have a newline character at the end of the cache type file. The newline character needs to be taken into account for proper string comparisons. These changes have been tested on the ARM Neoverse V2, ARM Cortex-A72, and AMD Zen3 architectures. * src/linux-memory.c: papi_mem_info: convert tabs to spaces This helps with readability. These changes have been tested on the ARM Neoverse V2, ARM Cortex-A72, and AMD Zen3 architectures. 2024-09-19 Daniel Barry * .../intel_gpu/internal/src/GPUMetricHandler.cpp, src/components/intel_gpu/linux_intel_gpu_metrics.c: intel_gpu: properly reset metrics Reset the appropriate fields in the context struct when PAPI_cleanup() is called. This corresponds to intel_gpu_update_control_state() being called with the value 'count' equal to zero. Also reset the internal metric counts when PAPI_start() is called, so that the previous values do not persist after the next PAPI_start(). These changes have been tested on the Intel Ponte Vecchio architecture. * src/components/intel_gpu/linux_intel_gpu_metrics.c: intel_gpu: remove extraneous check for group The metric group should not be checked when PAPI_add_event() is called. It should be checked upon PAPI_start() only. These changes have been tested on the Intel Ponte Vecchio architecture. * src/components/intel_gpu/internal/inc/GPUMetricInterface.h: intel_gpu: allow larger number of groups There were previously only 8 bits allotted to enumerate the metric groups, limited the number of groups to 256. However, there are at least 1433 groups available on the Intel Ponte Vecchio architecture. These changes have been tested on the Intel Ponte Vecchio architecture. 2024-10-07 William Cohen * src/high-level/scripts/papi_hl_output_writer.py: high-level: Explicitly use python3 and place shebang on the first line Python3 has been released for over 15 years. Fedora, Debian, and other distributions state that python scripts should avoid using unversioned python intepreter: https://fedoraproject.org/wiki/FinalizingFedoraSwitchtoPython3 https://www.debian.org/doc/packaging-manuals/python-policy/ The RPM build process will flag unversioned python interpreter use as an error and the build will fail. Making the papi_hl_output_writer.py explicitly use /usr/bin/python3. The shebang line that describes which interpreter to use must be the first line of the file (https://www.shellcheck.net/wiki/SC1128). 2024-10-02 Treece Burgess * doc/Doxyfile-common, doc/Doxyfile-man3, man/man3/papi_hl_output_writer_Sum_Counter.3, man/man3/papi_hl_output_writer_Sum_Counters.3, src/high- level/scripts/papi_hl_output_writer.py: Update papi_hl_output_writer.py to use 4 spaces instead of 2 for code formatting and add Doxygen documentation to classes and functions. 2024-10-04 Treece Burgess * src/papi.c: Correcting documentation in PAPI_add_named_event. 2024-06-26 William Cohen * src/utils/Makefile: utils: Include the compiler flags when linking the PAPI utilities Newer Linux distributions are using LTO to generate the binaries rather than standard linking. Should include the flags used to generate the .o files to make it clearer what optimization should be done in LTO when generating the final binaries. 2023-03-27 Giuseppe Congiu * src/components/rocm_smi/rocs.c: rocm_smi: rename special cases to derived events Derived events are handled in the `handle_special_events` function. This name is not intuitive. Use `handle_derived_events` instead. 2024-09-23 Treece Burgess * src/components/rocm/rocm.c: Update disabled reason if rocm init is successful. 2024-09-26 Anthony Danalis * src/components/Makefile_comp_tests.target.in, src/components/cuda/tests/Makefile, src/configure, src/configure.in: CUDA Tests: Conditionally adding -fpic to nvcc. If the PAPI shared library is built (either because the user explicitly configured with --with-shared-lib=yes, or by default because the user did not specify anything) then nvcc will be instructed to create position-independent code through the flag "-fpic". If the user has disabled the shared library (by configuring with --with-shared-lib=no) then this flag is not passed to nvcc. Thu Sep 19 00:38:14 2024 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_ac55.3, src/libpfm4/docs/man3/libpfm_arm_ac76.3, src/libpfm4/docs/man3/libpfm_arm_neoverse_v3.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile, src/libpfm4/lib/events/arm_cortex_a55_events.h, src/libpfm4/lib/events/arm_cortex_a76_events.h, .../lib/events/arm_hisilicon_kunpeng_unc_events.h, .../lib/events/arm_marvell_tx2_unc_events.h, src/libpfm4/lib/events/arm_neoverse_v3_events.h, src/libpfm4/lib/events/power10_events.h, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_arm_armv8_kunpeng_unc.c, ...c => pfmlib_arm_armv8_kunpeng_unc_perf_event.c}, src/libpfm4/lib/pfmlib_arm_armv8_thunderx2_unc.c, ...=> pfmlib_arm_armv8_thunderx2_unc_perf_event.c}, src/libpfm4/lib/pfmlib_arm_armv8_unc.c, src/libpfm4/lib/pfmlib_arm_armv8_unc_priv.h, src/libpfm4/lib/pfmlib_arm_armv9.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_arm.c, src/libpfm4/tests/validate_arm64.c: Update libpfm4 Current with commit c89a379175c00a20bbc660ad9b444e8ecc16cd28 add ARM Cortex A76 core PMU support Adds ARM Cortex A76 core PMU support. Based on: https://github.com/ARM- software/data/blob/master/pmu/cortex-a76.json commit 6195cbb4686dbeeee7a237ab8a133ef6c2209476 fix detection of ARM Cortex A55 Was using code not yet released. Bug introduced by: c40b6eb0640a ("add ARM Cortex A55 core PMU support") commit 1e8734203f74f0ec6974a860c0b18cb95cce1371 Update IBM Power10 core PMU support Added additional events for IBM Power 10 core PMU. commit c40b6eb0640a649b2c3fdf472c1d6499a8e819c0 add ARM Cortex A55 core PMU support Add support for ARM Cortex A55 core PMU events. Based on: https://github.com/ARM- software/data/blob/master/pmu/cortex-a55.json commit f91ea4f1a76fdd5886fd9b6fe8eaa6f585a5bac4 fix ARM thunderX2 and HiSilicon support to compile on non Linux The perf_events encoding routines were mixed with generic encodings and event tables. This patch cleans all of this to separate generic from Linux specific code. commit 892c5fc89ed5fc0e4f0b4a4a290fac57613f23da Add ARM Neoverse V3 core PMU support Based on: https://github.com/ARM- software/data/blob/master/pmu/neoverse-v3.json Note: Below are the commits grouped from top to bottom and discusses what was able to be tested and what was not. - Commit ID with last 6 of 16cd28, unable to test due to no access to a machine with ARM Cortex A76. - Commit ID with last 6 of 209476, unable to test update due to no access to a machine with ARM Cortex A55. - Commit ID with last 6 of ce1371, tested Power10 updates on a machine with IBM Power S1022 with papi_component_avail showing added events in the event count, papi_native_avail showing newly added events, and the ability to add the new events with papi_command_line. Commit ID with last 6 of e819c0, unable to test update due to no access to a machine with ARM Cortex A55. - Commit ID's with last 6 of a5bac4 and 3f23da, unable to test updates due to no access to a machine with either ARM Neoverse V3, ARM thunderX2 and HiSilicon. Tue Sep 17 16:36:24 2024 -0700 Stephane Eranian * src/libpfm4/lib/pfmlib_common.c: Update libpfm4 Current with commit 0118612a28d270e78d1f17c24e9db0935e332285 remove extraneous printf in pfmlib_init_env() Was introduced by mistake by commit: 32b7c3d6ab6b ("add LIBPFM_PROC_CPUINFO variable for Linux") Note: Built PAPI on ARM Neoverse V2 with utilities such as papi_component_avail, papi_native_avail, and papi_command_line working as expected. Sun Sep 15 22:29:37 2024 -0700 Stephane Eranian * src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h: Update libpfm4 Current with commit 3724e7ef87e71dd1de46ef4eb4ec2b1be4ea63e5 add LIBPFM_PROC_CPUINFO variable for Linux Allows overriding the filename used to parse the /proc/cpuinfo file. This can be used to detect certain CPU models, such as on ARM. Providing an override allows testing without the actual hardware. Note: Built PAPI on machines with Arm Neoverse V2 and Intel(R) Xeon(R) CPU E5-2698 v4 with both being successful. Ran papi_component_avail, papi_native_avail, and papi_command_line with all three utilities running succesfully on both of the aforementioned CPU's. 2024-09-05 Anthony Danalis * src/components/rocp_sdk/sdk_class.cpp, src/components/rocp_sdk/sdk_class.hpp, src/components/rocp_sdk/tests/Makefile, src/components/rocp_sdk/tests/kernel.cpp, src/components/rocp_sdk/tests/two_eventsets.c: Better support for multiple eventsets and multiple devices. Now we create the dispatch profile when the user calls PAPI_start() not when the dispatch callback is invoked. Also, we create as many profiles as there are devices requested by the user events, and we associate each profile with the correct agent. Thu Sep 5 23:38:53 2024 -0700 Stephane Eranian * src/libpfm4/README, src/libpfm4/docs/Makefile, src/libpfm4/docs/man3/libpfm_arm_ac72.3, src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/events/intel_gnr_events.h, src/libpfm4/lib/pfmlib_arm.c, src/libpfm4/lib/pfmlib_arm_armv8.c, src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c: Update libpfm4 Current with commit 9c5c88e734e866f0801b80c527330ad6dbe21e89 add ARM Cortex A72 Core PMU support As a clone of Cortex A57. commit 6d276b48eba5ead4e3fd4b6eca359504f2b69b6c fix pfm_arm_detect() buffer initialization problem Commit 3abda5bc6c1a ("Optimize pfm_detect() for ARM processors") added an optimization to avoid parsing /proc/cpuinfo too many times. But it had a bug whereby it was reinitializing the pfm_arm_cfg.* fields multiple times and potentially from an uninitialized buffer. commit fd3191c34ad87e22d3f3d31d2cf5c1050a9136ba Fix FRONTEND_RETIRED.LATE_SWPF encoding on Intel GNR Fix was missing from commit d799b554647 ("update Intel GraniteRapids core PMU to 1.03") Note: As stands the PAPI team does not have access to a machine with either, Intel Granite Rapids or ARM Cortex A72 to test the updated changes mentioned above. 2024-08-26 Dong Jun Woun * src/run_tests.sh: run_tests.sh : Test only active components In the run_tests.sh script, when checking for components tests filter for active/inactive components. Tested with the rocm and cuda components on Methane 2024-08-01 Treece Burgess * src/components/cuda/tests/Makefile: Removing unneeded + from NVCFLAGS assignment. 2024-07-29 Treece Burgess * src/components/cuda/tests/Makefile: Removing conditional check for -arch=native flag. 2024-08-26 Daniel Barry * src/components/rocm/tests/common.h: rocm: fix typo in component tests Fix so that when a component test fails, it says "FAILED". Mon Sep 2 21:51:03 2024 -0700 Stephane Eranian * src/libpfm4/lib/events/intel_gnr_events.h: Update libpfm4 Current with commit 0d799b5546477a46b3a52310bbf1884d56e9e37f update Intel GraniteRapids core PMU to 1.03 Updates the Intel GraniteRapids core PMU event table to latest Intel released version: Date : 08/19/2024 Version: 1.03 From gitub.com/Intel/perfmon update Intel GraniteRapids core PMU event table Update to upstream version 1.03 Note: Unable to test Intel Granite Rapids updates due to the PAPI team not having access to a machine with Granite Rapids. 2024-08-19 Daniel Barry * src/components/cuda/README.md: cuda: fix typo in README Corrected the name of the 'papi_native_avail' utility. 2024-09-06 djwoun <65102751+djwoun@users.noreply.github.com> * README.md: Update README.md 2024-09-05 G-Ragghianti * .github/workflows/spack.sh: Fixing cuda version to highest currently supported papi-papi-7-2-0-t/INSTALL.txt000066400000000000000000000552131502707512200155150ustar00rootroot00000000000000/* * File: INSTALL.txt * CVS: $Id$ * Author: Kevin London * london@cs.utk.edu * Mods: Dan Terpstra * terpstra@cs.utk.edu * Mods: Philip Mucci * mucci@cs.utk.edu * Mods: * */ ***************************************************************************** HOW TO INSTALL PAPI ONTO YOUR SYSTEM ***************************************************************************** On some of the systems that PAPI supports, you can install PAPI right out of the box without any additional setup. Others require drivers or patches to be installed first. The general installation steps are below, but first find your particular Operating System's section for any additional steps that may be necessary. NOTE: the configure and make files are located in the papi/src directory. General Installation 1. % ./configure % make 2. Check for errors. a) Run a simple test case: (This will run ctests/zero) % make test If you get good counts, you can optionally run all the test programs with the included test harness. This will run the tests in quiet mode, which will print PASSED, FAILED, or SKIPPED. Tests are SKIPPED if the functionality being tested is not supported by that platform. % make fulltest (This will run ./run_tests.sh) To run the tests in verbose mode: % ./run_tests.sh -v 3. Create a PAPI binary distribution or install PAPI directly. a) To install PAPI libraries and header files from the build tree: % make install b) To install PAPI manual pages from the build tree: % make install-man c) To install PAPI test programs from the build tree: % make install-tests d) To install all of the above in one step from the build tree: % make install-all e) To create a binary kit, papi-.tgz: % make dist ***************************************************************************** MORE ABOUT CONFIGURE OPTIONS ***************************************************************************** There is an extensive array of options available from the configure command-line. These can differ significantly from version to versions of PAPI. For complete details on the command-line options, use: % ./configure --help ***************************************************************************** DOCUMENTATION BY DOXYGEN ***************************************************************************** PAPI now ships with documentation generated by doxygen. Documentation for the public apis can be created by running doxygen from the doc directory. More complete documentation of all internal apis and structures can be generated with: % doxygen Doxyfile-html Doxygen documentation for the currently released version of PAPI is also available on the website. ***************************************************************************** Operating System Specific Installation Steps (In Alphabetical Order by OS) ***************************************************************************** AIX - IBM POWER5 and POWER6 and POWER7 ***************************************************************************** PAPI is supported on AIX 5.x for POWER5 and POWER6. PAPI is also tested on AIX 6.1 for POWER7. Use ./configure to select the desired make options for your system, specifying the --with-bitmode=32 or --with-bitmode=64 to select wordlength. 32 bits is the default. 1. On AIX 5.x, the bos.pmapi is a product level fileset (part of the OS). However, it is not installed by default. Consult your sysadmin to make sure it is installed. 2. Follow the general instructions for installing PAPI. WARNING: PAPI requires XLC version 6 or greater. Your version can be determined by running 'lslpp -a -l | grep -i xlc'. BG/P ***************************************************************************** BG/P is a cross-compiled environment. The machine on which PAPI is compiled is not the machine on which PAPI runs. To compile PAPI on BG/P, specify the BG/P environment as shown below: % ./configure --with-OS=bgp % make NOTE: ./configure might fail if the cross compiler is not in your path. If that is the case, just add it to your path and everything should work: % export PATH=$PATH:/bgsys/drivers/ppcfloor/gnu-linux/bin By default this will make a subset of tests in the ctests directory and all tests in the ftests directory. There is an additional C test program provided for the BG/P environment that exercises the specific BG/P events and demonstrates how to intermix the PAPI and BG/P UPC native calls. This test program is built with the normal make sequence and can be found in the ctests/bgp directory. The testing targets in the make file will not work in the BG/P environment. Since BG/P supports multiple queuing systems, you must manually execute individual programs in the ctests and ftests directories to check for successful library creation. You can also manually edit the run_tests.sh script to automate testing for your installation. Most papi utilities work for BGP, including papi_avail, papi_native_avail, and papi_command_line. Many ctests pass for BGP, but many others produce errors due to the non-traditional architecture of BGP. In particular, PAPI_TOT_CYC always seems to produce 0 counts, although papi_get_virt_usec and papi_get_real_usec appear to work. The IBM RedPaper: http://www.redbooks.ibm.com/abstracts/redp4256.html provides further discussion about PAPI on BGP along with other performance issues. BG/Q ***************************************************************************** Five new components have been added to PAPI to support hardware performance monitoring for the BG/Q platform; in particular the BG/Q network, the I/O system, the Compute Node Kernel in addition to the processing core. There are no specific component configure scripts for L2unit, IOunit, NWunit, CNKunit. In order to configure PAPI for BG/Q, use the following configure options at the papi/src level: % ./configure --prefix=< your_choice > \ --with-OS=bgq \ --with-bgpm_installdir=/bgsys/drivers/ppcfloor \ CC=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc \ F77=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gfortran \ --with-components="bgpm/L2unit bgpm/CNKunit bgpm/IOunit bgpm/NWunit" CLE - Cray XT and XE Opteron ***************************************************************************** The Cray XT/XE is a cross-compiled environment. You must specify the perfmon version to configure as shown below. Before running configure to create the makefile that supports a Cray XT/XE CLE build of PAPI, execute the following module commands: % module purge % module load gcc Note: do not load the programming environment module (e.g. PrgEnv-gnu) but the compiler module (e.g. gcc) as shown above. Check CLE compute nodes for the version of perfmon2 that it supports: % aprun -b -a xt cat /sys/kernel/perfmon/version and use this version when configuring PAPI for a perfmon2 substrate: % configure CFLAGS="-D__crayxt" \ --with-perfmon=2.82 --prefix= \ --with-virtualtimer=times --with-tls=__thread \ --with-walltimer=cycle --with-ffsll --with-shared-lib=no \ --with-static-tools Configure PAPI for a perf events substrate: % configure CFLAGS="-D__crayxt" \ --with-perf-events --with-pe-incdir= \ --with-assumed-kernel=2.6.34 --prefix= \ --with-virtualtimer=times --with-tls=__thread \ --with-walltimer=cycle --with-ffsll --with-shared-lib=no \ --with-static-tools Invoke the make accordingly: % make CONFIG_PFMLIB_ARCH_CRAYXT=y CONFIG_PFMLIB_SHARED=n % make CONFIG_PFMLIB_ARCH_CRAYXT=y CONFIG_PFMLIB_SHARED=n install The testing targets in the makefile will not work in the XT/XE CLE environment. It is necessary to log into an interactive session and run the tests manually through the job submission system. For example, instead of: % make test use: % aprun -n1 ctests/zero and instead of: % make fulltest use: % ./run_cat_tests.sh after substituting "aprun -n1" for "yod -sz 1" in run_cat_tests.sh. FreeBSD - i386 & amd64 ***************************************************************************** PAPI requires FreeBSD 6 or higher to work. Kernel needs some modifications to provide PAPI access to the performance monitoring counters. Simply, add "options HWPMC_HOOKS" and "device hwpmc" in the kernel configuration file. For i386 systems, add also "device apic". (You can obtain more information in hwpmc(4), see NOTE 1 to check the supported HW) After this step, just recompile the kernel and boot it. FreeBSD 7 (or greater) does not ship with a fortran compiler. To compile fortan tests you will need to install a fortran compiler first (e.g. installing it from /usr/ports/lang/gcc42), and setup the F77 environment variable with the compiler you want to use (e.g. gfortran42). Fortran compilers may issue errors due to "Integer too big for its kind *". Add to FFLAGS environment variable a compiler option to use int*8 by default (in gfortran42 it is -fdefault-integer-8). Follow the "General Installation" steps. NOTE 1: -- HWPMC driver supports the following processors: Intel Pentium 2, Intel Pentium Pro, Intel Pentium 3, Intel Pentium M, Intel Celeron, Intel Pentium 4, AMD K7 (AMD Athlon) and AMD K8 (AMD Athlon64 / Opteron). FreeBSD 8 also adds support for Core/Core2/Core-i[357]/Atom processors. There is also a patch for FreeBSD 7/7.1 in http://wiki.freebsd.org/PmcTools Linux - Xeon Phi [MIC, KNC, Knight's Corner] ***************************************************************************** Full PAPI support of the MIC card requires MPSS Gold Update 2 or above, and a cross-compilation toolchain from Intel, the Intel C compiler is also supported. The compiler ----------------------------------------------------------------------------- * Download one of the MPSS full source bundles at [http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss] * Untar the download. * Extract gpl/package-cross-k1om.tar.bz2 Building PAPI - gcc cross compiler ----------------------------------------------------------------------------- * Add usr/linux-k1om-4.7/bin or equivalent to your PATH so PAPI can find the cross-build utils. (see above for instructions on acquiring the cross compilation toolchain) * You will need to invoke configure with options: > ./configure --with-mic --host=x86_64-k1om-linux --with-arch=k1om This sets up cross-compilation and sets options needed by PAPI. * Run make to build the library. Building PAPI - icc ----------------------------------------------------------------------------- If icc is in your path, > ./configure --with-mic You may have to provide additional configuration options... try > ./configure --with-mic --with-ffsll --with-walltimer=cycle --with-tls=__thread --with-virtualtimer=clock_thread_cputime_id This builds a mic native version of the library. Offload Code ------------ To use PAPI in MIC offload code, build a mic-native version of PAPI as detailed above. The PAPI utility programs can be run on the MIC using the micnativeloadex tool provided by Intel. The MIC events may require additional qualifiers to set the exclude_guest and exclude_host bits to 0 (eventname:mg=1:mh=1). For example, get a list of events available on the MIC by calling: micnativeloadex ./utils/papi_native_avail Then get an event count while setting the appropriate qualifiers micnativeloadex ./utils/papi_command_line -a "CPU_CLK_UNHALTED:mg=1:mh=1" To add offload code into your program, wrap the papi.h header as follows: #pragma offload_attribute (push,target(mic)) #include "papi.h" #pragma offload_attribute (pop) Make PAPI calls from offload code as normal. Finally add -offload-option,mic,ld,$(path_to_papi)/libpapi.a to your compile incantation or if that does not recognise papi library try -offload-option,mic,compiler,"-lpapi -L" to your compile incantation Linux - Itanium II & Montecito ***************************************************************************** PAPI on Itanium Linux links to the perfmon library. The library version and the Itanium version are automatically determined by configure. If you wish to override the defaults, a number of pfm options are available to configure. Use: % ./configure --help to learn more about these options. Follow the general installation instructions to complete your installation. PLATFORM NOTES: The earprofile test fails under perfmon for Itanium II. It has been reconfigured to work on the upcoming perfmon2 interface. Linux - PPC64 (POWER5, POWER5+, POWER6 and PowerPC970) **************************************************************************** Linux/PPC64 requires that the kernel be patched and recompiled with the PerfCtr patch if the kernel is version 2.6.30 or older. The required patches and complete installation instructions are provided in the papi/src/perfctr-2.7.x directory. PPC64 is the ONLY platform that REQUIRES use of PerfCtr 2.7.x. *- IF YOU HAVE ALREADY PATCHED YOUR KERNEL AND/OR INSTALLED PERFCTR -* WARNING: You should always use a PerfCtr distribution that has been distributed with a version of PAPI or your build will fail. The reason for this is that PAPI builds a shared library of the Perfctr runtime, on which libpapi.so depends. PAPI also depends on the .a file, which it decomposes into component objects files and includes in the libpapi.a file for convenience. If you install a new perfctr, even a shared library, YOU MUST REBUILD PAPI to get a proper, working libpapi.a. There are several options in configure to allow you to specify your perfctr version and location. Use: % ./configure --help to learn more about these options. Follow the general installation instructions to complete your installation. Linux Perf Events ( with kernel 2.6.32 and newer ) ***************************************************************************** Performance counter support has been merged as the "Perf Events" subsystem as of Linux 2.6.32. This means that PAPI can be built without patching the kernel on new enough systems. Perf Events support is new, and certain functionality does not work. If you need any of the functionality listed below, we recommend you install the PerfCtr patchset and use that in conjunction with PAPI. + PAPI requires at least Linux kernel 2.6.32, as the earlier 2.6.31 version had some significant API changes. + Kernels before 2.6.33 have extra overhead when determining whether events conflict or not. + Counter multiplexing is handled by PAPI (rather than perf_events) on kernels before 2.6.33 due to a bug in the kernel perf_events code. + Nehalem EX support requires kernel 2.6.34 or newer. + Pentium 4 support requires kernel 2.6.35 or newer. The PAPI configure script should auto-detect the availability of Perf Events on new enough distributions (this mainly requires that perf_event.h be available in /usr/include/linux) On older distributions (even ones that include the 2.6.32 kernel) the perf_event.h file might not be there. One fix is to install your distributions linux kernel headers package, which is often an optional package not installed by default. If you cannot install the kernel headers, you can obtain the perf_event.h file from your kernel and run configure as such: ./configure --with-pe-incdir=INCDIR replacing INCDIR with the directory that perf_event.h is in. Linux PerfCtr (requires patching the kernel) ***************************************************************************** When using Linux kernels before 2.6.32 the kernel must be patched with the PerfCtr patch set. (This patchset can also be used on more recent kernels if the support provided by Perf Events is not enough for your workload). The required patches and complete installation instructions are provided in the papi/src/perfctr-x.y directory. Please see the INSTALL file in that directory. Do not forget, you also need to build your kernel with APIC support in order for hardware overflow to work. This is very important for accurate statistical profiling ala gprof via the hardware counters. So, when you configure your kernel to build with PERFCTR as above, make sure you turn on APIC support in the "Processor type and features" section. This should be enabled by default if you are on an SMP, but it is disabled by default on a UP. In our 2.4.x kernels: > grep PIC /usr/src/linux/.config /usr/src/linux/.config:CONFIG_X86_GOOD_APIC=y /usr/src/linux/.config:CONFIG_X86_UP_APIC=y /usr/src/linux/.config:CONFIG_X86_UP_IOAPIC=y /usr/src/linux/.config:CONFIG_X86_LOCAL_APIC=y /usr/src/linux/.config:CONFIG_X86_IO_APIC=y You can verify the APIC is working after rebooting with the new kernel by running the 'perfex -i' command found in the perfctr/examples/perfex directory. PAPI on x86 assumes PerfCtr 2.6.x. NOTE: THE VERSIONS OF PERFCTR DO NOT CORRESPOND TO LINUX KERNEL VERSIONS. *- IF YOU HAVE ALREADY PATCHED YOUR KERNEL AND/OR INSTALLED PERFCTR -* WARNING: You should always use a PerfCtr distribution that has been distributed with a version of PAPI or your build may fail. Newer versions with backward compatibility may also work. PAPI builds a shared library of the Perfctr runtime, on which libpapi.so depends. PAPI also depends on the .a file, which it decomposes into component objects files and includes in the libpapi.a file for convenience. If you install a new PerfCtr, even a shared library, YOU MUST REBUILD PAPI to get a proper, working libpapi.a. There are several options in configure to allow you to specify your perfctr version and location. Use: % ./configure --help to learn more about these options. Follow the general installation instructions to complete your installation.PERFCT *- IF PERFCTR IS INSTALLED BUT PAPI FAILS TO INITIALIZE -* You may be running udev, which is not smart enough to know the permissions of dynamically created devices. To fix this, find your udev/devices directory, often /lib/udev/devices or /etc/udev/devices and perform the following actions: mknod perfctr c 10 182 chmod 644 perfctr On Ubuntu 6.06 (and probably other debian distros), add a line to /etc/udev/rules.d/40-permissions.rules like this: KERNEL=="perfctr", MODE="0666" On SuSE, you may need to add something like the following to /etc/udev/rules.d/50-udev-default.rules: (SuSE does not have the 40-permissions.rules file in it.] # cpu devices KERNEL=="cpu[0-9]*", NAME="cpu/%n/cpuid" KERNEL=="msr[0-9]*", NAME="cpu/%n/msr" KERNEL=="microcode", NAME="cpu/microcode", MODE="0600" KERNEL=="perfctr", NAME="perfctr", MODE="0644" These lines tell udev to always create the device file with the appropriate permissions. Use 'perfex -i' from the perfctr distribution to test this fix. PLATFORM NOTES: Opteron fails the matrix-hl test because the default definition of PAPI_FP_OPS overcounts speculative floating point operations. Solaris 8 - Ultrasparc ***************************************************************************** The only requirement for Solaris is that you must be running version 2.8 or newer. As long as that requirement is met, no additional steps are required to install PAPI and you can follow the general installation guide. Solaris 10 - UltraSPARC T2/Niagara 2 ***************************************************************************** PAPI supports the Niagara 2 on Solaris 10. The substrate offers support for common basic operations like adding/reading/etc and the advanced features multiplexing (see below), overflow handling and profiling. The implementation for Solaris 10 is based on libcpc 2, which offers access to the underlying performance counters. Performance counters for the UltraSPARC architecture are described in the UltraSPARC architecture manual in general with detailed descriptions in the actual processor manual. In case of this substrate the documentation for performance counters can be found at: - http://www.opensparc.net/publications/specifications/ In order to install PAPI on this platform make sure the packages SUNWcpc and SUNWcpcu are installed. For the compilation Sun Studio 12 was used while the substrate has been developed. GNU GCC has not been tested and would require to modify the makefiles Makefile.solaris-niagara2 (32 bit) and Makefile.solaris-niagara2-64bit (64 bit). The steps required for installation are as follows: ./configure --with-bitmode=[32|64] --prefix=/is/optional If no --with-bitmode parameter is present a default of 32 bit is assumed. If no --prefix is used, a default of /usr/local is assumed. make make install If you want to link your application against your installation you should make sure to include at least the following linker options: -lpapi -lcpc PLEASE NOTE: This is the first revision of Niagara 2/libcpc 2/Solaris 10 support and needs further testing! Contributions, especially for the preset definitions, would be very appreciated. MULTIPLEXING: As the Niagara 2 offers no native event to count the cycles elapsed, a "synthetic event" was created offering access to the cycle count. This event is neither as accurate as the native events, nor it should be used for anything else than the multiplexing mode, which needs the cycle count in order to work. Therefore multiplexing and the preset PAPI_TOT_CYC should be only used with caution. BEWARE OF WRONG COUNTER RESULTS! ***************************************************************************** CREATING AND RUNNING COMPONENTS ***************************************************************************** Basic instructions on how to create a new component can be found in src/components/README. The components directory contains several components developed by the PAPI team along with a simple yet functional "example" component which can be used as a guide to aid third-party developers. Assuming components are developed according to the specified guidelines, they will function within the PAPI framework without requiring any changes to PAPI source code. A separate directory for each components is in the papi/src/components/ directory; e.g. the NVIDIA cuda component is in papi/src/components/cuda. Within each component directory is a README file which should be consulted. Typically the component needs an environment variables to be exported; e.g. the cuda component requires the PAPI_CUDA_ROOT environment variable be set to the directory where cuda libraries can be found. Some components require multiple environment variables. Additional instructions and how to address special circumstances can be found in the README files. The components to be added to PAPI are specified during the configuration of PAPI by adding the --with-components= command line option to configure. For example, to add the acpi, lustre, and net components, the option would be: % ./configure --with-components="acpi lustre net" papi-papi-7-2-0-t/LICENSE.txt000066400000000000000000000035421502707512200154670ustar00rootroot00000000000000 Copyright (c) 2005 - 2010 Innovative Computing Laboratory Dept of Electrical Engineering & Computer Science University of Tennessee, Knoxville, TN. All Rights Reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the University of Tennessee nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. This open source software license conforms to the BSD License template. papi-papi-7-2-0-t/PAPI_FAQ.html000066400000000000000000001137231502707512200157550ustar00rootroot00000000000000 PAPI

PAPI FAQ

General Questions (FAQ)

I have a question that I think should be added here. Where should I send it?

ptools-perfapi@icl.utk.edu

Please note, this is a moderated group and if you are not subscribed to it, there may be a delay till the moderators get to your message and approve it.

How do I install the PAPI library?

Please see INSTALL.txt in the papi root directory.

Where do I go for help?

First, read this document and the PAPI Documentation thoroughly. Then consult the PAPI Home Page at http://icl.utk.edu/papi. If that doesn't help, then search the archives as mentioned below. If that fails, then send mail to one of the two mailing lists, the users group at ptools-perfapi@icl.utk.edu or the PAPI-developers group perfapi-devel@icl.utk.edu. The former is a group for general announcements, questions and miscellaneous topics. The latter is is a discussion group for the developers of PAPI and it receives all CVS update messages. (which can be a significant amount of mail!)

What are the mailing lists and how do I subscribe?

There are currently two mailing lists, ptools-perfapi, which is a group for general announcements, questions and miscellaneous topics and perfapi-devel, which is a discussion group for the developers of PAPI and it receives all CVS update messages (which can be a significant amount of mail!)

To subscribe to or maintain your subscription to either of the above groups, go to:
https://groups.google.com/a/icl.utk.edu/forum/#!forum/ptools-perfapi  or https://groups.google.com/a/icl.utk.edu/forum/#!forum/perfapi-devel

Where are the archives for the mailing lists?

The archives for the general PAPI mailing list are located at https://groups.google.com/a/icl.utk.edu/forum/#!forum/ptools-perfapi. The archives for the developers list are located at https://groups.google.com/a/icl.utk.edu/forum/#!forum/perfapi-devel.

What is needed to use PAPI?

See the Supported Architectures at https://github.com/icl-utk-edu/papi/wiki/Supported-Architectures.

What tools are available for PAPI?

Some of the more popular tools using PAPI are PaRSEC, Caliper, Kokkos, TAU, HPCToolkit , Score-P, Vampir, and Scalasca. If you have a tool to be posted, send it to the mailing list.

Is PAPI compatible with MacOS?

No. There is no PAPI version with MacOS support for CPU hardware counter monitoring. In addition to not supporting timer_create() and other POSIX realtime timers, MacOS does not export hardware counters that PAPI can read.


The PAPI Library

When I make PAPI, I always get a warning message when compiling fmultiplex2. Why?

The warning message here is benign, but since it occurs on the last file to be compiled, it often looks like the build has been aborted. The reason the message occurs is that the compiler thinks it is trying to stuff too many bits into an integer value. You can fix it by rearranging the code a little bit. Or just download the latest copy of fmultiplex2.F from the cvs tree.

How do I convert my code from PAPI 2 to PAPI 3?

PAPI 3 represents a major upgrade to the PAPI library. Because of this, there have been a number of interface changes. The process to upgrade from PAPI 2 to PAPI 3 is straightforward, and documented in the PAPI Conversion Cookbook.

How do I compile PAPI with debugging support?

To compile with debugging, define CFLAGS to include -DDEBUG in the corresponding Makefile or Rules. file.

How do I use the debugging features of the PAPI library?

To enable debugging messages at run time, set the PAPI_DEBUG environment variable to one or more of the following with any character as a separator.

SUBSTRATE
API
INTERNAL
THREADS
MULTIPLEX
OVERFLOW
PROFILE
ALL

Also, see the man page for PAPI_set_debug().

Why does PAPI_overflow, PAPI_profil and PAPI_sprofil work strangely with a small threshold?

On most systems, overflow must be emulated in software by PAPI. Only on the UltraSparc III, Itanium and IRIX does the operating system support true interrupt on overflow. Therefore the user is advised on most platforms to make sure the overflow value is no more than 1/1000th the clock rate. The emulation handler in PAPI runs every millisecond, therefore the goal of the tool designer should be to pick an value that will overflow frequently but not too frequently. Not following these guidelines could result in either the overflows never occurring or overflows occurring on every interrupt and thus resulting in a flat profile.

How do I stop PAPI_overflow, PAPI_profile or PAPI_sprofil?

Call PAPI_stop, and then call PAPI_overflow, PAPI_profile or PAPI_sprofil with a threshold value of 0. Since PAPI 3 can overflow and profile on multiple events, you must call the above routines for EACH event that had been previously enabled for overflow or profile.

What events does PAPI track?

PAPI only tracks 'hardware events', the occurrence of signals onboard the microprocessor. It does not count system calls, software interrupts or other software events. The user should remember that by default, PAPI only measures events that occur in User Space.

How does PAPI handle threads?

Currently, PAPI only supports thread level measurements with kernel or bound threads. Each thread must create, manipulate and read its own counters. When a thread is created, it inherits no PAPI events or information from the calling thread.

How does PAPI handle fork/exec?

When a process is created, it inherits no PAPI information from the calling thread.

Does PAPI support unbound or non-kernel threads?

Yes, but the counts will reflect the total events for the process. Measurements done in other threads will all get the same values, namely those counts for the total process. For non-bound threads, it is not necessary to call PAPI_thread_init. But in most scenarios like with SMP or OpenMP compiler directives, bound threads will be the default. For those using Pthreads, the user should take care to set the scope of each thread to the PTHREAD_SCOPE_SYSTEM attribute, unless the system is known to have a non hybrid thread library implementation, like Linux.

How do I encode a native event?

In PAPI2.0: Unless otherwise stated in the FAQ section for your platform, the encoding is as follows:

event = ((reg_code & 0xffffff) << 8 | (reg_num & 0xff))

In PAPI3.0: Just find the native event name and then call PAPI_event_name_to_code. The code returned can be added directly to an event set. The native events can be listed with the test case 'native_avail' in the ctests directory.

Why is there more than one patch for Linux?

There are numerous patches designed to provide access to the Intel CPU performance counters. As PAPI began, we used the original Beowulf patch (perf) by David Hendriks. However, as PAPI progressed, we needed some addition features, which I graciously added. This patch used a system call approach and has proven to be exceedingly stable. Yes, no crashes reported. I knew that there was a better way to designed a performance counter kernel patch, one that used mmap() to provide direct access to the virtual counts. Mikael Pettersson provided me with exactly that in the form of the perfctr patch. It is also very, very stable. It can be found at http://user.it.uu.se/~mikpe/linux/perfctr. If you're starting with PAPI for the first time, we recommend the perfctr patch as included in the papi source distribution.

The numbers are funky for event 0xabc on platform XYZ, help me!

This is not a question, but I'll help you. We the PAPI developers cannot be experts on the 1000's of events found across all supported platforms. However, if you are using a PAPI preset, the first thing to do is to look up the corresponding native event code using the test case 'avail'. Then the best bet is to always go to the vendor's technical documentation site and check the processor reference manual. If you're convinced everything is kosher, then please feel free to send a message to the mailing list and one of the members may be able to help you.

My program runs fine with 1 or 2 counters, but when I add more I get a -8, PAPI_ECNFLCT error code. The error text says, "Event exists. but cannot be counted due to hardware resource limitations". What does this mean?

Many systems have only a few hardware performance counter registers thus you can only measure a few metrics at once. Some platforms may support counter multiplexing, which gives the user the illusion of a larger number of registers by time sharing the performance registers. On the R10K series, the IRIX kernel supports multiplexing, allowing up to 32 events to be counted at once. Don't take fine grained measurements when multiplexing, unless you know what you're doing.

My program runs fine when measuring 1 or 2 events, but when I add more I get a -8, PAPI_ECNFLCT error code. The error text says, "Event exists. but cannot be counted due to hardware resource limitations". What does this mean?

You have either exceeded the number of available hardware counters or two or more of the events you want to count need the same resources. This can be particularly annoying on machines like the Pentium 4. Although the P4 has 18 nominal counter registers, many events require resources that are restricted to 2 or 3 of these counters. In practice it is often difficult to count more than 4 or 5 simultaneous events on this platform. One way around limited counter resources is to use multiplexing.

What's multiplexing?

Many systems have only a few hardware performance counter registers; thus you can only measure a few metrics at once. Some platforms may support counter multiplexing, which gives the user the illusion of a larger number of registers by time sharing the performance registers. On the MIPS R10K series, the IRIX kernel supports multiplexing, allowing up to 32 events to be counted at once. On other platforms PAPI does the multiplexing itself, swapping events in and out of the counters based on a timer interrupt. Don't take fine grained measurements when multiplexing, unless you know what you're doing.

Why am I still getting PAPI_ECNFLCT when using multiplexing?

PAPI multiplexing currently always uses one hardware counter for Total Cycles. If you are trying to multiplex a derived event on hardware with only two physical counters then you will get a PAPI_ECNFLCT error. This happens on the Intel Pentium IIIs for example.

Also, enabling multiplexing is a two-step process. You must call PAPI_multiplex_init() to initialize multiplexing system-wide. You must also call PAPI_set_multiplex() for *each* event set that you want to count in multiplexed mode. If you try to add too many events to an event set where multiplexing has not been set, a PAPI_ECNFLCT error will result.

What's a derived event?

Hardware counters count low level events that can be directly measured in the hardware. Often these low level events must be combined to form meaningful PAPI preset events. This linear combination of low level events is called a derived PAPI event. Derived events are usually formed by adding or subtracting 2 'native' events, but occasionally derived events can contain 4 or more terms.

When I compile and run the example program (PAPI_flops.c) on X platform I get the following error message: Error in PAPI_flops: Event exists, but cannot be counted due to hardware resource limits, what is the problem?

Hardware counters are a limited resource. Some PAPI preset events are derived, and require the use of more than one hardware counter. For example, Solaris has 2 counters, both of which are needed to count Floating point instructions. Flops also uses total cycles to measure time. On Solaris this would mean using 3 counters, and those resources aren't available.
If you get this error on any platform, run the avail program in the ctests directory and see how many native events have to be monitored. PAPI_num_counters() can be used to determine how many counters exist on your platform. If there are more native events than counters, then this is the reason you are getting the error.

Why can't I get my Fortran programs to compile with PAPI on a Cray T3E?

The Fortran header file you include has to be preprocessed before the Fortran file can use it. To have the cpp process the file before sending the file to the compiler, add the -F flag. For example:

f90 -F test.F -o test

What's wrong with PAPI_LST_INS (hex code 0x43) on my Pentium?

According to the Intel documentation, the counts from this event are not intuitive relating to it's description. Older releases of PAPI had this preset available in the Intel ports, but no longer. It does appear to work on the AMD Athlon.

I downloaded the PAPI 3 tarball last week and keep getting a segmentation fault in gcc. What's up?

SOme versions of GCC have a bug that is triggered by a statement in PAPI 3.0. This (one character) is fixed in the current tar ball, but may not be in the one you downloaded.

If you see an INTERNAL ERROR from GCC when compiling multiplex.c, do 2 things.

1) edit multiplex.c, line 1021 to have 2 equal signs instead of 1.

2 optional) send a message to your local gcc maintainer and complain.

The actual culprit is:

assert(retval = PAPI_OK) and it should be assert(retval == PAPI_OK)

Of course, both are legal C and nothing should trigger an internal compiler error, but hey...

P.S. If your current release compiled with GCC, you're still ok. As the statement above NEVER gets triggered. It is there as an artifact from the original multiplex.c implementation. So you don't need to change or upgrade your PAPI or gcc.

PAPI_create_eventset always returns an error now.

The EventSet MUST be set to PAPI_NULL before it is passed into PAPI_create_eventset.

What's this GCC error about "thread local storage not support for this target"?

TLS is thread local storage, a high performance mechanism in later GCC's/GLIBC/pthread to do constant time access to thread local storage. PAPI uses this if available.

However, many systems (especially IA64 running Debian or SuSE) provide very poor/buggy/non-existent support for this. If you're getting an error during compile (or seg faults on every program during the run), then please rebuild using ./configure.

Other systems don't bother to ship a gcc with this turned on, so you'll get the above error.

./configure has a test to make sure that the thread support is working on your platform.

If you find a case where configure did not detect a broken __thread implementation, please report it to us.


The PAPI GIT Source Repository

Can I browse the source repository on the web?

Yes. The latest copy of the PAPI source tree is viewable through a web based source browser accessible from the GitHub repository at:
https://github.com/icl-utk-edu/papi

 

How do I download a copy of the current PAPI source tree?

Make sure git is installed on your machine. You can download a copy of git here.

Download the PAPI repository the first time with the following command:

> git clone https://github.com/icl-utk-edu/papi.git

This creates a complete copy of the papi git repository on your computer in a folder called 'papi'.

To make sure your copy is up to date with the repository:

> cd papi
> git pull 

Can I commit changes to the PAPI repository?

You can always commit changes to your local copy of the PAPI respository using the "git commit" command and its variations. You cannot push those changes to the master copy of the repository without obtaining credentials from the PAPI team.

Where can I learn more about GIT?

The web has a variety of resources targetted at teaching you how to use GIT. A good place to start is the official GIT site.
History and background of GIT can be found on Wikipedia.
This user-friendly introduction might help "git" you started.


PAPI on AIX POWER Processors

General Comments

If you are running papi-3.0 on aix5.2 & power4 combo, and seeing failure. It is
most likely caused by the BUG in the KERNEL. You need look for efix for APAR IY57280, or
contact papi team at papi@cs.utk.edu for the fix. Here is the more precise info
from IBM:
the problam was introduced in 5.2 ML3, and fixed in 5.2 ML4 and 5.3.
 
To use PAPI in 64-bit mode on power4:
    make -f Makefile.aix-power4-64bit
        link your program with libpapi64.a or libpapi64.so
         
See: /usr/lpp/pmtoolkit/lib/<arch>.evs for POWER3;
     /usr/pmapi/lib/POWER4.evs and /POWER4.gps for POWER4
 
For threaded programs, you had better:
 
setenv AIXTHREAD_SCOPE S

Installation notes

AIX 4.3.x:
The current source and Makefile is for pmtoolkit 1.3.
If you have pmtoolkit 1.2 the test cases will fail. For example:
 
      ./tests/avail
      IOT trap
 
This can be remdied by recompiling the PAPI library with the option
-DPMTOOLKIT_1_2 set.
 
AIX 5.x:
The current source is for pmapi 1.4
 
The aix-power substrate is contained in a single source file, but targets
three different configurations.
Conditional compilation directed by three different make files determines
which configuration is targetted. Make sure you select the Makefile that
matches your configuration:
- Makefile.aix-power    for AIX 4.3.x on POWER3
- Makefile.aix5-power3  for AIX 5.x   on POWER3
- Makefile.aix-power4   for AIX 5.x   on POWER4

Test case notes

The POWER3 and POWER4 have a FMADD instruction. Although this instruction
performs two Floating Point operations, it is counted as one Floating Point
instruction. Because of this, there are situations where PAPI_FP_INS may
produce fewer Floating Point counts than expected.
Further, the Floating Point Instruction event on POWER3 and POWER4 also
counts Floating Point Stores, leading to higher Floating Point counts than
expected. There are occasions where these two effects can cancel each other
out, to produce the right result for the wrong reason!
Note that POWER3 and POWER4 also support an FMA counter (PAPI_FMA_INS).
Thus, a more accurate count of Floating Point Operations can be obtained
by PAPI_FP_INS + PAPI_FMA_INS.
Correcting for the overcount by Floating Point Stores is more difficult,
requiring the use of the native events: PM_FPU_LD_ST_ISSUES and PM_FPU_LD.
The complete expression for Floating Point Operations then becomes:
PAPI_FP_INS + PAPI_FMA_INS - (PM_FPU_LD_ST_ISSUES - PM_FPU_LD)

Counter notes

The POWER architecture supports up to 8 counters. However, in many cases
events are mutually exclusive and can't be counted simultaneously.
 
On POWER4, events are available only as members of predefined groups.
For more on these groups, see /usr/pmapi/lib/POWER4.gps.
 
The following table, submitted by Joel Malard, indicates
events that cannot be counted simultaneously on POWER3:

Things go haywire on my Power/AIX box with threaded programs?

It is very important that you set the environment variable AIXTHREAD_SCOPE to "S", which disables user level threads.


Linux-IA64

Floating Point

This version of the substrate always scales PME_FP_OPS_RETIRED_HI, hex code 0xa, even if you are using it as a NATIVE event. Previous versions of PAPI did not scale this event and could produce erroneously low counts for PAPI_FP_OPS or PAPI_FP_INS.

Notes on PAPI->Native event mappings

PAPI_CA_SNP
PAPI_CA_INV
 Only counts snoops and invalidations from the local processor.
PAPI_TLB_TL
 Counts "real" TLB misses, i.e. misses that cause a VHPT walk or a TLB
 miss trap to the OS. Misses in the L1 TLBs are not counted.
PAPI_FP_STAL
 Counts stalls due to register dependencies and load latencies.
 If the FP pipeline can stall for some other reason (I don't know)
 then those stall cycles won't be counted.

Why am I getting errors from perfmon and PAPI on my Redhat kernels?

Redhat broke the perfmon kernel interface in their kernels and thus only enabled it for root. In some kernels, its disabled entirely. You can test this by running your papi as root, if it then works, guess what, you have a broken kernel.

The fix is supposed to be in the latest update to RHEL3 and RHEL4. The best thing to do would be to download a kernel.org kernel, rebuild and go.

Counter interrupts seem to have stopped on my threaded programs?

You are probably on an Altix or a system with a Redhat kernel. The solution for the later is replace the kernel you have with a patched kernel.org kernel, discussed in this section.

Please send us the kernel version if this happens to you. You'll notice it by running the profile_pthreads test case.

If you're an Altix user, then it's best to complain to SGI. But please let us know also.

WHy can't I build PAPI with the Intel icc compiler?

The problem is not in PAPI, but in libpfm 3.x. When this library is built using icc, the file pfmlib_gen_ia64.c generates a series of errors. One workaround for this may be to make the libpfm library separately using gcc and then build PAPI with icc. Or just use gcc.


Linux-Perfctr

PAPI and the Linux Kernel

For Linux kernels more recent than 2.6.32, the perf_events interface is built into the kernel and can be used directly.

For Linux kernels before 2.6.32, PAPI requires your Linux kernel to be patched with either the PerfCtr patch or the Perfmon patch. For compatability reasons, we have included both of these patches in the tarball. You should patch your kernel with PerfCtr using the distribution found in the papi/src/perfctr-2.6.x directory for x86 hardware, and the papi/src/perfctr-2.7.x directory for IBM POWER hardware. If you prefer Perfmon for kernels older than 2.6.30, you should use the distribution found in the papi/src/libpfm-3.y. Prefmon is no longer supported as a Linux patch. The most recent Perfctr distribution can  be obtained from Mikael Petterson's web site although it is no longer actively supported and not guaranteed to work. http://user.it.uu.se/~mikpe/linux/perfctr/
If you're not sure how to patch, recompile and reinstall your linux kernel, there are a variety of resources on the web. Here's one that should help: http://answers.oreilly.com/topic/36-how-to-patch-a-linux-kernel/.

 

Before you compile

cd perfctr
more INSTALL
If you're getting compilation errors regarding not being able to find include files, then you're probably running a broken redhat installation.

Edit the path to your kernel include files at the top of either Makefile.linux-perfctr

If you have already patched your kernel

If you have a properly functioning Perfctr patch from a previous release of PAPI, you will obviously not want to repatch your kernel. PAPI is compatible with PerfCtr 2.4.x and Perfctr 2.6.x.

The x86 Makefiles:
Makefile.linux-perfctr-p3
Makefile.linux-perfctr-p4
Makefile.linux-athlon
Makefile.linux-opteron

To recompile PAPI *not* using the included PerfCtr distribution, you simply pass the PERFCTR argument to the appropriate Makefile.

make -f Makefile.linux-perfctr-p3
PERFCTR=/usr/src/perfctr-2.4.x

To use Perfctr 2.6.x, simply type:
make -f Makefile.linux-perfctr-p3

To use the older version:
make -f Makefile.linux-perfctr-p3 VERSION=2.4.x

Easy huh?

How do I patch my Linux/Pentium I, II, III, IV, AMD K7, K8 box to work with PAPI?

See the INSTALL file in papi/src/perfctr-2.6.x. The instructions are very, very simple. Do not use perfctr-2.4.x unless you have to. There is no link of perfctr version to linux kernel version!

After reboot, the /dev/perfctr file always seems to have the wrong permissions and PAPI fails to initialize. What's going on?

You are probably running udev, which is not smart enough to know the permissions of dynamically created devices. To fix this, find your udev/devices directory, often /lib/udev/devices or /etc/udev/devices and perform the following actions.

mknod perfctr c 10 182
chmod 644 perfctr

On Ubuntu 6.06 (and probably other debian distros), add a line to /etc/udev/rules.d/40-permissions.rules like this:

KERNEL=="perfctr", MODE="0666"

On SuSE, you may need to add something like the following to /etc/udev/rules.d/50-udev-default.rules: (SuSE does not have the 40-permissions.rules file in it.]

# cpu devices
KERNEL=="cpu[0-9]*", NAME="cpu/%n/cpuid"
KERNEL=="msr[0-9]*", NAME="cpu/%n/msr"
KERNEL=="microcode", NAME="cpu/microcode", MODE="0600"
KERNEL=="perfctr", NAME="perfctr", MODE="0644"

These lines tell udev to always create the device file with the appropriate permissions. Use 'perfex -i' from the perfctr distribution to test this fix.

Hardware interrupt driven counters

YOU MUST COMPILE YOUR KERNEL WITH APIC SUPPORT IF YOU WANT INTERRUPT SUPPORT!
With Perfctr 2.3.3 or later it is possible to make the performance counters generate an interrupt when the counter reaches a certain count. This requires support in the Linux kernel, Perfctr, PAPI and the CPU to work properly.
The necessary kernel support is available if your kernel is compiled with SMP APIC support or uni-processor APIC support compiled in. This is true for 2.4-ac kernels and kernels 2.4.10 or later. This topic is discussed in more detail in Mikael Pettersson's installation instructions for PerfCtr.
Your CPU must be a Pentium 686/AMD K7 or similar which can generate APIC interrupts for performance counter events. This is _not_ true for some mobile Pentiums and early revisions of the AMD K7 or Athlon.
You can verify that all is working by running the perfctr/examples/perfex program with the -i flag. If you do not see "pcint" as one of the flags, you need to recompile your kernel or buy a real CPU. ;-)

Why do PAPI_LD_INS and PAPI_SR_INS give identical results on Pentium 4?

Counting memory load and store instructions on the Pentium 4 is a two step process. First the desired events are tagged at the front of the pipeline. Then tagged events are counted as they graduate from the end of the pipeline. Unfortunately, the tags are all the same 'color' and can't be differentiated as they exit the pipe. Thus, you can correctly measure LD instructions, or correctly measure SR instructions, but if you try to measure them both at once, you will always get the sum of both operations in both counters. The same applies to PAPI_LST_INS.

This behavior is demonstrated in the test program ctests/p4_lst_ins.c.

The moral of the story is to always use these three events one-at-a-time on Pentium 4 machines.

Floating point counts on the Pentium 4 series

The Pentium 4 can generate floating point instructions either through the x87 floating point unit or with SSE instructions.
Furthermore SSE can generate either packed (multiple operands in one 128-bit register) or unpacked (signal operand in one 128-bit register) instructions.
Depending on your compiler and settings you will get different instruction mixes.
 
PAPI provides 2 preset events to count floating point operations:
- PAPI_FP_INS counts intstructions passing through the floating point unit;
- PAPI_FP_OPS counts something closer to theoretical floating point operations.
 
To minimize the overlap and maximize the usefulness of these two events on Pentium 4, we have made the following choices:
- PAPI_FP_INS always counts only x87 floating point operations.
- PAPI_FP_OPS counts can be customized as discussed below.
 
Further complicating things is that the Pentium 4 hardware is too restrictive to count all these modes at once, so a decision must be made about what to count.
In order to enable PAPI to count these various mixes, we support 2 methods.
 
1) The PAPI_PENTIUM4_FP_xxx defines.
 
   Set these in the EVENTFLAGS of either the Makefile.linux-perfctr-p4 or
   Makefile.linux-perfctr-em64t.
 
   -DPAPI_PENTIUM4_FP_X87
   -DPAPI_PENTIUM4_FP_X87_SSE_SP
   -DPAPI_PENTIUM4_FP_X87_SSE_DP
   -DPAPI_PENTIUM4_FP_SSE_SP_DP
 
   The predefined value for Nocona/EM64T/Pentium 4 Model 3 is:
 
         -DPAPI_PENTIUM4_FP_X87_SSE_DP.
 
   The predefined value for anything else is:
 
         -DPAPI_PENTIUM4_FP_X87.
 
   If nothing is defined, the substrate defaults to:
 
         -DPAPI_PENTIUM4_FP_X87_SSE_DP.
 
2) The PAPI_PENTIUM4_FP environment variable.
 
   Set this to one or two of the following, and it will change the
   behavior of PAPI_FP_OPS.
 
   X87: count all x87 instructions
   SSE_SP: count all unpacked SSE single precision instructions
   SSE_DP: count all unpacked SSE double precision instructions
 
   Due to the design of the register set, only 2 of the three are countable
   at one time. Sorry folks.

Vector instruction counts on the Pentium 4 series

PAPI can count 2 different types of vector instructions on the Pentium 4.
Either MMX instructions or packed SSE floating point instructions. These are supported with 2 methods, in a similar fashion to floating point events described above.
 
1) The PAPI_PENTIUM4_VEC_xxx defines.
 
   Set these in the EVENTFLAGS of either the Makefile.linux-perfctr-p4 or
   Makefile.linux-perfctr-em64t.
 
   -DPAPI_PENTIUM4_VEC_MMX
   -DPAPI_PENTIUM4_VEC_SSE
 
   The current default for all platforms is:
 
         -DPAPI_PENTIUM4_VEC_SSE.
 
   If nothing is defined, the substrate defaults to:
 
         -DPAPI_PENTIUM4_VEC_SSE.
 
2) The PAPI_PENTIUM4_VEC environment variable.
 
   Set this to either of the following, and it will change the
   behavior of PAPI_VEC_INS.
 
   SSE: count all packed SSE SP and DP instructions
   MMX: count all 64 and 128 bit MMX instructions

The memory test sometimes fails on Athlon Processors.

This is a known issue and we are looking in to the cause. Currently, we have no fix or work around.

Floating Point counts on AMD Opteron

(The following discussion does not apply to newer quad-core and higher Opteron processors)

The AMD Opteron is the first chip series from AMD that can measure and report floating point operations. Two native events measure floating point activity. One measures speculative operations that enter the FP units; the other measures operations that retire from the FP units.

The retired event generates precise event counts that scale with the amount of work done. However, it measures data movement as well as floating point operations, resulting in counts that are consistently significantly higher than the expected theoretical counts, often by factors of 2 or more.

The speculative event can be configured to generate counts of only the operations typically of interest. Since these counts are speculative, they tend to be higher by often widely variable amounts than expected theoretical counts, especially on complex production codes.

PAPI provides 2 preset events to count floating point operations:

- PAPI_FP_INS counts intstructions passing through the floating point unit;
- PAPI_FP_OPS is intended to count something closer to theoretical floating point operations.

To minimize the overlap and maximize the usefulness of these two events on AMD Opteron, we have made the following choices:

- PAPI_FP_INS always counts retired floating point operations. This value will be precise and accurate, but will include FP loads and stores as well as computations.

- PAPI_FP_OPS counts speculative computation operations by default, but can be customized as discussed below.

As an alternative to counting speculative computations, PAPI_FP_OPS can be configured to retired operations corrected for data movement. Unfortunately, the correction factors themselves are speculative, and can lead to undercounting errors similar in magnitude to those seen in the pure speculative counts.

Two methods are provided to allow customization of PAPI_FP_OPS:

1) The PAPI_OPTERON_FP_xxx defines.

Set these in the CFLAGS variable of Makefile.linux-perfctr-opteron.

-DPAPI_OPTERON_FP_RETIRED
-DPAPI_OPTERON_FP_SSE_SP
-DPAPI_OPTERON_FP_SSE_DP
-DPAPI_OPTERON_FP_SPECULATIVE

The default value is equivalent to:

-DPAPI_OPTERON_FP_SPECULATIVE.

2) The PAPI_OPTERON_FP environment variable.

Set this to one of the following, and it will change the behavior of PAPI_FP_OPS.

RETIRED: count all retired FP instructions
SSE_SP: correct retired counts optimized for single precision
SSE_DP: correct retired counts optimized for double precision
SPECULATIVE: count speculative computations (default)


Solaris-Ultra

General Comments

Assembler stubs for get_tick() and cpu_sync() as well as the following defines have been blatantly stolen from the perfmon code. The author of the package "perfmon" is Richard J. Enbody and the home page for "perfmon" is http://www.cse.msu.edu/~enbody/perfmon.html. For *all* the native event names, run native_avail in the ctests subdirectory. For how to use the native event names, see native.c

Bugs

1) Ultra I/II/III/III+ are currently supported;

2) Some of the cache events have documented bugs, see the Sun UltraSparc hardware reference manual.

3) WARNING FOR PEOPLE USING MULTITHREADED LIBRARIES ON SOLARIS 2.8: There is a bug that prevents setitimer() from being called after the process has called pthread() create at any point in time. Therefore if you suspect your communication library is multithreaded, you had better start the instrumentation before initializing it. See multiplex3_pthreads for details.

My Sun box doesn't have libcpc.h. What should I do?

You didn't check the PAPI Supported Architectures. The hardware counters on SunOS withUltraSparc are only available on Sun OS 5.8 and above. That's Solaris 2.8 for you SVR4 people.


papi-papi-7-2-0-t/README.md000066400000000000000000000117661502707512200151320ustar00rootroot00000000000000**[PAPI: The Performance Application Programming Interface](https://icl.utk.edu/exa-papi/)** **[Innovative Computing Laboratory (ICL)](http://www.icl.utk.edu/)** **[PAPI Wiki - Documentation](https://github.com/icl-utk-edu/papi/wiki/)** **University of Tennessee, Knoxville (UTK)** *** [TOC] *** # About The Performance Application Programming Interface (PAPI) provides tool designers and application engineers with a consistent interface and methodology for the use of low-level performance counter hardware found across the entire compute system (i.e. CPUs, GPUs, on/off-chip memory, interconnects, I/O system, energy/power, etc.). PAPI enables users to see, in near real time, the relations between software performance and hardware events across the entire computer system. [The ECP Exa-PAPI project](https://icl.utk.edu/exa-papi/) builds on the latest PAPI project and extends it with: * Performance counter monitoring capabilities for new and advanced ECP hardware, and software technologies. * Fine-grained power management support. * Functionality for performance counter analysis at "task granularity" for task-based runtime systems. * "Software-defined Events" that originate from the ECP software stack and are currently treated as black boxes (i.e., communication libraries, math libraries, task-based runtime systems, etc.) The objective is to enable monitoring of both types of performance events---hardware- and software-related events---in a uniform way, through one consistent PAPI interface. Third-party tools and application developers will have to handle only a single hook to PAPI in order to access all hardware performance counters in a system, including the new software-defined events. *** # Getting Help * Visit our FAQ at: or read a snapshot of the FAQ in papi/PAPI_FAQ.html * For assistance with PAPI, email ptools-perfapi@icl.utk.edu. * You can also join the PAPI User Google group by going to to read historical postings to the list. *** # Contributing The PAPI project welcomes contributions from new developers. Contributions can be offered through the standard GitHub pull request model. We strongly encourage you to coordinate large contributions with the PAPI development team early in the process. **For timely pull request reviews and feedback, it is important to submit one (1) pull request per feature / bug fix.** In order to create a pull request on a public read-only repo, you will need to do the following: 1. Fork the PAPI repo (click "+" on the left and "Fork this repository"). 2. Clone it. 3. Make your changes and push them. 4. Click "create pull request" from your repo (not the PAPI repo). *** # Resources * Visit the [Exa-PAPI website](https://icl.utk.edu/exa-papi/) to find out more about ongoing PAPI and [PAPI++](https://www.exascaleproject.org/papi-as-de-facto-standard-interface-for-performance-event-monitoring-at-the-exascale/) developments and research. * Visit the [PAPI website (retired)](https://icl.utk.edu/papi/) for basic information about PAPI. * Visit the [ECP website](https://www.exascaleproject.org/) to find out more about the DOE Exascale Computing Initiative. * Visit the [PAPI Papers and Presentations](https://www.icl.utk.edu/view/biblio/project/papi?items_per_page=All) to find out more about PAPI papers and presentations. *** # License Copyright (c) 2024, University of Tennessee All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the University of Tennessee nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL UNIVERSITY OF TENNESSEE BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. papi-papi-7-2-0-t/RELEASENOTES.txt000066400000000000000000002271311502707512200163000ustar00rootroot00000000000000This file documents changes in recent PAPI releases in inverse chronological order. For details on installing PAPI on your machine, consult the INSTALL.txt file in this directory. =============================================================================== PAPI 7.2.0 RELEASE NOTES Jun 2025 =============================================================================== PAPI 7.2.0 is now available as the next major release. This release officially introduces two new components: +++ rocp_sdk: Supports AMD GPUs and APUs via the ROCprofiler-SDK interface. +++ topdown: Provides proper support for Intel topdown metrics. PAPI 7.2.0 also introduces preset events for non-CPU devices, starting with CUDA events. In addition, component code has been extended to include a statistics qualifier (e.g., for CUDA events), offering more concise and functional output in the papi_native_avail utility. Additional Major Changes are: ----------------------------- Component Updates: ------------------ * RAPL: Support for Intel Emerald Rapids, and Intel Comet Lake S/H CPUs. * ROCM/ROCP_SDK: +++ Numerous improvements to error handling, shutdown behavior, and initialization in `rocm` and `rocp_sdk` components. +++ Added multiple libhsa search paths. +++ Correct handling if all events are removed. +++ Improved interoperability between `rocm` and `rocp_sdk` components. * CUDA: +++ Added statisitcs qualifier to CUDA events, which offers a more concise and functional output for the papi_native_avail utility. +++ Added `partially enabled` support for systems with multiple compute capabilities: <7.0, =7.0, >7.0. +++ Support for MetricsEvaluator API (CUDA ≥ 11.3). +++ Fixed `cuptid_init` return value and potential overflows. * Sysdetect, Coretemp, Infiniband, NVML, Net, ROCM_SMI: +++ Improved robustness and memory safety across components. +++ coretemp: Enabled support for event multiplexing. * Topdown: +++ New component to interface with Intel's PERF_METRICS MSR. +++ Converts raw metrics into user-friendly percentages. +++ Provides access to topdown metrics on supported CPUs: heterogeneous Intel CPUs (e.g., Raptor Lake), Sapphire Rapids, Alder Lake, Granite Rapids. +++ Integrated `librseq` to protect `rdpmc` instruction execution. Preset Events & CAT updates: ---------------------------- * AMD Family 17h: Corrected presets for IC accesses/misses. * ARM Cortex A57/A72/A76: Added/updated preset support. * CAT: Added scalar operations to vector-FLOPs Benchmarks. Acknowledgements: This release is the result of contributions from many people. The PAPI team would like to extend a special thank you to Vince Weaver, Willow Cunningham, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Dandan Zhang, Yoshihiro Furudera, Akio Kakuno, Richard Evans, Humberto Gomes and Phil Mucci. =============================================================================== PAPI 7.2.0b2 RELEASE NOTES 24 Feb 2025 =============================================================================== PAPI 7.2.0b2 is now available as a beta release. This release introduces improvements to the rocp_sdk component, which supports AMD GPUs/APUs through the ROCprofiler-SDK interface, currently still under development and testing. The release also includes general improvements to the PAPI code, enhancing both design and functionality, as well as various bug fixes. Additional Major Changes are: * AMD ROCprofiler-SDK component (rocp_sdk): Support for sampling (device profiling) mode, and multiple devices +++ Tested on Instinct MI50, MI210, MI250x, and MI300a +++ Sampling functionality has been tested successfully with ROCm-6.3.2. Earlier versions might lead to unexpected behavior. * CUDA component: +++ Added support for heterogeneous systems +++ Added support for "device" qualifier to reduce papi_native_avail output length * Updated libpfm4 to latest commit 762ca94010d9a8f21f0440c0b5807e9a2e849420 * AMD power: Added support for family 25 (19h) processors in the RAPL component * Intel power: Add support for RaptorLake in RAPL component * IBM POWER10: Added preset events * Updated papi_events.csv to remove deprecated preset events * Improvements in the Counter Analysis Toolkit (CAT) * Added tests for the lmsensors component * Allow user to optionally disable perf_event, perf_events_uncore, and cpu * Sysdetect: allow users to disable component * papi_mem_info: added support for ARM Neoverse V2 * Testing: run_tests.sh tests only active components Acknowledgements: This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Peinan Zhang, Rashawn Knapp and Phil Mucci. =============================================================================== PAPI 7.2.0b1 RELEASE NOTES 30 Aug 2024 =============================================================================== PAPI 7.2.0b1 is now available as a beta release. This release introduces a new component, rocp_sdk, which supports AMD GPUs/APUs through the ROCprofiler-SDK interface, currently still under development and testing. The release also includes general improvements to the PAPI code, enhancing both design and functionality, as well as various bug fixes. Additional Major Changes are: * Preliminary support for AMD ROCprofiler-SDK events * AMD Zen5 L3 PMU support * AMD Zen5 core PMU support * Preset support for Zen5 * Preset support for Ice Lake ICL * Basic support for the RISC-V architecture (no events yet) * Initial heterogenous CPU support: Alderlake and Raptorlake can now enumerate events for both Power and Efficiency cores on heterogeneous systems * Intel AlderLake Gracemont (E-Core) core PMU support * Intel AlderLake Goldencove (P-Core) core PMU support * Intel Raptorlake PMU support: Enables support for Raptorlake, Raptorlake P, Raptorlake S * Intel GraniteRapids core PMU support * Intel SapphireRapids uncore PMU support for: +++ Coherence and Home Agent (CHA) +++ Ultra Path Interconnect PMU (UPI) +++ memory controller PMU (IMC) * Intel IcelakeX uncore PMU support for: +++ Mesh to IIO PMU (M2PCIE) +++ UBOX PMU (UBOX) +++ Mesh to UPI PMU (M3UPI) +++ Ultra Path Interconnect PMU (UPI) +++ Power Control unit PMU (PCU) +++ Mesh to Memory PMU (M2M) +++ PCIe IIO Ring Port PMU (IRP) +++ PCIe I/O controller PMU (IIO) +++ memory controller PMU (IMC) +++ Coherency and Home Agent (CHA) * Sysdetect: support for ARM Neoverse V2 * SDE: support for ntv_code_to_info functionality * Removed the obsolete bundled perfctr and libpfm-3.y code Acknowledgements: This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Peinan Zhang, Rashawn Knapp and Phil Mucci. =============================================================================== PAPI 7.1.0 RELEASE NOTES 18 Dec 2023 =============================================================================== PAPI 7.1.0 is now available. This release includes support for Intel Sapphire Rapids and AMD Zen4 preset events. The release also includes general improvements to the PAPI code in terms of design and functionality. Furthermore, the Counter Analysis Toolkit (CAT) and the Software-Defined Events (SDE) library have also been updated. Major Changes: * Support for Intel Sapphire Rapids native and preset events * Support for AMD Zen4 native and preset events * Support for event qualifiers in the ROCm component * New 'template' compomenent * Integration into Spack package manager * Integration into the Extreme-Scale Scientific Software Stack (E4S) * Refactored cuda component with multi-thread and multi-gpu support * Support for ARM Neoverse V1 and V2 =============================================================================== PAPI 7.0.1 RELEASE NOTES 22 Feb 2023 =============================================================================== This is a minor release of PAPI. It introduces the following changes: * Support for AMD Zen4 CPUs in libpfm4 * Support for ARM Neoverse V1 and V2 in libpfm4 * Fix a build error encountered when building the library with gcc 10 and later * Resolve build warnings across different components * Fix bug in the ROCm component when monitoring multiple GPUs in sampling mode * Refactor ROCm component to simplify code and prepare it for rocmtools support * Refactor ROCm SMI component and support XGMI events =============================================================================== PAPI 7.0.0 RELEASE NOTES 14 Nov 2022 =============================================================================== This is a major release of PAPI, which offers several new components, including "intel_gpu" with monitoring capabilities on Intel GPUs; "sysdetect" (along with a new user API) for detecting details of the available hardware on a given compute system; a significant revision of the "rocm" component for AMD GPUs; the extension of the "cuda" component to enable performance monitoring on NVIDIA's compute capabilities 7.0 and beyond. Furthermore, PAPI 7.0.0 ships with a standalone "libsde" library and a new C++ API for software developers to define software-defined events from within their applications. For specific and detailed information on changes made for this release, see ChangeLogP700.txt for filenames or keywords of interest and change summaries, or go directly to the PAPI git repository. Some Major Changes for PAPI 7.0.0 include: * A new "intel_gpu" component with monitoring capabilities support for Intel GPUs, including GPU hardware events and memory performance metrics (e.g., bytes read/written/transferred from/to L3). The PAPI "intel_gpu" component offers two collection modes: (1) "Time-based Collection Mode," where metrics can be read at any given time during the execution of kernels. (2) "Kernel-based Collection Mode," where performance counter data is available once the kernel execution is finished. * A new "sysdetect" component for detecting a machine's architectural details, including the hardware's topology, specific aspects about the memory hierarchy, number and type of GPUs and CPUs on a node, thread affinity to NUMA nodes and GPU devices, etc. Additionally, PAPI offers a new API that enables users to get "sysdetect" details from within their application. * A major redesign of the "rocm" component for advanced monitoring features for the latest AMD GPUs. The PAPI "rocm" component is now thread-safe and offers two collection modes: "sampling" and "kernel intercept" mode. * Support for NVIDIA compute capability 7.0 and greater. This implies support for CUPTI's new Profiling and Perfworks APIs. The PAPI CUDA component has been refactored to work equally for NVIDIA compute capabilities <7.0 and >= 7.0. * A significant redesign of the "sde" component into two separate entities: (1) a standalone library "libsde" with a new API for software developers to define software-based metrics from within their applications, and (2) the PAPI "sde" component that enables monitoring of these new software-based events. * A new C++ interface for "libsde," which enables software developers to define software-defined events from within their C++ applications. * New Counter Analysis Toolkit (CAT) benchmarks and refinements of PAPI's CAT data analysis, specifically, the extension of PAPI's CAT with MPI and "distributed memory"-aware benchmarks and analysis to stress all cores per node. * Support for FUGAKU's A64FX Arm architecture, including monitoring capabilities for memory bandwidth and other node-wide metrics. =============================================================================== PAPI 6.0.0 RELEASE NOTES 29 Jan 2020 =============================================================================== PAPI 6.0 is now available. This release includes a new API for SDEs (Software Defined Events), a major revision of the 'high-level API', and several new components, including ROCM and ROCM_SMI (for AMD GPUs), powercap_ppc and sensors_ppc (for IBM Power9 and later), SDE, and the IO component (exposes I/O statistics exported by the Linux kernel). Furthermore, PAPI 6.0 ships CAT, a new Counter Analysis Toolkit that assists with native performance counter disambiguation through micro-benchmarks. For specific and detailed information on changes made for this release, see ChangeLogP600.txt for filenames or keywords of interest and change summaries, or go directly to the PAPI git repository. Major Changes * Added the rocm component to support performance counters on AMD GPUs. * Added the rocm_smi component; SMI is System Management Interface to monitor power usage on AMD GPUs, which is also writeable by the user, e.g. to reduce power consumption on non-critical operations. * Added 'io' component to expose I/O statistics exported by the Linux kernel (/proc/self/io). * Added 'SDE' component, Software Defined Events, which allows HPC software layers to expose internal performance-critical behavior via Software Defined Events (SDEs) through the PAPI interface. * Added 'SDE API' to register performance-critical events that originate from HPC software layers, and which are recognized as 'PAPI counters' and, thus, can be monitored with the standard PAPI interface. * Added powercap_ppc component to support monitoring and capping of power usage on IBM PowerPC architectures (Power9 and later) using the powercap interface exposed through the Linux kernel. * Added 'sensors_ppc' component to support monitoring of system metrics on IBM PowerPC architectures (Power9 and later) using the opal/exports sysfs interface. * Retired infiniband_umad component, it is superseded by infiniband. * Revived PAPI's 'high-level API' to make it more intuitive and effective for novice users and quick event reporting. * Added 'counter_analysis_toolkit' sub-directory (CAT): A tool to assist with native performance counter disambiguation through micro-benchmarks, which are used to probe different important aspects of modern CPUs, to aid the classification of native performance events. Other Changes * Standardized our environment variables and implemented a simplified, unified approach for specifying libraries necessary for components, with overrides possible for special circumstances. Eliminated component level 'configure' requirements. * Corrected TLS issues (Thread Local Storage) and race conditions. * Several bug fixes, documentation fixes and enhancements, improvements to README files for user instruction and code comments. Acknowledgements: This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Phil Mucci, Kevin Huck, Yunqiang Su, Carl Love, Andreas Beckmann, Al Grant and Evgeny Shcherbakov. The PAPI release can be downloaded from http://icl.cs.utk.edu/papi/software. =============================================================================== PAPI 5.7.0 RELEASE NOTES 4 Mar 2019 =============================================================================== PAPI 5.7 is now available. This release includes a new component, called "pcp", which interfaces to the Performance Co-Pilot (PCP). It enables PAPI users to monitor IBM POWER9 hardware performance events, particularly shared “NEST†events without root access. This release also upgrades the (to date read-only) PAPI “nvml†component with write access to the information and controls exposed via the NVIDIA Management Library. The PAPI “nvml†component now supports both---measuring and capping power usage---on recent NVIDIA GPU architectures (e.g. V100). We have added power monitoring as well as PMU support for recent Intel architectures such as Cascade Lake, Kaby Lake, Skylake, and Knights Mill (KNM). Furthermore, measuring power usage for AMD Fam17h chips is now available via the “rapl†component. For specific and detailed information on changes made for this release, see ChangeLogP570.txt for filenames or keywords of interest and change summaries, or go directly to the PAPI git repository. Major Changes * Added the component PCP (Performance Co-Pilot, IBM) which allows access to PCP events via the PAPI interface. * Added support for IBM POWER9 processors. * Added power monitoring support for AMD Fam17h architectures via RAPL. * Added power capping support for NVIDIA GPUs. * Added benchmarks and testing for the “nvml†component, which allows power-management (reporting and setting) for NVIDIA GPUs. * Re-implementation of the “cuda†component to better handle GPU events, metrics (values computed from multiple events), and NVLink events, each of which have different handling requirements and may require separate read groupings. * Enhanced NVLink support, and added additional tests and example code for NVLink (high-speed GPU interconnect). * Extension of test suite with more advanced testing: attach_cpu_sys_validate, attach_cpu_validate, event_destroy test, openmp.F test, attach_validate test (rdpmc issue). Other Changes * ARM64 configuration now works with newer Linux kernels (>=3.19). * As part of the “cuda†component, expanded CUPTI-only tests to distinguish between PAPI or non-PAPI issues with NVIDIA events and metrics. * Many memory leaks have been corrected. Not all, some 3rd party library codes still exhibit memory leaks. * Better reporting and error handling of bugs. Changes to “infiniband_umad†name reporting to distinguish it from the “infiniband†component. * Cleaning up of the source code, added documentation and test/utility files. Acknowledgements: This release is the result of efforts from many people. The PAPI team would like to express special Thanks to Vince Weaver, Stephane Eranian (for libpfm4), William Cohen, Steve Kaufmann, Phil Mucci, and Konstantin Stefanov. The PAPI release can be downloaded from http://icl.cs.utk.edu/papi/software. =============================================================================== PAPI 5.6.0 RELEASE NOTES 19 Dec 2017 =============================================================================== PAPI 5.6.0 contains a major cleanup of the source code and the build system to have consistent code structure, eliminate errors, and reduce redundancies. A number of validation tests have been added to PAPI to verify the PAPI preset events. Improvements and changes to multiple PAPI components have been made, varying from supporting new events to fixes in the component testing. For specific and detailed information on changes made in this release, see ChangeLogP560.txt for keywords of interest or go directly to the PAPI git repository. Major changes * Validation tests: A substantial effort to add validation tests to PAPI to check and detect problems in the definition of PAPI preset events. * Event testing: Thorough cleanup of code in the C and Fortran testing to add processor support, cleanup output and make the testing behavior consistent. * CUDA component: Updated and rewritten to support CUPTI Metric API (combinations of basic events). This component now supports NVLink information through the Metric API. Updated testing for the component. * NVML component: Updated to support power management limits and improved event names. Minor other bug fixes. * RAPL component: Added support for: Intel Atom models Goldmont / Gemini_Lake / Denverton, Skylake-X / Kabylake * PAPI preset events: Many updates to the PAPI preset event mappings; Skylake X support, initial AMD fam17h, fix AMD fam16h, added more Power8 events, initial Power9 events. Other changes * Updating man and help pages for papi_avail and papi_native_avail. * Powercap component: Added test for setting power caps via PAPI powercap component. * Infiniband component: Bugfix for infiniband_umad component. * Uncore component: Updated to support recent processors. * Lmsensors component updated to support correct runtime linking, better events name, and a number of bug fixes. * Updated and fixed timer support for multiple architectures. * All components: Cleanup and standardize testing behavior in the components. * Build system: Much needed cleanup of configure and make scripts. * Support for C++ was enhanced. * Enabling optional support for reading events using perfevent-rdpmc on recent Linux kernels can speed up PAPI_read() by a factor of 5. * Pthread testing limited to avoid excessive CPU consumption on highly parallel machines. Acknowledgements: This release is the result of efforts from many people, with special Thanks to Vince Weaver, Phil Mucci, Steve Kauffman, William Cohen, Will Schmidt, and Stephane Eranian (for libpfm4) from the internal PAPI team. =============================================================================== PAPI 5.5.1 RELEASE NOTES 18 Nov 2016 =============================================================================== PAPI 5.5.1 is now available. This is a point release intended primarily to add support for uncore performance monitoring events on Intel Xeon Phi Knights Landing (KNL). Other minor bugfixes have also been made. For specific and detailed information on changes made in this release, see ChangeLogP551.txt for keywords of interest or go directly to the PAPI git repository. New Platforms: * Added Knights Landing (KNL) uncore event support via libpfm4. Bug Fixes: * Fix some possible string termination problems. * Cleanup lustre and mx components. * Enable RAPL for Broadwell-EP. =============================================================================== PAPI 5.5.0 RELEASE NOTES 14 Sep 2016 =============================================================================== PAPI 5.5 is now available. This release provide a new component that provides read and write access to the information and controls exposed via the Linux powercap interface.The PAPI powercap component supports measuring and capping power usage on recent Intel architectures.[a][b] We have added core support for Knights Landing (uncore support will be released later) as well as power monitoring via the RAPL and powercap components. For specific and detailed information on changes made in this release, see ChangeLogP550.txt for keywords of interest or go directly to the PAPI git repo. New Platforms: * Added Knights Landing (KNL) core events and preset events. * Added Intel Broadwell/Skylake/Knights Landing RAPL support * Updated PAPI preset event support for Intel Broadwell/Skylake New Component: * Powercap component: PAPI now supports the Linux Power Capping Framework which exposes power capping devices and power measurement to user space via a sysfs virtual file system interface. Enhancements: * Add support for multiple flavors of POWER8 processors. * Force all processors to check event schedulability by checking that PAPI can successfully read the counters. * Support for Intel Broadwell-EP, Skylake, Goldmont, Haswell-EP inherited from libpfm4. * Shared memory object (.so) naming is made more limited so that minor updates do not break ABI compatibility. Bug Fixes: * Improve testlib error messages if a component fails to initialize. * Fix _papi_hwi_postfix_calc parsing and robustness. * Clean build rules for CUDA sampling subcomponent. * Correct IBM Power7 and Power8 computation of PAPI_L1_DCA. * Eliminate the sole use of ftests_skip subroutine. * Correct the event string names for tenth.c. * Have Fortran test support code report errors more clearly. * Cleanup output from libmsr component. * PAPI internal functions were marked as static to avoid exposing them externally. * Multiple component were fixed to make internal functions static where possible, to avoid exposing the functions as externally accessible entry points. * CUDA component configuration bug fixed. =============================================================================== PAPI 5.4.3 RELEASE NOTES 26 Jan 2016 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP543.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== New Implementations: ------------------- * libmsr component: Using LLNL's libmsr library to access Intel RAPL (Running Average Power Limit) library adds power capping abilities to PAPI. * CUDA PC sampling: A new standalone CUDA sampling tool (papi_cuda_sampling) has been added to the CUDA component (components/cuda/sampling/) and can be used as a preloader to perform PC sampling on Nvidia GPUs which support the CUPTI sampling interface (e.g. Maxwell). * ARM Cortex A53 support: Event definitions added. Enhancements: ------------ * Added Haswell-EP uncore support * Initial Broadwell, Skylake support * Added a general CUDA example (components/cuda/test) that uses LD_PRELOAD to attach to a running CUcontext. * Added "-check" flag to papi_avail and papi_native_avail to test counter availability/validity. Bug Fixes: ---------- * Clean output from papi_avail tool when there are no user defined events. * Support PAPI_GRN_SYS granularity for perf component. * Bug fix for infiniband_umad component. * Bug fix for vmware component. * Bug fix for NVML component. * Fixed RAPL component so it reports unsupported inside a guest VM. * Cleanup ARM CPU detection. * Bug fix for PAPI_overflow issue for multiple eventsets. * Increased PERF_EVENT_MAX_MPX_COUNTERS to 192 from 128. * Fixed memory leak in papi_preset.c. * Free allocated memory in the stealtime component. =============================================================================== PAPI 5.4.1 RELEASE NOTES 02 Mar 2015 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP541.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== The PAPI CUDA component is updated to support CUDA 6.5 with multiple GPUs. New Platforms: ------------- * Updated support for Intel Haswell and Haswell-EP * Added ARM Cortex A7 * Added ARM 1176 cpu (original Raspberry Pi) Enhancements: ------------ * Enhance PAPI preset events to allow user defined events. * User defined events are set up via a user event definition file. * CUDA component is updated to support multiple devices and contexts. * Tested under and supports CUDA 6.5. * Note: Events for different CUDA context MUST be added from within the context. * New test demonstrating attaching an eventset to a single CPU rather than a thread. * Use the term "event qualifiers" instead of "event masks" to clarify understanding. * Added pkg-config support to PAPI. Bug Fixes: ---------- * Fixed lustre segfault bug. * Fixed compilation in the absence of a Fortran compiler. * Fixed bug in krental_pthreads ctest to join threads properly on exit. * Fixed bug in perf_events where event masks were not getting cleared properly. * Fixed memory leak bug in perf_events. =============================================================================== PAPI 5.4.0 RELEASE NOTES 13 Nov 2014 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP540.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== Full support for CUDA 6.5 has been delayed and will be included in the next release. New Platforms: ------------- * EMON power component for IBM Blue Gene/Q * Support for the Applied Micro X-Gene processor * Support for IBM POWER non-virtualized platform * RAPL support for Intel Haswell models (60,69,71) Enhancements: ------------ * Added list of supported PMU names (core/uncore components) * Support for extended event masks (core/uncore components) * Extension of the RAPL energy measurements on Intel via msr-safe * Updated IBM POWER7, POWER8 presets * 'papi_native_avail --validate' supports events that require multiple masks to be valid Bug Fixes: ---------- * HW counter and event count added/fixed for BGPM components * Reduce cost of using PAPI_name_to_code * Non-null terminated strings fixed * Growing list of native events in core/uncore components fixed * Cleaned up Intel IvyBridge presets * Addressed Coverity reported issues =============================================================================== PAPI 5.3.2 RELEASE NOTES 30 Jun 2014 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP532.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== An internal 5.3.1 is skipped, changes since 5.3 are detailed below. New Platforms: ------------- * Intel Silvermont * ARM Qualcomm Krait Enhancements: ------------ * Rapl component support for Intel Haswell-EP * Add units to NVML component * Refine the definition of a Flop on the *-Bridge Intel chips. * Updated Intel Haswell presets Bug Fixes: ---------- * FreeBSD build and component fixes * Uncore enumeration * Printf format specifiers standardized (use # for hex) =============================================================================== PAPI 5.3.0 RELEASE NOTES 18 Nov 2013 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP530.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== New Platforms: ------------- * Intel Xeon Phi ( for offload code ) Enhancements: ------------ * RAPL component better deals with counter wrap * Floating support added for Intel IvyBridge * PAPI_L1_ICM event added for Intel Haswell * AMD Fam15h gets Core select umasks * CUDA component now sets the number of native events supported * Installed tests' code can now be built. * host-micpower utility Bug Fixes: ---------- * command_line utility event skipping bug * remove extranious -openmp flag from icc builds * Default to building all ctests, clean up much bit rot =============================================================================== PAPI 5.2.0 RELEASE NOTES 06 Aug 2013 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP520.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== This release represents a major overhaul of several components. Support for Intel Haswell and Power 8 has been added. Processor support code has been moved to the components directory. New Platform: ------------- * Intel Haswell (initial support) * Power 8 (initial support) New Components: --------------- * Host-side MIC power component Enhancements: ------------ * Component tests are now included with install-tests make target. * Components with external library dependencies load them at runtime allowing better distribution (infiniband, cuda, vmware, nvml and host-side micpower) * Perf_events, perfctr[_ppc] and perfmon2[_ia64] have been moved under the components directory * (Intel) Uncore support has been split into its own component * Lustre component better handles large numbers of filesystems =============================================================================== PAPI 5.1.1 RELEASE NOTES 21 May 2013 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP511.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== This is a bug fix release. New Platform: ------------- * Intel IvyBridge-EP Bug Fixes: ---------- * Many perf_event fixes * Cuda component fixes * IA64 and SPARC build fixes Enhancements: ------------ * Better logic in run_tests.sh script * ARM builds now use pthread_mutexes * BG/Q overflow enhancements =============================================================================== PAPI 5.1.0 RELEASE NOTES 11 Jan 2013 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP510.txt for keywords of interest or go directly to the PAPI git repo. New Platform: ------------- * Intel Xeon Phi ( Knight's Corner or KNC or MIC ) Bug Fixes: ---------- * Various build system fixes. * NVML component fix. * Work around a sampling bug on Power Linux Enhancements: ------------ * ARM Cortex A15 support. * New API entry, PAPI_get_eventset_component * Add options to papi_command_line to print in hex and unsigned formats New Components: --------------- * MIC Power component. =============================================================================== PAPI 5.0.1 RELEASE NOTES 20 Sep 2012 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP501.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== This in a bug fix release of PAPI. Including a major bug fix in the preset code, we recommend that all users of PAPI 5.0 upgrade; see commit 866bd51c for a detailed discussion. Bug Fixes: ---------- * Debugging macros with out variadic macro support. * Building PAPI with an external libpfm4 installation. * Fix a major bug in the preset code. Enhancements: ------------- * CUDA configure script better supports Kepler architecture. * rapl support for IvyBridge. * Libpfm4 updates for SandyBridge-EP counters. =============================================================================== PAPI 5.0.0 RELEASE NOTES 23 Aug 2012 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP500.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== This is a major release of PAPI. Parts of both the internal component and external low-level interfaces have changed, this will break your 4.4 compliant components. Numerous bug fixes are also included in this release. New Platforms: ------------- * Intel IvyBridge * Intel Atom Cedarview New / Improved Components: --------------- * nVidia Management library component - support for various system health and power measurements on supported nVidia gpus. * stealtime - When running in a VM, this provides information on how much time was "stolen" by the hypervisor due to the VM being disabled. This is currently only supported on KVM. * RAPL - a SandyBridge RAPL (Running Average Power Level) Component providing for energy measurement at the package-level. * VMware component for VMware pseudo-counters * appio - This application I/O component enables PAPI-C to determine I/O used by the application. Bug Fixes: ---------- * Numerous memory leaks, thread races, and compiler warnings corrected. Enhancements: ------------- * Major overhaul of the component interface. * Update perf_event.c rdpmc support * Minor uncore fixes plus changes for rdpmc. * Add a PAPI_REF_CYC preset event, defined as UNHALTED_REFERENCE_CYCLES for all Intel platforms on which this native event is supported. * Component names are now standardized in a meaningful way. * Multiplexing under perf_events has been improved. * FreeBSD cleanup/updates * appio component now intercepts recv() * Power7 definition of L1_DCA and LST_INS updated to a countable definition * Added BGPM's opcode and generic event functionality to PAPI for BG/Q (requires Q32 driver V1R1M2). Open Issues: ------------- * SandyBridge PAPI_FP_* events only produce reasonable results when counted by themselves. * Ivy Bridge does not support floating point events. Experimental: ------------- Known Bugs: ----------- * Software multiplexing is known to have a memory leak. * The byte-profile test is known to fail on Power7/AIX Deprecated: --------------------- * Java PAPI wrappers * Windows =============================================================================== PAPI 4.4.0 RELEASE NOTES 17 Apr 2012 =============================================================================== For specific and detailed information on changes made in this release, grep ChangeLogP440.txt for keywords of interest or go directly to the PAPI git repo. GENERAL NOTES =============================================================================== This is a major release of PAPI-C. Support for IBM Blue Gene/Q has been added. Multiple bug fixes are also included in this release. This is also the first release of papi made from the git repository; git clone http://icl.cs.utk.edu/git/papi.git Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ RECENT CHANGES IN PAPI 4.4.0 =============================================================================== New Platforms: ------------- * src/Rules.bgpm... Added PAPI support for Blue Gene/Q. Bug Fixes: ---------- * Fix buffer overrun in lmsensors component * libpfm4: Update to current git libpfm4 snapshot * Fix broken Pentium 4 Prescott support we were missing the netbusrt_p declaration in papi_events.csv * Fix various locking issues in the threaded code. * Fix multiplexing of large eventsets on perf_events systems. This presented when using more than 31 multiplexed events on perf_event Enhancements: ------------- * Update the release machinery for git. =============================================================================== PAPI 4.2.1 RELEASE NOTES 13 Feb 2012 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP421.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is a minor release of PAPI-C. It does not break binary or semantic compatibility with previous versions. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ RECENT CHANGES IN PAPI 4.2.1 =============================================================================== Bug Fixes: ---------- * solaris substrate set_domain call was added. * multiplexing math errors were fixed in perf_events.c * more multiplexing read path errors were identified and fixed * src/linux-timer.c: Fix compilation warning if you specify --with-walltime=gettimeofday * src/linux-timer.c: Fix the build on Linux systems using mmtimer * src/linux-common.c: Update the linux MHz detection code to use bogoMIPS when there is no MHz field available in /proc/cpuinfo. * src/: configure, configure.in: Fix a typo in the perfctr section; it was causing a machine to default to perfctr when it had no performance interface. ( a centos vm image with a 2.6.18 kernel) Also checks that we actually have perfctr if we specify --with-perfctr. * Fix SMP ARM issues reported by Harald Servat. Also, adds proper header dependency checking in the Rules files. * src/ctests/api.c: Make the api test actually test PAPI_flops() as it claims to do, rather than PAPI_flips(). * src/papi_events.csv: Update the coreduo (not core2) events. Most notably the FP events were wrong. * src/papi_events.csv: Modify Intel Sandybridge PAPI_FP_OPS and PAPI_FP_INS events to not count x87 fp instructions. The problem is that the current predefines were made by adding 5 events. With the NMI watchdog stealing an event and/or hyperthreading reducing the number of available counters by half, we just couldn't fit. This now raises the potential for people using x87-compiled floating point on Sandybridge and getting 0 FP_OPS. This is only likely if running a 32-bit kernel and *not* compiling your code with -msse. A long-term solution might be trying to find a better set of FP predefines for sandybridge. * src/components/lmsensors/: Rules.lmsensors, configure.in: Fixed configure error message and rules link error for shared object linking. Thanks Will Cohen. * src/components/lmsensors/linux-lmsensors.h: Added missing string header * src/components/net/tests/: net_values_by_code.c, net_values_by_name.c: Apply patch suggested by Will Cohen to check for system return values. * src/Makefile.inc: Patch to cleanup dependencies, allowing for parallel makes. Patch due to Will Cohen from redhat * src/: papi_internal.c, threads.c: Fix two race conditions that are probably the cause of the pthrtough double-free error. When freeing a thread, we remove and free all eventsets belonging to that thread. This could race with the thread itself removing the evenset, causing some ESI fields to be freed twice. The problem was found by using the Valgrind 3.8 Helgrind tool valgrind --tool=helgrind --free-is-write=yes ctests/pthrtough In order for Helgrind to work, I had to temporarily modify PAPI to use POSIX pthread mutexes for locking. Enhancements: ------------- * general doxygen cleanups * cleanup output of overflow_allcounters for clarity in debugging * updates to most recent (as of Feb 1) libpfm4 * remove now-opaque event codes from papi_native_avail and papi_xml_event_info * src/: papi_internal.c Update the component initialization code so that it can handle a PAPI ERROR return gracefully. Previously there was no way to indicate initialization failure besides just setting num_native_events to 0. New Platforms: ------------- * src/libpfm4/lib/: pfmlib_amd64_fam11h.c, events/amd64_events_fam11h.h Support for AMD Family 11. * src/libpfm4/lib/: pfmlib_amd64_fam12h.c, events/amd64_events_fam12h.h Support for AMD Family 12. Deprecated Platforms: --------------------- * remove obsolete ACPI component New / Improved Components: --------------- * PAPI CUDA component updated for CUDA / CUPTI 4.1. * SetCudaDevice() now works with the latest CUDA 4.1 version. * Auto-detection of CUDA version for backward compatibility. * PAPI_read() now accumulates event values. This fixes a bug in earlier versions. * extensive updates and cleanups to the example and coretemp components. * significant updates of lustre, and mx components * The linux net component underwent extensive updates and cleanups. In particular, it nows dynamically detects the network interface names [1] and export 16 counters for each interface (see also src/components/net/{CHANGES,README}). Open Issues: ------------- * multiplex1.c was rewritten to expose a multiplexing bug in the perf_events kernel (3.0.3) for MIPS * src/components/lmsensors/: Latest versions of lmsensors are incompatible with current lmsensors component. Interface needs to be updated for forward compatibility. * There's a problem with broken overflow on POWER6 linux systems. We suspect a kernel problem, but don't know exactly which version(s) We're running a 2.6.36 kernel where the problem has been identified. It may be fixed in newer versions. Experimental: ------------- * a new vmware component has been added to report a variety of soft events when running as a guest in a VMWare environment =============================================================================== PAPI 4.2.0 RELEASE NOTES 26 Oct 2011 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP420.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is a major release of PAPI-C. It add a significant new feature in user-defined events. It also marks a shift from external (and out-dated) man pages to doxygen generated man pages. These pages can be found online at: http://icl.cs.utk.edu/papi/docs/. They are also installable with "make install", and you can build your own versions using doxygen. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ RECENT CHANGES IN PAPI 4.2.0 =============================================================================== Bug Fixes: ---------- * Bug in CUDA v4.0 fixed. It caused a threaded application to hang when parent called cuInit() before fork() and child called also cuInit(). All fork ctests pass now if papi is configured with cuda component. * If papi is configured with cuda component and running a threaded application, we need to make sure that a thread doesn't free the same memory location(s) more than once. Now all pthread ctests pass with cuda. * ctests/thrspecific works now with the CUDA component * Added CudaRemoveEvent functionality (broken in earlier CUDA RC versions). * ctests/all_native_events works now for the default CUDA device. * Add locking to papi_pfm4_events so that adding/looking up event names doesn't have a race condition when multiple threads are doing it at once. * Fixed a series of problems with Itanium builds. * Set FD_CLOEXEC on the overflow signal handler fd. Otherwise if we exec() with overflow enabled, the exec'd process will quickly die due to lack of signal handler. This patch is needed due to a change in behavior in Linux 3.0. Mark Krentel first noticed this problem. * Recent Ubuntu versions use the ld flag --as-needed by default, which breaks the PAPI configure step for the libdl check, as the --as-needed flag enforces the rule that libraries must come after the object files on the command line, not before. The fix for this is to put the libdl check it in LIBS instead of in LDFLAGS. * Removed an fopen() without an fclose() on /proc/cpuinfo in papi.c. This was being done to set the event masks properly for itanium and p4. Since the platform code sets CPU vendor and family for us we don't really have to open cpuinfo. This fix may also work on non-Linux systems too. * Update papi.h to properly detect if being built with a C99 compiler. Enhancements: ------------- * Default support for libpfm4 * ./configure --with-libpfm3 to support legacy libpfm3 builds * PERF_COUNT_SW software events are available under perf_events with libpfm4 * Nehalem/Westmere/SandyBridge Offcore event support is ready, but support is not yet available in the Linux kernel. * Add new utility to display PAPI error codes and description strings. * Add API to access error descriptions: PAPI_descr_error( int error_code). * Add support for handling multiattach properly. * Cleanups to avoid gcc-4.6 warnings. * Added ability to add tests to components. All component tests are compiled with PAPI when typing 'make'and cleaned up with 'make clean' or 'make clobber'. Also added tests to the example and cuda components. * CUDA component is now thread-safe. Multiple CPU threads can access the same CUDA context. Note, it's possible to create a different CUDA context for each thread, but then we are likely running into a limitation that only one context can be profiled at a time. * LOTS of code cleanup thanks to Will Cohen of RedHat. * Refactored test code so no-cpu-counters can build with components * Build all utilities with no-cpu-counters * Modify run_tests.sh so that you can set the VALGRIND command externally via environment variable without having to edit run_tests.sh itself. Also adds Date and cpuinfo information to the beginning of run_tests.sh results. This can help when run_tests.sh output is passed around when debugging a problem. * Parallel make now works. New Platforms: ------------- * AMD Family 14h Bobcat (libpfm4 only) * Intel SandyBridge (libpfm4 only) * ARM Cortex-A8 and Cortex-A9 (libpfm4 only) Deprecated Platforms: --------------------- * although still technically supported, we are no longer actively testing platforms based on the perfmon and perfctr patches. All linux kernels > 2.6.32 provide internal support for perf_events. New / Improved Components: --------------- * Add a number of 'native' events to the component info structure in example component. * Introduce a papi_component_avail utility; lists the components we were built with, optionally with native/preset counts and version number. Open Issues: ------------- * On newer Linux kernels (2.6.34+) the nmi_watchdog counter can steal one of the counters, reducing by one the total available. There's a bug in Linux where if you try to use the full number of counters on such a system with a group leader, the sys_perf_open() call will succeed only to fail at read time. (instead of the proper error code at open time). I do wish there were a way to notify the user more visibly, because losing a counter (when you might only have 4 total to begin with) is a big deal, and most Linux vendors are starting to ship kernels with the nmi_watchdog enabled. Experimental: ------------- * Preliminary support for MIPS 74K. =============================================================================== PAPI 4.1.4 RELEASE NOTES 29 Aug 2011 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP414.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is an internal release of PAPI-C targetted specifically for a Cray tools release. It precedes a more general 4.2.0 release and incorporates changes and updates since PAPI 4.1.3. Detailed changes will be documented in the 4.2.0 release. Meanwhile the list below highlights the most significant changes since 4.1.3. * Intel SandyBridge is now supported * libpfm4 support has been updated * internal doxygen documentation has been added for the entire API * the man pages have been replaced with doxygen generated man pages * CUDA component support has been improved * an infrastructure for testing components only has been implemented * various bugs have been addressed If you find issues with the 4.1.4 release, please bring them to our attention ASAP, so they can be addressed prior to the general 4.2.0 release. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ =============================================================================== PAPI 4.1.3 RELEASE NOTES 06 May 2011 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP413.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is a minor release of PAPI-C. It addresses a number of bugs and other issues that have surfaced since the 4.1.2 release. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 4.1.3 SINCE PAPI 4.1.2 =============================================================================== Bug Fixes: ---------- * Fixed a linux-timer.c compile error that only shows up on PPC. * Fixed ctests/all_native_events.c: It failed when PAPI was built with several components because the eventset failed to add events that were not from the first component. * Redefined PAPI_FP_OPS for Nehalem; Now counts properly for 32-bit code. * Uncovered and resolved bugs in attaching to fork/exec'd code. * Reworked eventset cleanup code to avoid an error situation in perf_events where events were being removed from a terminated attached process. * Fixed a configure bug preventing non-default bitmode builds of perf_event versions of PAPI. Enhancements: ------------- * consolidated a bunch of duplicated linux code into "linux-xxx.c" files. * Split WIN32 specific code out from linux common code. * Renamed various perfctr functions to be _perfctr_ rather than _linux_. * Added function pointer destroy_eventset to the PAPI vector table. Needed for the CUDA Component. * PAPI_assign_eventset_component now refuses to reassign components. * Implemented inherit feature for perf_events. Thanks to Gary Mohr. * Added a case to utils/cost.c to test for processing derived events. * Added utils/multiplex_cost.c. * Added --with-assumed-kernel to configure New Platforms: ------------- * POWER7 / AIX support is now available (see Known Bugs below) * Intel Westmere for perfctr. * AMD Family 15h (Interlagos) and 10h RevE processors. Deprecated Platforms: --------------------- New Components: --------------- * NVidia CUDA: still in pre-release until NVidia releases official CUDA4. Open Issues: ------------- * Currently using PAPI_attach() to attach to multiple processes at the same time will not work. On the perf_events substrate this may fail with a PAPI_EISRUN error for the subsequent attaches. On other substrates the additional attaches may work but results read back will be invalid. This behavior will be fixed in a subsequent PAPI release. Experimental: ------------- * libpfm4 support is experimentally available but subject to change Known Bugs: ----------- * POWER7 / AIX has some known bugs in this version: * PAPI_FP_OPS overcounts by 50% in many cases * multiplexing does not work correctly * memory limits for threaded tests are causing problems =============================================================================== PAPI 4.1.2 RELEASE NOTES 20 Jan 2011 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP412.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is a minor release of PAPI-C. It addresses a number of bugs and other issues that have surfaced since the 4.1.1 release. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 4.1.2 SINCE PAPI 4.1.1 =============================================================================== Bug Fixes: ---------- * fixed a long-standing subtle bug identified by Richard Strong that caused segfaults when multiplexing * fixed several bugs that were causing test failures on POWER6/AIX * properly detect Pentium M in configure * fixed a problem with perf_events not properly handling overflows; first identified by Mark Krentel * fixed a problem where perfctr was silently adding uncountable events * fixed a lock bug identified by Martin Schindewolf * fixed forking order for {multi|zero}_attach.c Enhancements: ------------- * updated support for freeBSD submitted by Harald Servat * a plethora of code cleanups submitted by Robert Richter * addressed compatibility issues in run_tests.sh to make it posix comliant * refreshed PAPI_Matlab support * reimplemented SUBDBG print capabilities to address an issue first identified by Maynard Johnson * refreshed preset event definitions for Nehalem, including implementations for PAPI_HW_INT; submitted by Michel Brown * added 3 new error codes: PAPI_EATTR, PAPI_ECOUNT, and PAPI_ECOMBO. These provide more detail on why an event add fails * implement cpuid leaf4 mtrics required by Intel Westmere New Platforms: ------------- * Intel Westmere on perfctr and perf_events Deprecated Platforms: --------------------- New Components: --------------- Open Issues: ------------- * PowerPC970 / linux is currently not supported by configure * POWER7 / AIX support is in development Experimental: ------------- * libpfm4 support is experimentally available and subject to change Known Bugs: ----------- =============================================================================== PAPI 4.1.1 RELEASE NOTES 01 Oct 2010 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP411.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is a minor release of PAPI-C. It addresses a number of bugs and other issues that have surfaced since the 4.1.0 release. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 4.1.1 SINCE PAPI 4.1.0 =============================================================================== Bug Fixes: ---------- * resolved confusion in event table naming for Intel Core, Core2 and Core Duo processors; cleaned up Nehalem and Westmere event definitions. * the --with-no-cpu-counters function and timing functions for AIX were fixed. * compiler flags for AIX Fortran were fixed. * doc directory is now preserved to prevent 'make clean' from entering an ininite loop. * prevent passing -Wextra to libpfm build, which was throwing errors in that build under certain circumstances. * fix a subtle problem in multiplexing in which final counter values could be under-reported. Changes the behavior of PAPI_stop when multiplexing. See the ChangeLog for further details. Enhancements: ------------- * now supports attach/detach for perf_events, thanks to Gary Mohr. * update cache information for recent Intel x86 processors. * F_SETOWN_EX was implemented in perf_events to guarantee that each process recieves it's own interrupts. This fixes a bug in high interrupt rates reported by Rice. * perf_events checks permissions at configuration rather than at start. Thanks to Gary Mohr. * Pentium IV now supported under perf_events in kernel 2.6.35 * add a WARNING for tests cases that don't fail but have issues that may need to be addressed. * add OS kernel version to component info struct; useful for enabling / diabling features in PAPI based on kernel version * updated to terminal release (3.10) of libpfm. * mmtimer support added for Altix / perf_events. New Platforms: ------------- Deprecated Platforms: --------------------- * support for perf_counters in the 2.6.31 Linux kernel has been deprecated New Components: --------------- * CoreTemp: exposes stuff in the /sys/class/hwmon directory Open Issues: ------------- * support for cross-compiling perf-events on new Cray architectures is still in development. Experimental: ------------- Known Bugs: ----------- =============================================================================== PAPI 4.1.0 RELEASE NOTES 22 Jun 2010 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP410.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is the second release of Component PAPI, or PAPI-C. See other references to PAPI-C, including the description in this file under PAPI 4.0.0 for details on the differences between Classic PAPI and PAPI-C. This release includes significant code cleanup to eliminate compiler warnings and type inconsistencies and to eliminate memory leaks. We also now support embedded doxygen comments for documentation. See the PAPI website for more details. The component build environment has been restructured to make it easier to add and build components without modifying baseline PAPI code. See /src/components/README for details. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 4.1.0 SINCE PAPI 4.0.0 =============================================================================== Bug Fixes: ---------- * configure was mis-identifying some Pentium 4 processors * the ctests/shlib test now tests against the shared math library, libm.so, instead of libpapi.so, which works more predictably with library renaming. * multiplexing was silently returning without setting multiplex TRUE in cases where no event had been assigned to an eventset. An event must be added to an eventset or PAPI_assign_eventset_component() must be called before multiplexing can be enabled. This silent error has been removed. * the perfmon and perf_events counter interfaces were not properly handling event unit masks. This has been fixed. * PAPI_name_to_code() was not exiting properly in certain circumstances, failing on events where there should have been a match. This is corrected. * a serious but insidious bug in the overflow logic was corrected. This bug would only show up when PAPI_overflow was called between calls to PAPI_add_event. Overflow would only be set for the last call of PAPI_overflow. This has been corrected. * IBM Blue Gene P systems were corrupting stack frames and crashing when the papi_get_event_info call was executed. This has been fixed. * The PAPI cycles event was not working for IBM Blue Gene P. This is fixed. * papi_native_avail was exiting improperly when using the -e option. This caused problems with batch execution systems (like Blue Gene P). This has been fixed. * a significant number of memory leaks have been purged. * compiler warning flags have been tightened and a range of warnings have been eliminated. * removed implicit type conversions in prototypes. Enhancements: ------------- * the utils/papi_version utility now reports four digits where the last digit matches the patch number. * Pentium II and Athlon now use libpfm for event decoding like all other x86 platforms. * Doxygen documentation has been added to the API and components. * Component compilation has been completely restructured. See /papi/src/components/README for details. * PAPI can now be compiled with a no-cpu-counters option. New Platforms: ------------- * the ultrasparc architecture has been resurrected * freebsd support was migrated from PAPI 3.7 * Intel Nehalem EX and Westmere support has been added Deprecated Platforms: --------------------- * IBM BG/L has been deprecated. * POWER 3 and POWER4 have been deprecated New Components: --------------- * Infiniband: Experimental * Lustre: Experimental * example: provides simple test case and template code. Open Issues: ------------- Experimental: ------------- Known Bugs: ----------- =============================================================================== PAPI 4.0.0 RELEASE NOTES 19 Jan 2010 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP400.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This is the inaugural release of Component PAPI, or PAPI-C. It represents a significant architectural change from PAPI 3.7.x and earlier. As such, your application must be recompiled and relinked to libpapi, the PAPI library for this version to work. PAPI-C is backward compatible with earlier versions of PAPI. All new library features are supported through new APIs and all old APIs still work as expected. Applications instrumented for PAPI should continue to work as expected with no changes. The major change in PAPI-C is the support of multiple components, or counting domains, in addition to the traditional hardware counters found in the cpu. The goal of this first release of PAPI-C is to provide a stable technology platform within which to explore the development and implementation of additional components. Although a small number of components are provided with this release, the major objective has been to guarantee that PAPI-C works at least as well as earlier PAPI releases and on the same range of hardware platforms. We think we have achieved that goal. Visit the PAPI Reference pages for more information at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 4.0.0 SINCE PAPI 3.7.2 =============================================================================== Bug Fixes: ---------- Enhancements: ------------- - The perf_events linux kernel interface is supported for POWER and x86 in linux kernels 2.6.31 and above. - PAPI info now includes information on multicore hierarchy. This is reported in the header of many tests. New Platforms: ------------- - IBM Blue Gene P has been fully integrated into the code base. It still suffers the same quirks and limitations as the earlier pre-release. Open Issues: ------------- - Components are invoked from the configure line; Requires PAPI source code modifications to add new components. Experimental: ------------- Known Bugs: ----------- - some tests involving overflow and profiling fail with linux perf_events - multiple event overflow only works for last event enabled on (at least) Intel Core2 and Itanium architectures. - clock speeds on variable speed Intel systems can be misreported, leading to incorrect calculations of mflops - memory leaks may lead to (rare) seg faults on Pentium4 systems =============================================================================== PAPI 3.7.2 RELEASE NOTES 02 Dec 2009 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP372.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This release is an incremental upgrade to PAPI 3.7.1. It fixes a mistake in the 3.7.1 release by updating configure to better detect the proper counter interface in linux kernels. Along the way, it also cleans up a few issues found in the 3.7.1 release. As always, if you identify strange behavior or reproducible bugs, please contact the PAPI team or visit the PAPI User Forum. And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ CHANGES IN PAPI 3.7.2 SINCE PAPI 3.7.1 =============================================================================== Bug Fixes: ---------- - fixed L3 cache size reporting for AMD Family 10h processors - fixed std deviation underflow in sdsc2 and sdsc4 tests - fixed bug in counter assignment for FreeBSD Atom implementation Enhancements: ------------- - updated cache tables for Intel Nehalem i7 processors - configure provides better autodetection of 2.6.31 or 2.6.32 kernels and perf_counter interface (in most cases) - configure provides better detection and autoselection of perfctr or perfmon drivers for linux - configure and sources have been modified to support perf_counter on kernel 2.6.31 and perf_event on kernel 2.6.32 - a papi.spec file has been added to simplify creation of rpms =============================================================================== PAPI 3.7.1 RELEASE NOTES 13 Nov 2009 =============================================================================== This file documents changes in recent PAPI releases in inverse chronological order. For details on installing PAPI on your machine, consult the INSTALL.txt file in this directory. For specific and detailed information on changes made in this release, grep the ChangeLogP371.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This release is an incremental upgrade to PAPI 3.7.0. It cleans up several issues found in the 3.7.0 release and provides better support for the perf_counter interface introduced in Linux kernel 2.6.31. As always, if you identify strange behavior or reproducible bugs, please contact the PAPI team or visit the PAPI User Forum. And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ NOTE: If you are looking for the man pages and other user documentation, look online. We decided we could provide better and more timely support by maintaining just the online documentation. Let us know if you think this is a bad decision. CHANGES IN PAPI 3.7.1 SINCE PAPI 3.7.0 =============================================================================== Bug Fixes: ---------- - fixed long standing subtle multiplexing bug in which TIDs and PIDs would get confused. TIDs would then get lost leading to long term instability. - fixed unit mask handling in perf_counters - fixed uninitialized string issue in /proc/cpuinfo parsing - fixed event reporting errors for various Opteron Family 10h models Enhancements: ------------- - FreeBSD support for Intel i7 - cleaned up libpapi.so naming for RedHat rpms - cleaned up various other issues for rpms per RedHat - autodetection of 2.6.31 perf_counter interface (in most cases) - enhanced packaging options in configure to support building either static or shared libraries independently New Platforms: ------------- - Support for the perf_counters (PCL: Performance Counters for Linux) interface for Linux kernel 2.6.31 and later has been more completely tested on a broader range of platforms, including Opteron, Core2, i7, and POWER. It successfully performs basic counting operations and handles many multiplex, overflow and profiling situations. It is still not as extensivelytested as the perfmon or perfctr interfaces, but is ready for work. Caveat Emptor. Major Issues: ------------- - see 3.7.0 Experimental: ------------- Known Bugs: ----------- - see 3.7.0 =============================================================================== PAPI 3.7.0 RELEASE NOTES 08 Sep 2009 =============================================================================== For specific and detailed information on changes made in this release, grep the ChangeLogP370.txt file for keywords of interest or go directly to the PAPI cvs tree. GENERAL NOTES =============================================================================== This release is a recommended upgrade to PAPI 3.6.x. It addresses a number of open issues and introduces support for several new platforms, including Intel Nehalem (Core i7), Atom, POWER7 and Niagara2. If you are currently using PAPI 3.6.x or earlier, it is recommended that you upgrade to this version. As always, if you identify strange behavior or reproducible bugs, please contact the PAPI team or the PAPI User Forum. And visit the PAPI website for the latest updates: http://icl.cs.utk.edu/papi/ NOTE: If you are looking for the man pages and other user documentation, look online. We decided we could provide better and more timely support by maintaining just the online documentation. Let us know if you think this is a bad decision. CHANGES IN PAPI 3.7.0 SINCE PAPI 3.6.2 =============================================================================== Bug Fixes: ---------- - many minor bugs fixed in tests and in specific cpu components - fixed support for Intel CoreDuo (not Core2) broken in PAPI 3.6.x - fixed library init failure on AIX Power6 when executable names > 32 char long - fixed avail.F construct that was crashing some versions of gfortran Enhancements: ------------- - A new utility has been added: papi_version - Added 4 new PRESET events to better handle SIMD instructions on Intel cpus: PAPI_DP_OPS - counts double precision scalar and vector FP operations PAPI_SP_OPS - counts single precision scalar and vector FP operations PAPI_VEC_DP - counts double precision vector instructions PAPI_VEC_SP - counts single precision vector instructions - FreeBSD support upgrade and new support for Atom and Intel Core2 New Platforms: ------------- - Intel Core i7 (Nehalem) support for 7 core counters; no support for Uncore counters - Intel Atom - AMD Opteron Barcelona, Shanghai, Istanbul event table support - POWER7 support for Linux thanks to IBM - Sun Niagara2 support thanks to Aachen University, Germany - Resurrected support for PAPI on Windows; now supports Intel Core2 and Core i7 Major Issues: ------------- - PAPI for Windows does not support 64-bit versions due to compiler issues. Experimental: ------------- - Support for the perf_counters (PCL: Performance Counters for Linux) interface is available as a technology pre-release for Linux kernel 2.6.31 and later. This has been tested on IBM POWER and Intel Core2 and successfully performs basic counting operations. It has not been stress tested. Caveat Emptor. Known Bugs: ----------- - clock speeds are occasionally not reported correctly for systems with SpeedStep technology. - Intel Atom crashes on a small number of standard tests. =============================================================================== PAPI 3.6.2 RELEASE NOTES 03 Oct 2008 NOTE: For releases prior to PAPI 3.7.0, please reference the tarball for an earlier release, or use the on-line cvs viewer at: http://icl.cs.utk.edu/viewcvs/viewcvs.cgi/PAPI/papi/ to see earlier versions of this file. =============================================================================== papi-papi-7-2-0-t/bitbucket-pipelines.yml000066400000000000000000000012041502707512200203220ustar00rootroot00000000000000# This is a sample build configuration for C++. # Check our guides at https://confluence.atlassian.com/x/VYk8Lw for more examples. # Only use spaces to indent your .yml configuration. # ----- # You can specify a custom docker image from Docker Hub as your build environment. image: gcc:6.1 pipelines: default: - step: script: # Modify the commands below to build your repository. - cd src - ./configure - make # - make fulltest - ctests/zero - ctests/cmpinfo - ctests/hwinfo - utils/papi_avail - utils/papi_native_avail -c --noqual -i PERF papi-papi-7-2-0-t/delete_before_release.sh000077500000000000000000000006371502707512200204710ustar00rootroot00000000000000#!/bin/sh rm PAPI_FAQ.html rm release_procedure.txt rm gitlog2changelog.py rm bitbucket-pipelines.yml rm doc/DataRange.html rm doc/PAPI-C.html rm doc/README rm src/buildbot_configure_with_components.sh rm -rf .git rm delete_before_release.sh rm src/ctests/.gitignore rm src/libpfm4/.gitignore rm src/utils/.gitignore rm src/.gitignore rm src/ftests/.gitignore rm src/testlib/.gitignore rm src/components/.gitignore papi-papi-7-2-0-t/doc/000077500000000000000000000000001502707512200144055ustar00rootroot00000000000000papi-papi-7-2-0-t/doc/DataRange.html000066400000000000000000000425331502707512200171300ustar00rootroot00000000000000

Data and Instruction Range Restrictions in PAPI

Introduction

Performance instrumentation of data structures, as opposed to code segments, is a feature not widely supported across a range of platforms. One platform on which this feature is supported is the Itanium2. In fact, event counting on Itanium2 can be qualified by a number of conditioners, including instruction address, opcode matching, and data address. We have implemented a generalized PAPI interface for data structure and instruction range performance instrumentation, also referred to as data and instruction range specification, and applied that interface to the specific instance of the Itanium2 platform to demonstrate its viability. This feature is being introduced for the first time in the PAPI 3.5 release.

The PAPI Interface

Since PAPI is a platform-independent library, care must be taken when extending its feature set so as not to disrupt the existing interface or to clutter the API with calls to functionality that is not available on a large subset of the supported platforms. To that end, we elected to extend an existing PAPI call, PAPI_set_opt(), with the capability of specifying starting and ending addresses of data structures or instructions to be instrumented. The PAPI_set_opt() call previously supported functionality to set a variety of optional capability in the PAPI interface, including debug levels, multiplexing of eventsets,  and the scope of counting domains. This call was extended with two new cases to support instruction and data address range specification: PAPI_INSTR_ADDRESS and PAPI_DATA_ADDRESS. To access these options, a user initializes a simple option specific data structure and calls PAPI_set_opt() as illustrated in the code fragment below:

   ...
  
option.addr.eventset = EventSet;
   option.addr.start = (caddr_t)array;
   option.addr.end = (caddr_t)(array + size_array);
   retval = PAPI_set_opt(PAPI_DATA_ADDRESS, &option);
   ...

The user creates a PAPI eventset and determines the starting and ending addresses of the data to be monitored. The call to PAPI_set_opt then prepares the interface to count events that occur on accesses to data in that range. The specific events to be monitored can be added to the eventset either before or after the data range is specified. In a similar fashion, an instruction range can be set using the PAPI_INSTR_ADDRESS option. If this option is supported on the platform in use, the data is transferred to the platform-specific implementation and handled appropriately. If not supported, the call returns an error message.

It may not always be possible to exactly specify the address range of interest. If this is the case, it is important that the user have some way to know what approximations have been made, so that appropriate corrective action can be taken. For instance, to isolate a specific data structure completely, it may be necessary to pad memory before and after the structure with dummy structures that are never accessed. To facilitate this, PAPI_set_opt() returns the offsets from the requested starting and ending addresses as they were actually programmed into the hardware. If the addresses were mapped exactly, these values are zero. An example of this is shown below:

   ...
   retval = PAPI_set_opt(PAPI_DATA_ADDRESS, &option);
   actual.start = (caddr_t)array - option.addr.start_off;
   actual.end = (caddr_t)(array + size_array) + option.addr.end_off;
   ...

Itanium Idiosyncrasies

There are roughly 475 native events available on Itanium 2.160 of them are memory related and can be counted with data address specification in place. 283 can be counted using instruction address specification. All events in an eventset with data or instruction range specification in place must be one of these supported events. Further restrictions also apply to the use of data and instruction range specification, as described below. Data addresses can only be specified in coarse mode. Although four independent pairs of data address registers exist in the hardware and would suggest that four disjoint address regions can be monitored simultaneously, the Intel documentation strongly suggests that this is not a good idea. Further, the underlying software takes advantage of these register pairs to tune the range of addresses that is actually monitored. See the discussion under Data Address Ranges for further detail.

Instruction Address Ranges

Instruction ranges can be specified in one of two ways: coarse and fine. In fine mode, addresses can be specified exactly, but both the start and end addresses must exist on the same 4K byte page. In other words, the address range must be less than 4K bytes, and the addresses can only differ in the bottom 12 bits. If fine mode cannot be used, the underlying perfmon library automatically switches to coarse address specification. Four pairs of registers are available to specify coarse instruction address ranges. The restrictions to coarse address specification are discussed below.

Data Address Ranges

Data addresses can only be specified in coarse mode. As with instruction ranges, four pairs of registers are available to specify  the data address ranges. Use of coarse mode addressing for either instruction or data address specification can cause some anomalous results. The Intel documentation points out that starting and ending addresses cannot be specified exactly, since the hardware representation relies on powers-of-two bitmasks. The perfmon library tries to optimize the alignment of these power-of-two regions to cover the addresses requested as effectively as possible with the four sets of registers available. Perfmon first finds the largest power-of-two address region completely contained within the reqested addresses. Then it finds successively smaller power-of-two regions to cover the errors on the high and low end of the requested address range. The effective result is that the actual range specified is always equal to or larger than, and completely contains the requested range, and can occupy from one to four pairs of address registers. In some cases this can result in significant overcounts of the events of interest, especially if two active data structures are located in close proximity to each other. This may require that the developer insert some padding structures before and/or after a particular structure of interest to guarantee accurate counts.

Supporting Software

To make this new PAPI feature more accessible and easier to use, a test case was developed to both provide a coding example and to exercise and test the fuctionality of the data ranging features of the Itanium 2. In addition, the papi_native_event utility was modified to make it easier to identify events that support these features.

The data_range.c Test Case

A test case, called data_range was developed that measures memory load and store events on three different types of data structures. Three static arrays of 16,384 ints were declared in the program, and three dynamic arrays of 16,384 ints were malloc'd. The data range was specified sequentially to be the starting and ending addresses of each of:

  • the pointers to the malloc'd arrays;
  • the malloc'd arrays themselves;
  • the statically declared arrays.

The work done in each case consisted of loading an initialization value into each element of each array, and then summing the values of each element. This should produce 16,384 loads and 16,384 stores on each element.

For the pointers, the size was 8 bytes and the starting and ending addresses could be specified exactly. Output is shown below:

Measure loads and stores on the pointers to the allocated arrays
Expected loads: 32768; Expected stores: 0
These loads result from accessing the pointers to compute array addresses.
They will likely disappear with higher levels of optimization.
Requested Start Address: 0x6000000000011640; Start Offset: 0x 0; Actual Start Address: 0x6000000000011640
Requested End Address: 0x6000000000011648; End Offset: 0x 0; Actual End Address: 0x6000000000011648
loads_retired: 32768
stores_retired: 0
Requested Start Address: 0x6000000000011628; Start Offset: 0x 0; Actual Start Address: 0x6000000000011628
Requested End Address: 0x6000000000011630; End Offset: 0x 0; Actual End Address: 0x6000000000011630
loads_retired: 32768
stores_retired: 0
Requested Start Address: 0x6000000000011638; Start Offset: 0x 0; Actual Start Address: 0x6000000000011638
Requested End Address: 0x6000000000011640; End Offset: 0x 0; Actual End Address: 0x6000000000011640
loads_retired: 32768
stores_retired: 0

For the allocated arrays, small offsets were introduced in each case, and the resulting error in the loads and stores is exactly what would be predicted by the activity in the adjacent memory locations:

Measure loads and stores on the allocated arrays themselves
Expected loads: 16384; Expected stores: 16384
Requested Start Address: 0x6000000004044010; Start Offset: 0x 10; Actual Start Address: 0x6000000004044000
Requested End Address: 0x6000000004054010; End Offset: 0x 0; Actual End Address: 0x6000000004054010
loads_retired: 16384
stores_retired: 16384
Requested Start Address: 0x6000000004054020; Start Offset: 0x 20; Actual Start Address: 0x6000000004054000
Requested End Address: 0x6000000004064020; End Offset: 0x 0; Actual End Address: 0x6000000004064020
loads_retired: 16388
stores_retired: 16388
Requested Start Address: 0x6000000004064030; Start Offset: 0x 30; Actual Start Address: 0x6000000004064000
Requested End Address: 0x6000000004074030; End Offset: 0x 10; Actual End Address: 0x6000000004074040
loads_retired: 16392
stores_retired: 16392

For the static arrays, the locations of the arrays resulted in significant offsets, and hence significant errors. The most interesting case is the second one, in which the starting offset can be seen to force the inclusion of all three pointers to the malloc'd arrays. Because of this the loads retired count is too high by 98310, almost exactly 3 * 32768 = 98204:

Measure loads and stores on the static arrays
These values will differ from the expected values by the size of the offsets.
Expected loads: 16384; Expected stores: 16384
Requested Start Address: 0x60000000000218cc; Start Offset: 0x 18cc; Actual Start Address: 0x6000000000020000
Requested End Address: 0x60000000000318cc; End Offset: 0x 734; Actual End Address: 0x6000000000032000
loads_retired: 18432
stores_retired: 18432
Requested Start Address: 0x60000000000118cc; Start Offset: 0x 18cc; Actual Start Address: 0x6000000000010000
Requested End Address: 0x60000000000218cc; End Offset: 0x 734; Actual End Address: 0x6000000000022000
loads_retired: 115155
stores_retired: 16845
Requested Start Address: 0x60000000000318cc; Start Offset: 0x 18cc; Actual Start Address: 0x6000000000030000
Requested End Address: 0x60000000000418cc; End Offset: 0x 734; Actual End Address: 0x6000000000042000
loads_retired: 17971
stores_retired: 17971

The papi_native_avail Utility

To effectively use the instruction and address range specification feature for Itanium 2, one must know which of the roughly 475 available native events support these features. In addition, there are other qualifiers to Itanium 2 native events that are valuable to inspect. For these reasons, the papi_native_avail utility was enhanced to make it possible to filter the list of native events by these qualifiers. A help feature was added to this utility to make it easier to remember the Itanium specific options:

> papi_native_avail --help
This is the PAPI native avail program.
It provides availability and detail information
for PAPI native events. Usage:
    papi_native_avail [options]
Options:

  -h, --help  print this help message
  --darr      display Itanium events that support Data Address Range Restriction
  --dear      display Itanium Data Event Address Register events only
  --iarr      display Itanium events that support Instruction Address Range Restriction
  --iear      display Itanium Instruction Event Address Register events only
  --opcm      display Itanium events that support OpCode Matching
  NOTE:       The last five options are mutually exclusive.

If any of these options are specified on the command line, only those events that support that option are displayed. Even so, the list can be extensive, with roughly 160 events supporting data address ranging, and even more supporting instruction address ranging.

papi-papi-7-2-0-t/doc/Doxyfile-common000066400000000000000000002162351502707512200174120ustar00rootroot00000000000000# Doxyfile 1.7.4 # This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for a project. # # All text after a hash (#) is considered a comment and will be ignored. # The format is: # TAG = value [value, ...] # For lists items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (" "). #--------------------------------------------------------------------------- # Project related configuration options #--------------------------------------------------------------------------- # This tag specifies the encoding used for all characters in the config file # that follow. The default is UTF-8 which is also the encoding used for all # text before the first occurrence of this tag. Doxygen uses libiconv (or the # iconv built into libc) for the transcoding. See # http://www.gnu.org/software/libiconv for the list of possible encodings. DOXYFILE_ENCODING = UTF-8 # The PROJECT_NAME tag is a single word (or a sequence of words surrounded # by quotes) that should identify the project. PROJECT_NAME = PAPI # The PROJECT_NUMBER tag can be used to enter a project or revision number. # This could be handy for archiving the generated documentation or # if some version control system is used. PROJECT_NUMBER = 7.2.0.0 # Using the PROJECT_BRIEF tag one can provide an optional one line description # for a project that appears at the top of each page and should give viewer # a quick idea about the purpose of the project. Keep the description short. PROJECT_BRIEF = # With the PROJECT_LOGO tag one can specify an logo or icon that is # included in the documentation. The maximum height of the logo should not # exceed 55 pixels and the maximum width should not exceed 200 pixels. # Doxygen will copy the logo to the output directory. PROJECT_LOGO = # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) # base path where the generated documentation will be put. # If a relative path is entered, it will be relative to the location # where doxygen was started. If left blank the current directory will be used. OUTPUT_DIRECTORY = ./ # If the CREATE_SUBDIRS tag is set to YES, then doxygen will create # 4096 sub-directories (in 2 levels) under the output directory of each output # format and will distribute the generated files over these directories. # Enabling this option can be useful when feeding doxygen a huge amount of # source files, where putting all generated files in the same directory would # otherwise cause performance problems for the file system. CREATE_SUBDIRS = NO # The OUTPUT_LANGUAGE tag is used to specify the language in which all # documentation generated by doxygen is written. Doxygen will use this # information to generate all constant output in the proper language. # The default language is English, other supported languages are: # Afrikaans, Arabic, Brazilian, Catalan, Chinese, Chinese-Traditional, # Croatian, Czech, Danish, Dutch, Esperanto, Farsi, Finnish, French, German, # Greek, Hungarian, Italian, Japanese, Japanese-en (Japanese with English # messages), Korean, Korean-en, Lithuanian, Norwegian, Macedonian, Persian, # Polish, Portuguese, Romanian, Russian, Serbian, Serbian-Cyrillic, Slovak, # Slovene, Spanish, Swedish, Ukrainian, and Vietnamese. OUTPUT_LANGUAGE = English # If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will # include brief member descriptions after the members that are listed in # the file and class documentation (similar to JavaDoc). # Set to NO to disable this. BRIEF_MEMBER_DESC = YES # If the REPEAT_BRIEF tag is set to YES (the default) Doxygen will prepend # the brief description of a member or function before the detailed description. # Note: if both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the # brief descriptions will be completely suppressed. REPEAT_BRIEF = NO # This tag implements a quasi-intelligent brief description abbreviator # that is used to form the text in various listings. Each string # in this list, if found as the leading text of the brief description, will be # stripped from the text and the result after processing the whole list, is # used as the annotated text. Otherwise, the brief description is used as-is. # If left blank, the following values are used ("$name" is automatically # replaced with the name of the entity): "The $name class" "The $name widget" # "The $name file" "is" "provides" "specifies" "contains" # "represents" "a" "an" "the" ABBREVIATE_BRIEF = # If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then # Doxygen will generate a detailed section even if there is only a brief # description. ALWAYS_DETAILED_SEC = NO # If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all # inherited members of a class in the documentation of that class as if those # members were ordinary class members. Constructors, destructors and assignment # operators of the base classes will not be shown. INLINE_INHERITED_MEMB = NO # If the FULL_PATH_NAMES tag is set to YES then Doxygen will prepend the full # path before files name in the file list and in the header files. If set # to NO the shortest path that makes the file name unique will be used. FULL_PATH_NAMES = NO # If the FULL_PATH_NAMES tag is set to YES then the STRIP_FROM_PATH tag # can be used to strip a user-defined part of the path. Stripping is # only done if one of the specified strings matches the left-hand part of # the path. The tag can be used to show relative paths in the file list. # If left blank the directory from which doxygen is run is used as the # path to strip. STRIP_FROM_PATH = # The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of # the path mentioned in the documentation of a class, which tells # the reader which header file to include in order to use a class. # If left blank only the name of the header file containing the class # definition is used. Otherwise one should specify the include paths that # are normally passed to the compiler using the -I flag. STRIP_FROM_INC_PATH = # If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter # (but less readable) file names. This can be useful if your file system # doesn't support long names like on DOS, Mac, or CD-ROM. SHORT_NAMES = NO # If the JAVADOC_AUTOBRIEF tag is set to YES then Doxygen # will interpret the first line (until the first dot) of a JavaDoc-style # comment as the brief description. If set to NO, the JavaDoc # comments will behave just like regular Qt-style comments # (thus requiring an explicit @brief command for a brief description.) JAVADOC_AUTOBRIEF = NO # If the QT_AUTOBRIEF tag is set to YES then Doxygen will # interpret the first line (until the first dot) of a Qt-style # comment as the brief description. If set to NO, the comments # will behave just like regular Qt-style comments (thus requiring # an explicit \brief command for a brief description.) QT_AUTOBRIEF = NO # The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make Doxygen # treat a multi-line C++ special comment block (i.e. a block of //! or /// # comments) as a brief description. This used to be the default behaviour. # The new default is to treat a multi-line C++ comment block as a detailed # description. Set this tag to YES if you prefer the old behaviour instead. MULTILINE_CPP_IS_BRIEF = NO # If the INHERIT_DOCS tag is set to YES (the default) then an undocumented # member inherits the documentation from any documented member that it # re-implements. INHERIT_DOCS = YES # If the SEPARATE_MEMBER_PAGES tag is set to YES, then doxygen will produce # a new page for each member. If set to NO, the documentation of a member will # be part of the file/class/namespace that contains it. SEPARATE_MEMBER_PAGES = NO # The TAB_SIZE tag can be used to set the number of spaces in a tab. # Doxygen uses this value to replace tabs by spaces in code fragments. TAB_SIZE = 4 # This tag can be used to specify a number of aliases that acts # as commands in the documentation. An alias has the form "name=value". # For example adding "sideeffect=\par Side Effects:\n" will allow you to # put the command \sideeffect (or @sideeffect) in the documentation, which # will result in a user-defined paragraph with heading "Side Effects:". # You can put \n's in the value part of an alias to insert newlines. ALIASES = # Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C # sources only. Doxygen will then generate output that is more tailored for C. # For instance, some of the names that are used will be different. The list # of all members will be omitted, etc. OPTIMIZE_OUTPUT_FOR_C = YES # Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java # sources only. Doxygen will then generate output that is more tailored for # Java. For instance, namespaces will be presented as packages, qualified # scopes will look different, etc. OPTIMIZE_OUTPUT_JAVA = NO # Set the OPTIMIZE_FOR_FORTRAN tag to YES if your project consists of Fortran # sources only. Doxygen will then generate output that is more tailored for # Fortran. OPTIMIZE_FOR_FORTRAN = NO # Set the OPTIMIZE_OUTPUT_VHDL tag to YES if your project consists of VHDL # sources. Doxygen will then generate output that is tailored for # VHDL. OPTIMIZE_OUTPUT_VHDL = NO # Doxygen selects the parser to use depending on the extension of the files it # parses. With this tag you can assign which parser to use for a given extension. # Doxygen has a built-in mapping, but you can override or extend it using this # tag. The format is ext=language, where ext is a file extension, and language # is one of the parsers supported by doxygen: IDL, Java, Javascript, CSharp, C, # C++, D, PHP, Objective-C, Python, Fortran, VHDL, C, C++. For instance to make # doxygen treat .inc files as Fortran files (default is PHP), and .f files as C # (default is Fortran), use: inc=Fortran f=C. Note that for custom extensions # you also need to set FILE_PATTERNS otherwise the files are not read by doxygen. EXTENSION_MAPPING = # If you use STL classes (i.e. std::string, std::vector, etc.) but do not want # to include (a tag file for) the STL sources as input, then you should # set this tag to YES in order to let doxygen match functions declarations and # definitions whose arguments contain STL classes (e.g. func(std::string); v.s. # func(std::string) {}). This also makes the inheritance and collaboration # diagrams that involve STL classes more complete and accurate. BUILTIN_STL_SUPPORT = NO # If you use Microsoft's C++/CLI language, you should set this option to YES to # enable parsing support. CPP_CLI_SUPPORT = NO # Set the SIP_SUPPORT tag to YES if your project consists of sip sources only. # Doxygen will parse them like normal C++ but will assume all classes use public # instead of private inheritance when no explicit protection keyword is present. SIP_SUPPORT = NO # For Microsoft's IDL there are propget and propput attributes to indicate getter # and setter methods for a property. Setting this option to YES (the default) # will make doxygen replace the get and set methods by a property in the # documentation. This will only work if the methods are indeed getting or # setting a simple type. If this is not the case, or you want to show the # methods anyway, you should set this option to NO. IDL_PROPERTY_SUPPORT = YES # If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC # tag is set to YES, then doxygen will reuse the documentation of the first # member in the group (if any) for the other members of the group. By default # all members of a group must be documented explicitly. DISTRIBUTE_GROUP_DOC = NO # Set the SUBGROUPING tag to YES (the default) to allow class member groups of # the same type (for instance a group of public functions) to be put as a # subgroup of that type (e.g. under the Public Functions section). Set it to # NO to prevent subgrouping. Alternatively, this can be done per class using # the \nosubgrouping command. SUBGROUPING = YES # When the INLINE_GROUPED_CLASSES tag is set to YES, classes, structs and # unions are shown inside the group in which they are included (e.g. using # @ingroup) instead of on a separate page (for HTML and Man pages) or # section (for LaTeX and RTF). INLINE_GROUPED_CLASSES = NO # When TYPEDEF_HIDES_STRUCT is enabled, a typedef of a struct, union, or enum # is documented as struct, union, or enum with the name of the typedef. So # typedef struct TypeS {} TypeT, will appear in the documentation as a struct # with name TypeT. When disabled the typedef will appear as a member of a file, # namespace, or class. And the struct will be named TypeS. This can typically # be useful for C code in case the coding convention dictates that all compound # types are typedef'ed and only the typedef is referenced, never the tag name. TYPEDEF_HIDES_STRUCT = YES # The SYMBOL_CACHE_SIZE determines the size of the internal cache use to # determine which symbols to keep in memory and which to flush to disk. # When the cache is full, less often used symbols will be written to disk. # For small to medium size projects (<1000 input files) the default value is # probably good enough. For larger projects a too small cache size can cause # doxygen to be busy swapping symbols to and from disk most of the time # causing a significant performance penalty. # If the system has enough physical memory increasing the cache will improve the # performance by keeping more symbols in memory. Note that the value works on # a logarithmic scale so increasing the size by one will roughly double the # memory usage. The cache size is given by this formula: # 2^(16+SYMBOL_CACHE_SIZE). The valid range is 0..9, the default is 0, # corresponding to a cache size of 2^16 = 65536 symbols SYMBOL_CACHE_SIZE = 0 #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- # If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in # documentation are documented, even if no documentation was available. # Private class members and static file members will be hidden unless # the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES EXTRACT_ALL = NO # If the EXTRACT_PRIVATE tag is set to YES all private members of a class # will be included in the documentation. EXTRACT_PRIVATE = NO # If the EXTRACT_STATIC tag is set to YES all static members of a file # will be included in the documentation. EXTRACT_STATIC = YES # If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) # defined locally in source files will be included in the documentation. # If set to NO only classes defined in header files are included. EXTRACT_LOCAL_CLASSES = YES # This flag is only useful for Objective-C code. When set to YES local # methods, which are defined in the implementation section but not in # the interface are included in the documentation. # If set to NO (the default) only methods in the interface are included. EXTRACT_LOCAL_METHODS = NO # If this flag is set to YES, the members of anonymous namespaces will be # extracted and appear in the documentation as a namespace called # 'anonymous_namespace{file}', where file will be replaced with the base # name of the file that contains the anonymous namespace. By default # anonymous namespaces are hidden. EXTRACT_ANON_NSPACES = NO # If the HIDE_UNDOC_MEMBERS tag is set to YES, Doxygen will hide all # undocumented members of documented classes, files or namespaces. # If set to NO (the default) these members will be included in the # various overviews, but no documentation section is generated. # This option has no effect if EXTRACT_ALL is enabled. HIDE_UNDOC_MEMBERS = NO # If the HIDE_UNDOC_CLASSES tag is set to YES, Doxygen will hide all # undocumented classes that are normally visible in the class hierarchy. # If set to NO (the default) these classes will be included in the various # overviews. This option has no effect if EXTRACT_ALL is enabled. HIDE_UNDOC_CLASSES = NO # If the HIDE_FRIEND_COMPOUNDS tag is set to YES, Doxygen will hide all # friend (class|struct|union) declarations. # If set to NO (the default) these declarations will be included in the # documentation. HIDE_FRIEND_COMPOUNDS = NO # If the HIDE_IN_BODY_DOCS tag is set to YES, Doxygen will hide any # documentation blocks found inside the body of a function. # If set to NO (the default) these blocks will be appended to the # function's detailed documentation block. HIDE_IN_BODY_DOCS = NO # The INTERNAL_DOCS tag determines if documentation # that is typed after a \internal command is included. If the tag is set # to NO (the default) then the documentation will be excluded. # Set it to YES to include the internal documentation. INTERNAL_DOCS = NO # If the CASE_SENSE_NAMES tag is set to NO then Doxygen will only generate # file names in lower-case letters. If set to YES upper-case letters are also # allowed. This is useful if you have classes or files whose names only differ # in case and if your file system supports case sensitive file names. Windows # and Mac users are advised to set this option to NO. CASE_SENSE_NAMES = YES # If the HIDE_SCOPE_NAMES tag is set to NO (the default) then Doxygen # will show members with their full class and namespace scopes in the # documentation. If set to YES the scope will be hidden. HIDE_SCOPE_NAMES = NO # If the SHOW_INCLUDE_FILES tag is set to YES (the default) then Doxygen # will put a list of the files that are included by a file in the documentation # of that file. SHOW_INCLUDE_FILES = NO # If the FORCE_LOCAL_INCLUDES tag is set to YES then Doxygen # will list include files with double quotes in the documentation # rather than with sharp brackets. FORCE_LOCAL_INCLUDES = NO # If the INLINE_INFO tag is set to YES (the default) then a tag [inline] # is inserted in the documentation for inline members. INLINE_INFO = YES # If the SORT_MEMBER_DOCS tag is set to YES (the default) then doxygen # will sort the (detailed) documentation of file and class members # alphabetically by member name. If set to NO the members will appear in # declaration order. SORT_MEMBER_DOCS = YES # If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the # brief documentation of file, namespace and class members alphabetically # by member name. If set to NO (the default) the members will appear in # declaration order. SORT_BRIEF_DOCS = NO # If the SORT_MEMBERS_CTORS_1ST tag is set to YES then doxygen # will sort the (brief and detailed) documentation of class members so that # constructors and destructors are listed first. If set to NO (the default) # the constructors will appear in the respective orders defined by # SORT_MEMBER_DOCS and SORT_BRIEF_DOCS. # This tag will be ignored for brief docs if SORT_BRIEF_DOCS is set to NO # and ignored for detailed docs if SORT_MEMBER_DOCS is set to NO. SORT_MEMBERS_CTORS_1ST = NO # If the SORT_GROUP_NAMES tag is set to YES then doxygen will sort the # hierarchy of group names into alphabetical order. If set to NO (the default) # the group names will appear in their defined order. SORT_GROUP_NAMES = NO # If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be # sorted by fully-qualified names, including namespaces. If set to # NO (the default), the class list will be sorted only by class name, # not including the namespace part. # Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. # Note: This option applies only to the class list, not to the # alphabetical list. SORT_BY_SCOPE_NAME = NO # If the STRICT_PROTO_MATCHING option is enabled and doxygen fails to # do proper type resolution of all parameters of a function it will reject a # match between the prototype and the implementation of a member function even # if there is only one candidate or it is obvious which candidate to choose # by doing a simple string match. By disabling STRICT_PROTO_MATCHING doxygen # will still accept a match between prototype and implementation in such cases. STRICT_PROTO_MATCHING = NO # The GENERATE_TODOLIST tag can be used to enable (YES) or # disable (NO) the todo list. This list is created by putting \todo # commands in the documentation. GENERATE_TODOLIST = NO # The GENERATE_TESTLIST tag can be used to enable (YES) or # disable (NO) the test list. This list is created by putting \test # commands in the documentation. GENERATE_TESTLIST = NO # The GENERATE_BUGLIST tag can be used to enable (YES) or # disable (NO) the bug list. This list is created by putting \bug # commands in the documentation. GENERATE_BUGLIST = NO # The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or # disable (NO) the deprecated list. This list is created by putting # \deprecated commands in the documentation. GENERATE_DEPRECATEDLIST= NO # The ENABLED_SECTIONS tag can be used to enable conditional # documentation sections, marked by \if sectionname ... \endif. ENABLED_SECTIONS = # The MAX_INITIALIZER_LINES tag determines the maximum number of lines # the initial value of a variable or macro consists of for it to appear in # the documentation. If the initializer consists of more lines than specified # here it will be hidden. Use a value of 0 to hide initializers completely. # The appearance of the initializer of individual variables and macros in the # documentation can be controlled using \showinitializer or \hideinitializer # command in the documentation regardless of this setting. MAX_INITIALIZER_LINES = 30 # Set the SHOW_USED_FILES tag to NO to disable the list of files generated # at the bottom of the documentation of classes and structs. If set to YES the # list will mention the files that were used to generate the documentation. SHOW_USED_FILES = NO # If the sources in your project are distributed over multiple directories # then setting the SHOW_DIRECTORIES tag to YES will show the directory hierarchy # in the documentation. The default is NO. SHOW_DIRECTORIES = NO # Set the SHOW_FILES tag to NO to disable the generation of the Files page. # This will remove the Files entry from the Quick Index and from the # Folder Tree View (if specified). The default is YES. SHOW_FILES = NO # Set the SHOW_NAMESPACES tag to NO to disable the generation of the # Namespaces page. # This will remove the Namespaces entry from the Quick Index # and from the Folder Tree View (if specified). The default is YES. SHOW_NAMESPACES = NO # The FILE_VERSION_FILTER tag can be used to specify a program or script that # doxygen should invoke to get the current version for each file (typically from # the version control system). Doxygen will invoke the program by executing (via # popen()) the command , where is the value of # the FILE_VERSION_FILTER tag, and is the name of an input file # provided by doxygen. Whatever the program writes to standard output # is used as the file version. See the manual for examples. FILE_VERSION_FILTER = # The LAYOUT_FILE tag can be used to specify a layout file which will be parsed # by doxygen. The layout file controls the global structure of the generated # output files in an output format independent way. The create the layout file # that represents doxygen's defaults, run doxygen with the -l option. # You can optionally specify a file name after the option, if omitted # DoxygenLayout.xml will be used as the name of the layout file. LAYOUT_FILE = #--------------------------------------------------------------------------- # configuration options related to warning and progress messages #--------------------------------------------------------------------------- # The QUIET tag can be used to turn on/off the messages that are generated # by doxygen. Possible values are YES and NO. If left blank NO is used. QUIET = NO # The WARNINGS tag can be used to turn on/off the warning messages that are # generated by doxygen. Possible values are YES and NO. If left blank # NO is used. WARNINGS = YES # If WARN_IF_UNDOCUMENTED is set to YES, then doxygen will generate warnings # for undocumented members. If EXTRACT_ALL is set to YES then this flag will # automatically be disabled. WARN_IF_UNDOCUMENTED = NO # If WARN_IF_DOC_ERROR is set to YES, doxygen will generate warnings for # potential errors in the documentation, such as not documenting some # parameters in a documented function, or documenting parameters that # don't exist or using markup commands wrongly. WARN_IF_DOC_ERROR = YES # The WARN_NO_PARAMDOC option can be enabled to get warnings for # functions that are documented, but have no documentation for their parameters # or return value. If set to NO (the default) doxygen will only warn about # wrong or incomplete parameter documentation, but not about the absence of # documentation. WARN_NO_PARAMDOC = NO # The WARN_FORMAT tag determines the format of the warning messages that # doxygen can produce. The string should contain the $file, $line, and $text # tags, which will be replaced by the file and line number from which the # warning originated and the warning text. Optionally the format may contain # $version, which will be replaced by the version of the file (if it could # be obtained via FILE_VERSION_FILTER) WARN_FORMAT = "$file:$line: $text" # The WARN_LOGFILE tag can be used to specify a file to which warning # and error messages should be written. If left blank the output is written # to stderr. WARN_LOGFILE = doxyerror #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag can be used to specify the files and/or directories that contain # documented source files. You may enter file names like "myfile.cpp" or # directories like "/usr/src/myproject". Separate the files or directories # with spaces. INPUT = # This tag can be used to specify the character encoding of the source files # that doxygen parses. Internally doxygen uses the UTF-8 encoding, which is # also the default input encoding. Doxygen uses libiconv (or the iconv built # into libc) for the transcoding. See http://www.gnu.org/software/libiconv for # the list of possible encodings. INPUT_ENCODING = UTF-8 # If the value of the INPUT tag contains directories, you can use the # FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp # and *.h) to filter out the source-files in the directories. If left # blank the following patterns are tested: # *.c *.cc *.cxx *.cpp *.c++ *.d *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh # *.hxx *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm *.dox *.py # *.f90 *.f *.for *.vhd *.vhdl FILE_PATTERNS = *.c *.h *.py # The RECURSIVE tag can be used to turn specify whether or not subdirectories # should be searched for input files as well. Possible values are YES and NO. # If left blank NO is used. RECURSIVE = NO # The EXCLUDE tag can be used to specify files and/or directories that should # excluded from the INPUT source files. This way you can easily exclude a # subdirectory from a directory tree whose root is specified with the INPUT tag. EXCLUDE = # The EXCLUDE_SYMLINKS tag can be used select whether or not files or # directories that are symbolic links (a Unix file system feature) are excluded # from the input. EXCLUDE_SYMLINKS = NO # If the value of the INPUT tag contains directories, you can use the # EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude # certain files from those directories. Note that the wildcards are matched # against the file with absolute path, so to exclude all test directories # for example use the pattern */test/* EXCLUDE_PATTERNS = */Matlab/* */CVS/* */libpfm-2.x/* */libpfm-3.x/* \ */libpfm-3.y/* */libpfm4/* */perfctr-1.6.1/* */perfctr-2.3.12/* \ */perfctr-2.4.1/* */perfctr-2.4.5/* */perfctr-2.4.x/* \ */perfctr-2.6.x/* */perfctr-2.6.x.old/* */perfctr-2.7.x/* \ */linux-bgp.c # The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names # (namespaces, classes, functions, etc.) that should be excluded from the # output. The symbol name can be a fully qualified name, a word, or if the # wildcard * is used, a substring. Examples: ANamespace, AClass, # AClass::ANamespace, ANamespace::*Test EXCLUDE_SYMBOLS = # The EXAMPLE_PATH tag can be used to specify one or more files or # directories that contain example code fragments that are included (see # the \include command). EXAMPLE_PATH = # If the value of the EXAMPLE_PATH tag contains directories, you can use the # EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp # and *.h) to filter out the source-files in the directories. If left # blank all files are included. EXAMPLE_PATTERNS = # If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be # searched for input files to be used with the \include or \dontinclude # commands irrespective of the value of the RECURSIVE tag. # Possible values are YES and NO. If left blank NO is used. EXAMPLE_RECURSIVE = NO # The IMAGE_PATH tag can be used to specify one or more files or # directories that contain image that are included in the documentation (see # the \image command). IMAGE_PATH = # The INPUT_FILTER tag can be used to specify a program that doxygen should # invoke to filter for each input file. Doxygen will invoke the filter program # by executing (via popen()) the command , where # is the value of the INPUT_FILTER tag, and is the name of an # input file. Doxygen will then use the output that the filter program writes # to standard output. # If FILTER_PATTERNS is specified, this tag will be # ignored. INPUT_FILTER = # The FILTER_PATTERNS tag can be used to specify filters on a per file pattern # basis. # Doxygen will compare the file name with each pattern and apply the # filter if there is a match. # The filters are a list of the form: # pattern=filter (like *.cpp=my_cpp_filter). See INPUT_FILTER for further # info on how filters are used. If FILTER_PATTERNS is empty or if # non of the patterns match the file name, INPUT_FILTER is applied. FILTER_PATTERNS = # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using # INPUT_FILTER) will be used to filter the input files when producing source # files to browse (i.e. when SOURCE_BROWSER is set to YES). FILTER_SOURCE_FILES = NO # The FILTER_SOURCE_PATTERNS tag can be used to specify source filters per file # pattern. A pattern will override the setting for FILTER_PATTERN (if any) # and it is also possible to disable source filtering for a specific pattern # using *.ext= (so without naming a filter). This option only has effect when # FILTER_SOURCE_FILES is enabled. FILTER_SOURCE_PATTERNS = #--------------------------------------------------------------------------- # configuration options related to source browsing #--------------------------------------------------------------------------- # If the SOURCE_BROWSER tag is set to YES then a list of source files will # be generated. Documented entities will be cross-referenced with these sources. # Note: To get rid of all source code in the generated output, make sure also # VERBATIM_HEADERS is set to NO. SOURCE_BROWSER = NO # Setting the INLINE_SOURCES tag to YES will include the body # of functions and classes directly in the documentation. INLINE_SOURCES = NO # Setting the STRIP_CODE_COMMENTS tag to YES (the default) will instruct # doxygen to hide any special comment blocks from generated source code # fragments. Normal C and C++ comments will always remain visible. STRIP_CODE_COMMENTS = YES # If the REFERENCED_BY_RELATION tag is set to YES # then for each documented function all documented # functions referencing it will be listed. REFERENCED_BY_RELATION = NO # If the REFERENCES_RELATION tag is set to YES # then for each documented function all documented entities # called/used by that function will be listed. REFERENCES_RELATION = NO # If the REFERENCES_LINK_SOURCE tag is set to YES (the default) # and SOURCE_BROWSER tag is set to YES, then the hyperlinks from # functions in REFERENCES_RELATION and REFERENCED_BY_RELATION lists will # link to the source code. # Otherwise they will link to the documentation. REFERENCES_LINK_SOURCE = YES # If the USE_HTAGS tag is set to YES then the references to source code # will point to the HTML generated by the htags(1) tool instead of doxygen # built-in source browser. The htags tool is part of GNU's global source # tagging system (see http://www.gnu.org/software/global/global.html). You # will need version 4.8.6 or higher. USE_HTAGS = NO # If the VERBATIM_HEADERS tag is set to YES (the default) then Doxygen # will generate a verbatim copy of the header file for each class for # which an include is specified. Set to NO to disable this. VERBATIM_HEADERS = YES #--------------------------------------------------------------------------- # configuration options related to the alphabetical class index #--------------------------------------------------------------------------- # If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index # of all compounds will be generated. Enable this if the project # contains a lot of classes, structs, unions or interfaces. ALPHABETICAL_INDEX = YES # If the alphabetical index is enabled (see ALPHABETICAL_INDEX) then # the COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns # in which this list will be split (can be a number in the range [1..20]) COLS_IN_ALPHA_INDEX = 5 # In case all classes in a project start with a common prefix, all # classes will be put under the same header in the alphabetical index. # The IGNORE_PREFIX tag can be used to specify one or more prefixes that # should be ignored while generating the index headers. IGNORE_PREFIX = #--------------------------------------------------------------------------- # configuration options related to the HTML output #--------------------------------------------------------------------------- # If the GENERATE_HTML tag is set to YES (the default) Doxygen will # generate HTML output. GENERATE_HTML = NO # The HTML_OUTPUT tag is used to specify where the HTML docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `html' will be used as the default path. HTML_OUTPUT = html # The HTML_FILE_EXTENSION tag can be used to specify the file extension for # each generated HTML page (for example: .htm,.php,.asp). If it is left blank # doxygen will generate files with .html extension. HTML_FILE_EXTENSION = .html # The HTML_HEADER tag can be used to specify a personal HTML header for # each generated HTML page. If it is left blank doxygen will generate a # standard header. Note that when using a custom header you are responsible # for the proper inclusion of any scripts and style sheets that doxygen # needs, which is dependent on the configuration options used. # It is adviced to generate a default header using "doxygen -w html # header.html footer.html stylesheet.css YourConfigFile" and then modify # that header. Note that the header is subject to change so you typically # have to redo this when upgrading to a newer version of doxygen or when changing the value of configuration settings such as GENERATE_TREEVIEW! HTML_HEADER = # The HTML_FOOTER tag can be used to specify a personal HTML footer for # each generated HTML page. If it is left blank doxygen will generate a # standard footer. HTML_FOOTER = # The HTML_STYLESHEET tag can be used to specify a user-defined cascading # style sheet that is used by each HTML page. It can be used to # fine-tune the look of the HTML output. If the tag is left blank doxygen # will generate a default style sheet. Note that doxygen will try to copy # the style sheet file to the HTML output directory, so don't put your own # stylesheet in the HTML output directory as well, or it will be erased! HTML_STYLESHEET = # The HTML_EXTRA_FILES tag can be used to specify one or more extra images or # other source files which should be copied to the HTML output directory. Note # that these files will be copied to the base HTML output directory. Use the # $relpath$ marker in the HTML_HEADER and/or HTML_FOOTER files to load these # files. In the HTML_STYLESHEET file, use the file name only. Also note that # the files will be copied as-is; there are no commands or markers available. HTML_EXTRA_FILES = # The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. # Doxygen will adjust the colors in the stylesheet and background images # according to this color. Hue is specified as an angle on a colorwheel, # see http://en.wikipedia.org/wiki/Hue for more information. # For instance the value 0 represents red, 60 is yellow, 120 is green, # 180 is cyan, 240 is blue, 300 purple, and 360 is red again. # The allowed range is 0 to 359. HTML_COLORSTYLE_HUE = 220 # The HTML_COLORSTYLE_SAT tag controls the purity (or saturation) of # the colors in the HTML output. For a value of 0 the output will use # grayscales only. A value of 255 will produce the most vivid colors. HTML_COLORSTYLE_SAT = 100 # The HTML_COLORSTYLE_GAMMA tag controls the gamma correction applied to # the luminance component of the colors in the HTML output. Values below # 100 gradually make the output lighter, whereas values above 100 make # the output darker. The value divided by 100 is the actual gamma applied, # so 80 represents a gamma of 0.8, The value 220 represents a gamma of 2.2, # and 100 does not change the gamma. HTML_COLORSTYLE_GAMMA = 80 # If the HTML_TIMESTAMP tag is set to YES then the footer of each generated HTML # page will contain the date and time when the page was generated. Setting # this to NO can help when comparing the output of multiple runs. HTML_TIMESTAMP = YES # If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes, # files or namespaces will be aligned in HTML using tables. If set to # NO a bullet list will be used. HTML_ALIGN_MEMBERS = YES # If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML # documentation will contain sections that can be hidden and shown after the # page has loaded. For this to work a browser that supports # JavaScript and DHTML is required (for instance Mozilla 1.0+, Firefox # Netscape 6.0+, Internet explorer 5.0+, Konqueror, or Safari). HTML_DYNAMIC_SECTIONS = NO # If the GENERATE_DOCSET tag is set to YES, additional index files # will be generated that can be used as input for Apple's Xcode 3 # integrated development environment, introduced with OSX 10.5 (Leopard). # To create a documentation set, doxygen will generate a Makefile in the # HTML output directory. Running make will produce the docset in that # directory and running "make install" will install the docset in # ~/Library/Developer/Shared/Documentation/DocSets so that Xcode will find # it at startup. # See http://developer.apple.com/tools/creatingdocsetswithdoxygen.html # for more information. GENERATE_DOCSET = NO # When GENERATE_DOCSET tag is set to YES, this tag determines the name of the # feed. A documentation feed provides an umbrella under which multiple # documentation sets from a single provider (such as a company or product suite) # can be grouped. DOCSET_FEEDNAME = "Doxygen generated docs" # When GENERATE_DOCSET tag is set to YES, this tag specifies a string that # should uniquely identify the documentation set bundle. This should be a # reverse domain-name style string, e.g. com.mycompany.MyDocSet. Doxygen # will append .docset to the name. DOCSET_BUNDLE_ID = org.doxygen.Project # When GENERATE_PUBLISHER_ID tag specifies a string that should uniquely identify # the documentation publisher. This should be a reverse domain-name style # string, e.g. com.mycompany.MyDocSet.documentation. DOCSET_PUBLISHER_ID = org.doxygen.Publisher # The GENERATE_PUBLISHER_NAME tag identifies the documentation publisher. DOCSET_PUBLISHER_NAME = Publisher # If the GENERATE_HTMLHELP tag is set to YES, additional index files # will be generated that can be used as input for tools like the # Microsoft HTML help workshop to generate a compiled HTML help file (.chm) # of the generated HTML documentation. GENERATE_HTMLHELP = NO # If the GENERATE_HTMLHELP tag is set to YES, the CHM_FILE tag can # be used to specify the file name of the resulting .chm file. You # can add a path in front of the file if the result should not be # written to the html output directory. CHM_FILE = # If the GENERATE_HTMLHELP tag is set to YES, the HHC_LOCATION tag can # be used to specify the location (absolute path including file name) of # the HTML help compiler (hhc.exe). If non-empty doxygen will try to run # the HTML help compiler on the generated index.hhp. HHC_LOCATION = # If the GENERATE_HTMLHELP tag is set to YES, the GENERATE_CHI flag # controls if a separate .chi index file is generated (YES) or that # it should be included in the master .chm file (NO). GENERATE_CHI = NO # If the GENERATE_HTMLHELP tag is set to YES, the CHM_INDEX_ENCODING # is used to encode HtmlHelp index (hhk), content (hhc) and project file # content. CHM_INDEX_ENCODING = # If the GENERATE_HTMLHELP tag is set to YES, the BINARY_TOC flag # controls whether a binary table of contents is generated (YES) or a # normal table of contents (NO) in the .chm file. BINARY_TOC = NO # The TOC_EXPAND flag can be set to YES to add extra items for group members # to the contents of the HTML help documentation and to the tree view. TOC_EXPAND = NO # If the GENERATE_QHP tag is set to YES and both QHP_NAMESPACE and # QHP_VIRTUAL_FOLDER are set, an additional index file will be generated # that can be used as input for Qt's qhelpgenerator to generate a # Qt Compressed Help (.qch) of the generated HTML documentation. GENERATE_QHP = NO # If the QHG_LOCATION tag is specified, the QCH_FILE tag can # be used to specify the file name of the resulting .qch file. # The path specified is relative to the HTML output folder. QCH_FILE = # The QHP_NAMESPACE tag specifies the namespace to use when generating # Qt Help Project output. For more information please see # http://doc.trolltech.com/qthelpproject.html#namespace QHP_NAMESPACE = # The QHP_VIRTUAL_FOLDER tag specifies the namespace to use when generating # Qt Help Project output. For more information please see # http://doc.trolltech.com/qthelpproject.html#virtual-folders QHP_VIRTUAL_FOLDER = doc # If QHP_CUST_FILTER_NAME is set, it specifies the name of a custom filter to # add. For more information please see # http://doc.trolltech.com/qthelpproject.html#custom-filters QHP_CUST_FILTER_NAME = # The QHP_CUST_FILT_ATTRS tag specifies the list of the attributes of the # custom filter to add. For more information please see # # Qt Help Project / Custom Filters. QHP_CUST_FILTER_ATTRS = # The QHP_SECT_FILTER_ATTRS tag specifies the list of the attributes this # project's # filter section matches. # # Qt Help Project / Filter Attributes. QHP_SECT_FILTER_ATTRS = # If the GENERATE_QHP tag is set to YES, the QHG_LOCATION tag can # be used to specify the location of Qt's qhelpgenerator. # If non-empty doxygen will try to run qhelpgenerator on the generated # .qhp file. QHG_LOCATION = # If the GENERATE_ECLIPSEHELP tag is set to YES, additional index files # will be generated, which together with the HTML files, form an Eclipse help # plugin. To install this plugin and make it available under the help contents # menu in Eclipse, the contents of the directory containing the HTML and XML # files needs to be copied into the plugins directory of eclipse. The name of # the directory within the plugins directory should be the same as # the ECLIPSE_DOC_ID value. After copying Eclipse needs to be restarted before # the help appears. GENERATE_ECLIPSEHELP = NO # A unique identifier for the eclipse help plugin. When installing the plugin # the directory name containing the HTML and XML files should also have # this name. ECLIPSE_DOC_ID = org.doxygen.Project # The DISABLE_INDEX tag can be used to turn on/off the condensed index at # top of each HTML page. The value NO (the default) enables the index and # the value YES disables it. DISABLE_INDEX = NO # The ENUM_VALUES_PER_LINE tag can be used to set the number of enum values # (range [0,1..20]) that doxygen will group on one line in the generated HTML # documentation. Note that a value of 0 will completely suppress the enum # values from appearing in the overview section. ENUM_VALUES_PER_LINE = 4 # The GENERATE_TREEVIEW tag is used to specify whether a tree-like index # structure should be generated to display hierarchical information. # If the tag value is set to YES, a side panel will be generated # containing a tree-like index structure (just like the one that # is generated for HTML Help). For this to work a browser that supports # JavaScript, DHTML, CSS and frames is required (i.e. any modern browser). # Windows users are probably better off using the HTML help feature. GENERATE_TREEVIEW = NO # By enabling USE_INLINE_TREES, doxygen will generate the Groups, Directories, # and Class Hierarchy pages using a tree view instead of an ordered list. USE_INLINE_TREES = NO # If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be # used to set the initial width (in pixels) of the frame in which the tree # is shown. TREEVIEW_WIDTH = 250 # When the EXT_LINKS_IN_WINDOW option is set to YES doxygen will open # links to external symbols imported via tag files in a separate window. EXT_LINKS_IN_WINDOW = NO # Use this tag to change the font size of Latex formulas included # as images in the HTML documentation. The default is 10. Note that # when you change the font size after a successful doxygen run you need # to manually remove any form_*.png images from the HTML output directory # to force them to be regenerated. FORMULA_FONTSIZE = 10 # Use the FORMULA_TRANPARENT tag to determine whether or not the images # generated for formulas are transparent PNGs. Transparent PNGs are # not supported properly for IE 6.0, but are supported on all modern browsers. # Note that when changing this option you need to delete any form_*.png files # in the HTML output before the changes have effect. FORMULA_TRANSPARENT = YES # Enable the USE_MATHJAX option to render LaTeX formulas using MathJax # (see http://www.mathjax.org) which uses client side Javascript for the # rendering instead of using prerendered bitmaps. Use this if you do not # have LaTeX installed or if you want to formulas look prettier in the HTML # output. When enabled you also need to install MathJax separately and # configure the path to it using the MATHJAX_RELPATH option. USE_MATHJAX = NO # When MathJax is enabled you need to specify the location relative to the # HTML output directory using the MATHJAX_RELPATH option. The destination # directory should contain the MathJax.js script. For instance, if the mathjax # directory is located at the same level as the HTML output directory, then # MATHJAX_RELPATH should be ../mathjax. The default value points to the # mathjax.org site, so you can quickly see the result without installing # MathJax, but it is strongly recommended to install a local copy of MathJax # before deployment. MATHJAX_RELPATH = http://www.mathjax.org/mathjax # When the SEARCHENGINE tag is enabled doxygen will generate a search box # for the HTML output. The underlying search engine uses javascript # and DHTML and should work on any modern browser. Note that when using # HTML help (GENERATE_HTMLHELP), Qt help (GENERATE_QHP), or docsets # (GENERATE_DOCSET) there is already a search function so this one should # typically be disabled. For large projects the javascript based search engine # can be slow, then enabling SERVER_BASED_SEARCH may provide a better solution. SEARCHENGINE = YES # When the SERVER_BASED_SEARCH tag is enabled the search engine will be # implemented using a PHP enabled web server instead of at the web client # using Javascript. Doxygen will generate the search PHP script and index # file to put on the web server. The advantage of the server # based approach is that it scales better to large projects and allows # full text search. The disadvantages are that it is more difficult to setup # and does not have live searching capabilities. SERVER_BASED_SEARCH = NO #--------------------------------------------------------------------------- # configuration options related to the LaTeX output #--------------------------------------------------------------------------- # If the GENERATE_LATEX tag is set to YES (the default) Doxygen will # generate Latex output. GENERATE_LATEX = NO # The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `latex' will be used as the default path. LATEX_OUTPUT = latex # The LATEX_CMD_NAME tag can be used to specify the LaTeX command name to be # invoked. If left blank `latex' will be used as the default command name. # Note that when enabling USE_PDFLATEX this option is only used for # generating bitmaps for formulas in the HTML output, but not in the # Makefile that is written to the output directory. LATEX_CMD_NAME = latex # The MAKEINDEX_CMD_NAME tag can be used to specify the command name to # generate index for LaTeX. If left blank `makeindex' will be used as the # default command name. MAKEINDEX_CMD_NAME = makeindex # If the COMPACT_LATEX tag is set to YES Doxygen generates more compact # LaTeX documents. This may be useful for small projects and may help to # save some trees in general. COMPACT_LATEX = NO # The PAPER_TYPE tag can be used to set the paper type that is used # by the printer. Possible values are: a4, letter, legal and # executive. If left blank a4wide will be used. PAPER_TYPE = a4 # The EXTRA_PACKAGES tag can be to specify one or more names of LaTeX # packages that should be included in the LaTeX output. EXTRA_PACKAGES = # The LATEX_HEADER tag can be used to specify a personal LaTeX header for # the generated latex document. The header should contain everything until # the first chapter. If it is left blank doxygen will generate a # standard header. Notice: only use this tag if you know what you are doing! LATEX_HEADER = # The LATEX_FOOTER tag can be used to specify a personal LaTeX footer for # the generated latex document. The footer should contain everything after # the last chapter. If it is left blank doxygen will generate a # standard footer. Notice: only use this tag if you know what you are doing! LATEX_FOOTER = # If the PDF_HYPERLINKS tag is set to YES, the LaTeX that is generated # is prepared for conversion to pdf (using ps2pdf). The pdf file will # contain links (just like the HTML output) instead of page references # This makes the output suitable for online browsing using a pdf viewer. PDF_HYPERLINKS = YES # If the USE_PDFLATEX tag is set to YES, pdflatex will be used instead of # plain latex in the generated Makefile. Set this option to YES to get a # higher quality PDF documentation. USE_PDFLATEX = YES # If the LATEX_BATCHMODE tag is set to YES, doxygen will add the \\batchmode. # command to the generated LaTeX files. This will instruct LaTeX to keep # running if errors occur, instead of asking the user for help. # This option is also used when generating formulas in HTML. LATEX_BATCHMODE = NO # If LATEX_HIDE_INDICES is set to YES then doxygen will not # include the index chapters (such as File Index, Compound Index, etc.) # in the output. LATEX_HIDE_INDICES = NO # If LATEX_SOURCE_CODE is set to YES then doxygen will include # source code with syntax highlighting in the LaTeX output. # Note that which sources are shown also depends on other settings # such as SOURCE_BROWSER. LATEX_SOURCE_CODE = NO #--------------------------------------------------------------------------- # configuration options related to the RTF output #--------------------------------------------------------------------------- # If the GENERATE_RTF tag is set to YES Doxygen will generate RTF output # The RTF output is optimized for Word 97 and may not look very pretty with # other RTF readers or editors. GENERATE_RTF = NO # The RTF_OUTPUT tag is used to specify where the RTF docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `rtf' will be used as the default path. RTF_OUTPUT = rtf # If the COMPACT_RTF tag is set to YES Doxygen generates more compact # RTF documents. This may be useful for small projects and may help to # save some trees in general. COMPACT_RTF = NO # If the RTF_HYPERLINKS tag is set to YES, the RTF that is generated # will contain hyperlink fields. The RTF file will # contain links (just like the HTML output) instead of page references. # This makes the output suitable for online browsing using WORD or other # programs which support those fields. # Note: wordpad (write) and others do not support links. RTF_HYPERLINKS = NO # Load stylesheet definitions from file. Syntax is similar to doxygen's # config file, i.e. a series of assignments. You only have to provide # replacements, missing definitions are set to their default value. RTF_STYLESHEET_FILE = # Set optional variables used in the generation of an rtf document. # Syntax is similar to doxygen's config file. RTF_EXTENSIONS_FILE = #--------------------------------------------------------------------------- # configuration options related to the man page output #--------------------------------------------------------------------------- # If the GENERATE_MAN tag is set to YES (the default) Doxygen will # generate man pages GENERATE_MAN = NO # The MAN_OUTPUT tag is used to specify where the man pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `man' will be used as the default path. MAN_OUTPUT = man # The MAN_EXTENSION tag determines the extension that is added to # the generated man pages (default is the subroutine's section .3) MAN_EXTENSION = .3 # If the MAN_LINKS tag is set to YES and Doxygen generates man output, # then it will generate one additional man file for each entity # documented in the real man page(s). These additional files # only source the real man page, but without them the man command # would be unable to find the correct page. The default is NO. MAN_LINKS = NO #--------------------------------------------------------------------------- # configuration options related to the XML output #--------------------------------------------------------------------------- # If the GENERATE_XML tag is set to YES Doxygen will # generate an XML file that captures the structure of # the code including all documentation. GENERATE_XML = NO # The XML_OUTPUT tag is used to specify where the XML pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `xml' will be used as the default path. XML_OUTPUT = xml # The XML_SCHEMA tag can be used to specify an XML schema, # which can be used by a validating XML parser to check the # syntax of the XML files. XML_SCHEMA = # The XML_DTD tag can be used to specify an XML DTD, # which can be used by a validating XML parser to check the # syntax of the XML files. XML_DTD = # If the XML_PROGRAMLISTING tag is set to YES Doxygen will # dump the program listings (including syntax highlighting # and cross-referencing information) to the XML output. Note that # enabling this will significantly increase the size of the XML output. XML_PROGRAMLISTING = YES #--------------------------------------------------------------------------- # configuration options for the AutoGen Definitions output #--------------------------------------------------------------------------- # If the GENERATE_AUTOGEN_DEF tag is set to YES Doxygen will # generate an AutoGen Definitions (see autogen.sf.net) file # that captures the structure of the code including all # documentation. Note that this feature is still experimental # and incomplete at the moment. GENERATE_AUTOGEN_DEF = NO #--------------------------------------------------------------------------- # configuration options related to the Perl module output #--------------------------------------------------------------------------- # If the GENERATE_PERLMOD tag is set to YES Doxygen will # generate a Perl module file that captures the structure of # the code including all documentation. Note that this # feature is still experimental and incomplete at the # moment. GENERATE_PERLMOD = NO # If the PERLMOD_LATEX tag is set to YES Doxygen will generate # the necessary Makefile rules, Perl scripts and LaTeX code to be able # to generate PDF and DVI output from the Perl module output. PERLMOD_LATEX = NO # If the PERLMOD_PRETTY tag is set to YES the Perl module output will be # nicely formatted so it can be parsed by a human reader. # This is useful # if you want to understand what is going on. # On the other hand, if this # tag is set to NO the size of the Perl module output will be much smaller # and Perl will parse it just the same. PERLMOD_PRETTY = YES # The names of the make variables in the generated doxyrules.make file # are prefixed with the string contained in PERLMOD_MAKEVAR_PREFIX. # This is useful so different doxyrules.make files included by the same # Makefile don't overwrite each other's variables. PERLMOD_MAKEVAR_PREFIX = #--------------------------------------------------------------------------- # Configuration options related to the preprocessor #--------------------------------------------------------------------------- # If the ENABLE_PREPROCESSING tag is set to YES (the default) Doxygen will # evaluate all C-preprocessor directives found in the sources and include # files. ENABLE_PREPROCESSING = YES # If the MACRO_EXPANSION tag is set to YES Doxygen will expand all macro # names in the source code. If set to NO (the default) only conditional # compilation will be performed. Macro expansion can be done in a controlled # way by setting EXPAND_ONLY_PREDEF to YES. MACRO_EXPANSION = YES # If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES # then the macro expansion is limited to the macros specified with the # PREDEFINED and EXPAND_AS_DEFINED tags. EXPAND_ONLY_PREDEF = NO # If the SEARCH_INCLUDES tag is set to YES (the default) the includes files # pointed to by INCLUDE_PATH will be searched when a #include is found. SEARCH_INCLUDES = YES # The INCLUDE_PATH tag can be used to specify one or more directories that # contain include files that are not input files but should be processed by # the preprocessor. INCLUDE_PATH = # You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard # patterns (like *.h and *.hpp) to filter out the header-files in the # directories. If left blank, the patterns specified with FILE_PATTERNS will # be used. INCLUDE_FILE_PATTERNS = # The PREDEFINED tag can be used to specify one or more macro names that # are defined before the preprocessor is started (similar to the -D option of # gcc). The argument of the tag is a list of macros of the form: name # or name=definition (no spaces). If the definition and the = are # omitted =1 is assumed. To prevent a macro definition from being # undefined via #undef or recursively expanded use the := operator # instead of the = operator. PREDEFINED = # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then # this tag can be used to specify a list of macro names that should be expanded. # The macro definition that is found in the sources will be used. # Use the PREDEFINED tag if you want to use a different macro definition that # overrules the definition found in the source code. EXPAND_AS_DEFINED = # If the SKIP_FUNCTION_MACROS tag is set to YES (the default) then # doxygen's preprocessor will remove all references to function-like macros # that are alone on a line, have an all uppercase name, and do not end with a # semicolon, because these will confuse the parser if not removed. SKIP_FUNCTION_MACROS = YES #--------------------------------------------------------------------------- # Configuration::additions related to external references #--------------------------------------------------------------------------- # The TAGFILES option can be used to specify one or more tagfiles. # Optionally an initial location of the external documentation # can be added for each tagfile. The format of a tag file without # this location is as follows: # # TAGFILES = file1 file2 ... # Adding location for the tag files is done as follows: # # TAGFILES = file1=loc1 "file2 = loc2" ... # where "loc1" and "loc2" can be relative or absolute paths or # URLs. If a location is present for each tag, the installdox tool # does not have to be run to correct the links. # Note that each tag file must have a unique name # (where the name does NOT include the path) # If a tag file is not located in the directory in which doxygen # is run, you must also specify the path to the tagfile here. TAGFILES = # When a file name is specified after GENERATE_TAGFILE, doxygen will create # a tag file that is based on the input files it reads. GENERATE_TAGFILE = # If the ALLEXTERNALS tag is set to YES all external classes will be listed # in the class index. If set to NO only the inherited external classes # will be listed. ALLEXTERNALS = NO # If the EXTERNAL_GROUPS tag is set to YES all external groups will be listed # in the modules index. If set to NO, only the current project's groups will # be listed. EXTERNAL_GROUPS = YES # The PERL_PATH should be the absolute path and name of the perl script # interpreter (i.e. the result of `which perl'). PERL_PATH = /usr/bin/perl #--------------------------------------------------------------------------- # Configuration options related to the dot tool #--------------------------------------------------------------------------- # If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will # generate a inheritance diagram (in HTML, RTF and LaTeX) for classes with base # or super classes. Setting the tag to NO turns the diagrams off. Note that # this option also works with HAVE_DOT disabled, but it is recommended to # install and use dot, since it yields more powerful graphs. CLASS_DIAGRAMS = YES # You can define message sequence charts within doxygen comments using the \msc # command. Doxygen will then run the mscgen tool (see # http://www.mcternan.me.uk/mscgen/) to produce the chart and insert it in the # documentation. The MSCGEN_PATH tag allows you to specify the directory where # the mscgen tool resides. If left empty the tool is assumed to be found in the # default search path. MSCGEN_PATH = # If set to YES, the inheritance and collaboration graphs will hide # inheritance and usage relations if the target is undocumented # or is not a class. HIDE_UNDOC_RELATIONS = YES # If you set the HAVE_DOT tag to YES then doxygen will assume the dot tool is # available from the path. This tool is part of Graphviz, a graph visualization # toolkit from AT&T and Lucent Bell Labs. The other options in this section # have no effect if this option is set to NO (the default) HAVE_DOT = YES # The DOT_NUM_THREADS specifies the number of dot invocations doxygen is # allowed to run in parallel. When set to 0 (the default) doxygen will # base this on the number of processors available in the system. You can set it # explicitly to a value larger than 0 to get control over the balance # between CPU load and processing speed. DOT_NUM_THREADS = 0 # By default doxygen will write a font called Helvetica to the output # directory and reference it in all dot files that doxygen generates. # When you want a differently looking font you can specify the font name # using DOT_FONTNAME. You need to make sure dot is able to find the font, # which can be done by putting it in a standard location or by setting the # DOTFONTPATH environment variable or by setting DOT_FONTPATH to the directory # containing the font. DOT_FONTNAME = Helvetica # The DOT_FONTSIZE tag can be used to set the size of the font of dot graphs. # The default size is 10pt. DOT_FONTSIZE = 10 # By default doxygen will tell dot to use the output directory to look for the # FreeSans.ttf font (which doxygen will put there itself). If you specify a # different font using DOT_FONTNAME you can set the path where dot # can find it using this tag. DOT_FONTPATH = # If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect inheritance relations. Setting this tag to YES will force the # the CLASS_DIAGRAMS tag to NO. CLASS_GRAPH = YES # If the COLLABORATION_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect implementation dependencies (inheritance, containment, and # class references variables) of the class with other documented classes. COLLABORATION_GRAPH = YES # If the GROUP_GRAPHS and HAVE_DOT tags are set to YES then doxygen # will generate a graph for groups, showing the direct groups dependencies GROUP_GRAPHS = YES # If the UML_LOOK tag is set to YES doxygen will generate inheritance and # collaboration diagrams in a style similar to the OMG's Unified Modeling # Language. UML_LOOK = NO # If set to YES, the inheritance and collaboration graphs will show the # relations between templates and their instances. TEMPLATE_RELATIONS = NO # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDE_GRAPH, and HAVE_DOT # tags are set to YES then doxygen will generate a graph for each documented # file showing the direct and indirect include dependencies of the file with # other documented files. INCLUDE_GRAPH = YES # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDED_BY_GRAPH, and # HAVE_DOT tags are set to YES then doxygen will generate a graph for each # documented header file showing the documented files that directly or # indirectly include this file. INCLUDED_BY_GRAPH = YES # If the CALL_GRAPH and HAVE_DOT options are set to YES then # doxygen will generate a call dependency graph for every global function # or class method. Note that enabling this option will significantly increase # the time of a run. So in most cases it will be better to enable call graphs # for selected functions only using the \callgraph command. CALL_GRAPH = NO # If the CALLER_GRAPH and HAVE_DOT tags are set to YES then # doxygen will generate a caller dependency graph for every global function # or class method. Note that enabling this option will significantly increase # the time of a run. So in most cases it will be better to enable caller # graphs for selected functions only using the \callergraph command. CALLER_GRAPH = NO # If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen # will generate a graphical hierarchy of all classes instead of a textual one. GRAPHICAL_HIERARCHY = YES # If the DIRECTORY_GRAPH, SHOW_DIRECTORIES and HAVE_DOT tags are set to YES # then doxygen will show the dependencies a directory has on other directories # in a graphical way. The dependency relations are determined by the #include # relations between the files in the directories. DIRECTORY_GRAPH = YES # The DOT_IMAGE_FORMAT tag can be used to set the image format of the images # generated by dot. Possible values are svg, png, jpg, or gif. # If left blank png will be used. DOT_IMAGE_FORMAT = png # The tag DOT_PATH can be used to specify the path where the dot tool can be # found. If left blank, it is assumed the dot tool can be found in the path. DOT_PATH = # The DOTFILE_DIRS tag can be used to specify one or more directories that # contain dot files that are included in the documentation (see the # \dotfile command). DOTFILE_DIRS = # The MSCFILE_DIRS tag can be used to specify one or more directories that # contain msc files that are included in the documentation (see the # \mscfile command). MSCFILE_DIRS = # The DOT_GRAPH_MAX_NODES tag can be used to set the maximum number of # nodes that will be shown in the graph. If the number of nodes in a graph # becomes larger than this value, doxygen will truncate the graph, which is # visualized by representing a node as a red box. Note that doxygen if the # number of direct children of the root node in a graph is already larger than # DOT_GRAPH_MAX_NODES then the graph will not be shown at all. Also note # that the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH. DOT_GRAPH_MAX_NODES = 50 # The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the # graphs generated by dot. A depth value of 3 means that only nodes reachable # from the root by following a path via at most 3 edges will be shown. Nodes # that lay further from the root node will be omitted. Note that setting this # option to 1 or 2 may greatly reduce the computation time needed for large # code bases. Also note that the size of a graph can be further restricted by # DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction. MAX_DOT_GRAPH_DEPTH = 0 # Set the DOT_TRANSPARENT tag to YES to generate images with a transparent # background. This is disabled by default, because dot on Windows does not # seem to support this out of the box. Warning: Depending on the platform used, # enabling this option may lead to badly anti-aliased labels on the edges of # a graph (i.e. they become hard to read). DOT_TRANSPARENT = NO # Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output # files in one run (i.e. multiple -o and -T options on the command line). This # makes dot run faster, but since only newer versions of dot (>1.8.10) # support this, this feature is disabled by default. DOT_MULTI_TARGETS = NO # If the GENERATE_LEGEND tag is set to YES (the default) Doxygen will # generate a legend page explaining the meaning of the various boxes and # arrows in the dot generated graphs. GENERATE_LEGEND = NO # If the DOT_CLEANUP tag is set to YES (the default) Doxygen will # remove the intermediate dot files that are used to generate # the various graphs. DOT_CLEANUP = YES # By default Python docstrings are displayed as preformmated text # and Doxygen's special commands cannot be used. By setting PYTHON_DOCSTRING # to NO the Doxygen's special commands can be used and the contents of the # docstring documentation blocks is shown as Doxygen documentation. PYTHON_DOCSTRING = NO papi-papi-7-2-0-t/doc/Doxyfile-html000066400000000000000000000457031502707512200170660ustar00rootroot00000000000000# Doxyfile 1.6.2 # This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for a project # # All text after a hash (#) is considered a comment and will be ignored # The format is: # TAG = value [value, ...] # For lists items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (" ") @INCLUDE = Doxyfile-common #--------------------------------------------------------------------------- # Configuration options related to the preprocessor #--------------------------------------------------------------------------- # If the ENABLE_PREPROCESSING tag is set to YES (the default) Doxygen will # evaluate all C-preprocessor directives found in the sources and include # files. ENABLE_PREPROCESSING = YES # If the MACRO_EXPANSION tag is set to YES Doxygen will expand all macro # names in the source code. If set to NO (the default) only conditional # compilation will be performed. Macro expansion can be done in a controlled # way by setting EXPAND_ONLY_PREDEF to YES. MACRO_EXPANSION = YES # If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES # then the macro expansion is limited to the macros specified with the # PREDEFINED and EXPAND_AS_DEFINED tags. EXPAND_ONLY_PREDEF = YES # The PREDEFINED tag can be used to specify one or more macro names that # are defined before the preprocessor is started (similar to the -D option of # gcc). The argument of the tag is a list of macros of the form: name # or name=definition (no spaces). If the definition and the = are # omitted =1 is assumed. To prevent a macro definition from being # undefined via #undef or recursively expanded use the := operator # instead of the = operator. PREDEFINED = DEBUG # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then # this tag can be used to specify a list of macro names that should be expanded. # The macro definition that is found in the sources will be used. # Use the PREDEFINED tag if you want to use a different macro definition that # overrules the definition found in the source code. EXPAND_AS_DEFINED = PAPIERROR LEAKDBG MEMDBG MPXDBG OVFDBG PAPIDEBUG SUBDBG PRFDBG INTDBG THRDBG APIDBG #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- # If the CREATE_SUBDIRS tag is set to YES, then doxygen will create # 4096 sub-directories (in 2 levels) under the output directory of each output # format and will distribute the generated files over these directories. # Enabling this option can be useful when feeding doxygen a huge amount of # source files, where putting all generated files in the same directory would # otherwise cause performance problems for the file system. CREATE_SUBDIRS = YES # If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in # documentation are documented, even if no documentation was available. # Private class members and static file members will be hidden unless # the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES EXTRACT_ALL = YES # If the EXTRACT_STATIC tag is set to YES all static members of a file # will be included in the documentation. EXTRACT_STATIC = YES # The INTERNAL_DOCS tag determines if documentation # that is typed after a \internal command is included. If the tag is set # to NO (the default) then the documentation will be excluded. # Set it to YES to include the internal documentation. INTERNAL_DOCS = YES # If the CASE_SENSE_NAMES tag is set to NO then Doxygen will only generate # file names in lower-case letters. If set to YES upper-case letters are also # allowed. This is useful if you have classes or files whose names only differ # in case and if your file system supports case sensitive file names. Windows # and Mac users are advised to set this option to NO. CASE_SENSE_NAMES = YES # The GENERATE_TODOLIST tag can be used to enable (YES) or # disable (NO) the todo list. This list is created by putting \todo # commands in the documentation. GENERATE_TODOLIST = YES # The GENERATE_TESTLIST tag can be used to enable (YES) or # disable (NO) the test list. This list is created by putting \test # commands in the documentation. GENERATE_TESTLIST = YES # The GENERATE_BUGLIST tag can be used to enable (YES) or # disable (NO) the bug list. This list is created by putting \bug # commands in the documentation. GENERATE_BUGLIST = YES # The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or # disable (NO) the deprecated list. This list is created by putting # \deprecated commands in the documentation. GENERATE_DEPRECATEDLIST= YES # Set the SHOW_USED_FILES tag to NO to disable the list of files generated # at the bottom of the documentation of classes and structs. If set to YES the # list will mention the files that were used to generate the documentation. SHOW_USED_FILES = YES # If the sources in your project are distributed over multiple directories # then setting the SHOW_DIRECTORIES tag to YES will show the directory hierarchy # in the documentation. The default is NO. SHOW_DIRECTORIES = NO # Set the SHOW_FILES tag to NO to disable the generation of the Files page. # This will remove the Files entry from the Quick Index and from the # Folder Tree View (if specified). The default is YES. SHOW_FILES = YES # Set the SHOW_NAMESPACES tag to NO to disable the generation of the # Namespaces page. # This will remove the Namespaces entry from the Quick Index # and from the Folder Tree View (if specified). The default is YES. SHOW_NAMESPACES = YES #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag can be used to specify the files and/or directories that contain # documented source files. You may enter file names like "myfile.cpp" or # directories like "/usr/src/myproject". Separate the files or directories # with spaces. INPUT = ../src ../src/components/README # The RECURSIVE tag can be used to turn specify whether or not subdirectories # should be searched for input files as well. Possible values are YES and NO. # If left blank NO is used. RECURSIVE = YES #--------------------------------------------------------------------------- # configuration options related to source browsing #--------------------------------------------------------------------------- # If the SOURCE_BROWSER tag is set to YES then a list of source files will # be generated. Documented entities will be cross-referenced with these sources. # Note: To get rid of all source code in the generated output, make sure also # VERBATIM_HEADERS is set to NO. SOURCE_BROWSER = YES # Setting the INLINE_SOURCES tag to YES will include the body # of functions and classes directly in the documentation. INLINE_SOURCES = YES # Setting the STRIP_CODE_COMMENTS tag to YES (the default) will instruct # doxygen to hide any special comment blocks from generated source code # fragments. Normal C and C++ comments will always remain visible. STRIP_CODE_COMMENTS = YES # If the REFERENCED_BY_RELATION tag is set to YES # then for each documented function all documented # functions referencing it will be listed. REFERENCED_BY_RELATION = NO # If the REFERENCES_RELATION tag is set to YES # then for each documented function all documented entities # called/used by that function will be listed. REFERENCES_RELATION = NO # If the REFERENCES_LINK_SOURCE tag is set to YES (the default) # and SOURCE_BROWSER tag is set to YES, then the hyperlinks from # functions in REFERENCES_RELATION and REFERENCED_BY_RELATION lists will # link to the source code. # Otherwise they will link to the documentation. REFERENCES_LINK_SOURCE = YES #--------------------------------------------------------------------------- # configuration options related to the HTML output #--------------------------------------------------------------------------- # If the GENERATE_HTML tag is set to YES (the default) Doxygen will # generate HTML output. GENERATE_HTML = YES # The HTML_OUTPUT tag is used to specify where the HTML docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `html' will be used as the default path. HTML_OUTPUT = html # The HTML_FILE_EXTENSION tag can be used to specify the file extension for # each generated HTML page (for example: .htm,.php,.asp). If it is left blank # doxygen will generate files with .html extension. HTML_FILE_EXTENSION = .html # The HTML_HEADER tag can be used to specify a personal HTML header for # each generated HTML page. If it is left blank doxygen will generate a # standard header. HTML_HEADER = # The HTML_FOOTER tag can be used to specify a personal HTML footer for # each generated HTML page. If it is left blank doxygen will generate a # standard footer. HTML_FOOTER = # The HTML_STYLESHEET tag can be used to specify a user-defined cascading # style sheet that is used by each HTML page. It can be used to # fine-tune the look of the HTML output. If the tag is left blank doxygen # will generate a default style sheet. Note that doxygen will try to copy # the style sheet file to the HTML output directory, so don't put your own # stylesheet in the HTML output directory as well, or it will be erased! HTML_STYLESHEET = # If the HTML_TIMESTAMP tag is set to YES then the footer of each generated HTML # page will contain the date and time when the page was generated. Setting # this to NO can help when comparing the output of multiple runs. HTML_TIMESTAMP = YES # If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes, # files or namespaces will be aligned in HTML using tables. If set to # NO a bullet list will be used. HTML_ALIGN_MEMBERS = YES # If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML # documentation will contain sections that can be hidden and shown after the # page has loaded. For this to work a browser that supports # JavaScript and DHTML is required (for instance Mozilla 1.0+, Firefox # Netscape 6.0+, Internet explorer 5.0+, Konqueror, or Safari). HTML_DYNAMIC_SECTIONS = NO # This tag can be used to set the number of enum values (range [1..20]) # that doxygen will group on one line in the generated HTML documentation. ENUM_VALUES_PER_LINE = 4 # The GENERATE_TREEVIEW tag is used to specify whether a tree-like index # structure should be generated to display hierarchical information. # If the tag value is set to YES, a side panel will be generated # containing a tree-like index structure (just like the one that # is generated for HTML Help). For this to work a browser that supports # JavaScript, DHTML, CSS and frames is required (i.e. any modern browser). # Windows users are probably better off using the HTML help feature. GENERATE_TREEVIEW = YES # By enabling USE_INLINE_TREES, doxygen will generate the Groups, Directories, # and Class Hierarchy pages using a tree view instead of an ordered list. USE_INLINE_TREES = NO # If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be # used to set the initial width (in pixels) of the frame in which the tree # is shown. TREEVIEW_WIDTH = 250 # Use this tag to change the font size of Latex formulas included # as images in the HTML documentation. The default is 10. Note that # when you change the font size after a successful doxygen run you need # to manually remove any form_*.png images from the HTML output directory # to force them to be regenerated. FORMULA_FONTSIZE = 10 # When the SEARCHENGINE tag is enabled doxygen will generate a search box for the HTML output. The underlying search engine uses javascript # and DHTML and should work on any modern browser. Note that when using HTML help (GENERATE_HTMLHELP), Qt help (GENERATE_QHP), or docsets (GENERATE_DOCSET) there is already a search function so this one should # typically be disabled. For large projects the javascript based search engine # can be slow, then enabling SERVER_BASED_SEARCH may provide a better solution. SEARCHENGINE = YES # When the SERVER_BASED_SEARCH tag is enabled the search engine will be implemented using a PHP enabled web server instead of at the web client using Javascript. Doxygen will generate the search PHP script and index # file to put on the web server. The advantage of the server based approach is that it scales better to large projects and allows full text search. The disadvances is that it is more difficult to setup # and does not have live searching capabilities. SERVER_BASED_SEARCH = NO #--------------------------------------------------------------------------- # Configuration options related to the dot tool #--------------------------------------------------------------------------- # If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will # generate a inheritance diagram (in HTML, RTF and LaTeX) for classes with base # or super classes. Setting the tag to NO turns the diagrams off. Note that # this option is superseded by the HAVE_DOT option below. This is only a # fallback. It is recommended to install and use dot, since it yields more # powerful graphs. CLASS_DIAGRAMS = YES # If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect inheritance relations. Setting this tag to YES will force the # the CLASS_DIAGRAMS tag to NO. CLASS_GRAPH = YES # If the COLLABORATION_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect implementation dependencies (inheritance, containment, and # class references variables) of the class with other documented classes. COLLABORATION_GRAPH = YES # If the GROUP_GRAPHS and HAVE_DOT tags are set to YES then doxygen # will generate a graph for groups, showing the direct groups dependencies GROUP_GRAPHS = YES # If the UML_LOOK tag is set to YES doxygen will generate inheritance and # collaboration diagrams in a style similar to the OMG's Unified Modeling # Language. UML_LOOK = NO # If set to YES, the inheritance and collaboration graphs will show the # relations between templates and their instances. TEMPLATE_RELATIONS = NO # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDE_GRAPH, and HAVE_DOT # tags are set to YES then doxygen will generate a graph for each documented # file showing the direct and indirect include dependencies of the file with # other documented files. INCLUDE_GRAPH = YES # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDED_BY_GRAPH, and # HAVE_DOT tags are set to YES then doxygen will generate a graph for each # documented header file showing the documented files that directly or # indirectly include this file. INCLUDED_BY_GRAPH = YES # If the CALL_GRAPH and HAVE_DOT options are set to YES then # doxygen will generate a call dependency graph for every global function # or class method. Note that enabling this option will significantly increase # the time of a run. So in most cases it will be better to enable call graphs # for selected functions only using the \callgraph command. CALL_GRAPH = YES # If the CALLER_GRAPH and HAVE_DOT tags are set to YES then # doxygen will generate a caller dependency graph for every global function # or class method. Note that enabling this option will significantly increase # the time of a run. So in most cases it will be better to enable caller # graphs for selected functions only using the \callergraph command. CALLER_GRAPH = YES # If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen # will graphical hierarchy of all classes instead of a textual one. GRAPHICAL_HIERARCHY = YES # If the DIRECTORY_GRAPH, SHOW_DIRECTORIES and HAVE_DOT tags are set to YES # then doxygen will show the dependencies a directory has on other directories # in a graphical way. The dependency relations are determined by the #include # relations between the files in the directories. DIRECTORY_GRAPH = NO # The DOT_IMAGE_FORMAT tag can be used to set the image format of the images # generated by dot. Possible values are png, jpg, or gif # If left blank png will be used. DOT_IMAGE_FORMAT = png # The tag DOT_PATH can be used to specify the path where the dot tool can be # found. If left blank, it is assumed the dot tool can be found in the path. DOT_PATH = # The DOTFILE_DIRS tag can be used to specify one or more directories that # contain dot files that are included in the documentation (see the # \dotfile command). DOTFILE_DIRS = # The DOT_GRAPH_MAX_NODES tag can be used to set the maximum number of # nodes that will be shown in the graph. If the number of nodes in a graph # becomes larger than this value, doxygen will truncate the graph, which is # visualized by representing a node as a red box. Note that doxygen if the # number of direct children of the root node in a graph is already larger than # DOT_GRAPH_MAX_NODES then the graph will not be shown at all. Also note # that the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH. DOT_GRAPH_MAX_NODES = 50 # The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the # graphs generated by dot. A depth value of 3 means that only nodes reachable # from the root by following a path via at most 3 edges will be shown. Nodes # that lay further from the root node will be omitted. Note that setting this # option to 1 or 2 may greatly reduce the computation time needed for large # code bases. Also note that the size of a graph can be further restricted by # DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction. MAX_DOT_GRAPH_DEPTH = 0 # Set the DOT_TRANSPARENT tag to YES to generate images with a transparent # background. This is disabled by default, because dot on Windows does not # seem to support this out of the box. Warning: Depending on the platform used, # enabling this option may lead to badly anti-aliased labels on the edges of # a graph (i.e. they become hard to read). DOT_TRANSPARENT = NO # Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output # files in one run (i.e. multiple -o and -T options on the command line). This # makes dot run faster, but since only newer versions of dot (>1.8.10) # support this, this feature is disabled by default. DOT_MULTI_TARGETS = NO # If the GENERATE_LEGEND tag is set to YES (the default) Doxygen will # generate a legend page explaining the meaning of the various boxes and # arrows in the dot generated graphs. GENERATE_LEGEND = YES # If the DOT_CLEANUP tag is set to YES (the default) Doxygen will # remove the intermediate dot files that are used to generate # the various graphs. DOT_CLEANUP = YES papi-papi-7-2-0-t/doc/Doxyfile-man1000066400000000000000000000047711502707512200167560ustar00rootroot00000000000000# This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for PAPI utilities man-pages # The following overrides default values in Doxyfile-common # # All text after a hash (#) is considered a comment and will be ignored # The format is: # TAG = value [value, ...] # For lists items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (" ") @INCLUDE = Doxyfile-common #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag can be used to specify the files and/or directories that contain # documented source files. You may enter file names like "myfile.cpp" or # directories like "/usr/src/myproject". Separate the files or directories # with spaces. INPUT = ../src/utils/ FILE_PATTERNS = *.c # The RECURSIVE tag can be used to turn specify whether or not subdirectories # should be searched for input files as well. Possible values are YES and NO. # If left blank NO is used. RECURSIVE = YES # If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) # defined locally in source files will be included in the documentation. # If set to NO only classes defined in header files are included. EXTRACT_LOCAL_CLASSES = NO #--------------------------------------------------------------------------- # configuration options related to the man page output #--------------------------------------------------------------------------- # If the GENERATE_MAN tag is set to YES (the default) Doxygen will # generate man pages GENERATE_MAN = YES # The MAN_OUTPUT tag is used to specify where the man pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `man' will be used as the default path. MAN_OUTPUT = man # The MAN_EXTENSION tag determines the extension that is added to # the generated man pages (default is the subroutine's section .3) MAN_EXTENSION = .1 # If the MAN_LINKS tag is set to YES and Doxygen generates man output, # then it will generate one additional man file for each entity # documented in the real man page(s). These additional files # only source the real man page, but without them the man command # would be unable to find the correct page. The default is NO. MAN_LINKS = NO papi-papi-7-2-0-t/doc/Doxyfile-man3000066400000000000000000000046171502707512200167570ustar00rootroot00000000000000# This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for PAPI utilities man-pages # The following overrides default values in Doxyfile-common # # All text after a hash (#) is considered a comment and will be ignored # The format is: # TAG = value [value, ...] # For lists items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (" ") @INCLUDE = Doxyfile-common #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag can be used to specify the files and/or directories that contain # documented source files. You may enter file names like "myfile.cpp" or # directories like "/usr/src/myproject". Separate the files or directories # with spaces. INPUT = ../src/papi.h ../src/papi.c ../src/high-level/papi_hl.c \ ../src/papi_fwrappers.c ../src/high-level/scripts/papi_hl_output_writer.py FILE_PATTERNS = *.c *.h *.py # The RECURSIVE tag can be used to turn specify whether or not subdirectories # should be searched for input files as well. Possible values are YES and NO. # If left blank NO is used. RECURSIVE = NO #--------------------------------------------------------------------------- # configuration options related to the man page output #--------------------------------------------------------------------------- # If the GENERATE_MAN tag is set to YES (the default) Doxygen will # generate man pages GENERATE_MAN = YES # The MAN_OUTPUT tag is used to specify where the man pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `man' will be used as the default path. MAN_OUTPUT = man # The MAN_EXTENSION tag determines the extension that is added to # the generated man pages (default is the subroutine's section .3) MAN_EXTENSION = .3 # If the MAN_LINKS tag is set to YES and Doxygen generates man output, # then it will generate one additional man file for each entity # documented in the real man page(s). These additional files # only source the real man page, but without them the man command # would be unable to find the correct page. The default is NO. MAN_LINKS = NO papi-papi-7-2-0-t/doc/Makefile000066400000000000000000000017571502707512200160570ustar00rootroot00000000000000.PHONY: clean clobber distclean install force_me all all: man @echo "Built PAPI user documentation" html: force_me doxygen Doxyfile-html man: man/man1 man/man3 man/man3: ../src/papi.h ../src/papi.c ../src/high-level/papi_hl.c ../src/papi_fwrappers.c doxygen Doxyfile-man3 man/man1: ../src/utils/papi_avail.c ../src/utils/papi_clockres.c ../src/utils/papi_command_line.c ../src/utils/papi_component_avail.c ../src/utils/papi_cost.c ../src/utils/papi_decode.c ../src/utils/papi_error_codes.c ../src/utils/papi_event_chooser.c ../src/utils/papi_xml_event_info.c ../src/utils/papi_mem_info.c ../src/utils/papi_multiplex_cost.c ../src/utils/papi_native_avail.c ../src/utils/papi_version.c ../src/utils/papi_hardware_avail.c doxygen Doxyfile-man1 clean: rm -rf man html doxyerror distclean clobber: clean install: man -rm -f man/man3/HighLevelInfo.3 -rm -f man/man3/papi_data_structures.3 -rm -r ../man/man1/*.1 ../man/man3/*.3 -cp -R man/man1/*.1 ../man/man1 -cp -R man/man3/*.3 ../man/man3 papi-papi-7-2-0-t/doc/PAPI-C.html000066400000000000000000000347011502707512200162110ustar00rootroot00000000000000 Component PAPI Technology Pre-Release

Component PAPI Technology Pre-Release (PAPI 3.9.0)

Introduction

Component PAPI, or PAPI-C is an attempt to make the PAPI performance monitoring programming interface available for more than just the hardware performance counters found on the cpu. Performance counters are finding their way into a number of other components of High Performance computing systems, such as network or memory controllers, power or temperature monitors or even specialized processing units that may find their way into future multicore processor implementations.

The primary technical challenge for PAPI is to sever the very tight coupling between the hardware independent layer of PAPI code and the hardware specific code necessary to interface with the counters, and to do this without sacrificing performance. Secondarily, once these two code layers have been functionally separated, the hardware independent, or Framework layer must be modified to simultaneously support multiple hardware dependent substrate layers, or Components.

These changes cannot be accomplished without some modification of the PAPI user interface. We have tried to keep these modifications minimal and transparent, and have been successful at preserving most backward compatibility for applications and tools that just want access to the cpu counters. We have introduced a small number of new APIs and functionality to support the new abstractions of multiple components. We have also modified the function of some APIs and data structures to support a multi-Component landscape. These changes have been tabluated at the bottom of this document, and are discussed below.

This release of PAPI-C is a technology pre-release, implemented and tested on a small number of platforms and components. The platforms include Intel Pentium III, Pentium 4, Core2Duo, Itanium (I and II) and AMD Opteron. Work is underway to port the other platforms currently supported by PAPI. Components (beside the cpu component) currently available include an ACPI component for monitoring temperature where available; a Myrinet MX component; and a 'toy' component that monitors network traffic as reported in the linux/unix sbin/ifconfig directory.

API and Abstraction Changes

EventSets

One of the key organizing data structures in PAPI is the EventSet. This serves as a repository for all the events and settings necessary to define a counting regime. EventSets are created, modified, added to, deleted from, and disposed of over the life of a PAPI counting session. In traditional PAPI, multiple EventSets can exist simultaneously, but only one can be active at any time. PAPI-C extends the concept of an EventSet by binding it to a specific numbered Component. This component index then signals which component the EventSet is paired with. Multiple EventSets can be defined and active simultaneously, but only one EventSet per Component can be enabled. We have adopted a late-binding model for associating an EventSet with a Component. No changes are needed in the API call for creating an EventSet, and the Set is bound to a Component when the first event is added. Any additional events must then belong to the same Component. Occasionally it is desirable to modify settings in an EventSet before an event is added. In this case, a new API, PAPI_assign_eventset_component(), has been introduced to make this binding explicit. For now, PAPI Preset events are only defined for the cpu component, which by convention is always component 0.

Events

For now, PAPI Preset events are only defined for the cpu component, which by convention is always component 0. Since these event names and codes are available directly in papi.h, they will continue to work with no modifications. Event codes for other components are always mapped to native events available on that component and are bound to the component with a 4-bit component ID field embedded in the event code itself. These codes cannot be determined a priori, since they are an opaque id used only by PAPI. They must be obtained by a call to PAPI_event_name_to_code(), which will search all available native event tables and return a properly encoded value if the event exists. As described above, the first event added binds an EventSet to a Component; all following added events must belong to the same Component.

Component Housekeeping

A number of changes were made to support various housekeeping chores associated with multiple Components. A new API, PAPI_num_components(), was added to provide the number of active components in the current library. Also, PAPI_get_component_info() replaces PAPI_get_substrate_info() and provides detailed information for a given component. As mentioned above, since the cpu component is always assumed to exist, it is always assigned as component 0. In addition, component 0 is always relied on to provide the high resolution timer functionality behind the following APIs: PAPI_get_real_cyc(), PAPI_get_virt_cyc(), PAPI_get_real_usec(), qnd PAPI_get_virt_usec(). One call, PAPI_num_hwctrs(), still functions as it did in traditional PAPI to provide the number of physical cpu counters. It has been augmented by the new PAPI_num_cmp_hwctrs(), to provide the number of counters for a specified component.

PAPI Options

The bulk of the visible changes in PAPI-C have occured in the general area of setting and getting option values. Options can be either system-wide or component-specific. This didn't matter in traditional PAPI with only one component. Now it does. In order to preserve backward compatibility with code that only accesses the cpu component, the PAPI_get_opt() and PAPI_set_opt() calls behave as before, with an implicit component index of 0 for those options that are bound to a component. For those options that are component specific, PAPI_get_cmp_opt() and PAPI_set_cmp_opt() take an addition component index argument. Futher, two new convenience functions, PAPI_set_cmp_domain() and PAPI_set_cmp_granularity() have been added for component specific setting of these options. More subtly, two of the cases handled by PAPI_set_opt() now have additional information included in the passed data structures. Both PAPI_DEFDOM and PAPI_DEFGRN cases now require a component index to be provided in the passed data structure, since available domains are component dependent and may differ widely between cpu domains and, for example, network domains.

Building and Linking

There are very few visible changes in the build environment. As before, cpu components are automatically detected by configure and included in the build. As new components are added, each is supported by a
--with-<cmp> = yes
option on the configure command line. Currently supported component options include:
--with-acpi = yes
--with-mx = yes
--with-net = yes

It is intended that in the future, where possible, component support will be autodetected by configure in a fashion similar to cpu architectures and automatically included in the make.

The make process currently compiles and links the sources for all requested components into a single binary. This process is automatic and transparent once the components are specified in the configure step. It is intended that future releases will make each component independently and allow for dynamic component loading at runtime.

Application Changes

Very few changes are needed to run existing PAPI-enabled applications under PAPI-C. The discussion below highlights the changes we found necessary in porting our test applications to the modified API:

  • Any calls to PAPI_get_substrate_info() must be converted to calls to PAPI_get_component_info() with a corresponding change in the type of the returned data structure. Correspondingly, calls to PAPI_get_opt(SUBSTRATE_INFO) should be changed to PAPI_get_opt(COMPONENT_INFO) or PAPI_get_cmp_opt(COMPONENT_INFO,0).
  • If an application creates an EventSet and then tries to set the domain or the mutliplex options before adding events, the code will error. The fix is to call PAPI_assign_eventset_component() with the desired component prior to setting options.
  • Calls to PAPI_set_opt() with either PAPI_DEFDOM or PAPI_DEFGRN options must set the def_cidx field in the passed data structure.

Summary of Changes

New APIs:

  • const PAPI_component_info_t *PAPI_get_component_info(int cidx)
    given a valid index, returns a component info structure as defined in papi.h
    returns NULL if out of range.
    Replaces  PAPI_get_substrate_info()
  • int PAPI_num_components(void)
  • int PAPI_assign_eventset_component(int EventSet, int cidx)
    Explicitly bind an eventset to a component before events are added.
    Occasionally needed prior to manipulating eventset parameters like domain or multiplexing.
  • int PAPI_set_cmp_domain(int domain, int cidx)
  • int PAPI_set_cmp_granularity(int granularity, int cidx)
  • int PAPI_num_cmp_hwctrs(int cidx)
  • int PAPI_get_cmp_opt(int option, PAPI_option_t * ptr, int cidx)
    Handles options that explicitly require a component index:
        PAPI_DEF_MPX_USEC (shouldn't this one be system level?)
        PAPI_MAX_HWCTRS
        PAPI_MAX_MPX_CTRS
        PAPI_DEFDOM
        PAPI_DEFGRN
        PAPI_SHLIBINFO (shouldn't this one be system level?)
        PAPI_COMPONENTINFO 

Modified API Functionality:

  • int PAPI_enum_event(int *EventCode, int modifier)
    Parses EventCode for component index.
    Enumerates only across component specified in EventCode
  • int PAPI_create_eventset(int *EventSet)
    Eventsets are bound to components. This is ordinarily a late-binding process that occurs when an event is added.
  • int PAPI_set_domain(int domain)
    Implicitly sets domain of component 0; deprecated - maintained for backward compatibility.
  • int PAPI_set_granularity(int granularity)
    Implicitly sets granularity of component 0; deprecated - maintained for backward compatibility.
  • int PAPI_set_opt(int option, PAPI_option_t * ptr)
    The PAPI_DEFDOM and PAPI_DEFGRN options now include a mandatory component index field in the data structure
  • int PAPI_num_hwctrs(void)
    Implicitly returns number of counters for component 0; deprecated - maintained for backward compatibility.
  • int PAPI_get_opt(int option, PAPI_option_t * ptr)
    Behaves as before for options that don't require a component index;
    Implicitly returns values for component 0 for the following options:
        PAPI_DEF_MPX_USEC
        PAPI_MAX_HWCTRS
        PAPI_MAX_MPX_CTRS
        PAPI_DEFDOM
        PAPI_DEFGRN
        PAPI_SHLIBINFO (shouldn't this one be system level?)
        PAPI_COMPONENTINFO
  • The following 4 APIs always call the timer functions found in the cpu component (component 0):
    PAPI_get_real_cyc()
    PAPI_get_virt_cyc()
    PAPI_get_real_usec()
    PAPI_get_virt_usec()

Structural Changes:

  • Component 0 is always assumed to be the traditional cpu counter component.
  • Event codes now contain an embedded 4 bit COMPONENT_INDEX field to id one of 16 components
    No error checking is done yet to guarantee less than 16 components.
  • The PAPI_SUBSTRATEINFO case for PAPI_get_opt has been changed to PAPI_COMPONENTINFO
    multiplex info was moved from the component level to a separate PAPI_mpx_info_t structure included at the hardware info level.
  • The PAPI_domain_option_t and PAPI_granularity_option_t structures now have a component index field, def_cidx, required when setting default domains or granularities.

New Error Messages:

  • PAPI_ENOINIT - PAPI hasn't been initialized yet
  • PAPI_ENOCMP - Component Index isn't set or out of rangeint PAPI_create_eventset(int *EventSet)
papi-papi-7-2-0-t/doc/README000066400000000000000000000011361502707512200152660ustar00rootroot00000000000000/* * File: papi/doc/README * CVS: $Id$ * Author: Dan Terpstra * terpstra@cs.utk.edu * Mods: * */ This directory contains: A collection of files to aid in the preparation of PAPI doxygen documentation. Beginning with PAPI 4.2.0, doxygen and an on-line wiki are the two primary sources of documentation. Man pages generated from doxygen can be found in the man directories and installed on your system. A PAPI Overview and html versions of the man pages can also be found online at: http://icl.cs.utk.edu/projects/papi/wiki/Main_Page papi-papi-7-2-0-t/doc/doxygen_procedure.txt000066400000000000000000000045171502707512200207020ustar00rootroot00000000000000******************************************************************************** Check the version of doxygen you're using, there is a bug with older versions ( < 1.7.4 ) ******************************************************************************** USAGE ======================= To invoke doxygen, cd $(papi_dir)/doc make (alternativly doxygen Doxyfile-{html,man1,man3} This command produces documentation for the PAPI user-exposed api and data-structures. There are several different configuration files are present: Doxyfile-html - generates documentation for everything under src. This will take a long time to run, and generates north of 600 megs of files. Requires the program dot, for dependency graphs. Doxyfile-man1 - generates man-pages for the utilities. Doxyfile-man3 - generates man-pages for the API, see papi.h Commenting the Code ======================= To get doxygen's attention, in general, use a special comment block /** */ thing_to_be_commented Doxygen responds to several special commands, denoted by @command (if you're feeling texy, \command) As an artifact of how doxygen started life, we call our api functions 'classes' to get doxygen to generate man-pages for the function. /** @class MY_FUNCTION @brief gives a brief overview of what the function does, limited to 1 line or 1 sentence if you need the space. @param arg1 describes a parameter to the function @return describes the functions return value @retval allows you to enumerate return values Down here we have more detailed information about the function Which can span many lines And paragraphs (feeling texy now?) @par Examples: @code This is the way to get examples to format nicely code goes here.... @endcode @bug Here you get a section of freeform text to describe bugs you encounter. */ @internal keeps comment blocks marked as such out of the documentation (unless the INTERNAL_DOCS flag is set in the config file) In several places /**< */ appears, this means that the comment pertains to the previous element. int foo; /**< This comment is about foo */ TODO ======================= Doxygen provides options for [ab]using the preprocessor, Do we need to look into this? Probably not more than we already do -J Document the ctests? See http://www.stack.nl/~dimitri/doxygen/docblocks.html for more detail on doxygen. papi-papi-7-2-0-t/gitlog2changelog.py000077500000000000000000000131701502707512200174360ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2008 Marcus D. Hanwell # Minor changes for NUT by Charles Lepple # Distributed under the terms of the GNU General Public License v2 or later # # Updates by Treece Burgess in October of 2024: # Add --start_commit and --fout command line arguments # Update the re.search conditional check to actually check against a valid return value # Correctly account for the final entry of the git log summary import string, re, os from textwrap import TextWrapper import sys, argparse def cmd_line_interface(): """Setup for the command line interface.. :return: The argparse.Namespace object. """ parser = argparse.ArgumentParser() parser.add_argument("--starting_commit", required = True, help = "Commit hash for the starting point of the desired range (non-inclusive).") parser.add_argument("--fout", required = True, help = "Name to give output file. E.g. ChangeLogP800.txt") return parser.parse_args() if __name__ == "__main__": # Collect the command line arguments args = cmd_line_interface() # Range of specific commits that we want to create a change log for rev_range = '%s..HEAD' % args.starting_commit # Execute git log with the desired command line options. # This is implemented using subprocess.Popen fin = os.popen('git log --summary --stat --no-merges --date=short %s' % rev_range, 'r', buffering = -1) # Needed to properly parse final entry lines = fin.readlines() last_line = lines[-1] # Create a ChangeLog file in the current directory. fout = open(args.fout, 'w') # Set up the loop variables in order to locate the blocks we want authorFound = False dateFound = False messageFound = False filesFound = False message = "" messageNL = False files = "" prevAuthorLine = "" wrapper = TextWrapper(initial_indent="\t", subsequent_indent="\t ") # The main part of the loop for line in lines: # The commit line marks the start of a new commit object. if line.startswith('commit'): # Start all over again... authorFound = False dateFound = False messageFound = False messageNL = False message = "" filesFound = False files = "" continue # Match the author line and extract the part we want elif 'Author:' in line: authorList = re.split(': ', line, 1) author = authorList[1] author = author[0:len(author)-1] authorFound = True # Match the date line elif 'Date:' in line: dateList = re.split(': ', line, 1) date = dateList[1] date = date[0:len(date)-1] dateFound = True # The Fossil-IDs are ignored: elif line.startswith(' Fossil-ID:') or line.startswith(' [[SVN:'): continue # The svn-id lines are ignored elif ' git-svn-id:' in line: continue # The sign off line is ignored too elif 'Signed-off-by' in line: continue # Extract the actual commit message for this commit elif authorFound & dateFound & messageFound == False: # Find the commit message if we can if len(line) == 1: if messageNL: messageFound = True else: messageNL = True elif len(line) == 4: messageFound = True else: if len(message) == 0: message = message + line.strip() else: message = message + " " + line.strip() # If this line is hit all of the files have been stored for this commit elif re.search('files? changed', line) != None: filesFound = True # We only want to continue if it is not the last line; # continuing on the last line would skip the final entry if line is not last_line: continue # Collect the files for this commit. FIXME: Still need to add +/- to files elif authorFound & dateFound & messageFound: fileList = re.split(' \| ', line, 2) if len(fileList) > 1: if len(files) > 0: files = files + ", " + fileList[0].strip() else: files = fileList[0].strip() # All of the parts of the commit have been found - write out the entry if authorFound & dateFound & messageFound & filesFound: # First the author line, only outputted if it is the first for that # author on this day authorLine = date + " " + author if len(prevAuthorLine) == 0: fout.write(authorLine + "\n\n") elif authorLine == prevAuthorLine: pass else: fout.write("\n" + authorLine + "\n\n") # Assemble the actual commit message line(s) and limit the line length # to 80 characters. commitLine = "* " + files + ": " + message # Write out the commit line fout.write(wrapper.fill(commitLine) + "\n") #Now reset all the variables ready for a new commit block. authorFound = False dateFound = False messageFound = False messageNL = False message = "" filesFound = False files = "" prevAuthorLine = authorLine # Close the input and output lines now that we are finished. fin.close() fout.close() papi-papi-7-2-0-t/man/000077500000000000000000000000001502707512200144135ustar00rootroot00000000000000papi-papi-7-2-0-t/man/Makefile000066400000000000000000000005301502707512200160510ustar00rootroot00000000000000clean: rm -f *~ core man3/*~ install: @echo "Man pages (MANDIR) being installed in: \"$(MANDIR)\""; -mkdir -p $(MANDIR)/man3 -chmod go+rx $(MANDIR)/man3 -cp man3/PAPI*.3 $(MANDIR)/man3 -chmod go+r $(MANDIR)/man3/PAPI*.3 -mkdir -p $(MANDIR)/man1 -chmod go+rx $(MANDIR)/man1 -cp man1/*.1 $(MANDIR)/man1 -chmod go+r $(MANDIR)/man1/*.1 papi-papi-7-2-0-t/man/README000066400000000000000000000010261502707512200152720ustar00rootroot00000000000000/* * File: README * CVS: $Id$ * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ This directory contains: Makefile Installs man pages. man1/ Man pages for the PAPI utility applications. man3/ Man pages for the PAPI API functions. Makefile Usage: make make install DESTDIR= Beginning with PAPI 4.2.0, man pages are generated from the PAPI sources using doxygen scripts found in the papi/doc directory. They are updated prior to each release.papi-papi-7-2-0-t/man/man1/000077500000000000000000000000001502707512200152475ustar00rootroot00000000000000papi-papi-7-2-0-t/man/man1/PAPI_derived_event_files.1000066400000000000000000000201101502707512200221410ustar00rootroot00000000000000.TH "PAPI_derived_event_files" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_derived_event_files \- Describes derived event definition file syntax\&. .SH "Derived Events" .PP PAPI provides the ability to define events whose value will be derived from multiple native events\&. The list of native events to be used in a derived event and a formula which describes how to use them is provided in an event definition file\&. The PAPI team provides an event definition file which describes all of the supported PAPI preset events\&. PAPI also allows a user to provide an event definition file that describes a set of user defined events which can extend the events PAPI normally supports\&. .PP This page documents the syntax of the commands which can appear in an event definition file\&. .PP .br .SS "General Rules:" .PD 0 .IP "\(bu" 2 Blank lines are ignored\&. .IP "\(bu" 2 Lines that begin with '#' are comments (they are also ignored)\&. .IP "\(bu" 2 Names shown inside < > below represent values that must be provided by the user\&. .IP "\(bu" 2 If a user provided value contains white space, it must be protected with quotes\&. .PP .PP .br .SS "Commands:" \fBCPU,\fP .RS 4 Specifies a PMU name which controls if the PRESET and EVENT commands that follow this line should be processed\&. Multiple CPU commands can be entered without PRESET or EVENT commands between them to provide a list of PMU names to which the derived events that follow will apply\&. When a PMU name provided in the list matches a PMU name known to the running system, the events which follow will be created\&. If none of the PMU names provided in the list match a PMU name on the running system, the events which follow will be ignored\&. When a new CPU command follows either a PRESET or EVENT command, the PMU list is rebuilt\&. .br .br .RE .PP \fBPRESET,,,,LDESC,"",SDESC,"",NOTE,""\fP .RS 4 Declare a PAPI preset derived event\&. .br .br .RE .PP \fBEVENT,,,,LDESC,"",SDESC,"",NOTE,""\fP .RS 4 Declare a user defined derived event\&. .br .br .RE .PP \fBWhere:\fP .RS 4 .RE .PP \fBpmuName:\fP .RS 4 The PMU which the following events should apply to\&. A list of PMU names supported by your system can be obtained by running papi_component_avail on your system\&. .br .RE .PP \fBeventName:\fP .RS 4 Specifies the name used to identify this derived event\&. This name should be unique within the events on your system\&. .br .RE .PP \fBderivedType:\fP .RS 4 Specifies the kind of derived event being defined (see 'Derived Types' below)\&. .br .RE .PP \fBeventAttr:\fP .RS 4 Specifies a formula and a list of base events that are used to compute the derived events value\&. The syntax of this field depends on the 'derivedType' specified above (see 'Derived Types' below)\&. .br .RE .PP \fBlongDesc:\fP .RS 4 Provides the long description of the event\&. .br .RE .PP \fBshortDesc:\fP .RS 4 Provides the short description of the event\&. .br .RE .PP \fBnote:\fP .RS 4 Provides an event note\&. .br .RE .PP \fBbaseEvent (used below):\fP .RS 4 Identifies an event on which this derived event is based\&. This may be a native event (possibly with event masks), an already known preset event, or an already known user event\&. .br .RE .PP .br .SS "Notes:" The PRESET command has traditionally been used in the PAPI provided preset definition file\&. The EVENT command is intended to be used in user defined event definition files\&. The code treats them the same so they are interchangeable and they can both be used in either event definition file\&. .br .PP .br .SS "Derived Types:" This describes values allowed in the 'derivedType' field of the PRESET and EVENT commands\&. It also shows the syntax of the 'eventAttr' field for each derived type supported by these commands\&. All of the derived events provide a list of one or more events which the derived event is based on (baseEvent)\&. Some derived events provide a formula that specifies how to compute the derived events value using the baseEvents in the list\&. The following derived types are supported, the syntax of the 'eventAttr' parameter for each derived event type is shown in parentheses\&. .br .br .PP \fBNOT_DERIVED ():\fP .RS 4 This derived type defines an alias for the existing event 'baseEvent'\&. .br .RE .PP \fBDERIVED_ADD (,):\fP .RS 4 This derived type defines a new event that will be the sum of two other events\&. It has a value of 'baseEvent1' plus 'baseEvent2'\&. .br .RE .PP \fBDERIVED_PS (PAPI_TOT_CYC,):\fP .RS 4 This derived type defines a new event that will report the number of 'baseEvent1' events which occurred per second\&. It has a value of ((('baseEvent1' * cpu_max_mhz) * 1000000 ) / PAPI_TOT_CYC)\&. The user must provide PAPI_TOT_CYC as the first event of two events in the event list for this to work correctly\&. .br .RE .PP \fBDERIVED_ADD_PS (PAPI_TOT_CYC,,):\fP .RS 4 This derived type defines a new event that will add together two event counters and then report the number which occurred per second\&. It has a value of (((('baseEvent1' + baseEvent2) * cpu_max_mhz) * 1000000 ) / PAPI_TOT_CYC)\&. The user must provide PAPI_TOT_CYC as the first event of three events in the event list for this to work correctly\&. .br .RE .PP \fBDERIVED_CMPD (,,):\fP .RS 4 This derived type defines a new event that will be the difference between two other events\&. It has a value of 'baseEvent1' minus 'baseEvent2'\&. .br .RE .PP \fBDERIVED_POSTFIX (,,, \&.\&.\&. ,):\fP .RS 4 This derived type defines a new event whose value is computed from several native events using a postfix (reverse polish notation) formula\&. Its value is the result of processing the postfix formula\&. The 'pfFormula' is of the form 'N0|N1|N2|5|*|+|-|' where the '|' acts as a token separator and the tokens N0, N1, and N2 are place holders that represent baseEvent0, baseEvent1, and baseEvent2 respectively\&. .br .RE .PP \fBDERIVED_INFIX (,,, \&.\&.\&. ,):\fP .RS 4 This derived type defines a new event whose value is computed from several native events using an infix (algebraic notation) formula\&. Its value is the result of processing the infix formula\&. The 'ifFormula' is of the form 'N0-(N1+(N2*5))' where the tokens N0, N1, and N2 are place holders that represent baseEvent0, baseEvent1, and baseEvent2 respectively\&. .br .RE .PP .br .SS "Example:" In the following example, the events PAPI_SP_OPS, USER_SP_OPS, and ALIAS_SP_OPS will all measure the same events and return the same value\&. They just demonstrate different ways to use the PRESET and EVENT event definition commands\&. .br .br .PP .PD 0 .IP "\(bu" 2 # The following lines define pmu names that all share the following events .IP "\(bu" 2 CPU nhm .IP "\(bu" 2 CPU nhm-ex .IP "\(bu" 2 # Events which should be defined for either of the above pmu types .IP "\(bu" 2 PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES .IP "\(bu" 2 PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES .IP "\(bu" 2 PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|3|*|+|,FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED,NOTE,'Using a postfix formula' .IP "\(bu" 2 EVENT,USER_SP_OPS,DERIVED_INFIX,N0+(N1*3),FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED,NOTE,'Using the same formula in infix format' .IP "\(bu" 2 EVENT,ALIAS_SP_OPS,NOT_DERIVED,PAPI_SP_OPS,LDESC,'Alias for preset event PAPI_SP_OPS' .IP "\(bu" 2 # End of event definitions for above pmu names and start of a section for a new pmu name\&. .IP "\(bu" 2 CPU snb .PP papi-papi-7-2-0-t/man/man1/papi_avail.1000066400000000000000000000042301502707512200174350ustar00rootroot00000000000000.TH "papi_avail" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_avail \- papi_avail utility\&. .PP file papi_avail\&.c .SH "Name" .PP papi_avail - provides availability and detailed information for PAPI preset and user defined events\&. .SH "Synopsis" .PP papi_avail [-adht] [-e event] .SH "Description" .PP papi_avail is a PAPI utility program that reports information about the current PAPI installation and supported preset and user defined events\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -h Display help information about this utility\&. .IP "\(bu" 2 -a Display only the available PAPI events\&. .IP "\(bu" 2 -c Display only the available PAPI events after a check\&. .IP "\(bu" 2 -d Display PAPI event information in a more detailed format\&. .IP "\(bu" 2 -e < event > Display detailed event information for the named event\&. This event can be a preset event, a user defined event, or a native event\&. If the event is a preset or a user defined event the output shows a list of native events the event is based on and the formula that is used to compute the events final value\&. .br .PP .PP Event filtering options .PD 0 .IP "\(bu" 2 --br Display branch related PAPI preset events .IP "\(bu" 2 --cache Display cache related PAPI preset events .IP "\(bu" 2 --cnd Display conditional PAPI preset events .IP "\(bu" 2 --fp Display Floating Point related PAPI preset events .IP "\(bu" 2 --ins Display instruction related PAPI preset events .IP "\(bu" 2 --idl Display Stalled or Idle PAPI preset events .IP "\(bu" 2 --l1 Display level 1 cache related PAPI preset events .IP "\(bu" 2 --l2 Display level 2 cache related PAPI preset events .IP "\(bu" 2 --l3 Display level 3 cache related PAPI preset events .IP "\(bu" 2 --mem Display memory related PAPI preset events .IP "\(bu" 2 --msc Display miscellaneous PAPI preset events .IP "\(bu" 2 --tlb Display Translation Lookaside Buffer PAPI preset events .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. .br .PP \fBSee also\fP .RS 4 PAPI_derived_event_files .RE .PP papi-papi-7-2-0-t/man/man1/papi_clockres.1000066400000000000000000000013651502707512200201540ustar00rootroot00000000000000.TH "papi_clockres" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_clockres \- The papi_clockres utility\&. .PP file clockres\&.c .SH "Name" .PP papi_clockres - measures and reports clock latency and resolution for PAPI timers\&. .SH "Synopsis" .PP .SH "Description" .PP papi_clockres is a PAPI utility program that measures and reports the latency and resolution of the four PAPI timer functions: PAPI_get_real_cyc(), PAPI_get_virt_cyc(), PAPI_get_real_usec() and PAPI_get_virt_usec()\&. .SH "Options" .PP This utility has no command line options\&. .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_command_line.1000066400000000000000000000016331502707512200207720ustar00rootroot00000000000000.TH "papi_command_line" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_command_line \- executes PAPI preset or native events from the command line\&. .SH "Synopsis" .PP papi_command_line < event > < event > \&.\&.\&. .SH "Description" .PP papi_command_line is a PAPI utility program that adds named events from the command line to a PAPI EventSet and does some work with that EventSet\&. This serves as a handy way to see if events can be counted together, and if they give reasonable results for known work\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -u Display output values as unsigned integers .IP "\(bu" 2 -x Display output values as hexadecimal .IP "\(bu" 2 -h Display help information about this utility\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_component_avail.1000066400000000000000000000014031502707512200215160ustar00rootroot00000000000000.TH "papi_component_avail" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_component_avail \- papi_component_avail utility\&. .PP file papi_component_avail\&.c .SH "NAME" .PP papi_component_avail - provides detailed information on the PAPI components available on the system\&. .SH "Synopsis" .PP .SH "Description" .PP papi_component_avail is a PAPI utility program that reports information about the components papi was built with\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -h help message .IP "\(bu" 2 -d provide detailed information about each component\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_cost.1000066400000000000000000000026321502707512200173150ustar00rootroot00000000000000.TH "papi_cost" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_cost \- papi_cost utility\&. .PP file papi_cost\&.c .SH "NAME" .PP papi_cost - computes execution time costs for basic PAPI operations\&. .SH "Synopsis" .PP papi_cost [-dhps] [-b bins] [-t threshold] .SH "Description" .PP papi_cost is a PAPI utility program that computes the min / max / mean / std\&. deviation of execution times for PAPI start/stop pairs and for PAPI reads\&. This information provides the basic operating cost to a user's program for collecting hardware counter data\&. Command line options control display capabilities\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -b < bins > Define the number of bins into which the results are partitioned for display\&. The default is 100\&. .IP "\(bu" 2 -d Display a graphical distribution of costs in a vertical histogram\&. .IP "\(bu" 2 -h Display help information about this utility\&. .IP "\(bu" 2 -p Display 25/50/75 perecentile results for making boxplots\&. .IP "\(bu" 2 -s Show the number of iterations in each of the first 10 standard deviations above the mean\&. .IP "\(bu" 2 -t < threshold > Set the threshold for the number of iterations to measure costs\&. The default is 1,000,000\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_decode.1000066400000000000000000000031551502707512200175710ustar00rootroot00000000000000.TH "papi_decode" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_decode \- papi_decode utility\&. .PP file papi_decode\&.c .SH "NAME" .PP papi_decode - provides availability and detail information for PAPI preset events\&. .SH "Synopsis" .PP papi_decode [-ah] .SH "Description" .PP papi_decode is a PAPI utility program that converts the PAPI presets for the existing library into a comma separated value format that can then be viewed or modified in spreadsheet applications or text editors, and can be supplied to PAPI_encode_events (3) as a way of adding or modifying event definitions for specialized applications\&. The format for the csv output consists of a line of field names, followed by a blank line, followed by one line of comma separated values for each event contained in the preset table\&. A portion of this output (for Pentium 4) is shown below: .PP .nf name,derived,postfix,short_descr,long_descr,note,[native,\&.\&.\&.] PAPI_L1_ICM,NOT_DERIVED,,"L1I cache misses","Level 1 instruction cache misses",,BPU_fetch_request_TCMISS PAPI_L2_TCM,NOT_DERIVED,,"L2 cache misses","Level 2 cache misses",,BSQ_cache_reference_RD_2ndL_MISS_WR_2ndL_MISS PAPI_TLB_DM,NOT_DERIVED,,"Data TLB misses","Data translation lookaside buffer misses",,page_walk_type_DTMISS .fi .PP .SH "Options" .PP .PD 0 .IP "\(bu" 2 -a Convert only the available PAPI preset events\&. .IP "\(bu" 2 -h Display help information about this utility\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_error_codes.1000066400000000000000000000015141502707512200206510ustar00rootroot00000000000000.TH "papi_error_codes" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_error_codes \- papi_error_codes utility\&. .PP file error_codes\&.c .SH "NAME" .PP papi_error_codes - lists all currently defined PAPI error codes\&. .SH "Synopsis" .PP papi_error_codes .SH "Description" .PP papi_error_codes is a PAPI utility program that displays all defined error codes from papi\&.h and their error strings from papi_data\&.h\&. If an error string is not defined, a warning is generated\&. This can help trap newly defined error codes for which error strings are not yet defined\&. .SH "Options" .PP This utility has no command line options\&. .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_event_chooser.1000066400000000000000000000014131502707512200212040ustar00rootroot00000000000000.TH "papi_event_chooser" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_event_chooser \- papi_event_chooser utility\&. .PP file event_chooser\&.c .SH "NAME" .PP papi_event_chooser - given a list of named events, lists other events that can be counted with them\&. .SH "Synopsis" .PP papi_event_chooser NATIVE | PRESET < event > < event > \&.\&.\&. .SH "Description" .PP papi_event_chooser is a PAPI utility program that reports information about the current PAPI installation and supported preset events\&. .SH "Options" .PP This utility has no command line options\&. .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_hardware_avail.1000066400000000000000000000012711502707512200213140ustar00rootroot00000000000000.TH "papi_hardware_avail" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_hardware_avail \- papi_hardware_avail utility\&. .PP file papi_hardware_avail\&.c .SH "NAME" .PP papi_hardware_avail - provides detailed information on the hardware available in the system\&. .SH "Synopsis" .PP .SH "Description" .PP papi_hardware_avail is a PAPI utility program that reports information about the hardware devices equipped in the system\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -h help message .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_hybrid_native_avail.1000066400000000000000000000041161502707512200223470ustar00rootroot00000000000000.TH "papi_hybrid_native_avail" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_hybrid_native_avail \- papi_hybrid_native_avail utility\&. .PP file hybrid_native_avail\&.c .SH "NAME" .PP papi_hybrid_native_avail - provides detailed information for PAPI native events\&. .SH "Synopsis" .PP .SH "Description" .PP papi_hybrid_native_avail is a PAPI utility program that reports information about the native events available on the current platform or on an attached MIC card\&. A native event is an event specific to a specific hardware platform\&. On many platforms, a specific native event may have a number of optional settings\&. In such cases, the native event and the valid settings are presented, rather than every possible combination of those settings\&. For each native event, a name, a description, and specific bit patterns are provided\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 --help, -h print this help message .IP "\(bu" 2 -d display detailed information about native events .IP "\(bu" 2 -e EVENTNAME display detailed information about named native event .IP "\(bu" 2 -i EVENTSTR include only event names that contain EVENTSTR .IP "\(bu" 2 -x EVENTSTR exclude any event names that contain EVENTSTR .IP "\(bu" 2 --noumasks suppress display of Unit Mask information .IP "\(bu" 2 --mic < index > report events on the specified target MIC device .PP .PP Processor-specific options .PD 0 .IP "\(bu" 2 --darr display events supporting Data Address Range Restriction .IP "\(bu" 2 --dear display Data Event Address Register events only .IP "\(bu" 2 --iarr display events supporting Instruction Address Range Restriction .IP "\(bu" 2 --iear display Instruction Event Address Register events only .IP "\(bu" 2 --opcm display events supporting OpCode Matching .IP "\(bu" 2 --nogroups suppress display of Event grouping information .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. .PP Modified by Gabriel Marin gmarin@icl.utk.edu to use offloading\&. papi-papi-7-2-0-t/man/man1/papi_mem_info.1000066400000000000000000000014351502707512200201360ustar00rootroot00000000000000.TH "papi_mem_info" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_mem_info \- papi_mem_info utility\&. .PP file papi_mem_info\&.c .SH "NAME" .PP papi_mem_info - provides information on the memory architecture of the current processor\&. .SH "Synopsis" .PP .SH "Description" .PP papi_mem_info is a PAPI utility program that reports information about the cache memory architecture of the current processor, including number, types, sizes and associativities of instruction and data caches and Translation Lookaside Buffers\&. .SH "Options" .PP This utility has no command line options\&. .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_multiplex_cost.1000066400000000000000000000023521502707512200214170ustar00rootroot00000000000000.TH "papi_multiplex_cost" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_multiplex_cost \- papi_multiplex_cost utility\&. .PP file papi_multiplex_cost\&.c .SH "NAME" .PP papi_multiplex_cost - computes execution time costs for basic PAPI operations on multiplexed EventSets\&. .SH "Synopsis" .PP papi_cost [-m, --min < min >] [-x, --max < max >] [-k,-s] .SH "Description" .PP papi_multiplex_cost is a PAPI utility program that computes the min / max / mean / std\&. deviation of execution times for PAPI start/stop pairs and for PAPI reads on multiplexed eventsets\&. This information provides the basic operating cost to a user's program for collecting hardware counter data\&. Command line options control display capabilities\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -m < Min number of events to test > .IP "\(bu" 2 -x < Max number of events to test > .IP "\(bu" 2 -k, Do not time kernel multiplexing .IP "\(bu" 2 -s, Do not ime software multiplexed EventSets .IP "\(bu" 2 -t THRESHOLD, Test with THRESHOLD iterations of counting loop\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_native_avail.1000066400000000000000000000037411502707512200210110ustar00rootroot00000000000000.TH "papi_native_avail" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_native_avail \- papi_native_avail utility\&. .PP file papi_native_avail\&.c .SH "NAME" .PP papi_native_avail - provides detailed information for PAPI native events\&. .SH "Synopsis" .PP .SH "Description" .PP papi_native_avail is a PAPI utility program that reports information about the native events available on the current platform\&. A native event is an event specific to a specific hardware platform\&. On many platforms, a specific native event may have a number of optional settings\&. In such cases, the native event and the valid settings are presented, rather than every possible combination of those settings\&. For each native event, a name, a description, and specific bit patterns are provided\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 --help, -h print this help message .IP "\(bu" 2 --check, -c attempts to add each event .IP "\(bu" 2 -sde FILE lists SDEs that are registered by the library or executable in FILE .IP "\(bu" 2 -e EVENTNAME display detailed information about named native event .IP "\(bu" 2 -i EVENTSTR include only event names that contain EVENTSTR .IP "\(bu" 2 -x EVENTSTR exclude any event names that contain EVENTSTR .IP "\(bu" 2 --noqual suppress display of event qualifiers (mask and flag) information .br .PP .PP Processor-specific options .PD 0 .IP "\(bu" 2 --darr display events supporting Data Address Range Restriction .IP "\(bu" 2 --dear display Data Event Address Register events only .IP "\(bu" 2 --iarr display events supporting Instruction Address Range Restriction .IP "\(bu" 2 --iear display Instruction Event Address Register events only .IP "\(bu" 2 --opcm display events supporting OpCode Matching .IP "\(bu" 2 --nogroups suppress display of Event grouping information .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_version.1000066400000000000000000000010751502707512200200320ustar00rootroot00000000000000.TH "papi_version" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_version \- papi_version utility\&. .PP file papi_version\&.c .SH "Name" .PP papi_version - provides version information for papi\&. .SH "Synopsis" .PP papi_version .SH "Description" .PP papi_version is a PAPI utility program that reports version information about the current PAPI installation\&. .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man1/papi_xml_event_info.1000066400000000000000000000020661502707512200213620ustar00rootroot00000000000000.TH "papi_xml_event_info" 1 "Wed Jun 25 2025 19:17:03" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_xml_event_info \- papi_xml_event_info utility\&. .PP file papi_xml_event_info\&.c .SH "NAME" .PP papi_xml_event_info - provides detailed information for PAPI events in XML format .SH "Synopsis" .PP .SH "Description" .PP papi_native_avail is a PAPI utility program that reports information about the events available on the current platform in an XML format\&. .PP It will attempt to create an EventSet with each event in it, which can be slow\&. .SH "Options" .PP .PD 0 .IP "\(bu" 2 -h print help message .IP "\(bu" 2 -p print only preset events .IP "\(bu" 2 -n print only native events .IP "\(bu" 2 -c COMPONENT print only events from component number COMPONENT event1, event2, \&.\&.\&. Print only events that can be created in the same event set with the events event1, event2, etc\&. .PP .SH "Bugs" .PP There are no known bugs in this utility\&. If you find a bug, it should be reported to the PAPI Mailing List at ptools-perfapi@icl.utk.edu\&. papi-papi-7-2-0-t/man/man3/000077500000000000000000000000001502707512200152515ustar00rootroot00000000000000papi-papi-7-2-0-t/man/man3/PAPIF_accum.3000066400000000000000000000007671502707512200173560ustar00rootroot00000000000000.TH "PAPIF_accum" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_accum \- accumulate and reset counters in an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_accum( C_INT EventSet, C_LONG_LONG(*) values, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_accum\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_add_event.3000066400000000000000000000010131502707512200202000ustar00rootroot00000000000000.TH "PAPIF_add_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_add_event \- add PAPI preset or native hardware event to an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_add_event( C_INT EventSet, C_INT EventCode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_add_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_add_events.3000066400000000000000000000010551502707512200203710ustar00rootroot00000000000000.TH "PAPIF_add_events" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_add_events \- add multiple PAPI presets or native hardware events to an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_add_events( C_INT EventSet, C_INT(*) EventCodes, C_INT number, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_add_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_add_named_event.3000066400000000000000000000010561502707512200213530ustar00rootroot00000000000000.TH "PAPIF_add_named_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_add_named_event \- add PAPI preset or native hardware event to an event set by name .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_add_named_event( C_INT EventSet, C_STRING EventName, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_add_named_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_assign_eventset_component.3000066400000000000000000000011141502707512200235340ustar00rootroot00000000000000.TH "PAPIF_assign_eventset_component" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_assign_eventset_component \- assign a component index to an existing but empty EventSet .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_assign_eventset_component( C_INT EventSet, C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_assign_eventset_component\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_cleanup_eventset.3000066400000000000000000000007731502707512200216270ustar00rootroot00000000000000.TH "PAPIF_cleanup_eventset" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_cleanup_eventset \- empty and destroy an EventSet .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_cleanup_eventset( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_cleanup_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_create_eventset.3000066400000000000000000000007721502707512200214420ustar00rootroot00000000000000.TH "PAPIF_create_eventset" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_create_eventset \- create a new empty PAPI EventSet .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_create_eventset( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_create_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_destroy_eventset.3000066400000000000000000000007731502707512200216710ustar00rootroot00000000000000.TH "PAPIF_destroy_eventset" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_destroy_eventset \- empty and destroy an EventSet .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_destroy_eventset( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_destroy_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_enum_dev_type.3000066400000000000000000000010101502707512200211070ustar00rootroot00000000000000.TH "PAPIF_enum_dev_type" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_enum_dev_type \- returns handle of next device type .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_enum_dev_type( C_INT modifier, C_INT handle_index, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_enum_dev_type\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_enum_event.3000066400000000000000000000010131502707512200204140ustar00rootroot00000000000000.TH "PAPIF_enum_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_enum_event \- Return the number of events in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_enum_event( C_INT EventCode, C_INT modifier, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_enum_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_epc.3000066400000000000000000000011441502707512200170230ustar00rootroot00000000000000.TH "PAPIF_epc" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_epc \- Get named events per cycle, real and processor time, reference and core cycles\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_epc( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG ref, C_LONG_LONG core, C_LONG_LONG evt, C_FLOAT epc, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_epc\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_event_code_to_name.3000066400000000000000000000010561502707512200220730ustar00rootroot00000000000000.TH "PAPIF_event_code_to_name" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_event_code_to_name \- Convert a numeric hardware event code to a name\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_event_code_to_name( C_INT EventCode, C_STRING EventName, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_event_code_to_name\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_event_name_to_code.3000066400000000000000000000010561502707512200220730ustar00rootroot00000000000000.TH "PAPIF_event_name_to_code" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_event_name_to_code \- Convert a name to a numeric hardware event code\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_event_name_to_code( C_STRING EventName, C_INT EventCode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_event_name_to_code\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_flips_rate.3000066400000000000000000000011601502707512200204020ustar00rootroot00000000000000.TH "PAPIF_flips_rate" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_flips_rate \- Simplified call to get Mflips/s (floating point instruction rate), real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_flips_rate\fP ( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG flpins, C_FLOAT mflips, C_INT check ) .RE .PP \fBSee also\fP .RS 4 \fBPAPI_flips_rate\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_flops_rate.3000066400000000000000000000011571502707512200204160ustar00rootroot00000000000000.TH "PAPIF_flops_rate" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_flops_rate \- Simplified call to get Mflops/s (floating point instruction rate), real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_flops_rate( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG flpops, C_FLOAT mflops, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_flops_rate\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_clockrate.3000066400000000000000000000011311502707512200210560ustar00rootroot00000000000000.TH "PAPIF_get_clockrate" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_clockrate \- Get the clockrate in MHz for the current cpu\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_domain( C_INT cr )\fP .RE .PP \fBNote\fP .RS 4 This is a Fortran only interface that returns a value from the \fBPAPI_get_opt\fP call\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_dev_attr.3000066400000000000000000000010451502707512200207230ustar00rootroot00000000000000.TH "PAPIF_get_dev_attr" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_dev_attr \- returns device attributes .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_dev_attr\fP( C_INT handle, C_INT id, C_INT attribute, .br C_INT value, C_STRING string, C_INT check ) .RE .PP \fBSee also\fP .RS 4 \fBPAPIF_get_dev_attr\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_dev_type_attr.3000066400000000000000000000010721502707512200217640ustar00rootroot00000000000000.TH "PAPIF_get_dev_type_attr" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_dev_type_attr \- returns device type attributes .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_dev_type_attr\fP( C_INT handle_index, C_INT attribute, .br C_INT value, C_STRING string, C_INT check ) .RE .PP \fBSee also\fP .RS 4 \fBPAPIF_get_dev_type_attr\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_dmem_info.3000066400000000000000000000010271502707512200210500ustar00rootroot00000000000000.TH "PAPIF_get_dmem_info" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_dmem_info \- get information about the dynamic memory usage of the current program .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_dmem_info( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_dmem_info\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_domain.3000066400000000000000000000010221502707512200203550ustar00rootroot00000000000000.TH "PAPIF_get_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_domain \- Get the domain setting for the specified EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_domain( C_INT eventset, C_INT domain, C_INT mode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_event_info.3000066400000000000000000000012251502707512200212470ustar00rootroot00000000000000.TH "PAPIF_get_event_info" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_event_info \- Get the event's name and description info\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_event_info(C_INT EventCode, C_STRING symbol, C_STRING long_descr, C_STRING short_descr, C_INT count, C_STRING event_note, C_INT flags, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_event_info\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_exe_info.3000066400000000000000000000013061502707512200207070ustar00rootroot00000000000000.TH "PAPIF_get_exe_info" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_exe_info \- get information about the dynamic memory usage of the current program .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_exe_info\fP( C_STRING fullname, C_STRING name, .br C_LONG_LONG text_start, C_LONG_LONG text_end, .br C_LONG_LONG data_start, C_LONG_LONG data_end, .br C_LONG_LONG bss_start, C_LONG_LONG bss_end, C_INT check ) .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_executable_info\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_granularity.3000066400000000000000000000010531502707512200214530ustar00rootroot00000000000000.TH "PAPIF_get_granularity" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_granularity \- Get the granularity setting for the specified EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_granularity( C_INT eventset, C_INT granularity, C_INT mode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_hardware_info.3000066400000000000000000000012071502707512200217230ustar00rootroot00000000000000.TH "PAPIF_get_hardware_info" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_hardware_info \- get information about the system hardware .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_hardware_info\fP( C_INT ncpu, C_INT nnodes, C_INT totalcpus, .br C_INT vendor, C_STRING vendor_str, C_INT model, C_STRING model_str, .br C_FLOAT revision, C_FLOAT mhz ) .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_hardware_info\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_multiplex.3000066400000000000000000000010121502707512200211300ustar00rootroot00000000000000.TH "PAPIF_get_multiplex" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_multiplex \- Get the multiplexing status of specified event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_multiplex( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_multiplex\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_preload.3000066400000000000000000000011561502707512200205440ustar00rootroot00000000000000.TH "PAPIF_get_preload" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_preload \- Get the LD_PRELOAD environment variable\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_preload( C_STRING lib_preload_env, C_INT check )\fP .RE .PP \fBNote\fP .RS 4 This is a Fortran only interface that returns a value from the \fBPAPI_get_opt\fP call\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_real_cyc.3000066400000000000000000000007661502707512200207050ustar00rootroot00000000000000.TH "PAPIF_get_real_cyc" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_real_cyc \- Get real time counter value in clock cycles\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_real_cyc( C_LONG_LONG real_cyc )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_real_cyc\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_real_nsec.3000066400000000000000000000007651502707512200210560ustar00rootroot00000000000000.TH "PAPIF_get_real_nsec" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_real_nsec \- Get real time counter value in nanoseconds\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_real_nsec( C_LONG_LONG time )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_real_nsec\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_real_usec.3000066400000000000000000000007661502707512200210660ustar00rootroot00000000000000.TH "PAPIF_get_real_usec" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_real_usec \- Get real time counter value in microseconds\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_real_usec( C_LONG_LONG time )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_real_usec\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_virt_cyc.3000066400000000000000000000007711502707512200207420ustar00rootroot00000000000000.TH "PAPIF_get_virt_cyc" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_virt_cyc \- Get virtual time counter value in clock cycles\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_virt_cyc( C_LONG_LONG virt_cyc )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_virt_cyc\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_get_virt_usec.3000066400000000000000000000007711502707512200211230ustar00rootroot00000000000000.TH "PAPIF_get_virt_usec" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_get_virt_usec \- Get virtual time counter value in microseconds\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_get_virt_usec( C_LONG_LONG time )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_virt_usec\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_ipc.3000066400000000000000000000010251502707512200170250ustar00rootroot00000000000000.TH "PAPIF_ipc" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_ipc \- Get instructions per cycle, real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_ipc( C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG ins, C_FLOAT ipc, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_ipc\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_is_initialized.3000066400000000000000000000007421502707512200212570ustar00rootroot00000000000000.TH "PAPIF_is_initialized" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_is_initialized \- Check for initialization\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_is_initialized( C_INT level )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_is_initialized\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_library_init.3000066400000000000000000000007351502707512200207500ustar00rootroot00000000000000.TH "PAPIF_library_init" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_library_init \- Initialize the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_library_init( C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_library_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_lock.3000066400000000000000000000007321502707512200172060ustar00rootroot00000000000000.TH "PAPIF_lock" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_lock \- Lock one of two mutex variables defined in \fBpapi\&.h\fP\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_lock( C_INT lock )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_lock\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_multiplex_init.3000066400000000000000000000007711502707512200213270ustar00rootroot00000000000000.TH "PAPIF_multiplex_init" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_multiplex_init \- Initialize multiplex support in the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_multiplex_init( C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_multiplex_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_num_cmp_hwctrs.3000066400000000000000000000010601502707512200213010ustar00rootroot00000000000000.TH "PAPIF_num_cmp_hwctrs" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_num_cmp_hwctrs \- Return the number of hardware counters on the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_num_cmp_hwctrs( C_INT cidx, C_INT num )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_num_hwctrs\fP .PP \fBPAPI_num_cmp_hwctrs\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_num_events.3000066400000000000000000000007601502707512200204420ustar00rootroot00000000000000.TH "PAPIF_num_events" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_num_events \- Enumerate PAPI preset or native events\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_num_events(C_INT EventSet, C_INT count)\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_num_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_num_hwctrs.3000066400000000000000000000010101502707512200204350ustar00rootroot00000000000000.TH "PAPIF_num_hwctrs" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_num_hwctrs \- Return the number of hardware counters on the cpu\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_num_hwctrs( C_INT num )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_num_hwctrs\fP .PP \fBPAPI_num_cmp_hwctrs\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_perror.3000066400000000000000000000007651502707512200175750ustar00rootroot00000000000000.TH "PAPIF_perror" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_perror \- Convert PAPI error codes to strings, and print error message to stderr\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_perror( C_STRING message )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_perror\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_query_event.3000066400000000000000000000007471502707512200206320ustar00rootroot00000000000000.TH "PAPIF_query_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_query_event \- Query if PAPI event exists\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_query_event(C_INT EventCode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_query_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_query_named_event.3000066400000000000000000000010101502707512200217560ustar00rootroot00000000000000.TH "PAPIF_query_named_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_query_named_event \- Query if named PAPI event exists\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_query_named_event(C_STRING EventName, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_query_named_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_rate_stop.3000066400000000000000000000007411502707512200202560ustar00rootroot00000000000000.TH "PAPIF_rate_stop" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_rate_stop \- Stop a running event set of a rate function\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_rate_stop( C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_rate_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_read.3000066400000000000000000000007601502707512200171720ustar00rootroot00000000000000.TH "PAPIF_read" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_read \- Read hardware counters from an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_read(C_INT EventSet, C_LONG_LONG(*) values, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_read\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_read_ts.3000066400000000000000000000010211502707512200176670ustar00rootroot00000000000000.TH "PAPIF_read_ts" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_read_ts \- Read hardware counters with a timestamp\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_read_ts\fP(C_INT EventSet, C_LONG_LONG(*) values, C_LONG_LONG(*) cycles, C_INT check) .RE .PP \fBSee also\fP .RS 4 \fBPAPI_read_ts\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_register_thread.3000066400000000000000000000007661502707512200214400ustar00rootroot00000000000000.TH "PAPIF_register_thread" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_register_thread \- Notify PAPI that a thread has 'appeared'\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_register_thread( C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_register_thread\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_remove_event.3000066400000000000000000000010201502707512200207430ustar00rootroot00000000000000.TH "PAPIF_remove_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_remove_event \- Remove a hardware event from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_remove_event( C_INT EventSet, C_INT EventCode, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_remove_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_remove_events.3000066400000000000000000000010651502707512200211370ustar00rootroot00000000000000.TH "PAPIF_remove_events" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_remove_events \- Remove an array of hardware event codes from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_remove_events( C_INT EventSet, C_INT(*) EventCode, C_INT number, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_remove_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_remove_named_event.3000066400000000000000000000010611502707512200221140ustar00rootroot00000000000000.TH "PAPIF_remove_named_event" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_remove_named_event \- Remove a named hardware event from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_remove_named_event( C_INT EventSet, C_STRING EventName, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_remove_named_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_reset.3000066400000000000000000000007451502707512200174040ustar00rootroot00000000000000.TH "PAPIF_reset" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_reset \- Reset the hardware event counts in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_reset( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_reset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_cmp_domain.3000066400000000000000000000010671502707512200212410ustar00rootroot00000000000000.TH "PAPIF_set_cmp_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_cmp_domain \- Set the default counting domain for new event sets bound to the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_cmp_domain( C_INT domain, C_INT cidx, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_cmp_domain\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_cmp_granularity.3000066400000000000000000000011201502707512200223210ustar00rootroot00000000000000.TH "PAPIF_set_cmp_granularity" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_cmp_granularity \- Set the default counting granularity for eventsets bound to the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_cmp_granularity( C_INT granularity, C_INT cidx, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_cmp_granularity\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_debug.3000066400000000000000000000007711502707512200202220ustar00rootroot00000000000000.TH "PAPIF_set_debug" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_debug \- Set the current debug level for error output from PAPI\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_debug( C_INT level, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_debug\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_domain.3000066400000000000000000000010251502707512200203740ustar00rootroot00000000000000.TH "PAPIF_set_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_domain \- Set the default counting domain for new event sets bound to the cpu component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_domain( C_INT domain, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_domain\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_event_domain.3000066400000000000000000000010701502707512200215750ustar00rootroot00000000000000.TH "PAPIF_set_event_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_event_domain \- Set the default counting domain for specified EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_event_domain( C_INT EventSet, C_INT domain, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_domain\fP .PP \fBPAPI_set_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_granularity.3000066400000000000000000000010561502707512200214720ustar00rootroot00000000000000.TH "PAPIF_set_granularity" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_granularity \- Set the default counting granularity for eventsets bound to the cpu component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_granularity( C_INT granularity, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_granularity\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_inherit.3000066400000000000000000000010051502707512200205650ustar00rootroot00000000000000.TH "PAPIF_set_inherit" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_inherit \- Turn on inheriting of counts from daughter to parent process\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_inherit( C_INT inherit, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_set_multiplex.3000066400000000000000000000010171502707512200211510ustar00rootroot00000000000000.TH "PAPIF_set_multiplex" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_set_multiplex \- Convert a standard event set to a multiplexed event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_set_multiplex( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_multiplex\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_shutdown.3000066400000000000000000000007261502707512200201340ustar00rootroot00000000000000.TH "PAPIF_shutdown" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_shutdown \- finish using PAPI and free all related resources\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_shutdown( )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_shutdown\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_start.3000066400000000000000000000007441502707512200174160ustar00rootroot00000000000000.TH "PAPIF_start" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_start \- Start counting hardware events in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_start( C_INT EventSet, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_start\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_state.3000066400000000000000000000007531502707512200174010ustar00rootroot00000000000000.TH "PAPIF_state" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_state \- Return the counting state of an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_state(C_INT EventSet, C_INT status, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_state\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_stop.3000066400000000000000000000007651502707512200172510ustar00rootroot00000000000000.TH "PAPIF_stop" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_stop \- Stop counting hardware events in an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_stop( C_INT EventSet, C_LONG_LONG(*) values, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_thread_id.3000066400000000000000000000007421502707512200202020ustar00rootroot00000000000000.TH "PAPIF_thread_id" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_thread_id \- Get the thread identifier of the current thread\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_thread_id( C_INT id )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_thread_id\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_thread_init.3000066400000000000000000000010051502707512200205420ustar00rootroot00000000000000.TH "PAPIF_thread_init" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_thread_init \- Initialize thread support in the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_thread_init( C_INT FUNCTION handle, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_unlock.3000066400000000000000000000007441502707512200175540ustar00rootroot00000000000000.TH "PAPIF_unlock" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_unlock \- Unlock one of the mutex variables defined in \fBpapi\&.h\fP\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_unlock( C_INT lock )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_unlock\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_unregister_thread.3000066400000000000000000000010011502707512200217620ustar00rootroot00000000000000.TH "PAPIF_unregister_thread" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_unregister_thread \- Notify PAPI that a thread has 'disappeared'\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_unregister_thread( C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_unregister_thread\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIF_write.3000066400000000000000000000007571502707512200174170ustar00rootroot00000000000000.TH "PAPIF_write" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIF_write \- Write counter values into counters\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIF_write( C_INT EventSet, C_LONG_LONG(*) values, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_write\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_accum.3000066400000000000000000000034021502707512200172350ustar00rootroot00000000000000.TH "PAPI_accum" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_accum \- Accumulate and reset counters in an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_accum( int EventSet, long_long * values )\fP; .RE .PP These calls assume an initialized PAPI library and a properly added event set\&. \fBPAPI_accum\fP adds the counters of the indicated event set into the array values\&. The counters are zeroed and continue counting after the operation\&. Note the differences between \fBPAPI_read\fP and \fBPAPI_accum\fP, specifically that \fBPAPI_accum\fP resets the values array to zero\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI Event Set as created by \fBPAPI_create_eventset\fP .br \fI*values\fP an array to hold the counter values of the counting events .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf do_100events( ); if ( PAPI_read( EventSet, values) != PAPI_OK ) handle_error( 1 ); // values[0] now equals 100 do_100events( ); if (PAPI_accum( EventSet, values ) != PAPI_OK ) handle_error( 1 ); // values[0] now equals 200 values[0] = \-100; do_100events( ); if (PAPI_accum( EventSet, values ) != PAPI_OK ) handle_error( 1 ); // values[0] now equals 0 .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPIF_accum\fP .PP \fBPAPI_start\fP .PP \fBPAPI_set_opt\fP .PP \fBPAPI_reset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_add_event.3000066400000000000000000000054351502707512200201060ustar00rootroot00000000000000.TH "PAPI_add_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_add_event \- add PAPI preset or native hardware event to an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_add_event( int EventSet, int EventCode ); .RE .PP \fBPAPI_add_event\fP adds one event to a PAPI Event Set\&. .br A hardware event can be either a PAPI preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the avail test case in the PAPI distribution\&. PAPI presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on current platform, run the papi_native_avail utility in the PAPI distribution\&. For the encoding of native events, see \fBPAPI_event_name_to_code\fP to learn how to generate native code for the supported native event on the underlying architecture\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI Event Set as created by \fBPAPI_create_eventset\fP\&. .br \fIEventCode\fP A defined event such as PAPI_TOT_INS\&. .RE .PP \fBReturn values\fP .RS 4 \fIPositive-Integer\fP The number of consecutive elements that succeeded before the error\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .br \fIPAPI_EISRUN\fP The event set is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the event set simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .br \fIPAPI_EBUG\fP Internal error, please send mail to the developers\&. .br \fIPAPI_EMULPASS\fP Event exists, but cannot be counted due to multiple passes required by hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; unsigned int native = 0x0; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) handle_error( 1 ); // Add native event PM_CYC to EventSet if ( PAPI_event_name_to_code( "PM_CYC", &native ) != PAPI_OK ) handle_error( 1 ); if ( PAPI_add_event( EventSet, native ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_cleanup_eventset\fP .br \fBPAPI_destroy_eventset\fP .br \fBPAPI_event_code_to_name\fP .br \fBPAPI_remove_events\fP .br \fBPAPI_query_event\fP .br PAPI_presets .br PAPI_native .br \fBPAPI_remove_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_add_events.3000066400000000000000000000057071502707512200202730ustar00rootroot00000000000000.TH "PAPI_add_events" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_add_events \- add multiple PAPI presets or native hardware events to an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_add_events( int EventSet, int * EventCodes, int number ); .RE .PP \fBPAPI_add_event\fP adds one event to a PAPI Event Set\&. \fBPAPI_add_events\fP does the same, but for an array of events\&. .br A hardware event can be either a PAPI preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the avail test case in the PAPI distribution\&. PAPI presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on current platform, run native_avail test case in the PAPI distribution\&. For the encoding of native events, see \fBPAPI_event_name_to_code\fP to learn how to generate native code for the supported native event on the underlying architecture\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI Event Set as created by \fBPAPI_create_eventset\fP\&. .br \fI*EventCode\fP An array of defined events\&. .br \fInumber\fP An integer indicating the number of events in the array *EventCode\&. It should be noted that \fBPAPI_add_events\fP can partially succeed, exactly like \fBPAPI_remove_events\fP\&. .RE .PP \fBReturn values\fP .RS 4 \fIPositive-Integer\fP The number of consecutive elements that succeeded before the error\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .br \fIPAPI_EISRUN\fP The event set is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the event set simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .br \fIPAPI_EBUG\fP Internal error, please send mail to the developers\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; unsigned int native = 0x0; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) handle_error( 1 ); // Add native event PM_CYC to EventSet if ( PAPI_event_name_to_code( "PM_CYC", &native ) != PAPI_OK ) handle_error( 1 ); if ( PAPI_add_event( EventSet, native ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_cleanup_eventset\fP .br \fBPAPI_destroy_eventset\fP .br \fBPAPI_event_code_to_name\fP .br \fBPAPI_remove_events\fP .br \fBPAPI_query_event\fP .br PAPI_presets .br PAPI_native .br \fBPAPI_remove_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_add_named_event.3000066400000000000000000000050651502707512200212510ustar00rootroot00000000000000.TH "PAPI_add_named_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_add_named_event \- add PAPI preset or native hardware event by name to an EventSet .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_add_named_event( int EventSet, const char *EventName ); .RE .PP \fBPAPI_add_named_event\fP adds one event to a PAPI EventSet\&. .br A hardware event can be either a PAPI preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the avail test case in the PAPI distribution\&. PAPI presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on current platform, run the papi_native_avail utility in the PAPI distribution\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI Event Set as created by \fBPAPI_create_eventset\fP\&. .br \fIEventCode\fP A defined event such as PAPI_TOT_INS\&. .RE .PP \fBReturn values\fP .RS 4 \fIPositive-Integer\fP The number of consecutive elements that succeeded before the error\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOINIT\fP The PAPI library has not been initialized\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .br \fIPAPI_EISRUN\fP The event set is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the event set simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .br \fIPAPI_EBUG\fP Internal error, please send mail to the developers\&. .br \fIPAPI_EMULPASS\fP Event exists, but cannot be counted due to multiple passes required by hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf char EventName = "PAPI_TOT_INS"; int EventSet = PAPI_NULL; unsigned int native = 0x0; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_named_event( EventSet, EventName ) != PAPI_OK ) handle_error( 1 ); // Add native event PM_CYC to EventSet if ( PAPI_add_named_event( EventSet, "PM_CYC" ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_add_event\fP .br \fBPAPI_query_named_event\fP .br \fBPAPI_remove_named_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_addr_range_option_t.3000066400000000000000000000020601502707512200221450ustar00rootroot00000000000000.TH "PAPI_addr_range_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_addr_range_option_t \- address range specification for range restricted counting if both are zero, range is disabled .br .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "vptr_t \fBstart\fP" .br .ti -1c .RI "vptr_t \fBend\fP" .br .ti -1c .RI "int \fBstart_off\fP" .br .ti -1c .RI "int \fBend_off\fP" .br .in -1c .SH "Field Documentation" .PP .SS "vptr_t PAPI_addr_range_option_t::end" user requested end address of an address range .SS "int PAPI_addr_range_option_t::end_off" hardware specified offset from end address .SS "int PAPI_addr_range_option_t::eventset" eventset to restrict .SS "vptr_t PAPI_addr_range_option_t::start" user requested start address of an address range .SS "int PAPI_addr_range_option_t::start_off" hardware specified offset from start address .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_address_map_t.3000066400000000000000000000021771502707512200207620ustar00rootroot00000000000000.TH "PAPI_address_map_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_address_map_t \- get the executable's address space info .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "char \fBname\fP [1024]" .br .ti -1c .RI "vptr_t \fBtext_start\fP" .br .ti -1c .RI "vptr_t \fBtext_end\fP" .br .ti -1c .RI "vptr_t \fBdata_start\fP" .br .ti -1c .RI "vptr_t \fBdata_end\fP" .br .ti -1c .RI "vptr_t \fBbss_start\fP" .br .ti -1c .RI "vptr_t \fBbss_end\fP" .br .in -1c .SH "Field Documentation" .PP .SS "vptr_t PAPI_address_map_t::bss_end" End address of program bss segment .SS "vptr_t PAPI_address_map_t::bss_start" Start address of program bss segment .SS "vptr_t PAPI_address_map_t::data_end" End address of program data segment .SS "vptr_t PAPI_address_map_t::data_start" Start address of program data segment .SS "vptr_t PAPI_address_map_t::text_end" End address of program text segment .SS "vptr_t PAPI_address_map_t::text_start" Start address of program text segment .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_all_thr_spec_t.3000066400000000000000000000006121502707512200211270ustar00rootroot00000000000000.TH "PAPI_all_thr_spec_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_all_thr_spec_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBnum\fP" .br .ti -1c .RI "PAPI_thread_id_t * \fBid\fP" .br .ti -1c .RI "void ** \fBdata\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_assign_eventset_component.3000066400000000000000000000034601502707512200234340ustar00rootroot00000000000000.TH "PAPI_assign_eventset_component" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_assign_eventset_component \- Assign a component index to an existing but empty EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br PAPI_assign_eventset_component( int EventSet, int cidx ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer identifier for an existing EventSet\&. .br \fIcidx\fP An integer identifier for a component\&. By convention, component 0 is always the cpu component\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOCMP\fP The argument cidx is not a valid component\&. .br \fIPAPI_ENOEVST\fP The EventSet doesn't exist\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP \fBPAPI_assign_eventset_component\fP assigns a specific component index, as specified by cidx, to a new EventSet identified by EventSet, as obtained from \fBPAPI_create_eventset\fP\&. EventSets are ordinarily automatically bound to components when the first event is added\&. This routine is useful to explicitly bind an EventSet to a component before setting component related options\&. .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Bind our EventSet to the cpu component if ( PAPI_assign_eventset_component( EventSet, 0 ) != PAPI_OK ) handle_error( 1 ); // Convert our EventSet to multiplexing if ( PAPI_set_multiplex( EventSet ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_opt\fP .br \fBPAPI_create_eventset\fP .br \fBPAPI_add_events\fP .br \fBPAPI_set_multiplex\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_attach.3000066400000000000000000000036251502707512200174200ustar00rootroot00000000000000.TH "PAPI_attach" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_attach \- Attach PAPI event set to the specified thread id\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_attach( int EventSet, unsigned long tid ); .RE .PP \fBPAPI_attach\fP is a wrapper function that calls \fBPAPI_set_opt\fP to allow PAPI to monitor performance counts on a thread other than the one currently executing\&. This is sometimes referred to as third party monitoring\&. \fBPAPI_attach\fP connects the specified EventSet to the specified thread; \fBPAPI_detach\fP breaks that connection and restores the EventSet to the original executing thread\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI EventSet as created by \fBPAPI_create_eventset\fP\&. .br \fItid\fP A thread id as obtained from, for example, \fBPAPI_list_threads\fP or \fBPAPI_thread_id\fP\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ECMP\fP This feature is unsupported on this component\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .br \fIPAPI_EISRUN\fP The event set is currently counting events\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; unsigned long pid; pid = fork( ); if ( pid <= 0 ) exit( 1 ); if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) exit( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) exit( 1 ); // Attach this EventSet to the forked process if ( PAPI_attach( EventSet, pid ) != PAPI_OK ) exit( 1 ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_opt\fP .PP \fBPAPI_list_threads\fP .PP \fBPAPI_thread_id\fP .PP \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_attach_option_t.3000066400000000000000000000005501502707512200213250ustar00rootroot00000000000000.TH "PAPI_attach_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_attach_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "unsigned long \fBtid\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_cleanup_eventset.3000066400000000000000000000026731502707512200215220ustar00rootroot00000000000000.TH "PAPI_cleanup_eventset" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_cleanup_eventset \- Empty and destroy an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_cleanup_eventset( int EventSet ); .RE .PP \fBPAPI_cleanup_eventset\fP removes all events from a PAPI event set and turns off profiling and overflow for all events in the EventSet\&. This can not be called if the EventSet is not stopped\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. Attempting to destroy a non-empty event set or passing in a null pointer to be destroyed\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_EBUG\fP Internal error, send mail to ptools-perfapi@icl.utk.edu and complain\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf // Remove all events in the eventset if ( PAPI_cleanup_eventset( EventSet ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_profil\fP .br \fBPAPI_create_eventset\fP .br \fBPAPI_add_event\fP .br \fBPAPI_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_component_info_t.3000066400000000000000000000126561502707512200215200ustar00rootroot00000000000000.TH "PAPI_component_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_component_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "char \fBname\fP [128]" .br .ti -1c .RI "char \fBshort_name\fP [64]" .br .ti -1c .RI "char \fBdescription\fP [128]" .br .ti -1c .RI "char \fBversion\fP [64]" .br .ti -1c .RI "char \fBsupport_version\fP [64]" .br .ti -1c .RI "char \fBkernel_version\fP [64]" .br .ti -1c .RI "char \fBdisabled_reason\fP [1024]" .br .ti -1c .RI "int \fBdisabled\fP" .br .ti -1c .RI "char \fBpartially_disabled_reason\fP [1024]" .br .ti -1c .RI "int \fBpartially_disabled\fP" .br .ti -1c .RI "int \fBinitialized\fP" .br .ti -1c .RI "int \fBCmpIdx\fP" .br .ti -1c .RI "int \fBnum_cntrs\fP" .br .ti -1c .RI "int \fBnum_mpx_cntrs\fP" .br .ti -1c .RI "int \fBnum_preset_events\fP" .br .ti -1c .RI "int \fBnum_native_events\fP" .br .ti -1c .RI "int \fBdefault_domain\fP" .br .ti -1c .RI "int \fBavailable_domains\fP" .br .ti -1c .RI "int \fBdefault_granularity\fP" .br .ti -1c .RI "int \fBavailable_granularities\fP" .br .ti -1c .RI "int \fBhardware_intr_sig\fP" .br .ti -1c .RI "int \fBcomponent_type\fP" .br .ti -1c .RI "char * \fBpmu_names\fP [80]" .br .ti -1c .RI "int \fBreserved\fP [8]" .br .ti -1c .RI "unsigned int \fBhardware_intr\fP:1" .br .ti -1c .RI "unsigned int \fBprecise_intr\fP:1" .br .ti -1c .RI "unsigned int \fBposix1b_timers\fP:1" .br .ti -1c .RI "unsigned int \fBkernel_profile\fP:1" .br .ti -1c .RI "unsigned int \fBkernel_multiplex\fP:1" .br .ti -1c .RI "unsigned int \fBfast_counter_read\fP:1" .br .ti -1c .RI "unsigned int \fBfast_real_timer\fP:1" .br .ti -1c .RI "unsigned int \fBfast_virtual_timer\fP:1" .br .ti -1c .RI "unsigned int \fBattach\fP:1" .br .ti -1c .RI "unsigned int \fBattach_must_ptrace\fP:1" .br .ti -1c .RI "unsigned int \fBcntr_umasks\fP:1" .br .ti -1c .RI "unsigned int \fBcpu\fP:1" .br .ti -1c .RI "unsigned int \fBinherit\fP:1" .br .ti -1c .RI "unsigned int \fBreserved_bits\fP:19" .br .in -1c .SH "Field Documentation" .PP .SS "unsigned int PAPI_component_info_t::attach" Supports attach .SS "unsigned int PAPI_component_info_t::attach_must_ptrace" Attach must first ptrace and stop the thread/process .SS "int PAPI_component_info_t::available_domains" Available domains .SS "int PAPI_component_info_t::available_granularities" Available granularities .SS "int PAPI_component_info_t::CmpIdx" Index into the vector array for this component; set at init time .SS "unsigned int PAPI_component_info_t::cntr_umasks" counters have unit masks .SS "int PAPI_component_info_t::component_type" Type of component .SS "unsigned int PAPI_component_info_t::cpu" Supports specifying cpu number to use with event set .SS "int PAPI_component_info_t::default_domain" The default domain when this component is used .SS "int PAPI_component_info_t::default_granularity" The default granularity when this component is used .SS "char PAPI_component_info_t::description[128]" Description of the component .SS "int PAPI_component_info_t::disabled" 0 if enabled, otherwise error code from initialization .SS "char PAPI_component_info_t::disabled_reason[1024]" Reason for failure of initialization .SS "unsigned int PAPI_component_info_t::fast_counter_read" Supports a user level PMC read instruction .SS "unsigned int PAPI_component_info_t::fast_real_timer" Supports a fast real timer .SS "unsigned int PAPI_component_info_t::fast_virtual_timer" Supports a fast virtual timer .SS "unsigned int PAPI_component_info_t::hardware_intr" hw overflow intr, does not need to be emulated in software .SS "int PAPI_component_info_t::hardware_intr_sig" Signal used by hardware to deliver PMC events .SS "unsigned int PAPI_component_info_t::inherit" Supports child processes inheriting parents counters .SS "int PAPI_component_info_t::initialized" Component is ready to use .SS "unsigned int PAPI_component_info_t::kernel_multiplex" In kernel multiplexing .SS "unsigned int PAPI_component_info_t::kernel_profile" Has kernel profiling support (buffered interrupts or sprofil-like) .SS "char PAPI_component_info_t::kernel_version[64]" Version of the kernel PMC support driver .SS "char PAPI_component_info_t::name[128]" Name of the component we're using .SS "int PAPI_component_info_t::num_cntrs" Number of hardware counters the component supports .SS "int PAPI_component_info_t::num_mpx_cntrs" Number of hardware counters the component or PAPI can multiplex supports .SS "int PAPI_component_info_t::num_native_events" Number of native events the component supports .SS "int PAPI_component_info_t::num_preset_events" Number of preset events the component supports .SS "int PAPI_component_info_t::partially_disabled" 1 if component is partially disabled, 0 otherwise .SS "char PAPI_component_info_t::partially_disabled_reason[1024]" Reason for partial initialization .SS "char* PAPI_component_info_t::pmu_names[80]" list of pmu names supported by this component .SS "unsigned int PAPI_component_info_t::posix1b_timers" Using POSIX 1b interval timers (timer_create) instead of setitimer .SS "unsigned int PAPI_component_info_t::precise_intr" Performance interrupts happen precisely .SS "char PAPI_component_info_t::short_name[64]" Short name of component, to be prepended to event names .SS "char PAPI_component_info_t::support_version[64]" Version of the support library .SS "char PAPI_component_info_t::version[64]" Version of this component .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_cpu_option_t.3000066400000000000000000000005451502707512200206540ustar00rootroot00000000000000.TH "PAPI_cpu_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_cpu_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "unsigned int \fBcpu_num\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_create_eventset.3000066400000000000000000000034731502707512200213350ustar00rootroot00000000000000.TH "PAPI_create_eventset" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_create_eventset \- Create a new empty PAPI EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br PAPI_create_eventset( int * EventSet ); .RE .PP \fBPAPI_create_eventset\fP creates a new EventSet pointed to by EventSet, which must be initialized to PAPI_NULL before calling this routine\&. The user may then add hardware events to the event set by calling \fBPAPI_add_event\fP or similar routines\&. .PP \fBNote\fP .RS 4 PAPI-C uses a late binding model to bind EventSets to components\&. When an EventSet is first created it is not bound to a component\&. This will cause some API calls that modify EventSet options to fail\&. An EventSet can be bound to a component explicitly by calling \fBPAPI_assign_eventset_component\fP or implicitly by calling \fBPAPI_add_event\fP or similar routines\&. .RE .PP \fBParameters\fP .RS 4 \fI*EventSet\fP Address of an integer location to store the new EventSet handle\&. .RE .PP \fBExceptions\fP .RS 4 \fIPAPI_EINVAL\fP The argument handle has not been initialized to PAPI_NULL or the argument is a NULL pointer\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_add_event\fP .br \fBPAPI_assign_eventset_component\fP .br \fBPAPI_destroy_eventset\fP .br \fBPAPI_cleanup_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_debug_option_t.3000066400000000000000000000005561502707512200211550ustar00rootroot00000000000000.TH "PAPI_debug_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_debug_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBlevel\fP" .br .ti -1c .RI "PAPI_debug_handler_t \fBhandler\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_destroy_eventset.3000066400000000000000000000027111502707512200215550ustar00rootroot00000000000000.TH "PAPI_destroy_eventset" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_destroy_eventset \- Empty and destroy an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_destroy_eventset( int * EventSet ); .RE .PP \fBPAPI_destroy_eventset\fP deallocates the memory associated with an empty PAPI EventSet\&. .PP \fBParameters\fP .RS 4 \fI*EventSet\fP A pointer to the integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP\&. The value pointed to by EventSet is then set to PAPI_NULL on success\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. Attempting to destroy a non-empty event set or passing in a null pointer to be destroyed\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_EBUG\fP Internal error, send mail to ptools-perfapi@icl.utk.edu and complain\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf // Free all memory and data structures, EventSet must be empty\&. if ( PAPI_destroy_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_profil\fP .br \fBPAPI_create_eventset\fP .br \fBPAPI_add_event\fP .br \fBPAPI_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_detach.3000066400000000000000000000037061502707512200174040ustar00rootroot00000000000000.TH "PAPI_detach" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_detach \- Detach PAPI event set from previously specified thread id and restore to executing thread\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_detach( int EventSet, unsigned long tid )\fP; .RE .PP \fBPAPI_detach\fP is a wrapper function that calls \fBPAPI_set_opt\fP to allow PAPI to monitor performance counts on a thread other than the one currently executing\&. This is sometimes referred to as third party monitoring\&. \fBPAPI_attach\fP connects the specified EventSet to the specified thread; \fBPAPI_detach\fP breaks that connection and restores the EventSet to the original executing thread\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI EventSet as created by \fBPAPI_create_eventset\fP\&. .br \fItid\fP A thread id as obtained from, for example, \fBPAPI_list_threads\fP or \fBPAPI_thread_id\fP\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ECMP\fP This feature is unsupported on this component\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .br \fIPAPI_EISRUN\fP The event set is currently counting events\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; unsigned long pid; pid = fork( ); if ( pid <= 0 ) exit( 1 ); if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) exit( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) exit( 1 ); // Attach this EventSet to the forked process if ( PAPI_attach( EventSet, pid ) != PAPI_OK ) exit( 1 ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_opt\fP .br \fBPAPI_list_threads\fP .br \fBPAPI_thread_id\fP .br \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_disable_component.3000066400000000000000000000023241502707512200216340ustar00rootroot00000000000000.TH "PAPI_disable_component" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_disable_component \- disables the specified component .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOCMP\fP component does not exist .br \fIENOINIT\fP cannot disable as PAPI has already been initialized .RE .PP \fBParameters\fP .RS 4 \fIcidx\fP component index of component to be disabled .RE .PP \fBExamples:\fP .RS 4 .PP .nf int cidx, result; cidx = PAPI_get_component_index("example"); if (cidx>=0) { result = PAPI_disable_component(cidx); if (result==PAPI_OK) printf("The example component is disabled\\n"); } // \&.\&.\&. PAPI_library_init(); .fi .PP PAPI_disable_component() allows the user to disable components before PAPI_library_init() time\&. This is useful if the user knows they do not wish to use events from that component and want to reduce the PAPI library overhead\&. .RE .PP PAPI_disable_component() must be called before PAPI_library_init()\&. .PP \fBSee also\fP .RS 4 \fBPAPI_get_event_component\fP .PP \fBPAPI_library_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_disable_component_by_name.3000066400000000000000000000023351502707512200233300ustar00rootroot00000000000000.TH "PAPI_disable_component_by_name" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_disable_component_by_name \- disables the named component .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOCMP\fP component does not exist .br \fIENOINIT\fP unable to disable the component, the library has already been initialized .RE .PP \fBParameters\fP .RS 4 \fIcomponent_name\fP name of the component to disable\&. .RE .PP \fBExample:\fP .RS 4 .PP .nf int result; result = PAPI_disable_component_by_name("example"); if (result==PAPI_OK) printf("component \\"example\\" has been disabled\\n"); //\&.\&.\&. PAPI_library_init(PAPI_VER_CURRENT); .fi .PP PAPI_disable_component_by_name() allows the user to disable a component before PAPI_library_init() time\&. This is useful if the user knows they do not with to use events from that component and want to reduce the PAPI library overhead\&. .RE .PP PAPI_disable_component_by_name() must be called before PAPI_library_init()\&. .PP \fBSee also\fP .RS 4 \fBPAPI_library_init\fP .PP \fBPAPI_disable_component\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_dmem_info_t.3000066400000000000000000000015401502707512200204260ustar00rootroot00000000000000.TH "PAPI_dmem_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_dmem_info_t \- A pointer to the following is passed to PAPI_get_dmem_info() .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "long long \fBpeak\fP" .br .ti -1c .RI "long long \fBsize\fP" .br .ti -1c .RI "long long \fBresident\fP" .br .ti -1c .RI "long long \fBhigh_water_mark\fP" .br .ti -1c .RI "long long \fBshared\fP" .br .ti -1c .RI "long long \fBtext\fP" .br .ti -1c .RI "long long \fBlibrary\fP" .br .ti -1c .RI "long long \fBheap\fP" .br .ti -1c .RI "long long \fBlocked\fP" .br .ti -1c .RI "long long \fBstack\fP" .br .ti -1c .RI "long long \fBpagesize\fP" .br .ti -1c .RI "long long \fBpte\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_domain_option_t.3000066400000000000000000000010201502707512200213210ustar00rootroot00000000000000.TH "PAPI_domain_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_domain_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBdef_cidx\fP" .br .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "int \fBdomain\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_domain_option_t::def_cidx" this structure requires a component index to set default domains .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_enum_cmp_event.3000066400000000000000000000104021502707512200211470ustar00rootroot00000000000000.TH "PAPI_enum_cmp_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_enum_cmp_event \- Enumerate PAPI preset or native events for a given component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_enum_cmp_event( int *EventCode, int modifer, int cidx ); .RE .PP Given an event code, \fBPAPI_enum_event\fP replaces the event code with the next available event\&. .PP The modifier argument affects which events are returned\&. For all platforms and event types, a value of PAPI_ENUM_ALL (zero) directs the function to return all possible events\&. .br For native events, the effect of the modifier argument may be different on each platform\&. See the discussion below for platform-specific definitions\&. .PP \fBParameters\fP .RS 4 \fI*EventCode\fP A defined preset or native event such as PAPI_TOT_INS\&. .br \fImodifier\fP Modifies the search logic\&. See below for full list\&. For native events, each platform behaves differently\&. See platform-specific documentation for details\&. .br \fIcidx\fP Specifies the component to search in .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVNT\fP The next requested PAPI preset or native event is not available on the underlying hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf // Scan for all supported native events on the first component printf( "Name\\t\\t\\t Code\\t Description\\n" ); do { retval = PAPI_get_event_info( i, &info ); if ( retval == PAPI_OK ) { printf( "%\-30s %#\-10x\\n%s\\n", info\&.symbol, info\&.event_code, info\&.long_descr ); } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_ALL, 0 ) == PAPI_OK ); .fi .PP .PP .nf @par Generic Modifiers .fi .PP The following values are implemented for preset events .PD 0 .IP "\(bu" 2 PAPI_ENUM_EVENTS -- Enumerate all (default) .IP "\(bu" 2 PAPI_ENUM_FIRST -- Enumerate first event (preset or native) preset/native chosen based on type of EventCode .PP .PP .nf @par Native Modifiers .fi .PP The following values are implemented for native events .PD 0 .IP "\(bu" 2 PAPI_NTV_ENUM_UMASKS -- Given an event, iterate through possible umasks one at a time .IP "\(bu" 2 PAPI_NTV_ENUM_UMASK_COMBOS -- Given an event, iterate through all possible combinations of umasks\&. This is not implemented on libpfm4\&. .PP .RE .PP \fBPreset Modifiers\fP .RS 4 The following values are implemented for preset events .PD 0 .IP "\(bu" 2 PAPI_PRESET_ENUM_AVAIL -- enumerate only available presets .IP "\(bu" 2 PAPI_PRESET_ENUM_MSC -- Miscellaneous preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_INS -- Instruction related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_IDL -- Stalled or Idle preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_BR -- Branch related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_CND -- Conditional preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_MEM -- Memory related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_CACH -- Cache related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_L1 -- L1 cache related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_L2 -- L2 cache related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_L3 -- L3 cache related preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_TLB -- Translation Lookaside Buffer events .IP "\(bu" 2 PAPI_PRESET_ENUM_FP -- Floating Point related preset events .PP .RE .PP \fBITANIUM Modifiers\fP .RS 4 The following values are implemented for modifier on Itanium: .PD 0 .IP "\(bu" 2 PAPI_NTV_ENUM_IARR - Enumerate IAR (instruction address ranging) events .IP "\(bu" 2 PAPI_NTV_ENUM_DARR - Enumerate DAR (data address ranging) events .IP "\(bu" 2 PAPI_NTV_ENUM_OPCM - Enumerate OPC (opcode matching) events .IP "\(bu" 2 PAPI_NTV_ENUM_IEAR - Enumerate IEAR (instr event address register) events .IP "\(bu" 2 PAPI_NTV_ENUM_DEAR - Enumerate DEAR (data event address register) events .PP .RE .PP \fBPOWER Modifiers\fP .RS 4 The following values are implemented for POWER .PD 0 .IP "\(bu" 2 PAPI_NTV_ENUM_GROUPS - Enumerate groups to which an event belongs .PP .RE .PP \fBSee also\fP .RS 4 PAPI .br PAPIF .br \fBPAPI_enum_event\fP .br \fBPAPI_get_event_info\fP .br \fBPAPI_event_name_to_code\fP .br PAPI_preset .br PAPI_native .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_enum_dev_type.3000066400000000000000000000026061502707512200210150ustar00rootroot00000000000000.TH "PAPI_enum_dev_type" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_enum_dev_type \- returns handle of next device type .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOCMP\fP component does not exist .br \fIEINVAL\fP end of device type list .RE .PP \fBParameters\fP .RS 4 \fIenum_modifier\fP device type modifier, used to filter out enumerated device types .RE .PP \fBExample:\fP .RS 4 .PP .nf enum { PAPI_DEV_TYPE_ENUM__FIRST, PAPI_DEV_TYPE_ENUM__CPU, PAPI_DEV_TYPE_ENUM__CUDA, PAPI_DEV_TYPE_ENUM__ROCM, PAPI_DEV_TYPE_ENUM__ALL }; void *handle; const char *vendor_name; int enum_modifier = PAPI_DEV_TYPE_ENUM__CPU | PAPI_DEV_TYPE_ENUM__CUDA; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); \&.\&.\&. } .fi .PP PAPI_enum_dev_type() allows the user to access all device types in the system\&. It takes an enumerator modifier that allows users to enumerate only devices of a predefined type and it returns an opaque handler that users can pass to other functions in order to query device type attributes\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_dev_type_attr\fP .PP \fBPAPI_get_dev_attr\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_enum_event.3000066400000000000000000000061711502707512200203200ustar00rootroot00000000000000.TH "PAPI_enum_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_enum_event \- Enumerate PAPI preset or native events\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_enum_event( int * EventCode, int modifer ); .RE .PP Given a preset or native event code, \fBPAPI_enum_event\fP replaces the event code with the next available event in either the preset or native table\&. The modifier argument affects which events are returned\&. For all platforms and event types, a value of PAPI_ENUM_ALL (zero) directs the function to return all possible events\&. .br For preset events, a TRUE (non-zero) value currently directs the function to return event codes only for PAPI preset events available on this platform\&. This may change in the future\&. For native events, the effect of the modifier argument is different on each platform\&. See the discussion below for platform-specific definitions\&. .PP \fBParameters\fP .RS 4 \fI*EventCode\fP A defined preset or native event such as PAPI_TOT_INS\&. .br \fImodifier\fP Modifies the search logic\&. See below for full list\&. For native events, each platform behaves differently\&. See platform-specific documentation for details\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVNT\fP The next requested PAPI preset or native event is not available on the underlying hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf // Scan for all supported native events on this platform printf( "Name\\t\\t\\t Code\\t Description\\n" ); do { retval = PAPI_get_event_info( i, &info ); if ( retval == PAPI_OK ) { printf( "%\-30s %#\-10x\\n%s\\n", info\&.symbol, info\&.event_code, info\&.long_descr ); } } while ( PAPI_enum_event( &i, PAPI_ENUM_ALL ) == PAPI_OK ); .fi .PP .PP .nf @par Generic Modifiers .fi .PP The following values are implemented for preset events .PD 0 .IP "\(bu" 2 PAPI_ENUM_EVENTS -- Enumerate all (default) .IP "\(bu" 2 PAPI_ENUM_FIRST -- Enumerate first event (preset or native) preset/native chosen based on type of EventCode .PP .PP .nf @par Native Modifiers .fi .PP The following values are implemented for native events .PD 0 .IP "\(bu" 2 PAPI_NTV_ENUM_UMASKS -- Given an event, iterate through possible umasks one at a time .IP "\(bu" 2 PAPI_NTV_ENUM_UMASK_COMBOS -- Given an event, iterate through all possible combinations of umasks\&. This is not implemented on libpfm4\&. .PP .RE .PP \fBPreset Modifiers\fP .RS 4 The following values are implemented for preset events .PD 0 .IP "\(bu" 2 PAPI_PRESET_ENUM_AVAIL -- enumerate only available presets .IP "\(bu" 2 PAPI_PRESET_ENUM_CPU -- enumerate CPU preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_CPU_AVAIL -- enumerate available CPU preset events .IP "\(bu" 2 PAPI_PRESET_ENUM_FIRST_COMP -- enumerate first component preset event .PP .RE .PP \fBSee also\fP .RS 4 PAPI .br PAPIF .br \fBPAPI_enum_cmp_event\fP .br \fBPAPI_get_event_info\fP .br \fBPAPI_event_name_to_code\fP .br PAPI_preset .br PAPI_native .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_epc.3000066400000000000000000000043661502707512200167260ustar00rootroot00000000000000.TH "PAPI_epc" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_epc \- Simplified call to get arbitrary events per cycle, real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_epc( int event, float *rtime, float *ptime, long long *ref, long long *core, long long *evt, float *epc ); .RE .PP \fBParameters\fP .RS 4 \fIevent\fP event code to be measured (0 defaults to PAPI_TOT_INS) .br \fI*rtime\fP realtime since the latest call .br \fI*ptime\fP process time since the latest call .br \fI*ref\fP incremental reference clock cycles since the latest call .br \fI*core\fP incremental core clock cycles since the latest call .br \fI*evt\fP events since the latest call .br \fI*epc\fP incremental events per cycle since the latest call .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP The counters were already started by something other than PAPI_epc()\&. .br \fIPAPI_ENOEVNT\fP One of the requested events does not exist\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP The first call to PAPI_epc() will initialize the PAPI interface, set up the counters to monitor the user specified event, PAPI_TOT_CYC, and PAPI_REF_CYC (if it exists) and start the counters\&. .PP Subsequent calls will read the counters and return real time, process time, event counts, the core and reference cycle count and EPC rate since the latest call to PAPI_epc()\&. .PP PAPI_epc() can provide a more detailed look at algorithm efficiency in light of clock variability in modern cpus\&. MFLOPS is no longer an adequate description of peak performance if clock rates can arbitrarily speed up or slow down\&. By allowing a user specified event and reporting reference cycles, core cycles and real time, \fBPAPI_epc\fP provides the information to compute an accurate effective clock rate, and an accurate measure of computational throughput\&. Note that PAPI_epc() is thread-safe and can therefore be called by multiple threads\&. .PP \fBSee also\fP .RS 4 PAPI_flips_rate() .PP PAPI_flops_rate() .PP PAPI_ipc() .PP PAPI_rate_stop() .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_event_code_to_name.3000066400000000000000000000037721502707512200217740ustar00rootroot00000000000000.TH "PAPI_event_code_to_name" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_event_code_to_name \- Convert a numeric hardware event code to a name\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_event_code_to_name( int EventCode, char * EventName ); .RE .PP \fBPAPI_event_code_to_name\fP is used to translate a 32-bit integer PAPI event code into an ASCII PAPI event name\&. Either Preset event codes or Native event codes can be passed to this routine\&. Native event codes and names differ from platform to platform\&. .PP \fBParameters\fP .RS 4 \fIEventCode\fP The numeric code for the event\&. .br \fI*EventName\fP A string containing the event name as listed in PAPI_presets or discussed in PAPI_native\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOTPRESET\fP The hardware event specified is not a valid PAPI preset\&. .br \fIPAPI_ENOEVNT\fP The hardware event is not available on the underlying hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventCode, EventSet = PAPI_NULL; int Event, number; char EventCodeStr[PAPI_MAX_STR_LEN]; // Create the EventSet if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) handle_error( 1 ); number = 1; if ( PAPI_list_events( EventSet, &Event, &number ) != PAPI_OK ) handle_error(1); // Convert integer code to name string if ( PAPI_event_code_to_name( Event, EventCodeStr ) != PAPI_OK ) handle_error( 1 ); printf( "Event Name: %s\\n", EventCodeStr ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_event_name_to_code\fP .PP \fBPAPI_remove_event\fP .PP \fBPAPI_get_event_info\fP .PP \fBPAPI_enum_event\fP .PP \fBPAPI_add_event\fP .PP PAPI_presets .PP PAPI_native .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_event_info_t.3000066400000000000000000000066661502707512200206430ustar00rootroot00000000000000.TH "PAPI_event_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_event_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "unsigned int \fBevent_code\fP" .br .ti -1c .RI "char \fBsymbol\fP [1024]" .br .ti -1c .RI "char \fBshort_descr\fP [64]" .br .ti -1c .RI "char \fBlong_descr\fP [1024]" .br .ti -1c .RI "int \fBcomponent_index\fP" .br .ti -1c .RI "char \fBunits\fP [64]" .br .ti -1c .RI "int \fBlocation\fP" .br .ti -1c .RI "int \fBdata_type\fP" .br .ti -1c .RI "int \fBvalue_type\fP" .br .ti -1c .RI "int \fBtimescope\fP" .br .ti -1c .RI "int \fBupdate_type\fP" .br .ti -1c .RI "int \fBupdate_freq\fP" .br .ti -1c .RI "unsigned int \fBcount\fP" .br .ti -1c .RI "unsigned int \fBevent_type\fP" .br .ti -1c .RI "char \fBderived\fP [64]" .br .ti -1c .RI "char \fBpostfix\fP [256]" .br .ti -1c .RI "unsigned int \fBcode\fP [12]" .br .ti -1c .RI "char \fBname\fP [12][256]" .br .ti -1c .RI "char \fBnote\fP [1024]" .br .ti -1c .RI "int \fBnum_quals\fP" .br .ti -1c .RI "char \fBquals\fP [8][1024]" .br .ti -1c .RI "char \fBquals_descrs\fP [8][1024]" .br .in -1c .SH "Field Documentation" .PP .SS "unsigned int PAPI_event_info_t::code[12]" array of values that further describe the event: .IP "\(bu" 2 presets: native event_code values .IP "\(bu" 2 native:, register values(?) .PP .SS "int PAPI_event_info_t::component_index" component this event belongs to .SS "unsigned int PAPI_event_info_t::count" number of terms (usually 1) in the code and name fields .IP "\(bu" 2 presets: these are native events .IP "\(bu" 2 native: these are unused .PP .SS "int PAPI_event_info_t::data_type" data type returned by PAPI .SS "char PAPI_event_info_t::derived[64]" name of the derived type .IP "\(bu" 2 presets: usually NOT_DERIVED .IP "\(bu" 2 native: empty string .PP .SS "unsigned int PAPI_event_info_t::event_code" preset (0x8xxxxxxx) or native (0x4xxxxxxx) event code .SS "unsigned int PAPI_event_info_t::event_type" event type or category for preset events only .SS "int PAPI_event_info_t::location" location event applies to .SS "char PAPI_event_info_t::long_descr[1024]" a longer description: typically a sentence for presets, possibly a paragraph from vendor docs for native events .SS "char PAPI_event_info_t::name[12][256]" < names of code terms: .IP "\(bu" 2 presets: native event names, .IP " \(bu" 4 native: descriptive strings for each register value(?) .PP .PP .SS "char PAPI_event_info_t::note[1024]" an optional developer note supplied with a preset event to delineate platform specific anomalies or restrictions .SS "int PAPI_event_info_t::num_quals" number of qualifiers .SS "char PAPI_event_info_t::postfix[256]" string containing postfix operations; only defined for preset events of derived type DERIVED_POSTFIX .SS "char PAPI_event_info_t::quals[8][1024]" qualifiers .SS "char PAPI_event_info_t::quals_descrs[8][1024]" qualifier descriptions .SS "char PAPI_event_info_t::short_descr[64]" a short description suitable for use as a label .SS "char PAPI_event_info_t::symbol[1024]" name of the event .SS "int PAPI_event_info_t::timescope" from start, etc\&. .SS "char PAPI_event_info_t::units[64]" units event is measured in .SS "int PAPI_event_info_t::update_freq" how frequently event is updated .SS "int PAPI_event_info_t::update_type" how event is updated .SS "int PAPI_event_info_t::value_type" sum or absolute .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_event_name_to_code.3000066400000000000000000000034001502707512200217600ustar00rootroot00000000000000.TH "PAPI_event_name_to_code" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_event_name_to_code \- Convert a name to a numeric hardware event code\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_event_name_to_code( const char * EventName, int * EventCode ); .RE .PP \fBPAPI_event_name_to_code\fP is used to translate an ASCII PAPI event name into an integer PAPI event code\&. .PP \fBParameters\fP .RS 4 \fI*EventCode\fP The numeric code for the event\&. .br \fI*EventName\fP A string containing the event name as listed in PAPI_presets or discussed in PAPI_native\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOTPRESET\fP The hardware event specified is not a valid PAPI preset\&. .br \fIPAPI_ENOINIT\fP The PAPI library has not been initialized\&. .br \fIPAPI_ENOEVNT\fP The hardware event is not available on the underlying hardware\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int EventCode, EventSet = PAPI_NULL; // Convert to integer if ( PAPI_event_name_to_code( "PAPI_TOT_INS", &EventCode ) != PAPI_OK ) handle_error( 1 ); // Create the EventSet if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) handle_error( 1 ); // Add Total Instructions Executed to our EventSet if ( PAPI_add_event( EventSet, EventCode ) != PAPI_OK ) handle_error( 1 ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_event_code_to_name\fP .PP \fBPAPI_remove_event\fP .PP \fBPAPI_get_event_info\fP .PP \fBPAPI_enum_event\fP .PP \fBPAPI_add_event\fP .PP \fBPAPI_add_named_event\fP .PP PAPI_presets .PP PAPI_native .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_exe_info_t.3000066400000000000000000000011551502707512200202670ustar00rootroot00000000000000.TH "PAPI_exe_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_exe_info_t \- get the executable's info .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "char \fBfullname\fP [1024]" .br .ti -1c .RI "\fBPAPI_address_map_t\fP \fBaddress_info\fP" .br .in -1c .SH "Field Documentation" .PP .SS "\fBPAPI_address_map_t\fP PAPI_exe_info_t::address_info" executable's address space info .SS "char PAPI_exe_info_t::fullname[1024]" path + name .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_flips_rate.3000066400000000000000000000036271502707512200203060ustar00rootroot00000000000000.TH "PAPI_flips_rate" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_flips_rate \- Simplified call to get Mflips/s (floating point instruction rate), real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_flips_rate( int event, float *rtime, float *ptime, long long *flpins, float *mflips ); .RE .PP \fBParameters\fP .RS 4 \fIevent\fP one of the three presets PAPI_FP_INS, PAPI_VEC_SP or PAPI_VEC_DP .br \fI*rtime\fP realtime since the latest call .br \fI*ptime\fP process time since the latest call .br \fI*flpins\fP floating point instructions since the latest call .br \fI*mflips\fP incremental (Mega) floating point instructions per seconds since the latest call .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP The counters were already started by something other than PAPI_flips_rate()\&. .br \fIPAPI_ENOEVNT\fP The floating point instructions event does not exist\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP The first call to PAPI_flips_rate() will initialize the PAPI interface, set up the counters to monitor the floating point instructions event and start the counters\&. .PP Subsequent calls will read the counters and return real time, process time, floating point instructions and the Mflip/s rate since the latest call to PAPI_flips_rate()\&. .PP PAPI_flips_rate() returns information related to floating point instructions using the floating point instructions event\&. This is intended to measure instruction rate through the floating point pipe with no massaging\&. Note that PAPI_flips_rate() is thread-safe and can therefore be called by multiple threads\&. .PP \fBSee also\fP .RS 4 PAPI_flops_rate() .PP PAPI_ipc() .PP PAPI_epc() .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_flops_rate.3000066400000000000000000000037201502707512200203060ustar00rootroot00000000000000.TH "PAPI_flops_rate" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_flops_rate \- Simplified call to get Mflops/s (floating point operation rate), real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_flops_rate\fP ( int event, float *rtime, float *ptime, long long *flpops, float *mflops ); .RE .PP \fBParameters\fP .RS 4 \fIevent\fP one of the three presets PAPI_FP_OPS, PAPI_SP_OPS or PAPI_DP_OPS .br \fI*rtime\fP realtime since the latest call .br \fI*ptime\fP process time since the latest call .br \fI*flpops\fP floating point operations since the latest call .br \fI*mflops\fP incremental (Mega) floating point operations per seconds since the latest call .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP The counters were already started by something other than PAPI_flops_rate()\&. .br \fIPAPI_ENOEVNT\fP The floating point operations event does not exist\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP The first call to PAPI_flops_rate() will initialize the PAPI interface, set up the counters to monitor the floating point operations event and start the counters\&. .PP Subsequent calls will read the counters and return real time, process time, floating point operations and the Mflop/s rate since the latest call to PAPI_flops_rate()\&. .PP PAPI_flops_rate() returns information related to theoretical floating point operations rather than simple instructions\&. It uses the floating point operations event which attempts to 'correctly' account for, e\&.g\&., FMA undercounts and FP Store overcounts\&. Note that PAPI_flops_rate() is thread-safe and can therefore be called by multiple threads\&. .PP \fBSee also\fP .RS 4 PAPI_flips_rate() .PP PAPI_ipc() .PP PAPI_epc() .PP PAPI_rate_stop() .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_cmp_opt.3000066400000000000000000000043601502707512200204510ustar00rootroot00000000000000.TH "PAPI_get_cmp_opt" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_cmp_opt \- Get component specific PAPI options\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIoption\fP is an input parameter describing the course of action\&. Possible values are defined in \fBpapi\&.h\fP and briefly described in the table below\&. The Fortran calls are implementations of specific options\&. .br \fIptr\fP is a pointer to a structure that acts as both an input and output parameter\&. .br \fIcidx\fP An integer identifier for a component\&. By convention, component 0 is always the cpu component\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .RE .PP PAPI_get_opt() and PAPI_set_opt() query or change the options of the PAPI library or a specific event set created by \fBPAPI_create_eventset\fP \&. Some options may require that the eventset be bound to a component before they can execute successfully\&. This can be done either by adding an event or by explicitly calling \fBPAPI_assign_eventset_component\fP \&. .PP The C interface for these functions passes a pointer to the \fBPAPI_option_t\fP structure\&. Not all options require or return information in this structure, and not all options are implemented for both get and set\&. Some options require a component index to be provided\&. These options are handled explicitly by the PAPI_get_cmp_opt() call for 'get' and implicitly through the option structure for 'set'\&. The Fortran interface is a series of calls implementing various subsets of the C interface\&. Not all options in C are available in Fortran\&. .PP \fBNote\fP .RS 4 Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX, are also available as separate entry points in both C and Fortran\&. .RE .PP The reader is urged to see the example code in the PAPI distribution for usage of \fBPAPI_get_opt\fP\&. The file \fBpapi\&.h\fP contains definitions for the structures unioned in the \fBPAPI_option_t\fP structure\&. .PP \fBSee also\fP .RS 4 \fBPAPI_set_debug\fP \fBPAPI_set_multiplex\fP \fBPAPI_set_domain\fP \fBPAPI_option_t\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_component_index.3000066400000000000000000000016051502707512200222000ustar00rootroot00000000000000.TH "PAPI_get_component_index" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_component_index \- returns the component index for the named component .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOCMP\fP component does not exist .RE .PP \fBParameters\fP .RS 4 \fIname\fP name of component to find index for .RE .PP \fBExamples:\fP .RS 4 .PP .nf int cidx; cidx = PAPI_get_component_index("cuda"); if (cidx==PAPI_OK) { printf("The CUDA component is cidx %d\\n",cidx); } .fi .PP PAPI_get_component_index() returns the component index of the named component\&. This is useful for finding out if a specified component exists\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_event_component\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_component_info.3000066400000000000000000000023171502707512200220250ustar00rootroot00000000000000.TH "PAPI_get_component_info" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_component_info \- get information about a specific software component .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIcidx\fP Component index .RE .PP This function returns a pointer to a structure containing detailed information about a specific software component in the PAPI library\&. This includes versioning information, preset and native event information, and more\&. For full details, see \fBPAPI_component_info_t\fP\&. .PP \fBExamples:\fP .RS 4 .PP .nf const PAPI_component_info_t *cmpinfo = NULL; if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) exit(1); if ((cmpinfo = PAPI_get_component_info(0)) == NULL) exit(1); printf("This component supports %d Preset Events and %d Native events\&.\\n", cmpinfo\->num_preset_events, cmpinfo\->num_native_events); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_executable_info\fP .PP \fBPAPI_get_hardware_info\fP .PP \fBPAPI_get_dmem_info\fP .PP \fBPAPI_get_opt\fP .PP \fBPAPI_component_info_t\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_dev_attr.3000066400000000000000000000107541502707512200206240ustar00rootroot00000000000000.TH "PAPI_get_dev_attr" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_dev_attr \- returns device attributes .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOSUPP\fP invalid/unsupported attribute .RE .PP \fBParameters\fP .RS 4 \fIhandle\fP opaque handle for device, obtained through \fBPAPI_enum_dev_type\fP .br \fIid\fP integer identifier of queried device .br \fIattr\fP device attribute to query .br \fIval\fP value of the requested device attribute .RE .PP \fBExample:\fP .RS 4 .PP .nf typedef enum { PAPI_DEV_ATTR__CPU_CHAR_NAME, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, PAPI_DEV_ATTR__CPU_UINT_FAMILY, PAPI_DEV_ATTR__CPU_UINT_MODEL, PAPI_DEV_ATTR__CPU_UINT_STEPPING, PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE, PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY, PAPI_DEV_ATTR__CPU_UINT_THR_PER_NUMA, PAPI_DEV_ATTR__CUDA_ULONG_UID, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT, PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL, PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM, PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP, PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR, PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM, PAPI_DEV_ATTR__ROCM_ULONG_UID, PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT, PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR, } PAPI_dev_attr_e; void *handle; int id; int count; int enum_modifier = PAPI_DEV_TYPE_ENUM__CPU | PAPI_DEV_TYPE_ENUM__CUDA; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &count); if (PAPI_DEV_TYPE_ID__CUDA == id) { for (int i = 0; i < count; ++i) { unsigned int warp_size; unsigned int cc_major, cc_minor; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, &warp_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, &cc_minor); \&.\&.\&. } } } .fi .PP PAPI_get_dev_type_attr() allows the user to query all device type attributes\&. It takes a device type handle, returned by \fBPAPI_enum_dev_type\fP, the device sequential id and an attribute to be queried for the device and returns the attribute value\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_enum_dev_type\fP .PP \fBPAPI_get_dev_attr\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_dev_type_attr.3000066400000000000000000000040701502707512200216570ustar00rootroot00000000000000.TH "PAPI_get_dev_type_attr" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_dev_type_attr \- returns device type attributes .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOSUPP\fP invalid attribute .RE .PP \fBParameters\fP .RS 4 \fIhandle\fP opaque handle for device, obtained through \fBPAPI_enum_dev_type\fP .br \fIattr\fP device type attribute to query .br \fIval\fP value of the requested device type attribute .RE .PP \fBExample:\fP .RS 4 .PP .nf typedef enum { PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, // PAPI defined device type id PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, // Vendor defined id PAPI_DEV_TYPE_ATTR__CHAR_NAME, // Vendor name PAPI_DEV_TYPE_ATTR__INT_COUNT, // Devices of that type and vendor PAPI_DEV_TYPE_ATTR__CHAR_STATUS, // Status string for the device type } PAPI_dev_type_attr_e; typedef enum { PAPI_DEV_TYPE_ID__CPU, // Device id for CPUs PAPI_DEV_TYPE_ID__CUDA, // Device id for Nvidia GPUs PAPI_DEV_TYPE_ID__ROCM, // Device id for AMD GPUs } PAPI_dev_type_id_e; void *handle; int id; int enum_modifier = PAPI_DEV_TYPE_ENUM__ALL; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); switch (id) { case PAPI_DEV_TYPE_ID__CPU: // query cpu attributes break; case PAPI_DEV_TYPE_ID__CUDA: // query nvidia gpu attributes break; case PAPI_DEV_TYPE_ID__ROCM: // query amd gpu attributes break; default: \&.\&.\&. } } .fi .PP PAPI_get_dev_type_attr() allows the user to query all device type attributes\&. It takes a device type handle, returned by \fBPAPI_enum_dev_type\fP, and an attribute to be queried for the device type and returns the attribute value\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_enum_dev_type\fP .PP \fBPAPI_get_dev_attr\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_dmem_info.3000066400000000000000000000023441502707512200207450ustar00rootroot00000000000000.TH "PAPI_get_dmem_info" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_dmem_info \- Get information about the dynamic memory usage of the current program\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_get_dmem_info( PAPI_dmem_info_t *dest ); .RE .PP \fBParameters\fP .RS 4 \fIdest\fP structure to be filled in \fBPAPI_dmem_info_t\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ECMP\fP The function is not implemented for the current component\&. .br \fIPAPI_EINVAL\fP Any value in the structure or array may be undefined as indicated by this error value\&. .br \fIPAPI_SYS\fP A system error occurred\&. .RE .PP \fBNote\fP .RS 4 This function is only implemented for the Linux operating system\&. This function takes a pointer to a \fBPAPI_dmem_info_t\fP structure and returns with the structure fields filled in\&. A value of PAPI_EINVAL in any field indicates an undefined parameter\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_executable_info\fP \fBPAPI_get_hardware_info\fP \fBPAPI_get_opt\fP \fBPAPI_library_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_event_component.3000066400000000000000000000013611502707512200222110ustar00rootroot00000000000000.TH "PAPI_get_event_component" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_event_component \- return component an event belongs to .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIENOCMP\fP component does not exist .RE .PP \fBParameters\fP .RS 4 \fIEventCode\fP EventCode for which we want to know the component index .RE .PP \fBExamples:\fP .RS 4 .PP .nf int cidx,eventcode; cidx = PAPI_get_event_component(eventcode); .fi .PP PAPI_get_event_component() returns the component an event belongs to\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_event_info\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_event_info.3000066400000000000000000000020371502707512200211430ustar00rootroot00000000000000.TH "PAPI_get_event_info" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_event_info \- Get the event's name and description info\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIEventCode\fP event code (preset or native) .br \fIinfo\fP structure with the event information \fBPAPI_event_info_t\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOTPRESET\fP The PAPI preset mask was set, but the hardware event specified is not a valid PAPI preset\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP This function fills the event information into a structure\&. In Fortran, some fields of the structure are returned explicitly\&. This function works with existing PAPI preset and native event codes\&. .PP \fBSee also\fP .RS 4 \fBPAPI_event_name_to_code\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_eventset_component.3000066400000000000000000000016031502707512200227240ustar00rootroot00000000000000.TH "PAPI_get_eventset_component" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_eventset_component \- return index for component an eventset is assigned to .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVST\fP eventset does not exist .br \fIPAPI_ENOCMP\fP component is invalid or does not exist .br \fIpositive\fP value valid component index .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP EventSet for which we want to know the component index .RE .PP \fBExamples:\fP .RS 4 .PP .nf int cidx,eventcode; cidx = PAPI_get_eventset_component(eventset); .fi .PP PAPI_get_eventset_component() returns the component an event belongs to\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_event_component\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_executable_info.3000066400000000000000000000034131502707512200221420ustar00rootroot00000000000000.TH "PAPI_get_executable_info" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_executable_info \- Get the executable's address space info\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br const \fBPAPI_exe_info_t\fP *PAPI_get_executable_info( void ); .RE .PP This function returns a pointer to a structure containing information about the current program\&. .PP \fBParameters\fP .RS 4 \fIfullname\fP Fully qualified path + filename of the executable\&. .br \fIname\fP Filename of the executable with no path information\&. .br \fItext_start,text_end\fP Start and End addresses of program text segment\&. .br \fIdata_start,data_end\fP Start and End addresses of program data segment\&. .br \fIbss_start,bss_end\fP Start and End addresses of program bss segment\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf const PAPI_exe_info_t *prginfo = NULL; if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) exit( 1 ); printf( "Path+Program: %s\\n", exeinfo\->fullname ); printf( "Program: %s\\n", exeinfo\->address_info\&.name ); printf( "Text start: %p, Text end: %p\\n", exeinfo\->address_info\&.text_start, exeinfo\->address_info\&.text_end) ; printf( "Data start: %p, Data end: %p\\n", exeinfo\->address_info\&.data_start, exeinfo\->address_info\&.data_end ); printf( "Bss start: %p, Bss end: %p\\n", exeinfo\->address_info\&.bss_start, exeinfo\->address_info\&.bss_end ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_opt\fP .PP \fBPAPI_get_hardware_info\fP .PP \fBPAPI_exe_info_t\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_hardware_info.3000066400000000000000000000023021502707512200216120ustar00rootroot00000000000000.TH "PAPI_get_hardware_info" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_hardware_info \- get information about the system hardware .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP In C, this function returns a pointer to a structure containing information about the hardware on which the program runs\&. In Fortran, the values of the structure are returned explicitly\&. .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .RE .PP .PP \fBNote\fP .RS 4 The C structure contains detailed information about cache and TLB sizes\&. This information is not available from Fortran\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf const PAPI_hw_info_t *hwinfo = NULL; if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) exit(1); if ((hwinfo = PAPI_get_hardware_info()) == NULL) exit(1); printf("%d CPUs at %f Mhz\&.\\en",hwinfo\->totalcpus,hwinfo\->mhz); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_hw_info_t\fP .PP \fBPAPI_get_executable_info\fP, \fBPAPI_get_opt\fP, \fBPAPI_get_dmem_info\fP, \fBPAPI_library_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_multiplex.3000066400000000000000000000041641502707512200210350ustar00rootroot00000000000000.TH "PAPI_get_multiplex" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_multiplex \- Get the multiplexing status of specified event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_get_multiplex( int EventSet ); .RE .PP \fBFortran Interface:\fP .RS 4 #include fpapi\&.h .br \fBPAPIF_get_multiplex( C_INT EventSet, C_INT check )\fP .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid, or the EventSet is already multiplexed\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP \fBPAPI_get_multiplex\fP tests the state of the PAPI_MULTIPLEXING flag in the specified event set, returning \fITRUE\fP if a PAPI event set is multiplexed, or FALSE if not\&. .br .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Bind it to the CPU component ret = PAPI_assign_eventset_component(EventSet, 0); if (ret != PAPI_OK) handle_error(ret); // Check current multiplex status ret = PAPI_get_multiplex(EventSet); if (ret == TRUE) printf("This event set is ready for multiplexing\\n\&.") if (ret == FALSE) printf("This event set is not enabled for multiplexing\\n\&.") if (ret < 0) handle_error(ret); // Turn on multiplexing ret = PAPI_set_multiplex(EventSet); if ((ret == PAPI_EINVAL) && (PAPI_get_multiplex(EventSet) == TRUE)) printf("This event set already has multiplexing enabled\\n"); else if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_multiplex_init\fP .PP \fBPAPI_set_opt\fP .PP \fBPAPI_create_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_opt.3000066400000000000000000000104611502707512200176110ustar00rootroot00000000000000.TH "PAPI_get_opt" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_opt \- Get PAPI library or event set options\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_get_opt( int option, PAPI_option_t * ptr ); .RE .PP \fBParameters\fP .RS 4 \fIoption\fP Defines the option to get\&. Possible values are briefly described in the table below\&. .br \fIptr\fP Pointer to a structure determined by the selected option\&. See \fBPAPI_option_t\fP for a description of possible structures\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP The specified option or parameter is invalid\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_ECMP\fP The option is not implemented for the current component\&. .br \fIPAPI_ENOINIT\fP specified option requires PAPI to be initialized first\&. .RE .PP PAPI_get_opt() queries the options of the PAPI library or a specific event set created by \fBPAPI_create_eventset\fP\&. Some options may require that the eventset be bound to a component before they can execute successfully\&. This can be done either by adding an event or by explicitly calling \fBPAPI_assign_eventset_component\fP\&. .PP Ptr is a pointer to the \fBPAPI_option_t\fP structure, which is actually a union of different structures for different options\&. Not all options require or return information in these structures\&. Each returns different values in the structure\&. Some options require a component index to be provided\&. These options are handled explicitly by the PAPI_get_cmp_opt() call\&. .PP \fBNote\fP .RS 4 Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX are also available as separate entry points in both C and Fortran\&. .RE .PP The reader is encouraged to peruse the ctests code in the PAPI distribution for examples of usage of \fBPAPI_set_opt\fP\&. .PP \fBPossible values for the PAPI_get_opt option parameter\fP .RS 4 OPTION DEFINITION PAPI_DEFDOM Get default counting domain for newly created event sets. Requires a component index. PAPI_DEFGRN Get default counting granularity. Requires a component index. PAPI_DEBUG Get the PAPI debug state and the debug handler. The debug state is specified in ptr->debug.level. The debug handler is specified in ptr->debug.handler. For further information regarding debug states and the behavior of the handler, see PAPI_set_debug. PAPI_MULTIPLEX Get current multiplexing state for specified EventSet. PAPI_DEF_ITIMER Get the type of itimer used in software multiplexing, overflowing and profiling. PAPI_DEF_MPX_NS Get the sampling time slice in nanoseconds for multiplexing and overflow. PAPI_DEF_ITIMER_NS See PAPI_DEF_MPX_NS. PAPI_ATTACH Get thread or process id to which event set is attached. Returns TRUE if currently attached. PAPI_CPU_ATTACH Get ptr->cpu.cpu_num and Attach state for EventSet specified in ptr->cpu.eventset. PAPI_DETACH Get thread or process id to which event set is attached. Returns TRUE if currently attached. PAPI_DOMAIN Get domain for EventSet specified in ptr->domain.eventset. Will error if eventset is not bound to a component. PAPI_GRANUL Get granularity for EventSet specified in ptr->granularity.eventset. Will error if eventset is not bound to a component. PAPI_INHERIT Get current inheritance state for specified EventSet. PAPI_PRELOAD Get LD_PRELOAD environment equivalent. PAPI_CLOCKRATE Get clockrate in MHz. PAPI_MAX_CPUS Get number of CPUs. PAPI_EXEINFO Get Executable addresses for text/data/bss. PAPI_HWINFO Get information about the hardware. PAPI_LIB_VERSION Get the full PAPI version of the library. This does not require PAPI to be initialized first. PAPI_MAX_HWCTRS Get number of counters. Requires a component index. PAPI_MAX_MPX_CTRS Get maximum number of multiplexing counters. Requires a component index. PAPI_SHLIBINFO Get shared library information used by the program. PAPI_COMPONENTINFO Get the PAPI features the specified component supports. Requires a component index. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_multiplex\fP .PP \fBPAPI_get_cmp_opt\fP .PP \fBPAPI_set_opt\fP .PP \fBPAPI_option_t\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_overflow_event_index.3000066400000000000000000000034341502707512200232440ustar00rootroot00000000000000.TH "PAPI_get_overflow_event_index" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_overflow_event_index \- converts an overflow vector into an array of indexes to overflowing events .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle to a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIoverflow_vector\fP a vector with bits set for each counter that overflowed\&. This vector is passed by the system to the overflow handler routine\&. .br \fI*array\fP an array of indexes for events in EventSet\&. No more than *number indexes will be stored into the array\&. .br \fI*number\fP On input the variable determines the size of the array\&. On output the variable contains the number of indexes in the array\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. This could occur if the overflow_vector is empty (zero), if the array or number pointers are NULL, if the value of number is less than one, or if the EventSet is empty\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .RE .PP \fBExamples\fP .RS 4 .PP .nf void handler(int EventSet, void *address, long_long overflow_vector, void *context){ int Events[4], number, i; int total = 0, retval; printf("Overflow #%d\\n Handler(%d) Overflow at %p! vector=%#llx\\n", total, EventSet, address, overflow_vector); total++; number = 4; retval = PAPI_get_overflow_event_index(EventSet, overflow_vector, Events, &number); if(retval == PAPI_OK) for(i=0; i .br int PAPI_get_thr_specific( int tag, void **ptr ); .RE .PP \fBParameters\fP .RS 4 \fItag\fP An identifier, the value of which is either PAPI_USR1_TLS or PAPI_USR2_TLS\&. This identifier indicates which of several data structures associated with this thread is to be accessed\&. .br \fIptr\fP A pointer to the memory containing the data structure\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP The \fItag\fP argument is out of range\&. .RE .PP In C, \fBPAPI_get_thr_specific\fP \fBPAPI_get_thr_specific\fP will retrieve the pointer from the array with index \fItag\fP\&. There are 2 user available locations and \fItag\fP can be either PAPI_USR1_TLS or PAPI_USR2_TLS\&. The array mentioned above is managed by PAPI and allocated to each thread which has called \fBPAPI_thread_init\fP\&. There is no Fortran equivalent function\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; RateInfo *state = NULL; ret = PAPI_thread_init(pthread_self); if (ret != PAPI_OK) handle_error(ret); // Do we have the thread specific data setup yet? ret = PAPI_get_thr_specific(PAPI_USR1_TLS, (void *) &state); if (ret != PAPI_OK || state == NULL) { state = (RateInfo *) malloc(sizeof(RateInfo)); if (state == NULL) return (PAPI_ESYS); memset(state, 0, sizeof(RateInfo)); state\->EventSet = PAPI_NULL; ret = PAPI_create_eventset(&state\->EventSet); if (ret != PAPI_OK) return (PAPI_ESYS); ret = PAPI_set_thr_specific(PAPI_USR1_TLS, state); if (ret != PAPI_OK) return (ret); } .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_register_thread\fP \fBPAPI_thread_init\fP \fBPAPI_thread_id\fP \fBPAPI_set_thr_specific\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_virt_cyc.3000066400000000000000000000026021502707512200206270ustar00rootroot00000000000000.TH "PAPI_get_virt_cyc" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_virt_cyc \- get virtual time counter value in clock cycles .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_ECNFLCT\fP If there is no master event set\&. This will happen if the library has not been initialized, or .br for threaded applications, if there has been no thread id function defined by the \fBPAPI_thread_init\fP function\&. .br \fIPAPI_ENOMEM\fP For threaded applications, if there has not yet been any thread specific master event created for the current thread, and if the allocation of such an event set fails, the call will return PAPI_ENOMEM or PAPI_ESYS \&. .RE .PP This function returns the total number of virtual units from some arbitrary starting point\&. Virtual units accrue every time the process is running in user-mode on behalf of the process\&. Like the real time counters, this count is guaranteed to exist on every platform PAPI supports\&. However on some platforms, the resolution can be as bad as 1/Hz as defined by the operating system\&. .PP \fBExamples:\fP .RS 4 .PP .nf s = PAPI_get_virt_cyc(); your_slow_code(); e = PAPI_get_virt_cyc(); printf("Process has run for cycles: %lld\\en",e\-s); .fi .PP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_virt_nsec.3000066400000000000000000000023211502707512200207770ustar00rootroot00000000000000.TH "PAPI_get_virt_nsec" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_virt_nsec \- Get virtual time counter values in nanoseconds\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_ECNFLCT\fP If there is no master event set\&. This will happen if the library has not been initialized, or for threaded applications, if there has been no thread id function defined by the \fBPAPI_thread_init\fP function\&. .br \fIPAPI_ENOMEM\fP For threaded applications, if there has not yet been any thread specific master event created for the current thread, and if the allocation of such an event set fails, the call will return PAPI_ENOMEM or PAPI_ESYS \&. .RE .PP This function returns the total number of virtual units from some arbitrary starting point\&. Virtual units accrue every time the process is running in user-mode on behalf of the process\&. Like the real time counters, this count is guaranteed to exist on every platform PAPI supports\&. However on some platforms, the resolution can be as bad as 1/Hz as defined by the operating system\&. .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_get_virt_usec.3000066400000000000000000000027611502707512200210160ustar00rootroot00000000000000.TH "PAPI_get_virt_usec" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_get_virt_usec \- get virtual time counter values in microseconds .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_ECNFLCT\fP If there is no master event set\&. This will happen if the library has not been initialized, or for threaded applications, if there has been no thread id function defined by the \fBPAPI_thread_init\fP function\&. .br \fIPAPI_ENOMEM\fP For threaded applications, if there has not yet been any thread specific master event created for the current thread, and if the allocation of such an event set fails, the call will return PAPI_ENOMEM or PAPI_ESYS \&. .RE .PP This function returns the total number of virtual units from some arbitrary starting point\&. Virtual units accrue every time the process is running in user-mode on behalf of the process\&. Like the real time counters, this count is guaranteed to exist on every platform PAPI supports\&. However on some platforms, the resolution can be as bad as 1/Hz as defined by the operating system\&. .PP \fBExamples:\fP .RS 4 .PP .nf s = PAPI_get_virt_cyc(); your_slow_code(); e = PAPI_get_virt_cyc(); printf("Process has run for cycles: %lld\\en",e\-s); .fi .PP .RE .PP \fBSee also\fP .RS 4 PAPIF .PP PAPI .PP PAPI .PP \fBPAPI_get_real_cyc\fP .PP \fBPAPI_get_virt_cyc\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_granularity_option_t.3000066400000000000000000000010501502707512200224160ustar00rootroot00000000000000.TH "PAPI_granularity_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_granularity_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBdef_cidx\fP" .br .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "int \fBgranularity\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_granularity_option_t::def_cidx" this structure requires a component index to set default granularity .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_hl_read.3000066400000000000000000000032071502707512200175460ustar00rootroot00000000000000.TH "PAPI_hl_read" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_hl_read \- Read performance events inside of a region and store the difference to the corresponding beginning of the region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_hl_read( const char* region ); .RE .PP \fBParameters\fP .RS 4 \fIregion\fP -- a unique region name corresponding to \fBPAPI_hl_region_begin\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous errors\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPI_hl_read\fP reads performance events inside of a region and stores the difference to the corresponding beginning of the region\&. .PP Assumes that \fBPAPI_hl_region_begin\fP was called before\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf int retval; retval = PAPI_hl_region_begin("computation"); if ( retval != PAPI_OK ) handle_error(1); //Do some computation here retval = PAPI_hl_read("computation"); if ( retval != PAPI_OK ) handle_error(1); //Do some computation here retval = PAPI_hl_region_end("computation"); if ( retval != PAPI_OK ) handle_error(1); .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_region_begin\fP .PP \fBPAPI_hl_region_end\fP .PP \fBPAPI_hl_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_hl_region_begin.3000066400000000000000000000033021502707512200212560ustar00rootroot00000000000000.TH "PAPI_hl_region_begin" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_hl_region_begin \- Read performance events at the beginning of a region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_hl_region_begin( const char* region ); .RE .PP \fBParameters\fP .RS 4 \fIregion\fP -- a unique region name .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous errors\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPI_hl_region_begin\fP reads performance events and stores them internally at the beginning of an instrumented code region\&. If not specified via the environment variable PAPI_EVENTS, default events are used\&. The first call sets all counters implicitly to zero and starts counting\&. Note that if PAPI_EVENTS is not set or cannot be interpreted, default performance events are recorded\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf export PAPI_EVENTS="PAPI_TOT_INS,PAPI_TOT_CYC" .fi .PP .PP .PP .nf int retval; retval = PAPI_hl_region_begin("computation"); if ( retval != PAPI_OK ) handle_error(1); //Do some computation here retval = PAPI_hl_region_end("computation"); if ( retval != PAPI_OK ) handle_error(1); .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_read\fP .PP \fBPAPI_hl_region_end\fP .PP \fBPAPI_hl_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_hl_region_end.3000066400000000000000000000072161502707512200207500ustar00rootroot00000000000000.TH "PAPI_hl_region_end" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_hl_region_end \- Read performance events at the end of a region and store the difference to the corresponding beginning of the region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_hl_region_end( const char* region ); .RE .PP \fBParameters\fP .RS 4 \fIregion\fP -- a unique region name corresponding to \fBPAPI_hl_region_begin\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous errors\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPI_hl_region_end\fP reads performance events at the end of a region and stores the difference to the corresponding beginning of the region\&. .PP Assumes that \fBPAPI_hl_region_begin\fP was called before\&. .PP Note that \fBPAPI_hl_region_end\fP does not stop counting the performance events\&. Counting continues until the application terminates\&. Therefore, the programmer can also create nested regions if required\&. To stop a running high-level event set, the programmer must call PAPI_hl_stop()\&. It should also be noted, that a marked region is thread-local and therefore has to be in the same thread\&. .PP An output of the measured events is created automatically after the application exits\&. In the case of a serial, or a thread-parallel application there is only one output file\&. MPI applications would be saved in multiple files, one per MPI rank\&. The output is generated in the current directory by default\&. However, it is recommended to specify an output directory for larger measurements, especially for MPI applications via the environment variable PAPI_OUTPUT_DIRECTORY\&. In the case where measurements are performed, while there are old measurements in the same directory, PAPI will not overwrite or delete the old measurement directories\&. Instead, timestamps are added to the old directories\&. .PP For more convenience, the output can also be printed to stdout by setting PAPI_REPORT=1\&. This is not recommended for MPI applications as each MPI rank tries to print the output concurrently\&. .PP The generated measurement output can also be converted in a better readable output\&. The python script papi_hl_output_writer\&.py enhances the output by creating some derived metrics, like IPC, MFlops/s, and MFlips/s as well as real and processor time in case the corresponding PAPI events have been recorded\&. The python script can also summarize performance events over all threads and MPI ranks when using the option 'accumulate' as seen below\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf int retval; retval = PAPI_hl_region_begin("computation"); if ( retval != PAPI_OK ) handle_error(1); //Do some computation here retval = PAPI_hl_region_end("computation"); if ( retval != PAPI_OK ) handle_error(1); .fi .PP .PP .PP .nf python papi_hl_output_writer\&.py \-\-type=accumulate { "computation": { "Region count": 1, "Real time in s": 0\&.97 , "CPU time in s": 0\&.98 , "IPC": 1\&.41 , "MFLIPS /s": 386\&.28 , "MFLOPS /s": 386\&.28 , "Number of ranks ": 1, "Number of threads ": 1, "Number of processes ": 1 } } .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_region_begin\fP .PP \fBPAPI_hl_read\fP .PP \fBPAPI_hl_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_hl_stop.3000066400000000000000000000021321502707512200176140ustar00rootroot00000000000000.TH "PAPI_hl_stop" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_hl_stop \- Stop a running high-level event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_hl_stop(); .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVNT\fP -- The EventSet is not started yet\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory to complete the operation\&. .RE .PP \fBPAPI_hl_stop\fP stops a running high-level event set\&. .PP This call is optional and only necessary if the programmer wants to use the low-level API in addition to the high-level API\&. It should be noted that \fBPAPI_hl_stop\fP and low-level calls are not allowed inside of a marked region\&. Furthermore, \fBPAPI_hl_stop\fP is thread-local and therefore has to be called in the same thread as the corresponding marked region\&. .PP \fBSee also\fP .RS 4 \fBPAPI_hl_region_begin\fP .PP \fBPAPI_hl_read\fP .PP \fBPAPI_hl_region_end\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_hw_info_t.3000066400000000000000000000052111502707512200201210ustar00rootroot00000000000000.TH "PAPI_hw_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_hw_info_t \- Hardware info structure\&. .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBncpu\fP" .br .ti -1c .RI "int \fBthreads\fP" .br .ti -1c .RI "int \fBcores\fP" .br .ti -1c .RI "int \fBsockets\fP" .br .ti -1c .RI "int \fBnnodes\fP" .br .ti -1c .RI "int \fBtotalcpus\fP" .br .ti -1c .RI "int \fBvendor\fP" .br .ti -1c .RI "char \fBvendor_string\fP [128]" .br .ti -1c .RI "int \fBmodel\fP" .br .ti -1c .RI "char \fBmodel_string\fP [128]" .br .ti -1c .RI "float \fBrevision\fP" .br .ti -1c .RI "int \fBcpuid_family\fP" .br .ti -1c .RI "int \fBcpuid_model\fP" .br .ti -1c .RI "int \fBcpuid_stepping\fP" .br .ti -1c .RI "int \fBcpu_max_mhz\fP" .br .ti -1c .RI "int \fBcpu_min_mhz\fP" .br .ti -1c .RI "\fBPAPI_mh_info_t\fP \fBmem_hierarchy\fP" .br .ti -1c .RI "int \fBvirtualized\fP" .br .ti -1c .RI "char \fBvirtual_vendor_string\fP [128]" .br .ti -1c .RI "char \fBvirtual_vendor_version\fP [128]" .br .ti -1c .RI "float \fBmhz\fP" .br .ti -1c .RI "int \fBclock_mhz\fP" .br .ti -1c .RI "int \fBreserved\fP [8]" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_hw_info_t::clock_mhz" Deprecated .SS "int PAPI_hw_info_t::cores" Number of cores per socket .SS "int PAPI_hw_info_t::cpu_max_mhz" Maximum supported CPU speed .SS "int PAPI_hw_info_t::cpu_min_mhz" Minimum supported CPU speed .SS "int PAPI_hw_info_t::cpuid_family" cpuid family .SS "int PAPI_hw_info_t::cpuid_model" cpuid model .SS "int PAPI_hw_info_t::cpuid_stepping" cpuid stepping .SS "\fBPAPI_mh_info_t\fP PAPI_hw_info_t::mem_hierarchy" PAPI memory hierarchy description .SS "float PAPI_hw_info_t::mhz" Deprecated .SS "int PAPI_hw_info_t::model" Model number of CPU .SS "char PAPI_hw_info_t::model_string[128]" Model string of CPU .SS "int PAPI_hw_info_t::ncpu" Number of CPUs per NUMA Node .SS "int PAPI_hw_info_t::nnodes" Total Number of NUMA Nodes .SS "float PAPI_hw_info_t::revision" Revision of CPU .SS "int PAPI_hw_info_t::sockets" Number of sockets .SS "int PAPI_hw_info_t::threads" Number of hdw threads per core .SS "int PAPI_hw_info_t::totalcpus" Total number of CPUs in the entire system .SS "int PAPI_hw_info_t::vendor" Vendor number of CPU .SS "char PAPI_hw_info_t::vendor_string[128]" Vendor string of CPU .SS "char PAPI_hw_info_t::virtual_vendor_string[128]" Vendor for virtual machine .SS "char PAPI_hw_info_t::virtual_vendor_version[128]" Version of virtual machine .SS "int PAPI_hw_info_t::virtualized" Running in virtual machine .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_inherit_option_t.3000066400000000000000000000005441502707512200215260ustar00rootroot00000000000000.TH "PAPI_inherit_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_inherit_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "int \fBinherit\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_ipc.3000066400000000000000000000032511502707512200167220ustar00rootroot00000000000000.TH "PAPI_ipc" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_ipc \- Simplified call to get instructions per cycle, real and processor time\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_ipc( float *rtime, float *ptime, long long *ins, float *ipc ); .RE .PP \fBParameters\fP .RS 4 \fI*rtime\fP realtime since the latest call .br \fI*ptime\fP process time since the latest call .br \fI*ins\fP instructions since the latest call .br \fI*ipc\fP incremental instructions per cycle since the latest call .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP The counters were already started by something other than PAPI_ipc()\&. .br \fIPAPI_ENOEVNT\fP The events PAPI_TOT_INS and PAPI_TOT_CYC are not supported\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .RE .PP The first call to PAPI_ipc() will initialize the PAPI interface, set up the counters to monitor PAPI_TOT_INS and PAPI_TOT_CYC events and start the counters\&. .PP Subsequent calls will read the counters and return real time, process time, instructions and the IPC rate since the latest call to PAPI_ipc()\&. .PP PAPI_ipc() should return a ratio greater than 1\&.0, indicating instruction level parallelism within the chip\&. The larger this ratio the more effeciently the program is running\&. Note that PAPI_ipc() is thread-safe and can therefore be called by multiple threads\&. .PP \fBSee also\fP .RS 4 PAPI_flips_rate() .PP PAPI_flops_rate() .PP PAPI_epc() .PP PAPI_rate_stop() .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_is_initialized.3000066400000000000000000000026341502707512200211530ustar00rootroot00000000000000.TH "PAPI_is_initialized" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_is_initialized \- check for initialization .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_NOT_INITED\fP Library has not been initialized .br \fIPAPI_LOW_LEVEL_INITED\fP Low level has called library init .br \fIPAPI_HIGH_LEVEL_INITED\fP High level has called library init .br \fIPAPI_THREAD_LEVEL_INITED\fP .br Threads have been inited .RE .PP \fBParameters\fP .RS 4 \fIversion\fP upon initialization, PAPI checks the argument against the internal value of PAPI_VER_CURRENT when the library was compiled\&. This guards against portability problems when updating the PAPI shared libraries on your system\&. .RE .PP \fBExamples:\fP .RS 4 .PP .nf int retval; retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT && retval > 0) { fprintf(stderr,"PAPI library version mismatch!\\en"); exit(1); } if (retval < 0) handle_error(retval); retval = PAPI_is_initialized(); if (retval != PAPI_LOW_LEVEL_INITED) handle_error(retval); .fi .PP PAPI_is_initialized() returns the status of the PAPI library\&. The PAPI library can be in one of four states, as described under RETURN VALUES\&. .RE .PP \fBSee also\fP .RS 4 PAPI .PP \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_itimer_option_t.3000066400000000000000000000006501502707512200213530ustar00rootroot00000000000000.TH "PAPI_itimer_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_itimer_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBitimer_num\fP" .br .ti -1c .RI "int \fBitimer_sig\fP" .br .ti -1c .RI "int \fBns\fP" .br .ti -1c .RI "int \fBflags\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_library_init.3000066400000000000000000000033031502707512200206340ustar00rootroot00000000000000.TH "PAPI_library_init" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_library_init \- initialize the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIversion\fP upon initialization, PAPI checks the argument against the internal value of PAPI_VER_CURRENT when the library was compiled\&. This guards against portability problems when updating the PAPI shared libraries on your system\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP \fBpapi\&.h\fP is different from the version used to compile the PAPI library\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ECMP\fP This component does not support the underlying hardware\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .RE .PP PAPI_library_init() initializes the PAPI library\&. PAPI_is_initialized() check for initialization\&. It must be called before any low level PAPI functions can be used\&. If your application is making use of threads \fBPAPI_thread_init\fP must also be called prior to making any calls to the library other than PAPI_library_init() \&. .PP \fBExamples:\fP .RS 4 .PP .nf int retval; retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT && retval > 0) { fprintf(stderr,"PAPI library version mismatch!\\en"); exit(1); } if (retval < 0) handle_error(retval); retval = PAPI_is_initialized(); if (retval != PAPI_LOW_LEVEL_INITED) handle_error(retval) .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_thread_init\fP PAPI .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_list_events.3000066400000000000000000000042321502707512200205060ustar00rootroot00000000000000.TH "PAPI_list_events" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_list_events \- list the events in an event set .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP List the events in an event set\&. .PP PAPI_list_events() returns an array of events and a count of the total number of events in an event set\&. This call assumes an initialized PAPI library and a successfully created event set\&. .PP \fBC Interface\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_list_events(int EventSet, int *Events, int *number); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP An integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fI*Events\fP A pointer to a preallocated array of codes for events, such as PAPI_INT_INS\&. No more than *number codes will be stored into the array\&. .br \fI*number\fP On input, the size of the Events array, or maximum number of event codes to be returned\&. A value of 0 can be used to probe an event set\&. On output, the number of events actually in the event set\&. This value may be greater than the actually stored number of event codes\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP .br \fIPAPI_ENOEVST\fP .RE .PP \fBExamples:\fP .RS 4 .PP .nf if (PAPI_event_name_to_code("PAPI_TOT_INS",&EventCode) != PAPI_OK) exit(1); if (PAPI_add_event(EventSet, EventCode) != PAPI_OK) exit(1); Convert a second event name to an event code if (PAPI_event_name_to_code("PAPI_L1_LDM",&EventCode) != PAPI_OK) exit(1); if (PAPI_add_event(EventSet, EventCode) != PAPI_OK) exit(1); number = 0; if(PAPI_list_events(EventSet, NULL, &number)) exit(1); if(number != 2) exit(1); if(PAPI_list_events(EventSet, Events, &number)) exit(1); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_event_code_to_name\fP .PP \fBPAPI_event_name_to_code\fP .PP \fBPAPI_add_event\fP .PP \fBPAPI_create_eventset\fP .RE .PP \fBFortran Interface:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPI_list_events( C_INT EventSet, C_INT(*) Events, C_INT number, C_INT check )\fP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_list_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_list_threads.3000066400000000000000000000024161502707512200206360ustar00rootroot00000000000000.TH "PAPI_list_threads" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_list_threads \- List the registered thread ids\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_list_threads() returns to the caller a list of all thread IDs known to PAPI\&. .PP This call assumes an initialized PAPI library\&. .PP \fBC Interface\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_list_threads(PAPI_thread_id_t *tids, int * number ); .RE .PP \fBParameters\fP .RS 4 \fI*tids\fP -- A pointer to a preallocated array\&. This may be NULL to only return a count of threads\&. No more than *number codes will be stored in the array\&. .br \fI*number\fP -- An input and output parameter\&. .br Input specifies the number of allocated elements in *tids (if non-NULL) and output specifies the number of threads\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP The call returned successfully\&. .br \fIPAPI_EINVAL\fP *number has an improper value .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_thr_specific\fP .PP \fBPAPI_set_thr_specific\fP .PP \fBPAPI_register_thread\fP .PP \fBPAPI_unregister_thread\fP .PP \fBPAPI_thread_init\fP \fBPAPI_thread_id\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_lock.3000066400000000000000000000017511502707512200171020ustar00rootroot00000000000000.TH "PAPI_lock" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_lock \- Lock one of two mutex variables defined in \fBpapi\&.h\fP\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_lock() grabs access to one of the two PAPI mutex variables\&. This function is provided to the user to have a platform independent call to a (hopefully) efficiently implemented mutex\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br void PAPI_lock(int lock); .RE .PP \fBParameters\fP .RS 4 \fIlock\fP -- an integer value specifying one of the two user locks: PAPI_USR1_LOCK or PAPI_USR2_LOCK .RE .PP \fBReturns\fP .RS 4 There is no return value for this call\&. Upon return from \fBPAPI_lock\fP the current thread has acquired exclusive access to the specified PAPI mutex\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_unlock\fP .PP \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_mh_cache_info_t.3000066400000000000000000000011011502707512200212240ustar00rootroot00000000000000.TH "PAPI_mh_cache_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_mh_cache_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBtype\fP" .br .ti -1c .RI "int \fBsize\fP" .br .ti -1c .RI "int \fBline_size\fP" .br .ti -1c .RI "int \fBnum_lines\fP" .br .ti -1c .RI "int \fBassociativity\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_mh_cache_info_t::type" Empty, instr, data, vector, trace, unified .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_mh_info_t.3000066400000000000000000000006501502707512200201110ustar00rootroot00000000000000.TH "PAPI_mh_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_mh_info_t \- mh for mem hierarchy maybe? .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBlevels\fP" .br .ti -1c .RI "\fBPAPI_mh_level_t\fP \fBlevel\fP [4]" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_mh_level_t.3000066400000000000000000000006051502707512200202650ustar00rootroot00000000000000.TH "PAPI_mh_level_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_mh_level_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "\fBPAPI_mh_tlb_info_t\fP \fBtlb\fP [6]" .br .ti -1c .RI "\fBPAPI_mh_cache_info_t\fP \fBcache\fP [6]" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_mh_tlb_info_t.3000066400000000000000000000010251502707512200207470ustar00rootroot00000000000000.TH "PAPI_mh_tlb_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_mh_tlb_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBtype\fP" .br .ti -1c .RI "int \fBnum_entries\fP" .br .ti -1c .RI "int \fBpage_size\fP" .br .ti -1c .RI "int \fBassociativity\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_mh_tlb_info_t::type" Empty, instr, data, vector, unified .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_mpx_info_t.3000066400000000000000000000013161502707512200203110ustar00rootroot00000000000000.TH "PAPI_mpx_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_mpx_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBtimer_sig\fP" .br .ti -1c .RI "int \fBtimer_num\fP" .br .ti -1c .RI "int \fBtimer_us\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int PAPI_mpx_info_t::timer_num" Number of the itimer or POSIX 1 timer used by the multiplex timer: PAPI_ITIMER .SS "int PAPI_mpx_info_t::timer_sig" Signal number used by the multiplex timer, 0 if not: PAPI_SIGNAL .SS "int PAPI_mpx_info_t::timer_us" uS between switching of sets: PAPI_MPX_DEF_US .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_multiplex_init.3000066400000000000000000000017471502707512200212250ustar00rootroot00000000000000.TH "PAPI_multiplex_init" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_multiplex_init \- Initialize multiplex support in the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_multiplex_init() enables and initializes multiplex support in the PAPI library\&. Multiplexing allows a user to count more events than total physical counters by time sharing the existing counters at some loss in precision\&. Applications that make no use of multiplexing do not need to call this routine\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_multiplex_init\fP (void); .RE .PP \fBExamples\fP .RS 4 .PP .nf retval = PAPI_multiplex_init(); .fi .PP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP This call always returns PAPI_OK .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_multiplex\fP .PP \fBPAPI_get_multiplex\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_multiplex_option_t.3000066400000000000000000000006051502707512200221050ustar00rootroot00000000000000.TH "PAPI_multiplex_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_multiplex_option_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBeventset\fP" .br .ti -1c .RI "int \fBns\fP" .br .ti -1c .RI "int \fBflags\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_num_cmp_hwctrs.3000066400000000000000000000040661502707512200212040ustar00rootroot00000000000000.TH "PAPI_num_cmp_hwctrs" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_num_cmp_hwctrs \- Return the number of hardware counters for the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_num_cmp_hwctrs() returns the number of counters present in the specified component\&. By convention, component 0 is always the cpu\&. .PP On some components, especially for CPUs, the value returned is a theoretical maximum for estimation purposes only\&. It might not be possible to easily create an EventSet that contains the full number of events\&. This can be due to a variety of reasons: 1)\&. Some CPUs (especially Intel and POWER) have the notion of fixed counters that can only measure one thing, usually cycles\&. 2)\&. Some CPUs have very explicit rules about which event can run in which counter\&. In this case it might not be possible to add a wanted event even if counters are free\&. 3)\&. Some CPUs halve the number of counters available when running with SMT (multiple CPU threads) enabled\&. 4)\&. Some operating systems 'steal' a counter to use for things such as NMI Watchdog timers\&. The only sure way to see if events will fit is to attempt adding events to an EventSet, and doing something sensible if an error is generated\&. .PP PAPI_library_init() must be called in order for this function to return anything greater than 0\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_num_cmp_hwctrs(int cidx ); .RE .PP \fBParameters\fP .RS 4 \fIcidx\fP -- An integer identifier for a component\&. By convention, component 0 is always the cpu component\&. .RE .PP \fBExample\fP .RS 4 .PP .nf // Query the cpu component for the number of counters\&. printf(\\"%d hardware counters found\&.\\\\n\\", PAPI_num_cmp_hwctrs(0)); .fi .PP .RE .PP \fBReturns\fP .RS 4 On success, this function returns a value greater than zero\&. .br A zero result usually means the library has not been initialized\&. .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_num_components.3000066400000000000000000000010361502707512200212120ustar00rootroot00000000000000.TH "PAPI_num_components" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_num_components \- Get the number of components available on the system\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturns\fP .RS 4 Number of components available on the system .RE .PP .PP .nf // Query the library for a component count\&. printf("%d components installed\&., PAPI_num_components() ); .fi .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_num_events.3000066400000000000000000000025111502707512200203300ustar00rootroot00000000000000.TH "PAPI_num_events" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_num_events \- Return the number of events in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_num_events() returns the number of preset and/or native events contained in an event set\&. The event set should be created by \fBPAPI_create_eventset\fP \&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_num_events(int EventSet ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set created by \fBPAPI_create_eventset\fP\&. .br \fI*count\fP -- (Fortran only) On output the variable contains the number of events in the event set .RE .PP \fBReturn values\fP .RS 4 \fIOn\fP success, this function returns the positive number of events in the event set\&. .br \fIPAPI_EINVAL\fP The event count is zero; only if code is compiled with debug enabled\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .RE .PP \fBExample\fP .RS 4 .PP .nf // Count the events in our EventSet printf(\\"%d events found in EventSet\&.\\\\n\\", PAPI_num_events(EventSet)); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_add_event\fP .PP \fBPAPI_create_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_num_hwctrs.3000066400000000000000000000006051502707512200203400ustar00rootroot00000000000000.TH "PAPI_num_hwctrs" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_num_hwctrs \- Return the number of hardware counters on the cpu\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBSee also\fP .RS 4 \fBPAPI_num_cmp_hwctrs\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_option_t.3000066400000000000000000000025561502707512200200110ustar00rootroot00000000000000.TH "PAPI_option_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_option_t \- A pointer to the following is passed to PAPI_set/get_opt() .SH SYNOPSIS .br .PP .PP \fR#include \fP .SS "Data Fields" .in +1c .ti -1c .RI "\fBPAPI_preload_info_t\fP \fBpreload\fP" .br .ti -1c .RI "\fBPAPI_debug_option_t\fP \fBdebug\fP" .br .ti -1c .RI "\fBPAPI_inherit_option_t\fP \fBinherit\fP" .br .ti -1c .RI "\fBPAPI_granularity_option_t\fP \fBgranularity\fP" .br .ti -1c .RI "\fBPAPI_granularity_option_t\fP \fBdefgranularity\fP" .br .ti -1c .RI "\fBPAPI_domain_option_t\fP \fBdomain\fP" .br .ti -1c .RI "\fBPAPI_domain_option_t\fP \fBdefdomain\fP" .br .ti -1c .RI "\fBPAPI_attach_option_t\fP \fBattach\fP" .br .ti -1c .RI "\fBPAPI_cpu_option_t\fP \fBcpu\fP" .br .ti -1c .RI "\fBPAPI_multiplex_option_t\fP \fBmultiplex\fP" .br .ti -1c .RI "\fBPAPI_itimer_option_t\fP \fBitimer\fP" .br .ti -1c .RI "\fBPAPI_hw_info_t\fP * \fBhw_info\fP" .br .ti -1c .RI "\fBPAPI_shlib_info_t\fP * \fBshlib_info\fP" .br .ti -1c .RI "\fBPAPI_exe_info_t\fP * \fBexe_info\fP" .br .ti -1c .RI "\fBPAPI_component_info_t\fP * \fBcmp_info\fP" .br .ti -1c .RI "\fBPAPI_addr_range_option_t\fP \fBaddr\fP" .br .ti -1c .RI "PAPI_user_defined_events_file_t \fBevents_file\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_overflow.3000066400000000000000000000126511502707512200200160ustar00rootroot00000000000000.TH "PAPI_overflow" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_overflow \- Set up an event set to begin registering overflows\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP PAPI_overflow() marks a specific EventCode in an EventSet to generate an overflow signal after every threshold events are counted\&. More than one event in an event set can be used to trigger overflows\&. In such cases, the user must call this function once for each overflowing event\&. To turn off overflow on a specified event, call this function with a threshold value of 0\&. .PP Overflows can be implemented in either software or hardware, but the scope is the entire event set\&. PAPI defaults to hardware overflow if it is available\&. In the case of software overflow, a periodic timer interrupt causes PAPI to compare the event counts against the threshold values and call the overflow handler if one or more events have exceeded their threshold\&. In the case of hardware overflow, the counters are typically set to the negative of the threshold value and count up to 0\&. This zero-crossing triggers a hardware interrupt that calls the overflow handler\&. Because of this counter interrupt, the counter values for overflowing counters may be very small or even negative numbers, and cannot be relied upon as accurate\&. In such cases the overflow handler can approximate the counts by supplying the threshold value whenever an overflow occurs\&. .PP _papi_overflow_handler() is a placeholder for a user-defined function to process overflow events\&. A pointer to this function is passed to the \fBPAPI_overflow\fP routine, where it is invoked whenever a software or hardware overflow occurs\&. This handler receives the EventSet of the overflowing event, the Program Counter address when the interrupt occurred, an overflow_vector that can be processed to determined which event(s) caused the overflow, and a pointer to the machine context, which can be used in a platform-specific manor to extract register information about what was happening when the overflow occurred\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_overflow\fP (int EventSet, int EventCode, int threshold, int flags, PAPI_overflow_handler_t handler ); .br .br (*PAPI_overflow_handler_t) _papi_overflow_handler (int EventSet, void *address, long_long overflow_vector, void *context ); .RE .PP \fBFortran Interface:\fP .RS 4 Not implemented .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle to a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIEventCode\fP -- the preset or native event code to be set for overflow detection\&. This event must have already been added to the EventSet\&. .br \fIthreshold\fP -- the overflow threshold value for this EventCode\&. .br \fIflags\fP -- bitmap that controls the overflow mode of operation\&. Set to PAPI_OVERFLOW_FORCE_SW to force software overflowing, even if hardware overflow support is available\&. If hardware overflow support is available on a given system, it will be the default mode of operation\&. There are situations where it is advantageous to use software overflow instead\&. Although software overflow is inherently less accurate, with more latency and processing overhead, it does allow for overflowing on derived events, and for the accurate recording of overflowing event counts\&. These two features are typically not available with hardware overflow\&. Only one type of overflow is allowed per event set, so setting one event to hardware overflow and another to forced software overflow will result in an error being returned\&. .br \fIhandler\fP -- pointer to the user supplied handler function to call upon overflow .br \fIaddress\fP -- the Program Counter address at the time of the overflow .br \fIoverflow_vector\fP .br -- a long long word containing flag bits to indicate which hardware counter(s) caused the overflow .br \fI*context\fP -- pointer to a machine specific structure that defines the register context at the time of overflow\&. This parameter is often unused and can be ignored in the user function\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP On success, \fBPAPI_overflow\fP returns PAPI_OK\&. .br .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br Most likely a bad threshold value\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware cannot count this event and other events in the EventSet simultaneously\&. Also can happen if you are trying to overflow both by hardware and by forced software at the same time\&. .br \fIPAPI_ENOEVNT\fP The PAPI event is not available on the underlying hardware\&. .RE .PP \fBExample\fP .RS 4 .PP .nf // Define a simple overflow handler: void handler(int EventSet, void *address, long_long overflow_vector, void *context) { fprintf(stderr,\\"Overflow at %p! bit=%#llx \\\\n\\", address,overflow_vector); } // Call PAPI_overflow for an EventSet containing PAPI_TOT_INS, // setting the threshold to 100000\&. Use the handler defined above\&. retval = PAPI_overflow(EventSet, PAPI_TOT_INS, 100000, 0, handler); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_get_overflow_event_index\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_perror.3000066400000000000000000000025551502707512200174660ustar00rootroot00000000000000.TH "PAPI_perror" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_perror \- Produces a string on standard error, describing the last library error\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br void PAPI_perror( const char *s ); .RE .PP \fBParameters\fP .RS 4 \fIs\fP -- Optional message to print before the string describing the last error message\&. .RE .PP The routine PAPI_perror() produces a message on the standard error output, describing the last error encountered during a call to PAPI\&. If s is not NULL, s is printed, followed by a colon and a space\&. Then the error message and a new-line are printed\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; int EventSet = PAPI_NULL; int native = 0x0; ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) { fprintf(stderr, \\"PAPI error %d: %s\\\\n\\", ret, PAPI_strerror(retval)); exit(1); } // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) { PAPI_perror( "PAPI_add_event" ); exit(1); } // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_strerror\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_preload_info_t.3000066400000000000000000000007171502707512200211370ustar00rootroot00000000000000.TH "PAPI_preload_info_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_preload_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "char \fBlib_preload_env\fP [128]" .br .ti -1c .RI "char \fBlib_preload_sep\fP" .br .ti -1c .RI "char \fBlib_dir_env\fP [128]" .br .ti -1c .RI "char \fBlib_dir_sep\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_profil.3000066400000000000000000000161441502707512200174470ustar00rootroot00000000000000.TH "PAPI_profil" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_profil \- Generate a histogram of hardware counter overflows vs\&. PC addresses\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_profil(void *buf, unsigned bufsiz, unsigned long offset, unsigned scale, int EventSet, int EventCode, int threshold, int flags )\fP; .RE .PP \fBFortran Interface\fP .RS 4 The profiling routines have no Fortran interface\&. .RE .PP \fBParameters\fP .RS 4 \fI*buf\fP -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'\&. The size of the buckets is determined by values in the flags argument\&. .br \fIbufsiz\fP -- the size of the histogram buffer in bytes\&. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed above\&. .br \fIoffset\fP -- the start address of the region to be profiled\&. .br \fIscale\fP -- broadly and historically speaking, a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled\&. More precisely, scale is interpreted as an unsigned 16-bit fixed-point fraction with the decimal point implied on the left\&. Its value is the reciprocal of the number of addresses in a subdivision, per counter of histogram buffer\&. Below is a table of representative values for scale\&. .br \fIEventSet\fP -- The PAPI EventSet to profile\&. This EventSet is marked as profiling-ready, but profiling doesn't actually start until a PAPI_start() call is issued\&. .br \fIEventCode\fP -- Code of the Event in the EventSet to profile\&. This event must already be a member of the EventSet\&. .br \fIthreshold\fP -- minimum number of events that must occur before the PC is sampled\&. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached\&. Otherwise, the counters will be sampled periodically and the PC will be recorded for the first sample that exceeds the threshold\&. If the value of threshold is 0, profiling will be disabled for this event\&. .br \fIflags\fP -- bit pattern to control profiling behavior\&. Defined values are shown in the table above\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP PAPI_profil() provides hardware event statistics by profiling the occurrence of specified hardware counter events\&. It is designed to mimic the UNIX SVR4 profil call\&. .PP The statistics are generated by creating a histogram of hardware counter event overflows vs\&. program counter addresses for the current process\&. The histogram is defined for a specific region of program code to be profiled, and the identified region is logically broken up into a set of equal size subdivisions, each of which corresponds to a count in the histogram\&. .PP With each hardware event overflow, the current subdivision is identified and its corresponding histogram count is incremented\&. These counts establish a relative measure of how many hardware counter events are occurring in each code subdivision\&. .PP The resulting histogram counts for a profiled region can be used to identify those program addresses that generate a disproportionately high percentage of the event of interest\&. .PP Events to be profiled are specified with the EventSet and EventCode parameters\&. More than one event can be simultaneously profiled by calling PAPI_profil() several times with different EventCode values\&. Profiling can be turned off for a given event by calling PAPI_profil() with a threshold value of 0\&. .PP \fBRepresentative values for the scale variable\fP .RS 4 HEX DECIMAL DEFININTION 0x20000 131072 Maps precisely one instruction address to a unique bucket in buf. 0x10000 65536 Maps precisely two instruction addresses to a unique bucket in buf. 0x0FFFF 65535 Maps approximately two instruction addresses to a unique bucket in buf. 0x08000 32768 Maps every four instruction addresses to a bucket in buf. 0x04000 16384 Maps every eight instruction addresses to a bucket in buf. 0x00002 2 Maps all instruction addresses to the same bucket in buf. 0x00001 1 Undefined. 0x00000 0 Undefined. .RE .PP Historically, the scale factor was introduced to allow the allocation of buffers smaller than the code size to be profiled\&. Data and instruction sizes were assumed to be multiples of 16-bits\&. These assumptions are no longer necessarily true\&. PAPI_profil() has preserved the traditional definition of scale where appropriate, but deprecated the definitions for 0 and 1 (disable scaling) and extended the range of scale to include 65536 and 131072 to allow for exactly two addresses and exactly one address per profiling bucket\&. .PP The value of bufsiz is computed as follows: .PP bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where .PD 0 .IP "\(bu" 1 bufsiz - the size of the buffer in bytes .IP "\(bu" 1 end, start - the ending and starting addresses of the profiled region .IP "\(bu" 1 bucket_size - the size of each bucket in bytes; 2, 4, or 8 as defined in flags .PP \fBDefined bits for the flags variable:\fP .RS 4 .PD 0 .IP "\(bu" 1 PAPI_PROFIL_POSIX Default type of profiling, similar to profil (3)\&. .br .IP "\(bu" 1 PAPI_PROFIL_RANDOM Drop a random 25% of the samples\&. .br .IP "\(bu" 1 PAPI_PROFIL_WEIGHTED Weight the samples by their value\&. .br .IP "\(bu" 1 PAPI_PROFIL_COMPRESS Ignore samples as values in the hash buckets get big\&. .br .IP "\(bu" 1 PAPI_PROFIL_BUCKET_16 Use unsigned short (16 bit) buckets, This is the default bucket\&. .br .IP "\(bu" 1 PAPI_PROFIL_BUCKET_32 Use unsigned int (32 bit) buckets\&. .br .IP "\(bu" 1 PAPI_PROFIL_BUCKET_64 Use unsigned long long (64 bit) buckets\&. .br .IP "\(bu" 1 PAPI_PROFIL_FORCE_SW Force software overflow in profiling\&. .br .PP .RE .PP \fBExample\fP .RS 4 .PP .nf int retval; unsigned long length; PAPI_exe_info_t *prginfo; unsigned short *profbuf; if ((prginfo = PAPI_get_executable_info()) == NULL) handle_error(1); length = (unsigned long)(prginfo\->text_end \- prginfo\->text_start); profbuf = (unsigned short *)malloc(length); if (profbuf == NULL) handle_error(1); memset(profbuf,0x00,length); if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet, PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK) handle_error(retval); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_overflow\fP .PP \fBPAPI_sprofil\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_query_event.3000066400000000000000000000027161502707512200205220ustar00rootroot00000000000000.TH "PAPI_query_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_query_event \- Query if PAPI event exists\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_query_event(int EventCode); .RE .PP PAPI_query_event() asks the PAPI library if the PAPI Preset event can be counted on this architecture\&. If the event CAN be counted, the function returns PAPI_OK\&. If the event CANNOT be counted, the function returns an error code\&. This function also can be used to check the syntax of native and user events\&. .PP \fBParameters\fP .RS 4 \fIEventCode\fP -- a defined event such as PAPI_TOT_INS\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBExamples\fP .RS 4 .PP .nf int retval; // Initialize the library retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { fprintf(stderr,\\"PAPI library init error!\\\\n\\"); exit(1); } if (PAPI_query_event(PAPI_TOT_INS) != PAPI_OK) { fprintf(stderr,\\"No instruction counter? How lame\&.\\\\n\\"); exit(1); } .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_remove_event\fP .PP \fBPAPI_remove_events\fP .PP PAPI_presets .PP PAPI_native .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_query_named_event.3000066400000000000000000000026731502707512200216700ustar00rootroot00000000000000.TH "PAPI_query_named_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_query_named_event \- Query if a named PAPI event exists\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_query_named_event(const char *EventName); .RE .PP PAPI_query_named_event() asks the PAPI library if the PAPI named event can be counted on this architecture\&. If the event CAN be counted, the function returns PAPI_OK\&. If the event CANNOT be counted, the function returns an error code\&. This function also can be used to check the syntax of native and user events\&. .PP \fBParameters\fP .RS 4 \fIEventName\fP -- a defined event such as PAPI_TOT_INS\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBExamples\fP .RS 4 .PP .nf int retval; // Initialize the library retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { fprintf(stderr,\\"PAPI library init error!\\\\n\\"); exit(1); } if (PAPI_query_named_event("PAPI_TOT_INS") != PAPI_OK) { fprintf(stderr,\\"No instruction counter? How lame\&.\\\\n\\"); exit(1); } .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_query_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_rate_stop.3000066400000000000000000000013621502707512200201500ustar00rootroot00000000000000.TH "PAPI_rate_stop" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_rate_stop \- Stop a running event set of a rate function\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface: \fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_rate_stop(); .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVNT\fP -- The EventSet is not started yet\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory to complete the operation\&. .RE .PP \fBPAPI_rate_stop\fP stops a running event set of a rate function\&. .PP \fBSee also\fP .RS 4 PAPI_flips_rate() .PP PAPI_flops_rate() .PP PAPI_ipc() .PP PAPI_epc() .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_read.3000066400000000000000000000033351502707512200170650ustar00rootroot00000000000000.TH "PAPI_read" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_read \- Read hardware counters from an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_read(int EventSet, long_long * values )\fP; .RE .PP PAPI_read() copies the counters of the indicated event set into the provided array\&. .PP The counters continue counting after the read\&. .PP Note the differences between PAPI_read() and PAPI_accum(), specifically that PAPI_accum() resets the values array to zero\&. .PP PAPI_read() assumes an initialized PAPI library and a properly added event set\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI Event Set as created by PAPI_create_eventset() .br \fI*values\fP -- an array to hold the counter values of the counting events .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .RE .PP \fBExamples\fP .RS 4 .PP .nf do_100events(); if (PAPI_read(EventSet, values) != PAPI_OK) handle_error(1); // values[0] now equals 100 do_100events(); if (PAPI_accum(EventSet, values) != PAPI_OK) handle_error(1); // values[0] now equals 300 values[0] = \-100; do_100events(); if (PAPI_accum(EventSet, values) != PAPI_OK) handle_error(1); // values[0] now equals 0 .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_accum\fP .PP \fBPAPI_start\fP .PP \fBPAPI_stop\fP .PP \fBPAPI_reset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_read_ts.3000066400000000000000000000026611502707512200175740ustar00rootroot00000000000000.TH "PAPI_read_ts" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_read_ts \- Read hardware counters with a timestamp\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_read_ts(int EventSet, long long *values, long long *cycles ); .RE .PP PAPI_read_ts() copies the counters of the indicated event set into the provided array\&. It also places a real-time cycle timestamp into the cycles array\&. .PP The counters continue counting after the read\&. .PP PAPI_read_ts() assumes an initialized PAPI library and a properly added event set\&. .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI Event Set as created by PAPI_create_eventset() .br \fI*values\fP -- an array to hold the counter values of the counting events .br \fI*cycles\fP -- an array to hold the timestamp values .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP The event set specified does not exist\&. .RE .PP \fBExamples\fP .RS 4 .PP .nf .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_read\fP .PP \fBPAPI_accum\fP .PP \fBPAPI_start\fP .PP \fBPAPI_stop\fP .PP \fBPAPI_reset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_register_thread.3000066400000000000000000000023411502707512200213210ustar00rootroot00000000000000.TH "PAPI_register_thread" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_register_thread \- Notify PAPI that a thread has 'appeared'\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int \fBPAPI_register_thread\fP (void); .RE .PP PAPI_register_thread() should be called when the user wants to force PAPI to initialize a thread that PAPI has not seen before\&. .PP Usually this is not necessary as PAPI implicitly detects the thread when an eventset is created or other thread local PAPI functions are called\&. However, it can be useful for debugging and performance enhancements in the run-time systems of performance tools\&. .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOMEM\fP Space could not be allocated to store the new thread information\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ECMP\fP Hardware counters for this thread could not be initialized\&. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_unregister_thread\fP .PP \fBPAPI_thread_id\fP .PP \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_remove_event.3000066400000000000000000000045141502707512200206500ustar00rootroot00000000000000.TH "PAPI_remove_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_remove_event \- removes a hardware event from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP A hardware event can be either a PAPI Preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the papi_avail utility in the PAPI distribution\&. PAPI Presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on the current platform, run papi_native_avail in the PAPI distribution\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_remove_event( int EventSet, int EventCode ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIEventCode\fP -- a defined event such as PAPI_TOT_INS or a native event\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP Everything worked\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // Remove event ret = PAPI_remove_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_cleanup_eventset\fP .PP \fBPAPI_destroy_eventset\fP .PP \fBPAPI_event_name_to_code\fP .PP PAPI_presets .PP \fBPAPI_add_event\fP .PP \fBPAPI_add_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_remove_events.3000066400000000000000000000051031502707512200210260ustar00rootroot00000000000000.TH "PAPI_remove_events" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_remove_events \- Remove an array of hardware event codes from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP A hardware event can be either a PAPI Preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the papi_avail utility in the PAPI distribution\&. PAPI Presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on current platform, run papi_native_avail in the PAPI distribution\&. It should be noted that \fBPAPI_remove_events\fP can partially succeed, exactly like \fBPAPI_add_events\fP\&. .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_remove_events( int EventSet, int * EventCode, int number ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fI*Events\fP an array of defined events .br \fInumber\fP an integer indicating the number of events in the array *EventCode .RE .PP \fBReturn values\fP .RS 4 \fIPositive\fP integer The number of consecutive elements that succeeded before the error\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int Events[] = {PAPI_TOT_INS, PAPI_FP_OPS}; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add two events to our EventSet ret = PAPI_add_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // Remove event ret = PAPI_remove_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP .PP \fBSee also\fP .RS 4 \fBPAPI_cleanup_eventset\fP \fBPAPI_destroy_eventset\fP \fBPAPI_event_name_to_code\fP PAPI_presets \fBPAPI_add_event\fP \fBPAPI_add_events\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_remove_named_event.3000066400000000000000000000046131502707512200220140ustar00rootroot00000000000000.TH "PAPI_remove_named_event" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_remove_named_event \- removes a named hardware event from a PAPI event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP A hardware event can be either a PAPI Preset or a native hardware event code\&. For a list of PAPI preset events, see PAPI_presets or run the papi_avail utility in the PAPI distribution\&. PAPI Presets can be passed to \fBPAPI_query_event\fP to see if they exist on the underlying architecture\&. For a list of native events available on the current platform, run papi_native_avail in the PAPI distribution\&. .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_remove_named_event( int EventSet, const char *EventName ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIEventName\fP -- a defined event such as PAPI_TOT_INS or a native event\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP Everything worked\&. .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOINIT\fP The PAPI library has not been initialized\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBExample:\fP .RS 4 .PP .nf char EventName = "PAPI_TOT_INS"; int EventSet = PAPI_NULL; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add Total Instructions Executed to our EventSet ret = PAPI_add_named_event(EventSet, EventName); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // Remove event ret = PAPI_remove_named_event(EventSet, EventName); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_remove_event\fP .br \fBPAPI_query_named_event\fP .br \fBPAPI_add_named_event\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_reset.3000066400000000000000000000031131502707512200172660ustar00rootroot00000000000000.TH "PAPI_reset" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_reset \- Reset the hardware event counts in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_reset( int EventSet ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .RE .PP PAPI_reset() zeroes the values of the counters contained in EventSet\&. This call assumes an initialized PAPI library and a properly added event set .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int Events[] = {PAPI_TOT_INS, PAPI_FP_OPS}; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add two events to our EventSet ret = PAPI_add_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // reset the counters in this EventSet ret = PAPI_reset(EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_create_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_cmp_domain.3000066400000000000000000000052241502707512200211320ustar00rootroot00000000000000.TH "PAPI_set_cmp_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_cmp_domain \- Set the default counting domain for new event sets bound to the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_cmp_domain( int domain, int cidx ); .RE .PP \fBParameters\fP .RS 4 \fIdomain\fP one of the following constants as defined in the \fBpapi\&.h\fP header file .PD 0 .IP "\(bu" 1 PAPI_DOM_USER User context counted .IP "\(bu" 1 PAPI_DOM_KERNEL Kernel/OS context counted .IP "\(bu" 1 PAPI_DOM_OTHER Exception/transient mode counted .IP "\(bu" 1 PAPI_DOM_SUPERVISOR Supervisor/hypervisor context counted .IP "\(bu" 1 PAPI_DOM_ALL All above contexts counted .IP "\(bu" 1 PAPI_DOM_MIN The smallest available context .IP "\(bu" 1 PAPI_DOM_MAX The largest available context .IP "\(bu" 1 PAPI_DOM_HWSPEC Something other than CPU like stuff\&. Individual components can decode low order bits for more meaning .PP .br \fIcidx\fP An integer identifier for a component\&. By convention, component 0 is always the cpu component\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOCMP\fP The argument cidx is not a valid component\&. .RE .PP \fBPAPI_set_cmp_domain\fP sets the default counting domain for all new event sets in all threads, and requires an explicit component argument\&. Event sets that are already in existence are not affected\&. To change the domain of an existing event set, please see \fBPAPI_set_opt\fP\&. The reader should note that the domain of an event set affects only the mode in which the counter continues to run\&. Counts are still aggregated for the current process, and not for any other processes in the system\&. Thus when requesting PAPI_DOM_KERNEL , the user is asking for events that occur on behalf of the process, inside the kernel\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default domain for the cpu component ret = PAPI_set_cmp_domain(PAPI_DOM_KERNEL,0); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_domain\fP \fBPAPI_set_granularity\fP \fBPAPI_set_opt\fP \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_cmp_granularity.3000066400000000000000000000045241502707512200222260ustar00rootroot00000000000000.TH "PAPI_set_cmp_granularity" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_cmp_granularity \- Set the default counting granularity for eventsets bound to the specified component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_cmp_granularity( int granularity, int cidx ); .RE .PP \fBParameters\fP .RS 4 \fIgranularity\fP one of the following constants as defined in the \fBpapi\&.h\fP header file .PD 0 .IP "\(bu" 1 PAPI_GRN_THR Count each individual thread .IP "\(bu" 1 PAPI_GRN_PROC Count each individual process .IP "\(bu" 1 PAPI_GRN_PROCG Count each individual process group .IP "\(bu" 1 PAPI_GRN_SYS Count the current CPU .IP "\(bu" 1 PAPI_GRN_SYS_CPU Count all CPUs individually .IP "\(bu" 1 PAPI_GRN_MIN The finest available granularity .IP "\(bu" 1 PAPI_GRN_MAX The coarsest available granularity .PP .br \fIcidx\fP An integer identifier for a component\&. By convention, component 0 is always the cpu component\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOCMP\fP The argument cidx is not a valid component\&. .RE .PP \fBPAPI_set_cmp_granularity\fP sets the default counting granularity for all new event sets, and requires an explicit component argument\&. Event sets that are already in existence are not affected\&. .PP To change the granularity of an existing event set, please see \fBPAPI_set_opt\fP\&. The reader should note that the granularity of an event set affects only the mode in which the counter continues to run\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default granularity for the cpu component ret = PAPI_set_cmp_granularity(PAPI_GRN_PROC, 0); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_granularity\fP \fBPAPI_set_domain\fP \fBPAPI_set_opt\fP \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_debug.3000066400000000000000000000041131502707512200201060ustar00rootroot00000000000000.TH "PAPI_set_debug" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_debug \- Set the current debug level for error output from PAPI\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_debug( int level ); .RE .PP \fBParameters\fP .RS 4 \fIlevel\fP one of the constants shown in the table below and defined in the \fBpapi\&.h\fP header file\&. .br The possible debug levels for debugging are shown below\&. .PD 0 .IP "\(bu" 1 PAPI_QUIET Do not print anything, just return the error code .IP "\(bu" 1 PAPI_VERB_ECONT Print error message and continue .IP "\(bu" 1 PAPI_VERB_ESTOP Print error message and exit .br .PP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP The debug level is invalid\&. .br .br The current debug level is used by both the internal error and debug message handler subroutines\&. .br The debug handler is only used if the library was compiled with -DDEBUG\&. .br The debug handler is called when there is an error upon a call to the PAPI API\&. .br The error handler is always active and its behavior cannot be modified except for whether or not it prints anything\&. .RE .PP The default PAPI debug handler prints out messages in the following form: .br PAPI Error: Error Code code, symbol, description .PP If the error was caused from a system call and the return code is PAPI_ESYS, the message will have a colon space and the error string as reported by strerror() appended to the end\&. .PP The PAPI error handler prints out messages in the following form: .br PAPI Error: message\&. .br .PP \fBNote\fP .RS 4 This is the ONLY function that may be called BEFORE PAPI_library_init()\&. .br .RE .PP \fBExample:\fP .RS 4 .PP .nf int ret; ret = PAPI_set_debug(PAPI_VERB_ECONT); if ( ret != PAPI_OK ) handle_error(); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_library_init\fP .PP \fBPAPI_get_opt\fP .PP \fBPAPI_set_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_domain.3000066400000000000000000000037041502707512200202740ustar00rootroot00000000000000.TH "PAPI_set_domain" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_domain \- Set the default counting domain for new event sets bound to the cpu component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_domain( int domain ); .RE .PP \fBParameters\fP .RS 4 \fIdomain\fP one of the following constants as defined in the \fBpapi\&.h\fP header file .PD 0 .IP "\(bu" 1 PAPI_DOM_USER User context counted .IP "\(bu" 1 PAPI_DOM_KERNEL Kernel/OS context counted .IP "\(bu" 1 PAPI_DOM_OTHER Exception/transient mode counted .IP "\(bu" 1 PAPI_DOM_SUPERVISOR Supervisor/hypervisor context counted .IP "\(bu" 1 PAPI_DOM_ALL All above contexts counted .IP "\(bu" 1 PAPI_DOM_MIN The smallest available context .IP "\(bu" 1 PAPI_DOM_MAX The largest available context .PP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .RE .PP \fBPAPI_set_domain\fP sets the default counting domain for all new event sets created by \fBPAPI_create_eventset\fP in all threads\&. This call implicitly sets the domain for the cpu component (component 0) and is included to preserve backward compatibility\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default domain for the cpu component ret = PAPI_set_domain(PAPI_DOM_KERNEL); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_cmp_domain\fP \fBPAPI_set_granularity\fP \fBPAPI_set_opt\fP \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_granularity.3000066400000000000000000000037761502707512200213770ustar00rootroot00000000000000.TH "PAPI_set_granularity" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_granularity \- Set the default counting granularity for eventsets bound to the cpu component\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_granularity( int granularity ); .RE .PP \fBParameters\fP .RS 4 \fI--\fP granularity one of the following constants as defined in the \fBpapi\&.h\fP header file .PD 0 .IP "\(bu" 1 PAPI_GRN_THR -- Count each individual thread .IP "\(bu" 1 PAPI_GRN_PROC -- Count each individual process .IP "\(bu" 1 PAPI_GRN_PROCG -- Count each individual process group .IP "\(bu" 1 PAPI_GRN_SYS -- Count the current CPU .IP "\(bu" 1 PAPI_GRN_SYS_CPU -- Count all CPUs individually .IP "\(bu" 1 PAPI_GRN_MIN -- The finest available granularity .IP "\(bu" 1 PAPI_GRN_MAX -- The coarsest available granularity .PP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .RE .PP \fBPAPI_set_granularity\fP sets the default counting granularity for all new event sets created by \fBPAPI_create_eventset\fP\&. This call implicitly sets the granularity for the cpu component (component 0) and is included to preserve backward compatibility\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default granularity for the cpu component ret = PAPI_set_granularity(PAPI_GRN_PROC); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_cmp_granularity\fP \fBPAPI_set_domain\fP \fBPAPI_set_opt\fP \fBPAPI_get_opt\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_multiplex.3000066400000000000000000000053601502707512200210500ustar00rootroot00000000000000.TH "PAPI_set_multiplex" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_multiplex \- Convert a standard event set to a multiplexed event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_multiplex( int EventSet ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP -- One or more of the arguments is invalid, or the EventSet is already multiplexed\&. .br \fIPAPI_ENOCMP\fP -- The EventSet specified is not yet bound to a component\&. .br \fIPAPI_ENOEVST\fP -- The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP -- The EventSet is currently counting events\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory to complete the operation\&. .RE .PP \fBPAPI_set_multiplex\fP converts a standard PAPI event set created by a call to \fBPAPI_create_eventset\fP into an event set capable of handling multiplexed events\&. This must be done after calling \fBPAPI_multiplex_init\fP, and either \fBPAPI_add_event\fP or \fBPAPI_assign_eventset_component\fP, but prior to calling PAPI_start()\&. .PP Events can be added to an event set either before or after converting it into a multiplexed set, but the conversion must be done prior to using it as a multiplexed set\&. .PP \fBNote\fP .RS 4 Multiplexing can't be enabled until PAPI knows which component is targeted\&. Due to the late binding nature of PAPI event sets, this only happens after adding an event to an event set or explicitly binding the component with a call to \fBPAPI_assign_eventset_component\fP\&. .RE .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Bind it to the CPU component ret = PAPI_assign_eventset_component(EventSet, 0); if (ret != PAPI_OK) handle_error(ret); // Check current multiplex status ret = PAPI_get_multiplex(EventSet); if (ret == TRUE) printf("This event set is ready for multiplexing\\n\&.") if (ret == FALSE) printf("This event set is not enabled for multiplexing\\n\&.") if (ret < 0) handle_error(ret); // Turn on multiplexing ret = PAPI_set_multiplex(EventSet); if ((ret == PAPI_EINVAL) && (PAPI_get_multiplex(EventSet) == TRUE)) printf("This event set already has multiplexing enabled\\n"); else if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_multiplex_init\fP .PP \fBPAPI_get_multiplex\fP .PP \fBPAPI_set_opt\fP .PP \fBPAPI_create_eventset\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_opt.3000066400000000000000000000105551502707512200176310ustar00rootroot00000000000000.TH "PAPI_set_opt" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_opt \- Set PAPI library or event set options\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_opt( int option, PAPI_option_t * ptr ); .RE .PP \fBParameters\fP .RS 4 \fIoption\fP Defines the option to be set\&. Possible values are briefly described in the table below\&. .br \fIptr\fP Pointer to a structure determined by the selected option\&. See \fBPAPI_option_t\fP for a description of possible structures\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP The specified option or parameter is invalid\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP The EventSet is currently counting events\&. .br \fIPAPI_ECMP\fP The option is not implemented for the current component\&. .br \fIPAPI_ENOINIT\fP PAPI has not been initialized\&. .br \fIPAPI_EINVAL_DOM\fP Invalid domain has been requested\&. .RE .PP PAPI_set_opt() changes the options of the PAPI library or a specific EventSet created by \fBPAPI_create_eventset\fP\&. Some options may require that the EventSet be bound to a component before they can execute successfully\&. This can be done either by adding an event or by explicitly calling \fBPAPI_assign_eventset_component\fP\&. .PP Ptr is a pointer to the \fBPAPI_option_t\fP structure, which is actually a union of different structures for different options\&. Not all options require or return information in these structures\&. Each requires different values to be set\&. Some options require a component index to be provided\&. These options are handled implicitly through the option structures\&. .PP \fBNote\fP .RS 4 Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX are also available as separate entry points in both C and Fortran\&. .RE .PP The reader is encouraged to peruse the ctests code in the PAPI distribution for examples of usage of \fBPAPI_set_opt\fP\&. .PP \fBPossible values for the PAPI_set_opt option parameter\fP .RS 4 OPTION DEFINITION PAPI_DEFDOM Set default counting domain for newly created event sets. Requires a component index. PAPI_DEFGRN Set default counting granularity. Requires a component index. PAPI_DEBUG Set the PAPI debug state and the debug handler. The debug state is specified in ptr->debug.level. The debug handler is specified in ptr->debug.handler. For further information regarding debug states and the behavior of the handler, see PAPI_set_debug. PAPI_MULTIPLEX Enable specified EventSet for multiplexing. PAPI_DEF_ITIMER Set the type of itimer used in software multiplexing, overflowing and profiling. PAPI_DEF_MPX_NS Set the sampling time slice in nanoseconds for multiplexing and overflow. PAPI_DEF_ITIMER_NS See PAPI_DEF_MPX_NS. PAPI_ATTACH Attach EventSet specified in ptr->attach.eventset to thread or process id specified in in ptr->attach.tid. PAPI_CPU_ATTACH Attach EventSet specified in ptr->cpu.eventset to cpu specified in in ptr->cpu.cpu_num. PAPI_DETACH Detach EventSet specified in ptr->attach.eventset from any thread or process id. PAPI_DOMAIN Set domain for EventSet specified in ptr->domain.eventset. Will error if eventset is not bound to a component. PAPI_GRANUL Set granularity for EventSet specified in ptr->granularity.eventset. Will error if eventset is not bound to a component. PAPI_INHERIT Enable or disable inheritance for specified EventSet. PAPI_DATA_ADDRESS Set data address range to restrict event counting for EventSet specified in ptr->addr.eventset. Starting and ending addresses are specified in ptr->addr.start and ptr->addr.end, respectively. If exact addresses cannot be instantiated, offsets are returned in ptr->addr.start_off and ptr->addr.end_off. Currently implemented on Itanium only. PAPI_INSTR_ADDRESS Set instruction address range as described above. Itanium only. .RE .PP \fBSee also\fP .RS 4 \fBPAPI_set_debug\fP .PP \fBPAPI_set_multiplex\fP .PP \fBPAPI_set_domain\fP .PP \fBPAPI_option_t\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_set_thr_specific.3000066400000000000000000000037511502707512200214710ustar00rootroot00000000000000.TH "PAPI_set_thr_specific" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_set_thr_specific \- Store a pointer to a thread specific data structure\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBPrototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_set_thr_specific( int tag, void *ptr ); .RE .PP \fBParameters\fP .RS 4 \fItag\fP An identifier, the value of which is either PAPI_USR1_TLS or PAPI_USR2_TLS\&. This identifier indicates which of several data structures associated with this thread is to be accessed\&. .br \fIptr\fP A pointer to the memory containing the data structure\&. .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP The \fItag\fP argument is out of range\&. .RE .PP In C, \fBPAPI_set_thr_specific\fP will save \fIptr\fP into an array indexed by \fItag\fP\&. There are 2 user available locations and \fItag\fP can be either PAPI_USR1_TLS or PAPI_USR2_TLS\&. The array mentioned above is managed by PAPI and allocated to each thread which has called \fBPAPI_thread_init\fP\&. There is no Fortran equivalent function\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; RateInfo *state = NULL; ret = PAPI_thread_init(pthread_self); if (ret != PAPI_OK) handle_error(ret); // Do we have the thread specific data setup yet? ret = PAPI_get_thr_specific(PAPI_USR1_TLS, (void *) &state); if (ret != PAPI_OK || state == NULL) { state = (RateInfo *) malloc(sizeof(RateInfo)); if (state == NULL) return (PAPI_ESYS); memset(state, 0, sizeof(RateInfo)); state\->EventSet = PAPI_NULL; ret = PAPI_create_eventset(&state\->EventSet); if (ret != PAPI_OK) return (PAPI_ESYS); ret = PAPI_set_thr_specific(PAPI_USR1_TLS, state); if (ret != PAPI_OK) return (ret); } .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_register_thread\fP \fBPAPI_thread_init\fP \fBPAPI_thread_id\fP \fBPAPI_get_thr_specific\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_shlib_info_t.3000066400000000000000000000005541502707512200206110ustar00rootroot00000000000000.TH "PAPI_shlib_info_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_shlib_info_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "\fBPAPI_address_map_t\fP * \fBmap\fP" .br .ti -1c .RI "int \fBcount\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_shutdown.3000066400000000000000000000014021502707512200200160ustar00rootroot00000000000000.TH "PAPI_shutdown" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_shutdown \- Finish using PAPI and free all related resources\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Prototype:\fP .RS 4 #include <\fBpapi\&.h\fP> .br void PAPI_shutdown( void ); .RE .PP PAPI_shutdown() is an exit function used by the PAPI Library to free resources and shut down when certain error conditions arise\&. It is not necessary for the user to call this function, but doing so allows the user to have the capability to free memory and resources used by the PAPI Library\&. .PP \fBSee also\fP .RS 4 PAPI_init_library .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_sprofil.3000066400000000000000000000105611502707512200176270ustar00rootroot00000000000000.TH "PAPI_sprofil" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_sprofil \- Generate PC histogram data from multiple code regions where hardware counter overflow occurs\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_sprofil( PAPI_sprofil_t * prof, int profcnt, int EventSet, int EventCode, int threshold, int flags ); .RE .PP \fBParameters\fP .RS 4 \fI*prof\fP pointer to an array of \fBPAPI_sprofil_t\fP structures\&. Each copy of the structure contains the following: .PD 0 .IP "\(bu" 1 buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'\&. The size of the buckets is determined by values in the flags argument\&. .IP "\(bu" 1 bufsiz -- the size of the histogram buffer in bytes\&. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed below\&. .IP "\(bu" 1 offset -- the start address of the region to be profiled\&. .IP "\(bu" 1 scale -- broadly and historically speaking, a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled\&. More precisely, scale is interpreted as an unsigned 16-bit fixed-point fraction with the decimal point implied on the left\&. Its value is the reciprocal of the number of addresses in a subdivision, per counter of histogram buffer\&. .PP .br \fIprofcnt\fP number of structures in the prof array for hardware profiling\&. .br \fIEventSet\fP The PAPI EventSet to profile\&. This EventSet is marked as profiling-ready, but profiling doesn't actually start until a PAPI_start() call is issued\&. .br \fIEventCode\fP Code of the Event in the EventSet to profile\&. This event must already be a member of the EventSet\&. .br \fIthreshold\fP minimum number of events that must occur before the PC is sampled\&. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached\&. Otherwise, the counters will be sampled periodically and the PC will be recorded for the first sample that exceeds the threshold\&. If the value of threshold is 0, profiling will be disabled for this event\&. .br \fIflags\fP bit pattern to control profiling behavior\&. Defined values are given in a table in the documentation for PAPI_pofil .RE .PP \fBReturn values\fP .RS 4 \fIReturn\fP values for PAPI_sprofil() are identical to those for \fBPAPI_profil\fP\&. Please refer to that page for further details\&. .RE .PP PAPI_sprofil() is a structure driven profiler that profiles one or more disjoint regions of code in a single call\&. It accepts a pointer to a preinitialized array of sprofil structures, and initiates profiling based on the values contained in the array\&. Each structure in the array defines the profiling parameters that are normally passed to PAPI_profil()\&. For more information on profiling, \fBPAPI_profil\fP .PP \fBExample:\fP .RS 4 .PP .nf int retval; unsigned long length; PAPI_exe_info_t *prginfo; unsigned short *profbuf1, *profbuf2, profbucket; PAPI_sprofil_t sprof[3]; prginfo = PAPI_get_executable_info(); if (prginfo == NULL) handle_error( NULL ); length = (unsigned long)(prginfo\->text_end \- prginfo\->text_start); // Allocate 2 buffers of equal length profbuf1 = (unsigned short *)malloc(length); profbuf2 = (unsigned short *)malloc(length); if ((profbuf1 == NULL) || (profbuf2 == NULL)) handle_error( NULL ); memset(profbuf1,0x00,length); memset(profbuf2,0x00,length); // First buffer sprof[0]\&.pr_base = profbuf1; sprof[0]\&.pr_size = length; sprof[0]\&.pr_off = (vptr_t) DO_FLOPS; sprof[0]\&.pr_scale = 0x10000; // Second buffer sprof[1]\&.pr_base = profbuf2; sprof[1]\&.pr_size = length; sprof[1]\&.pr_off = (vptr_t) DO_READS; sprof[1]\&.pr_scale = 0x10000; // Overflow bucket sprof[2]\&.pr_base = profbucket; sprof[2]\&.pr_size = 1; sprof[2]\&.pr_off = 0; sprof[2]\&.pr_scale = 0x0002; retval = PAPI_sprofil(sprof, EventSet, PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK) if ( retval != PAPI_OK ) handle_error( retval ); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_overflow\fP .PP \fBPAPI_get_executable_info\fP .PP \fBPAPI_profil\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_sprofil_t.3000066400000000000000000000014441502707512200201520ustar00rootroot00000000000000.TH "PAPI_sprofil_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_sprofil_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "void * \fBpr_base\fP" .br .ti -1c .RI "unsigned \fBpr_size\fP" .br .ti -1c .RI "vptr_t \fBpr_off\fP" .br .ti -1c .RI "unsigned \fBpr_scale\fP" .br .in -1c .SH "Field Documentation" .PP .SS "void* PAPI_sprofil_t::pr_base" buffer base .SS "vptr_t PAPI_sprofil_t::pr_off" pc start address (offset) .SS "unsigned PAPI_sprofil_t::pr_scale" pc scaling factor: fixed point fraction 0xffff ~= 1, 0x8000 == \&.5, 0x4000 == \&.25, etc\&. also, two extensions 0x1000 == 1, 0x2000 == 2 .SS "unsigned PAPI_sprofil_t::pr_size" buffer size .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_start.3000066400000000000000000000037131502707512200173070ustar00rootroot00000000000000.TH "PAPI_start" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_start \- Start counting hardware events in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_start( int EventSet ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP -- One or more of the arguments is invalid\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP -- The EventSet specified does not exist\&. .br \fIPAPI_EISRUN\fP -- The EventSet is currently counting events\&. .br \fIPAPI_ECNFLCT\fP -- The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. .br \fIPAPI_ENOEVNT\fP -- The PAPI preset is not available on the underlying hardware\&. .RE .PP \fBPAPI_start\fP starts counting all of the hardware events contained in the previously defined EventSet\&. All counters are implicitly set to zero before counting\&. Assumes an initialized PAPI library and a properly added event set\&. .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; long long values[2]; int ret; ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); poorly_tuned_function(); ret = PAPI_stop(EventSet, values); if (ret != PAPI_OK) handle_error(ret); printf("%lld\\\\n",values[0]); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_create_eventset\fP \fBPAPI_add_event\fP \fBPAPI_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_state.3000066400000000000000000000043041502707512200172670ustar00rootroot00000000000000.TH "PAPI_state" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_state \- Return the counting state of an EventSet\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_state( int EventSet, int * status ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIstatus\fP -- an integer containing a boolean combination of one or more of the following nonzero constants as defined in the PAPI header file \fBpapi\&.h\fP: .PD 0 .IP "\(bu" 1 PAPI_STOPPED -- EventSet is stopped .IP "\(bu" 1 PAPI_RUNNING -- EventSet is running .IP "\(bu" 1 PAPI_PAUSED -- EventSet temporarily disabled by the library .IP "\(bu" 1 PAPI_NOT_INIT -- EventSet defined, but not initialized .IP "\(bu" 1 PAPI_OVERFLOWING -- EventSet has overflowing enabled .IP "\(bu" 1 PAPI_PROFILING -- EventSet has profiling enabled .IP "\(bu" 1 PAPI_MULTIPLEXING -- EventSet has multiplexing enabled .IP "\(bu" 1 PAPI_ACCUMULATING -- reserved for future use .IP "\(bu" 1 PAPI_HWPROFILING -- reserved for future use .PP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .RE .PP PAPI_state() returns the counting state of the specified event set\&. .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; int status = 0; int ret; ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_state(EventSet, &status); if (ret != PAPI_OK) handle_error(ret); printf("State is now %d\\n",status); ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_state(EventSet, &status); if (ret != PAPI_OK) handle_error(ret); printf("State is now %d\\n",status); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_stop\fP \fBPAPI_start\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_stop.3000066400000000000000000000034301502707512200171330ustar00rootroot00000000000000.TH "PAPI_stop" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_stop \- Stop counting hardware events in an event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br int PAPI_stop( int EventSet, long long * values ); .RE .PP \fBParameters\fP .RS 4 \fIEventSet\fP -- an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fIvalues\fP -- an array to hold the counter values of the counting events .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_ENOTRUN\fP The EventSet is currently not running\&. .RE .PP \fBPAPI_stop\fP halts the counting of a previously defined event set and the counter values contained in that EventSet are copied into the values array Assumes an initialized PAPI library and a properly added event set\&. .PP \fBExample:\fP .RS 4 .PP .nf int EventSet = PAPI_NULL; long long values[2]; int ret; ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); poorly_tuned_function(); ret = PAPI_stop(EventSet, values); if (ret != PAPI_OK) handle_error(ret); printf("%lld\\\\n",values[0]); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_create_eventset\fP \fBPAPI_start\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_strerror.3000066400000000000000000000030701502707512200200300ustar00rootroot00000000000000.TH "PAPI_strerror" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_strerror \- Returns a string describing the PAPI error code\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBC Interface:\fP .RS 4 #include <\fBpapi\&.h\fP> .br char * PAPI_strerror( int errorCode ); .RE .PP \fBParameters\fP .RS 4 \fIcode\fP .br -- the error code to interpret .RE .PP \fBReturn values\fP .RS 4 \fI*error\fP -- a pointer to the error string\&. .br \fINULL\fP -- the input error code to PAPI_strerror() is invalid\&. .RE .PP PAPI_strerror() returns a pointer to the error message corresponding to the error code code\&. If the call fails the function returns the NULL pointer\&. This function is not implemented in Fortran\&. .PP \fBExample:\fP .RS 4 .PP .nf int ret; int EventSet = PAPI_NULL; int native = 0x0; char error_str[PAPI_MAX_STR_LEN]; ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) { fprintf(stderr, "PAPI error %d: %s\\n", ret, PAPI_strerror(retval)); exit(1); } // Add Total Instructions Executed to our EventSet ret = PAPI_add_event(EventSet, PAPI_TOT_INS); if (ret != PAPI_OK) { PAPI_perror( "PAPI_add_event"); fprintf(stderr,"PAPI_error %d: %s\\n", ret, error_str); exit(1); } // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); .fi .PP .RE .PP \fBSee also\fP .RS 4 \fBPAPI_perror\fP \fBPAPI_set_opt\fP \fBPAPI_get_opt\fP \fBPAPI_shutdown\fP \fBPAPI_set_debug\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_thread_id.3000066400000000000000000000015241502707512200200730ustar00rootroot00000000000000.TH "PAPI_thread_id" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_thread_id \- Get the thread identifier of the current thread\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_EMISC\fP is returned if there are no threads registered\&. .br \fI-1\fP is returned if the thread id function returns an error\&. .RE .PP This function returns a valid thread identifier\&. It calls the function registered with PAPI through a call to PAPI_thread_init()\&. .PP .PP .nf unsigned long tid; if ((tid = PAPI_thread_id()) == (unsigned long int)\-1 ) exit(1); printf("Initial thread id is: %lu\\n", tid ); .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_thread_init.3000066400000000000000000000025341502707512200204440ustar00rootroot00000000000000.TH "PAPI_thread_init" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_thread_init \- Initialize thread support in the PAPI library\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fI*id_fn\fP Pointer to a function that returns current thread ID\&. .RE .PP \fBPAPI_thread_init\fP initializes thread support in the PAPI library\&. Applications that make no use of threads do not need to call this routine\&. This function MUST return a UNIQUE thread ID for every new thread/LWP created\&. The OpenMP call omp_get_thread_num() violates this rule, as the underlying LWPs may have been killed off by the run-time system or by a call to omp_set_num_threads() \&. In that case, it may still possible to use omp_get_thread_num() in conjunction with PAPI_unregister_thread() when the OpenMP thread has finished\&. However it is much better to use the underlying thread subsystem's call, which is pthread_self() on Linux platforms\&. .PP .PP .nf if ( PAPI_thread_init(pthread_self) != PAPI_OK ) exit(1); .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_register_thread\fP \fBPAPI_unregister_thread\fP \fBPAPI_get_thr_specific\fP \fBPAPI_set_thr_specific\fP \fBPAPI_thread_id\fP \fBPAPI_list_threads\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_unlock.3000066400000000000000000000011231502707512200174360ustar00rootroot00000000000000.TH "PAPI_unlock" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_unlock \- Unlock one of the mutex variables defined in \fBpapi\&.h\fP\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIlck\fP an integer value specifying one of the two user locks: PAPI_USR1_LOCK or PAPI_USR2_LOCK .RE .PP PAPI_unlock() unlocks the mutex acquired by a call to \fBPAPI_lock\fP \&. .PP \fBSee also\fP .RS 4 \fBPAPI_thread_init\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_unregister_thread.3000066400000000000000000000024631502707512200216710ustar00rootroot00000000000000.TH "PAPI_unregister_thread" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_unregister_thread \- Notify PAPI that a thread has 'disappeared'\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOMEM\fP Space could not be allocated to store the new thread information\&. .br \fIPAPI_ESYS\fP A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_ECMP\fP Hardware counters for this thread could not be initialized\&. .RE .PP \fBPAPI_unregister_thread\fP should be called when the user wants to shutdown a particular thread and free the associated thread ID\&. THIS IS IMPORTANT IF YOUR THREAD LIBRARY REUSES THE SAME THREAD ID FOR A NEW KERNEL LWP\&. OpenMP does this\&. OpenMP parallel regions, if separated by a call to omp_set_num_threads() will often kill off the underlying kernel LWPs and then start new ones for the next region\&. However, omp_get_thread_id() does not reflect this, as the thread IDs for the new LWPs will be the same as the old LWPs\&. PAPI needs to know that the underlying LWP has changed so it can set up the counters for that new thread\&. This is accomplished by calling this function\&. .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPI_write.3000066400000000000000000000023341502707512200173020ustar00rootroot00000000000000.TH "PAPI_write" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPI_write \- Write counter values into counters\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBParameters\fP .RS 4 \fIEventSet\fP an integer handle for a PAPI event set as created by \fBPAPI_create_eventset\fP .br \fI*values\fP an array to hold the counter values of the counting events .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. .br \fIPAPI_ECMP\fP PAPI_write() is not implemented for this architecture\&. .br \fIPAPI_ESYS\fP The EventSet is currently counting events and the component could not change the values of the running counters\&. .RE .PP PAPI_write() writes the counter values provided in the array values into the event set EventSet\&. The virtual counters managed by the PAPI library will be set to the values provided\&. If the event set is running, an attempt will be made to write the values to the running counters\&. This operation is not permitted by all components and may result in a run-time error\&. .PP \fBSee also\fP .RS 4 \fBPAPI_read\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIf_hl_read.3000066400000000000000000000032341502707512200177140ustar00rootroot00000000000000.TH "PAPIf_hl_read" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIf_hl_read \- Reads and stores hardware events inside of an instrumented code region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include .br int \fBPAPIf_hl_read( C_STRING region, C_INT check )\fP .RE .PP \fBParameters\fP .RS 4 \fIregion\fP -- a unique region name corresponding to \fBPAPIf_hl_region_begin\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous erros\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPIf_hl_read\fP reads hardware events and stores them internally inside of an instrumented code region\&. Assumes that \fBPAPIf_hl_region_begin\fP was called before\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf integer retval call PAPIf_hl_region_begin("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_begin failed!" end if !do some computation here call PAPIf_hl_read("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_read failed!" end if !do some computation here call PAPIf_hl_region_end("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_end failed!" end if .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_read\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIf_hl_region_begin.3000066400000000000000000000033231502707512200214270ustar00rootroot00000000000000.TH "PAPIf_hl_region_begin" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIf_hl_region_begin \- Reads and stores hardware events at the beginning of an instrumented code region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIf_hl_region_begin( C_STRING region, C_INT check )\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous erros\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPIf_hl_region_begin\fP reads hardware events and stores them internally at the beginning of an instrumented code region\&. If not specified via environment variable PAPI_EVENTS, default events are used\&. The first call sets all counters implicitly to zero and starts counting\&. Note that if PAPI_EVENTS is not set or cannot be interpreted, default hardware events are recorded\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf export PAPI_EVENTS="PAPI_TOT_INS,PAPI_TOT_CYC" .fi .PP .PP .PP .nf integer retval call PAPIf_hl_region_begin("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_begin failed!" end if !do some computation here call PAPIf_hl_region_end("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_end failed!" end if .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_region_begin\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIf_hl_region_end.3000066400000000000000000000032461502707512200211150ustar00rootroot00000000000000.TH "PAPIf_hl_region_end" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIf_hl_region_end \- Reads and stores hardware events at the end of an instrumented code region\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIf_hl_region_end( C_STRING region, C_INT check )\fP .RE .PP \fBParameters\fP .RS 4 \fIregion\fP -- a unique region name corresponding to \fBPAPIf_hl_region_begin\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_OK\fP .br \fIPAPI_ENOTRUN\fP -- EventSet is currently not running or could not determined\&. .br \fIPAPI_ESYS\fP -- A system or C library call failed inside PAPI, see the errno variable\&. .br \fIPAPI_EMISC\fP -- PAPI has been deactivated due to previous erros\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory\&. .RE .PP \fBPAPIf_hl_region_end\fP reads hardware events and stores the difference to the values from \fBPAPIf_hl_region_begin\fP at the end of an instrumented code region\&. Assumes that \fBPAPIf_hl_region_begin\fP was called before\&. Note that an output is automatically generated when your application terminates\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf integer retval call PAPIf_hl_region_begin("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_begin failed!" end if !do some computation here call PAPIf_hl_region_end("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_end failed!" end if .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_region_end\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/PAPIf_hl_stop.3000066400000000000000000000030221502707512200177610ustar00rootroot00000000000000.TH "PAPIf_hl_stop" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME PAPIf_hl_stop \- Stop a running high-level event set\&. .SH SYNOPSIS .br .PP .SH "Detailed Description" .PP .PP \fBFortran Prototype:\fP .RS 4 #include 'fpapi\&.h' .br \fBPAPIf_hl_stop( C_INT check )\fP .RE .PP \fBReturn values\fP .RS 4 \fIPAPI_ENOEVNT\fP -- The EventSet is not started yet\&. .br \fIPAPI_ENOMEM\fP -- Insufficient memory to complete the operation\&. .RE .PP \fBPAPIf_hl_stop\fP stops a running high-level event set\&. .PP This call is optional and only necessary if the programmer wants to use the low-level API in addition to the high-level API\&. It should be noted that \fBPAPIf_hl_stop\fP and low-level calls are not allowed inside of a marked region\&. Furthermore, \fBPAPIf_hl_stop\fP is thread-local and therefore has to be called in the same thread as the corresponding marked region\&. .PP \fBExample:\fP .RS 4 .RE .PP .PP .nf integer retval call PAPIf_hl_region_begin("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_begin failed!" end if !do some computation here call PAPIf_hl_region_end("computation", retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_region_end failed!" end if call PAPIf_hl_stop(retval) if ( retval \&.NE\&. PAPI_OK ) then write (*,*) "PAPIf_hl_stop failed!" end if .fi .PP .PP \fBSee also\fP .RS 4 \fBPAPI_hl_stop\fP .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/RateInfo.3000066400000000000000000000014641502707512200170510ustar00rootroot00000000000000.TH "RateInfo" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME RateInfo .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBEventSet\fP" .br .ti -1c .RI "int \fBevent_0\fP" .br .ti -1c .RI "short int \fBrunning\fP" .br .ti -1c .RI "long long \fBlast_real_time\fP" .br .ti -1c .RI "long long \fBlast_proc_time\fP" .br .in -1c .SH "Field Documentation" .PP .SS "int RateInfo::event_0" first event of the eventset .SS "int RateInfo::EventSet" EventSet of the thread .SS "long long RateInfo::last_proc_time" Previous value of processor time .SS "long long RateInfo::last_real_time" Previous value of real time .SS "short int RateInfo::running" STOP, FLIP, FLOP, IPC or EPC .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/binary_tree_t.3000066400000000000000000000010211502707512200201550ustar00rootroot00000000000000.TH "binary_tree_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME binary_tree_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "void * \fBroot\fP" .br .ti -1c .RI "\fBthreads_t\fP * \fBfind_p\fP" .br .in -1c .SH "Field Documentation" .PP .SS "\fBthreads_t\fP* binary_tree_t::find_p" Pointer that is used for finding a thread node .SS "void* binary_tree_t::root" Root of binary tree .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/components_t.3000066400000000000000000000010611502707512200200430ustar00rootroot00000000000000.TH "components_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME components_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBcomponent_id\fP" .br .ti -1c .RI "int \fBnum_of_events\fP" .br .ti -1c .RI "int \fBmax_num_of_events\fP" .br .ti -1c .RI "char ** \fBevent_names\fP" .br .ti -1c .RI "int * \fBevent_codes\fP" .br .ti -1c .RI "short * \fBevent_types\fP" .br .ti -1c .RI "int \fBEventSet\fP" .br .in -1c .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/local_components_t.3000066400000000000000000000007211502707512200212170ustar00rootroot00000000000000.TH "local_components_t" 3 "Wed Jun 25 2025 19:30:48" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME local_components_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "int \fBEventSet\fP" .br .ti -1c .RI "long_long * \fBvalues\fP" .br .in -1c .SH "Field Documentation" .PP .SS "long_long* local_components_t::values" Return values for the eventsets .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/papi_hl_output_writer_Sum_Counter.3000066400000000000000000000045741502707512200243220ustar00rootroot00000000000000.TH "papi_hl_output_writer.Sum_Counter" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_hl_output_writer.Sum_Counter \- \fBSum_Counter\fP class defintion\&. .SH SYNOPSIS .br .PP .PP Inherits object\&. .SS "Public Member Functions" .in +1c .ti -1c .RI "\fB__init__\fP (self)" .br .RI "\fBSum_Counter\fP class initializer\&. " .ti -1c .RI "\fBadd_event\fP (self, value)" .br .RI "Method definition for add_event\&. " .ti -1c .RI "\fBget_min\fP (self)" .br .RI "Method definition for get_min\&. " .ti -1c .RI "\fBget_median\fP (self)" .br .RI "Method definition for get_median\&. " .ti -1c .RI "\fBget_sum\fP (self)" .br .RI "Method definition for get_sum\&. " .ti -1c .RI "\fBget_max\fP (self)" .br .RI "Method definition for get_max\&. " .in -1c .SS "Data Fields" .in +1c .ti -1c .RI "\fBmin\fP" .br .ti -1c .RI "\fBall_values\fP" .br .ti -1c .RI "\fBmax\fP" .br .in -1c .SH "Detailed Description" .PP Calculates the min, max, median or sum for the measurements of a recorded events\&. .SH "Member Function Documentation" .PP .SS "papi_hl_output_writer\&.Sum_Counter\&.add_event ( self, value)" Add a recorded event and measurement to summary output\&. .PP \fBParameters\fP .RS 4 \fIvalue\fP Measurement from a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SS "papi_hl_output_writer\&.Sum_Counter\&.get_max ( self)" Calculate the maximum for a set of measurements for a recorded event\&. .PP \fBReturns\fP .RS 4 The maximum for a set of measurement values for a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SS "papi_hl_output_writer\&.Sum_Counter\&.get_median ( self)" Calculates the median for a set of measurements for a recorded event\&. .PP \fBReturns\fP .RS 4 The median for a set of measurement values for a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SS "papi_hl_output_writer\&.Sum_Counter\&.get_min ( self)" Calculates the minimum for a set of measurements for a recorded event\&. .PP \fBReturns\fP .RS 4 The minimum for a set of measurement values for a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SS "papi_hl_output_writer\&.Sum_Counter\&.get_sum ( self)" Calculates the sum for a set of measurements for a recorded event\&. .PP \fBReturns\fP .RS 4 The sum of measurement values for a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/papi_hl_output_writer_Sum_Counters.3000066400000000000000000000041221502707512200244720ustar00rootroot00000000000000.TH "papi_hl_output_writer.Sum_Counters" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME papi_hl_output_writer.Sum_Counters \- \fBSum_Counters\fP class defintion\&. .SH SYNOPSIS .br .PP .PP Inherits object\&. .SS "Public Member Functions" .in +1c .ti -1c .RI "\fB__init__\fP (self)" .br .RI "\fBSum_Counters\fP class initializer\&. " .ti -1c .RI "\fBadd_region\fP (self, rank_id, thread_id, events=OrderedDict())" .br .RI "Method defintion for add_region\&. " .ti -1c .RI "\fBget_json\fP (self)" .br .RI "Method definition for get_json\&. " .in -1c .SS "Data Fields" .in +1c .ti -1c .RI "\fBregions\fP" .br .ti -1c .RI "\fBregions_last_rank_id\fP" .br .ti -1c .RI "\fBregions_rank_num\fP" .br .ti -1c .RI "\fBregions_last_thread_id\fP" .br .ti -1c .RI "\fBregions_thread_num\fP" .br .ti -1c .RI "\fBclean_regions\fP" .br .ti -1c .RI "\fBsum_counters\fP" .br .in -1c .SH "Detailed Description" .PP Gathers summary output for a region (e\&.g\&. computation) and accompanying measurements for a recorded event (e\&.g\&. PAPI_TOT_INS)\&. .SH "Member Function Documentation" .PP .SS "papi_hl_output_writer\&.Sum_Counters\&.add_region ( self, rank_id, thread_id, events = \fROrderedDict()\fP)" Adds the region (e\&.g\&. computation) and accompanying measurements for a recorded event (e\&.g\&. PAPI_TOT_INS) to summary output\&. .PP \fBParameters\fP .RS 4 \fIrank_id\fP MPI rank, if no MPI rank is present this value will be random\&. .br \fIthread_id\fP Thread identifier containing performance events\&. E\&.g\&. 0\&. .br \fIevents\fP An ordered dictionary containing measurements for recorded events obtained through PAPI HL function calls\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SS "papi_hl_output_writer\&.Sum_Counters\&.get_json ( self)" Calculates the min, max, median, or sum for a set of measurements for a recorded event\&. E\&.g\&. PAPI_TOT_INS\&. .PP \fBReturns\fP .RS 4 An ordered dictionary containing summary measurements for recorded events\&. E\&.g\&. PAPI_TOT_INS\&. .RE .PP .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/reads_t.3000066400000000000000000000007121502707512200167560ustar00rootroot00000000000000.TH "reads_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME reads_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "struct reads * \fBnext\fP" .br .ti -1c .RI "struct reads * \fBprev\fP" .br .ti -1c .RI "long_long \fBvalue\fP" .br .in -1c .SH "Field Documentation" .PP .SS "long_long reads_t::value" Event value .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/regions_t.3000066400000000000000000000014601502707512200173270ustar00rootroot00000000000000.TH "regions_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME regions_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "unsigned int \fBregion_id\fP" .br .ti -1c .RI "int \fBparent_region_id\fP" .br .ti -1c .RI "char * \fBregion\fP" .br .ti -1c .RI "struct regions * \fBnext\fP" .br .ti -1c .RI "struct regions * \fBprev\fP" .br .ti -1c .RI "\fBvalue_t\fP \fBvalues\fP []" .br .in -1c .SH "Field Documentation" .PP .SS "int regions_t::parent_region_id" Region ID of parent region .SS "char* regions_t::region" Region name .SS "unsigned int regions_t::region_id" Unique region ID .SS "\fBvalue_t\fP regions_t::values[]" Array of event values based on current eventset .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/threads_t.3000066400000000000000000000007431502707512200173160ustar00rootroot00000000000000.TH "threads_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME threads_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "unsigned long \fBkey\fP" .br .ti -1c .RI "\fBregions_t\fP * \fBvalue\fP" .br .in -1c .SH "Field Documentation" .PP .SS "unsigned long threads_t::key" Thread ID .SS "\fBregions_t\fP* threads_t::value" List of regions .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/man/man3/value_t.3000066400000000000000000000012141502707512200167720ustar00rootroot00000000000000.TH "value_t" 3 "Wed Jun 25 2025 19:30:49" "Version 7.2.0.0" "PAPI" \" -*- nroff -*- .ad l .nh .SH NAME value_t .SH SYNOPSIS .br .PP .SS "Data Fields" .in +1c .ti -1c .RI "long_long \fBbegin\fP" .br .ti -1c .RI "long_long \fBregion_value\fP" .br .ti -1c .RI "\fBreads_t\fP * \fBread_values\fP" .br .in -1c .SH "Field Documentation" .PP .SS "long_long value_t::begin" Event value for region_begin .SS "\fBreads_t\fP* value_t::read_values" List of read event values inside a region .SS "long_long value_t::region_value" Delta value for region_end - region_begin .SH "Author" .PP Generated automatically by Doxygen for PAPI from the source code\&. papi-papi-7-2-0-t/papi.spec000066400000000000000000000066121502707512200154520ustar00rootroot00000000000000Summary: Performance Application Programming Interface Name: papi Version: 7.2.0.0 Release: 1%{?dist} License: BSD Group: Development/System URL: http://icl.utk.edu/papi/ Source0: http://icl.utk.edu/projects/papi/downloads/%{name}-%{version}.tar.gz BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root BuildRequires: ncurses-devel BuildRequires: gcc-gfortran BuildRequires: kernel-headers >= 2.6.32 BuildRequires: chrpath #Right now libpfm does not know anything about s390 and will fail ExcludeArch: s390 s390x # Conditional for rocm_smi support %bcond_with rocm_smi # rocm_smi path detection %if %{with rocm_smi} # First try user-defined path, then default locations %define rocm_smi_path %{?_rocm_smi_path:%{_rocm_smi_path}}%{!?_rocm_smi_path:/opt/rocm} # Verify rocm_smi exists at the expected path %{!?__rocm_smi_exists:%global __rocm_smi_exists %(test -e %{rocm_smi_path} && echo 1 || echo 0)} %if !%{__rocm_smi_exists} %{error: rocm_smi not found at %{rocm_smi_path}, install rocm_smi or specify alternate path with --define="_rocm_smi_path /path/to/rocm_smi"} %endif %endif %description PAPI provides a programmer interface to monitor the performance of running programs. %package devel Summary: Header files for the compiling programs with PAPI Group: Development/System Requires: papi = %{version}-%{release} %description devel PAPI-devel includes the C header files that specify the PAPI userspace libraries and interfaces. This is required for rebuilding any program that uses PAPI. %prep %setup -q %build cd src %configure --with-static-lib=no --with-shared-lib=yes --with-shlib #DBG workaround to make sure libpfm just uses the normal CFLAGS DBG="" make %if %{with rocm_smi} PAPI_ROCMSMI_ROOT=%{rocm_smi_path} \ --with-rocm-smi \ %endif #%check #cd src #make fulltest %install rm -rf $RPM_BUILD_ROOT cd src make DESTDIR=$RPM_BUILD_ROOT install chrpath --delete $RPM_BUILD_ROOT%{_libdir}/*.so* # Remove the static libraries. Static libraries are undesirable: # https://fedoraproject.org/wiki/Packaging/Guidelines#Packaging_Static_Libraries rm -rf $RPM_BUILD_ROOT%{_libdir}/*.a %post -p /sbin/ldconfig %postun -p /sbin/ldconfig %clean rm -rf $RPM_BUILD_ROOT %files %defattr(-,root,root,-) %{_bindir}/* %{_libdir}/*.so.* /usr/share/papi %doc INSTALL.txt README LICENSE.txt RELEASENOTES.txt %files devel %defattr(-,root,root,-) %{_includedir}/*.h %{_includedir}/perfmon %{_libdir}/*.so %doc %{_mandir}/man3/* %doc %{_mandir}/man1/* %changelog * Tue Jan 31 2012 Dan Terpstra - 4.2.1 - Rebase to papi-4.2.1 * Wed Dec 8 2010 Dan Terpstra - 4.1.2-1 - Rebase to papi-4.1.2 * Tue Jun 8 2010 William Cohen - 4.1.0-1 - Rebase to papi-4.1.0 * Mon May 17 2010 William Cohen - 4.0.0-5 - Test run with upstream cvs version. * Wed Feb 10 2010 William Cohen - 4.0.0-4 - Resolves: rhbz562935 Rebase to papi-4.0.0 (correct ExcludeArch). * Wed Feb 10 2010 William Cohen - 4.0.0-3 - Resolves: rhbz562935 Rebase to papi-4.0.0 (bump nvr). * Wed Feb 10 2010 William Cohen - 4.0.0-2 - correct the ctests/shlib test - have PAPI_set_multiplex() return proper value - properly handle event unit masks - correct PAPI_name_to_code() to match events - Resolves: rhbz562935 Rebase to papi-4.0.0 * Wed Jan 13 2010 William Cohen - 4.0.0-1 - Generate papi.spec file for papi-4.0.0. papi-papi-7-2-0-t/release_procedure.txt000066400000000000000000000456171502707512200201060ustar00rootroot00000000000000Release Procedure for PAPI ========================== Below is a step-wise procedure for making a PAPI release. This is a living document and may not be totally current or accurate. It is an attempt to capture the current practices in making a PAPI release. Please update it as appropriate. One way to use this procedure is to print a copy and check off the lines as they are completed to avoid confusion. ================================================================================ __ 0a. Notify developers that a release is imminent and the repository should be considered frozen. Check the GitHub Actions / make fulltest results to make sure the codebase is bug-free. Before running 'make fulltest', ensure you do NOT configure PAPI using '--with-debug=yes'. This forces -O0 which will cause some of the papi_tot_cyc.c tests to fail. We run our 'make fulltest' with no components installed (other than the defaults). Several components require specialized hardware, operating systems, or sudo privileges to operate. You should use '--with-debug=yes' if you wish to do any valgrind testing for memory leaks. You can conduct such tests running the code './validation_tests/memleak_check.c' using valgrind. You may need to load a module to use valgrind. BEFORE YOU BEGIN: Step 2d will require access to the papi website directory, so all these steps should be done on a machine that has it. Most recently we updated using methane.icl.utk.edu, where '/nfs/www/icl/projectsdev/papi/docs' was accessible. The sysadmin must give you write permission. This directory should be available on all machines. You must also be on a machine that will let you rebuild the manual pages. This requires doxygen version 1.8.5 (See step 2a). You can check the version with 'doxygen -v' or 'doxygen --version'. The website directory may be 'automount'. That means you cannot see it with 'ls' and it will not be searched by 'find'. You must change to the directory for it to automount. For example, 'pushd /nfs/www/icl/projectsdev/papi/docs' to get it mounted (if this fails, you don't have access, discuss with your sysadmin). 'popd' to return to your previous directory, THEN you can 'ls /nfs/www/icl/projectsdev/papi/docs' and see the files. --- Modified with 6.0.0 to use a Pull Request, to allow review of changes before committing. __ 0b. Fork the repository. It is located at: https://github.com/icl-utk-edu/papi. GitHub only allows you to create a single fork. If you have not already created a fork of the PAPI repository at this point then do so.i For creating a fork on GitHub, click the Fork dropdown in the top right corner of the PAPI repository and then select "+ Create a new fork". Fill out the boxes that follow. __ 0c. Clone your fork and then execute 'cd papi' on the command line. Once inside the 'papi' directory, create a branch titled papi-release-X-Y-Z-t. Replace X, Y, Z with the appropriate numerals. As a note, the branch title is not necessarily restricted to papi-release-X-Y-Z-t. Example: > git clone https://github.com/Treece-Burgess/papi.git > cd papi > git checkout -b papi-release-7-2-0b2 __ 1. Update any documentation that may have changed. Pay particular attention to INSTALL.txt. __ 2. Check/Change the version number in: - papi.spec (i.e. Version:) - src/papi.h (i.e. PAPI_VERSION) - src/configure.in (i.e. AC_INIT) - src/Makefile.in (i.e. PAPIVER, PAPIREV, PAPIAGE, PAPIINC) - doc/Doxyfile-common (i.e. PROJECT_NUMBER) Commit these version changes to the repo, along with any other changed files. Do not "git push" until autoconf is run (Step 3). You will have to run it, you just changed configure.in, which is the reason to run autoconf. The version number may already have been updated after the last release, if so you can skip these edits, but do commit any other files changed. -- 2a. Ensure you have doxygen 1.8.5. Execute 'doxygen -v' or 'doxygen --version' on the command line to see what doxygen version you are currently using. If it isn't found, find the doxygen directories and add it to $PATH. Example: > find /usr -name "doxygen" 2>/dev/null > export PATH=/[PathFound]:$PATH (use actual path for [PathFound].) > doxygen -v (To Test) -- 2b. Rebuild the doxygen manpages: > cd doc && make && make install You will want to check if anything needs to be committed to git. (Usually the $(papi_dir)/man/man1 and man3 directories). Doxygen may generate some extraneous files in man1 and man3; e.g. 'man/man1/_home_youruserid_*.*' and 'man/man3/_home_youruserid_*.*' Remove these. Be careful of the directory! These ALSO exist in 'doc/man/man1' and 'doc/man/man3' (where they were built). Those are NOT the ones to remove, and not the ones to ADD in the next step. Then you can go to each directory and add all files: > cd papi/man/man1 > git add * > cd papi/man/man3 > git add * ---- Step 2d will require access to the webdir. The website files are not ---- saved as part of the repository, they are updated directly. -- 2c. Rebuild the website docs. We use Doxyfile-html for this, that is a configuration file in 'papi/doc' that will be used by the Makefile. Run the commands below: > newgrp papi (newgrp is a linux command, it changes your defacto group to papi, starting a new shell.) > cd doc && make clean html -- 2d. Update the web dir (currently /nfs/www/icl/projectsdev/papi/docs as of 10/28/24). For the example below, 'webdir' = '/nfs/www/icl/projectsdev'. You may wish to do a diff between this directory and the papi directory to ensure you are replacing the correct files. > diff webdir/papi/docs/* papi/doc/html/* Cleanup web dir and add new files: > ( /bin/rm -rf /webdir/papi/docs/* ) > ( cp -R doc/html/* /webdir/papi/docs ) > ( chmod -R 775 /webdir/papi/docs ) __ 3. If 'configure.in' is changed, we must generate a new 'configure' file. Run autoconf (2.69), it reads configure.in and creates a new configure. As of 10/28/24, you will need to module load autoconf version 2.69 from autoconf-archive. > module load autoconf-archive/2023.02.20/gcc-11.4.1-izt5hd > autconf --version > diff -u configure <(autoconf configure.in) (see first NOTE below on why to do this) > autoconf configure.in > configure NOTE: We have an issue with autoconf; on different machines the same version (2.69) can produce different files. Due to this when generating a new configure, diff the master configure file vs your newly generated configure file. Delete any excess lines that may appear. All new changes must now be done on login.icl.utk.edu. NOTE: Using an autoconf version > 2.69 will work, but will produce an inordinate number of extraneous differences between versions. You can check the version # with 'autoconf -V' or 'autoconf --version'. Release 6.0.0.1 was done on saturn.icl.utk.edu with autoconf 2.69. __ 4. Create a ChangeLog for the current release. We use a python script titled gitlog2changelog.py to generate this for us. There are two command line arguments to pass to the python script: 1. --starting_commit, commit hash for the starting point of the desired range (non-inclusive). 2. --fout, the filename to save the script output to. Example: > ./gitlog2changelog.py --starting_commit=0f93aac0ad426a2a9dc2e5fb43c87e6f0cf656ef --fout=ChangeLogP800.txt __ 5. Scan the ChangeLog to remove extraneous fluff, like perfctr imports. __ 6. Modify RELEASENOTES.txt to summarize the major changes listed in the log. __ 7. Add ChangeLogPXYZ.txt to git and commit both ChangeLogPXYZ.txt and RELEASENOTES.txt. __ 8. Push the branch back to the remote fork. 'git push'. __ 9. Go to https://github.com/icl-utk-edu/papi and create a pull request. Having just made a push, a yellow banner should appear right above the green "Code" dropdown button. If this is not the case then follow the steps outlined below. 1. Click on the "Pull requests" button located in the top left corner. 2. Click on the green "New pull request" button located in the middle right corner. 3. Click on the blue "compare across forks" button located below the "Compare changes" heading. 4. For the head repository select your PAPI fork and then for compare select the branch that you would like to merge in. 5. Review the changes and if everything looks correct click "Create pull request". Then wait for that to be reviewed, approved, and merged. Once it is merged, no other pull requests should be approved, until the following steps are completed! ------BRANCHING AND TAGGING __ 10. After the pull request is merged: Clone the new PAPI. First, remove or rename any existing papi/ directory. Then 'git clone https://github.com/icl-utk-edu/papi.git' __ 11. Change to the papi directory, 'cd papi'. If this is not an incremental release, branch git. 'git checkout -b stable-6.0' (change version as needed). __ 12. Tag git: papi-X-Y-Z-t. Example: > git tag -a papi-6-0-0-t __ 13. Push the tag to the central repo. Examples: > git push --tags origin stable-6.0 (if you created a new branch, should match step 11) OTHERWISE > git push --tags You will be prompted for a comment on the tags. A tags comment should be able to be seen by clicking "Tags" and then clicking the desired tag you would like to see. For a comment, 'Release PAPI-6-0-0-t' is sufficient. (your own version of course). __ 13a. Double check. Go to the repository. Look at Branches; ensure you select 'all branches', if you branched, the branch should be shown. Ensure your tag appears on the list of Tags. ------BUILD A TARBALL. __ 14. Create a fresh clone of the papi repository, under a directory name including the release number. This is important, we will delete the '.git' file for the tarball, so this new directory will no longer be under git. Example: > git clone https://github.com/icl-utk-edu/papi.git papi-6.0.0 (your own version) > cd papi-6.0.0 (your own version) Note: If you created a new branch, ensure the clone contains your last commit and ChangeLog. If not, something went wrong with steps 4-8. __ 15. Delete any unneccessary files or directories particularly .doc and .pdf files in the /doc directory. We use a script for this, it deletes itself: 'sh delete_before_release.sh'. NOTE: Running the above command deletes the .git directory! You will not be able to commit or push any changes in this directory once this script is run. __ 16. tar the directory. Example: > tar -cvf papi-X.Y.Z.tar papi-X.Y.Z (using release number for X-Y-Z) __ 17. zip the tarball. Example: > gzip papi-X.Y.Z.tar __ 18. Copy the tarball to the website. Example: > cp papi-X.Y.Z.tar /nfs/www/icl/projects/papi/downloads/. current directory: /nfs/www/icl/projects/papi/downloads previously: gonzo:/mnt/papi/downloads previously: /silk/homes/icl/projects/papi/downloads __ 19. Check permissions on the tarball. 664 is good. (-rw-rw-r--). __ 19a. Check that the proper link where folks can download the tarball is functional. It should look like: http://icl.utk.edu/projects/papi/downloads/papi-X.Y.Z.tar.gz Note: The landing page may look broken, but as long as the tarball is downloaded from your created link, then everything is working as expected. ---- The following steps are typically done by the person responsible ---- for the website, NOT the programmer completing the release. But ---- final steps are done by the programmer, after these are completed. __ 20. Create a link with supporting text on the PAPI software web page. Create an entry on the PAPI wiki: https://github.com/icl-utk-edu/papi/wiki/PAPI-Releases announcing the new release with details about the changes. __ 21. Create a News item on the PAPI Web page. ---- The following steps are typically done by the PAPI PIs or management. ---- At this writing, Heike Jagode and Anthony Danalis. __ 22. Email the papi developer and discussion lists with an announcement (see emails below). 1. perfapi-devel@icl.utk.edu 2. ptools-perfapi@icl.utk.edu ---- FINAL STEPS: To be done by the Programmer making the release. ---- Bump version number, update release procedures (after you know they ---- worked!) This is to ensure more development commits (that may not ---- be fully tested) will not be in the release version of the repository. *WE WILL UPDATE THIS SECTION ONCE WE DO THESE STEPS DURING A RELEASE* __ 23. Using steps 0b and 0c make ANOTHER fork of the repository, clone it, and change to that directory. Example, papi-6-0-1. __ 24. Repeat Step 2 (ONLY Step 2, not 2a, etc) to bump the version number in the repository AFTER the release changing Z to Z+1. Example: papi-6-0-0 to papi-6-0-1. The following files must be edited: papi.spec, src/papi.h, src/configure.in, src/Makefile.in, doc/Doxyfile-common (PROJECT_NUMBER) __ 25. Repeat Step 3, to create a new configure, preserving runstatedir lines. __ 25a. If these 'release_procedure.txt' instructions need to be updated, fix them now. __ 26. Add changed files, Commit and Push. (use 'git status -uno' to see files that need to be added with 'git add filepath'). 'git commit -m "Updated version to 6.0.1 after the release' 'git push'. __ 27. Repeat step 9, to create a pull request for review. As of PAPI 7.2.0b1, we began to add PAPI releases to the releases section on GitHub, see the following link: https://github.com/icl-utk-edu/papi/releases. Steps 28 through 35 will outline how to create a release on GitHub. __ 28. Go to the following link: https://github.com/icl-utk-edu/papi/releases. __ 29. Select "Draft a new release" in the top right corner. __ 30. From the "Choose a tag" dropdown, select the appropriate tag for this release. Applying a tag was done at step 12. __ 31. Select a Target, this should be the branch that was made in step 13 (e.g. stable-6.0). __ 32. Add a release title, this should be PAPI followed by the release number (e.g. PAPI 7.2.0b1). __ 33. Copy and paste the supporting text obtained at Steps 20 and 21. Along with the supporing text make sure to attach the release tarball. Note: The text should be obtained from Heike and Anthony. __ 34. Select the checkbox "Set as the latest release". __ 35. Select the green "Publish Release" button. ================================================================================ Patch Procedure for PAPI January 29, 2010 Below is a step-wise procedure for making a PAPI patch. One way to use this procedure is to print a copy and check off the lines as they are completed to avoid confusion. ================================================================================ __ 0. Make sure you're on the branch you want to patch against. > git clone https://github.com/icl-utk-edu/papi.git > cd papi > git checkout stable-5.0 (stable-5.0 would be replaced with the branch you want to patch against) __1. Generate the patch. > git diff -p papi-5-0-0-t stable-5.0 > papi-5-0-1.patch __ 2. Apply the patch. > wget http://icl.cs.utk.edu/projects/papi/downloads/papi-5.0.0.tar.gz > tar xzf papi-5.0.0.tar.gz && cd papi-5.0.0 > patch -p1 ../papi-5-0-1.patch __ 3. If the patch applied cleanly, build and test on affected platforms; otherwise clean up the patch and try again. __ 4. Copy the diff file to methane:/nfs/www/icl/projects/papi/downloads/patches/ __ 5. Check permissions on the diff. 644 is good. __ 6. Create a link with supporting text on the PAPI software web page. __ 7. Create a News item on the PAPI Web page. __ 8. Email the papi discussion list with an announcement. __ 9. Post the announcement on the PAPI User Forum. ================================================================================ Bug Fix Release Procedure for PAPI Below is a step-wise procedure for creating a bug fix release tarball. One way to use this procedure is to print a copy and check off the lines as they are completed to avoid confusion. ================================================================================ __ 0. Clone the PAPI repo and checkout the release branch. > git clone https://github.com/icl-utk-edu/papi.git For the first bug fix release do: > git checkout stable-6.0 For further bug fix releases create a branch from the last bug fix release: > git checkout tags/papi-6-0-0-1-t -b papi-6-0-0-2 __ 1. Make sure you're on the branch you want to apply bug fixes > git branch * stable-6.0 __ 2. Apply bug fixes. If those fixes are already applied in the master branch, you can do 'git cherry-pick '. If the commit was a merge use "-m 1", Example: > git cherry-pick -m 1 95c2b15 > git cherry-pick -m 1 7acc7a5 __ 3. Build and test your changes on different platforms. __ 4. Create the tag papi-X-Y-Z-N-t (N is an incremental bug fix identifier) Example: > git tag -a papi-6-0-0-1-t __ 5. Push your changes: > git push --tags __ 6. Create a fresh clone of the papi repository, under a directory name including the release number. > git clone https://github.com/icl-utk-edu/papi.git papi-6.0.0.1 __ 7. tar and zip the directory. Example: > tar -czf papi-6.0.0.1.tar.gz papi-6.0.0.1 __ 8. Copy the tarball to the webserver. > scp papi-6.0.0.1.tar.gz methane:/nfs/www/icl/projects/papi/downloads __ 9. ssh to methane and adjust permissions on the tarball. 664 is good. (-rw-rw-r--). > ssh methane > cd /nfs/www/icl/projects/papi/downloads > chmod g+w papi-6.0.0.1.tar.gz __ 10. Check that the proper link where folks can download the tarball is functional. Example: > wget http://icl.utk.edu/projects/papi/downloads/papi-6.0.0.1.tar.gz __ 11. Create a link with supporting text on the PAPI software web page. Create an entry on the PAPI wiki: https://github.com/icl-utk-edu/papi/wiki/PAPI-Releases announcing the new release with details about the changes. papi-papi-7-2-0-t/src/000077500000000000000000000000001502707512200144275ustar00rootroot00000000000000papi-papi-7-2-0-t/src/.indent.pro000066400000000000000000000075551502707512200165240ustar00rootroot00000000000000/** * PAPI - Indent profile.

* * The purpose of this file is to standardize the PAPI's source code style. * Every new/modified source should be formatted with indent using this * profile before it is checked in again. * * @name .indent.pro * * @version $Revision$
* $Date$
* $Author$ * * @author Heike Jagode */ /* use tabs */ --use-tabs /* set tab size to 4 spaces */ --tab-size4 /* set indentation level to 4 spaces, and these will be turned into * tabs by default */ --indent-level4 /* don't put variables in column 16 */ //--declaration-indentation16 /* maximum length of a line is 80 */ --line-length80 /* breakup the procedure type */ --procnames-start-lines // --dont-break-procedure-type /* break long lines after the boolean operators && and || */ --break-after-boolean-operator /* if long lines are already broken up, GNU indent won't touch them */ --honour-newlines /* If a line has a left parenthesis which is not closed on that line, * then continuation lines will be lined up to start at the character * position just after the left parenthesis */ --continue-at-parentheses /* NO! (see --continue-at-parentheses) */ --continuation-indentation0 /* put braces on line with if, etc.*/ --braces-on-if-line //--braces-after-if-line /* put braces on the line after struct declaration lines */ --braces-after-struct-decl-line /* put braces on the line after function definition lines */ --braces-after-func-def-line /* indent braces 0 spaces */ --brace-indent0 /* NO extra struct/union brace indentation */ --struct-brace-indentation0 /* NO extra case brace indentation! */ --case-brace-indentation0 /* put a space after and before every parenthesis */ --space-after-parentheses /* NO extra parentheses indentation in broken lines */ --paren-indentation0 /* blank line causes problems with multi parameter function prototypes */ --no-blank-lines-after-declarations /* forces blank line after every procedure body */ --blank-lines-after-procedures /* NO newline is forced after each comma in a declaration */ --no-blank-lines-after-commas /* allow optional blank lines */ --leave-optional-blank-lines // --swallow-optional-blank-lines /* do not put comment delimiters on blank lines */ --no-comment-delimiters-on-blank-lines /* the maximum comment column is 79 */ --comment-line-length79 /* do not touch comments starting at column 0 */ --dont-format-first-column-comments /* no extra line comment indentation */ --line-comments-indentation0 /* dont star comments */ --dont-star-comments // --start-left-side-of-comments /* comments to the right of the code start at column 30 */ --comment-indentation30 /* comments after declarations start at column 40 */ --declaration-comment-column40 /* comments after #else #endif start at column 8 */ --else-endif-column8 /* Do not cuddle } and the while of a do {} while; */ --dont-cuddle-do-while /* Do cuddle } and else */ --cuddle-else //--dont-cuddle-else /* a case label indentation of 0 */ --case-indentation0 /* put no space after a cast operator */ //--no-space-after-casts /* no space after function call names; * but space after keywords for, it, while */ --no-space-after-function-call-names //--no-space-after-for //--no-space-after-if //--no-space-after-while /* Do not force space between special statements and semicolon */ --dont-space-special-semicolon // --space-special-semicolon /* put a space between sizeof and its argument :TODO: check */ --blank-before-sizeof /* enable verbose mode */ --verbose // --no-verbosity /* NO space between # and preprocessor directives */ // --leave-preprocessor-space /* format some comments but not all */ // --dont-format-comments /* NO gnu style as default */ // --gun_style /* K&R default style */ --k-and-r-style /* NO Berkeley default style */ // --original /* read this profile :-) */ // --ignore-profile papi-papi-7-2-0-t/src/CreatePresetTbl.sh000077500000000000000000000004511502707512200200160ustar00rootroot00000000000000#!/bin/sh # This is a shell script to help create the man page for PAPI_presets cat << EOF .TS box, tab(&); lt | lw(50). = EOF ./tests/avail | grep 'PAPI_' | sed 's/(.*)//g' | sort | \ awk '{ printf("%s&T{\n", $1); for(i=5;i<=NF;i++) { printf("%s ",$i) } ; printf("\nT}\n_\n") }' echo ".TE" papi-papi-7-2-0-t/src/INSTALL000066400000000000000000000002311502707512200154540ustar00rootroot00000000000000/* * File: papi/src/README * CVS: $Id$ * Author: Philip Mucci * mucci@cs.utk.edu */ Please see the INSTALL.txt in the root directory. papi-papi-7-2-0-t/src/Makefile.in000066400000000000000000000045701502707512200165020ustar00rootroot00000000000000PAPIVER=7 PAPIREV=2 PAPIAGE=0 PAPIINC=0 PREFIX = @prefix@ prefix = $(PREFIX) exec_prefix = $(EPREFIX) PACKAGE_TARNAME = @PACKAGE_TARNAME@ ALTIX = @altix@ AR = @AR@ ARCH = @arch@ ARCH_EVENTS = @ARCH_EVENTS@ ARG64 = @ARG64@ BGP_SYSDIR = @BGP_SYSDIR@ BINDIR = @bindir@ BITFLAGS = @BITFLAGS@ CC = @CC@ CC_R = @CC_R@ CC_SHR = @CC_SHR@ CFLAGS = @CFLAGS@ CPP = @CPP@ CPPFLAGS = @CPPFLAGS@ COMPONENT_RULES = @COMPONENT_RULES@ COMPONENTS = @COMPONENTS@ CPU = @CPU@ CPU_MODEL = @CPU_MODEL@ cpu_option = @cpu_option@ CTEST_TARGETS = @CTEST_TARGETS@ datarootdir = @datarootdir@ DATADIR = @datadir@/${PACKAGE_TARNAME} DESCR = @DESCR@ DOCDIR = @docdir@ EPREFIX = @exec_prefix@ F77 = @F77@ FFLAGS = @FFLAGS@ FLAGS = @FLAGS@ FILENAME = @FILENAME@ FTEST_TARGETS = @FTEST_TARGETS@ INCDIR = @includedir@ LDFLAGS = @LDFLAGS@ LIBCFLAGS = @PAPICFLAGS@ LIBDIR = @libdir@ LIBRARY = @LIBRARY@ LIBS = @papiLIBS@ LINKLIB = @LINKLIB@ MAKEVER = @MAKEVER@ MANDIR = @mandir@ MISCHDRS = @MISCHDRS@ MISCOBJS = @MISCOBJS@ MISCSRCS = @MISCSRCS@ MPICC = @MPICC@ NOOPT = @NOOPT@ OMPCFLGS = @OMPCFLGS@ OPTFLAGS = @OPTFLAGS@ OSFILESSRC = @OSFILESSRC@ OSFILESOBJ = @OSFILESOBJ@ OSFILESHDR = @OSFILESHDR@ OSLOCK = @OSLOCK@ OSCONTEXT = @OSCONTEXT@ PAPI_EVENTS = @PAPI_EVENTS@ PAPI_EVENTS_CSV = @PAPI_EVENTS_CSV@ PEPATH = @PEPATH@ PERFCTR_INC_PATH = @perfctr_incdir@ PERFCTR_LIB_PATH = @perfctr_libdir@ PERFCTR_PREFIX = @perfctr_prefix@ PERFCTR_ROOT = @perfctr_root@ PFM_INC_PATH = @pfm_incdir@ PFM_LIB_PATH = @pfm_libdir@ PFM_OLD_PFMV2 = @old_pfmv2@ BGPM_INSTALL_DIR = @BGPM_INSTALL_DIR@ PFM_PREFIX = @pfm_prefix@ PFM_ROOT = @pfm_root@ POST_BUILD = @POST_BUILD@ PMAPI = @PMAPI@ PMINIT = @PMINIT@ SETPATH = @SETPATH@ SHLIB = @SHLIB@ PAPISOVER = @PAPISOVER@ VLIB = @VLIB@ SHLIBDEPS = @SHLIBDEPS@ SHOW_CONF = @SHOW_CONF@ SMPCFLGS = @SMPCFLGS@ STATIC = @STATIC@ CPUCOMPONENT_NAME = @CPUCOMPONENT_NAME@ CPUCOMPONENT_C = @CPUCOMPONENT_C@ CPUCOMPONENT_OBJ = @CPUCOMPONENT_OBJ@ TESTS = @tests@ TOPTFLAGS = @TOPTFLAGS@ FTOPTFLAGS = @TOPTFLAGS@ UTIL_TARGETS = @UTIL_TARGETS@ VERSION = @VERSION@ LDL = @LDL@ HAVE_NO_OVERRIDE_INIT = @HAVE_NO_OVERRIDE_INIT@ CC_COMMON_NAME = @CC_COMMON_NAME@ MIC = @MIC@ BUILD_LIBSDE_SHARED = @BUILD_LIBSDE_SHARED@ BUILD_LIBSDE_STATIC = @BUILD_LIBSDE_STATIC@ FORT_WRAPPERS_SRC=@FORT_WRAPPERS_SRC@ FORT_WRAPPERS_OBJ=@FORT_WRAPPERS_OBJ@ FORT_HEADERS=@FORT_HEADERS@ include $(FILENAME) papi-papi-7-2-0-t/src/Makefile.inc000066400000000000000000000334611502707512200166460ustar00rootroot00000000000000PAPI_SRCDIR = $(PWD) SOURCES = $(MISCSRCS) papi.c papi_internal.c \ high-level/papi_hl.c \ extras.c sw_multiplex.c \ $(FORT_WRAPPERS_SRC) \ threads.c cpus.c $(OSFILESSRC) $(CPUCOMPONENT_C) papi_preset.c \ papi_vector.c papi_memory.c $(COMPSRCS) OBJECTS = $(MISCOBJS) papi.o papi_internal.o \ papi_hl.o \ extras.o sw_multiplex.o \ $(FORT_WRAPPERS_OBJ) \ threads.o cpus.o $(OSFILESOBJ) $(CPUCOMPONENT_OBJ) papi_preset.o \ papi_vector.o papi_memory.o $(COMPOBJS) PAPI_EVENTS_TABLE = papi_events_table.h HEADERS = $(MISCHDRS) $(OSFILESHDR) $(PAPI_EVENTS_TABLE) \ papi.h papi_internal.h papiStdEventDefs.h \ papi_preset.h threads.h cpus.h papi_vector.h \ papi_memory.h config.h \ extras.h sw_multiplex.h \ papi_common_strings.h components_config.h \ papi_components_config_event_defs.h LIBCFLAGS += -I. $(CFLAGS) -DOSLOCK=\"$(OSLOCK)\" -DOSCONTEXT=\"$(OSCONTEXT)\" FHEADERS = $(FORT_HEADERS) # pkgconfig directory LIBPC = $(LIBDIR)/pkgconfig all: $(SHOW_CONF) $(LIBS) libsde utils tests .PHONY : all test fulltest tests testlib utils ctests ftests comp_tests validation_tests null include $(COMPONENT_RULES) showconf: @echo "Host architecture : $(DESCR)"; @echo "Host CPU component : $(CPUCOMPONENT_NAME)"; @echo "Installation DESTDIR: $(DESTDIR)"; @echo "Installation PREFIX : $(PREFIX)"; @echo "Installation EPREFIX: $(EPREFIX)"; @echo "Installation INCDIR : $(INCDIR)"; @echo "Installation LIBDIR : $(LIBDIR)"; @echo "Installation BINDIR : $(BINDIR)"; @echo "Installation MANDIR : $(MANDIR)"; @echo "Installation DOCDIR : $(DOCDIR)"; @echo "Installation DATADIR: $(DATADIR)"; @echo show_bgp_conf: @echo; @echo "BG/P System Path : $(BGP_SYSDIR)"; @echo "BG/P Install Path : $(BGP_INSTALLDIR)"; @echo "BG/P GNU/Linux Path: $(BGP_GNU_LINUX_PATH)"; @echo "BG/P ARCH Path : $(BGP_ARCH_PATH)"; @echo "BG/P Runtime Path : $(BGP_RUNTIME_PATH)"; @echo static: $(LIBRARY) $(LIBRARY): $(OBJECTS) rm -f $(LIBRARY) $(AR) $(ARG64) rv $(LIBRARY) $(OBJECTS) shared: libpapi.so libpapi.so.$(PAPISOVER) libpapi.so libpapi.so.$(PAPISOVER): $(SHLIB) ln -sf $(SHLIB) $@ $(SHLIB): $(HEADERS) $(SOURCES) $(SHLIBOBJS) rm -f $(SHLIB) libpapi.so libpapi.so.$(PAPISOVER) $(CC_SHR) $(LIBCFLAGS) $(OPTFLAGS) $(SOURCES) $(SHLIBOBJS) -o $@ $(SHLIBDEPS) $(LDFLAGS) @set -ex; if test "$(POST_BUILD)" != "" ; then \ -$(POST_BUILD) ; \ fi libsde: ifeq ($(BUILD_LIBSDE_SHARED),yes) $(MAKE) CC=$(CC) -C sde_lib dynamic ln -sf sde_lib/libsde.so.1.0 libsde.so endif ifeq ($(BUILD_LIBSDE_STATIC),yes) $(MAKE) CC=$(CC) -C sde_lib static ln -sf sde_lib/libsde.a libsde.a endif papi_fwrappers_.c: papi_fwrappers.c $(HEADERS) $(CPP) $(CPPFLAGS) -DFORTRANUNDERSCORE papi_fwrappers.c > papi_fwrappers_.c papi_fwrappers__.c: papi_fwrappers.c $(HEADERS) $(CPP) $(CPPFLAGS) -DFORTRANDOUBLEUNDERSCORE papi_fwrappers.c > papi_fwrappers__.c upper_PAPI_FWRAPPERS.c: papi_fwrappers.c $(HEADERS) $(CPP) $(CPPFLAGS) -DFORTRANCAPS papi_fwrappers.c > upper_PAPI_FWRAPPERS.c papi_fwrappers.o: papi_fwrappers.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_fwrappers.c -o papi_fwrappers.o papi_fwrappers_.o: papi_fwrappers_.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_fwrappers_.c -o papi_fwrappers_.o papi_fwrappers__.o: papi_fwrappers__.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_fwrappers__.c -o papi_fwrappers__.o upper_PAPI_FWRAPPERS.o: upper_PAPI_FWRAPPERS.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c upper_PAPI_FWRAPPERS.c -o upper_PAPI_FWRAPPERS.o papi.o: papi.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi.c -o papi.o papi_internal.o: papi_internal.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_internal.c -o papi_internal.o threads.o: threads.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c threads.c -o threads.o cpus.o: cpus.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c cpus.c -o cpus.o papi_hl.o: high-level/papi_hl.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c high-level/papi_hl.c -o papi_hl.o aix-memory.o: aix-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c aix-memory.c -o aix-memory.o solaris-memory.o: solaris-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c solaris-memory.c -o solaris-memory.o solaris-common.o: solaris-common.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c solaris-common.c -o solaris-common.o linux-bgp-memory.o: linux-bgp-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c linux-bgp-memory.c -o linux-bgp-memory.o linux-bgq-memory.o: linux-bgq-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c linux-bgq-memory.c -o linux-bgq-memory.o darwin-memory.o: darwin-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c darwin-memory.c -o darwin-memory.o darwin-common.o: darwin-common.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c darwin-common.c -o darwin-common.o linux-memory.o: linux-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c linux-memory.c -o linux-memory.o linux-timer.o: linux-timer.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c linux-timer.c -o linux-timer.o linux-common.o: linux-common.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c linux-common.c -o linux-common.o extras.o: extras.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c extras.c -o extras.o papi_memory.o: papi_memory.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_memory.c -o papi_memory.o papi_vector.o: papi_vector.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_vector.c -o papi_vector.o papi_preset.o: papi_preset.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_preset.c -o papi_preset.o sw_multiplex.o: sw_multiplex.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c sw_multiplex.c -o sw_multiplex.o $(CPUCOMPONENT_OBJ): $(CPUCOMPONENT_C) $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $(CPUCOMPONENT_C) -o $(CPUCOMPONENT_OBJ) x86_cpuid_info.o: x86_cpuid_info.c x86_cpuid_info.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c x86_cpuid_info.c -o x86_cpuid_info.o $(PAPI_EVENTS_TABLE): $(PAPI_EVENTS_CSV) papi_events_table.sh sh papi_events_table.sh $(PAPI_EVENTS_CSV) > $@ $(ARCH_EVENTS)_map.o: $(ARCH_EVENTS)_map.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $(ARCH_EVENTS)_map.c -o $(ARCH_EVENTS)_map.o # Required for BGP .SUFFIXES: .rts.o .c.rts.o: $(CC) $(CFLAGS) -c $< -o $@ bgp_tests:$(LIBRARY) null $(SETPATH) $(MAKE) -C ctests/bgp CC="$(CC)" CC_R="$(CC_R)" MPICC="$(MPICC)" CFLAGS="-I.. -I../.. $(CFLAGS)" TOPTFLAGS="$(TOPTFLAGS)" SMPCFLGS="$(SMPCFLGS)" OMPCFLGS="$(OMPCFLGS)" NOOPT="$(NOOPT)" LDFLAGS="$(LDFLAGS) $(STATIC)" LIBRARY="../../$(LINKLIB)" bgp_tests #Required for freebsd freebsd-memory.o: freebsd-memory.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map.o: freebsd/map.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-unknown.o: freebsd/map-unknown.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p6.o: freebsd/map-p6.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p6-m.o: freebsd/map-p6-m.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p6-3.o: freebsd/map-p6-3.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p6-2.o: freebsd/map-p6-2.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p6-c.o: freebsd/map-p6-c.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-k7.o: freebsd/map-k7.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-k8.o: freebsd/map-k8.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-p4.o: freebsd/map-p4.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-atom.o: freebsd/map-atom.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-core.o: freebsd/map-core.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-core2.o: freebsd/map-core2.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-core2-extreme.o: freebsd/map-core2-extreme.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-i7.o: freebsd/map-i7.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ freebsd/map-westmere.o: freebsd/map-westmere.c $(HEADERS) $(CC) $(LIBCFLAGS) -c $< -o $@ test: ctests $(SETPATH) ctests/zero fulltest: tests $(SETPATH) sh run_tests.sh tests: $(TESTS) testlib: $(SETPATH) $(MAKE) -C testlib utils: $(LIBS) testlib $(MAKE) -C utils validation_tests: $(LIBS) testlib $(SETPATH) $(MAKE) -C validation_tests ctests: $(LIBS) testlib validation_tests $(SETPATH) $(MAKE) -C ctests ftests: $(LIBS) testlib $(SETPATH) $(MAKE) -C ftests # compile tests added to components comp_tests: $(LIBS) testlib ifneq (${COMPONENTS},) @set -ex; for comp in ${COMPONENTS} ; do \ $(SETPATH) $(MAKE) -C components/$$comp/tests ; \ done endif clean: comp_tests_clean native_clean rm -rf $(LIBRARY) $(SHLIB) libpapi.so libpapi.so.$(PAPISOVER) $(OBJECTS) core rii_files genpapifdef *~ so_locations papi_fwrappers_.c papi_fwrappers__.c upper_PAPI_FWRAPPERS.c $(MAKE) -C ../doc clean $(MAKE) -C ctests clean $(MAKE) -C ftests clean $(MAKE) -C testlib clean $(MAKE) -C utils clean $(MAKE) -C validation_tests clean # Component tests cleaning comp_tests_clean: ifneq (${COMPONENTS},) @set -ex; for comp in ${COMPONENTS} ; do \ $(MAKE) -C components/$$comp/tests clean ; \ done endif clobber distclean: clean native_clobber $(MAKE) -C ../doc distclean $(MAKE) -C ctests distclean $(MAKE) -C ftests distclean $(MAKE) -C testlib distclean $(MAKE) -C utils distclean $(MAKE) -C validation_tests distclean $(MAKE) -C components -f Makefile_comp_tests distclean rm -f $(LIBRARY) $(SHLIB) $(EXTRALIBS) Makefile config.h libpapi.so sde_lib/libsde.so* sde_lib/libsde.a libsde.so libsde.a papi.pc components_config.h papi_components_config_event_defs.h $(PAPI_EVENTS_TABLE) $(if ${COMPONENTS}, \ set -ex; for comp in ${COMPONENTS}; do \ rm -f papi_$${comp}_std_event_defs.h; \ done) rm -f config.log config.status f77papi.h f90papi.h fpapi.h null: #need to go to the git repo top level directory to do this dist-targz: cd $(PAPI_SRCDIR)/..; git status | grep working.directory.clean || (echo "You should commit your changes before 'make dist-targz'.") (cd $(PAPI_SRCDIR)/..; git archive --prefix=papi-$(PAPIVER).$(PAPIREV).$(PAPIAGE)/ --format=tar HEAD) | gzip > papi-$(PAPIVER).$(PAPIREV).$(PAPIAGE).tar.gz dist: $(MAKE) install-all PREFIX=`pwd`/papi-$(CPUCOMPONENT_NAME) tar cfv ./papi-$(CPUCOMPONENT_NAME).tar ./papi-$(CPUCOMPONENT_NAME) gzip ./papi-$(CPUCOMPONENT_NAME).tar rm -rf ./papi-$(CPUCOMPONENT_NAME) install-all: install install-tests install: install-lib install-man install-utils install-hl-scripts install-pkgconf install-hl-scripts: @echo "Copy papi_hl_output_writer.py to: \"$(DESTDIR)$(BINDIR)\""; -mkdir -p $(DESTDIR)$(BINDIR) cp high-level/scripts/papi_hl_output_writer.py $(DESTDIR)$(BINDIR) install-lib: native_install @echo "Headers (INCDIR) being installed in: \"$(DESTDIR)$(INCDIR)\""; -mkdir -p $(DESTDIR)$(INCDIR) -chmod go+rx $(DESTDIR)$(INCDIR) cp $(FHEADERS) papi.h papiStdEventDefs.h papi_components_config_event_defs.h $(DESTDIR)$(INCDIR) $(if ${COMPONENTS}, \ set -ex; for comp in ${COMPONENTS}; do \ if [ -e papi_$${comp}_std_event_defs.h ]; then \ cp papi_$${comp}_std_event_defs.h $(DESTDIR)$(INCDIR); \ fi; \ done) cp sde_lib/sde_lib.h sde_lib/sde_lib.hpp $(DESTDIR)$(INCDIR) cd $(DESTDIR)$(INCDIR) && chmod go+r $(FHEADERS) papi.h papiStdEventDefs.h papi_components_config_event_defs.h sde_lib.h sde_lib.hpp @echo "Libraries (LIBDIR) being installed in: \"$(DESTDIR)$(LIBDIR)\""; -mkdir -p $(DESTDIR)$(LIBDIR) -chmod go+rx $(DESTDIR)$(LIBDIR) @set -ex; if test -r $(LIBRARY) ; then \ cp $(LIBRARY) $(DESTDIR)$(LIBDIR); \ chmod go+r $(DESTDIR)$(LIBDIR)/$(LIBRARY); \ fi @set -ex; if test -r $(SHLIB) ; then \ cp -p $(SHLIB) $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC); \ chmod go+r $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) ; \ cd $(DESTDIR)$(LIBDIR); \ ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) libpapi.so.$(PAPISOVER); \ ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) libpapi.so; \ fi @set -ex; if test -r libsde.so ; then \ cp sde_lib/libsde.so.1.0 $(DESTDIR)$(LIBDIR); \ chmod go+r $(DESTDIR)$(LIBDIR)/libsde.so.1.0; \ cd $(DESTDIR)$(LIBDIR); \ ln -sf libsde.so.1.0 libsde.so.1; \ ln -sf libsde.so.1.0 libsde.so; \ fi @set -ex; if test -r libsde.a ; then \ cp sde_lib/libsde.a $(DESTDIR)$(LIBDIR); \ chmod go+r $(DESTDIR)$(LIBDIR)/libsde.a; \ fi install-man: $(MAKE) -C ../man DOCDIR=$(DESTDIR)$(DOCDIR) MANDIR=$(DESTDIR)$(MANDIR) install install-utils: $(SETPATH) $(MAKE) -C utils BINDIR="$(DESTDIR)$(BINDIR)" CC="$(CC)" CC_R="$(CC_R)" CFLAGS="-I.. $(CFLAGS)" TOPTFLAGS="$(TOPTFLAGS)" SMPCFLGS="$(SMPCFLGS)" OMPCFLGS="$(OMPCFLGS)" NOOPT="$(NOOPT)" LDFLAGS="$(LDFLAGS) $(STATIC)" LIBRARY="../$(LINKLIB)" install install-tests: install-comp_tests $(SETPATH) $(MAKE) -C testlib install $(SETPATH) $(MAKE) -C ctests install $(SETPATH) $(MAKE) -C ftests install $(SETPATH) $(MAKE) -C validation_tests install -cp run_tests.sh $(DESTDIR)$(DATADIR) -cp run_tests_exclude_cuda.txt $(DESTDIR)$(DATADIR) -cp run_tests_exclude.txt $(DESTDIR)$(DATADIR) -chmod go+rx $(DESTDIR)$(DATADIR)/run_tests.sh -chmod go+r $(DESTDIR)$(DATADIR)/run_tests_exclude_cuda.txt $(DESTDIR)$(DATADIR)/run_tests_exclude.txt # Component tests installing install-comp_tests: ifneq (${COMPONENTS},) @set -ex; for comp in ${COMPONENTS} ; do \ $(MAKE) -C components/$$comp/tests DATADIR="$(DESTDIR)$(DATADIR)/components" install ; \ done endif install-pkgconf: @echo "pkcongfig being installed in: \"$(DESTDIR)$(LIBPC)\""; -mkdir -p $(DESTDIR)$(LIBPC) -chmod go+rx $(DESTDIR)$(LIBPC) cp papi.pc $(DESTDIR)$(LIBPC)/papi-$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC).pc ln -sf papi-$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC).pc $(DESTDIR)$(LIBPC)/papi-$(PAPISOVER).pc ln -sf papi-$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC).pc $(DESTDIR)$(LIBPC)/papi.pc # # Dummy targets for configurations that do not also include a Rules file with targets # native_clean: native_install: native_clobber: papi-papi-7-2-0-t/src/Matlab/000077500000000000000000000000001502707512200156275ustar00rootroot00000000000000papi-papi-7-2-0-t/src/Matlab/PAPI.m000066400000000000000000000122001502707512200165310ustar00rootroot00000000000000% PAPI Performance API. % PAPI provides access to one of 8 Hardware Performance Monitoring functions. % % ctrs = PAPI('num') - Return the number of hardware counters. % PAPI('start', 'event', ...) - % Begin counting the specified events. % [val, ...] = PAPI('stop') - Stop counting and return the current values. % [val, ...] = PAPI('read') - Read the current values of the active counters. % [val, ...] = PAPI('accum') - Add the current values of the active counters % to the input values. % PAPI('ipc') - Begin counting instructions. % ins = PAPI('ipc') - Return the number of instructions executed % since the last call. % [ins, ipc] = PAPI('ipc') - Return both the total number of instructions % executed since the last call, and the % incremental rate of instruction execution % since the last call. % PAPI('flips') % PAPI('flops') - Begin counting floating point % instructions or operations. % ins = PAPI('flips') % ops = PAPI('flops') - Return the number of floating point instruc- % tions or operations since the last call. % [ins, mflips] = PAPI('flips') % [ops, mflops] = PAPI('flops') - % Return both the number of floating point % instructions or operations since the last % call, and the incremental rate of floating % point execution since since the last call. % % DESCRIPTION % The PAPI function provides access to the PAPI Performance API. % PAPI takes advantage of the fact that most modern microprocessors % have built-in hardware support for counting a variety of basic operations % or events. PAPI uses these counters to track things like instructions % executed, cycles elapsed, floating point instructions performed and % a variety of other events. % % There are 8 subfunctions within the PAPI call, as described below: % 'num' - provides information on the number of hardware counters built % into this platform. The result of this call specifies how many % events can be counted at once. % 'start' - programs the counters with the named events and begins % counting. The names of the events can be found in the PAPI % documentation. If a named event cannot be found, or cannot % be mapped, an error message is displayed. % 'stop' - stops counting and returns the values of the counters in the % same order as events were specified in the start command. % 'stop' also can be used to reset the counters for the ipc % flips and flops subfunctions described below. % 'read' - return the values of the counters without stopping them. % 'accum' - adds the values of the counters to the input parameters and % returns them in the output parameters. Counting is not stopped. % 'ipc' - returns the total instructions executed since the first call % to this subfunction, and the rate of execution of instructions % (as instructions per cycle) since the last call. % 'flips' - returns the total floating point instructions executed since % the last call to this subfunction, and the rate of execution % of floating point instructions (as mega-floating point % instructions per second, or mflips) since the last call. % A floating point instruction is defined as whatever this cpu % naturally counts as floating point instructions. % 'flops' - identical to 'flips', except it measures floating point % operations rather than instructions. In many cases these two % counts may be identical. In some cases 'flops' will be a % derived value that attempts to reproduce that which is % traditionally considered a floating point operation. For % example, a fused multiply-add would be counted as two % operations, even if it was only a single instruction. % % In typical usage, the first five subfunctions: 'num', 'start', 'stop', % 'read', and 'accum' are used together. 'num establishes the maximum number % of events that can be supplied to 'start'. After a 'start' is issued, % 'read' and 'accum' can be intermixed until a 'stop' is issued. % % The three rate calls, 'ipc', 'flips', and 'flops' are intended to be used % independently. They cannot be mixed, because they use the same counter % resources. They can be used serially if they are separated by a 'stop' % call, which can also be used to reset the counters. % % Copyright 2001 - 2004 The Innovative Computing Laboratory, % University of Tennessee. % $Revision$ $Date$ papi-papi-7-2-0-t/src/Matlab/PAPIInnerProduct.m000066400000000000000000000024741502707512200211020ustar00rootroot00000000000000function PAPIInnerProduct % Compute an Inner Product (c = a * x) % on elements sized from 50 to 500, % in steps of 50. % % Use the PAPI mex function with two different methods: % - The PAPI flops call % - PAPI start/stop calls % % For each size, display: % - number of floating point operations % - theoretical number of operations % - difference % - per cent error % - mflops/s fprintf(1,'\n\nPAPI Inner Product Test'); fprintf(1,'\nUsing the PAPI("flops") call'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n', 'difference', '% error', 'mflops') for n=50:50:500, a=rand(1,n);x=rand(n,1); PAPI('stop'); % reset the counters to zero PAPI('flops'); % start counting flops c=a*x; [ops, mflops] = PAPI('flops'); % read the flops data fprintf(1,'%12d %12d %12d %12d %12.2f %12.2f\n',n,ops,2*n,ops - 2*n, (1.0 - ((2*n) / ops)) * 100,mflops) end PAPI('stop'); fprintf(1,'\n\nPAPI Inner Product Test'); fprintf(1,'\nUsing PAPI start and stop'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n', 'difference', '% error', 'flops/cycle') for n=50:50:500, a=rand(1,n);x=rand(n,1); PAPI('start', 'PAPI_TOT_CYC', 'PAPI_FP_OPS'); c=a*x; [cyc, ops] = PAPI('stop'); fprintf(1,'%12d %12d %12d %12d %12.2f %12.6f\n',n,ops,2*n,ops - 2*n, (1.0 - ((2*n) / ops)) * 100,ops/cyc) endpapi-papi-7-2-0-t/src/Matlab/PAPIMatrixMatrix.m000066400000000000000000000025641502707512200211170ustar00rootroot00000000000000function PAPIMatrixMatrix % Compute a Matrix Matrix multiply % on square arrays sized from 50 to 500, % in steps of 50. % % Use the PAPI mex function with two different methods: % - The PAPI flops call % - PAPI start/stop calls % % For each size, display: % - number of floating point operations % - theoretical number of operations % - difference % - per cent error % - mflops/s fprintf(1,'\nPAPI Matrix Matrix Multiply Test'); fprintf(1,'\nUsing the PAPI("flops") call'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n^3', 'difference', '% error', 'mflops') for n=50:50:500, a=rand(n);b=rand(n);c=rand(n); PAPI('stop'); % reset the counters to zero PAPI('flops'); % start counting flops c=c+a*b; [count, mflops] = PAPI('flops'); % read the flops data fprintf(1,'%12d %12d %12d %12d %12.2f %12.2f\n',n,count,2*n^3,count - 2*n^3, (1.0 - ((2*n^3) / count)) * 100,mflops) end PAPI('stop'); fprintf(1,'\nPAPI Matrix Matrix Multiply Test'); fprintf(1,'\nUsing PAPI start and stop'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n^3', 'difference', '% error', 'flops/cycle') for n=50:50:500, a=rand(n);b=rand(n);c=rand(n); PAPI('start', 'PAPI_TOT_CYC', 'PAPI_FP_OPS'); c=c+a*b; [cyc, ops] = PAPI('stop'); fprintf(1,'%12d %12d %12d %12d %12.2f %12.6f\n',n,ops,2*n^3,ops - 2*n^3, (1.0 - ((2*n^3) / ops)) * 100,ops/cyc) endpapi-papi-7-2-0-t/src/Matlab/PAPIMatrixVector.m000066400000000000000000000025451502707512200211140ustar00rootroot00000000000000function PAPIMatrixVector % Compute a Matrix Vector multiply % on arrays and vectors sized from 50 to 500, % in steps of 50. % % Use the PAPI mex function with two different methods: % - The PAPI flops call % - PAPI start/stop calls % % For each size, display: % - number of floating point operations % - theoretical number of operations % - difference % - per cent error % - mflops/s fprintf(1,'\nPAPI Matrix Vector Multiply Test'); fprintf(1,'\nUsing the PAPI("flops") call'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n^2', 'difference', '% error', 'mflops') for n=50:50:500, a=rand(n);x=rand(n,1); PAPI('stop'); % reset the counters to zero PAPI('flops'); % start counting flops b=a*x; [count, mflops] = PAPI('flops'); % read the flops data fprintf(1,'%12d %12d %12d %12d %12.2f %12.2f\n',n,count,2*n^2,count - 2*n^2, (1.0 - ((2*n^2) / count)) * 100,mflops) end PAPI('stop'); fprintf(1,'\nPAPI Matrix Vector Multiply Test'); fprintf(1,'\nUsing PAPI start and stop'); fprintf(1,'\n%12s %12s %12s %12s %12s %12s\n', 'n', 'ops', '2n^2', 'difference', '% error', 'flops/cycle') for n=50:50:500, a=rand(n);x=rand(n,1); PAPI('start', 'PAPI_TOT_CYC', 'PAPI_FP_OPS'); c=a*x; [cyc, ops] = PAPI('stop'); fprintf(1,'%12d %12d %12d %12d %12.2f %12.6f\n',n,ops,2*n^2,ops - 2*n^2, (1.0 - ((2*n^2) / ops)) * 100,ops/cyc) endpapi-papi-7-2-0-t/src/Matlab/PAPI_Matlab.c000077500000000000000000000225371502707512200200200ustar00rootroot00000000000000 /****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file: PAPI_Matlab.c * @author Frank Winkler * * @brief PAPI Matlab integration. * See PAPI_Matlab.readme for more information. */ #define FLIPS_EVENT PAPI_FP_INS #define FLOPS_EVENT PAPI_FP_OPS #include "mex.h" #include "matrix.h" #include "papi.h" static long long accum_error = 0; static long long start_time = 0; int EventSet = PAPI_NULL; int papi_init = 0; int papi_start = 0; void initialize_papi() { int result; /* initialize PAPI */ result = PAPI_library_init(PAPI_VER_CURRENT); if(result < PAPI_OK) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error PAPI_create_eventset."); } /* create EventSet */ result = PAPI_create_eventset(&EventSet); if(result < PAPI_OK) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error PAPI_library_init."); } } void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { float real_time, proc_time, rate; int i; int number_of_counters; unsigned int mrows, nchars; unsigned int *events; unsigned int flop_events[2]; long long ins = 0, *values, flop_values[2]; long long elapsed_time; int result; char *input, *temp; char one_output[] = "This function produces one output per running counter."; char no_input[] = "This function expects no input."; char error_reading[] = "Error reading the running counters."; if ( papi_init == 0 ) { initialize_papi(); papi_init = 1; } /* Check for proper number of arguments. */ if(nrhs < 1) { mexErrMsgTxt("This function expects input."); } nchars = mxGetNumberOfElements(prhs[0]); input = (char *)mxCalloc(nchars, sizeof(char) + 1); input = mxArrayToString(prhs[0]); if(!strncmp(input, "num", 3)) { if(nrhs != 1) { mexErrMsgTxt(no_input); } else if(nlhs != 1) { mexErrMsgTxt("This function produces one and only one output: counters."); } result = PAPI_num_cmp_hwctrs(0); if(result < PAPI_OK) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error reading counters."); } plhs[0] = mxCreateDoubleScalar((double)result); } else if((!strncmp(input, "flip", 4)) || (!strncmp(input, "flop", 4))) { if(nrhs != 1) { mexErrMsgTxt(no_input); } else if(nlhs > 2) { if (input[2] == 'i') mexErrMsgTxt("This function produces 1 or 2 outputs: [ops, mflips]."); else mexErrMsgTxt("This function produces 1 or 2 outputs: [ops, mflops]."); } if (input[2] == 'i') { if(result = PAPI_flips_rate( FLIPS_EVENT, &real_time, &proc_time, &ins, &rate) 0) { plhs[0] = mxCreateDoubleScalar((double)(ins - accum_error)); /* this call adds 7 fp instructions to the total */ /* but apparently not on Pentium M with Matlab 7.0.4 */ /* accum_error += 7; */ if(nlhs == 2) { plhs[1] = mxCreateDoubleScalar((double)rate); /* the second call adds 4 fp instructions to the total */ /* but apparently not on Pentium M with Matlab 7.0.4 */ /* accum_error += 4; */ } } } else if(!strncmp(input, "start", 5)) { if(nlhs != 0) { mexErrMsgTxt("This function produces no output."); } if(nrhs > (PAPI_num_cmp_hwctrs(0) + 1)) { mexErrMsgTxt(one_output); } mrows = mxGetM(prhs[1]); events = (unsigned int *)mxCalloc(nrhs - 1, sizeof(int) + 1); for(i = 1; i < nrhs; i++) { if(mxIsComplex(prhs[i]) || !(mrows == 1) ) { mexErrMsgTxt("Input must be a list of strings."); } if(mxIsChar(prhs[i])) { nchars = mxGetNumberOfElements(prhs[i]); temp = (char *)mxCalloc(nchars, sizeof(char) + 1); temp = mxArrayToString(prhs[i]); if(result = PAPI_event_name_to_code(temp, &(events[i - 1])) < PAPI_OK) { mxFree(temp); mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Incorrect PAPI code given."); } mxFree(temp); } else { events[i - 1] = (unsigned int)mxGetScalar(prhs[i]); } } if((result = PAPI_cleanup_eventset(EventSet)) < PAPI_OK) mexErrMsgTxt("Error PAPI_cleanup_eventset"); for (i = 0; i < nrhs - 1; i++) { result = PAPI_add_event(EventSet, events[i]); if(result < PAPI_OK) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error PAPI_add_event."); } } if((result = PAPI_start(EventSet)) < PAPI_OK) { mxFree(events); mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error initializing counters."); } papi_start = 1; mxFree(events); } else if(!strncmp(input, "stop", 4)) { if(nrhs != 1) { mexErrMsgTxt(no_input); } number_of_counters = PAPI_num_cmp_hwctrs(0); if(nlhs > number_of_counters ) { mexErrMsgTxt(one_output); } if (nlhs == 0) values = (long long*)mxCalloc(number_of_counters, sizeof(long long)); else values = (long long *)mxCalloc(nlhs, sizeof(long long) + 1); result = PAPI_OK; if (start_time == 0) { if ( papi_start == 1 ) result = PAPI_stop(EventSet, values); } else { start_time = 0; if ( papi_start == 1 ) result = PAPI_stop(EventSet, flop_values); } PAPI_rate_stop(); papi_start = 0; if(result < PAPI_OK) { if(result != PAPI_ENOTRUN) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt("Error stopping the running counters."); } } accum_error = 0; for(i = 0; i < nlhs; i++) { plhs[i] = mxCreateDoubleScalar((double)values[i]); } mxFree(values); } else if(!strncmp(input, "read", 4)) { if(nrhs != 1) { mexErrMsgTxt(no_input); } if(nlhs > PAPI_num_cmp_hwctrs(0)) { mexErrMsgTxt(one_output); } values = (long long *)mxCalloc(nlhs, sizeof(long long) + 1); if((result = PAPI_read(EventSet, values)) < PAPI_OK) { mexPrintf("%d\n", result); mexErrMsgTxt(error_reading); } for(i = 0; i < nlhs; i++) { plhs[i] = mxCreateDoubleScalar((double)values[i]); } mxFree(values); } else if(!strncmp(input, "accum", 5)) { if(nrhs > PAPI_num_cmp_hwctrs(0) + 1) { mexErrMsgTxt(no_input); } if(nlhs > PAPI_num_cmp_hwctrs(0)) { mexErrMsgTxt(one_output); } values = (long long *)mxCalloc(nlhs, sizeof(long long) + 1); for(i = 0; i < nrhs - 1; i++) { values[i] = (long long)(*(mxGetPr(prhs[i + 1]))); } if(result = PAPI_accum(EventSet, values) < PAPI_OK) { mexPrintf("Error code: %d\n", result); mexErrMsgTxt(error_reading); } for(i = 0; i < nlhs; i++) { plhs[i] = mxCreateDoubleScalar((double)values[i]); } mxFree(values); } else if(!strncmp(input, "ipc", 3)) { if(nrhs != 1) { mexErrMsgTxt(no_input); } else if(nlhs > 2) { mexErrMsgTxt("This function produces 1 or 2 outputs: [ops, ipc]."); } if(PAPI_ipc(&real_time, &proc_time, &ins, &rate) 0) { plhs[0] = mxCreateDoubleScalar((double)ins); if(nlhs == 2) { plhs[1] = mxCreateDoubleScalar((double)rate); } } } else { mexPrintf("Cannot find the command you specified.\n"); mexErrMsgTxt("See the included readme file."); } } papi-papi-7-2-0-t/src/Matlab/PAPI_Matlab.readme000077500000000000000000000143241502707512200210260ustar00rootroot00000000000000Running PAPI in the MATLAB Environment If you have the desire to do this, you most likely already know why you want to make calls to PAPI inside of a MATLAB environment. This section of the PAPI user guide covers C and FORTRAN calls, but at the moment, you can only make C calls from the MATLAB environment. There is one overall function to call from Matlab; from there, you specify which of the 6 specific functions you want to call, and then the arguments to each. Here are some examples: PAPI_num_cmp_hwctrs(0) - Returns the number of available preset hardware counters on the system. Ex: num_counters = PAPI('num') PAPI_flips_rate - Has 3 possibilities: Initialize FLIP counting with: PAPI('flips') Record the number of floating point instructions since latest call: ops = PAPI('flips') Record the number of floating point instructions and the incremental rate of floating point execution since latest call: [ops, mflips] = PAPI('flips') Use PAPI('stop') to stop counting flips and reset the counters. PAPI_flops - Identical to PAPI_flips, but counts floating point *operations* rather than instructions. In most cases, these two are identical, but some instructions (e.g. FMA) might contain multiple operations or vice versa. PAPI_ipc - Has 3 possibilities: Initialize instruction per cycle counting with: PAPI('ipc', 0) Record the number of instructions since initialization: ins = PAPI('ipc') Record the number of instructions and the incremental rate of instructions per cycle since initialization: [ins, ipc] = PAPI('ipc') PAPI_start - Specify the events to count (in text form or the actual numeric code; NOTE: make sure to not confuse normal decimal and hexadecimal.) You cannot specify more events than there are hardware counters. To begin counting cycles and instructions: PAPI('start', 'PAPI_TOT_CYC', 'PAPI_TOT_INS'); PAPI_read - Simply specify the variables to read the values into. You cannot specify more variables than there are hardware counters. To read the above events you just started: [cycles, instructions] = PAPI('read'); PAPI_accum - This function adds the value you pass to the readings in the hardware counter. You cannot specify more variables than there are hardware counters. This function will reset the counters. To add the values currently in the counters to the previously read values: [cycles, instructions] = PAPI('accum', cycles, instructions); PAPI_stop - This function reads the value of the running hardware counters into the variables you specify. You cannot specify more variables than there are hardware counters. To stop the running counters you previously started and record their values: [cycles, instructions] = PAPI('stop'); PAPI_Matlab.c, when compiled, functions simply as a wrapper. In order to use the calls, you need to know a little about mex. mex is simply the compiler you use to make your code run in the MATLAB environment. If you don't know how to use mex, you might want to acquaint yourself a bit. "mex -setup "might be needed if you encounter problems, but the simplest explanation might be to substitute "mex" for "gcc" and you are on your way. All the other rules for compiling PAPI are the same. mex compilations can de done inside or outside of the Matlab environment, but in this case, it is recommended that you compile outside of Matlab. For some reason, compiling inside does not work on some systems. So far, the Linux environment and the Windows environment have been tested, but _in theory_ this code should work anywhere PAPI and Matlab both work. The following instructions are for a Linux/Unix environment: Assuming papi.h is present in /usr/local/include and libpapi.so is present in /usr/local/lib, the below should work. If not, you may need to alter the compile strings and/or the #include statement in PAPI_Matlab.c. Also, the compile string will be different for different platforms. For instance, if I want to compile and run on a linux machine assuming PAPI_Matlab.c is in your current working directory (you'll have a different compile string on a different architecture): 0. Define the events for flops and flips in PAPI_Matlab.c: ONE of the three presets for flips: - PAPI_FP_INS - PAPI_VEC_SP - PAPI_VEC_DP ONE of the three presets for flops: - PAPI_FP_OPS - PAPI_SP_OPS - PAPI_DP_OPS Example: #define FLIPS_EVENT PAPI_FP_INS #define FLOPS_EVENT PAPI_FP_OPS 1. Compile the wrapper: mex -I/usr/local/include PAPI_Matlab.c /usr/local/lib/libpapi.so -output PAPI 2. Start Matlab: matlab 3. Run the code: a. Find the number of hardware counters on your system: num_counters = PAPI('num') b. Play with flips - the first makes sure the counters are stopped and clear; the second initializes the counting; the third returns the number of floating point instructions since the first call, and the fourth line does the same as the second AND reports the incremental rate of floating point execution since the last call: PAPI('stop') PAPI('flips') ins = PAPI('flips') [ins, mflips] = PAPI('flips') c. Play with instructions per cycle - the first makes sure the counters are stopped and clear; the second initializes counting; the third returns the number of instructions since the first call, and the fourth line does the same as the second AND reports the incremental rate of instructions per cycle since the last call: PAPI('stop') PAPI('ipc') ins = PAPI('ipc') [ins, ipc] = PAPI('ipc') d. Try the example m files included with the distribution: PAPIInnerProduct.m PAPIMatrixVector.m PAPIMatrixMatrix.m e. Start counting: PAPI('start', 'PAPI_TOT_CYC', 'PAPI_TOT_INS') f. Read the counters: [cycles, instr] = PAPI('read') g. Add the current value of the counters to a previous read and reset: [cycles, instr] = PAPI('accum', cycles, instr) h. Read the counters and stop them: [cycles, instr] = PAPI('stop') You can pass as many events as you like to be counted or recorded, as long as that number does not exceed the number of available hardware counters. Contact ptools-perfapi@icl.utk.edu with any questions regarding PAPI calls in Matlab - either errors or questions. Also, this has just been implemented, so changes could be coming.......... papi-papi-7-2-0-t/src/README000066400000000000000000000002241502707512200153050ustar00rootroot00000000000000/* * File: papi/src/README * CVS: $Id$ * Author: Philip Mucci * mucci@cs.utk.edu */ Please see the README in the root directory. papi-papi-7-2-0-t/src/Rules.bgpm000066400000000000000000000011101502707512200163610ustar00rootroot00000000000000# $Id: Rules.bgpm,v 1.1 2011/03/11 23:06:54 jagode Exp $ ifneq ($(USE_DEBUG),) BGPM_LIBNAME = bgpm_debug DEBUG_BGPM = "-DDEBUG_BGPM" else BGPM_LIBNAME = bgpm endif BGPM_OBJS=$(shell $(AR) t $(BGPM_INSTALL_DIR)/bgpm/lib/lib$(BGPM_LIBNAME).a && $(AR) 2>/dev/null) MISCOBJS = $(BGPM_OBJS) $(MISCSRCS:.c=.o) include Makefile.inc CFLAGS += -I$(BGPM_INSTALL_DIR) -I$(BGPM_INSTALL_DIR)/spi/include/kernel/cnk $(DEBUG_BGPM) LDFLAGS += $(BGPM_INSTALL_DIR)/bgpm/lib/lib$(BGPM_LIBNAME).a -lrt -lstdc++ $(BGPM_OBJS): $(AR) xv $(BGPM_INSTALL_DIR)/bgpm/lib/lib$(BGPM_LIBNAME).a papi-papi-7-2-0-t/src/Rules.perfmon2000066400000000000000000000041251502707512200171750ustar00rootroot00000000000000DESCR = "Linux with perfmon2 kernel support and library" ifneq (/usr,$(PFM_PREFIX)) PWD = $(shell pwd) ifeq (,$(PFM_LIB_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libpfm-3.y endif PFM_LIB_PATH := $(PFM_ROOT)/lib CC_SHR += -Wl,-rpath-link -Wl,$(PFM_LIB_PATH) endif ifeq (,$(PFM_INC_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libpfm-3.y endif PFM_INC_PATH := $(PFM_ROOT)/include endif ifneq (/usr/include,$(PFM_INC_PATH)) CFLAGS += -I$(PFM_INC_PATH) endif endif MISCHDRS += linux-lock.h mb.h papi_libpfm_events.h MISCSRCS += papi_libpfm3_events.c SHLIBDEPS = -Bdynamic -L$(PFM_LIB_PATH) -lpfm PFM_OBJS=$(shell $(AR) t $(PFM_LIB_PATH)/libpfm.a 2>/dev/null) MISCOBJS = $(PFM_OBJS) $(MISCSRCS:.c=.o) ifeq (,$(PFM_OBJS)) $(PFM_LIB_PATH)/libpfm.a: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC) $(BITFLAGS)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" lib else @echo '$@ not installed!'; exit 1 endif $(MAKE) endif include Makefile.inc config.h: @echo 'Please clobber your build and run ./configure." $(PFM_OBJS): $(PFM_LIB_PATH)/libpfm.a $(AR) xv $< papi_libpfm3_events.o: papi_libpfm3_events.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_libpfm3_events.c -o $@ native_clean: -rm -f $(MISCOBJS) ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" clean endif native_install: ifneq (,${PFM_ROOT}) -$(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" DESTDIR=$(DESTDIR) PREFIX=$(PREFIX) install_prefix=$(PREFIX) LIBDIR=$(LIBDIR) INCDIR=$(INCDIR) MANDIR=$(MANDIR) install endif -install -d $(DESTDIR)$(LIBDIR) ifneq (,$(findstring shared,$(LIBS))) cp -p $(SHLIB) $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPISOVER) ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) $(DESTDIR)$(LIBDIR)/libpapi.so endif -install -d $(DESTDIR)$(DATADIR) cp -f ./papi_events.csv $(DESTDIR)$(DATADIR) native_clobber: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" distclean endif papi-papi-7-2-0-t/src/Rules.perfnec000066400000000000000000000042531502707512200170710ustar00rootroot00000000000000DESCR = "Linux without perf kernel support and library on NEC architecture" ifneq (/usr,$(PFM_PREFIX)) PWD = $(shell pwd) ifeq (,$(PFM_LIB_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libperfnec endif PFM_LIB_PATH := $(PFM_ROOT)/lib CC_SHR += -Wl,-rpath-link -Wl,$(PFM_LIB_PATH) endif ifeq (,$(PFM_INC_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libperfnec endif PFM_INC_PATH := $(PFM_ROOT)/include endif ifneq (/usr/include,$(PFM_INC_PATH)) CFLAGS += -I$(PFM_INC_PATH) endif endif MISCHDRS += linux-lock.h mb.h papi_libpfm_events.h #TODO: add that one #MISCSRCS += papi_libpfm3_events.c SHLIBDEPS = -Bdynamic -L$(PFM_LIB_PATH) -lpfm PFM_OBJS=$(shell $(AR) t $(PFM_LIB_PATH)/libpfm.a 2>/dev/null) MISCOBJS = $(PFM_OBJS) $(MISCSRCS:.c=.o) ifeq (,$(PFM_OBJS)) $(PFM_LIB_PATH)/libpfm.a: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC) $(BITFLAGS)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" lib else @echo '$@ not installed!'; exit 1 endif $(MAKE) endif include Makefile.inc config.h: @echo 'Please clobber your build and run ./configure." $(PFM_OBJS): $(PFM_LIB_PATH)/libpfm.a $(AR) xv $< papi_libpfm3_events.o: papi_libpfm3_events.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_libpfm3_events.c -o $@ native_clean: -rm -f $(MISCOBJS) ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" clean endif native_install: ifneq (,${PFM_ROOT}) @echo 'XXXXXXXXX You are compiling perfnec!' -$(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" DESTDIR=$(DESTDIR) PREFIX=$(PREFIX) install_prefix=$(PREFIX) LIBDIR=$(LIBDIR) INCDIR=$(INCDIR) MANDIR=$(MANDIR) install endif -install -d $(DESTDIR)$(LIBDIR) ifneq (,$(findstring shared,$(LIBS))) cp -p $(SHLIB) $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) $(DESTDIR)$(LIBDIR)/libpapi.so.$(PAPISOVER) ln -sf libpapi.so.$(PAPIVER).$(PAPIREV).$(PAPIAGE).$(PAPIINC) $(DESTDIR)$(LIBDIR)/libpapi.so endif -install -d $(DESTDIR)$(DATADIR) cp -f ./papi_events.csv $(DESTDIR)$(DATADIR) native_clobber: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" distclean endif papi-papi-7-2-0-t/src/Rules.pfm000066400000000000000000000031731502707512200162310ustar00rootroot00000000000000# $Id$ DESCR = "Linux with PFM $(VERSION) kernel support and library" ifneq (,$(wildcard /etc/sgi-release)) PFM_PREFIX ?= /usr ALTIX ?= -DALTIX endif ifeq (,$(PFM_LIB_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := ./libpfm-$(VERSION) endif PFM_LIB_PATH := $(PFM_ROOT)/lib endif ifeq (,$(PFM_INC_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := ./libpfm-$(VERSION) endif PFM_INC_PATH := $(PFM_ROOT)/include endif OPTIM := $(CFLAGS) CFLAGS-3.y := -DPFM30 CFLAGS += -I$(PFM_INC_PATH) $(ALTIX) $(CFLAGS-$(VERSION)) MISCHDRS += linux-lock.h mb.h SHLIBDEPS = -Bdynamic -L$(PFM_LIB_PATH) -lpfm PFM_OBJS = $(shell $(AR) t $(PFM_LIB_PATH)/libpfm.a 2>/dev/null) MISCOBJS = $(PFM_OBJS) $(MISCSRCS:.c=.o) ifeq (,$(PFM_OBJS)) $(PFM_LIB_PATH)/libpfm.a: ifneq (,${PFM_ROOT}) ifeq (1, $(HAVE_NO_OVERRIDE_INIT)) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC)" OPTIM="$(OPTIM)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" -Wno-override-init lib else $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC)" OPTIM="$(OPTIM)" CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" lib endif else @echo '$@ not installed!'; exit 1 endif $(MAKE) endif include Makefile.inc config.h: @echo 'Please clobber your build and run ./configure." $(PFM_OBJS): $(AR) xv $(PFM_LIB_PATH)/libpfm.a native_clean: -rm -f $(MISCOBJS) ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) clean endif native_install: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) CONFIG_PFMLIB_OLD_PFMV2="$(PFM_OLD_PFMV2)" DESTDIR=$(DESTDIR) PREFIX=$(PREFIX) install_prefix=$(PREFIX) LIBDIR=$(LIBDIR) INCDIR=$(INCDIR) MANDIR=$(MANDIR) install endif native_clobber: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) distclean endif papi-papi-7-2-0-t/src/Rules.pfm4_pe000066400000000000000000000043501502707512200167770ustar00rootroot00000000000000 DESCR = "Linux with perf_event kernel support and libpfm4" ifneq (/usr,$(PFM_PREFIX)) PWD = $(shell pwd) ifeq (,$(PFM_LIB_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libpfm4 endif PFM_LIB_PATH := $(PFM_ROOT)/lib CC_SHR += -Wl,-rpath-link -Wl,$(PFM_LIB_PATH) endif ifeq (,$(PFM_INC_PATH)) ifeq (,$(PFM_ROOT)) PFM_ROOT := $(PWD)/libpfm4 endif PFM_INC_PATH := $(PFM_ROOT)/include endif ifneq (/usr/include,$(PFM_INC_PATH)) LIBCFLAGS += -I$(PFM_INC_PATH) endif endif LIBCFLAGS += -fvisibility=hidden MISCHDRS += linux-lock.h mb.h papi_libpfm4_events.h MISCSRCS += papi_libpfm4_events.c SHLIBDEPS = -Bdynamic -L$(PFM_LIB_PATH) -lpfm PFM_OBJS=$(shell $(AR) t $(PFM_LIB_PATH)/libpfm.a 2>/dev/null) MISCOBJS = $(PFM_OBJS) $(MISCSRCS:.c=.o) ifeq (yes,$(MIC)) FORCE_PFM_ARCH="CONFIG_PFMLIB_ARCH_X86=y" endif ifeq (,$(PFM_OBJS)) $(PFM_LIB_PATH)/libpfm.a: ifneq (,${PFM_ROOT}) ifeq ("$(CC_COMMON_NAME)","icc") $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC) $(BITFLAGS)" DBG="-g -Wall -Werror" $(FORCE_PFM_ARCH) lib else ifeq (1,$(HAVE_NO_OVERRIDE_INIT)) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC) $(BITFLAGS) -Wno-override-init" $(FORCE_PFM_ARCH) lib else $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" CC="$(CC) $(BITFLAGS)" $(FORCE_PFM_ARCH) lib endif endif else @echo '$@ not installed!'; exit 1 endif $(MAKE) endif include Makefile.inc config.h: @echo 'Please clobber your build and run ./configure." $(PFM_OBJS): $(PFM_LIB_PATH)/libpfm.a $(AR) xv $< papi_libpfm4_events.o: papi_libpfm4_events.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c papi_libpfm4_events.c -o $@ native_clean: -rm -f $(MISCOBJS) ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" clean endif native_install: ifneq (,${PFM_ROOT}) -$(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" DESTDIR=$(DESTDIR) PREFIX=$(PREFIX) install_prefix=$(PREFIX) LIBDIR=$(LIBDIR) INCDIR=$(INCDIR) MANDIR=$(MANDIR) install endif -install -d $(DESTDIR)$(LIBDIR) # Makefile.inc already has installation of shared libraries so # there is no need to do it here -install -d $(DESTDIR)$(DATADIR) cp -f ./papi_events.csv $(DESTDIR)$(DATADIR) native_clobber: ifneq (,${PFM_ROOT}) $(MAKE) -C $(PFM_ROOT) ARCH="$(ARCH)" distclean endif papi-papi-7-2-0-t/src/aix-context.h000066400000000000000000000007041502707512200170440ustar00rootroot00000000000000#ifndef _PAPI_AIX_CONTEXT_H #define _PAPI_AIX_CONTEXT_H /* overflow */ /* Override void* definitions from PAPI framework layer */ /* with typedefs to conform to PAPI component layer code. */ #undef hwd_siginfo_t #undef hwd_ucontext_t typedef siginfo_t hwd_siginfo_t; typedef struct sigcontext hwd_ucontext_t; #define GET_OVERFLOW_ADDRESS(ctx) (void *)(((hwd_ucontext_t *)(ctx->ucontext))->sc_jmpbuf.jmp_context.iar) #endif /* _PAPI_AIX_CONTEXT */ papi-papi-7-2-0-t/src/aix-lock.h000066400000000000000000000005711502707512200163120ustar00rootroot00000000000000#include /* Locks */ extern atomic_p lock[]; #define _papi_hwd_lock(lck) \ { \ while(_check_lock(lock[lck],0,1) == TRUE) { ; } \ } #define _papi_hwd_unlock(lck) \ { \ _clear_lock(lock[lck], 0); \ } papi-papi-7-2-0-t/src/aix-memory.c000066400000000000000000000057011502707512200166650ustar00rootroot00000000000000/* * File: aix-memory.c * Author: Kevin London * london@cs.utk.edu * * Mods: * */ #include "papi.h" #include "papi_internal.h" #include "aix.h" int _aix_get_memory_info( PAPI_hw_info_t * mem_info, int type ) { PAPI_mh_level_t *L = mem_info->mem_hierarchy.level; /* Not quite sure what bit 30 indicates. I'm assuming it flags a unified tlb */ if ( _system_configuration.tlb_attrib & ( 1 << 30 ) ) { L[0].tlb[0].type = PAPI_MH_TYPE_UNIFIED; L[0].tlb[0].num_entries = _system_configuration.itlb_size; L[0].tlb[0].type = PAPI_MH_TYPE_UNIFIED; } else { L[0].tlb[0].type = PAPI_MH_TYPE_INST; L[0].tlb[0].num_entries = _system_configuration.itlb_size; L[0].tlb[0].associativity = _system_configuration.itlb_asc; L[0].tlb[1].type = PAPI_MH_TYPE_DATA; L[0].tlb[1].num_entries = _system_configuration.dtlb_size; L[0].tlb[1].associativity = _system_configuration.dtlb_asc; } /* Not quite sure what bit 30 indicates. I'm assuming it flags a unified cache */ if ( _system_configuration.cache_attrib & ( 1 << 30 ) ) { L[0].cache[0].type = PAPI_MH_TYPE_UNIFIED; L[0].cache[0].size = _system_configuration.icache_size; L[0].cache[0].associativity = _system_configuration.icache_asc; L[0].cache[0].line_size = _system_configuration.icache_line; } else { L[0].cache[0].type = PAPI_MH_TYPE_INST; L[0].cache[0].size = _system_configuration.icache_size; L[0].cache[0].associativity = _system_configuration.icache_asc; L[0].cache[0].line_size = _system_configuration.icache_line; L[0].cache[1].type = PAPI_MH_TYPE_DATA; L[0].cache[1].size = _system_configuration.dcache_size; L[0].cache[1].associativity = _system_configuration.dcache_asc; L[0].cache[1].line_size = _system_configuration.dcache_line; } L[1].cache[0].type = PAPI_MH_TYPE_UNIFIED; L[1].cache[0].size = _system_configuration.L2_cache_size; L[1].cache[0].associativity = _system_configuration.L2_cache_asc; /* is there a line size for Level 2 cache? */ /* it looks like we've always got at least 2 levels of info */ /* what about level 3 cache? */ mem_info->mem_hierarchy.levels = 2; return PAPI_OK; } int _aix_get_dmem_info( PAPI_dmem_info_t * d ) { /* This function has been reimplemented to conform to current interface. It has not been tested. Nor has it been confirmed for completeness. dkt 05-10-06 */ struct procsinfo pi; pid_t mypid = getpid( ); pid_t pid; int found = 0; pid = 0; while ( 1 ) { if ( getprocs( &pi, sizeof ( pi ), 0, 0, &pid, 1 ) != 1 ) break; if ( mypid == pi.pi_pid ) { found = 1; break; } } if ( !found ) return ( PAPI_ESYS ); d->size = pi.pi_size; d->resident = pi.pi_drss + pi.pi_trss; d->high_water_mark = PAPI_EINVAL; d->shared = PAPI_EINVAL; d->text = pi.pi_trss; /* this is a guess */ d->library = PAPI_EINVAL; d->heap = PAPI_EINVAL; d->locked = PAPI_EINVAL; d->stack = PAPI_EINVAL; d->pagesize = getpagesize( ); return ( PAPI_OK ); } papi-papi-7-2-0-t/src/aix.c000066400000000000000000001037601502707512200153630ustar00rootroot00000000000000/* This file handles the OS dependent part of the POWER5 and POWER6 architectures. It supports both AIX 4 and AIX 5. The switch between AIX 4 and 5 is driven by the system defined value _AIX_VERSION_510. Other routines also include minor conditionally compiled differences. */ #include #include "papi.h" #include "papi_internal.h" #include "papi_lock.h" #include "papi_memory.h" #include "extras.h" #include "aix.h" #include "papi_vector.h" /* Advance declarations */ papi_vector_t _aix_vector; /* Locking variables */ volatile int lock_var[PAPI_MAX_LOCK] = { 0 }; atomic_p lock[PAPI_MAX_LOCK]; /* some heap information, start_of_text, start_of_data ..... ref: http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixprggd/genprogc/sys_mem_alloc.htm#HDRA9E4A4C9921SYLV */ #define START_OF_TEXT &_text #define END_OF_TEXT &_etext #define START_OF_DATA &_data #define END_OF_DATA &_edata #define START_OF_BSS &_edata #define END_OF_BSS &_end static int maxgroups = 0; struct utsname AixVer; native_event_entry_t native_table[PAPI_MAX_NATIVE_EVENTS]; hwd_pminfo_t pminfo; pm_groups_info_t pmgroups; native_event_entry_t native_table[PAPI_MAX_NATIVE_EVENTS]; PPC64_native_map_t native_name_map[PAPI_MAX_NATIVE_EVENTS]; hwd_groups_t group_map[MAX_GROUPS] = { 0 }; /* to initialize the native_table */ void aix_initialize_native_table( ) { int i, j; memset( native_table, 0, PAPI_MAX_NATIVE_EVENTS * sizeof ( native_event_entry_t ) ); memset( native_name_map, 0, PAPI_MAX_NATIVE_EVENTS * sizeof ( PPC64_native_map_t ) ); for ( i = 0; i < PAPI_MAX_NATIVE_EVENTS; i++ ) { native_name_map[i].index = -1; for ( j = 0; j < MAX_COUNTERS; j++ ) native_table[i].resources.counter_cmd[j] = -1; } } /* to setup native_table group value */ static void aix_ppc64_setup_gps( int total ) { int i, j, gnum; for ( i = 0; i < total; i++ ) { for ( j = 0; j < MAX_COUNTERS; j++ ) { /* native_table[i].resources.rgg[j]=-1; */ if ( native_table[i].resources.selector & ( 1 << j ) ) { for ( gnum = 0; gnum < pmgroups.maxgroups; gnum++ ) { if ( native_table[i].resources.counter_cmd[j] == pmgroups.event_groups[gnum].events[j] ) { /* could use gnum instead of pmgroups.event_groups[gnum].group_id */ native_table[i].resources.group[pmgroups. event_groups[gnum]. group_id / 32] |= 1 << ( pmgroups.event_groups[gnum].group_id % 32 ); } } } } } for ( gnum = 0; gnum < pmgroups.maxgroups; gnum++ ) { for ( i = 0; i < MAX_COUNTERS; i++ ) { /*group_map[gnum].counter_cmd[i] = pmgroups.event_groups[gnum].events[i]; */ if (pmgroups.event_groups[gnum].group_id >=MAX_GROUPS) { fprintf(stderr,"ERROR, group number trying to go past MAX GROUPS\n"); continue; } group_map[pmgroups.event_groups[gnum].group_id].counter_cmd[i] = pmgroups.event_groups[gnum].events[i]; } } } /* to setup native_table values, and return number of entries */ int aix_ppc64_setup_native_table( ) { hwd_pmevents_t *wevp; hwd_pminfo_t *info; int pmc, ev, i, j, index; info = &pminfo; index = 0; aix_initialize_native_table( ); for ( pmc = 0; pmc < info->maxpmcs; pmc++ ) { wevp = info->list_events[pmc]; for ( ev = 0; ev < info->maxevents[pmc]; ev++, wevp++ ) { for ( i = 0; i < index; i++ ) { if ( strcmp( wevp->short_name, native_table[i].name ) == 0 ) { native_table[i].resources.selector |= 1 << pmc; native_table[i].resources.counter_cmd[pmc] = wevp->event_id; break; } } if ( i == index ) { /*native_table[i].index=i; */ native_table[i].resources.selector |= 1 << pmc; native_table[i].resources.counter_cmd[pmc] = wevp->event_id; native_table[i].name = wevp->short_name; native_table[i].description = wevp->description; native_name_map[i].name = native_table[i].name; native_name_map[i].index = i; index++; } } } aix_ppc64_setup_gps( index ); return index; } /* Reports the elements of the hwd_register_t struct as an array of names and a matching array of values. Maximum string length is name_len; Maximum number of values is count. */ static void copy_value( unsigned int val, char *nam, char *names, unsigned int *values, int len ) { *values = val; strncpy( names, nam, len ); names[len - 1] = '\0'; } /* this function recusively does Modified Bipartite Graph counter allocation success return 1 fail return 0 */ static int do_counter_allocation( ppc64_reg_alloc_t * event_list, int size ) { int i, j, group = -1; unsigned int map[GROUP_INTS]; for ( i = 0; i < GROUP_INTS; i++ ) map[i] = event_list[0].ra_group[i]; for ( i = 1; i < size; i++ ) { for ( j = 0; j < GROUP_INTS; j++ ) map[j] &= event_list[i].ra_group[j]; } for ( i = 0; i < GROUP_INTS; i++ ) { if ( map[i] ) { group = ffs( map[i] ) - 1 + i * 32; break; } } if ( group < 0 ) return group; /* allocation fail */ else { for ( i = 0; i < size; i++ ) { for ( j = 0; j < MAX_COUNTERS; j++ ) { if ( event_list[i].ra_counter_cmd[j] >= 0 && event_list[i].ra_counter_cmd[j] == group_map[group].counter_cmd[j] ) event_list[i].ra_position = j; } } return group; } } /* this function will be called when there are counters available success return 1 fail return 0 */ int _aix_allocate_registers( EventSetInfo_t * ESI ) { hwd_control_state_t *this_state = ESI->ctl_state; unsigned char selector; int i, j, natNum, index; ppc64_reg_alloc_t event_list[MAX_COUNTERS]; int position, group; /* not yet successfully mapped, but have enough slots for events */ /* Initialize the local structure needed for counter allocation and optimization. */ natNum = ESI->NativeCount; for ( i = 0; i < natNum; i++ ) { /* CAUTION: Since this is in the hardware layer, it's ok to access the native table directly, but in general this is a bad idea */ event_list[i].ra_position = -1; /* calculate native event rank, which is number of counters it can live on, this is power3 specific */ for ( j = 0; j < MAX_COUNTERS; j++ ) { if ( ( index = native_name_map[ESI->NativeInfoArray[i]. ni_event & PAPI_NATIVE_AND_MASK].index ) < 0 ) return PAPI_ECNFLCT; event_list[i].ra_counter_cmd[j] = native_table[index].resources.counter_cmd[j]; } for ( j = 0; j < GROUP_INTS; j++ ) { if ( ( index = native_name_map[ESI->NativeInfoArray[i]. ni_event & PAPI_NATIVE_AND_MASK].index ) < 0 ) return PAPI_ECNFLCT; event_list[i].ra_group[j] = native_table[index].resources.group[j]; } /*event_list[i].ra_mod = -1; */ } if ( ( group = do_counter_allocation( event_list, natNum ) ) >= 0 ) { /* successfully mapped */ /* copy counter allocations info back into NativeInfoArray */ this_state->group_id = group; for ( i = 0; i < natNum; i++ ) ESI->NativeInfoArray[i].ni_position = event_list[i].ra_position; /* update the control structure based on the NativeInfoArray */ /*_papi_hwd_update_control_state(this_state, ESI->NativeInfoArray, natNum);*/ return PAPI_OK; } else { return PAPI_ECNFLCT; } } int _aix_init_control_state( hwd_control_state_t * ptr ) { int i; for ( i = 0; i < _aix_vector.cmp_info.num_cntrs; i++ ) { ptr->counter_cmd.events[i] = COUNT_NOTHING; } ptr->counter_cmd.mode.b.is_group = 1; _aix_vector.set_domain( ptr, _aix_vector.cmp_info.default_domain ); _aix_set_granularity( ptr, _aix_vector.cmp_info.default_granularity ); /*setup_native_table(); */ return ( PAPI_OK ); } /* This function updates the control structure with whatever resources are allocated for all the native events in the native info structure array. */ int _aix_update_control_state( hwd_control_state_t * this_state, NativeInfo_t * native, int count, hwd_context_t * context ) { this_state->counter_cmd.events[0] = this_state->group_id; return PAPI_OK; } /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ /* The following is for any POWER hardware */ /* Trims trailing blank space and line endings from a string (in place). Returns pointer to start address */ static char * trim_string( char *in ) { int len, i = 0; char *start = in; if ( in == NULL ) return ( in ); len = strlen( in ); if ( len == 0 ) return ( in ); /* Trim right */ i = strlen( start ) - 1; while ( i >= 0 ) { if ( isblank( start[i] ) || ( start[i] == '\r' ) || ( start[i] == '\n' ) ) start[i] = '\0'; else break; i--; } return ( start ); } /* Routines to support an opaque native event table */ int _aix_ntv_code_to_name( unsigned int EventCode, char *ntv_name, int len ) { if ( ( EventCode & PAPI_NATIVE_AND_MASK ) >= _aix_vector.cmp_info.num_native_events ) return ( PAPI_ENOEVNT ); strncpy( ntv_name, native_name_map[EventCode & PAPI_NATIVE_AND_MASK].name, len ); trim_string( ntv_name ); if ( strlen( native_name_map[EventCode & PAPI_NATIVE_AND_MASK].name ) > len - 1 ) return ( PAPI_EBUF ); return ( PAPI_OK ); } int _aix_ntv_code_to_descr( unsigned int EventCode, char *ntv_descr, int len ) { if ( ( EventCode & PAPI_NATIVE_AND_MASK ) >= _aix_vector.cmp_info.num_native_events ) return ( PAPI_ENOEVNT ); strncpy( ntv_descr, native_table[native_name_map[EventCode & PAPI_NATIVE_AND_MASK]. index].description, len ); trim_string( ntv_descr ); if ( strlen ( native_table [native_name_map[EventCode & PAPI_NATIVE_AND_MASK].index]. description ) > len - 1 ) return ( PAPI_EBUF ); return ( PAPI_OK ); } int _aix_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { bits = &native_table[EventCode & PAPI_NATIVE_AND_MASK].resources; /* it is not right, different type */ return ( PAPI_OK ); } /* this function return the next native event code. modifier = PAPI_ENUM_FIRST returns first native event code modifier = PAPI_ENUM_EVENTS returns next native event code modifier = PAPI_NTV_ENUM_GROUPS return groups in which this native event lives, in bits 16 - 23 of event code terminating with PAPI_ENOEVNT at the end of the list. function return value: PAPI_OK successful, event code is valid PAPI_EINVAL bad modifier PAPI_ENOEVNT end of list or fail, event code is invalid */ int _aix_ntv_enum_events( unsigned int *EventCode, int modifier ) { if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = PAPI_NATIVE_MASK; return ( PAPI_OK ); } if ( modifier == PAPI_ENUM_EVENTS ) { int index = *EventCode & PAPI_NATIVE_AND_MASK; if ( native_table[index + 1].resources.selector ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); } else if ( modifier == PAPI_NTV_ENUM_GROUPS ) { #if defined(_POWER5) || defined(_POWER6) unsigned int group = ( *EventCode & PAPI_NTV_GROUP_AND_MASK ) >> PAPI_NTV_GROUP_SHIFT; int index = *EventCode & 0x000000FF; int i; unsigned int tmpg; *EventCode = *EventCode & ( ~PAPI_NTV_GROUP_SHIFT ); for ( i = 0; i < GROUP_INTS; i++ ) { tmpg = native_table[index].resources.group[i]; if ( group != 0 ) { while ( ( ffs( tmpg ) + i * 32 ) <= group && tmpg != 0 ) tmpg = tmpg ^ ( 1 << ( ffs( tmpg ) - 1 ) ); } if ( tmpg != 0 ) { group = ffs( tmpg ) + i * 32; *EventCode = *EventCode | ( group << PAPI_NTV_GROUP_SHIFT ); return ( PAPI_OK ); } } #endif return ( PAPI_ENOEVNT ); } else return ( PAPI_EINVAL ); } static void set_config( hwd_control_state_t * ptr, int arg1, int arg2 ) { ptr->counter_cmd.events[arg1] = arg2; } static void unset_config( hwd_control_state_t * ptr, int arg1 ) { ptr->counter_cmd.events[arg1] = 0; } int init_domain( ) { int domain = 0; domain = PAPI_DOM_USER | PAPI_DOM_KERNEL | PAPI_DOM_OTHER; #ifdef PM_INITIALIZE #ifdef _AIXVERSION_510 if ( pminfo.proc_feature.b.hypervisor ) { domain |= PAPI_DOM_SUPERVISOR; } #endif #endif return ( domain ); } static int _aix_set_domain( hwd_control_state_t * this_state, int domain ) { pm_mode_t *mode = &( this_state->counter_cmd.mode ); int did = 0; mode->b.user = 0; mode->b.kernel = 0; if ( domain & PAPI_DOM_USER ) { did++; mode->b.user = 1; } if ( domain & PAPI_DOM_KERNEL ) { did++; mode->b.kernel = 1; } #ifdef PM_INITIALIZE #ifdef _AIXVERSION_510 if ( ( domain & PAPI_DOM_SUPERVISOR ) && pminfo.proc_feature.b.hypervisor ) { did++; mode->b.hypervisor = 1; } #endif #endif if ( did ) return ( PAPI_OK ); else return ( PAPI_EINVAL ); /* switch (domain) { case PAPI_DOM_USER: mode->b.user = 1; mode->b.kernel = 0; break; case PAPI_DOM_KERNEL: mode->b.user = 0; mode->b.kernel = 1; break; case PAPI_DOM_ALL: mode->b.user = 1; mode->b.kernel = 1; break; default: return(PAPI_EINVAL); } return(PAPI_OK); */ } int _aix_set_granularity( hwd_control_state_t * this_state, int domain ) { pm_mode_t *mode = &( this_state->counter_cmd.mode ); switch ( domain ) { case PAPI_GRN_THR: mode->b.process = 0; mode->b.proctree = 0; break; /* case PAPI_GRN_PROC: mode->b.process = 1; mode->b.proctree = 0; break; case PAPI_GRN_PROCG: mode->b.process = 0; mode->b.proctree = 1; break; */ default: return ( PAPI_EINVAL ); } return ( PAPI_OK ); } static int set_default_domain( EventSetInfo_t * zero, int domain ) { hwd_control_state_t *current_state = zero->ctl_state; return ( _aix_set_domain( current_state, domain ) ); } static int set_default_granularity( EventSetInfo_t * zero, int granularity ) { hwd_control_state_t *current_state = zero->ctl_state; return ( _aix_set_granularity( current_state, granularity ) ); } /* Initialize the system-specific settings */ /* Machine info structure. -1 is unused. */ int _aix_mdi_init( ) { int retval; if ( ( retval = uname( &AixVer ) ) < 0 ) return ( PAPI_ESYS ); if ( AixVer.version[0] == '4' ) { _papi_hwi_system_info.exe_info.address_info.text_start = ( vptr_t ) START_OF_TEXT; _papi_hwi_system_info.exe_info.address_info.text_end = ( vptr_t ) END_OF_TEXT; _papi_hwi_system_info.exe_info.address_info.data_start = ( vptr_t ) START_OF_DATA; _papi_hwi_system_info.exe_info.address_info.data_end = ( vptr_t ) END_OF_DATA; _papi_hwi_system_info.exe_info.address_info.bss_start = ( vptr_t ) START_OF_BSS; _papi_hwi_system_info.exe_info.address_info.bss_end = ( vptr_t ) END_OF_BSS; } else { _aix_update_shlib_info( &_papi_hwi_system_info ); } /* _papi_hwi_system_info.supports_64bit_counters = 1; _papi_hwi_system_info.supports_real_usec = 1; _papi_hwi_system_info.sub_info.fast_real_timer = 1; _papi_hwi_system_info.sub_info->available_domains = init_domain();*/ return ( PAPI_OK ); } static int _aix_get_system_info( papi_mdi_t *mdi ) { int retval; /* pm_info_t pminfo; */ struct procsinfo psi = { 0 }; pid_t pid; char maxargs[PAPI_HUGE_STR_LEN]; char pname[PAPI_HUGE_STR_LEN]; pid = getpid( ); if ( pid == -1 ) return ( PAPI_ESYS ); _papi_hwi_system_info.pid = pid; psi.pi_pid = pid; retval = getargs( &psi, sizeof ( psi ), maxargs, PAPI_HUGE_STR_LEN ); if ( retval == -1 ) return ( PAPI_ESYS ); if ( realpath( maxargs, pname ) ) strncpy( _papi_hwi_system_info.exe_info.fullname, pname, PAPI_HUGE_STR_LEN ); else strncpy( _papi_hwi_system_info.exe_info.fullname, maxargs, PAPI_HUGE_STR_LEN ); strcpy( _papi_hwi_system_info.exe_info.address_info.name, basename( maxargs ) ); #ifdef _POWER7 /* we pass PM_POWER7 for the same reasons as below (power6 case) */ retval = pm_initialize( PM_INIT_FLAGS , &pminfo, &pmgroups, PM_POWER7); #elif defined(_POWER6) /* problem with pm_initialize(): it cannot be called multiple times with PM_CURRENT; use instead the actual proc type - here PM_POWER6 - and multiple invocations are no longer a problem */ retval = pm_initialize( PM_INIT_FLAGS, &pminfo, &pmgroups, PM_POWER6 ); #else #ifdef _AIXVERSION_510 #ifdef PM_INITIALIZE SUBDBG( "Calling AIX 5 version of pm_initialize...\n" ); /*#if defined(_POWER5) retval = pm_initialize(PM_INIT_FLAGS, &pminfo, &pmgroups, PM_POWER5); #endif*/ retval = pm_initialize( PM_INIT_FLAGS, &pminfo, &pmgroups, PM_CURRENT ); #else SUBDBG( "Calling AIX 5 version of pm_init...\n" ); retval = pm_init( PM_INIT_FLAGS, &pminfo, &pmgroups ); #endif #else SUBDBG( "Calling AIX 4 version of pm_init...\n" ); retval = pm_init( PM_INIT_FLAGS, &pminfo ); #endif #endif SUBDBG( "...Back from pm_init\n" ); if ( retval > 0 ) return ( retval ); _aix_mdi_init( ); _papi_hwi_system_info.hw_info.nnodes = 1; _papi_hwi_system_info.hw_info.ncpu = _system_configuration.ncpus; _papi_hwi_system_info.hw_info.totalcpus = _papi_hwi_system_info.hw_info.ncpu * _papi_hwi_system_info.hw_info.nnodes; _papi_hwi_system_info.hw_info.vendor = -1; strcpy( _papi_hwi_system_info.hw_info.vendor_string, "IBM" ); _papi_hwi_system_info.hw_info.model = _system_configuration.implementation; strcpy( _papi_hwi_system_info.hw_info.model_string, pminfo.proc_name ); _papi_hwi_system_info.hw_info.revision = ( float ) _system_configuration.version; _papi_hwi_system_info.hw_info.mhz = ( float ) ( pm_cycles( ) / 1000000.0 ); _papi_hwi_system_info.hw_info.cpu_max_mhz=_papi_hwi_system_info.hw_info.mhz; _papi_hwi_system_info.hw_info.cpu_min_mhz=_papi_hwi_system_info.hw_info.mhz; /* _papi_hwi_system_info.num_gp_cntrs = pminfo.maxpmcs;*/ _aix_vector.cmp_info.num_cntrs = pminfo.maxpmcs; _aix_vector.cmp_info.num_mpx_cntrs = MAX_MPX_COUNTERS; // pminfo.maxpmcs, _aix_vector.cmp_info.available_granularities = PAPI_GRN_THR; /* This field doesn't appear to exist in the PAPI 3.0 structure _papi_hwi_system_info.cpunum = mycpu(); */ _aix_vector.cmp_info.available_domains = init_domain( ); return PAPI_OK; } /* Low level functions, should not handle errors, just return codes. */ /* At init time, the higher level library should always allocate and reserve EventSet zero. */ long long _aix_get_real_usec( void ) { timebasestruct_t t; long long retval; read_real_time( &t, TIMEBASE_SZ ); time_base_to_time( &t, TIMEBASE_SZ ); retval = ( t.tb_high * 1000000 ) + t.tb_low / 1000; return ( retval ); } long long _aix_get_real_cycles( void ) { return ( _aix_get_real_usec( ) * ( long long ) _papi_hwi_system_info.hw_info.cpu_max_mhz ); } long long _aix_get_virt_usec( void ) { long long retval; struct tms buffer; times( &buffer ); SUBDBG( "user %d system %d\n", ( int ) buffer.tms_utime, ( int ) buffer.tms_stime ); retval = ( long long ) ( ( buffer.tms_utime + buffer.tms_stime ) * ( 1000000 / CLK_TCK ) ); return ( retval ); } static void _aix_lock_init( void ) { int i; for ( i = 0; i < PAPI_MAX_LOCK; i++ ) lock[i] = ( int * ) ( lock_var + i ); } int _aix_shutdown_thread( hwd_context_t * ctx ) { return ( PAPI_OK ); } int _aix_init_component( int cidx ) { int retval = PAPI_OK, procidx; /* Fill in what we can of the papi_system_info. */ retval = _papi_os_vector.get_system_info( &_papi_hwi_system_info ); if ( retval ) return ( retval ); /* Setup memory info */ retval = _papi_os_vector.get_memory_info( &_papi_hwi_system_info.hw_info, 0 ); if ( retval ) return ( retval ); SUBDBG( "Found %d %s %s CPUs at %d Mhz.\n", _papi_hwi_system_info.hw_info.totalcpus, _papi_hwi_system_info.hw_info.vendor_string, _papi_hwi_system_info.hw_info.model_string, _papi_hwi_system_info.hw_info.cpu_max_mhz ); _aix_vector.cmp_info.CmpIdx = cidx; _aix_vector.cmp_info.num_native_events = aix_ppc64_setup_native_table( ); procidx = pm_get_procindex( ); switch ( procidx ) { case PM_POWER5: _papi_load_preset_table( "POWER5", 0, cidx ); break; case PM_POWER5_II: _papi_load_preset_table( "POWER5+", 0, cidx ); break; case PM_POWER6: _papi_load_preset_table( "POWER6", 0, cidx ); break; case PM_PowerPC970: _papi_load_preset_table( "PPC970", 0, cidx ); break; case PM_POWER7: _papi_load_preset_table( "POWER7", 0, cidx ); break; default: fprintf( stderr, "%s is not supported!\n", pminfo.proc_name ); return PAPI_ENOIMPL; } _aix_lock_init( ); return ( retval ); } int _aix_init_thread( hwd_context_t * context ) { int retval; /* Initialize our global control state. */ _aix_init_control_state( &context->cntrl ); } /* Go from highest counter to lowest counter. Why? Because there are usually more counters on #1, so we try the least probable first. */ static int get_avail_hwcntr_bits( int cntr_avail_bits ) { int tmp = 0, i = 1 << ( POWER_MAX_COUNTERS - 1 ); while ( i ) { tmp = i & cntr_avail_bits; if ( tmp ) return ( tmp ); i = i >> 1; } return ( 0 ); } static void set_hwcntr_codes( int selector, unsigned char *from, int *to ) { int useme, i; for ( i = 0; i < _aix_vector.cmp_info.num_cntrs; i++ ) { useme = ( 1 << i ) & selector; if ( useme ) { to[i] = from[i]; } } } #ifdef DEBUG void dump_cmd( pm_prog_t * t ) { SUBDBG( "mode.b.threshold %d\n", t->mode.b.threshold ); SUBDBG( "mode.b.spare %d\n", t->mode.b.spare ); SUBDBG( "mode.b.process %d\n", t->mode.b.process ); SUBDBG( "mode.b.kernel %d\n", t->mode.b.kernel ); SUBDBG( "mode.b.user %d\n", t->mode.b.user ); SUBDBG( "mode.b.count %d\n", t->mode.b.count ); SUBDBG( "mode.b.proctree %d\n", t->mode.b.proctree ); SUBDBG( "events[0] %d\n", t->events[0] ); SUBDBG( "events[1] %d\n", t->events[1] ); SUBDBG( "events[2] %d\n", t->events[2] ); SUBDBG( "events[3] %d\n", t->events[3] ); SUBDBG( "events[4] %d\n", t->events[4] ); SUBDBG( "events[5] %d\n", t->events[5] ); SUBDBG( "events[6] %d\n", t->events[6] ); SUBDBG( "events[7] %d\n", t->events[7] ); SUBDBG( "reserved %d\n", t->reserved ); } void dump_data( long long *vals ) { int i; for ( i = 0; i < MAX_COUNTERS; i++ ) { SUBDBG( "counter[%d] = %lld\n", i, vals[i] ); } } #endif int _aix_reset( hwd_context_t * ESI, hwd_control_state_t * zero ) { int retval = pm_reset_data_mythread( ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "PAPI Error: pm_reset_data_mythread", retval ); return ( retval ); } return ( PAPI_OK ); } int _aix_read( hwd_context_t * ctx, hwd_control_state_t * spc, long long **vals, int flags ) { int retval; retval = pm_get_data_mythread( &spc->state ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "PAPI Error: pm_get_data_mythread", retval ); return ( retval ); } *vals = spc->state.accu; #ifdef DEBUG if ( ISLEVEL( DEBUG_SUBSTRATE ) ) dump_data( *vals ); #endif return ( PAPI_OK ); } static int round_requested_ns( int ns ) { if ( ns <= _papi_os_info.itimer_res_ns ) { return _papi_os_info.itimer_res_ns; } else { int leftover_ns = ns % _papi_os_info.itimer_res_ns; return ( ns - leftover_ns + _papi_os_info.itimer_res_ns ); } } int _aix_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { switch ( code ) { /* I don't understand what it means to set the default domain case PAPI_DEFDOM: return(set_default_domain(zero, option->domain.domain)); */ case PAPI_DOMAIN: return ( _aix_set_domain ( option->domain.ESI->ctl_state, option->domain.domain ) ); /* I don't understand what it means to set the default granularity case PAPI_DEFGRN: return(set_default_granularity(zero, option->granularity.granularity)); */ case PAPI_GRANUL: return ( _aix_set_granularity ( option->domain.ESI->ctl_state, option->granularity.granularity ) ); #if 0 case PAPI_INHERIT: return ( set_inherit( option->inherit.inherit ) ); #endif case PAPI_DEF_ITIMER: { /* flags are currently ignored, eventually the flags will be able to specify whether or not we use POSIX itimers (clock_gettimer) */ if ( ( option->itimer.itimer_num == ITIMER_REAL ) && ( option->itimer.itimer_sig != SIGALRM ) ) return PAPI_EINVAL; if ( ( option->itimer.itimer_num == ITIMER_VIRTUAL ) && ( option->itimer.itimer_sig != SIGVTALRM ) ) return PAPI_EINVAL; if ( ( option->itimer.itimer_num == ITIMER_PROF ) && ( option->itimer.itimer_sig != SIGPROF ) ) return PAPI_EINVAL; if ( option->itimer.ns > 0 ) option->itimer.ns = round_requested_ns( option->itimer.ns ); /* At this point, we assume the user knows what he or she is doing, they maybe doing something arch specific */ return PAPI_OK; } case PAPI_DEF_MPX_NS: { option->multiplex.ns = round_requested_ns( option->multiplex.ns ); return ( PAPI_OK ); } case PAPI_DEF_ITIMER_NS: { option->itimer.ns = round_requested_ns( option->itimer.ns ); return ( PAPI_OK ); } default: return ( PAPI_ENOSUPP ); } } void _aix_dispatch_timer( int signal, siginfo_t * si, void *i ) { _papi_hwi_context_t ctx; ThreadInfo_t *t = NULL; vptr_t address; ctx.si = si; ctx.ucontext = ( hwd_ucontext_t * ) i; address = ( vptr_t ) GET_OVERFLOW_ADDRESS( ( &ctx ) ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address, NULL, 0, 0, &t, _aix_vector.cmp_info.CmpIdx ); } int _aix_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { hwd_control_state_t *this_state = ESI->ctl_state; return ( PAPI_OK ); } void * _aix_get_overflow_address( void *context ) { void *location; struct sigcontext *info = ( struct sigcontext * ) context; location = ( void * ) info->sc_jmpbuf.jmp_context.iar; return ( location ); } /* Copy the current control_state into the new thread context */ /*int _papi_hwd_start(EventSetInfo_t *ESI, EventSetInfo_t *zero)*/ int _aix_start( hwd_context_t * ctx, hwd_control_state_t * cntrl ) { int i, retval; hwd_control_state_t *current_state = &ctx->cntrl; /* If we are nested, merge the global counter structure with the current eventset */ SUBDBG( "Start\n" ); /* Copy the global counter structure to the current eventset */ SUBDBG( "Copying states\n" ); memcpy( current_state, cntrl, sizeof ( hwd_control_state_t ) ); retval = pm_set_program_mythread( ¤t_state->counter_cmd ); if ( retval > 0 ) { if ( retval == 13 ) { retval = pm_delete_program_mythread( ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "PAPI Error: pm_delete_program_mythread", retval ); return ( retval ); } retval = pm_set_program_mythread( ¤t_state->counter_cmd ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "PAPI Error: pm_set_program_mythread", retval ); return ( retval ); } } else { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "PAPI Error: pm_set_program_mythread", retval ); return ( retval ); } } /* Set up the new merged control structure */ #if 0 dump_cmd( ¤t_state->counter_cmd ); #endif /* Start the counters */ retval = pm_start_mythread( ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "pm_start_mythread()", retval ); return ( retval ); } return ( PAPI_OK ); } int _aix_stop( hwd_context_t * ctx, hwd_control_state_t * cntrl ) { int retval; retval = pm_stop_mythread( ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "pm_stop_mythread()", retval ); return ( retval ); } retval = pm_delete_program_mythread( ); if ( retval > 0 ) { if ( _papi_hwi_error_level != PAPI_QUIET ) pm_error( "pm_delete_program_mythread()", retval ); return ( retval ); } return ( PAPI_OK ); } int _aix_update_shlib_info( papi_mdi_t *mdi ) { #if ( ( defined( _AIXVERSION_510) || defined(_AIXVERSION_520))) struct ma_msg_s { long flag; char *name; } ma_msgs[] = { { MA_MAINEXEC, "MAINEXEC"}, { MA_KERNTEXT, "KERNTEXT"}, { MA_READ, "READ"}, { MA_WRITE, "WRITE"}, { MA_EXEC, "EXEC"}, { MA_SHARED, "SHARED"}, { MA_BREAK, "BREAK"}, { MA_STACK, "STACK"},}; char fname[80], name[PAPI_HUGE_STR_LEN]; prmap_t newp; int count, t_index, retval, i, j, not_first_flag_bit; FILE *map_f; void *vaddr; prmap_t *tmp1 = NULL; PAPI_address_map_t *tmp2 = NULL; sprintf( fname, "/proc/%d/map", getpid( ) ); map_f = fopen( fname, "r" ); if ( !map_f ) { PAPIERROR( "fopen(%s) returned < 0", fname ); return ( PAPI_OK ); } /* count the entries we need */ count = 0; t_index = 0; while ( ( retval = fread( &newp, sizeof ( prmap_t ), 1, map_f ) ) > 0 ) { if ( newp.pr_pathoff > 0 && newp.pr_mapname[0] != '\0' ) { if ( newp.pr_mflags & MA_STACK ) continue; count++; SUBDBG( "count=%d offset=%ld map=%s\n", count, newp.pr_pathoff, newp.pr_mapname ); if ( ( newp.pr_mflags & MA_READ ) && ( newp.pr_mflags & MA_EXEC ) ) t_index++; } } rewind( map_f ); tmp1 = ( prmap_t * ) papi_calloc( ( count + 1 ), sizeof ( prmap_t ) ); if ( tmp1 == NULL ) return ( PAPI_ENOMEM ); tmp2 = ( PAPI_address_map_t * ) papi_calloc( t_index, sizeof ( PAPI_address_map_t ) ); if ( tmp2 == NULL ) return ( PAPI_ENOMEM ); i = 0; t_index = -1; while ( ( retval = fread( &tmp1[i], sizeof ( prmap_t ), 1, map_f ) ) > 0 ) { if ( tmp1[i].pr_pathoff > 0 && tmp1[i].pr_mapname[0] != '\0' ) if ( !( tmp1[i].pr_mflags & MA_STACK ) ) i++; } for ( i = 0; i < count; i++ ) { char c; int cc = 0; retval = fseek( map_f, tmp1[i].pr_pathoff, SEEK_SET ); if ( retval != 0 ) return ( PAPI_ESYS ); while ( fscanf( map_f, "%c", &c ) != EOF ) { name[cc] = c; /* how many char are hold in /proc/xxxx/map */ cc++; if ( c == '\0' ) break; } /* currently /proc/xxxx/map file holds only 33 char per line (incl NULL char); * if executable name > 32 char, compare first 32 char only */ if ( strncmp( _papi_hwi_system_info.exe_info.address_info.name, basename( name ), cc - 1 ) == 0 ) { if ( strlen( _papi_hwi_system_info.exe_info.address_info.name ) != cc - 1 ) PAPIERROR ( "executable name too long (%d char). Match of first %d char only", strlen( _papi_hwi_system_info.exe_info.address_info. name ), cc - 1 ); if ( tmp1[i].pr_mflags & MA_READ ) { if ( tmp1[i].pr_mflags & MA_EXEC ) { _papi_hwi_system_info.exe_info.address_info. text_start = ( vptr_t ) tmp1[i].pr_vaddr; _papi_hwi_system_info.exe_info.address_info. text_end = ( vptr_t ) ( tmp1[i].pr_vaddr + tmp1[i].pr_size ); } else if ( tmp1[i].pr_mflags & MA_WRITE ) { _papi_hwi_system_info.exe_info.address_info. data_start = ( vptr_t ) tmp1[i].pr_vaddr; _papi_hwi_system_info.exe_info.address_info. data_end = ( vptr_t ) ( tmp1[i].pr_vaddr + tmp1[i].pr_size ); } } } else { if ( ( _papi_hwi_system_info.exe_info.address_info.text_start == 0 ) && ( _papi_hwi_system_info.exe_info.address_info.text_end == 0 ) && ( _papi_hwi_system_info.exe_info.address_info.data_start == 0 ) && ( _papi_hwi_system_info.exe_info.address_info.data_end == 0 ) ) PAPIERROR( "executable name not recognized" ); if ( tmp1[i].pr_mflags & MA_READ ) { if ( tmp1[i].pr_mflags & MA_EXEC ) { t_index++; tmp2[t_index].text_start = ( vptr_t ) tmp1[i].pr_vaddr; tmp2[t_index].text_end = ( vptr_t ) ( tmp1[i].pr_vaddr + tmp1[i].pr_size ); strncpy( tmp2[t_index].name, name, PAPI_MAX_STR_LEN ); } else if ( tmp1[i].pr_mflags & MA_WRITE ) { tmp2[t_index].data_start = ( vptr_t ) tmp1[i].pr_vaddr; tmp2[t_index].data_end = ( vptr_t ) ( tmp1[i].pr_vaddr + tmp1[i].pr_size ); } } } } fclose( map_f ); if ( _papi_hwi_system_info.shlib_info.map ) papi_free( _papi_hwi_system_info.shlib_info.map ); _papi_hwi_system_info.shlib_info.map = tmp2; _papi_hwi_system_info.shlib_info.count = t_index + 1; papi_free( tmp1 ); return PAPI_OK; #else return PAPI_ENOIMPL; #endif } int _aix_ntv_name_to_code( const char *name, unsigned int *evtcode ) { int i; for ( i = 0; i < PAPI_MAX_NATIVE_EVENTS; i++ ) if ( strcmp( name, native_name_map[i].name ) == 0 ) { *evtcode = native_name_map[i].index | PAPI_NATIVE_MASK; return PAPI_OK; } return PAPI_ENOEVNT; } PAPI_os_info_t _papi_os_info; int _papi_hwi_init_os(void) { struct utsname uname_buffer; uname(&uname_buffer); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_res_ns = 1; _papi_os_info.itimer_ns = 1000 * PAPI_INT_MPX_DEF_US; return PAPI_OK; } papi_vector_t _aix_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "aix", .description = "AIX pmapi CPU counters", .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 1, .fast_virtual_timer = 1, .attach = 1, .attach_must_ptrace = 1, .cntr_umasks = 1, } , /* sizes of framework-opaque component-private structures these are remapped in pmapi_ppc64.h, ppc64_events.h */ .size = { .context = sizeof ( hwd_context_t ), .control_state = sizeof ( hwd_control_state_t ), .reg_value = sizeof ( hwd_register_t ), .reg_alloc = sizeof ( hwd_reg_alloc_t ), } , /* function pointers in this component */ .init_control_state = _aix_init_control_state, .start = _aix_start, .stop = _aix_stop, .read = _aix_read, .allocate_registers = _aix_allocate_registers, .update_control_state = _aix_update_control_state, .set_domain = _aix_set_domain, .reset = _aix_reset, .set_overflow = _aix_set_overflow, /* .stop_profiling = _aix_stop_profiling, */ .ntv_enum_events = _aix_ntv_enum_events, .ntv_name_to_code = _aix_ntv_name_to_code, .ntv_code_to_name = _aix_ntv_code_to_name, .ntv_code_to_descr = _aix_ntv_code_to_descr, .ntv_code_to_bits = _aix_ntv_code_to_bits, .init_component = _aix_init_component, .ctl = _aix_ctl, .dispatch_timer = _aix_dispatch_timer, .init_thread = _aix_init_thread, .shutdown_thread = _aix_shutdown_thread, }; papi_os_vector_t _papi_os_vector = { .get_memory_info = _aix_get_memory_info, .get_dmem_info = _aix_get_dmem_info, .get_real_usec = _aix_get_real_usec, .get_real_cycles = _aix_get_real_cycles, .get_virt_usec = _aix_get_virt_usec, .update_shlib_info = _aix_update_shlib_info, .get_system_info = _aix_get_system_info, }; papi-papi-7-2-0-t/src/aix.h000066400000000000000000000061351502707512200153660ustar00rootroot00000000000000#ifndef _PAPI_AIX_H /* _PAPI_AIX */ #define _PAPI_AIX_H /****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: pmapi-ppc64.h * Author: Maynard Johnson * maynardj@us.ibm.com * Mods: * */ #include #include #include #include #include #include #include #include #if defined( _AIXVERSION_510) || defined(_AIXVERSION_520) #include #include #endif #include #include #include #include #include #include #include #include #include "pmapi.h" #define ANY_THREAD_GETS_SIGNAL #define POWER_MAX_COUNTERS MAX_COUNTERS #define MAX_COUNTER_TERMS MAX_COUNTERS #define MAX_MPX_COUNTERS 32 #define INVALID_EVENT -2 #define POWER_MAX_COUNTERS_MAPPING 8 extern _text; extern _etext; extern _edata; extern _end; extern _data; /* globals */ #ifdef PM_INITIALIZE #ifdef _AIXVERSION_510 #define PMINFO_T pm_info2_t #define PMEVENTS_T pm_events2_t #else #define PMINFO_T pm_info_t #define PMEVENTS_T pm_events_t #endif PMINFO_T pminfo; #else #define PMINFO_T pm_info_t #define PMEVENTS_T pm_events_t /*pm_info_t pminfo;*/ #endif #include "aix-context.h" /* define the vector structure at the bottom of this file */ #define PM_INIT_FLAGS PM_VERIFIED|PM_UNVERIFIED|PM_CAVEAT|PM_GET_GROUPS #ifdef PM_INITIALIZE typedef pm_info2_t hwd_pminfo_t; typedef pm_events2_t hwd_pmevents_t; #else typedef pm_info_t hwd_pminfo_t; typedef pm_events_t hwd_pmevents_t; #endif #include "ppc64_events.h" typedef struct ppc64_pmapi_control { /* Buffer to pass to the kernel to control the counters */ pm_prog_t counter_cmd; int group_id; /* Space to read the counters */ pm_data_t state; } ppc64_pmapi_control_t; typedef struct ppc64_reg_alloc { int ra_position; unsigned int ra_group[GROUP_INTS]; int ra_counter_cmd[MAX_COUNTERS]; } ppc64_reg_alloc_t; typedef struct ppc64_pmapi_context { /* this structure is a work in progress */ ppc64_pmapi_control_t cntrl; } ppc64_pmapi_context_t; /* Override void* definitions from PAPI framework layer */ /* typedefs to conform to hardware independent PAPI code. */ #undef hwd_control_state_t #undef hwd_reg_alloc_t #undef hwd_context_t typedef ppc64_pmapi_control_t hwd_control_state_t; typedef ppc64_reg_alloc_t hwd_reg_alloc_t; typedef ppc64_pmapi_context_t hwd_context_t; /* typedef struct hwd_groups { // group number from the pmapi pm_groups_t struct //int group_id; // Buffer containing counter cmds for this group unsigned char counter_cmd[POWER_MAX_COUNTERS]; } hwd_groups_t; */ /* prototypes */ extern int _aix_set_granularity( hwd_control_state_t * this_state, int domain ); extern int _papi_hwd_init_preset_search_map( hwd_pminfo_t * info ); extern int _aix_get_memory_info( PAPI_hw_info_t * mem_info, int type ); extern int _aix_get_dmem_info( PAPI_dmem_info_t * d ); /* Machine dependent info structure */ extern pm_groups_info_t pmgroups; #endif /* _PAPI_AIX */ papi-papi-7-2-0-t/src/atomic_ops.h000066400000000000000000000530621502707512200167430ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * Copyright (c) 2008-2022 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef AO_ATOMIC_OPS_H #define AO_ATOMIC_OPS_H #include "atomic_ops/ao_version.h" /* Define version numbers here to allow */ /* test on build machines for cross-builds. */ #include #include /* We define various atomic operations on memory in a */ /* machine-specific way. Unfortunately, this is complicated */ /* by the fact that these may or may not be combined with */ /* various memory barriers. Thus the actual operations we */ /* define have the form AO__, for all */ /* plausible combinations of and . */ /* This of course results in a mild combinatorial explosion. */ /* To deal with it, we try to generate derived */ /* definitions for as many of the combinations as we can, as */ /* automatically as possible. */ /* */ /* Our assumption throughout is that the programmer will */ /* specify the least demanding operation and memory barrier */ /* that will guarantee correctness for the implementation. */ /* Our job is to find the least expensive way to implement it */ /* on the applicable hardware. In many cases that will */ /* involve, for example, a stronger memory barrier, or a */ /* combination of hardware primitives. */ /* */ /* Conventions: */ /* "plain" atomic operations are not guaranteed to include */ /* a barrier. The suffix in the name specifies the barrier */ /* type. Suffixes are: */ /* _release: Earlier operations may not be delayed past it. */ /* _acquire: Later operations may not move ahead of it. */ /* _read: Subsequent reads must follow this operation and */ /* preceding reads. */ /* _write: Earlier writes precede both this operation and */ /* later writes. */ /* _full: Ordered with respect to both earlier and later memory */ /* operations. */ /* _release_write: Ordered with respect to earlier writes. */ /* _acquire_read: Ordered with respect to later reads. */ /* */ /* Currently we try to define the following atomic memory */ /* operations, in combination with the above barriers: */ /* AO_nop */ /* AO_load */ /* AO_store */ /* AO_test_and_set (binary) */ /* AO_fetch_and_add */ /* AO_fetch_and_add1 */ /* AO_fetch_and_sub1 */ /* AO_and */ /* AO_or */ /* AO_xor */ /* AO_compare_and_swap */ /* AO_fetch_compare_and_swap */ /* */ /* Note that atomicity guarantees are valid only if both */ /* readers and writers use AO_ operations to access the */ /* shared value, while ordering constraints are intended to */ /* apply all memory operations. If a location can potentially */ /* be accessed simultaneously from multiple threads, and one of */ /* those accesses may be a write access, then all such */ /* accesses to that location should be through AO_ primitives. */ /* However if AO_ operations enforce sufficient ordering to */ /* ensure that a location x cannot be accessed concurrently, */ /* or can only be read concurrently, then x can be accessed */ /* via ordinary references and assignments. */ /* */ /* AO_compare_and_swap takes an address and an expected old */ /* value and a new value, and returns an int. Non-zero result */ /* indicates that it succeeded. */ /* AO_fetch_compare_and_swap takes an address and an expected */ /* old value and a new value, and returns the real old value. */ /* The operation succeeded if and only if the expected old */ /* value matches the old value returned. */ /* */ /* Test_and_set takes an address, atomically replaces it by */ /* AO_TS_SET, and returns the prior value. */ /* An AO_TS_t location can be reset with the */ /* AO_CLEAR macro, which normally uses AO_store_release. */ /* AO_fetch_and_add takes an address and an AO_t increment */ /* value. The AO_fetch_and_add1 and AO_fetch_and_sub1 variants */ /* are provided, since they allow faster implementations on */ /* some hardware. AO_and, AO_or, AO_xor do atomically and, or, */ /* xor (respectively) an AO_t value into a memory location, */ /* but do not provide access to the original. */ /* */ /* We expect this list to grow slowly over time. */ /* */ /* Note that AO_nop_full is a full memory barrier. */ /* */ /* Note that if some data is initialized with */ /* data.x = ...; data.y = ...; ... */ /* AO_store_release_write(&data_is_initialized, 1) */ /* then data is guaranteed to be initialized after the test */ /* if (AO_load_acquire_read(&data_is_initialized)) ... */ /* succeeds. Furthermore, this should generate near-optimal */ /* code on all common platforms. */ /* */ /* All operations operate on unsigned AO_t, which */ /* is the natural word size, and usually unsigned long. */ /* It is possible to check whether a particular operation op */ /* is available on a particular platform by checking whether */ /* AO_HAVE_op is defined. We make heavy use of these macros */ /* internally. */ /* The rest of this file basically has three sections: */ /* */ /* Some utility and default definitions. */ /* */ /* The architecture dependent section: */ /* This defines atomic operations that have direct hardware */ /* support on a particular platform, mostly by including the */ /* appropriate compiler- and hardware-dependent file. */ /* */ /* The synthesis section: */ /* This tries to define other atomic operations in terms of */ /* those that are explicitly available on the platform. */ /* This section is hardware independent. */ /* We make no attempt to synthesize operations in ways that */ /* effectively introduce locks, except for the debugging/demo */ /* pthread-based implementation at the beginning. A more */ /* realistic implementation that falls back to locks could be */ /* added as a higher layer. But that would sacrifice */ /* usability from signal handlers. */ /* The synthesis section is implemented almost entirely in */ /* atomic_ops/generalize.h. */ /* Some common defaults. Overridden for some architectures. */ #define AO_t size_t /* The test_and_set primitive returns an AO_TS_VAL_t value. */ /* AO_TS_t is the type of an in-memory test-and-set location. */ #define AO_TS_INITIALIZER ((AO_TS_t)AO_TS_CLEAR) /* Convenient internal macro to test version of GCC. */ #if defined(__GNUC__) && defined(__GNUC_MINOR__) # define AO_GNUC_PREREQ(major, minor) \ ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((major) << 16) + (minor)) #else # define AO_GNUC_PREREQ(major, minor) 0 /* false */ #endif /* Convenient internal macro to test version of Clang. */ #if defined(__clang__) && defined(__clang_major__) # define AO_CLANG_PREREQ(major, minor) \ ((__clang_major__ << 16) + __clang_minor__ >= ((major) << 16) + (minor)) #else # define AO_CLANG_PREREQ(major, minor) 0 /* false */ #endif /* Platform-dependent stuff: */ #if (defined(__GNUC__) || defined(_MSC_VER) || defined(__INTEL_COMPILER) \ || defined(__DMC__) || defined(__WATCOMC__)) && !defined(AO_NO_INLINE) # define AO_INLINE static __inline #elif defined(__sun) && !defined(AO_NO_INLINE) # define AO_INLINE static inline #else # define AO_INLINE static #endif #if AO_GNUC_PREREQ(3, 0) && !defined(LINT2) # define AO_EXPECT_FALSE(expr) __builtin_expect(expr, 0) /* Equivalent to (expr) but predict that usually (expr) == 0. */ #else # define AO_EXPECT_FALSE(expr) (expr) #endif /* !__GNUC__ */ #if defined(__has_feature) /* __has_feature() is supported. */ # if __has_feature(address_sanitizer) # define AO_ADDRESS_SANITIZER # endif # if __has_feature(memory_sanitizer) # define AO_MEMORY_SANITIZER # endif # if __has_feature(thread_sanitizer) # define AO_THREAD_SANITIZER # endif #else # ifdef __SANITIZE_ADDRESS__ /* GCC v4.8+ */ # define AO_ADDRESS_SANITIZER # endif #endif /* !__has_feature */ #ifndef AO_ATTR_NO_SANITIZE_MEMORY # ifndef AO_MEMORY_SANITIZER # define AO_ATTR_NO_SANITIZE_MEMORY /* empty */ # elif AO_CLANG_PREREQ(3, 8) # define AO_ATTR_NO_SANITIZE_MEMORY __attribute__((no_sanitize("memory"))) # else # define AO_ATTR_NO_SANITIZE_MEMORY __attribute__((no_sanitize_memory)) # endif #endif /* !AO_ATTR_NO_SANITIZE_MEMORY */ #ifndef AO_ATTR_NO_SANITIZE_THREAD # ifndef AO_THREAD_SANITIZER # define AO_ATTR_NO_SANITIZE_THREAD /* empty */ # elif AO_CLANG_PREREQ(3, 8) # define AO_ATTR_NO_SANITIZE_THREAD __attribute__((no_sanitize("thread"))) # else # define AO_ATTR_NO_SANITIZE_THREAD __attribute__((no_sanitize_thread)) # endif #endif /* !AO_ATTR_NO_SANITIZE_THREAD */ #if (AO_GNUC_PREREQ(7, 5) || __STDC_VERSION__ >= 201112L) && !defined(LINT2) # define AO_ALIGNOF_SUPPORTED 1 #endif #if defined(AO_DLL) && !defined(AO_API) # ifdef AO_BUILD # if defined(__CEGCC__) || (defined(__MINGW32__) && !defined(__cplusplus)) # define AO_API __declspec(dllexport) # elif defined(_MSC_VER) || defined(__BORLANDC__) || defined(__CYGWIN__) \ || defined(__DMC__) || defined(__MINGW32__) || defined(__WATCOMC__) # define AO_API extern __declspec(dllexport) # endif # else # if defined(_MSC_VER) || defined(__BORLANDC__) || defined(__CEGCC__) \ || defined(__CYGWIN__) || defined(__DMC__) # define AO_API __declspec(dllimport) # elif defined(__MINGW32_DELAY_LOAD__) # define AO_API __declspec(dllexport) # elif defined(__MINGW32__) || defined(__WATCOMC__) # define AO_API extern __declspec(dllimport) # endif # endif #endif /* AO_DLL */ #ifndef AO_API # define AO_API extern #endif #ifdef AO_ALIGNOF_SUPPORTED # define AO_ASSERT_ADDR_ALIGNED(addr) \ assert(((size_t)(addr) & (__alignof__(*(addr)) - 1)) == 0) #else # define AO_ASSERT_ADDR_ALIGNED(addr) \ assert(((size_t)(addr) & (sizeof(*(addr)) - 1)) == 0) #endif /* !AO_ALIGNOF_SUPPORTED */ #if defined(__GNUC__) && !defined(__INTEL_COMPILER) # define AO_compiler_barrier() __asm__ __volatile__("" : : : "memory") #elif defined(_MSC_VER) || defined(__DMC__) || defined(__BORLANDC__) \ || defined(__WATCOMC__) # if defined(_AMD64_) || defined(_M_X64) || _MSC_VER >= 1400 # if defined(_WIN32_WCE) /* # include */ # elif defined(_MSC_VER) # include # endif # pragma intrinsic(_ReadWriteBarrier) # define AO_compiler_barrier() _ReadWriteBarrier() /* We assume this does not generate a fence instruction. */ /* The documentation is a bit unclear. */ # else # define AO_compiler_barrier() __asm { } /* The preceding implementation may be preferable here too. */ /* But the documentation warns about VC++ 2003 and earlier. */ # endif #elif defined(__INTEL_COMPILER) # define AO_compiler_barrier() __memory_barrier() /* FIXME: Too strong? IA64-only? */ #elif defined(_HPUX_SOURCE) # if defined(__ia64) # include # define AO_compiler_barrier() _Asm_sched_fence() # else /* FIXME - We do not know how to do this. This is a guess. */ /* And probably a bad one. */ static volatile int AO_barrier_dummy; # define AO_compiler_barrier() (void)(AO_barrier_dummy = AO_barrier_dummy) # endif #else /* We conjecture that the following usually gives us the right */ /* semantics or an error. */ # define AO_compiler_barrier() asm("") #endif #if defined(AO_USE_PTHREAD_DEFS) # include "atomic_ops/sysdeps/generic_pthread.h" #endif /* AO_USE_PTHREAD_DEFS */ #if (defined(__CC_ARM) || defined(__ARMCC__)) && !defined(__GNUC__) \ && !defined(AO_USE_PTHREAD_DEFS) # include "atomic_ops/sysdeps/armcc/arm_v6.h" # define AO_GENERALIZE_TWICE #endif #if defined(__GNUC__) && !defined(AO_USE_PTHREAD_DEFS) \ && !defined(__INTEL_COMPILER) # if defined(__i386__) /* We don't define AO_USE_SYNC_CAS_BUILTIN for x86 here because */ /* it might require specifying additional options (like -march) */ /* or additional link libraries (if -march is not specified). */ # include "atomic_ops/sysdeps/gcc/x86.h" # elif defined(__x86_64__) # if AO_GNUC_PREREQ(4, 2) && !defined(AO_USE_SYNC_CAS_BUILTIN) /* It is safe to use __sync CAS built-in on this architecture. */ # define AO_USE_SYNC_CAS_BUILTIN # endif # include "atomic_ops/sysdeps/gcc/x86.h" # elif defined(__ia64__) # include "atomic_ops/sysdeps/gcc/ia64.h" # define AO_GENERALIZE_TWICE # elif defined(__hppa__) # include "atomic_ops/sysdeps/gcc/hppa.h" # define AO_CAN_EMUL_CAS # elif defined(__alpha__) # include "atomic_ops/sysdeps/gcc/alpha.h" # define AO_GENERALIZE_TWICE # elif defined(__s390__) # include "atomic_ops/sysdeps/gcc/s390.h" # elif defined(__sparc__) # include "atomic_ops/sysdeps/gcc/sparc.h" # define AO_CAN_EMUL_CAS # elif defined(__m68k__) # include "atomic_ops/sysdeps/gcc/m68k.h" # elif defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) \ || defined(__powerpc64__) || defined(__ppc64__) || defined(_ARCH_PPC) # include "atomic_ops/sysdeps/gcc/powerpc.h" # elif defined(__aarch64__) # include "atomic_ops/sysdeps/gcc/aarch64.h" # define AO_CAN_EMUL_CAS # elif defined(__arm__) # include "atomic_ops/sysdeps/gcc/arm.h" # define AO_CAN_EMUL_CAS # elif defined(__cris__) || defined(CRIS) # include "atomic_ops/sysdeps/gcc/cris.h" # define AO_CAN_EMUL_CAS # define AO_GENERALIZE_TWICE # elif defined(__mips__) # include "atomic_ops/sysdeps/gcc/mips.h" # elif defined(__sh__) || defined(SH4) # include "atomic_ops/sysdeps/gcc/sh.h" # define AO_CAN_EMUL_CAS # elif defined(__avr32__) # include "atomic_ops/sysdeps/gcc/avr32.h" # elif defined(__e2k__) # include "atomic_ops/sysdeps/gcc/e2k.h" # elif defined(__hexagon__) # include "atomic_ops/sysdeps/gcc/hexagon.h" # elif defined(__nios2__) # include "atomic_ops/sysdeps/gcc/generic.h" # define AO_CAN_EMUL_CAS # elif defined(__riscv) # include "atomic_ops/sysdeps/gcc/riscv.h" # elif defined(__tile__) # include "atomic_ops/sysdeps/gcc/tile.h" # else /* etc. */ # include "atomic_ops/sysdeps/gcc/generic.h" # endif #endif /* __GNUC__ && !AO_USE_PTHREAD_DEFS */ #if (defined(__IBMC__) || defined(__IBMCPP__)) && !defined(__GNUC__) \ && !defined(AO_USE_PTHREAD_DEFS) # if defined(__powerpc__) || defined(__powerpc) || defined(__ppc__) \ || defined(__PPC__) || defined(_M_PPC) || defined(_ARCH_PPC) \ || defined(_ARCH_PWR) # include "atomic_ops/sysdeps/ibmc/powerpc.h" # define AO_GENERALIZE_TWICE # endif #endif #if defined(__INTEL_COMPILER) && !defined(AO_USE_PTHREAD_DEFS) # if defined(__ia64__) # include "atomic_ops/sysdeps/icc/ia64.h" # define AO_GENERALIZE_TWICE # endif # if defined(__GNUC__) /* Intel Compiler in GCC compatible mode */ # if defined(__i386__) # include "atomic_ops/sysdeps/gcc/x86.h" # endif /* __i386__ */ # if defined(__x86_64__) # if (__INTEL_COMPILER > 1110) && !defined(AO_USE_SYNC_CAS_BUILTIN) # define AO_USE_SYNC_CAS_BUILTIN # endif # include "atomic_ops/sysdeps/gcc/x86.h" # endif /* __x86_64__ */ # endif #endif #if defined(_HPUX_SOURCE) && !defined(__GNUC__) && !defined(AO_USE_PTHREAD_DEFS) # if defined(__ia64) # include "atomic_ops/sysdeps/hpc/ia64.h" # define AO_GENERALIZE_TWICE # else # include "atomic_ops/sysdeps/hpc/hppa.h" # define AO_CAN_EMUL_CAS # endif #endif #if defined(_MSC_VER) || defined(__DMC__) || defined(__BORLANDC__) \ || (defined(__WATCOMC__) && defined(__NT__)) # if defined(_AMD64_) || defined(_M_X64) # include "atomic_ops/sysdeps/msftc/x86_64.h" # elif defined(_M_ARM64) # include "atomic_ops/sysdeps/msftc/arm64.h" # elif defined(_M_IX86) || defined(x86) # include "atomic_ops/sysdeps/msftc/x86.h" # elif defined(_M_ARM) || defined(ARM) || defined(_ARM_) # include "atomic_ops/sysdeps/msftc/arm.h" # define AO_GENERALIZE_TWICE # endif #endif #if defined(__sun) && !defined(__GNUC__) && !defined(AO_USE_PTHREAD_DEFS) /* Note: use -DAO_USE_PTHREAD_DEFS if Sun CC does not handle inline asm. */ # if defined(__i386) || defined(__x86_64) || defined(__amd64) # include "atomic_ops/sysdeps/sunc/x86.h" # endif #endif #if !defined(__GNUC__) && (defined(sparc) || defined(__sparc)) \ && !defined(AO_USE_PTHREAD_DEFS) # include "atomic_ops/sysdeps/sunc/sparc.h" # define AO_CAN_EMUL_CAS #endif #if (defined(AO_REQUIRE_CAS) && !defined(AO_HAVE_compare_and_swap) \ && !defined(AO_HAVE_fetch_compare_and_swap) \ && !defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_compare_and_swap_acquire) \ && !defined(AO_HAVE_fetch_compare_and_swap_acquire)) || defined(CPPCHECK) # if defined(AO_CAN_EMUL_CAS) # include "atomic_ops/sysdeps/emul_cas.h" # elif !defined(CPPCHECK) # error Cannot implement AO_compare_and_swap_full on this architecture. # endif #endif /* AO_REQUIRE_CAS && !AO_HAVE_compare_and_swap ... */ /* The most common way to clear a test-and-set location */ /* at the end of a critical section. */ #if defined(AO_AO_TS_T) && !defined(AO_HAVE_CLEAR) # define AO_CLEAR(addr) AO_store_release((AO_TS_t *)(addr), AO_TS_CLEAR) # define AO_HAVE_CLEAR #endif #if defined(AO_CHAR_TS_T) && !defined(AO_HAVE_CLEAR) # define AO_CLEAR(addr) AO_char_store_release((AO_TS_t *)(addr), AO_TS_CLEAR) # define AO_HAVE_CLEAR #endif /* The generalization section. */ #if !defined(AO_GENERALIZE_TWICE) && defined(AO_CAN_EMUL_CAS) \ && !defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_fetch_compare_and_swap_full) # define AO_GENERALIZE_TWICE #endif /* Theoretically we should repeatedly include atomic_ops/generalize.h. */ /* In fact, we observe that this converges after a small fixed number */ /* of iterations, usually one. */ #include "atomic_ops/generalize.h" #if !defined(AO_GENERALIZE_TWICE) \ && defined(AO_HAVE_compare_double_and_swap_double) \ && (!defined(AO_HAVE_double_load) || !defined(AO_HAVE_double_store)) # define AO_GENERALIZE_TWICE #endif #ifdef AO_T_IS_INT /* Included after the first generalization pass. */ # include "atomic_ops/sysdeps/ao_t_is_int.h" # ifndef AO_GENERALIZE_TWICE /* Always generalize again. */ # define AO_GENERALIZE_TWICE # endif #endif /* AO_T_IS_INT */ #ifdef AO_GENERALIZE_TWICE # include "atomic_ops/generalize.h" #endif /* For compatibility with version 0.4 and earlier */ #define AO_TS_T AO_TS_t #define AO_T AO_t #define AO_TS_VAL AO_TS_VAL_t #endif /* !AO_ATOMIC_OPS_H */ papi-papi-7-2-0-t/src/atomic_ops/000077500000000000000000000000001502707512200165645ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/ao_version.h000066400000000000000000000035501502707512200211040ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * Copyright (c) 2011-2018 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef AO_ATOMIC_OPS_H # error This file should not be included directly. #endif /* The policy regarding version numbers: development code has odd */ /* "minor" number (and "micro" part is 0); when development is finished */ /* and a release is prepared, "minor" number is incremented (keeping */ /* "micro" number still zero), whenever a defect is fixed a new release */ /* is prepared incrementing "micro" part to odd value (the most stable */ /* release has the biggest "micro" number). */ /* The version here should match that in configure.ac and README. */ #define AO_VERSION_MAJOR 7 #define AO_VERSION_MINOR 7 #define AO_VERSION_MICRO 0 /* 7.7.0 */ papi-papi-7-2-0-t/src/atomic_ops/generalize-arithm.h000066400000000000000000003732471502707512200223640ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* char_compare_and_swap (based on fetch_compare_and_swap) */ #if defined(AO_HAVE_char_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_char_compare_and_swap_full) AO_INLINE int AO_char_compare_and_swap_full(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_full #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_char_compare_and_swap_acquire) AO_INLINE int AO_char_compare_and_swap_acquire(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_acquire #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_char_compare_and_swap_release) AO_INLINE int AO_char_compare_and_swap_release(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_release #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_char_compare_and_swap_write) AO_INLINE int AO_char_compare_and_swap_write(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_write #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_char_compare_and_swap_read) AO_INLINE int AO_char_compare_and_swap_read(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_read #endif #if defined(AO_HAVE_char_fetch_compare_and_swap) \ && !defined(AO_HAVE_char_compare_and_swap) AO_INLINE int AO_char_compare_and_swap(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_release_write) \ && !defined(AO_HAVE_char_compare_and_swap_release_write) AO_INLINE int AO_char_compare_and_swap_release_write(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_release_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_release_write #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_acquire_read) \ && !defined(AO_HAVE_char_compare_and_swap_acquire_read) AO_INLINE int AO_char_compare_and_swap_acquire_read(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_dd_acquire_read) \ && !defined(AO_HAVE_char_compare_and_swap_dd_acquire_read) AO_INLINE int AO_char_compare_and_swap_dd_acquire_read(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return AO_char_fetch_compare_and_swap_dd_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_char_compare_and_swap_dd_acquire_read #endif /* char_fetch_and_add */ /* We first try to implement fetch_and_add variants in terms of the */ /* corresponding compare_and_swap variants to minimize adding barriers. */ #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_fetch_and_add_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_fetch_and_add_full(volatile unsigned/**/char *addr, unsigned/**/char incr) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full(addr, old, old + incr))); return old; } # define AO_HAVE_char_fetch_and_add_full #endif #if defined(AO_HAVE_char_compare_and_swap_acquire) \ && !defined(AO_HAVE_char_fetch_and_add_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_fetch_and_add_acquire(volatile unsigned/**/char *addr, unsigned/**/char incr) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_acquire(addr, old, old + incr))); return old; } # define AO_HAVE_char_fetch_and_add_acquire #endif #if defined(AO_HAVE_char_compare_and_swap_release) \ && !defined(AO_HAVE_char_fetch_and_add_release) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_fetch_and_add_release(volatile unsigned/**/char *addr, unsigned/**/char incr) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_release(addr, old, old + incr))); return old; } # define AO_HAVE_char_fetch_and_add_release #endif #if defined(AO_HAVE_char_compare_and_swap) \ && !defined(AO_HAVE_char_fetch_and_add) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_fetch_and_add(volatile unsigned/**/char *addr, unsigned/**/char incr) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap(addr, old, old + incr))); return old; } # define AO_HAVE_char_fetch_and_add #endif #if defined(AO_HAVE_char_fetch_and_add_full) # if !defined(AO_HAVE_char_fetch_and_add_release) # define AO_char_fetch_and_add_release(addr, val) \ AO_char_fetch_and_add_full(addr, val) # define AO_HAVE_char_fetch_and_add_release # endif # if !defined(AO_HAVE_char_fetch_and_add_acquire) # define AO_char_fetch_and_add_acquire(addr, val) \ AO_char_fetch_and_add_full(addr, val) # define AO_HAVE_char_fetch_and_add_acquire # endif # if !defined(AO_HAVE_char_fetch_and_add_write) # define AO_char_fetch_and_add_write(addr, val) \ AO_char_fetch_and_add_full(addr, val) # define AO_HAVE_char_fetch_and_add_write # endif # if !defined(AO_HAVE_char_fetch_and_add_read) # define AO_char_fetch_and_add_read(addr, val) \ AO_char_fetch_and_add_full(addr, val) # define AO_HAVE_char_fetch_and_add_read # endif #endif /* AO_HAVE_char_fetch_and_add_full */ #if defined(AO_HAVE_char_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_and_add_acquire) AO_INLINE unsigned/**/char AO_char_fetch_and_add_acquire(volatile unsigned/**/char *addr, unsigned/**/char incr) { unsigned/**/char result = AO_char_fetch_and_add(addr, incr); AO_nop_full(); return result; } # define AO_HAVE_char_fetch_and_add_acquire #endif #if defined(AO_HAVE_char_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_and_add_release) # define AO_char_fetch_and_add_release(addr, incr) \ (AO_nop_full(), AO_char_fetch_and_add(addr, incr)) # define AO_HAVE_char_fetch_and_add_release #endif #if !defined(AO_HAVE_char_fetch_and_add) \ && defined(AO_HAVE_char_fetch_and_add_release) # define AO_char_fetch_and_add(addr, val) \ AO_char_fetch_and_add_release(addr, val) # define AO_HAVE_char_fetch_and_add #endif #if !defined(AO_HAVE_char_fetch_and_add) \ && defined(AO_HAVE_char_fetch_and_add_acquire) # define AO_char_fetch_and_add(addr, val) \ AO_char_fetch_and_add_acquire(addr, val) # define AO_HAVE_char_fetch_and_add #endif #if !defined(AO_HAVE_char_fetch_and_add) \ && defined(AO_HAVE_char_fetch_and_add_write) # define AO_char_fetch_and_add(addr, val) \ AO_char_fetch_and_add_write(addr, val) # define AO_HAVE_char_fetch_and_add #endif #if !defined(AO_HAVE_char_fetch_and_add) \ && defined(AO_HAVE_char_fetch_and_add_read) # define AO_char_fetch_and_add(addr, val) \ AO_char_fetch_and_add_read(addr, val) # define AO_HAVE_char_fetch_and_add #endif #if defined(AO_HAVE_char_fetch_and_add_acquire) \ && defined(AO_HAVE_nop_full) && !defined(AO_HAVE_char_fetch_and_add_full) # define AO_char_fetch_and_add_full(addr, val) \ (AO_nop_full(), AO_char_fetch_and_add_acquire(addr, val)) # define AO_HAVE_char_fetch_and_add_full #endif #if !defined(AO_HAVE_char_fetch_and_add_release_write) \ && defined(AO_HAVE_char_fetch_and_add_write) # define AO_char_fetch_and_add_release_write(addr, val) \ AO_char_fetch_and_add_write(addr, val) # define AO_HAVE_char_fetch_and_add_release_write #endif #if !defined(AO_HAVE_char_fetch_and_add_release_write) \ && defined(AO_HAVE_char_fetch_and_add_release) # define AO_char_fetch_and_add_release_write(addr, val) \ AO_char_fetch_and_add_release(addr, val) # define AO_HAVE_char_fetch_and_add_release_write #endif #if !defined(AO_HAVE_char_fetch_and_add_acquire_read) \ && defined(AO_HAVE_char_fetch_and_add_read) # define AO_char_fetch_and_add_acquire_read(addr, val) \ AO_char_fetch_and_add_read(addr, val) # define AO_HAVE_char_fetch_and_add_acquire_read #endif #if !defined(AO_HAVE_char_fetch_and_add_acquire_read) \ && defined(AO_HAVE_char_fetch_and_add_acquire) # define AO_char_fetch_and_add_acquire_read(addr, val) \ AO_char_fetch_and_add_acquire(addr, val) # define AO_HAVE_char_fetch_and_add_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_fetch_and_add_acquire_read) # define AO_char_fetch_and_add_dd_acquire_read(addr, val) \ AO_char_fetch_and_add_acquire_read(addr, val) # define AO_HAVE_char_fetch_and_add_dd_acquire_read # endif #else # if defined(AO_HAVE_char_fetch_and_add) # define AO_char_fetch_and_add_dd_acquire_read(addr, val) \ AO_char_fetch_and_add(addr, val) # define AO_HAVE_char_fetch_and_add_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_fetch_and_add1 */ #if defined(AO_HAVE_char_fetch_and_add_full) \ && !defined(AO_HAVE_char_fetch_and_add1_full) # define AO_char_fetch_and_add1_full(addr) \ AO_char_fetch_and_add_full(addr, 1) # define AO_HAVE_char_fetch_and_add1_full #endif #if defined(AO_HAVE_char_fetch_and_add_release) \ && !defined(AO_HAVE_char_fetch_and_add1_release) # define AO_char_fetch_and_add1_release(addr) \ AO_char_fetch_and_add_release(addr, 1) # define AO_HAVE_char_fetch_and_add1_release #endif #if defined(AO_HAVE_char_fetch_and_add_acquire) \ && !defined(AO_HAVE_char_fetch_and_add1_acquire) # define AO_char_fetch_and_add1_acquire(addr) \ AO_char_fetch_and_add_acquire(addr, 1) # define AO_HAVE_char_fetch_and_add1_acquire #endif #if defined(AO_HAVE_char_fetch_and_add_write) \ && !defined(AO_HAVE_char_fetch_and_add1_write) # define AO_char_fetch_and_add1_write(addr) \ AO_char_fetch_and_add_write(addr, 1) # define AO_HAVE_char_fetch_and_add1_write #endif #if defined(AO_HAVE_char_fetch_and_add_read) \ && !defined(AO_HAVE_char_fetch_and_add1_read) # define AO_char_fetch_and_add1_read(addr) \ AO_char_fetch_and_add_read(addr, 1) # define AO_HAVE_char_fetch_and_add1_read #endif #if defined(AO_HAVE_char_fetch_and_add_release_write) \ && !defined(AO_HAVE_char_fetch_and_add1_release_write) # define AO_char_fetch_and_add1_release_write(addr) \ AO_char_fetch_and_add_release_write(addr, 1) # define AO_HAVE_char_fetch_and_add1_release_write #endif #if defined(AO_HAVE_char_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_char_fetch_and_add1_acquire_read) # define AO_char_fetch_and_add1_acquire_read(addr) \ AO_char_fetch_and_add_acquire_read(addr, 1) # define AO_HAVE_char_fetch_and_add1_acquire_read #endif #if defined(AO_HAVE_char_fetch_and_add) \ && !defined(AO_HAVE_char_fetch_and_add1) # define AO_char_fetch_and_add1(addr) AO_char_fetch_and_add(addr, 1) # define AO_HAVE_char_fetch_and_add1 #endif #if defined(AO_HAVE_char_fetch_and_add1_full) # if !defined(AO_HAVE_char_fetch_and_add1_release) # define AO_char_fetch_and_add1_release(addr) \ AO_char_fetch_and_add1_full(addr) # define AO_HAVE_char_fetch_and_add1_release # endif # if !defined(AO_HAVE_char_fetch_and_add1_acquire) # define AO_char_fetch_and_add1_acquire(addr) \ AO_char_fetch_and_add1_full(addr) # define AO_HAVE_char_fetch_and_add1_acquire # endif # if !defined(AO_HAVE_char_fetch_and_add1_write) # define AO_char_fetch_and_add1_write(addr) \ AO_char_fetch_and_add1_full(addr) # define AO_HAVE_char_fetch_and_add1_write # endif # if !defined(AO_HAVE_char_fetch_and_add1_read) # define AO_char_fetch_and_add1_read(addr) \ AO_char_fetch_and_add1_full(addr) # define AO_HAVE_char_fetch_and_add1_read # endif #endif /* AO_HAVE_char_fetch_and_add1_full */ #if !defined(AO_HAVE_char_fetch_and_add1) \ && defined(AO_HAVE_char_fetch_and_add1_release) # define AO_char_fetch_and_add1(addr) AO_char_fetch_and_add1_release(addr) # define AO_HAVE_char_fetch_and_add1 #endif #if !defined(AO_HAVE_char_fetch_and_add1) \ && defined(AO_HAVE_char_fetch_and_add1_acquire) # define AO_char_fetch_and_add1(addr) AO_char_fetch_and_add1_acquire(addr) # define AO_HAVE_char_fetch_and_add1 #endif #if !defined(AO_HAVE_char_fetch_and_add1) \ && defined(AO_HAVE_char_fetch_and_add1_write) # define AO_char_fetch_and_add1(addr) AO_char_fetch_and_add1_write(addr) # define AO_HAVE_char_fetch_and_add1 #endif #if !defined(AO_HAVE_char_fetch_and_add1) \ && defined(AO_HAVE_char_fetch_and_add1_read) # define AO_char_fetch_and_add1(addr) AO_char_fetch_and_add1_read(addr) # define AO_HAVE_char_fetch_and_add1 #endif #if defined(AO_HAVE_char_fetch_and_add1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_and_add1_full) # define AO_char_fetch_and_add1_full(addr) \ (AO_nop_full(), AO_char_fetch_and_add1_acquire(addr)) # define AO_HAVE_char_fetch_and_add1_full #endif #if !defined(AO_HAVE_char_fetch_and_add1_release_write) \ && defined(AO_HAVE_char_fetch_and_add1_write) # define AO_char_fetch_and_add1_release_write(addr) \ AO_char_fetch_and_add1_write(addr) # define AO_HAVE_char_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_char_fetch_and_add1_release_write) \ && defined(AO_HAVE_char_fetch_and_add1_release) # define AO_char_fetch_and_add1_release_write(addr) \ AO_char_fetch_and_add1_release(addr) # define AO_HAVE_char_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_char_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_char_fetch_and_add1_read) # define AO_char_fetch_and_add1_acquire_read(addr) \ AO_char_fetch_and_add1_read(addr) # define AO_HAVE_char_fetch_and_add1_acquire_read #endif #if !defined(AO_HAVE_char_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_char_fetch_and_add1_acquire) # define AO_char_fetch_and_add1_acquire_read(addr) \ AO_char_fetch_and_add1_acquire(addr) # define AO_HAVE_char_fetch_and_add1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_fetch_and_add1_acquire_read) # define AO_char_fetch_and_add1_dd_acquire_read(addr) \ AO_char_fetch_and_add1_acquire_read(addr) # define AO_HAVE_char_fetch_and_add1_dd_acquire_read # endif #else # if defined(AO_HAVE_char_fetch_and_add1) # define AO_char_fetch_and_add1_dd_acquire_read(addr) \ AO_char_fetch_and_add1(addr) # define AO_HAVE_char_fetch_and_add1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_fetch_and_sub1 */ #if defined(AO_HAVE_char_fetch_and_add_full) \ && !defined(AO_HAVE_char_fetch_and_sub1_full) # define AO_char_fetch_and_sub1_full(addr) \ AO_char_fetch_and_add_full(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_full #endif #if defined(AO_HAVE_char_fetch_and_add_release) \ && !defined(AO_HAVE_char_fetch_and_sub1_release) # define AO_char_fetch_and_sub1_release(addr) \ AO_char_fetch_and_add_release(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_release #endif #if defined(AO_HAVE_char_fetch_and_add_acquire) \ && !defined(AO_HAVE_char_fetch_and_sub1_acquire) # define AO_char_fetch_and_sub1_acquire(addr) \ AO_char_fetch_and_add_acquire(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_char_fetch_and_add_write) \ && !defined(AO_HAVE_char_fetch_and_sub1_write) # define AO_char_fetch_and_sub1_write(addr) \ AO_char_fetch_and_add_write(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_write #endif #if defined(AO_HAVE_char_fetch_and_add_read) \ && !defined(AO_HAVE_char_fetch_and_sub1_read) # define AO_char_fetch_and_sub1_read(addr) \ AO_char_fetch_and_add_read(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_read #endif #if defined(AO_HAVE_char_fetch_and_add_release_write) \ && !defined(AO_HAVE_char_fetch_and_sub1_release_write) # define AO_char_fetch_and_sub1_release_write(addr) \ AO_char_fetch_and_add_release_write(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_release_write #endif #if defined(AO_HAVE_char_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_char_fetch_and_sub1_acquire_read) # define AO_char_fetch_and_sub1_acquire_read(addr) \ AO_char_fetch_and_add_acquire_read(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1_acquire_read #endif #if defined(AO_HAVE_char_fetch_and_add) \ && !defined(AO_HAVE_char_fetch_and_sub1) # define AO_char_fetch_and_sub1(addr) \ AO_char_fetch_and_add(addr, (unsigned/**/char)(-1)) # define AO_HAVE_char_fetch_and_sub1 #endif #if defined(AO_HAVE_char_fetch_and_sub1_full) # if !defined(AO_HAVE_char_fetch_and_sub1_release) # define AO_char_fetch_and_sub1_release(addr) \ AO_char_fetch_and_sub1_full(addr) # define AO_HAVE_char_fetch_and_sub1_release # endif # if !defined(AO_HAVE_char_fetch_and_sub1_acquire) # define AO_char_fetch_and_sub1_acquire(addr) \ AO_char_fetch_and_sub1_full(addr) # define AO_HAVE_char_fetch_and_sub1_acquire # endif # if !defined(AO_HAVE_char_fetch_and_sub1_write) # define AO_char_fetch_and_sub1_write(addr) \ AO_char_fetch_and_sub1_full(addr) # define AO_HAVE_char_fetch_and_sub1_write # endif # if !defined(AO_HAVE_char_fetch_and_sub1_read) # define AO_char_fetch_and_sub1_read(addr) \ AO_char_fetch_and_sub1_full(addr) # define AO_HAVE_char_fetch_and_sub1_read # endif #endif /* AO_HAVE_char_fetch_and_sub1_full */ #if !defined(AO_HAVE_char_fetch_and_sub1) \ && defined(AO_HAVE_char_fetch_and_sub1_release) # define AO_char_fetch_and_sub1(addr) AO_char_fetch_and_sub1_release(addr) # define AO_HAVE_char_fetch_and_sub1 #endif #if !defined(AO_HAVE_char_fetch_and_sub1) \ && defined(AO_HAVE_char_fetch_and_sub1_acquire) # define AO_char_fetch_and_sub1(addr) AO_char_fetch_and_sub1_acquire(addr) # define AO_HAVE_char_fetch_and_sub1 #endif #if !defined(AO_HAVE_char_fetch_and_sub1) \ && defined(AO_HAVE_char_fetch_and_sub1_write) # define AO_char_fetch_and_sub1(addr) AO_char_fetch_and_sub1_write(addr) # define AO_HAVE_char_fetch_and_sub1 #endif #if !defined(AO_HAVE_char_fetch_and_sub1) \ && defined(AO_HAVE_char_fetch_and_sub1_read) # define AO_char_fetch_and_sub1(addr) AO_char_fetch_and_sub1_read(addr) # define AO_HAVE_char_fetch_and_sub1 #endif #if defined(AO_HAVE_char_fetch_and_sub1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_and_sub1_full) # define AO_char_fetch_and_sub1_full(addr) \ (AO_nop_full(), AO_char_fetch_and_sub1_acquire(addr)) # define AO_HAVE_char_fetch_and_sub1_full #endif #if !defined(AO_HAVE_char_fetch_and_sub1_release_write) \ && defined(AO_HAVE_char_fetch_and_sub1_write) # define AO_char_fetch_and_sub1_release_write(addr) \ AO_char_fetch_and_sub1_write(addr) # define AO_HAVE_char_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_char_fetch_and_sub1_release_write) \ && defined(AO_HAVE_char_fetch_and_sub1_release) # define AO_char_fetch_and_sub1_release_write(addr) \ AO_char_fetch_and_sub1_release(addr) # define AO_HAVE_char_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_char_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_char_fetch_and_sub1_read) # define AO_char_fetch_and_sub1_acquire_read(addr) \ AO_char_fetch_and_sub1_read(addr) # define AO_HAVE_char_fetch_and_sub1_acquire_read #endif #if !defined(AO_HAVE_char_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_char_fetch_and_sub1_acquire) # define AO_char_fetch_and_sub1_acquire_read(addr) \ AO_char_fetch_and_sub1_acquire(addr) # define AO_HAVE_char_fetch_and_sub1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_fetch_and_sub1_acquire_read) # define AO_char_fetch_and_sub1_dd_acquire_read(addr) \ AO_char_fetch_and_sub1_acquire_read(addr) # define AO_HAVE_char_fetch_and_sub1_dd_acquire_read # endif #else # if defined(AO_HAVE_char_fetch_and_sub1) # define AO_char_fetch_and_sub1_dd_acquire_read(addr) \ AO_char_fetch_and_sub1(addr) # define AO_HAVE_char_fetch_and_sub1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_and */ #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_and_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_and_full(volatile unsigned/**/char *addr, unsigned/**/char value) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full(addr, old, old & value))); } # define AO_HAVE_char_and_full #endif #if defined(AO_HAVE_char_and_full) # if !defined(AO_HAVE_char_and_release) # define AO_char_and_release(addr, val) AO_char_and_full(addr, val) # define AO_HAVE_char_and_release # endif # if !defined(AO_HAVE_char_and_acquire) # define AO_char_and_acquire(addr, val) AO_char_and_full(addr, val) # define AO_HAVE_char_and_acquire # endif # if !defined(AO_HAVE_char_and_write) # define AO_char_and_write(addr, val) AO_char_and_full(addr, val) # define AO_HAVE_char_and_write # endif # if !defined(AO_HAVE_char_and_read) # define AO_char_and_read(addr, val) AO_char_and_full(addr, val) # define AO_HAVE_char_and_read # endif #endif /* AO_HAVE_char_and_full */ #if !defined(AO_HAVE_char_and) && defined(AO_HAVE_char_and_release) # define AO_char_and(addr, val) AO_char_and_release(addr, val) # define AO_HAVE_char_and #endif #if !defined(AO_HAVE_char_and) && defined(AO_HAVE_char_and_acquire) # define AO_char_and(addr, val) AO_char_and_acquire(addr, val) # define AO_HAVE_char_and #endif #if !defined(AO_HAVE_char_and) && defined(AO_HAVE_char_and_write) # define AO_char_and(addr, val) AO_char_and_write(addr, val) # define AO_HAVE_char_and #endif #if !defined(AO_HAVE_char_and) && defined(AO_HAVE_char_and_read) # define AO_char_and(addr, val) AO_char_and_read(addr, val) # define AO_HAVE_char_and #endif #if defined(AO_HAVE_char_and_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_and_full) # define AO_char_and_full(addr, val) \ (AO_nop_full(), AO_char_and_acquire(addr, val)) # define AO_HAVE_char_and_full #endif #if !defined(AO_HAVE_char_and_release_write) \ && defined(AO_HAVE_char_and_write) # define AO_char_and_release_write(addr, val) AO_char_and_write(addr, val) # define AO_HAVE_char_and_release_write #endif #if !defined(AO_HAVE_char_and_release_write) \ && defined(AO_HAVE_char_and_release) # define AO_char_and_release_write(addr, val) AO_char_and_release(addr, val) # define AO_HAVE_char_and_release_write #endif #if !defined(AO_HAVE_char_and_acquire_read) \ && defined(AO_HAVE_char_and_read) # define AO_char_and_acquire_read(addr, val) AO_char_and_read(addr, val) # define AO_HAVE_char_and_acquire_read #endif #if !defined(AO_HAVE_char_and_acquire_read) \ && defined(AO_HAVE_char_and_acquire) # define AO_char_and_acquire_read(addr, val) AO_char_and_acquire(addr, val) # define AO_HAVE_char_and_acquire_read #endif /* char_or */ #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_or_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_or_full(volatile unsigned/**/char *addr, unsigned/**/char value) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full(addr, old, old | value))); } # define AO_HAVE_char_or_full #endif #if defined(AO_HAVE_char_or_full) # if !defined(AO_HAVE_char_or_release) # define AO_char_or_release(addr, val) AO_char_or_full(addr, val) # define AO_HAVE_char_or_release # endif # if !defined(AO_HAVE_char_or_acquire) # define AO_char_or_acquire(addr, val) AO_char_or_full(addr, val) # define AO_HAVE_char_or_acquire # endif # if !defined(AO_HAVE_char_or_write) # define AO_char_or_write(addr, val) AO_char_or_full(addr, val) # define AO_HAVE_char_or_write # endif # if !defined(AO_HAVE_char_or_read) # define AO_char_or_read(addr, val) AO_char_or_full(addr, val) # define AO_HAVE_char_or_read # endif #endif /* AO_HAVE_char_or_full */ #if !defined(AO_HAVE_char_or) && defined(AO_HAVE_char_or_release) # define AO_char_or(addr, val) AO_char_or_release(addr, val) # define AO_HAVE_char_or #endif #if !defined(AO_HAVE_char_or) && defined(AO_HAVE_char_or_acquire) # define AO_char_or(addr, val) AO_char_or_acquire(addr, val) # define AO_HAVE_char_or #endif #if !defined(AO_HAVE_char_or) && defined(AO_HAVE_char_or_write) # define AO_char_or(addr, val) AO_char_or_write(addr, val) # define AO_HAVE_char_or #endif #if !defined(AO_HAVE_char_or) && defined(AO_HAVE_char_or_read) # define AO_char_or(addr, val) AO_char_or_read(addr, val) # define AO_HAVE_char_or #endif #if defined(AO_HAVE_char_or_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_or_full) # define AO_char_or_full(addr, val) \ (AO_nop_full(), AO_char_or_acquire(addr, val)) # define AO_HAVE_char_or_full #endif #if !defined(AO_HAVE_char_or_release_write) \ && defined(AO_HAVE_char_or_write) # define AO_char_or_release_write(addr, val) AO_char_or_write(addr, val) # define AO_HAVE_char_or_release_write #endif #if !defined(AO_HAVE_char_or_release_write) \ && defined(AO_HAVE_char_or_release) # define AO_char_or_release_write(addr, val) AO_char_or_release(addr, val) # define AO_HAVE_char_or_release_write #endif #if !defined(AO_HAVE_char_or_acquire_read) && defined(AO_HAVE_char_or_read) # define AO_char_or_acquire_read(addr, val) AO_char_or_read(addr, val) # define AO_HAVE_char_or_acquire_read #endif #if !defined(AO_HAVE_char_or_acquire_read) \ && defined(AO_HAVE_char_or_acquire) # define AO_char_or_acquire_read(addr, val) AO_char_or_acquire(addr, val) # define AO_HAVE_char_or_acquire_read #endif /* char_xor */ #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_xor_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_xor_full(volatile unsigned/**/char *addr, unsigned/**/char value) { unsigned/**/char old; do { old = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full(addr, old, old ^ value))); } # define AO_HAVE_char_xor_full #endif #if defined(AO_HAVE_char_xor_full) # if !defined(AO_HAVE_char_xor_release) # define AO_char_xor_release(addr, val) AO_char_xor_full(addr, val) # define AO_HAVE_char_xor_release # endif # if !defined(AO_HAVE_char_xor_acquire) # define AO_char_xor_acquire(addr, val) AO_char_xor_full(addr, val) # define AO_HAVE_char_xor_acquire # endif # if !defined(AO_HAVE_char_xor_write) # define AO_char_xor_write(addr, val) AO_char_xor_full(addr, val) # define AO_HAVE_char_xor_write # endif # if !defined(AO_HAVE_char_xor_read) # define AO_char_xor_read(addr, val) AO_char_xor_full(addr, val) # define AO_HAVE_char_xor_read # endif #endif /* AO_HAVE_char_xor_full */ #if !defined(AO_HAVE_char_xor) && defined(AO_HAVE_char_xor_release) # define AO_char_xor(addr, val) AO_char_xor_release(addr, val) # define AO_HAVE_char_xor #endif #if !defined(AO_HAVE_char_xor) && defined(AO_HAVE_char_xor_acquire) # define AO_char_xor(addr, val) AO_char_xor_acquire(addr, val) # define AO_HAVE_char_xor #endif #if !defined(AO_HAVE_char_xor) && defined(AO_HAVE_char_xor_write) # define AO_char_xor(addr, val) AO_char_xor_write(addr, val) # define AO_HAVE_char_xor #endif #if !defined(AO_HAVE_char_xor) && defined(AO_HAVE_char_xor_read) # define AO_char_xor(addr, val) AO_char_xor_read(addr, val) # define AO_HAVE_char_xor #endif #if defined(AO_HAVE_char_xor_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_xor_full) # define AO_char_xor_full(addr, val) \ (AO_nop_full(), AO_char_xor_acquire(addr, val)) # define AO_HAVE_char_xor_full #endif #if !defined(AO_HAVE_char_xor_release_write) \ && defined(AO_HAVE_char_xor_write) # define AO_char_xor_release_write(addr, val) AO_char_xor_write(addr, val) # define AO_HAVE_char_xor_release_write #endif #if !defined(AO_HAVE_char_xor_release_write) \ && defined(AO_HAVE_char_xor_release) # define AO_char_xor_release_write(addr, val) AO_char_xor_release(addr, val) # define AO_HAVE_char_xor_release_write #endif #if !defined(AO_HAVE_char_xor_acquire_read) \ && defined(AO_HAVE_char_xor_read) # define AO_char_xor_acquire_read(addr, val) AO_char_xor_read(addr, val) # define AO_HAVE_char_xor_acquire_read #endif #if !defined(AO_HAVE_char_xor_acquire_read) \ && defined(AO_HAVE_char_xor_acquire) # define AO_char_xor_acquire_read(addr, val) AO_char_xor_acquire(addr, val) # define AO_HAVE_char_xor_acquire_read #endif /* char_and/or/xor_dd_acquire_read are meaningless. */ /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* short_compare_and_swap (based on fetch_compare_and_swap) */ #if defined(AO_HAVE_short_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_short_compare_and_swap_full) AO_INLINE int AO_short_compare_and_swap_full(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_full #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_short_compare_and_swap_acquire) AO_INLINE int AO_short_compare_and_swap_acquire(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_acquire #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_short_compare_and_swap_release) AO_INLINE int AO_short_compare_and_swap_release(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_release #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_short_compare_and_swap_write) AO_INLINE int AO_short_compare_and_swap_write(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_write #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_short_compare_and_swap_read) AO_INLINE int AO_short_compare_and_swap_read(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_read #endif #if defined(AO_HAVE_short_fetch_compare_and_swap) \ && !defined(AO_HAVE_short_compare_and_swap) AO_INLINE int AO_short_compare_and_swap(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_release_write) \ && !defined(AO_HAVE_short_compare_and_swap_release_write) AO_INLINE int AO_short_compare_and_swap_release_write(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_release_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_release_write #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_acquire_read) \ && !defined(AO_HAVE_short_compare_and_swap_acquire_read) AO_INLINE int AO_short_compare_and_swap_acquire_read(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_dd_acquire_read) \ && !defined(AO_HAVE_short_compare_and_swap_dd_acquire_read) AO_INLINE int AO_short_compare_and_swap_dd_acquire_read(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return AO_short_fetch_compare_and_swap_dd_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_short_compare_and_swap_dd_acquire_read #endif /* short_fetch_and_add */ /* We first try to implement fetch_and_add variants in terms of the */ /* corresponding compare_and_swap variants to minimize adding barriers. */ #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_fetch_and_add_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_fetch_and_add_full(volatile unsigned/**/short *addr, unsigned/**/short incr) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full(addr, old, old + incr))); return old; } # define AO_HAVE_short_fetch_and_add_full #endif #if defined(AO_HAVE_short_compare_and_swap_acquire) \ && !defined(AO_HAVE_short_fetch_and_add_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_fetch_and_add_acquire(volatile unsigned/**/short *addr, unsigned/**/short incr) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_acquire(addr, old, old + incr))); return old; } # define AO_HAVE_short_fetch_and_add_acquire #endif #if defined(AO_HAVE_short_compare_and_swap_release) \ && !defined(AO_HAVE_short_fetch_and_add_release) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_fetch_and_add_release(volatile unsigned/**/short *addr, unsigned/**/short incr) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_release(addr, old, old + incr))); return old; } # define AO_HAVE_short_fetch_and_add_release #endif #if defined(AO_HAVE_short_compare_and_swap) \ && !defined(AO_HAVE_short_fetch_and_add) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_fetch_and_add(volatile unsigned/**/short *addr, unsigned/**/short incr) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap(addr, old, old + incr))); return old; } # define AO_HAVE_short_fetch_and_add #endif #if defined(AO_HAVE_short_fetch_and_add_full) # if !defined(AO_HAVE_short_fetch_and_add_release) # define AO_short_fetch_and_add_release(addr, val) \ AO_short_fetch_and_add_full(addr, val) # define AO_HAVE_short_fetch_and_add_release # endif # if !defined(AO_HAVE_short_fetch_and_add_acquire) # define AO_short_fetch_and_add_acquire(addr, val) \ AO_short_fetch_and_add_full(addr, val) # define AO_HAVE_short_fetch_and_add_acquire # endif # if !defined(AO_HAVE_short_fetch_and_add_write) # define AO_short_fetch_and_add_write(addr, val) \ AO_short_fetch_and_add_full(addr, val) # define AO_HAVE_short_fetch_and_add_write # endif # if !defined(AO_HAVE_short_fetch_and_add_read) # define AO_short_fetch_and_add_read(addr, val) \ AO_short_fetch_and_add_full(addr, val) # define AO_HAVE_short_fetch_and_add_read # endif #endif /* AO_HAVE_short_fetch_and_add_full */ #if defined(AO_HAVE_short_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_and_add_acquire) AO_INLINE unsigned/**/short AO_short_fetch_and_add_acquire(volatile unsigned/**/short *addr, unsigned/**/short incr) { unsigned/**/short result = AO_short_fetch_and_add(addr, incr); AO_nop_full(); return result; } # define AO_HAVE_short_fetch_and_add_acquire #endif #if defined(AO_HAVE_short_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_and_add_release) # define AO_short_fetch_and_add_release(addr, incr) \ (AO_nop_full(), AO_short_fetch_and_add(addr, incr)) # define AO_HAVE_short_fetch_and_add_release #endif #if !defined(AO_HAVE_short_fetch_and_add) \ && defined(AO_HAVE_short_fetch_and_add_release) # define AO_short_fetch_and_add(addr, val) \ AO_short_fetch_and_add_release(addr, val) # define AO_HAVE_short_fetch_and_add #endif #if !defined(AO_HAVE_short_fetch_and_add) \ && defined(AO_HAVE_short_fetch_and_add_acquire) # define AO_short_fetch_and_add(addr, val) \ AO_short_fetch_and_add_acquire(addr, val) # define AO_HAVE_short_fetch_and_add #endif #if !defined(AO_HAVE_short_fetch_and_add) \ && defined(AO_HAVE_short_fetch_and_add_write) # define AO_short_fetch_and_add(addr, val) \ AO_short_fetch_and_add_write(addr, val) # define AO_HAVE_short_fetch_and_add #endif #if !defined(AO_HAVE_short_fetch_and_add) \ && defined(AO_HAVE_short_fetch_and_add_read) # define AO_short_fetch_and_add(addr, val) \ AO_short_fetch_and_add_read(addr, val) # define AO_HAVE_short_fetch_and_add #endif #if defined(AO_HAVE_short_fetch_and_add_acquire) \ && defined(AO_HAVE_nop_full) && !defined(AO_HAVE_short_fetch_and_add_full) # define AO_short_fetch_and_add_full(addr, val) \ (AO_nop_full(), AO_short_fetch_and_add_acquire(addr, val)) # define AO_HAVE_short_fetch_and_add_full #endif #if !defined(AO_HAVE_short_fetch_and_add_release_write) \ && defined(AO_HAVE_short_fetch_and_add_write) # define AO_short_fetch_and_add_release_write(addr, val) \ AO_short_fetch_and_add_write(addr, val) # define AO_HAVE_short_fetch_and_add_release_write #endif #if !defined(AO_HAVE_short_fetch_and_add_release_write) \ && defined(AO_HAVE_short_fetch_and_add_release) # define AO_short_fetch_and_add_release_write(addr, val) \ AO_short_fetch_and_add_release(addr, val) # define AO_HAVE_short_fetch_and_add_release_write #endif #if !defined(AO_HAVE_short_fetch_and_add_acquire_read) \ && defined(AO_HAVE_short_fetch_and_add_read) # define AO_short_fetch_and_add_acquire_read(addr, val) \ AO_short_fetch_and_add_read(addr, val) # define AO_HAVE_short_fetch_and_add_acquire_read #endif #if !defined(AO_HAVE_short_fetch_and_add_acquire_read) \ && defined(AO_HAVE_short_fetch_and_add_acquire) # define AO_short_fetch_and_add_acquire_read(addr, val) \ AO_short_fetch_and_add_acquire(addr, val) # define AO_HAVE_short_fetch_and_add_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_fetch_and_add_acquire_read) # define AO_short_fetch_and_add_dd_acquire_read(addr, val) \ AO_short_fetch_and_add_acquire_read(addr, val) # define AO_HAVE_short_fetch_and_add_dd_acquire_read # endif #else # if defined(AO_HAVE_short_fetch_and_add) # define AO_short_fetch_and_add_dd_acquire_read(addr, val) \ AO_short_fetch_and_add(addr, val) # define AO_HAVE_short_fetch_and_add_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_fetch_and_add1 */ #if defined(AO_HAVE_short_fetch_and_add_full) \ && !defined(AO_HAVE_short_fetch_and_add1_full) # define AO_short_fetch_and_add1_full(addr) \ AO_short_fetch_and_add_full(addr, 1) # define AO_HAVE_short_fetch_and_add1_full #endif #if defined(AO_HAVE_short_fetch_and_add_release) \ && !defined(AO_HAVE_short_fetch_and_add1_release) # define AO_short_fetch_and_add1_release(addr) \ AO_short_fetch_and_add_release(addr, 1) # define AO_HAVE_short_fetch_and_add1_release #endif #if defined(AO_HAVE_short_fetch_and_add_acquire) \ && !defined(AO_HAVE_short_fetch_and_add1_acquire) # define AO_short_fetch_and_add1_acquire(addr) \ AO_short_fetch_and_add_acquire(addr, 1) # define AO_HAVE_short_fetch_and_add1_acquire #endif #if defined(AO_HAVE_short_fetch_and_add_write) \ && !defined(AO_HAVE_short_fetch_and_add1_write) # define AO_short_fetch_and_add1_write(addr) \ AO_short_fetch_and_add_write(addr, 1) # define AO_HAVE_short_fetch_and_add1_write #endif #if defined(AO_HAVE_short_fetch_and_add_read) \ && !defined(AO_HAVE_short_fetch_and_add1_read) # define AO_short_fetch_and_add1_read(addr) \ AO_short_fetch_and_add_read(addr, 1) # define AO_HAVE_short_fetch_and_add1_read #endif #if defined(AO_HAVE_short_fetch_and_add_release_write) \ && !defined(AO_HAVE_short_fetch_and_add1_release_write) # define AO_short_fetch_and_add1_release_write(addr) \ AO_short_fetch_and_add_release_write(addr, 1) # define AO_HAVE_short_fetch_and_add1_release_write #endif #if defined(AO_HAVE_short_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_short_fetch_and_add1_acquire_read) # define AO_short_fetch_and_add1_acquire_read(addr) \ AO_short_fetch_and_add_acquire_read(addr, 1) # define AO_HAVE_short_fetch_and_add1_acquire_read #endif #if defined(AO_HAVE_short_fetch_and_add) \ && !defined(AO_HAVE_short_fetch_and_add1) # define AO_short_fetch_and_add1(addr) AO_short_fetch_and_add(addr, 1) # define AO_HAVE_short_fetch_and_add1 #endif #if defined(AO_HAVE_short_fetch_and_add1_full) # if !defined(AO_HAVE_short_fetch_and_add1_release) # define AO_short_fetch_and_add1_release(addr) \ AO_short_fetch_and_add1_full(addr) # define AO_HAVE_short_fetch_and_add1_release # endif # if !defined(AO_HAVE_short_fetch_and_add1_acquire) # define AO_short_fetch_and_add1_acquire(addr) \ AO_short_fetch_and_add1_full(addr) # define AO_HAVE_short_fetch_and_add1_acquire # endif # if !defined(AO_HAVE_short_fetch_and_add1_write) # define AO_short_fetch_and_add1_write(addr) \ AO_short_fetch_and_add1_full(addr) # define AO_HAVE_short_fetch_and_add1_write # endif # if !defined(AO_HAVE_short_fetch_and_add1_read) # define AO_short_fetch_and_add1_read(addr) \ AO_short_fetch_and_add1_full(addr) # define AO_HAVE_short_fetch_and_add1_read # endif #endif /* AO_HAVE_short_fetch_and_add1_full */ #if !defined(AO_HAVE_short_fetch_and_add1) \ && defined(AO_HAVE_short_fetch_and_add1_release) # define AO_short_fetch_and_add1(addr) AO_short_fetch_and_add1_release(addr) # define AO_HAVE_short_fetch_and_add1 #endif #if !defined(AO_HAVE_short_fetch_and_add1) \ && defined(AO_HAVE_short_fetch_and_add1_acquire) # define AO_short_fetch_and_add1(addr) AO_short_fetch_and_add1_acquire(addr) # define AO_HAVE_short_fetch_and_add1 #endif #if !defined(AO_HAVE_short_fetch_and_add1) \ && defined(AO_HAVE_short_fetch_and_add1_write) # define AO_short_fetch_and_add1(addr) AO_short_fetch_and_add1_write(addr) # define AO_HAVE_short_fetch_and_add1 #endif #if !defined(AO_HAVE_short_fetch_and_add1) \ && defined(AO_HAVE_short_fetch_and_add1_read) # define AO_short_fetch_and_add1(addr) AO_short_fetch_and_add1_read(addr) # define AO_HAVE_short_fetch_and_add1 #endif #if defined(AO_HAVE_short_fetch_and_add1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_and_add1_full) # define AO_short_fetch_and_add1_full(addr) \ (AO_nop_full(), AO_short_fetch_and_add1_acquire(addr)) # define AO_HAVE_short_fetch_and_add1_full #endif #if !defined(AO_HAVE_short_fetch_and_add1_release_write) \ && defined(AO_HAVE_short_fetch_and_add1_write) # define AO_short_fetch_and_add1_release_write(addr) \ AO_short_fetch_and_add1_write(addr) # define AO_HAVE_short_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_short_fetch_and_add1_release_write) \ && defined(AO_HAVE_short_fetch_and_add1_release) # define AO_short_fetch_and_add1_release_write(addr) \ AO_short_fetch_and_add1_release(addr) # define AO_HAVE_short_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_short_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_short_fetch_and_add1_read) # define AO_short_fetch_and_add1_acquire_read(addr) \ AO_short_fetch_and_add1_read(addr) # define AO_HAVE_short_fetch_and_add1_acquire_read #endif #if !defined(AO_HAVE_short_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_short_fetch_and_add1_acquire) # define AO_short_fetch_and_add1_acquire_read(addr) \ AO_short_fetch_and_add1_acquire(addr) # define AO_HAVE_short_fetch_and_add1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_fetch_and_add1_acquire_read) # define AO_short_fetch_and_add1_dd_acquire_read(addr) \ AO_short_fetch_and_add1_acquire_read(addr) # define AO_HAVE_short_fetch_and_add1_dd_acquire_read # endif #else # if defined(AO_HAVE_short_fetch_and_add1) # define AO_short_fetch_and_add1_dd_acquire_read(addr) \ AO_short_fetch_and_add1(addr) # define AO_HAVE_short_fetch_and_add1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_fetch_and_sub1 */ #if defined(AO_HAVE_short_fetch_and_add_full) \ && !defined(AO_HAVE_short_fetch_and_sub1_full) # define AO_short_fetch_and_sub1_full(addr) \ AO_short_fetch_and_add_full(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_full #endif #if defined(AO_HAVE_short_fetch_and_add_release) \ && !defined(AO_HAVE_short_fetch_and_sub1_release) # define AO_short_fetch_and_sub1_release(addr) \ AO_short_fetch_and_add_release(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_release #endif #if defined(AO_HAVE_short_fetch_and_add_acquire) \ && !defined(AO_HAVE_short_fetch_and_sub1_acquire) # define AO_short_fetch_and_sub1_acquire(addr) \ AO_short_fetch_and_add_acquire(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_short_fetch_and_add_write) \ && !defined(AO_HAVE_short_fetch_and_sub1_write) # define AO_short_fetch_and_sub1_write(addr) \ AO_short_fetch_and_add_write(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_write #endif #if defined(AO_HAVE_short_fetch_and_add_read) \ && !defined(AO_HAVE_short_fetch_and_sub1_read) # define AO_short_fetch_and_sub1_read(addr) \ AO_short_fetch_and_add_read(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_read #endif #if defined(AO_HAVE_short_fetch_and_add_release_write) \ && !defined(AO_HAVE_short_fetch_and_sub1_release_write) # define AO_short_fetch_and_sub1_release_write(addr) \ AO_short_fetch_and_add_release_write(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_release_write #endif #if defined(AO_HAVE_short_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_short_fetch_and_sub1_acquire_read) # define AO_short_fetch_and_sub1_acquire_read(addr) \ AO_short_fetch_and_add_acquire_read(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1_acquire_read #endif #if defined(AO_HAVE_short_fetch_and_add) \ && !defined(AO_HAVE_short_fetch_and_sub1) # define AO_short_fetch_and_sub1(addr) \ AO_short_fetch_and_add(addr, (unsigned/**/short)(-1)) # define AO_HAVE_short_fetch_and_sub1 #endif #if defined(AO_HAVE_short_fetch_and_sub1_full) # if !defined(AO_HAVE_short_fetch_and_sub1_release) # define AO_short_fetch_and_sub1_release(addr) \ AO_short_fetch_and_sub1_full(addr) # define AO_HAVE_short_fetch_and_sub1_release # endif # if !defined(AO_HAVE_short_fetch_and_sub1_acquire) # define AO_short_fetch_and_sub1_acquire(addr) \ AO_short_fetch_and_sub1_full(addr) # define AO_HAVE_short_fetch_and_sub1_acquire # endif # if !defined(AO_HAVE_short_fetch_and_sub1_write) # define AO_short_fetch_and_sub1_write(addr) \ AO_short_fetch_and_sub1_full(addr) # define AO_HAVE_short_fetch_and_sub1_write # endif # if !defined(AO_HAVE_short_fetch_and_sub1_read) # define AO_short_fetch_and_sub1_read(addr) \ AO_short_fetch_and_sub1_full(addr) # define AO_HAVE_short_fetch_and_sub1_read # endif #endif /* AO_HAVE_short_fetch_and_sub1_full */ #if !defined(AO_HAVE_short_fetch_and_sub1) \ && defined(AO_HAVE_short_fetch_and_sub1_release) # define AO_short_fetch_and_sub1(addr) AO_short_fetch_and_sub1_release(addr) # define AO_HAVE_short_fetch_and_sub1 #endif #if !defined(AO_HAVE_short_fetch_and_sub1) \ && defined(AO_HAVE_short_fetch_and_sub1_acquire) # define AO_short_fetch_and_sub1(addr) AO_short_fetch_and_sub1_acquire(addr) # define AO_HAVE_short_fetch_and_sub1 #endif #if !defined(AO_HAVE_short_fetch_and_sub1) \ && defined(AO_HAVE_short_fetch_and_sub1_write) # define AO_short_fetch_and_sub1(addr) AO_short_fetch_and_sub1_write(addr) # define AO_HAVE_short_fetch_and_sub1 #endif #if !defined(AO_HAVE_short_fetch_and_sub1) \ && defined(AO_HAVE_short_fetch_and_sub1_read) # define AO_short_fetch_and_sub1(addr) AO_short_fetch_and_sub1_read(addr) # define AO_HAVE_short_fetch_and_sub1 #endif #if defined(AO_HAVE_short_fetch_and_sub1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_and_sub1_full) # define AO_short_fetch_and_sub1_full(addr) \ (AO_nop_full(), AO_short_fetch_and_sub1_acquire(addr)) # define AO_HAVE_short_fetch_and_sub1_full #endif #if !defined(AO_HAVE_short_fetch_and_sub1_release_write) \ && defined(AO_HAVE_short_fetch_and_sub1_write) # define AO_short_fetch_and_sub1_release_write(addr) \ AO_short_fetch_and_sub1_write(addr) # define AO_HAVE_short_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_short_fetch_and_sub1_release_write) \ && defined(AO_HAVE_short_fetch_and_sub1_release) # define AO_short_fetch_and_sub1_release_write(addr) \ AO_short_fetch_and_sub1_release(addr) # define AO_HAVE_short_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_short_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_short_fetch_and_sub1_read) # define AO_short_fetch_and_sub1_acquire_read(addr) \ AO_short_fetch_and_sub1_read(addr) # define AO_HAVE_short_fetch_and_sub1_acquire_read #endif #if !defined(AO_HAVE_short_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_short_fetch_and_sub1_acquire) # define AO_short_fetch_and_sub1_acquire_read(addr) \ AO_short_fetch_and_sub1_acquire(addr) # define AO_HAVE_short_fetch_and_sub1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_fetch_and_sub1_acquire_read) # define AO_short_fetch_and_sub1_dd_acquire_read(addr) \ AO_short_fetch_and_sub1_acquire_read(addr) # define AO_HAVE_short_fetch_and_sub1_dd_acquire_read # endif #else # if defined(AO_HAVE_short_fetch_and_sub1) # define AO_short_fetch_and_sub1_dd_acquire_read(addr) \ AO_short_fetch_and_sub1(addr) # define AO_HAVE_short_fetch_and_sub1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_and */ #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_and_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_and_full(volatile unsigned/**/short *addr, unsigned/**/short value) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full(addr, old, old & value))); } # define AO_HAVE_short_and_full #endif #if defined(AO_HAVE_short_and_full) # if !defined(AO_HAVE_short_and_release) # define AO_short_and_release(addr, val) AO_short_and_full(addr, val) # define AO_HAVE_short_and_release # endif # if !defined(AO_HAVE_short_and_acquire) # define AO_short_and_acquire(addr, val) AO_short_and_full(addr, val) # define AO_HAVE_short_and_acquire # endif # if !defined(AO_HAVE_short_and_write) # define AO_short_and_write(addr, val) AO_short_and_full(addr, val) # define AO_HAVE_short_and_write # endif # if !defined(AO_HAVE_short_and_read) # define AO_short_and_read(addr, val) AO_short_and_full(addr, val) # define AO_HAVE_short_and_read # endif #endif /* AO_HAVE_short_and_full */ #if !defined(AO_HAVE_short_and) && defined(AO_HAVE_short_and_release) # define AO_short_and(addr, val) AO_short_and_release(addr, val) # define AO_HAVE_short_and #endif #if !defined(AO_HAVE_short_and) && defined(AO_HAVE_short_and_acquire) # define AO_short_and(addr, val) AO_short_and_acquire(addr, val) # define AO_HAVE_short_and #endif #if !defined(AO_HAVE_short_and) && defined(AO_HAVE_short_and_write) # define AO_short_and(addr, val) AO_short_and_write(addr, val) # define AO_HAVE_short_and #endif #if !defined(AO_HAVE_short_and) && defined(AO_HAVE_short_and_read) # define AO_short_and(addr, val) AO_short_and_read(addr, val) # define AO_HAVE_short_and #endif #if defined(AO_HAVE_short_and_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_and_full) # define AO_short_and_full(addr, val) \ (AO_nop_full(), AO_short_and_acquire(addr, val)) # define AO_HAVE_short_and_full #endif #if !defined(AO_HAVE_short_and_release_write) \ && defined(AO_HAVE_short_and_write) # define AO_short_and_release_write(addr, val) AO_short_and_write(addr, val) # define AO_HAVE_short_and_release_write #endif #if !defined(AO_HAVE_short_and_release_write) \ && defined(AO_HAVE_short_and_release) # define AO_short_and_release_write(addr, val) AO_short_and_release(addr, val) # define AO_HAVE_short_and_release_write #endif #if !defined(AO_HAVE_short_and_acquire_read) \ && defined(AO_HAVE_short_and_read) # define AO_short_and_acquire_read(addr, val) AO_short_and_read(addr, val) # define AO_HAVE_short_and_acquire_read #endif #if !defined(AO_HAVE_short_and_acquire_read) \ && defined(AO_HAVE_short_and_acquire) # define AO_short_and_acquire_read(addr, val) AO_short_and_acquire(addr, val) # define AO_HAVE_short_and_acquire_read #endif /* short_or */ #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_or_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_or_full(volatile unsigned/**/short *addr, unsigned/**/short value) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full(addr, old, old | value))); } # define AO_HAVE_short_or_full #endif #if defined(AO_HAVE_short_or_full) # if !defined(AO_HAVE_short_or_release) # define AO_short_or_release(addr, val) AO_short_or_full(addr, val) # define AO_HAVE_short_or_release # endif # if !defined(AO_HAVE_short_or_acquire) # define AO_short_or_acquire(addr, val) AO_short_or_full(addr, val) # define AO_HAVE_short_or_acquire # endif # if !defined(AO_HAVE_short_or_write) # define AO_short_or_write(addr, val) AO_short_or_full(addr, val) # define AO_HAVE_short_or_write # endif # if !defined(AO_HAVE_short_or_read) # define AO_short_or_read(addr, val) AO_short_or_full(addr, val) # define AO_HAVE_short_or_read # endif #endif /* AO_HAVE_short_or_full */ #if !defined(AO_HAVE_short_or) && defined(AO_HAVE_short_or_release) # define AO_short_or(addr, val) AO_short_or_release(addr, val) # define AO_HAVE_short_or #endif #if !defined(AO_HAVE_short_or) && defined(AO_HAVE_short_or_acquire) # define AO_short_or(addr, val) AO_short_or_acquire(addr, val) # define AO_HAVE_short_or #endif #if !defined(AO_HAVE_short_or) && defined(AO_HAVE_short_or_write) # define AO_short_or(addr, val) AO_short_or_write(addr, val) # define AO_HAVE_short_or #endif #if !defined(AO_HAVE_short_or) && defined(AO_HAVE_short_or_read) # define AO_short_or(addr, val) AO_short_or_read(addr, val) # define AO_HAVE_short_or #endif #if defined(AO_HAVE_short_or_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_or_full) # define AO_short_or_full(addr, val) \ (AO_nop_full(), AO_short_or_acquire(addr, val)) # define AO_HAVE_short_or_full #endif #if !defined(AO_HAVE_short_or_release_write) \ && defined(AO_HAVE_short_or_write) # define AO_short_or_release_write(addr, val) AO_short_or_write(addr, val) # define AO_HAVE_short_or_release_write #endif #if !defined(AO_HAVE_short_or_release_write) \ && defined(AO_HAVE_short_or_release) # define AO_short_or_release_write(addr, val) AO_short_or_release(addr, val) # define AO_HAVE_short_or_release_write #endif #if !defined(AO_HAVE_short_or_acquire_read) && defined(AO_HAVE_short_or_read) # define AO_short_or_acquire_read(addr, val) AO_short_or_read(addr, val) # define AO_HAVE_short_or_acquire_read #endif #if !defined(AO_HAVE_short_or_acquire_read) \ && defined(AO_HAVE_short_or_acquire) # define AO_short_or_acquire_read(addr, val) AO_short_or_acquire(addr, val) # define AO_HAVE_short_or_acquire_read #endif /* short_xor */ #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_xor_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_xor_full(volatile unsigned/**/short *addr, unsigned/**/short value) { unsigned/**/short old; do { old = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full(addr, old, old ^ value))); } # define AO_HAVE_short_xor_full #endif #if defined(AO_HAVE_short_xor_full) # if !defined(AO_HAVE_short_xor_release) # define AO_short_xor_release(addr, val) AO_short_xor_full(addr, val) # define AO_HAVE_short_xor_release # endif # if !defined(AO_HAVE_short_xor_acquire) # define AO_short_xor_acquire(addr, val) AO_short_xor_full(addr, val) # define AO_HAVE_short_xor_acquire # endif # if !defined(AO_HAVE_short_xor_write) # define AO_short_xor_write(addr, val) AO_short_xor_full(addr, val) # define AO_HAVE_short_xor_write # endif # if !defined(AO_HAVE_short_xor_read) # define AO_short_xor_read(addr, val) AO_short_xor_full(addr, val) # define AO_HAVE_short_xor_read # endif #endif /* AO_HAVE_short_xor_full */ #if !defined(AO_HAVE_short_xor) && defined(AO_HAVE_short_xor_release) # define AO_short_xor(addr, val) AO_short_xor_release(addr, val) # define AO_HAVE_short_xor #endif #if !defined(AO_HAVE_short_xor) && defined(AO_HAVE_short_xor_acquire) # define AO_short_xor(addr, val) AO_short_xor_acquire(addr, val) # define AO_HAVE_short_xor #endif #if !defined(AO_HAVE_short_xor) && defined(AO_HAVE_short_xor_write) # define AO_short_xor(addr, val) AO_short_xor_write(addr, val) # define AO_HAVE_short_xor #endif #if !defined(AO_HAVE_short_xor) && defined(AO_HAVE_short_xor_read) # define AO_short_xor(addr, val) AO_short_xor_read(addr, val) # define AO_HAVE_short_xor #endif #if defined(AO_HAVE_short_xor_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_xor_full) # define AO_short_xor_full(addr, val) \ (AO_nop_full(), AO_short_xor_acquire(addr, val)) # define AO_HAVE_short_xor_full #endif #if !defined(AO_HAVE_short_xor_release_write) \ && defined(AO_HAVE_short_xor_write) # define AO_short_xor_release_write(addr, val) AO_short_xor_write(addr, val) # define AO_HAVE_short_xor_release_write #endif #if !defined(AO_HAVE_short_xor_release_write) \ && defined(AO_HAVE_short_xor_release) # define AO_short_xor_release_write(addr, val) AO_short_xor_release(addr, val) # define AO_HAVE_short_xor_release_write #endif #if !defined(AO_HAVE_short_xor_acquire_read) \ && defined(AO_HAVE_short_xor_read) # define AO_short_xor_acquire_read(addr, val) AO_short_xor_read(addr, val) # define AO_HAVE_short_xor_acquire_read #endif #if !defined(AO_HAVE_short_xor_acquire_read) \ && defined(AO_HAVE_short_xor_acquire) # define AO_short_xor_acquire_read(addr, val) AO_short_xor_acquire(addr, val) # define AO_HAVE_short_xor_acquire_read #endif /* short_and/or/xor_dd_acquire_read are meaningless. */ /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* int_compare_and_swap (based on fetch_compare_and_swap) */ #if defined(AO_HAVE_int_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_int_compare_and_swap_full) AO_INLINE int AO_int_compare_and_swap_full(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_full #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_int_compare_and_swap_acquire) AO_INLINE int AO_int_compare_and_swap_acquire(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_acquire #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_int_compare_and_swap_release) AO_INLINE int AO_int_compare_and_swap_release(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_release #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_int_compare_and_swap_write) AO_INLINE int AO_int_compare_and_swap_write(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_write #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_int_compare_and_swap_read) AO_INLINE int AO_int_compare_and_swap_read(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_read #endif #if defined(AO_HAVE_int_fetch_compare_and_swap) \ && !defined(AO_HAVE_int_compare_and_swap) AO_INLINE int AO_int_compare_and_swap(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_release_write) \ && !defined(AO_HAVE_int_compare_and_swap_release_write) AO_INLINE int AO_int_compare_and_swap_release_write(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_release_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_release_write #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_acquire_read) \ && !defined(AO_HAVE_int_compare_and_swap_acquire_read) AO_INLINE int AO_int_compare_and_swap_acquire_read(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_dd_acquire_read) \ && !defined(AO_HAVE_int_compare_and_swap_dd_acquire_read) AO_INLINE int AO_int_compare_and_swap_dd_acquire_read(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return AO_int_fetch_compare_and_swap_dd_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_int_compare_and_swap_dd_acquire_read #endif /* int_fetch_and_add */ /* We first try to implement fetch_and_add variants in terms of the */ /* corresponding compare_and_swap variants to minimize adding barriers. */ #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_fetch_and_add_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_fetch_and_add_full(volatile unsigned *addr, unsigned incr) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full(addr, old, old + incr))); return old; } # define AO_HAVE_int_fetch_and_add_full #endif #if defined(AO_HAVE_int_compare_and_swap_acquire) \ && !defined(AO_HAVE_int_fetch_and_add_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_fetch_and_add_acquire(volatile unsigned *addr, unsigned incr) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_acquire(addr, old, old + incr))); return old; } # define AO_HAVE_int_fetch_and_add_acquire #endif #if defined(AO_HAVE_int_compare_and_swap_release) \ && !defined(AO_HAVE_int_fetch_and_add_release) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_fetch_and_add_release(volatile unsigned *addr, unsigned incr) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_release(addr, old, old + incr))); return old; } # define AO_HAVE_int_fetch_and_add_release #endif #if defined(AO_HAVE_int_compare_and_swap) \ && !defined(AO_HAVE_int_fetch_and_add) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_fetch_and_add(volatile unsigned *addr, unsigned incr) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap(addr, old, old + incr))); return old; } # define AO_HAVE_int_fetch_and_add #endif #if defined(AO_HAVE_int_fetch_and_add_full) # if !defined(AO_HAVE_int_fetch_and_add_release) # define AO_int_fetch_and_add_release(addr, val) \ AO_int_fetch_and_add_full(addr, val) # define AO_HAVE_int_fetch_and_add_release # endif # if !defined(AO_HAVE_int_fetch_and_add_acquire) # define AO_int_fetch_and_add_acquire(addr, val) \ AO_int_fetch_and_add_full(addr, val) # define AO_HAVE_int_fetch_and_add_acquire # endif # if !defined(AO_HAVE_int_fetch_and_add_write) # define AO_int_fetch_and_add_write(addr, val) \ AO_int_fetch_and_add_full(addr, val) # define AO_HAVE_int_fetch_and_add_write # endif # if !defined(AO_HAVE_int_fetch_and_add_read) # define AO_int_fetch_and_add_read(addr, val) \ AO_int_fetch_and_add_full(addr, val) # define AO_HAVE_int_fetch_and_add_read # endif #endif /* AO_HAVE_int_fetch_and_add_full */ #if defined(AO_HAVE_int_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_and_add_acquire) AO_INLINE unsigned AO_int_fetch_and_add_acquire(volatile unsigned *addr, unsigned incr) { unsigned result = AO_int_fetch_and_add(addr, incr); AO_nop_full(); return result; } # define AO_HAVE_int_fetch_and_add_acquire #endif #if defined(AO_HAVE_int_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_and_add_release) # define AO_int_fetch_and_add_release(addr, incr) \ (AO_nop_full(), AO_int_fetch_and_add(addr, incr)) # define AO_HAVE_int_fetch_and_add_release #endif #if !defined(AO_HAVE_int_fetch_and_add) \ && defined(AO_HAVE_int_fetch_and_add_release) # define AO_int_fetch_and_add(addr, val) \ AO_int_fetch_and_add_release(addr, val) # define AO_HAVE_int_fetch_and_add #endif #if !defined(AO_HAVE_int_fetch_and_add) \ && defined(AO_HAVE_int_fetch_and_add_acquire) # define AO_int_fetch_and_add(addr, val) \ AO_int_fetch_and_add_acquire(addr, val) # define AO_HAVE_int_fetch_and_add #endif #if !defined(AO_HAVE_int_fetch_and_add) \ && defined(AO_HAVE_int_fetch_and_add_write) # define AO_int_fetch_and_add(addr, val) \ AO_int_fetch_and_add_write(addr, val) # define AO_HAVE_int_fetch_and_add #endif #if !defined(AO_HAVE_int_fetch_and_add) \ && defined(AO_HAVE_int_fetch_and_add_read) # define AO_int_fetch_and_add(addr, val) \ AO_int_fetch_and_add_read(addr, val) # define AO_HAVE_int_fetch_and_add #endif #if defined(AO_HAVE_int_fetch_and_add_acquire) \ && defined(AO_HAVE_nop_full) && !defined(AO_HAVE_int_fetch_and_add_full) # define AO_int_fetch_and_add_full(addr, val) \ (AO_nop_full(), AO_int_fetch_and_add_acquire(addr, val)) # define AO_HAVE_int_fetch_and_add_full #endif #if !defined(AO_HAVE_int_fetch_and_add_release_write) \ && defined(AO_HAVE_int_fetch_and_add_write) # define AO_int_fetch_and_add_release_write(addr, val) \ AO_int_fetch_and_add_write(addr, val) # define AO_HAVE_int_fetch_and_add_release_write #endif #if !defined(AO_HAVE_int_fetch_and_add_release_write) \ && defined(AO_HAVE_int_fetch_and_add_release) # define AO_int_fetch_and_add_release_write(addr, val) \ AO_int_fetch_and_add_release(addr, val) # define AO_HAVE_int_fetch_and_add_release_write #endif #if !defined(AO_HAVE_int_fetch_and_add_acquire_read) \ && defined(AO_HAVE_int_fetch_and_add_read) # define AO_int_fetch_and_add_acquire_read(addr, val) \ AO_int_fetch_and_add_read(addr, val) # define AO_HAVE_int_fetch_and_add_acquire_read #endif #if !defined(AO_HAVE_int_fetch_and_add_acquire_read) \ && defined(AO_HAVE_int_fetch_and_add_acquire) # define AO_int_fetch_and_add_acquire_read(addr, val) \ AO_int_fetch_and_add_acquire(addr, val) # define AO_HAVE_int_fetch_and_add_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_fetch_and_add_acquire_read) # define AO_int_fetch_and_add_dd_acquire_read(addr, val) \ AO_int_fetch_and_add_acquire_read(addr, val) # define AO_HAVE_int_fetch_and_add_dd_acquire_read # endif #else # if defined(AO_HAVE_int_fetch_and_add) # define AO_int_fetch_and_add_dd_acquire_read(addr, val) \ AO_int_fetch_and_add(addr, val) # define AO_HAVE_int_fetch_and_add_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_fetch_and_add1 */ #if defined(AO_HAVE_int_fetch_and_add_full) \ && !defined(AO_HAVE_int_fetch_and_add1_full) # define AO_int_fetch_and_add1_full(addr) \ AO_int_fetch_and_add_full(addr, 1) # define AO_HAVE_int_fetch_and_add1_full #endif #if defined(AO_HAVE_int_fetch_and_add_release) \ && !defined(AO_HAVE_int_fetch_and_add1_release) # define AO_int_fetch_and_add1_release(addr) \ AO_int_fetch_and_add_release(addr, 1) # define AO_HAVE_int_fetch_and_add1_release #endif #if defined(AO_HAVE_int_fetch_and_add_acquire) \ && !defined(AO_HAVE_int_fetch_and_add1_acquire) # define AO_int_fetch_and_add1_acquire(addr) \ AO_int_fetch_and_add_acquire(addr, 1) # define AO_HAVE_int_fetch_and_add1_acquire #endif #if defined(AO_HAVE_int_fetch_and_add_write) \ && !defined(AO_HAVE_int_fetch_and_add1_write) # define AO_int_fetch_and_add1_write(addr) \ AO_int_fetch_and_add_write(addr, 1) # define AO_HAVE_int_fetch_and_add1_write #endif #if defined(AO_HAVE_int_fetch_and_add_read) \ && !defined(AO_HAVE_int_fetch_and_add1_read) # define AO_int_fetch_and_add1_read(addr) \ AO_int_fetch_and_add_read(addr, 1) # define AO_HAVE_int_fetch_and_add1_read #endif #if defined(AO_HAVE_int_fetch_and_add_release_write) \ && !defined(AO_HAVE_int_fetch_and_add1_release_write) # define AO_int_fetch_and_add1_release_write(addr) \ AO_int_fetch_and_add_release_write(addr, 1) # define AO_HAVE_int_fetch_and_add1_release_write #endif #if defined(AO_HAVE_int_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_int_fetch_and_add1_acquire_read) # define AO_int_fetch_and_add1_acquire_read(addr) \ AO_int_fetch_and_add_acquire_read(addr, 1) # define AO_HAVE_int_fetch_and_add1_acquire_read #endif #if defined(AO_HAVE_int_fetch_and_add) \ && !defined(AO_HAVE_int_fetch_and_add1) # define AO_int_fetch_and_add1(addr) AO_int_fetch_and_add(addr, 1) # define AO_HAVE_int_fetch_and_add1 #endif #if defined(AO_HAVE_int_fetch_and_add1_full) # if !defined(AO_HAVE_int_fetch_and_add1_release) # define AO_int_fetch_and_add1_release(addr) \ AO_int_fetch_and_add1_full(addr) # define AO_HAVE_int_fetch_and_add1_release # endif # if !defined(AO_HAVE_int_fetch_and_add1_acquire) # define AO_int_fetch_and_add1_acquire(addr) \ AO_int_fetch_and_add1_full(addr) # define AO_HAVE_int_fetch_and_add1_acquire # endif # if !defined(AO_HAVE_int_fetch_and_add1_write) # define AO_int_fetch_and_add1_write(addr) \ AO_int_fetch_and_add1_full(addr) # define AO_HAVE_int_fetch_and_add1_write # endif # if !defined(AO_HAVE_int_fetch_and_add1_read) # define AO_int_fetch_and_add1_read(addr) \ AO_int_fetch_and_add1_full(addr) # define AO_HAVE_int_fetch_and_add1_read # endif #endif /* AO_HAVE_int_fetch_and_add1_full */ #if !defined(AO_HAVE_int_fetch_and_add1) \ && defined(AO_HAVE_int_fetch_and_add1_release) # define AO_int_fetch_and_add1(addr) AO_int_fetch_and_add1_release(addr) # define AO_HAVE_int_fetch_and_add1 #endif #if !defined(AO_HAVE_int_fetch_and_add1) \ && defined(AO_HAVE_int_fetch_and_add1_acquire) # define AO_int_fetch_and_add1(addr) AO_int_fetch_and_add1_acquire(addr) # define AO_HAVE_int_fetch_and_add1 #endif #if !defined(AO_HAVE_int_fetch_and_add1) \ && defined(AO_HAVE_int_fetch_and_add1_write) # define AO_int_fetch_and_add1(addr) AO_int_fetch_and_add1_write(addr) # define AO_HAVE_int_fetch_and_add1 #endif #if !defined(AO_HAVE_int_fetch_and_add1) \ && defined(AO_HAVE_int_fetch_and_add1_read) # define AO_int_fetch_and_add1(addr) AO_int_fetch_and_add1_read(addr) # define AO_HAVE_int_fetch_and_add1 #endif #if defined(AO_HAVE_int_fetch_and_add1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_and_add1_full) # define AO_int_fetch_and_add1_full(addr) \ (AO_nop_full(), AO_int_fetch_and_add1_acquire(addr)) # define AO_HAVE_int_fetch_and_add1_full #endif #if !defined(AO_HAVE_int_fetch_and_add1_release_write) \ && defined(AO_HAVE_int_fetch_and_add1_write) # define AO_int_fetch_and_add1_release_write(addr) \ AO_int_fetch_and_add1_write(addr) # define AO_HAVE_int_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_int_fetch_and_add1_release_write) \ && defined(AO_HAVE_int_fetch_and_add1_release) # define AO_int_fetch_and_add1_release_write(addr) \ AO_int_fetch_and_add1_release(addr) # define AO_HAVE_int_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_int_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_int_fetch_and_add1_read) # define AO_int_fetch_and_add1_acquire_read(addr) \ AO_int_fetch_and_add1_read(addr) # define AO_HAVE_int_fetch_and_add1_acquire_read #endif #if !defined(AO_HAVE_int_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_int_fetch_and_add1_acquire) # define AO_int_fetch_and_add1_acquire_read(addr) \ AO_int_fetch_and_add1_acquire(addr) # define AO_HAVE_int_fetch_and_add1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_fetch_and_add1_acquire_read) # define AO_int_fetch_and_add1_dd_acquire_read(addr) \ AO_int_fetch_and_add1_acquire_read(addr) # define AO_HAVE_int_fetch_and_add1_dd_acquire_read # endif #else # if defined(AO_HAVE_int_fetch_and_add1) # define AO_int_fetch_and_add1_dd_acquire_read(addr) \ AO_int_fetch_and_add1(addr) # define AO_HAVE_int_fetch_and_add1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_fetch_and_sub1 */ #if defined(AO_HAVE_int_fetch_and_add_full) \ && !defined(AO_HAVE_int_fetch_and_sub1_full) # define AO_int_fetch_and_sub1_full(addr) \ AO_int_fetch_and_add_full(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_full #endif #if defined(AO_HAVE_int_fetch_and_add_release) \ && !defined(AO_HAVE_int_fetch_and_sub1_release) # define AO_int_fetch_and_sub1_release(addr) \ AO_int_fetch_and_add_release(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_release #endif #if defined(AO_HAVE_int_fetch_and_add_acquire) \ && !defined(AO_HAVE_int_fetch_and_sub1_acquire) # define AO_int_fetch_and_sub1_acquire(addr) \ AO_int_fetch_and_add_acquire(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_int_fetch_and_add_write) \ && !defined(AO_HAVE_int_fetch_and_sub1_write) # define AO_int_fetch_and_sub1_write(addr) \ AO_int_fetch_and_add_write(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_write #endif #if defined(AO_HAVE_int_fetch_and_add_read) \ && !defined(AO_HAVE_int_fetch_and_sub1_read) # define AO_int_fetch_and_sub1_read(addr) \ AO_int_fetch_and_add_read(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_read #endif #if defined(AO_HAVE_int_fetch_and_add_release_write) \ && !defined(AO_HAVE_int_fetch_and_sub1_release_write) # define AO_int_fetch_and_sub1_release_write(addr) \ AO_int_fetch_and_add_release_write(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_release_write #endif #if defined(AO_HAVE_int_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_int_fetch_and_sub1_acquire_read) # define AO_int_fetch_and_sub1_acquire_read(addr) \ AO_int_fetch_and_add_acquire_read(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1_acquire_read #endif #if defined(AO_HAVE_int_fetch_and_add) \ && !defined(AO_HAVE_int_fetch_and_sub1) # define AO_int_fetch_and_sub1(addr) \ AO_int_fetch_and_add(addr, (unsigned)(-1)) # define AO_HAVE_int_fetch_and_sub1 #endif #if defined(AO_HAVE_int_fetch_and_sub1_full) # if !defined(AO_HAVE_int_fetch_and_sub1_release) # define AO_int_fetch_and_sub1_release(addr) \ AO_int_fetch_and_sub1_full(addr) # define AO_HAVE_int_fetch_and_sub1_release # endif # if !defined(AO_HAVE_int_fetch_and_sub1_acquire) # define AO_int_fetch_and_sub1_acquire(addr) \ AO_int_fetch_and_sub1_full(addr) # define AO_HAVE_int_fetch_and_sub1_acquire # endif # if !defined(AO_HAVE_int_fetch_and_sub1_write) # define AO_int_fetch_and_sub1_write(addr) \ AO_int_fetch_and_sub1_full(addr) # define AO_HAVE_int_fetch_and_sub1_write # endif # if !defined(AO_HAVE_int_fetch_and_sub1_read) # define AO_int_fetch_and_sub1_read(addr) \ AO_int_fetch_and_sub1_full(addr) # define AO_HAVE_int_fetch_and_sub1_read # endif #endif /* AO_HAVE_int_fetch_and_sub1_full */ #if !defined(AO_HAVE_int_fetch_and_sub1) \ && defined(AO_HAVE_int_fetch_and_sub1_release) # define AO_int_fetch_and_sub1(addr) AO_int_fetch_and_sub1_release(addr) # define AO_HAVE_int_fetch_and_sub1 #endif #if !defined(AO_HAVE_int_fetch_and_sub1) \ && defined(AO_HAVE_int_fetch_and_sub1_acquire) # define AO_int_fetch_and_sub1(addr) AO_int_fetch_and_sub1_acquire(addr) # define AO_HAVE_int_fetch_and_sub1 #endif #if !defined(AO_HAVE_int_fetch_and_sub1) \ && defined(AO_HAVE_int_fetch_and_sub1_write) # define AO_int_fetch_and_sub1(addr) AO_int_fetch_and_sub1_write(addr) # define AO_HAVE_int_fetch_and_sub1 #endif #if !defined(AO_HAVE_int_fetch_and_sub1) \ && defined(AO_HAVE_int_fetch_and_sub1_read) # define AO_int_fetch_and_sub1(addr) AO_int_fetch_and_sub1_read(addr) # define AO_HAVE_int_fetch_and_sub1 #endif #if defined(AO_HAVE_int_fetch_and_sub1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_and_sub1_full) # define AO_int_fetch_and_sub1_full(addr) \ (AO_nop_full(), AO_int_fetch_and_sub1_acquire(addr)) # define AO_HAVE_int_fetch_and_sub1_full #endif #if !defined(AO_HAVE_int_fetch_and_sub1_release_write) \ && defined(AO_HAVE_int_fetch_and_sub1_write) # define AO_int_fetch_and_sub1_release_write(addr) \ AO_int_fetch_and_sub1_write(addr) # define AO_HAVE_int_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_int_fetch_and_sub1_release_write) \ && defined(AO_HAVE_int_fetch_and_sub1_release) # define AO_int_fetch_and_sub1_release_write(addr) \ AO_int_fetch_and_sub1_release(addr) # define AO_HAVE_int_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_int_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_int_fetch_and_sub1_read) # define AO_int_fetch_and_sub1_acquire_read(addr) \ AO_int_fetch_and_sub1_read(addr) # define AO_HAVE_int_fetch_and_sub1_acquire_read #endif #if !defined(AO_HAVE_int_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_int_fetch_and_sub1_acquire) # define AO_int_fetch_and_sub1_acquire_read(addr) \ AO_int_fetch_and_sub1_acquire(addr) # define AO_HAVE_int_fetch_and_sub1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_fetch_and_sub1_acquire_read) # define AO_int_fetch_and_sub1_dd_acquire_read(addr) \ AO_int_fetch_and_sub1_acquire_read(addr) # define AO_HAVE_int_fetch_and_sub1_dd_acquire_read # endif #else # if defined(AO_HAVE_int_fetch_and_sub1) # define AO_int_fetch_and_sub1_dd_acquire_read(addr) \ AO_int_fetch_and_sub1(addr) # define AO_HAVE_int_fetch_and_sub1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_and */ #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_and_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_and_full(volatile unsigned *addr, unsigned value) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full(addr, old, old & value))); } # define AO_HAVE_int_and_full #endif #if defined(AO_HAVE_int_and_full) # if !defined(AO_HAVE_int_and_release) # define AO_int_and_release(addr, val) AO_int_and_full(addr, val) # define AO_HAVE_int_and_release # endif # if !defined(AO_HAVE_int_and_acquire) # define AO_int_and_acquire(addr, val) AO_int_and_full(addr, val) # define AO_HAVE_int_and_acquire # endif # if !defined(AO_HAVE_int_and_write) # define AO_int_and_write(addr, val) AO_int_and_full(addr, val) # define AO_HAVE_int_and_write # endif # if !defined(AO_HAVE_int_and_read) # define AO_int_and_read(addr, val) AO_int_and_full(addr, val) # define AO_HAVE_int_and_read # endif #endif /* AO_HAVE_int_and_full */ #if !defined(AO_HAVE_int_and) && defined(AO_HAVE_int_and_release) # define AO_int_and(addr, val) AO_int_and_release(addr, val) # define AO_HAVE_int_and #endif #if !defined(AO_HAVE_int_and) && defined(AO_HAVE_int_and_acquire) # define AO_int_and(addr, val) AO_int_and_acquire(addr, val) # define AO_HAVE_int_and #endif #if !defined(AO_HAVE_int_and) && defined(AO_HAVE_int_and_write) # define AO_int_and(addr, val) AO_int_and_write(addr, val) # define AO_HAVE_int_and #endif #if !defined(AO_HAVE_int_and) && defined(AO_HAVE_int_and_read) # define AO_int_and(addr, val) AO_int_and_read(addr, val) # define AO_HAVE_int_and #endif #if defined(AO_HAVE_int_and_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_and_full) # define AO_int_and_full(addr, val) \ (AO_nop_full(), AO_int_and_acquire(addr, val)) # define AO_HAVE_int_and_full #endif #if !defined(AO_HAVE_int_and_release_write) \ && defined(AO_HAVE_int_and_write) # define AO_int_and_release_write(addr, val) AO_int_and_write(addr, val) # define AO_HAVE_int_and_release_write #endif #if !defined(AO_HAVE_int_and_release_write) \ && defined(AO_HAVE_int_and_release) # define AO_int_and_release_write(addr, val) AO_int_and_release(addr, val) # define AO_HAVE_int_and_release_write #endif #if !defined(AO_HAVE_int_and_acquire_read) \ && defined(AO_HAVE_int_and_read) # define AO_int_and_acquire_read(addr, val) AO_int_and_read(addr, val) # define AO_HAVE_int_and_acquire_read #endif #if !defined(AO_HAVE_int_and_acquire_read) \ && defined(AO_HAVE_int_and_acquire) # define AO_int_and_acquire_read(addr, val) AO_int_and_acquire(addr, val) # define AO_HAVE_int_and_acquire_read #endif /* int_or */ #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_or_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_or_full(volatile unsigned *addr, unsigned value) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full(addr, old, old | value))); } # define AO_HAVE_int_or_full #endif #if defined(AO_HAVE_int_or_full) # if !defined(AO_HAVE_int_or_release) # define AO_int_or_release(addr, val) AO_int_or_full(addr, val) # define AO_HAVE_int_or_release # endif # if !defined(AO_HAVE_int_or_acquire) # define AO_int_or_acquire(addr, val) AO_int_or_full(addr, val) # define AO_HAVE_int_or_acquire # endif # if !defined(AO_HAVE_int_or_write) # define AO_int_or_write(addr, val) AO_int_or_full(addr, val) # define AO_HAVE_int_or_write # endif # if !defined(AO_HAVE_int_or_read) # define AO_int_or_read(addr, val) AO_int_or_full(addr, val) # define AO_HAVE_int_or_read # endif #endif /* AO_HAVE_int_or_full */ #if !defined(AO_HAVE_int_or) && defined(AO_HAVE_int_or_release) # define AO_int_or(addr, val) AO_int_or_release(addr, val) # define AO_HAVE_int_or #endif #if !defined(AO_HAVE_int_or) && defined(AO_HAVE_int_or_acquire) # define AO_int_or(addr, val) AO_int_or_acquire(addr, val) # define AO_HAVE_int_or #endif #if !defined(AO_HAVE_int_or) && defined(AO_HAVE_int_or_write) # define AO_int_or(addr, val) AO_int_or_write(addr, val) # define AO_HAVE_int_or #endif #if !defined(AO_HAVE_int_or) && defined(AO_HAVE_int_or_read) # define AO_int_or(addr, val) AO_int_or_read(addr, val) # define AO_HAVE_int_or #endif #if defined(AO_HAVE_int_or_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_or_full) # define AO_int_or_full(addr, val) \ (AO_nop_full(), AO_int_or_acquire(addr, val)) # define AO_HAVE_int_or_full #endif #if !defined(AO_HAVE_int_or_release_write) \ && defined(AO_HAVE_int_or_write) # define AO_int_or_release_write(addr, val) AO_int_or_write(addr, val) # define AO_HAVE_int_or_release_write #endif #if !defined(AO_HAVE_int_or_release_write) \ && defined(AO_HAVE_int_or_release) # define AO_int_or_release_write(addr, val) AO_int_or_release(addr, val) # define AO_HAVE_int_or_release_write #endif #if !defined(AO_HAVE_int_or_acquire_read) && defined(AO_HAVE_int_or_read) # define AO_int_or_acquire_read(addr, val) AO_int_or_read(addr, val) # define AO_HAVE_int_or_acquire_read #endif #if !defined(AO_HAVE_int_or_acquire_read) \ && defined(AO_HAVE_int_or_acquire) # define AO_int_or_acquire_read(addr, val) AO_int_or_acquire(addr, val) # define AO_HAVE_int_or_acquire_read #endif /* int_xor */ #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_xor_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_xor_full(volatile unsigned *addr, unsigned value) { unsigned old; do { old = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full(addr, old, old ^ value))); } # define AO_HAVE_int_xor_full #endif #if defined(AO_HAVE_int_xor_full) # if !defined(AO_HAVE_int_xor_release) # define AO_int_xor_release(addr, val) AO_int_xor_full(addr, val) # define AO_HAVE_int_xor_release # endif # if !defined(AO_HAVE_int_xor_acquire) # define AO_int_xor_acquire(addr, val) AO_int_xor_full(addr, val) # define AO_HAVE_int_xor_acquire # endif # if !defined(AO_HAVE_int_xor_write) # define AO_int_xor_write(addr, val) AO_int_xor_full(addr, val) # define AO_HAVE_int_xor_write # endif # if !defined(AO_HAVE_int_xor_read) # define AO_int_xor_read(addr, val) AO_int_xor_full(addr, val) # define AO_HAVE_int_xor_read # endif #endif /* AO_HAVE_int_xor_full */ #if !defined(AO_HAVE_int_xor) && defined(AO_HAVE_int_xor_release) # define AO_int_xor(addr, val) AO_int_xor_release(addr, val) # define AO_HAVE_int_xor #endif #if !defined(AO_HAVE_int_xor) && defined(AO_HAVE_int_xor_acquire) # define AO_int_xor(addr, val) AO_int_xor_acquire(addr, val) # define AO_HAVE_int_xor #endif #if !defined(AO_HAVE_int_xor) && defined(AO_HAVE_int_xor_write) # define AO_int_xor(addr, val) AO_int_xor_write(addr, val) # define AO_HAVE_int_xor #endif #if !defined(AO_HAVE_int_xor) && defined(AO_HAVE_int_xor_read) # define AO_int_xor(addr, val) AO_int_xor_read(addr, val) # define AO_HAVE_int_xor #endif #if defined(AO_HAVE_int_xor_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_xor_full) # define AO_int_xor_full(addr, val) \ (AO_nop_full(), AO_int_xor_acquire(addr, val)) # define AO_HAVE_int_xor_full #endif #if !defined(AO_HAVE_int_xor_release_write) \ && defined(AO_HAVE_int_xor_write) # define AO_int_xor_release_write(addr, val) AO_int_xor_write(addr, val) # define AO_HAVE_int_xor_release_write #endif #if !defined(AO_HAVE_int_xor_release_write) \ && defined(AO_HAVE_int_xor_release) # define AO_int_xor_release_write(addr, val) AO_int_xor_release(addr, val) # define AO_HAVE_int_xor_release_write #endif #if !defined(AO_HAVE_int_xor_acquire_read) \ && defined(AO_HAVE_int_xor_read) # define AO_int_xor_acquire_read(addr, val) AO_int_xor_read(addr, val) # define AO_HAVE_int_xor_acquire_read #endif #if !defined(AO_HAVE_int_xor_acquire_read) \ && defined(AO_HAVE_int_xor_acquire) # define AO_int_xor_acquire_read(addr, val) AO_int_xor_acquire(addr, val) # define AO_HAVE_int_xor_acquire_read #endif /* int_and/or/xor_dd_acquire_read are meaningless. */ /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* compare_and_swap (based on fetch_compare_and_swap) */ #if defined(AO_HAVE_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_compare_and_swap_full) AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_full #endif #if defined(AO_HAVE_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_compare_and_swap_acquire) AO_INLINE int AO_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_acquire #endif #if defined(AO_HAVE_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_compare_and_swap_release) AO_INLINE int AO_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_release #endif #if defined(AO_HAVE_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_compare_and_swap_write) AO_INLINE int AO_compare_and_swap_write(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_write #endif #if defined(AO_HAVE_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_compare_and_swap_read) AO_INLINE int AO_compare_and_swap_read(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_read #endif #if defined(AO_HAVE_fetch_compare_and_swap) \ && !defined(AO_HAVE_compare_and_swap) AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap #endif #if defined(AO_HAVE_fetch_compare_and_swap_release_write) \ && !defined(AO_HAVE_compare_and_swap_release_write) AO_INLINE int AO_compare_and_swap_release_write(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_release_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_release_write #endif #if defined(AO_HAVE_fetch_compare_and_swap_acquire_read) \ && !defined(AO_HAVE_compare_and_swap_acquire_read) AO_INLINE int AO_compare_and_swap_acquire_read(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_fetch_compare_and_swap_dd_acquire_read) \ && !defined(AO_HAVE_compare_and_swap_dd_acquire_read) AO_INLINE int AO_compare_and_swap_dd_acquire_read(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_dd_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_dd_acquire_read #endif /* fetch_and_add */ /* We first try to implement fetch_and_add variants in terms of the */ /* corresponding compare_and_swap variants to minimize adding barriers. */ #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_fetch_and_add_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *addr, AO_t incr) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full(addr, old, old + incr))); return old; } # define AO_HAVE_fetch_and_add_full #endif #if defined(AO_HAVE_compare_and_swap_acquire) \ && !defined(AO_HAVE_fetch_and_add_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_fetch_and_add_acquire(volatile AO_t *addr, AO_t incr) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_acquire(addr, old, old + incr))); return old; } # define AO_HAVE_fetch_and_add_acquire #endif #if defined(AO_HAVE_compare_and_swap_release) \ && !defined(AO_HAVE_fetch_and_add_release) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_fetch_and_add_release(volatile AO_t *addr, AO_t incr) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_release(addr, old, old + incr))); return old; } # define AO_HAVE_fetch_and_add_release #endif #if defined(AO_HAVE_compare_and_swap) \ && !defined(AO_HAVE_fetch_and_add) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *addr, AO_t incr) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap(addr, old, old + incr))); return old; } # define AO_HAVE_fetch_and_add #endif #if defined(AO_HAVE_fetch_and_add_full) # if !defined(AO_HAVE_fetch_and_add_release) # define AO_fetch_and_add_release(addr, val) \ AO_fetch_and_add_full(addr, val) # define AO_HAVE_fetch_and_add_release # endif # if !defined(AO_HAVE_fetch_and_add_acquire) # define AO_fetch_and_add_acquire(addr, val) \ AO_fetch_and_add_full(addr, val) # define AO_HAVE_fetch_and_add_acquire # endif # if !defined(AO_HAVE_fetch_and_add_write) # define AO_fetch_and_add_write(addr, val) \ AO_fetch_and_add_full(addr, val) # define AO_HAVE_fetch_and_add_write # endif # if !defined(AO_HAVE_fetch_and_add_read) # define AO_fetch_and_add_read(addr, val) \ AO_fetch_and_add_full(addr, val) # define AO_HAVE_fetch_and_add_read # endif #endif /* AO_HAVE_fetch_and_add_full */ #if defined(AO_HAVE_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_and_add_acquire) AO_INLINE AO_t AO_fetch_and_add_acquire(volatile AO_t *addr, AO_t incr) { AO_t result = AO_fetch_and_add(addr, incr); AO_nop_full(); return result; } # define AO_HAVE_fetch_and_add_acquire #endif #if defined(AO_HAVE_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_and_add_release) # define AO_fetch_and_add_release(addr, incr) \ (AO_nop_full(), AO_fetch_and_add(addr, incr)) # define AO_HAVE_fetch_and_add_release #endif #if !defined(AO_HAVE_fetch_and_add) \ && defined(AO_HAVE_fetch_and_add_release) # define AO_fetch_and_add(addr, val) \ AO_fetch_and_add_release(addr, val) # define AO_HAVE_fetch_and_add #endif #if !defined(AO_HAVE_fetch_and_add) \ && defined(AO_HAVE_fetch_and_add_acquire) # define AO_fetch_and_add(addr, val) \ AO_fetch_and_add_acquire(addr, val) # define AO_HAVE_fetch_and_add #endif #if !defined(AO_HAVE_fetch_and_add) \ && defined(AO_HAVE_fetch_and_add_write) # define AO_fetch_and_add(addr, val) \ AO_fetch_and_add_write(addr, val) # define AO_HAVE_fetch_and_add #endif #if !defined(AO_HAVE_fetch_and_add) \ && defined(AO_HAVE_fetch_and_add_read) # define AO_fetch_and_add(addr, val) \ AO_fetch_and_add_read(addr, val) # define AO_HAVE_fetch_and_add #endif #if defined(AO_HAVE_fetch_and_add_acquire) \ && defined(AO_HAVE_nop_full) && !defined(AO_HAVE_fetch_and_add_full) # define AO_fetch_and_add_full(addr, val) \ (AO_nop_full(), AO_fetch_and_add_acquire(addr, val)) # define AO_HAVE_fetch_and_add_full #endif #if !defined(AO_HAVE_fetch_and_add_release_write) \ && defined(AO_HAVE_fetch_and_add_write) # define AO_fetch_and_add_release_write(addr, val) \ AO_fetch_and_add_write(addr, val) # define AO_HAVE_fetch_and_add_release_write #endif #if !defined(AO_HAVE_fetch_and_add_release_write) \ && defined(AO_HAVE_fetch_and_add_release) # define AO_fetch_and_add_release_write(addr, val) \ AO_fetch_and_add_release(addr, val) # define AO_HAVE_fetch_and_add_release_write #endif #if !defined(AO_HAVE_fetch_and_add_acquire_read) \ && defined(AO_HAVE_fetch_and_add_read) # define AO_fetch_and_add_acquire_read(addr, val) \ AO_fetch_and_add_read(addr, val) # define AO_HAVE_fetch_and_add_acquire_read #endif #if !defined(AO_HAVE_fetch_and_add_acquire_read) \ && defined(AO_HAVE_fetch_and_add_acquire) # define AO_fetch_and_add_acquire_read(addr, val) \ AO_fetch_and_add_acquire(addr, val) # define AO_HAVE_fetch_and_add_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_fetch_and_add_acquire_read) # define AO_fetch_and_add_dd_acquire_read(addr, val) \ AO_fetch_and_add_acquire_read(addr, val) # define AO_HAVE_fetch_and_add_dd_acquire_read # endif #else # if defined(AO_HAVE_fetch_and_add) # define AO_fetch_and_add_dd_acquire_read(addr, val) \ AO_fetch_and_add(addr, val) # define AO_HAVE_fetch_and_add_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* fetch_and_add1 */ #if defined(AO_HAVE_fetch_and_add_full) \ && !defined(AO_HAVE_fetch_and_add1_full) # define AO_fetch_and_add1_full(addr) \ AO_fetch_and_add_full(addr, 1) # define AO_HAVE_fetch_and_add1_full #endif #if defined(AO_HAVE_fetch_and_add_release) \ && !defined(AO_HAVE_fetch_and_add1_release) # define AO_fetch_and_add1_release(addr) \ AO_fetch_and_add_release(addr, 1) # define AO_HAVE_fetch_and_add1_release #endif #if defined(AO_HAVE_fetch_and_add_acquire) \ && !defined(AO_HAVE_fetch_and_add1_acquire) # define AO_fetch_and_add1_acquire(addr) \ AO_fetch_and_add_acquire(addr, 1) # define AO_HAVE_fetch_and_add1_acquire #endif #if defined(AO_HAVE_fetch_and_add_write) \ && !defined(AO_HAVE_fetch_and_add1_write) # define AO_fetch_and_add1_write(addr) \ AO_fetch_and_add_write(addr, 1) # define AO_HAVE_fetch_and_add1_write #endif #if defined(AO_HAVE_fetch_and_add_read) \ && !defined(AO_HAVE_fetch_and_add1_read) # define AO_fetch_and_add1_read(addr) \ AO_fetch_and_add_read(addr, 1) # define AO_HAVE_fetch_and_add1_read #endif #if defined(AO_HAVE_fetch_and_add_release_write) \ && !defined(AO_HAVE_fetch_and_add1_release_write) # define AO_fetch_and_add1_release_write(addr) \ AO_fetch_and_add_release_write(addr, 1) # define AO_HAVE_fetch_and_add1_release_write #endif #if defined(AO_HAVE_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_fetch_and_add1_acquire_read) # define AO_fetch_and_add1_acquire_read(addr) \ AO_fetch_and_add_acquire_read(addr, 1) # define AO_HAVE_fetch_and_add1_acquire_read #endif #if defined(AO_HAVE_fetch_and_add) \ && !defined(AO_HAVE_fetch_and_add1) # define AO_fetch_and_add1(addr) AO_fetch_and_add(addr, 1) # define AO_HAVE_fetch_and_add1 #endif #if defined(AO_HAVE_fetch_and_add1_full) # if !defined(AO_HAVE_fetch_and_add1_release) # define AO_fetch_and_add1_release(addr) \ AO_fetch_and_add1_full(addr) # define AO_HAVE_fetch_and_add1_release # endif # if !defined(AO_HAVE_fetch_and_add1_acquire) # define AO_fetch_and_add1_acquire(addr) \ AO_fetch_and_add1_full(addr) # define AO_HAVE_fetch_and_add1_acquire # endif # if !defined(AO_HAVE_fetch_and_add1_write) # define AO_fetch_and_add1_write(addr) \ AO_fetch_and_add1_full(addr) # define AO_HAVE_fetch_and_add1_write # endif # if !defined(AO_HAVE_fetch_and_add1_read) # define AO_fetch_and_add1_read(addr) \ AO_fetch_and_add1_full(addr) # define AO_HAVE_fetch_and_add1_read # endif #endif /* AO_HAVE_fetch_and_add1_full */ #if !defined(AO_HAVE_fetch_and_add1) \ && defined(AO_HAVE_fetch_and_add1_release) # define AO_fetch_and_add1(addr) AO_fetch_and_add1_release(addr) # define AO_HAVE_fetch_and_add1 #endif #if !defined(AO_HAVE_fetch_and_add1) \ && defined(AO_HAVE_fetch_and_add1_acquire) # define AO_fetch_and_add1(addr) AO_fetch_and_add1_acquire(addr) # define AO_HAVE_fetch_and_add1 #endif #if !defined(AO_HAVE_fetch_and_add1) \ && defined(AO_HAVE_fetch_and_add1_write) # define AO_fetch_and_add1(addr) AO_fetch_and_add1_write(addr) # define AO_HAVE_fetch_and_add1 #endif #if !defined(AO_HAVE_fetch_and_add1) \ && defined(AO_HAVE_fetch_and_add1_read) # define AO_fetch_and_add1(addr) AO_fetch_and_add1_read(addr) # define AO_HAVE_fetch_and_add1 #endif #if defined(AO_HAVE_fetch_and_add1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_and_add1_full) # define AO_fetch_and_add1_full(addr) \ (AO_nop_full(), AO_fetch_and_add1_acquire(addr)) # define AO_HAVE_fetch_and_add1_full #endif #if !defined(AO_HAVE_fetch_and_add1_release_write) \ && defined(AO_HAVE_fetch_and_add1_write) # define AO_fetch_and_add1_release_write(addr) \ AO_fetch_and_add1_write(addr) # define AO_HAVE_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_fetch_and_add1_release_write) \ && defined(AO_HAVE_fetch_and_add1_release) # define AO_fetch_and_add1_release_write(addr) \ AO_fetch_and_add1_release(addr) # define AO_HAVE_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_fetch_and_add1_read) # define AO_fetch_and_add1_acquire_read(addr) \ AO_fetch_and_add1_read(addr) # define AO_HAVE_fetch_and_add1_acquire_read #endif #if !defined(AO_HAVE_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_fetch_and_add1_acquire) # define AO_fetch_and_add1_acquire_read(addr) \ AO_fetch_and_add1_acquire(addr) # define AO_HAVE_fetch_and_add1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_fetch_and_add1_acquire_read) # define AO_fetch_and_add1_dd_acquire_read(addr) \ AO_fetch_and_add1_acquire_read(addr) # define AO_HAVE_fetch_and_add1_dd_acquire_read # endif #else # if defined(AO_HAVE_fetch_and_add1) # define AO_fetch_and_add1_dd_acquire_read(addr) \ AO_fetch_and_add1(addr) # define AO_HAVE_fetch_and_add1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* fetch_and_sub1 */ #if defined(AO_HAVE_fetch_and_add_full) \ && !defined(AO_HAVE_fetch_and_sub1_full) # define AO_fetch_and_sub1_full(addr) \ AO_fetch_and_add_full(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_full #endif #if defined(AO_HAVE_fetch_and_add_release) \ && !defined(AO_HAVE_fetch_and_sub1_release) # define AO_fetch_and_sub1_release(addr) \ AO_fetch_and_add_release(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_release #endif #if defined(AO_HAVE_fetch_and_add_acquire) \ && !defined(AO_HAVE_fetch_and_sub1_acquire) # define AO_fetch_and_sub1_acquire(addr) \ AO_fetch_and_add_acquire(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_fetch_and_add_write) \ && !defined(AO_HAVE_fetch_and_sub1_write) # define AO_fetch_and_sub1_write(addr) \ AO_fetch_and_add_write(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_write #endif #if defined(AO_HAVE_fetch_and_add_read) \ && !defined(AO_HAVE_fetch_and_sub1_read) # define AO_fetch_and_sub1_read(addr) \ AO_fetch_and_add_read(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_read #endif #if defined(AO_HAVE_fetch_and_add_release_write) \ && !defined(AO_HAVE_fetch_and_sub1_release_write) # define AO_fetch_and_sub1_release_write(addr) \ AO_fetch_and_add_release_write(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_release_write #endif #if defined(AO_HAVE_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_fetch_and_sub1_acquire_read) # define AO_fetch_and_sub1_acquire_read(addr) \ AO_fetch_and_add_acquire_read(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1_acquire_read #endif #if defined(AO_HAVE_fetch_and_add) \ && !defined(AO_HAVE_fetch_and_sub1) # define AO_fetch_and_sub1(addr) \ AO_fetch_and_add(addr, (AO_t)(-1)) # define AO_HAVE_fetch_and_sub1 #endif #if defined(AO_HAVE_fetch_and_sub1_full) # if !defined(AO_HAVE_fetch_and_sub1_release) # define AO_fetch_and_sub1_release(addr) \ AO_fetch_and_sub1_full(addr) # define AO_HAVE_fetch_and_sub1_release # endif # if !defined(AO_HAVE_fetch_and_sub1_acquire) # define AO_fetch_and_sub1_acquire(addr) \ AO_fetch_and_sub1_full(addr) # define AO_HAVE_fetch_and_sub1_acquire # endif # if !defined(AO_HAVE_fetch_and_sub1_write) # define AO_fetch_and_sub1_write(addr) \ AO_fetch_and_sub1_full(addr) # define AO_HAVE_fetch_and_sub1_write # endif # if !defined(AO_HAVE_fetch_and_sub1_read) # define AO_fetch_and_sub1_read(addr) \ AO_fetch_and_sub1_full(addr) # define AO_HAVE_fetch_and_sub1_read # endif #endif /* AO_HAVE_fetch_and_sub1_full */ #if !defined(AO_HAVE_fetch_and_sub1) \ && defined(AO_HAVE_fetch_and_sub1_release) # define AO_fetch_and_sub1(addr) AO_fetch_and_sub1_release(addr) # define AO_HAVE_fetch_and_sub1 #endif #if !defined(AO_HAVE_fetch_and_sub1) \ && defined(AO_HAVE_fetch_and_sub1_acquire) # define AO_fetch_and_sub1(addr) AO_fetch_and_sub1_acquire(addr) # define AO_HAVE_fetch_and_sub1 #endif #if !defined(AO_HAVE_fetch_and_sub1) \ && defined(AO_HAVE_fetch_and_sub1_write) # define AO_fetch_and_sub1(addr) AO_fetch_and_sub1_write(addr) # define AO_HAVE_fetch_and_sub1 #endif #if !defined(AO_HAVE_fetch_and_sub1) \ && defined(AO_HAVE_fetch_and_sub1_read) # define AO_fetch_and_sub1(addr) AO_fetch_and_sub1_read(addr) # define AO_HAVE_fetch_and_sub1 #endif #if defined(AO_HAVE_fetch_and_sub1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_and_sub1_full) # define AO_fetch_and_sub1_full(addr) \ (AO_nop_full(), AO_fetch_and_sub1_acquire(addr)) # define AO_HAVE_fetch_and_sub1_full #endif #if !defined(AO_HAVE_fetch_and_sub1_release_write) \ && defined(AO_HAVE_fetch_and_sub1_write) # define AO_fetch_and_sub1_release_write(addr) \ AO_fetch_and_sub1_write(addr) # define AO_HAVE_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_fetch_and_sub1_release_write) \ && defined(AO_HAVE_fetch_and_sub1_release) # define AO_fetch_and_sub1_release_write(addr) \ AO_fetch_and_sub1_release(addr) # define AO_HAVE_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_fetch_and_sub1_read) # define AO_fetch_and_sub1_acquire_read(addr) \ AO_fetch_and_sub1_read(addr) # define AO_HAVE_fetch_and_sub1_acquire_read #endif #if !defined(AO_HAVE_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_fetch_and_sub1_acquire) # define AO_fetch_and_sub1_acquire_read(addr) \ AO_fetch_and_sub1_acquire(addr) # define AO_HAVE_fetch_and_sub1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_fetch_and_sub1_acquire_read) # define AO_fetch_and_sub1_dd_acquire_read(addr) \ AO_fetch_and_sub1_acquire_read(addr) # define AO_HAVE_fetch_and_sub1_dd_acquire_read # endif #else # if defined(AO_HAVE_fetch_and_sub1) # define AO_fetch_and_sub1_dd_acquire_read(addr) \ AO_fetch_and_sub1(addr) # define AO_HAVE_fetch_and_sub1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* and */ #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_and_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_and_full(volatile AO_t *addr, AO_t value) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full(addr, old, old & value))); } # define AO_HAVE_and_full #endif #if defined(AO_HAVE_and_full) # if !defined(AO_HAVE_and_release) # define AO_and_release(addr, val) AO_and_full(addr, val) # define AO_HAVE_and_release # endif # if !defined(AO_HAVE_and_acquire) # define AO_and_acquire(addr, val) AO_and_full(addr, val) # define AO_HAVE_and_acquire # endif # if !defined(AO_HAVE_and_write) # define AO_and_write(addr, val) AO_and_full(addr, val) # define AO_HAVE_and_write # endif # if !defined(AO_HAVE_and_read) # define AO_and_read(addr, val) AO_and_full(addr, val) # define AO_HAVE_and_read # endif #endif /* AO_HAVE_and_full */ #if !defined(AO_HAVE_and) && defined(AO_HAVE_and_release) # define AO_and(addr, val) AO_and_release(addr, val) # define AO_HAVE_and #endif #if !defined(AO_HAVE_and) && defined(AO_HAVE_and_acquire) # define AO_and(addr, val) AO_and_acquire(addr, val) # define AO_HAVE_and #endif #if !defined(AO_HAVE_and) && defined(AO_HAVE_and_write) # define AO_and(addr, val) AO_and_write(addr, val) # define AO_HAVE_and #endif #if !defined(AO_HAVE_and) && defined(AO_HAVE_and_read) # define AO_and(addr, val) AO_and_read(addr, val) # define AO_HAVE_and #endif #if defined(AO_HAVE_and_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_and_full) # define AO_and_full(addr, val) \ (AO_nop_full(), AO_and_acquire(addr, val)) # define AO_HAVE_and_full #endif #if !defined(AO_HAVE_and_release_write) \ && defined(AO_HAVE_and_write) # define AO_and_release_write(addr, val) AO_and_write(addr, val) # define AO_HAVE_and_release_write #endif #if !defined(AO_HAVE_and_release_write) \ && defined(AO_HAVE_and_release) # define AO_and_release_write(addr, val) AO_and_release(addr, val) # define AO_HAVE_and_release_write #endif #if !defined(AO_HAVE_and_acquire_read) \ && defined(AO_HAVE_and_read) # define AO_and_acquire_read(addr, val) AO_and_read(addr, val) # define AO_HAVE_and_acquire_read #endif #if !defined(AO_HAVE_and_acquire_read) \ && defined(AO_HAVE_and_acquire) # define AO_and_acquire_read(addr, val) AO_and_acquire(addr, val) # define AO_HAVE_and_acquire_read #endif /* or */ #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_or_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_or_full(volatile AO_t *addr, AO_t value) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full(addr, old, old | value))); } # define AO_HAVE_or_full #endif #if defined(AO_HAVE_or_full) # if !defined(AO_HAVE_or_release) # define AO_or_release(addr, val) AO_or_full(addr, val) # define AO_HAVE_or_release # endif # if !defined(AO_HAVE_or_acquire) # define AO_or_acquire(addr, val) AO_or_full(addr, val) # define AO_HAVE_or_acquire # endif # if !defined(AO_HAVE_or_write) # define AO_or_write(addr, val) AO_or_full(addr, val) # define AO_HAVE_or_write # endif # if !defined(AO_HAVE_or_read) # define AO_or_read(addr, val) AO_or_full(addr, val) # define AO_HAVE_or_read # endif #endif /* AO_HAVE_or_full */ #if !defined(AO_HAVE_or) && defined(AO_HAVE_or_release) # define AO_or(addr, val) AO_or_release(addr, val) # define AO_HAVE_or #endif #if !defined(AO_HAVE_or) && defined(AO_HAVE_or_acquire) # define AO_or(addr, val) AO_or_acquire(addr, val) # define AO_HAVE_or #endif #if !defined(AO_HAVE_or) && defined(AO_HAVE_or_write) # define AO_or(addr, val) AO_or_write(addr, val) # define AO_HAVE_or #endif #if !defined(AO_HAVE_or) && defined(AO_HAVE_or_read) # define AO_or(addr, val) AO_or_read(addr, val) # define AO_HAVE_or #endif #if defined(AO_HAVE_or_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_or_full) # define AO_or_full(addr, val) \ (AO_nop_full(), AO_or_acquire(addr, val)) # define AO_HAVE_or_full #endif #if !defined(AO_HAVE_or_release_write) \ && defined(AO_HAVE_or_write) # define AO_or_release_write(addr, val) AO_or_write(addr, val) # define AO_HAVE_or_release_write #endif #if !defined(AO_HAVE_or_release_write) \ && defined(AO_HAVE_or_release) # define AO_or_release_write(addr, val) AO_or_release(addr, val) # define AO_HAVE_or_release_write #endif #if !defined(AO_HAVE_or_acquire_read) && defined(AO_HAVE_or_read) # define AO_or_acquire_read(addr, val) AO_or_read(addr, val) # define AO_HAVE_or_acquire_read #endif #if !defined(AO_HAVE_or_acquire_read) \ && defined(AO_HAVE_or_acquire) # define AO_or_acquire_read(addr, val) AO_or_acquire(addr, val) # define AO_HAVE_or_acquire_read #endif /* xor */ #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_xor_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_xor_full(volatile AO_t *addr, AO_t value) { AO_t old; do { old = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full(addr, old, old ^ value))); } # define AO_HAVE_xor_full #endif #if defined(AO_HAVE_xor_full) # if !defined(AO_HAVE_xor_release) # define AO_xor_release(addr, val) AO_xor_full(addr, val) # define AO_HAVE_xor_release # endif # if !defined(AO_HAVE_xor_acquire) # define AO_xor_acquire(addr, val) AO_xor_full(addr, val) # define AO_HAVE_xor_acquire # endif # if !defined(AO_HAVE_xor_write) # define AO_xor_write(addr, val) AO_xor_full(addr, val) # define AO_HAVE_xor_write # endif # if !defined(AO_HAVE_xor_read) # define AO_xor_read(addr, val) AO_xor_full(addr, val) # define AO_HAVE_xor_read # endif #endif /* AO_HAVE_xor_full */ #if !defined(AO_HAVE_xor) && defined(AO_HAVE_xor_release) # define AO_xor(addr, val) AO_xor_release(addr, val) # define AO_HAVE_xor #endif #if !defined(AO_HAVE_xor) && defined(AO_HAVE_xor_acquire) # define AO_xor(addr, val) AO_xor_acquire(addr, val) # define AO_HAVE_xor #endif #if !defined(AO_HAVE_xor) && defined(AO_HAVE_xor_write) # define AO_xor(addr, val) AO_xor_write(addr, val) # define AO_HAVE_xor #endif #if !defined(AO_HAVE_xor) && defined(AO_HAVE_xor_read) # define AO_xor(addr, val) AO_xor_read(addr, val) # define AO_HAVE_xor #endif #if defined(AO_HAVE_xor_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_xor_full) # define AO_xor_full(addr, val) \ (AO_nop_full(), AO_xor_acquire(addr, val)) # define AO_HAVE_xor_full #endif #if !defined(AO_HAVE_xor_release_write) \ && defined(AO_HAVE_xor_write) # define AO_xor_release_write(addr, val) AO_xor_write(addr, val) # define AO_HAVE_xor_release_write #endif #if !defined(AO_HAVE_xor_release_write) \ && defined(AO_HAVE_xor_release) # define AO_xor_release_write(addr, val) AO_xor_release(addr, val) # define AO_HAVE_xor_release_write #endif #if !defined(AO_HAVE_xor_acquire_read) \ && defined(AO_HAVE_xor_read) # define AO_xor_acquire_read(addr, val) AO_xor_read(addr, val) # define AO_HAVE_xor_acquire_read #endif #if !defined(AO_HAVE_xor_acquire_read) \ && defined(AO_HAVE_xor_acquire) # define AO_xor_acquire_read(addr, val) AO_xor_acquire(addr, val) # define AO_HAVE_xor_acquire_read #endif /* and/or/xor_dd_acquire_read are meaningless. */ papi-papi-7-2-0-t/src/atomic_ops/generalize-arithm.template000066400000000000000000001005161502707512200237330ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* XSIZE_compare_and_swap (based on fetch_compare_and_swap) */ #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_full) AO_INLINE int AO_XSIZE_compare_and_swap_full(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_full #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_acquire) AO_INLINE int AO_XSIZE_compare_and_swap_acquire(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_acquire #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_release) AO_INLINE int AO_XSIZE_compare_and_swap_release(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_release #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_write) AO_INLINE int AO_XSIZE_compare_and_swap_write(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_write #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_read) AO_INLINE int AO_XSIZE_compare_and_swap_read(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_read #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && !defined(AO_HAVE_XSIZE_compare_and_swap) AO_INLINE int AO_XSIZE_compare_and_swap(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release_write) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_release_write) AO_INLINE int AO_XSIZE_compare_and_swap_release_write(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_release_write(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_release_write #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_acquire_read) AO_INLINE int AO_XSIZE_compare_and_swap_acquire_read(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_dd_acquire_read) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_dd_acquire_read) AO_INLINE int AO_XSIZE_compare_and_swap_dd_acquire_read(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return AO_XSIZE_fetch_compare_and_swap_dd_acquire_read(addr, old_val, new_val) == old_val; } # define AO_HAVE_XSIZE_compare_and_swap_dd_acquire_read #endif /* XSIZE_fetch_and_add */ /* We first try to implement fetch_and_add variants in terms of the */ /* corresponding compare_and_swap variants to minimize adding barriers. */ #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_add_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_fetch_and_add_full(volatile XCTYPE *addr, XCTYPE incr) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full(addr, old, old + incr))); return old; } # define AO_HAVE_XSIZE_fetch_and_add_full #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_acquire) \ && !defined(AO_HAVE_XSIZE_fetch_and_add_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_fetch_and_add_acquire(volatile XCTYPE *addr, XCTYPE incr) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_acquire(addr, old, old + incr))); return old; } # define AO_HAVE_XSIZE_fetch_and_add_acquire #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_release) \ && !defined(AO_HAVE_XSIZE_fetch_and_add_release) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_fetch_and_add_release(volatile XCTYPE *addr, XCTYPE incr) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_release(addr, old, old + incr))); return old; } # define AO_HAVE_XSIZE_fetch_and_add_release #endif #if defined(AO_HAVE_XSIZE_compare_and_swap) \ && !defined(AO_HAVE_XSIZE_fetch_and_add) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_fetch_and_add(volatile XCTYPE *addr, XCTYPE incr) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap(addr, old, old + incr))); return old; } # define AO_HAVE_XSIZE_fetch_and_add #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_full) # if !defined(AO_HAVE_XSIZE_fetch_and_add_release) # define AO_XSIZE_fetch_and_add_release(addr, val) \ AO_XSIZE_fetch_and_add_full(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_release # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add_acquire) # define AO_XSIZE_fetch_and_add_acquire(addr, val) \ AO_XSIZE_fetch_and_add_full(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_acquire # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add_write) # define AO_XSIZE_fetch_and_add_write(addr, val) \ AO_XSIZE_fetch_and_add_full(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_write # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add_read) # define AO_XSIZE_fetch_and_add_read(addr, val) \ AO_XSIZE_fetch_and_add_full(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_read # endif #endif /* AO_HAVE_XSIZE_fetch_and_add_full */ #if defined(AO_HAVE_XSIZE_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_add_acquire) AO_INLINE XCTYPE AO_XSIZE_fetch_and_add_acquire(volatile XCTYPE *addr, XCTYPE incr) { XCTYPE result = AO_XSIZE_fetch_and_add(addr, incr); AO_nop_full(); return result; } # define AO_HAVE_XSIZE_fetch_and_add_acquire #endif #if defined(AO_HAVE_XSIZE_fetch_and_add) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_add_release) # define AO_XSIZE_fetch_and_add_release(addr, incr) \ (AO_nop_full(), AO_XSIZE_fetch_and_add(addr, incr)) # define AO_HAVE_XSIZE_fetch_and_add_release #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add) \ && defined(AO_HAVE_XSIZE_fetch_and_add_release) # define AO_XSIZE_fetch_and_add(addr, val) \ AO_XSIZE_fetch_and_add_release(addr, val) # define AO_HAVE_XSIZE_fetch_and_add #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add) \ && defined(AO_HAVE_XSIZE_fetch_and_add_acquire) # define AO_XSIZE_fetch_and_add(addr, val) \ AO_XSIZE_fetch_and_add_acquire(addr, val) # define AO_HAVE_XSIZE_fetch_and_add #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add) \ && defined(AO_HAVE_XSIZE_fetch_and_add_write) # define AO_XSIZE_fetch_and_add(addr, val) \ AO_XSIZE_fetch_and_add_write(addr, val) # define AO_HAVE_XSIZE_fetch_and_add #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add) \ && defined(AO_HAVE_XSIZE_fetch_and_add_read) # define AO_XSIZE_fetch_and_add(addr, val) \ AO_XSIZE_fetch_and_add_read(addr, val) # define AO_HAVE_XSIZE_fetch_and_add #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_acquire) \ && defined(AO_HAVE_nop_full) && !defined(AO_HAVE_XSIZE_fetch_and_add_full) # define AO_XSIZE_fetch_and_add_full(addr, val) \ (AO_nop_full(), AO_XSIZE_fetch_and_add_acquire(addr, val)) # define AO_HAVE_XSIZE_fetch_and_add_full #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_add_write) # define AO_XSIZE_fetch_and_add_release_write(addr, val) \ AO_XSIZE_fetch_and_add_write(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_add_release) # define AO_XSIZE_fetch_and_add_release_write(addr, val) \ AO_XSIZE_fetch_and_add_release(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_add_read) # define AO_XSIZE_fetch_and_add_acquire_read(addr, val) \ AO_XSIZE_fetch_and_add_read(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_acquire_read #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_add_acquire) # define AO_XSIZE_fetch_and_add_acquire_read(addr, val) \ AO_XSIZE_fetch_and_add_acquire(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_fetch_and_add_acquire_read) # define AO_XSIZE_fetch_and_add_dd_acquire_read(addr, val) \ AO_XSIZE_fetch_and_add_acquire_read(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_fetch_and_add) # define AO_XSIZE_fetch_and_add_dd_acquire_read(addr, val) \ AO_XSIZE_fetch_and_add(addr, val) # define AO_HAVE_XSIZE_fetch_and_add_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_fetch_and_add1 */ #if defined(AO_HAVE_XSIZE_fetch_and_add_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_full) # define AO_XSIZE_fetch_and_add1_full(addr) \ AO_XSIZE_fetch_and_add_full(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_full #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_release) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_release) # define AO_XSIZE_fetch_and_add1_release(addr) \ AO_XSIZE_fetch_and_add_release(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_release #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_acquire) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_acquire) # define AO_XSIZE_fetch_and_add1_acquire(addr) \ AO_XSIZE_fetch_and_add_acquire(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_acquire #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_write) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_write) # define AO_XSIZE_fetch_and_add1_write(addr) \ AO_XSIZE_fetch_and_add_write(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_write #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_read) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_read) # define AO_XSIZE_fetch_and_add1_read(addr) \ AO_XSIZE_fetch_and_add_read(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_read #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_release_write) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_release_write) # define AO_XSIZE_fetch_and_add1_release_write(addr) \ AO_XSIZE_fetch_and_add_release_write(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_release_write #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_acquire_read) # define AO_XSIZE_fetch_and_add1_acquire_read(addr) \ AO_XSIZE_fetch_and_add_acquire_read(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1_acquire_read #endif #if defined(AO_HAVE_XSIZE_fetch_and_add) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1) # define AO_XSIZE_fetch_and_add1(addr) AO_XSIZE_fetch_and_add(addr, 1) # define AO_HAVE_XSIZE_fetch_and_add1 #endif #if defined(AO_HAVE_XSIZE_fetch_and_add1_full) # if !defined(AO_HAVE_XSIZE_fetch_and_add1_release) # define AO_XSIZE_fetch_and_add1_release(addr) \ AO_XSIZE_fetch_and_add1_full(addr) # define AO_HAVE_XSIZE_fetch_and_add1_release # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add1_acquire) # define AO_XSIZE_fetch_and_add1_acquire(addr) \ AO_XSIZE_fetch_and_add1_full(addr) # define AO_HAVE_XSIZE_fetch_and_add1_acquire # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add1_write) # define AO_XSIZE_fetch_and_add1_write(addr) \ AO_XSIZE_fetch_and_add1_full(addr) # define AO_HAVE_XSIZE_fetch_and_add1_write # endif # if !defined(AO_HAVE_XSIZE_fetch_and_add1_read) # define AO_XSIZE_fetch_and_add1_read(addr) \ AO_XSIZE_fetch_and_add1_full(addr) # define AO_HAVE_XSIZE_fetch_and_add1_read # endif #endif /* AO_HAVE_XSIZE_fetch_and_add1_full */ #if !defined(AO_HAVE_XSIZE_fetch_and_add1) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_release) # define AO_XSIZE_fetch_and_add1(addr) AO_XSIZE_fetch_and_add1_release(addr) # define AO_HAVE_XSIZE_fetch_and_add1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_acquire) # define AO_XSIZE_fetch_and_add1(addr) AO_XSIZE_fetch_and_add1_acquire(addr) # define AO_HAVE_XSIZE_fetch_and_add1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_write) # define AO_XSIZE_fetch_and_add1(addr) AO_XSIZE_fetch_and_add1_write(addr) # define AO_HAVE_XSIZE_fetch_and_add1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_read) # define AO_XSIZE_fetch_and_add1(addr) AO_XSIZE_fetch_and_add1_read(addr) # define AO_HAVE_XSIZE_fetch_and_add1 #endif #if defined(AO_HAVE_XSIZE_fetch_and_add1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_add1_full) # define AO_XSIZE_fetch_and_add1_full(addr) \ (AO_nop_full(), AO_XSIZE_fetch_and_add1_acquire(addr)) # define AO_HAVE_XSIZE_fetch_and_add1_full #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_write) # define AO_XSIZE_fetch_and_add1_release_write(addr) \ AO_XSIZE_fetch_and_add1_write(addr) # define AO_HAVE_XSIZE_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_release) # define AO_XSIZE_fetch_and_add1_release_write(addr) \ AO_XSIZE_fetch_and_add1_release(addr) # define AO_HAVE_XSIZE_fetch_and_add1_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_read) # define AO_XSIZE_fetch_and_add1_acquire_read(addr) \ AO_XSIZE_fetch_and_add1_read(addr) # define AO_HAVE_XSIZE_fetch_and_add1_acquire_read #endif #if !defined(AO_HAVE_XSIZE_fetch_and_add1_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_add1_acquire) # define AO_XSIZE_fetch_and_add1_acquire_read(addr) \ AO_XSIZE_fetch_and_add1_acquire(addr) # define AO_HAVE_XSIZE_fetch_and_add1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_fetch_and_add1_acquire_read) # define AO_XSIZE_fetch_and_add1_dd_acquire_read(addr) \ AO_XSIZE_fetch_and_add1_acquire_read(addr) # define AO_HAVE_XSIZE_fetch_and_add1_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_fetch_and_add1) # define AO_XSIZE_fetch_and_add1_dd_acquire_read(addr) \ AO_XSIZE_fetch_and_add1(addr) # define AO_HAVE_XSIZE_fetch_and_add1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_fetch_and_sub1 */ #if defined(AO_HAVE_XSIZE_fetch_and_add_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_full) # define AO_XSIZE_fetch_and_sub1_full(addr) \ AO_XSIZE_fetch_and_add_full(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_full #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_release) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_release) # define AO_XSIZE_fetch_and_sub1_release(addr) \ AO_XSIZE_fetch_and_add_release(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_release #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_acquire) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire) # define AO_XSIZE_fetch_and_sub1_acquire(addr) \ AO_XSIZE_fetch_and_add_acquire(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_write) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_write) # define AO_XSIZE_fetch_and_sub1_write(addr) \ AO_XSIZE_fetch_and_add_write(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_write #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_read) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_read) # define AO_XSIZE_fetch_and_sub1_read(addr) \ AO_XSIZE_fetch_and_add_read(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_read #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_release_write) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_release_write) # define AO_XSIZE_fetch_and_sub1_release_write(addr) \ AO_XSIZE_fetch_and_add_release_write(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_release_write #endif #if defined(AO_HAVE_XSIZE_fetch_and_add_acquire_read) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire_read) # define AO_XSIZE_fetch_and_sub1_acquire_read(addr) \ AO_XSIZE_fetch_and_add_acquire_read(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1_acquire_read #endif #if defined(AO_HAVE_XSIZE_fetch_and_add) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1) # define AO_XSIZE_fetch_and_sub1(addr) \ AO_XSIZE_fetch_and_add(addr, (XCTYPE)(-1)) # define AO_HAVE_XSIZE_fetch_and_sub1 #endif #if defined(AO_HAVE_XSIZE_fetch_and_sub1_full) # if !defined(AO_HAVE_XSIZE_fetch_and_sub1_release) # define AO_XSIZE_fetch_and_sub1_release(addr) \ AO_XSIZE_fetch_and_sub1_full(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_release # endif # if !defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire) # define AO_XSIZE_fetch_and_sub1_acquire(addr) \ AO_XSIZE_fetch_and_sub1_full(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_acquire # endif # if !defined(AO_HAVE_XSIZE_fetch_and_sub1_write) # define AO_XSIZE_fetch_and_sub1_write(addr) \ AO_XSIZE_fetch_and_sub1_full(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_write # endif # if !defined(AO_HAVE_XSIZE_fetch_and_sub1_read) # define AO_XSIZE_fetch_and_sub1_read(addr) \ AO_XSIZE_fetch_and_sub1_full(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_read # endif #endif /* AO_HAVE_XSIZE_fetch_and_sub1_full */ #if !defined(AO_HAVE_XSIZE_fetch_and_sub1) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_release) # define AO_XSIZE_fetch_and_sub1(addr) AO_XSIZE_fetch_and_sub1_release(addr) # define AO_HAVE_XSIZE_fetch_and_sub1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire) # define AO_XSIZE_fetch_and_sub1(addr) AO_XSIZE_fetch_and_sub1_acquire(addr) # define AO_HAVE_XSIZE_fetch_and_sub1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_write) # define AO_XSIZE_fetch_and_sub1(addr) AO_XSIZE_fetch_and_sub1_write(addr) # define AO_HAVE_XSIZE_fetch_and_sub1 #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_read) # define AO_XSIZE_fetch_and_sub1(addr) AO_XSIZE_fetch_and_sub1_read(addr) # define AO_HAVE_XSIZE_fetch_and_sub1 #endif #if defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_and_sub1_full) # define AO_XSIZE_fetch_and_sub1_full(addr) \ (AO_nop_full(), AO_XSIZE_fetch_and_sub1_acquire(addr)) # define AO_HAVE_XSIZE_fetch_and_sub1_full #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_write) # define AO_XSIZE_fetch_and_sub1_release_write(addr) \ AO_XSIZE_fetch_and_sub1_write(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1_release_write) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_release) # define AO_XSIZE_fetch_and_sub1_release_write(addr) \ AO_XSIZE_fetch_and_sub1_release(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_read) # define AO_XSIZE_fetch_and_sub1_acquire_read(addr) \ AO_XSIZE_fetch_and_sub1_read(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_acquire_read #endif #if !defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire) # define AO_XSIZE_fetch_and_sub1_acquire_read(addr) \ AO_XSIZE_fetch_and_sub1_acquire(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_fetch_and_sub1_acquire_read) # define AO_XSIZE_fetch_and_sub1_dd_acquire_read(addr) \ AO_XSIZE_fetch_and_sub1_acquire_read(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_fetch_and_sub1) # define AO_XSIZE_fetch_and_sub1_dd_acquire_read(addr) \ AO_XSIZE_fetch_and_sub1(addr) # define AO_HAVE_XSIZE_fetch_and_sub1_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_and */ #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_and_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_and_full(volatile XCTYPE *addr, XCTYPE value) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full(addr, old, old & value))); } # define AO_HAVE_XSIZE_and_full #endif #if defined(AO_HAVE_XSIZE_and_full) # if !defined(AO_HAVE_XSIZE_and_release) # define AO_XSIZE_and_release(addr, val) AO_XSIZE_and_full(addr, val) # define AO_HAVE_XSIZE_and_release # endif # if !defined(AO_HAVE_XSIZE_and_acquire) # define AO_XSIZE_and_acquire(addr, val) AO_XSIZE_and_full(addr, val) # define AO_HAVE_XSIZE_and_acquire # endif # if !defined(AO_HAVE_XSIZE_and_write) # define AO_XSIZE_and_write(addr, val) AO_XSIZE_and_full(addr, val) # define AO_HAVE_XSIZE_and_write # endif # if !defined(AO_HAVE_XSIZE_and_read) # define AO_XSIZE_and_read(addr, val) AO_XSIZE_and_full(addr, val) # define AO_HAVE_XSIZE_and_read # endif #endif /* AO_HAVE_XSIZE_and_full */ #if !defined(AO_HAVE_XSIZE_and) && defined(AO_HAVE_XSIZE_and_release) # define AO_XSIZE_and(addr, val) AO_XSIZE_and_release(addr, val) # define AO_HAVE_XSIZE_and #endif #if !defined(AO_HAVE_XSIZE_and) && defined(AO_HAVE_XSIZE_and_acquire) # define AO_XSIZE_and(addr, val) AO_XSIZE_and_acquire(addr, val) # define AO_HAVE_XSIZE_and #endif #if !defined(AO_HAVE_XSIZE_and) && defined(AO_HAVE_XSIZE_and_write) # define AO_XSIZE_and(addr, val) AO_XSIZE_and_write(addr, val) # define AO_HAVE_XSIZE_and #endif #if !defined(AO_HAVE_XSIZE_and) && defined(AO_HAVE_XSIZE_and_read) # define AO_XSIZE_and(addr, val) AO_XSIZE_and_read(addr, val) # define AO_HAVE_XSIZE_and #endif #if defined(AO_HAVE_XSIZE_and_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_and_full) # define AO_XSIZE_and_full(addr, val) \ (AO_nop_full(), AO_XSIZE_and_acquire(addr, val)) # define AO_HAVE_XSIZE_and_full #endif #if !defined(AO_HAVE_XSIZE_and_release_write) \ && defined(AO_HAVE_XSIZE_and_write) # define AO_XSIZE_and_release_write(addr, val) AO_XSIZE_and_write(addr, val) # define AO_HAVE_XSIZE_and_release_write #endif #if !defined(AO_HAVE_XSIZE_and_release_write) \ && defined(AO_HAVE_XSIZE_and_release) # define AO_XSIZE_and_release_write(addr, val) AO_XSIZE_and_release(addr, val) # define AO_HAVE_XSIZE_and_release_write #endif #if !defined(AO_HAVE_XSIZE_and_acquire_read) \ && defined(AO_HAVE_XSIZE_and_read) # define AO_XSIZE_and_acquire_read(addr, val) AO_XSIZE_and_read(addr, val) # define AO_HAVE_XSIZE_and_acquire_read #endif #if !defined(AO_HAVE_XSIZE_and_acquire_read) \ && defined(AO_HAVE_XSIZE_and_acquire) # define AO_XSIZE_and_acquire_read(addr, val) AO_XSIZE_and_acquire(addr, val) # define AO_HAVE_XSIZE_and_acquire_read #endif /* XSIZE_or */ #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_or_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_or_full(volatile XCTYPE *addr, XCTYPE value) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full(addr, old, old | value))); } # define AO_HAVE_XSIZE_or_full #endif #if defined(AO_HAVE_XSIZE_or_full) # if !defined(AO_HAVE_XSIZE_or_release) # define AO_XSIZE_or_release(addr, val) AO_XSIZE_or_full(addr, val) # define AO_HAVE_XSIZE_or_release # endif # if !defined(AO_HAVE_XSIZE_or_acquire) # define AO_XSIZE_or_acquire(addr, val) AO_XSIZE_or_full(addr, val) # define AO_HAVE_XSIZE_or_acquire # endif # if !defined(AO_HAVE_XSIZE_or_write) # define AO_XSIZE_or_write(addr, val) AO_XSIZE_or_full(addr, val) # define AO_HAVE_XSIZE_or_write # endif # if !defined(AO_HAVE_XSIZE_or_read) # define AO_XSIZE_or_read(addr, val) AO_XSIZE_or_full(addr, val) # define AO_HAVE_XSIZE_or_read # endif #endif /* AO_HAVE_XSIZE_or_full */ #if !defined(AO_HAVE_XSIZE_or) && defined(AO_HAVE_XSIZE_or_release) # define AO_XSIZE_or(addr, val) AO_XSIZE_or_release(addr, val) # define AO_HAVE_XSIZE_or #endif #if !defined(AO_HAVE_XSIZE_or) && defined(AO_HAVE_XSIZE_or_acquire) # define AO_XSIZE_or(addr, val) AO_XSIZE_or_acquire(addr, val) # define AO_HAVE_XSIZE_or #endif #if !defined(AO_HAVE_XSIZE_or) && defined(AO_HAVE_XSIZE_or_write) # define AO_XSIZE_or(addr, val) AO_XSIZE_or_write(addr, val) # define AO_HAVE_XSIZE_or #endif #if !defined(AO_HAVE_XSIZE_or) && defined(AO_HAVE_XSIZE_or_read) # define AO_XSIZE_or(addr, val) AO_XSIZE_or_read(addr, val) # define AO_HAVE_XSIZE_or #endif #if defined(AO_HAVE_XSIZE_or_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_or_full) # define AO_XSIZE_or_full(addr, val) \ (AO_nop_full(), AO_XSIZE_or_acquire(addr, val)) # define AO_HAVE_XSIZE_or_full #endif #if !defined(AO_HAVE_XSIZE_or_release_write) \ && defined(AO_HAVE_XSIZE_or_write) # define AO_XSIZE_or_release_write(addr, val) AO_XSIZE_or_write(addr, val) # define AO_HAVE_XSIZE_or_release_write #endif #if !defined(AO_HAVE_XSIZE_or_release_write) \ && defined(AO_HAVE_XSIZE_or_release) # define AO_XSIZE_or_release_write(addr, val) AO_XSIZE_or_release(addr, val) # define AO_HAVE_XSIZE_or_release_write #endif #if !defined(AO_HAVE_XSIZE_or_acquire_read) && defined(AO_HAVE_XSIZE_or_read) # define AO_XSIZE_or_acquire_read(addr, val) AO_XSIZE_or_read(addr, val) # define AO_HAVE_XSIZE_or_acquire_read #endif #if !defined(AO_HAVE_XSIZE_or_acquire_read) \ && defined(AO_HAVE_XSIZE_or_acquire) # define AO_XSIZE_or_acquire_read(addr, val) AO_XSIZE_or_acquire(addr, val) # define AO_HAVE_XSIZE_or_acquire_read #endif /* XSIZE_xor */ #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_xor_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_xor_full(volatile XCTYPE *addr, XCTYPE value) { XCTYPE old; do { old = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full(addr, old, old ^ value))); } # define AO_HAVE_XSIZE_xor_full #endif #if defined(AO_HAVE_XSIZE_xor_full) # if !defined(AO_HAVE_XSIZE_xor_release) # define AO_XSIZE_xor_release(addr, val) AO_XSIZE_xor_full(addr, val) # define AO_HAVE_XSIZE_xor_release # endif # if !defined(AO_HAVE_XSIZE_xor_acquire) # define AO_XSIZE_xor_acquire(addr, val) AO_XSIZE_xor_full(addr, val) # define AO_HAVE_XSIZE_xor_acquire # endif # if !defined(AO_HAVE_XSIZE_xor_write) # define AO_XSIZE_xor_write(addr, val) AO_XSIZE_xor_full(addr, val) # define AO_HAVE_XSIZE_xor_write # endif # if !defined(AO_HAVE_XSIZE_xor_read) # define AO_XSIZE_xor_read(addr, val) AO_XSIZE_xor_full(addr, val) # define AO_HAVE_XSIZE_xor_read # endif #endif /* AO_HAVE_XSIZE_xor_full */ #if !defined(AO_HAVE_XSIZE_xor) && defined(AO_HAVE_XSIZE_xor_release) # define AO_XSIZE_xor(addr, val) AO_XSIZE_xor_release(addr, val) # define AO_HAVE_XSIZE_xor #endif #if !defined(AO_HAVE_XSIZE_xor) && defined(AO_HAVE_XSIZE_xor_acquire) # define AO_XSIZE_xor(addr, val) AO_XSIZE_xor_acquire(addr, val) # define AO_HAVE_XSIZE_xor #endif #if !defined(AO_HAVE_XSIZE_xor) && defined(AO_HAVE_XSIZE_xor_write) # define AO_XSIZE_xor(addr, val) AO_XSIZE_xor_write(addr, val) # define AO_HAVE_XSIZE_xor #endif #if !defined(AO_HAVE_XSIZE_xor) && defined(AO_HAVE_XSIZE_xor_read) # define AO_XSIZE_xor(addr, val) AO_XSIZE_xor_read(addr, val) # define AO_HAVE_XSIZE_xor #endif #if defined(AO_HAVE_XSIZE_xor_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_xor_full) # define AO_XSIZE_xor_full(addr, val) \ (AO_nop_full(), AO_XSIZE_xor_acquire(addr, val)) # define AO_HAVE_XSIZE_xor_full #endif #if !defined(AO_HAVE_XSIZE_xor_release_write) \ && defined(AO_HAVE_XSIZE_xor_write) # define AO_XSIZE_xor_release_write(addr, val) AO_XSIZE_xor_write(addr, val) # define AO_HAVE_XSIZE_xor_release_write #endif #if !defined(AO_HAVE_XSIZE_xor_release_write) \ && defined(AO_HAVE_XSIZE_xor_release) # define AO_XSIZE_xor_release_write(addr, val) AO_XSIZE_xor_release(addr, val) # define AO_HAVE_XSIZE_xor_release_write #endif #if !defined(AO_HAVE_XSIZE_xor_acquire_read) \ && defined(AO_HAVE_XSIZE_xor_read) # define AO_XSIZE_xor_acquire_read(addr, val) AO_XSIZE_xor_read(addr, val) # define AO_HAVE_XSIZE_xor_acquire_read #endif #if !defined(AO_HAVE_XSIZE_xor_acquire_read) \ && defined(AO_HAVE_XSIZE_xor_acquire) # define AO_XSIZE_xor_acquire_read(addr, val) AO_XSIZE_xor_acquire(addr, val) # define AO_HAVE_XSIZE_xor_acquire_read #endif /* XSIZE_and/or/xor_dd_acquire_read are meaningless. */ papi-papi-7-2-0-t/src/atomic_ops/generalize-small.h000066400000000000000000003125551502707512200222030ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* char_fetch_compare_and_swap */ #if defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_compare_and_swap_acquire) AO_INLINE unsigned/**/char AO_char_fetch_compare_and_swap_acquire(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { unsigned/**/char result = AO_char_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_char_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_compare_and_swap_release) # define AO_char_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_char_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_char_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_full) # if !defined(AO_HAVE_char_fetch_compare_and_swap_release) # define AO_char_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_char_fetch_compare_and_swap_acquire) # define AO_char_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_char_fetch_compare_and_swap_write) # define AO_char_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_char_fetch_compare_and_swap_read) # define AO_char_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_char_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_char_fetch_compare_and_swap_release) # define AO_char_fetch_compare_and_swap(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_char_fetch_compare_and_swap_acquire) # define AO_char_fetch_compare_and_swap(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_char_fetch_compare_and_swap_write) # define AO_char_fetch_compare_and_swap(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap) \ && defined(AO_HAVE_char_fetch_compare_and_swap_read) # define AO_char_fetch_compare_and_swap(addr, old_val, new_val) \ AO_char_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap #endif #if defined(AO_HAVE_char_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_fetch_compare_and_swap_full) # define AO_char_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_char_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_char_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_char_fetch_compare_and_swap_write) # define AO_char_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_char_fetch_compare_and_swap_release) # define AO_char_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_char_fetch_compare_and_swap_read) # define AO_char_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_char_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_char_fetch_compare_and_swap_acquire) # define AO_char_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_fetch_compare_and_swap_acquire_read) # define AO_char_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_char_fetch_compare_and_swap) # define AO_char_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_char_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_char_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_compare_and_swap */ #if defined(AO_HAVE_char_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_compare_and_swap_acquire) AO_INLINE int AO_char_compare_and_swap_acquire(volatile unsigned/**/char *addr, unsigned/**/char old, unsigned/**/char new_val) { int result = AO_char_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_char_compare_and_swap_acquire #endif #if defined(AO_HAVE_char_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_compare_and_swap_release) # define AO_char_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_char_compare_and_swap(addr, old, new_val)) # define AO_HAVE_char_compare_and_swap_release #endif #if defined(AO_HAVE_char_compare_and_swap_full) # if !defined(AO_HAVE_char_compare_and_swap_release) # define AO_char_compare_and_swap_release(addr, old, new_val) \ AO_char_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_release # endif # if !defined(AO_HAVE_char_compare_and_swap_acquire) # define AO_char_compare_and_swap_acquire(addr, old, new_val) \ AO_char_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_acquire # endif # if !defined(AO_HAVE_char_compare_and_swap_write) # define AO_char_compare_and_swap_write(addr, old, new_val) \ AO_char_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_write # endif # if !defined(AO_HAVE_char_compare_and_swap_read) # define AO_char_compare_and_swap_read(addr, old, new_val) \ AO_char_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_read # endif #endif /* AO_HAVE_char_compare_and_swap_full */ #if !defined(AO_HAVE_char_compare_and_swap) \ && defined(AO_HAVE_char_compare_and_swap_release) # define AO_char_compare_and_swap(addr, old, new_val) \ AO_char_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_char_compare_and_swap #endif #if !defined(AO_HAVE_char_compare_and_swap) \ && defined(AO_HAVE_char_compare_and_swap_acquire) # define AO_char_compare_and_swap(addr, old, new_val) \ AO_char_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_char_compare_and_swap #endif #if !defined(AO_HAVE_char_compare_and_swap) \ && defined(AO_HAVE_char_compare_and_swap_write) # define AO_char_compare_and_swap(addr, old, new_val) \ AO_char_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_char_compare_and_swap #endif #if !defined(AO_HAVE_char_compare_and_swap) \ && defined(AO_HAVE_char_compare_and_swap_read) # define AO_char_compare_and_swap(addr, old, new_val) \ AO_char_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_char_compare_and_swap #endif #if defined(AO_HAVE_char_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_compare_and_swap_full) # define AO_char_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_char_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_char_compare_and_swap_full #endif #if !defined(AO_HAVE_char_compare_and_swap_release_write) \ && defined(AO_HAVE_char_compare_and_swap_write) # define AO_char_compare_and_swap_release_write(addr, old, new_val) \ AO_char_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_release_write #endif #if !defined(AO_HAVE_char_compare_and_swap_release_write) \ && defined(AO_HAVE_char_compare_and_swap_release) # define AO_char_compare_and_swap_release_write(addr, old, new_val) \ AO_char_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_release_write #endif #if !defined(AO_HAVE_char_compare_and_swap_acquire_read) \ && defined(AO_HAVE_char_compare_and_swap_read) # define AO_char_compare_and_swap_acquire_read(addr, old, new_val) \ AO_char_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_char_compare_and_swap_acquire_read) \ && defined(AO_HAVE_char_compare_and_swap_acquire) # define AO_char_compare_and_swap_acquire_read(addr, old, new_val) \ AO_char_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_compare_and_swap_acquire_read) # define AO_char_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_char_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_char_compare_and_swap) # define AO_char_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_char_compare_and_swap(addr, old, new_val) # define AO_HAVE_char_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_load */ #if defined(AO_HAVE_char_load_full) && !defined(AO_HAVE_char_load_acquire) # define AO_char_load_acquire(addr) AO_char_load_full(addr) # define AO_HAVE_char_load_acquire #endif #if defined(AO_HAVE_char_load_acquire) && !defined(AO_HAVE_char_load) # define AO_char_load(addr) AO_char_load_acquire(addr) # define AO_HAVE_char_load #endif #if defined(AO_HAVE_char_load_full) && !defined(AO_HAVE_char_load_read) # define AO_char_load_read(addr) AO_char_load_full(addr) # define AO_HAVE_char_load_read #endif #if !defined(AO_HAVE_char_load_acquire_read) \ && defined(AO_HAVE_char_load_acquire) # define AO_char_load_acquire_read(addr) AO_char_load_acquire(addr) # define AO_HAVE_char_load_acquire_read #endif #if defined(AO_HAVE_char_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_load_acquire) AO_INLINE unsigned/**/char AO_char_load_acquire(const volatile unsigned/**/char *addr) { unsigned/**/char result = AO_char_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_char_load_acquire #endif #if defined(AO_HAVE_char_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_char_load_read) AO_INLINE unsigned/**/char AO_char_load_read(const volatile unsigned/**/char *addr) { unsigned/**/char result = AO_char_load(addr); AO_nop_read(); return result; } # define AO_HAVE_char_load_read #endif #if defined(AO_HAVE_char_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_load_full) # define AO_char_load_full(addr) (AO_nop_full(), AO_char_load_acquire(addr)) # define AO_HAVE_char_load_full #endif #if defined(AO_HAVE_char_compare_and_swap_read) \ && !defined(AO_HAVE_char_load_read) # define AO_char_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_load_read(const volatile unsigned/**/char *addr) { unsigned/**/char result; do { result = *(const unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_read( (volatile unsigned/**/char *)addr, result, result))); return result; } # define AO_HAVE_char_load_read #endif #if !defined(AO_HAVE_char_load_acquire_read) \ && defined(AO_HAVE_char_load_read) # define AO_char_load_acquire_read(addr) AO_char_load_read(addr) # define AO_HAVE_char_load_acquire_read #endif #if defined(AO_HAVE_char_load_acquire_read) && !defined(AO_HAVE_char_load) \ && (!defined(AO_char_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_char_compare_and_swap)) # define AO_char_load(addr) AO_char_load_acquire_read(addr) # define AO_HAVE_char_load #endif #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_load_full(const volatile unsigned/**/char *addr) { unsigned/**/char result; do { result = *(const unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full( (volatile unsigned/**/char *)addr, result, result))); return result; } # define AO_HAVE_char_load_full #endif #if defined(AO_HAVE_char_compare_and_swap_acquire) \ && !defined(AO_HAVE_char_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_load_acquire(const volatile unsigned/**/char *addr) { unsigned/**/char result; do { result = *(const unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_acquire( (volatile unsigned/**/char *)addr, result, result))); return result; } # define AO_HAVE_char_load_acquire #endif #if defined(AO_HAVE_char_compare_and_swap) && !defined(AO_HAVE_char_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/char AO_char_load(const volatile unsigned/**/char *addr) { unsigned/**/char result; do { result = *(const unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap( (volatile unsigned/**/char *)addr, result, result))); return result; } # define AO_HAVE_char_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_char_load_acquire_read) # define AO_char_load_dd_acquire_read(addr) \ AO_char_load_acquire_read(addr) # define AO_HAVE_char_load_dd_acquire_read # endif #else # if defined(AO_HAVE_char_load) # define AO_char_load_dd_acquire_read(addr) AO_char_load(addr) # define AO_HAVE_char_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* char_store */ #if defined(AO_HAVE_char_store_full) && !defined(AO_HAVE_char_store_release) # define AO_char_store_release(addr, val) AO_char_store_full(addr, val) # define AO_HAVE_char_store_release #endif #if defined(AO_HAVE_char_store_release) && !defined(AO_HAVE_char_store) # define AO_char_store(addr, val) AO_char_store_release(addr, val) # define AO_HAVE_char_store #endif #if defined(AO_HAVE_char_store_full) && !defined(AO_HAVE_char_store_write) # define AO_char_store_write(addr, val) AO_char_store_full(addr, val) # define AO_HAVE_char_store_write #endif #if defined(AO_HAVE_char_store_release) \ && !defined(AO_HAVE_char_store_release_write) # define AO_char_store_release_write(addr, val) \ AO_char_store_release(addr, val) # define AO_HAVE_char_store_release_write #endif #if defined(AO_HAVE_char_store_write) && !defined(AO_HAVE_char_store) # define AO_char_store(addr, val) AO_char_store_write(addr, val) # define AO_HAVE_char_store #endif #if defined(AO_HAVE_char_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_store_release) # define AO_char_store_release(addr, val) \ (AO_nop_full(), AO_char_store(addr, val)) # define AO_HAVE_char_store_release #endif #if defined(AO_HAVE_char_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_char_store_write) # define AO_char_store_write(addr, val) \ (AO_nop_write(), AO_char_store(addr, val)) # define AO_HAVE_char_store_write #endif #if defined(AO_HAVE_char_compare_and_swap_write) \ && !defined(AO_HAVE_char_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_store_write(volatile unsigned/**/char *addr, unsigned/**/char new_val) { unsigned/**/char old_val; do { old_val = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_char_store_write #endif #if defined(AO_HAVE_char_store_write) \ && !defined(AO_HAVE_char_store_release_write) # define AO_char_store_release_write(addr, val) \ AO_char_store_write(addr, val) # define AO_HAVE_char_store_release_write #endif #if defined(AO_HAVE_char_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_char_store_full) # define AO_char_store_full(addr, val) \ (AO_char_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_char_store_full #endif #if defined(AO_HAVE_char_compare_and_swap) && !defined(AO_HAVE_char_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_store(volatile unsigned/**/char *addr, unsigned/**/char new_val) { unsigned/**/char old_val; do { old_val = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_char_store #endif #if defined(AO_HAVE_char_compare_and_swap_release) \ && !defined(AO_HAVE_char_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_store_release(volatile unsigned/**/char *addr, unsigned/**/char new_val) { unsigned/**/char old_val; do { old_val = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_char_store_release #endif #if defined(AO_HAVE_char_compare_and_swap_full) \ && !defined(AO_HAVE_char_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_char_store_full(volatile unsigned/**/char *addr, unsigned/**/char new_val) { unsigned/**/char old_val; do { old_val = *(unsigned/**/char *)addr; } while (AO_EXPECT_FALSE(!AO_char_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_char_store_full #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* short_fetch_compare_and_swap */ #if defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_compare_and_swap_acquire) AO_INLINE unsigned/**/short AO_short_fetch_compare_and_swap_acquire(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { unsigned/**/short result = AO_short_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_short_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_compare_and_swap_release) # define AO_short_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_short_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_short_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_full) # if !defined(AO_HAVE_short_fetch_compare_and_swap_release) # define AO_short_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_short_fetch_compare_and_swap_acquire) # define AO_short_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_short_fetch_compare_and_swap_write) # define AO_short_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_short_fetch_compare_and_swap_read) # define AO_short_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_short_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_short_fetch_compare_and_swap_release) # define AO_short_fetch_compare_and_swap(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_short_fetch_compare_and_swap_acquire) # define AO_short_fetch_compare_and_swap(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_short_fetch_compare_and_swap_write) # define AO_short_fetch_compare_and_swap(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap) \ && defined(AO_HAVE_short_fetch_compare_and_swap_read) # define AO_short_fetch_compare_and_swap(addr, old_val, new_val) \ AO_short_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap #endif #if defined(AO_HAVE_short_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_fetch_compare_and_swap_full) # define AO_short_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_short_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_short_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_short_fetch_compare_and_swap_write) # define AO_short_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_short_fetch_compare_and_swap_release) # define AO_short_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_short_fetch_compare_and_swap_read) # define AO_short_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_short_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_short_fetch_compare_and_swap_acquire) # define AO_short_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_fetch_compare_and_swap_acquire_read) # define AO_short_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_short_fetch_compare_and_swap) # define AO_short_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_short_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_short_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_compare_and_swap */ #if defined(AO_HAVE_short_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_compare_and_swap_acquire) AO_INLINE int AO_short_compare_and_swap_acquire(volatile unsigned/**/short *addr, unsigned/**/short old, unsigned/**/short new_val) { int result = AO_short_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_short_compare_and_swap_acquire #endif #if defined(AO_HAVE_short_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_compare_and_swap_release) # define AO_short_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_short_compare_and_swap(addr, old, new_val)) # define AO_HAVE_short_compare_and_swap_release #endif #if defined(AO_HAVE_short_compare_and_swap_full) # if !defined(AO_HAVE_short_compare_and_swap_release) # define AO_short_compare_and_swap_release(addr, old, new_val) \ AO_short_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_release # endif # if !defined(AO_HAVE_short_compare_and_swap_acquire) # define AO_short_compare_and_swap_acquire(addr, old, new_val) \ AO_short_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_acquire # endif # if !defined(AO_HAVE_short_compare_and_swap_write) # define AO_short_compare_and_swap_write(addr, old, new_val) \ AO_short_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_write # endif # if !defined(AO_HAVE_short_compare_and_swap_read) # define AO_short_compare_and_swap_read(addr, old, new_val) \ AO_short_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_read # endif #endif /* AO_HAVE_short_compare_and_swap_full */ #if !defined(AO_HAVE_short_compare_and_swap) \ && defined(AO_HAVE_short_compare_and_swap_release) # define AO_short_compare_and_swap(addr, old, new_val) \ AO_short_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_short_compare_and_swap #endif #if !defined(AO_HAVE_short_compare_and_swap) \ && defined(AO_HAVE_short_compare_and_swap_acquire) # define AO_short_compare_and_swap(addr, old, new_val) \ AO_short_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_short_compare_and_swap #endif #if !defined(AO_HAVE_short_compare_and_swap) \ && defined(AO_HAVE_short_compare_and_swap_write) # define AO_short_compare_and_swap(addr, old, new_val) \ AO_short_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_short_compare_and_swap #endif #if !defined(AO_HAVE_short_compare_and_swap) \ && defined(AO_HAVE_short_compare_and_swap_read) # define AO_short_compare_and_swap(addr, old, new_val) \ AO_short_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_short_compare_and_swap #endif #if defined(AO_HAVE_short_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_compare_and_swap_full) # define AO_short_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_short_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_short_compare_and_swap_full #endif #if !defined(AO_HAVE_short_compare_and_swap_release_write) \ && defined(AO_HAVE_short_compare_and_swap_write) # define AO_short_compare_and_swap_release_write(addr, old, new_val) \ AO_short_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_release_write #endif #if !defined(AO_HAVE_short_compare_and_swap_release_write) \ && defined(AO_HAVE_short_compare_and_swap_release) # define AO_short_compare_and_swap_release_write(addr, old, new_val) \ AO_short_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_release_write #endif #if !defined(AO_HAVE_short_compare_and_swap_acquire_read) \ && defined(AO_HAVE_short_compare_and_swap_read) # define AO_short_compare_and_swap_acquire_read(addr, old, new_val) \ AO_short_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_short_compare_and_swap_acquire_read) \ && defined(AO_HAVE_short_compare_and_swap_acquire) # define AO_short_compare_and_swap_acquire_read(addr, old, new_val) \ AO_short_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_compare_and_swap_acquire_read) # define AO_short_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_short_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_short_compare_and_swap) # define AO_short_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_short_compare_and_swap(addr, old, new_val) # define AO_HAVE_short_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_load */ #if defined(AO_HAVE_short_load_full) && !defined(AO_HAVE_short_load_acquire) # define AO_short_load_acquire(addr) AO_short_load_full(addr) # define AO_HAVE_short_load_acquire #endif #if defined(AO_HAVE_short_load_acquire) && !defined(AO_HAVE_short_load) # define AO_short_load(addr) AO_short_load_acquire(addr) # define AO_HAVE_short_load #endif #if defined(AO_HAVE_short_load_full) && !defined(AO_HAVE_short_load_read) # define AO_short_load_read(addr) AO_short_load_full(addr) # define AO_HAVE_short_load_read #endif #if !defined(AO_HAVE_short_load_acquire_read) \ && defined(AO_HAVE_short_load_acquire) # define AO_short_load_acquire_read(addr) AO_short_load_acquire(addr) # define AO_HAVE_short_load_acquire_read #endif #if defined(AO_HAVE_short_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_load_acquire) AO_INLINE unsigned/**/short AO_short_load_acquire(const volatile unsigned/**/short *addr) { unsigned/**/short result = AO_short_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_short_load_acquire #endif #if defined(AO_HAVE_short_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_short_load_read) AO_INLINE unsigned/**/short AO_short_load_read(const volatile unsigned/**/short *addr) { unsigned/**/short result = AO_short_load(addr); AO_nop_read(); return result; } # define AO_HAVE_short_load_read #endif #if defined(AO_HAVE_short_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_load_full) # define AO_short_load_full(addr) (AO_nop_full(), AO_short_load_acquire(addr)) # define AO_HAVE_short_load_full #endif #if defined(AO_HAVE_short_compare_and_swap_read) \ && !defined(AO_HAVE_short_load_read) # define AO_short_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_load_read(const volatile unsigned/**/short *addr) { unsigned/**/short result; do { result = *(const unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_read( (volatile unsigned/**/short *)addr, result, result))); return result; } # define AO_HAVE_short_load_read #endif #if !defined(AO_HAVE_short_load_acquire_read) \ && defined(AO_HAVE_short_load_read) # define AO_short_load_acquire_read(addr) AO_short_load_read(addr) # define AO_HAVE_short_load_acquire_read #endif #if defined(AO_HAVE_short_load_acquire_read) && !defined(AO_HAVE_short_load) \ && (!defined(AO_short_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_short_compare_and_swap)) # define AO_short_load(addr) AO_short_load_acquire_read(addr) # define AO_HAVE_short_load #endif #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_load_full(const volatile unsigned/**/short *addr) { unsigned/**/short result; do { result = *(const unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full( (volatile unsigned/**/short *)addr, result, result))); return result; } # define AO_HAVE_short_load_full #endif #if defined(AO_HAVE_short_compare_and_swap_acquire) \ && !defined(AO_HAVE_short_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_load_acquire(const volatile unsigned/**/short *addr) { unsigned/**/short result; do { result = *(const unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_acquire( (volatile unsigned/**/short *)addr, result, result))); return result; } # define AO_HAVE_short_load_acquire #endif #if defined(AO_HAVE_short_compare_and_swap) && !defined(AO_HAVE_short_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned/**/short AO_short_load(const volatile unsigned/**/short *addr) { unsigned/**/short result; do { result = *(const unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap( (volatile unsigned/**/short *)addr, result, result))); return result; } # define AO_HAVE_short_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_short_load_acquire_read) # define AO_short_load_dd_acquire_read(addr) \ AO_short_load_acquire_read(addr) # define AO_HAVE_short_load_dd_acquire_read # endif #else # if defined(AO_HAVE_short_load) # define AO_short_load_dd_acquire_read(addr) AO_short_load(addr) # define AO_HAVE_short_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* short_store */ #if defined(AO_HAVE_short_store_full) && !defined(AO_HAVE_short_store_release) # define AO_short_store_release(addr, val) AO_short_store_full(addr, val) # define AO_HAVE_short_store_release #endif #if defined(AO_HAVE_short_store_release) && !defined(AO_HAVE_short_store) # define AO_short_store(addr, val) AO_short_store_release(addr, val) # define AO_HAVE_short_store #endif #if defined(AO_HAVE_short_store_full) && !defined(AO_HAVE_short_store_write) # define AO_short_store_write(addr, val) AO_short_store_full(addr, val) # define AO_HAVE_short_store_write #endif #if defined(AO_HAVE_short_store_release) \ && !defined(AO_HAVE_short_store_release_write) # define AO_short_store_release_write(addr, val) \ AO_short_store_release(addr, val) # define AO_HAVE_short_store_release_write #endif #if defined(AO_HAVE_short_store_write) && !defined(AO_HAVE_short_store) # define AO_short_store(addr, val) AO_short_store_write(addr, val) # define AO_HAVE_short_store #endif #if defined(AO_HAVE_short_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_store_release) # define AO_short_store_release(addr, val) \ (AO_nop_full(), AO_short_store(addr, val)) # define AO_HAVE_short_store_release #endif #if defined(AO_HAVE_short_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_short_store_write) # define AO_short_store_write(addr, val) \ (AO_nop_write(), AO_short_store(addr, val)) # define AO_HAVE_short_store_write #endif #if defined(AO_HAVE_short_compare_and_swap_write) \ && !defined(AO_HAVE_short_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_store_write(volatile unsigned/**/short *addr, unsigned/**/short new_val) { unsigned/**/short old_val; do { old_val = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_short_store_write #endif #if defined(AO_HAVE_short_store_write) \ && !defined(AO_HAVE_short_store_release_write) # define AO_short_store_release_write(addr, val) \ AO_short_store_write(addr, val) # define AO_HAVE_short_store_release_write #endif #if defined(AO_HAVE_short_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_short_store_full) # define AO_short_store_full(addr, val) \ (AO_short_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_short_store_full #endif #if defined(AO_HAVE_short_compare_and_swap) && !defined(AO_HAVE_short_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_store(volatile unsigned/**/short *addr, unsigned/**/short new_val) { unsigned/**/short old_val; do { old_val = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_short_store #endif #if defined(AO_HAVE_short_compare_and_swap_release) \ && !defined(AO_HAVE_short_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_store_release(volatile unsigned/**/short *addr, unsigned/**/short new_val) { unsigned/**/short old_val; do { old_val = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_short_store_release #endif #if defined(AO_HAVE_short_compare_and_swap_full) \ && !defined(AO_HAVE_short_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_short_store_full(volatile unsigned/**/short *addr, unsigned/**/short new_val) { unsigned/**/short old_val; do { old_val = *(unsigned/**/short *)addr; } while (AO_EXPECT_FALSE(!AO_short_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_short_store_full #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* int_fetch_compare_and_swap */ #if defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_acquire) AO_INLINE unsigned AO_int_fetch_compare_and_swap_acquire(volatile unsigned *addr, unsigned old_val, unsigned new_val) { unsigned result = AO_int_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_int_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_release) # define AO_int_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_int_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_int_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_full) # if !defined(AO_HAVE_int_fetch_compare_and_swap_release) # define AO_int_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_int_fetch_compare_and_swap_acquire) # define AO_int_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_int_fetch_compare_and_swap_write) # define AO_int_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_int_fetch_compare_and_swap_read) # define AO_int_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_int_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_int_fetch_compare_and_swap_release) # define AO_int_fetch_compare_and_swap(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_int_fetch_compare_and_swap_acquire) # define AO_int_fetch_compare_and_swap(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_int_fetch_compare_and_swap_write) # define AO_int_fetch_compare_and_swap(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap) \ && defined(AO_HAVE_int_fetch_compare_and_swap_read) # define AO_int_fetch_compare_and_swap(addr, old_val, new_val) \ AO_int_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap #endif #if defined(AO_HAVE_int_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_full) # define AO_int_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_int_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_int_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_int_fetch_compare_and_swap_write) # define AO_int_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_int_fetch_compare_and_swap_release) # define AO_int_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_int_fetch_compare_and_swap_read) # define AO_int_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_int_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_int_fetch_compare_and_swap_acquire) # define AO_int_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_fetch_compare_and_swap_acquire_read) # define AO_int_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_int_fetch_compare_and_swap) # define AO_int_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_int_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_int_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_compare_and_swap */ #if defined(AO_HAVE_int_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_compare_and_swap_acquire) AO_INLINE int AO_int_compare_and_swap_acquire(volatile unsigned *addr, unsigned old, unsigned new_val) { int result = AO_int_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_int_compare_and_swap_acquire #endif #if defined(AO_HAVE_int_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_compare_and_swap_release) # define AO_int_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_int_compare_and_swap(addr, old, new_val)) # define AO_HAVE_int_compare_and_swap_release #endif #if defined(AO_HAVE_int_compare_and_swap_full) # if !defined(AO_HAVE_int_compare_and_swap_release) # define AO_int_compare_and_swap_release(addr, old, new_val) \ AO_int_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_release # endif # if !defined(AO_HAVE_int_compare_and_swap_acquire) # define AO_int_compare_and_swap_acquire(addr, old, new_val) \ AO_int_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_acquire # endif # if !defined(AO_HAVE_int_compare_and_swap_write) # define AO_int_compare_and_swap_write(addr, old, new_val) \ AO_int_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_write # endif # if !defined(AO_HAVE_int_compare_and_swap_read) # define AO_int_compare_and_swap_read(addr, old, new_val) \ AO_int_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_read # endif #endif /* AO_HAVE_int_compare_and_swap_full */ #if !defined(AO_HAVE_int_compare_and_swap) \ && defined(AO_HAVE_int_compare_and_swap_release) # define AO_int_compare_and_swap(addr, old, new_val) \ AO_int_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_int_compare_and_swap #endif #if !defined(AO_HAVE_int_compare_and_swap) \ && defined(AO_HAVE_int_compare_and_swap_acquire) # define AO_int_compare_and_swap(addr, old, new_val) \ AO_int_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_int_compare_and_swap #endif #if !defined(AO_HAVE_int_compare_and_swap) \ && defined(AO_HAVE_int_compare_and_swap_write) # define AO_int_compare_and_swap(addr, old, new_val) \ AO_int_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_int_compare_and_swap #endif #if !defined(AO_HAVE_int_compare_and_swap) \ && defined(AO_HAVE_int_compare_and_swap_read) # define AO_int_compare_and_swap(addr, old, new_val) \ AO_int_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_int_compare_and_swap #endif #if defined(AO_HAVE_int_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_compare_and_swap_full) # define AO_int_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_int_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_int_compare_and_swap_full #endif #if !defined(AO_HAVE_int_compare_and_swap_release_write) \ && defined(AO_HAVE_int_compare_and_swap_write) # define AO_int_compare_and_swap_release_write(addr, old, new_val) \ AO_int_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_release_write #endif #if !defined(AO_HAVE_int_compare_and_swap_release_write) \ && defined(AO_HAVE_int_compare_and_swap_release) # define AO_int_compare_and_swap_release_write(addr, old, new_val) \ AO_int_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_release_write #endif #if !defined(AO_HAVE_int_compare_and_swap_acquire_read) \ && defined(AO_HAVE_int_compare_and_swap_read) # define AO_int_compare_and_swap_acquire_read(addr, old, new_val) \ AO_int_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_int_compare_and_swap_acquire_read) \ && defined(AO_HAVE_int_compare_and_swap_acquire) # define AO_int_compare_and_swap_acquire_read(addr, old, new_val) \ AO_int_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_compare_and_swap_acquire_read) # define AO_int_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_int_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_int_compare_and_swap) # define AO_int_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_int_compare_and_swap(addr, old, new_val) # define AO_HAVE_int_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_load */ #if defined(AO_HAVE_int_load_full) && !defined(AO_HAVE_int_load_acquire) # define AO_int_load_acquire(addr) AO_int_load_full(addr) # define AO_HAVE_int_load_acquire #endif #if defined(AO_HAVE_int_load_acquire) && !defined(AO_HAVE_int_load) # define AO_int_load(addr) AO_int_load_acquire(addr) # define AO_HAVE_int_load #endif #if defined(AO_HAVE_int_load_full) && !defined(AO_HAVE_int_load_read) # define AO_int_load_read(addr) AO_int_load_full(addr) # define AO_HAVE_int_load_read #endif #if !defined(AO_HAVE_int_load_acquire_read) \ && defined(AO_HAVE_int_load_acquire) # define AO_int_load_acquire_read(addr) AO_int_load_acquire(addr) # define AO_HAVE_int_load_acquire_read #endif #if defined(AO_HAVE_int_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_load_acquire) AO_INLINE unsigned AO_int_load_acquire(const volatile unsigned *addr) { unsigned result = AO_int_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_int_load_acquire #endif #if defined(AO_HAVE_int_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_int_load_read) AO_INLINE unsigned AO_int_load_read(const volatile unsigned *addr) { unsigned result = AO_int_load(addr); AO_nop_read(); return result; } # define AO_HAVE_int_load_read #endif #if defined(AO_HAVE_int_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_load_full) # define AO_int_load_full(addr) (AO_nop_full(), AO_int_load_acquire(addr)) # define AO_HAVE_int_load_full #endif #if defined(AO_HAVE_int_compare_and_swap_read) \ && !defined(AO_HAVE_int_load_read) # define AO_int_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_load_read(const volatile unsigned *addr) { unsigned result; do { result = *(const unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_read( (volatile unsigned *)addr, result, result))); return result; } # define AO_HAVE_int_load_read #endif #if !defined(AO_HAVE_int_load_acquire_read) \ && defined(AO_HAVE_int_load_read) # define AO_int_load_acquire_read(addr) AO_int_load_read(addr) # define AO_HAVE_int_load_acquire_read #endif #if defined(AO_HAVE_int_load_acquire_read) && !defined(AO_HAVE_int_load) \ && (!defined(AO_int_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_int_compare_and_swap)) # define AO_int_load(addr) AO_int_load_acquire_read(addr) # define AO_HAVE_int_load #endif #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_load_full(const volatile unsigned *addr) { unsigned result; do { result = *(const unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full( (volatile unsigned *)addr, result, result))); return result; } # define AO_HAVE_int_load_full #endif #if defined(AO_HAVE_int_compare_and_swap_acquire) \ && !defined(AO_HAVE_int_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_load_acquire(const volatile unsigned *addr) { unsigned result; do { result = *(const unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_acquire( (volatile unsigned *)addr, result, result))); return result; } # define AO_HAVE_int_load_acquire #endif #if defined(AO_HAVE_int_compare_and_swap) && !defined(AO_HAVE_int_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE unsigned AO_int_load(const volatile unsigned *addr) { unsigned result; do { result = *(const unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap( (volatile unsigned *)addr, result, result))); return result; } # define AO_HAVE_int_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_int_load_acquire_read) # define AO_int_load_dd_acquire_read(addr) \ AO_int_load_acquire_read(addr) # define AO_HAVE_int_load_dd_acquire_read # endif #else # if defined(AO_HAVE_int_load) # define AO_int_load_dd_acquire_read(addr) AO_int_load(addr) # define AO_HAVE_int_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* int_store */ #if defined(AO_HAVE_int_store_full) && !defined(AO_HAVE_int_store_release) # define AO_int_store_release(addr, val) AO_int_store_full(addr, val) # define AO_HAVE_int_store_release #endif #if defined(AO_HAVE_int_store_release) && !defined(AO_HAVE_int_store) # define AO_int_store(addr, val) AO_int_store_release(addr, val) # define AO_HAVE_int_store #endif #if defined(AO_HAVE_int_store_full) && !defined(AO_HAVE_int_store_write) # define AO_int_store_write(addr, val) AO_int_store_full(addr, val) # define AO_HAVE_int_store_write #endif #if defined(AO_HAVE_int_store_release) \ && !defined(AO_HAVE_int_store_release_write) # define AO_int_store_release_write(addr, val) \ AO_int_store_release(addr, val) # define AO_HAVE_int_store_release_write #endif #if defined(AO_HAVE_int_store_write) && !defined(AO_HAVE_int_store) # define AO_int_store(addr, val) AO_int_store_write(addr, val) # define AO_HAVE_int_store #endif #if defined(AO_HAVE_int_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_store_release) # define AO_int_store_release(addr, val) \ (AO_nop_full(), AO_int_store(addr, val)) # define AO_HAVE_int_store_release #endif #if defined(AO_HAVE_int_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_int_store_write) # define AO_int_store_write(addr, val) \ (AO_nop_write(), AO_int_store(addr, val)) # define AO_HAVE_int_store_write #endif #if defined(AO_HAVE_int_compare_and_swap_write) \ && !defined(AO_HAVE_int_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_store_write(volatile unsigned *addr, unsigned new_val) { unsigned old_val; do { old_val = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_int_store_write #endif #if defined(AO_HAVE_int_store_write) \ && !defined(AO_HAVE_int_store_release_write) # define AO_int_store_release_write(addr, val) \ AO_int_store_write(addr, val) # define AO_HAVE_int_store_release_write #endif #if defined(AO_HAVE_int_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_int_store_full) # define AO_int_store_full(addr, val) \ (AO_int_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_int_store_full #endif #if defined(AO_HAVE_int_compare_and_swap) && !defined(AO_HAVE_int_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_store(volatile unsigned *addr, unsigned new_val) { unsigned old_val; do { old_val = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_int_store #endif #if defined(AO_HAVE_int_compare_and_swap_release) \ && !defined(AO_HAVE_int_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_store_release(volatile unsigned *addr, unsigned new_val) { unsigned old_val; do { old_val = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_int_store_release #endif #if defined(AO_HAVE_int_compare_and_swap_full) \ && !defined(AO_HAVE_int_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_int_store_full(volatile unsigned *addr, unsigned new_val) { unsigned old_val; do { old_val = *(unsigned *)addr; } while (AO_EXPECT_FALSE(!AO_int_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_int_store_full #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* fetch_compare_and_swap */ #if defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_compare_and_swap_acquire) AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result = AO_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_compare_and_swap_release) # define AO_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_fetch_compare_and_swap_full) # if !defined(AO_HAVE_fetch_compare_and_swap_release) # define AO_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_fetch_compare_and_swap_acquire) # define AO_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_fetch_compare_and_swap_write) # define AO_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_fetch_compare_and_swap_read) # define AO_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_fetch_compare_and_swap_release) # define AO_fetch_compare_and_swap(addr, old_val, new_val) \ AO_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_fetch_compare_and_swap_acquire) # define AO_fetch_compare_and_swap(addr, old_val, new_val) \ AO_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_fetch_compare_and_swap_write) # define AO_fetch_compare_and_swap(addr, old_val, new_val) \ AO_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_fetch_compare_and_swap) \ && defined(AO_HAVE_fetch_compare_and_swap_read) # define AO_fetch_compare_and_swap(addr, old_val, new_val) \ AO_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap #endif #if defined(AO_HAVE_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_fetch_compare_and_swap_full) # define AO_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_fetch_compare_and_swap_write) # define AO_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_fetch_compare_and_swap_release) # define AO_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_fetch_compare_and_swap_read) # define AO_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_fetch_compare_and_swap_acquire) # define AO_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_fetch_compare_and_swap_acquire_read) # define AO_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_fetch_compare_and_swap) # define AO_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* compare_and_swap */ #if defined(AO_HAVE_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_acquire) AO_INLINE int AO_compare_and_swap_acquire(volatile AO_t *addr, AO_t old, AO_t new_val) { int result = AO_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_compare_and_swap_acquire #endif #if defined(AO_HAVE_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_release) # define AO_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_compare_and_swap(addr, old, new_val)) # define AO_HAVE_compare_and_swap_release #endif #if defined(AO_HAVE_compare_and_swap_full) # if !defined(AO_HAVE_compare_and_swap_release) # define AO_compare_and_swap_release(addr, old, new_val) \ AO_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_compare_and_swap_release # endif # if !defined(AO_HAVE_compare_and_swap_acquire) # define AO_compare_and_swap_acquire(addr, old, new_val) \ AO_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_compare_and_swap_acquire # endif # if !defined(AO_HAVE_compare_and_swap_write) # define AO_compare_and_swap_write(addr, old, new_val) \ AO_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_compare_and_swap_write # endif # if !defined(AO_HAVE_compare_and_swap_read) # define AO_compare_and_swap_read(addr, old, new_val) \ AO_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_compare_and_swap_read # endif #endif /* AO_HAVE_compare_and_swap_full */ #if !defined(AO_HAVE_compare_and_swap) \ && defined(AO_HAVE_compare_and_swap_release) # define AO_compare_and_swap(addr, old, new_val) \ AO_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_compare_and_swap #endif #if !defined(AO_HAVE_compare_and_swap) \ && defined(AO_HAVE_compare_and_swap_acquire) # define AO_compare_and_swap(addr, old, new_val) \ AO_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_compare_and_swap #endif #if !defined(AO_HAVE_compare_and_swap) \ && defined(AO_HAVE_compare_and_swap_write) # define AO_compare_and_swap(addr, old, new_val) \ AO_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_compare_and_swap #endif #if !defined(AO_HAVE_compare_and_swap) \ && defined(AO_HAVE_compare_and_swap_read) # define AO_compare_and_swap(addr, old, new_val) \ AO_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_compare_and_swap #endif #if defined(AO_HAVE_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_full) # define AO_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_compare_and_swap_full #endif #if !defined(AO_HAVE_compare_and_swap_release_write) \ && defined(AO_HAVE_compare_and_swap_write) # define AO_compare_and_swap_release_write(addr, old, new_val) \ AO_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_compare_and_swap_release_write #endif #if !defined(AO_HAVE_compare_and_swap_release_write) \ && defined(AO_HAVE_compare_and_swap_release) # define AO_compare_and_swap_release_write(addr, old, new_val) \ AO_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_compare_and_swap_release_write #endif #if !defined(AO_HAVE_compare_and_swap_acquire_read) \ && defined(AO_HAVE_compare_and_swap_read) # define AO_compare_and_swap_acquire_read(addr, old, new_val) \ AO_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_compare_and_swap_acquire_read) \ && defined(AO_HAVE_compare_and_swap_acquire) # define AO_compare_and_swap_acquire_read(addr, old, new_val) \ AO_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_compare_and_swap_acquire_read) # define AO_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_compare_and_swap) # define AO_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_compare_and_swap(addr, old, new_val) # define AO_HAVE_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* load */ #if defined(AO_HAVE_load_full) && !defined(AO_HAVE_load_acquire) # define AO_load_acquire(addr) AO_load_full(addr) # define AO_HAVE_load_acquire #endif #if defined(AO_HAVE_load_acquire) && !defined(AO_HAVE_load) # define AO_load(addr) AO_load_acquire(addr) # define AO_HAVE_load #endif #if defined(AO_HAVE_load_full) && !defined(AO_HAVE_load_read) # define AO_load_read(addr) AO_load_full(addr) # define AO_HAVE_load_read #endif #if !defined(AO_HAVE_load_acquire_read) \ && defined(AO_HAVE_load_acquire) # define AO_load_acquire_read(addr) AO_load_acquire(addr) # define AO_HAVE_load_acquire_read #endif #if defined(AO_HAVE_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_load_acquire) AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { AO_t result = AO_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_load_acquire #endif #if defined(AO_HAVE_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_load_read) AO_INLINE AO_t AO_load_read(const volatile AO_t *addr) { AO_t result = AO_load(addr); AO_nop_read(); return result; } # define AO_HAVE_load_read #endif #if defined(AO_HAVE_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_load_full) # define AO_load_full(addr) (AO_nop_full(), AO_load_acquire(addr)) # define AO_HAVE_load_full #endif #if defined(AO_HAVE_compare_and_swap_read) \ && !defined(AO_HAVE_load_read) # define AO_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_load_read(const volatile AO_t *addr) { AO_t result; do { result = *(const AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_read( (volatile AO_t *)addr, result, result))); return result; } # define AO_HAVE_load_read #endif #if !defined(AO_HAVE_load_acquire_read) \ && defined(AO_HAVE_load_read) # define AO_load_acquire_read(addr) AO_load_read(addr) # define AO_HAVE_load_acquire_read #endif #if defined(AO_HAVE_load_acquire_read) && !defined(AO_HAVE_load) \ && (!defined(AO_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_compare_and_swap)) # define AO_load(addr) AO_load_acquire_read(addr) # define AO_HAVE_load #endif #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_load_full(const volatile AO_t *addr) { AO_t result; do { result = *(const AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full( (volatile AO_t *)addr, result, result))); return result; } # define AO_HAVE_load_full #endif #if defined(AO_HAVE_compare_and_swap_acquire) \ && !defined(AO_HAVE_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { AO_t result; do { result = *(const AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_acquire( (volatile AO_t *)addr, result, result))); return result; } # define AO_HAVE_load_acquire #endif #if defined(AO_HAVE_compare_and_swap) && !defined(AO_HAVE_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_t AO_load(const volatile AO_t *addr) { AO_t result; do { result = *(const AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap( (volatile AO_t *)addr, result, result))); return result; } # define AO_HAVE_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_load_acquire_read) # define AO_load_dd_acquire_read(addr) \ AO_load_acquire_read(addr) # define AO_HAVE_load_dd_acquire_read # endif #else # if defined(AO_HAVE_load) # define AO_load_dd_acquire_read(addr) AO_load(addr) # define AO_HAVE_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* store */ #if defined(AO_HAVE_store_full) && !defined(AO_HAVE_store_release) # define AO_store_release(addr, val) AO_store_full(addr, val) # define AO_HAVE_store_release #endif #if defined(AO_HAVE_store_release) && !defined(AO_HAVE_store) # define AO_store(addr, val) AO_store_release(addr, val) # define AO_HAVE_store #endif #if defined(AO_HAVE_store_full) && !defined(AO_HAVE_store_write) # define AO_store_write(addr, val) AO_store_full(addr, val) # define AO_HAVE_store_write #endif #if defined(AO_HAVE_store_release) \ && !defined(AO_HAVE_store_release_write) # define AO_store_release_write(addr, val) \ AO_store_release(addr, val) # define AO_HAVE_store_release_write #endif #if defined(AO_HAVE_store_write) && !defined(AO_HAVE_store) # define AO_store(addr, val) AO_store_write(addr, val) # define AO_HAVE_store #endif #if defined(AO_HAVE_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_store_release) # define AO_store_release(addr, val) \ (AO_nop_full(), AO_store(addr, val)) # define AO_HAVE_store_release #endif #if defined(AO_HAVE_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_store_write) # define AO_store_write(addr, val) \ (AO_nop_write(), AO_store(addr, val)) # define AO_HAVE_store_write #endif #if defined(AO_HAVE_compare_and_swap_write) \ && !defined(AO_HAVE_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_store_write(volatile AO_t *addr, AO_t new_val) { AO_t old_val; do { old_val = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_store_write #endif #if defined(AO_HAVE_store_write) \ && !defined(AO_HAVE_store_release_write) # define AO_store_release_write(addr, val) \ AO_store_write(addr, val) # define AO_HAVE_store_release_write #endif #if defined(AO_HAVE_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_store_full) # define AO_store_full(addr, val) \ (AO_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_store_full #endif #if defined(AO_HAVE_compare_and_swap) && !defined(AO_HAVE_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_store(volatile AO_t *addr, AO_t new_val) { AO_t old_val; do { old_val = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_store #endif #if defined(AO_HAVE_compare_and_swap_release) \ && !defined(AO_HAVE_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_store_release(volatile AO_t *addr, AO_t new_val) { AO_t old_val; do { old_val = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_store_release #endif #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_store_full(volatile AO_t *addr, AO_t new_val) { AO_t old_val; do { old_val = *(AO_t *)addr; } while (AO_EXPECT_FALSE(!AO_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_store_full #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* double_fetch_compare_and_swap */ #if defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_fetch_compare_and_swap_acquire) AO_INLINE AO_double_t AO_double_fetch_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_double_t result = AO_double_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_double_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_fetch_compare_and_swap_release) # define AO_double_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_double_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_double_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_double_fetch_compare_and_swap_full) # if !defined(AO_HAVE_double_fetch_compare_and_swap_release) # define AO_double_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_double_fetch_compare_and_swap_acquire) # define AO_double_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_double_fetch_compare_and_swap_write) # define AO_double_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_double_fetch_compare_and_swap_read) # define AO_double_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_double_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_double_fetch_compare_and_swap_release) # define AO_double_fetch_compare_and_swap(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_double_fetch_compare_and_swap_acquire) # define AO_double_fetch_compare_and_swap(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_double_fetch_compare_and_swap_write) # define AO_double_fetch_compare_and_swap(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap) \ && defined(AO_HAVE_double_fetch_compare_and_swap_read) # define AO_double_fetch_compare_and_swap(addr, old_val, new_val) \ AO_double_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap #endif #if defined(AO_HAVE_double_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_fetch_compare_and_swap_full) # define AO_double_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_double_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_double_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_double_fetch_compare_and_swap_write) # define AO_double_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_double_fetch_compare_and_swap_release) # define AO_double_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_double_fetch_compare_and_swap_read) # define AO_double_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_double_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_double_fetch_compare_and_swap_acquire) # define AO_double_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_double_fetch_compare_and_swap_acquire_read) # define AO_double_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_double_fetch_compare_and_swap) # define AO_double_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_double_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_double_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* double_compare_and_swap */ #if defined(AO_HAVE_double_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_compare_and_swap_acquire) AO_INLINE int AO_double_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old, AO_double_t new_val) { int result = AO_double_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_double_compare_and_swap_acquire #endif #if defined(AO_HAVE_double_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_compare_and_swap_release) # define AO_double_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_double_compare_and_swap(addr, old, new_val)) # define AO_HAVE_double_compare_and_swap_release #endif #if defined(AO_HAVE_double_compare_and_swap_full) # if !defined(AO_HAVE_double_compare_and_swap_release) # define AO_double_compare_and_swap_release(addr, old, new_val) \ AO_double_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_release # endif # if !defined(AO_HAVE_double_compare_and_swap_acquire) # define AO_double_compare_and_swap_acquire(addr, old, new_val) \ AO_double_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_acquire # endif # if !defined(AO_HAVE_double_compare_and_swap_write) # define AO_double_compare_and_swap_write(addr, old, new_val) \ AO_double_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_write # endif # if !defined(AO_HAVE_double_compare_and_swap_read) # define AO_double_compare_and_swap_read(addr, old, new_val) \ AO_double_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_read # endif #endif /* AO_HAVE_double_compare_and_swap_full */ #if !defined(AO_HAVE_double_compare_and_swap) \ && defined(AO_HAVE_double_compare_and_swap_release) # define AO_double_compare_and_swap(addr, old, new_val) \ AO_double_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_double_compare_and_swap #endif #if !defined(AO_HAVE_double_compare_and_swap) \ && defined(AO_HAVE_double_compare_and_swap_acquire) # define AO_double_compare_and_swap(addr, old, new_val) \ AO_double_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_double_compare_and_swap #endif #if !defined(AO_HAVE_double_compare_and_swap) \ && defined(AO_HAVE_double_compare_and_swap_write) # define AO_double_compare_and_swap(addr, old, new_val) \ AO_double_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_double_compare_and_swap #endif #if !defined(AO_HAVE_double_compare_and_swap) \ && defined(AO_HAVE_double_compare_and_swap_read) # define AO_double_compare_and_swap(addr, old, new_val) \ AO_double_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_double_compare_and_swap #endif #if defined(AO_HAVE_double_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_compare_and_swap_full) # define AO_double_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_double_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_double_compare_and_swap_full #endif #if !defined(AO_HAVE_double_compare_and_swap_release_write) \ && defined(AO_HAVE_double_compare_and_swap_write) # define AO_double_compare_and_swap_release_write(addr, old, new_val) \ AO_double_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_release_write #endif #if !defined(AO_HAVE_double_compare_and_swap_release_write) \ && defined(AO_HAVE_double_compare_and_swap_release) # define AO_double_compare_and_swap_release_write(addr, old, new_val) \ AO_double_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_release_write #endif #if !defined(AO_HAVE_double_compare_and_swap_acquire_read) \ && defined(AO_HAVE_double_compare_and_swap_read) # define AO_double_compare_and_swap_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_double_compare_and_swap_acquire_read) \ && defined(AO_HAVE_double_compare_and_swap_acquire) # define AO_double_compare_and_swap_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_double_compare_and_swap_acquire_read) # define AO_double_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_double_compare_and_swap) # define AO_double_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* double_load */ #if defined(AO_HAVE_double_load_full) && !defined(AO_HAVE_double_load_acquire) # define AO_double_load_acquire(addr) AO_double_load_full(addr) # define AO_HAVE_double_load_acquire #endif #if defined(AO_HAVE_double_load_acquire) && !defined(AO_HAVE_double_load) # define AO_double_load(addr) AO_double_load_acquire(addr) # define AO_HAVE_double_load #endif #if defined(AO_HAVE_double_load_full) && !defined(AO_HAVE_double_load_read) # define AO_double_load_read(addr) AO_double_load_full(addr) # define AO_HAVE_double_load_read #endif #if !defined(AO_HAVE_double_load_acquire_read) \ && defined(AO_HAVE_double_load_acquire) # define AO_double_load_acquire_read(addr) AO_double_load_acquire(addr) # define AO_HAVE_double_load_acquire_read #endif #if defined(AO_HAVE_double_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_load_acquire) AO_INLINE AO_double_t AO_double_load_acquire(const volatile AO_double_t *addr) { AO_double_t result = AO_double_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_double_load_acquire #endif #if defined(AO_HAVE_double_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_double_load_read) AO_INLINE AO_double_t AO_double_load_read(const volatile AO_double_t *addr) { AO_double_t result = AO_double_load(addr); AO_nop_read(); return result; } # define AO_HAVE_double_load_read #endif #if defined(AO_HAVE_double_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_load_full) # define AO_double_load_full(addr) (AO_nop_full(), AO_double_load_acquire(addr)) # define AO_HAVE_double_load_full #endif #if defined(AO_HAVE_double_compare_and_swap_read) \ && !defined(AO_HAVE_double_load_read) # define AO_double_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_double_t AO_double_load_read(const volatile AO_double_t *addr) { AO_double_t result; do { result = *(const AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_read( (volatile AO_double_t *)addr, result, result))); return result; } # define AO_HAVE_double_load_read #endif #if !defined(AO_HAVE_double_load_acquire_read) \ && defined(AO_HAVE_double_load_read) # define AO_double_load_acquire_read(addr) AO_double_load_read(addr) # define AO_HAVE_double_load_acquire_read #endif #if defined(AO_HAVE_double_load_acquire_read) && !defined(AO_HAVE_double_load) \ && (!defined(AO_double_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_double_compare_and_swap)) # define AO_double_load(addr) AO_double_load_acquire_read(addr) # define AO_HAVE_double_load #endif #if defined(AO_HAVE_double_compare_and_swap_full) \ && !defined(AO_HAVE_double_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_double_t AO_double_load_full(const volatile AO_double_t *addr) { AO_double_t result; do { result = *(const AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_full( (volatile AO_double_t *)addr, result, result))); return result; } # define AO_HAVE_double_load_full #endif #if defined(AO_HAVE_double_compare_and_swap_acquire) \ && !defined(AO_HAVE_double_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_double_t AO_double_load_acquire(const volatile AO_double_t *addr) { AO_double_t result; do { result = *(const AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_acquire( (volatile AO_double_t *)addr, result, result))); return result; } # define AO_HAVE_double_load_acquire #endif #if defined(AO_HAVE_double_compare_and_swap) && !defined(AO_HAVE_double_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; do { result = *(const AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap( (volatile AO_double_t *)addr, result, result))); return result; } # define AO_HAVE_double_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_double_load_acquire_read) # define AO_double_load_dd_acquire_read(addr) \ AO_double_load_acquire_read(addr) # define AO_HAVE_double_load_dd_acquire_read # endif #else # if defined(AO_HAVE_double_load) # define AO_double_load_dd_acquire_read(addr) AO_double_load(addr) # define AO_HAVE_double_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* double_store */ #if defined(AO_HAVE_double_store_full) && !defined(AO_HAVE_double_store_release) # define AO_double_store_release(addr, val) AO_double_store_full(addr, val) # define AO_HAVE_double_store_release #endif #if defined(AO_HAVE_double_store_release) && !defined(AO_HAVE_double_store) # define AO_double_store(addr, val) AO_double_store_release(addr, val) # define AO_HAVE_double_store #endif #if defined(AO_HAVE_double_store_full) && !defined(AO_HAVE_double_store_write) # define AO_double_store_write(addr, val) AO_double_store_full(addr, val) # define AO_HAVE_double_store_write #endif #if defined(AO_HAVE_double_store_release) \ && !defined(AO_HAVE_double_store_release_write) # define AO_double_store_release_write(addr, val) \ AO_double_store_release(addr, val) # define AO_HAVE_double_store_release_write #endif #if defined(AO_HAVE_double_store_write) && !defined(AO_HAVE_double_store) # define AO_double_store(addr, val) AO_double_store_write(addr, val) # define AO_HAVE_double_store #endif #if defined(AO_HAVE_double_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_store_release) # define AO_double_store_release(addr, val) \ (AO_nop_full(), AO_double_store(addr, val)) # define AO_HAVE_double_store_release #endif #if defined(AO_HAVE_double_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_double_store_write) # define AO_double_store_write(addr, val) \ (AO_nop_write(), AO_double_store(addr, val)) # define AO_HAVE_double_store_write #endif #if defined(AO_HAVE_double_compare_and_swap_write) \ && !defined(AO_HAVE_double_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_double_store_write(volatile AO_double_t *addr, AO_double_t new_val) { AO_double_t old_val; do { old_val = *(AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_double_store_write #endif #if defined(AO_HAVE_double_store_write) \ && !defined(AO_HAVE_double_store_release_write) # define AO_double_store_release_write(addr, val) \ AO_double_store_write(addr, val) # define AO_HAVE_double_store_release_write #endif #if defined(AO_HAVE_double_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_double_store_full) # define AO_double_store_full(addr, val) \ (AO_double_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_double_store_full #endif #if defined(AO_HAVE_double_compare_and_swap) && !defined(AO_HAVE_double_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_double_store(volatile AO_double_t *addr, AO_double_t new_val) { AO_double_t old_val; do { old_val = *(AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_double_store #endif #if defined(AO_HAVE_double_compare_and_swap_release) \ && !defined(AO_HAVE_double_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_double_store_release(volatile AO_double_t *addr, AO_double_t new_val) { AO_double_t old_val; do { old_val = *(AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_double_store_release #endif #if defined(AO_HAVE_double_compare_and_swap_full) \ && !defined(AO_HAVE_double_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_double_store_full(volatile AO_double_t *addr, AO_double_t new_val) { AO_double_t old_val; do { old_val = *(AO_double_t *)addr; } while (AO_EXPECT_FALSE(!AO_double_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_double_store_full #endif papi-papi-7-2-0-t/src/atomic_ops/generalize-small.template000066400000000000000000000507751502707512200235720ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* XSIZE_fetch_compare_and_swap */ #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) AO_INLINE XCTYPE AO_XSIZE_fetch_compare_and_swap_acquire(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { XCTYPE result = AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val); AO_nop_full(); return result; } # define AO_HAVE_XSIZE_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release) # define AO_XSIZE_fetch_compare_and_swap_release(addr, old_val, new_val) \ (AO_nop_full(), \ AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val)) # define AO_HAVE_XSIZE_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_full) # if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release) # define AO_XSIZE_fetch_compare_and_swap_release(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_release # endif # if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) # define AO_XSIZE_fetch_compare_and_swap_acquire(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_acquire # endif # if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_write) # define AO_XSIZE_fetch_compare_and_swap_write(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_write # endif # if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_read) # define AO_XSIZE_fetch_compare_and_swap_read(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_read # endif #endif /* AO_HAVE_XSIZE_fetch_compare_and_swap_full */ #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release) # define AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) # define AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_write) # define AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_read) # define AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) \ AO_XSIZE_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap #endif #if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_full) # define AO_XSIZE_fetch_compare_and_swap_full(addr, old_val, new_val) \ (AO_nop_full(), \ AO_XSIZE_fetch_compare_and_swap_acquire(addr, old_val, new_val)) # define AO_HAVE_XSIZE_fetch_compare_and_swap_full #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_write) # define AO_XSIZE_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap_write(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release_write) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_release) # define AO_XSIZE_fetch_compare_and_swap_release_write(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap_release(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_release_write #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_read) # define AO_XSIZE_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap_read(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read) \ && defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire) # define AO_XSIZE_fetch_compare_and_swap_acquire_read(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap_acquire(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_fetch_compare_and_swap_acquire_read) # define AO_XSIZE_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap_acquire_read(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_fetch_compare_and_swap) # define AO_XSIZE_fetch_compare_and_swap_dd_acquire_read(addr,old_val,new_val) \ AO_XSIZE_fetch_compare_and_swap(addr, old_val, new_val) # define AO_HAVE_XSIZE_fetch_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_compare_and_swap */ #if defined(AO_HAVE_XSIZE_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_acquire) AO_INLINE int AO_XSIZE_compare_and_swap_acquire(volatile XCTYPE *addr, XCTYPE old, XCTYPE new_val) { int result = AO_XSIZE_compare_and_swap(addr, old, new_val); AO_nop_full(); return result; } # define AO_HAVE_XSIZE_compare_and_swap_acquire #endif #if defined(AO_HAVE_XSIZE_compare_and_swap) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_release) # define AO_XSIZE_compare_and_swap_release(addr, old, new_val) \ (AO_nop_full(), AO_XSIZE_compare_and_swap(addr, old, new_val)) # define AO_HAVE_XSIZE_compare_and_swap_release #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_full) # if !defined(AO_HAVE_XSIZE_compare_and_swap_release) # define AO_XSIZE_compare_and_swap_release(addr, old, new_val) \ AO_XSIZE_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_release # endif # if !defined(AO_HAVE_XSIZE_compare_and_swap_acquire) # define AO_XSIZE_compare_and_swap_acquire(addr, old, new_val) \ AO_XSIZE_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_acquire # endif # if !defined(AO_HAVE_XSIZE_compare_and_swap_write) # define AO_XSIZE_compare_and_swap_write(addr, old, new_val) \ AO_XSIZE_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_write # endif # if !defined(AO_HAVE_XSIZE_compare_and_swap_read) # define AO_XSIZE_compare_and_swap_read(addr, old, new_val) \ AO_XSIZE_compare_and_swap_full(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_read # endif #endif /* AO_HAVE_XSIZE_compare_and_swap_full */ #if !defined(AO_HAVE_XSIZE_compare_and_swap) \ && defined(AO_HAVE_XSIZE_compare_and_swap_release) # define AO_XSIZE_compare_and_swap(addr, old, new_val) \ AO_XSIZE_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap) \ && defined(AO_HAVE_XSIZE_compare_and_swap_acquire) # define AO_XSIZE_compare_and_swap(addr, old, new_val) \ AO_XSIZE_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap) \ && defined(AO_HAVE_XSIZE_compare_and_swap_write) # define AO_XSIZE_compare_and_swap(addr, old, new_val) \ AO_XSIZE_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap) \ && defined(AO_HAVE_XSIZE_compare_and_swap_read) # define AO_XSIZE_compare_and_swap(addr, old, new_val) \ AO_XSIZE_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_compare_and_swap_full) # define AO_XSIZE_compare_and_swap_full(addr, old, new_val) \ (AO_nop_full(), \ AO_XSIZE_compare_and_swap_acquire(addr, old, new_val)) # define AO_HAVE_XSIZE_compare_and_swap_full #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap_release_write) \ && defined(AO_HAVE_XSIZE_compare_and_swap_write) # define AO_XSIZE_compare_and_swap_release_write(addr, old, new_val) \ AO_XSIZE_compare_and_swap_write(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_release_write #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap_release_write) \ && defined(AO_HAVE_XSIZE_compare_and_swap_release) # define AO_XSIZE_compare_and_swap_release_write(addr, old, new_val) \ AO_XSIZE_compare_and_swap_release(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_release_write #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap_acquire_read) \ && defined(AO_HAVE_XSIZE_compare_and_swap_read) # define AO_XSIZE_compare_and_swap_acquire_read(addr, old, new_val) \ AO_XSIZE_compare_and_swap_read(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_acquire_read #endif #if !defined(AO_HAVE_XSIZE_compare_and_swap_acquire_read) \ && defined(AO_HAVE_XSIZE_compare_and_swap_acquire) # define AO_XSIZE_compare_and_swap_acquire_read(addr, old, new_val) \ AO_XSIZE_compare_and_swap_acquire(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_compare_and_swap_acquire_read) # define AO_XSIZE_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_XSIZE_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_compare_and_swap) # define AO_XSIZE_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_XSIZE_compare_and_swap(addr, old, new_val) # define AO_HAVE_XSIZE_compare_and_swap_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_load */ #if defined(AO_HAVE_XSIZE_load_full) && !defined(AO_HAVE_XSIZE_load_acquire) # define AO_XSIZE_load_acquire(addr) AO_XSIZE_load_full(addr) # define AO_HAVE_XSIZE_load_acquire #endif #if defined(AO_HAVE_XSIZE_load_acquire) && !defined(AO_HAVE_XSIZE_load) # define AO_XSIZE_load(addr) AO_XSIZE_load_acquire(addr) # define AO_HAVE_XSIZE_load #endif #if defined(AO_HAVE_XSIZE_load_full) && !defined(AO_HAVE_XSIZE_load_read) # define AO_XSIZE_load_read(addr) AO_XSIZE_load_full(addr) # define AO_HAVE_XSIZE_load_read #endif #if !defined(AO_HAVE_XSIZE_load_acquire_read) \ && defined(AO_HAVE_XSIZE_load_acquire) # define AO_XSIZE_load_acquire_read(addr) AO_XSIZE_load_acquire(addr) # define AO_HAVE_XSIZE_load_acquire_read #endif #if defined(AO_HAVE_XSIZE_load) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_load_acquire) AO_INLINE XCTYPE AO_XSIZE_load_acquire(const volatile XCTYPE *addr) { XCTYPE result = AO_XSIZE_load(addr); /* Acquire barrier would be useless, since the load could be delayed */ /* beyond it. */ AO_nop_full(); return result; } # define AO_HAVE_XSIZE_load_acquire #endif #if defined(AO_HAVE_XSIZE_load) && defined(AO_HAVE_nop_read) \ && !defined(AO_HAVE_XSIZE_load_read) AO_INLINE XCTYPE AO_XSIZE_load_read(const volatile XCTYPE *addr) { XCTYPE result = AO_XSIZE_load(addr); AO_nop_read(); return result; } # define AO_HAVE_XSIZE_load_read #endif #if defined(AO_HAVE_XSIZE_load_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_load_full) # define AO_XSIZE_load_full(addr) (AO_nop_full(), AO_XSIZE_load_acquire(addr)) # define AO_HAVE_XSIZE_load_full #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_read) \ && !defined(AO_HAVE_XSIZE_load_read) # define AO_XSIZE_CAS_BASED_LOAD_READ AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_load_read(const volatile XCTYPE *addr) { XCTYPE result; do { result = *(const XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_read( (volatile XCTYPE *)addr, result, result))); return result; } # define AO_HAVE_XSIZE_load_read #endif #if !defined(AO_HAVE_XSIZE_load_acquire_read) \ && defined(AO_HAVE_XSIZE_load_read) # define AO_XSIZE_load_acquire_read(addr) AO_XSIZE_load_read(addr) # define AO_HAVE_XSIZE_load_acquire_read #endif #if defined(AO_HAVE_XSIZE_load_acquire_read) && !defined(AO_HAVE_XSIZE_load) \ && (!defined(AO_XSIZE_CAS_BASED_LOAD_READ) \ || !defined(AO_HAVE_XSIZE_compare_and_swap)) # define AO_XSIZE_load(addr) AO_XSIZE_load_acquire_read(addr) # define AO_HAVE_XSIZE_load #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_load_full) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_load_full(const volatile XCTYPE *addr) { XCTYPE result; do { result = *(const XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full( (volatile XCTYPE *)addr, result, result))); return result; } # define AO_HAVE_XSIZE_load_full #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_acquire) \ && !defined(AO_HAVE_XSIZE_load_acquire) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_load_acquire(const volatile XCTYPE *addr) { XCTYPE result; do { result = *(const XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_acquire( (volatile XCTYPE *)addr, result, result))); return result; } # define AO_HAVE_XSIZE_load_acquire #endif #if defined(AO_HAVE_XSIZE_compare_and_swap) && !defined(AO_HAVE_XSIZE_load) AO_ATTR_NO_SANITIZE_THREAD AO_INLINE XCTYPE AO_XSIZE_load(const volatile XCTYPE *addr) { XCTYPE result; do { result = *(const XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap( (volatile XCTYPE *)addr, result, result))); return result; } # define AO_HAVE_XSIZE_load #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_XSIZE_load_acquire_read) # define AO_XSIZE_load_dd_acquire_read(addr) \ AO_XSIZE_load_acquire_read(addr) # define AO_HAVE_XSIZE_load_dd_acquire_read # endif #else # if defined(AO_HAVE_XSIZE_load) # define AO_XSIZE_load_dd_acquire_read(addr) AO_XSIZE_load(addr) # define AO_HAVE_XSIZE_load_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* XSIZE_store */ #if defined(AO_HAVE_XSIZE_store_full) && !defined(AO_HAVE_XSIZE_store_release) # define AO_XSIZE_store_release(addr, val) AO_XSIZE_store_full(addr, val) # define AO_HAVE_XSIZE_store_release #endif #if defined(AO_HAVE_XSIZE_store_release) && !defined(AO_HAVE_XSIZE_store) # define AO_XSIZE_store(addr, val) AO_XSIZE_store_release(addr, val) # define AO_HAVE_XSIZE_store #endif #if defined(AO_HAVE_XSIZE_store_full) && !defined(AO_HAVE_XSIZE_store_write) # define AO_XSIZE_store_write(addr, val) AO_XSIZE_store_full(addr, val) # define AO_HAVE_XSIZE_store_write #endif #if defined(AO_HAVE_XSIZE_store_release) \ && !defined(AO_HAVE_XSIZE_store_release_write) # define AO_XSIZE_store_release_write(addr, val) \ AO_XSIZE_store_release(addr, val) # define AO_HAVE_XSIZE_store_release_write #endif #if defined(AO_HAVE_XSIZE_store_write) && !defined(AO_HAVE_XSIZE_store) # define AO_XSIZE_store(addr, val) AO_XSIZE_store_write(addr, val) # define AO_HAVE_XSIZE_store #endif #if defined(AO_HAVE_XSIZE_store) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_store_release) # define AO_XSIZE_store_release(addr, val) \ (AO_nop_full(), AO_XSIZE_store(addr, val)) # define AO_HAVE_XSIZE_store_release #endif #if defined(AO_HAVE_XSIZE_store) && defined(AO_HAVE_nop_write) \ && !defined(AO_HAVE_XSIZE_store_write) # define AO_XSIZE_store_write(addr, val) \ (AO_nop_write(), AO_XSIZE_store(addr, val)) # define AO_HAVE_XSIZE_store_write #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_write) \ && !defined(AO_HAVE_XSIZE_store_write) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_store_write(volatile XCTYPE *addr, XCTYPE new_val) { XCTYPE old_val; do { old_val = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_write(addr, old_val, new_val))); } # define AO_HAVE_XSIZE_store_write #endif #if defined(AO_HAVE_XSIZE_store_write) \ && !defined(AO_HAVE_XSIZE_store_release_write) # define AO_XSIZE_store_release_write(addr, val) \ AO_XSIZE_store_write(addr, val) # define AO_HAVE_XSIZE_store_release_write #endif #if defined(AO_HAVE_XSIZE_store_release) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_XSIZE_store_full) # define AO_XSIZE_store_full(addr, val) \ (AO_XSIZE_store_release(addr, val), \ AO_nop_full()) # define AO_HAVE_XSIZE_store_full #endif #if defined(AO_HAVE_XSIZE_compare_and_swap) && !defined(AO_HAVE_XSIZE_store) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_store(volatile XCTYPE *addr, XCTYPE new_val) { XCTYPE old_val; do { old_val = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap(addr, old_val, new_val))); } # define AO_HAVE_XSIZE_store #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_release) \ && !defined(AO_HAVE_XSIZE_store_release) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_store_release(volatile XCTYPE *addr, XCTYPE new_val) { XCTYPE old_val; do { old_val = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_release(addr, old_val, new_val))); } # define AO_HAVE_XSIZE_store_release #endif #if defined(AO_HAVE_XSIZE_compare_and_swap_full) \ && !defined(AO_HAVE_XSIZE_store_full) AO_ATTR_NO_SANITIZE_MEMORY AO_ATTR_NO_SANITIZE_THREAD AO_INLINE void AO_XSIZE_store_full(volatile XCTYPE *addr, XCTYPE new_val) { XCTYPE old_val; do { old_val = *(XCTYPE *)addr; } while (AO_EXPECT_FALSE(!AO_XSIZE_compare_and_swap_full(addr, old_val, new_val))); } # define AO_HAVE_XSIZE_store_full #endif papi-papi-7-2-0-t/src/atomic_ops/generalize.h000066400000000000000000000732061502707512200210720ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * Generalize atomic operations for atomic_ops.h. * Should not be included directly. * * We make no attempt to define useless operations, such as * AO_nop_acquire * AO_nop_release * * We have also so far neglected to define some others, which * do not appear likely to be useful, e.g. stores with acquire * or read barriers. * * This file is sometimes included twice by atomic_ops.h. * All definitions include explicit checks that we are not replacing * an earlier definition. In general, more desirable expansions * appear earlier so that we are more likely to use them. * * We only make safe generalizations, except that by default we define * the ...dd_acquire_read operations to be equivalent to those without * a barrier. On platforms for which this is unsafe, the platform-specific * file must define AO_NO_DD_ORDERING. */ #ifndef AO_ATOMIC_OPS_H # error This file should not be included directly. #endif /* Generate test_and_set_full, if necessary and possible. */ #if !defined(AO_HAVE_test_and_set) && !defined(AO_HAVE_test_and_set_release) \ && !defined(AO_HAVE_test_and_set_acquire) \ && !defined(AO_HAVE_test_and_set_read) \ && !defined(AO_HAVE_test_and_set_full) /* Emulate AO_compare_and_swap() via AO_fetch_compare_and_swap(). */ # if defined(AO_HAVE_fetch_compare_and_swap) \ && !defined(AO_HAVE_compare_and_swap) AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap # endif # if defined(AO_HAVE_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_compare_and_swap_full) AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_full(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_full # endif # if defined(AO_HAVE_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_compare_and_swap_acquire) AO_INLINE int AO_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_acquire(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_acquire # endif # if defined(AO_HAVE_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_compare_and_swap_release) AO_INLINE int AO_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return AO_fetch_compare_and_swap_release(addr, old_val, new_val) == old_val; } # define AO_HAVE_compare_and_swap_release # endif # if defined(AO_CHAR_TS_T) # define AO_TS_COMPARE_AND_SWAP_FULL(a,o,n) \ AO_char_compare_and_swap_full(a,o,n) # define AO_TS_COMPARE_AND_SWAP_ACQUIRE(a,o,n) \ AO_char_compare_and_swap_acquire(a,o,n) # define AO_TS_COMPARE_AND_SWAP_RELEASE(a,o,n) \ AO_char_compare_and_swap_release(a,o,n) # define AO_TS_COMPARE_AND_SWAP(a,o,n) AO_char_compare_and_swap(a,o,n) # endif # if defined(AO_AO_TS_T) # define AO_TS_COMPARE_AND_SWAP_FULL(a,o,n) AO_compare_and_swap_full(a,o,n) # define AO_TS_COMPARE_AND_SWAP_ACQUIRE(a,o,n) \ AO_compare_and_swap_acquire(a,o,n) # define AO_TS_COMPARE_AND_SWAP_RELEASE(a,o,n) \ AO_compare_and_swap_release(a,o,n) # define AO_TS_COMPARE_AND_SWAP(a,o,n) AO_compare_and_swap(a,o,n) # endif # if (defined(AO_AO_TS_T) && defined(AO_HAVE_compare_and_swap_full)) \ || (defined(AO_CHAR_TS_T) && defined(AO_HAVE_char_compare_and_swap_full)) AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { if (AO_TS_COMPARE_AND_SWAP_FULL(addr, AO_TS_CLEAR, AO_TS_SET)) return AO_TS_CLEAR; else return AO_TS_SET; } # define AO_HAVE_test_and_set_full # endif /* AO_HAVE_compare_and_swap_full */ # if (defined(AO_AO_TS_T) && defined(AO_HAVE_compare_and_swap_acquire)) \ || (defined(AO_CHAR_TS_T) \ && defined(AO_HAVE_char_compare_and_swap_acquire)) AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { if (AO_TS_COMPARE_AND_SWAP_ACQUIRE(addr, AO_TS_CLEAR, AO_TS_SET)) return AO_TS_CLEAR; else return AO_TS_SET; } # define AO_HAVE_test_and_set_acquire # endif /* AO_HAVE_compare_and_swap_acquire */ # if (defined(AO_AO_TS_T) && defined(AO_HAVE_compare_and_swap_release)) \ || (defined(AO_CHAR_TS_T) \ && defined(AO_HAVE_char_compare_and_swap_release)) AO_INLINE AO_TS_VAL_t AO_test_and_set_release(volatile AO_TS_t *addr) { if (AO_TS_COMPARE_AND_SWAP_RELEASE(addr, AO_TS_CLEAR, AO_TS_SET)) return AO_TS_CLEAR; else return AO_TS_SET; } # define AO_HAVE_test_and_set_release # endif /* AO_HAVE_compare_and_swap_release */ # if (defined(AO_AO_TS_T) && defined(AO_HAVE_compare_and_swap)) \ || (defined(AO_CHAR_TS_T) && defined(AO_HAVE_char_compare_and_swap)) AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { if (AO_TS_COMPARE_AND_SWAP(addr, AO_TS_CLEAR, AO_TS_SET)) return AO_TS_CLEAR; else return AO_TS_SET; } # define AO_HAVE_test_and_set # endif /* AO_HAVE_compare_and_swap */ #endif /* No prior test and set */ /* Nop */ #if !defined(AO_HAVE_nop) AO_INLINE void AO_nop(void) {} # define AO_HAVE_nop #endif #if defined(AO_HAVE_test_and_set_full) && !defined(AO_HAVE_nop_full) AO_INLINE void AO_nop_full(void) { AO_TS_t dummy = AO_TS_INITIALIZER; AO_test_and_set_full(&dummy); } # define AO_HAVE_nop_full #endif #if defined(AO_HAVE_nop_acquire) && !defined(CPPCHECK) # error AO_nop_acquire is useless: do not define. #endif #if defined(AO_HAVE_nop_release) && !defined(CPPCHECK) # error AO_nop_release is useless: do not define. #endif #if defined(AO_HAVE_nop_full) && !defined(AO_HAVE_nop_read) # define AO_nop_read() AO_nop_full() # define AO_HAVE_nop_read #endif #if defined(AO_HAVE_nop_full) && !defined(AO_HAVE_nop_write) # define AO_nop_write() AO_nop_full() # define AO_HAVE_nop_write #endif /* Test_and_set */ #if defined(AO_HAVE_test_and_set) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_test_and_set_release) # define AO_test_and_set_release(addr) (AO_nop_full(), AO_test_and_set(addr)) # define AO_HAVE_test_and_set_release #endif #if defined(AO_HAVE_test_and_set) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_test_and_set_acquire) AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { AO_TS_VAL_t result = AO_test_and_set(addr); AO_nop_full(); return result; } # define AO_HAVE_test_and_set_acquire #endif #if defined(AO_HAVE_test_and_set_full) # if !defined(AO_HAVE_test_and_set_release) # define AO_test_and_set_release(addr) AO_test_and_set_full(addr) # define AO_HAVE_test_and_set_release # endif # if !defined(AO_HAVE_test_and_set_acquire) # define AO_test_and_set_acquire(addr) AO_test_and_set_full(addr) # define AO_HAVE_test_and_set_acquire # endif # if !defined(AO_HAVE_test_and_set_write) # define AO_test_and_set_write(addr) AO_test_and_set_full(addr) # define AO_HAVE_test_and_set_write # endif # if !defined(AO_HAVE_test_and_set_read) # define AO_test_and_set_read(addr) AO_test_and_set_full(addr) # define AO_HAVE_test_and_set_read # endif #endif /* AO_HAVE_test_and_set_full */ #if !defined(AO_HAVE_test_and_set) && defined(AO_HAVE_test_and_set_release) # define AO_test_and_set(addr) AO_test_and_set_release(addr) # define AO_HAVE_test_and_set #endif #if !defined(AO_HAVE_test_and_set) && defined(AO_HAVE_test_and_set_acquire) # define AO_test_and_set(addr) AO_test_and_set_acquire(addr) # define AO_HAVE_test_and_set #endif #if !defined(AO_HAVE_test_and_set) && defined(AO_HAVE_test_and_set_write) # define AO_test_and_set(addr) AO_test_and_set_write(addr) # define AO_HAVE_test_and_set #endif #if !defined(AO_HAVE_test_and_set) && defined(AO_HAVE_test_and_set_read) # define AO_test_and_set(addr) AO_test_and_set_read(addr) # define AO_HAVE_test_and_set #endif #if defined(AO_HAVE_test_and_set_acquire) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_test_and_set_full) # define AO_test_and_set_full(addr) \ (AO_nop_full(), AO_test_and_set_acquire(addr)) # define AO_HAVE_test_and_set_full #endif #if !defined(AO_HAVE_test_and_set_release_write) \ && defined(AO_HAVE_test_and_set_write) # define AO_test_and_set_release_write(addr) AO_test_and_set_write(addr) # define AO_HAVE_test_and_set_release_write #endif #if !defined(AO_HAVE_test_and_set_release_write) \ && defined(AO_HAVE_test_and_set_release) # define AO_test_and_set_release_write(addr) AO_test_and_set_release(addr) # define AO_HAVE_test_and_set_release_write #endif #if !defined(AO_HAVE_test_and_set_acquire_read) \ && defined(AO_HAVE_test_and_set_read) # define AO_test_and_set_acquire_read(addr) AO_test_and_set_read(addr) # define AO_HAVE_test_and_set_acquire_read #endif #if !defined(AO_HAVE_test_and_set_acquire_read) \ && defined(AO_HAVE_test_and_set_acquire) # define AO_test_and_set_acquire_read(addr) AO_test_and_set_acquire(addr) # define AO_HAVE_test_and_set_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_test_and_set_acquire_read) # define AO_test_and_set_dd_acquire_read(addr) \ AO_test_and_set_acquire_read(addr) # define AO_HAVE_test_and_set_dd_acquire_read # endif #else # if defined(AO_HAVE_test_and_set) # define AO_test_and_set_dd_acquire_read(addr) AO_test_and_set(addr) # define AO_HAVE_test_and_set_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ #include "generalize-small.h" #include "generalize-arithm.h" /* Compare_double_and_swap_double based on double_compare_and_swap. */ #ifdef AO_HAVE_DOUBLE_PTR_STORAGE # if defined(AO_HAVE_double_compare_and_swap) \ && !defined(AO_HAVE_compare_double_and_swap_double) AO_INLINE int AO_compare_double_and_swap_double(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_double_t old_w; AO_double_t new_w; old_w.AO_val1 = old_val1; old_w.AO_val2 = old_val2; new_w.AO_val1 = new_val1; new_w.AO_val2 = new_val2; return AO_double_compare_and_swap(addr, old_w, new_w); } # define AO_HAVE_compare_double_and_swap_double # endif # if defined(AO_HAVE_double_compare_and_swap_acquire) \ && !defined(AO_HAVE_compare_double_and_swap_double_acquire) AO_INLINE int AO_compare_double_and_swap_double_acquire(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_double_t old_w; AO_double_t new_w; old_w.AO_val1 = old_val1; old_w.AO_val2 = old_val2; new_w.AO_val1 = new_val1; new_w.AO_val2 = new_val2; return AO_double_compare_and_swap_acquire(addr, old_w, new_w); } # define AO_HAVE_compare_double_and_swap_double_acquire # endif # if defined(AO_HAVE_double_compare_and_swap_release) \ && !defined(AO_HAVE_compare_double_and_swap_double_release) AO_INLINE int AO_compare_double_and_swap_double_release(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_double_t old_w; AO_double_t new_w; old_w.AO_val1 = old_val1; old_w.AO_val2 = old_val2; new_w.AO_val1 = new_val1; new_w.AO_val2 = new_val2; return AO_double_compare_and_swap_release(addr, old_w, new_w); } # define AO_HAVE_compare_double_and_swap_double_release # endif # if defined(AO_HAVE_double_compare_and_swap_full) \ && !defined(AO_HAVE_compare_double_and_swap_double_full) AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_double_t old_w; AO_double_t new_w; old_w.AO_val1 = old_val1; old_w.AO_val2 = old_val2; new_w.AO_val1 = new_val1; new_w.AO_val2 = new_val2; return AO_double_compare_and_swap_full(addr, old_w, new_w); } # define AO_HAVE_compare_double_and_swap_double_full # endif #endif /* AO_HAVE_DOUBLE_PTR_STORAGE */ /* Compare_double_and_swap_double */ #if defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_double_and_swap_double_acquire) AO_INLINE int AO_compare_double_and_swap_double_acquire(volatile AO_double_t *addr, AO_t o1, AO_t o2, AO_t n1, AO_t n2) { int result = AO_compare_double_and_swap_double(addr, o1, o2, n1, n2); AO_nop_full(); return result; } # define AO_HAVE_compare_double_and_swap_double_acquire #endif #if defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_double_and_swap_double_release) # define AO_compare_double_and_swap_double_release(addr,o1,o2,n1,n2) \ (AO_nop_full(), AO_compare_double_and_swap_double(addr,o1,o2,n1,n2)) # define AO_HAVE_compare_double_and_swap_double_release #endif #if defined(AO_HAVE_compare_double_and_swap_double_full) # if !defined(AO_HAVE_compare_double_and_swap_double_release) # define AO_compare_double_and_swap_double_release(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_full(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_release # endif # if !defined(AO_HAVE_compare_double_and_swap_double_acquire) # define AO_compare_double_and_swap_double_acquire(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_full(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_acquire # endif # if !defined(AO_HAVE_compare_double_and_swap_double_write) # define AO_compare_double_and_swap_double_write(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_full(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_write # endif # if !defined(AO_HAVE_compare_double_and_swap_double_read) # define AO_compare_double_and_swap_double_read(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_full(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_read # endif #endif /* AO_HAVE_compare_double_and_swap_double_full */ #if !defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_compare_double_and_swap_double_release) # define AO_compare_double_and_swap_double(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_release(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double #endif #if !defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_compare_double_and_swap_double_acquire) # define AO_compare_double_and_swap_double(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_acquire(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double #endif #if !defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_compare_double_and_swap_double_write) # define AO_compare_double_and_swap_double(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_write(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double #endif #if !defined(AO_HAVE_compare_double_and_swap_double) \ && defined(AO_HAVE_compare_double_and_swap_double_read) # define AO_compare_double_and_swap_double(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_read(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double #endif #if defined(AO_HAVE_compare_double_and_swap_double_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_double_and_swap_double_full) # define AO_compare_double_and_swap_double_full(addr,o1,o2,n1,n2) \ (AO_nop_full(), \ AO_compare_double_and_swap_double_acquire(addr,o1,o2,n1,n2)) # define AO_HAVE_compare_double_and_swap_double_full #endif #if !defined(AO_HAVE_compare_double_and_swap_double_release_write) \ && defined(AO_HAVE_compare_double_and_swap_double_write) # define AO_compare_double_and_swap_double_release_write(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_write(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_release_write #endif #if !defined(AO_HAVE_compare_double_and_swap_double_release_write) \ && defined(AO_HAVE_compare_double_and_swap_double_release) # define AO_compare_double_and_swap_double_release_write(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_release(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_release_write #endif #if !defined(AO_HAVE_compare_double_and_swap_double_acquire_read) \ && defined(AO_HAVE_compare_double_and_swap_double_read) # define AO_compare_double_and_swap_double_acquire_read(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_read(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_acquire_read #endif #if !defined(AO_HAVE_compare_double_and_swap_double_acquire_read) \ && defined(AO_HAVE_compare_double_and_swap_double_acquire) # define AO_compare_double_and_swap_double_acquire_read(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_acquire(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_compare_double_and_swap_double_acquire_read) # define AO_compare_double_and_swap_double_dd_acquire_read(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double_acquire_read(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_dd_acquire_read # endif #else # if defined(AO_HAVE_compare_double_and_swap_double) # define AO_compare_double_and_swap_double_dd_acquire_read(addr,o1,o2,n1,n2) \ AO_compare_double_and_swap_double(addr,o1,o2,n1,n2) # define AO_HAVE_compare_double_and_swap_double_dd_acquire_read # endif #endif /* !AO_NO_DD_ORDERING */ /* Compare_and_swap_double */ #if defined(AO_HAVE_compare_and_swap_double) && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_double_acquire) AO_INLINE int AO_compare_and_swap_double_acquire(volatile AO_double_t *addr, AO_t o1, AO_t n1, AO_t n2) { int result = AO_compare_and_swap_double(addr, o1, n1, n2); AO_nop_full(); return result; } # define AO_HAVE_compare_and_swap_double_acquire #endif #if defined(AO_HAVE_compare_and_swap_double) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_double_release) # define AO_compare_and_swap_double_release(addr,o1,n1,n2) \ (AO_nop_full(), AO_compare_and_swap_double(addr,o1,n1,n2)) # define AO_HAVE_compare_and_swap_double_release #endif #if defined(AO_HAVE_compare_and_swap_double_full) # if !defined(AO_HAVE_compare_and_swap_double_release) # define AO_compare_and_swap_double_release(addr,o1,n1,n2) \ AO_compare_and_swap_double_full(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_release # endif # if !defined(AO_HAVE_compare_and_swap_double_acquire) # define AO_compare_and_swap_double_acquire(addr,o1,n1,n2) \ AO_compare_and_swap_double_full(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_acquire # endif # if !defined(AO_HAVE_compare_and_swap_double_write) # define AO_compare_and_swap_double_write(addr,o1,n1,n2) \ AO_compare_and_swap_double_full(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_write # endif # if !defined(AO_HAVE_compare_and_swap_double_read) # define AO_compare_and_swap_double_read(addr,o1,n1,n2) \ AO_compare_and_swap_double_full(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_read # endif #endif /* AO_HAVE_compare_and_swap_double_full */ #if !defined(AO_HAVE_compare_and_swap_double) \ && defined(AO_HAVE_compare_and_swap_double_release) # define AO_compare_and_swap_double(addr,o1,n1,n2) \ AO_compare_and_swap_double_release(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double #endif #if !defined(AO_HAVE_compare_and_swap_double) \ && defined(AO_HAVE_compare_and_swap_double_acquire) # define AO_compare_and_swap_double(addr,o1,n1,n2) \ AO_compare_and_swap_double_acquire(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double #endif #if !defined(AO_HAVE_compare_and_swap_double) \ && defined(AO_HAVE_compare_and_swap_double_write) # define AO_compare_and_swap_double(addr,o1,n1,n2) \ AO_compare_and_swap_double_write(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double #endif #if !defined(AO_HAVE_compare_and_swap_double) \ && defined(AO_HAVE_compare_and_swap_double_read) # define AO_compare_and_swap_double(addr,o1,n1,n2) \ AO_compare_and_swap_double_read(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double #endif #if defined(AO_HAVE_compare_and_swap_double_acquire) \ && defined(AO_HAVE_nop_full) \ && !defined(AO_HAVE_compare_and_swap_double_full) # define AO_compare_and_swap_double_full(addr,o1,n1,n2) \ (AO_nop_full(), AO_compare_and_swap_double_acquire(addr,o1,n1,n2)) # define AO_HAVE_compare_and_swap_double_full #endif #if !defined(AO_HAVE_compare_and_swap_double_release_write) \ && defined(AO_HAVE_compare_and_swap_double_write) # define AO_compare_and_swap_double_release_write(addr,o1,n1,n2) \ AO_compare_and_swap_double_write(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_release_write #endif #if !defined(AO_HAVE_compare_and_swap_double_release_write) \ && defined(AO_HAVE_compare_and_swap_double_release) # define AO_compare_and_swap_double_release_write(addr,o1,n1,n2) \ AO_compare_and_swap_double_release(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_release_write #endif #if !defined(AO_HAVE_compare_and_swap_double_acquire_read) \ && defined(AO_HAVE_compare_and_swap_double_read) # define AO_compare_and_swap_double_acquire_read(addr,o1,n1,n2) \ AO_compare_and_swap_double_read(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_acquire_read #endif #if !defined(AO_HAVE_compare_and_swap_double_acquire_read) \ && defined(AO_HAVE_compare_and_swap_double_acquire) # define AO_compare_and_swap_double_acquire_read(addr,o1,n1,n2) \ AO_compare_and_swap_double_acquire(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_acquire_read #endif #ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_compare_and_swap_double_acquire_read) # define AO_compare_and_swap_double_dd_acquire_read(addr,o1,n1,n2) \ AO_compare_and_swap_double_acquire_read(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_dd_acquire_read # endif #else # if defined(AO_HAVE_compare_and_swap_double) # define AO_compare_and_swap_double_dd_acquire_read(addr,o1,n1,n2) \ AO_compare_and_swap_double(addr,o1,n1,n2) # define AO_HAVE_compare_and_swap_double_dd_acquire_read # endif #endif /* Convenience functions for AO_double compare-and-swap which types and */ /* reads easier in code. */ #if defined(AO_HAVE_compare_double_and_swap_double) \ && !defined(AO_HAVE_double_compare_and_swap) AO_INLINE int AO_double_compare_and_swap(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap #endif #if defined(AO_HAVE_compare_double_and_swap_double_release) \ && !defined(AO_HAVE_double_compare_and_swap_release) AO_INLINE int AO_double_compare_and_swap_release(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_release(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_release #endif #if defined(AO_HAVE_compare_double_and_swap_double_acquire) \ && !defined(AO_HAVE_double_compare_and_swap_acquire) AO_INLINE int AO_double_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_acquire(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_acquire #endif #if defined(AO_HAVE_compare_double_and_swap_double_read) \ && !defined(AO_HAVE_double_compare_and_swap_read) AO_INLINE int AO_double_compare_and_swap_read(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_read(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_read #endif #if defined(AO_HAVE_compare_double_and_swap_double_write) \ && !defined(AO_HAVE_double_compare_and_swap_write) AO_INLINE int AO_double_compare_and_swap_write(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_write(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_write #endif #if defined(AO_HAVE_compare_double_and_swap_double_release_write) \ && !defined(AO_HAVE_double_compare_and_swap_release_write) AO_INLINE int AO_double_compare_and_swap_release_write(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_release_write(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_release_write #endif #if defined(AO_HAVE_compare_double_and_swap_double_acquire_read) \ && !defined(AO_HAVE_double_compare_and_swap_acquire_read) AO_INLINE int AO_double_compare_and_swap_acquire_read(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_acquire_read(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_acquire_read #endif #if defined(AO_HAVE_compare_double_and_swap_double_full) \ && !defined(AO_HAVE_double_compare_and_swap_full) AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return AO_compare_double_and_swap_double_full(addr, old_val.AO_val1, old_val.AO_val2, new_val.AO_val1, new_val.AO_val2); } # define AO_HAVE_double_compare_and_swap_full #endif #ifndef AO_HAVE_double_compare_and_swap_dd_acquire_read /* Duplicated from generalize-small because double CAS might be */ /* defined after the include. */ # ifdef AO_NO_DD_ORDERING # if defined(AO_HAVE_double_compare_and_swap_acquire_read) # define AO_double_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap_acquire_read(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_dd_acquire_read # endif # elif defined(AO_HAVE_double_compare_and_swap) # define AO_double_compare_and_swap_dd_acquire_read(addr, old, new_val) \ AO_double_compare_and_swap(addr, old, new_val) # define AO_HAVE_double_compare_and_swap_dd_acquire_read # endif /* !AO_NO_DD_ORDERING */ #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/000077500000000000000000000000001502707512200202565ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/README000066400000000000000000000005411502707512200211360ustar00rootroot00000000000000There are two kinds of entities in this directory: - Subdirectories corresponding to specific compilers (or compiler/OS combinations). Each of these includes one or more architecture-specific headers. - More generic header files corresponding to a particular ordering and/or atomicity property that might be shared by multiple hardware platforms. papi-papi-7-2-0-t/src/atomic_ops/sysdeps/all_acquire_release_volatile.h000066400000000000000000000030541502707512200263110ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Describes architectures on which volatile AO_t, unsigned char, */ /* unsigned short, and unsigned int loads and stores have */ /* acquire/release semantics for all normally legal alignments. */ #include "loadstore/acquire_release_volatile.h" #include "loadstore/char_acquire_release_volatile.h" #include "loadstore/short_acquire_release_volatile.h" #include "loadstore/int_acquire_release_volatile.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/all_aligned_atomic_load_store.h000066400000000000000000000034751502707512200264420ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Describes architectures on which AO_t, unsigned char, unsigned */ /* short, and unsigned int loads and stores are atomic but only if data */ /* is suitably aligned. */ #if defined(__m68k__) && !defined(AO_ALIGNOF_SUPPORTED) /* Even though AO_t is redefined in m68k.h, some clients use AO */ /* pointer size primitives to access variables not declared as AO_t. */ /* Such variables may have 2-byte alignment, while their sizeof is 4. */ #else # define AO_ACCESS_CHECK_ALIGNED #endif /* Check for char type is a misnomer. */ #define AO_ACCESS_short_CHECK_ALIGNED #define AO_ACCESS_int_CHECK_ALIGNED #include "all_atomic_load_store.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/all_atomic_load_store.h000066400000000000000000000030371502707512200247510ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Describes architectures on which AO_t, unsigned char, unsigned */ /* short, and unsigned int loads and stores are atomic for all normally */ /* legal alignments. */ #include "all_atomic_only_load.h" #include "loadstore/atomic_store.h" #include "loadstore/char_atomic_store.h" #include "loadstore/short_atomic_store.h" #include "loadstore/int_atomic_store.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/all_atomic_only_load.h000066400000000000000000000027701502707512200246010ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Describes architectures on which AO_t, unsigned char, unsigned */ /* short, and unsigned int loads are atomic for all normally legal */ /* alignments. */ #include "loadstore/atomic_load.h" #include "loadstore/char_atomic_load.h" #include "loadstore/short_atomic_load.h" #include "loadstore/int_atomic_load.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ao_t_is_int.h000066400000000000000000000560401502707512200227230ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load) && !defined(AO_HAVE_int_load) # define AO_int_load(addr) \ (unsigned)AO_load((const volatile AO_t *)(addr)) # define AO_HAVE_int_load #endif #if defined(AO_HAVE_store) && !defined(AO_HAVE_int_store) # define AO_int_store(addr, val) \ AO_store((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store #endif #if defined(AO_HAVE_fetch_and_add) \ && !defined(AO_HAVE_int_fetch_and_add) # define AO_int_fetch_and_add(addr, incr) \ (unsigned)AO_fetch_and_add((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add #endif #if defined(AO_HAVE_fetch_and_add1) \ && !defined(AO_HAVE_int_fetch_and_add1) # define AO_int_fetch_and_add1(addr) \ (unsigned)AO_fetch_and_add1((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1 #endif #if defined(AO_HAVE_fetch_and_sub1) \ && !defined(AO_HAVE_int_fetch_and_sub1) # define AO_int_fetch_and_sub1(addr) \ (unsigned)AO_fetch_and_sub1((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1 #endif #if defined(AO_HAVE_and) && !defined(AO_HAVE_int_and) # define AO_int_and(addr, val) \ AO_and((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and #endif #if defined(AO_HAVE_or) && !defined(AO_HAVE_int_or) # define AO_int_or(addr, val) \ AO_or((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or #endif #if defined(AO_HAVE_xor) && !defined(AO_HAVE_int_xor) # define AO_int_xor(addr, val) \ AO_xor((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor #endif #if defined(AO_HAVE_fetch_compare_and_swap) \ && !defined(AO_HAVE_int_fetch_compare_and_swap) # define AO_int_fetch_compare_and_swap(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap #endif #if defined(AO_HAVE_compare_and_swap) \ && !defined(AO_HAVE_int_compare_and_swap) # define AO_int_compare_and_swap(addr, old, new_val) \ AO_compare_and_swap((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_full) && !defined(AO_HAVE_int_load_full) # define AO_int_load_full(addr) \ (unsigned)AO_load_full((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_full #endif #if defined(AO_HAVE_store_full) && !defined(AO_HAVE_int_store_full) # define AO_int_store_full(addr, val) \ AO_store_full((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_full #endif #if defined(AO_HAVE_fetch_and_add_full) \ && !defined(AO_HAVE_int_fetch_and_add_full) # define AO_int_fetch_and_add_full(addr, incr) \ (unsigned)AO_fetch_and_add_full((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_full #endif #if defined(AO_HAVE_fetch_and_add1_full) \ && !defined(AO_HAVE_int_fetch_and_add1_full) # define AO_int_fetch_and_add1_full(addr) \ (unsigned)AO_fetch_and_add1_full((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_full #endif #if defined(AO_HAVE_fetch_and_sub1_full) \ && !defined(AO_HAVE_int_fetch_and_sub1_full) # define AO_int_fetch_and_sub1_full(addr) \ (unsigned)AO_fetch_and_sub1_full((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_full #endif #if defined(AO_HAVE_and_full) && !defined(AO_HAVE_int_and_full) # define AO_int_and_full(addr, val) \ AO_and_full((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_full #endif #if defined(AO_HAVE_or_full) && !defined(AO_HAVE_int_or_full) # define AO_int_or_full(addr, val) \ AO_or_full((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_full #endif #if defined(AO_HAVE_xor_full) && !defined(AO_HAVE_int_xor_full) # define AO_int_xor_full(addr, val) \ AO_xor_full((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_full #endif #if defined(AO_HAVE_fetch_compare_and_swap_full) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_full) # define AO_int_fetch_compare_and_swap_full(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_full((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_full #endif #if defined(AO_HAVE_compare_and_swap_full) \ && !defined(AO_HAVE_int_compare_and_swap_full) # define AO_int_compare_and_swap_full(addr, old, new_val) \ AO_compare_and_swap_full((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_full #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_acquire) && !defined(AO_HAVE_int_load_acquire) # define AO_int_load_acquire(addr) \ (unsigned)AO_load_acquire((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_acquire #endif #if defined(AO_HAVE_store_acquire) && !defined(AO_HAVE_int_store_acquire) # define AO_int_store_acquire(addr, val) \ AO_store_acquire((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_acquire #endif #if defined(AO_HAVE_fetch_and_add_acquire) \ && !defined(AO_HAVE_int_fetch_and_add_acquire) # define AO_int_fetch_and_add_acquire(addr, incr) \ (unsigned)AO_fetch_and_add_acquire((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_acquire #endif #if defined(AO_HAVE_fetch_and_add1_acquire) \ && !defined(AO_HAVE_int_fetch_and_add1_acquire) # define AO_int_fetch_and_add1_acquire(addr) \ (unsigned)AO_fetch_and_add1_acquire((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_acquire #endif #if defined(AO_HAVE_fetch_and_sub1_acquire) \ && !defined(AO_HAVE_int_fetch_and_sub1_acquire) # define AO_int_fetch_and_sub1_acquire(addr) \ (unsigned)AO_fetch_and_sub1_acquire((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_acquire #endif #if defined(AO_HAVE_and_acquire) && !defined(AO_HAVE_int_and_acquire) # define AO_int_and_acquire(addr, val) \ AO_and_acquire((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_acquire #endif #if defined(AO_HAVE_or_acquire) && !defined(AO_HAVE_int_or_acquire) # define AO_int_or_acquire(addr, val) \ AO_or_acquire((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_acquire #endif #if defined(AO_HAVE_xor_acquire) && !defined(AO_HAVE_int_xor_acquire) # define AO_int_xor_acquire(addr, val) \ AO_xor_acquire((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_acquire #endif #if defined(AO_HAVE_fetch_compare_and_swap_acquire) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_acquire) # define AO_int_fetch_compare_and_swap_acquire(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_acquire((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_acquire #endif #if defined(AO_HAVE_compare_and_swap_acquire) \ && !defined(AO_HAVE_int_compare_and_swap_acquire) # define AO_int_compare_and_swap_acquire(addr, old, new_val) \ AO_compare_and_swap_acquire((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_acquire #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_release) && !defined(AO_HAVE_int_load_release) # define AO_int_load_release(addr) \ (unsigned)AO_load_release((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_release #endif #if defined(AO_HAVE_store_release) && !defined(AO_HAVE_int_store_release) # define AO_int_store_release(addr, val) \ AO_store_release((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_release #endif #if defined(AO_HAVE_fetch_and_add_release) \ && !defined(AO_HAVE_int_fetch_and_add_release) # define AO_int_fetch_and_add_release(addr, incr) \ (unsigned)AO_fetch_and_add_release((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_release #endif #if defined(AO_HAVE_fetch_and_add1_release) \ && !defined(AO_HAVE_int_fetch_and_add1_release) # define AO_int_fetch_and_add1_release(addr) \ (unsigned)AO_fetch_and_add1_release((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_release #endif #if defined(AO_HAVE_fetch_and_sub1_release) \ && !defined(AO_HAVE_int_fetch_and_sub1_release) # define AO_int_fetch_and_sub1_release(addr) \ (unsigned)AO_fetch_and_sub1_release((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_release #endif #if defined(AO_HAVE_and_release) && !defined(AO_HAVE_int_and_release) # define AO_int_and_release(addr, val) \ AO_and_release((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_release #endif #if defined(AO_HAVE_or_release) && !defined(AO_HAVE_int_or_release) # define AO_int_or_release(addr, val) \ AO_or_release((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_release #endif #if defined(AO_HAVE_xor_release) && !defined(AO_HAVE_int_xor_release) # define AO_int_xor_release(addr, val) \ AO_xor_release((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_release #endif #if defined(AO_HAVE_fetch_compare_and_swap_release) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_release) # define AO_int_fetch_compare_and_swap_release(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_release((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_release #endif #if defined(AO_HAVE_compare_and_swap_release) \ && !defined(AO_HAVE_int_compare_and_swap_release) # define AO_int_compare_and_swap_release(addr, old, new_val) \ AO_compare_and_swap_release((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_release #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_write) && !defined(AO_HAVE_int_load_write) # define AO_int_load_write(addr) \ (unsigned)AO_load_write((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_write #endif #if defined(AO_HAVE_store_write) && !defined(AO_HAVE_int_store_write) # define AO_int_store_write(addr, val) \ AO_store_write((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_write #endif #if defined(AO_HAVE_fetch_and_add_write) \ && !defined(AO_HAVE_int_fetch_and_add_write) # define AO_int_fetch_and_add_write(addr, incr) \ (unsigned)AO_fetch_and_add_write((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_write #endif #if defined(AO_HAVE_fetch_and_add1_write) \ && !defined(AO_HAVE_int_fetch_and_add1_write) # define AO_int_fetch_and_add1_write(addr) \ (unsigned)AO_fetch_and_add1_write((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_write #endif #if defined(AO_HAVE_fetch_and_sub1_write) \ && !defined(AO_HAVE_int_fetch_and_sub1_write) # define AO_int_fetch_and_sub1_write(addr) \ (unsigned)AO_fetch_and_sub1_write((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_write #endif #if defined(AO_HAVE_and_write) && !defined(AO_HAVE_int_and_write) # define AO_int_and_write(addr, val) \ AO_and_write((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_write #endif #if defined(AO_HAVE_or_write) && !defined(AO_HAVE_int_or_write) # define AO_int_or_write(addr, val) \ AO_or_write((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_write #endif #if defined(AO_HAVE_xor_write) && !defined(AO_HAVE_int_xor_write) # define AO_int_xor_write(addr, val) \ AO_xor_write((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_write #endif #if defined(AO_HAVE_fetch_compare_and_swap_write) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_write) # define AO_int_fetch_compare_and_swap_write(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_write((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_write #endif #if defined(AO_HAVE_compare_and_swap_write) \ && !defined(AO_HAVE_int_compare_and_swap_write) # define AO_int_compare_and_swap_write(addr, old, new_val) \ AO_compare_and_swap_write((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_write #endif /* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_read) && !defined(AO_HAVE_int_load_read) # define AO_int_load_read(addr) \ (unsigned)AO_load_read((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_read #endif #if defined(AO_HAVE_store_read) && !defined(AO_HAVE_int_store_read) # define AO_int_store_read(addr, val) \ AO_store_read((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_read #endif #if defined(AO_HAVE_fetch_and_add_read) \ && !defined(AO_HAVE_int_fetch_and_add_read) # define AO_int_fetch_and_add_read(addr, incr) \ (unsigned)AO_fetch_and_add_read((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_read #endif #if defined(AO_HAVE_fetch_and_add1_read) \ && !defined(AO_HAVE_int_fetch_and_add1_read) # define AO_int_fetch_and_add1_read(addr) \ (unsigned)AO_fetch_and_add1_read((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_read #endif #if defined(AO_HAVE_fetch_and_sub1_read) \ && !defined(AO_HAVE_int_fetch_and_sub1_read) # define AO_int_fetch_and_sub1_read(addr) \ (unsigned)AO_fetch_and_sub1_read((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_read #endif #if defined(AO_HAVE_and_read) && !defined(AO_HAVE_int_and_read) # define AO_int_and_read(addr, val) \ AO_and_read((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_read #endif #if defined(AO_HAVE_or_read) && !defined(AO_HAVE_int_or_read) # define AO_int_or_read(addr, val) \ AO_or_read((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_read #endif #if defined(AO_HAVE_xor_read) && !defined(AO_HAVE_int_xor_read) # define AO_int_xor_read(addr, val) \ AO_xor_read((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_read #endif #if defined(AO_HAVE_fetch_compare_and_swap_read) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_read) # define AO_int_fetch_compare_and_swap_read(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_read((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_read #endif #if defined(AO_HAVE_compare_and_swap_read) \ && !defined(AO_HAVE_int_compare_and_swap_read) # define AO_int_compare_and_swap_read(addr, old, new_val) \ AO_compare_and_swap_read((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_read #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ao_t_is_int.template000066400000000000000000000075121502707512200243070ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Inclusion of this file signifies that AO_t is in fact int. */ /* Hence any AO_... operation can also serve as AO_int_... operation. */ #if defined(AO_HAVE_load_XBAR) && !defined(AO_HAVE_int_load_XBAR) # define AO_int_load_XBAR(addr) \ (unsigned)AO_load_XBAR((const volatile AO_t *)(addr)) # define AO_HAVE_int_load_XBAR #endif #if defined(AO_HAVE_store_XBAR) && !defined(AO_HAVE_int_store_XBAR) # define AO_int_store_XBAR(addr, val) \ AO_store_XBAR((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_store_XBAR #endif #if defined(AO_HAVE_fetch_and_add_XBAR) \ && !defined(AO_HAVE_int_fetch_and_add_XBAR) # define AO_int_fetch_and_add_XBAR(addr, incr) \ (unsigned)AO_fetch_and_add_XBAR((volatile AO_t *)(addr), \ (AO_t)(incr)) # define AO_HAVE_int_fetch_and_add_XBAR #endif #if defined(AO_HAVE_fetch_and_add1_XBAR) \ && !defined(AO_HAVE_int_fetch_and_add1_XBAR) # define AO_int_fetch_and_add1_XBAR(addr) \ (unsigned)AO_fetch_and_add1_XBAR((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_add1_XBAR #endif #if defined(AO_HAVE_fetch_and_sub1_XBAR) \ && !defined(AO_HAVE_int_fetch_and_sub1_XBAR) # define AO_int_fetch_and_sub1_XBAR(addr) \ (unsigned)AO_fetch_and_sub1_XBAR((volatile AO_t *)(addr)) # define AO_HAVE_int_fetch_and_sub1_XBAR #endif #if defined(AO_HAVE_and_XBAR) && !defined(AO_HAVE_int_and_XBAR) # define AO_int_and_XBAR(addr, val) \ AO_and_XBAR((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_and_XBAR #endif #if defined(AO_HAVE_or_XBAR) && !defined(AO_HAVE_int_or_XBAR) # define AO_int_or_XBAR(addr, val) \ AO_or_XBAR((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_or_XBAR #endif #if defined(AO_HAVE_xor_XBAR) && !defined(AO_HAVE_int_xor_XBAR) # define AO_int_xor_XBAR(addr, val) \ AO_xor_XBAR((volatile AO_t *)(addr), (AO_t)(val)) # define AO_HAVE_int_xor_XBAR #endif #if defined(AO_HAVE_fetch_compare_and_swap_XBAR) \ && !defined(AO_HAVE_int_fetch_compare_and_swap_XBAR) # define AO_int_fetch_compare_and_swap_XBAR(addr, old, new_val) \ (unsigned)AO_fetch_compare_and_swap_XBAR((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_fetch_compare_and_swap_XBAR #endif #if defined(AO_HAVE_compare_and_swap_XBAR) \ && !defined(AO_HAVE_int_compare_and_swap_XBAR) # define AO_int_compare_and_swap_XBAR(addr, old, new_val) \ AO_compare_and_swap_XBAR((volatile AO_t *)(addr), \ (AO_t)(old), (AO_t)(new_val)) # define AO_HAVE_int_compare_and_swap_XBAR #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/armcc/000077500000000000000000000000001502707512200213435ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/armcc/arm_v6.h000066400000000000000000000162501502707512200227120ustar00rootroot00000000000000/* * Copyright (c) 2007 by NEC LE-IT: All rights reserved. * A transcription of ARMv6 atomic operations for the ARM Realview Toolchain. * This code works with armcc from RVDS 3.1 * This is based on work in gcc/arm.h by * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #include "../test_and_set_t_is_ao_t.h" /* Probably suboptimal */ #if __TARGET_ARCH_ARM < 6 # if !defined(CPPCHECK) # error Do not use with ARM instruction sets lower than v6 # endif #else #define AO_ACCESS_CHECK_ALIGNED #define AO_ACCESS_short_CHECK_ALIGNED #define AO_ACCESS_int_CHECK_ALIGNED #include "../all_atomic_only_load.h" #include "../standard_ao_double_t.h" /* NEC LE-IT: ARMv6 is the first architecture providing support for simple LL/SC * A data memory barrier must be raised via CP15 command (see documentation). * * ARMv7 is compatible to ARMv6 but has a simpler command for issuing a * memory barrier (DMB). Raising it via CP15 should still work as told me by the * support engineers. If it turns out to be much quicker than we should implement * custom code for ARMv7 using the asm { dmb } command. * * If only a single processor is used, we can define AO_UNIPROCESSOR * and do not need to access CP15 for ensuring a DMB at all. */ AO_INLINE void AO_nop_full(void) { # ifndef AO_UNIPROCESSOR unsigned int dest=0; /* Issue a data memory barrier (keeps ordering of memory transactions */ /* before and after this operation). */ __asm { mcr p15,0,dest,c7,c10,5 }; # else AO_compiler_barrier(); # endif } #define AO_HAVE_nop_full /* NEC LE-IT: atomic "store" - according to ARM documentation this is * the only safe way to set variables also used in LL/SC environment. * A direct write won't be recognized by the LL/SC construct in other CPUs. * * HB: Based on subsequent discussion, I think it would be OK to use an * ordinary store here if we knew that interrupt handlers always cleared * the reservation. They should, but there is some doubt that this is * currently always the case for e.g. Linux. */ AO_INLINE void AO_store(volatile AO_t *addr, AO_t value) { unsigned long tmp; retry: __asm { ldrex tmp, [addr] strex tmp, value, [addr] teq tmp, #0 bne retry }; } #define AO_HAVE_store /* NEC LE-IT: replace the SWAP as recommended by ARM: "Applies to: ARM11 Cores Though the SWP instruction will still work with ARM V6 cores, it is recommended to use the new V6 synchronization instructions. The SWP instruction produces locked read and write accesses which are atomic, i.e. another operation cannot be done between these locked accesses which ties up external bus (AHB,AXI) bandwidth and can increase worst case interrupt latencies. LDREX,STREX are more flexible, other instructions can be done between the LDREX and STREX accesses. " */ #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { AO_TS_VAL_t oldval; unsigned long tmp; unsigned long one = 1; retry: __asm { ldrex oldval, [addr] strex tmp, one, [addr] teq tmp, #0 bne retry } return oldval; } #define AO_HAVE_test_and_set AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *p, AO_t incr) { unsigned long tmp,tmp2; AO_t result; retry: __asm { ldrex result, [p] add tmp, incr, result strex tmp2, tmp, [p] teq tmp2, #0 bne retry } return result; } #define AO_HAVE_fetch_and_add AO_INLINE AO_t AO_fetch_and_add1(volatile AO_t *p) { unsigned long tmp,tmp2; AO_t result; retry: __asm { ldrex result, [p] add tmp, result, #1 strex tmp2, tmp, [p] teq tmp2, #0 bne retry } return result; } #define AO_HAVE_fetch_and_add1 AO_INLINE AO_t AO_fetch_and_sub1(volatile AO_t *p) { unsigned long tmp,tmp2; AO_t result; retry: __asm { ldrex result, [p] sub tmp, result, #1 strex tmp2, tmp, [p] teq tmp2, #0 bne retry } return result; } #define AO_HAVE_fetch_and_sub1 #endif /* !AO_PREFER_GENERALIZED */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result, tmp; retry: __asm__ { mov result, #2 ldrex tmp, [addr] teq tmp, old_val # ifdef __thumb__ it eq # endif strexeq result, new_val, [addr] teq result, #1 beq retry } return !(result&2); } # define AO_HAVE_compare_and_swap #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val, tmp; retry: __asm__ { mov tmp, #2 ldrex fetched_val, [addr] teq fetched_val, old_val # ifdef __thumb__ it eq # endif strexeq tmp, new_val, [addr] teq tmp, #1 beq retry } return fetched_val; } #define AO_HAVE_fetch_compare_and_swap /* helper functions for the Realview compiler: LDREXD is not usable * with inline assembler, so use the "embedded" assembler as * suggested by ARM Dev. support (June 2008). */ __asm inline double_ptr_storage AO_load_ex(const volatile AO_double_t *addr) { LDREXD r0,r1,[r0] } __asm inline int AO_store_ex(AO_t val1, AO_t val2, volatile AO_double_t *addr) { STREXD r3,r0,r1,[r2] MOV r0,r3 } AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; result.AO_whole = AO_load_ex(addr); return result; } #define AO_HAVE_double_load AO_INLINE int AO_compare_double_and_swap_double(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { double_ptr_storage old_val = ((double_ptr_storage)old_val2 << 32) | old_val1; double_ptr_storage tmp; int result; while(1) { tmp = AO_load_ex(addr); if(tmp != old_val) return 0; result = AO_store_ex(new_val1, new_val2, addr); if(!result) return 1; } } #define AO_HAVE_compare_double_and_swap_double #endif /* __TARGET_ARCH_ARM >= 6 */ #define AO_T_IS_INT papi-papi-7-2-0-t/src/atomic_ops/sysdeps/emul_cas.h000066400000000000000000000065771502707512200222360ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * Ensure, if at all possible, that AO_compare_and_swap_full() is * available. The emulation should be brute-force signal-safe, even * though it actually blocks. * Including this file will generate an error if AO_compare_and_swap_full() * cannot be made available. * This will be included from platform-specific atomic_ops files * if appropriate, and if AO_REQUIRE_CAS is defined. It should not be * included directly, especially since it affects the implementation * of other atomic update primitives. * The implementation assumes that only AO_store_XXX and AO_test_and_set_XXX * variants are defined, and that AO_test_and_set_XXX is not used to * operate on compare_and_swap locations. */ #ifndef AO_ATOMIC_OPS_H # error This file should not be included directly. #endif #ifndef AO_HAVE_double_t # include "standard_ao_double_t.h" #endif #ifdef __cplusplus extern "C" { #endif AO_API AO_t AO_fetch_compare_and_swap_emulation(volatile AO_t *addr, AO_t old_val, AO_t new_val); AO_API int AO_compare_double_and_swap_double_emulation(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2); AO_API void AO_store_full_emulation(volatile AO_t *addr, AO_t val); #ifndef AO_HAVE_fetch_compare_and_swap_full # define AO_fetch_compare_and_swap_full(addr, old, newval) \ AO_fetch_compare_and_swap_emulation(addr, old, newval) # define AO_HAVE_fetch_compare_and_swap_full #endif #ifndef AO_HAVE_compare_double_and_swap_double_full # define AO_compare_double_and_swap_double_full(addr, old1, old2, \ newval1, newval2) \ AO_compare_double_and_swap_double_emulation(addr, old1, old2, \ newval1, newval2) # define AO_HAVE_compare_double_and_swap_double_full #endif #undef AO_store #undef AO_HAVE_store #undef AO_store_write #undef AO_HAVE_store_write #undef AO_store_release #undef AO_HAVE_store_release #undef AO_store_full #undef AO_HAVE_store_full #define AO_store_full(addr, val) AO_store_full_emulation(addr, val) #define AO_HAVE_store_full #ifdef __cplusplus } /* extern "C" */ #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/000077500000000000000000000000001502707512200210125ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/aarch64.h000066400000000000000000000220161502707512200224140ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * Copyright (c) 2013-2017 Ivan Maidanski * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ /* As of clang-5.0 (and gcc-5.4), __atomic_thread_fence is always */ /* translated to DMB (which is inefficient for AO_nop_write). */ /* TODO: Update it for newer Clang and GCC releases. */ #if !defined(AO_PREFER_BUILTIN_ATOMICS) && !defined(AO_THREAD_SANITIZER) \ && !defined(AO_UNIPROCESSOR) AO_INLINE void AO_nop_write(void) { __asm__ __volatile__("dmb ishst" : : : "memory"); } # define AO_HAVE_nop_write #endif /* There were some bugs in the older clang releases (related to */ /* optimization of functions dealing with __int128 values, supposedly), */ /* so even asm-based implementation did not work correctly. */ #if !defined(__clang__) || AO_CLANG_PREREQ(3, 9) # include "../standard_ao_double_t.h" /* As of gcc-5.4, all built-in load/store and CAS atomics for double */ /* word require -latomic, are not lock-free and cause test_stack */ /* failure, so the asm-based implementation is used for now. */ /* TODO: Update it for newer GCC releases. */ #if (!defined(__ILP32__) && !defined(__clang__)) \ || defined(AO_AARCH64_ASM_LOAD_STORE_CAS) # ifndef AO_PREFER_GENERALIZED AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; int status; /* Note that STXP cannot be discarded because LD[A]XP is not */ /* single-copy atomic (unlike LDREXD for 32-bit ARM). */ do { __asm__ __volatile__("//AO_double_load\n" # ifdef __ILP32__ " ldxp %w0, %w1, %3\n" " stxp %w2, %w0, %w1, %3" # else " ldxp %0, %1, %3\n" " stxp %w2, %0, %1, %3" # endif : "=&r" (result.AO_val1), "=&r" (result.AO_val2), "=&r" (status) : "Q" (*addr)); } while (AO_EXPECT_FALSE(status)); return result; } # define AO_HAVE_double_load AO_INLINE AO_double_t AO_double_load_acquire(const volatile AO_double_t *addr) { AO_double_t result; int status; do { __asm__ __volatile__("//AO_double_load_acquire\n" # ifdef __ILP32__ " ldaxp %w0, %w1, %3\n" " stxp %w2, %w0, %w1, %3" # else " ldaxp %0, %1, %3\n" " stxp %w2, %0, %1, %3" # endif : "=&r" (result.AO_val1), "=&r" (result.AO_val2), "=&r" (status) : "Q" (*addr)); } while (AO_EXPECT_FALSE(status)); return result; } # define AO_HAVE_double_load_acquire AO_INLINE void AO_double_store(volatile AO_double_t *addr, AO_double_t value) { AO_double_t old_val; int status; do { __asm__ __volatile__("//AO_double_store\n" # ifdef __ILP32__ " ldxp %w0, %w1, %3\n" " stxp %w2, %w4, %w5, %3" # else " ldxp %0, %1, %3\n" " stxp %w2, %4, %5, %3" # endif : "=&r" (old_val.AO_val1), "=&r" (old_val.AO_val2), "=&r" (status), "=Q" (*addr) : "r" (value.AO_val1), "r" (value.AO_val2)); /* Compared to the arm.h implementation, the 'cc' (flags) are */ /* not clobbered because A64 has no concept of conditional */ /* execution. */ } while (AO_EXPECT_FALSE(status)); } # define AO_HAVE_double_store AO_INLINE void AO_double_store_release(volatile AO_double_t *addr, AO_double_t value) { AO_double_t old_val; int status; do { __asm__ __volatile__("//AO_double_store_release\n" # ifdef __ILP32__ " ldxp %w0, %w1, %3\n" " stlxp %w2, %w4, %w5, %3" # else " ldxp %0, %1, %3\n" " stlxp %w2, %4, %5, %3" # endif : "=&r" (old_val.AO_val1), "=&r" (old_val.AO_val2), "=&r" (status), "=Q" (*addr) : "r" (value.AO_val1), "r" (value.AO_val2)); } while (AO_EXPECT_FALSE(status)); } # define AO_HAVE_double_store_release # endif /* !AO_PREFER_GENERALIZED */ AO_INLINE int AO_double_compare_and_swap(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_double_t tmp; int result = 1; do { __asm__ __volatile__("//AO_double_compare_and_swap\n" # ifdef __ILP32__ " ldxp %w0, %w1, %2\n" # else " ldxp %0, %1, %2\n" # endif : "=&r" (tmp.AO_val1), "=&r" (tmp.AO_val2) : "Q" (*addr)); if (tmp.AO_val1 != old_val.AO_val1 || tmp.AO_val2 != old_val.AO_val2) break; __asm__ __volatile__( # ifdef __ILP32__ " stxp %w0, %w2, %w3, %1\n" # else " stxp %w0, %2, %3, %1\n" # endif : "=&r" (result), "=Q" (*addr) : "r" (new_val.AO_val1), "r" (new_val.AO_val2)); } while (AO_EXPECT_FALSE(result)); return !result; } # define AO_HAVE_double_compare_and_swap AO_INLINE int AO_double_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_double_t tmp; int result = 1; do { __asm__ __volatile__("//AO_double_compare_and_swap_acquire\n" # ifdef __ILP32__ " ldaxp %w0, %w1, %2\n" # else " ldaxp %0, %1, %2\n" # endif : "=&r" (tmp.AO_val1), "=&r" (tmp.AO_val2) : "Q" (*addr)); if (tmp.AO_val1 != old_val.AO_val1 || tmp.AO_val2 != old_val.AO_val2) break; __asm__ __volatile__( # ifdef __ILP32__ " stxp %w0, %w2, %w3, %1\n" # else " stxp %w0, %2, %3, %1\n" # endif : "=&r" (result), "=Q" (*addr) : "r" (new_val.AO_val1), "r" (new_val.AO_val2)); } while (AO_EXPECT_FALSE(result)); return !result; } # define AO_HAVE_double_compare_and_swap_acquire AO_INLINE int AO_double_compare_and_swap_release(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_double_t tmp; int result = 1; do { __asm__ __volatile__("//AO_double_compare_and_swap_release\n" # ifdef __ILP32__ " ldxp %w0, %w1, %2\n" # else " ldxp %0, %1, %2\n" # endif : "=&r" (tmp.AO_val1), "=&r" (tmp.AO_val2) : "Q" (*addr)); if (tmp.AO_val1 != old_val.AO_val1 || tmp.AO_val2 != old_val.AO_val2) break; __asm__ __volatile__( # ifdef __ILP32__ " stlxp %w0, %w2, %w3, %1\n" # else " stlxp %w0, %2, %3, %1\n" # endif : "=&r" (result), "=Q" (*addr) : "r" (new_val.AO_val1), "r" (new_val.AO_val2)); } while (AO_EXPECT_FALSE(result)); return !result; } # define AO_HAVE_double_compare_and_swap_release AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_double_t tmp; int result = 1; do { __asm__ __volatile__("//AO_double_compare_and_swap_full\n" # ifdef __ILP32__ " ldaxp %w0, %w1, %2\n" # else " ldaxp %0, %1, %2\n" # endif : "=&r" (tmp.AO_val1), "=&r" (tmp.AO_val2) : "Q" (*addr)); if (tmp.AO_val1 != old_val.AO_val1 || tmp.AO_val2 != old_val.AO_val2) break; __asm__ __volatile__( # ifdef __ILP32__ " stlxp %w0, %w2, %w3, %1\n" # else " stlxp %w0, %2, %3, %1\n" # endif : "=&r" (result), "=Q" (*addr) : "r" (new_val.AO_val1), "r" (new_val.AO_val2)); } while (AO_EXPECT_FALSE(result)); return !result; } # define AO_HAVE_double_compare_and_swap_full #endif /* !__ILP32__ && !__clang__ || AO_AARCH64_ASM_LOAD_STORE_CAS */ /* As of clang-5.0 and gcc-8.1, __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 */ /* macro is still missing (while the double-word CAS is available). */ # ifndef __ILP32__ # define AO_GCC_HAVE_double_SYNC_CAS # endif #endif /* !__clang__ || AO_CLANG_PREREQ(3, 9) */ #if (defined(__clang__) && !AO_CLANG_PREREQ(3, 8)) || defined(__APPLE_CC__) /* __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n macros are missing. */ # define AO_GCC_FORCE_HAVE_CAS #endif #include "generic.h" #undef AO_GCC_FORCE_HAVE_CAS #undef AO_GCC_HAVE_double_SYNC_CAS papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/alpha.h000066400000000000000000000040031502707512200222450ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #include "../loadstore/atomic_load.h" #include "../loadstore/atomic_store.h" #include "../test_and_set_t_is_ao_t.h" #define AO_NO_DD_ORDERING /* Data dependence does not imply read ordering. */ AO_INLINE void AO_nop_full(void) { __asm__ __volatile__("mb" : : : "memory"); } #define AO_HAVE_nop_full AO_INLINE void AO_nop_write(void) { __asm__ __volatile__("wmb" : : : "memory"); } #define AO_HAVE_nop_write /* mb should be used for AO_nop_read(). That's the default. */ /* TODO: implement AO_fetch_and_add explicitly. */ /* We believe that ldq_l ... stq_c does not imply any memory barrier. */ AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old, AO_t new_val) { unsigned long was_equal; unsigned long temp; __asm__ __volatile__( "1: ldq_l %0,%1\n" " cmpeq %0,%4,%2\n" " mov %3,%0\n" " beq %2,2f\n" " stq_c %0,%1\n" " beq %0,1b\n" "2:\n" : "=&r" (temp), "+m" (*addr), "=&r" (was_equal) : "r" (new_val), "Ir" (old) :"memory"); return (int)was_equal; } #define AO_HAVE_compare_and_swap /* TODO: implement AO_fetch_compare_and_swap */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/arm.h000066400000000000000000000616211502707512200217500ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * Copyright (c) 2008-2017 Ivan Maidanski * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if (AO_GNUC_PREREQ(4, 8) || AO_CLANG_PREREQ(3, 5)) \ && !defined(AO_DISABLE_GCC_ATOMICS) /* Probably, it could be enabled even for earlier gcc/clang versions. */ # define AO_GCC_ATOMIC_TEST_AND_SET #endif #ifdef __native_client__ /* Mask instruction should immediately precede access instruction. */ # define AO_MASK_PTR(reg) " bical " reg ", " reg ", #0xc0000000\n" # define AO_BR_ALIGN " .align 4\n" #else # define AO_MASK_PTR(reg) /* empty */ # define AO_BR_ALIGN /* empty */ #endif #if defined(__thumb__) && !defined(__thumb2__) /* Thumb One mode does not have ARM "mcr", "swp" and some load/store */ /* instructions, so we temporarily switch to ARM mode and go back */ /* afterwards (clobbering "r3" register). */ # define AO_THUMB_GO_ARM \ " adr r3, 4f\n" \ " bx r3\n" \ " .align\n" \ " .arm\n" \ AO_BR_ALIGN \ "4:\n" # define AO_THUMB_RESTORE_MODE \ " adr r3, 5f + 1\n" \ " bx r3\n" \ " .thumb\n" \ AO_BR_ALIGN \ "5:\n" # define AO_THUMB_SWITCH_CLOBBERS "r3", #else # define AO_THUMB_GO_ARM /* empty */ # define AO_THUMB_RESTORE_MODE /* empty */ # define AO_THUMB_SWITCH_CLOBBERS /* empty */ #endif /* !__thumb__ */ /* NEC LE-IT: gcc has no way to easily check the arm architecture */ /* but it defines only one (or several) of __ARM_ARCH_x__ to be true. */ #if !defined(__ARM_ARCH_2__) && !defined(__ARM_ARCH_3__) \ && !defined(__ARM_ARCH_3M__) && !defined(__ARM_ARCH_4__) \ && !defined(__ARM_ARCH_4T__) \ && ((!defined(__ARM_ARCH_5__) && !defined(__ARM_ARCH_5E__) \ && !defined(__ARM_ARCH_5T__) && !defined(__ARM_ARCH_5TE__) \ && !defined(__ARM_ARCH_5TEJ__) && !defined(__ARM_ARCH_6M__)) \ || defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \ || defined(__ARM_ARCH_8A__)) # define AO_ARM_HAVE_LDREX # if !defined(__ARM_ARCH_6__) && !defined(__ARM_ARCH_6J__) \ && !defined(__ARM_ARCH_6T2__) /* LDREXB/STREXB and LDREXH/STREXH are present in ARMv6K/Z+. */ # define AO_ARM_HAVE_LDREXBH # endif # if !defined(__ARM_ARCH_6__) && !defined(__ARM_ARCH_6J__) \ && !defined(__ARM_ARCH_6T2__) && !defined(__ARM_ARCH_6Z__) \ && !defined(__ARM_ARCH_6ZT2__) # if !defined(__ARM_ARCH_6K__) && !defined(__ARM_ARCH_6KZ__) \ && !defined(__ARM_ARCH_6ZK__) /* DMB is present in ARMv6M and ARMv7+. */ # define AO_ARM_HAVE_DMB # endif # if (!defined(__thumb__) \ || (defined(__thumb2__) && !defined(__ARM_ARCH_7__) \ && !defined(__ARM_ARCH_7M__) && !defined(__ARM_ARCH_7EM__))) \ && (!defined(__clang__) || AO_CLANG_PREREQ(3, 3)) /* LDREXD/STREXD present in ARMv6K/M+ (see gas/config/tc-arm.c). */ /* In the Thumb mode, this works only starting from ARMv7 (except */ /* for the base and 'M' models). Clang3.2 (and earlier) does not */ /* allocate register pairs for LDREXD/STREXD properly (besides, */ /* Clang3.1 does not support "%H" operand specification). */ # define AO_ARM_HAVE_LDREXD # endif /* !thumb || ARMv7A || ARMv7R+ */ # endif /* ARMv7+ */ #endif /* ARMv6+ */ #if !defined(__ARM_ARCH_2__) && !defined(__ARM_ARCH_6M__) \ && !defined(__ARM_ARCH_8A__) && !defined(__thumb2__) # define AO_ARM_HAVE_SWP /* Note: ARMv6M is excluded due to no ARM mode support. */ /* Also, SWP is obsoleted for ARMv8+. */ #endif /* !__thumb2__ */ #if !defined(AO_UNIPROCESSOR) && defined(AO_ARM_HAVE_DMB) \ && !defined(AO_PREFER_BUILTIN_ATOMICS) AO_INLINE void AO_nop_write(void) { /* AO_THUMB_GO_ARM is empty. */ /* This will target the system domain and thus be overly */ /* conservative as the CPUs (even in case of big.LITTLE SoC) will */ /* occupy the inner shareable domain. */ /* The plain variant (dmb st) is theoretically slower, and should */ /* not be needed. That said, with limited experimentation, a CPU */ /* implementation for which it actually matters has not been found */ /* yet, though they should already exist. */ /* Anyway, note that the "st" and "ishst" barriers are actually */ /* quite weak and, as the libatomic_ops documentation states, */ /* usually not what you really want. */ __asm__ __volatile__("dmb ishst" : : : "memory"); } # define AO_HAVE_nop_write #endif /* AO_ARM_HAVE_DMB */ #ifndef AO_GCC_ATOMIC_TEST_AND_SET #ifdef AO_UNIPROCESSOR /* If only a single processor (core) is used, AO_UNIPROCESSOR could */ /* be defined by the client to avoid unnecessary memory barrier. */ AO_INLINE void AO_nop_full(void) { AO_compiler_barrier(); } # define AO_HAVE_nop_full #elif defined(AO_ARM_HAVE_DMB) /* ARMv7 is compatible to ARMv6 but has a simpler command for issuing */ /* a memory barrier (DMB). Raising it via CP15 should still work */ /* (but slightly less efficient because it requires the use of */ /* a general-purpose register). */ AO_INLINE void AO_nop_full(void) { /* AO_THUMB_GO_ARM is empty. */ __asm__ __volatile__("dmb" : : : "memory"); } # define AO_HAVE_nop_full #elif defined(AO_ARM_HAVE_LDREX) /* ARMv6 is the first architecture providing support for a simple */ /* LL/SC. A data memory barrier must be raised via CP15 command. */ AO_INLINE void AO_nop_full(void) { unsigned dest = 0; /* Issue a data memory barrier (keeps ordering of memory */ /* transactions before and after this operation). */ __asm__ __volatile__("@AO_nop_full\n" AO_THUMB_GO_ARM " mcr p15,0,%0,c7,c10,5\n" AO_THUMB_RESTORE_MODE : "=&r"(dest) : /* empty */ : AO_THUMB_SWITCH_CLOBBERS "memory"); } # define AO_HAVE_nop_full #else /* AO_nop_full() is emulated using AO_test_and_set_full(). */ #endif /* !AO_UNIPROCESSOR && !AO_ARM_HAVE_LDREX */ #endif /* !AO_GCC_ATOMIC_TEST_AND_SET */ #ifdef AO_ARM_HAVE_LDREX /* "ARM Architecture Reference Manual" (chapter A3.5.3) says that the */ /* single-copy atomic processor accesses are all byte accesses, all */ /* halfword accesses to halfword-aligned locations, all word accesses */ /* to word-aligned locations. */ /* There is only a single concern related to AO store operations: */ /* a direct write (by STR[B/H] instruction) will not be recognized */ /* by the LL/SC construct on the same CPU (i.e., according to ARM */ /* documentation, e.g., see CortexA8 TRM reference, point 8.5, */ /* atomic "store" (using LDREX/STREX[B/H]) is the only safe way to */ /* set variables also used in LL/SC environment). */ /* This is only a problem if interrupt handlers do not clear the */ /* reservation (by CLREX instruction or a dummy STREX one), as they */ /* almost certainly should (e.g., see restore_user_regs defined in */ /* arch/arm/kernel/entry-header.S of Linux. Nonetheless, there is */ /* a doubt this was properly implemented in some ancient OS releases. */ # ifdef AO_BROKEN_TASKSWITCH_CLREX # define AO_SKIPATOMIC_store # define AO_SKIPATOMIC_store_release # define AO_SKIPATOMIC_char_store # define AO_SKIPATOMIC_char_store_release # define AO_SKIPATOMIC_short_store # define AO_SKIPATOMIC_short_store_release # define AO_SKIPATOMIC_int_store # define AO_SKIPATOMIC_int_store_release # ifndef AO_PREFER_BUILTIN_ATOMICS AO_INLINE void AO_store(volatile AO_t *addr, AO_t value) { int flag; __asm__ __volatile__("@AO_store\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%2") " ldrex %0, [%2]\n" AO_MASK_PTR("%2") " strex %0, %3, [%2]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (flag), "+m" (*addr) : "r" (addr), "r" (value) : AO_THUMB_SWITCH_CLOBBERS "cc"); } # define AO_HAVE_store # ifdef AO_ARM_HAVE_LDREXBH AO_INLINE void AO_char_store(volatile unsigned char *addr, unsigned char value) { int flag; __asm__ __volatile__("@AO_char_store\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%2") " ldrexb %0, [%2]\n" AO_MASK_PTR("%2") " strexb %0, %3, [%2]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (flag), "+m" (*addr) : "r" (addr), "r" (value) : AO_THUMB_SWITCH_CLOBBERS "cc"); } # define AO_HAVE_char_store AO_INLINE void AO_short_store(volatile unsigned short *addr, unsigned short value) { int flag; __asm__ __volatile__("@AO_short_store\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%2") " ldrexh %0, [%2]\n" AO_MASK_PTR("%2") " strexh %0, %3, [%2]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (flag), "+m" (*addr) : "r" (addr), "r" (value) : AO_THUMB_SWITCH_CLOBBERS "cc"); } # define AO_HAVE_short_store # endif /* AO_ARM_HAVE_LDREXBH */ # endif /* !AO_PREFER_BUILTIN_ATOMICS */ # elif !defined(AO_GCC_ATOMIC_TEST_AND_SET) # include "../loadstore/atomic_store.h" /* AO_int_store is defined in ao_t_is_int.h. */ # endif /* !AO_BROKEN_TASKSWITCH_CLREX */ #endif /* AO_ARM_HAVE_LDREX */ #ifndef AO_GCC_ATOMIC_TEST_AND_SET # include "../test_and_set_t_is_ao_t.h" /* Probably suboptimal */ #ifdef AO_ARM_HAVE_LDREX /* AO_t/char/short/int load is simple reading. */ /* Unaligned accesses are not guaranteed to be atomic. */ # define AO_ACCESS_CHECK_ALIGNED # define AO_ACCESS_short_CHECK_ALIGNED # define AO_ACCESS_int_CHECK_ALIGNED # include "../all_atomic_only_load.h" # ifndef AO_HAVE_char_store # include "../loadstore/char_atomic_store.h" # include "../loadstore/short_atomic_store.h" # endif /* NEC LE-IT: replace the SWAP as recommended by ARM: "Applies to: ARM11 Cores Though the SWP instruction will still work with ARM V6 cores, it is recommended to use the new V6 synchronization instructions. The SWP instruction produces 'locked' read and write accesses which are atomic, i.e. another operation cannot be done between these locked accesses which ties up external bus (AHB, AXI) bandwidth and can increase worst case interrupt latencies. LDREX, STREX are more flexible, other instructions can be done between the LDREX and STREX accesses." */ #ifndef AO_PREFER_GENERALIZED #if !defined(AO_FORCE_USE_SWP) || !defined(AO_ARM_HAVE_SWP) /* But, on the other hand, there could be a considerable performance */ /* degradation in case of a race. Eg., test_atomic.c executing */ /* test_and_set test on a dual-core ARMv7 processor using LDREX/STREX */ /* showed around 35 times lower performance than that using SWP. */ /* To force use of SWP instruction, use -D AO_FORCE_USE_SWP option */ /* (the latter is ignored if SWP instruction is unsupported). */ AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { AO_TS_VAL_t oldval; int flag; __asm__ __volatile__("@AO_test_and_set\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%3") " ldrex %0, [%3]\n" AO_MASK_PTR("%3") " strex %1, %4, [%3]\n" " teq %1, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r"(oldval), "=&r"(flag), "+m"(*addr) : "r"(addr), "r"(1) : AO_THUMB_SWITCH_CLOBBERS "cc"); return oldval; } # define AO_HAVE_test_and_set #endif /* !AO_FORCE_USE_SWP */ AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *p, AO_t incr) { AO_t result, tmp; int flag; __asm__ __volatile__("@AO_fetch_and_add\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%5") " ldrex %0, [%5]\n" /* get original */ " add %2, %0, %4\n" /* sum up in incr */ AO_MASK_PTR("%5") " strex %1, %2, [%5]\n" /* store them */ " teq %1, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r"(result), "=&r"(flag), "=&r"(tmp), "+m"(*p) /* 0..3 */ : "r"(incr), "r"(p) /* 4..5 */ : AO_THUMB_SWITCH_CLOBBERS "cc"); return result; } #define AO_HAVE_fetch_and_add AO_INLINE AO_t AO_fetch_and_add1(volatile AO_t *p) { AO_t result, tmp; int flag; __asm__ __volatile__("@AO_fetch_and_add1\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%4") " ldrex %0, [%4]\n" /* get original */ " add %1, %0, #1\n" /* increment */ AO_MASK_PTR("%4") " strex %2, %1, [%4]\n" /* store them */ " teq %2, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r"(result), "=&r"(tmp), "=&r"(flag), "+m"(*p) : "r"(p) : AO_THUMB_SWITCH_CLOBBERS "cc"); return result; } #define AO_HAVE_fetch_and_add1 AO_INLINE AO_t AO_fetch_and_sub1(volatile AO_t *p) { AO_t result, tmp; int flag; __asm__ __volatile__("@AO_fetch_and_sub1\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%4") " ldrex %0, [%4]\n" /* get original */ " sub %1, %0, #1\n" /* decrement */ AO_MASK_PTR("%4") " strex %2, %1, [%4]\n" /* store them */ " teq %2, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r"(result), "=&r"(tmp), "=&r"(flag), "+m"(*p) : "r"(p) : AO_THUMB_SWITCH_CLOBBERS "cc"); return result; } #define AO_HAVE_fetch_and_sub1 AO_INLINE void AO_and(volatile AO_t *p, AO_t value) { AO_t tmp, result; __asm__ __volatile__("@AO_and\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%4") " ldrex %0, [%4]\n" " and %1, %0, %3\n" AO_MASK_PTR("%4") " strex %0, %1, [%4]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (tmp), "=&r" (result), "+m" (*p) : "r" (value), "r" (p) : AO_THUMB_SWITCH_CLOBBERS "cc"); } #define AO_HAVE_and AO_INLINE void AO_or(volatile AO_t *p, AO_t value) { AO_t tmp, result; __asm__ __volatile__("@AO_or\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%4") " ldrex %0, [%4]\n" " orr %1, %0, %3\n" AO_MASK_PTR("%4") " strex %0, %1, [%4]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (tmp), "=&r" (result), "+m" (*p) : "r" (value), "r" (p) : AO_THUMB_SWITCH_CLOBBERS "cc"); } #define AO_HAVE_or AO_INLINE void AO_xor(volatile AO_t *p, AO_t value) { AO_t tmp, result; __asm__ __volatile__("@AO_xor\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%4") " ldrex %0, [%4]\n" " eor %1, %0, %3\n" AO_MASK_PTR("%4") " strex %0, %1, [%4]\n" " teq %0, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (tmp), "=&r" (result), "+m" (*p) : "r" (value), "r" (p) : AO_THUMB_SWITCH_CLOBBERS "cc"); } #define AO_HAVE_xor #endif /* !AO_PREFER_GENERALIZED */ #ifdef AO_ARM_HAVE_LDREXBH AO_INLINE unsigned char AO_char_fetch_and_add(volatile unsigned char *p, unsigned char incr) { unsigned result, tmp; int flag; __asm__ __volatile__("@AO_char_fetch_and_add\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%5") " ldrexb %0, [%5]\n" " add %2, %0, %4\n" AO_MASK_PTR("%5") " strexb %1, %2, [%5]\n" " teq %1, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (result), "=&r" (flag), "=&r" (tmp), "+m" (*p) : "r" ((unsigned)incr), "r" (p) : AO_THUMB_SWITCH_CLOBBERS "cc"); return (unsigned char)result; } # define AO_HAVE_char_fetch_and_add AO_INLINE unsigned short AO_short_fetch_and_add(volatile unsigned short *p, unsigned short incr) { unsigned result, tmp; int flag; __asm__ __volatile__("@AO_short_fetch_and_add\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: " AO_MASK_PTR("%5") " ldrexh %0, [%5]\n" " add %2, %0, %4\n" AO_MASK_PTR("%5") " strexh %1, %2, [%5]\n" " teq %1, #0\n" " bne 1b\n" AO_THUMB_RESTORE_MODE : "=&r" (result), "=&r" (flag), "=&r" (tmp), "+m" (*p) : "r" ((unsigned)incr), "r" (p) : AO_THUMB_SWITCH_CLOBBERS "cc"); return (unsigned short)result; } # define AO_HAVE_short_fetch_and_add #endif /* AO_ARM_HAVE_LDREXBH */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result, tmp; __asm__ __volatile__("@AO_compare_and_swap\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: mov %0, #2\n" /* store a flag */ AO_MASK_PTR("%3") " ldrex %1, [%3]\n" /* get original */ " teq %1, %4\n" /* see if match */ AO_MASK_PTR("%3") # ifdef __thumb2__ /* TODO: Eliminate warning: it blocks containing wide Thumb */ /* instructions are deprecated in ARMv8. */ " it eq\n" # endif " strexeq %0, %5, [%3]\n" /* store new one if matched */ " teq %0, #1\n" " beq 1b\n" /* if update failed, repeat */ AO_THUMB_RESTORE_MODE : "=&r"(result), "=&r"(tmp), "+m"(*addr) : "r"(addr), "r"(old_val), "r"(new_val) : AO_THUMB_SWITCH_CLOBBERS "cc"); return !(result&2); /* if succeeded then return 1 else 0 */ } # define AO_HAVE_compare_and_swap #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val; int flag; __asm__ __volatile__("@AO_fetch_compare_and_swap\n" AO_THUMB_GO_ARM AO_BR_ALIGN "1: mov %0, #2\n" /* store a flag */ AO_MASK_PTR("%3") " ldrex %1, [%3]\n" /* get original */ " teq %1, %4\n" /* see if match */ AO_MASK_PTR("%3") # ifdef __thumb2__ " it eq\n" # endif " strexeq %0, %5, [%3]\n" /* store new one if matched */ " teq %0, #1\n" " beq 1b\n" /* if update failed, repeat */ AO_THUMB_RESTORE_MODE : "=&r"(flag), "=&r"(fetched_val), "+m"(*addr) : "r"(addr), "r"(old_val), "r"(new_val) : AO_THUMB_SWITCH_CLOBBERS "cc"); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap #ifdef AO_ARM_HAVE_LDREXD # include "../standard_ao_double_t.h" /* "ARM Architecture Reference Manual ARMv7-A/R edition" (chapter */ /* A3.5.3) says that memory accesses caused by LDREXD and STREXD */ /* instructions to doubleword-aligned locations are single-copy */ /* atomic; accesses to 64-bit elements by other instructions might */ /* not be single-copy atomic as they are executed as a sequence of */ /* 32-bit accesses. */ AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; /* AO_THUMB_GO_ARM is empty. */ __asm__ __volatile__("@AO_double_load\n" AO_MASK_PTR("%1") " ldrexd %0, %H0, [%1]" : "=&r" (result.AO_whole) : "r" (addr) /* : no clobber */); return result; } # define AO_HAVE_double_load AO_INLINE void AO_double_store(volatile AO_double_t *addr, AO_double_t new_val) { AO_double_t old_val; int status; do { /* AO_THUMB_GO_ARM is empty. */ __asm__ __volatile__("@AO_double_store\n" AO_MASK_PTR("%3") " ldrexd %0, %H0, [%3]\n" AO_MASK_PTR("%3") " strexd %1, %4, %H4, [%3]" : "=&r" (old_val.AO_whole), "=&r" (status), "+m" (*addr) : "r" (addr), "r" (new_val.AO_whole) : "cc"); } while (AO_EXPECT_FALSE(status)); } # define AO_HAVE_double_store AO_INLINE int AO_double_compare_and_swap(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { double_ptr_storage tmp; int result = 1; do { /* AO_THUMB_GO_ARM is empty. */ __asm__ __volatile__("@AO_double_compare_and_swap\n" AO_MASK_PTR("%1") " ldrexd %0, %H0, [%1]\n" /* get original to r1 & r2 */ : "=&r"(tmp) : "r"(addr) /* : no clobber */); if (tmp != old_val.AO_whole) break; __asm__ __volatile__( AO_MASK_PTR("%2") " strexd %0, %3, %H3, [%2]\n" /* store new one if matched */ : "=&r"(result), "+m"(*addr) : "r" (addr), "r" (new_val.AO_whole) : "cc"); } while (AO_EXPECT_FALSE(result)); return !result; /* if succeeded then return 1 else 0 */ } # define AO_HAVE_double_compare_and_swap #endif /* AO_ARM_HAVE_LDREXD */ #else /* pre ARMv6 architectures ... */ /* I found a slide set that, if I read it correctly, claims that */ /* Loads followed by either a Load or Store are ordered, but nothing */ /* else is. */ /* It appears that SWP is the only simple memory barrier. */ #include "../all_aligned_atomic_load_store.h" /* The code should run correctly on a multi-core ARMv6+ as well. */ #endif /* !AO_ARM_HAVE_LDREX */ #if !defined(AO_HAVE_test_and_set_full) && !defined(AO_HAVE_test_and_set) \ && defined (AO_ARM_HAVE_SWP) && (!defined(AO_PREFER_GENERALIZED) \ || !defined(AO_HAVE_fetch_compare_and_swap)) AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_VAL_t oldval; /* SWP on ARM is very similar to XCHG on x86. */ /* The first operand is the result, the second the value */ /* to be stored. Both registers must be different from addr. */ /* Make the address operand an early clobber output so it */ /* doesn't overlap with the other operands. The early clobber */ /* on oldval is necessary to prevent the compiler allocating */ /* them to the same register if they are both unused. */ __asm__ __volatile__("@AO_test_and_set_full\n" AO_THUMB_GO_ARM AO_MASK_PTR("%3") " swp %0, %2, [%3]\n" /* Ignore GCC "SWP is deprecated for this architecture" */ /* warning here (for ARMv6+). */ AO_THUMB_RESTORE_MODE : "=&r"(oldval), "=&r"(addr) : "r"(1), "1"(addr) : AO_THUMB_SWITCH_CLOBBERS "memory"); return oldval; } # define AO_HAVE_test_and_set_full #endif /* !AO_HAVE_test_and_set[_full] && AO_ARM_HAVE_SWP */ #define AO_T_IS_INT #else /* AO_GCC_ATOMIC_TEST_AND_SET */ # if defined(__clang__) && !defined(AO_ARM_HAVE_LDREX) /* As of clang-3.8, it cannot compile __atomic_and/or/xor_fetch */ /* library calls yet for pre ARMv6. */ # define AO_SKIPATOMIC_ANY_and_ANY # define AO_SKIPATOMIC_ANY_or_ANY # define AO_SKIPATOMIC_ANY_xor_ANY # endif # ifdef AO_ARM_HAVE_LDREXD # include "../standard_ao_double_t.h" # endif # include "generic.h" #endif /* AO_GCC_ATOMIC_TEST_AND_SET */ #undef AO_ARM_HAVE_DMB #undef AO_ARM_HAVE_LDREX #undef AO_ARM_HAVE_LDREXBH #undef AO_ARM_HAVE_LDREXD #undef AO_ARM_HAVE_SWP #undef AO_BR_ALIGN #undef AO_MASK_PTR #undef AO_SKIPATOMIC_ANY_and_ANY #undef AO_SKIPATOMIC_ANY_or_ANY #undef AO_SKIPATOMIC_ANY_xor_ANY #undef AO_SKIPATOMIC_char_store #undef AO_SKIPATOMIC_char_store_release #undef AO_SKIPATOMIC_int_store #undef AO_SKIPATOMIC_int_store_release #undef AO_SKIPATOMIC_short_store #undef AO_SKIPATOMIC_short_store_release #undef AO_SKIPATOMIC_store #undef AO_SKIPATOMIC_store_release #undef AO_THUMB_GO_ARM #undef AO_THUMB_RESTORE_MODE #undef AO_THUMB_SWITCH_CLOBBERS papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/avr32.h000066400000000000000000000046151502707512200221260ustar00rootroot00000000000000/* * Copyright (C) 2009 Bradley Smith * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the * "Software"), to deal in the Software without restriction, including * without limitation the rights to use, copy, modify, merge, publish, * distribute, sublicense, and/or sell copies of the Software, and to * permit persons to whom the Software is furnished to do so, subject to * the following conditions: * * The above copyright notice and this permission notice shall be included * in all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include "../all_atomic_load_store.h" #include "../ordered.h" /* There are no multiprocessor implementations. */ #include "../test_and_set_t_is_ao_t.h" #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { register long ret; __asm__ __volatile__( "xchg %[oldval], %[mem], %[newval]" : [oldval] "=&r"(ret) : [mem] "r"(addr), [newval] "r"(1) : "memory"); return (AO_TS_VAL_t)ret; } # define AO_HAVE_test_and_set_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { register long ret; __asm__ __volatile__( "1: ssrf 5\n" " ld.w %[res], %[mem]\n" " eor %[res], %[oldval]\n" " brne 2f\n" " stcond %[mem], %[newval]\n" " brne 1b\n" "2:\n" : [res] "=&r"(ret), [mem] "=m"(*addr) : "m"(*addr), [newval] "r"(new_val), [oldval] "r"(old) : "cc", "memory"); return (int)ret; } #define AO_HAVE_compare_and_swap_full /* TODO: implement AO_fetch_compare_and_swap. */ #define AO_T_IS_INT papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/cris.h000066400000000000000000000052271502707512200221310ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* FIXME: seems to be untested. */ #include "../all_atomic_load_store.h" #include "../ordered.h" /* There are no multiprocessor implementations. */ #include "../test_and_set_t_is_ao_t.h" /* * The architecture apparently supports an "f" flag which is * set on preemption. This essentially gives us load-locked, * store-conditional primitives, though I'm not quite sure how * this would work on a hypothetical multiprocessor. -HB * * For details, see * http://developer.axis.com/doc/hardware/etrax100lx/prog_man/ * 1_architectural_description.pdf * * TODO: Presumably many other primitives (notably CAS, including the double- * width versions) could be implemented in this manner, if someone got * around to it. */ AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { /* Ripped from linuxthreads/sysdeps/cris/pt-machine.h */ register unsigned long int ret; /* Note the use of a dummy output of *addr to expose the write. The memory barrier is to stop *other* writes being moved past this code. */ __asm__ __volatile__("clearf\n" "0:\n\t" "movu.b [%2],%0\n\t" "ax\n\t" "move.b %3,[%2]\n\t" "bwf 0b\n\t" "clearf" : "=&r" (ret), "=m" (*addr) : "r" (addr), "r" ((int) 1), "m" (*addr) : "memory"); return ret; } #define AO_HAVE_test_and_set_full papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/e2k.h000066400000000000000000000023751502707512200216530ustar00rootroot00000000000000/* * Copyright (c) 2022 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* As of clang-9, all __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n are missing. */ #define AO_GCC_FORCE_HAVE_CAS #include "generic.h" #undef AO_GCC_FORCE_HAVE_CAS papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/generic-arithm.h000066400000000000000000000623341502707512200240710ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_char_ARITHM AO_INLINE unsigned/**/char AO_char_fetch_and_add(volatile unsigned/**/char *addr, unsigned/**/char incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELAXED); } #define AO_HAVE_char_fetch_and_add #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_char_and(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_char_and #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_char_or(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_char_or #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_char_xor(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_char_xor #endif #endif /* !AO_NO_char_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_short_ARITHM AO_INLINE unsigned/**/short AO_short_fetch_and_add(volatile unsigned/**/short *addr, unsigned/**/short incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELAXED); } #define AO_HAVE_short_fetch_and_add #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_short_and(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_short_and #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_short_or(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_short_or #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_short_xor(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_short_xor #endif #endif /* !AO_NO_short_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_int_ARITHM AO_INLINE unsigned AO_int_fetch_and_add(volatile unsigned *addr, unsigned incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELAXED); } #define AO_HAVE_int_fetch_and_add #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_int_and(volatile unsigned *addr, unsigned value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_int_and #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_int_or(volatile unsigned *addr, unsigned value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_int_or #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_int_xor(volatile unsigned *addr, unsigned value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_int_xor #endif #endif /* !AO_NO_int_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_ARITHM AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *addr, AO_t incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELAXED); } #define AO_HAVE_fetch_and_add #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_and(volatile AO_t *addr, AO_t value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_and #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_or(volatile AO_t *addr, AO_t value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_or #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_xor(volatile AO_t *addr, AO_t value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_xor #endif #endif /* !AO_NO_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_char_ARITHM AO_INLINE unsigned/**/char AO_char_fetch_and_add_acquire(volatile unsigned/**/char *addr, unsigned/**/char incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_ACQUIRE); } #define AO_HAVE_char_fetch_and_add_acquire #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_char_and_acquire(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_char_and_acquire #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_char_or_acquire(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_char_or_acquire #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_char_xor_acquire(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_char_xor_acquire #endif #endif /* !AO_NO_char_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_short_ARITHM AO_INLINE unsigned/**/short AO_short_fetch_and_add_acquire(volatile unsigned/**/short *addr, unsigned/**/short incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_ACQUIRE); } #define AO_HAVE_short_fetch_and_add_acquire #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_short_and_acquire(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_short_and_acquire #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_short_or_acquire(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_short_or_acquire #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_short_xor_acquire(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_short_xor_acquire #endif #endif /* !AO_NO_short_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_int_ARITHM AO_INLINE unsigned AO_int_fetch_and_add_acquire(volatile unsigned *addr, unsigned incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_ACQUIRE); } #define AO_HAVE_int_fetch_and_add_acquire #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_int_and_acquire(volatile unsigned *addr, unsigned value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_int_and_acquire #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_int_or_acquire(volatile unsigned *addr, unsigned value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_int_or_acquire #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_int_xor_acquire(volatile unsigned *addr, unsigned value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_int_xor_acquire #endif #endif /* !AO_NO_int_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_ARITHM AO_INLINE AO_t AO_fetch_and_add_acquire(volatile AO_t *addr, AO_t incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_ACQUIRE); } #define AO_HAVE_fetch_and_add_acquire #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_and_acquire(volatile AO_t *addr, AO_t value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_and_acquire #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_or_acquire(volatile AO_t *addr, AO_t value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_or_acquire #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_xor_acquire(volatile AO_t *addr, AO_t value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_ACQUIRE); } # define AO_HAVE_xor_acquire #endif #endif /* !AO_NO_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_char_ARITHM AO_INLINE unsigned/**/char AO_char_fetch_and_add_release(volatile unsigned/**/char *addr, unsigned/**/char incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELEASE); } #define AO_HAVE_char_fetch_and_add_release #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_char_and_release(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_char_and_release #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_char_or_release(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_char_or_release #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_char_xor_release(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_char_xor_release #endif #endif /* !AO_NO_char_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_short_ARITHM AO_INLINE unsigned/**/short AO_short_fetch_and_add_release(volatile unsigned/**/short *addr, unsigned/**/short incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELEASE); } #define AO_HAVE_short_fetch_and_add_release #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_short_and_release(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_short_and_release #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_short_or_release(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_short_or_release #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_short_xor_release(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_short_xor_release #endif #endif /* !AO_NO_short_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_int_ARITHM AO_INLINE unsigned AO_int_fetch_and_add_release(volatile unsigned *addr, unsigned incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELEASE); } #define AO_HAVE_int_fetch_and_add_release #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_int_and_release(volatile unsigned *addr, unsigned value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_int_and_release #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_int_or_release(volatile unsigned *addr, unsigned value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_int_or_release #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_int_xor_release(volatile unsigned *addr, unsigned value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_int_xor_release #endif #endif /* !AO_NO_int_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_ARITHM AO_INLINE AO_t AO_fetch_and_add_release(volatile AO_t *addr, AO_t incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_RELEASE); } #define AO_HAVE_fetch_and_add_release #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_and_release(volatile AO_t *addr, AO_t value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_and_release #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_or_release(volatile AO_t *addr, AO_t value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_or_release #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_xor_release(volatile AO_t *addr, AO_t value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_xor_release #endif #endif /* !AO_NO_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_char_ARITHM AO_INLINE unsigned/**/char AO_char_fetch_and_add_full(volatile unsigned/**/char *addr, unsigned/**/char incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_SEQ_CST); } #define AO_HAVE_char_fetch_and_add_full #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_char_and_full(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_char_and_full #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_char_or_full(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_char_or_full #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_char_xor_full(volatile unsigned/**/char *addr, unsigned/**/char value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_char_xor_full #endif #endif /* !AO_NO_char_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_short_ARITHM AO_INLINE unsigned/**/short AO_short_fetch_and_add_full(volatile unsigned/**/short *addr, unsigned/**/short incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_SEQ_CST); } #define AO_HAVE_short_fetch_and_add_full #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_short_and_full(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_short_and_full #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_short_or_full(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_short_or_full #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_short_xor_full(volatile unsigned/**/short *addr, unsigned/**/short value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_short_xor_full #endif #endif /* !AO_NO_short_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_int_ARITHM AO_INLINE unsigned AO_int_fetch_and_add_full(volatile unsigned *addr, unsigned incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_SEQ_CST); } #define AO_HAVE_int_fetch_and_add_full #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_int_and_full(volatile unsigned *addr, unsigned value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_int_and_full #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_int_or_full(volatile unsigned *addr, unsigned value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_int_or_full #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_int_xor_full(volatile unsigned *addr, unsigned value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_int_xor_full #endif #endif /* !AO_NO_int_ARITHM */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_ARITHM AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *addr, AO_t incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_SEQ_CST); } #define AO_HAVE_fetch_and_add_full #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_and_full(volatile AO_t *addr, AO_t value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_and_full #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_or_full(volatile AO_t *addr, AO_t value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_or_full #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_xor_full(volatile AO_t *addr, AO_t value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_SEQ_CST); } # define AO_HAVE_xor_full #endif #endif /* !AO_NO_ARITHM */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/generic-arithm.template000066400000000000000000000030631502707512200254470ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #ifndef AO_NO_XSIZE_ARITHM AO_INLINE XCTYPE AO_XSIZE_fetch_and_add_XBAR(volatile XCTYPE *addr, XCTYPE incr) { return __atomic_fetch_add(addr, incr, __ATOMIC_XGCCBAR); } #define AO_HAVE_XSIZE_fetch_and_add_XBAR #ifndef AO_SKIPATOMIC_ANY_and_ANY AO_INLINE void AO_XSIZE_and_XBAR(volatile XCTYPE *addr, XCTYPE value) { (void)__atomic_and_fetch(addr, value, __ATOMIC_XGCCBAR); } # define AO_HAVE_XSIZE_and_XBAR #endif #ifndef AO_SKIPATOMIC_ANY_or_ANY AO_INLINE void AO_XSIZE_or_XBAR(volatile XCTYPE *addr, XCTYPE value) { (void)__atomic_or_fetch(addr, value, __ATOMIC_XGCCBAR); } # define AO_HAVE_XSIZE_or_XBAR #endif #ifndef AO_SKIPATOMIC_ANY_xor_ANY AO_INLINE void AO_XSIZE_xor_XBAR(volatile XCTYPE *addr, XCTYPE value) { (void)__atomic_xor_fetch(addr, value, __ATOMIC_XGCCBAR); } # define AO_HAVE_XSIZE_xor_XBAR #endif #endif /* !AO_NO_XSIZE_ARITHM */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/generic-small.h000066400000000000000000000563271502707512200237220ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if !defined(AO_GCC_HAVE_char_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) AO_INLINE unsigned/**/char AO_char_load(const volatile unsigned/**/char *addr) { return __atomic_load_n(addr, __ATOMIC_RELAXED); } #define AO_HAVE_char_load AO_INLINE unsigned/**/char AO_char_load_acquire(const volatile unsigned/**/char *addr) { return __atomic_load_n(addr, __ATOMIC_ACQUIRE); } #define AO_HAVE_char_load_acquire /* char_load_read is defined using load and nop_read. */ /* TODO: Map it to ACQUIRE. We should be strengthening the read and */ /* write stuff to the more general acquire/release versions. It almost */ /* never makes a difference and is much less error-prone. */ /* char_load_full is generalized using load and nop_full. */ /* TODO: Map it to SEQ_CST and clarify the documentation. */ /* TODO: Map load_dd_acquire_read to ACQUIRE. Ideally it should be */ /* mapped to CONSUME, but the latter is currently broken. */ /* char_store_full definition is omitted similar to load_full reason. */ /* TODO: Map store_write to RELEASE. */ #ifndef AO_SKIPATOMIC_char_store AO_INLINE void AO_char_store(volatile unsigned/**/char *addr, unsigned/**/char value) { __atomic_store_n(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_char_store #endif #ifndef AO_SKIPATOMIC_char_store_release AO_INLINE void AO_char_store_release(volatile unsigned/**/char *addr, unsigned/**/char value) { __atomic_store_n(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_char_store_release #endif #endif /* !AO_GCC_HAVE_char_SYNC_CAS || !AO_PREFER_GENERALIZED */ #ifdef AO_GCC_HAVE_char_SYNC_CAS AO_INLINE unsigned/**/char AO_char_fetch_compare_and_swap(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { (void)__atomic_compare_exchange_n(addr, &old_val /* p_expected */, new_val /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_char_fetch_compare_and_swap AO_INLINE unsigned/**/char AO_char_fetch_compare_and_swap_acquire(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); return old_val; } # define AO_HAVE_char_fetch_compare_and_swap_acquire AO_INLINE unsigned/**/char AO_char_fetch_compare_and_swap_release(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_char_fetch_compare_and_swap_release AO_INLINE unsigned/**/char AO_char_fetch_compare_and_swap_full(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); return old_val; } # define AO_HAVE_char_fetch_compare_and_swap_full # ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_char_compare_and_swap(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } # define AO_HAVE_char_compare_and_swap AO_INLINE int AO_char_compare_and_swap_acquire(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_char_compare_and_swap_acquire AO_INLINE int AO_char_compare_and_swap_release(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_char_compare_and_swap_release AO_INLINE int AO_char_compare_and_swap_full(volatile unsigned/**/char *addr, unsigned/**/char old_val, unsigned/**/char new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_char_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ #endif /* AO_GCC_HAVE_char_SYNC_CAS */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if !defined(AO_GCC_HAVE_short_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) AO_INLINE unsigned/**/short AO_short_load(const volatile unsigned/**/short *addr) { return __atomic_load_n(addr, __ATOMIC_RELAXED); } #define AO_HAVE_short_load AO_INLINE unsigned/**/short AO_short_load_acquire(const volatile unsigned/**/short *addr) { return __atomic_load_n(addr, __ATOMIC_ACQUIRE); } #define AO_HAVE_short_load_acquire /* short_load_read is defined using load and nop_read. */ /* TODO: Map it to ACQUIRE. We should be strengthening the read and */ /* write stuff to the more general acquire/release versions. It almost */ /* never makes a difference and is much less error-prone. */ /* short_load_full is generalized using load and nop_full. */ /* TODO: Map it to SEQ_CST and clarify the documentation. */ /* TODO: Map load_dd_acquire_read to ACQUIRE. Ideally it should be */ /* mapped to CONSUME, but the latter is currently broken. */ /* short_store_full definition is omitted similar to load_full reason. */ /* TODO: Map store_write to RELEASE. */ #ifndef AO_SKIPATOMIC_short_store AO_INLINE void AO_short_store(volatile unsigned/**/short *addr, unsigned/**/short value) { __atomic_store_n(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_short_store #endif #ifndef AO_SKIPATOMIC_short_store_release AO_INLINE void AO_short_store_release(volatile unsigned/**/short *addr, unsigned/**/short value) { __atomic_store_n(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_short_store_release #endif #endif /* !AO_GCC_HAVE_short_SYNC_CAS || !AO_PREFER_GENERALIZED */ #ifdef AO_GCC_HAVE_short_SYNC_CAS AO_INLINE unsigned/**/short AO_short_fetch_compare_and_swap(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { (void)__atomic_compare_exchange_n(addr, &old_val /* p_expected */, new_val /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_short_fetch_compare_and_swap AO_INLINE unsigned/**/short AO_short_fetch_compare_and_swap_acquire(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); return old_val; } # define AO_HAVE_short_fetch_compare_and_swap_acquire AO_INLINE unsigned/**/short AO_short_fetch_compare_and_swap_release(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_short_fetch_compare_and_swap_release AO_INLINE unsigned/**/short AO_short_fetch_compare_and_swap_full(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); return old_val; } # define AO_HAVE_short_fetch_compare_and_swap_full # ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_short_compare_and_swap(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } # define AO_HAVE_short_compare_and_swap AO_INLINE int AO_short_compare_and_swap_acquire(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_short_compare_and_swap_acquire AO_INLINE int AO_short_compare_and_swap_release(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_short_compare_and_swap_release AO_INLINE int AO_short_compare_and_swap_full(volatile unsigned/**/short *addr, unsigned/**/short old_val, unsigned/**/short new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_short_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ #endif /* AO_GCC_HAVE_short_SYNC_CAS */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if !defined(AO_GCC_HAVE_int_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) AO_INLINE unsigned AO_int_load(const volatile unsigned *addr) { return __atomic_load_n(addr, __ATOMIC_RELAXED); } #define AO_HAVE_int_load AO_INLINE unsigned AO_int_load_acquire(const volatile unsigned *addr) { return __atomic_load_n(addr, __ATOMIC_ACQUIRE); } #define AO_HAVE_int_load_acquire /* int_load_read is defined using load and nop_read. */ /* TODO: Map it to ACQUIRE. We should be strengthening the read and */ /* write stuff to the more general acquire/release versions. It almost */ /* never makes a difference and is much less error-prone. */ /* int_load_full is generalized using load and nop_full. */ /* TODO: Map it to SEQ_CST and clarify the documentation. */ /* TODO: Map load_dd_acquire_read to ACQUIRE. Ideally it should be */ /* mapped to CONSUME, but the latter is currently broken. */ /* int_store_full definition is omitted similar to load_full reason. */ /* TODO: Map store_write to RELEASE. */ #ifndef AO_SKIPATOMIC_int_store AO_INLINE void AO_int_store(volatile unsigned *addr, unsigned value) { __atomic_store_n(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_int_store #endif #ifndef AO_SKIPATOMIC_int_store_release AO_INLINE void AO_int_store_release(volatile unsigned *addr, unsigned value) { __atomic_store_n(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_int_store_release #endif #endif /* !AO_GCC_HAVE_int_SYNC_CAS || !AO_PREFER_GENERALIZED */ #ifdef AO_GCC_HAVE_int_SYNC_CAS AO_INLINE unsigned AO_int_fetch_compare_and_swap(volatile unsigned *addr, unsigned old_val, unsigned new_val) { (void)__atomic_compare_exchange_n(addr, &old_val /* p_expected */, new_val /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_int_fetch_compare_and_swap AO_INLINE unsigned AO_int_fetch_compare_and_swap_acquire(volatile unsigned *addr, unsigned old_val, unsigned new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); return old_val; } # define AO_HAVE_int_fetch_compare_and_swap_acquire AO_INLINE unsigned AO_int_fetch_compare_and_swap_release(volatile unsigned *addr, unsigned old_val, unsigned new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_int_fetch_compare_and_swap_release AO_INLINE unsigned AO_int_fetch_compare_and_swap_full(volatile unsigned *addr, unsigned old_val, unsigned new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); return old_val; } # define AO_HAVE_int_fetch_compare_and_swap_full # ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_int_compare_and_swap(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } # define AO_HAVE_int_compare_and_swap AO_INLINE int AO_int_compare_and_swap_acquire(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_int_compare_and_swap_acquire AO_INLINE int AO_int_compare_and_swap_release(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_int_compare_and_swap_release AO_INLINE int AO_int_compare_and_swap_full(volatile unsigned *addr, unsigned old_val, unsigned new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_int_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ #endif /* AO_GCC_HAVE_int_SYNC_CAS */ /* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if !defined(AO_GCC_HAVE_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) AO_INLINE AO_t AO_load(const volatile AO_t *addr) { return __atomic_load_n(addr, __ATOMIC_RELAXED); } #define AO_HAVE_load AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { return __atomic_load_n(addr, __ATOMIC_ACQUIRE); } #define AO_HAVE_load_acquire /* load_read is defined using load and nop_read. */ /* TODO: Map it to ACQUIRE. We should be strengthening the read and */ /* write stuff to the more general acquire/release versions. It almost */ /* never makes a difference and is much less error-prone. */ /* load_full is generalized using load and nop_full. */ /* TODO: Map it to SEQ_CST and clarify the documentation. */ /* TODO: Map load_dd_acquire_read to ACQUIRE. Ideally it should be */ /* mapped to CONSUME, but the latter is currently broken. */ /* store_full definition is omitted similar to load_full reason. */ /* TODO: Map store_write to RELEASE. */ #ifndef AO_SKIPATOMIC_store AO_INLINE void AO_store(volatile AO_t *addr, AO_t value) { __atomic_store_n(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_store #endif #ifndef AO_SKIPATOMIC_store_release AO_INLINE void AO_store_release(volatile AO_t *addr, AO_t value) { __atomic_store_n(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_store_release #endif #endif /* !AO_GCC_HAVE_SYNC_CAS || !AO_PREFER_GENERALIZED */ #ifdef AO_GCC_HAVE_SYNC_CAS AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { (void)__atomic_compare_exchange_n(addr, &old_val /* p_expected */, new_val /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_fetch_compare_and_swap AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); return old_val; } # define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_fetch_compare_and_swap_release AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); return old_val; } # define AO_HAVE_fetch_compare_and_swap_full # ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } # define AO_HAVE_compare_and_swap AO_INLINE int AO_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_compare_and_swap_acquire AO_INLINE int AO_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_compare_and_swap_release AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ #endif /* AO_GCC_HAVE_SYNC_CAS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/generic-small.template000066400000000000000000000133031502707512200252710ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if !defined(AO_GCC_HAVE_XSIZE_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) AO_INLINE XCTYPE AO_XSIZE_load(const volatile XCTYPE *addr) { return __atomic_load_n(addr, __ATOMIC_RELAXED); } #define AO_HAVE_XSIZE_load AO_INLINE XCTYPE AO_XSIZE_load_acquire(const volatile XCTYPE *addr) { return __atomic_load_n(addr, __ATOMIC_ACQUIRE); } #define AO_HAVE_XSIZE_load_acquire /* XSIZE_load_read is defined using load and nop_read. */ /* TODO: Map it to ACQUIRE. We should be strengthening the read and */ /* write stuff to the more general acquire/release versions. It almost */ /* never makes a difference and is much less error-prone. */ /* XSIZE_load_full is generalized using load and nop_full. */ /* TODO: Map it to SEQ_CST and clarify the documentation. */ /* TODO: Map load_dd_acquire_read to ACQUIRE. Ideally it should be */ /* mapped to CONSUME, but the latter is currently broken. */ /* XSIZE_store_full definition is omitted similar to load_full reason. */ /* TODO: Map store_write to RELEASE. */ #ifndef AO_SKIPATOMIC_XSIZE_store AO_INLINE void AO_XSIZE_store(volatile XCTYPE *addr, XCTYPE value) { __atomic_store_n(addr, value, __ATOMIC_RELAXED); } # define AO_HAVE_XSIZE_store #endif #ifndef AO_SKIPATOMIC_XSIZE_store_release AO_INLINE void AO_XSIZE_store_release(volatile XCTYPE *addr, XCTYPE value) { __atomic_store_n(addr, value, __ATOMIC_RELEASE); } # define AO_HAVE_XSIZE_store_release #endif #endif /* !AO_GCC_HAVE_XSIZE_SYNC_CAS || !AO_PREFER_GENERALIZED */ #ifdef AO_GCC_HAVE_XSIZE_SYNC_CAS AO_INLINE XCTYPE AO_XSIZE_fetch_compare_and_swap(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { (void)__atomic_compare_exchange_n(addr, &old_val /* p_expected */, new_val /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_XSIZE_fetch_compare_and_swap AO_INLINE XCTYPE AO_XSIZE_fetch_compare_and_swap_acquire(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); return old_val; } # define AO_HAVE_XSIZE_fetch_compare_and_swap_acquire AO_INLINE XCTYPE AO_XSIZE_fetch_compare_and_swap_release(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); return old_val; } # define AO_HAVE_XSIZE_fetch_compare_and_swap_release AO_INLINE XCTYPE AO_XSIZE_fetch_compare_and_swap_full(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { (void)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); return old_val; } # define AO_HAVE_XSIZE_fetch_compare_and_swap_full # ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_XSIZE_compare_and_swap(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } # define AO_HAVE_XSIZE_compare_and_swap AO_INLINE int AO_XSIZE_compare_and_swap_acquire(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_XSIZE_compare_and_swap_acquire AO_INLINE int AO_XSIZE_compare_and_swap_release(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_XSIZE_compare_and_swap_release AO_INLINE int AO_XSIZE_compare_and_swap_full(volatile XCTYPE *addr, XCTYPE old_val, XCTYPE new_val) { return (int)__atomic_compare_exchange_n(addr, &old_val, new_val, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_XSIZE_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ #endif /* AO_GCC_HAVE_XSIZE_SYNC_CAS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/generic.h000066400000000000000000000172201502707512200226010ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * Copyright (c) 2013-2017 Ivan Maidanski * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ /* The following implementation assumes GCC 4.7 or later. */ /* For the details, see GNU Manual, chapter 6.52 (Built-in functions */ /* for memory model aware atomic operations). */ #define AO_GCC_ATOMIC_TEST_AND_SET #include "../test_and_set_t_is_char.h" #if defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1) \ || defined(AO_GCC_FORCE_HAVE_CAS) # define AO_GCC_HAVE_char_SYNC_CAS #endif #if (__SIZEOF_SHORT__ == 2 && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2)) \ || defined(AO_GCC_FORCE_HAVE_CAS) # define AO_GCC_HAVE_short_SYNC_CAS #endif #if (__SIZEOF_INT__ == 4 && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4)) \ || (__SIZEOF_INT__ == 8 && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8)) \ || defined(AO_GCC_FORCE_HAVE_CAS) # define AO_GCC_HAVE_int_SYNC_CAS #endif #if (__SIZEOF_SIZE_T__ == 4 && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4)) \ || (__SIZEOF_SIZE_T__ == 8 \ && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8)) \ || defined(AO_GCC_FORCE_HAVE_CAS) # define AO_GCC_HAVE_SYNC_CAS #endif #undef AO_compiler_barrier #define AO_compiler_barrier() __atomic_signal_fence(__ATOMIC_SEQ_CST) #ifdef AO_UNIPROCESSOR /* If only a single processor (core) is used, AO_UNIPROCESSOR could */ /* be defined by the client to avoid unnecessary memory barrier. */ AO_INLINE void AO_nop_full(void) { AO_compiler_barrier(); } # define AO_HAVE_nop_full #else AO_INLINE void AO_nop_read(void) { __atomic_thread_fence(__ATOMIC_ACQUIRE); } # define AO_HAVE_nop_read # ifndef AO_HAVE_nop_write AO_INLINE void AO_nop_write(void) { __atomic_thread_fence(__ATOMIC_RELEASE); } # define AO_HAVE_nop_write # endif AO_INLINE void AO_nop_full(void) { /* __sync_synchronize() could be used instead. */ __atomic_thread_fence(__ATOMIC_SEQ_CST); } # define AO_HAVE_nop_full #endif /* !AO_UNIPROCESSOR */ #include "generic-small.h" #ifndef AO_PREFER_GENERALIZED # include "generic-arithm.h" # define AO_CLEAR(addr) __atomic_clear(addr, __ATOMIC_RELEASE) # define AO_HAVE_CLEAR AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)__atomic_test_and_set(addr, __ATOMIC_RELAXED); } # define AO_HAVE_test_and_set AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)__atomic_test_and_set(addr, __ATOMIC_ACQUIRE); } # define AO_HAVE_test_and_set_acquire AO_INLINE AO_TS_VAL_t AO_test_and_set_release(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)__atomic_test_and_set(addr, __ATOMIC_RELEASE); } # define AO_HAVE_test_and_set_release AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)__atomic_test_and_set(addr, __ATOMIC_SEQ_CST); } # define AO_HAVE_test_and_set_full #endif /* !AO_PREFER_GENERALIZED */ #ifdef AO_HAVE_DOUBLE_PTR_STORAGE # if ((__SIZEOF_SIZE_T__ == 4 \ && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8)) \ || (__SIZEOF_SIZE_T__ == 8 /* half of AO_double_t */ \ && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16))) \ && !defined(AO_SKIPATOMIC_double_compare_and_swap_ANY) # define AO_GCC_HAVE_double_SYNC_CAS # endif # if !defined(AO_GCC_HAVE_double_SYNC_CAS) || !defined(AO_PREFER_GENERALIZED) # if !defined(AO_HAVE_double_load) && !defined(AO_SKIPATOMIC_double_load) AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; result.AO_whole = __atomic_load_n(&addr->AO_whole, __ATOMIC_RELAXED); return result; } # define AO_HAVE_double_load # endif # if !defined(AO_HAVE_double_load_acquire) \ && !defined(AO_SKIPATOMIC_double_load_acquire) AO_INLINE AO_double_t AO_double_load_acquire(const volatile AO_double_t *addr) { AO_double_t result; result.AO_whole = __atomic_load_n(&addr->AO_whole, __ATOMIC_ACQUIRE); return result; } # define AO_HAVE_double_load_acquire # endif # if !defined(AO_HAVE_double_store) && !defined(AO_SKIPATOMIC_double_store) AO_INLINE void AO_double_store(volatile AO_double_t *addr, AO_double_t value) { __atomic_store_n(&addr->AO_whole, value.AO_whole, __ATOMIC_RELAXED); } # define AO_HAVE_double_store # endif # if !defined(AO_HAVE_double_store_release) \ && !defined(AO_SKIPATOMIC_double_store_release) AO_INLINE void AO_double_store_release(volatile AO_double_t *addr, AO_double_t value) { __atomic_store_n(&addr->AO_whole, value.AO_whole, __ATOMIC_RELEASE); } # define AO_HAVE_double_store_release # endif #endif /* !AO_GCC_HAVE_double_SYNC_CAS || !AO_PREFER_GENERALIZED */ #endif /* AO_HAVE_DOUBLE_PTR_STORAGE */ #ifdef AO_GCC_HAVE_double_SYNC_CAS # ifndef AO_HAVE_double_compare_and_swap AO_INLINE int AO_double_compare_and_swap(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return (int)__atomic_compare_exchange_n(&addr->AO_whole, &old_val.AO_whole /* p_expected */, new_val.AO_whole /* desired */, 0 /* is_weak: false */, __ATOMIC_RELAXED /* success */, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_double_compare_and_swap # endif # ifndef AO_HAVE_double_compare_and_swap_acquire AO_INLINE int AO_double_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return (int)__atomic_compare_exchange_n(&addr->AO_whole, &old_val.AO_whole, new_val.AO_whole, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE); } # define AO_HAVE_double_compare_and_swap_acquire # endif # ifndef AO_HAVE_double_compare_and_swap_release AO_INLINE int AO_double_compare_and_swap_release(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return (int)__atomic_compare_exchange_n(&addr->AO_whole, &old_val.AO_whole, new_val.AO_whole, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED /* failure */); } # define AO_HAVE_double_compare_and_swap_release # endif # ifndef AO_HAVE_double_compare_and_swap_full AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { return (int)__atomic_compare_exchange_n(&addr->AO_whole, &old_val.AO_whole, new_val.AO_whole, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE /* failure */); } # define AO_HAVE_double_compare_and_swap_full # endif #endif /* AO_GCC_HAVE_double_SYNC_CAS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/hexagon.h000066400000000000000000000111531502707512200226150ustar00rootroot00000000000000/* * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. */ #if AO_CLANG_PREREQ(3, 9) && !defined(AO_DISABLE_GCC_ATOMICS) /* Probably, it could be enabled for earlier clang versions as well. */ /* As of clang-3.9, __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n are missing. */ # define AO_GCC_FORCE_HAVE_CAS # define AO_GCC_HAVE_double_SYNC_CAS # include "../standard_ao_double_t.h" # include "generic.h" #else /* AO_DISABLE_GCC_ATOMICS */ #include "../all_aligned_atomic_load_store.h" #include "../test_and_set_t_is_ao_t.h" /* There's also "isync" and "barrier"; however, for all current CPU */ /* versions, "syncht" should suffice. Likewise, it seems that the */ /* auto-defined versions of *_acquire, *_release or *_full suffice for */ /* all current ISA implementations. */ AO_INLINE void AO_nop_full(void) { __asm__ __volatile__("syncht" : : : "memory"); } #define AO_HAVE_nop_full /* The Hexagon has load-locked, store-conditional primitives, and so */ /* resulting code is very nearly identical to that of PowerPC. */ #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *addr, AO_t incr) { AO_t oldval; AO_t newval; __asm__ __volatile__( "1:\n" " %0 = memw_locked(%3);\n" /* load and reserve */ " %1 = add (%0,%4);\n" /* increment */ " memw_locked(%3,p1) = %1;\n" /* store conditional */ " if (!p1) jump 1b;\n" /* retry if lost reservation */ : "=&r"(oldval), "=&r"(newval), "+m"(*addr) : "r"(addr), "r"(incr) : "memory", "p1"); return oldval; } #define AO_HAVE_fetch_and_add AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { int oldval; int locked_value = 1; __asm__ __volatile__( "1:\n" " %0 = memw_locked(%2);\n" /* load and reserve */ " {\n" " p2 = cmp.eq(%0,#0);\n" /* if load is not zero, */ " if (!p2.new) jump:nt 2f;\n" /* we are done */ " }\n" " memw_locked(%2,p1) = %3;\n" /* else store conditional */ " if (!p1) jump 1b;\n" /* retry if lost reservation */ "2:\n" /* oldval is zero if we set */ : "=&r"(oldval), "+m"(*addr) : "r"(addr), "r"(locked_value) : "memory", "p1", "p2"); return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set #endif /* !AO_PREFER_GENERALIZED */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_t __oldval; int result = 0; __asm__ __volatile__( "1:\n" " %0 = memw_locked(%3);\n" /* load and reserve */ " {\n" " p2 = cmp.eq(%0,%4);\n" /* if load is not equal to */ " if (!p2.new) jump:nt 2f;\n" /* old, fail */ " }\n" " memw_locked(%3,p1) = %5;\n" /* else store conditional */ " if (!p1) jump 1b;\n" /* retry if lost reservation */ " %1 = #1\n" /* success, result = 1 */ "2:\n" : "=&r" (__oldval), "+r" (result), "+m"(*addr) : "r" (addr), "r" (old), "r" (new_val) : "p1", "p2", "memory" ); return result; } # define AO_HAVE_compare_and_swap #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t __oldval; __asm__ __volatile__( "1:\n" " %0 = memw_locked(%2);\n" /* load and reserve */ " {\n" " p2 = cmp.eq(%0,%3);\n" /* if load is not equal to */ " if (!p2.new) jump:nt 2f;\n" /* old_val, fail */ " }\n" " memw_locked(%2,p1) = %4;\n" /* else store conditional */ " if (!p1) jump 1b;\n" /* retry if lost reservation */ "2:\n" : "=&r" (__oldval), "+m"(*addr) : "r" (addr), "r" (old_val), "r" (new_val) : "p1", "p2", "memory" ); return __oldval; } #define AO_HAVE_fetch_compare_and_swap #define AO_T_IS_INT #endif /* AO_DISABLE_GCC_ATOMICS */ #undef AO_GCC_FORCE_HAVE_CAS #undef AO_GCC_HAVE_double_SYNC_CAS papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/hppa.h000066400000000000000000000075751502707512200221310ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_atomic_load_store.h" /* Some architecture set descriptions include special "ordered" memory */ /* operations. As far as we can tell, no existing processors actually */ /* require those. Nor does it appear likely that future processors */ /* will. */ #include "../ordered.h" /* GCC will not guarantee the alignment we need, use four lock words */ /* and select the correctly aligned datum. See the glibc 2.3.2 */ /* linuxthread port for the original implementation. */ struct AO_pa_clearable_loc { int data[4]; }; #undef AO_TS_INITIALIZER #define AO_TS_t struct AO_pa_clearable_loc #define AO_TS_INITIALIZER { { 1, 1, 1, 1 } } /* Switch meaning of set and clear, since we only have an atomic clear */ /* instruction. */ typedef enum {AO_PA_TS_set = 0, AO_PA_TS_clear = 1} AO_PA_TS_val; #define AO_TS_VAL_t AO_PA_TS_val #define AO_TS_CLEAR AO_PA_TS_clear #define AO_TS_SET AO_PA_TS_set /* The hppa only has one atomic read and modify memory operation, */ /* load and clear, so hppa spinlocks must use zero to signify that */ /* someone is holding the lock. The address used for the ldcw */ /* semaphore must be 16-byte aligned. */ #define AO_ldcw(a, ret) \ __asm__ __volatile__("ldcw 0(%2), %0" \ : "=r" (ret), "=m" (*(a)) : "r" (a)) /* Because malloc only guarantees 8-byte alignment for malloc'd data, */ /* and GCC only guarantees 8-byte alignment for stack locals, we can't */ /* be assured of 16-byte alignment for atomic lock data even if we */ /* specify "__attribute ((aligned(16)))" in the type declaration. So, */ /* we use a struct containing an array of four ints for the atomic lock */ /* type and dynamically select the 16-byte aligned int from the array */ /* for the semaphore. */ #define AO_PA_LDCW_ALIGNMENT 16 #define AO_ldcw_align(addr) \ ((volatile unsigned *)(((unsigned long)(addr) \ + (AO_PA_LDCW_ALIGNMENT - 1)) \ & ~(AO_PA_LDCW_ALIGNMENT - 1))) /* Works on PA 1.1 and PA 2.0 systems */ AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t * addr) { volatile unsigned int ret; volatile unsigned *a = AO_ldcw_align(addr); AO_ldcw(a, ret); return (AO_TS_VAL_t)ret; } #define AO_HAVE_test_and_set_full AO_INLINE void AO_pa_clear(volatile AO_TS_t * addr) { volatile unsigned *a = AO_ldcw_align(addr); AO_compiler_barrier(); *a = 1; } #define AO_CLEAR(addr) AO_pa_clear(addr) #define AO_HAVE_CLEAR #undef AO_PA_LDCW_ALIGNMENT #undef AO_ldcw #undef AO_ldcw_align papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/ia64.h000066400000000000000000000234661502707512200217410ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_atomic_load_store.h" #include "../all_acquire_release_volatile.h" #include "../test_and_set_t_is_char.h" #ifdef _ILP32 /* 32-bit HP/UX code. */ /* This requires pointer "swizzling". Pointers need to be expanded */ /* to 64 bits using the addp4 instruction before use. This makes it */ /* hard to share code, but we try anyway. */ # define AO_LEN "4" /* We assume that addr always appears in argument position 1 in asm */ /* code. If it is clobbered due to swizzling, we also need it in */ /* second position. Any later arguments are referenced symbolically, */ /* so that we don't have to worry about their position. This requires*/ /* gcc 3.1, but you shouldn't be using anything older than that on */ /* IA64 anyway. */ /* The AO_MASK macro is a workaround for the fact that HP/UX gcc */ /* appears to otherwise store 64-bit pointers in ar.ccv, i.e. it */ /* doesn't appear to clear high bits in a pointer value we pass into */ /* assembly code, even if it is supposedly of type AO_t. */ # define AO_IN_ADDR "1"(addr) # define AO_OUT_ADDR , "=r"(addr) # define AO_SWIZZLE "addp4 %1=0,%1;;\n" # define AO_MASK(ptr) __asm__ __volatile__("zxt4 %1=%1": "=r"(ptr) : "0"(ptr)) #else # define AO_LEN "8" # define AO_IN_ADDR "r"(addr) # define AO_OUT_ADDR # define AO_SWIZZLE # define AO_MASK(ptr) /* empty */ #endif /* !_ILP32 */ AO_INLINE void AO_nop_full(void) { __asm__ __volatile__("mf" : : : "memory"); } #define AO_HAVE_nop_full #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add1_acquire (volatile AO_t *addr) { AO_t result; __asm__ __volatile__ (AO_SWIZZLE "fetchadd" AO_LEN ".acq %0=[%1],1": "=r" (result) AO_OUT_ADDR: AO_IN_ADDR :"memory"); return result; } #define AO_HAVE_fetch_and_add1_acquire AO_INLINE AO_t AO_fetch_and_add1_release (volatile AO_t *addr) { AO_t result; __asm__ __volatile__ (AO_SWIZZLE "fetchadd" AO_LEN ".rel %0=[%1],1": "=r" (result) AO_OUT_ADDR: AO_IN_ADDR :"memory"); return result; } #define AO_HAVE_fetch_and_add1_release AO_INLINE AO_t AO_fetch_and_sub1_acquire (volatile AO_t *addr) { AO_t result; __asm__ __volatile__ (AO_SWIZZLE "fetchadd" AO_LEN ".acq %0=[%1],-1": "=r" (result) AO_OUT_ADDR: AO_IN_ADDR :"memory"); return result; } #define AO_HAVE_fetch_and_sub1_acquire AO_INLINE AO_t AO_fetch_and_sub1_release (volatile AO_t *addr) { AO_t result; __asm__ __volatile__ (AO_SWIZZLE "fetchadd" AO_LEN ".rel %0=[%1],-1": "=r" (result) AO_OUT_ADDR: AO_IN_ADDR :"memory"); return result; } #define AO_HAVE_fetch_and_sub1_release #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_t fetched_val; AO_MASK(old); __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg" AO_LEN ".acq %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"(old) : "memory"); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_t fetched_val; AO_MASK(old); __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg" AO_LEN ".rel %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"(old) : "memory"); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap_release AO_INLINE unsigned char AO_char_fetch_compare_and_swap_acquire(volatile unsigned char *addr, unsigned char old, unsigned char new_val) { unsigned char fetched_val; __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg1.acq %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"((AO_t)old) : "memory"); return fetched_val; } #define AO_HAVE_char_fetch_compare_and_swap_acquire AO_INLINE unsigned char AO_char_fetch_compare_and_swap_release(volatile unsigned char *addr, unsigned char old, unsigned char new_val) { unsigned char fetched_val; __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg1.rel %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"((AO_t)old) : "memory"); return fetched_val; } #define AO_HAVE_char_fetch_compare_and_swap_release AO_INLINE unsigned short AO_short_fetch_compare_and_swap_acquire(volatile unsigned short *addr, unsigned short old, unsigned short new_val) { unsigned short fetched_val; __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg2.acq %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"((AO_t)old) : "memory"); return fetched_val; } #define AO_HAVE_short_fetch_compare_and_swap_acquire AO_INLINE unsigned short AO_short_fetch_compare_and_swap_release(volatile unsigned short *addr, unsigned short old, unsigned short new_val) { unsigned short fetched_val; __asm__ __volatile__(AO_SWIZZLE "mov ar.ccv=%[old] ;; cmpxchg2.rel %0=[%1],%[new_val],ar.ccv" : "=r"(fetched_val) AO_OUT_ADDR : AO_IN_ADDR, [new_val]"r"(new_val), [old]"r"((AO_t)old) : "memory"); return fetched_val; } #define AO_HAVE_short_fetch_compare_and_swap_release #ifdef _ILP32 # define AO_T_IS_INT /* TODO: Add compare_double_and_swap_double for the _ILP32 case. */ #else # ifndef AO_PREFER_GENERALIZED AO_INLINE unsigned int AO_int_fetch_and_add1_acquire(volatile unsigned int *addr) { unsigned int result; __asm__ __volatile__("fetchadd4.acq %0=[%1],1" : "=r" (result) : AO_IN_ADDR : "memory"); return result; } # define AO_HAVE_int_fetch_and_add1_acquire AO_INLINE unsigned int AO_int_fetch_and_add1_release(volatile unsigned int *addr) { unsigned int result; __asm__ __volatile__("fetchadd4.rel %0=[%1],1" : "=r" (result) : AO_IN_ADDR : "memory"); return result; } # define AO_HAVE_int_fetch_and_add1_release AO_INLINE unsigned int AO_int_fetch_and_sub1_acquire(volatile unsigned int *addr) { unsigned int result; __asm__ __volatile__("fetchadd4.acq %0=[%1],-1" : "=r" (result) : AO_IN_ADDR : "memory"); return result; } # define AO_HAVE_int_fetch_and_sub1_acquire AO_INLINE unsigned int AO_int_fetch_and_sub1_release(volatile unsigned int *addr) { unsigned int result; __asm__ __volatile__("fetchadd4.rel %0=[%1],-1" : "=r" (result) : AO_IN_ADDR : "memory"); return result; } # define AO_HAVE_int_fetch_and_sub1_release # endif /* !AO_PREFER_GENERALIZED */ AO_INLINE unsigned int AO_int_fetch_compare_and_swap_acquire(volatile unsigned int *addr, unsigned int old, unsigned int new_val) { unsigned int fetched_val; __asm__ __volatile__("mov ar.ccv=%3 ;; cmpxchg4.acq %0=[%1],%2,ar.ccv" : "=r"(fetched_val) : AO_IN_ADDR, "r"(new_val), "r"((AO_t)old) : "memory"); return fetched_val; } # define AO_HAVE_int_fetch_compare_and_swap_acquire AO_INLINE unsigned int AO_int_fetch_compare_and_swap_release(volatile unsigned int *addr, unsigned int old, unsigned int new_val) { unsigned int fetched_val; __asm__ __volatile__("mov ar.ccv=%3 ;; cmpxchg4.rel %0=[%1],%2,ar.ccv" : "=r"(fetched_val) : AO_IN_ADDR, "r"(new_val), "r"((AO_t)old) : "memory"); return fetched_val; } # define AO_HAVE_int_fetch_compare_and_swap_release #endif /* !_ILP32 */ /* TODO: Add compare_and_swap_double as soon as there is widely */ /* available hardware that implements it. */ #undef AO_IN_ADDR #undef AO_LEN #undef AO_MASK #undef AO_OUT_ADDR #undef AO_SWIZZLE papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/m68k.h000066400000000000000000000042171502707512200217540ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ /* The cas instruction causes an emulation trap for the */ /* 060 with a misaligned pointer, so let's avoid this. */ #undef AO_t typedef unsigned long AO_t __attribute__((__aligned__(4))); /* FIXME. Very incomplete. */ #include "../all_aligned_atomic_load_store.h" /* Are there any m68k multiprocessors still around? */ /* AFAIK, Alliants were sequentially consistent. */ #include "../ordered.h" #include "../test_and_set_t_is_char.h" AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_t oldval; /* The value at addr is semi-phony. */ /* 'tas' sets bit 7 while the return */ /* value pretends all bits were set, */ /* which at least matches AO_TS_SET. */ __asm__ __volatile__( "tas %1; sne %0" : "=d" (oldval), "=m" (*addr) : "m" (*addr) : "memory"); /* This cast works due to the above. */ return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set_full /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { char result; __asm__ __volatile__( "cas.l %3,%4,%1; seq %0" : "=d" (result), "=m" (*addr) : "m" (*addr), "d" (old), "d" (new_val) : "memory"); return -result; } #define AO_HAVE_compare_and_swap_full /* TODO: implement AO_fetch_compare_and_swap. */ #define AO_T_IS_INT papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/mips.h000066400000000000000000000131421502707512200221340ustar00rootroot00000000000000/* * Copyright (c) 2005,2007 Thiemo Seufer * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. */ /* * FIXME: This should probably make finer distinctions. SGI MIPS is * much more strongly ordered, and in fact closer to sequentially * consistent. This is really aimed at modern embedded implementations. */ /* Data dependence does not imply read ordering. */ #define AO_NO_DD_ORDERING /* #include "../standard_ao_double_t.h" */ /* TODO: Implement double-wide operations if available. */ #if (AO_GNUC_PREREQ(4, 9) || AO_CLANG_PREREQ(3, 5)) \ && !defined(AO_DISABLE_GCC_ATOMICS) /* Probably, it could be enabled even for earlier gcc/clang versions. */ /* As of clang-3.6/mips[64], __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n missing. */ # if defined(__clang__) # define AO_GCC_FORCE_HAVE_CAS # endif # include "generic.h" #else /* AO_DISABLE_GCC_ATOMICS */ # include "../test_and_set_t_is_ao_t.h" # include "../all_aligned_atomic_load_store.h" # if !defined(_ABI64) || _MIPS_SIM != _ABI64 # define AO_T_IS_INT # if __mips_isa_rev >= 6 /* Encoding of ll/sc in mips rel6 differs from that of mips2/3. */ # define AO_MIPS_SET_ISA "" # else # define AO_MIPS_SET_ISA " .set mips2\n" # endif # define AO_MIPS_LL_1(args) " ll " args "\n" # define AO_MIPS_SC(args) " sc " args "\n" # else # if __mips_isa_rev >= 6 # define AO_MIPS_SET_ISA "" # else # define AO_MIPS_SET_ISA " .set mips3\n" # endif # define AO_MIPS_LL_1(args) " lld " args "\n" # define AO_MIPS_SC(args) " scd " args "\n" # endif /* _MIPS_SIM == _ABI64 */ #ifdef AO_ICE9A1_LLSC_WAR /* ICE9 rev A1 chip (used in very few systems) is reported to */ /* have a low-frequency bug that causes LL to fail. */ /* To workaround, just issue the second 'LL'. */ # define AO_MIPS_LL(args) AO_MIPS_LL_1(args) AO_MIPS_LL_1(args) #else # define AO_MIPS_LL(args) AO_MIPS_LL_1(args) #endif AO_INLINE void AO_nop_full(void) { __asm__ __volatile__( " .set push\n" AO_MIPS_SET_ISA " .set noreorder\n" " .set nomacro\n" " sync\n" " .set pop" : : : "memory"); } #define AO_HAVE_nop_full #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *addr, AO_t incr) { register int result; register int temp; __asm__ __volatile__( " .set push\n" AO_MIPS_SET_ISA " .set noreorder\n" " .set nomacro\n" "1: " AO_MIPS_LL("%0, %2") " addu %1, %0, %3\n" AO_MIPS_SC("%1, %2") " beqz %1, 1b\n" " nop\n" " .set pop" : "=&r" (result), "=&r" (temp), "+m" (*addr) : "Ir" (incr) : "memory"); return (AO_t)result; } #define AO_HAVE_fetch_and_add AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { register int oldval; register int temp; __asm__ __volatile__( " .set push\n" AO_MIPS_SET_ISA " .set noreorder\n" " .set nomacro\n" "1: " AO_MIPS_LL("%0, %2") " move %1, %3\n" AO_MIPS_SC("%1, %2") " beqz %1, 1b\n" " nop\n" " .set pop" : "=&r" (oldval), "=&r" (temp), "+m" (*addr) : "r" (1) : "memory"); return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set /* TODO: Implement AO_and/or/xor primitives directly. */ #endif /* !AO_PREFER_GENERALIZED */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old, AO_t new_val) { register int was_equal = 0; register int temp; __asm__ __volatile__( " .set push\n" AO_MIPS_SET_ISA " .set noreorder\n" " .set nomacro\n" "1: " AO_MIPS_LL("%0, %1") " bne %0, %4, 2f\n" " move %0, %3\n" AO_MIPS_SC("%0, %1") " .set pop\n" " beqz %0, 1b\n" " li %2, 1\n" "2:" : "=&r" (temp), "+m" (*addr), "+r" (was_equal) : "r" (new_val), "r" (old) : "memory"); return was_equal; } # define AO_HAVE_compare_and_swap #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old, AO_t new_val) { register int fetched_val; register int temp; __asm__ __volatile__( " .set push\n" AO_MIPS_SET_ISA " .set noreorder\n" " .set nomacro\n" "1: " AO_MIPS_LL("%0, %2") " bne %0, %4, 2f\n" " move %1, %3\n" AO_MIPS_SC("%1, %2") " beqz %1, 1b\n" " nop\n" " .set pop\n" "2:" : "=&r" (fetched_val), "=&r" (temp), "+m" (*addr) : "r" (new_val), "Jr" (old) : "memory"); return (AO_t)fetched_val; } #define AO_HAVE_fetch_compare_and_swap #endif /* AO_DISABLE_GCC_ATOMICS */ /* CAS primitives with acquire, release and full semantics are */ /* generated automatically (and AO_int_... primitives are */ /* defined properly after the first generalization pass). */ #undef AO_GCC_FORCE_HAVE_CAS #undef AO_MIPS_LL #undef AO_MIPS_LL_1 #undef AO_MIPS_SC #undef AO_MIPS_SET_ISA papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/powerpc.h000066400000000000000000000256701502707512200226540ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ /* Memory model documented at http://www-106.ibm.com/developerworks/ */ /* eserver/articles/archguide.html and (clearer) */ /* http://www-106.ibm.com/developerworks/eserver/articles/powerpc.html. */ /* There appears to be no implicit ordering between any kind of */ /* independent memory references. */ /* TODO: Implement double-wide operations if available. */ #if (AO_GNUC_PREREQ(4, 8) || AO_CLANG_PREREQ(3, 8)) \ && !defined(AO_DISABLE_GCC_ATOMICS) /* Probably, it could be enabled even for earlier gcc/clang versions. */ /* TODO: As of clang-3.8.1, it emits lwsync in AO_load_acquire */ /* (i.e., the code is less efficient than the one given below). */ # include "generic.h" #else /* AO_DISABLE_GCC_ATOMICS */ /* Architecture enforces some ordering based on control dependence. */ /* I don't know if that could help. */ /* Data-dependent loads are always ordered. */ /* Based on the above references, eieio is intended for use on */ /* uncached memory, which we don't support. It does not order loads */ /* from cached memory. */ #include "../all_aligned_atomic_load_store.h" #include "../test_and_set_t_is_ao_t.h" /* There seems to be no byte equivalent of lwarx, so this */ /* may really be what we want, at least in the 32-bit case. */ AO_INLINE void AO_nop_full(void) { __asm__ __volatile__("sync" : : : "memory"); } #define AO_HAVE_nop_full /* lwsync apparently works for everything but a StoreLoad barrier. */ AO_INLINE void AO_lwsync(void) { #ifdef __NO_LWSYNC__ __asm__ __volatile__("sync" : : : "memory"); #else __asm__ __volatile__("lwsync" : : : "memory"); #endif } #define AO_nop_write() AO_lwsync() #define AO_HAVE_nop_write #define AO_nop_read() AO_lwsync() #define AO_HAVE_nop_read #if defined(__powerpc64__) || defined(__ppc64__) || defined(__64BIT__) /* ppc64 uses ld not lwz */ # define AO_PPC_LD "ld" # define AO_PPC_LxARX "ldarx" # define AO_PPC_CMPx "cmpd" # define AO_PPC_STxCXd "stdcx." # define AO_PPC_LOAD_CLOBBER "cr0" #else # define AO_PPC_LD "lwz" # define AO_PPC_LxARX "lwarx" # define AO_PPC_CMPx "cmpw" # define AO_PPC_STxCXd "stwcx." # define AO_PPC_LOAD_CLOBBER "cc" /* FIXME: We should get gcc to allocate one of the condition */ /* registers. I always got "impossible constraint" when I */ /* tried the "y" constraint. */ # define AO_T_IS_INT #endif #ifdef _AIX /* Labels are not supported on AIX. */ /* ppc64 has same size of instructions as 32-bit one. */ # define AO_PPC_L(label) /* empty */ # define AO_PPC_BR_A(labelBF, addr) addr #else # define AO_PPC_L(label) label ": " # define AO_PPC_BR_A(labelBF, addr) labelBF #endif /* We explicitly specify load_acquire, since it is important, and can */ /* be implemented relatively cheaply. It could be implemented */ /* with an ordinary load followed by a lwsync. But the general wisdom */ /* seems to be that a data dependent branch followed by an isync is */ /* cheaper. And the documentation is fairly explicit that this also */ /* has acquire semantics. */ AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { AO_t result; __asm__ __volatile__ ( AO_PPC_LD "%U1%X1 %0,%1\n" "cmpw %0,%0\n" "bne- " AO_PPC_BR_A("1f", "$+4") "\n" AO_PPC_L("1") "isync\n" : "=r" (result) : "m"(*addr) : "memory", AO_PPC_LOAD_CLOBBER); return result; } #define AO_HAVE_load_acquire /* We explicitly specify store_release, since it relies */ /* on the fact that lwsync is also a LoadStore barrier. */ AO_INLINE void AO_store_release(volatile AO_t *addr, AO_t value) { AO_lwsync(); *addr = value; } #define AO_HAVE_store_release #ifndef AO_PREFER_GENERALIZED /* This is similar to the code in the garbage collector. Deleting */ /* this and having it synthesized from compare_and_swap would probably */ /* only cost us a load immediate instruction. */ AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { /* TODO: And we should be using smaller objects anyway. */ AO_t oldval; AO_t temp = 1; /* locked value */ __asm__ __volatile__( AO_PPC_L("1") AO_PPC_LxARX " %0,0,%1\n" /* load and reserve */ AO_PPC_CMPx "i %0, 0\n" /* if load is */ "bne " AO_PPC_BR_A("2f", "$+12") "\n" /* non-zero, return already set */ AO_PPC_STxCXd " %2,0,%1\n" /* else store conditional */ "bne- " AO_PPC_BR_A("1b", "$-16") "\n" /* retry if lost reservation */ AO_PPC_L("2") "\n" /* oldval is zero if we set */ : "=&r"(oldval) : "r"(addr), "r"(temp) : "memory", "cr0"); return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { AO_TS_VAL_t result = AO_test_and_set(addr); AO_lwsync(); return result; } #define AO_HAVE_test_and_set_acquire AO_INLINE AO_TS_VAL_t AO_test_and_set_release(volatile AO_TS_t *addr) { AO_lwsync(); return AO_test_and_set(addr); } #define AO_HAVE_test_and_set_release AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_VAL_t result; AO_lwsync(); result = AO_test_and_set(addr); AO_lwsync(); return result; } #define AO_HAVE_test_and_set_full #endif /* !AO_PREFER_GENERALIZED */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_compare_and_swap(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_t oldval; int result = 0; __asm__ __volatile__( AO_PPC_L("1") AO_PPC_LxARX " %0,0,%2\n" /* load and reserve */ AO_PPC_CMPx " %0, %4\n" /* if load is not equal to */ "bne " AO_PPC_BR_A("2f", "$+16") "\n" /* old, fail */ AO_PPC_STxCXd " %3,0,%2\n" /* else store conditional */ "bne- " AO_PPC_BR_A("1b", "$-16") "\n" /* retry if lost reservation */ "li %1,1\n" /* result = 1; */ AO_PPC_L("2") "\n" : "=&r"(oldval), "=&r"(result) : "r"(addr), "r"(new_val), "r"(old), "1"(result) : "memory", "cr0"); return result; } # define AO_HAVE_compare_and_swap AO_INLINE int AO_compare_and_swap_acquire(volatile AO_t *addr, AO_t old, AO_t new_val) { int result = AO_compare_and_swap(addr, old, new_val); AO_lwsync(); return result; } # define AO_HAVE_compare_and_swap_acquire AO_INLINE int AO_compare_and_swap_release(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_lwsync(); return AO_compare_and_swap(addr, old, new_val); } # define AO_HAVE_compare_and_swap_release AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { int result; AO_lwsync(); result = AO_compare_and_swap(addr, old, new_val); if (result) AO_lwsync(); return result; } # define AO_HAVE_compare_and_swap_full #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val; __asm__ __volatile__( AO_PPC_L("1") AO_PPC_LxARX " %0,0,%1\n" /* load and reserve */ AO_PPC_CMPx " %0, %3\n" /* if load is not equal to */ "bne " AO_PPC_BR_A("2f", "$+12") "\n" /* old_val, fail */ AO_PPC_STxCXd " %2,0,%1\n" /* else store conditional */ "bne- " AO_PPC_BR_A("1b", "$-16") "\n" /* retry if lost reservation */ AO_PPC_L("2") "\n" : "=&r"(fetched_val) : "r"(addr), "r"(new_val), "r"(old_val) : "memory", "cr0"); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result = AO_fetch_compare_and_swap(addr, old_val, new_val); AO_lwsync(); return result; } #define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_lwsync(); return AO_fetch_compare_and_swap(addr, old_val, new_val); } #define AO_HAVE_fetch_compare_and_swap_release AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result; AO_lwsync(); result = AO_fetch_compare_and_swap(addr, old_val, new_val); if (result == old_val) AO_lwsync(); return result; } #define AO_HAVE_fetch_compare_and_swap_full #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *addr, AO_t incr) { AO_t oldval; AO_t newval; __asm__ __volatile__( AO_PPC_L("1") AO_PPC_LxARX " %0,0,%2\n" /* load and reserve */ "add %1,%0,%3\n" /* increment */ AO_PPC_STxCXd " %1,0,%2\n" /* store conditional */ "bne- " AO_PPC_BR_A("1b", "$-12") "\n" /* retry if lost reservation */ : "=&r"(oldval), "=&r"(newval) : "r"(addr), "r"(incr) : "memory", "cr0"); return oldval; } #define AO_HAVE_fetch_and_add AO_INLINE AO_t AO_fetch_and_add_acquire(volatile AO_t *addr, AO_t incr) { AO_t result = AO_fetch_and_add(addr, incr); AO_lwsync(); return result; } #define AO_HAVE_fetch_and_add_acquire AO_INLINE AO_t AO_fetch_and_add_release(volatile AO_t *addr, AO_t incr) { AO_lwsync(); return AO_fetch_and_add(addr, incr); } #define AO_HAVE_fetch_and_add_release AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *addr, AO_t incr) { AO_t result; AO_lwsync(); result = AO_fetch_and_add(addr, incr); AO_lwsync(); return result; } #define AO_HAVE_fetch_and_add_full #endif /* !AO_PREFER_GENERALIZED */ #undef AO_PPC_BR_A #undef AO_PPC_CMPx #undef AO_PPC_L #undef AO_PPC_LD #undef AO_PPC_LOAD_CLOBBER #undef AO_PPC_LxARX #undef AO_PPC_STxCXd #endif /* AO_DISABLE_GCC_ATOMICS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/riscv.h000066400000000000000000000022301502707512200223060ustar00rootroot00000000000000/* * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. */ #if defined(__clang__) || defined(AO_PREFER_BUILTIN_ATOMICS) /* All __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n macros are still missing. */ /* The operations are lock-free even for the types smaller than word. */ # define AO_GCC_FORCE_HAVE_CAS #else /* As of gcc-7.5, CAS and arithmetic atomic operations for char and */ /* short are supported by the compiler but require -latomic flag. */ # if !defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1) # define AO_NO_char_ARITHM # endif # if !defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) # define AO_NO_short_ARITHM # endif #endif /* !__clang__ */ #include "generic.h" #undef AO_GCC_FORCE_HAVE_CAS #undef AO_NO_char_ARITHM #undef AO_NO_short_ARITHM papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/s390.h000066400000000000000000000063371502707512200216720ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ #if (AO_GNUC_PREREQ(5, 4) || AO_CLANG_PREREQ(8, 0)) && defined(__s390x__) \ && !defined(AO_DISABLE_GCC_ATOMICS) /* Probably, it could be enabled for earlier clang/gcc versions. */ /* But, e.g., clang-3.8.0 produces a backend error for AtomicFence. */ # include "generic.h" #else /* AO_DISABLE_GCC_ATOMICS */ /* The relevant documentation appears to be at */ /* http://publibz.boulder.ibm.com/epubs/pdf/dz9zr003.pdf */ /* around page 5-96. Apparently: */ /* - Memory references in general are atomic only for a single */ /* byte. But it appears that the most common load/store */ /* instructions also guarantee atomicity for aligned */ /* operands of standard types. WE FOOLISHLY ASSUME that */ /* compilers only generate those. If that turns out to be */ /* wrong, we need inline assembly code for AO_load and */ /* AO_store. */ /* - A store followed by a load is unordered since the store */ /* may be delayed. Otherwise everything is ordered. */ /* - There is a hardware compare-and-swap (CS) instruction. */ #include "../all_aligned_atomic_load_store.h" #include "../ordered_except_wr.h" #include "../test_and_set_t_is_ao_t.h" /* TODO: Is there a way to do byte-sized test-and-set? */ /* TODO: AO_nop_full should probably be implemented directly. */ /* It appears that certain BCR instructions have that effect. */ /* Presumably they're cheaper than CS? */ #ifndef AO_GENERALIZE_ASM_BOOL_CAS AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { int retval; __asm__ __volatile__ ( # ifndef __s390x__ " cs %1,%2,0(%3)\n" # else " csg %1,%2,0(%3)\n" # endif " ipm %0\n" " srl %0,28\n" : "=&d" (retval), "+d" (old) : "d" (new_val), "a" (addr) : "cc", "memory"); return retval == 0; } #define AO_HAVE_compare_and_swap_full #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { __asm__ __volatile__ ( # ifndef __s390x__ " cs %0,%2,%1\n" # else " csg %0,%2,%1\n" # endif : "+d" (old), "=Q" (*addr) : "d" (new_val), "m" (*addr) : "cc", "memory"); return old; } #define AO_HAVE_fetch_compare_and_swap_full #endif /* AO_DISABLE_GCC_ATOMICS */ /* TODO: Add double-wide operations for 32-bit executables. */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/sh.h000066400000000000000000000017451502707512200216040ustar00rootroot00000000000000/* * Copyright (c) 2009 by Takashi YOSHII. All rights reserved. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. */ #include "../all_atomic_load_store.h" #include "../ordered.h" /* sh has tas.b(byte) only */ #include "../test_and_set_t_is_char.h" AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { int oldval; __asm__ __volatile__( "tas.b @%1; movt %0" : "=r" (oldval) : "r" (addr) : "t", "memory"); return oldval? AO_TS_CLEAR : AO_TS_SET; } #define AO_HAVE_test_and_set_full /* TODO: Very incomplete. */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/sparc.h000066400000000000000000000062701502707512200223000ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * */ /* TODO: Very incomplete; Add support for sparc64. */ /* Non-ancient SPARCs provide compare-and-swap (casa). */ #include "../all_atomic_load_store.h" /* Real SPARC code uses TSO: */ #include "../ordered_except_wr.h" /* Test_and_set location is just a byte. */ #include "../test_and_set_t_is_char.h" AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_VAL_t oldval; __asm__ __volatile__("ldstub %1,%0" : "=r"(oldval), "=m"(*addr) : "m"(*addr) : "memory"); return oldval; } #define AO_HAVE_test_and_set_full #ifndef AO_NO_SPARC_V9 # ifndef AO_GENERALIZE_ASM_BOOL_CAS /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { AO_t ret; __asm__ __volatile__ ("membar #StoreLoad | #LoadLoad\n\t" # if defined(__arch64__) "casx [%2],%0,%1\n\t" # else "cas [%2],%0,%1\n\t" /* 32-bit version */ # endif "membar #StoreLoad | #StoreStore\n\t" "cmp %0,%1\n\t" "be,a 0f\n\t" "mov 1,%0\n\t"/* one insn after branch always executed */ "clr %0\n\t" "0:\n\t" : "=r" (ret), "+r" (new_val) : "r" (addr), "0" (old) : "memory", "cc"); return (int)ret; } # define AO_HAVE_compare_and_swap_full # endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { __asm__ __volatile__ ("membar #StoreLoad | #LoadLoad\n\t" # if defined(__arch64__) "casx [%1],%2,%0\n\t" # else "cas [%1],%2,%0\n\t" /* 32-bit version */ # endif "membar #StoreLoad | #StoreStore\n\t" : "+r" (new_val) : "r" (addr), "r" (old) : "memory"); return new_val; } #define AO_HAVE_fetch_compare_and_swap_full #endif /* !AO_NO_SPARC_V9 */ /* TODO: Extend this for SPARC v8 and v9 (V8 also has swap, V9 has CAS, */ /* there are barriers like membar #LoadStore, CASA (32-bit) and */ /* CASXA (64-bit) instructions added in V9). */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/tile.h000066400000000000000000000025101502707512200221160ustar00rootroot00000000000000/* * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. */ /* Minimal support for tile. */ #if (AO_GNUC_PREREQ(4, 8) || AO_CLANG_PREREQ(3, 4)) \ && !defined(AO_DISABLE_GCC_ATOMICS) # include "generic.h" #else /* AO_DISABLE_GCC_ATOMICS */ # include "../all_atomic_load_store.h" # include "../test_and_set_t_is_ao_t.h" AO_INLINE void AO_nop_full(void) { __sync_synchronize(); } # define AO_HAVE_nop_full AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *p, AO_t incr) { return __sync_fetch_and_add(p, incr); } # define AO_HAVE_fetch_and_add_full AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return __sync_val_compare_and_swap(addr, old_val, new_val /* empty protection list */); } # define AO_HAVE_fetch_compare_and_swap_full #endif /* AO_DISABLE_GCC_ATOMICS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/gcc/x86.h000066400000000000000000000617111502707512200216160ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * Copyright (c) 2008-2018 Ivan Maidanski * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * * Some of the machine specific code was borrowed from our GC distribution. */ #if (AO_GNUC_PREREQ(4, 8) || AO_CLANG_PREREQ(3, 4)) \ && !defined(__INTEL_COMPILER) /* TODO: test and enable icc */ \ && !defined(AO_DISABLE_GCC_ATOMICS) # define AO_GCC_ATOMIC_TEST_AND_SET # if defined(__APPLE_CC__) /* OS X 10.7 clang-425 lacks __GCC_HAVE_SYNC_COMPARE_AND_SWAP_n */ /* predefined macro (unlike e.g. OS X 10.11 clang-703). */ # define AO_GCC_FORCE_HAVE_CAS # ifdef __x86_64__ # if !AO_CLANG_PREREQ(9, 0) /* < Apple clang-900 */ /* Older Apple clang (e.g., clang-600 based on LLVM 3.5svn) had */ /* some bug in the double word CAS implementation for x64. */ # define AO_SKIPATOMIC_double_compare_and_swap_ANY # endif # elif defined(__MACH__) /* OS X 10.8 lacks __atomic_load/store symbols for arch i386 */ /* (even with a non-Apple clang). */ # ifndef MAC_OS_X_VERSION_MIN_REQUIRED /* Include this header just to import the version macro. */ # include # endif # if MAC_OS_X_VERSION_MIN_REQUIRED < 1090 /* MAC_OS_X_VERSION_10_9 */ # define AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # endif # endif /* __i386__ */ # elif defined(__clang__) # if !defined(__x86_64__) # if !defined(AO_PREFER_BUILTIN_ATOMICS) && !defined(__CYGWIN__) \ && !AO_CLANG_PREREQ(5, 0) /* At least clang-3.8/i686 (from NDK r11c) required to specify */ /* -latomic in case of a double-word atomic operation use. */ # define AO_SKIPATOMIC_double_compare_and_swap_ANY # define AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # endif /* !AO_PREFER_BUILTIN_ATOMICS */ # elif !defined(__ILP32__) # if (!AO_CLANG_PREREQ(3, 5) && !defined(AO_PREFER_BUILTIN_ATOMICS)) \ || (!AO_CLANG_PREREQ(4, 0) && defined(AO_ADDRESS_SANITIZER)) \ || defined(AO_THREAD_SANITIZER) /* clang-3.4/x64 required -latomic. clang-3.9/x64 seems to */ /* pass double-wide arguments to atomic operations incorrectly */ /* in case of ASan/TSan. */ /* TODO: As of clang-4.0, lock-free test_stack fails if TSan. */ # define AO_SKIPATOMIC_double_compare_and_swap_ANY # define AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # endif # endif /* __x86_64__ */ # elif AO_GNUC_PREREQ(7, 0) && !defined(AO_PREFER_BUILTIN_ATOMICS) \ && !defined(AO_THREAD_SANITIZER) && !defined(__MINGW32__) /* gcc-7.x/x64 (gcc-7.2, at least) requires -latomic flag in case */ /* of double-word atomic operations use (but not in case of TSan). */ /* TODO: Revise it for the future gcc-7 releases. */ # define AO_SKIPATOMIC_double_compare_and_swap_ANY # define AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # endif /* __GNUC__ && !__clang__ */ # ifdef AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # define AO_SKIPATOMIC_double_load # define AO_SKIPATOMIC_double_load_acquire # define AO_SKIPATOMIC_double_store # define AO_SKIPATOMIC_double_store_release # undef AO_SKIPATOMIC_DOUBLE_LOAD_STORE_ANY # endif #else /* AO_DISABLE_GCC_ATOMICS */ /* The following really assume we have a 486 or better. Unfortunately */ /* gcc doesn't define a suitable feature test macro based on command */ /* line options. */ /* We should perhaps test dynamically. */ #include "../all_aligned_atomic_load_store.h" #include "../test_and_set_t_is_char.h" #if defined(__SSE2__) && !defined(AO_USE_PENTIUM4_INSTRS) /* "mfence" is a part of SSE2 set (introduced on Intel Pentium 4). */ # define AO_USE_PENTIUM4_INSTRS #endif #if defined(AO_USE_PENTIUM4_INSTRS) AO_INLINE void AO_nop_full(void) { __asm__ __volatile__("mfence" : : : "memory"); } # define AO_HAVE_nop_full #else /* We could use the cpuid instruction. But that seems to be slower */ /* than the default implementation based on test_and_set_full. Thus */ /* we omit that bit of misinformation here. */ #endif /* !AO_USE_PENTIUM4_INSTRS */ /* As far as we can tell, the lfence and sfence instructions are not */ /* currently needed or useful for cached memory accesses. */ /* Really only works for 486 and later */ #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add_full (volatile AO_t *p, AO_t incr) { AO_t result; __asm__ __volatile__ ("lock; xadd %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } # define AO_HAVE_fetch_and_add_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE unsigned char AO_char_fetch_and_add_full (volatile unsigned char *p, unsigned char incr) { unsigned char result; __asm__ __volatile__ ("lock; xaddb %0, %1" : "=q" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } #define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full (volatile unsigned short *p, unsigned short incr) { unsigned short result; __asm__ __volatile__ ("lock; xaddw %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } #define AO_HAVE_short_fetch_and_add_full #ifndef AO_PREFER_GENERALIZED AO_INLINE void AO_and_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; and %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_and_full AO_INLINE void AO_or_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; or %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_or_full AO_INLINE void AO_xor_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; xor %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_xor_full /* AO_store_full could be implemented directly using "xchg" but it */ /* could be generalized efficiently as an ordinary store accomplished */ /* with AO_nop_full ("mfence" instruction). */ AO_INLINE void AO_char_and_full (volatile unsigned char *p, unsigned char value) { __asm__ __volatile__ ("lock; andb %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_char_and_full AO_INLINE void AO_char_or_full (volatile unsigned char *p, unsigned char value) { __asm__ __volatile__ ("lock; orb %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_char_or_full AO_INLINE void AO_char_xor_full (volatile unsigned char *p, unsigned char value) { __asm__ __volatile__ ("lock; xorb %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_char_xor_full AO_INLINE void AO_short_and_full (volatile unsigned short *p, unsigned short value) { __asm__ __volatile__ ("lock; andw %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_short_and_full AO_INLINE void AO_short_or_full (volatile unsigned short *p, unsigned short value) { __asm__ __volatile__ ("lock; orw %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_short_or_full AO_INLINE void AO_short_xor_full (volatile unsigned short *p, unsigned short value) { __asm__ __volatile__ ("lock; xorw %1, %0" : "+m" (*p) : "r" (value) : "memory"); } #define AO_HAVE_short_xor_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { unsigned char oldval; /* Note: the "xchg" instruction does not need a "lock" prefix */ __asm__ __volatile__ ("xchgb %0, %1" : "=q" (oldval), "+m" (*addr) : "0" ((unsigned char)0xff) : "memory"); return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set_full #ifndef AO_GENERALIZE_ASM_BOOL_CAS /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { # ifdef AO_USE_SYNC_CAS_BUILTIN return (int)__sync_bool_compare_and_swap(addr, old, new_val /* empty protection list */); /* Note: an empty list of variables protected by the */ /* memory barrier should mean all globally accessible */ /* variables are protected. */ # else char result; # if defined(__GCC_ASM_FLAG_OUTPUTS__) AO_t dummy; __asm__ __volatile__ ("lock; cmpxchg %3, %0" : "+m" (*addr), "=@ccz" (result), "=a" (dummy) : "r" (new_val), "a" (old) : "memory"); # else __asm__ __volatile__ ("lock; cmpxchg %2, %0; setz %1" : "+m" (*addr), "=a" (result) : "r" (new_val), "a" (old) : "memory"); # endif return (int)result; # endif } # define AO_HAVE_compare_and_swap_full #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { # ifdef AO_USE_SYNC_CAS_BUILTIN return __sync_val_compare_and_swap(addr, old_val, new_val /* empty protection list */); # else AO_t fetched_val; __asm__ __volatile__ ("lock; cmpxchg %3, %1" : "=a" (fetched_val), "+m" (*addr) : "a" (old_val), "r" (new_val) : "memory"); return fetched_val; # endif } #define AO_HAVE_fetch_compare_and_swap_full AO_INLINE unsigned char AO_char_fetch_compare_and_swap_full(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { # ifdef AO_USE_SYNC_CAS_BUILTIN return __sync_val_compare_and_swap(addr, old_val, new_val /* empty protection list */); # else unsigned char fetched_val; __asm__ __volatile__ ("lock; cmpxchgb %3, %1" : "=a" (fetched_val), "+m" (*addr) : "a" (old_val), "q" (new_val) : "memory"); return fetched_val; # endif } # define AO_HAVE_char_fetch_compare_and_swap_full AO_INLINE unsigned short AO_short_fetch_compare_and_swap_full(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { # ifdef AO_USE_SYNC_CAS_BUILTIN return __sync_val_compare_and_swap(addr, old_val, new_val /* empty protection list */); # else unsigned short fetched_val; __asm__ __volatile__ ("lock; cmpxchgw %3, %1" : "=a" (fetched_val), "+m" (*addr) : "a" (old_val), "r" (new_val) : "memory"); return fetched_val; # endif } # define AO_HAVE_short_fetch_compare_and_swap_full # if defined(__x86_64__) && !defined(__ILP32__) AO_INLINE unsigned int AO_int_fetch_compare_and_swap_full(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { # ifdef AO_USE_SYNC_CAS_BUILTIN return __sync_val_compare_and_swap(addr, old_val, new_val /* empty protection list */); # else unsigned int fetched_val; __asm__ __volatile__ ("lock; cmpxchgl %3, %1" : "=a" (fetched_val), "+m" (*addr) : "a" (old_val), "r" (new_val) : "memory"); return fetched_val; # endif } # define AO_HAVE_int_fetch_compare_and_swap_full # ifndef AO_PREFER_GENERALIZED AO_INLINE unsigned int AO_int_fetch_and_add_full (volatile unsigned int *p, unsigned int incr) { unsigned int result; __asm__ __volatile__ ("lock; xaddl %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } # define AO_HAVE_int_fetch_and_add_full AO_INLINE void AO_int_and_full (volatile unsigned int *p, unsigned int value) { __asm__ __volatile__ ("lock; andl %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_int_and_full AO_INLINE void AO_int_or_full (volatile unsigned int *p, unsigned int value) { __asm__ __volatile__ ("lock; orl %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_int_or_full AO_INLINE void AO_int_xor_full (volatile unsigned int *p, unsigned int value) { __asm__ __volatile__ ("lock; xorl %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_int_xor_full # endif /* !AO_PREFER_GENERALIZED */ # else # define AO_T_IS_INT # endif /* !x86_64 || ILP32 */ /* Real X86 implementations, except for some old 32-bit WinChips, */ /* appear to enforce ordering between memory operations, EXCEPT that */ /* a later read can pass earlier writes, presumably due to the */ /* visible presence of store buffers. */ /* We ignore both the WinChips and the fact that the official specs */ /* seem to be much weaker (and arguably too weak to be usable). */ # include "../ordered_except_wr.h" #endif /* AO_DISABLE_GCC_ATOMICS */ #if defined(AO_GCC_ATOMIC_TEST_AND_SET) \ && !defined(AO_SKIPATOMIC_double_compare_and_swap_ANY) # if defined(__ILP32__) || !defined(__x86_64__) /* 32-bit AO_t */ \ || defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) /* 64-bit AO_t */ # include "../standard_ao_double_t.h" # endif #elif !defined(__x86_64__) && (!defined(AO_USE_SYNC_CAS_BUILTIN) \ || defined(AO_GCC_ATOMIC_TEST_AND_SET)) # include "../standard_ao_double_t.h" /* Reading or writing a quadword aligned on a 64-bit boundary is */ /* always carried out atomically on at least a Pentium according to */ /* Chapter 8.1.1 of Volume 3A Part 1 of Intel processor manuals. */ # ifndef AO_PREFER_GENERALIZED # define AO_ACCESS_double_CHECK_ALIGNED # include "../loadstore/double_atomic_load_store.h" # endif /* Returns nonzero if the comparison succeeded. */ /* Really requires at least a Pentium. */ AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { char result; # if defined(__PIC__) && !(AO_GNUC_PREREQ(5, 1) || AO_CLANG_PREREQ(4, 0)) AO_t saved_ebx; AO_t dummy; /* The following applies to an ancient GCC (and, probably, it was */ /* never needed for Clang): */ /* If PIC is turned on, we cannot use ebx as it is reserved for the */ /* GOT pointer. We should save and restore ebx. The proposed */ /* solution is not so efficient as the older alternatives using */ /* push ebx or edi as new_val1 (w/o clobbering edi and temporary */ /* local variable usage) but it is more portable (it works even if */ /* ebx is not used as GOT pointer, and it works for the buggy GCC */ /* releases that incorrectly evaluate memory operands offset in the */ /* inline assembly after push). */ # ifdef __OPTIMIZE__ __asm__ __volatile__("mov %%ebx, %2\n\t" /* save ebx */ "lea %0, %%edi\n\t" /* in case addr is in ebx */ "mov %7, %%ebx\n\t" /* load new_val1 */ "lock; cmpxchg8b (%%edi)\n\t" "mov %2, %%ebx\n\t" /* restore ebx */ "setz %1" : "+m" (*addr), "=a" (result), "=m" (saved_ebx), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "m" (new_val1) : "%edi", "memory"); # else /* A less-efficient code manually preserving edi if GCC invoked */ /* with -O0 option (otherwise it fails while finding a register */ /* in class 'GENERAL_REGS'). */ AO_t saved_edi; __asm__ __volatile__("mov %%edi, %3\n\t" /* save edi */ "mov %%ebx, %2\n\t" /* save ebx */ "lea %0, %%edi\n\t" /* in case addr is in ebx */ "mov %8, %%ebx\n\t" /* load new_val1 */ "lock; cmpxchg8b (%%edi)\n\t" "mov %2, %%ebx\n\t" /* restore ebx */ "mov %3, %%edi\n\t" /* restore edi */ "setz %1" : "+m" (*addr), "=a" (result), "=m" (saved_ebx), "=m" (saved_edi), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "m" (new_val1) : "memory"); # endif # else /* For non-PIC mode, this operation could be simplified (and be */ /* faster) by using ebx as new_val1. Reuse of the PIC hard */ /* register, instead of using a fixed register, is implemented */ /* in Clang and GCC 5.1+, at least. (Older GCC refused to compile */ /* such code for PIC mode). */ # if defined(__GCC_ASM_FLAG_OUTPUTS__) __asm__ __volatile__ ("lock; cmpxchg8b %0" : "+m" (*addr), "=@ccz" (result), "+d" (old_val2), "+a" (old_val1) : "c" (new_val2), "b" (new_val1) : "memory"); # else AO_t dummy; /* an output for clobbered edx */ __asm__ __volatile__ ("lock; cmpxchg8b %0; setz %1" : "+m" (*addr), "=a" (result), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "b" (new_val1) : "memory"); # endif # endif return (int) result; } # define AO_HAVE_compare_double_and_swap_double_full #elif defined(__ILP32__) || !defined(__x86_64__) # include "../standard_ao_double_t.h" /* Reading or writing a quadword aligned on a 64-bit boundary is */ /* always carried out atomically (requires at least a Pentium). */ # ifndef AO_PREFER_GENERALIZED # define AO_ACCESS_double_CHECK_ALIGNED # include "../loadstore/double_atomic_load_store.h" # endif /* X32 has native support for 64-bit integer operations (AO_double_t */ /* is a 64-bit integer and we could use 64-bit cmpxchg). */ /* This primitive is used by compare_double_and_swap_double_full. */ AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { /* It is safe to use __sync CAS built-in here. */ return __sync_bool_compare_and_swap(&addr->AO_whole, old_val.AO_whole, new_val.AO_whole /* empty protection list */); } # define AO_HAVE_double_compare_and_swap_full #elif defined(AO_CMPXCHG16B_AVAILABLE) \ || (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) \ && !defined(AO_THREAD_SANITIZER)) # include "../standard_ao_double_t.h" /* The Intel and AMD Architecture Programmer Manuals state roughly */ /* the following: */ /* - CMPXCHG16B (with a LOCK prefix) can be used to perform 16-byte */ /* atomic accesses in 64-bit mode (with certain alignment */ /* restrictions); */ /* - SSE instructions that access data larger than a quadword (like */ /* MOVDQA) may be implemented using multiple memory accesses; */ /* - LOCK prefix causes an invalid-opcode exception when used with */ /* 128-bit media (SSE) instructions. */ /* Thus, currently, the only way to implement lock-free double_load */ /* and double_store on x86_64 is to use CMPXCHG16B (if available). */ /* NEC LE-IT: older AMD Opterons are missing this instruction. */ /* On these machines SIGILL will be thrown. */ /* Define AO_WEAK_DOUBLE_CAS_EMULATION to have an emulated (lock */ /* based) version available. */ /* HB: Changed this to not define either by default. There are */ /* enough machines and tool chains around on which cmpxchg16b */ /* doesn't work. And the emulation is unsafe by our usual rules. */ /* However both are clearly useful in certain cases. */ AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { char result; # if defined(__GCC_ASM_FLAG_OUTPUTS__) __asm__ __volatile__("lock; cmpxchg16b %0" : "+m" (*addr), "=@ccz" (result), "+d" (old_val2), "+a" (old_val1) : "c" (new_val2), "b" (new_val1) : "memory"); # else AO_t dummy; /* an output for clobbered rdx */ __asm__ __volatile__("lock; cmpxchg16b %0; setz %1" : "+m" (*addr), "=a" (result), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "b" (new_val1) : "memory"); # endif return (int) result; } # define AO_HAVE_compare_double_and_swap_double_full #elif defined(AO_WEAK_DOUBLE_CAS_EMULATION) # include "../standard_ao_double_t.h" # ifdef __cplusplus extern "C" { # endif /* This one provides spinlock based emulation of CAS implemented in */ /* atomic_ops.c. We probably do not want to do this here, since it */ /* is not atomic with respect to other kinds of updates of *addr. */ /* On the other hand, this may be a useful facility on occasion. */ int AO_compare_double_and_swap_double_emulation( volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2); # ifdef __cplusplus } /* extern "C" */ # endif AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { return AO_compare_double_and_swap_double_emulation(addr, old_val1, old_val2, new_val1, new_val2); } # define AO_HAVE_compare_double_and_swap_double_full #endif /* x86_64 && !ILP32 && CAS_EMULATION && !AO_CMPXCHG16B_AVAILABLE */ #ifdef AO_GCC_ATOMIC_TEST_AND_SET # include "generic.h" #endif #undef AO_GCC_FORCE_HAVE_CAS #undef AO_SKIPATOMIC_double_compare_and_swap_ANY #undef AO_SKIPATOMIC_double_load #undef AO_SKIPATOMIC_double_load_acquire #undef AO_SKIPATOMIC_double_store #undef AO_SKIPATOMIC_double_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/generic_pthread.h000066400000000000000000000261151502707512200235570ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* The following is useful primarily for debugging and documentation. */ /* We define various atomic operations by acquiring a global pthread */ /* lock. The resulting implementation will perform poorly, but should */ /* be correct unless it is used from signal handlers. */ /* We assume that all pthread operations act like full memory barriers. */ /* (We believe that is the intent of the specification.) */ #include #include "test_and_set_t_is_ao_t.h" /* This is not necessarily compatible with the native */ /* implementation. But those can't be safely mixed anyway. */ #ifdef __cplusplus extern "C" { #endif /* We define only the full barrier variants, and count on the */ /* generalization section below to fill in the rest. */ AO_API pthread_mutex_t AO_pt_lock; #ifdef __cplusplus } /* extern "C" */ #endif AO_INLINE void AO_nop_full(void) { pthread_mutex_lock(&AO_pt_lock); pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_nop_full AO_INLINE AO_t AO_load_full(const volatile AO_t *addr) { AO_t result; pthread_mutex_lock(&AO_pt_lock); result = *addr; pthread_mutex_unlock(&AO_pt_lock); return result; } #define AO_HAVE_load_full AO_INLINE void AO_store_full(volatile AO_t *addr, AO_t val) { pthread_mutex_lock(&AO_pt_lock); *addr = val; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_store_full AO_INLINE unsigned char AO_char_load_full(const volatile unsigned char *addr) { unsigned char result; pthread_mutex_lock(&AO_pt_lock); result = *addr; pthread_mutex_unlock(&AO_pt_lock); return result; } #define AO_HAVE_char_load_full AO_INLINE void AO_char_store_full(volatile unsigned char *addr, unsigned char val) { pthread_mutex_lock(&AO_pt_lock); *addr = val; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_char_store_full AO_INLINE unsigned short AO_short_load_full(const volatile unsigned short *addr) { unsigned short result; pthread_mutex_lock(&AO_pt_lock); result = *addr; pthread_mutex_unlock(&AO_pt_lock); return result; } #define AO_HAVE_short_load_full AO_INLINE void AO_short_store_full(volatile unsigned short *addr, unsigned short val) { pthread_mutex_lock(&AO_pt_lock); *addr = val; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_short_store_full AO_INLINE unsigned int AO_int_load_full(const volatile unsigned int *addr) { unsigned int result; pthread_mutex_lock(&AO_pt_lock); result = *addr; pthread_mutex_unlock(&AO_pt_lock); return result; } #define AO_HAVE_int_load_full AO_INLINE void AO_int_store_full(volatile unsigned int *addr, unsigned int val) { pthread_mutex_lock(&AO_pt_lock); *addr = val; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_int_store_full AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_VAL_t result; pthread_mutex_lock(&AO_pt_lock); result = (AO_TS_VAL_t)(*addr); *addr = AO_TS_SET; pthread_mutex_unlock(&AO_pt_lock); assert(result == AO_TS_SET || result == AO_TS_CLEAR); return result; } #define AO_HAVE_test_and_set_full AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *p, AO_t incr) { AO_t old_val; pthread_mutex_lock(&AO_pt_lock); old_val = *p; *p = old_val + incr; pthread_mutex_unlock(&AO_pt_lock); return old_val; } #define AO_HAVE_fetch_and_add_full AO_INLINE unsigned char AO_char_fetch_and_add_full(volatile unsigned char *p, unsigned char incr) { unsigned char old_val; pthread_mutex_lock(&AO_pt_lock); old_val = *p; *p = old_val + incr; pthread_mutex_unlock(&AO_pt_lock); return old_val; } #define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full(volatile unsigned short *p, unsigned short incr) { unsigned short old_val; pthread_mutex_lock(&AO_pt_lock); old_val = *p; *p = old_val + incr; pthread_mutex_unlock(&AO_pt_lock); return old_val; } #define AO_HAVE_short_fetch_and_add_full AO_INLINE unsigned int AO_int_fetch_and_add_full(volatile unsigned int *p, unsigned int incr) { unsigned int old_val; pthread_mutex_lock(&AO_pt_lock); old_val = *p; *p = old_val + incr; pthread_mutex_unlock(&AO_pt_lock); return old_val; } #define AO_HAVE_int_fetch_and_add_full AO_INLINE void AO_and_full(volatile AO_t *p, AO_t value) { pthread_mutex_lock(&AO_pt_lock); *p &= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_and_full AO_INLINE void AO_or_full(volatile AO_t *p, AO_t value) { pthread_mutex_lock(&AO_pt_lock); *p |= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_or_full AO_INLINE void AO_xor_full(volatile AO_t *p, AO_t value) { pthread_mutex_lock(&AO_pt_lock); *p ^= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_xor_full AO_INLINE void AO_char_and_full(volatile unsigned char *p, unsigned char value) { pthread_mutex_lock(&AO_pt_lock); *p &= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_char_and_full AO_INLINE void AO_char_or_full(volatile unsigned char *p, unsigned char value) { pthread_mutex_lock(&AO_pt_lock); *p |= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_char_or_full AO_INLINE void AO_char_xor_full(volatile unsigned char *p, unsigned char value) { pthread_mutex_lock(&AO_pt_lock); *p ^= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_char_xor_full AO_INLINE void AO_short_and_full(volatile unsigned short *p, unsigned short value) { pthread_mutex_lock(&AO_pt_lock); *p &= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_short_and_full AO_INLINE void AO_short_or_full(volatile unsigned short *p, unsigned short value) { pthread_mutex_lock(&AO_pt_lock); *p |= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_short_or_full AO_INLINE void AO_short_xor_full(volatile unsigned short *p, unsigned short value) { pthread_mutex_lock(&AO_pt_lock); *p ^= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_short_xor_full AO_INLINE void AO_int_and_full(volatile unsigned *p, unsigned value) { pthread_mutex_lock(&AO_pt_lock); *p &= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_int_and_full AO_INLINE void AO_int_or_full(volatile unsigned *p, unsigned value) { pthread_mutex_lock(&AO_pt_lock); *p |= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_int_or_full AO_INLINE void AO_int_xor_full(volatile unsigned *p, unsigned value) { pthread_mutex_lock(&AO_pt_lock); *p ^= value; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_int_xor_full AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val; pthread_mutex_lock(&AO_pt_lock); fetched_val = *addr; if (fetched_val == old_val) *addr = new_val; pthread_mutex_unlock(&AO_pt_lock); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap_full AO_INLINE unsigned char AO_char_fetch_compare_and_swap_full(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { unsigned char fetched_val; pthread_mutex_lock(&AO_pt_lock); fetched_val = *addr; if (fetched_val == old_val) *addr = new_val; pthread_mutex_unlock(&AO_pt_lock); return fetched_val; } #define AO_HAVE_char_fetch_compare_and_swap_full AO_INLINE unsigned short AO_short_fetch_compare_and_swap_full(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { unsigned short fetched_val; pthread_mutex_lock(&AO_pt_lock); fetched_val = *addr; if (fetched_val == old_val) *addr = new_val; pthread_mutex_unlock(&AO_pt_lock); return fetched_val; } #define AO_HAVE_short_fetch_compare_and_swap_full AO_INLINE unsigned AO_int_fetch_compare_and_swap_full(volatile unsigned *addr, unsigned old_val, unsigned new_val) { unsigned fetched_val; pthread_mutex_lock(&AO_pt_lock); fetched_val = *addr; if (fetched_val == old_val) *addr = new_val; pthread_mutex_unlock(&AO_pt_lock); return fetched_val; } #define AO_HAVE_int_fetch_compare_and_swap_full /* Unlike real architectures, we define both double-width CAS variants. */ typedef struct { AO_t AO_val1; AO_t AO_val2; } AO_double_t; #define AO_HAVE_double_t #define AO_DOUBLE_T_INITIALIZER { (AO_t)0, (AO_t)0 } AO_INLINE AO_double_t AO_double_load_full(const volatile AO_double_t *addr) { AO_double_t result; pthread_mutex_lock(&AO_pt_lock); result.AO_val1 = addr->AO_val1; result.AO_val2 = addr->AO_val2; pthread_mutex_unlock(&AO_pt_lock); return result; } #define AO_HAVE_double_load_full AO_INLINE void AO_double_store_full(volatile AO_double_t *addr, AO_double_t value) { pthread_mutex_lock(&AO_pt_lock); addr->AO_val1 = value.AO_val1; addr->AO_val2 = value.AO_val2; pthread_mutex_unlock(&AO_pt_lock); } #define AO_HAVE_double_store_full AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old1, AO_t old2, AO_t new1, AO_t new2) { pthread_mutex_lock(&AO_pt_lock); if (addr -> AO_val1 == old1 && addr -> AO_val2 == old2) { addr -> AO_val1 = new1; addr -> AO_val2 = new2; pthread_mutex_unlock(&AO_pt_lock); return 1; } else pthread_mutex_unlock(&AO_pt_lock); return 0; } #define AO_HAVE_compare_double_and_swap_double_full AO_INLINE int AO_compare_and_swap_double_full(volatile AO_double_t *addr, AO_t old1, AO_t new1, AO_t new2) { pthread_mutex_lock(&AO_pt_lock); if (addr -> AO_val1 == old1) { addr -> AO_val1 = new1; addr -> AO_val2 = new2; pthread_mutex_unlock(&AO_pt_lock); return 1; } else pthread_mutex_unlock(&AO_pt_lock); return 0; } #define AO_HAVE_compare_and_swap_double_full /* We can't use hardware loads and stores, since they don't */ /* interact correctly with atomic updates. */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/hpc/000077500000000000000000000000001502707512200210305ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/hpc/hppa.h000066400000000000000000000104211502707512200221270ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * Derived from the corresponding header file for gcc. */ #include "../loadstore/atomic_load.h" #include "../loadstore/atomic_store.h" /* Some architecture set descriptions include special "ordered" memory */ /* operations. As far as we can tell, no existing processors actually */ /* require those. Nor does it appear likely that future processors */ /* will. */ /* FIXME: The PA emulator on Itanium may obey weaker restrictions. */ /* There should be a mode in which we don't assume sequential */ /* consistency here. */ #include "../ordered.h" #include /* GCC will not guarantee the alignment we need, use four lock words */ /* and select the correctly aligned datum. See the glibc 2.3.2 */ /* linuxthread port for the original implementation. */ struct AO_pa_clearable_loc { int data[4]; }; #undef AO_TS_INITIALIZER #define AO_TS_t struct AO_pa_clearable_loc #define AO_TS_INITIALIZER {1,1,1,1} /* Switch meaning of set and clear, since we only have an atomic clear */ /* instruction. */ typedef enum {AO_PA_TS_set = 0, AO_PA_TS_clear = 1} AO_PA_TS_val; #define AO_TS_VAL_t AO_PA_TS_val #define AO_TS_CLEAR AO_PA_TS_clear #define AO_TS_SET AO_PA_TS_set /* The hppa only has one atomic read and modify memory operation, */ /* load and clear, so hppa spinlocks must use zero to signify that */ /* someone is holding the lock. The address used for the ldcw */ /* semaphore must be 16-byte aligned. */ #define AO_ldcw(a, ret) \ _LDCWX(0 /* index */, 0 /* s */, a /* base */, ret) /* Because malloc only guarantees 8-byte alignment for malloc'd data, */ /* and GCC only guarantees 8-byte alignment for stack locals, we can't */ /* be assured of 16-byte alignment for atomic lock data even if we */ /* specify "__attribute ((aligned(16)))" in the type declaration. So, */ /* we use a struct containing an array of four ints for the atomic lock */ /* type and dynamically select the 16-byte aligned int from the array */ /* for the semaphore. */ #define AO_PA_LDCW_ALIGNMENT 16 #define AO_ldcw_align(addr) \ ((volatile unsigned *)(((unsigned long)(addr) \ + (AO_PA_LDCW_ALIGNMENT - 1)) \ & ~(AO_PA_LDCW_ALIGNMENT - 1))) /* Works on PA 1.1 and PA 2.0 systems */ AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t * addr) { register unsigned int ret; register unsigned long a = (unsigned long)AO_ldcw_align(addr); # if defined(CPPCHECK) ret = 0; /* to void 'uninitialized variable' warning */ # endif AO_ldcw(a, ret); return (AO_TS_VAL_t)ret; } #define AO_HAVE_test_and_set_full AO_INLINE void AO_pa_clear(volatile AO_TS_t * addr) { volatile unsigned *a = AO_ldcw_align(addr); AO_compiler_barrier(); *a = 1; } #define AO_CLEAR(addr) AO_pa_clear(addr) #define AO_HAVE_CLEAR #undef AO_PA_LDCW_ALIGNMENT #undef AO_ldcw #undef AO_ldcw_align papi-papi-7-2-0-t/src/atomic_ops/sysdeps/hpc/ia64.h000066400000000000000000000116671502707512200217570ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * This file specifies Itanimum primitives for use with the HP compiler * under HP/UX. We use intrinsics instead of the inline assembly code in the * gcc file. */ #include "../all_atomic_load_store.h" #include "../all_acquire_release_volatile.h" #include "../test_and_set_t_is_char.h" #include #ifdef __LP64__ # define AO_T_FASIZE _FASZ_D # define AO_T_SIZE _SZ_D #else # define AO_T_FASIZE _FASZ_W # define AO_T_SIZE _SZ_W #endif AO_INLINE void AO_nop_full(void) { _Asm_mf(); } #define AO_HAVE_nop_full #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add1_acquire (volatile AO_t *p) { return _Asm_fetchadd(AO_T_FASIZE, _SEM_ACQ, p, 1, _LDHINT_NONE, _DOWN_MEM_FENCE); } #define AO_HAVE_fetch_and_add1_acquire AO_INLINE AO_t AO_fetch_and_add1_release (volatile AO_t *p) { return _Asm_fetchadd(AO_T_FASIZE, _SEM_REL, p, 1, _LDHINT_NONE, _UP_MEM_FENCE); } #define AO_HAVE_fetch_and_add1_release AO_INLINE AO_t AO_fetch_and_sub1_acquire (volatile AO_t *p) { return _Asm_fetchadd(AO_T_FASIZE, _SEM_ACQ, p, -1, _LDHINT_NONE, _DOWN_MEM_FENCE); } #define AO_HAVE_fetch_and_sub1_acquire AO_INLINE AO_t AO_fetch_and_sub1_release (volatile AO_t *p) { return _Asm_fetchadd(AO_T_FASIZE, _SEM_REL, p, -1, _LDHINT_NONE, _UP_MEM_FENCE); } #define AO_HAVE_fetch_and_sub1_release #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _DOWN_MEM_FENCE); return _Asm_cmpxchg(AO_T_SIZE, _SEM_ACQ, addr, new_val, _LDHINT_NONE, _DOWN_MEM_FENCE); } #define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _UP_MEM_FENCE); return _Asm_cmpxchg(AO_T_SIZE, _SEM_REL, addr, new_val, _LDHINT_NONE, _UP_MEM_FENCE); } #define AO_HAVE_fetch_compare_and_swap_release AO_INLINE unsigned char AO_char_fetch_compare_and_swap_acquire(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _DOWN_MEM_FENCE); return _Asm_cmpxchg(_SZ_B, _SEM_ACQ, addr, new_val, _LDHINT_NONE, _DOWN_MEM_FENCE); } #define AO_HAVE_char_fetch_compare_and_swap_acquire AO_INLINE unsigned char AO_char_fetch_compare_and_swap_release(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _UP_MEM_FENCE); return _Asm_cmpxchg(_SZ_B, _SEM_REL, addr, new_val, _LDHINT_NONE, _UP_MEM_FENCE); } #define AO_HAVE_char_fetch_compare_and_swap_release AO_INLINE unsigned short AO_short_fetch_compare_and_swap_acquire(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _DOWN_MEM_FENCE); return _Asm_cmpxchg(_SZ_B, _SEM_ACQ, addr, new_val, _LDHINT_NONE, _DOWN_MEM_FENCE); } #define AO_HAVE_short_fetch_compare_and_swap_acquire AO_INLINE unsigned short AO_short_fetch_compare_and_swap_release(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { _Asm_mov_to_ar(_AREG_CCV, old_val, _UP_MEM_FENCE); return _Asm_cmpxchg(_SZ_B, _SEM_REL, addr, new_val, _LDHINT_NONE, _UP_MEM_FENCE); } #define AO_HAVE_short_fetch_compare_and_swap_release #ifndef __LP64__ # define AO_T_IS_INT #endif #undef AO_T_FASIZE #undef AO_T_SIZE papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ibmc/000077500000000000000000000000001502707512200211705ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ibmc/powerpc.h000066400000000000000000000142471502707512200230300ustar00rootroot00000000000000 /* Memory model documented at http://www-106.ibm.com/developerworks/ */ /* eserver/articles/archguide.html and (clearer) */ /* http://www-106.ibm.com/developerworks/eserver/articles/powerpc.html. */ /* There appears to be no implicit ordering between any kind of */ /* independent memory references. */ /* Architecture enforces some ordering based on control dependence. */ /* I don't know if that could help. */ /* Data-dependent loads are always ordered. */ /* Based on the above references, eieio is intended for use on */ /* uncached memory, which we don't support. It does not order loads */ /* from cached memory. */ /* Thanks to Maged Michael, Doug Lea, and Roger Hoover for helping to */ /* track some of this down and correcting my misunderstandings. -HB */ #include "../all_aligned_atomic_load_store.h" #include "../test_and_set_t_is_ao_t.h" void AO_sync(void); #pragma mc_func AO_sync { "7c0004ac" } #ifdef __NO_LWSYNC__ # define AO_lwsync AO_sync #else void AO_lwsync(void); #pragma mc_func AO_lwsync { "7c2004ac" } #endif #define AO_nop_write() AO_lwsync() #define AO_HAVE_nop_write #define AO_nop_read() AO_lwsync() #define AO_HAVE_nop_read /* We explicitly specify load_acquire and store_release, since these */ /* rely on the fact that lwsync is also a LoadStore barrier. */ AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { AO_t result = *addr; AO_lwsync(); return result; } #define AO_HAVE_load_acquire AO_INLINE void AO_store_release(volatile AO_t *addr, AO_t value) { AO_lwsync(); *addr = value; } #define AO_HAVE_store_release #ifndef AO_PREFER_GENERALIZED /* This is similar to the code in the garbage collector. Deleting */ /* this and having it synthesized from compare_and_swap would probably */ /* only cost us a load immediate instruction. */ AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { #if defined(__powerpc64__) || defined(__ppc64__) || defined(__64BIT__) /* Completely untested. And we should be using smaller objects anyway. */ unsigned long oldval; unsigned long temp = 1; /* locked value */ __asm__ __volatile__( "1:ldarx %0,0,%1\n" /* load and reserve */ "cmpdi %0, 0\n" /* if load is */ "bne 2f\n" /* non-zero, return already set */ "stdcx. %2,0,%1\n" /* else store conditional */ "bne- 1b\n" /* retry if lost reservation */ "2:\n" /* oldval is zero if we set */ : "=&r"(oldval) : "r"(addr), "r"(temp) : "memory", "cr0"); #else int oldval; int temp = 1; /* locked value */ __asm__ __volatile__( "1:lwarx %0,0,%1\n" /* load and reserve */ "cmpwi %0, 0\n" /* if load is */ "bne 2f\n" /* non-zero, return already set */ "stwcx. %2,0,%1\n" /* else store conditional */ "bne- 1b\n" /* retry if lost reservation */ "2:\n" /* oldval is zero if we set */ : "=&r"(oldval) : "r"(addr), "r"(temp) : "memory", "cr0"); #endif return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { AO_TS_VAL_t result = AO_test_and_set(addr); AO_lwsync(); return result; } #define AO_HAVE_test_and_set_acquire AO_INLINE AO_TS_VAL_t AO_test_and_set_release(volatile AO_TS_t *addr) { AO_lwsync(); return AO_test_and_set(addr); } #define AO_HAVE_test_and_set_release AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { AO_TS_VAL_t result; AO_lwsync(); result = AO_test_and_set(addr); AO_lwsync(); return result; } #define AO_HAVE_test_and_set_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val; # if defined(__powerpc64__) || defined(__ppc64__) || defined(__64BIT__) __asm__ __volatile__( "1:ldarx %0,0,%1\n" /* load and reserve */ "cmpd %0, %3\n" /* if load is not equal to */ "bne 2f\n" /* old_val, fail */ "stdcx. %2,0,%1\n" /* else store conditional */ "bne- 1b\n" /* retry if lost reservation */ "2:\n" : "=&r"(fetched_val) : "r"(addr), "r"(new_val), "r"(old_val) : "memory", "cr0"); # else __asm__ __volatile__( "1:lwarx %0,0,%1\n" /* load and reserve */ "cmpw %0, %3\n" /* if load is not equal to */ "bne 2f\n" /* old_val, fail */ "stwcx. %2,0,%1\n" /* else store conditional */ "bne- 1b\n" /* retry if lost reservation */ "2:\n" : "=&r"(fetched_val) : "r"(addr), "r"(new_val), "r"(old_val) : "memory", "cr0"); # endif return fetched_val; } #define AO_HAVE_fetch_compare_and_swap AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result = AO_fetch_compare_and_swap(addr, old_val, new_val); AO_lwsync(); return result; } #define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_lwsync(); return AO_fetch_compare_and_swap(addr, old_val, new_val); } #define AO_HAVE_fetch_compare_and_swap_release AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t result; AO_lwsync(); result = AO_fetch_compare_and_swap(addr, old_val, new_val); AO_lwsync(); return result; } #define AO_HAVE_fetch_compare_and_swap_full /* TODO: Implement AO_fetch_and_add, AO_and/or/xor directly. */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/icc/000077500000000000000000000000001502707512200210145ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/icc/ia64.h000066400000000000000000000144611502707512200217360ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * This file specifies Itanimum primitives for use with the Intel (ecc) * compiler. We use intrinsics instead of the inline assembly code in the * gcc file. */ #include "../all_atomic_load_store.h" #include "../test_and_set_t_is_char.h" #include /* The acquire release semantics of volatile can be turned off. And volatile */ /* operations in icc9 don't imply ordering with respect to other nonvolatile */ /* operations. */ #define AO_INTEL_PTR_t void * AO_INLINE AO_t AO_load_acquire(const volatile AO_t *p) { return (AO_t)(__ld8_acq((AO_INTEL_PTR_t)p)); } #define AO_HAVE_load_acquire AO_INLINE void AO_store_release(volatile AO_t *p, AO_t val) { __st8_rel((AO_INTEL_PTR_t)p, (__int64)val); } #define AO_HAVE_store_release AO_INLINE unsigned char AO_char_load_acquire(const volatile unsigned char *p) { /* A normal volatile load generates an ld.acq */ return (__ld1_acq((AO_INTEL_PTR_t)p)); } #define AO_HAVE_char_load_acquire AO_INLINE void AO_char_store_release(volatile unsigned char *p, unsigned char val) { __st1_rel((AO_INTEL_PTR_t)p, val); } #define AO_HAVE_char_store_release AO_INLINE unsigned short AO_short_load_acquire(const volatile unsigned short *p) { /* A normal volatile load generates an ld.acq */ return (__ld2_acq((AO_INTEL_PTR_t)p)); } #define AO_HAVE_short_load_acquire AO_INLINE void AO_short_store_release(volatile unsigned short *p, unsigned short val) { __st2_rel((AO_INTEL_PTR_t)p, val); } #define AO_HAVE_short_store_release AO_INLINE unsigned int AO_int_load_acquire(const volatile unsigned int *p) { /* A normal volatile load generates an ld.acq */ return (__ld4_acq((AO_INTEL_PTR_t)p)); } #define AO_HAVE_int_load_acquire AO_INLINE void AO_int_store_release(volatile unsigned int *p, unsigned int val) { __st4_rel((AO_INTEL_PTR_t)p, val); } #define AO_HAVE_int_store_release AO_INLINE void AO_nop_full(void) { __mf(); } #define AO_HAVE_nop_full #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add1_acquire(volatile AO_t *p) { return __fetchadd8_acq((unsigned __int64 *)p, 1); } #define AO_HAVE_fetch_and_add1_acquire AO_INLINE AO_t AO_fetch_and_add1_release(volatile AO_t *p) { return __fetchadd8_rel((unsigned __int64 *)p, 1); } #define AO_HAVE_fetch_and_add1_release AO_INLINE AO_t AO_fetch_and_sub1_acquire(volatile AO_t *p) { return __fetchadd8_acq((unsigned __int64 *)p, -1); } #define AO_HAVE_fetch_and_sub1_acquire AO_INLINE AO_t AO_fetch_and_sub1_release(volatile AO_t *p) { return __fetchadd8_rel((unsigned __int64 *)p, -1); } #define AO_HAVE_fetch_and_sub1_release #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return _InterlockedCompareExchange64_acq(addr, new_val, old_val); } #define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { return _InterlockedCompareExchange64_rel(addr, new_val, old_val); } #define AO_HAVE_fetch_compare_and_swap_release AO_INLINE unsigned char AO_char_fetch_compare_and_swap_acquire(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8_acq(addr, new_val, old_val); } #define AO_HAVE_char_fetch_compare_and_swap_acquire AO_INLINE unsigned char AO_char_fetch_compare_and_swap_release(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8_rel(addr, new_val, old_val); } #define AO_HAVE_char_fetch_compare_and_swap_release AO_INLINE unsigned short AO_short_fetch_compare_and_swap_acquire(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16_acq(addr, new_val, old_val); } #define AO_HAVE_short_fetch_compare_and_swap_acquire AO_INLINE unsigned short AO_short_fetch_compare_and_swap_release(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16_rel(addr, new_val, old_val); } #define AO_HAVE_short_fetch_compare_and_swap_release AO_INLINE unsigned int AO_int_fetch_compare_and_swap_acquire(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange_acq(addr, new_val, old_val); } #define AO_HAVE_int_fetch_compare_and_swap_acquire AO_INLINE unsigned int AO_int_fetch_compare_and_swap_release(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange_rel(addr, new_val, old_val); } #define AO_HAVE_int_fetch_compare_and_swap_release #undef AO_INTEL_PTR_t papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/000077500000000000000000000000001502707512200222525ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/acquire_release_volatile.h000066400000000000000000000050421502707512200274540ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file adds definitions appropriate for environments in which */ /* volatile load of a given type has acquire semantics, and volatile */ /* store of a given type has release semantics. This is arguably */ /* supposed to be true with the standard Itanium software conventions. */ /* Empirically gcc/ia64 does some reordering of ordinary operations */ /* around volatiles even when we think it should not. GCC v3.3 and */ /* earlier could reorder a volatile store with another store. As of */ /* March 2005, gcc pre-4 reuses some previously computed common */ /* subexpressions across a volatile load; hence, we now add compiler */ /* barriers for gcc. */ #ifndef AO_HAVE_GCC_BARRIER /* TODO: Check GCC version (if workaround not needed for modern GCC). */ # if defined(__GNUC__) # define AO_GCC_BARRIER() AO_compiler_barrier() # else # define AO_GCC_BARRIER() (void)0 # endif # define AO_HAVE_GCC_BARRIER #endif AO_INLINE AO_t AO_load_acquire(const volatile AO_t *addr) { AO_t result = *addr; /* A normal volatile load generates an ld.acq (on IA-64). */ AO_GCC_BARRIER(); return result; } #define AO_HAVE_load_acquire AO_INLINE void AO_store_release(volatile AO_t *addr, AO_t new_val) { AO_GCC_BARRIER(); /* A normal volatile store generates an st.rel (on IA-64). */ *addr = new_val; } #define AO_HAVE_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/acquire_release_volatile.template000066400000000000000000000051041502707512200310370ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file adds definitions appropriate for environments in which */ /* volatile load of a given type has acquire semantics, and volatile */ /* store of a given type has release semantics. This is arguably */ /* supposed to be true with the standard Itanium software conventions. */ /* Empirically gcc/ia64 does some reordering of ordinary operations */ /* around volatiles even when we think it should not. GCC v3.3 and */ /* earlier could reorder a volatile store with another store. As of */ /* March 2005, gcc pre-4 reuses some previously computed common */ /* subexpressions across a volatile load; hence, we now add compiler */ /* barriers for gcc. */ #ifndef AO_HAVE_GCC_BARRIER /* TODO: Check GCC version (if workaround not needed for modern GCC). */ # if defined(__GNUC__) # define AO_GCC_BARRIER() AO_compiler_barrier() # else # define AO_GCC_BARRIER() (void)0 # endif # define AO_HAVE_GCC_BARRIER #endif AO_INLINE XCTYPE AO_XSIZE_load_acquire(const volatile XCTYPE *addr) { XCTYPE result = *addr; /* A normal volatile load generates an ld.acq (on IA-64). */ AO_GCC_BARRIER(); return result; } #define AO_HAVE_XSIZE_load_acquire AO_INLINE void AO_XSIZE_store_release(volatile XCTYPE *addr, XCTYPE new_val) { AO_GCC_BARRIER(); /* A normal volatile store generates an st.rel (on IA-64). */ *addr = new_val; } #define AO_HAVE_XSIZE_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/atomic_load.h000066400000000000000000000032261502707512200247010ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which loads of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE AO_t AO_load(const volatile AO_t *addr) { # ifdef AO_ACCESS_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile for architectures like IA64 where */ /* volatile adds barrier (fence) semantics. */ return *(const AO_t *)addr; } #define AO_HAVE_load papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/atomic_load.template000066400000000000000000000032561502707512200262700ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which loads of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE XCTYPE AO_XSIZE_load(const volatile XCTYPE *addr) { # ifdef AO_ACCESS_XSIZE_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile for architectures like IA64 where */ /* volatile adds barrier (fence) semantics. */ return *(const XCTYPE *)addr; } #define AO_HAVE_XSIZE_load papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/atomic_store.h000066400000000000000000000030271502707512200251150ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which stores of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE void AO_store(volatile AO_t *addr, AO_t new_val) { # ifdef AO_ACCESS_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif *(AO_t *)addr = new_val; } #define AO_HAVE_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/atomic_store.template000066400000000000000000000030571502707512200265040ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which stores of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE void AO_XSIZE_store(volatile XCTYPE *addr, XCTYPE new_val) { # ifdef AO_ACCESS_XSIZE_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif *(XCTYPE *)addr = new_val; } #define AO_HAVE_XSIZE_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/char_acquire_release_volatile.h000066400000000000000000000051621502707512200304540ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file adds definitions appropriate for environments in which */ /* volatile load of a given type has acquire semantics, and volatile */ /* store of a given type has release semantics. This is arguably */ /* supposed to be true with the standard Itanium software conventions. */ /* Empirically gcc/ia64 does some reordering of ordinary operations */ /* around volatiles even when we think it should not. GCC v3.3 and */ /* earlier could reorder a volatile store with another store. As of */ /* March 2005, gcc pre-4 reuses some previously computed common */ /* subexpressions across a volatile load; hence, we now add compiler */ /* barriers for gcc. */ #ifndef AO_HAVE_GCC_BARRIER /* TODO: Check GCC version (if workaround not needed for modern GCC). */ # if defined(__GNUC__) # define AO_GCC_BARRIER() AO_compiler_barrier() # else # define AO_GCC_BARRIER() (void)0 # endif # define AO_HAVE_GCC_BARRIER #endif AO_INLINE unsigned/**/char AO_char_load_acquire(const volatile unsigned/**/char *addr) { unsigned/**/char result = *addr; /* A normal volatile load generates an ld.acq (on IA-64). */ AO_GCC_BARRIER(); return result; } #define AO_HAVE_char_load_acquire AO_INLINE void AO_char_store_release(volatile unsigned/**/char *addr, unsigned/**/char new_val) { AO_GCC_BARRIER(); /* A normal volatile store generates an st.rel (on IA-64). */ *addr = new_val; } #define AO_HAVE_char_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/char_atomic_load.h000066400000000000000000000033111502707512200256710ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which loads of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE unsigned/**/char AO_char_load(const volatile unsigned/**/char *addr) { # ifdef AO_ACCESS_char_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile for architectures like IA64 where */ /* volatile adds barrier (fence) semantics. */ return *(const unsigned/**/char *)addr; } #define AO_HAVE_char_load papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/char_atomic_store.h000066400000000000000000000031121502707512200261050ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which stores of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE void AO_char_store(volatile unsigned/**/char *addr, unsigned/**/char new_val) { # ifdef AO_ACCESS_char_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif *(unsigned/**/char *)addr = new_val; } #define AO_HAVE_char_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/double_atomic_load_store.h000066400000000000000000000037271502707512200274550ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * Copyright (c) 2013 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which AO_double_t loads and stores */ /* are atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE AO_double_t AO_double_load(const volatile AO_double_t *addr) { AO_double_t result; # ifdef AO_ACCESS_double_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile in case it adds fence semantics. */ result.AO_whole = ((const AO_double_t *)addr)->AO_whole; return result; } #define AO_HAVE_double_load AO_INLINE void AO_double_store(volatile AO_double_t *addr, AO_double_t new_val) { # ifdef AO_ACCESS_double_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif ((AO_double_t *)addr)->AO_whole = new_val.AO_whole; } #define AO_HAVE_double_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/int_acquire_release_volatile.h000066400000000000000000000051061502707512200303270ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file adds definitions appropriate for environments in which */ /* volatile load of a given type has acquire semantics, and volatile */ /* store of a given type has release semantics. This is arguably */ /* supposed to be true with the standard Itanium software conventions. */ /* Empirically gcc/ia64 does some reordering of ordinary operations */ /* around volatiles even when we think it should not. GCC v3.3 and */ /* earlier could reorder a volatile store with another store. As of */ /* March 2005, gcc pre-4 reuses some previously computed common */ /* subexpressions across a volatile load; hence, we now add compiler */ /* barriers for gcc. */ #ifndef AO_HAVE_GCC_BARRIER /* TODO: Check GCC version (if workaround not needed for modern GCC). */ # if defined(__GNUC__) # define AO_GCC_BARRIER() AO_compiler_barrier() # else # define AO_GCC_BARRIER() (void)0 # endif # define AO_HAVE_GCC_BARRIER #endif AO_INLINE unsigned AO_int_load_acquire(const volatile unsigned *addr) { unsigned result = *addr; /* A normal volatile load generates an ld.acq (on IA-64). */ AO_GCC_BARRIER(); return result; } #define AO_HAVE_int_load_acquire AO_INLINE void AO_int_store_release(volatile unsigned *addr, unsigned new_val) { AO_GCC_BARRIER(); /* A normal volatile store generates an st.rel (on IA-64). */ *addr = new_val; } #define AO_HAVE_int_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/int_atomic_load.h000066400000000000000000000032561502707512200255560ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which loads of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE unsigned AO_int_load(const volatile unsigned *addr) { # ifdef AO_ACCESS_int_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile for architectures like IA64 where */ /* volatile adds barrier (fence) semantics. */ return *(const unsigned *)addr; } #define AO_HAVE_int_load papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/int_atomic_store.h000066400000000000000000000030571502707512200257720ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which stores of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE void AO_int_store(volatile unsigned *addr, unsigned new_val) { # ifdef AO_ACCESS_int_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif *(unsigned *)addr = new_val; } #define AO_HAVE_int_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/ordered_loads_only.h000066400000000000000000000150351502707512200262760ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_char_load /* char_load_read is defined in generalize-small. */ # define AO_char_load_acquire(addr) AO_char_load_read(addr) # define AO_HAVE_char_load_acquire #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_short_load /* short_load_read is defined in generalize-small. */ # define AO_short_load_acquire(addr) AO_short_load_read(addr) # define AO_HAVE_short_load_acquire #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_int_load /* int_load_read is defined in generalize-small. */ # define AO_int_load_acquire(addr) AO_int_load_read(addr) # define AO_HAVE_int_load_acquire #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_load /* load_read is defined in generalize-small. */ # define AO_load_acquire(addr) AO_load_read(addr) # define AO_HAVE_load_acquire #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_double_load /* double_load_read is defined in generalize-small. */ # define AO_double_load_acquire(addr) AO_double_load_read(addr) # define AO_HAVE_double_load_acquire #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/ordered_loads_only.template000066400000000000000000000025011502707512200276540ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_XSIZE_load /* XSIZE_load_read is defined in generalize-small. */ # define AO_XSIZE_load_acquire(addr) AO_XSIZE_load_read(addr) # define AO_HAVE_XSIZE_load_acquire #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/ordered_stores_only.h000066400000000000000000000150571502707512200265170ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_char_store # define AO_char_store_release(addr, val) \ (AO_nop_write(), AO_char_store(addr, val)) # define AO_HAVE_char_store_release #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_short_store # define AO_short_store_release(addr, val) \ (AO_nop_write(), AO_short_store(addr, val)) # define AO_HAVE_short_store_release #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_int_store # define AO_int_store_release(addr, val) \ (AO_nop_write(), AO_int_store(addr, val)) # define AO_HAVE_int_store_release #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_store # define AO_store_release(addr, val) \ (AO_nop_write(), AO_store(addr, val)) # define AO_HAVE_store_release #endif /* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_double_store # define AO_double_store_release(addr, val) \ (AO_nop_write(), AO_double_store(addr, val)) # define AO_HAVE_double_store_release #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/ordered_stores_only.template000066400000000000000000000025031502707512200300730ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef AO_HAVE_XSIZE_store # define AO_XSIZE_store_release(addr, val) \ (AO_nop_write(), AO_XSIZE_store(addr, val)) # define AO_HAVE_XSIZE_store_release #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/short_acquire_release_volatile.h000066400000000000000000000051731502707512200307000ustar00rootroot00000000000000/* * Copyright (c) 2003-2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file adds definitions appropriate for environments in which */ /* volatile load of a given type has acquire semantics, and volatile */ /* store of a given type has release semantics. This is arguably */ /* supposed to be true with the standard Itanium software conventions. */ /* Empirically gcc/ia64 does some reordering of ordinary operations */ /* around volatiles even when we think it should not. GCC v3.3 and */ /* earlier could reorder a volatile store with another store. As of */ /* March 2005, gcc pre-4 reuses some previously computed common */ /* subexpressions across a volatile load; hence, we now add compiler */ /* barriers for gcc. */ #ifndef AO_HAVE_GCC_BARRIER /* TODO: Check GCC version (if workaround not needed for modern GCC). */ # if defined(__GNUC__) # define AO_GCC_BARRIER() AO_compiler_barrier() # else # define AO_GCC_BARRIER() (void)0 # endif # define AO_HAVE_GCC_BARRIER #endif AO_INLINE unsigned/**/short AO_short_load_acquire(const volatile unsigned/**/short *addr) { unsigned/**/short result = *addr; /* A normal volatile load generates an ld.acq (on IA-64). */ AO_GCC_BARRIER(); return result; } #define AO_HAVE_short_load_acquire AO_INLINE void AO_short_store_release(volatile unsigned/**/short *addr, unsigned/**/short new_val) { AO_GCC_BARRIER(); /* A normal volatile store generates an st.rel (on IA-64). */ *addr = new_val; } #define AO_HAVE_short_store_release papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/short_atomic_load.h000066400000000000000000000033171502707512200261210ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which loads of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE unsigned/**/short AO_short_load(const volatile unsigned/**/short *addr) { # ifdef AO_ACCESS_short_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif /* Cast away the volatile for architectures like IA64 where */ /* volatile adds barrier (fence) semantics. */ return *(const unsigned/**/short *)addr; } #define AO_HAVE_short_load papi-papi-7-2-0-t/src/atomic_ops/sysdeps/loadstore/short_atomic_store.h000066400000000000000000000031201502707512200263260ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Definitions for architectures on which stores of given type are */ /* atomic (either for suitably aligned data only or for any legal */ /* alignment). */ AO_INLINE void AO_short_store(volatile unsigned/**/short *addr, unsigned/**/short new_val) { # ifdef AO_ACCESS_short_CHECK_ALIGNED AO_ASSERT_ADDR_ALIGNED(addr); # endif *(unsigned/**/short *)addr = new_val; } #define AO_HAVE_short_store papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/000077500000000000000000000000001502707512200213725ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/arm.h000066400000000000000000000107731502707512200223320ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * Copyright (c) 2009-2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* Some ARM slide set, if it has been read correctly, claims that Loads */ /* followed by either a Load or a Store are ordered, but nothing else. */ /* It is assumed that Windows interrupt handlers clear the LL/SC flag. */ /* Unaligned accesses are not guaranteed to be atomic. */ #include "../all_aligned_atomic_load_store.h" #define AO_T_IS_INT #ifndef AO_ASSUME_WINDOWS98 /* CAS is always available */ # define AO_ASSUME_WINDOWS98 #endif #include "common32_defs.h" /* If only a single processor is used, we can define AO_UNIPROCESSOR. */ #ifdef AO_UNIPROCESSOR AO_INLINE void AO_nop_full(void) { AO_compiler_barrier(); } # define AO_HAVE_nop_full #else /* AO_nop_full() is emulated using AO_test_and_set_full(). */ #endif #ifndef AO_HAVE_test_and_set_full # include "../test_and_set_t_is_ao_t.h" /* AO_test_and_set_full() is emulated. */ #endif #if _M_ARM >= 7 && !defined(AO_NO_DOUBLE_CAS) # include "../standard_ao_double_t.h" /* These intrinsics are supposed to use LDREXD/STREXD. */ # pragma intrinsic (_InterlockedCompareExchange64) # pragma intrinsic (_InterlockedCompareExchange64_acq) # pragma intrinsic (_InterlockedCompareExchange64_nf) # pragma intrinsic (_InterlockedCompareExchange64_rel) AO_INLINE int AO_double_compare_and_swap(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_ASSERT_ADDR_ALIGNED(addr); return (double_ptr_storage)_InterlockedCompareExchange64_nf( (__int64 volatile *)addr, new_val.AO_whole /* exchange */, old_val.AO_whole) == old_val.AO_whole; } # define AO_HAVE_double_compare_and_swap AO_INLINE int AO_double_compare_and_swap_acquire(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_ASSERT_ADDR_ALIGNED(addr); return (double_ptr_storage)_InterlockedCompareExchange64_acq( (__int64 volatile *)addr, new_val.AO_whole /* exchange */, old_val.AO_whole) == old_val.AO_whole; } # define AO_HAVE_double_compare_and_swap_acquire AO_INLINE int AO_double_compare_and_swap_release(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_ASSERT_ADDR_ALIGNED(addr); return (double_ptr_storage)_InterlockedCompareExchange64_rel( (__int64 volatile *)addr, new_val.AO_whole /* exchange */, old_val.AO_whole) == old_val.AO_whole; } # define AO_HAVE_double_compare_and_swap_release AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_ASSERT_ADDR_ALIGNED(addr); return (double_ptr_storage)_InterlockedCompareExchange64( (__int64 volatile *)addr, new_val.AO_whole /* exchange */, old_val.AO_whole) == old_val.AO_whole; } # define AO_HAVE_double_compare_and_swap_full #endif /* _M_ARM >= 7 && !AO_NO_DOUBLE_CAS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/arm64.h000066400000000000000000000111601502707512200224730ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * Copyright (c) 2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_aligned_atomic_load_store.h" #ifndef AO_ASSUME_WINDOWS98 # define AO_ASSUME_WINDOWS98 #endif #ifndef AO_USE_INTERLOCKED_INTRINSICS # define AO_USE_INTERLOCKED_INTRINSICS #endif #include "common32_defs.h" #ifndef AO_HAVE_test_and_set_full # include "../test_and_set_t_is_ao_t.h" /* AO_test_and_set_full() is emulated using word-wide CAS. */ #endif #ifndef AO_NO_DOUBLE_CAS # include "../standard_ao_double_t.h" # pragma intrinsic (_InterlockedCompareExchange128) # pragma intrinsic (_InterlockedCompareExchange128_acq) # pragma intrinsic (_InterlockedCompareExchange128_nf) # pragma intrinsic (_InterlockedCompareExchange128_rel) AO_INLINE int AO_compare_double_and_swap_double(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __int64 comparandResult[2]; AO_ASSERT_ADDR_ALIGNED(addr); comparandResult[0] = old_val1; /* low */ comparandResult[1] = old_val2; /* high */ return _InterlockedCompareExchange128_nf((volatile __int64 *)addr, new_val2 /* high */, new_val1 /* low */, comparandResult); } # define AO_HAVE_compare_double_and_swap_double AO_INLINE int AO_compare_double_and_swap_double_acquire(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __int64 comparandResult[2]; AO_ASSERT_ADDR_ALIGNED(addr); comparandResult[0] = old_val1; /* low */ comparandResult[1] = old_val2; /* high */ return _InterlockedCompareExchange128_acq((volatile __int64 *)addr, new_val2 /* high */, new_val1 /* low */, comparandResult); } # define AO_HAVE_compare_double_and_swap_double_acquire AO_INLINE int AO_compare_double_and_swap_double_release(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __int64 comparandResult[2]; AO_ASSERT_ADDR_ALIGNED(addr); comparandResult[0] = old_val1; /* low */ comparandResult[1] = old_val2; /* high */ return _InterlockedCompareExchange128_rel((volatile __int64 *)addr, new_val2 /* high */, new_val1 /* low */, comparandResult); } # define AO_HAVE_compare_double_and_swap_double_release AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __int64 comparandResult[2]; AO_ASSERT_ADDR_ALIGNED(addr); comparandResult[0] = old_val1; /* low */ comparandResult[1] = old_val2; /* high */ return _InterlockedCompareExchange128((volatile __int64 *)addr, new_val2 /* high */, new_val1 /* low */, comparandResult); } # define AO_HAVE_compare_double_and_swap_double_full #endif /* !AO_NO_DOUBLE_CAS */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/common32_defs.h000066400000000000000000001146631502707512200242140ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * Copyright (c) 2009-2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* This file contains AO primitives based on VC++ built-in intrinsic */ /* functions commonly available across 32- and 64-bit architectures. */ /* This file should be included from arch-specific header files. */ /* Define AO_USE_INTERLOCKED_INTRINSICS if _Interlocked primitives */ /* (used below) are available as intrinsic ones for a target arch */ /* (otherwise "Interlocked" functions family is used instead). */ /* Define AO_ASSUME_WINDOWS98 if CAS is available. */ #if _MSC_VER <= 1400 || !defined(AO_USE_INTERLOCKED_INTRINSICS) \ || defined(_WIN32_WCE) # include /* Seems like over-kill, but that's what MSDN recommends. */ /* And apparently winbase.h is not always self-contained. */ /* Optionally, client could define WIN32_LEAN_AND_MEAN before */ /* include atomic_ops.h to reduce amount of Windows internal */ /* headers included by windows.h one. */ #endif #if _MSC_VER < 1310 || !defined(AO_USE_INTERLOCKED_INTRINSICS) # define _InterlockedIncrement InterlockedIncrement # define _InterlockedDecrement InterlockedDecrement # define _InterlockedExchangeAdd InterlockedExchangeAdd # define _InterlockedCompareExchange InterlockedCompareExchange # define AO_INTERLOCKED_VOLATILE /**/ #else /* elif _MSC_VER >= 1310 */ # if _MSC_VER >= 1400 # ifndef _WIN32_WCE # include # endif # else /* elif _MSC_VER < 1400 */ # ifdef __cplusplus extern "C" { # endif LONG __cdecl _InterlockedIncrement(LONG volatile *); LONG __cdecl _InterlockedDecrement(LONG volatile *); LONG __cdecl _InterlockedExchangeAdd(LONG volatile *, LONG); LONG __cdecl _InterlockedCompareExchange(LONG volatile *, LONG /* Exchange */, LONG /* Comp */); # ifdef __cplusplus } /* extern "C" */ # endif # endif /* _MSC_VER < 1400 */ # if !defined(AO_PREFER_GENERALIZED) || !defined(AO_ASSUME_WINDOWS98) # pragma intrinsic (_InterlockedIncrement) # pragma intrinsic (_InterlockedDecrement) # pragma intrinsic (_InterlockedExchangeAdd) # ifndef AO_T_IS_INT # pragma intrinsic (_InterlockedIncrement64) # pragma intrinsic (_InterlockedDecrement64) # pragma intrinsic (_InterlockedExchangeAdd64) # endif # endif /* !AO_PREFER_GENERALIZED */ # pragma intrinsic (_InterlockedCompareExchange) # ifndef AO_T_IS_INT # pragma intrinsic (_InterlockedCompareExchange64) # endif # define AO_INTERLOCKED_VOLATILE volatile #endif /* _MSC_VER >= 1310 */ #if !defined(AO_PREFER_GENERALIZED) || !defined(AO_ASSUME_WINDOWS98) AO_INLINE AO_t AO_fetch_and_add_full(volatile AO_t *p, AO_t incr) { # ifdef AO_T_IS_INT return _InterlockedExchangeAdd((long AO_INTERLOCKED_VOLATILE *)p, incr); # else return _InterlockedExchangeAdd64((__int64 volatile *)p, incr); # endif } # define AO_HAVE_fetch_and_add_full AO_INLINE AO_t AO_fetch_and_add1_full(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedIncrement((long AO_INTERLOCKED_VOLATILE *)p) - 1; # else return _InterlockedIncrement64((__int64 volatile *)p) - 1; # endif } # define AO_HAVE_fetch_and_add1_full AO_INLINE AO_t AO_fetch_and_sub1_full(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedDecrement((long AO_INTERLOCKED_VOLATILE *)p) + 1; # else return _InterlockedDecrement64((__int64 volatile *)p) + 1; # endif } # define AO_HAVE_fetch_and_sub1_full # ifndef AO_T_IS_INT AO_INLINE unsigned int AO_int_fetch_and_add_full(volatile unsigned int *p, unsigned int incr) { return _InterlockedExchangeAdd((long volatile *)p, incr); } # define AO_HAVE_int_fetch_and_add_full AO_INLINE unsigned int AO_int_fetch_and_add1_full(volatile unsigned int *p) { return _InterlockedIncrement((long volatile *)p) - 1; } # define AO_HAVE_int_fetch_and_add1_full AO_INLINE unsigned int AO_int_fetch_and_sub1_full(volatile unsigned int *p) { return _InterlockedDecrement((long volatile *)p) + 1; } # define AO_HAVE_int_fetch_and_sub1_full # endif /* !AO_T_IS_INT */ #endif /* !AO_PREFER_GENERALIZED */ #ifdef AO_ASSUME_WINDOWS98 AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { # ifndef AO_T_IS_INT return (AO_t)_InterlockedCompareExchange64((__int64 volatile *)addr, new_val, old_val); # elif defined(AO_OLD_STYLE_INTERLOCKED_COMPARE_EXCHANGE) return (AO_t)_InterlockedCompareExchange( (void *AO_INTERLOCKED_VOLATILE *)addr, (void *)new_val, (void *)old_val); # else return _InterlockedCompareExchange((long AO_INTERLOCKED_VOLATILE *)addr, new_val, old_val); # endif } # define AO_HAVE_fetch_compare_and_swap_full # ifndef AO_T_IS_INT AO_INLINE unsigned int AO_int_fetch_compare_and_swap_full(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange((long volatile *)addr, new_val, old_val); } # define AO_HAVE_int_fetch_compare_and_swap_full # endif /* !AO_T_IS_INT */ #endif /* AO_ASSUME_WINDOWS98 */ #if (_MSC_VER > 1400) && (!defined(_M_ARM) || _MSC_VER >= 1800) # if _MSC_VER < 1800 || !defined(AO_PREFER_GENERALIZED) # pragma intrinsic (_InterlockedAnd8) # pragma intrinsic (_InterlockedOr8) # pragma intrinsic (_InterlockedXor8) AO_INLINE void AO_char_and_full(volatile unsigned char *p, unsigned char value) { _InterlockedAnd8((char volatile *)p, value); } # define AO_HAVE_char_and_full AO_INLINE void AO_char_or_full(volatile unsigned char *p, unsigned char value) { _InterlockedOr8((char volatile *)p, value); } # define AO_HAVE_char_or_full AO_INLINE void AO_char_xor_full(volatile unsigned char *p, unsigned char value) { _InterlockedXor8((char volatile *)p, value); } # define AO_HAVE_char_xor_full # endif /* _MSC_VER < 1800 || !AO_PREFER_GENERALIZED */ # pragma intrinsic (_InterlockedCompareExchange16) AO_INLINE unsigned short AO_short_fetch_compare_and_swap_full(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16((short volatile *)addr, new_val, old_val); } # define AO_HAVE_short_fetch_compare_and_swap_full # ifndef AO_PREFER_GENERALIZED # pragma intrinsic (_InterlockedIncrement16) # pragma intrinsic (_InterlockedDecrement16) AO_INLINE unsigned short AO_short_fetch_and_add1_full(volatile unsigned short *p) { return _InterlockedIncrement16((short volatile *)p) - 1; } # define AO_HAVE_short_fetch_and_add1_full AO_INLINE unsigned short AO_short_fetch_and_sub1_full(volatile unsigned short *p) { return _InterlockedDecrement16((short volatile *)p) + 1; } # define AO_HAVE_short_fetch_and_sub1_full # endif /* !AO_PREFER_GENERALIZED */ #endif /* _MSC_VER > 1400 */ #if _MSC_VER >= 1800 /* Visual Studio 2013+ */ # ifndef AO_PREFER_GENERALIZED # pragma intrinsic (_InterlockedAnd16) # pragma intrinsic (_InterlockedOr16) # pragma intrinsic (_InterlockedXor16) AO_INLINE void AO_short_and_full(volatile unsigned short *p, unsigned short value) { (void)_InterlockedAnd16((short volatile *)p, value); } # define AO_HAVE_short_and_full AO_INLINE void AO_short_or_full(volatile unsigned short *p, unsigned short value) { (void)_InterlockedOr16((short volatile *)p, value); } # define AO_HAVE_short_or_full AO_INLINE void AO_short_xor_full(volatile unsigned short *p, unsigned short value) { (void)_InterlockedXor16((short volatile *)p, value); } # define AO_HAVE_short_xor_full # pragma intrinsic (_InterlockedAnd) # pragma intrinsic (_InterlockedOr) # pragma intrinsic (_InterlockedXor) # ifndef AO_T_IS_INT AO_INLINE void AO_int_and_full(volatile unsigned int *p, unsigned int value) { (void)_InterlockedAnd((long volatile *)p, value); } # define AO_HAVE_int_and_full AO_INLINE void AO_int_or_full(volatile unsigned int *p, unsigned int value) { (void)_InterlockedOr((long volatile *)p, value); } # define AO_HAVE_int_or_full AO_INLINE void AO_int_xor_full(volatile unsigned int *p, unsigned int value) { (void)_InterlockedXor((long volatile *)p, value); } # define AO_HAVE_int_xor_full # pragma intrinsic (_InterlockedAnd64) # pragma intrinsic (_InterlockedOr64) # pragma intrinsic (_InterlockedXor64) # endif /* !AO_T_IS_INT */ AO_INLINE void AO_and_full(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedAnd((long volatile *)p, value); # else (void)_InterlockedAnd64((__int64 volatile *)p, value); # endif } # define AO_HAVE_and_full AO_INLINE void AO_or_full(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedOr((long volatile *)p, value); # else (void)_InterlockedOr64((__int64 volatile *)p, value); # endif } # define AO_HAVE_or_full AO_INLINE void AO_xor_full(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedXor((long volatile *)p, value); # else (void)_InterlockedXor64((__int64 volatile *)p, value); # endif } # define AO_HAVE_xor_full # endif /* !AO_PREFER_GENERALIZED */ # if !defined(AO_PREFER_GENERALIZED) && (defined(_M_ARM) || defined(_M_ARM64)) # pragma intrinsic (_InterlockedAnd8_acq) # pragma intrinsic (_InterlockedAnd8_nf) # pragma intrinsic (_InterlockedAnd8_rel) # pragma intrinsic (_InterlockedOr8_acq) # pragma intrinsic (_InterlockedOr8_nf) # pragma intrinsic (_InterlockedOr8_rel) # pragma intrinsic (_InterlockedXor8_acq) # pragma intrinsic (_InterlockedXor8_nf) # pragma intrinsic (_InterlockedXor8_rel) AO_INLINE void AO_char_and(volatile unsigned char *p, unsigned char value) { _InterlockedAnd8_nf((char volatile *)p, value); } # define AO_HAVE_char_and AO_INLINE void AO_char_or(volatile unsigned char *p, unsigned char value) { _InterlockedOr8_nf((char volatile *)p, value); } # define AO_HAVE_char_or AO_INLINE void AO_char_xor(volatile unsigned char *p, unsigned char value) { _InterlockedXor8_nf((char volatile *)p, value); } # define AO_HAVE_char_xor AO_INLINE void AO_char_and_acquire(volatile unsigned char *p, unsigned char value) { _InterlockedAnd8_acq((char volatile *)p, value); } # define AO_HAVE_char_and_acquire AO_INLINE void AO_char_or_acquire(volatile unsigned char *p, unsigned char value) { _InterlockedOr8_acq((char volatile *)p, value); } # define AO_HAVE_char_or_acquire AO_INLINE void AO_char_xor_acquire(volatile unsigned char *p, unsigned char value) { _InterlockedXor8_acq((char volatile *)p, value); } # define AO_HAVE_char_xor_acquire AO_INLINE void AO_char_and_release(volatile unsigned char *p, unsigned char value) { _InterlockedAnd8_rel((char volatile *)p, value); } # define AO_HAVE_char_and_release AO_INLINE void AO_char_or_release(volatile unsigned char *p, unsigned char value) { _InterlockedOr8_rel((char volatile *)p, value); } # define AO_HAVE_char_or_release AO_INLINE void AO_char_xor_release(volatile unsigned char *p, unsigned char value) { _InterlockedXor8_rel((char volatile *)p, value); } # define AO_HAVE_char_xor_release # pragma intrinsic (_InterlockedAnd16_acq) # pragma intrinsic (_InterlockedAnd16_nf) # pragma intrinsic (_InterlockedAnd16_rel) # pragma intrinsic (_InterlockedOr16_acq) # pragma intrinsic (_InterlockedOr16_nf) # pragma intrinsic (_InterlockedOr16_rel) # pragma intrinsic (_InterlockedXor16_acq) # pragma intrinsic (_InterlockedXor16_nf) # pragma intrinsic (_InterlockedXor16_rel) AO_INLINE void AO_short_and(volatile unsigned short *p, unsigned short value) { (void)_InterlockedAnd16_nf((short volatile *)p, value); } # define AO_HAVE_short_and AO_INLINE void AO_short_or(volatile unsigned short *p, unsigned short value) { (void)_InterlockedOr16_nf((short volatile *)p, value); } # define AO_HAVE_short_or AO_INLINE void AO_short_xor(volatile unsigned short *p, unsigned short value) { (void)_InterlockedXor16_nf((short volatile *)p, value); } # define AO_HAVE_short_xor AO_INLINE void AO_short_and_acquire(volatile unsigned short *p, unsigned short value) { (void)_InterlockedAnd16_acq((short volatile *)p, value); } # define AO_HAVE_short_and_acquire AO_INLINE void AO_short_or_acquire(volatile unsigned short *p, unsigned short value) { (void)_InterlockedOr16_acq((short volatile *)p, value); } # define AO_HAVE_short_or_acquire AO_INLINE void AO_short_xor_acquire(volatile unsigned short *p, unsigned short value) { (void)_InterlockedXor16_acq((short volatile *)p, value); } # define AO_HAVE_short_xor_acquire AO_INLINE void AO_short_and_release(volatile unsigned short *p, unsigned short value) { (void)_InterlockedAnd16_rel((short volatile *)p, value); } # define AO_HAVE_short_and_release AO_INLINE void AO_short_or_release(volatile unsigned short *p, unsigned short value) { (void)_InterlockedOr16_rel((short volatile *)p, value); } # define AO_HAVE_short_or_release AO_INLINE void AO_short_xor_release(volatile unsigned short *p, unsigned short value) { (void)_InterlockedXor16_rel((short volatile *)p, value); } # define AO_HAVE_short_xor_release # pragma intrinsic (_InterlockedAnd_acq) # pragma intrinsic (_InterlockedAnd_nf) # pragma intrinsic (_InterlockedAnd_rel) # pragma intrinsic (_InterlockedOr_acq) # pragma intrinsic (_InterlockedOr_nf) # pragma intrinsic (_InterlockedOr_rel) # pragma intrinsic (_InterlockedXor_acq) # pragma intrinsic (_InterlockedXor_nf) # pragma intrinsic (_InterlockedXor_rel) # ifndef AO_T_IS_INT AO_INLINE void AO_int_and(volatile unsigned int *p, unsigned int value) { (void)_InterlockedAnd_nf((long volatile *)p, value); } # define AO_HAVE_int_and AO_INLINE void AO_int_or(volatile unsigned int *p, unsigned int value) { (void)_InterlockedOr_nf((long volatile *)p, value); } # define AO_HAVE_int_or AO_INLINE void AO_int_xor(volatile unsigned int *p, unsigned int value) { (void)_InterlockedXor_nf((long volatile *)p, value); } # define AO_HAVE_int_xor AO_INLINE void AO_int_and_acquire(volatile unsigned int *p, unsigned int value) { (void)_InterlockedAnd_acq((long volatile *)p, value); } # define AO_HAVE_int_and_acquire AO_INLINE void AO_int_or_acquire(volatile unsigned int *p, unsigned int value) { (void)_InterlockedOr_acq((long volatile *)p, value); } # define AO_HAVE_int_or_acquire AO_INLINE void AO_int_xor_acquire(volatile unsigned int *p, unsigned int value) { (void)_InterlockedXor_acq((long volatile *)p, value); } # define AO_HAVE_int_xor_acquire AO_INLINE void AO_int_and_release(volatile unsigned int *p, unsigned int value) { (void)_InterlockedAnd_rel((long volatile *)p, value); } # define AO_HAVE_int_and_release AO_INLINE void AO_int_or_release(volatile unsigned int *p, unsigned int value) { (void)_InterlockedOr_rel((long volatile *)p, value); } # define AO_HAVE_int_or_release AO_INLINE void AO_int_xor_release(volatile unsigned int *p, unsigned int value) { (void)_InterlockedXor_rel((long volatile *)p, value); } # define AO_HAVE_int_xor_release # pragma intrinsic (_InterlockedAnd64_acq) # pragma intrinsic (_InterlockedAnd64_nf) # pragma intrinsic (_InterlockedAnd64_rel) # pragma intrinsic (_InterlockedOr64_acq) # pragma intrinsic (_InterlockedOr64_nf) # pragma intrinsic (_InterlockedOr64_rel) # pragma intrinsic (_InterlockedXor64_acq) # pragma intrinsic (_InterlockedXor64_nf) # pragma intrinsic (_InterlockedXor64_rel) # endif /* !AO_T_IS_INT */ AO_INLINE void AO_and(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedAnd_nf((long volatile *)p, value); # else (void)_InterlockedAnd64_nf((__int64 volatile *)p, value); # endif } # define AO_HAVE_and AO_INLINE void AO_or(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedOr_nf((long volatile *)p, value); # else (void)_InterlockedOr64_nf((__int64 volatile *)p, value); # endif } # define AO_HAVE_or AO_INLINE void AO_xor(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedXor_nf((long volatile *)p, value); # else (void)_InterlockedXor64_nf((__int64 volatile *)p, value); # endif } # define AO_HAVE_xor AO_INLINE void AO_and_acquire(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedAnd_acq((long volatile *)p, value); # else (void)_InterlockedAnd64_acq((__int64 volatile *)p, value); # endif } # define AO_HAVE_and_acquire AO_INLINE void AO_or_acquire(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedOr_acq((long volatile *)p, value); # else (void)_InterlockedOr64_acq((__int64 volatile *)p, value); # endif } # define AO_HAVE_or_acquire AO_INLINE void AO_xor_acquire(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedXor_acq((long volatile *)p, value); # else (void)_InterlockedXor64_acq((__int64 volatile *)p, value); # endif } # define AO_HAVE_xor_acquire AO_INLINE void AO_and_release(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedAnd_rel((long volatile *)p, value); # else (void)_InterlockedAnd64_rel((__int64 volatile *)p, value); # endif } # define AO_HAVE_and_release AO_INLINE void AO_or_release(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedOr_rel((long volatile *)p, value); # else (void)_InterlockedOr64_rel((__int64 volatile *)p, value); # endif } # define AO_HAVE_or_release AO_INLINE void AO_xor_release(volatile AO_t *p, AO_t value) { # ifdef AO_T_IS_INT (void)_InterlockedXor_rel((long volatile *)p, value); # else (void)_InterlockedXor64_rel((__int64 volatile *)p, value); # endif } # define AO_HAVE_xor_release # pragma intrinsic (_InterlockedDecrement16_acq) # pragma intrinsic (_InterlockedDecrement16_nf) # pragma intrinsic (_InterlockedDecrement16_rel) # pragma intrinsic (_InterlockedIncrement16_acq) # pragma intrinsic (_InterlockedIncrement16_nf) # pragma intrinsic (_InterlockedIncrement16_rel) AO_INLINE unsigned short AO_short_fetch_and_add1(volatile unsigned short *p) { return _InterlockedIncrement16_nf((short volatile *)p) - 1; } # define AO_HAVE_short_fetch_and_add1 AO_INLINE unsigned short AO_short_fetch_and_sub1(volatile unsigned short *p) { return _InterlockedDecrement16_nf((short volatile *)p) + 1; } # define AO_HAVE_short_fetch_and_sub1 AO_INLINE unsigned short AO_short_fetch_and_add1_acquire(volatile unsigned short *p) { return _InterlockedIncrement16_acq((short volatile *)p) - 1; } # define AO_HAVE_short_fetch_and_add1_acquire AO_INLINE unsigned short AO_short_fetch_and_sub1_acquire(volatile unsigned short *p) { return _InterlockedDecrement16_acq((short volatile *)p) + 1; } # define AO_HAVE_short_fetch_and_sub1_acquire AO_INLINE unsigned short AO_short_fetch_and_add1_release(volatile unsigned short *p) { return _InterlockedIncrement16_rel((short volatile *)p) - 1; } # define AO_HAVE_short_fetch_and_add1_release AO_INLINE unsigned short AO_short_fetch_and_sub1_release(volatile unsigned short *p) { return _InterlockedDecrement16_rel((short volatile *)p) + 1; } # define AO_HAVE_short_fetch_and_sub1_release # pragma intrinsic (_InterlockedExchangeAdd_acq) # pragma intrinsic (_InterlockedExchangeAdd_nf) # pragma intrinsic (_InterlockedExchangeAdd_rel) # pragma intrinsic (_InterlockedDecrement_acq) # pragma intrinsic (_InterlockedDecrement_nf) # pragma intrinsic (_InterlockedDecrement_rel) # pragma intrinsic (_InterlockedIncrement_acq) # pragma intrinsic (_InterlockedIncrement_nf) # pragma intrinsic (_InterlockedIncrement_rel) # ifndef AO_T_IS_INT # pragma intrinsic (_InterlockedExchangeAdd64_acq) # pragma intrinsic (_InterlockedExchangeAdd64_nf) # pragma intrinsic (_InterlockedExchangeAdd64_rel) # pragma intrinsic (_InterlockedDecrement64_acq) # pragma intrinsic (_InterlockedDecrement64_nf) # pragma intrinsic (_InterlockedDecrement64_rel) # pragma intrinsic (_InterlockedIncrement64_acq) # pragma intrinsic (_InterlockedIncrement64_nf) # pragma intrinsic (_InterlockedIncrement64_rel) # endif AO_INLINE AO_t AO_fetch_and_add(volatile AO_t *p, AO_t incr) { # ifdef AO_T_IS_INT return _InterlockedExchangeAdd_nf((long volatile *)p, incr); # else return _InterlockedExchangeAdd64_nf((__int64 volatile *)p, incr); # endif } # define AO_HAVE_fetch_and_add AO_INLINE AO_t AO_fetch_and_add1(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedIncrement_nf((long volatile *)p) - 1; # else return _InterlockedIncrement64_nf((__int64 volatile *)p) - 1; # endif } # define AO_HAVE_fetch_and_add1 AO_INLINE AO_t AO_fetch_and_sub1(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedDecrement_nf((long volatile *)p) + 1; # else return _InterlockedDecrement64_nf((__int64 volatile *)p) + 1; # endif } # define AO_HAVE_fetch_and_sub1 AO_INLINE AO_t AO_fetch_and_add_acquire(volatile AO_t *p, AO_t incr) { # ifdef AO_T_IS_INT return _InterlockedExchangeAdd_acq((long volatile *)p, incr); # else return _InterlockedExchangeAdd64_acq((__int64 volatile *)p, incr); # endif } # define AO_HAVE_fetch_and_add_acquire AO_INLINE AO_t AO_fetch_and_add1_acquire(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedIncrement_acq((long volatile *)p) - 1; # else return _InterlockedIncrement64_acq((__int64 volatile *)p) - 1; # endif } # define AO_HAVE_fetch_and_add1_acquire AO_INLINE AO_t AO_fetch_and_sub1_acquire(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedDecrement_acq((long volatile *)p) + 1; # else return _InterlockedDecrement64_acq((__int64 volatile *)p) + 1; # endif } # define AO_HAVE_fetch_and_sub1_acquire AO_INLINE AO_t AO_fetch_and_add_release(volatile AO_t *p, AO_t incr) { # ifdef AO_T_IS_INT return _InterlockedExchangeAdd_rel((long volatile *)p, incr); # else return _InterlockedExchangeAdd64_rel((__int64 volatile *)p, incr); # endif } # define AO_HAVE_fetch_and_add_release AO_INLINE AO_t AO_fetch_and_add1_release(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedIncrement_rel((long volatile *)p) - 1; # else return _InterlockedIncrement64_rel((__int64 volatile *)p) - 1; # endif } # define AO_HAVE_fetch_and_add1_release AO_INLINE AO_t AO_fetch_and_sub1_release(volatile AO_t *p) { # ifdef AO_T_IS_INT return _InterlockedDecrement_rel((long volatile *)p) + 1; # else return _InterlockedDecrement64_rel((__int64 volatile *)p) + 1; # endif } # define AO_HAVE_fetch_and_sub1_release # ifndef AO_T_IS_INT AO_INLINE unsigned int AO_int_fetch_and_add(volatile unsigned int *p, unsigned int incr) { return _InterlockedExchangeAdd_nf((long volatile *)p, incr); } # define AO_HAVE_int_fetch_and_add AO_INLINE unsigned int AO_int_fetch_and_add1(volatile unsigned int *p) { return _InterlockedIncrement_nf((long volatile *)p) - 1; } # define AO_HAVE_int_fetch_and_add1 AO_INLINE unsigned int AO_int_fetch_and_sub1(volatile unsigned int *p) { return _InterlockedDecrement_nf((long volatile *)p) + 1; } # define AO_HAVE_int_fetch_and_sub1 AO_INLINE unsigned int AO_int_fetch_and_add_acquire(volatile unsigned int *p, unsigned int incr) { return _InterlockedExchangeAdd_acq((long volatile *)p, incr); } # define AO_HAVE_int_fetch_and_add_acquire AO_INLINE unsigned int AO_int_fetch_and_add1_acquire(volatile unsigned int *p) { return _InterlockedIncrement_acq((long volatile *)p) - 1; } # define AO_HAVE_int_fetch_and_add1_acquire AO_INLINE unsigned int AO_int_fetch_and_sub1_acquire(volatile unsigned int *p) { return _InterlockedDecrement_acq((long volatile *)p) + 1; } # define AO_HAVE_int_fetch_and_sub1_acquire AO_INLINE unsigned int AO_int_fetch_and_add_release(volatile unsigned int *p, unsigned int incr) { return _InterlockedExchangeAdd_rel((long volatile *)p, incr); } # define AO_HAVE_int_fetch_and_add_release AO_INLINE unsigned int AO_int_fetch_and_add1_release(volatile unsigned int *p) { return _InterlockedIncrement_rel((long volatile *)p) - 1; } # define AO_HAVE_int_fetch_and_add1_release AO_INLINE unsigned int AO_int_fetch_and_sub1_release(volatile unsigned int *p) { return _InterlockedDecrement_rel((long volatile *)p) + 1; } # define AO_HAVE_int_fetch_and_sub1_release # endif /* !AO_T_IS_INT */ # endif /* !AO_PREFER_GENERALIZED && (_M_ARM || _M_ARM64) */ # pragma intrinsic (_InterlockedCompareExchange8) AO_INLINE unsigned char AO_char_fetch_compare_and_swap_full(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8((char volatile *)addr, new_val, old_val); } # define AO_HAVE_char_fetch_compare_and_swap_full # if defined(_M_ARM) || defined(_M_ARM64) # pragma intrinsic (_InterlockedCompareExchange_acq) # pragma intrinsic (_InterlockedCompareExchange_nf) # pragma intrinsic (_InterlockedCompareExchange_rel) # ifndef AO_T_IS_INT # pragma intrinsic (_InterlockedCompareExchange64_acq) # pragma intrinsic (_InterlockedCompareExchange64_nf) # pragma intrinsic (_InterlockedCompareExchange64_rel) # endif AO_INLINE AO_t AO_fetch_compare_and_swap(volatile AO_t *addr, AO_t old_val, AO_t new_val) { # ifdef AO_T_IS_INT return _InterlockedCompareExchange_nf((long volatile *)addr, new_val, old_val); # else return (AO_t)_InterlockedCompareExchange64_nf( (__int64 volatile *)addr, new_val, old_val); # endif } # define AO_HAVE_fetch_compare_and_swap AO_INLINE AO_t AO_fetch_compare_and_swap_acquire(volatile AO_t *addr, AO_t old_val, AO_t new_val) { # ifdef AO_T_IS_INT return _InterlockedCompareExchange_acq((long volatile *)addr, new_val, old_val); # else return (AO_t)_InterlockedCompareExchange64_acq( (__int64 volatile *)addr, new_val, old_val); # endif } # define AO_HAVE_fetch_compare_and_swap_acquire AO_INLINE AO_t AO_fetch_compare_and_swap_release(volatile AO_t *addr, AO_t old_val, AO_t new_val) { # ifdef AO_T_IS_INT return _InterlockedCompareExchange_rel((long volatile *)addr, new_val, old_val); # else return (AO_t)_InterlockedCompareExchange64_rel( (__int64 volatile *)addr, new_val, old_val); # endif } # define AO_HAVE_fetch_compare_and_swap_release # ifndef AO_T_IS_INT AO_INLINE unsigned int AO_int_fetch_compare_and_swap(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange_nf((long volatile *)addr, new_val, old_val); } # define AO_HAVE_int_fetch_compare_and_swap AO_INLINE unsigned int AO_int_fetch_compare_and_swap_acquire(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange_acq((long volatile *)addr, new_val, old_val); } # define AO_HAVE_int_fetch_compare_and_swap_acquire AO_INLINE unsigned int AO_int_fetch_compare_and_swap_release(volatile unsigned int *addr, unsigned int old_val, unsigned int new_val) { return _InterlockedCompareExchange_rel((long volatile *)addr, new_val, old_val); } # define AO_HAVE_int_fetch_compare_and_swap_release # endif /* !AO_T_IS_INT */ # pragma intrinsic (_InterlockedCompareExchange16_acq) # pragma intrinsic (_InterlockedCompareExchange16_nf) # pragma intrinsic (_InterlockedCompareExchange16_rel) # pragma intrinsic (_InterlockedCompareExchange8_acq) # pragma intrinsic (_InterlockedCompareExchange8_nf) # pragma intrinsic (_InterlockedCompareExchange8_rel) AO_INLINE unsigned short AO_short_fetch_compare_and_swap(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16_nf((short volatile *)addr, new_val, old_val); } # define AO_HAVE_short_fetch_compare_and_swap AO_INLINE unsigned short AO_short_fetch_compare_and_swap_acquire(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16_acq((short volatile *)addr, new_val, old_val); } # define AO_HAVE_short_fetch_compare_and_swap_acquire AO_INLINE unsigned short AO_short_fetch_compare_and_swap_release(volatile unsigned short *addr, unsigned short old_val, unsigned short new_val) { return _InterlockedCompareExchange16_rel((short volatile *)addr, new_val, old_val); } # define AO_HAVE_short_fetch_compare_and_swap_release AO_INLINE unsigned char AO_char_fetch_compare_and_swap(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8_nf((char volatile *)addr, new_val, old_val); } # define AO_HAVE_char_fetch_compare_and_swap AO_INLINE unsigned char AO_char_fetch_compare_and_swap_acquire(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8_acq((char volatile *)addr, new_val, old_val); } # define AO_HAVE_char_fetch_compare_and_swap_acquire AO_INLINE unsigned char AO_char_fetch_compare_and_swap_release(volatile unsigned char *addr, unsigned char old_val, unsigned char new_val) { return _InterlockedCompareExchange8_rel((char volatile *)addr, new_val, old_val); } # define AO_HAVE_char_fetch_compare_and_swap_release # endif /* _M_ARM || _M_ARM64 */ # if !defined(AO_PREFER_GENERALIZED) && !defined(_M_ARM) # pragma intrinsic (_InterlockedExchangeAdd16) # pragma intrinsic (_InterlockedExchangeAdd8) AO_INLINE unsigned char AO_char_fetch_and_add_full(volatile unsigned char *p, unsigned char incr) { return _InterlockedExchangeAdd8((char volatile *)p, incr); } # define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full(volatile unsigned short *p, unsigned short incr) { return _InterlockedExchangeAdd16((short volatile *)p, incr); } # define AO_HAVE_short_fetch_and_add_full # if defined(_M_ARM64) # pragma intrinsic (_InterlockedExchangeAdd16_acq) # pragma intrinsic (_InterlockedExchangeAdd16_nf) # pragma intrinsic (_InterlockedExchangeAdd16_rel) # pragma intrinsic (_InterlockedExchangeAdd8_acq) # pragma intrinsic (_InterlockedExchangeAdd8_nf) # pragma intrinsic (_InterlockedExchangeAdd8_rel) AO_INLINE unsigned char AO_char_fetch_and_add(volatile unsigned char *p, unsigned char incr) { return _InterlockedExchangeAdd8_nf((char volatile *)p, incr); } # define AO_HAVE_char_fetch_and_add AO_INLINE unsigned short AO_short_fetch_and_add(volatile unsigned short *p, unsigned short incr) { return _InterlockedExchangeAdd16_nf((short volatile *)p, incr); } # define AO_HAVE_short_fetch_and_add AO_INLINE unsigned char AO_char_fetch_and_add_acquire(volatile unsigned char *p, unsigned char incr) { return _InterlockedExchangeAdd8_acq((char volatile *)p, incr); } # define AO_HAVE_char_fetch_and_add_acquire AO_INLINE unsigned short AO_short_fetch_and_add_acquire(volatile unsigned short *p, unsigned short incr) { return _InterlockedExchangeAdd16_acq((short volatile *)p, incr); } # define AO_HAVE_short_fetch_and_add_acquire AO_INLINE unsigned char AO_char_fetch_and_add_release(volatile unsigned char *p, unsigned char incr) { return _InterlockedExchangeAdd8_rel((char volatile *)p, incr); } # define AO_HAVE_char_fetch_and_add_release AO_INLINE unsigned short AO_short_fetch_and_add_release(volatile unsigned short *p, unsigned short incr) { return _InterlockedExchangeAdd16_rel((short volatile *)p, incr); } # define AO_HAVE_short_fetch_and_add_release # endif /* _M_ARM64 */ # endif /* !AO_PREFER_GENERALIZED && !_M_ARM */ # if !defined(_M_ARM) || _M_ARM >= 6 # include "../test_and_set_t_is_char.h" # pragma intrinsic (_InterlockedExchange8) AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)(_InterlockedExchange8((char volatile *)addr, (AO_TS_t)AO_TS_SET) & 0xff); /* Note: bitwise "and 0xff" is applied to the result because cast */ /* to unsigned char does not work properly (for a reason) if /J */ /* option is passed to the MS VC compiler. */ } # define AO_HAVE_test_and_set_full # endif /* !_M_ARM || _M_ARM >= 6 */ # if _M_ARM >= 6 || defined(_M_ARM64) # pragma intrinsic (_InterlockedExchange8_acq) # pragma intrinsic (_InterlockedExchange8_nf) # pragma intrinsic (_InterlockedExchange8_rel) AO_INLINE AO_TS_VAL_t AO_test_and_set(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)(_InterlockedExchange8_nf((char volatile *)addr, (AO_TS_t)AO_TS_SET) & 0xff); } # define AO_HAVE_test_and_set AO_INLINE AO_TS_VAL_t AO_test_and_set_acquire(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)(_InterlockedExchange8_acq((char volatile *)addr, (AO_TS_t)AO_TS_SET) & 0xff); } # define AO_HAVE_test_and_set_acquire AO_INLINE AO_TS_VAL_t AO_test_and_set_release(volatile AO_TS_t *addr) { return (AO_TS_VAL_t)(_InterlockedExchange8_rel((char volatile *)addr, (AO_TS_t)AO_TS_SET) & 0xff); } # define AO_HAVE_test_and_set_release # endif /* _M_ARM >= 6 || _M_ARM64 */ #endif /* _MSC_VER >= 1800 */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/x86.h000066400000000000000000000127441502707512200222000ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * Copyright (c) 2009-2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_aligned_atomic_load_store.h" #if !defined(AO_ASSUME_VISTA) && _MSC_VER >= 1910 /* Visual Studio 2017 (15.0) discontinued support of Windows XP. */ /* We assume Windows Server 2003, Vista or later. */ # define AO_ASSUME_VISTA #endif #if !defined(AO_ASSUME_WINDOWS98) \ && (defined(AO_ASSUME_VISTA) || _MSC_VER >= 1400) /* Visual Studio 2005 (MS VC++ 8.0) discontinued support of Windows 95. */ # define AO_ASSUME_WINDOWS98 #endif #if !defined(AO_USE_PENTIUM4_INSTRS) && _M_IX86_FP >= 2 /* SSE2 */ /* "mfence" is a part of SSE2 set (introduced on Intel Pentium 4). */ # define AO_USE_PENTIUM4_INSTRS #endif #define AO_T_IS_INT #ifndef AO_USE_INTERLOCKED_INTRINSICS /* _Interlocked primitives (Inc, Dec, Xchg, Add) are always available */ # define AO_USE_INTERLOCKED_INTRINSICS #endif #include "common32_defs.h" /* As far as we can tell, the lfence and sfence instructions are not */ /* currently needed or useful for cached memory accesses. */ /* Unfortunately mfence doesn't exist everywhere. */ /* IsProcessorFeaturePresent(PF_COMPARE_EXCHANGE128) is */ /* probably a conservative test for it? */ #if defined(AO_USE_PENTIUM4_INSTRS) AO_INLINE void AO_nop_full(void) { __asm { mfence } } #define AO_HAVE_nop_full #else /* We could use the cpuid instruction. But that seems to be slower */ /* than the default implementation based on test_and_set_full. Thus */ /* we omit that bit of misinformation here. */ #endif #if !defined(AO_NO_ASM_XADD) && !defined(AO_HAVE_char_fetch_and_add_full) AO_INLINE unsigned char AO_char_fetch_and_add_full(volatile unsigned char *p, unsigned char incr) { __asm { mov al, incr mov ebx, p lock xadd byte ptr [ebx], al } /* Ignore possible "missing return value" warning here. */ } # define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full(volatile unsigned short *p, unsigned short incr) { __asm { mov ax, incr mov ebx, p lock xadd word ptr [ebx], ax } /* Ignore possible "missing return value" warning here. */ } # define AO_HAVE_short_fetch_and_add_full #endif /* !AO_NO_ASM_XADD */ #ifndef AO_HAVE_test_and_set_full # include "../test_and_set_t_is_char.h" AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { __asm { mov eax,0xff ; /* AO_TS_SET */ mov ebx,addr ; xchg byte ptr [ebx],al ; } /* Ignore possible "missing return value" warning here. */ } # define AO_HAVE_test_and_set_full #endif #if defined(_WIN64) && !defined(CPPCHECK) # error wrong architecture #endif #ifdef AO_ASSUME_VISTA # include "../standard_ao_double_t.h" /* Reading or writing a quadword aligned on a 64-bit boundary is */ /* always carried out atomically (requires at least a Pentium). */ # define AO_ACCESS_double_CHECK_ALIGNED # include "../loadstore/double_atomic_load_store.h" /* Whenever we run on a Pentium class machine, we have that certain */ /* function. */ # pragma intrinsic (_InterlockedCompareExchange64) /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_double_compare_and_swap_full(volatile AO_double_t *addr, AO_double_t old_val, AO_double_t new_val) { AO_ASSERT_ADDR_ALIGNED(addr); return (double_ptr_storage)_InterlockedCompareExchange64( (__int64 volatile *)addr, new_val.AO_whole /* exchange */, old_val.AO_whole) == old_val.AO_whole; } # define AO_HAVE_double_compare_and_swap_full #endif /* AO_ASSUME_VISTA */ /* Real X86 implementations, except for some old WinChips, appear */ /* to enforce ordering between memory operations, EXCEPT that a later */ /* read can pass earlier writes, presumably due to the visible */ /* presence of store buffers. */ /* We ignore both the WinChips, and the fact that the official specs */ /* seem to be much weaker (and arguably too weak to be usable). */ #include "../ordered_except_wr.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/msftc/x86_64.h000066400000000000000000000122141502707512200225010ustar00rootroot00000000000000/* * Copyright (c) 2003-2011 Hewlett-Packard Development Company, L.P. * Copyright (c) 2009-2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_aligned_atomic_load_store.h" /* Real X86 implementations appear */ /* to enforce ordering between memory operations, EXCEPT that a later */ /* read can pass earlier writes, presumably due to the visible */ /* presence of store buffers. */ /* We ignore the fact that the official specs */ /* seem to be much weaker (and arguably too weak to be usable). */ #include "../ordered_except_wr.h" #ifndef AO_ASSUME_WINDOWS98 /* CAS is always available */ # define AO_ASSUME_WINDOWS98 #endif #ifndef AO_USE_INTERLOCKED_INTRINSICS # define AO_USE_INTERLOCKED_INTRINSICS #endif #include "common32_defs.h" #ifdef AO_ASM_X64_AVAILABLE #if _MSC_VER < 1800 AO_INLINE unsigned char AO_char_fetch_and_add_full(volatile unsigned char *p, unsigned char incr) { __asm { mov al, incr mov rbx, p lock xadd byte ptr [rbx], al } } # define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full(volatile unsigned short *p, unsigned short incr) { __asm { mov ax, incr mov rbx, p lock xadd word ptr [rbx], ax } } # define AO_HAVE_short_fetch_and_add_full #endif /* _MSC_VER < 1800 */ /* As far as we can tell, the lfence and sfence instructions are not */ /* currently needed or useful for cached memory accesses. */ AO_INLINE void AO_nop_full(void) { /* Note: "mfence" (SSE2) is supported on all x86_64/amd64 chips. */ __asm { mfence } } # define AO_HAVE_nop_full # ifndef AO_HAVE_test_and_set_full # include "../test_and_set_t_is_char.h" AO_INLINE AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr) { __asm { mov rax,AO_TS_SET ; mov rbx,addr ; xchg byte ptr [rbx],al ; } } # define AO_HAVE_test_and_set_full # endif #endif /* AO_ASM_X64_AVAILABLE */ #ifndef AO_HAVE_test_and_set_full # include "../test_and_set_t_is_ao_t.h" /* AO_test_and_set_full() is emulated using word-wide CAS. */ #endif #ifdef AO_CMPXCHG16B_AVAILABLE # if _MSC_VER >= 1500 # include "../standard_ao_double_t.h" # pragma intrinsic (_InterlockedCompareExchange128) AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __int64 comparandResult[2]; AO_ASSERT_ADDR_ALIGNED(addr); comparandResult[0] = old_val1; /* low */ comparandResult[1] = old_val2; /* high */ return _InterlockedCompareExchange128((volatile __int64 *)addr, new_val2 /* high */, new_val1 /* low */, comparandResult); } # define AO_HAVE_compare_double_and_swap_double_full # elif defined(AO_ASM_X64_AVAILABLE) # include "../standard_ao_double_t.h" /* If there is no intrinsic _InterlockedCompareExchange128 then we */ /* need basically what's given below. */ AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { __asm { mov rdx,QWORD PTR [old_val2] ; mov rax,QWORD PTR [old_val1] ; mov rcx,QWORD PTR [new_val2] ; mov rbx,QWORD PTR [new_val1] ; lock cmpxchg16b [addr] ; setz rax ; } } # define AO_HAVE_compare_double_and_swap_double_full # endif /* AO_ASM_X64_AVAILABLE && (_MSC_VER < 1500) */ #endif /* AO_CMPXCHG16B_AVAILABLE */ papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ordered.h000066400000000000000000000025711502707512200220600ustar00rootroot00000000000000/* * Copyright (c) 2003 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* These are common definitions for architectures that provide */ /* processor ordered memory operations. */ #include "ordered_except_wr.h" AO_INLINE void AO_nop_full(void) { AO_compiler_barrier(); } #define AO_HAVE_nop_full papi-papi-7-2-0-t/src/atomic_ops/sysdeps/ordered_except_wr.h000066400000000000000000000033651502707512200241420ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * These are common definitions for architectures that provide processor * ordered memory operations except that a later read may pass an * earlier write. Real x86 implementations seem to be in this category, * except apparently for some IDT WinChips, which we ignore. */ #include "read_ordered.h" AO_INLINE void AO_nop_write(void) { /* AO_nop_write implementation is the same as of AO_nop_read. */ AO_compiler_barrier(); /* sfence according to Intel docs. Pentium 3 and up. */ /* Unnecessary for cached accesses? */ } #define AO_HAVE_nop_write #include "loadstore/ordered_stores_only.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/read_ordered.h000066400000000000000000000030361502707512200230500ustar00rootroot00000000000000/* * Copyright (c) 2003 by Hewlett-Packard Company. All rights reserved. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * These are common definitions for architectures that provide processor * ordered memory operations except that a later read may pass an * earlier write. Real x86 implementations seem to be in this category, * except apparently for some IDT WinChips, which we ignore. */ AO_INLINE void AO_nop_read(void) { AO_compiler_barrier(); } #define AO_HAVE_nop_read #include "loadstore/ordered_loads_only.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/standard_ao_double_t.h000066400000000000000000000101341502707512200245620ustar00rootroot00000000000000/* * Copyright (c) 2004-2011 Hewlett-Packard Development Company, L.P. * Copyright (c) 2012-2021 Ivan Maidanski * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* For 64-bit systems, we expect the double type to hold two int64's. */ #if ((defined(__x86_64__) && defined(AO_GCC_ATOMIC_TEST_AND_SET)) \ || defined(__aarch64__)) && !defined(__ILP32__) /* x86-64: __m128 is not applicable to atomic intrinsics. */ # if AO_GNUC_PREREQ(4, 7) || AO_CLANG_PREREQ(3, 6) # pragma GCC diagnostic push /* Suppress warning about __int128 type. */ # if defined(__clang__) || AO_GNUC_PREREQ(6, 4) # pragma GCC diagnostic ignored "-Wpedantic" # else /* GCC before ~4.8 does not accept "-Wpedantic" quietly. */ # pragma GCC diagnostic ignored "-pedantic" # endif typedef unsigned __int128 double_ptr_storage; # pragma GCC diagnostic pop # else /* pragma diagnostic is not supported */ typedef unsigned __int128 double_ptr_storage; # endif #elif defined(_M_ARM64) && defined(_MSC_VER) /* __int128 does not seem to be available. */ typedef __declspec(align(16)) unsigned __int64 double_ptr_storage[2]; #elif ((defined(__x86_64__) && AO_GNUC_PREREQ(4, 0)) || defined(_WIN64)) \ && !defined(__ILP32__) /* x86-64 (except for x32): __m128 serves as a placeholder which also */ /* requires the compiler to align it on 16-byte boundary (as required */ /* by cmpxchg16b). */ /* Similar things could be done for PPC 64-bit using a VMX data type. */ # include typedef __m128 double_ptr_storage; #elif defined(_WIN32) && !defined(__GNUC__) typedef unsigned __int64 double_ptr_storage; #elif defined(__i386__) && defined(__GNUC__) typedef unsigned long long double_ptr_storage __attribute__((__aligned__(8))); #else typedef unsigned long long double_ptr_storage; #endif # define AO_HAVE_DOUBLE_PTR_STORAGE typedef union { struct { AO_t AO_v1; AO_t AO_v2; } AO_parts; /* Note that AO_v1 corresponds to the low or the high part of */ /* AO_whole depending on the machine endianness. */ double_ptr_storage AO_whole; /* AO_whole is now (starting from v7.3alpha3) the 2nd element */ /* of this union to make AO_DOUBLE_T_INITIALIZER portable */ /* (because __m128 definition could vary from a primitive type */ /* to a structure or array/vector). */ } AO_double_t; #define AO_HAVE_double_t /* Note: AO_double_t volatile variables are not intended to be local */ /* ones (at least those which are passed to AO double-wide primitives */ /* as the first argument), otherwise it is the client responsibility to */ /* ensure they have double-word alignment. */ /* Dummy declaration as a compile-time assertion for AO_double_t size. */ struct AO_double_t_size_static_assert { char dummy[sizeof(AO_double_t) == 2 * sizeof(AO_t) ? 1 : -1]; }; #define AO_DOUBLE_T_INITIALIZER { { (AO_t)0, (AO_t)0 } } #define AO_val1 AO_parts.AO_v1 #define AO_val2 AO_parts.AO_v2 papi-papi-7-2-0-t/src/atomic_ops/sysdeps/sunc/000077500000000000000000000000001502707512200212265ustar00rootroot00000000000000papi-papi-7-2-0-t/src/atomic_ops/sysdeps/sunc/sparc.S000066400000000000000000000002011502707512200224530ustar00rootroot00000000000000 .seg "text" .globl AO_test_and_set_full AO_test_and_set_full: retl ldstub [%o0],%o0 papi-papi-7-2-0-t/src/atomic_ops/sysdeps/sunc/sparc.h000066400000000000000000000032231502707512200225070ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "../all_atomic_load_store.h" /* Real SPARC code uses TSO: */ #include "../ordered_except_wr.h" /* Test_and_set location is just a byte. */ #include "../test_and_set_t_is_char.h" #ifdef __cplusplus extern "C" { #endif extern AO_TS_VAL_t AO_test_and_set_full(volatile AO_TS_t *addr); /* Implemented in separate .S file, for now. */ #define AO_HAVE_test_and_set_full /* TODO: Like the gcc version, extend this for V8 and V9. */ #ifdef __cplusplus } /* extern "C" */ #endif papi-papi-7-2-0-t/src/atomic_ops/sysdeps/sunc/x86.h000066400000000000000000000175031502707512200220320ustar00rootroot00000000000000/* * Copyright (c) 1991-1994 by Xerox Corporation. All rights reserved. * Copyright (c) 1996-1999 by Silicon Graphics. All rights reserved. * Copyright (c) 1999-2003 by Hewlett-Packard Company. All rights reserved. * Copyright (c) 2009-2016 Ivan Maidanski * * THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY EXPRESSED * OR IMPLIED. ANY USE IS AT YOUR OWN RISK. * * Permission is hereby granted to use or copy this program * for any purpose, provided the above notices are retained on all copies. * Permission to modify the code and to distribute modified code is granted, * provided the above notices are retained, and a notice that the code was * modified is included with the above copyright notice. * * Some of the machine specific code was borrowed from our GC distribution. */ /* The following really assume we have a 486 or better. */ #include "../all_aligned_atomic_load_store.h" #include "../test_and_set_t_is_char.h" #if !defined(AO_USE_PENTIUM4_INSTRS) && !defined(__i386) /* "mfence" (SSE2) is supported on all x86_64/amd64 chips. */ # define AO_USE_PENTIUM4_INSTRS #endif #if defined(AO_USE_PENTIUM4_INSTRS) AO_INLINE void AO_nop_full(void) { __asm__ __volatile__ ("mfence" : : : "memory"); } # define AO_HAVE_nop_full #else /* We could use the cpuid instruction. But that seems to be slower */ /* than the default implementation based on test_and_set_full. Thus */ /* we omit that bit of misinformation here. */ #endif /* !AO_USE_PENTIUM4_INSTRS */ /* As far as we can tell, the lfence and sfence instructions are not */ /* currently needed or useful for cached memory accesses. */ /* Really only works for 486 and later */ #ifndef AO_PREFER_GENERALIZED AO_INLINE AO_t AO_fetch_and_add_full (volatile AO_t *p, AO_t incr) { AO_t result; __asm__ __volatile__ ("lock; xadd %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } # define AO_HAVE_fetch_and_add_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE unsigned char AO_char_fetch_and_add_full (volatile unsigned char *p, unsigned char incr) { unsigned char result; __asm__ __volatile__ ("lock; xaddb %0, %1" : "=q" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } #define AO_HAVE_char_fetch_and_add_full AO_INLINE unsigned short AO_short_fetch_and_add_full (volatile unsigned short *p, unsigned short incr) { unsigned short result; __asm__ __volatile__ ("lock; xaddw %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } #define AO_HAVE_short_fetch_and_add_full #ifndef AO_PREFER_GENERALIZED AO_INLINE void AO_and_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; and %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_and_full AO_INLINE void AO_or_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; or %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_or_full AO_INLINE void AO_xor_full (volatile AO_t *p, AO_t value) { __asm__ __volatile__ ("lock; xor %1, %0" : "+m" (*p) : "r" (value) : "memory"); } # define AO_HAVE_xor_full #endif /* !AO_PREFER_GENERALIZED */ AO_INLINE AO_TS_VAL_t AO_test_and_set_full (volatile AO_TS_t *addr) { AO_TS_t oldval; /* Note: the "xchg" instruction does not need a "lock" prefix */ __asm__ __volatile__ ("xchg %b0, %1" : "=q" (oldval), "+m" (*addr) : "0" (0xff) : "memory"); return (AO_TS_VAL_t)oldval; } #define AO_HAVE_test_and_set_full #ifndef AO_GENERALIZE_ASM_BOOL_CAS /* Returns nonzero if the comparison succeeded. */ AO_INLINE int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t new_val) { char result; __asm__ __volatile__ ("lock; cmpxchg %2, %0; setz %1" : "+m" (*addr), "=a" (result) : "r" (new_val), "a" (old) : "memory"); return (int) result; } # define AO_HAVE_compare_and_swap_full #endif /* !AO_GENERALIZE_ASM_BOOL_CAS */ AO_INLINE AO_t AO_fetch_compare_and_swap_full(volatile AO_t *addr, AO_t old_val, AO_t new_val) { AO_t fetched_val; __asm__ __volatile__ ("lock; cmpxchg %2, %0" : "+m" (*addr), "=a" (fetched_val) : "r" (new_val), "a" (old_val) : "memory"); return fetched_val; } #define AO_HAVE_fetch_compare_and_swap_full #if defined(__i386) # ifndef AO_NO_CMPXCHG8B # include "../standard_ao_double_t.h" /* Reading or writing a quadword aligned on a 64-bit boundary is */ /* always carried out atomically (requires at least a Pentium). */ # define AO_ACCESS_double_CHECK_ALIGNED # include "../loadstore/double_atomic_load_store.h" /* Returns nonzero if the comparison succeeded. */ /* Really requires at least a Pentium. */ AO_INLINE int AO_compare_double_and_swap_double_full(volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_t dummy; /* an output for clobbered edx */ char result; __asm__ __volatile__ ("lock; cmpxchg8b %0; setz %1" : "+m" (*addr), "=a" (result), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "b" (new_val1) : "memory"); return (int) result; } # define AO_HAVE_compare_double_and_swap_double_full # endif /* !AO_NO_CMPXCHG8B */ # define AO_T_IS_INT #else /* x64 */ AO_INLINE unsigned int AO_int_fetch_and_add_full (volatile unsigned int *p, unsigned int incr) { unsigned int result; __asm__ __volatile__ ("lock; xaddl %0, %1" : "=r" (result), "+m" (*p) : "0" (incr) : "memory"); return result; } # define AO_HAVE_int_fetch_and_add_full # ifdef AO_CMPXCHG16B_AVAILABLE # include "../standard_ao_double_t.h" /* Older AMD Opterons are missing this instruction (SIGILL should */ /* be thrown in this case). */ AO_INLINE int AO_compare_double_and_swap_double_full (volatile AO_double_t *addr, AO_t old_val1, AO_t old_val2, AO_t new_val1, AO_t new_val2) { AO_t dummy; char result; __asm__ __volatile__ ("lock; cmpxchg16b %0; setz %1" : "+m" (*addr), "=a" (result), "=d" (dummy) : "d" (old_val2), "a" (old_val1), "c" (new_val2), "b" (new_val1) : "memory"); return (int) result; } # define AO_HAVE_compare_double_and_swap_double_full # endif /* !AO_CMPXCHG16B_AVAILABLE */ #endif /* x64 */ /* Real X86 implementations, except for some old 32-bit WinChips, */ /* appear to enforce ordering between memory operations, EXCEPT that */ /* a later read can pass earlier writes, presumably due to the visible */ /* presence of store buffers. */ /* We ignore both the WinChips and the fact that the official specs */ /* seem to be much weaker (and arguably too weak to be usable). */ #include "../ordered_except_wr.h" papi-papi-7-2-0-t/src/atomic_ops/sysdeps/test_and_set_t_is_ao_t.h000066400000000000000000000031341502707512200251240ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * These are common definitions for architectures on which test_and_set * operates on pointer-sized quantities, the "clear" value contains * all zeroes, and the "set" value contains only one lowest bit set. * This can be used if test_and_set is synthesized from compare_and_swap. */ typedef enum {AO_TS_clear = 0, AO_TS_set = 1} AO_TS_val; #define AO_TS_VAL_t AO_TS_val #define AO_TS_CLEAR AO_TS_clear #define AO_TS_SET AO_TS_set #define AO_TS_t AO_t #define AO_AO_TS_T 1 papi-papi-7-2-0-t/src/atomic_ops/sysdeps/test_and_set_t_is_char.h000066400000000000000000000035671502707512200251310ustar00rootroot00000000000000/* * Copyright (c) 2004 Hewlett-Packard Development Company, L.P. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * These are common definitions for architectures on which test_and_set * operates on byte sized quantities, the "clear" value contains * all zeroes, and the "set" value contains all ones typically. */ #ifndef AO_GCC_ATOMIC_TEST_AND_SET # define AO_TS_SET_TRUEVAL 0xff #elif defined(__GCC_ATOMIC_TEST_AND_SET_TRUEVAL) \ && !defined(AO_PREFER_GENERALIZED) # define AO_TS_SET_TRUEVAL __GCC_ATOMIC_TEST_AND_SET_TRUEVAL #else # define AO_TS_SET_TRUEVAL 1 /* true */ #endif typedef enum { AO_BYTE_TS_clear = 0, AO_BYTE_TS_set = AO_TS_SET_TRUEVAL } AO_BYTE_TS_val; #define AO_TS_VAL_t AO_BYTE_TS_val #define AO_TS_CLEAR AO_BYTE_TS_clear #define AO_TS_SET AO_BYTE_TS_set #define AO_TS_t unsigned char #define AO_CHAR_TS_T 1 #undef AO_TS_SET_TRUEVAL papi-papi-7-2-0-t/src/buildbot_configure_with_components.sh000077500000000000000000000011021502707512200241250ustar00rootroot00000000000000#!/bin/sh # this is the configuration that goes into a fedora rpm #./configure --with-debug --with-components="coretemp example infiniband lustre mx net" $1 if [ -f components/cuda/Makefile.cuda ]; then if [ -f components/nvml/Makefile.nvml ]; then ./configure --with-components="appio coretemp example lustre micpower mx net rapl stealtime cuda nvml" $1 else ./configure --with-components="appio coretemp example lustre micpower mx net rapl stealtime cuda" $1 fi else ./configure --with-components="appio coretemp example lustre micpower mx net rapl stealtime" $1 fi papi-papi-7-2-0-t/src/components/000077500000000000000000000000001502707512200166145ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/Makefile_comp_tests000066400000000000000000000013231502707512200225130ustar00rootroot00000000000000UTILOBJS= ../../../testlib/libtestlib.a DOLOOPS= ../../../testlib/do_loops.o INCLUDE = -I../../../testlib -I../../.. -I. LIBRARY = -L../../../ -lpapi PAPILIB = $(LIBRARY) tests: $(NAME)_tests install: @echo "$(NAME) tests (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/$(NAME)/tests -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/$(NAME)/tests -find . -perm -100 -type f -exec cp {} $(DATADIR)/$(NAME)/tests \; -chmod go+rx $(DATADIR)/$(NAME)/* -find . -name "*.[ch]" -type f -exec cp {} $(DATADIR)/$(NAME)/tests \; -cp Makefile $(DATADIR)/$(NAME)/tests -cp ../../Makefile_comp_tests.target $(DATADIR)/Makefile_comp_tests clean: distclean clobber: clean rm -f Makefile_comp_tests.target papi-papi-7-2-0-t/src/components/Makefile_comp_tests.target.in000066400000000000000000000024161502707512200244110ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ exec_prefix = @exec_prefix@ prefix = @prefix@ datarootdir = @datarootdir@ datadir = ../../.. testlibdir = $(datadir)/testlib validationlibdir = $(datadir)/validation_tests INCLUDE = -I. -I$(datadir) -I$(testlibdir) -I$(validationlibdir) -I@includedir@ LIBDIR = @libdir@ LIBRARY = @LIBRARY@ SHLIB = @SHLIB@ PAPILIB = $(datadir)/@LINKLIB@ TESTLIB = $(testlibdir)/libtestlib.a LDFLAGS = @LDFLAGS@ @LDL@ CC = @CC@ MPICC = @MPICC@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ OPTFLAGS= @OPTFLAGS@ TOPTFLAGS= @TOPTFLAGS@ OMPCFLGS = @OMPCFLGS@ UTILOBJS = $(TESTLIB) BUILD_SHARED_LIB = @BUILD_SHARED_LIB@ BUILD_LIBSDE_SHARED = @BUILD_LIBSDE_SHARED@ BUILD_LIBSDE_STATIC = @BUILD_LIBSDE_STATIC@ NO_MPI_TESTS = @NO_MPI_TESTS@ NVPPC64LEFLAGS = @NVPPC64LEFLAGS@ ENABLE_FORTRAN_TESTS = @ENABLE_FORTRAN_TESTS@ tests: $(NAME)_tests install: @echo "$(NAME) tests (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/$(NAME)/tests -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/$(NAME)/tests -find . -perm -100 -type f -exec cp {} $(DATADIR)/$(NAME)/tests \; -chmod go+rx $(DATADIR)/$(NAME)/* -find . -name "*.[ch]" -type f -exec cp {} $(DATADIR)/$(NAME)/tests \; -cp Makefile $(DATADIR)/$(NAME)/tests -cp ../../Makefile_comp_tests $(DATADIR) papi-papi-7-2-0-t/src/components/README000066400000000000000000000133301502707512200174740ustar00rootroot00000000000000/** * @file: README * @author: Brian Sheely * bsheely@eecs.utk.edu * @defgroup papi_components Components * @brief Component Readme file */ /** @page component_readme Component Readme @section Creating New Components The first step in creating a new component is to create a new directory inside the components directory. The naming convention is to use lower case letters for the directory name. At a minimum, this directory will contain all header files and source code required to build the component along with a Rules file which contains the build rules and compiler settings specific to that component. The Rules file must be named using the format Rules.x where x is the name of the directory. There are no restrictions on the naming of header or source files. If the component requires user input for any of the compiler settings, then the component directory will also contain the files required to generate a configure script using autoconf. The configure script can be used to generate a Makefile which the Rules file will include. The file configure.in is required in order to generate configure. It should specify that the Makefile that gets generated is named Makefile.x where x is the name of the component directory. Finally, configure also needs an input file to create the Makefile. That file must be named Makefile.x.in where x is the name of the component directory. The following comments apply to components that are under source control. Although configure is generated, it requires the correct version of autoconf. For that reason, configure should be placed under source control. The generated Makefile should not be placed under source control. In summary, the additional files required for configuration based on user input are: configure.in, configure (generated by autoconf), Makefile.x.in, and Makefile.x (generated by configure) where x is the name of the component directory. There is one final very important naming convention that applies to components. The array of function pointers that the component defines must use the naming convention papi_vector_t _x_vector where x is the name of the component directory. Adding tests to the components: ------------------------------- In order to add tests to a component that will be compiled together with PAPI when typing 'make' (as well as cleaned up when 'make clean' or 'make clobber' is typed and installed when 'make install-all' or 'make install-tests' is called), the following steps need to be carried out: 1. create a directory with name 'tests' in the specific component directory 2. add your test files and a Makefile to the 'tests' directory (see the example test and Makefile in components/example/tests) 3. The components/< component >/tests/Makefile has to have a rule with the name '< component >_tests'; e.g. for tests added to the example component, the name of the rule would be 'example_tests'. See: TESTS = HelloWorld example_tests: $(TESTS) 4. Include components/Makefile_comp_tests to your component test Makefile (see components/example/tests/Makefile for more details) 5. You may also define 'clean' and/or 'install' targets (as shown in the example) which will be called during those parts of the build. If these targets are missing it will just print a message reporting the missing target and continue. NOTE: there is no need to modify any PAPI code other than adding your tests and a Makefile to your component and follow step 1 to 4 listed above. @section Component Specific Information Some components under source control have additional information specific to their build process or operation. That information can be found in a README file inside the component directory. If the README doesn't exist, no special information is necessary. */ /*----------------------------------------------------------------------------- Notes on components as of February 2019 release 5.7.0.0. appio bgpm IBM Blue Gene Q specific. coretemp Linux HW Monitor, temperature and other info. coretemp_freebsd FREEBSD version of HW Monitor, temperature and other info. cuda CUDA events and metrics via NVIDIA CUpti interfaces. emon IBM Blue Gene Q specific. example Simple example component. host_micpower Requires its own configure and Knights Corner (KNC) architecture. infiniband Linux Infiniband stats using the sysfs interface. For OFED version < 1.4. infiniband_umad Requires its own configure. Infiniband stats for OFED version >= 1.4. libmsr Requires its own configure and libmsr from LLNL, for power (RAPL) read/write. lmsensors Requires its own configure. lustre lustre filesystem stats micpower For reading power on Intel XEON PHI. (MIC) mx Myricom MX (Myrinet Express) statistics. net Linux network driver statistics nvml Requires its own configure; monitors NVIDIA hardware (power, temp, fan speed, etc). pcp Performance Co-Pilot interface. perf_event Linux perf_event CPU counters perf_event_uncore Linux perf-event CPU uncore and Northbridge perfmon2 OLD, only used for Linux before 2.6.31. perfmon_ia64 OLD, only used for Linux before 2.6.31. powercap Linux Powercap energy measurements. rapl Linux RAPL energy measurments. stealtime Stealtime Filesystem statistics. vmware Requires its own configure. Only runs in the VMWARE virtual environment. */ papi-papi-7-2-0-t/src/components/Rules.components000066400000000000000000000001321502707512200220110ustar00rootroot00000000000000# $Id$ # This file is intended to prevent an empty include compile error in Makefile.inc papi-papi-7-2-0-t/src/components/appio/000077500000000000000000000000001502707512200177245ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/appio/CHANGES000066400000000000000000000003411502707512200207150ustar00rootroot00000000000000AppIO component changelog: 2012-01-19 Tushar Mohan * Support for read/write/fread/fwrite added * Test cases added * Static and dynamic linkage tested * Thread support enabled 2011-12-01 Phil Mucci * Initial skeleton papi-papi-7-2-0-t/src/components/appio/README.md000066400000000000000000000033201502707512200212010ustar00rootroot00000000000000# APPIO Component The APPIO component enables PAPI to access application level file and socket I/O information. * [Enabling the APPIO Component](#enabling-the-appio-component) * [Known Limitations](#known-limitations) * [FAQ](#faq) *** ## Enabling the APPIO Component To enable reading of APPIO counters the user needs to link against a PAPI library that was configured with the APPIO component enabled. As an example the following command: `./configure --with-components="appio"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## Known Limitations The most important aspect to note is that the code is likely to only work on Linux, given the low-level dependencies on libc features. At present the component intercepts the open(), close(), read(), write(), fread() and fwrite(). In the future it's expected that these will be expanded to cover lseek(), select(), other I/O calls. While READ\_* and WRITE\_* calls will not distinguish between file and network I/O, the user can explicitly determine network statistics using SOCK_* calls. Threads are handled using thread-specific structures in the backend. However, no aggregation is currently performed across threads. There is also NO global structure that has the statistics of all the threads. This means the user can call a PAPI read to get statitics for a running thread. However, if the thread has joined, then it's statistics can no longer be queried. *** ## FAQ 1. [Testing](#testing) ## Testing Tests lie in the tests/ sub-directory. All test take no argument. papi-papi-7-2-0-t/src/components/appio/Rules.appio000066400000000000000000000003661502707512200220550ustar00rootroot00000000000000# $Id$ COMPSRCS += components/appio/appio.c COMPOBJS += appio.o SHLIBDEPS += -ldl appio.o: components/appio/appio.h components/appio/appio.c components/appio/appio.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/appio/appio.c -o $@ papi-papi-7-2-0-t/src/components/appio/appio.c000066400000000000000000000640211502707512200212030ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file appio.c * * @author Philip Mucci * phil.mucci@samaratechnologygroup.com * * @author Tushar Mohan * tusharmohan@gmail.com * * Credit to: * Jose Pedro Oliveira * jpo@di.uminho.pt * whose code in the linux net component was used as a template for * many sections of code in this component. * * @ingroup papi_components * * @brief appio component * This file contains the source code for a component that enables * PAPI to access application level file and socket I/O information. * It does this through function replacement in the first person and * by trapping syscalls in the third person. */ #include #include #include #include #include #include #include #include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "appio.h" // The PIC test implies it's built for shared linkage #ifdef PIC # include "dlfcn.h" #endif /* #pragma weak dlerror static void *_dlsym_fake(void *handle, const char* symbol) { (void) handle; (void) symbol; return NULL; } void *dlsym(void *handle, const char* symbol) __attribute__ ((weak, alias ("_dlsym_fake"))); */ papi_vector_t _appio_vector; /********************************************************************* * Private ********************************************************************/ //#define APPIO_FOO 1 static APPIO_native_event_entry_t * _appio_native_events; /* If you modify the appio_stats_t below, you MUST update APPIO_MAX_COUNTERS */ static __thread long long _appio_register_current[APPIO_MAX_COUNTERS]; typedef enum { READ_BYTES = 0, READ_CALLS, READ_ERR, READ_INTERRUPTED, READ_WOULD_BLOCK, READ_SHORT, READ_EOF, READ_BLOCK_SIZE, READ_USEC, WRITE_BYTES, WRITE_CALLS, WRITE_ERR, WRITE_SHORT, WRITE_INTERRUPTED, WRITE_WOULD_BLOCK, WRITE_BLOCK_SIZE, WRITE_USEC, OPEN_CALLS, OPEN_ERR, OPEN_FDS, SELECT_USEC, RECV_BYTES, RECV_CALLS, RECV_ERR, RECV_INTERRUPTED, RECV_WOULD_BLOCK, RECV_SHORT, RECV_EOF, RECV_BLOCK_SIZE, RECV_USEC, SOCK_READ_BYTES, SOCK_READ_CALLS, SOCK_READ_ERR, SOCK_READ_SHORT, SOCK_READ_WOULD_BLOCK, SOCK_READ_USEC, SOCK_WRITE_BYTES, SOCK_WRITE_CALLS, SOCK_WRITE_ERR, SOCK_WRITE_SHORT, SOCK_WRITE_WOULD_BLOCK, SOCK_WRITE_USEC, SEEK_CALLS, SEEK_ABS_STRIDE_SIZE, SEEK_USEC } _appio_stats_t ; static const struct appio_counters { const char *name; const char *description; } _appio_counter_info[APPIO_MAX_COUNTERS] = { { "READ_BYTES", "Bytes read"}, { "READ_CALLS", "Number of read calls"}, { "READ_ERR", "Number of read calls that resulted in an error"}, { "READ_INTERRUPTED","Number of read calls that timed out or were interruped"}, { "READ_WOULD_BLOCK","Number of read calls that would have blocked"}, { "READ_SHORT", "Number of read calls that returned less bytes than requested"}, { "READ_EOF", "Number of read calls that returned an EOF"}, { "READ_BLOCK_SIZE", "Average block size of reads"}, { "READ_USEC", "Real microseconds spent in reads"}, { "WRITE_BYTES", "Bytes written"}, { "WRITE_CALLS", "Number of write calls"}, { "WRITE_ERR", "Number of write calls that resulted in an error"}, { "WRITE_SHORT", "Number of write calls that wrote less bytes than requested"}, { "WRITE_INTERRUPTED","Number of write calls that timed out or were interrupted"}, { "WRITE_WOULD_BLOCK","Number of write calls that would have blocked"}, { "WRITE_BLOCK_SIZE","Mean block size of writes"}, { "WRITE_USEC", "Real microseconds spent in writes"}, { "OPEN_CALLS", "Number of open calls"}, { "OPEN_ERR", "Number of open calls that resulted in an error"}, { "OPEN_FDS", "Number of currently open descriptors"}, { "SELECT_USEC", "Real microseconds spent in select calls"}, { "RECV_BYTES", "Bytes read in recv/recvmsg/recvfrom"}, { "RECV_CALLS", "Number of recv/recvmsg/recvfrom calls"}, { "RECV_ERR", "Number of recv/recvmsg/recvfrom calls that resulted in an error"}, { "RECV_INTERRUPTED","Number of recv/recvmsg/recvfrom calls that timed out or were interruped"}, { "RECV_WOULD_BLOCK","Number of recv/recvmsg/recvfrom calls that would have blocked"}, { "RECV_SHORT", "Number of recv/recvmsg/recvfrom calls that returned less bytes than requested"}, { "RECV_EOF", "Number of recv/recvmsg/recvfrom calls that returned an EOF"}, { "RECV_BLOCK_SIZE", "Average block size of recv/recvmsg/recvfrom"}, { "RECV_USEC", "Real microseconds spent in recv/recvmsg/recvfrom"}, { "SOCK_READ_BYTES", "Bytes read from socket"}, { "SOCK_READ_CALLS", "Number of read calls on socket"}, { "SOCK_READ_ERR", "Number of read calls on socket that resulted in an error"}, { "SOCK_READ_SHORT", "Number of read calls on socket that returned less bytes than requested"}, { "SOCK_READ_WOULD_BLOCK", "Number of read calls on socket that would have blocked"}, { "SOCK_READ_USEC", "Real microseconds spent in read(s) on socket(s)"}, { "SOCK_WRITE_BYTES","Bytes written to socket"}, { "SOCK_WRITE_CALLS","Number of write calls to socket"}, { "SOCK_WRITE_ERR", "Number of write calls to socket that resulted in an error"}, { "SOCK_WRITE_SHORT","Number of write calls to socket that wrote less bytes than requested"}, { "SOCK_WRITE_WOULD_BLOCK","Number of write calls to socket that would have blocked"}, { "SOCK_WRITE_USEC", "Real microseconds spent in write(s) to socket(s)"}, { "SEEK_CALLS", "Number of seek calls"}, { "SEEK_ABS_STRIDE_SIZE", "Average absolute stride size of seeks"}, { "SEEK_USEC", "Real microseconds spent in seek calls"} }; // The following macro follows if a string function has an error. It should // never happen; but it is necessary to prevent compiler warnings. We print // something just in case there is programmer error in invoking the function. #define HANDLE_STRING_ERROR {fprintf(stderr,"%s:%i unexpected string function error.\n",__FILE__,__LINE__); exit(-1);} /********************************************************************* *** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT **** ********************************************************************/ int __close(int fd); int close(int fd) { int retval; SUBDBG("appio: intercepted close(%d)\n", fd); retval = __close(fd); if ((retval == 0) && (_appio_register_current[OPEN_FDS]>0)) _appio_register_current[OPEN_FDS]--; return retval; } int __open(const char *pathname, int flags, mode_t mode); int open(const char *pathname, int flags, mode_t mode) { int retval; SUBDBG("appio: intercepted open(%s,%d,%d)\n", pathname, flags, mode); retval = __open(pathname,flags,mode); _appio_register_current[OPEN_CALLS]++; if (retval < 0) _appio_register_current[OPEN_ERR]++; else _appio_register_current[OPEN_FDS]++; return retval; } /* we use timeval as a zero value timeout to select in read/write for polling if the operation would block */ struct timeval zerotv; /* this has to be zero, so define it here */ int __select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { int retval; SUBDBG("appio: intercepted select(%d,%p,%p,%p,%p)\n", nfds,readfds,writefds,exceptfds,timeout); long long start_ts = PAPI_get_real_usec(); retval = __select(nfds,readfds,writefds,exceptfds,timeout); long long duration = PAPI_get_real_usec() - start_ts; _appio_register_current[SELECT_USEC] += duration; return retval; } off_t __lseek(int fd, off_t offset, int whence); off_t lseek(int fd, off_t offset, int whence) { off_t retval; SUBDBG("appio: intercepted lseek(%d,%ld,%d)\n", fd, offset, whence); long long start_ts = PAPI_get_real_usec(); retval = __lseek(fd, offset, whence); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[SEEK_CALLS]++; _appio_register_current[SEEK_USEC] += duration; if (offset < 0) offset = -offset; // get abs offset _appio_register_current[SEEK_ABS_STRIDE_SIZE]= (n * _appio_register_current[SEEK_ABS_STRIDE_SIZE] + offset)/(n+1); // mean absolute stride size return retval; } extern int errno; ssize_t __read(int fd, void *buf, size_t count); ssize_t read(int fd, void *buf, size_t count) { int retval; SUBDBG("appio: intercepted read(%d,%p,%lu)\n", fd, buf, (unsigned long)count); struct stat st; int issocket = 0; if (fstat(fd, &st) == 0) { if ((st.st_mode & S_IFMT) == S_IFSOCK) issocket = 1; } // check if read would block on descriptor fd_set readfds; FD_ZERO(&readfds); FD_SET(fd, &readfds); int ready = __select(fd+1, &readfds, NULL, NULL, &zerotv); if (ready == 0) { _appio_register_current[READ_WOULD_BLOCK]++; if (issocket) _appio_register_current[SOCK_READ_WOULD_BLOCK]++; } long long start_ts = PAPI_get_real_usec(); retval = __read(fd,buf, count); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[READ_CALLS]++; // read calls if (issocket) _appio_register_current[SOCK_READ_CALLS]++; // read calls if (retval > 0) { _appio_register_current[READ_BLOCK_SIZE]= (n * _appio_register_current[READ_BLOCK_SIZE] + count)/(n+1); // mean size _appio_register_current[READ_BYTES] += retval; // read bytes if (issocket) _appio_register_current[SOCK_READ_BYTES] += retval; if (retval < (int)count) { _appio_register_current[READ_SHORT]++; // read short if (issocket) _appio_register_current[SOCK_READ_SHORT]++; // read short } _appio_register_current[READ_USEC] += duration; if (issocket) _appio_register_current[SOCK_READ_USEC] += duration; } if (retval < 0) { _appio_register_current[READ_ERR]++; // read err if (issocket) _appio_register_current[SOCK_READ_ERR]++; // read err if (EINTR == errno) _appio_register_current[READ_INTERRUPTED]++; // signal interrupted the read //if ((EAGAIN == errno) || (EWOULDBLOCK == errno)) { // _appio_register_current[READ_WOULD_BLOCK]++; //read would block on descriptor marked as non-blocking // if (issocket) _appio_register_current[SOCK_READ_WOULD_BLOCK]++; //read would block on descriptor marked as non-blocking //} } if (retval == 0) _appio_register_current[READ_EOF]++; // read eof return retval; } size_t _IO_fread(void *ptr, size_t size, size_t nmemb, FILE *stream); size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream) { size_t retval; SUBDBG("appio: intercepted fread(%p,%lu,%lu,%p)\n", ptr, (unsigned long) size, (unsigned long) nmemb, (void*) stream); long long start_ts = PAPI_get_real_usec(); retval = _IO_fread(ptr,size,nmemb,stream); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[READ_CALLS]++; // read calls if (retval > 0) { _appio_register_current[READ_BLOCK_SIZE]= (n * _appio_register_current[READ_BLOCK_SIZE]+ size*nmemb)/(n+1);//mean size _appio_register_current[READ_BYTES]+= retval * size; // read bytes if (retval < nmemb) _appio_register_current[READ_SHORT]++; // read short _appio_register_current[READ_USEC] += duration; } /* A value of zero returned means one of two things..*/ if (retval == 0) { if (feof(stream)) _appio_register_current[READ_EOF]++; // read eof else _appio_register_current[READ_ERR]++; // read err } return retval; } ssize_t __write(int fd, const void *buf, size_t count); ssize_t write(int fd, const void *buf, size_t count) { int retval; SUBDBG("appio: intercepted write(%d,%p,%lu)\n", fd, buf, (unsigned long)count); struct stat st; int issocket = 0; if (fstat(fd, &st) == 0) { if ((st.st_mode & S_IFMT) == S_IFSOCK) issocket = 1; } // check if write would block on descriptor fd_set writefds; FD_ZERO(&writefds); FD_SET(fd, &writefds); int ready = __select(fd+1, NULL, &writefds, NULL, &zerotv); if (ready == 0) { _appio_register_current[WRITE_WOULD_BLOCK]++; if (issocket) _appio_register_current[SOCK_WRITE_WOULD_BLOCK]++; } long long start_ts = PAPI_get_real_usec(); retval = __write(fd,buf, count); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[WRITE_CALLS]++; // write calls if (issocket) _appio_register_current[SOCK_WRITE_CALLS]++; // socket write if (retval >= 0) { _appio_register_current[WRITE_BLOCK_SIZE]= (n * _appio_register_current[WRITE_BLOCK_SIZE] + count)/(n+1); // mean size _appio_register_current[WRITE_BYTES]+= retval; // write bytes if (issocket) _appio_register_current[SOCK_WRITE_BYTES] += retval; if (retval < (int)count) { _appio_register_current[WRITE_SHORT]++; // short write if (issocket) _appio_register_current[SOCK_WRITE_SHORT]++; } _appio_register_current[WRITE_USEC] += duration; if (issocket) _appio_register_current[SOCK_WRITE_USEC] += duration; } if (retval < 0) { _appio_register_current[WRITE_ERR]++; // err if (issocket) _appio_register_current[SOCK_WRITE_ERR]++; if (EINTR == errno) _appio_register_current[WRITE_INTERRUPTED]++; // signal interrupted the op //if ((EAGAIN == errno) || (EWOULDBLOCK == errno)) { // _appio_register_current[WRITE_WOULD_BLOCK]++; //op would block on descriptor marked as non-blocking // if (issocket) _appio_register_current[SOCK_WRITE_WOULD_BLOCK]++; //} } return retval; } // The PIC test implies it's built for shared linkage #ifdef PIC static ssize_t (*__recv)(int sockfd, void *buf, size_t len, int flags) = NULL; ssize_t recv(int sockfd, void *buf, size_t len, int flags) { int retval; SUBDBG("appio: intercepted recv(%d,%p,%lu,%d)\n", sockfd, buf, (unsigned long)len, flags); if (!__recv) __recv = dlsym(RTLD_NEXT, "recv"); if (!__recv) { fprintf(stderr, "appio,c Internal Error: Could not obtain handle for real recv\n"); exit(1); } // check if recv would block on descriptor fd_set readfds; FD_ZERO(&readfds); FD_SET(sockfd, &readfds); int ready = __select(sockfd+1, &readfds, NULL, NULL, &zerotv); if (ready == 0) _appio_register_current[RECV_WOULD_BLOCK]++; long long start_ts = PAPI_get_real_usec(); retval = __recv(sockfd, buf, len, flags); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[RECV_CALLS]++; // read calls if (retval > 0) { _appio_register_current[RECV_BLOCK_SIZE]= (n * _appio_register_current[RECV_BLOCK_SIZE] + len)/(n+1); // mean size _appio_register_current[RECV_BYTES] += retval; // read bytes if (retval < (int)len) _appio_register_current[RECV_SHORT]++; // read short _appio_register_current[RECV_USEC] += duration; } if (retval < 0) { _appio_register_current[RECV_ERR]++; // read err if (EINTR == errno) _appio_register_current[RECV_INTERRUPTED]++; // signal interrupted the read if ((EAGAIN == errno) || (EWOULDBLOCK == errno)) _appio_register_current[RECV_WOULD_BLOCK]++; //read would block on descriptor marked as non-blocking } if (retval == 0) _appio_register_current[RECV_EOF]++; // read eof return retval; } #endif /* PIC */ size_t _IO_fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream); size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream) { size_t retval; SUBDBG("appio: intercepted fwrite(%p,%lu,%lu,%p)\n", ptr, (unsigned long) size, (unsigned long) nmemb, (void*) stream); long long start_ts = PAPI_get_real_usec(); retval = _IO_fwrite(ptr,size,nmemb,stream); long long duration = PAPI_get_real_usec() - start_ts; int n = _appio_register_current[WRITE_CALLS]++; // write calls if (retval > 0) { _appio_register_current[WRITE_BLOCK_SIZE]= (n * _appio_register_current[WRITE_BLOCK_SIZE] + size*nmemb)/(n+1); // mean block size _appio_register_current[WRITE_BYTES]+= retval * size; // write bytes if (retval < nmemb) _appio_register_current[WRITE_SHORT]++; // short write _appio_register_current[WRITE_USEC] += duration; } if (retval == 0) _appio_register_current[WRITE_ERR]++; // err return retval; } /********************************************************************* *************** BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ********* *********************************************************************/ /* * This is called whenever a thread is initialized */ static int _appio_init_thread( hwd_context_t *ctx ) { ( void ) ctx; SUBDBG("_appio_init_thread %p\n", ctx); return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ static int _appio_init_component( int cidx ) { int strErr; int retval = PAPI_OK; SUBDBG("_appio_component %d\n", cidx); _appio_native_events = (APPIO_native_event_entry_t *) papi_calloc(APPIO_MAX_COUNTERS, sizeof(APPIO_native_event_entry_t)); if (_appio_native_events == NULL ) { PAPIERROR( "malloc():Could not get memory for events table" ); strErr=snprintf(_appio_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "malloc() failed in %s for %lu bytes.", __func__, APPIO_MAX_COUNTERS*sizeof(APPIO_native_event_entry_t)); _appio_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr>PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } int i; for (i=0; icmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ static int _appio_init_control_state( hwd_control_state_t *ctl ) { ( void ) ctl; return PAPI_OK; } static int _appio_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { ( void ) ctx; SUBDBG("_appio_start %p %p\n", ctx, ctl); APPIO_control_state_t *appio_ctl = (APPIO_control_state_t *) ctl; /* this memset needs to move to thread_init */ memset(_appio_register_current, 0, APPIO_MAX_COUNTERS * sizeof(_appio_register_current[0])); /* set initial values to 0 */ memset(appio_ctl->values, 0, APPIO_MAX_COUNTERS*sizeof(appio_ctl->values[0])); return PAPI_OK; } static int _appio_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long ** events, int flags ) { (void) flags; (void) ctx; SUBDBG("_appio_read %p %p\n", ctx, ctl); APPIO_control_state_t *appio_ctl = (APPIO_control_state_t *) ctl; int i; for ( i=0; inum_events; i++ ) { int index = appio_ctl->counter_bits[i]; SUBDBG("event=%d, index=%d, val=%lld\n", i, index, _appio_register_current[index]); appio_ctl->values[index] = _appio_register_current[index]; } *events = appio_ctl->values; return PAPI_OK; } static int _appio_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; SUBDBG("_appio_stop ctx=%p ctl=%p\n", ctx, ctl); APPIO_control_state_t *appio_ctl = (APPIO_control_state_t *) ctl; int i; for ( i=0; inum_events; i++ ) { int index = appio_ctl->counter_bits[i]; SUBDBG("event=%d, index=%d, val=%lld\n", i, index, _appio_register_current[index]); appio_ctl->values[i] = _appio_register_current[index]; } return PAPI_OK; } /* * Thread shutdown */ static int _appio_shutdown_thread( hwd_context_t *ctx ) { ( void ) ctx; return PAPI_OK; } /* * Clean up what was setup in appio_init_component(). */ static int _appio_shutdown_component( void ) { papi_free( _appio_native_events ); return PAPI_OK; } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL and * PAPI_SET_INHERIT */ static int _appio_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { ( void ) ctx; ( void ) code; ( void ) option; return PAPI_OK; } static int _appio_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { ( void ) ctx; ( void ) ctl; SUBDBG("_appio_update_control_state ctx=%p ctl=%p num_events=%d\n", ctx, ctl, count); int i, index; APPIO_control_state_t *appio_ctl = (APPIO_control_state_t *) ctl; (void) ctx; for ( i = 0; i < count; i++ ) { index = native[i].ni_event; appio_ctl->counter_bits[i] = index; native[i].ni_position = index; } appio_ctl->num_events = count; return PAPI_OK; } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ static int _appio_set_domain( hwd_control_state_t *ctl, int domain ) { ( void ) ctl; int found = 0; if ( PAPI_DOM_USER == domain ) found = 1; if ( !found ) return PAPI_EINVAL; return PAPI_OK; } static int _appio_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { ( void ) ctx; ( void ) ctl; return PAPI_OK; } /* * Native Event functions */ static int _appio_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index; switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return PAPI_OK; break; case PAPI_ENUM_EVENTS: index = *EventCode; if ( index < APPIO_MAX_COUNTERS - 1 ) { *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; default: return PAPI_EINVAL; break; } return PAPI_EINVAL; } /* * */ static int _appio_ntv_name_to_code( const char *name, unsigned int *EventCode ) { int i; for ( i=0; i= 0 && index < APPIO_MAX_COUNTERS ) { strncpy( name, _appio_counter_info[index].name, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /* * */ static int _appio_ntv_code_to_descr( unsigned int EventCode, char *desc, int len ) { int index = EventCode; if ( index >= 0 && index < APPIO_MAX_COUNTERS ) { strncpy(desc, _appio_counter_info[index].description, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /* * */ static int _appio_ntv_code_to_bits( unsigned int EventCode, hwd_register_t *bits ) { int index = EventCode; if ( index >= 0 && index < APPIO_MAX_COUNTERS ) { memcpy( ( APPIO_register_t * ) bits, &( _appio_native_events[index].resources ), sizeof ( APPIO_register_t ) ); return PAPI_OK; } return PAPI_ENOEVNT; } /* * */ papi_vector_t _appio_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "appio", .short_name = "appio", .version = "1.1.2.4", .description = "Linux I/O system calls", .CmpIdx = 0, /* set by init_component */ .num_mpx_cntrs = APPIO_MAX_COUNTERS, .num_cntrs = APPIO_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( APPIO_context_t ), .control_state = sizeof ( APPIO_control_state_t ), .reg_value = sizeof ( APPIO_register_t ), .reg_alloc = sizeof ( APPIO_reg_alloc_t ), }, /* function pointers in this component */ .init_thread = _appio_init_thread, .init_component = _appio_init_component, .init_control_state = _appio_init_control_state, .start = _appio_start, .stop = _appio_stop, .read = _appio_read, .shutdown_thread = _appio_shutdown_thread, .shutdown_component = _appio_shutdown_component, .ctl = _appio_ctl, .update_control_state = _appio_update_control_state, .set_domain = _appio_set_domain, .reset = _appio_reset, .ntv_enum_events = _appio_ntv_enum_events, .ntv_name_to_code = _appio_ntv_name_to_code, .ntv_code_to_name = _appio_ntv_code_to_name, .ntv_code_to_descr = _appio_ntv_code_to_descr, .ntv_code_to_bits = _appio_ntv_code_to_bits /* .ntv_bits_to_info = NULL, */ }; /* vim:set ts=4 sw=4 sts=4 et: */ papi-papi-7-2-0-t/src/components/appio/appio.h000066400000000000000000000041231502707512200212050ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file appio.h * CVS: $Id: appio.h,v 1.1.2.4 2012/02/01 05:01:00 tmohan Exp $ * * @author Philip Mucci * phil.mucci@samaratechnologygroup.com * * @author Tushar Mohan * tushar.mohan@samaratechnologygroup.com * * @ingroup papi_components * * @brief appio component * This file contains the source code for a component that enables * PAPI to access application level file and socket I/O information. * It does this through function replacement in the first person and * by trapping syscalls in the third person. */ #ifndef _PAPI_APPIO_H #define _PAPI_APPIO_H #include /************************* DEFINES SECTION ***********************************/ /* Set this equal to the number of elements in _appio_counter_info array */ #define APPIO_MAX_COUNTERS 45 /** Structure that stores private information of each event */ typedef struct APPIO_register { /* This is used by the framework. It likes it to be !=0 to do something */ unsigned int selector; } APPIO_register_t; /* * The following structures mimic the ones used by other components. It is more * convenient to use them like that as programming with PAPI makes specific * assumptions for them. */ /* This structure is used to build the table of events */ typedef struct APPIO_native_event_entry { APPIO_register_t resources; const char* name; const char* description; } APPIO_native_event_entry_t; typedef struct APPIO_reg_alloc { APPIO_register_t ra_bits; } APPIO_reg_alloc_t; typedef struct APPIO_control_state { int num_events; int counter_bits[APPIO_MAX_COUNTERS]; long long values[APPIO_MAX_COUNTERS]; // used for caching } APPIO_control_state_t; typedef struct APPIO_context { APPIO_control_state_t state; } APPIO_context_t; /************************* GLOBALS SECTION *********************************** *******************************************************************************/ #endif /* _PAPI_APPIO_H */ /* vim:set ts=4 sw=4 sts=4 et: */ papi-papi-7-2-0-t/src/components/appio/tests/000077500000000000000000000000001502707512200210665ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/appio/tests/Makefile000066400000000000000000000045411502707512200225320ustar00rootroot00000000000000NAME=appio include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = appio_list_events appio_values_by_code appio_values_by_name appio_test_read_write appio_test_pthreads appio_test_fread_fwrite appio_test_seek ALL_TESTS = $(TESTS) appio_test_blocking appio_test_select appio_test_recv appio_test_socket appio_tests: $(TESTS) all: $(ALL_TESTS) ARCH=$(shell uname -m) ifeq (x86_64,$(ARCH)) ARCH_SUFFIX="-AMD64" endif %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< appio_list_events: appio_list_events.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_list_events.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_values_by_code: appio_values_by_code.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_values_by_code.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_values_by_name: appio_values_by_name.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_values_by_name.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_read_write: appio_test_read_write.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_read_write.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_seek: appio_test_seek.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_seek.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_blocking: appio_test_blocking.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_blocking.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_socket: appio_test_socket.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_socket.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_recv: appio_test_recv.o $(UTILOBJS) ../../../libpapi.so $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_recv.o $(UTILOBJS) -Wl,-rpath ../../.. ../../../libpapi.so $(LDFLAGS) appio_test_select: appio_test_select.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_select.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_fread_fwrite: appio_test_fread_fwrite.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_fread_fwrite.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) appio_test_pthreads: appio_test_pthreads.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ appio_test_pthreads.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) -lpthread init_fini.o: init_fini.c $(CC) $(CFLAGS) $(INCLUDE) -o $@ -c $^ clean: rm -f $(ALL_TESTS) *.o papi-papi-7-2-0-t/src/components/appio/tests/appio_list_events.c000066400000000000000000000041741502707512200247670ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Tushar Mohan * (adapted for appio from original linux-net code) * * test case for the appio component * * @brief * List all appio events codes and names */ #include #include #include #include "papi.h" #include "papi_test.h" int main (int argc, char **argv) { int retval,cid,numcmp; int total_events=0; int code; char event_name[PAPI_MAX_STR_LEN]; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Listing all appio events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidname, "appio") == NULL) { continue; } if (!TESTS_QUIET) { printf("Component %d (%d) - %d events - %s\n", cid, cmpinfo->CmpIdx, cmpinfo->num_native_events, cmpinfo->name); } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) { printf("%#x %s\n", code, event_name); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } if (total_events==0) { test_skip(__FILE__,__LINE__,"No appio events found", 0); } test_pass( __FILE__ ); return 0; } // vim:set ai ts=4 sw=4 sts=4 et: papi-papi-7-2-0-t/src/components/appio/tests/appio_test_blocking.c000066400000000000000000000045421502707512200252560ustar00rootroot00000000000000/* * Test case for appio * Author: Tushar Mohan * tusharmohan@gmail.com * * Description: This test case reads from standard linux /etc/group * and writes the output to stdout. * Statistics are printed at the end of the run., */ #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 12 int main(int argc, char** argv) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"OPEN_CALLS", "OPEN_FDS", "READ_CALLS", "READ_BYTES", "READ_USEC", "READ_ERR", "READ_INTERRUPTED", "READ_WOULD_BLOCK", "WRITE_CALLS","WRITE_BYTES","WRITE_USEC", "WRITE_WOULD_BLOCK"}; long long values[NUM_EVENTS]; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } if (!TESTS_QUIET) fprintf(stderr, "This program will read from stdin and echo it to stdout\n"); int retval; int e; int event_code; for (e=0; e 0) { write(1, buf, bytes); } /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 8 int main(int argc, char** argv) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"READ_CALLS", "READ_BYTES","READ_USEC","READ_ERR", "READ_EOF", "WRITE_CALLS","WRITE_BYTES","WRITE_USEC"}; long long values[NUM_EVENTS]; char *infile = "/etc/group"; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } if (!TESTS_QUIET) printf("This program will read %s and write it to /dev/null\n", infile); FILE* fdin=fopen(infile, "r"); if (fdin == NULL) perror("Could not open file for reading: \n"); FILE* fout=fopen("/dev/null", "w"); if (fout == NULL) perror("Could not open file for writing: \n"); int bytes = 0; char buf[1024]; int retval; int e; int event_code; for (e=0; e 0) { fwrite(buf, 1, bytes, fout); } fclose(fdin); fclose(fout); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 6 const char* names[NUM_EVENTS] = {"READ_CALLS", "READ_BYTES","READ_USEC","WRITE_CALLS","WRITE_BYTES","WRITE_USEC"}; #define NUM_INFILES 4 static const char* files[NUM_INFILES] = {"/etc/passwd", "/etc/group", "/etc/protocols", "/etc/nsswitch.conf"}; void *ThreadIO(void *arg) { unsigned long tid = (unsigned long)pthread_self(); if (!TESTS_QUIET) printf("\nThread %#lx: will read %s and write it to /dev/null\n", tid,(const char*) arg); int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int retval; int e; int event_code; /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } for (e=0; e 0) { write(fdout, buf, bytes); } close(fdout); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { for (e=0; e #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 12 int main(int argc, char** argv) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"OPEN_CALLS", "OPEN_FDS", "READ_CALLS", "READ_BYTES", "READ_USEC", "READ_ERR", "READ_INTERRUPTED", "READ_WOULD_BLOCK", "WRITE_CALLS","WRITE_BYTES","WRITE_USEC","WRITE_WOULD_BLOCK"}; long long values[NUM_EVENTS]; char *infile = "/etc/group"; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } int fdin; if (!TESTS_QUIET) printf("This program will read %s and write it to /dev/null\n", infile); int retval; int e; int event_code; for (e=0; e 0) { write(fdout, buf, bytes); } /* Closing the descriptors before doing the PAPI_stop means, OPEN_FDS will be reported as zero, which is right, since at the time of PAPI_stop, the descriptors we opened have been closed */ close (fdin); close (fdout); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include /* exit() */ #include /* herror() */ #include /* gethostbyname() */ #include /* bind() accept() */ #include /* bind() accept() */ #include #include "papi.h" #include "papi_test.h" #define PORT 3490 #define NUM_EVENTS 6 main(int argc, char *argv[]) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"RECV_CALLS", "RECV_BYTES", "RECV_USEC", "RECV_ERR", "RECV_INTERRUPTED", "RECV_WOULD_BLOCK"}; long long values[NUM_EVENTS]; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } if (!TESTS_QUIET) printf("This program will listen on port 3490, and write data received to standard output\n"); int retval; int e; int event_code; for (e=0; e 0) { write(1, buf, bytes); } close(n_sockfd); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 7 int main(int argc, char** argv) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"READ_CALLS", "READ_BYTES", "READ_BLOCK_SIZE", "READ_USEC", "SEEK_CALLS", "SEEK_USEC", "SEEK_ABS_STRIDE_SIZE"}; long long values[NUM_EVENTS]; char *infile = "/etc/group"; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } int fdin; if (!TESTS_QUIET) printf("This program will do a strided read %s and write it to stdout\n", infile); int retval; int e; int event_code; for (e=0; e 0) { write(1, buf, bytes); lseek(fdin, 16, SEEK_CUR); } /* Closing the descriptors before doing the PAPI_stop means, OPEN_FDS will be reported as zero, which is right, since at the time of PAPI_stop, the descriptors we opened have been closed */ close (fdin); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main(int argc, char** argv) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"SELECT_USEC"}; long long values[NUM_EVENTS]; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } if (!TESTS_QUIET) printf("This program will read from stdin and echo it to stdout\n"); int retval; int e; int event_code; for (e=0; e 0) write(1, buf, bytes); if (bytes == 0) break; } /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include /* exit() */ #include /* herror() */ #include /* gethostbyname() */ #include /* bind() accept() */ #include /* bind() accept() */ #include #include "papi.h" #include "papi_test.h" #define PORT 3490 #define NUM_EVENTS 15 main(int argc, char *argv[]) { int EventSet = PAPI_NULL; const char* names[NUM_EVENTS] = {"READ_CALLS", "READ_BYTES", "READ_USEC", "READ_WOULD_BLOCK", "SOCK_READ_CALLS", "SOCK_READ_BYTES", "SOCK_READ_USEC", "SOCK_READ_WOULD_BLOCK", "WRITE_BYTES", "WRITE_CALLS", "WRITE_WOULD_BLOCK", "WRITE_USEC", "SOCK_WRITE_BYTES", "SOCK_WRITE_CALLS", "SOCK_WRITE_USEC"}; long long values[NUM_EVENTS]; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } if (!TESTS_QUIET) printf("This program will listen on port 3490, and write data received to standard output AND socket\n" "In the output ensure that the following identities hold:\n" "READ_* == SOCK_READ_*\n" "WRITE_{CALLS,BYTES} = 2 * SOCK_WRITE_{CALLS,BYTES}\n" "SOCK_READ_BYTES == SOCK_WRITE_BYTES\n"); int retval; int e; int event_code; for (e=0; e 0) { write(1, buf, bytes); write(n_sockfd, buf, bytes); } close(n_sockfd); /* Stop counting events */ if (PAPI_stop(EventSet, values) != PAPI_OK) { fprintf(stderr, "Error in PAPI_stop\n"); } if (!TESTS_QUIET) { printf("----\n"); for (e=0; e #include #include #include #include #include "papi.h" #include "papi_test.h" #define MAX_EVENTS 48 int main (int argc, char **argv) { int retval,cid,numcmp; int EventSet = PAPI_NULL; int code; char event_names[MAX_EVENTS][PAPI_MAX_STR_LEN]; int event_codes[MAX_EVENTS]; long long event_values[MAX_EVENTS]; int total_events=0; /* events added so far */ int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Trying all appio events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidnum_native_events, cmpinfo->name); } if ( strstr(cmpinfo->name, "appio") == NULL) { continue; } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); /* Create and populate the EventSet */ EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()", retval); } while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_names[total_events] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) { printf("Added event %s (code=%#x)\n", event_names[total_events], code); } event_codes[total_events++] = code; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } int fdin,fdout; const char* infile = "/etc/group"; printf("This program will read %s and write it to /dev/null\n", infile); int bytes = 0; char buf[1024]; retval = PAPI_add_events( EventSet, event_codes, total_events); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_events()", retval); } retval = PAPI_start( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()", retval); } fdin=open(infile, O_RDONLY); if (fdin < 0) perror("Could not open file for reading: \n"); fdout = open("/dev/null", O_WRONLY); if (fdout < 0) perror("Could not open /dev/null for writing: \n"); while ((bytes = read(fdin, buf, 1024)) > 0) { write(fdout, buf, bytes); } retval = PAPI_stop( EventSet, event_values ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop()", retval); } close(fdin); close(fdout); int i; if (!TESTS_QUIET) { for ( i=0; i #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 11 int main (int argc, char **argv) { int i, retval; int EventSet = PAPI_NULL; char *event_name[NUM_EVENTS] = { "READ_BYTES", "READ_CALLS", "READ_USEC", "READ_EOF", "READ_SHORT", "READ_ERR", "WRITE_BYTES", "WRITE_CALLS", "WRITE_USEC", "WRITE_ERR", "WRITE_SHORT" }; int event_code[NUM_EVENTS] = { 0, 0, 0, 0, 0, 0, 0, 0, 0}; long long event_value[NUM_EVENTS]; int total_events=0; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Appio events by name\n"); } /* Map names to codes */ for ( i=0; i 0) { write(fdout, buf, bytes); } close(fdin); close(fdout); retval = PAPI_stop( EventSet, event_value ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()", retval); } if (!TESTS_QUIET) { for ( i=0; i #include #include #include #include "papi.h" #define NUM_EVENTS 6 static int EventSet = PAPI_NULL; static const char* names[NUM_EVENTS] = {"READ_CALLS", "READ_BYTES","READ_USEC","WRITE_CALLS","WRITE_BYTES","WRITE_USEC"}; static long long values[NUM_EVENTS]; __attribute__ ((constructor)) void my_init(void) { //fprintf(stderr, "appio: constructor started\n"); int version = PAPI_library_init (PAPI_VER_CURRENT); if (version != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init version mismatch\n"); exit(1); } else { fprintf(stderr, "appio: PAPI library initialized\n"); } /* Create the Event Set */ if (PAPI_create_eventset(&EventSet) != PAPI_OK) { fprintf(stderr, "Error creating event set\n"); exit(2); } int retval; int e; int event_code; for (e=0; e * < your email address > * BGPM / CNKunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "linux-CNKunit.h" /* Declare our vector in advance */ papi_vector_t _CNKunit_vector; /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ int CNKUNIT_init_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_init_thread\n" ); #endif ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int CNKUNIT_init_component( int cidx ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_init_component\n" ); #endif _CNKunit_vector.cmp_info.CmpIdx = cidx; #ifdef DEBUG_BGQ printf( "CNKUNIT_init_component cidx = %d\n", cidx ); #endif return ( PAPI_OK ); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int CNKUNIT_init_control_state( hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_init_control_state\n" ); #endif int retval; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; this_state->EventGroup = Bgpm_CreateEventSet(); retval = _check_BGPM_error( this_state->EventGroup, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; return PAPI_OK; } /* * */ int CNKUNIT_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_start\n" ); #endif ( void ) ctx; int retval; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; retval = Bgpm_Apply( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Apply" ); if ( retval < 0 ) return retval; /* Bgpm_Apply() does an implicit reset; hence no need to use Bgpm_ResetStart */ retval = Bgpm_Start( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Start" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int CNKUNIT_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_stop\n" ); #endif ( void ) ctx; int retval; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int CNKUNIT_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long_long ** events, int flags ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_read\n" ); #endif ( void ) ctx; ( void ) flags; int i, numEvts; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; numEvts = Bgpm_NumEvents( this_state->EventGroup ); if ( numEvts == 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function Bgpm_NumEvents.\n", numEvts ); #endif //return ( EXIT_FAILURE ); } for ( i = 0; i < numEvts; i++ ) this_state->counts[i] = _common_getEventValue( i, this_state->EventGroup ); *events = this_state->counts; return ( PAPI_OK ); } /* * */ int CNKUNIT_shutdown_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_shutdown_thread\n" ); #endif ( void ) ctx; return ( PAPI_OK ); } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int CNKUNIT_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_ctl\n" ); #endif ( void ) ctx; ( void ) code; ( void ) option; return ( PAPI_OK ); } /* * */ int CNKUNIT_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_update_control_state: count = %d\n", count ); #endif ( void ) ctx; int retval, index, i; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; // otherwise, add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event ) + OFFSET; native[i].ni_position = i; #ifdef DEBUG_BGQ printf("CNKUNIT_update_control_state: ADD event: i = %d, index = %d\n", i, index ); #endif /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( this_state->EventGroup, index ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; } return ( PAPI_OK ); } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ int CNKUNIT_set_domain( hwd_control_state_t * cntrl, int domain ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_set_domain\n" ); #endif int found = 0; ( void ) cntrl; if ( PAPI_DOM_USER & domain ) found = 1; if ( PAPI_DOM_KERNEL & domain ) found = 1; if ( PAPI_DOM_OTHER & domain ) found = 1; if ( !found ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * */ int CNKUNIT_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_reset\n" ); #endif ( void ) ctx; int retval; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ptr; /* we can't simply call Bgpm_Reset() since PAPI doesn't have the restriction that an EventSet has to be stopped before resetting is possible. However, BGPM does have this restriction. Hence we need to stop, reset and start */ retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Cleanup Eventset * * Destroy and re-create the BGPM / CNKunit EventSet */ int CNKUNIT_cleanup_eventset( hwd_control_state_t * ctrl ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_cleanup_eventset\n" ); #endif int retval; CNKUNIT_control_state_t * this_state = ( CNKUNIT_control_state_t * ) ctrl; // create a new empty bgpm eventset // reason: bgpm doesn't permit to remove events from an eventset; // hence we delete the old eventset and create a new one retval = _common_deleteRecreate( &this_state->EventGroup ); // HJ try to use delete() only if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * Native Event functions */ int CNKUNIT_ntv_enum_events( unsigned int *EventCode, int modifier ) { #ifdef DEBUG_BGQ // printf( "CNKUNIT_ntv_enum_events\n" ); #endif switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode ) + OFFSET; if ( index < CNKUNIT_MAX_COUNTERS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } /* * */ int CNKUNIT_ntv_name_to_code( const char *name, unsigned int *event_code ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_ntv_name_to_code\n" ); #endif int ret; /* Return event id matching a given event label string */ ret = Bgpm_GetEventIdFromLabel ( name ); if ( ret <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } else if ( ret < OFFSET || ret > CNKUNIT_MAX_COUNTERS ) // not a CNKUnit event return PAPI_ENOEVNT; else *event_code = ( ret - OFFSET ) ; return PAPI_OK; } /* * */ int CNKUNIT_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "CNKUNIT_ntv_code_to_name\n" ); #endif int index; index = ( EventCode ) + OFFSET; if ( index >= MAX_COUNTERS ) return PAPI_ENOEVNT; strncpy( name, Bgpm_GetEventIdLabel( index ), len ); //printf("----%s----\n", name); if ( name == NULL ) { #ifdef DEBUG_BGPM printf ("Error: ret value is NULL for BGPM API function Bgpm_GetEventIdLabel.\n" ); #endif return PAPI_ENOEVNT; } return ( PAPI_OK ); } /* * */ int CNKUNIT_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "CNKUNIT_ntv_code_to_descr\n" ); #endif int retval, index; index = ( EventCode ) + OFFSET; retval = Bgpm_GetLongDesc( index, name, &len ); retval = _check_BGPM_error( retval, "Bgpm_GetLongDesc" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int CNKUNIT_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { #ifdef DEBUG_BGQ printf( "CNKUNIT_ntv_code_to_bits\n" ); #endif ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } /* * */ papi_vector_t _CNKunit_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "bgpm/CNKUnit", .short_name = "CNKUnit", .description = "Blue Gene/Q CNKUnit component", .num_native_events = CNKUNIT_MAX_COUNTERS-OFFSET+1, .num_cntrs = CNKUNIT_MAX_COUNTERS, .num_mpx_cntrs = CNKUNIT_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 0, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( CNKUNIT_context_t ), .control_state = sizeof ( CNKUNIT_control_state_t ), .reg_value = sizeof ( CNKUNIT_register_t ), .reg_alloc = sizeof ( CNKUNIT_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = CNKUNIT_init_thread, .init_component = CNKUNIT_init_component, .init_control_state = CNKUNIT_init_control_state, .start = CNKUNIT_start, .stop = CNKUNIT_stop, .read = CNKUNIT_read, .shutdown_thread = CNKUNIT_shutdown_thread, .cleanup_eventset = CNKUNIT_cleanup_eventset, .ctl = CNKUNIT_ctl, .update_control_state = CNKUNIT_update_control_state, .set_domain = CNKUNIT_set_domain, .reset = CNKUNIT_reset, .ntv_name_to_code = CNKUNIT_ntv_name_to_code, .ntv_enum_events = CNKUNIT_ntv_enum_events, .ntv_code_to_name = CNKUNIT_ntv_code_to_name, .ntv_code_to_descr = CNKUNIT_ntv_code_to_descr, .ntv_code_to_bits = CNKUNIT_ntv_code_to_bits }; papi-papi-7-2-0-t/src/components/bgpm/CNKunit/linux-CNKunit.h000066400000000000000000000031061502707512200236750ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-CNKunit.h * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / CNKunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #ifndef _PAPI_CNKUNIT_H #define _PAPI_CNKUNIT_H #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "../../../linux-bgq-common.h" /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ #define CNKUNIT_MAX_COUNTERS PEVT_CNKUNIT_LAST_EVENT #define OFFSET ( PEVT_NWUNIT_LAST_EVENT + 1 ) /** Structure that stores private information of each event */ typedef struct CNKUNIT_register { unsigned int selector; /* Signifies which counter slot is being used */ /* Indexed from 1 as 0 has a special meaning */ } CNKUNIT_register_t; typedef struct CNKUNIT_reg_alloc { CNKUNIT_register_t ra_bits; } CNKUNIT_reg_alloc_t; typedef struct CNKUNIT_control_state { int EventGroup; long long counts[CNKUNIT_MAX_COUNTERS]; } CNKUNIT_control_state_t; typedef struct CNKUNIT_context { CNKUNIT_control_state_t state; } CNKUNIT_context_t; #endif /* _PAPI_CNKUNIT_H */ papi-papi-7-2-0-t/src/components/bgpm/IOunit/000077500000000000000000000000001502707512200207505ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/bgpm/IOunit/Rules.IOunit000066400000000000000000000004321502707512200231720ustar00rootroot00000000000000# $Id$ COMPSRCS += components/bgpm/IOunit/linux-IOunit.c COMPOBJS += linux-IOunit.o linux-IOunit.o: components/bgpm/IOunit/linux-IOunit.c components/bgpm/IOunit/linux-IOunit.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/bgpm/IOunit/linux-IOunit.c -o linux-IOunit.o papi-papi-7-2-0-t/src/components/bgpm/IOunit/linux-IOunit.c000066400000000000000000000405001502707512200234570ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-IOunit.c * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / IOunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "linux-IOunit.h" /* Declare our vector in advance */ papi_vector_t _IOunit_vector; /* prototypes */ void user_signal_handler_IOUNIT( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ); /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ int IOUNIT_init_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "IOUNIT_init_thread\n" ); #endif ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int IOUNIT_init_component( int cidx ) { #ifdef DEBUG_BGQ printf( "IOUNIT_init_component\n" ); #endif _IOunit_vector.cmp_info.CmpIdx = cidx; #ifdef DEBUG_BGQ printf( "IOUNIT_init_component cidx = %d\n", cidx ); #endif return ( PAPI_OK ); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int IOUNIT_init_control_state( hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "IOUNIT_init_control_state\n" ); #endif int retval; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; this_state->EventGroup = Bgpm_CreateEventSet(); retval = _check_BGPM_error( this_state->EventGroup, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; // initialize overflow flag to OFF (0) this_state->overflow = 0; this_state->overflow_count = 0; return PAPI_OK; } /* * */ int IOUNIT_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "IOUNIT_start\n" ); #endif ( void ) ctx; int retval; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int IOUNIT_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "IOUNIT_stop\n" ); #endif ( void ) ctx; int retval; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int IOUNIT_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long_long ** events, int flags ) { #ifdef DEBUG_BGQ printf( "IOUNIT_read\n" ); #endif ( void ) ctx; ( void ) flags; int i, numEvts; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; numEvts = Bgpm_NumEvents( this_state->EventGroup ); if ( numEvts == 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function Bgpm_NumEvents.\n", numEvts ); #endif //return ( EXIT_FAILURE ); } for ( i = 0; i < numEvts; i++ ) this_state->counts[i] = _common_getEventValue( i, this_state->EventGroup ); *events = this_state->counts; return ( PAPI_OK ); } /* * */ int IOUNIT_shutdown_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "IOUNIT_shutdown_thread\n" ); #endif ( void ) ctx; return ( PAPI_OK ); } /* * user_signal_handler * * This function is used when hardware overflows are working or when * software overflows are forced */ void user_signal_handler_IOUNIT( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ) { #ifdef DEBUG_BGQ printf( "user_signal_handler_IOUNIT\n" ); #endif ( void ) address; int retval; unsigned i; int isHardware = 1; int cidx = _IOunit_vector.cmp_info.CmpIdx; long_long overflow_bit = 0; vptr_t address1; _papi_hwi_context_t ctx; ctx.ucontext = ( hwd_ucontext_t * ) pContext; ThreadInfo_t *thread = _papi_hwi_lookup_thread( 0 ); EventSetInfo_t *ESI; ESI = thread->running_eventset[cidx]; // Get the indices of all events which have overflowed. unsigned ovfIdxs[BGPM_MAX_OVERFLOW_EVENTS]; unsigned len = BGPM_MAX_OVERFLOW_EVENTS; retval = Bgpm_GetOverflowEventIndices( hEvtSet, ovfVector, ovfIdxs, &len ); if ( retval < 0 ) { #ifdef DEBUG_BGPM printf ( "Error: ret value is %d for BGPM API function Bgpm_GetOverflowEventIndices.\n", retval ); #endif return; } if ( thread == NULL ) { PAPIERROR( "thread == NULL in user_signal_handler!" ); return; } if ( ESI == NULL ) { PAPIERROR( "ESI == NULL in user_signal_handler!"); return; } if ( ESI->overflow.flags == 0 ) { PAPIERROR( "ESI->overflow.flags == 0 in user_signal_handler!"); return; } for ( i = 0; i < len; i++ ) { uint64_t hProf; Bgpm_GetEventUser1( hEvtSet, ovfIdxs[i], &hProf ); if ( hProf ) { overflow_bit ^= 1 << ovfIdxs[i]; break; } } if ( ESI->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) { #ifdef DEBUG_BGQ printf("OVERFLOW_SOFTWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, NULL, 0, 0, &thread, cidx ); return; } else if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { #ifdef DEBUG_BGQ printf("OVERFLOW_HARDWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, &isHardware, overflow_bit, 0, &thread, cidx ); } else { #ifdef DEBUG_BGQ printf("OVERFLOW_NONE\n"); #endif PAPIERROR( "ESI->overflow.flags is set to something other than PAPI_OVERFLOW_HARDWARE or PAPI_OVERFLOW_FORCE_SW (%#x)", thread->running_eventset[cidx]->overflow.flags); } } /* * Set Overflow * * This is commented out in BG/L/P - need to explore and complete... * However, with true 64-bit counters in BG/Q and all counters for PAPI * always starting from a true zero (we don't allow write...), the possibility * for overflow is remote at best... */ int IOUNIT_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { #ifdef DEBUG_BGQ printf("BEGIN IOUNIT_set_overflow\n"); #endif IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ESI->ctl_state; int retval; int evt_idx; evt_idx = ESI->EventInfoArray[EventIndex].pos[0]; SUBDBG( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #ifdef DEBUG_BGQ printf( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #endif /* If this counter isn't set to overflow, it's an error */ if ( threshold == 0 ) { /* Remove the signal handler */ retval = _papi_hwi_stop_signal( _IOunit_vector.cmp_info.hardware_intr_sig ); if ( retval != PAPI_OK ) return ( retval ); } else { this_state->overflow = 1; this_state->overflow_count++; this_state->overflow_list[this_state->overflow_count-1].threshold = threshold; this_state->overflow_list[this_state->overflow_count-1].EventIndex = evt_idx; #ifdef DEBUG_BGQ printf( "IOUNIT_set_overflow: Enable the signal handler\n" ); #endif /* Enable the signal handler */ retval = _papi_hwi_start_signal( _IOunit_vector.cmp_info.hardware_intr_sig, NEED_CONTEXT, _IOunit_vector.cmp_info.CmpIdx ); if ( retval != PAPI_OK ) return ( retval ); retval = _common_set_overflow_BGPM( this_state->EventGroup, this_state->overflow_list[this_state->overflow_count-1].EventIndex, this_state->overflow_list[this_state->overflow_count-1].threshold, user_signal_handler_IOUNIT ); if ( retval < 0 ) return retval; } return ( PAPI_OK ); } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int IOUNIT_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG_BGQ printf( "IOUNIT_ctl\n" ); #endif ( void ) ctx; ( void ) code; ( void ) option; return ( PAPI_OK ); } /* * */ int IOUNIT_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "IOUNIT_update_control_state: count = %d\n", count ); #endif ( void ) ctx; int retval, index, i, k; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf( "IOUNIT_update_control_state: EventGroup=%d, overflow = %d\n", this_state->EventGroup, this_state->overflow ); #endif // otherwise, add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event ) + OFFSET; native[i].ni_position = i; #ifdef DEBUG_BGQ printf("IOUNIT_update_control_state: ADD event: i = %d, index = %d\n", i, index ); #endif /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( this_state->EventGroup, index ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; } // since update_control_state trashes overflow settings, this puts things // back into balance for BGPM if ( 1 == this_state->overflow ) { for ( k = 0; k < this_state->overflow_count; k++ ) { retval = _common_set_overflow_BGPM( this_state->EventGroup, this_state->overflow_list[k].EventIndex, this_state->overflow_list[k].threshold, user_signal_handler_IOUNIT ); if ( retval < 0 ) return retval; } } return ( PAPI_OK ); } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ int IOUNIT_set_domain( hwd_control_state_t * cntrl, int domain ) { #ifdef DEBUG_BGQ printf( "IOUNIT_set_domain\n" ); #endif int found = 0; ( void ) cntrl; if ( PAPI_DOM_USER & domain ) found = 1; if ( PAPI_DOM_KERNEL & domain ) found = 1; if ( PAPI_DOM_OTHER & domain ) found = 1; if ( !found ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * */ int IOUNIT_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "IOUNIT_reset\n" ); #endif ( void ) ctx; int retval; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ptr; /* we can't simply call Bgpm_Reset() since PAPI doesn't have the restriction that an EventSet has to be stopped before resetting is possible. However, BGPM does have this restriction. Hence we need to stop, reset and start */ retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Cleanup Eventset * * Destroy and re-create the BGPM / IOunit EventSet */ int IOUNIT_cleanup_eventset( hwd_control_state_t * ctrl ) { #ifdef DEBUG_BGQ printf( "IOUNIT_cleanup_eventset\n" ); #endif int retval; IOUNIT_control_state_t * this_state = ( IOUNIT_control_state_t * ) ctrl; // create a new empty bgpm eventset // reason: bgpm doesn't permit to remove events from an eventset; // hence we delete the old eventset and create a new one retval = _common_deleteRecreate( &this_state->EventGroup ); // HJ try to use delete() only if ( retval < 0 ) return retval; // set overflow flag to OFF (0) this_state->overflow = 0; this_state->overflow_count = 0; return ( PAPI_OK ); } /* * Native Event functions */ int IOUNIT_ntv_enum_events( unsigned int *EventCode, int modifier ) { #ifdef DEBUG_BGQ //printf( "IOUNIT_ntv_enum_events\n" ); #endif switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode ) + OFFSET; if ( index < IOUNIT_MAX_EVENTS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } /* * */ int IOUNIT_ntv_name_to_code( const char *name, unsigned int *event_code ) { #ifdef DEBUG_BGQ printf( "IOUNIT_ntv_name_to_code\n" ); #endif int ret; /* Return event id matching a given event label string */ ret = Bgpm_GetEventIdFromLabel ( name ); if ( ret <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } else if ( ret < OFFSET || ret > IOUNIT_MAX_EVENTS ) // not an IOUnit event return PAPI_ENOEVNT; else *event_code = ( ret - OFFSET ) ; return PAPI_OK; } /* * */ int IOUNIT_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "IOUNIT_ntv_code_to_name\n" ); #endif int index; index = ( EventCode ) + OFFSET; if ( index >= MAX_COUNTERS ) return PAPI_ENOEVNT; strncpy( name, Bgpm_GetEventIdLabel( index ), len ); if ( name == NULL ) { #ifdef DEBUG_BGPM printf ("Error: ret value is NULL for BGPM API function Bgpm_GetEventIdLabel.\n" ); #endif return PAPI_ENOEVNT; } return ( PAPI_OK ); } /* * */ int IOUNIT_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "IOUNIT_ntv_code_to_descr\n" ); #endif int retval, index; index = ( EventCode ) + OFFSET; retval = Bgpm_GetLongDesc( index, name, &len ); retval = _check_BGPM_error( retval, "Bgpm_GetLongDesc" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int IOUNIT_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { #ifdef DEBUG_BGQ printf( "IOUNIT_ntv_code_to_bits\n" ); #endif ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } /* * */ papi_vector_t _IOunit_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "bgpm/IOUnit", .short_name = "IOUnit", .description = "Blue Gene/Q IOUnit component", .num_native_events = IOUNIT_MAX_EVENTS-OFFSET+1, .num_cntrs = IOUNIT_MAX_COUNTERS, .num_mpx_cntrs = IOUNIT_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 0, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( IOUNIT_context_t ), .control_state = sizeof ( IOUNIT_control_state_t ), .reg_value = sizeof ( IOUNIT_register_t ), .reg_alloc = sizeof ( IOUNIT_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = IOUNIT_init_thread, .init_component = IOUNIT_init_component, .init_control_state = IOUNIT_init_control_state, .start = IOUNIT_start, .stop = IOUNIT_stop, .read = IOUNIT_read, .shutdown_thread = IOUNIT_shutdown_thread, .set_overflow = IOUNIT_set_overflow, .cleanup_eventset = IOUNIT_cleanup_eventset, .ctl = IOUNIT_ctl, .update_control_state = IOUNIT_update_control_state, .set_domain = IOUNIT_set_domain, .reset = IOUNIT_reset, .ntv_name_to_code = IOUNIT_ntv_name_to_code, .ntv_enum_events = IOUNIT_ntv_enum_events, .ntv_code_to_name = IOUNIT_ntv_code_to_name, .ntv_code_to_descr = IOUNIT_ntv_code_to_descr, .ntv_code_to_bits = IOUNIT_ntv_code_to_bits }; papi-papi-7-2-0-t/src/components/bgpm/IOunit/linux-IOunit.h000066400000000000000000000034501502707512200234670ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-IOunit.h * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / IOunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #ifndef _PAPI_IOUNIT_H #define _PAPI_IOUNIT_H #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "../../../linux-bgq-common.h" /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ #define IOUNIT_MAX_COUNTERS UPC_C_IOSRAM_NUM_COUNTERS #define IOUNIT_MAX_EVENTS PEVT_IOUNIT_LAST_EVENT #define OFFSET ( PEVT_L2UNIT_LAST_EVENT + 1 ) /** Structure that stores private information of each event */ typedef struct IOUNIT_register { unsigned int selector; /* Signifies which counter slot is being used */ /* Indexed from 1 as 0 has a special meaning */ } IOUNIT_register_t; typedef struct IOUNIT_reg_alloc { IOUNIT_register_t ra_bits; } IOUNIT_reg_alloc_t; typedef struct IOUNIT_overflow { int threshold; int EventIndex; } IOUNIT_overflow_t; typedef struct IOUNIT_control_state { int EventGroup; int overflow; // overflow enable int overflow_count; IOUNIT_overflow_t overflow_list[512]; long long counts[IOUNIT_MAX_COUNTERS]; } IOUNIT_control_state_t; typedef struct IOUNIT_context { IOUNIT_control_state_t state; } IOUNIT_context_t; #endif /* _PAPI_IOUNIT_H */ papi-papi-7-2-0-t/src/components/bgpm/L2unit/000077500000000000000000000000001502707512200207165ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/bgpm/L2unit/Rules.L2unit000066400000000000000000000004301502707512200231040ustar00rootroot00000000000000# $Id$ COMPSRCS += components/bgpm/L2unit/linux-L2unit.c COMPOBJS += linux-L2unit.o linux-L2unit.o: components/bgpm/L2unit/linux-L2unit.c components/bgpm/L2unit/linux-L2unit.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/bgpm/L2unit/linux-L2unit.c -o linux-L2unit.o papi-papi-7-2-0-t/src/components/bgpm/L2unit/linux-L2unit.c000066400000000000000000000434101502707512200233760ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-L2unit.c * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / L2unit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "linux-L2unit.h" /* Declare our vector in advance */ papi_vector_t _L2unit_vector; /* prototypes */ void user_signal_handler_L2UNIT( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ); /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ int L2UNIT_init_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "L2UNIT_init_thread\n" ); #endif ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int L2UNIT_init_component( int cidx ) { #ifdef DEBUG_BGQ printf( "L2UNIT_init_component\n" ); #endif _L2unit_vector.cmp_info.CmpIdx = cidx; #ifdef DEBUG_BGQ printf( "L2UNIT_init_component cidx = %d\n", cidx ); #endif return ( PAPI_OK ); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int L2UNIT_init_control_state( hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "L2UNIT_init_control_state\n" ); #endif int retval; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; this_state->EventGroup = Bgpm_CreateEventSet(); retval = _check_BGPM_error( this_state->EventGroup, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; // initialize overflow flag to OFF (0) this_state->overflow = 0; this_state->overflow_count = 0; // initialized BGPM eventGroup flag to NOT applied yet (0) this_state->bgpm_eventset_applied = 0; return PAPI_OK; } /* * */ int L2UNIT_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "L2UNIT_start\n" ); #endif ( void ) ctx; int retval; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; retval = Bgpm_Apply( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Apply" ); if ( retval < 0 ) return retval; // set flag to 1: BGPM eventGroup HAS BEEN applied this_state->bgpm_eventset_applied = 1; /* Bgpm_Apply() does an implicit reset; hence no need to use Bgpm_ResetStart */ retval = Bgpm_Start( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Start" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int L2UNIT_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "L2UNIT_stop\n" ); #endif ( void ) ctx; int retval; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int L2UNIT_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long_long ** events, int flags ) { #ifdef DEBUG_BGQ printf( "L2UNIT_read\n" ); #endif ( void ) ctx; ( void ) flags; int i, numEvts; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; numEvts = Bgpm_NumEvents( this_state->EventGroup ); if ( numEvts == 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function Bgpm_NumEvents.\n", numEvts ); #endif //return ( EXIT_FAILURE ); } for ( i = 0; i < numEvts; i++ ) this_state->counters[i] = _common_getEventValue( i, this_state->EventGroup ); *events = this_state->counters; return ( PAPI_OK ); } /* * */ int L2UNIT_shutdown_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "L2UNIT_shutdown_thread\n" ); #endif ( void ) ctx; return ( PAPI_OK ); } /* * user_signal_handler * * This function is used when hardware overflows are working or when * software overflows are forced */ void user_signal_handler_L2UNIT( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ) { #ifdef DEBUG_BGQ printf( "user_signal_handler_L2UNIT\n" ); #endif ( void ) address; int retval; unsigned i; int isHardware = 1; int cidx = _L2unit_vector.cmp_info.CmpIdx; long_long overflow_bit = 0; vptr_t address1; _papi_hwi_context_t ctx; ctx.ucontext = ( hwd_ucontext_t * ) pContext; ThreadInfo_t *thread = _papi_hwi_lookup_thread( 0 ); EventSetInfo_t *ESI; ESI = thread->running_eventset[cidx]; // Get the indices of all events which have overflowed. unsigned ovfIdxs[BGPM_MAX_OVERFLOW_EVENTS]; unsigned len = BGPM_MAX_OVERFLOW_EVENTS; retval = Bgpm_GetOverflowEventIndices( hEvtSet, ovfVector, ovfIdxs, &len ); if ( retval < 0 ) { #ifdef DEBUG_BGPM printf ( "Error: ret value is %d for BGPM API function Bgpm_GetOverflowEventIndices.\n", retval ); #endif return; } if ( thread == NULL ) { PAPIERROR( "thread == NULL in user_signal_handler!" ); return; } if ( ESI == NULL ) { PAPIERROR( "ESI == NULL in user_signal_handler!"); return; } if ( ESI->overflow.flags == 0 ) { PAPIERROR( "ESI->overflow.flags == 0 in user_signal_handler!"); return; } for ( i = 0; i < len; i++ ) { uint64_t hProf; Bgpm_GetEventUser1( hEvtSet, ovfIdxs[i], &hProf ); if ( hProf ) { overflow_bit ^= 1 << ovfIdxs[i]; break; } } if ( ESI->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) { #ifdef DEBUG_BGQ printf("OVERFLOW_SOFTWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, NULL, 0, 0, &thread, cidx ); return; } else if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { #ifdef DEBUG_BGQ printf("OVERFLOW_HARDWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, &isHardware, overflow_bit, 0, &thread, cidx ); } else { #ifdef DEBUG_BGQ printf("OVERFLOW_NONE\n"); #endif PAPIERROR( "ESI->overflow.flags is set to something other than PAPI_OVERFLOW_HARDWARE or PAPI_OVERFLOW_FORCE_SW (%#x)", thread->running_eventset[cidx]->overflow.flags); } } /* * Set Overflow * * This is commented out in BG/L/P - need to explore and complete... * However, with true 64-bit counters in BG/Q and all counters for PAPI * always starting from a true zero (we don't allow write...), the possibility * for overflow is remote at best... */ int L2UNIT_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { #ifdef DEBUG_BGQ printf("BEGIN L2UNIT_set_overflow\n"); #endif L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ESI->ctl_state; int retval; int evt_idx; /* * In case an BGPM eventGroup HAS BEEN applied or attached before * overflow is set, delete the eventGroup and create an new empty one, * and rebuild as it was prior to deletion */ #ifdef DEBUG_BGQ printf( "L2UNIT_set_overflow: bgpm_eventset_applied = %d, threshold = %d\n", this_state->bgpm_eventset_applied, threshold ); #endif if ( 1 == this_state->bgpm_eventset_applied && 0 != threshold ) { retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; retval = _common_rebuildEventgroup( this_state->count, this_state->EventGroup_local, &this_state->EventGroup ); if ( retval < 0 ) return retval; /* set BGPM eventGroup flag back to NOT applied yet (0) * because the eventGroup has been recreated from scratch */ this_state->bgpm_eventset_applied = 0; } evt_idx = ESI->EventInfoArray[EventIndex].pos[0]; SUBDBG( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #ifdef DEBUG_BGQ printf( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #endif /* If this counter isn't set to overflow, it's an error */ if ( threshold == 0 ) { /* Remove the signal handler */ retval = _papi_hwi_stop_signal( _L2unit_vector.cmp_info.hardware_intr_sig ); if ( retval != PAPI_OK ) return ( retval ); } else { this_state->overflow = 1; this_state->overflow_count++; this_state->overflow_list[this_state->overflow_count-1].threshold = threshold; this_state->overflow_list[this_state->overflow_count-1].EventIndex = evt_idx; #ifdef DEBUG_BGQ printf( "L2UNIT_set_overflow: Enable the signal handler\n" ); #endif /* Enable the signal handler */ retval = _papi_hwi_start_signal( _L2unit_vector.cmp_info.hardware_intr_sig, NEED_CONTEXT, _L2unit_vector.cmp_info.CmpIdx ); if ( retval != PAPI_OK ) return ( retval ); retval = _common_set_overflow_BGPM( this_state->EventGroup, this_state->overflow_list[this_state->overflow_count-1].EventIndex, this_state->overflow_list[this_state->overflow_count-1].threshold, user_signal_handler_L2UNIT ); if ( retval < 0 ) return retval; } return ( PAPI_OK ); } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int L2UNIT_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG_BGQ printf( "L2UNIT_ctl\n" ); #endif ( void ) ctx; ( void ) code; ( void ) option; return ( PAPI_OK ); } /* * PAPI Cleanup Eventset * Destroy and re-create the BGPM / L2unit EventSet */ int L2UNIT_cleanup_eventset( hwd_control_state_t * ctrl ) { #ifdef DEBUG_BGQ printf( "L2UNIT_cleanup_eventset\n" ); #endif int retval; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ctrl; // create a new empty bgpm eventset // reason: bgpm doesn't permit to remove events from an eventset; // hence we delete the old eventset and create a new one retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; // set overflow flag to OFF (0) this_state->overflow = 0; this_state->overflow_count = 0; // set BGPM eventGroup flag back to NOT applied yet (0) this_state->bgpm_eventset_applied = 0; return ( PAPI_OK ); } /* * */ int L2UNIT_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "L2UNIT_update_control_state: count = %d\n", count ); #endif ( void ) ctx; int retval, index, i, k; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf( "L2UNIT_update_control_state: EventGroup=%d, overflow = %d\n", this_state->EventGroup, this_state->overflow ); #endif // otherwise, add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event ) + OFFSET; native[i].ni_position = i; #ifdef DEBUG_BGQ printf("L2UNIT_update_control_state: ADD event: i = %d, index = %d\n", i, index ); #endif this_state->EventGroup_local[i] = index; /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( this_state->EventGroup, index ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; } // store how many events we added to an EventSet this_state->count = count; // since update_control_state trashes overflow settings, this puts things // back into balance for BGPM if ( 1 == this_state->overflow ) { for ( k = 0; k < this_state->overflow_count; k++ ) { retval = _common_set_overflow_BGPM( this_state->EventGroup, this_state->overflow_list[k].EventIndex, this_state->overflow_list[k].threshold, user_signal_handler_L2UNIT ); if ( retval < 0 ) return retval; } } return ( PAPI_OK ); } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ int L2UNIT_set_domain( hwd_control_state_t * cntrl, int domain ) { #ifdef DEBUG_BGQ printf( "L2UNIT_set_domain\n" ); #endif int found = 0; ( void ) cntrl; if ( PAPI_DOM_USER & domain ) found = 1; if ( PAPI_DOM_KERNEL & domain ) found = 1; if ( PAPI_DOM_OTHER & domain ) found = 1; if ( !found ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * */ int L2UNIT_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "L2UNIT_reset\n" ); #endif ( void ) ctx; int retval; L2UNIT_control_state_t * this_state = ( L2UNIT_control_state_t * ) ptr; /* we can't simply call Bgpm_Reset() since PAPI doesn't have the restriction that an EventSet has to be stopped before resetting is possible. However, BGPM does have this restriction. Hence we need to stop, reset and start */ retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * Native Event functions */ int L2UNIT_ntv_enum_events( unsigned int *EventCode, int modifier ) { #ifdef DEBUG_BGQ //printf( "L2UNIT_ntv_enum_events, EventCode = %#x\n", *EventCode ); #endif switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode ) + OFFSET; if ( index < L2UNIT_MAX_EVENTS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } /* * */ int L2UNIT_ntv_name_to_code( const char *name, unsigned int *event_code ) { #ifdef DEBUG_BGQ printf( "L2UNIT_ntv_name_to_code\n" ); #endif int ret; /* Return event id matching a given event label string */ ret = Bgpm_GetEventIdFromLabel ( name ); if ( ret <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } else if ( ret < OFFSET || ret > L2UNIT_MAX_EVENTS ) // not a L2Unit event return PAPI_ENOEVNT; else *event_code = ( ret - OFFSET ); return PAPI_OK; } /* * */ int L2UNIT_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "L2UNIT_ntv_code_to_name\n" ); #endif int index; index = ( EventCode ) + OFFSET; if ( index >= MAX_COUNTERS ) return PAPI_ENOEVNT; strncpy( name, Bgpm_GetEventIdLabel( index ), len ); if ( name == NULL ) { #ifdef DEBUG_BGPM printf ("Error: ret value is NULL for BGPM API function Bgpm_GetEventIdLabel.\n" ); #endif return PAPI_ENOEVNT; } return ( PAPI_OK ); } /* * */ int L2UNIT_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "L2UNIT_ntv_code_to_descr\n" ); #endif int retval, index; index = ( EventCode ) + OFFSET; retval = Bgpm_GetLongDesc( index, name, &len ); retval = _check_BGPM_error( retval, "Bgpm_GetLongDesc" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int L2UNIT_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { #ifdef DEBUG_BGQ printf( "L2UNIT_ntv_code_to_bits\n" ); #endif ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } /* * */ papi_vector_t _L2unit_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "bgpm/L2Unit", .short_name = "L2Unit", .description = "Blue Gene/Q L2Unit component", .num_cntrs = L2UNIT_MAX_COUNTERS, .num_native_events = L2UNIT_MAX_EVENTS-OFFSET+1, .num_mpx_cntrs = L2UNIT_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 0, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( L2UNIT_context_t ), .control_state = sizeof ( L2UNIT_control_state_t ), .reg_value = sizeof ( L2UNIT_register_t ), .reg_alloc = sizeof ( L2UNIT_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = L2UNIT_init_thread, .init_component = L2UNIT_init_component, .init_control_state = L2UNIT_init_control_state, .start = L2UNIT_start, .stop = L2UNIT_stop, .read = L2UNIT_read, .shutdown_thread = L2UNIT_shutdown_thread, .set_overflow = L2UNIT_set_overflow, .cleanup_eventset = L2UNIT_cleanup_eventset, .ctl = L2UNIT_ctl, .update_control_state = L2UNIT_update_control_state, .set_domain = L2UNIT_set_domain, .reset = L2UNIT_reset, .ntv_name_to_code = L2UNIT_ntv_name_to_code, .ntv_enum_events = L2UNIT_ntv_enum_events, .ntv_code_to_name = L2UNIT_ntv_code_to_name, .ntv_code_to_descr = L2UNIT_ntv_code_to_descr, .ntv_code_to_bits = L2UNIT_ntv_code_to_bits }; papi-papi-7-2-0-t/src/components/bgpm/L2unit/linux-L2unit.h000066400000000000000000000037501502707512200234060ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-L2unit.h * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / L2unit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #ifndef _PAPI_L2UNIT_H #define _PAPI_L2UNIT_H #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "../../../linux-bgq-common.h" /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ #define L2UNIT_MAX_COUNTERS UPC_L2_NUM_COUNTERS #define L2UNIT_MAX_EVENTS PEVT_L2UNIT_LAST_EVENT #define OFFSET ( PEVT_PUNIT_LAST_EVENT + 1 ) /* Stores private information for each event */ typedef struct L2UNIT_register { unsigned int selector; /* Signifies which counter slot is being used */ /* Indexed from 1 as 0 has a special meaning */ } L2UNIT_register_t; /* Used when doing register allocation */ typedef struct L2UNIT_reg_alloc { L2UNIT_register_t ra_bits; } L2UNIT_reg_alloc_t; typedef struct L2UNIT_overflow { int threshold; int EventIndex; } L2UNIT_overflow_t; /* Holds control flags */ typedef struct L2UNIT_control_state { int EventGroup; int EventGroup_local[512]; int count; long long counters[L2UNIT_MAX_COUNTERS]; int overflow; // overflow enable int overflow_count; L2UNIT_overflow_t overflow_list[512]; int bgpm_eventset_applied; // BGPM eventGroup applied yes or no flag } L2UNIT_control_state_t; /* Holds per-thread information */ typedef struct L2UNIT_context { L2UNIT_control_state_t state; } L2UNIT_context_t; #endif /* _PAPI_L2UNIT_H */ papi-papi-7-2-0-t/src/components/bgpm/NWunit/000077500000000000000000000000001502707512200207655ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/bgpm/NWunit/Rules.NWunit000066400000000000000000000004321502707512200232240ustar00rootroot00000000000000# $Id$ COMPSRCS += components/bgpm/NWunit/linux-NWunit.c COMPOBJS += linux-NWunit.o linux-NWunit.o: components/bgpm/NWunit/linux-NWunit.c components/bgpm/NWunit/linux-NWunit.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/bgpm/NWunit/linux-NWunit.c -o linux-NWunit.o papi-papi-7-2-0-t/src/components/bgpm/NWunit/linux-NWunit.c000066400000000000000000000262731502707512200235240ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-NWunit.c * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / NWunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "linux-NWunit.h" /* Declare our vector in advance */ papi_vector_t _NWunit_vector; /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ int NWUNIT_init_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "NWUNIT_init_thread\n" ); #endif ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int NWUNIT_init_component( int cidx ) { #ifdef DEBUG_BGQ printf( "NWUNIT_init_component\n" ); #endif _NWunit_vector.cmp_info.CmpIdx = cidx; #ifdef DEBUG_BGQ printf( "NWUNIT_init_component cidx = %d\n", cidx ); #endif return ( PAPI_OK ); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int NWUNIT_init_control_state( hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "NWUNIT_init_control_state\n" ); #endif int retval; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; this_state->EventGroup = Bgpm_CreateEventSet(); retval = _check_BGPM_error( this_state->EventGroup, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; return PAPI_OK; } /* * */ int NWUNIT_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "NWUNIT_start\n" ); #endif ( void ) ctx; int retval; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; retval = Bgpm_Attach( this_state->EventGroup, UPC_NW_ALL_LINKS, 0); retval = _check_BGPM_error( retval, "Bgpm_Attach" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int NWUNIT_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "NWUNIT_stop\n" ); #endif ( void ) ctx; int retval; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int NWUNIT_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long_long ** events, int flags ) { #ifdef DEBUG_BGQ printf( "NWUNIT_read\n" ); #endif ( void ) ctx; ( void ) flags; int i, numEvts; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; numEvts = Bgpm_NumEvents( this_state->EventGroup ); if ( numEvts == 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function Bgpm_NumEvents.\n", numEvts ); #endif //return ( EXIT_FAILURE ); } for ( i = 0; i < numEvts; i++ ) this_state->counts[i] = _common_getEventValue( i, this_state->EventGroup ); *events = this_state->counts; return ( PAPI_OK ); } /* * */ int NWUNIT_shutdown_thread( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "NWUNIT_shutdown_thread\n" ); #endif ( void ) ctx; return ( PAPI_OK ); } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int NWUNIT_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG_BGQ printf( "NWUNIT_ctl\n" ); #endif ( void ) ctx; ( void ) code; ( void ) option; return ( PAPI_OK ); } //int NWUNIT_ntv_code_to_bits ( unsigned int EventCode, hwd_register_t * bits ); /* * */ int NWUNIT_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "NWUNIT_update_control_state: count = %d\n", count ); #endif ( void ) ctx; int retval, index, i; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; // otherwise, add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event ) + OFFSET; native[i].ni_position = i; #ifdef DEBUG_BGQ printf("NWUNIT_update_control_state: ADD event: i = %d, index = %d\n", i, index ); #endif /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( this_state->EventGroup, index ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; } return ( PAPI_OK ); } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ int NWUNIT_set_domain( hwd_control_state_t * cntrl, int domain ) { #ifdef DEBUG_BGQ printf( "NWUNIT_set_domain\n" ); #endif int found = 0; ( void ) cntrl; if ( PAPI_DOM_USER & domain ) found = 1; if ( PAPI_DOM_KERNEL & domain ) found = 1; if ( PAPI_DOM_OTHER & domain ) found = 1; if ( !found ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * */ int NWUNIT_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "NWUNIT_reset\n" ); #endif ( void ) ctx; int retval; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ptr; /* we can't simply call Bgpm_Reset() since PAPI doesn't have the restriction that an EventSet has to be stopped before resetting is possible. However, BGPM does have this restriction. Hence we need to stop, reset and start */ retval = Bgpm_Stop( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( this_state->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Cleanup Eventset * * Destroy and re-create the BGPM / NWunit EventSet */ int NWUNIT_cleanup_eventset( hwd_control_state_t * ctrl ) { #ifdef DEBUG_BGQ printf( "NWUNIT_cleanup_eventset\n" ); #endif int retval; NWUNIT_control_state_t * this_state = ( NWUNIT_control_state_t * ) ctrl; // create a new empty bgpm eventset // reason: bgpm doesn't permit to remove events from an eventset; // hence we delete the old eventset and create a new one retval = _common_deleteRecreate( &this_state->EventGroup ); // HJ try to use delete() only if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * Native Event functions */ int NWUNIT_ntv_enum_events( unsigned int *EventCode, int modifier ) { //printf( "NWUNIT_ntv_enum_events\n" ); switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode ) + OFFSET; if ( index < NWUNIT_MAX_EVENTS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } /* * */ int NWUNIT_ntv_name_to_code( const char *name, unsigned int *event_code ) { #ifdef DEBUG_BGQ printf( "NWUNIT_ntv_name_to_code\n" ); #endif int ret; /* Return event id matching a given event label string */ ret = Bgpm_GetEventIdFromLabel ( name ); if ( ret <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } else if ( ret < OFFSET || ret > NWUNIT_MAX_EVENTS ) // not a NWUnit event return PAPI_ENOEVNT; else *event_code = ( ret - OFFSET ) ; return PAPI_OK; } /* * */ int NWUNIT_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "NWUNIT_ntv_code_to_name\n" ); #endif int index; index = ( EventCode ) + OFFSET; if ( index >= MAX_COUNTERS ) return PAPI_ENOEVNT; strncpy( name, Bgpm_GetEventIdLabel( index ), len ); if ( name == NULL ) { #ifdef DEBUG_BGPM printf ("Error: ret value is NULL for BGPM API function Bgpm_GetEventIdLabel.\n" ); #endif return PAPI_ENOEVNT; } return ( PAPI_OK ); } /* * */ int NWUNIT_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ //printf( "NWUNIT_ntv_code_to_descr\n" ); #endif int retval, index; index = ( EventCode ) + OFFSET; retval = Bgpm_GetLongDesc( index, name, &len ); retval = _check_BGPM_error( retval, "Bgpm_GetLongDesc" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * */ int NWUNIT_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { #ifdef DEBUG_BGQ printf( "NWUNIT_ntv_code_to_bits\n" ); #endif ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } /* * */ papi_vector_t _NWunit_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "bgpm/NWUnit", .short_name = "NWUnit", .description = "Blue Gene/Q NWUnit component", .num_cntrs = NWUNIT_MAX_COUNTERS, .num_native_events = NWUNIT_MAX_EVENTS-OFFSET+1, .num_mpx_cntrs = NWUNIT_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 0, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( NWUNIT_context_t ), .control_state = sizeof ( NWUNIT_control_state_t ), .reg_value = sizeof ( NWUNIT_register_t ), .reg_alloc = sizeof ( NWUNIT_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = NWUNIT_init_thread, .init_component = NWUNIT_init_component, .init_control_state = NWUNIT_init_control_state, .start = NWUNIT_start, .stop = NWUNIT_stop, .read = NWUNIT_read, .shutdown_thread = NWUNIT_shutdown_thread, .cleanup_eventset = NWUNIT_cleanup_eventset, .ctl = NWUNIT_ctl, .update_control_state = NWUNIT_update_control_state, .set_domain = NWUNIT_set_domain, .reset = NWUNIT_reset, .ntv_name_to_code = NWUNIT_ntv_name_to_code, .ntv_enum_events = NWUNIT_ntv_enum_events, .ntv_code_to_name = NWUNIT_ntv_code_to_name, .ntv_code_to_descr = NWUNIT_ntv_code_to_descr, .ntv_code_to_bits = NWUNIT_ntv_code_to_bits }; papi-papi-7-2-0-t/src/components/bgpm/NWunit/linux-NWunit.h000066400000000000000000000032201502707512200235140ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-NWunit.h * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM / NWunit component * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #ifndef _PAPI_NWUNIT_H #define _PAPI_NWUNIT_H #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "../../../linux-bgq-common.h" /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ //#define NWUNIT_MAX_COUNTERS UPC_NW_ALL_LINKCTRS #define NWUNIT_MAX_COUNTERS UPC_NW_NUM_CTRS #define NWUNIT_MAX_EVENTS PEVT_NWUNIT_LAST_EVENT #define OFFSET ( PEVT_IOUNIT_LAST_EVENT + 1 ) /** Structure that stores private information of each event */ typedef struct NWUNIT_register { unsigned int selector; /* Signifies which counter slot is being used */ /* Indexed from 1 as 0 has a special meaning */ } NWUNIT_register_t; typedef struct NWUNIT_reg_alloc { NWUNIT_register_t ra_bits; } NWUNIT_reg_alloc_t; typedef struct NWUNIT_control_state { int EventGroup; long long counts[NWUNIT_MAX_COUNTERS]; } NWUNIT_control_state_t; typedef struct NWUNIT_context { NWUNIT_control_state_t state; } NWUNIT_context_t; #endif /* _PAPI_NWUNIT_H */ papi-papi-7-2-0-t/src/components/bgpm/README.md000066400000000000000000000023031502707512200210160ustar00rootroot00000000000000# BGPM Component Five components have been added to PAPI to support hardware performance monitoring for the BG/Q platform; in particular the BG/Q network, the I/O system, the Compute Node Kernel in addition to the processing core. * [Enabling the BGPM Component](#enabling-the-bgpm-component) *** ## Enabling the BGPM Component To enable reading of BGPM counters the user needs to link against a PAPI library that was configured with the BGPM component enabled. There are no specific component configure scripts for L2unit, IOunit, NWunit, CNKunit. In order to configure PAPI for BG/Q, use the following configure options at the papi/src level: ./configure --prefix=< your_choice > \ --with-OS=bgq \ --with-bgpm_installdir=/bgsys/drivers/ppcfloor \ CC=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc \ F77=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gfortran \ --with-components="bgpm/L2unit bgpm/CNKunit bgpm/IOunit bgpm/NWunit" Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/coretemp/000077500000000000000000000000001502707512200204325ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/coretemp/README.md000066400000000000000000000015271502707512200217160ustar00rootroot00000000000000# CORETEMP Component The CORETEMP component enables PAPI-C to access hardware monitoring sensors through the coretemp sysfs interface. This component will dynamically create a native events table for all the sensors that can be found under /sys/class/hwmon/hwmon[0-9]+. * [Enabling the CORETEMP Component](#enabling-the-coretemp-component) *** ## Enabling the CORETEMP Component To enable reading CORETEMP events the user needs to link against a PAPI library that was configured with the CORETEMP component enabled. As an example the following command: `./configure --with-components="coretemp"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/coretemp/Rules.coretemp000066400000000000000000000004221502707512200232620ustar00rootroot00000000000000# $Id$ COMPSRCS += components/coretemp/linux-coretemp.c COMPOBJS += linux-coretemp.o linux-coretemp.o: components/coretemp/linux-coretemp.c components/coretemp/linux-coretemp.h $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/coretemp/linux-coretemp.c -o linux-coretemp.o papi-papi-7-2-0-t/src/components/coretemp/linux-coretemp.c000066400000000000000000000521531502707512200235570ustar00rootroot00000000000000#include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "linux-coretemp.h" /* this is what I found on my core2 machine * but I have not explored this widely yet*/ #define REFRESH_LAT 4000 #define INVALID_RESULT -1000000L papi_vector_t _coretemp_vector; /* temporary event */ struct temp_event { char name[PAPI_MAX_STR_LEN]; char units[PAPI_MIN_STR_LEN]; char description[PAPI_MAX_STR_LEN]; char location[PAPI_MAX_STR_LEN]; char path[PATH_MAX]; int stone; long count; struct temp_event *next; }; // The following macro follows if a string function has an error. It should // never happen; but it is necessary to prevent compiler warnings. We print // something just in case there is programmer error in invoking the function. #define HANDLE_STRING_ERROR {fprintf(stderr,"%s:%i unexpected string function error.\n",__FILE__,__LINE__); exit(-1);} static CORETEMP_native_event_entry_t * _coretemp_native_events; static int num_events = 0; static int is_initialized = 0; /***************************************************************************/ /****** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT *******/ /***************************************************************************/ static struct temp_event* root = NULL; static struct temp_event *last = NULL; static int insert_in_list(char *name, char *units, char *description, char *filename) { struct temp_event *temp; /* new_event path, events->d_name */ temp = (struct temp_event *) papi_calloc(1, sizeof(struct temp_event)); if (temp==NULL) { PAPIERROR("out of memory!"); /* We should also free any previously allocated data */ return PAPI_ENOMEM; } temp->next = NULL; if (root == NULL) { root = temp; } else if (last) { last->next = temp; } else { /* Because this is a function, it is possible */ /* we are called with root!=NULL but no last */ /* so add this to keep coverity happy */ papi_free(temp); PAPIERROR("This shouldn't be possible\n"); return PAPI_ECMP; } last = temp; snprintf(temp->name, PAPI_MAX_STR_LEN, "%s", name); snprintf(temp->units, PAPI_MIN_STR_LEN, "%s", units); snprintf(temp->description, PAPI_MAX_STR_LEN, "%s", description); snprintf(temp->path, PATH_MAX, "%s", filename); return PAPI_OK; } /* * find all coretemp information reported by the kernel */ static int generateEventList(char *base_dir) { char path[PATH_MAX],filename[PATH_MAX]; char modulename[PAPI_MIN_STR_LEN], location[PAPI_MIN_STR_LEN], units[PAPI_MIN_STR_LEN], description[PAPI_MAX_STR_LEN], name[PAPI_MAX_STR_LEN]; DIR *dir,*d; FILE *fff; int count = 0; struct dirent *hwmonx; int i,pathnum; int retlen; #define NUM_PATHS 2 char paths[NUM_PATHS][PATH_MAX]={ "device","." }; /* Open "/sys/class/hwmon" */ dir = opendir(base_dir); if ( dir == NULL ) { SUBDBG("Can't find %s, are you sure the coretemp module is loaded?\n", base_dir); return 0; } /* Iterate each /sys/class/hwmonX/device directory */ while( (hwmonx = readdir(dir) ) ) { if ( !strncmp("hwmon", hwmonx->d_name, 5) ) { /* Found a hwmon directory */ /* Sometimes the files are in ./, sometimes in device/ */ for(pathnum=0;pathnumd_name,paths[pathnum]); if (retlen <= 0 || PATH_MAX <= retlen) { SUBDBG("Path length is too long.\n"); return PAPI_EINVAL; } SUBDBG("Trying to open %s\n",path); d = opendir(path); if (d==NULL) { continue; } /* Get the name of the module */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/name",path); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Module name too long.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) { snprintf(modulename, PAPI_MIN_STR_LEN, "Unknown"); } else { if (fgets(modulename,PAPI_MIN_STR_LEN,fff)!=NULL) { modulename[strlen(modulename)-1]='\0'; } fclose(fff); } SUBDBG("Found module %s\n",modulename); /******************************************************/ /* Try handling all events starting with in (voltage) */ /******************************************************/ /* arbitrary maximum */ /* the problem is the numbering can be sparse */ /* should probably go back to dirent listing */ for(i=0;i<32;i++) { /* Try looking for a location label */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/in%d_label", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Failed to construct location label.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) { strncpy(location,"?",PAPI_MIN_STR_LEN); } else { if (fgets(location,PAPI_MIN_STR_LEN,fff)!=NULL) { location[strlen(location)-1]='\0'; } fclose(fff); } /* Look for input temperature */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/in%d_input", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Failed input temperature string.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) continue; fclose(fff); retlen = snprintf(name, PAPI_MAX_STR_LEN, "%s:in%i_input", hwmonx->d_name, i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Unable to generate name %s:in%i_input\n", hwmonx->d_name, i); closedir(dir); closedir(d); return ( PAPI_EINVAL ); } snprintf(units, PAPI_MIN_STR_LEN, "V"); retlen = snprintf(description, PAPI_MAX_STR_LEN, "%s, %s module, label %s", units,modulename, location); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("snprintf failed.\n"); return PAPI_EINVAL; } if (insert_in_list(name,units,description,filename)!=PAPI_OK) { goto done_error; } count++; } /************************************************************/ /* Try handling all events starting with temp (temperature) */ /************************************************************/ for(i=0;i<32;i++) { /* Try looking for a location label */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/temp%d_label", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Location label string failed.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) { strncpy(location,"?",PAPI_MIN_STR_LEN); } else { if (fgets(location,PAPI_MIN_STR_LEN,fff)!=NULL) { location[strlen(location)-1]='\0'; } fclose(fff); } /* Look for input temperature */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/temp%d_input", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Input temperature string failed.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) continue; fclose(fff); retlen = snprintf(name, PAPI_MAX_STR_LEN, "%s:temp%i_input", hwmonx->d_name, i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Unable to generate name %s:temp%i_input\n", hwmonx->d_name, i); closedir(d); closedir(dir); return ( PAPI_EINVAL ); } snprintf(units, PAPI_MIN_STR_LEN, "degrees C"); retlen = snprintf(description, PAPI_MAX_STR_LEN, "%s, %s module, label %s", units,modulename, location); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("snprintf failed.\n"); return PAPI_EINVAL; } if (insert_in_list(name,units,description,filename)!=PAPI_OK) { goto done_error; } count++; } /************************************************************/ /* Try handling all events starting with fan (fan) */ /************************************************************/ for(i=0;i<32;i++) { /* Try looking for a location label */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/fan%d_label", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Failed to write fan label string.\n"); return PAPI_EINVAL; } fff=fopen(filename,"r"); if (fff==NULL) { strncpy(location,"?",PAPI_MIN_STR_LEN); } else { if (fgets(location,PAPI_MIN_STR_LEN,fff)!=NULL) { location[strlen(location)-1]='\0'; } fclose(fff); } /* Look for input fan */ retlen = snprintf(filename, PAPI_MAX_STR_LEN, "%s/fan%d_input", path,i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Unable to generate filename %s/fan%d_input\n", path,i); closedir(d); closedir(dir); return ( PAPI_EINVAL ); } fff=fopen(filename,"r"); if (fff==NULL) continue; fclose(fff); retlen = snprintf(name, PAPI_MAX_STR_LEN, "%s:fan%i_input", hwmonx->d_name, i); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("Unable to generate name %s:fan%i_input\n", hwmonx->d_name, i); closedir(d); closedir(dir); return ( PAPI_EINVAL ); } snprintf(units, PAPI_MIN_STR_LEN, "RPM"); retlen = snprintf(description, PAPI_MAX_STR_LEN, "%s, %s module, label %s", units,modulename, location); if (retlen <= 0 || PAPI_MAX_STR_LEN <= retlen) { SUBDBG("snprintf failed.\n"); return PAPI_EINVAL; } if (insert_in_list(name,units,description,filename)!=PAPI_OK) { goto done_error; } count++; } closedir(d); } } } closedir(dir); return count; done_error: closedir(d); closedir(dir); return PAPI_ECMP; } static long long getEventValue( int index ) { char buf[PAPI_MAX_STR_LEN]; FILE* fp; long result; if (_coretemp_native_events[index].stone) { return _coretemp_native_events[index].value; } fp = fopen(_coretemp_native_events[index].path, "r"); if (fp==NULL) { return INVALID_RESULT; } if (fgets(buf, PAPI_MAX_STR_LEN, fp)==NULL) { result=INVALID_RESULT; } else { result=strtoll(buf, NULL, 10); } fclose(fp); return result; } /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ static int _coretemp_init_thread( hwd_context_t *ctx ) { ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ static int _coretemp_init_component( int cidx ) { int retval = PAPI_OK; int i = 0; struct temp_event *t,*last; if ( is_initialized ) goto fn_exit; is_initialized = 1; /* This is the prefered method, all coretemp sensors are symlinked here * see $(kernel_src)/Documentation/hwmon/sysfs-interface */ num_events = generateEventList("/sys/class/hwmon"); if ( num_events < 0 ) { char* strCpy; strCpy=strncpy(_coretemp_vector.cmp_info.disabled_reason, "Cannot open /sys/class/hwmon",PAPI_MAX_STR_LEN); _coretemp_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strCpy == NULL) HANDLE_STRING_ERROR; retval = PAPI_ECMP; goto fn_fail; } if ( num_events == 0 ) { char* strCpy=strncpy(_coretemp_vector.cmp_info.disabled_reason, "No coretemp events found",PAPI_MAX_STR_LEN); _coretemp_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strCpy == NULL) HANDLE_STRING_ERROR; retval = PAPI_ECMP; goto fn_fail; } t = root; _coretemp_native_events = (CORETEMP_native_event_entry_t*) papi_calloc(num_events, sizeof(CORETEMP_native_event_entry_t)); if (_coretemp_native_events == NULL) { int strErr=snprintf(_coretemp_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "malloc() of _coretemp_native_events failed for %lu bytes.", num_events*sizeof(CORETEMP_native_event_entry_t)); _coretemp_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } do { int retlen; retlen = snprintf(_coretemp_native_events[i].name, PAPI_MAX_STR_LEN, "%s", t->name); if (retlen <= 0 || retlen >= PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retlen = snprintf(_coretemp_native_events[i].path, PATH_MAX, "%s", t->path); if (retlen <= 0 || retlen >= PATH_MAX) HANDLE_STRING_ERROR; retlen = snprintf(_coretemp_native_events[i].units, PAPI_MIN_STR_LEN, "%s", t->units); if (retlen <= 0 || retlen >= PAPI_MIN_STR_LEN) HANDLE_STRING_ERROR; retlen = snprintf(_coretemp_native_events[i].description, PAPI_MAX_STR_LEN, "%s",t->description); if (retlen <= 0 || retlen >= PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; _coretemp_native_events[i].stone = 0; _coretemp_native_events[i].resources.selector = i + 1; last = t; t = t->next; papi_free(last); i++; } while (t != NULL); root = NULL; /* Export the total number of events available */ _coretemp_vector.cmp_info.num_native_events = num_events; /* Export the component id */ _coretemp_vector.cmp_info.CmpIdx = cidx; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ static int _coretemp_init_control_state( hwd_control_state_t * ctl) { int i; CORETEMP_control_state_t *coretemp_ctl = (CORETEMP_control_state_t *) ctl; for ( i=0; i < num_events; i++ ) { coretemp_ctl->counts[i] = getEventValue(i); } /* Set last access time for caching results */ coretemp_ctl->lastupdate = PAPI_get_real_usec(); return PAPI_OK; } static int _coretemp_start( hwd_context_t *ctx, hwd_control_state_t *ctl) { ( void ) ctx; ( void ) ctl; return PAPI_OK; } static int _coretemp_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long ** events, int flags) { (void) flags; (void) ctx; CORETEMP_control_state_t* control = (CORETEMP_control_state_t*) ctl; long long now = PAPI_get_real_usec(); int i; /* Only read the values from the kernel if enough time has passed */ /* since the last read. Otherwise return cached values. */ if ( now - control->lastupdate > REFRESH_LAT ) { for ( i = 0; i < num_events; i++ ) { control->counts[i] = getEventValue( i ); } control->lastupdate = now; } /* Pass back a pointer to our results */ *events = control->counts; return PAPI_OK; } static int _coretemp_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; /* read values */ CORETEMP_control_state_t* control = (CORETEMP_control_state_t*) ctl; int i; for ( i = 0; i < num_events; i++ ) { control->counts[i] = getEventValue( i ); } return PAPI_OK; } /* Shutdown a thread */ static int _coretemp_shutdown_thread( hwd_context_t * ctx ) { ( void ) ctx; return PAPI_OK; } /* * Clean up what was setup in coretemp_init_component(). */ static int _coretemp_shutdown_component( ) { if ( is_initialized ) { is_initialized = 0; papi_free(_coretemp_native_events); _coretemp_native_events = NULL; } return PAPI_OK; } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ static int _coretemp_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { ( void ) ctx; ( void ) code; ( void ) option; return PAPI_OK; } static int _coretemp_update_control_state( hwd_control_state_t *ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { int i, index; ( void ) ctx; ( void ) ptr; for ( i = 0; i < count; i++ ) { index = native[i].ni_event; native[i].ni_position = _coretemp_native_events[index].resources.selector - 1; } return PAPI_OK; } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ static int _coretemp_set_domain( hwd_control_state_t * cntl, int domain ) { (void) cntl; if ( PAPI_DOM_ALL != domain ) return PAPI_EINVAL; return PAPI_OK; } static int _coretemp_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { ( void ) ctx; ( void ) ctl; return PAPI_OK; } /* * Native Event functions */ static int _coretemp_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index; switch ( modifier ) { case PAPI_ENUM_FIRST: if (num_events==0) { return PAPI_ENOEVNT; } *EventCode = 0; return PAPI_OK; case PAPI_ENUM_EVENTS: index = *EventCode; if ( index < num_events - 1 ) { *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; default: return PAPI_EINVAL; } return PAPI_EINVAL; } /* * */ static int _coretemp_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int index = EventCode; if ( index >= 0 && index < num_events ) { strncpy( name, _coretemp_native_events[index].name, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /* * */ static int _coretemp_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { int index = EventCode; if ( index >= 0 && index < num_events ) { strncpy( name, _coretemp_native_events[index].description, len ); return PAPI_OK; } return PAPI_ENOEVNT; } static int _coretemp_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { int index = EventCode; if ( ( index < 0) || (index >= num_events )) return PAPI_ENOEVNT; strncpy( info->symbol, _coretemp_native_events[index].name, sizeof(info->symbol)); strncpy( info->long_descr, _coretemp_native_events[index].description, sizeof(info->long_descr)); strncpy( info->units, _coretemp_native_events[index].units, sizeof(info->units)); info->units[sizeof(info->units)-1] = '\0'; return PAPI_OK; } /* * */ papi_vector_t _coretemp_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "coretemp", .short_name = "coretemp", .description = "Linux hwmon temperature and other info", .version = "4.2.1", .num_mpx_cntrs = CORETEMP_MAX_COUNTERS, .num_cntrs = CORETEMP_MAX_COUNTERS, .default_domain = PAPI_DOM_ALL, .available_domains = PAPI_DOM_ALL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, .kernel_multiplex = 1, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( CORETEMP_context_t ), .control_state = sizeof ( CORETEMP_control_state_t ), .reg_value = sizeof ( CORETEMP_register_t ), .reg_alloc = sizeof ( CORETEMP_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = _coretemp_init_thread, .init_component = _coretemp_init_component, .init_control_state = _coretemp_init_control_state, .start = _coretemp_start, .stop = _coretemp_stop, .read = _coretemp_read, .shutdown_thread = _coretemp_shutdown_thread, .shutdown_component = _coretemp_shutdown_component, .ctl = _coretemp_ctl, .update_control_state = _coretemp_update_control_state, .set_domain = _coretemp_set_domain, .reset = _coretemp_reset, .ntv_enum_events = _coretemp_ntv_enum_events, .ntv_code_to_name = _coretemp_ntv_code_to_name, .ntv_code_to_descr = _coretemp_ntv_code_to_descr, .ntv_code_to_info = _coretemp_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/coretemp/linux-coretemp.h000066400000000000000000000046011502707512200235570ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-coretemp.h * CVS: $Id$ * @author James Ralph * ralph@eecs.utk.edu * * @ingroup papi_components * * @brief coretemp component * This file has the source code for a component that enables PAPI-C to access * hardware monitoring sensors through the coretemp sysfs interface. This code * will dynamically create a native events table for all the sensors that can * be found under /sys/class/hwmon/hwmon[0-9]+. * * Notes: * - Based heavily upon the lm-sensors component by Heike Jagode. */ #ifndef _PAPI_CORETEMP_H #define _PAPI_CORETEMP_H #include #include /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ #define CORETEMP_MAX_COUNTERS 512 /** Structure that stores private information of each event */ typedef struct CORETEMP_register { /* This is used by the framework.It likes it to be !=0 to do somehting */ unsigned int selector; /* These are the only information needed to locate a libsensors event */ int subfeat_nr; } CORETEMP_register_t; /* * The following structures mimic the ones used by other components. It is more * convenient to use them like that as programming with PAPI makes specific * assumptions for them. */ /** This structure is used to build the table of events */ typedef struct CORETEMP_native_event_entry { char name[PAPI_MAX_STR_LEN]; char units[PAPI_MIN_STR_LEN]; char description[PAPI_MAX_STR_LEN]; char path[PATH_MAX]; int stone; /* some counters are set in stone, a max temperature is just that... */ long value; CORETEMP_register_t resources; } CORETEMP_native_event_entry_t; typedef struct CORETEMP_reg_alloc { CORETEMP_register_t ra_bits; } CORETEMP_reg_alloc_t; typedef struct CORETEMP_control_state { long long counts[CORETEMP_MAX_COUNTERS]; // used for caching long long lastupdate; } CORETEMP_control_state_t; typedef struct CORETEMP_context { CORETEMP_control_state_t state; } CORETEMP_context_t; /************************* GLOBALS SECTION *********************************** *******************************************************************************/ #endif /* _PAPI_CORETEMP_H */ papi-papi-7-2-0-t/src/components/coretemp/tests/000077500000000000000000000000001502707512200215745ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/coretemp/tests/Makefile000066400000000000000000000010211502707512200232260ustar00rootroot00000000000000NAME=coretemp include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = coretemp_basic coretemp_pretty coretemp_tests: $(TESTS) coretemp_basic: coretemp_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o coretemp_basic coretemp_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) coretemp_pretty: coretemp_pretty.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o coretemp_pretty coretemp_pretty.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/coretemp/tests/coretemp_basic.c000066400000000000000000000066131502707512200247250ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Vince Weaver * * test case for coretemp component * * * @brief * Tests basic coretemp functionality */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval,cid,numcmp,coretemp_cid=-1; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int code; char event_name[PAPI_MAX_STR_LEN]; int total_events=0; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Trying all coretemp events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidname,"coretemp")) { coretemp_cid=cid; if (!TESTS_QUIET) { printf("Found coretemp component at cid %d\n", coretemp_cid); } if (cmpinfo->disabled) { if (!TESTS_QUIET) fprintf(stderr,"Coretemp component disabled: %s\n", cmpinfo->disabled_reason); test_skip(__FILE__, __LINE__, "Component disabled\n", 0); } } } if (coretemp_cid==-1) { test_skip(__FILE__,__LINE__,"No coretemp component found",0); } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, coretemp_cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) printf("%s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } if (!TESTS_QUIET) printf(" value: %lld\n",values[0]); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, coretemp_cid ); } if (total_events==0) { test_skip(__FILE__,__LINE__,"No coretemp events found",0); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/coretemp/tests/coretemp_pretty.c000066400000000000000000000155601502707512200251740ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Vince Weaver * * test case that displays "pretty" coretemp output * * @brief * Shows "pretty" coretemp output */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval,cid,coretemp_cid=-1,numcmp; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int code; char event_name[PAPI_MAX_STR_LEN]; int r; const PAPI_component_info_t *cmpinfo = NULL; PAPI_event_info_t evinfo; double temperature; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Trying all coretemp events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidname,"coretemp")) { coretemp_cid=cid; if (!TESTS_QUIET) printf("Found coretemp component at cid %d\n", coretemp_cid); if (cmpinfo->disabled) { if (!TESTS_QUIET) fprintf(stderr,"Coretemp component disabled: %s\n", cmpinfo->disabled_reason); test_skip(__FILE__, __LINE__, "Component disabled\n", 0); } if (cmpinfo->num_native_events==0) { test_skip(__FILE__,__LINE__,"No coretemp events found",0); } break; } } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, coretemp_cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } retval = PAPI_get_event_info(code,&evinfo); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "Error getting event info\n",retval); } /****************************/ /* Print Temperature Inputs */ /****************************/ if (strstr(event_name,"temp")) { /* Only print inputs */ if (strstr(event_name,"_input")) { if (!TESTS_QUIET) printf("%s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } temperature=(values[0]/1000.0); if (!TESTS_QUIET) printf("\tvalue: %.2lf %s\n", temperature, evinfo.long_descr ); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } } } /****************************/ /* Print Voltage Inputs */ /****************************/ if (strstr(event_name,".in")) { /* Only print inputs */ if (strstr(event_name,"_input")) { if (!TESTS_QUIET) printf("%s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } temperature=(values[0]/1000.0); if (!TESTS_QUIET) printf("\tvalue: %.2lf %s\n", temperature, evinfo.long_descr ); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } } } /********************/ /* Print Fan Inputs */ /********************/ else if (strstr(event_name,"fan")) { /* Only print inputs */ if (strstr(event_name,"_input")) { if (!TESTS_QUIET) printf("%s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } if (!TESTS_QUIET) printf("\tvalue: %lld %s\n",values[0], evinfo.long_descr); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } } } else { /* Skip unknown */ } r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, coretemp_cid ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/coretemp_freebsd/000077500000000000000000000000001502707512200221245ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/coretemp_freebsd/README.md000066400000000000000000000015561502707512200234120ustar00rootroot00000000000000# CORETEMP_FREEBSD Component The CORETEMP_FREEBSD component is intended to access CPU On-Die Thermal Sensors in the Intel Core architecture in a FreeBSD machine using the coretemp.ko kernel module. The returned values represent Kelvin degrees. * [Enabling the CORETEMP_FREEBSD Component](#enabling-the-coretemp_freebsd-component) *** ## Enabling the CORETEMP_FREEBSD Component To enable reading CORETEMP\_FREEBSD events the user needs to link against a PAPI library that was configured with the CORETEMP_FREEBSD component enabled. As an example the following command: `./configure --with-components="coretemp_freebsd"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/coretemp_freebsd/Rules.coretemp_freebsd000066400000000000000000000005001502707512200264430ustar00rootroot00000000000000# $Id$ COMPSRCS += components/coretemp_freebsd/coretemp_freebsd.c COMPOBJS += coretemp_freebsd.o coretemp_freebsd.o: components/coretemp_freebsd/coretemp_freebsd.c components/coretemp_freebsd/coretemp_freebsd.h $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/coretemp_freebsd/coretemp_freebsd.c -o coretemp_freebsd.o papi-papi-7-2-0-t/src/components/coretemp_freebsd/coretemp_freebsd.c000066400000000000000000000310651502707512200256050ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file coretemp_freebsd.c * @author Joachim Protze * joachim.protze@zih.tu-dresden.de * @author Vince Weaver * vweaver1@eecs.utk.edu * @author Harald Servat * harald.servat@gmail.com * * @ingroup papi_components * * @brief * This component is intended to access CPU On-Die Thermal Sensors in * the Intel Core architecture in a FreeBSD machine using the coretemp.ko * kernel module. */ #include #include #include #include #include #include #include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #define CORETEMP_MAX_COUNTERS 32 /* Can we tune this dynamically? */ #define TRUE (1==1) #define FALSE (1!=1) #define UNREFERENCED(x) (void)x /* Structure that stores private information for each event */ typedef struct coretemp_register { int mib[4]; /* Access to registers through these MIBs + sysctl (3) call */ unsigned int selector; /**< Signifies which counter slot is being used */ /**< Indexed from 1 as 0 has a special meaning */ } coretemp_register_t; /** This structure is used to build the table of events */ typedef struct coretemp_native_event_entry { coretemp_register_t resources; /**< Per counter resources */ char name[PAPI_MAX_STR_LEN]; /**< Name of the counter */ char description[PAPI_MAX_STR_LEN]; /**< Description of the counter */ } coretemp_native_event_entry_t; /* This structure is used when doing register allocation it possibly is not necessary when there are no register constraints */ typedef struct coretemp_reg_alloc { coretemp_register_t ra_bits; } coretemp_reg_alloc_t; /* Holds control flags, usually out-of band configuration of the hardware */ typedef struct coretemp_control_state { int added[CORETEMP_MAX_COUNTERS]; long_long counters[CORETEMP_MAX_COUNTERS]; /**< Copy of counts, used for caching */ } coretemp_control_state_t; /* Holds per-thread information */ typedef struct coretemp_context { coretemp_control_state_t state; } coretemp_context_t; /** This table contains the native events */ static coretemp_native_event_entry_t *coretemp_native_table; /** number of events in the table*/ static int CORETEMP_NUM_EVENTS = 0; /********************************************************************/ /* Below are the functions required by the PAPI component interface */ /********************************************************************/ /** This is called whenever a thread is initialized */ int coretemp_init_thread (hwd_context_t * ctx) { int mib[4]; size_t len; UNREFERENCED(ctx); ( void ) mib; /*unused */ ( void ) len; /*unused */ SUBDBG("coretemp_init_thread %p...\n", ctx); #if 0 /* what does this do? VMW */ len = 4; if (sysctlnametomib ("dev.coretemp.0.%driver", mib, &len) == -1) return PAPI_ECMP; #endif return PAPI_OK; } /** Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int coretemp_init_component () { int ret; int i; int mib[4]; size_t len; char tmp[128]; int retval = PAPI_OK; SUBDBG("coretemp_init_component...\n"); /* Count the number of cores (counters) that have sensors allocated */ i = 0; CORETEMP_NUM_EVENTS = 0; sprintf (tmp, "dev.coretemp.%d.%%driver", i); len = 4; ret = sysctlnametomib (tmp, mib, &len); while (ret != -1) { CORETEMP_NUM_EVENTS++; i++; sprintf (tmp, "dev.coretemp.%d.%%driver", i); len = 4; ret = sysctlnametomib (tmp, mib, &len); } if (CORETEMP_NUM_EVENTS == 0) goto fn_exit; /* Allocate memory for the our event table */ coretemp_native_table = (coretemp_native_event_entry_t *) papi_malloc (sizeof (coretemp_native_event_entry_t) * CORETEMP_NUM_EVENTS); if (coretemp_native_table == NULL) { perror( "malloc():Could not get memory for coretemp events table" ); retval = PAPI_ENOMEM; goto fn_fail; } /* Allocate native events internal structures */ for (i = 0; i < CORETEMP_NUM_EVENTS; i++) { /* Event name */ sprintf (coretemp_native_table[i].name, "CORETEMP_CPU_%d", i); /* Event description */ sprintf (coretemp_native_table[i].description, "CPU On-Die Thermal Sensor #%d", i); /* Event extra bits -> save MIB to faster access later */ sprintf (tmp, "dev.cpu.%d.temperature", i); len = 4; if (sysctlnametomib (tmp, coretemp_native_table[i].resources.mib, &len) == -1) { retval = PAPI_ECMP; goto fn_fail; } coretemp_native_table[i].resources.selector = i+1; } fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /** Setup the counter control structure */ int coretemp_init_control_state (hwd_control_state_t * ctrl) { int i; SUBDBG("coretemp_init_control_state... %p\n", ctrl); coretemp_control_state_t *c = (coretemp_control_state_t *) ctrl; for (i = 0; i < CORETEMP_MAX_COUNTERS; i++) c->added[i] = FALSE; return PAPI_OK; } /** Enumerate Native Events @param EventCode is the event of interest @param modifier is one of PAPI_ENUM_FIRST, PAPI_ENUM_EVENTS */ int coretemp_ntv_enum_events (unsigned int *EventCode, int modifier) { switch ( modifier ) { /* return EventCode of first event */ case PAPI_ENUM_FIRST: *EventCode = 0; return PAPI_OK; break; /* return EventCode of passed-in Event */ case PAPI_ENUM_EVENTS: { int index = *EventCode; if ( index < CORETEMP_NUM_EVENTS - 1 ) { *EventCode = *EventCode + 1; return PAPI_OK; } else return PAPI_ENOEVNT; break; } default: return PAPI_EINVAL; } return PAPI_EINVAL; } /** Takes a native event code and passes back the name @param EventCode is the native event code @param name is a pointer for the name to be copied to @param len is the size of the string */ int coretemp_ntv_code_to_name (unsigned int EventCode, char *name, int len) { int index = EventCode; strncpy( name, coretemp_native_table[index].name, len ); return PAPI_OK; } /** Takes a native event code and passes back the event description @param EventCode is the native event code @param name is a pointer for the description to be copied to @param len is the size of the string */ int coretemp_ntv_code_to_descr (unsigned int EventCode, char *name, int len) { int index = EventCode; strncpy( name, coretemp_native_table[index].description, len ); return PAPI_OK; } /** This takes an event and returns the bits that would be written out to the hardware device (this is very much tied to CPU-type support */ int coretemp_ntv_code_to_bits (unsigned int EventCode, hwd_register_t * bits) { UNREFERENCED(EventCode); UNREFERENCED(bits); return PAPI_OK; } /** Triggered by eventset operations like add or remove */ int coretemp_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { int i, index; coretemp_control_state_t *c = (coretemp_control_state_t *) ptr; UNREFERENCED(ctx); SUBDBG("coretemp_update_control_state %p %p...\n", ptr, ctx); for (i = 0; i < count; i++) { index = native[i].ni_event; native[i].ni_position = coretemp_native_table[index].resources.selector - 1; c->added[native[i].ni_position] = TRUE; SUBDBG ("\nnative[%i].ni_position = coretemp_native_table[%i].resources.selector-1 = %i;\n", i, index, native[i].ni_position ); } return PAPI_OK; } /** Triggered by PAPI_start() */ int coretemp_start (hwd_context_t * ctx, hwd_control_state_t * ctrl) { UNREFERENCED(ctx); UNREFERENCED(ctrl); SUBDBG( "coretemp_start %p %p...\n", ctx, ctrl ); /* Nothing to be done */ return PAPI_OK; } /** Triggered by PAPI_stop() */ int coretemp_stop (hwd_context_t * ctx, hwd_control_state_t * ctrl) { UNREFERENCED(ctx); UNREFERENCED(ctrl); SUBDBG("coretemp_stop %p %p...\n", ctx, ctrl); /* Nothing to be done */ return PAPI_OK; } /** Triggered by PAPI_read() */ int coretemp_read (hwd_context_t * ctx, hwd_control_state_t * ctrl, long_long ** events, int flags) { int i; coretemp_control_state_t *c = (coretemp_control_state_t *) ctrl; UNREFERENCED(ctx); UNREFERENCED(flags); SUBDBG("coretemp_read... %p %d\n", ctx, flags); for (i = 0; i < CORETEMP_MAX_COUNTERS; i++) if (c->added[i]) { int tmp; size_t len = sizeof(tmp); if (sysctl (coretemp_native_table[i].resources.mib, 4, &tmp, &len, NULL, 0) == -1) c->counters[i] = 0; else c->counters[i] = tmp/10; /* Coretemp module returns temperature in tenths of kelvin Kelvin are useful to avoid negative values... but will have negative temperatures ??? */ } *events = c->counters; return PAPI_OK; } /** Triggered by PAPI_write(), but only if the counters are running */ /* otherwise, the updated state is written to ESI->hw_start */ int coretemp_write (hwd_context_t * ctx, hwd_control_state_t * ctrl, long_long events[] ) { UNREFERENCED(ctx); UNREFERENCED(events); UNREFERENCED(ctrl); SUBDBG("coretemp_write... %p %p\n", ctx, ctrl); /* These sensor counters cannot be writtn */ return PAPI_OK; } /** Triggered by PAPI_reset */ int coretemp_reset(hwd_context_t * ctx, hwd_control_state_t * ctrl) { UNREFERENCED(ctx); UNREFERENCED(ctrl); SUBDBG("coretemp_reset ctx=%p ctrl=%p...\n", ctx, ctrl); /* These sensors cannot be reseted */ return PAPI_OK; } /** Triggered by PAPI_shutdown() */ int coretemp_shutdown_component (void) { SUBDBG( "coretemp_shutdown_component...\n"); /* Last chance to clean up */ papi_free (coretemp_native_table); return PAPI_OK; } /** This function sets various options in the component @param ctx unused @param code valid are PAPI_SET_DEFDOM, PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL and PAPI_SET_INHERIT @param option unused */ int coretemp_ctl (hwd_context_t * ctx, int code, _papi_int_option_t * option) { UNREFERENCED(ctx); UNREFERENCED(code); UNREFERENCED(option); SUBDBG( "coretemp_ctl... %p %d %p\n", ctx, code, option ); /* FIXME. This should maybe set up more state, such as which counters are active and */ /* counter mappings. */ return PAPI_OK; } /** This function has to set the bits needed to count different domains In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER By default return PAPI_EINVAL if none of those are specified and PAPI_OK with success PAPI_DOM_USER is only user context is counted PAPI_DOM_KERNEL is only the Kernel/OS context is counted PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) PAPI_DOM_ALL is all of the domains */ int coretemp_set_domain (hwd_control_state_t * cntrl, int domain) { UNREFERENCED(cntrl); SUBDBG ("coretemp_set_domain... %p %d\n", cntrl, domain); if (PAPI_DOM_ALL & domain) { SUBDBG( " PAPI_DOM_ALL \n" ); return PAPI_OK; } return PAPI_EINVAL ; } /** Vector that points to entry points for our component */ papi_vector_t _coretemp_freebsd_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "coretemp_freebsd", .short_name = "coretemp", .version = "5.0", .num_mpx_cntrs = CORETEMP_MAX_COUNTERS, .num_cntrs = CORETEMP_MAX_COUNTERS, .default_domain = PAPI_DOM_ALL, .available_domains = PAPI_DOM_ALL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( coretemp_context_t ), .control_state = sizeof ( coretemp_control_state_t ), .reg_value = sizeof ( coretemp_register_t ), .reg_alloc = sizeof ( coretemp_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = coretemp_init_thread, .init_component = coretemp_init_component, .init_control_state = coretemp_init_control_state, .start = coretemp_start, .stop = coretemp_stop, .read = coretemp_read, .write = coretemp_write, .shutdown_component = coretemp_shutdown_component, .ctl = coretemp_ctl, .update_control_state = coretemp_update_control_state, .set_domain = coretemp_set_domain, .reset = coretemp_reset, .ntv_enum_events = coretemp_ntv_enum_events, .ntv_code_to_name = coretemp_ntv_code_to_name, .ntv_code_to_descr = coretemp_ntv_code_to_descr, .ntv_code_to_bits = coretemp_ntv_code_to_bits, }; papi-papi-7-2-0-t/src/components/coretemp_freebsd/coretemp_freebsd.h000066400000000000000000000000001502707512200255730ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/cuda/000077500000000000000000000000001502707512200175305ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/cuda/README.md000066400000000000000000000136031502707512200210120ustar00rootroot00000000000000# CUDA Component The CUDA component exposes counters and controls for NVIDIA GPUs. * [Enabling the CUDA Component](#enabling-the-cuda-component) * [Environment Variables](#environment-variables) * [Known Limitations](#known-limitations) * [FAQ](#faq) *** ## Enabling the CUDA Component To enable reading or writing of CUDA counters the user needs to link against a PAPI library that was configured with the CUDA component enabled. As an example the following command: ./configure --with-components="cuda" is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## Environment Variables For CUDA, PAPI requires one environment variable: `PAPI_CUDA_ROOT`. This is required for both compiling and runtime. Typically in Linux one would export this (examples are shown below) variable but some systems have software to manage environment variables (such as `module` or `spack`), so consult with your sysadmin if you have such management software. Eg: export PAPI_CUDA_ROOT=/path/to/installed/cuda Within PAPI_CUDA_ROOT, we expect the following standard directories for building: PAPI_CUDA_ROOT/include PAPI_CUDA_ROOT/extras/CUPTI/include and for runtime: PAPI_CUDA_ROOT/lib64 PAPI_CUDA_ROOT/extras/CUPTI/lib64 As of this writing (07/2021) Nvidia has overhauled performance reporting; divided now into "Legacy CUpti" and "CUpti_11", the new approach. Legacy Cupti works on devices up to Compute Capability 7.0; while only CUpti_11 works on devices with Compute Capability >=7.0. Both work on CC==7.0. This component automatically distinguishes between the two; but it cannot handle a "mix", one device that can only work with Legacy and another that can only work with CUpti_11. For the CUDA component to be operational, both versions require the following dynamic libraries be found at runtime: libcuda.so libcudart.so libcupti.so CUpti\_11 also requires: libnvperf_host.so If those libraries cannot be found or some of those are stub libraries in the standard `PAPI_CUDA_ROOT` subdirectories, you must add the correct paths, e.g. `/usr/lib64` or `/usr/lib` to `LD_LIBRARY_PATH`, separated by colons `:`. This can be set using export; e.g. export LD_LIBRARY_PATH=$PAPI_CUDA_ROOT/lib64:$LD_LIBRARY_PATH ## Known Limitations * In CUpti\_11, the number of possible events is vastly expanded; e.g. from some hundreds of events per device to over 110,000 events per device. this can make the utility `papi/src/utils/papi_native_avail` run for several minutes; as much as 2 minutes per GPU. If the output is redirected to a file, this may appear to "hang up". Give it time. * Currently the CUDA component profiling only works with GPUs with compute capability > 7.0 using the NVIDIA Perfworks libraries. *** ## FAQ 1. [Unusual installations](#unusual-installations) 2. [CUDA contexts](#cuda-contexts) 3. [CUDA toolkit versions](#cuda-toolkit-versions) 4. [Custom library paths](#custom-library-paths) 5. [Compute capability 7.0 with CUDA toolkit version 11.0](#compute-capability-70-with-cuda-toolkit-version-110) ## Unusual installations Three libraries are required for the PAPI CUDA component. `libcuda.so`, `libcudart.so` (The CUDA run-time library), and `libcupti.so`. For CUpti_11, `libnvperf_host.so` is also necessary. For the CUDA component to be operational, it must find the dynamic libraries mentioned above. If they are not found anywhere in the standard `PAPI_CUDA_ROOT` subdirectories mentioned above, or `PAPI_CUDA_ROOT` does not exist at runtime, the component looks in the Linux default directories listed by `/etc/ld.so.conf`, usually `/usr/lib64`, `/lib64`, `/usr/lib` and `/lib`. The system will also search the directories listed in `LD_LIBRARY_PATH`, separated by colons `:`. This can be set using export; e.g. export LD_LIBRARY_PATH=/WhereLib1CanBeFound:/WhereLib2CanBeFound:$LD_LIBRARY_PATH * If CUDA libraries are installed on your system, such that the OS can find `nvcc`, the header files, and the shared libraries, then `PAPI_CUDA_ROOT` and `LD_LIBRARY_PATH` may not be necessary. ## CUDA contexts The CUDA component can profile using contexts created by `cuCtxCreate` or primary device contexts activated by `cudaSetDevice`. Refer to test codes `HelloWorld`, `simpleMultiGPU`, `pthreads`, etc, that use created contexts. Refer to corresponding `*_noCuCtx` tests for profiling using primary device contexts. ## CUDA toolkit versions Once your binaries are compiled, it is possible to swap the CUDA toolkit versions without needing to recompile the source. Simply update `PAPI_CUDA_ROOT` to point to the path where the cuda toolkit version can be found. You might need to update `LD_LIBRARY_PATH` as well. ## Custom library paths PAPI CUDA component loads the CUDA driver library from the system installed path. It loads the other libraries from `$PAPI_CUDA_ROOT`. If that is not set, then it tries to load them from system paths. However, it is possible to load each of these libraries from custom paths by setting each of the following environment variables to point to the desired files. These are, - `PAPI_CUDA_RUNTIME` to point to `libcudart.so` - `PAPI_CUDA_CUPTI` to point to `libcupti.so` - `PAPI_CUDA_PERFWORKS` to point to `libnvperf_host.so` ## Compute capability 7.0 with CUDA toolkit version 11.0 NVIDIA GPUs with compute capability 7.0 support profiling on both PerfWorks API and the older Events & Metrics API. If CUDA toolkit version > 11.0 is used, then PAPI uses the newer API, but using toolkit version 11.0, PAPI uses the events API by default. If the environment variable `PAPI_CUDA_110_CC_70_PERFWORKS_API` is set to any non-empty value, then compute capability 7.0 using toolkit version 11.0 will use the Perfworks API. Eg: `export PAPI_CUDA_110_CC_70_PERFWORKS_API=1` papi-papi-7-2-0-t/src/components/cuda/README_internal.md000066400000000000000000000053261502707512200227110ustar00rootroot00000000000000# Cuda Component Native Events in PAPI At current the Cuda component uses bits to create an Event identifier. The following internal README will be used to breakdown the encoding format. # Event Identifier Encoding Format ## Unused bits As of 02/02/25, there are a total of 2 unused bits. These bits can be used to create a new qualifier or can be used to extend the number of bits for an existing qualifier. ## STAT 3 bits are allocated for the statistic qualifier. ([0 - 7 stats]). ## Device 7 bits are allocated for the device which accounts for 128 total devices on a node (e.g. [0 - 127 devices]). ## Qlmask 2 bits are allocated for the qualifier mask. ## Nameid 18 bits are allocated for the nameid which will roughly account for greater than 260k Cuda native events per device on a node. ## Calculations for Bit Masks and Shifts | #DEFINE | Bits | | ------------ | -------------------------------------------------------------------------- | | EVENTS_WIDTH | `(sizeof(uint32_t) * 8)` | | STAT_WIDTH | `( 3)` | | DEVICE_WIDTH | `( 7)` | | QLMASK_WIDTH | `( 2)` | | NAMEID_WIDTH | `(18)` | | UNUSED_WIDTH | `(EVENTS_WIDTH - DEVICE_WIDTH - QLMASK_WIDTH - NAMEID_WIDTH - STAT_WIDTH)` | | STAT_SHIFT | `(EVENTS_WIDTH - UNUSED_WIDTH - STAT_WIDTH)` | | DEVICE_SHIFT | `(EVENTS_WIDTH - UNUSED_WIDTH - STAT_WIDTH - DEVICE_WIDTH)` | | QLMASK_SHIFT | `(DEVICE_SHIFT - QLMASK_WIDTH)` | | NAMEID_SHIFT | `(QLMASK_SHIFT - NAMEID_WIDTH)` | | STAT_MASK | `((0xFFFFFFFFFFFFFFFF >> (EVENTS_WIDTH - STAT_WIDTH)) << STAT_SHIFT)` | | DEVICE_MASK | `((0xFFFFFFFFFFFFFFFF >> (EVENTS_WIDTH - DEVICE_WIDTH)) << DEVICE_SHIFT)` | | QLMASK_MASK | `((0xFFFFFFFFFFFFFFFF >> (EVENTS_WIDTH - QLMASK_WIDTH)) << QLMASK_SHIFT)` | | NAMEID_MASK | `((0xFFFFFFFFFFFFFFFF >> (EVENTS_WIDTH - NAMEID_WIDTH)) << NAMEID_SHIFT)` | | STAT_FLAG | `STAT_FLAG (0x2)` | | DEVICE_FLAG | `DEVICE_FLAG (0x1)` | **NOTE**: If adding a new qualifier, you must add it to the table found in the section titled [Calculations for Bit Masks and Shifts](#calculations-for-bit-masks-and-shifts) and account for this addition within `cupti_profiler.c`. papi-papi-7-2-0-t/src/components/cuda/Rules.cuda000066400000000000000000000037341502707512200214670ustar00rootroot00000000000000# Note: If PAPI_CUDA_ROOT environment variable is set then build using $PAPI_CUDA_ROOT/bin/nvcc. # If not set find nvcc installed on system and set it. PAPI_CUDA_ROOT ?= $(shell dirname $(shell dirname $(shell which nvcc))) # obtain user Cuda version to check if Cuda component currently supports it NVCC = $(PAPI_CUDA_ROOT)/bin/nvcc NVCC_VERSION := $(shell $(NVCC) --version | grep -oP '(?<=release )\d+\.\d+') CUDA_MACS = -DPAPI_CUDA_MAIN=$(PAPI_CUDA_MAIN) -DPAPI_CUDA_RUNTIME=$(PAPI_CUDA_RUNTIME) CUDA_MACS+= -DPAPI_CUDA_CUPTI=$(PAPI_CUDA_CUPTI) -DPAPI_CUDA_PERFWORKS=$(PAPI_CUDA_PERFWORKS) COMPSRCS += components/cuda/linux-cuda.c \ components/cuda/cupti_dispatch.c \ components/cuda/cupti_utils.c \ components/cuda/papi_cupti_common.c \ components/cuda/cupti_profiler.c \ components/cuda/cupti_events.c \ COMPOBJS += linux-cuda.o cupti_dispatch.o cupti_utils.o papi_cupti_common.o cupti_profiler.o cupti_events.o # CFLAGS specifies compile flags; need include files here, and macro defines. CFLAGS += -I$(PAPI_CUDA_ROOT)/include -I$(PAPI_CUDA_ROOT)/extras/CUPTI/include -g $(CUDA_MACS) LDFLAGS += $(LDL) linux-cuda.o: components/cuda/linux-cuda.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/linux-cuda.c -o linux-cuda.o cupti_dispatch.o: components/cuda/cupti_dispatch.c $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/cupti_dispatch.c -o cupti_dispatch.o cupti_utils.o: components/cuda/cupti_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/cupti_utils.c -o cupti_utils.o papi_cupti_common.o: components/cuda/papi_cupti_common.c $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/papi_cupti_common.c -o papi_cupti_common.o cupti_profiler.o: components/cuda/cupti_profiler.c $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/cupti_profiler.c -o cupti_profiler.o cupti_events.o: components/cuda/cupti_events.c $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/cuda/cupti_events.c -o cupti_events.o papi-papi-7-2-0-t/src/components/cuda/cupti_config.h000066400000000000000000000017341502707512200223570ustar00rootroot00000000000000/** * @file cupti_config.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __LCUDA_CONFIG_H__ #define __LCUDA_CONFIG_H__ #include /* used to assign the EventSet state */ #define CUDA_EVENTS_STOPPED (0x0) #define CUDA_EVENTS_RUNNING (0x2) #define CUPTI_PROFILER_API_MIN_SUPPORTED_VERSION (13) #if (CUPTI_API_VERSION >= CUPTI_PROFILER_API_MIN_SUPPORTED_VERSION) # define API_PERFWORKS 1 #endif // The Events API has been deprecated in Cuda Toolkit 12.8 and will be removed in a future // CUDA release (https://docs.nvidia.com/cupti/api/group__CUPTI__EVENT__API.html). // TODO: When the Events API has been removed #define CUPTI_EVENTS_API_MAX_SUPPORTED_VERSION // and set it to the last version that is supported. Use this macro as a runtime check in // `cuptic_determine_runtime_api`. #define API_EVENTS 2 #endif /* __LCUDA_CONFIG_H__ */ papi-papi-7-2-0-t/src/components/cuda/cupti_dispatch.c000066400000000000000000000160541502707512200227050ustar00rootroot00000000000000/** * @file cupti_dispatch.c * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #include "cupti_config.h" #include "papi_cupti_common.h" #include "cupti_dispatch.h" #include "lcuda_debug.h" #if defined(API_PERFWORKS) # include "cupti_profiler.h" #endif #if defined(API_EVENTS) # include "cupti_events.h" #endif int cuptid_shutdown(void) { int papi_errno; int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) papi_errno = cuptip_shutdown(); if (papi_errno != PAPI_OK) { return papi_errno; } #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) papi_errno = cuptie_shutdown(); if (papi_errno != PAPI_OK) { return papi_errno; } #endif } return cuptic_shutdown(); } int cuptid_err_get_last(const char **error_str) { return cuptic_err_get_last(error_str); } int cuptid_get_chip_name(int dev_num, char *name) { return get_chip_name(dev_num, name); } int cuptid_device_get_count(int *num_gpus) { return cuptic_device_get_count(num_gpus); } int cuptid_init(void) { int papi_errno; int init_errno = cuptic_init(); if (init_errno != PAPI_OK && init_errno != PAPI_PARTIAL) { papi_errno = init_errno; goto fn_exit; } int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) papi_errno = cuptip_init(); if (papi_errno == PAPI_OK) { if (init_errno == PAPI_PARTIAL) { papi_errno = init_errno; } } #else cuptic_err_set_last("PAPI not built with NVIDIA profiler API support."); papi_errno = PAPI_ECMP; goto fn_exit; #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) // TODO: When the Events API is added back, add a similar check // as above papi_errno = cuptie_init(); #else cuptic_err_set_last("Unknown events API problem."); papi_errno = PAPI_ECMP; #endif } else { cuptic_err_set_last("CUDA configuration not supported."); papi_errno = PAPI_ECMP; } fn_exit: return papi_errno; } int cuptid_thread_info_create(cuptid_info_t *info) { return cuptic_ctxarr_create((cuptic_info_t *) info); } int cuptid_thread_info_destroy(cuptid_info_t *info) { return cuptic_ctxarr_destroy((cuptic_info_t *) info); } int cuptid_ctx_create(cuptid_info_t info, cuptip_control_t *pcupti_ctl, uint32_t *events_id, int num_events) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_create((cuptic_info_t) info, pcupti_ctl, events_id, num_events); #endif } else if (cupti_api == API_EVENTS) { #if defined (API_EVENTS) return cuptie_ctx_create((cuptic_info_t) info, (cuptie_control_t *) pcupti_ctl); #endif } return PAPI_ECMP; } int cuptid_ctx_start(cuptip_control_t cupti_ctl) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_start(cupti_ctl); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_ctx_start((cuptie_control_t) cupti_ctl); #endif } return PAPI_ECMP; } int cuptid_ctx_read(cuptip_control_t cupti_ctl, long long **counters) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_read(cupti_ctl, counters); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_ctx_read((cuptie_control_t) cupti_ctl, counters); #endif } return PAPI_ECMP; } int cuptid_ctx_reset(cuptip_control_t cupti_ctl) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_reset(cupti_ctl); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_ctx_reset((cuptie_control_t) cupti_ctl); #endif } return PAPI_ECMP; } int cuptid_ctx_stop(cuptip_control_t cupti_ctl) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_stop(cupti_ctl); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_ctx_stop((cuptie_control_t) cupti_ctl); #endif } return PAPI_ECMP; } int cuptid_ctx_destroy(cuptip_control_t *pcupti_ctl) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_ctx_destroy(pcupti_ctl); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_ctx_destroy((cuptie_control_t *) pcupti_ctl); #endif } return PAPI_ECMP; } int cuptid_evt_enum(uint32_t *event_code, int modifier) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_evt_enum(event_code, modifier); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_evt_enum(event_code, modifier); #endif } return PAPI_ECMP; } int cuptid_evt_code_to_descr(uint32_t event_code, char *descr, int len) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_evt_code_to_descr(event_code, descr, len); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_evt_code_to_descr(event_code, descr, len); #endif } return PAPI_ECMP; } int cuptid_evt_name_to_code(const char *name, uint32_t *event_code) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_evt_name_to_code(name, event_code); #endif } else if (cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_evt_name_to_code(name, event_code); #endif } return PAPI_ECMP; } int cuptid_evt_code_to_name(uint32_t event_code, char *name, int len) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_evt_code_to_name(event_code, name, len); #endif } else if(cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_evt_code_to_name(event_code, name, len); #endif } return PAPI_ECMP; } int cuptid_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info) { int cupti_api = cuptic_determine_runtime_api(); if (cupti_api == API_PERFWORKS) { #if defined(API_PERFWORKS) return cuptip_evt_code_to_info(event_code, info); #endif } else if(cupti_api == API_EVENTS) { #if defined(API_EVENTS) return cuptie_evt_code_to_info(event_code, info); #endif } return PAPI_ECMP; } papi-papi-7-2-0-t/src/components/cuda/cupti_dispatch.h000066400000000000000000000032421502707512200227050ustar00rootroot00000000000000/** * @file cupti_dispatch.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __CUPTI_DISPATCH_H__ #define __CUPTI_DISPATCH_H__ #include "cupti_utils.h" #include "cupti_config.h" extern unsigned int _cuda_lock; typedef struct cuptip_control_s *cuptip_control_t; typedef void *cuptid_info_t; typedef cuptiu_event_table_t *ntv_event_table_t; typedef cuptiu_event_t *ntv_event_t; /* init and shutdown interfaces */ int cuptid_init(void); int cuptid_shutdown(void); /* native event interfaces */ int cuptid_evt_enum(uint32_t *event_code, int modifier); int cuptid_evt_code_to_descr(uint32_t event_code, char *descr, int len); int cuptid_evt_name_to_code(const char *name, uint32_t *event_code); int cuptid_evt_code_to_name(uint32_t event_code, char *name, int len); int cuptid_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info); /* profiling context handling interfaces */ int cuptid_ctx_create(cuptid_info_t thread_info, cuptip_control_t *pcupti_ctl, uint32_t *events_id, int num_events); int cuptid_ctx_start(cuptip_control_t ctl); int cuptid_ctx_read(cuptip_control_t ctl, long long **counters); int cuptid_ctx_reset(cuptip_control_t ctl); int cuptid_ctx_stop(cuptip_control_t ctl); int cuptid_ctx_destroy(cuptip_control_t *ctl); /* thread interfaces */ int cuptid_thread_info_create(cuptid_info_t *info); int cuptid_thread_info_destroy(cuptid_info_t *info); /* misc. */ int cuptid_err_get_last(const char **error_str); int cuptid_get_chip_name(int dev_num, char *name); int cuptid_device_get_count(int *num_gpus); #endif /* __CUPTI_DISPATCH_H__ */ papi-papi-7-2-0-t/src/components/cuda/cupti_events.c000066400000000000000000000030321502707512200224020ustar00rootroot00000000000000/** * @file cupti_events.c * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #include #include "cupti_events.h" #include "papi_cupti_common.h" #pragma GCC diagnostic ignored "-Wunused-parameter" /* Functions needed by CUPTI Events API */ /* ... */ /* CUPTI Events component API functions */ int cuptie_init(void) { cuptic_err_set_last("CUDA events API not implemented."); return PAPI_ENOIMPL; } int cuptie_ctx_create(void *thr_info, cuptie_control_t *pctl) { return PAPI_ENOIMPL; } int cuptie_ctx_start(cuptie_control_t ctl) { return PAPI_ENOIMPL; } int cuptie_ctx_read(cuptie_control_t ctl, long long **values) { return PAPI_ENOIMPL; } int cuptie_ctx_stop(cuptie_control_t ctl) { return PAPI_ENOIMPL; } int cuptie_ctx_reset(cuptie_control_t ctl) { return PAPI_ENOIMPL; } int cuptie_ctx_destroy(cuptie_control_t *pctl) { return PAPI_ENOIMPL; } int cuptie_evt_enum(uint32_t *event_code, int modifier) { return PAPI_ENOIMPL; } int cuptie_evt_code_to_descr(uint32_t event_code, char *descr, int len) { return PAPI_ENOIMPL; } int cuptie_evt_name_to_code(const char *name, uint32_t *event_code) { return PAPI_ENOIMPL; } int cuptie_evt_code_to_name(uint32_t event_code, char *name, int len) { return PAPI_ENOIMPL; } int cuptie_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info) { return PAPI_ENOIMPL; } int cuptie_shutdown(void) { return PAPI_ENOIMPL; } papi-papi-7-2-0-t/src/components/cuda/cupti_events.h000066400000000000000000000022341502707512200224120ustar00rootroot00000000000000/** * @file cupti_events.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __CUPTI_EVENTS_H__ #define __CUPTI_EVENTS_H__ #include "cupti_utils.h" #include typedef void *cuptie_control_t; /* init and shutdown interfaces */ int cuptie_init(void); int cuptie_shutdown(void); /* native event interfaces */ int cuptie_evt_enum(uint32_t *event_code, int modifier); int cuptie_evt_code_to_descr(uint32_t event_code, char *descr, int len); int cuptie_evt_name_to_code(const char *name, uint32_t *event_code); int cuptie_evt_code_to_name(uint32_t event_code, char *name, int len); int cuptie_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info); /* profiling context handling interfaces */ int cuptie_ctx_create(void *thr_info, cuptie_control_t *pctl); int cuptie_ctx_start(cuptie_control_t ctl); int cuptie_ctx_read(cuptie_control_t ctl, long long **counters); int cuptie_ctx_reset(cuptie_control_t ctl); int cuptie_ctx_stop(cuptie_control_t ctl); int cuptie_ctx_destroy(cuptie_control_t *pctl); #endif /* __CUPTI_EVENTS_H__ */ papi-papi-7-2-0-t/src/components/cuda/cupti_profiler.c000066400000000000000000004416331502707512200227350ustar00rootroot00000000000000/** * @file cupti_profiler.c * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu * @author Dong Jun WOun dwoun@vols.utk.edu */ #include #include #include "papi_memory.h" #include #include #include #include #include #include "papi_cupti_common.h" #include "cupti_profiler.h" #include "cupti_config.h" #include "lcuda_debug.h" #include "htable.h" /** * Event identifier encoding format: * +--------+------+-------+----+------------+ * | unused | stat | dev | ql | nameid | * +--------+------+-------+----+------------+ * * unused : 2 bits * stat : 3 bit ([0 - 8] stats) * device : 7 bits ([0 - 127] devices) * qlmask : 2 bits (qualifier mask) * nameid : 18: bits (roughly > 262 Thousand event names) */ #define EVENTS_WIDTH (sizeof(uint32_t) * 8) #define STAT_WIDTH ( 3) #define DEVICE_WIDTH ( 7) #define QLMASK_WIDTH ( 2) #define NAMEID_WIDTH (18) #define UNUSED_WIDTH (EVENTS_WIDTH - DEVICE_WIDTH - QLMASK_WIDTH - NAMEID_WIDTH - STAT_WIDTH) #define STAT_SHIFT (EVENTS_WIDTH - UNUSED_WIDTH - STAT_WIDTH) #define DEVICE_SHIFT (EVENTS_WIDTH - UNUSED_WIDTH - STAT_WIDTH - DEVICE_WIDTH) #define QLMASK_SHIFT (DEVICE_SHIFT - QLMASK_WIDTH) #define NAMEID_SHIFT (QLMASK_SHIFT - NAMEID_WIDTH) #define STAT_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - STAT_WIDTH)) << STAT_SHIFT) #define DEVICE_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - DEVICE_WIDTH)) << DEVICE_SHIFT) #define QLMASK_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - QLMASK_WIDTH)) << QLMASK_SHIFT) #define NAMEID_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - NAMEID_WIDTH)) << NAMEID_SHIFT) #define STAT_FLAG (0x2) #define DEVICE_FLAG (0x1) #define NUM_STATS_QUALS 7 char stats[NUM_STATS_QUALS][PAPI_MIN_STR_LEN] = {"avg", "sum", "min", "max", "max_rate", "pct", "ratio"}; typedef struct { int stat; int device; int flags; int nameid; } event_info_t; typedef struct byte_array_s { int size; uint8_t *data; } byte_array_t; typedef struct cuptip_gpu_state_s { int dev_id; cuptiu_event_table_t *added_events; int numberOfRawMetricRequests; NVPA_RawMetricRequest *rawMetricRequests; byte_array_t counterDataPrefixImage; byte_array_t configImage; byte_array_t counterDataImage; byte_array_t counterDataScratchBuffer; byte_array_t counterAvailabilityImage; } cuptip_gpu_state_t; struct cuptip_control_s { cuptip_gpu_state_t *gpu_ctl; long long *counters; int read_count; int running; cuptic_info_t info; }; static void *dl_nvpw; static int numDevicesOnMachine; static cuptiu_event_table_t *cuptiu_table_p; // Cupti Profiler API function pointers // CUptiResult ( *cuptiProfilerInitializePtr ) (CUpti_Profiler_Initialize_Params* params); CUptiResult ( *cuptiProfilerDeInitializePtr ) (CUpti_Profiler_DeInitialize_Params* params); CUptiResult ( *cuptiProfilerCounterDataImageCalculateSizePtr ) (CUpti_Profiler_CounterDataImage_CalculateSize_Params* params); CUptiResult ( *cuptiProfilerCounterDataImageInitializePtr ) (CUpti_Profiler_CounterDataImage_Initialize_Params* params); CUptiResult ( *cuptiProfilerCounterDataImageCalculateScratchBufferSizePtr ) (CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params* params); CUptiResult ( *cuptiProfilerCounterDataImageInitializeScratchBufferPtr ) (CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params* params); CUptiResult ( *cuptiProfilerBeginSessionPtr ) (CUpti_Profiler_BeginSession_Params* params); CUptiResult ( *cuptiProfilerSetConfigPtr ) (CUpti_Profiler_SetConfig_Params* params); CUptiResult ( *cuptiProfilerBeginPassPtr ) (CUpti_Profiler_BeginPass_Params* params); CUptiResult ( *cuptiProfilerEnableProfilingPtr ) (CUpti_Profiler_EnableProfiling_Params* params); CUptiResult ( *cuptiProfilerPushRangePtr ) (CUpti_Profiler_PushRange_Params* params); CUptiResult ( *cuptiProfilerPopRangePtr ) (CUpti_Profiler_PopRange_Params* params); CUptiResult ( *cuptiProfilerDisableProfilingPtr ) (CUpti_Profiler_DisableProfiling_Params* params); CUptiResult ( *cuptiProfilerEndPassPtr ) (CUpti_Profiler_EndPass_Params* params); CUptiResult ( *cuptiProfilerFlushCounterDataPtr ) (CUpti_Profiler_FlushCounterData_Params* params); CUptiResult ( *cuptiProfilerUnsetConfigPtr ) (CUpti_Profiler_UnsetConfig_Params* params); CUptiResult ( *cuptiProfilerEndSessionPtr ) (CUpti_Profiler_EndSession_Params* params); CUptiResult ( *cuptiProfilerGetCounterAvailabilityPtr ) (CUpti_Profiler_GetCounterAvailability_Params* params); CUptiResult ( *cuptiFinalizePtr ) (void); // Function wrappers for the Cupti Profiler API // static int initialize_cupti_profiler_api(void); static int deinitialize_cupti_profiler_api(void); static int enable_profiling(void); static int begin_pass(void); static int end_pass(void); static int push_range(const char *pRangeName); static int pop_range(void); static int flush_data(void); static int disable_profiling(void); static int unset_config(void); static int end_session(void); // Perfworks API function pointers // // Initialize NVPA_Status ( *NVPW_InitializeHostPtr ) (NVPW_InitializeHost_Params* params); // Enumeration NVPA_Status ( *NVPW_MetricsEvaluator_GetMetricNamesPtr ) (NVPW_MetricsEvaluator_GetMetricNames_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetSupportedSubmetricsPtr ) (NVPW_MetricsEvaluator_GetSupportedSubmetrics_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetCounterPropertiesPtr ) (NVPW_MetricsEvaluator_GetCounterProperties_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetRatioMetricPropertiesPtr ) (NVPW_MetricsEvaluator_GetRatioMetricProperties_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetThroughputMetricPropertiesPtr ) (NVPW_MetricsEvaluator_GetThroughputMetricProperties_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetMetricDimUnitsPtr ) (NVPW_MetricsEvaluator_GetMetricDimUnits_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_DimUnitToStringPtr ) (NVPW_MetricsEvaluator_DimUnitToString_Params* pParams); // Configuration NVPA_Status ( *NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequestPtr ) (NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequest_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_GetMetricRawDependenciesPtr ) (NVPW_MetricsEvaluator_GetMetricRawDependencies_Params* pParams); NVPA_Status ( *NVPW_CUDA_RawMetricsConfig_Create_V2Ptr ) (NVPW_CUDA_RawMetricsConfig_Create_V2_Params* pParams); NVPA_Status ( *NVPW_RawMetricsConfig_GenerateConfigImagePtr ) (NVPW_RawMetricsConfig_GenerateConfigImage_Params* params); NVPA_Status ( *NVPW_RawMetricsConfig_GetConfigImagePtr ) (NVPW_RawMetricsConfig_GetConfigImage_Params* params); NVPA_Status ( *NVPW_CounterDataBuilder_CreatePtr ) (NVPW_CounterDataBuilder_Create_Params* params); NVPA_Status ( *NVPW_CounterDataBuilder_AddMetricsPtr ) (NVPW_CounterDataBuilder_AddMetrics_Params* params); NVPA_Status ( *NVPW_CounterDataBuilder_GetCounterDataPrefixPtr ) (NVPW_CounterDataBuilder_GetCounterDataPrefix_Params* params); NVPA_Status ( *NVPW_CUDA_CounterDataBuilder_CreatePtr ) (NVPW_CUDA_CounterDataBuilder_Create_Params* pParams); NVPA_Status ( *NVPW_RawMetricsConfig_SetCounterAvailabilityPtr ) (NVPW_RawMetricsConfig_SetCounterAvailability_Params* params); // Evaluation NVPA_Status ( *NVPW_MetricsEvaluator_SetDeviceAttributesPtr ) (NVPW_MetricsEvaluator_SetDeviceAttributes_Params* pParams); NVPA_Status ( *NVPW_MetricsEvaluator_EvaluateToGpuValuesPtr ) (NVPW_MetricsEvaluator_EvaluateToGpuValues_Params* pParams); // Used in both enumeration and evaluation NVPA_Status ( *NVPW_CUDA_MetricsEvaluator_InitializePtr ) (NVPW_CUDA_MetricsEvaluator_Initialize_Params* pParams); NVPA_Status ( *NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr ) (NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params* pParams); NVPA_Status ( *NVPW_RawMetricsConfig_GetNumPassesPtr ) (NVPW_RawMetricsConfig_GetNumPasses_Params* params); NVPA_Status ( *NVPW_RawMetricsConfig_BeginPassGroupPtr ) (NVPW_RawMetricsConfig_BeginPassGroup_Params* params); NVPA_Status ( *NVPW_RawMetricsConfig_EndPassGroupPtr ) (NVPW_RawMetricsConfig_EndPassGroup_Params* params); NVPA_Status ( *NVPW_RawMetricsConfig_AddMetricsPtr ) (NVPW_RawMetricsConfig_AddMetrics_Params* params); // Destroy NVPA_Status ( *NVPW_RawMetricsConfig_DestroyPtr ) (NVPW_RawMetricsConfig_Destroy_Params* params); NVPA_Status ( *NVPW_CounterDataBuilder_DestroyPtr ) (NVPW_CounterDataBuilder_Destroy_Params* params); NVPA_Status ( *NVPW_MetricsEvaluator_DestroyPtr ) (NVPW_MetricsEvaluator_Destroy_Params* pParams); // Misc. NVPA_Status ( *NVPW_GetSupportedChipNamesPtr ) (NVPW_GetSupportedChipNames_Params* params); // Helper functions for the MetricsEvaluator API // // Initialize static int initialize_perfworks_api(void); // Enumeration static int enumerate_metrics_for_unique_devices(const char *pChipName, int *totalNumMetrics, char ***arrayOfMetricNames); static int get_rollup_metrics(NVPW_RollupOp rollupMetric, char **strRollupMetric); static int get_supported_submetrics(NVPW_Submetric subMetric, char **strSubMetric); static int get_metric_properties(const char *pChipName, const char *metricName, char *fullMetricDescription); static int get_number_of_passes_for_info(const char *pChipName, NVPW_MetricsEvaluator *pMetricsEvaluator, NVPW_MetricEvalRequest *metricEvalRequest, int *numOfPasses); // Configuration static int get_metric_eval_request(NVPW_MetricsEvaluator *metricEvaluator, const char *metricName, NVPW_MetricEvalRequest *pMetricEvalRequest); static int create_raw_metric_requests(NVPW_MetricsEvaluator *pMetricsEvaluator, NVPW_MetricEvalRequest *metricEvalRequest, NVPA_RawMetricRequest **rawMetricRequests, int *rawMetricRequestsCount); // Metric Evaluation static int get_number_of_passes_for_eventsets(const char *pChipName, const char *metricName, int *numOfPasses); static int get_evaluated_metric_values(NVPW_MetricsEvaluator *pMetricsEvaluator, cuptip_gpu_state_t *gpu_ctl, long long *evaluatedMetricValues); // Destroy MetricsEvaluator static int destroy_metrics_evaluator(NVPW_MetricsEvaluator *pMetricsEvaluator); // Helper functions for profiling // static int start_profiling_session(byte_array_t counterDataImage, byte_array_t counterDataScratchBufferSize, byte_array_t configImage); static int end_profiling_session(void); static int get_config_image(const char *chipName, const uint8_t *pCounterAvailabilityImageData, NVPA_RawMetricRequest *rawMetricRequests, int rmr_count, byte_array_t *configImage); static int get_counter_data_prefix_image(const char *chipName, NVPA_RawMetricRequest *rawMetricRequests, int rmr_count, byte_array_t *counterDataPrefixImage); static int get_counter_data_image(byte_array_t counterDataPrefixImage, byte_array_t *counterDataScratchBuffer, byte_array_t *counterDataImage); static int get_event_collection_method(const char *evt_name); static int get_counter_availability(cuptip_gpu_state_t *gpu_ctl); static void free_and_reset_configuration_images(cuptip_gpu_state_t *gpu_ctl); // Functions related to Cuda component hash tables static int init_main_htable(void); static int init_event_table(void); static void shutdown_event_table(void); static void shutdown_event_stats_table(void); // Functions related to NVIDIA device chips static int assign_chipnames_for_a_device_index(void); static int find_same_chipname(int dev_id); // Functions related to the native event interface static int get_ntv_events(cuptiu_event_table_t *evt_table, const char *evt_name, int dev_id); static int verify_user_added_events(uint32_t *events_id, int num_events, cuptip_control_t state); static int evt_id_to_info(uint32_t event_id, event_info_t *info); static int evt_id_create(event_info_t *info, uint32_t *event_id); static int evt_code_to_name(uint32_t event_code, char *name, int len); static int evt_name_to_basename(const char *name, char *base, int len); static int evt_name_to_device(const char *name, int *device, const char *base); static int evt_name_to_stat(const char *name, int *stat, const char *base); static int cuda_verify_no_repeated_qualifiers(const char *eventName); static int cuda_verify_qualifiers(int flag, char *qualifierName, int equalitySignPosition, int *qualifierValue); // Functions related to the stats qualifier static int restructure_event_name(const char *input, char *output, char *base, char *stat); static int is_stat(const char *token); // Functions related to a partially disabled Cuda component static int determine_dev_cc_major(int dev_id); // Load and unload function pointers static int load_cupti_perf_sym(void); static int unload_cupti_perf_sym(void); static int load_nvpw_sym(void); static int unload_nvpw_sym(void); /** @class load_cupti_perf_sym * @brief Load cupti functions and assign to function pointers. */ static int load_cupti_perf_sym(void) { COMPDBG("Entering.\n"); if (dl_cupti == NULL) { ERRDBG("libcupti.so should already be loaded.\n"); return PAPI_EMISC; } cuptiProfilerInitializePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerInitialize"); cuptiProfilerDeInitializePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerDeInitialize"); cuptiProfilerCounterDataImageCalculateSizePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerCounterDataImageCalculateSize"); cuptiProfilerCounterDataImageInitializePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerCounterDataImageInitialize"); cuptiProfilerCounterDataImageCalculateScratchBufferSizePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerCounterDataImageCalculateScratchBufferSize"); cuptiProfilerCounterDataImageInitializeScratchBufferPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerCounterDataImageInitializeScratchBuffer"); cuptiProfilerBeginSessionPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerBeginSession"); cuptiProfilerSetConfigPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerSetConfig"); cuptiProfilerBeginPassPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerBeginPass"); cuptiProfilerEnableProfilingPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerEnableProfiling"); cuptiProfilerPushRangePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerPushRange"); cuptiProfilerPopRangePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerPopRange"); cuptiProfilerDisableProfilingPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerDisableProfiling"); cuptiProfilerEndPassPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerEndPass"); cuptiProfilerFlushCounterDataPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerFlushCounterData"); cuptiProfilerUnsetConfigPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerUnsetConfig"); cuptiProfilerEndSessionPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerEndSession"); cuptiProfilerGetCounterAvailabilityPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiProfilerGetCounterAvailability"); cuptiFinalizePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiFinalize"); return PAPI_OK; } /** @class unload_cupti_perf_sym * @brief Unload cupti function pointers. */ static int unload_cupti_perf_sym(void) { if (dl_cupti) { dlclose(dl_cupti); dl_cupti = NULL; } cuptiProfilerInitializePtr = NULL; cuptiProfilerDeInitializePtr = NULL; cuptiProfilerCounterDataImageCalculateSizePtr = NULL; cuptiProfilerCounterDataImageInitializePtr = NULL; cuptiProfilerCounterDataImageCalculateScratchBufferSizePtr = NULL; cuptiProfilerCounterDataImageInitializeScratchBufferPtr = NULL; cuptiProfilerBeginSessionPtr = NULL; cuptiProfilerSetConfigPtr = NULL; cuptiProfilerBeginPassPtr = NULL; cuptiProfilerEnableProfilingPtr = NULL; cuptiProfilerPushRangePtr = NULL; cuptiProfilerPopRangePtr = NULL; cuptiProfilerDisableProfilingPtr = NULL; cuptiProfilerEndPassPtr = NULL; cuptiProfilerFlushCounterDataPtr = NULL; cuptiProfilerUnsetConfigPtr = NULL; cuptiProfilerEndSessionPtr = NULL; cuptiProfilerGetCounterAvailabilityPtr = NULL; cuptiFinalizePtr = NULL; return PAPI_OK; } /**@class load_nvpw_sym * @brief Search for a variation of the shared object libnvperf_host. * Order of search is outlined below. * * 1. If a user sets PAPI_CUDA_PERFWORKS, this will take precedent over * the options listed below to be searched. * 2. If we fail to collect a variation of the shared object libnvperf_host from * PAPI_CUDA_PERFWORKS or it is not set, we will search the path defined with PAPI_CUDA_ROOT; * as this is supposed to always be set. * 3. If we fail to collect a variation of the shared object libnvperf_host from steps 1 and 2, * then we will search the linux default directories listed by /etc/ld.so.conf. As a note, * updating the LD_LIBRARY_PATH is advised for this option. * 4. We use dlopen to search for a variation of the shared object libnvperf_host. * If this fails, then we failed to find a variation of the shared object libnvperf_host. */ static int load_nvpw_sym(void) { int soNamesToSearchCount = 3; const char *soNamesToSearchFor[] = {"libnvperf_host.so", "libnvperf_host.so.1", "libnvperf_host"}; // If a user set PAPI_CUDA_PERFWORKS with a path, then search it for the shared object (takes precedent over PAPI_CUDA_ROOT) char *papi_cuda_perfworks = getenv("PAPI_CUDA_PERFWORKS"); if (papi_cuda_perfworks) { dl_nvpw = search_and_load_shared_objects(papi_cuda_perfworks, NULL, soNamesToSearchFor, soNamesToSearchCount); } char *soMainName = "libnvperf_host"; // If a user set PAPI_CUDA_ROOT with a path and we did not already find the shared object, then search it for the shared object char *papi_cuda_root = getenv("PAPI_CUDA_ROOT"); if (papi_cuda_root && !dl_nvpw) { dl_nvpw = search_and_load_shared_objects(papi_cuda_root, soMainName, soNamesToSearchFor, soNamesToSearchCount); } // Last ditch effort to find a variation of libnvperf_host, see dlopen manpages for how search occurs if (!dl_nvpw) { dl_nvpw = search_and_load_from_system_paths(soNamesToSearchFor, soNamesToSearchCount); if (!dl_nvpw) { ERRDBG("Loading libnvperf_host.so failed.\n"); goto fn_fail; } } // Initialize NVPW_InitializeHostPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_InitializeHost"); // Enumeration NVPW_MetricsEvaluator_GetMetricNamesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetMetricNames"); NVPW_MetricsEvaluator_GetSupportedSubmetricsPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetSupportedSubmetrics"); NVPW_MetricsEvaluator_GetCounterPropertiesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetCounterProperties"); NVPW_MetricsEvaluator_GetRatioMetricPropertiesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetRatioMetricProperties"); NVPW_MetricsEvaluator_GetThroughputMetricPropertiesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetThroughputMetricProperties"); NVPW_MetricsEvaluator_GetMetricDimUnitsPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetMetricDimUnits"); NVPW_MetricsEvaluator_DimUnitToStringPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_DimUnitToString"); // Configuration NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequestPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequest"); NVPW_MetricsEvaluator_GetMetricRawDependenciesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_GetMetricRawDependencies"); NVPW_CUDA_RawMetricsConfig_Create_V2Ptr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CUDA_RawMetricsConfig_Create_V2"); NVPW_RawMetricsConfig_GenerateConfigImagePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_GenerateConfigImage"); NVPW_RawMetricsConfig_GetConfigImagePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_GetConfigImage"); NVPW_CounterDataBuilder_CreatePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CounterDataBuilder_Create"); NVPW_CounterDataBuilder_AddMetricsPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CounterDataBuilder_AddMetrics"); NVPW_CounterDataBuilder_GetCounterDataPrefixPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CounterDataBuilder_GetCounterDataPrefix"); NVPW_CUDA_CounterDataBuilder_CreatePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CUDA_CounterDataBuilder_Create"); NVPW_RawMetricsConfig_SetCounterAvailabilityPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_SetCounterAvailability"); // Evaluation NVPW_MetricsEvaluator_SetDeviceAttributesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_SetDeviceAttributes"); NVPW_MetricsEvaluator_EvaluateToGpuValuesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_EvaluateToGpuValues"); // Used in both enumeration and evaluation NVPW_CUDA_MetricsEvaluator_InitializePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CUDA_MetricsEvaluator_Initialize"); NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize"); NVPW_RawMetricsConfig_GetNumPassesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_GetNumPasses"); NVPW_RawMetricsConfig_BeginPassGroupPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_BeginPassGroup"); NVPW_RawMetricsConfig_EndPassGroupPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_EndPassGroup"); NVPW_RawMetricsConfig_AddMetricsPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_AddMetrics"); // Destroy NVPW_RawMetricsConfig_DestroyPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_RawMetricsConfig_Destroy"); NVPW_CounterDataBuilder_DestroyPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_CounterDataBuilder_Destroy"); NVPW_MetricsEvaluator_DestroyPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_MetricsEvaluator_Destroy"); // Misc. NVPW_GetSupportedChipNamesPtr = DLSYM_AND_CHECK(dl_nvpw, "NVPW_GetSupportedChipNames"); Dl_info info; dladdr(NVPW_GetSupportedChipNamesPtr, &info); LOGDBG("NVPW library loaded from %s\n", info.dli_fname); return PAPI_OK; fn_fail: return PAPI_EMISC; } /** @class unload_nvpw_sym * @brief Unload nvperf function pointers. */ static int unload_nvpw_sym(void) { if (dl_nvpw) { dlclose(dl_nvpw); dl_nvpw = NULL; } // Initialize NVPW_InitializeHostPtr = NULL; // Enumeration NVPW_MetricsEvaluator_GetMetricNamesPtr = NULL; NVPW_MetricsEvaluator_GetSupportedSubmetricsPtr = NULL; NVPW_MetricsEvaluator_GetCounterPropertiesPtr = NULL; NVPW_MetricsEvaluator_GetRatioMetricPropertiesPtr = NULL; NVPW_MetricsEvaluator_GetThroughputMetricPropertiesPtr = NULL; NVPW_MetricsEvaluator_GetMetricDimUnitsPtr = NULL; NVPW_MetricsEvaluator_DimUnitToStringPtr = NULL; // Configuration NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequestPtr = NULL; NVPW_MetricsEvaluator_GetMetricRawDependenciesPtr = NULL; NVPW_CUDA_RawMetricsConfig_Create_V2Ptr = NULL; NVPW_RawMetricsConfig_GenerateConfigImagePtr = NULL; NVPW_RawMetricsConfig_GetConfigImagePtr = NULL; NVPW_CounterDataBuilder_CreatePtr = NULL; NVPW_CounterDataBuilder_AddMetricsPtr = NULL; NVPW_CounterDataBuilder_GetCounterDataPrefixPtr = NULL; NVPW_CUDA_CounterDataBuilder_CreatePtr = NULL; NVPW_RawMetricsConfig_SetCounterAvailabilityPtr = NULL; // Evaluation NVPW_MetricsEvaluator_SetDeviceAttributesPtr = NULL; NVPW_MetricsEvaluator_EvaluateToGpuValuesPtr = NULL; // Used in both enumeration and evaluation NVPW_CUDA_MetricsEvaluator_InitializePtr = NULL; NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr = NULL; NVPW_RawMetricsConfig_GetNumPassesPtr = NULL; NVPW_RawMetricsConfig_BeginPassGroupPtr = NULL; NVPW_RawMetricsConfig_EndPassGroupPtr = NULL; NVPW_RawMetricsConfig_AddMetricsPtr = NULL; // Destroy NVPW_RawMetricsConfig_DestroyPtr = NULL; NVPW_CounterDataBuilder_DestroyPtr = NULL; NVPW_MetricsEvaluator_DestroyPtr = NULL; // Misc. NVPW_GetSupportedChipNamesPtr = NULL; return PAPI_OK; } /** @class initialize_perfworks_api * @brief Initialize the Perfworks API. */ static int initialize_perfworks_api(void) { COMPDBG("Entering.\n"); NVPW_InitializeHost_Params perfInitHostParams = {NVPW_InitializeHost_Params_STRUCT_SIZE}; perfInitHostParams.pPriv = NULL; nvpwCheckErrors( NVPW_InitializeHostPtr(&perfInitHostParams), return PAPI_EMISC ); return PAPI_OK; } /** @class get_counter_availability * @brief Query counter availability. Helps to filter unavailable raw metrics on host. * @param *gpu_ctl * Structure of type cuptip_gpu_state_t which has member variables such as * dev_id, rawMetricRequests, numberOfRawMetricRequests, and more. */ static int get_counter_availability(cuptip_gpu_state_t *gpu_ctl) { CUpti_Profiler_GetCounterAvailability_Params getCounterAvailabilityParams = {CUpti_Profiler_GetCounterAvailability_Params_STRUCT_SIZE}; getCounterAvailabilityParams.pPriv = NULL; getCounterAvailabilityParams.ctx = NULL; // If NULL, the current CUcontext is used getCounterAvailabilityParams.pCounterAvailabilityImage = NULL; cuptiCheckErrors( cuptiProfilerGetCounterAvailabilityPtr(&getCounterAvailabilityParams), return PAPI_EMISC ); // Allocate the necessary memory for data gpu_ctl->counterAvailabilityImage.size = getCounterAvailabilityParams.counterAvailabilityImageSize; gpu_ctl->counterAvailabilityImage.data = (uint8_t *) malloc(gpu_ctl->counterAvailabilityImage.size); if (gpu_ctl->counterAvailabilityImage.data == NULL) { ERRDBG("Failed to allocate memory for counterAvailabilityImage.data.\n"); return PAPI_ENOMEM; } getCounterAvailabilityParams.pCounterAvailabilityImage = gpu_ctl->counterAvailabilityImage.data; cuptiCheckErrors( cuptiProfilerGetCounterAvailabilityPtr(&getCounterAvailabilityParams), return PAPI_EMISC ); return PAPI_OK; } /** @class free_and_reset_configuration_images * @brief Free and reset the configuration images created in * cuptip_ctx_start. * @param *gpu_ctl * Structure of type cuptip_gpu_state_t which has member variables such as * dev_id, rawMetricRequests, numberOfRawMetricRequests, and more. */ void free_and_reset_configuration_images(cuptip_gpu_state_t *gpu_ctl) { COMPDBG("Entering.\n"); // Note that you can find the memory allocation for the below variables // in cuptip_ctx_start as of April 21st, 2025 free(gpu_ctl->configImage.data); gpu_ctl->configImage.data = NULL; gpu_ctl->configImage.size = 0; free(gpu_ctl->counterDataPrefixImage.data); gpu_ctl->counterDataPrefixImage.data = NULL; gpu_ctl->counterDataPrefixImage.size = 0; free(gpu_ctl->counterDataScratchBuffer.data); gpu_ctl->counterDataScratchBuffer.data = NULL; gpu_ctl->counterDataScratchBuffer.size = 0; free(gpu_ctl->counterDataImage.data); gpu_ctl->counterDataImage.data = NULL; gpu_ctl->counterDataImage.size = 0; free(gpu_ctl->counterAvailabilityImage.data); gpu_ctl->counterAvailabilityImage.data = NULL; gpu_ctl->counterAvailabilityImage.size = 0; } /** @class find_same_chipname * @brief Check to see if chipnames are identical. * * @param dev_id * A gpu id number, e.g 0, 1, 2, etc. */ static int find_same_chipname(int dev_id) { int i; for (i = 0; i < dev_id; i++) { if (!strcmp(cuptiu_table_p->avail_gpu_info[dev_id].chipName, cuptiu_table_p->avail_gpu_info[i].chipName)) { return i; } } return -1; } /** @class init_main_htable * @brief Initialize the main htable used to collect metrics. */ static int init_main_htable(void) { // Allocate (2 ^ NAMEID_WIDTH) metric names, this matches the // number of bits for the event encoding format int i, val = 1, base = 2; for (i = 0; i < NAMEID_WIDTH; i++) { val *= base; } cuptiu_table_p = (cuptiu_event_table_t *) malloc(sizeof(cuptiu_event_table_t)); if (cuptiu_table_p == NULL) { ERRDBG("Failed to allocate memory for cuptiu_table_p.\n"); return PAPI_ENOMEM; } cuptiu_table_p->capacity = val; cuptiu_table_p->count = 0; cuptiu_table_p->event_stats_count = 0; cuptiu_table_p->events = (cuptiu_event_t *) calloc(val, sizeof(cuptiu_event_t)); if (cuptiu_table_p->events == NULL) { ERRDBG("Failed to allocate memory for cuptiu_table_p->events.\n"); return PAPI_ENOMEM; } cuptiu_table_p->event_stats = (StringVector *) calloc(val, sizeof(StringVector)); if (cuptiu_table_p->event_stats == NULL) { ERRDBG("Failed to allocate memory for cuptiu_table_p->event_stats.\n"); return PAPI_ENOMEM; } cuptiu_table_p->avail_gpu_info = (gpu_record_t *) calloc(numDevicesOnMachine, sizeof(gpu_record_t)); if (cuptiu_table_p->avail_gpu_info == NULL) { ERRDBG("Failed to allocate memory for cuptiu_table_p->avail_gpu_info.\n"); return PAPI_ENOMEM; } // Initialize the main hash table for metric collection htable_init(&cuptiu_table_p->htable); return PAPI_OK; } /** @class cuptip_init * @brief Load and initialize API's. */ int cuptip_init(void) { COMPDBG("Entering.\n"); int papi_errno = load_cupti_perf_sym(); papi_errno += load_nvpw_sym(); if (papi_errno != PAPI_OK) { cuptic_err_set_last("Unable to load CUDA library functions."); return papi_errno; } // Collect the number of devices on the machine papi_errno = cuptic_device_get_count(&numDevicesOnMachine); if (papi_errno != PAPI_OK) { return papi_errno; } if (numDevicesOnMachine <= 0) { cuptic_err_set_last("No GPUs found on system."); return PAPI_ECMP; } // Initialize the Cupti Profiler and Perfworks API's papi_errno = initialize_cupti_profiler_api(); papi_errno += initialize_perfworks_api(); if (papi_errno != PAPI_OK) { cuptic_err_set_last("Unable to initialize CUPTI profiler libraries."); return PAPI_EMISC; } papi_errno = init_main_htable(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = assign_chipnames_for_a_device_index(); if (papi_errno != PAPI_OK) { return papi_errno; } // Collect the available metrics on the machine papi_errno = init_event_table(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = cuInitPtr(0); if (papi_errno != CUDA_SUCCESS) { cuptic_err_set_last("Failed to initialize CUDA driver API."); return PAPI_EMISC; } return PAPI_OK; } /** @class verify_user_added_events * @brief For user added events, verify they exist and do not require * multiple passes. If both are true, store metadata. * @param *events_id * Cuda native event id's. * @param num_events * Number of Cuda native events a user is wanting to count. * @param state * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. */ int verify_user_added_events(uint32_t *events_id, int num_events, cuptip_control_t state) { int i, papi_errno; for (i = 0; i < numDevicesOnMachine; i++) { papi_errno = cuptiu_event_table_create_init_capacity( num_events, sizeof(cuptiu_event_t), &(state->gpu_ctl[i].added_events) ); if (papi_errno != PAPI_OK) { return papi_errno; } } for (i = 0; i < num_events; i++) { event_info_t info; papi_errno = evt_id_to_info(events_id[i], &info); if (papi_errno != PAPI_OK) { return papi_errno; } // Verify the user added event exists void *p; if (htable_find(cuptiu_table_p->htable, cuptiu_table_p->events[info.nameid].name, (void **) &p) != HTABLE_SUCCESS) { return PAPI_ENOEVNT; } char stat[PAPI_HUGE_STR_LEN]=""; int strLen; if (info.stat < NUM_STATS_QUALS){ strLen = snprintf(stat, sizeof(stat), "%s", stats[info.stat]); if (strLen < 0 || strLen >= sizeof(stat)) { SUBDBG("Failed to fully write statistic qualifier.\n"); return PAPI_ENOMEM; } } const char *stat_position = strstr(cuptiu_table_p->events[info.nameid].basenameWithStatReplaced, "stat"); if (stat_position == NULL) { ERRDBG("Event does not have a 'stat' placeholder.\n"); return PAPI_EBUG; } // Reconstructing event name. Append the basename, stat, and sub-metric. size_t basename_len = stat_position - cuptiu_table_p->events[info.nameid].basenameWithStatReplaced; char reconstructedEventName[PAPI_HUGE_STR_LEN]=""; strLen = snprintf(reconstructedEventName, PAPI_MAX_STR_LEN, "%.*s%s%s", (int)basename_len, cuptiu_table_p->events[info.nameid].basenameWithStatReplaced, stat, stat_position + 4); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { SUBDBG("Failed to fully write reconstructed event name.\n"); return PAPI_EBUF; } // Verify the user added event does not require multiple passes int numOfPasses; papi_errno = get_number_of_passes_for_eventsets(cuptiu_table_p->avail_gpu_info[info.device].chipName, reconstructedEventName, &numOfPasses); if (papi_errno != PAPI_OK) { return papi_errno; } if (numOfPasses > 1) { return PAPI_EMULPASS; } // For a specific device table, get the current event index int idx = state->gpu_ctl[info.device].added_events->count; // Store metadata strLen = snprintf(state->gpu_ctl[info.device].added_events->cuda_evts[idx], PAPI_MAX_STR_LEN, "%s", reconstructedEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { SUBDBG("Failed to fully write reconstructed Cuda event name to array of added events.\n"); return PAPI_EBUF; } state->gpu_ctl[info.device].added_events->cuda_devs[idx] = info.device; state->gpu_ctl[info.device].added_events->evt_pos[idx] = i; state->gpu_ctl[info.device].added_events->count++; /* total number of events added for a specific device */ } return PAPI_OK; } /** @class cuptip_ctx_create * @brief Create a profiling context for the requested Cuda events. * @param thr_info * @param *pstate * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. * @param *events_id * Cuda native event id's. * @param num_events * Number of Cuda native events a user is wanting to count. */ int cuptip_ctx_create(cuptic_info_t thr_info, cuptip_control_t *pstate, uint32_t *events_id, int num_events) { COMPDBG("Entering.\n"); cuptip_control_t state = (cuptip_control_t) calloc (1, sizeof(struct cuptip_control_s)); if (state == NULL) { SUBDBG("Failed to allocate memory for state.\n"); return PAPI_ENOMEM; } state->gpu_ctl = (cuptip_gpu_state_t *) calloc(numDevicesOnMachine, sizeof(cuptip_gpu_state_t)); if (state->gpu_ctl == NULL) { SUBDBG("Failed to allocate memory for state->gpu_ctl.\n"); return PAPI_ENOMEM; } long long *counters = (long long *) malloc(num_events * sizeof(*counters)); if (counters == NULL) { SUBDBG("Failed to allocate memory for counters.\n"); return PAPI_ENOMEM; } int dev_id; for (dev_id = 0; dev_id < numDevicesOnMachine; dev_id++) { state->gpu_ctl[dev_id].dev_id = dev_id; } event_info_t info; int papi_errno = evt_id_to_info(events_id[num_events - 1], &info); if (papi_errno != PAPI_OK) { return papi_errno; } // Store a user created cuda context or create one papi_errno = cuptic_ctxarr_update_current(thr_info, info.device); if (papi_errno != PAPI_OK) { return papi_errno; } // Verify user added events are available on the machine papi_errno = verify_user_added_events(events_id, num_events, state); if (papi_errno != PAPI_OK) { return papi_errno; } state->info = thr_info; state->counters = counters; *pstate = state; return PAPI_OK; } /** @class cuptip_ctx_start * @brief Code to start counting Cuda hardware events in an event set. * @param state * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. */ int cuptip_ctx_start(cuptip_control_t state) { COMPDBG("Entering.\n"); int papi_errno = PAPI_OK; cuptip_gpu_state_t *gpu_ctl; CUcontext userCtx, ctx; // Return the Cuda context bound to the calling CPU thread cudaCheckErrors( cuCtxGetCurrentPtr(&userCtx), return PAPI_EMISC ); // Enumerate through the devices a user has added an event for int dev_id; for (dev_id = 0; dev_id < numDevicesOnMachine; dev_id++) { // Skip devices that will require the Events API to be profiled int cupti_api = determine_dev_cc_major(dev_id); if (cupti_api != API_PERFWORKS) { if (cupti_api == API_EVENTS) { continue; } else { return PAPI_EMISC; } } gpu_ctl = &(state->gpu_ctl[dev_id]); if (gpu_ctl->added_events->count == 0) { continue; } LOGDBG("Device num %d: event_count %d, rmr count %d\n", dev_id, gpu_ctl->added_events->count, gpu_ctl->numberOfRawMetricRequests); papi_errno = cuptic_device_acquire(state->gpu_ctl[dev_id].added_events); if (papi_errno != PAPI_OK) { ERRDBG("Profiling same gpu from multiple event sets not allowed.\n"); return papi_errno; } // Get the cuda context papi_errno = cuptic_ctxarr_get_ctx(state->info, dev_id, &ctx); // Bind the specified CUDA context to the calling CPU thread cudaCheckErrors( cuCtxSetCurrentPtr(ctx), return PAPI_EMISC ); // Query/filter cuda native events available on host papi_errno = get_counter_availability(gpu_ctl); if (papi_errno != PAPI_OK) { ERRDBG("Error getting counter availability image.\n"); return papi_errno; } NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params calculateScratchBufferSizeParam = {NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params_STRUCT_SIZE}; calculateScratchBufferSizeParam.pChipName = cuptiu_table_p->avail_gpu_info[dev_id].chipName; calculateScratchBufferSizeParam.pCounterAvailabilityImage = NULL; calculateScratchBufferSizeParam.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr(&calculateScratchBufferSizeParam), return PAPI_EMISC ); uint8_t myScratchBuffer[calculateScratchBufferSizeParam.scratchBufferSize]; NVPW_CUDA_MetricsEvaluator_Initialize_Params metricEvaluatorInitializeParams = {NVPW_CUDA_MetricsEvaluator_Initialize_Params_STRUCT_SIZE}; metricEvaluatorInitializeParams.scratchBufferSize = calculateScratchBufferSizeParam.scratchBufferSize; metricEvaluatorInitializeParams.pScratchBuffer = myScratchBuffer; metricEvaluatorInitializeParams.pChipName = cuptiu_table_p->avail_gpu_info[dev_id].chipName; metricEvaluatorInitializeParams.pCounterAvailabilityImage = NULL; metricEvaluatorInitializeParams.pCounterDataImage = NULL; metricEvaluatorInitializeParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_InitializePtr(&metricEvaluatorInitializeParams), return PAPI_EMISC ); NVPW_MetricsEvaluator *pMetricsEvaluator = metricEvaluatorInitializeParams.pMetricsEvaluator; NVPA_RawMetricRequest *rawMetricRequests = NULL; int i, numOfRawMetricRequests = 0; for (i = 0; i < gpu_ctl->added_events->count; i++) { NVPW_MetricEvalRequest metricEvalRequest; papi_errno = get_metric_eval_request(pMetricsEvaluator, gpu_ctl->added_events->cuda_evts[i], &metricEvalRequest); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = create_raw_metric_requests(pMetricsEvaluator, &metricEvalRequest, &rawMetricRequests, &numOfRawMetricRequests); if (papi_errno != PAPI_OK) { return papi_errno; } } gpu_ctl->rawMetricRequests = rawMetricRequests; gpu_ctl->numberOfRawMetricRequests = numOfRawMetricRequests; papi_errno = get_config_image(cuptiu_table_p->avail_gpu_info[dev_id].chipName, gpu_ctl->counterAvailabilityImage.data, gpu_ctl->rawMetricRequests, gpu_ctl->numberOfRawMetricRequests, &gpu_ctl->configImage); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = get_counter_data_prefix_image(cuptiu_table_p->avail_gpu_info[dev_id].chipName, gpu_ctl->rawMetricRequests, gpu_ctl->numberOfRawMetricRequests, &gpu_ctl->counterDataPrefixImage); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = get_counter_data_image(gpu_ctl->counterDataPrefixImage, &gpu_ctl->counterDataScratchBuffer, &gpu_ctl->counterDataImage); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = start_profiling_session(gpu_ctl->counterDataImage, gpu_ctl->counterDataScratchBuffer, gpu_ctl->configImage); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = begin_pass(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = enable_profiling(); if (papi_errno != PAPI_OK) { return papi_errno; } char rangeName[PAPI_MIN_STR_LEN]; int strLen = snprintf(rangeName, PAPI_MIN_STR_LEN, "PAPI_Range_%d", gpu_ctl->dev_id); if (strLen < 0 || strLen >= PAPI_MIN_STR_LEN) { ERRDBG("Failed to fully write range name.\n"); return PAPI_EBUF; } papi_errno = push_range(rangeName); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = destroy_metrics_evaluator(pMetricsEvaluator); if (papi_errno != PAPI_OK) { return papi_errno; } } cudaCheckErrors( cuCtxSetCurrentPtr(userCtx), return PAPI_EMISC ); return PAPI_OK; } /** @class cuptip_ctx_read * @brief Query an array of numeric values corresponding * to each user added event. * @param state * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. * @param **counters * An array which holds numeric values for the corresponding * user added event. */ int cuptip_ctx_read(cuptip_control_t state, long long **counters) { COMPDBG("Entering.\n"); long long *counter_vals = state->counters; CUcontext userCtx = NULL, ctx = NULL; cudaArtCheckErrors( cuCtxGetCurrentPtr(&userCtx), return PAPI_EMISC ); int dev_id; for (dev_id = 0; dev_id < numDevicesOnMachine; dev_id++) { // Skip devices that will require the Events API to be profiled int cupti_api = determine_dev_cc_major(dev_id); if (cupti_api != API_PERFWORKS) { if (cupti_api == API_EVENTS) { continue; } else { return PAPI_EMISC; } } cuptip_gpu_state_t *gpu_ctl = &(state->gpu_ctl[dev_id]); if (gpu_ctl->added_events->count == 0) { continue; } cudaArtCheckErrors( cuptic_ctxarr_get_ctx(state->info, dev_id, &ctx), return PAPI_EMISC ); cudaArtCheckErrors( cuCtxSetCurrentPtr(ctx), return PAPI_EMISC ); int papi_errno = pop_range(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = end_pass(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = flush_data(); if (papi_errno != PAPI_OK) { return papi_errno; } NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params calculateScratchBufferSizeParam = {NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params_STRUCT_SIZE}; calculateScratchBufferSizeParam.pChipName = cuptiu_table_p->avail_gpu_info[dev_id].chipName; calculateScratchBufferSizeParam.pCounterAvailabilityImage = NULL; calculateScratchBufferSizeParam.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr(&calculateScratchBufferSizeParam), return PAPI_EMISC ); uint8_t myScratchBuffer[calculateScratchBufferSizeParam.scratchBufferSize]; NVPW_CUDA_MetricsEvaluator_Initialize_Params metricEvaluatorInitializeParams = {NVPW_CUDA_MetricsEvaluator_Initialize_Params_STRUCT_SIZE}; metricEvaluatorInitializeParams.scratchBufferSize = calculateScratchBufferSizeParam.scratchBufferSize; metricEvaluatorInitializeParams.pScratchBuffer = myScratchBuffer; metricEvaluatorInitializeParams.pChipName = cuptiu_table_p->avail_gpu_info[dev_id].chipName; metricEvaluatorInitializeParams.pCounterAvailabilityImage = NULL; metricEvaluatorInitializeParams.pCounterDataImage = NULL; metricEvaluatorInitializeParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_InitializePtr(&metricEvaluatorInitializeParams), return PAPI_EMISC ); NVPW_MetricsEvaluator *pMetricsEvaluator = metricEvaluatorInitializeParams.pMetricsEvaluator; long long *metricValues = (long long *) calloc(gpu_ctl->added_events->count, sizeof(long long)); if (metricValues == NULL) { SUBDBG("Failed to allocate memory for metricValues.\n"); return PAPI_ENOMEM; } papi_errno = get_evaluated_metric_values(pMetricsEvaluator, gpu_ctl, metricValues); if (papi_errno != PAPI_OK) { return papi_errno; } int i; for (i = 0; i < gpu_ctl->added_events->count; i++) { int evt_pos = gpu_ctl->added_events->evt_pos[i]; if (state->read_count == 0) { counter_vals[evt_pos] = metricValues[i]; } else { int method = get_event_collection_method(gpu_ctl->added_events->cuda_evts[i]); switch (method) { case CUDA_SUM: counter_vals[evt_pos] += metricValues[i]; break; case CUDA_MIN: counter_vals[evt_pos] = counter_vals[evt_pos] < metricValues[i] ? counter_vals[evt_pos] : metricValues[i]; break; case CUDA_MAX: counter_vals[evt_pos] = counter_vals[evt_pos] > metricValues[i] ? counter_vals[evt_pos] : metricValues[i]; break; case CUDA_AVG: // (size * average + value) / (size + 1) // size - current number of values in the average // average - current average // value - number to add to the average counter_vals[evt_pos] = (state->read_count * counter_vals[i] + metricValues[i]) / (state->read_count + 1); break; default: counter_vals[evt_pos] = metricValues[i]; break; } } } free(metricValues); *counters = counter_vals; papi_errno = begin_pass(); if (papi_errno != PAPI_OK) { return papi_errno; } char rangeName[PAPI_MIN_STR_LEN]; int strLen = snprintf(rangeName, PAPI_MIN_STR_LEN, "PAPI_Range_%d", gpu_ctl->dev_id); if (strLen < 0 || strLen >= PAPI_MIN_STR_LEN) { ERRDBG("Failed to fully write range name.\n"); return PAPI_EBUF; } papi_errno = push_range(rangeName); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = destroy_metrics_evaluator(pMetricsEvaluator); if (papi_errno != PAPI_OK) { return papi_errno; } } state->read_count++; cudaCheckErrors( cuCtxSetCurrentPtr(userCtx), return PAPI_EMISC); return PAPI_OK; } /** @class cuptip_ctx_reset * @brief Code to reset Cuda hardware counter values. * @param state * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. */ int cuptip_ctx_reset(cuptip_control_t state) { COMPDBG("Entering.\n"); int i; for (i = 0; i < state->read_count; i++) { state->counters[i] = 0; } state->read_count = 0; return PAPI_OK; } /** @class cuptip_ctx_stop * @brief Code to stop counting PAPI eventset containing Cuda hardware events. * @param state * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. */ int cuptip_ctx_stop(cuptip_control_t state) { COMPDBG("Entering.\n"); CUcontext userCtx = NULL; cudaCheckErrors( cuCtxGetCurrentPtr(&userCtx), return PAPI_EMISC ); int dev_id; for (dev_id=0; dev_id < numDevicesOnMachine; dev_id++) { // Skip devices that will require the Events API to be profiled int cupti_api = determine_dev_cc_major(dev_id); if (cupti_api != API_PERFWORKS) { if (cupti_api == API_EVENTS) { continue; } else { return PAPI_EMISC; } } cuptip_gpu_state_t *gpu_ctl = &(state->gpu_ctl[dev_id]); if (gpu_ctl->added_events->count == 0) { continue; } CUcontext ctx = NULL; int papi_errno = cuptic_ctxarr_get_ctx(state->info, dev_id, &ctx); if (papi_errno != PAPI_OK) { return papi_errno; } cudaCheckErrors( cuCtxSetCurrentPtr(ctx), return PAPI_EMISC ); papi_errno = end_profiling_session(); if (papi_errno != PAPI_OK) { SUBDBG("Failed to end profiling session.\n"); return papi_errno; } papi_errno = cuptic_device_release(state->gpu_ctl[dev_id].added_events); if (papi_errno != PAPI_OK) { return papi_errno; } COMPDBG("Stopped and ended profiling session for device %d\n", gpu_ctl->dev_id); } cudaCheckErrors( cuCtxSetCurrentPtr(userCtx), return PAPI_EMISC ); return PAPI_OK; } /** @class cuptip_ctx_destroy * @brief Free allocated memory in start - stop workflow and * reset config images. * @param *pstate * Struct that holds read count, running, cuptip_info_t, and * cuptip_gpu_state_t. */ int cuptip_ctx_destroy(cuptip_control_t *pstate) { COMPDBG("Entering.\n"); cuptip_control_t state = *pstate; int i; for (i = 0; i < numDevicesOnMachine; i++) { free_and_reset_configuration_images( &(state->gpu_ctl[i]) ); cuptiu_event_table_destroy( &(state->gpu_ctl[i].added_events) ); // Free the created rawMetricRequests from cuptip_ctx_start int j; for (j = 0; j < state->gpu_ctl[i].numberOfRawMetricRequests; j++) { free((void *) state->gpu_ctl[i].rawMetricRequests[j].pMetricName); } free(state->gpu_ctl[i].rawMetricRequests); } // Free the allocated memory from cuptip_ctx_create free(state->counters); free(state->gpu_ctl); free(state); *pstate = NULL; return PAPI_OK; } /** @class get_event_collection_method * @brief Determine the collection method of the event. Can be avg, max, min, or sum.. * @param *evt_name * Cuda native event name. E.g. dram__bytes.avg */ int get_event_collection_method(const char *evt_name) { if (strstr(evt_name, ".avg") != NULL) { return CUDA_AVG; } else if (strstr(evt_name, ".max") != NULL) { return CUDA_MAX; } else if (strstr(evt_name, ".min") != NULL) { return CUDA_MIN; } else if (strstr(evt_name, ".sum") != NULL) { return CUDA_SUM; } else { return CUDA_DEFAULT; } } /** @class cuptip_shutdown * @brief Free memory and unload function pointers. */ int cuptip_shutdown(void) { COMPDBG("Entering.\n"); shutdown_event_stats_table(); shutdown_event_table(); int papi_errno = deinitialize_cupti_profiler_api(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = unload_nvpw_sym(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = unload_cupti_perf_sym(); if (papi_errno != PAPI_OK) { return papi_errno; } return PAPI_OK; } /** @class evt_id_create * @brief Create event ID. Function is needed for cuptip_event_enum. * * @param *info * Structure which contains member variables of device, flags, and nameid. * @param *event_id * Created event id. */ int evt_id_create(event_info_t *info, uint32_t *event_id) { *event_id = (uint32_t)(info->stat << STAT_SHIFT); *event_id |= (uint32_t)(info->device << DEVICE_SHIFT); *event_id |= (uint32_t)(info->flags << QLMASK_SHIFT); *event_id |= (uint32_t)(info->nameid << NAMEID_SHIFT); return PAPI_OK; } /** @class evt_id_to_info * @brief Convert event id to info. Function is needed for cuptip_event_enum. * * @param event_id * An event id. * @param *info * Structure which contains member variables of device, flags, and nameid. */ int evt_id_to_info(uint32_t event_id, event_info_t *info) { info->stat = (uint32_t)((event_id & STAT_MASK) >> STAT_SHIFT); info->device = (uint32_t)((event_id & DEVICE_MASK) >> DEVICE_SHIFT); info->flags = (uint32_t)((event_id & QLMASK_MASK) >> QLMASK_SHIFT); info->nameid = (uint32_t)((event_id & NAMEID_MASK) >> NAMEID_SHIFT); if (info->stat >= (1 << STAT_WIDTH)) { return PAPI_ENOEVNT; } if (info->device >= numDevicesOnMachine) { return PAPI_ENOEVNT; } if (0 == (info->flags & DEVICE_FLAG) && info->device > 0) { return PAPI_ENOEVNT; } if (info->nameid >= cuptiu_table_p->count) { return PAPI_ENOEVNT; } return PAPI_OK; } /** @class init_event_table * @brief For a device get and store the metric names. */ int init_event_table(void) { int dev_id, deviceRecord = 0; // Loop through all available devices on the current system for (dev_id = 0; dev_id < numDevicesOnMachine; dev_id++) { // Skip devices that will require the Events API to be profiled int cupti_api = determine_dev_cc_major(dev_id); if (cupti_api != API_PERFWORKS) { if (cupti_api == API_EVENTS) { continue; } else { return PAPI_EMISC; } } int papi_errno; int found = find_same_chipname(dev_id); // Unique device found, collect the constructed metric names if (found == -1) { // Increment device record if (dev_id > 0) deviceRecord++; papi_errno = enumerate_metrics_for_unique_devices( cuptiu_table_p->avail_gpu_info[deviceRecord].chipName, &cuptiu_table_p->avail_gpu_info[deviceRecord].totalMetricCount, &cuptiu_table_p->avail_gpu_info[deviceRecord].metricNames ); if (papi_errno != PAPI_OK) { return papi_errno; } } // Device metadata already collected, set device record else { deviceRecord = found; } int i; for (i = 0; i < cuptiu_table_p->avail_gpu_info[deviceRecord].totalMetricCount; i++) { papi_errno = get_ntv_events(cuptiu_table_p, cuptiu_table_p->avail_gpu_info[deviceRecord].metricNames[i], dev_id); if (papi_errno != PAPI_OK) { return papi_errno; } } } // Free memory allocated in enumerate_metrics_for_unique_devices and reset totalMetricCount to 0 int recordIdx; for (recordIdx = 0; recordIdx < (deviceRecord + 1); recordIdx++) { int metricIdx; for (metricIdx = 0; metricIdx < cuptiu_table_p->avail_gpu_info[recordIdx].totalMetricCount; metricIdx++) { free(cuptiu_table_p->avail_gpu_info[recordIdx].metricNames[metricIdx]); } free(cuptiu_table_p->avail_gpu_info[recordIdx].metricNames); cuptiu_table_p->avail_gpu_info[recordIdx].totalMetricCount = 0; } return PAPI_OK; } /** @class is_stat * @brief Helper function to determine if a token represents a statistical operation. * * @param token * A string from the event name. Ex. "dram__bytes" "avg" */ int is_stat(const char *token) { int i; for (i = 0; i < NUM_STATS_QUALS; i++) { if (strcmp(token, stats[i]) == 0) return 1; } return 0; } /** @restructure_event_name * @brief Helper function to restructure the event name * * @param input * Event name string * @param output * Event name string (stat string replaced w/ "stat") * @param base * Event name string base(w/o stat) * @param stat * Event stat string */ int restructure_event_name(const char *input, char *output, char *base, char *stat) { char input_copy[PAPI_HUGE_STR_LEN]; int strLen = snprintf(input_copy, PAPI_HUGE_STR_LEN, "%s", input); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } input_copy[sizeof(input_copy) - 1] = '\0'; char *parts[10] = {0}; char *token; char delimiter[] = "."; int segment_count = 0; int stat_index = -1; // Initialize output strings output[0] = '\0'; base[0] = '\0'; stat[0] = '\0'; // Split the string by periods token = strtok(input_copy, delimiter); while (token != NULL) { parts[segment_count] = token; if (is_stat(token) == 1) { stat_index = segment_count; } segment_count++; token = strtok(NULL, delimiter); } // Copy the stat strLen = snprintf(stat, PAPI_HUGE_STR_LEN, "%s", parts[stat_index]); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } // Build base name (everything except the stat) int i; for (i = 0; i < segment_count; i++) { if (i != stat_index) { if (base[0] != '\0') { strcat(base, "."); strcat(output, "."); } strcat(base, parts[i]); strcat(output, parts[i]); } else { if (output[0] != '\0') strcat(output, "."); strcat(output, "stat"); } } return PAPI_OK; } /** @class get_ntv_events * @brief Store Cuda native events and their corresponding device(s). * * @param *evt_table * Structure containing member variables such as name, evt_code, evt_pos, and htable. * @param *evt_name * Cuda native event name. */ static int get_ntv_events(cuptiu_event_table_t *evt_table, const char *evt_name, int dev_id) { int papi_errno, strLen; char name_restruct[PAPI_HUGE_STR_LEN]="", name_no_stat[PAPI_HUGE_STR_LEN]="", stat[PAPI_HUGE_STR_LEN]=""; int *count = &evt_table->count; int *event_stats_count = &evt_table->event_stats_count; cuptiu_event_t *events = evt_table->events; StringVector *event_stats = evt_table->event_stats; // Check to see if evt_name argument has been provided if (evt_name == NULL) { return PAPI_EINVAL; } // Check to see if capacity has been correctly allocated if (*count >= evt_table->capacity) { return PAPI_EBUG; } papi_errno = restructure_event_name(evt_name, name_restruct, name_no_stat, stat); if (papi_errno != PAPI_OK){ return papi_errno; } cuptiu_event_t *event; StringVector *stat_vec; if ( htable_find(evt_table->htable, name_no_stat, (void **) &event) != HTABLE_SUCCESS ) { event = &events[*count]; // Increment event count (*count)++; strLen = snprintf(event->name, PAPI_2MAX_STR_LEN, "%s", name_no_stat); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { ERRDBG("Failed to fully write name with no stat.\n"); return PAPI_EBUF; } strLen = snprintf(event->basenameWithStatReplaced, sizeof(event->basenameWithStatReplaced), "%s", name_restruct); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } stat_vec = &event_stats[*event_stats_count]; (*event_stats_count)++; event->stat = stat_vec; init_vector(event->stat); papi_errno = push_back(event->stat, stat); if (papi_errno != PAPI_OK){ return papi_errno; } if ( htable_insert(evt_table->htable, name_no_stat, event) != HTABLE_SUCCESS ) { return PAPI_ESYS; } } else { papi_errno = push_back(event->stat, stat); if (papi_errno != PAPI_OK){ return papi_errno; } } cuptiu_dev_set(&event->device_map, dev_id); return PAPI_OK; } /** @class shutdown_event_table * @brief Shutdown cuptiu_event_table_t structure that holds the cuda native * event name and the corresponding description. */ static void shutdown_event_table(void) { cuptiu_table_p->count = 0; free(cuptiu_table_p->avail_gpu_info); cuptiu_table_p->avail_gpu_info = NULL; free(cuptiu_table_p->events); cuptiu_table_p->events = NULL; free(cuptiu_table_p); cuptiu_table_p = NULL; } /** @class shutdown_event_stats_table * @brief Shutdown StringVector structure that holds the statistic qualifiers * for event names. */ static void shutdown_event_stats_table(void) { int i; for (i = 0; i < cuptiu_table_p->event_stats_count; i++) { free_vector(&cuptiu_table_p->event_stats[i]); } cuptiu_table_p->event_stats_count = 0; free(cuptiu_table_p->event_stats); } /** @class cuptip_evt_enum * @brief Enumerate Cuda native events. * * @param *event_code * Cuda native event code. * @param modifier * Modifies the search logic. Three modifiers are used PAPI_ENUM_FIRST, * PAPI_ENUM_EVENTS, and PAPI_NTV_ENUM_UMASKS. */ int cuptip_evt_enum(uint32_t *event_code, int modifier) { int papi_errno = PAPI_OK; event_info_t info; SUBDBG("ENTER: event_code: %u, modifier: %d\n", *event_code, modifier); switch(modifier) { case PAPI_ENUM_FIRST: if(cuptiu_table_p->count == 0) { papi_errno = PAPI_ENOEVNT; break; } info.stat = 0; info.device = 0; info.flags = 0; info.nameid = 0; papi_errno = evt_id_create(&info, event_code); break; case PAPI_ENUM_EVENTS: papi_errno = evt_id_to_info(*event_code, &info); if (papi_errno != PAPI_OK) { break; } if (cuptiu_table_p->count > info.nameid + 1) { info.stat = 0; info.device = 0; info.flags = 0; info.nameid++; papi_errno = evt_id_create(&info, event_code); break; } papi_errno = PAPI_ENOEVNT; break; case PAPI_NTV_ENUM_UMASKS: papi_errno = evt_id_to_info(*event_code, &info); if (papi_errno != PAPI_OK) { break; } if (info.flags == 0){ info.stat = 0; info.device = 0; info.flags = STAT_FLAG; papi_errno = evt_id_create(&info, event_code); break; } if (info.flags == STAT_FLAG){ info.stat = 0; info.device = 0; info.flags = DEVICE_FLAG; papi_errno = evt_id_create(&info, event_code); break; } papi_errno = PAPI_ENOEVNT; break; default: papi_errno = PAPI_EINVAL; } SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; } /** @class cuptip_evt_code_to_descr * @brief Take a Cuda native event code and retrieve a corresponding description. * * @param event_code * Cuda native event code. * @param *descr * Corresponding description for provided Cuda native event code. * @param len * Maximum alloted characters for Cuda native event description. */ int cuptip_evt_code_to_descr(uint32_t event_code, char *descr, int len) { event_info_t info; int papi_errno = evt_id_to_info(event_code, &info); if (papi_errno != PAPI_OK) { return papi_errno; } int str_len = snprintf(descr, (size_t) len, "%s", cuptiu_table_p->events[event_code].desc); if (str_len < 0 || str_len >= len) { ERRDBG("String formatting exceeded max string length.\n"); return PAPI_EBUF; } return papi_errno; } /** @class cuptip_evt_name_to_code * @brief Take a Cuda native event name and collect the corresponding event code. * * @param *name * Cuda native event name. * @param *event_code * Corresponding Cuda native event code for provided Cuda native event name. */ int cuptip_evt_name_to_code(const char *name, uint32_t *event_code) { int htable_errno, device, stat, flags, nameid, papi_errno = PAPI_OK; cuptiu_event_t *event; char base[PAPI_MAX_STR_LEN] = { 0 }; SUBDBG("ENTER: name: %s, event_code: %p\n", name, event_code); papi_errno = cuda_verify_no_repeated_qualifiers(name); if (papi_errno != PAPI_OK) { goto fn_exit; } papi_errno = evt_name_to_basename(name, base, PAPI_MAX_STR_LEN); if (papi_errno != PAPI_OK) { goto fn_exit; } papi_errno = evt_name_to_device(name, &device, base); if (papi_errno != PAPI_OK) { goto fn_exit; } papi_errno = evt_name_to_stat(name, &stat, base); if (papi_errno != PAPI_OK) { goto fn_exit; } htable_errno = htable_find(cuptiu_table_p->htable, base, (void **) &event); if (htable_errno != HTABLE_SUCCESS) { papi_errno = (htable_errno == HTABLE_ENOVAL) ? PAPI_ENOEVNT : PAPI_ECMP; goto fn_exit; } flags = (event->stat->size >= 0) ? (STAT_FLAG | DEVICE_FLAG) : DEVICE_FLAG; if (flags == 0){ papi_errno = PAPI_EINVAL; goto fn_exit; } nameid = (int) (event - cuptiu_table_p->events); event_info_t info = { stat, device, flags, nameid }; papi_errno = evt_id_create(&info, event_code); if (papi_errno != PAPI_OK) { goto fn_exit; } papi_errno = evt_id_to_info(*event_code, &info); if (papi_errno != PAPI_OK) { goto fn_exit; } // Section handles if the Cuda component is partially disabled int *enabledCudaDeviceIds, cudaCmpPartial; size_t cudaEnabledDevicesCnt; cuptic_partial(&cudaCmpPartial, &enabledCudaDeviceIds, &cudaEnabledDevicesCnt); if (cudaCmpPartial) { papi_errno = PAPI_PARTIAL; int i; for (i = 0; i < cudaEnabledDevicesCnt; i++) { if (device == enabledCudaDeviceIds[i]) { papi_errno = PAPI_OK; break; } } } fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; } /** @class cuptip_evt_code_to_name * @brief Returns Cuda native event name for a Cuda native event code. See * evt_code_to_name( ... ) for more details. * @param *event_code * Cuda native event code. * @param *name * Cuda native event name. * @param len * Maximum alloted characters for base Cuda native event name. */ int cuptip_evt_code_to_name(uint32_t event_code, char *name, int len) { return evt_code_to_name(event_code, name, len); } /** @class evt_code_to_name * @brief Helper function for cuptip_evt_code_to_name. Takes a Cuda native event * code and collects the corresponding Cuda native event name. * @param *event_code * Cuda native event code. * @param *name * Cuda native event name. * @param len * Maximum alloted characters for base Cuda native event name. */ static int evt_code_to_name(uint32_t event_code, char *name, int len) { event_info_t info; int papi_errno = evt_id_to_info(event_code, &info); if (papi_errno != PAPI_OK) { return papi_errno; } int str_len; char stat[PAPI_HUGE_STR_LEN] = ""; if (info.stat < NUM_STATS_QUALS){ str_len = snprintf(stat, sizeof(stat), "%s", stats[info.stat]); if (str_len < 0 || str_len >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } } switch (info.flags) { case (DEVICE_FLAG): str_len = snprintf(name, len, "%s:device=%i", cuptiu_table_p->events[info.nameid].name, info.device); if (str_len < 0 || str_len >= len) { ERRDBG("String formatting exceeded max string length.\n"); return PAPI_EBUF; } break; case (STAT_FLAG): str_len = snprintf(name, len, "%s:stat=%s", cuptiu_table_p->events[info.nameid].name, stat); if (str_len < 0 || str_len >= PAPI_HUGE_STR_LEN) { ERRDBG("String formatting exceeded max string length.\n"); return PAPI_EBUF; } break; case (DEVICE_FLAG | STAT_FLAG): str_len = snprintf(name, len, "%s:stat=%s:device=%i", cuptiu_table_p->events[info.nameid].name, stat, info.device); if (str_len < 0 || str_len >= len) { ERRDBG("String formatting exceeded max string length.\n"); return PAPI_EBUF; } break; default: str_len = snprintf(name, len, "%s", cuptiu_table_p->events[info.nameid].name); if (str_len < 0 || str_len >= len) { ERRDBG("String formatting exceeded max string length.\n"); return PAPI_EBUF; } break; } return papi_errno; } /** @class cuptip_evt_code_to_info * @brief Takes a Cuda native event code and collects info such as Cuda native * event name, Cuda native event description, and number of devices. * @param event_code * Cuda native event code. * @param *info * Structure for member variables such as symbol, short description, and * long desctiption. */ int cuptip_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info) { event_info_t inf; int papi_errno = evt_id_to_info(event_code, &inf); if (papi_errno != PAPI_OK) { return papi_errno; } const char *stat_position = strstr(cuptiu_table_p->events[inf.nameid].basenameWithStatReplaced, "stat"); if (stat_position == NULL) { return PAPI_ENOMEM; } size_t basename_len = stat_position - cuptiu_table_p->events[inf.nameid].basenameWithStatReplaced; char reconstructedEventName[PAPI_HUGE_STR_LEN]=""; int strLen = snprintf(reconstructedEventName, PAPI_MAX_STR_LEN, "%.*s%s%s", (int)basename_len, cuptiu_table_p->events[inf.nameid].basenameWithStatReplaced, cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[0], stat_position + 4); int i; // For a Cuda event collect the description, units, and number of passes if (cuptiu_table_p->events[inf.nameid].desc[0] == '\0') { int dev_id = -1; for (i = 0; i < numDevicesOnMachine; ++i) { if (cuptiu_dev_check(cuptiu_table_p->events[inf.nameid].device_map, i)) { dev_id = i; break; } } if (dev_id == -1) { SUBDBG("Failed to find a matching device in the device map.\n"); return PAPI_EINVAL; } papi_errno = get_metric_properties( cuptiu_table_p->avail_gpu_info[dev_id].chipName, reconstructedEventName, cuptiu_table_p->events[inf.nameid].desc ); if (papi_errno != PAPI_OK) { return papi_errno; } } char all_stat[PAPI_HUGE_STR_LEN]=""; switch (inf.flags) { case (0): { // Store details for the Cuda event strLen = snprintf( info->symbol, PAPI_HUGE_STR_LEN, "%s", cuptiu_table_p->events[inf.nameid].name ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write metric name in case 0.\n"); return PAPI_EBUF; } strLen = snprintf( info->long_descr, PAPI_HUGE_STR_LEN, "%s", cuptiu_table_p->events[inf.nameid].desc ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write long description in case 0.\n") return PAPI_EBUF; } break; } case DEVICE_FLAG: { char devices[PAPI_MAX_STR_LEN] = { 0 }; int init_metric_dev_id; for (i = 0; i < numDevicesOnMachine; ++i) { if (cuptiu_dev_check(cuptiu_table_p->events[inf.nameid].device_map, i)) { // For an event, store the first device found to use with :device=#, // as on a heterogenous system events may not appear on each device if (devices[0] == '\0') { init_metric_dev_id = i; } int strLen = snprintf(devices + strlen(devices), PAPI_MAX_STR_LEN, "%i,", i); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { ERRDBG("Failed to fully write device qualifiers.\n"); } } } *(devices + strlen(devices) - 1) = 0; // Store details for the Cuda event strLen = snprintf( info->symbol, PAPI_HUGE_STR_LEN, "%s:device=%i", cuptiu_table_p->events[inf.nameid].name, init_metric_dev_id ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write metric name in case DEVICE_FLAG.\n"); return PAPI_EBUF; } strLen = snprintf( info->long_descr, PAPI_HUGE_STR_LEN, "%s masks:Mandatory device qualifier [%s]", cuptiu_table_p->events[inf.nameid].desc, devices ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write long description in case DEVICE_FLAG.\n"); return PAPI_EBUF; } break; } case STAT_FLAG: { all_stat[0]= '\0'; size_t current_len = strlen(all_stat); for (size_t i = 0; i < cuptiu_table_p->events[inf.nameid].stat->size; i++) { size_t remaining_space = PAPI_HUGE_STR_LEN - current_len - 1; // Calculate remaining space // Ensure there's enough space for the string before concatenating if (remaining_space > 0) { strncat(all_stat, cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[i], remaining_space); current_len += strlen(cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[i]); } else { ERRDBG("Not enough space for the all_stat string") return papi_errno; } // Add a comma only if there is space and it is not the last element if (i < cuptiu_table_p->events[inf.nameid].stat->size - 1 && remaining_space > 2) { strncat(all_stat, ", ", remaining_space - 2); current_len += 2; // Account for the added comma and space } } /* cuda native event name */ strLen = snprintf( info->symbol, PAPI_HUGE_STR_LEN, "%s:stat=%s", cuptiu_table_p->events[inf.nameid].name, cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[0] ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write metric name in case STAT_FLAG.\n"); return PAPI_EBUF; } /* cuda native event long description */ strLen = snprintf( info->long_descr, PAPI_HUGE_STR_LEN, "%s masks:Mandatory stat qualifier [%s]", cuptiu_table_p->events[inf.nameid].desc, all_stat ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write long description in case STAT_FLAG.\n"); return PAPI_EBUF; } break; } case (STAT_FLAG | DEVICE_FLAG): { int init_metric_dev_id; char devices[PAPI_MAX_STR_LEN] = { 0 }; for (i = 0; i < numDevicesOnMachine; ++i) { if (cuptiu_dev_check(cuptiu_table_p->events[inf.nameid].device_map, i)) { /* for an event, store the first device found to use with :device=#, as on a heterogenous system events may not appear on each device */ if (devices[0] == '\0') { init_metric_dev_id = i; } sprintf(devices + strlen(devices), "%i,", i); } } *(devices + strlen(devices) - 1) = 0; all_stat[0]= '\0'; size_t current_len = strlen(all_stat); for (size_t i = 0; i < cuptiu_table_p->events[inf.nameid].stat->size; i++) { size_t remaining_space = PAPI_HUGE_STR_LEN - current_len - 1; // Calculate remaining space // Ensure there's enough space for the string before concatenating if (remaining_space > 0) { strncat(all_stat, cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[i], remaining_space); current_len += strlen(cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[i]); } else { ERRDBG("Not enough space for the all_stat string") return papi_errno; } // Add a comma only if there is space and it is not the last element if (i < cuptiu_table_p->events[inf.nameid].stat->size - 1 && remaining_space > 2) { strncat(all_stat, ", ", remaining_space - 2); current_len += 2; // Account for the added comma and space } } /* cuda native event name */ strLen = snprintf( info->symbol, PAPI_HUGE_STR_LEN, "%s:stat=%s:device=%i", cuptiu_table_p->events[inf.nameid].name, cuptiu_table_p->events[inf.nameid].stat->arrayMetricStatistics[0], init_metric_dev_id); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } /* cuda native event long description */ strLen = snprintf( info->long_descr, PAPI_HUGE_STR_LEN, "%s masks:Mandatory stat qualifier [%s]:Mandatory device qualifier [%s]", cuptiu_table_p->events[inf.nameid].desc, all_stat, devices ); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("String larger than PAPI_HUGE_STR_LEN"); return PAPI_EBUF; } break; } default: papi_errno = PAPI_EINVAL; } return papi_errno; } /** @class evt_name_to_basename * @brief Convert a Cuda native event name with a device qualifer appended to * it, back to the base Cuda native event name provided by NVIDIA. * @param *name * Cuda native event name with a device qualifier appended. * @param *base * Base Cuda native event name (excludes device qualifier). * @param len * Maximum alloted characters for base Cuda native event name. */ static int evt_name_to_basename(const char *name, char *base, int len) { char *p = strstr(name, ":"); if (p) { if (len < (int)(p - name)) { return PAPI_EBUF; } strncpy(base, name, (size_t)(p - name)); } else { if (len < (int) strlen(name)) { return PAPI_EBUF; } strncpy(base, name, (size_t) len); } return PAPI_OK; } /** @class cuda_verify_no_repeated_qualifiers * @brief Verify that a user has not added multiple device or stats qualifiers * to an event name. * * @param *eventName * User provided event name we need to verify. */ static int cuda_verify_no_repeated_qualifiers(const char *eventName) { int numDeviceQualifiers = 0, numStatsQualifiers = 0; char tmpEventName[PAPI_2MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_2MAX_STR_LEN, "%s", eventName); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { ERRDBG("Failed to fully write eventName into tmpEventName.\n"); return PAPI_EBUF; } char *token = strtok(tmpEventName, ":"); while(token != NULL) { if (strncmp(token, "device", 6) == 0) { numDeviceQualifiers++; } else if (strncmp(token, "stat", 4) == 0){ numStatsQualifiers++; } token = strtok(NULL, ":"); } if (numDeviceQualifiers > 1 || numStatsQualifiers > 1) { ERRDBG("Provided Cuda event has multiple device or stats qualifiers appended.\n"); return PAPI_ENOEVNT; } return PAPI_OK; } /** @class cuda_verify_qualifiers * @brief Verify that the device and/or stats qualifier provided by the user * is valid. E.g. :device=# or :stat=avg. * * @param flag * Device or stats flag define. Allows us to determine the case to enter for * the switch statement. * @param *qualifierName * Name of the qualifier we need to verify. E.g. :device or :stat. * @param equalitySignPosition * Position of where the equal sign is located in the qualifier string name. * @param *qualifierValue * Upon verifying the provided qualifier is valid. Store either a device index * or a statistic index. */ static int cuda_verify_qualifiers(int flag, char *qualifierName, int equalitySignPosition, int *qualifierValue) { int pos = equalitySignPosition; // Verify that an equal sign was provided where it was suppose to be if (qualifierName[pos] != '=') { SUBDBG("Improper qualifier name. No equal sign found.\n"); return PAPI_ENOEVNT; } switch(flag) { case DEVICE_FLAG: { // Verify that the next character after the equal sign is indeed a digit pos++; int isDigit = (unsigned) qualifierName[pos] - '0' < 10; if (!isDigit) { SUBDBG("Improper device qualifier name. Digit does not follow equal sign.\n"); return PAPI_ENOEVNT; } // Verify that only qualifiers have been appended char *endPtr; *qualifierValue = (int) strtol(qualifierName + strlen(":device="), &endPtr, 10); // Check to make sure only qualifiers have been appended if (*endPtr != '\0') { if (strncmp(endPtr, ":stat", 5) != 0) { return PAPI_ENOEVNT; } } return PAPI_OK; } case STAT_FLAG: { qualifierName += 6; // Move past ":stat=" int i; for (i = 0; i < NUM_STATS_QUALS; i++) { size_t token_len = strlen(stats[i]); if (strncmp(qualifierName, stats[i], token_len) == 0) { // Check to make sure only qualifiers have been appended char *no_excess_chars = qualifierName + token_len; if (strlen(no_excess_chars) == 0 || strncmp(no_excess_chars, ":device", 7) == 0) { *qualifierValue = i; return PAPI_OK; } } } return PAPI_ENOEVNT; } default: SUBDBG("Flag provided is not accounted for in switch statement.\n"); return PAPI_EINVAL; } } /** @class evt_name_to_device * @brief Return the device number for a user provided Cuda native event. * This can be done with a device qualifier present (:device=#) or * we internally find the first device the native event exists for. * @param *name * Cuda native event name with a device qualifier appended. * @param *device * Device number. */ static int evt_name_to_device(const char *name, int *device, const char *base) { char *p = strstr(name, ":device"); // User did provide :device=# qualifier if (p != NULL) { int equalitySignPos = 7; int papi_errno = cuda_verify_qualifiers(DEVICE_FLAG, p, equalitySignPos, device); if (papi_errno != PAPI_OK) { return papi_errno; } } // User did not provide :device=# qualifier else { int i, htable_errno; cuptiu_event_t *event; htable_errno = htable_find(cuptiu_table_p->htable, base, (void **) &event); if (htable_errno != HTABLE_SUCCESS) { return PAPI_EINVAL; } // Search for the first device the event exists for. for (i = 0; i < numDevicesOnMachine; ++i) { if (cuptiu_dev_check(event->device_map, i)) { *device = i; return PAPI_OK; } } } return PAPI_OK; } /** @class evt_name_to_stat * @brief Take a Cuda native event name with a stat qualifer appended to * it and collect the stat . * @param *name * Cuda native event name with a stat qualifier appended. * @param *stat * Stat collected. */ static int evt_name_to_stat(const char *name, int *stat, const char *base) { char *p = strstr(name, ":stat"); if (p != NULL) { int equalitySignPos = 5; int papi_errno = cuda_verify_qualifiers(STAT_FLAG, p, equalitySignPos, stat); if (papi_errno != PAPI_OK) { return papi_errno; } } else { cuptiu_event_t *event; int htable_errno = htable_find(cuptiu_table_p->htable, base, (void **) &event); if (htable_errno != HTABLE_SUCCESS) { return PAPI_ENOEVNT; } int i; for (i = 0; i < NUM_STATS_QUALS; i++) { size_t token_len = strlen(stats[i]); if (strncmp(event->stat->arrayMetricStatistics[0], stats[i], token_len) == 0) { *stat = i; return PAPI_OK; } } } } /** @class assign_chipnames_for_a_device_index * @brief For each device found, assign a chipname. */ static int assign_chipnames_for_a_device_index(void) { char chipName[PAPI_MIN_STR_LEN]; int dev_id; for (dev_id = 0; dev_id < numDevicesOnMachine; dev_id++) { int retval = get_chip_name(dev_id, chipName); if (PAPI_OK != retval ) { return PAPI_EMISC; } int strLen = snprintf(cuptiu_table_p->avail_gpu_info[dev_id].chipName, PAPI_MIN_STR_LEN, "%s", chipName); if (strLen < 0 || strLen >= PAPI_MIN_STR_LEN) { SUBDBG("Failed to fully write chip name.\n"); return PAPI_EBUF; } } return PAPI_OK; } static int determine_dev_cc_major(int dev_id) { int cc; int papi_errno = get_gpu_compute_capability(dev_id, &cc); if (papi_errno != PAPI_OK) { return papi_errno; } if (cc >= 70) { return API_PERFWORKS; } // TODO: Once the Events API is added back, move this to either cupti_utils or papi_cupti_common // with updated logic. else { return API_EVENTS; } } /** * @} ******************************************************************************/ /***************************************************************************//** * @name Metrics Evaluator * @{ */ /** @class enumerate_metrics_for_unique_devices * @brief Get the total number of metrics on a device and the subsequent metric names * using the Metrics Evaluator API. * * @param *pChipName * A Cuda device chip name. * @param *totalNumMetrics * Count of the total number of metrics found on a device. * @param ***arrayOfMetricNames * Constructured metric names. With the Metrics Evaluator API, a metric name must be * reconstructured using metricName.rollup.submetric. */ static int enumerate_metrics_for_unique_devices(const char *pChipName, int *totalNumMetrics, char ***arrayOfMetricNames) { NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params calculateScratchBufferSizeParam = {NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params_STRUCT_SIZE}; calculateScratchBufferSizeParam.pChipName = pChipName; calculateScratchBufferSizeParam.pCounterAvailabilityImage = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr(&calculateScratchBufferSizeParam), return PAPI_EMISC ); uint8_t myScratchBuffer[calculateScratchBufferSizeParam.scratchBufferSize]; NVPW_CUDA_MetricsEvaluator_Initialize_Params metricEvaluatorInitializeParams = {NVPW_CUDA_MetricsEvaluator_Initialize_Params_STRUCT_SIZE}; metricEvaluatorInitializeParams.scratchBufferSize = calculateScratchBufferSizeParam.scratchBufferSize; metricEvaluatorInitializeParams.pScratchBuffer = myScratchBuffer; metricEvaluatorInitializeParams.pChipName = pChipName; metricEvaluatorInitializeParams.pCounterAvailabilityImage = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_InitializePtr(&metricEvaluatorInitializeParams), return PAPI_EMISC ); NVPW_MetricsEvaluator *pMetricsEvaluator = metricEvaluatorInitializeParams.pMetricsEvaluator; char **metricNames = NULL; int i, metricCount = 0, papi_errno; for (i = 0; i < NVPW_METRIC_TYPE__COUNT; ++i) { NVPW_MetricType metricType = (NVPW_MetricType)i; NVPW_MetricsEvaluator_GetMetricNames_Params getMetricNamesParams = {NVPW_MetricsEvaluator_GetMetricNames_Params_STRUCT_SIZE}; getMetricNamesParams.metricType = metricType; getMetricNamesParams.pMetricsEvaluator = pMetricsEvaluator; getMetricNamesParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetMetricNamesPtr(&getMetricNamesParams), return PAPI_EMISC ); size_t metricIdx; for (metricIdx = 0; metricIdx < getMetricNamesParams.numMetrics; ++metricIdx) { size_t metricNameBeginIndex = getMetricNamesParams.pMetricNameBeginIndices[metricIdx]; const char *baseMetricName = &getMetricNamesParams.pMetricNames[metricNameBeginIndex]; char fullMetricName[PAPI_2MAX_STR_LEN]; int strLen = snprintf(fullMetricName, PAPI_2MAX_STR_LEN, "%s", baseMetricName); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { SUBDBG("Failed to fully append the base metric name.\n"); return PAPI_EBUF; } int rollupMetricIdx; for (rollupMetricIdx = 0; rollupMetricIdx < NVPW_ROLLUP_OP__COUNT; ++rollupMetricIdx) { // Set the starting offset to be used for a metric int offsetForMetricName = strlen(baseMetricName); // Get the rollup metric if applicable // Rollup's are required for Counter and Throughput, but does not apply to Ratio char *rollupMetricName = NULL; if (metricType != NVPW_METRIC_TYPE_RATIO) { papi_errno = get_rollup_metrics(rollupMetricIdx, &rollupMetricName); if (papi_errno != 0) { return papi_errno; } strLen = snprintf(fullMetricName + offsetForMetricName, PAPI_2MAX_STR_LEN - offsetForMetricName, "%s", rollupMetricName); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { SUBDBG("Failed to fully append rollup metric name.\n"); return PAPI_EBUF; } // Update the offset as a rollup metric was found offsetForMetricName += strlen(rollupMetricName); } // Get the list of submetrics // Submetrics are required for Ratio and Throughput, optional for Counter (here we do collect for Counter as well) NVPW_MetricsEvaluator_GetSupportedSubmetrics_Params supportedSubMetrics = {NVPW_MetricsEvaluator_GetSupportedSubmetrics_Params_STRUCT_SIZE}; supportedSubMetrics.pMetricsEvaluator = pMetricsEvaluator; supportedSubMetrics.metricType = metricType; supportedSubMetrics.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetSupportedSubmetricsPtr(&supportedSubMetrics), return PAPI_EMISC ); size_t subMetricIdx; for (subMetricIdx = 0; subMetricIdx < supportedSubMetrics.numSupportedSubmetrics; ++subMetricIdx) { char *subMetricName; papi_errno = get_supported_submetrics(supportedSubMetrics.pSupportedSubmetrics[subMetricIdx], &subMetricName); if (papi_errno != 0) { return papi_errno; } if (supportedSubMetrics.pSupportedSubmetrics[subMetricIdx] != NVPW_SUBMETRIC_NONE) { strLen = snprintf(fullMetricName + offsetForMetricName, PAPI_2MAX_STR_LEN - offsetForMetricName, "%s", subMetricName); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { SUBDBG("Failed to fully append submetric names.\n"); return PAPI_EBUF; } } metricNames = (char **) realloc(metricNames, (metricCount + 1) * sizeof(char *)); if (metricNames == NULL) { SUBDBG("Failed to allocate memory for metricNames.\n"); return PAPI_ENOMEM; } metricNames[metricCount] = (char *) malloc(PAPI_2MAX_STR_LEN * sizeof(char)); if (metricNames[metricCount] == NULL) { SUBDBG("Failed to allocate memory for the index %d in the array metricNames.\n", metricCount); return PAPI_ENOMEM; } // Store the constructed metric name strLen = snprintf(metricNames[metricCount], PAPI_2MAX_STR_LEN, "%s", fullMetricName); if (strLen < 0 || strLen >= PAPI_2MAX_STR_LEN) { SUBDBG("Failed to fully write constructued metric name: %s\n", fullMetricName); return PAPI_EBUF; } metricCount++; } // Avoid counting ratio metrics 4X more then should occur if (metricType == NVPW_METRIC_TYPE_RATIO) { break; } } } } papi_errno = destroy_metrics_evaluator(pMetricsEvaluator); if (papi_errno != PAPI_OK) { return papi_errno; } *totalNumMetrics = metricCount; *arrayOfMetricNames = metricNames; return PAPI_OK; } /** @class get_rollup_metrics * @brief Get the appropriate string for a provided member of the NVPW_RollupOp * enum. Note that, rollup's are required for Counter and Throughput, but * does not apply to Ratio. * @param rollupMetric * A member of the enum NVPW_RollupOp. See nvperf_host.h for a full list. * @param **strRollupMetric * String rollup metric to store based on the rollupMetric parameter. */ static int get_rollup_metrics(NVPW_RollupOp rollupMetric, char **strRollupMetric) { switch(rollupMetric) { case NVPW_ROLLUP_OP_AVG: *strRollupMetric = ".avg"; return PAPI_OK; case NVPW_ROLLUP_OP_MAX: *strRollupMetric = ".max"; return PAPI_OK; case NVPW_ROLLUP_OP_MIN: *strRollupMetric = ".min"; return PAPI_OK; case NVPW_ROLLUP_OP_SUM: *strRollupMetric = ".sum"; return PAPI_OK; default: SUBDBG("Rollup metric was not one of avg, max, min, or sum.\n"); *strRollupMetric = ""; return PAPI_OK; } } /** @class get_supported_submetrics * @brief Get the appropriate string for a provided member of the NVPW_Submetric * enum. Note that, submetrics are required for Ratio and Throughput, optional * for Counter. * @param subMetric * A member of the enum NVPW_Submetric. See nvperf_host.h for a full list. * @param **strSubMetric * String submetric to store based on the subMetric parameter. */ static int get_supported_submetrics(NVPW_Submetric subMetric, char **strSubMetric) { // NOTE: The following submetrics are not supported in CUPTI 11.3 and onwards: // - Burst submetrics: .peak_burst, .pct_of_peak_burst_active, .pct_of_peak_burst_active // .pct_of_peak_burst_elapsed, .pct_of_peak_burst_region, // .pct_of_peak_burst_frame. // - Throughput submetrics: .pct_of_peak_burst_active, .pct_of_peak_burst_elapsed // .pct_of_peak_burst_region, .pct_of_peak_burst_frame. switch (subMetric) { case NVPW_SUBMETRIC_PEAK_SUSTAINED: *strSubMetric = ".peak_sustained"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_ACTIVE: *strSubMetric = ".peak_sustained_active"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_ACTIVE_PER_SECOND: *strSubMetric = ".peak_sustained_active.per_second"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_ELAPSED: *strSubMetric = ".peak_sustained_elapsed"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_ELAPSED_PER_SECOND: *strSubMetric = ".peak_sustained_elapsed.per_second"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_FRAME: *strSubMetric = ".peak_sustained_frame"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_FRAME_PER_SECOND: *strSubMetric = ".peak_sustained_frame.per_second"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_REGION: *strSubMetric = ".peak_sustained_region"; return PAPI_OK; case NVPW_SUBMETRIC_PEAK_SUSTAINED_REGION_PER_SECOND: *strSubMetric = ".peak_sustained_region.per_second"; return PAPI_OK; case NVPW_SUBMETRIC_PER_CYCLE_ACTIVE: *strSubMetric = ".per_cycle_active"; return PAPI_OK; case NVPW_SUBMETRIC_PER_CYCLE_ELAPSED: *strSubMetric = ".per_cycle_elapsed"; return PAPI_OK; case NVPW_SUBMETRIC_PER_CYCLE_IN_FRAME: *strSubMetric = ".per_cycle_in_frame"; return PAPI_OK; case NVPW_SUBMETRIC_PER_CYCLE_IN_REGION: *strSubMetric = ".per_cycle_in_region"; return PAPI_OK; case NVPW_SUBMETRIC_PER_SECOND: *strSubMetric = ".per_second"; return PAPI_OK; case NVPW_SUBMETRIC_PCT_OF_PEAK_SUSTAINED_ACTIVE: *strSubMetric = ".pct_of_peak_sustained_active"; return PAPI_OK; case NVPW_SUBMETRIC_PCT_OF_PEAK_SUSTAINED_ELAPSED: *strSubMetric = ".pct_of_peak_sustained_elapsed"; return PAPI_OK; case NVPW_SUBMETRIC_PCT_OF_PEAK_SUSTAINED_FRAME: *strSubMetric = ".pct_of_peak_sustained_frame"; return PAPI_OK; case NVPW_SUBMETRIC_PCT_OF_PEAK_SUSTAINED_REGION: *strSubMetric = ".pct_of_peak_sustained_region"; return PAPI_OK; case NVPW_SUBMETRIC_MAX_RATE: *strSubMetric = ".max_rate"; return PAPI_OK; case NVPW_SUBMETRIC_PCT: *strSubMetric = ".pct"; return PAPI_OK; case NVPW_SUBMETRIC_RATIO: *strSubMetric = ".ratio"; return PAPI_OK; case NVPW_SUBMETRIC_NONE: default: *strSubMetric = ""; return PAPI_OK; } } /** @class get_metric_properties * @brief For a metric, get the description, units, and number * of passes. * * @param *pChipName * The device chipname. * @param *metricName * A metric name from the Perfworks api. * @param *fullMetricDescription * The constructed metric description with units and number of * passes. */ static int get_metric_properties(const char *pChipName, const char *metricName, char *fullMetricDescription) { NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params calculateScratchBufferSizeParam = {NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params_STRUCT_SIZE}; calculateScratchBufferSizeParam.pChipName = pChipName; calculateScratchBufferSizeParam.pCounterAvailabilityImage = NULL; calculateScratchBufferSizeParam.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr(&calculateScratchBufferSizeParam), return PAPI_EMISC ); uint8_t myScratchBuffer[calculateScratchBufferSizeParam.scratchBufferSize]; NVPW_CUDA_MetricsEvaluator_Initialize_Params metricEvaluatorInitializeParams = {NVPW_CUDA_MetricsEvaluator_Initialize_Params_STRUCT_SIZE}; metricEvaluatorInitializeParams.scratchBufferSize = calculateScratchBufferSizeParam.scratchBufferSize; metricEvaluatorInitializeParams.pScratchBuffer = myScratchBuffer; metricEvaluatorInitializeParams.pChipName = pChipName; metricEvaluatorInitializeParams.pCounterAvailabilityImage = NULL; metricEvaluatorInitializeParams.pCounterDataImage = NULL; metricEvaluatorInitializeParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_InitializePtr(&metricEvaluatorInitializeParams), return PAPI_EMISC ); NVPW_MetricsEvaluator *pMetricsEvaluator = metricEvaluatorInitializeParams.pMetricsEvaluator; NVPW_MetricEvalRequest metricEvalRequest; int papi_errno = get_metric_eval_request(pMetricsEvaluator, metricName, &metricEvalRequest); if (papi_errno != PAPI_OK) { return papi_errno; } NVPW_MetricType metricType = (NVPW_MetricType) metricEvalRequest.metricType; size_t metricIndex = metricEvalRequest.metricIndex; // For a metric, get the description const char *metricDescription; if (metricType == NVPW_METRIC_TYPE_COUNTER) { NVPW_MetricsEvaluator_GetCounterProperties_Params counterPropParams = {NVPW_MetricsEvaluator_GetCounterProperties_Params_STRUCT_SIZE}; counterPropParams.pMetricsEvaluator = pMetricsEvaluator; counterPropParams.counterIndex = metricIndex; counterPropParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetCounterPropertiesPtr(&counterPropParams), return PAPI_EMISC ); metricDescription = counterPropParams.pDescription; } else if (metricType == NVPW_METRIC_TYPE_RATIO) { NVPW_MetricsEvaluator_GetRatioMetricProperties_Params ratioPropParams = {NVPW_MetricsEvaluator_GetRatioMetricProperties_Params_STRUCT_SIZE}; ratioPropParams.pMetricsEvaluator = pMetricsEvaluator; ratioPropParams.ratioMetricIndex = metricIndex; ratioPropParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetRatioMetricPropertiesPtr(&ratioPropParams), return PAPI_EMISC ); metricDescription = ratioPropParams.pDescription; } else if (metricType == NVPW_METRIC_TYPE_THROUGHPUT) { NVPW_MetricsEvaluator_GetThroughputMetricProperties_Params throughputPropParams = {NVPW_MetricsEvaluator_GetThroughputMetricProperties_Params_STRUCT_SIZE}; throughputPropParams.pMetricsEvaluator = pMetricsEvaluator; throughputPropParams.throughputMetricIndex = metricIndex; throughputPropParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetThroughputMetricPropertiesPtr(&throughputPropParams), return PAPI_EMISC ); metricDescription = throughputPropParams.pDescription; } // For a metric, get the dimensional units NVPW_MetricsEvaluator_GetMetricDimUnits_Params dimUnitsParams = {NVPW_MetricsEvaluator_GetMetricDimUnits_Params_STRUCT_SIZE}; dimUnitsParams.pMetricsEvaluator = pMetricsEvaluator; dimUnitsParams.pMetricEvalRequest = &metricEvalRequest; dimUnitsParams.metricEvalRequestStructSize = NVPW_MetricEvalRequest_STRUCT_SIZE; dimUnitsParams.dimUnitFactorStructSize = NVPW_DimUnitFactor_STRUCT_SIZE; dimUnitsParams.pDimUnits = NULL; dimUnitsParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetMetricDimUnitsPtr(&dimUnitsParams), return PAPI_EMISC ); int strLen; char *metricUnits = "unitless"; // It appears that some metrics have a bug which do not return a value of 1 when they should for unitless. if (dimUnitsParams.numDimUnits > 0) { NVPW_DimUnitFactor *dimUnitsFactor = (NVPW_DimUnitFactor *) malloc(dimUnitsParams.numDimUnits * sizeof(NVPW_DimUnitFactor)); if (dimUnitsFactor == NULL) { SUBDBG("Failed to allocate memory for dimUnitsFactor.\n"); return PAPI_ENOMEM; } dimUnitsParams.pDimUnits = dimUnitsFactor; nvpwCheckErrors( NVPW_MetricsEvaluator_GetMetricDimUnitsPtr(&dimUnitsParams), return PAPI_EMISC ); char tmpMetricUnits[PAPI_MAX_STR_LEN] = { 0 }; int i; for (i = 0; i < dimUnitsParams.numDimUnits; i++) { NVPW_MetricsEvaluator_DimUnitToString_Params dimUnitToStringParams = {NVPW_MetricsEvaluator_DimUnitToString_Params_STRUCT_SIZE}; dimUnitToStringParams.pMetricsEvaluator = pMetricsEvaluator; dimUnitToStringParams.dimUnit = dimUnitsFactor[i].dimUnit; dimUnitToStringParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_DimUnitToStringPtr(&dimUnitToStringParams), return PAPI_EMISC ); char *unitsFormat = (i == 0) ? "%s" : "/%s"; strLen = snprintf(tmpMetricUnits + strlen(tmpMetricUnits), PAPI_MAX_STR_LEN - strlen(tmpMetricUnits), unitsFormat, dimUnitToStringParams.pPluralName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { SUBDBG("Failed to fully write dimensional units for a metric.\n"); return PAPI_EBUF; } } free(dimUnitsFactor); metricUnits = tmpMetricUnits; } int numOfPasses = 0; papi_errno = get_number_of_passes_for_info(pChipName, pMetricsEvaluator, &metricEvalRequest, &numOfPasses); if (papi_errno != PAPI_OK) { return papi_errno; } char *multipassSupport = ""; if (numOfPasses > 1) { multipassSupport = "(multiple passes not supported)"; } strLen = snprintf(fullMetricDescription, PAPI_HUGE_STR_LEN, "%s. Units=(%s). Numpass=%d%s.", metricDescription, metricUnits, numOfPasses, multipassSupport); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { SUBDBG("Failed to fully write metric description.\n"); return PAPI_EBUF; } papi_errno = destroy_metrics_evaluator(pMetricsEvaluator); if (papi_errno != PAPI_OK) { return papi_errno; } return PAPI_OK; } /** @class get_number_of_passes_for_eventsets * @brief For a metric, get the number of passes. Function is specifically * designed to work with the start - stop workflow. * * @param *pChipName * The device chipname. * @param *metricEvaluator * A NVPW_MetricsEvaluator struct. * @param *metricEvalRequest * A created metric eval request for the current metric. * @param *numOfPasses * The total number of passes required by the metric. */ static int get_number_of_passes_for_eventsets(const char *pChipName, const char *metricName, int *numOfPasses) { NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params calculateScratchBufferSizeParam = {NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSize_Params_STRUCT_SIZE}; calculateScratchBufferSizeParam.pChipName = pChipName; calculateScratchBufferSizeParam.pCounterAvailabilityImage = NULL; calculateScratchBufferSizeParam.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_CalculateScratchBufferSizePtr(&calculateScratchBufferSizeParam), return PAPI_EMISC ); uint8_t myScratchBuffer[calculateScratchBufferSizeParam.scratchBufferSize]; NVPW_CUDA_MetricsEvaluator_Initialize_Params metricEvaluatorInitializeParams = {NVPW_CUDA_MetricsEvaluator_Initialize_Params_STRUCT_SIZE}; metricEvaluatorInitializeParams.scratchBufferSize = calculateScratchBufferSizeParam.scratchBufferSize; metricEvaluatorInitializeParams.pScratchBuffer = myScratchBuffer; metricEvaluatorInitializeParams.pChipName = pChipName; metricEvaluatorInitializeParams.pCounterAvailabilityImage = NULL; metricEvaluatorInitializeParams.pCounterDataImage = NULL; metricEvaluatorInitializeParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_MetricsEvaluator_InitializePtr(&metricEvaluatorInitializeParams), return PAPI_EMISC ); NVPW_MetricsEvaluator *pMetricsEvaluator = metricEvaluatorInitializeParams.pMetricsEvaluator; NVPW_MetricEvalRequest metricEvalRequest; int papi_errno = get_metric_eval_request(pMetricsEvaluator, metricName, &metricEvalRequest); if (papi_errno != PAPI_OK) { return papi_errno; } int rawMetricRequestsCount = 0; NVPA_RawMetricRequest *rawMetricRequests = NULL; papi_errno = create_raw_metric_requests(pMetricsEvaluator, &metricEvalRequest, &rawMetricRequests, &rawMetricRequestsCount); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = destroy_metrics_evaluator(pMetricsEvaluator); if (papi_errno != PAPI_OK) { return papi_errno; } NVPW_CUDA_RawMetricsConfig_Create_V2_Params rawMetricsConfigCreateParams = {NVPW_CUDA_RawMetricsConfig_Create_V2_Params_STRUCT_SIZE}; rawMetricsConfigCreateParams.activityKind = NVPA_ACTIVITY_KIND_PROFILER; rawMetricsConfigCreateParams.pChipName = pChipName; rawMetricsConfigCreateParams.pCounterAvailabilityImage = NULL; rawMetricsConfigCreateParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_RawMetricsConfig_Create_V2Ptr(&rawMetricsConfigCreateParams), return PAPI_EMISC ); // Destory pRawMetricsConfig at the end; otherwise, a memory leak will occur NVPA_RawMetricsConfig *pRawMetricsConfig = rawMetricsConfigCreateParams.pRawMetricsConfig; NVPW_RawMetricsConfig_BeginPassGroup_Params beginPassGroupParams = {NVPW_RawMetricsConfig_BeginPassGroup_Params_STRUCT_SIZE}; beginPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; beginPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_BeginPassGroupPtr(&beginPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_AddMetrics_Params addMetricsParams = {NVPW_RawMetricsConfig_AddMetrics_Params_STRUCT_SIZE}; addMetricsParams.pRawMetricsConfig = pRawMetricsConfig; addMetricsParams.pRawMetricRequests = rawMetricRequests; addMetricsParams.numMetricRequests = rawMetricRequestsCount; addMetricsParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_AddMetricsPtr(&addMetricsParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_EndPassGroup_Params endPassGroupParams = { NVPW_RawMetricsConfig_EndPassGroup_Params_STRUCT_SIZE}; endPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; endPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_EndPassGroupPtr(&endPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_GetNumPasses_Params rawMetricsConfigGetNumPassesParams = {NVPW_RawMetricsConfig_GetNumPasses_Params_STRUCT_SIZE}; rawMetricsConfigGetNumPassesParams.pRawMetricsConfig = pRawMetricsConfig; rawMetricsConfigGetNumPassesParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_GetNumPassesPtr(&rawMetricsConfigGetNumPassesParams), return PAPI_EMISC ); size_t numNestingLevels = 1; size_t numIsolatedPasses = rawMetricsConfigGetNumPassesParams.numIsolatedPasses; size_t numPipelinedPasses = rawMetricsConfigGetNumPassesParams.numPipelinedPasses; *numOfPasses = numPipelinedPasses + numIsolatedPasses * numNestingLevels; NVPW_RawMetricsConfig_Destroy_Params rawMetricsConfigDestroyParams = {NVPW_RawMetricsConfig_Destroy_Params_STRUCT_SIZE}; rawMetricsConfigDestroyParams.pRawMetricsConfig = pRawMetricsConfig; rawMetricsConfigDestroyParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_DestroyPtr((NVPW_RawMetricsConfig_Destroy_Params *)&rawMetricsConfigDestroyParams), return PAPI_EMISC ); int i; for (i = 0; i < rawMetricRequestsCount; i++) { free((void *) rawMetricRequests[i].pMetricName); } free(rawMetricRequests); return PAPI_OK; } /** @class get_number_of_passes_for_info * @brief For a metric, get the number of passes. Function is specifically * designed to work with the evt_code_to_info workflow. * * @param *pChipName * The device chipname. * @param *metricEvaluator * A NVPW_MetricsEvaluator struct. * @param *metricEvalRequest * A created metric eval request for the current metric. * @param *numOfPasses * The total number of passes required by the metric. */ static int get_number_of_passes_for_info(const char *pChipName, NVPW_MetricsEvaluator *pMetricsEvaluator, NVPW_MetricEvalRequest *metricEvalRequest, int *numOfPasses) { int rawMetricRequestsCount = 0; NVPA_RawMetricRequest *rawMetricRequests = NULL; int papi_errno = create_raw_metric_requests(pMetricsEvaluator, metricEvalRequest, &rawMetricRequests, &rawMetricRequestsCount); if (papi_errno != PAPI_OK) { return papi_errno; } NVPW_CUDA_RawMetricsConfig_Create_V2_Params rawMetricsConfigCreateParams = {NVPW_CUDA_RawMetricsConfig_Create_V2_Params_STRUCT_SIZE}; rawMetricsConfigCreateParams.activityKind = NVPA_ACTIVITY_KIND_PROFILER; rawMetricsConfigCreateParams.pChipName = pChipName; rawMetricsConfigCreateParams.pCounterAvailabilityImage = NULL; rawMetricsConfigCreateParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_RawMetricsConfig_Create_V2Ptr(&rawMetricsConfigCreateParams), return PAPI_EMISC ); // Destory pRawMetricsConfig at the end; otherwise, a memory leak will occur NVPA_RawMetricsConfig *pRawMetricsConfig = rawMetricsConfigCreateParams.pRawMetricsConfig; NVPW_RawMetricsConfig_BeginPassGroup_Params beginPassGroupParams = {NVPW_RawMetricsConfig_BeginPassGroup_Params_STRUCT_SIZE}; beginPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; beginPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_BeginPassGroupPtr(&beginPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_AddMetrics_Params addMetricsParams = {NVPW_RawMetricsConfig_AddMetrics_Params_STRUCT_SIZE}; addMetricsParams.pRawMetricsConfig = pRawMetricsConfig; addMetricsParams.pRawMetricRequests = rawMetricRequests; addMetricsParams.numMetricRequests = rawMetricRequestsCount; addMetricsParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_AddMetricsPtr(&addMetricsParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_EndPassGroup_Params endPassGroupParams = { NVPW_RawMetricsConfig_EndPassGroup_Params_STRUCT_SIZE}; endPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; endPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_EndPassGroupPtr(&endPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_GetNumPasses_Params rawMetricsConfigGetNumPassesParams = {NVPW_RawMetricsConfig_GetNumPasses_Params_STRUCT_SIZE}; rawMetricsConfigGetNumPassesParams.pRawMetricsConfig = pRawMetricsConfig; rawMetricsConfigGetNumPassesParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_GetNumPassesPtr(&rawMetricsConfigGetNumPassesParams), return PAPI_EMISC ); size_t numNestingLevels = 1; size_t numIsolatedPasses = rawMetricsConfigGetNumPassesParams.numIsolatedPasses; size_t numPipelinedPasses = rawMetricsConfigGetNumPassesParams.numPipelinedPasses; *numOfPasses = numPipelinedPasses + numIsolatedPasses * numNestingLevels; NVPW_RawMetricsConfig_Destroy_Params rawMetricsConfigDestroyParams = {NVPW_RawMetricsConfig_Destroy_Params_STRUCT_SIZE}; rawMetricsConfigDestroyParams.pRawMetricsConfig = pRawMetricsConfig; rawMetricsConfigDestroyParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_DestroyPtr((NVPW_RawMetricsConfig_Destroy_Params *)&rawMetricsConfigDestroyParams), return PAPI_EMISC ); int i; for (i = 0; i < rawMetricRequestsCount; i++) { free((void *) rawMetricRequests[i].pMetricName); } free(rawMetricRequests); return PAPI_OK; } /** @class get_metric_eval_request * @brief A simple wrapper for the perfworks api call * NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequest. * * @param *pMetricsEvaluator * A NVPW_MetricsEvaluator struct. * @param *metricName * The name of the metric you want to convert to a metric eval request. * @param *pMetricEvalRequest * Variable to store the created metric eval request. */ static int get_metric_eval_request(NVPW_MetricsEvaluator *pMetricsEvaluator, const char *metricName, NVPW_MetricEvalRequest *pMetricEvalRequest) { NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequest_Params convertMetricToEvalRequest = {NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequest_Params_STRUCT_SIZE}; convertMetricToEvalRequest.pMetricsEvaluator = pMetricsEvaluator; convertMetricToEvalRequest.pMetricName = metricName; convertMetricToEvalRequest.pMetricEvalRequest = pMetricEvalRequest; convertMetricToEvalRequest.metricEvalRequestStructSize = NVPW_MetricEvalRequest_STRUCT_SIZE; convertMetricToEvalRequest.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_ConvertMetricNameToMetricEvalRequestPtr(&convertMetricToEvalRequest), return PAPI_EMISC ); return PAPI_OK; } /** @class create_raw_metric_requests * @brief Create raw metric requests for a metric. * * @param *pMetricsEvaluator * A NVPW_MetricsEvaluator struct. * @param *metricEvalRequest * A metric eval request for the metric. * @param **rawMetricRequests * Store the raw metric requests for a metric. * @param *rawMetricRequestsCount * Total number of raw metric requests created. */ static int create_raw_metric_requests(NVPW_MetricsEvaluator *pMetricsEvaluator, NVPW_MetricEvalRequest *metricEvalRequest, NVPA_RawMetricRequest **rawMetricRequests, int *rawMetricRequestsCount) { NVPW_MetricsEvaluator_GetMetricRawDependencies_Params getMetricRawDependenciesParams = {NVPW_MetricsEvaluator_GetMetricRawDependencies_Params_STRUCT_SIZE}; getMetricRawDependenciesParams.pMetricsEvaluator = pMetricsEvaluator; getMetricRawDependenciesParams.pMetricEvalRequests = metricEvalRequest; getMetricRawDependenciesParams.numMetricEvalRequests = 1; // Set to 1 as that is the number of eval requests we will have each time getMetricRawDependenciesParams.metricEvalRequestStructSize = NVPW_MetricEvalRequest_STRUCT_SIZE; getMetricRawDependenciesParams.metricEvalRequestStrideSize = sizeof(NVPW_MetricEvalRequest); getMetricRawDependenciesParams.ppRawDependencies = NULL; getMetricRawDependenciesParams.ppOptionalRawDependencies = NULL; getMetricRawDependenciesParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_GetMetricRawDependenciesPtr(&getMetricRawDependenciesParams), return PAPI_EMISC ); const char **rawDependencies; rawDependencies = (const char **) malloc(getMetricRawDependenciesParams.numRawDependencies * sizeof(char *)); if (rawDependencies == NULL) { SUBDBG("Failed to allocate memory for variable rawDependencies.\n"); return PAPI_ENOMEM; } getMetricRawDependenciesParams.ppRawDependencies = rawDependencies; nvpwCheckErrors( NVPW_MetricsEvaluator_GetMetricRawDependenciesPtr(&getMetricRawDependenciesParams), return PAPI_EMISC ); *rawMetricRequests = (NVPA_RawMetricRequest *) realloc(*rawMetricRequests, (getMetricRawDependenciesParams.numRawDependencies + (*rawMetricRequestsCount)) * sizeof(NVPA_RawMetricRequest)); if (rawMetricRequests == NULL) { SUBDBG("Failed to allocate memory for variable tmpRawMetricRequests.\n"); return PAPI_ENOMEM; } int i, tmpRawMetricRequestsCount = *rawMetricRequestsCount; for (i = 0; i < getMetricRawDependenciesParams.numRawDependencies; i++) { NVPA_RawMetricRequest rawMetricRequestParams = {NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE}; rawMetricRequestParams.pPriv = NULL; rawMetricRequestParams.pMetricName = strdup(rawDependencies[i]); rawMetricRequestParams.isolated = 1; rawMetricRequestParams.keepInstances = 1; (*rawMetricRequests)[(*rawMetricRequestsCount)] = rawMetricRequestParams; (*rawMetricRequestsCount)++; } free(rawDependencies); return PAPI_OK; } /** @class get_evaluated_metric_values * @brief For a user added metric, get the evaluated gpu value. * * @param *pMetricsEvaluator * A NVPW_MetricsEvaluator struct. * @param *gpu_ctl * Structure of type cuptip_gpu_state_t which has member variables such as * dev_id, rawMetricRequests, numberOfRawMetricRequests, and more. * @param *evaluatedMetricValues * Total number of raw metric requests created. */ static int get_evaluated_metric_values(NVPW_MetricsEvaluator *pMetricsEvaluator, cuptip_gpu_state_t *gpu_ctl, long long *evaluatedMetricValues) { int i; for (i = 0; i < gpu_ctl->added_events->count; i++) { NVPW_MetricEvalRequest metricEvalRequest; get_metric_eval_request(pMetricsEvaluator, gpu_ctl->added_events->cuda_evts[i], &metricEvalRequest); NVPW_MetricsEvaluator_SetDeviceAttributes_Params setDeviceAttributeParams = {NVPW_MetricsEvaluator_SetDeviceAttributes_Params_STRUCT_SIZE}; setDeviceAttributeParams.pMetricsEvaluator = pMetricsEvaluator; setDeviceAttributeParams.pCounterDataImage = (const uint8_t *) gpu_ctl->counterDataImage.data; setDeviceAttributeParams.counterDataImageSize = gpu_ctl->counterDataImage.size; nvpwCheckErrors( NVPW_MetricsEvaluator_SetDeviceAttributesPtr(&setDeviceAttributeParams), return PAPI_EMISC ); double metricValue; NVPW_MetricsEvaluator_EvaluateToGpuValues_Params evaluateToGpuValuesParams = {NVPW_MetricsEvaluator_EvaluateToGpuValues_Params_STRUCT_SIZE}; evaluateToGpuValuesParams.pMetricsEvaluator = pMetricsEvaluator; evaluateToGpuValuesParams.pMetricEvalRequests = &metricEvalRequest; evaluateToGpuValuesParams.numMetricEvalRequests = 1; evaluateToGpuValuesParams.metricEvalRequestStructSize = NVPW_MetricEvalRequest_STRUCT_SIZE; evaluateToGpuValuesParams.metricEvalRequestStrideSize = sizeof(NVPW_MetricEvalRequest); evaluateToGpuValuesParams.pCounterDataImage = gpu_ctl->counterDataImage.data; evaluateToGpuValuesParams.counterDataImageSize = gpu_ctl->counterDataImage.size; evaluateToGpuValuesParams.rangeIndex = 0; evaluateToGpuValuesParams.isolated = 1; evaluateToGpuValuesParams.pMetricValues = &metricValue; nvpwCheckErrors( NVPW_MetricsEvaluator_EvaluateToGpuValuesPtr(&evaluateToGpuValuesParams), return PAPI_EMISC ); evaluatedMetricValues[i] = metricValue; } return PAPI_OK; } /** @class destroy_metric_evaluator * @brief A simple wrapper for the perfworks api call * NVPW_MetricsEvaluator_Destroy. */ static int destroy_metrics_evaluator(NVPW_MetricsEvaluator *pMetricsEvaluator) { NVPW_MetricsEvaluator_Destroy_Params metricEvaluatorDestroyParams = {NVPW_MetricsEvaluator_Destroy_Params_STRUCT_SIZE}; metricEvaluatorDestroyParams.pMetricsEvaluator = pMetricsEvaluator; metricEvaluatorDestroyParams.pPriv = NULL; nvpwCheckErrors( NVPW_MetricsEvaluator_DestroyPtr(&metricEvaluatorDestroyParams), return PAPI_EMISC ); return PAPI_OK; } /** * @} ******************************************************************************/ /***************************************************************************//** * @name Functions necessary for the configuration/profiling stage * @{ */ /** @class start_profiling_session * @brief Start a profiling session. * * @param counterDataImage * Contains the size and data. * @param counterDataScratchBufferSize * Contains the size and data. * @param configImage * Contains the size and data. */ static int start_profiling_session(byte_array_t counterDataImage, byte_array_t counterDataScratchBufferSize, byte_array_t configImage) { CUpti_Profiler_BeginSession_Params beginSessionParams = {CUpti_Profiler_BeginSession_Params_STRUCT_SIZE}; beginSessionParams.counterDataImageSize = counterDataImage.size; beginSessionParams.pCounterDataImage = counterDataImage.data; beginSessionParams.counterDataScratchBufferSize = counterDataScratchBufferSize.size; beginSessionParams.pCounterDataScratchBuffer = counterDataScratchBufferSize.data; beginSessionParams.maxLaunchesPerPass = 1; beginSessionParams.maxRangesPerPass = 1; beginSessionParams.range = CUPTI_UserRange; beginSessionParams.replayMode = CUPTI_UserReplay; beginSessionParams.pPriv = NULL; beginSessionParams.ctx = NULL; cuptiCheckErrors( cuptiProfilerBeginSessionPtr(&beginSessionParams), return PAPI_EMISC ); CUpti_Profiler_SetConfig_Params setConfigParams = {CUpti_Profiler_SetConfig_Params_STRUCT_SIZE}; setConfigParams.pConfig = configImage.data; setConfigParams.configSize = configImage.size; // Only set for Application Replay mode. setConfigParams.passIndex = 0; setConfigParams.minNestingLevel = 1; setConfigParams.numNestingLevels = 1; setConfigParams.targetNestingLevel = 1; setConfigParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerSetConfigPtr(&setConfigParams), return PAPI_EMISC ); return PAPI_OK; } /** @class get_config_image * @brief Generate the ConfigImage binary configuration image * (file format in memory). * * @param chipName * Name of the device begin used. * @param *pCounterAvailabilityImageData * Data from cuptiProfilerGetCounterAvailability. * @param *rawMetricRequests * A filled in NVPA_RawMetricRequest. * @para rmr_count * Number of rawMetricRequests. * @param configImage * Variable to store the generated configImage. */ static int get_config_image(const char *chipName, const uint8_t *pCounterAvailabilityImageData, NVPA_RawMetricRequest *rawMetricRequests, int rmr_count, byte_array_t *configImage) { NVPW_CUDA_RawMetricsConfig_Create_V2_Params rawMetricsConfigCreateParamsV2 = {NVPW_CUDA_RawMetricsConfig_Create_V2_Params_STRUCT_SIZE}; rawMetricsConfigCreateParamsV2.activityKind = NVPA_ACTIVITY_KIND_PROFILER; rawMetricsConfigCreateParamsV2.pChipName = chipName; rawMetricsConfigCreateParamsV2.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_RawMetricsConfig_Create_V2Ptr(&rawMetricsConfigCreateParamsV2), return PAPI_EMISC ); // Destory pRawMetricsConfig at the end; otherwise, a memory leak will occur NVPA_RawMetricsConfig *pRawMetricsConfig = rawMetricsConfigCreateParamsV2.pRawMetricsConfig; // Query counter availability before starting the profiling session if (pCounterAvailabilityImageData) { NVPW_RawMetricsConfig_SetCounterAvailability_Params setCounterAvailabilityParams = {NVPW_RawMetricsConfig_SetCounterAvailability_Params_STRUCT_SIZE}; setCounterAvailabilityParams.pPriv = NULL; setCounterAvailabilityParams.pRawMetricsConfig = pRawMetricsConfig; setCounterAvailabilityParams.pCounterAvailabilityImage = pCounterAvailabilityImageData; nvpwCheckErrors( NVPW_RawMetricsConfig_SetCounterAvailabilityPtr(&setCounterAvailabilityParams), return PAPI_EMISC ); } // NOTE: maxPassCount is being set to 1 as a final safety net to limit metric collection to a single pass. // Metrics that require multiple passes would fail further down at AddMetrics due to this. // This failure should never occur as we filter for metrics with multiple passes at get_number_of_passes, // which occurs before the get_config_image call. NVPW_RawMetricsConfig_BeginPassGroup_Params beginPassGroupParams = {NVPW_RawMetricsConfig_BeginPassGroup_Params_STRUCT_SIZE}; beginPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; beginPassGroupParams.maxPassCount = 1; beginPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_BeginPassGroupPtr(&beginPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_AddMetrics_Params addMetricsParams = {NVPW_RawMetricsConfig_AddMetrics_Params_STRUCT_SIZE}; addMetricsParams.pRawMetricsConfig = pRawMetricsConfig; addMetricsParams.pRawMetricRequests = rawMetricRequests; addMetricsParams.numMetricRequests = rmr_count; addMetricsParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_AddMetricsPtr(&addMetricsParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_EndPassGroup_Params endPassGroupParams = {NVPW_RawMetricsConfig_EndPassGroup_Params_STRUCT_SIZE}; endPassGroupParams.pRawMetricsConfig = pRawMetricsConfig; endPassGroupParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_EndPassGroupPtr(&endPassGroupParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_GenerateConfigImage_Params generateConfigImageParams = {NVPW_RawMetricsConfig_GenerateConfigImage_Params_STRUCT_SIZE}; generateConfigImageParams.pRawMetricsConfig = pRawMetricsConfig; generateConfigImageParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_GenerateConfigImagePtr(&generateConfigImageParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_GetConfigImage_Params getConfigImageParams = {NVPW_RawMetricsConfig_GetConfigImage_Params_STRUCT_SIZE}; getConfigImageParams.pRawMetricsConfig = pRawMetricsConfig; getConfigImageParams.bytesAllocated = 0; getConfigImageParams.pBuffer = NULL; getConfigImageParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_GetConfigImagePtr(&getConfigImageParams), return PAPI_EMISC ); byte_array_t *tmpConfigImage; tmpConfigImage = configImage; tmpConfigImage->size = getConfigImageParams.bytesCopied; tmpConfigImage->data = (uint8_t *) calloc(tmpConfigImage->size, sizeof(uint8_t)); if (configImage->data == NULL) { SUBDBG("Failed to allocate memory for configImage->data.\n"); return PAPI_ENOMEM; } getConfigImageParams.bytesAllocated = tmpConfigImage->size; getConfigImageParams.pBuffer = tmpConfigImage->data; nvpwCheckErrors( NVPW_RawMetricsConfig_GetConfigImagePtr(&getConfigImageParams), return PAPI_EMISC ); NVPW_RawMetricsConfig_Destroy_Params rawMetricsConfigDestroyParams = {NVPW_RawMetricsConfig_Destroy_Params_STRUCT_SIZE}; rawMetricsConfigDestroyParams.pRawMetricsConfig = pRawMetricsConfig; rawMetricsConfigDestroyParams.pPriv = NULL; nvpwCheckErrors( NVPW_RawMetricsConfig_DestroyPtr((NVPW_RawMetricsConfig_Destroy_Params *)&rawMetricsConfigDestroyParams), return PAPI_EMISC ); return PAPI_OK; } /** @class get_counter_data_prefix_image * @brief Generate the counterDataPrefix binary configuration image * (file format in memory). * * @param chipName * Name of the device begin used. * @param *rawMetricRequests * A filled in NVPA_RawMetricRequest. * @param rmr_count * Number of rawMetricRequests. * @param obtainCounterDataPrefixImage * Variable to store the generated counterDataPrefix. */ static int get_counter_data_prefix_image(const char *chipName, NVPA_RawMetricRequest *rawMetricRequests, int rmr_count, byte_array_t *counterDataPrefixImage) { NVPW_CUDA_CounterDataBuilder_Create_Params counterDataBuilderCreateParams = {NVPW_CUDA_CounterDataBuilder_Create_Params_STRUCT_SIZE}; counterDataBuilderCreateParams.pChipName = chipName; counterDataBuilderCreateParams.pPriv = NULL; nvpwCheckErrors( NVPW_CUDA_CounterDataBuilder_CreatePtr(&counterDataBuilderCreateParams), return PAPI_EMISC ); NVPW_CounterDataBuilder_AddMetrics_Params builderAddMetricsParams = {NVPW_CounterDataBuilder_AddMetrics_Params_STRUCT_SIZE}; builderAddMetricsParams.pCounterDataBuilder = counterDataBuilderCreateParams.pCounterDataBuilder; builderAddMetricsParams.pRawMetricRequests = rawMetricRequests; builderAddMetricsParams.numMetricRequests = rmr_count; builderAddMetricsParams.pPriv = NULL; nvpwCheckErrors( NVPW_CounterDataBuilder_AddMetricsPtr(&builderAddMetricsParams), return PAPI_EMISC ); NVPW_CounterDataBuilder_GetCounterDataPrefix_Params getCounterDataPrefixParams = {NVPW_CounterDataBuilder_GetCounterDataPrefix_Params_STRUCT_SIZE}; getCounterDataPrefixParams.pCounterDataBuilder = counterDataBuilderCreateParams.pCounterDataBuilder; getCounterDataPrefixParams.bytesAllocated = 0; getCounterDataPrefixParams.pBuffer = NULL; getCounterDataPrefixParams.pPriv = NULL; nvpwCheckErrors( NVPW_CounterDataBuilder_GetCounterDataPrefixPtr(&getCounterDataPrefixParams), return PAPI_EMISC ); byte_array_t *tmpCounterDataPrefixImage; tmpCounterDataPrefixImage = counterDataPrefixImage; tmpCounterDataPrefixImage->size = getCounterDataPrefixParams.bytesCopied; tmpCounterDataPrefixImage->data = (uint8_t *) calloc(tmpCounterDataPrefixImage->size, sizeof(uint8_t)); if (tmpCounterDataPrefixImage->data == NULL) { SUBDBG("Failed to allocate memory for tmpCounterDataPrefixImage->data.\n"); return PAPI_ENOMEM; } getCounterDataPrefixParams.bytesAllocated = tmpCounterDataPrefixImage->size; getCounterDataPrefixParams.pBuffer = tmpCounterDataPrefixImage->data; nvpwCheckErrors( NVPW_CounterDataBuilder_GetCounterDataPrefixPtr(&getCounterDataPrefixParams), return PAPI_EMISC ); NVPW_CounterDataBuilder_Destroy_Params counterDataBuilderDestroyParams = {NVPW_CounterDataBuilder_Destroy_Params_STRUCT_SIZE}; counterDataBuilderDestroyParams.pCounterDataBuilder = counterDataBuilderCreateParams.pCounterDataBuilder; counterDataBuilderDestroyParams.pPriv = NULL; nvpwCheckErrors( NVPW_CounterDataBuilder_DestroyPtr((NVPW_CounterDataBuilder_Destroy_Params *)&counterDataBuilderDestroyParams), return PAPI_EMISC ); return PAPI_OK; } /** @class get_counter_data_image * @brief Create a counterDataImage to be used for metric evaluation. * * @param counterDataPrefixImage * Struct containing the size and data of the counterDataPrefix * binary configuration image. * @param counterDataScratchBuffer * Struct to store the size and data of the scratch buffer. * @param counterDataImage * Struct to store the size and data of the counterDataImage. */ static int get_counter_data_image(byte_array_t counterDataPrefixImage, byte_array_t *counterDataScratchBuffer, byte_array_t *counterDataImage) { CUpti_Profiler_CounterDataImageOptions counterDataImageOptions; counterDataImageOptions.pCounterDataPrefix = counterDataPrefixImage.data; counterDataImageOptions.counterDataPrefixSize = counterDataPrefixImage.size; counterDataImageOptions.maxNumRanges = 1; counterDataImageOptions.maxNumRangeTreeNodes = 1; // Why do we do this? counterDataImageOptions.maxRangeNameLength = 64; // Calculate size of counterDataImage based on counterDataPrefixImage and options. CUpti_Profiler_CounterDataImage_CalculateSize_Params calculateSizeParams = {CUpti_Profiler_CounterDataImage_CalculateSize_Params_STRUCT_SIZE}; calculateSizeParams.pOptions = &counterDataImageOptions; calculateSizeParams.sizeofCounterDataImageOptions = CUpti_Profiler_CounterDataImageOptions_STRUCT_SIZE; calculateSizeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerCounterDataImageCalculateSizePtr(&calculateSizeParams), return PAPI_EMISC ); // Initialize counterDataImage. CUpti_Profiler_CounterDataImage_Initialize_Params initializeParams = {CUpti_Profiler_CounterDataImage_Initialize_Params_STRUCT_SIZE}; initializeParams.pOptions = &counterDataImageOptions; initializeParams.sizeofCounterDataImageOptions = CUpti_Profiler_CounterDataImageOptions_STRUCT_SIZE; initializeParams.counterDataImageSize = calculateSizeParams.counterDataImageSize; initializeParams.pPriv = NULL; byte_array_t *tmpCounterDataImage; tmpCounterDataImage = counterDataImage; tmpCounterDataImage->size = calculateSizeParams.counterDataImageSize; tmpCounterDataImage->data = (uint8_t *) calloc(tmpCounterDataImage->size, sizeof(uint8_t)); if (counterDataImage->data == NULL) { SUBDBG("Failed to allocate memory for counterDataImage->data.\n"); return PAPI_ENOMEM; } initializeParams.pCounterDataImage = counterDataImage->data; cuptiCheckErrors( cuptiProfilerCounterDataImageInitializePtr(&initializeParams), return PAPI_EMISC ); // Calculate scratchBuffer size based on counterDataImage size and counterDataImage. CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params scratchBufferSizeParams = {CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params_STRUCT_SIZE}; scratchBufferSizeParams.counterDataImageSize = calculateSizeParams.counterDataImageSize; scratchBufferSizeParams.pCounterDataImage = counterDataImage->data; scratchBufferSizeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerCounterDataImageCalculateScratchBufferSizePtr(&scratchBufferSizeParams), return PAPI_EMISC ); // Create counterDataScratchBuffer. byte_array_t *tmpCounterDataScratchBuffer; tmpCounterDataScratchBuffer = counterDataScratchBuffer; tmpCounterDataScratchBuffer->size = scratchBufferSizeParams.counterDataScratchBufferSize; tmpCounterDataScratchBuffer->data = (uint8_t *) calloc(tmpCounterDataScratchBuffer->size, sizeof(uint8_t)); if (counterDataScratchBuffer->data == NULL) { SUBDBG("Failed to allocate memory for counterDataScratchBuffer->data.\n"); return PAPI_ENOMEM; } // Initialize counterDataScratchBuffer. CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params initScratchBufferParams = { CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params_STRUCT_SIZE }; initScratchBufferParams.counterDataImageSize = calculateSizeParams.counterDataImageSize; initScratchBufferParams.pCounterDataImage = counterDataImage->data; //uint8_t* pCounterDataImage initScratchBufferParams.counterDataScratchBufferSize = counterDataScratchBuffer->size; initScratchBufferParams.pCounterDataScratchBuffer = counterDataScratchBuffer->data; initScratchBufferParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerCounterDataImageInitializeScratchBufferPtr(&initScratchBufferParams), return PAPI_EMISC ); return PAPI_OK; } /** @class end_profiling_session * @brief End the started profiling session. */ static int end_profiling_session(void) { int papi_errno = disable_profiling(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = pop_range(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = flush_data(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = unset_config(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = end_session(); if (papi_errno != PAPI_OK) { return papi_errno; } return PAPI_OK; } /** * @} ******************************************************************************/ /***************************************************************************//** * @name Wrappers for cupti profiler api calls * @{ */ /** @class initialize_cupti_profiler_api * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerInitialize. */ static int initialize_cupti_profiler_api(void) { COMPDBG("Entering.\n"); CUpti_Profiler_Initialize_Params profilerInitializeParams = {CUpti_Profiler_Initialize_Params_STRUCT_SIZE}; profilerInitializeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerInitializePtr(&profilerInitializeParams), return PAPI_EMISC ); return PAPI_OK; } /** @class deinitialize_cupti_profiler_api * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerDeInitialize. */ static int deinitialize_cupti_profiler_api(void) { COMPDBG("Entering.\n"); CUpti_Profiler_DeInitialize_Params profilerDeInitializeParams = {CUpti_Profiler_DeInitialize_Params_STRUCT_SIZE}; profilerDeInitializeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerDeInitializePtr(&profilerDeInitializeParams), return PAPI_EMISC ); return PAPI_OK; } /** @class enable_profiling * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerEnableProfiling. */ static int enable_profiling(void) { CUpti_Profiler_EnableProfiling_Params enableProfilingParams = {CUpti_Profiler_EnableProfiling_Params_STRUCT_SIZE}; enableProfilingParams.ctx = NULL; // If NULL, the current CUcontext is used enableProfilingParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerEnableProfilingPtr(&enableProfilingParams), return PAPI_EMISC ); return PAPI_OK; } /** @class begin_pass * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerBeginPass. */ int begin_pass(void) { CUpti_Profiler_BeginPass_Params beginPassParams = {CUpti_Profiler_BeginPass_Params_STRUCT_SIZE}; beginPassParams.ctx = NULL; // If NULL, the current CUcontext is used beginPassParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerBeginPassPtr(&beginPassParams), return PAPI_EMISC ); return PAPI_OK; } /** @class end_pass * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerEndPass. */ static int end_pass(void) { CUpti_Profiler_EndPass_Params endPassParams = {CUpti_Profiler_EndPass_Params_STRUCT_SIZE}; endPassParams.ctx = NULL; // If NULL, the current CUcontext is used endPassParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerEndPassPtr(&endPassParams), return PAPI_EMISC ); return PAPI_OK; } /** @class push_range * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerPushRange. */ static int push_range(const char *pRangeName) { CUpti_Profiler_PushRange_Params pushRangeParams = {CUpti_Profiler_PushRange_Params_STRUCT_SIZE}; pushRangeParams.pRangeName = pRangeName; pushRangeParams.rangeNameLength = strlen(pRangeName); pushRangeParams.ctx = NULL; // If NULL, the current CUcontext is used pushRangeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerPushRangePtr(&pushRangeParams), return PAPI_EMISC ); return PAPI_OK; } /** @class pop_range * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerPopRange. */ static int pop_range(void) { CUpti_Profiler_PopRange_Params popRangeParams = {CUpti_Profiler_PopRange_Params_STRUCT_SIZE}; popRangeParams.ctx = NULL; // If NULL, the current CUcontext is used popRangeParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerPopRangePtr(&popRangeParams), return PAPI_EMISC ); return PAPI_OK; } /** @class flush_data * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerFlushCounterData. * * Note that Flush is required to ensure data is returned from the * device when running User Replay mode. */ static int flush_data(void) { CUpti_Profiler_FlushCounterData_Params flushCounterDataParams = {CUpti_Profiler_FlushCounterData_Params_STRUCT_SIZE}; flushCounterDataParams.ctx = NULL; // If NULL, the current CUcontext is used flushCounterDataParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerFlushCounterDataPtr(&flushCounterDataParams), return PAPI_EMISC ); return PAPI_OK; } /** @class disable_profiling * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerDisableProfiling. */ static int disable_profiling(void) { CUpti_Profiler_DisableProfiling_Params disableProfilingParams = {CUpti_Profiler_DisableProfiling_Params_STRUCT_SIZE}; disableProfilingParams.ctx = NULL; // If NULL, the current CUcontext is used disableProfilingParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerDisableProfilingPtr(&disableProfilingParams), return PAPI_EMISC ); return PAPI_OK; } /** @class unset_config * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerUnsetConfig. */ static int unset_config(void) { CUpti_Profiler_UnsetConfig_Params unsetConfigParams = {CUpti_Profiler_UnsetConfig_Params_STRUCT_SIZE}; unsetConfigParams.ctx = NULL; // If NULL, the current CUcontext is used unsetConfigParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerUnsetConfigPtr(&unsetConfigParams), return PAPI_EMISC ); return PAPI_OK; } /** @class end_session * @brief A simple wrapper for the cupti profiler api call * cuptiProfilerEndSession. */ static int end_session(void) { CUpti_Profiler_EndSession_Params endSessionParams = {CUpti_Profiler_EndSession_Params_STRUCT_SIZE}; endSessionParams.ctx = NULL; // If NULL, the current CUcontext is used endSessionParams.pPriv = NULL; cuptiCheckErrors( cuptiProfilerEndSessionPtr(&endSessionParams), return PAPI_EMISC ); return PAPI_OK; } /** * @} ******************************************************************************/ papi-papi-7-2-0-t/src/components/cuda/cupti_profiler.h000066400000000000000000000026411502707512200227320ustar00rootroot00000000000000/** * @file cupti_profiler.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __CUPTI_PROFILER_H__ #define __CUPTI_PROFILER_H__ #include "cupti_utils.h" typedef struct cuptip_control_s *cuptip_control_t; /* used to determine collection method in cupti_profiler.c, see cuptip_ctx_read */ #define CUDA_AVG 0x1 #define CUDA_MAX 0x2 #define CUDA_MIN 0x3 #define CUDA_SUM 0x4 #define CUDA_DEFAULT 0x5 /* init and shutdown interfaces */ int cuptip_init(void); int cuptip_shutdown(void); /* native event interfaces */ int cuptip_evt_enum(uint32_t *event_code, int modifier); int cuptip_evt_code_to_descr(uint32_t event_code, char *descr, int len); int cuptip_evt_name_to_code(const char *name, uint32_t *event_code); int cuptip_evt_code_to_name(uint32_t event_code, char *name, int len); int cuptip_evt_code_to_info(uint32_t event_code, PAPI_event_info_t *info); /* profiling context handling interfaces */ int cuptip_ctx_create(cuptic_info_t thr_info, cuptip_control_t *pstate, uint32_t *events_id, int num_events); int cuptip_ctx_destroy(cuptip_control_t *pstate); int cuptip_ctx_start(cuptip_control_t state); int cuptip_ctx_stop(cuptip_control_t state); int cuptip_ctx_read(cuptip_control_t state, long long **counters); int cuptip_ctx_reset(cuptip_control_t state); #endif /* __CUPTI_PROFILER_H__ */ papi-papi-7-2-0-t/src/components/cuda/cupti_utils.c000066400000000000000000000072701502707512200222460ustar00rootroot00000000000000/** * @file cupti_utils.c * * @author Treece Burges tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.)s * @author Anustuv Pal anustuv@icl.utk.edu */ #include #include "papi_memory.h" #include "cupti_utils.h" #include "htable.h" #include "lcuda_debug.h" #define ADDED_EVENTS_INITIAL_CAPACITY 64 int cuptiu_event_table_create_init_capacity(int capacity, int sizeof_rec, cuptiu_event_table_t **pevt_table) { cuptiu_event_table_t *evt_table = (cuptiu_event_table_t *) malloc(sizeof(cuptiu_event_table_t)); if (evt_table == NULL) { goto fn_fail; } evt_table->capacity = capacity; evt_table->count = 0; evt_table->event_stats_count = 0; if (htable_init(&(evt_table->htable)) != HTABLE_SUCCESS) { cuptiu_event_table_destroy(&evt_table); goto fn_fail; } *pevt_table = evt_table; return 0; fn_fail: *pevt_table = NULL; return PAPI_ENOMEM; } void cuptiu_event_table_destroy(cuptiu_event_table_t **pevt_table) { cuptiu_event_table_t *evt_table = *pevt_table; if (evt_table == NULL) return; if (evt_table->htable) { htable_shutdown(evt_table->htable); evt_table->htable = NULL; } free(evt_table); *pevt_table = NULL; } int cuptiu_files_search_in_path(const char *file_name, const char *search_path, char **file_paths) { char path[PATH_MAX]; char command[PATH_MAX]; snprintf(command, PATH_MAX, "find %s -name %s", search_path, file_name); FILE *fp; fp = popen(command, "r"); if (fp == NULL) { ERRDBG("Failed to run system command find using popen.\n"); return -1; } int count = 0; while (fgets(path, PATH_MAX, fp) != NULL) { path[strcspn(path, "\n")] = 0; file_paths[count] = strdup(path); count++; if (count >= CUPTIU_MAX_FILES) { break; } } pclose(fp); if (count == 0) { ERRDBG("%s not found in path PAPI_CUDA_ROOT.\n", file_name); } return count; } // Initialize the stat Stringvector void init_vector(StringVector *vec) { vec->arrayMetricStatistics = NULL; vec->size = 0; vec->capacity = 0; } // Add a string to the vector int push_back(StringVector *vec, const char *str) { size_t i; for (i = 0; i < vec->size; i++) { if (strcmp(vec->arrayMetricStatistics[i], str) == 0) { return PAPI_OK; } } // Resize if necessary if (vec->size == vec->capacity) { size_t new_capacity = (vec->capacity == 0) ? 4 : vec->capacity * 2; char **new_data = realloc(vec->arrayMetricStatistics, new_capacity * sizeof(char*)); if (new_data == NULL) { ERRDBG ("Memory allocation failed\n"); return PAPI_ENOMEM; } vec->arrayMetricStatistics = new_data; vec->capacity = new_capacity; } // Allocate memory for the new string and copy it vec->arrayMetricStatistics[vec->size] = malloc(strlen(str) + 1); if (vec->arrayMetricStatistics[vec->size] == NULL) { ERRDBG ("Memory allocation failed\n"); return PAPI_ENOMEM; } int strLen = snprintf(vec->arrayMetricStatistics[vec->size], strlen(str) + 1, "%s", str); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { SUBDBG("Failed to fully write added Cuda native event name.\n"); return PAPI_ENOMEM; } vec->size++; // Increase the size return PAPI_OK; } // Free the memory used by the vector void free_vector(StringVector *vec) { for (size_t i = 0; i < vec->size; i++) { free(vec->arrayMetricStatistics[i]); } free(vec->arrayMetricStatistics); vec->arrayMetricStatistics = NULL; } papi-papi-7-2-0-t/src/components/cuda/cupti_utils.h000066400000000000000000000035601502707512200222510ustar00rootroot00000000000000/** * @file cupti_utils.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __CUPTI_UTILS_H__ #define __CUPTI_UTILS_H__ #include #include #include typedef int64_t cuptiu_bitmap_t; typedef int (*cuptiu_dev_get_map_cb)(uint64_t event_id, int *dev_id); typedef struct { char **arrayMetricStatistics ; size_t size; size_t capacity; } StringVector; typedef struct event_record_s { char name[PAPI_2MAX_STR_LEN]; char basenameWithStatReplaced[PAPI_2MAX_STR_LEN]; char desc[PAPI_HUGE_STR_LEN]; StringVector * stat; cuptiu_bitmap_t device_map; } cuptiu_event_t; typedef struct gpu_record_s { char chipName[PAPI_MIN_STR_LEN]; int totalMetricCount; char **metricNames; } gpu_record_t; typedef struct event_table_s { unsigned int count; unsigned int event_stats_count; unsigned int capacity; char cuda_evts[30][PAPI_2MAX_STR_LEN]; int cuda_devs[30]; int evt_pos[30]; gpu_record_t *avail_gpu_info; cuptiu_event_t *events; StringVector *event_stats; void *htable; } cuptiu_event_table_t; /* These functions form a simple API to handle dynamic list of strings */ int cuptiu_event_table_create_init_capacity(int capacity, int sizeof_rec, cuptiu_event_table_t **pevt_table); void cuptiu_event_table_destroy(cuptiu_event_table_t **pevt_table); /* These functions handle list of strings for statistics qualifiers */ void init_vector(StringVector *vec); int push_back(StringVector *vec, const char *str); void free_vector(StringVector *vec); /* Utility to locate a file in a given path */ #define CUPTIU_MAX_FILES 100 int cuptiu_files_search_in_path(const char *file_name, const char *search_path, char **file_paths); #endif /* __CUPTI_UTILS_H__ */ papi-papi-7-2-0-t/src/components/cuda/htable.h000066400000000000000000000226651502707512200211530ustar00rootroot00000000000000/** * @file htable.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * */ #ifndef __HTABLE_H__ #define __HTABLE_H__ #include #include #include "papi.h" #include "papi_internal.h" #include "papi_memory.h" #define HTABLE_SUCCESS ( 0) #define HTABLE_ENOVAL (-1) #define HTABLE_EINVAL (-2) #define HTABLE_ENOMEM (-3) #define HTABLE_NEEDS_TO_GROW(table) (table->size > 0 && table->capacity / table->size < 2) #define HTABLE_NEEDS_TO_SHRINK(table) (table->size > 0 && table->capacity / table->size > 8) struct hash_table_entry { char *key; void *val; struct hash_table_entry *next; }; struct hash_table { uint32_t capacity; uint32_t size; struct hash_table_entry **buckets; }; static uint64_t hash_func(const char *); static int create_table(uint64_t, struct hash_table **); static int destroy_table(struct hash_table *); static int rehash_table(struct hash_table *, struct hash_table *); static int move_table(struct hash_table *, struct hash_table *); static int check_n_resize_table(struct hash_table *); static int destroy_table_entries(struct hash_table *); static int create_table_entry(const char *, void *, struct hash_table_entry **); static int destroy_table_entry(struct hash_table_entry *); static int insert_table_entry(struct hash_table *, struct hash_table_entry *); static int delete_table_entry(struct hash_table *, struct hash_table_entry *); static int find_table_entry(struct hash_table *, const char *, struct hash_table_entry **); static inline int htable_init(void **handle) { int htable_errno = HTABLE_SUCCESS; #define HTABLE_MIN_SIZE (8) struct hash_table *table = NULL; htable_errno = create_table(HTABLE_MIN_SIZE, &table); if (htable_errno != HTABLE_SUCCESS) { goto fn_fail; } *handle = table; fn_exit: return htable_errno; fn_fail: *handle = NULL; goto fn_exit; } static inline int htable_shutdown(void *handle) { int htable_errno = HTABLE_SUCCESS; struct hash_table *table = (struct hash_table *) handle; if (table == NULL) { return HTABLE_EINVAL; } destroy_table_entries(table); destroy_table(table); return htable_errno; } static inline int htable_insert(void *handle, const char *key, void *in) { int htable_errno = HTABLE_SUCCESS; struct hash_table *table = (struct hash_table *) handle; if (table == NULL || key == NULL) { return HTABLE_EINVAL; } struct hash_table_entry *entry = NULL; htable_errno = find_table_entry(table, key, &entry); if (htable_errno == HTABLE_SUCCESS) { entry->val = in; goto fn_exit; } htable_errno = create_table_entry(key, in, &entry); if (htable_errno != HTABLE_SUCCESS) { goto fn_fail; } htable_errno = insert_table_entry(table, entry); if (htable_errno != HTABLE_SUCCESS) { goto fn_fail; } htable_errno = check_n_resize_table(table); fn_exit: return htable_errno; fn_fail: if (entry) { free(entry); } goto fn_exit; } static inline int htable_delete(void *handle, const char *key) { int htable_errno = HTABLE_SUCCESS; struct hash_table *table = (struct hash_table *) handle; if (table == NULL || key == NULL) { return HTABLE_EINVAL; } struct hash_table_entry *entry = NULL; htable_errno = find_table_entry(table, key, &entry); if (htable_errno != HTABLE_SUCCESS) { return htable_errno; } entry->val = NULL; htable_errno = delete_table_entry(table, entry); if (htable_errno != HTABLE_SUCCESS) { return htable_errno; } htable_errno = destroy_table_entry(entry); if (htable_errno != HTABLE_SUCCESS) { return htable_errno; } return check_n_resize_table(table); } static inline int htable_find(void *handle, const char *key, void **out) { int htable_errno = HTABLE_SUCCESS; struct hash_table *table = (struct hash_table *) handle; if (table == NULL || key == NULL || out == NULL) { return HTABLE_EINVAL; } struct hash_table_entry *entry = NULL; htable_errno = find_table_entry(table, key, &entry); if (htable_errno != HTABLE_SUCCESS) { return htable_errno; } *out = entry->val; return htable_errno; } /** * djb2 hash function */ uint64_t hash_func(const char *string) { uint64_t hash = 5381; int c; while ((c = *string++)) { hash = ((hash << 5) + hash) + c; } return hash; } int create_table(uint64_t size, struct hash_table **table) { int htable_errno = HTABLE_SUCCESS; *table = calloc(1, sizeof(**table)); if (table == NULL) { htable_errno = HTABLE_ENOMEM; goto fn_exit; } (*table)->buckets = calloc(size, sizeof(*(*table)->buckets)); if ((*table)->buckets == NULL) { htable_errno = HTABLE_ENOMEM; goto fn_exit; } (*table)->capacity = size; fn_exit: return htable_errno; } int destroy_table(struct hash_table *table) { int htable_errno = HTABLE_SUCCESS; if (table && table->buckets) { free(table->buckets); } if (table) { free(table); } return htable_errno; } int rehash_table(struct hash_table *old_table, struct hash_table *new_table) { uint64_t old_id; for (old_id = 0; old_id < old_table->capacity; ++old_id) { struct hash_table_entry *entry = old_table->buckets[old_id]; struct hash_table_entry *next; while (entry) { next = entry->next; delete_table_entry(old_table, entry); insert_table_entry(new_table, entry); entry = next; } } return HTABLE_SUCCESS; } int move_table(struct hash_table *new_table, struct hash_table *old_table) { int htable_errno = HTABLE_SUCCESS; struct hash_table_entry **old_buckets = old_table->buckets; old_table->capacity = new_table->capacity; old_table->size = new_table->size; old_table->buckets = new_table->buckets; new_table->buckets = NULL; free(old_buckets); return htable_errno; } int destroy_table_entries(struct hash_table *table) { int htable_errno = HTABLE_SUCCESS; uint64_t i; for (i = 0; i < table->capacity; ++i) { struct hash_table_entry *entry = table->buckets[i]; struct hash_table_entry *tmp = NULL; while (entry) { tmp = entry; entry = entry->next; delete_table_entry(table, tmp); destroy_table_entry(tmp); } } return htable_errno; } int check_n_resize_table(struct hash_table *table) { int htable_errno = HTABLE_SUCCESS; struct hash_table *new_table = NULL; char resize = (HTABLE_NEEDS_TO_GROW(table) << 1) | HTABLE_NEEDS_TO_SHRINK(table); if (resize) { uint64_t new_capacity = (resize & 0x2) ? table->capacity * 2 : table->capacity / 2; htable_errno = create_table(new_capacity, &new_table); if (htable_errno != HTABLE_SUCCESS) { goto fn_fail; } htable_errno = rehash_table(table, new_table); if (htable_errno != HTABLE_SUCCESS) { goto fn_fail; } move_table(new_table, table); destroy_table(new_table); } fn_exit: return htable_errno; fn_fail: if (new_table) { destroy_table(new_table); } goto fn_exit; } int create_table_entry(const char *key, void *val, struct hash_table_entry **entry) { int htable_errno = HTABLE_SUCCESS; *entry = calloc(1, sizeof(**entry)); if (*entry == NULL) { return HTABLE_ENOMEM; } (*entry)->key = strdup(key); (*entry)->val = val; (*entry)->next = NULL; return htable_errno; } int destroy_table_entry(struct hash_table_entry *entry) { int htable_errno = HTABLE_SUCCESS; free(entry->key); free(entry); return htable_errno; } int insert_table_entry(struct hash_table *table, struct hash_table_entry *entry) { int htable_errno = HTABLE_SUCCESS; uint64_t id = hash_func(entry->key) % table->capacity; if (table->buckets[id]) { entry->next = table->buckets[id]; } table->buckets[id] = entry; ++table->size; return htable_errno; } int delete_table_entry(struct hash_table *table, struct hash_table_entry *entry) { int htable_errno = HTABLE_SUCCESS; uint64_t id = hash_func(entry->key) % table->capacity; if (table->buckets[id] == entry) { table->buckets[id] = entry->next; entry->next = NULL; goto fn_exit; } struct hash_table_entry *prev = table->buckets[id]; struct hash_table_entry *curr = table->buckets[id]->next; while (curr) { if (curr == entry) { prev->next = curr->next; curr->next = NULL; break; } prev = prev->next; curr = curr->next; } fn_exit: --table->size; return htable_errno; } int find_table_entry(struct hash_table *table, const char *key, struct hash_table_entry **entry) { int htable_errno; uint64_t id = hash_func(key) % table->capacity; struct hash_table_entry *head = table->buckets[id]; if (head == NULL) { htable_errno = HTABLE_ENOVAL; goto fn_exit; } struct hash_table_entry *curr = head; while (curr && strcmp(curr->key, key)) { curr = curr->next; } *entry = curr; htable_errno = (curr) ? HTABLE_SUCCESS : HTABLE_ENOVAL; fn_exit: return htable_errno; } #endif /* __HTABLE_H__ */ papi-papi-7-2-0-t/src/components/cuda/lcuda_debug.h000066400000000000000000000020521502707512200221360ustar00rootroot00000000000000/** * @file lcuda_debug.h * @author Anustuv Pal * anustuv@icl.utk.edu */ #ifndef __LCUDA_DEBUG_H__ #define __LCUDA_DEBUG_H__ #include "papi.h" #include "papi_internal.h" /* Macro to either exit or continue depending on switch */ #define EXIT_OR_NOT #ifdef EXIT_ON_ERROR # undef EXIT_OR_NOT # define EXIT_OR_NOT exit(-1) #endif /* Function calls */ #define COMPDBG(format, args...) SUBDBG("COMPDEBUG: " format, ## args); /* General log */ #define LOGDBG(format, args...) SUBDBG("LOG: " format, ## args); /* Lock and unlock calls */ #define LOCKDBG(format, args...) SUBDBG("LOCK: " format, ## args); /* ERROR */ #define ERRDBG(format, args...) SUBDBG("ERROR: " format, ## args); /* Log cuda driver and runtime calls */ #define LOGCUDACALL(format, args...) SUBDBG("CUDACALL: " format, ## args); /* Log cupti calls */ #define LOGCUPTICALL(format, args...) SUBDBG("CUPTICALL: " format, ## args); /* Log perfworks calls */ #define LOGPERFWORKSCALL(format, args...) SUBDBG("PERFWORKSCALL: " format, ## args); #endif /* __LCUDA_DEBUG_H__ */ papi-papi-7-2-0-t/src/components/cuda/linux-cuda.c000066400000000000000000000512541502707512200217540ustar00rootroot00000000000000/** * @file linux-cuda.c * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu (updated in 2023, redesigned with multi-threading support.) * @author Tony Castaldo tonycastaldo@icl.utk.edu (updated in 08/2019, to make counters accumulate.) * @author Tony Castaldo tonycastaldo@icl.utk.edu (updated in 2018, to use batch reads and support nvlink metrics. * @author Asim YarKhan yarkhan@icl.utk.edu (updated in 2017 to support CUDA metrics) * @author Asim YarKhan yarkhan@icl.utk.edu (updated in 2015 for multiple CUDA contexts/devices) * @author Heike Jagode (First version, in collaboration with Robert Dietrich, TU Dresden) jagode@icl.utk.edu * * @ingroup papi_components * * @brief * This file implements a PAPI component that enables PAPI-C to access * hardware monitoring counters for NVIDIA GPU devices through the CuPTI library. * * The open source software license for PAPI conforms to the BSD * License template. */ #include #include #include #include #include #include "papi_memory.h" #include "cupti_dispatch.h" #include "cupti_config.h" #include "lcuda_debug.h" #define PAPI_CUDA_MPX_COUNTERS 512 #define PAPI_CUDA_MAX_COUNTERS 30 papi_vector_t _cuda_vector; /* init and shutdown functions */ static int cuda_init_component(int cidx); static int cuda_init_thread(hwd_context_t *ctx); static int cuda_init_control_state(hwd_control_state_t *ctl); static int cuda_shutdown_thread(hwd_context_t *ctx); static int cuda_shutdown_component(void); static int cuda_init_comp_presets(void); /* set and update component state */ static int cuda_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx); static int cuda_set_domain(hwd_control_state_t * ctrl, int domain); /* functions to monitor hardware counters */ static int cuda_start(hwd_context_t *ctx, hwd_control_state_t *ctl); static int cuda_read(hwd_context_t *ctx, hwd_control_state_t *ctl, long long **val, int flags); static int cuda_reset(hwd_context_t *ctx, hwd_control_state_t *ctl); static int cuda_stop(hwd_context_t *ctx, hwd_control_state_t *ctl); static int cuda_cleanup_eventset(hwd_control_state_t *ctl); static int cuda_init_private(void); static int cuda_get_evt_count(int *count); /* cuda native event conversion utility functions */ static int cuda_ntv_enum_events(unsigned int *event_code, int modifier); static int cuda_ntv_code_to_name(unsigned int event_code, char *name, int len); static int cuda_ntv_name_to_code(const char *name, unsigned int *event_code); static int cuda_ntv_code_to_descr(unsigned int event_code, char *descr, int len); static int cuda_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info); /* track metadata, such as the EventSet state */ typedef struct { int initialized; int state; int component_id; } cuda_context_t; typedef struct { int num_events; unsigned int domain; unsigned int granularity; unsigned int overflow; unsigned int overflow_signal; unsigned int attached; int component_id; uint32_t *events_id; cuptid_info_t info; /* struct holding read count, gpu_ctl, etc. */ cuptip_control_t cuptid_ctx; } cuda_control_t; papi_vector_t _cuda_vector = { .cmp_info = { .name = "cuda", .short_name = "cuda", .version = "0.1", .description = "CUDA profiling via NVIDIA CuPTI interfaces", .num_mpx_cntrs = PAPI_CUDA_MPX_COUNTERS, .num_cntrs = PAPI_CUDA_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .initialized = 0, }, .size = { .context = sizeof(cuda_context_t), .control_state = sizeof(cuda_control_t), .reg_value = 1, .reg_alloc = 1, }, .init_component = cuda_init_component, .shutdown_component = cuda_shutdown_component, .init_thread = cuda_init_thread, .shutdown_thread = cuda_shutdown_thread, .ntv_enum_events = cuda_ntv_enum_events, .ntv_code_to_name = cuda_ntv_code_to_name, .ntv_name_to_code = cuda_ntv_name_to_code, .ntv_code_to_descr = cuda_ntv_code_to_descr, .ntv_code_to_info = cuda_ntv_code_to_info, .init_control_state = cuda_init_control_state, .set_domain = cuda_set_domain, .update_control_state = cuda_update_control_state, .cleanup_eventset = cuda_cleanup_eventset, .start = cuda_start, .stop = cuda_stop, .read = cuda_read, .reset = cuda_reset, }; static int cuda_init_component(int cidx) { COMPDBG("Entering with component idx: %d\n", cidx); _cuda_vector.cmp_info.CmpIdx = cidx; _cuda_vector.cmp_info.num_native_events = -1; _cuda_lock = PAPI_NUM_LOCK + NUM_INNER_LOCK + cidx; _cuda_vector.cmp_info.disabled = PAPI_EDELAY_INIT; sprintf(_cuda_vector.cmp_info.disabled_reason, "Not initialized. Access component events to initialize it."); return PAPI_EDELAY_INIT; } static int cuda_shutdown_component(void) { COMPDBG("Entering.\n"); if (!_cuda_vector.cmp_info.initialized || _cuda_vector.cmp_info.disabled != PAPI_OK) { return PAPI_OK; } _cuda_vector.cmp_info.initialized = 0; return cuptid_shutdown(); } static int cuda_init_private(void) { int papi_errno = PAPI_OK; _papi_hwi_lock(COMPONENT_LOCK); SUBDBG("ENTER\n"); if (_cuda_vector.cmp_info.initialized) { SUBDBG("Skipping cuda_init_private, as the Cuda event table has already been initialized.\n"); goto fn_exit; } int strLen = snprintf(_cuda_vector.cmp_info.disabled_reason, PAPI_MIN_STR_LEN, "%s", ""); if (strLen < 0 || strLen >= PAPI_MIN_STR_LEN) { SUBDBG("Failed to fully write initial disabled_reason.\n"); } strLen = snprintf(_cuda_vector.cmp_info.partially_disabled_reason, PAPI_MIN_STR_LEN, "%s", ""); if (strLen < 0 || strLen >= PAPI_MIN_STR_LEN) { SUBDBG("Failed to fully write initial partially_disabled_reason.\n"); } papi_errno = cuptid_init(); if (papi_errno != PAPI_OK) { // Get last error message const char *err_string; cuptid_err_get_last(&err_string); // Cuda component is partially disabled if (papi_errno == PAPI_PARTIAL) { _cuda_vector.cmp_info.partially_disabled = 1; strLen = snprintf(_cuda_vector.cmp_info.partially_disabled_reason, PAPI_HUGE_STR_LEN, "%s", err_string); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { SUBDBG("Failed to fully write the partially disabled reason.\n"); } // Reset variable that holds error code papi_errno = PAPI_OK; } // Cuda component is disabled else { strLen = snprintf(_cuda_vector.cmp_info.disabled_reason, PAPI_HUGE_STR_LEN, "%s", err_string); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { SUBDBG("Failed to fully write the disabled reason.\n"); } goto fn_fail; } } // Get the metric count found on a machine int count = 0; papi_errno = cuda_get_evt_count(&count); if (papi_errno != PAPI_OK) { goto fn_fail; } _cuda_vector.cmp_info.num_native_events = count; _cuda_vector.cmp_info.initialized = 1; fn_exit: _cuda_vector.cmp_info.disabled = papi_errno; SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); _papi_hwi_unlock(COMPONENT_LOCK); return papi_errno; fn_fail: goto fn_exit; } static int check_n_initialize(void) { if (!_cuda_vector.cmp_info.initialized) { int papi_errno = cuda_init_private(); if( PAPI_OK != papi_errno ) { return papi_errno; } // Setup the presets. papi_errno = cuda_init_comp_presets(); if( PAPI_OK != papi_errno ) { return papi_errno; } return papi_errno; } return _cuda_vector.cmp_info.disabled; } static int cuda_ntv_enum_events(unsigned int *event_code, int modifier) { SUBDBG("ENTER: event_code: %u, modifier: %d\n", *event_code, modifier); int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { goto fn_exit; } uint32_t code = *(uint32_t *) event_code; papi_errno = cuptid_evt_enum(&code, modifier); *event_code = (unsigned int) code; fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } static int cuda_ntv_name_to_code(const char *name, unsigned int *event_code) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { goto fn_exit; } uint32_t code; papi_errno = cuptid_evt_name_to_code(name, &code); *event_code = (unsigned int) code; fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } static int cuda_ntv_code_to_name(unsigned int event_code, char *name, int len) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = cuptid_evt_code_to_name((uint32_t) event_code, name, len); fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } static int cuda_ntv_code_to_descr(unsigned int event_code, char *descr, int len) { SUBDBG("ENTER: event_code: %u, descr: %p, len: %d\n", event_code, descr, len); int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_errno = cuptid_evt_code_to_descr((uint32_t) event_code, descr, len); fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } static int cuda_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { SUBDBG("ENTER: event_code: %u, info: %p\n", event_code, info); int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_errno = cuptid_evt_code_to_info((uint32_t) event_code, info); fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } static int cuda_init_thread(hwd_context_t *ctx) { cuda_context_t *cuda_ctx = (cuda_context_t *) ctx; memset(cuda_ctx, 0, sizeof(*cuda_ctx)); cuda_ctx->initialized = 1; cuda_ctx->component_id = _cuda_vector.cmp_info.CmpIdx; return PAPI_OK; } static int cuda_shutdown_thread(hwd_context_t *ctx) { cuda_context_t *cuda_ctx = (cuda_context_t *) ctx; cuda_ctx->initialized = 0; cuda_ctx->state = 0; return PAPI_OK; } static int cuda_init_comp_presets(void) { SUBDBG("ENTER: Init CUDA component presets.\n"); int cidx = _cuda_vector.cmp_info.CmpIdx; char *cname = _cuda_vector.cmp_info.name; /* Setup presets. */ char arch_name[PAPI_2MAX_STR_LEN]; int devIdx = -1; int numDevices = 0; int retval = cuptid_device_get_count(&numDevices); if ( retval != PAPI_OK ) { return PAPI_EMISC; } /* Load preset table for every device type available on the system. * As long as one of the cards has presets defined, then they should * be available. */ for( devIdx = 0; devIdx < numDevices; ++devIdx ) { retval = cuptid_get_chip_name(devIdx, arch_name); if ( retval == PAPI_OK ) { break; } } if ( devIdx > -1 && devIdx < numDevices ) { retval = _papi_load_preset_table_component( cname, arch_name, cidx ); if ( retval != PAPI_OK ) { SUBDBG("EXIT: Failed to init CUDA component presets.\n"); return retval; } } return PAPI_OK; } static int cuda_init_control_state(hwd_control_state_t __attribute__((unused)) *ctl) { COMPDBG("Entering.\n"); return check_n_initialize(); } static int cuda_set_domain(hwd_control_state_t __attribute__((unused)) *ctrl, int domain) { COMPDBG("Entering\n"); if((PAPI_DOM_USER & domain) || (PAPI_DOM_KERNEL & domain) || (PAPI_DOM_OTHER & domain) || (PAPI_DOM_ALL & domain)) return (PAPI_OK); else return (PAPI_EINVAL); } static int update_native_events(cuda_control_t *, NativeInfo_t *, int); static int cuda_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx __attribute__((unused))) { SUBDBG("ENTER: ctl: %p, ntv_info: %p, ntv_count: %d, ctx: %p\n", ctl, ntv_info, ntv_count, ctx); int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { goto fn_exit; } /* needed to make sure multipass events are caught with proper error code (PAPI_EMULPASS)*/ if (ntv_count == 0) { return PAPI_OK; } cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; /* allocating memoory for total number of devices */ if (cuda_ctl->info == NULL) { papi_errno = cuptid_thread_info_create(&(cuda_ctl->info)); if (papi_errno != PAPI_OK) { goto fn_exit; } } papi_errno = update_native_events(cuda_ctl, ntv_info, ntv_count); if (papi_errno != PAPI_OK) { goto fn_exit; } /* needed to make sure multipass events are caught with proper error code (PAPI_EMULPASS)*/ papi_errno = cuptid_ctx_create(cuda_ctl->info, &(cuda_ctl->cuptid_ctx), cuda_ctl->events_id, cuda_ctl->num_events); fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; } struct event_map_item { int event_id; int frontend_idx; }; int update_native_events(cuda_control_t *ctl, NativeInfo_t *ntv_info, int ntv_count) { int papi_errno = PAPI_OK; struct event_map_item sorted_events[PAPI_CUDA_MAX_COUNTERS]; if (ntv_count != ctl->num_events) { ctl->num_events = ntv_count; if (ntv_count == 0) { free(ctl->events_id); ctl->events_id = NULL; goto fn_exit; } else { ctl->events_id = realloc(ctl->events_id, ntv_count * sizeof(*ctl->events_id)); if (ctl->events_id == NULL) { papi_errno = PAPI_ENOMEM; goto fn_fail; } } } int i; for (i = 0; i < ntv_count; ++i) { sorted_events[i].event_id = ntv_info[i].ni_event; sorted_events[i].frontend_idx = i; } for (i = 0; i < ntv_count; ++i) { ctl->events_id[i] = sorted_events[i].event_id; ntv_info[sorted_events[i].frontend_idx].ni_position = i; } fn_exit: return papi_errno; fn_fail: ctl->num_events = 0; goto fn_exit; } /** @class cuda_start * @brief Start counting Cuda hardware events. * * @param *ctx * Vestigial pointer are to structures defined in the components. * They are opaque to the framework and defined as void. * They are remapped to real data in the component rountines that use them. * In this case Cuda. * @param *ctl * Contains the encoding's necessary for the hardware to set the counters * to the appropriate conditions. */ static int cuda_start(hwd_context_t *ctx, hwd_control_state_t *ctl) { COMPDBG("Entering.\n"); int papi_errno, i; cuda_context_t *cuda_ctx = (cuda_context_t *) ctx; cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; if (cuda_ctx->state == CUDA_EVENTS_RUNNING) { SUBDBG("Error! Cannot PAPI_start more than one eventset at a time for every component."); papi_errno = PAPI_EISRUN; goto fn_fail; } papi_errno = cuptid_ctx_create(cuda_ctl->info, &(cuda_ctl->cuptid_ctx), cuda_ctl->events_id, cuda_ctl->num_events); if (papi_errno != PAPI_OK) goto fn_fail; /* start profiling */ papi_errno = cuptid_ctx_start( (void *) cuda_ctl->cuptid_ctx); if (papi_errno != PAPI_OK) goto fn_fail; /* update the EventSet state to running */ cuda_ctx->state = CUDA_EVENTS_RUNNING; fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: cuda_ctx->state = CUDA_EVENTS_STOPPED; goto fn_exit; } /** @class cuda_read * @brief Read the Cuda hardware counters. * * @param *ctl * Contains the encoding's necessary for the hardware to set the counters * to the appropriate conditions. * @param **val * Holds the counter values for each added Cuda native event. */ static int cuda_read(hwd_context_t __attribute__((unused)) *ctx, hwd_control_state_t *ctl, long long **val, int __attribute__((unused)) flags) { COMPDBG("Entering.\n"); int papi_errno = PAPI_OK; cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; SUBDBG("ENTER: ctx: %p, ctl: %p, val: %p, flags: %d\n", ctx, ctl, val, flags); if (cuda_ctl->cuptid_ctx == NULL) { SUBDBG("Error! Cannot PAPI_read counters for an eventset that has not been PAPI_start'ed."); papi_errno = PAPI_EMISC; goto fn_fail; } papi_errno = cuptid_ctx_read( cuda_ctl->cuptid_ctx, val ); fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } /** @class cuda_reset * @brief Reset the Cuda hardware event counts. * * @param *ctl * Contains the encoding's necessary for the hardware to set the counters * to the appropriate conditions. */ static int cuda_reset(hwd_context_t __attribute__((unused)) *ctx, hwd_control_state_t *ctl) { int papi_errno; cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; if (cuda_ctl->cuptid_ctx == NULL) { SUBDBG("Cannot reset counters for an eventset that has not been started."); return PAPI_EMISC; } papi_errno = cuptid_ctx_reset(cuda_ctl->cuptid_ctx); return papi_errno; } /** @class cuda_stop * @brief Stop counting Cuda hardware events. * * @param *ctx * Vestigial pointer are to structures defined in the components. * They are opaque to the framework and defined as void. * They are remapped to real data in the component rountines that use them. * In this case Cuda. * @param *ctl * Contains the encoding's necessary for the hardware to set the counters * to the appropriate conditions. E.g. Stopped or running. */ int cuda_stop(hwd_context_t *ctx, hwd_control_state_t *ctl) { COMPDBG("Entering.\n"); int papi_errno = PAPI_OK; cuda_context_t *cuda_ctx = (cuda_context_t *) ctx; cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; if (cuda_ctx->state == CUDA_EVENTS_STOPPED) { SUBDBG("Error! Cannot PAPI_stop counters for an eventset that has not been PAPI_start'ed."); papi_errno = PAPI_EMISC; goto fn_fail; } /* stop counting */ papi_errno = cuptid_ctx_stop(cuda_ctl->cuptid_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } /* free memory that was used */ papi_errno = cuptid_ctx_destroy( &(cuda_ctl->cuptid_ctx) ); if (papi_errno != PAPI_OK) { } /* update EventSet state to stopped */ cuda_ctx->state = CUDA_EVENTS_STOPPED; cuda_ctl->cuptid_ctx = NULL; fn_exit: SUBDBG("EXIT: %s\n", PAPI_strerror(papi_errno)); return papi_errno; fn_fail: goto fn_exit; } /** @class cuda_cleanup_eventset * @brief Remove all Cuda hardware events from a PAPI event set. * * @param *ctl * Contains the encoding's necessary for the hardware to set the counters * to the appropriate conditions. */ static int cuda_cleanup_eventset(hwd_control_state_t *ctl) { COMPDBG("Entering.\n"); int papi_errno; cuda_control_t *cuda_ctl = (cuda_control_t *) ctl; if (cuda_ctl->info) { papi_errno = cuptid_thread_info_destroy(&(cuda_ctl->info)); if (papi_errno != PAPI_OK) return papi_errno; } /* free int array of event id's and reset number of events */ free(cuda_ctl->events_id); cuda_ctl->events_id = NULL; cuda_ctl->num_events = 0; return PAPI_OK; } /** @class cuda_get_evt_count * @brief Helper function to count the number of Cuda base event names. * This count is shown in the util papi_component_avail. * * @param *count * Count of Cuda base hardware event names. */ static int cuda_get_evt_count(int *count) { uint32_t event_code = 0; if (cuptid_evt_enum(&event_code, PAPI_ENUM_FIRST) == PAPI_OK) { ++(*count); } while (cuptid_evt_enum(&event_code, PAPI_ENUM_EVENTS) == PAPI_OK) { ++(*count); } return PAPI_OK; } papi-papi-7-2-0-t/src/components/cuda/papi_cuda_presets.h000066400000000000000000000300711502707512200233740ustar00rootroot00000000000000#ifndef __PAPI_CUDA_PRESETS_H__ #define __PAPI_CUDA_PRESETS_H__ hwi_presets_t _cuda_presets[PAPI_MAX_cuda_PRESETS] = { /* 0 */ {"PAPI_CUDA_FP16_FMA", "CUDA FP16 FMA instr", "CUDA Half precision (FP16) FMA instructions", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 1 */ {"PAPI_CUDA_BF16_FMA", "CUDA BF16 FMA instr", "CUDA Half precision (BF16) FMA instructions", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 2 */ {"PAPI_CUDA_FP32_FMA", "CUDA FP32 FMA instr", "CUDA Single precision (FP32) FMA instructions", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 3 */ {"PAPI_CUDA_FP64_FMA", "CUDA FP64 FMA instr", "CUDA Double precision (FP64) FMA instructions", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 4 */ {"PAPI_CUDA_FP_FMA", "CUDA FP FMA instr", "CUDA floating-point FMA instructions", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 5 */ {"PAPI_CUDA_FP8_OPS", "CUDA FP8 ops", "CUDA 8-bit precision floating-point operations", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 6 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 7 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 8 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 9 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 10 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 11 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 12 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 13 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 14 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 15 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 16 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 17 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 18 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 19 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 21 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 22 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 23 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 24 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 25 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 26 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 27 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 28 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 29 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 30 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 31 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 32 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 33 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 34 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 35 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 36 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 37 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 38 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 39 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 40 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 41 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 42 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 43 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 44 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 45 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 46 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 47 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 48 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 49 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 50 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 51 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 52 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 53 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 54 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 55 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 56 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 57 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 58 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 59 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 60 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 61 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 62 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 63 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 64 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 65 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 66 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 67 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 68 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 69 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 70 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 71 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 72 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 73 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 74 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 75 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 76 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 77 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 78 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 79 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 80 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 81 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 82 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 83 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 84 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 85 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 86 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 87 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 88 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 89 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 90 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 91 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 92 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 93 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 94 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 95 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 96 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 97 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 98 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 99 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*100 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*110 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*120 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*121 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*122 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*123 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*124 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*125 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*126 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*127 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, }; #endif /* __PAPI_CUDA_PRESETS_H__ */ papi-papi-7-2-0-t/src/components/cuda/papi_cuda_std_event_defs.h000066400000000000000000000014301502707512200247000ustar00rootroot00000000000000#ifndef __PAPI_CUDA_STD_EVENT_DEFS_H__ #define __PAPI_CUDA_STD_EVENT_DEFS_H__ #define PAPI_MAX_cuda_PRESETS 128 enum { PAPI_CUDA_FP16_FMA_idx = PAPI_cuda_PRESET_OFFSET, PAPI_CUDA_BF16_FMA_idx, PAPI_CUDA_FP32_FMA_idx, PAPI_CUDA_FP64_FMA_idx, PAPI_CUDA_FP_FMA_idx, PAPI_CUDA_FP8_OPS_idx }; #define PAPI_CUDA_FP16_FMA (PAPI_CUDA_FP16_FMA_idx | PAPI_PRESET_MASK) #define PAPI_CUDA_BF16_FMA (PAPI_CUDA_BF16_FMA_idx | PAPI_PRESET_MASK) #define PAPI_CUDA_FP32_FMA (PAPI_CUDA_FP32_FMA_idx | PAPI_PRESET_MASK) #define PAPI_CUDA_FP64_FMA (PAPI_CUDA_FP64_FMA_idx | PAPI_PRESET_MASK) #define PAPI_CUDA_FP_FMA (PAPI_CUDA_FP_FMA_idx | PAPI_PRESET_MASK) #define PAPI_CUDA_FP8_OPS (PAPI_CUDA_FP8_OPS_idx | PAPI_PRESET_MASK) #endif /* __PAPI_CUDA_STD_EVENT_DEFS_H__ */ papi-papi-7-2-0-t/src/components/cuda/papi_cupti_common.c000066400000000000000000000755241502707512200234160ustar00rootroot00000000000000/** * @file papi_cupti_common.c * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #include #include #include #include #include #include #include "papi_memory.h" #include "cupti_config.h" #include "papi_cupti_common.h" static void *dl_drv, *dl_rt; static char cuda_error_string[PAPI_HUGE_STR_LEN]; void *dl_cupti; unsigned int _cuda_lock; typedef int64_t gpu_occupancy_t; static gpu_occupancy_t global_gpu_bitmask; // Variables to handle partially disabled Cuda component static int isCudaPartial = 0; static int enabledDeviceIds[PAPI_CUDA_MAX_DEVICES]; static size_t enabledDevicesCnt = 0; typedef enum { sys_gpu_ccs_unknown = 0, sys_gpu_ccs_mixed, sys_gpu_ccs_all_lt_70, sys_gpu_ccs_all_eq_70, sys_gpu_ccs_all_gt_70, sys_gpu_ccs_all_lte_70, sys_gpu_ccs_all_gte_70 } sys_compute_capabilities_e; struct cuptic_info { CUcontext ctx; }; // Load necessary functions from Cuda toolkit e.g. cupti or runtime static int util_load_cuda_sym(void); static int load_cuda_sym(void); static int load_cudart_sym(void); static int load_cupti_common_sym(void); // Unload the loaded functions from Cuda toolkit e.g. cupti or runtime static int unload_cudart_sym(void); static int unload_cupti_common_sym(void); static void unload_linked_cudart_path(void); // Functions to get library versions static int util_dylib_cu_runtime_version(void); static int util_dylib_cupti_version(void); // Functions to get cuda runtime library path static int dl_iterate_phdr_cb(struct dl_phdr_info *info, __attribute__((unused)) size_t size, __attribute__((unused)) void *data); static int get_user_cudart_path(void); // Function to determine compute capabilities static int compute_capabilities_on_system(sys_compute_capabilities_e *system_ccs); // Functions to handle a partially disabled Cuda component static int get_enabled_devices(void); // misc. static int _devmask_events_get(cuptiu_event_table_t *evt_table, gpu_occupancy_t *bitmask); /* cuda driver function pointers */ CUresult ( *cuCtxGetCurrentPtr ) (CUcontext *); CUresult ( *cuCtxSetCurrentPtr ) (CUcontext); CUresult ( *cuCtxDestroyPtr ) (CUcontext); CUresult ( *cuCtxCreatePtr ) (CUcontext *pctx, unsigned int flags, CUdevice dev); CUresult ( *cuCtxGetDevicePtr ) (CUdevice *); CUresult ( *cuDeviceGetPtr ) (CUdevice *, int); CUresult ( *cuDeviceGetCountPtr ) (int *); CUresult ( *cuDeviceGetNamePtr ) (char *, int, CUdevice); CUresult ( *cuDevicePrimaryCtxRetainPtr ) (CUcontext *pctx, CUdevice); CUresult ( *cuDevicePrimaryCtxReleasePtr ) (CUdevice); CUresult ( *cuInitPtr ) (unsigned int); CUresult ( *cuGetErrorStringPtr ) (CUresult error, const char** pStr); CUresult ( *cuCtxPopCurrentPtr ) (CUcontext * pctx); CUresult ( *cuCtxPushCurrentPtr ) (CUcontext pctx); CUresult ( *cuCtxSynchronizePtr ) (); CUresult ( *cuDeviceGetAttributePtr ) (int *, CUdevice_attribute, CUdevice); /* cuda runtime function pointers */ cudaError_t ( *cudaGetDeviceCountPtr ) (int *); cudaError_t ( *cudaGetDevicePtr ) (int *); const char *( *cudaGetErrorStringPtr ) (cudaError_t); cudaError_t ( *cudaSetDevicePtr ) (int); cudaError_t ( *cudaGetDevicePropertiesPtr ) (struct cudaDeviceProp* prop, int device); cudaError_t ( *cudaDeviceGetAttributePtr ) (int *value, enum cudaDeviceAttr attr, int device); cudaError_t ( *cudaFreePtr ) (void *); cudaError_t ( *cudaDriverGetVersionPtr ) (int *); cudaError_t ( *cudaRuntimeGetVersionPtr ) (int *); /* cupti function pointer */ CUptiResult ( *cuptiGetVersionPtr ) (uint32_t* ); CUptiResult ( *cuptiDeviceGetChipNamePtr ) (CUpti_Device_GetChipName_Params* params); /**@class load_cuda_sym * @brief Search for a variation of the shared object libcuda. */ int load_cuda_sym(void) { int soNamesToSearchCount = 3; const char *soNamesToSearchFor[] = {"libcuda.so", "libcuda.so.1", "libcuda"}; dl_drv = search_and_load_from_system_paths(soNamesToSearchFor, soNamesToSearchCount); if (!dl_drv) { ERRDBG("Loading installed libcuda.so failed. Check that cuda drivers are installed.\n"); goto fn_fail; } cuCtxSetCurrentPtr = DLSYM_AND_CHECK(dl_drv, "cuCtxSetCurrent"); cuCtxGetCurrentPtr = DLSYM_AND_CHECK(dl_drv, "cuCtxGetCurrent"); cuCtxDestroyPtr = DLSYM_AND_CHECK(dl_drv, "cuCtxDestroy"); cuCtxCreatePtr = DLSYM_AND_CHECK(dl_drv, "cuCtxCreate"); cuCtxGetDevicePtr = DLSYM_AND_CHECK(dl_drv, "cuCtxGetDevice"); cuDeviceGetPtr = DLSYM_AND_CHECK(dl_drv, "cuDeviceGet"); cuDeviceGetCountPtr = DLSYM_AND_CHECK(dl_drv, "cuDeviceGetCount"); cuDeviceGetNamePtr = DLSYM_AND_CHECK(dl_drv, "cuDeviceGetName"); cuDevicePrimaryCtxRetainPtr = DLSYM_AND_CHECK(dl_drv, "cuDevicePrimaryCtxRetain"); cuDevicePrimaryCtxReleasePtr = DLSYM_AND_CHECK(dl_drv, "cuDevicePrimaryCtxRelease"); cuInitPtr = DLSYM_AND_CHECK(dl_drv, "cuInit"); cuGetErrorStringPtr = DLSYM_AND_CHECK(dl_drv, "cuGetErrorString"); cuCtxPopCurrentPtr = DLSYM_AND_CHECK(dl_drv, "cuCtxPopCurrent"); cuCtxPushCurrentPtr = DLSYM_AND_CHECK(dl_drv, "cuCtxPushCurrent"); cuCtxSynchronizePtr = DLSYM_AND_CHECK(dl_drv, "cuCtxSynchronize"); cuDeviceGetAttributePtr = DLSYM_AND_CHECK(dl_drv, "cuDeviceGetAttribute"); Dl_info info; dladdr(cuCtxSetCurrentPtr, &info); LOGDBG("CUDA driver library loaded from %s\n", info.dli_fname); return PAPI_OK; fn_fail: return PAPI_EMISC; } static int unload_cuda_sym(void) { if (dl_drv) { dlclose(dl_drv); dl_drv = NULL; } cuCtxSetCurrentPtr = NULL; cuCtxGetCurrentPtr = NULL; cuCtxDestroyPtr = NULL; cuCtxCreatePtr = NULL; cuCtxGetDevicePtr = NULL; cuDeviceGetPtr = NULL; cuDeviceGetCountPtr = NULL; cuDeviceGetNamePtr = NULL; cuDevicePrimaryCtxRetainPtr = NULL; cuDevicePrimaryCtxReleasePtr = NULL; cuInitPtr = NULL; cuGetErrorStringPtr = NULL; cuCtxPopCurrentPtr = NULL; cuCtxPushCurrentPtr = NULL; cuCtxSynchronizePtr = NULL; cuDeviceGetAttributePtr = NULL; return PAPI_OK; } /**@class search_and_load_shared_objects * @brief Search and load Cuda shared objects. * * @param *parentPath * The main path we will use to search for the shared objects. * @param *soMainName * The name of the shared object e.g. libcudart. This is used * to select the standardSubPaths to use. * @param *soNamesToSearchFor[] * Varying names of the shared object we want to search for. * @param soNamesToSearchCount * Total number of names in soNamesToSearchFor. */ void *search_and_load_shared_objects(const char *parentPath, const char *soMainName, const char *soNamesToSearchFor[], int soNamesToSearchCount) { const char *standardSubPaths[3]; // Case for when we want to search explicit subpaths for a shared object if (soMainName != NULL) { if (strcmp(soMainName, "libcudart") == 0) { standardSubPaths[0] = "%s/lib64/"; standardSubPaths[1] = NULL; } else if (strcmp(soMainName, "libcupti") == 0) { standardSubPaths[0] = "%s/extras/CUPTI/lib64/"; standardSubPaths[1] = "%s/lib64/"; standardSubPaths[2] = NULL; } else if (strcmp(soMainName, "libnvperf_host") == 0) { standardSubPaths[0] = "%s/extras/CUPTI/lib64/"; standardSubPaths[1] = "%s/lib64/"; standardSubPaths[2] = NULL; } } // Case for when a user provides an exact path e.g. PAPI_CUDA_RUNTIME // and we do not want to search subpaths else{ standardSubPaths[0] = "%s/"; standardSubPaths[1] = NULL; } char pathToSharedLibrary[PAPI_HUGE_STR_LEN], directoryPathToSearch[PAPI_HUGE_STR_LEN]; void *so = NULL; char *soNameFound; int i, strLen; for (i = 0; standardSubPaths[i] != NULL; i++) { // Create path to search for dl names int strLen = snprintf(directoryPathToSearch, PAPI_HUGE_STR_LEN, standardSubPaths[i], parentPath); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write path to search for dlnames.\n"); return NULL; } DIR *dir = opendir(directoryPathToSearch); if (dir == NULL) { ERRDBG("Directory path could not be opened.\n"); continue; } int j; for (j = 0; j < soNamesToSearchCount; j++) { struct dirent *dirEntry; while( ( dirEntry = readdir(dir) ) != NULL ) { int result; char *p = strstr(soNamesToSearchFor[j], "so"); // Check for an exact match of a shared object name (.so and .so.1 case) if (p) { result = strcmp(dirEntry->d_name, soNamesToSearchFor[j]); } // Check for any match of a shared object name (we could not find .so and .so.1) else { result = strncmp(dirEntry->d_name, soNamesToSearchFor[j], strlen(soNamesToSearchFor[j])); } if (result == 0) { soNameFound = dirEntry->d_name; goto found; } } // Reset the position of the directory stream rewinddir(dir); } } exit: return so; found: // Construct path to shared library strLen = snprintf(pathToSharedLibrary, PAPI_HUGE_STR_LEN, "%s%s", directoryPathToSearch, soNameFound); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { ERRDBG("Failed to fully write constructed path to shared library.\n"); return NULL; } so = dlopen(pathToSharedLibrary, RTLD_NOW | RTLD_GLOBAL); goto exit; } /**@class search_and_load_from_system_paths * @brief A simple wrapper to try and search and load * Cuda shared objects from system paths. * * @param *soNamesToSearchFor[] * Varying names of the shared object we want to search for. * @param soNamesToSearchCount * Total number of names in soNamesToSearchFor. */ void *search_and_load_from_system_paths(const char *soNamesToSearchFor[], int soNamesToSearchCount) { void *so = NULL; int i; for (i = 0; i < soNamesToSearchCount; i++) { so = dlopen(soNamesToSearchFor[i], RTLD_NOW | RTLD_GLOBAL); if (so) { return so; } } return so; } /**@class load_cudart_sym * @brief Search for a variation of the shared object libcudart. * Order of search is outlined below. * * 1. If a user sets PAPI_CUDA_RUNTIME, this will take precedent over * the options listed below to be searched. * 2. If we fail to collect a variation of the shared object libcudart from PAPI_CUDA_RUNTIME or it is not set, * we will search the path defined with PAPI_CUDA_ROOT; as this is supposed to always be set. * 3. If we fail to collect a variation of the shared object libcudart from steps 1 and 2, then we will search the linux * default directories listed by /etc/ld.so.conf. As a note, updating the LD_LIBRARY_PATH is * advised for this option. * 4. We use dlopen to search for a variation of the shared object libcudart. * If this fails, then we failed to find a variation of the shared object * libcudart. */ int load_cudart_sym(void) { int soNamesToSearchCount = 3; const char *soNamesToSearchFor[] = {"libcudart.so", "libcudart.so.1", "libcudart"}; // If a user set PAPI_CUDA_RUNTIME with a path, then search it for the shared object (takes precedent over PAPI_CUDA_ROOT) char *papi_cuda_runtime = getenv("PAPI_CUDA_RUNTIME"); if (papi_cuda_runtime) { dl_rt = search_and_load_shared_objects(papi_cuda_runtime, NULL, soNamesToSearchFor, soNamesToSearchCount); } char *soMainName = "libcudart"; // If a user set PAPI_CUDA_ROOT with a path and we did not already find the shared object, then search it for the shared object char *papi_cuda_root = getenv("PAPI_CUDA_ROOT"); if (papi_cuda_root && !dl_rt) { dl_rt = search_and_load_shared_objects(papi_cuda_root, soMainName, soNamesToSearchFor, soNamesToSearchCount); } // Last ditch effort to find a variation of libcudart, see dlopen manpages for how search occurs if (!dl_rt) { dl_rt = search_and_load_from_system_paths(soNamesToSearchFor, soNamesToSearchCount); if (!dl_rt) { ERRDBG("Loading libcudart shared library failed. Try setting PAPI_CUDA_ROOT\n"); goto fn_fail; } } cudaGetDevicePtr = DLSYM_AND_CHECK(dl_rt, "cudaGetDevice"); cudaGetDeviceCountPtr = DLSYM_AND_CHECK(dl_rt, "cudaGetDeviceCount"); cudaGetDevicePropertiesPtr = DLSYM_AND_CHECK(dl_rt, "cudaGetDeviceProperties"); cudaGetErrorStringPtr = DLSYM_AND_CHECK(dl_rt, "cudaGetErrorString"); cudaDeviceGetAttributePtr = DLSYM_AND_CHECK(dl_rt, "cudaDeviceGetAttribute"); cudaSetDevicePtr = DLSYM_AND_CHECK(dl_rt, "cudaSetDevice"); cudaFreePtr = DLSYM_AND_CHECK(dl_rt, "cudaFree"); cudaDriverGetVersionPtr = DLSYM_AND_CHECK(dl_rt, "cudaDriverGetVersion"); cudaRuntimeGetVersionPtr = DLSYM_AND_CHECK(dl_rt, "cudaRuntimeGetVersion"); Dl_info info; dladdr(cudaGetDevicePtr, &info); LOGDBG("CUDA runtime library loaded from %s\n", info.dli_fname); return PAPI_OK; fn_fail: return PAPI_EMISC; } int unload_cudart_sym(void) { if (dl_rt) { dlclose(dl_rt); dl_rt = NULL; } cudaGetDevicePtr = NULL; cudaGetDeviceCountPtr = NULL; cudaGetDevicePropertiesPtr = NULL; cudaGetErrorStringPtr = NULL; cudaDeviceGetAttributePtr = NULL; cudaSetDevicePtr = NULL; cudaFreePtr = NULL; cudaDriverGetVersionPtr = NULL; cudaRuntimeGetVersionPtr = NULL; return PAPI_OK; } /**@class load_cupti_common_sym * @brief Search for a variation of the shared object libcupti. * Order of search is outlined below. * * 1. If a user sets PAPI_CUDA_CUPTI, this will take precedent over * the options listed below to be searched. * 2. If we fail to collect a variation of the shared object libcupti from PAPI_CUDA_CUPTI or it is not set, * we will search the path defined with PAPI_CUDA_ROOT; as this is supposed to always be set. * 3. If we fail to collect a variation of the shared object libcupti from steps 1 and 2, then we will search the linux * default directories listed by /etc/ld.so.conf. As a note, updating the LD_LIBRARY_PATH is * advised for this option. * 4. We use dlopen to search for a variation of the shared object libcupti. * If this fails, then we failed to find a variation of the shared object * libcupti. */ int load_cupti_common_sym(void) { int soNamesToSearchCount = 3; const char *soNamesToSearchFor[] = {"libcupti.so", "libcupti.so.1", "libcupti"}; // If a user set PAPI_CUDA_CUPTI with a path, then search it for the shared object (takes precedent over PAPI_CUDA_ROOT) char *papi_cuda_cupti = getenv("PAPI_CUDA_CUPTI"); if (papi_cuda_cupti) { dl_cupti = search_and_load_shared_objects(papi_cuda_cupti, NULL, soNamesToSearchFor, soNamesToSearchCount); } char *soMainName = "libcupti"; // If a user set PAPI_CUDA_ROOT with a path and we did not already find the shared object, then search it for the shared object char *papi_cuda_root = getenv("PAPI_CUDA_ROOT"); if (papi_cuda_root && !dl_cupti) { dl_cupti = search_and_load_shared_objects(papi_cuda_root, soMainName, soNamesToSearchFor, soNamesToSearchCount); } // Last ditch effort to find a variation of libcupti, see dlopen manpages for how search occurs if (!dl_cupti) { dl_cupti = search_and_load_from_system_paths(soNamesToSearchFor, soNamesToSearchCount); if (!dl_cupti) { ERRDBG("Loading libcupti.so failed. Try setting PAPI_CUDA_ROOT\n"); goto fn_fail; } } cuptiGetVersionPtr = DLSYM_AND_CHECK(dl_cupti, "cuptiGetVersion"); cuptiDeviceGetChipNamePtr = DLSYM_AND_CHECK(dl_cupti, "cuptiDeviceGetChipName"); Dl_info info; dladdr(cuptiGetVersionPtr, &info); LOGDBG("CUPTI library loaded from %s\n", info.dli_fname); return PAPI_OK; fn_fail: return PAPI_EMISC; } int unload_cupti_common_sym(void) { if (dl_cupti) { dlclose(dl_cupti); dl_cupti = NULL; } cuptiGetVersionPtr = NULL; cuptiDeviceGetChipNamePtr = NULL; return PAPI_OK; } int util_load_cuda_sym(void) { int papi_errno; papi_errno = load_cuda_sym(); papi_errno += load_cudart_sym(); papi_errno += load_cupti_common_sym(); if (papi_errno != PAPI_OK) { return PAPI_EMISC; } else return PAPI_OK; } int cuptic_shutdown(void) { unload_cuda_sym(); unload_cudart_sym(); unload_cupti_common_sym(); return PAPI_OK; } int util_dylib_cu_runtime_version(void) { int runtimeVersion; cudaArtCheckErrors(cudaRuntimeGetVersionPtr(&runtimeVersion), return PAPI_EMISC); return runtimeVersion; } int util_dylib_cupti_version(void) { unsigned int cuptiVersion; cuptiCheckErrors(cuptiGetVersionPtr(&cuptiVersion), return PAPI_EMISC); return cuptiVersion; } /** @class cuptic_device_get_count * @brief Get total number of gpus on the machine that are compute * capable.. * @param *num_gpus * Collect the total number of gpus. */ int cuptic_device_get_count(int *num_gpus) { cudaError_t cuda_err; /* find the total number of compute-capable devices */ cuda_err = cudaGetDeviceCountPtr(num_gpus); if (cuda_err != cudaSuccess) { cuptic_err_set_last(cudaGetErrorStringPtr(cuda_err)); return PAPI_EMISC; } return PAPI_OK; } int get_gpu_compute_capability(int dev_num, int *cc) { int cc_major, cc_minor; cudaError_t cuda_errno; cuda_errno = cudaDeviceGetAttributePtr(&cc_major, cudaDevAttrComputeCapabilityMajor, dev_num); if (cuda_errno != cudaSuccess) { cuptic_err_set_last(cudaGetErrorStringPtr(cuda_errno)); return PAPI_EMISC; } cuda_errno = cudaDeviceGetAttributePtr(&cc_minor, cudaDevAttrComputeCapabilityMinor, dev_num); if (cuda_errno != cudaSuccess) { cuptic_err_set_last(cudaGetErrorStringPtr(cuda_errno)); return PAPI_EMISC; } *cc = cc_major * 10 + cc_minor; return PAPI_OK; } int compute_capabilities_on_system(sys_compute_capabilities_e *system_ccs) { int total_gpus; int papi_errno = cuptic_device_get_count(&total_gpus); if (papi_errno != PAPI_OK) { return papi_errno; } int i, cc; int num_gpus_with_ccs_gt_cc70 = 0, num_gpus_with_ccs_eq_cc70 = 0, num_gpus_with_ccs_lt_cc70 = 0; for (i = 0; i < total_gpus; i++) { papi_errno = get_gpu_compute_capability(i, &cc); if (papi_errno != PAPI_OK) { return papi_errno; } if (cc > 70) { ++num_gpus_with_ccs_gt_cc70; } if (cc == 70) { ++num_gpus_with_ccs_eq_cc70; } if (cc < 70) { ++num_gpus_with_ccs_lt_cc70; } } sys_compute_capabilities_e sys_ccs = sys_gpu_ccs_unknown; // All devices have CCs > 7.0. if (num_gpus_with_ccs_gt_cc70 == total_gpus) { sys_ccs = sys_gpu_ccs_all_gt_70; } // All devices have CCs = 7.0 else if (num_gpus_with_ccs_eq_cc70 == total_gpus) { sys_ccs = sys_gpu_ccs_all_eq_70; } // All devices have CCs < 7.0 else if (num_gpus_with_ccs_lt_cc70 == total_gpus) { sys_ccs = sys_gpu_ccs_all_lt_70; } // Devices can result in a partially disabled Cuda component else { sys_ccs = sys_gpu_ccs_mixed; int all_ccs_gte_cc70 = num_gpus_with_ccs_eq_cc70 + num_gpus_with_ccs_gt_cc70; if (all_ccs_gte_cc70 == total_gpus) { sys_ccs = sys_gpu_ccs_all_gte_70; } int all_ccs_lte_cc70 = num_gpus_with_ccs_eq_cc70 + num_gpus_with_ccs_lt_cc70; if (all_ccs_lte_cc70 == total_gpus) { sys_ccs = sys_gpu_ccs_all_lte_70; } } *system_ccs = sys_ccs; return PAPI_OK; } /** @class cuptic_err_set_last * @brief For the last error, set an error message. * @param *error_str * Error message to be set. */ int cuptic_err_set_last(const char *error_str) { int strLen = snprintf(cuda_error_string, PAPI_HUGE_STR_LEN, "%s", error_str); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { SUBDBG("Last set error message not fully written.\n"); } return PAPI_OK; } /** @class cuptic_err_get_last * @brief Get the last error message set. * @param **error_str * Error message to be returned. */ int cuptic_err_get_last(const char **error_str) { *error_str = cuda_error_string; return PAPI_OK; } int cuptic_init(void) { int papi_errno = util_load_cuda_sym(); if (papi_errno != PAPI_OK) { cuptic_err_set_last("Unable to load CUDA library functions."); return papi_errno; } sys_compute_capabilities_e system_ccs; papi_errno = compute_capabilities_on_system(&system_ccs); if (papi_errno != PAPI_OK) { return papi_errno; } // Get an array of the available devices on the system papi_errno = get_enabled_devices(); if (papi_errno != PAPI_OK) { return papi_errno; } // Handle a partially disabled Cuda component // TODO: Once the Events API is added back, this conditional will need to be updated for Issue #297 section 2 if (system_ccs == sys_gpu_ccs_mixed || system_ccs == sys_gpu_ccs_all_lte_70) { char *PAPI_CUDA_API = getenv("PAPI_CUDA_API"); char *cc_support = ">=7.0"; if (PAPI_CUDA_API != NULL) { int result = strcasecmp(PAPI_CUDA_API, "EVENTS"); if (result == 0) { cc_support = "<=7.0"; } } char errMsg[PAPI_HUGE_STR_LEN]; int strLen = snprintf(errMsg, PAPI_HUGE_STR_LEN, "System includes multiple compute capabilities: <7.0, =7.0, >7.0." " Only support for CC %s enabled.", cc_support); if (strLen < 0 || strLen >= PAPI_HUGE_STR_LEN) { SUBDBG("Failed to fully write the partially disabled error message.\n"); return PAPI_ENOMEM; } cuptic_err_set_last(errMsg); isCudaPartial = 1; return PAPI_PARTIAL; } return PAPI_OK; } void cuptic_partial(int *isCmpPartial, int **cudaEnabledDeviceIds, size_t *totalNumEnabledDevices) { *isCmpPartial = isCudaPartial; *cudaEnabledDeviceIds = enabledDeviceIds; *totalNumEnabledDevices = enabledDevicesCnt; return; } int cuptic_determine_runtime_api(void) { int cupti_api = -1; char *PAPI_CUDA_API = getenv("PAPI_CUDA_API"); // For the Perfworks API to be operational in the Cuda component, // users must link with a Cuda toolkit version that has a CUPTI version >= 13. // TODO: Once the Events API is added back into the Cuda component. Add a similar // check as the one shown below. unsigned int cuptiVersion = util_dylib_cupti_version(); if (!(cuptiVersion >= CUPTI_PROFILER_API_MIN_SUPPORTED_VERSION) && PAPI_CUDA_API == NULL) { return cupti_api; } // Determine the compute capabilities on the system sys_compute_capabilities_e system_ccs; int papi_errno = compute_capabilities_on_system(&system_ccs); if (papi_errno != PAPI_OK) { return papi_errno; } // Determine which CUPTI API will be in use switch (system_ccs) { // All devices have CCs < 7.0 case sys_gpu_ccs_all_lt_70: cupti_api = API_EVENTS; break; // All devices have CCs > 7.0 case sys_gpu_ccs_all_gt_70: cupti_api = API_PERFWORKS; break; // All devices have CCs <= 7.0 // TODO: Once the Events API is added back, this case will default to use the Events API case sys_gpu_ccs_all_lte_70: // All devices have CCs >= 7.0 case sys_gpu_ccs_all_gte_70: // ALL devices have CC's = 7.0 case sys_gpu_ccs_all_eq_70: // Devices are mixed with CC's > 7.0 and CC's < 7.0 case sys_gpu_ccs_mixed: // Default will be to use Perfworks API, user can change this by setting PAPI_CUDA_API. cupti_api = API_PERFWORKS; if (PAPI_CUDA_API != NULL) { int result = strcasecmp(PAPI_CUDA_API, "EVENTS"); if (result == 0) cupti_api = API_EVENTS; } break; default: SUBDBG("Implemented CUPTI APIs do not support the current GPU configuration.\n"); break; } return cupti_api; } int get_enabled_devices(void) { int total_gpus; int papi_errno = cuptic_device_get_count(&total_gpus); if (papi_errno != PAPI_OK) { return papi_errno; } int cupti_api = cuptic_determine_runtime_api(); if (cupti_api < 0) { return PAPI_ECMP; } int i, cc, collectCudaDevice; for (i = 0; i < total_gpus; i++) { collectCudaDevice = 0; papi_errno = get_gpu_compute_capability(i, &cc); if (papi_errno != PAPI_OK) { return papi_errno; } if (cupti_api == API_PERFWORKS && cc >= 70) { collectCudaDevice = 1; } else if (cupti_api == API_EVENTS && cc <= 70) { collectCudaDevice = 1; } if (collectCudaDevice) { enabledDeviceIds[enabledDevicesCnt] = i; enabledDevicesCnt++; } } return PAPI_OK; } /** @class cuptic_ctxarr_create * @brief Allocate memory for pinfo. * @param *pinfo * Instance of a struct that holds read count, running, cuptic_t * and cuptip_gpu_state_t. */ int cuptic_ctxarr_create(cuptic_info_t *pinfo) { COMPDBG("Entering.\n"); int total_gpus, papi_errno; /* retrieve total number of compute-capable devices */ papi_errno = cuptic_device_get_count(&total_gpus); if (papi_errno != PAPI_OK) { return PAPI_EMISC; } /* allocate memory */ *pinfo = (cuptic_info_t) calloc (total_gpus, sizeof(*pinfo)); if (*pinfo == NULL) { return PAPI_ENOMEM; } return PAPI_OK; } /** @class cuptic_ctxarr_update_current * @brief Updating the current Cuda context. * @param info * Struct that contains a Cuda context, that can be indexed into based * on device id. * @param evt_dev_id * Device id from an appended device qualifier (e.g. :device=#). */ int cuptic_ctxarr_update_current(cuptic_info_t info, int evt_dev_id) { CUcontext pctx; CUresult cuda_err; CUdevice dev_id; // If a Cuda context already exists, get it cuda_err = cuCtxGetCurrentPtr(&pctx); if (cuda_err != CUDA_SUCCESS) { return PAPI_EMISC; } // Get the Device ID for the existing Cuda context if (pctx != NULL) { cuda_err = cuCtxGetDevicePtr(&dev_id); if (cuda_err != CUDA_SUCCESS) { return PAPI_EMISC; } } // A context is not stored for the :device=# qualifier if (info[evt_dev_id].ctx == NULL) { // Cuda context was not found or a user did not provide an appropriate Cuda context for the // device qualifier id that was supplied if (pctx == NULL || dev_id != evt_dev_id) { // If multiple devices are found on the machine, then we need to call cudaSetDevice SUBDBG("A Cuda context was not found. Therefore, one is created for device: %d\n", evt_dev_id); cudaArtCheckErrors(cudaSetDevicePtr(evt_dev_id), return PAPI_EMISC); cudaArtCheckErrors(cudaFreePtr(0), return PAPI_EMISC); cudaCheckErrors(cuCtxGetCurrentPtr(&info[evt_dev_id].ctx), return PAPI_EMISC); cudaCheckErrors(cuCtxPopCurrentPtr(&info[evt_dev_id].ctx), return PAPI_EMISC); } // Cuda context was found else { SUBDBG("A cuda context was found for device: %d\n", evt_dev_id); cudaCheckErrors(cuCtxGetCurrentPtr(&info[evt_dev_id].ctx), return PAPI_EMISC); } } // If the Cuda context has changed for a device keep the first one seen, but output a warning else if (pctx != NULL){ if (evt_dev_id == dev_id) { if (info[dev_id].ctx != pctx) { ERRDBG("Warning: cuda context for device %d has changed from %p to %p\n", dev_id, info[dev_id].ctx, pctx); } } } return PAPI_OK; } int cuptic_ctxarr_get_ctx(cuptic_info_t info, int gpu_idx, CUcontext *ctx) { *ctx = info[gpu_idx].ctx; return PAPI_OK; } int cuptic_ctxarr_destroy(cuptic_info_t *pinfo) { free(*pinfo); *pinfo = NULL; return PAPI_OK; } int _devmask_events_get(cuptiu_event_table_t *evt_table, gpu_occupancy_t *bitmask) { gpu_occupancy_t acq_mask = 0; long i; for (i = 0; i < evt_table->count; i++) { acq_mask |= (1 << evt_table->cuda_devs[i]); } *bitmask = acq_mask; return PAPI_OK; } int cuptic_device_acquire(cuptiu_event_table_t *evt_table) { gpu_occupancy_t bitmask; int papi_errno = _devmask_events_get(evt_table, &bitmask); if (papi_errno != PAPI_OK) { return papi_errno; } if (bitmask & global_gpu_bitmask) { return PAPI_ECNFLCT; } _papi_hwi_lock(_cuda_lock); global_gpu_bitmask |= bitmask; _papi_hwi_unlock(_cuda_lock); return PAPI_OK; } int cuptic_device_release(cuptiu_event_table_t *evt_table) { gpu_occupancy_t bitmask; int papi_errno = _devmask_events_get(evt_table, &bitmask); if (papi_errno != PAPI_OK) { return papi_errno; } if ((bitmask & global_gpu_bitmask) != bitmask) { return PAPI_EMISC; } _papi_hwi_lock(_cuda_lock); global_gpu_bitmask ^= bitmask; _papi_hwi_unlock(_cuda_lock); return PAPI_OK; } /** @class cuptiu_dev_set * @brief For a Cuda native event, set the device ID. * * @param *bitmap * Device map. * @param i * Device ID. */ int cuptiu_dev_set(cuptiu_bitmap_t *bitmap, int i) { *bitmap |= (1ULL << i); return PAPI_OK; } /** @class cuptiu_dev_check * @brief For a Cuda native event, check for a valid device ID. * * @param *bitmap * Device map. * @param i * Device ID. */ int cuptiu_dev_check(cuptiu_bitmap_t bitmap, int i) { return (bitmap & (1ULL << i)); } int get_chip_name(int dev_num, char* chipName) { int papi_errno; CUpti_Device_GetChipName_Params getChipName = { .structSize = CUpti_Device_GetChipName_Params_STRUCT_SIZE, .pPriv = NULL, .deviceIndex = 0 }; getChipName.deviceIndex = dev_num; papi_errno = cuptiDeviceGetChipNamePtr(&getChipName); if (papi_errno != CUPTI_SUCCESS) { ERRDBG("CUPTI error %d: Failed to get chip name for device %d\n", papi_errno, dev_num); return PAPI_EMISC; } strcpy(chipName, getChipName.pChipName); return PAPI_OK; } papi-papi-7-2-0-t/src/components/cuda/papi_cupti_common.h000066400000000000000000000126251502707512200234140ustar00rootroot00000000000000/** * @file papi_cupti_common.h * * @author Treece Burgess tburgess@icl.utk.edu (updated in 2024, redesigned to add device qualifier support.) * @author Anustuv Pal anustuv@icl.utk.edu */ #ifndef __PAPI_CUPTI_COMMON_H__ #define __PAPI_CUPTI_COMMON_H__ #include #include #include #include "cupti_utils.h" #include "lcuda_debug.h" // Set to match the maximum number of devices allowed for the event identifier // encoding format. See README_internal.md for more details. #define PAPI_CUDA_MAX_DEVICES 128 typedef struct cuptic_info *cuptic_info_t; extern void *dl_cupti; extern unsigned int _cuda_lock; /* cuda driver function pointers */ extern CUresult ( *cuCtxGetCurrentPtr ) (CUcontext *); extern CUresult ( *cuCtxSetCurrentPtr ) (CUcontext); extern CUresult ( *cuCtxDestroyPtr ) (CUcontext); extern CUresult ( *cuCtxCreatePtr ) (CUcontext *pctx, unsigned int flags, CUdevice dev); extern CUresult ( *cuCtxGetDevicePtr ) (CUdevice *); extern CUresult ( *cuDeviceGetPtr ) (CUdevice *, int); extern CUresult ( *cuDeviceGetCountPtr ) (int *); extern CUresult ( *cuDeviceGetNamePtr ) (char *, int, CUdevice); extern CUresult ( *cuDevicePrimaryCtxRetainPtr ) (CUcontext *pctx, CUdevice); extern CUresult ( *cuDevicePrimaryCtxReleasePtr ) (CUdevice); extern CUresult ( *cuInitPtr ) (unsigned int); extern CUresult ( *cuGetErrorStringPtr ) (CUresult error, const char** pStr); extern CUresult ( *cuCtxPopCurrentPtr ) (CUcontext * pctx); extern CUresult ( *cuCtxPushCurrentPtr ) (CUcontext pctx); extern CUresult ( *cuCtxSynchronizePtr ) (); extern CUresult ( *cuDeviceGetAttributePtr ) (int *, CUdevice_attribute, CUdevice); /* cuda runtime function pointers */ extern cudaError_t ( *cudaGetDeviceCountPtr ) (int *); extern cudaError_t ( *cudaGetDevicePtr ) (int *); extern cudaError_t ( *cudaSetDevicePtr ) (int); extern cudaError_t ( *cudaGetDevicePropertiesPtr ) (struct cudaDeviceProp* prop, int device); extern cudaError_t ( *cudaDeviceGetAttributePtr ) (int *value, enum cudaDeviceAttr attr, int device); extern cudaError_t ( *cudaFreePtr ) (void *); extern cudaError_t ( *cudaDriverGetVersionPtr ) (int *); extern cudaError_t ( *cudaRuntimeGetVersionPtr ) (int *); /* cupti function pointer */ extern CUptiResult ( *cuptiGetVersionPtr ) (uint32_t* ); /* utility functions to check runtime api, disabled reason, etc. */ int cuptic_init(void); int cuptic_determine_runtime_api(void); int cuptic_device_get_count(int *num_gpus); void *search_and_load_shared_objects(const char *parentPath, const char *soMainName, const char *soNamesToSearchFor[], int soNamesToSearchCount); void *search_and_load_from_system_paths(const char *soNamesToSearchFor[], int soNamesToSearchCount); int cuptic_err_get_last(const char **error_str); int cuptic_err_set_last(const char *error_str); int cuptic_shutdown(void); /* context management interfaces */ int cuptic_ctxarr_create(cuptic_info_t *pinfo); int cuptic_ctxarr_update_current(cuptic_info_t info, int evt_dev_id); int cuptic_ctxarr_get_ctx(cuptic_info_t info, int dev_id, CUcontext *ctx); int cuptic_ctxarr_destroy(cuptic_info_t *pinfo); /* functions to track the occupancy of gpu counters in event sets */ int cuptic_device_acquire(cuptiu_event_table_t *evt_table); int cuptic_device_release(cuptiu_event_table_t *evt_table); /* device qualifier interfaces */ int cuptiu_dev_set(cuptiu_bitmap_t *bitmap, int i); int cuptiu_dev_check(cuptiu_bitmap_t bitmap, int i); /* functions to handle a partially disabled Cuda component */ void cuptic_partial(int *isCmpPartial, int **cudaEnabledDeviceIds, size_t *totalNumEnabledDevices); /* function to get a devices compute capability */ int get_gpu_compute_capability(int dev_num, int *cc); /* misc. */ int get_chip_name(int dev_num, char* chipName); #define DLSYM_AND_CHECK( dllib, name ) dlsym( dllib, name ); \ if (dlerror() != NULL) { \ ERRDBG("A CUDA required function '%s' was not found in lib '%s'.\n", name, #dllib); \ return PAPI_EMISC; \ } /* error handling defines for Cuda related function calls */ #define cudaCheckErrors( call, handleerror ) \ do { \ CUresult _status = (call); \ LOGCUDACALL("\t" #call "\n"); \ if (_status != CUDA_SUCCESS) { \ ERRDBG("CUDA Error %d: Error in call to " #call "\n", _status); \ EXIT_OR_NOT; \ handleerror; \ } \ } while (0); #define cudaArtCheckErrors( call, handleerror ) \ do { \ cudaError_t _status = (call); \ LOGCUDACALL("\t" #call "\n"); \ if (_status != cudaSuccess) { \ ERRDBG("CUDART Error %d: Error in call to " #call "\n", _status); \ EXIT_OR_NOT; \ handleerror; \ } \ } while (0); #define cuptiCheckErrors( call, handleerror ) \ do { \ CUptiResult _status = (call); \ LOGCUPTICALL("\t" #call "\n"); \ if (_status != CUPTI_SUCCESS) { \ ERRDBG("CUPTI Error %d: Error in call to " #call "\n", _status); \ EXIT_OR_NOT; \ handleerror; \ } \ } while (0); #define nvpwCheckErrors( call, handleerror ) \ do { \ NVPA_Status _status = (call); \ LOGPERFWORKSCALL("\t" #call "\n"); \ if (_status != NVPA_STATUS_SUCCESS) { \ ERRDBG("NVPA Error %d: Error in call to " #call "\n", _status); \ EXIT_OR_NOT; \ handleerror; \ } \ } while (0); #endif /* __CUPTI_COMMON_H__ */ papi-papi-7-2-0-t/src/components/cuda/tests/000077500000000000000000000000001502707512200206725ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/cuda/tests/HelloWorld.cu000066400000000000000000000326641502707512200233110ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file HelloWorld.cu * @author Heike Jagode * jagode@eecs.utk.edu * Mods: Anustuv Pal * anustuv@icl.utk.edu * Mods: * * test case for Example component * * * @brief * This file is a very simple HelloWorld C example which serves (together * with its Makefile) as a guideline on how to add tests to components. * The papi configure and papi Makefile will take care of the compilation * of the component tests (if all tests are added to a directory named * 'tests' in the specific component dir). * See components/README for more details. * * The string "Hello World!" is mangled and then restored. * * CUDA Context notes for CUPTI_11: Although a cudaSetDevice() will create a * primary context for the device that allows kernel execution; PAPI cannot * use a primary context to control the Nvidia Performance Profiler. * Applications must create a context using cuCtxCreate() that will execute * the kernel, this must be done prior to the PAPI_add_events() invocation in * the code below. If multiple GPUs are in use, each requires its own context, * and that context should be active when PAPI_events are added for each * device. Which means using Seperate PAPI_add_events() for each device. For * an example see simpleMultiGPU.cu. * * There are three points below where cuCtxCreate() is called, this code works * if any one of them is used alone. */ #include #include #include #ifdef PAPI #include "papi.h" #include "papi_test.h" #endif #define STEP_BY_STEP_DEBUG 0 /* helps debug CUcontext issues. */ #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} // Device kernel __global__ void helloWorld(char* str) { // determine where in the thread grid we are int idx = blockIdx.x * blockDim.x + threadIdx.x; // unmangle output str[idx] += idx; } /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventNamesFromCommandLine * Events provided on the command line. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { int papi_errno = PAPI_add_named_event(EventSet, eventNamesFromCommandLine[i]); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", eventNamesFromCommandLine[i], papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events int strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", eventNamesFromCommandLine[i]); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } // Host function int main(int argc, char** argv) { int quiet = 0; CUcontext getCtx=NULL, sessionCtx=NULL; cudaError_t cudaError; CUresult cuError; (void) cuError; cuError = cuInit(0); if (cuError != CUDA_SUCCESS) { fprintf(stderr, "Failed to initialize the CUDA driver API.\n"); exit(1); } #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); /* PAPI Initialization */ int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__,__LINE__, "PAPI_library_init failed", 0 ); } printf( "PAPI_VERSION : %4d %6d %7d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); int i; int EventSet = PAPI_NULL; int eventCount = argc - 1; /* if no events passed at command line, just report test skipped. */ if (eventCount == 0) { fprintf(stderr, "No eventnames specified at command line."); test_skip(__FILE__, __LINE__, "", 0); } long long *values = (long long *) calloc(eventCount, sizeof (long long)); if (values == NULL) { test_fail(__FILE__, __LINE__, "Failed to allocate memory for values.\n", 0); } int *events = (int *) calloc(eventCount, sizeof (int)); if (events == NULL) { test_fail(__FILE__, __LINE__, "Failed to allocate memory for events.\n", 0); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i before PAPI_create_eventset() getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } papi_errno = PAPI_create_eventset( &EventSet ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__,__LINE__,"Cannot create eventset",papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_create_eventset() getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // If multiple GPUs/contexts were being used, you'd need to // create contexts for each device. See, for example, // simpleMultiGPU.cu. // Context Create. We will use this one to run our kernel. cuError = cuCtxCreate(&sessionCtx, 0, 0); // Create a context, NULL flags, Device 0. if (cuError != CUDA_SUCCESS) { fprintf(stderr, "Failed to create cuContext: %d\n", cuError); exit(-1); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cuCtxCreate(&sessionCtx), about to PAPI_start(), sessionCtx=%p, getCtx=%p.\n", __FILE__, __func__, __LINE__, sessionCtx, getCtx); } // Handle the events from the command line int numEventsSuccessfullyAdded = 0, numMultipassEvents = 0; char **eventsSuccessfullyAdded, **metricNames = argv + 1; eventsSuccessfullyAdded = (char **) malloc(eventCount * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < eventCount; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, eventCount, metricNames, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i before PAPI_start(), getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } papi_errno = PAPI_start( EventSet ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_start failed.", papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_start(), getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } #endif int j; // desired output char str[] = "Hello World!"; // mangle contents of output // the null character is left intact for simplicity for(j = 0; j < 12; j++) { str[j] -= j; } PRINT(quiet, "mangled str=%s\n", str); // allocate memory on the device char *d_str; size_t size = sizeof(str); cudaMalloc((void**)&d_str, size); if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cudaMalloc() getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // copy the string to the device cudaMemcpy(d_str, str, size, cudaMemcpyHostToDevice); if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cudaMemcpy(ToDevice) getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // set the grid and block sizes dim3 dimGrid(2); // one block per word dim3 dimBlock(6); // one thread per character // invoke the kernel helloWorld<<< dimGrid, dimBlock >>>(d_str); cudaError = cudaGetLastError(); if (STEP_BY_STEP_DEBUG) { fprintf(stderr, "%s:%s:%i Kernel Return Code: %s.\n", __FILE__, __func__, __LINE__, cudaGetErrorString(cudaError)); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i After Kernel Execution: getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // retrieve the results from the device cudaMemcpy(str, d_str, size, cudaMemcpyDeviceToHost); if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cudaMemcpy(ToHost) getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // free up the allocated memory on the device cudaFree(d_str); if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cudaFree() getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } #ifdef PAPI papi_errno = PAPI_read( EventSet, values ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_read failed", papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_read getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } for( i = 0; i < numEventsSuccessfullyAdded; i++ ) { PRINT( quiet, "read: %12lld \t=0X%016llX \t\t --> %s \n", values[i], values[i], eventsSuccessfullyAdded[i] ); } papi_errno = cuCtxPopCurrent(&getCtx); if( papi_errno != CUDA_SUCCESS) { fprintf( stderr, "cuCtxPopCurrent failed, papi_errno=%d (%s)\n", papi_errno, PAPI_strerror(papi_errno) ); exit(1); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cuCtxPopCurrent() getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } papi_errno = PAPI_stop( EventSet, values ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_stop failed", papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_stop getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } papi_errno = PAPI_cleanup_eventset(EventSet); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset failed", papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_cleanup_eventset getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } papi_errno = PAPI_destroy_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset failed", papi_errno); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_destroy_eventset getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } for( i = 0; i < numEventsSuccessfullyAdded; i++ ) { PRINT( quiet, "stop: %12lld \t=0X%016llX \t\t --> %s \n", values[i], values[i], eventsSuccessfullyAdded[i] ); } #endif if (STEP_BY_STEP_DEBUG) { fprintf(stderr, "%s:%s:%i before cuCtxDestroy sessionCtx=%p.\n", __FILE__, __func__, __LINE__, sessionCtx); } // Test destroying the session Context. if (sessionCtx != NULL) { cuCtxDestroy(sessionCtx); } if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after cuCtxDestroy(%p) getCtx=%p.\n", __FILE__, __func__, __LINE__, sessionCtx, getCtx); } // Free allocated memory free(values); free(events); for (i = 0; i < eventCount; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); #ifdef PAPI PAPI_shutdown(); if (STEP_BY_STEP_DEBUG) { cuCtxGetCurrent(&getCtx); fprintf(stderr, "%s:%s:%i after PAPI_shutdown getCtx=%p.\n", __FILE__, __func__, __LINE__, getCtx); } // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/HelloWorld_noCuCtx.cu000066400000000000000000000223631502707512200247470ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file HelloWorld_noCuCtx.cu * @author Heike Jagode * jagode@eecs.utk.edu * Mods: Anustuv Pal * anustuv@icl.utk.edu * Mods: * * test case for cuda component * * * @brief * This file is a very simple HelloWorld C example which serves (together * with its Makefile) as a guideline on how to add tests to components. * The papi configure and papi Makefile will take care of the compilation * of the component tests (if all tests are added to a directory named * 'tests' in the specific component dir). * See components/README for more details. * * The string "Hello World!" is mangled and then restored. * * CUDA Context notes for CUPTI_11: Although a cudaSetDevice() will create a * primary context for the device that allows kernel execution; PAPI cannot * use a primary context to control the Nvidia Performance Profiler. * Applications must create a context using cuCtxCreate() that will execute * the kernel, this must be done prior to the PAPI_add_events() invocation in * the code below. If multiple GPUs are in use, each requires its own context, * and that context should be active when PAPI_events are added for each * device. Which means using Seperate PAPI_add_events() for each device. For * an example see simpleMultiGPU.cu. * * There are three points below where cuCtxCreate() is called, this code works * if any one of them is used alone. */ #include #include #include #ifdef PAPI #include "papi.h" #include "papi_test.h" #endif #define STEP_BY_STEP_DEBUG 0 /* helps debug CUcontext issues. */ #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} // Device kernel __global__ void helloWorld(char* str) { // determine where in the thread grid we are int idx = blockIdx.x * blockDim.x + threadIdx.x; // unmangle output str[idx] += idx; } /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param eventNamesFromCommandLine * Events provided on the command line. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { int papi_errno = PAPI_add_named_event(EventSet, eventNamesFromCommandLine[i]); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", eventNamesFromCommandLine[i], papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events int strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", eventNamesFromCommandLine[i]); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } // Host function int main(int argc, char** argv) { int quiet = 0; cudaError_t cudaError; CUresult cuError; (void) cuError; cuInit(0); #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); /* PAPI Initialization */ int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__,__LINE__, "PAPI_library_init failed", 0); } printf( "PAPI_VERSION : %4d %6d %7d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); int i; int EventSet = PAPI_NULL; int eventCount = argc - 1; /* if no events passed at command line, just report test skipped. */ if (eventCount == 0) { fprintf(stderr, "No events specified at command line."); test_skip(__FILE__,__LINE__, "", 0); } long long *values = (long long *) calloc(eventCount, sizeof (long long)); if (values == NULL) { test_fail(__FILE__, __LINE__, "Failed to allocate memory for values.\n", 0); } int *events = (int *) calloc(eventCount, sizeof (int)); if (events == NULL) { test_fail(__FILE__, __LINE__, "Failed to allocate memory for events.\n", 0); } papi_errno = PAPI_create_eventset( &EventSet ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__,__LINE__,"Cannot create eventset",papi_errno); } // Handle the events from the command line int numEventsSuccessfullyAdded = 0, numMultipassEvents = 0; char **eventsSuccessfullyAdded, **metricNames = argv + 1; eventsSuccessfullyAdded = (char **) malloc(eventCount * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < eventCount; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, eventCount, metricNames, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } papi_errno = PAPI_start( EventSet ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_start failed.", papi_errno); } #endif int j; // desired output char str[] = "Hello World!"; // mangle contents of output // the null character is left intact for simplicity for(j = 0; j < 12; j++) { str[j] -= j; } PRINT( quiet, "mangled str=%s\n", str ); // allocate memory on the device char *d_str; size_t size = sizeof(str); cudaMalloc((void**)&d_str, size); // copy the string to the device cudaMemcpy(d_str, str, size, cudaMemcpyHostToDevice); // set the grid and block sizes dim3 dimGrid(2); // one block per word dim3 dimBlock(6); // one thread per character // invoke the kernel helloWorld<<< dimGrid, dimBlock >>>(d_str); cudaError = cudaGetLastError(); if (STEP_BY_STEP_DEBUG) { fprintf(stderr, "%s:%s:%i Kernel Return Code: %s.\n", __FILE__, __func__, __LINE__, cudaGetErrorString(cudaError)); } // retrieve the results from the device cudaMemcpy(str, d_str, size, cudaMemcpyDeviceToHost); // free up the allocated memory on the device cudaFree(d_str); #ifdef PAPI papi_errno = PAPI_read( EventSet, values ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_read failed", papi_errno); } for( i = 0; i < numEventsSuccessfullyAdded; i++ ) { PRINT( quiet, "read: %12lld \t=0X%016llX \t\t --> %s \n", values[i], values[i], eventsSuccessfullyAdded[i] ); } papi_errno = PAPI_stop( EventSet, values ); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_stop failed", papi_errno); } papi_errno = PAPI_cleanup_eventset(EventSet); if( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset failed", papi_errno); } papi_errno = PAPI_destroy_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset failed", papi_errno); } for( i = 0; i < numEventsSuccessfullyAdded; i++ ) { PRINT( quiet, "stop: %12lld \t=0X%016llX \t\t --> %s \n", values[i], values[i], eventsSuccessfullyAdded[i] ); } // Free allocated memory free(values); free(events); for (i = 0; i < eventCount; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/Makefile000066400000000000000000000060461502707512200223400ustar00rootroot00000000000000NAME=cuda include ../../Makefile_comp_tests.target PAPI_CUDA_ROOT ?= $(shell dirname $(shell dirname $(shell which nvcc))) TESTS = HelloWorld simpleMultiGPU \ pthreads cudaOpenMP concurrent_profiling \ test_multi_read_and_reset test_multipass_event_fail \ test_2thr_1gpu_not_allowed TESTS_NOCTX = concurrent_profiling_noCuCtx pthreads_noCuCtx \ cudaOpenMP_noCuCtx HelloWorld_noCuCtx \ simpleMultiGPU_noCuCtx NVCC = $(PAPI_CUDA_ROOT)/bin/nvcc PAPI_FLAG = -DPAPI # Comment this line for tests to run without PAPI profiling NVCFLAGS = -g -ccbin='$(CC)' $(PAPI_FLAG) ifeq ($(BUILD_SHARED_LIB),yes) NVCFLAGS += -Xcompiler -fpic endif CFLAGS += -g $(PAPI_FLAG) INCLUDE += -I$(PAPI_CUDA_ROOT)/include CUDALIBS = -L$(PAPI_CUDA_ROOT)/lib64 -lcudart -lcuda cuda_tests: $(TESTS) $(TESTS_NOCTX) %.o:%.cu $(NVCC) $(INCLUDE) $(NVCFLAGS) -c -o $@ $< %.mac:%.cu $(NVCC) $(INCLUDE) $(NVCFLAGS) -E -c -o $@ $< test_multi_read_and_reset: test_multi_read_and_reset.o $(UTILOBJS) $(CXX) $(CFLAGS) -o test_multi_read_and_reset test_multi_read_and_reset.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) concurrent_profiling: concurrent_profiling.o $(UTILOBJS) $(CXX) $(CFLAGS) -pthread -o concurrent_profiling concurrent_profiling.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) concurrent_profiling_noCuCtx: concurrent_profiling_noCuCtx.o $(UTILOBJS) $(CXX) $(CFLAGS) -pthread -o concurrent_profiling_noCuCtx concurrent_profiling_noCuCtx.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) pthreads: pthreads.o $(CXX) $(CFLAGS) -pthread -o pthreads pthreads.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) pthreads_noCuCtx: pthreads_noCuCtx.o $(CXX) $(CFLAGS) -pthread -o pthreads_noCuCtx pthreads_noCuCtx.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) cudaOpenMP: cudaOpenMP.o $(CXX) $(CFLAGS) -o cudaOpenMP cudaOpenMP.o -lgomp -fopenmp $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) cudaOpenMP_noCuCtx: cudaOpenMP_noCuCtx.o $(CXX) $(CFLAGS) -o cudaOpenMP_noCuCtx cudaOpenMP_noCuCtx.o -lgomp -fopenmp $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) test_multipass_event_fail: test_multipass_event_fail.o $(UTILOBJS) $(CXX) $(CFLAGS) -o test_multipass_event_fail test_multipass_event_fail.o $(INCLUDE) $(UTILOBJS) $(PAPILIB) $(LDFLAGS) $(CUDALIBS) test_2thr_1gpu_not_allowed: test_2thr_1gpu_not_allowed.o $(CXX) $(CFLAGS) -pthread -o test_2thr_1gpu_not_allowed test_2thr_1gpu_not_allowed.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) HelloWorld: HelloWorld.o $(UTILOBJS) $(CXX) $(CFLAGS) -o HelloWorld HelloWorld.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) HelloWorld_noCuCtx: HelloWorld_noCuCtx.o $(UTILOBJS) $(CXX) $(CFLAGS) -o HelloWorld_noCuCtx HelloWorld_noCuCtx.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) simpleMultiGPU: simpleMultiGPU.o $(UTILOBJS) $(CXX) $(CFLAGS) -o simpleMultiGPU simpleMultiGPU.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) simpleMultiGPU_noCuCtx: simpleMultiGPU_noCuCtx.o $(UTILOBJS) $(CXX) $(CFLAGS) -o simpleMultiGPU_noCuCtx simpleMultiGPU_noCuCtx.o $(UTILOBJS) $(PAPILIB) $(CUDALIBS) $(LDFLAGS) clean: rm -f *.o $(TESTS) $(TESTS_NOCTX) papi-papi-7-2-0-t/src/components/cuda/tests/concurrent_profiling.cu000066400000000000000000000523761502707512200254730ustar00rootroot00000000000000// Copyright 2021 NVIDIA Corporation. All rights reserved // // This sample demonstrates two ways to use the CUPTI Profiler API with concurrent kernels. // By taking the ratio of runtimes for a consecutive series of kernels, compared // to a series of concurrent kernels, one can difinitively demonstrate that concurrent // kernels were running while metrics were gathered and the User Replay mechanism was in use. // // Example: // 4 kernel launches, with 1x, 2x, 3x, and 4x amounts of work, each sized to one SM (one warp // of threads, one thread block). // When run synchronously, this comes to 10x amount of work. // When run concurrently, the longest (4x) kernel should be the only measured time (it hides the others). // Thus w/ 4 kernels, the concurrent : consecutive time ratio should be 4:10. // On test hardware this does simplify to 3.998:10. As the test is affected by memory layout, this may not // hold for certain architectures where, for example, cache sizes may optimize certain kernel calls. // // After demonstrating concurrency using multpile streams, this then demonstrates using multiple devices. // In this 3rd configuration, the same concurrent workload with streams is then duplicated and run // on each device concurrently using streams. // In this case, the wallclock time to launch, run, and join the threads should be roughly the same as the // wallclock time to run the single device case. If concurrency was not working, the wallcock time // would be (num devices) times the single device concurrent case. // // * If the multiple devices have different performance, the runtime may be significantly different between // devices, but this does not mean concurrent profiling is not happening. // This code has been adapted to PAPI from // `/extras/CUPTI/samples/concurrent_profiling/cpncurrent_profiling.cu` #ifdef PAPI extern "C" { #include #include "papi_test.h" } #endif // Standard CUDA, CUPTI, Profiler, NVPW headers #include "cuda.h" // Standard STL headers #include #include #include #include using ::std::string; #include using ::std::thread; #include using ::std::vector; #include using ::std::find; #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #ifdef PAPI #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif // Helpful error handlers for standard CUPTI and CUDA runtime calls #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define MEMORY_ALLOCATION_CALL(var) \ do { \ if (var == NULL) { \ fprintf(stderr, "%s:%d: Error: Memory Allocation Failed \n", \ __FILE__, __LINE__); \ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) typedef struct { int device; //!< compute device number CUcontext context; //!< CUDA driver context, or NULL if default context has already been initialized } profilingConfig; // Per-device configuration, buffers, stream and device information, and device pointers typedef struct { int deviceID; profilingConfig config; // Each device (or each context) needs its own CUPTI profiling config vector streams; // Each device needs its own streams vector d_x; // And device memory allocation vector d_y; // .. long long values[100]; // Capture PAPI measured values for each device } perDeviceData; #define DAXPY_REPEAT 32768 // Loop over array of elements performing daxpy multiple times // To be launched with only one block (artificially increasing serial time to better demonstrate overlapping replay) __global__ void daxpyKernel(int elements, double a, double * x, double * y) { for (int i = threadIdx.x; i < elements; i += blockDim.x) // Artificially increase kernel runtime to emphasize concurrency for (int j = 0; j < DAXPY_REPEAT; j++) y[i] = a * x[i] + y[i]; // daxpy } // Initialize kernel values double a = 2.5; // Normally you would want multiple warps, but to emphasize concurrency with streams and multiple devices // we run the kernels on a single warp. int threadsPerBlock = 32; int threadBlocks = 1; // Configurable number of kernels (streams, when running concurrently) int const numKernels = 4; int const numStreams = numKernels; vector elements(numKernels); // Each kernel call allocates and computes (call number) * (blockSize) elements // For 4 calls, this is 4k elements * 2 arrays * (1 + 2 + 3 + 4 stream mul) * 8B/elem =~ 640KB int const blockSize = 4 * 1024; // Globals for successfully added and multiple pass events int numMultipassEvents = 0; vector eventsSuccessfullyAdded; /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param d * Per device data. * @param EventSet * A PAPI eventset. * @param metricNames * Events provided on the command line. * @param successfullyAddedEvents * Events successfully added to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(perDeviceData &d, int EventSet, vector const &metricNames, vector successfullyAddedEvents, int *numMultipassEvents) { int i; for (i = 0; i < metricNames.size(); i++) { string evt_name = metricNames[i] + std::to_string(d.config.device); int papi_errno = PAPI_add_named_event(EventSet, evt_name.c_str()); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", evt_name.c_str(), papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events if (find(eventsSuccessfullyAdded.begin(), eventsSuccessfullyAdded.end(), metricNames[i]) == eventsSuccessfullyAdded.end()) { eventsSuccessfullyAdded.push_back(metricNames[i]); } } return; } // Wrapper which will launch numKernel kernel calls on a single device // The device streams vector is used to control which stream each call is made on // If 'serial' is non-zero, the device streams are ignored and instead the default stream is used void profileKernels(perDeviceData &d, vector const &metricNames, char const * const rangeName, bool serial) { // Switch to desired device RUNTIME_API_CALL(cudaSetDevice(d.config.device)); // Orig code has mistake here DRIVER_API_CALL(cuCtxSetCurrent(d.config.context)); #ifdef PAPI int eventset = PAPI_NULL; PAPI_CALL(PAPI_create_eventset(&eventset)); add_events_from_command_line(d, eventset, metricNames, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (eventsSuccessfullyAdded.size() == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } PAPI_CALL(PAPI_start(eventset)); #endif for (unsigned int stream = 0; stream < d.streams.size(); stream++) { cudaStream_t streamId = (serial ? 0 : d.streams[stream]); daxpyKernel <<>> (elements[stream], a, d.d_x[stream], d.d_y[stream]); } // After launching all work, synchronize all streams if (serial == false) { for (unsigned int stream = 0; stream < d.streams.size(); stream++) { RUNTIME_API_CALL(cudaStreamSynchronize(d.streams[stream])); } } else { RUNTIME_API_CALL(cudaStreamSynchronize(0)); } #ifdef PAPI PAPI_CALL(PAPI_stop(eventset, d.values)); PAPI_CALL(PAPI_cleanup_eventset(eventset)); PAPI_CALL(PAPI_destroy_eventset(&eventset)); #endif } void print_measured_values(perDeviceData &d, vector const &metricNames) { string evt_name; PRINT(quiet, "PAPI event name\t\t\t\t\t\t\tMeasured value\n"); PRINT(quiet, "%s\n", std::string(80, '-').c_str()); for (int i=0; i < metricNames.size(); i++) { evt_name = metricNames[i] + std::to_string(d.config.device); PRINT(quiet, "%s\t\t\t%lld\n", evt_name.c_str(), d.values[i]); } } int main(int argc, char **argv) { quiet = 0; int i; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } vector metricNames; for (i=0; i < event_count; i++) { metricNames.push_back(argv[i+1]); } // Initialize the PAPI library if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } #else vector metricNames = {""}; #endif int numDevices; RUNTIME_API_CALL(cudaGetDeviceCount(&numDevices)); // Per-device information vector device_ids; // Find all devices capable of running CUPTI Profiling (Compute Capability >= 7.0) for (i = 0; i < numDevices; i++) { // Get device properties int major; RUNTIME_API_CALL(cudaDeviceGetAttribute(&major, cudaDevAttrComputeCapabilityMajor, i)); if (major >= 7) { // Record device number device_ids.push_back(i); } } numDevices = device_ids.size(); PRINT(quiet, "Found %d compatible devices\n", numDevices); // Ensure we found at least one device if (numDevices == 0) { fprintf(stderr, "No devices detected compatible with CUPTI Profiling (Compute Capability >= 7.0)\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif } // Initialize kernel input to some known numbers vector h_x(blockSize * numKernels); vector h_y(blockSize * numKernels); for (size_t i = 0; i < blockSize * numKernels; i++) { h_x[i] = 1.5 * i; h_y[i] = 2.0 * (i - 3000); } // Initialize a vector of 'default stream' values to demonstrate serialized kernels vector defaultStreams(numStreams); for (int stream = 0; stream < numStreams; stream++) { defaultStreams[stream] = 0; } // Scale per-kernel work by stream number for (int stream = 0; stream < numStreams; stream++) { elements[stream] = blockSize * (stream + 1); } // For each device, configure profiling, set up buffers, copy kernel data vector deviceData(numDevices); for (int device = 0; device < numDevices; device++) { int device_id = device_ids[device]; RUNTIME_API_CALL(cudaSetDevice(device_id)); PRINT(quiet, "Configuring device %d\n", device_id); deviceData[device].deviceID = device_id; // Required CUPTI Profiling configuration & initialization // Can be done ahead of time or immediately before startSession() call // Initialization & configuration images can be generated separately, then passed to later calls // For simplicity's sake, in this sample, a single config struct is created per device and passed to each CUPTI Profiler API call // For more complex cases, each combination of CUPTI Profiler Session and Config requires additional initialization profilingConfig config; config.device = device_id; // Device ID, used to get device name for metrics enumeration // config.maxLaunchesPerPass = 1; // Must be >= maxRangesPerPass. Set this to the largest count of kernel launches which may be encountered in any Pass in this Session // // Device 0 has max of 3 passes; other devices only run one pass in this sample code DRIVER_API_CALL(cuCtxCreate(&(config.context), 0, device)); // Either set to a context, or may be NULL if a default context has been created deviceData[device].config = config;// Save this device config // Initialize CUPTI Profiling structures // targetInitProfiling(deviceData[device], metricNames); // Per-stream initialization & memory allocation - copy from constant host array to each device array deviceData[device].streams.resize(numStreams); deviceData[device].d_x.resize(numStreams); deviceData[device].d_y.resize(numStreams); for (int stream = 0; stream < numStreams; stream++) { RUNTIME_API_CALL(cudaStreamCreate(&(deviceData[device].streams[stream]))); // Each kernel does (stream #) * blockSize work on doubles size_t size = elements[stream] * sizeof(double); RUNTIME_API_CALL(cudaMalloc(&(deviceData[device].d_x[stream]), size)); MEMORY_ALLOCATION_CALL(deviceData[device].d_x[stream]); // Validate pointer RUNTIME_API_CALL(cudaMemcpy(deviceData[device].d_x[stream], h_x.data(), size, cudaMemcpyHostToDevice)); RUNTIME_API_CALL(cudaMalloc(&(deviceData[device].d_y[stream]), size)); MEMORY_ALLOCATION_CALL(deviceData[device].d_y[stream]); // Validate pointer RUNTIME_API_CALL(cudaMemcpy(deviceData[device].d_y[stream], h_x.data(), size, cudaMemcpyHostToDevice)); } } // // First version - single device, kernel calls serialized on default stream // // Use wallclock time to measure performance auto begin_time = ::std::chrono::high_resolution_clock::now(); // Run on first device and use default streams, which run serially profileKernels(deviceData[0], metricNames, "single_device_serial", true); auto end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_serial_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); int numBlocks = 0; for (int i = 1; i <= numKernels; i++) { numBlocks += i; } PRINT(quiet, "It took %d ms on the host to profile %d kernels in serial.", elapsed_serial_ms, numKernels); // // Second version - same kernel calls as before on the same device, but now using separate streams for concurrency // (Should be limited by the longest running kernel) // begin_time = ::std::chrono::high_resolution_clock::now(); // Still only use first device, but this time use its allocated streams for parallelism profileKernels(deviceData[0], metricNames, "single_device_async", false); end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_single_device_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); PRINT(quiet, "It took %d ms on the host to profile %d kernels on a single device on separate streams.", elapsed_single_device_ms, numKernels); PRINT(quiet, "--> If the separate stream wallclock time is less than the serial version, the streams were profiling concurrently.\n"); // // Third version - same as the second case, but duplicates the concurrent work across devices to show cross-device concurrency // This is done using devices so no serialization is needed between devices // (Should have roughly the same wallclock time as second case if the devices have similar performance) // if (numDevices == 1) { PRINT(quiet, "Only one compatible device found; skipping the multi-threaded test.\n"); } else { #ifdef PAPI int papi_errno = PAPI_thread_init((unsigned long (*)(void)) std::this_thread::get_id); if ( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "Error setting thread id function.\n", papi_errno); } #endif PRINT(quiet, "Running on %d devices, one thread per device.\n", numDevices); // Time creation of the same multiple streams (on multiple devices, if possible) vector<::std::thread> threads; begin_time = ::std::chrono::high_resolution_clock::now(); // Now launch parallel thread work, duplicated on one thread per device for (int thread = 0; thread < numDevices; thread++) { threads.push_back(::std::thread(profileKernels, ::std::ref(deviceData[thread]), metricNames, "multi_device_async", false)); } // Wait for all threads to finish for (auto &t: threads) { t.join(); } // Record time used when launching on multiple devices end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_multiple_device_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); PRINT(quiet, "It took %d ms on the host to profile the same %d kernels on each of the %d devices in parallel\n", elapsed_multiple_device_ms, numKernels, numDevices); PRINT(quiet, "--> Wallclock ratio of parallel device launch to single device launch is %f\n", elapsed_multiple_device_ms / (double) elapsed_single_device_ms); PRINT(quiet, "--> If the ratio is close to 1, that means there was little overhead to profile in parallel on multiple devices compared to profiling on a single device.\n"); PRINT(quiet, "--> If the devices have different performance, the ratio may not be close to one, and this should be limited by the slowest device.\n"); } // Free stream memory for each device for (int i = 0; i < numDevices; i++) { for (int j = 0; j < numKernels; j++) { RUNTIME_API_CALL(cudaFree(deviceData[i].d_x[j])); RUNTIME_API_CALL(cudaFree(deviceData[i].d_y[j])); } } #ifdef PAPI // Display metric values PRINT(quiet, "\nMetrics for device #0:\n"); PRINT(quiet, "Look at the sm__cycles_elapsed.max values for each test.\n"); PRINT(quiet, "This value represents the time spent on device to run the kernels in each case, and should be longest for the serial range, and roughly equal for the single and multi device concurrent ranges.\n"); print_measured_values(deviceData[0], eventsSuccessfullyAdded); // Only display next device info if needed if (numDevices > 1) { PRINT(quiet, "\nMetrics for the remaining devices only display the multi device async case and should all be similar to the first device's values if the device has similar performance characteristics.\n"); PRINT(quiet, "If devices have different performance characteristics, the runtime cycles calculation may vary by device.\n"); } for (int i = 1; i < numDevices; i++) { PRINT(quiet, "\nMetrics for device #%d:\n", i); print_measured_values(deviceData[i], eventsSuccessfullyAdded); } PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/concurrent_profiling_noCuCtx.cu000066400000000000000000000515731502707512200271340ustar00rootroot00000000000000// Copyright 2021 NVIDIA Corporation. All rights reserved // // This sample demonstrates two ways to use the CUPTI Profiler API with concurrent kernels. // By taking the ratio of runtimes for a consecutive series of kernels, compared // to a series of concurrent kernels, one can difinitively demonstrate that concurrent // kernels were running while metrics were gathered and the User Replay mechanism was in use. // // Example: // 4 kernel launches, with 1x, 2x, 3x, and 4x amounts of work, each sized to one SM (one warp // of threads, one thread block). // When run synchronously, this comes to 10x amount of work. // When run concurrently, the longest (4x) kernel should be the only measured time (it hides the others). // Thus w/ 4 kernels, the concurrent : consecutive time ratio should be 4:10. // On test hardware this does simplify to 3.998:10. As the test is affected by memory layout, this may not // hold for certain architectures where, for example, cache sizes may optimize certain kernel calls. // // After demonstrating concurrency using multpile streams, this then demonstrates using multiple devices. // In this 3rd configuration, the same concurrent workload with streams is then duplicated and run // on each device concurrently using streams. // In this case, the wallclock time to launch, run, and join the threads should be roughly the same as the // wallclock time to run the single device case. If concurrency was not working, the wallcock time // would be (num devices) times the single device concurrent case. // // * If the multiple devices have different performance, the runtime may be significantly different between // devices, but this does not mean concurrent profiling is not happening. // This code has been adapted to PAPI from // `/extras/CUPTI/samples/concurrent_profiling/cpncurrent_profiling.cu` #ifdef PAPI extern "C" { #include #include "papi_test.h" } #endif // Standard CUDA, CUPTI, Profiler, NVPW headers #include "cuda.h" // Standard STL headers #include #include #include #include using ::std::string; #include using ::std::thread; #include using ::std::vector; #include using ::std::find; #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #ifdef PAPI #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif // Helpful error handlers for standard CUPTI and CUDA runtime calls #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define MEMORY_ALLOCATION_CALL(var) \ do { \ if (var == NULL) { \ fprintf(stderr, "%s:%d: Error: Memory Allocation Failed \n", \ __FILE__, __LINE__); \ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) typedef struct { int device; //!< compute device number } profilingConfig; // Per-device configuration, buffers, stream and device information, and device pointers typedef struct { int deviceID; profilingConfig config; // Each device (or each context) needs its own CUPTI profiling config vector streams; // Each device needs its own streams vector d_x; // And device memory allocation vector d_y; // .. long long values[100]; // Capture PAPI measured values for each device } perDeviceData; #define DAXPY_REPEAT 32768 // Loop over array of elements performing daxpy multiple times // To be launched with only one block (artificially increasing serial time to better demonstrate overlapping replay) __global__ void daxpyKernel(int elements, double a, double * x, double * y) { for (int i = threadIdx.x; i < elements; i += blockDim.x) // Artificially increase kernel runtime to emphasize concurrency for (int j = 0; j < DAXPY_REPEAT; j++) y[i] = a * x[i] + y[i]; // daxpy } // Initialize kernel values double a = 2.5; // Normally you would want multiple warps, but to emphasize concurrency with streams and multiple devices // we run the kernels on a single warp. int threadsPerBlock = 32; int threadBlocks = 1; // Configurable number of kernels (streams, when running concurrently) int const numKernels = 4; int const numStreams = numKernels; vector elements(numKernels); // Each kernel call allocates and computes (call number) * (blockSize) elements // For 4 calls, this is 4k elements * 2 arrays * (1 + 2 + 3 + 4 stream mul) * 8B/elem =~ 640KB int const blockSize = 4 * 1024; // Globals for successfully added and multiple pass events int numMultipassEvents = 0; vector eventsSuccessfullyAdded; /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param d * Per device data. * @param EventSet * A PAPI eventset. * @param metricNames * Events provided on the command line. * @param successfullyAddedEvents * Events successfully added to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(perDeviceData &d, int EventSet, vector const &metricNames, vector successfullyAddedEvents, int *numMultipassEvents) { int i; for (i = 0; i < metricNames.size(); i++) { string evt_name = metricNames[i] + std::to_string(d.config.device); int papi_errno = PAPI_add_named_event(EventSet, evt_name.c_str()); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", evt_name.c_str(), papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events if (find(eventsSuccessfullyAdded.begin(), eventsSuccessfullyAdded.end(), metricNames[i]) == eventsSuccessfullyAdded.end()) { eventsSuccessfullyAdded.push_back(metricNames[i]); } } return; } // Wrapper which will launch numKernel kernel calls on a single device // The device streams vector is used to control which stream each call is made on // If 'serial' is non-zero, the device streams are ignored and instead the default stream is used void profileKernels(perDeviceData &d, vector const &metricNames, char const * const rangeName, bool serial) { RUNTIME_API_CALL(cudaSetDevice(d.config.device)); // Orig code has mistake here #ifdef PAPI int eventset = PAPI_NULL; PAPI_CALL(PAPI_create_eventset(&eventset)); add_events_from_command_line(d, eventset, metricNames, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (eventsSuccessfullyAdded.size() == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } PAPI_CALL(PAPI_start(eventset)); #endif for (unsigned int stream = 0; stream < d.streams.size(); stream++) { cudaStream_t streamId = (serial ? 0 : d.streams[stream]); daxpyKernel <<>> (elements[stream], a, d.d_x[stream], d.d_y[stream]); } // After launching all work, synchronize all streams if (serial == false) { for (unsigned int stream = 0; stream < d.streams.size(); stream++) { RUNTIME_API_CALL(cudaStreamSynchronize(d.streams[stream])); } } else { RUNTIME_API_CALL(cudaStreamSynchronize(0)); } #ifdef PAPI PAPI_CALL(PAPI_stop(eventset, d.values)); PAPI_CALL(PAPI_cleanup_eventset(eventset)); PAPI_CALL(PAPI_destroy_eventset(&eventset)); #endif } void print_measured_values(perDeviceData &d, vector const &metricNames) { string evt_name; PRINT(quiet, "PAPI event name\t\t\t\t\t\t\tMeasured value\n"); PRINT(quiet, "%s\n", std::string(80, '-').c_str()); for (int i=0; i < metricNames.size(); i++) { evt_name = metricNames[i] + std::to_string(d.config.device); PRINT(quiet, "%s\t\t\t%lld\n", evt_name.c_str(), d.values[i]); } } int main(int argc, char **argv) { quiet = 0; int i; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } vector metricNames; for (i=0; i < event_count; i++) { metricNames.push_back(argv[i+1]); } // Initialize the PAPI library if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } #else vector metricNames = {""}; #endif int numDevices; RUNTIME_API_CALL(cudaGetDeviceCount(&numDevices)); // Per-device information vector device_ids; // Find all devices capable of running CUPTI Profiling (Compute Capability >= 7.0) for (i = 0; i < numDevices; i++) { // Get device properties int major; RUNTIME_API_CALL(cudaDeviceGetAttribute(&major, cudaDevAttrComputeCapabilityMajor, i)); if (major >= 7) { // Record device number device_ids.push_back(i); } } numDevices = device_ids.size(); PRINT(quiet, "Found %d compatible devices\n", numDevices); // Ensure we found at least one device if (numDevices == 0) { fprintf(stderr, "No devices detected compatible with CUPTI Profiling (Compute Capability >= 7.0)\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif } // Initialize kernel input to some known numbers vector h_x(blockSize * numKernels); vector h_y(blockSize * numKernels); for (size_t i = 0; i < blockSize * numKernels; i++) { h_x[i] = 1.5 * i; h_y[i] = 2.0 * (i - 3000); } // Initialize a vector of 'default stream' values to demonstrate serialized kernels vector defaultStreams(numStreams); for (int stream = 0; stream < numStreams; stream++) { defaultStreams[stream] = 0; } // Scale per-kernel work by stream number for (int stream = 0; stream < numStreams; stream++) { elements[stream] = blockSize * (stream + 1); } // For each device, configure profiling, set up buffers, copy kernel data vector deviceData(numDevices); for (int device = 0; device < numDevices; device++) { int device_id = device_ids[device]; RUNTIME_API_CALL(cudaSetDevice(device_id)); PRINT(quiet, "Configuring device %d\n", device_id); deviceData[device].deviceID = device_id; // Required CUPTI Profiling configuration & initialization // Can be done ahead of time or immediately before startSession() call // Initialization & configuration images can be generated separately, then passed to later calls // For simplicity's sake, in this sample, a single config struct is created per device and passed to each CUPTI Profiler API call // For more complex cases, each combination of CUPTI Profiler Session and Config requires additional initialization profilingConfig config; config.device = device_id; // Device ID, used to get device name for metrics enumeration // config.maxLaunchesPerPass = 1; // Must be >= maxRangesPerPass. Set this to the largest count of kernel launches which may be encountered in any Pass in this Session // // Device 0 has max of 3 passes; other devices only run one pass in this sample code deviceData[device].config = config;// Save this device config // Initialize CUPTI Profiling structures // Per-stream initialization & memory allocation - copy from constant host array to each device array deviceData[device].streams.resize(numStreams); deviceData[device].d_x.resize(numStreams); deviceData[device].d_y.resize(numStreams); for (int stream = 0; stream < numStreams; stream++) { RUNTIME_API_CALL(cudaStreamCreate(&(deviceData[device].streams[stream]))); // Each kernel does (stream #) * blockSize work on doubles size_t size = elements[stream] * sizeof(double); RUNTIME_API_CALL(cudaMalloc(&(deviceData[device].d_x[stream]), size)); MEMORY_ALLOCATION_CALL(deviceData[device].d_x[stream]); // Validate pointer RUNTIME_API_CALL(cudaMemcpy(deviceData[device].d_x[stream], h_x.data(), size, cudaMemcpyHostToDevice)); RUNTIME_API_CALL(cudaMalloc(&(deviceData[device].d_y[stream]), size)); MEMORY_ALLOCATION_CALL(deviceData[device].d_y[stream]); // Validate pointer RUNTIME_API_CALL(cudaMemcpy(deviceData[device].d_y[stream], h_x.data(), size, cudaMemcpyHostToDevice)); } } // // First version - single device, kernel calls serialized on default stream // // Use wallclock time to measure performance auto begin_time = ::std::chrono::high_resolution_clock::now(); // Run on first device and use default streams, which run serially profileKernels(deviceData[0], metricNames, "single_device_serial", true); auto end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_serial_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); int numBlocks = 0; for (int i = 1; i <= numKernels; i++) { numBlocks += i; } PRINT(quiet, "It took %d ms on the host to profile %d kernels in serial.", elapsed_serial_ms, numKernels); // // Second version - same kernel calls as before on the same device, but now using separate streams for concurrency // (Should be limited by the longest running kernel) // begin_time = ::std::chrono::high_resolution_clock::now(); // Still only use first device, but this time use its allocated streams for parallelism profileKernels(deviceData[0], metricNames, "single_device_async", false); end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_single_device_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); PRINT(quiet, "It took %d ms on the host to profile %d kernels on a single device on separate streams.", elapsed_single_device_ms, numKernels); PRINT(quiet, "--> If the separate stream wallclock time is less than the serial version, the streams were profiling concurrently.\n"); // // Third version - same as the second case, but duplicates the concurrent work across devices to show cross-device concurrency // This is done using devices so no serialization is needed between devices // (Should have roughly the same wallclock time as second case if the devices have similar performance) // if (numDevices == 1) { PRINT(quiet, "Only one compatible device found; skipping the multi-threaded test.\n"); } else { #ifdef PAPI int papi_errno = PAPI_thread_init((unsigned long (*)(void)) std::this_thread::get_id); if ( papi_errno != PAPI_OK ) { test_fail(__FILE__, __LINE__, "Error setting thread id function.\n", papi_errno); } #endif PRINT(quiet, "Running on %d devices, one thread per device.\n", numDevices); // Time creation of the same multiple streams (on multiple devices, if possible) vector<::std::thread> threads; begin_time = ::std::chrono::high_resolution_clock::now(); // Now launch parallel thread work, duplicated on one thread per device for (int thread = 0; thread < numDevices; thread++) { threads.push_back(::std::thread(profileKernels, ::std::ref(deviceData[thread]), metricNames, "multi_device_async", false)); } // Wait for all threads to finish for (auto &t: threads) { t.join(); } // Record time used when launching on multiple devices end_time = ::std::chrono::high_resolution_clock::now(); int elapsed_multiple_device_ms = ::std::chrono::duration_cast<::std::chrono::milliseconds>(end_time - begin_time).count(); double ratio = elapsed_multiple_device_ms / (double) elapsed_single_device_ms; PRINT(quiet, "It took %d ms on the host to profile the same %d kernels on each of the %d devices in parallel\n", elapsed_multiple_device_ms, numKernels, numDevices); PRINT(quiet, "--> Wallclock ratio of parallel device launch to single device launch is %f\n", ratio); PRINT(quiet, "--> If the ratio is close to 1, that means there was little overhead to profile in parallel on multiple devices compared to profiling on a single device.\n"); PRINT(quiet, "--> If the devices have different performance, the ratio may not be close to one, and this should be limited by the slowest device.\n"); } // Free stream memory for each device for (int i = 0; i < numDevices; i++) { for (int j = 0; j < numKernels; j++) { RUNTIME_API_CALL(cudaFree(deviceData[i].d_x[j])); RUNTIME_API_CALL(cudaFree(deviceData[i].d_y[j])); } } #ifdef PAPI // Display metric values PRINT(quiet, "\nMetrics for device #0:\n"); PRINT(quiet, "Look at the sm__cycles_elapsed.max values for each test.\n"); PRINT(quiet, "This value represents the time spent on device to run the kernels in each case, and should be longest for the serial range, and roughly equal for the single and multi device concurrent ranges.\n"); print_measured_values(deviceData[0], eventsSuccessfullyAdded); // Only display next device info if needed if (numDevices > 1) { PRINT(quiet, "\nMetrics for the remaining devices only display the multi device async case and should all be similar to the first device's values if the device has similar performance characteristics.\n"); PRINT(quiet, "If devices have different performance characteristics, the runtime cycles calculation may vary by device.\n"); } for (int i = 1; i < numDevices; i++) { PRINT(quiet, "\nMetrics for device #%d:\n", i); print_measured_values(deviceData[i], eventsSuccessfullyAdded); } PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/cudaOpenMP.cu000066400000000000000000000270251502707512200232240ustar00rootroot00000000000000/* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * Neither the name of NVIDIA CORPORATION nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * Multi-GPU sample using OpenMP for threading on the CPU side * needs a compiler that supports OpenMP 2.0 */ #ifdef PAPI #include #include "papi_test.h" #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif #include "gpu_work.h" #include #include // stdio functions are used since C++ streams aren't necessarily thread safe #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) #define MAX_THREADS (32) /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param gpu_id * NVIDIA device index. * @param **eventNamesFromCommandLine * Events provided on the command line. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, int gpu_id, char **eventNamesFromCommandLine, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } int main(int argc, char *argv[]) { quiet = 0; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } #endif int num_gpus = 0, i; CUcontext ctx_arr[MAX_THREADS]; RUNTIME_API_CALL(cudaGetDeviceCount(&num_gpus)); // determine the number of CUDA capable GPUs if (num_gpus < 1) { fprintf(stderr, "no CUDA capable devices were detected\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif return 0; } ///////////////////////////////////////////////////////////////// // display CPU and GPU configuration // PRINT(quiet, "number of host CPUs:\t%d\n", omp_get_num_procs()); PRINT(quiet, "number of CUDA devices:\t%d\n", num_gpus); for (i = 0; i < num_gpus; i++) { cudaDeviceProp dprop; RUNTIME_API_CALL(cudaGetDeviceProperties(&dprop, i)); PRINT(quiet, " %d: %s\n", i, dprop.name); } int num_threads = (num_gpus > MAX_THREADS) ? MAX_THREADS : num_gpus; // Create a gpu context for every thread for (i=0; i < num_threads; i++) { DRIVER_API_CALL(cuCtxCreate(&(ctx_arr[i]), 0, i % num_gpus)); // "% num_gpus" allows more CPU threads than GPU devices DRIVER_API_CALL(cuCtxPopCurrent(&(ctx_arr[i]))); } PRINT(quiet, "---------------------------\n"); #ifdef PAPI int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if ( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } PAPI_CALL(PAPI_thread_init((unsigned long (*)(void)) omp_get_thread_num)); #endif omp_lock_t lock; omp_init_lock(&lock); PRINT(quiet, "Launching %d threads.\n", num_threads); omp_set_num_threads(num_threads); // create as many CPU threads as there are CUDA devices int numMultipassEvents = 0; #pragma omp parallel { unsigned int cpu_thread_id = omp_get_thread_num(); unsigned int num_cpu_threads = omp_get_num_threads(); PRINT(quiet, "cpu_thread_id %u, num_cpu_threads %u, num_threads %d, num_gpus %d\n", cpu_thread_id, num_cpu_threads, num_threads, num_gpus); DRIVER_API_CALL(cuCtxPushCurrent(ctx_arr[cpu_thread_id])); #ifdef PAPI int gpu_id = cpu_thread_id % num_gpus; int EventSet = PAPI_NULL; long long values[MAX_THREADS]; int j, errno; PAPI_CALL(PAPI_create_eventset(&EventSet)); PRINT(quiet, "CPU thread %d (of %d) uses CUDA device %d with context %p @ eventset %d\n", cpu_thread_id, num_cpu_threads, gpu_id, ctx_arr[cpu_thread_id], EventSet); int numEventsSuccessfullyAdded = 0; char **eventsSuccessfullyAdded, **metricNames = argv + 1; eventsSuccessfullyAdded = (char **) malloc(event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, event_count, gpu_id, metricNames, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } PAPI_CALL(PAPI_start(EventSet)); #endif VectorAddSubtract(50000*(cpu_thread_id+1), quiet); // gpu work #ifdef PAPI PAPI_CALL(PAPI_stop(EventSet, values)); PRINT(quiet, "User measured values.\n"); for (j = 0; j < numEventsSuccessfullyAdded; j++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[j], values[j]); } // Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); DRIVER_API_CALL(cuCtxPopCurrent(&(ctx_arr[gpu_id]))); errno = PAPI_cleanup_eventset(EventSet); if (errno != PAPI_OK) { fprintf(stderr, "PAPI_cleanup_eventset(%d) failed with error %d", EventSet, errno); test_fail(__FILE__, __LINE__, "", errno); } PAPI_CALL(PAPI_destroy_eventset(&EventSet)); #endif } // omp parallel region end for (i = 0; i < num_threads; i++) { DRIVER_API_CALL(cuCtxDestroy(ctx_arr[i])); } if (cudaSuccess != cudaGetLastError()) fprintf(stderr, "%s\n", cudaGetErrorString(cudaGetLastError())); omp_destroy_lock(&lock); #ifdef PAPI PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/cudaOpenMP_noCuCtx.cu000066400000000000000000000255131502707512200246670ustar00rootroot00000000000000/* Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * Neither the name of NVIDIA CORPORATION nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * Multi-GPU sample using OpenMP for threading on the CPU side * needs a compiler that supports OpenMP 2.0 */ #ifdef PAPI #include #include "papi_test.h" #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif #include "gpu_work.h" #include #include // stdio functions are used since C++ streams aren't necessarily thread safe #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) #define MAX_THREADS (32) /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param gpu_id * NVIDIA device index. * @param eventNamesFromCommandLine * Events provided on the command line. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, int gpu_id, char **eventNamesFromCommandLine, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } int main(int argc, char *argv[]) { quiet = 0; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } #endif int num_gpus = 0, i; RUNTIME_API_CALL(cudaGetDeviceCount(&num_gpus)); // determine the number of CUDA capable GPUs if (num_gpus < 1) { fprintf(stderr, "no CUDA capable devices were detected\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif return 0; } ///////////////////////////////////////////////////////////////// // display CPU and GPU configuration // PRINT(quiet, "number of host CPUs:\t%d\n", omp_get_num_procs()); PRINT(quiet, "number of CUDA devices:\t%d\n", num_gpus); for (i = 0; i < num_gpus; i++) { cudaDeviceProp dprop; RUNTIME_API_CALL(cudaGetDeviceProperties(&dprop, i)); PRINT(quiet, " %d: %s\n", i, dprop.name); RUNTIME_API_CALL(cudaSetDevice(i)); RUNTIME_API_CALL(cudaFree(NULL)); } int num_threads = (num_gpus > MAX_THREADS) ? MAX_THREADS : num_gpus; PRINT(quiet, "---------------------------\n"); #ifdef PAPI int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } PAPI_CALL(PAPI_thread_init((unsigned long (*)(void)) omp_get_thread_num)); #endif omp_lock_t lock; omp_init_lock(&lock); omp_set_num_threads(num_threads); // create as many CPU threads as there are CUDA devices int numMultipassEvents = 0; #pragma omp parallel { unsigned int cpu_thread_id = omp_get_thread_num(); unsigned int num_cpu_threads = omp_get_num_threads(); int gpu_id = cpu_thread_id % num_gpus; RUNTIME_API_CALL(cudaSetDevice(gpu_id)); #ifdef PAPI int EventSet = PAPI_NULL; long long values[MAX_THREADS]; int j, errno; PAPI_CALL(PAPI_create_eventset(&EventSet)); PRINT(quiet, "CPU thread %d (of %d) uses CUDA device %d @ eventset %d\n", cpu_thread_id, num_cpu_threads, gpu_id, EventSet); int numEventsSuccessfullyAdded = 0; char **eventsSuccessfullyAdded, **metricNames = argv + 1; eventsSuccessfullyAdded = (char **) malloc(event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, event_count, gpu_id, metricNames, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } PAPI_CALL(PAPI_start(EventSet)); #endif VectorAddSubtract(50000*(cpu_thread_id+1), quiet); // gpu work #ifdef PAPI PAPI_CALL(PAPI_stop(EventSet, values)); PRINT(quiet, "User measured values.\n"); for (j = 0; j < numEventsSuccessfullyAdded; j++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[j], values[j]); } // Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); errno = PAPI_cleanup_eventset(EventSet); if (errno != PAPI_OK) { fprintf(stderr, "PAPI_cleanup_eventset(%d) failed with error %d", EventSet, errno); test_fail(__FILE__, __LINE__, "", errno); } PAPI_CALL(PAPI_destroy_eventset(&EventSet)); #endif } // omp parallel region end if (cudaSuccess != cudaGetLastError()) fprintf(stderr, "%s\n", cudaGetErrorString(cudaGetLastError())); omp_destroy_lock(&lock); #ifdef PAPI PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/gpu_work.h000066400000000000000000000064021502707512200227020ustar00rootroot00000000000000#include #include #define _GW_CALL(call) \ do { \ cudaError_t _status = (call); \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s: %d: " #call "\n", __FILE__, __LINE__); \ } \ } while (0); // Device code __global__ void VecAdd(const int* A, const int* B, int* C, int N) { int i = blockDim.x * blockIdx.x + threadIdx.x; if (i < N) C[i] = A[i] + B[i]; } // Device code __global__ void VecSub(const int* A, const int* B, int* C, int N) { int i = blockDim.x * blockIdx.x + threadIdx.x; if (i < N) C[i] = A[i] - B[i]; } static void initVec(int *vec, int n) { for (int i=0; i< n; i++) vec[i] = i; } static void cleanUp(int *h_A, int *h_B, int *h_C, int *h_D, int *d_A, int *d_B, int *d_C, int *d_D) { if (d_A) _GW_CALL(cudaFree(d_A)); if (d_B) _GW_CALL(cudaFree(d_B)); if (d_C) _GW_CALL(cudaFree(d_C)); if (d_D) _GW_CALL(cudaFree(d_D)); // Free host memory if (h_A) free(h_A); if (h_B) free(h_B); if (h_C) free(h_C); if (h_D) free(h_D); } static void VectorAddSubtract(int N, int quiet) { if (N==0) N = 50000; size_t size = N * sizeof(int); int threadsPerBlock = 0; int blocksPerGrid = 0; int *h_A, *h_B, *h_C, *h_D; int *d_A, *d_B, *d_C, *d_D; int i, sum, diff; int device; cudaGetDevice(&device); // Allocate input vectors h_A and h_B in host memory h_A = (int*)malloc(size); h_B = (int*)malloc(size); h_C = (int*)malloc(size); h_D = (int*)malloc(size); if (h_A == NULL || h_B == NULL || h_C == NULL || h_D == NULL) { fprintf(stderr, "Allocating input vectors failed.\n"); } // Initialize input vectors initVec(h_A, N); initVec(h_B, N); memset(h_C, 0, size); memset(h_D, 0, size); // Allocate vectors in device memory _GW_CALL(cudaMalloc((void**)&d_A, size)); _GW_CALL(cudaMalloc((void**)&d_B, size)); _GW_CALL(cudaMalloc((void**)&d_C, size)); _GW_CALL(cudaMalloc((void**)&d_D, size)); // Copy vectors from host memory to device memory _GW_CALL(cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice)); _GW_CALL(cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice)); // Invoke kernel threadsPerBlock = 256; blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock; if (!quiet) fprintf(stderr, "Launching kernel on device %d: blocks %d, thread/block %d\n", device, blocksPerGrid, threadsPerBlock); VecAdd<<>>(d_A, d_B, d_C, N); VecSub<<>>(d_A, d_B, d_D, N); // Copy result from device memory to host memory // h_C contains the result in host memory _GW_CALL(cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost)); _GW_CALL(cudaMemcpy(h_D, d_D, size, cudaMemcpyDeviceToHost)); if (!quiet) fprintf(stderr, "Kernel launch complete and mem copied back from device %d\n", device); // Verify result for (i = 0; i < N; ++i) { sum = h_A[i] + h_B[i]; diff = h_A[i] - h_B[i]; if (h_C[i] != sum || h_D[i] != diff) { fprintf(stderr, "error: result verification failed\n"); exit(-1); } } cleanUp(h_A, h_B, h_C, h_D, d_A, d_B, d_C, d_D); } papi-papi-7-2-0-t/src/components/cuda/tests/pthreads.cu000066400000000000000000000231221502707512200230350ustar00rootroot00000000000000/** * @file pthreads.cu * @author Anustuv Pal * anustuv@icl.utk.edu */ #include #include #include #include #include "gpu_work.h" #ifdef PAPI #include #include "papi_test.h" #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) #define MAX_THREADS (32) int numGPUs; int g_event_count; char **g_evt_names; static volatile int global_thread_count = 0; pthread_mutex_t global_mutex; pthread_t tidarr[MAX_THREADS]; CUcontext cuCtx[MAX_THREADS]; pthread_mutex_t lock; // Globals for multiple pass events int numMultipassEvents = 0; /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventNamesFromCommandLine * Events provided on the command line. * @param gpu_id * NVIDIA device index. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int gpu_id, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } void *thread_gpu(void * idx) { int tid = *((int*) idx); unsigned long gettid = (unsigned long) pthread_self(); #ifdef PAPI int gpuid = tid % numGPUs; int i; int EventSet = PAPI_NULL; long long values[MAX_THREADS]; PAPI_CALL(PAPI_create_eventset(&EventSet)); DRIVER_API_CALL(cuCtxSetCurrent(cuCtx[tid])); PRINT(quiet, "This is idx %d thread %lu - using GPU %d context %p!\n", tid, gettid, gpuid, cuCtx[tid]); int numEventsSuccessfullyAdded = 0; char **eventsSuccessfullyAdded; eventsSuccessfullyAdded = (char **) malloc(g_event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < g_event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } pthread_mutex_lock(&global_mutex); add_events_from_command_line(EventSet, g_event_count, g_evt_names, gpuid, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } ++global_thread_count; pthread_mutex_unlock(&global_mutex); while(global_thread_count < numGPUs); PAPI_CALL(PAPI_start(EventSet)); #endif VectorAddSubtract(50000*(tid+1), quiet); // gpu work #ifdef PAPI PAPI_CALL(PAPI_stop(EventSet, values)); PRINT(quiet, "User measured values in thread id %d.\n", tid); for (i = 0; i < numEventsSuccessfullyAdded; i++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[i], values[i]); } // Free allocated memory for (i = 0; i < g_event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); PAPI_CALL(PAPI_cleanup_eventset(EventSet)); PAPI_CALL(PAPI_destroy_eventset(&EventSet)); #endif return NULL; } int main(int argc, char **argv) { quiet = 0; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); g_event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (g_event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } g_evt_names = argv + 1; #endif int rc, i; int tid[MAX_THREADS]; RUNTIME_API_CALL(cudaGetDeviceCount(&numGPUs)); PRINT(quiet, "No. of GPUs = %d\n", numGPUs); if (numGPUs < 1) { fprintf(stderr, "No GPUs found on system.\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif return 0; } if (numGPUs > MAX_THREADS) numGPUs = MAX_THREADS; PRINT(quiet, "No. of threads to launch = %d\n", numGPUs); #ifdef PAPI pthread_mutex_init(&global_mutex, NULL); int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } // Point PAPI to function that gets the thread id PAPI_CALL(PAPI_thread_init((unsigned long (*)(void)) pthread_self)); #endif // Launch the threads for(i = 0; i < numGPUs; i++) { tid[i] = i; DRIVER_API_CALL(cuCtxCreate(&(cuCtx[i]), 0, i % numGPUs)); DRIVER_API_CALL(cuCtxPopCurrent(&(cuCtx[i]))); rc = pthread_create(&tidarr[i], NULL, thread_gpu, &(tid[i])); if(rc) { fprintf(stderr, "\n ERROR: return code from pthread_create is %d \n", rc); exit(1); } PRINT(quiet, "\n Main thread %lu. Created new thread (%lu) in iteration %d ...\n", (unsigned long)pthread_self(), (unsigned long)tidarr[i], i); } // Join all threads when complete for (i = 0; i < numGPUs; i++) { pthread_join(tidarr[i], NULL); PRINT(quiet, "IDX: %d: TID: %lu: Done! Joined main thread.\n", i, (unsigned long)tidarr[i]); } // Destroy all CUDA contexts for all threads/GPUs for (i = 0; i < numGPUs; i++) { DRIVER_API_CALL(cuCtxDestroy(cuCtx[i])); } #ifdef PAPI PAPI_shutdown(); PRINT(quiet, "Main thread exit!\n"); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/pthreads_noCuCtx.cu000066400000000000000000000226241502707512200245060ustar00rootroot00000000000000/** * @file pthreads_noCuCtx.cu * @author Anustuv Pal * anustuv@icl.utk.edu */ #include #include #include #include #include "gpu_work.h" #ifdef PAPI #include #include "papi_test.h" #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) #define MAX_THREADS (32) int numGPUs; int g_event_count; char **g_evt_names; static volatile int global_thread_count = 0; pthread_mutex_t global_mutex; pthread_t tidarr[MAX_THREADS]; CUcontext cuCtx[MAX_THREADS]; pthread_mutex_t lock; // Globals for multiple pass events int numMultipassEvents = 0; /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventNamesFromCommandLine * Events provided on the command line. * @param gpu_id * NVIDIA device index. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int gpu_id, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } void *thread_gpu(void * idx) { int tid = *((int*) idx); unsigned long gettid = (unsigned long) pthread_self(); #ifdef PAPI int gpuid = tid % numGPUs; int i; int EventSet = PAPI_NULL; long long values[MAX_THREADS]; PAPI_CALL(PAPI_create_eventset(&EventSet)); RUNTIME_API_CALL(cudaSetDevice(gpuid)); PRINT(quiet, "This is idx %d thread %lu - using GPU %d\n", tid, gettid, gpuid); int numEventsSuccessfullyAdded = 0; char **eventsSuccessfullyAdded; eventsSuccessfullyAdded = (char **) malloc(g_event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < g_event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } pthread_mutex_lock(&global_mutex); add_events_from_command_line(EventSet, g_event_count, g_evt_names, gpuid, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } ++global_thread_count; pthread_mutex_unlock(&global_mutex); while(global_thread_count < numGPUs); PAPI_CALL(PAPI_start(EventSet)); #endif VectorAddSubtract(50000*(tid+1), quiet); // gpu work #ifdef PAPI PAPI_CALL(PAPI_stop(EventSet, values)); PRINT(quiet, "User measured values in thread id %d.\n", tid); for (i = 0; i < numEventsSuccessfullyAdded; i++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[i], values[i]); } // Free allocated memory for (i = 0; i < g_event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); PAPI_CALL(PAPI_cleanup_eventset(EventSet)); PAPI_CALL(PAPI_destroy_eventset(&EventSet)); #endif return NULL; } int main(int argc, char **argv) { quiet = 0; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); g_event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (g_event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } g_evt_names = argv + 1; #endif int rc, i; int tid[MAX_THREADS]; RUNTIME_API_CALL(cudaGetDeviceCount(&numGPUs)); PRINT(quiet, "No. of GPUs = %d\n", numGPUs); if (numGPUs < 1) { fprintf(stderr, "No GPUs found on system.\n"); #ifdef PAPI test_skip(__FILE__, __LINE__, "", 0); #endif return 0; } if (numGPUs > MAX_THREADS) numGPUs = MAX_THREADS; PRINT(quiet, "No. of threads to launch = %d\n", numGPUs); #ifdef PAPI pthread_mutex_init(&global_mutex, NULL); int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } // Point PAPI to function that gets the thread id PAPI_CALL(PAPI_thread_init((unsigned long (*)(void)) pthread_self)); #endif // Launch the threads for(i = 0; i < numGPUs; i++) { tid[i] = i; RUNTIME_API_CALL(cudaSetDevice(tid[i] % numGPUs)); RUNTIME_API_CALL(cudaFree(NULL)); rc = pthread_create(&tidarr[i], NULL, thread_gpu, &(tid[i])); if(rc) { fprintf(stderr, "\n ERROR: return code from pthread_create is %d \n", rc); exit(1); } PRINT(quiet, "\n Main thread %lu. Created new thread (%lu) in iteration %d ...\n", (unsigned long)pthread_self(), (unsigned long)tidarr[i], i); } // Join all threads when complete for (i = 0; i < numGPUs; i++) { pthread_join(tidarr[i], NULL); PRINT(quiet, "IDX: %d: TID: %lu: Done! Joined main thread.\n", i, (unsigned long)tidarr[i]); } #ifdef PAPI PAPI_shutdown(); PRINT(quiet, "Main thread exit!\n"); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/runtest.sh000066400000000000000000000064661502707512200227460ustar00rootroot00000000000000#!/bin/bash export PAPI_CUDA_TEST_QUIET=1 # Comment this line to see standard output from tests evt_names=("cuda:::dram__bytes_read:stat=sum:device=0" \ "cuda:::sm__cycles_active:stat=sum:device=0" \ "cuda:::smsp__warps_launched:stat=sum:device=0") multi_gpu_evt_names=("cuda:::dram__bytes_read:stat=sum" \ "cuda:::sm__cycles_active:stat=sum" \ "cuda:::smsp__warps_launched:stat=sum") multi_pass_evt_name="cuda:::gpu__compute_memory_access_throughput_internal_activity.pct_of_peak_sustained_elapsed:stat=max:device=0" concurrent_evt_names=("cuda:::sm__cycles_active:stat=sum:device=" \ "cuda:::sm__cycles_elapsed:stat=max:device=") make test_multipass_event_fail echo -e "Running: \e[36m./test_multipass_event_fail\e[0m" "${evt_names[@]}" $multi_pass_evt_name ./test_multipass_event_fail "${evt_names[@]}" $multi_pass_evt_name echo -e "-------------------------------------\n" make test_multi_read_and_reset echo -e "Running: \e[36m./test_multi_read_and_reset\e[0m" "${evt_names[@]}" ./test_multi_read_and_reset "${evt_names[@]}" echo -e "-------------------------------------\n" make test_2thr_1gpu_not_allowed echo -e "Running: \e[36m./test_2thr_1gpu_not_allowed\e[0m" "${evt_names[@]:0:2}" ./test_2thr_1gpu_not_allowed "${evt_names[@]:0:2}" echo -e "-------------------------------------\n" make HelloWorld echo -e "Running: \e[36m./HelloWorld\e[0m" "${evt_names[@]}" ./HelloWorld "${evt_names[@]}" echo -e "-------------------------------------\n" make HelloWorld_noCuCtx echo -e "Running: \e[36m./HelloWorld_noCuCtx\e[0m" "${evt_names[@]}" ./HelloWorld_noCuCtx "${evt_names[@]}" echo -e "-------------------------------------\n" make simpleMultiGPU echo -e "Running: \e[36m./simpleMultiGPU\e[0m" "${multi_gpu_evt_names[@]}" ./simpleMultiGPU "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make simpleMultiGPU_noCuCtx echo -e "Running: \e[36m./simpleMultiGPU_noCuCtx\e[0m" "${multi_gpu_evt_names[@]}" ./simpleMultiGPU_noCuCtx "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make pthreads_noCuCtx echo -e "Running: \e[36m./pthreads_noCuCtx\e[0m" "${multi_gpu_evt_names[@]}" ./pthreads_noCuCtx "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make pthreads echo -e "Running: \e[36m./pthreads\e[0m" "${multi_gpu_evt_names[@]}" ./pthreads "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make cudaOpenMP echo -e "Running: \e[36m./cudaOpenMP\e[0m" "${multi_gpu_evt_names[@]}" ./cudaOpenMP "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make cudaOpenMP_noCuCtx echo -e "Running: \e[36m./cudaOpenMP_noCuCtx\e[0m" "${multi_gpu_evt_names[@]}" ./cudaOpenMP_noCuCtx "${multi_gpu_evt_names[@]}" echo -e "-------------------------------------\n" make concurrent_profiling_noCuCtx echo -e "Running: \e[36m./concurrent_profiling_noCuCtx\e[0m" "${concurrent_evt_names[@]}" ./concurrent_profiling_noCuCtx "${concurrent_evt_names[@]}" echo -e "-------------------------------------\n" make concurrent_profiling echo -e "Running: \e[36m./concurrent_profiling\e[0m" "${concurrent_evt_names[@]}" ./concurrent_profiling "${concurrent_evt_names[@]}" echo -e "-------------------------------------\n" # Finalize tests unset PAPI_CUDA_TEST_QUIET papi-papi-7-2-0-t/src/components/cuda/tests/simpleMultiGPU.cu000066400000000000000000000467051502707512200241170ustar00rootroot00000000000000/* * PAPI Multiple GPU example. This example is taken from the NVIDIA * documentation (Copyright 1993-2013 NVIDIA Corporation) and has been * adapted to show the use of CUPTI and PAPI in collecting event * counters for multiple GPU contexts. PAPI Team (2015) * * Update, July/2021, for CUPTI 11. This version is for the CUPTI 11 * API, which PAPI uses for Nvidia GPUs with Compute Capability >= * 7.0. It will only work on cuda distributions of 10.0 or better. * Similar to legacy CUpti API, PAPI is informed of the CUcontexts * that will be used to execute kernels at the time of adding PAPI * events for that device; as shown below. */ /* * This software contains source code provided by NVIDIA Corporation * * According to the Nvidia EULA (compute 5.5 version) * http://developer.download.nvidia.com/compute/cuda/5_5/rel/docs/EULA.pdf * * Chapter 2. NVIDIA CORPORATION CUDA SAMPLES END USER LICENSE AGREEMENT * 2.1.1. Source Code * Developer shall have the right to modify and create derivative works with the Source * Code. Developer shall own any derivative works ("Derivatives") it creates to the Source * Code, provided that Developer uses the Materials in accordance with the terms and * conditions of this Agreement. Developer may distribute the Derivatives, provided that * all NVIDIA copyright notices and trademarks are propagated and used properly and * the Derivatives include the following statement: “This software contains source code * provided by NVIDIA Corporation.†*/ /* * This application demonstrates how to use the CUDA API to use multiple GPUs, * with an emphasis on simple illustration of the techniques (not on performance). * * Note that in order to detect multiple GPUs in your system you have to disable * SLI in the nvidia control panel. Otherwise only one GPU is visible to the * application. On the other side, you can still extend your desktop to screens * attached to both GPUs. * * CUDA Context notes for CUPTI_11: Although a cudaSetDevice() will create a * primary context for the device that allows kernel execution; PAPI cannot * use a primary context to control the Nvidia Performance Profiler. * Applications must create a context using cuCtxCreate() that will execute * the kernel, this must be done prior to the PAPI_add_events() invocation in * the code below. When multiple GPUs are in use, each requires its own * context, and that context should be active when PAPI_events are added for * each device. This means using seperate PAPI_add_events() for each device, * as we do here. */ // System includes #include // CUDA runtime #include #include #ifdef PAPI #include "papi.h" #include "papi_test.h" #endif #ifndef MAX #define MAX(a,b) (a > b ? a : b) #endif #include "simpleMultiGPU.h" // ////////////////////////////////////////////////////////////////////////////// // Data configuration // ////////////////////////////////////////////////////////////////////////////// const int MAX_GPU_COUNT = 32; const int DATA_N = 48576 * 32; #ifdef PAPI const int MAX_NUM_EVENTS = 32; #endif #define CHECK_CU_ERROR(err, cufunc) \ if (err != CUDA_SUCCESS) { fprintf (stderr, "Error %d for CUDA Driver API function '%s'\n", err, cufunc); return -1; } #define CHECK_CUDA_ERROR(err) \ if (err != cudaSuccess) { fprintf (stderr, "%s:%i Error %d for CUDA [%s]\n", __FILE__, __LINE__, err, cudaGetErrorString(err) ); return -1; } #define CHECK_CUPTI_ERROR(err, cuptifunc) \ if (err != CUPTI_SUCCESS) { const char *errStr; cuptiGetResultString(err, &errStr); \ fprintf (stderr, "%s:%i Error %d [%s] for CUPTI API function '%s'\n", __FILE__, __LINE__, err, errStr, cuptifunc); return -1; } #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} // ////////////////////////////////////////////////////////////////////////////// // Simple reduction kernel. // Refer to the 'reduction' CUDA SDK sample describing // reduction optimization strategies // ////////////////////////////////////////////////////////////////////////////// __global__ static void reduceKernel( float *d_Result, float *d_Input, int N ) { const int tid = blockIdx.x * blockDim.x + threadIdx.x; const int threadN = gridDim.x * blockDim.x; float sum = 0; for( int pos = tid; pos < N; pos += threadN ) sum += d_Input[pos]; d_Result[tid] = sum; } /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventsFromCommandLine * Events provided on the command line. * @param gpu_id * NVIDIA device index. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int gpu_id, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } // ////////////////////////////////////////////////////////////////////////////// // Program main // ////////////////////////////////////////////////////////////////////////////// int main( int argc, char **argv ) { // Solver config TGPUplan plan[MAX_GPU_COUNT]; // GPU reduction results float h_SumGPU[MAX_GPU_COUNT]; float sumGPU; double sumCPU, diff; int i, j, gpuBase, num_gpus; const int BLOCK_N = 32; const int THREAD_N = 256; const int ACCUM_N = BLOCK_N * THREAD_N; CUcontext ctx[MAX_GPU_COUNT]; CUcontext poppedCtx; char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); int quiet = 0; if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); PRINT( quiet, "Starting simpleMultiGPU\n" ); #ifdef PAPI int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } /* PAPI Initialization must occur before any context creation/manipulation. */ /* This is to ensure PAPI can monitor CUpti library calls. */ int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { fprintf( stderr, "PAPI_library_init failed\n" ); exit(-1); } printf( "PAPI version: %d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); #endif // Report on the available CUDA devices int computeCapabilityMajor = 0, computeCapabilityMinor = 0; int runtimeVersion = 0, driverVersion = 0; char deviceName[PAPI_MIN_STR_LEN]; CUdevice device[MAX_GPU_COUNT]; CHECK_CUDA_ERROR( cudaGetDeviceCount( &num_gpus ) ); if( num_gpus > MAX_GPU_COUNT ) num_gpus = MAX_GPU_COUNT; PRINT( quiet, "CUDA-capable device count: %i\n", num_gpus ); for ( i=0; i>> ( plan[i].d_Sum, plan[i].d_Data, plan[i].dataN ); if ( cudaGetLastError() != cudaSuccess ) { printf( "reduceKernel() execution failed (GPU %d).\n", i ); exit(EXIT_FAILURE); } // Read back GPU results CHECK_CUDA_ERROR( cudaMemcpyAsync( plan[i].h_Sum_from_device, plan[i].d_Sum, ACCUM_N * sizeof( float ), cudaMemcpyDeviceToHost, plan[i].stream ) ); // Popping a context can change the device to match the previous context. CHECK_CU_ERROR( cuCtxPopCurrent(&(ctx[i])), "cuCtxPopCurrent" ); } // Process GPU results PRINT( quiet, "Process GPU results on %d GPUs...\n", num_gpus ); for( i = 0; i < num_gpus; i++ ) { float sum; // Pushing a context implicitly sets the device for which it was created. CHECK_CU_ERROR(cuCtxPushCurrent(ctx[i]), "cuCtxPushCurrent"); // Wait for all operations to finish cudaStreamSynchronize( plan[i].stream ); // Finalize GPU reduction for current subvector sum = 0; for( j = 0; j < ACCUM_N; j++ ) { sum += plan[i].h_Sum_from_device[j]; } *( plan[i].h_Sum ) = ( float ) sum; // Popping a context can change the device to match the previous context. CHECK_CU_ERROR( cuCtxPopCurrent(&(ctx[i])), "cuCtxPopCurrent" ); } double gpuTime = GetTimer(); #ifdef PAPI for ( i=0; i %s \n", values[i], eventsSuccessfullyAdded[i] ); papi_errno = PAPI_cleanup_eventset( EventSet ); if( papi_errno != PAPI_OK ) fprintf( stderr, "PAPI_cleanup_eventset failed\n" ); papi_errno = PAPI_destroy_eventset( &EventSet ); if( papi_errno != PAPI_OK ) fprintf( stderr, "PAPI_destroy_eventset failed\n" ); PAPI_shutdown(); #endif sumGPU = 0; for( i = 0; i < num_gpus; i++ ) { sumGPU += h_SumGPU[i]; } PRINT( quiet, " GPU Processing time: %f (ms)\n", gpuTime ); // Compute on Host CPU PRINT( quiet, "Computing the same result with Host CPU...\n" ); StartTimer(); sumCPU = 0; for( i = 0; i < num_gpus; i++ ) { for( j = 0; j < plan[i].dataN; j++ ) { sumCPU += plan[i].h_Data[j]; } } double cpuTime = GetTimer(); if (gpuTime > 0) { PRINT( quiet, " CPU Processing time: %f (ms) (speedup %.2fX)\n", cpuTime, (cpuTime/gpuTime) ); } else { PRINT( quiet, " CPU Processing time: %f (ms)\n", cpuTime); } // Compare GPU and CPU results PRINT( quiet, "Comparing GPU and Host CPU results...\n" ); diff = fabs( sumCPU - sumGPU ) / fabs( sumCPU ); PRINT( quiet, " GPU sum: %f\n CPU sum: %f\n", sumGPU, sumCPU ); PRINT( quiet, " Relative difference: %E \n", diff ); // Cleanup and shutdown for( i = 0; i < num_gpus; i++ ) { CHECK_CUDA_ERROR( cudaFreeHost( plan[i].h_Sum_from_device ) ); CHECK_CUDA_ERROR( cudaFreeHost( plan[i].h_Data ) ); CHECK_CUDA_ERROR( cudaFree( plan[i].d_Sum ) ); CHECK_CUDA_ERROR( cudaFree( plan[i].d_Data ) ); // Shut down this GPU CHECK_CUDA_ERROR( cudaStreamDestroy( plan[i].stream ) ); CHECK_CU_ERROR( cuCtxDestroy(ctx[i]), "cuCtxDestroy"); } //Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); #ifdef PAPI // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } if ( diff < 1e-5 ) test_pass(__FILE__); else test_fail(__FILE__, __LINE__, "Result of GPU calculation doesn't match CPU.", PAPI_EINVAL); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/simpleMultiGPU.h000066400000000000000000000037311502707512200237270ustar00rootroot00000000000000/* * PAPI Multiple GPU example. This example is taken from the NVIDIA * documentation (Copyright 1993-2013 NVIDIA Corporation) and has been * adapted to show the use of CUPTI and PAPI in collecting event * counters for multiple GPU contexts. PAPI Team (2015) */ /* * This software contains source code provided by NVIDIA Corporation * * According to the Nvidia EULA (compute 5.5 version) * http://developer.download.nvidia.com/compute/cuda/5_5/rel/docs/EULA.pdf * * Chapter 2. NVIDIA CORPORATION CUDA SAMPLES END USER LICENSE AGREEMENT * 2.1.1. Source Code * Developer shall have the right to modify and create derivative works with the Source * Code. Developer shall own any derivative works ("Derivatives") it creates to the Source * Code, provided that Developer uses the Materials in accordance with the terms and * conditions of this Agreement. Developer may distribute the Derivatives, provided that * all NVIDIA copyright notices and trademarks are propagated and used properly and * the Derivatives include the following statement: “This software contains source code * provided by NVIDIA Corporation.†*/ /* * This application demonstrates how to use the CUDA API to use multiple GPUs. * * Note that in order to detect multiple GPUs in your system you have to disable * SLI in the nvidia control panel. Otherwise only one GPU is visible to the * application. On the other side, you can still extend your desktop to screens * attached to both GPUs. */ #ifndef SIMPLEMULTIGPU_H #define SIMPLEMULTIGPU_H typedef struct { //Host-side input data int dataN; float *h_Data; //Partial sum for this GPU float *h_Sum; //Device buffers float *d_Data,*d_Sum; //Reduction copied back from GPU float *h_Sum_from_device; //Stream for asynchronous command execution cudaStream_t stream; } TGPUplan; extern "C" void launch_reduceKernel(float *d_Result, float *d_Input, int N, int BLOCK_N, int THREAD_N, cudaStream_t &s); #endif papi-papi-7-2-0-t/src/components/cuda/tests/simpleMultiGPU_noCuCtx.cu000066400000000000000000000440071502707512200255530ustar00rootroot00000000000000/* * PAPI Multiple GPU example. This example is taken from the NVIDIA * documentation (Copyright 1993-2013 NVIDIA Corporation) and has been * adapted to show the use of CUPTI and PAPI in collecting event * counters for multiple GPU contexts. PAPI Team (2015) * * Update, July/2021, for CUPTI 11. This version is for the CUPTI 11 * API, which PAPI uses for Nvidia GPUs with Compute Capability >= * 7.0. It will only work on cuda distributions of 10.0 or better. * Similar to legacy CUpti API, PAPI is informed of the CUcontexts * that will be used to execute kernels at the time of adding PAPI * events for that device; as shown below. */ /* * This software contains source code provided by NVIDIA Corporation * * According to the Nvidia EULA (compute 5.5 version) * http://developer.download.nvidia.com/compute/cuda/5_5/rel/docs/EULA.pdf * * Chapter 2. NVIDIA CORPORATION CUDA SAMPLES END USER LICENSE AGREEMENT * 2.1.1. Source Code * Developer shall have the right to modify and create derivative works with the Source * Code. Developer shall own any derivative works ("Derivatives") it creates to the Source * Code, provided that Developer uses the Materials in accordance with the terms and * conditions of this Agreement. Developer may distribute the Derivatives, provided that * all NVIDIA copyright notices and trademarks are propagated and used properly and * the Derivatives include the following statement: “This software contains source code * provided by NVIDIA Corporation.†*/ /* * This application demonstrates how to use the CUDA API to use multiple GPUs, * with an emphasis on simple illustration of the techniques (not on performance). * * Note that in order to detect multiple GPUs in your system you have to disable * SLI in the nvidia control panel. Otherwise only one GPU is visible to the * application. On the other side, you can still extend your desktop to screens * attached to both GPUs. * * CUDA Context notes for CUPTI_11: Although a cudaSetDevice() will create a * primary context for the device that allows kernel execution; PAPI cannot * use a primary context to control the Nvidia Performance Profiler. * Applications must create a context using cuCtxCreate() that will execute * the kernel, this must be done prior to the PAPI_add_events() invocation in * the code below. When multiple GPUs are in use, each requires its own * context, and that context should be active when PAPI_events are added for * each device. This means using seperate PAPI_add_events() for each device, * as we do here. */ // System includes #include // CUDA runtime #include #include #ifdef PAPI #include "papi.h" #include "papi_test.h" #endif #ifndef MAX #define MAX(a,b) (a > b ? a : b) #endif #include "simpleMultiGPU.h" // ////////////////////////////////////////////////////////////////////////////// // Data configuration // ////////////////////////////////////////////////////////////////////////////// const int MAX_GPU_COUNT = 32; const int DATA_N = 48576 * 32; #ifdef PAPI const int MAX_NUM_EVENTS = 32; #endif #define CHECK_CU_ERROR(err, cufunc) \ if (err != CUDA_SUCCESS) { fprintf (stderr, "Error %d for CUDA Driver API function '%s'\n", err, cufunc); return -1; } #define CHECK_CUDA_ERROR(err) \ if (err != cudaSuccess) { fprintf (stderr, "%s:%i Error %d for CUDA [%s]\n", __FILE__, __LINE__, err, cudaGetErrorString(err) ); return -1; } #define CHECK_CUPTI_ERROR(err, cuptifunc) \ if (err != CUPTI_SUCCESS) { const char *errStr; cuptiGetResultString(err, &errStr); \ fprintf (stderr, "%s:%i Error %d [%s] for CUPTI API function '%s'\n", __FILE__, __LINE__, err, errStr, cuptifunc); return -1; } #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} // ////////////////////////////////////////////////////////////////////////////// // Simple reduction kernel. // Refer to the 'reduction' CUDA SDK sample describing // reduction optimization strategies // ////////////////////////////////////////////////////////////////////////////// __global__ static void reduceKernel( float *d_Result, float *d_Input, int N ) { const int tid = blockIdx.x * blockDim.x + threadIdx.x; const int threadN = gridDim.x * blockDim.x; float sum = 0; for( int pos = tid; pos < N; pos += threadN ) sum += d_Input[pos]; d_Result[tid] = sum; } /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventsFromCommandLine * Events provided on the command line. * @param gpu_id * Current gpu id. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int gpu_id, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { char tmpEventName[PAPI_MAX_STR_LEN]; int strLen = snprintf(tmpEventName, PAPI_MAX_STR_LEN, "%s:device=%d", eventNamesFromCommandLine[i], gpu_id); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write event name with appended device qualifier.\n"); test_skip(__FILE__, __LINE__, "", 0); } int papi_errno = PAPI_add_named_event(EventSet, tmpEventName); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", tmpEventName, papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", tmpEventName); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } // ////////////////////////////////////////////////////////////////////////////// // Program main // ////////////////////////////////////////////////////////////////////////////// int main( int argc, char **argv ) { // Solver config TGPUplan plan[MAX_GPU_COUNT]; // GPU reduction results float h_SumGPU[MAX_GPU_COUNT]; float sumGPU; double sumCPU, diff; int i, j, gpuBase, num_gpus; const int BLOCK_N = 32; const int THREAD_N = 256; const int ACCUM_N = BLOCK_N * THREAD_N; char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); int quiet = 0; if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); PRINT( quiet, "Starting simpleMultiGPU\n" ); #ifdef PAPI int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } /* PAPI Initialization must occur before any context creation/manipulation. */ /* This is to ensure PAPI can monitor CUpti library calls. */ int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { fprintf( stderr, "PAPI_library_init failed\n" ); exit(-1); } printf( "PAPI version: %d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); #endif // Report on the available CUDA devices int computeCapabilityMajor = 0, computeCapabilityMinor = 0; int runtimeVersion = 0, driverVersion = 0; char deviceName[PAPI_MIN_STR_LEN]; CUdevice device[MAX_GPU_COUNT]; CHECK_CUDA_ERROR( cudaGetDeviceCount( &num_gpus ) ); if( num_gpus > MAX_GPU_COUNT ) num_gpus = MAX_GPU_COUNT; PRINT( quiet, "CUDA-capable device count: %i\n", num_gpus ); for ( i=0; i>> ( plan[i].d_Sum, plan[i].d_Data, plan[i].dataN ); if ( cudaGetLastError() != cudaSuccess ) { printf( "reduceKernel() execution failed (GPU %d).\n", i ); exit(EXIT_FAILURE); } // Read back GPU results CHECK_CUDA_ERROR( cudaMemcpyAsync( plan[i].h_Sum_from_device, plan[i].d_Sum, ACCUM_N * sizeof( float ), cudaMemcpyDeviceToHost, plan[i].stream ) ); } // Process GPU results PRINT( quiet, "Process GPU results on %d GPUs...\n", num_gpus ); for( i = 0; i < num_gpus; i++ ) { float sum; // Pushing a context implicitly sets the device for which it was created. CHECK_CUDA_ERROR(cudaSetDevice(device[i])); // Wait for all operations to finish cudaStreamSynchronize( plan[i].stream ); // Finalize GPU reduction for current subvector sum = 0; for( j = 0; j < ACCUM_N; j++ ) { sum += plan[i].h_Sum_from_device[j]; } *( plan[i].h_Sum ) = ( float ) sum; } double gpuTime = GetTimer(); #ifdef PAPI for ( i=0; i %s \n", values[i], eventsSuccessfullyAdded[i] ); papi_errno = PAPI_cleanup_eventset( EventSet ); if( papi_errno != PAPI_OK ) fprintf( stderr, "PAPI_cleanup_eventset failed\n" ); papi_errno = PAPI_destroy_eventset( &EventSet ); if( papi_errno != PAPI_OK ) fprintf( stderr, "PAPI_destroy_eventset failed\n" ); PAPI_shutdown(); #endif sumGPU = 0; for( i = 0; i < num_gpus; i++ ) { sumGPU += h_SumGPU[i]; } PRINT( quiet, " GPU Processing time: %f (ms)\n", gpuTime ); // Compute on Host CPU PRINT( quiet, "Computing the same result with Host CPU...\n" ); StartTimer(); sumCPU = 0; for( i = 0; i < num_gpus; i++ ) { for( j = 0; j < plan[i].dataN; j++ ) { sumCPU += plan[i].h_Data[j]; } } double cpuTime = GetTimer(); if (gpuTime > 0) { PRINT( quiet, " CPU Processing time: %f (ms) (speedup %.2fX)\n", cpuTime, (cpuTime/gpuTime) ); } else { PRINT( quiet, " CPU Processing time: %f (ms)\n", cpuTime); } // Compare GPU and CPU results PRINT( quiet, "Comparing GPU and Host CPU results...\n" ); diff = fabs( sumCPU - sumGPU ) / fabs( sumCPU ); PRINT( quiet, " GPU sum: %f\n CPU sum: %f\n", sumGPU, sumCPU ); PRINT( quiet, " Relative difference: %E \n", diff ); // Cleanup and shutdown for( i = 0; i < num_gpus; i++ ) { CHECK_CUDA_ERROR( cudaFreeHost( plan[i].h_Sum_from_device ) ); CHECK_CUDA_ERROR( cudaFreeHost( plan[i].h_Data ) ); CHECK_CUDA_ERROR( cudaFree( plan[i].d_Sum ) ); CHECK_CUDA_ERROR( cudaFree( plan[i].d_Data ) ); // Shut down this GPU CHECK_CUDA_ERROR( cudaStreamDestroy( plan[i].stream ) ); } //Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); #ifdef PAPI // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } if ( diff < 1e-5 ) test_pass(__FILE__); else test_fail(__FILE__, __LINE__, "Result of GPU calculation doesn't match CPU.", PAPI_EINVAL); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/test_2thr_1gpu_not_allowed.cu000066400000000000000000000152711502707512200264720ustar00rootroot00000000000000/** * @file test_2thr_1gpu_not_allowed.cu * @author Anustuv Pal * anustuv@icl.utk.edu */ #include #include #include #include #include "gpu_work.h" #ifdef PAPI #include #include #define PAPI_CALL(apiFuncCall) \ do { \ int _status = apiFuncCall; \ if (_status != PAPI_OK) { \ fprintf(stderr, "error: function %s failed.", #apiFuncCall); \ test_fail(__FILE__, __LINE__, "", _status); \ } \ } while (0) #endif #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; #define RUNTIME_API_CALL(apiFuncCall) \ do { \ cudaError_t _status = apiFuncCall; \ if (_status != cudaSuccess) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %s.\n", \ __FILE__, __LINE__, #apiFuncCall, cudaGetErrorString(_status));\ exit(EXIT_FAILURE); \ } \ } while (0) #define DRIVER_API_CALL(apiFuncCall) \ do { \ CUresult _status = apiFuncCall; \ if (_status != CUDA_SUCCESS) { \ fprintf(stderr, "%s:%d: error: function %s failed with error %d.\n", \ __FILE__, __LINE__, #apiFuncCall, _status); \ exit(EXIT_FAILURE); \ } \ } while (0) #define NUM_THREADS 2 int numGPUs; int g_event_count; char **g_evt_names; typedef struct pthread_params_s { pthread_t tid; CUcontext cuCtx; int idx; int retval; } pthread_params_t; void *thread_gpu(void * ptinfo) { pthread_params_t *tinfo = (pthread_params_t *) ptinfo; int idx = tinfo->idx; int gpuid = idx % numGPUs; unsigned long gettid = (unsigned long) pthread_self(); DRIVER_API_CALL(cuCtxSetCurrent(tinfo->cuCtx)); PRINT(quiet, "This is idx %d thread %lu - using GPU %d context %p!\n", idx, gettid, gpuid, tinfo->cuCtx); #ifdef PAPI int papi_errno; int EventSet = PAPI_NULL; long long values[1]; PAPI_CALL(PAPI_create_eventset(&EventSet)); papi_errno = PAPI_add_named_event(EventSet, g_evt_names[idx]); if (papi_errno != PAPI_OK) { if (papi_errno == PAPI_EMULPASS) { fprintf(stderr, "Event %s requires multiple passes and cannot be added to an EventSet. Two single pass events are needed for this test see utils/papi_native_avail for more Cuda native events.\n", g_evt_names[idx]); test_skip(__FILE__, __LINE__, "", 0); } else { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", g_evt_names[idx], papi_errno); test_skip(__FILE__, __LINE__, "", 0); } } papi_errno = PAPI_start(EventSet); if (papi_errno == PAPI_ECNFLCT) { PRINT(quiet, "Thread %d was not allowed to start profiling on same GPU.\n", tinfo->idx); tinfo->retval = papi_errno; return NULL; } #endif VectorAddSubtract(5000000*(idx+1), quiet); // gpu work #ifdef PAPI PAPI_CALL(PAPI_stop(EventSet, values)); PRINT(quiet, "User measured values in thread id %d.\n", idx); PRINT(quiet, "%s\t\t%lld\n", g_evt_names[idx], values[0]); tinfo->retval = PAPI_OK; PAPI_CALL(PAPI_cleanup_eventset(EventSet)); PAPI_CALL(PAPI_destroy_eventset(&EventSet)); #endif return NULL; } int main(int argc, char **argv) { quiet = 0; #ifdef PAPI char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); g_event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (g_event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } else if (g_event_count != 2) { fprintf(stderr, "Two single pass events are needed for this test to run properly.\n"); test_skip(__FILE__, __LINE__, "", 0); } g_evt_names = argv + 1; #endif int rc, i; pthread_params_t data[NUM_THREADS]; RUNTIME_API_CALL(cudaGetDeviceCount(&numGPUs)); PRINT(quiet, "No. of GPUs = %d\n", numGPUs); PRINT(quiet, "No. of threads to launch = %d\n", NUM_THREADS); #ifdef PAPI int papi_errno = PAPI_library_init( PAPI_VER_CURRENT ); if( papi_errno != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed.", 0); } // Point PAPI to function that gets the thread id PAPI_CALL(PAPI_thread_init((unsigned long (*)(void)) pthread_self)); #endif // Launch the threads for(i = 0; i < NUM_THREADS; i++) { data[i].idx = i; DRIVER_API_CALL(cuCtxCreate(&(data[i].cuCtx), 0, 0)); DRIVER_API_CALL(cuCtxPopCurrent(&(data[i].cuCtx))); rc = pthread_create(&data[i].tid, NULL, thread_gpu, &(data[i])); if(rc) { fprintf(stderr, "\n ERROR: return code from pthread_create is %d \n", rc); exit(1); } PRINT(quiet, "\n Main thread %lu. Created new thread (%lu) in iteration %d ...\n", (unsigned long)pthread_self(), (unsigned long) data[i].tid, i); } // Join all threads when complete for (i=0; i #include "gpu_work.h" #ifdef PAPI #include #include "papi_test.h" #endif #define COMP_NAME "cuda" #define MAX_EVENT_COUNT (32) #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; int approx_equal(long long v1, long long v2) { double err = fabs(v1 - v2) / v1; if (err < 0.1) return 1; return 0; } // Globals for successfully added and multiple pass events int numEventsSuccessfullyAdded = 0, numMultipassEvents = 0; /** @class add_events_from_command_line * @brief Try and add each event provided on the command line by the user. * * @param EventSet * A PAPI eventset. * @param totalEventCount * Number of events from the command line. * @param **eventNamesFromCommandLine * Events provided on the command line. * @param *numEventsSuccessfullyAdded * Total number of successfully added events. * @param **eventsSuccessfullyAdded * Events that we are able to add to the EventSet. * @param *numMultipassEvents * Counter to see if a multiple pass event was provided on the command line. */ static void add_events_from_command_line(int EventSet, int totalEventCount, char **eventNamesFromCommandLine, int *numEventsSuccessfullyAdded, char **eventsSuccessfullyAdded, int *numMultipassEvents) { int i; for (i = 0; i < totalEventCount; i++) { int strLen; int papi_errno = PAPI_add_named_event(EventSet, eventNamesFromCommandLine[i]); if (papi_errno != PAPI_OK) { if (papi_errno != PAPI_EMULPASS) { fprintf(stderr, "Unable to add event %s to the EventSet with error code %d.\n", eventNamesFromCommandLine[i], papi_errno); test_skip(__FILE__, __LINE__, "", 0); } // Handle multiple pass events (*numMultipassEvents)++; continue; } // Handle successfully added events strLen = snprintf(eventsSuccessfullyAdded[(*numEventsSuccessfullyAdded)], PAPI_MAX_STR_LEN, "%s", eventNamesFromCommandLine[i]); if (strLen < 0 || strLen >= PAPI_MAX_STR_LEN) { fprintf(stderr, "Failed to fully write successfully added event.\n"); test_skip(__FILE__, __LINE__, "", 0); } (*numEventsSuccessfullyAdded)++; } return; } void multi_reset(int event_count, char **evt_names, long long *values) { CUcontext ctx; int papi_errno, i; papi_errno = cuCtxCreate(&ctx, 0, 0); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cuda error: failed to create cuda context.\n"); exit(1); } #ifdef PAPI int EventSet = PAPI_NULL; int j; papi_errno = PAPI_create_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "Failed to create eventset.", papi_errno); } // Handle the events from the command line numEventsSuccessfullyAdded = 0; numMultipassEvents = 0; char **eventsSuccessfullyAdded; eventsSuccessfullyAdded = (char **) malloc(event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, event_count, evt_names, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } papi_errno = PAPI_start(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start error.", papi_errno); } #endif for (i=0; i<10; i++) { VectorAddSubtract(100000, quiet); #ifdef PAPI papi_errno = PAPI_read(EventSet, values); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read error.", papi_errno); } PRINT(quiet, "Measured values iter %d\n", i); for (j=0; j < numEventsSuccessfullyAdded; j++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[j], values[j]); } papi_errno = PAPI_reset(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_reset error.", papi_errno); } #endif } #ifdef PAPI papi_errno = PAPI_stop(EventSet, values); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop error.", papi_errno); } papi_errno = PAPI_cleanup_eventset(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset error.", papi_errno); } papi_errno = PAPI_destroy_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset error.", papi_errno); } #endif papi_errno = cuCtxDestroy(ctx); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cude error: failed to destroy context.\n"); exit(1); } // Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); } void multi_read(int event_count, char **evt_names, long long *values) { CUcontext ctx; int papi_errno, i; papi_errno = cuCtxCreate(&ctx, 0, 0); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cuda error: failed to create cuda context.\n"); exit(1); } #ifdef PAPI int EventSet = PAPI_NULL, j; papi_errno = PAPI_create_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "Failed to create eventset.", papi_errno); } // Handle the events from the command line numEventsSuccessfullyAdded = 0; numMultipassEvents = 0; char **eventsSuccessfullyAdded; eventsSuccessfullyAdded = (char **) malloc(event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, event_count, evt_names, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } papi_errno = PAPI_start(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start error.", papi_errno); } #endif for (i=0; i<10; i++) { VectorAddSubtract(100000, quiet); #ifdef PAPI papi_errno = PAPI_read(EventSet, values); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start error.", papi_errno); } PRINT(quiet, "Measured values iter %d\n", i); for (j=0; j < numEventsSuccessfullyAdded; j++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[j], values[j]); } } papi_errno = PAPI_stop(EventSet, values); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop error.", papi_errno); } papi_errno = PAPI_cleanup_eventset(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset error.", papi_errno); } papi_errno = PAPI_destroy_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset error.", papi_errno); #endif } papi_errno = cuCtxDestroy(ctx); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cude error: failed to destroy context.\n"); exit(1); } // Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); } void single_read(int event_count, char **evt_names, long long *values, char ***addedEvents) { int papi_errno, i; CUcontext ctx; papi_errno = cuCtxCreate(&ctx, 0, 0); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cuda error: failed to create cuda context.\n"); exit(1); } #ifdef PAPI int EventSet = PAPI_NULL, j; papi_errno = PAPI_create_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "Failed to create eventset.", papi_errno); } // Handle the events from the command line numEventsSuccessfullyAdded = 0; numMultipassEvents = 0; char **eventsSuccessfullyAdded; eventsSuccessfullyAdded = (char **) malloc(event_count * sizeof(char *)); if (eventsSuccessfullyAdded == NULL) { fprintf(stderr, "Failed to allocate memory for successfully added events.\n"); test_skip(__FILE__, __LINE__, "", 0); } for (i = 0; i < event_count; i++) { eventsSuccessfullyAdded[i] = (char *) malloc(PAPI_MAX_STR_LEN * sizeof(char)); if (eventsSuccessfullyAdded[i] == NULL) { fprintf(stderr, "Failed to allocate memory for command line argument.\n"); test_skip(__FILE__, __LINE__, "", 0); } } add_events_from_command_line(EventSet, event_count, evt_names, &numEventsSuccessfullyAdded, eventsSuccessfullyAdded, &numMultipassEvents); // Only multiple pass events were provided on the command line if (numEventsSuccessfullyAdded == 0) { fprintf(stderr, "Events provided on the command line could not be added to an EventSet as they require multiple passes.\n"); test_skip(__FILE__, __LINE__, "", 0); } papi_errno = PAPI_start(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start error.", papi_errno); } #endif for (i=0; i<10; i++) { VectorAddSubtract(100000, quiet); } #ifdef PAPI papi_errno = PAPI_stop(EventSet, values); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop error.", papi_errno); } PRINT(quiet, "Measured values from single read\n"); for (j=0; j < numEventsSuccessfullyAdded; j++) { PRINT(quiet, "%s\t\t%lld\n", eventsSuccessfullyAdded[j], values[j]); } papi_errno = PAPI_cleanup_eventset(EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset error.", papi_errno); } papi_errno = PAPI_destroy_eventset(&EventSet); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset error.", papi_errno); } #endif papi_errno = cuCtxDestroy(ctx); if (papi_errno != CUDA_SUCCESS) { fprintf(stderr, "cuda error: failed to destroy cuda context.\n"); exit(1); } *addedEvents = eventsSuccessfullyAdded; } int main(int argc, char **argv) { cuInit(0); quiet = 0; #ifdef PAPI int papi_errno; char *test_quiet = getenv("PAPI_CUDA_TEST_QUIET"); if (test_quiet) quiet = (int) strtol(test_quiet, (char**) NULL, 10); int event_count = argc - 1; /* if no events passed at command line, just report test skipped. */ if (event_count == 0) { fprintf(stderr, "No eventnames specified at command line.\n"); test_skip(__FILE__, __LINE__, "", 0); } papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "Failed to initialize PAPI.", 0); } papi_errno = PAPI_get_component_index(COMP_NAME); if (papi_errno < 0) { test_fail(__FILE__, __LINE__, "Failed to get index of cuda component.", PAPI_ECMP); } long long values_multi_reset[MAX_EVENT_COUNT]; long long values_multi_read[MAX_EVENT_COUNT]; long long values_single_read[MAX_EVENT_COUNT]; PRINT(quiet, "Running multi_reset.\n"); multi_reset(event_count, argv + 1, values_multi_reset); PRINT(quiet, "\nRunning multi_read.\n"); multi_read(event_count, argv + 1, values_multi_read); PRINT(quiet, "\nRunning single_read.\n"); char **eventsSuccessfullyAdded = { 0 }; single_read(event_count, argv + 1, values_single_read, &eventsSuccessfullyAdded); int i; PRINT(quiet, "Final measured values\nEvent_name\t\t\t\t\t\tMulti_read\tsingle_read\n"); for (i=0; i < numEventsSuccessfullyAdded; i++) { PRINT(quiet, "%s\t\t\t%lld\t\t%lld\n", eventsSuccessfullyAdded[i], values_multi_read[i], values_single_read[i]); if ( !approx_equal(values_multi_read[i], values_single_read[i]) ) test_warn(__FILE__, __LINE__, "Measured values from multi read and single read don't match.", PAPI_OK); } // Free allocated memory for (i = 0; i < event_count; i++) { free(eventsSuccessfullyAdded[i]); } free(eventsSuccessfullyAdded); PAPI_shutdown(); // Output a note that a multiple pass event was provided on the command line if (numMultipassEvents > 0) { PRINT(quiet, "\033[0;33mNOTE: From the events provided on the command line, an event or events requiring multiple passes was detected and not added to the EventSet. Check your events with utils/papi_native_avail.\n\033[0m"); } test_pass(__FILE__); #else fprintf(stderr, "Please compile with -DPAPI to test this feature.\n"); #endif return 0; } papi-papi-7-2-0-t/src/components/cuda/tests/test_multipass_event_fail.cu000066400000000000000000000137201502707512200265020ustar00rootroot00000000000000/** * @file test_multipass_event_fail.cu * @author Anustuv Pal * anustuv@icl.utk.edu */ #include #ifdef PAPI #include "papi.h" #include "papi_test.h" #define PASS 1 #define FAIL 0 #define MAX_EVENT_COUNT (32) #define PRINT(quiet, format, args...) {if (!quiet) {fprintf(stderr, format, ## args);}} int quiet; int test_PAPI_add_named_event(int *EventSet, int numEvents, char **EventName) { int i, papi_errno; PRINT(quiet, "LOG: %s: Entering.\n", __func__); for (i=0; i #if defined(WIN32) || defined(_WIN32) || defined(WIN64) || defined(_WIN64) #define WIN32_LEAN_AND_MEAN #include #else #include #endif #if defined(WIN32) || defined(_WIN32) || defined(WIN64) || defined(_WIN64) double PCFreq = 0.0; __int64 timerStart = 0; #else struct timeval timerStart; #endif void StartTimer() { #if defined(WIN32) || defined(_WIN32) || defined(WIN64) || defined(_WIN64) LARGE_INTEGER li; if (!QueryPerformanceFrequency(&li)) { printf("QueryPerformanceFrequency failed!\n"); } PCFreq = (double)li.QuadPart/1000.0; QueryPerformanceCounter(&li); timerStart = li.QuadPart; #else gettimeofday(&timerStart, NULL); #endif } // time elapsed in ms double GetTimer() { #if defined(WIN32) || defined(_WIN32) || defined(WIN64) || defined(_WIN64) LARGE_INTEGER li; QueryPerformanceCounter(&li); return (double)(li.QuadPart-timerStart)/PCFreq; #else struct timeval timerStop, timerElapsed; gettimeofday(&timerStop, NULL); timersub(&timerStop, &timerStart, &timerElapsed); return timerElapsed.tv_sec*1000.0+timerElapsed.tv_usec/1000.0; #endif } #endif // TIMER_H papi-papi-7-2-0-t/src/components/emon/000077500000000000000000000000001502707512200175525ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/emon/README.md000066400000000000000000000020471502707512200210340ustar00rootroot00000000000000# EMON Component The EMON component provide access to Evniromental MONitoring power data on BG/Q systems. * [Enabling the EMON Component](#enabling-the-emon-component) *** ## Enabling the EMON Component To enable reading of EMON counters the user needs to link against a PAPI library that was configured with the EMON component enabled. There are no specific component configure scripts. In order to configure PAPI for BG/Q, use the following configure options at the papi/src level: ./configure --prefix=< your_choice > \ --with-OS=bgq \ --with-EMON_installdir=/bgsys/drivers/ppcfloor \ CC=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc \ F77=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gfortran \ --with-components="EMON/L2unit EMON/CNKunit EMON/IOunit EMON/NWunit emon" Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/emon/Rules.emon000066400000000000000000000003531502707512200215250ustar00rootroot00000000000000# $Id$ COMPSRCS += components/emon/linux-emon.c COMPOBJS += linux-emon.o linux-emon.o: components/emon/linux-emon.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/emon/linux-emon.c -o linux-emon.o -I/bgsys/drivers/ppcfloor papi-papi-7-2-0-t/src/components/emon/linux-emon.c000066400000000000000000000350561502707512200220220ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-emon.c * @author Heike Jagode * jagode@eecs.utk.edu * BGPM / emon component * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware power data for BG/Q through the EMON interface. */ #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #define EMON_DEFINE_GLOBALS #include #include // the emon library header file (no linking required) #define EMON_MAX_COUNTERS 8 #define EMON_TOTAL_EVENTS 8 #ifndef DEBUG #define EMONDBG( fmt, args...) do {} while(0) #else #define EMONDBG( fmt, args... ) do { printf("%s:%d\t"fmt, __func__, __LINE__, ##args); } while(0) #endif /* Stores private information for each event */ typedef struct EMON_register { unsigned int selector; /* Signifies which counter slot is being used */ /* Indexed from 1 as 0 has a special meaning */ } EMON_register_t; /** This structure is used to build the table of events */ /* The contents of this structure will vary based on */ /* your component, however having name and description */ /* fields are probably useful. */ typedef struct EMON_native_event_entry { EMON_register_t resources; /**< Per counter resources */ char *name; /**< Name of the counter */ char *description; /**< Description of the counter */ int return_type; } EMON_native_event_entry_t; /* Used when doing register allocation */ typedef struct EMON_reg_alloc { EMON_register_t ra_bits; } EMON_reg_alloc_t; typedef struct EMON_overflow { int threshold; int EventIndex; } EMON_overflow_t; /* Holds control flags */ typedef struct EMON_control_state { int count; long long counters[EMON_MAX_COUNTERS]; int being_measured[EMON_MAX_COUNTERS]; long long last_update; } EMON_control_state_t; /* Holds per-thread information */ typedef struct EMON_context { EMON_control_state_t state; } EMON_context_t; /* Declare our vector in advance */ papi_vector_t _emon2_vector; static void _check_EMON_error( char* emon2func, int err ) { ( void ) emon2func; if ( err < 0 ) { printf( "Error: EMON API function '%s' returned %d.\n", emon2func, err ); } } /** This table contains the native events * So with the EMON interface, we get every domain at a time. */ static EMON_native_event_entry_t EMON_native_table[] = { { .name = "DOMAIN1", .description = "Chip core", .resources.selector = 1, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN2", .description = "Chip Memory Interface and Dramm", .resources.selector = 2, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN3", .description = "Optics", .resources.selector = 3, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN4", .description = "Optics + PCIExpress", .resources.selector = 4, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN6", .description = "HSS Network and Link Chip", .resources.selector = 5, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN8", .description = "Link Chip Core", .resources.selector = 6, .return_type = PAPI_DATATYPE_FP64, }, { .name = "DOMAIN7", .description = "Chip SRAM", .resources.selector = 7, .return_type = PAPI_DATATYPE_FP64, }, { .name="EMON_DOMAIN_ALL", .description = "Measures power on all domains.", .resources.selector = 8, .return_type = PAPI_DATATYPE_FP64, }, }; /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ int EMON_init_thread( hwd_context_t * ctx ) { EMONDBG( "EMON_init_thread\n" ); ( void ) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int EMON_init_component( int cidx ) { int ret = 0; _emon2_vector.cmp_info.CmpIdx = cidx; EMONDBG( "EMON_init_component cidx = %d\n", cidx ); /* Setup connection with the fpga: * NOTE: any other threads attempting to call into the EMON API * will be turned away. */ ret = EMON_SetupPowerMeasurement(); _check_EMON_error("EMON_SetupPowerMeasurement", ret ); _emon2_vector.cmp_info.num_native_events = EMON_TOTAL_EVENTS; _emon2_vector.cmp_info.num_cntrs = EMON_TOTAL_EVENTS; _emon2_vector.cmp_info.num_mpx_cntrs = EMON_TOTAL_EVENTS; return ( PAPI_OK ); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int EMON_init_control_state( hwd_control_state_t * ptr ) { EMONDBG( "EMON_init_control_state\n" ); EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr; memset( this_state, 0, sizeof ( EMON_control_state_t ) ); return PAPI_OK; } static int _emon_accessor( EMON_control_state_t * this_state ) { union { long long ll; double fp; } return_value; return_value.fp = -1; double volts[14],amps[14]; double cpu = 0; double dram = 0; double link_chip = 0; double network = 0; double optics = 0; double pci = 0; double sram = 0; unsigned k_const; EMONDBG( "_emon_accessor, enter this_state = %x\n", this_state); return_value.fp = EMON_GetPower_impl( volts, amps ); EMONDBG("_emon_accessor, after EMON_GetPower %lf \n", return_value.fp); if ( -1 == return_value.fp ) { PAPIERROR("EMON_GetPower() failed!\n"); return ( PAPI_ESYS ); } this_state->counters[7] = return_value.ll; /* We just stuff everything in counters, there is no extra overhead here */ k_const = domain_info[0].k_const; /* Chip Core Voltage */ cpu += volts[0] * amps[0] * k_const; cpu += volts[1] * amps[1] * k_const; k_const = domain_info[1].k_const; /* Chip Core Voltage */ dram += volts[2] * amps[2] * k_const; dram += volts[3] * amps[3] * k_const; k_const = domain_info[2].k_const; /* Chip Core Voltage */ optics += volts[4] * amps[4] * k_const; optics += volts[5] * amps[5] * k_const; k_const = domain_info[3].k_const; /* Chip Core Voltage */ pci += volts[6] * amps[6] * k_const; pci += volts[7] * amps[7] * k_const; k_const = domain_info[4].k_const; /* Chip Core Voltage */ network += volts[8] * amps[8] * k_const; network += volts[9] * amps[9] * k_const; k_const = domain_info[5].k_const; /* Chip Core Voltage */ link_chip += volts[10] * amps[10] * k_const; link_chip += volts[11] * amps[11] * k_const; k_const = domain_info[6].k_const; /* Chip Core Voltage */ sram += volts[12] * amps[12] * k_const; sram += volts[13] * amps[13] * k_const; this_state->counters[0] = *(long long*)&cpu; this_state->counters[1] = *(long long*)&dram; this_state->counters[2] = *(long long*)&optics; this_state->counters[3] = *(long long*)&pci; this_state->counters[4] = *(long long*)&link_chip; this_state->counters[5] = *(long long*)&network; this_state->counters[6] = *(long long*)&sram; EMONDBG("CPU = %lf\n", *(double*)&this_state->counters[0]); EMONDBG("DRAM = %lf\n", *(double*)&this_state->counters[1]); EMONDBG("Optics = %lf\n", *(double*)&this_state->counters[2]); EMONDBG("PCI = %lf\n", *(double*)&this_state->counters[3]); EMONDBG("Link Chip = %lf\n", *(double*)&this_state->counters[4]); EMONDBG("Network = %lf\n", *(double*)&this_state->counters[5]); EMONDBG("SRAM = %lf\n", *(double*)&this_state->counters[6]); EMONDBG("TOTAL = %lf\n", *(double*)&this_state->counters[7] ); return ( PAPI_OK ); } /* * */ int EMON_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { EMONDBG( "EMON_start\n" ); ( void ) ctx; ( void ) ptr; /*EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr;*/ return ( PAPI_OK ); } /* * */ int EMON_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { EMONDBG( "EMON_stop\n" ); ( void ) ctx; EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr; return _emon_accessor( this_state ); } /* * */ int EMON_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long long ** events, int flags ) { EMONDBG( "EMON_read\n" ); ( void ) ctx; ( void ) flags; int ret; EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr; ret = _emon_accessor( this_state ); *events = this_state->counters; return ret; } /* * */ int EMON_shutdown_thread( hwd_context_t * ctx ) { EMONDBG( "EMON_shutdown_thread\n" ); ( void ) ctx; return ( PAPI_OK ); } int EMON_shutdown_component( void ) { EMONDBG( "EMON_shutdown_component\n" ); return ( PAPI_OK ); } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int EMON_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { EMONDBG( "EMON_ctl\n" ); ( void ) ctx; ( void ) code; ( void ) option; return ( PAPI_OK ); } /* * PAPI Cleanup Eventset */ int EMON_cleanup_eventset( hwd_control_state_t * ctrl ) { EMONDBG( "EMON_cleanup_eventset\n" ); EMON_control_state_t * this_state = ( EMON_control_state_t * ) ctrl; ( void ) this_state; return ( PAPI_OK ); } /* * */ int EMON_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { EMONDBG( "EMON_update_control_state: count = %d\n", count ); ( void ) ctx; int index, i; EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr; ( void ) ptr; // otherwise, add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event ) ; native[i].ni_position = i; EMONDBG("EMON_update_control_state: ADD event: i = %d, index = %d\n", i, index ); } // store how many events we added to an EventSet this_state->count = count; return ( PAPI_OK ); } /* * As a system wide count, PAPI_DOM_ALL is all we support */ int EMON_set_domain( hwd_control_state_t * cntrl, int domain ) { EMONDBG( "EMON_set_domain\n" ); ( void ) cntrl; if ( PAPI_DOM_ALL != domain ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * */ int EMON_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { EMONDBG( "EMON_reset\n" ); ( void ) ctx; int retval; EMON_control_state_t * this_state = ( EMON_control_state_t * ) ptr; ( void ) this_state; ( void ) retval; memset( this_state->counters, 0x0, sizeof(long long) * EMON_MAX_COUNTERS); return ( PAPI_OK ); } /* * Native Event functions */ int EMON_ntv_enum_events( unsigned int *EventCode, int modifier ) { EMONDBG( "EMON_ntv_enum_events, EventCode = %#x\n", *EventCode ); switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode ); if ( index < EMON_TOTAL_EVENTS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else { return ( PAPI_ENOEVNT ); } break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } /* * */ int EMON_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { EMONDBG( "EMON_ntv_code_to_name\n" ); int index; ( void ) name; ( void ) len; index = ( EventCode ); if ( index >= EMON_TOTAL_EVENTS || index < 0 ) { return PAPI_ENOEVNT; } strncpy( name, EMON_native_table[index].name, len ); return ( PAPI_OK ); } /* * */ int EMON_ntv_name_to_code( const char *name, unsigned int *code ) { int index; for ( index = 0; index < EMON_TOTAL_EVENTS; index++ ) { if ( 0 == strcmp( name, EMON_native_table[index].name ) ) { *code = index; } } return ( PAPI_OK ); } int EMON_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { EMONDBG( "EMON_ntv_code_to_descr\n" ); int index; ( void ) name; ( void ) len; index = ( EventCode ) ; if ( index >= EMON_TOTAL_EVENTS || index < 0 ) { return PAPI_ENOEVNT; } strncpy( name, EMON_native_table[index].description, len ); return ( PAPI_OK ); } /* * */ int EMON_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { EMONDBG( "EMON_ntv_code_to_bits\n" ); ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } int EMON_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { int index = EventCode; if ( ( index < 0) || (index >= EMON_TOTAL_EVENTS )) return PAPI_ENOEVNT; strncpy( info->symbol, EMON_native_table[index].name, sizeof(info->symbol)); strncpy( info->long_descr, EMON_native_table[index].description, sizeof(info->symbol)); //strncpy( info->units, rapl_native_events[index].units, //sizeof(info->units)); info->data_type = EMON_native_table[index].return_type; return PAPI_OK; } /* * */ papi_vector_t _emon_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "EMON", .short_name = "EMON", .description = "Blue Gene/Q EMON component", .num_native_events = EMON_MAX_COUNTERS, .num_cntrs = EMON_MAX_COUNTERS, .num_mpx_cntrs = EMON_MAX_COUNTERS, .default_domain = PAPI_DOM_ALL, .available_domains = PAPI_DOM_ALL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 0, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( EMON_context_t ), .control_state = sizeof ( EMON_control_state_t ), .reg_value = sizeof ( EMON_register_t ), .reg_alloc = sizeof ( EMON_reg_alloc_t ), } , /* function pointers in this component */ .init_thread = EMON_init_thread, .init_component = EMON_init_component, .init_control_state = EMON_init_control_state, .start = EMON_start, .stop = EMON_stop, .read = EMON_read, .shutdown_thread = EMON_shutdown_thread, .shutdown_component = EMON_shutdown_component, .cleanup_eventset = EMON_cleanup_eventset, .ctl = EMON_ctl, .update_control_state = EMON_update_control_state, .set_domain = EMON_set_domain, .reset = EMON_reset, .ntv_enum_events = EMON_ntv_enum_events, .ntv_code_to_name = EMON_ntv_code_to_name, .ntv_code_to_descr = EMON_ntv_code_to_descr, .ntv_code_to_bits = EMON_ntv_code_to_bits, .ntv_code_to_info = EMON_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/example/000077500000000000000000000000001502707512200202475ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/example/README.md000066400000000000000000000012711502707512200215270ustar00rootroot00000000000000# EXAMPLE Component The EXAMPLE component demos the component interface and implements three example counters. * [Enabling the EXAMPLE Component](#enabling-the-example-component) *** ## Enabling the EXAMPLE Component To enable reading of EXAMPLE counters the user needs to link against a PAPI library that was configured with the EXAMPLE component enabled. As an example the following command: `./configure --with-components="example"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/example/Rules.example000066400000000000000000000003371502707512200227210ustar00rootroot00000000000000COMPSRCS += components/example/example.c COMPOBJS += example.o example.o: components/example/example.c components/example/example.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/example/example.c -o example.o papi-papi-7-2-0-t/src/components/example/example.c000066400000000000000000000464331502707512200220600ustar00rootroot00000000000000/** * @file example.c * @author Joachim Protze * joachim.protze@zih.tu-dresden.de * @author Vince Weaver * vweaver1@eecs.utk.edu * * @ingroup papi_components * * @brief * This is an example component, it demos the component interface * and implements three example counters. */ #include #include #include #include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" /* defines papi_malloc(), etc. */ /** This driver supports three counters counting at once */ /* This is artificially low to allow testing of multiplexing */ #define EXAMPLE_MAX_SIMULTANEOUS_COUNTERS 3 #define EXAMPLE_MAX_MULTIPLEX_COUNTERS 4 // The following macro follows if a string function has an error. It should // never happen; but it is necessary to prevent compiler warnings. We print // something just in case there is programmer error in invoking the function. #define HANDLE_STRING_ERROR {fprintf(stderr,"%s:%i unexpected string function error.\n",__FILE__,__LINE__); exit(-1);} /* Declare our vector in advance */ /* This allows us to modify the component info */ papi_vector_t _example_vector; /** Structure that stores private information for each event */ typedef struct example_register { unsigned int selector; /**< Signifies which counter slot is being used */ /**< Indexed from 1 as 0 has a special meaning */ } example_register_t; /** This structure is used to build the table of events */ /* The contents of this structure will vary based on */ /* your component, however having name and description */ /* fields are probably useful. */ typedef struct example_native_event_entry { example_register_t resources; /**< Per counter resources */ char name[PAPI_MAX_STR_LEN]; /**< Name of the counter */ char description[PAPI_MAX_STR_LEN]; /**< Description of the counter */ int writable; /**< Whether counter is writable */ /* any other counter parameters go here */ } example_native_event_entry_t; /** This structure is used when doing register allocation it possibly is not necessary when there are no register constraints */ typedef struct example_reg_alloc { example_register_t ra_bits; } example_reg_alloc_t; /** Holds control flags. * There's one of these per event-set. * Use this to hold data specific to the EventSet, either hardware * counter settings or things like counter start values */ typedef struct example_control_state { int num_events; int domain; int multiplexed; int overflow; int inherit; int which_counter[EXAMPLE_MAX_SIMULTANEOUS_COUNTERS]; long long counter[EXAMPLE_MAX_MULTIPLEX_COUNTERS]; /**< Copy of counts, holds results when stopped */ } example_control_state_t; /** Holds per-thread information */ typedef struct example_context { long long autoinc_value; } example_context_t; /** This table contains the native events */ static example_native_event_entry_t *example_native_table; /** number of events in the table*/ static int num_events = 0; /*************************************************************************/ /* Below is the actual "hardware implementation" of our example counters */ /*************************************************************************/ #define EXAMPLE_ZERO_REG 0 #define EXAMPLE_CONSTANT_REG 1 #define EXAMPLE_AUTOINC_REG 2 #define EXAMPLE_GLOBAL_AUTOINC_REG 3 #define EXAMPLE_TOTAL_EVENTS 4 static long long example_global_autoinc_value = 0; /** Code that resets the hardware. */ static void example_hardware_reset( example_context_t *ctx ) { /* reset per-thread count */ ctx->autoinc_value=0; /* reset global count */ example_global_autoinc_value = 0; } /** Code that reads event values. */ /* You might replace this with code that accesses */ /* hardware or reads values from the operatings system. */ static long long example_hardware_read( int which_one, example_context_t *ctx ) { long long old_value; switch ( which_one ) { case EXAMPLE_ZERO_REG: return 0; case EXAMPLE_CONSTANT_REG: return 42; case EXAMPLE_AUTOINC_REG: old_value = ctx->autoinc_value; ctx->autoinc_value++; return old_value; case EXAMPLE_GLOBAL_AUTOINC_REG: old_value = example_global_autoinc_value; example_global_autoinc_value++; return old_value; default: fprintf(stderr,"Invalid counter read %#x\n",which_one ); return -1; } return 0; } /** Code that writes event values. */ static int example_hardware_write( int which_one, example_context_t *ctx, long long value) { switch ( which_one ) { case EXAMPLE_ZERO_REG: case EXAMPLE_CONSTANT_REG: return PAPI_OK; /* can't be written */ case EXAMPLE_AUTOINC_REG: ctx->autoinc_value=value; return PAPI_OK; case EXAMPLE_GLOBAL_AUTOINC_REG: example_global_autoinc_value=value; return PAPI_OK; default: perror( "Invalid counter write" ); return -1; } return 0; } static int detect_example(void) { return PAPI_OK; } /********************************************************************/ /* Below are the functions required by the PAPI component interface */ /********************************************************************/ /** Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ static int _example_init_component( int cidx ) { int retval = PAPI_OK; SUBDBG( "_example_init_component..." ); /* First, detect that our hardware is available */ if (detect_example()!=PAPI_OK) { int strErr=snprintf(_example_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Example Hardware not present."); _example_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; // force null termination. if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } /* we know in advance how many events we want */ /* for actual hardware this might have to be determined dynamically */ num_events = EXAMPLE_TOTAL_EVENTS; /* Allocate memory for the our native event table */ example_native_table = ( example_native_event_entry_t * ) papi_calloc( num_events, sizeof(example_native_event_entry_t) ); if ( example_native_table == NULL ) { int strErr=snprintf(_example_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Could not allocate %lu bytes of memory for EXAMPLE device structure.", num_events*sizeof(example_native_event_entry_t)); _example_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; // force null termination. if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } /* fill in the event table parameters */ /* for complicated components this will be done dynamically */ /* or by using an external library */ strcpy( example_native_table[0].name, "EXAMPLE_ZERO" ); strcpy( example_native_table[0].description, "This is an example counter, that always returns 0" ); example_native_table[0].writable = 0; strcpy( example_native_table[1].name, "EXAMPLE_CONSTANT" ); strcpy( example_native_table[1].description, "This is an example counter, that always returns a constant value of 42" ); example_native_table[1].writable = 0; strcpy( example_native_table[2].name, "EXAMPLE_AUTOINC" ); strcpy( example_native_table[2].description, "This is an example counter, that reports a per-thread auto-incrementing value" ); example_native_table[2].writable = 1; strcpy( example_native_table[3].name, "EXAMPLE_GLOBAL_AUTOINC" ); strcpy( example_native_table[3].description, "This is an example counter, that reports a global auto-incrementing value" ); example_native_table[3].writable = 1; /* Export the total number of events available */ _example_vector.cmp_info.num_native_events = num_events; /* Export the component id */ _example_vector.cmp_info.CmpIdx = cidx; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /** This is called whenever a thread is initialized */ static int _example_init_thread( hwd_context_t *ctx ) { example_context_t *example_context = (example_context_t *)ctx; example_context->autoinc_value=0; SUBDBG( "_example_init_thread %p...", ctx ); return PAPI_OK; } /** Setup a counter control state. * In general a control state holds the hardware info for an * EventSet. */ static int _example_init_control_state( hwd_control_state_t * ctl ) { SUBDBG( "example_init_control_state... %p\n", ctl ); example_control_state_t *example_ctl = ( example_control_state_t * ) ctl; memset( example_ctl, 0, sizeof ( example_control_state_t ) ); return PAPI_OK; } /** Triggered by eventset operations like add or remove */ static int _example_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { (void) ctx; int i, index; example_control_state_t *example_ctl = ( example_control_state_t * ) ctl; SUBDBG( "_example_update_control_state %p %p...", ctl, ctx ); /* if no events, return */ if (count==0) return PAPI_OK; for( i = 0; i < count; i++ ) { index = native[i].ni_event; /* Map counter #i to Measure Event "index" */ example_ctl->which_counter[i]=index; /* We have no constraints on event position, so any event */ /* can be in any slot. */ native[i].ni_position = i; } example_ctl->num_events=count; return PAPI_OK; } /** Triggered by PAPI_start() */ static int _example_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; SUBDBG( "example_start %p %p...", ctx, ctl ); /* anything that would need to be set at counter start time */ /* reset counters? */ /* For hardware that cannot reset counters, store initial */ /* counter state to the ctl and subtract it off at read time */ /* start the counting ?*/ return PAPI_OK; } /** Triggered by PAPI_stop() */ static int _example_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; SUBDBG( "example_stop %p %p...", ctx, ctl ); /* anything that would need to be done at counter stop time */ return PAPI_OK; } /** Triggered by PAPI_read() */ /* flags field is never set? */ static int _example_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ) { (void) flags; example_context_t *example_ctx = (example_context_t *) ctx; example_control_state_t *example_ctl = ( example_control_state_t *) ctl; SUBDBG( "example_read... %p %d", ctx, flags ); int i; /* Read counters into expected slot */ for(i=0;inum_events;i++) { example_ctl->counter[i] = example_hardware_read( example_ctl->which_counter[i], example_ctx ); } /* return pointer to the values we read */ *events = example_ctl->counter; return PAPI_OK; } /** Triggered by PAPI_write(), but only if the counters are running */ /* otherwise, the updated state is written to ESI->hw_start */ static int _example_write( hwd_context_t *ctx, hwd_control_state_t *ctl, long long *events ) { example_context_t *example_ctx = (example_context_t *) ctx; example_control_state_t *example_ctl = ( example_control_state_t *) ctl; int i; SUBDBG( "example_write... %p %p", ctx, ctl ); /* Write counters into expected slot */ for(i=0;inum_events;i++) { example_hardware_write( example_ctl->which_counter[i], example_ctx, events[i] ); } return PAPI_OK; } /** Triggered by PAPI_reset() but only if the EventSet is currently running */ /* If the eventset is not currently running, then the saved value in the */ /* EventSet is set to zero without calling this routine. */ static int _example_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { example_context_t *event_ctx = (example_context_t *)ctx; (void) ctl; SUBDBG( "example_reset ctx=%p ctrl=%p...", ctx, ctl ); /* Reset the hardware */ example_hardware_reset( event_ctx ); return PAPI_OK; } /** Triggered by PAPI_shutdown() */ static int _example_shutdown_component(void) { SUBDBG( "example_shutdown_component..." ); /* Free anything we allocated */ papi_free(example_native_table); return PAPI_OK; } /** Called at thread shutdown */ static int _example_shutdown_thread( hwd_context_t *ctx ) { (void) ctx; SUBDBG( "example_shutdown_thread... %p", ctx ); /* Last chance to clean up thread */ return PAPI_OK; } /** This function sets various options in the component @param[in] ctx -- hardware context @param[in] code valid are PAPI_SET_DEFDOM, PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL and PAPI_SET_INHERIT @param[in] option -- options to be set */ static int _example_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { (void) ctx; (void) code; (void) option; SUBDBG( "example_ctl..." ); return PAPI_OK; } /** This function has to set the bits needed to count different domains In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER By default return PAPI_EINVAL if none of those are specified and PAPI_OK with success PAPI_DOM_USER is only user context is counted PAPI_DOM_KERNEL is only the Kernel/OS context is counted PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) PAPI_DOM_ALL is all of the domains */ static int _example_set_domain( hwd_control_state_t * cntrl, int domain ) { (void) cntrl; int found = 0; SUBDBG( "example_set_domain..." ); if ( PAPI_DOM_USER & domain ) { SUBDBG( " PAPI_DOM_USER " ); found = 1; } if ( PAPI_DOM_KERNEL & domain ) { SUBDBG( " PAPI_DOM_KERNEL " ); found = 1; } if ( PAPI_DOM_OTHER & domain ) { SUBDBG( " PAPI_DOM_OTHER " ); found = 1; } if ( PAPI_DOM_ALL & domain ) { SUBDBG( " PAPI_DOM_ALL " ); found = 1; } if ( !found ) return ( PAPI_EINVAL ); return PAPI_OK; } /**************************************************************/ /* Naming functions, used to translate event numbers to names */ /**************************************************************/ /** Enumerate Native Events * @param EventCode is the event of interest * @param modifier is one of PAPI_ENUM_FIRST, PAPI_ENUM_EVENTS * If your component has attribute masks then these need to * be handled here as well. */ static int _example_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index; switch ( modifier ) { /* return EventCode of first event */ case PAPI_ENUM_FIRST: /* return the first event that we support */ *EventCode = 0; return PAPI_OK; /* return EventCode of next available event */ case PAPI_ENUM_EVENTS: index = *EventCode; /* Make sure we have at least 1 more event after us */ if ( index < num_events - 1 ) { /* This assumes a non-sparse mapping of the events */ *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; default: return PAPI_EINVAL; } return PAPI_EINVAL; } /** Takes a native event code and passes back the name * @param EventCode is the native event code * @param name is a pointer for the name to be copied to * @param len is the size of the name string */ static int _example_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int index; index = EventCode; /* Make sure we are in range */ if (index >= 0 && index < num_events) { strncpy( name, example_native_table[index].name, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /** Takes a native event code and passes back the event description * @param EventCode is the native event code * @param descr is a pointer for the description to be copied to * @param len is the size of the descr string */ static int _example_ntv_code_to_descr( unsigned int EventCode, char *descr, int len ) { int index; index = EventCode; /* make sure event is in range */ if (index >= 0 && index < num_events) { strncpy( descr, example_native_table[index].description, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /** Vector that points to entry points for our component */ papi_vector_t _example_vector = { .cmp_info = { /* default component information */ /* (unspecified values are initialized to 0) */ /* we explicitly set them to zero in this example */ /* to show what settings are available */ .name = "example", .short_name = "example", .description = "A simple example component", .version = "1.15", .support_version = "n/a", .kernel_version = "n/a", .num_cntrs = EXAMPLE_MAX_SIMULTANEOUS_COUNTERS, .num_mpx_cntrs = EXAMPLE_MAX_SIMULTANEOUS_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ }, /* sizes of framework-opaque component-private structures */ .size = { /* once per thread */ .context = sizeof ( example_context_t ), /* once per eventset */ .control_state = sizeof ( example_control_state_t ), /* ?? */ .reg_value = sizeof ( example_register_t ), /* ?? */ .reg_alloc = sizeof ( example_reg_alloc_t ), }, /* function pointers */ /* by default they are set to NULL */ /* Used for general PAPI interactions */ .start = _example_start, .stop = _example_stop, .read = _example_read, .reset = _example_reset, .write = _example_write, .init_component = _example_init_component, .init_thread = _example_init_thread, .init_control_state = _example_init_control_state, .update_control_state = _example_update_control_state, .ctl = _example_ctl, .shutdown_thread = _example_shutdown_thread, .shutdown_component = _example_shutdown_component, .set_domain = _example_set_domain, /* .cleanup_eventset = NULL, */ /* called in add_native_events() */ /* .allocate_registers = NULL, */ /* Used for overflow/profiling */ /* .dispatch_timer = NULL, */ /* .get_overflow_address = NULL, */ /* .stop_profiling = NULL, */ /* .set_overflow = NULL, */ /* .set_profile = NULL, */ /* ??? */ /* .user = NULL, */ /* Name Mapping Functions */ .ntv_enum_events = _example_ntv_enum_events, .ntv_code_to_name = _example_ntv_code_to_name, .ntv_code_to_descr = _example_ntv_code_to_descr, /* if .ntv_name_to_code not available, PAPI emulates */ /* it by enumerating all events and looking manually */ .ntv_name_to_code = NULL, /* These are only used by _papi_hwi_get_native_event_info() */ /* Which currently only uses the info for printing native */ /* event info, not for any sort of internal use. */ /* .ntv_code_to_bits = NULL, */ }; papi-papi-7-2-0-t/src/components/example/example.h000066400000000000000000000000001502707512200220410ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/example/tests/000077500000000000000000000000001502707512200214115ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/example/tests/Makefile000066400000000000000000000011011502707512200230420ustar00rootroot00000000000000NAME=example include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = example_basic example_multiple_components example_tests: $(TESTS) example_basic: example_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o example_basic example_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) example_multiple_components: example_multiple_components.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o example_multiple_components example_multiple_components.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/example/tests/example_basic.c000066400000000000000000000344751502707512200243660ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file example_basic.c * @author Vince Weaver * vweaver1@eecs.utk.edu * test case for Example component * * * @brief * This file is a very simple example test and Makefile that acat * as a guideline on how to add tests to components. * The papi configure and papi Makefile will take care of the compilation * of the component tests (if all tests are added to a directory named * 'tests' in the specific component dir). * See components/README for more details. */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 3 int main (int argc, char **argv) { int retval,i; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; const PAPI_component_info_t *cmpinfo = NULL; int numcmp,cid,example_cid=-1; int code,maximum_code=0; char event_name[PAPI_MAX_STR_LEN]; PAPI_event_info_t event_info; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!quiet) { printf( "Testing example component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); } /* Find our component */ numcmp = PAPI_num_components(); for( cid=0; cidnum_native_events, cmpinfo->name); } if (strstr(cmpinfo->name,"example")) { /* FOUND! */ example_cid=cid; } } if (example_cid<0) { test_skip(__FILE__, __LINE__, "Example component not found\n", 0); } if (!quiet) { printf("\nFound Example Component at id %d\n",example_cid); printf("\nListing all events in this component:\n"); } /**************************************************/ /* Listing all available events in this component */ /* Along with descriptions */ /**************************************************/ code = PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, example_cid ); while ( retval == PAPI_OK ) { if (PAPI_event_code_to_name( code, event_name )!=PAPI_OK) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (PAPI_get_event_info( code, &event_info)!=PAPI_OK) { printf("Error getting info for event %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_get_event_info()", retval ); } if (!quiet) { printf("\tEvent %#x: %s -- %s\n", code,event_name,event_info.long_descr); } maximum_code=code; retval = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, example_cid ); } if (!quiet) printf("\n"); /**********************************/ /* Try accessing an invalid event */ /**********************************/ retval=PAPI_event_code_to_name( maximum_code+10, event_name ); if (retval!=PAPI_ENOEVNT) { test_fail( __FILE__, __LINE__, "Failed to return PAPI_ENOEVNT on invalid event", retval ); } /***********************************/ /* Test the EXAMPLE_ZERO event */ /***********************************/ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_ZERO", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_ZERO not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("Testing EXAMPLE_ZERO: %lld\n",values[0]); if (values[0]!=0) { test_fail( __FILE__, __LINE__, "Result should be 0!\n", 0); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } EventSet=PAPI_NULL; /***********************************/ /* Test the EXAMPLE_CONSTANT event */ /***********************************/ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_CONSTANT", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_CONSTANT not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("Testing EXAMPLE_CONSTANT: %lld\n",values[0]); if (values[0]!=42) { test_fail( __FILE__, __LINE__, "Result should be 42!\n", 0); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } EventSet=PAPI_NULL; /***********************************/ /* Test the EXAMPLE_AUTOINC event */ /***********************************/ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_AUTOINC", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_AUTOINC not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } if (!quiet) printf("Testing EXAMPLE_AUTOINC: "); for(i=0;i<10;i++) { retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("%lld ",values[0]); if (values[0]!=i) { test_fail( __FILE__, __LINE__, "Result wrong!\n", 0); } } if (!quiet) printf("\n"); /***********************************/ /* Test multiple reads */ /***********************************/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } for(i=0;i<10;i++) { retval=PAPI_read( EventSet, values); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read failed\n", retval); } if (!quiet) printf("%lld ",values[0]); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("%lld\n",values[0]); // if (values[0]!=i) { // test_fail( __FILE__, __LINE__, "Result wrong!\n", 0); //} /***********************************/ /* Test PAPI_reset() */ /***********************************/ retval = PAPI_reset( EventSet); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset() failed\n",retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_reset( EventSet); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset() failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("Testing EXAMPLE_AUTOINC after PAPI_reset(): %lld\n", values[0]); if (values[0]!=0) { test_fail( __FILE__, __LINE__, "Result not zero!\n", 0); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } EventSet=PAPI_NULL; /***********************************/ /* Test multiple events */ /***********************************/ if (!quiet) printf("Testing Multiple Events: "); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_CONSTANT", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_CONSTANT not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_GLOBAL_AUTOINC", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_GLOBAL_AUTOINC not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_ZERO", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_ZERO not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) { for(i=0;i<3;i++) { printf("%lld ",values[i]); } printf("\n"); } if (values[0]!=42) { test_fail( __FILE__, __LINE__, "Result should be 42!\n", 0); } if (values[2]!=0) { test_fail( __FILE__, __LINE__, "Result should be 0!\n", 0); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } EventSet=PAPI_NULL; /***********************************/ /* Test writing to an event */ /***********************************/ if (!quiet) printf("Testing Write\n"); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_CONSTANT", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_CONSTANT not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_GLOBAL_AUTOINC", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_GLOBAL_AUTOINC not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_ZERO", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_ZERO not found\n",retval ); } retval = PAPI_add_event( EventSet, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } retval = PAPI_read ( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read failed\n",retval ); } if (!quiet) { printf("Before values: "); for(i=0;i<3;i++) { printf("%lld ",values[i]); } printf("\n"); } values[0]=100; values[1]=200; values[2]=300; retval = PAPI_write ( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_write failed\n",retval ); } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) { printf("After values: "); for(i=0;i<3;i++) { printf("%lld ",values[i]); } printf("\n"); } if (values[0]!=42) { test_fail( __FILE__, __LINE__, "Result should be 42!\n", 0); } if (values[1]!=200) { test_fail( __FILE__, __LINE__, "Result should be 200!\n", 0); } if (values[2]!=0) { test_fail( __FILE__, __LINE__, "Result should be 0!\n", 0); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } EventSet=PAPI_NULL; /************/ /* All Done */ /************/ if (!quiet) printf("\n"); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/example/tests/example_multiple_components.c000066400000000000000000000112161502707512200273710ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file example_multiple_components.c * @author Vince Weaver * vweaver1@eecs.utk.edu * test if multiple components can be used at once * * * @brief * This tests to see if the CPU component and Example component * can be used simultaneously. */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval; int EventSet1 = PAPI_NULL, EventSet2 = PAPI_NULL; long long values1[NUM_EVENTS]; long long values2[NUM_EVENTS]; const PAPI_component_info_t *cmpinfo = NULL; int numcmp,cid,example_cid=-1; int code; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!quiet) { printf( "Testing simultaneous component use with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ) ); } /* Find our component */ numcmp = PAPI_num_components(); for( cid=0; cidnum_native_events, cmpinfo->name); } if (strstr(cmpinfo->name,"example")) { /* FOUND! */ example_cid=cid; } } if (example_cid<0) { test_skip(__FILE__, __LINE__, "Example component not found\n", 0); } if (!quiet) { printf("\nFound Example Component at id %d\n",example_cid); } /* Create an eventset for the Example component */ retval = PAPI_create_eventset( &EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("EXAMPLE_CONSTANT", &code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "EXAMPLE_ZERO not found\n",retval ); } retval = PAPI_add_event( EventSet1, code); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events failed\n", retval ); } /* Create an eventset for the CPU component */ retval = PAPI_create_eventset( &EventSet2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset() failed\n", retval ); } retval = PAPI_event_name_to_code("PAPI_TOT_CYC", &code); if ( retval != PAPI_OK ) { test_skip( __FILE__, __LINE__, "PAPI_TOT_CYC not available\n",retval ); } retval = PAPI_add_event( EventSet2, code); if ( retval != PAPI_OK ) { test_skip( __FILE__, __LINE__, "NO CPU component found\n", retval ); } if (!quiet) printf("\nStarting EXAMPLE_CONSTANT and PAPI_TOT_CYC at the same time\n"); /* Start CPU component event */ retval = PAPI_start( EventSet2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } /* Start example component */ retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed\n",retval ); } /* Stop example component */ retval = PAPI_stop( EventSet1, values1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } /* Stop CPU component */ retval = PAPI_stop( EventSet2, values2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop failed\n", retval); } if (!quiet) printf("Stopping EXAMPLE_CONSTANT and PAPI_TOT_CYC\n\n"); if (!quiet) printf("Results from EXAMPLE_CONSTANT: %lld\n",values1[0]); if (values1[0]!=42) { test_fail( __FILE__, __LINE__, "Result should be 42!\n", 0); } if (!quiet) printf("Results from PAPI_TOT_CYC: %lld\n\n",values2[0]); if (values2[0]<1) { test_fail( __FILE__, __LINE__, "Result should greater than 0\n", 0); } /* Cleanup EventSets */ retval = PAPI_cleanup_eventset(EventSet1); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } retval = PAPI_cleanup_eventset(EventSet2); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset!\n", retval); } /* Destroy EventSets */ retval = PAPI_destroy_eventset(&EventSet2); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset!\n", retval); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/host_micpower/000077500000000000000000000000001502707512200214765ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/host_micpower/Makefile.host_micpower.in000066400000000000000000000001021502707512200264150ustar00rootroot00000000000000SYSMGMT_CFLAGS = @SYSMGMT_CFLAGS@ SYSMGMT_LIBS = @SYSMGMT_LIBS@ papi-papi-7-2-0-t/src/components/host_micpower/README.md000066400000000000000000000042361502707512200227620ustar00rootroot00000000000000# HOST\_MICPOWER Component The HOST\_MICPOWER component exports power information for Intel Xeon Phi cards (MIC). The component makes use of the MicAccessAPI distributed with the Intel Manycore Platform Software Stack. (http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss) Specifically in the intel-mic-sysmgmt package. * [Enabling the HOST\_MICPOWER Component](#enabling-the-host_micpower-component) * [FAQ](#faq) *** ## Enabling the HOST\_MICPOWER Component A configure script allows for non-default locations for the sysmgmt sdk. See: cd src/components/host_micpower ./configure --help To enable reading of HOST\_MICPOWER counters the user needs to link against a PAPI library that was configured with the HOST\_MICPOWER component enabled. As an example the following command: `./configure --with-components="host_micpower"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## FAQ PAPI retrieves the data via the MicGetPowerUsage call. Per the SDK documentation: MicGetPowerUsage - Retrieve power usage values of Intel® Xeon Phiâ„¢ Coprocessor and components. Data Fields MicPwrPws  total0   Total power utilization by Intel® Xeon Phiâ„¢ product codenamed “Knights Corner†device, Averaged over Time Window 0 (uWatts). MicPwrPws  total1   Total power utilization by Intel® Xeon Phiâ„¢ product codenamed “Knights Corner†device, Averaged over Time Window 1 (uWatts). MicPwrPws  inst   Instantaneous power (uWatts). MicPwrPws  imax   Max instantaneous power (uWatts). MicPwrPws  pcie   PCI-E connector power (uWatts). MicPwrPws  c2x3   2x3 connector power (uWatts). MicPwrPws  c2x4   2x4 connector power (uWatts). MicPwrVrr  vccp   Core rail (uVolts). MicPwrVrr  vddg   Uncore rail (uVolts). MicPwrVrr  vddq   Memory subsystem rail (uVolts). papi-papi-7-2-0-t/src/components/host_micpower/Rules.host_micpower000066400000000000000000000012111502707512200253670ustar00rootroot00000000000000include components/host_micpower/Makefile.host_micpower COMPSRCS += components/host_micpower/linux-host_micpower.c COMPOBJS += linux-host_micpower.o CFLAGS += -D MICACCESSAPI -D LINUX # default install location MPSSROOT ?= /opt/intel/mic SYSMGT = $(MPSSROOT)/sysmgmt/sdk LIBPATH = -L$(SYSMGT)/lib/Linux #SCIF_LIBPATH=/usr/lib64 #LDFLAGS += $(LIBPATH) $(SCIF_LIBPATH) -lpthread -ldl LDFLAGS += -pthread $(LDL) $(SYSMGMT_LIBS) CFLAGS += $(SYSMGMT_CFLAGS) linux-host_micpower.o: components/host_micpower/linux-host_micpower.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/host_micpower/linux-host_micpower.c -o linux-host_micpower.o papi-papi-7-2-0-t/src/components/host_micpower/configure000077500000000000000000003772331502707512200234240ustar00rootroot00000000000000#! /bin/sh # Guess values for system-dependent variables and create Makefiles. # Generated by GNU Autoconf 2.63 for host_micpower version-0.1. # # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, # 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in *posix*) set -o posix ;; esac fi # PATH needs CR # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo if (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # Support unset when possible. if ( (MAIL=60; unset MAIL) || exit) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 { (exit 1); exit 1; } fi # Work around bugs in pre-3.0 UWIN ksh. for as_var in ENV MAIL MAILPATH do ($as_unset $as_var) >/dev/null 2>&1 && $as_unset $as_var done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # CDPATH. $as_unset CDPATH if test "x$CONFIG_SHELL" = x; then if (eval ":") 2>/dev/null; then as_have_required=yes else as_have_required=no fi if test $as_have_required = yes && (eval ": (as_func_return () { (exit \$1) } as_func_success () { as_func_return 0 } as_func_failure () { as_func_return 1 } as_func_ret_success () { return 0 } as_func_ret_failure () { return 1 } exitcode=0 if as_func_success; then : else exitcode=1 echo as_func_success failed. fi if as_func_failure; then exitcode=1 echo as_func_failure succeeded. fi if as_func_ret_success; then : else exitcode=1 echo as_func_ret_success failed. fi if as_func_ret_failure; then exitcode=1 echo as_func_ret_failure succeeded. fi if ( set x; as_func_ret_success y && test x = \"\$1\" ); then : else exitcode=1 echo positional parameters were not saved. fi test \$exitcode = 0) || { (exit 1); exit 1; } ( as_lineno_1=\$LINENO as_lineno_2=\$LINENO test \"x\$as_lineno_1\" != \"x\$as_lineno_2\" && test \"x\`expr \$as_lineno_1 + 1\`\" = \"x\$as_lineno_2\") || { (exit 1); exit 1; } ") 2> /dev/null; then : else as_candidate_shells= as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. case $as_dir in /*) for as_base in sh bash ksh sh5; do as_candidate_shells="$as_candidate_shells $as_dir/$as_base" done;; esac done IFS=$as_save_IFS for as_shell in $as_candidate_shells $SHELL; do # Try only shells that exist, to save several forks. if { test -f "$as_shell" || test -f "$as_shell.exe"; } && { ("$as_shell") 2> /dev/null <<\_ASEOF if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in *posix*) set -o posix ;; esac fi : _ASEOF }; then CONFIG_SHELL=$as_shell as_have_required=yes if { "$as_shell" 2> /dev/null <<\_ASEOF if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in *posix*) set -o posix ;; esac fi : (as_func_return () { (exit $1) } as_func_success () { as_func_return 0 } as_func_failure () { as_func_return 1 } as_func_ret_success () { return 0 } as_func_ret_failure () { return 1 } exitcode=0 if as_func_success; then : else exitcode=1 echo as_func_success failed. fi if as_func_failure; then exitcode=1 echo as_func_failure succeeded. fi if as_func_ret_success; then : else exitcode=1 echo as_func_ret_success failed. fi if as_func_ret_failure; then exitcode=1 echo as_func_ret_failure succeeded. fi if ( set x; as_func_ret_success y && test x = "$1" ); then : else exitcode=1 echo positional parameters were not saved. fi test $exitcode = 0) || { (exit 1); exit 1; } ( as_lineno_1=$LINENO as_lineno_2=$LINENO test "x$as_lineno_1" != "x$as_lineno_2" && test "x`expr $as_lineno_1 + 1`" = "x$as_lineno_2") || { (exit 1); exit 1; } _ASEOF }; then break fi fi done if test "x$CONFIG_SHELL" != x; then for as_var in BASH_ENV ENV do ($as_unset $as_var) >/dev/null 2>&1 && $as_unset $as_var done export CONFIG_SHELL exec "$CONFIG_SHELL" "$as_myself" ${1+"$@"} fi if test $as_have_required = no; then echo This script requires a shell more modern than all the echo shells that I found on your system. Please install a echo modern shell, or manually run the script under such a echo shell if you do have one. { (exit 1); exit 1; } fi fi fi (eval "as_func_return () { (exit \$1) } as_func_success () { as_func_return 0 } as_func_failure () { as_func_return 1 } as_func_ret_success () { return 0 } as_func_ret_failure () { return 1 } exitcode=0 if as_func_success; then : else exitcode=1 echo as_func_success failed. fi if as_func_failure; then exitcode=1 echo as_func_failure succeeded. fi if as_func_ret_success; then : else exitcode=1 echo as_func_ret_success failed. fi if as_func_ret_failure; then exitcode=1 echo as_func_ret_failure succeeded. fi if ( set x; as_func_ret_success y && test x = \"\$1\" ); then : else exitcode=1 echo positional parameters were not saved. fi test \$exitcode = 0") || { echo No shell found that supports shell functions. echo Please tell bug-autoconf@gnu.org about your system, echo including any error possibly output before this message. echo This can help us improve future autoconf versions. echo Configuration will now proceed without shell functions. } as_lineno_1=$LINENO as_lineno_2=$LINENO test "x$as_lineno_1" != "x$as_lineno_2" && test "x`expr $as_lineno_1 + 1`" = "x$as_lineno_2" || { # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line after each line using $LINENO; the second 'sed' # does the real work. The second script uses 'N' to pair each # line-number line with the line containing $LINENO, and appends # trailing '-' during substitution so that $LINENO is not a special # case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # scripts with optimization help from Paolo Bonzini. Blame Lee # E. McMahon (1931-1989) for sed's syntax. :-) sed -n ' p /[$]LINENO/= ' <$as_myself | sed ' s/[$]LINENO.*/&-/ t lineno b :lineno N :loop s/[$]LINENO\([^'$as_cr_alnum'_].*\n\)\(.*\)/\2\1\2/ t loop s/-\n.*// ' >$as_me.lineno && chmod +x "$as_me.lineno" || { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2 { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensitive to this). . "./$as_me.lineno" # Exit status is that of the last command. exit } if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in -n*) case `echo 'x\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. *) ECHO_C='\c';; esac;; *) ECHO_N='-n';; esac if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null if mkdir -p . 2>/dev/null; then as_mkdir_p=: else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" exec 7<&0 &1 # Name of the host. # hostname on some systems (SVR3.2, Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` # # Initializations. # ac_default_prefix=/usr/local ac_clean_files= ac_config_libobj_dir=. LIBOBJS= cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= SHELL=${CONFIG_SHELL-/bin/sh} # Identity of this package. PACKAGE_NAME='host_micpower' PACKAGE_TARNAME='host_micpower' PACKAGE_VERSION='version-0.1' PACKAGE_STRING='host_micpower version-0.1' PACKAGE_BUGREPORT='' # Factoring default headers for most tests. ac_includes_default="\ #include #ifdef HAVE_SYS_TYPES_H # include #endif #ifdef HAVE_SYS_STAT_H # include #endif #ifdef STDC_HEADERS # include # include #else # ifdef HAVE_STDLIB_H # include # endif #endif #ifdef HAVE_STRING_H # if !defined STDC_HEADERS && defined HAVE_MEMORY_H # include # endif # include #endif #ifdef HAVE_STRINGS_H # include #endif #ifdef HAVE_INTTYPES_H # include #endif #ifdef HAVE_STDINT_H # include #endif #ifdef HAVE_UNISTD_H # include #endif" ac_subst_vars='LTLIBOBJS LIBOBJS EGREP GREP CPP SYSMGMT_LIBS SYSMGMT_CFLAGS OBJEXT EXEEXT ac_ct_CC CPPFLAGS LDFLAGS CFLAGS CC target_alias host_alias build_alias LIBS ECHO_T ECHO_N ECHO_C DEFS mandir localedir libdir psdir pdfdir dvidir htmldir infodir docdir oldincludedir includedir localstatedir sharedstatedir sysconfdir datadir datarootdir libexecdir sbindir bindir program_transform_name prefix exec_prefix PACKAGE_BUGREPORT PACKAGE_STRING PACKAGE_VERSION PACKAGE_TARNAME PACKAGE_NAME PATH_SEPARATOR SHELL' ac_subst_files='' ac_user_opts=' enable_option_checking with_sysmgmt_include_path with_sysmgmt_lib_path ' ac_precious_vars='build_alias host_alias target_alias CC CFLAGS LDFLAGS LIBS CPPFLAGS CPP' # Initialize some variables set by options. ac_init_help= ac_init_version=false ac_unrecognized_opts= ac_unrecognized_sep= # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. # (The list follows the same order as the GNU Coding Standards.) bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datarootdir='${prefix}/share' datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' infodir='${datarootdir}/info' htmldir='${docdir}' dvidir='${docdir}' pdfdir='${docdir}' psdir='${docdir}' libdir='${exec_prefix}/lib' localedir='${datarootdir}/locale' mandir='${datarootdir}/man' ac_prev= ac_dashdash= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval $ac_prev=\$ac_option ac_prev= continue fi case $ac_option in *=*) ac_optarg=`expr "X$ac_option" : '[^=]*=\(.*\)'` ;; *) ac_optarg=yes ;; esac # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_dashdash$ac_option in --) ac_dashdash=yes ;; -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=*) datadir=$ac_optarg ;; -datarootdir | --datarootdir | --datarootdi | --datarootd | --dataroot \ | --dataroo | --dataro | --datar) ac_prev=datarootdir ;; -datarootdir=* | --datarootdir=* | --datarootdi=* | --datarootd=* \ | --dataroot=* | --dataroo=* | --dataro=* | --datar=*) datarootdir=$ac_optarg ;; -disable-* | --disable-*) ac_useropt=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && { $as_echo "$as_me: error: invalid feature name: $ac_useropt" >&2 { (exit 1); exit 1; }; } ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--disable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=no ;; -docdir | --docdir | --docdi | --doc | --do) ac_prev=docdir ;; -docdir=* | --docdir=* | --docdi=* | --doc=* | --do=*) docdir=$ac_optarg ;; -dvidir | --dvidir | --dvidi | --dvid | --dvi | --dv) ac_prev=dvidir ;; -dvidir=* | --dvidir=* | --dvidi=* | --dvid=* | --dvi=* | --dv=*) dvidir=$ac_optarg ;; -enable-* | --enable-*) ac_useropt=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && { $as_echo "$as_me: error: invalid feature name: $ac_useropt" >&2 { (exit 1); exit 1; }; } ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--enable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=\$ac_optarg ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -htmldir | --htmldir | --htmldi | --htmld | --html | --htm | --ht) ac_prev=htmldir ;; -htmldir=* | --htmldir=* | --htmldi=* | --htmld=* | --html=* | --htm=* \ | --ht=*) htmldir=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localedir | --localedir | --localedi | --localed | --locale) ac_prev=localedir ;; -localedir=* | --localedir=* | --localedi=* | --localed=* | --locale=*) localedir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst | --locals) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* | --locals=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -pdfdir | --pdfdir | --pdfdi | --pdfd | --pdf | --pd) ac_prev=pdfdir ;; -pdfdir=* | --pdfdir=* | --pdfdi=* | --pdfd=* | --pdf=* | --pd=*) pdfdir=$ac_optarg ;; -psdir | --psdir | --psdi | --psd | --ps) ac_prev=psdir ;; -psdir=* | --psdir=* | --psdi=* | --psd=* | --ps=*) psdir=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_useropt=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && { $as_echo "$as_me: error: invalid package name: $ac_useropt" >&2 { (exit 1); exit 1; }; } ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--with-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=\$ac_optarg ;; -without-* | --without-*) ac_useropt=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && { $as_echo "$as_me: error: invalid package name: $ac_useropt" >&2 { (exit 1); exit 1; }; } ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--without-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=no ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) { $as_echo "$as_me: error: unrecognized option: $ac_option Try \`$0 --help' for more information." >&2 { (exit 1); exit 1; }; } ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. expr "x$ac_envvar" : ".*[^_$as_cr_alnum]" >/dev/null && { $as_echo "$as_me: error: invalid variable name: $ac_envvar" >&2 { (exit 1); exit 1; }; } eval $ac_envvar=\$ac_optarg export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. $as_echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && $as_echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : ${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option} ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` { $as_echo "$as_me: error: missing argument to $ac_option" >&2 { (exit 1); exit 1; }; } fi if test -n "$ac_unrecognized_opts"; then case $enable_option_checking in no) ;; fatal) { $as_echo "$as_me: error: unrecognized options: $ac_unrecognized_opts" >&2 { (exit 1); exit 1; }; } ;; *) $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2 ;; esac fi # Check all directory arguments for consistency. for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ libdir localedir mandir do eval ac_val=\$$ac_var # Remove trailing slashes. case $ac_val in */ ) ac_val=`expr "X$ac_val" : 'X\(.*[^/]\)' \| "X$ac_val" : 'X\(.*\)'` eval $ac_var=\$ac_val;; esac # Be sure to have absolute directory names. case $ac_val in [\\/$]* | ?:[\\/]* ) continue;; NONE | '' ) case $ac_var in *prefix ) continue;; esac;; esac { $as_echo "$as_me: error: expected an absolute directory name for --$ac_var: $ac_val" >&2 { (exit 1); exit 1; }; } done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe $as_echo "$as_me: WARNING: If you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used." >&2 elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null ac_pwd=`pwd` && test -n "$ac_pwd" && ac_ls_di=`ls -di .` && ac_pwd_ls_di=`cd "$ac_pwd" && ls -di .` || { $as_echo "$as_me: error: working directory cannot be determined" >&2 { (exit 1); exit 1; }; } test "X$ac_ls_di" = "X$ac_pwd_ls_di" || { $as_echo "$as_me: error: pwd does not report name of working directory" >&2 { (exit 1); exit 1; }; } # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then the parent directory. ac_confdir=`$as_dirname -- "$as_myself" || $as_expr X"$as_myself" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_myself" : 'X\(//\)[^/]' \| \ X"$as_myself" : 'X\(//\)$' \| \ X"$as_myself" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_myself" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` srcdir=$ac_confdir if test ! -r "$srcdir/$ac_unique_file"; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r "$srcdir/$ac_unique_file"; then test "$ac_srcdir_defaulted" = yes && srcdir="$ac_confdir or .." { $as_echo "$as_me: error: cannot find sources ($ac_unique_file) in $srcdir" >&2 { (exit 1); exit 1; }; } fi ac_msg="sources are in $srcdir, but \`cd $srcdir' does not work" ac_abs_confdir=`( cd "$srcdir" && test -r "./$ac_unique_file" || { $as_echo "$as_me: error: $ac_msg" >&2 { (exit 1); exit 1; }; } pwd)` # When building in place, set srcdir=. if test "$ac_abs_confdir" = "$ac_pwd"; then srcdir=. fi # Remove unnecessary trailing slashes from srcdir. # Double slashes in file names in object file debugging info # mess up M-x gdb in Emacs. case $srcdir in */) srcdir=`expr "X$srcdir" : 'X\(.*[^/]\)' \| "X$srcdir" : 'X\(.*\)'`;; esac for ac_var in $ac_precious_vars; do eval ac_env_${ac_var}_set=\${${ac_var}+set} eval ac_env_${ac_var}_value=\$${ac_var} eval ac_cv_env_${ac_var}_set=\${${ac_var}+set} eval ac_cv_env_${ac_var}_value=\$${ac_var} done # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures host_micpower version-0.1 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] --datadir=DIR read-only architecture-independent data [DATAROOTDIR] --infodir=DIR info documentation [DATAROOTDIR/info] --localedir=DIR locale-dependent data [DATAROOTDIR/locale] --mandir=DIR man documentation [DATAROOTDIR/man] --docdir=DIR documentation root [DATAROOTDIR/doc/host_micpower] --htmldir=DIR html documentation [DOCDIR] --dvidir=DIR dvi documentation [DOCDIR] --pdfdir=DIR pdf documentation [DOCDIR] --psdir=DIR ps documentation [DOCDIR] _ACEOF cat <<\_ACEOF _ACEOF fi if test -n "$ac_init_help"; then case $ac_init_help in short | recursive ) echo "Configuration of host_micpower version-0.1:";; esac cat <<\_ACEOF Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-sysmgmt-include-path location of the MPSS sysmgmt api headers, defaults to /opt/intel/mic/sysmgmt/sdk/include --with-sysmgmt-lib-path location of the MPSS sysmgmt libraries, feed to the runtime linker; defaults to /opt/intel/mic/sysmgmt/sdk/lib/Linux Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory LIBS libraries to pass to the linker, e.g. -l CPPFLAGS C/C++/Objective C preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. _ACEOF ac_status=$? fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d "$ac_dir" || { cd "$srcdir" && ac_pwd=`pwd` && srcdir=. && test -d "$ac_dir"; } || continue ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix cd "$ac_dir" || { ac_status=$?; continue; } # Check for guested configure. if test -f "$ac_srcdir/configure.gnu"; then echo && $SHELL "$ac_srcdir/configure.gnu" --help=recursive elif test -f "$ac_srcdir/configure"; then echo && $SHELL "$ac_srcdir/configure" --help=recursive else $as_echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi || ac_status=$? cd "$ac_pwd" || { ac_status=$?; break; } done fi test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF host_micpower configure version-0.1 generated by GNU Autoconf 2.63 Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit fi cat >config.log <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by host_micpower $as_me version-0.1, which was generated by GNU Autoconf 2.63. Invocation command line was $ $0 $@ _ACEOF exec 5>>config.log { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` /usr/bin/hostinfo = `(/usr/bin/hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. $as_echo "PATH: $as_dir" done IFS=$as_save_IFS } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *\'*) ac_arg=`$as_echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) ac_configure_args0="$ac_configure_args0 '$ac_arg'" ;; 2) ac_configure_args1="$ac_configure_args1 '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi ac_configure_args="$ac_configure_args '$ac_arg'" ;; esac done done $as_unset ac_configure_args0 || test "${ac_configure_args0+set}" != set || { ac_configure_args0=; export ac_configure_args0; } $as_unset ac_configure_args1 || test "${ac_configure_args1+set}" != set || { ac_configure_args1=; export ac_configure_args1; } # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Use '\'' to represent an apostrophe within the trap. # WARNING: Do not start the trap code with a newline, due to a FreeBSD 4.0 bug. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo cat <<\_ASBOX ## ---------------- ## ## Cache variables. ## ## ---------------- ## _ASBOX echo # The following way of writing the cache mishandles newlines in values, ( for ac_var in `(set) 2>&1 | sed -n '\''s/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'\''`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:$LINENO: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) $as_unset $ac_var ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space='\'' '\''; set) 2>&1` in #( *${as_nl}ac_space=\ *) sed -n \ "s/'\''/'\''\\\\'\'''\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\''\\2'\''/p" ;; #( *) sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) echo cat <<\_ASBOX ## ----------------- ## ## Output variables. ## ## ----------------- ## _ASBOX echo for ac_var in $ac_subst_vars do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo if test -n "$ac_subst_files"; then cat <<\_ASBOX ## ------------------- ## ## File substitutions. ## ## ------------------- ## _ASBOX echo for ac_var in $ac_subst_files do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo fi if test -s confdefs.h; then cat <<\_ASBOX ## ----------- ## ## confdefs.h. ## ## ----------- ## _ASBOX echo cat confdefs.h echo fi test "$ac_signal" != 0 && $as_echo "$as_me: caught signal $ac_signal" $as_echo "$as_me: exit $exit_status" } >&5 rm -f core *.core core.conftest.* && rm -f -r conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; { (exit 1); exit 1; }' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -f -r conftest* confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer an explicitly selected file to automatically selected ones. ac_site_file1=NONE ac_site_file2=NONE if test -n "$CONFIG_SITE"; then ac_site_file1=$CONFIG_SITE elif test "x$prefix" != xNONE; then ac_site_file1=$prefix/share/config.site ac_site_file2=$prefix/etc/config.site else ac_site_file1=$ac_default_prefix/share/config.site ac_site_file2=$ac_default_prefix/etc/config.site fi for ac_site_file in "$ac_site_file1" "$ac_site_file2" do test "x$ac_site_file" = xNONE && continue if test -r "$ac_site_file"; then { $as_echo "$as_me:$LINENO: loading site script $ac_site_file" >&5 $as_echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special # files actually), so we avoid doing that. if test -f "$cache_file"; then { $as_echo "$as_me:$LINENO: loading cache $cache_file" >&5 $as_echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . "$cache_file";; *) . "./$cache_file";; esac fi else { $as_echo "$as_me:$LINENO: creating cache $cache_file" >&5 $as_echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in $ac_precious_vars; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val=\$ac_cv_env_${ac_var}_value eval ac_new_val=\$ac_env_${ac_var}_value case $ac_old_set,$ac_new_set in set,) { $as_echo "$as_me:$LINENO: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { $as_echo "$as_me:$LINENO: error: \`$ac_var' was not set in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then # differences in whitespace do not lead to failure. ac_old_val_w=`echo x $ac_old_val` ac_new_val_w=`echo x $ac_new_val` if test "$ac_old_val_w" != "$ac_new_val_w"; then { $as_echo "$as_me:$LINENO: error: \`$ac_var' has changed since the previous run:" >&5 $as_echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} ac_cache_corrupted=: else { $as_echo "$as_me:$LINENO: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&5 $as_echo "$as_me: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&2;} eval $ac_var=\$ac_old_val fi { $as_echo "$as_me:$LINENO: former value: \`$ac_old_val'" >&5 $as_echo "$as_me: former value: \`$ac_old_val'" >&2;} { $as_echo "$as_me:$LINENO: current value: \`$ac_new_val'" >&5 $as_echo "$as_me: current value: \`$ac_new_val'" >&2;} fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *\'*) ac_arg=$ac_var=`$as_echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) ac_configure_args="$ac_configure_args '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { $as_echo "$as_me:$LINENO: error: changes in the environment can compromise the build" >&5 $as_echo "$as_me: error: changes in the environment can compromise the build" >&2;} { { $as_echo "$as_me:$LINENO: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&5 $as_echo "$as_me: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&2;} { (exit 1); exit 1; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}gcc", so it can be a program name with args. set dummy ${ac_tool_prefix}gcc; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}gcc" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:$LINENO: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "gcc", so it can be a program name with args. set dummy gcc; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_ac_ct_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="gcc" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:$LINENO: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi else CC="$ac_cv_prog_CC" fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}cc", so it can be a program name with args. set dummy ${ac_tool_prefix}cc; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}cc" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:$LINENO: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi fi fi if test -z "$CC"; then # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then ac_prog_rejected=yes continue fi ac_cv_prog_CC="cc" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_CC shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set CC to just the basename; use the full file name. shift ac_cv_prog_CC="$as_dir/$ac_word${1+' '}$@" fi fi fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:$LINENO: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then for ac_prog in cl.exe do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:$LINENO: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in cl.exe do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:$LINENO: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_ac_ct_CC+set}" = set; then $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="$ac_prog" $as_echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:$LINENO: result: no" >&5 $as_echo "no" >&6; } fi test -n "$ac_ct_CC" && break done if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:$LINENO: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi fi fi test -z "$CC" && { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&5 $as_echo "$as_me: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; }; } # Provide some information about the compiler. $as_echo "$as_me:$LINENO: checking for C compiler version" >&5 set X $ac_compile ac_compiler=$2 { (ac_try="$ac_compiler --version >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compiler --version >&5") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (ac_try="$ac_compiler -v >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compiler -v >&5") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (ac_try="$ac_compiler -V >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compiler -V >&5") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.out.dSYM a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. { $as_echo "$as_me:$LINENO: checking for C compiler default output file name" >&5 $as_echo_n "checking for C compiler default output file name... " >&6; } ac_link_default=`$as_echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` # The possible output files: ac_files="a.out conftest.exe conftest a.exe a_out.exe b.out conftest.*" ac_rmfiles= for ac_file in $ac_files do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; * ) ac_rmfiles="$ac_rmfiles $ac_file";; esac done rm -f $ac_rmfiles if { (ac_try="$ac_link_default" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_link_default") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # Autoconf-2.13 could set the ac_cv_exeext variable to `no'. # So ignore a value of `no', otherwise this would lead to `EXEEXT = no' # in a Makefile. We should not override ac_cv_exeext if it was cached, # so that the user can short-circuit this test for compilers unknown to # Autoconf. for ac_file in $ac_files '' do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) if test "${ac_cv_exeext+set}" = set && test "$ac_cv_exeext" != no; then :; else ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` fi # We set ac_cv_exeext here because the later test for it is not # safe: cross compilers may not add the suffix if given an `-o' # argument, so we may need to know it at that point already. # Even if this section looks crufty: it has the advantage of # actually working. break;; * ) break;; esac done test "$ac_cv_exeext" = no && ac_cv_exeext= else ac_file='' fi { $as_echo "$as_me:$LINENO: result: $ac_file" >&5 $as_echo "$ac_file" >&6; } if test -z "$ac_file"; then $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: C compiler cannot create executables See \`config.log' for more details." >&5 $as_echo "$as_me: error: C compiler cannot create executables See \`config.log' for more details." >&2;} { (exit 77); exit 77; }; }; } fi ac_exeext=$ac_cv_exeext # Check that the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. { $as_echo "$as_me:$LINENO: checking whether the C compiler works" >&5 $as_echo_n "checking whether the C compiler works... " >&6; } # FIXME: These cross compiler hacks should be removed for Autoconf 3.0 # If not cross compiling, check that we can run a simple program. if test "$cross_compiling" != yes; then if { ac_try='./$ac_file' { (case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&5 $as_echo "$as_me: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; }; } fi fi fi { $as_echo "$as_me:$LINENO: result: yes" >&5 $as_echo "yes" >&6; } rm -f -r a.out a.out.dSYM a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save # Check that the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. { $as_echo "$as_me:$LINENO: checking whether we are cross compiling" >&5 $as_echo_n "checking whether we are cross compiling... " >&6; } { $as_echo "$as_me:$LINENO: result: $cross_compiling" >&5 $as_echo "$cross_compiling" >&6; } { $as_echo "$as_me:$LINENO: checking for suffix of executables" >&5 $as_echo_n "checking for suffix of executables... " >&6; } if { (ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` break;; * ) break;; esac done else { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&5 $as_echo "$as_me: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; }; } fi rm -f conftest$ac_cv_exeext { $as_echo "$as_me:$LINENO: result: $ac_cv_exeext" >&5 $as_echo "$ac_cv_exeext" >&6; } rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT { $as_echo "$as_me:$LINENO: checking for suffix of object files" >&5 $as_echo_n "checking for suffix of object files... " >&6; } if test "${ac_cv_objext+set}" = set; then $as_echo_n "(cached) " >&6 else cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then for ac_file in conftest.o conftest.obj conftest.*; do test -f "$ac_file" || continue; case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&5 $as_echo "$as_me: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; }; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi { $as_echo "$as_me:$LINENO: result: $ac_cv_objext" >&5 $as_echo "$ac_cv_objext" >&6; } OBJEXT=$ac_cv_objext ac_objext=$OBJEXT { $as_echo "$as_me:$LINENO: checking whether we are using the GNU C compiler" >&5 $as_echo_n "checking whether we are using the GNU C compiler... " >&6; } if test "${ac_cv_c_compiler_gnu+set}" = set; then $as_echo_n "(cached) " >&6 else cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_compiler_gnu=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_compiler_gnu=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi { $as_echo "$as_me:$LINENO: result: $ac_cv_c_compiler_gnu" >&5 $as_echo "$ac_cv_c_compiler_gnu" >&6; } if test $ac_compiler_gnu = yes; then GCC=yes else GCC= fi ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS { $as_echo "$as_me:$LINENO: checking whether $CC accepts -g" >&5 $as_echo_n "checking whether $CC accepts -g... " >&6; } if test "${ac_cv_prog_cc_g+set}" = set; then $as_echo_n "(cached) " >&6 else ac_save_c_werror_flag=$ac_c_werror_flag ac_c_werror_flag=yes ac_cv_prog_cc_g=no CFLAGS="-g" cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_cv_prog_cc_g=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 CFLAGS="" cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then : else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_c_werror_flag=$ac_save_c_werror_flag CFLAGS="-g" cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_cv_prog_cc_g=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_c_werror_flag=$ac_save_c_werror_flag fi { $as_echo "$as_me:$LINENO: result: $ac_cv_prog_cc_g" >&5 $as_echo "$ac_cv_prog_cc_g" >&6; } if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi { $as_echo "$as_me:$LINENO: checking for $CC option to accept ISO C89" >&5 $as_echo_n "checking for $CC option to accept ISO C89... " >&6; } if test "${ac_cv_prog_cc_c89+set}" = set; then $as_echo_n "(cached) " >&6 else ac_cv_prog_cc_c89=no ac_save_CC=$CC cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } /* OSF 4.0 Compaq cc is some sort of almost-ANSI by default. It has function prototypes and stuff, but not '\xHH' hex character constants. These don't provoke an error unfortunately, instead are silently treated as 'x'. The following induces an error, until -std is added to get proper ANSI mode. Curiously '\x00'!='x' always comes out true, for an array size at least. It's necessary to write '\x00'==0 to get something that's true only with -std. */ int osf4_cc_array ['\x00' == 0 ? 1 : -1]; /* IBM C 6 for AIX is almost-ANSI by default, but it replaces macro parameters inside strings and character constants. */ #define FOO(x) 'x' int xlc6_cc_array[FOO(a) == 'x' ? 1 : -1]; int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF for ac_arg in '' -qlanglvl=extc89 -qlanglvl=ansi -std \ -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_cv_prog_cc_c89=$ac_arg else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f core conftest.err conftest.$ac_objext test "x$ac_cv_prog_cc_c89" != "xno" && break done rm -f conftest.$ac_ext CC=$ac_save_CC fi # AC_CACHE_VAL case "x$ac_cv_prog_cc_c89" in x) { $as_echo "$as_me:$LINENO: result: none needed" >&5 $as_echo "none needed" >&6; } ;; xno) { $as_echo "$as_me:$LINENO: result: unsupported" >&5 $as_echo "unsupported" >&6; } ;; *) CC="$CC $ac_cv_prog_cc_c89" { $as_echo "$as_me:$LINENO: result: $ac_cv_prog_cc_c89" >&5 $as_echo "$ac_cv_prog_cc_c89" >&6; } ;; esac ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu # Check whether --with-sysmgmt-include-path was given. if test "${with_sysmgmt_include_path+set}" = set; then withval=$with_sysmgmt_include_path; SYSMGMT_CFLAGS="-I$withval" else SYSMGMT_CFLAGS="-I/opt/intel/mic/sysmgmt/sdk/include" fi # Check whether --with-sysmgmt-lib-path was given. if test "${with_sysmgmt_lib_path+set}" = set; then withval=$with_sysmgmt_lib_path; SYSMGMT_LIBS="-Wl,-rpath,$withval" else SYSMGMT_LIBS="-Wl,-rpath,/opt/intel/mic/sysmgmt/sdk/lib/Linux" fi #AC_ARG_WITH([scif-lib-path], # [AS_HELP_STRING([--with-scif-lib-path],[location of the SCIF library, needed by libMicAccessApi.so]), # [], # []) OLD_CPPFLAGS=$CPPFLAGS CPPFLAGS="-DMICACCESSAPI -DLINUX $SYSMGMT_CFLAGS" ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:$LINENO: checking how to run the C preprocessor" >&5 $as_echo_n "checking how to run the C preprocessor... " >&6; } # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if test "${ac_cv_prog_CPP+set}" = set; then $as_echo_n "(cached) " >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then : else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then # Broken: success on invalid input. continue else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi { $as_echo "$as_me:$LINENO: result: $CPP" >&5 $as_echo "$CPP" >&6; } ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then : else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then # Broken: success on invalid input. continue else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { $as_echo "$as_me:$LINENO: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { { $as_echo "$as_me:$LINENO: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&5 $as_echo "$as_me: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:$LINENO: checking for grep that handles long lines and -e" >&5 $as_echo_n "checking for grep that handles long lines and -e... " >&6; } if test "${ac_cv_path_GREP+set}" = set; then $as_echo_n "(cached) " >&6 else if test -z "$GREP"; then ac_path_GREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in grep ggrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_GREP" && $as_test_x "$ac_path_GREP"; } || continue # Check for GNU ac_path_GREP and select it if it is found. # Check for GNU $ac_path_GREP case `"$ac_path_GREP" --version 2>&1` in *GNU*) ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'GREP' >> "conftest.nl" "$ac_path_GREP" -e 'GREP$' -e '-(cannot match)-' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break ac_count=`expr $ac_count + 1` if test $ac_count -gt ${ac_path_GREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_GREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_GREP"; then { { $as_echo "$as_me:$LINENO: error: no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" >&5 $as_echo "$as_me: error: no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" >&2;} { (exit 1); exit 1; }; } fi else ac_cv_path_GREP=$GREP fi fi { $as_echo "$as_me:$LINENO: result: $ac_cv_path_GREP" >&5 $as_echo "$ac_cv_path_GREP" >&6; } GREP="$ac_cv_path_GREP" { $as_echo "$as_me:$LINENO: checking for egrep" >&5 $as_echo_n "checking for egrep... " >&6; } if test "${ac_cv_path_EGREP+set}" = set; then $as_echo_n "(cached) " >&6 else if echo a | $GREP -E '(a|b)' >/dev/null 2>&1 then ac_cv_path_EGREP="$GREP -E" else if test -z "$EGREP"; then ac_path_EGREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in egrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_EGREP" && $as_test_x "$ac_path_EGREP"; } || continue # Check for GNU ac_path_EGREP and select it if it is found. # Check for GNU $ac_path_EGREP case `"$ac_path_EGREP" --version 2>&1` in *GNU*) ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'EGREP' >> "conftest.nl" "$ac_path_EGREP" 'EGREP$' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break ac_count=`expr $ac_count + 1` if test $ac_count -gt ${ac_path_EGREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_EGREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_EGREP"; then { { $as_echo "$as_me:$LINENO: error: no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" >&5 $as_echo "$as_me: error: no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" >&2;} { (exit 1); exit 1; }; } fi else ac_cv_path_EGREP=$EGREP fi fi fi { $as_echo "$as_me:$LINENO: result: $ac_cv_path_EGREP" >&5 $as_echo "$ac_cv_path_EGREP" >&6; } EGREP="$ac_cv_path_EGREP" { $as_echo "$as_me:$LINENO: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } if test "${ac_cv_header_stdc+set}" = set; then $as_echo_n "(cached) " >&6 else cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_cv_header_stdc=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_header_stdc=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : else cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) return 2; return 0; } _ACEOF rm -f conftest$ac_exeext if { (ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='./conftest$ac_exeext' { (case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then : else $as_echo "$as_me: program exited with status $ac_status" >&5 $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ( exit $ac_status ) ac_cv_header_stdc=no fi rm -rf conftest.dSYM rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext fi fi fi { $as_echo "$as_me:$LINENO: result: $ac_cv_header_stdc" >&5 $as_echo "$ac_cv_header_stdc" >&6; } if test $ac_cv_header_stdc = yes; then cat >>confdefs.h <<\_ACEOF #define STDC_HEADERS 1 _ACEOF fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` { $as_echo "$as_me:$LINENO: checking for $ac_header" >&5 $as_echo_n "checking for $ac_header... " >&6; } if { as_var=$as_ac_Header; eval "test \"\${$as_var+set}\" = set"; }; then $as_echo_n "(cached) " >&6 else cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then eval "$as_ac_Header=yes" else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_Header=no" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi ac_res=`eval 'as_val=${'$as_ac_Header'} $as_echo "$as_val"'` { $as_echo "$as_me:$LINENO: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } as_val=`eval 'as_val=${'$as_ac_Header'} $as_echo "$as_val"'` if test "x$as_val" = x""yes; then cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_header in MicAccessApi.h do as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` if { as_var=$as_ac_Header; eval "test \"\${$as_var+set}\" = set"; }; then { $as_echo "$as_me:$LINENO: checking for $ac_header" >&5 $as_echo_n "checking for $ac_header... " >&6; } if { as_var=$as_ac_Header; eval "test \"\${$as_var+set}\" = set"; }; then $as_echo_n "(cached) " >&6 fi ac_res=`eval 'as_val=${'$as_ac_Header'} $as_echo "$as_val"'` { $as_echo "$as_me:$LINENO: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } else # Is the header compilable? { $as_echo "$as_me:$LINENO: checking $ac_header usability" >&5 $as_echo_n "checking $ac_header usability... " >&6; } cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_compile") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then ac_header_compiler=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_compiler=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext { $as_echo "$as_me:$LINENO: result: $ac_header_compiler" >&5 $as_echo "$ac_header_compiler" >&6; } # Is the header present? { $as_echo "$as_me:$LINENO: checking $ac_header presence" >&5 $as_echo_n "checking $ac_header presence... " >&6; } cat >conftest.$ac_ext <<_ACEOF /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include <$ac_header> _ACEOF if { (ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:$LINENO: $ac_try_echo\"" $as_echo "$ac_try_echo") >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 $as_echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then ac_header_preproc=yes else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_preproc=no fi rm -f conftest.err conftest.$ac_ext { $as_echo "$as_me:$LINENO: result: $ac_header_preproc" >&5 $as_echo "$ac_header_preproc" >&6; } # So? What about this header? case $ac_header_compiler:$ac_header_preproc:$ac_c_preproc_warn_flag in yes:no: ) { $as_echo "$as_me:$LINENO: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&5 $as_echo "$as_me: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $ac_header: proceeding with the compiler's result" >&2;} ac_header_preproc=yes ;; no:yes:* ) { $as_echo "$as_me:$LINENO: WARNING: $ac_header: present but cannot be compiled" >&5 $as_echo "$as_me: WARNING: $ac_header: present but cannot be compiled" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: check for missing prerequisite headers?" >&5 $as_echo "$as_me: WARNING: $ac_header: check for missing prerequisite headers?" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: see the Autoconf documentation" >&5 $as_echo "$as_me: WARNING: $ac_header: see the Autoconf documentation" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: section \"Present But Cannot Be Compiled\"" >&5 $as_echo "$as_me: WARNING: $ac_header: section \"Present But Cannot Be Compiled\"" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 $as_echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} { $as_echo "$as_me:$LINENO: WARNING: $ac_header: in the future, the compiler will take precedence" >&5 $as_echo "$as_me: WARNING: $ac_header: in the future, the compiler will take precedence" >&2;} ;; esac { $as_echo "$as_me:$LINENO: checking for $ac_header" >&5 $as_echo_n "checking for $ac_header... " >&6; } if { as_var=$as_ac_Header; eval "test \"\${$as_var+set}\" = set"; }; then $as_echo_n "(cached) " >&6 else eval "$as_ac_Header=\$ac_header_preproc" fi ac_res=`eval 'as_val=${'$as_ac_Header'} $as_echo "$as_val"'` { $as_echo "$as_me:$LINENO: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } fi as_val=`eval 'as_val=${'$as_ac_Header'} $as_echo "$as_val"'` if test "x$as_val" = x""yes; then cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF else { { $as_echo "$as_me:$LINENO: error: Couldn't find MicAccessApi.h...try installing MPSS from \ http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss" >&5 $as_echo "$as_me: error: Couldn't find MicAccessApi.h...try installing MPSS from \ http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss" >&2;} { (exit 1); exit 1; }; } fi done CPPFLAGS=$OLD_CPPFLAGS ac_config_files="$ac_config_files Makefile.host_micpower" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, we kill variables containing newlines. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. ( for ac_var in `(set) 2>&1 | sed -n 's/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:$LINENO: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) $as_unset $ac_var ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space=' '; set) 2>&1` in #( *${as_nl}ac_space=\ *) # `set' does not quote correctly, so add quotes (double-quote # substitution turns \\\\ into \\, and sed turns \\ into \). sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; #( *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) | sed ' /^ac_cv_env_/b end t clear :clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ :end' >>confcache if diff "$cache_file" confcache >/dev/null 2>&1; then :; else if test -w "$cache_file"; then test "x$cache_file" != "x/dev/null" && { $as_echo "$as_me:$LINENO: updating cache $cache_file" >&5 $as_echo "$as_me: updating cache $cache_file" >&6;} cat confcache >$cache_file else { $as_echo "$as_me:$LINENO: not updating unwritable cache $cache_file" >&5 $as_echo "$as_me: not updating unwritable cache $cache_file" >&6;} fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' # Transform confdefs.h into DEFS. # Protect against shell expansion while executing Makefile rules. # Protect against Makefile macro expansion. # # If the first sed substitution is executed (which looks for macros that # take arguments), then branch to the quote section. Otherwise, # look for a macro that doesn't take arguments. ac_script=' :mline /\\$/{ N s,\\\n,, b mline } t clear :clear s/^[ ]*#[ ]*define[ ][ ]*\([^ (][^ (]*([^)]*)\)[ ]*\(.*\)/-D\1=\2/g t quote s/^[ ]*#[ ]*define[ ][ ]*\([^ ][^ ]*\)[ ]*\(.*\)/-D\1=\2/g t quote b any :quote s/[ `~#$^&*(){}\\|;'\''"<>?]/\\&/g s/\[/\\&/g s/\]/\\&/g s/\$/$$/g H :any ${ g s/^\n// s/\n/ /g p } ' DEFS=`sed -n "$ac_script" confdefs.h` ac_libobjs= ac_ltlibobjs= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_script='s/\$U\././;s/\.o$//;s/\.obj$//' ac_i=`$as_echo "$ac_i" | sed "$ac_script"` # 2. Prepend LIBOBJDIR. When used with automake>=1.10 LIBOBJDIR # will be set to the directory where LIBOBJS objects are built. ac_libobjs="$ac_libobjs \${LIBOBJDIR}$ac_i\$U.$ac_objext" ac_ltlibobjs="$ac_ltlibobjs \${LIBOBJDIR}$ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : ${CONFIG_STATUS=./config.status} ac_write_fail=0 ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { $as_echo "$as_me:$LINENO: creating $CONFIG_STATUS" >&5 $as_echo "$as_me: creating $CONFIG_STATUS" >&6;} cat >$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in *posix*) set -o posix ;; esac fi # PATH needs CR # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo if (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # Support unset when possible. if ( (MAIL=60; unset MAIL) || exit) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 { (exit 1); exit 1; } fi # Work around bugs in pre-3.0 UWIN ksh. for as_var in ENV MAIL MAILPATH do ($as_unset $as_var) >/dev/null 2>&1 && $as_unset $as_var done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # CDPATH. $as_unset CDPATH as_lineno_1=$LINENO as_lineno_2=$LINENO test "x$as_lineno_1" != "x$as_lineno_2" && test "x`expr $as_lineno_1 + 1`" = "x$as_lineno_2" || { # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line after each line using $LINENO; the second 'sed' # does the real work. The second script uses 'N' to pair each # line-number line with the line containing $LINENO, and appends # trailing '-' during substitution so that $LINENO is not a special # case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # scripts with optimization help from Paolo Bonzini. Blame Lee # E. McMahon (1931-1989) for sed's syntax. :-) sed -n ' p /[$]LINENO/= ' <$as_myself | sed ' s/[$]LINENO.*/&-/ t lineno b :lineno N :loop s/[$]LINENO\([^'$as_cr_alnum'_].*\n\)\(.*\)/\2\1\2/ t loop s/-\n.*// ' >$as_me.lineno && chmod +x "$as_me.lineno" || { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2 { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensitive to this). . "./$as_me.lineno" # Exit status is that of the last command. exit } if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in -n*) case `echo 'x\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. *) ECHO_C='\c';; esac;; *) ECHO_N='-n';; esac if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null if mkdir -p . 2>/dev/null; then as_mkdir_p=: else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" exec 6>&1 # Save the log message, to keep $[0] and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" This file was extended by host_micpower $as_me version-0.1, which was generated by GNU Autoconf 2.63. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ on `(hostname || uname -n) 2>/dev/null | sed 1q` " _ACEOF case $ac_config_files in *" "*) set x $ac_config_files; shift; ac_config_files=$*;; esac cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 # Files that config.status was made for. config_files="$ac_config_files" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 ac_cs_usage="\ \`$as_me' instantiates files from templates according to the current configuration. Usage: $0 [OPTION]... [FILE]... -h, --help print this help, then exit -V, --version print version number and configuration settings, then exit -q, --quiet, --silent do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE Configuration files: $config_files Report bugs to ." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_version="\\ host_micpower config.status version-0.1 configured by $0, generated by GNU Autoconf 2.63, with options \\"`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`\\" Copyright (C) 2008 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." ac_pwd='$ac_pwd' srcdir='$srcdir' test -n "\$AWK" || AWK=awk _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # The default lists apply if the user does not specify any file. ac_need_defaults=: while test $# != 0 do case $1 in --*=*) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg=`expr "X$1" : 'X[^=]*=\(.*\)'` ac_shift=: ;; *) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; esac case $ac_option in # Handling of the options. -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --versio | --versi | --vers | --ver | --ve | --v | -V ) $as_echo "$ac_cs_version"; exit ;; --debug | --debu | --deb | --de | --d | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift case $ac_optarg in *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` ;; esac CONFIG_FILES="$CONFIG_FILES '$ac_optarg'" ac_need_defaults=false;; --he | --h | --help | --hel | -h ) $as_echo "$ac_cs_usage"; exit ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) { $as_echo "$as_me: error: unrecognized option: $1 Try \`$0 --help' for more information." >&2 { (exit 1); exit 1; }; } ;; *) ac_config_targets="$ac_config_targets $1" ac_need_defaults=false ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 if \$ac_cs_recheck; then set X '$SHELL' '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion shift \$as_echo "running CONFIG_SHELL=$SHELL \$*" >&6 CONFIG_SHELL='$SHELL' export CONFIG_SHELL exec "\$@" fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX $as_echo "$ac_log" } >&5 _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Handling of arguments. for ac_config_target in $ac_config_targets do case $ac_config_target in "Makefile.host_micpower") CONFIG_FILES="$CONFIG_FILES Makefile.host_micpower" ;; *) { { $as_echo "$as_me:$LINENO: error: invalid argument: $ac_config_target" >&5 $as_echo "$as_me: error: invalid argument: $ac_config_target" >&2;} { (exit 1); exit 1; }; };; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason against having it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Hook for its removal unless debugging. # Note that there is a small window in which the directory will not be cleaned: # after its creation but before its name has been assigned to `$tmp'. $debug || { tmp= trap 'exit_status=$? { test -z "$tmp" || test ! -d "$tmp" || rm -fr "$tmp"; } && exit $exit_status ' 0 trap '{ (exit 1); exit 1; }' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d "./confXXXXXX") 2>/dev/null` && test -n "$tmp" && test -d "$tmp" } || { tmp=./conf$$-$RANDOM (umask 077 && mkdir "$tmp") } || { $as_echo "$as_me: cannot create a temporary directory in ." >&2 { (exit 1); exit 1; } } # Set up the scripts for CONFIG_FILES section. # No need to generate them if there are no CONFIG_FILES. # This happens for instance with `./config.status config.h'. if test -n "$CONFIG_FILES"; then ac_cr=' ' ac_cs_awk_cr=`$AWK 'BEGIN { print "a\rb" }' /dev/null` if test "$ac_cs_awk_cr" = "a${ac_cr}b"; then ac_cs_awk_cr='\\r' else ac_cs_awk_cr=$ac_cr fi echo 'BEGIN {' >"$tmp/subs1.awk" && _ACEOF { echo "cat >conf$$subs.awk <<_ACEOF" && echo "$ac_subst_vars" | sed 's/.*/&!$&$ac_delim/' && echo "_ACEOF" } >conf$$subs.sh || { { $as_echo "$as_me:$LINENO: error: could not make $CONFIG_STATUS" >&5 $as_echo "$as_me: error: could not make $CONFIG_STATUS" >&2;} { (exit 1); exit 1; }; } ac_delim_num=`echo "$ac_subst_vars" | grep -c '$'` ac_delim='%!_!# ' for ac_last_try in false false false false false :; do . ./conf$$subs.sh || { { $as_echo "$as_me:$LINENO: error: could not make $CONFIG_STATUS" >&5 $as_echo "$as_me: error: could not make $CONFIG_STATUS" >&2;} { (exit 1); exit 1; }; } ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | grep -c X` if test $ac_delim_n = $ac_delim_num; then break elif $ac_last_try; then { { $as_echo "$as_me:$LINENO: error: could not make $CONFIG_STATUS" >&5 $as_echo "$as_me: error: could not make $CONFIG_STATUS" >&2;} { (exit 1); exit 1; }; } else ac_delim="$ac_delim!$ac_delim _$ac_delim!! " fi done rm -f conf$$subs.sh cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>"\$tmp/subs1.awk" <<\\_ACAWK && _ACEOF sed -n ' h s/^/S["/; s/!.*/"]=/ p g s/^[^!]*!// :repl t repl s/'"$ac_delim"'$// t delim :nl h s/\(.\{148\}\).*/\1/ t more1 s/["\\]/\\&/g; s/^/"/; s/$/\\n"\\/ p n b repl :more1 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t nl :delim h s/\(.\{148\}\).*/\1/ t more2 s/["\\]/\\&/g; s/^/"/; s/$/"/ p b :more2 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t delim ' >$CONFIG_STATUS || ac_write_fail=1 rm -f conf$$subs.awk cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACAWK cat >>"\$tmp/subs1.awk" <<_ACAWK && for (key in S) S_is_set[key] = 1 FS = "" } { line = $ 0 nfields = split(line, field, "@") substed = 0 len = length(field[1]) for (i = 2; i < nfields; i++) { key = field[i] keylen = length(key) if (S_is_set[key]) { value = S[key] line = substr(line, 1, len) "" value "" substr(line, len + keylen + 3) len += length(value) + length(field[++i]) substed = 1 } else len += 1 + keylen } print line } _ACAWK _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 if sed "s/$ac_cr//" < /dev/null > /dev/null 2>&1; then sed "s/$ac_cr\$//; s/$ac_cr/$ac_cs_awk_cr/g" else cat fi < "$tmp/subs1.awk" > "$tmp/subs.awk" \ || { { $as_echo "$as_me:$LINENO: error: could not setup config files machinery" >&5 $as_echo "$as_me: error: could not setup config files machinery" >&2;} { (exit 1); exit 1; }; } _ACEOF # VPATH may cause trouble with some makes, so we remove $(srcdir), # ${srcdir} and @srcdir@ from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=/{ s/:*\$(srcdir):*/:/ s/:*\${srcdir}:*/:/ s/:*@srcdir@:*/:/ s/^\([^=]*=[ ]*\):*/\1/ s/:*$// s/^[^=]*=[ ]*$// }' fi cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 fi # test -n "$CONFIG_FILES" eval set X " :F $CONFIG_FILES " shift for ac_tag do case $ac_tag in :[FHLC]) ac_mode=$ac_tag; continue;; esac case $ac_mode$ac_tag in :[FHL]*:*);; :L* | :C*:*) { { $as_echo "$as_me:$LINENO: error: invalid tag $ac_tag" >&5 $as_echo "$as_me: error: invalid tag $ac_tag" >&2;} { (exit 1); exit 1; }; };; :[FH]-) ac_tag=-:-;; :[FH]*) ac_tag=$ac_tag:$ac_tag.in;; esac ac_save_IFS=$IFS IFS=: set x $ac_tag IFS=$ac_save_IFS shift ac_file=$1 shift case $ac_mode in :L) ac_source=$1;; :[FH]) ac_file_inputs= for ac_f do case $ac_f in -) ac_f="$tmp/stdin";; *) # Look for the file first in the build tree, then in the source tree # (if the path is not absolute). The absolute path cannot be DOS-style, # because $ac_f cannot contain `:'. test -f "$ac_f" || case $ac_f in [\\/$]*) false;; *) test -f "$srcdir/$ac_f" && ac_f="$srcdir/$ac_f";; esac || { { $as_echo "$as_me:$LINENO: error: cannot find input file: $ac_f" >&5 $as_echo "$as_me: error: cannot find input file: $ac_f" >&2;} { (exit 1); exit 1; }; };; esac case $ac_f in *\'*) ac_f=`$as_echo "$ac_f" | sed "s/'/'\\\\\\\\''/g"`;; esac ac_file_inputs="$ac_file_inputs '$ac_f'" done # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ configure_input='Generated from '` $as_echo "$*" | sed 's|^[^:]*/||;s|:[^:]*/|, |g' `' by configure.' if test x"$ac_file" != x-; then configure_input="$ac_file. $configure_input" { $as_echo "$as_me:$LINENO: creating $ac_file" >&5 $as_echo "$as_me: creating $ac_file" >&6;} fi # Neutralize special characters interpreted by sed in replacement strings. case $configure_input in #( *\&* | *\|* | *\\* ) ac_sed_conf_input=`$as_echo "$configure_input" | sed 's/[\\\\&|]/\\\\&/g'`;; #( *) ac_sed_conf_input=$configure_input;; esac case $ac_tag in *:-:* | *:-) cat >"$tmp/stdin" \ || { { $as_echo "$as_me:$LINENO: error: could not create $ac_file" >&5 $as_echo "$as_me: error: could not create $ac_file" >&2;} { (exit 1); exit 1; }; } ;; esac ;; esac ac_dir=`$as_dirname -- "$ac_file" || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` { as_dir="$ac_dir" case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || { $as_mkdir_p && mkdir -p "$as_dir"; } || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || { { $as_echo "$as_me:$LINENO: error: cannot create directory $as_dir" >&5 $as_echo "$as_me: error: cannot create directory $as_dir" >&2;} { (exit 1); exit 1; }; }; } ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix case $ac_mode in :F) # # CONFIG_FILE # _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # If the template does not know about datarootdir, expand it. # FIXME: This hack should be removed a few years after 2.60. ac_datarootdir_hack=; ac_datarootdir_seen= ac_sed_dataroot=' /datarootdir/ { p q } /@datadir@/p /@docdir@/p /@infodir@/p /@localedir@/p /@mandir@/p ' case `eval "sed -n \"\$ac_sed_dataroot\" $ac_file_inputs"` in *datarootdir*) ac_datarootdir_seen=yes;; *@datadir@*|*@docdir@*|*@infodir@*|*@localedir@*|*@mandir@*) { $as_echo "$as_me:$LINENO: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&5 $as_echo "$as_me: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&2;} _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_datarootdir_hack=' s&@datadir@&$datadir&g s&@docdir@&$docdir&g s&@infodir@&$infodir&g s&@localedir@&$localedir&g s&@mandir@&$mandir&g s&\\\${datarootdir}&$datarootdir&g' ;; esac _ACEOF # Neutralize VPATH when `$srcdir' = `.'. # Shell code in configure.ac might set extrasub. # FIXME: do we really want to maintain this feature? cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_sed_extra="$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s|@configure_input@|$ac_sed_conf_input|;t t s&@top_builddir@&$ac_top_builddir_sub&;t t s&@top_build_prefix@&$ac_top_build_prefix&;t t s&@srcdir@&$ac_srcdir&;t t s&@abs_srcdir@&$ac_abs_srcdir&;t t s&@top_srcdir@&$ac_top_srcdir&;t t s&@abs_top_srcdir@&$ac_abs_top_srcdir&;t t s&@builddir@&$ac_builddir&;t t s&@abs_builddir@&$ac_abs_builddir&;t t s&@abs_top_builddir@&$ac_abs_top_builddir&;t t $ac_datarootdir_hack " eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$tmp/subs.awk" >$tmp/out \ || { { $as_echo "$as_me:$LINENO: error: could not create $ac_file" >&5 $as_echo "$as_me: error: could not create $ac_file" >&2;} { (exit 1); exit 1; }; } test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && { ac_out=`sed -n '/\${datarootdir}/p' "$tmp/out"`; test -n "$ac_out"; } && { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' "$tmp/out"`; test -z "$ac_out"; } && { $as_echo "$as_me:$LINENO: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined." >&5 $as_echo "$as_me: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined." >&2;} rm -f "$tmp/stdin" case $ac_file in -) cat "$tmp/out" && rm -f "$tmp/out";; *) rm -f "$ac_file" && mv "$tmp/out" "$ac_file";; esac \ || { { $as_echo "$as_me:$LINENO: error: could not create $ac_file" >&5 $as_echo "$as_me: error: could not create $ac_file" >&2;} { (exit 1); exit 1; }; } ;; esac done # for ac_tag { (exit 0); exit 0; } _ACEOF chmod +x $CONFIG_STATUS ac_clean_files=$ac_clean_files_save test $ac_write_fail = 0 || { { $as_echo "$as_me:$LINENO: error: write failure creating $CONFIG_STATUS" >&5 $as_echo "$as_me: error: write failure creating $CONFIG_STATUS" >&2;} { (exit 1); exit 1; }; } # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || { (exit 1); exit 1; } fi if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then { $as_echo "$as_me:$LINENO: WARNING: unrecognized options: $ac_unrecognized_opts" >&5 $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;} fi papi-papi-7-2-0-t/src/components/host_micpower/configure.ac000066400000000000000000000022641502707512200237700ustar00rootroot00000000000000AC_INIT(host_micpower, version-0.1) AC_PROG_CC AC_ARG_WITH([sysmgmt-include-path], [AS_HELP_STRING([--with-sysmgmt-include-path], [location of the MPSS sysmgmt api headers, defaults to /opt/intel/mic/sysmgmt/sdk/include])], [SYSMGMT_CFLAGS="-I$withval"], [SYSMGMT_CFLAGS="-I/opt/intel/mic/sysmgmt/sdk/include"] ) AC_SUBST([SYSMGMT_CFLAGS]) AC_ARG_WITH([sysmgmt-lib-path], [AS_HELP_STRING([--with-sysmgmt-lib-path], [location of the MPSS sysmgmt libraries, feed to the runtime linker; \ defaults to /opt/intel/mic/sysmgmt/sdk/lib/Linux])], [SYSMGMT_LIBS="-Wl,-rpath,$withval"], [SYSMGMT_LIBS="-Wl,-rpath,/opt/intel/mic/sysmgmt/sdk/lib/Linux"]) AC_SUBST([SYSMGMT_LIBS]) #AC_ARG_WITH([scif-lib-path], # [AS_HELP_STRING([--with-scif-lib-path],[location of the SCIF library, needed by libMicAccessApi.so]), # [], # []) OLD_CPPFLAGS=$CPPFLAGS CPPFLAGS=["-DMICACCESSAPI -DLINUX $SYSMGMT_CFLAGS"] AC_CHECK_HEADERS([MicAccessApi.h], [], AC_MSG_ERROR([Couldn't find MicAccessApi.h...try installing MPSS from \ http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss]) ) CPPFLAGS=$OLD_CPPFLAGS AC_CONFIG_FILES([Makefile.host_micpower]) AC_OUTPUT papi-papi-7-2-0-t/src/components/host_micpower/linux-host_micpower.c000066400000000000000000000521131502707512200256630ustar00rootroot00000000000000/** linux-host_micpower.c * @author James Ralph * ralph@icl.utk.edu * * @ingroup papi_components * * @brief * This component wraps the MicAccessAPI to provide hostside * power information for attached Intel Xeon Phi (MIC) cards. */ /* From intel examples, see $(mic_dir)/sysmgt/sdk/Examples/Usage */ #define MAX_DEVICES (32) #define EVENTS_PER_DEVICE 10 #include #include #include #include #include "MicAccessTypes.h" #include "MicBasicTypes.h" #include "MicAccessErrorTypes.h" #include "MicAccessApi.h" #include "MicPowerManagerAPI.h" #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" void (*_dl_non_dynamic_init)(void) __attribute__((weak)); /* This is a guess, refine this later */ #define UPDATEFREQ 500000 papi_vector_t _host_micpower_vector; typedef struct host_micpower_register { /** Corresponds to counter slot, indexed from 1, 0 has a special meaning */ unsigned int selector; } host_micpower_register_t; typedef struct host_micpower_reg_alloc { host_micpower_register_t ra_bits; } host_micpower_reg_alloc_t; /** Internal structure used to build the table of events */ typedef struct host_micpower_native_event_entry { host_micpower_register_t resources; char name[PAPI_MAX_STR_LEN]; char description[PAPI_MAX_STR_LEN]; char units[3]; } host_micpower_native_event_entry_t; /** Per-eventset structure used to hold control flags. */ typedef struct host_micpower_control_state { int num_events; int resident[MAX_DEVICES*EVENTS_PER_DEVICE]; long long counts[MAX_DEVICES*EVENTS_PER_DEVICE]; long long lastupdate[MAX_DEVICES]; } host_micpower_control_state_t; /** Per-thread data */ typedef struct host_micpower_context { host_micpower_control_state_t state; } host_micpower_context_t; /* Global state info */ static MicDeviceOnSystem adapters[MAX_DEVICES]; static HANDLE handles[MAX_DEVICES]; static long long lastupdate[MAX_DEVICES]; static HANDLE accessHandle = NULL; static U32 nAdapters = MAX_DEVICES; static void* mic_access = NULL; static void* scif_access = NULL; #undef MICACCESS_API #define MICACCESS_API __attribute__((weak)) const char *MicGetErrorString(U32); U32 MICACCESS_API MicCloseAdapter(HANDLE); U32 MICACCESS_API MicInitAPI(HANDLE *, ETarget, MicDeviceOnSystem *, U32 *); U32 MICACCESS_API MicCloseAPI(HANDLE *); U32 MICACCESS_API MicInitAdapter(HANDLE *, MicDeviceOnSystem *); U32 MICACCESS_API MicGetPowerUsage(HANDLE, MicPwrUsage *); const char *(*MicGetErrorStringPtr)(U32); U32 (*MicCloseAdapterPtr)(HANDLE); U32 (*MicInitAPIPtr)(HANDLE *, ETarget, MicDeviceOnSystem *, U32 *); U32 (*MicCloseAPIPtr)(HANDLE *); U32 (*MicInitAdapterPtr)(HANDLE *, MicDeviceOnSystem *); U32 (*MicGetPowerUsagePtr)(HANDLE, MicPwrUsage *); static host_micpower_native_event_entry_t *native_events_table = NULL; struct powers { int total0; int total1; int inst; int imax; int pcie; int c2x3; int c2x4; int vccp; int vddg; int vddq; }; typedef union { struct powers power; int array[EVENTS_PER_DEVICE]; } power_t; static power_t cached_values[MAX_DEVICES]; static int loadFunctionPtrs() { /* Attempt to guess if we were statically linked to libc, if so bail */ if ( _dl_non_dynamic_init != NULL ) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "The host_micpower component does not support statically linking of libc.", PAPI_MAX_STR_LEN); return PAPI_ENOSUPP; } /* Need to link in the cuda libraries, if not found disable the component */ scif_access = dlopen("libscif.so", RTLD_NOW | RTLD_GLOBAL); if (NULL == scif_access) { snprintf(_host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Problem loading the SCIF library: %s\n", dlerror()); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } mic_access = dlopen("libMicAccessSDK.so", RTLD_NOW | RTLD_GLOBAL); if (NULL == mic_access) { snprintf(_host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Problem loading libMicAccessSDK.so: %s\n", dlerror()); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicGetErrorStringPtr = dlsym(mic_access, "MicGetErrorString"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicGetErrorString not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicCloseAdapterPtr = dlsym(mic_access, "MicCloseAdapter"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicCloseAdapter not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicInitAPIPtr = dlsym(mic_access, "MicInitAPI"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicInitAPI not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicCloseAPIPtr = dlsym(mic_access, "MicCloseAPI"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicCloseAPI not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicInitAdapterPtr = dlsym(mic_access, "MicInitAdapter"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicInitAdapter not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } MicGetPowerUsagePtr = dlsym(mic_access, "MicGetPowerUsage"); if (dlerror() != NULL) { strncpy(_host_micpower_vector.cmp_info.disabled_reason, "MicAccessSDK function MicGetPowerUsage not found.",PAPI_MAX_STR_LEN); _host_micpower_vector.cmp_info.disabled = 1; return ( PAPI_ENOSUPP ); } return 0; } /* ############################################### * Component Interface code * ############################################### */ int _host_micpower_init_component( int cidx ) { int retval = PAPI_OK; U32 ret = MIC_ACCESS_API_ERROR_UNKNOWN; U32 adapterNum = 0; U32 throwaway = 1; _host_micpower_vector.cmp_info.CmpIdx = cidx; if ( loadFunctionPtrs() ) { retval = PAPI_ENOSUPP; goto fn_fail; } memset( lastupdate, 0x0, sizeof(lastupdate)); memset( cached_values, 0x0, sizeof(struct powers)*MAX_DEVICES ); ret = MicInitAPIPtr( &accessHandle, eTARGET_SCIF_DRIVER, adapters, &nAdapters ); if ( MIC_ACCESS_API_SUCCESS != ret ) { snprintf( _host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Failed to init: %s", MicGetErrorStringPtr(ret)); MicCloseAPIPtr(&accessHandle); retval = PAPI_ENOSUPP; goto fn_fail; } /* Sanity check on array size */ if ( nAdapters >= MAX_DEVICES ) { snprintf(_host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Too many MIC cards [%d] found, bailing.", nAdapters); MicCloseAPIPtr(&accessHandle); retval = PAPI_ENOSUPP; goto fn_fail; } /* XXX: This code initializes a token for each adapter, in testing this appeared to be required/ * One has to call MicInitAdapter() before calling into that adapter's entries */ for (adapterNum=0; adapterNum < nAdapters; adapterNum++) { ret = MicInitAPIPtr( &handles[adapterNum], eTARGET_SCIF_DRIVER, adapters, &throwaway ); throwaway = 1; if (MIC_ACCESS_API_SUCCESS != ret) { fprintf(stderr, "%d:MicInitAPI carps: %s\n", __LINE__, MicGetErrorStringPtr(ret)); nAdapters = adapterNum; for (adapterNum=0; adapterNum < nAdapters; adapterNum++) MicCloseAdapterPtr( handles[adapterNum] ); MicCloseAPIPtr( &accessHandle ); snprintf(_host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Failed to initialize card %d's interface.", nAdapters); retval = PAPI_ENOSUPP; goto fn_fail; } ret = MicInitAdapterPtr(&handles[adapterNum], &adapters[adapterNum]); if (MIC_ACCESS_API_SUCCESS != ret) { fprintf(stderr, "%d:MicInitAdapter carps: %s\n", __LINE__, MicGetErrorStringPtr(ret)); nAdapters = adapterNum; for (adapterNum=0; adapterNum < nAdapters; adapterNum++) MicCloseAdapterPtr( handles[adapterNum] ); MicCloseAPIPtr( &accessHandle ); snprintf(_host_micpower_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Failed to initialize card %d's interface.", nAdapters); retval = PAPI_ENOSUPP; goto fn_fail; } } native_events_table = ( host_micpower_native_event_entry_t*)papi_malloc( nAdapters * EVENTS_PER_DEVICE * sizeof(host_micpower_native_event_entry_t)); if ( NULL == native_events_table ) { retval = PAPI_ENOMEM; goto fn_fail; } for (adapterNum=0; adapterNum < nAdapters; adapterNum++) { snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE].name, PAPI_MAX_STR_LEN, "mic%d:tot0", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE].description, PAPI_MAX_STR_LEN, "Total power utilization, Averaged over Time Window 0 (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE].resources.selector = adapterNum*EVENTS_PER_DEVICE + 1; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 1].name, PAPI_MAX_STR_LEN, "mic%d:tot1", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 1].description, PAPI_MAX_STR_LEN, "Total power utilization, Averaged over Time Window 1 (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 1].resources.selector = adapterNum*EVENTS_PER_DEVICE + 2; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 1].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 2].name, PAPI_MAX_STR_LEN, "mic%d:pcie", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 2].description, PAPI_MAX_STR_LEN, "PCI-E connector power (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 2].resources.selector = adapterNum*EVENTS_PER_DEVICE + 3; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 2].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 3].name, PAPI_MAX_STR_LEN, "mic%d:inst", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 3].description, PAPI_MAX_STR_LEN, "Instantaneous power (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 3].resources.selector = adapterNum*EVENTS_PER_DEVICE + 4; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 3].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 4].name, PAPI_MAX_STR_LEN, "mic%d:imax", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 4].description, PAPI_MAX_STR_LEN, "Max instantaneous power (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 4].resources.selector = adapterNum*EVENTS_PER_DEVICE + 5; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 4].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 5].name, PAPI_MAX_STR_LEN, "mic%d:c2x3", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 5].description, PAPI_MAX_STR_LEN, "2x3 connector power (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 5].resources.selector = adapterNum*EVENTS_PER_DEVICE + 6; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 5].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 6].name, PAPI_MAX_STR_LEN, "mic%d:c2x4", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 6].description, PAPI_MAX_STR_LEN, "2x4 connector power (uWatts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 6].resources.selector = adapterNum*EVENTS_PER_DEVICE + 7; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 6].units, PAPI_MIN_STR_LEN, "uW"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 7].name, PAPI_MAX_STR_LEN, "mic%d:vccp", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 7].description, PAPI_MAX_STR_LEN, "Core rail (uVolts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 7].resources.selector = adapterNum*EVENTS_PER_DEVICE + 8; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 7].units, PAPI_MIN_STR_LEN, "uV"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 8].name, PAPI_MAX_STR_LEN, "mic%d:vddg", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 8].description, PAPI_MAX_STR_LEN, "Uncore rail (uVolts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 8].resources.selector = adapterNum*EVENTS_PER_DEVICE + 9; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 8].units, PAPI_MIN_STR_LEN, "uV"); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 9].name, PAPI_MAX_STR_LEN, "mic%d:vddq", adapterNum); snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 9].description, PAPI_MAX_STR_LEN, "Memory subsystem rail (uVolts)"); native_events_table[adapterNum*EVENTS_PER_DEVICE + 9].resources.selector = adapterNum*EVENTS_PER_DEVICE + 10; snprintf(native_events_table[adapterNum*EVENTS_PER_DEVICE + 9].units, PAPI_MIN_STR_LEN, "uV"); } fn_exit: _host_micpower_vector.cmp_info.num_cntrs = EVENTS_PER_DEVICE*nAdapters; _host_micpower_vector.cmp_info.num_mpx_cntrs = EVENTS_PER_DEVICE*nAdapters; _host_micpower_vector.cmp_info.num_native_events = EVENTS_PER_DEVICE*nAdapters; _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: nAdapters = 0; goto fn_exit; } int _host_micpower_init_thread( hwd_context_t *ctx) { (void)ctx; return PAPI_OK; } int _host_micpower_shutdown_component( void ) { U32 i = 0; for( i=0; iresident[i] = 0; for (i=0; i < count; i++) { index = info[i].ni_event&PAPI_NATIVE_AND_MASK; info[i].ni_position=native_events_table[index].resources.selector-1; state->resident[index] = 1; } state->num_events = count; return PAPI_OK; } int _host_micpower_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; return PAPI_OK; } static int read_power( struct powers *pwr, int which_one ) { MicPwrUsage power; U32 ret = MIC_ACCESS_API_ERROR_UNKNOWN; if ( which_one < 0 || which_one > (int)nAdapters ) return PAPI_ENOEVNT; ret = MicGetPowerUsagePtr(handles[which_one], &power); if (MIC_ACCESS_API_SUCCESS != ret) { fprintf(stderr,"Oops MicGetPowerUsage failed: %s\n", MicGetErrorStringPtr(ret)); return PAPI_ECMP; } pwr->total0 = power.total0.prr; pwr->total1 = power.total1.prr; pwr->inst = power.inst.prr; pwr->imax = power.imax.prr; pwr->pcie = power.pcie.prr; pwr->c2x3 = power.c2x3.prr; pwr->c2x4 = power.c2x4.prr; pwr->vccp = power.vccp.pwr; pwr->vddg = power.vddg.pwr; pwr->vddq = power.vddq.pwr; return PAPI_OK; } int _host_micpower_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags) { (void)flags; (void)events; (void)ctx; unsigned int i,j; int needs_update = 0; host_micpower_control_state_t* control = (host_micpower_control_state_t*)ctl; long long now = PAPI_get_real_usec(); for( i=0; iresident[EVENTS_PER_DEVICE*i+j]) { needs_update = 1; break; } } if ( needs_update ) { /* Do the global update */ if ( now >= lastupdate[i] + UPDATEFREQ) { read_power( &cached_values[i].power, i ); lastupdate[i] = now; } /* update from cached values */ if ( control->lastupdate[i] < lastupdate[i]) { control->lastupdate[i] = lastupdate[i]; } for (j=0; jresident[EVENTS_PER_DEVICE*i+j] ) { control->counts[EVENTS_PER_DEVICE*i+j] = (long long)cached_values[i].array[j]; } } } } *events = control->counts; return PAPI_OK; } int _host_micpower_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void)ctx; int needs_update = 0; unsigned int i,j; host_micpower_control_state_t* control = (host_micpower_control_state_t*)ctl; long long now = PAPI_get_real_usec(); for( i=0; iresident[EVENTS_PER_DEVICE*i+j]) { needs_update = 1; break; } } if ( needs_update ) { /* Do the global update */ if ( now >= lastupdate[i] + UPDATEFREQ) { read_power( &cached_values[i].power, i ); lastupdate[i] = now; } /* update from cached values */ if ( control->lastupdate[i] < lastupdate[i]) { control->lastupdate[i] = lastupdate[i]; } for (j=0; jresident[EVENTS_PER_DEVICE*i+j] ) { control->counts[EVENTS_PER_DEVICE*i+j] = (long long)cached_values[i].array[j]; } } } } return PAPI_OK; } int _host_micpower_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index; switch (modifier) { case PAPI_ENUM_FIRST: if (0 == _host_micpower_vector.cmp_info.num_cntrs) return PAPI_ENOEVNT; *EventCode = 0; return PAPI_OK; case PAPI_ENUM_EVENTS: index = *EventCode; if ( index < _host_micpower_vector.cmp_info.num_cntrs - 1) { *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; default: return PAPI_EINVAL; } return PAPI_EINVAL; } int _host_micpower_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; if ( code < _host_micpower_vector.cmp_info.num_cntrs ) { strncpy( name, native_events_table[code].name, len); return PAPI_OK; } return PAPI_ENOEVNT; } int _host_micpower_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; if ( code < _host_micpower_vector.cmp_info.num_cntrs ) { strncpy( name, native_events_table[code].description, len ); return PAPI_OK; } return PAPI_ENOEVNT; } int _host_micpower_ntv_code_to_info( unsigned int EventCode, PAPI_event_info_t *info) { unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; if ( code >= _host_micpower_vector.cmp_info.num_cntrs) return PAPI_ENOEVNT; strncpy( info->symbol, native_events_table[code].name, sizeof(info->symbol) ); strncpy( info->long_descr, native_events_table[code].description, sizeof(info->long_descr) ); strncpy( info->units, native_events_table[code].units, sizeof(info->units) ); return PAPI_OK; } int _host_micpower_ctl( hwd_context_t* ctx, int code, _papi_int_option_t *option) { (void)ctx; (void)code; (void)option; return PAPI_OK; } int _host_micpower_set_domain( hwd_control_state_t* ctl, int domain) { (void)ctl; if ( PAPI_DOM_ALL != domain ) return PAPI_EINVAL; return PAPI_OK; } papi_vector_t _host_micpower_vector = { .cmp_info = { .name = "host_micpower", .short_name = "host_micpower", .description = "A host-side component to read power usage on MIC guest cards.", .version = "0.1", .support_version = "n/a", .kernel_version = "n/a", .num_cntrs = 0, .num_mpx_cntrs = 0, .default_domain = PAPI_DOM_ALL, .available_domains = PAPI_DOM_ALL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, }, .size = { .context = sizeof(host_micpower_context_t), .control_state = sizeof(host_micpower_control_state_t), .reg_value = sizeof(host_micpower_register_t), .reg_alloc = sizeof(host_micpower_reg_alloc_t), }, .start = _host_micpower_start, .stop = _host_micpower_start, .read = _host_micpower_read, .reset = NULL, .write = NULL, .init_component = _host_micpower_init_component, .init_thread = _host_micpower_init_thread, .init_control_state = _host_micpower_init_control_state, .update_control_state = _host_micpower_update_control_state, .ctl = _host_micpower_ctl, .shutdown_thread = _host_micpower_shutdown_thread, .shutdown_component = _host_micpower_shutdown_component, .set_domain = _host_micpower_set_domain, .ntv_enum_events = _host_micpower_ntv_enum_events, .ntv_code_to_name = _host_micpower_ntv_code_to_name, .ntv_code_to_descr = _host_micpower_ntv_code_to_descr, .ntv_code_to_info = _host_micpower_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/host_micpower/tests/000077500000000000000000000000001502707512200226405ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/host_micpower/tests/Makefile000066400000000000000000000006371502707512200243060ustar00rootroot00000000000000NAME=host_micpower include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = host_micpower_basic host_micpower_tests : $(TESTS) micpower_tests: $(TESTS) host_micpower_basic: host_micpower_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o host_micpower_basic host_micpower_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/host_micpower/tests/host_micpower_basic.c000066400000000000000000000056321502707512200270350ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Vince Weaver * * test case for micpower component * Based on coretemp test code by Vince Weaver * * * @brief * Tests basic component functionality */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval,cid,numcmp; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int code; char event_name[PAPI_MAX_STR_LEN]; int total_events=0; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } numcmp = PAPI_num_components(); for(cid=0; cidname); } if ( 0 != strncmp(cmpinfo->name,"host_micpower",13)) { continue; } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) printf("%#x %s ",code,event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop()",retval); } if (!TESTS_QUIET) printf(" value: %lld\n",values[0]); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } if (total_events==0) { test_skip(__FILE__,__LINE__,"No events from host_micpower found",0); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/host_micpower/utils/000077500000000000000000000000001502707512200226365ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/host_micpower/utils/Makefile000066400000000000000000000006311502707512200242760ustar00rootroot00000000000000CC = gcc CFLAGS = -O2 -Wall LFLAGS = PAPI_INCLUDE = ../../.. PAPI_LIBRARY = ../../../libpapi.a all: host_micpower_plot host_micpower_plot: host_micpower_plot.o $(CC) $(LFLAGS) -o host_micpower_plot host_micpower_plot.o $(PAPI_LIBRARY) -ldl -lpthread host_micpower_plot.o: host_micpower_plot.c $(CC) $(CFLAGS) -I$(PAPI_INCLUDE) -c host_micpower_plot.c clean: rm -f *~ *.o host_micpower_plot results.* papi-papi-7-2-0-t/src/components/host_micpower/utils/README000066400000000000000000000015061502707512200235200ustar00rootroot00000000000000This tool can be used to gather Power (and Voltage) measurements on Intel Xeon Phi (aka Intel MIC) chips using the MicAccessAPI. Be sure to configure the PAPI host_micpower component: $ cd "/src/components/host_micpower" $ ./configure as well as PAPI with --with-components: $ cd "/src" $ ./configure --with-components=host_micpower It works by using PAPI to poll the MIC power stats every 100ms. It will dump each statistic to different files, which then can be plotted. The measurements (in uW and uV) are dumped every 100ms. You can adjust the frequency by changing the source code. You can then take those files and put them into your favorite plotting program. You might want to edit the source to remove the extra commentary from the data, the plotting program I use ignores things surrounded by (* brackets. papi-papi-7-2-0-t/src/components/host_micpower/utils/host_micpower_plot.c000066400000000000000000000105201502707512200267200ustar00rootroot00000000000000/** * @author Vince Weaver, Heike McCraw */ #include #include #include #include #include "papi.h" #define MAX_DEVICES (32) #define EVENTS_PER_DEVICE 10 #define MAX_EVENTS (MAX_DEVICES*EVENTS_PER_DEVICE) char events[MAX_EVENTS][BUFSIZ]; char filenames[MAX_EVENTS][BUFSIZ]; FILE *fff[MAX_EVENTS]; static int num_events=0; int main (int argc, char **argv) { int retval,cid,host_micpower_cid=-1,numcmp; int EventSet = PAPI_NULL; long long values[MAX_EVENTS]; int i,code,enum_retval; const PAPI_component_info_t *cmpinfo = NULL; long long start_time,before_time,after_time; double elapsed_time,total_time; double energy = 0.0; char event_name[BUFSIZ]; /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"PAPI_library_init failed\n"); exit(1); } numcmp = PAPI_num_components(); for(cid=0; cidname,"host_micpower")) { host_micpower_cid=cid; printf("Found host_micpower component at cid %d\n", host_micpower_cid); if (cmpinfo->disabled) { fprintf(stderr,"No host_micpower events found: %s\n", cmpinfo->disabled_reason); exit(1); } break; } } /* Component not found */ if (cid==numcmp) { fprintf(stderr,"No host_micpower component found\n"); exit(1); } /* Find Events */ code = PAPI_NATIVE_MASK; enum_retval = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( enum_retval == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); exit(1); } printf("Found: %s\n",event_name); strncpy(events[num_events],event_name,BUFSIZ); sprintf(filenames[num_events],"results.%s",event_name); num_events++; if (num_events==MAX_EVENTS) { printf("Too many events! %d\n",num_events); exit(1); } enum_retval = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } if (num_events==0) { printf("Error! No host_micpower events found!\n"); exit(1); } /* Open output files */ for(i=0;i #include #include #include #include #include #include #include #include #include "pscanf.h" /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" /************************* DEFINES SECTION *********************************** *******************************************************************************/ /* this number assumes that there will never be more events than indicated */ #define INFINIBAND_MAX_COUNTERS 256 /** Structure that stores private information of each event */ typedef struct infiniband_register { /* This is used by the framework.It likes it to be !=0 to do somehting */ unsigned int selector; } infiniband_register_t; /* * The following structures mimic the ones used by other components. It is more * convenient to use them like that as programming with PAPI makes specific * assumptions for them. */ typedef struct _ib_device_type { char* dev_name; int dev_port; struct _ib_device_type *next; } ib_device_t; typedef struct _ib_counter_type { char* ev_name; char* ev_file_name; ib_device_t* ev_device; int extended; // if this is an extended (64-bit) counter struct _ib_counter_type *next; } ib_counter_t; static const char *ib_dir_path = "/sys/class/infiniband"; /** This structure is used to build the table of events */ typedef struct _infiniband_native_event_entry { infiniband_register_t resources; char *name; char *description; char* file_name; ib_device_t* device; int extended; /* if this is an extended (64-bit) counter */ } infiniband_native_event_entry_t; typedef struct _infiniband_control_state { long long counts[INFINIBAND_MAX_COUNTERS]; int being_measured[INFINIBAND_MAX_COUNTERS]; /* all IB counters need difference, but use a flag for generality */ int need_difference[INFINIBAND_MAX_COUNTERS]; long long lastupdate; } infiniband_control_state_t; typedef struct _infiniband_context { infiniband_control_state_t state; long long start_value[INFINIBAND_MAX_COUNTERS]; } infiniband_context_t; /************************* GLOBALS SECTION *********************************** *******************************************************************************/ /* This table contains the component native events */ static infiniband_native_event_entry_t *infiniband_native_events = 0; /* number of events in the table*/ static int num_events = 0; papi_vector_t _infiniband_vector; /****************************************************************************** ******** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT ******** *****************************************************************************/ static ib_device_t *root_device = 0; static ib_counter_t *root_counter = 0; static char* make_ib_event_description(const char* input_str, int extended) { int i, len; char *desc = 0; if (! input_str) return (0); desc = (char*) papi_calloc(PAPI_MAX_STR_LEN, 1); if (desc == 0) { PAPIERROR("cannot allocate memory for event description"); return (0); } len = strlen(input_str); // append additional info to counter description int bits = 32; if (extended) bits = 64; int ret; // list of event descriptions if (strstr(input_str, "rx_atomic_requests")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of received ATOMIC requests for the associated Queue Pairs", bits); } else if (strstr(input_str, "out_of_buffer")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of drops which occurred due to lack of Work Queue Entry for the associated Queue Pairs", bits); } else if (strstr(input_str, "out_of_sequence")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of out of sequence packets received", bits); } else if (strstr(input_str, "lifespan")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Maximum sampling period of the counters in milliseconds", bits); } else if (strstr(input_str, "rx_read_requests")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of received READ requests for the associated Queue Pairs", bits); } else if (strstr(input_str, "rx_write_requests")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of received WRITE requests for the associated Queue Pairs", bits); } else if (strstr(input_str, "port_rcv_data")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of data octets, divided by 4 (lanes), received on all Virtual Lanes. " "Multiply this by 4 to get bytes", bits); } else if (strstr(input_str, "port_rcv_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets received on all Virtual Lanes from this port, including packets containing errors", bits); } else if (strstr(input_str, "port_multicast_rcv_packets") || strstr(input_str, "multicast_rcv_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of multicast packets, including multicast packets containing errors", bits); } else if (strstr(input_str, "port_unicast_rcv_packets") || strstr(input_str, "unicast_rcv_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of unicast packets, including unicast packets containing errors", bits); } else if (strstr(input_str, "port_xmit_data")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of data octets, divided by 4 (lanes), transmitted on all Virtual Lanes. " "Multiply this by 4 to get bytes", bits); } else if (strstr(input_str, "port_xmit_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets transmitted on all Virtual Lanes from this port, including packets containing errors", bits); } else if (strstr(input_str, "port_rcv_switch_relay_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets received on port that were discarded" " because they could not be forwarded by switch relay", bits); } else if (strstr(input_str, "port_rcv_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets containing an error that were received on the port", bits); } else if (strstr(input_str, "port_rcv_constraint_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets received on the switch physical port that are discarded", bits); } else if (strstr(input_str, "local_link_integrity_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of times the count of local physical errors exceeded threshold", bits); } else if (strstr(input_str, "port_xmit_wait")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of ticks during which port had data to transmit but no data was sent during the entire tick", bits); } else if (strstr(input_str, "port_multicast_xmit_packets") || strstr(input_str, "multicast_xmit_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of multicast packets transmitted on all VLs from port," " including multicast packets containing errors", bits); } else if (strstr(input_str, "port_unicast_xmit_packets") || strstr(input_str, "unicast_xmit_packets")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of unicast packets transmitted on all VLs from port," " including unicast packets containing errors", bits); } else if (strstr(input_str, "port_xmit_discards")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of outbound packets discarded by the port because it is down or congested", bits); } else if (strstr(input_str, "port_xmit_constraint_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets not transmitted from the switch physical port", bits); } else if (strstr(input_str, "port_rcv_remote_physical_errors")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of packets marked with EBP (End of Bad Packet) delimiter received on the port", bits); } else if (strstr(input_str, "symbol_error")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of minor link errors detected on one or more physical lanes", bits); } else if (strstr(input_str, "VL15_dropped")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Number of incoming VL15 packets (can include management packets) dropped due to resource limitations of the port", bits); } else if (strstr(input_str, "link_error_recovery")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of times the Port Training state machine has successfully completed the link error recovery process", bits); } else if (strstr(input_str, "link_downed")) { ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%d-bit).", "Total number of times the Port Training state machine has failed link error recovery process and downed the link", bits); } else { // default event description: re-format the filename ret = snprintf(desc, PAPI_MAX_STR_LEN, "%s (%s).", input_str, (extended ? "free-running 64bit counter" : "overflowing, auto-resetting counter")); desc[0] = toupper(desc[0]); for (i=0 ; idev_name = papi_strdup(name); new_dev->dev_port = port; if (new_dev->dev_name==0) { PAPIERROR("cannot allocate memory for device internal fields"); papi_free(new_dev->dev_name); papi_free(new_dev); return (0); } // prepend the new device to the device list new_dev->next = root_device; root_device = new_dev; return (new_dev); } static ib_counter_t* add_ib_counter(const char* name, const char* file_name, int extended, ib_device_t *device) { ib_counter_t *new_cnt = (ib_counter_t*) papi_calloc(sizeof(ib_counter_t), 1); if (new_cnt == 0) { PAPIERROR("cannot allocate memory for new IB counter structure"); return (0); } new_cnt->ev_name = papi_strdup(name); new_cnt->ev_file_name = papi_strdup(file_name); new_cnt->extended = extended; new_cnt->ev_device = device; if (new_cnt->ev_name==0 || new_cnt->ev_file_name==0) { PAPIERROR("cannot allocate memory for counter internal fields"); /* free(NULL) is allowed and means no operation is performed */ papi_free(new_cnt->ev_name); papi_free(new_cnt->ev_file_name); papi_free(new_cnt); return (0); } // prepend the new counter to the counter list new_cnt->next = root_counter; root_counter = new_cnt; return (new_cnt); } static int find_ib_device_events(ib_device_t *dev, int extended) { int nevents = 0; DIR *cnt_dir = NULL; char counters_path[128]; if ( extended ) { /* mofed driver version <4.0 */ snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/counters_ext", ib_dir_path, dev->dev_name, dev->dev_port); cnt_dir = opendir(counters_path); if (cnt_dir == NULL) { /* directory counters_ext in sysfs fs has changed to hw_counters */ /* in 4.0 version of mofed driver */ SUBDBG("cannot open counters directory `%s'\n", counters_path); snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/%scounters", ib_dir_path, dev->dev_name, dev->dev_port, "hw_"); cnt_dir = opendir(counters_path); } } else { snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/counters", ib_dir_path, dev->dev_name, dev->dev_port); cnt_dir = opendir(counters_path); } if (cnt_dir == NULL) { SUBDBG("cannot open counters directory `%s'\n", counters_path); goto out; } struct dirent *ev_ent; /* iterate over all the events */ while ((ev_ent = readdir(cnt_dir)) != NULL) { char *ev_name = ev_ent->d_name; long long value = -1; char event_path[FILENAME_MAX]; char counter_name[PAPI_HUGE_STR_LEN]; if (ev_name[0] == '.') continue; /* Check that we can read an integer from the counter file */ snprintf(event_path, sizeof(event_path), "%s/%s", counters_path, ev_name); if (pscanf(event_path, "%lld", &value) != 1) { SUBDBG("cannot read value for event '%s'\n", ev_name); continue; } /* Adding 3 exceptions to the counter size * https://community.mellanox.com/s/article/understanding-mlx5-linux-counters-and-status-parameters * port_rcv_data, port_rcv_packets, and port_xmit_data are listed as 64 bits counters, * located in the 32 bits directory */ int fixed_extended = extended; if ( !strcmp("port_rcv_data", ev_name) || !strcmp("port_rcv_packets", ev_name) || !strcmp("port_xmit_data", ev_name)) fixed_extended = 3-extended; // higher bit, location is wrong, mark it, and flip the size, 2 + (1-extended) /* Create new counter */ snprintf(counter_name, sizeof(counter_name), "%s_%d%s:%s", dev->dev_name, dev->dev_port, (fixed_extended?"_ext":""), ev_name); if (add_ib_counter(counter_name, ev_name, fixed_extended, dev)) { SUBDBG("Added new counter `%s'\n", counter_name); nevents += 1; } } out: if (cnt_dir != NULL) closedir(cnt_dir); return (nevents); } static int find_ib_devices() { DIR *ib_dir = NULL; int result = PAPI_OK; num_events = 0; ib_dir = opendir(ib_dir_path); if (ib_dir == NULL) { SUBDBG("cannot open `%s'\n", ib_dir_path); strncpy(_infiniband_vector.cmp_info.disabled_reason, "Infiniband sysfs interface not found", PAPI_MAX_STR_LEN); result = PAPI_ENOSUPP; goto out; } struct dirent *hca_ent; while ((hca_ent = readdir(ib_dir)) != NULL) { char *hca = hca_ent->d_name; char ports_path[FILENAME_MAX]; DIR *ports_dir = NULL; if (hca[0] == '.') goto next_hca; snprintf(ports_path, sizeof(ports_path), "%s/%s/ports", ib_dir_path, hca); ports_dir = opendir(ports_path); if (ports_dir == NULL) { SUBDBG("cannot open `%s'\n", ports_path); goto next_hca; } struct dirent *port_ent; while ((port_ent = readdir(ports_dir)) != NULL) { int port = atoi(port_ent->d_name); if (port <= 0) continue; /* Check that port is active. .../HCA/ports/PORT/state should read "4: ACTIVE." */ int state = -1; char state_path[FILENAME_MAX]; snprintf(state_path, sizeof(state_path), "%s/%s/ports/%d/state", ib_dir_path, hca, port); if (pscanf(state_path, "%d", &state) != 1) { SUBDBG("cannot read state of IB HCA `%s' port %d\n", hca, port); continue; } if (state != 4) { SUBDBG("skipping inactive IB HCA `%s', port %d, state %d\n", hca, port, state); continue; } /* Create dev name (HCA/PORT) and get stats for dev. */ SUBDBG("Found IB device `%s', port %d\n", hca, port); ib_device_t *dev = add_ib_device(hca, port); if (!dev) continue; // do we want to check for short counters only if no extended counters found? num_events += find_ib_device_events(dev, 1); // check if we have extended (64bit) counters num_events += find_ib_device_events(dev, 0); // check also for short counters } next_hca: if (ports_dir != NULL) closedir(ports_dir); } if (root_device == 0) // no active devices found { strncpy(_infiniband_vector.cmp_info.disabled_reason, "No active Infiniband ports found", PAPI_MAX_STR_LEN); result = PAPI_ENOIMPL; } else if (num_events == 0) { strncpy(_infiniband_vector.cmp_info.disabled_reason, "No supported Infiniband events found", PAPI_MAX_STR_LEN); result = PAPI_ENOIMPL; } else { // Events are stored in a linked list, in reverse order than how I found them // Revert them again, so that they are in finding order, not that it matters. int i = num_events - 1; // now allocate memory to store the counters into the native table infiniband_native_events = (infiniband_native_event_entry_t*) papi_calloc(num_events, sizeof(infiniband_native_event_entry_t)); ib_counter_t *iter = root_counter; while (iter != 0) { infiniband_native_events[i].name = iter->ev_name; infiniband_native_events[i].file_name = iter->ev_file_name; infiniband_native_events[i].device = iter->ev_device; infiniband_native_events[i].extended = iter->extended; infiniband_native_events[i].resources.selector = i + 1; infiniband_native_events[i].description = make_ib_event_description(iter->ev_file_name, iter->extended); ib_counter_t *tmp = iter; iter = iter->next; papi_free(tmp); -- i; } root_counter = 0; } out: if (ib_dir != NULL) closedir(ib_dir); return (result); } static long long read_ib_counter_value(int index) { char ev_file[FILENAME_MAX]; char counters_path[FILENAME_MAX]; DIR *cnt_dir = NULL; long long value = 0ll; infiniband_native_event_entry_t *iter = &infiniband_native_events[index]; if ( iter->extended == 1 || iter->extended == 2 ) { /* extended == 1, counter is 32b, in the 32b location || 2, counter is 64b in the 64b dir */ /* mofed driver version <4.0 */ snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/counters%s", ib_dir_path, iter->device->dev_name, iter->device->dev_port, "_ext"); cnt_dir = opendir(counters_path); if (cnt_dir == NULL) { /* directory counters_ext in sysfs fs has changed to hw_counters */ /* in 4.0 version of mofed driver */ snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/%scounters", ib_dir_path, iter->device->dev_name, iter->device->dev_port, "hw_"); cnt_dir = opendir(counters_path); } } else { /* extended == 0, counter is 32b, in the 64b location || 3, counter is 64b in the 32b dir */ snprintf(counters_path, sizeof(counters_path), "%s/%s/ports/%d/counters", ib_dir_path, iter->device->dev_name, iter->device->dev_port ); cnt_dir = opendir(counters_path); } if (cnt_dir != NULL) closedir(cnt_dir); snprintf(ev_file, strlen(counters_path) + strlen(iter->file_name) + 2, "%s/%s", counters_path, iter->file_name); if (pscanf(ev_file, "%lld", &value) != 1) { PAPIERROR("cannot read value for counter '%s'\n", iter->name); } else { SUBDBG("Counter '%s': %lld\n", iter->name, value); } return (value); } static void deallocate_infiniband_resources() { int i; if (infiniband_native_events) { for (i=0 ; idev_name) papi_free(iter->dev_name); ib_device_t *tmp = iter; iter = iter->next; papi_free(tmp); } root_device = 0; } /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * This is called whenever a thread is initialized */ static int _infiniband_init_thread( hwd_context_t *ctx ) { (void) ctx; return PAPI_OK; } /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ static int _infiniband_init_component( int cidx ) { /* discover Infiniband devices and available events */ int result = find_ib_devices(); if (result != PAPI_OK) // we couldn't initialize the component { // deallocate any eventually allocated memory deallocate_infiniband_resources(); } _infiniband_vector.cmp_info.num_native_events = num_events; _infiniband_vector.cmp_info.num_cntrs = num_events; _infiniband_vector.cmp_info.num_mpx_cntrs = num_events; /* Export the component id */ _infiniband_vector.cmp_info.CmpIdx = cidx; _papi_hwd[cidx]->cmp_info.disabled = result; return (result); } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ static int _infiniband_init_control_state( hwd_control_state_t *ctl ) { infiniband_control_state_t* control = (infiniband_control_state_t*) ctl; int i; for (i=0 ; ibeing_measured[i] = 0; } return PAPI_OK; } /* * */ static int _infiniband_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { infiniband_context_t* context = (infiniband_context_t*) ctx; infiniband_control_state_t* control = (infiniband_control_state_t*) ctl; long long now = PAPI_get_real_usec(); int i; for (i=0 ; ibeing_measured[i] && control->need_difference[i]) { context->start_value[i] = read_ib_counter_value(i); } } control->lastupdate = now; return PAPI_OK; } /* * */ static int _infiniband_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { infiniband_context_t* context = (infiniband_context_t*) ctx; infiniband_control_state_t* control = (infiniband_control_state_t*) ctl; long long now = PAPI_get_real_usec(); int i; long long temp; for (i=0 ; ibeing_measured[i]) { temp = read_ib_counter_value(i); if (context->start_value[i] && control->need_difference[i]) { /* Must subtract values, but check for wraparound. * We cannot even detect all wraparound cases. Using the short, * auto-resetting IB counters is error prone. */ if (temp < context->start_value[i]) { SUBDBG("Wraparound!\nstart:\t%#016x\ttemp:\t%#016x", (unsigned)context->start_value[i], (unsigned)temp); /* The counters auto-reset. I cannot even adjust them to * account for a simple wraparound. * Just use the current reading of the counter, which is useless. */ } else temp -= context->start_value[i]; } control->counts[i] = temp; } } control->lastupdate = now; return PAPI_OK; } /* * */ static int _infiniband_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long_long ** events, int flags ) { ( void ) flags; _infiniband_stop(ctx, ctl); /* we cannot actually stop the counters */ /* Pass back a pointer to our results */ *events = ((infiniband_control_state_t*) ctl)->counts; return PAPI_OK; } static int _infiniband_shutdown_component( void ) { /* Cleanup resources used by this component before leaving */ deallocate_infiniband_resources(); return PAPI_OK; } static int _infiniband_shutdown_thread( hwd_context_t *ctx ) { ( void ) ctx; return PAPI_OK; } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ static int _infiniband_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { ( void ) ctx; ( void ) code; ( void ) option; return PAPI_OK; } static int _infiniband_update_control_state( hwd_control_state_t *ctl, NativeInfo_t * native, int count, hwd_context_t *ctx ) { int i, index; ( void ) ctx; infiniband_control_state_t* control = (infiniband_control_state_t*) ctl; for (i=0 ; ibeing_measured[i] = 0; } for (i=0 ; ibeing_measured[index] = 1; control->need_difference[index] = 1; } return PAPI_OK; } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ static int _infiniband_set_domain( hwd_control_state_t *ctl, int domain ) { int found = 0; (void) ctl; if (PAPI_DOM_USER & domain) found = 1; if (PAPI_DOM_KERNEL & domain) found = 1; if (PAPI_DOM_OTHER & domain) found = 1; if (!found) return (PAPI_EINVAL); return (PAPI_OK); } /* * Cannot reset the counters using the sysfs interface. */ static int _infiniband_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; return PAPI_OK; } /* * Native Event functions */ static int _infiniband_ntv_enum_events( unsigned int *EventCode, int modifier ) { switch (modifier) { case PAPI_ENUM_FIRST: if (num_events == 0) return (PAPI_ENOEVNT); *EventCode = 0; return PAPI_OK; case PAPI_ENUM_EVENTS: { int index = *EventCode & PAPI_NATIVE_AND_MASK; if (index < num_events - 1) { *EventCode = *EventCode + 1; return PAPI_OK; } else return PAPI_ENOEVNT; break; } default: return PAPI_EINVAL; } return PAPI_EINVAL; } /* * */ static int _infiniband_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int index = EventCode; if (index>=0 && index=0 && index= num_events )) return PAPI_ENOEVNT; if (infiniband_native_events[index].name) { strncpy(info->symbol, infiniband_native_events[index].name, PAPI_HUGE_STR_LEN); info->symbol[PAPI_HUGE_STR_LEN-1] = '\0'; } if (infiniband_native_events[index].description) { strncpy(info->long_descr, infiniband_native_events[index].description, PAPI_MAX_STR_LEN); info->long_descr[PAPI_MAX_STR_LEN-1] = '\0'; } strncpy(info->units, "\0", 1); /* infiniband_native_events[index].units, sizeof(info->units)); */ /* info->data_type = infiniband_native_events[index].return_type; */ return PAPI_OK; } /* * */ papi_vector_t _infiniband_vector = { .cmp_info = { /* component information (unspecified values are initialized to 0) */ .name = "infiniband", .short_name = "infiniband", .version = "5.3.0", .description = "Linux Infiniband statistics using the sysfs interface", .num_mpx_cntrs = INFINIBAND_MAX_COUNTERS, .num_cntrs = INFINIBAND_MAX_COUNTERS, .default_domain = PAPI_DOM_USER | PAPI_DOM_KERNEL, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof (infiniband_context_t), .control_state = sizeof (infiniband_control_state_t), .reg_value = sizeof (infiniband_register_t), .reg_alloc = 1 /* unused */ }, /* function pointers in this component */ .init_thread = _infiniband_init_thread, .init_component = _infiniband_init_component, .init_control_state = _infiniband_init_control_state, .start = _infiniband_start, .stop = _infiniband_stop, .read = _infiniband_read, .shutdown_thread = _infiniband_shutdown_thread, .shutdown_component = _infiniband_shutdown_component, .ctl = _infiniband_ctl, .update_control_state = _infiniband_update_control_state, .set_domain = _infiniband_set_domain, .reset = _infiniband_reset, .ntv_enum_events = _infiniband_ntv_enum_events, .ntv_code_to_name = _infiniband_ntv_code_to_name, .ntv_code_to_descr = _infiniband_ntv_code_to_descr, .ntv_code_to_info = _infiniband_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/infiniband/pscanf.h000066400000000000000000000011711502707512200223400ustar00rootroot00000000000000/* This file was taken from the tacc_stats utility, which is distributed * under a GPL license. */ #ifndef _PSCANF_H_ #define _PSCANF_H_ #include #include __attribute__((format(scanf, 2, 3))) static inline int pscanf(const char *path, const char *fmt, ...) { int rc = -1; FILE *file = NULL; char file_buf[4096]; va_list arg_list; va_start(arg_list, fmt); file = fopen(path, "r"); if (file == NULL) goto out; setvbuf(file, file_buf, _IOFBF, sizeof(file_buf)); rc = vfscanf(file, fmt, arg_list); out: if (file != NULL) fclose(file); va_end(arg_list); return rc; } #endif papi-papi-7-2-0-t/src/components/infiniband/tests/000077500000000000000000000000001502707512200220575ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/infiniband/tests/MPI_test_infiniband_events.c000066400000000000000000001074031502707512200274610ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file MPI_test_infiniband_events.c * * @author Rizwan Ashraf * rizwan@icl.utk.edu * * MPI-based test case for the infiniband component. * * @brief * The test code uses the message passing interface (MPI) to test all interconnect * related events available in the infiniband component. It is designed to generate * network traffic using MPI routines with the goal to trigger some network counters. * The code automatically checks if the infiniband component is enabled and * correspondingly adds all available PAPI events in the event set, one at a time. * In each invocation, different data sizes are communicated over the network. * The event values are recorded in each case, and listed at the completion of the * test. Mostly, the event values need to be checked manually for correctness. * The code automatically tests expected behavior of the code for transmit (TX)/ * receive (RX) event types. * * In this test, the master process distributes workload to all other processes * (NumProcs-1) and then receives the results of the corresponding sub-computations. * As far as message transfers is concerned, the expected behavior of this code * is as follows: * 1. Master TX event ~= Sum of all RX events across all workers (NumProcs-1), * 2. Master RX event ~= Sum of all TX events across all workers (NumProcs-1). * Usage: mpirun -n ./MPI_test_infiniband_events */ #include #include #include #include #include /* headers required by PAPI */ #include "papi.h" #include "papi_test.h" /* constants */ // NSIZE_MIN/MAX: min/max no. of double floating point // values allocated at each node #define NSIZE_MIN 10000 #define NSIZE_MAX 100000 // No. of different data sizes to be // tested b/w NSIZE_MIN and NSIZE_MAX #define NSTEPS 9 // The max no. of infiniband events expected #define MAX_IB_EVENTS 150 // Threshold value to use when comparing TX/RX event values, // i.e., error will be recorded when any difference greater // than the threshold occurs #define EVENT_VAL_DIFF_THRESHOLD 100 // PASS_THRESHOLD: percentage of values out of all possibilities // which need to be correct for the test to PASS // WARN_THRESHOLD: If PASS_THRESHOLD is not met, then this threshold // is used to check if the test can be declared // PASS WITH WARNING. Otherwise, the test is declared // as FAILED. // NSIZE_* : No. of Data Sizes out of all possible NSTEPS data sizes // where TX/RX event value comparison will be performed to check // expected behavior. #define NSIZE_PASS_THRESHOLD 90 #define NSIZE_WARN_THRESHOLD 50 // EVENT_* : No. of events out of all possible events as reported by // component_info which need to be added successfully to the // event set. #define EVENT_PASS_THRESHOLD 90 #define EVENT_WARN_THRESHOLD 50 int main (int argc, char **argv) { /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /************************* SETUP PAPI ENV ************************************* *******************************************************************************/ int retVal, r, code; int ComponentID, NumComponents, IB_ID = -1; int EventSet = PAPI_NULL; int eventCount = 0; // total events as reported by component info int eventNum = 0; // number of events successfully tested /* error reporting */ int addEventFailCount = 0, codeConvertFailCount = 0, eventInfoFailCount = 0; int PAPIstartFailCount = 0, PAPIstopFailCount = 0; int failedEventCodes[MAX_IB_EVENTS]; int failedEventIndex = 0; /* Note: these are fixed length arrays */ char eventNames[MAX_IB_EVENTS][PAPI_MAX_STR_LEN]; char description[MAX_IB_EVENTS][PAPI_MAX_STR_LEN]; long long values[NSTEPS][MAX_IB_EVENTS]; /* these record certain event values for event value testing */ long long rxCount[NSTEPS], txCount[NSTEPS]; const PAPI_component_info_t *cmpInfo = NULL; PAPI_event_info_t eventInfo; /* for timing the test */ long long startTime, endTime; double elapsedTime; /* PAPI Initialization */ retVal = PAPI_library_init( PAPI_VER_CURRENT ); if ( retVal != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed. The test has been terminated.\n",retVal); } /* Get total number of components detected by PAPI */ NumComponents = PAPI_num_components(); /* Check if infiniband component exists */ for ( ComponentID = 0; ComponentID < NumComponents; ComponentID++ ) { if ( (cmpInfo = PAPI_get_component_info(ComponentID)) == NULL ) { fprintf(stderr, "WARNING: PAPI_get_component_info failed on one of the components.\n" "\t The test will continue for now, but it will be skipped later on\n" "\t if this error was for a component under test.\n"); continue; } if (strcmp(cmpInfo->name, "infiniband") != 0) { continue; } // if we are here, Infiniband component is found if (!TESTS_QUIET) { printf("INFO: Component %d (%d) - %d events - %s\n", ComponentID, cmpInfo->CmpIdx, cmpInfo->num_native_events, cmpInfo->name); } if (cmpInfo->disabled) { test_skip(__FILE__,__LINE__,"Infiniband Component is disabled. The test has been terminated.\n", 0); break; } eventCount = cmpInfo->num_native_events; IB_ID = ComponentID; break; } /* if we did not find any valid events, just skip the test. */ if (eventCount==0) { fprintf(stderr, "FATAL: No events found for the Infiniband component, even though it is enabled.\n" " The test will be skipped.\n"); test_skip(__FILE__,__LINE__,"No events found for the Infiniband component.\n", 0); } /************************* SETUP MPI ENV ************************************** *******************************************************************************/ int NumProcs, Rank; /* Initialize MPI environment */ MPI_Init (&argc, &argv); MPI_Comm_size (MPI_COMM_WORLD, &NumProcs); MPI_Comm_rank (MPI_COMM_WORLD, &Rank); if ((!TESTS_QUIET) && (Rank == 0)) { printf("INFO: This test should trigger some network events.\n"); } /* data sizes assigned here */ int Nmax_per_Proc = NSIZE_MAX; int Nmin_per_Proc = NSIZE_MIN; // fix data size if not appropriately set while (Nmax_per_Proc <= Nmin_per_Proc) Nmax_per_Proc = Nmin_per_Proc*10; int Nmax = Nmax_per_Proc * NumProcs; int NstepSize = (Nmax_per_Proc - Nmin_per_Proc)/NSTEPS; int i, j, k; // loop variables int memoryAllocateFailure = 0, ALLmemoryAllocateFailure = 0; // error flags /* data arrays */ double *X, *Y, *Out; double *Xp, *Yp, *Outp; /* Master will initialize data arrays */ if (Rank == 0) { X = (double *) malloc (sizeof(double) * Nmax); Y = (double *) malloc (sizeof(double) * Nmax); Out = (double *) malloc (sizeof(double) * Nmax); // check if memory was successfully allocated. // Do NOT quit from here. Need to quit safely. if ( (X == NULL) || (Y == NULL) || (Out == NULL) ) { fprintf(stderr, "FATAL: Failed to allocate memory on Master Node.\n"); memoryAllocateFailure = 1; } if (memoryAllocateFailure == 0) { if (!TESTS_QUIET) printf("INFO: Master is initializing data.\n"); for ( i = 0; i < Nmax; i++ ) { X[i] = i*0.25; Y[i] = i*0.75; } if (!TESTS_QUIET) printf("INFO: Master has successfully initialized arrays.\n"); } } // communicate to workers if master was able to successfully allocate memory MPI_Bcast (&memoryAllocateFailure, 1, MPI_INT, 0, MPI_COMM_WORLD); if (memoryAllocateFailure == 1) test_fail(__FILE__,__LINE__,"Could not allocate memory during the test. This is fatal and the test has been terminated.\n", 0); memoryAllocateFailure = 0; // re-use flag /* allocate memory for all nodes */ Xp = (double *) malloc (sizeof(double) * Nmax_per_Proc); Yp = (double *) malloc (sizeof(double) * Nmax_per_Proc); Outp = (double *) malloc (sizeof(double) * Nmax_per_Proc); // handle error cases for memory allocation failure for all nodes. if ( (Xp == NULL) || (Yp == NULL) || (Outp == NULL) ) { fprintf(stderr, "FATAL: Failed to allocate %zu bytes on Rank %d.\n", sizeof(double)*Nmax_per_Proc, Rank); memoryAllocateFailure = 1; } MPI_Allreduce (&memoryAllocateFailure, &ALLmemoryAllocateFailure, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD); if (ALLmemoryAllocateFailure > 0) test_fail(__FILE__,__LINE__,"Could not allocate memory during the test. This is fatal and the test has been terminated.\n", 0); /* calculate data size for each compute step */ int Nstep_per_Proc; int DataSizes[NSTEPS]; for (i = 0; i < NSTEPS; i++) { Nstep_per_Proc = Nmin_per_Proc + (i * NstepSize); //last iteration or when max size is exceeded if ((i == (NSTEPS - 1)) || (Nstep_per_Proc > Nmax_per_Proc)) Nstep_per_Proc = Nmax_per_Proc; DataSizes[i] = Nstep_per_Proc; } /************************* MAIN TEST CODE ************************************* *******************************************************************************/ startTime = PAPI_get_real_nsec(); /* create an eventSet */ retVal = PAPI_create_eventset ( &EventSet ); if (retVal != PAPI_OK) { // handle error cases for PAPI_create_eventset() // Two outcomes are possible here: // 1. PAPI_EINVAL: invalid argument. This should not occur. // 2. PAPI_ENOMEM: insufficient memory. If this is the case, then we need to quit the test. fprintf(stderr, "FATAL: Could not create an eventSet on MPI Rank %d due to: %s.\n" " Test will not proceed.\n", Rank, PAPI_strerror(retVal)); test_fail(__FILE__, __LINE__, "PAPI_create_eventset failed. This is fatal and the test has been terminated.\n", retVal); } // end -- handle error cases for PAPI_create_eventset() /* find the code for first event in component */ code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event ( &code, PAPI_ENUM_FIRST, IB_ID ); /* add each event individually in the eventSet and measure event values. */ /* for each event, repeat work with different data sizes. */ while ( r == PAPI_OK ) { // attempt to add event to event set retVal = PAPI_add_event (EventSet, code); if (retVal != PAPI_OK ) { // handle error cases for PAPI_add_event() if (retVal == PAPI_ENOMEM) { fprintf(stderr, "FATAL: Could not add an event to eventSet on MPI Rank %d due to insufficient memory.\n" " Test will not proceed.\n", Rank); test_fail(__FILE__, __LINE__, "PAPI_add_event failed due to fatal error and the test has been terminated.\n", retVal); } if (retVal == PAPI_ENOEVST) { fprintf(stderr, "WARNING: Could not add an event to eventSet on MPI Rank %d since eventSet does not exist.\n" "\t Test will proceed attempting to create a new eventSet\n", Rank); EventSet = PAPI_NULL; retVal = PAPI_create_eventset ( &EventSet ); if (retVal != PAPI_OK) test_fail(__FILE__, __LINE__, "PAPI_create_eventset failed while handling failure of PAPI_add_event." " This is fatal and the test has been terminated.\n", retVal); continue; } if (retVal == PAPI_EISRUN) { long long tempValue; fprintf(stderr, "WARNING: Could not add an event to eventSet on MPI Rank %d since eventSet is already counting.\n" "\t Test will proceed attempting to stop counting and re-attempting to add current event.\n", Rank); retVal = PAPI_stop (EventSet, &tempValue); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_stop failed while handling failure of PAPI_add_event." " This is fatal and the test has been terminated.\n", retVal); retVal = PAPI_cleanup_eventset( EventSet ); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_cleanup_eventset failed while handling failure of PAPI_add_event." " This is fatal and the test has been terminated.\n", retVal); continue; } // for all other errors, skip an event addEventFailCount++; // error reporting failedEventCodes[failedEventIndex] = code; failedEventIndex++; fprintf(stderr, "WARNING: Could not add an event to eventSet on MPI Rank %d due to: %s.\n" "\t Test will proceed attempting to add other events.\n", Rank, PAPI_strerror(retVal)); r = PAPI_enum_cmp_event (&code, PAPI_ENUM_EVENTS, IB_ID); if (addEventFailCount >= eventCount) // if no event was added successfully break; continue; } // end -- handle error cases for PAPI_add_event() /* get event name of added event */ retVal = PAPI_event_code_to_name (code, eventNames[eventNum]); if (retVal != PAPI_OK ) { // handle error cases for PAPI_event_code_to_name(). codeConvertFailCount++; // error reporting fprintf(stderr, "WARNING: PAPI_event_code_to_name failed due to: %s.\n" "\t Test will proceed but an event name will not be available.\n", PAPI_strerror(retVal)); strncpy(eventNames[eventNum], "ERROR:NOT_AVAILABLE", sizeof(eventNames[0])-1); eventNames[eventNum][sizeof(eventNames[0])-1] = '\0'; } // end -- handle error cases for PAPI_event_code_to_name() /* get long description of added event */ retVal = PAPI_get_event_info (code, &eventInfo); if (retVal != PAPI_OK ) { // handle error cases for PAPI_get_event_info() eventInfoFailCount++; // error reporting fprintf(stderr, "WARNING: PAPI_get_event_info failed due to: %s.\n" "\t Test will proceed but an event description will not be available.\n", PAPI_strerror(retVal)); strncpy(description[eventNum], "ERROR:NOT_AVAILABLE", sizeof(description[0])-1); description[eventNum][sizeof(description[0])-1] = '\0'; } else { strncpy(description[eventNum], eventInfo.long_descr, sizeof(description[0])-1); description[eventNum][sizeof(description[0])-1] = '\0'; } /****************** PERFORM WORK (W/ DIFFERENT DATA SIZES) ********************* *******************************************************************************/ for (i = 0; i < NSTEPS; i++) { /* start recording event value */ retVal = PAPI_start (EventSet); if (retVal != PAPI_OK ) { // handle error cases for PAPI_start() // we need to skip the current event being counted for all errors, // in all cases, errors will be handled later on. PAPIstartFailCount++; // error reporting failedEventCodes[failedEventIndex] = code; failedEventIndex++; fprintf(stderr, "WARNING: PAPI_start failed on Event Number %d (%s) due to: %s.\n" "\t Test will proceed with other events if available.\n", eventNum, eventNames[eventNum], PAPI_strerror(retVal)); for (k = i; k < NSTEPS; k++) // fill invalid event values. values[k][eventNum] = (unsigned long long) - 1; break; // try next event } // end -- handle error cases for PAPI_start() if ((!TESTS_QUIET) && (Rank == 0)) printf("INFO: Doing MPI communication for %s: min. %ld bytes transferred by each process.\n", eventNames[eventNum], DataSizes[i]*sizeof(double)); MPI_Scatter (X, DataSizes[i], MPI_DOUBLE, Xp, DataSizes[i], MPI_DOUBLE, 0, MPI_COMM_WORLD); MPI_Scatter (Y, DataSizes[i], MPI_DOUBLE, Yp, DataSizes[i], MPI_DOUBLE, 0, MPI_COMM_WORLD); /* perform calculation. */ /* Note: there is redundant computation here. */ for (j = 0; j < DataSizes[i]; j++) Outp [j] = Xp [j] + Yp [j]; MPI_Gather (Outp, DataSizes[i], MPI_DOUBLE, Out, DataSizes[i], MPI_DOUBLE, 0, MPI_COMM_WORLD); /* stop recording and collect event value */ retVal = PAPI_stop (EventSet, &values[i][eventNum]); if (retVal != PAPI_OK ) { // handle error cases for PAPI_stop() // we need to skip the current event for all errors // except one case, as below. PAPIstopFailCount++; // error reporting if (retVal == PAPI_ENOTRUN) { fprintf(stderr, "WARNING: PAPI_stop failed on Event Number %d (%s) since eventSet is not running.\n" "\t Test will attempt to restart counting on this eventSet.\n", eventNum, eventNames[eventNum]); if (PAPIstopFailCount < NSTEPS) { i = i - 1; // re-attempt this data size continue; } } // for all other errors, try next event. failedEventCodes[failedEventIndex] = code; failedEventIndex++; fprintf(stderr, "WARNING: PAPI_stop failed on Event Number %d (%s) due to: %s.\n" "\t Test will proceed with other events if available.\n", eventNum, eventNames[eventNum], PAPI_strerror(retVal)); for (k = i; k < NSTEPS; k++) // fill invalid event values values[k][eventNum] = (unsigned long long) - 1; break; } // end -- handle error cases for PAPI_stop() /* record number of bytes received */ if (strstr(eventNames[eventNum], ":port_rcv_data")) { rxCount[i] = values[i][eventNum] * 4; // counter value needs to be multiplied by 4 to get total number of bytes } /* record number of bytes transmitted */ if (strstr(eventNames[eventNum], ":port_xmit_data")) { txCount[i] = values[i][eventNum] * 4; } } // end -- work loop /* Done, clean up eventSet for next iteration */ retVal = PAPI_cleanup_eventset( EventSet ); if (retVal != PAPI_OK) { // handle failure cases for PAPI_cleanup_eventset() if (retVal == PAPI_ENOEVST) { fprintf(stderr, "WARNING: Could not clean up eventSet on MPI Rank %d since eventSet does not exist.\n" "\t Test will proceed attempting to create a new eventSet\n", Rank); EventSet = PAPI_NULL; retVal = PAPI_create_eventset ( &EventSet ); if (retVal != PAPI_OK) test_fail(__FILE__, __LINE__, "PAPI_create_eventset failed while handling failure of PAPI_cleanup_eventset.\n" "This is fatal and the test has been terminated.\n", retVal); } else if (retVal == PAPI_EISRUN) { long long tempValue; fprintf(stderr, "WARNING: Could not clean up eventSet on MPI Rank %d since eventSet is already counting.\n" "\t Test will proceed attempting to stop counting and re-attempting to clean up.\n", Rank); retVal = PAPI_stop (EventSet, &tempValue); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_stop failed while handling failure of PAPI_cleanup_eventset." "This is fatal and the test has been terminated.\n", retVal); retVal = PAPI_cleanup_eventset( EventSet ); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_cleanup_eventset failed once again while handling failure of PAPI_cleanup_eventset." "This is fatal and the test has been terminated.\n", retVal); } else { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset failed:", retVal); } } // end -- handle failure cases for PAPI_cleanup_eventset() /* get next event */ eventNum++; r = PAPI_enum_cmp_event (&code, PAPI_ENUM_EVENTS, IB_ID); } // end -- event loop // free memory at all nodes free (Xp); free (Yp); free (Outp); /* Done, destroy eventSet */ retVal = PAPI_destroy_eventset( &EventSet ); if (retVal != PAPI_OK) { // handle error cases for PAPI_destroy_eventset() if (retVal == PAPI_ENOEVST || retVal == PAPI_EINVAL) { fprintf(stderr, "WARNING: Could not destroy eventSet on MPI Rank %d since eventSet does not exist or has invalid value.\n" "\t Test will proceed with other operations.\n", Rank); } else if (retVal == PAPI_EISRUN) { long long tempValue; fprintf(stderr, "WARNING: Could not destroy eventSet on MPI Rank %d since eventSet is already counting.\n" "\t Test will proceed attempting to stop counting and re-attempting to clean up.\n", Rank); retVal = PAPI_stop (EventSet, &tempValue); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_stop failed while handling failure of PAPI_destroy_eventset." "This is fatal and the test has been terminated.\n", retVal); retVal = PAPI_cleanup_eventset( EventSet ); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_cleanup_eventset failed while handling failure of PAPI_destroy_eventset." "This is fatal and the test has been terminated.\n", retVal); retVal = PAPI_destroy_eventset(&EventSet); if (retVal != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_destroy_eventset failed once again while handling failure of PAPI_destroy_eventset." " This is fatal and the test has been terminated.\n", retVal); } else { fprintf(stderr, "WARNING: Could not destroy eventSet on MPI Rank %d since there is an internal bug in PAPI.\n" "\t Please report this to the developers. Test will proceed and operation may be unexpected.\n", Rank); } } // end -- handle failure cases for PAPI_destroy_eventset() /*************************** SUMMARIZE RESULTS ******************************** ******************************************************************************/ endTime = PAPI_get_real_nsec(); elapsedTime = ((double) (endTime-startTime))/1.0e9; /* print results: event values and descriptions */ if (!TESTS_QUIET) { int eventX; // print event values at each process/rank printf("POST WORK EVENT VALUES (Rank, Event Name, List of Event Values w/ Different Data Sizes)>>>\n"); for (eventX = 0; eventX < eventNum; eventX++) { printf("\tRank %d> %s --> \t\t", Rank, eventNames[eventX]); for (i = 0; i < NSTEPS; i++) { if (i < NSTEPS-1) printf("%lld, ", values[i][eventX]); else printf("%lld.", values[i][eventX]); } printf("\n"); } // print description of each event if (Rank == 0) { printf("\n\nTHE DESCRIPTION OF EVENTS IS AS FOLLOWS>>>\n"); for (eventX = 0; eventX < eventNum; eventX++) { printf("\t%s \t\t--> %s \n", eventNames[eventX], description[eventX]); } } } /* test summary: 1) sanity check on floating point computation */ int computeTestPass = 0, computeTestPassCount = 0; if (Rank == 0) { // check results of computation for (i = 0; i < Nmax; i++) { if ( fabs(Out[i] - (X[i] + Y[i])) < 0.00001 ) computeTestPassCount++; } // summarize results of computation if (computeTestPassCount == Nmax) computeTestPass = 1; // free memory free (X); free (Y); free (Out); } // communicate test results to everyone MPI_Bcast (&computeTestPass, 1, MPI_INT, 0, MPI_COMM_WORLD); /* test summary: 2) check TX and RX event values, if available */ long long rxCountSumWorkers[NSTEPS], txCountSumWorkers[NSTEPS]; long long *allProcessRxEvents, *allProcessTxEvents; int txFailedIndex = 0, rxFailedIndex = 0; int txFailedDataSizes[NSTEPS], rxFailedDataSizes[NSTEPS]; int eventValueTestPass = 0; // for test summary if ((txCount[0] > 0) && (rxCount[0] > 0)) { if (Rank == 0) { allProcessRxEvents = (long long*) malloc(sizeof(long long) * NumProcs * NSTEPS); allProcessTxEvents = (long long*) malloc(sizeof(long long) * NumProcs * NSTEPS); } // get all rxCount/txCount at master. Used to check if rx/tx counts match up. MPI_Gather (&rxCount, NSTEPS, MPI_LONG_LONG, allProcessRxEvents, NSTEPS, MPI_LONG_LONG, 0, MPI_COMM_WORLD); MPI_Gather (&txCount, NSTEPS, MPI_LONG_LONG, allProcessTxEvents, NSTEPS, MPI_LONG_LONG, 0, MPI_COMM_WORLD); // perform event count check at master if (Rank == 0) { memset (rxCountSumWorkers, 0, sizeof(long long) * NSTEPS); memset (txCountSumWorkers, 0, sizeof(long long) * NSTEPS); for (i = 0; i < NSTEPS; i++) { for (j = 1; j < NumProcs; j++) { rxCountSumWorkers[i] += allProcessRxEvents[j*NSTEPS+i]; txCountSumWorkers[i] += allProcessTxEvents[j*NSTEPS+i]; } } if (!TESTS_QUIET) printf("\n\n"); for (i = 0; i < NSTEPS; i++) { // check: Master TX event ~= Sum of all RX events across all workers (NumProcs-1) // difference threshold may need to be adjusted based on observed values if ((llabs(rxCountSumWorkers[i] - txCount[i]) > EVENT_VAL_DIFF_THRESHOLD)) { txFailedDataSizes[txFailedIndex] = DataSizes[i]; txFailedIndex++; if (!TESTS_QUIET) printf("WARNING: The transmit event count at Master Node (%lld) is not equal" " to receive event counts at Worker Nodes (%lld) when using %ld bytes!\n" "\t A difference of %lld was recorded.\n", txCount[i], rxCountSumWorkers[i], DataSizes[i]*sizeof(double), llabs(rxCountSumWorkers[i] - txCount[i])); } else { if (!TESTS_QUIET) printf("PASSED: The transmit event count at Master Node (%lld) is almost equal" " to receive event counts at Worker Nodes (%lld) when using %ld bytes.\n", txCount[i], rxCountSumWorkers[i], DataSizes[i]*sizeof(double)); } // check: Master RX event ~= Sum of all TX events across all workers (NumProcs-1) if ((llabs(txCountSumWorkers[i] - rxCount[i]) > EVENT_VAL_DIFF_THRESHOLD)) { rxFailedDataSizes[rxFailedIndex] = DataSizes[i]; rxFailedIndex++; if (!TESTS_QUIET) printf("WARNING: The receive event count at Master Node (%lld) is not equal" " to transmit event counts at Worker Nodes (%lld) when using %ld bytes!\n" " A difference of %lld was recorded.\n", rxCount[i], txCountSumWorkers[i], DataSizes[i]*sizeof(double), llabs(txCountSumWorkers[i] - rxCount[i])); } else { if (!TESTS_QUIET) printf("PASSED: The receive event count at Master Node (%lld) is almost equal" " to transmit event counts at Worker Nodes (%lld) when using %ld bytes.\n", rxCount[i], txCountSumWorkers[i], DataSizes[i]*sizeof(double)); } } // test evaluation criteria if ( (((float) txFailedIndex / NSTEPS) <= (1.0 - (float) NSIZE_PASS_THRESHOLD/100)) && (((float) rxFailedIndex / NSTEPS) <= (1.0 - (float) NSIZE_PASS_THRESHOLD/100)) ) eventValueTestPass = 1; // pass else if ( (((float) txFailedIndex / NSTEPS) <= (1.0 - (float) NSIZE_WARN_THRESHOLD/100)) && (((float) rxFailedIndex / NSTEPS) <= (1.0 - (float) NSIZE_WARN_THRESHOLD/100)) ) eventValueTestPass = -1; // warning else eventValueTestPass = 0; // fail } // end -- check RX/TX counts for all data sizes at Master node. // communicate test results to everyone, since only master knows the result MPI_Bcast (&eventValueTestPass, 1, MPI_INT, 0, MPI_COMM_WORLD); } else { eventValueTestPass = -2; // not available } // end -- event value test /* test summary: 3) number of events added and counted successfully */ // Note: under some rare circumstances, the number of failed events at each node may be different. int eventNumTestPass = 0; // test evaluation criteria if (((float) failedEventIndex / eventCount) <= (1.0 - (float) EVENT_PASS_THRESHOLD/100) ) eventNumTestPass = 1; else if (((float) failedEventIndex / eventCount) <= (1.0 - (float) EVENT_WARN_THRESHOLD/100) ) eventNumTestPass = -1; else eventNumTestPass = 0; /* print test summary */ if ((!TESTS_QUIET) && (Rank == 0)) { printf("\n\n************************ TEST SUMMARY (EVENTS) ******************************\n" "No. of Events NOT tested successfully: %d (%.1f%%)\n" "Note: the above failed event count is for Master node.\n" "Total No. of Events reported by component info: %d\n", failedEventIndex, ((float) failedEventIndex/eventCount)*100.0, eventCount); if (failedEventIndex > 0) { printf("\tNames of Events NOT tested: "); char failedEventName[PAPI_MAX_STR_LEN]; for (i = 0; i < failedEventIndex; i++) { retVal = PAPI_event_code_to_name (failedEventCodes[i], failedEventName); if (retVal != PAPI_OK) { strncpy(failedEventName, "ERROR:NOT_AVAILABLE", sizeof(failedEventName)-1); failedEventName[sizeof(failedEventName)-1] = '\0'; } printf("%s ", failedEventName); if ((i > 0) && (i % 2 == 1)) printf("\n \t\t\t\t"); } printf("\n"); printf("\tThe error counts for different PAPI routines are as follows:\n" "\t\t\tNo. of PAPI add event errors (major) --> %d\n" "\t\t\tNo. of PAPI code convert errors (minor) --> %d\n" "\t\t\tNo. of PAPI event info errors (minor) --> %d\n" "\t\t\tNo. of PAPI start errors (major) --> %d\n" "\t\t\tNo. of PAPI stop errors (major) --> %d\n", addEventFailCount, codeConvertFailCount, eventInfoFailCount, PAPIstartFailCount, PAPIstopFailCount); } printf("The PAPI event test has "); if (eventNumTestPass == 1) printf("PASSED\n"); else if (eventNumTestPass == -1) printf("PASSED WITH WARNING\n"); else printf("FAILED\n"); // event values printf("************************ TEST SUMMARY (EVENT VALUES) ************************\n"); if ((txCount[0] > 0) && (rxCount[0] > 0)) { printf("No. of times transmit event at Master node did NOT match up receive events at worker nodes: %d (%.1f%%)\n" "No. of times receive event at Master node did NOT match up transmit events at worker nodes: %d (%.1f%%)\n" "Total No. of data sizes tested: %d\n" "\tList of Data Sizes tested in bytes:\n\t\t\t", txFailedIndex, ((float) txFailedIndex/NSTEPS)*100.0, rxFailedIndex, ((float) rxFailedIndex/NSTEPS)*100.0, NSTEPS); for (i = 0; i < NSTEPS; i++) printf("%ld ",DataSizes[i]*sizeof(double)); printf("\n"); if (txFailedIndex > 0 || rxFailedIndex > 0) { printf("\tList of Data Sizes where transmit count at Master was not equal to sum of all worker receive counts:\n" "\t\t\t"); for (i = 0; i < txFailedIndex; i++) printf("%ld ", txFailedDataSizes[i]*sizeof(double)); printf("\n\tList of Data Sizes where receive count at Master was not equal to sum of all worker transmit counts:\n" "\t\t\t"); for (i = 0; i < rxFailedIndex; i++) printf("%ld ", rxFailedDataSizes[i]*sizeof(double)); printf("\n"); } printf("The PAPI event value test has "); if (eventValueTestPass == 1) printf("PASSED\n"); else if (eventValueTestPass == -1) printf("PASSED WITH WARNING\n"); else printf("FAILED\n"); } else { printf("Transmit or receive events were NOT found!\n"); } // compute values printf("************************ TEST SUMMARY (COMPUTE VALUES) **********************\n"); if (computeTestPassCount != Nmax) { printf("No. of times sanity check FAILED on the floating point computation: %d (%.1f%%)\n" "Total No. of floating point computations performed: %d \n", Nmax-computeTestPassCount, ((float) (Nmax-computeTestPassCount)/Nmax)*100.0, Nmax); } else { printf("Sanity check PASSED on all floating point computations.\n" "Note: this may pass even if one event was tested successfully!\n"); } printf("The overall test took %.3f secs.\n\n", elapsedTime); } // end -- print summary of test results. /* finialize MPI */ MPI_Finalize(); /* determine success of overall test based on all tests */ if (computeTestPass == 1 && eventValueTestPass == 1 && eventNumTestPass == 1) { // all has to be good for the test to pass. // note: test will generate a warning if tx/rx events are not available. test_pass( __FILE__ ); } else if ( (eventValueTestPass < 0 && (eventNumTestPass < 0 || eventNumTestPass == 1) ) || (eventValueTestPass == 1 && eventNumTestPass < 0) || (eventValueTestPass == 1 && eventNumTestPass == 1 && computeTestPass == 0) ) { test_warn(__FILE__,__LINE__,"A warning was generated during any PAPI related tests or sanity check on computation failed", 0); test_pass(__FILE__); } else { // fail, in case any of eventValueTest and eventNumTest have failed, // irrespective of the result of computeTest. test_fail(__FILE__, __LINE__,"Any of PAPI event related tests have failed", 0); } } // end main papi-papi-7-2-0-t/src/components/infiniband/tests/Makefile000066400000000000000000000013641502707512200235230ustar00rootroot00000000000000NAME=infiniband include ../../Makefile_comp_tests.target TESTS = infiniband_list_events infiniband_values_by_code ifneq ($(MPICC),) TESTS += MPI_test_infiniband_events endif infiniband_tests: $(TESTS) MPI_test_infiniband_events.o:MPI_test_infiniband_events.c $(MPICC) $(INCLUDE) -c -o $@ $< %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< infiniband_list_events: infiniband_list_events.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ $^ $(LDFLAGS) infiniband_values_by_code: infiniband_values_by_code.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o $@ $^ $(LDFLAGS) MPI_test_infiniband_events: MPI_test_infiniband_events.o $(UTILOBJS) $(PAPILIB) $(MPICC) $(INCLUDE) -o $@ $^ $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/infiniband/tests/infiniband_list_events.c000066400000000000000000000044561502707512200267540ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Jose Pedro Oliveira * * test case for the linux-infiniband component * Adapted from its counterpart in the net component. * * @brief * List all net events codes and names */ #include #include #include #include "papi.h" #include "papi_test.h" int main (int argc, char **argv) { int retval,cid,numcmp; int total_events=0; int code; char event_name[PAPI_MAX_STR_LEN]; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Listing all infiniband events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidname, "infiniband") == NULL) { continue; } if (!TESTS_QUIET) { printf("Component %d (%d) - %d events - %s\n", cid, cmpinfo->CmpIdx, cmpinfo->num_native_events, cmpinfo->name); } if (cmpinfo->disabled) { test_skip(__FILE__,__LINE__,"Component infiniband is disabled", 0); continue; } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) { printf("%#x %s\n", code, event_name); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } if (total_events==0) { test_skip(__FILE__,__LINE__,"No infiniband events found", 0); } test_pass( __FILE__ ); return 0; } // vim:set ai ts=4 sw=4 sts=4 et: papi-papi-7-2-0-t/src/components/infiniband/tests/infiniband_values_by_code.c000066400000000000000000000101721502707512200273700ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @author Jose Pedro Oliveira * * test case for the linux-infiniband component * Adapted from its counterpart in the net component. * * @brief * Prints the value of every native event (by code) */ #include #include #include #include #include "papi.h" #include "papi_test.h" int main (int argc, char **argv) { int retval,cid,numcmp; int EventSet = PAPI_NULL; long long *values = 0; int *codes = 0; char *names = 0; int code, i; int total_events=0; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Trying all infiniband events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidnum_native_events, cmpinfo->name); } if ( strstr(cmpinfo->name, "infiniband") == NULL) { continue; } if (cmpinfo->disabled) { test_skip(__FILE__,__LINE__,"Component infiniband is disabled", 0); continue; } values = (long long*) malloc(sizeof(long long) * cmpinfo->num_native_events); codes = (int*) malloc(sizeof(int) * cmpinfo->num_native_events); names = (char*) malloc(PAPI_MAX_STR_LEN * cmpinfo->num_native_events); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()", retval); } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); i = 0; while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, &names[i*PAPI_MAX_STR_LEN] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } codes[i] = code; retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()", retval); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); i += 1; } retval = PAPI_start( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()", retval); } /* XXX figure out a general method to generate some traffic * for infiniband * the operation should take more than one second in order * to guarantee that the network counters are updated */ /* For now, just sleep for 10 seconds */ sleep(10); retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop()", retval); } if (!TESTS_QUIET) { for (i=0 ; inum_native_events ; ++i) printf("%#x %-24s = %lld\n", codes[i], names+i*PAPI_MAX_STR_LEN, values[i]); } retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()", retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()", retval); } free(names); free(codes); free(values); } if (total_events==0) { test_skip(__FILE__,__LINE__,"No infiniband events found", 0); } test_pass( __FILE__ ); return 0; } // vim:set ai ts=4 sw=4 sts=4 et: papi-papi-7-2-0-t/src/components/intel_gpu/000077500000000000000000000000001502707512200206025ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/intel_gpu/README.md000066400000000000000000000040121502707512200220560ustar00rootroot00000000000000# Intel GPU Component: (intel_gpu) PAPI "intel_gpu" component is for accessing Intel Graphics Processing Unit (GPU) hardware performance metrics through Intel oneAPI Level Zero Interface. ## Enable intel_gpu component To enable intel_gpu component, the PAPI library is built with configure option as ./configure --with-components="intel_gpu" It is requred to build with Intel oneAPI Level Zero header files. The directory of Level0 header files can be defined using INTEL_L0_HEADERS. Default installation location is /usr/include/level_zero/. ## Prerequisites * [oneAPI Level Zero loader (libze_loader.so)](https://github.com/oneapi-src/level-zero) * [Intel(R) Metrics Discovery Application Programming Interface](https://github.com/intel/metrics-discovery) * /proc/sys/dev/i915/perf_stream_paranoid is set to "0" * User needs to be added into Linux render/video groups ## Runtime environment: * ```sh export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:: ``` * To enable metrics ```sh ZET_ENABLE_METRICS=1 ``` * To change the sampling period from default 400000, ```sh METRICS_SAMPLING_PERIOD=value ``` * To enable metrics query on an kernel ```sh ZE_ENABLE_TRACING_LAYER=1 ``` ## Metric collection mode: Two metrics collection modes are supported. * Time based sampling. In this mode, data collection and app can run in separate processes. * Metrics query on a kernel. In this mode, the PAPI_start() and PAPI_stop must be called before kernel launch and after kernel execution completes. When setting ZE_ENABLE_TRACING_LAYER=1, the collection will switch to metrics query mode. ## Metrics: Use "test/gpu_metrc_list" or "papi_native_avail" to find metrics name for Intel GPU. * Metrics are named as: intel_gpu:::. * Metrics can be added with qulifier for a select device or subdevice intel_gpu:::.:device=:tile= n: device id, start from 0. m: subdevice id, start from 0. See test/readme.txt for how to use it. papi-papi-7-2-0-t/src/components/intel_gpu/Rules.intel_gpu000066400000000000000000000025131502707512200236050ustar00rootroot00000000000000# $Id$ ## the following set are used to define oneAPI Level Zero Metric API based implementation INTEL_L0_HEADERS ?= /usr/include GPUCOMP = components/intel_gpu GPU_INTERNAL = $(GPUCOMP)/internal GPUSRCS = $(GPUCOMP)/linux_intel_gpu_metrics.c GPUSRCS += $(GPU_INTERNAL)/src/GPUMetricInterface.cpp $(GPU_INTERNAL)/src/GPUMetricHandler.cpp GPUHEADER=$(GPU_INTERNAL)/inc/GPUMetricInterface.h COMPSRCS += $(GPUSRCS) $(GPULIBSRCS) GPUOBJS = GPUMetricInterface.o GPUMetricHandler.o linux_intel_gpu_metrics.o COMPOBJS += $(GPUOBJS) CFLAGS += $(LDL) -g -I$(GPU_INTERNAL) -I$(GPU_INTERNAL)/inc -D_GLIBCXX_USE_CXX11_ABI=1 LDFLAGS += -ldl GPUMetricInterface.o: $(GPU_INTERNAL)/src/GPUMetricInterface.cpp $(GPUHEADER) $(CXX) -g -fpic $(CFLAGS) $(CPPFLAGS) -I$(GPU_INTERNAL)/inc -I$(INTEL_L0_HEADERS) -o GPUMetricInterface.o -c $(GPU_INTERNAL)/src/GPUMetricInterface.cpp GPUMetricHandler.o: $(GPU_INTERNAL)/src/GPUMetricHandler.cpp $(GPUHEADER) $(GPU_INTERNAL)/inc/GPUMetricHandler.h $(CXX) -g -fpic $(CFLAGS) $(CPPFLAGS) -I$(GPU_INTERNAL)/inc -I$(INTEL_L0_HEADERS) -o GPUMetricHandler.o -c $(GPU_INTERNAL)/src/GPUMetricHandler.cpp linux_intel_gpu_metrics.o: $(GPUCOMP)/linux_intel_gpu_metrics.c $(GPUHEADER) $(CC) $(LIBCFLAGS) $(CFLAGS) $(CPPFLAGS) $(OPTFLAGS) -g -o linux_intel_gpu_metrics.o -c $(GPUCOMP)/linux_intel_gpu_metrics.c $(LDFLAGS) ##clean: papi-papi-7-2-0-t/src/components/intel_gpu/internal/000077500000000000000000000000001502707512200224165ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/intel_gpu/internal/inc/000077500000000000000000000000001502707512200231675ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/intel_gpu/internal/inc/GPUMetricHandler.h000066400000000000000000000143631502707512200264440ustar00rootroot00000000000000/* * GPUMetricHandler.h: Intel® Graphics Component for PAPI * * Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _GPUMETRICHANDLER_H #define _GPUMETRICHANDLER_H #pragma once #include #include #include #include #include #include #include #include #include #include "level_zero/zet_api.h" #include "GPUMetricInterface.h" /* collection status */ #define COLLECTION_IDLE 0 #define COLLECTION_INIT 1 #define COLLECTION_CONFIGED 2 #define COLLECTION_ENABLED 3 #define COLLECTION_DISABLED 4 #define COLLECTION_COMPLETED 5 using namespace std; class GPUMetricHandler; typedef struct TMetricNode_S { uint32_t code; // 0 mean invalid uint32_t metricId; uint32_t metricGroupId; uint32_t metricDomainId; int metricType; zet_metric_properties_t props; zet_metric_handle_t handle; } TMetricNode; typedef struct TMetricGroupNode_S { uint32_t code; // 0 mean invalid uint32_t numMetrics; TMetricNode *metricList; int *opList; // list of metric operation zet_metric_group_properties_t props; zet_metric_group_handle_t handle; }TMetricGroupNode; typedef struct TMetricGroupInfo_S { uint32_t numMetricGroups; uint32_t numMetrics; uint32_t maxMetricsPerGroup; uint32_t domainId; TMetricGroupNode *metricGroupList; }TMetricGroupInfo; typedef struct QueryData_S { string kernName; zet_metric_query_handle_t metricQuery; ze_event_handle_t event; } QueryData; typedef struct QueryState_S { atomic kernelId{0}; std::mutex lock; std::map nameMap; zet_metric_query_pool_handle_t queryPool; ze_event_pool_handle_t eventPool; GPUMetricHandler *handle; vector queryList; } QueryState; typedef struct InstanceData { uint32_t kernelId; QueryState *queryState; zet_metric_query_handle_t metricQuery; } InstanceData; class GPUMetricHandler { public: static int InitMetricDevices(DeviceInfo **deviceInfoList, uint32_t *numDeviceInfo, uint32_t *totalDevices); static GPUMetricHandler* GetInstance(uint32_t driverId, uint32_t deviceId, uint32_t subdeviceId); ~GPUMetricHandler(); void DestroyMetricDevice(); int EnableMetricGroup(uint32_t metricGroupCode, uint32_t mtype, int *status); int EnableMetricGroup(const char *metricGroupName, uint32_t mtype, int *status); int EnableTimeBasedStream(uint32_t timePeriod, uint32_t numReports); int EnableEventBasedQuery(); void DisableMetricGroup(); int GetMetricInfo(int type, MetricInfo *data); int GetMetricInfo(const char * name, int type, MetricInfo *data); int GetMetricCode(const char *mGroupName, const char *metricName, uint32_t mtype, uint32_t *mGroupCode, uint32_t *metricCode); MetricData *GetMetricData(uint32_t mode, uint32_t *numReports); int SetControl(uint32_t mode); uint32_t GetCurGroupCode(); private: GPUMetricHandler(uint32_t driverid, uint32_t deviceid, uint32_t subdeviceid); GPUMetricHandler(GPUMetricHandler const&); void operator=(GPUMetricHandler const&); static int InitMetricGroups(ze_device_handle_t device, TMetricGroupInfo *mgroups); string GetDeviceName(ze_device_handle_t device); uint8_t *ReadStreamData(size_t *rawDataSize); uint8_t *ReadQueryData(QueryData &data, size_t *rawDataSize, ze_result_t *retStatus); void GenerateMetricData(uint8_t *rawData, size_t rawDataSize, uint32_t mode); void ProcessMetricDataSet(uint32_t dataSetId, zet_typed_value_t* typedDataList, uint32_t startIdx, uint32_t dataSize, uint32_t metricCount, uint32_t mode); private: // Fields static GPUMetricHandler* m_handlerInstance; static vector driverList; static vector m_deviceList; static vector handlerList; int m_driverId; int m_deviceId; int m_subdeviceId; std::mutex m_lock; string m_dataDumpFileName; string m_dataDumpFilePath; fstream m_dataDump; // current state ze_driver_handle_t m_driver; ze_context_handle_t m_context; ze_device_handle_t m_device; TMetricGroupInfo *m_groupInfo; uint32_t m_domainId; int m_groupId; uint32_t m_groupType; QueryState *m_queryState; ze_event_pool_handle_t m_eventPool; ze_event_handle_t m_event; zet_metric_streamer_handle_t m_metricStreamer; zet_metric_query_pool_handle_t m_queryPool; zet_tracer_exp_handle_t m_tracer; volatile int m_status; uint32_t m_numDevices; uint32_t m_numDataSet; MetricData *m_reportData; uint32_t *m_reportCount; }; /* this struct maintains the information */ typedef struct TMetricDevice_S { uint32_t driverId; uint32_t deviceId; uint32_t metricHandlerIndex; uint32_t numSubdevices; TMetricGroupInfo *groupList; char *devName; ze_driver_handle_t *ze_driver; ze_device_handle_t *ze_device; } TMetricDevice; /* This struct maintains the information * This is the based to access the GPUMetricHandler method */ typedef struct TMetricDeviceHandler_S { uint32_t driverId; uint32_t deviceId; uint32_t subdeviceId; uint32_t deviceListIndex; GPUMetricHandler *handler; } TMetricDeviceHandler; #endif papi-papi-7-2-0-t/src/components/intel_gpu/internal/inc/GPUMetricInterface.h000066400000000000000000000274561502707512200267760ustar00rootroot00000000000000/* * GPUMetricInterface.h: Intel® Graphics Component for PAPI * * Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _GPUMETRICINTERFACE_H #define _GPUMETRICINTERFACE_H #include #include #pragma once #if defined(__cplusplus) extern "C" { #endif #ifndef MAX_STR_LEN #define MAX_STR_LEN 256 #endif //metric group type #define TIME_BASED 0 #define EVENT_BASED 1 //metric type #define M_ACCUMULATE 0x0 #define M_AVERAGE 0x1 #define M_RAW 0x2 typedef int DEVICE_HANDLE; typedef char NAMESTR[MAX_STR_LEN]; typedef struct MetricInfo_S MetricInfo; struct MetricInfo_S { uint32_t code; uint32_t dataType; uint32_t metricType; // 0 accumulate 1 average, 2 raw (static) char name[MAX_STR_LEN]; char desc[MAX_STR_LEN]; int numEntries; MetricInfo *infoEntries; }; /* define GPU device info */ typedef struct DeviceInfo_S { DEVICE_HANDLE handle; uint32_t driverId; uint32_t deviceId; uint32_t subdeviceId; uint32_t numSubdevices; uint32_t index; char name[MAX_STR_LEN]; MetricInfo *metricGroups; } DeviceInfo; typedef struct DataEntry_S { uint32_t code; uint32_t type; union { long long ival; double fpval; } value; } DataEntry; /* Metric data for each device * Data can contain multiple set, one set per subdevice */ typedef struct MetricData_S { int id; uint32_t grpCode; uint32_t numDataSets; // num of data set per device uint32_t metricCount; // metrics per data set uint32_t numEntries; // total data entries allocated int *dataSetStartIdx; // start index per each data set, -1 mean no data available DataEntry *dataEntries; } MetricData; typedef struct MetricNode_S { char name[MAX_STR_LEN]; int code; } MetricNode; /* index code for a handle * resv(4) + drv (4) + dev(4) + subdev(4) + index(16) */ #define DRV_BITS 4 #define DEV_BITS 4 #define SDEV_BITS 4 #define IDX_BITS 16 #define DMASK 0xf #define DCODE_MASK 0xfff #define IDX_MASK 0xffff #define CreateDeviceCode(drv, dev, sdev) \ ((((drv)&DMASK)<<(SDEV_BITS+DEV_BITS)) | \ (((dev)&DMASK)<> IDX_BITS) & DCODE_MASK) #define GetIdx(handle) ((handle) & IDX_MASK) #define GetSDev(handle) (((handle) >> IDX_BITS) & DMASK) #define GetDev(handle) (((handle) >> (IDX_BITS+SDEV_BITS)) & DMASK) #define GetDrv(handle) (((handle) >> (IDX_BITS+SDEV_BITS+DEV_BITS)) & DMASK) #define IsDeviceHandle(devcode, handle) ((devcode) == (((handle)>>IDX_BITS)&DCODE_MASK)) /* * metric code: group(8) + metrics(8) */ #define METRIC_BITS 8 #define METRIC_GROUP_MASK 0xfff00 #define METRIC_MASK 0x00ff #define CreateGroupCode(mGroupId) (((mGroupId+1)<> METRIC_BITS) #define GetMetricIdx(code) ((code) & METRIC_MASK) #define MAX_METRICS 128 #define MAX_NUM_REPORTS 20 /* collection modes */ #define METRIC_SAMPLE 0x1 #define METRIC_SUMMARY 0x2 #define METRIC_RESET 0x4 MetricInfo *GPUAllocMetricInfo(uint32_t count); void GPUFreeMetricInfo(MetricInfo *info, uint32_t count); MetricData *GPUAllocMetricData(uint32_t count, uint32_t numSets, uint32_t numMetrics); void GPUFreeMetricData(MetricData *data, uint32_t count); extern void strncpy_se(char *dest, size_t destSize, char *src, size_t count); /* ------------------------------------------------------------------------- */ /* ------------------------------------------------------------------------- */ /*! * @fn int GPUDetectDevice(DEVICE_HANDLE **handle, uint32_t *numDevice); * * @brief Detect the named device which has performance monitoring feature availale. * * @param OUT handle - a array of handle, each for an instance of the device * * @return 0 -- success, otherwise, error code */ int GPUDetectDevice(DEVICE_HANDLE **handle, uint32_t *numDevice); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUGetDeviceInfo(DEVICE_HANDLE handle, MetricInfo *data); * * @brief Get the device properties, which is mainly in format * * @param IN handle - handle to the selected device * @param OUT data - the property data * * @return 0 -- success, otherwise, error code */ int GPUGetDeviceInfo(DEVICE_HANDLE handle, MetricInfo *data); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUPrintDeviceInfo(DEVICE_HANDLE handle, FILE *stream); * * @brief Get the device properties, which is mainly in format * * @param IN handle - handle to the selected device * @param IN stream - IO stream to print * * @return 0 -- success, otherwise, error code */ int GPUPrintDeviceInfo(DEVICE_HANDLE handle, FILE *stream); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUGetMetricGroups(DEVICE_HANDLE handle, uint32_t mtype, MetricInfo *data); * * @brief list available metric groups for the selected type * * @param IN handle - handle to the selected device * @param IN mtype - metric group type * @param OUT data - metric data * * @return 0 -- success, otherwise, error code */ int GPUGetMetricGroups(DEVICE_HANDLE handle, uint32_t mtype, MetricInfo *data); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUGetMetricList(DEVICE_HANDLE handle, * char *groupName, uint32_t mtype, MetricInfo *data); * * @brief list available metrics in the named group. * If name is "", list all available metrics in all groups * * @param IN handle - handle to the selected device * @param IN groupName - metric group name. "" means all groups. * @param In mtype - metric type * @param OUT data - metric data * * @return 0 -- success, otherwise, error code */ int GPUGetMetricList(DEVICE_HANDLE handle, char *groupName, uint32_t mtype, MetricInfo *data); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUEnableMetrics(DEVICE_HANDLE handle, char *metricGroupName, * uint32_t stype, uint32_t period, uint32_t numReports) * * @brief enable named metricgroup to collection. * * @param IN handle - handle to the selected device * @param IN metricGroupName - metric group name * @param IN metricGroupCode - metric group code * @param IN mtype - metric type * @param IN period - collection timer period * @param IN numReports - number of reports. Default collect all available metrics * * @return 0 -- success, otherwise, error code */ int GPUEnableMetricGroup(DEVICE_HANDLE handle, char *metricGroupName, uint32_t metricGroupCode, uint32_t mtype, uint32_t period, uint32_t numReports); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUEnableMetrics(DEVICE_HANDLE handle, char **metricNameList, * unsigned numMetrics, uint32_t stype, * uint32_t period, uint32_t numReports) * * @brief enable named metrics on a metric group to collection. * * @param IN handle - handle to the selected device * @param IN metricNameList - a list of metric names to be collected * @param IN numMetrics - number of metrics to be collected * @param IN mtype - metric type * @param IN period - collection timer period * @param IN numReports - number of reports. Default collect all available metrics * * @return 0 -- success, otherwise, error code */ int GPUEnableMetrics(DEVICE_HANDLE handle, char **metricNameList, uint32_t numMetrics, uint32_t mtype, uint32_t period, uint32_t numReports); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUDisableMetrics(DEVICE_HANDLE handle, uint32_t mtype); * * @brief disable current metric collection * * @param IN handle - handle to the selected device * @param IN mtype - a metric group type * * @return 0 -- success, otherwise, error code */ int GPUDisableMetricGroup(DEVICE_HANDLE handle, uint32_t mtype); /*------------------------------------------------------------------------- */ /*! * @fn int GPUStart(DEVICE_HANDLE handle); * * @brief start collection * * @param IN handle - handle to the selected device * * @return 0 -- success, otherwise, error code */ int GPUStart(DEVICE_HANDLE handle); /* ------------------------------------------------------------------------- */ /*! * @fn int GPUStop(DEVICE_HANDLE handle, MetricData **data, int *reportCounts); * * @brief stop collection * * @param IN handle - handle to the selected device * @param OUT data - returned metric data array * @param OUT reportCounts - returned metric data array size * * @return 0 -- success, otherwise, error code */ int GPUStop(DEVICE_HANDLE handle); /* ------------------------------------------------------------------------- */ /*! * @fn Metricdata *GPUReadMetricData(DEVICE_HANDLE handle, uint32_t mode, uint32_t *reportCounts); * * @brief read metric data * * @param IN handle - handle to the selected device * @param IN mode - collection mode (sample, summary, reset) * @param OUT reportCounts - returned metric data array size * * @return data - returned metric data array */ MetricData *GPUReadMetricData(DEVICE_HANDLE handle, uint32_t mode, uint32_t *reportCounts); /* ------------------------------------------------------------------------- */ /*! * @fn Metricdata *GPUSetMetricControl(DEVICE_HANDLE handle, uint32_t mode); * * @brief set control for metric data collection * * @param IN handle - handle to the selected device * @param IN mode - collection mode (sample, summary, reset) * * @return 0 -- success, otherwise, error code */ int GPUSetMetricControl(DEVICE_HANDLE handle, uint32_t mode); /* ------------------------------------------------------------------------- */ /*! * @fn void GPUFreeDevice(DEVICE_HANDLE handle); * * @brief free the resouce related this device handle * * @param IN handle - handle to the selected device * * @return 0 -- success, otherwise, error code */ //void GPUFreeDevice(); void GPUFreeDevice(DEVICE_HANDLE handle); #if defined(__cplusplus) } #endif #endif papi-papi-7-2-0-t/src/components/intel_gpu/internal/src/000077500000000000000000000000001502707512200232055ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/intel_gpu/internal/src/GPUMetricHandler.cpp000066400000000000000000001625741502707512200270250ustar00rootroot00000000000000/* * GPUMetricHandler.cpp: Intel® Graphics Component for PAPI * * Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "GPUMetricHandler.h" #define DebugPrintError(format, args...) fprintf(stderr, format, ## args) //#define _DEBUG 1 #if defined(_DEBUG) #define DebugPrint(format, args...) fprintf(stderr, format, ## args) #else #define DebugPrint(format, args...) {do {} while(0); } #endif #define MAX_REPORTS 32768 #define MAX_KERNERLS 1024 #define CHECK_N_RETURN_STATUS(status, retVal) {if (status) return retVal; } #define CHECK_N_RETURN(status) {if (status) return;} #define ENABLE_RAW_LOG "ENABLE_RAW_LOG" /************************************************************************/ /* Helper function */ /************************************************************************/ void (*_dl_non_dynamic_init) (void) __attribute__ ((weak)); ze_pfnInit_t zeInitFunc; ze_pfnDriverGet_t zeDriverGetFunc; ze_pfnDriverGetProperties_t zeDriverGetPropertiesFunc; ze_pfnDeviceGet_t zeDeviceGetFunc; ze_pfnDeviceGetSubDevices_t zeDeviceGetSubDevicesFunc; ze_pfnDeviceGetProperties_t zeDeviceGetPropertiesFunc; ze_pfnEventPoolCreate_t zeEventPoolCreateFunc; ze_pfnEventPoolDestroy_t zeEventPoolDestroyFunc; ze_pfnContextCreate_t zeContextCreateFunc; ze_pfnEventCreate_t zeEventCreateFunc; ze_pfnEventDestroy_t zeEventDestroyFunc; ze_pfnEventHostSynchronize_t zeEventHostSynchronizeFunc; zet_pfnMetricGroupGet_t zetMetricGroupGetFunc; zet_pfnMetricGroupGetProperties_t zetMetricGroupGetPropertiesFunc; zet_pfnMetricGroupCalculateMetricValues_t zetMetricGroupCalculateMetricValuesFunc; zet_pfnMetricGroupCalculateMultipleMetricValuesExp_t zetMetricGroupCalculateMultipleMetricValuesExpFunc; zet_pfnMetricGet_t zetMetricGetFunc; zet_pfnMetricGetProperties_t zetMetricGetPropertiesFunc; zet_pfnContextActivateMetricGroups_t zetContextActivateMetricGroupsFunc; zet_pfnMetricStreamerOpen_t zetMetricStreamerOpenFunc; zet_pfnMetricStreamerClose_t zetMetricStreamerCloseFunc; zet_pfnMetricStreamerReadData_t zetMetricStreamerReadDataFunc; zet_pfnMetricQueryPoolCreate_t zetMetricQueryPoolCreateFunc; zet_pfnMetricQueryPoolDestroy_t zetMetricQueryPoolDestroyFunc; zet_pfnMetricQueryCreate_t zetMetricQueryCreateFunc; zet_pfnMetricQueryDestroy_t zetMetricQueryDestroyFunc; zet_pfnMetricQueryReset_t zetMetricQueryResetFunc; zet_pfnMetricQueryGetData_t zetMetricQueryGetDataFunc; zet_pfnTracerExpCreate_t zetTracerExpCreateFunc; zet_pfnTracerExpDestroy_t zetTracerExpDestroyFunc; zet_pfnTracerExpSetPrologues_t zetTracerExpSetProloguesFunc; zet_pfnTracerExpSetEpilogues_t zetTracerExpSetEpiloguesFunc; zet_pfnTracerExpSetEnabled_t zetTracerExpSetEnabledFunc; zet_pfnCommandListAppendMetricQueryBegin_t zetCommandListAppendMetricQueryBeginFunc; zet_pfnCommandListAppendMetricQueryEnd_t zetCommandListAppendMetricQueryEndFunc; #define DLL_SYM_CHECK(handle, name, type) \ do { \ name##Func = (type) dlsym(handle, #name); \ if (dlerror() != nullptr) { \ DebugPrintError("failed: %s\n", #name); \ return 1; \ } \ } while (0) static int functionInit(void *dllHandle) { int ret = 0; DLL_SYM_CHECK(dllHandle, zeInit, ze_pfnInit_t); DLL_SYM_CHECK(dllHandle, zeDriverGet, ze_pfnDriverGet_t); DLL_SYM_CHECK(dllHandle, zeDriverGetProperties, ze_pfnDriverGetProperties_t); DLL_SYM_CHECK(dllHandle, zeDeviceGet, ze_pfnDeviceGet_t); DLL_SYM_CHECK(dllHandle, zeDeviceGetSubDevices, ze_pfnDeviceGetSubDevices_t); DLL_SYM_CHECK(dllHandle, zeDeviceGetProperties, ze_pfnDeviceGetProperties_t); DLL_SYM_CHECK(dllHandle, zetMetricGroupGet, zet_pfnMetricGroupGet_t); DLL_SYM_CHECK(dllHandle, zetMetricGroupGetProperties, zet_pfnMetricGroupGetProperties_t); DLL_SYM_CHECK(dllHandle, zetMetricGet, zet_pfnMetricGet_t); DLL_SYM_CHECK(dllHandle, zetMetricGetProperties, zet_pfnMetricGetProperties_t); DLL_SYM_CHECK(dllHandle, zetContextActivateMetricGroups, zet_pfnContextActivateMetricGroups_t); DLL_SYM_CHECK(dllHandle, zeEventPoolCreate, ze_pfnEventPoolCreate_t); DLL_SYM_CHECK(dllHandle, zeEventPoolDestroy, ze_pfnEventPoolDestroy_t); DLL_SYM_CHECK(dllHandle, zeContextCreate, ze_pfnContextCreate_t); DLL_SYM_CHECK(dllHandle, zeEventCreate, ze_pfnEventCreate_t); DLL_SYM_CHECK(dllHandle, zeEventDestroy, ze_pfnEventDestroy_t); DLL_SYM_CHECK(dllHandle, zeEventHostSynchronize, ze_pfnEventHostSynchronize_t); DLL_SYM_CHECK(dllHandle, zetMetricStreamerOpen, zet_pfnMetricStreamerOpen_t); DLL_SYM_CHECK(dllHandle, zetMetricStreamerClose, zet_pfnMetricStreamerClose_t); DLL_SYM_CHECK(dllHandle, zetMetricStreamerReadData, zet_pfnMetricStreamerReadData_t); DLL_SYM_CHECK(dllHandle, zetMetricQueryPoolCreate, zet_pfnMetricQueryPoolCreate_t); DLL_SYM_CHECK(dllHandle, zetMetricQueryPoolDestroy, zet_pfnMetricQueryPoolDestroy_t); DLL_SYM_CHECK(dllHandle, zetMetricQueryCreate, zet_pfnMetricQueryCreate_t); DLL_SYM_CHECK(dllHandle, zetMetricQueryDestroy, zet_pfnMetricQueryDestroy_t); DLL_SYM_CHECK(dllHandle, zetMetricQueryGetData, zet_pfnMetricQueryGetData_t); DLL_SYM_CHECK(dllHandle, zetMetricGroupCalculateMetricValues, zet_pfnMetricGroupCalculateMetricValues_t); DLL_SYM_CHECK(dllHandle, zetMetricGroupCalculateMultipleMetricValuesExp, zet_pfnMetricGroupCalculateMultipleMetricValuesExp_t); DLL_SYM_CHECK(dllHandle, zetTracerExpCreate, zet_pfnTracerExpCreate_t); DLL_SYM_CHECK(dllHandle, zetTracerExpDestroy, zet_pfnTracerExpDestroy_t); DLL_SYM_CHECK(dllHandle, zetTracerExpSetPrologues, zet_pfnTracerExpSetPrologues_t); DLL_SYM_CHECK(dllHandle, zetTracerExpSetEpilogues, zet_pfnTracerExpSetEpilogues_t); DLL_SYM_CHECK(dllHandle, zetTracerExpSetEnabled, zet_pfnTracerExpSetEnabled_t); DLL_SYM_CHECK(dllHandle, zetCommandListAppendMetricQueryBegin, zet_pfnCommandListAppendMetricQueryBegin_t); DLL_SYM_CHECK(dllHandle, zetCommandListAppendMetricQueryEnd, zet_pfnCommandListAppendMetricQueryEnd_t); return ret; } /*------------------------------------------------------------------------------*/ /*! * @fn static void kernelCreateCB( * ze_command_list_append_launch_kernel_params_t* params, * ze_result_t result, void* pUserData, void** instanceData) * * @brief callback function called to add kernel into name mapping. * It is called when exiting kerenl creation. * * @param IN params -- kerne parametes * @param IN result -- kernel launch result * @param INOUT pUserData -- pointer to the location which stores global data * @param INOUT instanceData -- pointer to the location which stores the data for this query * * @return */ static void kernelCreateCB(ze_kernel_create_params_t *params, ze_result_t result, void *pUserData, void **instanceData) { (void)instanceData; if (result != ZE_RESULT_SUCCESS) { return; } QueryState *queryState = reinterpret_cast(pUserData); queryState->lock.lock(); ze_kernel_handle_t kernel = **(params->pphKernel); queryState->nameMap[kernel] = (*(params->pdesc))->pKernelName; CHECK_N_RETURN(queryState==nullptr); queryState->lock.unlock(); return; } /*------------------------------------------------------------------------------*/ /*! * @fn static void kernelDestroyCB( * ze_command_list_append_launch_kernel_params_t* params, * ze_result_t result, void* pUserData, void** instanceData) * * @brief callback function to remove kernel from maintained mapping * It is called when exiting kerenl destroy. * * @param IN params -- kerne parametes * @param IN result -- kernel launch result * @param INOUT pUserData -- pointer to the location which stores global data * @param INOUT instanceData -- pointer to the location which stores the data for this query * * @return */ static void kernelDestroyCB(ze_kernel_destroy_params_t *params, ze_result_t result, void *pUserData, void **instanceData) { (void)instanceData; if (result != ZE_RESULT_SUCCESS) { return; } QueryState *queryState = reinterpret_cast(pUserData); CHECK_N_RETURN(queryState==nullptr); queryState->lock.lock(); ze_kernel_handle_t kernel = *(params->phKernel); if (queryState->nameMap.count(kernel) != 1) { queryState->lock.unlock(); return; } queryState->nameMap.erase(kernel); queryState->lock.unlock(); return; } /*------------------------------------------------------------------------------*/ /*! * @fn static void metricQueryBeginCB( * ze_command_list_append_launch_kernel_params_t* params, * ze_result_t result, void* pUserData, void** instanceData) * * @brief callback to append "query begin" request into command list before kernel launches. * * @param IN params -- kerne parametes * @param IN result -- kernel launch result * @param INOUT pUserData -- pointer to the location which stores global data * @param OUT instanceData -- pointer to the location which stores the data for this query * * @return * */ static void metricQueryBeginCB( ze_command_list_append_launch_kernel_params_t* params, ze_result_t result, void* pUserData, void** instanceData) { (void)result; (void)instanceData; QueryState *queryState = reinterpret_cast(pUserData); CHECK_N_RETURN(queryState==nullptr); // assign each kernel an id for reference. uint32_t kernId = queryState->kernelId.fetch_add(1, std::memory_order_acq_rel); if (kernId >= MAX_KERNERLS) { *instanceData = nullptr; return; } ze_result_t status = ZE_RESULT_SUCCESS; zet_metric_query_handle_t metricQuery = nullptr; status = zetMetricQueryCreateFunc(queryState->queryPool, kernId, &metricQuery); CHECK_N_RETURN(status!=ZE_RESULT_SUCCESS); ze_command_list_handle_t commandList = *(params->phCommandList); CHECK_N_RETURN(commandList== nullptr); status = zetCommandListAppendMetricQueryBeginFunc(commandList, metricQuery); CHECK_N_RETURN(status!=ZE_RESULT_SUCCESS); // maintain the query for current kernel InstanceData* data = new InstanceData; CHECK_N_RETURN(data==nullptr); data->kernelId = kernId; data->metricQuery = metricQuery; data->queryState = queryState; *instanceData = reinterpret_cast(data); return; } /*------------------------------------------------------------------------------*/ /*! * @fn static void metricQueryEndCB( * ze_command_list_append_launch_kernel_params_t* params, * ze_result_t result, void* pUserData, void** instanceData) * * @brief callback to append "query end" request into command list after kernel completes * * @param IN params -- kerne parametes * @param IN result -- kernel launch result * @param INOUT pUserData -- pointer to the location which stores global data * @param INOUT instanceData -- pointer to the location which stores the data for this query * */ static void metricQueryEndCB( ze_command_list_append_launch_kernel_params_t* params, ze_result_t result, void* pUserData, void** instanceData) { (void)result; InstanceData* data = reinterpret_cast(*instanceData); CHECK_N_RETURN(data==nullptr); QueryState *queryState = reinterpret_cast(pUserData); CHECK_N_RETURN(queryState==nullptr); ze_command_list_handle_t commandList = *(params->phCommandList); CHECK_N_RETURN(commandList==nullptr); ze_event_desc_t eventDesc; eventDesc.stype = ZE_STRUCTURE_TYPE_EVENT_DESC; eventDesc.index = data->kernelId; eventDesc.signal = ZE_EVENT_SCOPE_FLAG_HOST; eventDesc.wait = ZE_EVENT_SCOPE_FLAG_HOST; ze_event_handle_t event = nullptr; ze_result_t status = ZE_RESULT_SUCCESS; status = zeEventCreateFunc(queryState->eventPool, &eventDesc, &event); CHECK_N_RETURN(status!=ZE_RESULT_SUCCESS); status = zetCommandListAppendMetricQueryEndFunc(commandList, data->metricQuery, event, 0, nullptr); CHECK_N_RETURN(status!=ZE_RESULT_SUCCESS); queryState->lock.lock(); ze_kernel_handle_t kernel = *(params->phKernel); if (queryState->nameMap.count(kernel) == 1) { QueryData queryData = { queryState->nameMap[kernel], data->metricQuery, event }; queryState->queryList.push_back(queryData); } queryState->lock.unlock(); return; } /*------------------------------------------------------------------------------*/ /*! * @fn int getMetricType(const char *desc, zet_metric_type_t type ) * * @brief Find the metric type for report (accumulate, average or raw) * based on metric type and description. * * @param IN desc -- metric description * @param IN type -- metric type * * @ return metric report tyep, <0: accumulate, 1:averge, 2: static> */ static int getMetricType(char *desc, zet_metric_type_t metric_type) { if (metric_type == ZET_METRIC_TYPE_THROUGHPUT) { return M_ACCUMULATE; } if (metric_type == ZET_METRIC_TYPE_RATIO) { return M_AVERAGE; } char *ptr = nullptr; if ((metric_type == ZET_METRIC_TYPE_EVENT) || (metric_type == ZET_METRIC_TYPE_DURATION)) { if ((ptr = strstr(desc, "percentage")) || (ptr = strstr(desc, "Percentage")) || (ptr = strstr(desc, "Average")) || (ptr = strstr(desc, "average"))) { return M_AVERAGE; } return M_ACCUMULATE; } return M_RAW; } /*------------------------------------------------------------------------------*/ /*! * @fn void * typedValue2Value(zet_typed_value_t data, std::fstream *foutstream, * int outflag, int *dtype, long long *iVal, * double *fpVal, int isLast) * * @brief Convert typed value to long long or double. if foutstream is open, log the data * * @param In data - typed data * @param IN foutstream - output stream to dump raw data (used for debug) * @param IN outflag - 1 for output to stdout. 0 for disable output to stdout * @param OUT dtype - data type, 0: long long, 1: double * @param OUT ival - converted long long integer value * @param OUT dval - converted double value * @param IN isLast - true if the data is the last value in a metric group * */ static void typedValue2Value(zet_typed_value_t data, std::fstream *foutstream, int outflag, int *dtype, long long *iVal, double *fpVal, int isLast) { *dtype = 0; static int count = 0; switch( data.type ) { case ZET_VALUE_TYPE_UINT32: case ZET_VALUE_TYPE_BOOL8: if (foutstream->is_open()) { *foutstream << data.value.ui32 << ","; } if (outflag) { cout << data.value.ui32 << ","; } *iVal = (long long)data.value.ui32; break; case ZET_VALUE_TYPE_UINT64: if (foutstream->is_open()) { *foutstream << data.value.ui64 << ","; } if (outflag) { cout << data.value.ui64 << ","; } *iVal = (long long)data.value.ui64; break; case ZET_VALUE_TYPE_FLOAT32: if (foutstream->is_open()) { *foutstream << data.value.fp32 << ","; } if (outflag) { cout << data.value.fp32 << ","; } *dtype = 1; *fpVal = (double)data.value.fp32; break; case ZET_VALUE_TYPE_FLOAT64: if (foutstream->is_open()) { *foutstream << data.value.fp64 << ","; } if (outflag) { cout << data.value.fp64 << ","; } *dtype = 1; *fpVal = (double)data.value.fp64; break; default: break; } count++; if (isLast) { if (foutstream->is_open()) { *foutstream << endl; } if (outflag) { cout << endl; } } } /***************************************************************************/ /* GPUMericHandler class */ /***************************************************************************/ using namespace std; GPUMetricHandler* GPUMetricHandler::m_handlerInstance = nullptr; static vector g_deviceInfo; static map g_metricHandlerMap; static uint32_t g_stdout = 0; static std::mutex g_lock; /*--------------------------------------------------------------------------*/ /*! * GPUMetricHandler constructor */ GPUMetricHandler::GPUMetricHandler(uint32_t driverId, uint32_t deviceId, uint32_t subdeviceId) : m_driverId(driverId), m_deviceId(deviceId), m_subdeviceId(subdeviceId) { m_driver = nullptr; m_device = nullptr; m_context = nullptr; m_domainId = 0; m_groupId = -1; m_dataDumpFileName = ""; m_dataDumpFilePath = ""; m_status = COLLECTION_IDLE; m_numDevices = 0; m_numDataSet = 0; m_reportData = nullptr; m_reportCount = nullptr; m_eventPool = nullptr; m_event = nullptr; m_metricStreamer = nullptr; m_tracer = nullptr; m_queryPool = nullptr; m_groupType = ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED; } /*-------------------------------------------------------------------------------------------*/ /*! * ~GPUMetricHandler */ GPUMetricHandler::~GPUMetricHandler() { DestroyMetricDevice(); m_handlerInstance = nullptr; } /*-------------------------------------------------------------------------------------------*/ /*! * fn GPUMetricHandler * GPUMetricHandler::GetInstance(uint32_t driverId, * uint32_t deviceId, uint32_t subdeviceId) * * @brief Get an instance of GPUMetrichander object * * @param IN deviceId - given driver id * @param IN deviceId - given device id * @param IN subdeviceId - given subdevice id * * @return GPUMetricHandler object if such device exists, otherwise return nullptr; */ GPUMetricHandler * GPUMetricHandler::GetInstance(uint32_t driverId, uint32_t deviceId, uint32_t subdeviceId) { GPUMetricHandler *handler = nullptr; if (!g_deviceInfo.size()) { DeviceInfo *info = nullptr; uint32_t numDevs = 0; uint32_t totalAvailDevs = 0; GPUMetricHandler::InitMetricDevices(&info, &numDevs, &totalAvailDevs); } uint32_t key = CreateDeviceCode(driverId, deviceId, subdeviceId); auto it = g_metricHandlerMap.find(key); if (it == g_metricHandlerMap.end()) { DebugPrintError("Device <%d, %d, %d> is not a valid metrics device\n", driverId, deviceId, subdeviceId); } else { handler = (GPUMetricHandler *)it->second; } return handler; } /*-------------------------------------------------------------------------------------------*/ /*! * fn int GPUMetricHandler::InitMetricDevice((DeviceInfo **deviceInfo, uint32_t *numDevices, * uint32_t *totalDevices) * * @brief Discover and initiate GPU Metric Device * * @param OUT deviceInfo -- a list of DeviceInfo objects for devices * @param OUT numDevice -- number of DeviceInfo objects for devices * @param OUT totalDevice -- total available devices including root devices and subdevices. * * @return Status. 0 for success, otherwise 1. */ int GPUMetricHandler::InitMetricDevices(DeviceInfo **deviceInfo, uint32_t *numDevices, uint32_t *totalDevices) { uint32_t driverCount = 0; int ret = 0; int retError = 1; ze_result_t status = ZE_RESULT_SUCCESS; *numDevices = 0; *totalDevices = 0; *deviceInfo = nullptr; g_lock.lock(); if (g_deviceInfo.size()) { *numDevices = g_deviceInfo.size(); *deviceInfo = g_deviceInfo.data(); g_lock.unlock(); return 0; } void *dlHandle = dlopen("libze_loader.so", RTLD_NOW | RTLD_GLOBAL); if (!dlHandle) { DebugPrintError("dlopen libze_loader.so failed\n"); g_lock.unlock(); return retError; } ret = functionInit(dlHandle); if (ret) { DebugPrintError("Failed in finding functions in libze_loader.so\n"); g_lock.unlock(); return ret; } status = zeInitFunc(ZE_INIT_FLAG_GPU_ONLY); if (status == ZE_RESULT_SUCCESS) { status = zeDriverGetFunc(&driverCount, nullptr); } if (status != ZE_RESULT_SUCCESS || driverCount == 0) { g_lock.unlock(); return retError; } vector driverList(driverCount, nullptr); status = zeDriverGetFunc(&driverCount, driverList.data()); CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), 1); char *envStr = getenv(ENABLE_RAW_LOG); if (envStr) { g_stdout = atoi(envStr); } vector deviceInfoList; uint32_t dcount = 0; for (uint32_t i = 0; i < driverCount; ++i) { uint32_t deviceCount = 0; status = zeDeviceGetFunc(driverList[i], &deviceCount, nullptr); if (status != ZE_RESULT_SUCCESS || deviceCount == 0) { continue; } vector deviceList(deviceCount, nullptr); status = zeDeviceGetFunc(driverList[i], &deviceCount, deviceList.data()); if (status != ZE_RESULT_SUCCESS || deviceCount == 0) { continue; } for (uint32_t j = 0; j < deviceCount; ++j) { ze_device_properties_t props; status = zeDeviceGetPropertiesFunc(deviceList[j], &props); CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), 1); if ((props.type != ZE_DEVICE_TYPE_GPU) || (strstr(props.name, "Intel") == nullptr)) { continue; } uint32_t subdeviceCount = 0; status = zeDeviceGetSubDevicesFunc(deviceList[j], &subdeviceCount, nullptr); if (status != ZE_RESULT_SUCCESS) { continue; } DeviceInfo node; node.driverId = i+1; node.deviceId = j+1; node.subdeviceId = 0; node.handle = 0; node.index = 0; node.numSubdevices = subdeviceCount; strncpy_se(node.name, MAX_STR_LEN, props.name, strlen(props.name)); g_deviceInfo.push_back(node); // create metric groups for this device GPUMetricHandler *handler = new GPUMetricHandler((i+1), (j+1), 0); handler->m_driver = driverList.at(i); handler->m_device = deviceList.at(j); handler->m_numDevices = (subdeviceCount)?subdeviceCount:1; TMetricGroupInfo *mgroups = nullptr; mgroups = new TMetricGroupInfo(); ret = InitMetricGroups(deviceList.at(j), mgroups); if (ret) { continue; } handler->m_groupInfo = mgroups; uint32_t key = CreateDeviceCode((i+1), (j+1), 0); g_metricHandlerMap[key] = handler; dcount++; DebugPrint("detected device: , [%s], subdevCount %d, m_dev %p\n", i, j, props.name, subdeviceCount, handler->m_device); if (subdeviceCount) { vector subdeviceList(subdeviceCount, nullptr); status = zeDeviceGetSubDevicesFunc(deviceList[j], &subdeviceCount, subdeviceList.data()); if (status != ZE_RESULT_SUCCESS) { continue; } for (uint32_t k = 0; k < subdeviceCount; ++k) { ze_device_properties_t subprops; status = zeDeviceGetPropertiesFunc(subdeviceList[k], &subprops); GPUMetricHandler *handler = new GPUMetricHandler((i+1), (j+1), (k+1)); handler->m_driver = driverList.at(i); handler->m_device = subdeviceList.at(k); handler->m_numDevices = 1; uint32_t key = CreateDeviceCode((i+1), (j+1), (k+1)); DebugPrint("detected subdevice: , " "key 0x%x, name %s, m_device %p\n", i, j, k, key, props.name, handler->m_device); g_metricHandlerMap[key] = handler; handler->m_groupInfo = mgroups; dcount++; } } } } *numDevices = g_deviceInfo.size(); *deviceInfo = g_deviceInfo.data(); *totalDevices = dcount; #if defined(_DEBUG) for (uint32_t i=0; i<*numDevices; i++) { DeviceInfo node = g_deviceInfo.at(i); DebugPrint("dev[%d]: drv %d, dev %d subdev %d\n", i, node.driverId, node.deviceId, node.subdeviceId); } #endif g_lock.unlock(); if (!*numDevices) { DebugPrintError("No Intel GPU metric device detected."); return 1; } return ret; } /*------------------------------------------------------------------------------*/ /*! * fn void GPUMetricHandler::DestroyMetricDevice() * * @brief Clean up metric device */ void GPUMetricHandler::DestroyMetricDevice() { DebugPrint("DestroyMetricDevice\n"); m_device = nullptr; m_groupInfo = nullptr; if (m_reportData) { for (uint32_t i=0; i maxMetricsPerGroup) { maxMetricsPerGroup = metricCount; } groupList[gid].code = CreateGroupCode(gid); groupList[gid].props = groupProps; groupList[gid].handle = groupHandles[gid]; groupList[gid].metricList = new TMetricNode[metricCount]; CHECK_N_RETURN_STATUS(( groupList[gid].metricList ==nullptr), 1); DebugPrint("group[%d]: name %s, desc %s\n", gid, groupProps.name, groupProps.description); zet_metric_handle_t* metricHandles = new zet_metric_handle_t[metricCount]; CHECK_N_RETURN_STATUS((metricHandles ==nullptr), 1); status = zetMetricGetFunc(groupHandles[gid], &metricCount, metricHandles); CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), 1); for (uint32_t mid = 0; mid < metricCount; ++mid) { zet_metric_properties_t metricProps = {}; status = zetMetricGetPropertiesFunc(metricHandles[mid], &metricProps); CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), 1); groupList[gid].metricList[mid].props = metricProps; groupList[gid].metricList[mid].handle = metricHandles[mid]; groupList[gid].metricList[mid].code = CreateMetricCode(gid,mid); groupList[gid].metricList[mid].metricGroupId = gid; groupList[gid].metricList[mid].metricId = mid; groupList[gid].metricList[mid].metricType = getMetricType(metricProps.description, metricProps.metricType); DebugPrint(" metric[%d][%d] name %s, desc %s, metric_type %d\n", gid, mid, metricProps.name, metricProps.description, metricProps.metricType); } numMetrics += metricCount; if (groupList[gid].props.samplingType & ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_EVENT_BASED) { eventBasedCount += metricCount; } else { timeBasedCount += metricCount; } delete [] metricHandles; } DebugPrint("init metric groups return: groupCount %d, metric %d, TBS %d, EBS %d\n", groupCount, numMetrics, timeBasedCount, eventBasedCount); delete [] groupHandles; mgroups->metricGroupList = groupList; mgroups->numMetricGroups = groupCount; mgroups->numMetrics = numMetrics; mgroups->maxMetricsPerGroup = maxMetricsPerGroup; return ret; } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::GetMetricInfo(int type, MetricInfo *data) * * @brief Get available metric info * * @param IN type - metric group type, 0 for timed-based, 1 for query based * @param INOUT data - pointer to the MetricInfo data contains a list of metrics * * @reutrn - 0 if success. */ int GPUMetricHandler::GetMetricInfo(int type, MetricInfo *data) { uint32_t i = 0; int ret = 0; int retError = 1; uint32_t stype = 0; if(!m_device) { DebugPrintError("MetricsDevice not opened\n"); return retError; } if (!data) { DebugPrintError("GetMetricInfo: invalid out data\n"); return retError; } // get all metricGroups if (type) { stype = ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_EVENT_BASED; } else { stype = ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED; } data->code = 0; data->infoEntries = GPUAllocMetricInfo(m_groupInfo->numMetricGroups); CHECK_N_RETURN_STATUS((data->infoEntries==nullptr), 1); int index = 0; TMetricGroupNode *groupList = m_groupInfo->metricGroupList; for (i = 0; i < m_groupInfo->numMetricGroups; i++ ) { if (groupList[i].props.samplingType & stype) { strncpy_se(data->infoEntries[index].name, MAX_STR_LEN, groupList[i].props.name, strlen(groupList[i].props.name)); strncpy_se(data->infoEntries[index].desc, MAX_STR_LEN, groupList[i].props.description, strlen(groupList[i].props.description)); data->infoEntries[index].numEntries = groupList[i].props.metricCount; data->infoEntries[index].dataType = groupList[i].props.samplingType; data->infoEntries[index++].code = groupList[i].code; } data->numEntries = index; } return ret; } /*------------------------------------------------------------------------------*/ /** * @fn int GPUMetricHandler::GetMetricInfo(const char *name, int type, MetricInfo *data) * * @brief Get available metrics in a certain metrics group. * If metric group is not specified, get all available metrics from all metric groups. * * @param IN name -- metric group name. If nullptr or empty, means all metric groups * @param IN type -- metric group type, 0 for timed-based, 1 for query based * @param INOUT data -- pointer to the MetricInfo data contains a list of metrics * * @reutrn -- 0 if success. */ int GPUMetricHandler::GetMetricInfo(const char *name, int type, MetricInfo *data) { uint32_t i = 0; uint32_t numMetrics = 0; int ret = 0; int retError = 1; uint32_t stype = (type)? ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_EVENT_BASED : ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED; if(!m_device) { DebugPrintError("MetricsDevice not opened\n"); return retError; } if (!data) { DebugPrintError("GetMetricInfo: invalid out data\n"); return retError; } // get all metricGroups numMetrics = m_groupInfo->numMetrics; int selectedAll = 0; if (!name || !strlen(name)) { data->infoEntries = GPUAllocMetricInfo(numMetrics); CHECK_N_RETURN_STATUS((data->infoEntries==nullptr), 1); data->code = 0; selectedAll = 1; } int index = 0; TMetricGroupNode *mGroup; for (i = 0; i < m_groupInfo->numMetricGroups; i++ ) { mGroup = &(m_groupInfo->metricGroupList[i]); if (!(mGroup->props.samplingType & stype) || strlen(mGroup->props.name) == 0) { continue; } if (!selectedAll) { if (strncmp(name, mGroup->props.name, MAX_STR_LEN) != 0) { continue; } numMetrics = mGroup->props.metricCount; data->infoEntries = GPUAllocMetricInfo(numMetrics); CHECK_N_RETURN_STATUS((data->infoEntries==nullptr), 1); m_groupId = i; data->code = mGroup->code; index = 0; } for (uint32_t j=0; jprops.metricCount; j++) { strncpy_se((char *)(data->infoEntries[index].name), MAX_STR_LEN, mGroup->props.name, strlen(mGroup->props.name)); size_t mnLen = strlen(data->infoEntries[index].name); if ((mnLen+3) < MAX_STR_LEN) { data->infoEntries[index].name[mnLen++] = '.'; strncpy_se((char *)&(data->infoEntries[index].name[mnLen]), (MAX_STR_LEN-mnLen), mGroup->metricList[j].props.name, strlen(mGroup->metricList[j].props.name)); } strncpy_se(data->infoEntries[index].desc, MAX_STR_LEN, mGroup->metricList[j].props.description, strlen(mGroup->metricList[j].props.description)); data->infoEntries[index].code = mGroup->metricList[j].code; data->infoEntries[index].numEntries = 0; data->infoEntries[index].dataType = mGroup->metricList[j].props.resultType; data->infoEntries[index].metricType = mGroup->metricList[j].metricType; index++; } if (!selectedAll) { break; } } if ( !selectedAll && (i == m_groupInfo->numMetricGroups)) { DebugPrintError( "GetMetricInfo: metricGroup %s is not found, abort\n", name); return retError; } data->numEntries = index; return ret; } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::GetMetricCode( * const char *mGroupName, const char *metricName, uint32_t mtype, * uint32_t *mGroupCode, uint32_t *metricCode) * * @ brief Get metric code and metric group code for a given valid metric group name and * metric name. * * @param IN mGroupName - Metric group name * @param IN metricName - metric name in the * @param IN mtype - metric type: < time_based, event_based> * @param OUT mGroupCode - metric group code, 0 if the metric group dose not exist * @param OUT metricCode - metric code, 0 if the metric dose not exist * * @return Status, 0 if success, 1 if no such metric or metric group exist. */ int GPUMetricHandler::GetMetricCode( const char *mGroupName, const char *metricName, uint32_t mtype, uint32_t *mGroupCode, uint32_t *metricCode) { int ret = 0; int retError = 1; DebugPrint( "GetMetricCode: metricGroup %s, metric %s == ", mGroupName, metricName); int metricType = (mtype)?ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_EVENT_BASED : ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED; TMetricGroupNode *groupList = m_groupInfo->metricGroupList; for (uint32_t i=0; i< m_groupInfo->numMetricGroups; i++) { if (groupList[i].code && (metricType & groupList[i].props.samplingType) && (strncmp(mGroupName, groupList[i].props.name, MAX_STR_LEN) == 0 )) { *mGroupCode = groupList[i].code; if (!metricName || (strlen(metricName)==0) || !metricCode) { DebugPrint( " mGroupCode 0x%x \n", *mGroupCode); return ret; } for (uint32_t j=0; j< groupList[i].props.metricCount; j++) { if (strncmp(metricName, groupList[i].metricList[j].props.name,MAX_STR_LEN)==0) { *metricCode = groupList[i].metricList[j].code; DebugPrint( " mGroupCode 0x%x , metricCode 0x%x\n", *mGroupCode, *metricCode); return ret; } } } } return retError; } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::EnableMetricGroup( * const char *metricGroupName, uint32_t mtype, int *enableSt) * * @brief Enable a named metric group for collection * * @param IN metricGroupName - the metric group name * @param IN mtype - metric group type * @param OUT enableSt - if the metric group already enaled, otherwise 0 * * @return Status, 0 for success. */ int GPUMetricHandler::EnableMetricGroup(const char *metricGroupName, uint32_t mtype, int *enableSt) { int ret = 0; int retError = 1; uint32_t groupCode = 0; if ((ret=GetMetricCode(metricGroupName, "", mtype, &groupCode, nullptr))) { DebugPrintError("MetricGroup %s is not found\n", metricGroupName); return retError; } return EnableMetricGroup(groupCode, mtype, enableSt); } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::EnableMetricGroup( * const char *metricGroupName, uint32_t mtype, int *enableSt) * * @brief Enable a named metric group for collection * * @param IN metricGroupName - the metric group name * @param IN mtype - metric group type * @param OUT enableSt - 1 if the metrics group is already enabled, otherwise 0. * * @return Status, 0 for success. */ int GPUMetricHandler::EnableMetricGroup(uint32_t groupCode, uint32_t mtype, int *enableSt) { int ret = 0; int retError = 1; ze_result_t status = ZE_RESULT_SUCCESS; *enableSt = 0; if ((groupCode-1) >= m_groupInfo->numMetricGroups) { DebugPrintError("MetricGroup code 0x%x is not found\n", groupCode); return retError; } m_lock.lock(); if ( m_status == COLLECTION_CONFIGED) { DebugPrint( "EnableMetricGroup: already in enable status\n"); *enableSt = 1; m_lock.unlock(); return ret; } m_groupId = groupCode-1; if (mtype == EVENT_BASED) { m_groupType = ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_EVENT_BASED; } else { m_groupType = ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED; } DebugPrint("EnableMetricGroup: code 0x%x, type 0x%x\n", groupCode, m_groupType); TMetricGroupNode *groupList = m_groupInfo->metricGroupList; uint32_t metricCount = groupList[m_groupId].props.metricCount; if (!metricCount) { DebugPrintError("MetricGroup dose not have metrics\n"); m_lock.unlock(); return retError; } if (!m_context) { ze_context_desc_t ctxtDesc = { ZE_STRUCTURE_TYPE_CONTEXT_DESC, nullptr, 0 }; zeContextCreateFunc(m_driver, &ctxtDesc, &m_context); } // activate metric group zet_metric_group_handle_t mGroup = groupList[m_groupId].handle; status = zetContextActivateMetricGroupsFunc(m_context, m_device, 1, &mGroup); if (status != ZE_RESULT_SUCCESS) { DebugPrintError("ActivateMetricGroup code 0x%x failed.\n", groupCode); return retError; } m_eventPool = nullptr; m_event = nullptr; m_tracer = nullptr; // create buffer for report data if (!m_reportData) { m_reportData = new MetricData[m_numDevices]; m_reportCount = new uint32_t[m_numDevices]; CHECK_N_RETURN_STATUS((m_reportData==nullptr), 1); } for (uint32_t i=0; imetricGroupList[m_groupId].handle; status = zetMetricStreamerOpenFunc(m_context, m_device, mGroup, &metricStreamerDesc, m_event, &m_metricStreamer); if (!m_metricStreamer) { DebugPrintError("zetMetricStreamerOpen: failed with null streamer on device [%p]\n", m_device); } } if (status == ZE_RESULT_SUCCESS) { ret = 0; m_status = COLLECTION_ENABLED; } else { DebugPrintError("EnableTimeBasedStream: failed on device [%p], status 0x%x\n", m_device, status); status = zetContextActivateMetricGroupsFunc(m_context, m_device, 0, nullptr); m_status = COLLECTION_INIT; ret = 1; } m_lock.unlock(); return ret; } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::EnableEventBasedQuery() * * @brief Enable metric query on enabled metrics * * @return Status, 0 for success. */ int GPUMetricHandler::EnableEventBasedQuery() { ze_result_t status = ZE_RESULT_SUCCESS; int ret = 0; int retError = 0; if (m_groupId < 0) { DebugPrintError("No metrics enabled. Data collection abort\n"); return retError; } m_lock.lock(); if (m_status == COLLECTION_ENABLED) { DebugPrint( "EnableEventBaedQuery: already enabled\n"); m_lock.unlock(); return ret; } zet_metric_group_handle_t mGroup = m_groupInfo->metricGroupList[m_groupId].handle; m_queryState = new QueryState; zet_metric_query_pool_desc_t metricQueryPoolDesc; if (m_context) { ze_context_desc_t ctxtDesc = { ZE_STRUCTURE_TYPE_CONTEXT_DESC, nullptr, 0 }; zeContextCreateFunc(m_driver, &ctxtDesc, &m_context); } metricQueryPoolDesc.stype = ZET_STRUCTURE_TYPE_METRIC_QUERY_POOL_DESC; metricQueryPoolDesc.type = ZET_METRIC_QUERY_POOL_TYPE_PERFORMANCE; metricQueryPoolDesc.count = MAX_KERNERLS; status = zetMetricQueryPoolCreateFunc(m_context, m_device, mGroup, &metricQueryPoolDesc, &m_queryPool); if (status == ZE_RESULT_SUCCESS) { m_queryState->queryPool = m_queryPool; ze_event_pool_desc_t eventPoolDesc; eventPoolDesc.stype = ZE_STRUCTURE_TYPE_EVENT_POOL_DESC; eventPoolDesc.flags= ZE_EVENT_POOL_FLAG_HOST_VISIBLE; eventPoolDesc.count = MAX_KERNERLS; // create event to wait status = zeEventPoolCreateFunc(m_context, &eventPoolDesc, 1, &m_device, &m_eventPool); } zet_tracer_exp_desc_t tracerDesc; tracerDesc.stype = ZET_STRUCTURE_TYPE_TRACER_EXP_DESC; if (status == ZE_RESULT_SUCCESS) { m_queryState->eventPool = m_eventPool; m_queryState->handle = this; tracerDesc.pUserData = m_queryState; status = zetTracerExpCreateFunc(m_context, &tracerDesc, &m_tracer); } zet_core_callbacks_t prologCB = {}; zet_core_callbacks_t epilogCB = {}; if (status == ZE_RESULT_SUCCESS) { epilogCB.Kernel.pfnCreateCb = kernelCreateCB; epilogCB.Kernel.pfnDestroyCb = kernelDestroyCB; prologCB.CommandList.pfnAppendLaunchKernelCb = metricQueryBeginCB; epilogCB.CommandList.pfnAppendLaunchKernelCb = metricQueryEndCB; status = zetTracerExpSetProloguesFunc(m_tracer, &prologCB); } if (status == ZE_RESULT_SUCCESS) { status = zetTracerExpSetEpiloguesFunc(m_tracer, &epilogCB); } if (status == ZE_RESULT_SUCCESS) { status = zetTracerExpSetEnabledFunc(m_tracer, true); } if (status == ZE_RESULT_SUCCESS) { m_status = COLLECTION_ENABLED; ret = 0; } else { DebugPrintError("EnableEventBasedQuery: failed with status 0x%x, abort.\n", status); status = zetContextActivateMetricGroupsFunc(m_context, m_device, 0, nullptr); m_status = COLLECTION_INIT; ret = retError; } m_lock.unlock(); return ret; } /*------------------------------------------------------------------------------*/ /*! * @fn void GPUMetricHandler::DisableMetricGroup() * @brief Disable current metric group * After metric group disabled, cannot read data anymore * */ void GPUMetricHandler::DisableMetricGroup() { DebugPrint("enter DisableMetricGroup()\n"); m_lock.lock(); if ((m_status != COLLECTION_ENABLED) && (m_status != COLLECTION_CONFIGED)) { m_lock.unlock(); return; } m_status = COLLECTION_DISABLED; zetContextActivateMetricGroupsFunc(m_context, m_device, 0, nullptr); m_lock.unlock(); return; } /*------------------------------------------------------------------------------*/ /*! * @fn MetricData * GPUMetricHandler::GetMetricData( * uint32_t mode, uint32_t *numReports) * * @brief Read raw event data and calculate metrics. * Return an array of MetricData as the overall aggregated metrics data for a device. One MetricData pre device or subdevice. * * @param In mode - report mode, summary or samples * @param OUT numReports - total number of sample reports * * @return - metric data array, one MetricData per report. */ MetricData * GPUMetricHandler::GetMetricData(uint32_t mode, uint32_t *numReports) { uint8_t *rawBuffer = nullptr; size_t rawDataSize = 0; MetricData *reportArray = nullptr; *numReports = 0; m_lock.lock(); if (m_groupType == ZET_METRIC_GROUP_SAMPLING_TYPE_FLAG_TIME_BASED) { rawBuffer = ReadStreamData(&rawDataSize); if (rawBuffer) { GenerateMetricData(rawBuffer, rawDataSize, mode); delete [] rawBuffer; } } else { m_queryState->lock.lock(); vector qList; for (auto query : m_queryState->queryList) { rawDataSize = 0; rawBuffer = nullptr; ze_result_t status = ZE_RESULT_SUCCESS; rawBuffer = ReadQueryData(query, &rawDataSize, &status); if (status != ZE_RESULT_SUCCESS) { DebugPrintError("query read failed: status 0x%x\n", status); } if (status == ZE_RESULT_NOT_READY) { qList.push_back(query); } if ((status == ZE_RESULT_SUCCESS) && rawBuffer) { GenerateMetricData(rawBuffer, rawDataSize, mode); delete [] rawBuffer; } } m_queryState->queryList.clear(); for (auto query : qList) { m_queryState->queryList.push_back(query); } m_queryState->lock.unlock(); } // only SUMMARY mode TMetricGroupNode *groupList = m_groupInfo->metricGroupList; TMetricNode *metricList = groupList[m_groupId].metricList; uint32_t metricCount = groupList[m_groupId].props.metricCount; reportArray = GPUAllocMetricData(1, m_numDataSet, metricCount); if (!reportArray) { m_lock.unlock(); return reportArray; } *numReports = 1; MetricData *reportData = &(reportArray[0]); int index = 0; for (uint32_t i=0; idataSetStartIdx[i] = -1; continue; } reportData->dataSetStartIdx[i] = index; for (uint32_t j=0; j < metricCount; j++, index++) { reportData->dataEntries[index].code = m_reportData[i].dataEntries[j].code; reportData->dataEntries[index].type = m_reportData[i].dataEntries[j].type; if (metricList[j].metricType != M_AVERAGE) { reportData->dataEntries[index].value.ival = m_reportData[i].dataEntries[j].value.ival; } else { if (m_reportData[i].dataEntries[j].type) { // calculate avg if (m_reportData[i].dataEntries[j].value.fpval != 0.0) { reportData->dataEntries[index].value.fpval = (m_reportData[i].dataEntries[j].value.fpval)/m_reportCount[i]; } } else { reportData->dataEntries[index].value.ival = (m_reportData[i].dataEntries[j].value.ival)/m_reportCount[i]; } } } } m_lock.unlock(); return reportData; } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::GetCurGroupCode() * * @brief Return the metric group code of current activated metric group * * @return Out -- metric group code */ uint32_t GPUMetricHandler::GetCurGroupCode() { return (uint32_t)(m_groupId + 1); } /*------------------------------------------------------------------------------*/ /*! * @fn int GPUMetricHandler::SetControl(uint32_t mode) * * @brief Set control * * @param IN mode -- control mode to set */ int GPUMetricHandler::SetControl(uint32_t mode) { int ret = 0; m_lock.lock(); if (mode & METRIC_RESET) { if (m_numDevices && m_reportData) { for (uint32_t i=0; inumEntries; j++) { data->dataEntries[j].value.ival = 0; } m_reportCount[i] = 0; } } } m_lock.unlock(); return ret; } /*------------------------------------------------------------------------------*/ /*! * fn string GPUMetricHandler::GetDeviceName(ze_device_handle_t device) * * @brief Get metric device name * * @parapm In device Device handler * * @return The device name */ string GPUMetricHandler::GetDeviceName(ze_device_handle_t device) { ze_result_t status = ZE_RESULT_SUCCESS; ze_device_properties_t props; status = zeDeviceGetPropertiesFunc(device, &props); if (status == ZE_RESULT_SUCCESS) { return props.name; } return ""; } /*------------------------------------------------------------------------------*/ /*! * @fn void GPUMetricHandler::GenerateMetricData( * uint8_t *rawBuffer, size_t rawDataSize, uint32_t mode) * * @brief Calculate metric data from raw event data, the result will be aggrated into global data * * @param IN rawBuffer - buffer for raw event data * @param IN rawDataSize - the size of raw event data in the buffer * @param IN mode - report mode * */ void GPUMetricHandler::GenerateMetricData( uint8_t *rawBuffer, size_t rawDataSize, uint32_t mode ) { if (!rawDataSize) { return; } ze_result_t status = ZE_RESULT_SUCCESS; TMetricGroupNode *groupList = m_groupInfo->metricGroupList; zet_metric_group_handle_t mGroup = groupList[m_groupId].handle; uint32_t metricCount = groupList[m_groupId].props.metricCount; uint32_t numSets = 0; uint32_t numValues = 0; status = zetMetricGroupCalculateMultipleMetricValuesExpFunc(mGroup, ZET_METRIC_GROUP_CALCULATION_TYPE_METRIC_VALUES, (size_t)rawDataSize, (uint8_t*)rawBuffer, &numSets, &numValues, nullptr, nullptr); if ((status != ZE_RESULT_SUCCESS) || !numSets || !numValues) { DebugPrint("Metrics calculation, failed on allocating memory space.\n"); return; } std::vector metricSetCounts(numSets); std::vector metricValues(numValues); status = zetMetricGroupCalculateMultipleMetricValuesExpFunc(mGroup, ZET_METRIC_GROUP_CALCULATION_TYPE_METRIC_VALUES, (size_t)rawDataSize, (uint8_t*)rawBuffer, &numSets, &numValues, metricSetCounts.data(), metricValues.data()); if (status != ZE_RESULT_SUCCESS) { DebugPrintError("Failed on metrics calculation\n"); return; } if (!numSets || !numValues) { DebugPrintError("No metrics calculated.\n"); return; } zet_typed_value_t *typedDataList = metricValues.data(); m_numDataSet = numSets; uint32_t index = 0; for (uint32_t i=0; imetricGroupList; TMetricNode *metricList = groupList[m_groupId].metricList; if (dataSetId > m_numDevices) { return; } MetricData *reportData = &(m_reportData[dataSetId]); uint32_t reportCounts = metricDataSize/metricCount; if (mode & METRIC_RESET) { for (uint32_t j=0; jdataEntries[j].value.ival = 0; } m_reportCount[dataSetId] = 0; } DebugPrint("data[%d], metricDataSize %d, reportCounts %d, metricCount %d\n", (int)dataSetId, (int)metricDataSize, (int)reportCounts, (int)metricCount); // log metric names if (g_stdout) { for (uint32_t j=0; jdataEntries[j].type = dtype; if (!dtype) { reportData->dataEntries[j].value.ival += iVal; } else { reportData->dataEntries[j].value.fpval += fpVal; } } else { // static value, only need to take last one if (i== (reportCounts -1)) { if (!dtype) { reportData->dataEntries[j].value.ival = iVal; } else { reportData->dataEntries[j].value.fpval = fpVal; } } } } } m_reportCount[dataSetId] += reportCounts; return; } /*------------------------------------------------------------------------------*/ /*! * @fn uint8_t * GPUMetricHandler::ReadStreamData(size_t *rawDataSize) * * @brief read raw time based sampling data * * @param OUT rawDataSize - total byte size of raw data read. * * @return buffer pointer - buffer contains the raw data, nulltpr if failed. */ uint8_t * GPUMetricHandler::ReadStreamData(size_t *rawDataSize) { ze_result_t status = ZE_RESULT_SUCCESS; size_t rawSize = 0; uint8_t *rawBuffer = nullptr; *rawDataSize = 0; //read raw data status = zeEventHostSynchronizeFunc(m_event, 50000 /* wait delay in nanoseconds */); status = zetMetricStreamerReadDataFunc(m_metricStreamer, UINT32_MAX, &rawSize, nullptr); if (status != ZE_RESULT_SUCCESS) { DebugPrintError("ReadStreamData failed, status 0x%x, rawSize %d\n", status, (int)rawSize); return nullptr; } rawBuffer = new uint8_t[rawSize]; CHECK_N_RETURN_STATUS((rawBuffer==nullptr), nullptr); if (!rawSize) { *rawDataSize = rawSize; return rawBuffer; } status = zetMetricStreamerReadDataFunc(m_metricStreamer, UINT32_MAX, &rawSize, (uint8_t *)rawBuffer); if (status != ZE_RESULT_SUCCESS) { DebugPrintError("ReadStreamData failed, status 0x%x\n", status); delete [] rawBuffer; return nullptr; } if (!rawSize) { // thi may not be an error, especially in multi-thread case. DebugPrint("No Raw Data available. This could be collection time too short " "or buffer overflow. Please increase the sampling period\n"); } *rawDataSize = rawSize; return rawBuffer; } /*------------------------------------------------------------------------------*/ /*! * @fn uint8_t * GPUMetricHandler::ReadQueryData( * QueryData &data, * size_t *rawDataSize, * ze_result_t * status) * * @brief read raw query based sampling data * * @param IN data - given query data * @param OUT rawDataSize - total # of bytes of raw data read. * @param OUT retStatus - return status * * @return buffer pointer - buffer contains the raw data, nulltpr if failed. */ uint8_t * GPUMetricHandler::ReadQueryData(QueryData &data, size_t *rawDataSize, ze_result_t *retStatus) { size_t rawSize = 0; *rawDataSize = 0; std::map dataMap; ze_result_t status = ZE_RESULT_SUCCESS; status = zeEventHostSynchronizeFunc(data.event, 1000); *retStatus = status; CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), nullptr); status = zeEventDestroyFunc(data.event); *retStatus = status; CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), nullptr); status = zetMetricQueryGetDataFunc(data.metricQuery, &rawSize, nullptr); *retStatus = status; CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), nullptr); uint8_t *rawBuffer = new uint8_t[rawSize]; CHECK_N_RETURN_STATUS((rawBuffer==nullptr), nullptr); status = zetMetricQueryGetDataFunc(data.metricQuery, &rawSize, rawBuffer); *retStatus = status; CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), nullptr); status = zetMetricQueryDestroyFunc(data.metricQuery); *retStatus = status; CHECK_N_RETURN_STATUS((status!=ZE_RESULT_SUCCESS), nullptr); *rawDataSize = rawSize; return rawBuffer; } papi-papi-7-2-0-t/src/components/intel_gpu/internal/src/GPUMetricInterface.cpp000066400000000000000000000406001502707512200273310ustar00rootroot00000000000000/* * GPUMetricInterface.cpp: Intel® Graphics Component for PAPI * * Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include "GPUMetricInterface.h" #include "GPUMetricHandler.h" #ifdef __cplusplus extern "C" { #endif //#define _DEBUG 1 #define DebugPrintError(format, args...) fprintf(stderr, format, ## args) #if defined(_DEBUG) #define DebugPrint(format, args...) fprintf(stderr, format, ## args) #else #define DebugPrint(format, args...) {do { } while(0);} #endif #define UNINITED 0 #define INITED 1 #define ENABLED 2 #define MAX_ENTRY 256 #define MAX_HANDLES 16 /* global device information*/ static DeviceInfo *gDeviceInfo = nullptr; static uint32_t gNumDeviceInfo = 0; // maintail available handlers, one per subdevice static GPUMetricHandler **gHandlerTable = nullptr; static uint32_t gNumHandles = 0; static DEVICE_HANDLE *gHandles = nullptr; static int runningSessions = 0; // lock to make sure API is thread-safe static std::mutex infLock; /* build the key from */ static GPUMetricHandler *getHandler(DEVICE_HANDLE handle) { uint32_t index = GetIdx(handle); if (index >= gNumHandles) { return nullptr; } return gHandlerTable[index]; } /* * wrapper function for safe strncpy */ void strncpy_se(char *dest, size_t destSize, char *src, size_t count) { if (dest && src) { size_t toCopy = (count <= strlen(src))? count:strlen(src); if (toCopy < destSize) { memcpy(dest, src, toCopy); dest[toCopy] = '\0'; } } } /*============================================================================ * Below are Wrapper interface functions *============================================================================*/ /* ------------------------------------------------------------------------- */ /*! * @fn int GPUDetectDevice(DEVICE_HANDLE **handle, uint32_t *num_device); * * @brief Detect and init GPU device which has performance metrics availale. * * @param OUT handles - a array of handle, each for an instance of the device * @param OUT numDevice - total number of device detected * * @return 0 -- success, otherwise, error code */ int GPUDetectDevice(DEVICE_HANDLE **handles, uint32_t *numDevices) { int ret = 0; int retError = 1; infLock.lock(); if (gNumHandles) { *handles = gHandles; *numDevices = gNumHandles; infLock.unlock(); return ret; } ret = GPUMetricHandler::InitMetricDevices(&gDeviceInfo, &gNumDeviceInfo, &gNumHandles); if (!gDeviceInfo || !gNumDeviceInfo) { DebugPrintError("InitMetricDevices failed," " device does not exist or cannot find dependent libraries, abort\n"); infLock.unlock(); return retError; } gHandlerTable = (GPUMetricHandler **)calloc(gNumHandles, sizeof(GPUMetricHandler *)); gHandles = (DEVICE_HANDLE *)calloc(gNumHandles, sizeof(DEVICE_HANDLE)); uint32_t index = 0; uint32_t dcode = 0; for (uint32_t i=0; iDestroyMetricDevice(); } /* ------------------------------------------------------------------------- */ /*! * @fn int GPUEnableMetricGroup(DEVICE_HANDLE handle, char *metricGroupName * unsigned int mtype, unsigned int period, unsigned int numReports) * * @brief add named metric group tby name or code o collection. * * @param IN handle - handle to the selected device * @param IN metricGroupName - a metric group name * @param IN metricGroupCode - a metric group code * @param IN mtype - metric type * @param IN period - collection timer period * @param IN numReports - number of reports. Default collect all available metrics * * * @return 0 -- success, otherwise, error code */ int GPUEnableMetricGroup(DEVICE_HANDLE handle, char *metricGroupName, uint32_t metricGroupCode, unsigned int mtype, unsigned int period, unsigned int numReports) { int ret = 0; int retError = 1; unsigned int groupCode = 0; if (!handle || !getHandler(handle)) { DebugPrintError("GPUEnableMetricGroup: device handle is not initiated!\n"); return retError; } GPUMetricHandler *curMetricHandle = getHandler(handle); uint32_t curMetricGroupCode = curMetricHandle->GetCurGroupCode(); if (metricGroupName && strlen(metricGroupName)) { ret=curMetricHandle->GetMetricCode(metricGroupName, "", mtype, &groupCode, nullptr); if (ret) { DebugPrintError("GPUEnableMetricGroup: metric group %s is not supported, abort!\n", metricGroupName); return ret; } } else { groupCode = metricGroupCode; } if (curMetricGroupCode) { // already in collecting a metric group if (groupCode == curMetricGroupCode) { runningSessions++; return ret; } else { DebugPrintError("GPUEnableMetricGroup failed, reason:" " tried to collect more than one metric groups at the smae time." " Collection abort!\n"); return retError; } } infLock.lock(); int enabled = 0; ret = curMetricHandle->EnableMetricGroup(groupCode, mtype, &enabled); if (!ret && !enabled) { if (mtype == TIME_BASED) { ret = curMetricHandle->EnableTimeBasedStream(period, numReports); } else { ret = curMetricHandle->EnableEventBasedQuery(); } } if (!ret) { runningSessions++; }; infLock.unlock(); return ret; } /* ------------------------------------------------------------------------- */ /*! * @fn int GPUDisableMetricGroup(DEVICE_HANDLE handle, unsigned int mtype); * * @brief disable a metric group configured for the device * * @param IN handle - handle to the selected device * @param IN mtype - a metric group type * * @return 0 -- success, otherwise, error code */ int GPUDisableMetricGroup(DEVICE_HANDLE handle, unsigned mtype) { int ret = 0; int retError = 1; if (!handle || !getHandler(handle)) { DebugPrintError("GPUDisableMetricGroup: device handle is not initiated!\n"); return retError; } infLock.lock(); runningSessions--; if (runningSessions == 0) { if (mtype == TIME_BASED) { GPUMetricHandler *curMetricHandler = getHandler(handle); curMetricHandler->DisableMetricGroup(); } } infLock.unlock(); return ret; } /* ------------------------------------------------------------------------- */ /*! * @fn int GPUSetMetricControl(DEVICE_HANDLE handle, unsigned int mode); * * @brief set controls for metroc collection * * @param IN handle - handle to the selected device * @param IN mode - a metric collection control mode * * @return 0 -- success, otherwise, error code */ int GPUSetMetricControl(DEVICE_HANDLE handle, unsigned int mode) { int ret = 0; int retError = 1; if (!handle || !getHandler(handle)) { DebugPrintError("GPUStop: device handle is not initiated!\n"); return retError; } infLock.lock(); getHandler(handle)->SetControl(mode); infLock.unlock(); return ret; } /* ----------------------------------------------------------------------------------------- */ /*! * @fn MetricData *GPUReadMetricData(DEVICE_HANDLE handle, * int mode, unsigned int *reportCounts); * * @brief read metric data * * @param IN handle - handle to the selected device * @param IN mode - reprot data mode, METRIC_SUMMARY, METIC_SAMPLE * @param OUT reportCounts - returned metric data array size * * @return data - returned metric data array */ MetricData *GPUReadMetricData(DEVICE_HANDLE handle, unsigned int mode, unsigned int *reportCounts) { MetricData *reportData = nullptr; if (!handle || !getHandler(handle)) { DebugPrintError("GPUReadMetricData: device handle is not initiated!\n"); return nullptr; } unsigned int numReports = 0; infLock.lock(); reportData = getHandler(handle)->GetMetricData(mode, &numReports); infLock.unlock(); if (!reportData) { DebugPrintError("Failed on GPUReadMetricData\n"); *reportCounts = 0; return nullptr; } if (!numReports) { DebugPrintError("GPUReadMetricData: No metric is collected\n"); *reportCounts = 0; GPUFreeMetricData(reportData, numReports); return nullptr; } DebugPrint("GPUReadMetricData: GetMetricData numreport %d\n", numReports); #if defined(_DEBUG) for (int i=0; i<(int)numReports; i++) { DebugPrint("reportData[%d], metrics %d\n", i, reportData[i].metricCount); for (int j=0; j<(int)reportData[i].numDataSets; j++) { int sidx = reportData[i].dataSetStartIdx[j]; if (sidx < 0) { continue; } for (int k=0; k<(int)reportData[i].metricCount; k++) { DebugPrint("record[%d], dataSet[%d], metric [%d]: code 0x%x, ", i, j, k, reportData[i].dataEntries[sidx+k].code); if (reportData[i].dataEntries[sidx+k].type) { DebugPrint("value %lf \n", reportData[i].dataEntries[sidx+k].value.fpval); } else { DebugPrint("value %llu\n", reportData[i].dataEntries[sidx+k].value.ival); } } } } #endif *reportCounts = (int)numReports; return reportData; } /* ------------------------------------------------------------------------- */ /*! * @fn int GPUGetMetricGroups(DEVICE_HANDLE handle, unsigned int mtype, MetricInfo *data); * * @brief list all available metric groups for the selected type * @param IN handle - handle to the selected device * @param IN mtype - metric group type * @param OUT data - metric data * * @return 0 -- success, otherwise, error code */ int GPUGetMetricGroups(DEVICE_HANDLE handle, unsigned int mtype, MetricInfo *data) { int ret = 0; int retError = 1; if (!handle || !getHandler(handle)) { DebugPrintError("GPUGetMetricGroups: device handle is not initiated!\n"); return retError; } infLock.lock(); ret = getHandler(handle)->GetMetricInfo(mtype, data); infLock.unlock(); if (ret) { DebugPrintError("GPUGetMetricGroups failed, return %d\n", ret); } #if defined(_DEBUG) for (int i=0; inumEntries; i++) { DebugPrint("GPUGetMetricGroups: metric group[%d]: %s, code 0x%x, numEntries %d\n", i, data->infoEntries[i].name, data->infoEntries[i].code, data->infoEntries[i].numEntries); } #endif return ret; } /* ------------------------------------------------------------------------- */ /*! * @fn int GPUGetMetricList(DEVICE_HANDLE handle, * char *groupName, unsigned int mtype, MetricInfo *data); * * @brief list available metrics in the named group. * If name is "", list all available metrics in all groups * * @param IN handle - handle to the selected device * @param IN groupName - metric group name. "" means all groups. * @param In mtype - metric type * @param OUT data - metric data * * @return 0 -- success, otherwise, error code */ int GPUGetMetricList(DEVICE_HANDLE handle, char *groupName, unsigned mtype, MetricInfo *data) { int ret = 0; int retError = 1; if (!handle || !getHandler(handle)) { DebugPrintError("GPUGetMetricList: device handle (0x%x) is not initiated!\n", handle); return retError; } if (groupName == nullptr) { return retError; } GPUMetricHandler *mHandler = getHandler(handle); if (strlen(groupName) > 0) { ret = mHandler->GetMetricInfo(groupName, mtype, data); } else { ret = mHandler->GetMetricInfo("", mtype, data); } if (ret) { DebugPrintError("GPUGetMetrics [%s] failed, return %d\n", groupName, ret); } return ret; } /************************************************************************************************ * Memory allocate/free functions are defiend to allow C code caller to free up the memory space ************************************************************************************************/ /* ------------------------------------------------------------------------- */ /*! * @fn MetricInfo *GPUAllocMetricInfo(uint32_t count) * * @brief allocate memory for metrics info * @param IN count - number of entries of MetricInfo * * @return array of MetricInfo */ MetricInfo * GPUAllocMetricInfo(uint32_t count) { return (MetricInfo *)calloc(count, sizeof(MetricInfo)); } /* ------------------------------------------------------------------------- */ /*! * @fn void GPUFreeMetricInfo(MetricInfo *info, uint32_t count); * * @brief free memory space for metrics info * @param IN count - number of entries of MetricInfo * */ void GPUFreeMetricInfo(MetricInfo *info, uint32_t count) { if (!info || !count) { return; } for (uint32_t i=0; i * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include "inc/GPUMetricInterface.h" #include "linux_intel_gpu_metrics.h" #define DEFAULT_LOOPS 0 // collection infinitely until receive stop command #define DEFAULT_MODE METRIC_SUMMARY // summary report #define GPUDEBUG SUBDBG // all available devices static uint32_t num_avail_devices = 0; static DEVICE_HANDLE *avail_devices; // devices currently being queried static uint32_t num_active_devices = 0; DeviceContext *active_devices; static MetricInfo metricInfoList; static int total_metrics = 0; static int global_metrics_type = TIME_BASED; // default type papi_vector_t _intel_gpu_vector; /*! * @brief Parser a metric name with qualifiers * metricName can be * component:::metrcGroup.metricname:device=xx:tile=yy * or metrcGroup.metricname:device=xx:tile=yy * @param IN metricName -- metric name in the above format * @param OUT devnum -- device id * @param OUT tilenum -- tile id */ void parseMetricName(const char *metricName, int *devnum, int *tilenum) { char *name = strdup(metricName); char *ptr = name; char *param = name; uint32_t dnum = 0; uint32_t tnum = 0; if ((*param != '\0') && (ptr=strstr(param, ":"))) { if ((ptr=strstr(param, ":device="))) { ptr += 8; dnum = atoi(ptr)+1; } if ((ptr=strstr(param, ":tile="))) { ptr += 6; tnum = atoi(ptr)+1; } } *devnum = (dnum)?dnum:1; *tilenum = (tnum)?tnum:0; // default tile is 0 for root device free(name); } /*! * @brief Get handle from device code */ DEVICE_HANDLE getHandle(uint32_t device_code) { uint32_t i=0; for (i=0; i= total_metrics) { return -1; } uint32_t group = GetGroupIdx(metricInfoList.infoEntries[index].code); uint32_t metric = GetMetricIdx(metricInfoList.infoEntries[index].code); uint32_t devcode = GetDeviceCode(code); if (rootDev) { devcode = devcode & ~DMASK; } GPUDEBUG("addMetricToDevice, code 0x%x, group 0x%x, metric 0x%x, devcode 0x%x\n", code, group, metric, devcode); uint32_t i=0; for (i=0; i= num_avail_devices) { GPUDEBUG("intel_gpu: invalid event code 0x%x\n", code); return -1; } DEVICE_HANDLE hd = getHandle(devcode); if (!hd) { GPUDEBUG("intel_gpu: Metric is not supported. For multi devices and/or multi-tiles GPUs, " "metric name should be qualified with :device=0 and/or :tile=0. " "Failed with return code 0x%x \n", PAPI_ENOSUPP); return -1; } // add a new device entry DeviceContext *dev = &(active_devices[i]); if (i==num_active_devices) { dev->device_code = devcode; dev->mgroup_code = group; dev->handle = hd; num_active_devices++; } // add a new metric to collect on this device dev->metric_code[dev->num_metrics++] = metric; return i; } /*! * @brief Reset metric counts to zero. */ void metricReset( MetricContext *mContext ) { for (uint32_t i=0; iactive_devices[i]; if (dev_idx < num_active_devices) { DeviceContext *dev = &active_devices[dev_idx]; GPUSetMetricControl(dev->handle, METRIC_RESET); } } } /************************* PAPI Functions **********************************/ static int intel_gpu_init_thread(hwd_context_t *ctx) { GPUDEBUG("Entering intel_gpu_init_thread\n"); MetricContext *mContext = (MetricContext *)ctx; mContext->active_devices = calloc(num_avail_devices, sizeof(uint32_t)); return PAPI_OK; } static int intel_gpu_init_component(int cidx) { int retval = PAPI_OK; GPUDEBUG("Entering intel_init_component\n"); if (cidx < 0) { return PAPI_EINVAL; } char *errStr = NULL; memset(_intel_gpu_vector.cmp_info.disabled_reason, 0, PAPI_MAX_STR_LEN); if (putenv("ZET_ENABLE_METRICS=1")) { errStr = "Set ZET_ENABLE_METRICS=1 failed. Cannot access GPU metrics. "; strncpy_se(_intel_gpu_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, errStr, strlen(errStr)); retval = PAPI_ENOSUPP; goto fn_fail; } if ( _dl_non_dynamic_init != NULL ) { errStr = "The intel_gpu component does not support statically linking of libc."; strncpy_se(_intel_gpu_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, errStr, strlen(errStr)); retval = PAPI_ENOSUPP; goto fn_fail; } if (GPUDetectDevice(&avail_devices, &num_avail_devices) || (num_avail_devices==0)) { errStr = "The intel_gpu component does not detect metrics device."; strncpy_se(_intel_gpu_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, errStr, strlen(errStr)); retval = PAPI_ENOSUPP; goto fn_fail; } DEVICE_HANDLE handle = avail_devices[0]; char *envStr = NULL; envStr = getenv(ENABLE_API_TRACING); if (envStr != NULL) { if (atoi(envStr) == 1) { global_metrics_type = EVENT_BASED; } } metricInfoList.numEntries = 0; if (GPUGetMetricList(handle, "", global_metrics_type, &metricInfoList)) { errStr = "The intel_gpu component failed on get all available metrics."; strncpy_se(_intel_gpu_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, errStr, strlen(errStr)); GPUFreeDevice(handle); retval = PAPI_ENOSUPP; goto fn_fail; }; total_metrics = metricInfoList.numEntries; GPUDEBUG("total metrics %d\n", total_metrics); _intel_gpu_vector.cmp_info.num_native_events = metricInfoList.numEntries; _intel_gpu_vector.cmp_info.num_cntrs = GPU_MAX_COUNTERS; _intel_gpu_vector.cmp_info.num_mpx_cntrs = GPU_MAX_COUNTERS; /* Export the component id */ _intel_gpu_vector.cmp_info.CmpIdx = cidx; active_devices = calloc(num_avail_devices, sizeof(DeviceContext)); num_active_devices = 0; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /*! * @brief Setup a counter control state. * In general a control state holds the hardware info for an EventSet. */ static int intel_gpu_init_control_state( hwd_control_state_t * ctl ) { GPUDEBUG("Entering intel_gpu_control_state\n"); if (!ctl) { return PAPI_EINVAL; } MetricCtlState *mCtlSt = (MetricCtlState *)ctl; mCtlSt->metrics_type = global_metrics_type; char *envStr = NULL; mCtlSt->interval = DEFAULT_SAMPLING_PERIOD; envStr = getenv(METRICS_SAMPLING_PERIOD); if (envStr) { mCtlSt->interval = atoi(envStr); if (mCtlSt->interval < MINIMUM_SAMPLING_PERIOD) { mCtlSt->interval = DEFAULT_SAMPLING_PERIOD; } } // set default mode (get aggragated value or average value overtime). mCtlSt->mode = DEFAULT_MODE; mCtlSt->loops = DEFAULT_LOOPS; return PAPI_OK; } static int intel_gpu_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { GPUDEBUG("Entering intel_gpu_control_state\n"); (void)ctl; // use local maintained context, MetricContext *mContext = (MetricContext *)ctx; /* This check accounts for calls to PAPI_cleanup_eventset(). */ if ( !count ) { metricReset(mContext); /* Free devices. */ for (uint32_t i=0; inum_devices; i++) { uint32_t dev_idx = mContext->active_devices[i]; if (dev_idx >= num_active_devices) { return PAPI_ENOMEM; } DeviceContext *dev = &active_devices[dev_idx]; DEVICE_HANDLE handle = dev->handle; if( !handle ) { GPUFreeDevice(handle); } mContext->active_devices[i] = 0; } mContext->num_devices = 0; /* Free metric slots. */ for (uint32_t midx=0; midxnum_metrics; midx++) { mContext->metric_idx[midx] = 0; mContext->metric_values[midx] = 0; mContext->dev_ctx_idx[midx] = 0; mContext->subdev_idx[midx] = 0; } mContext->num_metrics = 0; return PAPI_OK; } if ( !native ) { return PAPI_OK; } #if defined(_DEBUG) for (int i=0; inum_metrics; uint32_t midx = 0; int ni = 0; for (ni = 0; ni < count; ni++) { uint32_t index = native[ni].ni_event; // check whether this metric is in the list for (midx=0; midxmetric_idx[midx] == index) { GPUDEBUG("metric code %d: already in the list, ignore\n", index); break; } } if (midx < nmetrics) { // already in the list continue; } // whether use root device or subdevice char *envStr = NULL; int useRootDevice = 1; envStr = getenv(ENABLE_SUB_DEVICE); if (envStr != NULL) { useRootDevice = 0; } int idx = addMetricToDevice(index, useRootDevice); if (idx<0) { return PAPI_ENOSUPP; } mContext->metric_idx[nmetrics] = index; mContext->dev_ctx_idx[nmetrics] = idx; mContext->subdev_idx[nmetrics] = GetSDev(index); GPUDEBUG("add metric[%d] code 0x%x, in device[%d] (event subdev[%d])\n", nmetrics, mContext->metric_idx[nmetrics], mContext->dev_ctx_idx[nmetrics], mContext->subdev_idx[nmetrics]); uint32_t i = 0; for (i=0; inum_devices; i++) { if (mContext->active_devices[i] == (uint32_t)idx) { // already in the list break; } } if (i == mContext->num_devices) { mContext->active_devices[i] = idx; mContext->num_devices++; } native[ni].ni_position = nmetrics; nmetrics++; } // add this metric mContext->num_metrics = nmetrics; return PAPI_OK; } static int intel_gpu_start( hwd_context_t * ctx, hwd_control_state_t * ctl ) { GPUDEBUG("Entering intel_gpu_start\n"); MetricCtlState *mCtlSt = (MetricCtlState *)ctl; MetricContext *mContext = (MetricContext *)ctx; int ret = PAPI_OK; metricReset(mContext); if (mContext->num_metrics == 0) { GPUDEBUG("intel_gpu_start : No metric selected, abort.\n"); return PAPI_EINVAL; } char **metrics = calloc(mContext->num_metrics, sizeof(char *)); if (!metrics) { GPUDEBUG("intel_gpu_start : insufficient memory, abort.\n"); return PAPI_ENOMEM; } mContext->num_reports = 0; for (uint32_t i=0; inum_devices; i++) { uint32_t dev_idx = mContext->active_devices[i]; if (dev_idx >= num_active_devices) { ret = PAPI_ENOMEM; break; } DeviceContext *dev = &active_devices[dev_idx]; DEVICE_HANDLE handle = dev->handle; if (GPUEnableMetricGroup(handle, "", dev->mgroup_code, mCtlSt->metrics_type, mCtlSt->interval, mCtlSt->loops)) { GPUDEBUG("intel_gpu_start on EnableMetrics failed, return 0x%x \n", PAPI_ENOSUPP); ret = PAPI_ENOMEM; break; } } return ret; } static int intel_gpu_stop(hwd_context_t *ctx, hwd_control_state_t *ctl) { GPUDEBUG("Entering intel_gpu_stop\n"); MetricContext *mContext = (MetricContext *)ctx; MetricCtlState *mCtlSt = (MetricCtlState *)ctl; int ret = PAPI_OK; for (uint32_t i=0; inum_devices; i++) { uint32_t dev_idx = mContext->active_devices[i]; if (dev_idx < num_active_devices) { DeviceContext *dev = &active_devices[dev_idx]; DEVICE_HANDLE handle = dev->handle; if (GPUDisableMetricGroup(handle, mCtlSt->metrics_type)) { GPUDEBUG("intel_gpu_stop : failed with ret %d\n", ret); ret = PAPI_EINVAL; } } } return ret; } static int intel_gpu_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ) { GPUDEBUG("Entering intel_gpu_read\n"); (void)flags; MetricCtlState *mCtlSt = (MetricCtlState *)ctl; MetricContext *mContext = (MetricContext *)ctx; MetricData *reports = NULL; uint32_t numReports = 0; int ret = PAPI_OK; if (!events) { return PAPI_EINVAL; } if (mContext->num_metrics == 0) { GPUDEBUG("intel_gpu_read: no metric is selected\n"); return PAPI_OK; } for (uint32_t i=0; inum_devices; i++) { uint32_t dc_idx = mContext->active_devices[i]; numReports = 0; if (dc_idx >= num_active_devices) { continue; } DeviceContext *dc = &active_devices[dc_idx]; DEVICE_HANDLE handle = dc->handle; reports = GPUReadMetricData(handle, mCtlSt->mode, &numReports); if (!reports) { GPUDEBUG("intel_gpu_read failed on device 0x%x\n", GetDeviceCode(handle)); continue; } else if (!numReports) { GPUFreeMetricData(reports, numReports); GPUDEBUG("intel_gpu_read: no data available on device 0x%x\n", GetDeviceCode(handle)); continue; } if (dc->data) { GPUFreeMetricData(dc->data, dc->num_reports); } /* take last report, it is expected the numReports is 1 */ dc->data = &reports[numReports-1]; dc->num_reports = numReports; } for (uint32_t i=0; inum_metrics; i++) { uint32_t dc_idx = mContext->dev_ctx_idx[i]; DeviceContext *dc = &(active_devices[dc_idx]); int index = GetIdx(mContext->metric_idx[i]); int start_idx = 0; if (!(dc->device_code &DMASK)) { // root device uint32_t dc_sidx = GetSDev(mContext->metric_idx[i]); if (dc_sidx > 0) { dc_sidx--; // zero index } if (dc_sidx < dc->data->numDataSets) { start_idx = dc->data->dataSetStartIdx[dc_sidx]; } else { start_idx = -1; } } if ((start_idx < 0) || !dc->data || !dc->num_reports) { mContext->metric_values[i] = 0; // no data available } else { uint32_t midx = GetMetricIdx(metricInfoList.infoEntries[index].code)-1; if (!dc->data->dataEntries[midx].type) { mContext->metric_values[i] = (long long)dc->data->dataEntries[start_idx + midx].value.ival; } else { mContext->metric_values[i] = (long long)dc->data->dataEntries[start_idx + midx].value.fpval; } } } mContext->num_reports = numReports; *events = mContext->metric_values; return ret; } static int intel_gpu_shutdown_thread( hwd_context_t *ctx ) { (void)ctx; GPUDEBUG("Entering intel_gpu_shutdown_thread\n" ); return PAPI_OK; } static int intel_gpu_shutdown_component(void) { GPUDEBUG("Entering intel_gpu_shutdown_component\n"); for (uint32_t i=0; idomain = domain; return PAPI_OK; } static int intel_gpu_ntv_enum_events(uint32_t *EventCode, int modifier ) { GPUDEBUG("Entering intel_gpu_ntv_enum_events\n"); int index = 0; if (!EventCode) { return PAPI_EINVAL; } switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = 0; break; case PAPI_ENUM_EVENTS: index = GetIdx(*EventCode); uint32_t dev_code = GetDeviceCode(*EventCode); if ( (index < 0 ) || (index >= (total_metrics-1)) ) { return PAPI_ENOEVNT; } *EventCode = CreateIdxCode(dev_code, index+1); break; default: return PAPI_EINVAL; } return PAPI_OK; } static int intel_gpu_ntv_code_to_name( uint32_t EventCode, char *name, int len ) { GPUDEBUG("Entering intel_gpu_ntv_code_to_name\n"); int index = GetIdx(EventCode); if( ( index < 0 ) || ( index >= total_metrics ) || !name || !len ) { return PAPI_EINVAL; } memset(name, 0, len); strncpy_se(name, len, metricInfoList.infoEntries[index].name, strlen(metricInfoList.infoEntries[index].name)); return PAPI_OK; } static int intel_gpu_ntv_code_to_descr( uint32_t EventCode, char *desc, int len ) { GPUDEBUG("Entering intel_gpu_ntv_code_to_descr\n"); int index = GetIdx(EventCode); if( ( index < 0 ) || ( index >= total_metrics ) || !desc || !len) { return PAPI_EINVAL; } memset(desc, 0, len); strncpy_se(desc, len, metricInfoList.infoEntries[index].desc, MAX_STR_LEN-1); return PAPI_OK; } static int intel_gpu_ntv_name_to_code( const char *name, uint32_t *event_code) { GPUDEBUG("Entering intel_gpu_ntv_name_to_code\n"); if( !name || !event_code) { return PAPI_EINVAL; } for (int i=0; i= total_metrics ) || !info) { return PAPI_EINVAL; } info->event_code = EventCode; strncpy_se(info->symbol, PAPI_HUGE_STR_LEN, metricInfoList.infoEntries[index].name, MAX_STR_LEN-1); // short description could be truncated due to the longer string size of metric name strncpy_se(info->short_descr, PAPI_MIN_STR_LEN, metricInfoList.infoEntries[index].name, strlen(metricInfoList.infoEntries[index].name)); strncpy_se(info->long_descr, PAPI_HUGE_STR_LEN, metricInfoList.infoEntries[index].desc, MAX_STR_LEN-1); info->component_index = _intel_gpu_vector.cmp_info.CmpIdx; return PAPI_OK; } /* Our component vector */ papi_vector_t _intel_gpu_vector = { .cmp_info = { /* component information (unspecified values initialized to 0) */ .name = "intel_gpu", .short_name = "intel_gpu", .version = "1.0", .description = "Intel GPU performance metrics", .default_domain = PAPI_DOM_ALL, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL | PAPI_DOM_SUPERVISOR, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .num_mpx_cntrs = GPU_MAX_COUNTERS, /* component specific cmp_info initializations */ .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, .cpu = 0, .inherit = 0, .cntr_umasks = 0, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof (MetricContext), .control_state = sizeof (MetricCtlState), .reg_value = sizeof ( int ), .reg_alloc = sizeof ( int ), }, /* function pointers in this component */ .init_thread = intel_gpu_init_thread, .init_component = intel_gpu_init_component, .init_control_state = intel_gpu_init_control_state, .update_control_state = intel_gpu_update_control_state, .start = intel_gpu_start, .stop = intel_gpu_stop, .read = intel_gpu_read, .shutdown_thread = intel_gpu_shutdown_thread, .shutdown_component = intel_gpu_shutdown_component, .ctl = intel_gpu_ctl, /* these are dummy implementation */ .set_domain = intel_gpu_set_domain, .reset = intel_gpu_reset, /* from counter name mapper */ .ntv_enum_events = intel_gpu_ntv_enum_events, .ntv_code_to_name = intel_gpu_ntv_code_to_name, .ntv_name_to_code = intel_gpu_ntv_name_to_code, .ntv_code_to_descr = intel_gpu_ntv_code_to_descr, .ntv_code_to_info = intel_gpu_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/intel_gpu/linux_intel_gpu_metrics.h000066400000000000000000000054741502707512200257200ustar00rootroot00000000000000/* * linux_intel_gpu_metrics.h: Intel® Graphics Processing Unit (GPU) Component for PAPI. * * Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _INTEL_GPU_METRICS_H #define _INTEL_GPU_METRICS_H /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" /* environment varaiable for changing the control */ #define METRICS_SAMPLING_PERIOD "METRICS_SAMPLING_PERIOD" // setting sampling period #define ENABLE_API_TRACING "ZET_ENABLE_API_TRACING_EXP" // for oneAPI Level0 V1.0 + #define ENABLE_SUB_DEVICE "ENABLE_SUB_DEVICE" #define MINIMUM_SAMPLING_PERIOD 100000 #define DEFAULT_SAMPLING_PERIOD 400000 #define GPU_MAX_COUNTERS 54 #define GPU_MAX_METRICS 128 void (*_dl_non_dynamic_init) (void) __attribute__ ((weak)); typedef struct _metric_ctl_s { uint32_t interval; uint32_t metrics_type; int mode; uint32_t loops; int domain; } MetricCtlState; typedef struct _device_context_s { uint32_t device_code; uint32_t mgroup_code; uint32_t num_metrics; uint32_t metric_code[GPU_MAX_METRICS]; DEVICE_HANDLE handle; uint32_t num_reports; uint32_t num_data_sets; uint32_t *data_set_sidx; uint32_t data_size; MetricData *data; } DeviceContext; typedef struct _metric_context_s { int cmp_id; int device_id; int domain; int thread_id; int data_avail; uint32_t num_metrics; uint32_t num_reports; uint32_t num_devices; uint32_t *active_sub_devices; uint32_t *active_devices; uint32_t metric_idx[GPU_MAX_METRICS]; uint32_t dev_ctx_idx[GPU_MAX_METRICS]; uint32_t subdev_idx[GPU_MAX_METRICS]; long long metric_values[GPU_MAX_METRICS]; } MetricContext; #endif /* _INTEL_GPU_METRICS_H */ papi-papi-7-2-0-t/src/components/intel_gpu/tests/000077500000000000000000000000001502707512200217445ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/intel_gpu/tests/Makefile000066400000000000000000000024431502707512200234070ustar00rootroot00000000000000NAME=intel_gpu include ../../Makefile_comp_tests.target CFLAGS += -DDEBUG -I../../../testlib INTEL_L0_HEADERS ?=/usr/include INTEL_L0_LIB64 ?=/usr/lib CPPLDFLAGS+=-ldl GPULIB=-L$(INTEL_L0_LIB64) -lze_loader %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -g -fPIC -c -o $@ $< %.o:%.cc $(CXX) $(CFLAGS) $(CPPFLAGS) $(OPTFLAGS) $(INCLUDE) -I$(INTEL_L0_HEADERS) -DENABLE_PAPI -g -fPIC -c -o $@ $< TESTS = gpu_metric_read gpu_metric_list gpu_query_gemm gpu_gemm intel_gpu_tests: $(TESTS) gpu_metric_list: gpu_metric_list.o $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o gpu_metric_list gpu_metric_list.o $(PAPILIB) $(LDFLAGS) gpu_metric_read: gpu_metric_read.o gpu_common_utils.o $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o gpu_metric_read gpu_metric_read.o gpu_common_utils.o $(PAPILIB) $(LDFLAGS) gpu_gemm.o: gpu_query_gemm.cc $(CXX) $(CFLAGS) $(CPPFLAGS) $(OPTFLAGS) $(INCLUDE) -I$(INTEL_L0_HEADERS) -g -fPIC -c -o gpu_gemm.o gpu_query_gemm.cc gpu_gemm: gpu_gemm.o gpu_common_utils.o $(PAPILIB) $(CXX) $(CFLAGS) -o gpu_gemm gpu_gemm.o gpu_common_utils.o $(GPULIB) $(PAPILIB) $(CPPLDFLAGS) $(LDFLAGS) gpu_query_gemm: gpu_query_gemm.o gpu_common_utils.o $(PAPILIB) $(CXX) $(CFLAGS) -o gpu_query_gemm gpu_query_gemm.o gpu_common_utils.o $(GPULIB) $(PAPILIB) $(CPPLDFLAGS) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/intel_gpu/tests/gemm.spv000066400000000000000000000101501502707512200234200ustar00rootroot00000000000000#p ' SPV_KHR_no_integer_wrap_decoration OpenCL.std SPIRV.debug GEMMYkernel_arg_type.GEMM.float*,float*,float*,int, ].//////////////////////////////////gemm.clbGEMMcp __spirv_BuiltInGlobalInvocationId abcsizeentryfor.condfor.cond.cleanupfor.bodyfor.incfor.enda.addrb.addrc.addrsize.addrji sum!k&call'conv*call1+conv23cmp8mul:add;idxprom<arrayidxAmul4Cadd5Didxprom6Earrayidx7LincQmul9Sadd10Tidxprom11Uarrayidx12GZ,IZG[,I[G\uI\G GG )__spirv_BuiltInGlobalInvocationIdJZ !J[J \8:ACLQS@ "+ /+ K   !     #"2;+- ^ _#]^ `_ a dba_`cˆ  ^6 7 7 7 7 ø;;;;;;; ;!> >>> ed]|#$$ f=%Q&% gd] q '&]>']|#(( h=)Q*) id] q +*]>+]|#, ,] > -]|#.!.] >!/]ùø jd]= 0!]= 1]±2301]ú3ø kd]|#4!4 lùø md] = 5]= 6]= 7]„ 867]= 9!]€ :89] r;:F <5;==<]= >] = ?!]$= @]"„ A?@]+= B])€ CAB]rDCF E>D=FE] =G …H=FIHG> I]ùø nd]= J!€ LJK>!L]ùø od] =M ] = N] = O] = P] „ QOP] = R] € SQR] rTSF UNT] >UM] |#V V|#WW|#XXư8papi-papi-7-2-0-t/src/components/intel_gpu/tests/gpu_common_utils.c000066400000000000000000000117141502707512200254770ustar00rootroot00000000000000/* Copyright (c) 2020 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* * This utility functions. */ #include #include #include #include #include #include "papi.h" #include "gpu_common_utils.h" #if defined(__cplusplus) extern "C" { #endif const char *default_metrics[] = { "ComputeBasic.GpuTime", "ComputeBasic.GpuCoreClocks", "ComputeBasic.AvgGpuCoreFrequencyMHz", }; int num_default_metrics = 3; void parseMetricList(char *metric_list, InParams *param) { int size = 64; int index = 0; if (!metric_list) { param->num_metrics = num_default_metrics; param->metric_names = (char **)default_metrics; } else { char **metrics = (char **)calloc(size, sizeof(char *)); char *token = strtok(metric_list, ","); while (token) { if (index >= size) { size += 64; metrics = (char **)realloc(metrics, size * sizeof(char *)); } metrics[index++] = token; printf("metric[%d]: %s\n", index-1, metrics[index-1]); token = strtok(NULL, ","); } param->num_metrics = index; param->metric_names = metrics; } } int parseInputParam(int argc, char **argv, InParams *param) { char ch; int duration = 3; int loops = 1; int reset = 0; char *metric_list = NULL; char *app_targets = NULL; while ((ch=getopt(argc, argv, "d:l:e:t:m:s")) != -1) { switch(ch) { case 't': app_targets = optarg; break; case 'd': duration = atoi(optarg); if ((duration <= 0) || (duration > 3600)) { // max 3600 seconds printf("invalid input on dueation [1, 3600], use default 3 sec.\n"); duration = 3; } break; case 'l': loops = atoi(optarg); if ((loops <= 0) || (loops > 0x100000)) { // max 1M printf("invalid input on loops [1, 1M], use default 1 loop.\n"); loops = 1; } break; case 's': reset = 1; break; case 'm': metric_list = strdup(optarg); break; default: return 1; } } param->duration = duration; param->loops = loops; param->reset = reset; param->app_dev = 0; param->app_tile = 0; if (app_targets) { char *str = app_targets; int i=0; if ((str[i]=='d') && (str[i+1] != '\0')) { param->app_dev = atoi(&str[++i]); while ((str[i] != '\0') && ( str[i] < '0') && (str[i] > '9')) { i++; } } if ((str[i] != '\0') && (str[i] == 't') && (str[i+1] != '\0')) { param->app_tile = atoi(&str[i+1])+1; } } parseMetricList(metric_list, param); return 0; } int initPAPIGPUComp() { PAPI_component_info_t *aComponent = NULL; int cid = -1; // init all components including "intel_gpu" int retVal = PAPI_library_init( PAPI_VER_CURRENT ); if( retVal != PAPI_VER_CURRENT ) { fprintf( stderr, "PAPI_library_init failed\n" ); return -1; } int numComponents = PAPI_num_components(); int i = 0; for (i=0; iname) == 0) { cid=i; // If we found our match, record it. } // end search components. } if (cid < 0) { fprintf(stderr, "Failed to find component [%s] in total %i supported components.\n", COMP_NAME, numComponents); PAPI_shutdown(); return -1; } return cid; } int initMetricSet(char **metric_names, int num_metrics, int *eventSet) { int retVal = PAPI_create_eventset(eventSet); if (retVal != PAPI_OK) { fprintf(stderr, "Error on PAPI_create_eventset, retVal %d\n", retVal); PAPI_shutdown(); return retVal; } for (int i=0; i * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* * This test case tests time based data collection on ntel GPU performance metrics * * @ brief Collect metric data for a certain time interval, with one or more loops. * By default, the metric data will aggregate overtime for each loop. * When reset, each group reports metric data only for the sepecified tiem duration * * @option: * [-d

    *
  • -u Display output values as unsigned integers *
  • -x Display output values as hexadecimal *
  • -h Display help information about this utility. *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include #include "papi.h" #include // Checks if HIP command (AMD) worked or not. #define HIPCHECK(cmd) \ { \ hipError_t error = cmd; \ if (error != hipSuccess) { \ fprintf(stderr, "error: '%s'(%d) at %s:%d\n", \ hipGetErrorString(error), error,__FILE__, __LINE__); \ exit(EXIT_FAILURE); \ } \ } //----------------------------------------------------------------------------- // HIP routine: Square each element in the array A and write to array C. //----------------------------------------------------------------------------- template __global__ void vector_square(T *C_d, T *A_d, size_t N) { size_t offset = (blockIdx.x * blockDim.x + threadIdx.x); size_t stride = blockDim.x * gridDim.x ; for (size_t i=offset; i>4) & 0x000000000000ffff; // Extract minor. major = (startupValues[1]>>8) & 0x000000000000ffff; // Extract major. printf("%i AMD rocm_smi capable devices found. Library version %i:%i:%i.\n", NUMDevices, major, minor, patch); values = ( long long * ) malloc( sizeof ( long long ) * ( size_t ) argc ); // create reading space. success = ( char * ) malloc( ( size_t ) argc ); if ( success == NULL || values == NULL ) { fprintf(stderr,"Error allocating memory!\n"); exit(1); } for ( num_events = 0, i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "-h" ) ) { print_help( argv ); exit( 1 ); } else if ( !strcmp( argv[i], "-u" ) ) { u_format = 1; } else if ( !strcmp( argv[i], "-x" ) ) { hex_format = 1; } else { if ( ( retval = PAPI_add_named_event( EventSet, argv[i] ) ) != PAPI_OK ) { printf( "Failed adding: %s\nbecause: %s\n", argv[i], PAPI_strerror(retval)); } else { success[num_events++] = i; printf( "Successfully added: %s\n", argv[i] ); } } } /* Automatically pass if no events, for run_tests.sh */ if ( num_events == 0 ) { printf("No events specified!\n"); printf("Specify events like rocm_smi:::device=0:mem_usage_VRAM rocm_smi:::device=0:pci_throughput_sent\n"); printf("Use papi/src/utils/papi_native_avail for a list of all events; search for 'rocm_smi:::'.\n"); return 0; } // ROCM Activity. printf( "\n" ); retval = PAPI_start( EventSet ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_start, retval=%i [%s].\n", retval, PAPI_strerror(retval) ); exit( retval ); } // ROCM skipped do_flops(), do_misses() in papi_command_line.c. for (k = 0; k < NUMDevices; k++ ) { // ROCM loop through devices. conductTest(k); // Do some GPU work on device 'k'. sleep(1); // .. sleep between reads to build up events. retval = PAPI_read( EventSet, values ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_read, retval=%i [%s].\n", retval, PAPI_strerror(retval) ); exit( retval ); } printf( "\n----------------------------------\n" ); for ( j = 0; j < num_events; j++ ) { // Back to original papi_command_line... i = success[j]; if (! (u_format || hex_format) ) { retval = PAPI_event_name_to_code( argv[i], &event ); if (retval == PAPI_OK) { retval = PAPI_get_event_info(event, &info); if (retval == PAPI_OK) data_type = info.data_type; else data_type = PAPI_DATATYPE_INT64; } switch (data_type) { case PAPI_DATATYPE_UINT64: printf( "%s : \t%llu(u)", argv[i], (unsigned long long)values[j] ); break; case PAPI_DATATYPE_FP64: printf( "%s : \t%0.3f", argv[i], *((double *)(&values[j])) ); break; case PAPI_DATATYPE_BIT64: printf( "%s : \t%#llX", argv[i], values[j] ); break; case PAPI_DATATYPE_INT64: default: printf( "%s : \t%lld", argv[i], values[j] ); break; } if (retval == PAPI_OK) printf( " %s", info.units ); printf( "\n" ); } if (u_format) printf( "%s : \t%llu(u)\n", argv[i], (unsigned long long)values[j] ); if (hex_format) printf( "%s : \t%#llX\n", argv[i], values[j] ); } } // end ROCM device loop. retval = PAPI_stop( EventSet, values ); // ROCM added stop and test. if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_stop, retval=%i [%s].\n", retval, PAPI_strerror(retval) ); exit( retval ); } return 0; } // end main. papi-papi-7-2-0-t/src/components/rocm_smi/tests/rocm_smi_all.cpp000066400000000000000000000742511502707512200247430ustar00rootroot00000000000000//----------------------------------------------------------------------------- // This program must be compiled using a special makefile: // make -f ROCM_SMI_Makefile rocm_smi_all.out //----------------------------------------------------------------------------- #define __HIP_PLATFORM_HCC__ #include #include #include #include "papi.h" #include #include #define CHECK(cmd) \ {\ hipError_t error = cmd;\ if (error != hipSuccess) { \ fprintf(stderr, "error: '%s'(%d) at %s:%d\n", hipGetErrorString(error), error,__FILE__, __LINE__); \ exit(EXIT_FAILURE);\ }\ } // THIS MACRO EXITS if the papi call does not return PAPI_OK. Do not use for routines that // return anything else; e.g. PAPI_num_components, PAPI_get_component_info, PAPI_library_init. #define CALL_PAPI_OK(papi_routine) \ do { \ int _papiret = papi_routine; \ if (_papiret != PAPI_OK) { \ fprintf(stderr, "%s:%d macro: PAPI Error: function " #papi_routine " failed with ret=%d [%s].\n", \ __FILE__, __LINE__, _papiret, PAPI_strerror(_papiret)); \ exit(-1); \ } \ } while (0); #define MEMORY_ALLOCATION_CALL(var) \ do { \ if (var == NULL) { \ fprintf(stderr, "%s:%d: Error: Memory Allocation Failed \n",\ __FILE__, __LINE__); \ exit(-1); \ } \ } while (0); #define MAX_DEVICES (32) #define BLOCK_SIZE (1024) #define GRID_SIZE (512) #define BUF_SIZE (32 * 1024) #define ALIGN_SIZE (8) #define SUCCESS (0) #define NUM_METRIC (18) #define NUM_EVENTS (2) #define MAX_SIZE (64*1024*1024) // 64 MB typedef union { long long ll; unsigned long long ull; double d; void *vp; unsigned char ch[8]; } convert_64_t; typedef struct { char name[128]; long long value; int flagged; } eventStore_t; int eventsFoundCount = 0; // occupants of the array. int eventsFoundMax; // Size of the array. int eventsFoundAdd = 32; // Blocksize for increasing the array. int deviceCount=0; // Total devices seen. int deviceEvents[32] = {0}; // Number of events for each device=??. int globalEvents = 0; // events without a "device=". eventStore_t *eventsFound = NULL; // The array. //----------------------------------------------------------------------------- // HIP routine: Square each element in the array A and write to array C. //----------------------------------------------------------------------------- template __global__ void vector_square(T *C_d, T *A_d, size_t N) { size_t offset = (blockIdx.x * blockDim.x + threadIdx.x); size_t stride = blockDim.x * gridDim.x ; for (size_t i=offset; i= eventsFoundMax) { // bump count, if too much, make room. eventsFoundMax += eventsFoundAdd; // Add. eventsFound = (eventStore_t*) realloc(eventsFound, eventsFoundMax*sizeof(eventStore_t)); // Make new room. memset(eventsFound+(eventsFoundMax-eventsFoundAdd), 0, eventsFoundAdd*sizeof(eventStore_t)); // zero it. } } // end routine. //----------------------------------------------------------------------------- // conduct a test using HIP. Derived from AMD sample code 'square.cpp'. // coming in, EventSet is already populated, we just run the test and read. // Note values must point at an array large enough to store the events in // Eventset. //----------------------------------------------------------------------------- void conductTest(int EventSet, int device, long long *values) { float *A_d, *C_d; float *A_h, *C_h; size_t N = 1000000; size_t Nbytes = N * sizeof(float); int ret, thisDev, verbose=0; ret = PAPI_start( EventSet ); if (ret != PAPI_OK ) { fprintf(stderr,"Error! PAPI_start\n"); exit( ret ); } hipDeviceProp_t props; if (verbose) fprintf(stderr, "args: EventSet=%i, device=%i, values=%p.\n", EventSet, device, values); CHECK(hipSetDevice(device)); // Set device requested. CHECK(hipGetDevice(&thisDev)); // Double check. CHECK(hipGetDeviceProperties(&props, thisDev)); // Get properties (for name). if (verbose) fprintf (stderr, "info: Requested Device=%i, running on device %i=%s\n", device, thisDev, props.name); if (verbose) fprintf (stderr, "info: allocate host mem (%6.2f MB)\n", 2*Nbytes/1024.0/1024.0); A_h = (float*)malloc(Nbytes); // standard malloc for host. CHECK(A_h == NULL ? hipErrorMemoryAllocation : hipSuccess ); C_h = (float*)malloc(Nbytes); // standard malloc for host. CHECK(C_h == NULL ? hipErrorMemoryAllocation : hipSuccess ); // Fill with Phi + i for (size_t i=0; iname) == 0) cid=i; // If we found our match, record it. } // end search components. if (cid < 0) { // if no PCP component found, fprintf(stderr, "Failed to find rocm_smi component among %i " "reported components.\n", k); FreeGlobals(); PAPI_shutdown(); exit(-1); } printf("Found ROCM_SMI Component at id %d\n", cid); // Add events at a GPU specific level ... eg rocm:::device=0:Whatever eventCount = 0; int eventsRead=0; // Begin enumeration of all events. printf("Events with numeric values were read; if they are zero, they may not \n" "be operational, or the exercises performed by this code do not affect \n" "them. We report all 'rocm' events presented by the rocm component. \n" "\n" "------------------------Event Name Found------------------------:---Value---\n"); PAPI_event_info_t info; // To get event enumeration info. m=PAPI_NATIVE_MASK; // Get the PAPI NATIVE mask. CALL_PAPI_OK(PAPI_enum_cmp_event(&m,PAPI_ENUM_FIRST,cid)); // Begin enumeration of ALL papi counters. do { // Enumerate all events. memset(&info,0,sizeof(PAPI_event_info_t)); // Clear event info. k=m; // Make a copy of current code. // enumerate sub-events, with masks. For this test, we do not // have any! But we do this to test our enumeration works as // expected. First time through is guaranteed, of course. do { // enumerate masked events. CALL_PAPI_OK(PAPI_get_event_info(k,&info)); // get name of k symbol. char *devstr = strstr(info.symbol, "device="); // look for device enumerator. if (devstr != NULL) { // If device specific, device=atoi(devstr+7); // Get the device id, for info. // fprintf(stderr, "Found rocm symbol '%s', device=%i.\n", info.symbol , device); if (device < 0 || device >= 32) continue; // skip any not in range. } else { // A few are system wide. // fprintf(stderr, "Found rocm symbol '%s'.\n", info.symbol); globalEvents++; // Add to global events. device=0; // Any device will do. } // Filter for strings being returned. int isString = 0; if (strstr(info.symbol, "device_brand:") != NULL) isString=1; if (strstr(info.symbol, "device_name:") != NULL) isString=1; if (strstr(info.symbol, "device_serial_number:") != NULL) isString=1; if (strstr(info.symbol, "device_subsystem_name:") != NULL) isString=1; if (strstr(info.symbol, "vbios_version:") != NULL) isString=1; if (strstr(info.symbol, "vendor_name:") != NULL) isString=1; if (strstr(info.symbol, "driver_version_str:") != NULL) isString=1; // Filter out crashers. if (strstr(info.symbol, "temp_current:device=0:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_critical:device=0:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_critical_hyst:device=0:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_emergency:device=0:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_emergency:device=0:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_current:device=1:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_critical:device=1:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_critical_hyst:device=1:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_emergency:device=1:sensor=3") != NULL) continue; if (strstr(info.symbol, "temp_emergency:device=1:sensor=3") != NULL) continue; CALL_PAPI_OK(PAPI_create_eventset(&EventSet)); CALL_PAPI_OK(PAPI_assign_eventset_component(EventSet, cid)); ret = PAPI_add_named_event(EventSet, info.symbol); // Don't want to fail program if name not found... if(ret == PAPI_OK) { eventCount++; // Bump number of events we could test. if (deviceEvents[device] == 0) deviceCount++; // Increase count of devices if first for this device. deviceEvents[device]++; // Add to count of events on this device. } else { fprintf(stderr, "FAILED to add event '%s', ret=%i='%s'.\n", info.symbol, ret, PAPI_strerror(ret)); CALL_PAPI_OK(PAPI_cleanup_eventset(EventSet)); // Delete all events in set. CALL_PAPI_OK(PAPI_destroy_eventset(&EventSet)); // destroy the event set. continue; } long long value=0; // The only value we read. // Prep stuff. fprintf(stderr, "conductTest on single event: %s.\n", info.symbol); conductTest(EventSet, device, &value); // Conduct a test, on device given. addEventsFound(info.symbol, value); // Add to events we were able to read. CALL_PAPI_OK(PAPI_cleanup_eventset(EventSet)); // Delete all events in set. CALL_PAPI_OK(PAPI_destroy_eventset(&EventSet)); // destroy the event set. // report each event counted. eventsRead++; // .. count and report. if (value == 0) { printf("%-64s: %lli (perhaps not exercised by current test code.)\n", info.symbol, value); } else { if (isString) printf("%-64s: %-64s\n", info.symbol, ((char*) value)); else printf("%-64s: %lli\n", info.symbol, value); } } while(PAPI_enum_cmp_event(&k,PAPI_NTV_ENUM_UMASKS,cid)==PAPI_OK); // Get next umask entry (bits different) (should return PAPI_NOEVNT). } while(PAPI_enum_cmp_event(&m,PAPI_ENUM_EVENTS,cid)==PAPI_OK); // Get next event code. // fprintf(stderr, "%s:%i Finished Event Loops.\n", __FILE__, __LINE__); if (eventCount < 1) { // If we failed on all of them, fprintf(stderr, "Unable to add any ROCM events; they are not present in the component.\n"); fprintf(stderr, "Unable to proceed with this test.\n"); FreeGlobals(); PAPI_shutdown(); // Returns no value. exit(-1); // exit no matter what. } if (eventsRead < 1) { // If failed to read any, fprintf(stderr, "\nFailed to read any ROCM events.\n"); // report a failure. fprintf(stderr, "Unable to proceed with pair testing.\n"); FreeGlobals(); PAPI_shutdown(); // Returns no value. exit(-1); // exit no matter what. } printf("\nTotal ROCM events identified: %i.\n\n", eventsFoundCount); // EARLY SHUT DOWN. // PAPI_shutdown(); // return(0); // Next section is pair testing information. if (eventsFoundCount < 2) { // If failed to get counts on any, printf("Insufficient events are exercised by the current test code to perform pair testing.\n"); // report a failure. FreeGlobals(); PAPI_shutdown(); // Returns no value. exit(0); // exit no matter what. } for (i=0; i<32; i++) { if (deviceEvents[i] == 0) continue; // skip if none found. if (i==0 && globalEvents >0) { printf("Device %i assigned %i events (%i of which are not device specific). %i potential pairings for this device.\n", i, deviceEvents[i], globalEvents, deviceEvents[i]*(deviceEvents[i]-1)/2); } else { printf("Device %i assigned %i events. %i potential pairings for this device.\n", i, deviceEvents[i], deviceEvents[i]*(deviceEvents[i]-1)/2); } } // Begin pair testing. We consider every possible pairing of events // that, tested alone, returned a value greater than zero. // fprintf(stderr, "Begin Pair Testing.\n"); int mainEvent, pairEvent, mainDevice, pairDevice; long long readValues[2]; int goodOnSame=0, failOnDiff=0, badSameCombo=0, pairProblems=0; // Some counters. int type; // 0 succeed on same device, 1 = fail across devices. for (type=0; type<2; type++) { if (type == 0) { printf("List of Pairings on SAME device:\n"); printf("* means value changed by more than 10%% when paired (vs measured singly, above).\n"); printf("^ means a pair was rejected as an invalid combo.\n"); } else { printf("List of Pairings causing an error when on DIFFERENT devices:\n"); } for (mainEvent = 0; mainEvent 1.10) { // Flag as significantly different for main. flag1='*'; eventsFound[mainEvent].flagged = 1; // .. remember this event is suspect. } if (pairCheck < 0.90 || pairCheck > 1.10) { // Flag as significantly different for pair. flag2='*'; eventsFound[pairEvent].flagged = 1; // .. remember this event is suspect. } if (flag1 == '*' || flag2 == '*') { pairProblems++; // Remember number of problems. flag = '*'; // set global flag. } printf("%c %64s + %-64s [", flag, eventsFound[mainEvent].name, eventsFound[pairEvent].name); if (flag1 == '*') printf("%c%lli (vs %lli),", flag1, readValues[0], eventsFound[mainEvent].value); else printf("%c%lli,", flag1, readValues[0]); if (flag2 == '*') printf("%c%lli (vs %lli)]\n", flag2, readValues[1], eventsFound[pairEvent].value); else printf("%c%lli]\n", flag2, readValues[1]); CALL_PAPI_OK(PAPI_cleanup_eventset(EventSet)); // Delete all events in set. CALL_PAPI_OK(PAPI_destroy_eventset(&EventSet)); // destroy the event set. } // end for each possible pairing event. } // end loop for each possible primary event. if (type == 0) { // For good pairings on same devices, if (goodOnSame == 0) { printf("NO valid pairings of above events if both on the SAME device.\n"); } else { printf("%i valid pairings of above events if both on the SAME device.\n", goodOnSame); } printf("%i unique pairings on SAME device were rejected as bad combinations.\n", badSameCombo); if (pairProblems > 0) { printf("%i pairings resulted in a change of one or both event values > 10%%.\n", pairProblems); printf("The following events were changed by pairing:\n"); for (mainEvent = 0; mainEvent #include #include #include #include "papi.h" #include #include "rocm_smi.h" #include "force_init.h" // Helper Function void write_papi_event(int cid, const char* event_name, long long value_to_write); void read_and_print_current_values(int cid, const char* perf_name, const char* pcap_name, const char* fan_name, const char* pcap_max_name, const char* fan_max_name, const char* stage_label); #define CHECK(cmd) \ { \ hipError_t error = cmd; \ if (error != hipSuccess) { \ fprintf(stderr, "error: '%s'(%d) at %s:%d\n", hipGetErrorString(error), error,__FILE__, __LINE__); \ exit(EXIT_FAILURE); \ } \ } // THIS MACRO EXITS if the papi call does not return PAPI_OK. #define CALL_PAPI_OK(papi_routine) \ do { \ int _papiret = papi_routine; \ if (_papiret != PAPI_OK) { \ fprintf(stderr, "%s:%d macro: PAPI Error: function " #papi_routine " failed with ret=%d [%s].\n", \ __FILE__, __LINE__, _papiret, PAPI_strerror(_papiret)); \ exit(-1); \ } \ } while (0); // Show help. //----------------------------------------------------------------------------- static void printUsage() { printf("Demonstrate use of ROCM API write routines.\n"); printf("This program will use PAPI to read ROCm SMI values, attempt to write\n"); printf("modified values for perf_level, power_cap, and fan_speed (for device 0),\n"); printf("read them back, revert them to original values, and read again.\n"); printf("Requires necessary permissions to write ROCm SMI values.\n"); printf("Compile with: make -f ROCM_SMI_Makefile rocm_smi_writeTests.out\n"); } //----------------------------------------------------------------------------- // Interpret command line flags. //----------------------------------------------------------------------------- void parseCommandLineArgs(int argc, char *argv[]) { int i; for (i = 1; i < argc; ++i) { if ((strcmp(argv[i], "--help") == 0) || (strcmp(argv[i], "-help") == 0) || (strcmp(argv[i], "-h") == 0)) { printUsage(); exit(0); } } } //----------------------------------------------------------------------------- // Main program. //----------------------------------------------------------------------------- int main(int argc, char *argv[]) { int devices; int i = 0; int r; parseCommandLineArgs(argc, argv); int ret; int k, cid = -1; // PAPI Initialization ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret != PAPI_VER_CURRENT) { fprintf(stderr, "PAPI_library_init failed, ret=%i [%s]\n", ret, PAPI_strerror(ret)); exit(-1); } printf("PAPI version: %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); fflush(stdout); // Find rocm_smi component k = PAPI_num_components(); for (i = 0; i < k && cid < 0; i++) { const PAPI_component_info_t *aComponent = PAPI_get_component_info(i); if (aComponent && strcmp("rocm_smi", aComponent->name) == 0) cid = i; } if (cid < 0) { fprintf(stderr, "Failed to find rocm_smi component.\n"); PAPI_shutdown(); exit(-1); } printf("Found ROCM_SMI Component at id %d\n", cid); // Force Init force_rocm_smi_init(cid); // Get Device Count { int tempEventSet = PAPI_NULL; long long numDevValue = 0; CALL_PAPI_OK(PAPI_create_eventset(&tempEventSet)); CALL_PAPI_OK(PAPI_assign_eventset_component(tempEventSet, cid)); ret = PAPI_add_named_event(tempEventSet, "rocm_smi:::NUMDevices"); if (ret == PAPI_OK) { CALL_PAPI_OK(PAPI_start(tempEventSet)); CALL_PAPI_OK(PAPI_stop(tempEventSet, &numDevValue)); devices = (int)numDevValue; printf("Found %d devices.\n", devices); } else { fprintf(stderr, "FAILED to add NUMDevices event.\n"); CALL_PAPI_OK(PAPI_cleanup_eventset(tempEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&tempEventSet)); exit(-1); } CALL_PAPI_OK(PAPI_cleanup_eventset(tempEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&tempEventSet)); } // Handle no devices if (devices < 1) { fprintf(stderr, "No ROCm devices found.\n"); PAPI_shutdown(); exit(0); } long long initial_perf_level = -1; long long initial_power_cap = -1; long long initial_fan_speed = -1; long long power_cap_range_max_val = -1; long long fan_speed_max_val = -1; char perf_level_event_name[PAPI_MAX_STR_LEN] = ""; char power_cap_event_name[PAPI_MAX_STR_LEN] = ""; char fan_speed_event_name[PAPI_MAX_STR_LEN] = ""; char power_cap_range_max_event_name[PAPI_MAX_STR_LEN] = ""; char fan_speed_max_event_name[PAPI_MAX_STR_LEN] = ""; bool can_write_perf = false; bool can_write_pcap = false; bool can_write_fan = false; long long new_perf_level = -1; long long new_power_cap = -1; long long new_fan_speed = -1; // ---- Initial Read ---- printf("\n--- Initial Read: Finding events and getting base values ---\n"); const char* target_substrings[] = { "perf_level", "power_cap:", "power_cap_range_max", "fan_speed:", "fan_speed_max" }; const int num_target_substrings = sizeof(target_substrings) / sizeof(target_substrings[0]); const int MAX_ROCM_EVENTS = 512; char event_names[MAX_ROCM_EVENTS][PAPI_MAX_STR_LEN]; long long *rocm_values = NULL; int num_rocm_events = 0; int event_code = PAPI_NATIVE_MASK; char current_event_name[PAPI_MAX_STR_LEN]; int readEventSet = PAPI_NULL; CALL_PAPI_OK(PAPI_create_eventset(&readEventSet)); CALL_PAPI_OK(PAPI_assign_eventset_component(readEventSet, cid)); printf("Enumerating events to find targets (device=0, sensor=0 where applicable) for initial read...\n"); r = PAPI_enum_cmp_event(&event_code, PAPI_ENUM_FIRST, cid); while (r == PAPI_OK) { ret = PAPI_event_code_to_name(event_code, current_event_name); if (ret != PAPI_OK) { fprintf(stderr, "Warning: PAPI_event_code_to_name failed for code %#x: %s\n", event_code, PAPI_strerror(ret)); r = PAPI_enum_cmp_event(&event_code, PAPI_ENUM_EVENTS, cid); continue; } bool is_target = false; const char* matched_substring = NULL; for (i = 0; i < num_target_substrings; ++i) { if (strstr(current_event_name, target_substrings[i]) != NULL) { bool device_match = (strstr(current_event_name, ":device=0") != NULL); if (strcmp(target_substrings[i],"perf_level") == 0) { if (device_match) { is_target = true; matched_substring = target_substrings[i]; break; } } else { bool sensor_match = (strstr(current_event_name, ":sensor=0") != NULL); if (device_match && sensor_match) { is_target = true; matched_substring = target_substrings[i]; break; } else if (device_match && strstr(current_event_name, ":sensor=") == NULL){ if (strcmp(target_substrings[i],"power_cap:")==0 || strcmp(target_substrings[i],"fan_speed:")==0) { printf(" Warning: Matched '%s' for device 0 but no sensor specified: %s\n", target_substrings[i], current_event_name); is_target = true; matched_substring = target_substrings[i]; break; } } } } } if (is_target) { if (num_rocm_events < MAX_ROCM_EVENTS) { ret = PAPI_add_event(readEventSet, event_code); if (ret == PAPI_OK) { printf(" Adding event (matched '%s'): %s\n", matched_substring, current_event_name); strncpy(event_names[num_rocm_events], current_event_name, PAPI_MAX_STR_LEN - 1); event_names[num_rocm_events][PAPI_MAX_STR_LEN - 1] = '\0'; num_rocm_events++; } else { fprintf(stderr, " Warning: Failed to add event %s: %s\n", current_event_name, PAPI_strerror(ret)); if(ret==PAPI_ENOMEM) break; } } else { fprintf(stderr, "Error: Exceeded MAX_ROCM_EVENTS.\n"); break; } } r = PAPI_enum_cmp_event(&event_code, PAPI_ENUM_EVENTS, cid); } printf("Added %d events for initial read.\n", num_rocm_events); if (num_rocm_events > 0) { rocm_values = (long long *)calloc(num_rocm_events, sizeof(long long)); if (!rocm_values) { /* Handle error */ exit(-1); } CALL_PAPI_OK(PAPI_start(readEventSet)); CALL_PAPI_OK(PAPI_stop(readEventSet, rocm_values)); printf("\n--- Extracting Initial Values and Event Names ---\n"); for (i = 0; i < num_rocm_events; ++i) { printf(" Read Event %d: %-60s = %lld\n", i, event_names[i], rocm_values[i]); if (strstr(event_names[i], "power_cap_range_max") != NULL && strstr(event_names[i], ":device=0") != NULL) { power_cap_range_max_val = rocm_values[i]; strncpy(power_cap_range_max_event_name, event_names[i], PAPI_MAX_STR_LEN - 1); power_cap_range_max_event_name[PAPI_MAX_STR_LEN - 1] = '\0'; } else if (strstr(event_names[i], "fan_speed_max") != NULL && strstr(event_names[i], ":device=0") != NULL) { fan_speed_max_val = rocm_values[i]; strncpy(fan_speed_max_event_name, event_names[i], PAPI_MAX_STR_LEN - 1); fan_speed_max_event_name[PAPI_MAX_STR_LEN - 1] = '\0'; } else if (strstr(event_names[i], "perf_level") != NULL && strstr(event_names[i], ":device=0") != NULL) { initial_perf_level = rocm_values[i]; strncpy(perf_level_event_name, event_names[i], PAPI_MAX_STR_LEN - 1); perf_level_event_name[PAPI_MAX_STR_LEN - 1] = '\0'; } else if (strstr(event_names[i], "power_cap:") != NULL && strstr(event_names[i], "power_cap_range_max") == NULL && strstr(event_names[i], ":device=0") != NULL) { initial_power_cap = rocm_values[i]; strncpy(power_cap_event_name, event_names[i], PAPI_MAX_STR_LEN - 1); power_cap_event_name[PAPI_MAX_STR_LEN - 1] = '\0'; } else if (strstr(event_names[i], "fan_speed:") != NULL && strstr(event_names[i], "fan_speed_max") == NULL && strstr(event_names[i], ":device=0") != NULL) { initial_fan_speed = rocm_values[i]; strncpy(fan_speed_event_name, event_names[i], PAPI_MAX_STR_LEN - 1); fan_speed_event_name[PAPI_MAX_STR_LEN - 1] = '\0'; } } free(rocm_values); rocm_values = NULL; } else { printf("No target events found for initial read. Skipping write tests.\n"); goto cleanup_and_exit; } // Cleanup the initial read EventSet - Pass address to destroy CALL_PAPI_OK(PAPI_cleanup_eventset(readEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&readEventSet)); // Pass address readEventSet = PAPI_NULL; // ---- Stage 1: Calculate and Write NEW Values ---- printf("\n=== Stage 1: Calculating and Writing NEW Values ===\n"); can_write_perf = (initial_perf_level != -1 && strcmp(perf_level_event_name, "") != 0); can_write_pcap = (initial_power_cap != -1 && power_cap_range_max_val != -1 && strcmp(power_cap_event_name, "") != 0); can_write_fan = (initial_fan_speed != -1 && strcmp(fan_speed_event_name, "") != 0); if (can_write_perf) { new_perf_level = initial_perf_level + 1; // Example: Increment perf level printf(" Calculating new perf_level: %lld + 1 = %lld\n", initial_perf_level, new_perf_level); write_papi_event(cid, perf_level_event_name, new_perf_level); } else { printf("Skipping perf_level write (initial value/name not found or invalid).\n"); } if (can_write_pcap) { new_power_cap = power_cap_range_max_val - 1000000; // Example: 1W below max if (new_power_cap < 0) { new_power_cap = initial_power_cap; } // Basic sanity check printf(" Calculating new power_cap: %lld uW - 1000000 uW = %lld uW\n", power_cap_range_max_val, new_power_cap); write_papi_event(cid, power_cap_event_name, new_power_cap); } else { printf("Skipping power_cap write (initial value/name/max not found or invalid).\n"); } if (can_write_fan) { new_fan_speed = fan_speed_max_val - 1; // Example: Decrease fan speed slightly if (new_fan_speed < 0) { new_fan_speed = 0; } // Basic sanity check (min speed 0?) printf(" Calculating new fan_speed: %lld - 1 = %lld\n", fan_speed_max_val, new_fan_speed); write_papi_event(cid, fan_speed_event_name, new_fan_speed); } else { printf("Skipping fan_speed write (initial value/name not found or invalid).\n"); } // ---- Stage 2: Read values AFTER writing NEW ones ---- printf("\n=== Stage 2: Verifying NEW Values ===\n"); read_and_print_current_values(cid, perf_level_event_name, power_cap_event_name, fan_speed_event_name, power_cap_range_max_event_name, fan_speed_max_event_name, "After Writing New Values"); // ---- Stage 3: Write INITIAL values back (Revert) ---- printf("\n=== Stage 3: Reverting to INITIAL Values ===\n"); if (can_write_perf) { write_papi_event(cid, perf_level_event_name, initial_perf_level); } else { printf("Skipping perf_level revert.\n"); } if (can_write_pcap) { write_papi_event(cid, power_cap_event_name, initial_power_cap); } else { printf("Skipping power_cap revert.\n"); } if (can_write_fan) { write_papi_event(cid, fan_speed_event_name, initial_fan_speed); } else { printf("Skipping fan_speed revert.\n"); } // ---- Stage 4: Read values AFTER reverting ---- printf("\n=== Stage 4: Verifying REVERTED Values ===\n"); read_and_print_current_values(cid, perf_level_event_name, power_cap_event_name, fan_speed_event_name, power_cap_range_max_event_name, fan_speed_max_event_name, "After Reverting to Initial Values"); // ---- Cleanup and Exit ---- cleanup_and_exit: printf("\n--- Write/Revert Test Sequence Finished ---\n"); if (readEventSet != PAPI_NULL) { // Check if cleanup needed after jump printf("Performing cleanup for initial read EventSet after jump...\n"); CALL_PAPI_OK(PAPI_cleanup_eventset(readEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&readEventSet)); // Pass address } printf("Finished All Tests.\n"); PAPI_shutdown(); return(0); } // end MAIN. // ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ // +++ Helper Function Definitions ++++++++++++++++++++++++++++++++++++++++++++ // ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ //----------------------------------------------------------------------------- // Helper to write a single value to a specific PAPI event name. C version. //----------------------------------------------------------------------------- void write_papi_event(int cid, const char* event_name, long long value_to_write) { printf(" Attempting Write: Set '%s' = %lld\n", event_name, value_to_write); if (event_name == NULL || strcmp(event_name, "") == 0) { /* Handle error */ return; } int writeEventSet = PAPI_NULL; int ret; long long read_back_value; long long write_buffer[1]; write_buffer[0] = value_to_write; CALL_PAPI_OK(PAPI_create_eventset(&writeEventSet)); // Pass address CALL_PAPI_OK(PAPI_assign_eventset_component(writeEventSet, cid)); ret = PAPI_add_named_event(writeEventSet, event_name); if (ret != PAPI_OK) { fprintf(stderr, " Error: FAILED to add event '%s' for writing, ret=%d [%s]. Skipping write.\n", event_name, ret, PAPI_strerror(ret)); CALL_PAPI_OK(PAPI_cleanup_eventset(writeEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&writeEventSet)); // Pass address return; } CALL_PAPI_OK(PAPI_start(writeEventSet)); ret = PAPI_write(writeEventSet, write_buffer); if (ret != PAPI_OK) { fprintf(stderr, " Error: PAPI_write FAILED for event '%s' with value %lld, ret=%d [%s].\n", event_name, value_to_write, ret, PAPI_strerror(ret)); int stop_ret = PAPI_stop(writeEventSet, &read_back_value); if (stop_ret != PAPI_OK) { fprintf(stderr, " Warning: PAPI_stop after failed PAPI_write also failed: %s\n", PAPI_strerror(stop_ret)); } } else { printf(" PAPI_write call succeeded for '%s' = %lld.\n", event_name, value_to_write); CALL_PAPI_OK(PAPI_stop(writeEventSet, &read_back_value)); printf(" Read back value immediately after write: %lld\n", read_back_value); if (read_back_value != value_to_write) { printf(" Warning: Read-back value (%lld) does not match written value (%lld).\n", read_back_value, value_to_write); } } CALL_PAPI_OK(PAPI_cleanup_eventset(writeEventSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&writeEventSet)); // Pass address printf(" Write attempt finished for '%s'.\n", event_name); } //----------------------------------------------------------------------------- // Helper to read the set of relevant metrics (passed by name) and print them. C version. //----------------------------------------------------------------------------- #define MAX_EVENTS_TO_READ 10 // Max number of events this function will read at once void read_and_print_current_values(int cid, const char* perf_name, const char* pcap_name, const char* fan_name, const char* pcap_max_name, const char* fan_max_name, const char* stage_label) { printf(" Reading Values [%s] for Verification <--\n", stage_label); int readSet = PAPI_NULL; int ret; char events_to_read[MAX_EVENTS_TO_READ][PAPI_MAX_STR_LEN]; char event_short_names[MAX_EVENTS_TO_READ][50]; bool added_flags[MAX_EVENTS_TO_READ]; int read_count = 0; int i; memset(events_to_read, 0, sizeof(events_to_read)); memset(event_short_names, 0, sizeof(event_short_names)); for(i=0; i 0) { CALL_PAPI_OK(PAPI_start(readSet)); ret = PAPI_stop(readSet, values); if (ret != PAPI_OK){ fprintf(stderr, " Error: PAPI_stop failed during read: %s\n", PAPI_strerror(ret)); printf(" Current System Values (PAPI_stop failed, results may be inaccurate):\n"); } else { printf(" Current System Values:\n"); } int value_idx = 0; for (i = 0; i < read_count; ++i) { if(added_flags[i]) { printf(" %-20s (%s): %lld\n", event_short_names[i], events_to_read[i], (ret == PAPI_OK) ? values[value_idx] : -999); value_idx++; } else { printf(" %-20s (%s): [Read Skipped - Add Failed]\n", event_short_names[i], events_to_read[i]); } } } else { printf(" No events were successfully added to the EventSet for reading.\n"); } free(values); CALL_PAPI_OK(PAPI_cleanup_eventset(readSet)); CALL_PAPI_OK(PAPI_destroy_eventset(&readSet)); printf(" Finished Reading [%s] --\n", stage_label); }papi-papi-7-2-0-t/src/components/rocm_smi/tests/rocmsmi_example.cpp000066400000000000000000000770531502707512200254720ustar00rootroot00000000000000//----------------------------------------------------------------------------- // rocmsmi_example.cpp is a minimal example of using PAPI with rocm_smi. SMI // is for "System Management Interface", it provide information on hardware // sensors such as fan speed, temperature, and power consumption. // // Unfortunately, the power consumption is a "spot" reading, and most users // want a reading *during* the execution of a function, not just before or // after a call to the GPU kernel returns; when the GPU will likely be at or // near its idling power. // // In this example, we will show how to call a function from rocblas (the sgemm // routine (single precision general matrix multiply), while using a separate // pthread to sample and record time & power while it runs. // This is intended as a simple example upon which programmers can expand; for // a more comprehensive approach see power_monitor_rocm.cpp, that can deal with // multiple GPUs and allows power-capping and other output control. It is in // this same directory. power_monitor_rocm is a standalone code, run in the // background to monitor another application (two processes). (On some clusters // you must ensure the GPU *can* be shared by two processes simultaneously.) // A separate process has the advantage of being able to monitor library code // and other application code that has no PAPI instrumentation code in it. The // pthread approach coded here has the advantage of being a single executable // and more flexible, for example you can incorporate other elements into your // output, such as PAPI event values into the timed outputs, or output labels // to indicate processing landmarks, etc. For example, with multiple papi event // sets, we could also read device temperature with every sample, or memory // usage or cache statistics, or even I/O bandwidth statistics on devices other // than the GPU, and output those stats for the same time steps as our power // consumption. // rocblas is generally part of the installed package from AMD, in // $PAPI_ROCM_ROOT/rocblas/, with subdirectories /lib and /include. We don't // use HipBlas (also included by AMD), HipBlas is a higher level "switch" that // calls either cuBlas or rocBlas. This would be an unnecessary complication // for this example. // // The corresponding Makefile is also instructional. // An advantage of rocblas is it is also a "switch", automatically detecting // the hardware and using the appropriate tuned code for it. // > make rocmsmi_example // This code is intentionally heavily commented to be instructional. We use // the library code to exercise the GPU with a realistic workload instead of a // toy kernel. For examples of how to include your own kernels, see the // distribution directory $PAPI_ROCM_ROOT/hip/samples/, which contains // sub-directories with working applications. // To Compile, the environment variable PAPI_ROCM_ROOT must be defined to point // at a rocm directory. No other environment variables are necessary, but if // you wish to use ROCM events or events from other components, check the // appropriate README.md files for instructions. These will be found in // papi/src/components/COMPONENT_NAME/README.md files. // Because this program uses AMD HIP functions to manage memory on the GPU, it // must be compiled with the hipcc compiler. Typically this is found in: // $PAPI_ROCM_ROOT/bin/hipcc // hipcc is a c++ compiler, so c++ conventions for strings must be followed. // (PAPI does not require c++; it is simple C; but PAPI++ will require c++). // Note for Clusters: Many clusters have "head nodes" (aka "login nodes") that // do not contain any gpus; the head node is used for compiling but the code is // actually run on a batch node (e.g. using SLURM and srun). Ensure when // running code that must access a GPU, including our utilities like // papi_component_avail and papi_native_avail, and this example code, that the // code is run on a batch node, not the head node. //----------------------------------------------------------------------------- // Necessary to specify a platform for the hipcc compiler. #define __HIP_PLATFORM_HCC__ #include #include #include #include #include #include #include "papi.h" #include "rocblas.h" #include // This is a structure to hold variables for our work example to exercise the // GPU. The work is an SGEMM, using AMD's rocblas library. PrepWork() fills in // this structure. typedef struct rb_sgemm_parms { rocblas_handle handle; hipStream_t hipStream; rocblas_status rb_ret; rocblas_operation transA; rocblas_operation transB; rocblas_int m; rocblas_int n; rocblas_int k; float* alpha; float* A; // Host side. float* A_d; // Device side. rocblas_int lda; float* B; // Host side. float* B_d; // Device side. rocblas_int ldb; float* beta; float* C; // Host side. float* C_d; // Device side. rocblas_int ldc; } rb_sgemm_parms_t; enum { samplingCommand_wait = -1, samplingCommand_record = 0, samplingCommand_exit = 1 }; typedef struct samplingOrders { // To avoid any multi-thread race conditions: // The following variables are only written by Main, not by the sampling Thread. int EventSet; // The events to read. volatile int command; // -1=wait, 0=sample and record, 1=exit. int maxSamples; // extent of array. // The following variables are only written by the Thread, not by Main. volatile int sampleIdx; // next sample to store. long long *samples; // array of [2*maxSamples] allocated by caller. Only first event is stored. // Note each sample takes two slots; for nanosecond time stamp and event value. long long *eventsRead; // array long enough to read all samples in EventSet, allocated by caller. } samplingOrders_t; // prototypes for doing some work to exercise the GPU. rb_sgemm_parms_t* PrepWork(int MNK); void DoWork(rb_sgemm_parms_t *myParms); void FinishWork(rb_sgemm_parms_t *myParms); // Used to initialize arrays. #define SRAND48_SEED 1001 // Select GPU to test [0,1,...] #define GPU_TO_TEST 0 #define MS_SAMPLE_INTERVAL 5 #define VERBOSE 0 /* 1 prints action-by-action report for debugging. */ //----------------------------------------------------------------------------- // This is the timer pthread; note that time intervals of 2ms or less may be // a problem for some operating systems. The operation here is simple, we // sample, store the value read in an array, increment our index, and sleep. // but we also check our passed in structure to look for an exit signal. If // we run out of room in the provided array, we stop sampling. This is an // example, programmers can get more sophisticated if they like. // The PAPI EventSet must already be initialized, built, and started; all this // thread does is PAPI_read(). //----------------------------------------------------------------------------- long long countInterrupts=0; void* sampler(void* vMyOrders) { int ret; struct timespec req, rem, cont; if (VERBOSE) fprintf(stderr, "entered sampler.\n"); samplingOrders_t *myOrders = (samplingOrders_t*) vMyOrders; // recast for use. // Initializations first time through. // These are the only two elements in timespec. cont.tv_sec=0; // always. req.tv_sec=0; req.tv_nsec = MS_SAMPLE_INTERVAL*(1000000); myOrders->sampleIdx = 0; myOrders->eventsRead[0]=0; // Sleep for time given in req. If interrupted by a signal, time remaining in rem. while (myOrders->command != samplingCommand_exit) { // run until instructed to exit. ret = nanosleep(&req, &rem); // sleep. while (ret != 0) { // If interrupted by a signal (almost never happens), countInterrupts++; cont = rem; // Set up continuation. ret = nanosleep(&cont, &rem); // try again. } // We have completed a sleep cycle. If we need to sample, do that. if (myOrders->command == samplingCommand_record && myOrders->sampleIdx < myOrders->maxSamples) { long long nsElapsed=0; int x2 = myOrders->sampleIdx<<1; // compute double the sample index. long long tstamp = PAPI_get_real_nsec(); // PAPI function returns ret = PAPI_read(myOrders->EventSet, myOrders->eventsRead); if (x2 > 0) nsElapsed=tstamp - myOrders->samples[x2-2]; if (VERBOSE) fprintf(stderr, "reading sample %d elapsed=%llu.\n", myOrders->sampleIdx, nsElapsed); if (ret == PAPI_OK) { myOrders->samples[x2]=tstamp; myOrders->samples[x2+1]=myOrders->eventsRead[0]; myOrders->sampleIdx++; } else { if (VERBOSE) fprintf(stderr, "Failed PAPI_read(EventSet, values), retcode=%d ->'%s'\n", ret, PAPI_strerror(ret)); exit(ret); } } } // end while. // We are exiting the thread. The data is contained in the passed in structure. if (VERBOSE) fprintf(stderr, "exiting sampler.\n"); pthread_exit(NULL); } // end pthread sampler. //----------------------------------------------------------------------------- // Begin main code. //----------------------------------------------------------------------------- int main( int argc, char **argv ) { (void) argc; (void) argv; int i, x2, retval; int EventSet = PAPI_NULL; // Step 1: Initialize the PAPI library. The library returns its version, // and we compare that to the include file version to ensure compatibility. // if for some reason the PAPI_libary_init() fails, it will not return // its version but an error code. // Note that PAPI_strerror(retcode) will return a pointer to a string that // describes the error. retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init failed, retcode=%d -> '%s'\n", retval, PAPI_strerror(retval)); exit(retval); } // Step 2: Create an event set. A PAPI event set is a collection of // events we wish to read. We can add new events, delete events, and // so on. Upon creation the event set is empty. // Note that the EventSet is just an integer, an index into an internal // array of event sets. We pass the address so PAPI can populate the // integer with the correct index. retval = PAPI_create_eventset(&EventSet); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_create_eventset failed, retcode=%d -> '%s'\n", retval, PAPI_strerror(retval)); exit(retval); } // When we read an eventset it returns an array of values, one per event // contained in the eventset. PAPI always works in 'long long' 64 bit // values. For some events the value returned should be recast to some // other type, e.g. unsigned long long, or even floating point. ROCM has // several events that are percentages in the range [0,100]. In general it // is up to the programmers to know how to use the event data, based on the // event description, and whether it needs to be recast. Event // descriptions can be obtained using the PAPI utility PAPI_native_avail, // found in the papi/src/utils directory. The following is an excerpt of a // few such such events reported by PAPI_native_avail (out of hundreds). // Note: Be sure to use the events for the device you are testing; defined // above as 'GPU_TO_TEST'. //-------------------------------------------------------------------------------- //| rocm_smi:::power_average:device=0:sensor=0 | //| Current Average Power consumption in microwatts. Requires root pri| //| vilege. | //-------------------------------------------------------------------------------- // We declare an array of long long here to receive counter values from // PAPI. It could also be allocated if the number of events are not known // at compile time. long long values[1]={0}; // declare a single event value. // We define a string for the event we choose to monitor. char eventname[]="rocm_smi:::power_average:device=0:sensor=0"; // Now we add the named event to the event set. It is also possible to add // names by their numerical code; but most applications use the named // events. Be aware that the numeric codes for the same event can change // from run to run; so using the name is the safer approach. Also note that // what we have done thus far is all setup, we have not tried to // communicate with the GPU yet. retval = PAPI_add_named_event(EventSet, eventname); // Note we pass pointer to string. if (retval != PAPI_OK) { fprintf(stderr, "Failed PAPI_add_named_event(EventSet, %s), retcode=%d ->'%s'\n", eventname, retval, PAPI_strerror(retval)); exit(retval); } // Now we ask PAPI to interface with the GPU driver software to start // keeping track of this event on the GPU. At this point, if the GPU has // some issue with the events in the event set, we may see an error in // starting the event counting. One example is if we attempt to combine // events that the GPU cannot collect together, due to lack of resources. // The GPU may use the same hardware counter that it switches to count // event A *or* event B, and thus cannot count both event A *and* event B // simultaneously. retval = PAPI_start(EventSet); if (retval != PAPI_OK) { fprintf(stderr, "Failed PAPI_start(EventSet), retcode=%d ->'%s'\n", retval, PAPI_strerror(retval)); exit(retval); } // Set up the sampler thread. pthread_t samplingThread; samplingOrders_t myOrders; myOrders.EventSet = EventSet; myOrders.command = samplingCommand_wait; // Wait for me to say start. myOrders.maxSamples = 20*1000/MS_SAMPLE_INTERVAL; // 20 seconds worth. myOrders.sampleIdx = 0; myOrders.samples = (long long*) calloc(myOrders.maxSamples*2, sizeof(long long)); // allocate space. myOrders.eventsRead = values; retval = pthread_create(&samplingThread, NULL, sampler, &myOrders); if (retval != 0) { fprintf(stderr, "pthread_create() failed, retcode=%d. Aborting.\n", retval); exit(-1); } if (VERBOSE) fprintf(stderr, "Launched Sampler.\n"); // Do Some Work: This is a subroutine do just make the GPU do something to // run up counters so we have something to report. We do this in three // parts. Information for the run is contained in 'myParms'. rb_sgemm_parms_t *myParms = PrepWork(16384); // set up, param is M,N,K. // If something went wrong it was reported by PrepWork; so just exit. if (myParms->rb_ret != rocblas_status_success) exit(myParms->rb_ret); // Start sampling. myOrders.command = samplingCommand_record; while (myOrders.sampleIdx < 1); // Call rocblas and do an SGEMM. long long timeDoWork = PAPI_get_real_nsec(); DoWork(myParms); timeDoWork = PAPI_get_real_nsec() - timeDoWork; if (VERBOSE) fprintf(stderr, "DoWork consumed %.3f ms.\n", (timeDoWork+0.)/1000000.0); // If something went wrong it was reported by DoWork; so just exit. if (myParms->rb_ret != rocblas_status_success) exit(myParms->rb_ret); // stop the samplings. int checkTime = myOrders.sampleIdx; // Wait for some trailing samples, if possible. while (myOrders.sampleIdx <= checkTime+5 && myOrders.sampleIdx < myOrders.maxSamples); if (VERBOSE) fprintf(stderr, "Stopping Sampler, joining thread.\n"); myOrders.command = samplingCommand_exit; // Tell the sampling thread to exit. // Wait for the sampler thread to finish. retval = pthread_join(samplingThread, NULL); if (retval != 0) { fprintf(stderr, "Failed to join the sampling thread, ret=%d.\n", retval); exit(retval); } if (VERBOSE) fprintf(stderr, "Joined Thread.\n"); // Success. Report what we read for the value. if (VERBOSE) printf("Read %d samples, =%.3f seconds. %llu nanosleepInterrupts.\n", myOrders.sampleIdx, (myOrders.sampleIdx*MS_SAMPLE_INTERVAL)/1000., countInterrupts); printf("ns timeStamp, ns Diff, microWatts, Joules (Watts*Seconds)\n"); float duration, avgWatts=0.0, totJoules=0.0; long long minWatts=myOrders.samples[1]; long long maxWatts=minWatts; for (i=0; i maxWatts) maxWatts = myOrders.samples[x2+1]; avgWatts += (myOrders.samples[x2+1]+0.0)*1.e-6; printf("%llu,", myOrders.samples[x2+1]); if (i==0) printf("0.0\n"); else { float w= myOrders.samples[x2+1]*1.e-6; float s= (myOrders.samples[x2]-myOrders.samples[x2-2])*1.e-9; totJoules += (w*s); printf("%.6f\n", (w*s)); } } x2 = (myOrders.sampleIdx-1)<<1; // Final index. duration = (float) (myOrders.samples[x2]-myOrders.samples[0]); duration *= 1.e-6; // compute milliseconds from nano seconds. avgWatts /= (myOrders.sampleIdx-1.0); // one less, first reading is zero. printf("ms Duration=%.3f\n", duration); printf("ms AvgInterval=%.3f\n", duration/(myOrders.sampleIdx-1)); printf("avg Watts=%.3f, minWatts=%.3f, maxWatts=%.3f\n", avgWatts, (minWatts*1.e-6), (maxWatts*1.e-6)); printf("total Joules=%.3f\n", totJoules); // Now we clean up. First, we stop the event set. This will re-do the read; // we could prevent that by passing a NULL pointer for the 'values'. retval = PAPI_stop( EventSet, values ); // ROCM added stop and test. if (retval != PAPI_OK) { fprintf(stderr, "Failed PAPI_stop(EventSet, values), retcode=%d ->'%s'\n", retval, PAPI_strerror(retval)); exit( retval ); } // This is an example of modifying an EventSet. This will happen // automatically if we destroyed the event set, but is shown here as an // example. We can remove events, add other events, do more work and read // those. Of course you should allow room in the 'values[]' array for the // maximum number of events you might read. retval = PAPI_remove_named_event(EventSet, eventname); // remove the event we added. if (retval != PAPI_OK) { fprintf(stderr, "Failed PAPI_remove_named_event(EventSet, eventname), retcode=%d ->'%s'\n", retval, PAPI_strerror(retval)); exit(retval); } // Being good memory citizens, we want to destroy the event set PAPI created for us. // Once again, we pass the address of an integer. retval=PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) { fprintf(stderr, "Failed PAPI_destroy_event(&EventSet), retcode=%d ->'%s'\n", retval, PAPI_strerror(retval)); exit(retval); } // Cleanup the work portion. It will delete any allocated memories. FinishWork(myParms); // And finally, we tell PAPI to do a clean shut down, release all its allocations. PAPI_shutdown(); // Shut it down. return 0; } // END MAIN. //----------------------------------------------------------------------------- // The following are dedicated to rocblas; the PAPI examples are above. // rocblas_handle is a structure holding the rocblas library context. // It must be created using rocblas_create_handle(), passed to all function // calls, and destroyed using rocblas_destroy_handle(). //----------------------------------------------------------------------------- static const char* rocblas_return_strings[13]={ // from comments in rocblas_types.h. "Success", "Handle not initialized, invalid or null", "Function is not implemented", "Invalid pointer argument", "Invalid size argument", "Failed internal memory allocation, copy or dealloc", "Other internal library failure", "Performance degraded due to low device memory", "Unmatched start/stop size query", "Queried device memory size increased", "Queried device memory size unchanged", "Passed argument not valid", "Nothing preventing function to proceed", }; //----------------------------------------------------------------------------- // helper. Report any info about the context of the error first; eg. // fprintf(stderr, "calloc(1, %lu) for rb_sgemm_parts_t ", sizeof(rb_sgemm_parms_t)); //----------------------------------------------------------------------------- void rb_report_error(int ret) { fprintf(stderr, "failed, ret=%d -> ", ret); if (ret >=0 && ret<=12) { fprintf(stderr, "%s.\n", rocblas_return_strings[ret]); } else { fprintf(stderr, "Meaning Unknown.\n"); } } // end rb_report_error. //----------------------------------------------------------------------------- // helper. Report any info about the context of the error first; eg. // fprintf(stderr, "calloc(1, %lu) for rb_sgemm_parts_t ", sizeof(rb_sgemm_parms_t)); //----------------------------------------------------------------------------- void hip_report_error(hipError_t ret) { fprintf(stderr, "failed, ret=%d -> %s.\n", ret, hipGetErrorString(ret)); } // end rb_report_error. rb_sgemm_parms_t* PrepWork(int MNK) { // We use calloc to ensure all pointers are NULL. rb_sgemm_parms_t *myParms = (rb_sgemm_parms_t*) calloc(1, sizeof(rb_sgemm_parms_t)); // Check that we successfully allocated memory. if (myParms == NULL) { fprintf(stderr, "calloc(1, %lu) for rb_sgemm_parms_t failed.", sizeof(rb_sgemm_parms_t)); exit(-1); } hipError_t hipret; // GPU_TO_TEST is defined at top of file; [0,1,...] hipret = hipSetDevice(GPU_TO_TEST); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipSetDevice(%d) ", __FILE__, __func__, __LINE__, GPU_TO_TEST); hip_report_error(hipret); exit(-1); } // initialize rocblas. rocblas_initialize(); // rocblas requires the creation of a 'handle' to call any functions. myParms->rb_ret = rocblas_create_handle(&myParms->handle); // Check that we were able to create a handle. if (myParms->rb_ret != rocblas_status_success) { fprintf(stderr, "rocblas_create_handle "); rb_report_error(myParms->rb_ret); exit(-1); } // Set constants. myParms->m = (rocblas_int) MNK; myParms->n = (rocblas_int) MNK; myParms->k = (rocblas_int) MNK; myParms->lda = (rocblas_int) MNK; myParms->ldb = (rocblas_int) MNK; myParms->ldc = (rocblas_int) MNK; myParms->transA = rocblas_operation_none; myParms->transB = rocblas_operation_none; // Allocate memory; check each time. myParms->alpha = (float*) calloc(1, sizeof(float)); if (myParms->alpha == NULL) { fprintf(stderr, "calloc(1, %lu) failed for myParms->alpha.\n", sizeof(float)); exit(-1); } myParms->beta = (float*) calloc(1, sizeof(float)); if (myParms->beta == NULL) { fprintf(stderr, "calloc(1, %lu) failed for myParms->beta.\n", sizeof(float)); exit(-1); } myParms->A = (float*) calloc(1, sizeof(float)*MNK*MNK); if (myParms->A == NULL) { fprintf(stderr, "calloc(1, %lu) failed for myParms->A.\n", sizeof(float)*MNK*MNK); exit(-1); } myParms->B = (float*) calloc(1, sizeof(float)*MNK*MNK); if (myParms->B == NULL) { fprintf(stderr, "calloc(1, %lu) failed for myParms->B.\n", sizeof(float)*MNK*MNK); exit(-1); } myParms->C = (float*) calloc(1, sizeof(float)*MNK*MNK); if (myParms->C == NULL) { fprintf(stderr, "calloc(1, %lu) failed for myParms->A.\n", sizeof(float)*MNK*MNK); exit(-1); } // Set up allocated areas. myParms->alpha[0]= 1.0; myParms->beta[0] = 1.0; srand48(SRAND48_SEED); // Init square arrays, uniform distribution; values [0.0,1.0). int i; for (i=0; i<(MNK*MNK); i++) { myParms->A[i] = (float) drand48(); myParms->B[i] = (float) drand48(); myParms->C[i] = (float) drand48(); } int thisDevice; hipret = hipGetDevice(&thisDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipGetDevice(&thisDevice) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } if (thisDevice != 0) { fprintf(stderr, "%s:%s:%i Unexpected result, thisDevice = %d.\n", __FILE__, __func__, __LINE__, thisDevice); } // Not used here, but useful for debug. hipDeviceProp_t devProps; hipret = hipGetDeviceProperties(&devProps, thisDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipGetDeviceProperties(&devProps) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } if (0) { fprintf(stderr, "info: device=%i name=%s.\n", thisDevice, devProps.name); } // Allocate memory on the GPU for three arrays. hipret = hipMalloc(&myParms->A_d, sizeof(float)*MNK*MNK); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMalloc((&myParms->A_d, %lu) ", __FILE__, __func__, __LINE__, sizeof(float)*MNK*MNK); hip_report_error(hipret); exit(-1); } hipret = hipMalloc(&myParms->B_d, sizeof(float)*MNK*MNK); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMalloc((&myParms->B_d, %lu) ", __FILE__, __func__, __LINE__, sizeof(float)*MNK*MNK); hip_report_error(hipret); exit(-1); } hipret = hipMalloc(&myParms->C_d, sizeof(float)*MNK*MNK); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMalloc((&myParms->C_d, %lu) ", __FILE__, __func__, __LINE__, sizeof(float)*MNK*MNK); hip_report_error(hipret); exit(-1); } // Copy each array from Host to Device. Note args for // hipMemcpy(*dest, *source, count, type of copy) hipret = hipMemcpy(myParms->A_d, myParms->A, sizeof(float)*MNK*MNK, hipMemcpyHostToDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMemcpy(A) HostToDevice) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } hipret = hipMemcpy(myParms->B_d, myParms->B, sizeof(float)*MNK*MNK, hipMemcpyHostToDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMemcpy(B) HostToDevice) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } hipret = hipMemcpy(myParms->C_d, myParms->C, sizeof(float)*MNK*MNK, hipMemcpyHostToDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMemcpy(C) HostToDevice) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } return (myParms); } // end PrepWork. void DoWork(rb_sgemm_parms_t *myParms) { long long elapsed; hipError_t hipret; (void) elapsed; // No warnings if not used. // Execute the SGEMM we set up. myParms->rb_ret = rocblas_sgemm( myParms->handle, myParms->transA, myParms->transB, myParms->m, myParms->n, myParms->k, myParms->alpha, myParms->A_d, myParms->lda, myParms->B_d, myParms->ldb, myParms->beta, myParms->C_d, myParms->ldc); if (myParms->rb_ret != rocblas_status_success) { fprintf(stderr, "rocblas_sgemm "); rb_report_error(myParms->rb_ret); exit(-1); } long unsigned mtxSize = sizeof(float)*(myParms->m)*(myParms->n); // Copy the result (matrix C) back to host space. hipret = hipMemcpy(myParms->C, myParms->C_d, mtxSize, hipMemcpyHostToDevice); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipMemcpy(C) HostToDevice) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } // AMD GPU "Streams" are command queues for the device. // hipDeviceSynchronize() blocks until all streams are empty. Failing to // Synchronize has resulted in incorrect reading of performance counters. // In timings, this takes about 3 uS if all commands ARE complete. But // immediately after the rocblas_call above, it has taken up to 239 ms, and // without it, we have read zeros for the performance event. (Typically the // memory copy will take long enough that we don't see this; reading zeros // for the event occurred when we had a PAPI_read() immediately after the // rocblas_sgemm() call.) // Example of timing: // elapsed = PAPI_get_real_nsec(); // ... do something ... // elapsed = PAPI_get_real_nsec() - elapsed; // fprintf(stderr, "Elapsed time %llu ns.\n", elapsed); hipret = hipDeviceSynchronize(); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipDeviceSynchronize() ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } if (VERBOSE) fprintf(stderr, "Successful rocblas_sgemm with M,N,K=%d,%d,%d.\n", myParms->m, myParms->n, myParms->k); return; } // end DoWork. void FinishWork(rb_sgemm_parms_t *myParms) { hipError_t hipret; // Clean up host memory. if (myParms->alpha != NULL) {free(myParms->alpha); myParms->alpha = NULL;} if (myParms->beta != NULL) {free(myParms->beta ); myParms->beta = NULL;} if (myParms->A != NULL) {free(myParms->A ); myParms->A = NULL;} if (myParms->B != NULL) {free(myParms->B ); myParms->B = NULL;} if (myParms->C != NULL) {free(myParms->C ); myParms->C = NULL;} // Clean up device memory. if (myParms->A_d != NULL) { hipret = hipFree(myParms->A_d); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipFree(myParms->A_d) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } myParms->A_d = NULL; } if (myParms->B_d != NULL) { hipret = hipFree(myParms->B_d); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipFree(myParms->B_d) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } myParms->B_d = NULL; } if (myParms->C_d != NULL) { hipret = hipFree(myParms->C_d); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipFree(myParms->C_d) ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } myParms->C_d = NULL; } // Make sure the device is done with everything. hipret = hipDeviceSynchronize(); if (hipret != hipSuccess) { fprintf(stderr, "%s:%s:%i hipStreamSynchronize() ", __FILE__, __func__, __LINE__); hip_report_error(hipret); exit(-1); } // Tell rocblas to clean up the handle. myParms->rb_ret = rocblas_destroy_handle(myParms->handle); if (myParms->rb_ret != rocblas_status_success) { fprintf(stderr, "rocblas_destroy_handle "); rb_report_error(myParms->rb_ret); } // free our parameter structure. if (myParms != NULL) {free(myParms); myParms = NULL;} return; } // end FinishWork. papi-papi-7-2-0-t/src/components/rocp_sdk/000077500000000000000000000000001502707512200204205ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/rocp_sdk/README.md000066400000000000000000000061731502707512200217060ustar00rootroot00000000000000# ROCP\_SDK Component The ROCP\_SDK component exposes numerous performance events on AMD GPUs and APUs. The component is an adapter to the ROCm profiling library ROCprofiler-SDK which is included in a standard ROCM release. * [Enabling the ROCP\_SDK Component](#enabling-the-rocm-component) * [Environment Variables](#environment-variables) * [Known Limitations](#known-limitations) * [FAQ](#faq) *** ## Enabling the ROCP\_SDK Component To enable reading ROCP\_SDK events the user needs to link against a PAPI library that was configured with the ROCP\_SDK component enabled. As an example the following command: `./configure --with-components="rocp_sdk"` is sufficient to enable the component. Typically, the utility `papi_component_avail` (available in `papi/src/utils/papi_component_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## Library Version Limitations AMD ROCprofiler-SDK released before rocm-6.3.2 has known bugs. ## Environment Variables PAPI requires the location of the ROCM install directory. This can be specified by one environment variable: **PAPI\_ROCP\_SDK\_ROOT**. Access to the rocm main directory is required at both compile (for include files) and at runtime (for libraries). Example: export PAPI_ROCP_SDK_ROOT=/opt/rocm Within PAPI\_ROCP\_SDK\_ROOT, we expect the following standard directories: PAPI_ROCP_SDK_ROOT/include PAPI_ROCP_SDK_ROOT/include/rocprofiler-sdk PAPI_ROCP_SDK_ROOT/lib ### Counter Collection Modes The default mode is device sampling, which allows counter collection during the execution of a kernel. If a PAPI user wants to use dispatch mode, they must set the environment variable: **PAPI\_ROCP\_SDK\_DISPATCH\_MODE** before initializing PAPI. Example: export PAPI_ROCP_SDK_DISPATCH_MODE=1 ### Unusual Installations For the ROCP\_SDK component to be operational, it must find the dynamic library `librocprofiler-sdk.so` at runtime. This is normally found in the standard directory structure mentioned above. For unusual installations that do not follow this structure, the user may provide the full path to the library using the environment variable: **PAPI\_ROCP\_SDK\_LIB**. Example: export PAPI_ROCP_SDK_LIB=/opt/rocm-6.3.2/lib/librocprofiler-sdk.so.0 Note that this variable takes precedence over PAPI\_ROCP\_SDK\_ROOT. ## Known Limitations * In dispatch mode, PAPI may read zeros if reading takes place immediately after the return of a GPU kernel. This is not a PAPI bug. It may occur because calls such as hipDeviceSynchronize() do not guarantee that ROCprofiler has been called and all counter buffers have been flushed. Therefore, it is recommended that the user code adds a delay between the return of a kernel and calls to PAPI_read(), PAPI_stop(), etc. * If an application is linked against the static PAPI library libpapi.a, then the application must call PAPI_library_init() before calling any hip routines (e.g. hipInit(), hipGetDeviceCount(), hipLaunchKernelGGL(), etc). If the application is linked against the dynamic library libpapi.so, then the order of operations does not matter. papi-papi-7-2-0-t/src/components/rocp_sdk/Rules.rocp_sdk000066400000000000000000000010401502707512200232330ustar00rootroot00000000000000COMPSRCS += components/rocp_sdk/rocp_sdk.c components/rocp_sdk/sdk_class.cpp COMPOBJS += rocp_sdk.o sdk_class.o ROCP_SDK_INCL=-I$(PAPI_ROCP_SDK_ROOT)/include \ -I$(PAPI_ROCP_SDK_ROOT)/include/hsa \ -I$(PAPI_ROCP_SDK_ROOT)/hsa/include CFLAGS += -g $(ROCP_SDK_INCL) -D__HIP_PLATFORM_AMD__ LDFLAGS += $(LDL) rocp_sdk.o: components/rocp_sdk/rocp_sdk.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ sdk_class.o: components/rocp_sdk/sdk_class.cpp $(HEADERS) $(CXX) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ papi-papi-7-2-0-t/src/components/rocp_sdk/rocp_sdk.c000066400000000000000000000336521502707512200224010ustar00rootroot00000000000000/** * @file rocp_sdk.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief This implements a PAPI component that accesses hardware * monitoring counters for AMD GPU and APU devices through the * ROCprofiler-SDK library. * * The open source software license for PAPI conforms to the BSD * License template. */ #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "sdk_class.h" #include "rocprofiler-sdk/hsa.h" #define ROCPROF_SDK_MAX_COUNTERS (64) #define RPSDK_CTX_RUNNING (1) #define ROCM_CALL(call, err_handle) do { \ hsa_status_t _status = (call); \ if (_status == HSA_STATUS_SUCCESS || \ _status == HSA_STATUS_INFO_BREAK) \ break; \ err_handle; \ } while(0) /* Utility functions */ static int check_for_available_devices(char *err_msg); unsigned int _rocp_sdk_lock; /* Init and finalize */ static int rocp_sdk_init_component(int cid); static int rocp_sdk_init_thread(hwd_context_t *ctx); static int rocp_sdk_init_control_state(hwd_control_state_t *ctl); static int rocp_sdk_init_private(void); static int rocp_sdk_shutdown_component(void); static int rocp_sdk_shutdown_thread(hwd_context_t *ctx); static int rocp_sdk_cleanup_eventset(hwd_control_state_t *ctl); /* Set and update component state */ static int rocp_sdk_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx); /* Start and stop profiling of hardware events */ static int rocp_sdk_start(hwd_context_t *ctx, hwd_control_state_t *ctl); static int rocp_sdk_read(hwd_context_t *ctx, hwd_control_state_t *ctl, long long **val, int flags); static int rocp_sdk_stop(hwd_context_t *ctx, hwd_control_state_t *ctl); static int rocp_sdk_reset(hwd_context_t *ctx, hwd_control_state_t *ctl); /* Event conversion */ static int rocp_sdk_ntv_enum_events(unsigned int *event_code, int modifier); static int rocp_sdk_ntv_code_to_name(unsigned int event_code, char *name, int len); static int rocp_sdk_ntv_name_to_code(const char *name, unsigned int *event_code); static int rocp_sdk_ntv_code_to_descr(unsigned int event_code, char *descr, int len); static int rocp_sdk_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info); static int rocp_sdk_set_domain(hwd_control_state_t *ctl, int domain); static int rocp_sdk_ctl_fn(hwd_context_t *ctx, int code, _papi_int_option_t *option); typedef struct { int initialized; int state; int component_id; } rocp_sdk_context_t; typedef struct { int *events_id; int num_events; vendorp_ctx_t vendor_ctx; } rocp_sdk_control_t; papi_vector_t _rocp_sdk_vector = { .cmp_info = { .name = "rocp_sdk", .short_name = "rocp_sdk", .version = "1.0", .description = "GPU events and metrics via AMD ROCprofiler-SDK", .initialized = 0, .num_mpx_cntrs = 0 }, .size = { .context = sizeof(rocp_sdk_context_t), .control_state = sizeof(rocp_sdk_control_t), .reg_value = 1, .reg_alloc = 1, }, .init_component = rocp_sdk_init_component, .init_thread = rocp_sdk_init_thread, .init_control_state = rocp_sdk_init_control_state, .shutdown_component = rocp_sdk_shutdown_component, .shutdown_thread = rocp_sdk_shutdown_thread, .cleanup_eventset = rocp_sdk_cleanup_eventset, .update_control_state = rocp_sdk_update_control_state, .start = rocp_sdk_start, .stop = rocp_sdk_stop, .read = rocp_sdk_read, .reset = rocp_sdk_reset, .ntv_enum_events = rocp_sdk_ntv_enum_events, .ntv_code_to_name = rocp_sdk_ntv_code_to_name, .ntv_name_to_code = rocp_sdk_ntv_name_to_code, .ntv_code_to_descr = rocp_sdk_ntv_code_to_descr, .ntv_code_to_info = rocp_sdk_ntv_code_to_info, .set_domain = rocp_sdk_set_domain, .ctl = rocp_sdk_ctl_fn, }; static int check_n_initialize(void); int rocp_sdk_init_component(int cid) { _rocp_sdk_vector.cmp_info.CmpIdx = cid; _rocp_sdk_vector.cmp_info.num_native_events = -1; _rocp_sdk_vector.cmp_info.num_cntrs = -1; _rocp_sdk_lock = PAPI_NUM_LOCK + NUM_INNER_LOCK + cid; // We set this env variable to silence some unnecessary ROCprofiler-SDK debug messages. // It is not critical, so if it fails to be set, we can safely ignore the error. (void)setenv("ROCPROFILER_LOG_LEVEL","fatal",0); int papi_errno = rocprofiler_sdk_init_pre(); if (papi_errno != PAPI_OK) { _rocp_sdk_vector.cmp_info.initialized = 1; _rocp_sdk_vector.cmp_info.disabled = papi_errno; const char *err_string; rocprofiler_sdk_err_get_last(&err_string); snprintf(_rocp_sdk_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s", err_string); return papi_errno; } // This component needs to be fully initialized from the beginning, // because interleaving hip calls and PAPI calls leads to errors. return check_n_initialize(); } int rocp_sdk_init_thread(hwd_context_t *ctx) { rocp_sdk_context_t *rocp_sdk_ctx = (rocp_sdk_context_t *) ctx; memset(rocp_sdk_ctx, 0, sizeof(*rocp_sdk_ctx)); rocp_sdk_ctx->initialized = 1; rocp_sdk_ctx->component_id = _rocp_sdk_vector.cmp_info.CmpIdx; return PAPI_OK; } int rocp_sdk_init_control_state(hwd_control_state_t *ctl __attribute__((unused))) { return check_n_initialize(); } static int evt_get_count(int *count) { unsigned int event_code = 0; if (rocprofiler_sdk_evt_enum(&event_code, PAPI_ENUM_FIRST) == PAPI_OK) { ++(*count); } while (rocprofiler_sdk_evt_enum(&event_code, PAPI_ENUM_EVENTS) == PAPI_OK) { ++(*count); } return PAPI_OK; } int rocp_sdk_init_private(void) { int papi_errno = PAPI_OK; _papi_hwi_lock(COMPONENT_LOCK); if (_rocp_sdk_vector.cmp_info.initialized) { papi_errno = _rocp_sdk_vector.cmp_info.disabled; goto fn_exit; } papi_errno = check_for_available_devices(_rocp_sdk_vector.cmp_info.disabled_reason); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_errno = rocprofiler_sdk_init(); if (papi_errno != PAPI_OK) { _rocp_sdk_vector.cmp_info.disabled = papi_errno; const char *err_string; rocprofiler_sdk_err_get_last(&err_string); snprintf(_rocp_sdk_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s", err_string); goto fn_fail; } int count = 0; papi_errno = evt_get_count(&count); if (papi_errno != PAPI_OK) { goto fn_fail; } _rocp_sdk_vector.cmp_info.num_native_events = count; _rocp_sdk_vector.cmp_info.num_cntrs = count; _rocp_sdk_vector.cmp_info.num_mpx_cntrs = count; _rocp_sdk_vector.cmp_info.initialized = 1; fn_exit: _rocp_sdk_vector.cmp_info.disabled = papi_errno; _papi_hwi_unlock(COMPONENT_LOCK); return papi_errno; fn_fail: goto fn_exit; } int rocp_sdk_shutdown_component(void) { _rocp_sdk_vector.cmp_info.initialized = 0; return rocprofiler_sdk_shutdown(); } int rocp_sdk_shutdown_thread(hwd_context_t *ctx) { rocp_sdk_context_t *rocp_sdk_ctx = (rocp_sdk_context_t *) ctx; rocp_sdk_ctx->initialized = 0; rocp_sdk_ctx->state = 0; return PAPI_OK; } int rocp_sdk_cleanup_eventset(hwd_control_state_t *ctl) { rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; papi_free(rocp_sdk_ctl->events_id); rocp_sdk_ctl->events_id = NULL; rocp_sdk_ctl->num_events = 0; papi_free(rocp_sdk_ctl->vendor_ctx); rocp_sdk_ctl->vendor_ctx = NULL; return PAPI_OK; } int update_native_events(rocp_sdk_control_t *ctl, NativeInfo_t *ntv_info, int ntv_count) { int papi_errno = PAPI_OK; if (0 == ntv_count) { if ( NULL != ctl->events_id ){ papi_free(ctl->events_id); ctl->events_id = NULL; } ctl->num_events = ntv_count; goto fn_exit; } if (ntv_count != ctl->num_events) { ctl->events_id = papi_realloc(ctl->events_id, ntv_count * sizeof(*ctl->events_id)); if (NULL == ctl->events_id) { papi_errno = PAPI_ENOMEM; goto fn_fail; } ctl->num_events = ntv_count; } int i; for (i = 0; i < ntv_count; ++i) { ctl->events_id[i] = ntv_info[i].ni_event; ntv_info[i].ni_position = i; } fn_exit: return papi_errno; fn_fail: ctl->num_events = 0; goto fn_exit; } int rocp_sdk_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx __attribute__((unused))) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; if (rocp_sdk_ctl->vendor_ctx != NULL) { return PAPI_ECMP; } papi_errno = update_native_events(rocp_sdk_ctl, ntv_info, ntv_count); if (papi_errno != PAPI_OK) { return papi_errno; } return PAPI_OK; } int rocp_sdk_start(hwd_context_t *ctx, hwd_control_state_t *ctl) { int papi_errno = PAPI_OK; rocp_sdk_context_t *rocp_sdk_ctx = (rocp_sdk_context_t *) ctx; rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; if (0 == rocp_sdk_ctl->num_events){ SUBDBG("Error! Cannot PAPI_start an empty eventset."); return PAPI_ENOSUPP; } if (rocp_sdk_ctx->state & RPSDK_CTX_RUNNING) { SUBDBG("Error! Cannot PAPI_start more than one eventset at a time for every component."); return PAPI_EINVAL; } if ( !(rocp_sdk_ctl->vendor_ctx) ) { papi_errno = rocprofiler_sdk_ctx_open(rocp_sdk_ctl->events_id, rocp_sdk_ctl->num_events, &rocp_sdk_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } } papi_errno = rocprofiler_sdk_start(rocp_sdk_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } rocp_sdk_ctx->state |= RPSDK_CTX_RUNNING; fn_exit: return papi_errno; fn_fail: rocp_sdk_ctx->state = 0; goto fn_exit; } int rocp_sdk_stop(hwd_context_t *ctx, hwd_control_state_t *ctl) { int papi_errno = PAPI_OK; rocp_sdk_context_t *rocp_sdk_ctx = (rocp_sdk_context_t *) ctx; rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; papi_errno = rocprofiler_sdk_stop(rocp_sdk_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } rocp_sdk_ctl->vendor_ctx = NULL; fn_exit: rocp_sdk_ctx->state = 0; return papi_errno; fn_fail: goto fn_exit; } int rocp_sdk_read(hwd_context_t *ctx __attribute__((unused)), hwd_control_state_t *ctl, long long **val, int flags __attribute__((unused))) { rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; return rocprofiler_sdk_ctx_read(rocp_sdk_ctl->vendor_ctx, val); } int rocp_sdk_reset(hwd_context_t *ctx __attribute__((unused)), hwd_control_state_t *ctl) { rocp_sdk_control_t *rocp_sdk_ctl = (rocp_sdk_control_t *) ctl; return rocprofiler_sdk_ctx_reset(rocp_sdk_ctl->vendor_ctx); } int rocp_sdk_ntv_enum_events(unsigned int *event_code, int modifier) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return rocprofiler_sdk_evt_enum(event_code, modifier); } int rocp_sdk_ntv_code_to_name(unsigned int event_code, char *name, int len) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return rocprofiler_sdk_evt_code_to_name(event_code, name, len); } int rocp_sdk_ntv_name_to_code(const char *name, unsigned int *code) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } int papi_errcode = rocprofiler_sdk_evt_name_to_code(name, code); return papi_errcode; } int rocp_sdk_ntv_code_to_descr(unsigned int event_code, char *descr, int len) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return rocprofiler_sdk_evt_code_to_descr(event_code, descr, len); } int rocp_sdk_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return rocprofiler_sdk_evt_code_to_info(event_code, info); } int rocp_sdk_set_domain(hwd_control_state_t *ctl __attribute__((unused)), int domain __attribute__((unused))) { return PAPI_OK; } int rocp_sdk_ctl_fn(hwd_context_t *ctx __attribute__((unused)), int code __attribute__((unused)), _papi_int_option_t *option __attribute__((unused))) { return PAPI_OK; } int check_n_initialize(void) { if (!_rocp_sdk_vector.cmp_info.initialized) { return rocp_sdk_init_private(); } return _rocp_sdk_vector.cmp_info.disabled; } int check_for_available_devices(char *err_msg) { int ret_val; struct stat stat_info; const char *dir_path="/sys/class/kfd/kfd/topology/nodes"; // If the path does not exist, there are no AMD devices on this system. ret_val = stat(dir_path, &stat_info); if (ret_val != 0 || !S_ISDIR(stat_info.st_mode)) { goto fn_fail; } // If we can't open this directory, there are no AMD devices on this system. DIR *dir = opendir(dir_path); if (dir == NULL) { goto fn_fail; } // If there are no non-trivial entries in this directory, there are no AMD devices on this system. struct dirent *dir_entry; while( NULL != (dir_entry = readdir(dir)) ) { if( strlen(dir_entry->d_name) < 1 || dir_entry->d_name[0] == '.' ){ continue; } // If we made it here, it means we found an entry that is not "." or ".." closedir(dir); goto fn_exit; } // If we made it here, it means we only found entries that start with a "." closedir(dir); goto fn_fail; fn_exit: return PAPI_OK; fn_fail: sprintf(err_msg, "No compatible devices found."); return PAPI_EMISC; } papi-papi-7-2-0-t/src/components/rocp_sdk/sdk_class.cpp000066400000000000000000001436031502707512200231010ustar00rootroot00000000000000/** * @file sdk_class.cpp * @author Anthony Danalis * adanalis@icl.utk.edu * */ #include "sdk_class.hpp" namespace papi_rocpsdk { using agent_map_t = std::map; using dim_t = std::pair; using dim_vector_t = std::vector< dim_t >; static inline bool dimensions_match( dim_vector_t dim_instances, dim_vector_t recorded_dims ); typedef struct { rocprofiler_counter_info_v0_t counter_info; std::vector dim_info; } base_event_info_t; typedef struct { uint64_t qualifiers_present; std::string event_inst_name; rocprofiler_counter_info_v0_t counter_info; std::vector dim_info; dim_vector_t dim_instances; int device; } event_instance_info_t; typedef struct { rocprofiler_counter_id_t counter_id; uint64_t device; dim_vector_t recorded_dims; } rec_info_t; std::atomic _global_papi_event_count{0}; std::atomic _base_event_count{0}; #if (__cplusplus >= 201703L) // c++17 static std::shared_mutex profile_cache_mutex = {}; #define SHARED_LOCK std::shared_lock #define UNIQUE_LOCK std::unique_lock #elif (__cplusplus >= 201402L) // c++14 static std::shared_timed_mutex profile_cache_mutex = {}; #define SHARED_LOCK std::shared_lock #define UNIQUE_LOCK std::unique_lock #elif (__cplusplus >= 201103L) // c++11 static std::mutex profile_cache_mutex = {}; #define SHARED_LOCK std::lock_guard #define UNIQUE_LOCK std::lock_guard #else #error "c++11 or higher is required" #endif static std::mutex agent_mutex = {}; static std::condition_variable agent_cond_var = {}; static bool data_is_ready = false; static std::string _rocp_sdk_error_string; static long long int *_counter_values = NULL; static int rpsdk_profiling_mode = RPSDK_MODE_DEVICE_SAMPLING; static agent_map_t gpu_agents = agent_map_t{}; static std::unordered_map base_events_by_name = {}; static std::set active_device_set = {}; static vendorp_ctx_t active_event_set_ctx = NULL; static std::vector index_mapping; static std::unordered_map rpsdk_profile_cache = {}; static std::unordered_map papi_id_to_event_instance = {}; static std::unordered_map event_instance_name_to_papi_id = {}; /* *** */ typedef rocprofiler_status_t (* rocprofiler_flush_buffer_t) (rocprofiler_buffer_id_t buffer_id); typedef rocprofiler_status_t (* rocprofiler_sample_device_counting_service_t) (rocprofiler_context_id_t context_id, rocprofiler_user_data_t user_data, rocprofiler_counter_flag_t flags, rocprofiler_record_counter_t* output_records, size_t* rec_count); typedef rocprofiler_status_t (* rocprofiler_configure_callback_dispatch_counting_service_t) (rocprofiler_context_id_t context_id, rocprofiler_dispatch_counting_service_callback_t dispatch_callback, void *dispatch_callback_args, rocprofiler_profile_counting_record_callback_t record_callback, void *record_callback_args); typedef rocprofiler_status_t (* rocprofiler_configure_device_counting_service_t) (rocprofiler_context_id_t context_id, rocprofiler_buffer_id_t buffer_id, rocprofiler_agent_id_t agent_id, rocprofiler_device_counting_service_callback_t cb, void *user_data); typedef rocprofiler_status_t (* rocprofiler_create_buffer_t) (rocprofiler_context_id_t context, unsigned long size, unsigned long watermark, rocprofiler_buffer_policy_t policy, rocprofiler_buffer_tracing_cb_t callback, void *callback_data, rocprofiler_buffer_id_t *buffer_id); typedef rocprofiler_status_t (* rocprofiler_create_context_t) (rocprofiler_context_id_t *context_id); typedef rocprofiler_status_t (* rocprofiler_start_context_t) (rocprofiler_context_id_t context_id); typedef rocprofiler_status_t (* rocprofiler_stop_context_t) (rocprofiler_context_id_t context_id); typedef rocprofiler_status_t (* rocprofiler_context_is_valid_t) (rocprofiler_context_id_t context_id, int *status); typedef rocprofiler_status_t (* rocprofiler_context_is_active_t) (rocprofiler_context_id_t context_id, int *status); typedef rocprofiler_status_t (* rocprofiler_create_profile_config_t) (rocprofiler_agent_id_t agent_id, rocprofiler_counter_id_t *counters_list, unsigned long counters_count, rocprofiler_profile_config_id_t *config_id); typedef rocprofiler_status_t (* rocprofiler_destroy_profile_config_t) (rocprofiler_profile_config_id_t config_id); typedef rocprofiler_status_t (* rocprofiler_force_configure_t) (rocprofiler_configure_func_t configure_func); typedef const char * (* rocprofiler_get_status_string_t) (rocprofiler_status_t status); typedef rocprofiler_status_t (* rocprofiler_get_thread_id_t) (rocprofiler_thread_id_t *tid); typedef rocprofiler_status_t (* rocprofiler_is_finalized_t) (int *status); typedef rocprofiler_status_t (* rocprofiler_is_initialized_t) (int *status); typedef rocprofiler_status_t (* rocprofiler_iterate_agent_supported_counters_t) (rocprofiler_agent_id_t agent_id, rocprofiler_available_counters_cb_t cb, void* user_data); typedef rocprofiler_status_t (* rocprofiler_iterate_counter_dimensions_t) (rocprofiler_counter_id_t id, rocprofiler_available_dimensions_cb_t info_cb, void *user_data); typedef rocprofiler_status_t (* rocprofiler_query_available_agents_t) (rocprofiler_agent_version_t version, rocprofiler_query_available_agents_cb_t callback, unsigned long agent_size, void *user_data); typedef rocprofiler_status_t (* rocprofiler_query_counter_info_t) (rocprofiler_counter_id_t counter_id, rocprofiler_counter_info_version_id_t version, void *info); typedef rocprofiler_status_t (* rocprofiler_query_counter_instance_count_t) (rocprofiler_agent_id_t agent_id, rocprofiler_counter_id_t counter_id, unsigned long *instance_count); typedef rocprofiler_status_t (* rocprofiler_query_record_counter_id_t) (rocprofiler_counter_instance_id_t id, rocprofiler_counter_id_t *counter_id); typedef rocprofiler_status_t (* rocprofiler_query_record_dimension_position_t) (rocprofiler_counter_instance_id_t id, rocprofiler_counter_dimension_id_t dim, unsigned long *pos); rocprofiler_flush_buffer_t rocprofiler_flush_buffer_FPTR; rocprofiler_sample_device_counting_service_t rocprofiler_sample_device_counting_service_FPTR; rocprofiler_configure_callback_dispatch_counting_service_t rocprofiler_configure_callback_dispatch_counting_service_FPTR; rocprofiler_configure_device_counting_service_t rocprofiler_configure_device_counting_service_FPTR; rocprofiler_create_buffer_t rocprofiler_create_buffer_FPTR; rocprofiler_create_context_t rocprofiler_create_context_FPTR; rocprofiler_start_context_t rocprofiler_start_context_FPTR; rocprofiler_stop_context_t rocprofiler_stop_context_FPTR; rocprofiler_context_is_active_t rocprofiler_context_is_active_FPTR; rocprofiler_context_is_valid_t rocprofiler_context_is_valid_FPTR; rocprofiler_create_profile_config_t rocprofiler_create_profile_config_FPTR; rocprofiler_force_configure_t rocprofiler_force_configure_FPTR; rocprofiler_get_status_string_t rocprofiler_get_status_string_FPTR; rocprofiler_get_thread_id_t rocprofiler_get_thread_id_FPTR; rocprofiler_is_finalized_t rocprofiler_is_finalized_FPTR; rocprofiler_is_initialized_t rocprofiler_is_initialized_FPTR; rocprofiler_iterate_agent_supported_counters_t rocprofiler_iterate_agent_supported_counters_FPTR; rocprofiler_iterate_counter_dimensions_t rocprofiler_iterate_counter_dimensions_FPTR; rocprofiler_query_available_agents_t rocprofiler_query_available_agents_FPTR; rocprofiler_query_counter_info_t rocprofiler_query_counter_info_FPTR; rocprofiler_query_counter_instance_count_t rocprofiler_query_counter_instance_count_FPTR; rocprofiler_query_record_counter_id_t rocprofiler_query_record_counter_id_FPTR; rocprofiler_query_record_dimension_position_t rocprofiler_query_record_dimension_position_FPTR; /* ** */ rocprofiler_context_id_t& get_client_ctx() { static rocprofiler_context_id_t client_ctx; return client_ctx; } rocprofiler_buffer_id_t& get_buffer() { static rocprofiler_buffer_id_t buf = {}; return buf; } std::string get_error_string() { return _rocp_sdk_error_string; } void set_error_string(std::string str) { _rocp_sdk_error_string = str; } int get_profiling_mode(void) { return rpsdk_profiling_mode; } /* ** */ static const char * obtain_function_pointers() { static bool first_time = true; void *dllHandle = nullptr; const char* pathname; const char *rocm_root; const char *ret_val = NULL; if( !first_time ){ ret_val = NULL; goto fn_exit; } pathname = std::getenv("PAPI_ROCP_SDK_LIB"); // If the user gave us an explicit path to librocprofiler-sdk.so, use it. if ( nullptr != pathname && strlen(pathname) <= PATH_MAX ) { dllHandle = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL); if ( nullptr == dllHandle ) { std::string err_str = std::string("Invalid path in PAPI_ROCP_SDK_LIB: ")+pathname; set_error_string(err_str); ret_val = strdup(err_str.c_str()); SUBDBG("%s\n",ret_val); goto fn_fail; } }else{ // If we were not given an explicit path to the library, try elsewhere. rocm_root = std::getenv("PAPI_ROCP_SDK_ROOT"); if( nullptr == rocm_root || strlen(rocm_root) > PATH_MAX ){ // If we are here, the user has not given us any hint about the // location of the library, so we let dlopen() try the default paths. pathname = "librocprofiler-sdk.so"; }else{ int err; struct stat stat_info; std::string tmp_str = std::string(rocm_root) + "/lib/librocprofiler-sdk.so"; pathname = strdup(tmp_str.c_str()); err = stat(pathname, &stat_info); if (err != 0 || !S_ISREG(stat_info.st_mode)) { std::string err_str = std::string("Invalid path in PAPI_ROCP_SDK_ROOT: ")+tmp_str; set_error_string(err_str); ret_val = strdup(err_str.c_str()); SUBDBG("%s\n",ret_val); goto fn_fail; } } dllHandle = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL); if (dllHandle == NULL) { // Nothing worked. Giving up. std::string err_str = std::string("Could not dlopen() librocprofiler-sdk.so. Set either PAPI_ROCP_SDK_ROOT, or PAPI_ROCP_SDK_LIB."); set_error_string(err_str); ret_val = strdup(err_str.c_str()); SUBDBG("%s\n",ret_val); goto fn_fail; } } DLL_SYM_CHECK(rocprofiler_flush_buffer, rocprofiler_flush_buffer_t); DLL_SYM_CHECK(rocprofiler_sample_device_counting_service, rocprofiler_sample_device_counting_service_t); DLL_SYM_CHECK(rocprofiler_configure_callback_dispatch_counting_service, rocprofiler_configure_callback_dispatch_counting_service_t); DLL_SYM_CHECK(rocprofiler_configure_device_counting_service, rocprofiler_configure_device_counting_service_t); DLL_SYM_CHECK(rocprofiler_create_context, rocprofiler_create_context_t); DLL_SYM_CHECK(rocprofiler_create_buffer, rocprofiler_create_buffer_t); DLL_SYM_CHECK(rocprofiler_start_context, rocprofiler_start_context_t); DLL_SYM_CHECK(rocprofiler_stop_context, rocprofiler_stop_context_t); DLL_SYM_CHECK(rocprofiler_context_is_valid, rocprofiler_context_is_valid_t); DLL_SYM_CHECK(rocprofiler_context_is_active, rocprofiler_context_is_active_t); DLL_SYM_CHECK(rocprofiler_create_profile_config, rocprofiler_create_profile_config_t); DLL_SYM_CHECK(rocprofiler_force_configure, rocprofiler_force_configure_t); DLL_SYM_CHECK(rocprofiler_get_status_string, rocprofiler_get_status_string_t); DLL_SYM_CHECK(rocprofiler_get_thread_id, rocprofiler_get_thread_id_t); DLL_SYM_CHECK(rocprofiler_is_finalized, rocprofiler_is_finalized_t); DLL_SYM_CHECK(rocprofiler_is_initialized, rocprofiler_is_initialized_t); DLL_SYM_CHECK(rocprofiler_iterate_agent_supported_counters, rocprofiler_iterate_agent_supported_counters_t); DLL_SYM_CHECK(rocprofiler_iterate_counter_dimensions, rocprofiler_iterate_counter_dimensions_t); DLL_SYM_CHECK(rocprofiler_query_available_agents, rocprofiler_query_available_agents_t); DLL_SYM_CHECK(rocprofiler_query_counter_info, rocprofiler_query_counter_info_t); DLL_SYM_CHECK(rocprofiler_query_counter_instance_count, rocprofiler_query_counter_instance_count_t); DLL_SYM_CHECK(rocprofiler_query_record_counter_id, rocprofiler_query_record_counter_id_t); DLL_SYM_CHECK(rocprofiler_query_record_dimension_position, rocprofiler_query_record_dimension_position_t); fn_exit: // Make sure we don't run this code multiple times. first_time = false; return ret_val; fn_fail: goto fn_exit; } /** * For a given counter, query the dimensions that it has. */ std::vector counter_dimensions(rocprofiler_counter_id_t counter) { std::vector dims; rocprofiler_available_dimensions_cb_t cb; cb = [](rocprofiler_counter_id_t, const rocprofiler_record_dimension_info_t* dim_info, size_t num_dims, void *user_data) { std::vector* vec; vec = static_cast*>(user_data); for(size_t i = 0; i < num_dims; i++){ vec->push_back(dim_info[i]); } return ROCPROFILER_STATUS_SUCCESS; }; // Use the callback defined above to populate the vector "dims" with the dimension info of counter "counter". ROCPROFILER_CALL(rocprofiler_iterate_counter_dimensions_FPTR(counter, cb, &dims), "Could not iterate counter dimensions"); return dims; } /* ** */ bool dimensions_match( dim_vector_t dim_instances, dim_vector_t recorded_dims ) { // Traverse all the dimensions in the event instance (i.e. base_event+qualifiers) of an event in the active_event_set_ctx for(const auto &ev_inst_dim : dim_instances ){ bool found_dim_id = false; // Traverse all the dimensions of the event in the record_callback() data for(const auto &recorded_dim : recorded_dims ){ if( ev_inst_dim.first == recorded_dim.first ){ found_dim_id = true; // If the ids of two dimensions match, we compare the positions. if( ev_inst_dim.second != recorded_dim.second ){ return false; } // If we found a match, we don't need to check the remaining recorded dimensions against this qualifier. break; } } // if the record_callback() data does not have one of the dimensions of the event instance, then they didn't match. if( !found_dim_id ){ return false; } } return true; } /* ** */ void record_callback(rocprofiler_dispatch_counting_service_data_t dispatch_data, rocprofiler_record_counter_t* record_data, size_t record_count, rocprofiler_user_data_t, void* callback_data_args) { rec_info_t *tmp_rec_info; uint64_t device; if( (NULL == _counter_values) || (NULL == active_event_set_ctx) || (0 == (active_event_set_ctx->state & RPSDK_AES_RUNNING)) ){ return; } _papi_hwi_lock(_rocp_sdk_lock); // Find the logical GPU id of this dispatch. auto agent = gpu_agents.find( dispatch_data.dispatch_info.agent_id.handle ); if( gpu_agents.end() != agent ){ device = agent->second->logical_node_type_id; }else{ device = -1; SUBDBG("agent_id in dispatch_data does not correspond to a known gpu agent.\n"); } // Create the mapping from events in the eventset (passed by the user) to entries in the "record_data" array. // The order of the entries in "record_data" will remain the same for a given profile, so we only need to do this once. if( index_mapping.empty() ){ rec_info_t event_set_to_rec_mapping[record_count]; index_mapping.resize( record_count*(active_event_set_ctx->num_events), false ); // Traverse all the recorded entries and cache some information about them // that we will need further down when doing the matching. for(int i=0; i dimensions = counter_dimensions(counter_id); for(auto& dim : dimensions ){ unsigned long pos=0; ROCPROFILER_CALL(rocprofiler_query_record_dimension_position_FPTR(record_data[i].id, dim.id, &pos), "Count not retrieve dimension"); rec_info.recorded_dims.emplace_back( std::make_pair(dim.id, pos) ); } } // Traverse all events in the active event set and find which recorded entry matches each one of them. for( int ei=0; einum_events; ei++ ){ double counter_value_sum = 0.0; auto e_tmp = papi_id_to_event_instance.find( active_event_set_ctx->event_ids[ei] ); if( papi_id_to_event_instance.end() == e_tmp ){ continue; } event_instance_info_t e_inst = e_tmp->second; for(int i=0; inum_events; ei++ ){ double counter_value_sum = 0.0; for(int i=0; isecond; } return; } /* ** */ agent_map_t get_GPU_agent_info() { auto iterate_cb = [](rocprofiler_agent_version_t agents_ver, const void** agents_arr, size_t num_agents, void* user_data) { if(agents_ver != ROCPROFILER_AGENT_INFO_VERSION_0) throw std::runtime_error{"unexpected rocprofiler agent version"}; auto* agents_v = static_cast(user_data); for(size_t i = 0; i < num_agents; ++i) { const auto* itr = static_cast(agents_arr[i]); if( ROCPROFILER_AGENT_TYPE_GPU == itr->type ){ agents_v->emplace(itr->id.handle, itr); } } return ROCPROFILER_STATUS_SUCCESS; }; auto _agents = agent_map_t{}; ROCPROFILER_CALL( rocprofiler_query_available_agents_FPTR(ROCPROFILER_AGENT_INFO_VERSION_0, iterate_cb, sizeof(rocprofiler_agent_t), static_cast(&_agents)), "query available agents"); return _agents; } /* ** */ void set_profile(rocprofiler_context_id_t context_id, rocprofiler_agent_id_t agent, rocprofiler_agent_set_profile_callback_t set_config, void*) { const SHARED_LOCK rlock(profile_cache_mutex); auto pos = rpsdk_profile_cache.find(agent.handle); if( rpsdk_profile_cache.end() != pos ){ set_config(context_id, pos->second); } return; } /* ** */ void buffered_callback(rocprofiler_context_id_t, rocprofiler_buffer_id_t, rocprofiler_record_header_t** headers, size_t num_headers, void* user_data, uint64_t) { return; } /* ** */ int tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data) { assert(tool_data != nullptr); if( NULL != getenv("PAPI_ROCP_SDK_DISPATCH_MODE") ){ rpsdk_profiling_mode = RPSDK_MODE_DISPATCH; } // Obtain the list of available (GPU) agents. gpu_agents = get_GPU_agent_info(); ROCPROFILER_CALL(rocprofiler_create_context_FPTR(&get_client_ctx()), "context creation"); if( RPSDK_MODE_DEVICE_SAMPLING == get_profiling_mode() ){ ROCPROFILER_CALL(rocprofiler_create_buffer_FPTR(get_client_ctx(), 32*1024, 16*1024, ROCPROFILER_BUFFER_POLICY_LOSSLESS, buffered_callback, tool_data, &get_buffer()), "buffer creation failed"); // Configure device_counting_service for all devices. for(auto g_it=gpu_agents.begin(); g_it!=gpu_agents.end(); ++g_it){ ROCPROFILER_CALL(rocprofiler_configure_device_counting_service_FPTR( get_client_ctx(), get_buffer(), g_it->second->id, set_profile, nullptr), "Could not setup sampling"); } }else{ ROCPROFILER_CALL(rocprofiler_configure_callback_dispatch_counting_service_FPTR( get_client_ctx(), dispatch_callback, tool_data, record_callback, tool_data), "Could not setup callback dispatch"); } return 0; } /* ** */ static void delete_event_list(void){ base_events_by_name.clear(); } /* ** */ static int assign_id_to_event(std::string event_name, event_instance_info_t ev_inst_info){ int papi_event_id = -1; // Note: _global_papi_event_count is std::atomic, so the followign line is thread safe. papi_event_id = _global_papi_event_count++; papi_id_to_event_instance[ papi_event_id ] = ev_inst_info; event_instance_name_to_papi_id[ event_name ] = papi_event_id; return papi_event_id; } /* ** */ static void populate_event_list(void){ // If the event list is already populated, return without doing anything. if( !base_events_by_name.empty() ) return; // Pick the first agent, because we currently do not support a mixture of heterogeneous GPUs, so all agents should be the same. const rocprofiler_agent_v0_t *agent = gpu_agents.begin()->second; // GPU Counter IDs std::vector gpu_counters; auto itrt_cntr_cb = [](rocprofiler_agent_id_t, rocprofiler_counter_id_t* counters, size_t num_counters, void* udata) { std::vector* vec = static_cast*>(udata); for(size_t i = 0; i < num_counters; i++){ vec->push_back(counters[i]); } return ROCPROFILER_STATUS_SUCCESS; }; // Get the counters available through the selected agent. ROCPROFILER_CALL(rocprofiler_iterate_agent_supported_counters_FPTR(agent->id, itrt_cntr_cb, static_cast(&gpu_counters)), "Could not fetch supported counters"); for(auto& counter : gpu_counters){ rocprofiler_counter_info_v0_t counter_info; ROCPROFILER_CALL( rocprofiler_query_counter_info_FPTR(counter, ROCPROFILER_COUNTER_INFO_VERSION_0, static_cast(&counter_info)), "Could not query info"); std::vector dim_info; dim_info = counter_dimensions(counter_info.id); base_events_by_name[counter_info.name].counter_info = counter_info; base_events_by_name[counter_info.name].dim_info = dim_info; ++_base_event_count; // This list does not contain "proper" events, with all qualifiers that // PAPI requires. This is just the list of base events as enumerated by the // vendor API. Therefore, it's ok to set "dim_instances" and "device" to dummy values. event_instance_info_t ev_inst_info; ev_inst_info.qualifiers_present = 0; ev_inst_info.event_inst_name = counter_info.name; ev_inst_info.counter_info = counter_info; ev_inst_info.dim_info = dim_info; ev_inst_info.dim_instances = {}; ev_inst_info.device = -1; (void)assign_id_to_event(counter_info.name, ev_inst_info); } return; } /* ** */ void stop_counting(void){ int ctx_active, ctx_valid; _counter_values = NULL; ROCPROFILER_CALL(rocprofiler_context_is_valid_FPTR(get_client_ctx(), &ctx_valid), "check context validity"); if( !ctx_valid ){ SUBDBG("client_context is invalid\n"); return; } ROCPROFILER_CALL(rocprofiler_context_is_active_FPTR(get_client_ctx(), &ctx_active), "check if context is active"); if( !ctx_active ){ SUBDBG("client_context is not active\n"); return; } ROCPROFILER_CALL(rocprofiler_stop_context_FPTR(get_client_ctx()), "stop context"); } /* ** */ void start_counting(vendorp_ctx_t ctx){ // Store a pointer to the counter value array in a global variable so that // our functions that are called from the ROCprofiler-SDK (instead of our // API) can still find the array. _counter_values = ctx->counters; ROCPROFILER_CALL(rocprofiler_start_context_FPTR(get_client_ctx()), "start context"); } /* ** */ int read_sample(){ int papi_errno = PAPI_OK; int ret_val; size_t rec_count = 1024; rocprofiler_record_counter_t output_records[1024]; if( (NULL == _counter_values) || (NULL == active_event_set_ctx) || (0 == (active_event_set_ctx->state & RPSDK_AES_RUNNING)) ){ papi_errno = PAPI_ENOTRUN; goto fn_fail; } ret_val = rocprofiler_sample_device_counting_service_FPTR( get_client_ctx(), {}, ROCPROFILER_COUNTER_FLAG_NONE, output_records, &rec_count); if( ret_val != ROCPROFILER_STATUS_SUCCESS ){ papi_errno = PAPI_ECMP; goto fn_fail; } // Create the mapping from events in the eventset (passed by the user) to entries (samples) in the "output_records" array. // The order of the entries in "output_records" will remain the same for a given profile, so we only need to do this once. if( index_mapping.empty() ){ rec_info_t event_set_to_rec_mapping[rec_count]; index_mapping.resize( rec_count*(active_event_set_ctx->num_events), false ); // Traverse all the sampled entries and cache some information about them // that we will need further down when doing the matching. for(int i=0; isecond->logical_node_type_id; }else{ SUBDBG("agent_id of recorded sample %d does not correspond to a known gpu agent.\n", i); } ROCPROFILER_CALL(rocprofiler_query_record_counter_id_FPTR(output_records[i].id, &counter_id), "Could not retrieve counter_id"); rec_info.counter_id = counter_id; std::vector dimensions = counter_dimensions(counter_id); for(auto& dim : dimensions ){ unsigned long pos=0; ROCPROFILER_CALL(rocprofiler_query_record_dimension_position_FPTR(output_records[i].id, dim.id, &pos), "Count not retrieve dimension"); rec_info.recorded_dims.emplace_back( std::make_pair(dim.id, pos) ); } } // Traverse all events in the active event set and find which entries in the set of samples matches each one of them. for( int ei=0; einum_events; ei++ ){ double counter_value_sum = 0.0; // Find the internal event instance. auto tmp = papi_id_to_event_instance.find( active_event_set_ctx->event_ids[ei] ); if( papi_id_to_event_instance.end() == tmp ){ SUBDBG("EventSet contains an event id that is unknown to the rocp_sdk component.\n"); continue; } event_instance_info_t e_inst = tmp->second; for(int i=0; inum_events; ei++ ){ double counter_value_sum = 0.0; for(int i=0; isecond.counter_info.description; return PAPI_OK; } /* ** */ int evt_id_to_name(int papi_event_id, const char **name){ auto it = papi_id_to_event_instance.find( papi_event_id ); if( papi_id_to_event_instance.end() == it ){ return PAPI_ENOEVNT; } *name = it->second.event_inst_name.c_str(); return PAPI_OK; } /* ** */ static int build_event_info_from_name(std::string event_name, event_instance_info_t *ev_inst_info){ int pos=0, ppos=0; std::vector qualifiers = {}; dim_vector_t dim_instances = {}; std::string base_event_name; uint64_t qualifiers_present = 0; int device_qualifier_value = -1; pos=event_name.find(':'); if( pos == event_name.npos){ base_event_name = event_name; }else{ base_event_name = event_name.substr(0, pos-0); ppos = pos+1; // Tokenize the event name and keep the qualifiers in a vector. while( (pos=event_name.find(':', ppos)) != event_name.npos){ std::string qual_tuple = event_name.substr(ppos,pos-ppos); qualifiers.emplace_back( qual_tuple ); ppos = pos+1; } // Add in the vector the last qualifier we found in the while loop. qualifiers.emplace_back( event_name.substr(ppos,pos-ppos) ); } auto it0 = base_events_by_name.find(base_event_name); if( base_events_by_name.end() == it0 ){ return PAPI_ENOEVNT; } base_event_info_t base_event_info = it0->second; for( const auto & qual : qualifiers ){ // All qualifiers must have the form "qual_name=qual_value". pos=qual.find('='); if( pos == qual.npos){ return PAPI_EINVAL; } std::string qual_name = qual.substr(0, pos-0); int qual_val = std::stoi( qual.substr(pos+1) ); // The "device" qualifier does not appear as a rocprofiler-sdk dimension. // It comes from us (the PAPI component), so it needs special treatment. if( qual_name.compare("device") == 0 ){ // We use the most significant bit to designate the presence of the "device" qualifier. qualifiers_present |= (1 << base_event_info.dim_info.size()); device_qualifier_value = qual_val; }else{ int qual_i = 0; // Make sure that the qualifier name corresponds to one of the known dimensions of this event. for( const auto & dim : base_event_info.dim_info ){ if( qual_name.compare(dim.name) == 0 ){ // Make sure that the qualifier value is within the proper range. if( qual_val >= dim.instance_size ){ return PAPI_EINVAL; } dim_instances.emplace_back( std::make_pair(dim.id, qual_val) ); // Mark which qualifiers we have found based on the order in which they appear in // base_event_info.dim_info, NOT based on the order the user provided them. // This will work up to 64 possible qualifiers. if( qual_i < 64 ){ qualifiers_present |= (1 << qual_i); }else{ SUBDBG("More than 64 qualifiers detected in event name: %s\n",event_name.c_str()); } } ++qual_i; } } } // Sort the qualifiers (dimension instances) based on dimension id. This allows the user to give us the // qualifiers in any order. std::sort(dim_instances.begin(), dim_instances.end(), [](const dim_t &a, const dim_t &b) { return (a.first < b.first); } ); ev_inst_info->qualifiers_present = qualifiers_present; ev_inst_info->event_inst_name = event_name; ev_inst_info->counter_info = base_event_info.counter_info; ev_inst_info->dim_info = base_event_info.dim_info; ev_inst_info->dim_instances = dim_instances; ev_inst_info->device = device_qualifier_value; return PAPI_OK; } int evt_name_to_id(std::string event_name, unsigned int *event_id){ int ret_val = PAPI_OK; event_instance_info_t ev_inst_info; unsigned int papi_event_id; // If the event already exists in our metadata, return its id. auto it1 = event_instance_name_to_papi_id.find( event_name ); if( event_instance_name_to_papi_id.end() != it1 ){ papi_event_id = it1->second; }else{ // If we've never seen this event before, insert the info into our metadata. ret_val = build_event_info_from_name(event_name, &ev_inst_info); if( PAPI_OK != ret_val ){ return ret_val; } papi_event_id = assign_id_to_event(event_name, ev_inst_info); } *event_id = papi_event_id; return PAPI_OK; } /* ** */ int evt_enum(unsigned int *event_code, int modifier){ int papi_errno=PAPI_OK, tmp_code; base_event_info_t event_info; std::string full_name; event_instance_info_t ev_inst; populate_event_list(); switch(modifier) { case PAPI_ENUM_FIRST: papi_errno = PAPI_OK; *event_code = 0; break; case PAPI_ENUM_EVENTS: tmp_code = *event_code + 1; if( tmp_code >= _base_event_count ){ papi_errno = PAPI_ENOEVNT; break; } papi_errno = PAPI_OK; *event_code = tmp_code; break; case PAPI_NTV_ENUM_UMASKS: tmp_code = *event_code; { std::string qual_ub, tmp_desc; auto it = papi_id_to_event_instance.find( tmp_code ); if( papi_id_to_event_instance.end() == it ){ papi_errno = PAPI_ENOEVNT; break; } ev_inst = it->second; int qual_i=-1; // Find the last qualifier present so that we can create an event instance using the next qualifier in the list. for(int i=0; i<64; ++i){ if( ( ev_inst.qualifiers_present >> i) & 0x1 ){ qual_i = i; } } // Increment the last one found by one to create the next potential qualifier index. ++qual_i; // If we exceeded the number of available dimensions (i.e. qualifiers) then we are done with this base event. if( qual_i > ev_inst.dim_info.size() ){ papi_errno = PAPI_ENOEVNT; break; // Here we insert the "device" qualifier, which does not appear as a dimension in rocprofiler-sdk. }else if( qual_i == ev_inst.dim_info.size() ){ full_name = ev_inst.counter_info.name + std::string(":device=0"); qual_ub = std::to_string(gpu_agents.size()-1); tmp_desc = "masks: Range: [0-" + qual_ub + "], default=0."; }else{ full_name = ev_inst.counter_info.name + std::string(":device=0"); rocprofiler_record_dimension_info_t dim = ev_inst.dim_info[qual_i]; full_name = ev_inst.counter_info.name + std::string(":") + dim.name + std::string("=0"); qual_ub = std::to_string(dim.instance_size-1); tmp_desc = "masks: Range: [0-" + qual_ub + "], default=sum."; } // Insert the new event (base_event:SOME_QUALIFIER=0) into the data structures and get an event_code for it. evt_name_to_id(full_name, event_code); papi_id_to_event_instance[*event_code].counter_info.description = strdup(tmp_desc.c_str()); papi_errno = PAPI_OK; break; } default: papi_errno = PAPI_EINVAL; break; } return papi_errno; } /* ** */ void empty_active_event_set(void){ active_event_set_ctx = NULL; index_mapping.clear(); active_device_set.clear(); return; } /* ** */ int set_profile_cache(vendorp_ctx_t ctx){ std::map > active_events_per_device; // Acquire a unique lock so that no other thread can try to read // the profile cache while we are modifying it. const UNIQUE_LOCK wlock(profile_cache_mutex); rpsdk_profile_cache.clear(); for( int i=0; i < ctx->num_events; ++i) { // make sure the event exists. auto it = papi_id_to_event_instance.find( ctx->event_ids[i] ); if( papi_id_to_event_instance.end() == it ){ return PAPI_ENOEVNT; } active_device_set.insert(it->second.device); active_events_per_device[it->second.device].emplace_back(it->second); } for(const auto &a_it : gpu_agents ){ rocprofiler_profile_config_id_t profile; auto agent = a_it.second; std::vector event_vid_list = {}; std::set id_set = {}; for( const auto e_inst : active_events_per_device[agent->logical_node_type_id] ){ rocprofiler_counter_id_t vid = e_inst.counter_info.id; // If the vid of the event (base event) is not already in the event_vid_list, then add it. if( id_set.find(vid.handle) == id_set.end() ){ event_vid_list.emplace_back( vid ); id_set.emplace( vid.handle ); } } //TODO Error handling: right now we can't tell which event caused the problem, if a problem occurs. ROCPROFILER_CALL(rocprofiler_create_profile_config_FPTR(agent->id, event_vid_list.data(), event_vid_list.size(), &profile), "Could not construct profile cfg"); rpsdk_profile_cache.emplace(agent->id.handle, profile); } return PAPI_OK; } /* ** */ void tool_fini(void* tool_data) { stop_counting(); empty_active_event_set(); return; } /* ** */ int setup() { int status = 0; // Set sampling as the default mode and allow the users to change this // behavior by setting the environment variable PAPI_ROCP_SDK_DISPATCH_MODE rpsdk_profiling_mode = RPSDK_MODE_DEVICE_SAMPLING; if( NULL != getenv("PAPI_ROCP_SDK_DISPATCH_MODE") ){ rpsdk_profiling_mode = RPSDK_MODE_DISPATCH; } const char *error_msg = obtain_function_pointers(); if( NULL != error_msg ){ if( get_error_string().empty() ){ set_error_string("Could not obtain all functions from librocprofiler-sdk.so. Possible library version mismatch."); SUBDBG("dlsym(): %s\n", error_msg); } goto fn_fail; } if( (ROCPROFILER_STATUS_SUCCESS == rocprofiler_is_initialized_FPTR(&status)) && (0 == status) ){ ROCPROFILER_CALL(rocprofiler_force_configure_FPTR(&rocprofiler_configure), "force configuration"); } return PAPI_OK; fn_fail: return PAPI_ECMP; } } // namespace papi_rocpsdk //-------------------------------------------------------------------------------- //-------------------------------------------------------------------------------- extern "C" int rocprofiler_sdk_init_pre(void) { return PAPI_OK; } extern "C" int rocprofiler_sdk_init(void) { int papi_errno=PAPI_OK; if( papi_rocpsdk::setup() ){ papi_errno = PAPI_ECMP; goto fn_fail; } papi_rocpsdk::populate_event_list(); fn_exit: return papi_errno; fn_fail: papi_rocpsdk::delete_event_list(); goto fn_exit; } extern "C" int rocprofiler_sdk_shutdown(void) { papi_rocpsdk::stop_counting(); papi_rocpsdk::empty_active_event_set(); papi_rocpsdk::delete_event_list(); return PAPI_OK; } extern "C" int rocprofiler_sdk_stop(vendorp_ctx_t ctx) { if( ctx ){ ctx->state = RPSDK_AES_STOPPED; } finalize_ctx(ctx); papi_rocpsdk::stop_counting(); papi_rocpsdk::empty_active_event_set(); papi_rocpsdk::delete_event_list(); return PAPI_OK; } extern "C" int rocprofiler_sdk_start(vendorp_ctx_t ctx) { int i; for(i=0; inum_events; i++){ ctx->counters[i] = 0; } papi_rocpsdk::start_counting(ctx); ctx->state |= RPSDK_AES_RUNNING; return PAPI_OK; } extern "C" int rocprofiler_sdk_ctx_reset(vendorp_ctx_t ctx) { int i; if( !ctx ){ SUBDBG("Trying to reset a component before calling PAPI_start()."); return PAPI_EINVAL; } for(i=0; inum_events; i++){ ctx->counters[i] = 0; } return PAPI_OK; } extern "C" int rocprofiler_sdk_ctx_open(int *event_ids, int num_events, vendorp_ctx_t *ctx) { int papi_errno=PAPI_OK; *ctx = (vendorp_ctx_t)papi_calloc(1, sizeof(struct vendord_ctx)); if (NULL == *ctx) { return PAPI_ENOMEM; } _papi_hwi_lock(_rocp_sdk_lock); papi_rocpsdk::empty_active_event_set(); papi_errno = init_ctx(event_ids, num_events, *ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_rocpsdk::active_event_set_ctx = *ctx; papi_rocpsdk::set_profile_cache(*ctx); (*ctx)->state = RPSDK_AES_OPEN; fn_exit: _papi_hwi_unlock(_rocp_sdk_lock); return papi_errno; fn_fail: finalize_ctx(*ctx); goto fn_exit; } extern "C" int rocprofiler_sdk_ctx_read(vendorp_ctx_t ctx, long long **counters) { int papi_errno = PAPI_OK; // If the collection mode is DEVICE_SAMPLING get an explicit sample. if( RPSDK_MODE_DEVICE_SAMPLING == papi_rocpsdk::get_profiling_mode() ){ papi_errno = papi_rocpsdk::read_sample(); } // If the mode is not sampling the counter data should already be in // "ctx->counters", because record_callback() should have placed it there // asynchronously. // However, we don't use any synchronization mechanisms to guarantee that // record_callback() has indeed completed before this function returns. // Therefore, we recomend that the user code adds a small delay between // the completion of a GPU kernel and the call of PAPI_read(). *counters = ctx->counters; return papi_errno; } extern "C" int rocprofiler_sdk_evt_enum(unsigned int *event_code, int modifier) { return papi_rocpsdk::evt_enum(event_code, modifier); } extern "C" int rocprofiler_sdk_evt_code_to_name(unsigned int event_code, char *name, int len) { int papi_errno = PAPI_OK; const char *tmp_name; papi_errno = papi_rocpsdk::evt_id_to_name(event_code, &tmp_name); if( PAPI_OK == papi_errno ){ snprintf(name, len, "%s", tmp_name); } return papi_errno; } extern "C" int rocprofiler_sdk_evt_code_to_descr(unsigned int event_code, char *descr, int len) { int papi_errno = PAPI_OK; const char *tmp_descr; papi_errno = papi_rocpsdk::evt_id_to_descr(event_code, &tmp_descr); if ( PAPI_OK == papi_errno ) { snprintf(descr, len, "%s", tmp_descr); } return papi_errno; } extern "C" int rocprofiler_sdk_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { int papi_errno = PAPI_OK; const char *tmp_name, *tmp_descr; papi_errno = papi_rocpsdk::evt_id_to_name(event_code, &tmp_name); if ( PAPI_OK == papi_errno ) { snprintf(info->symbol, PAPI_HUGE_STR_LEN, "%s", tmp_name); } papi_errno = papi_rocpsdk::evt_id_to_descr(event_code, &tmp_descr); if ( PAPI_OK == papi_errno ) { snprintf(info->long_descr, PAPI_HUGE_STR_LEN, "%s", tmp_descr); snprintf(info->short_descr, PAPI_MIN_STR_LEN, "%s", tmp_descr); } return papi_errno; } extern "C" int rocprofiler_sdk_evt_name_to_code(const char *event_name, unsigned int *event_code) { int papi_errno = PAPI_OK; // If "device" qualifier is not provived by the user, make it zero. if( NULL == strstr(event_name, "device=") ){ char *amended_event_name; size_t len = strlen(event_name)+strlen(":device=0")+1; // +1 for '\0' if( len > 1024 ){ return PAPI_EMISC; } amended_event_name = (char *)calloc(len, sizeof(char)); int ret = snprintf(amended_event_name, len, "%s:device=0", event_name); if( ret >= len ){ free(amended_event_name); return PAPI_EMISC; } papi_errno = papi_rocpsdk::evt_name_to_id(amended_event_name, event_code); }else{ papi_errno = papi_rocpsdk::evt_name_to_id(event_name, event_code); } return papi_errno; } extern "C" int rocprofiler_sdk_err_get_last(const char **err){ *err = strdup(papi_rocpsdk::get_error_string().substr(0,PAPI_MAX_STR_LEN-1).c_str() ); return PAPI_OK; } static int init_ctx(int *event_ids, int num_events, vendorp_ctx_t ctx) { ctx->event_ids = event_ids; ctx->num_events = num_events; ctx->counters = (long long *)papi_calloc(num_events, sizeof(long long)); if (NULL == ctx->counters) { return PAPI_ENOMEM; } return PAPI_OK; } static int finalize_ctx(vendorp_ctx_t ctx) { if( ctx ){ ctx->event_ids = NULL; ctx->num_events = 0; free(ctx->counters); ctx->counters = NULL; } free(ctx); return PAPI_OK; } rocprofiler_tool_configure_result_t * rocprofiler_configure(uint32_t version, const char* runtime_version, uint32_t priority, rocprofiler_client_id_t* id) { const char *error_msg = papi_rocpsdk::obtain_function_pointers(); if( NULL != error_msg ){ if( papi_rocpsdk::get_error_string().empty() ){ papi_rocpsdk::set_error_string("Could not obtain all functions from librocprofiler-sdk.so. Possible library version mismatch."); SUBDBG("dlsym(): %s\n", error_msg); } return NULL; } // set the client name id->name = "PAPI_ROCP_SDK_COMPONENT"; auto* client_tool_data = new std::string("CLIENT_TOOL_STRING"); // create configure data static auto cfg = rocprofiler_tool_configure_result_t{ sizeof(rocprofiler_tool_configure_result_t), &papi_rocpsdk::tool_init, &papi_rocpsdk::tool_fini, static_cast(client_tool_data) }; // return pointer to configure data return &cfg; } papi-papi-7-2-0-t/src/components/rocp_sdk/sdk_class.h000066400000000000000000000022611502707512200225400ustar00rootroot00000000000000#ifndef __VENDOR_PROFILER_V1_H__ #define __VENDOR_PROFILER_V1_H__ typedef struct vendord_ctx *vendorp_ctx_t; extern int rocprofiler_sdk_init_pre(void); extern int rocprofiler_sdk_init(void); extern int rocprofiler_sdk_shutdown(void); extern int rocprofiler_sdk_ctx_open(int *events_id, int num_events, vendorp_ctx_t *ctx); extern int rocprofiler_sdk_start(vendorp_ctx_t ctx); extern int rocprofiler_sdk_stop(vendorp_ctx_t ctx); extern int rocprofiler_sdk_ctx_read(vendorp_ctx_t ctx, long long **counters); extern int rocprofiler_sdk_ctx_stop(vendorp_ctx_t ctx); extern int rocprofiler_sdk_ctx_reset(vendorp_ctx_t ctx); extern int rocprofiler_sdk_ctx_close(vendorp_ctx_t ctx); extern int rocprofiler_sdk_evt_enum(unsigned int *event_code, int modifier); extern int rocprofiler_sdk_evt_code_to_name(unsigned int event_code, char *name, int len); extern int rocprofiler_sdk_evt_code_to_descr(unsigned int event_code, char *descr, int len); extern int rocprofiler_sdk_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info); extern int rocprofiler_sdk_evt_name_to_code(const char *name, unsigned int *event_code); extern int rocprofiler_sdk_err_get_last(const char **err_string); #endif papi-papi-7-2-0-t/src/components/rocp_sdk/sdk_class.hpp000066400000000000000000000070101502707512200230750ustar00rootroot00000000000000#ifndef __PAPI_ROCPSDK_INTERNAL_H__ #define __PAPI_ROCPSDK_INTERNAL_H__ #include #include #include #include #include #include #include #include "papi.h" #ifdef __cplusplus extern "C" { #endif #include "papi_internal.h" #include "papi_memory.h" #ifdef __cplusplus } #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if (__cplusplus >= 201402L) // c++14 #include #endif #include #include #include #include #include #include #define DLL_SYM_CHECK(name, type) \ do { \ char *err; \ name##_FPTR = (type) dlsym(dllHandle, #name);\ err = dlerror(); \ if(NULL != err) { \ return err; \ } \ } while (0) #if defined(PAPI_ROCPSDK_DEBUG) #define ROCPROFILER_CALL(result, msg) \ { \ rocprofiler_status_t CHECKSTATUS = result; \ if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \ { \ std::string status_msg = rocprofiler_get_status_string_FPTR(CHECKSTATUS); \ std::cerr << "[" #result "][" << __FILE__ << ":" << __LINE__ << "] " << msg \ << " failed with error code " << CHECKSTATUS << ": " << status_msg \ << std::endl; \ std::stringstream errmsg{}; \ errmsg << "[" #result "][" << __FILE__ << ":" << __LINE__ << "] " << msg " failure (" \ << status_msg << ")"; \ throw std::runtime_error(errmsg.str()); \ } \ } #else #define ROCPROFILER_CALL(result, msg) {(void)result;} #endif #define RPSDK_MODE_DISPATCH (0) #define RPSDK_MODE_DEVICE_SAMPLING (1) #define RPSDK_AES_STOPPED (0x0) #define RPSDK_AES_OPEN (0x1) #define RPSDK_AES_RUNNING (0x2) typedef struct { char name[PAPI_MAX_STR_LEN]; char descr[PAPI_2MAX_STR_LEN]; } ntv_event_t; typedef struct { ntv_event_t *events; int num_events; } ntv_event_table_t; struct vendord_ctx { int state; int *event_ids; long long *counters; int num_events; }; typedef struct vendord_ctx *vendorp_ctx_t; static int init_ctx(int *event_ids, int num_events, vendorp_ctx_t ctx); static int finalize_ctx(vendorp_ctx_t ctx); extern unsigned int _rocp_sdk_lock; #endif papi-papi-7-2-0-t/src/components/rocp_sdk/tests/000077500000000000000000000000001502707512200215625ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/rocp_sdk/tests/Makefile000066400000000000000000000025641502707512200232310ustar00rootroot00000000000000NAME=template include ../../Makefile_comp_tests.target ROCP_SDK_INCL=-I$(PAPI_ROCP_SDK_ROOT)/include \ -I$(PAPI_ROCP_SDK_ROOT)/include/hsa \ -I$(PAPI_ROCP_SDK_ROOT)/hsa/include AMDCXX ?= amdclang++ CFLAGS = $(OPTFLAGS) CPPFLAGS += $(INCLUDE) $(ROCP_SDK_INCL) LDFLAGS += $(PAPILIB) $(TESTLIB) $(UTILOBJS) GPUARCH = $(shell rocm_agent_enumerator 2>/dev/null | grep -v "gfx000" | head -1) ifneq ($(GPUARCH),) ARCHFLAG=--offload-arch=$(GPUARCH) endif GPUFLAGS=$(ARCHFLAG) --hip-link --rtlib=compiler-rt -unwindlib=libgcc TESTS = simple advanced two_eventsets simple_sampling template_tests: $(TESTS) %.o: %.c $(CC) $(CPPFLAGS) $(CFLAGS) $(OPTFLAGS) -D__HIP_PLATFORM_AMD__ -c -o $@ $< kernel.o: kernel.cpp $(AMDCXX) -D__HIP_ROCclr__=1 -O2 -g -DNDEBUG $(ARCHFLAG) -W -Wall -Wextra -Wshadow -o kernel.o -x hip -c kernel.cpp simple: simple.o kernel.o $(AMDCXX) -O2 -g -DNDEBUG $(GPUFLAGS) simple.o kernel.o -o simple $(LDFLAGS) advanced: advanced.o kernel.o $(AMDCXX) -O2 -g -DNDEBUG $(GPUFLAGS) advanced.o kernel.o -o advanced $(LDFLAGS) two_eventsets: two_eventsets.o kernel.o $(AMDCXX) -O2 -g -DNDEBUG $(GPUFLAGS) two_eventsets.o kernel.o -o two_eventsets $(LDFLAGS) simple_sampling: simple_sampling.o kernel.o $(AMDCXX) -O2 -g -DNDEBUG $(GPUFLAGS) simple_sampling.o kernel.o -o simple_sampling $(LDFLAGS) -pthread clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/rocp_sdk/tests/advanced.c000066400000000000000000000123641502707512200235010ustar00rootroot00000000000000#include #include #include #include #include extern int launch_kernel(int device_id); int main(int argc, char *argv[]) { int dev_count=0; int papi_errno; #define NUM_EVENTS (14) long long counters[NUM_EVENTS] = { 0 }; const char *events[NUM_EVENTS] = { "rocp_sdk:::SQ_CYCLES:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=0:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=1:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=2:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=3:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=4:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=5", "rocp_sdk:::SQ_BUSY_CYCLES:DIMENSION_INSTANCE=0", "rocp_sdk:::SQ_WAVE_CYCLES:DIMENSION_SHADER_ENGINE=0:device=0", "rocp_sdk:::SQ_WAVE_CYCLES:DIMENSION_SHADER_ENGINE=1:device=0", "rocp_sdk:::SQ_WAVE_CYCLES:DIMENSION_SHADER_ENGINE=2:device=0", "rocp_sdk:::SQ_WAVE_CYCLES:DIMENSION_SHADER_ENGINE=3:device=0", "rocp_sdk:::SQ_WAVE_CYCLES:DIMENSION_SHADER_ENGINE=4:device=0", "rocp_sdk:::SQ_WAVE_CYCLES:device=0" }; papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init", papi_errno); } int eventset = PAPI_NULL; papi_errno = PAPI_create_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset, events[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } papi_errno = PAPI_start(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } for(int rep=0; rep<=4; ++rep){ printf("--------------------- launch_kernel(0)\n"); papi_errno = launch_kernel(0); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(0)", papi_errno); } usleep(1000); papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { fprintf(stdout, "%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } } papi_errno = PAPI_stop(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { fprintf(stdout, "%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } if (hipGetDeviceCount(&dev_count) != hipSuccess){ test_fail(__FILE__, __LINE__, "Error while counting AMD devices:", papi_errno); } if( dev_count > 1 ){ printf("======================================================\n"); printf("==================== SECOND ROUND ====================\n"); printf("======================================================\n"); for(int rep=0; rep<=3; ++rep){ papi_errno = PAPI_start(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } printf("--------------------- launch_kernel(1)\n"); papi_errno = launch_kernel(1); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(1)", papi_errno); } usleep(1000); papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { fprintf(stdout, "%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } papi_errno = PAPI_stop(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { fprintf(stdout, "%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } } } papi_errno = PAPI_cleanup_eventset(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } PAPI_shutdown(); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/rocp_sdk/tests/kernel.cpp000066400000000000000000000032731502707512200235530ustar00rootroot00000000000000#include #include extern "C" int launch_kernel(int device_id); #define HIP_CALL(call) \ do \ { \ hipError_t err = call; \ if(err != hipSuccess) \ { \ std::cerr << hipGetErrorString(err) << std::endl; \ return(-1); \ } \ } while(0) __global__ void kernelA(int x, int y) { volatile int i, t; for(i=0; i<1000000; i++){ t = 173/x; x = t + y; } } template __global__ void kernelC(T* C_d, const T* A_d, size_t N) { size_t offset = (blockIdx.x * blockDim.x + threadIdx.x); size_t stride = blockDim.x * gridDim.x; for(size_t i = offset; i < N; i += stride) { C_d[i] = A_d[i] * A_d[i]; } } int launch_kernel(int device_id) { const int NUM_LAUNCH = 1; HIP_CALL(hipSetDevice(device_id)); for(int i = 0; i < NUM_LAUNCH; i++) { hipLaunchKernelGGL(kernelA, dim3(1), dim3(1), 0, 0, 1, 2); } HIP_CALL(hipDeviceSynchronize()); return 0; } papi-papi-7-2-0-t/src/components/rocp_sdk/tests/simple.c000066400000000000000000000054601502707512200232240ustar00rootroot00000000000000#include #include #include #include extern int launch_kernel(int device_id); int main(int argc, char *argv[]) { int papi_errno; #define NUM_EVENTS (7) long long counters[NUM_EVENTS] = { 0 }; const char *events[NUM_EVENTS] = { "rocp_sdk:::SQ_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=3", "rocp_sdk:::TCC_CYCLE:device=0:DIMENSION_INSTANCE=2", "rocp_sdk:::SQ_INSTS:device=0:DIMENSION_INSTANCE=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=1", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=2", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0" }; papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init", papi_errno); } int eventset = PAPI_NULL; papi_errno = PAPI_create_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset, events[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } papi_errno = PAPI_start(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } printf("--------------------- launch_kernel(0)\n"); papi_errno = launch_kernel(0); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(0)", papi_errno); } usleep(10000); papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } papi_errno = PAPI_stop(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } papi_errno = PAPI_cleanup_eventset(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } PAPI_shutdown(); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/rocp_sdk/tests/simple_sampling.c000066400000000000000000000075451502707512200251240ustar00rootroot00000000000000#include #include #include #include #include #define NUM_EVENTS (12) extern int launch_kernel(int device_id); int eventset = PAPI_NULL; volatile int gv=0; const char *events[NUM_EVENTS] = { "rocp_sdk:::SQ_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=3", "rocp_sdk:::SQ_INSTS:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=0", "rocp_sdk:::SQ_INSTS:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=1", "rocp_sdk:::SQ_INSTS:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=2", "rocp_sdk:::SQ_INSTS:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=3", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=1", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=2", "rocp_sdk:::SQ_BUSY_CYCLES:device=0:DIMENSION_INSTANCE=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=1:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=1:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=1", "rocp_sdk:::SQ_BUSY_CYCLES:device=1:DIMENSION_INSTANCE=0:DIMENSION_SHADER_ENGINE=2" }; void *thread_main(void *arg){ long long counters[NUM_EVENTS] = { 0 }; while(0==gv){;} usleep(150*1000); for(int i=0; i<30; i++){ printf("Sample: %2d\n", gv); fflush(stdout); PAPI_read(eventset, counters); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %.2lfM\n", events[i], (double)counters[i]/1e6); fflush(stdout); } printf("\n"); fflush(stdout); usleep(30*1000); ++gv; } return NULL; } int main(int argc, char *argv[]) { int papi_errno; papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init", papi_errno); } papi_errno = PAPI_create_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset, events[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } long long counters[NUM_EVENTS] = { 0 }; papi_errno = PAPI_start(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } pthread_t tid; pthread_create(&tid, NULL, thread_main, NULL); printf("--------------------- launch_kernel(0)\n"); gv = 1; papi_errno = launch_kernel(0); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(0)", papi_errno); } usleep(20000); papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } papi_errno = PAPI_stop(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %.2lfM\n", events[i], (double)counters[i]/1e6); } papi_errno = PAPI_cleanup_eventset(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } PAPI_shutdown(); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/rocp_sdk/tests/two_eventsets.c000066400000000000000000000146061502707512200246460ustar00rootroot00000000000000#include #include #include #include extern int launch_kernel(int device_id); int main(int argc, char *argv[]) { int papi_errno; #define NUM_EVENTS (5) long long counters1[NUM_EVENTS] = { 0 }; long long counters2[NUM_EVENTS] = { 0 }; int eventset1 = PAPI_NULL; int eventset2 = PAPI_NULL; double exp1[NUM_EVENTS] = {1, 1300000000, 55000000000, 1, 1}; double exp2[NUM_EVENTS] = {45000000000, 1, 40000000, 1, 1300000000}; double exp3[NUM_EVENTS] = {28000000000, 40000000, 1, 1300000000, 1}; const char *events1[NUM_EVENTS] = { "rocp_sdk:::SQ_BUSY_CYCLES:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=1", "rocp_sdk:::TCC_CYCLE:device=1", "rocp_sdk:::SQ_WAVES:device=0", "rocp_sdk:::SQ_WAVES:device=1" }; const char *events2[NUM_EVENTS] = { "rocp_sdk:::TCC_CYCLE:device=1", "rocp_sdk:::SQ_INSTS:device=0", "rocp_sdk:::SQ_INSTS:device=1", "rocp_sdk:::SQ_BUSY_CYCLES:device=0", "rocp_sdk:::SQ_BUSY_CYCLES:device=1" }; papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init", papi_errno); } papi_errno = PAPI_create_eventset(&eventset1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset1, events1[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } papi_errno = PAPI_create_eventset(&eventset2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset2, events2[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } printf("==================== FIRST EVENTSET - DEVICE 1 ====================\n"); papi_errno = PAPI_start(eventset1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } for(int rep=0; rep<=3; ++rep){ papi_errno = launch_kernel(1); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(1)", papi_errno); } usleep(1000); papi_errno = PAPI_read(eventset1, counters1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events1[i], counters1[i], 1.0*counters1[i]/((1.0+rep)*exp1[i])); } } papi_errno = PAPI_stop(eventset1, counters1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events1[i], counters1[i], 1.0*counters1[i]/((1.0+3)*exp1[i])); } printf("==================== SECOND EVENTSET - DEVICE 1 ====================\n"); papi_errno = PAPI_start(eventset2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } for(int rep=0; rep<=3; ++rep){ papi_errno = launch_kernel(1); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(1)", papi_errno); } usleep(1000); papi_errno = PAPI_read(eventset2, counters2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events2[i], counters2[i], 1.0*counters2[i]/((1.0+rep)*exp2[i])); } } papi_errno = PAPI_stop(eventset2, counters2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events2[i], counters2[i], 1.0*counters2[i]/((1.0+3)*exp2[i])); } printf("==================== SECOND EVENTSET - DEVICE 0 ====================\n"); papi_errno = PAPI_start(eventset2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } for(int rep=0; rep<=2; ++rep){ papi_errno = launch_kernel(0); if (papi_errno != 0) { test_fail(__FILE__, __LINE__, "launch_kernel(0)", papi_errno); } usleep(1000); papi_errno = PAPI_read(eventset2, counters2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } printf("--------------------- PAPI_read()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events2[i], counters2[i], 1.0*counters2[i]/((1.0+rep)*exp3[i])); } } papi_errno = PAPI_stop(eventset2, counters2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_stop", papi_errno); } printf("--------------------- PAPI_stop()\n"); for (int i = 0; i < NUM_EVENTS; ++i) { printf("%s: %lld (%.2lf)\n", events2[i], counters2[i], 1.0*counters2[i]/((1.0+2)*exp3[i])); } /* * * Cleanup * * */ papi_errno = PAPI_cleanup_eventset(eventset1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset1); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } papi_errno = PAPI_cleanup_eventset(eventset2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset2); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } PAPI_shutdown(); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/sde/000077500000000000000000000000001502707512200173675ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/README.md000066400000000000000000000014351502707512200206510ustar00rootroot00000000000000# SDE Component The SDE component enables PAPI to read Software Defined Events (SDEs) exported by third party libraries. * [Enabling the SDE Component](#enabling-the-sde-component) * [FAQ](#faq) ## Enabling the SDE Component To enable reading of SDE counters the user needs to link against a PAPI library that was configured with the SDE component enabled. As an example the following command: `./configure --with-components="sde"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## FAQ * [Software Defined Events](https://github.com/icl-utk-edu/papi/wiki/Software_Defined_Events.md) papi-papi-7-2-0-t/src/components/sde/Rules.sde000066400000000000000000000005461502707512200211630ustar00rootroot00000000000000COMPSRCS += components/sde/sde.c COMPOBJS += sde.o LDFLAGS += -ldl -pthread CC_SHR += -Icomponents/sde SDE_INC = -Icomponents/sde -g SDE_LD = -ldl -pthread F90FLAGS += -ffree-form -ffree-line-length-none -fPIC sde.o: components/sde/sde.c components/sde/sde_internal.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) $(LDFLAGS) $(SDE_INC) $(SDE_LD) -c $< -o $@ papi-papi-7-2-0-t/src/components/sde/sde.c000066400000000000000000001146021502707512200203120ustar00rootroot00000000000000/** * @file sde.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief * This is the component for supporting Software Defined Events (SDE). * It provides an interface for libraries, runtimes, and other software * layers to export events to other software layers through PAPI. */ #include "sde_internal.h" #include papi_vector_t _sde_vector; int _sde_component_lock; // The following two function pointers will be used by libsde in case PAPI is statically linked (libpapi.a) void (*papi_sde_check_overflow_status_ptr)(uint32_t cntr_id, long long int value) = &papi_sde_check_overflow_status; int (*papi_sde_set_timer_for_overflow_ptr)(void) = &papi_sde_set_timer_for_overflow; #define DLSYM_CHECK(name) \ do { \ if ( NULL != (err=dlerror()) ) { \ int strErr=snprintf(_sde_vector.cmp_info.disabled_reason, \ PAPI_MAX_STR_LEN, \ "Function '%s' not found in any dynamic lib", \ #name); \ if (strErr > PAPI_MAX_STR_LEN) \ SUBDBG("Unexpected snprintf error.\n"); \ name##_ptr = NULL; \ SUBDBG("sde_load_sde_ti(): Unable to load symbol %s: %s\n", #name, err);\ return ( PAPI_ECMP ); \ } \ } while (0) /* If the library is being built statically then there is no need (or ability) to access symbols through dlopen/dlsym; applications using the static version of PAPI (libpapi.a) must also be linked against libsde for supporting SDEs. However, if the dynamic library is used (libpapi.so) then we will look for the symbols from libsde.so dynamically. */ static int sde_load_sde_ti( void ){ char *err; // In case of static linking the function pointers will be automatically set // by the linker and the dlopen()/dlsym() would fail at runtime, so we want to // check if the linker has done its magic first. if( (NULL != sde_ti_reset_counter_ptr) && (NULL != sde_ti_reset_counter_ptr) && (NULL != sde_ti_read_counter_ptr) && (NULL != sde_ti_write_counter_ptr) && (NULL != sde_ti_name_to_code_ptr) && (NULL != sde_ti_is_simple_counter_ptr) && (NULL != sde_ti_is_counter_set_to_overflow_ptr) && (NULL != sde_ti_set_counter_overflow_ptr) && (NULL != sde_ti_get_event_name_ptr) && (NULL != sde_ti_get_event_description_ptr) && (NULL != sde_ti_get_num_reg_events_ptr) && (NULL != sde_ti_shutdown_ptr) ){ return PAPI_OK; } (void)dlerror(); // Clear the internal string so we can diagnose errors later on. void *handle = dlopen(NULL, RTLD_NOW|RTLD_GLOBAL); if( NULL != (err = dlerror()) ){ SUBDBG("sde_load_sde_ti(): %s\n",err); return PAPI_ENOSUPP; } sde_ti_reset_counter_ptr = (int (*)( uint32_t ))dlsym( handle, "sde_ti_reset_counter" ); DLSYM_CHECK(sde_ti_reset_counter); sde_ti_read_counter_ptr = (int (*)( uint32_t, long long int * ))dlsym( handle, "sde_ti_read_counter" ); DLSYM_CHECK(sde_ti_read_counter); sde_ti_write_counter_ptr = (int (*)( uint32_t, long long ))dlsym( handle, "sde_ti_write_counter" ); DLSYM_CHECK(sde_ti_write_counter); sde_ti_name_to_code_ptr = (int (*)( const char *, uint32_t * ))dlsym( handle, "sde_ti_name_to_code" ); DLSYM_CHECK(sde_ti_name_to_code); sde_ti_is_simple_counter_ptr = (int (*)( uint32_t ))dlsym( handle, "sde_ti_is_simple_counter" ); DLSYM_CHECK(sde_ti_is_simple_counter); sde_ti_is_counter_set_to_overflow_ptr = (int (*)( uint32_t ))dlsym( handle, "sde_ti_is_counter_set_to_overflow" ); DLSYM_CHECK(sde_ti_is_counter_set_to_overflow); sde_ti_set_counter_overflow_ptr = (int (*)( uint32_t, int ))dlsym( handle, "sde_ti_set_counter_overflow" ); DLSYM_CHECK(sde_ti_set_counter_overflow); sde_ti_get_event_name_ptr = (char * (*)( int ))dlsym( handle, "sde_ti_get_event_name" ); DLSYM_CHECK(sde_ti_get_event_name); sde_ti_get_event_description_ptr = (char * (*)( int ))dlsym( handle, "sde_ti_get_event_description" ); DLSYM_CHECK(sde_ti_get_event_description); sde_ti_get_num_reg_events_ptr = (int (*)( void ))dlsym( handle, "sde_ti_get_num_reg_events" ); DLSYM_CHECK(sde_ti_get_num_reg_events); sde_ti_shutdown_ptr = (int (*)( void ))dlsym( handle, "sde_ti_shutdown" ); DLSYM_CHECK(sde_ti_shutdown); return PAPI_OK; } /********************************************************************/ /* Below are the functions required by the PAPI component interface */ /********************************************************************/ static int _sde_init_component( int cidx ) { int ret_val = PAPI_OK; SUBDBG("_sde_init_component...\n"); _sde_vector.cmp_info.num_native_events = 0; _sde_vector.cmp_info.CmpIdx = cidx; _sde_component_lock = PAPI_NUM_LOCK + NUM_INNER_LOCK + cidx; ret_val = sde_load_sde_ti(); if( PAPI_OK != ret_val ){ _sde_vector.cmp_info.disabled = ret_val; int expect = snprintf(_sde_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "libsde API not found. No SDEs exist in this executable."); if (expect > PAPI_MAX_STR_LEN) { SUBDBG("disabled_reason truncated"); } } return ret_val; } /** This is called whenever a thread is initialized */ static int _sde_init_thread( hwd_context_t *ctx ) { (void)ctx; SUBDBG( "_sde_init_thread %p...\n", ctx ); return PAPI_OK; } /** Setup a counter control state. * In general a control state holds the hardware info for an * EventSet. */ static int _sde_init_control_state( hwd_control_state_t * ctl ) { SUBDBG( "sde_init_control_state... %p\n", ctl ); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; memset( sde_ctl, 0, sizeof ( sde_control_state_t ) ); return PAPI_OK; } /** Triggered by eventset operations like add or remove */ static int _sde_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { (void) ctx; int i, index; SUBDBG( "_sde_update_control_state %p %p...\n", ctl, ctx ); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; for( i = 0; i < count; i++ ) { index = native[i].ni_event & PAPI_NATIVE_AND_MASK; if( index < 0 ){ PAPIERROR("_sde_update_control_state(): Event at index %d has a negative native event code = %d.\n",i,index); return PAPI_EINVAL; } SUBDBG("_sde_update_control_state: i=%d index=%u\n", i, index ); sde_ctl->which_counter[i] = (uint32_t)index; native[i].ni_position = i; } // If an event for which overflowing was set is being removed from the eventset, then the // framework will turn overflowing off (by calling PAPI_overflow() with threshold=0), // so we don't need to do anything here. sde_ctl->num_events=count; return PAPI_OK; } /** Triggered by PAPI_start() */ static int _sde_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { int ret_val = PAPI_OK; ThreadInfo_t *thread; int cidx; struct itimerspec its; ( void ) ctx; ( void ) ctl; SUBDBG( "%p %p...\n", ctx, ctl ); ret_val = _sde_reset(ctx, ctl); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; its.it_value.tv_sec = 0; // We will start the timer at 100us because we adjust its period in _sde_dispatch_timer() // if the counter is not growing fast enough, or growing too slowly. its.it_value.tv_nsec = 100*1000; // 100us its.it_interval.tv_sec = its.it_value.tv_sec; its.it_interval.tv_nsec = its.it_value.tv_nsec; cidx = _sde_vector.cmp_info.CmpIdx; thread = _papi_hwi_lookup_thread( 0 ); if ( (NULL != thread) && (NULL != thread->running_eventset[cidx]) && (thread->running_eventset[cidx]->overflow.flags & PAPI_OVERFLOW_HARDWARE) ) { if( !(sde_ctl->has_timer) ){ // No registered counters went through r[1-3] int i; _papi_hwi_lock(_sde_component_lock); for( i = 0; i < sde_ctl->num_events; i++ ) { if( sde_ti_is_counter_set_to_overflow_ptr(sde_ctl->which_counter[i]) ){ // Registered counters went through r4 if( PAPI_OK == do_set_timer_for_overflow(sde_ctl) ) break; } } _papi_hwi_unlock(_sde_component_lock); } // r[1-4] if( sde_ctl->has_timer ){ SUBDBG( "starting SDE internal timer for emulating HARDWARE overflowing\n"); if (timer_settime(sde_ctl->timerid, 0, &its, NULL) == -1){ PAPIERROR("timer_settime"); timer_delete(sde_ctl->timerid); sde_ctl->has_timer = 0; return PAPI_ECMP; } } } return ret_val; } /** Triggered by PAPI_stop() */ static int _sde_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; ThreadInfo_t *thread; int cidx; struct itimerspec zero_time; SUBDBG( "sde_stop %p %p...\n", ctx, ctl ); /* anything that would need to be done at counter stop time */ sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; cidx = _sde_vector.cmp_info.CmpIdx; thread = _papi_hwi_lookup_thread( 0 ); if ( (NULL != thread) && (NULL != thread->running_eventset[cidx]) && (thread->running_eventset[cidx]->overflow.flags & PAPI_OVERFLOW_HARDWARE) ) { if( sde_ctl->has_timer ){ SUBDBG( "stopping SDE internal timer\n"); memset(&zero_time, 0, sizeof(struct itimerspec)); if (timer_settime(sde_ctl->timerid, 0, &zero_time, NULL) == -1){ PAPIERROR("timer_settime"); timer_delete(sde_ctl->timerid); sde_ctl->has_timer = 0; return PAPI_ECMP; } } } return PAPI_OK; } /** Triggered by PAPI_read() */ static int _sde_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ) { int i; int ret_val = PAPI_OK; (void) flags; (void) ctx; SUBDBG( "_sde_read... %p %d\n", ctx, flags ); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; _papi_hwi_lock(_sde_component_lock); for( i = 0; i < sde_ctl->num_events; i++ ) { uint32_t counter_uniq_id = sde_ctl->which_counter[i]; ret_val = sde_ti_read_counter_ptr( counter_uniq_id, &(sde_ctl->counter[i]) ); if( PAPI_OK != ret_val ){ PAPIERROR("_sde_read(): Error when reading event at index %d.\n",i); goto fnct_exit; } } *events = sde_ctl->counter; fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Triggered by PAPI_write(), but only if the counters are running */ /* otherwise, the updated state is written to ESI->hw_start */ static int _sde_write( hwd_context_t *ctx, hwd_control_state_t *ctl, long long *values ) { int i, ret_val = PAPI_OK; (void) ctx; (void) ctl; SUBDBG( "_sde_write... %p\n", ctx ); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; // Lock before we access global data structures. _papi_hwi_lock(_sde_component_lock); for( i = 0; i < sde_ctl->num_events; i++ ) { uint32_t counter_uniq_id = sde_ctl->which_counter[i]; ret_val = sde_ti_write_counter_ptr( counter_uniq_id, values[i] ); if( PAPI_OK != ret_val ){ PAPIERROR("_sde_write(): Error when writing event at index %d.\n",i); goto fnct_exit; } } fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Triggered by PAPI_reset() but only if the EventSet is currently running */ /* If the eventset is not currently running, then the saved value in the */ /* EventSet is set to zero without calling this routine. */ static int _sde_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { int i, ret_val=PAPI_OK; (void) ctx; SUBDBG( "_sde_reset ctx=%p ctrl=%p...\n", ctx, ctl ); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ctl; _papi_hwi_lock(_sde_component_lock); for( i = 0; i < sde_ctl->num_events; i++ ) { uint32_t counter_uniq_id = sde_ctl->which_counter[i]; ret_val = sde_ti_reset_counter_ptr( counter_uniq_id ); if( PAPI_OK != ret_val ){ PAPIERROR("_sde_reset(): Error when reseting event at index %d.\n",i); goto fnct_exit; } } fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Triggered by PAPI_shutdown() */ static int _sde_shutdown_component(void) { SUBDBG( "sde_shutdown_component...\n" ); return sde_ti_shutdown_ptr(); } /** Called at thread shutdown */ static int _sde_shutdown_thread( hwd_context_t *ctx ) { (void) ctx; SUBDBG( "sde_shutdown_thread... %p\n", ctx ); /* Last chance to clean up thread */ return PAPI_OK; } /** This function sets various options in the component @param[in] ctx -- hardware context @param[in] code valid are PAPI_SET_DEFDOM, PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL and PAPI_SET_INHERIT @param[in] option -- options to be set */ static int _sde_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { (void) ctx; (void) code; (void) option; SUBDBG( "sde_ctl...\n" ); return PAPI_OK; } /** This function has to set the bits needed to count different domains In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER By default return PAPI_EINVAL if none of those are specified and PAPI_OK with success PAPI_DOM_USER is only user context is counted PAPI_DOM_KERNEL is only the Kernel/OS context is counted PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) PAPI_DOM_ALL is all of the domains */ static int _sde_set_domain( hwd_control_state_t * cntrl, int domain ) { (void) cntrl; int found = 0; SUBDBG( "sde_set_domain...\n" ); if ( PAPI_DOM_USER & domain ) { SUBDBG( " PAPI_DOM_USER\n" ); found = 1; } if ( PAPI_DOM_KERNEL & domain ) { SUBDBG( " PAPI_DOM_KERNEL\n" ); found = 1; } if ( PAPI_DOM_OTHER & domain ) { SUBDBG( " PAPI_DOM_OTHER\n" ); found = 1; } if ( PAPI_DOM_ALL & domain ) { SUBDBG( " PAPI_DOM_ALL\n" ); found = 1; } if ( !found ) return ( PAPI_EINVAL ); return PAPI_OK; } /**************************************************************/ /* Naming functions, used to translate event numbers to names */ /**************************************************************/ /** Enumerate Native Events * @param EventCode is the event of interest * @param modifier is one of PAPI_ENUM_FIRST, PAPI_ENUM_EVENTS * If your component has attribute masks then these need to * be handled here as well. */ static int _sde_ntv_enum_events( unsigned int *EventCode, int modifier ) { unsigned int curr_code, next_code, num_reg_events; int ret_val = PAPI_OK; SUBDBG("_sde_ntv_enum_events begin\n\tEventCode=%u modifier=%d\n", *EventCode, modifier); switch ( modifier ) { /* return EventCode of first event */ case PAPI_ENUM_FIRST: /* return the first event that we support */ if( sde_ti_get_num_reg_events_ptr() <= 0 ){ ret_val = PAPI_ENOEVNT; break; } *EventCode = 0; ret_val = PAPI_OK; break; /* return EventCode of next available event */ case PAPI_ENUM_EVENTS: curr_code = *EventCode & PAPI_NATIVE_AND_MASK; // Lock before we read num_reg_events and the hash-tables. _papi_hwi_lock(_sde_component_lock); num_reg_events = (unsigned int)sde_ti_get_num_reg_events_ptr(); if( curr_code >= num_reg_events-1 ){ ret_val = PAPI_ENOEVNT; goto unlock; } /* * We have to check the events which follow the current one, because unregistering * will create sparcity in the global SDE table, so we can't just return the next * index. */ next_code = curr_code; do{ next_code++; char *ev_name = sde_ti_get_event_name_ptr((uint32_t)next_code); if( NULL != ev_name ){ *EventCode = next_code; SUBDBG("Event name = %s (code = %d)\n", ev_name, next_code); ret_val = PAPI_OK; goto unlock; } }while(next_code < num_reg_events); // If we make it here it means that we didn't find the event. ret_val = PAPI_EINVAL; unlock: _papi_hwi_unlock(_sde_component_lock); break; default: ret_val = PAPI_EINVAL; break; } return ret_val; } /** Takes a native event code and passes back the name * @param EventCode is the native event code * @param name is a pointer for the name to be copied to * @param len is the size of the name string */ static int _sde_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int ret_val = PAPI_OK; unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; SUBDBG("_sde_ntv_code_to_name %u\n", code); _papi_hwi_lock(_sde_component_lock); char *ev_name = sde_ti_get_event_name_ptr((uint32_t)code); if( NULL == ev_name ){ ret_val = PAPI_ENOEVNT; goto fnct_exit; } SUBDBG("Event name = %s (code = %d)\n", ev_name, code); (void)strncpy( name, ev_name, len ); name[len-1] = '\0'; fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Takes a native event code and passes back the event description * @param EventCode is the native event code * @param descr is a pointer for the description to be copied to * @param len is the size of the descr string */ static int _sde_ntv_code_to_descr( unsigned int EventCode, char *descr, int len ) { int ret_val = PAPI_OK; unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; SUBDBG("_sde_ntv_code_to_descr %u\n", code); _papi_hwi_lock(_sde_component_lock); char *ev_descr = sde_ti_get_event_description_ptr((uint32_t)code); if( NULL == ev_descr ){ ret_val = PAPI_ENOEVNT; goto fnct_exit; } SUBDBG("Event (code = %d) description: %s\n", code, ev_descr); (void)strncpy( descr, ev_descr, len ); descr[len-1] = '\0'; fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Takes a native event code and passes back the event info * @param EventCode is the native event code * @param info is a pointer to a PAPI_event_info_t structure where the event information will be stored. */ static int _sde_ntv_code_to_info( unsigned int EventCode, PAPI_event_info_t *info) { int ret_val = PAPI_OK; unsigned int code = EventCode & PAPI_NATIVE_AND_MASK; SUBDBG("_sde_ntv_code_to_info %u\n", code); info->event_code = EventCode; info->component_index = _sde_vector.cmp_info.CmpIdx; _papi_hwi_lock(_sde_component_lock); char *ev_name = sde_ti_get_event_name_ptr((uint32_t)code); char *ev_descr = sde_ti_get_event_description_ptr((uint32_t)code); if( (NULL == ev_descr) || (NULL == ev_name) ){ ret_val = PAPI_ENOEVNT; goto fnct_exit; } SUBDBG("Event (code = %d) name: %s \n", code, ev_name); SUBDBG("Event (code = %d) description: %s\n", code, ev_descr); (void)strncpy( info->symbol, ev_name, PAPI_HUGE_STR_LEN-1 ); info->symbol[PAPI_HUGE_STR_LEN-1] = '\0'; (void)strncpy( info->long_descr, ev_descr, PAPI_HUGE_STR_LEN-1 ); info->long_descr[PAPI_HUGE_STR_LEN-1] = '\0'; (void)strncpy( info->short_descr, ev_descr, PAPI_MIN_STR_LEN-1 ); info->long_descr[PAPI_MIN_STR_LEN-1] = '\0'; fnct_exit: _papi_hwi_unlock(_sde_component_lock); return ret_val; } /** Takes a native event name and passes back the code * @param event_name -- a pointer for the name to be copied to * @param event_code -- the native event code */ static int _sde_ntv_name_to_code(const char *event_name, unsigned int *event_code ) { int ret_val; SUBDBG( "_sde_ntv_name_to_code(%s)\n", event_name ); ret_val = sde_ti_name_to_code_ptr(event_name, (uint32_t *)event_code); return ret_val; } static int _sde_set_overflow( EventSetInfo_t *ESI, int EventIndex, int threshold ){ (void)ESI; (void)EventIndex; (void)threshold; SUBDBG("_sde_set_overflow(%d, %d).\n",EventIndex, threshold); sde_control_state_t *sde_ctl = ( sde_control_state_t * ) ESI->ctl_state; // pos[0] holds the first among the native events that compose the given event. If it is a derived event, // then it might be made up of multiple native events, but this is a CPU component concept. The SDE component // does not have derived events (the groups are first class citizens, they don't have multiple pos[] entries). int pos = ESI->EventInfoArray[EventIndex].pos[0]; uint32_t counter_uniq_id = sde_ctl->which_counter[pos]; // If we still don't know what type the counter is, then we are _not_ in r[1-3] so we can't create a timer here, // but we still have to tell the calling tool/app that there was no error, because the timer will be set in the future. int ret_val = sde_ti_set_counter_overflow_ptr(counter_uniq_id, threshold); if( PAPI_OK >= ret_val ){ return ret_val; } // A threshold of zero indicates that overflowing is not needed anymore. if( 0 == threshold ){ // If we had a timer (if the counter was created we wouldn't have one) then delete it. if( sde_ctl->has_timer ) timer_delete(sde_ctl->timerid); sde_ctl->has_timer = 0; }else{ // If we are here we are in r[1-3] so we can create the timer return do_set_timer_for_overflow(sde_ctl); } return PAPI_OK; } /** * This code assumes that it is called _ONLY_ for registered counters, * and that is why it sets has_timer to REGISTERED_EVENT_MASK */ static int do_set_timer_for_overflow( sde_control_state_t *sde_ctl ){ int signo, sig_offset; struct sigevent sigev; struct sigaction sa; sig_offset = 0; // Choose a new real-time signal signo = SIGRTMIN+sig_offset; if(signo > SIGRTMAX){ PAPIERROR("do_set_timer_for_overflow(): Unable to create new timer due to large number of existing timers. Overflowing will not be activated for the current event.\n"); return PAPI_ECMP; } // setup the signal handler sa.sa_flags = SA_SIGINFO; sa.sa_sigaction = _sde_dispatch_timer; sigemptyset(&sa.sa_mask); if (sigaction(signo, &sa, NULL) == -1){ PAPIERROR("do_set_timer_for_overflow(): sigaction() failed."); return PAPI_ECMP; } // create the timer sigev.sigev_notify = SIGEV_SIGNAL; sigev.sigev_signo = signo; sigev.sigev_value.sival_ptr = &(sde_ctl->timerid); if (timer_create(CLOCK_REALTIME, &sigev, &(sde_ctl->timerid)) == -1){ PAPIERROR("do_set_timer_for_overflow(): timer_create() failed."); return PAPI_ECMP; } sde_ctl->has_timer |= REGISTERED_EVENT_MASK; return PAPI_OK; } static inline int sde_arm_timer(sde_control_state_t *sde_ctl){ struct itimerspec its; // We will start the timer at 100us because we adjust its period in _sde_dispatch_timer() // if the counter is not growing fast enough, or growing too slowly. its.it_value.tv_sec = 0; its.it_value.tv_nsec = 100*1000; // 100us its.it_interval.tv_sec = its.it_value.tv_sec; its.it_interval.tv_nsec = its.it_value.tv_nsec; SUBDBG( "starting SDE internal timer for emulating HARDWARE overflowing\n"); if (timer_settime(sde_ctl->timerid, 0, &its, NULL) == -1){ PAPIERROR("timer_settime"); timer_delete(sde_ctl->timerid); sde_ctl->has_timer = 0; // If the timer is broken, let the caller know that something internal went wrong. return PAPI_ECMP; } return PAPI_OK; } void _sde_dispatch_timer( int n, hwd_siginfo_t *info, void *uc) { _papi_hwi_context_t hw_context; vptr_t address; ThreadInfo_t *thread; int i, cidx, retval, isHardware, slow_down, speed_up; int found_registered_counters, period_has_changed = 0; EventSetInfo_t *ESI; struct itimerspec its; long long overflow_vector = 0; sde_control_state_t *sde_ctl; (void) n; SUBDBG("SDE timer expired. Dispatching (papi internal) overflow handler\n"); thread = _papi_hwi_lookup_thread( 0 ); cidx = _sde_vector.cmp_info.CmpIdx; ESI = thread->running_eventset[cidx]; // This holds only the number of events in the eventset that are set to overflow. int event_counter = ESI->overflow.event_counter; sde_ctl = ( sde_control_state_t * ) ESI->ctl_state; retval = _papi_hwi_read( thread->context[cidx], ESI, ESI->sw_stop ); if ( retval < PAPI_OK ) return; slow_down = 0; speed_up = 0; found_registered_counters = 0; // Reset the deadline of counters which have exceeded the current deadline // and check if we need to slow down the frequency of the timer. for ( i = 0; i < event_counter; i++ ) { int papi_index = ESI->overflow.EventIndex[i]; long long deadline, threshold, latest, previous, diff; uint32_t counter_uniq_id = sde_ctl->which_counter[papi_index]; if( !sde_ti_is_simple_counter_ptr( counter_uniq_id ) ) continue; found_registered_counters = 1; latest = ESI->sw_stop[papi_index]; deadline = ESI->overflow.deadline[i]; threshold = ESI->overflow.threshold[i]; // Find the increment from the previous measurement. previous = sde_ctl->previous_value[papi_index]; // NOTE: The following code assumes that the counters are "long long". No other // NOTE: type will work correctly. diff = latest-previous; // If it's too small we need to slow down the timer, it it's // too large we need to speed up the timer. if( 30*diff < threshold ){ slow_down = 1; // I.e., grow the sampling period }else if( 10*diff > threshold ){ speed_up = 1; // I.e., shrink the sampling period } // Update the "previous" measurement to be the latest one. sde_ctl->previous_value[papi_index] = latest; // If this counter has exceeded the deadline, add it in the vector. if ( latest >= deadline ) { // pos[0] holds the first among the native events that compose the given event. If it is a derived event, // then it might be made up of multiple native events, but this is a CPU component concept. The SDE component // does not have derived events (the groups are first class citizens, they don't have multiple pos[] entries). int pos = ESI->EventInfoArray[papi_index].pos[0]; SUBDBG ( "Event at index %d (and pos %d) has value %lld which exceeds deadline %lld (threshold %lld, accuracy %.2lf)\n", papi_index, pos, latest, deadline, threshold, 100.0*(double)(latest-deadline)/(double)threshold); overflow_vector ^= ( long long ) 1 << pos; // We adjust the deadline in a way that it remains a multiple of threshold so we don't create an additive error. ESI->overflow.deadline[i] = threshold*(latest/threshold) + threshold; } } if( !found_registered_counters && sde_ctl->has_timer ){ struct itimerspec zero_time; memset(&zero_time, 0, sizeof(struct itimerspec)); if (timer_settime(sde_ctl->timerid, 0, &zero_time, NULL) == -1){ PAPIERROR("timer_settime"); timer_delete(sde_ctl->timerid); sde_ctl->has_timer = 0; return; } goto no_change_in_period; } // Since we potentially check multiple counters in the loop above, both conditions could be true (for different counter). // In this case, we give speed_up priority. if( speed_up ) slow_down = 0; // If neither was set, there is nothing to do here. if( !speed_up && !slow_down ) goto no_change_in_period; if( !sde_ctl->has_timer ) goto no_change_in_period; // Get the current value of the timer. if( timer_gettime(sde_ctl->timerid, &its) == -1){ PAPIERROR("timer_gettime() failed. Timer will not be modified.\n"); goto no_change_in_period; } period_has_changed = 0; // We only reduce the period if it is above 131.6us, so it never drops below 100us. if( speed_up && (its.it_interval.tv_nsec > 131607) ){ double new_val = (double)its.it_interval.tv_nsec; new_val /= 1.31607; // sqrt(sqrt(3)) = 1.316074 its.it_value.tv_nsec = (int)new_val; its.it_interval.tv_nsec = its.it_value.tv_nsec; period_has_changed = 1; SUBDBG ("Timer will be sped up to %ld ns\n", its.it_value.tv_nsec); } // We only increase the period if it is below 75.9ms, so it never grows above 100ms. if( slow_down && (its.it_interval.tv_nsec < 75983800) ){ double new_val = (double)its.it_interval.tv_nsec; new_val *= 1.31607; // sqrt(sqrt(3)) = 1.316074 its.it_value.tv_nsec = (int)new_val; its.it_interval.tv_nsec = its.it_value.tv_nsec; period_has_changed = 1; SUBDBG ("Timer will be slowed down to %ld ns\n", its.it_value.tv_nsec); } if( !period_has_changed ) goto no_change_in_period; if (timer_settime(sde_ctl->timerid, 0, &its, NULL) == -1){ PAPIERROR("timer_settime() failed when modifying PAPI internal timer. This might have broken overflow support for this eventset.\n"); goto no_change_in_period; } no_change_in_period: // If none of the events exceeded their deadline, there is nothing else to do. if( 0 == overflow_vector ){ return; } if ( (NULL== thread) || (NULL == thread->running_eventset[cidx]) || (0 == thread->running_eventset[cidx]->overflow.flags) ){ PAPIERROR( "_sde_dispatch_timer(): 'Can not access overflow flags'"); return; } hw_context.si = info; hw_context.ucontext = ( hwd_ucontext_t * ) uc; address = GET_OVERFLOW_ADDRESS( hw_context ); int genOverflowBit = 0; _papi_hwi_dispatch_overflow_signal( ( void * ) &hw_context, address, &isHardware, overflow_vector, genOverflowBit, &thread, cidx ); return; } static void invoke_user_handler(uint32_t cntr_uniq_id){ EventSetInfo_t *ESI; int i, cidx; ThreadInfo_t *thread; sde_control_state_t *sde_ctl; _papi_hwi_context_t hw_context; ucontext_t uc; vptr_t address; long long overflow_vector; thread = _papi_hwi_lookup_thread( 0 ); cidx = _sde_vector.cmp_info.CmpIdx; ESI = thread->running_eventset[cidx]; // checking again, just to be sure. if( !(ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE) ) { return; } sde_ctl = ( sde_control_state_t * ) ESI->ctl_state; // This path comes from papi_sde_inc_counter() which increment _ONLY_ one counter, so we don't // need to check if any others have overflown. overflow_vector = 0; for( i = 0; i < sde_ctl->num_events; i++ ) { uint32_t uniq_id = sde_ctl->which_counter[i]; if( uniq_id == cntr_uniq_id ){ // pos[0] holds the first among the native events that compose the given event. If it is a derived event, // then it might be made up of multiple native events, but this is a CPU component concept. The SDE component // does not have derived events (the groups are first class citizens, they don't have multiple pos[] entries). int pos = ESI->EventInfoArray[i].pos[0]; if( pos == -1 ){ PAPIERROR( "The PAPI framework considers this event removed from the eventset, but the component does not\n"); return; } overflow_vector = ( long long ) 1 << pos; } } getcontext( &uc ); hw_context.ucontext = &uc; hw_context.si = NULL; address = GET_OVERFLOW_ADDRESS( hw_context ); ESI->overflow.handler( ESI->EventSetIndex, ( void * ) address, overflow_vector, hw_context.ucontext ); return; } void __attribute__((visibility("default"))) papi_sde_check_overflow_status(uint32_t cntr_uniq_id, long long int latest){ EventSetInfo_t *ESI; int cidx, i, index_in_ESI; ThreadInfo_t *thread; sde_control_state_t *sde_ctl; cidx = _sde_vector.cmp_info.CmpIdx; thread = _papi_hwi_lookup_thread( 0 ); if( NULL == thread ) return; ESI = thread->running_eventset[cidx]; // Check if there is a running event set and it has some events set to overflow if( (NULL == ESI) || !(ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE) ) return; sde_ctl = ( sde_control_state_t * ) ESI->ctl_state; int event_counter = ESI->overflow.event_counter; // Check all the events that are set to overflow index_in_ESI = -1; for (i = 0; i < event_counter; i++ ) { int papi_index = ESI->overflow.EventIndex[i]; uint32_t uniq_id = sde_ctl->which_counter[papi_index]; // If the created counter that we are incrementing corresponds to // an event that was set to overflow, read the deadline and threshold. if( uniq_id == cntr_uniq_id ){ index_in_ESI = i; break; } } if( index_in_ESI >= 0 ){ long long deadline, threshold; deadline = ESI->overflow.deadline[index_in_ESI]; threshold = ESI->overflow.threshold[index_in_ESI]; // If the current value has exceeded the deadline then // invoke the user handler and update the deadline. if( latest > deadline ){ // We adjust the deadline in a way that it remains a multiple of threshold // so we don't create an additive error. ESI->overflow.deadline[index_in_ESI] = threshold*(latest/threshold) + threshold; invoke_user_handler(cntr_uniq_id); } } return; } // The following function should only be called from within // sde_do_register() in libsde.so, which guarantees we are in cases r[4-6]. int __attribute__((visibility("default"))) papi_sde_set_timer_for_overflow(void){ ThreadInfo_t *thread; EventSetInfo_t *ESI; sde_control_state_t *sde_ctl; thread = _papi_hwi_lookup_thread( 0 ); if( NULL == thread ) return PAPI_OK; // Get the current running eventset and check if it has some events set to overflow. int cidx = _sde_vector.cmp_info.CmpIdx; ESI = thread->running_eventset[cidx]; if( (NULL == ESI) || !(ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE) ) return PAPI_OK; sde_ctl = ( sde_control_state_t * ) ESI->ctl_state; // Below this point we know we have a running eventset, so we are in case r5. // Since the event is set to overfow, if there is no timer in the eventset, create one and arm it. if( !(sde_ctl->has_timer) ){ int ret = do_set_timer_for_overflow(sde_ctl); if( PAPI_OK != ret ){ return ret; } ret = sde_arm_timer(sde_ctl); return ret; } return PAPI_OK; } /** Vector that points to entry points for our component */ papi_vector_t _sde_vector = { .cmp_info = { /* default component information */ /* (unspecified values are initialized to 0) */ /* we explicitly set them to zero in this sde */ /* to show what settings are available */ .name = "sde", .short_name = "sde", .description = "Software Defined Events (SDE) component", .version = "1.15", .support_version = "n/a", .kernel_version = "n/a", .num_cntrs = SDE_MAX_SIMULTANEOUS_COUNTERS, .num_mpx_cntrs = SDE_MAX_SIMULTANEOUS_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, /* component specific cmp_info initializations */ }, /* sizes of framework-opaque component-private structures */ .size = { /* once per thread */ .context = sizeof ( sde_context_t ), /* once per eventset */ .control_state = sizeof ( sde_control_state_t ), /* ?? */ .reg_value = sizeof ( sde_register_t ), /* ?? */ .reg_alloc = sizeof ( sde_reg_alloc_t ), }, /* function pointers */ /* by default they are set to NULL */ /* Used for general PAPI interactions */ .start = _sde_start, .stop = _sde_stop, .read = _sde_read, .reset = _sde_reset, .write = _sde_write, .init_component = _sde_init_component, .init_thread = _sde_init_thread, .init_control_state = _sde_init_control_state, .update_control_state = _sde_update_control_state, .ctl = _sde_ctl, .shutdown_thread = _sde_shutdown_thread, .shutdown_component = _sde_shutdown_component, .set_domain = _sde_set_domain, /* .cleanup_eventset = NULL, */ /* called in add_native_events() */ /* .allocate_registers = NULL, */ /* Used for overflow/profiling */ .dispatch_timer = _sde_dispatch_timer, .set_overflow = _sde_set_overflow, /* .get_overflow_address = NULL, */ /* .stop_profiling = NULL, */ /* .set_profile = NULL, */ /* ??? */ /* .user = NULL, */ /* Name Mapping Functions */ .ntv_enum_events = _sde_ntv_enum_events, .ntv_code_to_name = _sde_ntv_code_to_name, .ntv_code_to_descr = _sde_ntv_code_to_descr, .ntv_code_to_info = _sde_ntv_code_to_info, /* if .ntv_name_to_code not available, PAPI emulates */ /* it by enumerating all events and looking manually */ .ntv_name_to_code = _sde_ntv_name_to_code, /* These are only used by _papi_hwi_get_native_event_info() */ /* Which currently only uses the info for printing native */ /* event info, not for any sort of internal use. */ /* .ntv_code_to_bits = NULL, */ }; papi-papi-7-2-0-t/src/components/sde/sde_F.F90000066400000000000000000000553511502707512200207000ustar00rootroot00000000000000module papi_sde_fortran_wrappers use, intrinsic :: ISO_C_BINDING implicit none #include "f90papi.h" integer, parameter :: i_kind = 0 integer, parameter :: PAPI_SDE_RO = int( Z'00', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_RW = int( Z'01', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_DELTA = int( Z'00', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_INSTANT = int( Z'10', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_long_long = int( Z'00', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_int = int( Z'01', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_double = int( Z'02', kind=kind(i_kind)) integer, parameter :: PAPI_SDE_float = int( Z'03', kind=kind(i_kind)) ! ------------------------------------------------------------------- ! ------------ Interfaces for F08 bridge-to-C functions ------------- ! ------------------------------------------------------------------- type, bind(C) :: fptr_struct_t type(C_funptr) init type(C_funptr) register_counter type(C_funptr) register_counter_cb type(C_funptr) unregister_counter type(C_funptr) describe_counter type(C_funptr) add_counter_to_group type(C_funptr) create_counter type(C_funptr) inc_counter type(C_funptr) create_recorder type(C_funptr) record type(c_funptr) reset_recorder type(c_funptr) reset_counter type(c_funptr) get_counter_handle end type fptr_struct_t interface papif_sde_init_F08 type(C_ptr) function papif_sde_init_F08(lib_name_C_str) result(handle) bind(C, name="papi_sde_init") use, intrinsic :: ISO_C_BINDING, only : C_ptr type(C_ptr), value, intent(in) :: lib_name_C_str end function papif_sde_init_F08 end interface papif_sde_init_F08 interface papif_sde_register_counter_F08 integer(kind=C_int) function papif_sde_register_counter_F08(handle, event_name_C_str, cntr_mode, cntr_type, counter) result(error) bind(C, name="papi_sde_register_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_int), value, intent(in) :: cntr_type integer(kind=C_int), value, intent(in) :: cntr_mode type(C_ptr), value, intent(in) :: counter end function papif_sde_register_counter_F08 end interface papif_sde_register_counter_F08 interface papif_sde_unregister_counter_F08 integer(kind=C_int) function papif_sde_unregister_counter_F08(handle, event_name_C_str) result(error) bind(C, name="papi_sde_unregister_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str end function papif_sde_unregister_counter_F08 end interface papif_sde_unregister_counter_F08 interface papif_sde_register_counter_cb_F08 integer(kind=C_int) function papif_sde_register_counter_cb_F08(handle, event_name_C_str, cntr_mode, cntr_type, func_ptr, param) result(error) bind(C, name="papi_sde_register_counter_cb") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_funptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_int), value, intent(in) :: cntr_type integer(kind=C_int), value, intent(in) :: cntr_mode type(C_funptr), value, intent(in) :: func_ptr type(C_ptr), value, intent(in) :: param end function papif_sde_register_counter_cb_F08 end interface papif_sde_register_counter_cb_F08 interface papif_sde_describe_counter_F08 integer(kind=C_int) function papif_sde_describe_counter_F08(handle, event_name_C_str, event_desc_C_str) result(error) bind(C, name="papi_sde_describe_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str type(C_ptr), value, intent(in) :: event_desc_C_str end function papif_sde_describe_counter_F08 end interface papif_sde_describe_counter_F08 interface papif_sde_add_counter_to_group_F08 integer(kind=C_int) function papif_sde_add_counter_to_group_F08(handle, event_name_C_str, group_name_C_str, flags) result(error) bind(C, name="papi_sde_add_counter_to_group") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int, C_int32_t type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str type(C_ptr), value, intent(in) :: group_name_C_str integer(kind=C_INT32_T), value, intent(in) :: flags end function papif_sde_add_counter_to_group_F08 end interface papif_sde_add_counter_to_group_F08 interface papif_sde_create_counter_F08 integer(kind=C_int) function papif_sde_create_counter_F08(handle, event_name_C_str, cntr_type, counter_handle) result(error) bind(C, name="papi_sde_create_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_int), value, intent(in) :: cntr_type type(C_ptr), value, intent(in) :: counter_handle ! this argument is "intent(in)" because we will modify the address in which it points to, not the argument itself. end function papif_sde_create_counter_F08 end interface papif_sde_create_counter_F08 interface papif_sde_inc_counter_F08 integer(kind=C_int) function papif_sde_inc_counter_F08(counter_handle, increment) result(error) bind(C, name="papi_sde_inc_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_long_long, C_int type(C_ptr), value, intent(in) :: counter_handle integer(kind=C_long_long), value, intent(in) :: increment end function papif_sde_inc_counter_F08 end interface papif_sde_inc_counter_F08 interface papif_sde_create_recorder_F08 integer(kind=C_int) function papif_sde_create_recorder_F08(handle, event_name_C_str, typesize, cmpr_func_ptr, recorder_handle) result(error) bind(C, name="papi_sde_create_recorder") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_funptr, C_size_t, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_size_t), value, intent(in) :: typesize type(C_funptr), value, intent(in) :: cmpr_func_ptr type(C_ptr), value, intent(in) :: recorder_handle ! this argument is "intent(in)" because we will modify the address in which it points to, not the argument itself. end function papif_sde_create_recorder_F08 end interface papif_sde_create_recorder_F08 interface papif_sde_record_F08 integer(kind=C_int) function papif_sde_record_F08(recorder_handle, typesize, value_to_rec) result(error) bind(C, name="papi_sde_record") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_size_t, C_int type(C_ptr), value, intent(in) :: recorder_handle integer(kind=C_size_t), value, intent(in) :: typesize type(C_ptr), value, intent(in) :: value_to_rec end function papif_sde_record_F08 end interface papif_sde_record_F08 interface papif_sde_reset_recorder_F08 integer(kind=C_int) function papif_sde_reset_recorder_F08(recorder_handle) result(error) bind(C, name="papi_sde_reset_recorder") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: recorder_handle end function papif_sde_reset_recorder_F08 end interface papif_sde_reset_recorder_F08 interface papif_sde_reset_counter_F08 integer(kind=C_int) function papif_sde_reset_counter_F08(counter_handle) result(error) bind(C, name="papi_sde_reset_counter") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: counter_handle end function papif_sde_reset_counter_F08 end interface papif_sde_reset_counter_F08 interface papif_sde_get_counter_handle_F08 type(C_ptr) function papif_sde_get_counter_handle_F08(handle, event_name_C_str) result(counter_handle) bind(C, name="papi_sde_get_counter_handle") use, intrinsic :: ISO_C_BINDING, only : C_ptr type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str end function papif_sde_get_counter_handle_F08 end interface papif_sde_get_counter_handle_F08 ! ------------------------------------------------------------------- ! ----------------- Interfaces for helper functions ----------------- ! ------------------------------------------------------------------- interface C_malloc type(C_ptr) function C_malloc(size) bind(C,name="malloc") use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_size_t integer(C_size_t), value, intent(in) :: size end function C_malloc end interface C_malloc interface C_free subroutine C_free(ptr) bind(C,name="free") use, intrinsic :: ISO_C_BINDING, only : C_ptr type(C_ptr), value, intent(in) :: ptr end subroutine C_free end interface C_free ! ------------------------------------------------------------------- ! ----------------- Interfaces for function pointers ---------------- ! ------------------------------------------------------------------- interface function init_t(lib_name) result(ret_val) use, intrinsic :: ISO_C_BINDING, only: C_ptr type(C_ptr), value :: lib_name type(C_ptr) :: ret_val end function init_t end interface interface function register_counter_t(lib_handle, event_name, cntr_mode, cntr_type, cntr) result(ret_val) use, intrinsic :: ISO_C_BINDING, only: C_ptr, C_int type(C_ptr), value, intent(in) :: lib_handle type(C_ptr), value, intent(in) :: event_name integer(kind=C_int), value, intent(in) :: cntr_mode integer(kind=C_int), value, intent(in) :: cntr_type type(C_ptr), intent(in) :: cntr integer(kind=C_int) :: ret_val end function register_counter_t end interface interface function register_counter_cb_t(lib_handle, event_name, cntr_mode, cntr_type, c_func_ptr, param ) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_funptr, C_int type(C_ptr), value, intent(in) :: lib_handle type(C_ptr), value, intent(in) :: event_name integer(kind=C_int), value, intent(in) :: cntr_type integer(kind=C_int), value, intent(in) :: cntr_mode type(C_funptr), value, intent(in) :: c_func_ptr type(C_ptr), value, intent(in) :: param integer(kind=C_int) :: ret_val end function register_counter_cb_t end interface interface function unregister_counter_t(lib_handle, event_name) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: lib_handle type(C_ptr), value, intent(in) :: event_name integer(kind=C_int) :: ret_val end function unregister_counter_t end interface interface function describe_counter_t(lib_handle, event_name, event_desc) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: lib_handle type(C_ptr), value, intent(in) :: event_name type(C_ptr), value, intent(in) :: event_desc integer(kind=C_int) :: ret_val end function describe_counter_t end interface interface function add_counter_to_group_t(handle, event_name_C_str, group_name_C_str, flags) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int, C_int32_t type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str type(C_ptr), value, intent(in) :: group_name_C_str integer(kind=C_INT32_T), value, intent(in) :: flags integer(kind=C_int) :: ret_val end function add_counter_to_group_t end interface interface function create_counter_t(handle, event_name_C_str, cntr_type, counter_handle) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_int), value, intent(in) :: cntr_type type(C_ptr), value, intent(in) :: counter_handle ! this argument is "intent(in)" because we will modify the address in which it points to, not the argument itself. integer(kind=C_int) :: ret_val end function create_counter_t end interface interface function inc_counter_t(counter_handle, increment) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_long_long, C_int type(C_ptr), value, intent(in) :: counter_handle integer(kind=C_long_long), value, intent(in) :: increment integer(kind=C_int) :: ret_val end function inc_counter_t end interface interface function create_recorder_t(handle, event_name_C_str, typesize, recorder_handle) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_size_t, C_int type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str integer(kind=C_size_t), value, intent(in) :: typesize type(C_ptr), value, intent(in) :: recorder_handle ! this argument is "intent(in)" because we will modify the address in which it points to, not the argument itself. integer(kind=C_int) :: ret_val end function create_recorder_t end interface interface function record_t(recorder_handle, typesize, value_to_rec) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_size_t, C_int type(C_ptr), value, intent(in) :: recorder_handle integer(kind=C_size_t), value, intent(in) :: typesize type(C_ptr), value, intent(in) :: value_to_rec integer(kind=C_int) :: ret_val end function record_t end interface interface function reset_recorder_t(recorder_handle) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: recorder_handle integer(kind=C_int) :: ret_val end function reset_recorder_t end interface interface function reset_counter_t(counter_handle) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr, C_int type(C_ptr), value, intent(in) :: counter_handle integer(kind=C_int) :: ret_val end function reset_counter_t end interface interface function get_counter_handle_t(handle, event_name_C_str) result(ret_val) use, intrinsic :: ISO_C_BINDING, only : C_ptr type(C_ptr), value, intent(in) :: handle type(C_ptr), value, intent(in) :: event_name_C_str type(C_ptr) :: ret_val end function get_counter_handle_t end interface ! ------------------------------------------------------------------- ! ------------------------ END OF INTERFACES ------------------------ ! ------------------------------------------------------------------- contains ! ------------------------------------------------------------------- ! ---------------------- F08 API subroutines ------------------------ ! ------------------------------------------------------------------- subroutine papif_sde_init(lib_name, handle, error) character(len=*), intent(in) :: lib_name type(C_ptr), intent(out) :: handle integer, intent(out), optional :: error type(C_ptr) :: lib_name_C_str lib_name_C_str = F_str_to_C(lib_name) handle = papif_sde_init_F08(lib_name_C_str) call C_free(lib_name_C_str) if( present(error) ) then error = PAPI_OK end if end subroutine papif_sde_init ! --------------------------------------------------------- subroutine papif_sde_register_counter( handle, event_name, cntr_mode, cntr_type, counter, error ) type(C_ptr), intent(in) :: handle character(len=*), intent(in) :: event_name integer, intent(in) :: cntr_type integer, intent(in) :: cntr_mode type(C_ptr), value, intent(in) :: counter integer, intent(out), optional :: error integer :: tmp type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) tmp = papif_sde_register_counter_F08(handle, event_name_C_str, cntr_mode, cntr_type, counter) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) end subroutine papif_sde_register_counter ! --------------------------------------------------------- subroutine papif_sde_register_counter_cb( handle, event_name, cntr_mode, cntr_type, c_func_ptr, param, error ) type(C_ptr), intent(in) :: handle character(len=*), intent(in) :: event_name integer, intent(in) :: cntr_type integer, intent(in) :: cntr_mode type(C_ptr), value, intent(in) :: param integer, intent(out), optional :: error integer :: tmp type(C_funptr) :: c_func_ptr type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) tmp = papif_sde_register_counter_cb_F08(handle, event_name_C_str, cntr_mode, cntr_type, c_func_ptr, param) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) end subroutine papif_sde_register_counter_cb ! --------------------------------------------------------- subroutine papif_sde_unregister_counter( handle, event_name, error ) type(C_ptr), intent(in) :: handle character(len=*), intent(in) :: event_name integer, intent(out), optional :: error integer :: tmp type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) tmp = papif_sde_unregister_counter_F08(handle, event_name_C_str) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) end subroutine papif_sde_unregister_counter ! --------------------------------------------------------- subroutine papif_sde_describe_counter( handle, event_name, event_desc, error ) type(C_ptr), intent(in) :: handle character(len=*), intent(in) :: event_name character(len=*), intent(in) :: event_desc integer, intent(out), optional :: error integer :: tmp type(C_ptr) :: event_name_C_str type(C_ptr) :: event_desc_C_str event_name_C_str = F_str_to_C(event_name) event_desc_C_str = F_str_to_C(event_desc) tmp = papif_sde_describe_counter_F08(handle, event_name_C_str, event_desc_C_str) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) call C_free(event_desc_C_str) end subroutine papif_sde_describe_counter ! --------------------------------------------------------- subroutine papif_sde_create_counter(handle, event_name, cntr_type, counter_handle, error) type(C_ptr), value, intent(in) :: handle character(len=*), intent(in) :: event_name integer(kind=C_int), value, intent(in) :: cntr_type type(C_ptr), value, intent(in) :: counter_handle integer, intent(out), optional :: error integer :: tmp type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) tmp = papif_sde_create_counter_F08(handle, event_name_C_str, cntr_type, counter_handle) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) end subroutine papif_sde_create_counter ! --------------------------------------------------------- subroutine papif_sde_inc_counter(counter_handle, increment, error) type(C_ptr), value, intent(in) :: counter_handle integer(kind=C_long_long), value, intent(in) :: increment integer, intent(out), optional :: error integer :: tmp tmp = papif_sde_inc_counter_F08(counter_handle, increment) if( present(error) ) then error = tmp end if end subroutine papif_sde_inc_counter ! --------------------------------------------------------- subroutine papif_sde_create_recorder(handle, event_name, typesize, cmpr_c_func_ptr, recorder_handle, error) type(C_ptr), value, intent(in) :: handle character(len=*), intent(in) :: event_name integer(kind=C_size_t), value, intent(in) :: typesize type(C_funptr) :: cmpr_c_func_ptr type(C_ptr), value, intent(in) :: recorder_handle integer, intent(out), optional :: error integer :: tmp type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) tmp = papif_sde_create_recorder_F08(handle, event_name_C_str, typesize, cmpr_c_func_ptr, recorder_handle) if( present(error) ) then error = tmp end if call C_free(event_name_C_str) end subroutine papif_sde_create_recorder ! --------------------------------------------------------- subroutine papif_sde_record(recorder_handle, typesize, value_to_rec, error) type(C_ptr), value, intent(in) :: recorder_handle integer(kind=C_size_t), value, intent(in) :: typesize type(C_ptr), value, intent(in) :: value_to_rec integer, intent(out), optional :: error integer :: tmp tmp = papif_sde_record_F08(recorder_handle, typesize, value_to_rec) if( present(error) ) then error = tmp end if end subroutine papif_sde_record ! --------------------------------------------------------- subroutine papif_sde_reset_recorder(recorder_handle, error) type(C_ptr), value, intent(in) :: recorder_handle integer, intent(out), optional :: error integer :: tmp tmp = papif_sde_reset_recorder_F08(recorder_handle) if( present(error) ) then error = tmp end if end subroutine papif_sde_reset_recorder ! --------------------------------------------------------- subroutine papif_sde_reset_counter(counter_handle, error) type(C_ptr), value, intent(in) :: counter_handle integer, intent(out), optional :: error integer :: tmp tmp = papif_sde_reset_counter_F08(counter_handle) if( present(error) ) then error = tmp end if end subroutine papif_sde_reset_counter ! --------------------------------------------------------- subroutine papif_sde_get_counter_handle(handle, event_name, counter_handle, error) type(C_ptr), value, intent(in) :: handle character(len=*), intent(in) :: event_name integer, intent(out), optional :: error type(C_ptr), intent(out) :: counter_handle type(C_ptr) :: event_name_C_str event_name_C_str = F_str_to_C(event_name) counter_handle = papif_sde_get_counter_handle_F08(handle, event_name_C_str) call C_free(event_name_C_str) if( present(error) ) then error = PAPI_OK end if end subroutine papif_sde_get_counter_handle ! ------------------------------------------------------------------- ! ------------------------ Helper functions ------------------------- ! ------------------------------------------------------------------- type(C_ptr) function F_str_to_C(F_str) result(C_str) implicit none character(len=*), intent(in) :: F_str character(len=1,kind=C_char), pointer :: tmp_str_ptr(:) integer(C_size_t) :: i, strlen strlen = len(F_str) C_str = C_malloc(strlen+1) if (C_associated(C_str)) then call C_F_pointer(C_str,tmp_str_ptr,[strlen+1]) forall (i=1:strlen) tmp_str_ptr(i) = F_str(i:i) end forall tmp_str_ptr(strlen+1) = C_NULL_char end if end function F_str_to_C end module papi_sde_fortran_wrappers papi-papi-7-2-0-t/src/components/sde/sde_internal.h000066400000000000000000000074171502707512200222200ustar00rootroot00000000000000#ifndef SDE_H #define SDE_H #ifndef SDE_MAX_SIMULTANEOUS_COUNTERS #define SDE_MAX_SIMULTANEOUS_COUNTERS 40 #endif #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #define REGISTERED_EVENT_MASK 0x2; /* We do not use this structure, but the framework needs its size */ typedef struct sde_register { int junk; } sde_register_t; /* We do not use this structure, but the framework needs its size */ typedef struct sde_reg_alloc { sde_register_t junk; } sde_reg_alloc_t; /** * There's one of these per event-set to hold data specific to the EventSet, like * counter start values, number of events in a set and counter uniq ids. */ typedef struct sde_control_state { int num_events; uint32_t which_counter[SDE_MAX_SIMULTANEOUS_COUNTERS]; long long counter[SDE_MAX_SIMULTANEOUS_COUNTERS]; long long previous_value[SDE_MAX_SIMULTANEOUS_COUNTERS]; timer_t timerid; int has_timer; } sde_control_state_t; typedef struct sde_context { long long junk; } sde_context_t; // Function prototypes static int _sde_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ); static int _sde_write( hwd_context_t *ctx, hwd_control_state_t *ctl, long long *events ); static int _sde_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ); static int _sde_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ); static int _sde_start( hwd_context_t *ctx, hwd_control_state_t *ctl ); static int _sde_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ); static int _sde_init_control_state( hwd_control_state_t * ctl ); static int _sde_init_thread( hwd_context_t *ctx ); static int _sde_init_component( int cidx ); static int _sde_shutdown_component(void); static int _sde_shutdown_thread( hwd_context_t *ctx ); static int _sde_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ); static int _sde_set_domain( hwd_control_state_t * cntrl, int domain ); static int _sde_ntv_enum_events( unsigned int *EventCode, int modifier ); static int _sde_ntv_code_to_name( unsigned int EventCode, char *name, int len ); static int _sde_ntv_code_to_descr( unsigned int EventCode, char *descr, int len ); static int _sde_ntv_name_to_code(const char *name, unsigned int *event_code ); static void _sde_dispatch_timer( int n, hwd_siginfo_t *info, void *uc); static void invoke_user_handler( unsigned int cntr_uniq_id ); static int do_set_timer_for_overflow( sde_control_state_t *sde_ctl ); static inline int sde_arm_timer(sde_control_state_t *sde_ctl); int papi_sde_lock(void); int papi_sde_unlock(void); void papi_sde_check_overflow_status(unsigned int cntr_uniq_id, long long int latest); int papi_sde_set_timer_for_overflow(void); // Function pointers that will be initialized by the linker if libpapi and libsde are static (.a) __attribute__((__common__)) int (*sde_ti_reset_counter_ptr)( uint32_t ); __attribute__((__common__)) int (*sde_ti_read_counter_ptr)( uint32_t, long long int * ); __attribute__((__common__)) int (*sde_ti_write_counter_ptr)( uint32_t, long long ); __attribute__((__common__)) int (*sde_ti_name_to_code_ptr)( const char *, uint32_t * ); __attribute__((__common__)) int (*sde_ti_is_simple_counter_ptr)( uint32_t ); __attribute__((__common__)) int (*sde_ti_is_counter_set_to_overflow_ptr)( uint32_t ); __attribute__((__common__)) int (*sde_ti_set_counter_overflow_ptr)( uint32_t, int ); __attribute__((__common__)) char * (*sde_ti_get_event_name_ptr)( int ); __attribute__((__common__)) char * (*sde_ti_get_event_description_ptr)( int ); __attribute__((__common__)) int (*sde_ti_get_num_reg_events_ptr)( void ); __attribute__((__common__)) int (*sde_ti_shutdown_ptr)( void ); #endif papi-papi-7-2-0-t/src/components/sde/tests/000077500000000000000000000000001502707512200205315ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Advanced_C+FORTRAN/000077500000000000000000000000001502707512200236075ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Advanced_C+FORTRAN/Gamum.c000066400000000000000000000065601502707512200250300ustar00rootroot00000000000000#include #include #include #include "sde_lib.h" // API functions (FORTRAN 77 friendly). papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct); void gamum_init_(void); void gamum_unreg_(void); void gamum_do_work_(void); // The following counter is a global variable that can be directly // modified by programs linking with this library. long long int gamum_cnt_i1; // The following counters are hidden to programs linking with // this library, so they can not be directly modified. static double cnt_d1, cnt_d2, cnt_d3, cnt_d4, cnt_d5; static long long int cnt_i2, cnt_i3; static double cnt_rm1, cnt_rm2; static papi_handle_t handle; static void *cntr_handle; // For internal use only. static papi_handle_t gamum_papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct); static const char *event_names[2] = { "event_with_characters____ __up______to_______60_bytes", "event_with_very_long_name_which_is_meant_to_exceed_128_bytes_or_in_other_words_the_size_of_two_cache_lines_so_we_see_if_it_makes_a_difference_in_performance" }; void gamum_init_(void){ cnt_d1 = cnt_d2 = cnt_d3 = cnt_d4 = cnt_d5 = 1; gamum_cnt_i1 = cnt_i2 = 0; papi_sde_fptr_struct_t fptr_struct; POPULATE_SDE_FPTR_STRUCT( fptr_struct ); (void)gamum_papi_sde_hook_list_events(&fptr_struct); return; } papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ return gamum_papi_sde_hook_list_events(fptr_struct); } papi_handle_t gamum_papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ handle = fptr_struct->init("Gamum"); fptr_struct->register_counter(handle, "rm1", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_double, &cnt_rm1); fptr_struct->register_counter(handle, "ev1", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_double, &cnt_d1); fptr_struct->add_counter_to_group(handle, "ev1", "group0", PAPI_SDE_SUM); fptr_struct->register_counter(handle, "ev2", PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, &cnt_d2); fptr_struct->register_counter(handle, "ev3", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_double, &cnt_d3); fptr_struct->register_counter(handle, "ev4", PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, &cnt_d4); fptr_struct->add_counter_to_group(handle, "ev4", "group0", PAPI_SDE_SUM); fptr_struct->register_counter(handle, "ev5", PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, &cnt_d5); fptr_struct->register_counter(handle, "rm2", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_double, &cnt_rm2); fptr_struct->register_counter(handle, "i1", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &gamum_cnt_i1); fptr_struct->register_counter(handle, "i2", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &cnt_i2); fptr_struct->create_counter(handle, "papi_counter", PAPI_SDE_RO|PAPI_SDE_INSTANT, &cntr_handle ); fptr_struct->register_counter(handle, event_names[0], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &cnt_i3); return handle; } void gamum_unreg_(void){ papi_sde_unregister_counter(handle, "rm1"); papi_sde_unregister_counter(handle, "rm2"); } void gamum_do_work_(void){ cnt_d1 += 0.1; cnt_d2 += 0.111; cnt_d3 += 0.2; cnt_d4 += 0.222; cnt_d5 += 0.3; gamum_cnt_i1 += 6; cnt_i2 += 101; cnt_i3 += 33; papi_sde_inc_counter(cntr_handle, 5); papi_sde_inc_counter(papi_sde_get_counter_handle(handle, "papi_counter"), 1); } papi-papi-7-2-0-t/src/components/sde/tests/Advanced_C+FORTRAN/Xandria.F90000066400000000000000000000165341502707512200254660ustar00rootroot00000000000000 module xandria_mod use ISO_C_BINDING, only: C_LONG_LONG, C_DOUBLE, C_ptr, C_funptr use papi_sde_fortran_wrappers implicit none integer(kind=C_LONG_LONG), target :: cntr_i1, cntr_i2, cntr_rw_i1, cntr_iL integer(kind=C_LONG_LONG), target :: cntr_i10, cntr_i20, cntr_i30 real(kind=C_DOUBLE), target :: cntr_r1, cntr_r2, cntr_r3 TYPE(C_ptr) :: xandria_sde_handle end module function papi_sde_hook_list_events(fptr_struct) result(tmp_handle) bind(C) use xandria_mod use, intrinsic :: ISO_C_BINDING, only: C_ptr, C_null_ptr, C_int, C_F_procpointer implicit none type(fptr_struct_t) :: fptr_struct type(C_ptr) :: tmp_handle integer(kind=C_int) :: error_code integer(kind=C_int) :: cntr_mode, rw_mode, cntr_type procedure(init_t), pointer :: init_fptr procedure(register_counter_t), pointer :: reg_cntr_fptr procedure(create_counter_t), pointer :: create_cntr_fptr cntr_mode = PAPI_SDE_RO+PAPI_SDE_DELTA rw_mode = PAPI_SDE_RW+PAPI_SDE_INSTANT cntr_type = PAPI_SDE_long_long call C_F_procpointer(fptr_struct%init, init_fptr) tmp_handle = init_fptr(F_str_to_C('Xandria')) call C_F_procpointer(fptr_struct%register_counter, reg_cntr_fptr) call C_F_procpointer(fptr_struct%create_counter, create_cntr_fptr) error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('EV_I1'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('EV_I2'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('RW_I1'), rw_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('EV_R1'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('EV_R2'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('EV_R3'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = reg_cntr_fptr(tmp_handle, F_str_to_C('LATE'), cntr_mode, cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif error_code = create_cntr_fptr(tmp_handle, F_str_to_C('XND_CREATED'), cntr_type, C_null_ptr) if( error_code .ne. PAPI_OK ) then print *,'Error in Xandria:papi_sde_hook_list_events() ' return endif end function papi_sde_hook_list_events subroutine xandria_init use, intrinsic :: ISO_C_BINDING use :: xandria_mod implicit none integer :: cntr_mode, rw_mode, cntr_type, error cntr_i1 = 0 cntr_i2 = 0 cntr_rw_i1 = 0 cntr_i10 = 0 cntr_i20 = 0 cntr_i30 = 0 cntr_iL = 0 cntr_mode = PAPI_SDE_RO+PAPI_SDE_DELTA rw_mode = PAPI_SDE_RW+PAPI_SDE_INSTANT cntr_type = PAPI_SDE_long_long call papif_sde_init('Xandria', xandria_sde_handle, error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'EV_I1', cntr_mode, cntr_type, C_loc(cntr_i1), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'EV_I2', cntr_mode, cntr_type, C_loc(cntr_i2), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'RW_I1', rw_mode, cntr_type, C_loc(cntr_rw_i1), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'EV_R1', cntr_mode, cntr_type, C_loc(cntr_i10), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'EV_R2', cntr_mode, cntr_type, C_loc(cntr_i20), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_register_counter(xandria_sde_handle, 'EV_R3', cntr_mode, cntr_type, C_loc(cntr_i30), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif call papif_sde_create_counter(xandria_sde_handle, 'XND_CREATED', cntr_type, C_null_ptr, error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_create_counter() ' return endif end subroutine subroutine xandria_add_more use, intrinsic :: ISO_C_BINDING use :: xandria_mod implicit none integer :: cntr_mode, cntr_type, error cntr_mode = PAPI_SDE_RO+PAPI_SDE_DELTA cntr_type = PAPI_SDE_long_long call papif_sde_register_counter(xandria_sde_handle, 'LATE', cntr_mode, cntr_type, C_loc(cntr_iL), error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_register_counter() ' stop endif end subroutine subroutine xandria_do_work use, intrinsic :: ISO_C_BINDING use :: xandria_mod implicit none TYPE(C_ptr) :: cntr_handle integer :: error cntr_i1 = cntr_i1+1 cntr_i2 = cntr_i2+3 cntr_rw_i1 = cntr_rw_i1 + 7 cntr_i10 = cntr_i10+10 cntr_i20 = cntr_i20+20 cntr_i30 = cntr_i30+30 cntr_iL = cntr_iL+7 call papif_sde_get_counter_handle(xandria_sde_handle, 'XND_CREATED', cntr_handle, error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_get_counter_handle() ' stop endif call papif_sde_inc_counter( cntr_handle, 9_8, error) if( error .ne. PAPI_OK ) then print *,'Error in papif_sde_inc_counter() ' stop endif end subroutine papi-papi-7-2-0-t/src/components/sde/tests/Advanced_C+FORTRAN/sde_test_f08.F90000066400000000000000000001110501502707512200263540ustar00rootroot00000000000000 program test_F_interface use, intrinsic :: ISO_C_BINDING use :: ISO_FORTRAN_ENV use :: papi_sde_fortran_wrappers implicit none TYPE(C_ptr) :: handle TYPE(C_ptr) :: quantile integer, pointer :: quantile_f integer(kind=C_LONG_LONG) :: values(22), values_to_write(1) integer(kind=C_LONG_LONG), target :: ev_cnt1, ev_cnt2 integer(kind=C_INT), target :: ev_cnt3_int real, target :: ev_cnt3_float real (KIND=KIND(0.0)), target :: ev_cnt4_float real (KIND=KIND(0.0D0)) :: value_d integer :: i, ret_val, error integer :: eventset, eventset2, eventcode, junk, codes(3) real, target :: internal_variable integer :: internal_variable_int integer :: all_tests_passed character(:), allocatable :: arg integer :: arglen, stat integer :: be_verbose interface function callback_t(param) result(ret_val) use, intrinsic :: ISO_C_BINDING, only: C_LONG_LONG real :: param integer(kind=C_LONG_LONG) :: ret_val end function callback_t end interface interface function rounding_error(param) result(ret_val) real (KIND=KIND(0.0D0)) :: param, ret_val end function rounding_error end interface procedure(callback_t) :: f08_callback all_tests_passed = 1 ev_cnt1 = 73 ev_cnt3_int = 5 ev_cnt4_float = 5.431 values_to_write(1) = 9 be_verbose = 0 if(command_argument_count() .eq. 1) then call get_command_argument(number=1, length=arglen, status=stat) if( stat .eq. 0 ) then allocate (character(arglen) :: arg) call get_command_argument(number=1, value=arg, status=stat) if( arg == '-verbose' ) then be_verbose = 1 endif endif endif call papif_sde_init('TESTLIB', handle, error) if(error .ne. PAPI_OK ) print *,'Error in sde_init' call papif_sde_register_counter(handle, 'TESTEVENT', PAPI_SDE_RO, PAPI_SDE_long_long, C_loc(ev_cnt1), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter' call papif_sde_describe_counter(handle, 'TESTEVENT', 'This is a test counter used to test SDE from FORTRAN, for testing purposes only. Use it when you test the functionality in a test or something. Happy testing.', error) if(error .ne. PAPI_OK ) print *,'Error in sde_describe_counter' call papif_sde_register_counter(handle, 'SERIOUSEVENT', PAPI_SDE_RO, PAPI_SDE_long_long, C_loc(ev_cnt2), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter' ! The following call should be ignored by the SDE component (since this counter is already registered.) call papif_sde_register_counter(handle, 'SERIOUSEVENT', PAPI_SDE_RO, PAPI_SDE_long_long, C_loc(ev_cnt1), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter' call papif_sde_describe_counter(handle, 'SERIOUSEVENT', 'This is a not a test counter, this one is serious.', error) if(error .ne. PAPI_OK ) print *,'Error in sde_describe_counter' call papif_sde_register_counter(handle, 'FLOATEVENT', PAPI_SDE_RO, PAPI_SDE_float, C_loc(ev_cnt4_float), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter' internal_variable = 987.65 internal_variable_int = 12345 ! the following call should be ignored by the SDE component, but the returned 'handle' should still be valid. call papif_sde_init('TESTLIB', handle, error) if(error .ne. PAPI_OK ) print *,'Error in sde_init' call papif_sde_register_counter_cb(handle, 'FP_EVENT', PAPI_SDE_RO, PAPI_SDE_long_long, c_funloc(f08_callback), C_loc(internal_variable), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter_cb' call papif_sde_describe_counter(handle, 'FP_EVENT', 'This is another counter.', error) if(error .ne. PAPI_OK ) print *,'Error in sde_describe_counter' ! The following call should be ignored by the SDE component (since this counter is already registered.) call papif_sde_register_counter_cb(handle, 'FP_EVENT', PAPI_SDE_RO, PAPI_SDE_long_long, c_funloc(f08_callback), C_loc(ev_cnt1), error) if(error .ne. PAPI_OK ) print *,'Error in sde_register_counter_cb' call xandria_init() call gamum_init() call recorder_init() internal_variable = 11.0 ret_val = PAPI_VER_CURRENT call papif_library_init(ret_val) if( ret_val .ne. PAPI_VER_CURRENT ) then print *,'Error at papif_init', ret_val, '!=', PAPI_VER_CURRENT print *,'PAPI_EINVAL', PAPI_EINVAL print *,'PAPI_ENOMEM', PAPI_ENOMEM print *,'PAPI_ECMP', PAPI_ECMP print *,'PAPI_ESYS', PAPI_ESYS stop endif call recorder_do_work() eventset = PAPI_NULL call papif_create_eventset( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_create_eventset' stop endif eventset2 = PAPI_NULL call papif_create_eventset( eventset2, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_create_eventset' stop endif ! 1 call papif_event_name_to_code( 'sde:::TESTLIB::TESTEVENT', eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif !------------------------------------------------------------------------------- call recorder_do_work() !------------------------------------------------------------------------------- call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif ev_cnt1 = ev_cnt1+100 call xandria_do_work() call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif call recorder_do_work() if( be_verbose .eq. 1 ) print '(A29,I4)', ' TESTLIB::TESTEVENT (100) = ', values(1) if( values(1) .ne. 100 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif ! 2 call papif_event_name_to_code( 'sde:::TESTLIB::FP_EVENT', eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif internal_variable = 12.0 internal_variable_int = 12 !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif ev_cnt1 = ev_cnt1+9 ev_cnt4_float = ev_cnt4_float+33 internal_variable = 12.4 call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A27,I2)', ' TESTLIB::TESTEVENT (9) = ', values(1) if( values(1) .ne. 9 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A26,I2)', ' TESTLIB::FP_EVENT (0) = ', values(2) if( values(2) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif ! 3 call papif_event_name_to_code( 'sde:::TESTLIB::FLOATEVENT', eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call recorder_do_work() call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif ! 4 call papif_event_name_to_code( 'sde:::Xandria::EV_I1', eventcode, ret_val ) ! not added call papif_event_name_to_code( 'sde:::Xandria::EV_I2', junk, ret_val ) ! not added call papif_event_name_to_code( 'sde:::Xandria::EV_I2', junk, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif ! 5 call papif_add_named_event( eventset, 'sde:::Xandria::RW_I1', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif do i=1,37 call recorder_do_work() end do !------------------------------------------------------------------------------- call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif ev_cnt1 = ev_cnt1+2 ev_cnt4_float = ev_cnt4_float+3.98 internal_variable = 20.12 internal_variable_int = 20 call xandria_do_work() ! Adding the 5th counter into a separate eventset so we can write into it. call papif_add_named_event( eventset2, 'sde:::Xandria::RW_I1', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif !-------------------- if( be_verbose .eq. 1 ) print *,'' call papif_read(eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_read' stop endif do i=1,370 call recorder_do_work() end do if( be_verbose .eq. 1 ) print '(A27,I2)', ' TESTLIB::TESTEVENT (2) = ', values(1) if( values(1) .ne. 2 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A26,I2)', ' TESTLIB::FP_EVENT (8) = ', values(2) if( values(2) .ne. 8 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(3), 1.0D0) if( be_verbose .eq. 1 ) print '(A31,F4.2)', ' TESTLIB::FLOATEVENT (3.98) = ', value_d if( abs(value_d - 3.98) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A23,I1)', ' Xandria::EV_I1 (1) = ', values(4) if( values(4) .ne. 1 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::RW_I1 (14) = ', values(5) if( values(5) .ne. 14 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif call xandria_do_work() call xandria_do_work() call xandria_do_work() !-------------------- if( be_verbose .eq. 1 ) print *,'' call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::RW_I1 (35) = ', values(5) if( values(5) .ne. 35 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- ! WRITE and then read the RW counter. call papif_start( eventset2, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call papif_write(eventset2, values_to_write, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_write' stop endif call papif_read(eventset2, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_read' stop endif if( be_verbose .eq. 1 ) print '(A23,I1)', ' Xandria::RW_I1 (9) = ', values(1) if( values(1) .ne. 9 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif call papif_stop( eventset2, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif !------------------------------------------------------------------------------- call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif ev_cnt1 = ev_cnt1+5 ev_cnt4_float = ev_cnt4_float+18.8 internal_variable = internal_variable + 30.1 internal_variable_int = 30 call xandria_do_work() call xandria_do_work() call xandria_do_work() call papif_read(eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_read' stop endif if( be_verbose .eq. 1 ) print '(A27,I1)', ' TESTLIB::TESTEVENT (5) = ', values(1) if( values(1) .ne. 5 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A27,I2)', ' TESTLIB::FP_EVENT (30) = ', values(2) if( values(2) .ne. 30 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(3), 1.0D0) if( be_verbose .eq. 1 ) print '(A31,F4.1)',' TESTLIB::FLOATEVENT (18.8) = ', value_d if( abs(value_d - 18.8) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A23,I2)', ' Xandria::EV_I1 (3) = ', values(4) if( values(4) .ne. 3 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !-------------------- if( be_verbose .eq. 1 ) print *,'' call papif_reset(eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_reset' stop endif call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A27,I2)', ' TESTLIB::TESTEVENT (0) = ', values(1) if( values(1) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A26,I2)', ' TESTLIB::FP_EVENT (0) = ', values(2) if( values(2) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(3), 1.0D0) if( be_verbose .eq. 1 ) print '(A31,F4.1)', ' TESTLIB::FLOATEVENT, (0.0) = ', value_d if( abs(value_d - 0.0) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A23,I2)', ' Xandria::EV_I1 (0) = ', values(4) if( values(4) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif ! 6 call papif_event_name_to_code('sde:::Xandria::EV_R1' , codes(1), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif ! 7 call papif_event_name_to_code('sde:::Xandria::EV_R2' , codes(2), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif ! 8 call papif_event_name_to_code('sde:::Xandria::EV_R3' , codes(3), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_events( eventset, codes, 3, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_events' stop endif do i=1,29 call recorder_do_work() end do !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call xandria_do_work() call xandria_do_work() call xandria_do_work() call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A27,I2)', ' TESTLIB::TESTEVENT (0) = ', values(1) if( values(1) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A26,I2)', ' TESTLIB::FP_EVENT (0) = ', values(2) if( values(2) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(3), 1.0D0) if( be_verbose .eq. 1 ) print '(A30,F3.1)' ,' TESTLIB::FLOATEVENT (0.0) = ', value_d if( abs(value_d - 0.0) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A23,I2)', ' Xandria::EV_I1 (3) = ', values(4) if( values(4) .ne. 3 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::EV_R1 (30) = ', values(6) if( values(6) .ne. 30 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::EV_R2 (60) = ', values(7) if( values(7) .ne. 60 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::EV_R3 (90) = ', values(8) if( values(8) .ne. 90 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- call gamum_unreg() do i=1,248 call recorder_do_work() end do ! 9 call papif_event_name_to_code('sde:::Gamum::ev1' , codes(1), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif ! 10 call papif_event_name_to_code('sde:::Gamum::ev3' , codes(2), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif ! 11 call papif_event_name_to_code('sde:::Gamum::ev4' , codes(3), ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_events( eventset, codes, 3, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_events' stop endif !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call gamum_do_work() call gamum_do_work() call gamum_do_work() call gamum_do_work() do i=1,122 call recorder_do_work() end do call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif value_d = transfer(values(9), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F4.1)',' Gamum::ev1 (0.4) = ', value_d if( abs(value_d - 0.4) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(10), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F4.1)',' Gamum::ev3 (0.8) = ', value_d if( abs(value_d - 0.8) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(11), 1.0D0) if( be_verbose .eq. 1 ) print '(A23,F5.3)',' Gamum::ev4 (1.888) = ', value_d if( abs(value_d - 1.888) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' ! 12 call papif_event_name_to_code('sde:::Xandria::LATE' , eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif ! We register this event after the placeholder was created call xandria_add_more() call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call xandria_do_work() call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif value_d = transfer(values(9), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F4.1)', ' Gamum::ev1 (0.0) = ', value_d if( abs(value_d - 0.0) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(10), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F4.1)',' Gamum::ev3 (0.0) = ', value_d if( abs(value_d - 0.0) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(11), 1.0D0) if( be_verbose .eq. 1 ) print '(A23,F5.3)', ' Gamum::ev4 (1.888) = ', value_d if( abs(value_d - 1.888) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A22,I2)', ' Xandria::LATE (7) = ', values(12) if( values(12) .ne. 7 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif do i=1,9 call recorder_do_work() end do !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' ! 13 call papif_event_name_to_code('sde:::Xandria::WRONG' , eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call xandria_do_work() call xandria_do_work() call xandria_do_work() call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A23,I2)', ' Xandria::LATE (21) = ', values(12) if( values(12) .ne. 21 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::WRONG (-1) = ', values(13) if( values(13) .ne. -1 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' ! 14 call papif_event_name_to_code('sde:::Gamum::group0' , eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif ! 15 call papif_event_name_to_code('sde:::Gamum::papi_counter' , eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call gamum_do_work() call gamum_do_work() call papif_read(eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_read' stop endif if( be_verbose .eq. 1 ) print '(A22,I2)', ' Xandria::LATE (0) = ', values(12) if( values(12) .ne. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A24,I2)', ' Xandria::WRONG (-1) = ', values(13) if( values(13) .ne. -1 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(9), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F4.1)', ' Gamum::ev1 (0.2) = ', value_d if( abs(value_d - 0.2) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(11), 1.0D0) if( be_verbose .eq. 1 ) print '(A23,F5.3)', ' Gamum::ev4 (2.332) = ', value_d if( abs(value_d - 2.332) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(14), 1.0D0) if( be_verbose .eq. 1 ) print '(A36,F5.3)', ' Gamum::group0 [ev1+ev4] (2.532) = ', value_d if( abs(value_d - 2.532) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A29,I3)', ' Gamum::papi_counter (36) = ', values(15) if( abs(values(15) - 36) .gt. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif do i=1,5 call gamum_do_work() end do call xandria_do_work() do i=1,217 call recorder_do_work() end do call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif value_d = transfer(values(9), 1.0D0) if( be_verbose .eq. 1 ) print '(A21,F3.1)', ' Gamum::ev1 (0.7) = ', value_d if( abs(value_d - 0.7) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(11), 1.0D0) if( be_verbose .eq. 1 ) print '(A23,F5.3)', ' Gamum::ev4 (3.442) = ', value_d if( abs(value_d - 3.442) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif value_d = transfer(values(14), 1.0D0) if( be_verbose .eq. 1 ) print '(A36,F5.3)', ' Gamum::group0 [ev1+ev4] (4.142) = ', value_d if( abs(value_d - 4.142) .gt. rounding_error(value_d) ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif if( be_verbose .eq. 1 ) print '(A29,I3)', ' Gamum::papi_counter (66) = ', values(15) if( abs(values(15) - 66) .gt. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' ! 16 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:CNT', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif ! 17 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:MIN', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif ! 18 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:Q1', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif ! 19 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:MED', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif ! 20 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:Q3', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif ! 21 call papif_add_named_event(eventset, 'sde:::Lib_With_Recorder::simple_recording:MAX', ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_named_event' stop endif call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A51,I4)', ' Lib_With_Recorder::simple_recording:CNT (1036) = ', values(16) if( abs(values(16) - 1036) .gt. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif call c_f_pointer(transfer(values(17), quantile), quantile_f) if( be_verbose .eq. 1 ) print '(A54,I6)', ' Lib_With_Recorder::simple_recording:MIN ( >0) = ', quantile_f call c_f_pointer(transfer(values(18), quantile), quantile_f) if( be_verbose .eq. 1 ) print '(A54,I6)', ' Lib_With_Recorder::simple_recording:Q1 ( ~30864) = ', quantile_f call c_f_pointer(transfer(values(19), quantile), quantile_f) if( be_verbose .eq. 1 ) print '(A54,I6)', ' Lib_With_Recorder::simple_recording:MED ( ~61728) = ', quantile_f call c_f_pointer(transfer(values(20), quantile), quantile_f) if( be_verbose .eq. 1 ) print '(A54,I6)', ' Lib_With_Recorder::simple_recording:Q3 ( ~92592) = ', quantile_f call c_f_pointer(transfer(values(21), quantile), quantile_f) if( be_verbose .eq. 1 ) print '(A54,I6)', ' Lib_With_Recorder::simple_recording:MAX (<123456) = ', quantile_f !------------------------------------------------------------------------------- if( be_verbose .eq. 1 ) print *,'' ! 22 call papif_event_name_to_code('sde:::Xandria::XND_CREATED' , eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_name_to_code' stop endif call papif_add_event( eventset, eventcode, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_add_event' stop endif call papif_start( eventset, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_start' stop endif call xandria_do_work() call xandria_do_work() call xandria_do_work() call papif_stop( eventset, values, ret_val ) if( ret_val .ne. PAPI_OK ) then print *,'Error at papif_stop' stop endif if( be_verbose .eq. 1 ) print '(A30,I2)', ' Xandria::XND_CREATED (27) = ', values(22) if( abs(values(22) - 27) .gt. 0 ) then if( be_verbose .eq. 1 ) print *,'^^^^^^^^^^^^^^^^^^^' all_tests_passed = 0 endif !------------------------------------------------------------------------------- !------------------------------------------------------------------------------- !------------------------------------------------------------------------------- call papif_shutdown( ) if( be_verbose .eq. 1 ) print *,'' if( all_tests_passed .eq. 1 ) then call ftests_pass('') else call ftest_fail(__FILE__, __LINE__, 'SDE counters do not much expected values!', 1) endif end program function rounding_error(param) result(ret_val) real (KIND=KIND(0.0D0)) :: param, ret_val ret_val = param/100000.0 end function rounding_error function f08_callback(param) result(ret_val) use, intrinsic :: ISO_C_BINDING implicit none real :: param integer(kind=C_LONG_LONG) :: ret_val ret_val = int(param, C_LONG_LONG) end function f08_callback papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/000077500000000000000000000000001502707512200231325ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/CountingSet_Lib++.cpp000066400000000000000000000067631502707512200270300ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" #include "cset_lib.hpp" struct test_type_t{ int id; float x; double y; }; struct mem_type_t{ void *ptr; int line_of_code; size_t size; }; CSetLib::CSetLib(){ papi_sde::PapiSde sde("CPP_CSET_LIB"); test_set = sde.create_counting_set("test counting set"); mem_set = sde.create_counting_set("malloc_tracking"); } void CSetLib::do_simple_work(){ int i; test_type_t element; for(i=0; i<22390; i++){ int j = i%5222; element.id = j; element.x = (float)j*1.037/((float)j+32.1); element.y = (double)(element.x)+145.67/((double)j+0.01); test_set->insert(sizeof(element), element, 0); } return; } void CSetLib::do_memory_allocations(){ int i, iter; void *ptrs[128]; for(iter=0; iter<8; iter++){ mem_type_t alloc_elem; for(i=0; i<64; i++){ size_t len = (17+i)*73; ptrs[i] = malloc(len); alloc_elem.ptr = ptrs[i]; alloc_elem.line_of_code = __LINE__; alloc_elem.size = len; mem_set->insert(sizeof(void *), alloc_elem, 1); } for(i=iter; i<64; i++){ mem_set->remove(sizeof(void *), ptrs[i], 1); free(ptrs[i]); } for(i=0; i<32; i++){ size_t len = (19+i)*73; ptrs[i] = malloc(len); alloc_elem.ptr = ptrs[i]; alloc_elem.line_of_code = __LINE__; alloc_elem.size = len; mem_set->insert(sizeof(void *), alloc_elem, 1); } for(i=0; i<32-iter; i++){ mem_set->remove(sizeof(void *), ptrs[i], 1); free(ptrs[i]); } } } void CSetLib::dump_set( cset_list_object_t *list_head ){ cset_list_object_t *list_runner; for(list_runner = list_head; NULL != list_runner; list_runner=list_runner->next){ switch(list_runner->type_id){ case 0: { auto *ptr = static_cast(list_runner->ptr); printf("count= %d typesize= %lu {id= %d, x= %f, y= %lf}\n", list_runner->count, list_runner->type_size, ptr->id, ptr->x, ptr->y); break; } case 1: { auto *ptr = static_cast(list_runner->ptr); printf("count= %d typesize= %lu { ptr= %p, line= %d, size= %lu }\n", list_runner->count, list_runner->type_size, ptr->ptr, ptr->line_of_code, ptr->size); break; } } } } int CSetLib::count_set_elements( cset_list_object_t *list_head ){ cset_list_object_t *list_runner; int element_counter = 0; for(list_runner = list_head; NULL != list_runner; list_runner=list_runner->next){ ++element_counter; } return element_counter; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. extern "C" papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t tmp_handle; tmp_handle = fptr_struct->init("CPP_CSET_LIB"); fptr_struct->create_counting_set( tmp_handle, "test counting set", NULL ); fptr_struct->create_counting_set( tmp_handle, "malloc_tracking", NULL ); return tmp_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/CountingSet_Lib.c000066400000000000000000000101331502707512200263240ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" papi_handle_t handle; typedef struct test_type_s{ int id; float x; double y; } test_type_t; typedef struct mem_type_s{ void *ptr; int line_of_code; size_t size; } mem_type_t; void libCSet_do_simple_work(void){ int i; void *test_set; test_type_t element; handle = papi_sde_init("CSET_LIB"); papi_sde_create_counting_set( handle, "test counting set", &test_set ); for(i=0; i<22390; i++){ int j = i%5222; long r = random(); if ( r%23 == 0 ){ j = r%2287; } element.id = j; element.x = (float)j*1.037/((float)j+32.1); element.y = (double)(element.x)+145.67/((double)j+0.01); papi_sde_counting_set_insert( test_set, sizeof(element), sizeof(element), &element, 0); } return; } int libCSet_finalize(void){ return papi_sde_shutdown(handle); } void libCSet_do_memory_allocations(void){ int i, iter; void *mem_set; void *ptrs[128]; handle = papi_sde_init("CSET_LIB"); papi_sde_create_counting_set( handle, "malloc_tracking", &mem_set ); for(iter=0; iter<8; iter++){ mem_type_t alloc_elem; for(i=0; i<64; i++){ size_t len = (17+i)*73; ptrs[i] = malloc(len); alloc_elem.ptr = ptrs[i]; alloc_elem.line_of_code = __LINE__; alloc_elem.size = len; papi_sde_counting_set_insert( mem_set, sizeof(alloc_elem), sizeof(void *), &alloc_elem, 1); } // "i" does _not_ start from zero so that some pointers are _not_ free()ed for(i=iter; i<64; i++){ papi_sde_counting_set_remove( mem_set, sizeof(void *), &(ptrs[i]), 1); free(ptrs[i]); } for(i=0; i<32; i++){ size_t len = (19+i)*73; ptrs[i] = malloc(len); alloc_elem.ptr = ptrs[i]; alloc_elem.line_of_code = __LINE__; alloc_elem.size = len; papi_sde_counting_set_insert( mem_set, sizeof(alloc_elem), sizeof(void *), &alloc_elem, 1); } // "i" does _not_ go to 31 so that some pointers are _not_ free()ed for(i=0; i<32-iter; i++){ papi_sde_counting_set_remove( mem_set, sizeof(void *), &(ptrs[i]), 1); free(ptrs[i]); } } return; } void libCSet_dump_set( cset_list_object_t *list_head ){ cset_list_object_t *list_runner; for(list_runner = list_head; NULL != list_runner; list_runner=list_runner->next){ switch(list_runner->type_id){ case 0: { test_type_t *ptr = (test_type_t *)(list_runner->ptr); printf("count= %d typesize= %lu {id= %d, x= %f, y= %lf}\n", list_runner->count, list_runner->type_size, ptr->id, ptr->x, ptr->y); break; } case 1: { mem_type_t *ptr = (mem_type_t *)(list_runner->ptr); printf("count= %d typesize= %lu { ptr= %p, line= %d, size= %lu }\n", list_runner->count, list_runner->type_size, ptr->ptr, ptr->line_of_code, ptr->size); break; } } } return; } int libCSet_count_set_elements( cset_list_object_t *list_head ){ cset_list_object_t *list_runner; int element_count = 0; for(list_runner = list_head; NULL != list_runner; list_runner=list_runner->next){ ++element_count; } return element_count; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t tmp_handle; tmp_handle = fptr_struct->init("CSET_LIB"); fptr_struct->create_counting_set( tmp_handle, "test counting set", NULL ); fptr_struct->create_counting_set( tmp_handle, "malloc_tracking", NULL ); return tmp_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/MemoryLeak_CountingSet_Driver++.cpp000066400000000000000000000030641502707512200316710ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" #include "cset_lib.hpp" void setup_PAPI(int *event_set); int main(int argc, char **argv){ int cnt, ret, event_set = PAPI_NULL; long long counter_values[1]; CSetLib LibCSetCPP; (void)argc; (void)argv; setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } LibCSetCPP.do_memory_allocations(); // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( (argc > 1) && !strcmp(argv[1], "-verbose") ){ LibCSetCPP.dump_set( (cset_list_object_t *)counter_values[0] ); } cnt = LibCSetCPP.count_set_elements( (cset_list_object_t *)counter_values[0] ); if( 56 == cnt ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "CountingSet contains wrong number of elements", ret ); return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::CPP_CSET_LIB::malloc_tracking")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/MemoryLeak_CountingSet_Driver.c000066400000000000000000000034111502707512200311770ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" void libCSet_do_memory_allocations(void); void libCSet_dump_set( cset_list_object_t *list_head); int libCSet_count_set_elements( cset_list_object_t *list_head); int libCSet_finalize(void); void setup_PAPI(int *event_set); int main(int argc, char **argv){ int cnt, ret, event_set = PAPI_NULL; long long counter_values[1]; (void)argc; (void)argv; setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } libCSet_do_memory_allocations(); // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( (argc > 1) && !strcmp(argv[1], "-verbose") ){ libCSet_dump_set( (cset_list_object_t *)counter_values[0] ); } cnt = libCSet_count_set_elements( (cset_list_object_t *)counter_values[0] ); ret = libCSet_finalize(); if( 56 == cnt && (SDE_OK==ret) ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "CountingSet contains wrong number of elements, or libsde finalization failed.", ret ); return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::CSET_LIB::malloc_tracking")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/Simple_CountingSet_Driver++.cpp000066400000000000000000000030611502707512200310520ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" #include "cset_lib.hpp" void setup_PAPI(int *event_set); int main(int argc, char **argv){ int cnt, ret, event_set = PAPI_NULL; long long counter_values[1]; CSetLib LibCSetCPP; (void)argc; (void)argv; setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } LibCSetCPP.do_simple_work(); // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( (argc > 1) && !strcmp(argv[1], "-verbose") ){ LibCSetCPP.dump_set( (cset_list_object_t *)counter_values[0] ); } cnt = LibCSetCPP.count_set_elements( (cset_list_object_t *)counter_values[0] ); if( 5222 == cnt ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "CountingSet contains wrong number of elements", ret ); return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::CPP_CSET_LIB::test counting set")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/Simple_CountingSet_Driver.c000066400000000000000000000032601502707512200303650ustar00rootroot00000000000000#include #include #include "sde_lib.h" #include "papi.h" #include "papi_test.h" void libCSet_do_simple_work(void); void libCSet_dump_set( cset_list_object_t *list_head ); int libCSet_count_set_elements( cset_list_object_t *list_head ); int libCSet_finalize(void); void setup_PAPI(int *event_set); int main(int argc, char **argv){ int cnt, ret, event_set = PAPI_NULL; long long counter_values[1]; (void)argc; (void)argv; setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } libCSet_do_simple_work(); // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( (argc > 1) && !strcmp(argv[1], "-verbose") ){ libCSet_dump_set( (cset_list_object_t *)counter_values[0] ); } cnt = libCSet_count_set_elements( (cset_list_object_t *)counter_values[0] ); if( 5222 == cnt ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "CountingSet contains wrong number of elements", ret ); return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::CSET_LIB::test counting set")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Counting_Set/cset_lib.hpp000066400000000000000000000006771502707512200254410ustar00rootroot00000000000000#if !defined(CSET_LIB_H) #define CSET_LIB_H #include "sde_lib.h" #include "sde_lib.hpp" class CSetLib{ private: papi_sde::PapiSde::CountingSet *test_set; papi_sde::PapiSde::CountingSet *mem_set; public: CSetLib(); void do_simple_work(); void do_memory_allocations(); int count_set_elements( cset_list_object_t *list_head ); void dump_set( cset_list_object_t *list_head ); }; #endif papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/000077500000000000000000000000001502707512200235775ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/Created_Counter_Driver++.cpp000066400000000000000000000042301502707512200310110ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" void cclib_init(void); void cclib_do_work(void); void cclib_do_more_work(void); void setup_PAPI(int *event_set); long long int epsilon_v[10] = {14LL, 11LL, 8LL, 13LL, 8LL, 10LL, 12LL, 11LL, 6LL, 8LL}; int be_verbose = 0; int main(int argc, char **argv){ int i, ret, event_set = PAPI_NULL; int discrepancies = 0; long long counter_values[1] = {0}; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; cclib_init(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ cclib_do_work(); // --- Read the event counters _and_ reset them if((ret=PAPI_accum(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_accum", ret ); } if( be_verbose ) printf("Epsilon count in cclib_do_work(): %lld\n",counter_values[0]); if( counter_values[0] != epsilon_v[i] ){ discrepancies++; } counter_values[0] = 0; } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( !discrepancies ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::CPP_Lib_With_CC::epsilon_count")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/Created_Counter_Driver.c000066400000000000000000000042241502707512200303260ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" void cclib_init(void); void cclib_do_work(void); void cclib_do_more_work(void); void setup_PAPI(int *event_set); long long int epsilon_v[10] = {14LL, 11LL, 8LL, 13LL, 8LL, 10LL, 12LL, 11LL, 6LL, 8LL}; int be_verbose = 0; int main(int argc, char **argv){ int i, ret, event_set = PAPI_NULL; int discrepancies = 0; long long counter_values[1] = {0}; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; cclib_init(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ cclib_do_work(); // --- Read the event counters _and_ reset them if((ret=PAPI_accum(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_accum", ret ); } if( be_verbose ) printf("Epsilon count in cclib_do_work(): %lld\n",counter_values[0]); if( counter_values[0] != epsilon_v[i] ){ discrepancies++; } counter_values[0] = 0; } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( !discrepancies ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Lib_With_CC::epsilon_count")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/Lib_With_Created_Counter++.cpp000066400000000000000000000047061502707512200312670ustar00rootroot00000000000000#include #include #include #include #include "sde_lib.h" #include "sde_lib.hpp" #define MY_EPSILON 0.0001 #define BRNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } volatile int result; volatile unsigned int b, z1, z2, z3, z4; static const char *event_names[1] = { "epsilon_count" }; papi_sde::PapiSde::CreatedCounter *sde_cntr; // API functions. void cclib_init(void); void cclib_do_work(void); void cclib_do_more_work(void); void cclib_init(void){ papi_sde::PapiSde sde("CPP_Lib_With_CC"); sde_cntr = sde.create_counter(event_names[0], PAPI_SDE_DELTA); if( nullptr == sde_cntr ){ std::cerr << "Unable to create counter: "<< event_names[0] << std::endl; abort(); } z1=42; z2=420; z3=42000; z4=424242; return; } void cclib_do_work(void){ int i; for(i=0; i<100*1000; i++){ BRNG(); double r = (1.0*result)/(1.0*INT_MAX); if( r < MY_EPSILON && r > -MY_EPSILON ){ ++(*sde_cntr); } // Do some usefull work here if( !(i%100) ) (void)usleep(1); } return; } void cclib_do_more_work(void){ int i; for(i=0; i<500*1000; i++){ BRNG(); double r = (1.0*result)/(1.0*INT_MAX); if( r < MY_EPSILON && r > -MY_EPSILON ){ (*sde_cntr)+=1; } // Do some usefull work here if( !(i%20) ) (void)usleep(1); } return; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t sde_handle; void *cntr_handle; sde_handle = fptr_struct->init("CPP_Lib_With_CC"); fptr_struct->create_counter(sde_handle, event_names[0], PAPI_SDE_DELTA, &cntr_handle); fptr_struct->describe_counter(sde_handle, event_names[0], "Number of times the random value was less than 0.0001"); return sde_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/Lib_With_Created_Counter.c000066400000000000000000000045231502707512200305760ustar00rootroot00000000000000#include #include #include #include #include "sde_lib.h" #define MY_EPSILON 0.0001 #define BRNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } volatile int result; volatile unsigned int b, z1, z2, z3, z4; static const char *event_names[1] = { "epsilon_count" }; void *cntr_handle; // API functions. void cclib_init(void); void cclib_do_work(void); void cclib_do_more_work(void); void cclib_init(void){ papi_handle_t sde_handle; sde_handle = papi_sde_init("Lib_With_CC"); papi_sde_create_counter(sde_handle, event_names[0], PAPI_SDE_DELTA, &cntr_handle); z1=42; z2=420; z3=42000; z4=424242; return; } void cclib_do_work(void){ int i; for(i=0; i<100*1000; i++){ BRNG(); double r = (1.0*result)/(1.0*INT_MAX); if( r < MY_EPSILON && r > -MY_EPSILON ){ papi_sde_inc_counter(cntr_handle, 1); } // Do some usefull work here if( !(i%100) ) (void)usleep(1); } return; } void cclib_do_more_work(void){ int i; for(i=0; i<500*1000; i++){ BRNG(); double r = (1.0*result)/(1.0*INT_MAX); if( r < MY_EPSILON && r > -MY_EPSILON ){ papi_sde_inc_counter(cntr_handle, 1); } // Do some usefull work here if( !(i%20) ) (void)usleep(1); } return; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t sde_handle; sde_handle = fptr_struct->init("Lib_With_CC"); fptr_struct->create_counter(sde_handle, event_names[0], PAPI_SDE_DELTA, &cntr_handle); fptr_struct->describe_counter(sde_handle, event_names[0], "Number of times the random value was less than 0.0001"); return sde_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Created_Counter/Overflow_Driver.c000066400000000000000000000077241502707512200270730ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #define EV_THRESHOLD 10 void cclib_init(void); void cclib_do_work(void); void cclib_do_more_work(void); void setup_PAPI(int *event_set, int threshold); int remaining_handler_invocations = 22; int be_verbose = 0; int main(int argc, char **argv){ int i, ret, event_set = PAPI_NULL; long long counter_values[1] = {0}; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; cclib_init(); setup_PAPI(&event_set, EV_THRESHOLD); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<4; i++){ cclib_do_work(); // --- Read the event counters _and_ reset them if((ret=PAPI_accum(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_accum", ret ); } if( be_verbose ) printf("Epsilon count in cclib_do_work(): %lld\n",counter_values[0]); counter_values[0] = 0; cclib_do_more_work(); // --- Read the event counters _and_ reset them if((ret=PAPI_accum(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_accum", ret ); } if( be_verbose ) printf("Epsilon count in cclib_do_more_work(): %lld\n",counter_values[0]); counter_values[0] = 0; } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( remaining_handler_invocations <= 1 ) // Let's allow for up to one missed signal, or race condition. test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE overflow handler was not invoked as expected!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void overflow_handler(int event_set, void *address, long long overflow_vector, void *context){ char event_name[PAPI_MAX_STR_LEN]; int ret, *event_codes, event_index, number=1; (void)address; (void)context; if( (ret = PAPI_get_overflow_event_index(event_set, overflow_vector, &event_index, &number)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_get_overflow_event_index", ret ); } number = event_index+1; event_codes = (int *)calloc(number, sizeof(int)); if( (ret = PAPI_list_events( event_set, event_codes, &number)) != PAPI_OK ){ test_fail( __FILE__, __LINE__, "PAPI_list_events", ret ); } if( (ret=PAPI_event_code_to_name(event_codes[event_index], event_name)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", ret ); } free(event_codes); if( be_verbose ){ printf("Event \"%s\" at index: %d exceeded its threshold again.\n",event_name, event_index); fflush(stdout); } if( !strcmp(event_name, "sde:::Lib_With_CC::epsilon_count") || !event_index ) remaining_handler_invocations--; return; } void setup_PAPI(int *event_set, int threshold){ int ret, event_code; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_event_name_to_code("sde:::Lib_With_CC::epsilon_count", &event_code)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", ret ); } if((ret=PAPI_add_event(*event_set, event_code)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_event", ret ); } if((ret = PAPI_overflow(*event_set, event_code, threshold, PAPI_OVERFLOW_HARDWARE, overflow_handler)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_overflow", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Makefile000066400000000000000000000141431502707512200221740ustar00rootroot00000000000000NAME=sde include ../../Makefile_comp_tests.target INCLUDE += -I$(datadir)/sde_lib -I.. intel_compilers := ifort ifx cray_compilers := ftn crayftn ifeq ($(notdir $(F77)),gfortran) FFLAGS +=-ffree-form -ffree-line-length-none else ifeq ($(notdir $(F77)),flang) FFLAGS +=-ffree-form else ifeq ($(findstring $(notdir $(F77)), $(intel_compilers)),) FFLAGS +=-free else ifeq ($(findstring $(notdir $(F77)), $(cray_compilers)),) FFLAGS +=-ffree endif FFLAGS +=-g CFLAGS +=-g CXXFLAGS +=-g -std=c++11 ifeq ($(BUILD_LIBSDE_SHARED),yes) LDFLAGS += -Llib -L$(datadir) -L$(datadir)/libpfm4/lib -lpapi -lpfm -lsde LIBSDE=yes else ifeq ($(BUILD_LIBSDE_STATIC),yes) LDFLAGS += -Llib $(datadir)/libsde.a -lpapi LIBSDE=yes endif SDE_F08_API=../sde_F.F90 ifeq ($(LIBSDE),yes) TESTS = Minimal_Test Minimal_Test++ Simple_Test Simple2_Test Simple2_NoPAPI_Test Simple2_Test++ Recorder_Test Recorder_Test++ Created_Counter_Test Created_Counter_Test++ Overflow_Test Counting_Set_Simple_Test Counting_Set_MemLeak_Test Counting_Set_Simple_Test++ Counting_Set_MemLeak_Test++ endif ifeq ($(BUILD_LIBSDE_STATIC),yes) TESTS += Overflow_Static_Test endif ifeq ($(ENABLE_FORTRAN_TESTS),yes) TESTS += sde_test_f08 endif sde_tests: $(TESTS) ################################################################################ ## Minimal test prfx=Minimal Minimal_Test: $(prfx)/Minimal_Test.c $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) $(LDFLAGS) Minimal_Test++: $(prfx)/Minimal_Test++.cpp $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) $(LDFLAGS) ################################################################################ ## Simple test prfx=Simple libSimple.so: $(prfx)/Simple_Lib.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ Simple_Test: $(prfx)/Simple_Driver.c libSimple.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lSimple $(LDFLAGS) -lm ################################################################################ ## Simple2 test prfx=Simple2 libSimple2.so: $(prfx)/Simple2_Lib.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ Simple2_Test: $(prfx)/Simple2_Driver.c libSimple2.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lSimple2 $(LDFLAGS) -lm Simple2_NoPAPI_Test: $(prfx)/Simple2_NoPAPI_Driver.c libSimple2.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) -Llib -lSimple2 -L$(datadir) -lsde -lm -ldl libSimple2++.so: $(prfx)/Simple2_Lib++.cpp $(CXX) -shared -Wall -fPIC $(CXXFLAGS) $(INCLUDE) -o lib/$@ $^ Simple2_Test++: $(prfx)/Simple2_Driver++.cpp libSimple2++.so $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) -lSimple2++ $(LDFLAGS) -lm ################################################################################ ## Recorder test prfx=Recorder libRecorder.so: $(prfx)/Lib_With_Recorder.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ Recorder_Test: $(prfx)/Recorder_Driver.c libRecorder.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lRecorder $(LDFLAGS) -lm libRecorder++.so: $(prfx)/Lib_With_Recorder++.cpp $(CXX) -shared -Wall -fPIC $(CXXFLAGS) $(INCLUDE) -o lib/$@ $^ Recorder_Test++: $(prfx)/Recorder_Driver++.cpp libRecorder++.so $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) -lRecorder++ $(LDFLAGS) -lm ################################################################################ ## Created Counter test prfx=Created_Counter libCreated_Counter.so: $(prfx)/Lib_With_Created_Counter.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ libCreated_Counter_static.a: $(prfx)/Lib_With_Created_Counter.c $(CC) -Bstatic -static -Wall $(CFLAGS) $(INCLUDE) -c -o lib/Lib_With_Created_Counter.o $^ ar rs lib/$@ lib/Lib_With_Created_Counter.o rm lib/Lib_With_Created_Counter.o Created_Counter_Test: $(prfx)/Created_Counter_Driver.c libCreated_Counter.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lCreated_Counter $(LDFLAGS) -lm Overflow_Test: $(prfx)/Overflow_Driver.c libCreated_Counter.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lCreated_Counter $(LDFLAGS) -lm Overflow_Static_Test: $(prfx)/Overflow_Driver.c libCreated_Counter_static.a $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) lib/libCreated_Counter_static.a $(datadir)/libpapi.a $(datadir)/libsde.a -ldl -lrt libCreated_Counter++.so: $(prfx)/Lib_With_Created_Counter++.cpp $(CXX) -shared -Wall -fPIC $(CXXFLAGS) $(INCLUDE) -o lib/$@ $^ Created_Counter_Test++: $(prfx)/Created_Counter_Driver++.cpp libCreated_Counter++.so $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) -lCreated_Counter++ $(LDFLAGS) -lm ################################################################################ ## Counting Set test prfx=Counting_Set libCounting_Set++.so: $(prfx)/CountingSet_Lib++.cpp $(prfx)/cset_lib.hpp $(CXX) -shared -Wall -fPIC $(CXXFLAGS) $(INCLUDE) -o lib/$@ $< Counting_Set_MemLeak_Test++: $(prfx)/MemoryLeak_CountingSet_Driver++.cpp libCounting_Set++.so $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) -lCounting_Set++ $(LDFLAGS) Counting_Set_Simple_Test++: $(prfx)/Simple_CountingSet_Driver++.cpp libCounting_Set++.so $(CXX) $< -o $@ $(INCLUDE) $(CXXFLAGS) $(UTILOBJS) -lCounting_Set++ $(LDFLAGS) libCounting_Set.so: $(prfx)/CountingSet_Lib.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ Counting_Set_Simple_Test: $(prfx)/Simple_CountingSet_Driver.c libCounting_Set.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lCounting_Set $(LDFLAGS) Counting_Set_MemLeak_Test: $(prfx)/MemoryLeak_CountingSet_Driver.c libCounting_Set.so $(CC) $< -o $@ $(INCLUDE) $(CFLAGS) $(UTILOBJS) -lCounting_Set $(LDFLAGS) ################################################################################ ## Advanced test prfx=Advanced_C+FORTRAN rcrd_prfx=Recorder libXandria.so: $(prfx)/Xandria.F90 $(F77) -shared -Wall -fPIC $(FFLAGS) $(INCLUDE) -o lib/$@ $(SDE_F08_API) $< libGamum.so: $(prfx)/Gamum.c $(CC) -shared -Wall -fPIC $(CFLAGS) $(INCLUDE) -o lib/$@ $^ sde_test_f08: $(prfx)/sde_test_f08.F90 $(UTILOBJS) $(PAPILIB) libXandria.so libGamum.so libRecorder.so $(F77) $< -o $@ $(INCLUDE) $(FFLAGS) $(UTILOBJS) -lXandria -lGamum -lRecorder $(LDFLAGS) ################################################################################ ## Cleaning clean: rm -f *.o *.mod lib/*.so lib/*.a $(TESTS) papi-papi-7-2-0-t/src/components/sde/tests/Minimal/000077500000000000000000000000001502707512200221175ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Minimal/Minimal_Test++.cpp000066400000000000000000000036411502707512200253420ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #include "sde_lib.h" #include "sde_lib.hpp" //////////////////////////////////////////////////////////////////////////////// //------- Library example that exports SDEs class MinTest{ public: MinTest(); void dowork(); private: long long local_var; }; MinTest::MinTest(){ local_var = 0; papi_sde::PapiSde sde("Min Example Code in C++"); sde.register_counter("Example Event", PAPI_SDE_RO|PAPI_SDE_DELTA, local_var); } void MinTest::dowork(){ local_var += 7; } //////////////////////////////////////////////////////////////////////////////// //------- Driver example that uses library and reads the SDEs int main(int argc, char **argv){ int ret, Eventset = PAPI_NULL; long long counter_values[1]; MinTest test_obj; (void)argc; (void)argv; // --- Setup PAPI if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); exit(-1); } if((ret=PAPI_create_eventset(&Eventset)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); exit(-1); } if((ret=PAPI_add_named_event(Eventset, "sde:::Min Example Code in C++::Example Event")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); exit(-1); } // --- Start PAPI if((ret=PAPI_start(Eventset)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); exit(-1); } test_obj.dowork(); // --- Stop PAPI if((ret=PAPI_stop(Eventset, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); exit(-1); } if( counter_values[0] == 7 ){ test_pass(__FILE__); }else{ test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", ret ); } return 0; } papi-papi-7-2-0-t/src/components/sde/tests/Minimal/Minimal_Test.c000066400000000000000000000031311502707512200246460ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #include "sde_lib.h" long long local_var; void mintest_init(void){ local_var =0; papi_handle_t handle = papi_sde_init("Min Example Code"); papi_sde_register_counter(handle, "Example Event", PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &local_var); } void mintest_dowork(void){ local_var += 7; } int main(int argc, char **argv){ int ret, Eventset = PAPI_NULL; long long counter_values[1]; (void)argc; (void)argv; mintest_init(); // --- Setup PAPI if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); exit(-1); } if((ret=PAPI_create_eventset(&Eventset)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); exit(-1); } if((ret=PAPI_add_named_event(Eventset, "sde:::Min Example Code::Example Event")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); exit(-1); } // --- Start PAPI if((ret=PAPI_start(Eventset)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); exit(-1); } mintest_dowork(); // --- Stop PAPI if((ret=PAPI_stop(Eventset, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); exit(-1); } if( counter_values[0] == 7 ){ test_pass(__FILE__); }else{ test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", ret ); } return 0; } papi-papi-7-2-0-t/src/components/sde/tests/README.txt000066400000000000000000000005371502707512200222340ustar00rootroot00000000000000To run the SDE tests the LD_LIBRARY_PATH variable needs to contain the paths to the following: libpapi.so libpfm.so and the libraries under $papi_root/src/components/sde/tests/lib This can be done by adding to your LD_LIBRARY_PATH environment variable the following (from within the tests directory): $PWD/lib:$PWD/../../..:$PWD/../../../libpfm4/lib papi-papi-7-2-0-t/src/components/sde/tests/Recorder/000077500000000000000000000000001502707512200222765ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Recorder/Lib_With_Recorder++.cpp000066400000000000000000000035401502707512200264600ustar00rootroot00000000000000#include #include #include #include "sde_lib.h" #include "sde_lib.hpp" static const char *event_names[1] = { "simple_recording" }; #define BRNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } volatile int result; volatile unsigned int b, z1, z2, z3, z4; void *rcrd_handle; papi_sde::PapiSde::Recorder *sde_rcrd; // API functions. void recorder_init_(void); void recorder_do_work_(void); void recorder_init_(void){ papi_sde::PapiSde sde("CPP_Lib_With_Recorder"); sde_rcrd = sde.create_recorder(event_names[0], sizeof(long long), papi_sde_compare_long_long); if( nullptr == sde_rcrd ){ std::cerr << "Unable to create recorder: "<< event_names[0] << std::endl; abort(); } z1=42; z2=420; z3=42000; z4=424242; return; } void recorder_do_work_(void){ long long r; BRNG(); if( result < 0 ) result *= -1; r = result%123456; sde_rcrd->record(r); return; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t tmp_handle; tmp_handle = fptr_struct->init("CPP_Lib_With_Recorder"); fptr_struct->create_recorder(tmp_handle, event_names[0], sizeof(long long), papi_sde_compare_long_long, &rcrd_handle); return tmp_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Recorder/Lib_With_Recorder.c000066400000000000000000000033411502707512200257710ustar00rootroot00000000000000#include #include #include #include "sde_lib.h" static const char *event_names[1] = { "simple_recording" }; #define BRNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } volatile int result; volatile unsigned int b, z1, z2, z3, z4; void *rcrd_handle; // API functions. void recorder_init_(void); void recorder_do_work_(void); void recorder_init_(void){ papi_handle_t tmp_handle; tmp_handle = papi_sde_init("Lib_With_Recorder"); papi_sde_create_recorder(tmp_handle, event_names[0], sizeof(long long), papi_sde_compare_long_long, &rcrd_handle); z1=42; z2=420; z3=42000; z4=424242; return; } void recorder_do_work_(void){ long long r; BRNG(); if( result < 0 ) result *= -1; r = result%123456; papi_sde_record(rcrd_handle, sizeof(r), &r); return; } // Hook for papi_native_avail utility. No user code which links against this library should call // this function because it has the same name in all SDE-enabled libraries. papi_native_avail // uses dlopen and dlclose on each library so it only has one version of this symbol at a time. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t tmp_handle; tmp_handle = fptr_struct->init("Lib_With_Recorder"); fptr_struct->create_recorder(tmp_handle, event_names[0], sizeof(long long), papi_sde_compare_long_long, &rcrd_handle); return tmp_handle; } papi-papi-7-2-0-t/src/components/sde/tests/Recorder/Recorder_Driver++.cpp000066400000000000000000000053611502707512200262150ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" void recorder_init_(void); void recorder_do_work_(void); void setup_PAPI(int *event_set); long long int expectation[10] = {20674LL, 50122LL, 112964LL, 32904LL, 101565LL, 56993LL, 58388LL, 122543LL, 62312LL, 52433LL}; int main(int argc, char **argv){ int i, j, ret, event_set = PAPI_NULL; int discrepancies = 0; int be_verbose = 0; long long counter_values[2]; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; recorder_init_(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ recorder_do_work_(); // --- read the event counters if((ret=PAPI_read(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_read", ret ); } long long *ptr = (long long *)counter_values[1]; if( be_verbose ){ printf("The number of recordings is: %lld (ptr is: %p)\n",counter_values[0],(void *)counter_values[1]); for(j=0; j #include #include #include #include "papi.h" #include "papi_test.h" void recorder_init_(void); void recorder_do_work_(void); void setup_PAPI(int *event_set); long long int expectation[10] = {20674LL, 50122LL, 112964LL, 32904LL, 101565LL, 56993LL, 58388LL, 122543LL, 62312LL, 52433LL}; int main(int argc, char **argv){ int i, j, ret, event_set = PAPI_NULL; int discrepancies = 0; int be_verbose = 0; long long counter_values[2]; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; recorder_init_(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ recorder_do_work_(); // --- read the event counters if((ret=PAPI_read(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_read", ret ); } long long *ptr = (long long *)counter_values[1]; if( be_verbose ){ printf("The number of recordings is: %lld (ptr is: %p)\n",counter_values[0],(void *)counter_values[1]); for(j=0; j #include #include #include #include "papi.h" #include "papi_test.h" long long int low_mark[10] = { 0LL, 2LL, 2LL, 7LL, 21LL, 29LL, 29LL, 29LL, 29LL, 34LL}; long long int high_mark[10] = { 1LL, 1LL, 2LL, 3LL, 4LL, 8LL, 9LL, 9LL, 9LL, 13LL}; long long int tot_iter[10] = { 2LL, 9LL, 13LL, 33LL, 83LL, 122LL, 126LL, 130LL, 135LL, 176LL}; double comp_val[10] = {0.653676, 3.160483, 4.400648, 10.286250, 25.162759, 36.454895, 37.965891, 39.680220, 41.709039, 53.453990}; void setup_PAPI(int *event_set); void simple_init(void); double simple_compute(double x); int main(int argc, char **argv){ int i,ret, event_set = PAPI_NULL; int discrepancies = 0; int be_verbose = 0; long long counter_values[4]; double *dbl_ptr; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; simple_init(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ double sum; sum = simple_compute(0.87*i); if( be_verbose ) printf("sum=%lf\n",sum); // --- read the event counters if((ret=PAPI_read(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_read", ret ); } // PAPI has packed the bits of the double inside the long long. dbl_ptr = (double *)&counter_values[3]; if( be_verbose ) printf("Low Mark=%lld, High Mark=%lld, Total Iterations=%lld, Comp. Value=%lf\n", counter_values[0], counter_values[1], counter_values[2], *dbl_ptr); if( counter_values[0] != low_mark[i] || counter_values[1] != high_mark[i] || counter_values[2] != tot_iter[i] || (*dbl_ptr-comp_val[i]) > 0.00001 || (*dbl_ptr-comp_val[i]) < -0.00001 ){ discrepancies++; } } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( !discrepancies ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple::LOW_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple::HIGH_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple::TOTAL_ITERATIONS")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple::COMPUTED_VALUE")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Simple/Simple_Lib.c000066400000000000000000000047511502707512200241540ustar00rootroot00000000000000#include #include #include #include #include "sde_lib.h" // API functions void simple_init(void); double simple_compute(double x); // The following counters are hidden to programs linking with // this library, so they can not be accessed directly. static double comp_value; static long long int total_iter_cnt, low_wtrmrk, high_wtrmrk; static papi_handle_t handle; static const char *ev_names[4] = { "COMPUTED_VALUE", "TOTAL_ITERATIONS", "LOW_WATERMARK_REACHED", "HIGH_WATERMARK_REACHED" }; void simple_init(void){ // Initialize library specific variables comp_value = 0.0; total_iter_cnt = 0; low_wtrmrk = 0; high_wtrmrk = 0; // Initialize PAPI SDEs handle = papi_sde_init("Simple"); papi_sde_register_counter(handle, ev_names[0], PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, &comp_value); papi_sde_register_counter(handle, ev_names[1], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &total_iter_cnt); papi_sde_register_counter(handle, ev_names[2], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &low_wtrmrk); papi_sde_register_counter(handle, ev_names[3], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &high_wtrmrk); return; } // Perform some nonsense computation to emulate a possible library behavior. // Notice that no SDE routines need to be called in the critical path of the library. double simple_compute(double x){ double sum = 0.0; int lcl_iter = 0; if( x > 1.0 ) x = 1.0/x; if( x < 0.000001 ) x += 0.3579; while( 1 ){ double y,x2,x3,x4; lcl_iter++; // Compute a function with range [0:1] so we can iterate // multiple times without diverging or creating FP exceptions. x2 = x*x; x3 = x2*x; x4 = x2*x2; y = 42.53*x4 -67.0*x3 +25.0*x2 +x/2.15; y = y*y; if( y < 0.01 ) y = 0.5-y; // Now set the next x to be the current y, so we can iterate again. x = y; // Add y to sum unconditionally sum += y; if( y < 0.1 ){ low_wtrmrk++; continue; } if( y > 0.9 ){ high_wtrmrk++; continue; } // Only add y to comp_value if y is between the low and high watermarks. comp_value += y; // If some condition is met, terminate the loop if( 0.61 < y && y < 0.69 ) break; } total_iter_cnt += lcl_iter; return sum; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/000077500000000000000000000000001502707512200220445ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/Simple2/Simple2_Driver++.cpp000066400000000000000000000074151502707512200255330ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #include "simple2.hpp" long long int low_mark[10] = { 0LL, 2LL, 2LL, 7LL, 21LL, 29LL, 29LL, 29LL, 29LL, 34LL}; long long int high_mark[10] = { 1LL, 1LL, 2LL, 3LL, 4LL, 8LL, 9LL, 9LL, 9LL, 13LL}; long long int tot_iter[10] = { 2LL, 9LL, 13LL, 33LL, 83LL, 122LL, 126LL, 130LL, 135LL, 176LL}; double comp_val[10] = {0.653676, 3.160483, 4.400648, 10.286250, 25.162759, 36.454895, 37.965891, 39.680220, 41.709039, 53.453990}; void setup_PAPI(int *event_set); void simple_init(void); double simple_compute(double x); int main(int argc, char **argv){ int i,ret, event_set = PAPI_NULL; int discrepancies = 0; int be_verbose = 0; long long counter_values[5]; double *dbl_ptr; Simple simp_obj; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ double sum; sum = simp_obj.simple_compute(0.87*i); if( be_verbose ) printf("sum=%lf\n",sum); // --- read the event counters if((ret=PAPI_read(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_read", ret ); } // PAPI has packed the bits of the double inside the long long. dbl_ptr = (double *)&counter_values[4]; if( be_verbose ) printf("Low Watermark=%lld, High Watermark=%lld, Any Watermark=%lld, Total Iterations=%lld, Comp. Value=%lf\n", counter_values[0], counter_values[1], counter_values[2], counter_values[3], *dbl_ptr); if( counter_values[0] != low_mark[i] || counter_values[1] != high_mark[i] || counter_values[2] != (low_mark[i]+high_mark[i]) || counter_values[3] != tot_iter[i] || (*dbl_ptr-2.0*comp_val[i]) > 0.00001 || (*dbl_ptr-2.0*comp_val[i]) < -0.00001 ){ discrepancies++; } } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( !discrepancies ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2_CPP::LOW_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2_CPP::HIGH_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2_CPP::ANY_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2_CPP::TOTAL_ITERATIONS")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2_CPP::COMPUTED_VALUE")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/Simple2_Driver.c000066400000000000000000000073301502707512200250410ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" long long int low_mark[10] = { 0LL, 2LL, 2LL, 7LL, 21LL, 29LL, 29LL, 29LL, 29LL, 34LL}; long long int high_mark[10] = { 1LL, 1LL, 2LL, 3LL, 4LL, 8LL, 9LL, 9LL, 9LL, 13LL}; long long int tot_iter[10] = { 2LL, 9LL, 13LL, 33LL, 83LL, 122LL, 126LL, 130LL, 135LL, 176LL}; double comp_val[10] = {0.653676, 3.160483, 4.400648, 10.286250, 25.162759, 36.454895, 37.965891, 39.680220, 41.709039, 53.453990}; void setup_PAPI(int *event_set); void simple_init(void); double simple_compute(double x); int main(int argc, char **argv){ int i,ret, event_set = PAPI_NULL; int discrepancies = 0; int be_verbose = 0; long long counter_values[5]; double *dbl_ptr; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; simple_init(); setup_PAPI(&event_set); // --- Start PAPI if((ret=PAPI_start(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } for(i=0; i<10; i++){ double sum; sum = simple_compute(0.87*i); if( be_verbose ) printf("sum=%lf\n",sum); // --- read the event counters if((ret=PAPI_read(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_read", ret ); } // PAPI has packed the bits of the double inside the long long. dbl_ptr = (double *)&counter_values[4]; if( be_verbose ) printf("Low Watermark=%lld, High Watermark=%lld, Any Watermark=%lld, Total Iterations=%lld, Comp. Value=%lf\n", counter_values[0], counter_values[1], counter_values[2], counter_values[3], *dbl_ptr); if( counter_values[0] != low_mark[i] || counter_values[1] != high_mark[i] || counter_values[2] != (low_mark[i]+high_mark[i]) || counter_values[3] != tot_iter[i] || (*dbl_ptr-2.0*comp_val[i]) > 0.00001 || (*dbl_ptr-2.0*comp_val[i]) < -0.00001 ){ discrepancies++; } } // --- Stop PAPI if((ret=PAPI_stop(event_set, counter_values)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } if( !discrepancies ) test_pass(__FILE__); else test_fail( __FILE__, __LINE__, "SDE counter values are wrong!", 0 ); // The following "return" is dead code, because both test_pass() and test_fail() call exit(), // however, we need it to prevent compiler warnings. return 0; } void setup_PAPI(int *event_set){ int ret; if((ret=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT){ test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); } if((ret=PAPI_create_eventset(event_set)) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2::LOW_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2::HIGH_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2::ANY_WATERMARK_REACHED")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2::TOTAL_ITERATIONS")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } if((ret=PAPI_add_named_event(*event_set, "sde:::Simple2::COMPUTED_VALUE")) != PAPI_OK){ test_fail( __FILE__, __LINE__, "PAPI_add_named_event", ret ); } return; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/Simple2_Lib++.cpp000066400000000000000000000104201502707512200247740ustar00rootroot00000000000000#include #include #include #include #include #include "simple2.hpp" Simple::Simple(){ papi_sde::PapiSde sde("Simple2_CPP"); // Initialize library specific variables comp_value = 0.0; total_iter_cnt = 0; low_wtrmrk = 0; high_wtrmrk = 0; // Initialize PAPI SDEs sde.register_fp_counter(ev_names[0], PAPI_SDE_RO|PAPI_SDE_DELTA, counter_accessor_function, comp_value); sde.register_counter(ev_names[1], PAPI_SDE_RO|PAPI_SDE_DELTA, total_iter_cnt); sde.register_counter(ev_names[2], PAPI_SDE_RO|PAPI_SDE_DELTA, low_wtrmrk); sde.register_counter(ev_names[3], PAPI_SDE_RO|PAPI_SDE_DELTA, high_wtrmrk); sde.add_counter_to_group(ev_names[2], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); sde.add_counter_to_group(ev_names[3], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); return; } // This function allows the library to perform operations in order to compute the value of an SDE at run-time long long Simple::counter_accessor_function( double *param ){ long long ll; double *dbl_ptr = param; // Scale the variable by a factor of two. Real libraries will do meaningful work here. double value = *dbl_ptr * 2.0; // Copy the bits of the result in a long long int. (void)memcpy(&ll, &value, sizeof(double)); return ll; } // Perform some nonsense computation to emulate a possible library behavior. // Notice that no SDE routines need to be called in the critical path of the library. double Simple::simple_compute(double x){ double sum = 0.0; int lcl_iter = 0; if( x > 1.0 ) x = 1.0/x; if( x < 0.000001 ) x += 0.3579; while( 1 ){ double y,x2,x3,x4; lcl_iter++; // Compute a function with range [0:1] so we can iterate // multiple times without diverging or creating FP exceptions. x2 = x*x; x3 = x2*x; x4 = x2*x2; y = 42.53*x4 -67.0*x3 +25.0*x2 +x/2.15; y = y*y; if( y < 0.01 ) y = 0.5-y; // Now set the next x to be the current y, so we can iterate again. x = y; // Add y to sum unconditionally sum += y; if( y < 0.1 ){ low_wtrmrk++; continue; } if( y > 0.9 ){ high_wtrmrk++; continue; } // Only add y to comp_value if y is between the low and high watermarks. comp_value += y; // If some condition is met, terminate the loop if( 0.61 < y && y < 0.69 ) break; } total_iter_cnt += lcl_iter; return sum; } // The following function will _NOT_ be called by other libray functions or normal // applications. It is a hook for the utility 'papi_native_avail' to be able to // discover the SDEs that are exported by this library. extern "C" papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ papi_handle_t handle = fptr_struct->init("Simple2_CPP"); handle = fptr_struct->init("Simple2_CPP"); fptr_struct->register_fp_counter(handle, Simple::ev_names[0], PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, NULL, NULL); fptr_struct->register_counter(handle, Simple::ev_names[1], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, NULL); fptr_struct->register_counter(handle, Simple::ev_names[2], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, NULL); fptr_struct->register_counter(handle, Simple::ev_names[3], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, NULL); fptr_struct->add_counter_to_group(handle, Simple::ev_names[2], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); fptr_struct->add_counter_to_group(handle, Simple::ev_names[3], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); fptr_struct->describe_counter(handle, Simple::ev_names[0], "Sum of values that are within the watermarks."); fptr_struct->describe_counter(handle, Simple::ev_names[1], "Total iterations executed by the library."); fptr_struct->describe_counter(handle, Simple::ev_names[2], "Number of times a value was below the low watermark."); fptr_struct->describe_counter(handle, Simple::ev_names[3], "Number of times a value was above the high watermark."); fptr_struct->describe_counter(handle, "ANY_WATERMARK_REACHED", "Number of times a value was not between the two watermarks."); return handle; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/Simple2_Lib.c000066400000000000000000000115101502707512200243070ustar00rootroot00000000000000#include #include #include #include #include #include "sde_lib.h" // API functions void simple_init(void); double simple_compute(double x); // The following counters are hidden to programs linking with // this library, so they can not be accessed directly. static double comp_value; static long long int total_iter_cnt, low_wtrmrk, high_wtrmrk; static papi_handle_t handle; static const char *ev_names[4] = { "COMPUTED_VALUE", "TOTAL_ITERATIONS", "LOW_WATERMARK_REACHED", "HIGH_WATERMARK_REACHED" }; long long int counter_accessor_function( void *param ); void simple_init(void){ // Initialize library specific variables comp_value = 0.0; total_iter_cnt = 0; low_wtrmrk = 0; high_wtrmrk = 0; // Initialize PAPI SDEs handle = papi_sde_init("Simple2"); papi_sde_register_counter_cb(handle, ev_names[0], PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, counter_accessor_function, &comp_value); papi_sde_register_counter(handle, ev_names[1], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &total_iter_cnt); papi_sde_register_counter(handle, ev_names[2], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &low_wtrmrk); papi_sde_register_counter(handle, ev_names[3], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &high_wtrmrk); papi_sde_add_counter_to_group(handle, ev_names[2], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); papi_sde_add_counter_to_group(handle, ev_names[3], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); return; } // The following function will _NOT_ be called by other libray functions or normal // applications. It is a hook for the utility 'papi_native_avail' to be able to // discover the SDEs that are exported by this library. papi_handle_t papi_sde_hook_list_events( papi_sde_fptr_struct_t *fptr_struct){ handle = fptr_struct->init("Simple2"); fptr_struct->register_counter_cb(handle, ev_names[0], PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_double, counter_accessor_function, &comp_value); fptr_struct->register_counter(handle, ev_names[1], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &total_iter_cnt); fptr_struct->register_counter(handle, ev_names[2], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &low_wtrmrk); fptr_struct->register_counter(handle, ev_names[3], PAPI_SDE_RO|PAPI_SDE_DELTA, PAPI_SDE_long_long, &high_wtrmrk); fptr_struct->add_counter_to_group(handle, ev_names[2], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); fptr_struct->add_counter_to_group(handle, ev_names[3], "ANY_WATERMARK_REACHED", PAPI_SDE_SUM); fptr_struct->describe_counter(handle, ev_names[0], "Sum of values that are within the watermarks."); fptr_struct->describe_counter(handle, ev_names[1], "Total iterations executed by the library."); fptr_struct->describe_counter(handle, ev_names[2], "Number of times a value was below the low watermark."); fptr_struct->describe_counter(handle, ev_names[3], "Number of times a value was above the high watermark."); fptr_struct->describe_counter(handle, "ANY_WATERMARK_REACHED", "Number of times a value was not between the two watermarks."); return handle; } // This function allows the library to perform operations in order to compute the value of an SDE at run-time long long counter_accessor_function( void *param ){ long long ll; double *dbl_ptr = (double *)param; // Scale the variable by a factor of two. Real libraries will do meaningful work here. double value = *dbl_ptr * 2.0; // Copy the bits of the result in a long long int. (void)memcpy(&ll, &value, sizeof(double)); return ll; } // Perform some nonsense computation to emulate a possible library behavior. // Notice that no SDE routines need to be called in the critical path of the library. double simple_compute(double x){ double sum = 0.0; int lcl_iter = 0; if( x > 1.0 ) x = 1.0/x; if( x < 0.000001 ) x += 0.3579; while( 1 ){ double y,x2,x3,x4; lcl_iter++; // Compute a function with range [0:1] so we can iterate // multiple times without diverging or creating FP exceptions. x2 = x*x; x3 = x2*x; x4 = x2*x2; y = 42.53*x4 -67.0*x3 +25.0*x2 +x/2.15; y = y*y; if( y < 0.01 ) y = 0.5-y; // Now set the next x to be the current y, so we can iterate again. x = y; // Add y to sum unconditionally sum += y; if( y < 0.1 ){ low_wtrmrk++; continue; } if( y > 0.9 ){ high_wtrmrk++; continue; } // Only add y to comp_value if y is between the low and high watermarks. comp_value += y; // If some condition is met, terminate the loop if( 0.61 < y && y < 0.69 ) break; } total_iter_cnt += lcl_iter; return sum; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/Simple2_NoPAPI_Driver.c000066400000000000000000000013401502707512200261420ustar00rootroot00000000000000#include #include #include #include void simple_init(void); double simple_compute(double x); int main(int argc, char **argv){ int i; int be_verbose = 0; if( (argc > 1) && !strcmp(argv[1], "-verbose") ) be_verbose = 1; simple_init(); for(i=0; i<10; i++){ double sum; sum = simple_compute(0.87*i); if( be_verbose) printf("sum=%lf\n",sum); } // This test exists just to check that a code that links against libsde // _without_ linking against libpapi will still compile and run. Therefore, // if we got to this point then the test has passed. fprintf( stdout, "%sPASSED%s\n","\033[1;32m","\033[0m"); return 0; } papi-papi-7-2-0-t/src/components/sde/tests/Simple2/simple2.hpp000066400000000000000000000007261502707512200241350ustar00rootroot00000000000000#ifndef SIMP2_LIB_H #define SIMP2_LIB_H #include "sde_lib.h" #include "sde_lib.hpp" class Simple{ public: Simple(); double simple_compute(double x); static constexpr const char *ev_names[4] = {"COMPUTED_VALUE", "TOTAL_ITERATIONS", "LOW_WATERMARK_REACHED", "HIGH_WATERMARK_REACHED" }; private: static long long int counter_accessor_function( double *param ); double comp_value; long long int total_iter_cnt, low_wtrmrk, high_wtrmrk; }; #endif papi-papi-7-2-0-t/src/components/sde/tests/lib/000077500000000000000000000000001502707512200212775ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sde/tests/lib/.gitignore000066400000000000000000000000001502707512200232550ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sensors_ppc/000077500000000000000000000000001502707512200211525ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sensors_ppc/README.md000066400000000000000000000036561502707512200224430ustar00rootroot00000000000000# SENSORS\_PPC Component The SENSORS\_PPC component supports reading system metrics on recent IBM PowerPC architectures (Power9 and later) using the OCC memory exposed through the Linux kernel. * [Enabling the SENSORS\_PPC Component](#enabling-the-sensors_ppc-component) * [Known Limitations](#known-limitations) * [FAQ](#faq) *** ## Enabling the SENSORS\_PPC Component To enable reading of SENSORS\_PPC counters the user needs to link against a PAPI library that was configured with the SENSORS\_PPC component enabled. As an example the following command: `./configure --with-components="sensors_ppc"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. ## Known Limitations The actions described below will generally require superuser ability. Note, these actions may have security and performance consequences, so please make sure you know what you are doing. Use chmod to set site-appropriate access permissions (e.g. 440). Use chown to set group ownership, for /sys/firmware/opal/exports/occ\_inband\_sensors. And finally, have your user added to said group, granting you read access. ## FAQ 1. [Measuring System](#measuring-system) ## Measuring System The opal/exports sysfs interface exposes sensors and counters as read only registers. The sensors and counters apply to the Power9. These counters and settings are exposed though this PAPI component and can be accessed just like any normal PAPI counter. Running the "sensors\_ppc\_basic" test in the tests directory will report a very limited sub-set of information on a system. For instance, voltage received by socket 0, and its extrema since the last reboot. Note: /sys/firmware/opal/exports/occ\_inband\_sensors is RDONLY for root. PAPI library will need read permissions to access it. papi-papi-7-2-0-t/src/components/sensors_ppc/Rules.sensors_ppc000066400000000000000000000003761502707512200245320ustar00rootroot00000000000000 COMPSRCS += components/sensors_ppc/linux-sensors-ppc.c COMPOBJS += linux-sensors-ppc.o linux-sensors-ppc.o: components/sensors_ppc/linux-sensors-ppc.c $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sensors_ppc/linux-sensors-ppc.c -o linux-sensors-ppc.o papi-papi-7-2-0-t/src/components/sensors_ppc/linux-sensors-ppc.c000066400000000000000000001016711502707512200247350ustar00rootroot00000000000000/** * @file linux-sensors_ppc.c * @author Philip Vaccaro * @ingroup papi_components * @brief sensors_ppc component * * To work, the sensors_ppc kernel module must be loaded. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "linux-sensors-ppc.h" // The following macro exit if a string function has an error. It should // never happen; but it is necessary to prevent compiler warnings. We print // something just in case there is programmer error in invoking the function. #define HANDLE_STRING_ERROR {fprintf(stderr,"%s:%i unexpected string function error.\n",__FILE__,__LINE__); exit(-1);} papi_vector_t _sensors_ppc_vector; /***************************************************************************/ /****** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT *******/ /***************************************************************************/ /* Null terminated version of strncpy */ static char * _local_strlcpy( char *dst, const char *src, size_t size ) { char *retval = strncpy( dst, src, size ); if ( size > 0 ) dst[size-1] = '\0'; return( retval ); } #define DESC_LINE_SIZE_ALLOWED 66 static void _space_padding(char *buf, size_t max) { size_t len = strlen(buf); /* 80 columns - 12 header - 2 footer*/ size_t nlines = 1+ len / DESC_LINE_SIZE_ALLOWED, c = len; /* space_padding */ for (; c < nlines * DESC_LINE_SIZE_ALLOWED && c < max-1; ++c) buf[c] = ' '; buf[c] = '\0'; } /** @brief Refresh_data locks in write and update ping and pong at * the same time for OCC occ_id. * The occ_names array contains constant memory and doesn't * need to be updated. * Ping and Pong are read outside of the critical path, and * only the swap needs to be protected. * */ static void refresh_data(int occ_id, int forced) { long long now = PAPI_get_real_nsec(); if (forced || now > last_refresh[occ_id] + OCC_REFRESH_TIME) { void *buf = double_ping[occ_id]; uint32_t ping_off = be32toh(occ_hdr[occ_id]->reading_ping_offset); uint32_t pong_off = be32toh(occ_hdr[occ_id]->reading_pong_offset); lseek (event_fd, occ_id * OCC_SENSOR_DATA_BLOCK_SIZE + ping_off, SEEK_SET); /* To limit risks of begin desynchronized, we read one chunk */ /* In memory, ping and pong are 40kB, with a 4kB buffer * of nothingness in between */ int to_read = pong_off - ping_off + OCC_PING_DATA_BLOCK_SIZE; int rc, bytes; /* copy memory iteratively until the full chunk is saved */ for (rc = bytes = 0; bytes < to_read; bytes += rc) { rc = read(event_fd, buf + bytes, to_read - bytes); if (!rc || rc < 0) /* done */ break; } papi_sensors_ppc_lock(); double_ping[occ_id] = ping[occ_id]; ping[occ_id] = buf; pong[occ_id] = ping[occ_id] + (pong_off - ping_off); last_refresh[occ_id] = now; papi_sensors_ppc_unlock(); } } static double _pow(int x, int y) { if (0 == y) return 1.; if (0 == x) return 0.; if (0 > y) return 1. / _pow(x, -y); if (1 == y) return 1. * x; if (0 == y%2) return _pow(x, y/2) * _pow(x, y/2); else return _pow(x, y/2) * _pow(x, y/2) * x; } #define TO_FP(f) ((f >> 8) * _pow(10, ((int8_t)(f & 0xFF)))) static long long read_sensors_ppc_record(int s, int gidx, int midx) { uint64_t value = 41; uint32_t offset = be32toh(occ_names[s][gidx].reading_offset); uint32_t scale = be32toh(occ_names[s][gidx].scale_factor); uint32_t freq = be32toh(occ_names[s][gidx].freq); occ_sensor_record_t *record = NULL; /* Let's see if the data segment needs a refresh */ refresh_data(s, 0); papi_sensors_ppc_lock(); occ_sensor_record_t *sping = (occ_sensor_record_t *)((uint64_t)ping[s] + offset); occ_sensor_record_t *spong = (occ_sensor_record_t *)((uint64_t)pong[s] + offset); if (*ping && *pong) { if (be64toh(sping->timestamp) > be64toh(spong->timestamp)) record = sping; else record = spong; } else if (*ping && !*pong) { record = sping; } else if (!*ping && *pong) { record = spong; } else if (!*ping && !*pong) { return value; } switch (midx) { case OCC_SENSORS_ACCUMULATOR_TAG: /* freq, per sensor, contains freq sampling for the last 500us of accumulation */ value = (uint64_t)(be64toh(record->accumulator) / TO_FP(freq)); break; default: /* That one might upset people * All the entries below sample (including it) are uint16_t packed */ value = (uint64_t)(be16toh((&record->sample)[midx]) * TO_FP(scale)); break; } papi_sensors_ppc_unlock(); return value; } static long long read_sensors_ppc_counter(int s, int gidx) { uint32_t offset = be32toh(occ_names[s][gidx].reading_offset); uint32_t scale = be32toh(occ_names[s][gidx].scale_factor); occ_sensor_counter_t *counter = NULL; refresh_data(s, 0); papi_sensors_ppc_lock(); occ_sensor_counter_t *sping = (occ_sensor_counter_t *)((uint64_t)ping[s] + offset); occ_sensor_counter_t *spong = (occ_sensor_counter_t *)((uint64_t)pong[s] + offset); if (*ping && *pong) { if (be64toh(sping->timestamp) > be64toh(spong->timestamp)) counter = sping; else counter = spong; } else if (*ping && !*pong) { counter = sping; } else if (!*ping && *pong) { counter = spong; } else if (!*ping && !*pong) { return 40; } uint64_t value = be64toh(counter->accumulator) * TO_FP(scale); papi_sensors_ppc_unlock(); return value; } static int _sensors_ppc_is_counter(int index) { int s = 0; /* get OCC s from index */ for (; index > occ_num_events[s+1] && s < MAX_OCCS; ++s); int ridx = index - occ_num_events[s]; int gidx = ridx / OCC_SENSORS_MASKS; return (OCC_SENSOR_READING_COUNTER == occ_names[s][gidx].structure_type); } static long long read_sensors_ppc_value( int index ) { int s = 0; /* get OCC s from index */ for (; index > occ_num_events[s+1] && s < MAX_OCCS; ++s); int ridx = index - occ_num_events[s]; int gidx = ridx / OCC_SENSORS_MASKS; int midx = ridx % OCC_SENSORS_MASKS; uint8_t structure_type = occ_names[s][gidx].structure_type; switch (structure_type) { case OCC_SENSOR_READING_FULL: return read_sensors_ppc_record(s, gidx, midx); case OCC_SENSOR_READING_COUNTER: if (OCC_SENSORS_ACCUMULATOR_TAG == midx) return read_sensors_ppc_counter(s, gidx); /* fall through */ /* counters only return the accumulator */ default: return 42; } } /************************* PAPI Functions **********************************/ /* * This is called whenever a thread is initialized */ static int _sensors_ppc_init_thread( hwd_context_t *ctx ) { (void) ctx; return PAPI_OK; } /* * Called when PAPI process is initialized (i.e. PAPI_library_init) */ static int _sensors_ppc_init_component( int cidx ) { int retval = PAPI_OK; int s = -1; int strErr; char events_dir[128]; char event_path[128]; char *strCpy; DIR *events; const PAPI_hw_info_t *hw_info; hw_info=&( _papi_hwi_system_info.hw_info ); if ( PAPI_VENDOR_IBM != hw_info->vendor ) { strCpy=strncpy(_sensors_ppc_vector.cmp_info.disabled_reason, "Not an IBM processor", PAPI_MAX_STR_LEN); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strCpy == NULL) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } int ret = snprintf(events_dir, sizeof(events_dir), "/sys/firmware/opal/exports/"); if (ret <= 0 || (int)(sizeof(events_dir)) <= ret) HANDLE_STRING_ERROR; if (NULL == (events = opendir(events_dir))) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Could not open events_dir='%s'.", __FILE__, __LINE__, events_dir); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } ret = snprintf(event_path, sizeof(event_path), "%s%s", events_dir, pkg_sys_name); if (ret <= 0 || (int)(sizeof(event_path)) <= ret) HANDLE_STRING_ERROR; if (-1 == access(event_path, F_OK)) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Could not access event_path='%s'.", __FILE__, __LINE__, event_path); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } event_fd = open(event_path, pkg_sys_flag); if (event_fd < 0) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Could not open event_path='%s'.", __FILE__, __LINE__, event_path); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } memset(occ_num_events, 0, (MAX_OCCS+1)*sizeof(int)); num_events = 0; for ( s = 0; s < MAX_OCCS; ++s ) { void *buf = NULL; if (NULL == (buf = malloc(OCC_SENSOR_DATA_BLOCK_SIZE))) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Failed to alloc %i bytes for buf.", __FILE__, __LINE__, OCC_SENSOR_DATA_BLOCK_SIZE); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } occ_hdr[s] = (struct occ_sensor_data_header_s*)buf; lseek (event_fd, s * OCC_SENSOR_DATA_BLOCK_SIZE, SEEK_SET); int rc, bytes; /* copy memory iteratively until the full chunk is saved */ for (rc = bytes = 0; bytes < OCC_SENSOR_DATA_BLOCK_SIZE; bytes += rc) { rc = read(event_fd, buf + bytes, OCC_SENSOR_DATA_BLOCK_SIZE - bytes); if (!rc || rc < 0) /* done */ break; } if (OCC_SENSOR_DATA_BLOCK_SIZE != bytes) { /* We are running out of OCCs, let's stop there */ free(buf); num_occs = s; s = MAX_OCCS; continue; } occ_sensor_name_t *names = (occ_sensor_name_t*)((uint64_t)buf + be32toh(occ_hdr[s]->names_offset)); int n_sensors = be16toh(occ_hdr[s]->nr_sensors); /* Prepare the double buffering for the ping/pong buffers */ int ping_off = be32toh(occ_hdr[s]->reading_ping_offset); int pong_off = be32toh(occ_hdr[s]->reading_pong_offset); /* Ping and pong are both 40kB, and we have a 4kB separator. * In theory, the distance between the beginnings of ping and pong is (40+4) kB. * But they expose an offset for the pong buffer. * So I won't trust the 4kB distance between buffers, and compute the buffer size * based on on both offsets ans the size of pong */ int buff_size = pong_off - ping_off + OCC_PING_DATA_BLOCK_SIZE; ping[s] = (uint32_t*)malloc(buff_size); if (ping[s] == NULL) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Failed to alloc %i bytes for ping[%i].", __FILE__, __LINE__, buff_size, s); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } double_ping[s] = (uint32_t*)malloc(buff_size); if (double_ping[s] == NULL) { strErr=snprintf(_sensors_ppc_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i Failed to alloc %i bytes for double_ping[%i].", __FILE__, __LINE__, buff_size, s); _sensors_ppc_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN-1]=0; if (strErr > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } double_pong[s] = double_ping[s]; refresh_data(s, 1); /* Not all events will exist, counter-based evens only have an accumulator to report */ occ_num_events[s+1] = occ_num_events[s] + (n_sensors * OCC_SENSORS_MASKS); num_events += (n_sensors * OCC_SENSORS_MASKS); /* occ_names map to read-only information that change only after reboot */ occ_names[s] = names; } /* Export the total number of events available */ _sensors_ppc_vector.cmp_info.num_native_events = num_events; _sensors_ppc_vector.cmp_info.num_cntrs = num_events; _sensors_ppc_vector.cmp_info.num_mpx_cntrs = num_events; /* 0 active events */ num_events = 0; /* Export the component id */ _sensors_ppc_vector.cmp_info.CmpIdx = cidx; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ static int _sensors_ppc_init_control_state( hwd_control_state_t *ctl ) { _sensors_ppc_control_state_t* control = ( _sensors_ppc_control_state_t* ) ctl; memset( control, 0, sizeof ( _sensors_ppc_control_state_t ) ); return PAPI_OK; } static int _sensors_ppc_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { SUBDBG("Enter _sensors_ppc_start\n"); _sensors_ppc_context_t* context = ( _sensors_ppc_context_t* ) ctx; _sensors_ppc_control_state_t* control = ( _sensors_ppc_control_state_t* ) ctl; memset( context->start_value, 0, sizeof(long long) * SENSORS_PPC_MAX_COUNTERS); int c, i; for( c = 0; c < num_events; c++ ) { i = control->which_counter[c]; if (_sensors_ppc_is_counter(i)) context->start_value[c] = read_sensors_ppc_value(i); } /* At the end, ctx->start if full of 0s, except for counter-type sensors */ return PAPI_OK; } static int _sensors_ppc_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; /* not sure what the side effect of stop is supposed to be, do a read? */ return PAPI_OK; } /* Shutdown a thread */ static int _sensors_ppc_shutdown_thread( hwd_context_t *ctx ) { (void) ctx; return PAPI_OK; } static int _sensors_ppc_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ) { SUBDBG("Enter _sensors_ppc_read\n"); (void) flags; _sensors_ppc_control_state_t* control = ( _sensors_ppc_control_state_t* ) ctl; _sensors_ppc_context_t* context = ( _sensors_ppc_context_t* ) ctx; long long start_val = 0; long long curr_val = 0; int c, i; /* c is the index in the dense array of selected counters */ /* using control->which_counters[c], fetch actual indices in i */ /* all subsequent methods use "global" indices i */ for ( c = 0; c < num_events; c++ ) { i = control->which_counter[c]; start_val = context->start_value[c]; curr_val = read_sensors_ppc_value(i); if (start_val) { /* Make sure an event is a counter. */ if (_sensors_ppc_is_counter(i)) { /* Wraparound. */ if(start_val > curr_val) { curr_val += (0x100000000 - start_val); } /* Normal subtraction. */ else if (start_val < curr_val) { curr_val -= start_val; } } } control->count[c]=curr_val; } *events = ( ( _sensors_ppc_control_state_t* ) ctl )->count; return PAPI_OK; } /* * Clean up what was setup in sensors_ppc_init_component(). */ static int _sensors_ppc_shutdown_component( void ) { close(event_fd); int s; papi_sensors_ppc_lock(); for (s = 0; s < num_occs; ++s) { free(occ_hdr[s]); if (ping[s] != NULL) free(ping[s]); if (double_ping[s] != NULL) free(double_ping[s]); } papi_sensors_ppc_unlock(); return PAPI_OK; } /* This function sets various options in the component. The valid * codes being passed in are PAPI_SET_DEFDOM, PAPI_SET_DOMAIN, * PAPI_SETDEFGRN, PAPI_SET_GRANUL and PAPI_SET_INHERIT */ static int _sensors_ppc_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { SUBDBG( "Enter: ctx: %p\n", ctx ); (void) ctx; (void) code; (void) option; return PAPI_OK; } static int _sensors_ppc_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { (void) ctx; int i, index; num_events = count; _sensors_ppc_control_state_t* control = ( _sensors_ppc_control_state_t* ) ctl; if (count == 0) return PAPI_OK; /* control contains a dense array of unsorted events */ for ( i = 0; i < count; i++ ) { index = native[i].ni_event; control->which_counter[i]=index; native[i].ni_position = i; } return PAPI_OK; } static int _sensors_ppc_set_domain( hwd_control_state_t *ctl, int domain ) { (void) ctl; if ( PAPI_DOM_ALL != domain ) return PAPI_EINVAL; return PAPI_OK; } static int _sensors_ppc_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; return PAPI_OK; } /* * Iterator function. Given an Eventcode, returns the next valid Eventcode to consider * returning anything but PAPI_OK will stop lookups and ignore next events. */ static int _sensors_ppc_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index; switch (modifier) { case PAPI_ENUM_FIRST: *EventCode = 0; return PAPI_OK; case PAPI_ENUM_EVENTS: index = *EventCode & PAPI_NATIVE_AND_MASK; if (index < occ_num_events[num_occs] - 1) { if (_sensors_ppc_is_counter(index+1)) /* For counters, exposing only the accumulator, * skips ghost events from _sample to _job_sched_max */ *EventCode = *EventCode + OCC_SENSORS_MASKS; else *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } default: return PAPI_EINVAL; } } /* * */ static int _sensors_ppc_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int index = EventCode & PAPI_NATIVE_AND_MASK; if ( index < 0 && index >= occ_num_events[num_occs] ) return PAPI_ENOEVNT; int s = 0; /* get OCC s from index */ for (; index > occ_num_events[s+1] && s < MAX_OCCS; ++s); int ridx = index - occ_num_events[s]; int gidx = ridx / OCC_SENSORS_MASKS; int midx = ridx % OCC_SENSORS_MASKS; /* EventCode maps to a counter */ /* Counters only expose their accumulator */ if (_sensors_ppc_is_counter(index) && midx != OCC_SENSORS_ACCUMULATOR_TAG) return PAPI_ENOEVNT; char buf[512]; int ret = snprintf(buf, 512, "%s:occ=%d%s", occ_names[s][gidx].name, s, sensors_ppc_fake_qualifiers[midx]); if (ret <= 0 || 512 <= ret) return PAPI_ENOSUPP; _local_strlcpy( name, buf, len); return PAPI_OK; } /* This is the optional function used by utils/papi_*_avail. * Not providing it will force the tools to forge a description using * ntv_code_to_desc, ntv_code_to_*. */ static int _sensors_ppc_ntv_code_to_info( unsigned int EventCode, PAPI_event_info_t *info ) { int index = EventCode; if ( index < 0 || index >= occ_num_events[num_occs]) return PAPI_ENOEVNT; int s = 0; /* get OCC s from index */ for (; index > occ_num_events[s+1] && s < MAX_OCCS; ++s); int ridx = index - occ_num_events[s]; int gidx = ridx / OCC_SENSORS_MASKS; int midx = ridx % OCC_SENSORS_MASKS; /* EventCode maps to a counter */ /* Counters only expose their accumulator */ if (_sensors_ppc_is_counter(index) && midx != OCC_SENSORS_ACCUMULATOR_TAG) return PAPI_ENOEVNT; char buf[512]; int ret = snprintf(buf, 512, "%s:occ=%d%s", occ_names[s][gidx].name, s, sensors_ppc_fake_qualifiers[midx]); if (ret <= 0 || 512 <= ret) return PAPI_ENOSUPP; _local_strlcpy( info->symbol, buf, sizeof( info->symbol )); _local_strlcpy( info->units, occ_names[s][gidx].units, sizeof( info->units ) ); /* If it ends with: * Qw: w-th Quad unit [0-5] * Cxx: xx-th core [0-23] * My: y-th memory channel [0-8] * CHvv: vv-th memory module [0-15] * or starts with: * GPUz: z-th GPU [0-2] * TEMPGPUz: z-th GPU [0-2] * */ uint16_t type = be16toh(occ_names[s][gidx].type); char *name = strdup(occ_names[s][gidx].name); uint32_t freq = be32toh(occ_names[s][gidx].freq); int tgt = -1; switch(type) { /* IPS, STOPDEEPACTCxx, STOPDEEPREQCxx, IPSCxx, NOTBZECxx, NOTFINCxx, * MRDMy, MWRMy, PROCPWRTHROT, PROCOTTHROT, MEMPWRTHROT, MEMOTTHROT, * GPUzHWTHROT, GPUzSWTHROT, GPUzSWOTTHROT, GPUzSWPWRTHROT */ case OCC_SENSOR_TYPE_PERFORMANCE: if (!strncmp(name, "GPU", 3)) { char z[] = {name[3], '\0'}; tgt = atoi(z); name[3] = 'z'; if (!strncmp(name, "GPUzHWTHROT", 11)) ret = snprintf(buf, 512, "Total time GPU %d has been throttled by hardware (thermal or power brake)", tgt); else if (!strncmp(name, "GPUzSWTHROT", 11)) ret = snprintf(buf, 512, "Total time GPU %d has been throttled by software for any reason", tgt); else if (!strncmp(name, "GPUzSWOTTHROT", 13)) ret = snprintf(buf, 512, "Total time GPU %d has been throttled by software due to thermal", tgt); else if (!strncmp(name, "GPUzSWPWRTHROT", 14)) ret = snprintf(buf, 512, "Total time GPU %d has been throttled by software due to power", tgt); else ret = snprintf(buf, 512, "[PERFORMANCE] Unexpected: GPU-%d %s", tgt, name); } else if (!strncmp(name, "IPSCxx", 4)) { tgt = atoi(name+4); ret = snprintf(buf, 512, "Instructions per second for core %d on this Processor", tgt); } else if (!strncmp(name, "IPS", 3)) ret = snprintf(buf, 512, "Vector sensor that takes the average of all the cores this Processor"); else if (!strncmp(name, "STOPDEEPACTCxx", 12)) { tgt = atoi(name+12); ret = snprintf(buf, 512, "Deepest actual stop state that was fully entered during sample time for core %d", tgt); } else if (!strncmp(name, "STOPDEEPREQCxx", 12)) { tgt = atoi(name+12); ret = snprintf(buf, 512, "Deepest stop state that has been requested during sample time for core %d", tgt); } else if (!strncmp(name, "MEMPWRTHROT", 11)) ret = snprintf(buf, 512, "Count of memory throttled due to power"); else if (!strncmp(name, "MEMOTTHROT", 10)) ret = snprintf(buf, 512, "Count of memory throttled due to memory Over temperature"); else if (!strncmp(name, "PROCOTTHROT", 11)) ret = snprintf(buf, 512, "Count of processor throttled for temperature"); else if (!strncmp(name, "PROCPWRTHROT", 12)) ret = snprintf(buf, 512, "Count of processor throttled due to power"); else if (!strncmp(name, "MWRM", 4)) { tgt = atoi(name+4); ret = snprintf(buf, 512, "Memory write requests per sec for MC %d", tgt); } else if (!strncmp(name, "MRDM", 4)) { tgt = atoi(name+4); ret = snprintf(buf, 512, "Memory read requests per sec for MC %d", tgt); } else ret = snprintf(buf, 512, "[PERFORMANCE] Unexpected: %s", name); break; /* PWRSYS, PWRGPU, PWRAPSSCHvv, PWRPROC, PWRVDD, PWRVDN, PWRMEM */ case OCC_SENSOR_TYPE_POWER: if (!strncmp(name, "PWRSYS", 6)) ret = snprintf(buf, 512, "Bulk power of the system/node"); else if (!strncmp(name, "PWRGPU", 6)) ret = snprintf(buf, 512, "Power consumption for GPUs per socket (OCC) read from APSS"); else if (!strncmp(name, "PWRPROC", 7)) ret = snprintf(buf, 512, "Power consumption for this Processor"); else if (!strncmp(name, "PWRVDD", 6)) ret = snprintf(buf, 512, "Power consumption for this Processor's Vdd (calculated from AVSBus readings)"); else if (!strncmp(name, "PWRVDN", 6)) ret = snprintf(buf, 512, "Power consumption for this Processor's Vdn (nest) (calculated from AVSBus readings)"); else if (!strncmp(name, "PWRMEM", 6)) ret = snprintf(buf, 512, "Power consumption for Memory for this Processor read from APSS"); else if (!strncmp(name, "PWRAPSSCH", 9)) { tgt = atoi(name+9); ret = snprintf(buf, 512, "Power Provided by APSS channel %d", tgt); } else ret = snprintf(buf, 512, "[POWER] Unexpected: %s", name); break; /* FREQA, FREQACxx */ case OCC_SENSOR_TYPE_FREQUENCY: if (!strncmp(name, "FREQACxx", 6)) { tgt = atoi(name+6); ret = snprintf(buf, 512, "Average/actual frequency for this processor, Core %d based on OCA data", tgt); } else if (!strncmp(name, "FREQA", 5)) ret = snprintf(buf, 512, "Average of all core frequencies for Processor"); else ret = snprintf(buf, 512, "[FREQUENCY] Unexpected: %s", name); break; case OCC_SENSOR_TYPE_TIME: ret = snprintf(buf, 512, "[TIME] Unexpected: %s", name); break; /* UTILCxx, UTIL, NUTILCxx, MEMSPSTATMy, MEMSPMy */ case OCC_SENSOR_TYPE_UTILIZATION: if (!strncmp(name, "MEMSPSTATM", 10)) { tgt = atoi(name+10); ret = snprintf(buf, 512, "Static Memory throttle level setting for MCA %d when not in a memory throttle condition", tgt); } else if (!strncmp(name, "MEMSPM", 6)) { tgt = atoi(name+6); ret = snprintf(buf, 512, "Current Memory throttle level setting for MCA %d", tgt); } else if (!strncmp(name, "NUTILC", 6)) { tgt = atoi(name+6); ret = snprintf(buf, 512, "Normalized average utilization, rolling average of this Processor's Core %d", tgt); } else if (!strncmp(name, "UTILC", 5)) { tgt = atoi(name+5); ret = snprintf(buf, 512, "Utilization of this Processor's Core %d (where 100%% means fully utilized): NOTE: per thread HW counters are combined as appropriate to give this core level utilization sensor", tgt); } else if (!strncmp(name, "UTIL", 4)) ret = snprintf(buf, 512, "Average of all Cores UTILC[yy] sensor"); else ret = snprintf(buf, 512, "[UTILIZATION] Unexpected: %s", name); break; /* TEMPNEST, TEMPPROCTHRMCxx, TEMPVDD, TEMPDIMMvv, TEMPGPUz, TEMPGPUzMEM*/ case OCC_SENSOR_TYPE_TEMPERATURE: if (!strncmp(name, "TEMPNEST", 8)) ret = snprintf(buf, 512, "Average temperature of nest DTS sensors"); else if (!strncmp(name, "TEMPVDD", 7)) ret = snprintf(buf, 512, "VRM Vdd temperature"); else if (!strncmp(name, "TEMPPROCTHRMCxx", 13)) { tgt = atoi(name+13); ret = snprintf(buf, 512, "The combined weighted core/quad temperature for processor core %d", tgt); } else if (!strncmp(name, "TEMPDIMMvv", 8)) { tgt = atoi(name+8); ret = snprintf(buf, 512, "DIMM temperature for DIMM %d", tgt); } else if (!strncmp(name, "TEMPGPUz", 7)) { char z[] = {name[7], '\0'}; tgt = atoi(z); name[7] = 'z'; if (!strncmp(name, "TEMPGPUzMEM", 11)) ret = snprintf(buf, 512, "GPU %d hottest HBM temperature (individual memory temperatures are not available)", tgt); else if (!strncmp(name, "TEMPGPUz", 8)) ret = snprintf(buf, 512, "GPU %d board temperature", tgt); else ret = snprintf(buf, 512, "[TEMPERATURE] Unexpected: GPU-%d %s", tgt, name); } else ret = snprintf(buf, 512, "[TEMPERATURE] Unexpected: %s", name); break; /* VOLTVDD, VOLTVDDSENSE, VOLTVDN, VOLTVDNSENSE, VOLTDROOPCNTCx, VOLTDROOPCNTQw */ case OCC_SENSOR_TYPE_VOLTAGE: if (!strncmp(name, "VOLTVDDS", 8)) ret = snprintf(buf, 512, "Vdn Voltage at the remote sense. (AVS reading adjusted for loadline)"); else if (!strncmp(name, "VOLTVDNS", 8)) ret = snprintf(buf, 512, "Vdd Voltage at the remote sense. (AVS reading adjusted for loadline)"); else if (!strncmp(name, "VOLTVDD", 7)) ret = snprintf(buf, 512, "Processor Vdd Voltage (read from AVSBus)"); else if (!strncmp(name, "VOLTVDN", 7)) ret = snprintf(buf, 512, "Processor Vdn Voltage (read from AVSBus)"); else if (!strncmp(name, "VOLTDROOPCNTC", 13)) { tgt = atoi(name+13); ret = snprintf(buf, 512, "Small voltage droop count for core %d", tgt); } else if (!strncmp(name, "VOLTDROOPCNTQ", 13)) { tgt = atoi(name+13); ret = snprintf(buf, 512, "Small voltage droop count for core %d", tgt); } else ret = snprintf(buf, 512, "[VOLTAGE] Unexpected: %s", name); break; /* CURVDD, CURVDN */ case OCC_SENSOR_TYPE_CURRENT: if (!strncmp(name, "CURVDN", 6)) ret = snprintf(buf, 512, "Processor Vdn Current (read from AVSBus)"); else if (!strncmp(name, "CURVDD", 6)) ret = snprintf(buf, 512, "Processor Vdd Current (read from AVSBus)"); else ret = snprintf(buf, 512, "[CURRENT] Unexpected: %s", name); break; case OCC_SENSOR_TYPE_GENERIC: default: ret = snprintf(buf, 512, "[GENERIC] Unexpected: %s", name); break; } if (ret <= 0 || 512 <= ret) return PAPI_ENOSUPP; _space_padding(buf, sizeof(buf)); ret = snprintf(buf+strlen(buf), 512, "%s", sensors_ppc_fake_qualif_desc[midx]); if (ret <= 0 || 512 <= ret) return PAPI_ENOSUPP; _space_padding(buf, sizeof(buf)); ret = snprintf(buf+strlen(buf), 512, "Sampling period: %lfs", 1./freq); if (ret <= 0 || 512 <= ret) return PAPI_ENOSUPP; _local_strlcpy( info->long_descr, buf, sizeof(info->long_descr)); info->data_type = PAPI_DATATYPE_INT64; return PAPI_OK; } papi_vector_t _sensors_ppc_vector = { .cmp_info = { /* (unspecified values are initialized to 0) */ .name = "sensors_ppc", .short_name = "sensors_ppc", .description = "Linux sensors_ppc energy measurements", .version = "5.3.0", .default_domain = PAPI_DOM_ALL, .default_granularity = PAPI_GRN_SYS, .available_granularities = PAPI_GRN_SYS, .hardware_intr_sig = PAPI_INT_SIGNAL, .available_domains = PAPI_DOM_ALL, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( _sensors_ppc_context_t ), .control_state = sizeof ( _sensors_ppc_control_state_t ), .reg_value = sizeof ( _sensors_ppc_register_t ), .reg_alloc = sizeof ( _sensors_ppc_reg_alloc_t ), }, /* function pointers in this component */ .init_thread = _sensors_ppc_init_thread, .init_component = _sensors_ppc_init_component, .init_control_state = _sensors_ppc_init_control_state, .update_control_state = _sensors_ppc_update_control_state, .start = _sensors_ppc_start, .stop = _sensors_ppc_stop, .read = _sensors_ppc_read, .shutdown_thread = _sensors_ppc_shutdown_thread, .shutdown_component = _sensors_ppc_shutdown_component, .ctl = _sensors_ppc_ctl, .set_domain = _sensors_ppc_set_domain, .reset = _sensors_ppc_reset, .ntv_enum_events = _sensors_ppc_ntv_enum_events, .ntv_code_to_name = _sensors_ppc_ntv_code_to_name, .ntv_code_to_info = _sensors_ppc_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/sensors_ppc/linux-sensors-ppc.h000066400000000000000000000155071502707512200247440ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-sensors-ppc.h * CVS: $Id$ * * @author PAPI team UTK/ICL * dgenet@icl.utk.edu * * @ingroup papi_components * * @brief OCC Inband Sensors component for PowerPC * This file contains the source code for a component that enables * PAPI to read counters and sensors on PowerPC (Power9) architecture. */ #ifndef _sensors_ppc_H #define _sensors_ppc_H /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #define papi_sensors_ppc_lock() _papi_hwi_lock(COMPONENT_LOCK); #define papi_sensors_ppc_unlock() _papi_hwi_unlock(COMPONENT_LOCK); typedef struct _sensors_ppc_register { unsigned int selector; } _sensors_ppc_register_t; typedef struct _sensors_ppc_native_event_entry { char name[PAPI_MAX_STR_LEN]; char units[PAPI_MIN_STR_LEN]; char description[PAPI_MAX_STR_LEN]; int socket_id; int component_id; int event_id; int type; int return_type; _sensors_ppc_register_t resources; } _sensors_ppc_native_event_entry_t; typedef struct _sensors_ppc_reg_alloc { _sensors_ppc_register_t ra_bits; } _sensors_ppc_reg_alloc_t; static int num_events=0; typedef enum occ_sensor_type_e { OCC_SENSOR_TYPE_GENERIC = 0x0001, OCC_SENSOR_TYPE_CURRENT = 0x0002, OCC_SENSOR_TYPE_VOLTAGE = 0x0004, OCC_SENSOR_TYPE_TEMPERATURE = 0x0008, OCC_SENSOR_TYPE_UTILIZATION = 0x0010, OCC_SENSOR_TYPE_TIME = 0x0020, OCC_SENSOR_TYPE_FREQUENCY = 0x0040, OCC_SENSOR_TYPE_POWER = 0x0080, OCC_SENSOR_TYPE_PERFORMANCE = 0x0200, } occ_sensor_type_t; typedef enum occ_sensor_loc_e { OCC_SENSOR_LOC_SYSTEM = 0x0001, OCC_SENSOR_LOC_PROCESSOR = 0x0002, OCC_SENSOR_LOC_PARTITION = 0x0004, OCC_SENSOR_LOC_MEMORY = 0x0008, OCC_SENSOR_LOC_VRM = 0x0010, OCC_SENSOR_LOC_OCC = 0x0020, OCC_SENSOR_LOC_CORE = 0x0040, OCC_SENSOR_LOC_GPU = 0x0080, OCC_SENSOR_LOC_QUAD = 0x0100, } occ_sensor_loc_t; #define OCC_SENSOR_READING_FULL 0x01 #define OCC_SENSOR_READING_COUNTER 0x02 static char *pkg_sys_name = "occ_inband_sensors"; static mode_t pkg_sys_flag = O_RDONLY; /* 8 OCCs, starting at OCC_SENSOR_DATA_BLOCK_OFFSET * OCC0: 0x00580000 -> 0x005A57FF * OCC1: 0x005A5800 -> 0x005CAFFF * Each zone is 150kB (OCC_SENSOR_DATA_BLOCK_SIZE) * OCC7: 0x00686800 -> 0x006ABFFF*/ #define MAX_OCCS 8 #define OCC_SENSOR_DATA_BLOCK_OFFSET 0x00580000 #define OCC_SENSOR_DATA_BLOCK_SIZE 0x00025800 #define OCC_PING_DATA_BLOCK_SIZE 0xA000 #define OCC_REFRESH_TIME 100000 /* In the 150kB, map the beginning to */ typedef struct occ_sensor_data_header_s { uint8_t valid; /* 0x01 means the block can be read */ uint8_t version; uint16_t nr_sensors; /* number of sensors! */ uint8_t reading_version; /* ping pong version */ uint8_t pad[3]; uint32_t names_offset; uint8_t names_version; uint8_t name_length; uint16_t reserved; uint32_t reading_ping_offset; uint32_t reading_pong_offset; } __attribute__((__packed__)) occ_sensor_data_header_t; /* That header is reset after each reboot */ struct occ_sensor_data_header_s *occ_hdr[MAX_OCCS]; static int event_fd; static long long last_refresh[MAX_OCCS]; static int num_occs; static int occ_num_events[MAX_OCCS+1]; static uint32_t *ping[MAX_OCCS], *pong[MAX_OCCS]; static uint32_t *double_ping[MAX_OCCS], *double_pong[MAX_OCCS]; #define MAX_CHARS_SENSOR_NAME 16 #define MAX_CHARS_SENSOR_UNIT 4 /* After 1kB, the list of sensor names, units */ /* map an array of size header->nr_sensors */ /* the following struct, */ typedef struct occ_sensor_name_s { char name[MAX_CHARS_SENSOR_NAME]; char units[MAX_CHARS_SENSOR_UNIT]; uint16_t gsid; uint32_t freq; uint32_t scale_factor; uint16_t type; uint16_t location; uint8_t structure_type; /* determine size+format of sensor */ uint32_t reading_offset; uint8_t sensor_data; uint8_t pad[8]; } __attribute__((__packed__)) occ_sensor_name_t; struct occ_sensor_name_s *occ_names[MAX_OCCS]; /* The following 4kB, size of a page, has to be skipped */ /* Following 40kB is the ping buffer */ /* Followed by another 4kB of skippable memory */ /* Finally, 40kB for the pong buffer */ typedef struct occ_sensor_record_s { uint16_t gsid; uint64_t timestamp; uint16_t sample; /* latest value */ uint16_t sample_min; /*min max since reboot */ uint16_t sample_max; uint16_t csm_min;/* since CSM reset */ uint16_t csm_max; uint16_t profiler_min; /* since prof reset */ uint16_t profiler_max; uint16_t job_scheduler_min; /* since job sched reset */ uint16_t job_scheduler_max; uint64_t accumulator; /* accu if it makes sense */ uint32_t update_tag; /* tics since between that value and previous one */ uint8_t pad[8]; } __attribute__((__packed__)) occ_sensor_record_t; typedef struct occ_sensor_counter_s { uint16_t gsid; uint64_t timestamp; uint64_t accumulator; uint8_t sample; uint8_t pad[5]; } __attribute__((__packed__)) occ_sensor_counter_t; typedef enum occ_sensors_mask_e { OCC_SENSORS_SAMPLE = 0, OCC_SENSORS_SAMPLE_MIN = 1, OCC_SENSORS_SAMPLE_MAX = 2, OCC_SENSORS_CSM_MIN = 3, OCC_SENSORS_CSM_MAX = 4, OCC_SENSORS_PROFILER_MIN = 5, OCC_SENSORS_PROFILER_MAX = 6, OCC_SENSORS_JOB_SCHED_MIN = 7, OCC_SENSORS_JOB_SCHED_MAX = 8, OCC_SENSORS_ACCUMULATOR_TAG = 9, OCC_SENSORS_MASKS } occ_sensors_mask_t; static const char* sensors_ppc_fake_qualifiers[] = {"", ":min", ":max", ":csm_min", ":csm_max", ":profiler_min", ":profiler_max", ":job_scheduler_min", ":job_scheduler_max", ":accumulator", NULL}; static const char *sensors_ppc_fake_qualif_desc[] = { "Last sample of this sensor", "Minimum value since last OCC reset (node reboot)", "Maximum value since last OCC reset (node reboot)", "Minimum value since last reset request by CSM", "Maximum value since last reset request by CSM", "Minimum value since last reset request by profiler", "Maximum value since last reset request by profiler", "Minimum value since last reset by job scheduler", "Maximum value since last reset by job scheduler", "Accumulator register for this sensor", NULL}; #define SENSORS_PPC_MAX_COUNTERS MAX_OCCS * 512 * OCC_SENSORS_MASKS typedef struct _sensors_ppc_control_state { long long count[SENSORS_PPC_MAX_COUNTERS]; long long which_counter[SENSORS_PPC_MAX_COUNTERS]; long long need_difference[SENSORS_PPC_MAX_COUNTERS]; uint32_t occ, scale; } _sensors_ppc_control_state_t; typedef struct _sensors_ppc_context { long long start_value[SENSORS_PPC_MAX_COUNTERS]; _sensors_ppc_control_state_t state; } _sensors_ppc_context_t; #endif /* _sensors_ppc_H */ papi-papi-7-2-0-t/src/components/sensors_ppc/tests/000077500000000000000000000000001502707512200223145ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sensors_ppc/tests/Makefile000066400000000000000000000006551502707512200237620ustar00rootroot00000000000000NAME=sensors_ppc include ../../Makefile_comp_tests.target TESTS = sensors_ppc_basic sensors_ppc_tests: $(TESTS) sensors_ppc_basic.o: sensors_ppc_basic.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c sensors_ppc_basic.c -o sensors_ppc_basic.o sensors_ppc_basic: sensors_ppc_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(INCLUDE) -o sensors_ppc_basic sensors_ppc_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o *~ papi-papi-7-2-0-t/src/components/sensors_ppc/tests/sensors_ppc_basic.c000066400000000000000000000061211502707512200261570ustar00rootroot00000000000000/** * @author PAPI team UTK/ICL * Test case for sensors_ppc component * @brief * Tests basic functionality of sensors_ppc component */ #include #include #include #include #include "papi.h" #define MAX_sensors_ppc_EVENTS 64 int main ( int argc, char **argv ) { (void) argv; (void) argc; int retval,cid,sensors_ppc_cid=-1,numcmp; int EventSet = PAPI_NULL; long long *values; int num_events=0; int code; char event_names[MAX_sensors_ppc_EVENTS][PAPI_MAX_STR_LEN]; char units[MAX_sensors_ppc_EVENTS][PAPI_MIN_STR_LEN]; int r,i; const PAPI_component_info_t *cmpinfo = NULL; PAPI_event_info_t evinfo; /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) fprintf(stderr, "PAPI_library_init failed %d\n",retval ); numcmp = PAPI_num_components(); for( cid=0; cidname,"sensors_ppc" ) ) { sensors_ppc_cid=cid; break; } } /* Component not found */ if ( cid==numcmp ) fprintf(stderr, "No sensors_ppc component found\n"); /* Skip if component has no counters */ if ( cmpinfo->num_cntrs==0 ) fprintf(stderr, "No counters in the sensors_ppc component\n"); /* Create EventSet */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) fprintf(stderr, "PAPI_create_eventset()\n"); /* Add all events */ code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, sensors_ppc_cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name(code, event_names[num_events]); if ( retval != PAPI_OK ) fprintf(stderr, "Error from PAPI_event_code_to_name, error = %d\n", retval); retval = PAPI_get_event_info(code,&evinfo); if ( retval != PAPI_OK ) fprintf(stderr, "Error getting event info, error = %d\n",retval); char *evt = "sensors_ppc:::VOLTVDD:occ=0"; if (!strncmp(event_names[num_events], evt, strlen(evt))) { retval = PAPI_add_event( EventSet, code ); strcpy(units[num_events], evinfo.units); num_events++; } r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, sensors_ppc_cid ); } values=calloc( num_events,sizeof( long long ) ); if ( values==NULL ) fprintf(stderr, "No memory"); PAPI_start(EventSet); PAPI_stop( EventSet, values ); for (i = 0; i < num_events; ++i) fprintf(stdout, "%s > %lld %s\n", event_names[i], values[i], units[i]); /* Done, clean up */ retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) fprintf(stderr, "PAPI_cleanup_eventset(), error=%d\n",retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) fprintf(stderr, "PAPI_destroy_eventset(), error=%d\n",retval ); return 0; } papi-papi-7-2-0-t/src/components/stealtime/000077500000000000000000000000001502707512200206035ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/stealtime/README.md000066400000000000000000000012721502707512200220640ustar00rootroot00000000000000# STEALTIME Component The STEALTIME component enables PAPI to access stealtime filesystem statistics. * [Enabling the STEALTIME Component](#enabling-the-stealtime-component) *** ## Enabling the STEALTIME Component To enable reading of STEALTIME counters the user needs to link against a PAPI library that was configured with the STEALTIME component enabled. As an example the following command: `./configure --with-components="stealtime"` is sufficient to enable the component. Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. papi-papi-7-2-0-t/src/components/stealtime/Rules.stealtime000066400000000000000000000003701502707512200236060ustar00rootroot00000000000000 COMPSRCS += components/stealtime/linux-stealtime.c COMPOBJS += linux-stealtime.o linux-stealtime.o: components/stealtime/linux-stealtime.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/stealtime/linux-stealtime.c -o linux-stealtime.o papi-papi-7-2-0-t/src/components/stealtime/linux-stealtime.c000066400000000000000000000327111502707512200240770ustar00rootroot00000000000000/** * @file linux-stealtime.c * @author Vince Weaver * vweaver1@eecs.utk.edu * @brief A component that gather info on VM stealtime */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" struct counter_info { char *name; char *description; char *units; unsigned long long value; }; typedef struct counter_info STEALTIME_register_t; typedef struct counter_info STEALTIME_native_event_entry_t; typedef struct counter_info STEALTIME_reg_alloc_t; struct STEALTIME_control_state { long long *values; int *which_counter; int num_events; }; struct STEALTIME_context { long long *start_count; long long *current_count; long long *value; }; static int num_events = 0; static struct counter_info *event_info=NULL; /* Advance declaration of buffer */ papi_vector_t _stealtime_vector; /****************************************************************************** ******** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT ******** *****************************************************************************/ struct statinfo { long long user; long long nice; long long system; long long idle; long long iowait; long long irq; long long softirq; long long steal; long long guest; }; static int read_stealtime( struct STEALTIME_context *context, int starting) { FILE *fff; char buffer[BUFSIZ],*result; int i,count; struct statinfo our_stat; int hz=sysconf(_SC_CLK_TCK); fff=fopen("/proc/stat","r"); if (fff==NULL) { return PAPI_ESYS; } for(i=0;istart_count[i]=our_stat.steal; } context->current_count[i]=our_stat.steal; /* convert to us */ context->value[i]=(context->current_count[i]-context->start_count[i])* (1000000/hz); } fclose(fff); return PAPI_OK; } /***************************************************************************** ******************* BEGIN PAPI's COMPONENT REQUIRED FUNCTIONS ************* *****************************************************************************/ /* * Component setup and shutdown */ static int _stealtime_shutdown_component( void ); // prototype for later routine. static int _stealtime_init_component( int cidx ) { int retval = PAPI_OK; (void)cidx; FILE *fff; char buffer[BUFSIZ],*result,string[BUFSIZ]; int i; /* Make sure /proc/stat exists */ fff=fopen("/proc/stat","r"); if (fff==NULL) { strncpy(_stealtime_vector.cmp_info.disabled_reason, "Cannot open /proc/stat",PAPI_MAX_STR_LEN); _stealtime_shutdown_component(); retval = PAPI_ESYS; goto fn_fail; } num_events=0; while(1) { result=fgets(buffer,BUFSIZ,fff); if (result==NULL) break; /* /proc/stat line with cpu stats always starts with "cpu" */ if (!strncmp(buffer,"cpu",3)) { num_events++; } else { break; } } fclose(fff); if (num_events<1) { strncpy(_stealtime_vector.cmp_info.disabled_reason, "Cannot find enough CPU lines in /proc/stat", PAPI_MAX_STR_LEN); _stealtime_shutdown_component(); retval = PAPI_ESYS; goto fn_fail; } event_info=calloc(num_events,sizeof(struct counter_info)); if (event_info==NULL) { _stealtime_shutdown_component(); retval = PAPI_ENOMEM; goto fn_fail; } sysconf(_SC_CLK_TCK); event_info[0].name=strdup("TOTAL"); event_info[0].description=strdup("Total amount of steal time"); event_info[0].units=strdup("us"); for(i=1;icmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /* * This is called whenever a thread is initialized */ static int _stealtime_init_thread( hwd_context_t * ctx ) { struct STEALTIME_context *context=(struct STEALTIME_context *)ctx; context->start_count=calloc(num_events,sizeof(long long)); if (context->start_count==NULL) return PAPI_ENOMEM; context->current_count=calloc(num_events,sizeof(long long)); if (context->current_count==NULL) return PAPI_ENOMEM; context->value=calloc(num_events,sizeof(long long)); if (context->value==NULL) return PAPI_ENOMEM; return PAPI_OK; } /* * */ static int _stealtime_shutdown_component( void ) { int i; int num_events = _stealtime_vector.cmp_info.num_native_events; if (event_info!=NULL) { for (i=0; istart_count!=NULL) free(context->start_count); if (context->current_count!=NULL) free(context->current_count); if (context->value!=NULL) free(context->value); return PAPI_OK; } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) functions */ static int _stealtime_init_control_state( hwd_control_state_t *ctl ) { struct STEALTIME_control_state *control = (struct STEALTIME_control_state *)ctl; control->values=NULL; control->which_counter=NULL; control->num_events=0; return PAPI_OK; } /* * */ static int _stealtime_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { struct STEALTIME_control_state *control; ( void ) ctx; int i, index; control= (struct STEALTIME_control_state *)ctl; if (count!=control->num_events) { // printf("Resizing %d to %d\n",control->num_events,count); control->which_counter=realloc(control->which_counter, count*sizeof(int)); control->values=realloc(control->values, count*sizeof(long long)); } for ( i = 0; i < count; i++ ) { index = native[i].ni_event; control->which_counter[i]=index; native[i].ni_position = i; } control->num_events=count; return PAPI_OK; } /* * */ static int _stealtime_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void)ctl; // struct STEALTIME_control_state *control; struct STEALTIME_context *context; //control = (struct STEALTIME_control_state *)ctl; context = (struct STEALTIME_context *)ctx; read_stealtime( context, 1 ); /* no need to update control, as we assume only one EventSet */ /* is active at once, so starting things at the context level */ /* is fine, since stealtime is system-wide */ return PAPI_OK; } /* * */ static int _stealtime_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctl; // struct STEALTIME_control_state *control; struct STEALTIME_context *context; //control = (struct STEALTIME_control_state *)ctl; context = (struct STEALTIME_context *)ctx; read_stealtime( context, 0 ); return PAPI_OK; } /* * */ static int _stealtime_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags ) { ( void ) flags; struct STEALTIME_control_state *control; struct STEALTIME_context *context; int i; control = (struct STEALTIME_control_state *)ctl; context = (struct STEALTIME_context *)ctx; read_stealtime( context, 0 ); for(i=0;inum_events;i++) { control->values[i]= context->value[control->which_counter[i]]; } *events = control->values; return PAPI_OK; } /* * */ static int _stealtime_reset( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { /* re-initializes counter_start values to current */ _stealtime_start(ctx,ctrl); return PAPI_OK; } /* * Unused stealtime write function */ /* static int */ /* _stealtime_write( hwd_context_t * ctx, hwd_control_state_t * ctrl, long long *from ) */ /* { */ /* ( void ) ctx; */ /* ( void ) ctrl; */ /* ( void ) from; */ /* return PAPI_OK; */ /* } */ /* * Functions for setting up various options */ /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ static int _stealtime_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { ( void ) ctx; ( void ) code; ( void ) option; return PAPI_OK; } /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ static int _stealtime_set_domain( hwd_control_state_t * cntrl, int domain ) { ( void ) cntrl; int found = 0; if ( PAPI_DOM_USER & domain ) { found = 1; } if ( PAPI_DOM_KERNEL & domain ) { found = 1; } if ( PAPI_DOM_OTHER & domain ) { found = 1; } if ( !found ) return PAPI_EINVAL; return PAPI_OK; } /* * */ static int _stealtime_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int event=EventCode; if (event >=0 && event < num_events) { strncpy( name, event_info[event].name, len ); return PAPI_OK; } return PAPI_ENOEVNT; } /* * */ static int _stealtime_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { int event=EventCode; if (event >=0 && event < num_events) { strncpy( name, event_info[event].description, len ); return PAPI_OK; } return PAPI_ENOEVNT; } static int _stealtime_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { int index = EventCode; if ( ( index < 0) || (index >= num_events )) return PAPI_ENOEVNT; strncpy( info->symbol, event_info[index].name,sizeof(info->symbol)); info->symbol[sizeof(info->symbol)-1] = '\0'; strncpy( info->long_descr, event_info[index].description,sizeof(info->symbol)); info->long_descr[sizeof(info->symbol)-1] = '\0'; strncpy( info->units, event_info[index].units,sizeof(info->units)); info->units[sizeof(info->units)-1] = '\0'; return PAPI_OK; } /* * */ static int _stealtime_ntv_enum_events( unsigned int *EventCode, int modifier ) { if ( modifier == PAPI_ENUM_FIRST ) { if (num_events==0) return PAPI_ENOEVNT; *EventCode = 0; return PAPI_OK; } if ( modifier == PAPI_ENUM_EVENTS ) { int index; index = *EventCode; if ( (index+1) < num_events ) { *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } } return PAPI_EINVAL; } /* * */ papi_vector_t _stealtime_vector = { .cmp_info = { /* component information (unspecified values initialized to 0) */ .name = "stealtime", .short_name="stealtime", .version = "5.0", .description = "Stealtime filesystem statistics", .default_domain = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( struct STEALTIME_context ), .control_state = sizeof ( struct STEALTIME_control_state ), .reg_value = sizeof ( STEALTIME_register_t ), .reg_alloc = sizeof ( STEALTIME_reg_alloc_t ), }, /* function pointers in this component */ .init_thread = _stealtime_init_thread, .init_component = _stealtime_init_component, .init_control_state = _stealtime_init_control_state, .start = _stealtime_start, .stop = _stealtime_stop, .read = _stealtime_read, .shutdown_thread = _stealtime_shutdown_thread, .shutdown_component = _stealtime_shutdown_component, .ctl = _stealtime_ctl, .update_control_state = _stealtime_update_control_state, .set_domain = _stealtime_set_domain, .reset = _stealtime_reset, .ntv_enum_events = _stealtime_ntv_enum_events, .ntv_code_to_name = _stealtime_ntv_code_to_name, .ntv_code_to_descr = _stealtime_ntv_code_to_descr, .ntv_code_to_info = _stealtime_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/stealtime/tests/000077500000000000000000000000001502707512200217455ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/stealtime/tests/Makefile000066400000000000000000000005371502707512200234120ustar00rootroot00000000000000NAME=stealtime include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = stealtime_basic stealtime_tests: $(TESTS) stealtime_basic: stealtime_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(INCLUDE) -o stealtime_basic stealtime_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/stealtime/tests/stealtime_basic.c000066400000000000000000000062331502707512200252450ustar00rootroot00000000000000/** * @author Vince Weaver * * test case for stealtime component * * * @brief * Tests basic stealtime functionality */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval,cid,numcmp; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int code; char event_name[PAPI_MAX_STR_LEN]; int total_events=0; int r; const PAPI_component_info_t *cmpinfo = NULL; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!quiet) { printf("Trying all stealtime events\n"); } numcmp = PAPI_num_components(); for(cid=0; cidname,"stealtime")) { if (!quiet) printf("\tFound stealtime component %d - %s\n", cid, cmpinfo->name); } else { continue; } code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( r == PAPI_OK ) { retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!quiet) printf(" %s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } if (!quiet) printf(" value: %lld\n",values[0]); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } if (total_events==0) { test_skip(__FILE__,__LINE__,"No stealtime events found",0); } if (!quiet) { printf("Note: for this test the values are expected to all be 0\n\t unless run inside a VM on a busy system.\n"); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/sysdetect/000077500000000000000000000000001502707512200206235ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sysdetect/README.md000066400000000000000000000025701502707512200221060ustar00rootroot00000000000000# SYSDETECT Component The SYSDETECT component allows PAPI users to query comprehensive system information. The information is gathered at PAPI_library_init() time and presented to the user through appropriate APIs. The component works similarly to other components, which means that hardware information for a specific device might not be available at runtime if, e.g., the device runtime software is not installed. * [Enabling the SYSDETECT Component](#enabling-the-sysdetect-component) ## Enabling the SYSDETECT Component The sysdetect component is enabled by default, however, support for various devices accessed by the component has to be enabled by exporting appropriate environment variables. This is required so that the component can dlopen the needed libraries and access hardware information. For example, for the sysdetect component to access cuda devices information the PAPI_CUDA_ROOT environment variable has to be set to the cuda toolkit installation path (in the same way the user has to enable the cuda component). Typically, the utility `papi_components_avail` (available in `papi/src/utils/papi_components_avail`) will display the components available to the user, and whether they are disabled, and when they are disabled why. The utility program papi_hardware_avail uses the SYSDETECT component to report installed and configured hardware information to the command line. papi-papi-7-2-0-t/src/components/sysdetect/Rules.sysdetect000066400000000000000000000054531502707512200236550ustar00rootroot00000000000000COMPSRCS += components/sysdetect/sysdetect.c \ components/sysdetect/nvidia_gpu.c \ components/sysdetect/amd_gpu.c \ components/sysdetect/cpu.c \ components/sysdetect/cpu_utils.c \ components/sysdetect/os_cpu_utils.c \ components/sysdetect/linux_cpu_utils.c COMPOBJS += sysdetect.o \ nvidia_gpu.o \ amd_gpu.o \ cpu.o \ cpu_utils.o \ os_cpu_utils.o \ linux_cpu_utils.o ifneq ($(PAPI_ROCM_ROOT),) CFLAGS += -I$(PAPI_ROCM_ROOT)/include \ -I$(PAPI_ROCM_ROOT)/include/hsa \ -I$(PAPI_ROCM_ROOT)/hsa/include \ -I$(PAPI_ROCM_ROOT)/hsa/include/hsa \ -I$(PAPI_ROCM_ROOT)/include/rocm_smi \ -I$(PAPI_ROCM_ROOT)/rocm_smi/include \ -I$(PAPI_ROCM_ROOT)/rocm_smi/include/rocm_smi endif ifneq ($(PAPI_CUDA_ROOT),) CFLAGS += -I$(PAPI_CUDA_ROOT)/include endif sysdetect.o: components/sysdetect/sysdetect.c components/sysdetect/sysdetect.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/sysdetect.c -o $@ nvidia_gpu.o: components/sysdetect/nvidia_gpu.c components/sysdetect/sysdetect.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/nvidia_gpu.c -o $@ amd_gpu.o: components/sysdetect/amd_gpu.c components/sysdetect/amd_gpu.h $(HEADERS) $(CC) -I$(PAPI_ROCM_ROOT)/include/hsa $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/amd_gpu.c -o $@ cpu.o: components/sysdetect/cpu.c components/sysdetect/sysdetect.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/cpu.c -o $@ cpu_utils.o: components/sysdetect/cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/cpu_utils.c -o $@ os_cpu_utils.o: components/sysdetect/os_cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/os_cpu_utils.c -o $@ linux_cpu_utils.o: components/sysdetect/linux_cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/linux_cpu_utils.c -o $@ ifeq ($(CPU), x86) COMPSRCS += components/sysdetect/x86_cpu_utils.c COMPOBJS += x86_cpu_utils.o x86_cpu_utils.o: components/sysdetect/x86_cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/x86_cpu_utils.c -o $@ else ifneq (, $(filter $(CPU), POWER5 POWER5+ POWER6 POWER7 POWER8 POWER9 POWER10 PPC970)) COMPSRCS += components/sysdetect/powerpc_cpu_utils.c COMPOBJS += powerpc_cpu_utils.o powerpc_cpu_utils.o: components/sysdetect/powerpc_cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/powerpc_cpu_utils.c -o $@ else ifeq ($(CPU), arm) COMPSRCS += components/sysdetect/arm_cpu_utils.c COMPOBJS += arm_cpu_utils.o arm_cpu_utils.o: components/sysdetect/arm_cpu_utils.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/sysdetect/arm_cpu_utils.c -o $@ endif papi-papi-7-2-0-t/src/components/sysdetect/amd_gpu.c000066400000000000000000000343771502707512200224210ustar00rootroot00000000000000/** * @file amd_gpu.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * @ingroup papi_components * * @brief * Scan functions for AMD GPU subsystems. */ #include #include #include #include #include #include "sysdetect.h" #include "amd_gpu.h" #ifdef HAVE_ROCM #include "hsa.h" #include "hsa_ext_amd.h" static void *rocm_dlp = NULL; static hsa_status_t (*hsa_initPtr)( void ) = NULL; static hsa_status_t (*hsa_shut_downPtr)( void ) = NULL; static hsa_status_t (*hsa_iterate_agentsPtr)( hsa_status_t (*)(hsa_agent_t agent, void *value), void *value ) = NULL; static hsa_status_t (*hsa_agent_get_infoPtr)( hsa_agent_t agent, hsa_agent_info_t attribute, void *value ) = NULL; static hsa_status_t (*hsa_amd_agent_iterate_memory_poolsPtr)( hsa_agent_t agent, hsa_status_t (*)(hsa_amd_memory_pool_t pool, void *value), void *value ) = NULL; static hsa_status_t (*hsa_amd_memory_pool_get_infoPtr)( hsa_amd_memory_pool_t pool, hsa_amd_memory_pool_info_t attribute, void *value ) = NULL; static hsa_status_t (*hsa_status_stringPtr)( hsa_status_t status, const char **string ) = NULL; #define ROCM_CALL(call, err_handle) do { \ hsa_status_t _status = (call); \ if (_status == HSA_STATUS_SUCCESS || \ _status == HSA_STATUS_INFO_BREAK) \ break; \ err_handle; \ } while(0) static hsa_status_t count_devices( hsa_agent_t agent, void *data ); static hsa_status_t get_device_count( int *count ); static hsa_status_t get_device_memory( hsa_amd_memory_pool_t pool, void *info ); static hsa_status_t get_device_properties( hsa_agent_t agent, void *info ); static void fill_dev_info( _sysdetect_gpu_info_u *dev_info ); static int hsa_is_enabled( void ); static int load_hsa_sym( char *status ); static int unload_hsa_sym( void ); #endif /* HAVE_ROCM */ #ifdef HAVE_ROCM_SMI #include "rocm_smi.h" static void *rsmi_dlp = NULL; static rsmi_status_t (*rsmi_initPtr)( unsigned long init_flags ) = NULL; static rsmi_status_t (*rsmi_shut_downPtr)( void ) = NULL; static rsmi_status_t (*rsmi_dev_pci_id_getPtr)( unsigned int dev_idx, unsigned long *bdfid ) = NULL; #define ROCM_SMI_CALL(call, err_handle) do { \ rsmi_status_t _status = (call); \ if (_status == RSMI_STATUS_SUCCESS) \ break; \ err_handle; \ } while(0) static void fill_dev_affinity_info( _sysdetect_gpu_info_u *dev_info, int dev_count ); static int rsmi_is_enabled( void ); static int load_rsmi_sym( char *status ); static int unload_rsmi_sym( void ); #endif /* HAVE_ROCM_SMI */ #ifdef HAVE_ROCM hsa_status_t count_devices( hsa_agent_t agent, void *data ) { int *count = (int *) data; hsa_device_type_t type; ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_DEVICE, &type), return _status); if (type == HSA_DEVICE_TYPE_GPU) { ++(*count); } return HSA_STATUS_SUCCESS; } hsa_status_t get_device_count( int *count ) { *count = 0; ROCM_CALL((*hsa_iterate_agentsPtr)(&count_devices, count), return _status); return HSA_STATUS_SUCCESS; } hsa_status_t get_device_memory( hsa_amd_memory_pool_t pool, void *info ) { hsa_region_segment_t seg_info; _sysdetect_gpu_info_u *dev_info = info; ROCM_CALL((*hsa_amd_memory_pool_get_infoPtr)(pool, HSA_AMD_MEMORY_POOL_INFO_SEGMENT, &seg_info), return _status); if (seg_info == HSA_REGION_SEGMENT_GROUP) { ROCM_CALL((*hsa_amd_memory_pool_get_infoPtr)(pool, HSA_AMD_MEMORY_POOL_INFO_SIZE, &dev_info->amd.max_shmmem_per_workgroup), return _status); return HSA_STATUS_INFO_BREAK; } return HSA_STATUS_SUCCESS; } hsa_status_t get_device_properties( hsa_agent_t agent, void *info ) { static int count; hsa_device_type_t type; ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_DEVICE, &type), return _status); if (type == HSA_DEVICE_TYPE_GPU) { /* query attributes for this device */ _sysdetect_gpu_info_u *dev_info = &((_sysdetect_gpu_info_u *) info)[count]; ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_NAME, dev_info->amd.name), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_WAVEFRONT_SIZE, &dev_info->amd.wavefront_size), return _status); unsigned short wg_dims[3]; ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_WORKGROUP_MAX_DIM, wg_dims), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_WORKGROUP_MAX_SIZE, &dev_info->amd.max_threads_per_workgroup), return _status); hsa_dim3_t gr_dims; ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_GRID_MAX_DIM, &gr_dims), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_VERSION_MAJOR, &dev_info->amd.major), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, HSA_AGENT_INFO_VERSION_MINOR, &dev_info->amd.minor), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, (hsa_agent_info_t) HSA_AMD_AGENT_INFO_NUM_SIMDS_PER_CU, &dev_info->amd.simd_per_compute_unit), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, (hsa_agent_info_t) HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT, &dev_info->amd.compute_unit_count), return _status); ROCM_CALL((*hsa_agent_get_infoPtr)(agent, (hsa_agent_info_t) HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU, &dev_info->amd.max_waves_per_compute_unit), return _status); ROCM_CALL((*hsa_amd_agent_iterate_memory_poolsPtr)(agent, &get_device_memory, dev_info), return _status); dev_info->amd.max_workgroup_dim_x = wg_dims[0]; dev_info->amd.max_workgroup_dim_y = wg_dims[1]; dev_info->amd.max_workgroup_dim_z = wg_dims[2]; dev_info->amd.max_grid_dim_x = gr_dims.x; dev_info->amd.max_grid_dim_y = gr_dims.y; dev_info->amd.max_grid_dim_z = gr_dims.z; ++count; } return HSA_STATUS_SUCCESS; } void fill_dev_info( _sysdetect_gpu_info_u *dev_info ) { hsa_status_t status = HSA_STATUS_SUCCESS; const char *string = NULL; ROCM_CALL((*hsa_iterate_agentsPtr)(&get_device_properties, dev_info), status = _status); if (status != HSA_STATUS_SUCCESS) { (*hsa_status_stringPtr)(status, &string); SUBDBG( "error: %s\n", string ); } } int hsa_is_enabled( void ) { return (hsa_initPtr != NULL && hsa_shut_downPtr != NULL && hsa_iterate_agentsPtr != NULL && hsa_agent_get_infoPtr != NULL && hsa_amd_agent_iterate_memory_poolsPtr != NULL && hsa_amd_memory_pool_get_infoPtr != NULL && hsa_status_stringPtr != NULL); } int load_hsa_sym( char *status ) { char pathname[PATH_MAX] = "libhsa-runtime64.so"; char *rocm_root = getenv("PAPI_ROCM_ROOT"); if (rocm_root != NULL) { sprintf(pathname, "%s/lib/libhsa-runtime64.so", rocm_root); } rocm_dlp = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL); if (rocm_dlp == NULL) { int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", dlerror()); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } hsa_initPtr = dlsym(rocm_dlp, "hsa_init"); hsa_shut_downPtr = dlsym(rocm_dlp, "hsa_shut_down"); hsa_iterate_agentsPtr = dlsym(rocm_dlp, "hsa_iterate_agents"); hsa_agent_get_infoPtr = dlsym(rocm_dlp, "hsa_agent_get_info"); hsa_amd_agent_iterate_memory_poolsPtr = dlsym(rocm_dlp, "hsa_amd_agent_iterate_memory_pools"); hsa_amd_memory_pool_get_infoPtr = dlsym(rocm_dlp, "hsa_amd_memory_pool_get_info"); hsa_status_stringPtr = dlsym(rocm_dlp, "hsa_status_string"); if (!hsa_is_enabled() || (*hsa_initPtr)()) { const char *message = "dlsym() of HSA symbols failed or hsa_init() " "failed"; int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } return 0; } int unload_hsa_sym( void ) { if (rocm_dlp != NULL) { (*hsa_shut_downPtr)(); dlclose(rocm_dlp); } hsa_initPtr = NULL; hsa_shut_downPtr = NULL; hsa_iterate_agentsPtr = NULL; hsa_agent_get_infoPtr = NULL; hsa_amd_agent_iterate_memory_poolsPtr = NULL; hsa_amd_memory_pool_get_infoPtr = NULL; hsa_status_stringPtr = NULL; return hsa_is_enabled(); } #endif /* HAVE_ROCM */ #ifdef HAVE_ROCM_SMI void fill_dev_affinity_info( _sysdetect_gpu_info_u *info, int dev_count ) { int dev; for (dev = 0; dev < dev_count; ++dev) { unsigned long uid; ROCM_SMI_CALL((*rsmi_dev_pci_id_getPtr)(dev, &uid), return); _sysdetect_gpu_info_u *dev_info = &info[dev]; dev_info->amd.uid = uid; } } int rsmi_is_enabled( void ) { return (rsmi_initPtr != NULL && rsmi_shut_downPtr != NULL && rsmi_dev_pci_id_getPtr != NULL); } int load_rsmi_sym( char *status ) { char pathname[PATH_MAX] = "librocm_smi64.so"; char *rsmi_root = getenv("PAPI_ROCM_ROOT"); if (rsmi_root != NULL) { sprintf(pathname, "%s/lib/librocm_smi64.so", rsmi_root); } rsmi_dlp = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL); if (rsmi_dlp == NULL) { int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", dlerror()); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } rsmi_initPtr = dlsym(rsmi_dlp, "rsmi_init"); rsmi_shut_downPtr = dlsym(rsmi_dlp, "rsmi_shut_down"); rsmi_dev_pci_id_getPtr = dlsym(rsmi_dlp, "rsmi_dev_pci_id_get"); if (!rsmi_is_enabled() || (*rsmi_initPtr)(0)) { const char *message = "dlsym() of RSMI symbols failed or rsmi_init() " "failed"; int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } return 0; } int unload_rsmi_sym( void ) { if (rsmi_dlp != NULL) { (*rsmi_shut_downPtr)(); dlclose(rsmi_dlp); } rsmi_initPtr = NULL; rsmi_shut_downPtr = NULL; rsmi_dev_pci_id_getPtr = NULL; return rsmi_is_enabled(); } #endif /* HAVE_ROCM_SMI */ void open_amd_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { memset(dev_type_info, 0, sizeof(*dev_type_info)); dev_type_info->id = PAPI_DEV_TYPE_ID__ROCM; strcpy(dev_type_info->vendor, "AMD/ATI"); strcpy(dev_type_info->status, "Device Initialized"); #ifdef HAVE_ROCM if (load_hsa_sym(dev_type_info->status)) { return; } int dev_count = 0; hsa_status_t status = get_device_count(&dev_count); if (status != HSA_STATUS_SUCCESS) { if (status != HSA_STATUS_ERROR_NOT_INITIALIZED) { const char *string; (*hsa_status_stringPtr)(status, &string); printf( "error: %s\n", string ); } return; } dev_type_info->num_devices = dev_count; _sysdetect_gpu_info_u *arr = papi_calloc(dev_count, sizeof(*arr)); fill_dev_info(arr); #ifdef HAVE_ROCM_SMI if (!load_rsmi_sym(dev_type_info->status)) { fill_dev_affinity_info(arr, dev_count); unload_rsmi_sym(); } #else const char *message = "RSMI not configured, no device affinity available"; int count = snprintf(dev_type_info->status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Error message truncated."); } #endif /* HAVE_ROCM_SMI */ unload_hsa_sym(); dev_type_info->dev_info_arr = (_sysdetect_dev_info_u *)arr; #else const char *message = "ROCm not configured, no ROCm device available"; int count = snprintf(dev_type_info->status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Error message truncated."); } #endif /* HAVE_ROCM */ } void close_amd_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { papi_free(dev_type_info->dev_info_arr); } papi-papi-7-2-0-t/src/components/sysdetect/amd_gpu.h000066400000000000000000000003431502707512200224100ustar00rootroot00000000000000#ifndef __AMD_GPU_H__ #define __AMD_GPU_H__ void open_amd_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); void close_amd_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); #endif /* End of __AMD_GPU_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/arm_cpu_utils.c000066400000000000000000000261441502707512200236440ustar00rootroot00000000000000#include #include "sysdetect.h" #include "arm_cpu_utils.h" #include "os_cpu_utils.h" #include #define VENDOR_ARM_ARM 65 #define VENDOR_ARM_BROADCOM 66 #define VENDOR_ARM_CAVIUM 67 #define VENDOR_ARM_FUJITSU 70 #define VENDOR_ARM_HISILICON 72 #define VENDOR_ARM_APM 80 #define VENDOR_ARM_QUALCOMM 81 #define NAMEID_ARM_1176 0xb76 #define NAMEID_ARM_CORTEX_A7 0xc07 #define NAMEID_ARM_CORTEX_A8 0xc08 #define NAMEID_ARM_CORTEX_A9 0xc09 #define NAMEID_ARM_CORTEX_A15 0xc0f #define NAMEID_ARM_CORTEX_A53 0xd03 #define NAMEID_ARM_CORTEX_A57 0xd07 #define NAMEID_ARM_CORTEX_A76 0xd0b #define NAMEID_ARM_NEOVERSE_N1 0xd0c #define NAMEID_ARM_NEOVERSE_N2 0xd49 #define NAMEID_ARM_NEOVERSE_V1 0xd40 #define NAMEID_ARM_NEOVERSE_V2 0xd4f #define NAMEID_BROADCOM_THUNDERX2 0x516 #define NAMEID_CAVIUM_THUNDERX2 0x0af #define NAMEID_FUJITSU_A64FX 0x001 #define NAMEID_FUJITSU_MONAKA 0x003 #define NAMEID_HISILICON_KUNPENG 0xd01 #define NAMEID_APM_XGENE 0x000 #define NAMEID_QUALCOMM_KRAIT 0x040 _sysdetect_cache_level_info_t fujitsu_a64fx_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 65536, 256, 64, 4}, {PAPI_MH_TYPE_DATA, 65536, 256, 64, 4} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 8388608, 256, 2048, 16}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t arm_neoverse_v2_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 65536, 64, 256, 4}, {PAPI_MH_TYPE_DATA, 65536, 64, 256, 4} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 1048576, 64, 2048, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 119537664, 64, 155648, 12}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; static int get_cache_info( CPU_attr_e attr, int level, int *value ); static int name_id_arm_cpu_get_name( int name_id, char *name ); static int name_id_broadcom_cpu_get_name( int name_id, char *name ); static int name_id_cavium_cpu_get_name( int name_id, char *name ); static int name_id_fujitsu_cpu_get_name( int name_id, char *name ); static int name_id_hisilicon_cpu_get_name( int name_id, char *name ); static int name_id_apm_cpu_get_name( int name_id, char *name ); static int name_id_qualcomm_cpu_get_name( int name_id, char *name ); int arm_cpu_init( void ) { return CPU_SUCCESS; } int arm_cpu_finalize( void ) { return CPU_SUCCESS; } int arm_cpu_get_vendor( char *vendor ) { int papi_errno; char tmp[PAPI_MAX_STR_LEN]; papi_errno = os_cpu_get_vendor(tmp); if (papi_errno != PAPI_OK) { return papi_errno; } int vendor_id; sscanf(tmp, "%x", &vendor_id); switch(vendor_id) { case VENDOR_ARM_ARM: strcpy(vendor, "Arm"); break; case VENDOR_ARM_BROADCOM: strcpy(vendor, "Broadcom"); break; case VENDOR_ARM_CAVIUM: strcpy(vendor, "Cavium"); break; case VENDOR_ARM_FUJITSU: strcpy(vendor, "Fujitsu"); break; case VENDOR_ARM_HISILICON: strcpy(vendor, "Hisilicon"); break; case VENDOR_ARM_APM: strcpy(vendor, "Apm"); break; case VENDOR_ARM_QUALCOMM: strcpy(vendor, "Qualcomm"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int arm_cpu_get_name( char *name ) { int papi_errno; papi_errno = os_cpu_get_name(name); if (strlen(name) != 0) { return papi_errno; } char tmp[PAPI_MAX_STR_LEN]; papi_errno = os_cpu_get_vendor(tmp); if (papi_errno != PAPI_OK) { return papi_errno; } int vendor_id; sscanf(tmp, "%x", &vendor_id); int name_id; papi_errno = os_cpu_get_attribute(CPU_ATTR__CPUID_MODEL, &name_id); if (papi_errno != PAPI_OK) { return papi_errno; } switch(vendor_id) { case VENDOR_ARM_ARM: papi_errno = name_id_arm_cpu_get_name(name_id, name); break; case VENDOR_ARM_BROADCOM: papi_errno = name_id_broadcom_cpu_get_name(name_id, name); break; case VENDOR_ARM_CAVIUM: papi_errno = name_id_cavium_cpu_get_name(name_id, name); break; case VENDOR_ARM_FUJITSU: papi_errno = name_id_fujitsu_cpu_get_name(name_id, name); break; case VENDOR_ARM_HISILICON: papi_errno = name_id_hisilicon_cpu_get_name(name_id, name); break; case VENDOR_ARM_APM: papi_errno = name_id_apm_cpu_get_name(name_id, name); break; case VENDOR_ARM_QUALCOMM: papi_errno = name_id_qualcomm_cpu_get_name(name_id, name); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int arm_cpu_get_attribute( CPU_attr_e attr, int *value ) { return os_cpu_get_attribute(attr, value); } int arm_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { int status = CPU_SUCCESS; switch(attr) { case CPU_ATTR__CACHE_INST_PRESENT: //fall through case CPU_ATTR__CACHE_DATA_PRESENT: //fall through case CPU_ATTR__CACHE_UNIF_PRESENT: //fall through case CPU_ATTR__CACHE_INST_TOT_SIZE: //fall through case CPU_ATTR__CACHE_INST_LINE_SIZE: //fall through case CPU_ATTR__CACHE_INST_NUM_LINES: //fall through case CPU_ATTR__CACHE_INST_ASSOCIATIVITY: //fall through case CPU_ATTR__CACHE_DATA_TOT_SIZE: //fall through case CPU_ATTR__CACHE_DATA_LINE_SIZE: //fall through case CPU_ATTR__CACHE_DATA_NUM_LINES: //fall through case CPU_ATTR__CACHE_DATA_ASSOCIATIVITY: //fall through case CPU_ATTR__CACHE_UNIF_TOT_SIZE: //fall through case CPU_ATTR__CACHE_UNIF_LINE_SIZE: //fall through case CPU_ATTR__CACHE_UNIF_NUM_LINES: //fall through case CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY: status = get_cache_info(attr, loc, value); break; case CPU_ATTR__NUMA_MEM_SIZE: //fall through case CPU_ATTR__HWTHREAD_NUMA_AFFINITY: status = os_cpu_get_attribute_at(attr, loc, value); break; default: status = CPU_ERROR; } return status; } int name_id_arm_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_ARM_1176: strcpy(name, "ARM1176"); break; case NAMEID_ARM_CORTEX_A7: strcpy(name, "ARM Cortex A7"); break; case NAMEID_ARM_CORTEX_A8: strcpy(name, "ARM Cortex A8"); break; case NAMEID_ARM_CORTEX_A9: strcpy(name, "ARM Cortex A9"); break; case NAMEID_ARM_CORTEX_A15: strcpy(name, "ARM Cortex A15"); break; case NAMEID_ARM_CORTEX_A53: strcpy(name, "ARM Cortex A53"); break; case NAMEID_ARM_CORTEX_A57: strcpy(name, "ARM Cortex A57"); break; case NAMEID_ARM_CORTEX_A76: strcpy(name, "ARM Cortex A76"); break; case NAMEID_ARM_NEOVERSE_N1: strcpy(name, "ARM Neoverse N1"); break; case NAMEID_ARM_NEOVERSE_N2: strcpy(name, "ARM Neoverse N2"); break; case NAMEID_ARM_NEOVERSE_V1: strcpy(name, "ARM Neoverse V1"); break; case NAMEID_ARM_NEOVERSE_V2: strcpy(name, "ARM Neoverse V2"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_broadcom_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_BROADCOM_THUNDERX2: strcpy(name, "Broadcom ThunderX2"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_cavium_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_CAVIUM_THUNDERX2: strcpy(name, "Cavium ThunderX2"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_fujitsu_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_FUJITSU_A64FX: strcpy(name, "Fujitsu A64FX"); break; case NAMEID_FUJITSU_MONAKA: strcpy(name, "Fujitsu FUJITSU-MONAKA"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_hisilicon_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_HISILICON_KUNPENG: strcpy(name, "Hisilicon Kunpeng"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_apm_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_APM_XGENE: strcpy(name, "Applied Micro X-Gene"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int name_id_qualcomm_cpu_get_name( int name_id, char *name ) { int papi_errno = PAPI_OK; switch(name_id) { case NAMEID_QUALCOMM_KRAIT: strcpy(name, "ARM Qualcomm Krait"); break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int get_cache_info( CPU_attr_e attr, int level, int *value ) { int impl, part; unsigned int implementer, partnum; static _sysdetect_cache_level_info_t *clevel_ptr; int status; status = arm_cpu_get_attribute(CPU_ATTR__VENDOR_ID, &impl); if( status != CPU_SUCCESS ) return CPU_ERROR; implementer = impl; status = arm_cpu_get_attribute(CPU_ATTR__CPUID_MODEL, &part); if( status != CPU_SUCCESS ) return CPU_ERROR; partnum = part; if (clevel_ptr) { return cpu_get_cache_info(attr, level, clevel_ptr, value); } switch ( implementer ) { case VENDOR_ARM_FUJITSU: if ( NAMEID_FUJITSU_A64FX == partnum ) { /* Fujitsu A64FX */ clevel_ptr = fujitsu_a64fx_cache_info; } else { return CPU_ERROR; } break; case VENDOR_ARM_ARM: if ( NAMEID_ARM_NEOVERSE_V2 == partnum ) { /* ARM Neoverse V2 */ clevel_ptr = arm_neoverse_v2_cache_info; } else { return CPU_ERROR; } break; default: return CPU_ERROR; } return cpu_get_cache_info(attr, level, clevel_ptr, value); } papi-papi-7-2-0-t/src/components/sysdetect/arm_cpu_utils.h000066400000000000000000000005571502707512200236510ustar00rootroot00000000000000#ifndef __ARM_UTIL_H__ #define __ARM_UTIL_H__ #include "cpu_utils.h" int arm_cpu_init( void ); int arm_cpu_finalize( void ); int arm_cpu_get_vendor( char *vendor ); int arm_cpu_get_name( char *name ); int arm_cpu_get_attribute( CPU_attr_e attr, int *value ); int arm_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); #endif /* End of __ARM_UTIL_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/cpu.c000066400000000000000000000157171502707512200215710ustar00rootroot00000000000000/** * @file cpu.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * @ingroup papi_components * * @brief * Scan functions for all Vendor CPU systems. */ #include #include #include #include "sysdetect.h" #include "cpu.h" #include "cpu_utils.h" #define CPU_CALL(call, err_handle) do { \ int _status = (call); \ if (_status != CPU_SUCCESS) { \ SUBDBG("error: function %s failed with error %d\n", #call, _status); \ err_handle; \ } \ } while(0) static void fill_cpu_info( _sysdetect_cpu_info_t *info ) { CPU_CALL(cpu_get_name(info->name), strcpy(info->name, "UNKNOWN")); CPU_CALL(cpu_get_attribute(CPU_ATTR__CPUID_FAMILY, &info->cpuid_family), info->cpuid_family = -1); CPU_CALL(cpu_get_attribute(CPU_ATTR__CPUID_MODEL, &info->cpuid_model), info->cpuid_model = -1); CPU_CALL(cpu_get_attribute(CPU_ATTR__CPUID_STEPPING, &info->cpuid_stepping), info->cpuid_stepping = -1); CPU_CALL(cpu_get_attribute( CPU_ATTR__NUM_SOCKETS, &info->sockets ), info->sockets = -1); CPU_CALL(cpu_get_attribute( CPU_ATTR__NUM_NODES, &info->numas ), info->numas = -1); CPU_CALL(cpu_get_attribute(CPU_ATTR__NUM_CORES, &info->cores), info->cores = -1); CPU_CALL(cpu_get_attribute(CPU_ATTR__NUM_THREADS, &info->threads), info->threads = -1); int cache_levels; CPU_CALL(cpu_get_attribute(CPU_ATTR__CACHE_MAX_NUM_LEVELS, &cache_levels), cache_levels = 0); int level, a, b, c; for (level = 1; level <= cache_levels; ++level) { CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_INST_PRESENT, level, &a), a = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_DATA_PRESENT, level, &b), b = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_UNIF_PRESENT, level, &c), c = 0); if (!(a || b || c)) { /* No caches at this level */ break; } int *num_caches = &info->clevel[level-1].num_caches; if (a) { info->clevel[level-1].cache[*num_caches].type = PAPI_MH_TYPE_INST; CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_INST_TOT_SIZE, level, &info->clevel[level-1].cache[*num_caches].size), info->clevel[level-1].cache[*num_caches].size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_INST_LINE_SIZE, level, &info->clevel[level-1].cache[*num_caches].line_size), info->clevel[level-1].cache[*num_caches].line_size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_INST_NUM_LINES, level, &info->clevel[level-1].cache[*num_caches].num_lines), info->clevel[level-1].cache[*num_caches].num_lines = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_INST_ASSOCIATIVITY, level, &info->clevel[level-1].cache[*num_caches].associativity), info->clevel[level-1].cache[*num_caches].associativity = 0); ++(*num_caches); } if (b) { info->clevel[level-1].cache[*num_caches].type = PAPI_MH_TYPE_DATA; CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_DATA_TOT_SIZE, level, &info->clevel[level-1].cache[*num_caches].size), info->clevel[level-1].cache[*num_caches].size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_DATA_LINE_SIZE, level, &info->clevel[level-1].cache[*num_caches].line_size), info->clevel[level-1].cache[*num_caches].line_size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_DATA_NUM_LINES, level, &info->clevel[level-1].cache[*num_caches].num_lines), info->clevel[level-1].cache[*num_caches].num_lines = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_DATA_ASSOCIATIVITY, level, &info->clevel[level-1].cache[*num_caches].associativity), info->clevel[level-1].cache[*num_caches].associativity = 0); ++(*num_caches); } if (c) { info->clevel[level-1].cache[*num_caches].type = PAPI_MH_TYPE_UNIFIED; CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_UNIF_TOT_SIZE, level, &info->clevel[level-1].cache[*num_caches].size), info->clevel[level-1].cache[*num_caches].size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_UNIF_LINE_SIZE, level, &info->clevel[level-1].cache[*num_caches].line_size), info->clevel[level-1].cache[*num_caches].line_size = 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_UNIF_NUM_LINES, level, &info->clevel[level-1].cache[*num_caches].num_lines), info->clevel[level-1].cache[*num_caches].num_lines= 0); CPU_CALL(cpu_get_attribute_at(CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY, level, &info->clevel[level-1].cache[*num_caches].associativity), info->clevel[level-1].cache[*num_caches].associativity = 0); ++(*num_caches); } } for (a = 0; a < info->numas; ++a) { CPU_CALL(cpu_get_attribute_at(CPU_ATTR__NUMA_MEM_SIZE, a, &info->numa_memory[a]), info->numa_memory[a] = 0); } for (a = 0; a < info->threads * info->cores * info->sockets; ++a) { int status = cpu_get_attribute_at(CPU_ATTR__HWTHREAD_NUMA_AFFINITY, a, &info->numa_affinity[a]); if( CPU_SUCCESS != status ){ SUBDBG("Warning: core ids might not be contiguous.\n"); } } info->cache_levels = level; } void open_cpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { memset(dev_type_info, 0, sizeof(*dev_type_info)); dev_type_info->id = PAPI_DEV_TYPE_ID__CPU; CPU_CALL(cpu_get_vendor(dev_type_info->vendor), strcpy(dev_type_info->vendor, "UNKNOWN")); CPU_CALL(cpu_get_attribute(CPU_ATTR__VENDOR_ID, &dev_type_info->vendor_id), dev_type_info->vendor_id = -1); strcpy(dev_type_info->status, "Device Initialized"); dev_type_info->num_devices = 1; _sysdetect_cpu_info_t *arr = papi_calloc(1, sizeof(*arr)); fill_cpu_info(arr); dev_type_info->dev_info_arr = (_sysdetect_dev_info_u *)arr; } void close_cpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { papi_free(dev_type_info->dev_info_arr); } papi-papi-7-2-0-t/src/components/sysdetect/cpu.h000066400000000000000000000003171502707512200215640ustar00rootroot00000000000000#ifndef __CPU_H__ #define __CPU_H__ void open_cpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); void close_cpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); #endif /* End of __CPU_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/cpu_utils.c000066400000000000000000000132541502707512200230030ustar00rootroot00000000000000/** * @file cpu_utils.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * @brief * Returns information about CPUs */ #include #include #include #include #include "sysdetect.h" #include "cpu_utils.h" #if defined(__x86_64__) || defined(__amd64__) #include "x86_cpu_utils.h" #elif defined(__powerpc__) #include "powerpc_cpu_utils.h" #elif defined(__arm__) || defined(__aarch64__) #include "arm_cpu_utils.h" #endif #include "os_cpu_utils.h" int cpu_get_vendor( char *vendor ) { #if defined(__x86_64__) || defined(__amd64__) return x86_cpu_get_vendor(vendor); #elif defined(__powerpc__) return powerpc_cpu_get_vendor(vendor); #elif defined(__arm__) || defined(__aarch64__) return arm_cpu_get_vendor(vendor); #endif return os_cpu_get_vendor(vendor); } int cpu_get_name( char *name ) { #if defined(__x86_64__) || defined(__amd64__) return x86_cpu_get_name(name); #elif defined(__powerpc__) return powerpc_cpu_get_name(name); #elif defined(__arm__) || defined(__aarch64__) return arm_cpu_get_name(name); #endif return os_cpu_get_name(name); } int cpu_get_attribute( CPU_attr_e attr, int *value ) { #if defined(__x86_64__) || defined(__amd64__) return x86_cpu_get_attribute(attr, value); #elif defined(__powerpc__) return powerpc_cpu_get_attribute(attr, value); #elif defined(__arm__) || defined(__aarch64__) return arm_cpu_get_attribute(attr, value); #endif return os_cpu_get_attribute(attr, value); } int cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { #if defined(__x86_64__) || defined(__amd64__) return x86_cpu_get_attribute_at(attr, loc, value); #elif defined(__powerpc__) return powerpc_cpu_get_attribute_at(attr, loc, value); #elif defined(__arm__) || defined(__aarch64__) return arm_cpu_get_attribute_at(attr, loc, value); #endif return os_cpu_get_attribute_at(attr, loc, value); } static int get_cache_level( _sysdetect_cache_level_info_t *clevel_ptr, int type ) { int i; for (i = 0; i < clevel_ptr->num_caches; ++i) { if (clevel_ptr->cache[i].type == type) return i + 1; } return 0; } int cpu_get_cache_info( CPU_attr_e attr, int level, _sysdetect_cache_level_info_t *clevel_ptr, int *value ) { int status = CPU_SUCCESS; int i; *value = 0; if (level >= PAPI_MAX_MEM_HIERARCHY_LEVELS) return CPU_ERROR; switch(attr) { case CPU_ATTR__CACHE_INST_PRESENT: if (get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_INST)) { *value = 1; } break; case CPU_ATTR__CACHE_DATA_PRESENT: if (get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_DATA)) { *value = 1; } break; case CPU_ATTR__CACHE_UNIF_PRESENT: if (get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_UNIFIED)) { *value = 1; } break; case CPU_ATTR__CACHE_INST_TOT_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_INST))) { *value = clevel_ptr[level-1].cache[i-1].size; } break; case CPU_ATTR__CACHE_INST_LINE_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_INST))) { *value = clevel_ptr[level-1].cache[i-1].line_size; } break; case CPU_ATTR__CACHE_INST_NUM_LINES: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_INST))) { *value = clevel_ptr[level-1].cache[i-1].num_lines; } break; case CPU_ATTR__CACHE_INST_ASSOCIATIVITY: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_INST))) { *value = clevel_ptr[level-1].cache[i-1].associativity; } break; case CPU_ATTR__CACHE_DATA_TOT_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_DATA))) { *value = clevel_ptr[level-1].cache[i-1].size; } break; case CPU_ATTR__CACHE_DATA_LINE_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_DATA))) { *value = clevel_ptr[level-1].cache[i-1].line_size; } break; case CPU_ATTR__CACHE_DATA_NUM_LINES: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_DATA))) { *value = clevel_ptr[level-1].cache[i-1].num_lines; } break; case CPU_ATTR__CACHE_DATA_ASSOCIATIVITY: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_DATA))) { *value = clevel_ptr[level-1].cache[i-1].associativity; } break; case CPU_ATTR__CACHE_UNIF_TOT_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_UNIFIED))) { *value = clevel_ptr[level-1].cache[i-1].size; } break; case CPU_ATTR__CACHE_UNIF_LINE_SIZE: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_UNIFIED))) { *value = clevel_ptr[level-1].cache[i-1].line_size; } break; case CPU_ATTR__CACHE_UNIF_NUM_LINES: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_UNIFIED))) { *value = clevel_ptr[level-1].cache[i-1].num_lines; } break; case CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY: if ((i = get_cache_level(&clevel_ptr[level-1], PAPI_MH_TYPE_UNIFIED))) { *value = clevel_ptr[level-1].cache[i-1].associativity; } break; default: status = CPU_ERROR; } return status; } papi-papi-7-2-0-t/src/components/sysdetect/cpu_utils.h000066400000000000000000000026751502707512200230150ustar00rootroot00000000000000#ifndef __CPU_UTILS_H__ #define __CPU_UTILS_H__ #define CPU_SUCCESS 0 #define CPU_ERROR -1 typedef enum { CPU_ATTR__NUM_SOCKETS, CPU_ATTR__NUM_NODES, CPU_ATTR__NUM_CORES, CPU_ATTR__NUM_THREADS, CPU_ATTR__VENDOR_ID, CPU_ATTR__CPUID_FAMILY, CPU_ATTR__CPUID_MODEL, CPU_ATTR__CPUID_STEPPING, /* Cache Attributes */ CPU_ATTR__CACHE_MAX_NUM_LEVELS, CPU_ATTR__CACHE_INST_PRESENT, CPU_ATTR__CACHE_DATA_PRESENT, CPU_ATTR__CACHE_UNIF_PRESENT, CPU_ATTR__CACHE_INST_TOT_SIZE, CPU_ATTR__CACHE_INST_LINE_SIZE, CPU_ATTR__CACHE_INST_NUM_LINES, CPU_ATTR__CACHE_INST_ASSOCIATIVITY, CPU_ATTR__CACHE_DATA_TOT_SIZE, CPU_ATTR__CACHE_DATA_LINE_SIZE, CPU_ATTR__CACHE_DATA_NUM_LINES, CPU_ATTR__CACHE_DATA_ASSOCIATIVITY, CPU_ATTR__CACHE_UNIF_TOT_SIZE, CPU_ATTR__CACHE_UNIF_LINE_SIZE, CPU_ATTR__CACHE_UNIF_NUM_LINES, CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY, /* Hardware Thread Affinity Attributes */ CPU_ATTR__HWTHREAD_NUMA_AFFINITY, /* Memory Attributes */ CPU_ATTR__NUMA_MEM_SIZE, } CPU_attr_e; int cpu_init( void ); int cpu_finalize( void ); int cpu_get_vendor( char *vendor ); int cpu_get_name( char *name ); int cpu_get_attribute( CPU_attr_e attr, int *value ); int cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); int cpu_get_cache_info( CPU_attr_e attr, int level, _sysdetect_cache_level_info_t *clevel_ptr, int *value ); #endif /* End of __CPU_UTILS_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/linux_cpu_utils.c000066400000000000000000000635021502707512200242230ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include "sysdetect.h" #include "linux_cpu_utils.h" #define VENDOR_UNKNOWN -1 #define VENDOR_UNINITED 0 #define VENDOR_INTEL_X86 1 #define VENDOR_AMD 2 #define VENDOR_IBM 3 #define VENDOR_CRAY 4 #define VENDOR_MIPS 8 #define VENDOR_INTEL_IA64 9 #define VENDOR_ARM_ARM 65 #define VENDOR_ARM_BROADCOM 66 #define VENDOR_ARM_CAVIUM 67 #define VENDOR_ARM_FUJITSU 70 #define VENDOR_ARM_HISILICON 72 #define VENDOR_ARM_APM 80 #define VENDOR_ARM_QUALCOMM 81 #define _PATH_SYS_SYSTEM "/sys/devices/system/" #define _PATH_SYS_CPU0 _PATH_SYS_SYSTEM "/cpu/cpu0" static int get_topology_info( const char *key, int *value ); static int get_naming_info( const char *key, char *value ); static int get_versioning_info( const char *key, int *value ); static int get_cache_info( CPU_attr_e attr, int level, int *value ); static int get_cache_level( const char *dirname, int *value ); static int get_cache_type( const char *dirname, int *value ); static int get_cache_size( const char *dirname, int *value ); static int get_cache_line_size( const char *dirname, int *value ); static int get_cache_associativity( const char *dirname, int *value ); static int get_cache_partition_count( const char *dirname, int *value ); static int get_cache_set_count( const char *dirname, int *value ); static int get_mem_info( int node, int *value ); static int get_thread_affinity( int thread, int *value ); static int path_sibling( const char *path, ... ); static char *search_cpu_info( FILE *fp, const char *key ); static int path_exist( const char *path, ... ); static void decode_vendor_string( char *s, int *vendor ); static int get_vendor_id( void ); int linux_cpu_get_vendor( char *vendor ) { const char *namekey_x86 = "vendor_id"; const char *namekey_ia64 = "vendor"; const char *namekey_ibm = "platform"; const char *namekey_mips = "system type"; const char *namekey_arm = "CPU implementer"; const char *namekey_dum = "none"; const char *namekey_ptr = NULL; int vendor_id = get_vendor_id(); if (vendor_id == VENDOR_INTEL_X86 || vendor_id == VENDOR_AMD) { namekey_ptr = namekey_x86; } else if (vendor_id == VENDOR_INTEL_IA64) { namekey_ptr = namekey_ia64; } else if (vendor_id == VENDOR_IBM) { namekey_ptr = namekey_ibm; } else if (vendor_id == VENDOR_MIPS) { namekey_ptr = namekey_mips; } else if (vendor_id == VENDOR_ARM_ARM || vendor_id == VENDOR_ARM_BROADCOM || vendor_id == VENDOR_ARM_CAVIUM || vendor_id == VENDOR_ARM_FUJITSU || vendor_id == VENDOR_ARM_HISILICON || vendor_id == VENDOR_ARM_APM || vendor_id == VENDOR_ARM_QUALCOMM) { namekey_ptr = namekey_arm; } else { namekey_ptr = namekey_dum; } return get_naming_info(namekey_ptr, vendor); } int linux_cpu_get_name( char *name ) { const char *namekey_x86 = "model name"; const char *namekey_ibm = "model"; const char *namekey_arm = "model name"; const char *namekey_dum = "none"; const char *namekey_ptr = NULL; int vendor_id = get_vendor_id(); if (vendor_id == VENDOR_INTEL_X86 || vendor_id == VENDOR_AMD) { namekey_ptr = namekey_x86; } else if (vendor_id == VENDOR_IBM) { namekey_ptr = namekey_ibm; } else if (vendor_id == VENDOR_ARM_ARM || vendor_id == VENDOR_ARM_BROADCOM || vendor_id == VENDOR_ARM_CAVIUM || vendor_id == VENDOR_ARM_FUJITSU || vendor_id == VENDOR_ARM_HISILICON || vendor_id == VENDOR_ARM_APM || vendor_id == VENDOR_ARM_QUALCOMM) { namekey_ptr = namekey_arm; } else { namekey_ptr = namekey_dum; } return get_naming_info(namekey_ptr, name); } int linux_cpu_get_attribute( CPU_attr_e attr, int *value ) { int status = CPU_SUCCESS; #define TOPOKEY_NUM_KEY 4 #define VERKEY_NUM_KEY 4 int topo_idx = TOPOKEY_NUM_KEY; int ver_idx = VERKEY_NUM_KEY; const char *topokey[TOPOKEY_NUM_KEY] = { "sockets", "nodes", "threads", "cores", }; const char *verkey_x86[VERKEY_NUM_KEY] = { "cpu family", /* cpuid_family */ "model", /* cpuid_model */ "stepping", /* cpuid_stepping */ "vendor_id", /* vendor id */ }; const char *verkey_ibm[VERKEY_NUM_KEY] = { "none", /* cpuid_family */ "none", /* cpuid_model */ "revision", /* cpuid_stepping */ "vendor_id", /* vendor id */ }; const char *verkey_arm[VERKEY_NUM_KEY] = { "CPU architecture", /* cpuid_family */ "CPU part", /* cpuid_model */ "CPU variant", /* cpuid_stepping */ "CPU implementer", /* vendor id */ }; const char *verkey_dum[VERKEY_NUM_KEY] = { "none", "none", "none", "none", }; const char **verkey_ptr = NULL; int vendor_id = get_vendor_id(); if (vendor_id == VENDOR_INTEL_X86 || vendor_id == VENDOR_AMD) { verkey_ptr = verkey_x86; } else if (vendor_id == VENDOR_IBM) { verkey_ptr = verkey_ibm; } else if (vendor_id == VENDOR_ARM_ARM || vendor_id == VENDOR_ARM_BROADCOM || vendor_id == VENDOR_ARM_CAVIUM || vendor_id == VENDOR_ARM_FUJITSU || vendor_id == VENDOR_ARM_HISILICON || vendor_id == VENDOR_ARM_APM || vendor_id == VENDOR_ARM_QUALCOMM) { verkey_ptr = verkey_arm; } else { verkey_ptr = verkey_dum; } switch(attr) { case CPU_ATTR__NUM_SOCKETS: --topo_idx; // fall through case CPU_ATTR__NUM_NODES: --topo_idx; // fall through case CPU_ATTR__NUM_THREADS: --topo_idx; // fall through case CPU_ATTR__NUM_CORES: --topo_idx; status = get_topology_info(topokey[topo_idx], value); break; case CPU_ATTR__CPUID_FAMILY: --ver_idx; // fall through case CPU_ATTR__CPUID_MODEL: --ver_idx; // fall through case CPU_ATTR__CPUID_STEPPING: --ver_idx; // fall through case CPU_ATTR__VENDOR_ID: --ver_idx; status = get_versioning_info(verkey_ptr[ver_idx], value); break; case CPU_ATTR__CACHE_MAX_NUM_LEVELS: *value = PAPI_MAX_MEM_HIERARCHY_LEVELS; break; default: status = CPU_ERROR; } return status; } int linux_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { int status = CPU_SUCCESS; switch(attr) { case CPU_ATTR__CACHE_INST_PRESENT: case CPU_ATTR__CACHE_DATA_PRESENT: case CPU_ATTR__CACHE_UNIF_PRESENT: case CPU_ATTR__CACHE_INST_TOT_SIZE: case CPU_ATTR__CACHE_INST_LINE_SIZE: case CPU_ATTR__CACHE_INST_NUM_LINES: case CPU_ATTR__CACHE_INST_ASSOCIATIVITY: case CPU_ATTR__CACHE_DATA_TOT_SIZE: case CPU_ATTR__CACHE_DATA_LINE_SIZE: case CPU_ATTR__CACHE_DATA_NUM_LINES: case CPU_ATTR__CACHE_DATA_ASSOCIATIVITY: case CPU_ATTR__CACHE_UNIF_TOT_SIZE: case CPU_ATTR__CACHE_UNIF_LINE_SIZE: case CPU_ATTR__CACHE_UNIF_NUM_LINES: case CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY: status = get_cache_info(attr, loc, value); break; case CPU_ATTR__NUMA_MEM_SIZE: status = get_mem_info(loc, value); break; case CPU_ATTR__HWTHREAD_NUMA_AFFINITY: status = get_thread_affinity(loc, value); break; default: status = CPU_ERROR; } return status; } int linux_cpu_set_affinity( int cpu ) { cpu_set_t cpuset; CPU_ZERO(&cpuset); CPU_SET(cpu, &cpuset); return sched_setaffinity(0, sizeof(cpuset), &cpuset); } int linux_cpu_get_num_supported( void ) { return sysconf(_SC_NPROCESSORS_CONF); } static cpu_set_t saved_affinity; int linux_cpu_store_affinity( void ) { if (!CPU_COUNT(&saved_affinity)) return sched_getaffinity(0, sizeof(cpu_set_t), &saved_affinity); return CPU_SUCCESS; } int linux_cpu_load_affinity( void ) { return sched_setaffinity(0, sizeof(cpu_set_t), &saved_affinity); } int get_topology_info( const char *key, int *val ) { int status = CPU_SUCCESS; static int sockets, nodes, threads, cores; if (!strcmp("sockets", key) && sockets) { *val = sockets; return status; } else if (!strcmp("nodes", key) && nodes) { *val = nodes; return status; } else if (!strcmp("threads", key) && threads) { *val = threads; return status; } else if (!strcmp("cores", key) && cores) { *val = cores; return status; } /* Query topology information once and store results for later */ int totalcpus = 0; while (path_exist(_PATH_SYS_SYSTEM "/cpu/cpu%d", totalcpus)) { ++totalcpus; } if (path_exist(_PATH_SYS_CPU0 "/topology/thread_siblings")) { threads = path_sibling(_PATH_SYS_CPU0 "/topology/thread_siblings"); } if (path_exist(_PATH_SYS_CPU0 "/topology/core_siblings")) { cores = path_sibling(_PATH_SYS_CPU0 "/topology/core_siblings") / threads; } sockets = totalcpus / cores / threads; while (path_exist(_PATH_SYS_SYSTEM "/node/node%d", nodes)) ++nodes; if (!strcmp("sockets", key)) { *val = sockets; } else if (!strcmp("nodes", key)) { *val = (nodes == 0) ? nodes = 1 : nodes; } else if (!strcmp("cores", key)) { *val = cores; } else if (!strcmp("threads", key)) { *val = threads; } else { status = CPU_ERROR; } return status; } int get_naming_info( const char *key, char *val ) { if (!strcmp(key, "none")) { strcpy(val, "UNKNOWN"); return CPU_SUCCESS; } FILE *fp = fopen("/proc/cpuinfo", "r"); if (fp == NULL) { return CPU_ERROR; } char *str = search_cpu_info(fp, key); if (str) { strncpy(val, str, PAPI_MAX_STR_LEN); val[PAPI_MAX_STR_LEN - 1] = 0; } fclose(fp); return CPU_SUCCESS; } int get_versioning_info( const char *key, int *val ) { if (!strcmp(key, "none")) { *val = -1; return CPU_SUCCESS; } if (!strcmp(key, "vendor_id") || !strcmp(key, "CPU implementer")) { *val = get_vendor_id(); return CPU_SUCCESS; } FILE *fp = fopen("/proc/cpuinfo", "r"); if (fp == NULL) { return CPU_ERROR; } char *str = search_cpu_info(fp, key); if (str) { /* FIXME: this is a ugly hack to handle old (prior to Linux 3.19) ARM64 */ if (strcmp(key, "CPU architecture") == 0) { /* Prior version 3.19 'CPU architecture' is always 'AArch64' * so we convert it to '8', which is the value since 3.19. */ if (strstr(str, "AArch64")) *val = 8; else *val = strtol(str, NULL, 10); /* Old Fallbacks if the above didn't work (e.g. Raspberry Pi) */ if (*val < 0) { str = search_cpu_info(fp, "Processor"); if (str) { char *t = strchr(str, '('); int tmp = *(t + 2) - '0'; *val = tmp; } else { /* Try the model name and look inside of parens */ str = search_cpu_info(fp, "model name"); if (str) { char *t = strchr(str, '('); int tmp = *(t + 2) - '0'; *val = tmp; } } } } else { sscanf(str, "%x", val); } } fclose(fp); return CPU_SUCCESS; } static _sysdetect_cache_level_info_t clevel[PAPI_MAX_MEM_HIERARCHY_LEVELS]; int get_cache_info( CPU_attr_e attr, int level, int *val ) { int type = 0; int size, line_size, associativity, sets; DIR *dir; struct dirent *d; int max_level = 0; int *level_count, level_index; static _sysdetect_cache_level_info_t *L; if (L) { return cpu_get_cache_info(attr, level, L, val); } L = clevel; /* open Linux cache dir */ /* assume all CPUs same as cpu0. */ /* Not necessarily a good assumption */ dir = opendir("/sys/devices/system/cpu/cpu0/cache"); if (dir == NULL) { goto fn_fail; } while(1) { d = readdir(dir); if (d == NULL) break; if (strncmp(d->d_name, "index", 5)) continue; if (get_cache_level(d->d_name, &level_index)) { goto fn_fail; } if (get_cache_type(d->d_name, &type)) { goto fn_fail; } level_count = &L[level_index].num_caches; L[level_index].cache[*level_count].type = type; if (get_cache_size(d->d_name, &size)) { goto fn_fail; } /* Linux reports in kB, PAPI expects in Bytes */ L[level_index].cache[*level_count].size = size * 1024; if (get_cache_line_size(d->d_name, &line_size)) { goto fn_fail; } L[level_index].cache[*level_count].line_size = line_size; if (get_cache_associativity(d->d_name, &associativity)) { goto fn_fail; } L[level_index].cache[*level_count].associativity = associativity; int partitions; if (get_cache_partition_count(d->d_name, &partitions)) { goto fn_fail; } if (get_cache_set_count(d->d_name, &sets)) { goto fn_fail; } L[level_index].cache[*level_count].num_lines = (sets * associativity * partitions); if (((size * 1024) / line_size / associativity) != sets) { MEMDBG("Warning! sets %d != expected %d\n", sets, ((size * 1024) / line_size / associativity)); } if (level > max_level) { max_level = level; } if (level >= PAPI_MAX_MEM_HIERARCHY_LEVELS) { MEMDBG("Exceeded maximum cache level %d\n", PAPI_MAX_MEM_HIERARCHY_LEVELS); break; } ++(*level_count); } closedir(dir); return cpu_get_cache_info(attr, level, L, val); fn_fail: closedir(dir); return CPU_ERROR; } int get_cache_level( const char *dirname, int *value ) { char filename[BUFSIZ]; int level_index; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/level", dirname); FILE *fff = fopen(filename,"r"); if (fff == NULL) { MEMDBG("Cannot open level.\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &level_index); fclose(fff); if (result != 1) { MEMDBG("Could not read cache level\n"); return CPU_ERROR; } /* Index arrays from 0 */ level_index -= 1; *value = level_index; return CPU_SUCCESS; } int get_cache_type( const char *dirname, int *value ) { char filename[BUFSIZ]; char type_string[BUFSIZ]; char buffer[BUFSIZ]; int type = PAPI_MH_TYPE_EMPTY; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/type", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open type\n"); return CPU_ERROR; } char *result = fgets(buffer, BUFSIZ, fff); fclose(fff); if (result == NULL) { MEMDBG("Could not read cache type\n"); return CPU_ERROR; } sscanf(buffer,"%s",type_string); if (!strcmp(type_string, "Data")) { type = PAPI_MH_TYPE_DATA; } if (!strcmp(type_string, "Instruction")) { type = PAPI_MH_TYPE_INST; } if (!strcmp(type_string, "Unified")) { type = PAPI_MH_TYPE_UNIFIED; } *value = type; return CPU_SUCCESS; } int get_cache_size( const char *dirname, int *value ) { char filename[BUFSIZ]; int size; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/size", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open size\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &size); fclose(fff); if (result != 1) { MEMDBG("Could not read cache size\n"); return CPU_ERROR; } *value = size; return CPU_SUCCESS; } int get_cache_line_size( const char *dirname, int *value ) { char filename[BUFSIZ]; int line_size; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/coherency_line_size", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open linesize\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &line_size); fclose(fff); if (result != 1) { MEMDBG("Could not read cache line-size\n"); return CPU_ERROR; } *value = line_size; return CPU_SUCCESS; } int get_cache_associativity( const char *dirname, int *value ) { char filename[BUFSIZ]; int associativity; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/ways_of_associativity", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open associativity\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &associativity); fclose(fff); if (result != 1) { MEMDBG("Could not read cache associativity\n"); return CPU_ERROR; } *value = associativity; return CPU_SUCCESS; } int get_cache_partition_count( const char *dirname, int *value ) { char filename[BUFSIZ]; int partitions; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/physical_line_partition", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open partitions\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &partitions); fclose(fff); if (result != 1) { MEMDBG("Could not read partitions count\n"); return CPU_ERROR; } *value = partitions; return CPU_SUCCESS; } int get_cache_set_count( const char *dirname, int *value ) { char filename[BUFSIZ]; int sets; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/number_of_sets", dirname); FILE *fff = fopen(filename, "r"); if (fff == NULL) { MEMDBG("Cannot open sets\n"); return CPU_ERROR; } int result = fscanf(fff, "%d", &sets); fclose(fff); if (result != 1) { MEMDBG("Could not read cache sets\n"); return CPU_ERROR; } *value = sets; return CPU_SUCCESS; } int get_mem_info( int node, int *val ) { if (path_exist(_PATH_SYS_SYSTEM "/node/node%d", node)) { char filename[PAPI_MAX_STR_LEN]; sprintf(filename, _PATH_SYS_SYSTEM "/node/node%d/meminfo", node); FILE *fp = fopen(filename, "r"); if (!fp) { return CPU_ERROR; } char search_str[PAPI_MIN_STR_LEN]; sprintf(search_str, "Node %d MemTotal", node); char *str = search_cpu_info(fp, search_str); if (str) { sprintf(search_str, "%s", str); int len = strlen(search_str); search_str[len-3] = '\0'; /* Remove trailing "KB" */ *val = atoi(search_str); } fclose(fp); } return CPU_SUCCESS; } int get_thread_affinity( int thread, int *val ) { if (!path_exist(_PATH_SYS_SYSTEM "/cpu/cpu0/node0")) { *val = 0; return CPU_SUCCESS; } // If gaps exist in the core numbering, the caller of this // fucntion will likely inquire about cpu-ids that do not // exist in the system (i.e., the gaps). if( !path_exist(_PATH_SYS_SYSTEM "/cpu/cpu%d", thread) ){ *val = -1; return CPU_ERROR; } int i = 0; while (!path_exist(_PATH_SYS_SYSTEM "/cpu/cpu%d/node%d", thread, i)) { ++i; } *val = i; return CPU_SUCCESS; } static char pathbuf[PATH_MAX] = "/"; FILE * xfopen( const char *path, const char *mode ) { FILE *fd = fopen(path, mode); return fd; } FILE * path_vfopen( const char *mode, const char *path, va_list ap ) { vsnprintf( pathbuf, sizeof ( pathbuf ), path, ap ); return xfopen( pathbuf, mode ); } int path_sibling( const char *path, ... ) { int c; long n; int result = CPU_SUCCESS; char s[2]; FILE *fp; va_list ap; va_start( ap, path ); fp = path_vfopen( "r", path, ap ); va_end( ap ); while ((c = fgetc(fp)) != EOF) { if (isxdigit(c)) { s[0] = (char) c; s[1] = '\0'; for (n = strtol(s, NULL, 16); n > 0; n /= 2) { if (n % 2) result++; } } } fclose(fp); return result; } char * search_cpu_info( FILE * f, const char *search_str ) { static char line[PAPI_HUGE_STR_LEN] = ""; char *s, *start = NULL; rewind(f); while (fgets(line, PAPI_HUGE_STR_LEN,f) != NULL) { s=strstr(line, search_str); if (s != NULL) { /* skip all characters in line up to the colon */ /* and then spaces */ s=strchr(s, ':'); if (s == NULL) break; s++; while (isspace(*s)) { s++; } start = s; /* Find and clear newline */ s=strrchr(start, '\n'); if (s != NULL) *s = 0; break; } } return start; } int path_exist( const char *path, ... ) { va_list ap; va_start(ap, path); vsnprintf(pathbuf, sizeof ( pathbuf ), path, ap); va_end(ap); return access(pathbuf, F_OK) == 0; } void decode_vendor_string( char *s, int *vendor ) { if (strcasecmp(s, "GenuineIntel") == 0) *vendor = VENDOR_INTEL_X86; else if ((strcasecmp(s, "AMD") == 0) || (strcasecmp(s, "AuthenticAMD") == 0 )) *vendor = VENDOR_AMD; else if (strcasecmp(s, "IBM") == 0) *vendor = VENDOR_IBM; else if (strcasecmp(s, "Cray") == 0) *vendor = VENDOR_CRAY; else if (strcasecmp(s, "ARM_ARM") == 0) *vendor = VENDOR_ARM_ARM; else if (strcasecmp(s, "ARM_BROADCOM") == 0) *vendor = VENDOR_ARM_BROADCOM; else if (strcasecmp(s, "ARM_CAVIUM") == 0) *vendor = VENDOR_ARM_CAVIUM; else if (strcasecmp(s, "ARM_FUJITSU") == 0) *vendor = VENDOR_ARM_FUJITSU; else if (strcasecmp(s, "ARM_HISILICON") == 0) *vendor = VENDOR_ARM_HISILICON; else if (strcasecmp(s, "ARM_APM") == 0) *vendor = VENDOR_ARM_APM; else if (strcasecmp(s, "ARM_QUALCOMM") == 0) *vendor = VENDOR_ARM_QUALCOMM; else if (strcasecmp(s, "MIPS") == 0) *vendor = VENDOR_MIPS; else if (strcasecmp(s, "SiCortex") == 0) *vendor = VENDOR_MIPS; else *vendor = VENDOR_UNKNOWN; } int get_vendor_id( void ) { static int vendor_id; // VENDOR_UNINITED; if (vendor_id != VENDOR_UNINITED) return vendor_id; FILE *fp = fopen("/proc/cpuinfo", "r"); if (fp == NULL) { return CPU_ERROR; } char vendor_string[PAPI_MAX_STR_LEN] = ""; char *s = search_cpu_info(fp, "vendor_id"); if (s) { strncpy(vendor_string, s, PAPI_MAX_STR_LEN); vendor_string[PAPI_MAX_STR_LEN - 1] = 0; } else { s = search_cpu_info(fp, "vendor"); if (s) { strncpy(vendor_string, s, PAPI_MAX_STR_LEN); vendor_string[PAPI_MAX_STR_LEN - 1] = 0; } else { s = search_cpu_info(fp, "system type"); if (s) { strncpy(vendor_string, s, PAPI_MAX_STR_LEN); vendor_string[PAPI_MAX_STR_LEN - 1] = 0; } else { s = search_cpu_info(fp, "platform"); if (s) { if (strcasecmp(s, "pSeries") == 0 || strcasecmp(s, "PowerNV") == 0 || strcasecmp(s, "PowerMac") == 0) { strcpy(vendor_string, "IBM"); } } else { s = search_cpu_info(fp, "CPU implementer"); if (s) { int tmp; sscanf(s, "%x", &tmp); switch(tmp) { case VENDOR_ARM_ARM: strcpy(vendor_string, "ARM_ARM"); break; case VENDOR_ARM_BROADCOM: strcpy(vendor_string, "ARM_BROADCOM"); break; case VENDOR_ARM_CAVIUM: strcpy(vendor_string, "ARM_CAVIUM"); break; case VENDOR_ARM_FUJITSU: strcpy(vendor_string, "ARM_FUJITSU"); break; case VENDOR_ARM_HISILICON: strcpy(vendor_string, "ARM_HISILICON"); break; case VENDOR_ARM_APM: strcpy(vendor_string, "ARM_APM"); break; case VENDOR_ARM_QUALCOMM: strcpy(vendor_string, "ARM_QUALCOMM"); break; default: strcpy(vendor_string, "UNKNOWN"); } } } } } } if (strlen(vendor_string)) { decode_vendor_string(vendor_string, &vendor_id); } fclose(fp); return vendor_id; } papi-papi-7-2-0-t/src/components/sysdetect/linux_cpu_utils.h000066400000000000000000000007541502707512200242300ustar00rootroot00000000000000#ifndef __LINUX_CPU_UTIL_H__ #define __LINUX_CPU_UTIL_H__ #include "cpu_utils.h" int linux_cpu_get_vendor( char *vendor ); int linux_cpu_get_name( char *name ); int linux_cpu_get_attribute( CPU_attr_e attr, int *value ); int linux_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); int linux_cpu_set_affinity( int cpu ); int linux_cpu_get_num_supported( void ); int linux_cpu_store_affinity( void ); int linux_cpu_load_affinity( void ); #endif /* End of __LINUX_CPU_UTIL_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/nvidia_gpu.c000066400000000000000000000361131502707512200231200ustar00rootroot00000000000000/** * @file nvidia_gpu.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * @ingroup papi_components * * @brief * Scan functions for NVIDIA GPU subsystems. */ #include #include #include #include #include "sysdetect.h" #include "nvidia_gpu.h" #ifdef HAVE_CUDA #include "cuda.h" static void *cuda_dlp = NULL; static CUresult (*cuInitPtr)( unsigned int flags ) = NULL; static CUresult (*cuDeviceGetPtr)( CUdevice *device, int ordinal ) = NULL; static CUresult (*cuDeviceGetNamePtr)( char *name, int len, CUdevice dev ) = NULL; static CUresult (*cuDeviceGetCountPtr)( int *count ) = NULL; static CUresult (*cuDeviceGetAttributePtr)( int *pi, CUdevice_attribute attrib, CUdevice dev ) = NULL; static CUresult (*cuDeviceGetPCIBusIdPtr)( char *bus_id_string, int len, CUdevice dev ) = NULL; #define CU_CALL(call, err_handle) do { \ CUresult _status = (call); \ if (_status != CUDA_SUCCESS) { \ if (_status == CUDA_ERROR_NOT_INITIALIZED) { \ if ((*cuInitPtr)(0) == CUDA_SUCCESS) { \ _status = (call); \ if (_status == CUDA_SUCCESS) { \ break; \ } \ } \ } \ SUBDBG("error: function %s failed with error %d.\n", #call, _status); \ err_handle; \ } \ } while(0) static void fill_dev_info( _sysdetect_gpu_info_u *dev_info, int dev ); static int cuda_is_enabled( void ); static int load_cuda_sym( char *status ); static int unload_cuda_sym( void ); #endif /* HAVE_CUDA */ #ifdef HAVE_NVML #include "nvml.h" static void *nvml_dlp = NULL; static nvmlReturn_t (*nvmlInitPtr)( void ); static nvmlReturn_t (*nvmlDeviceGetCountPtr)( unsigned int *deviceCount ) = NULL; static nvmlReturn_t (*nvmlDeviceGetHandleByPciBusIdPtr)( const char *bus_id_str, nvmlDevice_t *device ) = NULL; static nvmlReturn_t (*nvmlDeviceGetUUIDPtr)( nvmlDevice_t device, char *uuid, unsigned int length ) = NULL; #define NVML_CALL(call, err_handle) do { \ nvmlReturn_t _status = (call); \ if (_status != NVML_SUCCESS) { \ if (_status == NVML_ERROR_UNINITIALIZED) { \ if ((*nvmlInitPtr)() == NVML_SUCCESS) { \ _status = (call); \ if (_status == NVML_SUCCESS) { \ break; \ } \ } \ } \ SUBDBG("error: function %s failed with error %d.\n", #call, _status); \ err_handle; \ } \ } while(0) static void fill_dev_affinity_info( _sysdetect_gpu_info_u *dev_info, int dev_count ); static int nvml_is_enabled( void ); static int load_nvml_sym( char *status ); static int unload_nvml_sym( void ); static unsigned long hash(unsigned char *str); #endif /* HAVE_NVML */ #ifdef HAVE_CUDA void fill_dev_info( _sysdetect_gpu_info_u *dev_info, int dev ) { CUdevice device; CU_CALL((*cuDeviceGetPtr)(&device, dev), return); CU_CALL((*cuDeviceGetNamePtr)(dev_info->nvidia.name, PAPI_2MAX_STR_LEN, device), dev_info->nvidia.name[0] = '\0'); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, device), dev_info->nvidia.warp_size = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_shmmem_per_block, CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK, device), dev_info->nvidia.max_shmmem_per_block = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_shmmem_per_multi_proc, CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR, device), dev_info->nvidia.max_shmmem_per_multi_proc = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_block_dim_x, CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X, device), dev_info->nvidia.max_block_dim_x = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_block_dim_y, CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_Y, device), dev_info->nvidia.max_block_dim_y = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_block_dim_z, CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_Z, device), dev_info->nvidia.max_block_dim_z = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_grid_dim_x, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X, device), dev_info->nvidia.max_grid_dim_x = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_grid_dim_y, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_Y, device), dev_info->nvidia.max_grid_dim_y = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_grid_dim_z, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_Z, device), dev_info->nvidia.max_grid_dim_z = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_threads_per_block, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, device), dev_info->nvidia.max_threads_per_block = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.multi_processor_count, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, device), dev_info->nvidia.multi_processor_count = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.multi_kernel_per_ctx, CU_DEVICE_ATTRIBUTE_CONCURRENT_KERNELS, device), dev_info->nvidia.multi_kernel_per_ctx = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.can_map_host_mem, CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY, device), dev_info->nvidia.can_map_host_mem = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.can_overlap_comp_and_data_xfer, CU_DEVICE_ATTRIBUTE_GPU_OVERLAP, device), dev_info->nvidia.can_overlap_comp_and_data_xfer = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.unified_addressing, CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING, device), dev_info->nvidia.unified_addressing = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.managed_memory, CU_DEVICE_ATTRIBUTE_MANAGED_MEMORY, device), dev_info->nvidia.managed_memory = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.major, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, device), dev_info->nvidia.major = -1); CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.minor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, device), dev_info->nvidia.minor = -1); #if CUDA_VERSION >= 11000 CU_CALL((*cuDeviceGetAttributePtr)(&dev_info->nvidia.max_blocks_per_multi_proc, CU_DEVICE_ATTRIBUTE_MAX_BLOCKS_PER_MULTIPROCESSOR, device), dev_info->nvidia.max_blocks_per_multi_proc = -1); #else dev_info->nvidia.max_blocks_per_multi_proc = -1; #endif /* CUDA_VERSION */ } int cuda_is_enabled( void ) { return (cuInitPtr != NULL && cuDeviceGetPtr != NULL && cuDeviceGetNamePtr != NULL && cuDeviceGetCountPtr != NULL && cuDeviceGetAttributePtr != NULL && cuDeviceGetPCIBusIdPtr != NULL); } int load_cuda_sym( char *status ) { cuda_dlp = dlopen("libcuda.so", RTLD_NOW | RTLD_GLOBAL); if (cuda_dlp == NULL) { int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", dlerror()); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } cuInitPtr = dlsym(cuda_dlp, "cuInit"); cuDeviceGetPtr = dlsym(cuda_dlp, "cuDeviceGet"); cuDeviceGetNamePtr = dlsym(cuda_dlp, "cuDeviceGetName"); cuDeviceGetCountPtr = dlsym(cuda_dlp, "cuDeviceGetCount"); cuDeviceGetAttributePtr = dlsym(cuda_dlp, "cuDeviceGetAttribute"); cuDeviceGetPCIBusIdPtr = dlsym(cuda_dlp, "cuDeviceGetPCIBusId"); if (!cuda_is_enabled()) { const char *message = "dlsym() of CUDA symbols failed"; int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } return 0; } int unload_cuda_sym( void ) { if (cuda_dlp != NULL) { dlclose(cuda_dlp); } cuInitPtr = NULL; cuDeviceGetPtr = NULL; cuDeviceGetNamePtr = NULL; cuDeviceGetCountPtr = NULL; cuDeviceGetAttributePtr = NULL; cuDeviceGetPCIBusIdPtr = NULL; return cuda_is_enabled(); } #endif /* HAVE_CUDA */ #ifdef HAVE_NVML void fill_dev_affinity_info( _sysdetect_gpu_info_u *info, int dev_count ) { int dev; for (dev = 0; dev < dev_count; ++dev) { char bus_id_str[20] = { 0 }; CU_CALL((*cuDeviceGetPCIBusIdPtr)(bus_id_str, 20, dev), return); nvmlDevice_t device; NVML_CALL((*nvmlDeviceGetHandleByPciBusIdPtr)(bus_id_str, &device), return); char uuid_str[PAPI_NVML_DEV_BUFFER_SIZE] = { 0 }; NVML_CALL((*nvmlDeviceGetUUIDPtr)(device, uuid_str, PAPI_NVML_DEV_BUFFER_SIZE), return); _sysdetect_gpu_info_u *dev_info = &info[dev]; dev_info->nvidia.uid = hash((unsigned char *) uuid_str); } } unsigned long hash(unsigned char *str) { unsigned long hash = 5381; int c; while ((c = *str++)) { hash = ((hash << 5) + hash) + c; } return hash; } int nvml_is_enabled( void ) { return (nvmlInitPtr != NULL && nvmlDeviceGetCountPtr != NULL && nvmlDeviceGetHandleByPciBusIdPtr != NULL && nvmlDeviceGetUUIDPtr != NULL); } int load_nvml_sym( char *status ) { nvml_dlp = dlopen("libnvidia-ml.so", RTLD_NOW | RTLD_GLOBAL); if (nvml_dlp == NULL) { int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", dlerror()); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } nvmlInitPtr = dlsym(nvml_dlp, "nvmlInit_v2"); nvmlDeviceGetCountPtr = dlsym(nvml_dlp, "nvmlDeviceGetCount_v2"); nvmlDeviceGetHandleByPciBusIdPtr = dlsym(nvml_dlp, "nvmlDeviceGetHandleByPciBusId_v2"); nvmlDeviceGetUUIDPtr = dlsym(nvml_dlp, "nvmlDeviceGetUUID"); if (!nvml_is_enabled()) { const char *message = "dlsym() of NVML symbols failed"; int count = snprintf(status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } return -1; } return 0; } int unload_nvml_sym( void ) { if (nvml_dlp != NULL) { dlclose(nvml_dlp); } nvmlInitPtr = NULL; nvmlDeviceGetCountPtr = NULL; nvmlDeviceGetHandleByPciBusIdPtr = NULL; nvmlDeviceGetUUIDPtr = NULL; return nvml_is_enabled(); } #endif /* HAVE_NVML */ void open_nvidia_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { memset(dev_type_info, 0, sizeof(*dev_type_info)); dev_type_info->id = PAPI_DEV_TYPE_ID__CUDA; strcpy(dev_type_info->vendor, "NVIDIA"); strcpy(dev_type_info->status, "Device Initialized"); #ifdef HAVE_CUDA if (load_cuda_sym(dev_type_info->status)) { return; } int dev, dev_count; CU_CALL((*cuDeviceGetCountPtr)(&dev_count), return); dev_type_info->num_devices = dev_count; if (dev_count == 0) { return; } _sysdetect_gpu_info_u *arr = papi_calloc(dev_count, sizeof(*arr)); for (dev = 0; dev < dev_count; ++dev) { fill_dev_info(&arr[dev], dev); } #ifdef HAVE_NVML if (!load_nvml_sym(dev_type_info->status)) { fill_dev_affinity_info(arr, dev_count); unload_nvml_sym(); } #else const char *message = "NVML not configured, no device affinity available"; int count = snprintf(dev_type_info->status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } #endif /* HAVE_NVML */ unload_cuda_sym(); dev_type_info->dev_info_arr = (_sysdetect_dev_info_u *)arr; #else const char *message = "CUDA not configured, no CUDA device available"; int count = snprintf(dev_type_info->status, PAPI_MAX_STR_LEN, "%s", message); if (count >= PAPI_MAX_STR_LEN) { SUBDBG("Status string truncated."); } #endif /* HAVE_CUDA */ } void close_nvidia_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ) { papi_free(dev_type_info->dev_info_arr); } papi-papi-7-2-0-t/src/components/sysdetect/nvidia_gpu.h000066400000000000000000000006431502707512200231240ustar00rootroot00000000000000#ifndef __NVIDIA_GPU_H__ #define __NVIDIA_GPU_H__ #if CUDA_VERSION >= 11000 #define PAPI_NVML_DEV_BUFFER_SIZE NVML_DEVICE_UUID_V2_BUFFER_SIZE #else #define PAPI_NVML_DEV_BUFFER_SIZE NVML_DEVICE_UUID_BUFFER_SIZE #endif void open_nvidia_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); void close_nvidia_gpu_dev_type( _sysdetect_dev_type_info_t *dev_type_info ); #endif /* End of __NVIDIA_GPU_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/os_cpu_utils.c000066400000000000000000000044741502707512200235100ustar00rootroot00000000000000#include "sysdetect.h" #include "os_cpu_utils.h" #include "linux_cpu_utils.h" int os_cpu_get_vendor( char *vendor ) { #if defined(__linux__) return linux_cpu_get_vendor(vendor); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_get_name( char *name ) { #if defined(__linux__) return linux_cpu_get_name(name); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_get_attribute( CPU_attr_e attr, int *value ) { #if defined(__linux__) return linux_cpu_get_attribute(attr, value); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { #if defined(__linux__) return linux_cpu_get_attribute_at(attr, loc, value); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_set_affinity( int cpu ) { #if defined(__linux__) return linux_cpu_set_affinity(cpu); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_get_num_supported( void ) { #if defined(__linux__) return linux_cpu_get_num_supported(); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_store_affinity( void ) { #if defined(__linux__) return linux_cpu_store_affinity(); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } int os_cpu_load_affinity( void ) { #if defined(__linux__) return linux_cpu_load_affinity(); #elif defined(__APPLE__) || defined(__MACH__) #warning "WARNING! Darwin support of " __func__ " not yet implemented." return CPU_ERROR; #endif return CPU_ERROR; } papi-papi-7-2-0-t/src/components/sysdetect/os_cpu_utils.h000066400000000000000000000007161502707512200235100ustar00rootroot00000000000000#ifndef __OS_CPU_UTILS_H__ #define __OS_CPU_UTILS_H__ #include "cpu_utils.h" int os_cpu_get_vendor( char *vendor ); int os_cpu_get_name( char *name ); int os_cpu_get_attribute( CPU_attr_e attr, int *value ); int os_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); int os_cpu_set_affinity( int cpu ); int os_cpu_get_num_supported( void ); int os_cpu_store_affinity( void ); int os_cpu_load_affinity( void ); #endif /* End of __OS_CPU_UTILS_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/powerpc_cpu_utils.c000066400000000000000000000157141502707512200245450ustar00rootroot00000000000000#include "sysdetect.h" #include "powerpc_cpu_utils.h" #include "os_cpu_utils.h" _sysdetect_cache_level_info_t ppc970_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 65536, 128, 512, 1}, {PAPI_MH_TYPE_DATA, 32768, 128, 256, 2} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 524288, 128, 4096, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power5_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 65536, 128, 512, 2}, {PAPI_MH_TYPE_DATA, 32768, 128, 256, 4} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 1966080, 128, 15360, 10}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 37748736, 256, 147456, 12}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power6_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 65536, 128, 512, 4}, {PAPI_MH_TYPE_DATA, 65536, 128, 512, 8} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 4194304, 128, 16384, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 33554432, 128, 262144, 16}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power7_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 32768, 128, 64, 4}, {PAPI_MH_TYPE_DATA, 32768, 128, 32, 8} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 524288, 128, 256, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 4194304, 128, 4096, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power8_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 32768, 128, 64, 8}, {PAPI_MH_TYPE_DATA, 65536, 128, 512, 8} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 262144, 128, 256, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 8388608, 128, 65536, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power9_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 32768, 128, 256, 8}, {PAPI_MH_TYPE_DATA, 32768, 128, 256, 8} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 524288, 128, 4096, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 10485760, 128, 81920, 20}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; _sysdetect_cache_level_info_t power10_cache_info[] = { { // level 1 begins 2, { {PAPI_MH_TYPE_INST, 49152, 128, 384, 6}, {PAPI_MH_TYPE_DATA, 32768, 128, 256, 8} } }, { // level 2 begins 1, { {PAPI_MH_TYPE_UNIFIED, 1048576, 128, 8192, 8}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, { // level 3 begins 1, { {PAPI_MH_TYPE_UNIFIED, 4194304, 128, 32768, 16}, {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } }, }; #define SPRN_PVR 0x11F /* Processor Version Register */ #define PVR_PROCESSOR_SHIFT 16 static unsigned int mfpvr( void ); static int get_cache_info( CPU_attr_e attr, int level, int *value ); int powerpc_cpu_init( void ) { return CPU_SUCCESS; } int powerpc_cpu_finalize( void ) { return CPU_SUCCESS; } int powerpc_cpu_get_vendor( char *vendor ) { return os_cpu_get_vendor(vendor); } int powerpc_cpu_get_name( char *name ) { return os_cpu_get_name(name); } int powerpc_cpu_get_attribute( CPU_attr_e attr, int *value ) { return os_cpu_get_attribute(attr, value); } int powerpc_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { int status = CPU_SUCCESS; switch(attr) { case CPU_ATTR__CACHE_INST_PRESENT: case CPU_ATTR__CACHE_DATA_PRESENT: case CPU_ATTR__CACHE_UNIF_PRESENT: case CPU_ATTR__CACHE_INST_TOT_SIZE: case CPU_ATTR__CACHE_INST_LINE_SIZE: case CPU_ATTR__CACHE_INST_NUM_LINES: case CPU_ATTR__CACHE_INST_ASSOCIATIVITY: case CPU_ATTR__CACHE_DATA_TOT_SIZE: case CPU_ATTR__CACHE_DATA_LINE_SIZE: case CPU_ATTR__CACHE_DATA_NUM_LINES: case CPU_ATTR__CACHE_DATA_ASSOCIATIVITY: case CPU_ATTR__CACHE_UNIF_TOT_SIZE: case CPU_ATTR__CACHE_UNIF_LINE_SIZE: case CPU_ATTR__CACHE_UNIF_NUM_LINES: case CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY: status = get_cache_info(attr, loc, value); break; case CPU_ATTR__NUMA_MEM_SIZE: case CPU_ATTR__HWTHREAD_NUMA_AFFINITY: status = os_cpu_get_attribute_at(attr, loc, value); break; default: status = CPU_ERROR; } return status; } int get_cache_info( CPU_attr_e attr, int level, int *value ) { unsigned int pvr = mfpvr() >> PVR_PROCESSOR_SHIFT; static _sysdetect_cache_level_info_t *clevel_ptr; if (clevel_ptr) { return cpu_get_cache_info(attr, level, clevel_ptr, value); } switch(pvr) { case 0x39: /* PPC970 */ case 0x3C: /* PPC970FX */ case 0x44: /* PPC970MP */ case 0x45: /* PPC970GX */ clevel_ptr = ppc970_cache_info; break; case 0x3A: /* POWER5 */ case 0x3B: /* POWER5+ */ clevel_ptr = power5_cache_info; break; case 0x3E: /* POWER6 */ clevel_ptr = power6_cache_info; break; case 0x3F: /* POWER7 */ clevel_ptr = power7_cache_info; break; case 0x4b: /* POWER8 */ clevel_ptr = power8_cache_info; break; case 0x4e: /* POWER9 */ clevel_ptr = power9_cache_info; break; case 0x80: /* POWER10 */ clevel_ptr = power10_cache_info; break; default: return CPU_ERROR; } return cpu_get_cache_info(attr, level, clevel_ptr, value); } unsigned int mfpvr( void ) { unsigned long pvr; __asm__ ("mfspr %0,%1" : "=r" (pvr) : "i" (SPRN_PVR)); return pvr; } papi-papi-7-2-0-t/src/components/sysdetect/powerpc_cpu_utils.h000066400000000000000000000006231502707512200245430ustar00rootroot00000000000000#ifndef __POWERPC_UTIL_H__ #define __POWERPC_UTIL_H__ #include "cpu_utils.h" int powerpc_cpu_init( void ); int powerpc_cpu_finalize( void ); int powerpc_cpu_get_vendor( char *vendor ); int powerpc_cpu_get_name( char *name ); int powerpc_cpu_get_attribute( CPU_attr_e attr, int *value ); int powerpc_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); #endif /* End of __POWERPC_UTIL_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/sysdetect.c000066400000000000000000000407001502707512200227770ustar00rootroot00000000000000/** * @file sysdetect.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * @ingroup papi_components * * @brief * This is a system info detection component, it provides general hardware * information across the system, additionally to CPU, such as GPU, Network, * installed runtime libraries, etc. */ #include #include #include #include #include "sysdetect.h" #include "nvidia_gpu.h" #include "amd_gpu.h" #include "cpu.h" papi_vector_t _sysdetect_vector; typedef struct { void (*open) ( _sysdetect_dev_type_info_t *dev_type_info ); void (*close)( _sysdetect_dev_type_info_t *dev_type_info ); } dev_fn_ptr_vector; dev_fn_ptr_vector dev_fn_vector[PAPI_DEV_TYPE_ID__MAX_NUM] = { { open_cpu_dev_type, close_cpu_dev_type, }, { open_nvidia_gpu_dev_type, close_nvidia_gpu_dev_type, }, { open_amd_gpu_dev_type, close_amd_gpu_dev_type, }, }; static _sysdetect_dev_type_info_t dev_type_info_arr[PAPI_DEV_TYPE_ID__MAX_NUM]; static int _sysdetect_enum_dev_type( int enum_modifier, void **handle ); static int _sysdetect_get_dev_type_attr( void *handle, PAPI_dev_type_attr_e attr, void *val ); static int _sysdetect_get_dev_attr( void *handle, int id, PAPI_dev_attr_e attr, void *val ); static void get_num_threads_per_numa( _sysdetect_cpu_info_t *cpu_info ); static void init_dev_info( void ) { int id; for (id = 0; id < PAPI_DEV_TYPE_ID__MAX_NUM; ++id) { dev_fn_vector[id].open( &dev_type_info_arr[id] ); } } static void cleanup_dev_info( void ) { int id; for (id = 0; id < PAPI_DEV_TYPE_ID__MAX_NUM; ++id) { dev_fn_vector[id].close( &dev_type_info_arr[id] ); } } /** Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ static int _sysdetect_init_component( int cidx ) { SUBDBG( "_sysdetect_init_component...\n" ); /* Export the component id */ _sysdetect_vector.cmp_info.CmpIdx = cidx; return PAPI_OK; } static int _sysdetect_init_thread( hwd_context_t *ctx __attribute__((unused)) ) { return PAPI_OK; } /** Triggered by PAPI_shutdown() */ static int _sysdetect_shutdown_component( void ) { SUBDBG( "_sysdetect_shutdown_component...\n" ); cleanup_dev_info( ); return PAPI_OK; } static int _sysdetect_shutdown_thread(hwd_context_t *ctx __attribute__((unused)) ) { return PAPI_OK; } static void _sysdetect_init_private( void ) { static int initialized; if (initialized) { return; } init_dev_info( ); initialized = 1; } /** Trigered by PAPI_{enum,get}_dev_xxx interfaces */ int _sysdetect_user( int unused __attribute__((unused)), void *in, void *out ) { int papi_errno = PAPI_OK; _sysdetect_init_private(); _papi_hwi_sysdetect_t *args = (_papi_hwi_sysdetect_t *) in; int modifier; void *handle; int id; PAPI_dev_type_attr_e dev_type_attr; PAPI_dev_attr_e dev_attr; switch (args->query_type) { case PAPI_SYSDETECT_QUERY__DEV_TYPE_ENUM: modifier = args->query.enumerate.modifier; papi_errno = _sysdetect_enum_dev_type(modifier, out); break; case PAPI_SYSDETECT_QUERY__DEV_TYPE_ATTR: handle = args->query.dev_type.handle; dev_type_attr = args->query.dev_type.attr; papi_errno = _sysdetect_get_dev_type_attr(handle, dev_type_attr, out); break; case PAPI_SYSDETECT_QUERY__DEV_ATTR: handle = args->query.dev.handle; id = args->query.dev.id; dev_attr = args->query.dev.attr; papi_errno = _sysdetect_get_dev_attr(handle, id, dev_attr, out); break; default: papi_errno = PAPI_EMISC; } return papi_errno; } int _sysdetect_enum_dev_type( int enum_modifier, void **handle ) { static int dev_type_id; if (PAPI_DEV_TYPE_ENUM__FIRST == enum_modifier) { dev_type_id = 0; *(void **) handle = &dev_type_info_arr[dev_type_id]; return PAPI_OK; } int not_found = 1; while (not_found && dev_type_id < PAPI_DEV_TYPE_ID__MAX_NUM) { if ((1 << dev_type_info_arr[dev_type_id].id) & enum_modifier) { *handle = &dev_type_info_arr[dev_type_id]; not_found = 0; } ++dev_type_id; } if (not_found) { *handle = NULL; dev_type_id = 0; return PAPI_EINVAL; } return PAPI_OK; } int _sysdetect_get_dev_type_attr( void *handle, PAPI_dev_type_attr_e attr, void *val ) { int papi_errno = PAPI_OK; _sysdetect_dev_type_info_t *dev_type_info = (_sysdetect_dev_type_info_t *) handle; switch(attr) { case PAPI_DEV_TYPE_ATTR__INT_PAPI_ID: *(int *) val = dev_type_info->id; break; case PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID: *(int *) val = dev_type_info->vendor_id; break; case PAPI_DEV_TYPE_ATTR__CHAR_NAME: *(const char **) val = dev_type_info->vendor; break; case PAPI_DEV_TYPE_ATTR__INT_COUNT: *(int *) val = dev_type_info->num_devices; break; case PAPI_DEV_TYPE_ATTR__CHAR_STATUS: *(const char **) val = dev_type_info->status; break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } int _sysdetect_get_dev_attr( void *handle, int id, PAPI_dev_attr_e attr, void *val ) { int papi_errno = PAPI_OK; _sysdetect_dev_type_info_t *dev_type_info = (_sysdetect_dev_type_info_t *) handle; /* there is only one cpu vendor/model per system, hence id = 0 */ _sysdetect_cpu_info_t *cpu_info = (_sysdetect_cpu_info_t *) &dev_type_info->dev_info_arr[0]; _sysdetect_gpu_info_u *gpu_info = (_sysdetect_gpu_info_u *) (dev_type_info->dev_info_arr) + id; switch(attr) { /* CPU attributes */ case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE: *(unsigned int *) val = cpu_info->clevel[0].cache[0].size; break; case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE: *(unsigned int *) val = cpu_info->clevel[0].cache[1].size; break; case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE: *(unsigned int *) val = cpu_info->clevel[1].cache[0].size; break; case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE: *(unsigned int *) val = cpu_info->clevel[2].cache[0].size; break; case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE: *(unsigned int *) val = cpu_info->clevel[0].cache[0].line_size; break; case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE: *(unsigned int *) val = cpu_info->clevel[0].cache[1].line_size; break; case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE: *(unsigned int *) val = cpu_info->clevel[1].cache[0].line_size; break; case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE: *(unsigned int *) val = cpu_info->clevel[2].cache[0].line_size; break; case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT: *(unsigned int *) val = cpu_info->clevel[0].cache[0].num_lines; break; case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT: *(unsigned int *) val = cpu_info->clevel[0].cache[1].num_lines; break; case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT: *(unsigned int *) val = cpu_info->clevel[1].cache[0].num_lines; break; case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT: *(unsigned int *) val = cpu_info->clevel[2].cache[0].num_lines; break; case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC: *(unsigned int *) val = cpu_info->clevel[0].cache[0].associativity; break; case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC: *(unsigned int *) val = cpu_info->clevel[0].cache[1].associativity; break; case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC: *(unsigned int *) val = cpu_info->clevel[1].cache[0].associativity; break; case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC: *(unsigned int *) val = cpu_info->clevel[2].cache[0].associativity; break; case PAPI_DEV_ATTR__CPU_CHAR_NAME: *(const char **) val = cpu_info->name; break; case PAPI_DEV_ATTR__CPU_UINT_FAMILY: *(unsigned int *) val = cpu_info->cpuid_family; break; case PAPI_DEV_ATTR__CPU_UINT_MODEL: *(unsigned int *) val = cpu_info->cpuid_model; break; case PAPI_DEV_ATTR__CPU_UINT_STEPPING: *(unsigned int *) val = cpu_info->cpuid_stepping; break; case PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT: *(unsigned int *) val = cpu_info->sockets; break; case PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT: *(unsigned int *) val = cpu_info->numas; break; case PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT: *(int *) val = cpu_info->cores; break; case PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT: *(int *) val = cpu_info->threads * cpu_info->cores * cpu_info->sockets; break; case PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY: *(int *) val = cpu_info->numa_affinity[id]; break; case PAPI_DEV_ATTR__CPU_UINT_THR_PER_NUMA: get_num_threads_per_numa(cpu_info); *(int *) val = cpu_info->num_threads_per_numa[id]; break; case PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE: *(unsigned int *) val = (cpu_info->numa_memory[id] >> 10); break; /* NVIDIA GPU attributes */ case PAPI_DEV_ATTR__CUDA_ULONG_UID: *(unsigned long *) val = gpu_info->nvidia.uid; break; case PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME: *(const char **) val = gpu_info->nvidia.name; break; case PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE: *(unsigned int *) val = gpu_info->nvidia.warp_size; break; case PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK: *(unsigned int *) val = gpu_info->nvidia.max_threads_per_block; break; case PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM: *(unsigned int *) val = gpu_info->nvidia.max_blocks_per_multi_proc; break; case PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK: *(unsigned int *) val = gpu_info->nvidia.max_shmmem_per_block; break; case PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM: *(unsigned int *) val = gpu_info->nvidia.max_shmmem_per_multi_proc; break; case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X: *(unsigned int *) val = gpu_info->nvidia.max_block_dim_x; break; case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y: *(unsigned int *) val = gpu_info->nvidia.max_block_dim_y; break; case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z: *(unsigned int *) val = gpu_info->nvidia.max_block_dim_z; break; case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X: *(unsigned int *) val = gpu_info->nvidia.max_grid_dim_x; break; case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y: *(unsigned int *) val = gpu_info->nvidia.max_grid_dim_y; break; case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z: *(unsigned int *) val = gpu_info->nvidia.max_grid_dim_z; break; case PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT: *(unsigned int *) val = gpu_info->nvidia.multi_processor_count; break; case PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL: *(unsigned int *) val = gpu_info->nvidia.multi_kernel_per_ctx; break; case PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM: *(unsigned int *) val = gpu_info->nvidia.can_map_host_mem; break; case PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP: *(unsigned int *) val = gpu_info->nvidia.can_overlap_comp_and_data_xfer; break; case PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR: *(unsigned int *) val = gpu_info->nvidia.unified_addressing; break; case PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM: *(unsigned int *) val = gpu_info->nvidia.managed_memory; break; case PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR: *(unsigned int *) val = gpu_info->nvidia.major; break; case PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR: *(unsigned int *) val = gpu_info->nvidia.minor; break; /* AMD GPU attributes */ case PAPI_DEV_ATTR__ROCM_ULONG_UID: *(unsigned long *) val = gpu_info->amd.uid; break; case PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME: *(const char **) val = gpu_info->amd.name; break; case PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU: *(unsigned int *) val = gpu_info->amd.simd_per_compute_unit; break; case PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE: *(unsigned int *) val = gpu_info->amd.max_threads_per_workgroup; break; case PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE: *(unsigned int *) val = gpu_info->amd.wavefront_size; break; case PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU: *(unsigned int *) val = gpu_info->amd.max_waves_per_compute_unit; break; case PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG: *(unsigned int *) val = gpu_info->amd.max_shmmem_per_workgroup; break; case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X: *(unsigned int *) val = gpu_info->amd.max_workgroup_dim_x; break; case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y: *(unsigned int *) val = gpu_info->amd.max_workgroup_dim_y; break; case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z: *(unsigned int *) val = gpu_info->amd.max_workgroup_dim_z; break; case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X: *(unsigned int *) val = gpu_info->amd.max_grid_dim_x; break; case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y: *(unsigned int *) val = gpu_info->amd.max_grid_dim_y; break; case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z: *(unsigned int *) val = gpu_info->amd.max_grid_dim_z; break; case PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT: *(unsigned int *) val = gpu_info->amd.compute_unit_count; break; case PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR: *(unsigned int *) val = gpu_info->amd.major; break; case PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR: *(unsigned int *) val = gpu_info->amd.minor; break; default: papi_errno = PAPI_ENOSUPP; } return papi_errno; } void get_num_threads_per_numa( _sysdetect_cpu_info_t *cpu_info ) { static int initialized; int k; if (initialized) { return; } int threads = cpu_info->threads * cpu_info->cores * cpu_info->sockets; for (k = 0; k < threads; ++k) { int tmp = cpu_info->numa_affinity[k]; // If gaps exist in the core numbering, the numa_affinity entries // for the skipped core ids will be negative. if( tmp >= 0 ) cpu_info->num_threads_per_numa[cpu_info->numa_affinity[k]]++; } initialized = 1; } /** Vector that points to entry points for our component */ papi_vector_t _sysdetect_vector = { .cmp_info = { .name = "sysdetect", .short_name = "sysdetect", .description = "System info detection component", .version = "1.0", .support_version = "n/a", .kernel_version = "n/a", }, /* Sizes of framework-opaque component-private structures */ .size = { .context = 1, /* unused */ .control_state = 1, /* unused */ .reg_value = 1, /* unused */ .reg_alloc = 1, /* unused */ }, /* Used for general PAPI interactions */ .init_component = _sysdetect_init_component, .init_thread = _sysdetect_init_thread, .shutdown_component = _sysdetect_shutdown_component, .shutdown_thread = _sysdetect_shutdown_thread, .user = _sysdetect_user, }; papi-papi-7-2-0-t/src/components/sysdetect/sysdetect.h000066400000000000000000000050751502707512200230120ustar00rootroot00000000000000#ifndef __SYSDETECT_H__ #define __SYSDETECT_H__ /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" typedef union { struct { unsigned long uid; char name[PAPI_2MAX_STR_LEN]; int warp_size; int max_threads_per_block; int max_blocks_per_multi_proc; int max_shmmem_per_block; int max_shmmem_per_multi_proc; int max_block_dim_x; int max_block_dim_y; int max_block_dim_z; int max_grid_dim_x; int max_grid_dim_y; int max_grid_dim_z; int multi_processor_count; int multi_kernel_per_ctx; int can_map_host_mem; int can_overlap_comp_and_data_xfer; int unified_addressing; int managed_memory; int major; int minor; } nvidia; struct { unsigned long uid; char name[PAPI_2MAX_STR_LEN]; unsigned int wavefront_size; unsigned int simd_per_compute_unit; unsigned int max_threads_per_workgroup; unsigned int max_waves_per_compute_unit; unsigned int max_shmmem_per_workgroup; unsigned short max_workgroup_dim_x; unsigned short max_workgroup_dim_y; unsigned short max_workgroup_dim_z; unsigned int max_grid_dim_x; unsigned int max_grid_dim_y; unsigned int max_grid_dim_z; unsigned int compute_unit_count; unsigned int major; unsigned int minor; } amd; } _sysdetect_gpu_info_u; typedef struct { int num_caches; PAPI_mh_cache_info_t cache[PAPI_MH_MAX_LEVELS]; } _sysdetect_cache_level_info_t; typedef struct { char name[PAPI_MAX_STR_LEN]; int cpuid_family; int cpuid_model; int cpuid_stepping; int sockets; int numas; int cores; int threads; int cache_levels; _sysdetect_cache_level_info_t clevel[PAPI_MAX_MEM_HIERARCHY_LEVELS]; #define PAPI_MAX_NUM_NODES 8 int numa_memory[PAPI_MAX_NUM_NODES]; #define PAPI_MAX_NUM_THREADS 512 int numa_affinity[PAPI_MAX_NUM_THREADS]; #define PAPI_MAX_THREADS_PER_NUMA (PAPI_MAX_NUM_THREADS / PAPI_MAX_NUM_NODES) int num_threads_per_numa[PAPI_MAX_THREADS_PER_NUMA]; } _sysdetect_cpu_info_t; typedef union { _sysdetect_gpu_info_u gpu; _sysdetect_cpu_info_t cpu; } _sysdetect_dev_info_u; typedef struct { PAPI_dev_type_id_e id; char vendor[PAPI_MAX_STR_LEN]; int vendor_id; char status[PAPI_MAX_STR_LEN]; int num_devices; _sysdetect_dev_info_u *dev_info_arr; } _sysdetect_dev_type_info_t; #endif /* End of __SYSDETECT_H__ */ papi-papi-7-2-0-t/src/components/sysdetect/tests/000077500000000000000000000000001502707512200217655ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/sysdetect/tests/Makefile000066400000000000000000000026321502707512200234300ustar00rootroot00000000000000NAME=sysdetect include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< ifneq ($(MPICC),) ifeq ($(NO_MPI_TESTS),) MPITESTS = query_device_mpi else MPITESTS = endif endif FTESTS = ifeq ($(ENABLE_FORTRAN_TESTS),yes) FTESTS = query_device_simple_f endif intel_compilers := ifort ifx cray_compilers := ftn crayftn ifeq ($(notdir $(F77)),gfortran) FFLAGS +=-ffree-form -ffree-line-length-none else ifeq ($(patsubst %flang,,$(notdir $(F77))),) # compiler name ends with flang FFLAGS +=-ffree-form else ifneq ($(findstring $(notdir $(F77)), $(intel_compilers)),) FFLAGS +=-free else ifneq ($(findstring $(notdir $(F77)), $(cray_compilers)),) FFLAGS +=-ffree endif TESTS = query_device_simple \ $(FTESTS) \ $(MPITESTS) sysdetect_tests: $(TESTS) query_device_simple: query_device_simple.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o query_device_simple query_device_simple.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) query_device_mpi: query_device_mpi.o $(UTILOBJS) $(PAPILIB) $(MPICC) $(CFLAGS) $(INCLUDE) -o query_device_mpi query_device_mpi.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) query_device_mpi.o: query_device_mpi.c $(MPICC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< query_device_simple_f: query_device_simple_f.F $(F77) $(FFLAGS) -I../../.. -o $@ $< $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/sysdetect/tests/query_device_mpi.c000066400000000000000000000077151502707512200254740ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #define MAX_LOCAL_RANKS (512) #define MPI_CALL(call, err_handle) do { \ int _status = (call); \ if (_status == MPI_SUCCESS) \ break; \ err_handle; \ } while(0) static int cmp_fn(const void *a, const void *b) { return (*(unsigned long *)a - *(unsigned long *)b); } static int print_cuda_affinity( MPI_Comm comm, void *handle ) { int shm_comm_size; MPI_Comm shm_comm = MPI_COMM_NULL; const char *name; int dev_count; int rank, local_rank; int ranks[MAX_LOCAL_RANKS] = { 0 }; unsigned long uid; unsigned long uids[MAX_LOCAL_RANKS] = { 0 }; MPI_CALL(MPI_Comm_rank(comm, &rank), return _status); MPI_CALL(MPI_Comm_split_type(comm, MPI_COMM_TYPE_SHARED, rank, MPI_INFO_NULL, &shm_comm), return _status); MPI_CALL(MPI_Comm_size(shm_comm, &shm_comm_size), return _status); MPI_CALL(MPI_Comm_rank(shm_comm, &local_rank), return _status); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &dev_count); int i; for (i = 0; i < dev_count; ++i) { PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, &name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_ULONG_UID, &uid); MPI_CALL(MPI_Alltoall(&rank, 1, MPI_INT, ranks, 1, MPI_INT, shm_comm), return _status); MPI_CALL(MPI_Alltoall(&uid, 1, MPI_UNSIGNED_LONG, uids, 1, MPI_UNSIGNED_LONG, shm_comm), return _status); unsigned long sorted_uids[MAX_LOCAL_RANKS] = { 0 }; unsigned long uniq_sorted_uids[MAX_LOCAL_RANKS] = { 0 }; memcpy(sorted_uids, uids, sizeof(unsigned long) * shm_comm_size); qsort(sorted_uids, shm_comm_size, sizeof(unsigned long), cmp_fn); if (local_rank == 0) { int j, uniq_uids = 0; unsigned long curr_uid = 0; for (j = 0; j < shm_comm_size; ++j) { if (sorted_uids[j] != curr_uid) { curr_uid = sorted_uids[j]; uniq_sorted_uids[uniq_uids++] = curr_uid; } } int k, l, list[MAX_LOCAL_RANKS] = { 0 }; for (j = 0, l = 0; j < uniq_uids; ++j) { for (k = 0; k < shm_comm_size; ++k) { if (uids[k] == uniq_sorted_uids[j]) { list[l++] = ranks[k]; } } printf( "GPU-%i Affinity : Name: %s, UID: %lu, Ranks: [ ", i, name, uniq_sorted_uids[j] ); for (k = 0; k < l; ++k) { printf( "%d ", list[k] ); } printf( "]\n" ); } } } MPI_CALL(MPI_Comm_free(&shm_comm), return _status); return 0; } int main(int argc, char *argv[]) { int quiet = 0; quiet = tests_quiet(argc, argv); MPI_Init(&argc, &argv); int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); int retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed\n", retval); } if (!quiet && rank == 0) { printf("Testing systedect component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); } void *handle; int enum_modifier = PAPI_DEV_TYPE_ENUM__CUDA; while (PAPI_enum_dev_type(enum_modifier, &handle) == PAPI_OK) { if (!quiet) { print_cuda_affinity(MPI_COMM_WORLD, handle); } } MPI_Barrier(MPI_COMM_WORLD); if (rank == 0) { test_pass(__FILE__); } MPI_Finalize(); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/components/sysdetect/tests/query_device_simple.c000066400000000000000000000374571502707512200262060ustar00rootroot00000000000000/** * @file query_device_simple.c * @author Giuseppe Congiu * gcongiu@icl.utk.edu * * test case for sysdetect component * * @brief * This file contains an example of how to use the sysdetect component to * query NVIDIA GPU device information. */ #include #include #include "papi.h" #include "papi_test.h" int main(int argc, char *argv[]) { int i, quiet = 0; quiet = tests_quiet(argc, argv); int retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed\n", retval); } if (!quiet) { printf("Testing sysdetect component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); } if (!quiet) { printf( "\nDevice Summary -----------------------------------------------------------------\n" ); } void *handle; int enum_modifier = PAPI_DEV_TYPE_ENUM__ALL; int id, vendor_id, dev_count; const char *vendor_name, *status; if (!quiet) { printf( "Vendor DevCount \n" ); } while (PAPI_enum_dev_type(enum_modifier, &handle) == PAPI_OK) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &dev_count); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_STATUS, &status); if (!quiet) { printf( "%-18s (%d)\n", vendor_name, dev_count); printf( " \\-> Status: %s\n", status ); printf( "\n" ); } } if (!quiet) { printf( "\nDevice Information -------------------------------------------------------------\n" ); } while (PAPI_enum_dev_type(enum_modifier, &handle) == PAPI_OK) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, &vendor_id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &dev_count); if ( id == PAPI_DEV_TYPE_ID__CPU && dev_count > 0 ) { int numas = 1; for ( i = 0; i < dev_count; ++i ) { const char *cpu_name; unsigned int family, model, stepping; unsigned int sockets, cores, threads; unsigned int l1i_size, l1d_size, l2u_size, l3u_size; unsigned int l1i_line_sz, l1d_line_sz, l2u_line_sz, l3u_line_sz; unsigned int l1i_line_cnt, l1d_line_cnt, l2u_line_cnt, l3u_line_cnt; unsigned int l1i_cache_ass, l1d_cache_ass, l2u_cache_ass, l3u_cache_ass; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_CHAR_NAME, &cpu_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_FAMILY, &family); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_MODEL, &model); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_STEPPING, &stepping); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, &sockets); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, &numas); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, &cores); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, &threads); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, &l1i_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, &l1d_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, &l2u_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, &l3u_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, &l1i_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, &l1d_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, &l2u_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, &l3u_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, &l1i_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, &l1d_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, &l2u_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, &l3u_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, &l1i_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, &l1d_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, &l2u_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, &l3u_cache_ass); if (!quiet) { printf( "Vendor : %s (%u,0x%x)\n", vendor_name, vendor_id, vendor_id ); printf( "Id : %u\n", i ); printf( "Name : %s\n", cpu_name ); printf( "CPUID : Family/Model/Stepping %u/%u/%u 0x%02x/0x%02x/0x%02x\n", family, model, stepping, family, model, stepping ); printf( "Sockets : %u\n", sockets ); printf( "Numa regions : %u\n", numas ); printf( "Cores per socket : %u\n", cores ); printf( "Cores per NUMA region : %u\n", threads / numas ); printf( "SMT threads per core : %u\n", threads / sockets / cores ); if (l1i_size > 0) { printf( "L1i Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l1i_size >> 10, l1i_line_sz, l1i_line_cnt, l1i_cache_ass); printf( "L1d Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l1d_size >> 10, l1d_line_sz, l1d_line_cnt, l1d_cache_ass); } if (l2u_size > 0) { printf( "L2 Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l2u_size >> 10, l2u_line_sz, l2u_line_cnt, l2u_cache_ass ); } if (l3u_size > 0) { printf( "L3 Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l3u_size >> 10, l3u_line_sz, l3u_line_cnt, l3u_cache_ass ); } } } if (!quiet) { printf( "\n" ); } } if ( id == PAPI_DEV_TYPE_ID__CUDA && dev_count > 0 ) { if (!quiet) { printf( "Vendor : %s\n", vendor_name ); } for ( i = 0; i < dev_count; ++i ) { unsigned long uid; unsigned int warp_size, thread_per_block, block_per_sm; unsigned int shm_per_block, shm_per_sm; unsigned int blk_dim_x, blk_dim_y, blk_dim_z; unsigned int grd_dim_x, grd_dim_y, grd_dim_z; unsigned int sm_count, multi_kernel, map_host_mem, async_memcpy; unsigned int unif_addr, managed_mem; unsigned int cc_major, cc_minor; const char *dev_name; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_ULONG_UID, &uid); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, &dev_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, &warp_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK, &thread_per_block); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM, &block_per_sm); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK, &shm_per_block); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM, &shm_per_sm); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X, &blk_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y, &blk_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z, &blk_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X, &grd_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y, &grd_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z, &grd_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT, &sm_count); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL, &multi_kernel); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM, &map_host_mem); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP, &async_memcpy); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR, &unif_addr); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM, &managed_mem); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, &cc_minor); if (!quiet) { printf( "Id : %d\n", i ); printf( "UID : %lu\n", uid ); printf( "Name : %s\n", dev_name ); printf( "Warp size : %u\n", warp_size ); printf( "Max threads per block : %u\n", thread_per_block ); printf( "Max blocks per multiprocessor : %u\n", block_per_sm ); printf( "Max shared memory per block : %u\n", shm_per_block ); printf( "Max shared memory per multiprocessor : %u\n", shm_per_sm ); printf( "Max block dim x : %u\n", blk_dim_x ); printf( "Max block dim y : %u\n", blk_dim_y ); printf( "Max block dim z : %u\n", blk_dim_z ); printf( "Max grid dim x : %u\n", grd_dim_x ); printf( "Max grid dim y : %u\n", grd_dim_y ); printf( "Max grid dim z : %u\n", grd_dim_z ); printf( "Multiprocessor count : %u\n", sm_count ); printf( "Multiple kernels per context : %s\n", multi_kernel ? "yes" : "no" ); printf( "Can map host memory : %s\n", map_host_mem ? "yes" : "no"); printf( "Can overlap compute and data transfer : %s\n", async_memcpy ? "yes" : "no" ); printf( "Has unified addressing : %s\n", unif_addr ? "yes" : "no" ); printf( "Has managed memory : %s\n", managed_mem ? "yes" : "no" ); printf( "Compute capability : %u.%u\n", cc_major, cc_minor ); printf( "\n" ); } } } if ( id == PAPI_DEV_TYPE_ID__ROCM && dev_count > 0 ) { if (!quiet) { printf( "Vendor : %s\n", vendor_name ); } unsigned long uid; const char *dev_name; unsigned int wf_size, simd_per_cu, wg_size; unsigned int wf_per_cu, shm_per_wg, wg_dim_x, wg_dim_y, wg_dim_z; unsigned int grd_dim_x, grd_dim_y, grd_dim_z; unsigned int cu_count; unsigned int cc_major, cc_minor; for ( i = 0; i < dev_count; ++i ) { PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_ULONG_UID, &uid); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME, &dev_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE, &wf_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU, &simd_per_cu); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE, &wg_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU, &wf_per_cu); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG, &shm_per_wg); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X, &wg_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y, &wg_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z, &wg_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X, &grd_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y, &grd_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z, &grd_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT, &cu_count); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR, &cc_minor); if (!quiet) { printf( "Id : %d\n", i ); printf( "Name : %s\n", dev_name ); printf( "Wavefront size : %u\n", wf_size ); printf( "SIMD per compute unit : %u\n", simd_per_cu ); printf( "Max threads per workgroup : %u\n", wg_size ); printf( "Max waves per compute unit : %u\n", wf_per_cu ); printf( "Max shared memory per workgroup : %u\n", shm_per_wg ); printf( "Max workgroup dim x : %u\n", wg_dim_x ); printf( "Max workgroup dim y : %u\n", wg_dim_y ); printf( "Max workgroup dim z : %u\n", wg_dim_z ); printf( "Max grid dim x : %u\n", grd_dim_x ); printf( "Max grid dim y : %u\n", grd_dim_y ); printf( "Max grid dim z : %u\n", grd_dim_z ); printf( "Compute unit count : %u\n", cu_count ); printf( "Compute capability : %u.%u\n", cc_major, cc_minor ); printf( "\n" ); } } } } if (!quiet) { printf( "--------------------------------------------------------------------------------\n" ); } if (!quiet) printf("\n"); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/sysdetect/tests/query_device_simple_f.F000066400000000000000000000171121502707512200264400ustar00rootroot00000000000000 program test_F_interface use :: ISO_C_BINDING implicit none #include "f90papi.h" integer :: i, j, ret_val, handle, modifier, id, vendor_id integer :: dev_count, numas, dummy_val character(len=PAPI_MAX_STR_LEN) :: vendor_name, cpu_name character(len=1) :: dummy_str integer :: family, model, stepping, sockets, cores, threads integer :: l1i_size, l1d_size, l2u_size, l3u_size integer :: l1i_line_sz, l1d_line_sz, l2u_line_sz, l3u_line_sz integer :: l1i_line_cnt, l1d_line_cnt, l2u_line_cnt integer :: l3u_line_cnt, l1i_cache_ass, l1d_cache_ass integer :: l2u_cache_ass, l3u_cache_ass integer :: thr_list(256) integer :: mem_size, thread_count ret_val = PAPI_VER_CURRENT call papif_library_init(ret_val) if( ret_val .ne. PAPI_VER_CURRENT ) then print *,'Error at papif_init', ret_val, '!=', PAPI_VER_CURRENT print *,'PAPI_EINVAL', PAPI_EINVAL print *,'PAPI_ENOMEM', PAPI_ENOMEM print *,'PAPI_ECMP', PAPI_ECMP print *,'PAPI_ESYS', PAPI_ESYS stop endif modifier = PAPI_DEV_TYPE_ENUM__ALL call papif_enum_dev_type(modifier, handle, ret_val) do while (ret_val == PAPI_OK) call papif_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, id, dummy_str, ret_val) call papif_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, vendor_id, dummy_str, ret_val) call papif_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, dummy_val, vendor_name, ret_val) call papif_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, dev_count, dummy_str, ret_val) if (id == PAPI_DEV_TYPE_ID__CPU .and. dev_count > 0) then numas = 1 do i = 0, dev_count - 1 call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_CHAR_NAME, dummy_val, cpu_name, & ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_FAMILY, family, dummy_str, & ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_MODEL, model, dummy_str, & ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_STEPPING, stepping, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, sockets, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, numas, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, cores, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, threads, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, l1i_size, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, l1d_size, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, l2u_size, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, l3u_size, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, l1i_line_sz, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, l1d_line_sz, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, l2u_line_sz, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, l3u_line_sz, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, l1i_line_cnt,& dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, l1d_line_cnt,& dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, l2u_line_cnt,& dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, l3u_line_cnt,& dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, l1i_cache_ass, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, l1d_cache_ass, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, l2u_cache_ass, & dummy_str, ret_val) call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, l3u_cache_ass, & dummy_str, ret_val) print *, 'Vendor : ', trim(vendor_name), '(', vendor_id, ')' print *, 'Id : ', i print *, 'Name : ', trim(cpu_name) print *, 'CPUID : ', family, '/', model, '/', stepping print *, 'Sockets : ', sockets print *, 'Numa regions : ', numas print *, 'Cores/socket : ', cores print *, 'Cores/numa : ', threads / numas print *, 'SMT/core : ', threads / sockets / cores if (l1i_size > 0) then print *, 'L1i cache : ', 'size/line size/lines/associativity ', & l1i_size / 1024, l1i_line_sz, l1i_line_cnt, l1i_cache_ass print *, 'L1d cache : ', 'size/line size/lines/associativity ', & l1d_size / 1024, l1d_line_sz, l1d_line_cnt, l1d_cache_ass endif if (l2u_size > 0) then print *, 'L2 cache : ', 'size/line size/lines/associativity ', & l2u_size / 1024, l2u_line_sz, l2u_line_cnt, l2u_cache_ass endif if (l3u_size > 0) then print *, 'L3 cache : ', 'size/line size/lines/associativity ', & l3u_size / 1024, l3u_line_sz, l3u_line_cnt, l3u_cache_ass endif enddo do i = 0, numas - 1 call papif_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE, & mem_size, dummy_str, ret_val) enddo endif call papif_enum_dev_type(modifier, handle, ret_val) end do call papif_shutdown() end program papi-papi-7-2-0-t/src/components/sysdetect/x86_cpu_utils.c000066400000000000000000000530321502707512200235060ustar00rootroot00000000000000#include #include #include #include "sysdetect.h" #include "x86_cpu_utils.h" #include "os_cpu_utils.h" typedef struct { int smt_mask; int smt_width; int core_mask; int core_width; int pkg_mask; int pkg_width; } apic_subid_mask_t; typedef struct { int pkg; int core; int smt; } apic_subid_t; typedef struct { unsigned int eax; unsigned int ebx; unsigned int ecx; unsigned int edx; } cpuid_reg_t; static _sysdetect_cache_level_info_t clevel[PAPI_MAX_MEM_HIERARCHY_LEVELS]; static int cpuid_get_vendor( char *vendor ); static int cpuid_get_name( char *name ); static int cpuid_get_attribute( CPU_attr_e attr, int *value ); static int cpuid_get_attribute_at( CPU_attr_e attr, int loc, int *value ); static int cpuid_get_topology_info( CPU_attr_e attr, int *value ); static int cpuid_get_cache_info( CPU_attr_e attr, int level, int *value ); static int intel_get_cache_info( CPU_attr_e attr, int level, int *value ); static int amd_get_cache_info( CPU_attr_e attr, int level, int *value ); static int cpuid_supports_leaves_4_11( void ); static int enum_cpu_resources( int num_mappings, apic_subid_mask_t *mask, apic_subid_t *subids, int *sockets, int *cores, int *threads ); static int cpuid_get_versioning_info( CPU_attr_e attr, int *value ); static int cpuid_parse_id_foreach_thread( unsigned int num_mappings, apic_subid_mask_t *mask, apic_subid_t *subid ); static int cpuid_parse_ids( int os_proc_count, apic_subid_mask_t *mask, apic_subid_t *subid ); static int cpuid_get_mask( apic_subid_mask_t *mask ); static int cpuid_get_leaf11_mask( apic_subid_mask_t *mask ); static int cpuid_get_leaf4_mask( apic_subid_mask_t *mask ); static unsigned int cpuid_get_x2apic_id( void ); static unsigned int cpuid_get_apic_id( void ); static unsigned int bit_width( unsigned int x ); static void cpuid( cpuid_reg_t *reg, const unsigned int func ); static void cpuid2( cpuid_reg_t *reg, const unsigned int func, const unsigned int subfunc ); static int cpuid_has_leaf4; /* support legacy leaf1 and leaf4 interface */ static int cpuid_has_leaf11; /* support modern leaf11 interface */ int x86_cpu_init( void ) { /* * In the future we might need to dynamically * allocate and free objects; init/finalize * functions are a good place for doing that. */ return CPU_SUCCESS; } int x86_cpu_finalize( void ) { return CPU_SUCCESS; } int x86_cpu_get_vendor( char *vendor ) { return cpuid_get_vendor(vendor); } int x86_cpu_get_name( char *name ) { return cpuid_get_name(name); } int x86_cpu_get_attribute( CPU_attr_e attr, int *value ) { return cpuid_get_attribute(attr, value); } int x86_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { return cpuid_get_attribute_at(attr, loc, value); } int cpuid_get_vendor( char *vendor ) { cpuid_reg_t reg; cpuid(®, 0); /* Highest function parameter and manufacturer ID */ memcpy(vendor , ®.ebx, 4); memcpy(vendor + 4, ®.edx, 4); memcpy(vendor + 8, ®.ecx, 4); vendor[12] = '\0'; return CPU_SUCCESS; } int cpuid_get_name( char *name ) { cpuid_reg_t reg; cpuid(®, 0x80000000); if (reg.eax < 0x80000004) { /* Feature not implemented. Fallback! */ return os_cpu_get_name(name); } cpuid(®, 0x80000002); memcpy(name , ®.eax, 4); memcpy(name + 4 , ®.ebx, 4); memcpy(name + 8 , ®.ecx, 4); memcpy(name + 12, ®.edx, 4); cpuid(®, 0x80000003); memcpy(name + 16, ®.eax, 4); memcpy(name + 20, ®.ebx, 4); memcpy(name + 24, ®.ecx, 4); memcpy(name + 28, ®.edx, 4); cpuid(®, 0x80000004); memcpy(name + 32, ®.eax, 4); memcpy(name + 36, ®.ebx, 4); memcpy(name + 40, ®.ecx, 4); memcpy(name + 44, ®.edx, 4); name[48] = '\0'; return CPU_SUCCESS; } int cpuid_get_attribute( CPU_attr_e attr, int *value ) { int status = CPU_SUCCESS; switch(attr) { case CPU_ATTR__NUM_SOCKETS: case CPU_ATTR__NUM_NODES: case CPU_ATTR__NUM_THREADS: case CPU_ATTR__NUM_CORES: status = cpuid_get_topology_info(attr, value); break; case CPU_ATTR__CPUID_FAMILY: case CPU_ATTR__CPUID_MODEL: case CPU_ATTR__CPUID_STEPPING: status = cpuid_get_versioning_info(attr, value); break; case CPU_ATTR__CACHE_MAX_NUM_LEVELS: *value = PAPI_MAX_MEM_HIERARCHY_LEVELS; break; default: status = os_cpu_get_attribute(attr, value); } return status; } int cpuid_get_attribute_at( CPU_attr_e attr, int loc, int *value ) { int status = CPU_SUCCESS; switch(attr) { case CPU_ATTR__CACHE_INST_PRESENT: case CPU_ATTR__CACHE_DATA_PRESENT: case CPU_ATTR__CACHE_UNIF_PRESENT: case CPU_ATTR__CACHE_INST_TOT_SIZE: case CPU_ATTR__CACHE_INST_LINE_SIZE: case CPU_ATTR__CACHE_INST_NUM_LINES: case CPU_ATTR__CACHE_INST_ASSOCIATIVITY: case CPU_ATTR__CACHE_DATA_TOT_SIZE: case CPU_ATTR__CACHE_DATA_LINE_SIZE: case CPU_ATTR__CACHE_DATA_NUM_LINES: case CPU_ATTR__CACHE_DATA_ASSOCIATIVITY: case CPU_ATTR__CACHE_UNIF_TOT_SIZE: case CPU_ATTR__CACHE_UNIF_LINE_SIZE: case CPU_ATTR__CACHE_UNIF_NUM_LINES: case CPU_ATTR__CACHE_UNIF_ASSOCIATIVITY: status = cpuid_get_cache_info(attr, loc, value); break; case CPU_ATTR__NUMA_MEM_SIZE: case CPU_ATTR__HWTHREAD_NUMA_AFFINITY: status = os_cpu_get_attribute_at(attr, loc, value); break; default: status = CPU_ERROR; } return status; } int cpuid_get_topology_info( CPU_attr_e attr, int *value ) { int status = CPU_SUCCESS; static int sockets, nodes, cores, threads; if (attr == CPU_ATTR__NUM_SOCKETS && sockets) { *value = sockets; return status; } else if (attr == CPU_ATTR__NUM_THREADS && threads) { *value = threads; return status; } else if (attr == CPU_ATTR__NUM_CORES && cores) { *value = cores; return status; } else if (attr == CPU_ATTR__NUM_NODES && nodes) { *value = nodes; return status; } /* Query for cpuid supported topology enumeration capabilities: * - cpuid in the first generation of Intel Xeon and Intel Pentium 4 * supporting hyper-threading (2002) provides information that allows * to decompose the 8-bit wide APIC IDs into a two-level topology * enumeration; * - with the introduction of dual-core Intel 64 processors in 2005, * system topology enumeration using cpuid evolved into a three-level * algorithm (to account for physical cores) on the 8-bit wide APIC ID; * - modern Intel 64 platforms with support for large number of logical * processors use an extended 32-bit wide x2APIC ID. This is known as * cpuid leaf11 interface. Legacy cpuid interface with limited 256 * APIC IDs, is referred to as leaf4. */ if (!cpuid_supports_leaves_4_11()) { return os_cpu_get_attribute(attr, value); } /* Allocate SUBIDs' space for each logical processor */ int os_proc_count = os_cpu_get_num_supported(); apic_subid_t *subids = papi_malloc(os_proc_count * sizeof(*subids)); if (!subids) return CPU_ERROR; /* Get masks for later SUBIDs extraction */ apic_subid_mask_t mask = { 0 }; if (cpuid_get_mask(&mask)) goto fn_fail; /* For each logical processor get the unique APIC/x2APIC ID and use * use previously retrieved masks to extract package, core and smt * SUBIDs. */ int num_mappings = cpuid_parse_ids(os_proc_count, &mask, subids); if (num_mappings == -1) goto fn_fail; /* Enumerate all cpu resources once and store them for later */ status = enum_cpu_resources(num_mappings, &mask, subids, &sockets, &cores, &threads); if (status != CPU_SUCCESS) goto fn_fail; if (attr == CPU_ATTR__NUM_SOCKETS && sockets) { *value = sockets; } else if (attr == CPU_ATTR__NUM_THREADS && threads) { *value = threads; } else if (attr == CPU_ATTR__NUM_CORES && cores) { *value = cores; } else if (attr == CPU_ATTR__NUM_NODES) { /* We can't read the number of numa nodes using cpuid */ status = os_cpu_get_attribute(attr, &nodes); *value = nodes; } /* Parse subids and get package, core and smt counts */ papi_free(subids); fn_exit: return status; fn_fail: papi_free(subids); status = CPU_ERROR; goto fn_exit; } int cpuid_get_cache_info( CPU_attr_e attr, int level, int *value ) { int status = CPU_SUCCESS; char vendor[13] = { 0 }; cpuid_get_vendor(vendor); if (!strcmp(vendor, "GenuineIntel")) { status = intel_get_cache_info(attr, level, value); } else if (!strcmp(vendor, "AuthenticAMD")) { status = amd_get_cache_info(attr, level, value); } else { status = CPU_ERROR; } return status; } int intel_get_cache_info( CPU_attr_e attr, int level, int *value ) { static _sysdetect_cache_level_info_t *clevel_ptr; if (clevel_ptr) { return cpu_get_cache_info(attr, level, clevel_ptr, value); } if (!cpuid_supports_leaves_4_11()) { return os_cpu_get_attribute_at(attr, level, value); } clevel_ptr = clevel; cpuid_reg_t reg; int subleaf = 0; while(1) { /* * We query cache info only for the logical processor we are running on * and rely on the fact that the rest are all identical */ cpuid2(®, 4, subleaf); /* * Decoded as per table 3-12 in Intel's Software Developer's Manual * Volume 2A */ int type = reg.eax & 0x1f; if (type == 0) break; switch(type) { case 1: type = PAPI_MH_TYPE_DATA; break; case 2: type = PAPI_MH_TYPE_INST; break; case 3: type = PAPI_MH_TYPE_UNIFIED; break; default: type = PAPI_MH_TYPE_UNKNOWN; } int level = (reg.eax >> 5) & 0x3; int fully_assoc = (reg.eax >> 9) & 0x1; int line_size = (reg.ebx & 0xfff) + 1; int partitions = ((reg.ebx >> 12) & 0x3ff) + 1; int ways = ((reg.ebx >> 22) & 0x3ff) + 1; int sets = (reg.ecx + 1); int *num_caches = &clevel[level-1].num_caches; clevel_ptr[level-1].cache[*num_caches].type = type; clevel_ptr[level-1].cache[*num_caches].size = (ways * partitions * sets * line_size); clevel_ptr[level-1].cache[*num_caches].line_size = line_size; clevel_ptr[level-1].cache[*num_caches].num_lines = (ways * partitions * sets); clevel_ptr[level-1].cache[*num_caches].associativity = (fully_assoc) ? SHRT_MAX : ways; ++(*num_caches); ++subleaf; } return cpu_get_cache_info(attr, level, clevel_ptr, value); } int amd_get_cache_info( CPU_attr_e attr, int level, int *value ) { static _sysdetect_cache_level_info_t *clevel_ptr; if (clevel_ptr) { return cpu_get_cache_info(attr, level, clevel_ptr, value); } cpuid_reg_t reg; /* L1 Caches */ cpuid(®, 0x80000005); unsigned char byt[16]; memcpy(byt , ®.eax, 4); memcpy(byt + 4 , ®.ebx, 4); memcpy(byt + 8 , ®.ecx, 4); memcpy(byt + 12, ®.edx, 4); clevel_ptr = clevel; clevel_ptr[0].cache[0].type = PAPI_MH_TYPE_DATA; clevel_ptr[0].cache[0].size = byt[11] << 10; clevel_ptr[0].cache[0].line_size = byt[8]; clevel_ptr[0].cache[0].num_lines = clevel_ptr[0].cache[0].size / clevel_ptr[0].cache[0].line_size; clevel_ptr[0].cache[0].associativity = byt[10]; clevel_ptr[0].cache[1].type = PAPI_MH_TYPE_INST; clevel_ptr[0].cache[1].size = byt[15] << 10; clevel_ptr[0].cache[1].line_size = byt[12]; clevel_ptr[0].cache[1].num_lines = clevel_ptr[0].cache[1].size / clevel_ptr[0].cache[1].line_size; clevel_ptr[0].cache[1].associativity = byt[14]; clevel_ptr[0].num_caches = 2; /* L2 and L3 caches */ cpuid(®, 0x80000006); memcpy(byt , ®.eax, 4); memcpy(byt + 4 , ®.ebx, 4); memcpy(byt + 8 , ®.ecx, 4); memcpy(byt + 12, ®.edx, 4); static short int assoc[16] = { 0, 1, 2, -1, 4, -1, 8, -1, 16, -1, 32, 48, 64, 96, 128, SHRT_MAX }; if (reg.ecx) { clevel_ptr[1].cache[0].type = PAPI_MH_TYPE_UNIFIED; clevel_ptr[1].cache[0].size = (int)((reg.ecx & 0xffff0000) >> 6); clevel_ptr[1].cache[0].line_size = byt[8]; clevel_ptr[1].cache[0].num_lines = clevel_ptr[1].cache[0].size / clevel_ptr[1].cache[0].line_size; clevel_ptr[1].cache[0].associativity = assoc[(byt[9] & 0xf0) >> 4]; clevel_ptr[1].num_caches = 1; } if (reg.edx) { clevel_ptr[2].cache[0].type = PAPI_MH_TYPE_UNIFIED; clevel_ptr[2].cache[0].size = (int)((reg.edx & 0xfffc0000) << 1); clevel_ptr[2].cache[0].line_size = byt[12]; clevel_ptr[2].cache[0].num_lines = clevel_ptr[1].cache[0].size / clevel_ptr[1].cache[0].line_size; clevel_ptr[2].cache[0].associativity = assoc[(byt[13] & 0xf0) >> 4]; clevel_ptr[2].num_caches = 1; } return cpu_get_cache_info(attr, level, clevel_ptr, value); } int cpuid_supports_leaves_4_11( void ) { char vendor[13]; cpuid_get_vendor(vendor); cpuid_reg_t reg; cpuid(®, 0); /* If leaf4 not supported or vendor is not Intel, fallback */ int fallback = (reg.eax < 4 || strcmp(vendor, "GenuineIntel")); if (!fallback) { cpuid_has_leaf4 = 1; cpuid(®, 11); if (reg.ebx != 0) cpuid_has_leaf11 = 1; } return !fallback; } int enum_cpu_resources( int num_mappings, apic_subid_mask_t *mask, apic_subid_t *subids, int *sockets, int *cores, int *threads ) { int status = CPU_SUCCESS; int max_num_pkgs = (1 << mask->pkg_width); int max_num_cores = (1 << mask->core_width); int max_num_threads = (1 << mask->smt_width); int *pkg_arr = papi_calloc(max_num_pkgs, sizeof(int)); if (!pkg_arr) goto fn_fail_pkg; int *core_arr = papi_calloc(max_num_cores, sizeof(int)); if (!core_arr) goto fn_fail_core; int *smt_arr = papi_calloc(max_num_threads, sizeof(int)); if (!smt_arr) goto fn_fail_thread; int i; for (i = 0; i < num_mappings; ++i) { pkg_arr[subids[i].pkg] = core_arr[subids[i].core] = smt_arr[subids[i].smt] = 1; } i = 0, *sockets = 0; while (i < max_num_pkgs) { if (pkg_arr[i++] != 0) (*sockets)++; } i = 0, *cores = 0; while (i < max_num_cores) { if (core_arr[i++] != 0) (*cores)++; } i = 0, *threads = 0; while (i < max_num_threads) { if (smt_arr[i++] != 0) (*threads)++; } papi_free(pkg_arr); papi_free(core_arr); papi_free(smt_arr); fn_exit: return status; fn_fail_thread: papi_free(core_arr); fn_fail_core: papi_free(pkg_arr); fn_fail_pkg: status = CPU_ERROR; goto fn_exit; } int cpuid_get_versioning_info( CPU_attr_e attr, int *value ) { static int family, model, stepping; if (attr == CPU_ATTR__CPUID_FAMILY && family) { *value = family; return CPU_SUCCESS; } else if (attr == CPU_ATTR__CPUID_MODEL && model) { *value = model; return CPU_SUCCESS; } else if (attr == CPU_ATTR__CPUID_STEPPING && stepping) { *value = stepping; return CPU_SUCCESS; } cpuid_reg_t reg; cpuid(®, 1); /* Query versioning info once and store results for later */ family = (reg.eax >> 8) & 0x0000000f; model = (family == 6 || family == 15) ? ((reg.eax >> 4) & 0x0000000f) + ((reg.eax >> 12) & 0x000000f0) : ((reg.eax >> 4) & 0x0000000f); stepping = reg.eax & 0x0000000f; char vendor[13]; cpuid_get_vendor(vendor); if (!strcmp(vendor, "AuthenticAMD") && family == 15) { /* Adjust family for AMD processors */ family += (reg.eax >> 20) & 0x000000ff; } if (attr == CPU_ATTR__CPUID_FAMILY) { *value = family; } else if (attr == CPU_ATTR__CPUID_MODEL) { *value = model; } else { *value = stepping; } return CPU_SUCCESS; } int cpuid_parse_id_foreach_thread( unsigned int num_mappings, apic_subid_mask_t *mask, apic_subid_t *subid ) { unsigned int apic_id = cpuid_get_apic_id(); subid[num_mappings].pkg = ((apic_id & mask->pkg_mask ) >> (mask->smt_width + mask->core_width)); subid[num_mappings].core = ((apic_id & mask->core_mask) >> (mask->smt_width)); subid[num_mappings].smt = ((apic_id & mask->smt_mask )); return CPU_SUCCESS; } int cpuid_parse_ids( int os_proc_count, apic_subid_mask_t *mask, apic_subid_t *subid ) { int i, ret = 0; int num_mappings = 0; /* save cpu affinity */ os_cpu_store_affinity(); for (i = 0; i < os_proc_count; ++i) { /* check if we are allowed to run on this logical processor */ if (os_cpu_set_affinity(i)) { ret = -1; break; } /* now query id for the logical processor */ cpuid_parse_id_foreach_thread(num_mappings, mask, subid); /* increment parsed ids */ ret = ++num_mappings; } /* restore cpu affinity */ os_cpu_load_affinity(); return ret; } int cpuid_get_mask( apic_subid_mask_t *mask ) { if (cpuid_has_leaf11) { return cpuid_get_leaf11_mask(mask); } return cpuid_get_leaf4_mask(mask); } int cpuid_get_leaf11_mask( apic_subid_mask_t *mask ) { int status = CPU_SUCCESS; int core_reported = 0; int thread_reported = 0; int sub_leaf = 0, level_type, level_shift; unsigned int core_plus_smt_mask = 0; unsigned int core_plus_smt_width = 0; do { cpuid_reg_t reg; cpuid2(®, 11, sub_leaf); if (reg.ebx == 0) break; level_type = (reg.ecx >> 8) & 0x000000ff; level_shift = reg.eax & 0x0000001f; /* * x2APIC ID layout (32 bits) * +---------+----------+---------+ * | pkg | core | smt | * +---------+----------+---------+ * <---------> * level type = 1 * level shift = smt width * <--------------------> * level type = 2 * level shift = core + smt width * */ switch (level_type) { case 1: /* level type is SMT, so the mask width is level shift */ mask->smt_mask = ~(0xFFFFFFFF << level_shift); mask->smt_width = level_shift; thread_reported = 1; break; case 2: /* level type is core, so the core + smt mask width is level shift */ core_plus_smt_mask = ~(0xFFFFFFFF << level_shift); core_plus_smt_width = level_shift; mask->pkg_mask = 0xFFFFFFFF ^ core_plus_smt_mask; mask->pkg_width = 8; /* use reasonably high value */ core_reported = 1; break; default: break; } ++sub_leaf; } while(1); if (thread_reported && core_reported) { mask->core_mask = core_plus_smt_mask ^ mask->smt_mask; mask->core_width = core_plus_smt_width - mask->smt_width; } else if (!core_reported && thread_reported) { mask->core_mask = 0; mask->core_width = 0; mask->pkg_mask = 0xFFFFFFFF ^ mask->smt_mask; mask->pkg_width = 8; /* use reasonably high value */ } else { status = CPU_ERROR; } return status; } int cpuid_get_leaf4_mask( apic_subid_mask_t *mask ) { cpuid_reg_t reg; cpuid(®, 1); unsigned int core_plus_smt_max_cnt = (reg.ebx >> 16) & 0x000000ff; cpuid(®, 4); unsigned int core_max_cnt = ((reg.eax >> 26) & 0x0000003f) + 1; unsigned int core_width = bit_width(core_max_cnt); unsigned int smt_width = bit_width(core_plus_smt_max_cnt) - core_width; mask->smt_mask = ~(0xFFFFFFFF << smt_width); mask->smt_width = smt_width; mask->core_mask = ~(0xFFFFFFFF << bit_width(core_plus_smt_max_cnt)) ^ mask->smt_mask; mask->core_width = core_width; mask->pkg_mask = 0xFFFFFFFF << bit_width(core_plus_smt_max_cnt); mask->pkg_width = 8; /* use reasonably high value */ return CPU_SUCCESS; } unsigned int cpuid_get_x2apic_id( void ) { cpuid_reg_t reg; cpuid(®, 11); return reg.edx; } unsigned int cpuid_get_apic_id( void ) { if (cpuid_has_leaf11) { return cpuid_get_x2apic_id(); } cpuid_reg_t reg; cpuid(®, 1); return (reg.ebx >> 24) & 0x000000ff; } unsigned int bit_width( unsigned int x ) { int count = 0; double y = (double) x; while ((y /= 2) > 1) { ++count; } return (y < 1) ? count + 1 : count; } void cpuid2( cpuid_reg_t *reg, unsigned int func, unsigned int subfunc ) { __asm__ ("cpuid;" : "=a" (reg->eax), "=b" (reg->ebx), "=c" (reg->ecx), "=d" (reg->edx) : "a" (func), "c" (subfunc)); } void cpuid( cpuid_reg_t *reg, unsigned int func ) { cpuid2(reg, func, 0); } papi-papi-7-2-0-t/src/components/sysdetect/x86_cpu_utils.h000066400000000000000000000005601502707512200235110ustar00rootroot00000000000000#ifndef __X86_UTIL_H__ #define __X86_UTIL_H__ #include "cpu_utils.h" int x86_cpu_init( void ); int x86_cpu_finalize( void ); int x86_cpu_get_vendor( char *vendor ); int x86_cpu_get_name( char *name ); int x86_cpu_get_attribute( CPU_attr_e attr, int *value ); int x86_cpu_get_attribute_at( CPU_attr_e attr, int loc, int *value ); #endif /* End of __X86_UTIL_H__ */ papi-papi-7-2-0-t/src/components/template/000077500000000000000000000000001502707512200204275ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/template/README.md000066400000000000000000000032141502707512200217060ustar00rootroot00000000000000# Template Component The Template component showcases a possible approach to component design that varies from the more traditional monolitic approach, used in some existing components. The Template component is split into a front-end (template.c) and a back-end (vendor\_{dispatch,common,profiler}.c). The goal of this layered design is to decouple changes in the vendor libraries from the existing implementation. If, for example, the vendor library introduces a new set of interfaces and deprecates the old ones, a new vendor\_dispatch\_v2.c can be written to add support for the new vendor library interface without affecting the old vendor related code (i.e., vendor\_profiler.c). This can still be kept for backward compatibility with older vendor library versions. The dispatch layer (i.e., vendor\_dispatch.c) takes care of forwarding profiling calls to the right vendor library version. Any code that is shared between the different vendor library versions is placed in vendor\_common.c. This can contain common init routines (e.g., for the vendor runtime system initialization, like cuda or hsa), utility routines and shared data structures (e.g., device information tables). The template component emulates support for a generic vendor library to profile associated device hardware counters. The implementation is fairly detailed in stating the design decisions. For example, there is a clear separation between the front-end and the back-end with no sharing of data between the two. The front-end only takes care of doing some book keeping work (perhaps rearrange the order of events as it sees fit) and calling into the back-end functions to do actual work. papi-papi-7-2-0-t/src/components/template/Rules.template000066400000000000000000000013601502707512200232560ustar00rootroot00000000000000COMPSRCS += components/template/template.c \ components/template/vendor_dispatch.c \ components/template/vendor_common.c \ components/template/vendor_profiler_v1.c COMPOBJS += template.o vendor_dispatch.o vendor_common.o vendor_profiler_v1.o CFLAGS += -g LDFLAGS += $(LDL) template.o: components/template/template.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ vendor_dispatch.o: components/template/vendor_dispatch.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ vendor_common.o: components/template/vendor_common.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ vendor_profiler_v1.o: components/template/vendor_profiler_v1.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c $< -o $@ papi-papi-7-2-0-t/src/components/template/template.c000066400000000000000000000265551502707512200224230ustar00rootroot00000000000000#include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "vendor_dispatch.h" #define TEMPLATE_MAX_COUNTERS (16) /* Init and finalize */ static int templ_init_component(int cid); static int templ_init_thread(hwd_context_t *ctx); static int templ_init_control_state(hwd_control_state_t *ctl); static int templ_init_private(void); static int templ_shutdown_component(void); static int templ_shutdown_thread(hwd_context_t *ctx); static int templ_cleanup_eventset(hwd_control_state_t *ctl); /* Set and update component state */ static int templ_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx); /* Start and stop profiling of hardware events */ static int templ_start(hwd_context_t *ctx, hwd_control_state_t *ctl); static int templ_read(hwd_context_t *ctx, hwd_control_state_t *ctl, long long **val, int flags); static int templ_stop(hwd_context_t *ctx, hwd_control_state_t *ctl); static int templ_reset(hwd_context_t *ctx, hwd_control_state_t *ctl); /* Event conversion */ static int templ_ntv_enum_events(unsigned int *event_code, int modifier); static int templ_ntv_code_to_name(unsigned int event_code, char *name, int len); static int templ_ntv_name_to_code(const char *name, unsigned int *event_code); static int templ_ntv_code_to_descr(unsigned int event_code, char *descr, int len); static int templ_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info); static int templ_set_domain(hwd_control_state_t *ctl, int domain); static int templ_ctl(hwd_context_t *ctx, int code, _papi_int_option_t *option); typedef struct { int initialized; int state; int component_id; } templ_context_t; typedef struct { unsigned int *events_id; int num_events; vendord_ctx_t vendor_ctx; } templ_control_t; papi_vector_t _template_vector = { .cmp_info = { .name = "templ", .short_name = "templ", .version = "1.0", .description = "Template component for new components", .initialized = 0, .num_mpx_cntrs = TEMPLATE_MAX_COUNTERS, }, .size = { .context = sizeof(templ_context_t), .control_state = sizeof(templ_control_t), .reg_value = 1, .reg_alloc = 1, }, .init_component = templ_init_component, .init_thread = templ_init_thread, .init_control_state = templ_init_control_state, .shutdown_component = templ_shutdown_component, .shutdown_thread = templ_shutdown_thread, .cleanup_eventset = templ_cleanup_eventset, .update_control_state = templ_update_control_state, .start = templ_start, .stop = templ_stop, .read = templ_read, .reset = templ_reset, .ntv_enum_events = templ_ntv_enum_events, .ntv_code_to_name = templ_ntv_code_to_name, .ntv_name_to_code = templ_ntv_name_to_code, .ntv_code_to_descr = templ_ntv_code_to_descr, .ntv_code_to_info = templ_ntv_code_to_info, .set_domain = templ_set_domain, .ctl = templ_ctl, }; static int check_n_initialize(void); int templ_init_component(int cid) { _template_vector.cmp_info.CmpIdx = cid; _template_vector.cmp_info.num_native_events = -1; _template_vector.cmp_info.num_cntrs = -1; _templ_lock = PAPI_NUM_LOCK + NUM_INNER_LOCK + cid; int papi_errno = vendord_init_pre(); if (papi_errno != PAPI_OK) { _template_vector.cmp_info.initialized = 1; _template_vector.cmp_info.disabled = papi_errno; const char *err_string; vendord_err_get_last(&err_string); snprintf(_template_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s", err_string); return papi_errno; } sprintf(_template_vector.cmp_info.disabled_reason, "Not initialized. Access component events to initialize it."); _template_vector.cmp_info.disabled = PAPI_EDELAY_INIT; return PAPI_EDELAY_INIT; } int templ_init_thread(hwd_context_t *ctx) { templ_context_t *templ_ctx = (templ_context_t *) ctx; memset(templ_ctx, 0, sizeof(*templ_ctx)); templ_ctx->initialized = 1; templ_ctx->component_id = _template_vector.cmp_info.CmpIdx; return PAPI_OK; } int templ_init_control_state(hwd_control_state_t *ctl __attribute__((unused))) { return check_n_initialize(); } static int evt_get_count(int *count) { unsigned int event_code = 0; if (vendord_evt_enum(&event_code, PAPI_ENUM_FIRST) == PAPI_OK) { ++(*count); } while (vendord_evt_enum(&event_code, PAPI_ENUM_EVENTS) == PAPI_OK) { ++(*count); } return PAPI_OK; } int templ_init_private(void) { int papi_errno = PAPI_OK; _papi_hwi_lock(COMPONENT_LOCK); if (_template_vector.cmp_info.initialized) { papi_errno = _template_vector.cmp_info.disabled; goto fn_exit; } papi_errno = vendord_init(); if (papi_errno != PAPI_OK) { _template_vector.cmp_info.disabled = papi_errno; const char *err_string; vendord_err_get_last(&err_string); snprintf(_template_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s", err_string); goto fn_fail; } int count = 0; papi_errno = evt_get_count(&count); if (papi_errno != PAPI_OK) { goto fn_fail; } _template_vector.cmp_info.num_native_events = count; _template_vector.cmp_info.num_cntrs = count; _template_vector.cmp_info.initialized = 1; fn_exit: _template_vector.cmp_info.disabled = papi_errno; _papi_hwi_unlock(COMPONENT_LOCK); return papi_errno; fn_fail: goto fn_exit; } int templ_shutdown_component(void) { _template_vector.cmp_info.initialized = 0; return vendord_shutdown(); } int templ_shutdown_thread(hwd_context_t *ctx) { templ_context_t *templ_ctx = (templ_context_t *) ctx; templ_ctx->initialized = 0; templ_ctx->state = 0; return PAPI_OK; } int templ_cleanup_eventset(hwd_control_state_t *ctl) { templ_control_t *templ_ctl = (templ_control_t *) ctl; papi_free(templ_ctl->events_id); templ_ctl->events_id = NULL; templ_ctl->num_events = 0; return PAPI_OK; } static int update_native_events(templ_control_t *, NativeInfo_t *, int); static int try_open_events(templ_control_t *); int templ_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *ntv_info, int ntv_count, hwd_context_t *ctx __attribute__((unused))) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } templ_control_t *templ_ctl = (templ_control_t *) ctl; if (templ_ctl->vendor_ctx != NULL) { return PAPI_ECMP; } papi_errno = update_native_events(templ_ctl, ntv_info, ntv_count); if (papi_errno != PAPI_OK) { return papi_errno; } return try_open_events(templ_ctl); } int update_native_events(templ_control_t *ctl, NativeInfo_t *ntv_info, int ntv_count) { int papi_errno = PAPI_OK; if (ntv_count != ctl->num_events) { ctl->num_events = ntv_count; if (ntv_count == 0) { papi_free(ctl->events_id); ctl->events_id = NULL; goto fn_exit; } else { ctl->events_id = papi_realloc(ctl->events_id, ntv_count * sizeof(*ctl->events_id)); if (ctl->events_id == NULL) { papi_errno = PAPI_ENOMEM; goto fn_fail; } } } int i; for (i = 0; i < ntv_count; ++i) { ctl->events_id[i] = ntv_info[i].ni_event; ntv_info[i].ni_position = i; } fn_exit: return papi_errno; fn_fail: ctl->num_events = 0; goto fn_exit; } int try_open_events(templ_control_t *templ_ctl) { int papi_errno = PAPI_OK; vendord_ctx_t vendor_ctx; papi_errno = vendord_ctx_open(templ_ctl->events_id, templ_ctl->num_events, &vendor_ctx); if (papi_errno != PAPI_OK) { templ_cleanup_eventset(templ_ctl); return papi_errno; } return vendord_ctx_close(vendor_ctx); } int templ_start(hwd_context_t *ctx, hwd_control_state_t *ctl) { int papi_errno = PAPI_OK; templ_context_t *templ_ctx = (templ_context_t *) ctx; templ_control_t *templ_ctl = (templ_control_t *) ctl; if (templ_ctx->state & TEMPL_CTX_OPENED) { return PAPI_EINVAL; } papi_errno = vendord_ctx_open(templ_ctl->events_id, templ_ctl->num_events, &templ_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } templ_ctx->state = TEMPL_CTX_OPENED; papi_errno = vendord_ctx_start(templ_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } templ_ctx->state |= TEMPL_CTX_RUNNING; fn_exit: return papi_errno; fn_fail: vendord_ctx_close(templ_ctl->vendor_ctx); templ_ctx->state = 0; goto fn_exit; } int templ_read(hwd_context_t *ctx __attribute__((unused)), hwd_control_state_t *ctl, long long **val, int flags __attribute__((unused))) { templ_control_t *templ_ctl = (templ_control_t *) ctl; return vendord_ctx_read(templ_ctl->vendor_ctx, val); } int templ_stop(hwd_context_t *ctx, hwd_control_state_t *ctl) { int papi_errno = PAPI_OK; templ_context_t *templ_ctx = (templ_context_t *) ctx; templ_control_t *templ_ctl = (templ_control_t *) ctl; if (!(templ_ctx->state & TEMPL_CTX_OPENED)) { return PAPI_EINVAL; } papi_errno = vendord_ctx_stop(templ_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { return papi_errno; } templ_ctx->state &= ~TEMPL_CTX_RUNNING; papi_errno = vendord_ctx_close(templ_ctl->vendor_ctx); if (papi_errno != PAPI_OK) { return papi_errno; } templ_ctx->state = 0; templ_ctl->vendor_ctx = NULL; return papi_errno; } int templ_reset(hwd_context_t *ctx __attribute__((unused)), hwd_control_state_t *ctl) { templ_control_t *templ_ctl = (templ_control_t *) ctl; return vendord_ctx_reset(templ_ctl->vendor_ctx); } int templ_ntv_enum_events(unsigned int *event_code, int modifier) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendord_evt_enum(event_code, modifier); } int templ_ntv_code_to_name(unsigned int event_code, char *name, int len) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendord_evt_code_to_name(event_code, name, len); } int templ_ntv_name_to_code(const char *name, unsigned int *code) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendord_evt_name_to_code(name, code); } int templ_ntv_code_to_descr(unsigned int event_code, char *descr, int len) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendord_evt_code_to_descr(event_code, descr, len); } int templ_ntv_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { int papi_errno = check_n_initialize(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendord_evt_code_to_info(event_code, info); } int templ_set_domain(hwd_control_state_t *ctl __attribute__((unused)), int domain __attribute__((unused))) { return PAPI_OK; } int templ_ctl(hwd_context_t *ctx __attribute__((unused)), int code __attribute__((unused)), _papi_int_option_t *option __attribute__((unused))) { return PAPI_OK; } int check_n_initialize(void) { if (!_template_vector.cmp_info.initialized) { return templ_init_private(); } return _template_vector.cmp_info.disabled; } papi-papi-7-2-0-t/src/components/template/tests/000077500000000000000000000000001502707512200215715ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/template/tests/Makefile000066400000000000000000000005061502707512200232320ustar00rootroot00000000000000NAME=template include ../../Makefile_comp_tests.target CFLAGS = $(OPTFLAGS) CPPFLAGS += $(INCLUDE) LDFLAGS += $(PAPILIB) $(TESTLIB) $(UTILOBJS) TESTS = simple template_tests: $(TESTS) %.o: %.c $(CC) $(CPPFLAGS) $(CFLAGS) $(OPTFLAGS) -c -o $@ $< simple: simple.o $(CC) -o $@ $^ $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/template/tests/simple.c000066400000000000000000000053511502707512200232320ustar00rootroot00000000000000#include #include #include int quiet; int main(int argc, char *argv[]) { int papi_errno; quiet = tests_quiet(argc, argv); papi_errno = PAPI_library_init(PAPI_VER_CURRENT); if (papi_errno != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init", papi_errno); } #define NUM_EVENTS (4) const char *events[NUM_EVENTS] = { "templ:::TEMPLATE_ZERO:device=0", "templ:::TEMPLATE_CONSTANT:device=1", "templ:::TEMPLATE_FUNCTION:device=2:function=exp", "templ:::TEMPLATE_FUNCTION:device=3:function=sum", }; int eventset = PAPI_NULL; papi_errno = PAPI_create_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", papi_errno); } for (int i = 0; i < NUM_EVENTS; ++i) { papi_errno = PAPI_add_named_event(eventset, events[i]); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_named_event", papi_errno); } } long long counters[NUM_EVENTS] = { 0 }; papi_errno = PAPI_start(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start", papi_errno); } papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } for (int i = 0; i < NUM_EVENTS && !quiet; ++i) { fprintf(stdout, "%s: %lli\n", events[i], counters[i]); } papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } for (int i = 0; i < NUM_EVENTS && !quiet; ++i) { fprintf(stdout, "%s: %lli\n", events[i], counters[i]); } papi_errno = PAPI_read(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } for (int i = 0; i < NUM_EVENTS && !quiet; ++i) { fprintf(stdout, "%s: %lli\n", events[i], counters[i]); } papi_errno = PAPI_stop(eventset, counters); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_read", papi_errno); } for (int i = 0; i < NUM_EVENTS && !quiet; ++i) { fprintf(stdout, "%s: %lli\n", events[i], counters[i]); } papi_errno = PAPI_cleanup_eventset(eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset", papi_errno); } papi_errno = PAPI_destroy_eventset(&eventset); if (papi_errno != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset", papi_errno); } PAPI_shutdown(); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/components/template/vendor_common.c000066400000000000000000000030771502707512200234470ustar00rootroot00000000000000#include "vendor_common.h" char error_string[PAPI_MAX_STR_LEN]; unsigned int _templ_lock; device_table_t device_table; device_table_t *device_table_p; static int load_common_symbols(void); static int unload_common_symbols(void); static int initialize_device_table(void); static int finalize_device_table(void); int vendorc_init(void) { int papi_errno; papi_errno = load_common_symbols(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = initialize_device_table(); if (papi_errno != PAPI_OK) { goto fn_fail; } device_table_p = &device_table; fn_exit: return papi_errno; fn_fail: unload_common_symbols(); goto fn_exit; } int vendorc_shutdown(void) { finalize_device_table(); device_table_p = NULL; unload_common_symbols(); return PAPI_OK; } int vendorc_err_get_last(const char **error) { *error = error_string; return PAPI_OK; } int load_common_symbols(void) { return PAPI_OK; } int unload_common_symbols(void) { return PAPI_OK; } int initialize_device_table(void) { #define MAX_DEVICE_COUNT (8) device_t *devices = papi_calloc(MAX_DEVICE_COUNT, sizeof(device_t)); if (NULL == devices) { return PAPI_ENOMEM; } int i; for (i = 0; i < MAX_DEVICE_COUNT; ++i) { devices[i].id = (unsigned int) i; } device_table.devices = devices; device_table.num_devices = MAX_DEVICE_COUNT; return PAPI_OK; } int finalize_device_table(void) { papi_free(device_table_p->devices); device_table_p->num_devices = 0; return PAPI_OK; } papi-papi-7-2-0-t/src/components/template/vendor_common.h000066400000000000000000000012421502707512200234440ustar00rootroot00000000000000/* * This file contains all the functions that are shared between * different vendor versions of the profiling library. This can * include vendor runtime functionalities. */ #ifndef __VENDOR_COMMON_H__ #define __VENDOR_COMMON_H__ #include #include "papi.h" #include "papi_memory.h" #include "papi_internal.h" #include "vendor_config.h" typedef struct { unsigned int id; } device_t; typedef struct { device_t *devices; int num_devices; } device_table_t; extern char error_string[PAPI_MAX_STR_LEN]; extern device_table_t *device_table_p; int vendorc_init(void); int vendorc_shutdown(void); int vendorc_err_get_last(const char **error); #endif papi-papi-7-2-0-t/src/components/template/vendor_config.h000066400000000000000000000002661502707512200234260ustar00rootroot00000000000000#ifndef __VENDOR_CONFIG_H__ #define __VENDOR_CONFIG_H__ #define TEMPL_CTX_OPENED (0x1) #define TEMPL_CTX_RUNNING (0x2) #include "papi.h" extern unsigned int _templ_lock; #endif papi-papi-7-2-0-t/src/components/template/vendor_dispatch.c000066400000000000000000000034511502707512200237520ustar00rootroot00000000000000#include "vendor_dispatch.h" #include "vendor_common.h" #include "vendor_profiler_v1.h" int vendord_init_pre(void) { return vendorp1_init_pre(); } int vendord_init(void) { int papi_errno = vendorc_init(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendorp1_init(); } int vendord_shutdown(void) { int papi_errno = vendorp1_shutdown(); if (papi_errno != PAPI_OK) { return papi_errno; } return vendorc_shutdown(); } int vendord_ctx_open(unsigned int *events_id, int num_events, vendord_ctx_t *ctx) { return vendorp1_ctx_open(events_id, num_events, ctx); } int vendord_ctx_start(vendord_ctx_t ctx) { return vendorp1_ctx_start(ctx); } int vendord_ctx_read(vendord_ctx_t ctx, long long **counters) { return vendorp1_ctx_read(ctx, counters); } int vendord_ctx_stop(vendord_ctx_t ctx) { return vendorp1_ctx_stop(ctx); } int vendord_ctx_reset(vendord_ctx_t ctx) { return vendorp1_ctx_reset(ctx); } int vendord_ctx_close(vendord_ctx_t ctx) { return vendorp1_ctx_close(ctx); } int vendord_err_get_last(const char **error) { return vendorc_err_get_last(error); } int vendord_evt_enum(unsigned int *event_code, int modifier) { return vendorp1_evt_enum(event_code, modifier); } int vendord_evt_code_to_name(unsigned int event_code, char *name, int len) { return vendorp1_evt_code_to_name(event_code, name, len); } int vendord_evt_code_to_descr(unsigned int event_code, char *descr, int len) { return vendorp1_evt_code_to_descr(event_code, descr, len); } int vendord_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { return vendorp1_evt_code_to_info(event_code, info); } int vendord_evt_name_to_code(const char *name, unsigned int *event_code) { return vendorp1_evt_name_to_code(name, event_code); } papi-papi-7-2-0-t/src/components/template/vendor_dispatch.h000066400000000000000000000016711502707512200237610ustar00rootroot00000000000000#ifndef __VENDOR_DISPATCH_H__ #define __VENDOR_DISPATCH_H__ #include "vendor_config.h" typedef struct vendord_ctx *vendord_ctx_t; int vendord_init_pre(void); int vendord_init(void); int vendord_shutdown(void); int vendord_ctx_open(unsigned int *events_id, int num_events, vendord_ctx_t *ctx); int vendord_ctx_start(vendord_ctx_t ctx); int vendord_ctx_read(vendord_ctx_t ctx, long long **counters); int vendord_ctx_stop(vendord_ctx_t ctx); int vendord_ctx_reset(vendord_ctx_t ctx); int vendord_ctx_close(vendord_ctx_t ctx); int vendord_err_get_last(const char **error); int vendord_evt_enum(unsigned int *event_code, int modifier); int vendord_evt_code_to_name(unsigned int event_code, char *name, int len); int vendord_evt_code_to_descr(unsigned int event_code, char *descr, int len); int vendord_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info); int vendord_evt_name_to_code(const char *name, unsigned int *event_code); #endif papi-papi-7-2-0-t/src/components/template/vendor_profiler_v1.c000066400000000000000000000404471502707512200244110ustar00rootroot00000000000000/* include for rand() */ #include #include #include "vendor_common.h" #include "vendor_profiler_v1.h" /** * Event identifier encoding format: * +--------------------+-------+-+--+--+ * | unused | dev | | |id| * +--------------------+-------+-+--+--+ * * unused : 18 bits * device : 7 bits ([0 - 127] devices) * function : 1 bits (exponential or sum) * qlmask : 2 bits (qualifier mask) * nameid : 2 bits ([0 - 3] event names) */ #define EVENTS_WIDTH (sizeof(uint32_t) * 8) #define DEVICE_WIDTH (7) #define OPCODE_WIDTH (1) #define QLMASK_WIDTH (2) #define NAMEID_WIDTH (2) #define UNUSED_WIDTH (EVENTS_WIDTH - DEVICE_WIDTH - OPCODE_WIDTH - QLMASK_WIDTH - NAMEID_WIDTH) #define DEVICE_SHIFT (EVENTS_WIDTH - UNUSED_WIDTH - DEVICE_WIDTH) #define OPCODE_SHIFT (DEVICE_SHIFT - OPCODE_WIDTH) #define QLMASK_SHIFT (OPCODE_SHIFT - QLMASK_WIDTH) #define NAMEID_SHIFT (QLMASK_SHIFT - NAMEID_WIDTH) #define DEVICE_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - DEVICE_WIDTH)) << DEVICE_SHIFT) #define OPCODE_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - OPCODE_WIDTH)) << OPCODE_SHIFT) #define QLMASK_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - QLMASK_WIDTH)) << QLMASK_SHIFT) #define NAMEID_MASK ((0xFFFFFFFF >> (EVENTS_WIDTH - NAMEID_WIDTH)) << NAMEID_SHIFT) #define DEVICE_FLAG (0x2) #define OPCODE_FLAG (0x1) #define OPCODE_EXP (0x0) #define OPCODE_SUM (0x1) typedef struct { char name[PAPI_MAX_STR_LEN]; char descr[PAPI_2MAX_STR_LEN]; } ntv_event_t; typedef struct { ntv_event_t *events; int num_events; } ntv_event_table_t; struct vendord_ctx { int state; unsigned int *events_id; long long *counters; int num_events; }; static struct { char *name; char *descr; } vendor_events[] = { { "TEMPLATE_ZERO" , "This is a template counter, that always returns 0" }, { "TEMPLATE_CONSTANT", "This is a template counter, that always returns a constant value of 42" }, { "TEMPLATE_FUNCTION", "This is a template counter, that allows for different functions" }, { NULL, NULL } }; static ntv_event_table_t ntv_table; static ntv_event_table_t *ntv_table_p; int vendorp1_init_pre(void) { return PAPI_OK; } static int load_profiler_v1_symbols(void); static int unload_profiler_v1_symbols(void); static int initialize_event_table(void); static int finalize_event_table(void); typedef struct { int device; int opcode; int flags; int nameid; } event_info_t; static int evt_id_create(event_info_t *info, uint32_t *event_id); static int evt_id_to_info(uint32_t event_id, event_info_t *info); static int evt_name_to_device(const char *name, int *device); static int evt_name_to_opcode(const char *name, int *opcode); static int evt_name_to_basename(const char *name, char *base, int len); int vendorp1_init(void) { int papi_errno; papi_errno = load_profiler_v1_symbols(); if (papi_errno != PAPI_OK) { return papi_errno; } papi_errno = initialize_event_table(); if (papi_errno != PAPI_OK) { goto fn_fail; } ntv_table_p = &ntv_table; fn_exit: return papi_errno; fn_fail: finalize_event_table(); unload_profiler_v1_symbols(); goto fn_exit; } int vendorp1_shutdown(void) { finalize_event_table(); ntv_table_p = NULL; unload_profiler_v1_symbols(); return PAPI_OK; } static int init_ctx(unsigned int *events_id, int num_events, vendorp_ctx_t ctx); static int open_ctx(vendorp_ctx_t ctx); static int close_ctx(vendorp_ctx_t ctx); static int finalize_ctx(vendorp_ctx_t ctx); int vendorp1_ctx_open(unsigned int *events_id, int num_events, vendorp_ctx_t *ctx) { int papi_errno; *ctx = papi_calloc(1, sizeof(struct vendord_ctx)); if (NULL == *ctx) { return PAPI_ENOMEM; } _papi_hwi_lock(_templ_lock); papi_errno = init_ctx(events_id, num_events, *ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_errno = open_ctx(*ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } (*ctx)->state |= TEMPL_CTX_OPENED; fn_exit: _papi_hwi_unlock(_templ_lock); return papi_errno; fn_fail: close_ctx(*ctx); finalize_ctx(*ctx); goto fn_exit; } int vendorp1_ctx_start(vendorp_ctx_t ctx) { ctx->state |= TEMPL_CTX_RUNNING; return PAPI_OK; } int vendorp1_ctx_read(vendorp_ctx_t ctx, long long **counters) { int papi_errno; int i; for (i = 0; i < ctx->num_events; ++i) { event_info_t info; papi_errno = evt_id_to_info(ctx->events_id[i], &info); if (papi_errno != PAPI_OK) { return papi_errno; } if (0 == strcmp(ntv_table_p->events[info.nameid].name, "TEMPLATE_ZERO")) { ctx->counters[i] = (long long) 0; } else if (0 == strcmp(ntv_table_p->events[info.nameid].name, "TEMPLATE_CONSTANT")) { ctx->counters[i] = (long long) 42; } else if (0 == strcmp(ntv_table_p->events[info.nameid].name, "TEMPLATE_FUNCTION")) { if (info.opcode == OPCODE_EXP) { ctx->counters[i] = (ctx->counters[i]) ? ctx->counters[i] * 2 : 2; } else { ctx->counters[i] = (ctx->counters[i]) ? ctx->counters[i] + 1 : 1; } } } *counters = ctx->counters; return PAPI_OK; } int vendorp1_ctx_stop(vendorp_ctx_t ctx) { ctx->state &= ~TEMPL_CTX_RUNNING; return PAPI_OK; } int vendorp1_ctx_reset(vendorp_ctx_t ctx) { memset(ctx->counters, 0, sizeof(ctx->counters) * ctx->num_events); return PAPI_OK; } int vendorp1_ctx_close(vendorp_ctx_t ctx) { int papi_errno; _papi_hwi_lock(_templ_lock); papi_errno = close_ctx(ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } ctx->state &= ~TEMPL_CTX_OPENED; papi_errno = finalize_ctx(ctx); if (papi_errno != PAPI_OK) { goto fn_fail; } papi_free(ctx); fn_exit: _papi_hwi_unlock(_templ_lock); return papi_errno; fn_fail: goto fn_exit; } int vendorp1_evt_enum(unsigned int *event_code, int modifier) { int papi_errno; event_info_t info; papi_errno = evt_id_to_info(*event_code, &info); if (papi_errno != PAPI_OK) { return papi_errno; } switch(modifier) { case PAPI_ENUM_FIRST: if (ntv_table_p->num_events == 0) { papi_errno = PAPI_ENOEVNT; } info.device = 0; info.opcode = 0; info.flags = 0; info.nameid = 0; papi_errno = evt_id_create(&info, event_code); break; case PAPI_ENUM_EVENTS: if (info.nameid + 1 >= ntv_table_p->num_events) { papi_errno = PAPI_ENOEVNT; break; } ++info.nameid; papi_errno = evt_id_create(&info, event_code); break; case PAPI_NTV_ENUM_UMASKS: if (info.flags == 0) { info.flags = DEVICE_FLAG; papi_errno = evt_id_create(&info, event_code); break; } if (info.flags & DEVICE_FLAG && info.nameid == 2) { info.flags = OPCODE_FLAG; papi_errno = evt_id_create(&info, event_code); break; } papi_errno = PAPI_ENOEVNT; break; default: papi_errno = PAPI_EINVAL; } return papi_errno; } int vendorp1_evt_code_to_name(unsigned int event_code, char *name, int len) { int papi_errno; event_info_t info; papi_errno = evt_id_to_info(event_code, &info); if (papi_errno != PAPI_OK) { return papi_errno; } switch (info.flags) { case DEVICE_FLAG | OPCODE_FLAG: snprintf(name, len, "%s:device=%i:function=%s", ntv_table_p->events[info.nameid].name, info.device, (info.opcode == OPCODE_EXP) ? "exp" : "sum"); break; case DEVICE_FLAG: snprintf(name, len, "%s:device=%i", ntv_table_p->events[info.nameid].name, info.device); break; case OPCODE_FLAG: snprintf(name, len, "%s:function=%s", ntv_table_p->events[info.nameid].name, (info.opcode == OPCODE_EXP) ? "exp" : "sum"); break; default: papi_errno = PAPI_ENOEVNT; } snprintf(name, len, "%s", ntv_table_p->events[info.nameid].name); return papi_errno; } int vendorp1_evt_code_to_descr(unsigned int event_code, char *descr, int len) { int papi_errno; event_info_t info; papi_errno = evt_id_to_info(event_code, &info); if (papi_errno != PAPI_OK) { return papi_errno; } snprintf(descr, len, "%s", ntv_table_p->events[info.nameid].descr); return PAPI_OK; } int vendorp1_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info) { int papi_errno; event_info_t code_info; papi_errno = evt_id_to_info(event_code, &code_info); if (papi_errno != PAPI_OK) { return papi_errno; } switch (code_info.flags) { case 0: sprintf(info->symbol, "%s", ntv_table_p->events[code_info.nameid].name); sprintf(info->long_descr, "%s", ntv_table_p->events[code_info.nameid].descr); break; case DEVICE_FLAG | OPCODE_FLAG: sprintf(info->symbol, "%s:device=%i:function=%s", ntv_table_p->events[code_info.nameid].name, code_info.device, (code_info.opcode == OPCODE_EXP) ? "exp" : "sum"); sprintf(info->long_descr, "%s", ntv_table_p->events[code_info.nameid].descr); break; case DEVICE_FLAG: { int i; char devices[PAPI_MAX_STR_LEN] = { 0 }; for (i = 0; i < device_table_p->num_devices; ++i) { sprintf(devices + strlen(devices), "%i,", i); } *(devices + strlen(devices) - 1) = 0; sprintf(info->symbol, "%s:device=%i", ntv_table_p->events[code_info.nameid].name, code_info.device); sprintf(info->long_descr, "%s masks:Device qualifier [%s]", ntv_table_p->events[code_info.nameid].descr, devices); break; } case OPCODE_FLAG: sprintf(info->symbol, "%s:function=%s", ntv_table_p->events[code_info.nameid].name, (code_info.opcode == OPCODE_EXP) ? "exp" : "sum"); sprintf(info->long_descr, "%s masks:Mandatory function qualifier (exp,sum)", ntv_table_p->events[code_info.nameid].descr); break; default: papi_errno = PAPI_EINVAL; } return papi_errno; } int vendorp1_evt_name_to_code(const char *name, unsigned int *event_code) { int papi_errno; char basename[PAPI_MAX_STR_LEN] = { 0 }; papi_errno = evt_name_to_basename(name, basename, PAPI_MAX_STR_LEN); if (papi_errno != PAPI_OK) { return papi_errno; } int device; papi_errno = evt_name_to_device(name, &device); if (papi_errno != PAPI_OK) { return papi_errno; } int opcode = 0; papi_errno = evt_name_to_opcode(name, &opcode); if (papi_errno != PAPI_OK) { return papi_errno; } int i, nameid = 0; for (i = 0; i < ntv_table_p->num_events; ++i) { if (0 == strcmp(ntv_table_p->events[i].name, basename)) { nameid = i; break; } } event_info_t info = { 0, 0, 0, 0 }; if (0 == strcmp(ntv_table_p->events[nameid].name, "TEMPLATE_FUNCTION")) { info.device = device; info.opcode = opcode; info.flags = (DEVICE_FLAG | OPCODE_FLAG); info.nameid = nameid; papi_errno = evt_id_create(&info, event_code); if (papi_errno != PAPI_OK) { return papi_errno; } } else { info.device = device; info.opcode = 0; info.flags = DEVICE_FLAG; info.nameid = nameid; papi_errno = evt_id_create(&info, event_code); if (papi_errno != PAPI_OK) { return papi_errno; } } papi_errno = evt_id_to_info(*event_code, &info); return papi_errno; } int load_profiler_v1_symbols(void) { return PAPI_OK; } int unload_profiler_v1_symbols(void) { return PAPI_OK; } static int get_events_count(int *num_events); static int get_events(ntv_event_t *events, int num_events); int initialize_event_table(void) { int papi_errno, num_events; papi_errno = get_events_count(&num_events); if (papi_errno != PAPI_OK) { return papi_errno; } ntv_event_t *events = papi_calloc(num_events, sizeof(ntv_event_t)); if (NULL == events) { return PAPI_ENOMEM; } papi_errno = get_events(events, num_events); if (papi_errno != PAPI_OK) { goto fn_fail; } ntv_table.events = events; ntv_table.num_events = num_events; fn_exit: return papi_errno; fn_fail: papi_free(events); goto fn_exit; } int finalize_event_table(void) { papi_free(ntv_table_p->events); ntv_table_p->num_events = 0; ntv_table_p = NULL; return PAPI_OK; } int init_ctx(unsigned int *events_id, int num_events, vendorp_ctx_t ctx) { ctx->events_id = events_id; ctx->num_events = num_events; ctx->counters = papi_calloc(num_events, sizeof(long long)); if (NULL == ctx->counters) { return PAPI_ENOMEM; } return PAPI_OK; } int open_ctx(vendorp_ctx_t ctx __attribute__((unused))) { return PAPI_OK; } int close_ctx(vendorp_ctx_t ctx __attribute__((unused))) { return PAPI_OK; } int finalize_ctx(vendorp_ctx_t ctx) { ctx->events_id = NULL; ctx->num_events = 0; papi_free(ctx->counters); return PAPI_OK; } int get_events_count(int *num_events) { int i = 0; while (vendor_events[i++].name != NULL); *num_events = i - 1; return PAPI_OK; } int get_events(ntv_event_t *events, int num_events) { int i = 0; while (vendor_events[i].name != NULL) { snprintf(events[i].name, PAPI_MAX_STR_LEN, "%s", vendor_events[i].name); snprintf(events[i].descr, PAPI_2MAX_STR_LEN, "%s", vendor_events[i].descr); ++i; } return (num_events - i) ? PAPI_EMISC : PAPI_OK; } int evt_id_create(event_info_t *info, uint32_t *event_id) { *event_id = (uint32_t)(info->device << DEVICE_SHIFT); *event_id |= (uint32_t)(info->opcode << OPCODE_SHIFT); *event_id |= (uint32_t)(info->flags << QLMASK_SHIFT); *event_id |= (uint32_t)(info->nameid << NAMEID_SHIFT); return PAPI_OK; } int evt_id_to_info(uint32_t event_id, event_info_t *info) { info->device = (int)((event_id & DEVICE_MASK) >> DEVICE_SHIFT); info->opcode = (int)((event_id & OPCODE_MASK) >> OPCODE_SHIFT); info->flags = (int)((event_id & QLMASK_MASK) >> QLMASK_SHIFT); info->nameid = (int)((event_id & NAMEID_MASK) >> NAMEID_SHIFT); if (info->device >= device_table_p->num_devices) { return PAPI_ENOEVNT; } if (info->nameid >= ntv_table_p->num_events) { return PAPI_ENOEVNT; } if (0 == strcmp(ntv_table_p->events[info->nameid].name, "TEMPLATE_FUNCTION") && 0 == info->flags) { return PAPI_ENOEVNT; } return PAPI_OK; } int evt_name_to_device(const char *name, int *device) { *device = 0; char *p = strstr(name, ":device="); if (p) { *device = (int) strtol(p + strlen(":device="), NULL, 10); } return PAPI_OK; } int evt_name_to_opcode(const char *name, int *opcode) { char basename[PAPI_MAX_STR_LEN] = { 0 }; evt_name_to_basename(name, basename, PAPI_MAX_STR_LEN); if (0 == strcmp(basename, "TEMPLATE_FUNCTION")) { char *p = strstr(name, ":function="); if (p) { if (strncmp(p + strlen(":function="), "exp", strlen("exp")) == 0) { *opcode = OPCODE_EXP; } else if (strncmp(p + strlen(":function="), "sum", strlen("sum")) == 0) { *opcode = OPCODE_SUM; } else { return PAPI_ENOEVNT; } } else { return PAPI_ENOEVNT; } } return PAPI_OK; } int evt_name_to_basename(const char *name, char *base, int len) { char *p = strstr(name, ":"); if (p) { if (len < (int)(p - name)) { return PAPI_EBUF; } strncpy(base, name, (size_t)(p - name)); } else { if (len < (int) strlen(name)) { return PAPI_EBUF; } strncpy(base, name, (size_t) len); } return PAPI_OK; } papi-papi-7-2-0-t/src/components/template/vendor_profiler_v1.h000066400000000000000000000016021502707512200244040ustar00rootroot00000000000000#ifndef __VENDOR_PROFILER_V1_H__ #define __VENDOR_PROFILER_V1_H__ typedef struct vendord_ctx *vendorp_ctx_t; int vendorp1_init_pre(void); int vendorp1_init(void); int vendorp1_shutdown(void); int vendorp1_ctx_open(unsigned int *events_id, int num_events, vendorp_ctx_t *ctx); int vendorp1_ctx_start(vendorp_ctx_t ctx); int vendorp1_ctx_read(vendorp_ctx_t ctx, long long **counters); int vendorp1_ctx_stop(vendorp_ctx_t ctx); int vendorp1_ctx_reset(vendorp_ctx_t ctx); int vendorp1_ctx_close(vendorp_ctx_t ctx); int vendorp1_evt_enum(unsigned int *event_code, int modifier); int vendorp1_evt_code_to_name(unsigned int event_code, char *name, int len); int vendorp1_evt_code_to_descr(unsigned int event_code, char *descr, int len); int vendorp1_evt_code_to_info(unsigned int event_code, PAPI_event_info_t *info); int vendorp1_evt_name_to_code(const char *name, unsigned int *event_code); #endif papi-papi-7-2-0-t/src/components/topdown/000077500000000000000000000000001502707512200203065ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/topdown/README.md000066400000000000000000000032531502707512200215700ustar00rootroot00000000000000# TOPDOWN Component The `topdown` component enables accessing the `PERF_METRICS` Model Specific Register (MSR) of modern Intel PMUs, and makes it simple to properly interpret the results. * [Enabling the TOPDOWN Component](#enabling-the-topdown-component) * [Adding More Architectures](#adding_more_architectures) ## Enabling the TOPDOWN Component To enable reading of topdown metrics the user needs to link against a PAPI library that was configured with the topdown component enabled. As an example the following command: `./configure --with-components="topdown"` is sufficient to enable the component. ## Interpreting Results The events added by this component ending in "_PERC" should be cast to double values in order to be properly interpreted as percentages. An example of how to do so follows: PAPI_start(EventSet); /* some block of code... */ PAPI_stop(EventSet, values); printf("First metric was %.1f\n", *((double *)(&values[0]))); ## Adding More Architectures To contribute more supported architectures to the component, add the cpuid model of the architecture to the switch statement in `_topdown_init_component` of [topdown.c](./topdown.c) and set the relevant options (`supports_l2`, `required_core_type`, etc.) ## Warning on Heterogeneous CPU Affinity As of 2024-12-11, all Intel's hybrid CPU architectures only support the PERF_METRICS MSR on their 'performance' cores (p-cores). This means that to measure topdown events on a heterogeneous processor, one must limit the process affinity only to p-cores using a program like `taskset` or `numactl`. Otherwise, PAPI will exit to avoid encountering a segmentation fault. papi-papi-7-2-0-t/src/components/topdown/Rules.topdown000066400000000000000000000003771502707512200230230ustar00rootroot00000000000000COMPSRCS += components/topdown/topdown.c COMPOBJS += topdown.o LDFLAGS+=-ldl topdown.o: components/topdown/topdown.c components/topdown/topdown.h $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/topdown/topdown.c -o topdown.o $(LDFLAGS) papi-papi-7-2-0-t/src/components/topdown/tests/000077500000000000000000000000001502707512200214505ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/topdown/tests/Makefile000066400000000000000000000012031502707512200231040ustar00rootroot00000000000000NAME=topdown include ../../Makefile_comp_tests.target %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c -o $@ $< TESTS = topdown_basic topdown_L1 topdown_L2 topdown_tests: $(TESTS) topdown_basic: topdown_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o topdown_basic topdown_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) topdown_L1: topdown_L1.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o topdown_L1 topdown_L1.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) topdown_L2: topdown_L2.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o topdown_L2 topdown_L2.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/topdown/tests/topdown_L1.c000066400000000000000000000112421502707512200236420ustar00rootroot00000000000000/* * Specifically tests that the Level 1 topdown events make sense. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 4 #define PERC_TOLERANCE 1.5 // fibonacci function to serve as a benchable code section void __attribute__((optimize("O0"))) fib(int n) { long i, a = 0; int b = 1; for (i = 0; i < n; i++) { b = b + a; a = b - a; } } int main(int argc, char **argv) { int i, quiet, retval; int EventSet = PAPI_NULL; const PAPI_component_info_t *cmpinfo = NULL; int numcmp, cid, topdown_cid = -1; long long values[NUM_EVENTS]; double tmp; /* Set TESTS_QUIET variable */ quiet = tests_quiet(argc, argv); /* PAPI Initialization */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed\n", retval); } if (!quiet) { printf("Testing topdown component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); } /*******************************/ /* Find the topdown component */ /*******************************/ numcmp = PAPI_num_components(); for (cid = 0; cid < numcmp; cid++) { if ((cmpinfo = PAPI_get_component_info(cid)) == NULL) { test_fail(__FILE__, __LINE__, "PAPI_get_component_info failed\n", 0); } if (!quiet) { printf("\tComponent %d - %d events - %s\n", cid, cmpinfo->num_native_events, cmpinfo->name); } if (strstr(cmpinfo->name, "topdown")) { topdown_cid = cid; /* check that the component is enabled */ if (cmpinfo->disabled) { printf("Topdown component is disabled: %s\n", cmpinfo->disabled_reason); test_fail(__FILE__, __LINE__, "Component is not enabled\n", 0); } } } if (topdown_cid < 0) { test_skip(__FILE__, __LINE__, "Topdown component not found\n", 0); } if (!quiet) { printf("\nFound Topdown Component at id %d\n", topdown_cid); printf("\nAdding the level 1 topdown metrics..\n"); } /* Create EventSet */ retval = PAPI_create_eventset(&EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()", retval); } /* Add the level 1 topdown metrics */ retval = PAPI_add_named_event(EventSet, "TOPDOWN_RETIRING_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_RETIRING_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_BAD_SPEC_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_BAD_SPEC_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_FE_BOUND_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_FE_BOUND_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_BE_BOUND_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_BE_BOUND_PERC", retval); } /* stat a loop-based calculation of the sum of the fibonacci sequence */ /* the workload needs to be fairly large in order to acquire an accurate */ /* set of measurements */ PAPI_start(EventSet); fib(6000000); PAPI_stop(EventSet, values); /* run some sanity checks: */ /* first, the sum of all level 1 metric percentages should be 100% */ tmp = 0; for (i=0; i 100 + PERC_TOLERANCE) { test_fail(__FILE__, __LINE__, "Level 1 topdown metric percentages did not sum to 100%%\n", 1); } if (!quiet) printf("\tRetiring:\t%.1f%%\n", *((double *)(&values[0]))); /* next, verify that the percentage of bad spec slots is reasonable. */ /* for this benchmark, we can expect very low rate of bad speculation */ /* due to the fact that it consists of a simple for loop */ if (!quiet) printf("\tBad spec:\t%.1f%%\n", *((double *)(&values[1]))); if (*((double *)(&values[1])) > 5.0) { test_warn(__FILE__, __LINE__, "The percentage of slots affected by bad speculation was unexpectedly high", 1); } /* finally, make sure the frontend/backend bound percentages make sense */ /* we should expect this benchmark to be significantly more limited */ /* by the back end, so check that be bound is larger than the fe bound */ if (!quiet) { printf("\tFrontend bound:\t%.1f%%\n", *((double *)(&values[2]))); printf("\tBackend bound:\t%.1f%%\n", *((double *)(&values[3]))); } if (*((double *)(&values[2])) > *((double *)(&values[3]))) { test_warn(__FILE__, __LINE__, "Frontend bound should be significantly smaller than backend bound", 1); } return 0; }papi-papi-7-2-0-t/src/components/topdown/tests/topdown_L2.c000066400000000000000000000145241502707512200236510ustar00rootroot00000000000000/* * Specifically tests that the Level 2 topdown events make sense. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 8 #define PERC_TOLERANCE 1.5 // fibonacci function to serve as a benchable code section void __attribute__((optimize("O0"))) fib(int n) { long i, a = 0; int b = 1; for (i = 0; i < n; i++) { b = b + a; a = b - a; } } int main(int argc, char **argv) { int i, quiet, retval; int EventSet = PAPI_NULL; const PAPI_component_info_t *cmpinfo = NULL; int numcmp, cid, topdown_cid = -1; long long values[NUM_EVENTS]; double tmp; /* Set TESTS_QUIET variable */ quiet = tests_quiet(argc, argv); /* PAPI Initialization */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed\n", retval); } if (!quiet) { printf("Testing topdown component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); } /*******************************/ /* Find the topdown component */ /*******************************/ numcmp = PAPI_num_components(); for (cid = 0; cid < numcmp; cid++) { if ((cmpinfo = PAPI_get_component_info(cid)) == NULL) { test_fail(__FILE__, __LINE__, "PAPI_get_component_info failed\n", 0); } if (!quiet) { printf("\tComponent %d - %d events - %s\n", cid, cmpinfo->num_native_events, cmpinfo->name); } if (strstr(cmpinfo->name, "topdown")) { topdown_cid = cid; /* check that the component is enabled */ if (cmpinfo->disabled) { printf("Topdown component is disabled: %s\n", cmpinfo->disabled_reason); test_fail(__FILE__, __LINE__, "The TOPDOWN component is not enabled\n", 0); } } } if (topdown_cid < 0) { test_skip(__FILE__, __LINE__, "Topdown component not found\n", 0); } if (!quiet) { printf("\nFound Topdown Component at id %d\n", topdown_cid); printf("\nAdding the level 2 topdown metrics..\n"); } /* Create EventSet */ retval = PAPI_create_eventset(&EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()", retval); } /* Add the level 2 topdown metrics */ /* if we can't, just skip because not all processors support level 2 */ retval = PAPI_add_named_event(EventSet, "TOPDOWN_HEAVY_OPS_PERC"); if (retval != PAPI_OK) { test_skip(__FILE__, __LINE__, "Error adding TOPDOWN_HEAVY_OPS_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_LIGHT_OPS_PERC"); if (retval != PAPI_OK) { /* if the first L2 event was successfully added though, */ /* subsequent failures indicate a deeper problem */ test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_LIGHT_OPS_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_BR_MISPREDICT_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_BR_MISPREDICT_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_MACHINE_CLEARS_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_MACHINE_CLEARS_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_FETCH_LAT_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_FETCH_LAT_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_FETCH_BAND_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_FETCH_BAND_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_MEM_BOUND_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_MEM_BOUND_PERC", retval); } retval = PAPI_add_named_event(EventSet, "TOPDOWN_CORE_BOUND_PERC"); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Error adding TOPDOWN_CORE_BOUND_PERC", retval); } /* stat a loop-based calculation of the sum of the fibonacci sequence */ /* the workload needs to be fairly large in order to acquire an accurate */ /* set of measurements */ PAPI_start(EventSet); fib(6000000); PAPI_stop(EventSet, values); /* run some sanity checks: */ /* first, the sum of all level 2 metric percentages should be 100% */ tmp = 0; for (i=0; i 100 + PERC_TOLERANCE) { test_fail(__FILE__, __LINE__, "Level 2 topdown metric percentages did not sum to 100%%\n", 1); } /* next, check that we are retiring more light ops than heavy ops */ /* this is a very reasonable expectation for a simple loop performing /* scalar add and multiply operations */ if (!quiet) { printf("\tHeavy ops:\t%.1f%%\n", *((double *)(&values[0]))); printf("\tLight ops:\t%.1f%%\n", *((double *)(&values[1]))); } if (*((double *)(&values[0])) > *((double *)(&values[1]))) { test_warn(__FILE__, __LINE__, "Heavy ops should be much smaller than light ops", 1); } /* next, check that the branch mispredictions and machine clears */ /* are insignificant as this benchmark should have good speculation */ if (!quiet) { printf("\tBranch mispredictions:\t%.1f%%\n", *((double *)(&values[2]))); printf("\tMachine clears:\t%.1f%%\n", *((double *)(&values[3]))); } if ((*((double *)(&values[2])) + *((double *)(&values[3]))) > 5.0) { test_warn(__FILE__, __LINE__, "Bad speculation should be insignificant for this workload", 1); } /* next, check that the fetch latency and bandwidth are insignificant */ if (!quiet) { printf("\tFetch latency:\t%.1f%%\n", *((double *)(&values[4]))); printf("\tFetch bandwidth:\t%.1f%%\n", *((double *)(&values[5]))); } if ((*((double *)(&values[4])) + *((double *)(&values[5]))) > 10.0) { test_warn(__FILE__, __LINE__, "Frontend bound should be insignificant for this workload", 1); } /* finally, check that core bound is greater than memory bound. */ /* we can expect this because there are no memory loads/stores here */ if (!quiet) { printf("\tMemory bound:\t%.1f%%\n", *((double *)(&values[6]))); printf("\tCore bound:\t%.1f%%\n", *((double *)(&values[7]))); } if (*((double *)(&values[6])) > *((double *)(&values[7]))) { test_warn(__FILE__, __LINE__, "The workload should be significantly more core bound than memory bound", 1); } return 0; }papi-papi-7-2-0-t/src/components/topdown/tests/topdown_basic.c000066400000000000000000000100601502707512200244440ustar00rootroot00000000000000/* * Basic test that just adds all of the topdown events and make sure they dont * produce any errors. */ #include #include #include #include #include "papi.h" #include "papi_test.h" // fibonacci function to serve as a benchable code section void __attribute__((optimize("O0"))) fib(int n) { long i, a = 0; int b = 1; for (i = 0; i < n; i++) { b = b + a; a = b - a; } } int main(int argc, char **argv) { int i, quiet, retval; int EventSet = PAPI_NULL; const PAPI_component_info_t *cmpinfo = NULL; int numcmp, cid, topdown_cid = -1; int code, maximum_code = 0; char event_name[PAPI_MAX_STR_LEN]; PAPI_event_info_t event_info; int num_events = 0; long long *values; /* Set TESTS_QUIET variable */ quiet = tests_quiet(argc, argv); /* PAPI Initialization */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__, __LINE__, "PAPI_library_init failed\n", retval); } if (!quiet) { printf("Testing topdown component with PAPI %d.%d.%d\n", PAPI_VERSION_MAJOR(PAPI_VERSION), PAPI_VERSION_MINOR(PAPI_VERSION), PAPI_VERSION_REVISION(PAPI_VERSION)); } /*******************************/ /* Find the topdown component */ /*******************************/ numcmp = PAPI_num_components(); for (cid = 0; cid < numcmp; cid++) { if ((cmpinfo = PAPI_get_component_info(cid)) == NULL) { test_fail(__FILE__, __LINE__, "PAPI_get_component_info failed\n", 0); } if (!quiet) { printf("\tComponent %d - %d events - %s\n", cid, cmpinfo->num_native_events, cmpinfo->name); } if (strstr(cmpinfo->name, "topdown")) { topdown_cid = cid; /* check that the component is enabled */ if (cmpinfo->disabled) { printf("Topdown component is disabled: %s\n", cmpinfo->disabled_reason); test_fail(__FILE__, __LINE__, "Component is not enabled\n", 0); } } } if (topdown_cid < 0) { test_skip(__FILE__, __LINE__, "Topdown component not found\n", 0); } if (!quiet) { printf("\nFound Topdown Component at id %d\n", topdown_cid); printf("\nListing all events in this component:\n"); } /* Create EventSet */ retval = PAPI_create_eventset(&EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()", retval); } /*****************************************************/ /* Add all the events to an eventset as a basic test */ /*****************************************************/ code = PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event(&code, PAPI_ENUM_FIRST, topdown_cid); while (retval == PAPI_OK) { if (PAPI_event_code_to_name(code, event_name) != PAPI_OK) { printf("Error translating %#x\n", code); test_fail(__FILE__, __LINE__, "PAPI_event_code_to_name", retval); } if (PAPI_get_event_info(code, &event_info) != PAPI_OK) { printf("Error getting info for event %#x\n", code); test_fail(__FILE__, __LINE__, "PAPI_get_event_info()", retval); } retval = PAPI_add_event(EventSet, code); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()", retval); } if (!quiet) { printf("\tEvent %#x: %s -- %s\n", code, event_name, event_info.long_descr); } num_events += 1; maximum_code = code; retval = PAPI_enum_cmp_event(&code, PAPI_ENUM_EVENTS, topdown_cid); } if (!quiet) printf("\n"); /* ensure there is space for the output values */ values = calloc(num_events, sizeof(long long)); if (values == NULL) { test_fail(__FILE__, __LINE__, "Insufficient memory", retval); } /* now stat some code to make sure the events work */ PAPI_start(EventSet); fib(6000000); PAPI_stop(EventSet, values); if (!quiet) printf("Values:\n"); for (i = 0; i < num_events; i++) { /* ensure the metric percentages are between 0 and 100 */ if (*((double *)(&values[i])) < 0 || *((double *)(&values[i])) > 100.0) { test_fail(__FILE__, __LINE__, "Topdown metric was not a valid percentage", retval); } if (!quiet) printf("\t%d:\t%.1lf%%\n", i, *((double *)(&values[i]))); } return 0; }papi-papi-7-2-0-t/src/components/topdown/topdown.c000066400000000000000000000723441502707512200221560ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include #include #include #ifndef _GNU_SOURCE #define _GNU_SOURCE #endif #include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" /* defines papi_malloc(), etc. */ #include "topdown.h" // The following macro follows if a string function has an error. It should // never happen; but it is necessary to prevent compiler warnings. We print // something just in case there is programmer error in invoking the function. #define HANDLE_STRING_ERROR \ { \ fprintf(stderr, "%s:%i unexpected string function error.\n", __FILE__, __LINE__); \ exit(-1); \ } papi_vector_t _topdown_vector; static _topdown_native_event_entry_t *topdown_native_events = NULL; static int num_events = 0; static int librseq_loaded = 0; #define INTEL_CORE_TYPE_EFFICIENT 0x20 /* also known as 'ATOM' */ #define INTEL_CORE_TYPE_PERFORMANCE 0x40 /* also known as 'CORE' */ #define INTEL_CORE_TYPE_HOMOGENEOUS -1 /* core type is non-issue */ static int required_core_type = INTEL_CORE_TYPE_HOMOGENEOUS; /**************************/ /* x86 specific functions */ /**************************/ /* forward declarations */ void assert_affinity(int core_type); static inline __attribute__((always_inline)) unsigned long long rdpmc_rseq_protected(unsigned int counter, int allowed_core_type); /* rdpmc instruction wrapper */ static inline unsigned long long _rdpmc(unsigned int counter) { unsigned int low, high; /* if we need protection... */ if (required_core_type != INTEL_CORE_TYPE_HOMOGENEOUS) { /* if librseq is available, protect with librseq */ if (librseq_loaded) return rdpmc_rseq_protected(counter, required_core_type); /* otherwise, just hope we aren't moved to an unsupported core */ /* between assert_affinity() and the inline asm */ assert_affinity(required_core_type); } __asm__ volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter)); return (unsigned long long)low | ((unsigned long long)high) <<32; } typedef struct { unsigned int eax; unsigned int ebx; unsigned int ecx; unsigned int edx; } cpuid_reg_t; void cpuid2( cpuid_reg_t *reg, unsigned int func, unsigned int subfunc ) { __asm__ ("cpuid;" : "=a" (reg->eax), "=b" (reg->ebx), "=c" (reg->ecx), "=d" (reg->edx) : "a" (func), "c" (subfunc)); } /**************************************/ /* Hybrid processor support functions */ /**************************************/ /* ensure the core this process is running on is of the correct type */ int active_core_type_is(int core_type) { cpuid_reg_t reg; /* check that CPUID leaf 0x1A is supported */ cpuid2(®, 0, 0); if (reg.eax < 0x1a) return PAPI_ENOSUPP; cpuid2(®, 0x1a, 0); if (reg.eax == 0) return PAPI_ENOSUPP; return ((reg.eax >> 24) & 0xff) == core_type; } /* helper to allow printing core type in errors */ void core_type_to_name(int core_type, char *out) { int err; switch (core_type) { case INTEL_CORE_TYPE_EFFICIENT: err = snprintf(out, PAPI_MIN_STR_LEN, "e-core (Atom)"); if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; break; case INTEL_CORE_TYPE_PERFORMANCE: err = snprintf(out, PAPI_MIN_STR_LEN, "p-core (Core)"); if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; break; default: err = snprintf(out, PAPI_MIN_STR_LEN, "not applicable (N/A)"); if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; break; } } /* exit if the core affinity is disallowed in order to avoid segfaulting */ void handle_affinity_error(int allowed_type) { char allowed_name[PAPI_MIN_STR_LEN]; core_type_to_name(allowed_type, allowed_name); fprintf(stderr, "Error: Process was moved to an unsupported core type. To use the PAPI topdown component, process affinity must be limited to cores of type '%s' on this architecture.\n", allowed_name); exit(127); } /* assert that the current process affinity is to an allowed core type */ void assert_affinity(int core_type) { /* ensure the process is still on a valid core to avoid segfaulting */ if (!active_core_type_is(core_type)) { handle_affinity_error(core_type); } } /**********************************************/ /* Restartable sequence heterogeneous support */ /**********************************************/ /* dlsym access to librseq symbols */ static ptrdiff_t *rseq_offset_ptr; static int (*rseq_available_ptr)(unsigned int query); /* local wrappers for dlsym function pointers */ static int librseq_rseq_available(unsigned int query) { return (*rseq_available_ptr)(query); } int link_librseq() { void* lib = dlopen("librseq.so", RTLD_NOW); if (!lib) { return PAPI_ENOSUPP; } rseq_available_ptr = dlsym(lib, "rseq_available"); if (rseq_available_ptr == NULL) { return PAPI_ENOSUPP; } rseq_offset_ptr = dlsym(lib, "rseq_offset"); if (rseq_offset_ptr == NULL) { return PAPI_ENOSUPP; } if (!rseq_available_ptr(0)) { return PAPI_ENOSUPP; } return 0; } /* This function assumes some properties of the system have been verified. */ /* 1. Must be an Intel x86 processor */ /* 2. Processor must be hybrid/heterogeneous (e-core/p-core) */ /* 3. perf_event_open() + mmap() have been used to enable userspace rdpmc */ static inline __attribute__((always_inline)) unsigned long long rdpmc_rseq_protected(unsigned int counter, int allowed_core_type) { unsigned int low = -1; unsigned int high = -1; int core_check; restart_sequence: core_check = 0; __asm__ __volatile__ goto ( /* set up critical section of restartable sequence */ ".pushsection __rseq_cs, \"aw\"\n\t" ".balign 32\n\t" "3:\n\t" ".long 0x0\n\t" ".long 0x0\n\t" ".quad 1f\n\t" ".quad (2f) - (1f)\n\t" ".quad 4f\n\t" ".long 0x0\n\t" ".long 0x0\n\t" ".quad 1f\n\t" ".quad (2f) - (1f)\n\t" ".quad 4f\n\t" ".popsection\n\t" ".pushsection __rseq_cs_ptr_array, \"aw\"\n\t" ".quad 3b\n\t" ".popsection\n\t" /* start rseq by storing table entry pointer into rseq_cs. */ "leaq 3b(%%rip), %%rax\n\t" "movq %%rax, %%fs:8(%[rseq_offset])\n\t" "1:\n\t" /* check if core type is valid */ "mov $0x1A, %%eax\n\t" "mov $0x00, %%ecx\n\t" "cpuid\n\t" "mov %%eax, %[core_check]\n\t" "test %[core_type], %%eax\n\t" "jz 4f\n\t" /* abort if core type is invalid */ /* make the rdpmc call */ "movl %[counter], %%ecx\n\t" "rdpmc\n\t" /* retrieve results of rdpmc */ "mov %%edx, %[high]\n\t" "mov %%eax, %[low]\n\t" "2:\n\t" /* define abort section */ ".pushsection __rseq_failure, \"ax\"\n\t" ".byte 0x0f, 0xb9, 0x3d\n\t" ".long " "0x53053053" "\n\t" "4" ":\n\t" "" "jmp %l[" "abort" "]\n\t" ".popsection\n\t" : : [core_check] "m" (core_check), [low] "m" (low), [high] "m" (high), [rseq_offset] "r" (*rseq_offset_ptr), [counter] "r" (counter), [core_type] "r" (allowed_core_type << 24) /* shift mask into place */ : "memory", "cc", "rax", "eax", "ecx", "edx" : abort ); return (unsigned long long)low | ((unsigned long long)high) << 32; abort: /* we may abort because the core type was found to be invalid, or */ /* we might abort because the restartable sequence was preempted */ /* therefore we have to check why the abort happened here */ if ((((core_check >> 24) & 0xff) != allowed_core_type) && core_check != 0) { /* sequence reached the core check, and the core type was disallowed !*/ handle_affinity_error(allowed_core_type); return PAPI_EBUG; /* should never return, handle_affinity_error exits */ } /* if the critical section aborted, but not because the core type is */ /* invalid, then give it another shot */ /* while theoretically possible, this has never been observed to restart */ /* more than once before either succeeding or failing the check */ goto restart_sequence; } /********************************/ /* Internal component functions */ /********************************/ /* In case headers aren't new enough to have __NR_perf_event_open */ #ifndef __NR_perf_event_open #define __NR_perf_event_open 298 /* __x86_64__ is the only arch we support */ #endif __attribute__((weak)) int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu, int group_fd, unsigned long flags) { return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags); } /* read PERF_METRICS */ static inline unsigned long long read_metrics(void) { return _rdpmc(TOPDOWN_PERF_METRICS | TOPDOWN_METRIC_COUNTER_TOPDOWN_L1_L2); } /* extract the metric defined by event i from the value */ float extract_metric(int i, unsigned long long val) { return (double)(((val) >> (i * 8)) & 0xff) / 0xff; } /***********************************************/ /* Required PAPI component interface functions */ /***********************************************/ static int _topdown_init_component(int cidx) { unsigned long long val; int err, i; int retval = PAPI_OK; int supports_l2; char *strCpy; char typeStr[PAPI_MIN_STR_LEN]; const PAPI_hw_info_t *hw_info; /* Check for processor support */ hw_info = &(_papi_hwi_system_info.hw_info); switch (hw_info->vendor) { case PAPI_VENDOR_INTEL: break; default: err = snprintf(_topdown_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "Not a supported CPU vendor"); _topdown_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN - 1] = 0; if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOSUPP; goto fn_fail; } /* Ideally, we should check the IA32_PERF_CAPABILITIES MSR for */ /* PERF_METRICS support. However, since doing this requires a */ /* sysadmin to go through a lot of hassle, it may be better to /* just hardcode supported platforms instead */ if (hw_info->vendor == PAPI_VENDOR_INTEL) { if (hw_info->cpuid_family != 6) { /* Not a family 6 machine */ strCpy = strncpy(_topdown_vector.cmp_info.disabled_reason, "CPU family not supported", PAPI_MAX_STR_LEN); _topdown_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN - 1] = 0; if (strCpy == NULL) HANDLE_STRING_ERROR; retval = PAPI_ENOIMPL; goto fn_fail; } /* Detect topdown support */ switch (hw_info->cpuid_model) { /* The model id can be found in Table 2-1 of the */ /* IA-32 Architectures Software Developer’s Manual */ /* homogeneous machines that do not support l2 TMA */ case 0x6a: /* IceLake 3rd gen Xeon */ case 0x6c: /* IceLake 3rd gen Xeon */ case 0x7d: /* IceLake 10th gen Core */ case 0x7e: /* IceLake 10th gen Core */ case 0x8c: /* TigerLake 11th gen Core */ case 0x8d: /* TigerLake 11th gen Core */ case 0xa7: /* RocketLake 11th gen Core */ required_core_type = INTEL_CORE_TYPE_HOMOGENEOUS; supports_l2 = 0; break; /* homogeneous machines that support l2 TMA */ case 0x8f: /* SapphireRapids 4th gen Xeon */ case 0xcf: /* EmeraldRapids 5th gen Xeon */ required_core_type = INTEL_CORE_TYPE_HOMOGENEOUS; supports_l2 = 1; break; /* hybrid machines that support l2 TMA and are locked to the P-core */ case 0xaa: /* MeteorLake Core Ultra 7 hybrid */ case 0xad: /* GraniteRapids 6th gen Xeon P-core */ case 0xae: /* GraniteRapids 6th gen Xeon P-core */ case 0x97: /* AlderLake 12th gen Core hybrid */ case 0x9a: /* AlderLake 12th gen Core hybrid */ case 0xb7: /* RaptorLake-S/HX 13th gen Core hybrid */ case 0xba: /* RaptorLake 13th gen Core hybrid */ case 0xbd: /* LunarLake Series 2 Core Ultra hybrid */ case 0xbf: /* RaptorLake 13th gen Core hybrid */ required_core_type = INTEL_CORE_TYPE_PERFORMANCE; supports_l2 = 1; /* if we are on a heterogeneous processor, try and load librseq */ if (link_librseq() == PAPI_OK) { librseq_loaded = 1; /* indicate in desc that librseq was found and is being used */ err = snprintf(_topdown_vector.cmp_info.description, PAPI_MAX_STR_LEN, TOPDOWN_COMPONENT_DESCRIPTION " (librseq in use)"); _topdown_vector.cmp_info.description[PAPI_MAX_STR_LEN - 1] = 0; } break; default: /* not a supported model */ strCpy = strncpy(_topdown_vector.cmp_info.disabled_reason, "CPU model not supported", PAPI_MAX_STR_LEN); _topdown_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN - 1] = 0; if (strCpy == NULL) HANDLE_STRING_ERROR; retval = PAPI_ENOIMPL; goto fn_fail; } } /* if there is a core type requirement for this platform, check it */ if (!active_core_type_is(required_core_type) && required_core_type != INTEL_CORE_TYPE_HOMOGENEOUS) { core_type_to_name(required_core_type, typeStr); err = snprintf(_topdown_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "The PERF_EVENT MSR does not exist on this core. Limit process affinity to cores of type '%s' only.", typeStr); _topdown_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN - 1] = 0; if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ECMP; goto fn_fail; } /* allocate the events table */ topdown_native_events = (_topdown_native_event_entry_t *) papi_calloc(TOPDOWN_MAX_COUNTERS, sizeof(_topdown_native_event_entry_t)); if (topdown_native_events == NULL) { err = snprintf(_topdown_vector.cmp_info.disabled_reason, PAPI_MAX_STR_LEN, "%s:%i topdown_native_events papi_calloc for %lu bytes failed.", __FILE__, __LINE__, TOPDOWN_MAX_COUNTERS * sizeof(_topdown_native_event_entry_t)); _topdown_vector.cmp_info.disabled_reason[PAPI_MAX_STR_LEN - 1] = 0; if (err > PAPI_MAX_STR_LEN) HANDLE_STRING_ERROR; retval = PAPI_ENOMEM; goto fn_fail; } /* fill out the events table */ i = 0; /* level 1 events */ strcpy(topdown_native_events[i].name, "TOPDOWN_RETIRING_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were retiring instructions"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_RETIRING; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_BAD_SPEC_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were stalled due to bad speculation"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_BAD_SPEC; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_FE_BOUND_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were waiting on the frontend"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_FE_BOUND; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_BE_BOUND_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were waiting on the backend"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_BE_BOUND; topdown_native_events[i].selector = i + 1; if (supports_l2) { /* level 2 events */ i++; strcpy(topdown_native_events[i].name, "TOPDOWN_HEAVY_OPS_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were retiring heavy operations"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_HEAVY_OPS; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_BR_MISPREDICT_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were wasted due to branch misses"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_BR_MISPREDICT; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_FETCH_LAT_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were stalled due to no uops being issued"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_FETCH_LAT; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_MEM_BOUND_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were stalled due to demand load/store instructions"); topdown_native_events[i].metric_idx = TOPDOWN_METRIC_IDX_MEM_BOUND; topdown_native_events[i].selector = i + 1; /* derived level 2 events */ i++; strcpy(topdown_native_events[i].name, "TOPDOWN_LIGHT_OPS_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were retiring light operations"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = -1; topdown_native_events[i].derived_parent_idx = TOPDOWN_METRIC_IDX_RETIRING; topdown_native_events[i].derived_sibling_idx = TOPDOWN_METRIC_IDX_HEAVY_OPS; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_MACHINE_CLEARS_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were wasted due to pipeline resets"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = -1; topdown_native_events[i].derived_parent_idx = TOPDOWN_METRIC_IDX_BAD_SPEC; topdown_native_events[i].derived_sibling_idx = TOPDOWN_METRIC_IDX_BR_MISPREDICT; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_FETCH_BAND_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were wasted due to less uops being issued than there are slots"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = -1; topdown_native_events[i].derived_parent_idx = TOPDOWN_METRIC_IDX_FE_BOUND; topdown_native_events[i].derived_sibling_idx = TOPDOWN_METRIC_IDX_FETCH_LAT; topdown_native_events[i].selector = i + 1; i++; strcpy(topdown_native_events[i].name, "TOPDOWN_CORE_BOUND_PERC"); strcpy(topdown_native_events[i].description, "The percentage of pipeline slots that were stalled due to insufficient non-memory core resources"); strcpy(topdown_native_events[i].units, "%"); topdown_native_events[i].return_type = PAPI_DATATYPE_FP64; topdown_native_events[i].metric_idx = -1; topdown_native_events[i].derived_parent_idx = TOPDOWN_METRIC_IDX_BE_BOUND; topdown_native_events[i].derived_sibling_idx = TOPDOWN_METRIC_IDX_MEM_BOUND; topdown_native_events[i].selector = i + 1; } num_events = i + 1; /* Export the total number of events available */ _topdown_vector.cmp_info.num_native_events = num_events; _topdown_vector.cmp_info.num_cntrs = num_events; _topdown_vector.cmp_info.num_mpx_cntrs = num_events; /* Export the component id */ _topdown_vector.cmp_info.CmpIdx = cidx; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } static int _topdown_init_thread(hwd_context_t *ctx) { (void)ctx; return PAPI_OK; } static int _topdown_init_control_state(hwd_control_state_t *ctl) { _topdown_control_state_t *control = (_topdown_control_state_t *)ctl; int retval = PAPI_OK; struct perf_event_attr slots, metrics; int slots_fd = -1; int metrics_fd = -1; void *slots_p, *metrics_p; /* set up slots */ memset(&slots, 0, sizeof(slots)); slots.type = PERF_TYPE_RAW; slots.size = sizeof(struct perf_event_attr); slots.config = 0x0400ull; slots.exclude_kernel = 1; /* open slots */ slots_fd = perf_event_open(&slots, 0, -1, -1, 0); if (slots_fd < 0) { retval = PAPI_ENOMEM; goto fn_fail; } /* memory mapping the fd to permit _rdpmc calls from userspace */ slots_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, slots_fd, 0); if (slots_p == (void *) -1L) { retval = PAPI_ENOMEM; goto fn_fail; } /* set up metrics */ memset(&metrics, 0, sizeof(metrics)); metrics.type = PERF_TYPE_RAW; metrics.size = sizeof(struct perf_event_attr); metrics.config = 0x8000; metrics.exclude_kernel = 1; /* open metrics with slots as the group leader */ metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0); if (metrics_fd < 0) { retval = PAPI_ENOMEM; goto fn_fail; } /* memory mapping the fd to permit _rdpmc calls from userspace */ metrics_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, metrics_fd, 0); if (metrics_p == (void *) -1L) { retval = PAPI_ENOMEM; goto fn_fail; } /* we set up with no errors, so fill out the control state */ control->slots_fd = slots_fd; control->slots_p; control->metrics_fd = metrics_fd; control->metrics_p; fn_exit: return retval; fn_fail: /* we need to close & free whatever we opened and allocated */ if (slots_p != NULL) munmap(slots_p, getpagesize()); if (metrics_p != NULL) munmap(metrics_p, getpagesize()); if (slots_fd >= 0) close(slots_fd); if (metrics_fd >= 0) close(metrics_fd); goto fn_exit; } static int _topdown_update_control_state(hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx) { int i, index; (void)ctx; _topdown_control_state_t *control = (_topdown_control_state_t *)ctl; for (i = 0; i < TOPDOWN_MAX_COUNTERS; i++) { control->being_measured[i] = 0; } for (i = 0; i < count; i++) { index = native[i].ni_event & PAPI_NATIVE_AND_MASK; native[i].ni_position = topdown_native_events[index].selector - 1; control->being_measured[index] = 1; } return PAPI_OK; } static int _topdown_start(hwd_context_t *ctx, hwd_control_state_t *ctl) { (void) ctx; _topdown_control_state_t *control = (_topdown_control_state_t *)ctl; /* reset the PERF_METRICS counter and slots to maintain precision */ /* as per the recommendation of section 21.3.9.3 of the IA-32 /* Architectures Software Developer’s Manual. Resetting means we do not */ /* need to record 'before' metrics/slots values, as they are always */ /* effectively 0. Despite the reset meaning we don't need to record */ /* the slots value at all, the SDM states that SLOTS and the PERF_METRICS */ /* MSR should be reset together, so we do that here. */ /* these ioctl calls do not need to be protected by assert_affinity() */ ioctl(control->slots_fd, PERF_EVENT_IOC_RESET, 0); ioctl(control->metrics_fd, PERF_EVENT_IOC_RESET, 0); return PAPI_OK; } static int _topdown_stop(hwd_context_t *ctx, hwd_control_state_t *ctl) { _topdown_context_t *context = (_topdown_context_t *)ctx; _topdown_control_state_t *control = (_topdown_control_state_t *)ctl; unsigned long long metrics_after; int i, retval; double ma, mb, perc; retval = PAPI_OK; metrics_after = read_metrics(); /* extract the values */ for (i = 0; i < TOPDOWN_MAX_COUNTERS; i++) { if (control->being_measured[i]) { /* handle case where the metric is not derived */ if (topdown_native_events[i].metric_idx >= 0) { perc = extract_metric(topdown_native_events[i].metric_idx, metrics_after) * 100.0; } else /* handle case where the metric is derived */ { /* metric perc = parent perc - sibling perc */ perc = extract_metric( topdown_native_events[i].derived_parent_idx, metrics_after) * 100.0 - extract_metric( topdown_native_events[i].derived_sibling_idx, metrics_after) * 100.0; } /* sometimes the percentage will be a very small negative value */ /* instead of 0 due to floating point error. tidy that up: */ if (perc < 0.0) { perc = 0.0; } /* store the raw bits of the double into the counter value */ control->count[i] = *(long long*)&perc; } } fn_exit: /* free & close everything in the control state */ munmap(control->slots_p, getpagesize()); control->slots_p = NULL; munmap(control->metrics_p, getpagesize()); control->metrics_p = NULL; close(control->slots_fd); control->slots_fd = -1; close(control->metrics_fd); control->metrics_fd = -1; return retval; } static int _topdown_read(hwd_context_t *ctx, hwd_control_state_t *ctl, long long **events, int flags) { (void)flags; _topdown_stop(ctx, ctl); /* Pass back a pointer to our results */ *events = ((_topdown_control_state_t *)ctl)->count; return PAPI_OK; } static int _topdown_reset(hwd_context_t *ctx, hwd_control_state_t *ctl) { ( void ) ctx; ( void ) ctl; return PAPI_OK; } static int _topdown_shutdown_component(void) { /* Free anything we allocated */ papi_free(topdown_native_events); return PAPI_OK; } static int _topdown_shutdown_thread(hwd_context_t *ctx) { ( void ) ctx; return PAPI_OK; } static int _topdown_ctl(hwd_context_t *ctx, int code, _papi_int_option_t *option) { ( void ) ctx; ( void ) code; ( void ) option; return PAPI_OK; } static int _topdown_set_domain(hwd_control_state_t *cntrl, int domain) { (void) cntrl; (void) domain; return PAPI_OK; } static int _topdown_ntv_enum_events(unsigned int *EventCode, int modifier) { int index; switch (modifier) { case PAPI_ENUM_FIRST: /* return the first event that we support */ *EventCode = 0; return PAPI_OK; case PAPI_ENUM_EVENTS: index = *EventCode; /* Make sure we have at least 1 more event after us */ if (index < num_events - 1) { /* This assumes a non-sparse mapping of the events */ *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; default: return PAPI_EINVAL; } return PAPI_EINVAL; } static int _topdown_ntv_code_to_name(unsigned int EventCode, char *name, int len) { int index = EventCode & PAPI_NATIVE_AND_MASK; if (index >= 0 && index < num_events) { strncpy(name, topdown_native_events[index].name, len); return PAPI_OK; } return PAPI_ENOEVNT; } static int _topdown_ntv_code_to_descr(unsigned int EventCode, char *descr, int len) { int index = EventCode; if (index >= 0 && index < num_events) { strncpy(descr, topdown_native_events[index].description, len); return PAPI_OK; } return PAPI_ENOEVNT; } static int _topdown_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { int index = EventCode; if ((index < 0) || (index >= num_events)) return PAPI_ENOEVNT; strncpy(info->symbol, topdown_native_events[index].name, sizeof(info->symbol) - 1); info->symbol[sizeof(info->symbol) - 1] = '\0'; strncpy(info->long_descr, topdown_native_events[index].description, sizeof(info->long_descr) - 1); info->long_descr[sizeof(info->long_descr) - 1] = '\0'; strncpy(info->units, topdown_native_events[index].units, sizeof(info->units) - 1); info->units[sizeof(info->units) - 1] = '\0'; info->data_type = topdown_native_events[index].return_type; return PAPI_OK; } /** Vector that points to entry points for our component */ papi_vector_t _topdown_vector = { .cmp_info = { .name = "topdown", .short_name = "topdown", .description = TOPDOWN_COMPONENT_DESCRIPTION, .version = "1.0", .support_version = "n/a", .kernel_version = "n/a", .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, }, /* Sizes of framework-opaque component-private structures */ .size = { .context = sizeof(_topdown_context_t), .control_state = sizeof(_topdown_control_state_t), .reg_value = 1, /* unused */ .reg_alloc = 1, /* unused */ }, /* Used for general PAPI interactions */ .start = _topdown_start, .stop = _topdown_stop, .read = _topdown_read, .reset = _topdown_reset, .init_component = _topdown_init_component, .init_thread = _topdown_init_thread, .init_control_state = _topdown_init_control_state, .update_control_state = _topdown_update_control_state, .ctl = _topdown_ctl, .shutdown_thread = _topdown_shutdown_thread, .shutdown_component = _topdown_shutdown_component, .set_domain = _topdown_set_domain, /* Name Mapping Functions */ .ntv_enum_events = _topdown_ntv_enum_events, .ntv_code_to_name = _topdown_ntv_code_to_name, .ntv_code_to_descr = _topdown_ntv_code_to_descr, .ntv_code_to_info = _topdown_ntv_code_to_info, }; papi-papi-7-2-0-t/src/components/topdown/topdown.h000066400000000000000000000063301502707512200221530ustar00rootroot00000000000000#define TOPDOWN_COMPONENT_DESCRIPTION "A component for accessing topdown " \ "metrics on 10th gen+ Intel processors" /* these MSR access defines are constant based on the assumptoin that */ /* new architectures will not change them */ #define TOPDOWN_PERF_FIXED (1 << 30) /* return fixed counters */ #define TOPDOWN_PERF_METRICS (1 << 29) /* return metric counters */ #define TOPDOWN_FIXED_COUNTER_SLOTS 3 #define TOPDOWN_METRIC_COUNTER_TOPDOWN_L1_L2 0 /* L1 Topdown indices in the PERF_METRICS counter */ #define TOPDOWN_METRIC_IDX_RETIRING 0 #define TOPDOWN_METRIC_IDX_BAD_SPEC 1 #define TOPDOWN_METRIC_IDX_FE_BOUND 2 #define TOPDOWN_METRIC_IDX_BE_BOUND 3 /* L2 Topdown indices in the PERF_METRICS counter */ /* The L2 events not here are derived from the others */ #define TOPDOWN_METRIC_IDX_HEAVY_OPS 4 #define TOPDOWN_METRIC_IDX_BR_MISPREDICT 5 #define TOPDOWN_METRIC_IDX_FETCH_LAT 6 #define TOPDOWN_METRIC_IDX_MEM_BOUND 7 /** Holds per event information */ typedef struct topdown_native_event_entry { int selector; /* signifies which counter slot is being used. indexed from 1 */ char name[PAPI_MAX_STR_LEN]; char description[PAPI_MAX_STR_LEN]; char units[PAPI_MIN_STR_LEN]; /* the unit to use for this event */ int return_type; /* the PAPI return type to use for this event */ int metric_idx; /* index in PERF_METRICS. if -1, it's derived */ int derived_parent_idx; /* if derived, which parent do we subtract from */ int derived_sibling_idx; /* if derived, which metric do we subtract */ } _topdown_native_event_entry_t; /** Holds per event-set information */ typedef struct topdown_control_state { #define TOPDOWN_MAX_COUNTERS 16 int being_measured[TOPDOWN_MAX_COUNTERS]; long long count[TOPDOWN_MAX_COUNTERS]; int slots_fd; /* file descriptor for the slots fixed counter */ void *slots_p; /* we need this in ctl so it can be freed */ unsigned long long slots_before; int metrics_fd; /* file descriptor for the PERF_METRICS counter */ void *metrics_p; /* we need this in ctl so it can be freed */ unsigned long long metrics_before; } _topdown_control_state_t; /* these MSR access defines are constant based on the assumptoin that */ /* new architectures will not change them */ #define TOPDOWN_PERF_FIXED (1 << 30) /* return fixed counters */ #define TOPDOWN_PERF_METRICS (1 << 29) /* return metric counters */ #define TOPDOWN_FIXED_COUNTER_SLOTS 3 #define TOPDOWN_METRIC_COUNTER_TOPDOWN_L1_L2 0 /* L1 Topdown indices in the PERF_METRICS counter */ #define TOPDOWN_METRIC_IDX_RETIRING 0 #define TOPDOWN_METRIC_IDX_BAD_SPEC 1 #define TOPDOWN_METRIC_IDX_FE_BOUND 2 #define TOPDOWN_METRIC_IDX_BE_BOUND 3 /* L2 Topdown indices in the PERF_METRICS counter */ /* The L2 events not here are derived from the others */ #define TOPDOWN_METRIC_IDX_HEAVY_OPS 4 #define TOPDOWN_METRIC_IDX_BR_MISPREDICT 5 #define TOPDOWN_METRIC_IDX_FETCH_LAT 6 #define TOPDOWN_METRIC_IDX_MEM_BOUND 7 /* Holds per thread information; however, we do not use this structure, but the framework still needs its size */ typedef struct topdown_context { int junk; } _topdown_context_t; papi-papi-7-2-0-t/src/components/vmware/000077500000000000000000000000001502707512200201155ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/vmware/Makefile.vmware.in000066400000000000000000000000711502707512200234600ustar00rootroot00000000000000VMWARE_INCDIR = @VMWARE_INCDIR@ VMGUESTLIB = @VMGUESTLIB@papi-papi-7-2-0-t/src/components/vmware/PAPI-VMwareComponentDocument.pdf000066400000000000000000004660251502707512200261370ustar00rootroot00000000000000%PDF-1.4 %âăÏÓ 5 0 obj <>/Subtype/Link/Rect[193.78 633.15 213.58 645.61]>> endobj 6 0 obj <>/Subtype/Link/Rect[213.58 633.15 220.19 645.61]>> endobj 7 0 obj <>/Subtype/Link/Rect[220.19 633.15 246.59 645.61]>> endobj 8 0 obj <>/Subtype/Link/Rect[246.59 633.15 253.19 645.61]>> endobj 9 0 obj <>/Subtype/Link/Rect[253.19 633.15 272.99 645.61]>> endobj 10 0 obj <>/Subtype/Link/Rect[272.99 633.15 279.6 645.61]>> endobj 11 0 obj <>/Subtype/Link/Rect[279.6 633.15 299.4 645.61]>> endobj 12 0 obj <>/Subtype/Link/Rect[312.6 633.15 352.21 645.61]>> endobj 13 0 obj <>/Subtype/Link/Rect[352.21 633.15 372.01 645.61]>> endobj 14 0 obj <>/Subtype/Link/Rect[372.01 633.15 391.81 645.61]>> endobj 15 0 obj <>/Subtype/Link/Rect[391.81 633.15 398.42 645.61]>> endobj 16 0 obj <>/Subtype/Link/Rect[398.42 633.15 418.22 645.61]>> endobj 17 0 obj <>stream xœ•}K¯­¹q]ÍÎ$²ăd@È bX!ï||“-Y–ơj©¥–dyf8†Ñ×€2ÉßWqo÷¾‹EPă\¬ó"YU¬*ÖăoûƠ[Èï̃_ï_ưóÛ¾zụ̂ío×{i₫ưSÿïơ₫Í[v₫OÿÅ¿óö¯o_¿ưû›{ÿo₫ưÇơooîzÿéÛ?₫Óơ₫Ïư×Ư;₫÷ÿÏŸ¾óñ›o]÷ÿ|ï_ä_.üR'èưĐ½ûø₫Ơ¿ôOá_Ư{M·à̃s ·Zß¿úôö½ÿ₫çWÿz_ÂƯånÁ+ü_R¼«·œ¾â-g…ÿ>ÅÇNOPøos|[éoŸÓúưoQ|9ùxGÍà‹ư¥Á¿§`Ÿ×mü‚âău»4üăû6êcưÅg+ö]ô¹^ăÿâk^>ÿK`b½;¿ x¿Rÿ+ eư<''ùÛ¥ÙæKÏưp«}óC퇾ߺ 6…ÿáăåWzèéÆë‰y¾Kñ₫i½t?#tBTøR|zZïo)>?ÑOÙ'¶kư>=¯tÅÛí畺-Ÿ?.ß•°Ö à±̃ú•ñ¬wáíV5₫ éÖÂsrB“ƠñÑßœ‚ÿ‡×›÷ ÿ#ï̀…?áOñ]³ùªđßâø´~?Q|?ÖåûÆñyư>§§tfÖÇơ7_ă­jü/)¾9¹ỲçƠ €3₫K†WßÍ>¿áø.,ÑNp}½ù€.]퀂¢œíør»Ú>\7—́çºøæt°?]Ù–l?ß Ô”}?S‹HOøŸ̃ñz]Ÿ×Ùµk/~¯÷ë~MĂøAĂÿxĐ ₫Ƨç¿ó-½ú9¼¦éó4<€4üˆî[·ÛÊ ~P0ăÈñM, ÿ0½æOÏ{‚̃£ÇÏáơ}~OÀ׳₫Ÿÿfñƒüơçđ¦ÏÓđ~Đđ[ºo©­ç̣5Åwc9—Ÿ=Ö¨ÖđéÅZܧĂë~…à‹]ÿ§§?HµÛBùógåăîzÿ; o₫V_Ü<ÿ̃X̀Çï}‹ư™på—àI»çç½T?~óƯ°¤Ÿ~¾oß¾QáüÀöîÅÏgäơÜ÷Ỹ–cü’Â}Z‘2I~9½Ÿsx3ú…®{ƯŒÓø‡1¡·øÓó–wa»^ư^ÉgÅæOÀ₫ ¥9Ơ[ƠøÿJñƯ•ÿ¸ơ>=¯©{%ơƠÏáơ?¿ÆШ{C©ë¹PƯ #oÙÇîƠkøôbMơư‹ŸĂë5~à³öƠăÓÓßÜꌅ&BĂh¯x%‰©Y•ptî–ưËsÜháûå‹ ~ë?[ñø/,×îLµûă†|€Ë}öÀ₫̉p™=°¿3Üd,¿$ÇíơÀRƒ]®®̣\gûye\ư₫ÀRà®ÜMGq×́&†x¨ơø?Ytº‰{ÊׯjwK½ A /‚RŸ³Ô'üƲ–ú„çŒt·ª'üO¸\ !œđôv|È¡™₫»(Nxn̉Ư¥qÂÇ œ ûÉ ÎµÎË N¯á»dZ‰áœÀcÏ ₫…AD'¸Å›àü.‚j&æ.«댻¸Nø?üÁ"²æ¿ÛKV>xØKë›»m3á¹Ê¹Û6Ÿ£g `ÿúµ~5CdÏ]ơ=u[(̃ƠY¡`oøø„§1Iç¯[ ÿ—˜üŒ§‘=ç+T3/6øơûÇñƯ8Rp8t1̃’̃R|º`KÍøßr|¾9MÏ/(O" Nă°®8˜‚3F])·¤É¡L®^¸|́§UăÍ0OW̃ 9œyºö¾â9-­ßçô´‚»Ê¼₫Z—ËŸë®rsÙ~ºt.¼ÏÉqmá}**¾_ɺ_¾Ê­$ûiùèn>lg—Ưà́²î»́Æp@O·¤ÚÉzó…û~vKja¾̃Üà0Îx₫ú\̃₫ÍÜï»rXT3}Ôñ]9ø“ư¬áVÂÁz«Xâæ« ¶];Ø₫nÛ-ªœ?¨]~½*₫ă)̀ô„n+ûéWnƠÛw±UŸ̃\¡ûA‹.ü5Ň ¥öå†~ºÅ®ûƆ@98®nr.ßçOˆÉá1ß~¼)̃Úvƒ…´[ÈnƠÎü19—U;opÿíôwí°hç@ñƯXØÿ/)¾¹[óßïâëµöü>Ă#3k¹-¾Íñ /3¾Q|¿¬—ïÓưŒ.¯̣Ë¿ï#’f<ÍUˆKèïÓäd-ûIơOŒ/Çfư»|-¶åϘŒ%¾ŸưöÍO-ùø|ûRưË“#B-ùX€3₫W_"a§§dĂØååà6J`gÿ|r¹;fơ–\‚·;ăÿ#Ç×Ơ‘¢́¼[¯#ÊÎÉäîØ×$ÅĐl¦xÙ‰‡Ă^¨Ú³Yç2pñ2Mç1đđl»n-ÎhÊ1Ưwº%› †uùçߕߙàT˜\çÆæÍ¤;'y—œº*N ¾ ‘ KêN × ©ëàëˆÜ5;í¡,ÄP±@|Dm;ơbƠé^ZtPđ9˜—¥rK’̃8Áy¬£\E}ʺ½ƠœÅä6i/U¬]+tWL+ êU váëUöl /Vó×0tLă¯ v·ˆéCĂ]Ô~‹%û1ù~‰]ά­=L¾`>U߇W;CoTy¸23$‚9Z´©÷èc·N™ĂPCăư”RZî±óÀ‰»Ùï1ß;yû6–Küë>v=P½}¥u½ƒ9 Ôơæ«nCêmç´÷+>*Úé%Œtt}Mn²Ñ×Kx.Iâ2t{P+Tº3H-×jƒº,iµ±ID‘QëFvaªÅ¾3±-×*ܤ –²Ñ< )ß]‡…́~çÄt¯L_ñÔú %-´ó˜Y·–K1KS¨’Ÿo5­B—¦PíÇÔ²df™Ø…I³ ÷n»0黀;“]˜B6 ŚÂ«yc&ơqz§Æ0yă6Æ{ÑÈ`1^â–•RŒiÑ¿ü”̉…:+?Æ.Kê ₫E÷ḱàÛ°O•àQÓ$ökLs5«¢à̀´tkVỌ́ ï~²¾ 8-]L£]î$p2£i YlyáFv¸®›R_4&™`Ëf³%×=C»©‘\” Q$¿J¤äχ?`›Ụ́Ô…:Đ‚àqWà&e©Û«¢~»Àà<ºâÄà”’*‰]3œ̃uƯ«*î¹w·ªjj~Æñuæ3~3 rS[W‹îđ—­{ïúÓ4=›ÄIŸ™ñ›¸É…ÔœÏàÎ’W8Xoṭ̃3áy`&fñk­̀æºÙv;û8çjúy-WTvÛééMÔôlbĂo2ó›ô5°¾ »Ó̀n]¼jµ/¢¾Ù·¹ÑÙ·SÜØ¹× 1̀`¤Å.âHïpDqäq0Ru¬́€d‘¬Ù“»é1Ûb>_Xrá€nÊyMÏ?îb%ÙùÓçx;D%·‹ r?®xÀ>Ơ¯ÇÅÙ§FyMđÔÖơí›Î¼^˜iöÓE.Ç¥·‡»¥W’Ç6+9Èå¨öƯD.ǯ$c5‚oëamR9ÂMû½¡̃.êăª™7‰º˜Y ©ĂËYmªêërVÛ¼Œzp/"/£X…ÈË퀒W;ƒo?$÷ÀÎ µ®ÛÉéoâXy3^Ă1°2̉2NLøx|ă ÏCƯijä{iéa¾æ"D÷@QÅnƠ.f ßT—U;wJ8¤lO·R³³ ẂÂXÂÁq¥$pVîDL¤Ù5[̀ie~₫)n5²ùq•¼n'gŸê—k‹G€º,ÈVÅE¶x —h9`æÖnî€ä”,.µ!““æUfzR·™O˜'u›yqIy÷®±rCê»#ߟ.¼ÉΜơ°̉Ëí¥™`¥IÈóM7±»eˆ‘| éuÆ;‰”4£é;·CÓ 7ñ‰Ô} ·ù'ÑL‰s#3úÎ}Ï{ÿœG:F¿ñ|²ü½t/ Ư¬'äÂH°F̉Äœê;ƒ8ÍÖ¯Ç&>íœ^nTJ&5ÉXú€ÓGgṬDE Gu3áºø¦'ÈŸ•˜zƯœê(|ø€óÈOupß„Gçbú¥¤ƠÑ&Ó–á5]®Ă¢›ܪi¯‡q8£y3Âà%aÔ¨ƠÑÖ4™™UÇÙ›E Ñ­Á¶Á»6E]̃ô¿æđº¨$©R⢒8éƠIäÀxo †EkßMÉK“ ¨ugÚªy}¿̣âaß]§K³̃Jǘƒmï̉w5»8ơ›)ÚUA¼F‹‘ t$³æ@œ£* ÷„9́ÖItăqĐºÔ.zƠ.©hj\́´t ƠE;-è¢n¿l"êÇ‹}Û£[LN¾í—“Y8b·PËÁ>&ɳ1ÚaR0“́Û̃Ơ€â¯M %ËkµơP\“í»^̉bo¢2+Çđ˜Ø“ÖàûØ}Úp ×ÍK̃¢u©-i…Gù+‘V₫Jx”´Ëâ1ú¦öJv²Ư:I>-†>ÿzXfjsÏ(ù€Óê?„Kºâ¨³]££‚ó´ñx3ÁiD·Ô© N™·³ '8Ï ¸âM}SWs‰àY©qƯîÖö‹Yæ~ܨ±4¿áLxîM†"ƯP&<Ụ̋5i‘ôqÀ:è>‚WÑ OZTÀ,¼Æ·3Â&3=ƯƠÊ₫`{º¯åNè¹w°ă³¯̀ÜßFl̀¼^¤́°Ç4ˆdǪb§ሪ¿Ï“2ƒ•íû‰!W= ¦Y¶óƒńû‰¼ù₫AÊ"¾.Ôl’3F¯8O £fô΂L·3ïdḶ&`¥=V }À©«^)™×Ư»?™áyM¶…ײ±éƯ‘Ä6°n;L³ƒ}Ĩ«`'½f₫`›̀ëœà<©àx#íˆoè¯S#Ñ»ÑƯÖ¨îĐŒCk‚Mw wóæ}ôh à›Hŵw¬Ă¨V2’U¬1ëÆDÉ25ê$GM½ơL»\gßơ4BV̉Á´‹ªj’¢ç~ gb¶ó@ '*:Ù+̉yAƒ™ Œ’Ô÷ ¯®ja¹S¹óz]‹$mªcF+Iă¾ ÑGUÄđ'71Y Ü áŒÚ}̣]2o$"*©Ùá÷^^púê„›bÖ1̉AơàP»¨ji,1Ü›sY5%I¤³̉F|ÁhÉ„å±ÏJL¹´,qöEk.µ<©¤”ÅßwëïŸ^¹îÙ@₫} À÷)TjÓ'ô— =\ ư.đ&ô× U¤(ù‚¢ 3úç ,íî'ÛáR`¥Å]ĐĐÖuë Hÿ ‡K åÿ%…{y:´¨t T»₫"FáMÏ\—î¬e¾ÄçÎƯ/1Á7hç Îßó‚t„Ÿà›æ†̉~‚ó è£àÜ”N2¬À¼Ô$o—œ¦?¸đ›à4¹ƒ"®`ÿz©0_'8M®pU uóR»©S1ü{tºà›!úy‚oÆúú&a<¡îߺTa‘•¼çæl§ÅIm£•!Q̣_H÷84¡yîzQœ'¯£;»™ühQf•ki¥¾Î}Œ.JÑ.×Ă’}»ÓPƠ™̉À¿̀ Ïf}*^C1+Hß$r‚o:øÉ{²•y₫TÄ́_?›yßaØ;gÿº—är«è_aaZOó²’¢ÇQ#5ûNƈñWœ77LRñÍħ• øK)Àí´ ÿ[íûf̉{Z®Mµ{]ø}“Ñí É}$ÛÙ©É\M3-ư¢Ñ‡´Ièö(µƒüo»RHrvZº ®›g9£u“ư^B»³“~× N-™ڢ¨„m-§üëèH­X€Ï H̉©n‚ÿ-…ĂÆ›Ñ‰¢;NßÀÂ¥$Đ®Ë L½d-kûñÏ9¼.70½–̉—˜?ö¹k±ªø×]^t57QD®ïºT T³z”‡Mçđ‚ °÷gM+íđó’4̃ƒæwÎ â*Op̃ ohơ NO´‰V·̉âÆ¸äÏ_XPơ~Ó8OéØ¿À@QøMă<ɹñ›ÆvNÜÚ O¹NvÓ»¿)â–¤eóá¢ÖÉ„ß$9KHvÆs¯¿kƠâ—‰ç²îÿ¦Ó¿4Fœñ›ç‚$çÏ'\"©RóÏf6€´T±¯·6$aÚ×Û¤§Y3 s̃uÀÏ2d1Øå×;ée^/:í5ÍÜ–—1‹Î~À(ëvÉΠâx»‚@ë¼Pí „Öyé€üîª/ûÉÏå¿ñà¼RÙ¼ǗWưĂ_‘³4î4Ë êÇ}µIº7M>sĨơƒÎyËén:á]Gê!Eưđœâî—¸jß΀LLo÷p‰öMx₫4äÄ;µ.­öR< Ç_ëíÂÉñy=̃ÍD÷p;ưˆ³;÷cRwÖßߤQûU;or´+¨˜¥±‡ ÷‡>\ X=äƒưÉHaŸáüÅ5G Úñu5®¸´”¼\ZJE"ˆ=«ty¶³g•ñOöơ6IÄ4{¡•ơ®ăy°Ïăíû‰ºuŸí́»ø.Ú“çưú(6Vñۧéá.}_ =Q^œ‘ˆæ—5̣ºƒÆSÛ b> ?­¾×¦l\ zfü;Çוÿ‘¤ƠtÛ e”̣~;»Ơ²Ë¦ºûºH/^{íđßy $¬¦O[†4Úu­̀#8đÔR¿{ÛÁƠ>F ˜eñ>cÀ̀l÷öïÇ˾7Ú(³Äö\Y¤Jw†S¦¬’;Ă)O¶ưà”%Q˜ ‰¡g„º„ à”#ÑGÏkj6©̉tmÆsgC´ơ÷7éà匭ôl‚1#xcƯN´ßƒơ>á7 ởí„üPWfàơ₫QÊf̀¼†ơv°;IfùÙw'_bœ—‹ª¢p@fëånfø[8 §È”%;9P î`{ªôô·3O“0vú¡œ§\2ñyÆóœnœ'`"p²ó&̉¼&Ÿ;̃¾H\Ôº›(f¿vǺúpy[;¤ekæáq†äVÑƯôäIV͉¬‰p ëĂÄ“íA…̃Ÿ̀Ă0R{=ă¿»‹«MÿÇI¼¤™éoiƠ%îô 72ÊHœđÜÈ@ÉØ°x6Ă7S˜¤¢}ûïé¢VrÂ…AÛ3œß(—LÚñÜbpa%‡[2Vi†ó ƯئUÖ%T3ÿ¦Lzóà‹0óÍÂ-°~×¶rp\±I₫„ߌ~öb2›Ï+ÉÈG³6 ùiƒ6‰[Ù™W–́1ïOwz¸E…öUå€₫ö¤ö¤É® ‘B®vÉ!=¸Œđs¢dl´³ócä9°á:Ȉ…₫ŸÖóƯ Wrëụ̀Ç­1¿Ầo1¹Ũ7Ă˜Fú„§oµ1ÇƠÔÛôÙ½VzøcR)+=›&;a¥gÓ™7¯̣¾™—äWË™7juñøx"Lœ‰vñM×l‰K„v“e€yJ*œgäùQ–4á¹ÑŒ îNá7G²¼9Lx¡/ =MøÍ\˜,¡¤ O½ÅvEá·al¯éߌî•Ávs|ẺÁŒÿ.Ň´®÷¶ÙĂöµ^̉+År‘;é>Á¹ă¤øg‚‡’ J ¬—Ïc–›ç¥’t¿vØ_Q́¸sXÊ!÷ ÇFïư¶y€©è<®˜_ª÷{æ¦Bó¸d`®!î7Œi3×ËLeåq·<ÀTP˼—’+ÛŒÆñ£eU°äY&—3œ>Uc}†S§¶ôo0½yH N­Eé‹9£ùÄR±¤í´4„gf8/Ôºd̉÷Œ§̉ISă÷ÚLoå¦Mr§f<‰Ü)½=<÷Ë'Hœ™î…o3wE ¿>Ư›êP `æ{ ®q3÷@1תđ·VđÚ`Æs·ÙSéà¼rExlÆỏ¡$}ÆsO¯HS`ûzÇSâŒçQ”W₫ÙLÉw™÷o‰ ¿ñ·G'¹èf₫Dá[=Đ˨{këîÑóü€¼G8ͬNdxˆ·«7 @]î•M™\†Uj'¿›i¹ḷˆÍxëV˜%ÛƠ?za.Ç»m†¹/DØè@zɱ“_Ÿ¸™·¸„“upZ5­ÂÅGçt ;Ù}”Ơy»2ñưr_Œ$₫X‰yƠ®LUƒ]™À'k§‹ÇÓ|pÙáñtYï¦5fF̀×|Y îmáÎMÇ„¬;3W«=Q÷¶h̃'µU»Q~»Ï'™ñ4Óo§₫ÀøÁÛéboˆ¼•Ù׋Ö;åà|›GLbÆÓ̀ y;ƠçK37âUVc•zCàå̉hs†º­ôèÅH̉Œá‚3œÚ̀] BJ&8ƯEtK NµÉP’NuFWç̃Đ0­_‡{ă5ñÛAAS¿éë‘{dÇ{yơñw%ˆydÿ~GwÂoÜ›„RCóöKmˆf5î.ƈ́‹¿yơºđk_/̣_«ơñªĂÁ₫Ă>ªû™ë³!‹Ñ0s‘PŸY·ªíàpkY…eÓ4¤›#g5"ĐvrÚ/o).ªdÓ¥³ÊmeƯNI«,vú½+ëz¹u‡P†Æó>È]×ÇË]ÅÑÉvÆsW/âÉÎÈ«ôî`?!»ª;³^/÷Ị́ơXÙ==üÉz»uÔNÖ[ªX#VƯ‰Áˆ)đÍëƯ¸ŒxpQc.âr5n2 /i;0ÿ-,F±J;Đđnün_ûöHSr|X¯MÓ2ÓLm¨Ù“¶) Q™N87W=ø₫˜3<ăiD4éˆúû4’rYÙÿ¯¸³’$̉9áiÚ„.ç»q&Ô!§¿=©C̃Êñ’RÅÏ»3:i"cÇ{·º ¼#:têợ~aK&°üc  3owJƯç»›¹°3Ï mÓÉéoqUo¼íåu­ÖUÏ̉ăÀN]ăç’Ü“ºÚ´•>æÛqôÑ0û.÷>fö¼÷Ñ0Kè£a>[ÄJPÉl`Ëû»ñ¿G”ÛZ¿}₫@ó‡]Ï`Ê-M$&ô&ƯDr¾'ø~:b›áûéˆÍ¼N™̀+E±KærhÉ^́;Ó/ ÎŸ¿&é«7Áùcb÷°éÛŒắí¤#!N­téđ¨Û™à<Å̀êRàÙ́§Ôm³¨ø—W³"“'Ûw¦_Mé`#KF®±.Íñ&øæ8,;³g1#>à›RÓî#ÙE¯•E#m(H ²2»ï̃uºPyxV»Î_¤*øvN£Bo"'ײé|^·<¯ZºáéÍúy®Z#m‚&’¦h₫ú˜uhæ”®*9Ư¼î^ḅÀùă4OÑÎăaƠ/Çfl‰“L#£$á-ƠÙïö€‘>e¶4ÚçOơUV&æuºøB!½{Ă—Q~§àFáR >ẰáåV¾$3Áy®jtâ2MøMºjî5““$}wÆó¤Ñ,Í9fü9~Üz'ÆbÆ ÆIñơZ×û÷_ÖơnJ&º^Ê rhî¡©²₫*H‘Ÿñ¼äÀÉ3ï{?|+ÿ ©jM<ÁwópY//ḈÖọ́£ÄlÆoJ&Úº̃ ø,ĆëEÈ ØåÅß--«¼ àí5—Đ´›đ4ùWD< ‰ÎËz©üb¶Ơ²^^b)"'ßG®k´óƒ4C̉x^„„&ü;Å£;™ÆÓ)´đÑ~¾ˆ0/ë¥̣…VËzi‰T(~]/ƠW¨%YÖËB\́NÖlyà45C¨~ä‡̀pC~È 7ä‡̀pC~È 7ä‡̀đMèă‚áe₫¼Ă˜ ½X+qhÆ;Ă7ù!r¥Øñ#£nÆoÓÙC=ø~ÀX‡¾I)¸́»ß]Ÿà́¬ƒHĐÛÉ,I¼Í¼†́ô|@~w4¯ñ\m$·êÏó I—ĂTĐ óTñ ́Ë…}§×Ë“yÆpºÏáëI3pgƠ=©̃ ÑÅ•;ù¿+¨]1s¿̀d9Đ (¡ vƠƒüV́̀́qƯVûñúÑ—wÆsoùƠÎE,ßßä‡H¶­ư¸FîƠŒça\íÉ×j@́¢h^̃ ª’yh–ZWUÈ™­ÅUn²%$₫f>]<– 凜YÖÑvk¹§·Ùùàf‘Tóƒ›ERÍ«]1«µ,7¦U•o¦¯Ê ûr»ëWÊ{"uÜ»pÁV7/RÇÓÁzÑuË~ó¢‰V8¸y‘ơé̉)î˜P̉¸¹ú´h¦̣ÆƯÁîHêF´s5;ïGÿdǵg±Øh¡µØ´ư&ÚÁ¶f?]¤n8g?] d]t _o¾đ”f¶K0’µØy1çƠÇá-Ær]u§¿Hk³!€©¯₫àjAêÆ"-› %~‘–MÇ­°ÖŒ§ïhRØ9]m•®ÍDé=ăiUX̣R’j§ă\ó~¤V˜ï"¤V˜‹¬2vP#·b†S f$ẀpfüNí©&Íf8÷åF¼y­È™ÈÍN½̀I ¿ÉºùÏ}3×Vz8ư£~Æóü/ÉÑv|8ÉçöÚ»ĩÙ¼¨ü Ïqè ¿ D ™MxîzgÉX°?KŒߔ~\«°lB Ă@2³gi«¸lJK†œđ|í7ăi ä9gç<=e}¾4¹½T²æOj¸²˜fzÚ?8/̀))ç%ụ̀‚9% ưœÜ)ÉÎÿˆM´| ï˜T²̀¦øĂ‰{<áy0 ISpóuI%ÙÙơ³ Œ=Ù â‘_e?à2ẫæư©̉æøà:ráĐ}µˆ>°M|¢¢úĂ¼T}`›Zu¿rô¦/x]9z3I$JtÎzá_V ÊûL‡°jĐM9Lj™é‰#zfehL* î"Gêf¾1B’ öơ"úw Ñ‹­¹~@đdÿK^éçüSƒ­ " í₫ÉyáFơßo·gÁ¢z¥¶ƒïc úA£ĂÙ×›ü ~\Yúr™Ù-\³:Œ‚q "Æ̃¥öÄVM?Q`Hëz‹-¬₫Ư¦¥¡]öŒ§Ơg(/iú¼h -¹Œê*óu:Ê?̀́p/ÿ0û/…u;%FaƯKÄ(J„öÙûÉ¥ ,=¡é%Q%ál†Ó BoÖ¯7yB˜ĐÜżÆ„ç.s˜kiÿ¾“êÿ¿™Z:4„§Aï{§¾¿™¢* †gü¦`“ÿ„ç.ugIX̀ÖÓrƯóÙÎ 9,ôo:ơ¹•₫MCyä9ă·sQs:àŸ®Áñ¤d¦?K6ጧ7ºY8½Ÿô•Á¥ŸïöQƠÛC/PDªG̃¸î’ääOËuÑm¢iúiù-’’;À£‘̃NZ®ëĂ“²â½Ê‚$Îxî€G©ñ›nU²̀ß¿0;́iø»æơn3~3ê4­ôđlˆ\ŵ^\¾OÔ':éEÍ?¼[Æè6aßO<ú”ƒơvÿfÙí̉’TÖưA·‰“ëKF£êưÙĤ{y½è6qi~ØF•'ó}*£Q“! ¬Ïë§.““g<÷wƒ´#0Ë âËơÂÏ+úơzÙ̀-“©ƒ߯hÇqp^]- ±É £UB,M†,ƒÚ́÷5´mvÊgđ«# ¶W•X¯kf8M«5G3üg£Î  ¤æh†ojˆÆµ>áyϨ9ñ|lɨ9²“?j́›9jf<¯!5Göư5Göư5Göư5G3×€Œ£OǺÜḱøQsd>¯{ÍÑŒçSr X¡Ø¦OĂÑí~úU ö]©·Î=Ư(ÖƯ§”ÔÍàÜ1.b›NđÍăzÇ’ OcṆ̃¯ñ×x¼5XW ×o tW·izxG¸®'ôc y= tm43º6}^ÜơFSzw@̉ëợùÍ£ÉüŒçó›1NZÿï\Ñ’øơ÷y¨¡‹ lkóùv%˜ÓÁ~¶,¯å~3­­çËm}tVtvqG"8àgøº±ÙåƯû‘ h₫>º¢ÀĂˆ½Oø­«âÁvÆ‘8á·-ôjyÙB’ÉSfî§{Ơƒå̃=]«vC—€|rZ%¯Ú„¿4W/–û„ßxºMî­Ú¨Ó̉Â;]4™7Úœêíß$ÿ_«o±}X_| >ŸM|Øo_íêë'¾ÖÛ“¯·…U¼8»µ¼Z|ÿQÔZíû™ºz8Đ©K{vvm…â‚EºøCüè̉jÖ&xçɾ= =QưxñĐ9ûñ¦đt{mC }K£7XÙÎp^ ưeg8½'îưœn|Ïg|7çÛ>Ái‰$8…7%XÉ‘r½ÚMyÁuÓŸçéá÷·~3>Œl!órÉßG¤l ÁÎ ÑÇ»é{ ‰áÑ<ôküfr|^9y3¶Âß4í¼ô¢4qGÍk­I. ó̃£ëA;Øû6 Íô´L²̣Œi­vú¥¢^/o‹ç¤ ÍŒ§}È1‘/hz¸{ï傳nă‚3oO•Vn–àJ;ø~yæíHצëA–&œëöàBϦ°@zSÚ×›ËJϦ!‰ ¢1‹¸ĐgÈ~A÷»«˜Â–W}¯5ƒû»!ø~™Op®ÓΫ¾¾›g4¯Íë,_|ÛD¸Ù)ǤŸá›ù±m&8ÍZ’‹gFó¢BLh±S%£Û¼ç˜́́»ˆ†;/#Êa¦}9&8¯n¬̣J9Áy5[Vơuª]“đöç¯G]̃¢‚oî'¼ÎOh\E¿+Å<¶~WѼíÈÏfîE_ëŒMß^ T›773sÚë4u¨¼‚0Êô«è¡—ătàV:\2Z:ø™Ă²3¼ÏïÈD1½Û“EÑÎÑEkFó´KD#¼̣ؽ´9™àø̃ÿåÛß®÷̉üû§7€¿yËóø/₫ư›·}ûúíß;̣Ë·ÿÄØUj endstream endobj 1 0 obj <>>>/MediaBox[0 0 612 792]/Annots[5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R]>> endobj 19 0 obj <>stream xœ•KÏe»qdöMb'Îm` ¬ØÆÖâm‘ôPdIV"YWK†'†c’e’¿V‘«›ä–_¾…œú<{/EÖ…Åß|ë§á₫”Óưé§ÿøñŸ~|ơñûëS®₫Óïڟקß~ÜÎ₫S₫₫·ÿüñ‹ưpŸ₫߇ÿôƒFưˇ»>ư¯_ÿươéÛÏƯ'ùçÿ₫ŸÏßụ̀Ëç[×ø÷ó½̉¿¹äG­Aßü®ÿäܧŸ₫Sû”ü­û”Û®ú*­•¿ûøÆ¿û÷ÿó§ÿ"ư·Ø»¦Wđưdë+Äÿ9ÂK|•‡-)ơUׯüÂkgü{wWxƠ²đ?Âüữ—Wɼ´Îµö¤…ÿ6äươºn¾óO¯tÚ®× í ÷+¯ßÿkÈGÿ*ÅÂ×₫!„“{ŵñ_a>ïóÈßáu[÷.¯¼î/ ŸƯ«¸…ÿ̀çW †Éß–â• íokÑ­ßÿ#ÈWÿÊƠ·ŕøµî¯øütđmínv¿oʲ̣½ïƯưº/­wơu¯SÿWoK×­ü!â®—qo¶¥è ¿Ô},[ïÿ âÉï NNŸÊ.î· 7qüo!¯]‘CÅésÚ÷!Üü̉Ä]ÛóK̀7q¯}/¿êŸ >\×+T¾;C[,ÉñưœƯ†₫ ®éBĂô ̃o³íûO¯¸Î¶ïB¾ísơæuCh‹ẹ̈†æ‡¼/v¨jC[ŒÛ>åéU ç’Đ¯¿ùÅR[]kûáÖî¶¼ªÃ»K2?ưCÛïÊ/¯Đ¶Æm+ÅăUÚÑamÜ[B »z€GP¯]=à₫l[£3lƠñr/øùÛrÏW'±-÷²¶nF±-ßË N¢/ûQ·§­GŸø₫±©Ằ««ÛѤ¾ßv;ƒ:‰)½nñ<¶åµiO<¼÷ựÑĐÙ¿‚¥;sÙ— TW±-¯øåeye~¹DY^ơỤ̣̂ơkƯÍ:8}̉ƠNu˜œ{]ơ“Ù¸íÖp¼’œ=ƒ¡=>¾²ăçṣyß¾àÙ9…°›¸?ăÅ7^|¥­—L²Û6‘jµrëªṕ›ˆ¯Û-8¶¥Û‘mk;œ”®™CrÄ›xl₫¹K'ÍÄ}i*×&Ùµ̣Pé¸aà°½ï†3ñpOqMçGK{ÎOk{îƒvf3t3p3|¾8bƯO<öÜYu=¼íHu­Ư‰]1m­¸µ;±7&××m˜ÍMå{Kw–ºÏ~<;Ûj,Ơеugäg8'.ƒ¼₫ÊûjÿÍo°ÿ uĐúƒ?>©A9°*39±X”ư+¨½Z½Åm˜¶¯§ƒ?WT́„;¬“êØ‰ÿË“Öô}wïßÿó£N[¿ÿ§˜/:¨đÏzuáṆ̃†Ö‡ç4Ễ|ÜœèTçL<Ä.fµ§`œÂkm<×¹ô6ÛÚ¶Å­Ư‰uà]÷郛ŸÓ+C{Dg®ă…÷ˆ’ÔŒ£ÛSºD'₫ ă«_Ë̃́4k:³$^\ïº×fâ±Ë̉å×µvÿw°Ëµ©Ø‡NñĐ̃Ñ npzJf»Ó·Å›*?Z¾­̃­=؃ẵƒ‹ö̉S;Ư₫å€4áØƯcÉüdöm1nÊÏ|oº‡»éÜûÍÆ5¨_̣®ªđy¡-ƯmßÅ₫ñª¦Í„c‡èeP#áª/çù.©;‚íJqûÊ÷Mđn× ¸ưm×½2¿ĐÅ[œVyáÑ+4ó6­ßÇÎôPƠ»9ñ08!̃_Wù…Ö©½§©í̉ë|À̃YÙF£¡ÿß·QœxßFq¶¥héRÔ¹OO·ª }ª mƯ¶ÅƒsöRg.»\âLËEœ¿5ñz<¶mwë₫ƒ³ø~ôrl»îe8´E›1Ñ×­9pWŒá̃›sp{u]ÓÍo‹wSV¸=Hµô~*Û!ê’ØÖn6tf3)³eîäK*tsÚ&½I‹=ŲK{^UÅMg6ñ,og¤ÿ€ù ^ºÄà]Û´©­ơ­=Øó+`…ïŸä̉~¨:x®«"'zêS;3×Èoj[ơv́Ưdz̀¯z–Yaå¼²¬ö³ï剆‹¶¦O8UqªÄ‡ĐÈË™pØçµŸÖ&§E]NƒGô+kđhâ±gĐ9U°đsg N<¶ZåôXÅm _kÿà4­fö‰S…n}1´?öXëÄă´«¶ƯÙÂ×WXû†=$­nkÿ!m¯;mè₫¿½Gè₫¿Ë«ÄÍ^Ă5k;е̣8ŒÑ ?ï ƯÙ–£[‡ ;=$ đËKœ05ñÓÓK¬r•¢m9ºµ=8S«×Ù†µÚQ ­Q7Xeå£ÓĉÇyZ1kâ«ø}êaº7S{Đ½)‰6ÎĐ?’hă ³§™f~mÜÄ«Vå ·QßvÆyåà˽-öƒWåÚ;<̃ùú¶QcX-û̃ˆä¡ïa₫ˆ+¦f~>ÉZ­¼Á_z¾cµ•¸ṾÍ+ÿĐ6G¿Îσ+¦ @²đFyc“w]ï8Q+̃ze×Whë}ÓØ–nuµÑí¿¯}₫àD¿fÏmó¿é‡̀kgÉĂó†̃i«=xCï·ƒp\{-GĂI,¶­1åâ(Ù&6¦]Đ˜4;XQn8Ă÷ưÛdĂÆ´Û,YxNÓè₫iàf·\%^³đØÍE\+)úS ?ƒåÛÉ6g^ùHÚ·Œ×ưv8¤CăsÜ5 NÉ,×nâÉĐṽm«Ă“¡Y¥Ñ"lƯU ÜÙÓui4‰]ëIœ¢…Ÿ;’‚·íÁ¯R÷S?́}ñ“ƒfKí\»v> Áaw—úE{¿ă₫}Ü?ùÚwÓƒ×#©íAómùnßÇùEmùFĂQ/´å»N2v“Äm¹ă ¹klĐÎñê—]Ùî‰WÙ?‡|gk ¶öˆ-giO·ö`·“/êƠø_c·JP$«å2äe˜₫r̉e~y©[ÅpøŒ)îê ·gø8Yu%9+Ơ2^m¹_ëxÁ´aÉYquÛaxS?x}•²«́ÖjËwÛàö(·kå×KjëwS?đ4“®² Øuă¼ºÍXu(9+>ÿÛăë₫€?£5é›äâaí́× [eÎø¯^nq•̀ø·^½™gü‡×$ÿÄ5ă–nŒÜ=Œná¿ù›3œ3ÿ=È«sbÆá¨:‰1¯} ß ¨lhM¸% læù¨Ç÷™ÿ>æ5qæùîàÛ“̉₫ưC^‚̀Ñ0{$ȼ~ÿ+̀«œùïB¾¨äÇ«^rç§OƠBP3ÿuÄKJÆö}8^¾i§°Nç_B¾)À¶¡Óư齓8Æ̀ÿóI>ƯŸrÑ%%~¾ÉE—²~ÿçêJù?=)ñ¢;ă¦m %k9…ÿö¡ª—~江ªu|L ËÉ5=U{ư‚\f§u»oæ×ưâú}œÙƒ¸=f₫àu­ª–éö$/É 3½M¯•ux™c̃0¶wyµ¿áß”`ʆ¹ă>¸‡û×·\ç;¿¸}2]´[{ƠänBôdn§Ø´N~|wU2©ßQÊÛx̣ê²Üʘy\¿­bs0|ßqÚĐă+JÜW~₫H^©=¾•5óø̣dSâƠ0Ÿå¶btüú’Û)Æ7j=„™ÇNËf…:K{$¸¿Î́ä¼5ÙƒÏ₫Ö̀"z½ø¬É|{~ÈkûqĐ¡­÷­ưx₫Ôơăønf³?ÜÊŸ®+¾,ÊỶärä'xt }#Ơê¶ 7Ç¿éÂCù9§lvlC[‹†­B«ƠECïKøfå÷Ë₫}Üûm_7hfqÿn+‡äÊ“7|ÿ®&ͲVÖ¤5aÈÑÖríí9”{ßÙq{$«Î0™Û1<’„W “_ÜÅ—A»Øâ.¾×é0_Ôú˜xXEî+n5₫~xÓm°Æ‡\X¼‹A^ ÖFCÿÇ]bo¥â¯«ä~c5œ cªê,™xœ÷v'ñ&Î<®%×Îå!äÍy·ép÷·Ơ[¡û¥̃mhÔÂ+ụ̈ƠZx†æW½»=ó;ïưvagî-·¥hVJƠöl§¶qŸîMI¬£ûR”~3IS"ö\½O8ÓÄ}Â‡ç—ø6ă𨬉ot[4‘mm .j3|ÅÏÜ÷ă¿/Îß¼đ‡û~·z¿&†#\3œâÚ›‡NZ„hæ5¾_đCM§¾`éæÄn'N<Î ŒƯN¤»Sô½¡9I/(đÍi‡»«fC;Ü­“óPÁJk†æK†«e²§®Đ‰Ç^¢uÁé•.®ëhY\MƯo“óàUyÓ<‡Ô±^K†Ö&¾×’™yœåơv2/€øº×öàÔ´đ¦M¹xoÚáp1m«{U¢6gßW¯‡A;H™~ í¹/ •°óY¼$Ûtøß˜/û|;ÜoŒ»vẶf¹^J/_IÄ+·a6”¬‡_º9ƠïĂ‹oÇJ¨ÈwúUn _÷“Ăѯ²mØÔuU¿ô÷ÛiđZ‡÷đn@̃w |}2„m·8Ü>Lê“æù²7çøl€3hIÜ»ùÅ(y{Wâg›øm’á(̉®Ûn¡ºë’C™©$u/øÉpw<ỨƠ’cđ!w7Ư₫̉=̃ôä”0¶á , äÛ0\Uk½ĐÊ!^Zëeæu©ê6ixrG€-ÉÚs†ƒ¤\†ô«jĂYTáÚU!₫~¸·s0NRlç†`P%1º}̣ăæH¸díNœô–̃4-nO;ÇÇl˜<·×đ=¼÷½O₫Ă]Ë*%îéÅsÚæ‡$º /vú•¼Ÿ«°SQ¢£™ÿ~’Ûg«¸ĐÈL’2oĐÍRgj›øÅç÷Û Éå}>.P†—aº%ÿ¦Û°¼¡GóX§@ ÷® ÿăÉ5Ôö³œ¹Î®q±#ùƯ¶—ˆºyX\0ÊËùAaưq¹/,6ÍÅqHÙÔk45[Ù.ªO˜’Nk€O_ÆƠ€¼_qH*j¼S=§ÏT¶7âơ¢›,Y"ÜW%.çÙnkD¾Œ“[R–Ë®ŸáCƠn}ó3ŒƯ_Ă |àĂ5Ë[÷đ>÷VGÖĂ>ù¥É‡L™´4ùP*,_ÆcRµÀ¹HÚZßpkûÅ]ÈfH* ¤Đw"Ùụ(c å“-Ÿ¸…2¹®5ÿ±²M–Ë–ÓZ=xƒôơR@I¿c—«÷¢‹^¦<Í#́ƒH«²=ø—̣̉ ́¼jk;Đ“îî8JHmñXÙáÎúÅgøPĐª§q#Xºw̉GRËÊVÀªw†>Ă'mƠ:;Ÿéå¨ó~7 5ÂëÓ®çñ–sÓ£éùLÚkK̀8.EÑó1'?<ƯSx&—–ˆâ?ápÈqÊĐ˜Ú÷N¶1 ™X¼]=À:ñ¸TëÙ,¨C!ïÄÎ8̃r½7ª¾°²NĂ Ơú2ầăCgÈzđ¥Û#¯LFVînÜy›]¡ûSP5G·'M˜xèp×'S¡?ÅÎ64?ëæ3?EïṇâJ­wgè₫ª/óÍÇRƠTæå­÷~ø3è³¾¢†sÙ₫O®çz°ëW\úa_“đ—Z¡́Ÿ₫&/î ÇIÏ®xMöd—{³t¥„e;½O…Ư©?ăø9ơëÏ8t›t×₫Œc[4I>ôŒC5å./³`æá(‰ă¾¬ß‡jJ F¿v\…’™Vq¡pÅñ¤ƒnk?öă§K®₫Ik3̀?€jJÎĐAR̉y>POÉ#•Íbœy¨§\¹%£Î pÛHóÚCPOÉMÎ̀€xŸ‹aIÆ¡K|ÿø6AcµđuŸ@¸=Ab¦'´d:g蟼èÙùđ‘\Éñ «ÀPïû¶n :+¦µ§ä́ùơûØ0’¤½uaĂ«èÍR¾ưƠK ó̀c+¶ ăÊă¼ŸKoJỌ́©§• |ßâè₫ĂèZ'4ÎC̣q_`8ÉK5ÂÇIor Ø.~ƒ!¼-øă›.Úß«_Χû¿?L@O1©é±’v/ĂE›Ô“¢úJúëưAÆ?%̀É`Âá–X£©1îºöÖÀT^‰—›o»ô<2á‡û™ư<2ñ‡÷ơá‘™Ç^ú`ÜŒÎZ aæEµ‚G&Çףߛgq»ÎüáyE}‚k’ZÓÂW¼$GlæE²nyˆoÏ]wyq9ëcç3Ÿ#-úØ9ßÿåÖÓ;½ºªVoå¿_߯‡ÅziÍ™Ça1§ÍÓóG.D&ƒ6ñÍÚº ëËëK´voF4 —_^¸ Ẁ&ỚÓµëf,n̉ éååå>ó‚ăˆó­£ó£ƠV—‹¼.—÷S1t¿T•ă•¡Ú&ç¡h”>Å<óWĨG ›̉×¾Wà pWÚ'¶uÛw̉ƒ§D¯Î<.ü̃ÖîeØ[äိ=8®×yqCƯUíÁT5Ç·?VơLüŸc×GÙ÷ lzßqàø@\çÏñ!€xå 7k5ôO {âˆƠ$Ç™Ç!ó*É[ts׫;µR“a¸$¢½ ×Áwpí'ÏËmiZ=H¡₫mù̃?”‹ôIF®đyĂÁYêôoÚáPIIoóÍ—@”a¯ç ¯ûåFI\)ˆº‹ó'ä`¸6çOÈÛÀë÷áÁ*¯̣̣Jxưæ‹TÑ÷™×UÉé}!º;ơ…BĂ¹__( ülKí\#¯;G0Öê9`…ÏÁ­ưy>ĂHÂ₫½àPIÉEβà81\K/Í8”²&=mN8ƯëK%3m¹ṽ kk—î̉‚ă›f⸛àĂ帠QBVT7”7Û“’`馻³34?ö´ºù1íCu¸‹§oÍ<~¯ÙÅÂß~ÿ₫±üµh?ṽ»ô¨Ăóuÿ₫Á¬×ùö½À<óØ-Q{/IRđkârÊ—&GÍ<6ä®ºÏ·ă›ƒum?¾k%÷42ßÿzanm?.ẃơ‘˜™?¸49væu—ôMZë‹ßÀ2ßÄo°Í·Ă“†ú$à̀®¯©[…Ơn^,ƒzóbÙCṣỤ̂™đ’4dظ|[¾ñ6È[49–—W Ѱ\j–\fºưAªË¯ư §’“±µNÅsp~º‰'àö¼º’Ó9̣ËQr& Ư¢ÛÅ=8â®­p‰ç¶ăÊRÛó~ ƒ—⤄tåg[è/ñƯÓ_ă{¿­®bø|[,Ởûmo ksp¡£÷S*Nù§Ư+¯|Ô-aØë$s¾$^YIEèyå#•ˆ.CJÎÄ6;q‚¸à¿x¥‚t6œe¤¶P1œe$g¢®Ê;2’×;¾R[(6ypSæ¸=÷mRæ̣  3i1kÉZ=HÎĶYr ̉¾YJN——ᨭï.8öêêûüçÓµoØ®o;c4tø=Rá— 2œL’Ï»?lGÅiú *~`Đưê÷`/~¤¯œ;rdLL8Q\hÆá12&&.‘Ú5ø„Ÿü’6áØ1qéKb3cØ’Bç´ %®ß?Ö'J«¸Đç$)’„Ç•øJ̣ÚĂ;dU.ÑÓƯ/9~mÏ1g"CûcÙûçàüˆdÆÎ6É5NχÛë=ñ‡KñE7hv)JÑé ưŸơ&%?ßúƠ3¾ä}2ŒWÛQ\6´§fÓ|SçÊ:^Øø¾¤è7=¼^^YÅÅ7§›qæ ÓMæÚ–Ëá}U7-Ư÷ép¸v_Ô>`‡Kßc\Û¸v¯enùöH *đêP‹TGCÿÜaߌ7ï«únèögµéîlÖ_µLI¡µLI¡µt z‚™øƒ/ẳû‰Ç¾ )"íùƠ«¾Ănä„·NlMû1Çö§dM¤̀÷§>·Åo^z]„_\’3±~»¬-–M™\ĐK7§ÙN÷ÚœĂu‘Û´5Ê­₫ídˆ¿/¶a+’”‰kmÏ¡ @¿¾BO†Z÷cöñ5¬m+ŶåƠo²{]t×̃ÿ‡*̀·i«–œ Ë^'9ÛÑüר—áö£?œÎq›û8c"Æ}°pçH₫’Áä)í₫₫íöî‘‘öÎ?IîAz²Éeƒæ”·°,3–hÑœ’ÁaĐœúÖ:ZøjÆ¥rhƠ£ŒÄ÷¾”HïM©§p¬̉Ô_¥g[z·5’¯ƯJÆ̣ö·¶èñRÏÛ9`„*n¼sCz¡Á‡³¾ôa‡©tg÷„cW†ÓApèʨZZuÆñM”ª p±#ăêM¶'åjÉ&,vd8­)3ó‡”"O$̀üÁ‘qéîÏv¾T k{°#£YZ1Ú/u×₫ÁŒ¨%¤è©)›½ă§ƒ8JÄ̉øƒ#CëÑ̉s_.‹”µưø-µfiƠh¯¶IöƠÄăÚmí̃̃0Ú´©†« §Û_ûíº‰?<’̃/3±ëÑ_=ÉΩ7¸©“CVI¿ĂêÍBñ|ÿx9=&~½HJN|Êå•’y}%·Wjå׋îË÷µRöí ßopaßNïëÛ´^̃¦å›zÆ6ÛưR`±~ơÊ}ƒ2ÇM4–BÛܯ›åµ­Û:3ơ Ưü»»'₫đ|Ö­Æ1»tơ™tËäÏÑ6ùóÛQøđÜVØ÷́†l'sW ³­ÍR g›d¹¬íÁY ’åb8Û§g;âœÚ̃~öRÉrÉe½>eJ/É †ù,¡{ƯKY4yß»đm¦èöåmoO·¡ưɽª¡;å̃·a´¤®·çW»äĐ$Ëhµ“üfÇâÛ4̣àµÁ’’:"›vÀ£U´Î;?Z5́[#UG„=©>uDØñzꈰ'Ÿ§«}:"́ø>uDØ₫|ꈰă5ꈰ³_}ĆXI»[óÛô¤}CÎø†&'ÑtßÛ˜QÉs±+)ˆF`Û"®¡µc®!I’›x́ èUDض‹g(¬‹H¥¥9¾Ñ›x́)‘—c×̃<¼+Ô’›x\Æ"hu_¾¢×Ư–7vgôÄø¼+QW?̉éýÉÓă:^‡û@½R-oÛOV^w‘w%ª7tÑgêé•++đzD=.VcíeÄØî÷W¿©>ñØv½Œ;}[Π7½ï<¶ÿ¥JIˆ|ÿËˡں#nR¦ä6¨7}æ½ú¿ «Ẽ–0(Iˆ‰†Ù/nMÙ¼6~W¶ørXÖw=øöd}Ù€Ÿ m›Î–ÙÙöébP>¾FCăk̃—Ö!Ư&îSùpu(èIpâW‡ô’/{ĺ™Ûpn›F̃Đ—’=cèM)κís¸Ô Z{Vo–j«n]Z[{µR7?ŒîtY\‹ú‹?Á~¯ÉÓÿ‰¹Ó+€Åë₫Àéש_î Ÿêz̃́¯V<®Üo V²ügúW–’ue¡¿iq ÎôWˆ–́Í…₫¤ó¢îRôÑ-ø×!®&áŒÿá’D½â ñ"î@ºc”2r´¨±î¢₫ÂSÜE…_¿¯]Ôâiu*́|;+n²₫â÷.ëw.åĂ Ă*6̃:{„đêẺ=Ù¸‹Çåù±èéaơW‘̉ùô×å™åÿ&Äë.*êH‰í]¼ß&ê L¢öZ=3₫g×¢‰́b•èkÅñ,¾¡ÿ.Âo-‘;ă?ƒx‘Œ¿ÿ!ÂsØEư[ˆ×]T¤®%  )uN±¶‰m¢¢)&‰Ï›¨h΄Kë5±‹O6ß₫̣€¢tÜ{Ix́½4­{/K÷½—¦3Ͻ—Æû̃Ëâcï¥ñ¾÷²øØ{i¼D{/‹½—Æ“IÔ±ởømul½4̃·^[/÷­—ÄŸ­—ÆûÖËâcë¥ñjơÙzY|l½4L¢­—ÆûÖËâcë¥ñ¾ơ²øØzi¼o½,>¶^¯&QÇÖKăƠ$êØzIüÙzi¼o½,î4´Ëăú̉*{M„àñ,y4.ơË-x1‰ƒIÔXM¢J:ª—˜ƒO&Qó%¿%„Æ‹̃1åqủxÛÊ¢ÏQă,¢Ê•$ƒ¨RơƠ jl[™ATq“YD•jü‰V*¾̃¼me—¿¥4'­NÍăÙ$ªÜ·àÅ$jÛÊ,¢æbUWXđ*Pi¼êĂº,ÚV-¸¦}ѸÓ:„´%yuEvs¨Ü ưOGût‚ àaŸNô=Û§,ƯíS²ÙjŸ²́°Oi¼Û§,>́Sïö)‹ût¡ϱ­Í°ö9ô KpÎÓ_ĂGN¥^î‚J'ü‚xÑSé„ÿÂ%,½â_ƒ¸&½ÍxBxÛ-6QaˆÅÇk—ơ[O»¬đ¶]l²zˆßzV›päĂËçZñ ñ¬g5VỐơXÊ~=kÁºí%́¢¢¸X>›¨Đ-ئdH<.¯¤ßôê )è¾Ë~½hàÏ’ó;ăƒđ^›N…q•¦•G~p9—ÖGÎay)q“¹Íå\ºÉúm„ûk—Å}¢ORÆŒ')™W ƯñR1Ï×G~p¹7~W¾ßÅé·âȇÅéWiQơÀC/m=ïĐ+;5Ëư¦ṿü6&§£6FAơtDÂătDÓz:bé~:¢é̀³ătDăưtÄâătDăưtÄâătDăƠ$êđ̃³øđ̃Óx2‰:¼÷4~›D̃{ï̃{̃{ï̃{¼÷4̃½÷,>¼÷4^-¢>̃{̃{O&Q‡÷Æ»÷~ÂO1Ó¨¯‹ŸÍWƠº$<´.M«Öeé®ui:ó́Đº4̃µ.‹­Kă]ë²øĐº4^M¢­ËâCë̉x2‰:´.ß&Q‡Ö¥ñ®uY|h]ïZ—Ä­Kă]ë²øĐº4^-¢>Z—Ň֥ñduh]ïZ—ÅG̀”Æ{̀”ÅG̀”Æ{̀”ÅG̀”Æ«IÔ3¥ñjuÄLIü‰™̉x™²øˆ™̉x™²øˆ™̉x™²øˆ™̉x1‰:b¦4^M¢˜)‹˜)'“¨#fJă=fÊâ#fJă=fÊâ#fJăÙ"ê3¥ñbơ‰™̉xµˆúÄLY|ÄLi¼ÇLY|ÄLi¼ÇLY|ÄLi<›D1S/&QG̀”Æ‹IÔ3¥ñ3eñ3%ñ'fJă=fÊâ#fÊZ’—n •D¥:ñ¯c¦LÄL'ˆ™²t·OÉf«}ʲĂ>¥ñnŸ²ø°Oi¼Û§₫'öé„cGœÛ††L¥lÿmY™ ¦̣đá:œ(^êÚ¬Y,•E¹Đ0VÚƯú C¥Ư­?Ñ8RÚưú£‡̣Ö¡3|] †¯<¼çü>oaÇ´‰¸Í[‚óít¹Í[Ël§K“¬íx¹É /º¤ºË {^ ¯8Û^»¬ ñ{—æ´ăe´L²r« N.S¹·ëy!×p7YaœWêS¾ă¶V¡¨b“¬‹Å2ǻÚ/(P*ɺVÑ}!©:¶É‰B“!»}ú¢‹WblÓƯÁ”h¹eHë®zѽ«x%ºƯr¡wƦ]ÚÅıig3ÊNºĐ0×y#öâå]èĂĂyÜ&-́Âáăe…̀y6\¼:‹n•Éëm‘nGzÛFÑ·ÓÛ6OzÛEÿÑ×…r–² ªç`ç`Ös0K÷s0Mgç`ïç`ç`ïç`ç`¯&QGœ†ÅǦñduÄihü6‰:â44̃ă4~ ŒJŹLبºàHx,8ÖÇ̉}ÁÑtæÙ±àh¼/8 Æû‚cñ±àh¼D ÅÇ‚£ñdu,8¿M¢Gă}Á±øŒ̉xŒ’ø¥ñeñ¥ñjơ Œ²øŒ̉x2‰:£4̃£,>£4̃£,>£4̃£,>£4^M¢À(W“¨#0JâO`”Æ{`”ÅG`”Æ{`”ÅG`”Æ{`”ÅG`”Æ‹IÔ¥ñjuFY|Fi<™DQïQQïQQÏQŸÀ(‹¨O`”Æ«EÔ'0Êâ#0Jă=0Êâ#0Jă=0Êâ#0JăÙ$êŒ̉x1‰:£4^L¢À(÷À(‹À(‰?QïQQÖ’¼ô-YG¢ÅËíEÑ>Fgú¥iµOÙf‹}J³Ư>åqµOi¼Û§<®öéŒÿç³}:ăØĂsoˆ£ÿ²¼²Đ8,ºÎ@-ûDaQ}ăk¡QôO ?¯2°èµOA±ºÖ9Cṣ4ŸáÓîm˜¨›ƒøbíÛ„1Ñđ6aL4Ô}¹îáÆ]V½vYaǧ´Ë »æv»¬°kî{—âÙí¢ÂÆä¼‹ §oQ' ßñE½$3ï ¨^’‡Ñåª^V¿¨]Ç+ºàßæ$ç¾ÍIÏ5t£\!^—6æúMÁhn*›₫ÂÑܰÏuͽë̃‡8÷>„ÑÜ\÷¹£¹»‚„ư"VƯ/jè²í8øû6aÚ½MDFơoº;Ă›n„Áâđ¦a6î“~<…­aH7½ÍEŒ¾ßæ"F›æb,û\ÄÑè}.Â!ªû\<Ç£Ùo÷x4»ü{Jă=>ÊăưäOâ#>ÊăưäÏâ=>ÊăƠ"êˆ̉x̣x2‰Úă£<®ñQïñQ×ø(÷ø(k|”Æ{|”Ç«IÔåñjµÇGY|ÄGy\ă£4̃ă£<®ñQïñQ×ø(÷ø(“¨=>ÊăƠ$j̉x̣x2‰Úă£<®ñQïñQ×ø(÷ø(g‹¨#>ÊăÅ"êˆ̣xµˆ:â£4̃ă£<®ñQïñQ×ø(÷ø(g“¨=>ÊăÅ$j̣x1‰Úă£<®ñQïñQñQ×ø(÷ø(mI^z •C£§ƒ¢çøèñщ&â£,ƯíS²ÙjŸ²́°Oi¼Û§,>́Sïöé„ÿÂ>p½^a¡ñ;lé•ùoßë`Âøè·i#¤ÙoÓ FHsÙ¦Œ–°M+br¥., Œ¸÷y…"uăÑƯ‡ua¯´Ï+zs×>¯ Äï}^á;¯nÇkï]Txó2ø]T4]|È»¨đëí\¼‰ ăẓô5¿âäÆk6à[Ăá{z#øÊ~Zœ—ôL—¢Æë·ásm¥…ÆÏ®•׺à‹qm­: ¾ÑÖVQ¦ÛÚ"r«˜è¡ÑĐQâ{%¸ôª+ă‘m~<Ơ¸¤û¼Ù– +‡·)ÅẲĐî¶­ô/!¶™…#À×6³N·/fvÛ̀B³6ä¼Ï,\);l" ¢·Tè%jÜ6DŹ×Ùo˵bv kÅl—Ķ­ư ‹L·CÜúíïC:o³ _A[wĂøy¨[wĂ€xŒ[wC)Ózr‡ÆÓ~t?]X.´ª÷~t?Uô6ȘùU¦‘jĂ”jë&Óª'¶uĂkïØÖM¢»;]qST0ÊÚÛä퉆AsïévˆƯüùd"œ¯¯?ô~ø4/9vX—,¬3”„»mÉ™F‡eÉ̉Ư°$éaW²t7+IzX•ä^z₫gââ}´?³̀h“píÏ04'Çh†Ñ;&}´?£ÏŒ6×äg´?Ó(Íêóh¦Ñ9ơóhsß~Fû3´Ëâ₫BĂÈG„›lÉp¡áÓé#¾M¶d„·¿ĐđûƯ&Çr·¿ĐèôĶ¿ĐđÅ÷Ú&[2"Û\KÀö™O\›ëÁ'¬Íµû‰jsóû js3ö‰i¡¡¹9BÚdˆö(€6·x67¿Ÿp6Ùß#ư…F6Á̀&Û=bÙ_hø~üe“spD²É‘́/4: =qln~?á/4J|¢ØÜèk–®)Ÿx5Ip5K÷h5I`5K÷X5IP5KçMÊ“é´p¡¢çl ¯&Q‡•ÁâẰ ñdu4~›D¦w[ƒÅ‡±AăƯÚ ñÇÜ ñno°ø08h¼ZD}L6'“¨Ăê ñnv°ø°;h¼,>,曆ۃƫIÔa}Đx5‰:́ ï‹„Æ» ÂâĂ¡ñn…°ø0Ch¼˜D†W“¨Ăaña‹Đx2‰:¬ïæ‹{„Æ»AÂâĂ"¡ñlơ±Ih¼XD}¬¯Q»„ŇaBăƯ2añaĐx·MX|'4M¢eÅnư}N‹éSŹ̀ÓbÙ9ØÓbÙ•ÖÓbi:ïóΓœùùWvÍưHçMF8î5у®v=½ØÓLÿƯɪ£Ç±çIJ³OrbÙ₫H₫mb’%%„sqY´r—-0_Ơ9MÂĂº¤iÓ,Ư­KÎ<;¬Kï‡oÖ%÷Ă7‹ë’MøKBç9̉₫ ̀ ü&zÿB#•ó <Ùø/,´YŸÿ‚Ă„Ưgà¿à0i÷x²áÏÀÁ‘"ù<đ_p˜uÚẴ36ö¦¿̃ẴtÛ{Ø{ÆaÎq{ÓéaïGƠ4GØ›¦öqth„½g§äjØ›íÈöfÛ>Ẫ33P{Ø{Æae{³Ă4ẪtÛ{Ø›3#́Íj¤ö¦û½‡½é́a¼=́=ă0)¶‡½gfóö°7Ưï=́=ă0Ÿ·‡½é¶÷°÷ŒĂŒ̃ö¦gd{³“`„½Ù}i„½Ùù>Ẫ3Óz{Ø›¦öæq {Óx{óxÙE%̣oy¼DíaoïaoO&Q{Ø›Ç5́Mă=́Íăö¦ñöæñlu„½y¼XDao¯QGØ›Æ{Ø›Ç5́Mă=́Íăö¦ñöæñ¼‹z²KoU³rYx˜'4­æ Kwó„¦3Ïó„Æ»yÂâĂ<¡ñn°ø0Oh¼Dæ ‹ó„Æ“IÔaĐømu˜'4̃Íæ wó„Äó„Æ»yÂâĂ<¡ñjơ1OX|˜'4L¢ó„Æ»yÂâĂ<¡ñn°ø0Oh¼›',>̀¯&Q‡yBăƠ$ê0OHü1Oh¼›',>̀ïæ ‹ó„Æ»yÂâĂ<¡ñbu˜'4^M¢ó„ŇyBăÉ$ê0Oh¼›',>̀ïæ ‹󄯳EÔÇ<¡ñbơ1Oh¼ZD}̀æ wó„ŇyBăƯ²ry\³ri¼gå̉–ä¥ÅK›¤Đx‰zWôœ•;ÁDVîDY¹,ƯíS²ÙjŸ²́°Oi¼Û§,>́Sïöé„ÿ7Â>đcuÛºàÇ̣¶•ÿzÓ+n£S‰Û´â83׿êŸÜ†uœªÜæ?”¹}mŚXé6­ü©̉m]ñS¥ÛMÖS¥ÛMÖS¥[Ç·]n®o¢Â4ZàµTº5 «Tº5 «Tºu…V)u› /k ¯Zèơ䛿ɀ߯èÅ-Ơn]à¿̃ÖSr|O¶ơTW—Ç­RÙuÆa$²jÍà‡9¬m_r+hiGÄ®]T\"÷̃Eʽn—ÈÍ/çéQ“3VºßÅä,+³YÛîä+/j39o~» éÖĂÚ„Ă8çíô°6á°¬n[MeÅa¶l[M~ÅQe6)~{g¾gÚjº ?ßKyE^ÏH¦oYơ ̀%m«É¯8¬$ÛVÓÍk±èºÜú¢ë^ó ‡¥{GÀ|Âa5̃0gû}̀'WŸ€9Ûö0Ÿp˜ß8æäŒ|æä$xæ‹–€99ߟ€ù„£üÆ'`NÓ0gñ0§ñ0gñ0§ñ~Ôfñ0§ñ~Ôfñ0§ñjùi¼Dsæ4̃æ,>æ4̃æ,>æ4̃æ,>æ4^L¢€9W“¨#`Îâ#`NăÉ$ê˜Óx˜³ø˜Óx˜³ø˜Óx¶ˆú̀i¼XD}æ4^-¢>ssïssïssÏ»¨'»T7 ôœÏKÂĂ<¡i5OXº›'4yv˜'4̃Íæ wó„ŇyBăƠ$ê0OX|˜'4L¢ó„Æo“¨Ă<¡ñn°ø0Oh¼›'$₫˜'4̃Íæ W‹¨yÂâĂ<¡ñdu˜'4̃Íæ wó„ŇyBăỪO&Q‡yBăƯæ ‹¨yBăƠ"êc°ø0Oh¼›',>̀ïæ ‹󄯳IÔ‘ÏKăÅ$êÈç¥ñbuäó̉xÏçeñ‘ÏKâO>/÷|^ù¼¬%y}ªUz![·øA³y'öœ̀;Áç\^VÓ”l²X¦,Ú SV»”¥»YJÓj•Nô?¥MäđN4‘ÂË~»gđN4‘À;ÑD₫îDé»MdïN4‘¼;ÑLîî„©»MdîN4‘¸;ÑD̃.Ùî‘¶;ÑDÖîDẶ±=iw¢a9س;Ñß„F¥¦́²ƒ92v'–í »äêùº4­éºä:Ùº́·{².¹F®îD©ºMdêN4‘¨;ÑD.Ù;ÑD–.ûí¤;ÑD.9–#E—́ï‘¡;ÑD‚.+eÏÏeÛƯÓs'ÈÎh"9w¢‰Ü܉&RsÙ>陹́ülö¼Ü‰&̉r'ÈÊ%[2’rÉ₫9¹ä*)¹ “1{F.Kuñ- Uœä*–¼1²ó$¶µSúT‚×Gzäå†`dW”à]¥D™R‚7̣}’ïMJ8«ÚÚñƠl±­Û³-¶µs-4L K‹nIº¢xÉI){.ûiMĂ%eLnD3ÅŜ׃ÆW¿ÿÿà?ưîC¸ß~Üm_z₫”¿ÿíÇ?üâă_ùƠÇÿœ Î endstream endobj 20 0 obj <>>>/MediaBox[0 0 612 792]>> endobj 21 0 obj <>stream xœ•Me;v–‘˜U”A$`PAB§÷öǶ2H€† Ém1‰:QÔ©™đ¿àâµ–O—íBë<Ö•nI¥§¼üưn¯ß|ùă¾ÄçkÉÏ×₫êËŸüđåg_~óåúZZøúë₫çơơW_;üöOùû_}ù›/?ỵ̈w_î¯ÿûKøúgúÛ/÷ơơ?|ùË_\_ÿªÿóû«ü÷¿₫ú·¿óí_¾ëÿÿ̃/ơo.ùG½A?ưÓøµ}ưá—ư—ä/ﯭ“wxåûë¿₫̣‡ïïÿó₫VÚúÿEó«FLß×ưz¼¼Răø_1àíug‡ôj•ăñzƠû^Oâx ¯Tđú:¥œ^÷ÿOo¯|đëO~Åuü/׫L™̣¼̣:e₫‡‡×ûƠVüß»xŸ ư_=º…×:a~î̉ơµÑÿîĐ᯵W₫•K¯6₫G½Ó‹OÄ®WXÿ?¹x~=‘·;̃¯kÅÿÔÅŸWZñçáé̃7/ođC*»©?óđöíÍóëë:è÷'¾êºü…‹·W(¼ßKz=+₫ß<¼^¯‹ïF¡æ_s¡Ư¯ƒ^oÏ‹·$^aߺ¼n‰ưÙ¡Ó³ïĐnŸ<ûí₫v ›•î`–"î”^ßÑS­¯’¸¡­_£o{ë×h¾*̣ơƯíÍ–|íÏù̃÷goâæÀU—hË~êF ,:úôNy˜ M¾¯.Íè§oú˜¾»Â=  gC—|÷^Db<ÆW8Á«pOéuq\¾¯ûæˆ ü7˜ ü7Ú“XïÿF{{É{àaKt࿱̃füÛÿ†»:̣=đßpocøíÀĂ_ü7ÜUïïÿ†ÿøK—o»­?ùđ¥b³Ơÿôpí¶z· ùR±ÙêΗrﶺ=YƯT·íỡMơŸ|©Ï‚Î=OÉÖÅjÀm—‰/¥pƠW]qïà”›©^¿‡.³ƒ¶‡¼›êJáxí¦ú_,̣nê§/›©î7QYqÁ=Í'_,®ˆç{¥µâîw%‘Z‘·]´ÖAÛElE>ª]mm¦ºs¦«­ÍTwFÖ¶›ê6¦b›©îw‚~=|˜b?Æ®„_́çXZñàâÏ«̣8ö³,œàE¾¡a¼Ÿe× ^wSưoñÈÔ~”˜Ú²SûQvbj?ÊNL-ư¢”đ~S:ÀûQöœàåuàK'x915ơ£́ÀÔÔ²SS?ÊLMư(;05…tdj?Ê¿̉¦~”=x?Ê®¼ß–đÜoK'xÙMư$Låáh ¿ûQŸ@xèL«>¡´éLÎ}‚qÓ'úă¦O(>ô ÆÛ‘©CP|ÈŒç#S‡<Áøsdê'7yBñ!O0ṇâoy‚q“'̣ăíÄÔ·<¡ø'ÏG¦y‚q“'̣ă&O(>ä ÆMP|ÈŒ·#S‡<Áx;2uȈ¿å ÆMP|ÈŒ›<¡ø'7yBñ!O0^ḶăíÈÔ!O(>ä Æó‘©C`Üä Ň<Á¸Éy‚ñrbê[`¼˜ú–'o'¦¾å Ň<Á¸Éy‚q“'̣ăåÈÔ~”˜Ú²SûQvbj©G¦ö£́ÄÔ~”=\ª¤~”]Ïư(K'xŸ,Œ÷£,p\Dg¿Â—ÈĐçÏ&Eï£>à¿ü¬O'ú÷?ëSJ›>…ÍV}JÙ¡O1nú”âCŸbÜôé„ÿC O'Ü÷ܽ^±-¸ï¹›_¥ñ_ïûʽQsñ"j3~¹‚3ˆƒÚŒÿK¯âÿ0ăÿÄœQüf<»x{mŚÇ?tgzå•ÿco¯¶âà*μÛ] yí¶Vv[½¶‡p了¿ƯÔÇUœa7ƠíøËnë?s%g|ƯÏøĐ/É9s\ܦ"^~Aü¦"ÿơ¾Ê§°HÎ{ÅƯ÷¶¾̣û/bQ\§fÜ{³ÉWÜ÷ùM»©®§e?—6S=§ÿØ—S>øơ¾œ6S]gƠ¾œ"ß²Er> ÷{ ư²¶ẫƒ±HÎÔ¸©ưtªü@ˆ)ë fÂƯ§Å~:=+î>\æG/kî…ÄÎ¥w]\ŸG/k´gJØMußhKÙMơŸ¸LXM¸÷¾« « ÿ·®†4a#̃¼!â~õ'â¥-̃¼×»®Å·}Å£x°¯x´1¡î¦ºn±1~αéG ÷|AR¿ë¥„'AêgSMxñ‰KoXqÏŸ"=—~ï¡=ó<»©îë«)̃˜¾jÀGêgSx­[ïøœé«é ¼ßûUo3ơÏ\ ™vS}?Ư{ÓXS˜yở³#̃Qœ¹ßf̉ç[ éÏèÏ¿jz̉\ÚƠ1Ç‹t¥ï¨§ Æ›tÆC–a¤´DÅ´%öøA[RPư‹ñª'Ås/÷ÿ[z̉ö'ïmwĂ9Ë­Ú}½ ZDŸ÷ÖÏ]ƒjwÚöÚ7¹ºà~¸hßä*oLëĂ´â®èÛĐUqÛĂƠ‡)/¸ww׳“Ę–wµFWœwÆ)1¦yÅ]¯GYMë¾áFRöƠWÜu’LEBq¿ç¨Gׄ»Ñ·ÙiÛ{V„ËCbLkÄ32{V¤“ Ú³â„»>ÄơÙMuç{»wS}Ị́â–J”iÁ½¯úZw _mÆWÆí–ÓXđ\—ÓRx[⥒`ŸKˆi^q7"5]ú!‰́åî1ÛË/̉l/¿ÔÔÇ^~iÛŸª_&Üû<K̉/î}”HÓ²ẫ¼X³~\™p×!¸]úq…ÚW\̣A›~ê¥~Ç»+̃cD›æç»hÓÆI‰4­üL¡éS.Ü6$ÔôYqW(÷ƠtñSO‚MSÆó]¢M+?ÆR~^%đazÔM“Ĩ!VÜ•ạ́yGÊ;Dä)ïoẓ¾yócL´i|Ë3?ơô}ï¨C•:¤)´S¥)‘]oöƒ©AijwM$M#¦G̃#k̃#Œ[̃#k̃#Œ[̃#Œ[̃#k̃#Œ[̃#K̃#L[̃£yđ¯[̃#Œwmz¯sàS4a^q?‘‘æ=qO8X̃£™vƒ>5ï6T”éÚ~⣤û wăñ.M«A›.κۺv£Î,û5U”éµNï‰B”iZq7‹HÓÊg˜HӇϰ́åiÂ]y—ă¾‡}JTù–'̉´¬[÷U5”k_Ùîë̉4L.MÛÁ”́̉4F̃ïÍ>k̉iæu1áîS̉e^₫_>ˆÓmóŸN5̉Œƒ H3î§4J»©₫ÓiÛMơ¶È‘iÆƯÇMK€Dg¤ÈÓ›d’‰Î/Ë€„w K„\S á.×Hx¶~"É‹)¿6ŒôGxbYú#:ô#ư=4Fú#Úö‘₫ˆvùHD·é‘₫ˆ‘<˜Ü¿,ÿnºæ?¢ëỌ́Ñùbù°™ÿ[ù|·C»°ëÑ›¢G#y+m“±ëÑx0‡_¼Á¤áG÷¢4üâ`k† hïƒÊ¤pz A O.¤T:v•y_̣æ€éuɉ‹iĂô=¾£aÜ6ËGé|€7Ư)²̃Ÿ(-( ăE?wR<´$Ơ£1êrt#W>—~뤿₫˜™p__wƠ„û¹p̀»jÂƯô3Ơθ ÷¾¹‰Ư¦£+v›q₫‹‚4ñ¶Kôèfª{¿Únªû́u§ƯT7óH¸vSƯW²¾”_×"Hă:g>¥ă-o{×£÷ÚW{§¢gË„û3èÑBMͶ2á‹èÑp`êÓô{à„»/Ú]nkƠm»èÑ̀g¤èÑ̀ç»èÑwƠn×£÷ÿù=ù¯kF^~̉ˆ­ ªèÑđ¨}Uy.½ø±'z´E<ª1Ưû)é>hv=Z"̃–ä¹ô|˜º$Íü ”ç̉´Ù®?÷cù´í%©¤ĂTAÚï5«„¥(Óí ñI™Öÿ”˜7đ~×ç̉µ#]M%Ï¥·]ŸKWÜ}‡ öª'°(ÓÂ'(ÓÍTWâwiº™ê°”vS½7s§7Zy. ëơ>ÈÓçÆ»˜ÈÓ+à üvå¥m—ç̉›Oy.½ùyzóQyÊïú›>Ä)́ơ!NáÊât¢=äOWœ¹oI7‚ƠëDœ¶Ó"Nû~Îñ"i™1̃ïO>À›|dƸƯ¨1.7êz€‹ß¦“†¦r¼É–ˆñ¬ƯŒ’§é í]ÆuP?ỵnsÀMÖ[ôR:ăîƒfƠK)n{­̣n7ăŸª¹ÉÛ-báẸ́vK§˜<—¦ÆMµTa¸1–* 7FR…Ư|˜$UßôB«r±£¿¯(;:#ăƠäKç{¼ơU{ÆƯÇØpí¶+ÂúÑtbjÔWí÷ Æè«6í÷˜ÔߖΙ˜Ôß–.˜ƠßvÆƯÇĬ™–pÏ<ú›ú4ù`2ăne—’÷³É˲ëµßO¼, "OÛ{YbÓ/~3îeYõ²â^ˆzº4¡ÄŒ»Á₫¢O7[½L¢OÛÉÏßu7ö§já}#₫¼›±—¶ÔÍV/o‚Ô#[S̃Ob/o‚(Ô²ân‘¨7¿‹GïUñn 5­¸_&HdnL—¨¥áƯ@2Ư ïÂ"Q3?¹SËx×3}J7=Ó§ôh2}J[mú”ö èS¬$»è”ư(AÅE³‚_í³Ăwߵ­<0.Ơđ Ñ×FŒ÷ u905êwXŒÛG2Œç[¾Kq\½,fÜ—§êeưQ§UŒ—$_×gÜU„ơÚÛîæ½í¾{®]́hÛ[Ñ»üơĐWGZq÷á¬éơ¿£^icú‰₫¬¸û`̉nª+dBÛMu5[?Ñ7S]­œ®#Sû¾™ê¿oê+nL¶&ÔÔÇ>˜L¸û¾ùè+î÷åÁrÆ}õº›ê~tó|ÅƯµí¦úz3ë×Ç w•‰¸̣¶K2]ụ̀ç»$Ó­̃gâ­i(é$Đ̀F+î:ĐMCIç»f6zp¿ÇhŸq'Ü ©”³iÅƯʬ_×gÜ ©”¼{…·ư¹ÅĂdÆ}Ư"&3î~Eè·ă§đ~ï«é:À«æÍ˜q÷+B­»©~°iÚMơß7/ưÚ>á~°iÖÏpy¤¾î—Gêwä\ñ¨¦pëçGjª¬¦÷wÉl”*̃g’Üô*̃#å94T¼aKf£§â)&κ̣q“ö̀£Ux˜ ¬ÂĂT4° 7¦Ô½í₫û¦Ơà_Z“¶½ñCuh͉v_ MkÂv ­ ‡hMxwP­ »/‡z Enö‰ØPô‘9₫ùW%?/…-?/§%?/¦5?/§ g-?/Ç5?/Æ-?/Ç5?/Æ-?/Í>đư"̉·ŸÏsD~‚ÁÀO´÷=l üD{’k ^1®…ƒ0n…ƒ8®…ƒ(> qܾQÜ q¼˜: aÜ q<™j…ƒ8®)G1n…ƒ8®)G1n…ƒ8®)G1n…ƒ8̃LµÂAoG¦Zá ÂA×”£·ÂA×ôQ·ÂA·çl[á ×#S­pÇÛ‘©V8ăV8ˆăùÈT+Äq-„q+Äq-„q+Äñrbê(Äñzbê(Äñvbê(„q+Äq-„q+ÄqûbKq+Äñ²›úI—vÍÛú±®)…‡<Á´ÊJ›<Átá́'7yBñ!O0ṇ„âC`¼™:ä Ň<Áx>2uÈŒ?G¦y‚q“'̣ă&O ₫–'7yBñ!O0̃NL}Ëy‚ñ|dê'7yBñ!O0ṇ„âC`Üä Ň<Áx;2uÈŒ·#S‡<ø[`Üä Ň<Á¸Éy‚q“'̣ăơÈÔ!O0̃Ḷ„âC`<™:ä ÆMP|ÈŒ›<¡ø'/'¦¾å Æë‰©oy‚ñvbê[P|ÈŒ›<¡ø'7yBñ!O0^Lµº¦¯G¦Z]S×#S­®)ǵ®)Æ­®)ÅG]SkŒ[]S¬$»è¼{_2R¶‘«Ń碦ü¹¦)…M²&«2…覔6] é!K)mªôư€(ưFû¼—ú˜|£ưd¼º˜Àß~¢äh¿iU¿o´_Ç4ª3ă7Ú/cÚÔ—ñíW1MêÊøv‹˜J ÷B¨a73Ư2 ×µ™é×é¼ÍN/´&Ü÷f§×+á.›nuÔ63Ưz¤}ơ´‡›Ù—O\p·i_>å¡“V=|¦»ªW!¶|DTÖ†[²ï>tj̣Ư…öëfuÔüFû±¤×6<~̉g¿é½Yéǖ׺…û%Hƒ$¯‚¿­YúSI˜XèXJiH´¿%†ôYh¿úè­j Z5/*mw—…q9Lǘ¼ỤˆL´_y4I‘‰ö£G›Æà|£ưº£IsaÀ>ékgµ̉OU”7+ư÷'­v2Ñ~ÍQ-v2Ñ~ÉѨ¶¬%â—2íoqË}2]Åâ•{-´_nôQiép«²ƒ-éGÉj¥_k4lVú¥Fëf¥_i4nVú…FMé°•–² o´›~6k"<Ú'æÁ£³J\q#ƯÙ$onŒtgK}í”…öƯv«ÆsÁ–ôµ“#µr8â²f?\Öêá†Ë´á…Ëf«:áBó‚0º+Ä$Ù;ïÏ7Aû°„üªÊ_J2¤—2¤˜¶2¤×2¤×2¤˜¶2¤×2¤·2¤×2¤·2¤3Êâ_·2¤3îºZ̉w[­ 錻̃V†”ăEÖÏŒ¬ơ²â®ck_O±Q†tÆư\GM¢XfÜÍ~+Å^*oŒå÷q?:TËθ›æÆỂ~eHqÛe5­ưîÇ~7L~6"-C:ă® ³2¤¸#­ 錻áV†”®¦Q†tÆưpO}Äữç:¨n}Ö®8o¹!i7ÖS‹̉ÙhEHéø[Rº́â]7+]̉‹÷÷(AJ7»Q‚Ûi%H±¡é!GHB² ï˜\vS}OI}¤Å¦>úHKרz©ü€%Hé¢%HéÂ%Hñ$°́”ÔTÉ©ùÅd” ¥%Hqc¬)íÍ©[ñF­9uoLÔL8tkNƯw¥¢„d7̃ïV‚›*ï+î×pÑ‚t'Đœº¸äÔ}đZéZ~½ÆƯT¯R•p¿iŒ¤3î~\°¤´í"`#?LẨùk–‘ùà+b³¨ï̉¥}¶Ô€u)¤ßºăªK)=t)ÆM—RÜt)¥‡.ŸéR]qÓ¥ºt‰.¥¿>té„»ÚnèRúë}±6áŸtiª¼1]—Ö÷ösIªVܪV‹7æiª¿h¿Kbøï‘Ŕe»xO7¢LóÍM•Äđ«©¾}¤¸ÑŒR¦%à)6j‘̉)6j‘̉a’R/鯫I”i½ñZ•R/ª(ÓçÆ›*S> T™̣K¤„p®kÏ aº̃ļĐ( à¼ñBUYʧ—¨R̃åV†7ÅÊâ²2¤3î–µ2¤t„†.hïsÿĐ¥đ¾9t)́Å¡KáªÈÊGÄf¿± 5 Z¢‚è̉vcz”!å¸Ư0)neH9®ùë1neH1neH9®g”¶2¤×2¤·2¤3Êâ_·2¤ï›PXGÉ}úÊôYq_jF•ÉîÊd+C:ẵ7ÊθïøÛvS}Õ¬Oî>ôƯ—¾̣L¸+¬ )5u”!qWeX̉wú¬ éŒûBVËâL~Æ]—̉ñgÂ}O^-CJ§Ø(Cq+CceHqG–¼›êễzo›˜;!«^tè|lzÑ¡ó«éEg¢]1uéEvˆ̉ƠJWyơC‰Ÿ"IŸƠLW^HJ¯H̉´â®Ö±”^t/–̉ ·]Rz>F’̉+àu!¥)đyÂnª_¦î¦ú¯Ÿq7ơ“+oxeÄôËÁ„ûq¤—~9 iiƽïp"I·;˜û’uiF$:L"IÓrå­ q7ƠmL¨»©®;lL»©~T­>Mµâ£ØT+>MµÚ£¸íV{”̃ÙFíQz†Ú£¸ß». üª!ºôYq÷3F×¥›©î{|×¥›©n4t¿nç¯/cU˜Â{̀¦°åC˜Â³=¾Ư‰Ú´Ü{H˜^¬ºÏ¿úèÇZHk̉r€Wù^@éÛ°;áû!£ê¤2ă®₫’`·uCuÅ»E<%ém|Tûj x¿!?‘O‚¾6S}­™wS]U%ÉñøÉ$Io[Á“ ör,xT¥₫hYqïvƒ9ĂQ•ú£¹à)Io?l$ttƯóüS{¸ƒÛŒ”m7oz¿ G~HùÑÂO)?zß|†ơ ræ§”}ÖSØ÷ÍƠÎôœŒU3»Î¸ÿª™]icF#jªp±̣_¿ÍªÈÍ{Å]Ô`¿p1Éuº­‡‡û&ióøY£‘£ëiđG®ÜÔBót›¹^|R~´̃öGSä̉£IäfxđŒ¹ùqû›j1¦3bLgÜMïb1¦3î¾­YŒéŒûÊTcLgÜ•<’%´á1¦3îúĂZŒ)ƠcJ§˜Ä˜ÆO±4|›j1¦¸1cc1¦x˜,Æ·ƯbLgü\e*1¦3íæYÖÓ™®®.•ÔG¸áí»1rU€…˜̉ù5BLgÜơá´S:F#Ä”*¯¦!áu-¯¦OÂ;̃1Åm·Ó÷e¬~)Ǧæ BsÆƯLº¹Đœñt…©Æ˜âÖ Ä—wÅ?e@ÚLu¿XŒ)î÷¢¡À3₫;ŸúŒ»NΦcgÜơrnë„ñÊ?˜i¯Đ騙ọ̈î›]¶G¯t Ù÷Ê?ˆẠ̊̉RU‚>*9‰2m7¦‡3/ÇƠ™ăæ̀ËquæÅ¸9óbÜœy9.μ˜6g^«3/ÆÍ™wÆ3/₫usæqïº+̉t›̃}Wœyïux^‘¦yŽơ&̉´üºäß_q¯Ê‰HӲ⪹¤ƯV·ÎIצ›­?₫̣ƒ8m|9‰8­ërú”˜7¬¸û–Ø—ÓĂWßpçqW±uqÚVÜ}Lu_Û¾ÚŒz)pßû·é¥tÂ]ßâ.N7SƯưBƯÚ̃Åi<èÈ~¡.|×3^ÜơçiW´«?/m¸ùó̉v‹6­7ÿq)̉»â®»¥yôbÜ…‹)¾ûè́ÂăCÑ Ó…èÓ„a+BÊi©Bi-CÊéÂY+DÊq­Dq+EÊq­Eq+FGóR§¬TÁÑŸ`0đíV±Ÿh7¢Ó~¢]É(?±îCéø w=~ÇÀO¸[$f ü„»WÇÀO¸«uÇÀÓJm7Ơ}qÎi7ƠíÈçÚMưô¸¾™êW ºL-ÏnªÿZ~了³±½ŸO¸÷º#……ÀÓ4« ‘á̉¬.3îê?Q]w¤|¿ØLơ£—Ûnª›¼WT×AcDu˜:Tׄû¯å÷nªû…ATWÄ£*^âWä¦fÍ.1ăn•#‰¹ˆx5‰—xXq×!Câ(#^â%¾™êîH»©nJñÚvSư÷ơ¼›ê”¸vS?y‰_üˆŒ·=:cÜ)4…Ç5…Æc<25Ö#SS<2µe'¦ö£́ÄÔ~”˜Ú²SË¥ăŒñªÀ·ÏoöÙăåÄÔtÅSÓUOLMw:15ƯíÄỔ‘©ư( ÏÅăư(»NđG̉«`<«ß7ÇËnê']*á]ưỰÂC`Zå ¥M`ºpvÈŒ›<¡ø'7yBñ!O0̃Ḷ„âC`<™:ä ÆŸ#S‡<Á¸Éy‚q“'ËŒ›<¡ø'o'¦¾å Ň<Áx>2uÈŒ›<¡ø'7yBñ!O0ṇ„âC`¼™:ä ÆÛ‘©C@ü-O0ṇ„âC`Üä Ň<Á¸Éy‚ñzdê'oG¦yBñ!O0Ḷă&O(>ä ÆMP|ÈŒ—Sß̣ăơÄÔ·<Áx;1ơ-O(>ä ÆMP|ÈŒ›<¡ø'/G¦>áÈÔ§™Ú²SK=2µe'¦ö£́áR%ơ£́â¸ú Ÿà¾Iqñ済Π*gư½út‚=Ï€¡O'ú÷?ëSJ›>…ÍV}JÙ¡O1nú”âCŸbÜôé„ÿc O'Ü÷è½Ä»x¢ưXÓü*ü·Ÿu0=₫[̉¶-´û¤(YÛÚsâ¿ÅÅc¡='~Éâ»N+ωÿÿ‹‰uƯrï–öyåú·¶Ï+¯S$…ï6¯|'áûÚ'–ûóưb|ñÆKßÍV¯×Åéw³Ơ uŒa·Ơ­©Ënê¿pefÜMơæ®8ư¦“Ïq·Ơ}́y´¨ÇŸ½5̃’å>jL_K7^¡¯¥¼Đ₫›Y}­«É÷ùMÛÎå&¨êkiƯ¹üÓ¼w¹ïÅ«%#gÜu½-4†₫zĐ’‘3J6l> 1FÜåª/ןvc(ÓwóÜ^mûöŸ¿̣6·\?Ơ¾ˆÖ¹å† K$ÊB»y¯$e¡]ÛRö¹å¦kªaŸ[¾·oÙç–ÿ–•¶£Ñ5´/¢ơhôü ̉•q;$9RÄĂ#> W’éæ¿Ê6­Üœ81lÓÊÏÚ[7+Ư€è”âÁm+å Ïɉ”đ:NÏ~‰÷‹›î—xßư6lƯí&÷íkgín×¥ºŸ>ü„Hưô)¼û¹ñG]¾Wå+mVz~,â¦q/]¸ÔK₫²HÈ̃đÜ9:*§7K¼›"…%ïnÍœn¢¢)ƯḮrcZ¶̃nÉlø`:& Ó¢t̉k8¦‹|¦tÖ//”~Ô|¢ưĐÑG>\Đß.Z]l¢}·Ë"qúíV•]d¡ƯH²‹,´ë*»ÈB»y|[Û¬ô](Óf¥ïyK@*l·hÁ²Đ~Ú\MùK["«¡̉± ’æ³̉± Qs†L´Ÿ˜HS†Đ>É1d¢]߯œ7+ư¼A×f¥_ÆåÙ¬t¯¢ ̃PD†eCq½&åcÊ'JƠ¬Ù´Ăík \<Á>¦L´ŸW7oV~̣€|m·¤É½m·<0&¼%K’ÜZfÚû¤ûQúŸº´~9§-Iúáœ̉Y¿›Ó̀y³̉•DϽYéWcyN¬,ñuÓU,¹qñ¦ûÊiØÄ¾p"î½-ºrèºVƯ€ÊKÓ"ĂF§[³"ĂÎÍÚ-7îƒëZ²LíV¥éGN^V¤¯«¤gœhW„ô#'.ôŸ»t“à[8î"ún¼ïˆè[­t‹˜>y³̣“è‹xEè+Ï“ª¹äè¬J¶»áåk‰ƒàÈX̃  ₫…«à$ḿ “{́||]”Ü·\˜å˜„YŸ0|É’¯tQU¾#*qù·¡=v)l/¢œ–)˜ÖQNÎÚ‹(ÇơEăö"Êq}Ÿ½ˆr¼™j»7]ç#SÍc—ăÏ‘©æ±ËqơØÅ¸ýr\=v)>p÷9rèÓ ÿ́±;ÓŸ=vño?ë`~öØéÏ»3ưÙcw¦?{́Îô'Ư™%»3éµ-<7[s¿_¯éṔá„â5́ïº×z²y™Ó.]ªæ´;Ó~¢^uÚq÷Í̀œvgÜͺjN»3î>²™Ó.n»9íθëAÜÓ}đë’ózÅƯ§GÉyÍÇtøíâ~•¹â®±῭SEe´]]wéI=\wgÜơ"6×Ưw½kÍuwÆưD½qŸa®¤ç£”®´›êûï^{·»¢w̃7Tד8Üû óŸsÅ…—’¹đ̉ùe.¼ø·Ơƒ÷cà;R’:ÛíÁu'–B½ǿW{w¾”ưjïz¾èRưîjï¿è~wµ÷ŸtÅ·[ưxiËÍ—îºæÇKÛ=v'Ú-°ø}A„eŸ¶₫‚FÍŒ₫ùW³8SZë”Ö¼èßQÊ]p¼é‡K‡,Æ£½§`Ü §ƒ–ôKñÉå´Ñ§Z0÷Á>ZhÆƯ ¶}ûIåô+΄»·} ó«ß‰ïµí̃>.µ`̣©ư o+öM(®ø§2¥eÅ]÷Û~o¦º.›áÚMơOyøl¼g¤̀fªç#&eJ7S]é̉oÅϺk¸Y`ûjº2²;§wS̉fûâJ;̣±/®ÔÔ®0ïµ1Ÿ*Álû©›’¶^R=Ï™ú¨gú„»>ÄíV×tjj³ç½ w…×eÏ{°íR f;:|iÏ{°ß¥LXqW0„´›êxí¦z‡$¥88%¥̀vJºR*=RQ·=[ü Ü#¥LxÇ'¨³#OUŸNx؈KqIxÓ“j0÷A¿W­à6ăn צÜfܯá*÷<~år0%à /å`î€ÇIÊÁä€{RÊÁ4~ÓJ¥7̃9´Ré{Yđ¥̀³ân‘d_IhÏ$ûJB{&_»©¾0}vSư€Ñ{7Ơ×¥ªîáÆ¡…JySJƠo*ôÇkÔo*đĐ–‡̀û ›é{ºöDœ6̃ĂénĂëî¿C£#葜}áƯĐƒ÷Öú󯓤ïá€ñöª•ă·ÔeÇ´uÇ›̃(-’Œâé %©ơy~ú\<ưÖ†ûâT‹@á_/úag¢½é}Wư°3Ñ̃&qW}°¥t¿íÊ–2á~é£[ ´Sœf·Éå ª~ÛÍ+îJ~Ûm+îÇœ6½1N¸`Ùo»eÅ}íxé5¶½ßvsâÙµ£Èä ÷ê_‡TT&O¸+Â¥F̉ƒç£hǸ⮠z¢†ßÀ)UDïÂ{¦kÇ̉x¿×[*ƒăQíÚ1¯¸«Ùûjj¿̃µcäm—B¢ơ€VØQ^¨qëB?¢4ëăô„û)‚n}œ†³%FKđ@#Ïü/Œ(ÏüÏô(ÏüüŒ‘—ÉTyÏHµ»wEf¹÷#É—Ô%Äm¯öx8áÿÇÅíñî½aºJ®¸—WËÓd»qÏÈÓd<øơ‘ècÂư¢ ßÀ®‚ ßÀn¹Ô¾6Sÿ¯‹—ƯT¿BFÔïp×çÉÈHy,+îF§>ßÀ®œ~¾;€}awí¦ºm/Ïnª[=ƠûäZ-Q§±đ®ißÀnän;:‡œhϧlèÀ‰ö|Ćœèñ§´ƠR]—YRæ̣’ÔÉ&J³î(J [±ªĂJ)ln»œ–·}L«Û.§ gÍm—ăú¶qsÛ帺íbÜÜv9̃Lµ°RŒ[X)Çó‘©VÊñçÈT +帆•bÜÂJ9®a¥a¥×°RŒ[X)ÇÛ‰©#¬ăVÊñ|dª…•r\¿ØaÜÂJ9®a¥·°RkX)Æ-¬”ăíÈT +åx;2ƠÂJ)>ÂJ9®a¥·°RkX)Æ-¬”ăúîŒq +åx=2ƠÂJ9̃Lµ°RŒ[X)Çó‘©VÊqứŒq +帆•bÜÂJ9^NLa¥¯'¦°R·SGX)Æ-¬”ăVq +帆•bÜÂJ9^Lµ°R×#S-¬”ăơÈT +帆•bÜÂJ)>ÂJ9®a¥·°R¬$/}„ª¡̣ßÈQ¥û9¨t‚?Ç”RØ´)k²JSˆeJi¦º”̉&K¿Ñ̃;ư[•~£?Å’†e?Å’> ÿöcƠ7úS4iXèOѤÏB&½–‘ÿMÚ&Ơ„›ư14,ø§p̉§ĐN‘h̉‹7E‚Iÿñ;ov~ %]Ítƒ7Ă³™ég‰a3Ómw,›•~áu˜h·ªLR¯¸‚„ë`¸_‚K¤ S₫̃ÿ¶d_|è’}q¡ư×(Ưí:WM|>Ñ~ —(ÏÜígn›•~́hÚ¬ô ¸\›•₫Ûl̃¬ôË·\›•₫ël‘÷pØß6Ú}́K§5le×±áv'ûTơv=—³}©úF{¾ú.{@Ûw*he±ÏTlÆê£lÀ}Rµ|ÁD»₫Ö}í„LWƒ‹> ưéAö¿-ï±)Ó9˜Äơ÷Z¶X×ÁR2,´Uúl›½Û’~i»oÚƒ&o:«ô!–ÿ¶¼Ẵ¸Oäö¦36Y­±ùẜ>y3ÓOû{Ë(lèf%Q¢y¡ư˜̉"{´%iï½#ZèâÑÑ…₫#ZhḰơvIx£˜¥TÛÇ‹‰’¢æ>ÿ¦ªIÆ5IaU“65Iá‚Ñ¡&)mj̉CMRÚÔ$¤‡„cxi i7Vî ă=Ñ̃ål øD»É—lÄ'Ú­×"C>±®ÛïsÚ1èîêü1ê´1cØ'Ü1ăNh¼mO¸·Ÿ¾ß¶'ÜơqoÛ´1ăm›6f¼mO¸[ßg¼mÓoÛ´íăm›Nơñ¶MgïxÛpWo·mØ‘ï·í w+ĂŒ·m¸<̃oÛp¾¿ß¶a¿¿ß¶'Ü §oÛ´íăm{ÂƯˆíñ¶ gäûmN‚÷Ûö„»esÆÛ6œïï·mÚïăm›¶}¼mÓ¶·mÚ˜ñ¶=áî—¥ñ¶=ánơœñ¶=á/Üûm¶ưư¶q{Û¦øxÛÆ¸½mS|¼mc¼™:̃¶1̃LoÛoÛÏG¦·mŒÛÛ6ÅÇÛ6Æím›âămăåÄÔ÷Û6Æë‰©ï·mŒ·SßoÛoÛ··m·mŒÛÛ6ÅÇÛ6ÆËnê'1jŸeứz á!O0­̣„̉&O0]8;ä ÆMP|ÈŒ›<¡ø'oG¦yBñ!O0ḶăÏ‘©C`Üä Ň<Á¸Éˆ¿å ÆMP|ÈŒ·Sß̣„âC`<™:ä ÆMP|ÈŒ›<¡ø'7yBñ!O0̃ḶăíÈÔ!O ₫–'7yBñ!O0ṇ„âC`Üä Ň<Áx=2uÈŒ·#S‡<¡ø'ÏG¦y‚q“'̣ă&O(>ä ÆË‰©oy‚ñzbê[`¼˜ú–'̣ă&O(>ä ÆMP|ÈŒ—#S‡ë-Æë‘©ĂơăơÈÔáz‹qók£øp½…øÛơăæÙFñázK•dÿ_¨¯]oï* Îứ{;ÁÀùv¢÷-¥MŸÂf«>¥́Ч7}Jñ¡O1nút½/øo}:áŸ+º̀ôç.ø·Ÿu0?Wt™éO>¸ë´ú\Ñe¦?Wt™éO]f–Tt™yPÑ…vʨè‚[3*ºàŸ·.´ñ£¢ ₫u«èBiTt™ñèâe7ƠÅ­¢ ïI+é2ón+éB—Q~QöƸ₫»å>jL_K7^â’›ÚwàÇ™vß|µœËLû¼mÛ¹üܹyïrßËV˹̀¸ï«å\đ¯[9—÷xµœ Đ.1q—«¾\ÚuâMßÍsß‹·íkÚ₫ÊÛÜrÓ1ơE´Î-× µ¯¡µ[ÜDRZÈe¦]‡b+ä2ănJ^+ä‚‡È ¹àN”˜n¨Ä„,´ëăzeÜÔïrxæ<¯R¿ÍƯü·µ€ í+àB7!+à²+®‡«VpÁ]˜/1'i¸†4¦ƠE¶Ä<¤á&8¤éè˜ôDƒèMj¥yGÓ–˜s4íóhÏj¸FĂy2<£'ÚÍâcÑí>[_4Ëá ûdxEÓß6§è‰v¿˜O4Ëáé¶Y â5)m₫Đ˜Î'V74¦Ởæ iu…¦´yBcºX9ü 1]¬^Đ˜nVhJ› 4¦Ởæiơ¦´¹?cºlV¾ơÂϾüæK-|ươaơå¹Ăoÿ”¿ÿƠ—¿ụ̀ó/×ÉŸ}ù±ëAd endstream endobj 22 0 obj <>>>/MediaBox[0 0 612 792]>> endobj 23 0 obj <>stream xœ•K¯n¹qd¶3±‘Û(À$² _Öâeq1F&†%˲b«¥ÖŲ„L Å0$̣$¿,ÿ/,V}:$O£ÖC4ĐØứƠ,̃_Vùû?ÿú#^ŸJ¾>}ưßưúă«ßŸJ Ÿ~×~Ÿ~ûqá?å÷¿ưøÇŸüóÇùéÿ~„O?hÔ?}œÇ§ÿơñ÷¿>>ưCûóó“üó/ÿçßùü—ïoöï÷÷~Ósȵư÷ïÅOơÓ׿i_’_Ÿj#ÏđÊ积÷ñ§ÿê₫ë×ÿ$eưF4¿†ÏăxƯôơ œ>Ï×µAΆđJç^^÷ă+́à÷ëÚÀSz;xƯ25§-S¯cËÔ+o™ZÎ-S˵eê}n™z—WØè¼5¼®ü~G|¥ü~Ưø™vL gƯ15„¼ej<¶LyËÔtn™®×9ĂëØÁË+màWxƯ;øư x‰[¦–ºeê¶L½ë–©m Û05¶ElĂÔØV±ƒ/‘±-ci¿^÷̃–²°ƒ—×µ·¥lÇÔ¶”í˜â–©m)Û1µ-e;¦¶¥lÇÔ¶”í˜ZÚ>)oàm£´·¥́ÚÁËëØÀÛR–vđ²cjjKÙ†©©-e¦¦¶”m˜ÚR¶aj iËÔ¶”¾£Mm)»6đ¶”;xÛ-mà¹í–vđ²ej[ÊvLmKÙ©m)Û1µÜ[¦¶¥lÇÔ¶”]\ª¤¶”Ïm)K;xÛ-màm) Ñy„WdèƠ¦™&n;úoŸôéÿư£>éÿø¨O1Ươ)-¶èS̀ª>åx×§W}Êñ®OGü/ŸơéˆÿkWp6n¢ÿ·K—Wæß¹9ÑƠW›{ø•KçÅJ¯mRÛgE\ƒ]Âa+»‚›è¿zp¼$mă*ÇS|];x]Úǵ³†ŒÇqÊucHm¯et¿xÿÎm4¼Ăæ6Î ¼M÷Oœ¹uđ¸·MMÚh Öi3ÇEͶ®XÁEĐr¾J_m“æ‰éó8ÅÂñÖaÇÏ(Çpo[”‹ăm÷.­Cñxö¶Çx‘‘Lé¤jăUºÆs·Éˆ»ú÷:Äm‚¿~µÍơܨ̃"!₫Ö¥x3ùYÊëœûÀ¯~å ÛđÊ}æ¾åÔnĽÙü¬QNíFÜÛMµÊ©ÅÑ䀒Öd89 ñï¹x^Mơ¶đ!«©^Í„p­¦zÛ­ĐFÓ5>o} m43î-£!…¡̉½í_º̉6¾ÊÍ+²ísΛ×̀Ơ6Q3îí]ÂƠvQ|Z ¥m*/̀}.³˜[–¶×)¼^j?”£c£öC9h¦èæ4Ü•”v¬KíT NaÑN•`G;UpOE;U‚ă4jèˆÿíƒ||¥‰©ÍÖs‡qÅbn“uäeÏåU#đâ_‘÷̃xƯÂáNP̉jª§Fc©«©® o¢»̣58ÖîùÅ]¬vÏ/­™tœë>̀•6ʯWƯ(KR¿ Æï¾­§¸Î…ñ³u«́W^ËîMp"LĂŒ».è¢₫3Z˜¦Ke>pW%ߥχîÍ¢KĂÍ Ót©ˆêwU@Ó¥yÆ]p´fʸ̃E—ƌ˂îHia.½®÷ƒÄ ¸«J.MW¤Œ&>´E—Êa=-LMÇŒ»‹¢Ó´ñù&LïÊm-GwLĐ^Đ„éb«·‰0åƒO„i§÷'e/<øD¾xˆ6½æ•É‹IˆGyñỵ̈ MÓŒÿñƒ6½ù2)Út̃½€‡×F̣ưº×‹Ï"LçîâJđ¦Kă‰' Ñ¥…¯1ßƯÇ<஌mº4ó…Ftiå}]ti¹ñ@÷ÑO§¨©wî§SôëM—V¾ê‰.7nUÑ¥‹©¾ h›¼“¯’©ị́Œg ¦)ăéT„éÍ—=¦%à>&Âô ¸‰0Í;x¿®áø*)´l|ư «©®“º ÓÅTWÇ6aZ#¯÷&LăŒ»:¶é̉ •̣Ö¥pô¥]˜éRøeÓ¥°ÊM—ÂQmº”êø†JԦĭ!hjÓ“—]´éÉË.ÚtÆ¿~̉¦ơ.ÚtÆ]U"Û<̃L]̣fŸiæG:Ëjª¯5c?8đï>HÓ0ă¾ ©ŸœÓ̉¤ơ GS—¦x÷Ó•)¯ơÜă[h¹›.ÍsQÜpè¦Kkåư«ÄW¬¼5]Z*¯Å;ơ0Z86̃º.iÙÎz Ùt)¬Ó¥ưëg] KmºÖ`̃>¢5ÛzW2“¥ç-ÇRD–& [ +¦{ +¥5…Ó…³–ÂqMa¥¸¥°b\SX)n)¬´5¥áÑöÏ}D₫3L₫3í¯wæ]'¦5ügÚơaö†‡¥~7ügÜ›D₫ĐđŸqWä¾₫3îÇ¿[₫3ï:ß-ÿw# ơr­÷<:v¹®I½\‹ăy5ơé́âØÁ¯ƠTß§~n™ª—ka\/×â¸n´!n—kq\7Ú×˵8^wLµËµ0®—kq̃ưKaÓ'˜îú„̉ªO0]8kúăªO(núă*O(ṇăuËT“'7y‚ñ¼eªÉŒ_[¦<Á¸Ê›<Á¸Êˆ¿å ÆUPÜä Æë©oyBq“'Ï[¦<Á¸Ê›<Á¸Ê›<Á¸Ê›<ÁxƯ2Ơä Æë–©&O ₫–'WyBq“'WyBq“'WyBq“'¿·L5y‚ñºeªÉ›<Áx̃2Ơä ÆUPÜä ÆUPÜä ÆË©oy‚ñ{ÇÔ·<ÁxƯ1ơ-O(ṇă*O(ṇă*O(ṇăeËT½û—ă÷–©z÷/Çï-Sơî_÷»1®wÿRÜî₫å¸fmR\ï₫ÅJ̣ø”ë!‰m;ˆZ}¼ûw„Ÿï₫éç»1Ươ)-¶èS̀ª>åx×§W}Êñ®OG¸ÏFÜåí±^#îç˜öX/üu‰—{‹{ư¯ÄK͸p’ Ɉ»W7îqÿà,’÷/>VSnΫ­₫EÀçj«đµÚúppu₫,²Nă·ñtg\•!ܲ%á…ư"´‘¿\¼®¶>ÜœVcư›ƒëj¬{qΫ±₫MÀÇjkqñkµơ;® —Ó)[/i?˜W.i?–W.i÷Ö~y0.·^<ân«^Œ¿̃V¦:ăn¬mî9¬¸=¯$ñª¸̉¯_8ân†léù…ØÔûØr\ö¦ó<¹ûY¦Aấp¨·$Œ¸p”\ úơtTÉ¥ eư¸˜êFZu5Ơưzèw+ĐzO±'¾¸ŸdÚßqa$ËdÆƯDMÉ2¹xÍH–ÉÅ[U²L.<p?˜÷[ÏFÜæÍrë₫ºÜ$>·‘̀[z€Ê€ûÁ¼¡¨ ¸̀{¿âÜ ü`̃(!#îó¶yeébÁ¼Ĩ&y/V.̣­WMhKÜ(Üä[6>/7ụ̀Â÷›|gܶ•›|güÛ®ä ;íÚọåíÚọ½7jR‚¬f₫O\Íå=: $–WÂÏ0~Émøtt‡6 ÎÈ¿̃T>yM¶UgÜơ‹¶gÜơ-Ö~ꈻA®G¿ uÄ}—X^Mơ}\ÇjªæÚ†SÜøzN‹©₫Í¿ư&TÚª¢9SÅơ.óq×Ù–§P¹©m«sñơ&æ«ïÖÜ7ǽ»µw£bÛhºgÜ}{¥¦0ăîu¸¥¿K0â®ç²¦ăæÍ$Ïróy&VUVĐÔtô‹àé%µÑtñYLîæ=æié'.®Çx°G¦ Çxăï^¶ô¾ơN4I2Mæưµÿ₫iêÇ °Ă'I4™q÷1I4I¼ÚpZvªnX¬„ö<È»1)đ¯·át̃ƒÛp 3î̃₫Û†Ó6:MOŸƒ- ¶ªEĂiQÀ˜¾‘åFn£Hγ­ÏÛ!åØ |óêûŸ‡ÎÏ”¾¥I l9GŸù)}IkC8j"¥k_>!r×±Ö…0Ư;hÿ¹˜₫Ö1ưö¥ÑçŸio†•ŒÑ4Ñnhl©ư(à3í¿ư²Ö ÿôËѧ×Ï´ưÙÆ‚̀8Ÿi÷!Œắ“+«Áp¹Âl éÅ ÏíÀ ̣y¦uä1̣‰ö#3û+­ÔʘûV–$}§ K’.yY”ÖI;i¢ưWK¯¾#‡V^±o®>Ón8éu÷½,w‰}kơ™vƒIÛØ ×÷ú¾ê3톒¶±3[é¿ó’+½]’Hɯ:⽬ÓëJSÏ6X¹ă©Glf‹ḿœ…öØØ–ơ\hË‹ˆœ­ô£0ëb¥¯ Ób¥/ U³^%2¸N.â°u®₫-IÛîf¼ºÆrwg+l6vDç“x«S–¤ö÷LaáX'Ú—™ư5Óöu ¦ ~¦ư-ú©)ƒ̀LyÑ%âS4cátLưÀVJ<9Đ¡&‚±̣’dơiRZ]¾Ô£É6"ĂDû³¨?“ q8ÎVú/¹ÄÅJWµ̃÷b¥['ó>é߯Øoô!« ‘‘\Ëå6d¸–ËéàZ.Ëœ–ÇÄpA²Mơ}AD°ø¢ib+dƠ?‹aqÏR¸{g1\0ª¾YLw×,¥Ơ3‹é´úei}æ¤côæ`Đ̃íf?jƒôs:ëH?¥³¬›'im>à®ôµFp7ñÑZV¡5û€»OöX»¸ûfÆ‹ăÂh¼øˆ»G/[TăÅq½k¼8.»Æ‹ăÂh¼8®wÇ]]ăÅqEj¼85ƠâÅG܆·Ç‹Ó>cñâ¸0/>⮯_ăÅGÜKb³xñ÷’Ø,^—]ăÅ9̃ăÅ1®ñâï&Œk¼8Ç»Ÿ ă/ÎñºeªÆ‹s¼n™ªñâ·xq÷xqŒk¼8Ç{¼8Æ5^œă=^ă/Îñ{ËTçxƯ2UăÅ1®ñâÏ[¦j¼8Ç{¼8Æ5^œă=^ă/Îñ²cªÅ‹sü̃1ƠâÅ9^wLµxqŒk¼8Ç{¼8Æ5^œăưpă/Îñ²ú$FKßuô1•Â&O0Ưå ¥U`ºpÖä ÆUPÜä ÆUPÜä Æë–©&O(ṇăyËT“'¿¶L5y‚q•'7y‚q•'ËŒ«<¡¸ÉŒ×Sß̣„â&O0·L5y‚q•'7y‚q•'7y‚q•'7y‚ñºeªÉŒ×-SM@ü-O0®̣„â&O0®̣„â&O0®̣„â&O0~o™j̣ăuËT“'7y‚ñ¼eªÉŒ«<¡¸ÉŒ«<¡¸ÉŒ—Sß̣ă÷©oy‚ñºcê[PÜä ÆUPÜä ÆUPÜä ÆË–©ÎÊñ{ËTMgåø½eª¦³r¼§³b\ÓY)né¬ïé¬×tV¬$pC´u™ËĐÇtÖ~NgéçtVL«>…Åîú”²¦O1®ú”â¦Oüß}J¿nútÀA:눃tVüuMgqÎ:â uÄA:눃tÖé¬#N̉YG¤³̉ª±tV\KgÅŸ×tVZxKgqÎ:â •¶«¥³̉vµtV^“ÎmƠtV: ,•ă=•nKgÅ_×tV\“Î:â uÄA:눃tÖ鬴́–Î:â ]ÓYG¤³̉VµtVZï–Î:â ›ªé¬¸́Î:ânµ¦³¸›pªé¬#k:ëˆ×Å{:+®MgÅư]ÓYqEj:눃tÖ鬴0–ÎJëỬYiEZ:+₫ºf³âÑlVÚL–Í:­6n ¶¦³âl»½›ÏŸ¼́TƯÄ`Mgq7·VÓYqƠh:+nVMgŦj:+®H}Đ›Zç̃îeÇhœ2§ó"°´i/×@C•©y£úDn¦>J÷$M²‹y₫jW¦6eé®L)­ÊÓ…³¦L1®Ê”â¦L1®Ê”â¦Likưu¯@úHoø ?ĐÏÔŒ´Đ© ?Đn­4<-µ5<-¶5ü€»±ŸÖđ´0Öđ´0ÖđîÆ­ËœÖ¹¹̀iÙÍe>àî2æ2p?ù\Mơß³¹¶ZƠ\æîíWß.óẁÍe>à̃Êóv™Ăz»̀ÜÛf½]æ°́o—ù€»Ï˜ËöÈ·Ëv‚·Ë|ÀƯhds™Ă₫₫v™¸/l.sÚLæ2Ǹîµ)n.sŒë^›âæ2ÇxƯ2Ơ\æ¯[¦Ëâo—9ÆƠeNqs™c\]æ7—9ÆƠeNqs™cü̃2Ơ\æ¯[¦Ëœâæ2Çx̃2Ơ\æW—9ÅÍequ™SÜ\æ/;¦¾]æ¿wL}»̀1^wL}»̀)n.sŒ«Ëœâæ2Ǹº̀)n.sŒ—ƠÔ']Úf¦6₫:úÑ a“'˜ị̂„̉*O0]8ḳă*O(ṇă*O(ṇăuËT“'7y‚ñ¼eªÉŒ_[¦<Á¸Ê›<Á¸Êˆ¿å ÆUPÜä Æë©oyBq“'Ï[¦<Á¸Ê›<Á¸Ê›<Á¸Ê›<ÁxƯ2Ơä Æë–©&O ₫–'WyBq“'WyBq“'WyBq“'¿·L5y‚ñºeªÉ›<Áx̃2Ơä ÆUPÜä ÆUPÜä ÆË©oy‚ñ{ÇÔ·<ÁxƯ1ơ-O(ṇă*O(ṇă*O(ṇăeËT‹èÅø½eªEôbü̃2Ơ"z1®½·ˆ^ˆ¿#z1®½·ˆ^ª$›èÔƠ ¡mYm#»£Ï½ "zDôRZơ),v×§”5}qƠ§7}:à$¢—â¦Oü1¢w¢zù·¯¹1Ÿ‚yçnơË;w«§P̃¹$O‘¼s·zäØÇ(̃¥_=Eñ.ưê)wéWQ¼×ÆçÛTrđÂKïbëSïb«×¤Å»ØúÅ»˜]™WS}ü^Mơ+>ÇƠVÿ ›Ú52uá̉àS—s-Œô[Ö÷ưZñ•ñ` ÷ư‡“ëÚl£iºüˆßºL]~Àoî¿îü¶¡´Ô¹đ›{À¿m(ů·¡TfÜ ømCéä-Ú5扫½k̀÷~ëÚ½\SEcn”½ì¹ù _Kỵ̈ăwÏ¥ù×”µ̉ƯXâ;¬ư˼¦¬ưËm£×₫åû¿ê²@úÁ¾óøw#}Û0J¼ ²¹-¸$r—÷,Ùcg₫my‚WIBOE© Ùʇߺ3œå „{JºÖí¼á»nçƯ¨×²nçƯ’¾•hƯÈKƯ¡ÂKƯ!>צ6t2.·Eë´—`Ѻ°[´.¥7öMJÆzÉ1ÎăfBĐûơ₫üƠ¾bRú7&±p—usßà‘È“]"NÜ¢Dœ¸Ú£D̀¸«$ràÄ#;JäÀ‰ç(‘|5ˆ±©­ˆûoŒm¾¼́é”Ä:EÆT$Y™öߘ›à¸Ä\VSưL̉¸êªÜë^Mơc+»ăuÄƯL̉»;^GÜñîW\ÚÆêÅ;|Û:¾®¦£_A;M:ʺs_s9û•tZJ̣¶ÚŒ»èÈăje—×Ơ6j&¦ƠTWïÆºê¿sÓoŒ ­*™¤¥à‰F2IÏ»¤¤’f¾e–T̉tóf-ưùĂwŸ̉oĂÍ›UÜ 7ï4w•DnŒ‹ÇáÆó˜èÓeözÔ0ă?p5ç±n9\ùÛ%*₫xàÂCtg¹d—$j‰2¬‰DMVÇ(§å$ÓƯ1ÊéÂYuŒr¼;F1®QwÇ(ÆƠ1[óè1Fw}¤7üƒ†h7)O~ ŸJGÚ}“C~`ư Qmxj¤5ü€{[¿wĂS3­áé×­áÜQ¦®¦ºÊ^#¶GÜ]«,dפ†lăv̉ml«†lăÂhÈöˆ»Ë²=ân6¬†lăî«!Û#₫7î)FÙ¦¦ZÈ6-»…l¸ÿ¤L]Mu £!Û#îzê5d{Äư'ẹjªŸ"z®¦ºe×mŒkÈ6Ç{È6Æ5d›ă=dă²Íñºeª†ls¼n™ª!Û·m÷mŒkÈ6Ç{Æ5d›ă=dă²Íñ{ËT ÙæxƯ2UC¶1®!ÛÏ[¦jÈ6Ç{È6Æ5d›ă=dă²Íñ²cª…lsü̃1ƠB¶9^wLµmŒkÈ6Ç{È6Æ5d›ă=dă²Íñ²ú$Ls–Pæ>f”RØô ¦»>¡´êLÎ>Á¸ê›>Á¸ê›>ÁxƯ2Ơô ÅM`b\3J±’l¢3eñ(#´u÷`ècFé?g”ôsF)¦UŸÂbw}JYÓ§W}JqÓ§2JGüß}:à¥uÂÁ1øëúF̀ˆƒ7bFü)¯´Î8x#fÄÁ1#̃ˆq̣F̀ȃ7bhƠØ1¸4öF ₫¼¾C ooÄŒxqñkµƠO «­nÚ¢¾3âÿÍ•œư^“úF ¶U߈Á5™k2p7O÷J=ÊxÀ¿ăÈcµơºx^Mơ«æ>V[Ÿ"ăÅñ{Ü%œÉ$ĂôLøëöH ídöH̀ˆû)¦ư‘˜wĂ.ơ‘˜w£˜ơ‘˜wĂcÛxÊ—½§:ănzlOqăëm8•wÓc›è\L}̣‰e^íMsÎkŸ”\zØ;µ³Hv$.·>ƒkEˆÁ@Tfü×.^{%´4I‚ Ÿ̉©²jÀƯlÍSeƠ€»9¬Ae-Œ>C;Œ=C{Œ=ƒ+R߇Áe×÷apÙơ}˜÷|9×]ªï¬ïĂŒ¸.jïĂàÑ÷a°©ú> ®H}fÄƯlăqÁơ}úeÍ8¥#C3NiY8︴{qry£₫Dk†Ÿô¸Hw4Èçù«]–BØd)¦»,¥´ÊRLÎ,ŸÊR›,ŸÊR›,¥­yố¨›ô‘̃đ Ó½áÚ}cC~ Ư´NixÊZø'j ?àn>ª5<5Ó~ÀÁ1#¿œj₫̣wÓ†Í_NÛßüåîm&̃₫rZv󗸟”|®¦úÑ¿ê/§Íd₫̣wßñ19lƠ·¿|ÀƯŒQó—CSß₫̣wSFÍ_NËn₫rØLo9¬÷·¿|À½ ƒ·¿Ơ·¿Î3o9­w󗸛̣l₫rZvó—ĂÝí/‡øí/§ÀüåpyûËÜ¿+¯¦’p^Xö·¿–ưí/§…1ù€{'‡où€{Ñîoù€{Ñîo9-»ùË)n₫rŒ×ƠTÎKqó—c‡óBØä ¦»<¡´ÊLÎ<Á¸Ê›<Á¸Ê›<ÁxƯ2Ơä ÅM`±`Üq1vïïƠ`\8è-c Åh/æC¹îàmá™Íto5nÛ¶ÙJ÷öÛ ë[#qa§̉@\XÙ9à‹F,IÖÖÇ]Œ Wß<ơ곤ÏCÆ«œrc\ßmø„¦Ö üî Åc–÷M0ú“8—7ª0{VÆÛd"ËÈ€û¿ô—‡đ×e;—Ư¿2ơ–«₫ñ×ï₫ˆ{«ÚYû“f#î^m[s?1€_ÇÙçđw_‘'œfÜƯ…Ÿa5ƠÆĂYúk2„Ø·‘î†ă…*—¸×û¶55U°A7U€5dyvp¤ƯH¿|½xï=8Ïn8f[—w´­Ë|܉ ¬7oOYUñw›$\¦·Û@:gÜmc›oܹD.¦ºâà=Êù^C\†e#ưWV̉:¥úO†öüM¼®¦Ê§ Mà¤Mjªv/“°ïn„Ă® GØ:9Ü;"/·A´!̣DlÊÛxé̉¶Ä|µû9!lNLwO'¥ƠƠ‰éÂYsvb\½7w'ÆƠßIqsx̉Ö<º'“>̉~€AĂ´+µáúùµümiøơÓ=µáÜÍ#´†pÿ kù÷Rp̃-O«ÜZăuµơé¤ẵÀ5₫ăyËT¿æøµeªÆ_s¼Ç_c\ă¯9̃ă¯)nñ×ïñ××øk×S-₫ăÍñ¼eªÆ_s¼Ç_c\ă¯9̃ă¯1®ñ×ïñ××øk×-S5₫ăuËT¿¦¸Å_s¼Ç_c\ă¯9̃ă¯1®ñ×ïñ××økß[¦jü5Çë–©q¿æx̃2Uă¯9̃ă¯1®ñ×ïñ××øk—S-₫ă÷©ÍñºcªÅ_c\ă¯9̃ă¯1®ñ××kü5ÇËjê“0ưÑè>¦‡RØô ¦»>¡´êLÎ>Á¸ê›<Á¸Ê›<ÁxƯ2Ơä ÅM`" ?ÂÏ ?̉î+%½áGÚ}`¥7üHûÏÈ”‰Ă`#µáGÜÓ]Öđ#îé.kø÷t—5<.»ºÖq ©k}ÄÁĂ0¸0êZqÿă\Mơ¤½¹ÖGÜ{Ï\ë#îh«k›ª®uw×:Å͵ÎñîZǸºÖ9^wL5×:ÆƠµÎñ¼eªºÖ9̃]ëW×:Ç»kăêZçxw­c\]ë¯[¦ªkăuËTu­SÜ\ëï®uŒ«kăƯµqu­s¼»Ö1®®uß[¦ªkăuËTu­c\]ëÏ[¦ªkăƯµqu­s¼»Ö1®®u—S͵Îñ{ÇTs­s¼î˜j®uŒ«kăƯµqu­s¼»Ö1®®u—ƠÔ]z̃‡,Ü_~Y¶ö#[B÷ƒƠozó›Øœÿà..Û_̀ñzxråđyx=å×ÿ¾‹÷«óFüo<ülƒ¨T₫y¹ª¼øµâÄÿÄåÛªT2¯̀3ȱΈåâúÔúÈÿÀçuåx·Ûœ)¼fk¿åâùX?ÿcŸ¿$µ{äạ́W”§¿yơ9RñŸûx¿ü„w†6RϹ8?ơùÔgÿ™Ë×~MÍÈÿ§§Ã́ú¦ !|ÑñÛ‡Û4||ÓŒđÜùøÀÿµË‡£ïg₫û>Ée)#ï^.aœđŸøø-7_ŒüÏ\>¥.æqñ%*"Nü/|₫ê‹ëÀÿ¥Ë7}“ËÄÿØçï>L₫\^n×ÈơÙVˆT6ê§ơû<—ç§>÷ĐÀÿ•ËWññê©ưâ2\¹vm-?rùóX‹ÿ>½âÜ¿ry9´›Ëó=Ÿï›Ü\r—ÚÉchkJIÅ‘‚o.¹Ôa©wtơĂƒ{£:Û¢’ëÄÿÜçûư„¼:%r¢lt·̉ïGâơ‰9Ä£%ÜY‚üxùë¹×ÛđZÚË­O9tX†£Û¾rê¯O »ÜÜ~MwàƠĂÙ÷0¸ørU÷Ü=ƯÅHṆÆè•+Ù®ºñư6Ϻao₫bmw»ƒÜʶ³¶Ëµl;ĂQΖáèw‡«Ç_ṇ̃—¼Gwq”#ˆ¼cï]Öú÷í­_¬.îôë«‹[tôk@ñt"QM¸>Óû!9µW®₫¾6V»$÷#μ»WMƯ#Í«§ ǺÑư{LÇNơ'‰™Å½­ŸFĺeRÓËâøCŸo{‡ùûç̣W\'sÿûWiù¿ơù{½îl˜JZGï\¾Iº¼±¸§ûêGSt+Ÿjè§%¸;Ô{­OwÎn÷̀m4>¸̣Y—µÔ[9äu-u{Cn;Ơ²±α?g€[+§ƒW½̃³È•̃Ï/ưâ₫wË-́pw‰n}́˜qwƔLj治S‚ơÈw÷/U¢oGÚí0ç‘^‰ưlƯW”1­÷~|7á¾̣“Ó»ÀÛINïd«<đ₫¹DèQï¼øL˜yG€ºÅ×ơû₫Á„\t]&₫áàắ{Í÷2ÚjroT§¼B1óßơùơós”Û—çûÇ‹S¼~Åk¿ ³«lôλßZ:̣Ç÷Îàªi,¿ô=Âúy_÷wBt̃‘¨¾+̣ÎλŸ3Đæ’c’kc”¤’°1JVÉÆœâƯê1Ÿ̉Z=î\Réº?Ç×\wc²Ê&\ûm#xÖâ_y5×/¾DÔç̣4·ôÿ”§ÜëÜănu$ẵè̀µ?Æ‹/ÍƠéîĂåP%lT”¸Ù¹ú}-—xóÚŒÇ̃B$¹/ËBä‹ôđÅBäʈ̉ÖẪmæær·á1~±TøåOqm.w[ÓK…Û;%d§7´u:ÎÍåÆ(ªocß g0yc#5%œWg©k÷ñËç­}L¬1‰Ë/êÏö>œñ|±Ôùgm—}_¼₫“<\8—Ç?585‚?]Çьܿԧ»S’køëÆl%‡6¹n”_\í›̉ß÷Û+I!îÎrhsÏk—»ñ”\œ6ßÖ̃w'79S‰×Fk5¹›vz›\Ç¿ÓÛªFasë“¿[₫¬Ï âµWRs–­‰*!/\^?YÇÈ{[‡Ëc§0´{¾Oaè́ÙOahcÉ)L›ƯÊ7¥ }Áæ~N2àn½Üg_£Ü5SẪë„»kJ=À];ëƠ÷_î»È₫f÷ÈûJ]œf'7ö4q6đ¾²7q6đ'g—´6%ód.«Vä $^ûg¼ÖâûƠ“B_pñ÷Óµö5?Â"‡µ³=ÄÜkoó»Ï¥>ă÷.ơ¼”QÔgŒËÓ–”:—ß]‚$¢äØ»IµƠƯj«yäư˜†C7„´½ÂQú†öŸ ù›i£<çưÙ±_ôgÿ́#ôÇßp}J2Ê\=¾ø÷ë̀Ơ)[…w79+Ùª₫üEw~Y)̣:=/ÏƠ_Đâå¹R×g¸>府¹₫ưĂŒ̉ï)æơßôÖ•6Ê÷w[G̃?”¬—kîn—Ë×₫ ñÈËăă‘Ọ̈øzî¸ûQ!í±é§¸1ưÄp¬«ơĂñD́YAty‰¢Ÿ2ï1^ëÎÊ?YI£¼‡?xB)ëÿÀ? h*òÏrȱ±ư‘·ă—ơÎ?O’²²Ñ`̣Ú|Ưh€÷Ê_ÔH' Ød™Ëă€ô;xFÜu%H Êyñ%¥ǹỴ̈¤à²[}8ÿШüÿcŸïQù´u%yf©}wzKôŸxoHñ\§«‡ăŒ{Ư-=gœëtåßÈ»ơ˵\D’7vorÉ2ưû!C¹®»w7ÓƒP®î&Ç—ËcßĂÆ̣(WZ.íë/ B©|̣”;-câ“[ªç:9<đ©'bskY¯¦ùˆƯ¯JWÓ|ô·„qùs›–̣û«]>Ë*̃]ƒå@æØØ¾å^âî}~C‡×ûü†îúù µµ 09ˆS£´n0Ân'¸{á@»U"g77¦Å7—FÚƯµo]¨‘g[Ụ́„?ÄäÅN·ëǽæ!âŒKiüc ‰ÅœJặ´up.È Ñ6¦­gá5zÖÿ€û)L±K,Ú ÎÔọÅeׇ`p§iÛƯ#à±! cçÔưĂ¶Ù Í$âóä muËôuÿäNîj*̉X’¥²â‘ƯCr¦ÂøÇF‘³ÑÅjª×Lí§´Uƒù̉V zâK{dh‹ä<¸‚-œyÚ₫÷̃¸"C[đBÚÀûÑ<­÷úöWd́»CÚå"•ÊgɆ:xŸ‘“«yñ›)‡­Âäk)Œñu9¦pwßĐï])7ăÔgÜ¥Ỏ¦6ŸYÍ…ñ;đ—ÁçêcÉ™Ú|m&È|Zê״ܸH¬Î<ø·ê2øü£€6l,”©³Ñ£dO¸Î#1yî{ÄKyhŸ‘[]̣T‘n8ª^m‹+F̣ªN^í’Vy£6I\ù)IX)â)2¶‘3/L‘û́ú;>–5û!–絇;Ø}-̉À̀í̀gH üIÏarcLÅ´\¸.‹’%®<̉Q—)̀?T8û™Ừ¤³gẢF•ûeΩfüó:Ùđ₫Ø4j[ ̉"g¿úøưÇÑ₫Cøô»á~ûq5ụ₫)¿ÿíÇ?~üüăŸùƠÇÿÆA›ö endstream endobj 24 0 obj <>>>/MediaBox[0 0 612 792]>> endobj 26 0 obj <>stream xœ•][ËnIqä"đƯ$„bŒ&̣ºúÜr%5êèè¨Ñ‹€‘ar“–ß—zj½[»ûÛ>]ÅÀlØ<ïÚƠƠuîêêO_¾úáKª¯­Ô×ựơ_>xùôåzm#¾~"^¯¿Ôÿđ'₫₫ă—ß¾|ộû—đú¿/ñơ;‚úƯK¸^ÿăå?u½₫Z~^ñßÿü÷¾óÇ_¾ûÖơüÿ»ïưFÿæÂ„ ¯|#¾†đúáoäSøÛđÚä×xt¡̣“—/₫Yû̉‡¿±ïÇöÇÿ.ƒ÷üèy‹ÁÇơeÿ9…çÇøG̃W[àߣđñ(ë×ÈàáêÚ́k !?ÚXđß§øx=âp|?ÖG|’ü9^Ö›í́ 9<®•₫8¾=âJÿw(¾¤Gvá ¥>R_đß ø}ÅÿˆăWđÏ(¸åGˆæt†dưĐ“OxúØ…ç'?̉®[?åø₫(;Yñ º®÷ëßÍaxbHíZ ă1ÖưưÅ‹2®́ü9…'Y®ƒú$« næëуú\}Øw7–øHĂAO)¼îÖ·9^¤Ç®‰±æG]•ë/¢zđo<§§•_P¼(oq(WåƯ”ñsÔ§Ç×:ÔØ· XqêåùƯA±ă‘̣¿h¬€MÑÿF½s‡5áŸç̃¹>âJ{9yçPü/¿ÈƯmØÈÿ*‡7ÈÀŒÿWîm €3>rï©eÆWî Ă#¯ôÓ80ˆ †æX¯è 8 û÷[@lg§_Ó´̉ÿîàêNF€M˜ñÅñù±éÊáûâpWyNÜÁ‰I^¿Ot ×₫}n£B}¬Ëå+ÆÇº[4ŒQ€«iø&wp×£®äb}´uµÜbæøˆƯ¾[QbÓk¥Ÿ;P‰M7kBcM8¸êÁ×7Öó§]ûz9Äc¹øÙƒŸ¢?i8 ·ÖFÙ?eO’h3%;ùéª9”ú$9;ù)ùÍA¾¸¢ä°m Á©Ăö'É{³k{ß•VöÓL:‰ïÚl_o¹6[Å—+ÑfZñÔµ¤ͱ»Ut¥ØmCO·q“sGtëZ…“æé©·}w9{ÄÓu·¨'MCĂL³gÉWAê4ăijœCDƯ`ÆÓÔ8‡PÖ¼_Y´Å£]Y"·M»¨çÍ’Ë]₫gÑ–ǜ²_ÉÁŸÜ}ôH2×=üơÚä™Z·\…₫áàí›±åä·7Æ“#äpJ¹—êóØmgæH{ ÀW+iQv˜ÚrµGvcUBÚl!KJhç}‰ou,ET±U»è—”ö¤‘ZÉ—ơˆö›ÎsF/™ñÈ œWѫ朲‰tXàt“î*º™˜p]YNxÙ\ûv;9ZG_ñ¼Î*™ú ç•Öx=ƺ\^iƒ»đ¼̉*© ÜƠ„ç•P$öĂÁλ́n§GÜC³ËÖ ²c·̣ØÉá§â¶ïsöÔkÿ>gOMÛrŸWeæ7j¸đ´Û@™¹‰2ư?–éSwpgDZgü¡¬¯‰•œxEͳ¬Ü×*}t|Uz‡.¢ˆÑWƯú̀É„—hĐ\Üœà†âæ„¶7'¸©¸9áMÅÍ o*nNxSqÓ₫ư”qđ;ÿà+'£v5ƒ$æÍaÁÓê¬T[éáƠG±:W÷,¸]§M?øçcy³:v ß)ô„çåVIúVr¾|ªnÆj—₫(1@öÊ•mưx)4“'<-ÏF”H¢ăû±ă¼Ơ,@ñY®Ÿđ¼<›Æ₫}^Íe7?¼>[.4b̀x^ -z^<ăy¶Œ]Á¸%¯ÂŸ•ÿ‡¤(¼g¿Äñ•Ÿ¼@+úr­üáWq¼Åƒeÿ₫±¹YhÊT «â₫}^ c7¸¼†ó€h—çưÊz̉F鑨y%‡W\% ̃̀ñ™“ƒü̉vơåßÿ’WDO¤QƒvéO-îÚÅKâBrH›„µ59–+Úµ¹_j}̉¸KNVög‰S›CQ²́Én=³¤°qư>­Âdñ^Ù!9¦G)ơJN«ƒ₫7ñáEq^ăœÅyèø~‰»ñä56Ñ®â0&¹æưüûơ.ߘÙÙ´ƾ̃¶' Ô·äîqí†ùPPlû̃̉@¦\’¯;=çØÁÎû"+9úyRU/n3åOÇ•ºƯÔÉtc¹¨@ZiAú+2¯³Ééw|1Á){ÓóÜ NW9Ä]-hºHÉvÇ‚æ<1Æ×ºR^ĂxÖ­K}Ö'8ơ=ïꉗ`Äv¶àyyóYOœđ¼ä”ª™éAæ±J¡54đ¼\ŒÔ;;Ø_²ú63ưb́kpĐƒarđ_Œ}/va­h#’™Ÿb́³c¹¢¸¹Úơ<à¼)Ú5¿¶̉Ă”ü²ưñº '¢áÚ´‘J?*~#Ú¥-6^+ùÇ aR6W'¸¡B8¡- nªNxS…p›*„̃T!´ÿY!œ~`ªô¬NxS…p›*„ö?+„ÓLBó<+„f  ‡f±Kб¡1;=ïά_k ˆQ°JDà%¼¦}–\cN;A‡Ă±K/q–´Kô¡ h–b̃€Z›9¤EƒˆFŒ́ÁKX¸ Đ¡„Wv ă%W1¸m¥‡»‘µÏiÂSW”$ë₫º³4[÷+‰g¼‚ăû¢0×J?ïÚ EÛ&<ïK‹÷ß̀üLضáX/zz=ô#°]éçEªœq‰ÆNº»ƒÿ¢›¼ñd¹O3?QósØôn•Ó¿ÎüÖ´`i^57/vDâÍ‹•xiå =.É¢‹Ñ¾ÚŒê{±‹Zƒƒ›Y¢±²ª ½ư–%'ƯD™–‹QO QËQ+f1:D-'‰é’Ñ/䋿–́ ¿ÔƯ₫ ¯–=ö䮢‰a8èåJiëñá°ShhÜüÜ¡Bö@Œ̃LB…p‹3häŒEª?[­̀G‹â×̣ܳGѪ»ïz­aĂ»Å O¯=k…ĐÊÄk÷É÷Ù•M—ù́hœà4øÂPXàT&û}º9ÁéÂ9ôΫ•ZœĐÔX\ÉvΠÿ±­ßÿÇ=zŸđ‡zey¬äê•AÕ'ü¡^)¾§Û·J3ûä`ê›ë÷yA a]v°'ß·=¬’‰†ÆËĂNuÿ‹D›É¯YÏ;¬Z«åÊà`¿deqeïP|v4éMÇÑ́„çåSt4®xZӯ䨮Q´̀0áy–úæÊÏs}3Úù¯N₫ À›}½T°ÑĂ[&cÔó,3ưbz´«o„«rX7Œ*ví}WT1³'Wù¥́äó¢Dù¿ü%Ç·Guß®Ư1̣Á -=¶€“/n½¯ÚÅ'!ôû­Ÿwz÷DÛîÙù‘aCÉær(»̃ûtøº$Æa󥼄$y_]éç}0’÷mÆÿpñóq¦Ú•Ä·—äà§({-vqCÉfÓ^̃æ”ßg¾̃\văÉKÙg̀QÚèçU2´KÑ̀ÏZµÄl^o›ơ9´ƯפÍâÙßDÂ\Y|µÇ·çª!j«j †OµQü[QûÇ¿5*úé­¨QÓ– jÙÎO\SÙT‹̣3½µ/Ṕô£Tœ́ö-÷¦…Q«wÄpÉ‘́üGçfYñ4AçfHv₫kçf²Û«g̣a]î3ù°zM>¬¤@,“ö&œÅæN>&4uÑẉ1¡© ̃ÉÇ„¦ëN>&ôá@¢jî<áÉDÄ…¥OÅ7 UiÅN% .0ØéOw0håeÈw08áyp}¿’b§G̀SM ûâncrĐ#橯øĂ9FF+ÿ÷ j;?{Û×ËéDZl§'^:ăÊL¦æl§'eÖÜut¥Y_¢¸Ûm½T_ĐJ·­—î/Zé²CßcÖóêÿsynƯÁ‘ç´â©;Á¹Ç₫4½5c߯®×0́üé÷¹â„çé͸Ï­~"Ưí)fùAwY[ñ¼X/éJJöơ¢[ ¥Më₫"½©Íñ}¢84ái8… {W°ï¯^Øsøi éÑn0¤«;|uºox›í?îÔ]Ñn¯têVtđ_‚‡TüMÓ{«|âR]©vưÍAgԘ׋)Z­Ú÷+Ǭé½U~0Ek4»üd‡üc,V v{’Ë~›é)wúm̃¯z§ßV{‚³•ëmwDíûƠµûqÆÓÆ<"•́û;n)éÇèªáˆ'‘̃¬âùu ¿Ob¬»ơL†¬Îú™ Yy£É•1H†¢,<;ơ;ĐỖÉĐ„¦Ï Mh*^w24¡?ĂĐïNb&<öc@s䌧µ,´\µÏ;ºRÖäÉLO¹+ï„âèăZÀ³¿zg[gíζ̀ k€leÂÓV}¡4;è:Ø̃L?æ—ÄlütÆŒ§¹:.̣älçN îẓƯ(iUEdmÅŸâQ\^ñ‡£ }ƾ̃{&©YƯơêI²ËCú&˜™„́²-xm"»lv₫à®G]ñ‡h¼iă€Ũpw£7Çz˵D­¹eÖ/D×1Ú÷­H=:èß2ªƒŸ]”jÿ>û+;y­É™›(í;Œm66ĂydîHĐÊ{ö[°Ë2¤JÁ®+(́ÏÛ¨ê}c̃+«Çzq¬¾â]E8W_ÀOJ†S13´`–Đ*œ´/½½GhfĐ3Ø´KÏ`s‚ÓsÄx"ñ}—T̃?NhC'ơÛwü8¡ <ÚÔÉ3áM<̃ÔÉ3áMb¯jm  3»SÔËBf¥ôÆöBØó-ÚP´1ă;ÅײK‰‡¬kÁưBƠu₫²–á+ØUgêÛ÷#LĂüàbvë©8‰!¡áˆvóƒ#øá0'8‚¿ṾOo¹è•¿fơ°dQá}Í*đ|N¨n]ñ< Ơ…[œđ|¥wP;Áÿ‘G‘ªÑ3 ̀:EcÆó©‹¥ ;›ñĂ£È Qí„ç1<Ê+ ü¿NAg_Éù,ÇëPƠOGß„~-`1‹ÚD‡|Ä—|D£Dµ;;t`®Y”qik£‡ÏØCßĂç/ĈẀvêoæbW\Ä›°˜₫0àP»nÍÜGWæµ~ÿÏ\ °s?é™ÿ<€ ÏMøĂ„ĂŒ®ÆO³ï˜›ææư*[…eªíÊ…%;¾_ïøÚ¼½¸¾Àiƒ5F‡đ÷¨Ùå„çé„Ä›aăó(G·ƒq$ª÷x1"®ß?Œ+Ü}ÖaZ¡>0aæ=†il˧¦¼kîáÁˆºk._®Ëu]/Ÿ̃‡‰b×\×}åMŸ0M0¬ßçÁ{ƠS×OÓ'¼úZWKÂÓ'ôë­ûÅÓ§¡S̀g<̃q¿.9(£%#Ù„y|exz¡Ê₫¤¤á¯y÷“&3¦O¸0·IMŸp®] ®a7Wü‘*„ä¸Å Ô`°yÊ ‡đxº³,x>Ê:V­´™ñ(¯x̣§®‘É„ç)E¾í•?öïÓ̉*ǰŸf~–Û^Mx>¿êư >íôđáȘmÖJÍq'ˆ̣í6ˆ—¾»^˜²ă‡¾;b ¡Ï˜¿YåmU0tÅûYk3?1Û¼¯üäơ<œp­ßçơ¼xW{&<¯ç °­ À_,Ë mL3?'øZàđdœh̀¶MbzQp0H̉ØÚ<u­hÿđ:ts,@$4® æơ,›8PÉ×₫Ïz15óZ×K%Æv¬̀û÷¹sÁôèæø~Y?NË‹0´Ûæ₫ƯÉĐÆ•ûÍ í¥åË ÿ_ơÔpÂÿưÉ’àh` £ÙÑ÷ô_¦¯~jŸ`ơàơ¨ßAµơ–_1³Ă8q°ă1 {đxkÍÏzGĐ×j¤ßápíxÜôˆCÑÓ~^­¦_#r.^çGÚñ¸Eäá¿DTĂ#hÓtÀ%₫.¼·9đ ;Gö‹“öKGf¼=8Ø=ØáâD‹C0á!:´M©ƯCÎ*â⣗µo< 4̉ÏNđĂy§c&øá¸³À¡Op~$†‡hê ç•¤ÍÆæ¯'})y‚êB}#†W²çMp̃?–u­™ö»7Áyw]©poDÓ7¬w3üp¨[MpçT>Á¿ÀMqƯ¾N;Bj*­̣‹¶À`—̀ªYàß?ÙÉơë‡Ú²}ưX§‰víÀ|₫Í"£—³Y³aó¤£p!7'bĈ]ơ£ ÎrE±£Ưl`¾ÎJ ¯Öá9›nçL֧ج̣«Ơ;#ën®Om€êXvCëVĂḷM­&ïĐf(œ±Ó‚{Ñ.21] -‡#V<¬be#dWÍăñp×w»¸‚yÛzĐ6©ÏícÔéY3₫à‚ơpxÆóÓ¨÷Ågü¡M?©7˜đü´"éư‹đ«ñÑËÍúèÉŒç‡â¶áǹß/z½ÆN~Ñ£Ă8<)}'èà“̀ g¿í/w° 7–íëmYí™YÜđVq,W¯†:Ø9®M|xœ5ôa³4ă(g$;ù¸PWîs³̉c…s_†æf|¬»4ºî/—đ<çnÏxî̃DÙÓp|?먤ÏÊ=ÊÑ,ͱ4uˆViÆe½àP^•·̉sxö̉ª̃„çAhc-vñ÷£g3·5ơáÿ»ẺÎÿ»Ẻ́q°»üà2àhvúÓ=ÉyÆó6q̀† v₫'TU“ÿ˜4^p¤´Ơn}ôÔnW/̀ ÙÔw–‰º\ÁÁ₫ª“ằ¾“ûVgÁËZܵ…/ÄU‡´u= ´/W´k³Vœ₫¡G9vúGß­Ơá]Ơ U@ëzÑÇ•ªƯZaàíÖ·.‡3Íq₫,,®8V›n²›w7£±:¸Ÿô&ƒÙ–`LaËz0ì‘(è3¬X ëëøÔ›Éoúˆ…YYr+Z#3 ›({®zpO>8¶·W-Y]¦&&ûñ¤Ÿ#p.—Cv4ăÅ ªû ú|ïçyÔ×{'ø±@túp^ॠüœÙKŒ-e¨5J́^8_è}SiÆó«>ÏDƯŒOúdÛŒ?Ö¿Áø Ï‹ÎÏö„ç]vwI{‚Qîö„?乯nÏxØ×¶‘ĂW‹ëñën̉h½?ă¹¤=ËàfaĂ…Ç¼Îíø¦æ̀Çuú=ø„q¼¡tàZ¬íø÷ư=àËĂĂĐ5r7ă£Nvsà³&~v|Ó`ÇŒOzƯO»±:àuƯŸơÊ»ŸvæªÇ#vüP7dÆ}]̉¿»‡́xmñ‰âëíu'<ïđmi÷-¼ËUBü.LøCS̃(9M?8¶£¢GVT́̀Là³_÷¦äK¬SNf< QæN£#´¡,hJ8¦ƠN#dLmXIçÜ”́´ăY›¶Ày!Z x[ËÏU‚>q<ăbq́äc&„ƒùˆL¯à`g̉GœfüáṼ2™ñ‡p°ïüäÓ¤J̃¥‡ózujEŸ‹YSĐ˜œ̉$?,ÑAOW7ă©ÉDÀ×ưåáx•¤Ø¿“‰¾Êç—9^Ó2ózµÁ.>Ú°²“WêcB<5ă'Úa`'?•]½xÓ=©v󃣬ßÿ?×̣…Y}q̣‘=ë-:gÑ,nh4Ơ!nµ#^¶‹[ »8*N¶̀ê1…›w9´ œÛéùáđ.é̉Î63ùé̉ÖO‹*8˜¨뉃‰+ÛÅo 9¸'…¢Ăxâ xévmÄD—Ă$„ïërù¹ ¦9œKºÇy™ƒ$ñơf é´„„ hzÚµœ¦/p­9Æ1åäŸvëp8)h'·Ë^|u‡8ùØä‡[%Ößä‡[Ă…DÂlÍ1E¾9‚%\…ëuÏ÷“ufùÏ)îæ‡Ÿ5¤¾{|¼U~³_|¾nxû₫â‚|(~Ö7ûÅOªª°ïo»¾3Ä »=Ä üèđv8œHÅÁO<ââ¼¹à0ÿ›¿‰35?årxj¤ÓMoËœílÓëj3œ§ÓøΓơö &PCu›iÚX4Ăy:‡N‰ÎÂNx …ºS<åAö:áyv†{ĂAÓ¸P3ă}c)iºk^@j'>>>/MediaBox[0 0 612 792]>> endobj 28 0 obj <>stream xœ•Ë‹]EÆwgă{%³p¡ ®®ê×F$˜1F£̃$>p¢„Dˆ›üû©>×ÄîV¿[ÍÀ ¿iªUơƠwçùvë´qÚsLûéÑvû´]oÏ7¿çögúÓïO·DáơÏöû§ÛÛĂíÏö[Øï*ơd#¿ßÛ~ưÍïôÏio_ư₫z₫̣ƠZ₫ïï¯Ö{|üÆ·?̉€n^ÉN´ŸëRí·´g]€¼+²Ÿm¿?9=iÁ₫7[\è;ˆ.Ñk…èJ®Ø×®ÅÉ@ßG4ùè<ÙW'"—F₫óÅ…‘¿‚|WVâar2̣0_æữèy₫÷H5¹HgöÍ ó­̉ éÍègçSO¿èêå₫ÑäKÏVÈRt,=î!®·{úHsp2,₫Æ‹‹Ă¡Ü‡¸BßÁx™vzâQßUOĂ›§(N†ƒùâÉ»8\èCŒ³K¼€'—†'đ#ÄspÅ~.9º2\̉»/®’W®fû+uZă5O±C¼ă?Ä£Îđ<đoa>;_Â/ç,4Ç_ÎiØñïái(™̉ñĐ„ ˆÑ73D™Œ“³̉áØ<ɇRípx0•vÜá—í“f‹̉ñ]Êñ”Ögß́ªÿôRûl?yâsGîx<₫˸YlsD?/‹ÈG¿·ói¾Ýt4 6«"tÜ.6/Rq’xm„qçƒ-Åæ>­<USă…ŽŸ‹'6<gÇC•i­8³f{ÍöêÀ:ÛJ^X?Đ¡NÍû 2Ç?Ùa–¹ÂûmVTˆ ñë01UĂk̀7IÈ[Ø̉ưñCaƠü¨©üàưj₫̣ø~.8R²RNXÅ₫TN`³h–TZĐy¬ĂÁ”îĐKi¦ÔJº³¦o](Wâe~₫ØNi¾T°ïWÈÏé •Ö|íƠ\´÷.ÑI=Ûƒ¨ˆŸ̃₫kª:^áEæơq<ªă§Ü¦J̀sëÅ—Ơü¦‡¾h*NËC% S+…¾hkL+×¥­1o ú*Råøp£ă¡¯=Íă1Œ'j®ĐTV±ư/ÄÂuÅÖếăw ~ÖåĐW‰!¾GÇC_%†:ẹ̈Ë£‘ÁT!9d[ÇcSE7Û>.êxlªèfÛHúüüï(-m[1©ăFíÿ?u­_/I J. endstream endobj 29 0 obj <>>>/MediaBox[0 0 612 792]>> endobj 30 0 obj [1 0 R/Fit] endobj 31 0 obj <>stream xœí½y|TEÖ7~ªîÖk̉Ùº;ƯÙ7bB„@$7"° ‘€€ 3dSg„¸‚€ .ˆ+èŒÊ€M‚ǸŒë82n£Î8̣(ị̂Ê(¢ƒ¤ûưVuwq›yçưưñû$ñ{Om§–S§Nª{%ĈÈJm¤ëÜeKüßüÍÛH¹“Ș1¯ơ¼_Z>̃“đ·DÚ_ÏûÅ%ó₫2ư;‘s?QöçóçÎó·êˆÊögè|$įH "~ñ́ù¿\rñEO-%́%*í ÏÅ̃ÙCt̃爿ùËY·&₫]]†æFyëEs[ÿyúnÄŽ¢í¡4‰(MÍ¥4¢ĐÁ(‚ BE üS"–F䧤¿²|æ§vŒ<ô-Kfƒh©ô FºƒºéJ¤©´‘ÅS6¹ic*ÊÑ:vGhYè:•n¤{C²+BÛ=CߢÿPUĐ”ŸFséåj ƯNZEvA“™›fÑøư}¸‰n¦?°_‡¾E«‰tê«¢ª =:N…´N]¯½i}˜6Đ^¦‡Î - ”IkxQèĐ»”KMôz}*b]êXÊ  èjÚÄ’•gº…~KAæàÍÊhíq´4¦Ó…´œÖĐ6zųíMípèW¡H§ÊGŸĐ'lÏïS¡‘¡·é,ÚMÏa¼â·K=K}@;+Xº+ô$%Ñ£̀Æö±'´2íúîËC÷„~Oôg$2í̀¦+é z₫I_̣•¡•4–¦ å?²tæg¹ø<™¯à+”W錶½]J›)€ÙC{é1Èæot€>`‰,•Îf³ ́KîàsøËÊÊ.å5•©¿ƒ¼³(2ZB÷Ñ#ô'z‰^fê/e ́|¶ƯÊîbx€οQ-ê•êwj·–<ü.4!ô5y)…ΠKi%dûê ]ôgz¾¤¯è(s±al>»‡Øö9·̣L>‘·̣ü>₫2AÙ <¡QG©¨/©ok×hkYFđøưÁ›‚ÿz4ôèN êÏ¥:HôrhÅ}ô8½Úߢwè=¡?¨›ÁÎA+‹Ùjv3{ˆư‘ư…}Q’üÍä#x-Z]È/‚œ®à7ñ›ÑúËøƯÏßæïđÏø×¦d*C•EÊ=J@éTö+ª.5W=E¤NTg¨!̀L™v6EÛªm×ÔëUú½Uÿظ¸Ệ§îÂî)8?v@w-ФK!‰»é^èư.̀Á èŸÑăt³Â2Xú]ÉêX=ÏÎdg³¹́ ¶ƯÈ6±;ؽ́÷ÆÀ ô½ˆ×đ)|ŸË¯â«øu|~÷đçùüM~=÷(YJ‘2H§̀PÎR.Ä–(+”« Ù Ê6åeåUå#åcåfÍ£P—ª—ª·©¨»Ô¿hgh¿Äï½ÚăZ—öí¸v\çz¦—èçë[ơ÷ Ưj4ׯ_YZY+DÏưÔë‡'c àÛx¢º’BB:S)#/¾—¢]¼Oçw*Ä>`[éèûÅt3»€-¦í́Î.cl%½ÆƯÊvU…îå*³²q́0¡t¹:‡Î¡Ÿüa•ôwú$x·êT ûÔI1£̉»́wtŒi¡ÏaƯX£Y°2ë ïW“°zÍXg+±“aA~¡¿L»˜+^¡T/¥Ăô/úDÛKúQpz·ú~¨"TŒ†UF[±îæÓiX1@KC\ÄÎÆJ·Á–”aU7Đ C—ÁêmBw†® ]ZH/‚÷ȱ-Xਢçđ{½ÅÖböÓăü±Ÿàê¢O™—å°2¬‡CÚ2m½¶MÛ¥ưA{Ii_Ew@£ßƒ6Û0‚sé/ô)}Ă,˜›dHåèï0ô½‘~Á›”Çh4K¡V¬Ù|ØñQ‘‘,F-W@zwb=?†µqvâlú½É8ó`Dç¢} ꩇœg¢ôư˜Á+YRæÀj̉gw Æ— =5m„ƠêBŸ₫NBÚ!Ù¯° µl:êú†Î¤9ha(5°˜G¨–µVùäÍ\4e²ß‚¯+4†̉©R{ŸqœÆ(a !} v¯T:•-B/b1nJbiHp2úđ*SÔ{Eöâ6>7´JYü½H¿Ăœ˜ê2£–Ȭ™jV<µjÄđÊaCÊ— *-9¥x`QaA~^nNvVf†ß7 =-5%Ùëq'%&Äǹbcœ»Íj1tMU8£c²êZüÜ–€›5vl±ˆgÍB¬^ -?’êN.đ·Èb₫“K(9¯OI3\̉́)É\₫*ª*è“å¼T›åïd3&5"|]mV“?pH†ÇËđzv"œ‘ÿïüZ€µøÇê–Í_3¦¥Ơí´ÛFgk+H;mví\ÍäàGè9NÄ̃ƒo%z~™ª cº>LµYw(œë¹̀¯•j\Ûayi»èm³èbƠQª>T}(\uá˜äôøWÑ…LÇyÑ «ÙMFèMÓZQY®çăat†ºLk₫rƯı7͆Œ<äáQ@…j¡–o+q £ ­Úq>Ïç*ó´ù–ól+±§ëŒ[¬L±Y­ªaep€ŒDøYºUUư¨iºÅf¦¤´‰&́)éå¶®(ºjídû̀Ưà¥Åáñ¤P'ŸeÚ}LsÚ˜Â:y¶iơYY©µÍÊ­{x6©(aơkLK¶Ÿs®æ‘æñƯÉG›i^äí0fn퇘8¥ºjü¡¸øÊ’ªî¢¢ªUÚ)E«.{zƠ)^A WUƠª§Ÿ̃©óÑSwYË­Îr*jTÊêö)ơXB»I Û-ªmO(Iß©«ĂÄO[Ô\$22ü²ŒEÑ₫¡­û‘K‚Ïđ¬²đ…gØø`‡¶çøîï> 4dVè#ííUœÉ̃0'\c½6ñZ÷fÚ¤?k}MyÍ₫µbͱæ;̣‰î¥ÚRë5ÅH0<§€*9‘¯Ư¦Ưj}^ù£]«f¡_“]ÄÀ=â¡vÄyË%µađL·XµÄ˜1ñå1ơ3cÙÄXk&yËc;Y¾™_lSb¿ˆ™N_¬*¥4¥%åm1X¬á3J  e]Gê)aá.h‚«ùhóøCGQu÷‘¢æE‹è/5³ææf¦éj–Ÿâ\”á÷¸=ZnnV¦çr.ªV3ߨàKŸÿ\Í.eå̀¹uNYđo)÷-ûÍ‹ÏmY¶§uøv΋²[6Ÿ¨»èªOƒÇ‚Ÿ~¾Qè́ÍĐÙYĐYùh¥98jxg®:סz*=cƯMîùn­̉34uUêmÚF»æ‹ËaÄâsb]–ä¼Ưaµ—‹Q™ m̀ŸQÁ3ââưäw•º¸«“¯íđŒV¬ûñ®æEG‹0niªåú§æE¬9!£̀ăvÇ'%â†ß¬ 7¸¬b$R››—›u3O´ạ̊Ζâyă¯œưÛîWY₫;¿®;³ªêSF>¬íIË}2øÑŸ¾r˹ơ…>ơÉăCbâ§ÿqÛ¶GæÅÇY‡Ç.ŒT¡…»IC§ËÊË5Ñù¬IÍêDO9i¦Ö µi4ͧµh­ÚaMmÓ0‹\! W̃‚Ư ÀcVº„n£³1•.TmLçEÓVQ1D¡Éƒa2Ö±|mϱ:ôc΂É臃{M»]ɵäÚaĘ‚íÀ´¦ /·ù‡(·v†tD¨ùÛ´S‡nµØ̃·~nSU«Í–ÀÓT—ƠgËâU¿µÄvŸ¯Îµo[Î/VkƯf{غÇvÔz̀æ̃¬®·n¶=c}̃öW₫¦ú†ơ-ÛGücơë§6çrëŶ+ù:ơJë:Ûzn4Úç̣óƠó¬ómËø%ªQËëƠZk½íLË™ÖF›áµ•Ä”óáj¹u„­:ÆP¸CƠ­V[OQ=V#¼ØMWa¨4‡a”é12i¦¹¥Áâ,·‹‡eŒƯYn1c̣Êíâ¤;M—Ø-8«Œ6²S]ï© [ƒfVrÈơÚ!‘Úa£¿j±ZË5QQTn·ÙÊ G5CåÜaƒÅ4,¾ÓÉœâD±‡“SVsxÊ=S¦–ke†i¬´0Ëc+1 ÙưvïäẰx̀µ‰‚d¢•ù̀!ªqZáÈ¢CEE®ªÿăªJIvu/ê^T•âuÁ"Áup:ï’½=Ù2F¬`Â@KèÀN»_˜¼fù#u¥ˆ5Ca["ĂØÀöâlg°}ÁCÁw‚ïÿĂçU>>V§^ñƯ ¬â[ˆÔo¡S±8­-7stmwân¯rÆÎÓ̃Đx|\3&†R]9ĐăX²¸¿·nƯ¾ổô–ôÖô¶t-ƯëgBµaû°tÓN^º=+W¬[iư£«†iÂ1Nk7+™Ă6 Å̉Åʽ…ưÅL^±mö­Î₫‰{w,}ÎØ![´=îŒwv¬ê\—ÔưWơÉ`Ë)³kæ;mX#1Ưr'½È̀*±–ª¥Zƒµ{Ôz«¡3çàlhÅ-M]‰½ª“›6ƯÀ®F+E¿Sbx+oăëáá$[º ¢~RăNnkªD́íx`G;O•\³üŒ$ÿƯàxơºàơÉo¿ưn$zuv™lô*™Ö˜Ă ‹a5\‹Ûzå4«q¦uºk£ëÖ¸MIw¸p=ê₫k̉úQƯît8`.œ«Ăîw¾,T‘¯53ÍÔ†Ô–T¥5µ-•ûSKS·¤v¥ª© ÅŸ\Ü•¬$ ѧUëm5/:Ú¶.‡ä, LBF\"Ä.… qÅđ¬La.‡ÜẠ̈í 7üzE[ Ë/½üÍß¿̣ÖÄtè· ›ñËó6₫^): ~ûöƦYwL[qÆ1Â^0ăÓYL)̀‚µÚ_)—́Ô”áå]–7Øü-ơ-MæábíV¶‘ߦn̉6[, Ùơ‹0A-–å̀H&·^@¹ú8:M?³×ÊÏ(‘1̉•¯Eéä³M»Nø'©íá³HŪŒ¯´«l¥Ú¦¾«PUµ“ÙMÛJ¥MyW9S Å}%°X÷0;qᯔ2Æ’^₫Ê‘æ¢æ#ÍÍẼC=kñĐÉ+1l¯„̉Ơá’V¨ëaø)Sa‚à€ˆeºk’X5Ă ÁzdÜ̃}„Ơ°Ǻ<6¼û+mÏwO©§Â¤CáÆ…>VOQGR•±Eæ|#Å’¦¥»SNO›6.ço®wă¬C“ë’Ï̀—|^î5¹7&ß”rÊîÔgSKuèº3É­'»óô‚¤¦äåü~¿₫°₫Œîx¼ü-OÏ.7Đ™mRmfæă‘œ^¾0ûx6Ï®K&¬4&¶üÔtFé®ô@ú¿̉Ơôôl0™HÅ–ÎiZ†™Wa¦ºđđ¦”gṭ%«†Ăi(,̣$E¶¤(1%L3Ñ>`P®¥Àïḷ96;8́_&ĐŒq—;R&–³̣¬ˆë…àd̀ô°w=l¢g¦g¡Gñ$^PƯ áÖ,:Ô,<œ¢ṕ Ô_:,9̀"̀ÔAØ‘æEEái/Ig‹E§';ÔơhjzùỐ9Ù¼¹¨Iø†p:•Wx±.j¶'oèĐÁe0=J¢Û“ƯÏÓu,‚!åC‡V ­€ ‚¯Ä„UJJÄ:ẢĐ!ln¨è•—÷uÖ+©9ÁOí.CûÛæß>6ưÿxFĂÂú©́œ¡ŸfW4Ö1f°ËÎß;åö››®}4عîê3̉*’-uuí«g\WŸ–ăO›4fDđ•ø2o^Ơˆée¹Ùs!̣UІ›¥5N£»vS|è[s½²"ơ´T?]Ÿn›îîmJûÆĐ‡¨#œ#†¤Qëơ cRo6n³Ú1p7(“Đ®‰b.́öX²y2,)­ØWWráj˜ÖJmh/9½:,ïEpÉ»«>œ+¶Ñ‡„¥€}^'rt£iŸ§Ï³ÍsÏó.HÓ› çÂ3èâáWB`yI 0%°$a‘áÄtEû“Á`÷î³vñåă.i¾̣ªóæ^£íé>|sđ£à¿‚‡ƒoŸƠt'/¼obëæíÜs—đ²¦áƠX Éô_æ¤ÆØ¦x8± â¸/ó^’|+¿ƠñŒëï_]ox?Ñ?±|’đỈ·z°„aI§ÇŸî®ó698Œáñî ¯²\[»J»&öÚä­ñ¸wÇ?â¶ÆH M-‘& ±bºm…Ù‘.£d dÄWN·{ tXm#e´&£ÎÊ¥Y6cBJ1¢ÑÑvÙÚ±́—®=z±XG^(ć8Ѻ©ÓĪ~—?®Imójơq/OrÇñÄxw\LB,¹b¹x¢Ơkg3í!;·‹‰°é,.ÖÍBnæÑ.Ô{Uë ‰6ëàjËD¸å%ßU73Çu2ƠtÆ$ạ̈Ä™´ÅƯåæn¡VG¹;Ùsñn¾€Âs#nl7ĂuJ>H^™æEUƯ@5•e±ø‰Øå„ÁÂ.ór•$ NÊ‚¹Ẹ́̃YyÛ̉‹çyêW^ ~t§ÛpÍUS²ŸvUNªçø£Ê8±¦'©-rG-àÙËÓW¥óx‡³uĐ5ζAªŸeñ,¥” 惓棕³b››r¦LÇT]ûmÜ· ñ#œƒƯ#̣¬wÖºëókvt{l×c³;œöB‡3/ÆíI*v:‡½smđAq*ê™Í)̣^Å,s©5h¼M h]Ú~í‹đeÊJm 4 IÁ­ä2Î%«ß›µÈ< ÏQäîd‘¾ V!ØMànF[°Â$Ưí(WÊ-ạ̊̃¬Z>Æ2Æ[›åđ+%S¬-m› ~«?`ÜïxXØ(Ø_p  † J ñxÁ»z™’V^x›̀ÔŒ ƠHIf³ƯfdHë©®¸¸¼Ô´´Ü<T/Ö•gÎ̉ÇB‘:y›’›†´…i¬%¥!mWbXîíDyr¶V jE¿óP4Ϭª€́¼̣±0É[ÖEͽ'…4€'©MÁ²'U[RC¡­}µùö‰>>àÁ¹Ă®júÔÜ;wƠ¯Ä5₫Ûí©̃ï­æØÙG–o¾IRŸåk<>YK¶̣&×ô„éî&ï­|“¾Ér«£Óú:ÿ›öwëë´ô®,/̣?éOYqhK-×êWY”8©…vQ¢j$V)-©­©<5&ƒNrWĂN¿|QÑcư­ \óâç¹xU&L?kN(ǰ()vnN/;?yM÷ÿdåÁç?¿1øÍæßxá…·Ürá…yæ:¦¯ >ûÅ?ƒO]Úz÷Ö­[îܺUŒwmđê­¯ çœÛÍS†%ŒMàñåJ¥³2¡<µVç—P›ú¯T«8ó4ŇO=G¥Z°~zŸoÜv»+6&z¾‰+ˆ‰‰Íu¹Ä²1í}O8ăUa"]¿wÆaBµÅ~'Î8 âç¹ß ucrâ2X¯Q¯eúàߟ¿›ñàñƯ7LÄ»¯Ÿ7ûkÎ=o5¦¶aNđÁîàÑà[uÓº?Qvwl¿«ă{7C!W)ŕ[Íü[5faS´yÚRM)‰oŒ™Ó¯Ú¬±Ÿƒßà9xµc¢ƒ;:ùr³À0 ß ×mùduYK­­VƠ²2~s<Ÿ¿2~Gü₫x5̃E¹L‘ăç¼mÁ09®z7K£è1¯G6'»a´»²,,ETđL© ™4£q§­lä!uºÇ!ÓăØ¡Ñ£/¨mi:ó´SGL.Qso½ vÈ×§Ôl ₫c,…>»0ÆB₫¤Ù¥ÇéY–ẳJ»ợËY{ 8ơ¢´Ô· O3èọ̈W¼̣¹Vf ÿàNÓsfÜyq5Ū'ëU¼*®×Ç}ÄX1Ô8Ơî&[R"¯8Áæ&%‘01né%„¹?á%X-=î…¶0ËÉîAoß ¼Åôñ3†ÈKÅ\qÉxâ¾]™0ü±l;ƒ%û&W½¨%o6ûœmù– ÷ÀÜ—d]8R`œvøA30N;K5“´ü”’rCV«oR6ẹ́6Ѽxài¡]U5Ơj·©TJQƯZ¢5Ù–äpdQ¾§[ómyAT¡´ÖÑiü4m¬1κœ.V—k[/¶-w¬¢Ơê*mµuµm•ă-zK}]{Ưú–íuǧô©zP;hưÔvĐñ/ú—zTûÖ8jư—í¨£Xë ½jZS‡—«¹xX;Co˘MÄÑ<1]^q—o(µ›xD¦ØÂà1Ëy2ßLBÀn˜]‡7 AV1éâLy™ù]OÚUÍßߡ۬ g˜e 9üàRÄ™êP4›Ư°Zt‹ahá4º|¡J¶’˜ê¥°ÔXY ù!ó_’0Ia1»ü,Ùùôn–öUR’Çw§x»»S’»½a%èy!ăüÊ‰Ë ù¤8ÙKè…·´áØ HX₫]vÓY‰Ûî/¦¾…Ù·›‘rf_ Ä´ÛÉ@t»ˆđ‹„~%ˆÿX†¢°¦`€Å=û(‹Ưù"K n~ùè.èØX̃)đƯÛ|{÷4ù›ÆoøäH÷Ö™±U_[’-̣‹á{߯’ÿ'È3¥í;̃í"K’ÔH&9„v’128F»èرc—º(’̃óă¢G’xeÛ¨Sy‘ZƠÅÔéÔ¤=K3ØGt6̣.F+锦>HÓP~)â‹Aoâ•¡n”ŸÜ ƹÀYÀ™LjÀó<° ùơHú>o¼D§¢-6³€›µét ̣nƠ+i¶HG[ëPG·!ư.}m@x̣›DYIÿt:ù¾I› ב4B¸én´£è3h.Ú_¬.B¸uC₫*Đi S#ươÊđû‚GUŒñZ†|V }0X œù₫Rđù¿a;úeu1*Q&ÊTñS)ZŒöGGÆMrÜGϘĐÙ§Æ4Ñ¿̃@ŸÄ¸>^ö÷ê[_\wS­2XΟ³Á_¢QKPŒKû ô4ïMŒk/ ©sh…BÛĐÏjmmB¼ ¨’XLL½“*G0»èR}#Ưƒt⃀£”Ă?§=‡* ¿FÔ&0u>%ơaèCèsPŸú¬ábjÎGÛÏGå$dƒøX̀k#Ê+r½ Xl.ưCû%Bæ˜÷oØôàïPöÚ©@›> Œ=<¯´ü‹P“í„ç!L䟙₫xxBô! ©gȺ¶‘·…¾MR€—€ B߀ R”Aû6”·I}…ÎƯú!tC{VêêÑ÷đäZXY3¿ÿY@2¯?HgG²B>³…Îơ­[è–Đ™(•:}ÔûçÄ8…Nơ¢7k]4IôA¶ ƯR±îPï%‚*I²O·+¯Óz¡³BߢTÈEèXbMDhC¯±Œ¬‘à uº¥QYôĐ—évÔ9]ß=ưŒ&¨oÓåO4A»ôFŒo7̉0ơuذ"hé¢̀åDđ̃Ö‡n0^g磭ÔíÅët—”ëëmD[ü ́€¨t|/=:IçúêR”Fơµ/:#́.¨Œu·Ø ¼ÁïAǧ‰½AØg¹?ÀF×…ơ5t¨G?Ÿ§;A¯êg=-́£ŸF_½́KÅ̃"́»Ü[°NÑë¢ăöQØ8a#…{_´|_Ú‹ÿØ¿J;ü͈¬ë (Aû"vd/<Œ#X£믆öƠ¡½Ê ¡½úm¡û BÏé»BwbÜ={jWØ–‰ơƯK…œÄ¾ƯGµ\±g·Ë²h_î£Ó¥ ư¬¿ói6êư“ØWÅ:Tîĺƒ$ô‚˜{¬ùKƯetø=º]=†1waŒÏJz‡Ô'ÁÛ:&ÆgŒ ¦`|¢ xpRöGä±QÊ¢KÊè©Ă…¨SMú¤½‰̣›é2‹n·äÁ>}M)l‰lk'i1¥ÜU¹_ÿëă3èØ4Z­%†₫%ơÿÁPH9†5ôÖ—C^%kŸÑXK«¥|Ât­X?Êg”$tă›*ư‰Ï ă÷ÑEúvZ§wAï^Ç^đ:æí3Œå†đu{è;”ƒ:H´ôỈ?û”Ú/Ö‹ÑE^ĂDû(#ú ư?´«|€₫̃D«aKj,ŸÑotñY1á4…!ă+Àº0d+Láé̃D—‰t>—C/8Qˆ‰µ ₫kïªQÀ)pü‡Oé ^B«” Đ»CØ3ºLÄƠ”¯¢zå[¹ÿ¬̉lT!˹±L jø»hÚNs”Â^àè#ø´N¡ ?ëÔ +5èk. =(ÊÉ6¾ ¹ÔK¨Ḷơ‚́k¢Ï÷öêó-ƠåĐÑ_ñ-d¯₫¾öô3̉Ç꟧¨|²̀ߨrú;¦ÁIü:ÚláoĂï¢lchäZ×c{ÇƠl5Đ¨ê º´ôSàuàN`đÔ!t5ê~´Cœ ø`»@‘đđh^oˆv~(½7ÔC{zǵ2ªàaÓœ'ËßMåêÅ°Ă¥¡=Ê2² è1T`X¨€¿‡ôéàë×̣éVu!ÊN&åçúôSÀOi/9½ÇP÷¿¿÷¢~A±¾Å₫ü?éߘߕÀyR₫[è©CĂ'7BO±}t;:{® „ă”"åy7ÅEç é«ezŸùƒ® 2ï›p•@4̃w^.zôFT¢0ÊÈPỵ̈@ß8öS@:6đûñv S©rªS§¢/ï}?®»¨D€·"¾ ùR@O|*ˆ²m–d½G€¿GÊdäM–åG ô’k£«̉%x%¿œŸ¨÷đ’ú4́ÑAø̀S)¥/í½fû®Û¾iQ[̣Ceú¬̉«óÿOÀÚyxxæÿi;ĐsFĐUÀEđé^…¿€¯úœ1_¤ëˆºW}÷Ññ™°C؃?„´iç‚₫đ"m(v£ï ܼ׀—€-j*]ñ+“æí¾?R_N˜_đƒ·óƯĐ0ÿw«€;₫3-ûî)Đ›A¿Fùø@á¿´ñúpü/ˆ°ï| ŸÇáÆ/ÿƯÀ2áüÀ9ô—₫Èùăߥá;j–>'úÛ÷ ñoÓè|₫ í{ÖˆÎÿÏÑèYâ{4"ø|/ô:ûüä'J1Ÿÿàđ…zm¨>¥!ưhø²̉ç₫c„JûuéO²È¢¤Âw₫«đ…ÿ z§<罌₫,¦3Ä9_ö+ºô²­| ÍÜÀîÑh”yư9 Û‹ươkø–ë"ç„z{W,lîăl_èkĐ—OÇ^fîiQÛú=ûư=íÿiü?Ư#ÿ{êÄôA4}^}óK"Èè»ÿ§ø¹½û¿½—ÿÈƯ{Ÿ₫ŸÆ£û|Ö‘T&`˜è·ù}¿´¯đsñŸósÿÓx_¿£W|§ÀOäËx_¿$ï‹ïå_÷Â₫L Ö[}ÖƯ ¬ÓQêÂĐ[ÑơíCßuܳ̃"q}%Ơc¢”m¥|Ø‘`]äÜ•…0öÀĐ¥b³§2ËCT†øẶß4‘6'ỖûBëØ.ø̉ßÈÏåjÄ ơ%Y¶1‚¦ŸÓç¾z+üséBf²ïë1G¨Ä;_ö̀u‰ø.(øœ‚Wœs•ƒ¡¯Q××?æ ₫Å9ï"q̃C<ñXØâT}' ÷̣üƯ…óëû°‹Ï̉‚w|¡n½C–9[̃-¿M`ççáL¼P}?ô |‡ăÅQă¢è{y·v[ä]Jª¸2¾ï~BEîç¦^́ƒ_Ñ4½†È÷á»ø9({®¸›âÿ%₫-$y‡́Đbq?%ö+½@î11½î‘‡MêP**#ï©ÎR¡î{%ïZùNæ; ¨Ñtñ.̀¶n¶>K7[0ëtºËH£»Ô‹è&ÛpÚdwÈi´ÁWÑ}²₫ÀƯŸ¸Ë̀îu§)ÇÜ×'ư »Zº·w»Q>ËhÈfrøƯPäư'}ÔSG€ø¾3ôRä̃³-²ÇŸ×³ç÷½§¿›NQ d{á;ÝÙơ8dߥŒûö%Úä̉ưc¾PÔ7‰ÜQ‰÷lWD̃Á•‰‘´©̉/¨¡30_Å™¶¼ề·…î)ŸĨ1₫YếÑO5ü̃.5¢s7qñéà[”(ï$_“ïđ®`ôô~ùÎ́3y6Y߈{ÇM†¬><¬­„>T·£-ù̃/ôtw¨“¤~&Gt3Uư–Æ©[¥ÎÄG̃ ƪ!»VèèĐ‹wọư¤RVŸAîkh¢£¸›Û½…|”»åưàYѲ–jj\ }ƯƯ¹íÖ‘Oß!¯>₫áŒ{ x¯ ƠüŸT&À®½ÇU„Ñ…¨L…gÛCâƯ¯x'y¯¶¸ăï¶¾ådÙđ»\B¾eD̃FĂ á°L{!‚¯"¸¿P.ô_Àwü¸9Dưưú?á>).èj€gv .̃ªgBV'ct_€WĐ’¾@º 9}IOé ¤ :ª/>êúñcå~¬?–ÛHÏư_èÇƠ›ƠHÏú‰₫Ơ÷̉ëÿƒ~ü˜œ³ûéÙ?Ñ }ô }ûû´xgÔ‡Å₫‰½úbPñï– ºø=Â8÷†æEâÏEÊÍ?ṇ̃EP ˆ³4öăĐçÀÀ¤m…’"ÿö[¤Đ zZ¸-ÁÜn["̉fđ₫H_v÷‹¾£íà{áödÛèGpOØ Ư)?"̉îá~= KÂåE¾£ä{àBX~!Øñn1¶)' úÄÙ?tOØg vFdyo¸ƯnœCÉ@q$ơ »@ÏÉwEâƯöj 'TØZisPb¯½*úÎúVaïtôF­¢t>ê° ¿AØpy„Ư—çÉ¿Ê÷ês$²±¼†ø¡-ĐĂØÍ«(_´!ßË,ß³„î>‡̣M¾F—Ü«kÄ~`;•ôáèÓJAư©ÆŸi~ø¾Dîyvc>âçÁïïÈ Zf¹›Öo"_¡:́W££éѳ­~u(¤ ${”Ú‡S“ơ1¤__*…êE{Æw:Ѷ{äđ 8çÊỵß#§îÁº:Nù†₫Å#´ÚÊé6c¦<»ÏR7Ÿô½@±xÿ¤¿Bµ+))zv×߆\ _:BÅƯHô>@»‹nWŸC]w‘_¾×ÜôĐhâ}Ûgt‡øV¢¯_ơ£zü›ÈAO‘ñ*öÎ^ă—´—¿¾SØIgwcâ=ô;úĐhŸÄ{<ñ. ơœ#ư¯‘tq)öÖ‡¨NFk£á§ŸN£-)ä7î£dáŸó ›Â_w8~*Öî ¬ñĐh̀Ïó Ë@d}O¬¹¿FlÇäpº\›Hë¾=’~>đk`A8_ä…V†ĂƯ_„ë—y¿—ï¶J¼ƒăâ&‚îđ½\³ßp/9o”>ư÷iôƯưÍ¿ơ§é¿y‡&Ö°ø¦ễñ÷¥âưđˆhëó•0„/úsÔîKĂïû¥+̃û¡ïDè‹Bׄ¯×—öư~åǾgùq?6²Î¢ôäï^ú̉¦ïr~†ª½¾“ù!úï̃ƯÉ÷ơ°SQúưï"wr=4â—÷ưç½~Ú?"~lLø^‘®“ßæüz¾áú*ô¥₫Uo  ˆoz~ú”@dü>ô¥ñûT~cđĐoß Ø_|¡/-¾̃”Đçơa„₫|t] })ÿ¥Ü ¡/Ơ ½)ö ̣|̣Cß×MÖïB»w¡ihVÜxư…ß4ü೓ßl9ƯÄ^ø“ø3Ú·`¹í\ch瘤_ Då•cT.ÛGr¾¢}¶©÷:¨sƯOáÇç%ô•ÀÿÖ¸ªïÚ³¡¿ïˆ0Ö̉ß#ç’¿‡¿ƯC'÷ùzÙï1FÀ83 ù-ͶЗ|¹~üMبŒ|³ô±Đ-: €v"@;}ơà«"ñđ÷7¡nưĂĐ›Æé¡wÄ:ß₫_ê‡äc̀ ư:ø±ôÏà9_‘„ï%̃?‹»ñí¬ˆí˰¾„ưë:y'ă{½%Ư&ØŸ'h̃É>_hJÄ‹ï`Å7EñzMU‚t&ÎËTl̉¡ĐÖU<Á†°ïú#đTäI‘₫»̃PN¡G{h%tWÄß~́EaˆôP¤ŸèWÏ7’đ„qÖ%Œmæ;Vú/·¡o·á Îhđä‹óø2º>eŒüæ&rî—wÿ #r™ª̃(¿"ñ ʉïjùMNø;¥Iz6MRÈïkÂßüo}ß¾ïçđ9§ø¶FÔ!üAá)ïC°)ʽ8ăÂsV~:- Uư5ÎèÁGưÂ`̉3A—#œ z)pđ`$ưWT¦%¢. aœ¥ƠGÂT‰œ«åÙúå0?Ú@:客r₫%̉Æ6`\¢̀>øx"o”,WÆ?AgMI„G#ïmÀ>¿Ë{…/#yÑ2£N”ÑQm|ªàÚĐ­&´‡}BÔ©âû¡ÀLÅyèш…ƠÜ-î_¿÷]@ô=y„jÏQ¥v#•énºBK¦zœjôX́Ă”û#¾¿>ÅÙn¾øX|K¬¼¾÷Ü˰¾ŸN±^…s/¼¬È7Ç’̣íá~“M“{§ü– ïm{Ø#“ßOc­Eư\£…®3î/y±E³"w]qb_G¸R̃ Шđ7T¡Qa?$ÖĂ؆»WAÅ7mB·"¾ (ÿ âß\…*Ä» åTñ½–ä ŒVDdx:ê½»×û§Gú~³ñÿơû­¾ï§~́}ÑÏ}›ñsßj|/₫¾SéûíÆÏ}Ëñsñcù™÷eêû¡'„-́¨‘.¡×?* @—CÈ{º”rẈĂX³Ç₫đù"w¢©HO„ưJT¿•ú·*\üóyç*îæ[{ưká»Î~©²'ô´°ṣ;D’w–ăzƯƠÖôÜÓ¤ỈÖ¦Fîj¯ç4iƒlº°3±ĂdHØy/¹ñŒ°]a¾ «áz„?.l§„ Rf‚g&̉…m–´™Â¶‰u{¥˜À9ˆl? (÷ƒ,̃Ơo Ć9ÁÍbo’¶“‡ë•÷G₫±÷‰ïÖg¡\ÍÏùKÿ2êc>Ư7₫s~!Ê<ß}ó{̃á¼+¾ơ‡¿đ,¹ĂÿÏKäÜù6Z,Ï+̉î`.S{Ư¿—GîœËå|]Kq°)©ß;(´P̀môL9=ù¶$Jg†!÷i!Ǥ­dÔ,Û€‹è.ưq¾g‡máw‘³_ô,çèV>ƶûàjñd¿ß'̃·Dđ[ÑïÈ;ÿ_ü±ç?KüqđëĂPàèí߇6üdן€åœ>8Bd…œí1ÿ>œiD1^¢Ø©aĵʼnUü# · À*ó%£ß)*Epȇt?öŒ ¬•̀kÄßGêG?úÑ~ô£ưèG?úÑ~ô£ưèG?úÑ~ô£ưèG?úÑ~ô£ưèG?úÑ~ô£ưèG?úÑ~ô£ưèG?ú0ñ›èKª¢H'N.*ÿçª~Œ?IñSÛjœÊƒ´@&~`  ©<Øa8Ë̀NĐøDIÛƯEe»C] ,Ó‹o.kÛ§l§™4ÉÛÛ§‰äífm™¤ƒG„iÉ IÛ-ál#±̀W“¶€Sl$4¸Ø <èèĐvz²U¹·½Î‡îCE±5‰Ê}‰çË@PĐûû0–ûè‹H^ư¦ĂêÍÿFr¥*¿W,.  ؼ h´ÏÍ@Pºy÷WîUîiwù\56ånZ påveâ$v)›:\R6·uÄ&”™5.åj8”ñÔpT»lˆ£x}{ñ )Âú[L™ å×¢ÓkÑ‘µhr LÆM@”_Û‘àƠ_Ù'ù~Ơ^Zt¸¼e ÂÅÄ”¹Ê…”E>eèĐsAÓAg+sÈ)ûivĺÊÚĐ^5W+IT€́ÅMe µJ ¥ÊbKÛcÂí,mÏ/,ĂˆG+^Y$VqR9¨E1ÚË|₫½)…¿ºĂjư[ƯîJ*{L¹Z1(¥ÚPÊă‹}L±afmr$S;¬Î²ơ5e*†9bñ¡ R¾PVta;*ª‰SÆ(iäF̃J:%Ö)$}@¹‡ê@ïêÈMóuíUn’\7JÑüȰj́pÆ”uƠX•‘È (×c®—¯ïÈVF5¹J>•2^‰ĐJ©ôkZƒY[ƒ™Zƒ™ZƒN­ö‘r-r®E™åRjU–Óz`3ÂB­’Ú!ĐƯ2_¶[IV¼Œk/DÉ̉a=ó¶Ç'Èb̃GLYơcÊbèùbÔi*K:<̃²…{•B9”̃TÁĐÚu}Lñ„§Œn1%)i„Lº2 =ɨñ!.ÙGŒ¿À÷ !ñWùëbºùˈ úb„¾¡ÓPß^üAÔ¤ñPÙL₫mFˆó½ü)*ĂÛ¼Sô‚¿ÅwS5蛈ÏƯ :tO{Æs¾ÑÙ‚¾ßÑît‹Ạ́§Ú‹J"_N$àIâƯe59üI₫¥¡¿‚fƒ>Á»(ôqP/h_BÏ>̀‡ĐĐ]ú4ß'Tœ?Ê¡a í1¢ vCíº ¿o§p¬¡Ä·ÿo§}¨=7©[;r³}±{Qă÷ñ%íé¾ø¿‡5²#(´…̃”âù½í¢’ơíûü¾Ư|=_oz+̀³Ø¼_)Í)-.½_ñçø‹ư₫ûư5.~= ÈfơË×âYA~íL`=¿¶]­ÔtcLb\œÚđÜ"C-x¶ÊáéêÉ=,CƠüjpÔ±X ´—“ç¥À¯€_—É”%ÀR`9¬I+8ZÁÑ VÉÑ Vp´‚£Ur´ÊÖ—‚£-àhG‹ähG 8ZÀÑ"9D[ÀÑ"9ÀÑp4Hp4€£ ’£ àh&8Lp˜à0%‡ &8LÉa‚Ç)9JÁQ Rp”JRp”‚£¥’£¥à(•~pøÁá‡_røÁá‡~Éá‡~Éᇠ.p¸$‡ .p¸Àá’.9?KÁqÀq$ÇpÇpÀqø̣Ê₫?‚e?Xöƒe¿dÙ–ư`Ù–ư’e?XöƒedèK¤08Ôf°hox»ÀÛ̃.ÉÛ%Ơk) xà€#€ä€#8’#8’c 8¶€c 8¶H-àØ-àØ"9¶HÅ] ÿ\)ÿă©á—³F öZ̃Æ $]IŸKº‚̃”ô2Ú)é¯é~IEWHz)UHºœr%E}’.!Ÿ…µû*bkÜ0™ÀB`3°x0dèeà] ć˜™j¬1ÑØĺ07´ÆƒÇêơÍúưq]Û¡Đ¹¿&•;¥…i¡äs%_ØDđ¬–¡j^vËag‡à·œ—›q‡ü_²— Ùă…lG!»¡ƠXùiL•–ÎOg¦#w¤ïM "7o$,Óơ|îñµçơu²}aR`~́î®*€2 È|2­åÍ̀H•û€< đ‹&Èí&¢ø8‹¹›;Ùưt’U´“—¾½íy¥ íyAmÏ›í«±²G(OxÉàÜvĐí¾ƒÈ~(Ll÷íÙÚî+inÏ;ä¬ö¼—|5N6|ª`¡S0nA'·û¦£Ø¤v_HQ{^®(]ˆ†r[Àé hN„+;ÜRV»oHf»¯R”¶P˜x¦S±́ªt C_́f*3í¾C¾›|Ÿƒư3êñ–¿Sy9§“M7m¾}Åw£p¯½Æ&ÊcØ¡AöƯŸs­ïÔÅrñƯæ;Åw}q§Éסß×Ê&Ú}Wø;ùv3Á׿+ơ-)>è[́;Ư7Ë7ÙלƒôvßÙ¾}¢›ÔÄùöG| ¨pF‘Óî;-§Sv±Îw‰Ïôåù*ưû„|iX¸̃â}BTn} ä[˜Ó)t|ZE'‹3 ĂÆză,c”1ÂÈ22Fº‘h‰·¸,1‡Åf±Xt‹já²$¿Í^$₫æk¢.ÿô«®§*Ă..̣_¬ÁVÅ,œN§@‚RÏë§Œbơ®s©~¶?ptJV'³MĐ²F±@|=ƠOVTßi„&*êFĂY;»¾ ©¾º“ÑÔÆNIW§âG#“®¾.u71–|ơuMMäu/«öVÇŒ«¬«ưGKäYtâÇÛ;˜ØX?¥1°-½)P&¡ô¦úÀåSüg7îæ±Ü9¦v7¤©q·ÚÊcÇLéjkm”Å Í1(Fy‚ ˜eùE1Ø“Q¢æ(\.́(—!ÊÙœ”+ËåÚœ²œÊD¹oúÇÔîôûe™¢7e™7s¨Wh xkwææÊRY~Ö(J±Æ,¿́X¬ÈçC‘bŸ,Âà×É|L6(9Q$'RdHO‘!²-…(ă —Ì–ÌG™¢ÿáÏÜQE¬cĐ̉O™›5¦%k̀\ %°vÙ|o m¶ß¿sÅR‘á(¹-³Ï/謹¥Ysk+²jư;=ơÙO‰́AYµ;é©1Sw>eέmd“5«¶©£ºª±æ¤¶®íi«±ê*«•5¶ªk~ »FdW‹¶jD[5¢­j³Z¶5fĐû†ÆƠ4ú́0íàvt¸%5£i”ÛƠ:R(ôî̃©{Tb[É^Ôpd 8‘U\S\#²°ÎDV ’c#Ỹ#2R÷°­‘,’ă²Fơüah…êC&Ơ2¦̀hª0gưđœ-?2ÛKcÔâ?Ä—Hà·wIZüƒ?K~ègé̉¥‹ÅciÑb¢ú@á”úÀĐIè‰a ©–Ú&¤MS™¶ÓjÓêBf:Á–ˆæD¨ˆA‚¦ §.ƒoÑ·\–t¤¤—-| ;øJç8¾¼½DŸụ̀̀q~ỶQ2$Lq\´=%£Lümñ ° ¦f\1ësÖ¯¯Ø’³¥xK…ø äÜDßưb+m/¹_¡%E‹£‚@pI„n‰öîiOK— o¢¢¦¢Ạ̊¯|S_Q÷üơ]©u±¬~ItBÂé‹#•`&­/²-0É̀¥’)\I8Öó8ñƒÑÿú(ÁÜ endstream endobj 32 0 obj <> endobj 33 0 obj <>/W [3[277]16[333]29[333]36[722]38[722 722]44[277]48[833]51[666]55[610 722 666]68[556 610 556]72[556 333 610]76[277]78[556 277 889 610 610 610]85[389 556 333 610 556 777]]/Type/Font/Subtype/CIDFontType2/FontDescriptor 32 0 R/DW 1000/CIDToGIDMap/Identity>> endobj 34 0 obj <>stream xœ]“K«Û@ F÷₫³lé¯y4´é¥E4i¹[g,ĂÍØ8Î"ÿ¾c}©.tÀ|Æ#KBS~9¼̉¸̣ç2Å#¯fS¿đmº/‘Í™/c*êÆôc\ŸoÂxíæ¢̀‡ÛÊ×C¦b¿7导y[—‡ùp:½~ª>å¥çeL—llóûO6Çû<¿ñ•Ójª‚Èô<äPߺù{weSÊÁwyz̀ly¯‘Aœz¾Í]ä¥K.öU^´ÿœúÿ¶Ûç©óđ₫yKʦ¢MƠ)›ª'eÛ‰j,)m åIi[¨@Jk¡")íNT+ÿ-₫ØJF C^­DbµŸIéԔ΋²–”©ZGJß@!o¡GöVƒáí”₫¾#¥P¨NèQ£eRú3Ô@JE¹”p5)=C5¤ô:% è—s¤ ¨ÑyRÔèĐOa@Wú) (Û¡ŸÂ€² † ăöo®¶ÉÛ.…r¼/Kq¹92ÉÛ ‰ơrÍÓ¼2ù)₫÷ïâ÷ endstream endobj 2 0 obj <> endobj 35 0 obj <>stream xœ¤½ |Eú7^U}÷ồôÜg’™LfrL !a F.Â}$@$È "AAƒ" ¢¢»̃¨«x Ô5²¬® »̃º»‹ç•ŸË².™÷©™Üßûÿ́ûùÏdº««¯ªzçûơta„‚Z‡ôÙ«V†v?üÔ<„8ĩ²ùK̃YÓø”OÂïªùW®™÷ơÄÓ™›ûü‚¹³æ|̣ÏØK­Đáœ₫  Â^éJÂö¥°]°`ÉÊƠ§/O…í+ªzïÊ¥³g …–)í4`û£%³V/³#ô™-[>wYôƒúWa{B¦_ ‘~~á)äăcI} ¿oè:¹0ơ ƯO×ä;8»#󃋣]x!Ú…^E‡đI8ëtµ£ß! ưZ‹~‰6!Mƒ[Đø PÿḰKµ£2´Æa:ÇNE×£ƒÈ½©oÑ èfî]8ëfdFùh0‡–¢ÛđèÔƠh:Îß„jĐhtZ†[S ©ÛSw¥@¿B¸ß¥º‘ ùÑløIư]ø(ơ'ÔθƯă»”}È€»´Â‘£å讉ǩù©3Đ‚0ºÚÀ£ztw’8\}.ú{ñZn\åñT[ê0DMhzÄƠx 3Rơ©#È ÷X W½íAûáÛ^AŸ`M8™z"uùP) ưiG¿Ç\²{}²FL€Q*F سư½á~,4¡B0„kSï!'ê‡&CkŸ‚3¿Âÿ"×Ă÷îu~xêdq¹“6ú-ú3öă2<O!Åd)y„[d¸c?øÎA a¼ïƒ«ăx?ÑÈQîq₫Y₫¬˜“ü"eÄĐƒèaô6COCx¾€ÿJ†™äẠî—üÓü¥YĐëËÑtzư Ûñ<OÇ đZ¼ ߉ïÇGđ1ü L&‘Åän×½Â_߉ü ₫&a£p«øM²!y8ù‡ä¿R©h<đĂzhưƯèèÙt} ßăè/XÀ&lo‡ñd||¯Ç·áÇđNü4n‡»ĂÁßâñ?ñY‚à+’ “|øFÈrr ù%yˆ…ï1̣=ù7çạ́¹8WÍƠrÜRhƠ&n|÷qæưüQ>ă\!Ü#<*́ 'EMºQF̣;çï.é₫<‰’›“÷$÷$ÛSF. ¡F!ƠBëgÁwĐûà¸Đ»Xƒ±óă|1 #3/Â-x5Œäü₫kûóøe¥ñĐf3 ²6÷%Ơä2¾—“¹¤…l#w‘ṿ9ĂIœ‰³r.®„Á5qs¹•Üî®{‡ûŒû w;̣߯y|>ăăü~&5ÿÿ5ÿµ0Cx[øRTÅ%âF±Cü©¿t±4N/5IwHû¥÷äfàÎß }èEÔ냿àÖsø}èvRÉûÈïÉïŸg¢9\=N%;ñf²·“aµ8ˆ ÂcĐI>cư:y”œ&ƒ¸z< OD‹H¿ôƠD'ÿ ¬jùß .₫eèÛïáÊ«E _O~5´#’€{₫–+çăÜÛèî8–øèS^ÅÜEâÆ¼Â_,4 0÷zkÁëĐ>2 !ơ¬¼øx ~pa®À?q)Ä‘1ÀE5Ü_ÑMh1ùuoF÷â9ü|t;ªÄkÑ×èIbá*±Dtá7ÉB~ qàvDx@U’À˜œhnâ £«ÑQ^EŸsÏAë’ç¹z₫¤0/ X‡6¢–Ôz´Fhàÿˆç#OAQ₫ @·µ\†ơ €*3Óöƒt̀ƠC8g4đÅd@ˆà{à´d|* ØïQ»8‰t ù‚ê Ä¿œ€¦¥D÷§æ£«Rw¡>€›Rká;Ñ—è´ßœ¼-C¹ 9ŸăÑÂprTêC¶ÉDrÏ…ô…Ñb/ú¾ÏĂÆÅÂKh ÿ!ˆêR[SïwẪ®@—¡ĐË¿Ă.å:Qer ÙÎ-ƒ₫GăSO¥̣°¤®DcÑËèW’€fIq q₫#ô÷:4—LH­äæ&Â8Ü£`Àh] øs‹1ḍ¤ÁFƯÅƠ˜PS]UYѯ¼¬oŸ̉xIqQa,Zɇ̣rs‚¿Ïëq»œ»M·Z̀IUdIx`T:,2¼9Ôknăc‘K/íC·#³ bV¯æ¶T ¿đ˜¶P3;,tá‘9ïgGé##±ªEµ}JCĂ"¡¶#C#¡¤aSï½nË0ïÂƯܲeS¨mûø†̃{ĂtÙØ×€sItxó–ápë­0ˆ£&†àn俯†6|3Ü2D{B{•îßÜÈ0ZÓ¼(Ô¦D.‰,ز¨Hăß̉†&¬ ïñû©/Xhˤ†H¸­.iœ54¸Û‰¶LX³×g„|îéSº[·¥v·Å)hæ̃…¹=ûX‰NK£&ôŒ,¦-Œ†h ÍAK"Чt1wÚ2{ŸF gµÍ,lS†4oÑ̉zz~›Ơ#¡-ÿDÀ‘®ï/¬™•©£ú?-R>éa5ØŸ-·Åăm%%”E¤!@ShăÅl»ºO骉,ÓC°‚áCă`lg5,ƒá‡)oí0аÑÖ:¾!½BWö £,̃ØFéÎ́×dº§5»§çôæpr;¢æª«MơüYu·cØ‚mØưÿ±{nzÿ¨‰‘Qă§5„†miÎŒí¨Il¥÷èÙ—)µ9†4p’)‘ÇöSÎè9˜n4hm|₫DÆÔs:$¸’ƠàĐđ6½ù̉ô²Q ‡ÿË“:R'éYlu₫´L3ÛÆ/ÜtÁöÍÓ¶pĐ`P•£&MÛ²E½`°Zú†#3+àx4©!̉†&ƒdFá¯#Ơ9€₫m Ùzđ_º*³yÁL¹>”;û” Û²ex$4|Kó–Y©Ö+"!=²å9DmY6¬9Ë8©ƒ·Ú†om„±Z€‚PtÉî̃<~·7OœÖpœĐæI {&C/iÜ]û„2X-¡µ´’n„è…¡“{ˆ̀íåYÛƯ«“³uÍî é:=[G O׬~(Æ ™ÔĐ›{˜H6ön$˜Ø‹]B(l Û¢°À tÏ…¸Îs†€Î¢ßI÷Lơáâ`Tọ̈ü]›‹ụ̀̉₫ưˆ‡'‡Ç•5ơ_ÄÍ.»–»&¼¢́º₫›Â­e·÷×ûu¤>Ñ”È … ªJ;R_¥¡HƠ"y@‰[« ¹KÊĂ&ä̉jú•‡‘«&\^₫–VăÔ´r-\Ă{*Ạ̈Ä₫qºpđ yÈ®½¼wă¸ÚP.w«ÇăPI®ÙƒƠ ¨m/~›ƒñ8ô¾=Ơ+b@dCµ'ÊcF¬5ÆÅ:ÈĂZâöx̣̣B¡**‹á́_när:ăñ~ưL&U-}ÜÁtÍP„¾+–ê7èD?ˆoG"`X넱 Â/øoÜêÑO5Ơwjé‚u­~₫ÛƯ{£vÆ»Nu¡ºSPª¶.ôî́¯é„ÍîIl²ôo²¬;l…O¿rï5ÆeáÍQDó£œhY¬f+kÂƠcqe,J}Ç¢r ư£Æâp¨f@UaÅXTYÑLJƯq‡³̀Úé§™mC<³H¯q>%ñ’ơë¡„âM¸I¨îK kÜ·Ç+ŒÅª«jú×TWºh…‹ÚÜ\âr’‹E—Óíqôï_]+ÄÉMO_§îsU^´t唦ÍÓ÷.zxÆ*ïA}nĂæ̉I‹eÑÂ5ó¯[´đ–Yw¾Ûn›zhk₫C›Mä"×ạ̀g®́¼fœ}Êkưϵػÿïˆ.ºḳKg”ưb‘¾¹iÆÚh·ÛüĐ+®)ó =“üß₫§ǼSiŸ;đ8#†¹ZB°k‘J8Ø@âiàX°Í—‚¥¹|»iÇ}̃8áÔ ½ zêèRï̉»»°ÍèW^ …®öï_³ÿȸ©‰₫Ü‘#-·Æê}³¦Ă}ă²ˆ,9(5|ËÈ2Ôăz¸e¿° đñËn£¼p¢Iÿ •Ơwơ+G-¸ÉQv &Ÿcß>Úúƒ°Ø­çPÔđÚØÚt_@üvØ¿g­<ƯÔŒ̉•nÔÁ#GĐsQêk’̃…s'@\êó=Îé2BÎĽ&Ü£Ü áV!́„£A°9¤rß ̣ îÀOĂÍù½×z)ƯOuépmà¿MBßxÓ:ưp¿rÜ»p%ÆOoK6ø„ïÏÀœú· HG9x̣nBAÄPư¹¼à̀5›=JGê›v«•L¦Ăg6CɆ4ZƒÜKÖ¡2à±#°8ư¡= ́ÿóJ§àJ"½̉Wíf3+üƯđ™L"½¤Nk®itIëz.y₫íbȧA¤~ F~vøYq®àÅMd³i³ơM‹ H&/æíº̀7$0É1Ă5Ă7!°XZlí¸̉µØ×XC®W™®µnï“îÑßô~B>?0}jơ÷4w…b„#Uå F®e[m¢0cÚ¢ ¼-—¢1̃‹–xW¦™¸©5¡ôƒáרèĐíư++Ün»K'b$¿0æĐƯ•ưmz,’/‰“¿»}Ơ•—,zwÇ{kî<đôÚµO?}ưÚËÈ»˜Ç=7so2ơI2™üÍ®û^Ä'ïưá$^€ư}áFÊ+Ç€gv*zÁq†ÙVµ˜¿ÜAî—ùçx¬ Q œ"`à·TÖz•ö ᜠ(Ư®ë@ºÔw†4Èja…Q6|”\Y0úø5Á0[«„́H” 8$|¦ƒ¸ߌ̉¢Ñ‡q‰§?°Q[ß ‚XçI`[‚j‡#6Q”ªA +ÉÙöÁïNº÷/e+ùë.^›÷üˆ·f̉¾Ơ/KĐ·\üF†—›nö:âd3e%›₫n(º¥\§KYÔCÈÍ¥{sƒØ“«Ñ–çv— ¨O(O·Ê4({ï]Ae]´¥uty¸‚2/鹡f·vCC±ÚHö>_&»ƒLÎủ:zí=pi**&™ …ï 6ÿÛƯ(?ÓûÑ»±›ư ƒÄ—„WÅ—¤7ä7ƒ̉H­Q›dY¬Í±\k¿Öq‹ưeû—₫/'ưÚ«¦$ ơ=W:‰$`~Ö PËŸ«ê²(¾ô;ƒA¿ôZÈ₫ gÎƠAÁîkö́ƯG{€ØpX1ÑÔwa´)¯ă—ÈzB:è>Ͷ¯̀$KÉ `$(ß±;Í́€+§ă^\ºkëº2J g5› &´(+èœåQW8VgÚ$’Ï@äàX”à—ÎƠOôñ~Øyÿu7>„8~úĂ»§/}êĐc3rwí\;»óúĂ_Î[ü‹‡¶8~üƯ®†g^~bó¬~À)SR_ñnà”8ǹÎäótü½A„)«Æ5ØÀÅƠlƠ¬¹ªŹÊ ̣¹ÅA¡Ø1k^FöN™?$Å(éá±2>GÊèÙuu Dº€~]¯ë¯ÛúáxưQú f·y˜y£™f›j[à&¸¯Ô9縯6¯qn4oq̃ø•YBă“f¶đ†ûbJj⽄ièÓ †¦¹x/µ|dQ­ ™fû™¡¥!̣RNµJ+b ›bÅôŸz‘î‰mëăíÀöø̃ÅÔ6BĐqÓy´*íÀwíΣ"ŬSñ¦4nuŸ ̀Ù•6R̉äQ‚´â–FG›b#œTÓS̀̉Q¢KÉMiÏ»{ñ /<¶®r´ÓnZѱqÑ­Îöđwϯ~kñ¼97nK~óÁk)|“÷₫Mm7®Ưá|„¬^7ûÆ Bû̃˜¿gÎ̀‡úæ¾r{g̣Ÿ_A£ư€ºpđÍŒcF{ƒ¶@{@{Z{SFs£Í¿ä9;đ8̉DNT'! „ư-wrÏ™Ñ̀¼Ä½D^B2˜ÀÛ ñ<‚̃Rù2ïEAPœ¼*5‹„jZ1±Âß™†RÁæ4̀’‘©’ZĂỞ6+¡́d2;«Øă!Âz2= 'öÓsÈ>K̃ÊFú{@?„§(¼Ôê_é Á"<]kKĐAN$6ơó 2̀Äà‰@fĐùö@Î{†©2Áå÷Ip|NN-½D#1œaJh­ăKhùAX÷I0´m³¾WÚ*]gĂäî äá_¼₫z{²Ïü·ÿÜe¿J»{10̃,_·đ2£e†å°óđGd^±£QN0¯hæG(ĂeÀ¿U^¡ü Å3ñLÂƠÁj)¾”•Ï’a1j)·Ô֟꣟¦èO- [" ¸…YL"âD)̉ßn¯™ÅíÛ́Ơßz€»ñ·đgvm½;iOíøt₫¿ñơK&‚â)÷ *'(-éí äö¥4Ü'“ûöµ‡sE¡(×nÎU4*ĐÔêØÏ¬–¸T3e¬Y ¦¶ÓêåèNÊ\ö(b>3x¸—Fw±+º˜Áă:o\húP„ïJ$z, YCÄlCÄtCN0KÈå™̀ưiÎù´’̃–éb ̀Åzz¾Ù›Á½pY¦Ù¥j7.vtŒ}¥}[.(åxZ‡×̣+åÓríjóµ[Ѽ•ß(¯7mĐ6oó¼c{ƯaÏ~̃ ùé**£«>¡®Üâ†r½Hƒflï‹{ôW¬tù†_a5Àñ+·bdƠ­ÄÚïÜ_á]Ѧ:́ßS°ÂƠcB¹ qmë×cBj¢~U +dúÖÄ:G…„!Qú¤µ46bê·dÔGyÔ8œçÁ‰s-/ZvåW¯v~·xɦے§?₫8yúÎ+6.^pó-óæo8rÛÄơ;wƯxĂS\ ø¾EÛ?9¾}̃½Å¥‡7¿œBw̃ñ´`ĂM3goÚp.U¿ḿ“­7>³“ÚÎ@unäc̀«`„x‰’BÄZ«Å"VªC„Z[;äŒ_̉Be4 ëëŒ~Àà9÷8-Cáñ¿¡³„†v× ¦œ r2^"Ó ’#ÓÀưâqmă z?â‰_æk™ywµ4uÚZ*…ÈWæ¯ï‚ׯ§oNábÆ®eÜÛç’!ëwâö&'_ÛK{·ïạ̀"ëƯ£P1/)(Êá(G¤(Ï‹QˆGÉQBÈ«̣+Ø'Ó{‚˜v ¯Ôóªe7L€đS:†«)B…ùçp¿£?ị̂ƯîD?ïq+¹ƒœÂ€́„ö#z,˜öXH÷X|6Äqµ"̣Ë! ™Ơư­­ï¢]₫_zŒq5ưă«æđ¹÷6YŸœµ×áÚ½Éy´×K“ă¥÷…÷Ñ4ư˘ʇơ;V›+-Ă,#½CĂĂ †1e’åÚb‹;ZŒcJIN¬¸Úß?1$:ÅÛ˜3=<¥xÊÈÆ)s½s£óWù¯ÍY^p³wƒkέáM1ŸEgAÜD ¯ªµ°Ü4ÎDL’û%r)‚F‘—Ú‡ äÔ<Øûâ@/‹“øA\ ÉKûË.-°JXê 7V}ÜŨÀ¾ƯZP®/c!‹§Q€<̉^7 ¤WP„₫ặ©$̣÷nË-Ω²üæg“»|—\÷₫ûøÿÄ"¾¢a_åOÉg₫çóä-ÉŸ†Ls-~ ?á[—ÏzgÿGĂ&;ÍI÷“¬m¹tÓ,£e‘ñø¨é >Zÿ(®Û>½éÁîY[­Â‹ÆaóOáüç?MÎÿîŸÉGn»~á'7,ỵ̈îW>=ơ¶âĐÛoîz;ùùŸß*)ôáÑ·Ü7dĂÛó6ß3xÛï₫©n„„F°9$dÁó÷c‹UgÊưÇöLá'fª°3زL/×çË ”f}3·MSx]́ÔOê&YhÄSÈ8}©Mÿ‡öó?, ¯ñf̃™TEày°eQ’4(Ë¢&ã×O†•yc!IsÂ.Âq´ÎE븯9á,%Wä\‘;È2CA²ö­A0!± @Ëdص+qÆñGùă<·Ç|ƆiœÖ)׸mÖè¶n•Jä©U"̉/¬|˜F)üàÏ Üá÷é]]È[Wën©¥Q•.SˆƒƠ²©¯—­˜|ỏ¶>¼IH¯AÜGµ™&jË?­¡·r²tœ”ú‰"y#^̃̉”ö#¸G¸0çs±BQâHåHĂgÏv?¸ăcü?÷ÏV Ï Ç/'‡’iø×Üv+µîëå[ ” å ́8€x É;àùá‘)‘y‘ÊE\è¿ZX¦¬0Ư$Üd Ư ç-,Éuç(Ă[RR\Œ‚9¹0nyà4"Ù5jˆ` •ÔíTß‹"yQ¦W­E'åqR4¦éJÓ(_¸èQ¿4'7İ8”ñ³O3Ă‚2>ö™vFätAL{Ư*ó´›âƒfx{¼è&°Æ°ú®SÇ:ăÁ̀†L”ÙÔvO›îÔË®´…{ÙæÁá´û‹ WÔ0Ù„̣=$¶óíóæß|ÇÔÖ×¶&/Z?à²QĂo|$ù)^rylÈ´“î̃Ü%l<0÷̣'+ _n¿»¹7ÁæW?riñÙí’6`ñđ kúQŸ—úZX%¼ Tywßl²(‡à´ÁÅú÷1“–B¨Â<eN+Ú³ = <ËưÊ|€k7¿a>†Näü#Çf±çØrr¸±ÈV å0OqNuMñ-ç\g¿Ơ₫w¿åàNüÙi{ßâ@Nä׺Ÿ§Á²=E æ÷ơ)JèV„ù€#Wă¹¼¢Ç¬—¡XćÏóÄB2–5ÚÙ—;{ĂÉ8% 4 ô¦ÍL êÄăx9ö0Àƒ³TÚI1j]P<¤¶ß~è¢äo¾́J~øà xÈ¡?á̉A¯VúÅÓ±ä«ÿ…~?œ} _ơÇ/ñäƯ_¼Ưgû]%¸ó¥ä·[^¦ºçÀiÀÑV»/²P"§¹Ó¦çZ‘ MVpsmÆTÊbs^VĂXA’?/Gÿ¯Yï_YÖû)Ëz¹?g½L¹é<Ëơ+²ÆèÏ$Y”™—yÑçơ{‰hRATPn§ÛáæÄç c»^9ÆnƠFq’.ÏzÜD9Ôăö¸í.'₫Œ†+2ñBàÊGđ¿Ÿv}ăÊc®½óÈÍÉƯ8qç¯ú «¿÷Ê1»’ï]9£¯H=üT2ùô¬]ưû ûöɯ₫U’ ½~ ff™ĐƯ†KreY’ÇÓT•\’%Ê9º½JÄ]RCf¢úͼ̣ÿC\µAÓÓ ”´z&°Mơ§NÄ.§ưʡ׮pæ÷_pî.~î}nƒppW²î¹¤y•"0‰ø›¡ ºÍˆ³>Üê?Û èÂC!2â7ưí6LéXmF“ÿÑ|uĐŒ^ÍïƠ₫i× ̃ôó¶ïä>;÷%iëGÛ=pW÷ÀĐCùH èú‚œ¨Åô¨+–“£|,ơsÂÈmu„á`§#$ÁV¾ ă  8ÛiƒE®£ˆ̃8œM¾d?”×Á²ªÚ.@·GêK>DIdp¿M–Ü‘<¶ư£ä£í{ñ¸OÅø®Ø á+ö/½ùĐ5á›0¹óú““ºçp÷ËWÀ—ô^Ñ>¿ă—åËZëÇo»ùÑĂÉŸZgƠ`Đă @”|& ÑÈB§áw¸ªx.WQ·«ÇT¢ „˜dà$‰4êÁ4Œ·abZÅDê₫y™æĂLó5µ±™˜B™8r§¡ÂEÿ ö“3́× qÜé ™qÈ<ÎÜl^fæ5zăM-=ä4¥é¯e!&đ aPrÀ’đ‹Ạ̀‰Cä̀¡CƯ¢p°ûI2í̀p²·»Úø*Ôz½³Ê¡́½.b́½•UéuỴ̈ôº¨8½DÓëœÜôÚëO¾K̀zUHØ&¼ ¯‚±vÚÚ_† 4G'‘`Aå6Ä éhoft¾ÏÎß³£sÚĐÓ–Çø{ï {ZÁœkjlY^ÛƯ”F¢¢Xi{ơ5 5©¯¹ỲzÚĐç’ùâJrµ¸Ù¼Ù&*L̃ÚMTÜ:°ß0ñ¹VE‰©ª3u¤¾caV  2¥Ñ̉J›ÖnJ1SSÈCĂ1ÎÑ́à8†X°4 ‰ße‰ú§ ¦Œ²ïÏö¤KojI÷ˆZ ‚]ñ:æ~¦£ư«¡#Næs zAZ6{䢢C¯ƯøÚ¼Ư»sí×s?óu¼µès‹`ơ (G㤑Ëå×$de`¡Z-öWG¨S¹Ü‡œ´Jư˜û”E ¦‹„­üá₫;YPy\ÍÀ…2µbWq!º£a¯–°ÓÚ½°-gÖ<]ç°uç^»›Ön\äƒ{F£ÉÏwˆ®¢*²*p<T§ Àˆ“V»¨ªH <&’IF²Êx±d a-đv¡Mè¾xá2™Ö™Ê%+¼MâÀÉÛhh¦Đÿ«2úñ¼2ÚIÍø uuƒ'N#‘j©øÔÖ̉  5äiÄÖ^‡”d½V®³Ư f{̀vjU4 1̀¢'÷j6:^' DƯb«’u‹^¥Đ’ªƒl ô}™ƯÄ>4fiṢaÜJ} ₫̣ Ï÷»¡èNˆtXMö„œïLđ†3A‡y_®Düü§‘^·,o#ê8PîÇa ’íCä#,ußOnL¡îÓ'Aü‹É‡ƯÏŸ»|ơ]’Os _\# %††  €äu‰ÈS†U"Ü­úOÿ‡¹$₫‡¹ôUSZç§E4́‚æưÄô»à÷!$Z¡%:9‘a@2`CHÙb¶1íàNoÑ’f§»«Æ)Y1Y¬Ơ$2ÙƠ3‚{f?\Ñ@v¦'?e{r®ư‚‰Z®¬ë́Ô뤡‘x”üƠŸ'·ïûöëä¿’Gq́Ŧ_$¿Lûú|7p·yÑ£t®m±“Œ̉G9§ëÓ¼IË„AoÚ׳Çd]’ơ öf¼Ụ̀cøó{Íÿ¯.àz°¾̃j,qjiJÇœzœÀ´Í ® sÜi ‡mPîñÙIñ]ơW̃Ơø÷ä›ÉÍøº—iƯoC̣á Å>wÿ’—’ƯƯÏqxë 3nr™)ḉߣàEùøœ¶›,Ø̃?8-o¼$WX̉‰̀–[PC–öƒ¥€Đ‚–-˜²{Gê/{í₫*XŸÜ›_Xe£Û9…UzfmͬaÿG{sbéưp¼YÓưÆH(D-—/ M4Í. .WV[ÖXoV7[ï5?mí°~cùÚªƒ¶ Ù¬N›Íj³j=@Â~·*ÚiÖˆàU·ÇïËơP(aIN ç3zz½V«EÎY³éUb–T̀ÙÊgn—ÈŒM¡‚e­\A¾÷¿¥±øÅ£5 æægÀwÂKĂ;Tadh§™‰2–Ư‘NîẓÈz}PÆO1TÙ°&¬ú@›} … ÜÂ4†ĐÇïKØŸ́đ³Á„fŸ¿Àíªô¸=×—;Ek±¼đ²åđ;×¾ơn}ÑäÑ©S‡&_5µOxÔŸñ›ïsïăÉráàØß­y胜hÁ˜«“-¸ß†­LR÷Ơ\eÍ X¶ÔŒÔ×üß„wQ9q…³¹Ùü n%ÏG «¹Dp7R3,ohÁđ‰\£4#gjÑ-K„†èxd Ñl!–-f FôÁéB4[ˆe …ÔßNKEæX)à £ư­U‘¡ÑaeÓBS"“£W™[æ9çzט®5_k]§_]°"º‘ÛbºÅ¼Åz›~sÁMÑ»̀÷Xïqåf,µ>á˜=ó+±bpÈP±ßÎWô‹¡¹ \æ>k·H ê6÷É-Œâ¨à(v¤g,rû(¹¹na^ÜfO4¥Ă!tƠĦËʺ̉߀Ñ'Z`1›„p0'7 K"ÏG ̣¡Nr}üe»;‡ºÜ¨ î0-«ă‡›ñ2¼ ‹àz¶>ô–ôÖĐâË”*ÆÅÂ-2¹˜6ÍLÏ+öW@Ÿp̀NƠ7ƯeÏ2¹½gbÄ>‰Ê‚¯_&ØÓT‚ù™],J~>|«ƒÏLç»â§h€éL7Rÿ³å<æ;jrIeE&úXÀcÓÉ-™¯ËéqóƤàªÄf¼hù»uKŸ™8nÆ ä•ăο₫Ç_>₫ïÂAë®§Ûv$àZ¯Ưxöá7’ÿ¸¨_uÛÔKV 6?♯y|î̉׿,|g½åÖÛ×O[Y¹¸hĐ¾UW]±̣[ʩ堲9¨[ ³@raÀ{đGé +ö†̉39/!LÊèÔ:Æûp&óabđ g°áǬỤ̂—,HœË‚B2m@Ó+ÊûïïíÁ°çîM_é,ƒ6ïeÓ¶®°8’9ü–d@0ïÚuæ´µ;@ûÓ“}l¨1kß ¿)ónÊn°¡ªøẠp₫2y•ơI᫤!bë /µ‹3F²öé±Ïˆ đ}a™{Ñră{œ›4»—¹[ƯœÛ̀‚}YsP eRw̉p¨f9EíC•ϸi8T{àPmrQó́<‚«\¯7eBik€i»8j•¶ŒÀb,laă›ÍI}ï÷É3ËصîƒưÂÁs»?K{üvl₫–{nÏ«û®8Ạ̈t‘zn8ÍdÂgrỐF2Óî*YÀD(û́ˆ₫Ù[e%Œy›Œ e.AE\T-Óʵfíùe›Ö©ÔL!mœFxb’I&ƠDÁ8Rpɺ:6#g«’§, X„NBnơmHÏd®Œç™©ădÜ*o“acĂLŒ¢ÄL‚ï Bh-$ŒH9x#Û„Nᤠ€G²y¯©ygÚ#i¡YŸôçƠÓ™Ç~_—7}œ™(¤ó„i¯Ă ÅdJüÏÅé 30î̉)Ô)‚Ăú3±8˜QfGcW¦ư‰JLwÿîx]ß¼ü>xëëƯ‡À*ư°uÙêƠ|ñ™át̀}I«¨m?5bÅ(f+¶Ç¼ Ôß–°÷÷D#l#í#¼ hª­Á>Ơ«ß'ßgÍ ¤Q©c¿/´¡ÂPm”k’0I›î#̀Ñ»V +µë\VÁE=W» ¢FëêƠ< =éàçr<ø‡¢ƒ¯'*f‹Ơª9v»Ëíñz]©Ú½̣†èZ³ÛèÚ˜æ÷ „€âÄyYÎuy.—×®)J®ËE»M³ZCºÍ©ë6»¢É^—`µé WĐ$óêV«¢È26yív› É~ǯVđxB,]đ3€ÇïÑ©0Ÿ¯ߺ;m4ù}ơƯàNvû}Ữ1ĂæưªÇ&Ⱥ“Ô Ó½Ù¸.ơ½Ë W I›,úáð¨=œ-ơ^±­@lå »êùMs@*KÎs@Æaµ@Í^ÍŒi¦X̃ áH3„Ă+G%8™tăG’×½q¼À?@Åï₫86́óƠo’W½”|»P̣8“o‚¬ÖƯ{÷ß ¸Ï»ưÉïÿqk;÷<84M[CsGœ}¸çÖäBâe<܈ó\]ăH²s„Hâó¼ÅHL‡µ…L¦đûœüđ‚t:uôNƠ2ố™0sØÂ®ˆ­̉u+¾íă“ ¥ñwÿûă»)Ÿ&âvv§:ĂĂ qIÔ9̃@…çy.*ÑÀ±¡²»=§<8÷₫—[àp5øiƠaÜ\ññÇø¶ä»ÅBzÔŸ“ ´ÿ†8Ê×ѧ2¸WºN¿r>:¹đÆ©m~Yê>È_ŒP éc”*f¥Ägö—›KJæ₫®ÀÀ’‘%M榒Eæ…%Íå[̀‹p?èÚ́*ÊÆ ÙS´ô¤ï™¢ư¾—ûưÑơY‘<Ôs©Æ·QP¶ÛÏ'QTŚŸLKy ‰LFĂTweLĂïÓº«€§* óV4_°€z$´ï4(n¢·+`7*Èjá‚2Ư°47+½ÔĂ¡6˜Œ́g…~ ÈT•':d{'<´mƒé=Qo~YÁ«âQ‘ä‰u"-̀`I¢—ù,yQdŸha>›)û 蕤S\ø¦ÖSOđXüË/©<ϦgoI›‹‰̀*fEÑÀ,jIçQó©†}«« ÓÉÂfO¹]ô±¢HŒ% a™Eô ®vÎE/¼ñưÅñ½ç{ß_s¾ÈKåùBB™µ̀Y.ÔY a´uœ0Oø$çŸü]Ó]^$(XR]A‹É[p̀„u“aj6µøôŒ¾‰ñ¨É› æÎzµ'³³0éÇgL4ă˜ÍÆP(£ô4­Ä¶JħcQ̀¨«ä¢„tb°»·ă6|óy¸Å¦feZLÓLs({aÆ*˜™]ØNY3VÁ4ÄJ9ŒꦷÆ^6eË_°/wDÍÆååtjÀä>_É )øcÙ)é³–å¨% `hû:—¸tÉ/äÀ¼>Ÿ­Ùç©ö廯x¡ÅH₫øÊË‹IƠä;W=÷««W='́₫çcïxkẸ‡äă{^|ë‘·½N3&Ç¥¾áº¯üxZÆÆª²Ü`ÅV¦sËûx{Đ$yƒ¼ [\’L{/±̃K,³W̉iï%ÆáG̃{=í?nª ?jP4œââ™è˜èiv4{$r˜ŸĐŸđk²Ù§." ¹EÂƠÚ2s«ùImŸ²_Ư§inm£öWÂỴgZ—Zo°rV c¬)g³…ÍЬmh;úÑj5¡óm BÓ ,2çüô¯ÀÏ›Ó$.J ƒQçRF?£ÉÈ «à¨„ó¤:‰HSéAƒW©_ êpÆÎ§³Aé™ñå™GF€–=ÛµüT¼kyv–Ü–(Ó›NÀó–€nØCeÙ2©YψR«ƯóĂóŸ$ÿµüÛ[vư)ïß Ó6?óĆE·ă›=/Å9X}“ơ/́,¾̣7ï~pˆé˜á@³ăé.<ÙxB%¼9j®25 ƠÎêàT2IàœœOæs•ÙÎæ`g̃{ÂûÏ|_:¾t₫àù›ïK&yî¼¼¸Ÿë(?•]©/)0÷u$ƠæQd˜y¸sdpª:Å<ßü¥øµû >eѱ‹³˜t+H¤I²!IÎä­Ä(j³Fuư˜ ë6ĂÖlkµhRH ¨ÍN%ÇÆ”U›H9ÈÆÖÆ:â6 q[vĂFƯK(ul+í¯JG¥ăRJâ)‰ÆJœ”ËXá´”›fEF6¦–$¦}$_nƠ¸̃¹%-ơ]Ư½…=̃X{‚¹+ôw^Îh>\}A.'4ë"=`îá̃¿zÑ{75ßS¶·;ôÜƠ«~µóºƠ;6>²ớăbnËøÁÄrf8±¿óÖk¯̣ÎaJ³Q€¢¹ g. ÙDĂ“‡‚.2™k”ɦ¹Üba©2×$»¨dƯ†‚1–r‚tYhÿX8ă<íçûÙúúÛëưƒƒăí3|‚³́Kü³‚«ÅƠ®Óä´WGnl5{<ăÜÔóăÜAë6}»NtU $ÏPÍ¢Y'HŒ»̉q·¤Çc˜Aë2WĐœ}œĂœ¥6Óă•Â’ª636ûóèän4VEׯ`ªfóp»R/Œ‚’ª,¥B½(d”J Xшå1PJơÆÄ¦x}÷‰1zK<~º¥Ç¥¤“ø'˜p5Ơv·Ôf̣À3 {l~3+bép³S 3o‡cL‰r—,ưûo“?`çŸ̃Ç|îuÏͳ·vBÆk¦Ü²öi<Åóx;ΰ×pQ̣óä¿ơĐ à»7Yđ$ ˆHØ*¼‹<Ølä:lơ•ùÊ}†o™ïAí!óÓfÙo.2·ù:}¼G‘?¯*G6s5¨b‰;<'"ơQ'v¦ï‰̣ˆ#wa6e²·ß€*6uæUmCØgP1ñfädq‰"—ȧ‚ƒJ3‰3aKg&lùS;l“=€ü‹̀ {Üë{Dat«ÈŸ÷:£rª–e›w5Ñ E-{>-aK§ö8u›¨H¢ ’®ØÈ&Z8ÓEƒœ,§†s%}Ä`¢‹>E³çÑG₫›VP1aèÑ£Ü[[W ŸjX̃|ÅÖsó@".I羉ÈE%x©Ñl2 ÎRSÔ9Ú4̀)*9¾œRS̀YI˜ú;/3 wN‘L LgÔº,}#¥…G..]¸­t{©Ô?Ü¿¸®t¸ixxXñ¤đ¤â…̉́đ́âæ̉Ö̉O ¿ ÿ=̣C¡Íă]dw{QĐ!1M¢‡P9Ó#­¨'¢ƒ¬3*„`ĐªËjªÛU­T£^ï1Ö=†§ÙÓêáKaÈÉäRkkXó0Xó¸Ù>úp(ƒ5z”H·Ó°æ¡FÁe”é=+­8̣ó ^µµ·¦¬|µÎ:“«Ÿ̉Öϳa+Ă6+Ă6«/^º2Lá->¦¼ể†pƯ'NÓg¹O°éº63MÓâ¡éƒ̀€,©!iœóTgJz?2ïSÅ•ë6{-xUÛ§'¯úĂm/_ûäÜO·ÿú»ûŸ\·vç®kWïlđV̀™VÓv+®ứ>Œ·̃×znÑOGW?Ë•ü¡óƠw~óúo¨Ç¶ !æ:ñ¬È Œị̈T±§û˜y嫹aÜA3Ϫz|UÙ¦Ùœœ€‘5(HN“ªE£²UJÁ v3ă6XBg[:) êXØXj'³í?=N¡dFÅII¢PcbO©™Ç«OïgÓđcXÎSƠ¿ªÍ}̉M–¹·»ÛÜ)7ï&ÎhzS‡6œ¤Ïœ‡€s¾ ¯BHÉ&œ1zËkÙĂ‚0R8j¡,¿¼%΂QÙÜ̃̃Îÿíèѳ.>vöú́ÂcÉñx ë³½o ă…¨0ˆ¯6 ‚G‰ç /86›çÔx›`’hM¢´Y·¢{< •横n3áK½́韭²r“.§s‹-²nɺÀE  4GĐ-TºpúùS‡¥Ï™llO.ÈïŸWÓ¿½rđ½#ùoÿđ‡_w¿eä]üŒ³Û×Ï¡̣ ¼ÀưDsÆÈ,# ¦m+q8Má¬æ§ENÉ&ư§§ƠlAÉX¾›‚œ̀]£»r°,±“{í…U ủ`mXE˜U Fäyk”@ ±Ú ^Ă]­~ÂưU”qDŒIQ9!PềcÍ|£Ø 5*ëø5ÂưÊëâùÄâ·̉¿ÄË.»ª Çm¦È°¡Èr4cÆñ|4w¦Ặ4üË 4èh2!•ïÀVCx]É—éV8ļ==E¾ S‘(øס± 94߯“}Fq”NOdœŒ́ ˜;˜k‚|ùÏáózÓ‘ÍÅ´œfs1ñó3‹`z4*̀gÓÎh₫™d—k9¶̀Ä`Í£œ§làˆâ5Óđ=̉OȪR“P䜜Z7¶'‡¦½·'ÄV»Ă™ç`Y₫I b¹g˜êÜf©{Ütơù%ÁmilµÛ”Í_¡ùôVöÏx,;Ưp7§³–-èDí/=ùûƯôḩ1ư “Hé´´JŒ#X ÅÏ|›\„_ư<¹ăá๗q[rU÷’wmr:åË›`QĂäơ¯ûP,Ñ´f@:á´ª:½.ï—^ç§R(¨«'<*ø±°8)pyÂ2¡UH < ¹J¸4ÀÓ+1 weó(Âàf’̃hÿÓy´Ïé…öiZ§í19cŒe§‹R©́R»Đ₫B́¢àÅ" ,I³-ú¡#sS;KWMëP16S¿AÓNe³ÄNeß ̣‘Qo2WEùü åÏ/CÂûÂéñÈ¡ˆâ „‹äE5)$,Fü>]=ÅÛ¢Û£$ 8f‰n³aÏ<6/óÖX˜ylNÚI{[í¨0¿Á˜èlÙü[6Ï̀Ö› ÍƯÀv¹@Ïǻr«g£— 0-`w€ÊSÎ^8üèơ܈TF¢øÂ4@̣•?É_ÎÈC\äÎhàsYù”ádª8M KZ$ ¢xỡŸ#p:>Ó}¢WȦW¨6ºYÀ¿ey:C´.-Ä6Oï y‹ætÄœ-€ífWVQg\úÔ9›0ơ°çr™ºfvtoŽ£âÉE«îÍ»₫­GÙ™qñ²_¶7̀½~ »{̀̀+¾°¿»<|å̀w?Ñ}/Ù³zơ¸î́₫8ks}üâÆë ‡À‰²SïĐÿÊ}í8Évˆ<…ÜZ`˜5:¾O?æư›̣̣!Ùiqºí`saÑmVÍÍRàev–—Ù\&fm™˜µeê±¶LLLù́:ẦÚ21k ¶ÿ&¨IÍDăN M̀ 3aø3ñR¡óSËË{̉K–y·{Û¼^̃Ë‘J—›Éæév›-“Xú¿\êÏ .[/ƒ‹ÏHb§aÿ¹7ÆẴEÓóa“,®Û»6αYúèà+̀-ÚUV%•ơ˜M´°UµgˆL`h¡(̀¨œ‰âö"ñ¦Ç®₫¬yÇ8]m/Y|é§øØ½/ [V_±®{ÙxƠ’Áw½ÓÍpú†/*‘/̃ïbo´pĐÙæP‘\AK>¶Ă.©>m„x©¶£å̉O¨k^̃ư…ºÔ7Ün…rÎc\Çç;ó*—)C ¦äÏÍ_«Ü®l(x̉ñlé!άxü^Où¨̉¤ƯUtoéƯåO¨Ok>Q´7öÛ˜»(kóäg ‘l¡ [(Jû!™ch!’-d 94Ï×›˜&F5•÷‡b.̃Ô7ÇOƒDù¾RÇöƠùÆúfú^đơ‰V_o©ï¸ÏóƯá#¾W€6.à U5œôp&³ëø¸XḈ)¢½NwU:Új±UaÜwFΕ9$'è’øôd's¿Êº¹_J`>Ø×”çÇ₫ŸáđVUĐÓËXdĐ›^R¹̣±·ÉùBôL_ˆåc.EV}dú© NƯL+Á%ô.ôŒ’leIúY$‘¾coU)ñ³[… Kª+:+H]Ek© âäM[VŒåBéQ¡Ú€{÷ mD¨ÀÊDƯÊg ±0ƠÇ!ö&ö¼C& •<ë@ùúeÂÀM-ơ¡§¯Óaµ|Lf’5oéơ,w<=积ka“¬Ôj¦‰ktƠóœ¢'­§Â>¹ÁY³évƯ¡sb¾9@J‘ÀBXä:a3l‰P~ĬÉÅàD*ªç(OÏ¡=~₫=‘™÷C®_¿ơ(ihêyëSa¬°/©®ê_ó q̀Ë")4Ơí±̃rƯÚƠƠÑ_¼~ÿØÁJî•i¶6mŵ‹Üî²À†Wï²đơuG?Æ/Ÿ;ô¢ˆ7Z1rư˜ḳâ—^7ß;aÆ„H0Ç¡T^;cÚ£SŸ£rZú‘”÷#}†Q¥æÅ¨‡Ưi †B«#¬™Ù!·®Ä­*( ÎdƠóQ>6Û£NỊ0eX³´Lj•¶I<½]j“:¥c’ÈÊ3™å§I4eM ¦-ÿL!“k~†qƠ₫TËĐ BÆHÛ/̉A²yqÿƯó~æ±—Qv×ê'NƠ²YîZ ̣¶ÊJưÍt:mÔ“¤¡1h[ {{Ë1#ºtíW–nذwß>G¼(wÇ£úÅs#³·béÊäm[»Q_êg$`ÙôÿS౟În€HB7Mg>iTÚUq.n ;Ü&s ªtG½j¸ú™Úaö°ÇÎÀ=i ßKØằ„‚3qGsm<Ô6ÓñHyp§{Æø™çI`ÿI?Yæßîoó§ü¼_‹*=ƒ¾_1¤S¾Px%«8”Å‘‰{ª,ÚI¯Ïô…¬`……•1¾ œO^üOs4›ƒ®Í¾ä„ÈÏë³ƠLó³è#́`̣̣Z™e[:ØTR²T0œ›™?+Œ±€“çü‹\ƯÚ÷/|¬nj7Ù®?₫öAíµ_ºdlơ rW÷̃Ûú?ñÍ$qö ŸÆ‹:*₫.33íd¤Ê"{’ÿ Øs*eñ̃9€,đÅj£|[B¥øn¶%phªdº €t{a3k•:ÍJn¸ Á‚Y8J~´ ¹a[Ÿ×ơ­B!XXµbT¤ÄÔªV/E#Ô)x i””yxY(/TV£kđ5d¼Z¹FƯ„7‘Ü-̉fy‹̣0ºO¹S}=¦¾‚^”v«o¢ßªŸ ÷ƠïÑ_Ơ³è”Z ƯQ½È­¡˜Z£E†ª†Ư]%«TeßăH3EjPP–²²TLÄ0”­c†VKA3ÑÔ“Ïâ06đ;?Ge=)’5ª$ËQEu*8B¢éÜ9AU‘N„%UáÊ4¬åˆa(­ Q:p`Ÿ!´ D€’¡„ˆóMßư‘rS—ß×ƯÔƯä÷vhʼ¨©'‚eK\ø0Í$Îd¾œÿ¤3Yb£ăç“W₫úD4Ïÿ₫@̣*>Ö½a₫̉I«ÈfµÅHDHx¸ĂÎçdŸ´ÓɆ>ét#1c;Ç^Mȳ<_Z²…´ôÎvK:ü ª•–lÛVmFXCX´Âh˜5–±¦Ù0áŨ¦fâ i ³ÑW‡Ñ?8¢¿Ç̀d3²̃цH —đÅ*¹̀6Ưv»³…̉/Êc¯ÁcJÿ‹́´ïICÉ WéÁœt„Ôx1¯ 5Å!Ÿ]à/“E¶ëÈÁ9¥ 0倯•J严 UKåA–¡ÜÑêåQ¦!Ö¶Ë́Ó­́‹¥9̣|ûñZi¥|@î^í)¼“́4=©íCûŃ–ßñˆ+ßđßX¿¶ŸÏ(A{̉BcK]L'„1•Ζö ÛT‹•·#›,ÉQɵP‡Á"qf¬EÍ©ŒRfà¾æ˜±Ó!ª&[LÛ&ñÔ¶+mkm[lªMå)9̉„ùyâhYüTY:]]?A¿iíĂɱ„RIø?í |ƠƯ÷ÿsfîÜ%¹¹77 Ùï YD†%"¹‰a ¶ J ,*µÁµ>p-*"àÚRDe«+èCHø¾ ´(X—ömƯJUÚZ‹µh‹˜̀óûŸ™ — ˆ¶}?ïûy?7đÿœ3ç̀9sÖÿYf®Ûăq%$&züÉÉhßëÛ€Î2*<ËăK íMvºBÎä@ ÔáLu8œIÈç"oRª×›äÂ(·ÔăJ…w̃ejƠ3 ¹|ɉI^½Úq₫fW€ßḌ¤~é÷*üz~Ä«zÛ•'ĂĐ2ß³È#<íbBØ=&Y™Ÿ¼(™7wO'øÊt9#©¢r=ùœ̣eÊ—³¤J”yñ±Ë/Ï€^ƒÿ\É.Ï8óS«Ö%Ëă·Ø`êḶW1|ÎÔ·&oơ†Câyă0tÚĂ”d¼±•úúB”Ñ®ïÉ5Ö·ôoïྱÅÉŸ™ƒE~C}K?¹%Æẽâ ™¶ë}I~¥åmPqo´Vo´:ụ̂[iØi†Ôuó.Ư¤¿dăp›'¤…hµ{ƠzAæà¶@%x;…'•íab©ù~|—’Ù¤t“Û\ƠªRß¹kçÓƠZ¿§w¬pÁ¶Í[w=Ưó-40”|@\Û±ê•WŬo‹[Ÿûúu´4>ôCEKăW̃µú¡4Ÿ’ k­ Ư‹é“¹¯¬TJùí›́í¾€âëi¾¤=6³ro¥¶̉ơP̉jßÇ}óŸÛN¯̀RSÜĩ,ÿepÂmÊ= ®²À$­ÑÙ˜09é§Ê*Ϫ„í¢=ñå„I¿ô¿­rÿÊûÿˆ'`W®„D $û2¼P,ø¨pŸùt^̣x„._”ä"fÈÜ\=K×U§ËíVtƯíĐT¨|>ôç^Åçóú To‚è÷è>áóø÷Ñ>·đ‘;•È­ ï>¯â-JTSUÛ­ªBÇH 1‘¤U*Ưó+Ưᜮäåôœ\@‡Ó¯w=¼¡öP|Êíư₫g}rÎ+j{«ó~eÙ{oîüX”(ÇGô­íw¢3±ă5etcçåx®üÎqê_PF²”¿[e$דêSÔœL_@OĐSÂ_(!œ²ÊJfYiÖ{Y¯feúYÈAº́6²Û|9âœÊ’Ô‰¾Í5́ #CB%}ûûùàLt̉½ ={x&ôHz(9¡$P’22½1Đ˜̉˜6'0'eNÚÍú ̃›“oI½%íïƯÉËËS–¦®̣<•đ¼ẈÎÔ?{₫˜ú…·Ă<ƠÈɳKTzJBN¶æ«óƯîS}™]Ñ7']›ô+|¾D?ÚJh™©))EO* ¾D4†E  ƒ=)¼99AçP?G”ǻÎ9í¢ú9̉"œÚ.ƇªဘØv¥v›OéNĂ²=|I¦V8”Ø7qL¢:6ÑH‰pÑVæCÚˆê­Ù¡[Ñ0"ñ:ø{k(Düw†ÿØG™üíO³2üŸÊ3Êàƒ]¢\Ñ‹g\¤–Ẹ̀ƒV/ ­MZ›]”hü‰Œ?)ÑmMªñ₫¶JO÷Ê$Ô²ç̉*“­WđY_æÏ* ø¤ô0÷STÈMơ– ĂŸh.è¾(uÈyU#»%;:¯yñ½̉îÁ̉·v^]SØ÷Ö‰ư;¿÷´¿¤0{/W+éxèúÛn½À;ṇ̃æÚÆÖrJĐöD¹JR6‡½v±ß%Jy ¯¢¾văD'×T_ ÆIOQâ.óW*•QÊp1Ü5Ê=Æ?U/Æ»¦¸Çú¯Vfˆ®¹î* \?t/Sîp-uW‰́LW±̉ÓUê®t=îzKqrmÙîOë/Đ¼ºù“ºH‹Ánpyÿ‚|n6xUÆZ%íà“Jå̃%OTù`ˆxDnă³TM̉Ï­4\w¹TÇ›ÏơT]<)c¦‹Ó¦·s*rR"º®Q¹\æ½ Í‡ÁÚ]év¥g_ÀÊYk·J9̣́¤WT•~²aé7@Ñ øå/Å9°_~Z‰x¬yr絩ă…ù7ÏU>y@uéÜØqÅƯs>O½µ‰êơ”NxIKÏNHM×¼¡̉Ä!(úJˆä>™(x₫û`"L ©íbr81Ñ›˜•¡Pf·ë̀OVVù¿¬¢²‹ù{•Ç>ơó°îSó“¥r‰IĂ*̉̀Ut×'Í}¦êö[† +¨Y60¼ø̉ªk>¾Nô~ơ¦‚₫¹K†¬˜3·‚ăwµú±rc?%Đ‚pñ9?t-νNñ¹Kyе̃%]?r‰ ®™6¸W‚J®MN₫é‰ßGÊfâ ̉|ÇD£á­t¼D³ÀœoĐ>¤§ôJºæÇào·FTÁnàg¥¾‘VÁ₫\Ÿ»5“a^ó©đ××:w;WP&K Ă¾'î³̀z̃ê 4Pk6~giÄ=Gƒ;ÆXÈá nR kÁå%ºKyÉØ€ë´á/a{PgÉ‘¸Ï¸^ …0/Æyâ¡Cú@>(Ïj=Y†çŸd>7x‰fó3w=âoÅét̀8ÖGƒ0ÿ(•ÆHwTÜbYĂ(µE çl0N¼J×h‘‚ôzÈq„T%Óé}pÖD—À¬ ­´ÍàbI³Ñ¡=BëÔc4×nÑWâ9̃çƒ/©Lü7ơÖ‹hÊWîXƒ{₫I–‡&đû@öÓÈ2t'X°ÚéÄiómÈ×KÖ×\#࿌@¾DÀƠ„_ÆiÎù®Ĺ¬„Ûàf*ûn<;—IöĂ₫q¯"«n8)iܬ@º†Ô@ÇÁF–3 \Û‡ûdä‚>àØæÁ ” lB¸ª,¯(3\6eù@Ùp¼„4DÜd™5ŸàO³Î¬·îÅáäëÏĐ<‹|¾'×.³ˆËû̃\§¸̀ØR–ïy²Üÿ…Ÿ“ËT—DƯÓ>¡YQ¶lÉơqæú°RL » W£/æ2Ëñ³%§ —5™&¨–¬zÖ¾²@ªDVY_lK;-ºälz ÷œ®_…6eÔĐHơ~ºJûŒêÔÔÇÑvx¸mŸĐ¥®=Ôy9æ‡bä*ÆyH™ë؃çÜ„ôÂLå4‘yA”Áư÷‰êOÎÜ­ Åê;ĐØo?J–ưE5MBÜ÷K;ô©,ÙÎ1‰6èŸP¹6míjâ¼âçàøp̃»®'¯+ íÄ!:_{n̉Èwëd„éIY.Øï<¨TH ç r¢̀^7|¿ởO˜Vz<&ÓBú‡.Âe˜Ó÷ÔÓèR©O|BkhêĐzg„ÖC«'Ô‹§pÇáoÇ₫²dưº ơë.´Mw¡Í!Y₫§'ÔMx›Đ®5‚4ÚD̉p|ö:Ílc—pưQ7R1—ư'h‡YŸø Ư­•̉0}­€Ư ÚI„» v·£₫öEƯ] ÿA«Ư&„½ö́·uÖ¸¾8Ă”¢G¤@2¬§ |ơcZ¯¦»Pk\?A:ÜA½Q¤Yìç›HóB‹å&̉ÎoJ%_ơÓ°½èG¿F D÷¡;´Ûh6‘ÊƠóQw“©·ö+ÔƠăô°ê£iÚzXk§ålÖR¨DmÁóo…nÉö¯ÓX¶¿†yMѪàÿ.ºV›FÍꔽƒäÑf!¯áÏqÊI!üûZ(̉u"êÖ8?n<Ăîd[IŒ6’zKQȸÚÄÄYÔă©F#O_>?%¾ˆkW<í8!~̣9ù¾đÇn´‡ùW¯ŒwA‘);lj´ ¬oÓ…êÅt³̣”±é:<†‘Ñfm€r+è£  íà6œŸù¿ÁfÓ Ưm½îÀ½_€lÓåBÆ6µ4%́Ö€UàûZ4Ί£qd;O1?‡¾(ÇŒL¬{¤ó@„7P»ÀØÉ ,fôE”ê¼RƠ°Ïƒ¿³#ơé9*TÉøû¹âôMà¯oT:†£ŸÑÎÈôoÁ»Q2Ä̉ê₫é¸ư³ Ëeú₫…̉̀2DIʛƻ•7ÉÑz9so˜Śô´ó öJû˜üCY!NóXûXsl¾Ë,ÚhZ4v9è*ĐPF«†{kví§¡Œ₫ \ûÅéfíÉs0…z©«9N(ƒ=N7ëc¨# ×,öƒ:º̀¯£́Vú÷̉†ë.#¶b¼º® aLTºätUW›×íü±ó%6¿°ö‚,†¬„l€mËè:cí́¶äLnbêFß³Ưóÿ'Pw€—À¾ÿÓa)„² ü@zH5ôÈCĐO.ă7:Đ–|]@;4̣-Ø¡÷î́ ¼8O†Ư÷ %:ñÎûC&†Đ²i¥WfÂn›å×eƯ¯Áôâe¢¯Í¦ÿÁ\œÿ ??ñä «à₫Ïđw;䋿ơi0߇ù˜¯“q~däy à%ĂúÈiăĐ»<óøăÛJè,3Ï ÏyÃ;†øÖ̉ÎÏsÈØ±†ÿç’Qs1̉LŒ™>€̃×=öù¦1-‘ŸÑhŒ蔉¬G³.Ëú³Ô-)ÇoRE¸D©¶dƯ™ơWÖY…\/ç 2>xœ/ăeơÑm«rŒÖ?ȶä<¸9.z¯¡íñ¡|±Ñc ̀I`¢‰ñ:ú.úºƯhw¿€|æ\È/́>Ín[OkcÏѧư»Íßµü'úÔr‹i1œÍ̃fÅ(&¶/₫®œ«ï₫§ụ̂³ôÑÑưô¿j¶ûy÷P*gœac'«—¦œĂ|.=÷»cơïlÑKls,§]-{¶>“EY]ÄÔ»ï -´çNê₫vbëqW}³̀H£aÑ (±úĐ h/ ÿ¹}”ñ́º¾¦r׳Tósưfç§M| r­²‚ç·ùWĂ:³_{UºlÑt®̣[nY?—ú!̉L¶ƒ÷qü© °\cç5!öoz]çjSŒ/´×@ŒxN9€®ÏẤƒÙ‡¶8UOF»¦'y>̉éAû>îäŸÑ¡ß"ƯŒ–sË h$ÚùkµC<÷eü\Îéuó‘×Q£ Úót0§ñÜ3Äó%F»5?7]ÿưà$ô‡nî;îD¹&4OăyÜÏéÇjƠYsÈ©ö\2ÏOq¥÷LjG₫ÎצR¨Ö̀uª <ÿ¢‘k5Kx̃]½„·Ö·Z<iû%Zăj¢á®Er½i¥ú-†Ư#Î{è½T®¯L°ûUîÏ0÷Çs™Y]sÖ3Çê2~Sé"‰×öç¾ôs9eÎcC·A7h2×+Œ/Ï<ßiü̉÷œmơñ7tơù±óôSiœºă>{Nö È7é íN`¥ql\́°.gÓ…lƯç“ä\Ÿ¹̃ĂsP)QëpĂe:,ókç™Ă‹:́ăü7vhæú\­vÜ ÊÔsîQ®ÏñÜ0ÏA2b ê赨+(ƒÚƒr ïv ¸5₫®6×ÍôPxÍB8yíȆî8‰ñ‘6î–Èy5cƒH5v@₫@¼"×}ÖZ`¦¶ù<]ÎKÛk‚Z‰œ·.ÑÆä?¸æBù́–”i†?ÆuüŒ<7ׇ×\êkÔrëÜNĂa”×îh£Bu>ô—=hërw£‘¯>Z¬~@yÚ ¡&S£ 7^S>„¦Îˆ?Ă₫·÷Ằk¿oÑöº9?M'$ +k-—™ÉˆJ¾µNØhçç°«¤mûé‰(àÎøœ?FصÔ$ÚÆ:Äá¨~Ô¿àç*‹+œÚ$Ô±S¹0øeY ́YÅbÙgÅ{–µ±À¾ö ñ8›»³ÅălöűÀ¾øß³Ư· Ø|Cüêc}ưwˆÇÙ̉¹0Ø~C<.‰ö—ÄÆíƱû06}̣7Vÿ1äE(}?Ç9ÆÆ,ËüËƯOÆ¿ÆCce£ÖmÁcà%ÿ 0®6Ƥs?d¹ĂÇxôͰØoç.3l‰fg›é¿ăY+¾QæÎtđ3<6·½;! ÀjËư]V¸-fÜ;<é¯ó3J-'1Tp)®!N̉ùœ‰±̣?Ï‹¾^¶Îó¬ôàg̃Î÷:Ù.ĐWr­ˆ×vĐW;7É>û‡t‘ls_?¥¯2׬?¤§d{g í«¢rƯ =äQªe½ÛpÇLé~™£ }A?™(×óæi‡É¡ư‚2Ghv-Ơ©Û @{‹0äº îÍí6ëêRºȵJ¹&Äk'7ÑÏV©¿øá&Uû#âûíÆ˜í.^?ăöÜÙæûđ,ëé&Çé×5´[ÿ q=D³Đ_ơiTéø´Ç¶ú5äv$B/°d‚B3Ü9°ç½  :÷èuoĐX¤Y…vW:8)öO˜ó+H¹Näư×¥à"gÄz˜†±uª½oÀq9̉¤IÆç¹æô4i£“ă(úîQTâtC÷*£»Ü´NÿÏ¡#¬̉¨ư¼N¿‘ߣóK¨Ø»ë!Ǔǖ¼gÏ@w[¯Í–úb@®kYó]̉¾¯·Eh9Ơkl=ªK§°æºǽçä₫³ëù-¥o˜s { Ÿ¦Q)¯ăÉ9‘XiÅI®ăíAY²ôYçníT!Ÿ YúÔà¸é’B νp  ÖÏœN©×]Ă}´ă8tÑ*FÙ¿Đªï7®K#¬:¾öog̀úÈơ‹íeƯ„]ÇjË~.¸̀1¯ó5c‘ỹqÔ¼¿¼v«é¾ơĐà585Wó;9 Eë©Ö^ª;O“'×î¹ü ?§ü–sh\‡yOƠÖøc僳m3ô¼ß¡>¿! Ûzt¬Ồư) M)uC–[̣g\ÖX׋•±ûWζŸåôX³Ụ̀Ô}/¶¼Â’Å]ûrÎ!£÷Éœ”†a™“¾íÜ5ç–eË3́?0çäNJư´ñS´”yBª¥Ç²₫>Z®óó̃œo k×PNe"Ăû ΄„q^}*–Vô{Ḃ±cçÛLŒ‡->±ØÀ¨ Æ̉@»?ăo’3ﯫÓE¸ÀƠÛĹßDêÿß̉€œ¨Á®€”:÷…ß´ ÆyÔb™a0vºÛéh§ íxîÙ]q¶Ă·îû¯æă¿/ÿ®ç₫¦¸Gcíѳ%ïƯÓÏoääo&r/ÍFJ±Đ‘®»À&pÀâAu%‹÷*©3QfÊư]~N++06e,³µÿFסÙ93̀zÀ{L¨ñLéăœi–?g3ä¾S÷:‚çđZ{lgYm_¡{,­·öɹmA¿Ëơ¼¯öÍ:Uç3̀ñ´±ư¤î“ h¸xÅø™ă´ Ÿ/;Aëv‹ưëLƯÏØlíƒÔå~àôt4Ûæ1́á5ƒÇ-}›ơؘt₫Ñ´?/»íUÿç8A™riX¯Çjs0¦ŸC™ê'¸}×›Ô+©†û u t+̃ss“µ_–ç̃‡4ñ"]ƪOEƠõ_Ăûj€Ü“Ăù´}»ß'ưÛăû9¿4íø;”{pMîéÁ=x¯ëE*F1(ăàvœñ+uäH‹€k߉4GÜN½ƠY¿}' ö×ù8Ï€ôFđ¸Î—ö'PN¾‚{ j0ÿ̉±½vÇ-–›đu9̃̃FMЉ›p?ÓƯ!éÇD§&åEV“Z‹ûÁÀHI…F¡¦Yç:®ß»Íñ;Ï+°{yÍvă>éÆñ) ÷̀¢áz Xj́tÔ;•©J›BÉÈS/€¼~Í?°ơ:@jk`> b÷Øëä–tµ₫´’ù¿½¾%b֡ζ^t®½çÚ«qù;®©ÄîƯ8×^scÖ\ε^†²Ê:̣pô+»ơÆ!˜·ƒûѾ>Æhdr~ÔÔ×–ª ¨Û 0E…Öœ(Ï“æ¡ưÊÓ–Ë9ư;ÍûQ Ú¦ZsñøÚzÏAΧ̣Üë¥j†|"Ëz¯ï?Ú¿•ïMtÍÓö§ ÜÖr›*û ̃ÛqÚ›&n[Ä~ê'¾6Û å„¸-’󒵈c­”̣\ô²Ú”Zr‹~x–MTŸ±_¶IIf›¥î×Îíú_³½ÊU³̀öK4Û ñ>ÜØæµ9~̃́s:Ÿ–}ÓWf;)ÛB‡Ä¹|Å?ù¸̣{0ç̉—,ƯrSŒÜeËsé…–ŸM–ŸÓƯ[k7èKRdŸüơ义®qQ?¹7úr¼2×Y9©çÛóí2ŸGæÚ¾;.àơÎ[{LoΛuŒ’ÓLd?ÍéøGèeô»É0ĐÆÉơfă˜OŸd¢œ.ëûÙc9{¬A4D[C©ßƒ.Ô—÷$É₫₫ù¨ñíc —3ëå¦^gG´|3*¿x-؈Ö< ­÷›Đ ZÑ̉¿KäÙF”˜Å‹H’‰­&₫₫ñIîe’R‘.é­t{‚(ă/D™p—5›(ûD9ˆîDy ˆb₫ƯD…[‰¯ ê’Z²Ê¤ÉĐ#ƒ>S‰Ê"PÍ₫T1ALú›TèDƒ>"ŒûT]D4Ôƒ*Œû×\BTû9Ñ…Đ,ëè3F"̀Ñx®‹–— ›d2.•h<̉|Âa¢Iè?θït<ßUĐˆ›æÊLOœ;ÿ'Nœ8qâĉ'Nœ8qâĉ'Nœ8qâĉ'Nœ8qâĉ'Nœ8qâĉ'Nœ8qâĉçÿQ₫Å&úœªèQr’ ?•ñWMµÙZ.9H́ ñjI[qFđçƠtµgkinp‡ÚCÍm ·«m´r_Mo5„»•ÉcÇù`3Ø 4¦æÁ̃ă"›ÁnđЉpä«!0¬‡ù«æ´†‚₫j&üf">µP)ˆc¦{ÁZ Kwl3,»Ág̣JXíÖú@?Ľ[ë2)Úæ^].WÆ©—KcÛ¤FS^<Δu£LgƒMgç÷7­ûÔ²Çy¦ •GXz¼å{j̉ƠtmKJ._[3Z|@›Án đï÷â÷´Hæ4DZ¬»Áëà(ĐÅaüû₫½/̃'ŸxÊ@5˜Ö‚Ưà(p÷pô‹w¹´È#ŸW!̃ÅÑ/̃Ác½ƒ£O¼³·ÅÛˆÚµVT–ï'¥eÖI°È:é–m̉ËÛů[÷D‰*FN£DíR»ÓPê§vo-:?Ø®f´VÍ ¶‹ÛB¥Áu5}ÅAj19ˆRŒÓÁ÷³7qö&EÀ}`h(e8úAH¿oR_cK¼Ñ`ÚÅë­ÅµÁtñx‰º!Å_/KùK±OÊWÄ/¤Ü™y@́kÍ RM®üø!ưe¸î/´‚FM²Ø´ âXªÁ0 Ü t±[tom p“]tÀEpÙJKùmpQxn0\|! `ˆŃ/ÀkCk‹E¸xåC0̣¡øpƇâÛ—ăŒŷ܆3>_}ÎøPÜ4g|(2 g|(3g8´‹5Û {+Æ̀SB5>q#RéF¤̉H¥I7̣?:®qÜníƠ )¶:\Ú³W0²S‰<¯D.U"”ÈL%²P‰Ü¦Dª”ÈJ¤T‰ä(‘<%V"»”AHˆ̃z±2œ¡D(‘g•H³)V"EJ¤P‰„”p»ÈoƠOaR´Ơp¥ƒ¼`(ZŸÈGæ£̀ç£MØăëÀ¦0…º›3óXvoëUmû .Ÿ_3Ŕ…ǽȆ½ô; !ƒö¢íÅMöâ>«Á4°Đáº;"~¯<úp,Ơ`X]Fç(4ßâf±2+̉cØ$öâ_wüËùá\¿Ô?R½7Gñå)c̣Œá¥ÙN’®?N¬̉®[€̉ÿ²·[ endstream endobj 36 0 obj <> endobj 37 0 obj <>/W [3[277]7[556]15[277 333 277 277 556 556 556 556 556]26[556 556]29[277]31[583 583 583]35[1015 666 666 722 722 666 610 777 722 277 500 666 556 833 722 777 666]53[722 666 610 722 666 943]62[277]64[277]66[556]68[556 556 500 556 556 277 556 556 222 222 500 222 833 556 556 556]85[333 500 277 556 500 722 500 500 500]97[583]182[222]]/Type/Font/Subtype/CIDFontType2/FontDescriptor 36 0 R/DW 1000/CIDToGIDMap/Identity>> endobj 38 0 obj <>stream xœ]ƠOÚ@ đ{>E­z ™±g@Bï̉ª̉úGƯmƠkH&©„(°‡ưö ~[¯T$~†€ưŒ&›Ÿ¦Ó­̃|_.ưc¹Ơăi–r½«¿Ê†€Ç endstream endobj 3 0 obj <> endobj 39 0 obj <>stream xœœ| `EÚvUuÏ}ơôœ=÷Ù“¹29fB€„4„CA$(W€@AA²¨+x"…u½aWqU<á ‚.®ë¢+ë~à~ˆ®kưƠ…̀üoơLpƯÿûÿI¦ºªú˜êª÷}êyßz»FiĐJÄ ñUW6ηBÍĂ©Ö,Xvñ’-|MBê2„^¼xù‚á·m !d„C®~ø’ùs.zïúë9„n…ăQÍ%P¡»ÔAù”#—,¹âưáƯ ¡üœX¼t̃l[µ¡;œik—̀¹f™̣¿U^„îm‚ăË~1Ù ×ú¡¼ öoa?Gˆ½¹aëcæ"B…Ă¥ïß̣7À>ØŸï)Èpö¤̉·ø™÷Êé$<¾¸E¡Ch úú5ÔUă?¡§„LP1ái¨­GW£÷Đä·PD¡oP F—̣ÈŒV <₫%z Dà¬Zô.Ö‘:&É₫a”À̀f|JĂU&¡û„+& Z(w/ô ú̀luªPQøïcß(̀E¿Åuä}ö9ôGÔC,Êß\¸£đPáaè̃Œ·ç÷…ÊÂ8k2jEW¢ë¡+Ñ£è-ÜLêÉ̃ÂjhÓ4hĂ ´ÀI±­ˆGÀÑ· ûÑ.ô:ˆ>DŸaŒM¸ ¯ÄïâC Ô³?¿¿pnana)…ÎGMh%́ơâ(N¦3Ó™g™z₫;¤àƒkOBW¡kĐuh-Z‡6£ĐGè/˜!Z2‰LfEnT¦£¹Đ›ë¡MO¡7Đa¬ÆY<Kø6ü ¹ezöƒL±È=xÜû¿BAŸnBÏ£ưèmô\ó[èS 8‰'ă™ø—øV|¾oÂÏàçđ?ˆ‚|È0̀́Øäß/h ‚ßu#  8ŒL-:Æó-ô%Ü_§p₫3I’ƒY}O>_]SXQxµđ £[FÂ=GS¡ƠËÑÍhúœûú:†₫ ½Ä`-æ¡/8Œ/Àâ+¡Ïâop±ĂøƠ’Åd+9Ä$™·Ø©́s=Ụ̂¶üÖü7ùBas¡£đûÂåñ­ßi„hAËĐạ̊ˆm‡ßyEGßĂo(±Úz÷{?\ÿ0> â¤&7gH©gÖ1o°{₫üü’üưùÎB¶0d‹A $ ,ü iŒáÚ7Ao>††‘ééy}؇+đ¹x †[ñ%x)^†ÛđuøzèƠ§đ6¼¿ÿ‚¿&,QôS’̀#7‘ơdÙÕ'GÄ\ÈLcÚ˜ë˜ờ6æmæ –cSl;me—³×*‚QÚƠ<í8½¤gnσ=¿Ï—çGæ/Íß‘9ÿ~₫o]aoá3¤DĐÆft1´ñ—pÿ·¡»Đ§¡Ÿ¢ÏÑ?`̀¿ƒ¾`°» Å~yÜ¡Ưă¡åSq3^—àEĐÿ+ñf¼¿€÷á—ñø₫3₫C0´¾₫†‚L& à$›Iù₫¾'?2"“bª˜jfÓ w³¹îç×̀'̀g,aml%{!»‚}MÁ(.RܧxH±_ñºâK%§œQˆ~óG̣2;ŒYŒ6¢&Â0_’?“:üKr ÿxñËđk^¦‰i"d("xHùdU=¤ *ƒÄ8U+½y€¤™©¬ÈèÑ oˆL'·‘Vô~"瀤]żE6’Ù̀ĆỨ0üZ¿‰ˆŸDĂÑp< Æî]Ô#”fgÿD¯¨P3§Kˆ¡°ư\A˜?Öc¼‰§ănÜD́Đ[CÉ]( ewĂö\ĐÀ@̣wᩨ–=¬!cÉ_ n1Z_†{܃“=ø·0.µ ¿ÀMøa¦Ư€Û 7£Eä"ËHäy2ú|¶æ‚±‰ˆe d:DaÔ߯<)Ç7€œ.Awàv”Â=xú#ùªÁó™—N =eŸîÆ[˜sĐ|}ƒ}ƒ°p¥—¡7+=$Ç#&ƒf¤¦)H ä¿đó˜@«|Xà̀#%8rÁYGJÅ#¥¾#1¨CuéT`T8ĐñÖÈp  OŸ8 ̣w 7:ºåüx9¿NÎ   QÎKF:pk`TÇè«.iƠ:.·E§m 7ÎצSh‹VYä:áe[°c–3Ä1jÈ‚ÔhT‡+5Ê?Ó¡ĺPÉ?XHïƯØ’Ú×¾¦‹Cs[“ú‹ÂÍ™9­ƒ™ÓLĂœ„ßÙá¸ö¨³¿ç§­¸×Í´r. Đb{ûª@ÇÆ‰Óî ̉´¹®ç’èèÖöÑđÓk Ç]€_#·6OëÀ·ÂOèĐ»*̃ßüđ(ZÓº(Đ¡ _̉¾¨†ÆỠ.XÜêrI» GkT }̉´p°£Án3̉³ÅÚ/X̃)Há̀=éÔÎ\́Ø-FS)£7 ̀̀ïÛ'çäĂinÜ}=‹i‹Âç‚@tæ %ÓÂpOµ4™_‹ÚçƠÂađiÆpVÇE0" ;4­íÜZOÏïPD¹p ư{î₫ề9¥e”ûÑ,•“>Qƒư½ùd²#‘ "¢j„1…6“˹têª.²0¼Œ Àº5AßÎi’îéßÑ%¡¹PèX9qZ±@sƯ[‘”I6wVºg_ïÛdºgeï¾Ó[Ă ÉÛ%ù¶µØ÷oấ–Q— éÀöÿ°{~qÿ¸ Ăă&NŸỠZêÛq“Î(÷×öí+å:,Ó7)创‘÷‚PÎ́;˜¦é;Ø(ü+e¡¾¨K¥©”kp`t×zN1mÖƒÿË“º ÇéỴ¦ÿ´R3;†$Ï,=£|Fóôí 4˜ɸIÓÛÛµǵ Ỗ>:ỮÚ>§«°rn8À…ÛwÛ—jíÑ®Âî;Ü£×4ĂM\‚‡€´4bKß>q‹„o¿pú´]`?nŸ4m+P›ÆÖÍ["°oÚ®B’\Kúji)@KhIß ̀‘îrï’Z)ïeå ¹<¯ #¹NƯ[‡Ñ¼.R¬ăä:ø¤·ÂÜÈ^¬ºƒThô¥ª ë·\+XaV©€̀†!.ÖíÀHPO¸Î™<Ÿ;Q7¾§î|îdƯx®§5ÔơÔÑoeEµ9hÍÁ‹Yt:À́;-)Đ)`÷!LẒ§É|<¾IÅơ (F‹–·+• Îa·Ø†Yă5ËFc!'x̃Üȉ…[é/¶Œï9QÇusGáçềüàÁ˜&•¸gy~PMù?* ¹VÎDÊp(&‘´Ô=ÓyAuÙ¬Y—©̃¨>)áï.Ç_Ö9ÍZư›ù®Ç7å»̃ĐkÍ‚.„ÇæÆéüi²¢ÔÚ¸†h\\,m±†W:́œB ­Ơj¡ÑĐ^L®¹¼ï›³Ỗ“´½G¡ÁrsÏh­•´ǸeƠđ¹,‰AZï°óv²âß¶öÛËó…ü³!½­}Ÿóø&|ΛĐZ§.”ßA[›̀"ïàr°ç«%ç+èÏè:&ØÿyưÙ¤̣«ˆê|?̉¢%Ø‹œIh̃Ñ£(Ó-7(ˆKÍÊmÎà…0ƒË{>¬ Z=ùÙMT¬…¬IqIz´îUAṽf* G¹c(3^È̀±–Ó¿#+®¹ÚôVáo`^‹ È#iñVµưP'—́Â>$KĐxè8+*›2,7 †LÔ6MD“o'Ô9Ÿ~á÷¦2_*–€X,‘†h4v,h˜Z4X3Ÿ«™¡¹Ts¾F³Z½Zs~@³ ?¥Ùvà×đ÷ñ1üwÍIüƒÆ¡Ó`]~};£†fhºđVhÔ ơ‹3˜»đ-/@¯œhéé>Ñ]ê—¶–Ü×15Eñbồ4»Í‚–<¦³Í‚"̣¯iQÁ¤·)t“Äà3¸ï/ÔrËàg;y¢ ï.|‡˜Â‰­iu|¸̣e…(Vø'²Ă×VøçQcTÉîˆ+|·ƠkLÓ3…ï¤p\á1ú!~‰ÚçáQ9) ¡°1Xϧê¼BapƠ£.̣Ç•‘z£Pñ›ƯX â—ºµØ½ÜIèáĐ˜n<>óà¢6.—¦“rNt Á.Ø« PzÜ^·Ïíw³Ê˜X&ÆÅ„È*uz­^£WëUz…’C戄—„“ʨ„̉lFÂaSPÂnQŸ’P9$ L“d>ÉQméƒk~É%›Ùg¬>³£ÁL»ÏÇ7„º §$ 21«Ç ‰›ƒD0Aâ06„i³Ú ƒ„±ÂqŒ×5¤µØiÎk‚ô"_IȘ¬?=Ëß@´œy˜ƒ&ÅÖù¡ÍnÆ6NÖ˘ÿ¹'CĂÿª,ÔÄÄpˆØlV(;́ƠU|ùâÆù½¹Ü;Êä€Ü¸›Ê}#9û¤Æ„P6x̀“βÁç¬ÙH₫̣v₫ÛG¯ ̃]?ạ̊·1Gó¡»ë¦¬¸ú­ú°ÎÙ·ëê?Ơ‡„î£Úv ù öÀ-[yµ»«đƒd2+‘Ză–ÜM|“›Ơ˜v“§?$i8½̃Ľ¤QZ£€+¿¤.Ñqï¶î&€ uñN¤Đ¨ơ±î!7"3r?IZt±ÙŒ/Fæ^$Ëưÿ©(AƯu\O7wB°în€/Ç`ÄơÔóƒ3Ǹ}bÿ…Ê Ô"²9XÔç ¬ÏƠA³¢ê‘u8às¹|=‹iù¯­“ U ́§f:,¼ÓÉ[lÅ¥`6Ôè‰ÍĐ€.%q`‹’4N¶Ó­K*X+B]xÆv­̃ZR4ôĐÆUV¸wNư—”rG²ç˜®5̃»­́¶øeOÄ÷è·%4^kÏékl<œđ%­1_YXoƠQI1|ÉwÛÿÅ÷ØÙ2uoO~²³Ô‘ñQV6¨ÍئÑhơ®.üă6ù·÷à”êƠŸë£Ă d)˜y¨ơÁñ:²Œá_ơj%ẉUJH(ôu7@ÿåºq©Q±A;=₫ï´G¢-è”%l–°Ăo•0¤¤]7̃X́oø 6Ü–l”'» P82hÉeA\•*ei¦)á—R©Bªr«zûô!Œ¾k›äîºË”=gv,Ü5ç‘¿‰3®Ê¸{R̉•×ûzé%Ê?qC‹S¥upÏú¸}Ȝ˯Ẹ̀*«¿/ü…B0đ‹k1êÔª®ªÊ™‡DÎŒ6Ö₫)Wo«½—]Ÿ»¯vSî‰Ú]–Ư–Ö·±üƠñ•å_BÆLÏÛn ÁÀ™»`=‰«Mºd™™É@CœHö Á(S }g À§ºđb}µ¶Ûùze¸¾¦ $­­ñx3®!™Ư0răN0¸Z¡4|µ¯,À"¦yôèùÜ1èûñ ¢£ÑsƯ€“.e‘‡s4=Ù\$j±²h6,a‹Â&áHN”°•åwÉăr#|`SÛ̉V‹jÛ°½HCľ ¾ºªÆE,HµC.É£Ô«#ÅAb,W\û}×â/ÊM³>ốƯ¯ÎÙÑâs Â9më¼~êƯ)ά3;§.pĂç’ÍÙísưù̀ 眦Ëw.·îBªK¸}ƬuuY«ÆÁ•ƠÕ{ˤû`nzŸê°/ ¢w$̀çâ *¼~ºơد÷%»ÉÆwáV‰7_²‚Á‹ |…!A?Đ3v2 «ú >ÈoEF˜|`¾̣z¨Ø‘ ế6¦‹Ü,™°Âx±×ëG&Uđí&—¡ !é@‡°bY›f«?ĂpDú†£m|ÏÉ–¶º:P‹:`y\]7Í| ªƒJZm¬XUü%·´çûCu½[SeEæpµ¹—OôfJ@Tm6‡1Ăô¼‹ß}~´ßạ̊–Óüë4}$•ŸgÏab§ÿHû.ÿ}/áÙäpOä|?•sè¹ú/)¤skh¥ ­Ü¢aèÄ­èJ¥’!/iôĂÅ6dµÙ È„̃¡³éĂar±Nk6qZ–Óëvƒ&ḅä6‡F°5€>/Ÿxw²6QeZe,O)5oă\°Ø9Èà^@gńy‚b ĂäŸSÛ¼SÉ.eµØpÛ©×]f'§å…?›ásÙfˆ¢J¼JÉÿ.ô&ú}­g]¬×–LOMÎ' ‘uºVg»óü úAƯú؆äĂé§đc±íd¯v·~ẉ-í›IËr¼)H*­i`6[=a_Wá¯[+Âå» că‡mfuYY„Ö%ÊB» _¡háË­±P̉ >Y&©Ăơñ¸̉[oQdꕆp₫Hââq;'Ö3Ÿºế́Ä̃…»%]u û4U¯ªÎ2;@DO´@J¡è˜,¨TNeѬHWºưf«öñ y¬€Cå*°*0úÍ€Hn$iuFB•``ôtbư©%ZpKj[bJ¾èknä‹N0èVªAá„’Â 9LsØ)×Yơ 6'n£u6Zg£ug˜Í}ó7`à ^(”} ƒäiLmË€CEi8ÚP´'̀²eZ®X¬<fm£̃H”f 3ÎʱJE4¡)ÓŒÄÄ-B‘Ê‚ËYjlªăPi„$̀¡~;J+3}Ø5¼PK’VîĂ0ÈËJZU‡<®2[0ç¡\6&öï fïđÎYSkƯ»ñ/d‹ëg̃pûôÁ.§YïˆU¿‡«¬¹G^úÛß.zyuüạ́+.zyу=w­zö³­W5Ư—iqN³CgÁƠŸ'><°~Û«;%) ă,ûJ˜¹È6_¥¤1mµëÔ[‘’ßƒí€ ,¶o×éÁÓï<©Ï­êBÁg¸P,?çPéO˜¹Mƒ†O¿=kû¼,ÍÇí́BF/·bĐVUw‘$·-b̉ ./;Çđoâ3|Ïđ‚§älÊÆ¬C`o´-½v_‘ÈQb+OË–sMäÉNÊè7ÿ®ĂbqĐ/´çƒÂ `ư™Q=ƒ[v!®°¯ÓăẸ̈]…}’†÷e $ Pê„­¥´uÊ[g¶‘å€̀nơ Ṍ!̀1—¨^\_3¸ %mMMU=fÇ éíb’F¨L?6C[ßÅ0’™M)h•‘”  <7xX pßÑ’Á®–«‰ A5Œ [‡§T]8#é́¶HÚI7¥pêü9:½F—Ôˆ 6_ω–JÔ€×r'äÙ´€–Gå‰E¦±\j‘UႆÆhVáL$ăɲd,)&J‹•·­œ•UfÄêhƒfxrFí³‘©Â6ªte¸Qu’zXv$…ÙÈZnœơYc¡YVÔ‰d/ÀơÎè¥"Ε&¬ểT º`3[ùêª>¿jŸcÄ\tŒØmf•aFÙ«̀ !Á_÷È}Á˜†¤§bTçư÷3sfg]ëCM·¦=çsÛ?zAû"«Ơ`q¼å̃%sÍQ+.×±́ư‹¯Û2ç²5!̉°åæüöẈÿĂ9¹€X?4ë¿oÈÄe¸ £ß̃4ê7 {ö  6 >‚ot̃ ’bFˆƯË̃¨ï’5fèQ÷°Ô ô₫\×y©Ñé&¾É>Û5;Ơ”₫!aJ¢D"U Ik¹.̣¸d7¬5l0Ălˆ› Î́Ơùpœî2buBă o8‘̉0r•RY- ¯†¤‹\e·Oáív ïxsÈC«Îñ#ÿJÿ:?ó¶ûăn¿ßăö†Ü.W*‘đ¹]V·ÛÅ›Í>’Û* ‡µ5¾¤©Ü_NÊË5B:%º,¢K ®ƯxJáa’5!º%“¦™±Éíwqw³@(S;*ˆhN‹ün< ™AöÍÚ3•}5™12O0c.˜Y3Û™µØÙUrï´,œ¥“f{d?µ¸è¼Ư";ôA^W)dskU¹3¹ê—`u©û|>ß¶´eŃXñÿT”ÏV½¢ß¢9Ë`Ă%‰ â³v0L˜a®ëù°í7Ạ ép|ù²WéwøÁárơkÔ°Û¸₫ ÿ§xU₫­^ƒù’‚Ë©Wú ¼Ud^Ï#t d*ÈP3ÈÅP^ ½ø|bṣÚWuhkíÉGE7$Ÿ‹*¯‹¬ˆ^¼2½V»ÖzGdmT=™›Ï­Đ.ă–™—ñË,ª±ñÁs#ă’·U¦¡!Á!цÄĐä(ÓN­ÉOĐu'Ü™°)‘T/ç^ˆ¼–aFÎ^¸-Đ^qo`S`{@RƒIDÈk'jEc¯º"`dÂeƪ@̀í1Qíóú*«ª́jbW‡£&½_ŸÑ7è'ègë—êUú.|³OG(1™×™÷™ß617+Í®l¬ ŒrºÄr(´P=vyQ&(˜·•ÖuZdcœ2`/ÙÄä^‘’đLă[Æ,_$Å[µ:‹˜Œ&¬é4jĂiœâăiщiŒú™jkÁmmm-đ‰ĂfU PJm úȶC ΢ë$ˆQ_Â=̣ꦛ¯mÚ4§g -¿ă³'Ô¼çê|'~jâ5Ă½#ÿçIÅá̃~탳3ÏtÇ\:ä¤&́Y4h­§íç,,]3 :aEá0{û,ªE‡¥k̉VœA hbv›}c¾ơ"ûẬeÖËíËœÛÚA±ö±533r‹—änơ<ÑVWîFŒÚhw ª „}&bx]x[’̉ƯÁú¢ÉA K’£¨n ¢kˆ[4Uú+3• •l¥0xƠ€AßMÙROí~Ù_́}™.•Ö½ƒ)s̃„Æuè.×™88°¿Ù(­÷¾Ún·;LƒñÈÀxä0/¹fû—*W(³Î˜äy¯.äñçÂa¯‡Ñ()×2ùèVJ™„å³¢ƠåHZ,^W¶œ 8©Læṛ̃Xú:H")̃´¿ /–ê\‹ºpDtåơ!¤s:$<øOÁC<Ăip“f£æmÍÍqB“År”æ̉$Ư3¢=À¤©¹À’á¿áSRZ3v©³4rƯ=Ô“x‚Îd\K[7@[ Íz®DúèƠ æ{Ë¡º¾L Ñäb2Ù»£¯.s`s¯«ƯÜç[́%s¯  ÿ˜R Ln£Ư~z‘6Ă˜ËiMÏXöƒÁ(8I.ï—ç±ü¶₫Ù*˜Ö¼•7[̃ó5MgĂ(­ÔºP«‹RíÀƒ’Ḡs®~td®E?%̣„î‰È ÊƯz v„E},,Fj"Ê4x<ykrª,Ơ¦*\US^U•)÷æ´jŒK[°ÏáhL×$ü^ ºëÄŒX³ —c-Á¨‘ê±P X­’ˆ²ß‚̣̣´cäMj¿¨…úUKÏ‚49X€“鯬K”i퇶¢Ă²äf9CÅZ:V,pûÏ@½áZ0 ¾B ”s6߲—Û#ö=Ü‹}~m-ǜƠÊI Ô¥U‘^,,ºâ¡ÈŔ\EqƯ„Í´î½ê­»&¬₫zÍ5*êËq̣fV¾sƯ{&Ö`ôéy7M-›†³â­ùûs5Më¶®~°+Ú—VZM.ßK~Áá¼x₫]-W=đÎÉ@́Ä‹Á®‚½ôn)è]#~Ẻó¿±?—é´ïͰEª3$ḰÔY'çÅ̃dĐë ½®T•\…28¯Îdªª½©º´35øHC²±¡aD£·®ÈauÊd‰Â ¬Î/ñ×dT¾© —%#eeш794G«Q-®MfkksYïĐpȇ0ÖUb*• ˆ®¨˜LùjƯĐ¡Z ³Ơ¾HÖi”<₫́†ÆçÉÚÆĂ¤±‹́‘Ü£x_0höU‰¬#̣̀6!&2›,% ýA#ÑXü>’×âAe)ñƠMÖÉÖ.ƠÜ:ÊRe>BSs‰œ-*ÿ¶ôó…ÿtÖÙ×ɪ́rÊPCÔdm°Kdw-P€¤è> ₫d ¢}KÁŸÔœMuoîyW†ü'²Ög)©ưQF’^æs ₫iMvvï1‚©ÉûΤ»2ˆœ‡·ơæOÛ{÷ƒ̀}ä÷ï s~ô”ΰå°>`X¶Œ'㦨ÖWX+l ßùF½d•lă<¼|6I–}¼ ’ä—Ëäñø‘W(ÎĂ:EMiṿ´³Ơ˜m6̃́uúE'!¢Ú$j4jj™'p˜k;ûæ_:ê0Ø ²!ư¿Ê7Z?Y“>Ă-&¿:k]úˆL0e/;¬¿³ú;“®|nô}úMDßI×ßÀ¬°\k]MÖ0k-íÖ́j ÑYu6æẠˆêiƠçÜgÖḮJ–[ÀíàvXÙ*µç€Ü¿ç/N§×¯2ñ:đĐÔé P*̀ H̉s G^ ¿ç*3-0úÔjƯ¡¢;Vª°Jˆ=² ¿Wt µ} {ôü’yWÎê¦ñGÀÖGÛCœ(mv+¤f… PפTYGsJÀ`;áÓEbBU/™(‘wè̉"f‹“C½ü”«ŸƠ³̀óG½s1Ùđø_½àÜA3̣;©À’ÅÎíq=pdê<\#‹ïwcÆ”ùîHơu3FÍ€ˆ@/ I-δ³Æ5h†¨×Hæ‘ÁƠ;ÄáăTơÇîÆ“ú °lÖƠ Ü Wm.ºPNA×%ÈØÛílpu˰ÜS'“c®®øWÖW•'i¤€$_H³g$B$4$Ö²†¢[ªy•â—û=Á9ĐGùψ=Đ»Đ̉öA[eäêê!$»P¨p¤Sˆd©±¶Đ„f ²yôśh|bcäè-1†§<$/,óçy¸•ÖŸ—¸ááÇåÅư…ƒV‚vLí¢ÅR-ô%èA·b‹ú_J|¸ŸZ“†üÍ{̀GF£±µûĐÛè]ü¡çïItŸôj£(æùÄÚ1©'}»|‡Đ!|Èû%₫Âk˜æĂzYó,¨‰ç/n1™x‹Wï—© ‡BM!‹¡PTôú32¹ÑUU×TUåj¼B.««YµZÁzun[ñbNlrúÄ·:6«×]^VÔødS’$ă±d²,æ-ï*Ü!y¼<^¯+¦©¯!Ÿ×g…*ĐV¯¤óEE¿ßçóxELËc=wí ÂØD7)ÏÄjÄLF§Ó³Q¯cµµ^ŸÏ;¨Æ“ĐÁÍ-=ÛSĤX<“øœ)¶6öv́H́8Ôu‘O%›×gc²ÄcÖăa aÁD^.Ù-†µ²¾ –ƒ–Ă–o,¬EüJÉ^O•Ø%pƯNóàLñ¿¥ -Éd›“;æ’c/h-óIOQÇé¦Âƒ\(j>Hu®úå₫UêrgRñKn̉ùóÔªíÿŸµÉóú/€Ó·á0₫i¤G¯bü³Á ạhk₫Eî!yây“¦cr4ư†ÿIæTÅø>7¨.OAÎf=)rèL:Å|IµLyöFâ^*ÙƠk<‚‡¼F°+Ưnlw³:³,dÆ8o4Ac£É¢0é§ÊÊ’)oTËʇ¨ª•e€̣[å2XÖ‡”9â£åP°Ú ú¼̃ˆ›`û Æ™äF(%Es¦Ô„ỐỔÔÚÔá”2å*'ŒwÓĂ-ülËRËZËq k²`‹riŸQ×F­l®è!¡TăX‘¬Ơ•È·#;KzgVâc«Ù ç–W[›e·̣ÿ3±û))—á;₫Y¯ÆgC9K÷Ü[Äêr@ŒƠŸÅQt*r’1¬ătưY¾ăÏ™Wûy A—¼¬pđi©đ†éUđḈÇœ?r?̣'́'åkö¸ø÷í8ÿÎưW¹8o³Û́kü¿L'-̀#{ơ“§Oi׿©|S­¾™¬QÜ©^©_mYm»‡<¤PRRWkêôC¸j¾Ú>Ä©N¤>ĂEù¨=ăJT/˜ör[ù­–­¶û^çnAư¬é9nÿ[Ëc¶ÇíÏ;ŸÔS-í-Πܽ–ơö‡êQ–Q¶Qö±Îó„é¦éܼ:îbª± ² voËâƠ:¥VíVºƠqS̀³],`Vm1X¤rI5GµŒ1JƯÈT6"ºÚU ®ÆkKáƯ4(‰.PÊîJºÆK¾%¿/hnK ÈÄv»Öcnà» ';aËu~èä ṿj´º́N»·ÁI LÖ&îú’n]…÷ûÊ:–_¡[Mik¡[ Ê6z^q{B2i¶ ü0‹LC,Bƒ¡´%tËÙô¥­“® ̀–a؉>Dsÿ>r˜Ê1¢qưÀ‰‘™C ‚¼*Kh¸0ơ©̣́%·³ú@₫ÎXươêÉ_¿¸åVmzñk2úÉü§q36b¶1ÿ·§̃£óo|̣e₫<ÊV' É @’0J£ă’“u±n•ù-ñuçܣܻ’Úë*|-qWºnq‘˜:¡^ïº×OÎæ³ÿ̃«́£§)Ù #_”7E"$q‚I€»2i ˆœP~²ßIƯë¥Qj]´ YĂ#ô}¬bHt&j7— ùß›k4¢•:?Ï6ÛÎârhk´´.⢗wà]_<û—1Uă†LÉÿˆơ-{ú¦ü{øH₫35ú«'̃­uY&]xͰỷ~§₫Ï— ßÓh₫Í.,́—·%­ç°¹•WV2ªäʱ•Ó]Ó*¯\‘º&wgnSâéʃâ{₫w‡Å÷̉߈f0k+GùG¯IƯêoOưÊÿ[ÿæÔë7‚Ç’ßÂHƒLÿvŒÎ4!†ö‘?H•¡t*́/G5%{ |™rÚíå´ÇËËƠ`jˆ‰µ¨ư»Éµ(M6J7âăª£$b± ·l_áYë—Á¾n m ½:bC”‘˜̀‡3ÜqpBíØÅg®µ´m9ÚBmq˜÷O–œ2Ựª!¨B¯0p­è;đµh\_̣êmơë» ' çOlKêsv? ÄÖl ´½7ö§äÔ£Ñÿ™´«́gđóhuŸÈL₫)S?ưÈ·><}å--{xó̉ü÷Ÿ]Ö9ñ©åùD›{¦à¼öËérĂ₫N¦ä—r“×NºÀ.˜¬0'ŒD‘ơƠç¹'T·T_m¿Í¾ÊµÚ½fđ#´çF'T$₫äˆ÷Çß;Tnz“g mNJñú¡.§IaEx±ª"̀”gé’Y'ˆuuYs´Qw[~G, 62,(~P^Zí[ê#>×hkTªĂ¢4|i|E|m|Cüù¸".Œzd7öˆ(<Ú Suñi¥âzSï‚SY„.CIe/meĂÂÀÏ^K*Êø]¨(ö†Ä•blJ±€1±/²”Y_œKyVÛÁóĐÑyßG——n[;§ê‚j¿ÙăµÏZw`uƯơ]yùëG¸Øgs…Ï9çÖ[^¢̀l-èuŒ§wI “FŸă@íB ]Đ jE.´Z{ ØQB˜êG@„.¼tÇ™}p ÈJΓñ´zzX“§Á3Á3Û³ ´éyÏaÚó÷(¥ÔÔÿr¢ÇÜ ĂßYëg¯*üT„{—ú3dƯDz—J¶?Î?)Ç>K»ïL'U₫¯TªñƠùỢ,=t!Èăơpß8´yzü…¶ú9}äÈ smèj÷Qå1Ï—₫É÷ÊïƯ?øO4:Â*±[ç¿ƠưRÉ;‹¶“³[µ`³9/_tˆQ§ă(®@̃„Y[ôTÇ5ƒVă5½cÄê’·£),ÆăN‘×¼™xI„‚>Œ—ÂØ€fÓ‡̀ª—O­ ™­YªY¡Y«Qh„Ề³EvlQ©méwqơQÎÿ/‡¯ü¼— +µ%ßôxoP~Ÿñ/G¹ÔäÎb’LÏW¿[öܵc|.£̃Wä½tÓ…«/–­‹b;¬gÄ–ăs_»†¼#fĐÊöĂˆ;^9ïÑyrM¯̀•f$n–<*¤rV óœc“­̃»¹·½?:LjŸDOz‰̃Z\àªmgµyơ6{(A«à—‰‰œØ*¾-²¢Ob"é %‘Nvư;—ª0}p© ,‹8Q©âƠèÎs½̃j§×+8½A§Ăö£ntÙ™Öá´:N‡=!†1hơŒ¨ ƒz½ ¬†'Å g“³ĂyÜÉ:éÎAÄŒm¶m¯±A¹³àÀƯøfd'ow¦äH”‹hpà±–-r̀h‹¬+½v$ưËdz­É€rv€‰lM₫Ç̉0ƒ•P ü“‡°sÆƠÿ®–lº2ß<Üa5¬<Øi1-ßàÛ”øÆN+œ¸¶¸•Øa›^oÓÓÓvæËe>À–ÙÇalËÈ·EW”ä´9‰Ư¡P²@­Ë\V¥ĐM„ØâÅ©“Gü°«üt“táR×R÷RÏRïíöÛûû¬_Ø5­\«¹•oµ° ǽœC²KÖIÜŸà÷úÊâRc¯tŒ&£íĂÍx†}ăvÇ“7Èëöá¶äå3×Äa.gå8‹Ơk°Ú‚1Zë‹"Ë"E¸HSd_äíˆ"²®,‰•yƒeH¯”ј4~ 1iöjk¾Ñ@Q×)4¥Â«W°=ÄêíÅ̃œàơºo@p"¸á@W₫_RÖÆ2«‚e}6«æ21§`u:‚ ƒ}Nä„!˜ñÙ́p„ˆ.r•äscÆ&2¬:&]ô?°ˆ¥hĐüN!3\  Ó[¤ªƒö X9AÊÖd…•È„#YAcYA”Le₫²Ùe+ÊÖ–m(;XöM™ºlY$Đö±Ă§Ù¥ |áT»äÊ™́ßÈ!ĐÓ¶Ìc[¾U°½?gE ü4‹Ó’ÍoÅû¬Ø*r Œk¬âEØG£dÿêEŬnѯëIö´Q;ÂyLàzÚ\ÎîbŒjËQØëä¾F}øVZ ¡.ÖÙ×¢¦ñY Đ¾LÀ\íPưO!\?­(ề¸ˆYˆÙN²’¸.»«DÁÆu¸úÂOHá«­Díè*ßbçz)uª¶´4Ă4jë, Üb©¶XΪc>¸åë¿ßr½_†ÎZ:ƒí_úß7₫}É«E,¥~¦áôË́°¾Ơ¬“9ưó_P´ 4m%]G%Ở½| Ö†s’~‚}BùđÚ‰úÙö–̣‰µsô—Ù/+ŸSûhùºÚß…ºø®`W¶«ñu₫ơàëÙ×?D_e¿iènü'úË…œpÙ*̀7ùÆ0 sÁluf³<Ïû‚Yk0˜­ s<çĂUVŒ«đlN4‰Z‹È‹A1 ºFˆbV̀‰C+Å*1ĐE®‘<ÀÓµj—z(Io²8+666ÔÖ6„Ăåå±FJÍù†á NÄX¡×+¼^½ƯîÅ´ÚlRd W³ …kT•†Úí±^ø%º_;tn©—ñ #÷`Qh±çMaü '@­)=…ñG|¯¿O CNwÊuú*[JÄP&‹Ư*€ÔđóqÔ­ÏQ·>GƯú\ˆ÷5pFƒ½µ7®¾¹„ạ̈ê)_8Ô ç€í¨N“·p&O­o8Y.Ăùt»ờK˜à#¹|ê·’Åèh0›̀³ÂF°̃©u/Y *è„4áœ&kñú°­‚íNØ`ÓáÆ|3>;àđ¬¨8{à́ ̣^[´G¿£émùMù§n“Ë'hp@5nÏß.ËøgT¢gá‘xÄ,;FëdjOO_„âKùżѮ`ü¼o= ?6@æç̀_2_…¦KŸw:üô9èÏ¥œEhذ¦J U Ă «”ÓùYöÙ̃éAV¼(tOhSˆưgđ‡0Q5a[P³½îƒ\É4-Fú ‚¡d%Ôtr帼‹¼"éªÊË+«¼É*ÔkÄæJF¬`ÙM‡)é ‘MD䨓X2‹EẪD8Â\ÈŒAS¶TɈ˜LÄ„+ÈórTđ°ØT…«ºÈ̃í€ê¢™ƒœd ˆŸÀ¯¥¡?Ơùáønù¹¸-Ư%ú]\’’×ëêz-@÷ư°÷â¨>“9₫ü‘²X¢6°d€Å(Àb'r¥đ»6ôïŒƠÿ»,­ï#₫‚ÿăüÚ*Zúœ&â!¸ú‚>¹©"\~dŸÜ¼@ÊûD%†?¸vưBLdfFiđrós4sl­öeÚ6Ư2»²‹ù‚|abÔ¾&?ă0•‘ SK˜ ̀ æjÓUÜjr;c*2fJiÂ/F’$Àv*3g"Œ™UøŒƒé„Ơ&)3I^ø ¬Id©9,9ÿRÄ"—Û,ê‡0€—âø8f±à‚Q-ºÀR¥U7¾‡N}¯¡Of |ë́§²ÆuØúæ'Sá»-ÄØ Dc8¶3ôƠ\Wáo¥íN‡±0&] Z‚Á\ÿ“]æ  Ûz—¾˜üéÈÓ=ïĐe.̣^Ïr íƯk™Uÿ”í®qWŸ¡ÁQ§ïÙé9Ⱦ•#2lSÍÀ:6“)¾eầ'$î=ơ˜l-´ø¡çà€g!Ö7…Y‰ÊP +M|Zơ¸ÿérFTEưCÙ+,W»®r¯´̃êºÛz¯k³j£ơq×s™íªŒ[¬Û\»|Œ'*mZ,àf4ßă"ו·—?T₫´qsù«•ïU~V©. u‘ç$W4ŒFCÁPïµ8â5ATÇLµ^“ªéÂG¤éøö2¤­2:M¥¸Ô²“ƠëˬsA¯î0 @ (j›‚8lNÎn>Ü<T]µµA%Ư¿T¹A¹WyXÉ*…A‰=ư‹“ă{•Âѽ:fZº©•'?WÖç‘l|–ú•4ø^¤ó6[8rđ ':yu¹º´¤Df…C÷ b)́£{ÚZ`Øs½ÏCTÙ|¥¯1(.|–”“å}½Ï?LÛùö¯Ÿ>̣ÁÛ'¬\9wK@Ă9´Æy7mغŒÊÁ«Co9wçÅç_ư‹%{æ-đ¥×î0q·Z0XëäÍZ“+ñȼC²/é·fnÂРλdêlêLĂØOe?GT†#[(eN̉q™®‡ ;-[„ŒḾ¶Ç§b°. ê[t]x̃v1¨ 5̀“Œ´^¥Ñyƒ&èy¢t%“>`³̉Ç LÖ¥ÖĂVP‡ø¬»„£½̃††:JGE$-Eëư§WBŒëĐ—C´Hƒ+t‘1eSÊ.*{*´)²ï̉½àÛÛ¯8 >Ä~¢>ªøRm¶³•¸JQ¯kÄtçú¦àÉU‹î"¼@±Xw%¹N{o¹µo·ÿÅĐö¨æøă[u\YWáË->{ñ]-¸­›aŒÍ¨*|–Sx^'îÿ  +óÿÜ₫ÉúWÄe>úñƯwL¿́ç=ï₫!ÿư+ûóÇÿ°I~=Å0yYûơ ưëø–â€Æf&ĐñíA­ÎD—NJ)ȼfû$úQ́ˆÿHđÑ/cªˆ-f›h‰N-2-FW z;]h¸Übm¶L±]];éR(]gsŹ8uµsq÷9ïum²m‚cĂ"o6 V·k.xEŸ ºƯŒ«t¬̉ó[G0¬3U7oôău₫}~âw¥¬A‘̣FÓđ“u"# ÉưÆ´MĐ˜=Q|7ü-Åeö‡›ƯÔOBC€ècC% r Đ>0<B¹,ª®b^•ƒ€äråó÷́yåư§ç¸ÀÆ™ó{ư@₫Öx™1x¨–¼äw9ÜcV~ùëÇÓdu˜“#.Å̀k°ê ĐÛ›é{¡¿?Ưqnâ’¡Œă¹bđMF&!µÏI«8wÆáv;!ŸÖ*Ó´hA :Ë‚Đß PĐêCzUÖ"vø5•ô¹»RÑàJ0‡»đÎdbeïó2m¥₫¡nï:90¬±£đ‚êÁÏ»™*+ÆuØKJĐiTój 1ưz± %Àj XcÔ'k «#BFơ‘…pNÙ眨rộÀGOYR„˜»?ưÅ;Ë—¿sù'÷Éåẽ{߇̃wï‡́ç§–PlùƯëË\}Íák_Ç%yă'Ÿl¤’Läxâ H²€èmi¡Ö₫€T‘ä2üüẠ́¦đ1ÿ±đ‰û¿Ÿùÿe7„'Kj}cƯçùgº§û—ºûop¯q?àyÀ·Saº̉¾Û³ŸÙÏ¿áyçT¿jv`¶›½A‡ uúI®¡^Ô…?“¡ÀP $Hñä.œ@®Â‰­ 5X©?lơ©Q”Rê º‰ë«ieÚé_ + :e.8×óbUI¸ÿ]3ƒjrEH.ËWWñÔúŹ/â\6s¼ütÁ·c¯ÙW½÷́©SϾ·êÀw¾ùæw ¯?(#Æ®I#R³Êä¸ơóÎM ?½ ăíÛ1Ê»ço­¿ç­·@&ƒ.,]¨Å¿̉¸N‹mø"å•Êuø²?N:p'ÑnR>¡Ú¦Ø®úƒêCƠa—Ê¥6;dÜ6YưVbé´ZÎ9‘ OjfE*•©Å9mï Ø0S^fqE₫ª‹Î,ñ×Ú*Zç2•¹\Ue¨Ó‡†ØxY w-bUœV­ ‡æ‰Ç$Ư Tî­8XA*ºđ?:™Ó÷”dÑŒ¬ëé…|Ù9m₫YÀÿßFŸĂ®̃hD\ØGß`‚i4¢ÙUF…ä\n…Ju+?v©Ơ-=ÚûsË¥ø‚¦ơ3æ®9 Œ₫Ù1qó•3‡g|đDÖlàE§¦µvBÏ?ûô—™qm:puÏW}ọV|» z¤Á®ó́ )ªI¸@˜'\!Ü"¨,nx¬R¯™¦P„ôvp¯ x,ó*éÂ÷́đ( z-Â{0]"`†YV°M°b«à¸¢?Pœë‘G©®ád÷Y‹[°-œ³ü$¼ÔdƯơ+đXzß=NÙƯ8ö{ư¥0ôQ~âéï pKỤ̂70µ̣yÑ)ÉÑ©™ajö»ó\aZ‰Vâ•d%s¯Éx¾z­zƒz³g·GáQ»éâ¶´Y¡Swágw°lHW¼aɨSº& ̃b´¯÷ÑÇÙ’™†ñùơ†€×;ŬàÛ·ăw³1L~¢wÁ±çhĂÉ₫'–è3`¤w̃wÇ#STƠäÈáëò¨ƒŒ™6­~R₫{¹4—̃Bï¾ç´¬ùó.]—öË¿æbĐ̣½0®ëAËs¤kƒÛ qúL˜U/o¥ ¼®áb˲?‹ÖD´<È–åG¢ơñ†́"뢰n‡-5’´Nˆư(ûUô«́©è©¬zHtHvQdQn³usXɅèăº> ÷P¥ß†üØï§?ªçụ̈ƒÏÀ¼ư3Ă~(̣„QºZF‹ÑÙêl(Í™ụ…Œ­Ñ¨Ó†̀4Æ,¨b€§ó9Â3ä¶ZR"­ÏŒÆăb4”F¢‘H —µærÙ°ƠÂ[(lE(Œ,¹ˆUÆ¡¡m¨[)MUM§S)¢Ê›‘z(&Z+5¡5KĂ8ü`429·oDQ¨1,Ë®̀’@¶"Ûe²¼ƒ,0÷Ắ³L³RC8M@S:)5BÍüZY\@èĐæ(Fµ•̃¼B— J+̣ƒ¥(<ÇàUlyÑój)êôƠÑ©CAÅ­PUÜ:̣̉vkh6¦±Ù«Œ¥hÎÿ¼÷ó~³³(ûÉági[ G:]‘¬Uû2Ó÷;Ù [ùE²_§ŸA‡A[ƠQú̉̀láäÀç½à8ª©tÔQ!è{–:}û‚ûyMŸ?.Øçû‰;ø…Æâ«x~RÖ9ù.¼aVrœÖÍÿ_•o`:₫ §(|Èo»ø:ßÜç®»4jh”4ʉZ¤́\Ûå¶›m@>ôÓ(g–82D̃i»×l9C„f›ÀíåN¢¡ü̉ÆŸGÁŸEÀ_‰ßQüë5€:´ƠFß¹œl4IHu¦A¦Ză`ÓS©̃$™M£4¼¨¯ÑosoM±1\ƒÉdÏ\Ơ\Ϫ+<U•g”j”g²JQ¡T/ëçá!xÈèaC†Ô ²™h•/Àă&₫m₫œgÏñÏđ£ª"]ô«:½No§÷î¤×́ v“>Ù@‰²H"DE6–€zA ÎüFôFF™aF̃Gg!$0† ƒ£ŒË\|Wtzg@.÷Íe`F¯õh̉ï«:§“N@ÎûƯ?íꯪNUª:µ~ơïûNœ€Éº̉ˆà‚¾f„‚—h¶„=n9T=Ö½ÙMƯ_æĂE…́ºps!-ü²*©:FÈ‚º“u§ë„:ß”̉g½Y4ôÄ“CÎè*Ê”¡@KáÿC­ưJa‡̀¬Ñ±U˜#£‘ «—x}bFk*(Ñ…ˆVç=!R¬- o?D £ơ‹«\joGt#0,j)¦ÿ }ú=,ë=D^̃ÊàDÑäg5đ×qÙ`tYMzĐU¬¶;\œ¢ÆO±ĂJ̀®¨Ay…©Œ¤Ÿ¬º½~Q¤¶ăy¦p™ç=ÓÆ—ßV?™{§S6©‘Ÿç<¼Ü+,ÓÑŒà(¿Gù\åµd‹\ó¶îm=¥;e ?5ôèz ÂZưf=]¬_bXö~¦£›B½ärC+Cˆ†̉ ARhVWÈE]“9ëLTÓ*[’,Ä2YƯ•œÖ¶: ±5WOVÛÊD­ô“s&‹eG^D£GW’́¢Q ûÏúˆm(6̃n»Ñ[Ăm‡Q&³UçÀg¸Yüặ‘ÿ(^ë äj zƒÎ@u¹ZpC‚Û–rÜ60ÄèÄ[ÿx8àT†×Z®4¢½1¸ êÁóÑ1r]̃Îm}¤máôÚy|<|ÀY¾¿{Ǭk³±[u¬t¶5•»®¸4ŒƯ¶mjü₫À_F Äw¤Ïj’8BLà!×ʵ’[ăvzÜÂkä5ÓÛô}íôo›t«ô+́t)]ªYaX!®4ßn_ê¸ÍcpEkÄ(˜ŒúœpYz_»we³«ºˆ ÆÂBD1ûèÙ+Et2“´—1ÍjƯIƯiƯ9ƯeV×GÎ÷zq Êœ[ps»8Đ¾–2Z GˆM7b ÎôgGlN‹ÓÓŸ>;îù^sĐ>O¶ó—Ýư®ÉÍö̀²3̣¦ĂL™œhD´ồ²3ryˆñé& #Ñr;í:'³Nöªµ/}J–Đ#ˆ¬˜Å_µ u–ê5F»ÍĐg²©\ÉÁ‹/ü3‘N½Hs>Ø·ïäĐ ƒ—‰ư$S_~ù×?ùăÙï=w–QÎïă³—iƠ#§Æ‰Ö‰EƠcn$sh»y Á>Ñ­2¯'›Jו›^̉½ ¾«×ø^Ñ»ă>̉}(|B™°Iÿ°°[8 èܹ|Êú*̣|¾Ü¼¨[Ù¥L̉«#¶¤úh…ºsI…5áÊMàHµTDLbI„<ªÑC(Q +ŒX Äà_–pЧ¼¯×äù*³‰ïµË̃/&9ájôƒ¯g«Í&çŒeç1œ©Ö&¬×Ç¥ÿp¸(6BP‰*³ŒQwư+§ÔZzË ÷ükÇàÀ‰~O©ƠY$ơ¿µk÷™3»tFX´{̃üơ§×L?7¨Sx<¯Hp„hÅÓonßñæiFľ{û.äz¦ßïóëÄ6øj­ÂÓƒÂLXà_ßñ¯ŒtÀ=₫;Ë¿[ư”ï.Ü[ö£̣_({ªÜ₫dŒ́)Ù̃_"(çK6HY›M®WƠeYY†g²e8s8ÑoBbˆ¼eL$W42 QQêó‰Ïè o‰U<'^Ñ?®4”>í u‡4§CçB—CBÈ76CDΦ q+\z±SËt*y5²Đ׬²#;Ö¯(}©@D¸ÀYÆ?;ă}س%£zV¡Ú¥T”BÍE:đNIæôäÁ5œ¤×u₫ùÁ"üê\×™]»Î0 ¯íf=øÅo2=J₫₫!G™lÙqúôo¾©èóÖÜ,Ü…«½Kṽc!eÆéâJ韤‡¤ê~́Đç*dœĐ«êé-àê§ñ°#ËFơPÆÔ”§Oăº ¢q“ÅÉ?5¤Ơ›‰œ›˜_€¸NLÙp3ij;’D«₫²êưcÀηÆfÄ̃å˜.æ+xÄ›¥́…‰@*\C7?Xg´ûÄoÊÎ₫µ› öœ]í¹£§Å-åf0(uÖPHñUXJŸüisËư>‡hqĪ|5{N’ơu¿ƒâÿ… 1 ‹Î<6g©ßáÓ;b₫ÖưƒU¼s$»‡>¯â7§Óg…AœeMä/̣ƒÎTn=•®‡6XÑt | æ¿Ơ¾îx­áwÜïÔ½ßđ'Ç…ªO¾t|Vơ·ÉäĐ¹µuƆĂåvƠº¢;«[Ms7×®¨]™ØX{_â¡Ú‡?sö8ÅGGCôFC¼$V8N”¬̣{­½+g"TUiÊ'X-9‚‚Ư—˜4)b4}¤úˆf¬%ä‡rná„Hú9#ÓƒŒ¹]ú'›K”¸"2Û%Ư¸Êm«KH‰¯¹Q/è ňéuÊqÆ%¢ĐÅIü¢*ÏyƯY·3ºÒbuWơ‘H^ÊÚ)œ[à(đÔ¹BL ‘0ZR^ºS̃x¼u“®ÉK".ăO$kCBଷsTKÍsKe1â8N¦÷$œUbîóéÁƒ³· §m³gooỒ¦ërÖ…Ù©÷X#;N´jÙëµ¹đ ­&¶Å69qSmr¬©\–¶ KôC,œ̀ÊÚbqw¿*(¦²7‹ß©ËbàVç[T˜¯jÎîQN³́½^í̀-[§%&}àPÓ­ ~÷Ê+—™³đû<±Ư«ŸÜwằÁW¼₫̀£…xÔíA¿Û—,ª¯NçZ̃Ø=×®zziÔiñŸÅáë*MmlVQ®Z¼½“:€ØV‚ÉVÂkr₫bøôIñ¨ø¢ø–xAỖiyÀ²Ọ́”åeÓ;&ÇÀti Y'» ̃%6§Ñe·Ú́’SëË)é#?•íÁD~¾>Aèr">“óAMù…́,+3Ă…‘—!×–Î]“{2W‹À‡½cØA}?…¿fù,£Î†‰À*/J¯Đ¥¼_ñD“Éo È ̣~EU¦™ávçèWT…Ơ#ß·¸]ˆîsiêÁÚ kç¼\ă4Û¼æđÿYûèAÎl¾‡u†°ˆMîÿqƯ¢ña3û–Bä†̃@+X ×-ÄÚq¶c›°p%Î5GƯ´ØMü«‘¯À9†œ£!jU^¤ÓÔ©Ev=†)ªÎÏ„£EÄmu†# (=̃D(´Œ ›U猦pÀăfgc‰Í6œÖ=#°&°'“2S•SKeíù¦Ûaf¹•E"³Å6<‚Œ.9˜Y‡ÆIçTZ^™†uN?7"CRú¼úÆ3€e5?ï›áË ÿÇ^Ư$ÏR(A˧½±ŸwĂ%~ŒØ´·±u ̣ÎØ:såóW¡#³>H°¯ÜbÄÈyÜ~²_:àÂbØf X,akOn R+]ă¸.³¯p®ˆÂDÏ8$9D˜J ƒ²Ë f›¹Â,˜§qƠ@QÑ.)›(>mˆd‘A™̃ŸƒLwå|®ø'j¤D!{¦¼ ƯsÚÙÓN Kv'#1€°Ăét8œ‰€¨8¶„($D£.–pö‘•²ÉAö”ư]°÷“•à FÙ,Kd¬´ZÚ'½)i¤ä™QyŸúˆsư_„,ÉṬk₫G³/_…Wù*¼ËŒ¨pü êB»|ú&NöâÚëºHU)çºI’½™#˜3jj¦(ç¿ -¬6Ö<†=Y,”É?+vyqÿ̀ÓG¹x @m´Ó½Í}Èư+÷Y÷ Û°vÓÓT0h .¯Æë*¦%bW‘§VSëºVs­k®f®³ƠƠêk-¾¬̉,w-ó,ó-+̃¤¹ÛµËưCÏSt¿æç®}£ô¸¦ÏƠíyÎ÷\ñkîW<ï»Ïx₫Ư}Á7¹î8»ă-¾-ÅÜÇƯ/k_v₫›ụ̂‰çoô ÷ßNæ"]²Ç=‡Êă&VQ–²tT¶Ù«hyZ6kĂ ]Äơ’Só¨3¡e¤®±Ờé­XÅ/ăÊ%Ă]̀»x?w13æÊ’ÛS¥•]ƠÚmZÊ$¨ö9%Y3æóöö¡½û"“ hgBøà"í‘øg±Hđ”¤I}Æt£ç¸‚Èß̃¾ö I›«ªü—ĂÑbƒÏ ± #4d]D/EÂ(n×l¡€L˜đạ̀c}Ë– 1¾®ÚÙ»¤oÛJF₫ˆ!½Å„æ\ Y3ô6êøúxö,]ëíJœ¥ộÎ=$Q©Ö>×NŒÆ.$wH«#«c _"/Ù~'ư.̣źơÊ«^l´À »¢ÂƠ8ÓXaúçÂaE€&HÂÀ…̉‘‘D8á—¨Lä'b‰̉†Dc¢:Q•HȆÿ¢̣̣¢T›¶ª” 7>²±—EÆø‰¸sr´à&Œùÿq«v5 s%Æ÷Æ/’xºÈăEmÖ¼ •Œ Íó5‰¢_,Ơ%tơưĐ'o2ˆđÆ₫ï³”hg̀₫L€3ü3‘”‹^Û…Œ€êúÁ;ÿŸ[Ú-å–₫ßbé? ²ỗ+Ř{ѶĐưcO Yw…P€³M`beCb"̃f 2̃ư “%ˆƯ5$ lưGí^³µj|_ú“tU>ü*#p^6J¦”=h’RăÙGA¦¢Ç.º=uvÜ2ëëƒR0«±&×"̀j¬ ØĐ‡V#c&̀ˆyáº*+Z•N_ ÎÆđîJ†h£+©nc_úT¯ÍÉ(ß§d3zbI´"̀úJơ '̀Ù× ¨Úê¿‚Q\£ûÈư…N+̉ÿÊ&E×à±Áă|¼ô[…ä₫Ág̣ÿ!ÛÏ–É[¦Ї,6Ÿüfp›̃mV_GM|E¡uƯz<˜^kà1ŒFs‰Ø•Y•ă6à¬Ú9xŸfΪJ̣<"à•¼Ñ¸9â©&ƠöéfÙó…ăÿFMFG‹cjt9Yn¿ÛqwôAǃÑcö₫èËÑßG-85¥JÉ^éPp Ù\1„̀¢ÁÍAÜ £Ñ@4Ƥ)ËφÚ ¯t&@­v·Âh$ÀT¦à–ăë! ®6%êwT2Ùƒƒ̣EE\ø Z‹:*+ñ¨3‹Úqúq‚äR‰’€!¨•Œ ơ œ ¿g4e¨O~¢t\"/µ@pF® ^f'ÓªLä˦ k×hÏi/kuZßø̉~¾+i_kû—¿ ‘! ùQ9-·Êù›Ö-Ê¢üukñ7Eˆ2—¶Ñ©ơ[̉T„"Iæ;a_9¼F½ĐÛ7ú‚~³ËÍ¥Ö’¹dæZU¦Åæ,øïñ±Ç…µ‰×fÉ́2̣Åy:=¬ !\Ă#¶F_₫Œ£Éÿ.‹F"&‹HO¤?súo ‚†Gô\-NTtó!Ó䨰:6kÔm!T¢a³Åi6[̀9ÔBÜfC,Ö0xë ›rD̉®IXÅ”¸Ñë|îöƠ9$ÇçƯE¢»Aåw¾0$d0qøkJ¸Ô)Ê©̣̉œ²m—+îâ…îÿ́Áơ*³Dy¿V:Í$N\$£-V©&Cß‘̃袵œßbèºÏ•Ă]ËÀ¤ơ\«V }qó¼„&èÍÇ`†È1l£ñ²Ù¤ ̣ơŸeé/2:¾.đOnAgàẻD}©áƠüû–¡öŸpùàúGèyá/˜«Đ$ç™h€RÁLôö´(’œyÓzñSÇ{.DÄ>Ńă§G?ïÅ͆đoƯ}Æ?u§~¹‹û÷v•°0́ë}o¿1èÎơu¡›çÎói>¼Ñ“ă “?¨¬Éûô¼Æ§Ö$%Û ÑXX%Äơ¦Oí¿ÎÁÂe« Ö“Oơ¿¦ïa5^Ù¯Tă³‹ê7÷²«1~è³Ă¾íï¾ï º‚F:ÿƯÁó幃"V$Vt=äêÁJ èVú/¼&×È2¸+§×ü’M¡‡|j¥!J7üN¯HàÉcüc«¬íJ[\”¸"œ¬O®ö}rUóqæ“«́»̀Z€{üó₫s5ùŸ£Øï§7̃ÊÜ——Û”>3°[ó…~/^²/%möMzÄs̃HŸIÿ«æ 5|ø'h̃`ßưe)ø ""‡ ƒäÚNÆÀyØNâĐÕ€àCŒÙ/Á;pHđ|Lä R ‹—zŒ8à÷`‡¹Đ ?Vx6Ă*¼c?´1(‡åЋРÇ`̀À X oÓIđ¿H’}å’<Ûa ̃q̃ñ{¸æÀ 8'±6.¸v`ÜfŒ= ?€yxZ®ÅRwÂE²“&Éc˜Æ¦óg%Íœ†Í~¼O1ưªa¹è<Ơ|InÄZÜÛÈj^k̃,ä8Ia9Öời<†p3tC&À¤˜Â$|–5đù>åCpk2 Ÿ«ïb5Z Áô_đéß'¤sÙƒơ^Œí®‡Ut6Xp í‡s˜—Ÿ€A+¶b–s3‹›~’Ä2“ä ä0•9ƒmw–y Ûåm¸H“é¸s߉åÁ¾³;ɲXíoÖ+÷b,u'>%ƒû̉̉SXæv?Áë,}3‡Í˜sʱƠ,Ç6kÅû°|¶a0˜…mÈkÁ¡Ÿđfl­£$»àMØ”₫Hè·%÷f€Ùđ l«Ư°æ±áIóh³ÈüȽËRóßWù¿úG—e%Åç{‚X±̃leÁ`́¯ăGÉ ²âÈ`m”i¹L+)-uï¬Â‘» ê°gÁ ¼ă«“ØV™öܬ¶g¦M•öÜ8Ô–(ÀÑÎúô÷¼| GÜ Xƒs’…găq|%áA¬}¦3A€p|'Ó_âóÔă¶Y>åót)–ø6Ÿ£mØl†>ơX‚ăæÖa1–IŒ] ‹°×ºÈq˜K40™Ü]¸Xq¤ÔĂl˜J±î¿ÅzÏÅ>l† ¤};6đ‘܉æÇû!†ío‡»  Ka5`kÅThMë Í]˜Â‹5Rjщµ(ăơhƒĐ€†÷Ư\Ưn¬ïvl»M8®nF׉W× ¹ÆCïßÀÖ‘§°₫wás̃“!‚¦s ¾ ùđ=¼ë¼›­&'p=8‚g‚?cƯw¬Â’wáü‡ơø:j¾óÍä"üvh)Â-Ïàr{.ä»pU₫ï†&\ªÿ `*ÈùgË‹Vtíç@×iË‚}ĂàÆ<}ˆ%D€\LÄ­9ô‚‘äë ê Ÿ(Ájæ*„@qèÁI„ÓgXƒacacááưE<•§»„Fpa₫#ÓhùƯ!„¬\Xh1†ăU1̃SŒi‹1ô,Ú„ßÁâg lC8©ÆEù`̣Áż¢XÛ ´SÜgE;$D{¸è®rµ¾Û}:F̉­Ø[±Ư¶²BÙ$®À˜”bÂ!­p M "4Åh¢h"hÂh°… ö̃v4ÛĐ<‚f+‡Ñtao8ÅOÆé‚êƠƠƠÛªŸ¨>T}²Zœ̃f!](‹àvăÙY²üơ6ªù`&çö³Ü^Çm™ÛÙ?ß|a¾ùƠùæƯóÍ;ç›[ç›§Í7Oo®˜oî#‹dOÜü~ܼ=n¾)n7WÇÍăăæ’¸¹̃NÚÈ\0Ă¯¸ƯÀíJnG¹Gæö˜Áø<™xRt$rèĂHŸ†ô„¾é3 ó]åjâ$Xà/Cc#ËBeJH¡âäGNh0˜C€Äå2ưkúzY?Q_®£/Öécú̃i 6ƒÅc ƒÎ 1P˜„›g´[§ÎƆÙî·QfS…´K‰ÂTèv-´eVié~a1´, w>+ÖGÄoîÖÆH·Ô-³¼Ư5ñ–>}zfwm¼¥Û8c^ëaBiĂ«nú`Ù­}$Í‚¾è–«*)û₫ր궵±{ZkÈÖ­mà¾3åMIuö‰“›®b-Tí¬·̃́Wœ¬&yƯ?l™ƠÚưL^[w%ó¤óÚZ°åf…ç·£µtBsÓ1ZĂœ¶ÖcâfZÛ<“…‹››Ú†ÓAĂ›A„9<„Y:J¤5,]s”tA.8"ƯáI‘æ¦Ă‘H&Í$f̉È4ËF¦YÆÓ,SÓJHVư9ˆđ4ư¹+̉¿A‚«¦ÉjÍ¥ W¾D₫‘ćói‡76/5/Œ5/EXØƯuçro÷æEáđ1h$ï°¨p·P¸pÑâå̀½uiy'¶´©»1Ö>;¢¸‡2Å.½ơ*™ƯÊ2+eeMyö*Ñϲè)¬¬gYYϲ²¦ÈSxY|Ôă°4@C[ă|Åí¥&đÂ@¤­Ám[SÇGs"â½/Đ¯̣s0ÅÛºsb Ưf5¦~L=‹ÂYÆ¢,lU£¼÷%"~̣s5ʆÁöXx›W4á¿£Cơ|ĂGGÇú[:né`.ÿw¬ß€Àßúw@Çz¦1¿>‡ïo!\ÙÚÜ…đ0_£…¶ơ s@Ç`¹­gÖpæC¾ ˜3éÁSĐ1úÇFFÀ́:6Îy€uØtŒÄl€URÍàÿ+´$ endstream endobj 40 0 obj <> endobj 41 0 obj <>/W [3[600]11[600 600]15[600 600 600 600 600 600 600 600 600 600 600 600 600 600 600]32[600]35[600 600 600 600 600 600]42[600 600 600 600 600 600 600 600 600 600]53[600 600 600 600 600 600 600 600 600]66[600]68[600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600]95[600]178[600 600]181[600]]/Type/Font/Subtype/CIDFontType2/FontDescriptor 40 0 R/DW 1000/CIDToGIDMap/Identity>> endobj 42 0 obj <>stream xœ]ƠOkÛ@đ»>…-=XÚÙµÁ¼KK!‡₫¡IK¯²´ †ZsÈ·¯4/@ ùÇQâ÷Fhwï>ƯMç[½û¾\ûûr«Çó4,åéú¼ô¥>•ÇóTµ¡ÎưíơÙ_º¹Ú­ß¿<ƯÊån¯ƠñXï~¬>Ư–—úƯĂĂïÍûj÷mÊr׉„Ÿ¿ÖÉưó<ÿ)—2Ưê¦ꡌëŸú̉Í_»K©wváÛđáe.u°÷-¿AÊÓÜơeé¦ÇR›ơ…ăçơ…ªLĂgåU§ñí×#ÜĐÀF'¸aÏQ78á†̃Fm7 µpCá(À #Göïiä—hnl9R¸1p”àÆÈQ†…£=ܨàÆÄQ7f¬ÙDkĐÈ&Úńl¸‘MVl 3KG…ƒ¥£ÂŒÁ̉QaÆ`é¨0cØĂf … .Ơ =\a 0À•×@®œ8á o€ØÀÆ-\á \á #\e9Qá*›ˆ ®²‰˜á*›ˆ{¸Ê&â®rÛ±ƒ«,'rϦ²œÈ=›ÊrâWY¸Ê@"pW+ 71p©fb ÉpÉnb 9ÀM $ÜÄ@r‚›Hz¸‰d€›^¸‰Û–nâ¶µ›¸mµ=ÓÄmk€›XF¸™ÛV›Ù—*Ǜ¾4ÁÍ́K3Ǜ¾t7³/=ÀÍ́K;¸™})Wofö¥\½™Ù—rơff_:ÂÍ,çd7Ăö´°‘̃ ÛÓÂF–n3l£í‘ưïÙ¼=½·ƒÅƒ₫yYÖsÂN; ¶sà<? æë¼]U¯?Ơ_©́–B endstream endobj 4 0 obj <> endobj 43 0 obj <>stream xœí{ tTUºî¿÷©)I%© !#ÔÊ@¨$U™B NfBB A"$ V@DE…8kˆÚ8÷‡vhl©Ú¶6ôƠÛíÄ“Z¤ oÛmp‚Ô¹ß̃UAè¶ïëµ̃»ë­ơ–ụíÿ÷Ÿ½ÿŸ}’ 1"¡^RȲ|ĂzÇăƯRƒ‡‰L +zV^ü‹_]=ùgˆŒÿ±rÍ•+v”ƯPJdYXVu/íz{ܧ‰7`̀¤U¨ˆû„=ˆ̣ (g¯ºxưwæ]u)ÊG‰ô¶5åK);đŸD¹h-»xé=q?2.#̣–¡Âѳ®»çÛÿm&ÊmDc“téî¦LH›²ŒlDZ(ÿ oBÛ‡46ê›Đ}¸¾F)‰~…öưÚgTM©MöŸ€ºûP~™]ĂÇs;qگˡƒLÓ}Ê’”Ѷ™ưMYư÷AC˜ïƠfQ ƯH› ´g1*]LWÓ]ôC–Ȳ´K´#d p×i?̉^¡¥h !ö¥Yw¶ #[麛v³"_÷êÈŸÂ×kí-2Ó­ô‹cNôµ…4¦Óë‘»g]₫ˆ₫£6ưnª‚¦Í`½‹₫Đg¬–Ôåê)̀4»öºö;2̉ ŒƯÊ\–ÅêÙª¼©|CzJ£Œ^LƯ´’´Äơ ¬<ÆÊX9«åµ¼“ß·̣—”{t×è6ae6ÓÏ1›ÈT6›µ²́-öfëJå0Áî·†êè<êÄưnÁJ½"­>B#ŒÁ‚,À®a±íl?{¿¬´éfê>ƠVh7¸Ư$̀—“̣h:4´a}Ÿ¥]´£ßc:l/e>Üßuü<¾A)S•ó•«•~åGÊ!ƯBƯ³á²đ_µµG´´·µßkĂĐg¥,*¤Ù˜é6j§«°rwÑ£ĐúK:LŸ3«f—°ëØØ£́'́Yö{›…y<ß¡LRîQÓ1ªÛªûUØ~,<>¦ƠiÚiÜß2ºn·=FOÀăvC[ˆ5°óØ<¶ˆù¡ñ&v+{’½Ä₫Âu|1ÿ©’«¬U6*W)[•“ºƯFƯoơÂá{Â{4¯v),¾Eû¶&R:M¦FXz­†gôĐº6_9¿–ß(¯;p?çÏè瘗£ô:ÉbX¼đIƯ³z̉[ ÓMoœ:rzçÈư#†y8?Ü©µLmœÖ íÔ^̉hEä:¨~91u5ơĂk†°R¯Ă`­ÿL‡ôđ7+Ëf¹l[̀®ÅLß„¹~˜=†ëixÎN6„ë\ûذ˜ưẮ(û3;Åà¼<—{`ñb¾‚_ÅŸâ/̣—xX‰S2æ³RéÆœ^£Ü¬<{xKùLùR— £ËƠMÓuëîÖíĐưRwDwJß Ÿ£¿Ü`5ÜnØƯ9¾ƯOđau¼ ú9ë@ü›1ă?å¿â…ˆˆưÿ×­́Kz…UÓŸÙ¼üV\×̉Gˆ£…¼†}Oz”Mfw³G¸Â»Ñwm§G”gØÛüzºÑ_DŸ"e|+b·đqØ ïâ»èOđŒưˆ—Ïx̣û±̉i´_ÙÏzè+ö9»ƒá^ü|,­doÑv «¥5<Ÿ\´í‡‡á£WuL>öÛ•bïƠmåŸđ­́5đm̉æÛÙRÚỊ̂áoûÙù´“‡t“t/ÂKë¥èƯ ́JøæĂ\GỌ_Áwgs÷!z·#Nª`ơZO5l1ö%‹!+»̃~"óVسƒv°% ®zíy‰¸~¾•î‡y{(›~¬ƯI¿`ËÇ»Y,bïÑyÊ ƯX<1ëÆëë4^Fïhóè5́Xå]I¿g·aߘI¿c)ô¶F+ƒ7î×:`ç ´æë«ô6́ÆKùú¥q»á]C¥¡ØÀôơ]úưl}~²¾XŸ¯wêÓơ‰úXƯ1Ưut¿Đ=®»±[¤«3+ïbÿPPnSÊŧÁ'Ç+:₫5ÿ+ÿ˜ÿ¿Ă÷ñ§ùf„•¿×^ÑеéÚdmL8>~)ülø¡đÖđá̃pOØ?̣̣é?>xzàôØ#ï`ÿú%{-| π˴EÚyÚˆ·dímzø0Û‚{̀¡ÄרWïÁº<¹mǧ̣™̀Ba:IĂ˜¡·Ñ¾‡‚]N~Z`h£&¬w."óú¨7vc¯}%k•„'€3~Öd1q́̉yx̉¾LÏh(ó¡c@Ë“üMæ?FyØe.Áói6ư‰Í Opí¦Ư#‚í)Ă“`ƯcxN~¨œ‚Æ=t¯Ó[uøü°;´óĂçcO»öè₫Ló‰ÔóÚ.˜ßÖÚ2¯¹ină,ߌé•Ó*¦N™\^VZŔơ¸'æOÈËÍÉve9vÛøq™éi©)c“Ç$Y-‰ ñæ¸Ø“Ñ ×)œQA«̃ïæúƒº\×̀™…¢́Z¥gUøƒTƠŸÛ'èđËns{ªè¹âïzª‘ê™̀⨤ÊÂGËÜ_ër ±EóÚ‘¿£ÖƠáËü™×åÊB< N'F8ểVƠ:‚̀ï¨ ÖoXƠW篅¾¸ØWMwla ÄÆ!‡\0ƠƠ3ÀRg0™á©uœLñ°*˜áª­ ¦»j… A%§niW°y^{]m¦ÓÙQXd5Ë]Ë‚äª&ºeª‘4ACMĐ(i«ÅíĐm‚}}·Yh™ßmîru-]ÜT–v«¼µÁÔï§}[„̣¤ö›ÏnÍTúểV;D±¯ïfGpû¼ö³["í而å9ơ₫¾zPß.f1ÍC„ùâV"7Ơíª5₫ ÁWµkUß…~,HF_Z®tfd¨{´£”Qçèkkw9ƒ¾LWÇ̉ÚqÉÔ×rå®tƠ‘~nKaÁ€Å™Í„ÄhÆv¦ûL›̀Éî"7»å̀t2a‘kÜ èXî€%í.ÜÈ‘tO¡¾åSĐ Ÿ†QÁ.,Ăê`L¿ÏR!êÅø >Çârô$,»kø/çÖ,Ör,ˆä¿ˆÁ‘bÔÁĐ>ºƯÁ‰…_k°°q†,—lâû]=¦Û1¬£Âƒ9w:Ū̃6¤̉2‚½óÚ#e-Ë$Ơăîr¿hÙ7Ú2v¾hém93Üï‚ûî&ñ¦16hÊ=ó“hIS·ª"ÈR₫›æîHû́V×́y‹Úu}₫èÜÎn;§iŸr¦- ©iW2y4Ç3Ù O\|¦³(´›ƒºü¤'w MpEYĂơA‹f$íˆu:ÿÅACÚq1Jo‡EÍ V¸Ï-O;§|yæ>ërù́¶E}}±ç>×4çcràÁøœ`‚̀ÉLI˜ïvü9Ø@Ϥ"a–ùí‡2vG°m"v–Ê´ăă•Áf„{0.₫*R½Ô•(Óx©tlN05'Y*OWNîI;z\t‹Íô‰25å-9A«̀§ä ¦[…VÉt& ưƒÂKåÿ̃†Dù“LÏI#K¥é4Em‘ûCE&¿¹ƯŸ¹´CDøÑç̀oäô:Å6¯Ia‘?µmˆÛ`“?ˆ̉k#‘éŒ ;ë J.³̀VXàBdΑëÂj„S:üĂœ¾)™.gǦùÅ®*'€ûs¢¹Ï¬+Ø:Q´æ:2±øs;0LAßzß̉!­w™ËaqơíQR””¾:ÿhiÏß–¬¿½~¹U`âT=àb·̀PÙ-­‹Ú÷XđJ}K[û ô5₫êl´µïqà9*kù™ZQrˆÍf˜„An’M™{T¢^Ùª“²¼|ˆ‘¬3Ö1Z>Ä#uY‡X¶ö³£RNhG¡8p'È•z‰ÂHÓT›ÁxOd½î¸B±ưqEá1FƯqFé¦ÙW¥¹çZNTΩœkù¢re¤’|•#•Å̃R«Óă´:Wêè´CÙwZƠÓ)rèöƒí×₫ (ÊmG©T®•½1†±{c-ffbv56i²¶2¤§=·"’!ÑbƯ©ÅkØ,m%jûTsÿö tö oK£µfÛ,»§đz;k³3û¶oWVNäĐ®̉ !~ÎSRör¤IUkZË́W₫¤đÅB¥P­)/t4”ºY¡hóxHH*đ©‰É¾¢"SÚä‚ô)éei|¢Ûc2¹¢ú!̃UTR&ËàRM†̣^×s®—]k³Ûïîu+n4́n-wÿ­lˆ·«É₫q=ăzÇơÛ7N?îË)EExkâ ÔX«×Â,éS[¦È5îÎd×Ă#îµĂkƯøÀ«Dïưa˰5iªÇ]9́¶ »̣n÷Ú©nÑiƯZ’E·»Ø[s¥êđNʰ•”—r䌲læµ!)ÏœœÍJÇ{²ÉfÇ&åv3·{âµ×²uk;im'³ÂÆÂ—œg*µF½J/«)©)©%©²‡‹çå•–L4©¼,×åb÷46ú|áW}ê,‡CäÙ#ÇzÖV5Í›^8£ñ²¯}{GưUW_₫){3ZT8U@ÉZT4•~ĐÿĂéS;¦^i±O)œÊ®»?3»9¯b9b`ö®î6åIrÓ1uÎëÁäĂÙG̣>Nú ùƒ́óN%ŸrÅ’c\|RR·ueR÷ØN™ qf–4+iN^G̉“düq¶1#=̃LzĂ˜ồs¼%Æ’É2‡˜swm̀Oâß́¶8ó1C¬Qá†gVœa®M,¬%½¼ÇvÔÆ›mlÜ–Q8fˆÏW“zrå:r½¹=¹ºÜô‚ÿuudéÖÎk ¯ëtÏy_©oä}Ëû–áN,ËT«•:µØKx:«“Ÿâ‹IŒH2Á6˜àë×Á"Ë!Ây¬˜í¬¼Üܼ\W–alrœu¹NFƒ åe$¢₫‰œ¼̀èÄñcÓæmºkçS/ơÎó.pMœ̃Ù₫âØ»Yö§óïVVº|³nhœ‘–Èô₫øº+n˰̀™1±vúùËoüđ÷̀î»_µ̉]ˆùOÙ́!5v(i(ùg™¿ÎÔÅiGƠYăle]|Ṃ¯ ‡ ï$¿“₫¡á£ä̉ÿÆO₫–t:ù+û×®ÄI†OZ¼:íÂŒ í+\?àÛ́ư®ǵ»¾IoÔ+qc²m̀$¢gbE™ª9=«¬×tÀÄ›ĐÀR~dSÇ—ËùOï(³Ø˜jëµñ-6fbij9©I.©Ndƕۉ%â¥đMR4x‹jN,ĂÆæÑ́¡́t¦uNKœmˆûẹ́8<´v¹ê}RÎÎü®́²£q,.#7ûrĤ_M£ºÊíczÆđ1j|bÙ˜ôœYkä»±¶#ïc¯ïÄ2Í91,–yÄí¶"";×®CƯ°XàŸÚÔt§OÜĂ®ä  £¥t%I98!Y®¶»ă¯ˆ_„wÔA˜p=ĵCjŒ5Ơg/@‚]æĐ ¤è.\ĂƯÉœ)̉¢® seQĤ)cá0p£nơéŸ9»mƯ‹smùSl¯mù"|„ù\ó›̉™ÇŸ<÷¯^u¿—]Đ¼¬8¹¢`¸œ–̣ú;,±½´ñâóº6´/\Ø9Ư ½Go¥R6Gu3S3ó2'gêÈe<Ñ’TJjœjæÿ¸7µÍRcƠѪûv»¯2Ÿ³ßÇ1@œX³® °Èă-.)%vîc!̉ü…*Û¬lµ®¡,[Ó¤¼IaIYv÷çxJê*,¥®Â‚K7MMkNó§ơ¦̉ ‰]11¼ËKnïIưûD5;œ^'wf”cCÅú/T–±W˜Ă5Ù́ X¶YvZöZtdi†xÓ¢³¤— 160ôđƒ÷-•ĂX?ás̃GnQiö­EƯˆÈ »£lºâ4‰]`¶[çwo·¥‘'yJ4ôEÜvåƯ,[l¯#[Eºq™H—½ºw~È7%Lîæe+æ°JQÇ÷†F÷Xö¹Hç̃4`ŸRà™fLŸ^8WT`eK±²÷#ÖËÙX5Ó`H1Ü7^ Å„́ë‹Ơµ&I›’’ú»âgÆî7°$&&"'£<Ö̉há;-;­Ü®êÄʉ˜Săơé™ăÆÛ́gÖßE‰̀@¼\Ư·¿‡:û´Eg~[5Z›n›e7%Y…2«ZƠ€Ä™SfMÈ–̣́–”IiË•̣¹äô2–Ÿ7ÄÆ«Î¬e¾!#=–L“×Ôḷ›zL†~́ön¼Z[“ÄâY³È)±Ùéwö8 Îô‰gE™X7a¬Ăk7gx8rÚ‰¤â$4ơï—rLd)iVÛ•jR¢…+à̉'*VY¬œ1ánr_{-!ÅC9Ù¢¦'ú¬"‘±o‰Ü¢ÑăŒ3Èö\ÿ¿¼̣Ú­U]¨ÓƯ¹mN÷̃sBV””Û{;g4–”L?oÍđëçF*†t•Xï₫Kµ<&ÉP”R¾Â{“÷^ïăE»‹^*z;æṔÛÅÄ|X|Âü…ÇËŒzcŒq̉ï$OC~½Ç”-¼£'.ÏI$±”ÈL®É4#¿ reO(÷Ô{n.¾¯øk̉ØW®Ø$}œbñ˜½©qÉæñiöô oRÅq·yßû½'ჩïU|íQ©̀›ª”™cIç6f;S̀é^^äÀÚ{EbÆAœc£̉,så±!['M´BÖ]Í­¢,¥lol´CÊÑ bôóqT«)÷‚\—GuQ!Ơ˜Œ¼²JÅ;Äרũ¢d¯·HqN6Úë6׫Së긽Ơ©®œ²:uRyƯÛÓ§WRỜ²Ô+,đ·£N…œ><̃ΈÍs&Ç©$ UsƯâÀgí±ôZú-AË>ËQ‹Á’1Ëøs>»g6Î q¶ñsí¥Ro©R*æ<Æé*+MŸÙ´%z`˜s¢‡vùhÀ~S)†k;ßwc‡öNßđÍ Eîk,/GïISqpŸûYgñ³V × gÍ 56̃W#’z‘Ô‰¤V$YbB!³£̉•Ni2^5%̃+K8rz²Ó}8̃½» R¾ÖÄZ£:¬rŒB‡ ­¾‘Ô‹¤N$îïút0q¬‡q1ấ*Ï&̣¸‚HÉ3È—Öȳ.5r˜)ÇÎ*.¹¢vRô,(µ%—T^^oŸè¼Ñ¼zỬÛ̃í¸Ï—˜•äÉä”$xnXpÇÜœ̣̣'¾lmíÜôFĂơ•cœ §X“s¦đ‡íö<+ °$—s÷¼K/²Ûâ|u¾ü’ ù)i22’2g]tɬ®̀q h*©I“/iâKDΙӮ[’Xỷc’Dxl~ JÈ_­‰zûV¹N¿•₫XÄö‹úQˆvÄ̉à_„èú«öí+¶ƒm£ƒ(|{Ñ>¥f¹R"B3¡OÄçjX11*¾‡ăF„~MăĐêÆ ₫†Ö±!C̉D6 ÷};{^î…Å5̀Ëw`'˜ƒi ?S[øaµ†ûTqOOÉ}€#c¡;̣¤ØÀgĂË×r ZH¯ưå1Nô,À}aŸhµ£ˆ̀öÜ}æJŒ¯ĂƯƃ×k)™HÁØĐ V¶ĐDÄ©ăU1+ú ÆTk¢2¹·Ô¢Ï}r7j_Âç3 a"塾–rpot×™}²ëÿ>øŸ…rçÿ9táŒ"Ó'ÿ=bp.̀7ˆïÈ}ïñ=¾Ç÷øßă{ü¼eóÛñî\I⽓…<8M’i₫ xÿXh¯®cä¼€"s>  Xø7p 0‘#Úw °-Ú¢'»¢‘đ ù.BgJ[€mÀvà8 'U ïK(±W5(a S° Đaè·¥c²fK´´P(Q§'a´¦êaˆ¦éÑË¡œF½E9E`;JG´#NQ¤_âpöB±Ú>å«]óZK¨ªRù¾†•_S3ĐôAà(€y@êQFpÇ_Cñˆ́åú½(ïƒ<í=¢ÇzŒĐN xV/Ñă8ú¯§Ư[²Gfâ­2srWEeɪdå$î­_¦‰H=€h¶;hN Ƙ布S+JªÄ-Ạ̀—j½­(ïׂy·¡Â4¢ñ ‡̃0̣˜NªÛ 0œÀüŸÀj *>œT!Y>œÛVR5WäèÔ₫9ŒÊç¢̣Ѩ¼)*oŒÊK¢rUT.ˆÊÖ¨œ•Ó£²2*K¢²8*s¢2+*Qi—̣³ÁÖ̉₫ª|å3Lœ_ù+ù1n÷c¸Q3̉³kúí@Øb¨_§¿@G »”/ùBOvè=.ơf*Ç¥̃ å#hùHêưèœ~`;ö”c’Uªr¼ç²ĐaÔƒơ F=ˆQ¢†ZàT 0 å0Z#Œ*á?"  ©p^@ôç”å%¾/uvåq̃9Øe÷À áƒpƒAØ~T9]‡¤®CĐu£aô!Œ>$u}[R”EƒJ—}Hù÷Á!~¹ËÙeO¬*Vj ¾Tƒª‘qYIÚ‡ô(ÀáQƠh­†’jô¨Æ-W“^iPÜ”‹‘•|•CNCYÈ ¥@Ê©Q9Eq–ƒ'KñB‹¾é{‚’‡RJy²”R6JÙ0Ó‹4#ó K!³—(cƒc̉¥;9ÑLQIÉ‹“ϧi²‹sW]C‰¿*N;ÇÁú<%“™ƒÅ%rXæ`}C4ƒư£Êª¤̣5’k,? G´+Éùc¢̉>h«¶ïaU¼«@đ#3fÛŒ©2c~͘3ÖÙŒé1ƒÖ 0Ă#̀đ#3üÈŒÉ4Ằ»’’Ô!₫ê`vé¶çù+tŒ¿¢Îç'Û¦?¦çÛtÇt|›rLáÛø1Î÷ö¹Ưà3,1 [ z»Ñg\b ·ơ>îSx“¢sØYTÇoߺ)₫¦Mñ‹7Å7n¯̃_µ)~ê¦øI›â=È;Ø_X%:>*Ó{ez—Hé”L¿’éQ™^ ÓJ™:djc•ƒñ3ÄN:§ă¾O :› †Ë t–Ù_`OoŒvöø óÔ>6èlX9è,‡X1è,†¨tÖ@Tívzíß8‡tLM´¿ç\gËÙh:§Úuƒöm²)ξÎé¶w;'Ú»"Ơ #¢FˆḉÓ;́…‘‚HÍü11cbú‡ØµÔØÿkc¿ßØï5ö»ưư¹Æ₫lc¿ƯØ?̃˜lJ2YL &³)Öd2L:7‘)Y|Ă¢@üE;Ù` ©Næ-\¤<̣oÎLœÉÿ<ŸcÂô>98F™Íg·V³ÙÁ}Ëiö2Gđ‹V׋·(¨wU³`̉lƯVí¾4mv0½uv°ũ¢ö!>=Ø[;ÛO0½E÷Ơvsevˆ̣%Ѽ|E4ß‹|C4₫ÁÉîÙCF­%8Å=;Ó|~ûcwv ä·@K[ûÓDƠ™â_¿öcöïÈR»ñJÙàKó%ͰN­¯ưÄMÏú¾CÚ·YÁƯ|¥j¶?k´×í¥F»Ë(êg·¢²ÿYc± ©L¼wvk{P‹ffcƠZ‹Û÷pŸ^W»‡Ï¢£}Oúvî«kơéÛq“gú!8}è‡ØôEûQèG9×/‹Ïự„ˆôË’ư²Îé7Đଫp:Gû4È> çöÙ~nŸí²Ïöh%̉ÇyVŸ1SÈ)û8ÇLù‡>YÿBŸ¼ḯó_>ŸîêÚtö‡í¡˜¶Aü—ßU× øƒ·mX•́]æṕ¡i,ư¾\ÿ²å«„\Ú=ÄB®îÚà4W­c eĂ?¶7ˆæWím¨kkØ v×¶¨-u®¥µ»VúÖœCwë(Ư€oåw([)”ùWÓïh^#›×ÁµFp5©M’«nµˆ¾æöUwÔ,È]<.^ïÏtvT§XzfȘæLÛ”ù<^ưŸ¢8wGĐ́ªÆ¢©°ª°J4!đES‚ø̀hSÚ¦iÎ̀çÙSÑ& ª­®jBüç®öÿ₫µ^~.ư>ÿJOm_ŸV·ºö́ÔîơîKñă¾́Œ"” ˜.V¬¿ÔMâ»Yf¿Àß øm~'¿ổQù"̃ªÄ[x¿b¨cëI₫ÿ…ü``ô-‘ u$j ›E„0ª'R6AI»tưeèqEäw|F"R¤f.sư’’oÊ endstream endobj 44 0 obj <> endobj 45 0 obj <>/W [3[600]29[600]53[600]70[600]72[600 600]81[600]85[600 600]]/Type/Font/Subtype/CIDFontType2/FontDescriptor 44 0 R/DW 1000/CIDToGIDMap/Identity>> endobj 46 0 obj <>stream xœ]‘ËjÄ †÷>…Ë–.̀eât œMK!‹^褥[GO‚Đ1f‘·¯ÑÖ ~à§ç ¿́¡{́Œö”½¹YÑÓAåp™W'‘^pÔ†”UZúßU¤œ„%,Ÿ·ÅăÔ™a&mKÙ{Ø\¼ÛèMßƯ·„½:…N›1˜Cơñ̀yµö'4€*B«ga_Ä„”Å«́7‹´ë2Ư@Î +$:aF$m´OaA£₫mŸRÑe¸®!³*`W¥‚̀ZDU7ÙTQ8ḍ:©{ÈäMR'Èä<ª¦„LI¥Æ‘ÇÔ¾‰u|Ëß­÷gí‰ç”äê\0~KŒiH̀?gg»WÑ0É«‹f endstream endobj 25 0 obj <> endobj 18 0 obj <> endobj 47 0 obj <> endobj 48 0 obj <> endobj 49 0 obj <>stream application/pdf iText 2.1.6 by 1T3XT 2012-01-26T10:46:36-08:002012-01-26T10:46:36-08:00Documill Publishor 6.3.9 by Documill (http://www.documill.com/) endstream endobj 50 0 obj <> endobj 51 0 obj <> endobj xref 0 52 0000000000 65535 f 0000012943 00000 n 0000086144 00000 n 0000115785 00000 n 0000143154 00000 n 0000000015 00000 n 0000000146 00000 n 0000000277 00000 n 0000000408 00000 n 0000000539 00000 n 0000000670 00000 n 0000000801 00000 n 0000000931 00000 n 0000001062 00000 n 0000001194 00000 n 0000001326 00000 n 0000001458 00000 n 0000001590 00000 n 0000154241 00000 n 0000013207 00000 n 0000027786 00000 n 0000027964 00000 n 0000042420 00000 n 0000042580 00000 n 0000056721 00000 n 0000154103 00000 n 0000056899 00000 n 0000064917 00000 n 0000065105 00000 n 0000066599 00000 n 0000066778 00000 n 0000066806 00000 n 0000085129 00000 n 0000085318 00000 n 0000085684 00000 n 0000086276 00000 n 0000114426 00000 n 0000114606 00000 n 0000115127 00000 n 0000115912 00000 n 0000141748 00000 n 0000141930 00000 n 0000142468 00000 n 0000143286 00000 n 0000153313 00000 n 0000153505 00000 n 0000153767 00000 n 0000154340 00000 n 0000154392 00000 n 0000154426 00000 n 0000157275 00000 n 0000157351 00000 n trailer <<380cf0c28cf23324909a56206773f97a>]/Info 51 0 R/Size 52>> startxref 157549 %%EOF papi-papi-7-2-0-t/src/components/vmware/README.md000066400000000000000000000007211502707512200213740ustar00rootroot00000000000000# VMWARE Component The VMWARE component supports reading vmguest and pseudo counters. * [Enabling the VMWARE Component](#enabling-the-vmware-component) *** ## Enabling the VMWARE Component To enable the generic VMware component do: ./configure --with-vmware_incdir=< path_to_VMWare_Guest_SDK > from the component directory. For further information see the VMwareComponentDocument.txt file in the component directory, or the ComponentGuide pdf file. papi-papi-7-2-0-t/src/components/vmware/Rules.vmware000066400000000000000000000005241502707512200224330ustar00rootroot00000000000000include components/vmware/Makefile.vmware COMPSRCS += components/vmware/vmware.c COMPOBJS += vmware.o CFLAGS += -I$(VMWARE_INCDIR) -DVMGUESTLIB=$(VMGUESTLIB) -DVMWARE_INCDIR=\"$(VMWARE_INCDIR)\" LDFLAGS += $(LDL) vmware.o: components/vmware/vmware.c $(HEADERS) $(CC) $(LIBCFLAGS) $(OPTFLAGS) -c components/vmware/vmware.c -o vmware.o papi-papi-7-2-0-t/src/components/vmware/VMwareComponentDocument.txt000066400000000000000000000265041502707512200254500ustar00rootroot00000000000000PAPI-V VMware Component Document Matthew R. Johnson John Nelson 21 November 2011 Revised: 23 January 2012 This document is intended to detail the features of the PAPI-V VMware component, and more specifically the installation, usage, and pseudo performance counters available. In order to make this component possible, extensive research into the actual counters available, as well as the leveraging of the VMware Guest API1, was needed. As this is the first of the PAPI-V components, we seem to be stepping into a new realm of performance measurements that, previously, has been a new frontier, or unexplored all-together. Installation: To make PAPI with the VMware component you must go to the PAPI_ROOT/papi/src/components/vmware directory and configure with the flag: --with-vmware_incdir=, where is the path to the VMware Guest SDK for your machine. NOTE: The VMware Guest SDK is normally found in the following default vmware-tools path: /usr/lib/vmware-tools/GuestSDK or: /opt/GuestSDK e.g.: ./configure --with-vmware_incdir=/usr/lib/vmware-tools/GuestSDK After running configure in the vmware directory, go to PAPI_CVS_ROOT/papi/src and configure again using the flag: --with-components=vmware e.g.: ./configure --with-components=vmware After running the main configure script you can then type make, the Makefiles have been automatically generated. If at any point you would like to uninstall PAPI and the VMware comonent, from the PAPI_ROOT/papi/src directory, just type: make clean clobber To make use of VMWare timekeeping pseudo-performance counters, the following configuration must be added through the vSphere client: monitor_control.pseudo_perfctr = TRUE As well as adding the --with-vmware_pseudo_perfctr WARNING: If you do not enable the monitor_control.pseudo_perfctr on the host side, and give configure the --with-vmware_pseudo_perfctr, you will get a segmentation fault upon readpmc trying to access protected memory wiothout priveledged access. This is expected behavior. flag during component configure in the vmware component directory. Available Performance Counters: Below is the list of available performance metrics available to PAPI through the VMware component. If at any time you would like to see a full list of counters available to PAPI type ./papi_native_avail from within the utils directory. It is important to know that the counters VMWARE_HOST_TSC, VMWARE_ELAPSED_TIME, and VMWARE_ELAPSED_APPARENT are currently the only true to name register counters available from withing a VMware virtual machine. Any Guest OS running on a VMware host must have the access enabled from within the VMware vSphere client managing the system for each virtual machine that wishes to use the VMware component, this exposes the counters to the virtual machine. All other counters you will see in the following lost are software counters that are being exposed through the use of the VMware API1. Event Code | Symbol | Long Description | -------------------------------------------------------------------------------- 0x44000000 | VMWARE_HOST_TSC | Physical host TSC | -------------------------------------------------------------------------------- 0x44000001 | VMWARE_ELAPSED_TIME | Elapsed real time in ns. | -------------------------------------------------------------------------------- 0x44000002 | VMWARE_ELAPSED_APPARENT | Elapsed apparent time in ns. | -------------------------------------------------------------------------------- 0x44000003 | VMWARE_CPU_LIMIT | Retrieves the upper limit of processor use in | | MHz available to the virtual machine. | -------------------------------------------------------------------------------- 0x44000004 | VMWARE_CPU_RESERVATION | Retrieves the minimum processing power | | in MHz reserved for the virtual machine. | -------------------------------------------------------------------------------- 0x44000005 | VMWARE_CPU_SHARES | Retrieves the number of CPU shares allocated | | to the virtual machine. | -------------------------------------------------------------------------------- 0x44000006 | VMWARE_CPU_STOLEN | Retrieves the number of milliseconds that th | | e virtual machine was in a ready state (able to transition to a r | | un state), but was not scheduled to run. | -------------------------------------------------------------------------------- 0x44000007 | VMWARE_CPU_USED | Retrieves the number of milliseconds during wh | | ich the virtual machine has used the CPU. This value includes the | | time used by the guest operating system and the time used by vir | | tualization code for tasks for this virtual machine. You can comb | | ine this value with the elapsed time (VMWARE_ELAPSED) to estimate | | the effective virtual machine CPU speed. This value is a subset | | of elapsedMs. | -------------------------------------------------------------------------------- 0x44000008 | VMWARE_ELAPSED | Retrieves the number of milliseconds that have | | passed in the virtual machine since it last started running on th | | e server. The count of elapsed time restarts each time the virtua | | l machine is powered on, resumed, or migrated using VMotion. This | | value counts milliseconds, regardless of whether the virtual mac | | hine is using processing power during that time. You can combine | | this value with the CPU time used by the virtual machine (VMWARE_ | | CPU_USED) to estimate the effective virtual machine xCPU speed. c | | puUsedMS is a subset of this value. | -------------------------------------------------------------------------------- 0x44000009 | VMWARE_MEM_ACTIVE | Retrieves the amount of memory the virtual m | | achine is actively using in MB€”its estimated working set size. | -------------------------------------------------------------------------------- 0x4400000a | VMWARE_MEM_BALLOONED | Retrieves the amount of memory that has b | | een reclaimed from this virtual machine by the vSphere memory bal | | loon driver (also referred to as the “vmmemctl†driver) in MB. | -------------------------------------------------------------------------------- 0x4400000b | VMWARE_MEM_LIMIT | Retrieves the upper limit of memory that is a | | vailable to the virtual machine in MB. | -------------------------------------------------------------------------------- 0x4400000c | VMWARE_MEM_MAPPED | Retrieves the amount of memory that is alloc | | ated to the virtual machine in MB. Memory that is ballooned, swap | | ped, or has never been accessed is excluded. | -------------------------------------------------------------------------------- 0x4400000d | VMWARE_MEM_OVERHEAD | Retrieves the amount of €œoverhead mem | | ory associated with this virtual machine that is currently consum | | ed on the host system in MB. Overhead memory is additional memory | | that is reserved for data structures required by the virtualizat | | ion layer. | -------------------------------------------------------------------------------- 0x4400000e | VMWARE_MEM_RESERVATION | Retrieves the minimum amount of memory | | that is reserved for the virtual machine in MB. | -------------------------------------------------------------------------------- 0x4400000f | VMWARE_MEM_SHARED | Retrieves the amount of physical memory asso | | ciated with this virtual machine that is copy €Âon €Âwrite (COW) | | shared on the host in MB. | -------------------------------------------------------------------------------- 0x44000010 | VMWARE_MEM_SHARES | Retrieves the number of memory shares alloca | | ted to the virtual machine. | -------------------------------------------------------------------------------- 0x44000011 | VMWARE_MEM_SWAPPED | Retrieves the amount of memory that has bee | | n reclaimed from this virtual machine by transparently swapping g | | uest memory to disk in MB. | -------------------------------------------------------------------------------- 0x44000012 | VMWARE_MEM_TARGET_SIZE | Retrieves the size of the target memory | | allocation for this virtual machine in MB. | -------------------------------------------------------------------------------- 0x44000013 | VMWARE_MEM_USED | Retrieves the estimated amount of physical hos | | t memory currently consumed for this virtual machine’s physical | | memory. | -------------------------------------------------------------------------------- 0x44000014 | VMWARE_HOST_CPU | Retrieves the speed of the ESX system’€™s phys | | ical CPU in MHz. | Timekeeping Counters: The pseudo-performance counter feature uses a trap to catch a privileged machine instruction issued by software running in the virtual machine and therefore has more overhead than reading a performance counter or the TSC on physical hardware. The feature will only trap correctly if the configuration setting is applied as described in Installation. The timekeeping counters behave as follows: VMWARE_HOST_TSC - Provides access the the Time Stamp Counter on the host machine which counts ticks since reset. VMWARE_ELAPSED_TIME - Provides access to the elapsed time in ns since reset as measured on the host machine. VMWARE_ELAPSED_APPARENT - Apparent time is the time visible the Guest OS using virtualized timer devices. This timer may fall behind real time and catch up as needed. Usage: After installation of the VMware Component, you may use the papi_commmand_line interface, found in PAPI_ROOT/papi/src/utils to read out an instantaneous value from a certain counter from the above list. If you would like to read out a counter, type: ./papi_command_line COUNTER_SYMBOL_NAME. e.g.: To read out the value of the VMWARE_MEM_USED counter user@vm1:~/papi/src/utils$ ./papi_command_line VMWARE_MEM_USED Successfully added: VMWARE_MEM_USED VMWARE_MEM_USED : 13074 ---------------------------------- Verification: Checks for valid event name. This utility lets you add events from the command line interface to see if they work. command_line.c PASSED For further usage of PAPI and it’s API on how to incorporate these counters into a program of your own please see the PAPI Documentation2. ________________ References: [1] VMware: http://www.vmware.com/support/developer/guest-sdk. Last accessed November 28, 2011 [2] PAPI : http://icl.cs.utk.edu/projects/papi/wiki/Main_Page. Last accessed November 28, 2011 papi-papi-7-2-0-t/src/components/vmware/configure000077500000000000000000003662731502707512200220450ustar00rootroot00000000000000#! /bin/sh # Guess values for system-dependent variables and create Makefiles. # Generated by GNU Autoconf 2.67. # # # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, # 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software # Foundation, Inc. # # # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH if test "x$CONFIG_SHELL" = x; then as_bourne_compatible="if test -n \"\${ZSH_VERSION+set}\" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on \${1+\"\$@\"}, which # is contrary to our usage. Disable this feature. alias -g '\${1+\"\$@\"}'='\"\$@\"' setopt NO_GLOB_SUBST else case \`(set -o) 2>/dev/null\` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi " as_required="as_fn_return () { (exit \$1); } as_fn_success () { as_fn_return 0; } as_fn_failure () { as_fn_return 1; } as_fn_ret_success () { return 0; } as_fn_ret_failure () { return 1; } exitcode=0 as_fn_success || { exitcode=1; echo as_fn_success failed.; } as_fn_failure && { exitcode=1; echo as_fn_failure succeeded.; } as_fn_ret_success || { exitcode=1; echo as_fn_ret_success failed.; } as_fn_ret_failure && { exitcode=1; echo as_fn_ret_failure succeeded.; } if ( set x; as_fn_ret_success y && test x = \"\$1\" ); then : else exitcode=1; echo positional parameters were not saved. fi test x\$exitcode = x0 || exit 1" as_suggested=" as_lineno_1=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_1a=\$LINENO as_lineno_2=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_2a=\$LINENO eval 'test \"x\$as_lineno_1'\$as_run'\" != \"x\$as_lineno_2'\$as_run'\" && test \"x\`expr \$as_lineno_1'\$as_run' + 1\`\" = \"x\$as_lineno_2'\$as_run'\"' || exit 1 test \$(( 1 + 1 )) = 2 || exit 1" if (eval "$as_required") 2>/dev/null; then : as_have_required=yes else as_have_required=no fi if test x$as_have_required = xyes && (eval "$as_suggested") 2>/dev/null; then : else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR as_found=false for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. as_found=: case $as_dir in #( /*) for as_base in sh bash ksh sh5; do # Try only shells that exist, to save several forks. as_shell=$as_dir/$as_base if { test -f "$as_shell" || test -f "$as_shell.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$as_shell"; } 2>/dev/null; then : CONFIG_SHELL=$as_shell as_have_required=yes if { $as_echo "$as_bourne_compatible""$as_suggested" | as_run=a "$as_shell"; } 2>/dev/null; then : break 2 fi fi done;; esac as_found=false done $as_found || { if { test -f "$SHELL" || test -f "$SHELL.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$SHELL"; } 2>/dev/null; then : CONFIG_SHELL=$SHELL as_have_required=yes fi; } IFS=$as_save_IFS if test "x$CONFIG_SHELL" != x; then : # We cannot yet assume a decent shell, so we have to provide a # neutralization value for shells without unset; and this also # works around shells that cannot unset nonexistent variables. BASH_ENV=/dev/null ENV=/dev/null (unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV export CONFIG_SHELL exec "$CONFIG_SHELL" "$as_myself" ${1+"$@"} fi if test x$as_have_required = xno; then : $as_echo "$0: This script requires a shell more modern than all" $as_echo "$0: the shells that I found on your system." if test x${ZSH_VERSION+set} = xset ; then $as_echo "$0: In particular, zsh $ZSH_VERSION has bugs and should" $as_echo "$0: be upgraded to zsh 4.3.4 or later." else $as_echo "$0: Please tell bug-autoconf@gnu.org about your system, $0: including any error possibly output before this $0: message. Then install a modern shell, or manually run $0: the script under such a shell if you do have one." fi exit 1 fi fi fi SHELL=${CONFIG_SHELL-/bin/sh} export SHELL # Unset more variables known to interfere with behavior of common tools. CLICOLOR_FORCE= GREP_OPTIONS= unset CLICOLOR_FORCE GREP_OPTIONS ## --------------------- ## ## M4sh Shell Functions. ## ## --------------------- ## # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits as_lineno_1=$LINENO as_lineno_1a=$LINENO as_lineno_2=$LINENO as_lineno_2a=$LINENO eval 'test "x$as_lineno_1'$as_run'" != "x$as_lineno_2'$as_run'" && test "x`expr $as_lineno_1'$as_run' + 1`" = "x$as_lineno_2'$as_run'"' || { # Blame Lee E. McMahon (1931-1989) for sed's syntax. :-) sed -n ' p /[$]LINENO/= ' <$as_myself | sed ' s/[$]LINENO.*/&-/ t lineno b :lineno N :loop s/[$]LINENO\([^'$as_cr_alnum'_].*\n\)\(.*\)/\2\1\2/ t loop s/-\n.*// ' >$as_me.lineno && chmod +x "$as_me.lineno" || { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2; as_fn_exit 1; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensitive to this). . "./$as_me.lineno" # Exit status is that of the last command. exit } ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in #( -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #(( ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" test -n "$DJDIR" || exec 7<&0 &1 # Name of the host. # hostname on some systems (SVR3.2, old GNU/Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` # # Initializations. # ac_default_prefix=/usr/local ac_clean_files= ac_config_libobj_dir=. LIBOBJS= cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= # Identity of this package. PACKAGE_NAME= PACKAGE_TARNAME= PACKAGE_VERSION= PACKAGE_STRING= PACKAGE_BUGREPORT= PACKAGE_URL= # Factoring default headers for most tests. ac_includes_default="\ #include #ifdef HAVE_SYS_TYPES_H # include #endif #ifdef HAVE_SYS_STAT_H # include #endif #ifdef STDC_HEADERS # include # include #else # ifdef HAVE_STDLIB_H # include # endif #endif #ifdef HAVE_STRING_H # if !defined STDC_HEADERS && defined HAVE_MEMORY_H # include # endif # include #endif #ifdef HAVE_STRINGS_H # include #endif #ifdef HAVE_INTTYPES_H # include #endif #ifdef HAVE_STDINT_H # include #endif #ifdef HAVE_UNISTD_H # include #endif" ac_subst_vars='LTLIBOBJS LIBOBJS VMGUESTLIB VMWARE_INCDIR EGREP GREP CPP OBJEXT EXEEXT ac_ct_CC CPPFLAGS LDFLAGS CFLAGS CC target_alias host_alias build_alias LIBS ECHO_T ECHO_N ECHO_C DEFS mandir localedir libdir psdir pdfdir dvidir htmldir infodir docdir oldincludedir includedir localstatedir sharedstatedir sysconfdir datadir datarootdir libexecdir sbindir bindir program_transform_name prefix exec_prefix PACKAGE_URL PACKAGE_BUGREPORT PACKAGE_STRING PACKAGE_VERSION PACKAGE_TARNAME PACKAGE_NAME PATH_SEPARATOR SHELL' ac_subst_files='' ac_user_opts=' enable_option_checking with_vmware_incdir ' ac_precious_vars='build_alias host_alias target_alias CC CFLAGS LDFLAGS LIBS CPPFLAGS CPP' # Initialize some variables set by options. ac_init_help= ac_init_version=false ac_unrecognized_opts= ac_unrecognized_sep= # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. # (The list follows the same order as the GNU Coding Standards.) bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datarootdir='${prefix}/share' datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE}' infodir='${datarootdir}/info' htmldir='${docdir}' dvidir='${docdir}' pdfdir='${docdir}' psdir='${docdir}' libdir='${exec_prefix}/lib' localedir='${datarootdir}/locale' mandir='${datarootdir}/man' ac_prev= ac_dashdash= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval $ac_prev=\$ac_option ac_prev= continue fi case $ac_option in *=?*) ac_optarg=`expr "X$ac_option" : '[^=]*=\(.*\)'` ;; *=) ac_optarg= ;; *) ac_optarg=yes ;; esac # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_dashdash$ac_option in --) ac_dashdash=yes ;; -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=*) datadir=$ac_optarg ;; -datarootdir | --datarootdir | --datarootdi | --datarootd | --dataroot \ | --dataroo | --dataro | --datar) ac_prev=datarootdir ;; -datarootdir=* | --datarootdir=* | --datarootdi=* | --datarootd=* \ | --dataroot=* | --dataroo=* | --dataro=* | --datar=*) datarootdir=$ac_optarg ;; -disable-* | --disable-*) ac_useropt=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--disable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=no ;; -docdir | --docdir | --docdi | --doc | --do) ac_prev=docdir ;; -docdir=* | --docdir=* | --docdi=* | --doc=* | --do=*) docdir=$ac_optarg ;; -dvidir | --dvidir | --dvidi | --dvid | --dvi | --dv) ac_prev=dvidir ;; -dvidir=* | --dvidir=* | --dvidi=* | --dvid=* | --dvi=* | --dv=*) dvidir=$ac_optarg ;; -enable-* | --enable-*) ac_useropt=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--enable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=\$ac_optarg ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -htmldir | --htmldir | --htmldi | --htmld | --html | --htm | --ht) ac_prev=htmldir ;; -htmldir=* | --htmldir=* | --htmldi=* | --htmld=* | --html=* | --htm=* \ | --ht=*) htmldir=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localedir | --localedir | --localedi | --localed | --locale) ac_prev=localedir ;; -localedir=* | --localedir=* | --localedi=* | --localed=* | --locale=*) localedir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst | --locals) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* | --locals=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -pdfdir | --pdfdir | --pdfdi | --pdfd | --pdf | --pd) ac_prev=pdfdir ;; -pdfdir=* | --pdfdir=* | --pdfdi=* | --pdfd=* | --pdf=* | --pd=*) pdfdir=$ac_optarg ;; -psdir | --psdir | --psdi | --psd | --ps) ac_prev=psdir ;; -psdir=* | --psdir=* | --psdi=* | --psd=* | --ps=*) psdir=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_useropt=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--with-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=\$ac_optarg ;; -without-* | --without-*) ac_useropt=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--without-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=no ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) as_fn_error $? "unrecognized option: \`$ac_option' Try \`$0 --help' for more information" ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. case $ac_envvar in #( '' | [0-9]* | *[!_$as_cr_alnum]* ) as_fn_error $? "invalid variable name: \`$ac_envvar'" ;; esac eval $ac_envvar=\$ac_optarg export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. $as_echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && $as_echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : ${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option} ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` as_fn_error $? "missing argument to $ac_option" fi if test -n "$ac_unrecognized_opts"; then case $enable_option_checking in no) ;; fatal) as_fn_error $? "unrecognized options: $ac_unrecognized_opts" ;; *) $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2 ;; esac fi # Check all directory arguments for consistency. for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ libdir localedir mandir do eval ac_val=\$$ac_var # Remove trailing slashes. case $ac_val in */ ) ac_val=`expr "X$ac_val" : 'X\(.*[^/]\)' \| "X$ac_val" : 'X\(.*\)'` eval $ac_var=\$ac_val;; esac # Be sure to have absolute directory names. case $ac_val in [\\/$]* | ?:[\\/]* ) continue;; NONE | '' ) case $ac_var in *prefix ) continue;; esac;; esac as_fn_error $? "expected an absolute directory name for --$ac_var: $ac_val" done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe $as_echo "$as_me: WARNING: if you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used" >&2 elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null ac_pwd=`pwd` && test -n "$ac_pwd" && ac_ls_di=`ls -di .` && ac_pwd_ls_di=`cd "$ac_pwd" && ls -di .` || as_fn_error $? "working directory cannot be determined" test "X$ac_ls_di" = "X$ac_pwd_ls_di" || as_fn_error $? "pwd does not report name of working directory" # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then the parent directory. ac_confdir=`$as_dirname -- "$as_myself" || $as_expr X"$as_myself" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_myself" : 'X\(//\)[^/]' \| \ X"$as_myself" : 'X\(//\)$' \| \ X"$as_myself" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_myself" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` srcdir=$ac_confdir if test ! -r "$srcdir/$ac_unique_file"; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r "$srcdir/$ac_unique_file"; then test "$ac_srcdir_defaulted" = yes && srcdir="$ac_confdir or .." as_fn_error $? "cannot find sources ($ac_unique_file) in $srcdir" fi ac_msg="sources are in $srcdir, but \`cd $srcdir' does not work" ac_abs_confdir=`( cd "$srcdir" && test -r "./$ac_unique_file" || as_fn_error $? "$ac_msg" pwd)` # When building in place, set srcdir=. if test "$ac_abs_confdir" = "$ac_pwd"; then srcdir=. fi # Remove unnecessary trailing slashes from srcdir. # Double slashes in file names in object file debugging info # mess up M-x gdb in Emacs. case $srcdir in */) srcdir=`expr "X$srcdir" : 'X\(.*[^/]\)' \| "X$srcdir" : 'X\(.*\)'`;; esac for ac_var in $ac_precious_vars; do eval ac_env_${ac_var}_set=\${${ac_var}+set} eval ac_env_${ac_var}_value=\$${ac_var} eval ac_cv_env_${ac_var}_set=\${${ac_var}+set} eval ac_cv_env_${ac_var}_value=\$${ac_var} done # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures this package to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking ...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] --datadir=DIR read-only architecture-independent data [DATAROOTDIR] --infodir=DIR info documentation [DATAROOTDIR/info] --localedir=DIR locale-dependent data [DATAROOTDIR/locale] --mandir=DIR man documentation [DATAROOTDIR/man] --docdir=DIR documentation root [DATAROOTDIR/doc/PACKAGE] --htmldir=DIR html documentation [DOCDIR] --dvidir=DIR dvi documentation [DOCDIR] --pdfdir=DIR pdf documentation [DOCDIR] --psdir=DIR ps documentation [DOCDIR] _ACEOF cat <<\_ACEOF _ACEOF fi if test -n "$ac_init_help"; then cat <<\_ACEOF Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-vmware_incdir= Specify path to VMware GuestSDK includes Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory LIBS libraries to pass to the linker, e.g. -l CPPFLAGS (Objective) C/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. Report bugs to the package provider. _ACEOF ac_status=$? fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d "$ac_dir" || { cd "$srcdir" && ac_pwd=`pwd` && srcdir=. && test -d "$ac_dir"; } || continue ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix cd "$ac_dir" || { ac_status=$?; continue; } # Check for guested configure. if test -f "$ac_srcdir/configure.gnu"; then echo && $SHELL "$ac_srcdir/configure.gnu" --help=recursive elif test -f "$ac_srcdir/configure"; then echo && $SHELL "$ac_srcdir/configure" --help=recursive else $as_echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi || ac_status=$? cd "$ac_pwd" || { ac_status=$?; break; } done fi test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF configure generated by GNU Autoconf 2.67 Copyright (C) 2010 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit fi ## ------------------------ ## ## Autoconf initialization. ## ## ------------------------ ## # ac_fn_c_try_compile LINENO # -------------------------- # Try to compile conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_compile # ac_fn_c_try_cpp LINENO # ---------------------- # Try to preprocess conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_cpp () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if { { ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } > conftest.i && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_cpp # ac_fn_c_check_header_mongrel LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists, giving a warning if it cannot be compiled using # the include files in INCLUDES and setting the cache variable VAR # accordingly. ac_fn_c_check_header_mongrel () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if eval "test \"\${$3+set}\"" = set; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } else # Is the header compilable? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 usability" >&5 $as_echo_n "checking $2 usability... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_header_compiler=yes else ac_header_compiler=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_compiler" >&5 $as_echo "$ac_header_compiler" >&6; } # Is the header present? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 presence" >&5 $as_echo_n "checking $2 presence... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include <$2> _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : ac_header_preproc=yes else ac_header_preproc=no fi rm -f conftest.err conftest.i conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_preproc" >&5 $as_echo "$ac_header_preproc" >&6; } # So? What about this header? case $ac_header_compiler:$ac_header_preproc:$ac_c_preproc_warn_flag in #(( yes:no: ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&5 $as_echo "$as_me: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ;; no:yes:* ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: present but cannot be compiled" >&5 $as_echo "$as_me: WARNING: $2: present but cannot be compiled" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: check for missing prerequisite headers?" >&5 $as_echo "$as_me: WARNING: $2: check for missing prerequisite headers?" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: see the Autoconf documentation" >&5 $as_echo "$as_me: WARNING: $2: see the Autoconf documentation" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&5 $as_echo "$as_me: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else eval "$3=\$ac_header_compiler" fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } fi eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_header_mongrel # ac_fn_c_try_run LINENO # ---------------------- # Try to link conftest.$ac_ext, and return whether this succeeded. Assumes # that executables *can* be run. ac_fn_c_try_run () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { ac_try='./conftest$ac_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then : ac_retval=0 else $as_echo "$as_me: program exited with status $ac_status" >&5 $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=$ac_status fi rm -rf conftest.dSYM conftest_ipa8_conftest.oo eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_run # ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists and can be compiled using the include files in # INCLUDES, setting the cache variable VAR accordingly. ac_fn_c_check_header_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : eval "$3=yes" else eval "$3=no" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_header_compile cat >config.log <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by $as_me, which was generated by GNU Autoconf 2.67. Invocation command line was $ $0 $@ _ACEOF exec 5>>config.log { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` /usr/bin/hostinfo = `(/usr/bin/hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. $as_echo "PATH: $as_dir" done IFS=$as_save_IFS } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *\'*) ac_arg=`$as_echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) as_fn_append ac_configure_args0 " '$ac_arg'" ;; 2) as_fn_append ac_configure_args1 " '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi as_fn_append ac_configure_args " '$ac_arg'" ;; esac done done { ac_configure_args0=; unset ac_configure_args0;} { ac_configure_args1=; unset ac_configure_args1;} # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Use '\'' to represent an apostrophe within the trap. # WARNING: Do not start the trap code with a newline, due to a FreeBSD 4.0 bug. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo $as_echo "## ---------------- ## ## Cache variables. ## ## ---------------- ##" echo # The following way of writing the cache mishandles newlines in values, ( for ac_var in `(set) 2>&1 | sed -n '\''s/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'\''`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space='\'' '\''; set) 2>&1` in #( *${as_nl}ac_space=\ *) sed -n \ "s/'\''/'\''\\\\'\'''\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\''\\2'\''/p" ;; #( *) sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) echo $as_echo "## ----------------- ## ## Output variables. ## ## ----------------- ##" echo for ac_var in $ac_subst_vars do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo if test -n "$ac_subst_files"; then $as_echo "## ------------------- ## ## File substitutions. ## ## ------------------- ##" echo for ac_var in $ac_subst_files do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo fi if test -s confdefs.h; then $as_echo "## ----------- ## ## confdefs.h. ## ## ----------- ##" echo cat confdefs.h echo fi test "$ac_signal" != 0 && $as_echo "$as_me: caught signal $ac_signal" $as_echo "$as_me: exit $exit_status" } >&5 rm -f core *.core core.conftest.* && rm -f -r conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; as_fn_exit 1' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -f -r conftest* confdefs.h $as_echo "/* confdefs.h */" > confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_URL "$PACKAGE_URL" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer an explicitly selected file to automatically selected ones. ac_site_file1=NONE ac_site_file2=NONE if test -n "$CONFIG_SITE"; then # We do not want a PATH search for config.site. case $CONFIG_SITE in #(( -*) ac_site_file1=./$CONFIG_SITE;; */*) ac_site_file1=$CONFIG_SITE;; *) ac_site_file1=./$CONFIG_SITE;; esac elif test "x$prefix" != xNONE; then ac_site_file1=$prefix/share/config.site ac_site_file2=$prefix/etc/config.site else ac_site_file1=$ac_default_prefix/share/config.site ac_site_file2=$ac_default_prefix/etc/config.site fi for ac_site_file in "$ac_site_file1" "$ac_site_file2" do test "x$ac_site_file" = xNONE && continue if test /dev/null != "$ac_site_file" && test -r "$ac_site_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading site script $ac_site_file" >&5 $as_echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" \ || { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "failed to load site script $ac_site_file See \`config.log' for more details" "$LINENO" 5 ; } fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special files # actually), so we avoid doing that. DJGPP emulates it as a regular file. if test /dev/null != "$cache_file" && test -f "$cache_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading cache $cache_file" >&5 $as_echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . "$cache_file";; *) . "./$cache_file";; esac fi else { $as_echo "$as_me:${as_lineno-$LINENO}: creating cache $cache_file" >&5 $as_echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in $ac_precious_vars; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val=\$ac_cv_env_${ac_var}_value eval ac_new_val=\$ac_env_${ac_var}_value case $ac_old_set,$ac_new_set in set,) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was not set in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then # differences in whitespace do not lead to failure. ac_old_val_w=`echo x $ac_old_val` ac_new_val_w=`echo x $ac_new_val` if test "$ac_old_val_w" != "$ac_new_val_w"; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' has changed since the previous run:" >&5 $as_echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} ac_cache_corrupted=: else { $as_echo "$as_me:${as_lineno-$LINENO}: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&5 $as_echo "$as_me: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&2;} eval $ac_var=\$ac_old_val fi { $as_echo "$as_me:${as_lineno-$LINENO}: former value: \`$ac_old_val'" >&5 $as_echo "$as_me: former value: \`$ac_old_val'" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: current value: \`$ac_new_val'" >&5 $as_echo "$as_me: current value: \`$ac_new_val'" >&2;} fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *\'*) ac_arg=$ac_var=`$as_echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) as_fn_append ac_configure_args " '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: error: changes in the environment can compromise the build" >&5 $as_echo "$as_me: error: changes in the environment can compromise the build" >&2;} as_fn_error $? "run \`make distclean' and/or \`rm $cache_file' and start over" "$LINENO" 5 fi ## -------------------- ## ## Main body of script. ## ## -------------------- ## ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}gcc", so it can be a program name with args. set dummy ${ac_tool_prefix}gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}gcc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "gcc", so it can be a program name with args. set dummy gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_ac_ct_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="gcc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi else CC="$ac_cv_prog_CC" fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}cc", so it can be a program name with args. set dummy ${ac_tool_prefix}cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="${ac_tool_prefix}cc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi fi if test -z "$CC"; then # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then ac_prog_rejected=yes continue fi ac_cv_prog_CC="cc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_CC shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set CC to just the basename; use the full file name. shift ac_cv_prog_CC="$as_dir/$ac_word${1+' '}$@" fi fi fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then for ac_prog in cl.exe do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in cl.exe do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if test "${ac_cv_prog_ac_ct_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then ac_cv_prog_ac_ct_CC="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$ac_ct_CC" && break done if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi fi fi test -z "$CC" && { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "no acceptable C compiler found in \$PATH See \`config.log' for more details" "$LINENO" 5 ; } # Provide some information about the compiler. $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler version" >&5 set X $ac_compile ac_compiler=$2 for ac_option in --version -v -V -qversion; do { { ac_try="$ac_compiler $ac_option >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compiler $ac_option >&5") 2>conftest.err ac_status=$? if test -s conftest.err; then sed '10a\ ... rest of stderr output deleted ... 10q' conftest.err >conftest.er1 cat conftest.er1 >&5 fi rm -f conftest.er1 conftest.err $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } done cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.out.dSYM a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the C compiler works" >&5 $as_echo_n "checking whether the C compiler works... " >&6; } ac_link_default=`$as_echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` # The possible output files: ac_files="a.out conftest.exe conftest a.exe a_out.exe b.out conftest.*" ac_rmfiles= for ac_file in $ac_files do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; * ) ac_rmfiles="$ac_rmfiles $ac_file";; esac done rm -f $ac_rmfiles if { { ac_try="$ac_link_default" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link_default") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # Autoconf-2.13 could set the ac_cv_exeext variable to `no'. # So ignore a value of `no', otherwise this would lead to `EXEEXT = no' # in a Makefile. We should not override ac_cv_exeext if it was cached, # so that the user can short-circuit this test for compilers unknown to # Autoconf. for ac_file in $ac_files '' do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) if test "${ac_cv_exeext+set}" = set && test "$ac_cv_exeext" != no; then :; else ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` fi # We set ac_cv_exeext here because the later test for it is not # safe: cross compilers may not add the suffix if given an `-o' # argument, so we may need to know it at that point already. # Even if this section looks crufty: it has the advantage of # actually working. break;; * ) break;; esac done test "$ac_cv_exeext" = no && ac_cv_exeext= else ac_file='' fi if test -z "$ac_file"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "C compiler cannot create executables See \`config.log' for more details" "$LINENO" 5 ; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler default output file name" >&5 $as_echo_n "checking for C compiler default output file name... " >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_file" >&5 $as_echo "$ac_file" >&6; } ac_exeext=$ac_cv_exeext rm -f -r a.out a.out.dSYM a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of executables" >&5 $as_echo_n "checking for suffix of executables... " >&6; } if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` break;; * ) break;; esac done else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of executables: cannot compile and link See \`config.log' for more details" "$LINENO" 5 ; } fi rm -f conftest conftest$ac_cv_exeext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_exeext" >&5 $as_echo "$ac_cv_exeext" >&6; } rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { FILE *f = fopen ("conftest.out", "w"); return ferror (f) || fclose (f) != 0; ; return 0; } _ACEOF ac_clean_files="$ac_clean_files conftest.out" # Check that the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are cross compiling" >&5 $as_echo_n "checking whether we are cross compiling... " >&6; } if test "$cross_compiling" != yes; then { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } if { ac_try='./conftest$ac_cv_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details" "$LINENO" 5 ; } fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $cross_compiling" >&5 $as_echo "$cross_compiling" >&6; } rm -f conftest.$ac_ext conftest$ac_cv_exeext conftest.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of object files" >&5 $as_echo_n "checking for suffix of object files... " >&6; } if test "${ac_cv_objext+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : for ac_file in conftest.o conftest.obj conftest.*; do test -f "$ac_file" || continue; case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of object files: cannot compile See \`config.log' for more details" "$LINENO" 5 ; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_objext" >&5 $as_echo "$ac_cv_objext" >&6; } OBJEXT=$ac_cv_objext ac_objext=$OBJEXT { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are using the GNU C compiler" >&5 $as_echo_n "checking whether we are using the GNU C compiler... " >&6; } if test "${ac_cv_c_compiler_gnu+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_compiler_gnu=yes else ac_compiler_gnu=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_compiler_gnu" >&5 $as_echo "$ac_cv_c_compiler_gnu" >&6; } if test $ac_compiler_gnu = yes; then GCC=yes else GCC= fi ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -g" >&5 $as_echo_n "checking whether $CC accepts -g... " >&6; } if test "${ac_cv_prog_cc_g+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_save_c_werror_flag=$ac_c_werror_flag ac_c_werror_flag=yes ac_cv_prog_cc_g=no CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes else CFLAGS="" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : else ac_c_werror_flag=$ac_save_c_werror_flag CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_c_werror_flag=$ac_save_c_werror_flag fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_g" >&5 $as_echo "$ac_cv_prog_cc_g" >&6; } if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $CC option to accept ISO C89" >&5 $as_echo_n "checking for $CC option to accept ISO C89... " >&6; } if test "${ac_cv_prog_cc_c89+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_cv_prog_cc_c89=no ac_save_CC=$CC cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } /* OSF 4.0 Compaq cc is some sort of almost-ANSI by default. It has function prototypes and stuff, but not '\xHH' hex character constants. These don't provoke an error unfortunately, instead are silently treated as 'x'. The following induces an error, until -std is added to get proper ANSI mode. Curiously '\x00'!='x' always comes out true, for an array size at least. It's necessary to write '\x00'==0 to get something that's true only with -std. */ int osf4_cc_array ['\x00' == 0 ? 1 : -1]; /* IBM C 6 for AIX is almost-ANSI by default, but it replaces macro parameters inside strings and character constants. */ #define FOO(x) 'x' int xlc6_cc_array[FOO(a) == 'x' ? 1 : -1]; int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF for ac_arg in '' -qlanglvl=extc89 -qlanglvl=ansi -std \ -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_c89=$ac_arg fi rm -f core conftest.err conftest.$ac_objext test "x$ac_cv_prog_cc_c89" != "xno" && break done rm -f conftest.$ac_ext CC=$ac_save_CC fi # AC_CACHE_VAL case "x$ac_cv_prog_cc_c89" in x) { $as_echo "$as_me:${as_lineno-$LINENO}: result: none needed" >&5 $as_echo "none needed" >&6; } ;; xno) { $as_echo "$as_me:${as_lineno-$LINENO}: result: unsupported" >&5 $as_echo "unsupported" >&6; } ;; *) CC="$CC $ac_cv_prog_cc_c89" { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_c89" >&5 $as_echo "$ac_cv_prog_cc_c89" >&6; } ;; esac if test "x$ac_cv_prog_cc_c89" != xno; then : fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking how to run the C preprocessor" >&5 $as_echo_n "checking how to run the C preprocessor... " >&6; } # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if test "${ac_cv_prog_CPP+set}" = set; then : $as_echo_n "(cached) " >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CPP" >&5 $as_echo "$CPP" >&6; } ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details" "$LINENO" 5 ; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking for grep that handles long lines and -e" >&5 $as_echo_n "checking for grep that handles long lines and -e... " >&6; } if test "${ac_cv_path_GREP+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -z "$GREP"; then ac_path_GREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in grep ggrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_GREP" && $as_test_x "$ac_path_GREP"; } || continue # Check for GNU ac_path_GREP and select it if it is found. # Check for GNU $ac_path_GREP case `"$ac_path_GREP" --version 2>&1` in *GNU*) ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'GREP' >> "conftest.nl" "$ac_path_GREP" -e 'GREP$' -e '-(cannot match)-' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_GREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_GREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_GREP"; then as_fn_error $? "no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_GREP=$GREP fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_GREP" >&5 $as_echo "$ac_cv_path_GREP" >&6; } GREP="$ac_cv_path_GREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for egrep" >&5 $as_echo_n "checking for egrep... " >&6; } if test "${ac_cv_path_EGREP+set}" = set; then : $as_echo_n "(cached) " >&6 else if echo a | $GREP -E '(a|b)' >/dev/null 2>&1 then ac_cv_path_EGREP="$GREP -E" else if test -z "$EGREP"; then ac_path_EGREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in egrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext" { test -f "$ac_path_EGREP" && $as_test_x "$ac_path_EGREP"; } || continue # Check for GNU ac_path_EGREP and select it if it is found. # Check for GNU $ac_path_EGREP case `"$ac_path_EGREP" --version 2>&1` in *GNU*) ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'EGREP' >> "conftest.nl" "$ac_path_EGREP" 'EGREP$' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_EGREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_EGREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_EGREP"; then as_fn_error $? "no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_EGREP=$EGREP fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_EGREP" >&5 $as_echo "$ac_cv_path_EGREP" >&6; } EGREP="$ac_cv_path_EGREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } if test "${ac_cv_header_stdc+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_header_stdc=yes else ac_cv_header_stdc=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : : else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) return 2; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : else ac_cv_header_stdc=no fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_stdc" >&5 $as_echo "$ac_cv_header_stdc" >&6; } if test $ac_cv_header_stdc = yes; then $as_echo "#define STDC_HEADERS 1" >>confdefs.h fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default " if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done # Check whether --with-vmware_incdir was given. if test "${with_vmware_incdir+set}" = set; then : withval=$with_vmware_incdir; VMWARE_INCDIR=$withval CFLAGS="$CFLAGS -I$withval" ac_fn_c_check_header_mongrel "$LINENO" "vmGuestLib.h" "ac_cv_header_vmGuestLib_h" "$ac_includes_default" if test "x$ac_cv_header_vmGuestLib_h" = x""yes; then : VMGUESTLIB=1 else { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: vmGuestLib.h not found" >&5 $as_echo "$as_me: WARNING: vmGuestLib.h not found" >&2;} fi else { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Component requires path to vmware includes" >&5 $as_echo "$as_me: WARNING: Component requires path to vmware includes" >&2;} fi ac_config_files="$ac_config_files Makefile.vmware" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, we kill variables containing newlines. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. ( for ac_var in `(set) 2>&1 | sed -n 's/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space=' '; set) 2>&1` in #( *${as_nl}ac_space=\ *) # `set' does not quote correctly, so add quotes: double-quote # substitution turns \\\\ into \\, and sed turns \\ into \. sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; #( *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) | sed ' /^ac_cv_env_/b end t clear :clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ :end' >>confcache if diff "$cache_file" confcache >/dev/null 2>&1; then :; else if test -w "$cache_file"; then test "x$cache_file" != "x/dev/null" && { $as_echo "$as_me:${as_lineno-$LINENO}: updating cache $cache_file" >&5 $as_echo "$as_me: updating cache $cache_file" >&6;} cat confcache >$cache_file else { $as_echo "$as_me:${as_lineno-$LINENO}: not updating unwritable cache $cache_file" >&5 $as_echo "$as_me: not updating unwritable cache $cache_file" >&6;} fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' # Transform confdefs.h into DEFS. # Protect against shell expansion while executing Makefile rules. # Protect against Makefile macro expansion. # # If the first sed substitution is executed (which looks for macros that # take arguments), then branch to the quote section. Otherwise, # look for a macro that doesn't take arguments. ac_script=' :mline /\\$/{ N s,\\\n,, b mline } t clear :clear s/^[ ]*#[ ]*define[ ][ ]*\([^ (][^ (]*([^)]*)\)[ ]*\(.*\)/-D\1=\2/g t quote s/^[ ]*#[ ]*define[ ][ ]*\([^ ][^ ]*\)[ ]*\(.*\)/-D\1=\2/g t quote b any :quote s/[ `~#$^&*(){}\\|;'\''"<>?]/\\&/g s/\[/\\&/g s/\]/\\&/g s/\$/$$/g H :any ${ g s/^\n// s/\n/ /g p } ' DEFS=`sed -n "$ac_script" confdefs.h` ac_libobjs= ac_ltlibobjs= U= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_script='s/\$U\././;s/\.o$//;s/\.obj$//' ac_i=`$as_echo "$ac_i" | sed "$ac_script"` # 2. Prepend LIBOBJDIR. When used with automake>=1.10 LIBOBJDIR # will be set to the directory where LIBOBJS objects are built. as_fn_append ac_libobjs " \${LIBOBJDIR}$ac_i\$U.$ac_objext" as_fn_append ac_ltlibobjs " \${LIBOBJDIR}$ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : ${CONFIG_STATUS=./config.status} ac_write_fail=0 ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $CONFIG_STATUS" >&5 $as_echo "$as_me: creating $CONFIG_STATUS" >&6;} as_write_fail=0 cat >$CONFIG_STATUS <<_ASEOF || as_write_fail=1 #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} export SHELL _ASEOF cat >>$CONFIG_STATUS <<\_ASEOF || as_write_fail=1 ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -p'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -p' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi if test -x / >/dev/null 2>&1; then as_test_x='test -x' else if ls -dL / >/dev/null 2>&1; then as_ls_L_option=L else as_ls_L_option= fi as_test_x=' eval sh -c '\'' if test -d "$1"; then test -d "$1/."; else case $1 in #( -*)set "./$1";; esac; case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #(( ???[sx]*):;;*)false;;esac;fi '\'' sh ' fi as_executable_p=$as_test_x # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" exec 6>&1 ## ----------------------------------- ## ## Main body of $CONFIG_STATUS script. ## ## ----------------------------------- ## _ASEOF test $as_write_fail = 0 && chmod +x $CONFIG_STATUS || ac_write_fail=1 cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Save the log message, to keep $0 and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" This file was extended by $as_me, which was generated by GNU Autoconf 2.67. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ on `(hostname || uname -n) 2>/dev/null | sed 1q` " _ACEOF case $ac_config_files in *" "*) set x $ac_config_files; shift; ac_config_files=$*;; esac cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 # Files that config.status was made for. config_files="$ac_config_files" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 ac_cs_usage="\ \`$as_me' instantiates files and other configuration actions from templates according to the current configuration. Unless the files and actions are specified as TAGs, all are instantiated by default. Usage: $0 [OPTION]... [TAG]... -h, --help print this help, then exit -V, --version print version number and configuration settings, then exit --config print configuration, then exit -q, --quiet, --silent do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE Configuration files: $config_files Report bugs to the package provider." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ config.status configured by $0, generated by GNU Autoconf 2.67, with options \\"\$ac_cs_config\\" Copyright (C) 2010 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." ac_pwd='$ac_pwd' srcdir='$srcdir' test -n "\$AWK" || AWK=awk _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # The default lists apply if the user does not specify any file. ac_need_defaults=: while test $# != 0 do case $1 in --*=?*) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg=`expr "X$1" : 'X[^=]*=\(.*\)'` ac_shift=: ;; --*=) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg= ac_shift=: ;; *) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; esac case $ac_option in # Handling of the options. -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --versio | --versi | --vers | --ver | --ve | --v | -V ) $as_echo "$ac_cs_version"; exit ;; --config | --confi | --conf | --con | --co | --c ) $as_echo "$ac_cs_config"; exit ;; --debug | --debu | --deb | --de | --d | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift case $ac_optarg in *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` ;; '') as_fn_error $? "missing file argument" ;; esac as_fn_append CONFIG_FILES " '$ac_optarg'" ac_need_defaults=false;; --he | --h | --help | --hel | -h ) $as_echo "$ac_cs_usage"; exit ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) as_fn_error $? "unrecognized option: \`$1' Try \`$0 --help' for more information." ;; *) as_fn_append ac_config_targets " $1" ac_need_defaults=false ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 if \$ac_cs_recheck; then set X '$SHELL' '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion shift \$as_echo "running CONFIG_SHELL=$SHELL \$*" >&6 CONFIG_SHELL='$SHELL' export CONFIG_SHELL exec "\$@" fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX $as_echo "$ac_log" } >&5 _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Handling of arguments. for ac_config_target in $ac_config_targets do case $ac_config_target in "Makefile.vmware") CONFIG_FILES="$CONFIG_FILES Makefile.vmware" ;; *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5 ;; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason against having it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Hook for its removal unless debugging. # Note that there is a small window in which the directory will not be cleaned: # after its creation but before its name has been assigned to `$tmp'. $debug || { tmp= trap 'exit_status=$? { test -z "$tmp" || test ! -d "$tmp" || rm -fr "$tmp"; } && exit $exit_status ' 0 trap 'as_fn_exit 1' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d "./confXXXXXX") 2>/dev/null` && test -n "$tmp" && test -d "$tmp" } || { tmp=./conf$$-$RANDOM (umask 077 && mkdir "$tmp") } || as_fn_error $? "cannot create a temporary directory in ." "$LINENO" 5 # Set up the scripts for CONFIG_FILES section. # No need to generate them if there are no CONFIG_FILES. # This happens for instance with `./config.status config.h'. if test -n "$CONFIG_FILES"; then ac_cr=`echo X | tr X '\015'` # On cygwin, bash can eat \r inside `` if the user requested igncr. # But we know of no other shell where ac_cr would be empty at this # point, so we can use a bashism as a fallback. if test "x$ac_cr" = x; then eval ac_cr=\$\'\\r\' fi ac_cs_awk_cr=`$AWK 'BEGIN { print "a\rb" }' /dev/null` if test "$ac_cs_awk_cr" = "a${ac_cr}b"; then ac_cs_awk_cr='\\r' else ac_cs_awk_cr=$ac_cr fi echo 'BEGIN {' >"$tmp/subs1.awk" && _ACEOF { echo "cat >conf$$subs.awk <<_ACEOF" && echo "$ac_subst_vars" | sed 's/.*/&!$&$ac_delim/' && echo "_ACEOF" } >conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_num=`echo "$ac_subst_vars" | grep -c '^'` ac_delim='%!_!# ' for ac_last_try in false false false false false :; do . ./conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | grep -c X` if test $ac_delim_n = $ac_delim_num; then break elif $ac_last_try; then as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 else ac_delim="$ac_delim!$ac_delim _$ac_delim!! " fi done rm -f conf$$subs.sh cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>"\$tmp/subs1.awk" <<\\_ACAWK && _ACEOF sed -n ' h s/^/S["/; s/!.*/"]=/ p g s/^[^!]*!// :repl t repl s/'"$ac_delim"'$// t delim :nl h s/\(.\{148\}\)..*/\1/ t more1 s/["\\]/\\&/g; s/^/"/; s/$/\\n"\\/ p n b repl :more1 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t nl :delim h s/\(.\{148\}\)..*/\1/ t more2 s/["\\]/\\&/g; s/^/"/; s/$/"/ p b :more2 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t delim ' >$CONFIG_STATUS || ac_write_fail=1 rm -f conf$$subs.awk cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACAWK cat >>"\$tmp/subs1.awk" <<_ACAWK && for (key in S) S_is_set[key] = 1 FS = "" } { line = $ 0 nfields = split(line, field, "@") substed = 0 len = length(field[1]) for (i = 2; i < nfields; i++) { key = field[i] keylen = length(key) if (S_is_set[key]) { value = S[key] line = substr(line, 1, len) "" value "" substr(line, len + keylen + 3) len += length(value) + length(field[++i]) substed = 1 } else len += 1 + keylen } print line } _ACAWK _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 if sed "s/$ac_cr//" < /dev/null > /dev/null 2>&1; then sed "s/$ac_cr\$//; s/$ac_cr/$ac_cs_awk_cr/g" else cat fi < "$tmp/subs1.awk" > "$tmp/subs.awk" \ || as_fn_error $? "could not setup config files machinery" "$LINENO" 5 _ACEOF # VPATH may cause trouble with some makes, so we remove sole $(srcdir), # ${srcdir} and @srcdir@ entries from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=[ ]*/{ h s/// s/^/:/ s/[ ]*$/:/ s/:\$(srcdir):/:/g s/:\${srcdir}:/:/g s/:@srcdir@:/:/g s/^:*// s/:*$// x s/\(=[ ]*\).*/\1/ G s/\n// s/^[^=]*=[ ]*$// }' fi cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 fi # test -n "$CONFIG_FILES" eval set X " :F $CONFIG_FILES " shift for ac_tag do case $ac_tag in :[FHLC]) ac_mode=$ac_tag; continue;; esac case $ac_mode$ac_tag in :[FHL]*:*);; :L* | :C*:*) as_fn_error $? "invalid tag \`$ac_tag'" "$LINENO" 5 ;; :[FH]-) ac_tag=-:-;; :[FH]*) ac_tag=$ac_tag:$ac_tag.in;; esac ac_save_IFS=$IFS IFS=: set x $ac_tag IFS=$ac_save_IFS shift ac_file=$1 shift case $ac_mode in :L) ac_source=$1;; :[FH]) ac_file_inputs= for ac_f do case $ac_f in -) ac_f="$tmp/stdin";; *) # Look for the file first in the build tree, then in the source tree # (if the path is not absolute). The absolute path cannot be DOS-style, # because $ac_f cannot contain `:'. test -f "$ac_f" || case $ac_f in [\\/$]*) false;; *) test -f "$srcdir/$ac_f" && ac_f="$srcdir/$ac_f";; esac || as_fn_error 1 "cannot find input file: \`$ac_f'" "$LINENO" 5 ;; esac case $ac_f in *\'*) ac_f=`$as_echo "$ac_f" | sed "s/'/'\\\\\\\\''/g"`;; esac as_fn_append ac_file_inputs " '$ac_f'" done # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ configure_input='Generated from '` $as_echo "$*" | sed 's|^[^:]*/||;s|:[^:]*/|, |g' `' by configure.' if test x"$ac_file" != x-; then configure_input="$ac_file. $configure_input" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $ac_file" >&5 $as_echo "$as_me: creating $ac_file" >&6;} fi # Neutralize special characters interpreted by sed in replacement strings. case $configure_input in #( *\&* | *\|* | *\\* ) ac_sed_conf_input=`$as_echo "$configure_input" | sed 's/[\\\\&|]/\\\\&/g'`;; #( *) ac_sed_conf_input=$configure_input;; esac case $ac_tag in *:-:* | *:-) cat >"$tmp/stdin" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac ;; esac ac_dir=`$as_dirname -- "$ac_file" || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` as_dir="$ac_dir"; as_fn_mkdir_p ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix case $ac_mode in :F) # # CONFIG_FILE # _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # If the template does not know about datarootdir, expand it. # FIXME: This hack should be removed a few years after 2.60. ac_datarootdir_hack=; ac_datarootdir_seen= ac_sed_dataroot=' /datarootdir/ { p q } /@datadir@/p /@docdir@/p /@infodir@/p /@localedir@/p /@mandir@/p' case `eval "sed -n \"\$ac_sed_dataroot\" $ac_file_inputs"` in *datarootdir*) ac_datarootdir_seen=yes;; *@datadir@*|*@docdir@*|*@infodir@*|*@localedir@*|*@mandir@*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&5 $as_echo "$as_me: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&2;} _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_datarootdir_hack=' s&@datadir@&$datadir&g s&@docdir@&$docdir&g s&@infodir@&$infodir&g s&@localedir@&$localedir&g s&@mandir@&$mandir&g s&\\\${datarootdir}&$datarootdir&g' ;; esac _ACEOF # Neutralize VPATH when `$srcdir' = `.'. # Shell code in configure.ac might set extrasub. # FIXME: do we really want to maintain this feature? cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_sed_extra="$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s|@configure_input@|$ac_sed_conf_input|;t t s&@top_builddir@&$ac_top_builddir_sub&;t t s&@top_build_prefix@&$ac_top_build_prefix&;t t s&@srcdir@&$ac_srcdir&;t t s&@abs_srcdir@&$ac_abs_srcdir&;t t s&@top_srcdir@&$ac_top_srcdir&;t t s&@abs_top_srcdir@&$ac_abs_top_srcdir&;t t s&@builddir@&$ac_builddir&;t t s&@abs_builddir@&$ac_abs_builddir&;t t s&@abs_top_builddir@&$ac_abs_top_builddir&;t t $ac_datarootdir_hack " eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$tmp/subs.awk" >$tmp/out \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && { ac_out=`sed -n '/\${datarootdir}/p' "$tmp/out"`; test -n "$ac_out"; } && { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' "$tmp/out"`; test -z "$ac_out"; } && { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&5 $as_echo "$as_me: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&2;} rm -f "$tmp/stdin" case $ac_file in -) cat "$tmp/out" && rm -f "$tmp/out";; *) rm -f "$ac_file" && mv "$tmp/out" "$ac_file";; esac \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac done # for ac_tag as_fn_exit 0 _ACEOF ac_clean_files=$ac_clean_files_save test $ac_write_fail = 0 || as_fn_error $? "write failure creating $CONFIG_STATUS" "$LINENO" 5 # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || as_fn_exit 1 fi if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: unrecognized options: $ac_unrecognized_opts" >&5 $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;} fi papi-papi-7-2-0-t/src/components/vmware/configure.in000066400000000000000000000010021502707512200224170ustar00rootroot00000000000000AC_INIT AC_ARG_WITH(vmware_incdir, [--with-vmware_incdir= Specify path to VMware GuestSDK includes], [VMWARE_INCDIR=$withval CFLAGS="$CFLAGS -I$withval" AC_CHECK_HEADER([vmGuestLib.h], [VMGUESTLIB=1], [AC_MSG_WARN([vmGuestLib.h not found])], )], [AC_MSG_WARN([Component requires path to vmware includes])]) AC_SUBST(VMWARE_INCDIR) AC_SUBST(VMGUESTLIB) AC_CONFIG_FILES([Makefile.vmware]) AC_OUTPUT papi-papi-7-2-0-t/src/components/vmware/tests/000077500000000000000000000000001502707512200212575ustar00rootroot00000000000000papi-papi-7-2-0-t/src/components/vmware/tests/Makefile000066400000000000000000000005011502707512200227130ustar00rootroot00000000000000NAME=vmware include ../../Makefile_comp_tests %.o:%.c $(CC) $(CFLAGS) $(INCLUDE) -c -o $@ $< TESTS = vmware_basic vmware_tests: $(TESTS) vmware_basic: vmware_basic.o $(UTILOBJS) $(PAPILIB) $(CC) $(CFLAGS) $(INCLUDE) -o vmware_basic vmware_basic.o $(UTILOBJS) $(PAPILIB) $(LDFLAGS) clean: rm -f $(TESTS) *.o papi-papi-7-2-0-t/src/components/vmware/tests/vmware_basic.c000066400000000000000000000067411502707512200240750ustar00rootroot00000000000000/** * @author Vince Weaver * * test case for vmware component * * * @brief * Tests basic vmware functionality */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM_EVENTS 1 int main (int argc, char **argv) { int retval,cid,numcmp; int EventSet = PAPI_NULL; long long values[NUM_EVENTS]; int code; char event_name[PAPI_MAX_STR_LEN]; int total_events=0; int r; const PAPI_component_info_t *cmpinfo = NULL; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* PAPI Initialization */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__,"PAPI_library_init failed\n",retval); } if (!TESTS_QUIET) { printf("Trying all vmware events\n"); } /* Find our Component */ numcmp = PAPI_num_components(); for(cid=0; cidname,"vmware")) { if (!TESTS_QUIET) printf("\tFound vmware component %d - %s\n", cid, cmpinfo->name); } else { continue; } PAPI_event_info_t info; /* Try all events one by one */ code = PAPI_NATIVE_MASK; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_FIRST, cid ); while ( r == PAPI_OK ) { retval=PAPI_get_event_info(code,&info); if (retval!=PAPI_OK) { printf("Error getting event info\n"); test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { printf("Error translating %#x\n",code); test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!TESTS_QUIET) printf(" %s ",event_name); EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset()",retval); } retval = PAPI_add_event( EventSet, code ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_add_event()",retval); } /* start */ retval = PAPI_start( EventSet); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } /* do something */ usleep(100); /* stop */ retval = PAPI_stop( EventSet, values); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_start()",retval); } if (!TESTS_QUIET) printf(" value: %lld %s\n",values[0], info.units); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_cleanup_eventset()",retval); } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "PAPI_destroy_eventset()",retval); } total_events++; r = PAPI_enum_cmp_event( &code, PAPI_ENUM_EVENTS, cid ); } } if (total_events==0) { test_skip(__FILE__,__LINE__,"No vmware events found",0); } if (!TESTS_QUIET) { printf("\n"); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/components/vmware/vmware.c000066400000000000000000001167271502707512200216000ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file mware.c * @author Matt Johnson * mrj@eecs.utk.edu * @author John Nelson * jnelso37@eecs.utk.edu * @author Vince Weaver * vweaver1@eecs.utk.edu * * @ingroup papi_components * * VMware component * * @brief * This is the VMware component for PAPI-V. It will allow user access to * hardware information available from a VMware virtual machine. */ #include #include #include #include #include #include /* Headers required by PAPI */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #define VMWARE_MAX_COUNTERS 256 #define VMWARE_CPU_LIMIT_MHZ 0 #define VMWARE_CPU_RESERVATION_MHZ 1 #define VMWARE_CPU_SHARES 2 #define VMWARE_CPU_STOLEN_MS 3 #define VMWARE_CPU_USED_MS 4 #define VMWARE_ELAPSED_MS 5 #define VMWARE_MEM_ACTIVE_MB 6 #define VMWARE_MEM_BALLOONED_MB 7 #define VMWARE_MEM_LIMIT_MB 8 #define VMWARE_MEM_MAPPED_MB 9 #define VMWARE_MEM_OVERHEAD_MB 10 #define VMWARE_MEM_RESERVATION_MB 11 #define VMWARE_MEM_SHARED_MB 12 #define VMWARE_MEM_SHARES 13 #define VMWARE_MEM_SWAPPED_MB 14 #define VMWARE_MEM_TARGET_SIZE_MB 15 #define VMWARE_MEM_USED_MB 16 #define VMWARE_HOST_CPU_MHZ 17 /* The following 3 require VMWARE_PSEUDO_PERFORMANCE env_var to be set. */ #define VMWARE_HOST_TSC 18 #define VMWARE_ELAPSED_TIME 19 #define VMWARE_ELAPSED_APPARENT 20 /* Begin PAPI definitions */ papi_vector_t _vmware_vector; void (*_dl_non_dynamic_init)(void) __attribute__((weak)); /** Structure that stores private information for each event */ struct _vmware_register { unsigned int selector; /**< Signifies which counter slot is being used */ /**< Indexed from 1 as 0 has a special meaning */ }; /** This structure is used to build the table of events */ struct _vmware_native_event_entry { char name[PAPI_MAX_STR_LEN]; /**< Name of the counter */ char description[PAPI_HUGE_STR_LEN]; /**< Description of counter */ char units[PAPI_MIN_STR_LEN]; int which_counter; int report_difference; }; struct _vmware_reg_alloc { struct _vmware_register ra_bits; }; inline uint64_t rdpmc(int c) { uint32_t low, high; __asm__ __volatile__("rdpmc" : "=a" (low), "=d" (high) : "c" (c)); return (uint64_t)high << 32 | (uint64_t)low; } #ifdef VMGUESTLIB /* Headers required by VMware */ #include "vmGuestLib.h" /* Functions to dynamically load from the GuestLib library. */ char const * (*GuestLib_GetErrorText)(VMGuestLibError); VMGuestLibError (*GuestLib_OpenHandle)(VMGuestLibHandle*); VMGuestLibError (*GuestLib_CloseHandle)(VMGuestLibHandle); VMGuestLibError (*GuestLib_UpdateInfo)(VMGuestLibHandle handle); VMGuestLibError (*GuestLib_GetSessionId)(VMGuestLibHandle handle, VMSessionId *id); VMGuestLibError (*GuestLib_GetCpuReservationMHz)(VMGuestLibHandle handle, uint32 *cpuReservationMHz); VMGuestLibError (*GuestLib_GetCpuLimitMHz)(VMGuestLibHandle handle, uint32 *cpuLimitMHz); VMGuestLibError (*GuestLib_GetCpuShares)(VMGuestLibHandle handle, uint32 *cpuShares); VMGuestLibError (*GuestLib_GetCpuUsedMs)(VMGuestLibHandle handle, uint64 *cpuUsedMs); VMGuestLibError (*GuestLib_GetHostProcessorSpeed)(VMGuestLibHandle handle, uint32 *mhz); VMGuestLibError (*GuestLib_GetMemReservationMB)(VMGuestLibHandle handle, uint32 *memReservationMB); VMGuestLibError (*GuestLib_GetMemLimitMB)(VMGuestLibHandle handle, uint32 *memLimitMB); VMGuestLibError (*GuestLib_GetMemShares)(VMGuestLibHandle handle, uint32 *memShares); VMGuestLibError (*GuestLib_GetMemMappedMB)(VMGuestLibHandle handle, uint32 *memMappedMB); VMGuestLibError (*GuestLib_GetMemActiveMB)(VMGuestLibHandle handle, uint32 *memActiveMB); VMGuestLibError (*GuestLib_GetMemOverheadMB)(VMGuestLibHandle handle, uint32 *memOverheadMB); VMGuestLibError (*GuestLib_GetMemBalloonedMB)(VMGuestLibHandle handle, uint32 *memBalloonedMB); VMGuestLibError (*GuestLib_GetMemSwappedMB)(VMGuestLibHandle handle, uint32 *memSwappedMB); VMGuestLibError (*GuestLib_GetMemSharedMB)(VMGuestLibHandle handle, uint32 *memSharedMB); VMGuestLibError (*GuestLib_GetMemSharedSavedMB)(VMGuestLibHandle handle, uint32 *memSharedSavedMB); VMGuestLibError (*GuestLib_GetMemUsedMB)(VMGuestLibHandle handle, uint32 *memUsedMB); VMGuestLibError (*GuestLib_GetElapsedMs)(VMGuestLibHandle handle, uint64 *elapsedMs); VMGuestLibError (*GuestLib_GetResourcePoolPath)(VMGuestLibHandle handle, size_t *bufferSize, char *pathBuffer); VMGuestLibError (*GuestLib_GetCpuStolenMs)(VMGuestLibHandle handle, uint64 *cpuStolenMs); VMGuestLibError (*GuestLib_GetMemTargetSizeMB)(VMGuestLibHandle handle, uint64 *memTargetSizeMB); VMGuestLibError (*GuestLib_GetHostNumCpuCores)(VMGuestLibHandle handle, uint32 *hostNumCpuCores); VMGuestLibError (*GuestLib_GetHostCpuUsedMs)(VMGuestLibHandle handle, uint64 *hostCpuUsedMs); VMGuestLibError (*GuestLib_GetHostMemSwappedMB)(VMGuestLibHandle handle, uint64 *hostMemSwappedMB); VMGuestLibError (*GuestLib_GetHostMemSharedMB)(VMGuestLibHandle handle, uint64 *hostMemSharedMB); VMGuestLibError (*GuestLib_GetHostMemUsedMB)(VMGuestLibHandle handle, uint64 *hostMemUsedMB); VMGuestLibError (*GuestLib_GetHostMemPhysMB)(VMGuestLibHandle handle, uint64 *hostMemPhysMB); VMGuestLibError (*GuestLib_GetHostMemPhysFreeMB)(VMGuestLibHandle handle, uint64 *hostMemPhysFreeMB); VMGuestLibError (*GuestLib_GetHostMemKernOvhdMB)(VMGuestLibHandle handle, uint64 *hostMemKernOvhdMB); VMGuestLibError (*GuestLib_GetHostMemMappedMB)(VMGuestLibHandle handle, uint64 *hostMemMappedMB); VMGuestLibError (*GuestLib_GetHostMemUnmappedMB)(VMGuestLibHandle handle, uint64 *hostMemUnmappedMB); static void *dlHandle = NULL; /* * Macro to load a single GuestLib function from the shared library. */ #define LOAD_ONE_FUNC(funcname) \ do { \ funcname = dlsym(dlHandle, "VM" #funcname); \ if ((dlErrStr = dlerror()) != NULL) { \ fprintf(stderr, "Failed to load \'%s\': \'%s\'\n", \ #funcname, dlErrStr); \ return FALSE; \ } \ } while (0) #endif /** Holds control flags, usually out-of band configuration of the hardware */ struct _vmware_control_state { long long value[VMWARE_MAX_COUNTERS]; int which_counter[VMWARE_MAX_COUNTERS]; int num_events; }; /** Holds per-thread information */ struct _vmware_context { long long values[VMWARE_MAX_COUNTERS]; long long start_values[VMWARE_MAX_COUNTERS]; #ifdef VMGUESTLIB VMGuestLibHandle glHandle; #endif }; /* *----------------------------------------------------------------------------- * * LoadFunctions -- * * Load the functions from the shared library. * * Results: * TRUE on success * FALSE on failure * * Side effects: * None * * Credit: VMware *----------------------------------------------------------------------------- */ static int LoadFunctions(void) { #ifdef VMGUESTLIB /* * First, try to load the shared library. */ /* Attempt to guess if we were statically linked to libc, if so bail */ if ( _dl_non_dynamic_init != NULL ) { strncpy(_vmware_vector.cmp_info.disabled_reason, "The VMware component does not support statically linking of libc.", PAPI_MAX_STR_LEN); return PAPI_ENOSUPP; } char const *dlErrStr; char filename[BUFSIZ]; sprintf(filename,"%s","libvmGuestLib.so"); dlHandle = dlopen(filename, RTLD_NOW); if (!dlHandle) { dlErrStr = dlerror(); fprintf(stderr, "dlopen of %s failed: \'%s\'\n", filename, dlErrStr); sprintf(filename,"%s/lib/lib64/libvmGuestLib.so",VMWARE_INCDIR); dlHandle = dlopen(filename, RTLD_NOW); if (!dlHandle) { dlErrStr = dlerror(); fprintf(stderr, "dlopen of %s failed: \'%s\'\n", filename, dlErrStr); sprintf(filename,"%s/lib/lib32/libvmGuestLib.so",VMWARE_INCDIR); dlHandle = dlopen(filename, RTLD_NOW); if (!dlHandle) { dlErrStr = dlerror(); fprintf(stderr, "dlopen of %s failed: \'%s\'\n", filename, dlErrStr); return PAPI_ECMP; } } } /* Load all the individual library functions. */ LOAD_ONE_FUNC(GuestLib_GetErrorText); LOAD_ONE_FUNC(GuestLib_OpenHandle); LOAD_ONE_FUNC(GuestLib_CloseHandle); LOAD_ONE_FUNC(GuestLib_UpdateInfo); LOAD_ONE_FUNC(GuestLib_GetSessionId); LOAD_ONE_FUNC(GuestLib_GetCpuReservationMHz); LOAD_ONE_FUNC(GuestLib_GetCpuLimitMHz); LOAD_ONE_FUNC(GuestLib_GetCpuShares); LOAD_ONE_FUNC(GuestLib_GetCpuUsedMs); LOAD_ONE_FUNC(GuestLib_GetHostProcessorSpeed); LOAD_ONE_FUNC(GuestLib_GetMemReservationMB); LOAD_ONE_FUNC(GuestLib_GetMemLimitMB); LOAD_ONE_FUNC(GuestLib_GetMemShares); LOAD_ONE_FUNC(GuestLib_GetMemMappedMB); LOAD_ONE_FUNC(GuestLib_GetMemActiveMB); LOAD_ONE_FUNC(GuestLib_GetMemOverheadMB); LOAD_ONE_FUNC(GuestLib_GetMemBalloonedMB); LOAD_ONE_FUNC(GuestLib_GetMemSwappedMB); LOAD_ONE_FUNC(GuestLib_GetMemSharedMB); LOAD_ONE_FUNC(GuestLib_GetMemSharedSavedMB); LOAD_ONE_FUNC(GuestLib_GetMemUsedMB); LOAD_ONE_FUNC(GuestLib_GetElapsedMs); LOAD_ONE_FUNC(GuestLib_GetResourcePoolPath); LOAD_ONE_FUNC(GuestLib_GetCpuStolenMs); LOAD_ONE_FUNC(GuestLib_GetMemTargetSizeMB); LOAD_ONE_FUNC(GuestLib_GetHostNumCpuCores); LOAD_ONE_FUNC(GuestLib_GetHostCpuUsedMs); LOAD_ONE_FUNC(GuestLib_GetHostMemSwappedMB); LOAD_ONE_FUNC(GuestLib_GetHostMemSharedMB); LOAD_ONE_FUNC(GuestLib_GetHostMemUsedMB); LOAD_ONE_FUNC(GuestLib_GetHostMemPhysMB); LOAD_ONE_FUNC(GuestLib_GetHostMemPhysFreeMB); LOAD_ONE_FUNC(GuestLib_GetHostMemKernOvhdMB); LOAD_ONE_FUNC(GuestLib_GetHostMemMappedMB); LOAD_ONE_FUNC(GuestLib_GetHostMemUnmappedMB); #endif return PAPI_OK; } /** This table contains the native events */ static struct _vmware_native_event_entry *_vmware_native_table; /** number of events in the table*/ static int num_events = 0; static int use_pseudo=0; static int use_guestlib=0; /************************************************************************/ /* Below is the actual "hardware implementation" of our VMWARE counters */ /************************************************************************/ /** Code that reads event values. You might replace this with code that accesses hardware or reads values from the operatings system. */ static long long _vmware_hardware_read( struct _vmware_context *context, int starting) { int i; if (use_pseudo) { context->values[VMWARE_HOST_TSC]=rdpmc(0x10000); context->values[VMWARE_ELAPSED_TIME]=rdpmc(0x10001); context->values[VMWARE_ELAPSED_APPARENT]=rdpmc(0x10002); } #ifdef VMGUESTLIB static VMSessionId sessionId = 0; VMSessionId tmpSession; uint32_t temp32; uint64_t temp64; VMGuestLibError glError; if (use_guestlib) { glError = GuestLib_UpdateInfo(context->glHandle); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"UpdateInfo failed: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } /* Retrieve and check the session ID */ glError = GuestLib_GetSessionId(context->glHandle, &tmpSession); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get session ID: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } if (tmpSession == 0) { fprintf(stderr, "Error: Got zero sessionId from GuestLib\n"); return PAPI_ECMP; } if (sessionId == 0) { sessionId = tmpSession; } else if (tmpSession != sessionId) { sessionId = tmpSession; } glError = GuestLib_GetCpuLimitMHz(context->glHandle,&temp32); context->values[VMWARE_CPU_LIMIT_MHZ]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"Failed to get CPU limit: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetCpuReservationMHz(context->glHandle,&temp32); context->values[VMWARE_CPU_RESERVATION_MHZ]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"Failed to get CPU reservation: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetCpuShares(context->glHandle,&temp32); context->values[VMWARE_CPU_SHARES]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"Failed to get cpu shares: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetCpuStolenMs(context->glHandle,&temp64); context->values[VMWARE_CPU_STOLEN_MS]=temp64; if (glError != VMGUESTLIB_ERROR_SUCCESS) { if (glError == VMGUESTLIB_ERROR_UNSUPPORTED_VERSION) { context->values[VMWARE_CPU_STOLEN_MS]=0; fprintf(stderr, "Skipping CPU stolen, not supported...\n"); } else { fprintf(stderr, "Failed to get CPU stolen: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } glError = GuestLib_GetCpuUsedMs(context->glHandle,&temp64); context->values[VMWARE_CPU_USED_MS]=temp64; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get used ms: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetElapsedMs(context->glHandle, &temp64); context->values[VMWARE_ELAPSED_MS]=temp64; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get elapsed ms: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemActiveMB(context->glHandle, &temp32); context->values[VMWARE_MEM_ACTIVE_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get active mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemBalloonedMB(context->glHandle, &temp32); context->values[VMWARE_MEM_BALLOONED_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get ballooned mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemLimitMB(context->glHandle, &temp32); context->values[VMWARE_MEM_LIMIT_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"Failed to get mem limit: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemMappedMB(context->glHandle, &temp32); context->values[VMWARE_MEM_MAPPED_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get mapped mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemOverheadMB(context->glHandle, &temp32); context->values[VMWARE_MEM_OVERHEAD_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get overhead mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemReservationMB(context->glHandle, &temp32); context->values[VMWARE_MEM_RESERVATION_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get mem reservation: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemSharedMB(context->glHandle, &temp32); context->values[VMWARE_MEM_SHARED_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get swapped mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemShares(context->glHandle, &temp32); context->values[VMWARE_MEM_SHARES]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { if (glError == VMGUESTLIB_ERROR_NOT_AVAILABLE) { context->values[VMWARE_MEM_SHARES]=0; fprintf(stderr, "Skipping mem shares, not supported...\n"); } else { fprintf(stderr, "Failed to get mem shares: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } glError = GuestLib_GetMemSwappedMB(context->glHandle, &temp32); context->values[VMWARE_MEM_SWAPPED_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get swapped mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetMemTargetSizeMB(context->glHandle, &temp64); context->values[VMWARE_MEM_TARGET_SIZE_MB]=temp64; if (glError != VMGUESTLIB_ERROR_SUCCESS) { if (glError == VMGUESTLIB_ERROR_UNSUPPORTED_VERSION) { context->values[VMWARE_MEM_TARGET_SIZE_MB]=0; fprintf(stderr, "Skipping target mem size, not supported...\n"); } else { fprintf(stderr, "Failed to get target mem size: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } glError = GuestLib_GetMemUsedMB(context->glHandle, &temp32); context->values[VMWARE_MEM_USED_MB]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get swapped mem: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } glError = GuestLib_GetHostProcessorSpeed(context->glHandle, &temp32); context->values[VMWARE_HOST_CPU_MHZ]=temp32; if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to get host proc speed: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } #endif if (starting) { for(i=0;istart_values[i]=context->values[i]; } } return PAPI_OK; } /********************************************************************/ /* Below are the functions required by the PAPI component interface */ /********************************************************************/ /** This is called whenever a thread is initialized */ int _vmware_init_thread( hwd_context_t *ctx ) { (void) ctx; #ifdef VMGUESTLIB struct _vmware_context *context; VMGuestLibError glError; context=(struct _vmware_context *)ctx; if (use_guestlib) { glError = GuestLib_OpenHandle(&(context->glHandle)); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"OpenHandle failed: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } #endif return PAPI_OK; } /** Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int _vmware_init_component( int cidx ) { int retval = PAPI_OK; (void) cidx; int result; SUBDBG( "_vmware_init_component..." ); /* Initialize and try to load the VMware library */ /* Try to load the library. */ result=LoadFunctions(); if (result!=PAPI_OK) { strncpy(_vmware_vector.cmp_info.disabled_reason, "GuestLibTest: Failed to load shared library", PAPI_MAX_STR_LEN); retval = PAPI_ECMP; goto fn_fail; } /* we know in advance how many events we want */ /* for actual hardware this might have to be determined dynamically */ /* Allocate memory for the our event table */ _vmware_native_table = ( struct _vmware_native_event_entry * ) calloc( VMWARE_MAX_COUNTERS, sizeof ( struct _vmware_native_event_entry )); if ( _vmware_native_table == NULL ) { retval = PAPI_ENOMEM; goto fn_fail; } #ifdef VMGUESTLIB /* Detect if GuestLib works */ { VMGuestLibError glError; VMGuestLibHandle glHandle; use_guestlib=0; /* try to open */ glError = GuestLib_OpenHandle(&glHandle); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"OpenHandle failed: %s\n", GuestLib_GetErrorText(glError)); } else { /* open worked, try to update */ glError = GuestLib_UpdateInfo(glHandle); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr,"UpdateInfo failed: %s\n", GuestLib_GetErrorText(glError)); } else { /* update worked, things work! */ use_guestlib=1; } /* shut things down */ glError = GuestLib_CloseHandle(glHandle); } } if (use_guestlib) { /* fill in the event table parameters */ strcpy( _vmware_native_table[num_events].name, "CPU_LIMIT" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the upper limit of processor use in MHz " "available to the virtual machine.", PAPI_HUGE_STR_LEN); strcpy( _vmware_native_table[num_events].units,"MHz"); _vmware_native_table[num_events].which_counter= VMWARE_CPU_LIMIT_MHZ; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "CPU_RESERVATION" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the minimum processing power in MHz " "reserved for the virtual machine.", PAPI_HUGE_STR_LEN); strcpy( _vmware_native_table[num_events].units,"MHz"); _vmware_native_table[num_events].which_counter= VMWARE_CPU_RESERVATION_MHZ; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "CPU_SHARES" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the number of CPU shares allocated " "to the virtual machine.", PAPI_HUGE_STR_LEN); strcpy( _vmware_native_table[num_events].units,"shares"); _vmware_native_table[num_events].which_counter= VMWARE_CPU_SHARES; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "CPU_STOLEN" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the number of milliseconds that the " "virtual machine was in a ready state (able to " "transition to a run state), but was not scheduled to run.", PAPI_HUGE_STR_LEN); strcpy( _vmware_native_table[num_events].units,"ms"); _vmware_native_table[num_events].which_counter= VMWARE_CPU_STOLEN_MS; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "CPU_USED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the number of milliseconds during which " "the virtual machine has used the CPU. This value " "includes the time used by the guest operating system " "and the time used by virtualization code for tasks for " "this virtual machine. You can combine this value with " "the elapsed time (VMWARE_ELAPSED) to estimate the " "effective virtual machine CPU speed. This value is a " "subset of elapsedMs.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"ms"); _vmware_native_table[num_events].which_counter= VMWARE_CPU_USED_MS; _vmware_native_table[num_events].report_difference=1; num_events++; strcpy( _vmware_native_table[num_events].name, "ELAPSED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the number of milliseconds that have passed " "in the virtual machine since it last started running on " "the server. The count of elapsed time restarts each time " "the virtual machine is powered on, resumed, or migrated " "using VMotion. This value counts milliseconds, regardless " "of whether the virtual machine is using processing power " "during that time. You can combine this value with the CPU " "time used by the virtual machine (VMWARE_CPU_USED) to " "estimate the effective virtual machine xCPU speed. " "cpuUsedMS is a subset of this value.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"ms"); _vmware_native_table[num_events].which_counter= VMWARE_ELAPSED_MS; _vmware_native_table[num_events].report_difference=1; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_ACTIVE" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of memory the virtual machine is " "actively using in MB - Its estimated working set size.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_ACTIVE_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_BALLOONED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of memory that has been reclaimed " "from this virtual machine by the vSphere memory balloon " "driver (also referred to as the 'vmemctl' driver) in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_BALLOONED_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_LIMIT" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the upper limit of memory that is available " "to the virtual machine in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_LIMIT_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_MAPPED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of memory that is allocated to " "the virtual machine in MB. Memory that is ballooned, " "swapped, or has never been accessed is excluded.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_MAPPED_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_OVERHEAD" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of 'overhead' memory associated " "with this virtual machine that is currently consumed " "on the host system in MB. Overhead memory is additional " "memory that is reserved for data structures required by " "the virtualization layer.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_OVERHEAD_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_RESERVATION" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the minimum amount of memory that is " "reserved for the virtual machine in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_RESERVATION_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_SHARED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of physical memory associated " "with this virtual machine that is copy-on-write (COW) " "shared on the host in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_SHARED_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_SHARES" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the number of memory shares allocated to " "the virtual machine.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"shares"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_SHARES; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_SWAPPED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the amount of memory that has been reclaimed " "from this virtual machine by transparently swapping " "guest memory to disk in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_SWAPPED_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_TARGET_SIZE" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the size of the target memory allocation " "for this virtual machine in MB.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_TARGET_SIZE_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "MEM_USED" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the estimated amount of physical host memory " "currently consumed for this virtual machine's " "physical memory.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MB"); _vmware_native_table[num_events].which_counter= VMWARE_MEM_USED_MB; _vmware_native_table[num_events].report_difference=0; num_events++; strcpy( _vmware_native_table[num_events].name, "HOST_CPU" ); strncpy( _vmware_native_table[num_events].description, "Retrieves the speed of the ESX system's physical " "CPU in MHz.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"MHz"); _vmware_native_table[num_events].which_counter= VMWARE_HOST_CPU_MHZ; _vmware_native_table[num_events].report_difference=0; num_events++; } #endif /* For VMWare Pseudo Performance Counters */ if ( getenv( "PAPI_VMWARE_PSEUDOPERFORMANCE" ) ) { use_pseudo=1; strcpy( _vmware_native_table[num_events].name, "HOST_TSC" ); strncpy( _vmware_native_table[num_events].description, "Physical host TSC", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"cycles"); _vmware_native_table[num_events].which_counter= VMWARE_HOST_TSC; _vmware_native_table[num_events].report_difference=1; num_events++; strcpy( _vmware_native_table[num_events].name, "ELAPSED_TIME" ); strncpy( _vmware_native_table[num_events].description, "Elapsed real time in ns.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"ns"); _vmware_native_table[num_events].which_counter= VMWARE_ELAPSED_TIME; _vmware_native_table[num_events].report_difference=1; num_events++; strcpy( _vmware_native_table[num_events].name, "ELAPSED_APPARENT" ); strncpy( _vmware_native_table[num_events].description, "Elapsed apparent time in ns.", PAPI_HUGE_STR_LEN ); strcpy( _vmware_native_table[num_events].units,"ns"); _vmware_native_table[num_events].which_counter= VMWARE_ELAPSED_APPARENT; _vmware_native_table[num_events].report_difference=1; num_events++; } if (num_events==0) { strncpy(_vmware_vector.cmp_info.disabled_reason, "VMware SDK not installed, and PAPI_VMWARE_PSEUDOPERFORMANCE not set", PAPI_MAX_STR_LEN); retval = PAPI_ECMP; goto fn_fail; } _vmware_vector.cmp_info.num_native_events = num_events; fn_exit: _papi_hwd[cidx]->cmp_info.disabled = retval; return retval; fn_fail: goto fn_exit; } /** Setup the counter control structure */ int _vmware_init_control_state( hwd_control_state_t *ctl ) { (void) ctl; return PAPI_OK; } /** Enumerate Native Events @param EventCode is the event of interest @param modifier is one of PAPI_ENUM_FIRST, PAPI_ENUM_EVENTS */ int _vmware_ntv_enum_events( unsigned int *EventCode, int modifier ) { switch ( modifier ) { /* return EventCode of first event */ case PAPI_ENUM_FIRST: if (num_events==0) return PAPI_ENOEVNT; *EventCode = 0; return PAPI_OK; break; /* return EventCode of passed-in Event */ case PAPI_ENUM_EVENTS:{ int index = *EventCode; if ( index < num_events - 1 ) { *EventCode = *EventCode + 1; return PAPI_OK; } else { return PAPI_ENOEVNT; } break; } default: return PAPI_EINVAL; } return PAPI_EINVAL; } int _vmware_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { int index = EventCode; if ( ( index < 0) || (index >= num_events )) return PAPI_ENOEVNT; strncpy( info->symbol, _vmware_native_table[index].name, sizeof(info->symbol)); strncpy( info->long_descr, _vmware_native_table[index].description, sizeof(info->symbol)); strncpy( info->units, _vmware_native_table[index].units, sizeof(info->units)); return PAPI_OK; } /** Takes a native event code and passes back the name @param EventCode is the native event code @param name is a pointer for the name to be copied to @param len is the size of the string */ int _vmware_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { int index = EventCode; if ( index >= 0 && index < num_events ) { strncpy( name, _vmware_native_table[index].name, len ); } return PAPI_OK; } /** Takes a native event code and passes back the event description @param EventCode is the native event code @param name is a pointer for the description to be copied to @param len is the size of the string */ int _vmware_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { int index = EventCode; if ( index >= 0 && index < num_events ) { strncpy( name, _vmware_native_table[index].description, len ); } return PAPI_OK; } /** Triggered by eventset operations like add or remove */ int _vmware_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { (void) ctx; struct _vmware_control_state *control; int i, index; control=(struct _vmware_control_state *)ctl; for ( i = 0; i < count; i++ ) { index = native[i].ni_event; control->which_counter[i]=_vmware_native_table[index].which_counter; native[i].ni_position = i; } control->num_events=count; return PAPI_OK; } /** Triggered by PAPI_start() */ int _vmware_start( hwd_context_t *ctx, hwd_control_state_t *ctl ) { struct _vmware_context *context; (void) ctl; context=(struct _vmware_context *)ctx; _vmware_hardware_read( context, 1 ); return PAPI_OK; } /** Triggered by PAPI_stop() */ int _vmware_stop( hwd_context_t *ctx, hwd_control_state_t *ctl ) { struct _vmware_context *context; (void) ctl; context=(struct _vmware_context *)ctx; _vmware_hardware_read( context, 0 ); return PAPI_OK; } /** Triggered by PAPI_read() */ int _vmware_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long_long **events, int flags ) { struct _vmware_context *context; struct _vmware_control_state *control; (void) flags; int i; context=(struct _vmware_context *)ctx; control=(struct _vmware_control_state *)ctl; _vmware_hardware_read( context, 0 ); for (i=0; inum_events; i++) { if (_vmware_native_table[ _vmware_native_table[control->which_counter[i]].which_counter]. report_difference) { control->value[i]=context->values[control->which_counter[i]]- context->start_values[control->which_counter[i]]; } else { control->value[i]=context->values[control->which_counter[i]]; } // printf("%d %d %lld-%lld=%lld\n",i,control->which_counter[i], // context->values[control->which_counter[i]], // context->start_values[control->which_counter[i]], // control->value[i]); } *events = control->value; return PAPI_OK; } /** Triggered by PAPI_write(), but only if the counters are running */ /* otherwise, the updated state is written to ESI->hw_start */ int _vmware_write( hwd_context_t * ctx, hwd_control_state_t * ctrl, long_long events[] ) { (void) ctx; (void) ctrl; (void) events; SUBDBG( "_vmware_write... %p %p", ctx, ctrl ); /* FIXME... this should actually carry out the write, though */ /* this is non-trivial as which counter being written has to be */ /* determined somehow. */ return PAPI_OK; } /** Triggered by PAPI_reset */ int _vmware_reset( hwd_context_t *ctx, hwd_control_state_t *ctl ) { (void) ctx; (void) ctl; return PAPI_OK; } /** Shutting down a context */ int _vmware_shutdown_thread( hwd_context_t *ctx ) { (void) ctx; #ifdef VMGUESTLIB VMGuestLibError glError; struct _vmware_context *context; context=(struct _vmware_context *)ctx; if (use_guestlib) { glError = GuestLib_CloseHandle(context->glHandle); if (glError != VMGUESTLIB_ERROR_SUCCESS) { fprintf(stderr, "Failed to CloseHandle: %s\n", GuestLib_GetErrorText(glError)); return PAPI_ECMP; } } #endif return PAPI_OK; } /** Triggered by PAPI_shutdown() */ int _vmware_shutdown_component( void ) { #ifdef VMGUESTLIB if (dlclose(dlHandle)) { fprintf(stderr, "dlclose failed\n"); return EXIT_FAILURE; } #endif return PAPI_OK; } /** This function sets various options in the component @param ctx @param code valid are PAPI_SET_DEFDOM, PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL and PAPI_SET_INHERIT @param option */ int _vmware_ctl( hwd_context_t *ctx, int code, _papi_int_option_t *option ) { (void) ctx; (void) code; (void) option; SUBDBG( "_vmware_ctl..." ); return PAPI_OK; } /** This function has to set the bits needed to count different domains In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER By default return PAPI_EINVAL if none of those are specified and PAPI_OK with success PAPI_DOM_USER is only user context is counted PAPI_DOM_KERNEL is only the Kernel/OS context is counted PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) PAPI_DOM_ALL is all of the domains */ int _vmware_set_domain( hwd_control_state_t *ctl, int domain ) { (void) ctl; int found = 0; SUBDBG( "_vmware_set_domain..." ); if ( PAPI_DOM_USER & domain ) { SUBDBG( " PAPI_DOM_USER " ); found = 1; } if ( PAPI_DOM_KERNEL & domain ) { SUBDBG( " PAPI_DOM_KERNEL " ); found = 1; } if ( PAPI_DOM_OTHER & domain ) { SUBDBG( " PAPI_DOM_OTHER " ); found = 1; } if ( PAPI_DOM_ALL & domain ) { SUBDBG( " PAPI_DOM_ALL " ); found = 1; } if ( !found ) { return ( PAPI_EINVAL ); } return PAPI_OK; } /** Vector that points to entry points for our component */ papi_vector_t _vmware_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "vmware", .short_name = "vmware", .description = "Provide support for VMware vmguest and pseudo counters", .version = "5.0", .num_mpx_cntrs = VMWARE_MAX_COUNTERS, .num_cntrs = VMWARE_MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 0, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, }, /* sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( struct _vmware_context ), .control_state = sizeof ( struct _vmware_control_state ), .reg_value = sizeof ( struct _vmware_register ), .reg_alloc = sizeof ( struct _vmware_reg_alloc ), } , /* function pointers in this component */ .init_thread = _vmware_init_thread, .init_component = _vmware_init_component, .init_control_state = _vmware_init_control_state, .start = _vmware_start, .stop = _vmware_stop, .read = _vmware_read, .write = _vmware_write, .shutdown_thread = _vmware_shutdown_thread, .shutdown_component = _vmware_shutdown_component, .ctl = _vmware_ctl, .update_control_state = _vmware_update_control_state, .set_domain = _vmware_set_domain, .reset = _vmware_reset, .ntv_enum_events = _vmware_ntv_enum_events, .ntv_code_to_name = _vmware_ntv_code_to_name, .ntv_code_to_descr = _vmware_ntv_code_to_descr, .ntv_code_to_info = _vmware_ntv_code_to_info, }; papi-papi-7-2-0-t/src/config.h.in000066400000000000000000000116011502707512200164510ustar00rootroot00000000000000/* config.h.in. Generated from configure.in by autoheader. */ /* cpu type */ #undef CPU /* POSIX 1b clock */ #undef HAVE_CLOCK_GETTIME /* POSIX 1b realtime clock */ #undef HAVE_CLOCK_GETTIME_REALTIME /* POSIX 1b realtime HR clock */ #undef HAVE_CLOCK_GETTIME_REALTIME_HR /* POSIX 1b per-thread clock */ #undef HAVE_CLOCK_GETTIME_THREAD /* Native access to a hardware cycle counter */ #undef HAVE_CYCLE /* Define to 1 if you have the header file. */ #undef HAVE_C_ASM_H /* This platform has the ffsll() function */ #undef HAVE_FFSLL /* Define to 1 if you have the `gethrtime' function. */ #undef HAVE_GETHRTIME /* Full gettid function */ #undef HAVE_GETTID /* Normal gettimeofday timer */ #undef HAVE_GETTIMEOFDAY /* Define if hrtime_t is defined in */ #undef HAVE_HRTIME_T /* Define to 1 if you have the header file. */ #undef HAVE_INTRINSICS_H /* Define to 1 if you have the header file. */ #undef HAVE_INTTYPES_H /* Define to 1 if you have the `cpc' library (-lcpc). */ #undef HAVE_LIBCPC /* perfctr header file */ #undef HAVE_LIBPERFCTR_H /* Define to 1 if you have the `mach_absolute_time' function. */ #undef HAVE_MACH_ABSOLUTE_TIME /* Define to 1 if you have the header file. */ #undef HAVE_MACH_MACH_TIME_H /* Define to 1 if you have the header file. */ #undef HAVE_MEMORY_H /* Altix memory mapped global cycle counter */ #undef HAVE_MMTIMER /* Define to 1 if you have the header file. */ #undef HAVE_PERFMON_PFMLIB_H /* Montecito headers */ #undef HAVE_PERFMON_PFMLIB_MONTECITO_H /* Working per thread getrusage */ #undef HAVE_PER_THREAD_GETRUSAGE /* Working per thread timer */ #undef HAVE_PER_THREAD_TIMES /* new pfmlib_output_param_t */ #undef HAVE_PFMLIB_OUTPUT_PFP_PMD_COUNT /* event description function */ #undef HAVE_PFM_GET_EVENT_DESCRIPTION /* new pfm_msg_t */ #undef HAVE_PFM_MSG_TYPE /* old reg_evt_idx */ #undef HAVE_PFM_REG_EVT_IDX /* Define to 1 if you have the `read_real_time' function. */ #undef HAVE_READ_REAL_TIME /* Define to 1 if you have the `sched_getcpu' function. */ #undef HAVE_SCHED_GETCPU /* Define to 1 if you have the header file. */ #undef HAVE_SCHED_H /* Define to 1 if you have the header file. */ #undef HAVE_STDINT_H /* Define to 1 if you have the header file. */ #undef HAVE_STDLIB_H /* Define to 1 if you have the header file. */ #undef HAVE_STRINGS_H /* Define to 1 if you have the header file. */ #undef HAVE_STRING_H /* gettid syscall function */ #undef HAVE_SYSCALL_GETTID /* Define to 1 if you have the header file. */ #undef HAVE_SYS_STAT_H /* Define to 1 if you have the header file. */ #undef HAVE_SYS_TIME_H /* Define to 1 if you have the header file. */ #undef HAVE_SYS_TYPES_H /* Keyword for per-thread variables */ #undef HAVE_THREAD_LOCAL_STORAGE /* Define to 1 if you have the `time_base_to_time' function. */ #undef HAVE_TIME_BASE_TO_TIME /* Define to 1 if you have the header file. */ #undef HAVE_UNISTD_H /* Define for _rtc() intrinsic. */ #undef HAVE__RTC /* Define if _rtc() is not found. */ #undef NO_RTC_INTRINSIC /* Define to the address where bug reports for this package should be sent. */ #undef PACKAGE_BUGREPORT /* Define to the full name of this package. */ #undef PACKAGE_NAME /* Define to the full name and version of this package. */ #undef PACKAGE_STRING /* Define to the one symbol short name of this package. */ #undef PACKAGE_TARNAME /* Define to the home page for this package. */ #undef PACKAGE_URL /* Define to the version of this package. */ #undef PACKAGE_VERSION /* Define to 1 if you have the ANSI C header files. */ #undef STDC_HEADERS /* Define to 1 if you can safely include both and . */ #undef TIME_WITH_SYS_TIME /* Use the perfctr virtual TSC for per-thread times */ #undef USE_PERFCTR_PTTIMER /* Use /proc for per-thread times */ #undef USE_PROC_PTTIMER /* Enable extensions on AIX 3, Interix. */ #ifndef _ALL_SOURCE # undef _ALL_SOURCE #endif /* Enable GNU extensions on systems that have them. */ #ifndef _GNU_SOURCE # undef _GNU_SOURCE #endif /* Enable threading extensions on Solaris. */ #ifndef _POSIX_PTHREAD_SEMANTICS # undef _POSIX_PTHREAD_SEMANTICS #endif /* Enable extensions on HP NonStop. */ #ifndef _TANDEM_SOURCE # undef _TANDEM_SOURCE #endif /* Enable general extensions on Solaris. */ #ifndef __EXTENSIONS__ # undef __EXTENSIONS__ #endif /* Define to 1 if on MINIX. */ #undef _MINIX /* Define to 2 if the system does not provide POSIX.1 features except with this defined. */ #undef _POSIX_1_SOURCE /* Define to 1 if you need to in order for `stat' and other things to work. */ #undef _POSIX_SOURCE /* Define to `__inline__' or `__inline' if that's what the C compiler calls it, or to nothing if 'inline' is not supported under any name. */ #ifndef __cplusplus #undef inline #endif papi-papi-7-2-0-t/src/configure000077500000000000000000007705761502707512200163640ustar00rootroot00000000000000#! /bin/sh # Guess values for system-dependent variables and create Makefiles. # Generated by GNU Autoconf 2.69 for PAPI 7.2.0.0. # # Report bugs to . # # # Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc. # # # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH # Use a proper internal environment variable to ensure we don't fall # into an infinite loop, continuously re-executing ourselves. if test x"${_as_can_reexec}" != xno && test "x$CONFIG_SHELL" != x; then _as_can_reexec=no; export _as_can_reexec; # We cannot yet assume a decent shell, so we have to provide a # neutralization value for shells without unset; and this also # works around shells that cannot unset nonexistent variables. # Preserve -v and -x to the replacement shell. BASH_ENV=/dev/null ENV=/dev/null (unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV case $- in # (((( *v*x* | *x*v* ) as_opts=-vx ;; *v* ) as_opts=-v ;; *x* ) as_opts=-x ;; * ) as_opts= ;; esac exec $CONFIG_SHELL $as_opts "$as_myself" ${1+"$@"} # Admittedly, this is quite paranoid, since all the known shells bail # out after a failed `exec'. $as_echo "$0: could not re-execute with $CONFIG_SHELL" >&2 as_fn_exit 255 fi # We don't want this to propagate to other subprocesses. { _as_can_reexec=; unset _as_can_reexec;} if test "x$CONFIG_SHELL" = x; then as_bourne_compatible="if test -n \"\${ZSH_VERSION+set}\" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on \${1+\"\$@\"}, which # is contrary to our usage. Disable this feature. alias -g '\${1+\"\$@\"}'='\"\$@\"' setopt NO_GLOB_SUBST else case \`(set -o) 2>/dev/null\` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi " as_required="as_fn_return () { (exit \$1); } as_fn_success () { as_fn_return 0; } as_fn_failure () { as_fn_return 1; } as_fn_ret_success () { return 0; } as_fn_ret_failure () { return 1; } exitcode=0 as_fn_success || { exitcode=1; echo as_fn_success failed.; } as_fn_failure && { exitcode=1; echo as_fn_failure succeeded.; } as_fn_ret_success || { exitcode=1; echo as_fn_ret_success failed.; } as_fn_ret_failure && { exitcode=1; echo as_fn_ret_failure succeeded.; } if ( set x; as_fn_ret_success y && test x = \"\$1\" ); then : else exitcode=1; echo positional parameters were not saved. fi test x\$exitcode = x0 || exit 1 test -x / || exit 1" as_suggested=" as_lineno_1=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_1a=\$LINENO as_lineno_2=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_2a=\$LINENO eval 'test \"x\$as_lineno_1'\$as_run'\" != \"x\$as_lineno_2'\$as_run'\" && test \"x\`expr \$as_lineno_1'\$as_run' + 1\`\" = \"x\$as_lineno_2'\$as_run'\"' || exit 1 test \$(( 1 + 1 )) = 2 || exit 1" if (eval "$as_required") 2>/dev/null; then : as_have_required=yes else as_have_required=no fi if test x$as_have_required = xyes && (eval "$as_suggested") 2>/dev/null; then : else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR as_found=false for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. as_found=: case $as_dir in #( /*) for as_base in sh bash ksh sh5; do # Try only shells that exist, to save several forks. as_shell=$as_dir/$as_base if { test -f "$as_shell" || test -f "$as_shell.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$as_shell"; } 2>/dev/null; then : CONFIG_SHELL=$as_shell as_have_required=yes if { $as_echo "$as_bourne_compatible""$as_suggested" | as_run=a "$as_shell"; } 2>/dev/null; then : break 2 fi fi done;; esac as_found=false done $as_found || { if { test -f "$SHELL" || test -f "$SHELL.exe"; } && { $as_echo "$as_bourne_compatible""$as_required" | as_run=a "$SHELL"; } 2>/dev/null; then : CONFIG_SHELL=$SHELL as_have_required=yes fi; } IFS=$as_save_IFS if test "x$CONFIG_SHELL" != x; then : export CONFIG_SHELL # We cannot yet assume a decent shell, so we have to provide a # neutralization value for shells without unset; and this also # works around shells that cannot unset nonexistent variables. # Preserve -v and -x to the replacement shell. BASH_ENV=/dev/null ENV=/dev/null (unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV case $- in # (((( *v*x* | *x*v* ) as_opts=-vx ;; *v* ) as_opts=-v ;; *x* ) as_opts=-x ;; * ) as_opts= ;; esac exec $CONFIG_SHELL $as_opts "$as_myself" ${1+"$@"} # Admittedly, this is quite paranoid, since all the known shells bail # out after a failed `exec'. $as_echo "$0: could not re-execute with $CONFIG_SHELL" >&2 exit 255 fi if test x$as_have_required = xno; then : $as_echo "$0: This script requires a shell more modern than all" $as_echo "$0: the shells that I found on your system." if test x${ZSH_VERSION+set} = xset ; then $as_echo "$0: In particular, zsh $ZSH_VERSION has bugs and should" $as_echo "$0: be upgraded to zsh 4.3.4 or later." else $as_echo "$0: Please tell bug-autoconf@gnu.org and $0: ptools-perfapi@icl.utk.edu about your system, including $0: any error possibly output before this message. Then $0: install a modern shell, or manually run the script $0: under such a shell if you do have one." fi exit 1 fi fi fi SHELL=${CONFIG_SHELL-/bin/sh} export SHELL # Unset more variables known to interfere with behavior of common tools. CLICOLOR_FORCE= GREP_OPTIONS= unset CLICOLOR_FORCE GREP_OPTIONS ## --------------------- ## ## M4sh Shell Functions. ## ## --------------------- ## # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p # as_fn_executable_p FILE # ----------------------- # Test if FILE is an executable regular file. as_fn_executable_p () { test -f "$1" && test -x "$1" } # as_fn_executable_p # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits as_lineno_1=$LINENO as_lineno_1a=$LINENO as_lineno_2=$LINENO as_lineno_2a=$LINENO eval 'test "x$as_lineno_1'$as_run'" != "x$as_lineno_2'$as_run'" && test "x`expr $as_lineno_1'$as_run' + 1`" = "x$as_lineno_2'$as_run'"' || { # Blame Lee E. McMahon (1931-1989) for sed's syntax. :-) sed -n ' p /[$]LINENO/= ' <$as_myself | sed ' s/[$]LINENO.*/&-/ t lineno b :lineno N :loop s/[$]LINENO\([^'$as_cr_alnum'_].*\n\)\(.*\)/\2\1\2/ t loop s/-\n.*// ' >$as_me.lineno && chmod +x "$as_me.lineno" || { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2; as_fn_exit 1; } # If we had to re-execute with $CONFIG_SHELL, we're ensured to have # already done that, so ensure we don't try to do so again and fall # in an infinite loop. This has already happened in practice. _as_can_reexec=no; export _as_can_reexec # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensitive to this). . "./$as_me.lineno" # Exit status is that of the last command. exit } ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -pR'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -pR' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -pR' fi else as_ln_s='cp -pR' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi as_test_x='test -x' as_executable_p=as_fn_executable_p # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" test -n "$DJDIR" || exec 7<&0 &1 # Name of the host. # hostname on some systems (SVR3.2, old GNU/Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` # # Initializations. # ac_default_prefix=/usr/local ac_clean_files= ac_config_libobj_dir=. LIBOBJS= cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= # Identity of this package. PACKAGE_NAME='PAPI' PACKAGE_TARNAME='papi' PACKAGE_VERSION='7.2.0.0' PACKAGE_STRING='PAPI 7.2.0.0' PACKAGE_BUGREPORT='ptools-perfapi@icl.utk.edu' PACKAGE_URL='' ac_unique_file="papi.c" # Factoring default headers for most tests. ac_includes_default="\ #include #ifdef HAVE_SYS_TYPES_H # include #endif #ifdef HAVE_SYS_STAT_H # include #endif #ifdef STDC_HEADERS # include # include #else # ifdef HAVE_STDLIB_H # include # endif #endif #ifdef HAVE_STRING_H # if !defined STDC_HEADERS && defined HAVE_MEMORY_H # include # endif # include #endif #ifdef HAVE_STRINGS_H # include #endif #ifdef HAVE_INTTYPES_H # include #endif #ifdef HAVE_STDINT_H # include #endif #ifdef HAVE_UNISTD_H # include #endif" ac_subst_vars='LTLIBOBJS LIBOBJS ENABLE_FORTRAN_TESTS ENABLE_FORTRAN FORT_HEADERS FORT_WRAPPERS_OBJ FORT_WRAPPERS_SRC NVPPC64LEFLAGS CC_COMMON_NAME BGPM_INSTALL_DIR HAVE_NO_OVERRIDE_INIT FTEST_TARGETS COMPONENTS COMPONENT_RULES BITFLAGS BGP_SYSDIR SHOW_CONF tests TESTS TOPTFLAGS SHLIBDEPS MISCHDRS FLAGS ARG64 cpu_option CPU_MODEL ARCH_EVENTS POST_BUILD MISCOBJS MISCSRCS NOOPT OMPCFLGS SMPCFLGS CC_SHR CC_R CTEST_TARGETS DESCR OSCONTEXT OSLOCK OSFILESHDR OSFILESOBJ OSFILESSRC CPUCOMPONENT_OBJ CPUCOMPONENT_C CPUCOMPONENT_NAME OPTFLAGS PAPICFLAGS VLIB PAPISOVER SHLIB LIBRARY FILENAME CPU VERSION LINKLIB SETPATH PAPI_EVENTS_CSV PAPI_EVENTS OS pfm_libdir pfm_incdir pfm_prefix old_pfmv2 pfm_root altix NO_MPI_TESTS STATIC papiLIBS AR PMINIT PMAPI MAKEVER arch LIBSDEFLAGS BUILD_LIBSDE_STATIC BUILD_LIBSDE_SHARED BUILD_SHARED_LIB LDL LRT EGREP GREP RANLIB SET_MAKE LN_S CPP AWK MPIICC MPICC ac_ct_F77 FFLAGS F77 OBJEXT EXEEXT ac_ct_CC CPPFLAGS LDFLAGS CFLAGS CC NEC MIC target_alias host_alias build_alias LIBS ECHO_T ECHO_N ECHO_C DEFS mandir localedir libdir psdir pdfdir dvidir htmldir infodir docdir oldincludedir includedir runstatedir localstatedir sharedstatedir sysconfdir datadir datarootdir libexecdir sbindir bindir program_transform_name prefix exec_prefix PACKAGE_URL PACKAGE_BUGREPORT PACKAGE_STRING PACKAGE_VERSION PACKAGE_TARNAME PACKAGE_NAME PATH_SEPARATOR SHELL' ac_subst_files='' ac_user_opts=' enable_option_checking with_arch with_bitmode with_OS with_OSVER with_assumed_kernel with_mic with_nec with_bgpm_installdir with_nativecc with_tests with_debug enable_warnings with_CPU with_pthread_mutexes with_ffsll with_walltimer with_tls with_virtualtimer with_pmapi with_static_user_events with_static_papi_events with_static_lib with_shared_lib with_static_tools with_shlib_tools with_libsde with_perfnec with_perfmon with_pfm_root with_pfm_prefix with_pfm_incdir with_pfm_libdir enable_cpu enable_perf_event enable_perf_event_uncore with_perf_events enable_perfevent_rdpmc with_pe_incdir with_sysdetect with_components enable_fortran ' ac_precious_vars='build_alias host_alias target_alias CC CFLAGS LDFLAGS LIBS CPPFLAGS F77 FFLAGS CPP' # Initialize some variables set by options. ac_init_help= ac_init_version=false ac_unrecognized_opts= ac_unrecognized_sep= # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. # (The list follows the same order as the GNU Coding Standards.) bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datarootdir='${prefix}/share' datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' runstatedir='${localstatedir}/run' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' infodir='${datarootdir}/info' htmldir='${docdir}' dvidir='${docdir}' pdfdir='${docdir}' psdir='${docdir}' libdir='${exec_prefix}/lib' localedir='${datarootdir}/locale' mandir='${datarootdir}/man' ac_prev= ac_dashdash= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval $ac_prev=\$ac_option ac_prev= continue fi case $ac_option in *=?*) ac_optarg=`expr "X$ac_option" : '[^=]*=\(.*\)'` ;; *=) ac_optarg= ;; *) ac_optarg=yes ;; esac # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_dashdash$ac_option in --) ac_dashdash=yes ;; -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=*) datadir=$ac_optarg ;; -datarootdir | --datarootdir | --datarootdi | --datarootd | --dataroot \ | --dataroo | --dataro | --datar) ac_prev=datarootdir ;; -datarootdir=* | --datarootdir=* | --datarootdi=* | --datarootd=* \ | --dataroot=* | --dataroo=* | --dataro=* | --datar=*) datarootdir=$ac_optarg ;; -disable-* | --disable-*) ac_useropt=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--disable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=no ;; -docdir | --docdir | --docdi | --doc | --do) ac_prev=docdir ;; -docdir=* | --docdir=* | --docdi=* | --doc=* | --do=*) docdir=$ac_optarg ;; -dvidir | --dvidir | --dvidi | --dvid | --dvi | --dv) ac_prev=dvidir ;; -dvidir=* | --dvidir=* | --dvidi=* | --dvid=* | --dvi=* | --dv=*) dvidir=$ac_optarg ;; -enable-* | --enable-*) ac_useropt=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid feature name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "enable_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--enable-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval enable_$ac_useropt=\$ac_optarg ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -htmldir | --htmldir | --htmldi | --htmld | --html | --htm | --ht) ac_prev=htmldir ;; -htmldir=* | --htmldir=* | --htmldi=* | --htmld=* | --html=* | --htm=* \ | --ht=*) htmldir=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localedir | --localedir | --localedi | --localed | --locale) ac_prev=localedir ;; -localedir=* | --localedir=* | --localedi=* | --localed=* | --locale=*) localedir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst | --locals) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* | --locals=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -pdfdir | --pdfdir | --pdfdi | --pdfd | --pdf | --pd) ac_prev=pdfdir ;; -pdfdir=* | --pdfdir=* | --pdfdi=* | --pdfd=* | --pdf=* | --pd=*) pdfdir=$ac_optarg ;; -psdir | --psdir | --psdi | --psd | --ps) ac_prev=psdir ;; -psdir=* | --psdir=* | --psdi=* | --psd=* | --ps=*) psdir=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -runstatedir | --runstatedir | --runstatedi | --runstated \ | --runstate | --runstat | --runsta | --runst | --runs \ | --run | --ru | --r) ac_prev=runstatedir ;; -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \ | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \ | --run=* | --ru=* | --r=*) runstatedir=$ac_optarg ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_useropt=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--with-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=\$ac_optarg ;; -without-* | --without-*) ac_useropt=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_useropt" : ".*[^-+._$as_cr_alnum]" >/dev/null && as_fn_error $? "invalid package name: $ac_useropt" ac_useropt_orig=$ac_useropt ac_useropt=`$as_echo "$ac_useropt" | sed 's/[-+.]/_/g'` case $ac_user_opts in *" "with_$ac_useropt" "*) ;; *) ac_unrecognized_opts="$ac_unrecognized_opts$ac_unrecognized_sep--without-$ac_useropt_orig" ac_unrecognized_sep=', ';; esac eval with_$ac_useropt=no ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) as_fn_error $? "unrecognized option: \`$ac_option' Try \`$0 --help' for more information" ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. case $ac_envvar in #( '' | [0-9]* | *[!_$as_cr_alnum]* ) as_fn_error $? "invalid variable name: \`$ac_envvar'" ;; esac eval $ac_envvar=\$ac_optarg export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. $as_echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && $as_echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : "${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option}" ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` as_fn_error $? "missing argument to $ac_option" fi if test -n "$ac_unrecognized_opts"; then case $enable_option_checking in no) ;; fatal) as_fn_error $? "unrecognized options: $ac_unrecognized_opts" ;; *) $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2 ;; esac fi # Check all directory arguments for consistency. for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ libdir localedir mandir runstatedir do eval ac_val=\$$ac_var # Remove trailing slashes. case $ac_val in */ ) ac_val=`expr "X$ac_val" : 'X\(.*[^/]\)' \| "X$ac_val" : 'X\(.*\)'` eval $ac_var=\$ac_val;; esac # Be sure to have absolute directory names. case $ac_val in [\\/$]* | ?:[\\/]* ) continue;; NONE | '' ) case $ac_var in *prefix ) continue;; esac;; esac as_fn_error $? "expected an absolute directory name for --$ac_var: $ac_val" done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null ac_pwd=`pwd` && test -n "$ac_pwd" && ac_ls_di=`ls -di .` && ac_pwd_ls_di=`cd "$ac_pwd" && ls -di .` || as_fn_error $? "working directory cannot be determined" test "X$ac_ls_di" = "X$ac_pwd_ls_di" || as_fn_error $? "pwd does not report name of working directory" # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then the parent directory. ac_confdir=`$as_dirname -- "$as_myself" || $as_expr X"$as_myself" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_myself" : 'X\(//\)[^/]' \| \ X"$as_myself" : 'X\(//\)$' \| \ X"$as_myself" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_myself" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` srcdir=$ac_confdir if test ! -r "$srcdir/$ac_unique_file"; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r "$srcdir/$ac_unique_file"; then test "$ac_srcdir_defaulted" = yes && srcdir="$ac_confdir or .." as_fn_error $? "cannot find sources ($ac_unique_file) in $srcdir" fi ac_msg="sources are in $srcdir, but \`cd $srcdir' does not work" ac_abs_confdir=`( cd "$srcdir" && test -r "./$ac_unique_file" || as_fn_error $? "$ac_msg" pwd)` # When building in place, set srcdir=. if test "$ac_abs_confdir" = "$ac_pwd"; then srcdir=. fi # Remove unnecessary trailing slashes from srcdir. # Double slashes in file names in object file debugging info # mess up M-x gdb in Emacs. case $srcdir in */) srcdir=`expr "X$srcdir" : 'X\(.*[^/]\)' \| "X$srcdir" : 'X\(.*\)'`;; esac for ac_var in $ac_precious_vars; do eval ac_env_${ac_var}_set=\${${ac_var}+set} eval ac_env_${ac_var}_value=\$${ac_var} eval ac_cv_env_${ac_var}_set=\${${ac_var}+set} eval ac_cv_env_${ac_var}_value=\$${ac_var} done # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures PAPI 7.2.0.0 to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking ...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] --datadir=DIR read-only architecture-independent data [DATAROOTDIR] --infodir=DIR info documentation [DATAROOTDIR/info] --localedir=DIR locale-dependent data [DATAROOTDIR/locale] --mandir=DIR man documentation [DATAROOTDIR/man] --docdir=DIR documentation root [DATAROOTDIR/doc/papi] --htmldir=DIR html documentation [DOCDIR] --dvidir=DIR dvi documentation [DOCDIR] --pdfdir=DIR pdf documentation [DOCDIR] --psdir=DIR ps documentation [DOCDIR] _ACEOF cat <<\_ACEOF _ACEOF fi if test -n "$ac_init_help"; then case $ac_init_help in short | recursive ) echo "Configuration of PAPI 7.2.0.0:";; esac cat <<\_ACEOF Optional Features: --disable-option-checking ignore unrecognized --enable/--with options --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] --enable-warnings Enable build with -Wall -Wextra (default: disabled) --disable-cpu Disable cpu component --disable-perf-event Disable perf_event component --disable-perf-event-uncore Disable perf_event_uncore component --enable-perfevent-rdpmc Enable userspace rdpmc instruction on perf_event, default: yes --disable-fortran Whether to disable fortran bindings Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-arch= Specify architecture (uname -m) --with-bitmode=<32,64> Specify bit mode of library --with-OS= Specify operating system --with-OSVER= Specify operating system version --with-assumed-kernel= Assume kernel version is for purposes of workarounds --with-mic To compile for Intel MIC --with-nec To compile for NEC --with-bgpm_installdir= Specify the installation path of BGPM --with-nativecc= Specify native C compiler for header generation --with-tests=<"ctests ftests mpitests comp_tests", no> Specify which tests to run on install, or "no" tests (default: all available tests) --with-debug= Build a debug version, debug version plus memory tracker or none --with-CPU= Specify CPU type --with-pthread-mutexes Specify use of pthread mutexes rather than custom PAPI locks --with-ffsll Specify use of the ffsll() function --with-walltimer= Specify realtime timer --with-tls= This platform supports thread local storage with a keyword --with-virtualtimer= Specify per-thread virtual timer --with-pmapi= Specify path of pmapi on aix system --with-static-user-events Build with a static user events file. --with-static-papi-events Build with a static papi events file. --with-static-lib= Build a static library --with-shared-lib= Build a shared library --with-static-tools Specify static compile of tests and utilities --with-shlib-tools Specify linking with papi library of tests and utilities --with-libsde= Build the standalone libsde (default: yes) --with-perfnec Specify perf_nec as the performance interface --with-perfmon= Specify perfmon as the performance interface and specify version --with-pfm-root= Specify path to source tree (for use by developers only) --with-pfm-prefix= Specify prefix to installed pfm distribution --with-pfm-incdir= Specify directory of pfm header files in non-standard location --with-pfm-libdir= Specify directory of pfm library in non-standard location --with-perf-events Specify use of Linux Performance Event (requires kernel 2.6.32 or greater) --with-pe-incdir= Specify path to the correct perf header file --with-sysdetect= Build the sysdetect component (default: yes) --with-components=<"component1 component2"> Specify which components to build Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory LIBS libraries to pass to the linker, e.g. -l CPPFLAGS (Objective) C/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory F77 Fortran 77 compiler command FFLAGS Fortran 77 compiler flags CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. Report bugs to . _ACEOF ac_status=$? fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d "$ac_dir" || { cd "$srcdir" && ac_pwd=`pwd` && srcdir=. && test -d "$ac_dir"; } || continue ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix cd "$ac_dir" || { ac_status=$?; continue; } # Check for guested configure. if test -f "$ac_srcdir/configure.gnu"; then echo && $SHELL "$ac_srcdir/configure.gnu" --help=recursive elif test -f "$ac_srcdir/configure"; then echo && $SHELL "$ac_srcdir/configure" --help=recursive else $as_echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi || ac_status=$? cd "$ac_pwd" || { ac_status=$?; break; } done fi test -n "$ac_init_help" && exit $ac_status if $ac_init_version; then cat <<\_ACEOF PAPI configure 7.2.0.0 generated by GNU Autoconf 2.69 Copyright (C) 2012 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit fi ## ------------------------ ## ## Autoconf initialization. ## ## ------------------------ ## # ac_fn_c_try_compile LINENO # -------------------------- # Try to compile conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_compile # ac_fn_f77_try_compile LINENO # ---------------------------- # Try to compile conftest.$ac_ext, and return whether this succeeded. ac_fn_f77_try_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_f77_werror_flag" || test ! -s conftest.err } && test -s conftest.$ac_objext; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_f77_try_compile # ac_fn_c_try_cpp LINENO # ---------------------- # Try to preprocess conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_cpp () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if { { ac_try="$ac_cpp conftest.$ac_ext" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_cpp conftest.$ac_ext") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } > conftest.i && { test -z "$ac_c_preproc_warn_flag$ac_c_werror_flag" || test ! -s conftest.err }; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_cpp # ac_fn_c_check_header_mongrel LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists, giving a warning if it cannot be compiled using # the include files in INCLUDES and setting the cache variable VAR # accordingly. ac_fn_c_check_header_mongrel () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if eval \${$3+:} false; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } else # Is the header compilable? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 usability" >&5 $as_echo_n "checking $2 usability... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_header_compiler=yes else ac_header_compiler=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_compiler" >&5 $as_echo "$ac_header_compiler" >&6; } # Is the header present? { $as_echo "$as_me:${as_lineno-$LINENO}: checking $2 presence" >&5 $as_echo_n "checking $2 presence... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include <$2> _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : ac_header_preproc=yes else ac_header_preproc=no fi rm -f conftest.err conftest.i conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_header_preproc" >&5 $as_echo "$ac_header_preproc" >&6; } # So? What about this header? case $ac_header_compiler:$ac_header_preproc:$ac_c_preproc_warn_flag in #(( yes:no: ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&5 $as_echo "$as_me: WARNING: $2: accepted by the compiler, rejected by the preprocessor!" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ;; no:yes:* ) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: present but cannot be compiled" >&5 $as_echo "$as_me: WARNING: $2: present but cannot be compiled" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: check for missing prerequisite headers?" >&5 $as_echo "$as_me: WARNING: $2: check for missing prerequisite headers?" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: see the Autoconf documentation" >&5 $as_echo "$as_me: WARNING: $2: see the Autoconf documentation" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&5 $as_echo "$as_me: WARNING: $2: section \"Present But Cannot Be Compiled\"" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;} ( $as_echo "## ----------------------------------------- ## ## Report this to ptools-perfapi@icl.utk.edu ## ## ----------------------------------------- ##" ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else eval "$3=\$ac_header_compiler" fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } fi eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_header_mongrel # ac_fn_c_try_run LINENO # ---------------------- # Try to link conftest.$ac_ext, and return whether this succeeded. Assumes # that executables *can* be run. ac_fn_c_try_run () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { ac_try='./conftest$ac_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then : ac_retval=0 else $as_echo "$as_me: program exited with status $ac_status" >&5 $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=$ac_status fi rm -rf conftest.dSYM conftest_ipa8_conftest.oo eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_run # ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES # ------------------------------------------------------- # Tests whether HEADER exists and can be compiled using the include files in # INCLUDES, setting the cache variable VAR accordingly. ac_fn_c_check_header_compile () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 #include <$2> _ACEOF if ac_fn_c_try_compile "$LINENO"; then : eval "$3=yes" else eval "$3=no" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_header_compile # ac_fn_c_try_link LINENO # ----------------------- # Try to link conftest.$ac_ext, and return whether this succeeded. ac_fn_c_try_link () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack rm -f conftest.$ac_objext conftest$ac_exeext if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>conftest.err ac_status=$? if test -s conftest.err; then grep -v '^ *+' conftest.err >conftest.er1 cat conftest.er1 >&5 mv -f conftest.er1 conftest.err fi $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } && { test -z "$ac_c_werror_flag" || test ! -s conftest.err } && test -s conftest$ac_exeext && { test "$cross_compiling" = yes || test -x conftest$ac_exeext }; then : ac_retval=0 else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_retval=1 fi # Delete the IPA/IPO (Inter Procedural Analysis/Optimization) information # created by the PGI compiler (conftest_ipa8_conftest.oo), as it would # interfere with the next link command; also delete a directory that is # left behind by Apple's compiler. We do this before executing the actions. rm -rf conftest.dSYM conftest_ipa8_conftest.oo eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno as_fn_set_status $ac_retval } # ac_fn_c_try_link # ac_fn_c_check_func LINENO FUNC VAR # ---------------------------------- # Tests whether FUNC exists, setting the cache variable VAR accordingly ac_fn_c_check_func () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Define $2 to an innocuous variant, in case declares $2. For example, HP-UX 11i declares gettimeofday. */ #define $2 innocuous_$2 /* System header to define __stub macros and hopefully few prototypes, which can conflict with char $2 (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif #undef $2 /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char $2 (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined __stub_$2 || defined __stub___$2 choke me #endif int main () { return $2 (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : eval "$3=yes" else eval "$3=no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_func # ac_fn_c_check_type LINENO TYPE VAR INCLUDES # ------------------------------------------- # Tests whether TYPE exists after having included INCLUDES, setting cache # variable VAR accordingly. ac_fn_c_check_type () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } if eval \${$3+:} false; then : $as_echo_n "(cached) " >&6 else eval "$3=no" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { if (sizeof ($2)) return 0; ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $4 int main () { if (sizeof (($2))) return 0; ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : else eval "$3=yes" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_type # ac_fn_c_check_member LINENO AGGR MEMBER VAR INCLUDES # ---------------------------------------------------- # Tries to find if the field MEMBER exists in type AGGR, after including # INCLUDES, setting cache variable VAR accordingly. ac_fn_c_check_member () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2.$3" >&5 $as_echo_n "checking for $2.$3... " >&6; } if eval \${$4+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $5 int main () { static $2 ac_aggr; if (ac_aggr.$3) return 0; ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : eval "$4=yes" else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ $5 int main () { static $2 ac_aggr; if (sizeof ac_aggr.$3) return 0; ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : eval "$4=yes" else eval "$4=no" fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi eval ac_res=\$$4 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno } # ac_fn_c_check_member cat >config.log <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by PAPI $as_me 7.2.0.0, which was generated by GNU Autoconf 2.69. Invocation command line was $ $0 $@ _ACEOF exec 5>>config.log { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` /usr/bin/hostinfo = `(/usr/bin/hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. $as_echo "PATH: $as_dir" done IFS=$as_save_IFS } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *\'*) ac_arg=`$as_echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) as_fn_append ac_configure_args0 " '$ac_arg'" ;; 2) as_fn_append ac_configure_args1 " '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi as_fn_append ac_configure_args " '$ac_arg'" ;; esac done done { ac_configure_args0=; unset ac_configure_args0;} { ac_configure_args1=; unset ac_configure_args1;} # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Use '\'' to represent an apostrophe within the trap. # WARNING: Do not start the trap code with a newline, due to a FreeBSD 4.0 bug. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo $as_echo "## ---------------- ## ## Cache variables. ## ## ---------------- ##" echo # The following way of writing the cache mishandles newlines in values, ( for ac_var in `(set) 2>&1 | sed -n '\''s/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'\''`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space='\'' '\''; set) 2>&1` in #( *${as_nl}ac_space=\ *) sed -n \ "s/'\''/'\''\\\\'\'''\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\''\\2'\''/p" ;; #( *) sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) echo $as_echo "## ----------------- ## ## Output variables. ## ## ----------------- ##" echo for ac_var in $ac_subst_vars do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo if test -n "$ac_subst_files"; then $as_echo "## ------------------- ## ## File substitutions. ## ## ------------------- ##" echo for ac_var in $ac_subst_files do eval ac_val=\$$ac_var case $ac_val in *\'\''*) ac_val=`$as_echo "$ac_val" | sed "s/'\''/'\''\\\\\\\\'\'''\''/g"`;; esac $as_echo "$ac_var='\''$ac_val'\''" done | sort echo fi if test -s confdefs.h; then $as_echo "## ----------- ## ## confdefs.h. ## ## ----------- ##" echo cat confdefs.h echo fi test "$ac_signal" != 0 && $as_echo "$as_me: caught signal $ac_signal" $as_echo "$as_me: exit $exit_status" } >&5 rm -f core *.core core.conftest.* && rm -f -r conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; as_fn_exit 1' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -f -r conftest* confdefs.h $as_echo "/* confdefs.h */" > confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_URL "$PACKAGE_URL" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer an explicitly selected file to automatically selected ones. ac_site_file1=NONE ac_site_file2=NONE if test -n "$CONFIG_SITE"; then # We do not want a PATH search for config.site. case $CONFIG_SITE in #(( -*) ac_site_file1=./$CONFIG_SITE;; */*) ac_site_file1=$CONFIG_SITE;; *) ac_site_file1=./$CONFIG_SITE;; esac elif test "x$prefix" != xNONE; then ac_site_file1=$prefix/share/config.site ac_site_file2=$prefix/etc/config.site else ac_site_file1=$ac_default_prefix/share/config.site ac_site_file2=$ac_default_prefix/etc/config.site fi for ac_site_file in "$ac_site_file1" "$ac_site_file2" do test "x$ac_site_file" = xNONE && continue if test /dev/null != "$ac_site_file" && test -r "$ac_site_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading site script $ac_site_file" >&5 $as_echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" \ || { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "failed to load site script $ac_site_file See \`config.log' for more details" "$LINENO" 5; } fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special files # actually), so we avoid doing that. DJGPP emulates it as a regular file. if test /dev/null != "$cache_file" && test -f "$cache_file"; then { $as_echo "$as_me:${as_lineno-$LINENO}: loading cache $cache_file" >&5 $as_echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . "$cache_file";; *) . "./$cache_file";; esac fi else { $as_echo "$as_me:${as_lineno-$LINENO}: creating cache $cache_file" >&5 $as_echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in $ac_precious_vars; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val=\$ac_cv_env_${ac_var}_value eval ac_new_val=\$ac_env_${ac_var}_value case $ac_old_set,$ac_new_set in set,) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' was not set in the previous run" >&5 $as_echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then # differences in whitespace do not lead to failure. ac_old_val_w=`echo x $ac_old_val` ac_new_val_w=`echo x $ac_new_val` if test "$ac_old_val_w" != "$ac_new_val_w"; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: \`$ac_var' has changed since the previous run:" >&5 $as_echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} ac_cache_corrupted=: else { $as_echo "$as_me:${as_lineno-$LINENO}: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&5 $as_echo "$as_me: warning: ignoring whitespace changes in \`$ac_var' since the previous run:" >&2;} eval $ac_var=\$ac_old_val fi { $as_echo "$as_me:${as_lineno-$LINENO}: former value: \`$ac_old_val'" >&5 $as_echo "$as_me: former value: \`$ac_old_val'" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: current value: \`$ac_new_val'" >&5 $as_echo "$as_me: current value: \`$ac_new_val'" >&2;} fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *\'*) ac_arg=$ac_var=`$as_echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) as_fn_append ac_configure_args " '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} { $as_echo "$as_me:${as_lineno-$LINENO}: error: changes in the environment can compromise the build" >&5 $as_echo "$as_me: error: changes in the environment can compromise the build" >&2;} as_fn_error $? "run \`make distclean' and/or \`rm $cache_file' and start over" "$LINENO" 5 fi ## -------------------- ## ## Main body of script. ## ## -------------------- ## ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_config_headers="$ac_config_headers config.h" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for architecture" >&5 $as_echo_n "checking for architecture... " >&6; } # Check whether --with-arch was given. if test "${with_arch+set}" = set; then : withval=$with_arch; arch=$withval else arch=`uname -m` fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $arch" >&5 $as_echo "$arch" >&6; } # Check whether --with-bitmode was given. if test "${with_bitmode+set}" = set; then : withval=$with_bitmode; bitmode=$withval fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for OS" >&5 $as_echo_n "checking for OS... " >&6; } # Check whether --with-OS was given. if test "${with_OS+set}" = set; then : withval=$with_OS; OS=$withval else OS="`uname | tr 'A-Z' 'a-z'`" if (test "$OS" = "SunOS" || test "$OS" = "sunos"); then OS=solaris fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OS" >&5 $as_echo "$OS" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for OS version" >&5 $as_echo_n "checking for OS version... " >&6; } # Check whether --with-OSVER was given. if test "${with_OSVER+set}" = set; then : withval=$with_OSVER; OSVER=$withval else if test "$OS" != "bgp" -o "$OS" != "bgq"; then OSVER="`uname -r`" fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OSVER" >&5 $as_echo "$OSVER" >&6; } CFLAGS="$CFLAGS" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for perf_event workaround level" >&5 $as_echo_n "checking for perf_event workaround level... " >&6; } # Check whether --with-assumed_kernel was given. if test "${with_assumed_kernel+set}" = set; then : withval=$with_assumed_kernel; assumed_kernel=$withval; CFLAGS="$CFLAGS -DASSUME_KERNEL=\\\"$with_assumed_kernel\\\"" else assumed_kernel="autodetect" fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $assumed_kernel" >&5 $as_echo "$assumed_kernel" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for if MIC should be used" >&5 $as_echo_n "checking for if MIC should be used... " >&6; } # Check whether --with-mic was given. if test "${with_mic+set}" = set; then : withval=$with_mic; MIC=yes tls=__thread virtualtimer=cputime_id perf_events=yes walltimer=clock_realtime_hr ffsll=no cross_compiling=yes arch=k1om else MIC=no fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $MIC" >&5 $as_echo "$MIC" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for if NEC should be used" >&5 $as_echo_n "checking for if NEC should be used... " >&6; } # Check whether --with-nec was given. if test "${with_nec+set}" = set; then : withval=$with_nec; NEC=yes tls=__thread cross_compiling=yes ffsll=/opt/nec/ve/lib/libc-2.21.so virtualtimer=cputime_id walltimer=clock_realtime_hr CC=ncc CC_COMMON_NAME=ncc else NEC=no fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $NEC" >&5 $as_echo "$NEC" >&6; } #If not set, set FFLAGS to null to prevent AC_PROG_F77 from defaulting it to -g -O2 if test "x$FFLAGS" = "x"; then FFLAGS="" fi OPTFLAGS="-O2" TOPTFLAGS="-O1" ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then for ac_prog in xlc icc gcc cc do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CC" >&5 $as_echo "$CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in xlc icc gcc cc do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_CC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_CC" >&5 $as_echo "$ac_ct_CC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$ac_ct_CC" && break done if test "x$ac_ct_CC" = x; then CC="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac CC=$ac_ct_CC fi fi test -z "$CC" && { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "no acceptable C compiler found in \$PATH See \`config.log' for more details" "$LINENO" 5; } # Provide some information about the compiler. $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler version" >&5 set X $ac_compile ac_compiler=$2 for ac_option in --version -v -V -qversion; do { { ac_try="$ac_compiler $ac_option >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compiler $ac_option >&5") 2>conftest.err ac_status=$? if test -s conftest.err; then sed '10a\ ... rest of stderr output deleted ... 10q' conftest.err >conftest.er1 cat conftest.er1 >&5 fi rm -f conftest.er1 conftest.err $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } done cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.out.dSYM a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the C compiler works" >&5 $as_echo_n "checking whether the C compiler works... " >&6; } ac_link_default=`$as_echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` # The possible output files: ac_files="a.out conftest.exe conftest a.exe a_out.exe b.out conftest.*" ac_rmfiles= for ac_file in $ac_files do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; * ) ac_rmfiles="$ac_rmfiles $ac_file";; esac done rm -f $ac_rmfiles if { { ac_try="$ac_link_default" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link_default") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # Autoconf-2.13 could set the ac_cv_exeext variable to `no'. # So ignore a value of `no', otherwise this would lead to `EXEEXT = no' # in a Makefile. We should not override ac_cv_exeext if it was cached, # so that the user can short-circuit this test for compilers unknown to # Autoconf. for ac_file in $ac_files '' do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) if test "${ac_cv_exeext+set}" = set && test "$ac_cv_exeext" != no; then :; else ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` fi # We set ac_cv_exeext here because the later test for it is not # safe: cross compilers may not add the suffix if given an `-o' # argument, so we may need to know it at that point already. # Even if this section looks crufty: it has the advantage of # actually working. break;; * ) break;; esac done test "$ac_cv_exeext" = no && ac_cv_exeext= else ac_file='' fi if test -z "$ac_file"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "C compiler cannot create executables See \`config.log' for more details" "$LINENO" 5; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler default output file name" >&5 $as_echo_n "checking for C compiler default output file name... " >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_file" >&5 $as_echo "$ac_file" >&6; } ac_exeext=$ac_cv_exeext rm -f -r a.out a.out.dSYM a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of executables" >&5 $as_echo_n "checking for suffix of executables... " >&6; } if { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` break;; * ) break;; esac done else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of executables: cannot compile and link See \`config.log' for more details" "$LINENO" 5; } fi rm -f conftest conftest$ac_cv_exeext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_exeext" >&5 $as_echo "$ac_cv_exeext" >&6; } rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { FILE *f = fopen ("conftest.out", "w"); return ferror (f) || fclose (f) != 0; ; return 0; } _ACEOF ac_clean_files="$ac_clean_files conftest.out" # Check that the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are cross compiling" >&5 $as_echo_n "checking whether we are cross compiling... " >&6; } if test "$cross_compiling" != yes; then { { ac_try="$ac_link" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_link") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } if { ac_try='./conftest$ac_cv_exeext' { { case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_try") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details" "$LINENO" 5; } fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $cross_compiling" >&5 $as_echo "$cross_compiling" >&6; } rm -f conftest.$ac_ext conftest$ac_cv_exeext conftest.out ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of object files" >&5 $as_echo_n "checking for suffix of object files... " >&6; } if ${ac_cv_objext+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { { ac_try="$ac_compile" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compile") 2>&5 ac_status=$? $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; }; then : for ac_file in conftest.o conftest.obj conftest.*; do test -f "$ac_file" || continue; case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.map | *.inf | *.dSYM ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else $as_echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of object files: cannot compile See \`config.log' for more details" "$LINENO" 5; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_objext" >&5 $as_echo "$ac_cv_objext" >&6; } OBJEXT=$ac_cv_objext ac_objext=$OBJEXT { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are using the GNU C compiler" >&5 $as_echo_n "checking whether we are using the GNU C compiler... " >&6; } if ${ac_cv_c_compiler_gnu+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_compiler_gnu=yes else ac_compiler_gnu=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_compiler_gnu" >&5 $as_echo "$ac_cv_c_compiler_gnu" >&6; } if test $ac_compiler_gnu = yes; then GCC=yes else GCC= fi ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -g" >&5 $as_echo_n "checking whether $CC accepts -g... " >&6; } if ${ac_cv_prog_cc_g+:} false; then : $as_echo_n "(cached) " >&6 else ac_save_c_werror_flag=$ac_c_werror_flag ac_c_werror_flag=yes ac_cv_prog_cc_g=no CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes else CFLAGS="" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : else ac_c_werror_flag=$ac_save_c_werror_flag CFLAGS="-g" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_g=yes fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_c_werror_flag=$ac_save_c_werror_flag fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_g" >&5 $as_echo "$ac_cv_prog_cc_g" >&6; } if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $CC option to accept ISO C89" >&5 $as_echo_n "checking for $CC option to accept ISO C89... " >&6; } if ${ac_cv_prog_cc_c89+:} false; then : $as_echo_n "(cached) " >&6 else ac_cv_prog_cc_c89=no ac_save_CC=$CC cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include struct stat; /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } /* OSF 4.0 Compaq cc is some sort of almost-ANSI by default. It has function prototypes and stuff, but not '\xHH' hex character constants. These don't provoke an error unfortunately, instead are silently treated as 'x'. The following induces an error, until -std is added to get proper ANSI mode. Curiously '\x00'!='x' always comes out true, for an array size at least. It's necessary to write '\x00'==0 to get something that's true only with -std. */ int osf4_cc_array ['\x00' == 0 ? 1 : -1]; /* IBM C 6 for AIX is almost-ANSI by default, but it replaces macro parameters inside strings and character constants. */ #define FOO(x) 'x' int xlc6_cc_array[FOO(a) == 'x' ? 1 : -1]; int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF for ac_arg in '' -qlanglvl=extc89 -qlanglvl=ansi -std \ -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" if ac_fn_c_try_compile "$LINENO"; then : ac_cv_prog_cc_c89=$ac_arg fi rm -f core conftest.err conftest.$ac_objext test "x$ac_cv_prog_cc_c89" != "xno" && break done rm -f conftest.$ac_ext CC=$ac_save_CC fi # AC_CACHE_VAL case "x$ac_cv_prog_cc_c89" in x) { $as_echo "$as_me:${as_lineno-$LINENO}: result: none needed" >&5 $as_echo "none needed" >&6; } ;; xno) { $as_echo "$as_me:${as_lineno-$LINENO}: result: unsupported" >&5 $as_echo "unsupported" >&6; } ;; *) CC="$CC $ac_cv_prog_cc_c89" { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_cc_c89" >&5 $as_echo "$ac_cv_prog_cc_c89" >&6; } ;; esac if test "x$ac_cv_prog_cc_c89" != xno; then : fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_ext=f ac_compile='$F77 -c $FFLAGS conftest.$ac_ext >&5' ac_link='$F77 -o conftest$ac_exeext $FFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_f77_compiler_gnu if test -n "$ac_tool_prefix"; then for ac_prog in xlf ifort gfortran f95 f90 f77 do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_F77+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$F77"; then ac_cv_prog_F77="$F77" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_F77="$ac_tool_prefix$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi F77=$ac_cv_prog_F77 if test -n "$F77"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $F77" >&5 $as_echo "$F77" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$F77" && break done fi if test -z "$F77"; then ac_ct_F77=$F77 for ac_prog in xlf ifort gfortran f95 f90 f77 do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_F77+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_F77"; then ac_cv_prog_ac_ct_F77="$ac_ct_F77" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_F77="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_F77=$ac_cv_prog_ac_ct_F77 if test -n "$ac_ct_F77"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_F77" >&5 $as_echo "$ac_ct_F77" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$ac_ct_F77" && break done if test "x$ac_ct_F77" = x; then F77="" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac F77=$ac_ct_F77 fi fi # Provide some information about the compiler. $as_echo "$as_me:${as_lineno-$LINENO}: checking for Fortran 77 compiler version" >&5 set X $ac_compile ac_compiler=$2 for ac_option in --version -v -V -qversion; do { { ac_try="$ac_compiler $ac_option >&5" case "(($ac_try" in *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;; *) ac_try_echo=$ac_try;; esac eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\"" $as_echo "$ac_try_echo"; } >&5 (eval "$ac_compiler $ac_option >&5") 2>conftest.err ac_status=$? if test -s conftest.err; then sed '10a\ ... rest of stderr output deleted ... 10q' conftest.err >conftest.er1 cat conftest.er1 >&5 fi rm -f conftest.er1 conftest.err $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 test $ac_status = 0; } done rm -f a.out # If we don't use `.F' as extension, the preprocessor is not run on the # input file. (Note that this only needs to work for GNU compilers.) ac_save_ext=$ac_ext ac_ext=F { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are using the GNU Fortran 77 compiler" >&5 $as_echo_n "checking whether we are using the GNU Fortran 77 compiler... " >&6; } if ${ac_cv_f77_compiler_gnu+:} false; then : $as_echo_n "(cached) " >&6 else cat > conftest.$ac_ext <<_ACEOF program main #ifndef __GNUC__ choke me #endif end _ACEOF if ac_fn_f77_try_compile "$LINENO"; then : ac_compiler_gnu=yes else ac_compiler_gnu=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext ac_cv_f77_compiler_gnu=$ac_compiler_gnu fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_f77_compiler_gnu" >&5 $as_echo "$ac_cv_f77_compiler_gnu" >&6; } ac_ext=$ac_save_ext ac_test_FFLAGS=${FFLAGS+set} ac_save_FFLAGS=$FFLAGS FFLAGS= { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $F77 accepts -g" >&5 $as_echo_n "checking whether $F77 accepts -g... " >&6; } if ${ac_cv_prog_f77_g+:} false; then : $as_echo_n "(cached) " >&6 else FFLAGS=-g cat > conftest.$ac_ext <<_ACEOF program main end _ACEOF if ac_fn_f77_try_compile "$LINENO"; then : ac_cv_prog_f77_g=yes else ac_cv_prog_f77_g=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_prog_f77_g" >&5 $as_echo "$ac_cv_prog_f77_g" >&6; } if test "$ac_test_FFLAGS" = set; then FFLAGS=$ac_save_FFLAGS elif test $ac_cv_prog_f77_g = yes; then if test "x$ac_cv_f77_compiler_gnu" = xyes; then FFLAGS="-g -O2" else FFLAGS="-g" fi else if test "x$ac_cv_f77_compiler_gnu" = xyes; then FFLAGS="-O2" else FFLAGS= fi fi if test $ac_compiler_gnu = yes; then G77=yes else G77= fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test "x$F77" = "x"; then F77= fi # Extract the first word of "mpicc", so it can be a program name with args. set dummy mpicc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_MPICC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$MPICC"; then ac_cv_prog_MPICC="$MPICC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_MPICC="mpicc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi MPICC=$ac_cv_prog_MPICC if test -n "$MPICC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $MPICC" >&5 $as_echo "$MPICC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi # Lets figure out what CC actually is... # Used in later checks to set compiler specific options if `$CC --version 2>&1 | grep '^ncc (NCC)' >/dev/null 2>&1` ; then CC_COMMON_NAME="ncc" elif `$CC -V 2>&1 | grep '^Intel(R) C' >/dev/null 2>&1` ; then CC_COMMON_NAME="icc" if test "$MPICC" = "mpicc"; then # Check if mpiicc is available # Extract the first word of "mpiicc", so it can be a program name with args. set dummy mpiicc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_MPIICC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$MPIICC"; then ac_cv_prog_MPIICC="$MPIICC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_MPIICC="mpiicc" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi MPIICC=$ac_cv_prog_MPIICC if test -n "$MPIICC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $MPIICC" >&5 $as_echo "$MPIICC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$MPIICC" = "xmpiicc"; then MPICC=mpiicc fi MPI_COMPILER_CHECK=`$MPICC -V 2>&1 | grep '^Intel(R) C'` if test "x$MPI_COMPILER_CHECK" = "x"; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&5 $as_echo "$as_me: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&2;} NO_MPI_TESTS=yes fi fi elif `$CC -v 2>&1 | grep 'gcc version' >/dev/null 2>&1` ; then CC_COMMON_NAME="gcc" if test "$MPICC" = "mpicc"; then MPI_COMPILER_CHECK=`$MPICC -v 2>&1 | grep 'gcc version'` if test "x$MPI_COMPILER_CHECK" = "x"; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&5 $as_echo "$as_me: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&2;} NO_MPI_TESTS=yes fi fi elif `$CC -qversion 2>&1 | grep 'IBM XL C' >/dev/null 2>&1`; then CC_COMMON_NAME="xlc" if test "$MPICC" = "mpicc"; then MPI_COMPILER_CHECK=`$MPICC -qversion 2>&1 | grep 'IBM XL C'` if test "x$MPI_COMPILER_CHECK" = "x"; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&5 $as_echo "$as_me: WARNING: $MPICC is using a different compiler than $CC. MPI tests disabled." >&2;} NO_MPI_TESTS=yes fi fi else CC_COMMON_NAME="unknown" fi #prevent icc warnings about overriding optimization settings set by AC_PROG_CC # remark #869: parameter was never referenced # remark #271: trailing comma is nonstandard if test "$CC_COMMON_NAME" = "icc"; then CFLAGS="$CFLAGS -diag-disable 188,869,271" if test "$MIC" = "yes"; then CC="$CC -mmic -fPIC" fi fi if test "$F77" = "ifort" -a "$MIC" = "yes"; then F77="$F77 -mmic -fPIC" fi for ac_prog in gawk mawk nawk awk do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_AWK+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$AWK"; then ac_cv_prog_AWK="$AWK" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_AWK="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi AWK=$ac_cv_prog_AWK if test -n "$AWK"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $AWK" >&5 $as_echo "$AWK" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$AWK" && break done ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking how to run the C preprocessor" >&5 $as_echo_n "checking how to run the C preprocessor... " >&6; } # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if ${ac_cv_prog_CPP+:} false; then : $as_echo_n "(cached) " >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CPP" >&5 $as_echo "$CPP" >&6; } ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : else # Broken: fails on valid input. continue fi rm -f conftest.err conftest.i conftest.$ac_ext # OK, works on sane cases. Now check whether nonexistent headers # can be detected and how. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if ac_fn_c_try_cpp "$LINENO"; then : # Broken: success on invalid input. continue else # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.i conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.i conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details" "$LINENO" 5; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ln -s works" >&5 $as_echo_n "checking whether ln -s works... " >&6; } LN_S=$as_ln_s if test "$LN_S" = "ln -s"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no, using $LN_S" >&5 $as_echo "no, using $LN_S" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${MAKE-make} sets \$(MAKE)" >&5 $as_echo_n "checking whether ${MAKE-make} sets \$(MAKE)... " >&6; } set x ${MAKE-make} ac_make=`$as_echo "$2" | sed 's/+/p/g; s/[^a-zA-Z0-9_]/_/g'` if eval \${ac_cv_prog_make_${ac_make}_set+:} false; then : $as_echo_n "(cached) " >&6 else cat >conftest.make <<\_ACEOF SHELL = /bin/sh all: @echo '@@@%%%=$(MAKE)=@@@%%%' _ACEOF # GNU make sometimes prints "make[1]: Entering ...", which would confuse us. case `${MAKE-make} -f conftest.make 2>/dev/null` in *@@@%%%=?*=@@@%%%*) eval ac_cv_prog_make_${ac_make}_set=yes;; *) eval ac_cv_prog_make_${ac_make}_set=no;; esac rm -f conftest.make fi if eval test \$ac_cv_prog_make_${ac_make}_set = yes; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } SET_MAKE= else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } SET_MAKE="MAKE=${MAKE-make}" fi if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}ranlib", so it can be a program name with args. set dummy ${ac_tool_prefix}ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_RANLIB+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$RANLIB"; then ac_cv_prog_RANLIB="$RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_RANLIB="${ac_tool_prefix}ranlib" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi RANLIB=$ac_cv_prog_RANLIB if test -n "$RANLIB"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $RANLIB" >&5 $as_echo "$RANLIB" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi fi if test -z "$ac_cv_prog_RANLIB"; then ac_ct_RANLIB=$RANLIB # Extract the first word of "ranlib", so it can be a program name with args. set dummy ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_ac_ct_RANLIB+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_RANLIB"; then ac_cv_prog_ac_ct_RANLIB="$ac_ct_RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_RANLIB="ranlib" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi ac_ct_RANLIB=$ac_cv_prog_ac_ct_RANLIB if test -n "$ac_ct_RANLIB"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_ct_RANLIB" >&5 $as_echo "$ac_ct_RANLIB" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "x$ac_ct_RANLIB" = x; then RANLIB=":" else case $cross_compiling:$ac_tool_warned in yes:) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: using cross tools not prefixed with host triplet" >&5 $as_echo "$as_me: WARNING: using cross tools not prefixed with host triplet" >&2;} ac_tool_warned=yes ;; esac RANLIB=$ac_ct_RANLIB fi else RANLIB="$ac_cv_prog_RANLIB" fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for grep that handles long lines and -e" >&5 $as_echo_n "checking for grep that handles long lines and -e... " >&6; } if ${ac_cv_path_GREP+:} false; then : $as_echo_n "(cached) " >&6 else if test -z "$GREP"; then ac_path_GREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in grep ggrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext" as_fn_executable_p "$ac_path_GREP" || continue # Check for GNU ac_path_GREP and select it if it is found. # Check for GNU $ac_path_GREP case `"$ac_path_GREP" --version 2>&1` in *GNU*) ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'GREP' >> "conftest.nl" "$ac_path_GREP" -e 'GREP$' -e '-(cannot match)-' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_GREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_GREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_GREP"; then as_fn_error $? "no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_GREP=$GREP fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_GREP" >&5 $as_echo "$ac_cv_path_GREP" >&6; } GREP="$ac_cv_path_GREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for egrep" >&5 $as_echo_n "checking for egrep... " >&6; } if ${ac_cv_path_EGREP+:} false; then : $as_echo_n "(cached) " >&6 else if echo a | $GREP -E '(a|b)' >/dev/null 2>&1 then ac_cv_path_EGREP="$GREP -E" else if test -z "$EGREP"; then ac_path_EGREP_found=false # Loop through the user's path and test for each of PROGNAME-LIST as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_prog in egrep; do for ac_exec_ext in '' $ac_executable_extensions; do ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext" as_fn_executable_p "$ac_path_EGREP" || continue # Check for GNU ac_path_EGREP and select it if it is found. # Check for GNU $ac_path_EGREP case `"$ac_path_EGREP" --version 2>&1` in *GNU*) ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_found=:;; *) ac_count=0 $as_echo_n 0123456789 >"conftest.in" while : do cat "conftest.in" "conftest.in" >"conftest.tmp" mv "conftest.tmp" "conftest.in" cp "conftest.in" "conftest.nl" $as_echo 'EGREP' >> "conftest.nl" "$ac_path_EGREP" 'EGREP$' < "conftest.nl" >"conftest.out" 2>/dev/null || break diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break as_fn_arith $ac_count + 1 && ac_count=$as_val if test $ac_count -gt ${ac_path_EGREP_max-0}; then # Best one so far, save it but keep looking for a better one ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_max=$ac_count fi # 10*(2^10) chars as input seems more than enough test $ac_count -gt 10 && break done rm -f conftest.in conftest.tmp conftest.nl conftest.out;; esac $ac_path_EGREP_found && break 3 done done done IFS=$as_save_IFS if test -z "$ac_cv_path_EGREP"; then as_fn_error $? "no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5 fi else ac_cv_path_EGREP=$EGREP fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_EGREP" >&5 $as_echo "$ac_cv_path_EGREP" >&6; } EGREP="$ac_cv_path_EGREP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } if ${ac_cv_header_stdc+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_header_stdc=yes else ac_cv_header_stdc=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : : else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) return 2; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : else ac_cv_header_stdc=no fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_stdc" >&5 $as_echo "$ac_cv_header_stdc" >&6; } if test $ac_cv_header_stdc = yes; then $as_echo "#define STDC_HEADERS 1" >>confdefs.h fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default " if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done ac_fn_c_check_header_mongrel "$LINENO" "minix/config.h" "ac_cv_header_minix_config_h" "$ac_includes_default" if test "x$ac_cv_header_minix_config_h" = xyes; then : MINIX=yes else MINIX= fi if test "$MINIX" = yes; then $as_echo "#define _POSIX_SOURCE 1" >>confdefs.h $as_echo "#define _POSIX_1_SOURCE 2" >>confdefs.h $as_echo "#define _MINIX 1" >>confdefs.h fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether it is safe to define __EXTENSIONS__" >&5 $as_echo_n "checking whether it is safe to define __EXTENSIONS__... " >&6; } if ${ac_cv_safe_to_define___extensions__+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ # define __EXTENSIONS__ 1 $ac_includes_default int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_safe_to_define___extensions__=yes else ac_cv_safe_to_define___extensions__=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_safe_to_define___extensions__" >&5 $as_echo "$ac_cv_safe_to_define___extensions__" >&6; } test $ac_cv_safe_to_define___extensions__ = yes && $as_echo "#define __EXTENSIONS__ 1" >>confdefs.h $as_echo "#define _ALL_SOURCE 1" >>confdefs.h $as_echo "#define _GNU_SOURCE 1" >>confdefs.h $as_echo "#define _POSIX_PTHREAD_SEMANTICS 1" >>confdefs.h $as_echo "#define _TANDEM_SOURCE 1" >>confdefs.h { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } if ${ac_cv_header_stdc+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_header_stdc=yes else ac_cv_header_stdc=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : : else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) return 2; return 0; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : else ac_cv_header_stdc=no fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_stdc" >&5 $as_echo "$ac_cv_header_stdc" >&6; } if test $ac_cv_header_stdc = yes; then $as_echo "#define STDC_HEADERS 1" >>confdefs.h fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for inline" >&5 $as_echo_n "checking for inline... " >&6; } if ${ac_cv_c_inline+:} false; then : $as_echo_n "(cached) " >&6 else ac_cv_c_inline=no for ac_kw in inline __inline__ __inline; do cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifndef __cplusplus typedef int foo_t; static $ac_kw foo_t static_foo () {return 0; } $ac_kw foo_t foo () {return 0; } #endif _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_c_inline=$ac_kw fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext test "$ac_cv_c_inline" != no && break done fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_inline" >&5 $as_echo "$ac_cv_c_inline" >&6; } case $ac_cv_c_inline in inline | yes) ;; *) case $ac_cv_c_inline in no) ac_val=;; *) ac_val=$ac_cv_c_inline;; esac cat >>confdefs.h <<_ACEOF #ifndef __cplusplus #define inline $ac_val #endif _ACEOF ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether time.h and sys/time.h may both be included" >&5 $as_echo_n "checking whether time.h and sys/time.h may both be included... " >&6; } if ${ac_cv_header_time+:} false; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include int main () { if ((struct tm *) 0) return 0; ; return 0; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : ac_cv_header_time=yes else ac_cv_header_time=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_header_time" >&5 $as_echo "$ac_cv_header_time" >&6; } if test $ac_cv_header_time = yes; then $as_echo "#define TIME_WITH_SYS_TIME 1" >>confdefs.h fi for ac_header in sys/time.h c_asm.h intrinsics.h mach/mach_time.h sched.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default" if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_func in gethrtime read_real_time time_base_to_time clock_gettime mach_absolute_time sched_getcpu do : as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" if eval test \"x\$"$as_ac_var"\" = x"yes"; then : cat >>confdefs.h <<_ACEOF #define `$as_echo "HAVE_$ac_func" | $as_tr_cpp` 1 _ACEOF fi done # # Check if the system provides time_* symbols without -lrt, and if not, # check for -lrt existance. # { $as_echo "$as_me:${as_lineno-$LINENO}: checking for timer_create and timer_*ettime symbols in base system" >&5 $as_echo_n "checking for timer_create and timer_*ettime symbols in base system... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : rtsymbols_in_base="yes" else rtsymbols_in_base="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext if test "${rtsymbols_in_base}" = "yes"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: found" >&5 $as_echo "found" >&6; } LRT="" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: not found" >&5 $as_echo "not found" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for timer_create and timer_*ettime symbols in -lrt" >&5 $as_echo_n "checking for timer_create and timer_*ettime symbols in -lrt... " >&6; } SAVED_LIBS=${LIBS} LIBS="${LIBS} -lrt" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : has_lrt="yes" else has_lrt="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=${SAVED_LIBS} if test "${has_lrt}" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: found" >&5 $as_echo "found" >&6; } LRT="-lrt" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: not found" >&5 $as_echo "not found" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for timer_create and timer_*ettime symbols in -lrt -lpthread" >&5 $as_echo_n "checking for timer_create and timer_*ettime symbols in -lrt -lpthread... " >&6; } SAVED_LIBS=${LIBS} LIBS="${LIBS} -lrt -lpthread" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : has_lrt_lpthd="yes" else has_lrt_lpthd="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=${SAVED_LIBS} if test "${has_lrt_lpthd}" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: found" >&5 $as_echo "found" >&6; } LRT="-lrt -lpthread" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: not found" >&5 $as_echo "not found" >&6; } fi fi fi # # Check if the system provides dl* symbols without -ldl, and if not, # check for -ldl existance. # { $as_echo "$as_me:${as_lineno-$LINENO}: checking for dlopen and dlerror symbols in base system" >&5 $as_echo_n "checking for dlopen and dlerror symbols in base system... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { void *p = dlopen ("", 0); char *c = dlerror(); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : dlsymbols_in_base="yes" else dlsymbols_in_base="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext if test "${dlsymbols_in_base}" = "yes"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: found" >&5 $as_echo "found" >&6; } LDL="" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: not found" >&5 $as_echo "not found" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for dlopen and dlerror symbols in -ldl" >&5 $as_echo_n "checking for dlopen and dlerror symbols in -ldl... " >&6; } SAVED_LIBS=${LIBS} LIBS="${LIBS} -ldl" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include int main () { void *p = dlopen ("", 0); char *c = dlerror(); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : has_ldl="yes" else has_ldl="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=${SAVED_LIBS} if test "${has_ldl}" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: found" >&5 $as_echo "found" >&6; } LDL="-ldl" else as_fn_error $? "cannot find dlopen and dlerror symbols neither in the base system libraries nor in -ldl" "$LINENO" 5 fi fi if test "$OS" = "CLE"; then virtualtimer=times tls=__thread walltimer=cycle ffsll=yes cross_compiling=yes STATIC="-static" # _rtc is only defined when using the Cray compiler { $as_echo "$as_me:${as_lineno-$LINENO}: checking for _rtc intrinsic" >&5 $as_echo_n "checking for _rtc intrinsic... " >&6; } rtc_ok=yes cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #ifdef HAVE_INTRINSICS_H #include #endif int main () { _rtc() ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : $as_echo "#define HAVE__RTC 1" >>confdefs.h else rtc_ok=no $as_echo "#define NO_RTC_INTRINSIC 1" >>confdefs.h fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $rtc_ok" >&5 $as_echo "$rtc_ok" >&6; } elif test "$OS" = "bgp"; then CC=powerpc-bgp-linux-gcc F77=powerpc-bgp-linux-gfortran walltimer=cycle virtualtimer=proc tls=no ffsll=yes cross_compiling=yes elif test "$OS" = "bgq"; then # Check whether --with-bgpm_installdir was given. if test "${with_bgpm_installdir+set}" = set; then : withval=$with_bgpm_installdir; BGPM_INSTALL_DIR=$withval CFLAGS="$CFLAGS -I$withval" else as_fn_error $? "BGQ CPU component requires installation path of BGPM (see --with-bgpm_installdir)" "$LINENO" 5 fi bitmode=64 tls=no elif test "$OS" = "linux"; then if test "$arch" = "ppc64" -o "$arch" = "x86_64"; then if test "$bitmode" = "64" -a "$libdir" = '${exec_prefix}/lib'; then libdir='${exec_prefix}/lib64' fi fi elif test "$OS" = "solaris"; then ac_fn_c_check_type "$LINENO" "hrtime_t" "ac_cv_type_hrtime_t" "#if HAVE_SYS_TIME_H #include #endif " if test "x$ac_cv_type_hrtime_t" = xyes; then : $as_echo "#define HAVE_HRTIME_T 1" >>confdefs.h fi if test "x$AR" = "x"; then AR=/usr/ccs/bin/ar fi fi if test "x$AR" = "x"; then AR=ar fi if test "$cross_compiling" = "yes" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for native compiler for header generation" >&5 $as_echo_n "checking for native compiler for header generation... " >&6; } # Check whether --with-nativecc was given. if test "${with_nativecc+set}" = set; then : withval=$with_nativecc; nativecc=$withval else nativecc=gcc fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $nativecc" >&5 $as_echo "$nativecc" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for tests" >&5 $as_echo_n "checking for tests... " >&6; } # Check whether --with-tests was given. if test "${with_tests+set}" = set; then : withval=$with_tests; tests=$withval else tests="ctests ftests mpitests comp_tests" fi if test "$tests" = "no"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $tests" >&5 $as_echo "$tests" >&6; } tests= NO_MPI_TESTS=yes else # mpitests is not a target tmp_tests= mpi_tests=no case "$tests" in *ctests*) tmp_tests+="ctests " ;; esac case "$tests" in *ftests*) tmp_tests+="ftests " ;; esac case "$tests" in *comp_tests*) tmp_tests+="comp_tests " ;; esac case "$tests" in *mpitests*) # we already checked if mpicc is working if test "x$MPICC" != "x"; then if test "x$NO_MPI_TESTS" = "x"; then mpi_tests=yes # mpitests only works together with ctests if test "x$tmp_tests" != "x"; then tmp_tests+="mpitests " fi fi fi ;; esac if test "x$tmp_tests" = "x"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: $tmp_tests" >&5 $as_echo "$tmp_tests" >&6; } fi # do not list mpitests for makefile target case "$tmp_tests" in *mpitests* ) tmp_tests=$(echo "$tmp_tests" | sed 's/ mpitests//') ;; esac tests=$tmp_tests # mpitests is not listed by the user if test "$mpi_tests" = "no"; then NO_MPI_TESTS=yes fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for debug build" >&5 $as_echo_n "checking for debug build... " >&6; } # default value for --with-debug if not set by user debug="no" # Check whether --with-debug was given. if test "${with_debug+set}" = set; then : withval=$with_debug; debug=$withval fi if test "$debug" = "yes"; then if test "$CC_COMMON_NAME" = "gcc"; then CFLAGS="$CFLAGS -g3" fi OPTFLAGS="-O0" PAPICFLAGS+=" -DDEBUG -DPAPI_NO_MEMORY_MANAGEMENT" elif test "$debug" = "memory"; then if test "$CC_COMMON_NAME" = "gcc"; then CFLAGS="$CFLAGS -g3" fi OPTFLAGS="-O0" PAPICFLAGS+=" -DDEBUG" else PAPICFLAGS+="-DPAPI_NO_MEMORY_MANAGEMENT" fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $debug" >&5 $as_echo "$debug" >&6; } # Check whether --enable-warnings was given. if test "${enable_warnings+set}" = set; then : enableval=$enable_warnings; else enable_warnings=no fi if test "$CC_COMMON_NAME" = "gcc"; then if test "$enable_warnings" = "yes"; then gcc_version=`gcc -v 2>&1 | tail -n 1 | awk '{printf $3}'` major=`echo $gcc_version | sed 's/\([^.][^.]*\).*/\1/'` minor=`echo $gcc_version | sed 's/[^.][^.]*.\([^.][^.]*\).*/\1/'` if (test "$major" -ge 4 || test "$major" = 3 -a "$minor" -ge 4); then CFLAGS+=" -Wall -Wextra" else CFLAGS+=" -W" fi # -Wextra => -Woverride-init on gcc >= 4.2 # This issues a warning (error under -Werror) for some libpfm4 code. fi oldcflags="$CFLAGS" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for -Wno-override-init" >&5 $as_echo_n "checking for -Wno-override-init... " >&6; } CFLAGS+=" -Wall -Werror -Wno-override-init" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ struct A { int x; int y; }; int main(void) { struct A a = {.x = 0, .y = 0, .y = 5 }; return a.x; } _ACEOF if ac_fn_c_try_compile "$LINENO"; then : HAVE_NO_OVERRIDE_INIT=1 else HAVE_NO_OVERRIDE_INIT=0 fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext CFLAGS="$oldcflags" { $as_echo "$as_me:${as_lineno-$LINENO}: result: $HAVE_NO_OVERRIDE_INIT" >&5 $as_echo "$HAVE_NO_OVERRIDE_INIT" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for CPU type" >&5 $as_echo_n "checking for CPU type... " >&6; } # Check whether --with-CPU was given. if test "${with_CPU+set}" = set; then : withval=$with_CPU; CPU=$withval case "$CPU" in core|core2|i7|atom|p4|p3|opteron|athlon|x86) MISCSRCS="$MISCSRCS x86_cpuid_info.c" esac else case "$OS" in aix) CPU="`/usr/sbin/lsattr -E -l proc0 | grep type | cut -d '_' -f 2 | cut -d ' ' -f 1 | tr 'A-Z' 'a-z'`" if test "$CPU" = ""; then CPU="`/usr/sbin/lsattr -E -l proc1 | grep type | cut -d '_' -f 2 | cut -d ' ' -f 1 | tr 'A-Z' 'a-z'`" fi ;; nec) family=nec ;; freebsd) family=`uname -m` if test "$family" = "amd64"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" elif test "$family" = "i386"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" fi ;; darwin) family=`uname -m` MISCSRCS="$MISCSRCS x86_cpuid_info.c" ;; linux) family=`uname -m` if test "$family" = "x86_64"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" CPU="x86" elif test "$family" = "i686"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" CPU="x86" elif test "$family" = "ppc64" || test "$family" = "ppc64le"; then if (test "$family" = "ppc64le" && test "$CC_COMMON_NAME" = "gcc"); then # set cuda_version to be 0, such that we avoid "integer expression expected" cuda_version=0 if (test ! -z "$PAPI_CUDA_ROOT"); then update_cuda_version=`grep -r '#define CUDA_VERSION [0-9]' $PAPI_CUDA_ROOT/include/cuda.h 2>/dev/null | awk '{print $3}'` # update cuda_version, this will stay 0 unless PAPI_CUDA_ROOT was properly set to a Cuda install cuda_version=`echo "$cuda_version $update_cuda_version" | awk '{print $1 + $2}'` fi # get gcc major number gcc_major=`gcc -v 2>&1 | tail -n 1 | awk '{printf $3}' | sed 's/\([^.][^.]*\).*/\1/'` # set variable if the gcc and cuda versions match conditions if (test "$cuda_version" -ge 10000 && test "$cuda_version" -lt 11000 \ && test "$gcc_major" -ge 8 && test "$gcc_major" -lt 9); then NVPPC64LEFLAGS="-Xcompiler -mno-float128" fi fi CPU_info="`cat /proc/cpuinfo | grep cpu | cut -d: -f2 | cut -d' ' -f2 | sed '2,$d' | tr -d ","`" case "$CPU_info" in PPC970*) CPU="PPC970";; POWER5) CPU="POWER5";; POWER5+) CPU="POWER5+";; POWER6) CPU="POWER6";; POWER7) CPU="POWER7";; POWER8) CPU="POWER8";; POWER9) CPU="POWER9";; POWER10) CPU="POWER10";; esac elif test "${family:0:3}" = "arm" || test "$family" = "aarch64"; then CPU="arm" fi ;; solaris) ac_fn_c_check_header_mongrel "$LINENO" "libcpc.h" "ac_cv_header_libcpc_h" "$ac_includes_default" if test "x$ac_cv_header_libcpc_h" = xyes; then : CFLAGS="$CFLAGS -lcpc" if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main() { // Check for libcpc 2 if(CPC_VER_CURRENT == 2) exit(0); exit(1); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : cpc_version=2 else cpc_version=0 fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi else as_fn_error $? "libcpc is needed for running PAPI on Solaris" "$LINENO" 5 fi processor=`uname -p` machinetype=`uname -m` if test "$processor" = "sparc"; then if test "$machinetype" = "sun4u"; then CPU=ultra { $as_echo "$as_me:${as_lineno-$LINENO}: checking for cpc_take_sample in -lcpc" >&5 $as_echo_n "checking for cpc_take_sample in -lcpc... " >&6; } if ${ac_cv_lib_cpc_cpc_take_sample+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lcpc $LIBS" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char cpc_take_sample (); int main () { return cpc_take_sample (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : ac_cv_lib_cpc_cpc_take_sample=yes else ac_cv_lib_cpc_cpc_take_sample=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_cpc_cpc_take_sample" >&5 $as_echo "$ac_cv_lib_cpc_cpc_take_sample" >&6; } if test "x$ac_cv_lib_cpc_cpc_take_sample" = xyes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBCPC 1 _ACEOF LIBS="-lcpc $LIBS" else as_fn_error $? "libcpc.a is needed on Solaris, install SUNWcpc" "$LINENO" 5 fi elif test "$machinetype" = "sun4v"; then CPU=niagara2 if test "$cpc_version" != "2"; then as_fn_error $? "libcpc2 needed for Niagara 2" "$LINENO" 5 fi else as_fn_error $? "$machinetype not supported" "$LINENO" 5 fi else as_fn_error $? "Only SPARC processors are supported on Solaris" "$LINENO" 5 fi ;; bgp) CPU=bgp ;; bgq) CPU=bgq ;; esac fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $CPU" >&5 $as_echo "$CPU" >&6; } cat >>confdefs.h <<_ACEOF #define CPU $CPU _ACEOF # First set pthread-mutexes based on arch case $arch in aarch64|arm*|parisc*) pthread_mutexes=yes CFLAGS="$CFLAGS -DUSE_PTHREAD_MUTEXES" echo "forcing use of pthread mutexes... " >&6 ;; esac # Check whether --with-pthread-mutexes was given. if test "${with_pthread_mutexes+set}" = set; then : withval=$with_pthread_mutexes; pthread_mutexes=yes CFLAGS="$CFLAGS -DUSE_PTHREAD_MUTEXES" fi # Check whether --with-ffsll was given. if test "${with_ffsll+set}" = set; then : withval=$with_ffsll; ffsll=$withval else if test "$cross_compiling" = "yes" ; then as_fn_error $? "ffsll must be specified for cross compile" "$LINENO" 5 fi didcheck=1 ac_fn_c_check_func "$LINENO" "ffsll" "ac_cv_func_ffsll" if test "x$ac_cv_func_ffsll" = xyes; then : ffsll=yes else ffsll=no fi fi if test "$ffsll" = "yes" ; then $as_echo "#define HAVE_FFSLL 1" >>confdefs.h fi if test "$didcheck" != "1"; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ffsll" >&5 $as_echo_n "checking for ffsll... " >&6; } if test "$ffsll" = "yes" ; then $as_echo "#define HAVE_FFSLL 1" >>confdefs.h fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ffsll" >&5 $as_echo "$ffsll" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working gettid" >&5 $as_echo_n "checking for working gettid... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main() { pid_t a = gettid(); } _ACEOF if ac_fn_c_try_link "$LINENO"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } $as_echo "#define HAVE_GETTID 1" >>confdefs.h else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working syscall(SYS_gettid)" >&5 $as_echo_n "checking for working syscall(SYS_gettid)... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include int main() { pid_t a = syscall(SYS_gettid); } _ACEOF if ac_fn_c_try_link "$LINENO"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } $as_echo "#define HAVE_SYSCALL_GETTID 1" >>confdefs.h else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext # Check whether --with-walltimer was given. if test "${with_walltimer+set}" = set; then : withval=$with_walltimer; walltimer=$withval else if test "$cross_compiling" = "yes" ; then as_fn_error $? "walltimer must be specified for cross compile" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working MMTIMER" >&5 $as_echo_n "checking for working MMTIMER... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include #include #include #ifndef MMTIMER_FULLNAME #define MMTIMER_FULLNAME "/dev/mmtimer" #endif int main() { int offset; int fd; if((fd = open(MMTIMER_FULLNAME, O_RDONLY)) == -1) exit(1); if ((offset = ioctl(fd, MMTIMER_GETOFFSET, 0)) < 0) exit(1); close(fd); exit(0); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : walltimer="mmtimer" { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working CLOCK_REALTIME_HR POSIX 1b timer" >&5 $as_echo_n "checking for working CLOCK_REALTIME_HR POSIX 1b timer... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include #include int main() { struct timespec t1, t2; double seconds; if (syscall(__NR_clock_gettime,CLOCK_REALTIME_HR,&t1) == -1) exit(1); sleep(1); if (syscall(__NR_clock_gettime,CLOCK_REALTIME_HR,&t2) == -1) exit(1); seconds = ((double)t2.tv_sec + (double)t2.tv_nsec/1000000000.0) - ((double)t1.tv_sec + (double)t1.tv_nsec/1000000000.0); if (seconds > 1.0) exit(0); else exit(1); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : walltimer="clock_realtime_hr" { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working CLOCK_REALTIME POSIX 1b timer" >&5 $as_echo_n "checking for working CLOCK_REALTIME POSIX 1b timer... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include #include int main() { struct timespec t1, t2; double seconds; if (syscall(__NR_clock_gettime,CLOCK_REALTIME,&t1) == -1) exit(1); sleep(1); if (syscall(__NR_clock_gettime,CLOCK_REALTIME,&t2) == -1) exit(1); seconds = ((double)t2.tv_sec + (double)t2.tv_nsec/1000000000.0) - ((double)t1.tv_sec + (double)t1.tv_nsec/1000000000.0); if (seconds > 1.0) exit(0); else exit(1); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : walltimer="clock_realtime" { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else walltimer="cycle" { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for which real time clock to use" >&5 $as_echo_n "checking for which real time clock to use... " >&6; } if test "$walltimer" = "gettimeofday"; then $as_echo "#define HAVE_GETTIMEOFDAY 1" >>confdefs.h elif test "$walltimer" = "mmtimer"; then $as_echo "#define HAVE_MMTIMER 1" >>confdefs.h altix="-DALTIX" elif test "$walltimer" = "clock_realtime_hr"; then $as_echo "#define HAVE_CLOCK_GETTIME 1" >>confdefs.h $as_echo "#define HAVE_CLOCK_GETTIME_REALTIME_HR 1" >>confdefs.h elif test "$walltimer" = "clock_realtime"; then $as_echo "#define HAVE_CLOCK_GETTIME 1" >>confdefs.h $as_echo "#define HAVE_CLOCK_GETTIME_REALTIME 1" >>confdefs.h elif test "$walltimer" = "cycle"; then $as_echo "#define HAVE_CYCLE 1" >>confdefs.h else as_fn_error $? "Unknown value for walltimer" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $walltimer" >&5 $as_echo "$walltimer" >&6; } SAVED_LIBS=$LIBS SAVED_LDFLAGS=$LDFLAGS SAVED_CFLAGS=$CFLAGS LIBS="" LDFLAGS="" CFLAGS="-pthread" # Check whether --with-tls was given. if test "${with_tls+set}" = set; then : withval=$with_tls; tls=$withval else if test "$cross_compiling" = "yes" ; then as_fn_error $? "tls must be specified for cross compile" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working __thread" >&5 $as_echo_n "checking for working __thread... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include extern __thread int i; static int res1, res2; void *thread_main (void *arg) { i = (int)arg; sleep (1); if ((int)arg == 1) res1 = (i == (int)arg); else res2 = (i == (int)arg); return NULL; } __thread int i; int main () { pthread_t t1, t2; i = 5; pthread_create (&t1, NULL, thread_main, (void *)1); pthread_create (&t2, NULL, thread_main, (void *)2); pthread_join (t1, NULL); pthread_join (t2, NULL); return !(res1 + res2 == 2); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } tls="__thread" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } tls="no" fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi if test "$OS" = "linux"; then if test "x$tls" = "x__thread"; then # On some linux distributions, TLS works in executables, but linking against # a shared library containing TLS fails with: undefined reference to `__tls_get_addr' cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ static __thread int foo; void main() { foo = 5; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : else tls="no" ; { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Disabling usage of __thread" >&5 $as_echo "$as_me: WARNING: Disabling usage of __thread" >&2;} fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for high performance thread local storage" >&5 $as_echo_n "checking for high performance thread local storage... " >&6; } if test "$tls" = "no"; then NOTLS="-DNO_TLS" elif test "x$tls" != "x"; then if test "$tls" = "yes"; then tls="__thread" fi NOTLS="-DUSE_COMPILER_TLS" cat >>confdefs.h <<_ACEOF #define HAVE_THREAD_LOCAL_STORAGE $tls _ACEOF fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $tls" >&5 $as_echo "$tls" >&6; } # Check whether --with-virtualtimer was given. if test "${with_virtualtimer+set}" = set; then : withval=$with_virtualtimer; virtualtimer=$withval else if test "$cross_compiling" = "yes" ; then as_fn_error $? "virtualtimer must be specified for cross compile" "$LINENO" 5 fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working CLOCK_THREAD_CPUTIME_ID POSIX 1b timer" >&5 $as_echo_n "checking for working CLOCK_THREAD_CPUTIME_ID POSIX 1b timer... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include #include #include #include #include #include #include #if !defined( SYS_gettid ) #define SYS_gettid 1105 #endif struct timespec threadone = { 0, 0 }; struct timespec threadtwo = { 0, 0 }; pthread_t threadOne, threadTwo; volatile int done = 0; int gettid() { return syscall( SYS_gettid ); } void *doThreadOne( void * v ) { while (!done) sleep(1); if (syscall(__NR_clock_gettime,CLOCK_THREAD_CPUTIME_ID,&threadone) == -1) { perror("clock_gettime(CLOCK_THREAD_CPUTIME_ID)"); exit(1); } return 0; } void *doThreadTwo( void * v ) { long i, j = 0xdeadbeef; for( i = 0; i < 0xFFFFFFF; ++i ) { j = j ^ i; } if (syscall(__NR_clock_gettime,CLOCK_THREAD_CPUTIME_ID,&threadtwo) == -1) { perror("clock_gettime(CLOCK_THREAD_CPUTIME_ID)"); exit(1); } done = 1; return (void *) j; } int main( int argc, char ** argv ) { int status = pthread_create( & threadOne, NULL, doThreadOne, NULL ); assert( status == 0 ); status = pthread_create( & threadTwo, NULL, doThreadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadOne, NULL ); assert( status == 0 ); if ((threadone.tv_sec != threadtwo.tv_sec) || (threadone.tv_nsec != threadtwo.tv_nsec)) exit(0); else { fprintf(stderr,"T1 %ld %ld T2 %ld %ld\n",threadone.tv_sec,threadone.tv_nsec,threadtwo.tv_sec,threadtwo.tv_nsec); exit(1); } } _ACEOF if ac_fn_c_try_run "$LINENO"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } virtualtimer="clock_thread_cputime_id" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } # *** Checks for working per thread timer*** { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working per-thread times() timer" >&5 $as_echo_n "checking for working per-thread times() timer... " >&6; } if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include #include #include #include #include #include #include #include #include #if !defined( SYS_gettid ) #define SYS_gettid 1105 #endif long threadone = 0, threadtwo = 0; pthread_t threadOne, threadTwo; volatile int done = 0; int gettid() { return syscall( SYS_gettid ); } void *doThreadOne( void * v ) { struct tms tm; int status; while (!done) sleep(1); status = times( & tm ); assert( status != -1 ); threadone = tm.tms_utime; return 0; } void *doThreadTwo( void * v ) { struct tms tm; long i, j = 0xdeadbeef; int status; for( i = 0; i < 0xFFFFFFF; ++i ) { j = j ^ i; } status = times( & tm ); assert( status != -1 ); threadtwo = tm.tms_utime; done = 1; return j; } int main( int argc, char ** argv ) { int status = pthread_create( & threadOne, NULL, doThreadOne, NULL ); assert( status == 0 ); status = pthread_create( & threadTwo, NULL, doThreadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadOne, NULL ); assert( status == 0 ); return (threadone == threadtwo); } _ACEOF if ac_fn_c_try_run "$LINENO"; then : { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } virtualtimer="times" else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } virtualtimer="default" fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi fi LDFLAGS=$SAVED_LDFLAGS CFLAGS=$SAVED_CFLAGS LIBS=$SAVED_LIBS { $as_echo "$as_me:${as_lineno-$LINENO}: checking for which virtual timer to use" >&5 $as_echo_n "checking for which virtual timer to use... " >&6; } case "$virtualtimer" in times) $as_echo "#define HAVE_PER_THREAD_TIMES 1" >>confdefs.h ;; getrusage) $as_echo "#define HAVE_PER_THREAD_GETRUSAGE 1" >>confdefs.h ;; clock_thread_cputime_id) $as_echo "#define HAVE_CLOCK_GETTIME_THREAD CLOCK_THREAD_CPUTIME_ID" >>confdefs.h ;; proc|default) $as_echo "#define USE_PROC_PTTIMER 1" >>confdefs.h esac { $as_echo "$as_me:${as_lineno-$LINENO}: result: $virtualtimer" >&5 $as_echo "$virtualtimer" >&6; } if test "$OS" = "aix"; then # Check whether --with-pmapi was given. if test "${with_pmapi+set}" = set; then : withval=$with_pmapi; PMAPI=$withval else PMAPI="/usr/pmapi" fi LIBS="-L$PMAPI/lib -lpmapi" CPPFLAGS="$CPPFLAGS -I$PMAPI/include" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pm_initialize in -lpmapi" >&5 $as_echo_n "checking for pm_initialize in -lpmapi... " >&6; } if ${ac_cv_lib_pmapi_pm_initialize+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lpmapi $LIBS" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char pm_initialize (); int main () { return pm_initialize (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : ac_cv_lib_pmapi_pm_initialize=yes else ac_cv_lib_pmapi_pm_initialize=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pmapi_pm_initialize" >&5 $as_echo "$ac_cv_lib_pmapi_pm_initialize" >&6; } if test "x$ac_cv_lib_pmapi_pm_initialize" = xyes; then : PMINIT="-DPM_INITIALIZE" else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pm_init in -lpmapi" >&5 $as_echo_n "checking for pm_init in -lpmapi... " >&6; } if ${ac_cv_lib_pmapi_pm_init+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lpmapi $LIBS" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char pm_init (); int main () { return pm_init (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : ac_cv_lib_pmapi_pm_init=yes else ac_cv_lib_pmapi_pm_init=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pmapi_pm_init" >&5 $as_echo "$ac_cv_lib_pmapi_pm_init" >&6; } if test "x$ac_cv_lib_pmapi_pm_init" = xyes; then : PMINIT="-DPM_INIT" else as_fn_error $? "libpmapi.a not found, rerun configure with different flags" "$LINENO" 5 fi fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for static user preset events" >&5 $as_echo_n "checking for static user preset events... " >&6; } # Check whether --with-static_user_events was given. if test "${with_static_user_events+set}" = set; then : withval=$with_static_user_events; STATIC_USER_EVENTS=$withval else STATIC_USER_EVENTS=no fi if test "$STATIC_USER_EVENTS" = "yes"; then PAPICFLAGS+=" -DSTATIC_USER_EVENTS" fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $STATIC_USER_EVENTS" >&5 $as_echo "$STATIC_USER_EVENTS" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for static PAPI preset events" >&5 $as_echo_n "checking for static PAPI preset events... " >&6; } # Check whether --with-static_papi_events was given. if test "${with_static_papi_events+set}" = set; then : withval=$with_static_papi_events; STATIC_PAPI_EVENTS=$withval else STATIC_PAPI_EVENTS=yes fi if test "$STATIC_PAPI_EVENTS" = "yes"; then PAPICFLAGS+=" -DSTATIC_PAPI_EVENTS_TABLE" fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $STATIC_PAPI_EVENTS" >&5 $as_echo "$STATIC_PAPI_EVENTS" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for whether to build static library" >&5 $as_echo_n "checking for whether to build static library... " >&6; } # Check whether --with-static_lib was given. if test "${with_static_lib+set}" = set; then : withval=$with_static_lib; static_lib=$withval else static_lib=yes fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $static_lib" >&5 $as_echo "$static_lib" >&6; } { $as_echo "$as_me:${as_lineno-$LINENO}: checking for whether to build shared library" >&5 $as_echo_n "checking for whether to build shared library... " >&6; } # Check whether --with-shared_lib was given. if test "${with_shared_lib+set}" = set; then : withval=$with_shared_lib; shared_lib=$withval else shared_lib=yes fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $shared_lib" >&5 $as_echo "$shared_lib" >&6; } if test "$shared_lib" = "no" -a "$static_lib" = "no"; then as_fn_error $? "Both shared and static libs are disabled" "$LINENO" 5 fi BUILD_SHARED_LIB="no" if test "$shared_lib" = "yes"; then BUILD_SHARED_LIB="yes" papiLIBS="shared" fi if test "$static_lib" = "yes"; then papiLIBS="$papiLIBS static" fi if test "$shared_lib" = "no" -a "$static_lib" = "yes"; then NO_MPI_TESTS="yes" fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for static compile of tests and utilities" >&5 $as_echo_n "checking for static compile of tests and utilities... " >&6; } # Check whether --with-static_tools was given. if test "${with_static_tools+set}" = set; then : withval=$with_static_tools; STATIC="-static" { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi # Disable LDL for static builds if test "$STATIC" = "-static"; then LDL="" fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for linking with papi shared library of tests and utilities" >&5 $as_echo_n "checking for linking with papi shared library of tests and utilities... " >&6; } # Check whether --with-shlib_tools was given. if test "${with_shlib_tools+set}" = set; then : withval=$with_shlib_tools; shlib_tools=yes { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else shlib_tools=no { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi if test "$static_lib" = "no"; then shlib_tools=yes fi if test "$static_lib" = "no" -a "$shlib_tools" = "no"; then as_fn_error $? "Building tests and utilities static but no static papi library to be built" "$LINENO" 5 fi if test "$shlib_tools" = "yes"; then if test "$shared_lib" != "yes"; then as_fn_error $? "Building static but specified shared linking for tests and utilities" "$LINENO" 5 fi if test "$STATIC" = "-static"; then as_fn_error $? "Building shared but specified static linking" "$LINENO" 5 fi LINKLIB='$(SHLIB)' # Set rpath and runpath to find libpfm.so and libpapi.so if not specified via LD_LIBRARY_PATH. The search path at runtime can be overriden by LD_LIBRARY_PATH. LDFLAGS="$LDFLAGS -Wl,-rpath=$PWD/libpfm4/lib:$PWD,--enable-new-dtags" elif test "$shlib_tools" = "no"; then if test "$static_lib" != "yes"; then as_fn_error $? "Building shared but specified static linking for tests and utilities" "$LINENO" 5 fi LINKLIB='$(LIBRARY)' fi ## By default we want libsde built, so if the user does not ## give an option, then we set BUILD_LIBSDE_* to yes. { $as_echo "$as_me:${as_lineno-$LINENO}: checking for building libsde" >&5 $as_echo_n "checking for building libsde... " >&6; } # Check whether --with-libsde was given. if test "${with_libsde+set}" = set; then : withval=$with_libsde; else with_libsde=yes fi if test "$with_libsde" = "no"; then BUILD_LIBSDE_SHARED="no" BUILD_LIBSDE_STATIC="no" { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } else BUILD_LIBSDE_SHARED="yes" if test "$static_lib" = "yes"; then BUILD_LIBSDE_STATIC="yes" { $as_echo "$as_me:${as_lineno-$LINENO}: result: shared and static" >&5 $as_echo "shared and static" >&6; } else BUILD_LIBSDE_STATIC="no" { $as_echo "$as_me:${as_lineno-$LINENO}: result: shared only" >&5 $as_echo "shared only" >&6; } fi if test "$PWD" != ""; then TOPDIR=$PWD else TOPDIR=$(pwd) fi fi user_specified_interface=no ################################################## # perfnec ################################################## force_perfnec=no perfnec=0 # Check whether --with-perfnec was given. if test "${with_perfnec+set}" = set; then : withval=$with_perfnec; perfnec=$withval user_specified_interface=perfnec force_perfnec=yes pfm_root=libperfnec pfm_incdir="libperfnec/include" else perfnec=0 if test "$perfnec" != 0; then pfm_incdir="libpperfnec/include" fi fi force_pfm_incdir=no ################################################## # perfmon ################################################## old_pfmv2=n perfmon=0 perfmon2=no force_perfmon2=no # Check whether --with-perfmon was given. if test "${with_perfmon+set}" = set; then : withval=$with_perfmon; perfmon=$withval user_specified_interface=perfmon force_perfmon2=yes pfm_incdir="libpfm-3.y/include" perfmon=`echo ${perfmon} | sed 's/^ \t*//;s/ \t*$//'` perfmon=`echo ${perfmon} | grep -e '[1-9]\.[0-9][0-9]*'` if test "x$perfmon" = "x"; then as_fn_error $? "\"Badly formed perfmon version string\"" "$LINENO" 5 fi perfmon=`echo ${perfmon} | sed 's/\.//'` if test $perfmon -gt 20; then perfmon2=yes fi if test $perfmon -lt 25; then old_pfmv2=y PFMCFLAGS="-DPFMLIB_OLD_PFMV2" fi else perfmon=0 if test "$cross_compiling" = "no" ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for /sys/kernel/perfmon/version" >&5 $as_echo_n "checking for /sys/kernel/perfmon/version... " >&6; } if ${ac_cv_file__sys_kernel_perfmon_version+:} false; then : $as_echo_n "(cached) " >&6 else test "$cross_compiling" = yes && as_fn_error $? "cannot check for file existence when cross compiling" "$LINENO" 5 if test -r "/sys/kernel/perfmon/version"; then ac_cv_file__sys_kernel_perfmon_version=yes else ac_cv_file__sys_kernel_perfmon_version=no fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_file__sys_kernel_perfmon_version" >&5 $as_echo "$ac_cv_file__sys_kernel_perfmon_version" >&6; } if test "x$ac_cv_file__sys_kernel_perfmon_version" = xyes; then : perfmon=`cat /sys/kernel/perfmon/version` else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for /proc/perfmon" >&5 $as_echo_n "checking for /proc/perfmon... " >&6; } if ${ac_cv_file__proc_perfmon+:} false; then : $as_echo_n "(cached) " >&6 else test "$cross_compiling" = yes && as_fn_error $? "cannot check for file existence when cross compiling" "$LINENO" 5 if test -r "/proc/perfmon"; then ac_cv_file__proc_perfmon=yes else ac_cv_file__proc_perfmon=no fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_file__proc_perfmon" >&5 $as_echo "$ac_cv_file__proc_perfmon" >&6; } if test "x$ac_cv_file__proc_perfmon" = xyes; then : perfmon=`cat /proc/perfmon | grep version | cut -d: -f2` else perfmon=0 fi fi if test "$perfmon" != 0; then pfm_incdir="libpfm-3.y/include" perfmon=`echo ${perfmon} | sed 's/^ \t*//;s/ \t*$//'` perfmon=`echo ${perfmon} | grep -e '[1-9]\.[0-9][0-9]*'` perfmon=`echo ${perfmon} | sed 's/\.//'` if test $perfmon -gt 20; then perfmon2=yes fi if test $perfmon -lt 25; then # must be y, not yes, or libpfm breaks old_pfmv2="y" PFMCFLAGS="-DPFMLIB_OLD_PFMV2" fi fi fi fi force_pfm_incdir=no # default # Check whether --with-pfm_root was given. if test "${with_pfm_root+set}" = set; then : withval=$with_pfm_root; pfm_root=$withval pfm_incdir=$withval/include pfm_libdir=$withval/lib fi # Check whether --with-pfm_prefix was given. if test "${with_pfm_prefix+set}" = set; then : withval=$with_pfm_prefix; pfm_prefix=$withval pfm_incdir=$pfm_prefix/include pfm_libdir=$pfm_prefix/lib fi # Check whether --with-pfm_incdir was given. if test "${with_pfm_incdir+set}" = set; then : withval=$with_pfm_incdir; pfm_incdir=$withval fi # Check whether --with-pfm_libdir was given. if test "${with_pfm_libdir+set}" = set; then : withval=$with_pfm_libdir; pfm_libdir=$withval fi # if these are both empty, it means we haven't set either pfm_prefix or pfm_root # which would have set them. Thus it means that we set this to our included # libpfm4 library. Shame on the person that sets one but not the other. if test "x$pfm_incdir" = "x" -a "x$pfm_libdir" = "x"; then pfm_root="libpfm4" pfm_incdir="libpfm4/include" pfm_libdir="libpfm4/lib" fi ################################################## # Linux perf_event/perf_counter ################################################## perf_events=yes force_perf_events=no perf_events_uncore=yes if test "x$mic" = "xno"; then perf_events=no fi awk ' /^# validation_tests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+validation_tests$/ { sub(/^# /, ""); print; next } # ctests lines /^# ctests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib[[:space:]]+validation_tests$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ctests$/ { sub(/^# /, ""); print; next } # ftests lines /^# ftests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ftests$/ { sub(/^# /, ""); print; next } { print } ' "Makefile.inc" > "t_Makefile.inc" mv "t_Makefile.inc" "Makefile.inc" # Check whether --enable-cpu was given. if test "${enable_cpu+set}" = set; then : enableval=$enable_cpu; fi if test "x$enable_cpu" = "xno"; then : perf_events=no force_perf_events=no perf_events_uncore=no FILE_PATH="Makefile.inc" TEMP_FILE="temp_$FILE_PATH" # Comment out lines to disable ctest ftest vtest awk ' /^validation_tests:[:space:]+\$\(LIBS\)[:space:]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+validation_tests$/ { print "# " $0; next } /^ctests:[:space:]+\$\(LIBS\)[:space:]+testlib[:space:]+validation_tests$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+ctests$/ { print "# " $0; next } /^ftests:[:space:]+\$\(LIBS\)[:space:]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+ftests$/ { print "# " $0; next } { print } ' "$FILE_PATH" > "$TEMP_FILE" mv "$TEMP_FILE" "$FILE_PATH" fi # Check whether --enable-perf_event was given. if test "${enable_perf_event+set}" = set; then : enableval=$enable_perf_event; fi if test "x$enable_perf_event" = "xno"; then : perf_events=no force_perf_events=no FILE_PATH="Makefile.inc" TEMP_FILE="temp_$FILE_PATH" awk ' /^validation_tests:[:space:]+\$\(LIBS\)[:space:]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+validation_tests$/ { print "# " $0; next } /^ctests:[:space:]+\$\(LIBS\)[:space:]+testlib[:space:]+validation_tests$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+ctests$/ { print "# " $0; next } /^ftests:[:space:]+\$\(LIBS\)[:space:]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[:space:]+\$\(MAKE\)[:space:]+-C[:space:]+ftests$/ { print "# " $0; next } { print } ' "$FILE_PATH" > "$TEMP_FILE" mv "$TEMP_FILE" "$FILE_PATH" fi # Check whether --enable-perf_event_uncore was given. if test "${enable_perf_event_uncore+set}" = set; then : enableval=$enable_perf_event_uncore; fi if test "x$enable_perf_event_uncore" = "xno"; then : perf_events_uncore=no fi # Check whether --with-perf_events was given. if test "${with_perf_events+set}" = set; then : withval=$with_perf_events; force_perf_events=yes user_specified_interface=pe fi # RDPMC support # Check whether --enable-perfevent_rdpmc was given. if test "${enable_perfevent_rdpmc+set}" = set; then : enableval=$enable_perfevent_rdpmc; case "${enableval}" in yes) enable_perfevent_rdpmc=true ;; no) enable_perfevent_rpdmc=false ;; *) as_fn_error $? "bad value ${enableval} for --enable-perfevent-rdpmc" "$LINENO" 5 ;; esac else enable_perfevent_rdpmc=true fi if test "$enable_perfevent_rdpmc" = "true"; then PECFLAGS="$PECFLAGS -DUSE_PERFEVENT_RDPMC=1" fi # Uncore support # Check whether --with-pe_incdir was given. if test "${with_pe_incdir+set}" = set; then : withval=$with_pe_incdir; pe_incdir=$withval force_perf_events=yes user_specified_interface=pe else pe_incdir=$pfm_incdir/perfmon fi # Check for perf_event.h if test "$force_perf_events" = "yes"; then perf_events="yes" fi if test "$cross_compiling" = "no"; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for /proc/sys/kernel/perf_event_paranoid" >&5 $as_echo_n "checking for /proc/sys/kernel/perf_event_paranoid... " >&6; } if ${ac_cv_file__proc_sys_kernel_perf_event_paranoid+:} false; then : $as_echo_n "(cached) " >&6 else test "$cross_compiling" = yes && as_fn_error $? "cannot check for file existence when cross compiling" "$LINENO" 5 if test -r "/proc/sys/kernel/perf_event_paranoid"; then ac_cv_file__proc_sys_kernel_perf_event_paranoid=yes else ac_cv_file__proc_sys_kernel_perf_event_paranoid=no fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_file__proc_sys_kernel_perf_event_paranoid" >&5 $as_echo "$ac_cv_file__proc_sys_kernel_perf_event_paranoid" >&6; } if test "x$ac_cv_file__proc_sys_kernel_perf_event_paranoid" = xyes; then : have_paranoid=yes as_ac_File=`$as_echo "ac_cv_file_$pe_incdir/perf_event.h" | $as_tr_sh` { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $pe_incdir/perf_event.h" >&5 $as_echo_n "checking for $pe_incdir/perf_event.h... " >&6; } if eval \${$as_ac_File+:} false; then : $as_echo_n "(cached) " >&6 else test "$cross_compiling" = yes && as_fn_error $? "cannot check for file existence when cross compiling" "$LINENO" 5 if test -r "$pe_incdir/perf_event.h"; then eval "$as_ac_File=yes" else eval "$as_ac_File=no" fi fi eval ac_res=\$$as_ac_File { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } if eval test \"x\$"$as_ac_File"\" = x"yes"; then : if test "$perf_events" != "no"; then perf_events="yes" fi fi fi fi if test "$perf_events" = "yes"; then PECFLAGS="$PECFLAGS -DPEINCLUDE=\\\"$pe_incdir/perf_event.h\\\"" fi # # Sort out the choice of the user vs. what we detected # # MESSING WITH CFLAGS IS STUPID! # if test "$user_specified_interface" != "no"; then if test "$user_specified_interface" = "perfmon"; then perf_events="no" PAPICFLAGS+=" $PFMCFLAGS" perfnec=0 else if test "$user_specified_interface" = "pe"; then perfmon=0 PAPICFLAGS+=" $PECFLAGS" perfnec=0 else if test "$user_specified_interface" = "perfnec"; then perfmon=0 perf_events=0 PAPICFLAGS+=" -DPERFNEC" { $as_echo "$as_me:${as_lineno-$LINENO}: XXXXX user_specified_interface perfnec" >&5 $as_echo "$as_me: XXXXX user_specified_interface perfnec" >&6;} else as_fn_error $? "\"Unknown user_specified_interface=$user_specified_interface perfmon=$perfmon perfmon2=$perfmon2 perf-events=$perf_events perfnec=$perfnec\"" "$LINENO" 5 fi fi fi else if test "$perfmon" != 0; then PAPICFLAGS+=" $PFMCFLAGS" fi if test "$perf_events" = "yes"; then PAPICFLAGS+=" $PECFLAGS" fi fi # # User has made no choice, so we default to the ordering below in the platform section, if # we detect more than one. # # # What does this next section do? It determines whether or not to run the tests for libpfm # based on the settings of pfm_root, pfm_prefix, pfm_incdir, pfm_libdir # # Both should be 0 for NEC if test "$perfmon" != 0 -o "$perf_events" = "yes"; then # if prefix set, then yes if test "x$pfm_prefix" != "x"; then dotest=1 # if root not set and libdir set, then yes elif test "x$pfm_root" = "x" -a "x$pfm_libdir" != "x"; then dotest=1 else dotest=0 fi if test "$dotest" = 1; then LIBS="-L$pfm_libdir -lpfm" CPPFLAGS="$CPPFLAGS -I$pfm_incdir" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pfm_initialize in -lpfm" >&5 $as_echo_n "checking for pfm_initialize in -lpfm... " >&6; } if ${ac_cv_lib_pfm_pfm_initialize+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lpfm $LIBS" cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ /* Override any GCC internal prototype to avoid an error. Use char because int might match the return type of a GCC builtin and then its argument prototype would still apply. */ #ifdef __cplusplus extern "C" #endif char pfm_initialize (); int main () { return pfm_initialize (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : ac_cv_lib_pfm_pfm_initialize=yes else ac_cv_lib_pfm_pfm_initialize=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pfm_pfm_initialize" >&5 $as_echo "$ac_cv_lib_pfm_pfm_initialize" >&6; } if test "x$ac_cv_lib_pfm_pfm_initialize" = xyes; then : for ac_header in perfmon/pfmlib.h do : ac_fn_c_check_header_mongrel "$LINENO" "perfmon/pfmlib.h" "ac_cv_header_perfmon_pfmlib_h" "$ac_includes_default" if test "x$ac_cv_header_perfmon_pfmlib_h" = xyes; then : cat >>confdefs.h <<_ACEOF #define HAVE_PERFMON_PFMLIB_H 1 _ACEOF if test "$arch" = "ia64"; then for ac_header in perfmon/pfmlib_montecito.h do : ac_fn_c_check_header_mongrel "$LINENO" "perfmon/pfmlib_montecito.h" "ac_cv_header_perfmon_pfmlib_montecito_h" "$ac_includes_default" if test "x$ac_cv_header_perfmon_pfmlib_montecito_h" = xyes; then : cat >>confdefs.h <<_ACEOF #define HAVE_PERFMON_PFMLIB_MONTECITO_H 1 _ACEOF fi done fi ac_fn_c_check_func "$LINENO" "pfm_get_event_description" "ac_cv_func_pfm_get_event_description" if test "x$ac_cv_func_pfm_get_event_description" = xyes; then : $as_echo "#define HAVE_PFM_GET_EVENT_DESCRIPTION 1" >>confdefs.h fi ac_fn_c_check_member "$LINENO" "pfmlib_reg_t" "reg_evt_idx" "ac_cv_member_pfmlib_reg_t_reg_evt_idx" "#include \"perfmon/pfmlib.h\" " if test "x$ac_cv_member_pfmlib_reg_t_reg_evt_idx" = xyes; then : $as_echo "#define HAVE_PFM_REG_EVT_IDX 1" >>confdefs.h fi ac_fn_c_check_member "$LINENO" "pfmlib_output_param_t" "pfp_pmd_count" "ac_cv_member_pfmlib_output_param_t_pfp_pmd_count" "#include \"perfmon/pfmlib.h\" " if test "x$ac_cv_member_pfmlib_output_param_t_pfp_pmd_count" = xyes; then : $as_echo "#define HAVE_PFMLIB_OUTPUT_PFP_PMD_COUNT 1" >>confdefs.h fi ac_fn_c_check_member "$LINENO" "pfm_msg_t" "type" "ac_cv_member_pfm_msg_t_type" "#include \"perfmon/perfmon.h\" " if test "x$ac_cv_member_pfm_msg_t_type" = xyes; then : $as_echo "#define HAVE_PFM_MSG_TYPE 1" >>confdefs.h fi else as_fn_error $? "perfmon/pfmlib.h not found, rerun configure with different flags" "$LINENO" 5 fi done else as_fn_error $? "libpfm.a not found, rerun configure with different flags" "$LINENO" 5 fi else $as_echo "#define HAVE_PERFMON_PFMLIB_MONTECITO_H 1" >>confdefs.h $as_echo "#define HAVE_PFM_GET_EVENT_DESCRIPTION 1" >>confdefs.h $as_echo "#define HAVE_PFMLIB_OUTPUT_PFP_PMD_COUNT 1" >>confdefs.h fi fi ################################################## # Checking platform ################################################## { $as_echo "$as_me:${as_lineno-$LINENO}: checking platform" >&5 $as_echo_n "checking platform... " >&6; } case "$OS" in nec) MAKEVER=nec-nec ;; aix) MAKEVER="$OS"-"$CPU" ;; bgp) MAKEVER=bgp ;; bgq) MAKEVER=bgq ;; CLE) if test "$perfmon2" = "yes"; then # major_version=`echo $OSVER | sed 's/\([[^.]][[^.]]*\).*/\1/'` # minor_version=`echo $OSVER | sed 's/[[^.]][[^.]]*.\([[^.]][[^.]]*\).*/\1/'` # point_version=`echo $OSVER | sed -e 's/[[^.]][[^.]]*.[[^.]][[^.]]*.\(.*\)/\1/' -e 's/[[^0-9]].*//'` # if (test "$major_version" = 2 -a "$minor_version" = 6 -a "$point_version" -lt 31 -a "$perfmon2" != "yes" ); then MAKEVER="$OS"-perfmon2 else MAKEVER="$OS"-pe fi ;; freebsd) MAKEVER="freebsd" LDFLAGS="-lpmc" # HWPMC driver is available for FreeBSD >= 6 FREEBSD_VERSION=`uname -r | cut -d'.' -f1` if test "${FREEBSD_VERSION}" -lt 6 ; then as_fn_error $? "PAPI requires FreeBSD 6 or greater" "$LINENO" 5 fi # Determine if HWPMC module is on the kernel dmesg | grep hwpmc 2> /dev/null > /dev/null if test "$?" != "0" ; then as_fn_error $? "HWPMC module not found. (see INSTALL.TXT)" "$LINENO" 5 fi # Determine the number of counters echo "/* Automatically generated file by configure */" > freebsd-config.h echo "#ifndef _FREEBSD_CONFIG_H_" >> freebsd-config.h echo "#define _FREEBSD_CONFIG_H_" >> freebsd-config.h echo "" >> freebsd-config.h cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main () { int i = pmc_init(); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : pmc_pmc_init_linked="yes" else pmc_pmc_init_linked="no" fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext if test "${pmc_init_linked}" = "no" ; then as_fn_error $? "Failed to link hwpmc example" "$LINENO" 5 fi if test "$cross_compiling" = yes; then : { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run test program while cross compiling See \`config.log' for more details" "$LINENO" 5; } else cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include #include int main() { const struct pmc_cpuinfo *info; if (pmc_init() < 0) return 0; if (pmc_cpuinfo (&info) < 0) return 0; return info->pm_npmc-1; } _ACEOF if ac_fn_c_try_run "$LINENO"; then : num_counters="0" else num_counters="$?" fi rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ conftest.$ac_objext conftest.beam conftest.$ac_ext fi if test "${num_counters}" = "0" ; then as_fn_error $? "pmc_npmc info returned 0. Determine if the HWPMC module is loaded (see hwpmc(4))" "$LINENO" 5 fi echo "#define HWPMC_NUM_COUNTERS ${num_counters}" >> freebsd-config.h echo "" >> freebsd-config.h echo "#endif" >> freebsd-config.h ;; linux) if test "$force_perf_events" = "yes" ; then MAKEVER="$OS"-pe elif test "$force_perfmon2" = "yes" ; then MAKEVER="$OS"-perfmon2 elif test "$perf_events" = "yes" ; then MAKEVER="$OS"-pe elif test "$perfmon2" = "yes" ; then MAKEVER="$OS"-perfmon2 elif test "$old_pfmv2" = "y" ; then MAKEVER="$OS"-pfm-"$CPU" else MAKEVER="$OS"-generic fi ;; solaris) if test "$bitmode" = "64" -a "`isainfo -v | grep "64"`" = ""; then as_fn_error $? "The bitmode you specified is not supported" "$LINENO" 5 fi MAKEVER="$OS"-"$CPU" ;; darwin) MAKEVER="$OS" ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: result: $MAKEVER" >&5 $as_echo "$MAKEVER" >&6; } if test "x$MAKEVER" = "x"; then { $as_echo "$as_me:${as_lineno-$LINENO}: This platform is not supported so a generic build without CPU counters will be used" >&5 $as_echo "$as_me: This platform is not supported so a generic build without CPU counters will be used" >&6;} MAKEVER="generic_platform" fi ################################################## # Set build macros ################################################## FILENAME=Makefile.inc SHOW_CONF=showconf CTEST_TARGETS="all" FTEST_TARGETS="all" LIBRARY=libpapi.a SHLIB='libpapi.so.7.2.0.0' PAPISOVER='$(PAPIVER).$(PAPIREV)' VLIB='libpapi.so.$(PAPISOVER)' OMPCFLGS=-fopenmp CC_R='$(CC) -pthread' CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(VLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' if test "$CC_COMMON_NAME" = "gcc"; then if test "$bitmode" = "32"; then BITFLAGS=-m32 elif test "$bitmode" = "64"; then BITFLAGS=-m64 fi fi OPTFLAGS="$OPTFLAGS" PAPICFLAGS+=" -D_REENTRANT -D_GNU_SOURCE $NOTLS" CFLAGS="$CFLAGS $BITFLAGS" FFLAGS="$CFLAGS $BITFLAGS $FFLAGS -Dlinux" # OS Support if (test "$OS" = "aix"); then OSFILESSRC=aix-memory.c OSLOCK=aix-lock.h OSCONTEXT=aix-context.h elif (test "$OS" = "bgp"); then OSFILESSRC=linux-bgp-memory.c OSLOCK=linux-bgp-lock.h OSCONTEXT=linux-bgp-context.h elif (test "$OS" = "bgq"); then OSFILESSRC=linux-bgq-memory.c OSLOCK=linux-bgq-lock.h OSCONTEXT=linux-context.h elif (test "$OS" = "freebsd"); then OSFILESSRC=freebsd-memory.c OSLOCK="freebsd-lock.h" OSCONTEXT="freebsd-context.h" elif (test "$OS" = "nec"); then OSFILESSRC="linux-memory.c linux-timer.c linux-common.c" OSFILESHDR="linux-memory.h linux-timer.h linux-common.h" OSLOCK="linux-lock.h" OSCONTEXT="linux-context.h" elif (test "$OS" = "linux"); then OSFILESSRC="linux-memory.c linux-timer.c linux-common.c" OSFILESHDR="linux-memory.h linux-timer.h linux-common.h" OSLOCK="linux-lock.h" OSCONTEXT="linux-context.h" elif (test "$OS" = "solaris"); then OSFILESSRC="solaris-memory.c solaris-common.c" OSFILESHDR="solaris-memory.h solaris-common.h" OSLOCK="solaris-lock.h" OSCONTEXT="solaris-context.h" elif (test "$OS" = "darwin"); then OSFILESSRC="darwin-memory.c darwin-common.c" OSFILESHDR="darwin-memory.h darwin-common.h" OSLOCK="darwin-lock.h" OSCONTEXT="darwin-context.h" fi OSFILESOBJ='$(OSFILESSRC:.c=.o)' if (test "$MAKEVER" = "aix-power5" || test "$MAKEVER" = "aix-power6" || test "$MAKEVER" = "aix-power7"); then if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so # By default AIX enforces a limit on heap space #( limiting the heap to share the same 256MB memory segment as stack ) # changing the max data paramater moves the heap off the stack's memory segment BITFLAGS='-q64 -bmaxdata:0x07000000000000' ARG64=-X64 else # If the issue ever comes up, /dsa requires AIX v5.1 or higher # and the Large address-space model (-bmaxdata) requires v4.3 or later # see http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com.ibm.aix.genprogc/doc/genprogc/lrg_prg_support.htm#a179c11c5d SHLIB=libpapi.so BITFLAGS="-bmaxdata:0x80000000/dsa" fi CPUCOMPONENT_NAME=aix CPUCOMPONENT_C=aix.c CPUCOMPONENT_OBJ=aix.o VECTOR=_aix_vector PAPI_EVENTS_CSV="papi_events.csv" MISCHDRS="aix.h papi_events_table.h" MISCSRCS="aix.c" CFLAGS+='-qenum=4 -DNO_VARARG_MACRO -D_AIX -D_$(CPU_MODEL) -DNEED_FFSLL -DARCH_EVTS=\"$(ARCH_EVENTS).h\" -DCOMP_VECTOR=_ppc64_vectors -DSTATIC_PAPI_EVENTS_TABLE' FFLAGS+='-WF,-D_$(CPU_MODEL) -WF,-DARCH_EVTS=\"$(ARCH_EVENTS).h\"' CFLAGS+='-I$(PMAPI)/include -qmaxmem=-1 -qarch=$(cpu_option) -qtune=$(cpu_option) -qlanglvl=extended $(BITFLAGS)' if test $debug != "yes"; then OPTFLAGS='-O3 -qstrict $(PMINIT)' else OPTFLAGS='$(PMINIT)' fi SMPCFLGS=-qsmp OMPCFLGS='-qsmp=omp' LDFLAGS='-L$(PMAPI)/lib -lpmapi' CC_R=xlc_r CC=xlc CC_SHR="xlc -G -bnoentry" for ac_prog in mpicc mpcc do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } if ${ac_cv_prog_MPICC+:} false; then : $as_echo_n "(cached) " >&6 else if test -n "$MPICC"; then ac_cv_prog_MPICC="$MPICC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_MPICC="$ac_prog" $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done IFS=$as_save_IFS fi fi MPICC=$ac_cv_prog_MPICC if test -n "$MPICC"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: $MPICC" >&5 $as_echo "$MPICC" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi test -n "$MPICC" && break done F77=xlf CPP='xlc -E $(CPPFLAGS)' if test "$MAKEVER" = "aix-power5"; then ARCH_EVENTS=power5_events CPU_MODEL=POWER5 cpu_option=pwr5 DESCR="AIX 5.1.0 or greater with POWER5" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi elif test "$MAKEVER" = "aix-power6"; then ARCH_EVENTS=power6_events CPU_MODEL=POWER6 cpu_option=pwr6 DESCR="AIX 5.1.0 or greater with POWER6" CPPFLAGS="-qlanglvl=extended" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi elif test "$MAKEVER" = "aix-power7"; then ARCH_EVENTS=power7_events CPU_MODEL=POWER7 cpu_option=pwr7 DESCR="AIX 5.1.0 or greater with POWER7" CPPFLAGS="-qlanglvl=extended" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi fi elif test "$MAKEVER" = "bgp"; then CPP="$CC -E" CPUCOMPONENT_NAME=linux-bgp CPUCOMPONENT_C=linux-bgp.c CPUCOMPONENT_OBJ=linux-bgp.o VECTOR=_bgp_vectors PAPI_EVENTS_CSV="papi_events.csv" MISCSRCS= CFLAGS='-g -gdwarf-2 -O2 -Wall -I. -I$(BGP_SYSDIR)/arch/include -DCOMP_VECTOR=_bgp_vectors' tests="$tests bgp_tests" SHOW_CONF=show_bgp_conf BGP_SYSDIR=/bgsys/drivers/ppcfloor BGP_GNU_LINUX_PATH='${BGP_SYSDIR}/gnu-linux' LDFLAGS='-L$(BGP_SYSDIR)/runtime/SPI -lSPI.cna' FFLAGS='-g -gdwarf-2 -O2 -Wall -I. -Dlinux' OPTFLAGS="-g -Wall -O3" TOPTFLAGS="-g -Wall -O0" SHLIB=libpapi.so DESCR="Linux for BlueGene/P" LIBS=static CC_SHR='$(CC) -shared -Xlinker "-soname" -Xlinker "$(SHLIB)" -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' OMPCFLGS="" elif test "$MAKEVER" = "bgq"; then FILENAME=Rules.bgpm VECTOR=_bgq_vectors CPUCOMPONENT_NAME=linux-bgq CPUCOMPONENT_C=linux-bgq.c CPUCOMPONENT_OBJ=linux-bgq.o PAPI_EVENTS_CSV="papi_events.csv" MISCSRCS="linux-bgq-common.c" OPTFLAGS="-g -Wall -O3" TOPTFLAGS="-g -Wall -O0" SHLIB=libpapi.so DESCR="Linux for Blue Gene/Q" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(SHLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' OMPCFLGS="" elif test "$MAKEVER" = "CLE-perfmon2"; then FILENAME=Rules.perfmon2 CPUCOMPONENT_NAME=perfmon CPUCOMPONENT_C=perfmon.c CPUCOMPONENT_OBJ=perfmon.o VECTOR=_papi_pfm_vector PAPI_EVENTS_CSV="papi_events.csv" F77=gfortran CFLAGS="$CFLAGS -D__crayxt" FFLAGS="" elif test "$MAKEVER" = "freebsd"; then CPUCOMPONENT_NAME=freebsd CPUCOMPONENT_C=freebsd.c CPUCOMPONENT_OBJ=freebsd.o VECTOR=_papi_freebsd_vector PAPI_EVENTS_CSV="freebsd_events.csv" MISCHDRS="freebsd/map-unknown.h freebsd/map.h freebsd/map-p6.h freebsd/map-p6-m.h freebsd/map-p6-3.h freebsd/map-p6-2.h freebsd/map-p6-c.h freebsd/map-k7.h freebsd/map-k8.h freebsd/map-p4.h freebsd/map-atom.h freebsd/map-core.h freebsd/map-core2.h freebsd/map-core2-extreme.h freebsd/map-i7.h freebsd/map-westme\ re.h" MISCSRCS="$MISCSRCS freebsd/map-unknown.c freebsd/map.c freebsd/map-p6.c freebsd/map-p6-m.c freebsd/map-p6-3.c freebsd/map-p6-2.c freebsd/map-p6-c.c freebsd/map-k7.c freebsd/map-k8.c freebsd/map-p4.c freebsd/map-atom.c freebsd/map-core.c freebsd/map-core2.c freebsd/map-core2-extreme.c freebsd/map-i7.c freebsd/map-westme\ re.c" DESCR="FreeBSD -over libpmc- " CFLAGS+=" -I. -Ifreebsd -DPIC -fPIC" CC_SHR='$(CC) -shared -Xlinker "-soname" -Xlinker "libpapi.so" -Xlinker "-rpath" -Xlinker "$(LIBDIR)" -DPIC -fPIC -I. -Ifreebsd' elif test "$MAKEVER" = "linux-generic"; then CPUCOMPONENT_NAME=linux-generic CPUCOMPONENT_C=linux-generic.c CPUCOMPONENT_OBJ=linux-generic.o PAPI_EVENTS_CSV="papi_events.csv" VECTOR=_papi_dummy_vector elif test "$MAKEVER" = "linux-pe"; then FILENAME=Rules.pfm4_pe CPUCOMPONENT_NAME=perf_event if test "$perf_events" = "no"; then components="$components" else components="$components perf_event" fi if test "$perf_events_uncore" = "no"; then components="$components" else components="$components perf_event_uncore" fi elif test "$MAKEVER" = "nec-nec"; then FILENAME=Rules.perfnec CPUCOMPONENT_NAME=perfnec components="perfnec" elif test "$MAKEVER" = "linux-perfmon2"; then FILENAME=Rules.perfmon2 CPUCOMPONENT_NAME=perfmon2 components="perfmon2" elif (test "$MAKEVER" = "linux-pfm-ia64" || test "$MAKEVER" = "linux-pfm-itanium2" || test "$MAKEVER" = "linux-pfm-montecito"); then FILENAME=Rules.pfm CPUCOMPONENT_NAME=perfmon-ia64 components="perfmon_ia64" VERSION=3.y if test "$MAKEVER" = "linux-pfm-itanium2"; then CPU=2 else CPU=3 fi CFLAGS="$CFLAGS -DITANIUM$CPU" FFLAGS="$FFLAGS -DITANIUM$CPU" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(SHLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' elif test "$MAKEVER" = "solaris-ultra"; then CPUCOMPONENT_NAME=solaris-ultra CPUCOMPONENT_C=solaris-ultra.c CPUCOMPONENT_OBJ=solaris-ultra.obj VECTOR=_solaris_vector PAPI_EVENTS_CSV="papi_events.csv" DESCR="Solaris 5.8 or greater with UltraSPARC I, II or III" if test "$CC" = "gcc"; then F77=g77 CPP="$CC -E" CC_R="$CC" CC_SHR="$CC -shared -fpic" OPTFLAGS=-O3 CFLAGS="$CFLAGS -DNEED_FFSLL" FFLAGS=$CFLAGS else # Sun Workshop compilers: V5.0 and V6.0 R2 CPP="$CC -E" CC_R="$CC -mt" CC_SHR="$CC -ztext -G -Kpic" CFLAGS="-xtarget=ultra3 -xarch=v8plusa -DNO_VARARG_MACRO -D__EXTENSIONS__ -DPAPI_NO_MEMORY_MANAGEMENT -DCOMP_VECTOR=_solaris_vectors" SMPCFLGS=-xexplicitpar OMPCFLGS=-xopenmp F77=f90 FFLAGS=$CFLAGS NOOPT=-xO0 OPTFLAGS="-g -fast -xtarget=ultra3 -xarch=v8plusa" fi LDFLAGS="$LDFLAGS -lcpc" if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so CFLAGS="-xtarget=ultra3 -xarch=v9a -DNO_VARARG_MACRO -D__EXTENSIONS__ -DPAPI_NO_MEMORY_MANAGEMENT -DCOMP_VECTOR=_solaris_vectors" OPTFLAGS="-g -fast -xtarget=ultra3 -xarch=v9a" fi elif test "$MAKEVER" = "solaris-niagara2"; then CPUCOMPONENT_NAME=solaris-niagara2 CPUCOMPONENT_C=solaris-niagara2.c CPUCOMPONENT_OBJ=solaris-niagara2.obj VECTOR=_niagara2_vector PAPI_EVENTS_CSV="papi_events.csv" CFLAGS="-xtarget=native -xarch=native -DNO_VARARG_MACRO -D__EXTENSIONS__ -DCOMP_VECTOR=_niagara2_vector" ORY_MANAGEMENT="-DCOMP_VECTOR=_solaris_vector" DESCR="Solaris 10 with libcpc2 and UltraSPARC T2 (Niagara 2)" CPP="$CC -E" CC_R="$CC -mt" CC_SHR="$CC -ztext -G -Kpic" SMPCFLGS=-xexplicitpar OMPCFLGS=-xopenmp F77=f90 FFLAGS=$CFLAGS NOOPT=-xO0 OPTFLAGS="-fast" FOPTFLAGS=$OPTFLAGS LDFLAGS="$LDFLAGS -lcpc" if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so CFLAGS="$CFLAGS -m64" FFLAGS="$FFLAGS -m64" fi elif test "$MAKEVER" = "darwin"; then DESCR="Darwin" CPUCOMPONENT_NAME=darwin CPUCOMPONENT=linux-generic.c CPUCOMPONENT=linux-generic.obj CFLAGS="-DNEED_FFSLL" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-dylib -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' SHLIB=libpapi.dylib elif test "$MAKEVER" = "generic_platform"; then DESCR="Generic platform" fi MISCOBJS='$(MISCSRCS:.c=.o)' if test "$F77" = "pgf77"; then FFLAGS="$FFLAGS -Wall -Mextend" elif test "$F77" = "ifort"; then FFLAGS="$FFLAGS -warn all" elif test "$F77" != "xlf"; then FFLAGS="$FFLAGS -ffixed-line-length-132" fi if test "$CC_COMMON_NAME" = "icc"; then OMPCFLGS=-qopenmp fi ## By default we want the sysdetect component built, so if the user does not ## give an option we will add it to the list. { $as_echo "$as_me:${as_lineno-$LINENO}: checking for building sysdetect" >&5 $as_echo_n "checking for building sysdetect... " >&6; } # Check whether --with-sysdetect was given. if test "${with_sysdetect+set}" = set; then : withval=$with_sysdetect; else with_sysdetect=yes fi # Enable sysdetect unless the user has explicitly told us not to. if test "$with_sysdetect" = "yes"; then { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 $as_echo "no" >&6; } fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for components to build" >&5 $as_echo_n "checking for components to build... " >&6; } COMPONENT_RULES=components/Rules.components echo "/* Automatically generated by configure */" > components_config.h echo "#ifndef COMPONENTS_CONFIG_H" >> components_config.h echo "#define COMPONENTS_CONFIG_H" >> components_config.h echo "" >> components_config.h # Check whether --with-components was given. if test "${with_components+set}" = set; then : withval=$with_components; if test -n "${with_components//[[:space:]]/}"; then components="$components $withval" fi fi # Enable sysdetect unless the user has explicitly told us not to. if test "$with_sysdetect" = "yes"; then if test "$perf_events" != "no"; then components="$components sysdetect" fi fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $components" >&5 $as_echo "$components" >&6; } # Check whether rocm and rocp_sdk were configured together rocm_found=0 rocp_sdk_found=0 for comp in $components do if test "$comp" = "rocm"; then rocm_found=1 fi if test "$comp" = "rocp_sdk"; then rocp_sdk_found=1 fi done if test $rocm_found -eq 1 && test $rocp_sdk_found -eq 1; then echo "WARNING: Components rocm and rocp_sdk should not be configured together. See components/rocm/README.md for more details." fi # This is an ugly hack to keep building on configurations covered by any-null in the past. if test "$VECTOR" = "_papi_dummy_vector"; then if test "x$components" = "x"; then echo "papi_vector_t ${VECTOR} = {" >> components_config.h echo " .size = { .context = sizeof ( int ), .control_state = sizeof ( int ), .reg_value = sizeof ( int ), .reg_alloc = sizeof ( int ), }, .cmp_info = { .num_native_events = 0, .num_preset_events = 0, .num_cntrs = 0, .name = \"No Components Configured. \", .short_name = \"UNSUPPORTED!\" }, .dispatch_timer = NULL, .get_overflow_address = NULL, .start = NULL, .stop = NULL, .read = NULL, .reset = NULL, .write = NULL, .cleanup_eventset = NULL, .stop_profiling = NULL, .init_component = NULL, .init_thread = NULL, .init_control_state = NULL, .update_control_state = NULL, .ctl = NULL, .set_overflow = NULL, .set_profile = NULL, .set_domain = NULL, .ntv_enum_events = NULL, .ntv_name_to_code = NULL, .ntv_code_to_name = NULL, .ntv_code_to_descr = NULL, .ntv_code_to_bits = NULL, .ntv_code_to_info = NULL, .allocate_registers = NULL, .shutdown_thread = NULL, .shutdown_component = NULL, .user = NULL, };" >> components_config.h # but in the face of actual components, we don't have to do hacky size games else VECTOR="" fi elif test "x$VECTOR" != "x"; then echo "extern papi_vector_t ${VECTOR};" >> components_config.h fi # construct papi_components_config_event_defs.h echo "#ifndef _PAPICOMPCFGEVENTDEFS" > papi_components_config_event_defs.h echo "#define _PAPICOMPCFGEVENTDEFS" >> papi_components_config_event_defs.h echo "" >> papi_components_config_event_defs.h numLine=`grep "#define PAPI_MAX_PRESET_EVENTS" papiStdEventDefs.h` sumNum=`echo ${numLine} | awk '{print $3}'` for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_defs_inc=components/${subcomp}/papi_${subcomp}_std_event_defs.h if test -f ${subcomp_defs_inc}; then `cp ${subcomp_defs_inc} ./` `echo "#define PAPI_${subcomp}_PRESET_OFFSET ${sumNum}" >> papi_components_config_event_defs.h` numLine=`grep "#define PAPI_MAX_${subcomp}_PRESETS" ${subcomp_defs_inc}` singleNum=`echo ${numLine} | awk '{print $3}'` sumNum=$(( ${sumNum} + ${singleNum} )) fi fi done echo "" >> papi_components_config_event_defs.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_defs_inc=components/${subcomp}/papi_${subcomp}_std_event_defs.h if test -f ${subcomp_defs_inc}; then `echo "#include \"papi_${subcomp}_std_event_defs.h\"" >> papi_components_config_event_defs.h` fi fi done echo "" >> papi_components_config_event_defs.h echo "#endif" >> papi_components_config_event_defs.h # includes for preset headers for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h as_ac_File=`$as_echo "ac_cv_file_${subcomp_preset_inc}" | $as_tr_sh` { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ${subcomp_preset_inc}" >&5 $as_echo_n "checking for ${subcomp_preset_inc}... " >&6; } if eval \${$as_ac_File+:} false; then : $as_echo_n "(cached) " >&6 else test "$cross_compiling" = yes && as_fn_error $? "cannot check for file existence when cross compiling" "$LINENO" 5 if test -r "${subcomp_preset_inc}"; then eval "$as_ac_File=yes" else eval "$as_ac_File=no" fi fi eval ac_res=\$$as_ac_File { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } if eval test \"x\$"$as_ac_File"\" = x"yes"; then : `echo "#include \"${subcomp_preset_inc}\"" >> components_config.h` fi fi done echo "" >> components_config.h # array tracking max number of presets per component echo "int _papi_hwi_max_presets[] = {" >> components_config.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h if test -f ${subcomp_preset_inc}; then `echo " PAPI_MAX_${subcomp}_PRESETS," >> components_config.h` else `echo " 0," >> components_config.h` fi else `echo " PAPI_MAX_PRESET_EVENTS," >> components_config.h` fi done echo " 0" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h # preset arrays echo "hwi_presets_t *_papi_hwi_comp_presets[] = {" >> components_config.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h if test -f ${subcomp_preset_inc}; then `echo " _${subcomp}_presets," >> components_config.h` else `echo " NULL," >> components_config.h` fi else `echo " _papi_hwi_presets," >> components_config.h` fi done echo " NULL" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h PAPI_NUM_COMP=0 for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi COMPONENT_RULES="$COMPONENT_RULES components/$comp/Rules.$subcomp" echo "extern papi_vector_t _${subcomp}_vector;" >> components_config.h PAPI_NUM_COMP=$((PAPI_NUM_COMP+1)) done echo "" >> components_config.h echo "struct papi_vectors *_papi_hwd[] = {" >> components_config.h if test "x$VECTOR" != "x"; then echo " &${VECTOR}," >> components_config.h fi for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi echo " &_${subcomp}_vector," >> components_config.h done echo " NULL" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h echo "#endif" >> components_config.h # check for component tests for comp in $components; do if test "`find components/$comp -name "tests"`" != "" ; then COMPONENTS="$COMPONENTS $comp" fi done for comp in $components; do # check for SDE component to determine linking flags. if test "x$comp" = "xsde" ; then LDFLAGS="$LDFLAGS $LRT" LIBS="$LIBS $LRT" if test "$with_libsde" = "yes"; then if test "$shlib_tools" = "yes"; then LIBSDEFLAGS="-L${TOPDIR} -lsde -DSDE" else LIBSDEFLAGS="${TOPDIR}/libsde.a -DSDE" fi fi fi # check for intel_gpu or rocp_sdk component to determine if we need -lstdc++ in LDFLAGS if (test "x$comp" = "xintel_gpu" || test "x$comp" = "xrocp_sdk"); then LDFLAGS="$LDFLAGS -lstdc++ -pthread" fi if test "x$comp" = "xsysdetect" ; then if test "x`find $PAPI_CUDA_ROOT -name "cuda.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_CUDA" fi if test "x`find $PAPI_CUDA_ROOT -name "nvml.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_NVML" fi if test "x`find $PAPI_ROCM_ROOT -name "hsa.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_ROCM" fi if test "x`find $PAPI_ROCMSMI_ROOT -name "rocm_smi.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_ROCM_SMI" fi fi done CFLAGS="$CFLAGS -DPAPI_NUM_COMP=$PAPI_NUM_COMP" { $as_echo "$as_me:${as_lineno-$LINENO}: checking for PAPI event CSV filename to use" >&5 $as_echo_n "checking for PAPI event CSV filename to use... " >&6; } if test "x$PAPI_EVENTS_CSV" == "x"; then PAPI_EVENTS_CSV="papi_events.csv" fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PAPI_EVENTS_CSV" >&5 $as_echo "$PAPI_EVENTS_CSV" >&6; } # Check whether --enable-fortran was given. if test "${enable_fortran+set}" = set; then : enableval=$enable_fortran; else enable_fortran=yes fi if test "x$F77" != "x" -a "x$enable_fortran" = "xyes" ; then ac_config_commands="$ac_config_commands genpapifdef" FORT_WRAPPERS_SRC="upper_PAPI_FWRAPPERS.c papi_fwrappers_.c papi_fwrappers__.c" FORT_WRAPPERS_OBJ="upper_PAPI_FWRAPPERS.o papi_fwrappers_.o papi_fwrappers__.o" ENABLE_FORTRAN="$enable_fortran" ENABLE_FORTRAN_TESTS="$enable_fortran" FORT_HEADERS="fpapi.h f77papi.h f90papi.h" fi { $as_echo "$as_me:${as_lineno-$LINENO}: $FILENAME will be included in the generated Makefile" >&5 $as_echo "$as_me: $FILENAME will be included in the generated Makefile" >&6;} ac_config_files="$ac_config_files Makefile papi.pc" ac_config_files="$ac_config_files components/Makefile_comp_tests.target testlib/Makefile.target utils/Makefile.target ctests/Makefile.target ftests/Makefile.target validation_tests/Makefile.target" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, we kill variables containing newlines. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. ( for ac_var in `(set) 2>&1 | sed -n 's/^\([a-zA-Z_][a-zA-Z0-9_]*\)=.*/\1/p'`; do eval ac_val=\$$ac_var case $ac_val in #( *${as_nl}*) case $ac_var in #( *_cv_*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cache variable $ac_var contains a newline" >&5 $as_echo "$as_me: WARNING: cache variable $ac_var contains a newline" >&2;} ;; esac case $ac_var in #( _ | IFS | as_nl) ;; #( BASH_ARGV | BASH_SOURCE) eval $ac_var= ;; #( *) { eval $ac_var=; unset $ac_var;} ;; esac ;; esac done (set) 2>&1 | case $as_nl`(ac_space=' '; set) 2>&1` in #( *${as_nl}ac_space=\ *) # `set' does not quote correctly, so add quotes: double-quote # substitution turns \\\\ into \\, and sed turns \\ into \. sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; #( *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n "/^[_$as_cr_alnum]*_cv_[_$as_cr_alnum]*=/p" ;; esac | sort ) | sed ' /^ac_cv_env_/b end t clear :clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ :end' >>confcache if diff "$cache_file" confcache >/dev/null 2>&1; then :; else if test -w "$cache_file"; then if test "x$cache_file" != "x/dev/null"; then { $as_echo "$as_me:${as_lineno-$LINENO}: updating cache $cache_file" >&5 $as_echo "$as_me: updating cache $cache_file" >&6;} if test ! -f "$cache_file" || test -h "$cache_file"; then cat confcache >"$cache_file" else case $cache_file in #( */* | ?:*) mv -f confcache "$cache_file"$$ && mv -f "$cache_file"$$ "$cache_file" ;; #( *) mv -f confcache "$cache_file" ;; esac fi fi else { $as_echo "$as_me:${as_lineno-$LINENO}: not updating unwritable cache $cache_file" >&5 $as_echo "$as_me: not updating unwritable cache $cache_file" >&6;} fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' DEFS=-DHAVE_CONFIG_H ac_libobjs= ac_ltlibobjs= U= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_script='s/\$U\././;s/\.o$//;s/\.obj$//' ac_i=`$as_echo "$ac_i" | sed "$ac_script"` # 2. Prepend LIBOBJDIR. When used with automake>=1.10 LIBOBJDIR # will be set to the directory where LIBOBJS objects are built. as_fn_append ac_libobjs " \${LIBOBJDIR}$ac_i\$U.$ac_objext" as_fn_append ac_ltlibobjs " \${LIBOBJDIR}$ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : "${CONFIG_STATUS=./config.status}" ac_write_fail=0 ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $CONFIG_STATUS" >&5 $as_echo "$as_me: creating $CONFIG_STATUS" >&6;} as_write_fail=0 cat >$CONFIG_STATUS <<_ASEOF || as_write_fail=1 #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} export SHELL _ASEOF cat >>$CONFIG_STATUS <<\_ASEOF || as_write_fail=1 ## -------------------- ## ## M4sh Initialization. ## ## -------------------- ## # Be more Bourne compatible DUALCASE=1; export DUALCASE # for MKS sh if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then : emulate sh NULLCMD=: # Pre-4.2 versions of Zsh do word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' setopt NO_GLOB_SUBST else case `(set -o) 2>/dev/null` in #( *posix*) : set -o posix ;; #( *) : ;; esac fi as_nl=' ' export as_nl # Printing a long string crashes Solaris 7 /usr/bin/printf. as_echo='\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo as_echo=$as_echo$as_echo$as_echo$as_echo$as_echo$as_echo # Prefer a ksh shell builtin over an external printf program on Solaris, # but without wasting forks for bash or zsh. if test -z "$BASH_VERSION$ZSH_VERSION" \ && (test "X`print -r -- $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='print -r --' as_echo_n='print -rn --' elif (test "X`printf %s $as_echo`" = "X$as_echo") 2>/dev/null; then as_echo='printf %s\n' as_echo_n='printf %s' else if test "X`(/usr/ucb/echo -n -n $as_echo) 2>/dev/null`" = "X-n $as_echo"; then as_echo_body='eval /usr/ucb/echo -n "$1$as_nl"' as_echo_n='/usr/ucb/echo -n' else as_echo_body='eval expr "X$1" : "X\\(.*\\)"' as_echo_n_body='eval arg=$1; case $arg in #( *"$as_nl"*) expr "X$arg" : "X\\(.*\\)$as_nl"; arg=`expr "X$arg" : ".*$as_nl\\(.*\\)"`;; esac; expr "X$arg" : "X\\(.*\\)" | tr -d "$as_nl" ' export as_echo_n_body as_echo_n='sh -c $as_echo_n_body as_echo' fi export as_echo_body as_echo='sh -c $as_echo_body as_echo' fi # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then PATH_SEPARATOR=: (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 || PATH_SEPARATOR=';' } fi # IFS # We need space, tab and new line, in precisely that order. Quoting is # there to prevent editors from complaining about space-tab. # (If _AS_PATH_WALK were called with IFS unset, it would disable word # splitting by setting IFS to empty value.) IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done IFS=$as_save_IFS ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then $as_echo "$as_myself: error: cannot find myself; rerun with an absolute file name" >&2 exit 1 fi # Unset variables that we do not need and which cause bugs (e.g. in # pre-3.0 UWIN ksh). But do not cause bugs in bash 2.01; the "|| exit 1" # suppresses any "Segmentation fault" message there. '((' could # trigger a bug in pdksh 5.2.14. for as_var in BASH_ENV ENV MAIL MAILPATH do eval test x\${$as_var+set} = xset \ && ( (unset $as_var) || exit 1) >/dev/null 2>&1 && unset $as_var || : done PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. LC_ALL=C export LC_ALL LANGUAGE=C export LANGUAGE # CDPATH. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH # as_fn_error STATUS ERROR [LINENO LOG_FD] # ---------------------------------------- # Output "`basename $0`: error: ERROR" to stderr. If LINENO and LOG_FD are # provided, also output the error to LOG_FD, referencing LINENO. Then exit the # script with STATUS, using 1 if that was 0. as_fn_error () { as_status=$1; test $as_status -eq 0 && as_status=1 if test "$4"; then as_lineno=${as_lineno-"$3"} as_lineno_stack=as_lineno_stack=$as_lineno_stack $as_echo "$as_me:${as_lineno-$LINENO}: error: $2" >&$4 fi $as_echo "$as_me: error: $2" >&2 as_fn_exit $as_status } # as_fn_error # as_fn_set_status STATUS # ----------------------- # Set $? to STATUS, without forking. as_fn_set_status () { return $1 } # as_fn_set_status # as_fn_exit STATUS # ----------------- # Exit the shell with STATUS, even in a "trap 0" or "set -e" context. as_fn_exit () { set +e as_fn_set_status $1 exit $1 } # as_fn_exit # as_fn_unset VAR # --------------- # Portably unset VAR. as_fn_unset () { { eval $1=; unset $1;} } as_unset=as_fn_unset # as_fn_append VAR VALUE # ---------------------- # Append the text in VALUE to the end of the definition contained in VAR. Take # advantage of any shell optimizations that allow amortized linear growth over # repeated appends, instead of the typical quadratic growth present in naive # implementations. if (eval "as_var=1; as_var+=2; test x\$as_var = x12") 2>/dev/null; then : eval 'as_fn_append () { eval $1+=\$2 }' else as_fn_append () { eval $1=\$$1\$2 } fi # as_fn_append # as_fn_arith ARG... # ------------------ # Perform arithmetic evaluation on the ARGs, and store the result in the # global $as_val. Take advantage of shells that can avoid forks. The arguments # must be portable across $(()) and expr. if (eval "test \$(( 1 + 1 )) = 2") 2>/dev/null; then : eval 'as_fn_arith () { as_val=$(( $* )) }' else as_fn_arith () { as_val=`expr "$@" || test $? -eq 1` } fi # as_fn_arith if expr a : '\(a\)' >/dev/null 2>&1 && test "X`expr 00001 : '.*\(...\)'`" = X001; then as_expr=expr else as_expr=false fi if (basename -- /) >/dev/null 2>&1 && test "X`basename -- / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi if (as_dir=`dirname -- /` && test "X$as_dir" = X/) >/dev/null 2>&1; then as_dirname=dirname else as_dirname=false fi as_me=`$as_basename -- "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| . 2>/dev/null || $as_echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } /^X\/\(\/\/\)$/{ s//\1/ q } /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits ECHO_C= ECHO_N= ECHO_T= case `echo -n x` in #((((( -n*) case `echo 'xy\c'` in *c*) ECHO_T=' ';; # ECHO_T is single tab character. xy) ECHO_C='\c';; *) echo `echo ksh88 bug on AIX 6.1` > /dev/null ECHO_T=' ';; esac;; *) ECHO_N='-n';; esac rm -f conf$$ conf$$.exe conf$$.file if test -d conf$$.dir; then rm -f conf$$.dir/conf$$.file else rm -f conf$$.dir mkdir conf$$.dir 2>/dev/null fi if (echo >conf$$.file) 2>/dev/null; then if ln -s conf$$.file conf$$ 2>/dev/null; then as_ln_s='ln -s' # ... but there are two gotchas: # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail. # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable. # In both cases, we have to default to `cp -pR'. ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe || as_ln_s='cp -pR' elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -pR' fi else as_ln_s='cp -pR' fi rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file rmdir conf$$.dir 2>/dev/null # as_fn_mkdir_p # ------------- # Create "$as_dir" as a directory, including parents if necessary. as_fn_mkdir_p () { case $as_dir in #( -*) as_dir=./$as_dir;; esac test -d "$as_dir" || eval $as_mkdir_p || { as_dirs= while :; do case $as_dir in #( *\'*) as_qdir=`$as_echo "$as_dir" | sed "s/'/'\\\\\\\\''/g"`;; #'( *) as_qdir=$as_dir;; esac as_dirs="'$as_qdir' $as_dirs" as_dir=`$as_dirname -- "$as_dir" || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` test -d "$as_dir" && break done test -z "$as_dirs" || eval "mkdir $as_dirs" } || test -d "$as_dir" || as_fn_error $? "cannot create directory $as_dir" } # as_fn_mkdir_p if mkdir -p . 2>/dev/null; then as_mkdir_p='mkdir -p "$as_dir"' else test -d ./-p && rmdir ./-p as_mkdir_p=false fi # as_fn_executable_p FILE # ----------------------- # Test if FILE is an executable regular file. as_fn_executable_p () { test -f "$1" && test -x "$1" } # as_fn_executable_p as_test_x='test -x' as_executable_p=as_fn_executable_p # Sed expression to map a string onto a valid CPP name. as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'" # Sed expression to map a string onto a valid variable name. as_tr_sh="eval sed 'y%*+%pp%;s%[^_$as_cr_alnum]%_%g'" exec 6>&1 ## ----------------------------------- ## ## Main body of $CONFIG_STATUS script. ## ## ----------------------------------- ## _ASEOF test $as_write_fail = 0 && chmod +x $CONFIG_STATUS || ac_write_fail=1 cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Save the log message, to keep $0 and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. ac_log=" This file was extended by PAPI $as_me 7.2.0.0, which was generated by GNU Autoconf 2.69. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ on `(hostname || uname -n) 2>/dev/null | sed 1q` " _ACEOF case $ac_config_files in *" "*) set x $ac_config_files; shift; ac_config_files=$*;; esac case $ac_config_headers in *" "*) set x $ac_config_headers; shift; ac_config_headers=$*;; esac cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 # Files that config.status was made for. config_files="$ac_config_files" config_headers="$ac_config_headers" config_commands="$ac_config_commands" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 ac_cs_usage="\ \`$as_me' instantiates files and other configuration actions from templates according to the current configuration. Unless the files and actions are specified as TAGs, all are instantiated by default. Usage: $0 [OPTION]... [TAG]... -h, --help print this help, then exit -V, --version print version number and configuration settings, then exit --config print configuration, then exit -q, --quiet, --silent do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE --header=FILE[:TEMPLATE] instantiate the configuration header FILE Configuration files: $config_files Configuration headers: $config_headers Configuration commands: $config_commands Report bugs to ." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ PAPI config.status 7.2.0.0 configured by $0, generated by GNU Autoconf 2.69, with options \\"\$ac_cs_config\\" Copyright (C) 2012 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." ac_pwd='$ac_pwd' srcdir='$srcdir' AWK='$AWK' test -n "\$AWK" || AWK=awk _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # The default lists apply if the user does not specify any file. ac_need_defaults=: while test $# != 0 do case $1 in --*=?*) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg=`expr "X$1" : 'X[^=]*=\(.*\)'` ac_shift=: ;; --*=) ac_option=`expr "X$1" : 'X\([^=]*\)='` ac_optarg= ac_shift=: ;; *) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; esac case $ac_option in # Handling of the options. -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --versio | --versi | --vers | --ver | --ve | --v | -V ) $as_echo "$ac_cs_version"; exit ;; --config | --confi | --conf | --con | --co | --c ) $as_echo "$ac_cs_config"; exit ;; --debug | --debu | --deb | --de | --d | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift case $ac_optarg in *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` ;; '') as_fn_error $? "missing file argument" ;; esac as_fn_append CONFIG_FILES " '$ac_optarg'" ac_need_defaults=false;; --header | --heade | --head | --hea ) $ac_shift case $ac_optarg in *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` ;; esac as_fn_append CONFIG_HEADERS " '$ac_optarg'" ac_need_defaults=false;; --he | --h) # Conflict between --help and --header as_fn_error $? "ambiguous option: \`$1' Try \`$0 --help' for more information.";; --help | --hel | -h ) $as_echo "$ac_cs_usage"; exit ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) as_fn_error $? "unrecognized option: \`$1' Try \`$0 --help' for more information." ;; *) as_fn_append ac_config_targets " $1" ac_need_defaults=false ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 if \$ac_cs_recheck; then set X $SHELL '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion shift \$as_echo "running CONFIG_SHELL=$SHELL \$*" >&6 CONFIG_SHELL='$SHELL' export CONFIG_SHELL exec "\$@" fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX $as_echo "$ac_log" } >&5 _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # Handling of arguments. for ac_config_target in $ac_config_targets do case $ac_config_target in "config.h") CONFIG_HEADERS="$CONFIG_HEADERS config.h" ;; "genpapifdef") CONFIG_COMMANDS="$CONFIG_COMMANDS genpapifdef" ;; "Makefile") CONFIG_FILES="$CONFIG_FILES Makefile" ;; "papi.pc") CONFIG_FILES="$CONFIG_FILES papi.pc" ;; "components/Makefile_comp_tests.target") CONFIG_FILES="$CONFIG_FILES components/Makefile_comp_tests.target" ;; "testlib/Makefile.target") CONFIG_FILES="$CONFIG_FILES testlib/Makefile.target" ;; "utils/Makefile.target") CONFIG_FILES="$CONFIG_FILES utils/Makefile.target" ;; "ctests/Makefile.target") CONFIG_FILES="$CONFIG_FILES ctests/Makefile.target" ;; "ftests/Makefile.target") CONFIG_FILES="$CONFIG_FILES ftests/Makefile.target" ;; "validation_tests/Makefile.target") CONFIG_FILES="$CONFIG_FILES validation_tests/Makefile.target" ;; *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files test "${CONFIG_HEADERS+set}" = set || CONFIG_HEADERS=$config_headers test "${CONFIG_COMMANDS+set}" = set || CONFIG_COMMANDS=$config_commands fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason against having it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Hook for its removal unless debugging. # Note that there is a small window in which the directory will not be cleaned: # after its creation but before its name has been assigned to `$tmp'. $debug || { tmp= ac_tmp= trap 'exit_status=$? : "${ac_tmp:=$tmp}" { test ! -d "$ac_tmp" || rm -fr "$ac_tmp"; } && exit $exit_status ' 0 trap 'as_fn_exit 1' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d "./confXXXXXX") 2>/dev/null` && test -d "$tmp" } || { tmp=./conf$$-$RANDOM (umask 077 && mkdir "$tmp") } || as_fn_error $? "cannot create a temporary directory in ." "$LINENO" 5 ac_tmp=$tmp # Set up the scripts for CONFIG_FILES section. # No need to generate them if there are no CONFIG_FILES. # This happens for instance with `./config.status config.h'. if test -n "$CONFIG_FILES"; then ac_cr=`echo X | tr X '\015'` # On cygwin, bash can eat \r inside `` if the user requested igncr. # But we know of no other shell where ac_cr would be empty at this # point, so we can use a bashism as a fallback. if test "x$ac_cr" = x; then eval ac_cr=\$\'\\r\' fi ac_cs_awk_cr=`$AWK 'BEGIN { print "a\rb" }' /dev/null` if test "$ac_cs_awk_cr" = "a${ac_cr}b"; then ac_cs_awk_cr='\\r' else ac_cs_awk_cr=$ac_cr fi echo 'BEGIN {' >"$ac_tmp/subs1.awk" && _ACEOF { echo "cat >conf$$subs.awk <<_ACEOF" && echo "$ac_subst_vars" | sed 's/.*/&!$&$ac_delim/' && echo "_ACEOF" } >conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_num=`echo "$ac_subst_vars" | grep -c '^'` ac_delim='%!_!# ' for ac_last_try in false false false false false :; do . ./conf$$subs.sh || as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 ac_delim_n=`sed -n "s/.*$ac_delim\$/X/p" conf$$subs.awk | grep -c X` if test $ac_delim_n = $ac_delim_num; then break elif $ac_last_try; then as_fn_error $? "could not make $CONFIG_STATUS" "$LINENO" 5 else ac_delim="$ac_delim!$ac_delim _$ac_delim!! " fi done rm -f conf$$subs.sh cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 cat >>"\$ac_tmp/subs1.awk" <<\\_ACAWK && _ACEOF sed -n ' h s/^/S["/; s/!.*/"]=/ p g s/^[^!]*!// :repl t repl s/'"$ac_delim"'$// t delim :nl h s/\(.\{148\}\)..*/\1/ t more1 s/["\\]/\\&/g; s/^/"/; s/$/\\n"\\/ p n b repl :more1 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t nl :delim h s/\(.\{148\}\)..*/\1/ t more2 s/["\\]/\\&/g; s/^/"/; s/$/"/ p b :more2 s/["\\]/\\&/g; s/^/"/; s/$/"\\/ p g s/.\{148\}// t delim ' >$CONFIG_STATUS || ac_write_fail=1 rm -f conf$$subs.awk cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACAWK cat >>"\$ac_tmp/subs1.awk" <<_ACAWK && for (key in S) S_is_set[key] = 1 FS = "" } { line = $ 0 nfields = split(line, field, "@") substed = 0 len = length(field[1]) for (i = 2; i < nfields; i++) { key = field[i] keylen = length(key) if (S_is_set[key]) { value = S[key] line = substr(line, 1, len) "" value "" substr(line, len + keylen + 3) len += length(value) + length(field[++i]) substed = 1 } else len += 1 + keylen } print line } _ACAWK _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 if sed "s/$ac_cr//" < /dev/null > /dev/null 2>&1; then sed "s/$ac_cr\$//; s/$ac_cr/$ac_cs_awk_cr/g" else cat fi < "$ac_tmp/subs1.awk" > "$ac_tmp/subs.awk" \ || as_fn_error $? "could not setup config files machinery" "$LINENO" 5 _ACEOF # VPATH may cause trouble with some makes, so we remove sole $(srcdir), # ${srcdir} and @srcdir@ entries from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=[ ]*/{ h s/// s/^/:/ s/[ ]*$/:/ s/:\$(srcdir):/:/g s/:\${srcdir}:/:/g s/:@srcdir@:/:/g s/^:*// s/:*$// x s/\(=[ ]*\).*/\1/ G s/\n// s/^[^=]*=[ ]*$// }' fi cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 fi # test -n "$CONFIG_FILES" # Set up the scripts for CONFIG_HEADERS section. # No need to generate them if there are no CONFIG_HEADERS. # This happens for instance with `./config.status Makefile'. if test -n "$CONFIG_HEADERS"; then cat >"$ac_tmp/defines.awk" <<\_ACAWK || BEGIN { _ACEOF # Transform confdefs.h into an awk script `defines.awk', embedded as # here-document in config.status, that substitutes the proper values into # config.h.in to produce config.h. # Create a delimiter string that does not exist in confdefs.h, to ease # handling of long lines. ac_delim='%!_!# ' for ac_last_try in false false :; do ac_tt=`sed -n "/$ac_delim/p" confdefs.h` if test -z "$ac_tt"; then break elif $ac_last_try; then as_fn_error $? "could not make $CONFIG_HEADERS" "$LINENO" 5 else ac_delim="$ac_delim!$ac_delim _$ac_delim!! " fi done # For the awk script, D is an array of macro values keyed by name, # likewise P contains macro parameters if any. Preserve backslash # newline sequences. ac_word_re=[_$as_cr_Letters][_$as_cr_alnum]* sed -n ' s/.\{148\}/&'"$ac_delim"'/g t rset :rset s/^[ ]*#[ ]*define[ ][ ]*/ / t def d :def s/\\$// t bsnl s/["\\]/\\&/g s/^ \('"$ac_word_re"'\)\(([^()]*)\)[ ]*\(.*\)/P["\1"]="\2"\ D["\1"]=" \3"/p s/^ \('"$ac_word_re"'\)[ ]*\(.*\)/D["\1"]=" \2"/p d :bsnl s/["\\]/\\&/g s/^ \('"$ac_word_re"'\)\(([^()]*)\)[ ]*\(.*\)/P["\1"]="\2"\ D["\1"]=" \3\\\\\\n"\\/p t cont s/^ \('"$ac_word_re"'\)[ ]*\(.*\)/D["\1"]=" \2\\\\\\n"\\/p t cont d :cont n s/.\{148\}/&'"$ac_delim"'/g t clear :clear s/\\$// t bsnlc s/["\\]/\\&/g; s/^/"/; s/$/"/p d :bsnlc s/["\\]/\\&/g; s/^/"/; s/$/\\\\\\n"\\/p b cont ' >$CONFIG_STATUS || ac_write_fail=1 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 for (key in D) D_is_set[key] = 1 FS = "" } /^[\t ]*#[\t ]*(define|undef)[\t ]+$ac_word_re([\t (]|\$)/ { line = \$ 0 split(line, arg, " ") if (arg[1] == "#") { defundef = arg[2] mac1 = arg[3] } else { defundef = substr(arg[1], 2) mac1 = arg[2] } split(mac1, mac2, "(") #) macro = mac2[1] prefix = substr(line, 1, index(line, defundef) - 1) if (D_is_set[macro]) { # Preserve the white space surrounding the "#". print prefix "define", macro P[macro] D[macro] next } else { # Replace #undef with comments. This is necessary, for example, # in the case of _POSIX_SOURCE, which is predefined and required # on some systems where configure will not decide to define it. if (defundef == "undef") { print "/*", prefix defundef, macro, "*/" next } } } { print } _ACAWK _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 as_fn_error $? "could not setup config headers machinery" "$LINENO" 5 fi # test -n "$CONFIG_HEADERS" eval set X " :F $CONFIG_FILES :H $CONFIG_HEADERS :C $CONFIG_COMMANDS" shift for ac_tag do case $ac_tag in :[FHLC]) ac_mode=$ac_tag; continue;; esac case $ac_mode$ac_tag in :[FHL]*:*);; :L* | :C*:*) as_fn_error $? "invalid tag \`$ac_tag'" "$LINENO" 5;; :[FH]-) ac_tag=-:-;; :[FH]*) ac_tag=$ac_tag:$ac_tag.in;; esac ac_save_IFS=$IFS IFS=: set x $ac_tag IFS=$ac_save_IFS shift ac_file=$1 shift case $ac_mode in :L) ac_source=$1;; :[FH]) ac_file_inputs= for ac_f do case $ac_f in -) ac_f="$ac_tmp/stdin";; *) # Look for the file first in the build tree, then in the source tree # (if the path is not absolute). The absolute path cannot be DOS-style, # because $ac_f cannot contain `:'. test -f "$ac_f" || case $ac_f in [\\/$]*) false;; *) test -f "$srcdir/$ac_f" && ac_f="$srcdir/$ac_f";; esac || as_fn_error 1 "cannot find input file: \`$ac_f'" "$LINENO" 5;; esac case $ac_f in *\'*) ac_f=`$as_echo "$ac_f" | sed "s/'/'\\\\\\\\''/g"`;; esac as_fn_append ac_file_inputs " '$ac_f'" done # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ configure_input='Generated from '` $as_echo "$*" | sed 's|^[^:]*/||;s|:[^:]*/|, |g' `' by configure.' if test x"$ac_file" != x-; then configure_input="$ac_file. $configure_input" { $as_echo "$as_me:${as_lineno-$LINENO}: creating $ac_file" >&5 $as_echo "$as_me: creating $ac_file" >&6;} fi # Neutralize special characters interpreted by sed in replacement strings. case $configure_input in #( *\&* | *\|* | *\\* ) ac_sed_conf_input=`$as_echo "$configure_input" | sed 's/[\\\\&|]/\\\\&/g'`;; #( *) ac_sed_conf_input=$configure_input;; esac case $ac_tag in *:-:* | *:-) cat >"$ac_tmp/stdin" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac ;; esac ac_dir=`$as_dirname -- "$ac_file" || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| . 2>/dev/null || $as_echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q } /^X\(\/\/\)[^/].*/{ s//\1/ q } /^X\(\/\/\)$/{ s//\1/ q } /^X\(\/\).*/{ s//\1/ q } s/.*/./; q'` as_dir="$ac_dir"; as_fn_mkdir_p ac_builddir=. case "$ac_dir" in .) ac_dir_suffix= ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_dir_suffix=/`$as_echo "$ac_dir" | sed 's|^\.[\\/]||'` # A ".." for each directory in $ac_dir_suffix. ac_top_builddir_sub=`$as_echo "$ac_dir_suffix" | sed 's|/[^\\/]*|/..|g;s|/||'` case $ac_top_builddir_sub in "") ac_top_builddir_sub=. ac_top_build_prefix= ;; *) ac_top_build_prefix=$ac_top_builddir_sub/ ;; esac ;; esac ac_abs_top_builddir=$ac_pwd ac_abs_builddir=$ac_pwd$ac_dir_suffix # for backward compatibility: ac_top_builddir=$ac_top_build_prefix case $srcdir in .) # We are building in place. ac_srcdir=. ac_top_srcdir=$ac_top_builddir_sub ac_abs_top_srcdir=$ac_pwd ;; [\\/]* | ?:[\\/]* ) # Absolute name. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ac_abs_top_srcdir=$srcdir ;; *) # Relative name. ac_srcdir=$ac_top_build_prefix$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_build_prefix$srcdir ac_abs_top_srcdir=$ac_pwd/$srcdir ;; esac ac_abs_srcdir=$ac_abs_top_srcdir$ac_dir_suffix case $ac_mode in :F) # # CONFIG_FILE # _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 # If the template does not know about datarootdir, expand it. # FIXME: This hack should be removed a few years after 2.60. ac_datarootdir_hack=; ac_datarootdir_seen= ac_sed_dataroot=' /datarootdir/ { p q } /@datadir@/p /@docdir@/p /@infodir@/p /@localedir@/p /@mandir@/p' case `eval "sed -n \"\$ac_sed_dataroot\" $ac_file_inputs"` in *datarootdir*) ac_datarootdir_seen=yes;; *@datadir@*|*@docdir@*|*@infodir@*|*@localedir@*|*@mandir@*) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&5 $as_echo "$as_me: WARNING: $ac_file_inputs seems to ignore the --datarootdir setting" >&2;} _ACEOF cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_datarootdir_hack=' s&@datadir@&$datadir&g s&@docdir@&$docdir&g s&@infodir@&$infodir&g s&@localedir@&$localedir&g s&@mandir@&$mandir&g s&\\\${datarootdir}&$datarootdir&g' ;; esac _ACEOF # Neutralize VPATH when `$srcdir' = `.'. # Shell code in configure.ac might set extrasub. # FIXME: do we really want to maintain this feature? cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 ac_sed_extra="$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1 :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s|@configure_input@|$ac_sed_conf_input|;t t s&@top_builddir@&$ac_top_builddir_sub&;t t s&@top_build_prefix@&$ac_top_build_prefix&;t t s&@srcdir@&$ac_srcdir&;t t s&@abs_srcdir@&$ac_abs_srcdir&;t t s&@top_srcdir@&$ac_top_srcdir&;t t s&@abs_top_srcdir@&$ac_abs_top_srcdir&;t t s&@builddir@&$ac_builddir&;t t s&@abs_builddir@&$ac_abs_builddir&;t t s&@abs_top_builddir@&$ac_abs_top_builddir&;t t $ac_datarootdir_hack " eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$ac_tmp/subs.awk" \ >$ac_tmp/out || as_fn_error $? "could not create $ac_file" "$LINENO" 5 test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && { ac_out=`sed -n '/\${datarootdir}/p' "$ac_tmp/out"`; test -n "$ac_out"; } && { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' \ "$ac_tmp/out"`; test -z "$ac_out"; } && { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&5 $as_echo "$as_me: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&2;} rm -f "$ac_tmp/stdin" case $ac_file in -) cat "$ac_tmp/out" && rm -f "$ac_tmp/out";; *) rm -f "$ac_file" && mv "$ac_tmp/out" "$ac_file";; esac \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; :H) # # CONFIG_HEADER # if test x"$ac_file" != x-; then { $as_echo "/* $configure_input */" \ && eval '$AWK -f "$ac_tmp/defines.awk"' "$ac_file_inputs" } >"$ac_tmp/config.h" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 if diff "$ac_file" "$ac_tmp/config.h" >/dev/null 2>&1; then { $as_echo "$as_me:${as_lineno-$LINENO}: $ac_file is unchanged" >&5 $as_echo "$as_me: $ac_file is unchanged" >&6;} else rm -f "$ac_file" mv "$ac_tmp/config.h" "$ac_file" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 fi else $as_echo "/* $configure_input */" \ && eval '$AWK -f "$ac_tmp/defines.awk"' "$ac_file_inputs" \ || as_fn_error $? "could not create -" "$LINENO" 5 fi ;; :C) { $as_echo "$as_me:${as_lineno-$LINENO}: executing $ac_file commands" >&5 $as_echo "$as_me: executing $ac_file commands" >&6;} ;; esac case $ac_file$ac_mode in "genpapifdef":C) maint/genpapifdef.pl -c > fpapi.h maint/genpapifdef.pl -f77 > f77papi.h maint/genpapifdef.pl -f90 > f90papi.h ;; esac done # for ac_tag as_fn_exit 0 _ACEOF ac_clean_files=$ac_clean_files_save test $ac_write_fail = 0 || as_fn_error $? "write failure creating $CONFIG_STATUS" "$LINENO" 5 # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || as_fn_exit 1 fi if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: unrecognized options: $ac_unrecognized_opts" >&5 $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;} fi if test "$have_paranoid" = "yes"; then paranoid_level=`cat /proc/sys/kernel/perf_event_paranoid` if test $paranoid_level -gt 2; then warning_text=`echo -e "\n *************************************************************************** * Insufficient permissions for accessing any hardware counters. * * Your current paranoid level is $paranoid_level. * * Set /proc/sys/kernel/perf_event_paranoid to 2 (or less) or run as root. * * * * Example: * * sudo sh -c \"echo 2 > /proc/sys/kernel/perf_event_paranoid\" * *************************************************************************** "\ ` { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $warning_text" >&5 $as_echo "$as_me: WARNING: $warning_text" >&2;} fi fi papi-papi-7-2-0-t/src/configure.in000066400000000000000000002376731502707512200167620ustar00rootroot00000000000000# Process this file with autoconf to produce a configure script. # File: configure.in # cross compile sample # ARCH=mips CC=scgcc ./configure --with-arch=mips --host=mips64el-gentoo-linux-gnu- --with-ffsll --with-libpfm4 --with-perf-events --with-virtualtimer=times --with-walltimer=gettimeofday --with-tls=__thread --with-CPU=mips # cross compiling should work differently... AC_PREREQ(2.59) AC_INIT(PAPI, 7.2.0.0, ptools-perfapi@icl.utk.edu) AC_CONFIG_SRCDIR([papi.c]) AC_CONFIG_HEADER([config.h]) AC_DEFUN([AS_AC_EXPAND], [EXP_VAR=[$1] FROM_VAR=[$2] prefix_save=$prefix exec_prefix_save=$exec_prefix if test "x$prefix" = "xNONE"; then prefix="$ac_default_prefix" fi if test "x$exec_prefix" = "xNONE"; then exec_prefix=$prefix fi full_var="$FROM_VAR" while true; do new_full_var="`eval echo $full_var`" if test "x$new_full_var" = "x$full_var"; then break; fi full_var=$new_full_var done full_var=$new_full_var AC_DEFINE_UNQUOTED([$1], "$full_var") prefix=$prefix_save exec_prefix=$exec_prefix_save ]) AC_MSG_CHECKING(for architecture) AC_ARG_WITH([arch], [AS_HELP_STRING([--with-arch=], [Specify architecture (uname -m)])], [arch=$withval], [arch=`uname -m`]) AC_MSG_RESULT($arch) AC_ARG_WITH([bitmode], [AS_HELP_STRING([--with-bitmode=<32,64>], [Specify bit mode of library])], [bitmode=$withval]) AC_MSG_CHECKING(for OS) AC_ARG_WITH(OS, [AS_HELP_STRING([--with-OS=], [Specify operating system])], [OS=$withval], [OS="`uname | tr '[A-Z]' '[a-z]'`" if (test "$OS" = "SunOS" || test "$OS" = "sunos"); then OS=solaris fi ]) AC_MSG_RESULT($OS) AC_MSG_CHECKING(for OS version) AC_ARG_WITH(OSVER, [AS_HELP_STRING([--with-OSVER=], [Specify operating system version])], [OSVER=$withval], [if test "$OS" != "bgp" -o "$OS" != "bgq"; then OSVER="`uname -r`" fi ]) AC_MSG_RESULT($OSVER) CFLAGS="$CFLAGS" AC_MSG_CHECKING(for perf_event workaround level) AC_ARG_WITH(assumed_kernel, [AS_HELP_STRING([--with-assumed-kernel=], [Assume kernel version is for purposes of workarounds])], [assumed_kernel=$withval; CFLAGS="$CFLAGS -DASSUME_KERNEL=\\\"$with_assumed_kernel\\\""], [assumed_kernel="autodetect"] ) AC_MSG_RESULT($assumed_kernel) AC_MSG_CHECKING([for if MIC should be used]) AC_ARG_WITH(mic, [AS_HELP_STRING([--with-mic], [To compile for Intel MIC])], [MIC=yes tls=__thread virtualtimer=cputime_id perf_events=yes walltimer=clock_realtime_hr ffsll=no cross_compiling=yes arch=k1om], [MIC=no]) AC_MSG_RESULT($MIC) AC_SUBST(MIC) AC_MSG_CHECKING([for if NEC should be used]) AC_ARG_WITH(nec, [AS_HELP_STRING([]--with-nec, [To compile for NEC])], [NEC=yes tls=__thread cross_compiling=yes ffsll=/opt/nec/ve/lib/libc-2.21.so virtualtimer=cputime_id walltimer=clock_realtime_hr CC=ncc CC_COMMON_NAME=ncc], [NEC=no]) AC_MSG_RESULT($NEC) AC_SUBST(NEC) #If not set, set FFLAGS to null to prevent AC_PROG_F77 from defaulting it to -g -O2 if test "x$FFLAGS" = "x"; then FFLAGS="" fi OPTFLAGS="-O2" TOPTFLAGS="-O1" AC_PROG_CC([xlc icc gcc cc]) AC_PROG_F77([xlf ifort gfortran f95 f90 f77]) if test "x$F77" = "x"; then F77= fi AC_CHECK_PROG( [MPICC], mpicc, [mpicc], []) # Lets figure out what CC actually is... # Used in later checks to set compiler specific options if `$CC --version 2>&1 | grep '^ncc (NCC)' >/dev/null 2>&1` ; then CC_COMMON_NAME="ncc" elif `$CC -V 2>&1 | grep '^Intel(R) C' >/dev/null 2>&1` ; then CC_COMMON_NAME="icc" if test "$MPICC" = "mpicc"; then # Check if mpiicc is available AC_CHECK_PROG( [MPIICC], mpiicc, mpiicc, []) if test "x$MPIICC" = "xmpiicc"; then MPICC=mpiicc fi MPI_COMPILER_CHECK=`$MPICC -V 2>&1 | grep '^Intel(R) C'` if test "x$MPI_COMPILER_CHECK" = "x"; then AC_MSG_WARN([$MPICC is using a different compiler than $CC. MPI tests disabled.]) NO_MPI_TESTS=yes fi fi elif `$CC -v 2>&1 | grep 'gcc version' >/dev/null 2>&1` ; then CC_COMMON_NAME="gcc" if test "$MPICC" = "mpicc"; then MPI_COMPILER_CHECK=`$MPICC -v 2>&1 | grep 'gcc version'` if test "x$MPI_COMPILER_CHECK" = "x"; then AC_MSG_WARN([$MPICC is using a different compiler than $CC. MPI tests disabled.]) NO_MPI_TESTS=yes fi fi elif `$CC -qversion 2>&1 | grep 'IBM XL C' >/dev/null 2>&1`; then CC_COMMON_NAME="xlc" if test "$MPICC" = "mpicc"; then MPI_COMPILER_CHECK=`$MPICC -qversion 2>&1 | grep 'IBM XL C'` if test "x$MPI_COMPILER_CHECK" = "x"; then AC_MSG_WARN([$MPICC is using a different compiler than $CC. MPI tests disabled.]) NO_MPI_TESTS=yes fi fi else CC_COMMON_NAME="unknown" fi #prevent icc warnings about overriding optimization settings set by AC_PROG_CC # remark #869: parameter was never referenced # remark #271: trailing comma is nonstandard if test "$CC_COMMON_NAME" = "icc"; then CFLAGS="$CFLAGS -diag-disable 188,869,271" if test "$MIC" = "yes"; then CC="$CC -mmic -fPIC" fi fi if test "$F77" = "ifort" -a "$MIC" = "yes"; then F77="$F77 -mmic -fPIC" fi AC_PROG_AWK AC_PROG_CPP AC_PROG_LN_S AC_PROG_MAKE_SET AC_PROG_RANLIB AC_GNU_SOURCE AC_HEADER_STDC AC_C_INLINE AC_HEADER_TIME AC_CHECK_HEADERS([sys/time.h c_asm.h intrinsics.h mach/mach_time.h sched.h]) AC_CHECK_FUNCS([gethrtime read_real_time time_base_to_time clock_gettime mach_absolute_time sched_getcpu]) # # Check if the system provides time_* symbols without -lrt, and if not, # check for -lrt existance. # AC_MSG_CHECKING([for timer_create and timer_*ettime symbols in base system]) AC_TRY_LINK([#include #include ], [timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid);], [rtsymbols_in_base="yes"], [rtsymbols_in_base="no"]) if test "${rtsymbols_in_base}" = "yes"; then AC_MSG_RESULT([found]) LRT="" else AC_MSG_RESULT([not found]) AC_MSG_CHECKING([for timer_create and timer_*ettime symbols in -lrt]) SAVED_LIBS=${LIBS} LIBS="${LIBS} -lrt" AC_TRY_LINK([#include #include ], [timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid);], [has_lrt="yes"], [has_lrt="no"]) LIBS=${SAVED_LIBS} if test "${has_lrt}" = "yes" ; then AC_MSG_RESULT([found]) LRT="-lrt" else AC_MSG_RESULT([not found]) AC_MSG_CHECKING([for timer_create and timer_*ettime symbols in -lrt -lpthread]) SAVED_LIBS=${LIBS} LIBS="${LIBS} -lrt -lpthread" AC_TRY_LINK([#include #include ], [timer_t timerid; timer_create(CLOCK_REALTIME, NULL, &timerid);], [has_lrt_lpthd="yes"], [has_lrt_lpthd="no"]) LIBS=${SAVED_LIBS} if test "${has_lrt_lpthd}" = "yes" ; then AC_MSG_RESULT([found]) LRT="-lrt -lpthread" else AC_MSG_RESULT([not found]) fi fi fi AC_SUBST(LRT) # # Check if the system provides dl* symbols without -ldl, and if not, # check for -ldl existance. # AC_MSG_CHECKING([for dlopen and dlerror symbols in base system]) AC_TRY_LINK([#include ], [void *p = dlopen ("", 0); char *c = dlerror();], [dlsymbols_in_base="yes"], [dlsymbols_in_base="no"]) if test "${dlsymbols_in_base}" = "yes"; then AC_MSG_RESULT([found]) LDL="" else AC_MSG_RESULT([not found]) AC_MSG_CHECKING([for dlopen and dlerror symbols in -ldl]) SAVED_LIBS=${LIBS} LIBS="${LIBS} -ldl" AC_TRY_LINK([#include ], [void *p = dlopen ("", 0); char *c = dlerror();], [has_ldl="yes"], [has_ldl="no"]) LIBS=${SAVED_LIBS} if test "${has_ldl}" = "yes" ; then AC_MSG_RESULT([found]) LDL="-ldl" else AC_MSG_ERROR([cannot find dlopen and dlerror symbols neither in the base system libraries nor in -ldl]) fi fi AC_SUBST(LDL) if test "$OS" = "CLE"; then virtualtimer=times tls=__thread walltimer=cycle ffsll=yes cross_compiling=yes STATIC="-static" # _rtc is only defined when using the Cray compiler AC_MSG_CHECKING([for _rtc intrinsic]) rtc_ok=yes AC_TRY_LINK([#ifdef HAVE_INTRINSICS_H #include #endif], [_rtc()], [AC_DEFINE(HAVE__RTC,1,[Define for _rtc() intrinsic.])], [rtc_ok=no AC_DEFINE(NO_RTC_INTRINSIC,1,[Define if _rtc() is not found.])]) AC_MSG_RESULT($rtc_ok) elif test "$OS" = "bgp"; then CC=powerpc-bgp-linux-gcc F77=powerpc-bgp-linux-gfortran walltimer=cycle virtualtimer=proc tls=no ffsll=yes cross_compiling=yes elif test "$OS" = "bgq"; then AC_ARG_WITH(bgpm_installdir, [AS_HELP_STRING([--with-bgpm_installdir=], [Specify the installation path of BGPM])], [BGPM_INSTALL_DIR=$withval CFLAGS="$CFLAGS -I$withval"], [AC_MSG_ERROR([BGQ CPU component requires installation path of BGPM (see --with-bgpm_installdir)])]) bitmode=64 tls=no elif test "$OS" = "linux"; then if test "$arch" = "ppc64" -o "$arch" = "x86_64"; then if test "$bitmode" = "64" -a "$libdir" = '${exec_prefix}/lib'; then libdir='${exec_prefix}/lib64' fi fi elif test "$OS" = "solaris"; then AC_CHECK_TYPE([hrtime_t], [AC_DEFINE(HAVE_HRTIME_T, 1, [Define if hrtime_t is defined in ])],[], [#if HAVE_SYS_TIME_H #include #endif]) if test "x$AR" = "x"; then AR=/usr/ccs/bin/ar fi fi if test "x$AR" = "x"; then AR=ar fi if test "$cross_compiling" = "yes" ; then AC_MSG_CHECKING(for native compiler for header generation) AC_ARG_WITH(nativecc, [AS_HELP_STRING([--with-nativecc=], [Specify native C compiler for header generation])], [nativecc=$withval], [nativecc=gcc]) AC_MSG_RESULT($nativecc) fi AC_MSG_CHECKING(for tests) AC_ARG_WITH(tests, [AS_HELP_STRING([--with-tests=<"ctests ftests mpitests comp_tests", no>], [Specify which tests to run on install, or "no" tests (default: all available tests)])], [tests=$withval], [tests="ctests ftests mpitests comp_tests"]) if test "$tests" = "no"; then AC_MSG_RESULT($tests) tests= NO_MPI_TESTS=yes else # mpitests is not a target tmp_tests= mpi_tests=no case "$tests" in *ctests*) tmp_tests+="ctests " ;; esac case "$tests" in *ftests*) tmp_tests+="ftests " ;; esac case "$tests" in *comp_tests*) tmp_tests+="comp_tests " ;; esac case "$tests" in *mpitests*) # we already checked if mpicc is working if test "x$MPICC" != "x"; then if test "x$NO_MPI_TESTS" = "x"; then mpi_tests=yes # mpitests only works together with ctests if test "x$tmp_tests" != "x"; then tmp_tests+="mpitests " fi fi fi ;; esac if test "x$tmp_tests" = "x"; then AC_MSG_RESULT(no) else AC_MSG_RESULT($tmp_tests) fi # do not list mpitests for makefile target case "$tmp_tests" in *mpitests* ) tmp_tests=$(echo "$tmp_tests" | sed 's/ mpitests//') ;; esac tests=$tmp_tests # mpitests is not listed by the user if test "$mpi_tests" = "no"; then NO_MPI_TESTS=yes fi fi AC_MSG_CHECKING(for debug build) # default value for --with-debug if not set by user debug="no" AC_ARG_WITH(debug, [AS_HELP_STRING([--with-debug=], [Build a debug version, debug version plus memory tracker or none])], [debug=$withval]) if test "$debug" = "yes"; then if test "$CC_COMMON_NAME" = "gcc"; then CFLAGS="$CFLAGS -g3" fi OPTFLAGS="-O0" PAPICFLAGS+=" -DDEBUG -DPAPI_NO_MEMORY_MANAGEMENT" elif test "$debug" = "memory"; then if test "$CC_COMMON_NAME" = "gcc"; then CFLAGS="$CFLAGS -g3" fi OPTFLAGS="-O0" PAPICFLAGS+=" -DDEBUG" else PAPICFLAGS+="-DPAPI_NO_MEMORY_MANAGEMENT" fi AC_MSG_RESULT($debug) AC_ARG_ENABLE([warnings], [AS_HELP_STRING([--enable-warnings], [Enable build with -Wall -Wextra (default: disabled)])], [], [enable_warnings=no]) if test "$CC_COMMON_NAME" = "gcc"; then if test "$enable_warnings" = "yes"; then gcc_version=`gcc -v 2>&1 | tail -n 1 | awk '{printf $3}'` major=`echo $gcc_version | sed 's/\([[^.]][[^.]]*\).*/\1/'` minor=`echo $gcc_version | sed 's/[[^.]][[^.]]*.\([[^.]][[^.]]*\).*/\1/'` if (test "$major" -ge 4 || test "$major" = 3 -a "$minor" -ge 4); then CFLAGS+=" -Wall -Wextra" else CFLAGS+=" -W" fi # -Wextra => -Woverride-init on gcc >= 4.2 # This issues a warning (error under -Werror) for some libpfm4 code. fi oldcflags="$CFLAGS" AC_MSG_CHECKING(for -Wno-override-init) CFLAGS+=" -Wall -Werror -Wno-override-init" AC_COMPILE_IFELSE([AC_LANG_SOURCE( [ struct A { int x; int y; }; int main(void) { struct A a = {.x = 0, .y = 0, .y = 5 }; return a.x; } ])], [HAVE_NO_OVERRIDE_INIT=1], [HAVE_NO_OVERRIDE_INIT=0]) CFLAGS="$oldcflags" AC_MSG_RESULT($HAVE_NO_OVERRIDE_INIT) fi AC_MSG_CHECKING(for CPU type) AC_ARG_WITH(CPU, [AS_HELP_STRING([--with-CPU=], [Specify CPU type])], [CPU=$withval case "$CPU" in core|core2|i7|atom|p4|p3|opteron|athlon|x86) MISCSRCS="$MISCSRCS x86_cpuid_info.c" esac], [case "$OS" in aix) CPU="`/usr/sbin/lsattr -E -l proc0 | grep type | cut -d '_' -f 2 | cut -d ' ' -f 1 | tr '[A-Z]' '[a-z]'`" if test "$CPU" = ""; then CPU="`/usr/sbin/lsattr -E -l proc1 | grep type | cut -d '_' -f 2 | cut -d ' ' -f 1 | tr '[A-Z]' '[a-z]'`" fi ;; nec) family=nec ;; freebsd) family=`uname -m` if test "$family" = "amd64"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" elif test "$family" = "i386"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" fi ;; darwin) family=`uname -m` MISCSRCS="$MISCSRCS x86_cpuid_info.c" ;; linux) family=`uname -m` if test "$family" = "x86_64"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" CPU="x86" elif test "$family" = "i686"; then MISCSRCS="$MISCSRCS x86_cpuid_info.c" CPU="x86" elif test "$family" = "ppc64" || test "$family" = "ppc64le"; then if (test "$family" = "ppc64le" && test "$CC_COMMON_NAME" = "gcc"); then # set cuda_version to be 0, such that we avoid "integer expression expected" cuda_version=0 if (test ! -z "$PAPI_CUDA_ROOT"); then update_cuda_version=`grep -r '#define CUDA_VERSION [0-9]' $PAPI_CUDA_ROOT/include/cuda.h 2>/dev/null | awk '{print $3}'` # update cuda_version, this will stay 0 unless PAPI_CUDA_ROOT was properly set to a Cuda install cuda_version=`echo "$cuda_version $update_cuda_version" | awk '{print $1 + $2}'` fi # get gcc major number gcc_major=`gcc -v 2>&1 | tail -n 1 | awk '{printf $3}' | sed 's/\([[^.]][[^.]]*\).*/\1/'` # set variable if the gcc and cuda versions match conditions if (test "$cuda_version" -ge 10000 && test "$cuda_version" -lt 11000 \ && test "$gcc_major" -ge 8 && test "$gcc_major" -lt 9); then NVPPC64LEFLAGS="-Xcompiler -mno-float128" fi fi CPU_info="`cat /proc/cpuinfo | grep cpu | cut -d: -f2 | cut -d' ' -f2 | sed '2,$d' | tr -d ","`" case "$CPU_info" in PPC970*) CPU="PPC970";; POWER5) CPU="POWER5";; POWER5+) CPU="POWER5+";; POWER6) CPU="POWER6";; POWER7) CPU="POWER7";; POWER8) CPU="POWER8";; POWER9) CPU="POWER9";; POWER10) CPU="POWER10";; esac elif test "${family:0:3}" = "arm" || test "$family" = "aarch64"; then CPU="arm" fi ;; solaris) AC_CHECK_HEADER([libcpc.h], [CFLAGS="$CFLAGS -lcpc" AC_TRY_RUN([#include #include int main() { // Check for libcpc 2 if(CPC_VER_CURRENT == 2) exit(0); exit(1); } ], [cpc_version=2], [cpc_version=0])], [AC_MSG_ERROR([libcpc is needed for running PAPI on Solaris]) ]) processor=`uname -p` machinetype=`uname -m` if test "$processor" = "sparc"; then if test "$machinetype" = "sun4u"; then CPU=ultra AC_CHECK_LIB([cpc], [cpc_take_sample], [], [AC_MSG_ERROR([libcpc.a is needed on Solaris, install SUNWcpc]) ]) elif test "$machinetype" = "sun4v"; then CPU=niagara2 if test "$cpc_version" != "2"; then AC_MSG_ERROR([libcpc2 needed for Niagara 2]) fi else AC_MSG_ERROR([$machinetype not supported]) fi else AC_MSG_ERROR([Only SPARC processors are supported on Solaris]) fi ;; bgp) CPU=bgp ;; bgq) CPU=bgq ;; esac ]) AC_MSG_RESULT($CPU) AC_DEFINE_UNQUOTED(CPU,$CPU,[cpu type]) # First set pthread-mutexes based on arch case $arch in aarch64|arm*|parisc*) pthread_mutexes=yes CFLAGS="$CFLAGS -DUSE_PTHREAD_MUTEXES" echo "forcing use of pthread mutexes... " >&6 ;; esac AC_ARG_WITH(pthread-mutexes, [AS_HELP_STRING([--with-pthread-mutexes], [Specify use of pthread mutexes rather than custom PAPI locks])], [pthread_mutexes=yes CFLAGS="$CFLAGS -DUSE_PTHREAD_MUTEXES" ]) AC_ARG_WITH(ffsll, [AS_HELP_STRING([--with-ffsll], [Specify use of the ffsll() function])], [ffsll=$withval], [if test "$cross_compiling" = "yes" ; then AC_MSG_ERROR([ffsll must be specified for cross compile]) fi didcheck=1 AC_CHECK_FUNC(ffsll,[ffsll=yes],[ffsll=no]) ]) if test "$ffsll" = "yes" ; then AC_DEFINE(HAVE_FFSLL, 1, This platform has the ffsll() function) fi if test "$didcheck" != "1"; then AC_MSG_CHECKING(for ffsll) if test "$ffsll" = "yes" ; then AC_DEFINE(HAVE_FFSLL, 1, This platform has the ffsll() function) fi AC_MSG_RESULT($ffsll) fi AC_MSG_CHECKING(for working gettid) AC_LINK_IFELSE([AC_LANG_SOURCE([#include #include int main() { pid_t a = gettid(); }])], [AC_MSG_RESULT(yes) AC_DEFINE(HAVE_GETTID, 1, [Full gettid function])], [AC_MSG_RESULT(no) AC_MSG_CHECKING(for working syscall(SYS_gettid)) AC_LINK_IFELSE([AC_LANG_SOURCE([#include #include #include int main() { pid_t a = syscall(SYS_gettid); }])], [AC_MSG_RESULT(yes) AC_DEFINE(HAVE_SYSCALL_GETTID, 1, [gettid syscall function])], [AC_MSG_RESULT(no)]) ]) AC_ARG_WITH(walltimer, [AS_HELP_STRING([--with-walltimer=], [Specify realtime timer])], [walltimer=$withval], [if test "$cross_compiling" = "yes" ; then AC_MSG_ERROR([walltimer must be specified for cross compile]) fi AC_MSG_CHECKING(for working MMTIMER) AC_TRY_RUN([#include #include #include #include #include #include #ifndef MMTIMER_FULLNAME #define MMTIMER_FULLNAME "/dev/mmtimer" #endif int main() { int offset; int fd; if((fd = open(MMTIMER_FULLNAME, O_RDONLY)) == -1) exit(1); if ((offset = ioctl(fd, MMTIMER_GETOFFSET, 0)) < 0) exit(1); close(fd); exit(0); } ], [walltimer="mmtimer" AC_MSG_RESULT(yes)], [AC_MSG_RESULT(no) AC_MSG_CHECKING(for working CLOCK_REALTIME_HR POSIX 1b timer) AC_TRY_RUN([#include #include #include #include #include int main() { struct timespec t1, t2; double seconds; if (syscall(__NR_clock_gettime,CLOCK_REALTIME_HR,&t1) == -1) exit(1); sleep(1); if (syscall(__NR_clock_gettime,CLOCK_REALTIME_HR,&t2) == -1) exit(1); seconds = ((double)t2.tv_sec + (double)t2.tv_nsec/1000000000.0) - ((double)t1.tv_sec + (double)t1.tv_nsec/1000000000.0); if (seconds > 1.0) exit(0); else exit(1); } ], [walltimer="clock_realtime_hr" AC_MSG_RESULT(yes)], [AC_MSG_RESULT(no) AC_MSG_CHECKING(for working CLOCK_REALTIME POSIX 1b timer) AC_TRY_RUN([#include #include #include #include #include int main() { struct timespec t1, t2; double seconds; if (syscall(__NR_clock_gettime,CLOCK_REALTIME,&t1) == -1) exit(1); sleep(1); if (syscall(__NR_clock_gettime,CLOCK_REALTIME,&t2) == -1) exit(1); seconds = ((double)t2.tv_sec + (double)t2.tv_nsec/1000000000.0) - ((double)t1.tv_sec + (double)t1.tv_nsec/1000000000.0); if (seconds > 1.0) exit(0); else exit(1); } ], [walltimer="clock_realtime" AC_MSG_RESULT(yes) ], [walltimer="cycle" AC_MSG_RESULT(no)]) ]) ]) ]) AC_MSG_CHECKING(for which real time clock to use) if test "$walltimer" = "gettimeofday"; then AC_DEFINE(HAVE_GETTIMEOFDAY, 1, [Normal gettimeofday timer]) elif test "$walltimer" = "mmtimer"; then AC_DEFINE(HAVE_MMTIMER, 1, [Altix memory mapped global cycle counter]) altix="-DALTIX" elif test "$walltimer" = "clock_realtime_hr"; then AC_DEFINE(HAVE_CLOCK_GETTIME, 1, [POSIX 1b clock]) AC_DEFINE(HAVE_CLOCK_GETTIME_REALTIME_HR, 1, [POSIX 1b realtime HR clock]) elif test "$walltimer" = "clock_realtime"; then AC_DEFINE(HAVE_CLOCK_GETTIME, 1, [POSIX 1b clock]) AC_DEFINE(HAVE_CLOCK_GETTIME_REALTIME, 1, [POSIX 1b realtime clock]) elif test "$walltimer" = "cycle"; then AC_DEFINE(HAVE_CYCLE, 1, [Native access to a hardware cycle counter]) else AC_MSG_ERROR([Unknown value for walltimer]) fi AC_MSG_RESULT($walltimer) SAVED_LIBS=$LIBS SAVED_LDFLAGS=$LDFLAGS SAVED_CFLAGS=$CFLAGS LIBS="" LDFLAGS="" CFLAGS="-pthread" AC_ARG_WITH(tls, [AS_HELP_STRING([--with-tls=], [This platform supports thread local storage with a keyword])], [tls=$withval], [if test "$cross_compiling" = "yes" ; then AC_MSG_ERROR([tls must be specified for cross compile]) fi AC_MSG_CHECKING(for working __thread) AC_TRY_RUN([#include #include extern __thread int i; static int res1, res2; void *thread_main (void *arg) { i = (int)arg; sleep (1); if ((int)arg == 1) res1 = (i == (int)arg); else res2 = (i == (int)arg); return NULL; } __thread int i; int main () { pthread_t t1, t2; i = 5; pthread_create (&t1, NULL, thread_main, (void *)1); pthread_create (&t2, NULL, thread_main, (void *)2); pthread_join (t1, NULL); pthread_join (t2, NULL); return !(res1 + res2 == 2); } ], [AC_MSG_RESULT(yes) tls="__thread"], [AC_MSG_RESULT(no) tls="no" ]) if test "$OS" = "linux"; then if test "x$tls" = "x__thread"; then # On some linux distributions, TLS works in executables, but linking against # a shared library containing TLS fails with: undefined reference to `__tls_get_addr' AC_LINK_IFELSE([AC_LANG_SOURCE([ static __thread int foo; void main() { foo = 5; } ]) ], [], [tls="no" ; AC_MSG_WARN([Disabling usage of __thread])]) fi fi]) AC_MSG_CHECKING(for high performance thread local storage) if test "$tls" = "no"; then NOTLS="-DNO_TLS" elif test "x$tls" != "x"; then if test "$tls" = "yes"; then tls="__thread" fi NOTLS="-DUSE_COMPILER_TLS" AC_DEFINE_UNQUOTED(HAVE_THREAD_LOCAL_STORAGE,$tls,[Keyword for per-thread variables]) fi AC_MSG_RESULT($tls) AC_ARG_WITH(virtualtimer, [AS_HELP_STRING([--with-virtualtimer=], [Specify per-thread virtual timer])], [virtualtimer=$withval], [if test "$cross_compiling" = "yes" ; then AC_MSG_ERROR([virtualtimer must be specified for cross compile]) fi AC_MSG_CHECKING(for working CLOCK_THREAD_CPUTIME_ID POSIX 1b timer) AC_TRY_RUN([#include #include #include #include #include #include #include #include #include #include #if !defined( SYS_gettid ) #define SYS_gettid 1105 #endif struct timespec threadone = { 0, 0 }; struct timespec threadtwo = { 0, 0 }; pthread_t threadOne, threadTwo; volatile int done = 0; int gettid() { return syscall( SYS_gettid ); } void *doThreadOne( void * v ) { while (!done) sleep(1); if (syscall(__NR_clock_gettime,CLOCK_THREAD_CPUTIME_ID,&threadone) == -1) { perror("clock_gettime(CLOCK_THREAD_CPUTIME_ID)"); exit(1); } return 0; } void *doThreadTwo( void * v ) { long i, j = 0xdeadbeef; for( i = 0; i < 0xFFFFFFF; ++i ) { j = j ^ i; } if (syscall(__NR_clock_gettime,CLOCK_THREAD_CPUTIME_ID,&threadtwo) == -1) { perror("clock_gettime(CLOCK_THREAD_CPUTIME_ID)"); exit(1); } done = 1; return (void *) j; } int main( int argc, char ** argv ) { int status = pthread_create( & threadOne, NULL, doThreadOne, NULL ); assert( status == 0 ); status = pthread_create( & threadTwo, NULL, doThreadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadOne, NULL ); assert( status == 0 ); if ((threadone.tv_sec != threadtwo.tv_sec) || (threadone.tv_nsec != threadtwo.tv_nsec)) exit(0); else { fprintf(stderr,"T1 %ld %ld T2 %ld %ld\n",threadone.tv_sec,threadone.tv_nsec,threadtwo.tv_sec,threadtwo.tv_nsec); exit(1); } } ], [AC_MSG_RESULT(yes) virtualtimer="clock_thread_cputime_id"], [AC_MSG_RESULT(no) # *** Checks for working per thread timer*** AC_MSG_CHECKING(for working per-thread times() timer) AC_TRY_RUN([#include #include #include #include #include #include #include #include #include #include #if !defined( SYS_gettid ) #define SYS_gettid 1105 #endif long threadone = 0, threadtwo = 0; pthread_t threadOne, threadTwo; volatile int done = 0; int gettid() { return syscall( SYS_gettid ); } void *doThreadOne( void * v ) { struct tms tm; int status; while (!done) sleep(1); status = times( & tm ); assert( status != -1 ); threadone = tm.tms_utime; return 0; } void *doThreadTwo( void * v ) { struct tms tm; long i, j = 0xdeadbeef; int status; for( i = 0; i < 0xFFFFFFF; ++i ) { j = j ^ i; } status = times( & tm ); assert( status != -1 ); threadtwo = tm.tms_utime; done = 1; return j; } int main( int argc, char ** argv ) { int status = pthread_create( & threadOne, NULL, doThreadOne, NULL ); assert( status == 0 ); status = pthread_create( & threadTwo, NULL, doThreadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadTwo, NULL ); assert( status == 0 ); status = pthread_join( threadOne, NULL ); assert( status == 0 ); return (threadone == threadtwo); } ], [AC_MSG_RESULT(yes) virtualtimer="times"], [AC_MSG_RESULT(no) virtualtimer="default"]) ]) ]) LDFLAGS=$SAVED_LDFLAGS CFLAGS=$SAVED_CFLAGS LIBS=$SAVED_LIBS AC_MSG_CHECKING(for which virtual timer to use) case "$virtualtimer" in times) AC_DEFINE(HAVE_PER_THREAD_TIMES, 1, [Working per thread timer]) ;; getrusage) AC_DEFINE(HAVE_PER_THREAD_GETRUSAGE, 1, [Working per thread getrusage]) ;; clock_thread_cputime_id) AC_DEFINE(HAVE_CLOCK_GETTIME_THREAD, CLOCK_THREAD_CPUTIME_ID, [POSIX 1b per-thread clock]) ;; proc|default) AC_DEFINE(USE_PROC_PTTIMER, 1, [Use /proc for per-thread times]) esac AC_MSG_RESULT($virtualtimer) if test "$OS" = "aix"; then AC_ARG_WITH(pmapi, [AS_HELP_STRING([--with-pmapi=], [Specify path of pmapi on aix system])], [PMAPI=$withval], [PMAPI="/usr/pmapi"]) LIBS="-L$PMAPI/lib -lpmapi" CPPFLAGS="$CPPFLAGS -I$PMAPI/include" AC_CHECK_LIB([pmapi], [pm_initialize], [PMINIT="-DPM_INITIALIZE"], [AC_CHECK_LIB([pmapi], [pm_init], [PMINIT="-DPM_INIT"], [AC_MSG_ERROR([libpmapi.a not found, rerun configure with different flags]) ]) ]) fi AC_MSG_CHECKING(for static user preset events) AC_ARG_WITH(static_user_events, [AS_HELP_STRING([--with-static-user-events], [Build with a static user events file.])], [STATIC_USER_EVENTS=$withval], [STATIC_USER_EVENTS=no]) if test "$STATIC_USER_EVENTS" = "yes"; then PAPICFLAGS+=" -DSTATIC_USER_EVENTS" fi AC_MSG_RESULT($STATIC_USER_EVENTS) AC_MSG_CHECKING(for static PAPI preset events) AC_ARG_WITH(static_papi_events, [AS_HELP_STRING([--with-static-papi-events], [Build with a static papi events file.])], [STATIC_PAPI_EVENTS=$withval], [STATIC_PAPI_EVENTS=yes]) if test "$STATIC_PAPI_EVENTS" = "yes"; then PAPICFLAGS+=" -DSTATIC_PAPI_EVENTS_TABLE" fi AC_MSG_RESULT($STATIC_PAPI_EVENTS) AC_MSG_CHECKING(for whether to build static library) AC_ARG_WITH(static_lib, [AS_HELP_STRING([--with-static-lib=], [Build a static library])], [static_lib=$withval], [static_lib=yes]) AC_MSG_RESULT($static_lib) AC_MSG_CHECKING(for whether to build shared library) AC_ARG_WITH(shared_lib, [AS_HELP_STRING([--with-shared-lib=], [Build a shared library])], [shared_lib=$withval], [shared_lib=yes]) AC_MSG_RESULT($shared_lib) if test "$shared_lib" = "no" -a "$static_lib" = "no"; then AC_MSG_ERROR(Both shared and static libs are disabled) fi BUILD_SHARED_LIB="no" if test "$shared_lib" = "yes"; then BUILD_SHARED_LIB="yes" papiLIBS="shared" fi AC_SUBST(BUILD_SHARED_LIB) if test "$static_lib" = "yes"; then papiLIBS="$papiLIBS static" fi if test "$shared_lib" = "no" -a "$static_lib" = "yes"; then NO_MPI_TESTS="yes" fi AC_MSG_CHECKING(for static compile of tests and utilities) AC_ARG_WITH(static_tools, [AS_HELP_STRING([--with-static-tools], [Specify static compile of tests and utilities])], [STATIC="-static" AC_MSG_RESULT(yes)], [AC_MSG_RESULT(no)]) # Disable LDL for static builds if test "$STATIC" = "-static"; then LDL="" fi AC_MSG_CHECKING(for linking with papi shared library of tests and utilities) AC_ARG_WITH(shlib_tools, [AS_HELP_STRING([--with-shlib-tools], [Specify linking with papi library of tests and utilities])], [shlib_tools=yes AC_MSG_RESULT(yes)], [shlib_tools=no AC_MSG_RESULT(no)]) if test "$static_lib" = "no"; then shlib_tools=yes fi if test "$static_lib" = "no" -a "$shlib_tools" = "no"; then AC_MSG_ERROR(Building tests and utilities static but no static papi library to be built) fi if test "$shlib_tools" = "yes"; then if test "$shared_lib" != "yes"; then AC_MSG_ERROR(Building static but specified shared linking for tests and utilities) fi if test "$STATIC" = "-static"; then AC_MSG_ERROR([Building shared but specified static linking]) fi LINKLIB='$(SHLIB)' # Set rpath and runpath to find libpfm.so and libpapi.so if not specified via LD_LIBRARY_PATH. The search path at runtime can be overriden by LD_LIBRARY_PATH. LDFLAGS="$LDFLAGS -Wl,-rpath=$PWD/libpfm4/lib:$PWD,--enable-new-dtags" elif test "$shlib_tools" = "no"; then if test "$static_lib" != "yes"; then AC_MSG_ERROR([Building shared but specified static linking for tests and utilities]) fi LINKLIB='$(LIBRARY)' fi ## By default we want libsde built, so if the user does not ## give an option, then we set BUILD_LIBSDE_* to yes. AC_MSG_CHECKING(for building libsde) AC_ARG_WITH(libsde, [AS_HELP_STRING([--with-libsde=], [Build the standalone libsde (default: yes)])], [], [with_libsde=yes]) if test "$with_libsde" = "no"; then BUILD_LIBSDE_SHARED="no" BUILD_LIBSDE_STATIC="no" AC_MSG_RESULT(no) else BUILD_LIBSDE_SHARED="yes" if test "$static_lib" = "yes"; then BUILD_LIBSDE_STATIC="yes" AC_MSG_RESULT(shared and static) else BUILD_LIBSDE_STATIC="no" AC_MSG_RESULT(shared only) fi if test "$PWD" != ""; then TOPDIR=$PWD else TOPDIR=$(pwd) fi fi AC_SUBST(BUILD_LIBSDE_SHARED) AC_SUBST(BUILD_LIBSDE_STATIC) AC_SUBST(LIBSDEFLAGS) user_specified_interface=no ################################################## # perfnec ################################################## force_perfnec=no perfnec=0 AC_ARG_WITH(perfnec, [AS_HELP_STRING([--with-perfnec], [Specify perf_nec as the performance interface])], [perfnec=$withval user_specified_interface=perfnec force_perfnec=yes pfm_root=libperfnec pfm_incdir="libperfnec/include"], [perfnec=0 if test "$perfnec" != 0; then pfm_incdir="libpperfnec/include" fi]) force_pfm_incdir=no ################################################## # perfmon ################################################## old_pfmv2=n perfmon=0 perfmon2=no force_perfmon2=no AC_ARG_WITH(perfmon, [AS_HELP_STRING([--with-perfmon=], [Specify perfmon as the performance interface and specify version])], [perfmon=$withval user_specified_interface=perfmon force_perfmon2=yes pfm_incdir="libpfm-3.y/include" perfmon=`echo ${perfmon} | sed 's/^[ \t]*//;s/[ \t]*$//'` perfmon=`echo ${perfmon} | grep -e '[[1-9]]\.[[0-9]][[0-9]]*'` if test "x$perfmon" = "x"; then AC_MSG_ERROR("Badly formed perfmon version string") fi perfmon=`echo ${perfmon} | sed 's/\.//'` if test $perfmon -gt 20; then perfmon2=yes fi if test $perfmon -lt 25; then old_pfmv2=y PFMCFLAGS="-DPFMLIB_OLD_PFMV2" fi], [perfmon=0 if test "$cross_compiling" = "no" ; then AC_CHECK_FILE(/sys/kernel/perfmon/version, [perfmon=`cat /sys/kernel/perfmon/version`], [AC_CHECK_FILE(/proc/perfmon, [perfmon=`cat /proc/perfmon | grep version | cut -d: -f2`], [perfmon=0])]) if test "$perfmon" != 0; then pfm_incdir="libpfm-3.y/include" perfmon=`echo ${perfmon} | sed 's/^[ \t]*//;s/[ \t]*$//'` perfmon=`echo ${perfmon} | grep -e '[[1-9]]\.[[0-9]][[0-9]]*'` perfmon=`echo ${perfmon} | sed 's/\.//'` if test $perfmon -gt 20; then perfmon2=yes fi if test $perfmon -lt 25; then # must be y, not yes, or libpfm breaks old_pfmv2="y" PFMCFLAGS="-DPFMLIB_OLD_PFMV2" fi fi fi]) force_pfm_incdir=no # default AC_ARG_WITH(pfm_root, [AS_HELP_STRING([--with-pfm-root=], [Specify path to source tree (for use by developers only)])], [pfm_root=$withval pfm_incdir=$withval/include pfm_libdir=$withval/lib]) AC_ARG_WITH(pfm_prefix, [AS_HELP_STRING([--with-pfm-prefix=], [Specify prefix to installed pfm distribution])], [pfm_prefix=$withval pfm_incdir=$pfm_prefix/include pfm_libdir=$pfm_prefix/lib]) AC_ARG_WITH(pfm_incdir, [AS_HELP_STRING([--with-pfm-incdir=], [Specify directory of pfm header files in non-standard location])], [pfm_incdir=$withval]) AC_ARG_WITH(pfm_libdir, [AS_HELP_STRING([--with-pfm-libdir=], [Specify directory of pfm library in non-standard location])], [pfm_libdir=$withval]) # if these are both empty, it means we haven't set either pfm_prefix or pfm_root # which would have set them. Thus it means that we set this to our included # libpfm4 library. Shame on the person that sets one but not the other. if test "x$pfm_incdir" = "x" -a "x$pfm_libdir" = "x"; then pfm_root="libpfm4" pfm_incdir="libpfm4/include" pfm_libdir="libpfm4/lib" fi ################################################## # Linux perf_event/perf_counter ################################################## perf_events=yes force_perf_events=no perf_events_uncore=yes if test "x$mic" = "xno"; then perf_events=no fi awk ' /^# validation_tests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+validation_tests$/ { sub(/^# /, ""); print; next } # ctests lines /^# ctests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib[[:space:]]+validation_tests$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ctests$/ { sub(/^# /, ""); print; next } # ftests lines /^# ftests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { sub(/^# /, ""); print; next } /^# \t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ftests$/ { sub(/^# /, ""); print; next } { print } ' "Makefile.inc" > "t_Makefile.inc" mv "t_Makefile.inc" "Makefile.inc" AC_ARG_ENABLE(cpu, [AS_HELP_STRING([--disable-cpu], [Disable cpu component])]) AS_IF([test "x$enable_cpu" = "xno"],[ perf_events=no force_perf_events=no perf_events_uncore=no FILE_PATH="Makefile.inc" TEMP_FILE="temp_$FILE_PATH" # Comment out lines to disable ctest ftest vtest awk ' /^validation_tests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+validation_tests$/ { print "# " $0; next } /^ctests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib[[:space:]]+validation_tests$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ctests$/ { print "# " $0; next } /^ftests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ftests$/ { print "# " $0; next } { print } ' "$FILE_PATH" > "$TEMP_FILE" mv "$TEMP_FILE" "$FILE_PATH" ]) AC_ARG_ENABLE(perf_event, [AS_HELP_STRING([--disable-perf-event], [Disable perf_event component])]) AS_IF([test "x$enable_perf_event" = "xno"],[ perf_events=no force_perf_events=no FILE_PATH="Makefile.inc" TEMP_FILE="temp_$FILE_PATH" awk ' /^validation_tests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+validation_tests$/ { print "# " $0; next } /^ctests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib[[:space:]]+validation_tests$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ctests$/ { print "# " $0; next } /^ftests:[[:space:]]+\$\(LIBS\)[[:space:]]+testlib$/ { print "# " $0; next } /^\t\$\(SETPATH\)[[:space:]]+\$\(MAKE\)[[:space:]]+-C[[:space:]]+ftests$/ { print "# " $0; next } { print } ' "$FILE_PATH" > "$TEMP_FILE" mv "$TEMP_FILE" "$FILE_PATH" ]) AC_ARG_ENABLE(perf_event_uncore, [AS_HELP_STRING([--disable-perf-event-uncore], [Disable perf_event_uncore component])]) AS_IF([test "x$enable_perf_event_uncore" = "xno"],[ perf_events_uncore=no ]) AC_ARG_WITH(perf_events, [AS_HELP_STRING([--with-perf-events], [Specify use of Linux Performance Event (requires kernel 2.6.32 or greater)])], [force_perf_events=yes user_specified_interface=pe]) # RDPMC support AC_ARG_ENABLE(perfevent_rdpmc, [AS_HELP_STRING([--enable-perfevent-rdpmc], [Enable userspace rdpmc instruction on perf_event, default: yes])], [case "${enableval}" in yes) enable_perfevent_rdpmc=true ;; no) enable_perfevent_rpdmc=false ;; *) AC_MSG_ERROR([bad value ${enableval} for --enable-perfevent-rdpmc]) ;; esac], [enable_perfevent_rdpmc=true]) if test "$enable_perfevent_rdpmc" = "true"; then PECFLAGS="$PECFLAGS -DUSE_PERFEVENT_RDPMC=1" fi # Uncore support AC_ARG_WITH(pe_incdir, [AS_HELP_STRING([--with-pe-incdir=], [Specify path to the correct perf header file])], [pe_incdir=$withval force_perf_events=yes user_specified_interface=pe], [pe_incdir=$pfm_incdir/perfmon]) # Check for perf_event.h if test "$force_perf_events" = "yes"; then perf_events="yes" fi if test "$cross_compiling" = "no"; then AC_CHECK_FILE(/proc/sys/kernel/perf_event_paranoid,[ have_paranoid=yes AC_CHECK_FILE([$pe_incdir/perf_event.h], [ if test "$perf_events" != "no"; then perf_events="yes" fi ]) ]) fi if test "$perf_events" = "yes"; then PECFLAGS="$PECFLAGS -DPEINCLUDE=\\\"$pe_incdir/perf_event.h\\\"" fi # # Sort out the choice of the user vs. what we detected # # MESSING WITH CFLAGS IS STUPID! # if test "$user_specified_interface" != "no"; then if test "$user_specified_interface" = "perfmon"; then perf_events="no" PAPICFLAGS+=" $PFMCFLAGS" perfnec=0 else if test "$user_specified_interface" = "pe"; then perfmon=0 PAPICFLAGS+=" $PECFLAGS" perfnec=0 else if test "$user_specified_interface" = "perfnec"; then perfmon=0 perf_events=0 PAPICFLAGS+=" -DPERFNEC" AC_MSG_NOTICE([XXXXX user_specified_interface perfnec]) else AC_MSG_ERROR("Unknown user_specified_interface=$user_specified_interface perfmon=$perfmon perfmon2=$perfmon2 perf-events=$perf_events perfnec=$perfnec") fi fi fi else if test "$perfmon" != 0; then PAPICFLAGS+=" $PFMCFLAGS" fi if test "$perf_events" = "yes"; then PAPICFLAGS+=" $PECFLAGS" fi fi # # User has made no choice, so we default to the ordering below in the platform section, if # we detect more than one. # # # What does this next section do? It determines whether or not to run the tests for libpfm # based on the settings of pfm_root, pfm_prefix, pfm_incdir, pfm_libdir # # Both should be 0 for NEC if test "$perfmon" != 0 -o "$perf_events" = "yes"; then # if prefix set, then yes if test "x$pfm_prefix" != "x"; then dotest=1 # if root not set and libdir set, then yes elif test "x$pfm_root" = "x" -a "x$pfm_libdir" != "x"; then dotest=1 else dotest=0 fi if test "$dotest" = 1; then LIBS="-L$pfm_libdir -lpfm" CPPFLAGS="$CPPFLAGS -I$pfm_incdir" AC_CHECK_LIB([pfm], [pfm_initialize], [AC_CHECK_HEADERS([perfmon/pfmlib.h], [if test "$arch" = "ia64"; then AC_CHECK_HEADERS([perfmon/pfmlib_montecito.h]) fi AC_CHECK_FUNC(pfm_get_event_description, [AC_DEFINE(HAVE_PFM_GET_EVENT_DESCRIPTION,1,[event description function])],[]) AC_CHECK_MEMBER(pfmlib_reg_t.reg_evt_idx, [AC_DEFINE(HAVE_PFM_REG_EVT_IDX,1,[old reg_evt_idx])],[],[#include "perfmon/pfmlib.h"]) AC_CHECK_MEMBER(pfmlib_output_param_t.pfp_pmd_count, [AC_DEFINE(HAVE_PFMLIB_OUTPUT_PFP_PMD_COUNT,1,[new pfmlib_output_param_t])],[],[#include "perfmon/pfmlib.h"]) AC_CHECK_MEMBER(pfm_msg_t.type, [AC_DEFINE(HAVE_PFM_MSG_TYPE,1,[new pfm_msg_t])],[],[#include "perfmon/perfmon.h"]) ], [AC_MSG_ERROR([perfmon/pfmlib.h not found, rerun configure with different flags]) ]) ], [AC_MSG_ERROR([libpfm.a not found, rerun configure with different flags]) ]) else AC_DEFINE(HAVE_PERFMON_PFMLIB_MONTECITO_H,1,[Montecito headers]) AC_DEFINE(HAVE_PFM_GET_EVENT_DESCRIPTION,1,[event description function]) AC_DEFINE(HAVE_PFMLIB_OUTPUT_PFP_PMD_COUNT,1,[new pfmlib_output_param_t]) fi fi ################################################## # Checking platform ################################################## AC_MSG_CHECKING(platform) case "$OS" in nec) MAKEVER=nec-nec ;; aix) MAKEVER="$OS"-"$CPU" ;; bgp) MAKEVER=bgp ;; bgq) MAKEVER=bgq ;; CLE) if test "$perfmon2" = "yes"; then # major_version=`echo $OSVER | sed 's/\([[^.]][[^.]]*\).*/\1/'` # minor_version=`echo $OSVER | sed 's/[[^.]][[^.]]*.\([[^.]][[^.]]*\).*/\1/'` # point_version=`echo $OSVER | sed -e 's/[[^.]][[^.]]*.[[^.]][[^.]]*.\(.*\)/\1/' -e 's/[[^0-9]].*//'` # if (test "$major_version" = 2 -a "$minor_version" = 6 -a "$point_version" -lt 31 -a "$perfmon2" != "yes" ); then MAKEVER="$OS"-perfmon2 else MAKEVER="$OS"-pe fi ;; freebsd) MAKEVER="freebsd" LDFLAGS="-lpmc" # HWPMC driver is available for FreeBSD >= 6 FREEBSD_VERSION=`uname -r | cut -d'.' -f1` if test "${FREEBSD_VERSION}" -lt 6 ; then AC_MSG_ERROR([PAPI requires FreeBSD 6 or greater]) fi # Determine if HWPMC module is on the kernel dmesg | grep hwpmc 2> /dev/null > /dev/null if test "$?" != "0" ; then AC_MSG_ERROR([HWPMC module not found. (see INSTALL.TXT)]) fi # Determine the number of counters echo "/* Automatically generated file by configure */" > freebsd-config.h echo "#ifndef _FREEBSD_CONFIG_H_" >> freebsd-config.h echo "#define _FREEBSD_CONFIG_H_" >> freebsd-config.h echo "" >> freebsd-config.h AC_TRY_LINK([#include #include ], [int i = pmc_init();], [pmc_pmc_init_linked="yes"], [pmc_pmc_init_linked="no"]) if test "${pmc_init_linked}" = "no" ; then AC_MSG_ERROR([Failed to link hwpmc example]) fi AC_TRY_RUN([#include #include int main() { const struct pmc_cpuinfo *info; if (pmc_init() < 0) return 0; if (pmc_cpuinfo (&info) < 0) return 0; return info->pm_npmc-1; } ], [ num_counters="0" ], [ num_counters="$?"]) if test "${num_counters}" = "0" ; then AC_MSG_ERROR([pmc_npmc info returned 0. Determine if the HWPMC module is loaded (see hwpmc(4))]) fi echo "#define HWPMC_NUM_COUNTERS ${num_counters}" >> freebsd-config.h echo "" >> freebsd-config.h echo "#endif" >> freebsd-config.h ;; linux) if test "$force_perf_events" = "yes" ; then MAKEVER="$OS"-pe elif test "$force_perfmon2" = "yes" ; then MAKEVER="$OS"-perfmon2 elif test "$perf_events" = "yes" ; then MAKEVER="$OS"-pe elif test "$perfmon2" = "yes" ; then MAKEVER="$OS"-perfmon2 elif test "$old_pfmv2" = "y" ; then MAKEVER="$OS"-pfm-"$CPU" else MAKEVER="$OS"-generic fi ;; solaris) if test "$bitmode" = "64" -a "`isainfo -v | grep "64"`" = ""; then AC_MSG_ERROR([The bitmode you specified is not supported]) fi MAKEVER="$OS"-"$CPU" ;; darwin) MAKEVER="$OS" ;; esac AC_MSG_RESULT($MAKEVER) if test "x$MAKEVER" = "x"; then AC_MSG_NOTICE(This platform is not supported so a generic build without CPU counters will be used) MAKEVER="generic_platform" fi ################################################## # Set build macros ################################################## FILENAME=Makefile.inc SHOW_CONF=showconf CTEST_TARGETS="all" FTEST_TARGETS="all" LIBRARY=libpapi.a SHLIB='libpapi.so.AC_PACKAGE_VERSION' PAPISOVER='$(PAPIVER).$(PAPIREV)' VLIB='libpapi.so.$(PAPISOVER)' OMPCFLGS=-fopenmp CC_R='$(CC) -pthread' CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(VLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' if test "$CC_COMMON_NAME" = "gcc"; then if test "$bitmode" = "32"; then BITFLAGS=-m32 elif test "$bitmode" = "64"; then BITFLAGS=-m64 fi fi OPTFLAGS="$OPTFLAGS" PAPICFLAGS+=" -D_REENTRANT -D_GNU_SOURCE $NOTLS" CFLAGS="$CFLAGS $BITFLAGS" FFLAGS="$CFLAGS $BITFLAGS $FFLAGS -Dlinux" # OS Support if (test "$OS" = "aix"); then OSFILESSRC=aix-memory.c OSLOCK=aix-lock.h OSCONTEXT=aix-context.h elif (test "$OS" = "bgp"); then OSFILESSRC=linux-bgp-memory.c OSLOCK=linux-bgp-lock.h OSCONTEXT=linux-bgp-context.h elif (test "$OS" = "bgq"); then OSFILESSRC=linux-bgq-memory.c OSLOCK=linux-bgq-lock.h OSCONTEXT=linux-context.h elif (test "$OS" = "freebsd"); then OSFILESSRC=freebsd-memory.c OSLOCK="freebsd-lock.h" OSCONTEXT="freebsd-context.h" elif (test "$OS" = "nec"); then OSFILESSRC="linux-memory.c linux-timer.c linux-common.c" OSFILESHDR="linux-memory.h linux-timer.h linux-common.h" OSLOCK="linux-lock.h" OSCONTEXT="linux-context.h" elif (test "$OS" = "linux"); then OSFILESSRC="linux-memory.c linux-timer.c linux-common.c" OSFILESHDR="linux-memory.h linux-timer.h linux-common.h" OSLOCK="linux-lock.h" OSCONTEXT="linux-context.h" elif (test "$OS" = "solaris"); then OSFILESSRC="solaris-memory.c solaris-common.c" OSFILESHDR="solaris-memory.h solaris-common.h" OSLOCK="solaris-lock.h" OSCONTEXT="solaris-context.h" elif (test "$OS" = "darwin"); then OSFILESSRC="darwin-memory.c darwin-common.c" OSFILESHDR="darwin-memory.h darwin-common.h" OSLOCK="darwin-lock.h" OSCONTEXT="darwin-context.h" fi OSFILESOBJ='$(OSFILESSRC:.c=.o)' if (test "$MAKEVER" = "aix-power5" || test "$MAKEVER" = "aix-power6" || test "$MAKEVER" = "aix-power7"); then if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so # By default AIX enforces a limit on heap space #( limiting the heap to share the same 256MB memory segment as stack ) # changing the max data paramater moves the heap off the stack's memory segment BITFLAGS='-q64 -bmaxdata:0x07000000000000' ARG64=-X64 else # If the issue ever comes up, /dsa requires AIX v5.1 or higher # and the Large address-space model (-bmaxdata) requires v4.3 or later # see http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com.ibm.aix.genprogc/doc/genprogc/lrg_prg_support.htm#a179c11c5d SHLIB=libpapi.so BITFLAGS="-bmaxdata:0x80000000/dsa" fi CPUCOMPONENT_NAME=aix CPUCOMPONENT_C=aix.c CPUCOMPONENT_OBJ=aix.o VECTOR=_aix_vector PAPI_EVENTS_CSV="papi_events.csv" MISCHDRS="aix.h papi_events_table.h" MISCSRCS="aix.c" CFLAGS+='-qenum=4 -DNO_VARARG_MACRO -D_AIX -D_$(CPU_MODEL) -DNEED_FFSLL -DARCH_EVTS=\"$(ARCH_EVENTS).h\" -DCOMP_VECTOR=_ppc64_vectors -DSTATIC_PAPI_EVENTS_TABLE' FFLAGS+='-WF,-D_$(CPU_MODEL) -WF,-DARCH_EVTS=\"$(ARCH_EVENTS).h\"' CFLAGS+='-I$(PMAPI)/include -qmaxmem=-1 -qarch=$(cpu_option) -qtune=$(cpu_option) -qlanglvl=extended $(BITFLAGS)' if test $debug != "yes"; then OPTFLAGS='-O3 -qstrict $(PMINIT)' else OPTFLAGS='$(PMINIT)' fi SMPCFLGS=-qsmp OMPCFLGS='-qsmp=omp' LDFLAGS='-L$(PMAPI)/lib -lpmapi' CC_R=xlc_r CC=xlc CC_SHR="xlc -G -bnoentry" AC_CHECK_PROGS( [MPICC], [mpicc mpcc], []) F77=xlf CPP='xlc -E $(CPPFLAGS)' if test "$MAKEVER" = "aix-power5"; then ARCH_EVENTS=power5_events CPU_MODEL=POWER5 cpu_option=pwr5 DESCR="AIX 5.1.0 or greater with POWER5" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi elif test "$MAKEVER" = "aix-power6"; then ARCH_EVENTS=power6_events CPU_MODEL=POWER6 cpu_option=pwr6 DESCR="AIX 5.1.0 or greater with POWER6" CPPFLAGS="-qlanglvl=extended" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi elif test "$MAKEVER" = "aix-power7"; then ARCH_EVENTS=power7_events CPU_MODEL=POWER7 cpu_option=pwr7 DESCR="AIX 5.1.0 or greater with POWER7" CPPFLAGS="-qlanglvl=extended" if test "$bitmode" = "64"; then DESCR="$DESCR 64 bit build" fi fi elif test "$MAKEVER" = "bgp"; then CPP="$CC -E" CPUCOMPONENT_NAME=linux-bgp CPUCOMPONENT_C=linux-bgp.c CPUCOMPONENT_OBJ=linux-bgp.o VECTOR=_bgp_vectors PAPI_EVENTS_CSV="papi_events.csv" MISCSRCS= CFLAGS='-g -gdwarf-2 -O2 -Wall -I. -I$(BGP_SYSDIR)/arch/include -DCOMP_VECTOR=_bgp_vectors' tests="$tests bgp_tests" SHOW_CONF=show_bgp_conf BGP_SYSDIR=/bgsys/drivers/ppcfloor BGP_GNU_LINUX_PATH='${BGP_SYSDIR}/gnu-linux' LDFLAGS='-L$(BGP_SYSDIR)/runtime/SPI -lSPI.cna' FFLAGS='-g -gdwarf-2 -O2 -Wall -I. -Dlinux' OPTFLAGS="-g -Wall -O3" TOPTFLAGS="-g -Wall -O0" SHLIB=libpapi.so DESCR="Linux for BlueGene/P" LIBS=static CC_SHR='$(CC) -shared -Xlinker "-soname" -Xlinker "$(SHLIB)" -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' OMPCFLGS="" elif test "$MAKEVER" = "bgq"; then FILENAME=Rules.bgpm VECTOR=_bgq_vectors CPUCOMPONENT_NAME=linux-bgq CPUCOMPONENT_C=linux-bgq.c CPUCOMPONENT_OBJ=linux-bgq.o PAPI_EVENTS_CSV="papi_events.csv" MISCSRCS="linux-bgq-common.c" OPTFLAGS="-g -Wall -O3" TOPTFLAGS="-g -Wall -O0" SHLIB=libpapi.so DESCR="Linux for Blue Gene/Q" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(SHLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' OMPCFLGS="" elif test "$MAKEVER" = "CLE-perfmon2"; then FILENAME=Rules.perfmon2 CPUCOMPONENT_NAME=perfmon CPUCOMPONENT_C=perfmon.c CPUCOMPONENT_OBJ=perfmon.o VECTOR=_papi_pfm_vector PAPI_EVENTS_CSV="papi_events.csv" F77=gfortran CFLAGS="$CFLAGS -D__crayxt" FFLAGS="" elif test "$MAKEVER" = "freebsd"; then CPUCOMPONENT_NAME=freebsd CPUCOMPONENT_C=freebsd.c CPUCOMPONENT_OBJ=freebsd.o VECTOR=_papi_freebsd_vector PAPI_EVENTS_CSV="freebsd_events.csv" MISCHDRS="freebsd/map-unknown.h freebsd/map.h freebsd/map-p6.h freebsd/map-p6-m.h freebsd/map-p6-3.h freebsd/map-p6-2.h freebsd/map-p6-c.h freebsd/map-k7.h freebsd/map-k8.h freebsd/map-p4.h freebsd/map-atom.h freebsd/map-core.h freebsd/map-core2.h freebsd/map-core2-extreme.h freebsd/map-i7.h freebsd/map-westme\ re.h" MISCSRCS="$MISCSRCS freebsd/map-unknown.c freebsd/map.c freebsd/map-p6.c freebsd/map-p6-m.c freebsd/map-p6-3.c freebsd/map-p6-2.c freebsd/map-p6-c.c freebsd/map-k7.c freebsd/map-k8.c freebsd/map-p4.c freebsd/map-atom.c freebsd/map-core.c freebsd/map-core2.c freebsd/map-core2-extreme.c freebsd/map-i7.c freebsd/map-westme\ re.c" DESCR="FreeBSD -over libpmc- " CFLAGS+=" -I. -Ifreebsd -DPIC -fPIC" CC_SHR='$(CC) -shared -Xlinker "-soname" -Xlinker "libpapi.so" -Xlinker "-rpath" -Xlinker "$(LIBDIR)" -DPIC -fPIC -I. -Ifreebsd' elif test "$MAKEVER" = "linux-generic"; then CPUCOMPONENT_NAME=linux-generic CPUCOMPONENT_C=linux-generic.c CPUCOMPONENT_OBJ=linux-generic.o PAPI_EVENTS_CSV="papi_events.csv" VECTOR=_papi_dummy_vector elif test "$MAKEVER" = "linux-pe"; then FILENAME=Rules.pfm4_pe CPUCOMPONENT_NAME=perf_event if test "$perf_events" = "no"; then components="$components" else components="$components perf_event" fi if test "$perf_events_uncore" = "no"; then components="$components" else components="$components perf_event_uncore" fi elif test "$MAKEVER" = "nec-nec"; then FILENAME=Rules.perfnec CPUCOMPONENT_NAME=perfnec components="perfnec" elif test "$MAKEVER" = "linux-perfmon2"; then FILENAME=Rules.perfmon2 CPUCOMPONENT_NAME=perfmon2 components="perfmon2" elif (test "$MAKEVER" = "linux-pfm-ia64" || test "$MAKEVER" = "linux-pfm-itanium2" || test "$MAKEVER" = "linux-pfm-montecito"); then FILENAME=Rules.pfm CPUCOMPONENT_NAME=perfmon-ia64 components="perfmon_ia64" VERSION=3.y if test "$MAKEVER" = "linux-pfm-itanium2"; then CPU=2 else CPU=3 fi CFLAGS="$CFLAGS -DITANIUM$CPU" FFLAGS="$FFLAGS -DITANIUM$CPU" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-soname -Wl,$(SHLIB) -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' elif test "$MAKEVER" = "solaris-ultra"; then CPUCOMPONENT_NAME=solaris-ultra CPUCOMPONENT_C=solaris-ultra.c CPUCOMPONENT_OBJ=solaris-ultra.obj VECTOR=_solaris_vector PAPI_EVENTS_CSV="papi_events.csv" DESCR="Solaris 5.8 or greater with UltraSPARC I, II or III" if test "$CC" = "gcc"; then F77=g77 CPP="$CC -E" CC_R="$CC" CC_SHR="$CC -shared -fpic" OPTFLAGS=-O3 CFLAGS="$CFLAGS -DNEED_FFSLL" FFLAGS=$CFLAGS else # Sun Workshop compilers: V5.0 and V6.0 R2 CPP="$CC -E" CC_R="$CC -mt" CC_SHR="$CC -ztext -G -Kpic" CFLAGS="-xtarget=ultra3 -xarch=v8plusa -DNO_VARARG_MACRO -D__EXTENSIONS__ -DPAPI_NO_MEMORY_MANAGEMENT -DCOMP_VECTOR=_solaris_vectors" SMPCFLGS=-xexplicitpar OMPCFLGS=-xopenmp F77=f90 FFLAGS=$CFLAGS NOOPT=-xO0 OPTFLAGS="-g -fast -xtarget=ultra3 -xarch=v8plusa" fi LDFLAGS="$LDFLAGS -lcpc" if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so CFLAGS="-xtarget=ultra3 -xarch=v9a -DNO_VARARG_MACRO -D__EXTENSIONS__ -DPAPI_NO_MEMORY_MANAGEMENT -DCOMP_VECTOR=_solaris_vectors" OPTFLAGS="-g -fast -xtarget=ultra3 -xarch=v9a" fi elif test "$MAKEVER" = "solaris-niagara2"; then CPUCOMPONENT_NAME=solaris-niagara2 CPUCOMPONENT_C=solaris-niagara2.c CPUCOMPONENT_OBJ=solaris-niagara2.obj VECTOR=_niagara2_vector PAPI_EVENTS_CSV="papi_events.csv" CFLAGS="-xtarget=native -xarch=native -DNO_VARARG_MACRO -D__EXTENSIONS__ -DCOMP_VECTOR=_niagara2_vector" ORY_MANAGEMENT="-DCOMP_VECTOR=_solaris_vector" DESCR="Solaris 10 with libcpc2 and UltraSPARC T2 (Niagara 2)" CPP="$CC -E" CC_R="$CC -mt" CC_SHR="$CC -ztext -G -Kpic" SMPCFLGS=-xexplicitpar OMPCFLGS=-xopenmp F77=f90 FFLAGS=$CFLAGS NOOPT=-xO0 OPTFLAGS="-fast" FOPTFLAGS=$OPTFLAGS LDFLAGS="$LDFLAGS -lcpc" if test "$bitmode" = "64"; then LIBRARY=libpapi64.a SHLIB=libpapi64.so CFLAGS="$CFLAGS -m64" FFLAGS="$FFLAGS -m64" fi elif test "$MAKEVER" = "darwin"; then DESCR="Darwin" CPUCOMPONENT_NAME=darwin CPUCOMPONENT=linux-generic.c CPUCOMPONENT=linux-generic.obj CFLAGS="-DNEED_FFSLL" CC_SHR='$(CC) -fPIC -DPIC -shared -Wl,-dylib -Xlinker "-rpath" -Xlinker "$(LIBDIR)"' SHLIB=libpapi.dylib elif test "$MAKEVER" = "generic_platform"; then DESCR="Generic platform" fi MISCOBJS='$(MISCSRCS:.c=.o)' if test "$F77" = "pgf77"; then FFLAGS="$FFLAGS -Wall -Mextend" elif test "$F77" = "ifort"; then FFLAGS="$FFLAGS -warn all" elif test "$F77" != "xlf"; then FFLAGS="$FFLAGS -ffixed-line-length-132" fi if test "$CC_COMMON_NAME" = "icc"; then OMPCFLGS=-qopenmp fi ## By default we want the sysdetect component built, so if the user does not ## give an option we will add it to the list. AC_MSG_CHECKING(for building sysdetect) AC_ARG_WITH(sysdetect, [AS_HELP_STRING([--with-sysdetect=], [Build the sysdetect component (default: yes)])], [], [with_sysdetect=yes]) # Enable sysdetect unless the user has explicitly told us not to. if test "$with_sysdetect" = "yes"; then AC_MSG_RESULT(yes) else AC_MSG_RESULT(no) fi AC_MSG_CHECKING(for components to build) COMPONENT_RULES=components/Rules.components echo "/* Automatically generated by configure */" > components_config.h echo "#ifndef COMPONENTS_CONFIG_H" >> components_config.h echo "#define COMPONENTS_CONFIG_H" >> components_config.h echo "" >> components_config.h AC_ARG_WITH([components], [AS_HELP_STRING([--with-components=<"component1 component2">], [Specify which components to build])], [ if test -n "m4_quote(${with_components//[[[:space:]]]/})"; then components="$components $withval" fi ] ) # Enable sysdetect unless the user has explicitly told us not to. if test "$with_sysdetect" = "yes"; then if test "$perf_events" != "no"; then components="$components sysdetect" fi fi AC_MSG_RESULT($components) # Check whether rocm and rocp_sdk were configured together rocm_found=0 rocp_sdk_found=0 for comp in $components do if test "$comp" = "rocm"; then rocm_found=1 fi if test "$comp" = "rocp_sdk"; then rocp_sdk_found=1 fi done if test $rocm_found -eq 1 && test $rocp_sdk_found -eq 1; then echo "WARNING: Components rocm and rocp_sdk should not be configured together. See components/rocm/README.md for more details." fi # This is an ugly hack to keep building on configurations covered by any-null in the past. if test "$VECTOR" = "_papi_dummy_vector"; then if test "x$components" = "x"; then echo "papi_vector_t ${VECTOR} = {" >> components_config.h echo " .size = { .context = sizeof ( int ), .control_state = sizeof ( int ), .reg_value = sizeof ( int ), .reg_alloc = sizeof ( int ), }, .cmp_info = { .num_native_events = 0, .num_preset_events = 0, .num_cntrs = 0, .name = \"No Components Configured. \", .short_name = \"UNSUPPORTED!\" }, .dispatch_timer = NULL, .get_overflow_address = NULL, .start = NULL, .stop = NULL, .read = NULL, .reset = NULL, .write = NULL, .cleanup_eventset = NULL, .stop_profiling = NULL, .init_component = NULL, .init_thread = NULL, .init_control_state = NULL, .update_control_state = NULL, .ctl = NULL, .set_overflow = NULL, .set_profile = NULL, .set_domain = NULL, .ntv_enum_events = NULL, .ntv_name_to_code = NULL, .ntv_code_to_name = NULL, .ntv_code_to_descr = NULL, .ntv_code_to_bits = NULL, .ntv_code_to_info = NULL, .allocate_registers = NULL, .shutdown_thread = NULL, .shutdown_component = NULL, .user = NULL, };" >> components_config.h # but in the face of actual components, we don't have to do hacky size games else VECTOR="" fi elif test "x$VECTOR" != "x"; then echo "extern papi_vector_t ${VECTOR};" >> components_config.h fi # construct papi_components_config_event_defs.h echo "#ifndef _PAPICOMPCFGEVENTDEFS" > papi_components_config_event_defs.h echo "#define _PAPICOMPCFGEVENTDEFS" >> papi_components_config_event_defs.h echo "" >> papi_components_config_event_defs.h numLine=`grep "#define PAPI_MAX_PRESET_EVENTS" papiStdEventDefs.h` sumNum=`echo ${numLine} | awk '{print $3}'` for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_defs_inc=components/${subcomp}/papi_${subcomp}_std_event_defs.h if test -f ${subcomp_defs_inc}; then `cp ${subcomp_defs_inc} ./` `echo "#define PAPI_${subcomp}_PRESET_OFFSET ${sumNum}" >> papi_components_config_event_defs.h` numLine=`grep "#define PAPI_MAX_${subcomp}_PRESETS" ${subcomp_defs_inc}` singleNum=`echo ${numLine} | awk '{print $3}'` sumNum=$(( ${sumNum} + ${singleNum} )) fi fi done echo "" >> papi_components_config_event_defs.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_defs_inc=components/${subcomp}/papi_${subcomp}_std_event_defs.h if test -f ${subcomp_defs_inc}; then `echo "#include \"papi_${subcomp}_std_event_defs.h\"" >> papi_components_config_event_defs.h` fi fi done echo "" >> papi_components_config_event_defs.h echo "#endif" >> papi_components_config_event_defs.h # includes for preset headers for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h AC_CHECK_FILE(${subcomp_preset_inc}, [`echo "#include \"${subcomp_preset_inc}\"" >> components_config.h`]) fi done echo "" >> components_config.h # array tracking max number of presets per component echo "int _papi_hwi_max_presets[[]] = {" >> components_config.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h if test -f ${subcomp_preset_inc}; then `echo " PAPI_MAX_${subcomp}_PRESETS," >> components_config.h` else `echo " 0," >> components_config.h` fi else `echo " PAPI_MAX_PRESET_EVENTS," >> components_config.h` fi done echo " 0" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h # preset arrays echo "hwi_presets_t *_papi_hwi_comp_presets[[]] = {" >> components_config.h for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi if test "${subcomp}" != "perf_event"; then subcomp_preset_inc=components/${subcomp}/papi_${subcomp}_presets.h if test -f ${subcomp_preset_inc}; then `echo " _${subcomp}_presets," >> components_config.h` else `echo " NULL," >> components_config.h` fi else `echo " _papi_hwi_presets," >> components_config.h` fi done echo " NULL" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h PAPI_NUM_COMP=0 for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi COMPONENT_RULES="$COMPONENT_RULES components/$comp/Rules.$subcomp" echo "extern papi_vector_t _${subcomp}_vector;" >> components_config.h PAPI_NUM_COMP=$((PAPI_NUM_COMP+1)) done echo "" >> components_config.h echo "struct papi_vectors *_papi_hwd[[]] = {" >> components_config.h if test "x$VECTOR" != "x"; then echo " &${VECTOR}," >> components_config.h fi for comp in $components; do idx=`echo "$comp" | sed -n "s/\/.*//p" | wc -c` if test "$idx" = 0; then subcomp=$comp else subcomp=`echo $comp | sed -E "s/^.{${idx}}//"` fi echo " &_${subcomp}_vector," >> components_config.h done echo " NULL" >> components_config.h echo "};" >> components_config.h echo "" >> components_config.h echo "#endif" >> components_config.h # check for component tests for comp in $components; do if test "`find components/$comp -name "tests"`" != "" ; then COMPONENTS="$COMPONENTS $comp" fi done for comp in $components; do # check for SDE component to determine linking flags. if test "x$comp" = "xsde" ; then LDFLAGS="$LDFLAGS $LRT" LIBS="$LIBS $LRT" if test "$with_libsde" = "yes"; then if test "$shlib_tools" = "yes"; then LIBSDEFLAGS="-L${TOPDIR} -lsde -DSDE" else LIBSDEFLAGS="${TOPDIR}/libsde.a -DSDE" fi fi fi # check for intel_gpu or rocp_sdk component to determine if we need -lstdc++ in LDFLAGS if (test "x$comp" = "xintel_gpu" || test "x$comp" = "xrocp_sdk"); then LDFLAGS="$LDFLAGS -lstdc++ -pthread" fi if test "x$comp" = "xsysdetect" ; then if test "x`find $PAPI_CUDA_ROOT -name "cuda.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_CUDA" fi if test "x`find $PAPI_CUDA_ROOT -name "nvml.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_NVML" fi if test "x`find $PAPI_ROCM_ROOT -name "hsa.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_ROCM" fi if test "x`find $PAPI_ROCMSMI_ROOT -name "rocm_smi.h"`" != "x" ; then CFLAGS="$CFLAGS -DHAVE_ROCM_SMI" fi fi done CFLAGS="$CFLAGS -DPAPI_NUM_COMP=$PAPI_NUM_COMP" AC_MSG_CHECKING(for PAPI event CSV filename to use) if test "x$PAPI_EVENTS_CSV" == "x"; then PAPI_EVENTS_CSV="papi_events.csv" fi AC_MSG_RESULT($PAPI_EVENTS_CSV) AC_SUBST(prefix) AC_SUBST(exec_prefix) AC_SUBST(libdir) AC_SUBST(includedir) AC_SUBST(mandir) AC_SUBST(bindir) AC_SUBST(datadir) AC_SUBST(datarootdir) AC_SUBST(docdir) AC_SUBST(PACKAGE_TARNAME) AC_SUBST(arch) AC_SUBST(MAKEVER) AC_SUBST(PMAPI) AC_SUBST(PMINIT) AC_SUBST(F77) AC_SUBST(CPP) AC_SUBST(CC) AC_SUBST(AR) AC_SUBST(papiLIBS) AC_SUBST(STATIC) AC_SUBST(NO_MPI_TESTS) AC_SUBST(LDFLAGS) AC_SUBST(altix) AC_SUBST(pfm_root) AC_SUBST(old_pfmv2) AC_SUBST(pfm_prefix) AC_SUBST(pfm_incdir) AC_SUBST(pfm_libdir) AC_SUBST(OS) AC_SUBST(CFLAGS) AC_SUBST(FFLAGS) AC_SUBST(CPPFLAGS) AC_SUBST(PAPI_EVENTS) AC_SUBST(PAPI_EVENTS_CSV) AC_SUBST(SETPATH) AC_SUBST(LINKLIB) AC_SUBST(VERSION) AC_SUBST(CPU) AC_SUBST(FILENAME) AC_SUBST(LIBRARY) AC_SUBST(SHLIB) AC_SUBST(PAPISOVER) AC_SUBST(VLIB) AC_SUBST(PAPICFLAGS) AC_SUBST(OPTFLAGS) AC_SUBST(CPUCOMPONENT_NAME) AC_SUBST(CPUCOMPONENT_C) AC_SUBST(CPUCOMPONENT_OBJ) AC_SUBST(OSFILESSRC) AC_SUBST(OSFILESOBJ) AC_SUBST(OSFILESHDR) AC_SUBST(OSLOCK) AC_SUBST(OSCONTEXT) AC_SUBST(DESCR) AC_SUBST(LIBS) AC_SUBST(CTEST_TARGETS) AC_SUBST(CC_R) AC_SUBST(CC_SHR) AC_SUBST(SMPCFLGS) AC_SUBST(OMPCFLGS) AC_SUBST(NOOPT) AC_SUBST(MISCSRCS) AC_SUBST(MISCOBJS) AC_SUBST(POST_BUILD) AC_SUBST(ARCH_EVENTS) AC_SUBST(CPU_MODEL) AC_SUBST(cpu_option) AC_SUBST(ARG64) AC_SUBST(FLAGS) AC_SUBST(MPICC) AC_SUBST(MISCHDRS) AC_SUBST(SHLIBDEPS) AC_SUBST(TOPTFLAGS) AC_SUBST(TESTS) AC_SUBST(tests) AC_SUBST(SHOW_CONF) AC_SUBST(BGP_SYSDIR) AC_SUBST(BITFLAGS) AC_SUBST(COMPONENT_RULES) AC_SUBST(COMPONENTS) AC_SUBST(FTEST_TARGETS) AC_SUBST(HAVE_NO_OVERRIDE_INIT) AC_SUBST(BGPM_INSTALL_DIR) AC_SUBST(CC_COMMON_NAME) AC_SUBST(NVPPC64LEFLAGS) AC_ARG_ENABLE([fortran], [AS_HELP_STRING([--disable-fortran], [Whether to disable fortran bindings])], [], [enable_fortran=yes]) if test "x$F77" != "x" -a "x$enable_fortran" = "xyes" ; then AC_CONFIG_COMMANDS([genpapifdef], [ maint/genpapifdef.pl -c > fpapi.h maint/genpapifdef.pl -f77 > f77papi.h maint/genpapifdef.pl -f90 > f90papi.h ], []) FORT_WRAPPERS_SRC="upper_PAPI_FWRAPPERS.c papi_fwrappers_.c papi_fwrappers__.c" FORT_WRAPPERS_OBJ="upper_PAPI_FWRAPPERS.o papi_fwrappers_.o papi_fwrappers__.o" ENABLE_FORTRAN="$enable_fortran" ENABLE_FORTRAN_TESTS="$enable_fortran" FORT_HEADERS="fpapi.h f77papi.h f90papi.h" fi AC_SUBST([FORT_WRAPPERS_SRC]) AC_SUBST([FORT_WRAPPERS_OBJ]) AC_SUBST([FORT_HEADERS]) AC_SUBST([ENABLE_FORTRAN]) AC_SUBST([ENABLE_FORTRAN_TESTS]) AC_MSG_NOTICE($FILENAME will be included in the generated Makefile) AC_CONFIG_FILES([Makefile papi.pc]) AC_CONFIG_FILES([components/Makefile_comp_tests.target testlib/Makefile.target utils/Makefile.target ctests/Makefile.target ftests/Makefile.target validation_tests/Makefile.target]) AC_OUTPUT if test "$have_paranoid" = "yes"; then paranoid_level=`cat /proc/sys/kernel/perf_event_paranoid` if test $paranoid_level -gt 2; then warning_text=`echo -e "\n *************************************************************************** * Insufficient permissions for accessing any hardware counters. * * Your current paranoid level is $paranoid_level. * * Set /proc/sys/kernel/perf_event_paranoid to 2 (or less) or run as root. * * * * Example: * * sudo sh -c \"echo 2 > /proc/sys/kernel/perf_event_paranoid\" * *************************************************************************** "\ ` AC_MSG_WARN($warning_text) fi fi papi-papi-7-2-0-t/src/counter_analysis_toolkit/000077500000000000000000000000001502707512200215565ustar00rootroot00000000000000papi-papi-7-2-0-t/src/counter_analysis_toolkit/.cat_cfg000066400000000000000000000005301502707512200231430ustar00rootroot00000000000000#AUTO_DISCOVERY_MODE = 1 #L1_DCACHE_LINE_SIZE = 64 #L1_ICACHE_LINE_SIZE = 64 #L1_DCACHE_SIZE=32768 #L1_ICACHE_SIZE = 32768 #L2_UCACHE_SIZE=1048576 #L3_DCACHE_SIZE = 33554432 #L4_DCACHE_SIZE = 67108864 #L1_SPLIT=1 #L2_SPLIT=1 #L3_SPLIT=8 #L4_SPLIT=8 #MM_SPLIT=8 #PTS_PER_L1=7 #PTS_PER_L2=7 #PTS_PER_L3=7 #PTS_PER_L4=7 #PTS_PER_MM=7 #MAX_PPB=256 papi-papi-7-2-0-t/src/counter_analysis_toolkit/Makefile000066400000000000000000000145741502707512200232310ustar00rootroot00000000000000PAPIDIR?=${PAPI_DIR} USEMPI?=false LDFLAGS=-L$(PAPIDIR)/lib -lpapi -lm -lpthread -ldl -lrt INCFLAGS=-I$(PAPIDIR)/include CFLAGS+=-g -Wall -Wextra OPT0=-O0 OPT1=-O1 OPT2=-O2 OPT3=-O3 CC=gcc ## Check to see if MPI is used. ifeq ($(USEMPI),true) CC=mpicc CFLAGS+=-DUSE_MPI endif ## Try to auto-detect architecture if it is not set. ifndef ARCH UNAME_M:=$(shell uname -m) ARCH=POWER ifeq ($(UNAME_M),x86_64) ARCH=X86 endif ifeq ($(UNAME_M),aarch64) ARCH=ARM endif $(info Detected ARCH=$(ARCH)) endif ## Architecture determines vector instruction set. ifeq ($(ARCH),X86) FLOP+=-mfma -DX86 VECSRC=vec_fma_hp vec_fma_sp vec_fma_dp vec_nonfma_hp vec_nonfma_sp vec_nonfma_dp VEC_META=-DAVX128_AVAIL -DAVX256_AVAIL -DAVX512_AVAIL VEC128=-mavx -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_128B VEC256=-mavx -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_256B VEC512=-mavx512f -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_512B VEC128_FMA=-mfma4 -mfma -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_128B VEC256_FMA=-mfma4 -mfma -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_256B VEC512_FMA=-mfma4 -mfma -mavx512f -O0 -DX86 $(VEC_META) -DX86_VEC_WIDTH_512B VEC=-mavx -O0 -DX86 VEC_FMA=-mfma4 -mfma -O0 -DX86 VEC_ALL=$(VEC) $(VEC_FMA) -O0 -DX86 INSTR=-mavx512vl endif ifeq ($(ARCH),POWER) FLOP+=-maltivec -DPOWER VECSRC=vec_fma_hp.o vec_fma_sp.o vec_fma_dp.o vec_nonfma_hp.o vec_nonfma_sp.o vec_nonfma_dp.o VEC=-maltivec -DPOWER VEC_FMA=-maltivec -DPOWER VEC_ALL=$(VEC) -DPOWER endif ifeq ($(ARCH),ARM) FLOP+=-march=armv8.2-a+fp16 -DARM VECSRC=vec_fma_hp.o vec_fma_sp.o vec_fma_dp.o vec_nonfma_hp.o vec_nonfma_sp.o vec_nonfma_dp.o VEC=-march=armv8.2-a+fp16 -O0 -DARM VEC_FMA=-march=armv8.2-a+fp16 -O0 -DARM VEC_ALL=$(VEC) -O0 -DARM endif all: branch.o d_cache eventstock.o flops i_cache instr vector make cat_collect PAPIDIR=$(PAPIDIR) d_cache: timing_kernels.o prepareArray.o compar.o dcache.o i_cache: icache.o icache_seq_kernel_0.o vector: weak_symbols.o vec.o vec_scalar_verify.o $(VECSRC) branch.o: branch.c branch.h $(CC) $(OPT0) $(CFLAGS) $(INCFLAGS) -c branch.c -o branch.o timing_kernels.o: timing_kernels.c timing_kernels.h $(CC) $(OPT2) $(CFLAGS) -fopenmp $(INCFLAGS) -c timing_kernels.c -o timing_kernels.o prepareArray.o: prepareArray.c prepareArray.h $(CC) $(OPT2) $(CFLAGS) -c prepareArray.c -o prepareArray.o compar.o: compar.c $(CC) $(CFLAGS) $(OPT2) -c compar.c -o compar.o dcache.o: dcache.c dcache.h $(CC) $(CFLAGS) $(OPT2) -fopenmp $(INCFLAGS) -c dcache.c -o dcache.o eventstock.o: eventstock.c eventstock.h $(CC) $(CFLAGS) $(OPT0) $(INCFLAGS) -c eventstock.c -o eventstock.o flops: flops.c flops.h cat_arch.h $(CC) $(CFLAGS) $(FLOP) $(OPT1) $(INCFLAGS) -c flops.c -o flops.o icache.o: icache.c icache.h bash gen_seq_dlopen.sh $(CC) $(CFLAGS) $(OPT0) $(INCFLAGS) -c icache.c -o icache.o icache_seq_kernel_0.o: icache_seq.c icache_seq.h $(CC) $(CFLAGS) $(OPT0) $(INCFLAGS) -c icache_seq.c -o icache_seq.o $(CC) $(CFLAGS) $(OPT0) $(INCFLAGS) -fPIC -c icache_seq_kernel.c -o icache_seq_kernel_0.o $(CC) $(CFLAGS) $(OPT0) -shared -o icache_seq_kernel_0.so icache_seq_kernel_0.o bash replicate.sh rm icache_seq_kernel_0.o instr: instructions.c instr.h -$(CC) -c $(CFLAGS) $(OPT2) -ftree-vectorize $(FLOP) $(INSTR) $(INCFLAGS) instructions.c -o instructions.o weak_symbols.o: weak_symbols.c vec.h -$(CC) -c $(CFLAGS) weak_symbols.c vec.o: vec.c vec.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) -D$(ARCH) $(VEC_META) vec.c vec_scalar_verify.o: vec_scalar_verify.c vec_scalar_verify.h cat_arch.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC_ALL) vec_scalar_verify.c vec_fma_hp.o: vec_fma_hp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC_FMA) vec_fma_hp.c vec_fma_hp: vec_fma_hp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128_FMA) vec_fma_hp.c -o vec_fma_hp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256_FMA) vec_fma_hp.c -o vec_fma_hp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512_FMA) vec_fma_hp.c -o vec_fma_hp-512B.o vec_fma_sp.o: vec_fma_sp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC_FMA) vec_fma_sp.c vec_fma_sp: vec_fma_sp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128_FMA) vec_fma_sp.c -o vec_fma_sp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256_FMA) vec_fma_sp.c -o vec_fma_sp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512_FMA) vec_fma_sp.c -o vec_fma_sp-512B.o vec_fma_dp.o: vec_fma_dp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC_FMA) vec_fma_dp.c vec_fma_dp: vec_fma_dp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128_FMA) vec_fma_dp.c -o vec_fma_dp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256_FMA) vec_fma_dp.c -o vec_fma_dp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512_FMA) vec_fma_dp.c -o vec_fma_dp-512B.o vec_nonfma_hp.o: vec_nonfma_hp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC) vec_nonfma_hp.c vec_nonfma_hp: vec_nonfma_hp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128) vec_nonfma_hp.c -o vec_nonfma_hp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256) vec_nonfma_hp.c -o vec_nonfma_hp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512) vec_nonfma_hp.c -o vec_nonfma_hp-512B.o vec_nonfma_sp.o: vec_nonfma_sp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC) vec_nonfma_sp.c vec_nonfma_sp: vec_nonfma_sp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128) vec_nonfma_sp.c -o vec_nonfma_sp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256) vec_nonfma_sp.c -o vec_nonfma_sp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512) vec_nonfma_sp.c -o vec_nonfma_sp-512B.o vec_nonfma_dp.o: vec_nonfma_dp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC) vec_nonfma_dp.c vec_nonfma_dp: vec_nonfma_dp.c vec_scalar_verify.h -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC128) vec_nonfma_dp.c -o vec_nonfma_dp-128B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC256) vec_nonfma_dp.c -o vec_nonfma_dp-256B.o -$(CC) -c $(CFLAGS) $(OPT1) $(INCFLAGS) $(VEC512) vec_nonfma_dp.c -o vec_nonfma_dp-512B.o cat_collect: $(CC) $(CFLAGS) -fopenmp $(INCFLAGS) main.c $(wildcard *.o) -o cat_collect $(LDFLAGS) clean: rm -f *.o realclean: rm -f cat_collect *.o *.so icache_seq.c icache_seq.h icache_seq_kernel.c papi-papi-7-2-0-t/src/counter_analysis_toolkit/README000066400000000000000000000016241502707512200224410ustar00rootroot00000000000000Description: Benchmarks for helping in the understanding of native events by stressing different aspects of the architecture selectively. Compilation: make PAPIDIR=/path/to/your/papi/installation Usage: ./cat_collect -in event_list.txt -out OUTPUT_DIRECTORY -branch -dcr The following five flags specify the corresponding benchmarks: -branch Branch kernels. -dcr Data cache reading kernels. -dcw Data cache writing kernels. -flops Floating point operations kernels. -ic Instruction cache kernels. -vec Vector FLOPs kernels. -instr Instrution kernels. Each line in the event-list file should contain ether the name of a base event followed by the number of qualifiers to be appended, or a fully expanded event with qualifiers followed by the number zero, as in the following example: L2_RQSTS 1 ICACHE:MISSES 0 ICACHE:HIT 0 OFFCORE_RESPONSE_0:DMND_DATA_RD:L3_HIT:SNP_ANY 0 papi-papi-7-2-0-t/src/counter_analysis_toolkit/branch.c000066400000000000000000000223021502707512200231560ustar00rootroot00000000000000#include #include #include #include #include #include #include "papi.h" #include "branch.h" volatile int iter_count, global_var1, global_var2; volatile int result; volatile unsigned int b, z1, z2, z3, z4; void branch_driver(char *papi_event_name, int junk, hw_desc_t *hw_desc, char* outdir){ int papi_eventset = PAPI_NULL; int i, iter, sz, ret_val, max_iter = 16*1024; long long int cnt; double avg, round; FILE* ofp_papi; const char *sufx = ".branch"; int l = strlen(outdir)+strlen(papi_event_name)+strlen(sufx); (void)hw_desc; char *papiFileName = (char *)calloc( 1+l, sizeof(char) ); if (l != (sprintf(papiFileName, "%s%s%s", outdir, papi_event_name, sufx))) { goto error0; } if (NULL == (ofp_papi = fopen(papiFileName,"w"))) { fprintf(stderr, "Unable to open file %s.\n", papiFileName); goto error0; } // Initialize undecidible values for the BRNG macro. z1 = junk*7; z2 = (junk+4)/(junk+1); z3 = junk; z4 = (z3+z2)/z1; ret_val = PAPI_create_eventset( &papi_eventset ); if (ret_val != PAPI_OK){ goto error1; } ret_val = PAPI_add_named_event( papi_eventset, papi_event_name ); if (ret_val != PAPI_OK){ goto error1; } BRANCH_BENCH(1); BRANCH_BENCH(2); BRANCH_BENCH(3); BRANCH_BENCH(4); BRANCH_BENCH(4a); BRANCH_BENCH(4b); BRANCH_BENCH(5); BRANCH_BENCH(5a); BRANCH_BENCH(5b); BRANCH_BENCH(6); BRANCH_BENCH(7); if( result == 143526 ){ printf("Random side effect\n"); } ret_val = PAPI_cleanup_eventset( papi_eventset ); if (ret_val != PAPI_OK ){ goto error1; } ret_val = PAPI_destroy_eventset( &papi_eventset ); if (ret_val != PAPI_OK ){ goto error1; } error1: fclose(ofp_papi); error0: free(papiFileName); return; } long long int branch_char_b1(int size, int event_set){ int retval; long long int value; if ( (retval=PAPI_start(event_set)) != PAPI_OK){ return -1; } /* 1. Conditional EXECUTED = 2 1. Conditional RETIRED = 2 2. Conditional TAKEN = 1.5 4. Direct JUMP = 0 3. Branch MISPREDICT = 0 5. All Branches = 2 */ iter_count = 1; global_var2 = 1; do{ if ( iter_count < (size/2) ){ global_var2 += 2; } BRNG(); iter_count++; }while(iter_count global_var2 ){ global_var1+=2; } BRNG(); iter_count++; }while(iter_count> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } #define BUSY_WORK() {BRNG(); BRNG(); BRNG(); BRNG();} extern volatile int result; extern volatile unsigned int b, z1, z2, z3, z4; void branch_driver(char *papi_event_name, int junk, hw_desc_t *hw_desc, char* outdir); long long int branch_char_b1(int size, int papi_eventset); long long int branch_char_b2(int size, int papi_eventset); long long int branch_char_b3(int size, int papi_eventset); long long int branch_char_b4(int size, int papi_eventset); long long int branch_char_b4a(int size, int papi_eventset); long long int branch_char_b4b(int size, int papi_eventset); long long int branch_char_b5(int size, int papi_eventset); long long int branch_char_b5a(int size, int papi_eventset); long long int branch_char_b5b(int size, int papi_eventset); long long int branch_char_b6(int size, int papi_eventset); long long int branch_char_b7(int size, int papi_eventset); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/caches.h000066400000000000000000000023111502707512200231520ustar00rootroot00000000000000#ifndef _CACHES_ #define _CACHES_ #include #include #include #include #include #include // Header files for uintptr_t #if defined (__SVR4) && defined (__sun) # include #else # include #endif #include // Header files for setting the affinity #if defined(__linux__) # define __USE_GNU 1 # include #elif defined (__SVR4) && defined (__sun) //#elif defined(__sparc) # include # include # include #endif #include #define SIZE (512*1024) #define L_SIZE 0 #define C_SIZE 1 #define ASSOC 2 //#define DEBUG #define MAXTHREADS 128 typedef struct run_output_s{ double dt[MAXTHREADS]; double counter[MAXTHREADS]; int status; }run_output_t; static inline double getticks(void){ double ret; struct timeval tv; gettimeofday(&tv, NULL); ret = 1000*1000*(double)tv.tv_sec + (double)tv.tv_usec; return ret; } static inline double elapsed(double t1, double t0){ return (double)t1 - (double)t0; } extern int compar_lf(const void *a, const void *b); extern int compar_lld(const void *a, const void *b); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/cat_arch.h000066400000000000000000000175501502707512200235030ustar00rootroot00000000000000#include typedef unsigned long long uint64; #if defined(X86) void test_hp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); #include typedef __m128 SP_SCALAR_TYPE; typedef __m128d DP_SCALAR_TYPE; #define SET_VEC_SS(_I_) _mm_set_ss( _I_ ); #define ADD_VEC_SS(_I_,_J_) _mm_add_ss( _I_ , _J_ ); #define MUL_VEC_SS(_I_,_J_) _mm_mul_ss( _I_ , _J_ ); #define FMA_VEC_SS(_out_,_I_,_J_,_K_) { _out_ = _mm_fmadd_ss( _I_ , _J_ , _K_ ); } #define SET_VEC_SD(_I_) _mm_set_sd( _I_ ); #define ADD_VEC_SD(_I_,_J_) _mm_add_sd( _I_ , _J_ ); #define MUL_VEC_SD(_I_,_J_) _mm_mul_sd( _I_ , _J_ ); #define FMA_VEC_SD(_out_,_I_,_J_,_K_) { _out_ = _mm_fmadd_sd( _I_ , _J_ , _K_ ); } #if defined(X86_VEC_WIDTH_128B) typedef __m128 SP_VEC_TYPE; typedef __m128d DP_VEC_TYPE; #define SET_VEC_PS(_I_) _mm_set1_ps( _I_ ); #define ADD_VEC_PS(_I_,_J_) _mm_add_ps( _I_ , _J_ ); #define MUL_VEC_PS(_I_,_J_) _mm_mul_ps( _I_ , _J_ ); #define FMA_VEC_PS(_I_,_J_,_K_) _mm_fmadd_ps( _I_ , _J_ , _K_ ); #define SET_VEC_PD(_I_) _mm_set1_pd( _I_ ); #define ADD_VEC_PD(_I_,_J_) _mm_add_pd( _I_ , _J_ ); #define MUL_VEC_PD(_I_,_J_) _mm_mul_pd( _I_ , _J_ ); #define FMA_VEC_PD(_I_,_J_,_K_) _mm_fmadd_pd( _I_ , _J_ , _K_ ); #elif defined(X86_VEC_WIDTH_512B) typedef __m512 SP_VEC_TYPE; typedef __m512d DP_VEC_TYPE; #define SET_VEC_PS(_I_) _mm512_set1_ps( _I_ ); #define ADD_VEC_PS(_I_,_J_) _mm512_add_ps( _I_ , _J_ ); #define MUL_VEC_PS(_I_,_J_) _mm512_mul_ps( _I_ , _J_ ); #define FMA_VEC_PS(_I_,_J_,_K_) _mm512_fmadd_ps( _I_ , _J_ , _K_ ); #define SET_VEC_PD(_I_) _mm512_set1_pd( _I_ ); #define ADD_VEC_PD(_I_,_J_) _mm512_add_pd( _I_ , _J_ ); #define MUL_VEC_PD(_I_,_J_) _mm512_mul_pd( _I_ , _J_ ); #define FMA_VEC_PD(_I_,_J_,_K_) _mm512_fmadd_pd( _I_ , _J_ , _K_ ); #else typedef __m256 SP_VEC_TYPE; typedef __m256d DP_VEC_TYPE; #define SET_VEC_PS(_I_) _mm256_set1_ps( _I_ ); #define ADD_VEC_PS(_I_,_J_) _mm256_add_ps( _I_ , _J_ ); #define MUL_VEC_PS(_I_,_J_) _mm256_mul_ps( _I_ , _J_ ); #define FMA_VEC_PS(_I_,_J_,_K_) _mm256_fmadd_ps( _I_ , _J_ , _K_ ); #define SET_VEC_PD(_I_) _mm256_set1_pd( _I_ ); #define ADD_VEC_PD(_I_,_J_) _mm256_add_pd( _I_ , _J_ ); #define MUL_VEC_PD(_I_,_J_) _mm256_mul_pd( _I_ , _J_ ); #define FMA_VEC_PD(_I_,_J_,_K_) _mm256_fmadd_pd( _I_ , _J_ , _K_ ); #endif #elif defined(ARM) void test_hp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); #include typedef __fp16 half; typedef float SP_SCALAR_TYPE; typedef double DP_SCALAR_TYPE; typedef float16x8_t HP_VEC_TYPE; typedef float32x4_t SP_VEC_TYPE; typedef float64x2_t DP_VEC_TYPE; #define SET_VEC_PH(_I_) (HP_VEC_TYPE)vdupq_n_f16( _I_ ); #define SET_VEC_PS(_I_) (SP_VEC_TYPE)vdupq_n_f32( _I_ ); #define SET_VEC_PD(_I_) (DP_VEC_TYPE)vdupq_n_f64( _I_ ); #define ADD_VEC_PH(_I_,_J_) (HP_VEC_TYPE)vaddq_f16( _I_ , _J_ ); #define ADD_VEC_PS(_I_,_J_) (SP_VEC_TYPE)vaddq_f32( _I_ , _J_ ); #define ADD_VEC_PD(_I_,_J_) (DP_VEC_TYPE)vaddq_f64( _I_ , _J_ ); #define MUL_VEC_PH(_I_,_J_) (HP_VEC_TYPE)vmulq_f16( _I_ , _J_ ); #define MUL_VEC_PS(_I_,_J_) (SP_VEC_TYPE)vmulq_f32( _I_ , _J_ ); #define MUL_VEC_PD(_I_,_J_) (DP_VEC_TYPE)vmulq_f64( _I_ , _J_ ); #define FMA_VEC_PH(_I_,_J_,_K_) (HP_VEC_TYPE)vfmaq_f16( _K_ , _J_ , _I_ ); #define FMA_VEC_PS(_I_,_J_,_K_) (SP_VEC_TYPE)vfmaq_f32( _K_ , _J_ , _I_ ); #define FMA_VEC_PD(_I_,_J_,_K_) (DP_VEC_TYPE)vfmaq_f64( _K_ , _J_ , _I_ ); /* There is no scalar FMA intrinsic available on this architecture. */ #define SET_VEC_SH(_I_) _I_ ; #define ADD_VEC_SH(_I_,_J_) vaddh_f16( _I_ , _J_ ); #define MUL_VEC_SH(_I_,_J_) vmulh_f16( _I_ , _J_ ); #define SQRT_VEC_SH(_I_) vsqrth_f16( _I_ ); #define FMA_VEC_SH(_out_,_I_,_J_,_K_) _out_ = _I_ * _J_ + _K_; #define SET_VEC_SS(_I_) _I_ ; #define ADD_VEC_SS(_I_,_J_) _I_ + _J_ ; #define MUL_VEC_SS(_I_,_J_) _I_ * _J_ ; #define FMA_VEC_SS(_out_,_I_,_J_,_K_) _out_ = _I_ * _J_ + _K_; #define SET_VEC_SD(_I_) _I_ ; #define ADD_VEC_SD(_I_,_J_) _I_ + _J_ ; #define MUL_VEC_SD(_I_,_J_) _I_ * _J_ ; #define FMA_VEC_SD(_out_,_I_,_J_,_K_) _out_ = _I_ * _J_ + _K_; #elif defined(POWER) void test_hp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_hp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_sp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); void test_dp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); #include typedef float SP_SCALAR_TYPE; typedef double DP_SCALAR_TYPE; typedef __vector float SP_VEC_TYPE; typedef __vector double DP_VEC_TYPE; #define SET_VEC_PS(_I_) (SP_VEC_TYPE){ _I_ , _I_ , _I_ , _I_ }; #define SET_VEC_PD(_I_) (DP_VEC_TYPE){ _I_ , _I_ }; #define ADD_VEC_PS(_I_,_J_) (SP_VEC_TYPE)vec_add( _I_ , _J_ ); #define ADD_VEC_PD(_I_,_J_) (DP_VEC_TYPE)vec_add( _I_ , _J_ ); #define MUL_VEC_PS(_I_,_J_) (SP_VEC_TYPE)vec_mul( _I_ , _J_ ); #define MUL_VEC_PD(_I_,_J_) (DP_VEC_TYPE)vec_mul( _I_ , _J_ ); #define FMA_VEC_PS(_I_,_J_,_K_) (SP_VEC_TYPE)vec_madd( _I_ , _J_ , _K_ ); #define FMA_VEC_PD(_I_,_J_,_K_) (DP_VEC_TYPE)vec_madd( _I_ , _J_ , _K_ ); /* There is no scalar FMA intrinsic available on this architecture. */ #define SET_VEC_SS(_I_) _I_ ; #define ADD_VEC_SS(_I_,_J_) _I_ + _J_ ; #define MUL_VEC_SS(_I_,_J_) _I_ * _J_ ; #define FMA_VEC_SS(_out_,_I_,_J_,_K_) _out_ = _I_ * _J_ + _K_; #define SET_VEC_SD(_I_) _I_ ; #define ADD_VEC_SD(_I_,_J_) _I_ + _J_ ; #define MUL_VEC_SD(_I_,_J_) _I_ * _J_ ; #define FMA_VEC_SD(_out_,_I_,_J_,_K_) _out_ = _I_ * _J_ + _K_; #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/compar.c000066400000000000000000000006661502707512200232130ustar00rootroot00000000000000int compar_lf(const void *a, const void *b){ const double *da = (const double *)a; const double *db = (const double *)b; if( *da < *db) return -1; if( *da > *db) return 1; return 0; } int compar_lld(const void *a, const void *b){ const long long int *da = (const long long int *)a; const long long int *db = (const long long int *)b; if( *da < *db) return -1; if( *da > *db) return 1; return 0; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/dcache.c000066400000000000000000000377111502707512200231420ustar00rootroot00000000000000#include "papi.h" #include "caches.h" #include "timing_kernels.h" #include "dcache.h" #include "params.h" #include static void print_header(FILE *ofp_papi, hw_desc_t *hw_desc); static void print_cache_sizes(FILE *ofp_papi, hw_desc_t *hw_desc); static void print_core_affinities(FILE *ofp); extern char* eventname; long long min_size, max_size; int is_core = 0; void d_cache_driver(char* papi_event_name, cat_params_t params, hw_desc_t *hw_desc, int latency_only, int mode) { int pattern = 3; long long stride; int f, cache_line; int status, evtCode, test_cnt = 0; float ppb = 16; FILE *ofp_papi; char *sufx, *papiFileName; // Use component ID to check if event is a core event. if( strcmp(papi_event_name, "cat::latencies") && PAPI_OK != (status = PAPI_event_name_to_code(papi_event_name, &evtCode)) ) { error_handler(status, __LINE__); } else { if( 0 == PAPI_get_event_component(evtCode) ) is_core = 1; } // Open file (pass handle to d_cache_test()). if(CACHE_READ_WRITE == mode){ sufx = strdup(".data.writes"); }else{ sufx = strdup(".data.reads"); } int l = strlen(params.outputdir)+strlen(papi_event_name)+strlen(sufx); papiFileName = (char *)calloc( 1+l, sizeof(char) ); if (!papiFileName) { fprintf(stderr, "Unable to allocate memory. Skipping event %s.\n", papi_event_name); goto error0; } if (l != (sprintf(papiFileName, "%s%s%s", params.outputdir, papi_event_name, sufx))) { fprintf(stderr, "sprintf error. Skipping event %s.\n", papi_event_name); goto error1; } if (NULL == (ofp_papi = fopen(papiFileName,"w"))) { fprintf(stderr, "Unable to open file %s. Skipping event %s.\n", papiFileName, papi_event_name); goto error1; } if( (NULL==hw_desc) || (0==hw_desc->dcache_line_size[0]) ) cache_line = 64; else cache_line = hw_desc->dcache_line_size[0]; // Print meta-data about this run in the first few lines of the output file. print_header(ofp_papi, hw_desc); // Go through each parameter variant. for(pattern = 3; pattern <= 4; ++pattern) { for(f = 1; f <= 2; f *= 2) { stride = cache_line*f; // PPB variation only makes sense if the pattern is not sequential. if(pattern != 4) { for(ppb = (float)hw_desc->maxPPB; ppb >= 16; ppb *= 16.0/(hw_desc->maxPPB)) { if( params.show_progress ) { printf("%3d%%\b\b\b\b",(100*test_cnt++)/6); fflush(stdout); } status = d_cache_test(pattern, params, hw_desc, stride, ppb, papi_event_name, latency_only, mode, ofp_papi); if( status < 0 ) goto error2; } } else { if( params.show_progress ) { printf("%3d%%\b\b\b\b",(100*test_cnt++)/6); fflush(stdout); } status = d_cache_test(pattern, params, hw_desc, stride, ppb, papi_event_name, latency_only, mode, ofp_papi); if( status < 0 ) goto error2; } } } error2: if( params.show_progress ) { size_t i; printf("100%%"); for(i=0; icache_levels<=0) ){ for(i=min_size; icache_levels; for(j=0; jpts_per_reg[j]; } guessCount += hw_desc->pts_per_mm; int llc_idx = hw_desc->cache_levels-1; int num_pts = hw_desc->pts_per_mm+1; double factor = pow((double)FACTOR, ((double)(num_pts-1))/((double)num_pts)); max_size = factor*(hw_desc->dcache_size[llc_idx])/hw_desc->mmsplit; } // Get the number of threads. ONT = get_thread_count(); // Latency results from the benchmark. rslts = (double ***)malloc(max_iter*sizeof(double **)); for(i=0; icache_levels<=0) ){ cnt = 0; // If we don't know the cache sizes, space the measurements between two default values. for(active_buf_len=min_size; active_buf_lencache_levels; int numHier = numCaches+1; int llc_idx = numCaches-1; int len = 0, ptsToNextCache, tmpIdx = 0; long long currCacheSize, nextCacheSize; long long *bufSizes; // Calculate the length of the array of buffer sizes. for(j=0; jpts_per_reg[j]; } len += hw_desc->pts_per_mm; // Allocate space for the array of buffer sizes. if( NULL == (bufSizes = (long long *)calloc(len, sizeof(long long))) ) goto error; // Define buffer sizes. tmpIdx = 0; for(j=0; jdcache_size[0]/(8.0*hw_desc->split[0]); } else { currCacheSize = hw_desc->dcache_size[j-1]/hw_desc->split[j-1]; } /* The upper bound of the final "cache" region (memory in this case) is set to FACTOR times the * size of the LLC so that all threads cumulatively will exceed the LLC by a factor of FACTOR. * All other upper bounds are set to the capacity of the cache, as observed per core. */ if( llc_idx+1 == j ) { nextCacheSize = 16LL*(hw_desc->dcache_size[llc_idx])/hw_desc->mmsplit; ptsToNextCache = hw_desc->pts_per_mm+1; } else { nextCacheSize = hw_desc->dcache_size[j]/hw_desc->split[j]; ptsToNextCache = hw_desc->pts_per_reg[j]+1; } /* Choose a factor "f" to grow the buffer size by, such that we collect the user-specified * number of samples between each cache size, evenly distributed in a geometric fashion * (i.e., sizes will be equally spaced in a log graph). */ for(k = 1; k < ptsToNextCache; ++k) { f = pow(((double)nextCacheSize)/currCacheSize, ((double)k)/ptsToNextCache); bufSizes[tmpIdx+k-1] = f*currCacheSize; } if( llc_idx+1 == j ) { tmpIdx += hw_desc->pts_per_mm; } else { tmpIdx += hw_desc->pts_per_reg[j]; } } cnt=0; for(j=0; jdcache_size[llc_idx]/hw_desc->split[llc_idx]; llc_size /= sizeof(uintptr_t); out = probeBufferSize(active_buf_len, stride, pages_per_block, pattern, llc_size, v, &rslt, latency_only, mode, ONT); if(out.status != 0) goto error; for(k = 0; k < ONT; ++k) { rslts[cnt][k] = out.dt[k]; counter[cnt][k] = out.counter[k]; } values[cnt++] = bufSizes[j]; } if( params.show_progress ){ printf(" \b"); fflush(stdout); } free(bufSizes); } // Free each thread's memory. for(j=0; jcache_levels; ++i) { long long sz = hw_desc->dcache_size[i]/hw_desc->split[i]; fprintf(ofp, " L%d:%lld", i+1, sz); } fprintf(ofp, "\n"); } void print_core_affinities(FILE *ofp) { int k, ONT; int *pinnings = NULL; // Get the number of threads. ONT = get_thread_count(); // List of core affinities in which the index is the thread ID. pinnings = (int *)malloc(ONT*sizeof(int)); if( NULL == pinnings ) { fprintf(stderr, "Error: cannot allocate space for experiment.\n"); return; } #pragma omp parallel default(shared) { int idx = omp_get_thread_num(); pinnings[idx] = sched_getcpu(); } fprintf(ofp, "# Core:"); for(k=0; k #include #include "hw_desc.h" #include "params.h" #define FACTOR 12LL int varyBufferSizes(long long *values, double **rslts, double **counter, cat_params_t params, hw_desc_t *hw_desc, long long line_size_in_bytes, float pages_per_block, int pattern, int latency_only, int mode, int ONT); int get_thread_count(); void d_cache_driver(char* papi_event_name, cat_params_t params, hw_desc_t *hw_desc, int latency_only, int mode); int d_cache_test(int pattern, cat_params_t params, hw_desc_t *hw_desc, long long stride_in_bytes, float pages_per_block, char* papi_event_name, int latency_only, int mode, FILE* ofp); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/driver.h000066400000000000000000000025731502707512200232310ustar00rootroot00000000000000#include "eventstock.h" #include "dcache.h" #include "branch.h" #include "icache.h" #include "flops.h" #include "vec.h" #include "instr.h" #include "hw_desc.h" #include "params.h" #define USE_ALL_EVENTS 0x0 #define READ_FROM_FILE 0x1 #define BENCH_FLOPS 0x01 #define BENCH_BRANCH 0x02 #define BENCH_DCACHE_READ 0x04 #define BENCH_DCACHE_WRITE 0x08 #define BENCH_ICACHE_READ 0x10 #define BENCH_VEC 0x20 #define BENCH_INSTR 0x40 int parseArgs(int argc, char **argv, cat_params_t *params); int setup_evts(char* inputfile, char*** basenames, int** cards); unsigned long int omp_get_thread_num_wrapper(); int check_cards(cat_params_t mode, int** indexmemo, char** basenames, int* cards, int ct, int nevts, evstock* data); void combine_qualifiers(int n, int pk, int ct, char** list, char* name, char** allevts, int* track, int flag, int* bitmap); void trav_evts(evstock* stock, int pk, int* cards, int nevts, int selexnsize, int mode, char** allevts, int* track, int* indexmemo, char** basenames); int perm(int n, int k); int comb(int n, int k); void testbench(char** allevts, int cmbtotal, hw_desc_t *hw_desc, cat_params_t params, int myid, int nprocs); void print_usage(); static int parse_line(FILE *input, char **key, long long *value); static void read_conf_file(char *conf_file, hw_desc_t *hw_desc); static hw_desc_t *obtain_hardware_description(char *conf_file_name); papi-papi-7-2-0-t/src/counter_analysis_toolkit/event_list.txt000066400000000000000000000001171502707512200244720ustar00rootroot00000000000000MEM_LOAD_RETIRED 1 L2_RQSTS 1 OFFCORE_RESPONSE_0:DMND_DATA_RD:L3_HIT:SNP_ANY 0 papi-papi-7-2-0-t/src/counter_analysis_toolkit/eventstock.c000066400000000000000000000146201502707512200241120ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "eventstock.h" #if !defined(_PAPI_CPU_COMPONENT_NAME) #define _PAPI_CPU_COMPONENT_NAME "perf_event" #endif int build_stock(evstock* stock) { int ret; PAPI_event_info_t info; int cid; int ncomps = PAPI_num_components(); int event_counter = 0; int subctr = 0; int tmp_event_count; int event_qual_i, event_i; if (!stock) return 1; event_i = 0 | PAPI_NATIVE_MASK; // Add the names to the stock. event_counter = 0; for(cid = 0; cid < ncomps; ++cid) { const PAPI_component_info_t *cmp_info = PAPI_get_component_info(cid); if( strcmp(cmp_info->name, _PAPI_CPU_COMPONENT_NAME) ) continue; if (cmp_info->disabled == PAPI_EDELAY_INIT) { int nvt_code = 0 | PAPI_NATIVE_MASK; PAPI_enum_cmp_event(&nvt_code, PAPI_ENUM_FIRST, cid); } tmp_event_count = cmp_info->num_native_events; // Set the data stock's sizes all to zero. if (NULL == (stock->evtsizes = (int*)calloc( (tmp_event_count),sizeof(int) ))) { fprintf(stderr, "Failed allocation of stock->evtsizes.\n"); goto gracious_error; } if (NULL == (stock->base_evts = (char**)malloc( (tmp_event_count)*sizeof(char*) ))) { fprintf(stderr, "Failed allocation of stock->base_evts.\n"); goto gracious_error; } if (NULL == (stock->evts = (char***)malloc((tmp_event_count)*sizeof(char**)))) { fprintf(stderr, "Failed allocation of stock->evts.\n"); goto gracious_error; } if (NULL == (stock->maxqualsize = (size_t *)calloc( tmp_event_count, sizeof(size_t) ))) { fprintf(stderr, "Failed allocation of stock->maxqualsize.\n"); goto gracious_error; } break; } if( 0 == tmp_event_count ){ fprintf(stderr,"ERROR: CPU component (%s) not found. Exiting.\n",_PAPI_CPU_COMPONENT_NAME); goto gracious_error; } // At this point "cid" contains the id of the perf_event (CPU) component. ret=PAPI_enum_cmp_event(&event_i,PAPI_ENUM_FIRST,cid); if(ret!=PAPI_OK){ return 0; } do{ int i, max_qual_count = 32; size_t max_qual_len, tmp_qual_len; memset(&info,0,sizeof(info)); event_qual_i = event_i; // Resize the arrays if needed. if( event_counter >= tmp_event_count ){ tmp_event_count *= 2; stock->evts = (char ***)realloc( stock->evts, tmp_event_count*sizeof(char **) ); stock->evtsizes = (int *)realloc( stock->evtsizes, tmp_event_count*sizeof(int) ); stock->base_evts = (char **)realloc( stock->base_evts, tmp_event_count*sizeof(char *) ); stock->maxqualsize = (size_t *)realloc( stock->maxqualsize, tmp_event_count*sizeof(size_t) ); } if (NULL == (stock->evts[event_counter] = (char**)malloc( max_qual_count*sizeof(char*) )) ) { fprintf(stderr, "Failed allocation of stock->evts[i].\n"); goto gracious_error; } max_qual_len = 0; subctr = 0; i = 0; do { char *col_pos; ret=PAPI_get_event_info(event_qual_i,&info); if(ret != PAPI_OK) continue; if( 0 == i ){ // The first iteration of the inner do loop will give us // the base event, without qualifiers. stock->base_evts[event_counter] = strdup(info.symbol); i++; continue; } // TODO: For the CPU component, we skip qualifiers that // contain the string "=". This assumption should be // removed when working with other components. if( NULL != strstr(info.symbol, "=") ) continue; col_pos = rindex(info.symbol, ':'); if ( NULL == col_pos ){ continue; } // Resize the array of qualifiers as needed. if( subctr >= max_qual_count ){ max_qual_count *= 2; stock->evts[event_counter] = (char **)realloc( stock->evts[event_counter], max_qual_count*sizeof(char *) ); } // Copy the qualifier name into the array. stock->evts[event_counter][subctr] = strdup(col_pos+1); tmp_qual_len = strlen( stock->evts[event_counter][subctr] ) + 1; if( tmp_qual_len > max_qual_len ) max_qual_len = tmp_qual_len; subctr++; } while(PAPI_enum_cmp_event(&event_qual_i,PAPI_NTV_ENUM_UMASKS,cid)==PAPI_OK); stock->evtsizes[event_counter] = subctr; stock->maxqualsize[event_counter] = max_qual_len; event_counter++; } while( PAPI_enum_cmp_event(&event_i,PAPI_ENUM_EVENTS,cid)==PAPI_OK ); stock->size = event_counter; return 0; gracious_error: // Frees only the successfully allocated arrays remove_stock(stock); return 1; } void print_stock(evstock* stock) { int i, j; for(i = 0; i < stock->size; ++i) { fprintf(stdout, "BASE EVENT <%s>\n", stock->base_evts[i]); for(j = 0; j < stock->evtsizes[i]; ++j) { fprintf(stdout, "%s\n", stock->evts[i][j]); } } return; } int num_evts(evstock* stock) { return stock->size; } int num_quals(evstock* stock, int base_evt) { return stock->evtsizes[base_evt]; } size_t max_qual_size(evstock* stock, int base_evt) { return stock->maxqualsize[base_evt]; } char* evt_qual(evstock* stock, int base_evt, int tag) { return stock->evts[base_evt][tag]; } char* evt_name(evstock* stock, int index) { return stock->base_evts[index]; } void remove_stock(evstock* stock) { if (!stock) return; int i, j; for(i = 0; i < stock->size; ++i) { if (!stock->evtsizes) for(j = 0; j < stock->evtsizes[i]; ++j) { if (stock->evts[i][j]) free(stock->evts[i][j]); } if (stock->evts[i]) free(stock->evts[i]); if (stock->base_evts[i]) free(stock->base_evts[i]); } if (stock->evts) free(stock->evts); if (stock->base_evts) free(stock->base_evts); if (stock->evtsizes) free(stock->evtsizes); if (stock->maxqualsize) free(stock->maxqualsize); free(stock); return; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/eventstock.h000066400000000000000000000010221502707512200241070ustar00rootroot00000000000000#ifndef _EVENT_STOCK_ #define _EVENT_STOCK_ typedef struct { int size; int* evtsizes; size_t* maxqualsize; char** base_evts; char*** evts; } evstock; int build_stock(evstock* stock); void print_stock(evstock* stock); int num_evts(evstock* stock); int num_quals(evstock* stock, int base_evt); size_t max_qual_size(evstock* stock, int base_evt); char* evt_qual(evstock* stock, int base_evt, int tag); char* evt_name(evstock* stock, int index); void remove_stock(evstock* stock); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/flops.c000066400000000000000000000542711502707512200230560ustar00rootroot00000000000000#define _GNU_SOURCE #include #include #include #include #include #include #include "flops.h" #define DOUBLE 2 #define SINGLE 1 #define HALF 0 #define CHOLESKY 3 #define GEMM 2 #define NORMALIZE 1 #define MAXDIM 51 #if defined(mips) #define FMA 1 #elif (defined(sparc) && defined(sun)) #define FMA 1 #else #define FMA 0 #endif /* Function prototypes. */ void print_header( FILE *fp, char *prec, char *kernel ); void resultline( int i, int kernel, int EventSet, FILE *fp ); void exec_flops( int precision, int EventSet, FILE *fp ); double normalize_double( int n, double *xd ); void cholesky_double( int n, double *ld, double *ad ); void exec_double_norm( int EventSet, FILE *fp ); void exec_double_cholesky( int EventSet, FILE *fp ); void exec_double_gemm( int EventSet, FILE *fp ); void keep_double_vec_res( int n, double *xd ); void keep_double_mat_res( int n, double *ld ); float normalize_single( int n, float *xs ); void cholesky_single( int n, float *ls, float *as ); void exec_single_norm( int EventSet, FILE *fp ); void exec_single_cholesky( int EventSet, FILE *fp ); void exec_single_gemm( int EventSet, FILE *fp ); void keep_single_vec_res( int n, float *xs ); void keep_single_mat_res( int n, float *ls ); #if defined(ARM) half normalize_half( int n, half *xh ); void cholesky_half( int n, half *lh, half *ah ); void exec_half_norm( int EventSet, FILE *fp ); void exec_half_cholesky( int EventSet, FILE *fp ); void exec_half_gemm( int EventSet, FILE *fp ); void keep_half_vec_res( int n, half *xh ); void keep_half_mat_res( int n, half *lh ); #endif void print_header( FILE *fp, char *prec, char *kernel ) { fprintf(fp, "#%s %s\n", prec, kernel); fprintf(fp, "#N RawEvtCnt NormdEvtCnt ExpectedAdd ExpectedSub ExpectedMul ExpectedDiv ExpectedSqrt ExpectedFMA ExpectedTotal\n"); } void resultline( int i, int kernel, int EventSet, FILE *fp ) { long long flpins = 0, denom; long long papi, all, add, sub, mul, div, sqrt, fma; int retval; if ( (retval=PAPI_stop(EventSet, &flpins)) != PAPI_OK ) { return; } switch(kernel) { case NORMALIZE: all = 3*i+1; denom = all; add = i; sub = 0; mul = i; div = i; if ( 0 == i ) { sqrt = 0; } else { sqrt = 1; } fma = 0; break; case GEMM: all = 2*i*i*i; if ( 0 == i ) { denom = 1; } else { denom = all; } add = 0; sub = 0; mul = 0; div = 0; sqrt = 0; fma = i*i*i; // Need to derive. break; case CHOLESKY: all = i*(2*i*i+9*i+1)/6.0; if ( 0 == i ) { denom = 1; } else { denom = all; } add = i*(i-1)*(i+1)/6.0; sub = i*(i+1)/2.0; mul = i*(i-1)*(i+4)/6.0; div = i*(i-1)/2.0; sqrt = i; fma = 0; break; default: all = -1; denom = -1; add = -1; sub = -1; mul = -1; div = -1; sqrt = -1; fma = -1; } papi = flpins << FMA; fprintf(fp, "%d %lld %.17g %lld %lld %lld %lld %lld %lld %lld\n", i, papi, ((double)papi)/((double)denom), add, sub, mul, div, sqrt, fma, all); } #if defined(ARM) half normalize_half( int n, half *xh ) { if ( 0 == n ) return 0.0; half aa = 0.0; half buff = 0.0; int i; for ( i = 0; i < n; i++ ) { buff = xh[i] * xh[i]; aa += buff; } aa = SQRT_VEC_SH(aa); for ( i = 0; i < n; i++ ) xh[i] = xh[i]/aa; return ( aa ); } void cholesky_half( int n, half *lh, half *ah ) { int i, j, k; half sum = 0.0; half buff = 0.0; for (i = 0; i < n; i++) { for (j = 0; j <= i; j++) { sum = 0.0; for (k = 0; k < j; k++) { buff = lh[i * n + k] * lh[j * n + k]; sum += buff; } if( i == j ) { buff = ah[i * n + i] - sum; lh[i * n + j] = SQRT_VEC_SH(buff); } else { buff = ah[i * n + i] - sum; sum = ((half)1.0); sum = sum/lh[j * n + j]; lh[i * n + j] = sum * buff; } } } } void gemm_half( int n, half *ch, half *ah, half *bh ) { int i, j, k; half sum = 0.0; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { sum = 0.0; for (k = 0; k < n; k++) { FMA_VEC_SH(sum, ah[i * n + k], bh[k * n + j], sum); } ch[i * n + j] = sum; } } } #endif float normalize_single( int n, float *xs ) { if ( 0 == n ) return 0.0; float aa = 0.0; int i; for ( i = 0; i < n; i++ ) aa = aa + xs[i] * xs[i]; aa = sqrtf(aa); for ( i = 0; i < n; i++ ) xs[i] = xs[i]/aa; return ( aa ); } void cholesky_single( int n, float *ls, float *as ) { int i, j, k; float sum = 0.0; for (i = 0; i < n; i++) { for (j = 0; j <= i; j++) { sum = 0.0; for (k = 0; k < j; k++) { sum += ls[i * n + k] * ls[j * n + k]; } if( i == j ) { ls[i * n + j] = sqrtf(as[i * n + i] - sum); } else { ls[i * n + j] = ((float)1.0)/ls[j * n + j] * (as[i * n + j] - sum); } } } } void gemm_single( int n, float *cs, float *as, float *bs ) { int i, j, k; SP_SCALAR_TYPE argI, argJ, argK; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { argK = SET_VEC_SS(0.0); for (k = 0; k < n; k++) { argI = SET_VEC_SS(as[i * n + k]); argJ = SET_VEC_SS(bs[k * n + j]); FMA_VEC_SS(argK, argI, argJ, argK); } cs[i * n + j] = ((float*)&argK)[0]; } } } double normalize_double( int n, double *xd ) { if ( 0 == n ) return 0.0; double aa = 0.0; int i; for ( i = 0; i < n; i++ ) aa = aa + xd[i] * xd[i]; aa = sqrt(aa); for ( i = 0; i < n; i++ ) xd[i] = xd[i]/aa; return ( aa ); } void cholesky_double( int n, double *ld, double *ad ) { int i, j, k; double sum = 0.0; for (i = 0; i < n; i++) { for (j = 0; j <= i; j++) { sum = 0.0; for (k = 0; k < j; k++) { sum += ld[i * n + k] * ld[j * n + k]; } if( i == j ) { ld[i * n + j] = sqrt(ad[i * n + i] - sum); } else { ld[i * n + j] = ((double)1.0)/ld[j * n + j] * (ad[i * n + j] - sum); } } } } void gemm_double( int n, double *cd, double *ad, double *bd ) { int i, j, k; DP_SCALAR_TYPE argI, argJ, argK; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { argK = SET_VEC_SD(0.0); for (k = 0; k < n; k++) { argI = SET_VEC_SD(ad[i * n + k]); argJ = SET_VEC_SD(bd[k * n + j]); FMA_VEC_SD(argK, argI, argJ, argK); } cd[i * n + j] = ((double*)&argK)[0]; } } } void exec_double_norm( int EventSet, FILE *fp ) { int i, n, retval; double *xd=NULL; /* Print info about the computational kernel. */ print_header( fp, "Double-Precision", "Vector Normalization" ); /* Allocate the linear arrays. */ xd = malloc( MAXDIM * sizeof(double) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { xd[i] = ((double)random())/((double)RAND_MAX) * (double)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ normalize_double( n, xd ); usleep(1); /* Stop and print count. */ resultline( n, NORMALIZE, EventSet, fp ); keep_double_vec_res( n, xd ); } /* Free dynamically allocated memory. */ free( xd ); } void exec_double_cholesky( int EventSet, FILE *fp ) { int i, j, n, retval; double *ad=NULL, *ld=NULL; double sumd = 0.0; /* Print info about the computational kernel. */ print_header( fp, "Double-Precision", "Cholesky Decomposition" ); /* Allocate the matrices. */ ad = malloc( MAXDIM * MAXDIM * sizeof(double) ); ld = malloc( MAXDIM * MAXDIM * sizeof(double) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < i; j++ ) { ld[i * n + j] = 0.0; ld[j * n + i] = 0.0; ad[i * n + j] = ((double)random())/((double)RAND_MAX) * (double)1.1; ad[j * n + i] = ad[i * n + j]; } ad[i * n + i] = 0.0; ld[i * n + i] = 0.0; } /* Guarantee diagonal dominance for successful Cholesky. */ for ( i = 0; i < n; i++ ) { sumd = 0.0; for ( j = 0; j < n; j++ ) { sumd += fabs(ad[i * n + j]); } ad[i * n + i] = sumd + (double)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ cholesky_double( n, ld, ad ); usleep(1); /* Stop and print count. */ resultline( n, CHOLESKY, EventSet, fp ); keep_double_mat_res( n, ld ); } free( ad ); free( ld ); } void exec_double_gemm( int EventSet, FILE *fp ) { int i, j, n, retval; double *ad=NULL, *bd=NULL, *cd=NULL; /* Print info about the computational kernel. */ print_header( fp, "Double-Precision", "GEMM" ); /* Allocate the matrices. */ ad = malloc( MAXDIM * MAXDIM * sizeof(double) ); bd = malloc( MAXDIM * MAXDIM * sizeof(double) ); cd = malloc( MAXDIM * MAXDIM * sizeof(double) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < n; j++ ) { cd[i * n + j] = 0.0; ad[i * n + j] = ((double)random())/((double)RAND_MAX) * (double)1.1; bd[i * n + j] = ((double)random())/((double)RAND_MAX) * (double)1.1; } } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ gemm_double( n, cd, ad, bd ); usleep(1); /* Stop and print count. */ resultline( n, GEMM, EventSet, fp ); keep_double_mat_res( n, cd ); } free( ad ); free( bd ); free( cd ); } void keep_double_vec_res( int n, double *xd ) { int i; double sum = 0.0; for( i = 0; i < n; ++i ) { sum += xd[i]; } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } void keep_double_mat_res( int n, double *ld ) { int i, j; double sum = 0.0; for( i = 0; i < n; ++i ) { for( j = 0; j < n; ++j ) { sum += ld[i * n + j]; } } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } void exec_single_norm( int EventSet, FILE *fp ) { int i, n, retval; float *xs=NULL; /* Print info about the computational kernel. */ print_header( fp, "Single-Precision", "Vector Normalization" ); /* Allocate the linear arrays. */ xs = malloc( MAXDIM * sizeof(float) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { xs[i] = ((float)random())/((float)RAND_MAX) * (float)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ normalize_single( n, xs ); usleep(1); /* Stop and print count. */ resultline( n, NORMALIZE, EventSet, fp ); keep_single_vec_res( n, xs ); } /* Free dynamically allocated memory. */ free( xs ); } void exec_single_cholesky( int EventSet, FILE *fp ) { int i, j, n, retval; float *as=NULL, *ls=NULL; float sums = 0.0; /* Print info about the computational kernel. */ print_header( fp, "Single-Precision", "Cholesky Decomposition" ); /* Allocate the matrices. */ as = malloc( MAXDIM * MAXDIM * sizeof(float) ); ls = malloc( MAXDIM * MAXDIM * sizeof(float) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < i; j++ ) { ls[i * n + j] = 0.0; ls[j * n + i] = 0.0; as[i * n + j] = ((float)random())/((float)RAND_MAX) * (float)1.1; as[j * n + i] = as[i * n + j]; } as[i * n + i] = 0.0; ls[i * n + i] = 0.0; } /* Guarantee diagonal dominance for successful Cholesky. */ for ( i = 0; i < n; i++ ) { sums = 0.0; for ( j = 0; j < n; j++ ) { sums += fabs(as[i * n + j]); } as[i * n + i] = sums + (float)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ cholesky_single( n, ls, as ); usleep(1); /* Stop and print count. */ resultline( n, CHOLESKY, EventSet, fp ); keep_single_mat_res( n, ls ); } free( as ); free( ls ); } void exec_single_gemm( int EventSet, FILE *fp ) { int i, j, n, retval; float *as=NULL, *bs=NULL, *cs=NULL; /* Print info about the computational kernel. */ print_header( fp, "Single-Precision", "GEMM" ); /* Allocate the matrices. */ as = malloc( MAXDIM * MAXDIM * sizeof(float) ); bs = malloc( MAXDIM * MAXDIM * sizeof(float) ); cs = malloc( MAXDIM * MAXDIM * sizeof(float) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < n; j++ ) { cs[i * n + j] = 0.0; as[i * n + j] = ((float)random())/((float)RAND_MAX) * (float)1.1; bs[i * n + j] = ((float)random())/((float)RAND_MAX) * (float)1.1; } } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ gemm_single( n, cs, as, bs ); usleep(1); /* Stop and print count. */ resultline( n, GEMM, EventSet, fp ); keep_single_mat_res( n, cs ); } free( as ); free( bs ); free( cs ); } void keep_single_vec_res( int n, float *xs ) { int i; float sum = 0.0; for( i = 0; i < n; ++i ) { sum += xs[i]; } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } void keep_single_mat_res( int n, float *ls ) { int i, j; float sum = 0.0; for( i = 0; i < n; ++i ) { for( j = 0; j < n; ++j ) { sum += ls[i * n + j]; } } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } #if defined(ARM) void exec_half_norm( int EventSet, FILE *fp ) { int i, n, retval; half *xh=NULL; /* Print info about the computational kernel. */ print_header( fp, "Half-Precision", "Vector Normalization" ); /* Allocate the linear arrays. */ xh = malloc( MAXDIM * sizeof(half) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { xh[i] = ((half)random())/((half)RAND_MAX) * (half)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ normalize_half( n, xh ); usleep(1); /* Stop and print count. */ resultline( n, NORMALIZE, EventSet, fp ); keep_half_vec_res( n, xh ); } /* Free dynamically allocated memory. */ free( xh ); } void exec_half_cholesky( int EventSet, FILE *fp ) { int i, j, n, retval; half *ah=NULL, *lh=NULL; half sumh = 0.0; /* Print info about the computational kernel. */ print_header( fp, "Half-Precision", "Cholesky Decomposition" ); /* Allocate the matrices. */ ah = malloc( MAXDIM * MAXDIM * sizeof(half) ); lh = malloc( MAXDIM * MAXDIM * sizeof(half) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < i; j++ ) { lh[i * n + j] = 0.0; lh[j * n + i] = 0.0; ah[i * n + j] = ((half)random())/((half)RAND_MAX) * (half)1.1; ah[j * n + i] = ah[i * n + j]; } ah[i * n + i] = 0.0; lh[i * n + i] = 0.0; } /* Guarantee diagonal dominance for successful Cholesky. */ for ( i = 0; i < n; i++ ) { sumh = 0.0; for ( j = 0; j < n; j++ ) { sumh += fabs(ah[i * n + j]); } ah[i * n + i] = sumh + (half)1.1; } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ cholesky_half( n, lh, ah ); usleep(1); /* Stop and print count. */ resultline( n, CHOLESKY, EventSet, fp ); keep_half_mat_res( n, lh ); } free( ah ); free( lh ); } void exec_half_gemm( int EventSet, FILE *fp ) { int i, j, n, retval; half *ah=NULL, *bh=NULL, *ch=NULL; /* Print info about the computational kernel. */ print_header( fp, "Half-Precision", "GEMM" ); /* Allocate the matrices. */ ah = malloc( MAXDIM * MAXDIM * sizeof(half) ); bh = malloc( MAXDIM * MAXDIM * sizeof(half) ); ch = malloc( MAXDIM * MAXDIM * sizeof(half) ); /* Step through the different array sizes. */ for ( n = 0; n < MAXDIM; n++ ) { /* Initialize the needed arrays at this size. */ for ( i = 0; i < n; i++ ) { for ( j = 0; j < n; j++ ) { ch[i * n + j] = 0.0; ah[i * n + j] = ((half)random())/((half)RAND_MAX) * (half)1.1; bh[i * n + j] = ((half)random())/((half)RAND_MAX) * (half)1.1; } } /* Reset PAPI count. */ if ( (retval = PAPI_start( EventSet )) != PAPI_OK ) { return; } /* Run the kernel. */ gemm_half( n, ch, ah, bh ); usleep(1); /* Stop and print count. */ resultline( n, GEMM, EventSet, fp ); keep_half_mat_res( n, ch ); } free( ah ); free( bh ); free( ch ); } void keep_half_vec_res( int n, half *xh ) { int i; half sum = 0.0; for( i = 0; i < n; ++i ) { sum += xh[i]; } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } void keep_half_mat_res( int n, half *lh ) { int i, j; half sum = 0.0; for( i = 0; i < n; ++i ) { for( j = 0; j < n; ++j ) { sum += lh[i * n + j]; } } if( 1.2345 == sum ) { fprintf(stderr, "Side-effect to disable dead code elimination by the compiler. Please ignore.\n"); } } #endif void exec_flops( int precision, int EventSet, FILE *fp ) { /* Vector Normalization and Cholesky Decomposition tests. */ switch(precision) { case DOUBLE: exec_double_norm(EventSet, fp); exec_double_cholesky(EventSet, fp); exec_double_gemm(EventSet, fp); break; case SINGLE: exec_single_norm(EventSet, fp); exec_single_cholesky(EventSet, fp); exec_single_gemm(EventSet, fp); break; case HALF: #if defined(ARM) exec_half_norm(EventSet, fp); exec_half_cholesky(EventSet, fp); exec_half_gemm(EventSet, fp); #endif break; default: ; } return; } void flops_driver( char* papi_event_name, hw_desc_t *hw_desc, char* outdir ) { int retval = PAPI_OK; int EventSet = PAPI_NULL; FILE* ofp_papi; const char *sufx = ".flops"; char *papiFileName; (void)hw_desc; int l = strlen(outdir)+strlen(papi_event_name)+strlen(sufx); if (NULL == (papiFileName = (char *)calloc( 1+l, sizeof(char)))) { return; } if (l != (sprintf(papiFileName, "%s%s%s", outdir, papi_event_name, sufx))) { goto error0; } if (NULL == (ofp_papi = fopen(papiFileName,"w"))) { fprintf(stderr, "Failed to open file %s.\n", papiFileName); goto error0; } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ){ goto error1; } retval = PAPI_add_named_event( EventSet, papi_event_name ); if (retval != PAPI_OK ){ goto error1; } exec_flops(HALF, EventSet, ofp_papi); exec_flops(SINGLE, EventSet, ofp_papi); exec_flops(DOUBLE, EventSet, ofp_papi); retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK ){ goto error1; } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK ){ goto error1; } error1: fclose(ofp_papi); error0: free(papiFileName); return; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/flops.h000066400000000000000000000002401502707512200230460ustar00rootroot00000000000000#ifndef _FLOPS_ #define _FLOPS_ #include "hw_desc.h" #include "cat_arch.h" void flops_driver(char* papi_event_str, hw_desc_t *hw_desc, char* outdir); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/gen_seq_dlopen.sh000066400000000000000000000260231502707512200250770ustar00rootroot00000000000000#!/bin/bash DRV_F=icache_seq.c KRN_F=icache_seq_kernel.c HEAD_F=icache_seq.h TRUE_IF=1 FALSE_IF=0 ################################################################################ create_common_prefix(){ cat < #include #include #include #include #include #include "papi.h" #include "icache_seq.h" EOF } ################################################################################ create_kernel(){ basic_block_copies=$1; block_type=$2; for((i=0; i<$((${basic_block_copies}-1)); i++)); do deref[$i]=$(($i+1)) done available=$((${basic_block_copies}-1)); indx=0; for((i=1; i<${basic_block_copies}; i++)); do rnd=$((RANDOM % ${available})) next=${deref[${rnd}]}; # If the next jump is too close, try one more time. if (( ${next} <= $((${indx}+2)) && ${next} > ${indx} )); then rnd=$((RANDOM % ${available})) next=${deref[${rnd}]}; fi permutation[${indx}]=$next; indx=${next}; deref[${rnd}]=${deref[$((${available}-1))]} # replace the element we used with the last one ((available--)); # reduce the number of available elements (to ditch the last one). done permutation[${indx}]=-1; last_link_in_chain=${indx}; if (( $block_type == $TRUE_IF )); then echo "long long seq_kernel_TRUE_IF_${basic_block_copies}(int epilogue){" else echo "long long seq_kernel_FALSE_IF_${basic_block_copies}(int epilogue){" fi cat < 3 ){" echo " RNG();" echo " }" echo " result = z1 ^ z2 ^ z3 ^ z4;" echo " is_zero *= result;" fi done cat <= 10000 )); then dl_reps=$(( ${basic_block_copies}/5000 )) tmp=$(( ${basic_block_copies}/${dl_reps} )) basic_block_copies=$tmp else create_kernel $basic_block_copies $j $TRUE_IF >> ${KRN_F} create_kernel $basic_block_copies $j $FALSE_IF >> ${KRN_F} echo "" >> ${KRN_F} echo "long long seq_kernel_TRUE_IF_${basic_block_copies}(int epilogue);" >> ${HEAD_F} echo "long long seq_kernel_FALSE_IF_${basic_block_copies}(int epilogue);" >> ${HEAD_F} fi echo "int seq_jumps_${basic_block_copies}x${dl_reps}(int iter_count, int eventset, int epilogue, int branch_type, int run_type, FILE* ofp_papi);" >> ${HEAD_F} create_caller ${basic_block_copies} $dl_reps >> ${DRV_F} echo "" >> ${DRV_F} } ################################################################################ create_main(){ cat <= 10000 )); then dl_reps=$(( ${basic_block_copies}/5000 )) tmp=$(( ${basic_block_copies}/${dl_reps} )) basic_block_copies=$tmp fi echo " if( show_progress ){" echo " printf(\"%3d%%\b\b\b\b\",(100*exp_cnt)/(4*$#));" echo " exp_cnt++;" echo " fflush(stdout);" echo " }" echo " side_effect += seq_jumps_${basic_block_copies}x${dl_reps}(1, eventset, NO_COPY, TRUE_IF, COLD_RUN, NULL);" echo " if(side_effect < init){" echo " return;" echo " }" echo " side_effect += seq_jumps_${basic_block_copies}x${dl_reps}(150, eventset, ${copy_type}, TRUE_IF, NORMAL_RUN, ofp_papi);" echo " if(side_effect < init){" echo " return;" echo " }" echo "" done done for copy_type in "NO_COPY"; do for ((prm=1; prm<=$#; prm++)); do basic_block_copies=${!prm} dl_reps=1; if (( $basic_block_copies >= 10000 )); then dl_reps=$(( ${basic_block_copies}/5000 )) tmp=$(( ${basic_block_copies}/${dl_reps} )) basic_block_copies=$tmp fi echo " if( show_progress ){" echo " printf(\"%3d%%\b\b\b\b\",(100*exp_cnt)/(4*$#));" echo " exp_cnt++;" echo " fflush(stdout);" echo " }" echo " side_effect += seq_jumps_${basic_block_copies}x${dl_reps}(1, eventset, NO_COPY, FALSE_IF, COLD_RUN, NULL);" echo " if(side_effect < init){" echo " return;" echo " }" echo " side_effect += seq_jumps_${basic_block_copies}x${dl_reps}(150, eventset, ${copy_type}, FALSE_IF, NORMAL_RUN, ofp_papi);" echo " if(side_effect < init){" echo " return;" echo " }" echo "" done done cat < ${HEAD_F} echo "#include " >> ${HEAD_F} echo "" >> ${HEAD_F} echo "float buff[BUF_ELEM_CNT];" >> ${HEAD_F} echo "volatile int global_zero;" >> ${HEAD_F} echo "" >> ${HEAD_F} create_common_prefix > ${DRV_F} create_common_prefix > ${KRN_F} for sz in 10 20 30 50 100 150 200 300 400 600 800 1200 1600 2400 3200 5000 10000 15000 20000 25000 35000 40000 50000 60000; do create_functions ${sz} done create_main 10 20 30 50 100 150 200 300 400 600 800 1200 1600 2400 3200 5000 10000 15000 20000 25000 35000 40000 50000 60000 >> ${DRV_F} papi-papi-7-2-0-t/src/counter_analysis_toolkit/hw_desc.h000066400000000000000000000011531502707512200233430ustar00rootroot00000000000000#ifndef _HW_DESC_ #define _HW_DESC_ #define _MAX_SUPPORTED_CACHE_LEVELS 16 typedef struct _hw_desc{ int numcpus; int cache_levels; int maxPPB; int mmsplit; int pts_per_mm; int split[_MAX_SUPPORTED_CACHE_LEVELS]; int pts_per_reg[_MAX_SUPPORTED_CACHE_LEVELS]; long long dcache_line_size[_MAX_SUPPORTED_CACHE_LEVELS]; long long dcache_size[_MAX_SUPPORTED_CACHE_LEVELS]; int dcache_assoc[_MAX_SUPPORTED_CACHE_LEVELS]; long long icache_line_size[_MAX_SUPPORTED_CACHE_LEVELS]; long long icache_size[_MAX_SUPPORTED_CACHE_LEVELS]; int icache_assoc[_MAX_SUPPORTED_CACHE_LEVELS]; } hw_desc_t; #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/icache.c000066400000000000000000000022431502707512200231370ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include "papi.h" #include "icache.h" void i_cache_driver(char* papi_event_name, int junk, hw_desc_t *hw_desc, char* outdir, int show_progress) { // Open output file. const char *sufx = ".icache"; char *papiFileName; FILE *ofp_papi; (void)hw_desc; int l = strlen(outdir)+strlen(papi_event_name)+strlen(sufx); if (NULL == (papiFileName = (char *)calloc( 1+l, sizeof(char) ))) { fprintf(stderr, "Failed to allocate papiFileName.\n"); return; } if (l != (sprintf(papiFileName, "%s%s%s", outdir, papi_event_name, sufx))) { fprintf(stderr, "sprintf failed to copy into papiFileName.\n"); free(papiFileName); return; } if (NULL == (ofp_papi = fopen(papiFileName,"w"))) { fprintf(stderr, "Failed to open file %s.\n", papiFileName); free(papiFileName); return; } seq_driver(ofp_papi, papi_event_name, junk, show_progress); // Close output file. fclose(ofp_papi); free(papiFileName); return; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/icache.h000066400000000000000000000017501502707512200231460ustar00rootroot00000000000000#ifndef _INSTR_CACHE_ #define _INSTR_CACHE_ #include #include "hw_desc.h" #define NO_COPY 0 #define DO_COPY 1 #define FALSE_IF 0 #define TRUE_IF 1 #define COLD_RUN 0 #define NORMAL_RUN 1 #define BUF_ELEM_CNT 32*1024*1024 // Hopefully larger than the L3 cache. #define RNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ b = ((z1 << 6) ^ z4) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z1) >> 27;\ b += z4;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ result = z1 ^ z2 ^ z3 ^ z4;\ } void i_cache_driver(char* papi_event_name, int init, hw_desc_t *hw_desc, char* outdir, int show_progress); void seq_driver(FILE* ofp_papi, char* papi_event_name, int init, int show_progress); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/instr.h000066400000000000000000000002131502707512200230620ustar00rootroot00000000000000#ifndef _INSTR_ #define _INSTR_ #include "hw_desc.h" void instr_driver(char* papi_event_name, hw_desc_t *hw_desc, char* outdir); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/instructions.c000066400000000000000000002053641502707512200245000ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include #include "instr.h" int sum_i32=0; float sum_f32=0.0; double sum_f64=0.0; void test_int_add(int p, int M, int N, int EventSet, FILE *fp){ int ret; long long int ev_values[2]; int i32_00, i32_01, i32_02, i32_03, i32_04, i32_05, i32_06, i32_07, i32_08, i32_09; /* Initialize the variables with values that the compiler cannot guess. */ i32_00 = p/2; i32_01 = -p/3; i32_02 = p/4; i32_03 = -p/5; i32_04 = p/6; i32_05 = -p/7; i32_06 = p/8; i32_07 = -p/9; i32_08 = p/10; i32_09 = -p/11; // Start the counters. ret = PAPI_start(EventSet); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_start() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to run the kernel. goto clean_up; } for(int i=0; i 100 ){ I32_ADDS(i32_07); } } } ret = PAPI_stop(EventSet, ev_values); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_stop() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to print anything. goto clean_up; } fprintf(fp, "%d %lld %lld %.3lf\n", N*M, ev_values[0], 12LL*3LL*N*M, (double)ev_values[0]/(12.0*3.0*N*M)); sum_i32 += i32_00 + i32_01 + i32_02 + i32_03 + i32_04 + i32_05 + i32_06 + i32_07; sum_i32 += i32_08 + i32_09 + i32_10 + i32_11; clean_up: return; } void test_int_mul_max(int p, int M, int N, int EventSet, FILE *fp){ int ret; long long int ev_values[2]; int i32_00, i32_01, i32_02, i32_03, i32_04, i32_05, i32_06, i32_07; int i32_08, i32_09, i32_10, i32_11; int i32_100, i32_101, i32_102; /* Initialize the variables with values that the compiler cannot guess. */ i32_00 = 2*p; i32_01 = -p/3; i32_02 = p/4; i32_03 = -p/5; i32_04 = p/6; i32_05 = -p/7; i32_06 = p/8; i32_07 = 1/p; i32_08 = 1+p/2; i32_09 = 1-p/2; i32_10 = 1+p/3; i32_11 = 1-p/3; i32_100 = 17; i32_101 = -18; i32_102 = 12; // Start the counters. ret = PAPI_start(EventSet); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_start() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to run the kernel. goto clean_up; } if( p == 12345678 ){ p /= 2; i32_100 *= 13; i32_101 *= 12; i32_102 *= 11; }else{ // Almost certainly this is what will execute and all variables will // end up with the value one, but the compiler doesn't know that. i32_100 = 1 + i32_100 / (i32_00+16); i32_101 = 1 + i32_101 / (i32_00+17); i32_102 = 1 + i32_102 / (i32_00+11); } #define I32_MULS(_X) {i32_00 *= _X; i32_01 *= _X; i32_02 *= _X; i32_03 *= _X; i32_04 *= _X; i32_05 *= _X; i32_06 *= _X; i32_07 *= _X; i32_08 *= _X; i32_09 *= _X; i32_10 *= _X; i32_11 *= _X;} for(int i=0; i 100 ){ I32_MULS(i32_07); } } } ret = PAPI_stop(EventSet, ev_values); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_stop() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to print anything. goto clean_up; } fprintf(fp, "%d %lld %lld %.3lf\n", N*M, ev_values[0], 50LL*N*M, (double)ev_values[0]/(50.0*N*M)); sum_i32 += i32_00 + i32_01 + i32_02 + i32_03 + i32_04 + i32_05 + i32_06 + i32_07; sum_i32 += i32_08 + i32_09 + i32_10 + i32_11; clean_up: return; } void test_int_div_max(int p, int M, int N, int EventSet, FILE *fp){ int ret; long long int ev_values[2]; int i32_00, i32_01, i32_02, i32_03, i32_04, i32_05, i32_06, i32_07; int i32_08, i32_09, i32_10, i32_11; int i32_100, i32_101, i32_102; /* Initialize the variables with values that the compiler cannot guess. */ i32_00 = 2*p; i32_01 = -p/3; i32_02 = p/4; i32_03 = -p/5; i32_04 = p/6; i32_05 = -p/7; i32_06 = p/8; i32_07 = 1+1/p; i32_08 = 1+p/2; i32_09 = 1-p/2; i32_10 = 1+p/3; i32_11 = 1-p/3; i32_100 = 17; i32_101 = -18; i32_102 = 12; // Start the counters. ret = PAPI_start(EventSet); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_start() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to run the kernel. goto clean_up; } if( p == 12345678 ){ p /= 2; i32_100 *= 13; i32_101 *= 12; i32_102 *= 11; }else{ // Almost certainly this is what will execute and all variables will // end up with the value one, but the compiler doesn't know that. i32_100 = 1 + i32_100 / (i32_00+16); i32_101 = 1 + i32_101 / (i32_00+17); i32_102 = 1 + i32_102 / (i32_00+11); } #define I32_DIVS(_X) {i32_00 /= _X; i32_01 /= _X; i32_02 /= _X; i32_03 /= _X; i32_04 /= _X; i32_05 /= _X; i32_06 /= _X; i32_07 /= _X; i32_08 /= _X; i32_09 /= _X; i32_10 /= _X; i32_11 /= _X;} for(int i=0; i 100 ){ I32_DIVS(i32_07); } } } ret = PAPI_stop(EventSet, ev_values); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_stop() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to print anything. goto clean_up; } fprintf(fp, "%d %lld %lld %.3lf\n", N*M, ev_values[0], 50LL*N*M, (double)ev_values[0]/(50.0*N*M)); sum_i32 += i32_00 + i32_01 + i32_02 + i32_03 + i32_04 + i32_05 + i32_06 + i32_07; sum_i32 += i32_08 + i32_09 + i32_10 + i32_11; clean_up: return; } //////////////////////////////////////////////////////////////////////////////// // f32 ADD void test_f32_add(int p, int M, int N, int EventSet, FILE *fp){ int ret; long long int ev_values[2]; float f32_00, f32_01, f32_02, f32_03; /* Initialize the variables with values that the compiler cannot guess. */ f32_00 = (float)p/1.02; f32_01 = -(float)p/1.03; f32_02 = (float)p/1.04; f32_03 = -(float)p/1.05; // Start the counters. ret = PAPI_start(EventSet); if ( PAPI_OK != ret ) { fprintf(stderr, "PAPI_start() error: %s\n", PAPI_strerror(ret)); // If we can't measure events, no need to run the kernel. goto clean_up; } #define F32ADD_BLOCK() {f32_01 += f32_00; f32_02 += f32_01; f32_03 += f32_02; f32_00 += f32_03;} for(int i=0; i #include #include #include #include #include #include #include "papi.h" #include "driver.h" #if defined(USE_MPI) #include #endif int main(int argc, char*argv[]) { int cmbtotal = 0, ct = 0, track = 0, ret = 0; int i, nevts = 0, status; int *cards = NULL, *indexmemo = NULL; char **allevts = NULL, **basenames = NULL; evstock *data = NULL; cat_params_t params = {-1,0,1,0,0,0,NULL,NULL,NULL}; int nprocs = 1, myid = 0; #if defined(USE_MPI) MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); #endif // Initialize PAPI. ret = PAPI_library_init(PAPI_VER_CURRENT); if(ret != PAPI_VER_CURRENT){ fprintf(stderr,"PAPI shared library version error: %s Exiting...\n", PAPI_strerror(ret)); return 0; } // Initialize PAPI thread support. ret = PAPI_thread_init( omp_get_thread_num_wrapper ); if( ret != PAPI_OK ) { fprintf(stderr,"PAPI thread init error: %s Exiting...\n", PAPI_strerror(ret)); return 0; } // Parse the command-line arguments. status = parseArgs(argc, argv, ¶ms); if(0 != status) { free(params.outputdir); PAPI_shutdown(); return 0; } // Allocate space for the native events and qualifiers. data = (evstock*)calloc(1,sizeof(evstock)); if(NULL == data) { free(params.outputdir); fprintf(stderr, "Could not initialize event stock. Exiting...\n"); PAPI_shutdown(); return 0; } // Read the list of base event names and maximum qualifier set cardinalities. if( READ_FROM_FILE == params.mode) { ct = setup_evts(params.inputfile, &basenames, &cards); if(ct == -1) { free(params.outputdir); remove_stock(data); PAPI_shutdown(); return 0; } } // Populate the event stock. status = build_stock(data); if(status) { free(params.outputdir); if(READ_FROM_FILE == params.mode) { for(i = 0; i < ct; ++i) { free(basenames[i]); } free(basenames); free(cards); } fprintf(stderr, "Could not populate event stock. Exiting...\n"); PAPI_shutdown(); return 0; } // Get the number of events contained in the event stock. nevts = num_evts(data); // Verify the validity of the cardinalities. cmbtotal = check_cards(params, &indexmemo, basenames, cards, ct, nevts, data); if(-1 == cmbtotal) { free(params.outputdir); remove_stock(data); if(READ_FROM_FILE == params.mode) { for(i = 0; i < ct; ++i) { free(basenames[i]); } free(basenames); free(cards); } PAPI_shutdown(); return 0; } // Allocate enough space for all of the event+qualifier combinations. if (NULL == (allevts = (char**)malloc(cmbtotal*sizeof(char*)))) { fprintf(stderr, "Failed to allocate memory.\n"); PAPI_shutdown(); return 0; } // Create the qualifier combinations for each event. trav_evts(data, params.subsetsize, cards, nevts, ct, params.mode, allevts, &track, indexmemo, basenames); char *conf_file_name = ".cat_cfg"; if( NULL != params.conf_file ) { conf_file_name = params.conf_file; } hw_desc_t *hw_desc = obtain_hardware_description(conf_file_name); /* Set the default number of threads to the OMP_NUM_THREADS environment * variable if it is defined. Otherwise, set it to the number of CPUs * in a single socket. */ int numSetThreads = 1; char* envVarDefined = getenv("OMP_NUM_THREADS"); if (NULL == envVarDefined) { omp_set_num_threads(hw_desc->numcpus); #pragma omp parallel default(shared) { if(!omp_get_thread_num()) { numSetThreads = omp_get_num_threads(); } } if (numSetThreads != hw_desc->numcpus) { fprintf(stderr, "Warning! Failed to set default number of threads to number of CPUs in a single socket.\n"); } } // Run the benchmark for each qualifier combination. testbench(allevts, cmbtotal, hw_desc, params, myid, nprocs); // Free dynamically allocated memory. free(params.outputdir); remove_stock(data); if(READ_FROM_FILE == params.mode) { for(i = 0; i < ct; ++i) { free(basenames[i]); } free(basenames); free(cards); free(indexmemo); } for(i = 0; i < cmbtotal; ++i) { free(allevts[i]); } free(allevts); free(hw_desc); PAPI_shutdown(); #if defined(USE_MPI) MPI_Barrier(MPI_COMM_WORLD); MPI_Finalize(); #endif return 0; } unsigned long int omp_get_thread_num_wrapper() { return omp_get_thread_num(); } // Verify that valid qualifier counts are provided and count their combinations. int check_cards(cat_params_t params, int** indexmemo, char** basenames, int* cards, int ct, int nevts, evstock* data) { int i, j, minim, n, cmbtotal = 0; char *name; int mode = params.mode; int pk = params.subsetsize; // User provided a file of events. if(READ_FROM_FILE == mode) { // Compute the total number of qualifier combinations and allocate memory to store them. if (NULL == ((*indexmemo) = (int*)malloc(ct*sizeof(int)))) { fprintf(stderr, "Failed to allocate memory.\n"); return 0; } // Find the index in the main stock whose event corresponds to that in the file provided. // This simplifies looking up event qualifiers later. for(i = 0; i < ct; ++i) { if(NULL == basenames[i]) { (*indexmemo)[i] = -1; cmbtotal -= 1; continue; } // j is the index of the event name provided by the user. for(j = 0; j < nevts; ++j) { name = evt_name(data, j); if(strcmp(basenames[i], name) == 0) { break; } } // If the event name provided by the user does not match any of the main event // names in the architecture, then it either contains qualifiers or it does not // exist. if(cards[i] != 0 && j == nevts) { fprintf(stderr, "The provided event '%s' is either not in the architecture or contains qualifiers.\n" \ "If the latter, use '0' in place of the provided '%d'.\n", basenames[i], cards[i]); cards[i] = 0; } // If an invalid (negative) qualifier count was given, use zero qualifiers. if(cards[i] < 0) { fprintf(stderr, "The qualifier count (provided for event '%s') cannot be negative.\n", basenames[i]); cards[i] = 0; } (*indexmemo)[i] = j; } // Count the total number of events to test. for(i = 0; i < ct; ++i) { // If no qualifiers are used, then just count the event itself. if(cards[i] <= 0) { cmbtotal += 1; continue; } // Get the number of qualifiers which belong to the main event. if((*indexmemo)[i] != -1) { n = num_quals(data, (*indexmemo)[i]); } else { n = 0; } // If the user specifies to use more qualifiers than are available // for the main event, do not use any qualifiers. Otherwise, count // the number of combinations of qualifiers for the main event. minim = cards[i]; if(cards[i] > n || cards[i] < 0) { minim = 0; } cmbtotal += comb(n, minim); } } // User wants to inspect all events in the architecture. else { for(i = 0; i < nevts; ++i) { // Get the number of qualifiers which belong to the main event. n = num_quals(data, i); // If the user specifies to use more qualifiers than are available // for the main event, do not use any qualifiers. Otherwise, count // the number of combinations of qualifiers for the main event. minim = pk; if(pk > n || pk < 0) { minim = 0; } cmbtotal += comb(n, minim); } } return cmbtotal; } static hw_desc_t *obtain_hardware_description(char *conf_file_name){ int i,j; hw_desc_t *hw_desc; PAPI_mh_level_t *L; const PAPI_hw_info_t *meminfo; // Allocate some space. hw_desc = (hw_desc_t *)calloc(1, sizeof(hw_desc_t)); // Set at least the L1 cache size to a default value. hw_desc->dcache_line_size[0] = 64; // Set other default values. for( i=0; i<_MAX_SUPPORTED_CACHE_LEVELS; ++i ) { hw_desc->split[i] = 1; hw_desc->pts_per_reg[i] = 3; } hw_desc->mmsplit = 1; hw_desc->pts_per_mm = 3; hw_desc->maxPPB = 512; // Obtain hardware values through PAPI_get_hardware_info(). meminfo = PAPI_get_hardware_info(); if( NULL != meminfo ) { hw_desc->numcpus = meminfo->ncpu; hw_desc->cache_levels = meminfo->mem_hierarchy.levels; L = ( PAPI_mh_level_t * ) & ( meminfo->mem_hierarchy.level[0] ); for ( i = 0; i < meminfo->mem_hierarchy.levels && i<_MAX_SUPPORTED_CACHE_LEVELS; i++ ) { for ( j = 0; j < 2; j++ ) { if ( (PAPI_MH_TYPE_DATA == PAPI_MH_CACHE_TYPE(L[i].cache[j].type)) || (PAPI_MH_TYPE_UNIFIED == PAPI_MH_CACHE_TYPE(L[i].cache[j].type)) ){ hw_desc->dcache_line_size[i] = L[i].cache[j].line_size; hw_desc->dcache_size[i] = L[i].cache[j].size; hw_desc->dcache_assoc[i] = L[i].cache[j].associativity; } if ( (PAPI_MH_TYPE_INST == PAPI_MH_CACHE_TYPE(L[i].cache[j].type)) || (PAPI_MH_TYPE_UNIFIED == PAPI_MH_CACHE_TYPE(L[i].cache[j].type)) ){ hw_desc->icache_line_size[i] = L[i].cache[j].line_size; hw_desc->icache_size[i] = L[i].cache[j].size; hw_desc->icache_assoc[i] = L[i].cache[j].associativity; } } } } // Read the config file, if there, in case the user wants to overwrite some values. read_conf_file(conf_file_name, hw_desc); return hw_desc; } static int parse_line(FILE *input, char **key, long long *value){ int status; size_t linelen=0, len; char *line=NULL; char *pos=NULL; // Read one line from the input file. int ret_val = (int)getline(&line, &linelen, input); if( ret_val < 0 ) return ret_val; // Kill the part of the line after the comment character '#'. pos = strchr(line, '#'); if( NULL != pos ){ *pos = '\0'; } // Make sure the line is an assignment. pos = strchr(line, '='); if( NULL == pos ){ goto handle_error; } len = strcspn(line, " ="); *key = (char *)calloc((1+len),sizeof(char)); strncpy(*key, line, len); // Scan the line to make sure it has the form "key = value" status = sscanf(pos, "= %lld", value); if(1 != status){ fprintf(stderr,"Malformed line in conf file: '%s'\n", line); goto handle_error; } return 0; handle_error: free(line); key = NULL; *value = 0; line = NULL; linelen = 0; return 1; } static void read_conf_file(char *conf_file_name, hw_desc_t *hw_desc){ FILE *input; // Try to open the file. input = fopen(conf_file_name, "r"); if (NULL == input ){ return; } while(1){ long long value; char *key=NULL; int ret_val = parse_line(input, &key, &value); if( ret_val < 0 ){ free(key); break; }else if( ret_val > 0 ){ continue; } // If the user has set "AUTO_DISCOVERY_MODE = 1" then we don't need to process this file. // Otherwise, any entry in this file should overwrite what we auto discovered. if( !strcmp(key, "AUTO_DISCOVERY_MODE") && (value == 1) ){ return; // Data caches (including unified caches) }else if( !strcmp(key, "L1_DCACHE_LINE_SIZE") || !strcmp(key, "L1_UCACHE_LINE_SIZE") ){ hw_desc->dcache_line_size[0] = value; }else if( !strcmp(key, "L2_DCACHE_LINE_SIZE") || !strcmp(key, "L2_UCACHE_LINE_SIZE") ){ hw_desc->dcache_line_size[1] = value; }else if( !strcmp(key, "L3_DCACHE_LINE_SIZE") || !strcmp(key, "L3_UCACHE_LINE_SIZE") ){ hw_desc->dcache_line_size[2] = value; }else if( !strcmp(key, "L4_DCACHE_LINE_SIZE") || !strcmp(key, "L4_UCACHE_LINE_SIZE") ){ hw_desc->dcache_line_size[3] = value; }else if( !strcmp(key, "L1_DCACHE_SIZE") || !strcmp(key, "L1_UCACHE_SIZE") ){ if( hw_desc->cache_levels < 1 ) hw_desc->cache_levels = 1; hw_desc->dcache_size[0] = value; }else if( !strcmp(key, "L2_DCACHE_SIZE") || !strcmp(key, "L2_UCACHE_SIZE") ){ if( hw_desc->cache_levels < 2 ) hw_desc->cache_levels = 2; hw_desc->dcache_size[1] = value; }else if( !strcmp(key, "L3_DCACHE_SIZE") || !strcmp(key, "L3_UCACHE_SIZE") ){ if( hw_desc->cache_levels < 3 ) hw_desc->cache_levels = 3; hw_desc->dcache_size[2] = value; }else if( !strcmp(key, "L4_DCACHE_SIZE") || !strcmp(key, "L4_UCACHE_SIZE") ){ if( hw_desc->cache_levels < 4 ) hw_desc->cache_levels = 4; hw_desc->dcache_size[3] = value; // Instruction caches (including unified caches) }else if( !strcmp(key, "L1_ICACHE_LINE_SIZE") || !strcmp(key, "L1_UCACHE_LINE_SIZE") ){ hw_desc->icache_line_size[0] = value; }else if( !strcmp(key, "L2_ICACHE_LINE_SIZE") || !strcmp(key, "L2_UCACHE_LINE_SIZE") ){ hw_desc->icache_line_size[1] = value; }else if( !strcmp(key, "L3_ICACHE_LINE_SIZE") || !strcmp(key, "L3_UCACHE_LINE_SIZE") ){ hw_desc->icache_line_size[2] = value; }else if( !strcmp(key, "L4_ICACHE_LINE_SIZE") || !strcmp(key, "L4_UCACHE_LINE_SIZE") ){ hw_desc->icache_line_size[3] = value; }else if( !strcmp(key, "L1_ICACHE_SIZE") || !strcmp(key, "L1_UCACHE_SIZE") ){ hw_desc->icache_size[0] = value; }else if( !strcmp(key, "L2_ICACHE_SIZE") || !strcmp(key, "L2_UCACHE_SIZE") ){ hw_desc->icache_size[1] = value; }else if( !strcmp(key, "L3_ICACHE_SIZE") || !strcmp(key, "L3_UCACHE_SIZE") ){ hw_desc->icache_size[2] = value; }else if( !strcmp(key, "L4_ICACHE_SIZE") || !strcmp(key, "L4_UCACHE_SIZE") ){ hw_desc->icache_size[3] = value; }else if( !strcmp(key, "L1_SPLIT") ){ hw_desc->split[0] = value; }else if( !strcmp(key, "L2_SPLIT") ){ hw_desc->split[1] = value; }else if( !strcmp(key, "L3_SPLIT") ){ hw_desc->split[2] = value; }else if( !strcmp(key, "L4_SPLIT") ){ hw_desc->split[3] = value; }else if( !strcmp(key, "MM_SPLIT") ){ hw_desc->mmsplit = value; }else if( !strcmp(key, "PTS_PER_L1") ){ hw_desc->pts_per_reg[0] = value; }else if( !strcmp(key, "PTS_PER_L2") ){ hw_desc->pts_per_reg[1] = value; }else if( !strcmp(key, "PTS_PER_L3") ){ hw_desc->pts_per_reg[2] = value; }else if( !strcmp(key, "PTS_PER_L4") ){ hw_desc->pts_per_reg[3] = value; }else if( !strcmp(key, "PTS_PER_MM") ){ hw_desc->pts_per_mm = value; }else if( !strcmp(key, "MAX_PPB") ){ hw_desc->maxPPB = value; } free(key); key = NULL; } fclose(input); return; } // Read the contents of the file supplied by the user. int setup_evts(char* inputfile, char*** basenames, int** evnt_cards) { size_t linelen = 0; int cnt = 0, status = 0; char *line = NULL, *place; FILE *input; int evnt_count = 256; char **names = (char **)calloc(evnt_count, sizeof(char *)); int *cards = (int *)calloc(evnt_count, sizeof(int)); if (NULL == names || NULL == cards) { fprintf(stderr, "Failed to allocate memory.\n"); return 0; } // Read the base event name and cardinality columns. input = fopen(inputfile, "r"); for(cnt=0; 1; cnt++) { ssize_t ret_val = getline(&line, &linelen, input); if( ret_val < 0 ) break; if( cnt >= evnt_count ) { evnt_count *= 2; names = realloc(names, evnt_count*sizeof(char *)); cards = realloc(cards, evnt_count*sizeof(int)); if (NULL == names || NULL == cards) { fprintf(stderr, "Failed to allocate memory.\n"); return 0; } } place = strstr(line, " "); // If this line was commented, silently ignore it. if(strlen(line) > 0 && line[0] == '#') { names[cnt] = NULL; cards[cnt] = -1; cnt--; free(line); line = NULL; linelen = 0; continue; } else if( NULL == place ) { fprintf(stderr,"problem with line: '%s'\n",line); names[cnt] = NULL; cards[cnt] = -1; cnt--; free(line); line = NULL; linelen = 0; continue; } names[cnt] = NULL; status = sscanf(line, "%ms %d", &(names[cnt]), &(cards[cnt]) ); // If this line was malformed, ignore it. if(2 != status) { fprintf(stderr,"problem with line: '%s'\n",line); names[cnt] = NULL; cards[cnt] = -1; cnt--; } free(line); line = NULL; linelen = 0; } free(line); fclose(input); *basenames = names; *evnt_cards = cards; return cnt; } // Recursively builds the list of all combinations of an event's qualifiers. void combine_qualifiers(int n, int pk, int ct, char** list, char* name, char** allevts, int* track, int flag, int* bitmap) { int original; int counter; int i; // Set flag in the array. original = bitmap[ct]; bitmap[ct] = flag; // Only make recursive calls if there are more items. // Ensure proper cardinality. counter = 0; for(i = 0; i < n; ++i) { counter += bitmap[i]; } // Cannot use more qualifiers than are available. if(ct+1 < n) { // Make recursive calls both with and without a given qualifier. // Recursion cannot exceed the number of qualifiers specified by // the user. if(counter < pk) { combine_qualifiers(n, pk, ct+1, list, name, allevts, track, 1, bitmap); } combine_qualifiers(n, pk, ct+1, list, name, allevts, track, 0, bitmap); } // Qualifier count matches that specified by the user. else { if(counter == pk) { // Construct the qualifier combination string. char* chunk; size_t evtsize = strlen(name)+1; for(i = 0; i < n; ++i) { if(bitmap[i] == 1) { // Add one to account for the colon in front of the qualifier. evtsize += strlen(list[i])+1; } } if (NULL == (chunk = (char*)malloc((evtsize+1)*sizeof(char)))) { fprintf(stderr, "Failed to allocate memory.\n"); return; } strcpy(chunk,name); for(i = 0; i < n; ++i) { if(bitmap[i] == 1) { strcat(chunk,":"); strcat(chunk,list[i]); } } // Add qualifier combination string to the list. allevts[*track] = strdup(chunk); *track += 1; free(chunk); } } // Undo effect of recursive call to combine other qualifiers. bitmap[ct] = original; return; } // Create the combinations of qualifiers for the events. void trav_evts(evstock* stock, int pk, int* cards, int nevts, int selexnsize, int mode, char** allevts, int* track, int* indexmemo, char** basenames) { int i, j, k, n = 0; char** chosen = NULL; char* name = NULL; int* bitmap = NULL; // User provided a file of events. if(READ_FROM_FILE == mode) { for(i = 0; i < selexnsize; ++i) { // Iterate through whole stock. If there are matches, proceed normally using the given cardinalities. j = indexmemo[i]; if( -1 == j ) { allevts[i] = NULL; continue; } // Get event's name and qualifier count. if(j == nevts) { // User a provided specific qualifier combination. name = basenames[i]; } else { name = evt_name(stock, j); n = num_quals(stock, j); } // Create a list to contain the qualifiers. if(cards[i] > 0) { chosen = (char**)malloc(n*sizeof(char*)); bitmap = (int*)calloc(n, sizeof(int)); if (NULL == chosen || NULL == bitmap) { fprintf(stderr, "Failed to allocate memory.\n"); return; } // Store the qualifiers for the current event. for(k = 0; k < n; ++k) { chosen[k] = strdup(stock->evts[j][k]); } } // Get combinations of all current event's qualifiers. if (n!=0 && cards[i]>0) { combine_qualifiers(n, cards[i], 0, chosen, name, allevts, track, 0, bitmap); combine_qualifiers(n, cards[i], 0, chosen, name, allevts, track, 1, bitmap); } else { allevts[*track] = strdup(name); *track += 1; } // Free the space back up. if(cards[i] > 0) { for(k = 0; k < n; ++k) { free(chosen[k]); } free(chosen); free(bitmap); } } } // User wants to inspect all events in the architecture. else { for(i = 0; i < nevts; ++i) { // Get event's name and qualifier count. n = num_quals(stock, i); name = evt_name(stock, i); // Show progress to the user. //fprintf(stderr, "CURRENT EVENT: %s (%d/%d)\n", name, (i+1), nevts); // Create a list to contain the qualifiers. chosen = (char**)malloc(n*sizeof(char*)); bitmap = (int*)calloc(n, sizeof(int)); if (NULL == chosen || NULL == bitmap) { fprintf(stderr, "Failed to allocate memory.\n"); return; } // Store the qualifiers for the current event. for(j = 0; j < n; ++j) { chosen[j] = strdup(stock->evts[i][j]); } // Get combinations of all current event's qualifiers. if (n!=0) { combine_qualifiers(n, pk, 0, chosen, name, allevts, track, 0, bitmap); combine_qualifiers(n, pk, 0, chosen, name, allevts, track, 1, bitmap); } else { allevts[*track] = strdup(name); *track += 1; } // Free the space back up. for(j = 0; j < n; ++j) { free(chosen[j]); } free(chosen); free(bitmap); } } return; } // Compute the permutations of k objects from a set of n objects. int perm(int n, int k) { int i; int prod = 1; int diff = n-k; for(i = n; i > diff; --i) { prod *= i; } return prod; } // Compute the combinations of k objects from a set of n objects. int comb(int n, int k) { return perm(n, k)/perm(k, k); } static void print_progress(int prg) { if(prg < 100) printf("%3d%%\b\b\b\b",prg); else printf("%3d%%\n",prg); fflush(stdout); } static void print_progress2(int prg) { if(prg < 100) printf("Total:%3d%% Current test: 0%%\b\b\b\b",prg); else printf("Total:%3d%%\n",prg); fflush(stdout); } void testbench(char** allevts, int cmbtotal, hw_desc_t *hw_desc, cat_params_t params, int myid, int nprocs) { int i; int junk=((int)getpid()+123)/456; int low = myid*(cmbtotal/nprocs); int cap = (myid+1)*(cmbtotal/nprocs); int offset = nprocs*(1+cmbtotal/nprocs)-cmbtotal; // Divide the work as evenly as possible. if(myid >= offset) { cap += myid-offset+1; low += myid-offset; } // Make sure the user provided events and iterate through all events. if( 0 == cmbtotal ) { fprintf(stderr, "No events to measure.\n"); return; } // Run the branch benchmark by default if none are specified. if( 0 == params.bench_type ) { params.bench_type |= BENCH_BRANCH; fprintf(stderr, "Warning: No benchmark specified. Running 'branch' by default.\n"); } /* Benchmark I - Branch*/ if( params.bench_type & BENCH_BRANCH ) { if(params.show_progress) printf("Branch Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress((100*i)/cmbtotal); if( allevts[i] != NULL ) branch_driver(allevts[i], junk, hw_desc, params.outputdir); } if(params.show_progress) print_progress(100); } /* Benchmark II - Data Cache Reads*/ if( params.bench_type & BENCH_DCACHE_READ ) { if ( !params.quick && 0 == myid ) { if(params.show_progress) { printf("D-Cache Latencies: 0%%\b\b\b\b"); fflush(stdout); } d_cache_driver("cat::latencies", params, hw_desc, 1, 0); if(params.show_progress) printf("\n"); } if(params.show_progress) printf("D-Cache Read Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress2((100*i)/cmbtotal); if( allevts[i] != NULL ) { d_cache_driver(allevts[i], params, hw_desc, 0, 0); } } if(params.show_progress) print_progress2(100); } /* Benchmark III - Data Cache Writes*/ if( params.bench_type & BENCH_DCACHE_WRITE ) { // If the READ benchmark was run, do not recompute the latencies. if ( !(params.bench_type & BENCH_DCACHE_READ) && !params.quick) { if(params.show_progress) { printf("D-Cache Latencies: 0%%\b\b\b\b"); fflush(stdout); } d_cache_driver("cat::latencies", params, hw_desc, 1, 0); if(params.show_progress) printf("\n"); } if(params.show_progress) printf("D-Cache Write Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress2((100*i)/cmbtotal); if( allevts[i] != NULL ) { d_cache_driver(allevts[i], params, hw_desc, 0, 1); } } if(params.show_progress) print_progress2(100); } /* Benchmark IV - FLOPS*/ if( params.bench_type & BENCH_FLOPS ) { if(params.show_progress) printf("FLOP Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress((100*i)/cmbtotal); if( allevts[i] != NULL ) flops_driver(allevts[i], hw_desc, params.outputdir); } if(params.show_progress) print_progress(100); } /* Benchmark V - Instruction Cache*/ if( params.bench_type & BENCH_ICACHE_READ ) { if(params.show_progress) printf("I-Cache Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress2((100*i)/cmbtotal); if( allevts[i] != NULL ) i_cache_driver(allevts[i], junk, hw_desc, params.outputdir, params.show_progress); } if(params.show_progress) print_progress2(100); } /* Benchmark VI - Vector FLOPS*/ if( params.bench_type & BENCH_VEC ) { if(params.show_progress) printf("Vector FLOP Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress((100*i)/cmbtotal); if( allevts[i] != NULL ) vec_driver(allevts[i], hw_desc, params.outputdir); } if(params.show_progress) print_progress(100); } /* Benchmark VII - Instructions*/ if( params.bench_type & BENCH_INSTR ) { if(params.show_progress) printf("Instruction Benchmarks: "); for(i = low; i < cap; ++i) { if(params.show_progress) print_progress((100*i)/cmbtotal); if( allevts[i] != NULL ) instr_driver(allevts[i], hw_desc, params.outputdir); } if(params.show_progress) print_progress(100); } return; } int parseArgs(int argc, char **argv, cat_params_t *params){ char *name = argv[0]; char *tmp = NULL; int dirlen = 0; int kflag = 0; int inflag = 0; FILE *test = NULL; int len, status = 0; params->subsetsize = -1; // Parse the command line arguments while(--argc){ ++argv; if( !strcmp(argv[0],"-h") ){ print_usage(name); return -1; } if( argc > 1 && !strcmp(argv[0],"-k") ){ params->subsetsize = atoi(argv[1]); if( params->subsetsize < 0 ) { params->subsetsize = 0; fprintf(stderr, "Warning: Cannot pass a negative value to -k.\n"); } params->mode = USE_ALL_EVENTS; kflag = 1; --argc; ++argv; continue; } if( argc > 1 && !strcmp(argv[0],"-n") ){ params->max_iter = atoi(argv[1]); --argc; ++argv; continue; } if( argc > 1 && !strcmp(argv[0],"-conf") ){ params->conf_file = argv[1]; --argc; ++argv; continue; } if( argc > 1 && !strcmp(argv[0],"-in") ){ params->inputfile = argv[1]; params->mode = READ_FROM_FILE; inflag = 1; --argc; ++argv; continue; } if( argc > 1 && !strcmp(argv[0],"-out") ){ tmp = argv[1]; --argc; ++argv; continue; } if( !strcmp(argv[0],"-verbose") ){ params->show_progress = 1; continue; } if( !strcmp(argv[0],"-quick") ){ params->quick = 1; continue; } if( !strcmp(argv[0],"-branch") ){ params->bench_type |= BENCH_BRANCH; continue; } if( !strcmp(argv[0],"-dcr") ){ params->bench_type |= BENCH_DCACHE_READ; continue; } if( !strcmp(argv[0],"-dcw") ){ params->bench_type |= BENCH_DCACHE_WRITE; continue; } if( !strcmp(argv[0],"-flops") ){ params->bench_type |= BENCH_FLOPS; continue; } if( !strcmp(argv[0],"-ic") ){ params->bench_type |= BENCH_ICACHE_READ; continue; } if( !strcmp(argv[0],"-vec") ){ params->bench_type |= BENCH_VEC; continue; } if( !strcmp(argv[0],"-instr") ){ params->bench_type |= BENCH_INSTR; continue; } print_usage(name); return -1; } // MODE INFO: mode 1 uses file; mode 2 uses all native events. if(READ_FROM_FILE == params->mode) { test = fopen(params->inputfile, "r"); if(test == NULL) { fprintf(stderr, "Could not open %s. Exiting...\n", params->inputfile); return -1; } fclose(test); } // Make sure user does not specify both modes simultaneously. if(kflag == 1 && inflag == 1) { fprintf(stderr, "Cannot use -k flag with -in flag. Exiting...\n"); return -1; } // Make sure user specifies mode explicitly. if(kflag == 0 && inflag == 0) { print_usage(name); return -1; } // Make sure output path was provided. if(tmp == NULL) { fprintf(stderr, "Output path not provided. Exiting...\n"); return -1; } // Write output files in the user-specified directory. dirlen = strlen(tmp); params->outputdir = (char*)malloc((2+dirlen)*sizeof(char)); if (NULL == params->outputdir) { fprintf(stderr, "Failed to allocate memory.\n"); return -1; } len = snprintf( params->outputdir, 2+dirlen, "%s/", tmp); if( len < 1+dirlen ) { fprintf(stderr, "Problem with output directory name.\n"); return -1; } // Make sure files can be written to the provided path. status = access(params->outputdir, W_OK); if(status != 0) { fprintf(stderr, "Permission to write files to \"%s\" denied. Make sure the path exists and is writable.\n", tmp); return -1; } return 0; } // Show the user how to properly use the program. void print_usage(char* name) { fprintf(stdout, "\nUsage: %s [OPTIONS...]\n", name); fprintf(stdout, "\nRequired:\n"); fprintf(stdout, " -out Output files location.\n"); fprintf(stdout, " -in Events and cardinalities file.\n"); fprintf(stdout, " -k Cardinality of subsets.\n"); fprintf(stdout, " Parameters \"-k\" and \"-in\" are mutually exclusive.\n"); fprintf(stdout, "\nOptional:\n"); fprintf(stdout, " -conf Configuration file location.\n"); fprintf(stdout, " -verbose Show benchmark progress in the standard output.\n"); fprintf(stdout, " -quick Skip latency tests.\n"); fprintf(stdout, " -n Number of iterations for data cache kernels.\n"); fprintf(stdout, " -branch Branch kernels.\n"); fprintf(stdout, " -dcr Data cache reading kernels.\n"); fprintf(stdout, " -dcw Data cache writing kernels.\n"); fprintf(stdout, " -flops Floating point operations kernels.\n"); fprintf(stdout, " -ic Instruction cache kernels.\n"); fprintf(stdout, " -vec Vector FLOPs kernels.\n"); fprintf(stdout, " -instr Instructions kernels.\n"); fprintf(stdout, "\n"); fprintf(stdout, "EXAMPLE: %s -in event_list.txt -out OUTPUT_DIRECTORY -branch -dcw\n", name); fprintf(stdout, "\n"); return; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/params.h000066400000000000000000000004351502707512200232140ustar00rootroot00000000000000#ifndef _CAT_PARAMS_ #define _CAT_PARAMS_ typedef struct cat_params_s{ int subsetsize; int mode; int max_iter; int bench_type; int show_progress; int quick; char *conf_file; char *inputfile; char *outputdir; } cat_params_t; #endif // _CAT_PARAMS_ papi-papi-7-2-0-t/src/counter_analysis_toolkit/prepareArray.c000066400000000000000000000074371502707512200243720ustar00rootroot00000000000000#include #include #include #include #include #include "prepareArray.h" volatile uintptr_t opt_killer_zero; static void _prepareArray_sections_random(uintptr_t *array, long long len, long long stride, long long secSize); static void _prepareArray_sequential(uintptr_t *array, long long len, long long stride); /* * "stride" is in "uintptr_t" elements, NOT in bytes * Note: It is wise to provide an "array" that is aligned to the cache line size. */ int prepareArray(uintptr_t *array, long long len, long long stride, long long secSize, int pattern){ assert( array != NULL ); opt_killer_zero = (uintptr_t)( (len+37)/(len+36) - 1 ); switch(pattern){ case SECRND: _prepareArray_sections_random(array, len, stride, secSize); break; case SEQUEN: _prepareArray_sequential(array, len, stride); break; default: fprintf(stderr,"prepareArray() unknown array access pattern: %d\n",pattern); return -1; break; } return 0; } /* * "stride" is in "uintptr_t" elements, NOT in bytes * Note: It is wise to provide an "array" that is aligned to the cache line size. */ static void _prepareArray_sections_random(uintptr_t *array, long long len, long long stride, long long secSize){ assert( array != NULL ); long long elemCnt, maxElemCnt, sec, i; long long currElemCnt, uniqIndex, taken; uintptr_t **p, *next; long long currSecSize = secSize; long long secCnt = 1+len/secSize; long long *availableNumbers; p = (uintptr_t **)&array[0]; maxElemCnt = currSecSize/stride; availableNumbers = (long long *)calloc(maxElemCnt, sizeof(long long)); // Loop through every section in the array. for(sec=0; sec #define RANDOM 0x2 #define SECRND 0x3 #define SEQUEN 0x4 int prepareArray(uintptr_t *array, long long len, long long stride, long long secSize, int pattern); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/replicate.sh000066400000000000000000000001471502707512200240640ustar00rootroot00000000000000#!/bin/bash for ((i=1; i<12; i++)); do cp icache_seq_kernel_0.so icache_seq_kernel_${i}.so; done papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/000077500000000000000000000000001502707512200232455ustar00rootroot00000000000000papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/README.md000066400000000000000000000031351502707512200245260ustar00rootroot00000000000000# Contents The directory 'scripts' contains the files: * README.md * process_dcache_output.sh * default_gnp.inc * single_plot.gnp * multi_plot.gnp and the directory 'sample_data'. # Examples Executing the following commands: ``` gnuplot single_plot.gnp gnuplot multi_plot.gnp ``` will produce PDF files (single_plot.pdf and multi_plot.pdf) with graphs showing the data in the directory 'sample_data'. # Data Post-processing To use these gnuplot programs with data generated by the cache benchmarks of CAT (cat_collect -dcr, and cat_collect -dcw), the user must post-process the output files of the benchmarks. Executing the bash script 'process_dcache_output.sh' using as input a data file generated by the data cache benchmarks will compute basic statistics (min, avg, max) of the data gathered by each thread for each test size. The output is automatically stored in a new file with the keyword '.stat' appended to its name. These ".stat" files can be used as input for the gnuplot scripts. # Modifying the Gnuplot scripts The example scripts 'single_plot.gnp' and 'multi_plot.gnp' contain variables (at the top of each script) that must be modified to fit the user's use case and environment. * The variables 'L1_per_core', 'L2_per_core', and 'L3_per_core' must be set to indicate the sizes of the caches per core * The variables 'SML_STRIDE', 'BIG_STRIDE', 'SML_PPB', and 'BIG_PPB' must be set to the corresponding values reported in the benchmark output file. * The directory 'DIR' where the user data resides, the name of the event 'EVENT' whose data is being plotted, and the plot title 'PLOT_TITLE'. papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/default_gnp.inc000066400000000000000000000123201502707512200262260ustar00rootroot00000000000000unset clip points set clip one unset clip two set errorbars front 1.000000 set border 31 front lt black linewidth 1.000 dashtype solid set zdata set ydata set xdata set y2data set x2data set boxwidth set style fill empty border set style rectangle back fc bgnd fillstyle solid 1.00 border lt -1 set style circle radius graph 0.02 set style ellipse size graph 0.05, 0.03 angle 0 units xy set dummy x, y set format x "% h" set format y "% h" set format x2 "% h" set format y2 "% h" set format z "% h" set format cb "% h" set format r "% h" set ttics format "% h" set timefmt "%d/%m/%y,%H:%M" set angles radians set tics back unset raxis set theta counterclockwise right set style parallel front lt black linewidth 2.000 dashtype solid set key title "" center set key fixed right top vertical Right noreverse enhanced autotitle nobox set key noinvert samplen 4 spacing 1 width 0 height 0 set key maxcolumns 0 maxrows 0 set key noopaque unset label unset arrow unset style line unset style arrow set style histogram clustered gap 2 title textcolor lt -1 unset object set style textbox transparent margins 1.0, 1.0 border lt -1 linewidth 1.0 set offsets 0, 0, 0, 0 set pointsize 1 set pointintervalbox 1 set encoding default unset polar unset parametric unset decimalsign unset micro unset minussign set view 60, 30, 1, 1 set view azimuth 0 set rgbmax 255 set samples 100, 100 set isosamples 10, 10 set surface unset contour set cntrlabel format '%8.3g' font '' start 5 interval 20 set mapping cartesian set datafile separator whitespace unset hidden3d set cntrparam order 4 set cntrparam linear set cntrparam levels 5 set cntrparam levels auto set cntrparam firstlinetype 0 unsorted set cntrparam points 5 #set size ratio 0 0.333333,1 set origin 0,0 set style data points set style function lines unset xzeroaxis unset yzeroaxis unset zzeroaxis unset x2zeroaxis unset y2zeroaxis set xyplane relative 0.5 set tics scale 1, 0.5, 1, 1, 1 set mxtics default set mytics default set mztics default set mx2tics default set my2tics default set mcbtics default set mrtics default set nomttics set xtics border in scale 0.5,0.5 mirror norotate autojustify set xtics norangelimit logscale autofreq set ytics border in scale 0.75,0.5 nomirror norotate autojustify set ytics norangelimit logscale autofreq set ztics border in scale 1,0.5 nomirror norotate autojustify set ztics norangelimit logscale autofreq unset x2tics unset y2tics set cbtics border in scale 1,0.5 mirror norotate autojustify set cbtics norangelimit logscale autofreq set rtics axis in scale 1,0.5 nomirror norotate autojustify set rtics norangelimit autofreq unset ttics set title "" set title font "" textcolor lt -1 norotate set timestamp bottom set timestamp "" set timestamp font "" textcolor lt -1 norotate set trange [ * : * ] noreverse nowriteback set urange [ * : * ] noreverse nowriteback set vrange [ * : * ] noreverse nowriteback set xlabel "" set xlabel font "" textcolor lt -1 norotate set x2label "" set x2label font "" textcolor lt -1 norotate set xrange [ * : * ] noreverse writeback set x2range [ * : * ] noreverse writeback set ylabel "" set ylabel font "" textcolor lt -1 rotate set y2label "" set y2label font "" textcolor lt -1 rotate set yrange [ * : * ] noreverse writeback set y2range [ * : * ] noreverse writeback set zlabel "" set zlabel font "" textcolor lt -1 norotate set zrange [ * : * ] noreverse writeback set cblabel "" set cblabel font "" textcolor lt -1 rotate set cbrange [ * : * ] noreverse writeback set rlabel "" set rlabel font "" textcolor lt -1 norotate set rrange [ * : * ] noreverse writeback unset logscale set logscale x 2 unset jitter set zero 1e-08 set lmargin -1 set bmargin -1 set rmargin -1 set tmargin -1 set locale "en_US.UTF-8" set pm3d explicit at s set pm3d scansautomatic set pm3d interpolate 1,1 flush begin noftriangles noborder corners2color mean set pm3d nolighting set palette positive nops_allcF maxcolors 0 gamma 1.5 color model RGB set palette rgbformulae 7, 5, 15 set colorbox default set colorbox vertical origin screen 0.9, 0.2 size screen 0.05, 0.6 front noinvert bdefault set style boxplot candles range 1.50 outliers pt 7 separation 1 labels auto unsorted set loadpath set fontpath set psdir set fit brief errorvariables nocovariancevariables errorscaling prescale nowrap v5 set grid xtics nomxtics ytics nomytics noztics nomztics nortics nomrtics \ nox2tics nomx2tics noy2tics nomy2tics nocbtics nomcbtics set linestyle 1 lw 3 lt 1 pt 7 ps 0.33 lc rgb "#005aad" # blue (mid dark) set linestyle 2 lw 4 lt 1 pt 4 ps 0.33 lc rgb "#3d2a23" # brown (dark) set linestyle 3 lw 4 lt 2 pt 6 ps 0.33 lc rgb "#ff003a" # red (fire) set linestyle 4 lw 4 lt 2 pt 13 ps 0.50 lc rgb "#ebb600" # egg yoke set linestyle 5 lw 4 lt 1 pt 9 ps 0.50 lc rgb "#a1dc89" # green (bright cabage) set linestyle 6 lw 4 lt 1 pt 7 ps 0.60 lc rgb "#8ab0ff" # blue (sky) set linestyle 7 lw 4 lt 1 pt 5 ps 0.50 lc rgb "#016e4e" # green (evergreen) set linestyle 8 lw 4 lt 1 pt 10 ps 0.50 lc rgb "#ceaac9" # lilac set linestyle 9 lw 4 lt 1 pt 2 ps 0.50 lc rgb "#593f86" # purple (mid dark) set grid lt 1 linecolor rgb "#c0c0c0" linewidth 0.5, lt 0 linecolor 0 linewidth 0.500 set grid front unset key set xrange [3.0e3:1.0e8] set yrange [0:1.02] papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/multi_plot.gnp000066400000000000000000000066721502707512200261560ustar00rootroot00000000000000L1_per_core=32768 L2_per_core=1048576 L3_per_core=4194304 SML_STRIDE=128 BIG_STRIDE=256 SML_PPB=16 BIG_PPB=512 DIR="sample_data/" EVENT1="PM_DATA_FROM_L2" EVENT2="PM_DATA_FROM_L3" EVENT3="PM_DATA_FROM_L3MISS" PLOT_TITLE="L2 Reads / L3_reads / L3 Misses" SUFFIX=".data.reads.stat" set label 1 at 6000+(L1_per_core-6000)/20, graph 1.020 "L1" font ",9" set label 2 at L1_per_core+(L2_per_core-L1_per_core)/20, graph 1.020 "L2" font ",9" set label 3 at L2_per_core+(L3_per_core-L2_per_core)/20, graph 1.020 "L3" font ",9" ################################################################################ set terminal pdfcairo noenhanced font "Sans,12" set output "multi_plot.pdf" load "default_gnp.inc" set xtics ("" L1_per_core, "" L2_per_core, "" L3_per_core) FILE1=DIR.EVENT1.SUFFIX FILE2=DIR.EVENT2.SUFFIX FILE3=DIR.EVENT3.SUFFIX unset key set multiplot layout 1,6 title PLOT_TITLE." set margin 0,0,1.5,2.5 set label 20 "Normalized Event Count" at screen 0.015,0.26 rotate by 90 # Y-label OFFSET=0.09 GAP=0.015 WIDTH=(1.0-(OFFSET+5.0*GAP+GAP/4.0))/6.0 ### set label 21 sprintf("RND:%d:%d",SML_STRIDE,BIG_PPB) at graph 0.1,-0.05 font ",10" # X-label set size 0.1475,1 set size WIDTH,1 set origin OFFSET,0.0 plot \ FILE1 every :::0::0 using 1:2 w lp ls 7,\ FILE2 every :::0::0 using 1:2 w lp ls 3,\ FILE3 every :::0::0 using 1:2 w lp ls 1 unset title unset ylabel set format y '' set ytics scale 0.1,0.5 unset label 20 ### set label 21 sprintf("RND:%d:%d",SML_STRIDE,SML_PPB) at graph 0.16,-0.05 font ",10" # X-label set object 1 rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#FFFAE8' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*1,0 plot \ FILE1 every :::1::1 using 1:2 w lp ls 7,\ FILE2 every :::1::1 using 1:2 w lp ls 3,\ FILE3 every :::1::1 using 1:2 w lp ls 1 unset object 1 ### set label 21 sprintf("RND:%d:%d",BIG_STRIDE,BIG_PPB) at graph 0.1,-0.05 font ",10" # X-label set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*2,0 plot \ FILE1 every :::2::2 using 1:2 w lp ls 7,\ FILE2 every :::2::2 using 1:2 w lp ls 3,\ FILE3 every :::2::2 using 1:2 w lp ls 1 ### set label 21 sprintf("RND:%d:%d",BIG_STRIDE,SML_PPB) at graph 0.1,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#FFFAE8' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*3,0 plot \ FILE1 every :::3::3 using 1:2 w lp ls 7,\ FILE2 every :::3::3 using 1:2 w lp ls 3,\ FILE3 every :::3::3 using 1:2 w lp ls 1 ### set label 21 sprintf("SEQ:%d",SML_STRIDE) at graph 0.25,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#F0F6FF' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*4,0 plot \ FILE1 every :::4::4 using 1:2 w lp ls 7,\ FILE2 every :::4::4 using 1:2 w lp ls 3,\ FILE3 every :::4::4 using 1:2 w lp ls 1 ### set key at screen 0.997,0.997 set key font ",6" set key opaque box samplen 4 width 2 height 1 set label 21 sprintf("SEQ:%d",BIG_STRIDE) at graph 0.2,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#F0F6FF' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*5,0 plot \ FILE1 every :::5::5 using 1:2 w lp ls 7 title EVENT1,\ FILE2 every :::5::5 using 1:2 w lp ls 3 title EVENT2,\ FILE3 every :::5::5 using 1:2 w lp ls 1 title EVENT3 unset multiplot papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/process_dcache_output.sh000066400000000000000000000012341502707512200301660ustar00rootroot00000000000000#!/bin/bash if (($# != 1 )); then echo "Usage: process_dcache_output.sh datafile" fi ifile=$1 awk 'BEGIN{ first=1; }{ sum=0; min=-1; max=-1; # Process all lines that are not comments. if($1!="#"){ first=0; # Find the min, max and sum of measurements across threads. for(i=2; i<=NF; i++){ sum+=$i; if($imax) max=$i; } # Print only the size and three statistics print $1" "min" "sum/(NF-1)" "max; }else{ # Add a space between data blocks to facilitate gnuplot. if(!first){ printf("\n"); } printf("%s\n",$0); } }' $ifile > ${ifile}.stat papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/sample_data/000077500000000000000000000000001502707512200255175ustar00rootroot00000000000000papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/sample_data/PM_DATA_FROM_L2.data.reads.stat000066400000000000000000000054611502707512200327140ustar00rootroot00000000000000# Core: 3 8 17 26 # L1:32768 L2:1048576 L3:4194304 # PTRN=3, STRIDE=128, PPB=512.000000, ThreadCount=4 6888 0.000092 0.000126 0.000168 11585 0.000366 0.0008465 0.002029 19483 0.001251 0.00196825 0.003784 77935 1.000275 1.00029 1.000305 185363 1.000119 1.00018 1.000227 440871 1.000082 1.00008 1.000086 1482910 0.001117 0.0875627 0.346732 2097152 0.000593 0.000724 0.000852 2965820 0.000197 0.00030175 0.000459 8388608 0.000057 8.4e-05 0.000130 16777216 0.000019 3.15e-05 0.000050 33554432 0.000023 2.8e-05 0.000031 # PTRN=3, STRIDE=128, PPB=16.000000, ThreadCount=4 6888 0.000092 0.00016425 0.000229 11585 0.000366 0.00048825 0.000626 19483 0.001297 0.0026475 0.006409 77935 0.997040 0.999466 1.000275 185363 1.000194 1.0002 1.000205 440871 1.000077 1.00009 1.000100 1482910 0.001231 0.160944 0.331392 2097152 0.000240 0.00061675 0.000855 2965820 0.000173 0.0003355 0.000463 8388608 0.000046 6.1e-05 0.000084 16777216 0.000013 2.375e-05 0.000036 33554432 0.000012 1.925e-05 0.000031 # PTRN=3, STRIDE=256, PPB=512.000000, ThreadCount=4 6888 0.000061 0.00012975 0.000183 11585 0.000244 0.00042725 0.000519 19483 0.001282 0.001404 0.001526 77935 1.000214 1.00034 1.000641 185363 1.000151 1.00017 1.000173 440871 1.000073 1.0001 1.000118 1482910 0.000790 0.0738718 0.292812 2097152 0.000097 0.00048575 0.000700 2965820 0.000069 0.000158 0.000233 8388608 0.000046 6.875e-05 0.000084 16777216 0.000023 6.375e-05 0.000088 33554432 0.000017 3.475e-05 0.000059 # PTRN=3, STRIDE=256, PPB=16.000000, ThreadCount=4 6888 0.000031 0.00014525 0.000244 11585 0.000458 0.00055675 0.000671 19483 0.001129 0.00141925 0.001740 77935 1.000214 1.00032 1.000610 185363 1.000151 1.00018 1.000194 440871 1.000054 1.00007 1.000082 1482910 0.002369 0.153592 0.350087 2097152 0.000143 0.00045925 0.000736 2965820 0.000030 0.0001805 0.000281 8388608 0.000046 9.525e-05 0.000137 16777216 0.000042 5.225e-05 0.000057 33554432 0.000032 3.9e-05 0.000050 # PTRN=4, STRIDE=128, PPB=0.500000, ThreadCount=4 6888 0.000076 0.00011825 0.000153 11585 0.000153 0.000191 0.000229 19483 0.000244 0.00029725 0.000366 77935 0.006851 0.0069005 0.006973 185363 0.002978 0.00302675 0.003129 440871 0.001229 0.0012565 0.001302 1482910 0.000004 5.5e-06 0.000007 2097152 0.000003 4e-06 0.000005 2965820 0.000003 3.75e-06 0.000005 8388608 0.000011 1.5e-05 0.000019 16777216 0.000008 2.175e-05 0.000050 33554432 0.000006 8.75e-06 0.000013 # PTRN=4, STRIDE=256, PPB=0.500000, ThreadCount=4 6888 0.000092 0.00014525 0.000214 11585 0.000244 0.00051875 0.001068 19483 0.000610 0.00084675 0.001160 77935 0.042297 0.0506438 0.058685 185363 0.019790 0.020432 0.021754 440871 0.007295 0.010784 0.015779 1482910 0.000005 0.00011125 0.000394 2097152 0.000015 4.175e-05 0.000080 2965820 0.000007 1.425e-05 0.000031 8388608 0.000023 0.00011075 0.000198 16777216 0.000011 5.4e-05 0.000095 33554432 0.000015 3.425e-05 0.000046 papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/sample_data/PM_DATA_FROM_L3.data.reads.stat000066400000000000000000000054041502707512200327120ustar00rootroot00000000000000# Core: 1 10 17 26 # L1:32768 L2:1048576 L3:4194304 # PTRN=3, STRIDE=128, PPB=512.000000, ThreadCount=4 6888 0.000015 1.5e-05 0.000015 11585 0.000015 1.5e-05 0.000015 19483 0.000015 1.5e-05 0.000015 77935 0.000000 1.125e-05 0.000015 185363 0.000011 1.1e-05 0.000011 440871 0.000000 1.6e-05 0.000050 1482910 0.999005 0.999038 0.999100 2097152 0.999119 0.999288 0.999408 2965820 0.991387 0.997449 0.999549 8388608 0.301460 0.315969 0.343498 16777216 0.112118 0.129813 0.150003 33554432 0.000083 0.0117015 0.046288 # PTRN=3, STRIDE=128, PPB=16.000000, ThreadCount=4 6888 0.000015 1.5e-05 0.000015 11585 0.000015 1.5e-05 0.000015 19483 0.000015 1.5e-05 0.000015 77935 0.000015 1.5e-05 0.000015 185363 0.000011 1.1e-05 0.000011 440871 0.000005 1.5e-05 0.000041 1482910 0.655751 0.882003 0.999098 2097152 0.993767 0.996488 0.999195 2965820 0.992997 0.996036 0.999543 8388608 0.491600 0.539617 0.647552 16777216 0.513214 0.553109 0.631111 33554432 0.565421 0.58971 0.643687 # PTRN=3, STRIDE=256, PPB=512.000000, ThreadCount=4 6888 0.000031 3.1e-05 0.000031 11585 0.000031 3.1e-05 0.000031 19483 0.000031 3.1e-05 0.000031 77935 0.000031 3.1e-05 0.000031 185363 0.000022 2.2e-05 0.000022 440871 0.000009 9e-06 0.000009 1482910 0.997917 0.998812 0.999153 2097152 0.999319 0.999556 0.999836 2965820 0.996523 0.998938 0.999826 8388608 0.004730 0.0807703 0.256523 16777216 0.000011 0.0174197 0.047981 33554432 0.000011 0.00096125 0.003748 # PTRN=3, STRIDE=256, PPB=16.000000, ThreadCount=4 6888 0.000031 3.1e-05 0.000031 11585 0.000031 3.1e-05 0.000031 19483 0.000031 3.1e-05 0.000031 77935 0.000031 3.1e-05 0.000031 185363 0.000022 2.2e-05 0.000022 440871 0.000009 1.575e-05 0.000036 1482910 0.663712 0.799784 0.999007 2097152 0.999332 0.999602 0.999916 2965820 0.999401 0.999626 0.999891 8388608 0.003288 0.0361785 0.122711 16777216 0.000069 0.00028925 0.000546 33554432 0.000475 0.127035 0.331295 # PTRN=4, STRIDE=128, PPB=0.500000, ThreadCount=4 6888 0.000015 1.5e-05 0.000015 11585 0.000015 1.5e-05 0.000015 19483 0.000015 1.5e-05 0.000015 77935 0.000015 1.5e-05 0.000015 185363 0.000011 1.1e-05 0.000011 440871 0.000005 8.25e-06 0.000014 1482910 0.000376 0.0003795 0.000383 2097152 0.000203 0.00024875 0.000306 2965820 0.000142 0.00014525 0.000152 8388608 0.000381 0.000447 0.000515 16777216 0.000422 0.000732 0.000877 33554432 0.000504 0.0009735 0.001148 # PTRN=4, STRIDE=256, PPB=0.500000, ThreadCount=4 6888 0.000031 3.1e-05 0.000031 11585 0.000031 3.1e-05 0.000031 19483 0.000031 3.1e-05 0.000031 77935 0.000031 3.1e-05 0.000031 185363 0.000022 2.2e-05 0.000022 440871 0.000009 1.35e-05 0.000027 1482910 0.001905 0.00233825 0.002555 2097152 0.001337 0.0018195 0.002367 2965820 0.000730 0.00128925 0.001681 8388608 0.000687 0.00076875 0.000832 16777216 0.001003 0.0014515 0.001675 33554432 0.001017 0.00172525 0.002224 PM_DATA_FROM_L3MISS.data.reads.stat000066400000000000000000000052201502707512200333230ustar00rootroot00000000000000papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/sample_data# Core: 1 10 19 26 # L1:32768 L2:1048576 L3:4194304 # PTRN=3, STRIDE=128, PPB=512.000000, ThreadCount=4 6888 0.000000 7.75e-06 0.000031 11585 0.000000 0 0.000000 19483 0.000000 0 0.000000 77935 0.000000 7.75e-06 0.000031 185363 0.000000 1.1e-05 0.000022 440871 0.000000 2.25e-06 0.000009 1482910 0.000001 3.5e-06 0.000007 2097152 0.000003 1.5e-05 0.000034 2965820 0.000045 0.0001315 0.000264 8388608 0.653606 0.717872 0.756207 16777216 0.848867 0.872215 0.908665 33554432 0.946744 0.975656 0.999899 # PTRN=3, STRIDE=128, PPB=16.000000, ThreadCount=4 6888 0.000000 0 0.000000 11585 0.000000 0 0.000000 19483 0.000000 0 0.000000 77935 0.000000 7.75e-06 0.000031 185363 0.000000 5.5e-06 0.000022 440871 0.000000 4.5e-06 0.000009 1482910 0.000001 2.75e-06 0.000004 2097152 0.000002 7.25e-06 0.000016 2965820 0.000053 0.0001225 0.000163 8388608 0.343933 0.458251 0.503326 16777216 0.359467 0.435253 0.469786 33554432 0.351163 0.399427 0.446514 # PTRN=3, STRIDE=256, PPB=512.000000, ThreadCount=4 6888 0.000000 0 0.000000 11585 0.000000 0 0.000000 19483 0.000000 0 0.000000 77935 0.000000 1.525e-05 0.000061 185363 0.000000 0 0.000000 440871 0.000000 0 0.000000 1482910 0.000000 5.25e-06 0.000008 2097152 0.000002 1.925e-05 0.000031 2965820 0.000054 0.000304 0.000654 8388608 0.770248 0.92411 0.990852 16777216 0.891674 0.96542 0.999943 33554432 0.989222 0.997285 1.000004 # PTRN=3, STRIDE=256, PPB=16.000000, ThreadCount=4 6888 0.000000 0 0.000000 11585 0.000000 1.525e-05 0.000061 19483 0.000000 0 0.000000 77935 0.000000 1.525e-05 0.000061 185363 0.000000 1.075e-05 0.000043 440871 0.000000 4.5e-06 0.000018 1482910 0.000005 5e-06 0.000005 2097152 0.000000 2.75e-05 0.000086 2965820 0.000070 0.00035525 0.000948 8388608 0.872818 0.953537 0.992989 16777216 0.861000 0.962182 0.999973 33554432 0.756063 0.90988 0.999779 # PTRN=4, STRIDE=128, PPB=0.500000, ThreadCount=4 6888 0.000000 0 0.000000 11585 0.000000 0 0.000000 19483 0.000000 0 0.000000 77935 0.000000 7.75e-06 0.000031 185363 0.000000 5.5e-06 0.000022 440871 0.000000 2.25e-06 0.000009 1482910 0.000000 2.25e-06 0.000003 2097152 0.000000 1.25e-06 0.000002 2965820 0.000001 2e-06 0.000003 8388608 0.000286 0.00033575 0.000389 16777216 0.000193 0.00059925 0.001678 33554432 0.000062 0.00013725 0.000261 # PTRN=4, STRIDE=256, PPB=0.500000, ThreadCount=4 6888 0.000000 0 0.000000 11585 0.000000 0 0.000000 19483 0.000000 0 0.000000 77935 0.000000 1.525e-05 0.000061 185363 0.000000 1.075e-05 0.000043 440871 0.000000 4.5e-06 0.000018 1482910 0.000000 3.75e-06 0.000005 2097152 0.000000 2e-06 0.000004 2965820 0.000000 2.25e-06 0.000003 8388608 0.000450 0.00048825 0.000542 16777216 0.000374 0.000539 0.000809 33554432 0.000124 0.00033975 0.000607 papi-papi-7-2-0-t/src/counter_analysis_toolkit/scripts/single_plot.gnp000066400000000000000000000052161502707512200262760ustar00rootroot00000000000000L1_per_core=32768 L2_per_core=1048576 L3_per_core=4194304 SML_STRIDE=128 BIG_STRIDE=256 SML_PPB=16 BIG_PPB=512 DIR="sample_data/" EVENT="PM_DATA_FROM_L2" PLOT_TITLE="Measurements of event ".EVENT SUFFIX=".data.reads.stat" set label 1 at 6000+(L1_per_core-6000)/20, graph 1.020 "L1" font ",9" set label 2 at L1_per_core+(L2_per_core-L1_per_core)/20, graph 1.020 "L2" font ",9" set label 3 at L2_per_core+(L3_per_core-L2_per_core)/20, graph 1.020 "L3" font ",9" ################################################################################ set terminal pdfcairo noenhanced font "Sans,12" set output "single_plot.pdf" load "default_gnp.inc" set xtics ("" L1_per_core, "" L2_per_core, "" L3_per_core) FILE=DIR.EVENT.SUFFIX unset key set multiplot layout 1,6 title PLOT_TITLE." set margin 0,0,1.5,2.5 set label 20 "Normalized Event Count" at screen 0.015,0.26 rotate by 90 # Y-label OFFSET=0.09 GAP=0.015 WIDTH=(1.0-(OFFSET+5.0*GAP+GAP/4.0))/6.0 ### set label 21 sprintf("RND:%d:%d",SML_STRIDE,BIG_PPB) at graph 0.1,-0.05 font ",10" # X-label set size 0.1475,1 set size WIDTH,1 set origin OFFSET,0.0 plot \ FILE every :::0::0 using 1:2 w lp ls 1 unset title unset ylabel set format y '' set ytics scale 0.1,0.5 unset label 20 ### set label 21 sprintf("RND:%d:%d",SML_STRIDE,SML_PPB) at graph 0.16,-0.05 font ",10" # X-label set object 1 rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#FFFAE8' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*1,0 plot \ FILE every :::1::1 using 1:2 w lp ls 1 unset object 1 ### set label 21 sprintf("RND:%d:%d",BIG_STRIDE,BIG_PPB) at graph 0.1,-0.05 font ",10" # X-label set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*2,0 plot \ FILE every :::2::2 using 1:2 w lp ls 1 ### set label 21 sprintf("RND:%d:%d",BIG_STRIDE,SML_PPB) at graph 0.1,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#FFFAE8' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*3,0 plot \ FILE every :::3::3 using 1:2 w lp ls 1 ### set label 21 sprintf("SEQ:%d",SML_STRIDE) at graph 0.25,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#F0F6FF' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*4,0 plot \ FILE every :::4::4 using 1:2 w lp ls 1 ### set label 21 sprintf("SEQ:%d",BIG_STRIDE) at graph 0.2,-0.05 font ",10" # X-label set object rectangle from graph 0,0 to graph 1,1 behind fillcolor rgb '#F0F6FF' fillstyle solid noborder set size WIDTH,1 set origin OFFSET+(WIDTH+GAP)*5,0 plot \ FILE every :::5::5 using 1:2 w lp ls 1 unset multiplot papi-papi-7-2-0-t/src/counter_analysis_toolkit/timing_kernels.c000066400000000000000000000203361502707512200247400ustar00rootroot00000000000000#include #include #include #include #include #include #include #include "prepareArray.h" #include "timing_kernels.h" // For do_work macro in the header file volatile double x,y; extern int is_core; char* eventname = NULL; run_output_t probeBufferSize(long long active_buf_len, long long line_size, float pageCountPerBlock, int pattern, long long llc_size, uintptr_t **v, uintptr_t *rslt, int latency_only, int mode, int ONT){ int _papi_eventset = PAPI_NULL; int retval, buffer = 0, status = 0; int error_line = -1, error_type = PAPI_OK; register uintptr_t *p = NULL; register uintptr_t p_prime; long long pageSize, blockSize; long long int counter[ONT]; run_output_t out; out.status = 0; assert( sizeof(int) >= 4 ); x = (double)*rslt; x = floor(1.3*x/(1.4*x+1.8)); y = x*3.97; if( x > 0 || y > 0 ) printf("WARNING: x=%lf y=%lf\n",x,y); long long countSingle = active_buf_len/line_size; long long threshold = 1024LL*1024LL/64LL; long long len = (active_buf_len > threshold) ? active_buf_len : threshold; long long countMax; if( len > llc_size ){ countMax = 4LL*(len/line_size); }else{ countMax = 64LL*(len/line_size); } // Get the size of a page of memory. pageSize = sysconf(_SC_PAGESIZE)/sizeof(uintptr_t); if( pageSize <= 0 ){ fprintf(stderr,"Cannot determine pagesize, sysconf() returned an error code.\n"); out.status = -1; return out; } // Compute the size of a block in the pointer chain and create the pointer chain. blockSize = (long long)(pageCountPerBlock*(float)pageSize); // Start of threaded benchmark. #pragma omp parallel private(p,retval) reduction(+:buffer) reduction(+:status) firstprivate(_papi_eventset) default(shared) { int idx = omp_get_thread_num(); int thdStatus = 0; double divisor = 1.0; double time1=0, time2=0, dt, factor; long long count; // Initialize the result to a value indicating an error. // If no error occurs, it will be overwritten. if ( !latency_only ) { out.counter[idx] = -1; } status += prepareArray(v[idx], active_buf_len, line_size, blockSize, pattern); // We will use "p" even after the epilogue, so let's set // it here in case an error occurs. p = &v[idx][0]; if ( !latency_only && (is_core || 0 == idx) ) { retval = PAPI_create_eventset( &_papi_eventset ); if (retval != PAPI_OK ){ error_type = retval; error_line = __LINE__; thdStatus = -1; // If we can't measure events, no need to run the kernel. goto skip_epilogue; } retval = PAPI_add_named_event( _papi_eventset, eventname ); if (retval != PAPI_OK ){ error_type = retval; error_line = __LINE__; thdStatus = -1; // If we can't measure events, no need to run the kernel. goto clean_up; } } // Make sure all threads start at about the same time so that we get high pressure on the memory subsystem. #pragma omp barrier count = countSingle; // Make a warm-up pass to fetch the data into the cache, if it fits in any cache. while(count > 0){ N_128; count -= 128; } // Start the counters. if ( !latency_only && (is_core || 0 == idx) ) { retval = PAPI_start(_papi_eventset); if ( PAPI_OK != retval ) { error_type = retval; error_line = __LINE__; thdStatus = -1; // If we can't measure events, no need to run the kernel. goto clean_up; } } // Start the actual test. count = countMax; // Micro-kernel for memory reading. if( CACHE_READ_ONLY == mode || latency_only ) { if( latency_only ) time1 = getticks(); while(count > 0){ N_128; count -= 128; } if( latency_only ) time2 = getticks(); } // Micro-kernel for memory writing. else { while(count > 0){ NW_128; count -= 128; } } if ( !latency_only && (is_core || 0 == idx) ) { // Stop the counters. retval = PAPI_stop(_papi_eventset, &counter[idx]); if ( PAPI_OK != retval ) { error_type = retval; error_line = __LINE__; thdStatus = -1; goto clean_up; } // Get the average event count per access in pointer chase. // If it is not a core event, get average count per thread. divisor = 1.0*countMax; if( !is_core && 0 == idx ) divisor *= ONT; out.counter[idx] = (1.0*counter[idx])/divisor; clean_up: retval = PAPI_cleanup_eventset(_papi_eventset); if (retval != PAPI_OK ){ error_type = retval; error_line = __LINE__; thdStatus = -1; } retval = PAPI_destroy_eventset(&_papi_eventset); if (retval != PAPI_OK ){ error_type = retval; error_line = __LINE__; thdStatus = -1; } }else{ // Compute the duration of the pointer chase. dt = elapsed(time2, time1); // Convert time into nanoseconds. factor = 1000.0; // Number of accesses per pointer chase. factor /= (1.0*countMax); // Get the average nanoseconds per access. out.dt[idx] = dt*factor; } skip_epilogue: buffer += (uintptr_t)p+(uintptr_t)(x+y); status += thdStatus; } // Get the collective status. if(status < 0) { error_handler(error_type, error_line); out.status = -1; } // Prevent compiler optimization. *rslt = buffer; return out; } void error_handler(int e, int line){ int idx; const char *errors[26] = { "No error", "Invalid argument", "Insufficient memory", "A System/C library call failed", "Not supported by component", "Access to the counters was lost or interrupted", "Internal error, please send mail to the developers", "Event does not exist", "Event exists, but cannot be counted due to counter resource limitations", "EventSet is currently not running", "EventSet is currently counting", "No such EventSet Available", "Event in argument is not a valid preset", "Hardware does not support performance counters", "Unknown error code", "Permission level does not permit operation", "PAPI hasn't been initialized yet", "Component Index isn't set", "Not supported", "Not implemented", "Buffer size exceeded", "EventSet domain is not supported for the operation", "Invalid or missing event attributes", "Too many events or attributes", "Bad combination of features", "Component containing event is disabled" }; idx = -e; if(idx >= 26 || idx < 0 ) idx = 15; if( NULL != eventname ) fprintf(stderr,"\nError \"%s\" occured at line %d when processing event %s.\n", errors[idx], line, eventname); else fprintf(stderr,"\nError \"%s\" occured at line %d.\n", errors[idx], line); } papi-papi-7-2-0-t/src/counter_analysis_toolkit/timing_kernels.h000066400000000000000000000026311502707512200247430ustar00rootroot00000000000000#ifndef _TIMING_KERNELS_ #define _TIMING_KERNELS_ #include #include "caches.h" #define N_1 p = (uintptr_t *)*p; #define N_2 N_1 N_1 #define N_16 N_2 N_2 N_2 N_2 N_2 N_2 N_2 N_2 #define N_128 N_16 N_16 N_16 N_16 N_16 N_16 N_16 N_16 // NW_1 reads a pointer from an array and then modifies the pointer // and stores it back to the same place in the array. This way it // causes exactly one read and one write operation in memory, assuming // that the variable "p_prime" resides in a register. The exact steps // are the following: // 1. Reads the element pointed to by "p" into "p_prime". // This element is almost a pointer to the next element in the chain, but the least significant bit might be set to 1. // 2. Flip the least significant bit of "p_prime" and store it back into the buffer. // 3. Clear the least significant bit of "p_prime" and store the result in "p". #define NW_1 {p_prime = *p; *p = p_prime ^ 0x1; p = (uintptr_t *)(p_prime & (~0x1));} #define NW_2 NW_1 NW_1 #define NW_16 NW_2 NW_2 NW_2 NW_2 NW_2 NW_2 NW_2 NW_2 #define NW_128 NW_16 NW_16 NW_16 NW_16 NW_16 NW_16 NW_16 NW_16 #define CACHE_READ_ONLY 0x0 #define CACHE_READ_WRITE 0x1 run_output_t probeBufferSize(long long active_buf_len, long long line_size, float pageCountPerBlock, int pattern, long long llc_size, uintptr_t **v, uintptr_t *rslt, int detect_size, int mode, int ONT); void error_handler(int e, int line); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec.c000066400000000000000000000315761502707512200225130ustar00rootroot00000000000000#include #include #include #include #include #include "vec.h" #include "cat_arch.h" #include "vec_scalar_verify.h" void vec_driver(char* papi_event_name, hw_desc_t *hw_desc, char* outdir) { int retval = PAPI_OK; int EventSet = PAPI_NULL; FILE* ofp_papi; const char *sufx = ".vec"; char *papiFileName; (void)hw_desc; int l = strlen(outdir)+strlen(papi_event_name)+strlen(sufx); if (NULL == (papiFileName = (char *)calloc( 1+l, sizeof(char)))) { return; } if (l != (sprintf(papiFileName, "%s%s%s", outdir, papi_event_name, sufx))) { goto error0; } if (NULL == (ofp_papi = fopen(papiFileName,"w"))) { fprintf(stderr, "Failed to open file %s.\n", papiFileName); goto error0; } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ){ goto error1; } retval = PAPI_add_named_event( EventSet, papi_event_name ); if (retval != PAPI_OK ){ goto error1; } // Header to label the columns in the output file. fprintf(ofp_papi, "# ExpectedInstrs EventCount\n"); #if defined(X86) #if defined(AVX128_AVAIL) // HP Non-FMA instruction trials. fprintf(ofp_papi, "# HP Non-FMA Scalar\n"); test_hp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP Non-FMA Vector AVX128\n"); test_hp_x86_128B_VEC( 24, ITER, EventSet, ofp_papi ); test_hp_x86_128B_VEC( 48, ITER, EventSet, ofp_papi ); test_hp_x86_128B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# HP Non-FMA Vector AVX256\n"); test_hp_x86_256B_VEC( 24, ITER, EventSet, ofp_papi ); test_hp_x86_256B_VEC( 48, ITER, EventSet, ofp_papi ); test_hp_x86_256B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# HP Non-FMA Vector AVX512\n"); test_hp_x86_512B_VEC( 24, ITER, EventSet, ofp_papi ); test_hp_x86_512B_VEC( 48, ITER, EventSet, ofp_papi ); test_hp_x86_512B_VEC( 96, ITER, EventSet, ofp_papi ); #endif #endif // SP Non-FMA instruction trials. fprintf(ofp_papi, "# SP Non-FMA Scalar\n"); test_sp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP Non-FMA Vector AVX128\n"); test_sp_x86_128B_VEC( 24, ITER, EventSet, ofp_papi ); test_sp_x86_128B_VEC( 48, ITER, EventSet, ofp_papi ); test_sp_x86_128B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# SP Non-FMA Vector AVX256\n"); test_sp_x86_256B_VEC( 24, ITER, EventSet, ofp_papi ); test_sp_x86_256B_VEC( 48, ITER, EventSet, ofp_papi ); test_sp_x86_256B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# SP Non-FMA Vector AVX512\n"); test_sp_x86_512B_VEC( 24, ITER, EventSet, ofp_papi ); test_sp_x86_512B_VEC( 48, ITER, EventSet, ofp_papi ); test_sp_x86_512B_VEC( 96, ITER, EventSet, ofp_papi ); #endif #endif // DP Non-FMA instruction trials. fprintf(ofp_papi, "# DP Non-FMA Scalar\n"); test_dp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP Non-FMA Vector AVX128\n"); test_dp_x86_128B_VEC( 24, ITER, EventSet, ofp_papi ); test_dp_x86_128B_VEC( 48, ITER, EventSet, ofp_papi ); test_dp_x86_128B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# DP Non-FMA Vector AVX256\n"); test_dp_x86_256B_VEC( 24, ITER, EventSet, ofp_papi ); test_dp_x86_256B_VEC( 48, ITER, EventSet, ofp_papi ); test_dp_x86_256B_VEC( 96, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# DP Non-FMA Vector AVX512\n"); test_dp_x86_512B_VEC( 24, ITER, EventSet, ofp_papi ); test_dp_x86_512B_VEC( 48, ITER, EventSet, ofp_papi ); test_dp_x86_512B_VEC( 96, ITER, EventSet, ofp_papi ); #endif #endif // HP FMA instruction trials. fprintf(ofp_papi, "# HP FMA Scalar\n"); test_hp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP FMA Vector AVX128\n"); test_hp_x86_128B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_hp_x86_128B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_hp_x86_128B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# HP FMA Vector AVX256\n"); test_hp_x86_256B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_hp_x86_256B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_hp_x86_256B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# HP FMA Vector AVX512\n"); test_hp_x86_512B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_hp_x86_512B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_hp_x86_512B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #endif #endif // SP FMA instruction trials. fprintf(ofp_papi, "# SP FMA Scalar\n"); test_sp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP FMA Vector AVX128\n"); test_sp_x86_128B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_sp_x86_128B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_sp_x86_128B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# SP FMA Vector AVX256\n"); test_sp_x86_256B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_sp_x86_256B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_sp_x86_256B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# SP FMA Vector AVX512\n"); test_sp_x86_512B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_sp_x86_512B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_sp_x86_512B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #endif #endif // DP FMA instruction trials. fprintf(ofp_papi, "# DP FMA Scalar\n"); test_dp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP FMA Vector AVX128\n"); test_dp_x86_128B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_dp_x86_128B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_dp_x86_128B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX256_AVAIL) fprintf(ofp_papi, "# DP FMA Vector AVX256\n"); test_dp_x86_256B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_dp_x86_256B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_dp_x86_256B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #if defined(AVX512_AVAIL) fprintf(ofp_papi, "# DP FMA Vector AVX512\n"); test_dp_x86_512B_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_dp_x86_512B_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_dp_x86_512B_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #endif #endif #else fprintf(stderr, "Vector FLOP benchmark is not supported on this architecture: AVX unavailable!\n"); #endif #elif defined(ARM) // Non-FMA instruction trials. fprintf(ofp_papi, "# HP Non-FMA Scalar\n"); test_hp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP Non-FMA Vector\n"); test_hp_arm_VEC( 24, ITER, EventSet, ofp_papi ); test_hp_arm_VEC( 48, ITER, EventSet, ofp_papi ); test_hp_arm_VEC( 96, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP Non-FMA Scalar\n"); test_sp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP Non-FMA Vector\n"); test_sp_arm_VEC( 24, ITER, EventSet, ofp_papi ); test_sp_arm_VEC( 48, ITER, EventSet, ofp_papi ); test_sp_arm_VEC( 96, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP Non-FMA Scalar\n"); test_dp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP Non-FMA Vector\n"); test_dp_arm_VEC( 24, ITER, EventSet, ofp_papi ); test_dp_arm_VEC( 48, ITER, EventSet, ofp_papi ); test_dp_arm_VEC( 96, ITER, EventSet, ofp_papi ); // FMA instruction trials. fprintf(ofp_papi, "# HP FMA Scalar\n"); test_hp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP FMA Vector\n"); test_hp_arm_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_hp_arm_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_hp_arm_VEC_FMA( 48, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP FMA Scalar\n"); test_sp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP FMA Vector\n"); test_sp_arm_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_sp_arm_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_sp_arm_VEC_FMA( 48, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP FMA Scalar\n"); test_dp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP FMA Vector\n"); test_dp_arm_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_dp_arm_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_dp_arm_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #elif defined(POWER) // Non-FMA instruction trials. fprintf(ofp_papi, "# HP Non-FMA Scalar\n"); test_hp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP Non-FMA Vector\n"); test_hp_power_VEC( 24, ITER, EventSet, ofp_papi ); test_hp_power_VEC( 48, ITER, EventSet, ofp_papi ); test_hp_power_VEC( 96, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP Non-FMA Scalar\n"); test_sp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP Non-FMA Vector\n"); test_sp_power_VEC( 24, ITER, EventSet, ofp_papi ); test_sp_power_VEC( 48, ITER, EventSet, ofp_papi ); test_sp_power_VEC( 96, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP Non-FMA Scalar\n"); test_dp_scalar_VEC_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_48( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_96( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP Non-FMA Vector\n"); test_dp_power_VEC( 24, ITER, EventSet, ofp_papi ); test_dp_power_VEC( 48, ITER, EventSet, ofp_papi ); test_dp_power_VEC( 96, ITER, EventSet, ofp_papi ); // FMA instruction trials. fprintf(ofp_papi, "# HP FMA Scalar\n"); test_hp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_hp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# HP FMA Vector\n"); test_hp_power_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_hp_power_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_hp_power_VEC_FMA( 48, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP FMA Scalar\n"); test_sp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_sp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# SP FMA Vector\n"); test_sp_power_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_sp_power_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_sp_power_VEC_FMA( 48, ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP FMA Scalar\n"); test_dp_scalar_VEC_FMA_12( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_24( ITER, EventSet, ofp_papi ); test_dp_scalar_VEC_FMA_48( ITER, EventSet, ofp_papi ); fprintf(ofp_papi, "# DP FMA Vector\n"); test_dp_power_VEC_FMA( 12, ITER, EventSet, ofp_papi ); test_dp_power_VEC_FMA( 24, ITER, EventSet, ofp_papi ); test_dp_power_VEC_FMA( 48, ITER, EventSet, ofp_papi ); #endif retval = PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK ){ goto error1; } retval = PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK ){ goto error1; } error1: fclose(ofp_papi); error0: free(papiFileName); return; } papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec.h000066400000000000000000000002051502707512200225010ustar00rootroot00000000000000#ifndef _VEC_ #define _VEC_ #include "hw_desc.h" void vec_driver(char* papi_event_name, hw_desc_t *hw_desc, char* outdir); #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_fma_dp.c000066400000000000000000000234061502707512200240120ustar00rootroot00000000000000#include "vec_scalar_verify.h" static double test_dp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); static double test_dp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); static double test_dp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); static void test_dp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_dp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_dp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_dp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_dp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_dp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #endif /************************************/ /* Loop unrolling: 12 instructions */ /************************************/ static double test_dp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(12, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r0 = ADD_VEC_PD(r0,r6); r2 = ADD_VEC_PD(r2,r4); r0 = ADD_VEC_PD(r0,r2); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static double test_dp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r0 = ADD_VEC_PD(r0,r6); r2 = ADD_VEC_PD(r2,r4); r0 = ADD_VEC_PD(r0,r2); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static double test_dp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); r0 = FMA_VEC_PD(r0,r7,r9); r1 = FMA_VEC_PD(r1,r8,rA); r2 = FMA_VEC_PD(r2,r9,rB); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,rB,rD); r5 = FMA_VEC_PD(r5,rC,rE); r0 = FMA_VEC_PD(r0,rD,rF); r1 = FMA_VEC_PD(r1,rC,rE); r2 = FMA_VEC_PD(r2,rB,rD); r3 = FMA_VEC_PD(r3,rA,rC); r4 = FMA_VEC_PD(r4,r9,rB); r5 = FMA_VEC_PD(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r0 = ADD_VEC_PD(r0,r6); r2 = ADD_VEC_PD(r2,r4); r0 = ADD_VEC_PD(r0,r2); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } static void test_dp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { double sum = 0.0; double scalar_sum = 0.0; if ( instr_per_loop == 12 ) { sum += test_dp_mac_VEC_FMA_12( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_FMA_12( iterations, EventSet, NULL ); } else if ( instr_per_loop == 24 ) { sum += test_dp_mac_VEC_FMA_24( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_FMA_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_dp_mac_VEC_FMA_48( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_FMA_48( iterations, EventSet, NULL ); } if( sum/2.0 != scalar_sum ) { fprintf(stderr, "FMA: Inconsistent FLOP results detected!\n"); } } papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_fma_hp.c000066400000000000000000000276751502707512200240320ustar00rootroot00000000000000#include "vec_scalar_verify.h" #if defined(ARM) static half test_hp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); static half test_hp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); static half test_hp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); #else static float test_hp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); static float test_hp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); static float test_hp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); #endif static void test_hp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_hp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_hp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_hp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_hp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_hp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #endif #if defined(ARM) /************************************/ /* Loop unrolling: 12 instructions */ /************************************/ static half test_hp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(12, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r0 = ADD_VEC_PH(r0,r6); r2 = ADD_VEC_PH(r2,r4); r0 = ADD_VEC_PH(r0,r2); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static half test_hp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r0 = ADD_VEC_PH(r0,r6); r2 = ADD_VEC_PH(r2,r4); r0 = ADD_VEC_PH(r0,r2); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static half test_hp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); r0 = FMA_VEC_PH(r0,r7,r9); r1 = FMA_VEC_PH(r1,r8,rA); r2 = FMA_VEC_PH(r2,r9,rB); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,rB,rD); r5 = FMA_VEC_PH(r5,rC,rE); r0 = FMA_VEC_PH(r0,rD,rF); r1 = FMA_VEC_PH(r1,rC,rE); r2 = FMA_VEC_PH(r2,rB,rD); r3 = FMA_VEC_PH(r3,rA,rC); r4 = FMA_VEC_PH(r4,r9,rB); r5 = FMA_VEC_PH(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r0 = ADD_VEC_PH(r0,r6); r2 = ADD_VEC_PH(r2,r4); r0 = ADD_VEC_PH(r0,r2); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } static void test_hp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { half sum = 0.0; half scalar_sum = 0.0; if ( instr_per_loop == 12 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_FMA_12( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_FMA_12( iterations, EventSet, NULL )); } else if ( instr_per_loop == 24 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_FMA_24( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_FMA_24( iterations, EventSet, NULL )); } else if ( instr_per_loop == 48 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_FMA_48( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_FMA_48( iterations, EventSet, NULL )); } if( vdivh_f16(sum,4.0) != scalar_sum ) { fprintf(stderr, "FMA: Inconsistent FLOP results detected!\n"); } } #else static float test_hp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(12, fp); } return 0.0; } static float test_hp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(24, fp); } return 0.0; } static float test_hp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(48, fp); } return 0.0; } static void test_hp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { float sum = 0.0; float scalar_sum = 0.0; if ( instr_per_loop == 12 ) { sum += test_hp_mac_VEC_FMA_12( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_FMA_12( iterations, EventSet, NULL ); } else if ( instr_per_loop == 24 ) { sum += test_hp_mac_VEC_FMA_24( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_FMA_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_hp_mac_VEC_FMA_48( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_FMA_48( iterations, EventSet, NULL ); } if( sum/4.0 != scalar_sum ) { fprintf(stderr, "FMA: Inconsistent FLOP results detected!\n"); } } #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_fma_sp.c000066400000000000000000000237141502707512200240330ustar00rootroot00000000000000#include "vec_scalar_verify.h" static float test_sp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); static float test_sp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); static float test_sp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); static void test_sp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_sp_x86_128B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_sp_x86_512B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_sp_x86_256B_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_sp_arm_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_sp_power_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC_FMA( instr_per_loop, iterations, EventSet, fp ); } #endif /************************************/ /* Loop unrolling: 12 instructions */ /************************************/ static float test_sp_mac_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(12, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r0 = ADD_VEC_PS(r0,r6); r2 = ADD_VEC_PS(r2,r4); r0 = ADD_VEC_PS(r0,r2); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static float test_sp_mac_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r0 = ADD_VEC_PS(r0,r6); r2 = ADD_VEC_PS(r2,r4); r0 = ADD_VEC_PS(r0,r2); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static float test_sp_mac_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); r0 = FMA_VEC_PS(r0,r7,r9); r1 = FMA_VEC_PS(r1,r8,rA); r2 = FMA_VEC_PS(r2,r9,rB); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,rB,rD); r5 = FMA_VEC_PS(r5,rC,rE); r0 = FMA_VEC_PS(r0,rD,rF); r1 = FMA_VEC_PS(r1,rC,rE); r2 = FMA_VEC_PS(r2,rB,rD); r3 = FMA_VEC_PS(r3,rA,rC); r4 = FMA_VEC_PS(r4,r9,rB); r5 = FMA_VEC_PS(r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r0 = ADD_VEC_PS(r0,r6); r2 = ADD_VEC_PS(r2,r4); r0 = ADD_VEC_PS(r0,r2); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } static void test_sp_VEC_FMA( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { float sum = 0.0; float scalar_sum = 0.0; if ( instr_per_loop == 12 ) { sum += test_sp_mac_VEC_FMA_12( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_FMA_12( iterations, EventSet, NULL ); } else if ( instr_per_loop == 24 ) { sum += test_sp_mac_VEC_FMA_24( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_FMA_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_sp_mac_VEC_FMA_48( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_FMA_48( iterations, EventSet, NULL ); } if( sum/4.0 != scalar_sum ) { fprintf(stderr, "FMA: Inconsistent FLOP results detected! %f vs %f\n", sum/4.0, scalar_sum); } } papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_nonfma_dp.c000066400000000000000000000314311502707512200245220ustar00rootroot00000000000000#include "vec_scalar_verify.h" static double test_dp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ); static double test_dp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ); static double test_dp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ); static void test_dp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_dp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_dp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_dp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_dp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_dp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_dp_VEC( instr_per_loop, iterations, EventSet, fp ); } #endif /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static double test_dp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r6 = ADD_VEC_PD(r6,r7); r8 = ADD_VEC_PD(r8,r9); rA = ADD_VEC_PD(rA,rB); r0 = ADD_VEC_PD(r0,r2); r4 = ADD_VEC_PD(r4,r6); r8 = ADD_VEC_PD(r8,rA); r0 = ADD_VEC_PD(r0,r4); r0 = ADD_VEC_PD(r0,r8); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static double test_dp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r6 = ADD_VEC_PD(r6,r7); r8 = ADD_VEC_PD(r8,r9); rA = ADD_VEC_PD(rA,rB); r0 = ADD_VEC_PD(r0,r2); r4 = ADD_VEC_PD(r4,r6); r8 = ADD_VEC_PD(r8,rA); r0 = ADD_VEC_PD(r0,r4); r0 = ADD_VEC_PD(r0,r8); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } /************************************/ /* Loop unrolling: 96 instructions */ /************************************/ static double test_dp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register DP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PD(0.01); r1 = SET_VEC_PD(0.02); r2 = SET_VEC_PD(0.03); r3 = SET_VEC_PD(0.04); r4 = SET_VEC_PD(0.05); r5 = SET_VEC_PD(0.06); r6 = SET_VEC_PD(0.07); r7 = SET_VEC_PD(0.08); r8 = SET_VEC_PD(0.09); r9 = SET_VEC_PD(0.10); rA = SET_VEC_PD(0.11); rB = SET_VEC_PD(0.12); rC = SET_VEC_PD(0.13); rD = SET_VEC_PD(0.14); rE = SET_VEC_PD(0.15); rF = SET_VEC_PD(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); r0 = MUL_VEC_PD(r0,rC); r1 = ADD_VEC_PD(r1,rD); r2 = MUL_VEC_PD(r2,rE); r3 = ADD_VEC_PD(r3,rF); r4 = MUL_VEC_PD(r4,rC); r5 = ADD_VEC_PD(r5,rD); r6 = MUL_VEC_PD(r6,rE); r7 = ADD_VEC_PD(r7,rF); r8 = MUL_VEC_PD(r8,rC); r9 = ADD_VEC_PD(r9,rD); rA = MUL_VEC_PD(rA,rE); rB = ADD_VEC_PD(rB,rF); r0 = ADD_VEC_PD(r0,rF); r1 = MUL_VEC_PD(r1,rE); r2 = ADD_VEC_PD(r2,rD); r3 = MUL_VEC_PD(r3,rC); r4 = ADD_VEC_PD(r4,rF); r5 = MUL_VEC_PD(r5,rE); r6 = ADD_VEC_PD(r6,rD); r7 = MUL_VEC_PD(r7,rC); r8 = ADD_VEC_PD(r8,rF); r9 = MUL_VEC_PD(r9,rE); rA = ADD_VEC_PD(rA,rD); rB = MUL_VEC_PD(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(96, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PD(r0,r1); r2 = ADD_VEC_PD(r2,r3); r4 = ADD_VEC_PD(r4,r5); r6 = ADD_VEC_PD(r6,r7); r8 = ADD_VEC_PD(r8,r9); rA = ADD_VEC_PD(rA,rB); r0 = ADD_VEC_PD(r0,r2); r4 = ADD_VEC_PD(r4,r6); r8 = ADD_VEC_PD(r8,rA); r0 = ADD_VEC_PD(r0,r4); r0 = ADD_VEC_PD(r0,r8); double out = 0; DP_VEC_TYPE temp = r0; out += ((double*)&temp)[0]; out += ((double*)&temp)[1]; return out; } static void test_dp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { double sum = 0.0; double scalar_sum = 0.0; if ( instr_per_loop == 24 ) { sum += test_dp_mac_VEC_24( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_dp_mac_VEC_48( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_48( iterations, EventSet, NULL ); } else if ( instr_per_loop == 96 ) { sum += test_dp_mac_VEC_96( iterations, EventSet, fp ); scalar_sum += test_dp_scalar_VEC_96( iterations, EventSet, NULL ); } if( sum/2.0 != scalar_sum ) { fprintf(stderr, "Inconsistent FLOP results detected!\n"); } } papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_nonfma_hp.c000066400000000000000000000356271502707512200245410ustar00rootroot00000000000000#include "vec_scalar_verify.h" #if defined(ARM) static half test_hp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ); static half test_hp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ); static half test_hp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ); #else static float test_hp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ); static float test_hp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ); static float test_hp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ); #endif static void test_hp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_hp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_hp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_hp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_hp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_hp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_hp_VEC( instr_per_loop, iterations, EventSet, fp ); } #endif #if defined(ARM) /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static half test_hp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r6 = ADD_VEC_PH(r6,r7); r8 = ADD_VEC_PH(r8,r9); rA = ADD_VEC_PH(rA,rB); r0 = ADD_VEC_PH(r0,r2); r4 = ADD_VEC_PH(r4,r6); r8 = ADD_VEC_PH(r8,rA); r0 = ADD_VEC_PH(r0,r4); r0 = ADD_VEC_PH(r0,r8); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static half test_hp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r6 = ADD_VEC_PH(r6,r7); r8 = ADD_VEC_PH(r8,r9); rA = ADD_VEC_PH(rA,rB); r0 = ADD_VEC_PH(r0,r2); r4 = ADD_VEC_PH(r4,r6); r8 = ADD_VEC_PH(r8,rA); r0 = ADD_VEC_PH(r0,r4); r0 = ADD_VEC_PH(r0,r8); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } /************************************/ /* Loop unrolling: 96 instructions */ /************************************/ static half test_hp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register HP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PH(0.01); r1 = SET_VEC_PH(0.02); r2 = SET_VEC_PH(0.03); r3 = SET_VEC_PH(0.04); r4 = SET_VEC_PH(0.05); r5 = SET_VEC_PH(0.06); r6 = SET_VEC_PH(0.07); r7 = SET_VEC_PH(0.08); r8 = SET_VEC_PH(0.09); r9 = SET_VEC_PH(0.10); rA = SET_VEC_PH(0.11); rB = SET_VEC_PH(0.12); rC = SET_VEC_PH(0.13); rD = SET_VEC_PH(0.14); rE = SET_VEC_PH(0.15); rF = SET_VEC_PH(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); r0 = MUL_VEC_PH(r0,rC); r1 = ADD_VEC_PH(r1,rD); r2 = MUL_VEC_PH(r2,rE); r3 = ADD_VEC_PH(r3,rF); r4 = MUL_VEC_PH(r4,rC); r5 = ADD_VEC_PH(r5,rD); r6 = MUL_VEC_PH(r6,rE); r7 = ADD_VEC_PH(r7,rF); r8 = MUL_VEC_PH(r8,rC); r9 = ADD_VEC_PH(r9,rD); rA = MUL_VEC_PH(rA,rE); rB = ADD_VEC_PH(rB,rF); r0 = ADD_VEC_PH(r0,rF); r1 = MUL_VEC_PH(r1,rE); r2 = ADD_VEC_PH(r2,rD); r3 = MUL_VEC_PH(r3,rC); r4 = ADD_VEC_PH(r4,rF); r5 = MUL_VEC_PH(r5,rE); r6 = ADD_VEC_PH(r6,rD); r7 = MUL_VEC_PH(r7,rC); r8 = ADD_VEC_PH(r8,rF); r9 = MUL_VEC_PH(r9,rE); rA = ADD_VEC_PH(rA,rD); rB = MUL_VEC_PH(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(96, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PH(r0,r1); r2 = ADD_VEC_PH(r2,r3); r4 = ADD_VEC_PH(r4,r5); r6 = ADD_VEC_PH(r6,r7); r8 = ADD_VEC_PH(r8,r9); rA = ADD_VEC_PH(rA,rB); r0 = ADD_VEC_PH(r0,r2); r4 = ADD_VEC_PH(r4,r6); r8 = ADD_VEC_PH(r8,rA); r0 = ADD_VEC_PH(r0,r4); r0 = ADD_VEC_PH(r0,r8); half out = 0; HP_VEC_TYPE temp = r0; out = vaddh_f16(out,((half*)&temp)[0]); out = vaddh_f16(out,((half*)&temp)[1]); out = vaddh_f16(out,((half*)&temp)[2]); out = vaddh_f16(out,((half*)&temp)[3]); return out; } static void test_hp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { half sum = 0.0; half scalar_sum = 0.0; if ( instr_per_loop == 24 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_24( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_24( iterations, EventSet, NULL )); } else if ( instr_per_loop == 48 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_48( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_48( iterations, EventSet, NULL )); } else if ( instr_per_loop == 96 ) { sum = vaddh_f16(sum,test_hp_mac_VEC_96( iterations, EventSet, fp )); scalar_sum = vaddh_f16(scalar_sum,test_hp_scalar_VEC_96( iterations, EventSet, NULL )); } if( vdivh_f16(sum,4.0) != scalar_sum ) { fprintf(stderr, "Inconsistent FLOP results detected!\n"); } } #else static float test_hp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(24, fp); } return 0.0; } static float test_hp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(48, fp); } return 0.0; } static float test_hp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(96, fp); } return 0.0; } static void test_hp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { float sum = 0.0; float scalar_sum = 0.0; if ( instr_per_loop == 24 ) { sum += test_hp_mac_VEC_24( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_hp_mac_VEC_48( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_48( iterations, EventSet, NULL ); } else if ( instr_per_loop == 96 ) { sum += test_hp_mac_VEC_96( iterations, EventSet, fp ); scalar_sum += test_hp_scalar_VEC_96( iterations, EventSet, NULL ); } if( sum/4.0 != scalar_sum ) { fprintf(stderr, "Inconsistent FLOP results detected!\n"); } } #endif papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_nonfma_sp.c000066400000000000000000000316751502707512200245530ustar00rootroot00000000000000#include "vec_scalar_verify.h" static float test_sp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ); static float test_sp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ); static float test_sp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ); static void test_sp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ); /* Wrapper functions of different vector widths. */ #if defined(X86_VEC_WIDTH_128B) void test_sp_x86_128B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_512B) void test_sp_x86_512B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(X86_VEC_WIDTH_256B) void test_sp_x86_256B_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(ARM) void test_sp_arm_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC( instr_per_loop, iterations, EventSet, fp ); } #elif defined(POWER) void test_sp_power_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { return test_sp_VEC( instr_per_loop, iterations, EventSet, fp ); } #endif /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ static float test_sp_mac_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(24, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r6 = ADD_VEC_PS(r6,r7); r8 = ADD_VEC_PS(r8,r9); rA = ADD_VEC_PS(rA,rB); r0 = ADD_VEC_PS(r0,r2); r4 = ADD_VEC_PS(r4,r6); r8 = ADD_VEC_PS(r8,rA); r0 = ADD_VEC_PS(r0,r4); r0 = ADD_VEC_PS(r0,r8); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ static float test_sp_mac_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(48, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r6 = ADD_VEC_PS(r6,r7); r8 = ADD_VEC_PS(r8,r9); rA = ADD_VEC_PS(rA,rB); r0 = ADD_VEC_PS(r0,r2); r4 = ADD_VEC_PS(r4,r6); r8 = ADD_VEC_PS(r8,rA); r0 = ADD_VEC_PS(r0,r4); r0 = ADD_VEC_PS(r0,r8); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } /************************************/ /* Loop unrolling: 96 instructions */ /************************************/ static float test_sp_mac_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register SP_VEC_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_PS(0.01); r1 = SET_VEC_PS(0.02); r2 = SET_VEC_PS(0.03); r3 = SET_VEC_PS(0.04); r4 = SET_VEC_PS(0.05); r5 = SET_VEC_PS(0.06); r6 = SET_VEC_PS(0.07); r7 = SET_VEC_PS(0.08); r8 = SET_VEC_PS(0.09); r9 = SET_VEC_PS(0.10); rA = SET_VEC_PS(0.11); rB = SET_VEC_PS(0.12); rC = SET_VEC_PS(0.13); rD = SET_VEC_PS(0.14); rE = SET_VEC_PS(0.15); rF = SET_VEC_PS(0.16); /* Start PAPI counters */ if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); r0 = MUL_VEC_PS(r0,rC); r1 = ADD_VEC_PS(r1,rD); r2 = MUL_VEC_PS(r2,rE); r3 = ADD_VEC_PS(r3,rF); r4 = MUL_VEC_PS(r4,rC); r5 = ADD_VEC_PS(r5,rD); r6 = MUL_VEC_PS(r6,rE); r7 = ADD_VEC_PS(r7,rF); r8 = MUL_VEC_PS(r8,rC); r9 = ADD_VEC_PS(r9,rD); rA = MUL_VEC_PS(rA,rE); rB = ADD_VEC_PS(rB,rF); r0 = ADD_VEC_PS(r0,rF); r1 = MUL_VEC_PS(r1,rE); r2 = ADD_VEC_PS(r2,rD); r3 = MUL_VEC_PS(r3,rC); r4 = ADD_VEC_PS(r4,rF); r5 = MUL_VEC_PS(r5,rE); r6 = ADD_VEC_PS(r6,rD); r7 = MUL_VEC_PS(r7,rC); r8 = ADD_VEC_PS(r8,rF); r9 = MUL_VEC_PS(r9,rE); rA = ADD_VEC_PS(rA,rD); rB = MUL_VEC_PS(rB,rC); i++; } c++; } /* Stop PAPI counters */ papi_stop_and_print(96, EventSet, fp); /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_PS(r0,r1); r2 = ADD_VEC_PS(r2,r3); r4 = ADD_VEC_PS(r4,r5); r6 = ADD_VEC_PS(r6,r7); r8 = ADD_VEC_PS(r8,r9); rA = ADD_VEC_PS(rA,rB); r0 = ADD_VEC_PS(r0,r2); r4 = ADD_VEC_PS(r4,r6); r8 = ADD_VEC_PS(r8,rA); r0 = ADD_VEC_PS(r0,r4); r0 = ADD_VEC_PS(r0,r8); float out = 0; SP_VEC_TYPE temp = r0; out += ((float*)&temp)[0]; out += ((float*)&temp)[1]; out += ((float*)&temp)[2]; out += ((float*)&temp)[3]; return out; } static void test_sp_VEC( int instr_per_loop, uint64 iterations, int EventSet, FILE *fp ) { float sum = 0.0; float scalar_sum = 0.0; if ( instr_per_loop == 24 ) { sum += test_sp_mac_VEC_24( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_24( iterations, EventSet, NULL ); } else if ( instr_per_loop == 48 ) { sum += test_sp_mac_VEC_48( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_48( iterations, EventSet, NULL ); } else if ( instr_per_loop == 96 ) { sum += test_sp_mac_VEC_96( iterations, EventSet, fp ); scalar_sum += test_sp_scalar_VEC_96( iterations, EventSet, NULL ); } if( sum/4.0 != scalar_sum ) { fprintf(stderr, "Inconsistent FLOP results detected!\n"); } } papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_scalar_verify.c000066400000000000000000001574771502707512200254350ustar00rootroot00000000000000#include "vec_scalar_verify.h" void papi_stop_and_print_placeholder(long long theory, FILE *fp) { fprintf(fp, "%lld 0\n", theory); } void papi_stop_and_print(long long theory, int EventSet, FILE *fp) { long long flpins = 0; int retval; if ( (retval=PAPI_stop(EventSet, &flpins)) != PAPI_OK){ fprintf(stderr, "Problem.\n"); return; } fprintf(fp, "%lld %lld\n", theory, flpins); } #if defined(ARM) half test_hp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r6 = ADD_VEC_SH(r6,r7); r8 = ADD_VEC_SH(r8,r9); rA = ADD_VEC_SH(rA,rB); r0 = ADD_VEC_SH(r0,r2); r4 = ADD_VEC_SH(r4,r6); r8 = ADD_VEC_SH(r8,rA); r0 = ADD_VEC_SH(r0,r4); r0 = ADD_VEC_SH(r0,r8); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } half test_hp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r6 = ADD_VEC_SH(r6,r7); r8 = ADD_VEC_SH(r8,r9); rA = ADD_VEC_SH(rA,rB); r0 = ADD_VEC_SH(r0,r2); r4 = ADD_VEC_SH(r4,r6); r8 = ADD_VEC_SH(r8,rA); r0 = ADD_VEC_SH(r0,r4); r0 = ADD_VEC_SH(r0,r8); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } half test_hp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); r0 = MUL_VEC_SH(r0,rC); r1 = ADD_VEC_SH(r1,rD); r2 = MUL_VEC_SH(r2,rE); r3 = ADD_VEC_SH(r3,rF); r4 = MUL_VEC_SH(r4,rC); r5 = ADD_VEC_SH(r5,rD); r6 = MUL_VEC_SH(r6,rE); r7 = ADD_VEC_SH(r7,rF); r8 = MUL_VEC_SH(r8,rC); r9 = ADD_VEC_SH(r9,rD); rA = MUL_VEC_SH(rA,rE); rB = ADD_VEC_SH(rB,rF); r0 = ADD_VEC_SH(r0,rF); r1 = MUL_VEC_SH(r1,rE); r2 = ADD_VEC_SH(r2,rD); r3 = MUL_VEC_SH(r3,rC); r4 = ADD_VEC_SH(r4,rF); r5 = MUL_VEC_SH(r5,rE); r6 = ADD_VEC_SH(r6,rD); r7 = MUL_VEC_SH(r7,rC); r8 = ADD_VEC_SH(r8,rF); r9 = MUL_VEC_SH(r9,rE); rA = ADD_VEC_SH(rA,rD); rB = MUL_VEC_SH(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(96, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r6 = ADD_VEC_SH(r6,r7); r8 = ADD_VEC_SH(r8,r9); rA = ADD_VEC_SH(rA,rB); r0 = ADD_VEC_SH(r0,r2); r4 = ADD_VEC_SH(r4,r6); r8 = ADD_VEC_SH(r8,rA); r0 = ADD_VEC_SH(r0,r4); r0 = ADD_VEC_SH(r0,r8); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } #else float test_hp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(24, fp); } return 0.0; } float test_hp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(48, fp); } return 0.0; } float test_hp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(96, fp); } return 0.0; } #endif /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ float test_sp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r6 = ADD_VEC_SS(r6,r7); r8 = ADD_VEC_SS(r8,r9); rA = ADD_VEC_SS(rA,rB); r0 = ADD_VEC_SS(r0,r2); r4 = ADD_VEC_SS(r4,r6); r8 = ADD_VEC_SS(r8,rA); r0 = ADD_VEC_SS(r0,r4); r0 = ADD_VEC_SS(r0,r8); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ float test_sp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r6 = ADD_VEC_SS(r6,r7); r8 = ADD_VEC_SS(r8,r9); rA = ADD_VEC_SS(rA,rB); r0 = ADD_VEC_SS(r0,r2); r4 = ADD_VEC_SS(r4,r6); r8 = ADD_VEC_SS(r8,rA); r0 = ADD_VEC_SS(r0,r4); r0 = ADD_VEC_SS(r0,r8); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 96 instructions */ /************************************/ float test_sp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); r0 = MUL_VEC_SS(r0,rC); r1 = ADD_VEC_SS(r1,rD); r2 = MUL_VEC_SS(r2,rE); r3 = ADD_VEC_SS(r3,rF); r4 = MUL_VEC_SS(r4,rC); r5 = ADD_VEC_SS(r5,rD); r6 = MUL_VEC_SS(r6,rE); r7 = ADD_VEC_SS(r7,rF); r8 = MUL_VEC_SS(r8,rC); r9 = ADD_VEC_SS(r9,rD); rA = MUL_VEC_SS(rA,rE); rB = ADD_VEC_SS(rB,rF); r0 = ADD_VEC_SS(r0,rF); r1 = MUL_VEC_SS(r1,rE); r2 = ADD_VEC_SS(r2,rD); r3 = MUL_VEC_SS(r3,rC); r4 = ADD_VEC_SS(r4,rF); r5 = MUL_VEC_SS(r5,rE); r6 = ADD_VEC_SS(r6,rD); r7 = MUL_VEC_SS(r7,rC); r8 = ADD_VEC_SS(r8,rF); r9 = MUL_VEC_SS(r9,rE); rA = ADD_VEC_SS(rA,rD); rB = MUL_VEC_SS(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(96, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r6 = ADD_VEC_SS(r6,r7); r8 = ADD_VEC_SS(r8,r9); rA = ADD_VEC_SS(rA,rB); r0 = ADD_VEC_SS(r0,r2); r4 = ADD_VEC_SS(r4,r6); r8 = ADD_VEC_SS(r8,rA); r0 = ADD_VEC_SS(r0,r4); r0 = ADD_VEC_SS(r0,r8); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ double test_dp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r6 = ADD_VEC_SD(r6,r7); r8 = ADD_VEC_SD(r8,r9); rA = ADD_VEC_SD(rA,rB); r0 = ADD_VEC_SD(r0,r2); r4 = ADD_VEC_SD(r4,r6); r8 = ADD_VEC_SD(r8,rA); r0 = ADD_VEC_SD(r0,r4); r0 = ADD_VEC_SD(r0,r8); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ double test_dp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r6 = ADD_VEC_SD(r6,r7); r8 = ADD_VEC_SD(r8,r9); rA = ADD_VEC_SD(rA,rB); r0 = ADD_VEC_SD(r0,r2); r4 = ADD_VEC_SD(r4,r6); r8 = ADD_VEC_SD(r8,rA); r0 = ADD_VEC_SD(r0,r4); r0 = ADD_VEC_SD(r0,r8); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 96 instructions */ /************************************/ double test_dp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); r0 = MUL_VEC_SD(r0,rC); r1 = ADD_VEC_SD(r1,rD); r2 = MUL_VEC_SD(r2,rE); r3 = ADD_VEC_SD(r3,rF); r4 = MUL_VEC_SD(r4,rC); r5 = ADD_VEC_SD(r5,rD); r6 = MUL_VEC_SD(r6,rE); r7 = ADD_VEC_SD(r7,rF); r8 = MUL_VEC_SD(r8,rC); r9 = ADD_VEC_SD(r9,rD); rA = MUL_VEC_SD(rA,rE); rB = ADD_VEC_SD(rB,rF); r0 = ADD_VEC_SD(r0,rF); r1 = MUL_VEC_SD(r1,rE); r2 = ADD_VEC_SD(r2,rD); r3 = MUL_VEC_SD(r3,rC); r4 = ADD_VEC_SD(r4,rF); r5 = MUL_VEC_SD(r5,rE); r6 = ADD_VEC_SD(r6,rD); r7 = MUL_VEC_SD(r7,rC); r8 = ADD_VEC_SD(r8,rF); r9 = MUL_VEC_SD(r9,rE); rA = ADD_VEC_SD(rA,rD); rB = MUL_VEC_SD(rB,rC); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(96, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r6 = ADD_VEC_SD(r6,r7); r8 = ADD_VEC_SD(r8,r9); rA = ADD_VEC_SD(rA,rB); r0 = ADD_VEC_SD(r0,r2); r4 = ADD_VEC_SD(r4,r6); r8 = ADD_VEC_SD(r8,rA); r0 = ADD_VEC_SD(r0,r4); r0 = ADD_VEC_SD(r0,r8); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } #if defined(ARM) half test_hp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(12, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r0 = ADD_VEC_SH(r0,r6); r2 = ADD_VEC_SH(r2,r4); r0 = ADD_VEC_SH(r0,r2); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } half test_hp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r0 = ADD_VEC_SH(r0,r6); r2 = ADD_VEC_SH(r2,r4); r0 = ADD_VEC_SH(r0,r2); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } half test_hp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register half r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SH(0.01); r1 = SET_VEC_SH(0.02); r2 = SET_VEC_SH(0.03); r3 = SET_VEC_SH(0.04); r4 = SET_VEC_SH(0.05); r5 = SET_VEC_SH(0.06); r6 = SET_VEC_SH(0.07); r7 = SET_VEC_SH(0.08); r8 = SET_VEC_SH(0.09); r9 = SET_VEC_SH(0.10); rA = SET_VEC_SH(0.11); rB = SET_VEC_SH(0.12); rC = SET_VEC_SH(0.13); rD = SET_VEC_SH(0.14); rE = SET_VEC_SH(0.15); rF = SET_VEC_SH(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); FMA_VEC_SH(r0,r0,r7,r9); FMA_VEC_SH(r1,r1,r8,rA); FMA_VEC_SH(r2,r2,r9,rB); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,rB,rD); FMA_VEC_SH(r5,r5,rC,rE); FMA_VEC_SH(r0,r0,rD,rF); FMA_VEC_SH(r1,r1,rC,rE); FMA_VEC_SH(r2,r2,rB,rD); FMA_VEC_SH(r3,r3,rA,rC); FMA_VEC_SH(r4,r4,r9,rB); FMA_VEC_SH(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SH(r0,r1); r2 = ADD_VEC_SH(r2,r3); r4 = ADD_VEC_SH(r4,r5); r0 = ADD_VEC_SH(r0,r6); r2 = ADD_VEC_SH(r2,r4); r0 = ADD_VEC_SH(r0,r2); half out = 0; half temp = r0; out = ADD_VEC_SH(out,temp); return out; } #else float test_hp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(12, fp); } return 0.0; } float test_hp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(24, fp); } return 0.0; } float test_hp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ (void)iterations; (void)EventSet; if ( NULL != fp ) { papi_stop_and_print_placeholder(48, fp); } return 0.0; } #endif /************************************/ /* Loop unrolling: 12 instructions */ /************************************/ #pragma GCC optimize ("O2") float test_sp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(12, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r0 = ADD_VEC_SS(r0,r6); r2 = ADD_VEC_SS(r2,r4); r0 = ADD_VEC_SS(r0,r2); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ float test_sp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r0 = ADD_VEC_SS(r0,r6); r2 = ADD_VEC_SS(r2,r4); r0 = ADD_VEC_SS(r0,r2); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ float test_sp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register SP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SS(0.01); r1 = SET_VEC_SS(0.02); r2 = SET_VEC_SS(0.03); r3 = SET_VEC_SS(0.04); r4 = SET_VEC_SS(0.05); r5 = SET_VEC_SS(0.06); r6 = SET_VEC_SS(0.07); r7 = SET_VEC_SS(0.08); r8 = SET_VEC_SS(0.09); r9 = SET_VEC_SS(0.10); rA = SET_VEC_SS(0.11); rB = SET_VEC_SS(0.12); rC = SET_VEC_SS(0.13); rD = SET_VEC_SS(0.14); rE = SET_VEC_SS(0.15); rF = SET_VEC_SS(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); FMA_VEC_SS(r0,r0,r7,r9); FMA_VEC_SS(r1,r1,r8,rA); FMA_VEC_SS(r2,r2,r9,rB); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,rB,rD); FMA_VEC_SS(r5,r5,rC,rE); FMA_VEC_SS(r0,r0,rD,rF); FMA_VEC_SS(r1,r1,rC,rE); FMA_VEC_SS(r2,r2,rB,rD); FMA_VEC_SS(r3,r3,rA,rC); FMA_VEC_SS(r4,r4,r9,rB); FMA_VEC_SS(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SS(r0,r1); r2 = ADD_VEC_SS(r2,r3); r4 = ADD_VEC_SS(r4,r5); r0 = ADD_VEC_SS(r0,r6); r2 = ADD_VEC_SS(r2,r4); r0 = ADD_VEC_SS(r0,r2); float out = 0; SP_SCALAR_TYPE temp = r0; out += ((float*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 12 instructions */ /************************************/ double test_dp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(12, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r0 = ADD_VEC_SD(r0,r6); r2 = ADD_VEC_SD(r2,r4); r0 = ADD_VEC_SD(r0,r2); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 24 instructions */ /************************************/ double test_dp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(24, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r0 = ADD_VEC_SD(r0,r6); r2 = ADD_VEC_SD(r2,r4); r0 = ADD_VEC_SD(r0,r2); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } /************************************/ /* Loop unrolling: 48 instructions */ /************************************/ double test_dp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ){ register DP_SCALAR_TYPE r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,rA,rB,rC,rD,rE,rF; /* Generate starting data */ r0 = SET_VEC_SD(0.01); r1 = SET_VEC_SD(0.02); r2 = SET_VEC_SD(0.03); r3 = SET_VEC_SD(0.04); r4 = SET_VEC_SD(0.05); r5 = SET_VEC_SD(0.06); r6 = SET_VEC_SD(0.07); r7 = SET_VEC_SD(0.08); r8 = SET_VEC_SD(0.09); r9 = SET_VEC_SD(0.10); rA = SET_VEC_SD(0.11); rB = SET_VEC_SD(0.12); rC = SET_VEC_SD(0.13); rD = SET_VEC_SD(0.14); rE = SET_VEC_SD(0.15); rF = SET_VEC_SD(0.16); /* Start PAPI counters */ if ( NULL != fp ) { if ( PAPI_start( EventSet ) != PAPI_OK ) { return -1; } } uint64 c = 0; while (c < iterations){ size_t i = 0; while (i < ITER){ /* The performance critical part */ FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); FMA_VEC_SD(r0,r0,r7,r9); FMA_VEC_SD(r1,r1,r8,rA); FMA_VEC_SD(r2,r2,r9,rB); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,rB,rD); FMA_VEC_SD(r5,r5,rC,rE); FMA_VEC_SD(r0,r0,rD,rF); FMA_VEC_SD(r1,r1,rC,rE); FMA_VEC_SD(r2,r2,rB,rD); FMA_VEC_SD(r3,r3,rA,rC); FMA_VEC_SD(r4,r4,r9,rB); FMA_VEC_SD(r5,r5,r8,rA); i++; } c++; } /* Stop PAPI counters */ if ( NULL != fp ) { papi_stop_and_print(48, EventSet, fp); } /* Use data so that compiler does not eliminate it when using -O2 */ r0 = ADD_VEC_SD(r0,r1); r2 = ADD_VEC_SD(r2,r3); r4 = ADD_VEC_SD(r4,r5); r0 = ADD_VEC_SD(r0,r6); r2 = ADD_VEC_SD(r2,r4); r0 = ADD_VEC_SD(r0,r2); double out = 0; DP_SCALAR_TYPE temp = r0; out += ((double*)&temp)[0]; return out; } // End of pragma. papi-papi-7-2-0-t/src/counter_analysis_toolkit/vec_scalar_verify.h000066400000000000000000000042041502707512200254150ustar00rootroot00000000000000#include #include #include #include "cat_arch.h" #define ITER 1 void papi_stop_and_print_placeholder(long long theory, FILE *fp); void papi_stop_and_print(long long theory, int EventSet, FILE *fp); // Non-FMA-like computations. #if defined(ARM) half test_hp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ); half test_hp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ); half test_hp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ); #else float test_hp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ); float test_hp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ); float test_hp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ); #endif float test_sp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ); float test_sp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ); float test_sp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_24( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_48( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_96( uint64 iterations, int EventSet, FILE *fp ); // Functions to emulate FMA. #if defined(ARM) half test_hp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); half test_hp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); half test_hp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); #else float test_hp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); float test_hp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); float test_hp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); #endif float test_sp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); float test_sp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); float test_sp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_FMA_12( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_FMA_24( uint64 iterations, int EventSet, FILE *fp ); double test_dp_scalar_VEC_FMA_48( uint64 iterations, int EventSet, FILE *fp ); papi-papi-7-2-0-t/src/counter_analysis_toolkit/weak_symbols.c000066400000000000000000000005101502707512200244150ustar00rootroot00000000000000#include #include "vec.h" #pragma weak vec_driver void __attribute__((weak)) vec_driver(char* papi_event_name, hw_desc_t *hw_desc, char* outdir) { (void)hw_desc; fprintf(stderr, "Failed to create %s.vec in %s. The Vector FLOP benchmark is not supported on this architecture!\n", papi_event_name, outdir); } papi-papi-7-2-0-t/src/cpus.c000066400000000000000000000163701502707512200155540ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: cpus.c * Author: Gary Mohr * gary.mohr@bull.com * - based on threads.c by Philip Mucci - */ /* This file contains cpu allocation and bookkeeping functions */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "cpus.h" #include #include /* The list of cpus; this gets built as user apps set the cpu papi */ /* option on an event set */ static CpuInfo_t *_papi_hwi_cpu_head; static CpuInfo_t * _papi_hwi_lookup_cpu( unsigned int cpu_num ) { APIDBG("Entry:\n"); CpuInfo_t *tmp; tmp = ( CpuInfo_t * ) _papi_hwi_cpu_head; while ( tmp != NULL ) { THRDBG( "Examining cpu %#x at %p\n", tmp->cpu_num, tmp ); if ( tmp->cpu_num == cpu_num ) { break; } tmp = tmp->next; if ( tmp == _papi_hwi_cpu_head ) { tmp = NULL; break; } } if ( tmp ) { _papi_hwi_cpu_head = tmp; THRDBG( "Found cpu %#x at %p\n", cpu_num, tmp ); } else { THRDBG( "Did not find cpu %#x\n", cpu_num ); } return tmp; } int _papi_hwi_lookup_or_create_cpu( CpuInfo_t **here, unsigned int cpu_num ) { APIDBG("Entry: here: %p\n", here); CpuInfo_t *tmp = NULL; int retval = PAPI_OK; _papi_hwi_lock( CPUS_LOCK ); tmp = _papi_hwi_lookup_cpu(cpu_num); if ( tmp == NULL ) { retval = _papi_hwi_initialize_cpu( &tmp, cpu_num ); } /* Increment use count */ tmp->num_users++; if ( retval == PAPI_OK ) { *here = tmp; } _papi_hwi_unlock( CPUS_LOCK ); return retval; } static CpuInfo_t * allocate_cpu( unsigned int cpu_num ) { APIDBG("Entry: cpu_num: %d\n", cpu_num); CpuInfo_t *cpu; int i; /* Allocate new CpuInfo structure */ cpu = ( CpuInfo_t * ) papi_calloc( 1, sizeof ( CpuInfo_t ) ); if ( cpu == NULL ) { goto allocate_error; } /* identify the cpu this info structure represents */ cpu->cpu_num = cpu_num; cpu->context = ( hwd_context_t ** ) papi_calloc( ( size_t ) papi_num_components , sizeof ( hwd_context_t * ) ); if ( !cpu->context ) { goto error_free_cpu; } /* Allocate an eventset per component per cpu? Why? */ cpu->running_eventset = ( EventSetInfo_t ** ) papi_calloc(( size_t ) papi_num_components, sizeof ( EventSetInfo_t * ) ); if ( !cpu->running_eventset ) { goto error_free_context; } for ( i = 0; i < papi_num_components; i++ ) { cpu->context[i] = ( void * ) papi_calloc( 1, ( size_t ) _papi_hwd[i]->size.context ); cpu->running_eventset[i] = NULL; if ( cpu->context[i] == NULL ) { goto error_free_contexts; } } cpu->num_users=0; THRDBG( "Allocated CpuInfo: %p\n", cpu ); return cpu; error_free_contexts: for ( i--; i >= 0; i-- ) papi_free( cpu->context[i] ); error_free_context: papi_free( cpu->context ); error_free_cpu: papi_free( cpu ); allocate_error: return NULL; } /* Must be called with CPUS_LOCK held! */ static int remove_cpu( CpuInfo_t * entry ) { APIDBG("Entry: entry: %p\n", entry); CpuInfo_t *tmp = NULL, *prev = NULL; THRDBG( "_papi_hwi_cpu_head was cpu %d at %p\n", _papi_hwi_cpu_head->cpu_num, _papi_hwi_cpu_head ); /* Find the preceding element and the matched element, short circuit if we've seen the head twice */ for ( tmp = ( CpuInfo_t * ) _papi_hwi_cpu_head; ( entry != tmp ) || ( prev == NULL ); tmp = tmp->next ) { prev = tmp; } if ( tmp != entry ) { THRDBG( "Cpu %d at %p was not found in the cpu list!\n", entry->cpu_num, entry ); return PAPI_EBUG; } /* Only 1 element in list */ if ( prev == tmp ) { _papi_hwi_cpu_head = NULL; tmp->next = NULL; THRDBG( "_papi_hwi_cpu_head now NULL\n" ); } else { prev->next = tmp->next; /* If we're removing the head, better advance it! */ if ( _papi_hwi_cpu_head == tmp ) { _papi_hwi_cpu_head = tmp->next; THRDBG( "_papi_hwi_cpu_head now cpu %d at %p\n", _papi_hwi_cpu_head->cpu_num, _papi_hwi_cpu_head ); } THRDBG( "Removed cpu %p from list\n", tmp ); } return PAPI_OK; } static void free_cpu( CpuInfo_t **cpu ) { APIDBG( "Entry: *cpu: %p, cpu_num: %d, cpu_users: %d\n", *cpu, ( *cpu )->cpu_num, (*cpu)->num_users); int i,users,retval; _papi_hwi_lock( CPUS_LOCK ); (*cpu)->num_users--; users=(*cpu)->num_users; /* Remove from linked list if no users */ if (!users) remove_cpu( *cpu ); _papi_hwi_unlock( CPUS_LOCK ); /* Exit early if still users of this CPU */ if (users!=0) return; THRDBG( "Shutting down cpu %d at %p\n", (*cpu)->cpu_num, cpu ); for ( i = 0; i < papi_num_components; i++ ) { if (_papi_hwd[i]->cmp_info.disabled) continue; retval = _papi_hwd[i]->shutdown_thread( (*cpu)->context[i] ); if ( retval != PAPI_OK ) { // failure = retval; } } for ( i = 0; i < papi_num_components; i++ ) { if ( ( *cpu )->context[i] ) { papi_free( ( *cpu )->context[i] ); } } if ( ( *cpu )->context ) { papi_free( ( *cpu )->context ); } if ( ( *cpu )->running_eventset ) { papi_free( ( *cpu )->running_eventset ); } /* why do we clear this? */ memset( *cpu, 0x00, sizeof ( CpuInfo_t ) ); papi_free( *cpu ); *cpu = NULL; } /* Must be called with CPUS_LOCK held! */ static void insert_cpu( CpuInfo_t * entry ) { APIDBG("Entry: entry: %p\n", entry); if ( _papi_hwi_cpu_head == NULL ) { /* 0 elements */ THRDBG( "_papi_hwi_cpu_head is NULL\n" ); entry->next = entry; } else if ( _papi_hwi_cpu_head->next == _papi_hwi_cpu_head ) { /* 1 element */ THRDBG( "_papi_hwi_cpu_head was cpu %d at %p\n", _papi_hwi_cpu_head->cpu_num, _papi_hwi_cpu_head ); _papi_hwi_cpu_head->next = entry; entry->next = ( CpuInfo_t * ) _papi_hwi_cpu_head; } else { /* 2+ elements */ THRDBG( "_papi_hwi_cpu_head was cpu %d at %p\n", _papi_hwi_cpu_head->cpu_num, _papi_hwi_cpu_head ); entry->next = _papi_hwi_cpu_head->next; _papi_hwi_cpu_head->next = entry; } _papi_hwi_cpu_head = entry; THRDBG( "_papi_hwi_cpu_head now cpu %d at %p\n", _papi_hwi_cpu_head->cpu_num, _papi_hwi_cpu_head ); } /* Must be called with CPUS_LOCK held! */ int _papi_hwi_initialize_cpu( CpuInfo_t **dest, unsigned int cpu_num ) { APIDBG("Entry: dest: %p, *dest: %p, cpu_num: %d\n", dest, *dest, cpu_num); int retval; CpuInfo_t *cpu; int i; if ( ( cpu = allocate_cpu(cpu_num) ) == NULL ) { *dest = NULL; return PAPI_ENOMEM; } /* Call the component to fill in anything special. */ for ( i = 0; i < papi_num_components; i++ ) { if (_papi_hwd[i]->cmp_info.disabled && _papi_hwd[i]->cmp_info.disabled != PAPI_EDELAY_INIT) continue; retval = _papi_hwd[i]->init_thread( cpu->context[i] ); if ( retval ) { free_cpu( &cpu ); *dest = NULL; return retval; } } insert_cpu( cpu ); *dest = cpu; return PAPI_OK; } int _papi_hwi_shutdown_cpu( CpuInfo_t *cpu ) { APIDBG("Entry: cpu: %p, cpu_num: %d\n", cpu, cpu->cpu_num); free_cpu( &cpu ); return PAPI_OK; } papi-papi-7-2-0-t/src/cpus.h000066400000000000000000000011741502707512200155550ustar00rootroot00000000000000/** @file cpus.h * Author: Gary Mohr * gary.mohr@bull.com * - based on threads.h by unknown author - */ #ifndef PAPI_CPUS_H #define PAPI_CPUS_H typedef struct _CpuInfo { unsigned int cpu_num; struct _CpuInfo *next; hwd_context_t **context; EventSetInfo_t **running_eventset; EventSetInfo_t *from_esi; /* ESI used for last update this control state */ int num_users; } CpuInfo_t; int _papi_hwi_initialize_cpu( CpuInfo_t **dest, unsigned int cpu_num ); int _papi_hwi_shutdown_cpu( CpuInfo_t *cpu ); int _papi_hwi_lookup_or_create_cpu( CpuInfo_t ** here, unsigned int cpu_num ); #endif papi-papi-7-2-0-t/src/ctests/000077500000000000000000000000001502707512200157345ustar00rootroot00000000000000papi-papi-7-2-0-t/src/ctests/Makefile000066400000000000000000000016611502707512200174000ustar00rootroot00000000000000# File: ctests/Makefile include Makefile.target INCLUDE = -I../testlib -I../validation_tests -I.. -I. testlibdir= ../testlib TESTLIB= $(testlibdir)/libtestlib.a DOLOOPS= $(testlibdir)/do_loops.o CLOCKCORE= $(testlibdir)/clockcore.o validationlibdir= ../validation_tests TESTFLOPS= $(validationlibdir)/flops_testcode.o TESTINS= $(validationlibdir)/instructions_testcode.o TESTCYCLES = $(validationlibdir)/busy_work.o DISPLAYERROR= $(validationlibdir)/display_error.o include Makefile.recipies .PHONY : install install: default @echo "C tests (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/ctests -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/ctests -find . -perm -100 -type f -exec cp {} $(DATADIR)/ctests \; -chmod go+rx $(DATADIR)/ctests/* -find . -name "*.[ch]" -type f -exec cp {} $(DATADIR)/ctests \; -cp Makefile.target $(DATADIR)/ctests/Makefile -cat Makefile.recipies >> $(DATADIR)/ctests/Makefile papi-papi-7-2-0-t/src/ctests/Makefile.recipies000066400000000000000000000547651502707512200212170ustar00rootroot00000000000000OMP = omp_hl \ zero_omp omptough SMP = zero_smp SHMEM = zero_shmem PTHREADS= pthread_hl \ pthrtough pthrtough2 thrspecific profile_pthreads overflow_pthreads \ zero_pthreads clockres_pthreads overflow3_pthreads locks_pthreads \ krentel_pthreads MPX = max_multiplex multiplex1 multiplex2 mendes-alt sdsc-mpx sdsc2-mpx \ sdsc2-mpx-noreset sdsc4-mpx reset_multiplex MPXPTHR = multiplex1_pthreads multiplex3_pthreads kufrin MPI = mpi_hl mpi_omp_hl \ mpifirst ifeq ($(STATIC),) SHARED = shlib endif SERIAL = serial_hl serial_hl_ll_comb\ all_events all_native_events branches calibrate case1 case2 \ cmpinfo code2name derived describe destroy disable_component \ dmem_info eventname exeinfo failed_events first \ get_event_component inherit \ hwinfo johnmay2 low-level memory \ realtime remove_events reset second tenth version virttime \ zero zero_flip zero_named FORKEXEC = fork fork2 exec exec2 forkexec forkexec2 forkexec3 forkexec4 \ fork_overflow exec_overflow child_overflow system_child_overflow \ system_overflow burn zero_fork OVERFLOW = fork_overflow exec_overflow child_overflow system_child_overflow \ system_overflow burn overflow overflow_force_software \ overflow_single_event overflow_twoevents timer_overflow overflow2 \ overflow_index overflow_one_and_read overflow_allcounters PROFILE = profile profile_force_software sprofile profile_twoevents \ byte_profile ATTACH = multiattach multiattach2 zero_attach attach3 attach2 attach_target \ attach_cpu attach_validate attach_cpu_validate attach_cpu_sys_validate P4_TEST = p4_lst_ins EAR = earprofile RANGE = data_range BROKEN = pernode val_omp _ALL = $(PTHREADS) $(SERIAL) $(FORKEXEC) $(OVERFLOW) $(PROFILE) $(MPX) $(MPXPTHR) \ $(OMP) $(SMP) $(SHMEM) $(SHARED) $(EAR) $(RANGE) $(P4_TEST) $(ATTACH) $(API) ifneq ($(MPICC),) ifeq ($(NO_MPI_TESTS),) ALL = $(_ALL) $(MPI) else ALL = $(_ALL) endif else ALL = $(_ALL) endif DEFAULT = papi_api serial forkexec_tests overflow_tests profile_tests attach multiplex_and_pthreads shared all: $(ALL) default ctests ctest: $(DEFAULT) attach: $(ATTACH) p4: $(P4_TEST) ear: $(EAR) range: $(RANGE) mpi: $(MPI) shared: $(SHARED) multiplex_and_pthreads: $(MPXPTHR) $(MPX) $(PTHREADS) multiplex: $(MPX) omp: $(OMP) smp: $(SMP) pthreads: $(PTHREADS) shmem: $(SHMEM) serial: $(SERIAL) forkexec_tests: $(FORKEXEC) overflow_tests: $(OVERFLOW) profile_tests: $(PROFILE) papi_api: $(API) sdsc2: sdsc2.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) sdsc.c $(TESTLIB) $(TESTFLOPS) $(PAPILIB) $(LDFLAGS) -lm -o $@ sdsc2-mpx: sdsc2.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) -DMPX $(TOPTFLAGS) sdsc2.c $(TESTLIB) $(TESTFLOPS) $(PAPILIB) $(LDFLAGS) -lm -o $@ branches: branches.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) branches.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(LDFLAGS) -lm -o $@ sdsc2-mpx-noreset: sdsc2.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) -DMPX -DSTARTSTOP $(TOPTFLAGS) sdsc2.c $(TESTLIB) $(TESTFLOPS) $(PAPILIB) -lm $(LDFLAGS) -o $@ sdsc-mpx: sdsc-mpx.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) -DMPX $(TOPTFLAGS) sdsc-mpx.c $(TESTLIB) $(TESTFLOPS) $(PAPILIB) $(LDFLAGS) -o $@ sdsc4-mpx: sdsc4-mpx.c $(TESTLIB) $(PAPILIB) $(TESTFLOPS) $(CC) $(INCLUDE) $(CFLAGS) -DMPX $(TOPTFLAGS) sdsc4-mpx.c $(TESTLIB) $(TESTFLOPS) $(PAPILIB) $(LDFLAGS) -lm -o $@ calibrate: calibrate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) calibrate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o calibrate data_range: data_range.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) data_range.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o data_range p4_lst_ins: p4_lst_ins.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) p4_lst_ins.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o p4_lst_ins acpi: acpi.c dummy.o $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) acpi.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o acpi timer_overflow: timer_overflow.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) timer_overflow.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o $@ mendes-alt: mendes-alt.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DMULTIPLEX mendes-alt.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o $@ max_multiplex: max_multiplex.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) max_multiplex.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o $@ multiplex1: multiplex1.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiplex1.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ multiplex2: multiplex2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiplex2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ multiplex1_pthreads: multiplex1_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiplex1_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ -lpthread kufrin: kufrin.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) kufrin.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ -lpthread multiplex3_pthreads: multiplex3_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiplex3_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ -lpthread overflow3_pthreads: overflow3_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow3_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o $@ -lpthread thrspecific: thrspecific.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) thrspecific.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o thrspecific -lpthread pthread_hl: pthread_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) pthread_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o pthread_hl -lpthread pthrtough: pthrtough.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) pthrtough.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o pthrtough -lpthread pthrtough2: pthrtough2.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) pthrtough2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o pthrtough2 -lpthread profile_pthreads: profile_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) profile_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o profile_pthreads -lpthread locks_pthreads: locks_pthreads.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) locks_pthreads.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o locks_pthreads -lpthread -lm krentel_pthreads: krentel_pthreads.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) krentel_pthreads.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o krentel_pthreads -lpthread # krentel_pthreads_race is not included with the standard tests; # it is a modification of krentel_pthreads intended to be run with # "valgrind --tool=helgrind" to test for race conditions. krentel_pthreads_race: krentel_pthreads_race.c $(TESTLIB) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) krentel_pthreads_race.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o krentel_pthreads_race -lpthread overflow_pthreads: overflow_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_pthreads -lpthread version: version.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) version.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o version zero_pthreads: zero_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero_pthreads.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_pthreads -lpthread zero_smp: zero_smp.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(SMPCFLGS) $(CFLAGS) $(TOPTFLAGS) zero_smp.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_smp $(SMPLIBS) zero_shmem: zero_shmem.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC_R) $(INCLUDE) $(SMPCFLGS) $(CFLAGS) $(TOPTFLAGS) zero_shmem.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_shmem $(SMPLIBS) omp_hl: omp_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC_R) $(INCLUDE) $(OMPCFLGS) $(CFLAGS) $(TOPTFLAGS) omp_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o omp_hl $(OMPLIBS) zero_omp: zero_omp.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC_R) $(INCLUDE) $(OMPCFLGS) $(CFLAGS) $(TOPTFLAGS) zero_omp.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_omp $(OMPLIBS) omptough: omptough.c $(TESTLIB) $(PAPILIB) -$(CC_R) $(INCLUDE) $(OMPCFLGS) $(CFLAGS) $(TOPTFLAGS) omptough.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o omptough $(OMPLIBS) val_omp: val_omp.c $(TESTLIB) $(PAPILIB) -$(CC_R) $(INCLUDE) $(OMPCFLGS) $(CFLAGS) $(TOPTFLAGS) val_omp.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o val_omp $(OMPLIBS) clockres_pthreads: clockres_pthreads.c $(TESTLIB) $(CLOCKCORE) $(PAPILIB) $(CC_R) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) clockres_pthreads.c $(TESTLIB) $(CLOCKCORE) $(PAPILIB) $(LDFLAGS) -o clockres_pthreads -lpthread -lm inherit: inherit.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) inherit.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o inherit johnmay2: johnmay2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) johnmay2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o johnmay2 describe: describe.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) describe.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o describe derived: derived.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) derived.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o derived destroy: destroy.c $(TESTLIB) $(TESTINS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) destroy.c $(TESTLIB) $(TESTINS) $(PAPILIB) $(LDFLAGS) -o destroy zero: zero.c $(TESTLIB) $(TESTINS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero.c $(TESTLIB) $(TESTINS) $(PAPILIB) $(LDFLAGS) -o zero zero_named: zero_named.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero_named.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_named remove_events: remove_events.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) remove_events.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o remove_events zero_fork: zero_fork.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero_fork.c $(DOLOOPS) $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o zero_fork try: try.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) try.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o try zero_flip: zero_flip.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero_flip.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_flip realtime: realtime.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) realtime.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o realtime virttime: virttime.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) virttime.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o virttime first: first.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) first.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o first mpi_hl: mpi_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(MPICC) $(INCLUDE) $(MPFLAGS) $(CFLAGS) $(TOPTFLAGS) mpi_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o mpi_hl mpi_omp_hl: mpi_omp_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(MPICC) $(INCLUDE) $(MPFLAGS) $(OMPCFLGS) $(CFLAGS) $(TOPTFLAGS) mpi_omp_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o mpi_omp_hl $(OMPLIBS) mpifirst: mpifirst.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(MPICC) $(INCLUDE) $(MPFLAGS) $(CFLAGS) $(TOPTFLAGS) first.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o mpifirst first-twice: first-twice.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) first-twice.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o first-twice second: second.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) second.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o second overflow: overflow.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow overflow_allcounters: overflow_allcounters.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_allcounters.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_allcounters overflow_twoevents: overflow_twoevents.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_twoevents.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_twoevents overflow_one_and_read: overflow_one_and_read.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_one_and_read.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_one_and_read overflow_index: overflow_index.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_index.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_index overflow_values: overflow_values.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_values.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o overflow_values overflow2: overflow2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow2 overflow_single_event: overflow_single_event.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_single_event.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_single_event overflow_force_software: overflow_force_software.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) overflow_force_software.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o overflow_force_software sprofile: sprofile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) sprofile.c prof_utils.o $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o sprofile profile: profile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) profile.c prof_utils.o $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o profile profile_force_software: profile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DSWPROFILE profile.c prof_utils.o $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o profile_force_software profile_twoevents: profile_twoevents.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) profile_twoevents.c prof_utils.o $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o profile_twoevents earprofile: earprofile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) earprofile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(LDFLAGS) -o earprofile byte_profile: byte_profile.c $(TESTLIB) $(DOLOOPS) prof_utils.o $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) byte_profile.c prof_utils.o $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o byte_profile pernode: pernode.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) pernode.c $(LDFLAGS) -o pernode dmem_info: dmem_info.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) dmem_info.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o dmem_info serial_hl: serial_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) serial_hl.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o serial_hl serial_hl_ll_comb: serial_hl_ll_comb.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) serial_hl_ll_comb.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o serial_hl_ll_comb all_events: all_events.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) all_events.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o all_events all_native_events: all_native_events.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) all_native_events.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o all_native_events failed_events: failed_events.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) failed_events.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o failed_events get_event_component: get_event_component.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) get_event_component.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o get_event_component disable_component: disable_component.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) disable_component.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o disable_component memory: memory.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) memory.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o memory tenth: tenth.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) tenth.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o tenth eventname: eventname.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) eventname.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o eventname case1: case1.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) case1.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o case1 case2: case2.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) case2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o case2 low-level: low-level.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) low-level.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o low-level ifeq ($(STATIC),) shlib: shlib.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) shlib.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o shlib $(LDL) endif exeinfo: exeinfo.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) exeinfo.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o exeinfo cmpinfo: cmpinfo.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) cmpinfo.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o cmpinfo hwinfo: hwinfo.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) hwinfo.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o hwinfo code2name: code2name.c $(TESTLIB) $(PAPILIB) $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) code2name.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o code2name attach_target: attach_target.c $(DOLOOPS) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach_target.c -o attach_target $(DOLOOPS) $(TESTLIB) $(LDFLAGS) zero_attach: zero_attach.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) zero_attach.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o zero_attach multiattach: multiattach.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiattach.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o multiattach multiattach2: multiattach2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) multiattach2.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o multiattach2 attach3: attach3.c attach_target $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach3.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o attach3 attach2: attach2.c attach_target $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o attach2 attach_cpu: attach_cpu.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach_cpu.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o attach_cpu attach_cpu_validate: attach_cpu_validate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach_cpu_validate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o attach_cpu_validate attach_cpu_sys_validate: attach_cpu_sys_validate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach_cpu_sys_validate.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o attach_cpu_sys_validate attach_validate: attach_validate.c attach_target $(TESTLIB) $(TESTINS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) attach_validate.c $(TESTLIB) $(TESTINS) $(PAPILIB) $(LDFLAGS) -o attach_validate reset: reset.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) reset.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o reset reset_multiplex: reset_multiplex.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) reset_multiplex.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o reset_multiplex fork_overflow: fork_overflow.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) fork_overflow.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o fork_overflow exec_overflow: exec_overflow.c $(TESTLIB) $(PAPILIB) $(TESTCYCLES) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DPEXEC exec_overflow.c $(TESTLIB) $(PAPILIB) $(TESTCYCLES) $(LDFLAGS) -o exec_overflow child_overflow: child_overflow.c $(TESTLIB) $(PAPILIB) $(TESTCYCLES) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DPCHILD child_overflow.c $(TESTLIB) $(PAPILIB) $(TESTCYCLES) $(LDFLAGS) -o child_overflow system_child_overflow: system_child_overflow.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DSYSTEM system_child_overflow.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o system_child_overflow system_overflow: system_overflow.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -DSYSTEM2 system_overflow.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o system_overflow burn: burn.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) burn.c $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o burn fork: fork.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) fork.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o fork exec: exec.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) exec.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o exec exec2: exec2.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) exec2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o exec2 fork2: fork2.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) fork2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o fork2 forkexec: forkexec.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) forkexec.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o forkexec forkexec2: forkexec2.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) forkexec2.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o forkexec2 forkexec3: forkexec3.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) forkexec3.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o forkexec3 forkexec4: forkexec4.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) forkexec4.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o forkexec4 prof_utils.o: prof_utils.c $(testlibdir)/papi_test.h prof_utils.h $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -c prof_utils.c filter_helgrind: filter_helgrind.c $(TESTLIB) $(PAPILIB) -$(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) filter_helgrind.c $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o filter_helgrind .PHONY : all default ctests ctest clean clean: rm -f *.o *.stderr *.stdout core *~ $(ALL) unregister_pthreads distclean clobber: clean rm -f Makefile.target papi-papi-7-2-0-t/src/ctests/Makefile.target.in000066400000000000000000000011671502707512200212730ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ prefix = @prefix@ exec_prefix = @exec_prefix@ datarootdir = @datarootdir@ datadir = @datadir@/${PACKAGE_TARNAME} testlibdir = $(datadir)/testlib validationlibdir = $(datadir)/validation_tests DATADIR = $(DESTDIR)$(datadir) INCLUDE = -I. -I@includedir@ -I$(testlibdir) -I$(validationlibdir) LIBDIR = @libdir@ LIBRARY = @LIBRARY@ SHLIB = @SHLIB@ STATIC = @STATIC@ PAPILIB = ../@LINKLIB@ TESTLIB = $(testlibdir)/libtestlib.a LDFLAGS = @LDFLAGS@ @LDL@ @STATIC@ CC = @CC@ MPICC = @MPICC@ NO_MPI_TESTS = @NO_MPI_TESTS@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ @TOPTFLAGS@ OMPCFLGS = @OMPCFLGS@ papi-papi-7-2-0-t/src/ctests/all_events.c000066400000000000000000000053651502707512200202450ustar00rootroot00000000000000/* This file tries to add,start,stop, and remove all pre-defined events. * It is meant not to test the accuracy of the mapping but to make sure * that all events in the component will at least start (Helps to * catch typos). * * Author: Kevin London * london@cs.utk.edu */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval, i; int EventSet = PAPI_NULL, count = 0, err_count = 0; long long values; PAPI_event_info_t info; int quiet=0; char error_message[BUFSIZ]; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); if (!quiet) { printf("\nTrying all pre-defined events:\n"); } /* Initialize PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create an EventSet */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add all preset events */ for ( i = 0; i < PAPI_MAX_PRESET_EVENTS; i++ ) { if ( PAPI_get_event_info( PAPI_PRESET_MASK | i, &info ) != PAPI_OK ) continue; if ( !( info.count ) ) continue; if (!quiet) printf( "Adding %-14s", info.symbol ); retval = PAPI_add_event( EventSet, ( int ) info.event_code ); if ( retval != PAPI_OK ) { if (!quiet) { printf("Error adding event %s\n",info.symbol); if (retval==PAPI_ECNFLCT) { printf("Probably NMI watchdog related\n"); } } if (retval==PAPI_ECNFLCT) { sprintf(error_message,"Problem adding %s (probably NMI Watchdog related)",info.symbol); } else { sprintf(error_message,"Problem adding %s",info.symbol); } test_warn( __FILE__, __LINE__, error_message, retval ); err_count++; } else { retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { PAPI_perror( "PAPI_start" ); err_count++; } else { retval = PAPI_stop( EventSet, &values ); if ( retval != PAPI_OK ) { PAPI_perror( "PAPI_stop" ); err_count++; } else { if (!quiet) printf( "successful\n" ); count++; } } retval = PAPI_remove_event( EventSet, ( int ) info.event_code ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event", retval ); } } retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if (!quiet) { printf( "Successfully added, started and stopped %d events.\n", count ); } if ( err_count ) { if (!quiet) printf( "Failed to add, start or stop %d events.\n", err_count ); } if (count<=0) { test_fail( __FILE__, __LINE__, "No events added", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/all_native_events.c000066400000000000000000000152101502707512200216010ustar00rootroot00000000000000/* * File: all_native_events.c * Author: Haihang You * Author: Treece Burgess tburgess@icl.utk.edu (updated in November 2024 to add a flag to enable or disable Cuda events.) */ /* This test tries to add all native events from all components */ /* This file hardware info and performs the following test: - Start and stop all native events. This is a good preliminary way to validate native event tables. In its current form this test also stresses the number of events sets the library can handle outstanding. */ #include #include #include #include "papi.h" #include "papi_test.h" static int check_event( int event_code, char *name, int quiet ) { int retval; long long values; int EventSet = PAPI_NULL; /* Possibly there was an older issue with the */ /* REPLAY_EVENT:BR_MSP on Pentium4 ??? */ /* Create an eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add the event */ retval = PAPI_add_event( EventSet, event_code ); if ( retval != PAPI_OK ) { if (!quiet) printf( "Error adding %s %d\n", name, retval ); return retval; } /* Start the event */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { PAPI_perror( "PAPI_start" ); } else { retval = PAPI_stop( EventSet, &values ); if ( retval != PAPI_OK ) { PAPI_perror( "PAPI_stop" ); return retval; } else { if (!quiet) printf( "Added and Stopped %s successfully.\n", name ); } } /* Cleanup the eventset */ retval=PAPI_cleanup_eventset( EventSet ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval); } /* Destroy the eventset */ retval=PAPI_destroy_eventset( &EventSet ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval); } return PAPI_OK; } static void print_help(char **argv) { printf( "This is the all_native_events program.\n" ); printf( "For all components compiled in, it attempts to: add the component events, start, and stop.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "General command options:\n" ); printf( "\t-h, --help Print the help message.\n" ); printf( "\t--disable-cuda-events= Optionally disable processing the Cuda native events. Default is no.\n" ); printf( "\n" ); } int main( int argc, char **argv ) { int i, k, add_count = 0, err_count = 0; int retval; PAPI_event_info_t info, info1; const PAPI_hw_info_t *hwinfo = NULL; const PAPI_component_info_t* cmpinfo; char disableCudaEvts[PAPI_MIN_STR_LEN] = "no"; int event_code; int numcmp, cid; int quiet; /* Set quiet variable */ quiet=tests_quiet( argc, argv ); /* parse command line flags */ for (i = 0; i < argc; i++) { if (strncmp(argv[i], "--disable-cuda-events=", 22) == 0) { strncpy(disableCudaEvts, argv[i] + 22, PAPI_MIN_STR_LEN); } if (strncmp(argv[i], "--help", 6) == 0 || strncmp(argv[i], "-h", 2) == 0) { print_help(argv); exit(-1); } } /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("Test case ALL_NATIVE_EVENTS: Available " "native events and hardware " "information.\n"); } hwinfo=PAPI_get_hardware_info(); if ( hwinfo == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } numcmp = PAPI_num_components( ); int rocm_id = PAPI_get_component_index("rocm"); int cuda_id = PAPI_get_component_index("cuda"); /* Loop through all components */ for( cid = 0; cid < numcmp; cid++ ) { if (cid == rocm_id) { /* skip rocm component due to a bug in rocprofiler that * crashes PAPI if multiple GPUs are present */ continue; } /* optionally skip the Cuda native events, default is no */ if (strcmp(disableCudaEvts, "yes") == 0 && cid == cuda_id) { continue; } cmpinfo = PAPI_get_component_info( cid ); if (cmpinfo == NULL) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 2 ); } /* Skip disabled components */ if (cmpinfo->disabled != PAPI_OK && cmpinfo->disabled != PAPI_EDELAY_INIT) { if (!quiet) { printf( "Name: %-23s %s\n", cmpinfo->name ,cmpinfo->description); printf(" \\-> Disabled: %s\n", cmpinfo->disabled_reason); } continue; } /* For platform independence, always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cid ); do { retval = PAPI_get_event_info( i, &info ); event_code = ( int ) info.event_code; if ( check_event( event_code, info.symbol, quiet ) == PAPI_OK) { add_count++; } else { err_count++; } /* We used to skip OFFCORE and UNCORE events */ /* Why? */ /* Enumerate all umasks */ k = i; if ( PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_UMASKS, cid )==PAPI_OK ) { do { retval = PAPI_get_event_info( k, &info1 ); event_code = ( int ) info1.event_code; if ( check_event( event_code, info1.symbol, quiet ) == PAPI_OK ) { add_count++; } else { err_count++; } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cid ) == PAPI_OK ); } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cid ) == PAPI_OK ); } if (!quiet) { printf( "\n\nSuccessfully found and added %d events " "(in %d eventsets).\n", add_count , add_count); } if ( err_count ) { if (!quiet) printf( "Failed to add %d events.\n", err_count ); } if ( add_count <= 0 ) { test_fail( __FILE__, __LINE__, "No events added", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach2.c000066400000000000000000000161421502707512200174320ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for attached processes. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_TRACEME PT_TRACE_ME #endif static int wait_for_attach_and_loop( void ) { char *path; char newpath[PATH_MAX]; path = getenv("PATH"); sprintf(newpath, "PATH=./:%s", (path)?path:"\0" ); putenv(newpath); if (ptrace(PTRACE_TRACEME, 0, 0, 0) == 0) { execlp("attach_target","attach_target","100000000",NULL); perror("execl(attach_target) failed"); } perror("PTRACE_TRACEME"); return ( 1 ); } int main( int argc, char **argv ) { int status, retval, tmp; int EventSet1 = PAPI_NULL; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; char event_name[PAPI_MAX_STR_LEN];; const PAPI_hw_info_t *hw_info; const PAPI_component_info_t *cmpinfo; pid_t pid; int quiet; /* Fork before doing anything with the PMU */ setbuf(stdout,NULL); pid = fork( ); if ( pid < 0 ) test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); if ( pid == 0 ) exit( wait_for_attach_and_loop( ) ); /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Master only process below here */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); if ( cmpinfo->attach == 0 ) test_skip( __FILE__, __LINE__, "Platform does not support attaching", 0 ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 0 ); /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ retval = PAPI_create_eventset(&EventSet1); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* Here we are testing that this does not cause a fail */ retval = PAPI_assign_eventset_component( EventSet1, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); retval = PAPI_attach( EventSet1, ( unsigned long ) pid ); if ( retval != PAPI_OK ) { if (!quiet) printf("Cannot attach: %s\n",PAPI_strerror(retval)); test_skip( __FILE__, __LINE__, "PAPI_attach", retval ); } retval = PAPI_add_event(EventSet1, PAPI_TOT_CYC); if ( retval != PAPI_OK ) { if (!quiet) printf("Problem adding PAPI_TOT_CYC\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event", retval ); } strcpy(event_name,"PAPI_FP_INS"); retval = PAPI_add_named_event(EventSet1, event_name); if ( retval == PAPI_ENOEVNT ) { strcpy(event_name,"PAPI_TOT_INS"); retval = PAPI_add_named_event(EventSet1, event_name); } if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } values = allocate_test_space( 1, 2); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); if (!quiet) printf("must_ptrace is %d\n",cmpinfo->attach_must_ptrace); pid_t child = wait( &status ); if (!quiet) printf( "Debugger exited wait() with %d\n",child ); if (WIFSTOPPED( status )) { if (!quiet) printf( "Child has stopped due to signal %d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } if (WIFSIGNALED( status )) { if (!quiet) printf( "Child %ld received signal %d (%s)\n", (long)child, WTERMSIG(status) , strsignal(WTERMSIG( status )) ); } if (!quiet) printf("After %d\n",retval); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); if (!quiet) printf("Continuing\n"); #if defined(__FreeBSD__) if ( ptrace( PT_CONTINUE, pid, (vptr_t) 1, 0 ) == -1 ) { #else if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { #endif perror( "ptrace(PTRACE_CONT)" ); return 1; } do { child = wait( &status ); if (!quiet) printf( "Debugger exited wait() with %d\n", child); if (WIFSTOPPED( status )) { if (!quiet) printf( "Child has stopped due to signal %d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } if (WIFSIGNALED( status )) { if (!quiet) printf( "Child %ld received signal %d (%s)\n", (long)child, WTERMSIG(status) , strsignal(WTERMSIG( status )) ); } } while (!WIFEXITED( status )); if (!quiet) printf("Child exited with value %d\n",WEXITSTATUS(status)); if (WEXITSTATUS(status) != 0) { test_fail( __FILE__, __LINE__, "Exit status of child to attach to", PAPI_EMISC); } retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_cleanup_eventset(EventSet1); if (retval != PAPI_OK) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset(&EventSet1); if (retval != PAPI_OK) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if (!quiet) { printf( "Test case: 3rd party attach start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\n" ); printf( TAB1, "PAPI_TOT_CYC : \t", ( values[0] )[0] ); printf( "%s : \t %12lld\n",event_name, ( values[0] )[1]); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification: none\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach3.c000066400000000000000000000162201502707512200174300ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for attached processes. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_TRACEME PT_TRACE_ME #endif static int wait_for_attach_and_loop( void ) { char *path; char newpath[PATH_MAX]; path = getenv("PATH"); sprintf(newpath, "PATH=./:%s", (path)?path:"\0" ); putenv(newpath); if (ptrace(PTRACE_TRACEME, 0, 0, 0) == 0) { execlp("attach_target","attach_target","100000000",NULL); perror("execl(attach_target) failed"); } perror("PTRACE_TRACEME"); return ( 1 ); } int main( int argc, char **argv ) { int status, retval, tmp; int EventSet1 = PAPI_NULL; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; char event_name[PAPI_MAX_STR_LEN];; const PAPI_hw_info_t *hw_info; const PAPI_component_info_t *cmpinfo; pid_t pid; int quiet; /* Fork before doing anything with the PMU */ setbuf(stdout,NULL); pid = fork( ); if ( pid < 0 ) test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); if ( pid == 0 ) exit( wait_for_attach_and_loop( ) ); /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Master only process below here */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); if ( cmpinfo->attach == 0 ) test_skip( __FILE__, __LINE__, "Platform does not support attaching", 0 ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 0 ); /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ retval = PAPI_create_eventset(&EventSet1); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); } /* Force addition of component */ retval = PAPI_assign_eventset_component( EventSet1, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); /* The following call causes this test to fail for perf_events */ retval = PAPI_attach( EventSet1, ( unsigned long ) pid ); if ( retval != PAPI_OK ) { if (!quiet) printf("Cannot attach: %s\n",PAPI_strerror(retval)); test_skip( __FILE__, __LINE__, "PAPI_attach", retval ); } retval = PAPI_add_event(EventSet1, PAPI_TOT_CYC); if ( retval != PAPI_OK ) { if (!quiet) printf("Could not add PAPI_TOT_CYC\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event", retval ); } strcpy(event_name,"PAPI_FP_INS"); retval = PAPI_add_named_event(EventSet1, event_name); if ( retval == PAPI_ENOEVNT ) { strcpy(event_name,"PAPI_TOT_INS"); retval = PAPI_add_named_event(EventSet1, event_name); } if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } values = allocate_test_space( 1, 2); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); if (!quiet) printf("must_ptrace is %d\n",cmpinfo->attach_must_ptrace); pid_t child = wait( &status ); if (!quiet) printf( "Debugger exited wait() with %d\n",child ); if (WIFSTOPPED( status )) { if (!quiet) printf( "Child has stopped due to signal %d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } if (WIFSIGNALED( status )) { if (!quiet) printf( "Child %ld received signal %d (%s)\n", (long)child, WTERMSIG(status) , strsignal(WTERMSIG( status )) ); } if (!quiet) printf("After %d\n",retval); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); if (!quiet) printf("Continuing\n"); #if defined(__FreeBSD__) if ( ptrace( PT_CONTINUE, pid, (vptr_t) 1, 0 ) == -1 ) { #else if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { #endif perror( "ptrace(PTRACE_CONT)" ); return 1; } do { child = wait( &status ); if (!quiet) printf( "Debugger exited wait() with %d\n", child); if (WIFSTOPPED( status )) { if (!quiet) printf( "Child has stopped due to signal %d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } if (WIFSIGNALED( status )) { if (!quiet) printf( "Child %ld received signal %d (%s)\n", (long)child, WTERMSIG(status) , strsignal(WTERMSIG( status )) ); } } while (!WIFEXITED( status )); if (!quiet) printf("Child exited with value %d\n",WEXITSTATUS(status)); if (WEXITSTATUS(status) != 0) { test_fail( __FILE__, __LINE__, "Exit status of child to attach to", PAPI_EMISC); } retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_cleanup_eventset(EventSet1); if (retval != PAPI_OK) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset(&EventSet1); if (retval != PAPI_OK) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if (!quiet) { printf( "Test case: 3rd party attach start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\n" ); printf( TAB1, "PAPI_TOT_CYC : \t", ( values[0] )[0] ); printf( "%s : \t %12lld\n", event_name, ( values[0] )[1] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification: none\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach_cpu.c000066400000000000000000000061611502707512200202170ustar00rootroot00000000000000/* * This test case creates an event set and attaches it to a cpu. This causes only activity * on that cpu to get counted. The test case then starts the event set does a little work and * then stops the event set. It then prints out the event, count and cpu number which was used * during the test case. * * Since this test case does not try to force its own execution to the cpu which it is using to * count events, it is fairly normal to get zero counts printed at the end of the test. But every * now and then it will count the cpu where the test case is running and then the counts will be non-zero. * * The test case allows the user to specify which cpu should be counted by providing an argument to the * test case (ie: ./attach_cpu 3). Sometimes by trying different cpu numbers with the test case, you * can find the cpu used to run the test (because counts will look like cycle counts). * */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int num_tests=1; int num_events=1; int retval; int cpu_num = 1; int EventSet1 = PAPI_NULL; long long **values; char event_name[PAPI_MAX_STR_LEN] = "PAPI_TOT_CYC"; PAPI_option_t opts; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); // user can provide cpu number on which to count events as arg 1 if (argc > 1) { retval = atoi(argv[1]); if (retval >= 0) { cpu_num = retval; } } retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); retval = PAPI_create_eventset(&EventSet1); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); // Force event set to be associated with component 0 (perf_events component provides all core events) retval = PAPI_assign_eventset_component( EventSet1, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); // Attach this event set to cpu 1 opts.cpu.eventset = EventSet1; opts.cpu.cpu_num = cpu_num; retval = PAPI_set_opt( PAPI_CPU_ATTACH, &opts ); if ( retval != PAPI_OK ) { if (!quiet) printf("Can't PAPI_CPU_ATTACH: %s\n", PAPI_strerror(retval)); test_skip( __FILE__, __LINE__, "PAPI_set_opt", retval ); } retval = PAPI_add_named_event(EventSet1, event_name); if ( retval != PAPI_OK ) { if (!quiet) printf("Trouble adding event %s\n",event_name); test_skip( __FILE__, __LINE__, "PAPI_add_named_event", retval ); } // get space for counter values (this needs to do this call because it malloc's space that test_pass and friends free) values = allocate_test_space( num_tests, num_events); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } // do some work do_flops(NUM_FLOPS); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if (!quiet) printf ("Event: %s: %8lld on Cpu: %d\n", event_name, values[0][0], cpu_num); PAPI_shutdown( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach_cpu_sys_validate.c000066400000000000000000000070211502707512200227620ustar00rootroot00000000000000/* * This test attempts to attach to each CPU * Then it runs some code on one CPU * Then it reads the results, they should be different. * It sets the granularity to SYS as this is known to be broken * with some Linux/rdpmc combinations */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define MAX_CPUS 16 int main( int argc, char **argv ) { int i; int retval; int num_cpus = 8; int EventSet[MAX_CPUS]; const PAPI_hw_info_t *hwinfo; double diff; long long values[MAX_CPUS]; char event_name[PAPI_MAX_STR_LEN] = "PAPI_TOT_INS"; PAPI_option_t opts; int quiet; long long average=0; int same=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hwinfo = PAPI_get_hardware_info( ); if ( hwinfo==NULL) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", retval ); } num_cpus=hwinfo->totalcpus; if ( num_cpus < 2 ) { if (!quiet) printf("Need at least 1 CPU\n"); test_skip( __FILE__, __LINE__, "num_cpus", 0 ); } if (num_cpus > MAX_CPUS) { num_cpus=MAX_CPUS; } for(i=0;i-0.01)) same++; } if (same) { if (!quiet) { printf("Error! %d events were the same\n",same); } test_fail( __FILE__, __LINE__, "Too similar", 0 ); } PAPI_shutdown( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach_cpu_validate.c000066400000000000000000000070031502707512200220640ustar00rootroot00000000000000/* * This test attempts to attach to each CPU * Then it runs some code on one CPU * Then it reads the results, they should be different. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define MAX_CPUS 16 int main( int argc, char **argv ) { int i; int retval; int num_cpus = 8; int EventSet[MAX_CPUS]; const PAPI_hw_info_t *hwinfo; double diff; long long values[MAX_CPUS]; char event_name[PAPI_MAX_STR_LEN] = "PAPI_TOT_INS"; PAPI_option_t opts; int quiet; long long average=0; int same=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hwinfo = PAPI_get_hardware_info( ); if ( hwinfo==NULL) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", retval ); } num_cpus=hwinfo->totalcpus; if ( num_cpus < 2 ) { if (!quiet) printf("Need at least 1 CPU\n"); test_skip( __FILE__, __LINE__, "num_cpus", 0 ); } if (num_cpus > MAX_CPUS) { num_cpus = MAX_CPUS; } for(i=0;i-0.01)) same++; } if (same) { if (!quiet) { printf("Error! %d events were the same\n",same); } test_fail( __FILE__, __LINE__, "Too similar", 0 ); } PAPI_shutdown( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/attach_target.c000066400000000000000000000003371502707512200207150ustar00rootroot00000000000000#include #include #include "do_loops.h" int main(int argc, char **argv) { int c, i = NUM_FLOPS; if (argc > 1) { c = atoi(argv[1]); if (c >= 0) { i = c; } } do_flops(i); return 0; } papi-papi-7-2-0-t/src/ctests/attach_validate.c000066400000000000000000000141061502707512200212170ustar00rootroot00000000000000/* This test attempts to attach to a child and makes sure we are */ /* getting the expected results for the child only, not local. */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_TRACEME PT_TRACE_ME #endif static int wait_for_attach_and_loop( int quiet ) { int i,result; if (ptrace(PTRACE_TRACEME, 0, 0, 0) == 0) { raise(SIGSTOP); if (!quiet) printf("Child running 50 million instructions\n"); /* Run 50 million instructions */ for(i=0;i<50;i++) { result=instructions_million(); } } perror("PTRACE_TRACEME"); (void)result; return 0; } int main( int argc, char **argv ) { int status, retval, tmp; int EventSet1 = PAPI_NULL; long long values[1]; const PAPI_hw_info_t *hw_info; const PAPI_component_info_t *cmpinfo; pid_t pid; int quiet; int i,result; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Fork before doing anything with the PMU */ setbuf(stdout,NULL); pid = fork( ); if ( pid < 0 ) { test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); } /* If child */ if ( pid == 0 ) { exit(wait_for_attach_and_loop(quiet) ); } /* Parent process below here */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); } if ( cmpinfo->attach == 0 ) { test_skip( __FILE__, __LINE__, "Platform does not support attaching", 0 ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 0 ); } /* Create Eventset */ retval = PAPI_create_eventset(&EventSet1); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_assign_eventset_component( EventSet1, 0 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } /* Attach to our child */ retval = PAPI_attach( EventSet1, ( unsigned long ) pid ); if ( retval != PAPI_OK ) { if (!quiet) printf("Cannot attach: %s\n",PAPI_strerror(retval)); test_skip( __FILE__, __LINE__, "PAPI_attach", retval ); } /* Add instructions event */ retval = PAPI_add_event(EventSet1, PAPI_TOT_INS); if ( retval != PAPI_OK ) { if (!quiet) printf("Problem adding PAPI_TOT_INS\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event", retval ); } if (!quiet) { printf("must_ptrace is %d\n",cmpinfo->attach_must_ptrace); } /* Wait for child to stop for debugging */ pid_t child = wait( &status ); if (!quiet) printf( "Debugger exited wait() with %d\n",child ); if (WIFSTOPPED( status )) { if (!quiet) { printf( "Child has stopped due to signal %d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } } if (WIFSIGNALED( status )) { if (!quiet) { printf( "Child %ld received signal %d (%s)\n", (long)child, WTERMSIG(status), strsignal(WTERMSIG( status )) ); } } if (!quiet) { printf("After %d\n",retval); } /* Start eventset */ retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } if (!quiet) { printf("Continuing\n"); } #if defined(__FreeBSD__) if ( ptrace( PT_CONTINUE, pid, (vptr_t) 1, 0 ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); test_fail( __FILE__, __LINE__, "Continuing", PAPI_EMISC); return 1; } #else if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); test_fail( __FILE__, __LINE__, "Continuing", PAPI_EMISC); } #endif /* Run a billion instructions, should not appear in count */ for(i=0;i<1000;i++) { result=instructions_million(); } /* Wait for child to finish */ do { child = wait( &status ); if (!quiet) { printf( "Debugger exited wait() with %d\n", child); } if (WIFSTOPPED( status )) { if (!quiet) { printf( "Child has stopped due to signal " "%d (%s)\n", WSTOPSIG( status ), strsignal(WSTOPSIG( status )) ); } } if (WIFSIGNALED( status )) { if (!quiet) { printf( "Child %ld received signal " "%d (%s)\n", (long)child, WTERMSIG(status), strsignal(WTERMSIG( status )) ); } } } while (!WIFEXITED( status )); if (!quiet) { printf("Child exited with value %d\n",WEXITSTATUS(status)); } if (WEXITSTATUS(status) != 0) { test_fail( __FILE__, __LINE__, "Exit status of child to attach to", PAPI_EMISC); } /* Stop counts */ retval = PAPI_stop( EventSet1, &values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } retval = PAPI_cleanup_eventset(EventSet1); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); } retval = PAPI_destroy_eventset(&EventSet1); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); } if (!quiet) { printf( "Test case: attach validation.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using 50 million instructions\n"); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\n" ); printf( TAB1, "PAPI_TOT_INS : \t", ( values[0] ) ); printf( "-------------------------------------------------------------------------\n" ); } if (values[0]<100) { test_fail( __FILE__, __LINE__, "wrong result", PAPI_EMISC ); } if (values[0]>60000000) { test_fail( __FILE__, __LINE__, "wrong result", PAPI_EMISC ); } (void)result; test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/branches.c000066400000000000000000000161141502707512200176700ustar00rootroot00000000000000/* * Test example for branch accuracy and functionality, originally * provided by Timothy Kaiser, SDSC. It was modified to fit the * PAPI test suite by Nils Smeds, . * and Phil Mucci * This example verifies the accuracy of branch events */ /* Measures 4 events: PAPI_BR_NTK -- branches not taken PAPI_BR_PRC -- branches predicted correctly PAPI_BR_INS -- total branch instructions PAPI_BR_MSP -- branches mispredicted First measure all 4 at once (or as many as will fit). Then run them one by one. Compare results to see if they match. Note: sometimes have seen failure if system is under fuzzing load */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define MAXEVENTS 4 #define MINCOUNTS 100000 #define MPX_TOLERANCE .20 int main( int argc, char **argv ) { PAPI_event_info_t info; int i, j, retval, errors=0; int iters = 10000000; double x = 1.1, y; long long t1, t2; long long values[MAXEVENTS], refvalues[MAXEVENTS]; double spread[MAXEVENTS]; int nevents = MAXEVENTS; int eventset = PAPI_NULL; int events[MAXEVENTS]; int quiet; char event_names[MAXEVENTS][256] = { "PAPI_BR_NTK", // not taken "PAPI_BR_PRC", // predicted correctly "PAPI_BR_INS", // total branches "PAPI_BR_MSP", // branches mispredicted }; /* Set quiet variable */ quiet = tests_quiet( argc, argv ); /* Parse command line args */ if ( argc > 1 ) { if ( !strcmp( argv[1], "TESTS_QUIET" ) ) { } } events[0] = PAPI_BR_NTK; // not taken events[1] = PAPI_BR_PRC; // predicted correctly events[2] = PAPI_BR_INS; // total branches events[3] = PAPI_BR_MSP; // branches mispredicted /* Why were these disabled? events[3]=PAPI_BR_CN; events[4]=PAPI_BR_UCN; events[5]=PAPI_BR_TKN; */ /* Clear out the results to zero */ for ( i = 0; i < MAXEVENTS; i++ ) { values[i] = 0; } if ( !quiet ) { printf( "\nAccuracy check of branch presets.\n" ); printf( "Comparing a measurement with separate measurements.\n\n" ); } /* Initialize library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create Eventset */ retval = PAPI_create_eventset( &eventset ); if ( retval ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } #ifdef MPX retval = PAPI_multiplex_init( ); if ( retval ) { test_fail( __FILE__, __LINE__, "PAPI_multiplex_init", retval ); } retval = PAPI_set_multiplex( eventset ); if ( retval ) { test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } #endif nevents = 0; /* Add as many of the 4 events that exist on this machine */ for ( i = 0; i < MAXEVENTS; i++ ) { if ( PAPI_query_event( events[i] ) != PAPI_OK ) continue; if ( PAPI_add_event( eventset, events[i] ) == PAPI_OK ) { events[nevents] = events[i]; nevents++; } } /* If none of the events can be added, skip this test */ if ( nevents < 1 ) { test_skip( __FILE__, __LINE__, "Not enough events left...", 0 ); } /* Find a reasonable number of iterations (each * event active 20 times) during the measurement */ /* Target: 10000 usec/multiplex, 20 repeats */ t2 = (long long)(10000 * 20) * nevents; if ( t2 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } /* Measure one run */ t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); t1 = PAPI_get_real_usec( ) - t1; if ( t2 > t1 ) /* Scale up execution time to match t2 */ iters = iters * ( int ) ( t2 / t1 ); else if ( t1 > 30e6 ) /* Make sure execution time is < 30s per repeated test */ test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); x = 1.0; /**********************************/ /* First run: Grouped Measurement */ /**********************************/ if ( !quiet ) { printf( "\nFirst run: Together.\n" ); } t1 = PAPI_get_real_usec( ); retval = PAPI_start( eventset ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); y = do_flops3( x, iters, 1 ); retval = PAPI_stop( eventset, values ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); t2 = PAPI_get_real_usec( ); if ( !quiet ) { printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( y / ( double ) ( t2 - t1 ) ) ); printf( "PAPI grouped measurement:\n" ); for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); printf( "%20s = ", info.short_descr ); printf( LLDFMT, values[j] ); printf( "\n" ); } printf( "\n" ); } /* Remove all the events, start again */ retval = PAPI_remove_events( eventset, events, nevents ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_remove_events", retval ); retval = PAPI_destroy_eventset( &eventset ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); /* Recreate eventset */ eventset = PAPI_NULL; retval = PAPI_create_eventset( &eventset ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* Run events one by one */ for ( i = 0; i < nevents; i++ ) { /* Clear out old event */ retval = PAPI_cleanup_eventset( eventset ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); /* Add the event */ retval = PAPI_add_event( eventset, events[i] ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); x = 1.0; if ( !quiet ) { printf( "\nReference measurement %d (of %d):\n", i + 1, nevents ); } t1 = PAPI_get_real_usec( ); retval = PAPI_start( eventset ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); y = do_flops3( x, iters, 1 ); retval = PAPI_stop( eventset, &refvalues[i] ); if (retval) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); t2 = PAPI_get_real_usec( ); if ( !quiet ) { printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( y / ( double ) ( t2 - t1 ) ) ); PAPI_get_event_info( events[i], &info ); printf( "PAPI results:\n%20s = ", info.short_descr ); printf( LLDFMT, refvalues[i] ); printf( "\n" ); } } if ( !quiet ) { printf( "\n" ); } /* Validate the results */ if ( !quiet ) { printf( "\n\nRelative accuracy:\n" ); printf( "\tEvent\t\tGroup\t\tIndividual\tSpread\n"); } for ( j = 0; j < nevents; j++ ) { spread[j] = abs( ( int ) ( refvalues[j] - values[j] ) ); if ( values[j] ) spread[j] /= ( double ) values[j]; if ( !quiet ) { printf( "\t%02d: ",j); printf( "%s",event_names[j]); printf( "\t%10lld", values[j] ); printf( "\t%10lld", refvalues[j] ); printf("\t%10.3g\n", spread[j] ); } /* Make sure that NaN get counted as errors */ if ( spread[j] > MPX_TOLERANCE ) { /* Neglect inprecise results with low counts */ if ( refvalues[j] < MINCOUNTS ) { } else { errors++; if (!quiet) { printf("\tError: Spread > %lf\n",MPX_TOLERANCE); } } } } if ( !quiet ) { printf( "\n\n" ); } if ( errors ) { test_fail( __FILE__, __LINE__, "Values outside threshold", i ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/burn.c000066400000000000000000000002071502707512200170450ustar00rootroot00000000000000#include #include "do_loops.h" int main( int argc, char **argv ) { (void)argc; (void)argv; do_stuff( ); return 0; } papi-papi-7-2-0-t/src/ctests/byte_profile.c000066400000000000000000000151241502707512200205660ustar00rootroot00000000000000/* * File: byte_profile.c * Author: Dan Terpstra * terpstra@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com */ /* This file profiles multiple events with byte level address resolution. It's patterned after code suggested by John Mellor-Crummey, Rob Fowler, and Nathan Tallent. It is intended to illustrate the use of Multiprofiling on a very tight block of code at byte level resolution of the instruction addresses. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "prof_utils.h" #include "do_loops.h" #define PROFILE_ALL static const PAPI_hw_info_t *hw_info; static int num_events = 0; #define N (1 << 23) #define T (10) double aa[N], bb[N]; double s = 0, s2 = 0; static void cleara( double a[N] ) { int i; for ( i = 0; i < N; i++ ) { a[i] = 0; } } static int my_dummy( int i ) { return ( i + 1 ); } static void my_main( void ) { int i, j; for ( j = 0; j < T; j++ ) { for ( i = 0; i < N; i++ ) { bb[i] = 0; } cleara( aa ); memset( aa, 0, sizeof ( aa ) ); for ( i = 0; i < N; i++ ) { s += aa[i] * bb[i]; s2 += aa[i] * aa[i] + bb[i] * bb[i]; } } } static int do_profile( vptr_t start, unsigned long plength, unsigned scale, int thresh, int bucket, unsigned int mask ) { int i, retval; unsigned long blength; int num_buckets,j=0; int num_bufs = num_events; int event = num_events; int events[MAX_TEST_EVENTS]; char header[BUFSIZ]; strncpy(header,"address\t\t",BUFSIZ); //= "address\t\t\tcyc\tins\tfp_ins\n"; for(i=0;imodel_string, "POWER6" ) != 0 ) { printf( TAB1, "PAPI_TOT_INS:", ( values[0] )[--event] ); } #if defined(__powerpc__) printf( TAB1, "PAPI_FP_INS", ( values[0] )[--event] ); #else if ( strcmp( hw_info->model_string, "Intel Pentium III" ) != 0 ) { printf( TAB1, "PAPI_FP_OPS:", ( values[0] )[--event] ); printf( TAB1, "PAPI_L2_TCM:", ( values[0] )[--event] ); } #endif } for ( i = 0; i < num_events; i++ ) { if ( ( retval = PAPI_profil( profbuf[i], ( unsigned int ) blength, start, scale, EventSet, events[i], 0, PAPI_PROFIL_POSIX ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } if (!TESTS_QUIET) { prof_head( blength, bucket, num_buckets, header ); prof_out( start, num_events, bucket, num_buckets, scale ); } retval = prof_check( num_bufs, bucket, num_buckets ); for ( i = 0; i < num_bufs; i++ ) { free( profbuf[i] ); } return retval; } int main( int argc, char **argv ) { long length; int mask; int retval; const PAPI_exe_info_t *prginfo; vptr_t start, end; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } mask = MASK_TOT_CYC | MASK_TOT_INS | MASK_FP_OPS | MASK_L2_TCM; #if defined(__powerpc__) if ( strcmp( hw_info->model_string, "POWER6" ) == 0 ) mask = MASK_TOT_CYC | MASK_FP_INS; else mask = MASK_TOT_CYC | MASK_TOT_INS | MASK_FP_INS; #endif #if defined(ITANIUM2) mask = MASK_TOT_CYC | MASK_FP_OPS | MASK_L2_TCM | MASK_L1_DCM; #endif EventSet = add_test_events( &num_events, &mask, 0 ); if (num_events==0) { if (!quiet) printf("Trouble adding events\n"); test_skip(__FILE__,__LINE__,"add_test_events",2); } values = allocate_test_space( 1, num_events ); /* profile the cleara and my_main address space */ start = ( vptr_t ) cleara; end = ( vptr_t ) my_dummy; /* Itanium and PowerPC64 processors return function descriptors instead * of function addresses. You must dereference the descriptor to get the address. */ #if defined(ITANIUM1) || defined(ITANIUM2) \ || (defined(__powerpc64__) && (_CALL_ELF != 2)) start = ( vptr_t ) ( ( ( struct fdesc * ) start )->ip ); end = ( vptr_t ) ( ( ( struct fdesc * ) end )->ip ); /* PPC64 Big Endian is ELF version 1 which uses function descriptors. * PPC64 Little Endian is ELF version 2 which does not use * function descriptors */ #endif /* call dummy so it doesn't get optimized away */ retval = my_dummy( 1 ); length = end - start; if ( length < 0 ) test_fail( __FILE__, __LINE__, "Profile length < 0!", ( int ) length ); if (!quiet) { prof_print_address( "Test case byte_profile: " "Multi-event profiling at byte resolution.\n", prginfo ); prof_print_prof_info( start, end, THRESHOLD, event_name ); } retval = do_profile( start, ( unsigned ) length, FULL_SCALE * 2, THRESHOLD, PAPI_PROFIL_BUCKET_32, mask ); remove_test_events( &EventSet, mask ); if (retval == 0) { test_fail( __FILE__, __LINE__, "No information in buffers", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/calibrate.c000066400000000000000000000302101502707512200200220ustar00rootroot00000000000000/* Calibrate.c A program to perform one or all of three tests to count flops. Test 1. Inner Product: 2*n operations for i = 1:n; a = a + x(i)*y(i); end Test 2. Matrix Vector Product: 2*n^2 operations for i = 1:n; for j = 1:n; x(i) = x(i) + a(i,j)*y(j); end; end; Test 3. Matrix Matrix Multiply: 2*n^3 operations for i = 1:n; for j = 1:n; for k = 1:n; c(i,j) = c(i,j) + a(i,k)*b(k,j); end; end; end; Supply a command line argument of 1, 2, or 3 to perform each test, or no argument to perform all three. Each test initializes PAPI and presents a header with processor information. Then it performs 500 iterations, printing result lines containing: n, measured counts, theoretical counts, (measured - theory), % error */ #include #include #include #include "papi.h" #include "papi_test.h" #define INDEX1 100 #define INDEX5 500 #define MAX_WARN 10 #define MAX_ERROR 80 #define MAX_DIFF 14 /* Extract and display hardware information for this processor. (Re)Initialize PAPI_flops() and begin counting floating ops. */ static void headerlines( const char *title, int quiet ) { if ( !quiet ) { printf( "\n%s:\n%8s %12s %12s %8s %8s\n", title, "i", "papi", "theory", "diff", "%error" ); printf( "-------------------------------------------------------------------------\n" ); } } /* Read PAPI_flops. Format and display results. Compute error without using floating ops. */ #if defined(mips) #define FMA 1 #elif (defined(sparc) && defined(sun)) #define FMA 1 #else #define FMA 0 #endif static void resultline( int i, int j, int EventSet, int fail, int quiet ) { float ferror = 0; long long flpins = 0; long long papi, theory; int diff, retval; char err_str[PAPI_MAX_STR_LEN]; retval = PAPI_stop( EventSet, &flpins ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); i++; /* convert to 1s base */ theory = 2; while ( j-- ) theory *= i; /* theoretical ops */ papi = flpins << FMA; diff = ( int ) ( papi - theory ); ferror = ( ( float ) abs( diff ) ) / ( ( float ) theory ) * 100; if (!quiet) { printf( "%8d %12lld %12lld %8d %10.4f\n", i, papi, theory, diff, ferror ); } if ( ferror > MAX_WARN && abs( diff ) > MAX_DIFF && i > 20 ) { sprintf( err_str, "Calibrate: difference exceeds %d percent", MAX_WARN ); test_warn( __FILE__, __LINE__, err_str, 0 ); } if (fail) { if ( ferror > MAX_ERROR && abs( diff ) > MAX_DIFF && i > 20 ) { sprintf( err_str, "Calibrate: error exceeds %d percent", MAX_ERROR ); test_fail( __FILE__, __LINE__, err_str, PAPI_EMISC ); } } } static void print_help( char **argv ) { printf( "Usage: %s [-ivmdh] [-e event]\n", argv[0] ); printf( "Options:\n\n" ); printf( "\t-i Inner Product test.\n" ); printf( "\t-v Matrix-Vector multiply test.\n" ); printf( "\t-m Matrix-Matrix multiply test.\n" ); printf( "\t-d Double precision data. Default is float.\n" ); printf( "\t-e event Use as PAPI event instead of PAPI_FP_OPS\n" ); printf( "\t-f Suppress failures\n" ); printf( "\t-h Print this help message\n" ); printf( "\n" ); printf( "This test measures floating point operations for the specified test.\n" ); printf( "Operations can be performed in single or double precision.\n" ); printf( "Default operation is all three tests in single precision.\n" ); } static float inner_single( int n, float *x, float *y ) { float aa = 0.0; int i; for ( i = 0; i <= n; i++ ) aa = aa + x[i] * y[i]; return ( aa ); } static double inner_double( int n, double *x, double *y ) { double aa = 0.0; int i; for ( i = 0; i <= n; i++ ) aa = aa + x[i] * y[i]; return ( aa ); } static void vector_single( int n, float *a, float *x, float *y ) { int i, j; for ( i = 0; i <= n; i++ ) for ( j = 0; j <= n; j++ ) y[i] = y[i] + a[i * n + j] * x[i]; } static void vector_double( int n, double *a, double *x, double *y ) { int i, j; for ( i = 0; i <= n; i++ ) for ( j = 0; j <= n; j++ ) y[i] = y[i] + a[i * n + j] * x[i]; } static void matrix_single( int n, float *c, float *a, float *b ) { int i, j, k; for ( i = 0; i <= n; i++ ) for ( j = 0; j <= n; j++ ) for ( k = 0; k <= n; k++ ) c[i * n + j] = c[i * n + j] + a[i * n + k] * b[k * n + j]; } static void matrix_double( int n, double *c, double *a, double *b ) { int i, j, k; for ( i = 0; i <= n; i++ ) for ( j = 0; j <= n; j++ ) for ( k = 0; k <= n; k++ ) c[i * n + j] = c[i * n + j] + a[i * n + k] * b[k * n + j]; } static void reset_flops( const char *title, int EventSet ) { int retval; char err_str[PAPI_MAX_STR_LEN]; retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { sprintf( err_str, "%s: PAPI_start", title ); test_fail( __FILE__, __LINE__, err_str, retval ); } } int main( int argc, char *argv[] ) { extern void dummy( void * ); float aa, *a=NULL, *b=NULL, *c=NULL, *x=NULL, *y=NULL; double aad, *ad=NULL, *bd=NULL, *cd=NULL, *xd=NULL, *yd=NULL; int i, j, n; int inner = 0; int vector = 0; int matrix = 0; int double_precision = 0; int fail = 1; int retval = PAPI_OK; char papi_event_str[PAPI_MIN_STR_LEN] = "PAPI_FP_OPS"; int papi_event; int EventSet = PAPI_NULL; int quiet; /* Parse the input arguments */ for ( i = 0; i < argc; i++ ) { if ( strstr( argv[i], "-i" ) ) inner = 1; else if ( strstr( argv[i], "-f" ) ) fail = 0; else if ( strstr( argv[i], "-v" ) ) vector = 1; else if ( strstr( argv[i], "-m" ) ) matrix = 1; else if ( strstr( argv[i], "-e" ) ) { if ( ( argv[i + 1] == NULL ) || ( strlen( argv[i + 1] ) == 0 ) ) { print_help( argv ); exit( 1 ); } strncpy( papi_event_str, argv[i + 1], sizeof ( papi_event_str ) - 1); papi_event_str[sizeof ( papi_event_str )-1] = '\0'; i++; } else if ( strstr( argv[i], "-d" ) ) double_precision = 1; else if ( strstr( argv[i], "-h" ) ) { print_help( argv ); exit( 1 ); } } /* if no options specified, set all tests to TRUE */ if ( inner + vector + matrix == 0 ) inner = vector = matrix = 1; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( !quiet ) { printf( "Initializing..." ); } /* Initialize PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Translate name */ retval = PAPI_event_name_to_code( papi_event_str, &papi_event ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); } if ( PAPI_query_event( papi_event ) != PAPI_OK ) { test_skip( __FILE__, __LINE__, "PAPI_query_event", PAPI_ENOEVNT ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } if ( ( retval = PAPI_add_event( EventSet, papi_event ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } if (!quiet) printf( "\n" ); retval = PAPI_OK; /* Inner Product test */ if ( inner ) { /* Allocate the linear arrays */ if (double_precision) { xd = malloc( INDEX5 * sizeof(double) ); yd = malloc( INDEX5 * sizeof(double) ); if ( !( xd && yd ) ) retval = PAPI_ENOMEM; } else { x = malloc( INDEX5 * sizeof(float) ); y = malloc( INDEX5 * sizeof(float) ); if ( !( x && y ) ) retval = PAPI_ENOMEM; } if ( retval == PAPI_OK ) { headerlines( "Inner Product Test", quiet ); /* step through the different array sizes */ for ( n = 0; n < INDEX5; n++ ) { if ( n < INDEX1 || ( ( n + 1 ) % 50 ) == 0 ) { /* Initialize the needed arrays at this size */ if ( double_precision ) { for ( i = 0; i <= n; i++ ) { xd[i] = ( double ) rand( ) * ( double ) 1.1; yd[i] = ( double ) rand( ) * ( double ) 1.1; } } else { for ( i = 0; i <= n; i++ ) { x[i] = ( float ) rand( ) * ( float ) 1.1; y[i] = ( float ) rand( ) * ( float ) 1.1; } } /* reset PAPI flops count */ reset_flops( "Inner Product Test", EventSet ); /* do the multiplication */ if ( double_precision ) { aad = inner_double( n, xd, yd ); dummy( ( void * ) &aad ); } else { aa = inner_single( n, x, y ); dummy( ( void * ) &aa ); } resultline( n, 1, EventSet, fail, quiet ); } } } if (double_precision) { free( xd ); free( yd ); } else { free( x ); free( y ); } } /* Matrix Vector test */ if ( vector && retval != PAPI_ENOMEM ) { /* Allocate the needed arrays */ if (double_precision) { ad = malloc( INDEX5 * INDEX5 * sizeof(double) ); xd = malloc( INDEX5 * sizeof(double) ); yd = malloc( INDEX5 * sizeof(double) ); if ( !( ad && xd && yd ) ) retval = PAPI_ENOMEM; } else { a = malloc( INDEX5 * INDEX5 * sizeof(float) ); x = malloc( INDEX5 * sizeof(float) ); y = malloc( INDEX5 * sizeof(float) ); if ( !( a && x && y ) ) retval = PAPI_ENOMEM; } if ( retval == PAPI_OK ) { headerlines( "Matrix Vector Test", quiet ); /* step through the different array sizes */ for ( n = 0; n < INDEX5; n++ ) { if ( n < INDEX1 || ( ( n + 1 ) % 50 ) == 0 ) { /* Initialize the needed arrays at this size */ if ( double_precision ) { for ( i = 0; i <= n; i++ ) { yd[i] = 0.0; xd[i] = ( double ) rand( ) * ( double ) 1.1; for ( j = 0; j <= n; j++ ) ad[i * n + j] = ( double ) rand( ) * ( double ) 1.1; } } else { for ( i = 0; i <= n; i++ ) { y[i] = 0.0; x[i] = ( float ) rand( ) * ( float ) 1.1; for ( j = 0; j <= n; j++ ) a[i * n + j] = ( float ) rand( ) * ( float ) 1.1; } } /* reset PAPI flops count */ reset_flops( "Matrix Vector Test", EventSet ); /* compute the resultant vector */ if ( double_precision ) { vector_double( n, ad, xd, yd ); dummy( ( void * ) yd ); } else { vector_single( n, a, x, y ); dummy( ( void * ) y ); } resultline( n, 2, EventSet, fail, quiet ); } } } if (double_precision) { free( ad ); free( xd ); free( yd ); } else { free( a ); free( x ); free( y ); } } /* Matrix Multiply test */ if ( matrix && retval != PAPI_ENOMEM ) { /* Allocate the needed arrays */ if (double_precision) { ad = malloc( INDEX5 * INDEX5 * sizeof(double) ); bd = malloc( INDEX5 * INDEX5 * sizeof(double) ); cd = malloc( INDEX5 * INDEX5 * sizeof(double) ); if ( !( ad && bd && cd ) ) retval = PAPI_ENOMEM; } else { a = malloc( INDEX5 * INDEX5 * sizeof(float) ); b = malloc( INDEX5 * INDEX5 * sizeof(float) ); c = malloc( INDEX5 * INDEX5 * sizeof(float) ); if ( !( a && b && c ) ) retval = PAPI_ENOMEM; } if ( retval == PAPI_OK ) { headerlines( "Matrix Multiply Test", quiet ); /* step through the different array sizes */ for ( n = 0; n < INDEX5; n++ ) { if ( n < INDEX1 || ( ( n + 1 ) % 50 ) == 0 ) { /* Initialize the needed arrays at this size */ if ( double_precision ) { for ( i = 0; i <= n * n + n; i++ ) { cd[i] = 0.0; ad[i] = ( double ) rand( ) * ( double ) 1.1; bd[i] = ( double ) rand( ) * ( double ) 1.1; } } else { for ( i = 0; i <= n * n + n; i++ ) { c[i] = 0.0; a[i] = ( float ) rand( ) * ( float ) 1.1; b[i] = ( float ) rand( ) * ( float ) 1.1; } } /* reset PAPI flops count */ reset_flops( "Matrix Multiply Test", EventSet ); /* compute the resultant matrix */ if ( double_precision ) { matrix_double( n, cd, ad, bd ); dummy( ( void * ) c ); } else { matrix_single( n, c, a, b ); dummy( ( void * ) c ); } resultline( n, 3, EventSet, fail, quiet ); } } } if (double_precision) { free( ad ); free( bd ); free( cd ); } else { free( a ); free( b ); free( c ); } } /* exit with status code */ if ( retval == PAPI_ENOMEM ) { test_fail( __FILE__, __LINE__, "malloc", retval ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/case1.c000066400000000000000000000037701502707512200171030ustar00rootroot00000000000000/* From Dave McNamara at PSRV. Thanks! */ /* If you try to add an event that doesn't exist, you get the correct error message, yet you get subsequent Seg. Faults when you try to do PAPI_start and PAPI_stop. I would expect some bizarre behavior if I had no events added to the event set and then tried to PAPI_start but if I had successfully added one event, then the 2nd one get an error when I tried to add it, is it possible for PAPI_start to work but just count the first event? */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { double c, a = 0.999, b = 1.001; int n = 1000; int EventSet = PAPI_NULL; int retval; int i, j = 0; long long g1[2]; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( PAPI_query_event( PAPI_L2_TCM ) == PAPI_OK ) j++; if ( j == 1 && ( retval = PAPI_add_event( EventSet, PAPI_L2_TCM ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); j--; /* The event was not added */ } i = j; if ( PAPI_query_event( PAPI_L2_DCM ) == PAPI_OK ) j++; if ( j == ( i + 1 ) && ( retval = PAPI_add_event( EventSet, PAPI_L2_DCM ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); j--; /* The event was not added */ } if ( j ) { if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); for ( i = 0; i < n; i++ ) { c = a * b; } if (!TESTS_QUIET) fprintf(stdout,"c=%lf\n",c); if ( ( retval = PAPI_stop( EventSet, g1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/case2.c000066400000000000000000000043061502707512200171000ustar00rootroot00000000000000/* From Dave McNamara at PSRV. Thanks! */ /* If an event is countable but you've exhausted the counter resources and you try to add an event, it seems subsequent PAPI_start and/or PAPI_stop will causes a Seg. Violation. I got around this by calling PAPI to get the # of countable events, then making sure that I didn't try to add more than these number of events. I still have a problem if someone adds Level 2 cache misses and then adds FLOPS 'cause I didn't count FLOPS as actually requiring 2 counters. */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { double c, a = 0.999, b = 1.001; int n = 1000; int EventSet = PAPI_NULL; int retval; int j = 0, i; long long g1[3]; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( PAPI_query_event( PAPI_BR_CN ) == PAPI_OK ) j++; if ( j == 1 && ( retval = PAPI_add_event( EventSet, PAPI_BR_CN ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } i = j; if ( PAPI_query_event( PAPI_TOT_CYC ) == PAPI_OK ) j++; if ( j == ( i + 1 ) && ( retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } i = j; if ( PAPI_query_event( PAPI_TOT_INS ) == PAPI_OK ) j++; if ( j == ( i + 1 ) && ( retval = PAPI_add_event( EventSet, PAPI_TOT_INS ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } if ( j ) { if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); for ( i = 0; i < n; i++ ) { c = a * b; } if (!TESTS_QUIET) fprintf(stdout,"c=%lf\n",c); if ( ( retval = PAPI_stop( EventSet, g1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/child_overflow.c000066400000000000000000000065111502707512200211110ustar00rootroot00000000000000/* * Test PAPI with fork() and exec(). */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define MAX_EVENTS 3 static int Event[MAX_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_FAD_INS, }; static int Threshold[MAX_EVENTS] = { 8000000, 4000000, 4000000, }; static struct timeval start, last; static long count, total; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; count++; total++; } static void print_rate( const char *str ) { static int last_count = -1; struct timeval now; double st_secs, last_secs; gettimeofday( &now, NULL ); st_secs = ( double ) ( now.tv_sec - start.tv_sec ) + ( ( double ) ( now.tv_usec - start.tv_usec ) ) / 1000000.0; last_secs = ( double ) ( now.tv_sec - last.tv_sec ) + ( ( double ) ( now.tv_usec - last.tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%d] %s, time = %.3f, total = %ld, last = %ld, rate = %.1f/sec\n", getpid( ), str, st_secs, total, count, ( ( double ) count ) / last_secs ); } if ( last_count != -1 ) { if ( count < .1 * last_count ) { test_fail( __FILE__, __LINE__, "Interrupt rate changed!", 1 ); exit( 1 ); } } last_count = ( int ) count; count = 0; last = now; } static void run( const char *str, int len ) { int n; for ( n = 1; n <= len; n++ ) { do_cycles( 1 ); print_rate( str ); } } int main( int argc, char **argv ) { int quiet,retval; int ev, EventSet = PAPI_NULL; int num_events; const char *name = "unknown"; /* Used to be able to set this via command line */ num_events=1; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); do_cycles( 1 ); /* zero out the count fields */ gettimeofday( &start, NULL ); last = start; count = 0; total = 0; /* Initialize PAPI */ retval=PAPI_library_init( PAPI_VER_CURRENT ); if (retval!=PAPI_VER_CURRENT) { test_fail( name, __LINE__, "PAPI_library_init failed", 1 ); } name = argv[0]; if (!quiet) { printf( "[%d] %s, num_events = %d\n", getpid(), name, num_events ); } /* Set up eventset */ if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_create_eventset failed", 1 ); } /* Add events */ for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_add_event( EventSet, Event[ev] ) != PAPI_OK ) { if (!quiet) printf("Trouble adding event.\n"); test_skip( name, __LINE__, "PAPI_add_event failed", 1 ); } } /* Set up overflow handler */ for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_overflow( EventSet, Event[ev], Threshold[ev], 0, my_handler ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_overflow failed", 1 ); } } /* Start the eventset */ if ( PAPI_start( EventSet ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_start failed", 1 ); } /* Generate some workload */ run( name, 3 ); if (!quiet) { printf("[%d] %s, %s\n", getpid(), name, "stop"); } /* Stop measuring */ if ( PAPI_stop( EventSet, NULL ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_stop failed", 1 ); } if (!quiet) { printf("[%d] %s, %s\n", getpid(), name, "end"); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/ctests/clockres_pthreads.c000066400000000000000000000050171502707512200216020ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #include "clockcore.h" void * pthread_main( void *arg ) { ( void ) arg; int retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } retval=clockcore( TESTS_QUIET ); if (retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "clockcore failure", retval ); } retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); } return NULL; } int main( int argc, char **argv ) { pthread_t t1, t2, t3, t4; pthread_attr_t attr; int retval; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); if (( retval = PAPI_library_init( PAPI_VER_CURRENT)) != PAPI_VER_CURRENT) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_thread_init( ( unsigned long ( * )(void) ) (pthread_self) ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) { test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } } if ( !TESTS_QUIET ) { printf( "Test case: Clock latency and resolution.\n" ); printf( "Note: Virtual timers are proportional to # CPUs.\n" ); printf( "------------------------------------------------\n" ); } pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) { test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); } #endif if (pthread_create( &t1, &attr, pthread_main, NULL )) { test_fail(__FILE__, __LINE__, "cannot create thread", retval); } if (pthread_create( &t2, &attr, pthread_main, NULL )) { test_fail(__FILE__, __LINE__, "cannot create thread", retval); } if (pthread_create( &t3, &attr, pthread_main, NULL )) { test_fail(__FILE__, __LINE__, "cannot create thread", retval); } if (pthread_create( &t4, &attr, pthread_main, NULL )) { test_fail(__FILE__, __LINE__, "cannot create thread", retval); } pthread_main( NULL ); pthread_join( t1, NULL ); pthread_join( t2, NULL ); pthread_join( t3, NULL ); pthread_join( t4, NULL ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/cmpinfo.c000066400000000000000000000056071502707512200175430ustar00rootroot00000000000000/* * File: cmpinfo.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; const PAPI_component_info_t *cmpinfo; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_component_info", retval ); if (!TESTS_QUIET) { printf( "name: %s\n", cmpinfo->name ); printf( "component_version: %s\n", cmpinfo->version ); printf( "support_version: %s\n", cmpinfo->support_version ); printf( "kernel_version: %s\n", cmpinfo->kernel_version ); printf( "num_cntrs: %d\n", cmpinfo->num_cntrs ); printf( "num_mpx_cntrs: %d\n", cmpinfo->num_mpx_cntrs ); printf( "num_preset_events: %d\n", cmpinfo->num_preset_events ); /* Number of counters the component supports */ printf( "num_native_events: %d\n", cmpinfo->num_native_events ); /* Number of counters the component supports */ printf( "default_domain: %#x (%s)\n", cmpinfo->default_domain, stringify_all_domains( cmpinfo->default_domain ) ); printf( "available_domains: %#x (%s)\n", cmpinfo->available_domains, stringify_all_domains( cmpinfo->available_domains ) ); /* Available domains */ printf( "default_granularity: %#x (%s)\n", cmpinfo->default_granularity, stringify_granularity( cmpinfo->default_granularity ) ); /* The default granularity when this component is used */ printf( "available_granularities: %#x (%s)\n", cmpinfo->available_granularities, stringify_all_granularities( cmpinfo->available_granularities ) ); /* Available granularities */ printf( "hardware_intr_sig: %d\n", cmpinfo->hardware_intr_sig ); printf( "hardware_intr: %d\n", cmpinfo->hardware_intr ); /* Needs hw overflow intr to be emulated in software */ printf( "precise_intr: %d\n", cmpinfo->precise_intr ); /* Performance interrupts happen precisely */ printf( "posix1b_timers: %d\n", cmpinfo->posix1b_timers ); /* Performance interrupts happen precisely */ printf( "kernel_profile: %d\n", cmpinfo->kernel_profile ); /* Needs kernel profile support (buffered interrupts) to be emulated */ printf( "kernel_multiplex: %d\n", cmpinfo->kernel_multiplex ); /* In kernel multiplexing */ printf( "fast_counter_read: %d\n", cmpinfo->fast_counter_read ); /* Has a fast counter read */ printf( "fast_real_timer: %d\n", cmpinfo->fast_real_timer ); /* Has a fast real timer */ printf( "fast_virtual_timer: %d\n", cmpinfo->fast_virtual_timer ); /* Has a fast virtual timer */ printf( "attach: %d\n", cmpinfo->attach ); /* Supports attach */ printf( "attach_must_ptrace: %d\n", cmpinfo->attach_must_ptrace ); /* */ } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/code2name.c000066400000000000000000000100011502707512200177250ustar00rootroot00000000000000/* This file performs the following test: event_code_to_name */ #include #include #include "papi.h" #include "papi_test.h" static void test_continue( const char *call, int retval ) { if (!TESTS_QUIET) { printf( "Expected error in %s: %s\n", call, PAPI_strerror(retval) ); } } int main( int argc, char **argv ) { int retval; int code = PAPI_TOT_CYC, last; char event_name[PAPI_MAX_STR_LEN]; const PAPI_component_info_t *cmp_info; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if (!quiet) { printf( "Test case code2name.c: " "Check limits and indexing of event tables.\n"); printf( "Looking for PAPI_TOT_CYC...\n" ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!quiet) printf( "Found |%s|\n", event_name ); code = PAPI_FP_OPS; if (!quiet) { printf( "Looking for highest defined preset event " "(PAPI_FP_OPS): %#x...\n",code ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); if (!quiet) printf( "Found |%s|\n", event_name ); code = PAPI_PRESET_MASK | ( PAPI_MAX_PRESET_EVENTS - 1 ); if (!quiet) { printf( "Looking for highest allocated preset event:" " %#x...\n", code ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_continue( "PAPI_event_code_to_name", retval ); } else { if (!quiet) printf( "Found |%s|\n", event_name ); } code = PAPI_PRESET_MASK | ( unsigned int ) PAPI_NATIVE_AND_MASK; if (!quiet) { printf( "Looking for highest possible preset event:" " %#x...\n", code ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_continue( "PAPI_event_code_to_name", retval ); } else { if (!quiet) printf( "Found |%s|\n", event_name ); } /* Find the first defined native event in component 0 */ /* For platform independence, always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ code = PAPI_NATIVE_MASK; PAPI_enum_event( &code, PAPI_ENUM_FIRST ); if (!quiet) { printf( "Looking for first native event: %#x...\n", code ); } retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { if (!quiet) printf("Could not find first native event\n"); test_skip( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } else { if (!quiet) printf( "Found |%s|\n", event_name ); } /* Find the last defined native event */ /* FIXME: hardcoded cmp 0 */ cmp_info = PAPI_get_component_info( 0 ); if ( cmp_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", PAPI_ECMP ); } code = PAPI_NATIVE_MASK; last = code; PAPI_enum_event( &code, PAPI_ENUM_FIRST ); while ( PAPI_enum_event( &code, PAPI_ENUM_EVENTS ) == PAPI_OK ) { last=code; } code = last; if (!quiet) printf( "Looking for last native event: %#x...\n", code ); retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } else { if (!quiet) printf( "Found |%s|\n", event_name ); } /* Highly doubtful we have this many natives */ /* Turn on all bits *except* PRESET bit and COMPONENT bits */ code = PAPI_PRESET_AND_MASK; if (!quiet) printf( "Looking for highest definable native event: %#x...\n", code ); retval = PAPI_event_code_to_name( code, event_name ); if ( retval != PAPI_OK ) { test_continue( "PAPI_event_code_to_name", retval ); } else { if (!quiet) printf( "Found |%s|\n", event_name ); } if ( ( retval == PAPI_ENOCMP) || ( retval == PAPI_ENOEVNT ) || ( retval == PAPI_OK ) ) { test_pass( __FILE__ ); } test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", PAPI_EBUG ); return 1; } papi-papi-7-2-0-t/src/ctests/data_range.c000066400000000000000000000167151502707512200201770ustar00rootroot00000000000000/* * File: data_range.c * Author: Dan Terpstra * terpstra@cs.utk.edu * Mods: * */ /* This file performs the following test: */ /* exercise the Itanium data address range interface */ #include #include #include #include "papi.h" #include "papi_test.h" #define NUM 16384 static void init_array( void ); static int do_malloc_work( long loop ); static int do_static_work( long loop ); static void measure_load_store( vptr_t start, vptr_t end ); static void measure_event( int index, PAPI_option_t * option ); int *parray1, *parray2, *parray3; int array1[NUM], array2[NUM], array3[NUM]; char event_name[2][PAPI_MAX_STR_LEN]; int PAPI_event[2]; int EventSet = PAPI_NULL; int main( int argc, char **argv ) { int retval; const PAPI_exe_info_t *prginfo = NULL; const PAPI_hw_info_t *hw_info; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); #if !defined(ITANIUM2) && !defined(ITANIUM3) test_skip( __FILE__, __LINE__, "Currently only works on itanium2", 0 ); exit( 1 ); #endif tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ init_array( ); printf( "Malloc'd array pointers: %p %p %p\n", &parray1, &parray2, &parray3 ); printf( "Malloc'd array addresses: %p %p %p\n", parray1, parray2, parray3 ); printf( "Static array addresses: %p %p %p\n", &array1, &array2, &array3 ); tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); prginfo = PAPI_get_executable_info( ); if ( prginfo == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); #if defined(linux) && defined(__ia64__) sprintf( event_name[0], "loads_retired" ); sprintf( event_name[1], "stores_retired" ); PAPI_event_name_to_code( event_name[0], &PAPI_event[0] ); PAPI_event_name_to_code( event_name[1], &PAPI_event[1] ); #else test_skip( __FILE__, __LINE__, "only works for Itanium", PAPI_ENOSUPP ); #endif if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); /***************************************************************************************/ printf ( "\n\nMeasure loads and stores on the pointers to the allocated arrays\n" ); printf( "Expected loads: %d; Expected stores: 0\n", NUM * 2 ); printf ( "These loads result from accessing the pointers to compute array addresses.\n" ); printf ( "They will likely disappear with higher levels of optimization.\n" ); measure_load_store( ( vptr_t ) & parray1, ( vptr_t ) ( &parray1 + 1 ) ); measure_load_store( ( vptr_t ) & parray2, ( vptr_t ) ( &parray2 + 1 ) ); measure_load_store( ( vptr_t ) & parray3, ( vptr_t ) ( &parray3 + 1 ) ); /***************************************************************************************/ printf ( "\n\nMeasure loads and stores on the allocated arrays themselves\n" ); printf( "Expected loads: %d; Expected stores: %d\n", NUM, NUM ); measure_load_store( ( vptr_t ) parray1, ( vptr_t ) ( parray1 + NUM ) ); measure_load_store( ( vptr_t ) parray2, ( vptr_t ) ( parray2 + NUM ) ); measure_load_store( ( vptr_t ) parray3, ( vptr_t ) ( parray3 + NUM ) ); /***************************************************************************************/ printf( "\n\nMeasure loads and stores on the static arrays\n" ); printf ( "These values will differ from the expected values by the size of the offsets.\n" ); printf( "Expected loads: %d; Expected stores: %d\n", NUM, NUM ); measure_load_store( ( vptr_t ) array1, ( vptr_t ) ( array1 + NUM ) ); measure_load_store( ( vptr_t ) array2, ( vptr_t ) ( array2 + NUM ) ); measure_load_store( ( vptr_t ) array3, ( vptr_t ) ( array3 + NUM ) ); /***************************************************************************************/ retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy", retval ); free( parray1 ); free( parray2 ); free( parray3 ); test_pass( __FILE__ ); return 0; } static void measure_load_store( vptr_t start, vptr_t end ) { PAPI_option_t option; int retval; /* set up the optional address structure for starting and ending data addresses */ option.addr.eventset = EventSet; option.addr.start = start; option.addr.end = end; if ( ( retval = PAPI_set_opt( PAPI_DATA_ADDRESS, &option ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt(PAPI_DATA_ADDRESS)", retval ); measure_event( 0, &option ); measure_event( 1, &option ); } static void measure_event( int index, PAPI_option_t * option ) { int retval; long long value; if ( ( retval = PAPI_add_event( EventSet, PAPI_event[index] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( index == 0 ) { /* if ((retval = PAPI_get_opt(PAPI_DATA_ADDRESS, option)) != PAPI_OK) test_fail(__FILE__, __LINE__, "PAPI_get_opt(PAPI_DATA_ADDRESS)", retval); */ printf ( "Requested Start Address: %p; Start Offset: %#5x; Actual Start Address: %p\n", option->addr.start, option->addr.start_off, option->addr.start - option->addr.start_off ); printf ( "Requested End Address: %p; End Offset: %#5x; Actual End Address: %p\n", option->addr.end, option->addr.end_off, option->addr.end + option->addr.end_off ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_malloc_work( NUM ); do_static_work( NUM ); retval = PAPI_stop( EventSet, &value ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } printf( "%s: %lld\n", event_name[index], value ); if ( ( retval = PAPI_remove_event( EventSet, PAPI_event[index] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event", retval ); } static void init_array( void ) { parray1 = ( int * ) malloc( NUM * sizeof ( int ) ); if ( parray1 == NULL ) test_fail( __FILE__, __LINE__, "No memory available!\n", 0 ); memset( parray1, 0x0, NUM * sizeof ( int ) ); parray2 = ( int * ) malloc( NUM * sizeof ( int ) ); if ( parray2 == NULL ) test_fail( __FILE__, __LINE__, "No memory available!\n", 0 ); memset( parray2, 0x0, NUM * sizeof ( int ) ); parray3 = ( int * ) malloc( NUM * sizeof ( int ) ); if ( parray3 == NULL ) test_fail( __FILE__, __LINE__, "No memory available!\n", 0 ); memset( parray3, 0x0, NUM * sizeof ( int ) ); } static int do_static_work( long loop ) { int i; int sum = 0; for ( i = 0; i < loop; i++ ) { array1[i] = i; sum += array1[i]; } for ( i = 0; i < loop; i++ ) { array2[i] = i; sum += array2[i]; } for ( i = 0; i < loop; i++ ) { array3[i] = i; sum += array3[i]; } return sum; } static int do_malloc_work( long loop ) { int i; int sum = 0; for ( i = 0; i < loop; i++ ) { parray1[i] = i; sum += parray1[i]; } for ( i = 0; i < loop; i++ ) { parray2[i] = i; sum += parray2[i]; } for ( i = 0; i < loop; i++ ) { parray3[i] = i; sum += parray3[i]; } return sum; } papi-papi-7-2-0-t/src/ctests/derived.c000066400000000000000000000057451502707512200175350ustar00rootroot00000000000000/* This file performs the following test: start, stop with a derived event */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define EVENTSLEN 2 unsigned int PAPI_events[EVENTSLEN] = { 0, 0 }; static const int PAPI_events_len = 1; int main( int argc, char **argv ) { int retval, tmp; int EventSet = PAPI_NULL; int i; PAPI_event_info_t info; long long values; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf( "Test case %s: start, stop with a derived counter.\n", __FILE__ ); printf( "------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n\n", tmp, stringify_granularity( tmp ) ); } i = PAPI_PRESET_MASK; do { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { if ( info.count > 1 ) { PAPI_events[0] = ( unsigned int ) info.event_code; break; } } } while ( PAPI_enum_event( &i, 0 ) == PAPI_OK ); if ( PAPI_events[0] == 0 ) { test_skip(__FILE__, __LINE__, "No events found", 0); } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", retval ); } for ( i = 0; i < PAPI_events_len; i++ ) { PAPI_event_code_to_name( ( int ) PAPI_events[i], event_name ); if ( !quiet ) { printf( "Adding %s\n", event_name ); } retval = PAPI_add_event( EventSet, ( int ) PAPI_events[i] ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); } } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } if (!quiet) printf( "Running do_stuff().\n" ); do_stuff( ); retval = PAPI_stop( EventSet, &values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } if (!quiet) { sprintf( add_event_str, "%-12s : \t", event_name ); printf( TAB1, add_event_str, values ); printf( "------------------------------------------------\n" ); } retval = PAPI_cleanup_eventset( EventSet ); /* JT */ if ( retval != PAPI_OK ) { test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); } retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); } if (!quiet) printf( "Verification: Does it produce a non-zero value?\n" ); if ( values != 0 ) { if (!quiet) { printf( "Yes: " ); printf( LLDFMT, values ); printf( "\n" ); } } else { test_fail(__FILE__,__LINE__, "Validation", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/ctests/describe.c000066400000000000000000000075011502707512200176630ustar00rootroot00000000000000/* From Paul Drongowski at HP. Thanks. */ /* I have not been able to call PAPI_describe_event without incurring a segv, including the sample code on the man page. I noticed that PAPI_describe_event is not exercised by the PAPI test programs, so I haven't been able to check the function call using known good code. (Or steal your code for that matter. :-) */ /* PAPI_describe_event has been deprecated in PAPI 3, since its functionality exists in other API calls. Below shows several ways that this call was used, with replacement code compatible with PAPI 3. */ #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int EventSet = PAPI_NULL; int retval; long long g1[2]; int eventcode = PAPI_TOT_INS; PAPI_event_info_t info, info1, info2; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } if ( ( retval = PAPI_query_event( eventcode ) ) != PAPI_OK ) { if (!quiet) printf("Trouble checking event\n"); test_skip( __FILE__, __LINE__, "PAPI_query_event(PAPI_TOT_INS)", retval ); } if ( ( retval = PAPI_add_event( EventSet, eventcode ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_INS)", retval ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); if ( ( retval = PAPI_stop( EventSet, g1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* Case 0, no info, should fail */ eventcode = 0; /* if ( ( retval = PAPI_describe_event(eventname,(int *)&eventcode,eventdesc) ) == PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_describe_event",retval); */ if (!quiet) { printf("This test expects a 'PAPI Error' to be returned from this PAPI call.\n"); } if ( ( retval = PAPI_get_event_info( eventcode, &info ) ) == PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); /* Case 1, fill in name field. */ eventcode = PAPI_TOT_INS; /* if ( ( retval = PAPI_describe_event(eventname,(int *)&eventcode,eventdesc) ) != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_describe_event",retval); */ if ( ( retval = PAPI_get_event_info( eventcode, &info1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); if ( strcmp( info1.symbol, "PAPI_TOT_INS" ) != 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info symbol value is bogus", retval ); if ( strlen( info1.long_descr ) == 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info long_descr value is bogus", retval ); eventcode = 0; /* Case 2, fill in code field. */ /* if ( ( retval = PAPI_describe_event(eventname,(int *)&eventcode,eventdesc) ) != PAPI_OK) test_fail(__FILE__,__LINE__,"PAPI_describe_event",retval); */ if ( ( retval = PAPI_event_name_to_code( info1.symbol, ( int * ) &eventcode ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); } if ( eventcode != PAPI_TOT_INS ) test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code code value is bogus", retval ); if ( ( retval = PAPI_get_event_info( eventcode, &info2 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); if ( strcmp( info2.symbol, "PAPI_TOT_INS" ) != 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info symbol value is bogus", retval ); if ( strlen( info2.long_descr ) == 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info long_descr value is bogus", retval ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/destroy.c000066400000000000000000000036711502707512200176000ustar00rootroot00000000000000/* destroy.c */ /* Test that PAPI_destroy_eventset() doesn't leak file descriptors */ /* Run create/add/start/stop/remove/destroy in a large loop */ /* and make sure PAPI handles it OK */ #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define NUM_EVENTS 1 #define NUM_LOOPS 16384 int main( int argc, char **argv ) { int retval, i; int EventSet1 = PAPI_NULL; long long values[NUM_EVENTS]; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("Testing to make sure we can destroy eventsets\n"); } for(i=0;i #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; const PAPI_component_info_t* cmpinfo; int numcmp, cid, active_components=0; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* Disable All Compiled-in Components */ numcmp = PAPI_num_components( ); if (!TESTS_QUIET) printf("Compiled-in components:\n"); for( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if (!TESTS_QUIET) { printf( "Name: %-23s %s\n", cmpinfo->name, cmpinfo->description); } retval=PAPI_disable_component( cid ); if (retval!=PAPI_OK) { test_fail(__FILE__,__LINE__,"Error disabling component",retval); } } /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Try to disable after init, should fail */ retval=PAPI_disable_component( 0 ); if (retval==PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_disable_component should fail", retval ); } if (!TESTS_QUIET) printf("\nAfter init components:\n"); for( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if (!TESTS_QUIET) { printf( "%d %d Name: %-23s %s\n", cid, PAPI_get_component_index((char *)cmpinfo->name), cmpinfo->name ,cmpinfo->description); } if (cid!=PAPI_get_component_index((char *)cmpinfo->name)) { test_fail( __FILE__, __LINE__, "PAPI_get_component_index mismatch", 2 ); } if (cmpinfo->disabled) { if (!TESTS_QUIET) { printf(" \\-> Disabled: %s\n",cmpinfo->disabled_reason); } } else { active_components++; } } if (active_components>0) { test_fail( __FILE__, __LINE__, "too many active components", retval ); } test_pass( __FILE__ ); return PAPI_OK; } papi-papi-7-2-0-t/src/ctests/dmem_info.c000066400000000000000000000050051502707512200200350ustar00rootroot00000000000000/* * This file perfoms the following test: dynamic memory info * The pages used should increase steadily. * * Author: Kevin London * london@cs.utk.edu */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define ALLOCMEM 200000 static void dump_memory_info( FILE * output, PAPI_dmem_info_t * d ) { fprintf( output, "\n--------\n" ); fprintf( output, "Mem Size:\t\t%lld\n", d->size ); fprintf( output, "Mem Peak Size:\t\t%lld\n", d->peak ); fprintf( output, "Mem Resident:\t\t%lld\n", d->resident ); fprintf( output, "Mem High Water Mark:\t%lld\n", d->high_water_mark ); fprintf( output, "Mem Shared:\t\t%lld\n", d->shared ); fprintf( output, "Mem Text:\t\t%lld\n", d->text ); fprintf( output, "Mem Library:\t\t%lld\n", d->library ); fprintf( output, "Mem Heap:\t\t%lld\n", d->heap ); fprintf( output, "Mem Locked:\t\t%lld\n", d->locked ); fprintf( output, "Mem Stack:\t\t%lld\n", d->stack ); fprintf( output, "Mem Pagesize:\t\t%lld\n", d->pagesize ); fprintf( output, "Mem Page Table Entries:\t\t%lld\n", d->pte ); fprintf( output, "--------\n\n" ); } int main( int argc, char **argv ) { PAPI_dmem_info_t dmem; long long value[7]; int retval, i = 0, j = 0; double *m[7]; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); for ( i = 0; i < 7; i++ ) { retval = PAPI_get_dmem_info( &dmem ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_dmem_info", retval ); /* dump_memory_info(stdout,&dmem); */ value[i] = dmem.size; m[i] = ( double * ) malloc( ALLOCMEM * sizeof ( double ) ); touch_dummy( m[j], ALLOCMEM ); } if ( !TESTS_QUIET ) { printf( "Test case: Dynamic Memory Information.\n" ); dump_memory_info( stdout, &dmem ); printf ( "------------------------------------------------------------------------\n" ); for ( i = 0; i < 7; i++ ) printf( "Malloc additional: %d KB Memory Size in KB: %d\n", ( int ) ( ( sizeof ( double ) * ALLOCMEM ) / 1024 ), ( int ) value[i] ); printf ( "------------------------------------------------------------------------\n" ); } if ( value[6] >= value[5] && value[5] >= value[4] && value[4] >= value[3] && value[3] >= value[2] && value[2] >= value[1] && value[1] >= value[0] ) test_pass( __FILE__ ); else test_fail( __FILE__, __LINE__, "Calculating Resident Memory", ( int ) value[6] ); return 1; } papi-papi-7-2-0-t/src/ctests/earprofile.c000066400000000000000000000130601502707512200202300ustar00rootroot00000000000000/* * File: profile.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Dan Terpstra * terpstra@cs.utk.edu */ /* This file performs the following test: profiling and program info option call - This tests the SVR4 profiling interface of PAPI. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). The Eventset contains: + PAPI_FP_INS (to profile) + PAPI_TOT_CYC - Set up profile - Start eventset 1 - Do both (flops and reads) - Stop eventset 1 */ #include #include #include #include "papi.h" #include "papi_test.h" #include "prof_utils.h" #include "do_loops.h" #undef THRESHOLD #define THRESHOLD 1000 static void ear_no_profile( void ) { int retval; if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_l1misses( 10000 ); if ( ( retval = PAPI_stop( EventSet, values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); printf( "Test type : \tNo profiling\n" ); printf( TAB1, event_name, ( values[0] )[0] ); printf( TAB1, "PAPI_TOT_CYC:", ( values[0] )[1] ); } static int do_profile( vptr_t start, unsigned long plength, unsigned scale, int thresh, int bucket ) { int i, retval; unsigned long blength; int num_buckets; const char *profstr[2] = { "PAPI_PROFIL_POSIX", "PAPI_PROFIL_INST_EAR" }; int profflags[2] = { PAPI_PROFIL_POSIX, PAPI_PROFIL_POSIX | PAPI_PROFIL_INST_EAR }; int num_profs; do_stuff( ); num_profs = sizeof ( profflags ) / sizeof ( int ); ear_no_profile( ); blength = prof_size( plength, scale, bucket, &num_buckets ); prof_alloc( num_profs, blength ); for ( i = 0; i < num_profs; i++ ) { if ( !TESTS_QUIET ) printf( "Test type : \t%s\n", profstr[i] ); if ( ( retval = PAPI_profil( profbuf[i], blength, start, scale, EventSet, PAPI_event, thresh, profflags[i] | bucket ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values[1] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { printf( TAB1, event_name, ( values[1] )[0] ); printf( TAB1, "PAPI_TOT_CYC:", ( values[1] )[1] ); } if ( ( retval = PAPI_profil( profbuf[i], blength, start, scale, EventSet, PAPI_event, 0, profflags[i] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } prof_head( blength, bucket, num_buckets, "address\t\t\tPOSIX\tINST_DEAR\n" ); prof_out( start, num_profs, bucket, num_buckets, scale ); retval = prof_check( num_profs, bucket, num_buckets ); for ( i = 0; i < num_profs; i++ ) { free( profbuf[i] ); } return retval; } int main( int argc, char **argv ) { int num_events, num_tests = 6; long length; int retval, retval2; const PAPI_hw_info_t *hw_info; const PAPI_exe_info_t *prginfo; vptr_t start, end; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); } if ( ( hw_info = PAPI_get_hardware_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 0 ); } if ( ( strncasecmp( hw_info->model_string, "Itanium", strlen( "Itanium" ) ) != 0 ) && ( strncasecmp( hw_info->model_string, "32", strlen( "32" ) ) != 0 ) ) { if (!quiet) printf("Itanium only for now.\n"); test_skip( __FILE__, __LINE__, "Test unsupported", PAPI_ENOIMPL ); } // if ( quiet ) { // test_skip( __FILE__, __LINE__, // "Test deprecated in quiet mode for PAPI 3.6", 0 ); // } sprintf( event_name, "DATA_EAR_CACHE_LAT4" ); if ( ( retval = PAPI_event_name_to_code( event_name, &PAPI_event ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( ( retval = PAPI_add_event( EventSet, PAPI_event ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( ( retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); num_events = 2; values = allocate_test_space( num_tests, num_events ); /* use these lines to profile entire code address space */ start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; length = end - start; if ( length < 0 ) test_fail( __FILE__, __LINE__, "Profile length < 0!", length ); prof_print_address ( "Test earprofile: POSIX compatible event address register profiling.\n", prginfo ); prof_print_prof_info( start, end, THRESHOLD, event_name ); retval = do_profile( start, length, FULL_SCALE, THRESHOLD, PAPI_PROFIL_BUCKET_16 ); retval2 = PAPI_remove_event( EventSet, PAPI_event ); if ( retval2 == PAPI_OK ) retval2 = PAPI_remove_event( EventSet, PAPI_TOT_CYC ); if ( retval2 != PAPI_OK ) test_fail( __FILE__, __LINE__, "Can't remove events", retval2 ); if ( retval ) test_pass( __FILE__ ); else test_fail( __FILE__, __LINE__, "No information in buffers", 1 ); return 1; } papi-papi-7-2-0-t/src/ctests/eventname.c000066400000000000000000000016501502707512200200640ustar00rootroot00000000000000#include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int preset; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); retval = PAPI_event_name_to_code( "PAPI_FP_INS", &preset ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); if ( preset != PAPI_FP_INS ) test_fail( __FILE__, __LINE__, "Wrong preset returned", retval ); retval = PAPI_event_name_to_code( "PAPI_TOT_CYC", &preset ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); if ( preset != PAPI_TOT_CYC ) test_fail( __FILE__, __LINE__, "*preset returned did not equal PAPI_TOT_CYC", retval ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/exec.c000066400000000000000000000020541502707512200170250ustar00rootroot00000000000000/* * File: exec.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: start, stop and timer functionality for a parent and a forked child. */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } else { retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); PAPI_shutdown( ); if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/exec2.c000066400000000000000000000020261502707512200171060ustar00rootroot00000000000000/* * File: exec.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: start, stop and timer functionality for a parent and a forked child. */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } else { retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/exec_overflow.c000066400000000000000000000074041502707512200207540ustar00rootroot00000000000000/* * Test PAPI with fork() and exec(). */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define MAX_EVENTS 3 static int Event[MAX_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_FAD_INS, }; static int Threshold[MAX_EVENTS] = { 8000000, 4000000, 4000000, }; static struct timeval start, last; static long count, total; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; count++; total++; } static void print_rate( const char *str ) { static int last_count = -1; struct timeval now; double st_secs, last_secs; gettimeofday( &now, NULL ); st_secs = ( double ) ( now.tv_sec - start.tv_sec ) + ( ( double ) ( now.tv_usec - start.tv_usec ) ) / 1000000.0; last_secs = ( double ) ( now.tv_sec - last.tv_sec ) + ( ( double ) ( now.tv_usec - last.tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%d] %s, time = %.3f, total = %ld, last = %ld, rate = %.1f/sec\n", getpid( ), str, st_secs, total, count, ( ( double ) count ) / last_secs ); } if ( last_count != -1 ) { if ( count < .1 * last_count ) { test_fail( __FILE__, __LINE__, "Interrupt rate changed!", 1 ); exit( 1 ); } } last_count = ( int ) count; count = 0; last = now; } static void run( const char *str, int len ) { int n; for ( n = 1; n <= len; n++ ) { do_cycles( 1 ); print_rate( str ); } } int main( int argc, char **argv ) { int num_events = 1; const char *name = "unknown"; int ev,EventSet = PAPI_NULL; int quiet,retval; /* Used to be able to set this via command line */ num_events=1; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); do_cycles( 1 ); /* Zero out the Counters */ gettimeofday( &start, NULL ); last = start; count = 0; total = 0; /* Initialize PAPI */ retval=PAPI_library_init( PAPI_VER_CURRENT ); if (retval!=PAPI_VER_CURRENT) { test_fail( __FILE__, __LINE__, "PAPI_library_init failed", 1 ); } name = argv[0]; if (!quiet) { printf( "[%d] %s, num_events = %d\n", getpid(), name, num_events ); } /* Create eventset */ if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset failed", 1 ); } /* Add events */ for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_add_event( EventSet, Event[ev] ) != PAPI_OK ) { if (!quiet) printf("Trouble adding event\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event failed", 1 ); } } /* Set overflow */ for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_overflow( EventSet, Event[ev], Threshold[ev], 0, my_handler ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow failed", 1 ); } } /* Start measuring */ if ( PAPI_start( EventSet ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start failed", 1 ); } /* Tun a bit */ run( name, 3 ); /* Stop measuring */ if (!quiet) { printf("[%d] %s, %s\n", getpid(), name, "stop"); } if ( PAPI_stop( EventSet, NULL ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_stop failed", 1 ); } if (!quiet) { printf("[%d] %s, %s\n", getpid(), name, "exec(./child_overflow)"); } /* exec the child_overflow helper program */ /* we should never return from this */ if ( access( "./child_overflow", X_OK ) == 0 ) execl( "./child_overflow", "./child_overflow", ( quiet ? "TESTS_QUIET" : NULL ), NULL ); else if ( access( "./ctests/child_overflow", X_OK ) == 0 ) execl( "./ctests/child_overflow", "./ctests/child_overflow", ( quiet ? "TESTS_QUIET" : NULL ), NULL ); test_fail( name, __LINE__, "exec failed", 1 ); return 0; } papi-papi-7-2-0-t/src/ctests/exeinfo.c000066400000000000000000000043641502707512200175440ustar00rootroot00000000000000/* * File: exeinfo.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; const PAPI_exe_info_t *exeinfo; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( exeinfo = PAPI_get_executable_info( ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", retval ); if (!TESTS_QUIET) { printf( "Path+Program: %s\n", exeinfo->fullname ); printf( "Program: %s\n", exeinfo->address_info.name ); printf( "Text start: %p, Text end: %p\n", exeinfo->address_info.text_start, exeinfo->address_info.text_end ); printf( "Data start: %p, Data end: %p\n", exeinfo->address_info.data_start, exeinfo->address_info.data_end ); printf( "Bss start: %p, Bss end: %p\n", exeinfo->address_info.bss_start, exeinfo->address_info.bss_end ); } if ( ( strlen( &(exeinfo->fullname[0]) ) == 0 ) ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); if ( ( strlen( &(exeinfo->address_info.name[0]) ) == 0 ) ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); if ( ( exeinfo->address_info.text_start == 0x0 ) || ( exeinfo->address_info.text_end == 0x0 ) || ( exeinfo->address_info.text_start >= exeinfo->address_info.text_end ) ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); if ( ( exeinfo->address_info.data_start == 0x0 ) || ( exeinfo->address_info.data_end == 0x0 ) || ( exeinfo->address_info.data_start >= exeinfo->address_info.data_end ) ) test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); /* if ((exeinfo->address_info.bss_start == 0x0) || (exeinfo->address_info.bss_end == 0x0) || (exeinfo->address_info.bss_start >= exeinfo->address_info.bss_end)) test_fail(__FILE__, __LINE__, "PAPI_get_executable_info",1); */ sleep( 1 ); /* Needed for debugging, so you can ^Z and stop the process, inspect /proc to see if it's right */ test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/failed_events.c000066400000000000000000000106171502707512200207150ustar00rootroot00000000000000/* * File: failed_events.c * Author: Vince Weaver */ /* This test tries adding events that don't exist */ /* We've had issues where the name resolution code might do weird */ /* things when passed invalid event names */ #include #include #include #include "papi.h" #include "papi_test.h" #define LARGE_NAME_SIZE 4096 char large_name[LARGE_NAME_SIZE]; int main( int argc, char **argv ) { int i, k, err_count = 0; int retval; PAPI_event_info_t info, info1; const PAPI_component_info_t* cmpinfo; int numcmp, cid; int quiet; int EventSet = PAPI_NULL; /* Set quiet variable */ quiet=tests_quiet( argc, argv ); /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("Test adding invalid events.\n"); } /* Create an eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Simple Event */ if (!quiet) { printf("+ Simple invalid event\t"); } retval=PAPI_add_named_event(EventSet,"INVALID_EVENT"); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened!\n"); err_count++; } } else { if (!quiet) printf("OK\n"); } /* Extra Colons */ if (!quiet) { printf("+ Extra colons\t"); } retval=PAPI_add_named_event(EventSet,"INV::::AL:ID:::_E=3V::E=NT"); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened!\n"); err_count++; } } else { if (!quiet) printf("OK\n"); } /* Large Invalid Event */ if (!quiet) { printf("+ Large invalid event\t"); } memset(large_name,'A',LARGE_NAME_SIZE); large_name[LARGE_NAME_SIZE-1]=0; retval=PAPI_add_named_event(EventSet,large_name); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened!\n"); err_count++; } } else { if (!quiet) printf("OK\n"); } /* Large Unterminated Invalid Event */ if (!quiet) { printf("+ Large unterminated invalid event\t"); } memset(large_name,'A',LARGE_NAME_SIZE); retval=PAPI_add_named_event(EventSet,large_name); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened!\n"); err_count++; } } else { if (!quiet) printf("OK\n"); } /* Randomly modifying valid events */ if (!quiet) { printf("+ Randomly modifying valid events\t"); } numcmp = PAPI_num_components( ); /* Loop through all components */ for( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if (cmpinfo == NULL) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 2 ); } /* Include disabled components */ if (cmpinfo->disabled) { // continue; } /* For platform independence, always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cid ); do { retval = PAPI_get_event_info( i, &info ); k = i; if ( PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_UMASKS, cid )==PAPI_OK ) { do { retval = PAPI_get_event_info( k, &info1 ); /* Skip perf_raw event as it is hard to error out */ if (strstr(info1.symbol,"perf_raw")) { break; } // printf("%s\n",info1.symbol); if (strlen(info1.symbol)>5) { info1.symbol[strlen(info1.symbol)-4]^=0xa5; retval=PAPI_add_named_event(EventSet,info1.symbol); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened %s!\n", info1.symbol); err_count++; } } } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cid ) == PAPI_OK ); } else { /* Event didn't have any umasks */ // PROBLEM: info1 is NOT initialized by anyone! // Original code referenced info1, changed to info. [Tony C. 11-27-19] // printf("%s\n",info.symbol); if (strlen(info.symbol)>5) { info.symbol[strlen(info.symbol)-4]^=0xa5; retval=PAPI_add_named_event(EventSet,info.symbol); if (retval==PAPI_OK) { if (!quiet) { printf("Unexpectedly opened %s!\n", info.symbol); err_count++; } } } } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cid ) == PAPI_OK ); } if ( err_count ) { if (!quiet) { printf( "%d Invalid events added.\n", err_count ); } test_fail( __FILE__, __LINE__, "Invalid events added", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/filter_helgrind.c000066400000000000000000000130711502707512200212430ustar00rootroot00000000000000/* * This code is a simple filter for the helgrind_out.txt file * produced by: * "valgrind --tool=helgrind --log-file=helgrind_out.txt someProgram" * * This is useful because the tool does not recognize PAPI locks, * thus reports as possible race conditions reads/writes by * different threads that are actually fine (surrounded by locks). * * This was written particularly for krentel_pthreads_race.c * when processed by the above valgrind. We produce a line per * condition, in the form: * OP@file:line OP@file:line * where OP is R or W. The first file:line code occurred * after the second file:line code, and on a different thread. * * We print the results to stdout. It is useful to filter this * through the standard utility 'uniq', each occurrence only * needs to be investigated once. Just insure there are * MATCHING locks around each operation within the code. * * An example run (using uniq): The options -uc will print * only unique lines, preceeded by a count of how many times * it occurs. * * ./filter_helgrind | uniq -uc * * An example output line (piped through uniq as above): * 1 R@threads.c:190 W@threads.c:206 * An investigation shows threads.c:190 is protected by * _papi_hwi_lock(THREADS_LOCK); and threads.c:206 is * protected by the same lock. Thus no data race can * occur for this instance. * * Compilation within the papi/src/ctests directory: * make filter_helgrind * */ #include #include #include int main(int argc, char** args) { (void) argc; (void) args; char myLine[16384]; int state, size; char type1, type2; char fname1[256], fname2[256]; char *paren1, *paren2; FILE *HELOUT = fopen("helgrind_out.txt", "r"); // Read the file. if (HELOUT == NULL) { fprintf(stderr, "Could not open helgrind_out.txt.\n"); exit(-1); } char PDRR[]="Possible data race during read"; char PDRW[]="Possible data race during write"; char TCWW[]="This conflicts with a previous write"; char TCWR[]="This conflicts with a previous read"; char atSTR[]=" at "; // State machine: // State 0: We are looking for a line with PDRR or PDRW. // We don't exit until we find it, or run out of lines. // if we find it, we remember which and go to state 1. // State 1: Looking for " at " in column 11. // When found, we extract the string betweeen '(' and ')' // which is program name:line. go to state 2. // State 2: We are searching for TCWW, TCWR, PDRW, PDRR. // If we find the first two: // Remember which, and go to state 3. // If we find either of the second two, go back to State 1. // State 3: Looking for " at " in column 11. // When found, extract the string betweeen '(' and ')', // which is program name:line. // OUTPUT LINE for an investigation. // Go to State 0. state = 0; // looking for PDRR, PDRW. while (fgets(myLine, 16384, HELOUT) != NULL) { if (strlen(myLine) < 20) continue; switch (state) { case 0: // Looking for PDRR or PRDW. if (strstr(myLine, PDRR) != NULL) { type1='R'; state=1; continue; } if (strstr(myLine, PDRW) != NULL) { type1='W'; state=1; continue; } continue; break; case 1: // Looking for atSTR in column 11. if (strncmp(myLine+10, atSTR, 6) != 0) continue; paren1=strchr(myLine, '('); paren2=strchr(myLine, ')'); if (paren1 == NULL || paren2 == NULL || paren1 > paren2) { state=0; // Abort, found something I don't understand. continue; } size = paren2-paren1-1; // compute length of name. strncpy(fname1, paren1+1, size); // Copy the name. fname1[size]=0; // install z-terminator. state=2; continue; break; case 2: // Looking for TCWW, TCWR, PDRR, PDRW. if (strstr(myLine, TCWR) != NULL) { type2='R'; state=3; continue; } if (strstr(myLine, TCWW) != NULL) { type2='W'; state=3; continue; } if (strstr(myLine, PDRR) != NULL) { type1='R'; state=1; continue; } if (strstr(myLine, PDRW) != NULL) { type1='W'; state=1; continue; } continue; break; case 3: // Looking for atSTR in column 11. if (strncmp(myLine+10, atSTR, 6) != 0) continue; paren1=strchr(myLine, '('); paren2=strchr(myLine, ')'); if (paren1 == NULL || paren2 == NULL || paren1 > paren2) { state=0; // Abort, found something I don't understand. continue; } size = paren2-paren1-1; // compute length of name. strncpy(fname2, paren1+1, size); // Copy the name. fname2[size]=0; // install z-terminator. fprintf(stdout, "%c@%-32s %c@%-32s\n", type1, fname1, type2, fname2); state=0; continue; break; } // end switch. } // end while. fclose(HELOUT); exit(0); } papi-papi-7-2-0-t/src/ctests/first.c000066400000000000000000000152301502707512200172300ustar00rootroot00000000000000/* This file performs the following test: start, read, stop and again functionality - It attempts to use the following three counters. It may use fewer depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS (or PAPI_TOT_INS if PAPI_FP_INS doesn't exist) + PAPI_TOT_CYC - Start counters - Do flops - Read counters - Reset counters - Do flops - Read counters - Do flops - Read counters - Do flops - Stop and read counters - Read counters */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, num_tests = 5, num_events, tmp; long long **values; int EventSet = PAPI_NULL; char event_name1[]="PAPI_TOT_CYC"; char event_name2[]="PAPI_TOT_INS"; char add_event_str[PAPI_MAX_STR_LEN]; long long min, max; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* create the eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_add_named_event( EventSet, event_name1); if ( retval != PAPI_OK ) { if (!quiet) printf("Couldn't add %s\n",event_name1); test_skip(__FILE__,__LINE__,"Couldn't add PAPI_TOT_CYC",0); } retval = PAPI_add_named_event( EventSet, event_name2); if ( retval != PAPI_OK ) { if (!quiet) printf("Couldn't add %s\n",event_name2); test_skip(__FILE__,__LINE__,"Couldn't add PAPI_TOT_INS",0); } num_events=2; sprintf( add_event_str, "PAPI_add_event[%s]", event_name2 ); /* Allocate space for results */ values = allocate_test_space( num_tests, num_events ); /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* Benchmark code */ do_flops( NUM_FLOPS ); /* read results 0 */ retval = PAPI_read( EventSet, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /* Reset */ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } /* Benchmark some more */ do_flops( NUM_FLOPS ); /* Read Results 1 */ retval = PAPI_read( EventSet, values[1] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /* Benchmark some more */ do_flops( NUM_FLOPS ); /* Read results 2 */ retval = PAPI_read( EventSet, values[2] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /* Benchmark some more */ do_flops( NUM_FLOPS ); /* Read results 3 */ retval = PAPI_stop( EventSet, values[3] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* Read results 4 */ retval = PAPI_read( EventSet, values[4] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /* remove results. We never stop??? */ PAPI_remove_named_event(EventSet,event_name1); PAPI_remove_named_event(EventSet,event_name2); if ( !quiet ) { printf( "Test case 1: Non-overlapping start, stop, read.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : 1 2 3 4 5\n" ); sprintf( add_event_str, "%s:", event_name2 ); printf( TAB5, add_event_str, values[0][1], values[1][1], values[2][1], values[3][1], values[4][1] ); printf( TAB5, "PAPI_TOT_CYC:", values[0][0], values[1][0], values[2][0], values[3][0], values[4][0] ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Row 1 Column 1 at least %d\n", NUM_FLOPS ); printf( "%% difference between %s 1 & 2: %.2f\n", add_event_str, 100.0 * ( float ) values[0][1] / ( float ) values[1][1] ); printf( "%% difference between %s 1 & 2: %.2f\n", "PAPI_TOT_CYC", 100.0 * ( float ) values[0][0] / ( float ) values[1][0] ); printf( "Column 1 approximately equals column 2\n" ); printf( "Column 3 approximately equals 2 * column 2\n" ); printf( "Column 4 approximately equals 3 * column 2\n" ); printf( "Column 4 exactly equals column 5\n" ); } /* Validation */ /* Check cycles constraints */ min = ( long long ) ( ( double ) values[1][0] * .8 ); max = ( long long ) ( ( double ) values[1][0] * 1.2 ); /* Check constraint Col1=Col2 */ if ( values[0][0] > max || values[0][0] < min ) { test_fail( __FILE__, __LINE__, "Cycle Col1!=Col2", 1 ); } /* Check constraint col3 == 2*col2 */ if ( (values[2][0] > ( 2 * max )) || (values[2][0] < ( 2 * min )) ) { test_fail( __FILE__, __LINE__, "Cycle Col3!=2*Col2", 1 ); } /* Check constraint col4 == 3*col2 */ if ( (values[3][0] > ( 3 * max )) || (values[3][0] < ( 3 * min )) ) { test_fail( __FILE__, __LINE__, "Cycle Col3!=3*Col2", 1 ); } /* Check constraint col4 == col5 */ if ( values[3][0] != values[4][0] ) { test_fail( __FILE__, __LINE__, "Cycle Col4!=Col5", 1 ); } /* Check FLOP constraints */ min = ( long long ) ( ( double ) values[1][1] * .9 ); max = ( long long ) ( ( double ) values[1][1] * 1.1 ); /* Check constraint Col1=Col2 */ if ( values[0][1] > max || values[0][1] < min ) { test_fail( __FILE__, __LINE__, "FLOP Col1!=Col2", 1 ); } /* Check constraint col3 == 2*col2 */ if ( (values[2][1] > ( 2 * max )) || (values[2][1] < ( 2 * min )) ) { test_fail( __FILE__, __LINE__, "FLOP Col3!=2*Col2", 1 ); } /* Check constraint col4 == 3*col2 */ if ( (values[3][1] > ( 3 * max )) || (values[3][1] < ( 3 * min )) ) { test_fail( __FILE__, __LINE__, "FLOP Col4!=3*Col2", 1 ); } /* Check constraint col4 == col5 */ if (values[3][1] != values[4][1]) { test_fail( __FILE__, __LINE__, "FLOP Col4!=Col5", 1 ); } /* Check flops are sane */ if (values[0][1] < ( long long ) NUM_FLOPS ) { test_fail( __FILE__, __LINE__, "FLOP sanity", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/fork.c000066400000000000000000000020541502707512200170420ustar00rootroot00000000000000/* * File: fork.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: PAPI_library_init() fork(); / \ parent child wait() PAPI_library_init() */ #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int status; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); if ( fork( ) == 0 ) { retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "forked PAPI_library_init", retval ); exit( 0 ); } else { wait( &status ); if ( WEXITSTATUS( status ) != 0 ) test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/fork2.c000066400000000000000000000021371502707512200171260ustar00rootroot00000000000000/* * File: fork2.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: PAPI_library_init() fork(); / \ parent child wait() PAPI_shutdown() PAPI_library_init() */ #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int status; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); if ( fork( ) == 0 ) { PAPI_shutdown(); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "forked PAPI_library_init", retval ); exit( 0 ); } else { wait( &status ); if ( WEXITSTATUS( status ) != 0 ) test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/fork_overflow.c000066400000000000000000000105571502707512200207740ustar00rootroot00000000000000/* * Test PAPI with fork() and exec(). */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define MAX_EVENTS 3 static int Event[MAX_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_FAD_INS, }; static int Threshold[MAX_EVENTS] = { 8000000, 4000000, 4000000, }; static int num_events = 1; static int EventSet = PAPI_NULL; static const char *name = "unknown"; static struct timeval start, last; static long count, total; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; count++; total++; } static void zero_count( void ) { gettimeofday( &start, NULL ); last = start; count = 0; total = 0; } static void print_here( const char *str) { if (!TESTS_QUIET) printf("[%d] %s, %s\n", getpid(), name, str); } static void print_rate( const char *str ) { static int last_count = -1; struct timeval now; double st_secs, last_secs; gettimeofday( &now, NULL ); st_secs = ( double ) ( now.tv_sec - start.tv_sec ) + ( ( double ) ( now.tv_usec - start.tv_usec ) ) / 1000000.0; last_secs = ( double ) ( now.tv_sec - last.tv_sec ) + ( ( double ) ( now.tv_usec - last.tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%d] %s, time = %.3f, total = %ld, last = %ld, rate = %.1f/sec\n", getpid( ), str, st_secs, total, count, ( ( double ) count ) / last_secs ); } if ( last_count != -1 ) { if ( count < .1 * last_count ) { test_fail( name, __LINE__, "Interrupt rate changed!", 1 ); exit( 1 ); } } last_count = ( int ) count; count = 0; last = now; } static void do_cycles( int program_time ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) sum += x; if ( sum < 0.0 ) printf( "==>> SUM IS NEGATIVE !! <<==\n" ); gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + program_time ) break; } } static void my_papi_init( void ) { if ( PAPI_library_init( PAPI_VER_CURRENT ) != PAPI_VER_CURRENT ) test_fail( name, __LINE__, "PAPI_library_init failed", 1 ); } static void my_papi_start( void ) { int ev; EventSet = PAPI_NULL; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_create_eventset failed", 1 ); for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_add_event( EventSet, Event[ev] ) != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble adding event\n"); test_skip( name, __LINE__, "PAPI_add_event failed", 1 ); } } for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_overflow( EventSet, Event[ev], Threshold[ev], 0, my_handler ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_overflow failed", 1 ); } } if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_start failed", 1 ); } static void my_papi_stop( void ) { if ( PAPI_stop( EventSet, NULL ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_stop failed", 1 ); } static void run( const char *str, int len ) { int n; for ( n = 1; n <= len; n++ ) { do_cycles( 1 ); print_rate( str ); } } int main( int argc, char **argv ) { char buf[100]; int quiet,retval; /* Used to be able to set this via command line */ num_events=1; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); do_cycles( 1 ); zero_count( ); retval=PAPI_library_init( PAPI_VER_CURRENT ); if (retval!=PAPI_VER_CURRENT) { test_fail( name, __LINE__, "PAPI_library_init failed", 1 ); } name = argv[0]; if (!quiet) printf( "[%d] %s, num_events = %d\n", getpid( ), name, num_events ); sprintf( buf, "%d", num_events ); my_papi_start( ); run( name, 3 ); print_here( "fork" ); { int ret = fork( ); if ( ret < 0 ) test_fail( name, __LINE__, "fork failed", 1 ); if ( ret == 0 ) { /* * Child process. */ zero_count( ); my_papi_init( ); my_papi_start( ); run( "child", 5 ); print_here( "stop" ); my_papi_stop( ); sleep( 3 ); print_here( "end" ); exit( 0 ); } run( "main", 14 ); my_papi_stop( ); { int status; wait( &status ); print_here( "end" ); if ( WEXITSTATUS( status ) != 0 ) test_fail( name, __LINE__, "child failed", 1 ); else test_pass( name); } } return 0; } papi-papi-7-2-0-t/src/ctests/forkexec.c000066400000000000000000000031531502707512200177100ustar00rootroot00000000000000/* * File: forkexec.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: PAPI_library_init(); PAPI_shutdown() fork() / \ parent child wait() execlp() PAPI_library_init() */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int quiet; int status; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { /* In child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } return 0; } else { if (!quiet) printf("Test fork/exec/PAPI_init\n"); /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); } /* Then shut down ? */ PAPI_shutdown( ); if ( fork( ) == 0 ) { /* In child, exec ourself with "xxx" command line */ if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) { test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } } else { /* In parent, wait for child to finish */ wait( &status ); if ( WEXITSTATUS( status ) != 0 ) { test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/forkexec2.c000066400000000000000000000034061502707512200177730ustar00rootroot00000000000000/* * File: forkexec2.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: PAPI_library_init() PAPI_shutdown() fork() / \ parent child wait() PAPI_library_init() PAPI_shutdown() execlp() PAPI_library_init() */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int status; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { /* In child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } return 0; } else { if (!quiet) printf("Testing fork/PAPI_init/PAPI_shudtdown/exec/PAPI_init\n"); /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); } PAPI_shutdown( ); if ( fork( ) == 0 ) { /* Init PAPI in child before exec */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "forked PAPI_library_init", retval ); } PAPI_shutdown( ); if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) { test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } } else { /* In parent, wait for child to finish */ wait( &status ); if ( WEXITSTATUS( status ) != 0 ) { test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/forkexec3.c000066400000000000000000000032631502707512200177750ustar00rootroot00000000000000/* * File: forkexec3.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: PAPI_library_init() PAPI_shutdown() fork() / \ parent child wait() PAPI_library_init() **unlike forkexec2, no shutdown here** execlp() PAPI_library_init() */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int status; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { /* In child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } return 0; } else { if (!quiet) printf("Testing Init/Shutdown/fork/init/exec/init\n"); /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); PAPI_shutdown( ); if ( fork( ) == 0 ) { /* In child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "forked PAPI_library_init", retval ); } if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) { test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } } else { wait( &status ); if ( WEXITSTATUS( status ) != 0 ) { test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/forkexec4.c000066400000000000000000000033321502707512200177730ustar00rootroot00000000000000/* * File: forkexec4.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: PAPI_library_init() ** unlike forkexec2/forkexec3, no shutdown here ** fork() / \ parent child wait() PAPI_library_init() execlp() PAPI_library_init() */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; int status; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( ( argc > 1 ) && ( strcmp( argv[1], "xxx" ) == 0 ) ) { /* In child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "execed PAPI_library_init", retval ); } return 0; } else { if (!quiet) printf("Testing Init/fork/exec/Init\n"); /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "main PAPI_library_init", retval ); } if ( fork( ) == 0 ) { /* In Child */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "forked PAPI_library_init", retval ); } if ( execlp( argv[0], argv[0], "xxx", NULL ) == -1 ) { test_fail( __FILE__, __LINE__, "execlp", PAPI_ESYS ); } } else { /* Waiting in parent */ wait( &status ); if ( WEXITSTATUS( status ) != 0 ) { test_fail( __FILE__, __LINE__, "fork", WEXITSTATUS( status ) ); } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/get_event_component.c000066400000000000000000000067461502707512200221570ustar00rootroot00000000000000/* * File: get_event_component.c * Author: Vince Weaver vweaver1@eecs.utk.edu * Author: Treece Burgess tburgess@icl.utk.edu (updated in November 2024 to add a flag to enable or disable Cuda events.) */ /* This test makes sure PAPI_get_event_component() works */ #include #include #include #include "papi.h" #include "papi_test.h" static void print_help(char **argv) { printf( "This is the get_event_component program.\n" ); printf( "For all components compiled in, it uses the function\n" ); printf( "PAPI_get_event_component to get the appropriate component index for a native event.\n"); printf( "Usage: %s [options]\n", argv[0] ); printf( "General command options:\n" ); printf( "\t-h, --help Print the help message.\n" ); printf( "\t--disable-cuda-events= Optionally disable processing the Cuda native events. Default is no.\n" ); printf( "\n" ); } int main( int argc, char **argv ) { int i; int retval; PAPI_event_info_t info; int numcmp, cid, our_cid; const PAPI_component_info_t* cmpinfo; char disableCudaEvts[PAPI_MIN_STR_LEN] = "no"; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* parse command line flags */ for (i = 0; i < argc; i++) { if (strncmp(argv[i], "--disable-cuda-events=", 22) == 0) { strncpy(disableCudaEvts, argv[i] + 22, PAPI_MIN_STR_LEN); } if (strncmp(argv[i], "--help", 6) == 0 || strncmp(argv[i], "-h", 2) == 0) { print_help(argv); exit(-1); } } /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } numcmp = PAPI_num_components( ); /* Loop through all components */ for( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if (cmpinfo == NULL) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 2 ); } /* optionally skip the Cuda native events, default is no */ if (strcmp(cmpinfo->name, "cuda") == 0 && strcmp(disableCudaEvts, "yes") == 0) { continue; } if (cmpinfo->disabled != PAPI_OK && cmpinfo->disabled != PAPI_EDELAY_INIT && !TESTS_QUIET) { printf( "Name: %-23s %s\n", cmpinfo->name ,cmpinfo->description); printf(" \\-> Disabled: %s\n",cmpinfo->disabled_reason); continue; } i = 0 | PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cid ); if (retval!=PAPI_OK) continue; do { if (PAPI_get_event_info( i, &info ) != PAPI_OK) { if (!TESTS_QUIET) { printf("Getting information about event: %#x failed\n", i); } continue; } our_cid=PAPI_get_event_component(i); if (our_cid!=cid) { if (!TESTS_QUIET) { printf("%d %d %s\n",cid,our_cid,info.symbol); } test_fail( __FILE__, __LINE__, "component mismatch", 1 ); } if (!TESTS_QUIET) { printf("%d %d %s\n",cid,our_cid,info.symbol); } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cid ) == PAPI_OK ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/hwinfo.c000066400000000000000000000040651502707512200173770ustar00rootroot00000000000000/* This file performs the following test: valid fields in hw_info */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval, i, j; const PAPI_hw_info_t *hwinfo = NULL; const PAPI_mh_info_t *mh; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if (!TESTS_QUIET) { printf( "Test case hwinfo.c: " "Check output of PAPI_get_hardware_info.\n"); } hwinfo=PAPI_get_hardware_info(); if ( hwinfo == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } mh = &hwinfo->mem_hierarchy; validate_string( hwinfo->vendor_string, "vendor_string" ); validate_string( hwinfo->model_string, "model_string" ); if ( hwinfo->vendor == PAPI_VENDOR_UNKNOWN ) test_fail( __FILE__, __LINE__, "Vendor unknown", 0 ); if ( hwinfo->cpu_max_mhz == 0.0 ) test_fail( __FILE__, __LINE__, "Mhz unknown", 0 ); if ( hwinfo->ncpu < 1 ) test_fail( __FILE__, __LINE__, "ncpu < 1", 0 ); if ( hwinfo->totalcpus < 1 ) test_fail( __FILE__, __LINE__, "totalcpus < 1", 0 ); /* if ( PAPI_get_opt( PAPI_MAX_HWCTRS, NULL ) < 1 ) test_fail( __FILE__, __LINE__, "get_opt(MAX_HWCTRS) < 1", 0 ); if ( PAPI_get_opt( PAPI_MAX_MPX_CTRS, NULL ) < 1 ) test_fail( __FILE__, __LINE__, "get_opt(MAX_MPX_CTRS) < 1", 0 );*/ if ( mh->levels < 0 ) test_fail( __FILE__, __LINE__, "max mh level < 0", 0 ); if (!TESTS_QUIET) { printf( "Max level of TLB or Cache: %d\n", mh->levels ); for ( i = 0; i < mh->levels; i++ ) { for ( j = 0; j < PAPI_MH_MAX_LEVELS; j++ ) { const PAPI_mh_cache_info_t *c = &mh->level[i].cache[j]; const PAPI_mh_tlb_info_t *t = &mh->level[i].tlb[j]; printf( "Level %d, TLB %d: %d, %d, %d\n", i, j, t->type, t->num_entries, t->associativity ); printf( "Level %d, Cache %d: %d, %d, %d, %d, %d\n", i, j, c->type, c->size, c->line_size, c->num_lines, c->associativity ); } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/inherit.c000066400000000000000000000062711502707512200175500ustar00rootroot00000000000000#include #include #include #include #if defined(_AIX) || defined (__FreeBSD__) || defined (__APPLE__) #include /* ARGH! */ #else #include #endif #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, pid, status, EventSet = PAPI_NULL; long long int values[] = {0,0}; PAPI_option_t opt; char event_name[BUFSIZ]; int quiet; quiet=tests_quiet( argc, argv ); if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( ( retval = PAPI_assign_eventset_component( EventSet, 0 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); memset( &opt, 0x0, sizeof ( PAPI_option_t ) ); opt.inherit.inherit = PAPI_INHERIT_ALL; opt.inherit.eventset = EventSet; if ( ( retval = PAPI_set_opt( PAPI_INHERIT, &opt ) ) != PAPI_OK ) { if ( retval == PAPI_ECMP) { test_skip( __FILE__, __LINE__, "Inherit not supported by current component.\n", retval ); } else if (retval == PAPI_EPERM) { test_skip( __FILE__, __LINE__, "Inherit not supported by current component.\n", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } } if ( ( retval = PAPI_query_event( PAPI_TOT_CYC ) ) != PAPI_OK ) { if (!quiet) printf("Trouble finding PAPI_TOT_CYC\n"); test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); } if ( ( retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); strcpy(event_name,"PAPI_FP_INS"); retval = PAPI_add_named_event( EventSet, event_name ); if (retval == PAPI_ENOEVNT) { strcpy(event_name,"PAPI_TOT_INS"); retval = PAPI_add_named_event( EventSet, event_name ); } if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); pid = fork( ); if ( pid == 0 ) { do_flops( NUM_FLOPS ); exit( 0 ); } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if (!quiet) { printf( "Test case inherit: parent starts, child works, parent stops.\n" ); printf( "------------------------------------------------------------\n" ); printf( "Test run : \t1\n" ); printf( "%s : \t%lld\n", event_name, values[1] ); printf( "PAPI_TOT_CYC: \t%lld\n", values[0] ); printf( "------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Row 1 at least %d\n", NUM_FLOPS ); printf( "Row 2 greater than row 1\n"); } if ( values[1] < 100 ) { test_fail( __FILE__, __LINE__, event_name, 1 ); } if ( (!strcmp(event_name,"PAPI_FP_INS")) && (values[1] < NUM_FLOPS)) { test_fail( __FILE__, __LINE__, "PAPI_FP_INS", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/johnmay2.c000066400000000000000000000071061502707512200176330ustar00rootroot00000000000000#include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int FPEventSet = PAPI_NULL; long long values; int PAPI_event, retval; char event_name[PAPI_MAX_STR_LEN]; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Use PAPI_FP_INS if available, otherwise use PAPI_TOT_INS */ if ( PAPI_query_event( PAPI_FP_INS ) == PAPI_OK ) PAPI_event = PAPI_FP_INS; else PAPI_event = PAPI_TOT_INS; retval = PAPI_query_event( PAPI_event ); if (retval != PAPI_OK ) { if (!quiet) printf("Trouble querying event\n"); test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); } /* Create the eventset */ if ( ( retval = PAPI_create_eventset( &FPEventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* Add event to the eventset */ if ( ( retval = PAPI_add_event( FPEventSet, PAPI_event ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); /* Start counting */ if ( ( retval = PAPI_start( FPEventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); /* Try to cleanup while running */ /* Fail test if this isn't refused */ if ( ( retval = PAPI_cleanup_eventset( FPEventSet ) ) != PAPI_EISRUN ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); /* Try to destroy eventset while running */ /* Fail test if this isn't refused */ if ( ( retval = PAPI_destroy_eventset( &FPEventSet ) ) != PAPI_EISRUN ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); /* do some work */ do_flops( 1000000 ); /* stop counting */ if ( ( retval = PAPI_stop( FPEventSet, &values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* Try to destroy eventset without cleaning first */ /* Fail test if this isn't refused */ if ( ( retval = PAPI_destroy_eventset( &FPEventSet ) ) != PAPI_EINVAL ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); /* Try to cleanup eventset. */ /* This should pass. */ if ( ( retval = PAPI_cleanup_eventset( FPEventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); /* Try to destroy eventset. */ /* This should pass. */ if ( ( retval = PAPI_destroy_eventset( &FPEventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); /* Make sure eventset was set to PAPI_NULL */ if ( FPEventSet != PAPI_NULL ) test_fail( __FILE__, __LINE__, "FPEventSet != PAPI_NULL", retval ); if ( !quiet ) { if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); printf( "Test case John May 2: cleanup / destroy eventset.\n" ); printf( "-------------------------------------------------\n" ); printf( "Test run : \t1\n" ); printf( "%s : \t", event_name ); printf( LLDFMT, values ); printf( "\n" ); printf( "-------------------------------------------------\n" ); printf( "The following messages will appear if PAPI is compiled with debug enabled:\n" ); printf ( "\tPAPI Error Code -10: PAPI_EISRUN: EventSet is currently counting\n" ); printf ( "\tPAPI Error Code -10: PAPI_EISRUN: EventSet is currently counting\n" ); printf( "\tPAPI Error Code -1: PAPI_EINVAL: Invalid argument\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/krentel_pthreads.c000066400000000000000000000122211502707512200214340ustar00rootroot00000000000000/* * Test PAPI with multiple threads. */ #define MAX_THREADS 256 #include #include #include #include #include "papi.h" #include "papi_test.h" #define EVENT PAPI_TOT_CYC static int program_time = 5; static int threshold = 20000000; static int num_threads = 3; static long count[MAX_THREADS]; static long iter[MAX_THREADS]; static struct timeval last[MAX_THREADS]; static pthread_key_t key; static struct timeval start; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; long num = ( long ) pthread_getspecific( key ); if ( num < 0 || num > num_threads ) test_fail( __FILE__, __LINE__, "getspecific failed", 1 ); count[num]++; } static void print_rate( long num ) { struct timeval now; long st_secs; double last_secs; gettimeofday( &now, NULL ); st_secs = now.tv_sec - start.tv_sec; last_secs = ( double ) ( now.tv_sec - last[num].tv_sec ) + ( ( double ) ( now.tv_usec - last[num].tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%ld] time = %ld, count = %ld, iter = %ld, " "rate = %.1f/Kiter\n", num, st_secs, count[num], iter[num], ( 1000.0 * ( double ) count[num] ) / ( double ) iter[num] ); } count[num] = 0; iter[num] = 0; last[num] = now; } static void do_cycles( long num, int len ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) sum += x; if ( sum < 0.0 ) printf( "==>> SUM IS NEGATIVE !! <<==\n" ); iter[num]++; gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + len ) break; } } static void * my_thread( void *v ) { long num = ( long ) v; int n; int EventSet = PAPI_NULL; long long value; int retval; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } pthread_setspecific( key, v ); count[num] = 0; iter[num] = 0; last[num] = start; retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset failed", retval ); } retval = PAPI_add_event( EventSet, EVENT ); if (retval != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble adding event\n"); test_fail( __FILE__, __LINE__, "PAPI_add_event failed", retval ); } if ( PAPI_overflow( EventSet, EVENT, threshold, 0, my_handler ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow failed", 1 ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start failed", 1 ); if (!TESTS_QUIET) printf( "launched timer in thread %ld\n", num ); for ( n = 1; n <= program_time; n++ ) { do_cycles( num, 1 ); print_rate( num ); } PAPI_stop( EventSet, &value ); retval = PAPI_overflow( EventSet, EVENT, 0, 0, my_handler); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow failed to reset the overflow handler", retval ); if ( PAPI_remove_event( EventSet, EVENT ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event", 1 ); if ( PAPI_destroy_eventset( &EventSet ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", 1 ); if ( PAPI_unregister_thread( ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", 1 ); return ( NULL ); } int main( int argc, char **argv ) { pthread_t *td = NULL; long n; int quiet,retval; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); if ( argc < 2 || sscanf( argv[1], "%d", &program_time ) < 1 ) program_time = 6; if ( argc < 3 || sscanf( argv[2], "%d", &threshold ) < 1 ) threshold = 20000000; if ( argc < 4 || sscanf( argv[3], "%d", &num_threads ) < 1 ) num_threads = 3; td = malloc((num_threads+1) * sizeof(pthread_t)); if (!td) { test_fail( __FILE__, __LINE__, "td malloc failed", 1 ); } if (!quiet) { printf( "program_time = %d, threshold = %d, num_threads = %d\n\n", program_time, threshold, num_threads ); } if ( PAPI_library_init( PAPI_VER_CURRENT ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init failed", 1 ); /* Test to be sure we can add events */ retval = PAPI_query_event( EVENT ); if (retval!=PAPI_OK) { if (!quiet) printf("Trouble finding event\n"); test_skip(__FILE__,__LINE__,"Event not available",1); } if ( PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init failed", 1 ); if ( pthread_key_create( &key, NULL ) != 0 ) test_fail( __FILE__, __LINE__, "pthread key create failed", 1 ); gettimeofday( &start, NULL ); for ( n = 1; n <= num_threads; n++ ) { if ( pthread_create( &(td[n]), NULL, my_thread, ( void * ) n ) != 0 ) test_fail( __FILE__, __LINE__, "pthread create failed", 1 ); } my_thread( ( void * ) 0 ); /* wait for all the threads */ for ( n = 1; n <= num_threads; n++ ) { if ( pthread_join( td[n], NULL)) test_fail( __FILE__, __LINE__, "pthread join failed", 1 ); } free(td); if (!quiet) printf( "done\n" ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/krentel_pthreads_race.c000066400000000000000000000142771502707512200224430ustar00rootroot00000000000000/* * Test PAPI with multiple threads. * This code is a modification of krentel_pthreads.c by William Cohen * , on Sep 10 2019, to exercise and test for the race * condition in papi_internal.c involving the formerly static variables * papi_event_code and papi_event_code_changed. This code should be run with * "valgrind --tool=helgrind" to show any data races. If run with: * "valgrind --tool=helgrind --log-file=helgrind_out.txt" * The output will be captured in helgrind_out.txt and can then be processed * with the program filter_helgrind.c; see commentary at the top of that file. */ #define MAX_THREADS 256 #include #include #include #include #include "papi.h" #include "papi_test.h" #define EVENT PAPI_TOT_CYC static int program_time = 5; static int threshold = 20000000; static int num_threads = 3; static long count[MAX_THREADS]; static long iter[MAX_THREADS]; static struct timeval last[MAX_THREADS]; static pthread_key_t key; static struct timeval start; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; long num = ( long ) pthread_getspecific( key ); if ( num < 0 || num > num_threads ) test_fail( __FILE__, __LINE__, "getspecific failed", 1 ); count[num]++; } static void print_rate( long num ) { struct timeval now; long st_secs; double last_secs; gettimeofday( &now, NULL ); st_secs = now.tv_sec - start.tv_sec; last_secs = ( double ) ( now.tv_sec - last[num].tv_sec ) + ( ( double ) ( now.tv_usec - last[num].tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%ld] time = %ld, count = %ld, iter = %ld, " "rate = %.1f/Kiter\n", num, st_secs, count[num], iter[num], ( 1000.0 * ( double ) count[num] ) / ( double ) iter[num] ); } count[num] = 0; iter[num] = 0; last[num] = now; } static void do_cycles( long num, int len ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) sum += x; if ( sum < 0.0 ) printf( "==>> SUM IS NEGATIVE !! <<==\n" ); iter[num]++; gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + len ) break; } } static void * my_thread( void *v ) { long num = ( long ) v; int n; int EventSet = PAPI_NULL; int event_code; long long value; int retval; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } pthread_setspecific( key, v ); count[num] = 0; iter[num] = 0; last[num] = start; retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset failed", retval ); } retval = PAPI_event_name_to_code("PAPI_TOT_CYC", &event_code); if (retval != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble creating event name\n"); test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code failed", retval ); } retval = PAPI_add_event( EventSet, EVENT ); if (retval != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble adding event\n"); test_fail( __FILE__, __LINE__, "PAPI_add_event failed", retval ); } if ( PAPI_overflow( EventSet, EVENT, threshold, 0, my_handler ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow failed", 1 ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start failed", 1 ); if (!TESTS_QUIET) printf( "launched timer in thread %ld\n", num ); for ( n = 1; n <= program_time; n++ ) { do_cycles( num, 1 ); print_rate( num ); } PAPI_stop( EventSet, &value ); retval = PAPI_overflow( EventSet, EVENT, 0, 0, my_handler); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow failed to reset the overflow handler", retval ); if ( PAPI_remove_event( EventSet, EVENT ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event", 1 ); if ( PAPI_destroy_eventset( &EventSet ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", 1 ); if ( PAPI_unregister_thread( ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", 1 ); return ( NULL ); } int main( int argc, char **argv ) { pthread_t *td = NULL; long n; int quiet,retval; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); if ( argc < 2 || sscanf( argv[1], "%d", &program_time ) < 1 ) program_time = 6; if ( argc < 3 || sscanf( argv[2], "%d", &threshold ) < 1 ) threshold = 20000000; if ( argc < 4 || sscanf( argv[3], "%d", &num_threads ) < 1 ) num_threads = 32; td = malloc((num_threads+1) * sizeof(pthread_t)); if (!td) { test_fail( __FILE__, __LINE__, "td malloc failed", 1 ); } if (!quiet) { printf( "program_time = %d, threshold = %d, num_threads = %d\n\n", program_time, threshold, num_threads ); } if ( PAPI_library_init( PAPI_VER_CURRENT ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init failed", 1 ); /* Test to be sure we can add events */ retval = PAPI_query_event( EVENT ); if (retval!=PAPI_OK) { if (!quiet) printf("Trouble finding event\n"); test_skip(__FILE__,__LINE__,"Event not available",1); } if ( PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init failed", 1 ); if ( pthread_key_create( &key, NULL ) != 0 ) test_fail( __FILE__, __LINE__, "pthread key create failed", 1 ); gettimeofday( &start, NULL ); for ( n = 1; n <= num_threads; n++ ) { if ( pthread_create( &(td[n]), NULL, my_thread, ( void * ) n ) != 0 ) test_fail( __FILE__, __LINE__, "pthread create failed", 1 ); } my_thread( ( void * ) 0 ); /* wait for all the threads */ for ( n = 1; n <= num_threads; n++ ) { if ( pthread_join( td[n], NULL)) test_fail( __FILE__, __LINE__, "pthread join failed", 1 ); } free(td); if (!quiet) printf( "done\n" ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/kufrin.c000066400000000000000000000120651502707512200174020ustar00rootroot00000000000000/* * File: multiplex1_pthreads.c * Author: Rick Kufrin * rkufrin@ncsa.uiuc.edu * Mods: Philip Mucci * mucci@cs.utk.edu */ /* This file really bangs on the multiplex pthread functionality */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" static int *events; static int numevents = 0; static int max_events=0; double loop( long n ) { long i; double a = 0.0012; for ( i = 0; i < n; i++ ) { a += 0.01; } return a; } void * thread( void *arg ) { ( void ) arg; /*unused */ int eventset = PAPI_NULL; long long *values; int ret = PAPI_register_thread( ); if ( ret != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_register_thread", ret ); ret = PAPI_create_eventset( &eventset ); if ( ret != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); values=calloc(max_events,sizeof(long long)); if (!TESTS_QUIET) printf( "Event set %d created\n", eventset ); /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ ret = PAPI_assign_eventset_component( eventset, 0 ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", ret ); } ret = PAPI_set_multiplex( eventset ); if ( ret == PAPI_ENOSUPP) { free(values); test_skip( __FILE__, __LINE__, "Multiplexing not supported", 1 ); } else if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", ret ); } ret = PAPI_add_events( eventset, events, numevents ); if ( ret < PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_add_events", ret ); } ret = PAPI_start( eventset ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_start", ret ); } do_stuff( ); ret = PAPI_stop( eventset, values ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_stop", ret ); } ret = PAPI_cleanup_eventset( eventset ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", ret ); } ret = PAPI_destroy_eventset( &eventset ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", ret ); } ret = PAPI_unregister_thread( ); if ( ret != PAPI_OK ) { free(values); test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", ret ); } free(values); return ( NULL ); } int main( int argc, char **argv ) { int nthreads = 8, retval, i; PAPI_event_info_t info; pthread_t *threads; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if ( !quiet ) { if ( argc > 1 ) { int tmp = atoi( argv[1] ); if ( tmp >= 1 ) nthreads = tmp; } } retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) pthread_self ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_multiplex_init", retval ); } if ((max_events = PAPI_get_cmp_opt(PAPI_MAX_MPX_CTRS,NULL,0)) <= 0) { test_fail( __FILE__, __LINE__, "PAPI_get_cmp_opt", max_events ); } if ((events = calloc(max_events,sizeof(int))) == NULL) { test_fail( __FILE__, __LINE__, "calloc", PAPI_ESYS ); } /* Fill up the event set with as many non-derived events as we can */ i = PAPI_PRESET_MASK; do { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { if ( info.count == 1 ) { events[numevents++] = ( int ) info.event_code; if (!quiet) printf( "Added %s\n", info.symbol ); } else { if (!quiet) printf( "Skipping derived event %s\n", info.symbol ); } } } while ( ( PAPI_enum_event( &i, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK ) && ( numevents < max_events ) ); if (!quiet) printf( "Found %d events\n", numevents ); if (numevents==0) { test_skip(__FILE__,__LINE__,"No events found",0); } do_stuff( ); if (!quiet) printf( "Creating %d threads:\n", nthreads ); threads = ( pthread_t * ) malloc( ( size_t ) nthreads * sizeof ( pthread_t ) ); if ( threads == NULL ) { free(events); test_fail( __FILE__, __LINE__, "malloc", PAPI_ENOMEM ); } /* Create the threads */ for ( i = 0; i < nthreads; i++ ) { retval = pthread_create( &threads[i], NULL, thread, NULL ); if ( retval != 0 ) { free(events); free(threads); test_fail( __FILE__, __LINE__, "pthread_create", PAPI_ESYS ); } } /* Wait for thread completion */ for ( i = 0; i < nthreads; i++ ) { retval = pthread_join( threads[i], NULL ); if ( retval != 0 ) { free(events); free(threads); test_fail( __FILE__, __LINE__, "pthread_join", PAPI_ESYS ); } } if (!quiet) printf( "Done." ); free(events); free(threads); test_pass( __FILE__ ); pthread_exit( NULL ); return 0; } papi-papi-7-2-0-t/src/ctests/locks_pthreads.c000066400000000000000000000051201502707512200211030ustar00rootroot00000000000000/* This file checks to make sure the locking mechanisms work correctly */ /* on the platform. */ /* Platforms where the locking mechanisms are not implemented or are */ /* incorrectly implemented will fail. -KSL */ #define MAX_THREADS 256 #define APPR_TOTAL_ITER 1000000 #include #include #include #include #include #include "papi.h" #include "papi_test.h" volatile long long count = 0; volatile long long tmpcount = 0; volatile long long thread_iter = 0; static int quiet=0; void lockloop( int iters, volatile long long *mycount ) { int i; for ( i = 0; i < iters; i++ ) { PAPI_lock( PAPI_USR1_LOCK ); *mycount = *mycount + 1; PAPI_unlock( PAPI_USR1_LOCK ); } } void * Slave( void *arg ) { long long duration; duration = PAPI_get_real_usec( ); lockloop( thread_iter, &count ); duration = PAPI_get_real_usec( ) - duration; if (!quiet) { printf("%f lock/unlocks per us\n", (float)thread_iter/(float)duration); } pthread_exit( arg ); } int main( int argc, char **argv ) { pthread_t slaves[MAX_THREADS]; int rc, i, nthr; int retval; const PAPI_hw_info_t *hwinfo = NULL; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hwinfo = PAPI_get_hardware_info( ); if (hwinfo == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } retval = PAPI_thread_init((unsigned long (*)(void)) ( pthread_self ) ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) { test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } } if ( hwinfo->ncpu > MAX_THREADS ) { nthr = MAX_THREADS; } else { nthr = hwinfo->ncpu; } /* Scale the per thread work to keep the serial runtime about the same. */ thread_iter = APPR_TOTAL_ITER/sqrt(nthr); if (!quiet) { printf( "Creating %d threads, %lld lock/unlock\n", nthr , thread_iter); } for ( i = 0; i < nthr; i++ ) { rc = pthread_create( &slaves[i], NULL, Slave, NULL ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } } for ( i = 0; i < nthr; i++ ) { pthread_join( slaves[i], NULL ); } if (!quiet) { printf( "Expected: %lld Received: %lld\n", ( long long ) nthr * thread_iter, count ); } if ( nthr * thread_iter != count ) { test_fail( __FILE__, __LINE__, "Thread Locks", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/low-level.c000066400000000000000000000135521502707512200200140ustar00rootroot00000000000000/* This examples show the essentials in using the PAPI low-level interface. The program consists of 3 examples where the work done over some work-loops. The example tries to illustrate some simple mistakes that are easily made and how a correct code would accomplish the same thing. Example 1: The total count over two work loops (Loops 1 and 2) are supposed to be measured. Due to a mis-understanding of the semantics of the API the total count gets wrong. The example also illustrates that it is legal to read both running and stopped counters. Example 2: The total count over two work loops (Loops 1 and 3) is supposed to be measured while discarding the counts made in loop 2. Instead the counts in loop1 are counted twice and the counts in loop2 are added to the total number of counts. Example 3: One correct way of accomplishing the result aimed for in example 2. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define NUM_EVENTS 2 int main( int argc, char **argv ) { int retval; long long values[NUM_EVENTS], dummyvalues[NUM_EVENTS]; int Events[NUM_EVENTS]; int EventSet = PAPI_NULL; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* query and set up the right events to monitor */ if ( PAPI_query_event( PAPI_FP_INS ) == PAPI_OK ) { Events[0] = PAPI_FP_INS; Events[1] = PAPI_TOT_CYC; } else { Events[0] = PAPI_TOT_INS; Events[1] = PAPI_TOT_CYC; } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_add_events( EventSet, ( int * ) Events, NUM_EVENTS ); if (retval < PAPI_OK ) { if (!quiet) printf("Trouble adding events\n"); test_skip( __FILE__, __LINE__, "PAPI_add_events", retval ); } if ( !quiet ) { printf( "\n Incorrect usage of read and accum.\n" ); printf( " Some cycles are counted twice\n" ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); /* Loop 1 */ do_flops( NUM_FLOPS ); if ( ( retval = PAPI_read( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "(Counters continuing...)\n" ); /* Loop 2 */ do_flops( NUM_FLOPS ); /* Using PAPI_accum here is incorrect. The result is that Loop 1 * * is being counted twice */ if ( ( retval = PAPI_accum( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_accum", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "(Counters being accumulated)\n" ); /* Loop 3 */ do_flops( NUM_FLOPS ); if ( ( retval = PAPI_stop( EventSet, dummyvalues ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( ( retval = PAPI_read( EventSet, dummyvalues ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !quiet ) { printf( TWO12, dummyvalues[0], dummyvalues[1], "(Reading stopped counters)\n" ); printf( TWO12, values[0], values[1], "" ); printf( "\n Incorrect usage of read and accum.\n" ); printf( " Another incorrect use\n" ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); /* Loop 1 */ do_flops( NUM_FLOPS ); if ( ( retval = PAPI_read( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "(Counters continuing...)\n" ); /* Loop 2 */ /* Code that should not be counted */ do_flops( NUM_FLOPS ); if ( ( retval = PAPI_read( EventSet, dummyvalues ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !quiet ) printf( TWO12, dummyvalues[0], dummyvalues[1], "(Intermediate counts...)\n" ); /* Loop 3 */ do_flops( NUM_FLOPS ); /* Since PAPI_read does not reset the counters it's use above after * * loop 2 is incorrect. Instead Loop1 will in effect be counted twice. * * and the counts in loop 2 are included in the total counts */ if ( ( retval = PAPI_accum( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_accum", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "" ); if ( ( retval = PAPI_stop( EventSet, dummyvalues ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !quiet ) { printf( "\n Correct usage of read and accum.\n" ); printf( " PAPI_reset and PAPI_accum used to skip counting\n" ); printf( " a section of the code.\n" ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); if ( ( retval = PAPI_read( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "(Counters continuing)\n" ); /* Code that should not be counted */ do_flops( NUM_FLOPS ); if ( ( retval = PAPI_reset( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); if ( !quiet ) printf( "%12s %12s (Counters reset)\n", "", "" ); do_flops( NUM_FLOPS ); if ( ( retval = PAPI_accum( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_accum", retval ); if ( !quiet ) printf( TWO12, values[0], values[1], "" ); if ( !quiet ) { printf( "----------------------------------\n" ); printf( "Verification: The last line in each experiment should be\n" ); printf( "approximately twice the value of the first line.\n" ); printf ( "The third case illustrates one possible way to accomplish this.\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/max_multiplex.c000066400000000000000000000046601502707512200207760ustar00rootroot00000000000000/* this tests attempts to add the maximum number of pre-defined events */ /* to a multiplexed event set. This tests that we properly set the */ /* maximum events value. */ #include #include #include "papi.h" #include "papi_test.h" int main(int argc, char **argv) { int retval,max_multiplex,i,EventSet=PAPI_NULL; PAPI_event_info_t info; int added=0; int events_tried=0; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Multiplex not supported", 1); } max_multiplex=PAPI_get_opt( PAPI_MAX_MPX_CTRS, NULL ); if (!TESTS_QUIET) { printf("Maximum multiplexed counters=%d\n",max_multiplex); } if (!TESTS_QUIET) { printf("Trying to multiplex as many as possible:\n"); } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } retval = PAPI_set_multiplex( EventSet ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_create_multiplex", retval ); } i = 0 | PAPI_PRESET_MASK; PAPI_enum_event( &i, PAPI_ENUM_FIRST ); do { retval = PAPI_get_event_info( i, &info ); if (retval==PAPI_OK) { if (!TESTS_QUIET) printf("Adding %s: ",info.symbol); } retval = PAPI_add_event( EventSet, info.event_code ); if (retval!=PAPI_OK) { if (!TESTS_QUIET) printf("Fail!\n"); } else { if (!TESTS_QUIET) printf("Success!\n"); added++; } events_tried++; } while (PAPI_enum_event( &i, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK ); PAPI_shutdown( ); if (!TESTS_QUIET) { printf("Added %d of theoretical max %d\n",added,max_multiplex); } if (events_tried #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OUT_FMT "%12d\t%12lld\t%12lld\t%.2f\n" int main( int argc, char **argv ) { int retval, i, j; int EventSet = PAPI_NULL; long long values[2]; const PAPI_hw_info_t *hwinfo = NULL; char descr[PAPI_MAX_STR_LEN]; PAPI_event_info_t evinfo; PAPI_mh_level_t *L; const int eventlist[] = { PAPI_L1_DCA, PAPI_L1_DCM, PAPI_L1_DCH, PAPI_L2_DCA, PAPI_L2_DCM, PAPI_L2_DCH, #if 0 PAPI_L1_LDM, PAPI_L1_STM, PAPI_L1_DCR, PAPI_L1_DCW, PAPI_L1_ICM, PAPI_L1_TCM, PAPI_LD_INS, PAPI_SR_INS, PAPI_LST_INS, PAPI_L2_DCR, PAPI_L2_DCW, PAPI_CSR_TOT, PAPI_MEM_SCY, PAPI_MEM_RCY, PAPI_MEM_WCY, PAPI_L1_ICH, PAPI_L1_ICA, PAPI_L1_ICR, PAPI_L1_ICW, PAPI_L1_TCH, PAPI_L1_TCA, PAPI_L1_TCR, PAPI_L1_TCW, PAPI_L2_DCM, PAPI_L2_ICM, PAPI_L2_TCM, PAPI_L2_LDM, PAPI_L2_STM, PAPI_L2_DCH, PAPI_L2_DCA, PAPI_L2_DCR, PAPI_L2_DCW, PAPI_L2_ICH, PAPI_L2_ICA, PAPI_L2_ICR, PAPI_L2_ICW, PAPI_L2_TCH, PAPI_L2_TCA, PAPI_L2_TCR, PAPI_L2_TCW, #endif 0 }; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( hwinfo = PAPI_get_hardware_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* Extract and report the cache information */ L = ( PAPI_mh_level_t * ) ( hwinfo->mem_hierarchy.level ); for ( i = 0; i < hwinfo->mem_hierarchy.levels; i++ ) { for ( j = 0; j < 2; j++ ) { int tmp; tmp = PAPI_MH_CACHE_TYPE( L[i].cache[j].type ); if ( tmp == PAPI_MH_TYPE_UNIFIED ) { if (!TESTS_QUIET) printf( "L%d Unified ", i + 1 ); } else if ( tmp == PAPI_MH_TYPE_DATA ) { if (!TESTS_QUIET) printf( "L%d Data ", i + 1 ); } else if ( tmp == PAPI_MH_TYPE_INST ) { if (!TESTS_QUIET) printf( "L%d Instruction ", i + 1 ); } else if ( tmp == PAPI_MH_TYPE_VECTOR ) { if (!TESTS_QUIET) printf( "L%d Vector ", i + 1 ); } else if ( tmp == PAPI_MH_TYPE_TRACE ) { if (!TESTS_QUIET) printf( "L%d Trace ", i + 1 ); } else if ( tmp == PAPI_MH_TYPE_EMPTY ) { break; } else { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EBUG ); } tmp = PAPI_MH_CACHE_WRITE_POLICY( L[i].cache[j].type ); if ( tmp == PAPI_MH_TYPE_WB ) { if (!TESTS_QUIET) printf( "Write back " ); } else if ( tmp == PAPI_MH_TYPE_WT ) { if (!TESTS_QUIET) printf( "Write through " ); } else if ( tmp == PAPI_MH_TYPE_UNKNOWN ) { if (!TESTS_QUIET) printf( "Unknown Write policy " ); } else { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EBUG ); } tmp = PAPI_MH_CACHE_REPLACEMENT_POLICY( L[i].cache[j].type ); if ( tmp == PAPI_MH_TYPE_PSEUDO_LRU ) { if (!TESTS_QUIET) printf( "Pseudo LRU policy " ); } else if ( tmp == PAPI_MH_TYPE_LRU ) { if (!TESTS_QUIET) printf( "LRU policy " ); } else if ( tmp == PAPI_MH_TYPE_UNKNOWN ) { if (!TESTS_QUIET) printf( "Unknown Replacement policy " ); } else { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EBUG ); } tmp = PAPI_MH_CACHE_ALLOCATION_POLICY( L[i].cache[j].type ); if ( tmp == PAPI_MH_TYPE_RD_ALLOC ) { if (!TESTS_QUIET) printf( "Read Allocate " ); } else if ( tmp == PAPI_MH_TYPE_WR_ALLOC ) { if (!TESTS_QUIET) printf( "Write Allocate " ); } else if ( tmp == PAPI_MH_TYPE_RW_ALLOC ) { if (!TESTS_QUIET) printf( "Read Write Allocate " ); } else if ( tmp == PAPI_MH_TYPE_UNKNOWN ) { if (!TESTS_QUIET) printf( "Unknown Allocate policy " ); } else { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EBUG ); } if (!TESTS_QUIET) { printf( "Cache:\n" ); if ( L[i].cache[j].type ) { printf( " Total size: %dKB\n" " Line size: %dB\n" " Number of Lines: %d\n" " Associativity: %d\n\n", ( L[i].cache[j].size ) >> 10, L[i].cache[j].line_size, L[i].cache[j].num_lines, L[i].cache[j].associativity ); } } } } for ( i = 0; eventlist[i] != 0; i++ ) { if (PAPI_event_code_to_name( eventlist[i], descr ) != PAPI_OK) continue; if ( PAPI_add_event( EventSet, eventlist[i] ) != PAPI_OK ) continue; if ( PAPI_get_event_info( eventlist[i], &evinfo ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); if (!TESTS_QUIET) { printf( "\nEvent: %s\nShort: %s\nLong: %s\n\n", evinfo.symbol, evinfo.short_descr, evinfo.long_descr ); printf( " Bytes\t\tCold\t\tWarm\tPercent\n" ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); for ( j = 512; j <= 16 * ( 1024 * 1024 ); j = j * 2 ) { do_misses( 1, j ); do_flush( ); if ( ( retval = PAPI_reset( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); do_misses( 1, j ); if ( ( retval = PAPI_read( EventSet, &values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( ( retval = PAPI_reset( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); do_misses( 1, j ); if ( ( retval = PAPI_read( EventSet, &values[1] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if (!TESTS_QUIET) { printf( OUT_FMT, j, values[0], values[1], ( ( float ) values[1] / ( float ) ( ( values[0] !=0 ) ? values[0] : 1 ) * 100.0 ) ); } } if ( ( retval = PAPI_stop( EventSet, NULL ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( ( retval = PAPI_remove_event( EventSet, eventlist[i] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event", retval ); } if ( ( retval = PAPI_destroy_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/mendes-alt.c000066400000000000000000000111531502707512200201320ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #ifdef SETMAX #define MAX SETMAX #else #define MAX 10000 #endif #define TIMES 1000 #define PAPI_MAX_EVENTS 2 long long PAPI_values1[PAPI_MAX_EVENTS]; long long PAPI_values2[PAPI_MAX_EVENTS]; long long PAPI_values3[PAPI_MAX_EVENTS]; static int EventSet = PAPI_NULL; void funcX( double a[MAX], double b[MAX], int n) { int i, k; for ( k = 0; k < TIMES; k++ ) for ( i = 0; i < n; i++ ) a[i] = a[i] * b[i] + 1.; } void funcA( double a[MAX], double b[MAX], int n) { int i, k; double t[MAX]; for ( k = 0; k < TIMES; k++ ) for ( i = 0; i < n; i++ ) { t[i] = b[n - i]; b[i] = a[n - i]; a[i] = t[i]; } } int main( int argc, char **argv ) { int i, retval; double a[MAX], b[MAX]; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); for ( i = 0; i < MAX; i++ ) { a[i] = 0.0; b[i] = 0.; } for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) PAPI_values1[i] = PAPI_values2[i] = 0; retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); #ifdef MULTIPLEX if ( !quiet ) { printf( "Activating PAPI Multiplex\n" ); } retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } #endif retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI set event fail\n", retval ); #ifdef MULTIPLEX /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); retval = PAPI_set_multiplex( EventSet ); if (retval == PAPI_ENOSUPP) { test_skip( __FILE__, __LINE__, "Multiplex not supported", 1 ); } else if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_multiplex fails \n", retval ); #endif retval = PAPI_add_event( EventSet, PAPI_FP_INS ); if ( retval < PAPI_OK ) { retval = PAPI_add_event( EventSet, PAPI_TOT_INS ); if ( retval < PAPI_OK ) { if (!quiet) printf("Trouble adding events\n"); test_skip( __FILE__, __LINE__, "PAPI add PAPI_FP_INS or PAPI_TOT_INS fail\n", retval ); } else if ( !quiet ) { printf( "PAPI_TOT_INS\n" ); } } else if ( !quiet ) { printf( "PAPI_FP_INS\n" ); } retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ); if ( retval < PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI add PAPI_TOT_CYC fail\n", retval ); if ( !quiet ) { printf( "PAPI_TOT_CYC\n" ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI start fail\n", retval ); funcX( a, b, MAX ); retval = PAPI_read( EventSet, PAPI_values1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI read fail \n", retval ); funcX( a, b, MAX ); retval = PAPI_read( EventSet, PAPI_values2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI read fail \n", retval ); #ifdef RESET retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI read fail \n", retval ); #endif funcA( a, b, MAX ); retval = PAPI_stop( EventSet, PAPI_values3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI read fail \n", retval ); if ( !quiet ) { printf( "values1 is:\n" ); for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) printf( LLDFMT15, PAPI_values1[i] ); printf( "\nvalues2 is:\n" ); for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) printf( LLDFMT15, PAPI_values2[i] ); printf( "\nvalues3 is:\n" ); for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) printf( LLDFMT15, PAPI_values3[i] ); #ifndef RESET printf( "\nPAPI value (2-1) is : \n" ); for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) printf( LLDFMT15, PAPI_values2[i] - PAPI_values1[i] ); printf( "\nPAPI value (3-2) is : \n" ); for ( i = 0; i < PAPI_MAX_EVENTS; i++ ) { long long diff; diff = PAPI_values3[i] - PAPI_values2[i]; printf( LLDFMT15, diff); if (diff<0) { test_fail( __FILE__, __LINE__, "Multiplexed counter decreased", 1 ); } } #endif printf( "\n\nVerification:\n" ); printf( "From start to first PAPI_read %d fp operations are made.\n", 2 * MAX * TIMES ); printf( "Between 1st and 2nd PAPI_read %d fp operations are made.\n", 2 * MAX * TIMES ); printf( "Between 2nd and 3rd PAPI_read %d fp operations are made.\n", 0 ); printf( "\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/mpi_hl.c000066400000000000000000000017061502707512200173540ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval; int quiet = 0; char* region_name; int world_size, world_rank; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); MPI_Init( &argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &world_size); MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); region_name = "do_flops"; if ( !quiet ) { printf("\nRank %d: instrument flops\n", world_rank); } retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } MPI_Finalize(); test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/mpi_omp_hl.c000066400000000000000000000022211502707512200202200ustar00rootroot00000000000000#include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, i; int quiet = 0; char* region_name; int world_size, world_rank; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); MPI_Init( &argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &world_size); MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); region_name = "do_flops"; #pragma omp parallel #pragma omp for for ( i = 1; i <= 2; ++i ) { int tid; tid = omp_get_thread_num(); if ( !quiet ) { printf("\nRank %d, Thread %d: instrument flops\n", world_rank, tid); } retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } } MPI_Finalize(); test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/mpifirst.c000066400000000000000000000124201502707512200177340ustar00rootroot00000000000000/* This file performs the following test: start, read, stop and again functionality - It attempts to use the following three counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS or PAPI_TOT_INS if PAPI_FP_INS doesn't exist + PAPI_TOT_CYC - Start counters - Do flops - Read counters - Reset counters - Do flops - Read counters - Do flops - Read counters - Do flops - Stop and read counters - Read counters */ #include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval, num_tests = 5, num_events, tmp; long long **values; int EventSet = PAPI_NULL; int PAPI_event, mask; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_MAX_STR_LEN]; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); MPI_Init( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); /* query and set up the right instruction to monitor */ if ( PAPI_query_event( PAPI_FP_INS ) == PAPI_OK ) { PAPI_event = PAPI_FP_INS; mask = MASK_FP_INS | MASK_TOT_CYC; } else { PAPI_event = PAPI_TOT_INS; mask = MASK_TOT_INS | MASK_TOT_CYC; } retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); EventSet = add_test_events( &num_events, &mask ); values = allocate_test_space( num_tests, num_events ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_read( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); do_flops( NUM_FLOPS ); retval = PAPI_read( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); do_flops( NUM_FLOPS ); retval = PAPI_read( EventSet, values[2] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[3] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_read( EventSet, values[4] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); remove_test_events( &EventSet, mask ); if ( !quiet ) { printf( "Test case 1: Non-overlapping start, stop, read.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t1\t\t2\t\t3\t\t4\t\t5\n" ); sprintf( add_event_str, "%s : ", event_name ); printf( TAB5, add_event_str, ( values[0] )[0], ( values[1] )[0], ( values[2] )[0], ( values[3] )[0], ( values[4] )[0] ); printf( TAB5, "PAPI_TOT_CYC: ", ( values[0] )[1], ( values[1] )[1], ( values[2] )[1], ( values[3] )[1], ( values[4] )[1] ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Column 1 approximately equals column 2\n" ); printf( "Column 3 approximately equals 2 * column 2\n" ); printf( "Column 4 approximately equals 3 * column 2\n" ); printf( "Column 4 exactly equals column 5\n" ); } { long long min, max; min = ( long long ) ( values[1][0] * .9 ); max = ( long long ) ( values[1][0] * 1.1 ); if ( values[0][0] > max || values[0][0] < min || values[2][0] > ( 2 * max ) || values[2][0] < ( 2 * min ) || values[3][0] > ( 3 * max ) || values[3][0] < ( 3 * min ) || values[3][0] != values[4][0] ) { printf( "min: " ); printf( LLDFMT, min ); printf( "max: " ); printf( LLDFMT, max ); printf( "1st: " ); printf( LLDFMT, values[0][0] ); printf( "2nd: " ); printf( LLDFMT, values[1][0] ); printf( "3rd: " ); printf( LLDFMT, values[2][0] ); printf( "4th: " ); printf( LLDFMT, values[3][0] ); printf( "5th: " ); printf( LLDFMT, values[4][0] ); printf( "\n" ); test_fail( __FILE__, __LINE__, event_name, 1 ); } min = ( long long ) ( values[1][1] * .9 ); max = ( long long ) ( values[1][1] * 1.1 ); if ( values[0][1] > max || values[0][1] < min || values[2][1] > ( 2 * max ) || values[2][1] < ( 2 * min ) || values[3][1] > ( 3 * max ) || values[3][1] < ( 3 * min ) || values[3][1] != values[4][1] ) { test_fail( __FILE__, __LINE__, "PAPI_TOT_CYC", 1 ); } } test_pass( __FILE__, values, num_tests ); MPI_Finalize( ); exit( 1 ); } papi-papi-7-2-0-t/src/ctests/multiattach.c000066400000000000000000000256211502707512200204250ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for multiple attached processes. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_CONT PT_CONTINUE #endif #define MULTIPLIER 5 static int wait_for_attach_and_loop( int num ) { kill( getpid( ), SIGSTOP ); do_flops( NUM_FLOPS * num ); kill( getpid( ), SIGSTOP ); return 0; } int main( int argc, char **argv ) { int status, retval, num_tests = 2, tmp; int EventSet1 = PAPI_NULL, EventSet2 = PAPI_NULL; int PAPI_event, PAPI_event2, mask1, mask2; int num_events1, num_events2; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; const PAPI_component_info_t *cmpinfo; pid_t pid, pid2; double ratio1,ratio2; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* get the component info and check if we support attach */ if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); } if ( cmpinfo->attach == 0 ) { test_skip( __FILE__, __LINE__, "Platform does not support attaching", 0 ); } /* fork off first child */ pid = fork( ); if ( pid < 0 ) { test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); } if ( pid == 0 ) { exit( wait_for_attach_and_loop( 1 ) ); } /* fork off second child, does twice as much */ pid2 = fork( ); if ( pid2 < 0 ) { test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); } if ( pid2 == 0 ) { exit( wait_for_attach_and_loop( MULTIPLIER ) ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); EventSet2 = add_two_events( &num_events2, &PAPI_event2, &mask2 ); if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_ATTACH, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1 ; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didnt return true to WIFSTOPPED", 0 ); } if ( ptrace( PTRACE_ATTACH, pid2, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid2, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didnt return true to WIFSTOPPED", 0 ); } } retval = PAPI_attach( EventSet1, ( unsigned long ) pid ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); } retval = PAPI_attach( EventSet2, ( unsigned long ) pid2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); } strcpy(event_name, "PAPI_TOT_INS"); sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); /* num_events1 is greater than num_events2 so don't worry. */ values = allocate_test_space( num_tests, num_events1 ); /* Gather before values */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } if ( ptrace( PTRACE_CONT, pid2, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( waitpid( pid2, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } /* start measuring in first child */ retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* start measuring in second child */ retval = PAPI_start( EventSet2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* Start first child and Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } /* Start second child and Wait for the SIGSTOP. */ if ( ptrace( PTRACE_CONT, pid2, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid2, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; /* stop measuring and read first child */ retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) { printf( "Warning: PAPI_stop returned error %d, probably ok.\n", retval ); } /* stop measuring and read second child */ retval = PAPI_stop( EventSet2, values[1] ); if ( retval != PAPI_OK ) { printf( "Warning: PAPI_stop returned error %d, probably ok.\n", retval ); } /* close down the measurements */ remove_test_events( &EventSet1, mask1 ); remove_test_events( &EventSet2, mask2 ); /* restart events so they can end */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( ptrace( PTRACE_CONT, pid2, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFEXITED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFEXITED", 0 ); } if ( waitpid( pid2, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFEXITED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFEXITED", 0 ); } /* This code isn't necessary as we know the child has exited, */ /* it *may* return an error if the component so chooses. You */ /* should use read() instead. */ if (!TESTS_QUIET) { printf( "Test case: multiple 3rd party attach start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); sprintf( add_event_str, "(PID %jd) %-12s : \t", ( intmax_t ) pid, event_name ); printf( TAB1, add_event_str, values[0][1] ); sprintf( add_event_str, "(PID %jd) PAPI_TOT_CYC : \t", ( intmax_t ) pid ); printf( TAB1, add_event_str, values[0][0] ); sprintf( add_event_str, "(PID %jd) %-12s : \t", ( intmax_t ) pid2, event_name ); printf( TAB1, add_event_str,values[1][1] ); sprintf( add_event_str, "(PID %jd) PAPI_TOT_CYC : \t", ( intmax_t ) pid2 ); printf( TAB1, add_event_str, values[1][0] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf("Verification: pid %d results should be %dx pid %d\n", pid2,MULTIPLIER,pid ); } /* FLOPS ratio */ ratio1=(double)values[1][0]/(double)values[0][0]; /* CYCLES ratio */ ratio2=(double)values[1][1]/(double)values[0][1]; if (!TESTS_QUIET) { printf("\tFLOPS ratio %lld/%lld = %lf\n", values[1][0],values[0][0],ratio1); } double ratio1_high,ratio1_low,ratio2_high,ratio2_low; ratio1_high=(double)MULTIPLIER *1.10; ratio1_low=(double)MULTIPLIER * 0.90; if ((ratio1 > ratio1_high ) || (ratio1 < ratio1_low)) { printf("Ratio out of range, should be ~%lf not %lf\n", (double)MULTIPLIER, ratio1); test_fail( __FILE__, __LINE__, "Error: Counter ratio not two", 0 ); } if (!TESTS_QUIET) { printf("\tCycles ratio %lld/%lld = %lf\n", values[1][1],values[0][1],ratio2); } ratio2_high=(double)MULTIPLIER *1.20; ratio2_low=(double)MULTIPLIER * 0.80; if ((ratio2 > ratio2_high ) || (ratio2 < ratio2_low )) { printf("Ratio out of range, should be ~%lf, not %lf\n", (double)MULTIPLIER, ratio2); test_fail( __FILE__, __LINE__, "Known issue: Counter ratio not two", 0 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/multiattach2.c000066400000000000000000000166131502707512200205100ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for an attached process as well as itself. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_CONT PT_CONTINUE #endif int wait_for_attach_and_loop( int num ) { kill( getpid( ), SIGSTOP ); do_flops( NUM_FLOPS * num ); kill( getpid( ), SIGSTOP ); return 0; } int main( int argc, char **argv ) { int status, retval, num_tests = 2, tmp; int EventSet1 = PAPI_NULL, EventSet2 = PAPI_NULL; int PAPI_event, PAPI_event2, mask1, mask2; int num_events1, num_events2; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; const PAPI_component_info_t *cmpinfo; pid_t pid; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* init the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* get component info */ if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); } /* see if we support attach */ if ( cmpinfo->attach == 0 ) { test_skip( __FILE__, __LINE__, "Platform does not support attaching",0 ); } /* fork! */ pid = fork( ); if ( pid < 0 ) { test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); } /* if child, wait_for_attach_and_loop */ if ( pid == 0 ) { exit( wait_for_attach_and_loop( 2 ) ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); EventSet2 = add_two_events( &num_events2, &PAPI_event2, &mask2 ); if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_ATTACH, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didnt return true to WIFSTOPPED", 0 ); } } retval = PAPI_attach( EventSet2, ( unsigned long ) pid ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); } strcpy(event_name,"PAPI_TOT_INS"); sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); /* num_events1 is greater than num_events2 so don't worry. */ values = allocate_test_space( num_tests, num_events1 ); /* get before values */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } retval = PAPI_start( EventSet2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) { printf( "Warning: PAPI_stop returned error %d, probably ok.\n", retval ); } retval = PAPI_stop( EventSet2, values[1] ); if ( retval != PAPI_OK ) { printf( "Warning: PAPI_stop returned error %d, probably ok.\n", retval ); } remove_test_events( &EventSet1, mask1 ); remove_test_events( &EventSet2, mask2 ); if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFEXITED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFEXITED", 0 ); } /* This code isn't necessary as we know the child has exited, it *may* return an error if the component so chooses. You should use read() instead. */ if (!TESTS_QUIET) { printf( "Test case: multiple 3rd party attach start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); sprintf( add_event_str, "(PID self) %-12s : \t", event_name ); printf( TAB1, add_event_str, values[0][1] ); sprintf( add_event_str, "(PID self) PAPI_TOT_CYC : \t" ); printf( TAB1, add_event_str, values[0][0] ); sprintf( add_event_str, "(PID %jd) %-12s : \t", ( intmax_t ) pid, event_name ); printf( TAB1, add_event_str, values[1][1] ); sprintf( add_event_str, "(PID %jd) PAPI_TOT_CYC : \t", ( intmax_t ) pid ); printf( TAB1, add_event_str, values[1][0] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification: none\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/multiplex1.c000066400000000000000000000272671502707512200202220ustar00rootroot00000000000000/* * File: multiplex.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file tests the multiplex functionality, originally developed by John May of LLNL. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" /* Event to use in all cases; initialized in init_papi() */ #define TOTAL_EVENTS 6 int solaris_preset_PAPI_events[TOTAL_EVENTS] = { PAPI_TOT_CYC, PAPI_BR_MSP, PAPI_L2_TCM, PAPI_L1_ICM, 0 }; int power6_preset_PAPI_events[TOTAL_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_L1_DCM, PAPI_L1_ICM, 0 }; int preset_PAPI_events[TOTAL_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_TOT_INS, PAPI_L1_DCM, PAPI_L1_ICM, 0 }; static int PAPI_events[TOTAL_EVENTS] = { 0, }; static int PAPI_events_len = 0; static void init_papi( int *out_events, int *len ) { int retval; int i, real_len = 0; int *in_events = preset_PAPI_events; const PAPI_hw_info_t *hw_info; /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__,__LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } if ( strstr( hw_info->model_string, "UltraSPARC" ) ) { in_events = solaris_preset_PAPI_events; } if ( strcmp( hw_info->model_string, "POWER6" ) == 0 ) { in_events = power6_preset_PAPI_events; retval = PAPI_set_domain( PAPI_DOM_ALL ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_set_domain", retval ); } retval = PAPI_multiplex_init( ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail(__FILE__,__LINE__, "PAPI_multiplex_init", retval ); } for ( i = 0; in_events[i] != 0; i++ ) { char out[PAPI_MAX_STR_LEN]; /* query and set up the right instruction to monitor */ retval = PAPI_query_event( in_events[i] ); if ( retval == PAPI_OK ) { out_events[real_len++] = in_events[i]; PAPI_event_code_to_name( in_events[i], out ); if ( real_len == *len ) break; } else { PAPI_event_code_to_name( in_events[i], out ); if ( !TESTS_QUIET ) printf( "%s does not exist\n", out ); } } if ( real_len < 1 ) { if (!TESTS_QUIET) printf("Trouble adding events\n"); test_skip(__FILE__,__LINE__, "No counters available", 0 ); } *len = real_len; } /* Tests that PAPI_multiplex_init does not mess with normal operation. */ int case1( void ) { int retval, i, EventSet = PAPI_NULL; long long values[2]; PAPI_events_len = 2; init_papi( PAPI_events, &PAPI_events_len ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_create_eventset", retval ); for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } do_stuff( ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { test_print_event_header( "case1:", EventSet ); printf( TAB2, "case1:", values[0], values[1] ); } retval = PAPI_cleanup_eventset( EventSet ); /* JT */ if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); PAPI_shutdown( ); return ( SUCCESS ); } /* Tests that PAPI_set_multiplex() works before adding events */ int case2( void ) { int retval, i, EventSet = PAPI_NULL; long long values[2]; PAPI_events_len = 2; init_papi( PAPI_events, &PAPI_events_len ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_create_eventset", retval ); /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_assign_eventset_component", retval ); retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_set_multiplex", retval ); for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } do_stuff( ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { test_print_event_header( "case2:", EventSet ); printf( TAB2, "case2:", values[0], values[1] ); } retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); PAPI_shutdown( ); return ( SUCCESS ); } /* Tests that PAPI_set_multiplex() works after adding events */ int case3( void ) { int retval, i, EventSet = PAPI_NULL; long long values[2]; PAPI_events_len = 2; init_papi( PAPI_events, &PAPI_events_len ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_create_eventset", retval ); for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_set_multiplex", retval ); do_stuff( ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { test_print_event_header( "case3:", EventSet ); printf( TAB2, "case3:", values[0], values[1] ); } retval = PAPI_cleanup_eventset( EventSet ); /* JT */ if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); PAPI_shutdown( ); return ( SUCCESS ); } /* Tests that PAPI_set_multiplex() works before adding events */ /* Tests that PAPI_add_event() works after PAPI_add_event()/PAPI_set_multiplex() */ int case4( void ) { int retval, i, EventSet = PAPI_NULL; long long values[4]; char out[PAPI_MAX_STR_LEN]; PAPI_events_len = 2; init_papi( PAPI_events, &PAPI_events_len ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_create_eventset", retval ); i = 0; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_set_multiplex", retval ); i = 1; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); do_stuff( ); if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { test_print_event_header( "case4:", EventSet ); printf( TAB2, "case4:", values[0], values[1] ); } retval = PAPI_cleanup_eventset( EventSet ); /* JT */ if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_cleanup_eventset", retval ); PAPI_shutdown( ); return ( SUCCESS ); } /* Tests that PAPI_read() works immediately after PAPI_start() */ int case5( void ) { int retval, i, j, EventSet = PAPI_NULL; long long start_values[4] = { 0,0,0,0 }, values[4] = {0,0,0,0}; char out[PAPI_MAX_STR_LEN]; PAPI_events_len = 2; init_papi( PAPI_events, &PAPI_events_len ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_create_eventset", retval ); /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_assign_eventset_component", retval ); retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail(__FILE__,__LINE__, "PAPI_set_multiplex", retval ); } /* Add 2 events... */ i = 0; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); i++; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); i++; do_stuff( ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_start", retval ); retval = PAPI_read( EventSet, start_values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_read", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail(__FILE__,__LINE__, "PAPI_stop", retval ); for (j=0;j #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define TOTAL_EVENTS 10 static int solaris_preset_PAPI_events[TOTAL_EVENTS] = { PAPI_BR_MSP, PAPI_TOT_CYC, PAPI_L2_TCM, PAPI_L1_ICM, 0 }; static int power6_preset_PAPI_events[TOTAL_EVENTS] = { PAPI_FP_INS, PAPI_TOT_CYC, PAPI_L1_DCM, PAPI_L1_ICM, 0 }; static int preset_PAPI_events[TOTAL_EVENTS] = { PAPI_FP_INS, PAPI_TOT_INS, PAPI_L1_DCM, PAPI_L1_ICM, 0 }; static int PAPI_events[TOTAL_EVENTS] = { 0, }; static int PAPI_events_len = 0; static void init_papi_pthreads( int *out_events, int *len ) { int retval; int i, real_len = 0; int *in_events = preset_PAPI_events; const PAPI_hw_info_t *hw_info; /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } if ( strstr( hw_info->model_string, "UltraSPARC" ) ) { in_events = solaris_preset_PAPI_events; } if ( strcmp( hw_info->model_string, "POWER6" ) == 0 ) { in_events = power6_preset_PAPI_events; retval = PAPI_set_domain( PAPI_DOM_ALL ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_set_domain", retval ); } } retval = PAPI_multiplex_init( ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_multiplex_init", retval ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if (retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } for ( i = 0; in_events[i] != 0; i++ ) { char out[PAPI_MAX_STR_LEN]; /* query and set up the right instruction to monitor */ retval = PAPI_query_event( in_events[i] ); if ( retval == PAPI_OK ) { out_events[real_len++] = in_events[i]; PAPI_event_code_to_name( in_events[i], out ); if ( real_len == *len ) break; } else { PAPI_event_code_to_name( in_events[i], out ); if ( !TESTS_QUIET ) printf( "%s does not exist\n", out ); } } if ( real_len < 1 ) { if (!TESTS_QUIET) printf("No counters available\n"); test_skip(__FILE__, __LINE__, "No counters available", 0 ); } *len = real_len; } static int do_pthreads( void *( *fn ) ( void * ) ) { int i, rc, retval; pthread_attr_t attr; pthread_t id[NUM_THREADS]; pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif for ( i = 0; i < NUM_THREADS; i++ ) { rc = pthread_create( &id[i], &attr, fn, NULL ); if ( rc ) return ( FAILURE ); } for ( i = 0; i < NUM_THREADS; i++ ) pthread_join( id[i], NULL ); pthread_attr_destroy( &attr ); return ( SUCCESS ); } /* Tests that PAPI_multiplex_init does not mess with normal operation. */ static void * case1_pthreads( void *arg ) { ( void ) arg; /*unused */ int retval, i, EventSet = PAPI_NULL; long long values[2]; if ( ( retval = PAPI_register_thread( ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { printf( "case1 thread %4x:", ( unsigned ) pthread_self( ) ); test_print_event_header( "", EventSet ); printf( "case1 thread %4x:", ( unsigned ) pthread_self( ) ); printf( TAB2, "", values[0], values[1] ); } if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) /* JT */ test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); if ( ( retval = PAPI_destroy_eventset( &EventSet) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if ( ( retval = PAPI_unregister_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); return ( ( void * ) SUCCESS ); } /* Tests that PAPI_set_multiplex() works before adding events */ static void * case2_pthreads( void *arg ) { ( void ) arg; /*unused */ int retval, i, EventSet = PAPI_NULL; long long values[2]; if ( ( retval = PAPI_register_thread( ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) { test_fail(__FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } if ( ( retval = PAPI_set_multiplex( EventSet ) ) != PAPI_OK ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } if (!TESTS_QUIET) { printf( "++case2 thread %4x:", ( unsigned ) pthread_self( ) ); } for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { printf( "case2 thread %4x:", ( unsigned ) pthread_self( ) ); test_print_event_header( "", EventSet ); printf( "case2 thread %4x:", ( unsigned ) pthread_self( ) ); printf( TAB2, "", values[0], values[1] ); } /* JT */ if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); } if ( ( retval = PAPI_destroy_eventset( &EventSet) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); } if ( ( retval = PAPI_unregister_thread( ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); } return ( ( void * ) SUCCESS ); } /* Tests that PAPI_set_multiplex() works after adding events */ static void * case3_pthreads( void *arg ) { ( void ) arg; /*unused */ int retval, i, EventSet = PAPI_NULL; long long values[2]; if ( ( retval = PAPI_register_thread( ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } for ( i = 0; i < PAPI_events_len; i++ ) { char out[PAPI_MAX_STR_LEN]; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if ( !TESTS_QUIET ) printf( "Added %s\n", out ); } if ( ( retval = PAPI_set_multiplex( EventSet ) ) != PAPI_OK ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } if ( !TESTS_QUIET ) { printf( "case3 thread %4x:", ( unsigned ) pthread_self( ) ); test_print_event_header( "", EventSet ); printf( "case3 thread %4x:", ( unsigned ) pthread_self( ) ); printf( TAB2, "", values[0], values[1] ); } if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) /* JT */ test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); if ( ( retval = PAPI_destroy_eventset( &EventSet) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if ( ( retval = PAPI_unregister_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); return ( ( void * ) SUCCESS ); } /* Tests that PAPI_set_multiplex() works before/after adding events */ static void * case4_pthreads( void *arg ) { ( void ) arg; /*unused */ int retval, i, EventSet = PAPI_NULL; long long values[4]; char out[PAPI_MAX_STR_LEN]; if ( ( retval = PAPI_register_thread( ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } i = 0; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); if ( ( retval = PAPI_set_multiplex( EventSet ) ) != PAPI_OK ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } i = 1; retval = PAPI_add_event( EventSet, PAPI_events[i] ); if ( retval != PAPI_OK ) test_fail(__FILE__, __LINE__, "PAPI_add_event", retval ); PAPI_event_code_to_name( PAPI_events[i], out ); if (!TESTS_QUIET) printf( "Added %s\n", out ); do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { printf( "case4 thread %4x:", ( unsigned ) pthread_self( ) ); test_print_event_header( "", EventSet ); printf( "case4 thread %4x:", ( unsigned ) pthread_self( ) ); printf( TAB2, "", values[0], values[1] ); } if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) /* JT */ test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); if ( ( retval = PAPI_destroy_eventset( &EventSet) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if ( ( retval = PAPI_unregister_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); return ( ( void * ) SUCCESS ); } static int case1( void ) { int retval; PAPI_events_len = 2; init_papi_pthreads( PAPI_events, &PAPI_events_len ); retval = do_pthreads( case1_pthreads ); PAPI_shutdown( ); return retval; } static int case2( void ) { int retval; PAPI_events_len = 2; init_papi_pthreads( PAPI_events, &PAPI_events_len ); retval = do_pthreads( case2_pthreads ); PAPI_shutdown( ); return retval; } static int case3( void ) { int retval; PAPI_events_len = 2; init_papi_pthreads( PAPI_events, &PAPI_events_len ); retval = do_pthreads( case3_pthreads ); PAPI_shutdown( ); return retval; } static int case4( void ) { int retval; PAPI_events_len = 2; init_papi_pthreads( PAPI_events, &PAPI_events_len ); retval = do_pthreads( case4_pthreads ); PAPI_shutdown( ); return retval; } int main( int argc, char **argv ) { int retval; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if (!quiet) { printf( "%s: Using %d threads\n\n", argv[0], NUM_THREADS ); } /* Case1 */ if (!quiet) { printf ( "case1: Does PAPI_multiplex_init() " "not break regular operation?\n" ); } if ( case1() != SUCCESS ) { test_fail( __FILE__, __LINE__, "case1", PAPI_ESYS ); } /* Case2 */ if (!quiet) { printf( "case2: Does setmpx/add work?\n" ); } if ( case2( ) != SUCCESS ) { test_fail( __FILE__, __LINE__, "case2", PAPI_ESYS ); } /* Case3 */ if (!quiet) { printf( "case3: Does add/setmpx work?\n" ); } if ( case3( ) != SUCCESS ) { test_fail( __FILE__, __LINE__, "case3", PAPI_ESYS ); } /* Case4 */ if (!quiet) { printf( "case4: Does add/setmpx/add work?\n" ); } if ( case4( ) != SUCCESS ) { test_fail( __FILE__, __LINE__, "case4", PAPI_ESYS ); } /* Finally init PAPI? */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail(__FILE__, __LINE__, "PAPI_library_init", retval ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/multiplex2.c000066400000000000000000000122451502707512200202110ustar00rootroot00000000000000/* * File: multiplex.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file tests the multiplex functionality, originally developed by John May of LLNL. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" /* Tests that we can really multiplex a lot. */ static int case1( void ) { int retval, i, EventSet = PAPI_NULL, j = 0, k = 0, allvalid = 1; int max_mux, nev, *events; long long *values; PAPI_event_info_t pset; char evname[PAPI_MAX_STR_LEN]; /* Initialize PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } #if 0 if ( PAPI_set_domain( PAPI_DOM_KERNEL ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); #endif retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } #if 0 if ( PAPI_set_domain( PAPI_DOM_KERNEL ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); #endif /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } #if 0 if ( PAPI_set_domain( PAPI_DOM_KERNEL ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); #endif retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } max_mux = PAPI_get_opt( PAPI_MAX_MPX_CTRS, NULL ); if ( max_mux > 32 ) max_mux = 32; #if 0 if ( PAPI_set_domain( PAPI_DOM_KERNEL ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); #endif /* Fill up the event set with as many non-derived events as we can */ if (!TESTS_QUIET) { printf( "\nFilling the event set with as many non-derived events as we can...\n" ); } i = PAPI_PRESET_MASK; do { if ( PAPI_get_event_info( i, &pset ) == PAPI_OK ) { if ( pset.count && ( strcmp( pset.derived, "NOT_DERIVED" ) == 0 ) ) { retval = PAPI_add_event( EventSet, ( int ) pset.event_code ); if ( retval != PAPI_OK ) { printf("Failed trying to add %s\n",pset.symbol); break; } else { if (!TESTS_QUIET) printf( "Added %s\n", pset.symbol ); j++; } } } } while ( ( PAPI_enum_event( &i, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK ) && ( j < max_mux ) ); if (j==0) { if (!TESTS_QUIET) printf("No events found\n"); test_skip(__FILE__,__LINE__,"No events",0); } events = ( int * ) malloc( ( size_t ) j * sizeof ( int ) ); if ( events == NULL ) test_fail( __FILE__, __LINE__, "malloc events", 0 ); values = ( long long * ) malloc( ( size_t ) j * sizeof ( long long ) ); if ( values == NULL ) test_fail( __FILE__, __LINE__, "malloc values", 0 ); do_stuff( ); #if 0 if ( PAPI_set_domain( PAPI_DOM_KERNEL ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); #endif if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); nev = j; retval = PAPI_list_events( EventSet, events, &nev ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_list_events", retval ); if (!TESTS_QUIET) printf( "\nEvent Counts:\n" ); for ( i = 0, allvalid = 0; i < j; i++ ) { PAPI_event_code_to_name( events[i], evname ); if (!TESTS_QUIET) printf( TAB1, evname, values[i] ); if ( values[i] == 0 ) allvalid++; } if (!TESTS_QUIET) { printf( "\n" ); if ( allvalid ) { printf( "Caution: %d counters had zero values\n", allvalid ); } } if (allvalid==j) { test_fail( __FILE__, __LINE__, "All counters returned zero", 5 ); } for ( i = 0, allvalid = 0; i < j; i++ ) { for ( k = i + 1; k < j; k++ ) { if ( ( i != k ) && ( values[i] == values[k] ) ) { allvalid++; break; } } } if (!TESTS_QUIET) { if ( allvalid ) { printf( "Caution: %d counter pair(s) had identical values\n", allvalid ); } } free( events ); free( values ); retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); return ( SUCCESS ); } int main( int argc, char **argv ) { int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if (!quiet) { printf( "%s: Does PAPI_multiplex_init() handle lots of events?\n", argv[0] ); printf( "Using %d iterations\n", NUM_ITERS ); } case1( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/multiplex3_pthreads.c000066400000000000000000000152111502707512200221000ustar00rootroot00000000000000/* * File: multiplex3_pthreads.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: John May * johnmay@llnl.gov */ /* This file tests the multiplex functionality when there are * threads in which the application isn't calling PAPI (and only * one thread that is calling PAPI.) */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define MAX_TO_ADD 5 /* A thread function that does nothing forever, while the other * tests are running. */ void * thread_fn( void *dummy ) { ( void ) dummy; while ( 1 ) { do_stuff( ); } return NULL; } /* Runs a bunch of multiplexed events */ static void mainloop( int arg ) { int allvalid; long long *values; int EventSet = PAPI_NULL; int retval, i, j = 2, skipped_counters=0; PAPI_event_info_t pset; ( void ) arg; /* Initialize the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet, 0 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } retval = PAPI_set_multiplex( EventSet ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if (retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } retval = PAPI_add_event( EventSet, PAPI_TOT_INS ); if ( ( retval != PAPI_OK ) && ( retval != PAPI_ECNFLCT ) ) { if (!TESTS_QUIET) printf("Trouble adding PAPI_TOT_INS\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event", retval ); } if ( !TESTS_QUIET ) { printf( "Added %s\n", "PAPI_TOT_INS" ); } retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ); if ( ( retval != PAPI_OK ) && ( retval != PAPI_ECNFLCT ) ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( !TESTS_QUIET ) { printf( "Added %s\n", "PAPI_TOT_CYC" ); } values = ( long long * ) malloc( MAX_TO_ADD * sizeof ( long long ) ); if ( values == NULL ) test_fail( __FILE__, __LINE__, "malloc", 0 ); for ( i = 0; i < PAPI_MAX_PRESET_EVENTS; i++ ) { retval = PAPI_get_event_info( i | PAPI_PRESET_MASK, &pset ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_get_event_info", retval ); if ( pset.count ) { if (!TESTS_QUIET) printf( "Adding %s\n", pset.symbol ); retval = PAPI_add_event( EventSet, ( int ) pset.event_code ); if ( ( retval != PAPI_OK ) && ( retval != PAPI_ECNFLCT ) ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( retval == PAPI_OK ) { if (!TESTS_QUIET) printf( "Added %s\n", pset.symbol ); } else { if (!TESTS_QUIET) printf( "Could not add %s\n", pset.symbol ); } do_stuff( ); if ( retval == PAPI_OK ) { retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( values[j] ) { if ( ++j >= MAX_TO_ADD ) break; } else { retval = PAPI_remove_event( EventSet, ( int ) pset.event_code ); if ( retval == PAPI_OK ) if (!TESTS_QUIET) printf( "Removed %s\n", pset.symbol ); /* This added because the test */ /* can take a long time if mplexing */ /* is broken and all values are 0 */ skipped_counters++; if (skipped_counters>MAX_TO_ADD) break; } } } } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if (!TESTS_QUIET) { test_print_event_header( "multiplex3_pthreads:\n", EventSet ); } allvalid = 0; for ( i = 0; i < MAX_TO_ADD; i++ ) { if (!TESTS_QUIET) printf( ONENUM, values[i] ); if ( values[i] != 0 ) allvalid++; } if (!TESTS_QUIET) printf( "\n" ); if ( !allvalid ) test_fail( __FILE__, __LINE__, "all counter registered no counts", 1 ); retval = PAPI_cleanup_eventset( EventSet ); /* JT */ if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); free( values ); PAPI_shutdown( ); } int main( int argc, char **argv ) { int i, rc, retval; pthread_t id[NUM_THREADS]; pthread_attr_t attr; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if (!quiet) { printf( "%s: Using %d threads\n\n", argv[0], NUM_THREADS ); printf( "Does non-threaded multiplexing work " "with extraneous threads present?\n" ); } /* Create a bunch of unused pthreads, to simulate threads created * by the system that the user doesn't know about. */ pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif #ifdef PPC64 sigset_t sigprof; sigemptyset( &sigprof ); sigaddset( &sigprof, SIGPROF ); retval = sigprocmask( SIG_BLOCK, &sigprof, NULL ); if ( retval != 0 ) test_fail( __FILE__, __LINE__, "sigprocmask SIG_BLOCK", retval ); #endif for ( i = 0; i < NUM_THREADS; i++ ) { rc = pthread_create( &id[i], &attr, thread_fn, NULL ); if ( rc ) test_fail( __FILE__, __LINE__, "pthread_create", rc ); } pthread_attr_destroy( &attr ); #ifdef PPC64 retval = sigprocmask( SIG_UNBLOCK, &sigprof, NULL ); if ( retval != 0 ) test_fail( __FILE__, __LINE__, "sigprocmask SIG_UNBLOCK", retval ); #endif mainloop( NUM_ITERS ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/native.c000066400000000000000000000132131502707512200173660ustar00rootroot00000000000000/* * File: native.c * Mods: Maynard Johnson * maynardj@us.ibm.com */ /* This test defines an array of native event names, either at compile time or at run time (some x86 platforms). It then: - add the table of events to an event set; - starts counting - does a little work - stops counting; - reports the results. */ #include "papi_test.h" static int EventSet = PAPI_NULL; extern int TESTS_QUIET; /* Declared in test_utils.c */ #if (defined(PPC32)) /* Select 4 events common to both ppc750 and ppc7450 */ static char *native_name[] = { "CPU_CLK", "FLOPS", "TOT_INS", "BR_MSP", NULL }; #elif defined(_POWER4) || defined(_PPC970) /* arbitrarily code events from group 28: pm_fpu3 - Floating point events by unit */ static char *native_name[] = { "PM_FPU0_FDIV", "PM_FPU1_FDIV", "PM_FPU0_FRSP_FCONV", "PM_FPU1_FRSP_FCONV", "PM_FPU0_FMA", "PM_FPU1_FMA", "PM_INST_CMPL", "PM_CYC", NULL }; #elif defined(_POWER5p) /* arbitrarily code events from group 33: pm_fpustall - Floating Point Unit stalls */ static char *native_name[] = { "PM_FPU_FULL_CYC", "PM_CMPLU_STALL_FDIV", "PM_CMPLU_STALL_FPU", "PM_RUN_INST_CMPL", "PM_RUN_CYC", NULL }; #elif defined(_POWER5) /* arbitrarily code events from group 78: pm_fpu1 - Floating Point events */ static char *native_name[] = { "PM_FPU_FDIV", "PM_FPU_FMA", "PM_FPU_FMOV_FEST", "PM_FPU_FEST", "PM_INST_CMPL", "PM_RUN_CYC", NULL }; #elif defined(POWER3) static char *native_name[] = { "PM_IC_MISS", "PM_FPU1_CMPL", "PM_LD_MISS_L1", "PM_LD_CMPL", "PM_FPU0_CMPL", "PM_CYC", "PM_TLB_MISS", NULL }; #elif defined(__ia64__) #ifdef ITANIUM2 static char *native_name[] = { "CPU_CYCLES", "L1I_READS", "L1D_READS_SET0", "IA64_INST_RETIRED", NULL }; #else static char *native_name[] = { "DEPENDENCY_SCOREBOARD_CYCLE", "DEPENDENCY_ALL_CYCLE", "UNSTALLED_BACKEND_CYCLE", "MEMORY_CYCLE", NULL }; #endif #elif ((defined(linux) && (defined(__i386__) || (defined __x86_64__))) ) static char *p3_native_name[] = { "DATA_MEM_REFS", "DCU_LINES_IN", NULL }; static char *core_native_name[] = { "UnhltCore_Cycles", "Instr_Retired", NULL }; static char *k7_native_name[] = { "TOT_CYC", "IC_MISSES", "DC_ACCESSES", "DC_MISSES", NULL }; // static char *k8_native_name[] = { "FP_ADD_PIPE", "FP_MULT_PIPE", "FP_ST_PIPE", "FP_NONE_RET", NULL }; static char *k8_native_name[] = { "DISPATCHED_FPU:OPS_ADD", "DISPATCHED_FPU:OPS_MULTIPLY", "DISPATCHED_FPU:OPS_STORE", "CYCLES_NO_FPU_OPS_RETIRED", NULL }; static char *p4_native_name[] = { "retired_mispred_branch_type:CONDITIONAL", "resource_stall:SBFULL", "tc_ms_xfer:CISC", "instr_retired:BOGUSNTAG:BOGUSTAG", "BSQ_cache_reference:RD_2ndL_HITS", NULL }; static char **native_name = p3_native_name; #elif defined(mips) && defined(sgi) static char *native_name[] = { "Primary_instruction_cache_misses", "Primary_data_cache_misses", NULL }; #elif defined(mips) && defined(linux) static char *native_name[] = { "CYCLES", NULL }; #elif defined(sun) && defined(sparc) static char *native_name[] = { "Cycle_cnt", "Instr_cnt", NULL }; #elif defined(_BGL) static char *native_name[] = { "BGL_UPC_PU0_PREF_STREAM_HIT", "BGL_PAPI_TIMEBASE", "BGL_UPC_PU1_PREF_STREAM_HIT", NULL }; #elif defined(__bgp__) static char *native_name[] = { "PNE_BGP_PU0_JPIPE_LOGICAL_OPS", "PNE_BGP_PU0_JPIPE_LOGICAL_OPS", "PNE_BGP_PU2_IPIPE_INSTRUCTIONS", NULL }; #else #error "Architecture not supported in test file." #endif int main( int argc, char **argv ) { int i, retval, native; const PAPI_hw_info_t *hwinfo; long long values[8]; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( ( hwinfo = PAPI_get_hardware_info( ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EMISC ); printf( "Architecture %s, %d\n", hwinfo->model_string, hwinfo->model ); #if ((defined(linux) && (defined(__i386__) || (defined __x86_64__))) ) if ( !strncmp( hwinfo->model_string, "Intel Pentium 4", 15 ) ) { native_name = p4_native_name; } else if ( !strncmp( hwinfo->model_string, "AMD K7", 6 ) ) { native_name = k7_native_name; } else if ( !strncmp( hwinfo->model_string, "AMD K8", 6 ) ) { native_name = k8_native_name; } else if ( !strncmp( hwinfo->model_string, "Intel Core", 17 ) || !strncmp( hwinfo->model_string, "Intel Core 2", 17 ) ) { native_name = core_native_name; } #endif for ( i = 0; native_name[i] != NULL; i++ ) { retval = PAPI_event_name_to_code( native_name[i], &native ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_name_to_code", retval ); printf( "Adding %s\n", native_name[i] ); if ( ( retval = PAPI_add_event( EventSet, native ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_both( 1000 ); if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { for ( i = 0; native_name[i] != NULL; i++ ) { fprintf( stderr, "%-40s: ", native_name[i] ); fprintf( stderr, LLDFMT, values[i] ); fprintf( stderr, "\n" ); } } retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); test_pass( __FILE__, NULL, 0 ); exit( 0 ); } papi-papi-7-2-0-t/src/ctests/net-mpi-test/000077500000000000000000000000001502707512200202625ustar00rootroot00000000000000papi-papi-7-2-0-t/src/ctests/net-mpi-test/Makefile000066400000000000000000000026421502707512200217260ustar00rootroot00000000000000CC = gcc CC_R = gcc -pthread CC_SHR = gcc -shared #MXMPIPATH = /usr/local/mpich/mpich-gcc #MXMPIPATH = /usr/local/mpich-mx #MPICC = $(MXMPIPATH)/bin/mpicc #MPICC = /usr/bin/mpicc MPICC = mpicc MPICC_SHR = $(MPICC) -shared MPICCLD_SHR = $(MPICC_SHR) F77 = g77 FLAGS = -g -Wall CFLAGS = $(FLAGS) -O3 # -DPROFILE_TIMER -DDEBUG -DVERBOSE BLASLIBS = -lblas #BLASLIBS = -L/usr/local/lib -lf77blas -latlas LAPACKLIBS = -llapack UTILOBJS= ../do_loops.o ../test_utils.o ../dummy.o INCLUDE = -I.. -I../.. -I/usr/include PAPILIB = -L../.. -lpapi MPILIBS = MPIINC = XTRALIBS = PTHRLIBS = MPILIBS = LIBS =$(PAPILIB) -lm TESTS = cpi tests: $(TESTS) # Applications # Test programs ../test_utils.o: ../test_utils.c ../papi_test.h ../test_utils.h $(CC) $(CFLAGS) $(INCLUDE) -c ../test_utils.c -o ../test_utils.o ../do_loops.o: ../do_loops.c ../papi_test.h ../test_utils.h $(CC) $(CFLAGS) $(INCLUDE) -c ../do_loops.c -o ../do_loops.o ../dummy.o: ../dummy.c $(CC) $(CFLAGS) $(INCLUDE) -c ../dummy.c -o ../dummy.o cpi: cpi.c $(UTILOBJS) $(MPICC) $(MPFLAGS) $(CFLAGS) $(INCLUDE) $(MPIINC) $(TOPTFLAGS) cpi.c $(UTILOBJS) $(PAPILIB) $(MPILIBS) -o cpi #cpi: cpi.c # $(MPICC) $(FLAGS) cpi.c -o $@ $(MPIPERFLIBS) $(XTRALIBS) $(MPILIBS) -lm clean: rm -f core $(TESTS) *~ *.o papi-papi-7-2-0-t/src/ctests/net-mpi-test/cpi.c000066400000000000000000000116541502707512200212100ustar00rootroot00000000000000/* From Dave McNamara at PSRV. Thanks! */ /* If an event is countable but you've exhausted the counter resources and you try to add an event, it seems subsequent PAPI_start and/or PAPI_stop will causes a Seg. Violation. I got around this by calling PAPI to get the # of countable events, then making sure that I didn't try to add more than these number of events. I still have a problem if someone adds Level 2 cache misses and then adds FLOPS 'cause I didn't count FLOPS as actually requiring 2 counters. */ #include "papi_test.h" #include #include #include extern int TESTS_QUIET; /* Declared in test_utils.c */ char *netevents[] = { "LO_RX_PACKETS", "LO_TX_PACKETS", "ETH0_RX_PACKETS", "ETH0_TX_PACKETS" }; double f( double a ) { return ( 4.0 / ( 1.0 + a * a ) ); } int main( int argc, char **argv ) { int EventSet = PAPI_NULL, EventSet1 = PAPI_NULL; int evtcode; int retval, i, ins = 0; long long g1[2], g2[2]; int done = 0, n, myid, numprocs; double PI25DT = 3.141592653589793238462643; double mypi, pi, h, sum, x; double startwtime = 0.0, endwtime; int namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( ( retval = PAPI_create_eventset( &EventSet1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); PAPI_event_name_to_code( netevents[2], &evtcode ); if ( ( retval = PAPI_query_event( evtcode ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_aquery_event", retval ); } if ( ( retval = PAPI_add_event( EventSet, evtcode ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } PAPI_event_name_to_code( netevents[3], &evtcode ); if ( ( retval = PAPI_query_event( evtcode ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_aquery_event", retval ); } if ( ( retval = PAPI_add_event( EventSet, evtcode ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } if ( ( retval = PAPI_query_event( PAPI_FP_INS ) ) != PAPI_OK ) { if ( ( retval = PAPI_query_event( PAPI_FP_OPS ) ) == PAPI_OK ) { ins = 2; if ( ( retval = PAPI_add_event( EventSet1, PAPI_FP_OPS ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } } } else { ins = 1; if ( ( retval = PAPI_add_event( EventSet1, PAPI_FP_INS ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } } if ( ( retval = PAPI_add_event( EventSet1, PAPI_TOT_CYC ) ) != PAPI_OK ) { if ( retval != PAPI_ECNFLCT ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } MPI_Init( &argc, &argv ); MPI_Comm_size( MPI_COMM_WORLD, &numprocs ); MPI_Comm_rank( MPI_COMM_WORLD, &myid ); MPI_Get_processor_name( processor_name, &namelen ); fprintf( stdout, "Process %d of %d on %s\n", myid, numprocs, processor_name ); fflush( stdout ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); if ( ( retval = PAPI_start( EventSet1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); n = 0; while ( !done ) { if ( myid == 0 ) { if ( n == 0 ) n = 1000000; else n = 0; startwtime = MPI_Wtime( ); } MPI_Bcast( &n, 1, MPI_INT, 0, MPI_COMM_WORLD ); if ( n == 0 ) done = 1; else { h = 1.0 / ( double ) n; sum = 0.0; /* A slightly better approach starts from large i and works back */ for ( i = myid + 1; i <= n; i += numprocs ) { x = h * ( ( double ) i - 0.5 ); sum += f( x ); } mypi = h * sum; MPI_Reduce( &mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD ); if ( myid == 0 ) { printf( "pi is approximately %.16f, Error is %.16f\n", pi, fabs( pi - PI25DT ) ); endwtime = MPI_Wtime( ); printf( "wall clock time = %f\n", endwtime - startwtime ); fflush( stdout ); } } } if ( ( retval = PAPI_stop( EventSet1, g1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( ( retval = PAPI_stop( EventSet, g2 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); MPI_Finalize( ); printf( "ETH0_RX_BYTES: %lld ETH0_TX_BYTES: %lld\n", g2[0], g2[1] ); if ( ins == 0 ) { printf( "PAPI_TOT_CYC : %lld\n", g1[0] ); } else if ( ins == 1 ) { printf( "PAPI_FP_INS : %lld PAPI_TOT_CYC : %lld\n", g1[0], g1[1] ); } else if ( ins == 2 ) { printf( "PAPI_FP_OPS : %lld PAPI_TOT_CYC : %lld\n", g1[0], g1[1] ); } test_pass( __FILE__, NULL, 0 ); return 0; } papi-papi-7-2-0-t/src/ctests/net-mpi-test/cpi.pbs000066400000000000000000000032571502707512200215520ustar00rootroot00000000000000#!/bin/bash ############################################################ ## Template PBS Job Script for Parallel Job on Myrinet Nodes ## ## Lines beginning with '#PBS' are PBS directives, see ## 'man qsub' for additional information. ############################################################ ### Set the job name #PBS -N cpi ### Set the queue to submit this job: ALWAYS use the default queue ##PBS -q workq ### Set the number of nodes that will be used, 4 in this case, ### use a single processor per node (ppn=1), and use Myrinet #PBS -l nodes=4 ### The following command computes the number of processors requested ### from the file containing the list of nodes assigned to the job export NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'` ### The following statements dump some diagnostic information to ### the batch job's standard output. echo The master node of this job is `hostname` echo The working directory is `echo $PBS_O_WORKDIR` echo The node file is $PBS_NODEFILE echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-" echo This job runs on the following nodes: echo `cat $PBS_NODEFILE` echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-" echo This job has allocated $NPROCS nodes echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-" ### Change to the working directory of the qsub command. cd $PBS_O_WORKDIR ### Execute the MPI job --- NOTE: It is *crucial* that the proper ### 'mpirun' command (there are several versions of the command ### on the cluster) be used to launch the job---it is safest to use ### the full pathname as is done here. #/usr/local/mpich-mx/bin/mpirun -np $NPROCS -machinefile $PBS_NODEFILE cpi /usr/local/mpich/bin/mpirun -np $NPROCS -machinefile $PBS_NODEFILE cpi papi-papi-7-2-0-t/src/ctests/nineth.c000066400000000000000000000076551502707512200174020ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for derived events NOTE: This test becomes useless when rate events like PAPI_FLOPS are removed. - It tests the derived metric FLOPS using the following two counters. They are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include "papi_test.h" extern int TESTS_QUIET; /* Declared in test_utils.c */ int main( int argc, char **argv ) { int retval, num_tests = 2, tmp; int EventSet1 = PAPI_NULL; int EventSet2 = PAPI_NULL; int mask1 = 0x80001; /* FP_OPS and TOT_CYC */ int mask2 = 0x8; /* FLOPS */ int num_events1; int num_events2; long long **values; int clockrate; double test_flops; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); /* gotta count flops to run this test */ if ( ( retval = PAPI_query_event( PAPI_FP_OPS ) ) != PAPI_OK ) test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); EventSet1 = add_test_events( &num_events1, &mask1 ); /* EventSet2 = add_test_events(&num_events2, &mask2); */ if ( num_events1 == 0 || num_events2 == 0 ) test_skip( __FILE__, __LINE__, "add_test_events", PAPI_ENOEVNT ); /* num_events1 is greater than num_events2 so don't worry. */ values = allocate_test_space( num_tests, num_events1 ); clockrate = PAPI_get_opt( PAPI_CLOCKRATE, NULL ); if ( clockrate < 1 ) test_fail( __FILE__, __LINE__, "PAPI_get_opt", retval ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* retval = PAPI_start(EventSet2); if (retval != PAPI_OK) test_fail(__FILE__, __LINE__, "PAPI_start", retval); do_flops(NUM_FLOPS); retval = PAPI_stop(EventSet2, values[1]); if (retval != PAPI_OK) test_fail(__FILE__, __LINE__, "PAPI_stop", retval); */ remove_test_events( &EventSet1, mask1 ); /* remove_test_events(&EventSet2, mask2); */ test_flops = ( double ) ( values[0] )[0] * ( double ) clockrate *( double ) 1000000.0; test_flops = test_flops / ( double ) ( values[0] )[1]; if ( !TESTS_QUIET ) { printf( "Test case 9: start, stop for derived event PAPI_FLOPS.\n" ); printf( "------------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Test type : %12s%12s\n", "1", "2" ); printf( TAB2, "PAPI_FP_OPS : ", ( values[0] )[0], ( long long ) 0 ); printf( TAB2, "PAPI_TOT_CYC: ", ( values[0] )[1], ( long long ) 0 ); printf( TAB2, "PAPI_FLOPS : ", ( long long ) 0, ( values[1] )[0] ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Last number in row 3 approximately equals %f\n", test_flops ); printf( "This test is no longer valid: PAPI_FLOPS is deprecated.\n" ); } /* { double min, max; min = values[1][0] * .9; max = values[1][0] * 1.1; if (test_flops > max || test_flops < min) test_fail(__FILE__, __LINE__, "PAPI_FLOPS", 1); } */ test_pass( __FILE__, values, num_tests ); exit( 1 ); } papi-papi-7-2-0-t/src/ctests/omp_hl.c000066400000000000000000000030361502707512200173600ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, i; int quiet = 0; char* region_name; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); region_name = "do_flops"; #pragma omp parallel #pragma omp for for ( i = 1; i <= 4; ++i ) { int tid; tid = omp_get_thread_num(); if ( !quiet ) { printf("\nThread %d: instrument flops\n", tid); } retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } } region_name = "do_flops_2"; #pragma omp parallel #pragma omp for for ( i = 1; i <= 4; ++i ) { int tid; tid = omp_get_thread_num(); if ( !quiet ) { printf("\nThread %d: instrument flops_2\n", tid); } retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } } test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/omptough.c000066400000000000000000000052051502707512200177440ustar00rootroot00000000000000#include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define NITER (100000) int main( int argc, char *argv[] ) { int i; int ret; int nthreads; int *evtset; int *ctrcode; nthreads = omp_get_max_threads( ); evtset = ( int * ) malloc( sizeof ( int ) * nthreads ); ctrcode = ( int * ) malloc( sizeof ( int ) * nthreads ); tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ ret = PAPI_library_init( PAPI_VER_CURRENT ); if ( ret != PAPI_VER_CURRENT && ret > 0 ) { fprintf( stderr, "PAPI library version mismatch '%s'\n", PAPI_strerror( ret ) ); exit( 1 ); } if ( ret < 0 ) { fprintf( stderr, "PAPI initialization error '%s'\n", PAPI_strerror( ret ) ); exit( 1 ); } if ( ( ret = PAPI_thread_init( ( unsigned long ( * )( void ) ) pthread_self ) ) != PAPI_OK ) { fprintf( stderr, "PAPI thread initialization error '%s'\n", PAPI_strerror( ret ) ); exit( 1 ); } for ( i = 0; i < nthreads; i++ ) { evtset[i] = PAPI_NULL; if ( ( ret = PAPI_event_name_to_code( "PAPI_TOT_INS", &ctrcode[i] ) ) != PAPI_OK ) { fprintf( stderr, "PAPI evt-name-to-code error '%s'\n", PAPI_strerror( ret ) ); } } for ( i = 0; i < NITER; i++ ) { #pragma omp parallel { int tid; int pid; tid = omp_get_thread_num( ); pid = pthread_self( ); if ( ( ret = PAPI_register_thread( ) ) != PAPI_OK ) { if ( !TESTS_QUIET ) { fprintf( stderr, "[%5d] Error in register thread (tid=%d pid=%d) '%s'\n", i, tid, pid, PAPI_strerror( ret ) ); test_fail( __FILE__, __LINE__, "omptough", 1 ); } } evtset[tid] = PAPI_NULL; if ( ( ret = PAPI_create_eventset( &( evtset[tid] ) ) ) != PAPI_OK ) { if ( !TESTS_QUIET ) { fprintf( stderr, "[%5d] Error creating eventset (tid=%d pid=%d) '%s'\n", i, tid, pid, PAPI_strerror( ret ) ); test_fail( __FILE__, __LINE__, "omptough", 1 ); } } if ( ( ret = PAPI_destroy_eventset( &( evtset[tid] ) ) ) != PAPI_OK ) { if ( !TESTS_QUIET ) { fprintf( stderr, "[%5d] Error destroying eventset (tid=%d pid=%d) '%s'\n", i, tid, pid, PAPI_strerror( ret ) ); evtset[tid] = PAPI_NULL; test_fail( __FILE__, __LINE__, "omptough", 1 ); } } if ( ( ret = PAPI_unregister_thread( ) ) != PAPI_OK ) { if ( !TESTS_QUIET ) { fprintf( stderr, "[%5d] Error in unregister thread (tid=%d pid=%d) ret='%s'\n", i, tid, pid, PAPI_strerror( ret ) ); test_fail( __FILE__, __LINE__, "omptough", 1 ); } } } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow.c000066400000000000000000000127151502707512200177510ustar00rootroot00000000000000/* * File: overflow.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: overflow dispatch The Eventset contains: + PAPI_TOT_CYC + PAPI_FP_INS (overflow monitor) - Start eventset 1 - Do flops - Stop and measure eventset 1 - Set up overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d ) Overflow at %p! bit=%#llx \n" #define OUT_FMT "%-12s : %16lld%16lld\n" static int total = 0; /* total overflows */ void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } total++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long ( values[2] )[2]; long long min, max; int num_flops = NUM_FLOPS, retval; int PAPI_event, mythreshold = THRESHOLD; char event_name1[PAPI_MAX_STR_LEN]; const PAPI_hw_info_t *hw_info = NULL; int num_events, mask; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Get hardware info */ hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } /* add PAPI_TOT_CYC and one of the events in */ /* PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, */ /* depending on the availability of the event on */ /* the platform */ EventSet = add_two_nonderived_events( &num_events, &PAPI_event, &mask ); if (num_events==0) { if (!quiet) printf("Trouble adding event!\n"); test_skip(__FILE__,__LINE__,"Event add",1); } if (!quiet) { printf("Using %#x for the overflow event\n",PAPI_event); } if ( PAPI_event == PAPI_FP_INS ) { mythreshold = THRESHOLD; } else { #if defined(linux) mythreshold = ( int ) hw_info->cpu_max_mhz * 20000; #else mythreshold = THRESHOLD * 2; #endif } /* Start the run calibration run */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); /* stop the calibration run */ retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* set up overflow handler */ retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, 0, handler ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } /* Start overflow run */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( num_flops ); /* stop overflow run */ retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_overflow( EventSet, PAPI_event, 0, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); if ( !TESTS_QUIET ) { retval = PAPI_event_code_to_name( PAPI_event, event_name1 ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } printf( "Test case: Overflow dispatch of 2nd event in set with 2 events.\n" ); printf( "---------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", mythreshold ); printf( "Using %d iterations of c += a*b\n", num_flops ); printf( "-----------------------------------------------\n" ); printf( "Test type : %16d%16d\n", 1, 2 ); printf( OUT_FMT, event_name1, ( values[0] )[1], ( values[1] )[1] ); printf( OUT_FMT, "PAPI_TOT_CYC", ( values[0] )[0], ( values[1] )[0] ); printf( "Overflows : %16s%16d\n", "", total ); printf( "-----------------------------------------------\n" ); } retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if ( !TESTS_QUIET ) { printf( "Verification:\n" ); #if defined(linux) || defined(__ia64__) || defined(_POWER4) num_flops *= 2; #endif if ( PAPI_event == PAPI_FP_INS || PAPI_event == PAPI_FP_OPS ) { printf( "Row 1 approximately equals %d %d\n", num_flops, num_flops ); } printf( "Column 1 approximately equals column 2\n" ); printf( "Row 3 approximately equals %u +- %u %%\n", ( unsigned ) ( ( values[0] )[1] / ( long long ) mythreshold ), ( unsigned ) ( OVR_TOLERANCE * 100.0 ) ); } /* min = (long long)((values[0])[1]*(1.0-TOLERANCE)); max = (long long)((values[0])[1]*(1.0+TOLERANCE)); if ( (values[0])[1] > max || (values[0])[1] < min ) test_fail(__FILE__, __LINE__, event_name, 1); */ min = ( long long ) ( ( ( double ) values[0][1] * ( 1.0 - OVR_TOLERANCE ) ) / ( double ) mythreshold ); max = ( long long ) ( ( ( double ) values[0][1] * ( 1.0 + OVR_TOLERANCE ) ) / ( double ) mythreshold ); if (!quiet) { printf( "Overflows: total(%d) > max(%lld) || " "total(%d) < min(%lld) \n", total, max, total, min ); } if ( total > max || total < min ) { test_fail( __FILE__, __LINE__, "Overflows", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow2.c000066400000000000000000000124251502707512200200310ustar00rootroot00000000000000/* * File: overflow.c * Author: Nils Smeds [Based on tests/overflow.c by Philip Mucci] * smeds@pdc.kth.se */ /* This file performs the following test: overflow dispatch The Eventset contains: + PAPI_TOT_CYC (overflow monitor) + PAPI_FP_INS - Start eventset 1 - Do flops - Stop and measure eventset 1 - Set up overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d ) Overflow at %p! bit=%#llx \n" #define OUT_FMT "%-12s : %16lld%16lld\n" int total = 0; /* total overflows */ void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } total++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long ( values[2] )[2]; long long min, max; int num_flops, retval; int PAPI_event, mythreshold; char event_name[PAPI_MAX_STR_LEN]; const PAPI_hw_info_t *hw_info = NULL; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } #if defined(POWER3) || defined(__sparc__) PAPI_event = PAPI_TOT_INS; #else /* query and set up the right instruction to monitor */ PAPI_event = find_nonderived_event( ); #endif if (PAPI_event==0) { if (!quiet) printf("Trouble creating events\n"); test_skip(__FILE__,__LINE__,"Creating event",1); } if (( PAPI_event == PAPI_FP_OPS ) || ( PAPI_event == PAPI_FP_INS )) mythreshold = THRESHOLD; else #if defined(linux) mythreshold = ( int ) hw_info->cpu_max_mhz * 10000 * 2; #else mythreshold = THRESHOLD * 2; #endif retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); retval = PAPI_add_event( EventSet, PAPI_event ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_overflow( EventSet, PAPI_event, 0, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); num_flops = NUM_FLOPS; #if defined(linux) || defined(__ia64__) || defined(_POWER4) num_flops *= 2; #endif if ( !quiet ) { if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); printf ( "Test case: Overflow dispatch of 1st event in set with 2 events.\n" ); printf ( "---------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", mythreshold ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-----------------------------------------------\n" ); printf( "Test type : %16d%16d\n", 1, 2 ); printf( OUT_FMT, event_name, ( values[0] )[0], ( values[1] )[0] ); printf( OUT_FMT, "PAPI_TOT_CYC", ( values[0] )[1], ( values[1] )[1] ); printf( "Overflows : %16s%16d\n", "", total ); printf( "-----------------------------------------------\n" ); printf( "Verification:\n" ); /* if (PAPI_event == PAPI_FP_INS) printf("Row 1 approximately equals %d %d\n", num_flops, num_flops); */ /* Note that the second run prints output on stdout. On some systems * this is costly. PAPI_TOT_INS or PAPI_TOT_CYC are likely to be _very_ * different between the two runs. * printf("Column 1 approximately equals column 2\n"); */ printf( "Row 3 approximately equals %u +- %u %%\n", ( unsigned ) ( ( values[0] )[0] / ( long long ) mythreshold ), ( unsigned ) ( OVR_TOLERANCE * 100.0 ) ); } /* min = (long long)((values[0])[0]*(1.0-TOLERANCE)); max = (long long)((values[0])[0]*(1.0+TOLERANCE)); if ( (values[1])[0] > max || (values[1])[0] < min ) test_fail(__FILE__, __LINE__, event_name, 1); */ min = ( long long ) ( ( ( double ) values[0][0] * ( 1.0 - OVR_TOLERANCE ) ) / ( double ) mythreshold ); max = ( long long ) ( ( ( double ) values[0][0] * ( 1.0 + OVR_TOLERANCE ) ) / ( double ) mythreshold ); if ( total > max || total < min ) test_fail( __FILE__, __LINE__, "Overflows", 1 ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow3_pthreads.c000066400000000000000000000074511502707512200217270ustar00rootroot00000000000000/* * File: overflow3_pthreads.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file tests the overflow functionality when there are * threads in which the application isn't calling PAPI (and only * one thread that is calling PAPI.) */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int total = 0; void * thread_fn( void *dummy ) { ( void ) dummy; while ( 1 ) { do_stuff( ); } return ( NULL ); } void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) overflow_vector; ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, "handler(%d ) Overflow at %p, thread %#lx!\n", EventSet, address, PAPI_thread_id( ) ); } total++; } void mainloop( int arg ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int mask1 = 0x0; int num_events1; long long **values; int PAPI_event; char event_name[PAPI_MAX_STR_LEN]; ( void ) arg; retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_nonderived_events( &num_events1, &PAPI_event, &mask1 ); if (num_events1==0) { if (!TESTS_QUIET) printf("Trouble creating events\n"); test_skip(__FILE__,__LINE__,"Creating events",0); } values = allocate_test_space( num_tests, num_events1 ); if ( ( retval = PAPI_overflow( EventSet1, PAPI_event, THRESHOLD, 0, handler ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); do_stuff( ); if ( ( retval = PAPI_start( EventSet1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet1, values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* clear the papi_overflow event */ if ( ( retval = PAPI_overflow( EventSet1, PAPI_event, 0, 0, NULL ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); if ( !TESTS_QUIET ) { printf( "Thread %#x %s : \t%lld\n", ( int ) pthread_self( ), event_name, ( values[0] )[0] ); printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", ( int ) pthread_self( ), ( values[0] )[1] ); } retval = PAPI_cleanup_eventset( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); free_test_space( values, num_tests ); PAPI_shutdown( ); } int main( int argc, char **argv ) { int i, rc, retval; pthread_t id[NUM_THREADS]; pthread_attr_t attr; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); if (!quiet) { printf( "%s: Using %d threads\n\n", argv[0], NUM_THREADS ); printf( "Does non-threaded overflow work " "with extraneous threads present?\n" ); } pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif for ( i = 0; i < NUM_THREADS; i++ ) { rc = pthread_create( &id[i], &attr, thread_fn, NULL ); if ( rc ) test_fail( __FILE__, __LINE__, "pthread_create", rc ); } pthread_attr_destroy( &attr ); mainloop( NUM_ITERS ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_allcounters.c000066400000000000000000000170451502707512200223650ustar00rootroot00000000000000/* * File: overflow_allcounters.c * Author: Haihang You * you@cs.utk.edu * Mods: Vince Weaver * vweaver1@eecs.utk.edu */ /* This file performs the following test: overflow all counters to test availability of overflow of all counters - Start eventset 1 - Do flops - Stop and measure eventset 1 - Set up overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d ) Overflow at %p! bit=%#llx \n" #define OUT_FMT "%-12s : %16lld%16lld\n" static int total = 0; /* total overflows */ void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) context; if ( !TESTS_QUIET ) { printf( OVER_FMT, EventSet, address, overflow_vector ); } total++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long *values; int num_flops, retval, i, j; int *events, mythreshold; char **names; const PAPI_hw_info_t *hw_info = NULL; int num_events, *ovt; char name[PAPI_MAX_STR_LEN]; int using_perfmon = 0; int using_aix = 0; int cid; int quiet; long long value; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", retval ); } cid = PAPI_get_component_index("perfmon"); if (cid>=0) using_perfmon = 1; cid = PAPI_get_component_index("aix"); if (cid>=0) using_aix = 1; /* add PAPI_TOT_CYC and one of the events in */ /* PAPI_FP_INS, PAPI_FP_OPS PAPI_TOT_INS, */ /* depending on the availability of the event*/ /* on the platform */ EventSet = enum_add_native_events( &num_events, &events, 1 , 1, 0); if (num_events==0) { if (!quiet) printf("No events found\n"); test_skip(__FILE__,__LINE__,"No events found",0); } if (!quiet) printf("Trying %d events\n",num_events); names = ( char ** ) calloc( ( unsigned int ) num_events, sizeof ( char * ) ); for ( i = 0; i < num_events; i++ ) { if ( PAPI_event_code_to_name( events[i], name ) != PAPI_OK ) { test_fail( __FILE__, __LINE__,"PAPI_event_code_to_name", retval); } else { names[i] = strdup( name ); if (!quiet) printf("%i: %s\n",i,names[i]); } } values = ( long long * ) calloc( ( unsigned int ) ( num_events * ( num_events + 1 ) ), sizeof ( long long ) ); ovt = ( int * ) calloc( ( unsigned int ) num_events, sizeof ( int ) ); #if defined(linux) { char *tmp = getenv( "THRESHOLD" ); if ( tmp ) { mythreshold = atoi( tmp ); } else if (hw_info->cpu_max_mhz!=0) { mythreshold = ( int ) hw_info->cpu_max_mhz * 20000; if (!quiet) printf("Using a threshold of %d (20,000 * MHz)\n",mythreshold); } else { if (!quiet) printf("Using default threshold of %d\n",THRESHOLD); mythreshold = THRESHOLD; } } #else mythreshold = THRESHOLD; #endif num_flops = NUM_FLOPS * 2; /* initial test to make sure they all work */ if (!quiet) printf("Testing that the events all work with no overflow\n"); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( num_flops ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* done with initial test */ /* keep adding events? */ for ( i = 0; i < num_events; i++ ) { /* Enable overflow */ if (!quiet) printf("Testing with overflow set on %s\n", names[i]); retval = PAPI_overflow( EventSet, events[i], mythreshold, 0, handler ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( num_flops ); retval = PAPI_stop( EventSet, values + ( i + 1 ) * num_events ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* Disable overflow */ retval = PAPI_overflow( EventSet, events[i], 0, 0, handler ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } ovt[i] = total; total = 0; } if ( !quiet ) { printf("\nResults in Matrix-view:\n"); printf( "Test Overflow on %d counters with %d events.\n", num_events,num_events ); printf( "-----------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", mythreshold ); printf( "Using %d iterations of c += a*b\n", num_flops ); printf( "-----------------------------------------------\n" ); printf( "Test type : " ); for ( i = 0; i < num_events + 1; i++ ) { printf( "%16d", i ); } printf( "\n" ); for ( j = 0; j < num_events; j++ ) { printf( "%-27s : ", names[j] ); for ( i = 0; i < num_events + 1; i++ ) { printf( "%16lld", *( values + j + num_events * i ) ); } printf( "\n" ); } printf( "Overflows : %16s", "" ); for ( i = 0; i < num_events; i++ ) { printf( "%16d", ovt[i] ); } printf( "\n" ); printf( "-----------------------------------------------\n" ); } /* validation */ if ( !quiet ) { printf("\nResults broken out for validation\n"); } if (!quiet) { for ( j = 0; j < num_events+1; j++ ) { if (j==0) { printf("Test results, no overflow:\n\t"); } else { printf("Overflow of event %d, %s\n\t",j-1,names[j-1]); } for(i=0; i < num_events; i++) { if (i==j-1) { printf("*%lld* ",values[(num_events*j)+i]); } else { printf("%lld ",values[(num_events*j)+i]); } } printf("\n"); if (j!=0) { printf("\tOverflow should be %lld / %d = %lld\n", values[(num_events*j)+(j-1)], mythreshold, values[(num_events*j)+(j-1)]/mythreshold); printf("\tOverflow was %d\n",ovt[j-1]); } } } for ( j = 0; j < num_events; j++ ) { //printf("Validation: %lld / %d != %d (%lld)\n", // *( values + j + num_events * (j+1) ) , // mythreshold, // ovt[j], // *(values+j+num_events*(j+1))/mythreshold); value = values[j+num_events*(j+1)]; if ( value / mythreshold != ovt[j] ) { char error_string[BUFSIZ]; if ( using_perfmon ) test_warn( __FILE__, __LINE__, "perfmon component handles overflow differently than perf_events", 1 ); else if ( using_aix ) test_warn( __FILE__, __LINE__, "AIX (pmapi) component handles overflow differently than various other components", 1 ); else { sprintf( error_string, "Overflow value differs from expected %lld / %d should be %lld, we got %d", value , mythreshold, value / mythreshold, ovt[j] ); test_fail( __FILE__, __LINE__, error_string, 1 ); } } } retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); free( ovt ); for ( i = 0; i < num_events; i++ ) free( names[i] ); free( names ); free( events ); free( values ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_force_software.c000066400000000000000000000224461502707512200230430ustar00rootroot00000000000000/* * File: overflow_force_software.c * Author: Kevin London * london@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com * Philip Mucci * mucci@cs.utk.edu * Haihang You * you@cs.utk.edu * * */ /* This file performs the following test: overflow dispatch of an eventset with just a single event. Using both Hardware and software overflows The Eventset contains: + PAPI_FP_INS (overflow monitor) - Start eventset 1 - Do flops - Stop and measure eventset 1 - Set up overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 - Set up forced software overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d) Overflow at %p overflow_vector=%#llx!\n" #define OUT_FMT "%-12s : %16lld%16d%16lld\n" #define SOFT_TOLERANCE 0.90 #define MY_NUM_TESTS 5 static int total[MY_NUM_TESTS] = { 0, }; /* total overflows */ static int use_total = 0; /* which total field to bump */ static long long values[MY_NUM_TESTS] = { 0, }; void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } total[use_total]++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long hard_min, hard_max, soft_min, soft_max; int retval; int PAPI_event = 0, mythreshold; char event_name[PAPI_MAX_STR_LEN]; PAPI_option_t opt; PAPI_event_info_t info; PAPI_option_t itimer; const PAPI_hw_info_t *hw_info = NULL; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); /* query and set up the right instruction to monitor */ if ( PAPI_query_event( PAPI_FP_INS ) == PAPI_OK ) { if ( PAPI_query_event( PAPI_FP_INS ) == PAPI_OK ) { PAPI_get_event_info( PAPI_FP_INS, &info ); if ( info.count == 1 || !strcmp( info.derived, "DERIVED_CMPD" ) ) PAPI_event = PAPI_FP_INS; } } if ( PAPI_event == 0 ) { if ( PAPI_query_event( PAPI_FP_OPS ) == PAPI_OK ) { PAPI_get_event_info( PAPI_FP_OPS, &info ); if ( info.count == 1 || !strcmp( info.derived, "DERIVED_CMPD" ) ) PAPI_event = PAPI_FP_OPS; } } if ( PAPI_event == 0 ) { if ( PAPI_query_event( PAPI_TOT_INS ) == PAPI_OK ) { PAPI_get_event_info( PAPI_TOT_INS, &info ); if ( info.count == 1 || !strcmp( info.derived, "DERIVED_CMPD" ) ) PAPI_event = PAPI_TOT_INS; } } if ( PAPI_event == 0 ) test_skip( __FILE__, __LINE__, "No suitable event for this test found!", 0 ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); if ( PAPI_event == PAPI_FP_INS ) mythreshold = THRESHOLD; else #if defined(linux) mythreshold = ( int ) hw_info->cpu_max_mhz * 20000; #else mythreshold = THRESHOLD * 2; #endif retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); retval = PAPI_add_event( EventSet, PAPI_event ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); retval = PAPI_get_opt( PAPI_COMPONENTINFO, &opt ); if ( retval != PAPI_OK ) test_skip( __FILE__, __LINE__, "Platform does not support Hardware overflow", 0 ); do_stuff( ); /* Do reference count */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, &values[use_total] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); use_total++; /* Now do hardware overflow reference count */ retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, &values[use_total] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); use_total++; retval = PAPI_overflow( EventSet, PAPI_event, 0, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); /* Now do software overflow reference count, uses SIGPROF */ retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, &values[use_total] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); use_total++; retval = PAPI_overflow( EventSet, PAPI_event, 0, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); /* Now do software overflow with SIGVTALRM */ memset( &itimer, 0, sizeof ( itimer ) ); itimer.itimer.itimer_num = ITIMER_VIRTUAL; itimer.itimer.itimer_sig = SIGVTALRM; if ( PAPI_set_opt( PAPI_DEF_ITIMER, &itimer ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, &values[use_total] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); use_total++; retval = PAPI_overflow( EventSet, PAPI_event, 0, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); /* Now do software overflow with SIGALRM */ memset( &itimer, 0, sizeof ( itimer ) ); itimer.itimer.itimer_num = ITIMER_REAL; itimer.itimer.itimer_sig = SIGALRM; if ( PAPI_set_opt( PAPI_DEF_ITIMER, &itimer ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); retval = PAPI_stop( EventSet, &values[use_total] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); use_total++; retval = PAPI_overflow( EventSet, PAPI_event, 0, PAPI_OVERFLOW_FORCE_SW, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); if ( !TESTS_QUIET ) { if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); printf ( "Test case: Software overflow of various types with 1 event in set.\n" ); printf ( "------------------------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", mythreshold ); printf ( "------------------------------------------------------------------------------\n" ); printf( "Test type : %11s%13s%13s%13s%13s\n", "Reference", "Hardware", "ITIMER_PROF", "ITIMER_VIRT", "ITIMER_REAL" ); printf( "%-12s: %11lld%13lld%13lld%13lld%13lld\n", info.symbol, values[0], values[1], values[2], values[3], values[4] ); printf( "Overflows : %11d%13d%13d%13d%13d\n", total[0], total[1], total[2], total[3], total[4] ); printf ( "------------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf ( "Overflow in Column 2 greater than or equal to overflows in Columns 3, 4, 5\n" ); printf( "Overflow in Columns 3, 4, 5 greater than 0\n" ); } hard_min = ( long long ) ( ( ( double ) values[0] * ( 1.0 - OVR_TOLERANCE ) ) / ( double ) mythreshold ); hard_max = ( long long ) ( ( ( double ) values[0] * ( 1.0 + OVR_TOLERANCE ) ) / ( double ) mythreshold ); soft_min = ( long long ) ( ( ( double ) values[0] * ( 1.0 - SOFT_TOLERANCE ) ) / ( double ) mythreshold ); soft_max = ( long long ) ( ( ( double ) values[0] * ( 1.0 + SOFT_TOLERANCE ) ) / ( double ) mythreshold ); if ( total[1] > hard_max || total[1] < hard_min ) test_fail( __FILE__, __LINE__, "Hardware Overflows outside limits", 1 ); if ( total[2] > soft_max || total[3] > soft_max || total[4] > soft_max ) test_fail( __FILE__, __LINE__, "Software Overflows exceed theoretical maximum", 1 ); if ( total[2] < soft_min || total[3] < soft_min || total[4] < soft_min ) printf( "WARNING: Software Overflow occurring but suspiciously low\n" ); if ( ( total[2] == 0 ) || ( total[3] == 0 ) || ( total[4] == 0 ) ) test_fail( __FILE__, __LINE__, "Software Overflows", 1 ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_index.c000066400000000000000000000114121502707512200211310ustar00rootroot00000000000000/* * File: overflow_index.c * Author: min@cs.utk.edu * Min Zhou */ /* This file performs the following test: overflow dispatch on 2 counters. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d) Overflow at %p! vector=%#llx\n" #define OUT_FMT "%-12s : %16lld%16lld\n" #define INDEX_FMT "Overflows vector %#llx: \n" typedef struct { long long mask; int count; } ocount_t; /* there are three possible vectors, one counter overflows, the other counter overflows, both overflow */ static ocount_t overflow_counts[3] = { {0, 0}, {0, 0}, {0, 0} }; static int total_unknown = 0; static void handler( int EventSet, void *address, long long overflow_vector, void *context ) { int i; ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } /* Look for the overflow_vector entry */ for ( i = 0; i < 3; i++ ) { if ( overflow_counts[i].mask == overflow_vector ) { overflow_counts[i].count++; return; } } /* Didn't find it so add it. */ for ( i = 0; i < 3; i++ ) { if ( overflow_counts[i].mask == ( long long ) 0 ) { overflow_counts[i].mask = overflow_vector; overflow_counts[i].count = 1; return; } } /* Unknown entry!?! */ total_unknown++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long ( values[3] )[2]; int retval; int PAPI_event, k, i; char event_name[PAPI_MAX_STR_LEN]; int index_array[2], number; int num_events1, mask1; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ EventSet = add_two_nonderived_events( &num_events1, &PAPI_event, &mask1 ); if (num_events1==0) { if (!quiet) printf("Trouble adding events\n"); test_skip(__FILE__,__LINE__,"Adding events",0); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_overflow( EventSet, PAPI_event, THRESHOLD, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_overflow( EventSet, PAPI_TOT_CYC, THRESHOLD, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if (!quiet) { printf( "Test case: Overflow dispatch of 2nd event in set with 2 events.\n" ); printf( "---------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", THRESHOLD ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-----------------------------------------------\n" ); printf( "Test type : %16d%16d\n", 1, 2 ); printf( OUT_FMT, "PAPI_TOT_CYC", ( values[0] )[0], ( values[1] )[0] ); printf( OUT_FMT, event_name, ( values[0] )[1], ( values[1] )[1] ); } if ( overflow_counts[0].count == 0 && overflow_counts[1].count == 0 ) { test_fail( __FILE__, __LINE__, "one counter had no overflows", 1 ); } for ( k = 0; k < 3; k++ ) { if ( overflow_counts[k].mask ) { number = 2; retval = PAPI_get_overflow_event_index( EventSet, overflow_counts[k].mask, index_array, &number ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_get_overflow_event_index", retval ); } if (!quiet) { printf( INDEX_FMT, ( long long ) overflow_counts[k].mask ); printf( " counts: %d ", overflow_counts[k].count ); for ( i = 0; i < number; i++ ) printf( " Event Index %d ", index_array[i] ); printf( "\n" ); } } } if (!quiet) { printf( "Case 2 %s Overflows: %d\n", "Unknown", total_unknown ); printf( "-----------------------------------------------\n" ); } if ( total_unknown > 0 ) { test_fail( __FILE__, __LINE__, "Unknown counter had overflows", 1 ); } retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_one_and_read.c000066400000000000000000000072641502707512200224320ustar00rootroot00000000000000/* * File: overflow_one_and_read.c : based on overflow_twoevents.c * Mods: Philip Mucci * mucci@cs.utk.edu * Kevin London * london@cs.utk.edu */ /* This file performs the following test: overflow dispatch on 1 counter. * In the handler read events. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d) Overflow at %p! vector=%#llx\n" #define OUT_FMT "%-12s : %16lld%16lld\n" typedef struct { long long mask; int count; } ocount_t; /* there are three possible vectors, one counter overflows, the other counter overflows, both overflow */ /*not used*/ ocount_t overflow_counts[3] = { {0, 0}, {0, 0}, {0, 0} }; /*not used*/ int total_unknown = 0; /*added*/ long long dummyvalues[2]; void handler( int EventSet, void *address, long long overflow_vector, void *context ) { int retval; ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } if ( ( retval = PAPI_read( EventSet, dummyvalues ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_read", retval ); if ( !TESTS_QUIET ) { fprintf( stderr, TWO12, dummyvalues[0], dummyvalues[1], "(Reading counters)\n" ); } if ( dummyvalues[1] == 0 ) test_fail( __FILE__, __LINE__, "Total Cycles == 0", 1 ); } int main( int argc, char **argv ) { int EventSet; long long **values = NULL; int retval; int PAPI_event; char event_name[PAPI_MAX_STR_LEN]; int num_events1, mask1; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ /* NOTE: Only adding one overflow on PAPI_event -- no overflow for PAPI_TOT_CYC*/ EventSet = add_two_nonderived_events( &num_events1, &PAPI_event, &mask1 ); if (num_events1==0) { if (!quiet) printf("Trouble adding events\n"); test_skip(__FILE__,__LINE__,"Adding event",1); } values = allocate_test_space( 2, num_events1 ); if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_overflow( EventSet, PAPI_event, THRESHOLD, 0, handler ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); remove_test_events( &EventSet, mask1 ); if ( !TESTS_QUIET ) { printf ( "Test case: Overflow dispatch of 1st event in set with 2 events.\n" ); printf ( "---------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", THRESHOLD ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-----------------------------------------------\n" ); printf( "Test type : %16d%16d\n", 1, 2 ); printf( OUT_FMT, event_name, ( values[0] )[0], ( values[1] )[0] ); printf( OUT_FMT, "PAPI_TOT_CYC", ( values[0] )[1], ( values[1] )[1] ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_pthreads.c000066400000000000000000000144151502707512200216420ustar00rootroot00000000000000/* This file performs the following test: overflow dispatch with pthreads - This tests the dispatch of overflow calls from PAPI. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). The Eventset contains: + PAPI_FP_INS (overflow monitor) + PAPI_TOT_CYC - Set up overflow - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include #include #include "papi.h" #include "do_loops.h" #include "papi_test.h" static const PAPI_hw_info_t *hw_info = NULL; static int total[NUM_THREADS]; static int expected[NUM_THREADS]; static pthread_t myid[NUM_THREADS]; static void handler( int EventSet, void *address, long long overflow_vector, void *context ) { #if 0 printf( "handler(%d,%#lx,%llx) Overflow %d in thread %lx\n", EventSet, ( unsigned long ) address, overflow_vector, total[EventSet], PAPI_thread_id( ) ); printf( "%lx vs %lx\n", myid[EventSet], PAPI_thread_id( ) ); #else /* eliminate unused parameter warning message */ ( void ) address; ( void ) overflow_vector; ( void ) context; #endif total[EventSet]++; } static long long mythreshold=0; static void * Thread( void *arg ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int mask1, papi_event; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN]; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ EventSet1 = add_two_nonderived_events( &num_events1, &papi_event, &mask1 ); if (EventSet1 < 0) return NULL; /* Wait, we're indexing a per-thread array with the EventSet number? */ /* does that make any sense at all???? -- vmw */ expected[EventSet1] = *( int * ) arg / mythreshold; myid[EventSet1] = PAPI_thread_id( ); values = allocate_test_space( num_tests, num_events1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); if ((retval = PAPI_overflow( EventSet1, papi_event, mythreshold, 0, handler ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } /* start_timer(1); */ if ( ( retval = PAPI_start( EventSet1 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet1, values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_overflow( EventSet1, papi_event, 0, 0, NULL ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } remove_test_events( &EventSet1, mask1 ); retval = PAPI_event_code_to_name( papi_event, event_name ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } if ( !TESTS_QUIET ) { printf( "Thread %#x %s : \t%lld\n", ( int ) pthread_self( ), event_name, ( values[0] )[0] ); printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", ( int ) pthread_self( ), ( values[0] )[1] ); printf( "Thread %#x Real usec : \t%lld\n", ( int ) pthread_self( ), elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", ( int ) pthread_self( ), elapsed_cyc ); } free_test_space( values, num_tests ); retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); return ( NULL ); } int main( int argc, char **argv ) { pthread_t id[NUM_THREADS]; int flops[NUM_THREADS]; int i, rc, retval; pthread_attr_t attr; float ratio; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); memset( total, 0x0, NUM_THREADS * sizeof ( *total ) ); memset( expected, 0x0, NUM_THREADS * sizeof ( *expected ) ); memset( myid, 0x0, NUM_THREADS * sizeof ( *myid ) ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if (retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } #if defined(linux) mythreshold = ((long long)hw_info->cpu_max_mhz) * 10000 * 2; #else mythreshold = THRESHOLD * 2; #endif pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif for ( i = 0; i < NUM_THREADS; i++ ) { flops[i] = NUM_FLOPS * ( i + 1 ); rc = pthread_create( &id[i], &attr, Thread, ( void * ) &flops[i] ); if ( rc ) test_fail( __FILE__, __LINE__, "pthread_create", PAPI_ESYS ); } for ( i = 0; i < NUM_THREADS; i++ ) pthread_join( id[i], NULL ); pthread_attr_destroy( &attr ); { long long t = 0, r = 0; for ( i = 0; i < NUM_THREADS; i++ ) { t += ( NUM_FLOPS * ( i + 1 ) ) / mythreshold; r += total[i]; } if (!quiet) { printf( "Expected total overflows: %lld\n", t ); printf( "Received total overflows: %lld\n", r ); } } // FIXME: are we actually testing this properly? /* ratio = (float)total[0] / (float)expected[0]; */ /* printf("Ratio of total to expected: %f\n",ratio); */ ratio = 1.0; for ( i = 0; i < NUM_THREADS; i++ ) { if (!quiet) printf( "Overflows thread %d: %d, expected %d\n", i, total[i], ( int ) ( ratio * ( float ) expected[i] ) ); } for ( i = 0; i < NUM_THREADS; i++ ) { if ( total[i] < ( int ) ( ( ratio * ( float ) expected[i] ) / 2.0 ) ) test_fail( __FILE__, __LINE__, "not enough overflows", PAPI_EMISC ); } test_pass( __FILE__ ); pthread_exit( NULL ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_single_event.c000066400000000000000000000123541502707512200225120ustar00rootroot00000000000000/* * File: overflow_single_event.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: overflow dispatch of an eventset with just a single event. The Eventset contains: + PAPI_FP_INS (overflow monitor) - Start eventset 1 - Do flops - Stop and measure eventset 1 - Set up overflow on eventset 1 - Start eventset 1 - Do flops - Stop eventset 1 */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d ) Overflow at %p overflow_vector=%#llx!\n" #define OUT_FMT "%-12s : %16lld%16lld\n" static int total = 0; /* total overflows */ void handler( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) context; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, EventSet, address, overflow_vector ); } total++; } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long values[2] = { 0, 0 }; long long min, max; int num_flops = NUM_FLOPS, retval; int PAPI_event = 0, mythreshold; char event_name[PAPI_MAX_STR_LEN]; const PAPI_hw_info_t *hw_info = NULL; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); /* Ugh */ if ( ( !strncmp( hw_info->model_string, "UltraSPARC", 10 ) && !( strncmp( hw_info->vendor_string, "SUN", 3 ) ) ) || ( !strncmp( hw_info->model_string, "AMD K7", 6 ) ) || ( !strncmp( hw_info->vendor_string, "Cray", 4 ) ) || ( strstr( hw_info->model_string, "POWER3" ) ) ) { /* query and set up the right instruction to monitor */ if ( PAPI_query_event( PAPI_TOT_INS ) == PAPI_OK ) { PAPI_event = PAPI_TOT_INS; } else { test_fail( __FILE__, __LINE__, "PAPI_TOT_INS not available on this Sun platform!", 0 ); } } else { /* query and set up the right instruction to monitor */ PAPI_event = find_nonderived_event( ); } if (PAPI_event==0) { if (!quiet) printf("Trouble adding event\n"); test_skip(__FILE__,__LINE__,"Event trouble",1); } if (( PAPI_event == PAPI_FP_OPS ) || ( PAPI_event == PAPI_FP_INS )) { mythreshold = THRESHOLD; } else { #if defined(linux) mythreshold = ( int ) hw_info->cpu_max_mhz * 20000; #else mythreshold = THRESHOLD * 2; #endif } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_add_event( EventSet, PAPI_event ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, &values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } retval = PAPI_overflow( EventSet, PAPI_event, mythreshold, 0, handler ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, &values[1] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* double ugh */ #if defined(linux) || defined(__ia64__) || defined(_POWER4) num_flops *= 2; #endif if ( !quiet ) { if ( ( retval = PAPI_event_code_to_name( PAPI_event, event_name ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); printf ( "Test case: Overflow dispatch of 1st event in set with 1 event.\n" ); printf ( "--------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", mythreshold ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-----------------------------------------------\n" ); printf( "Test type : %16d%16d\n", 1, 2 ); printf( OUT_FMT, event_name, values[0], values[1] ); printf( "Overflows : %16s%16d\n", "", total ); printf( "-----------------------------------------------\n" ); printf( "Verification:\n" ); /* if (PAPI_event == PAPI_FP_INS) printf("Row 1 approximately equals %d %d\n", num_flops, num_flops); printf("Column 1 approximately equals column 2\n"); */ printf( "Row 3 approximately equals %u +- %u %%\n", ( unsigned ) ( ( values[0] ) / ( long long ) mythreshold ), ( unsigned ) ( OVR_TOLERANCE * 100.0 ) ); } /* min = (long long)(values[0]*(1.0-TOLERANCE)); max = (long long)(values[0]*(1.0+TOLERANCE)); if ( values[1] > max || values[1] < min ) test_fail(__FILE__, __LINE__, event_name, 1); */ min = ( long long ) ( ( ( double ) values[0] * ( 1.0 - OVR_TOLERANCE ) ) / ( double ) mythreshold ); max = ( long long ) ( ( ( double ) values[0] * ( 1.0 + OVR_TOLERANCE ) ) / ( double ) mythreshold ); if ( total > max || total < min ) test_fail( __FILE__, __LINE__, "Overflows", 1 ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_twoevents.c000066400000000000000000000205721502707512200220670ustar00rootroot00000000000000/* * File: overflow_twoevents.c * Author: min@cs.utk.edu * Min Zhou * Mods: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: overflow dispatch on 2 counters. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define OVER_FMT "handler(%d) Overflow at %p! vector=%#llx\n" #define OUT_FMT "%-12s : %18lld%18lld%18lld\n" #define VEC_FMT " at vector %#llx, event %-12s : %6d\n" typedef struct { long long mask; int count; } ocount_t; /* there are two experiments: batch and interleaf; for each experiment there are three possible vectors, one counter overflows, the other counter overflows, both overflow */ static ocount_t overflow_counts[2][3] = { {{0, 0}, {0, 0}, {0, 0}}, {{0, 0}, {0, 0}, {0, 0}} }; static int total_unknown = 0; static void handler( int mode, void *address, long long overflow_vector, void *context ) { ( void ) context; /*unused */ int i; if ( !TESTS_QUIET ) { fprintf( stderr, OVER_FMT, mode, address, overflow_vector ); } /* Look for the overflow_vector entry */ for ( i = 0; i < 3; i++ ) { if ( overflow_counts[mode][i].mask == overflow_vector ) { overflow_counts[mode][i].count++; return; } } /* Didn't find it so add it. */ for ( i = 0; i < 3; i++ ) { if ( overflow_counts[mode][i].mask == ( long long ) 0 ) { overflow_counts[mode][i].mask = overflow_vector; overflow_counts[mode][i].count = 1; return; } } /* Unknown entry!?! */ total_unknown++; } static void handler_batch( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) EventSet; /*unused */ handler( 0, address, overflow_vector, context ); } static void handler_interleaf( int EventSet, void *address, long long overflow_vector, void *context ) { ( void ) EventSet; /*unused */ handler( 1, address, overflow_vector, context ); } int main( int argc, char **argv ) { int EventSet = PAPI_NULL; long long ( values[3] )[2]; int retval; int PAPI_event, k, idx[4]; char event_name[3][PAPI_MAX_STR_LEN]; int num_events1; int threshold = THRESHOLD; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* decide which of PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS to add, depending on the availability and derived status of the event on this platform */ if ( ( PAPI_event = find_nonderived_event( ) ) == 0 ) { if (!quiet) printf("No events found!\n"); test_skip( __FILE__, __LINE__, "no PAPI_event", 0 ); } if ( ( retval = PAPI_add_event( EventSet, PAPI_event ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( ( retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); if ( ( retval = PAPI_stop( EventSet, values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* Set both overflows after adding both events (batch) */ retval = PAPI_overflow( EventSet, PAPI_event, threshold, 0, handler_batch ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } retval = PAPI_overflow( EventSet, PAPI_TOT_CYC, threshold, 0, handler_batch ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); } if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); num_events1 = 1; retval = PAPI_get_overflow_event_index( EventSet, 1, &idx[0], &num_events1 ); if ( retval != PAPI_OK ) { printf( "PAPI_get_overflow_event_index error: %s\n", PAPI_strerror( retval ) ); } num_events1 = 1; retval = PAPI_get_overflow_event_index( EventSet, 2, &idx[1], &num_events1 ); if ( retval != PAPI_OK ) { printf( "PAPI_get_overflow_event_index error: %s\n", PAPI_strerror( retval ) ); } if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); /* Add each event and set its overflow (interleaved) */ if ( ( retval = PAPI_add_event( EventSet, PAPI_event ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( ( retval = PAPI_overflow( EventSet, PAPI_event, threshold, 0, handler_interleaf ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); if ( ( retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); if ( ( retval = PAPI_overflow( EventSet, PAPI_TOT_CYC, threshold, 0, handler_interleaf ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); if ( ( retval = PAPI_stop( EventSet, values[2] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); num_events1 = 1; retval = PAPI_get_overflow_event_index( EventSet, 1, &idx[2], &num_events1 ); if ( retval != PAPI_OK ) { printf( "PAPI_get_overflow_event_index error: %s\n", PAPI_strerror( retval ) ); } num_events1 = 1; retval = PAPI_get_overflow_event_index( EventSet, 2, &idx[3], &num_events1 ); if ( retval != PAPI_OK ) { printf( "PAPI_get_overflow_event_index error: %s\n", PAPI_strerror( retval ) ); } if ( ( retval = PAPI_cleanup_eventset( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_event_code_to_name( PAPI_event, event_name[0] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } retval = PAPI_event_code_to_name( PAPI_TOT_CYC, event_name[1] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } strcpy( event_name[2], "Unknown" ); if (!TESTS_QUIET) { printf( "Test case: Overflow dispatch of both events in set with 2 events.\n" ); printf( "---------------------------------------------------------------\n" ); printf( "Threshold for overflow is: %d\n", threshold ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-----------------------------------------------\n" ); printf( "Test type : %18s%18s%18s\n", "1 (no overflow)", "2 (batch)", "3 (interleaf)" ); printf( OUT_FMT, event_name[0], ( values[0] )[0], ( values[1] )[0], ( values[2] )[0] ); printf( OUT_FMT, event_name[1], ( values[0] )[1], ( values[1] )[1], ( values[2] )[1] ); printf( "\n" ); printf( "Predicted overflows at event %-12s : %6d\n", event_name[0], ( int ) ( ( values[0] )[0] / threshold ) ); printf( "Predicted overflows at event %-12s : %6d\n", event_name[1], ( int ) ( ( values[0] )[1] / threshold ) ); printf( "\nBatch overflows (add, add, over, over):\n" ); for ( k = 0; k < 2; k++ ) { if ( overflow_counts[0][k].mask ) { printf( VEC_FMT, ( long long ) overflow_counts[0][k].mask, event_name[idx[k]], overflow_counts[0][k].count ); } } printf( "\nInterleaved overflows (add, over, add, over):\n" ); for ( k = 0; k < 2; k++ ) { if ( overflow_counts[1][k].mask ) printf( VEC_FMT, ( long long ) overflow_counts[1][k].mask, event_name[idx[k + 2]], overflow_counts[1][k].count ); } printf( "\nCases 2+3 Unknown overflows: %d\n", total_unknown ); printf( "-----------------------------------------------\n" ); } if ( overflow_counts[0][0].count == 0 || overflow_counts[0][1].count == 0 ) test_fail( __FILE__, __LINE__, "a batch counter had no overflows", 1 ); if ( overflow_counts[1][0].count == 0 || overflow_counts[1][1].count == 0 ) test_fail( __FILE__, __LINE__, "an interleaved counter had no overflows", 1 ); if ( total_unknown > 0 ) test_fail( __FILE__, __LINE__, "Unknown counter had overflows", 1 ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/overflow_values.c000066400000000000000000000117141502707512200213260ustar00rootroot00000000000000/* * File: overflow_values.c * CVS: $Id$ * Author: Harald Servat * harald@cepba.upc.edu * Mods: * */ /* This file performs the following test: overflow values check The Eventset contains: + PAPI_TOT_INS (overflow monitor) + PAPI_TOT_CYC + PAPI_L1_DCM - Start eventset - Read and report event counts mod 1000 - report overflow event counts - visually inspect for consistency - Stop eventset */ #include "papi_test.h" #define OVRFLOW 5000000 #define LOWERFLOW (OVRFLOW - (OVRFLOW/100)) #define UPPERFLOW (OVRFLOW/100) #define ERRORFLOW (UPPERFLOW/5) static long long ovrflow = 0; void handler( int EventSet, void *address, long long overflow_vector, void *context ) { int ret; int i; long long vals[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; printf( "\nOverflow at %p! bit=%#llx \n", address, overflow_vector ); ret = PAPI_read( EventSet, vals ); printf( "Overflow read vals :" ); for ( i = 0; i < 3 /* 8 */ ; i++ ) printf( "%lld ", vals[i] ); printf( "\n\n" ); ovrflow = vals[0]; } int main( int argc, char *argv[] ) { int EventSet = PAPI_NULL; int retval, i, dash = 0, evt3 = PAPI_L1_DCM; PAPI_option_t options; PAPI_option_t options2; const PAPI_hw_info_t *hwinfo; long long lwrflow = 0, error, max_error = 0; long long vals[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT && retval > 0 ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); retval = PAPI_get_opt( PAPI_HWINFO, &options ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_opt", retval ); printf( "ovf_info = %d (%#x)\n", options.ovf_info.type, options.ovf_info.type ); retval = PAPI_get_opt( PAPI_SUBSTRATEINFO, &options2 ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_opt", retval ); printf( "sub_info->hardware_intr = %d\n\n", options2.sub_info->hardware_intr ); if ( ( hwinfo = PAPI_get_hardware_info( ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", PAPI_EMISC ); printf( "Architecture %s, %d\n", hwinfo->model_string, hwinfo->model ); /* processing exceptions is a pain */ #if ((defined(linux) && (defined(__i386__) || (defined __x86_64__))) ) if ( !strncmp( hwinfo->model_string, "Intel Pentium 4", 15 ) ) { evt3 = PAPI_L2_TCM; } else if ( !strncmp( hwinfo->model_string, "AMD K7", 6 ) ) { /* do nothing */ } else if ( !strncmp( hwinfo->model_string, "AMD K8", 6 ) ) { /* do nothing */ } else if ( !strncmp( hwinfo->model_string, "Intel Core", 10 ) ) { evt3 = 0; } else evt3 = 0; /* for default PIII */ #endif retval = PAPI_create_eventset( &EventSet ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); retval = PAPI_add_event( EventSet, PAPI_TOT_INS ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_add_event:PAPI_TOT_INS", retval ); retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_add_event:PAPI_TOT_CYC", retval ); if ( evt3 ) { retval = PAPI_add_event( EventSet, evt3 ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_add_event:evt3", retval ); } retval = PAPI_overflow( EventSet, PAPI_TOT_INS, OVRFLOW, 0, handler ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_overflow", retval ); retval = PAPI_start( EventSet ); if ( retval < 0 ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); for ( i = 0; i < 1000000; i++ ) { if ( i % 1000 == 0 ) { int i; PAPI_read( EventSet, vals ); if ( vals[0] % OVRFLOW > LOWERFLOW || vals[0] % OVRFLOW < UPPERFLOW ) { dash = 0; printf( "Main loop read vals :" ); for ( i = 0; i < 3 /* 8 */ ; i++ ) printf( "%lld ", vals[i] ); printf( "\n" ); if ( ovrflow ) { error = ovrflow - ( lwrflow + vals[0] ) / 2; printf( "Difference: %lld\n", error ); ovrflow = 0; if ( abs( error ) > max_error ) max_error = abs( error ); } lwrflow = vals[0]; } else if ( vals[0] % OVRFLOW > UPPERFLOW && !dash ) { dash = 1; printf( "---------------------\n" ); } } } retval = PAPI_stop( EventSet, vals ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); printf( "Verification:\n" ); printf ( "Maximum absolute difference between overflow value\nand adjacent measured values is: %lld\n", max_error ); if ( max_error >= ERRORFLOW ) { printf( "This exceeds the error limit: %d\n", ERRORFLOW ); test_fail( __FILE__, __LINE__, "Overflows", 1 ); } printf( "This is within the error limit: %d\n", ERRORFLOW ); test_pass( __FILE__, NULL, 0 ); exit( 1 ); } papi-papi-7-2-0-t/src/ctests/p4_lst_ins.c000066400000000000000000000170521502707512200201630ustar00rootroot00000000000000/* This code demonstrates the behavior of PAPI_LD_INS, PAPI_SR_INS and PAPI_LST_INS on a Pentium 4 processor. Because of the way these events are implemented in hardware, LD and SR cannot be counted in the presence of either of the other two events. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, num_tests = 6, tmp; long long **values; int EventSet = PAPI_NULL; const PAPI_hw_info_t *hw_info; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); if ( hw_info->vendor == PAPI_VENDOR_INTEL ) { /* Check for Pentium4 */ if ( hw_info->cpuid_family != 15 ) { test_skip( __FILE__, __LINE__, "This test is intended only for Pentium 4.", 1 ); } } else { test_skip( __FILE__, __LINE__, "This test is intended only for Pentium 4.", 1 ); } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); values = allocate_test_space( num_tests, 2 ); /* First test: just PAPI_LD_INS */ retval = PAPI_add_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_LD_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_remove_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_LD_INS", retval ); /* Second test: just PAPI_SR_INS */ retval = PAPI_add_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_SR_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_remove_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_SR_INS", retval ); /* Third test: just PAPI_LST_INS */ retval = PAPI_add_event( EventSet, PAPI_LST_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_LST_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[2] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* Fourth test: PAPI_LST_INS and PAPI_LD_INS */ retval = PAPI_add_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_LD_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[3] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_remove_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_LD_INS", retval ); /* Fifth test: PAPI_LST_INS and PAPI_SR_INS */ retval = PAPI_add_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_SR_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[4] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_remove_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_SR_INS", retval ); retval = PAPI_remove_event( EventSet, PAPI_LST_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_LST_INS", retval ); /* Sixth test: PAPI_LD_INS and PAPI_SR_INS */ retval = PAPI_add_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_LD_INS", retval ); retval = PAPI_add_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event: PAPI_SR_INS", retval ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS / 10 ); retval = PAPI_stop( EventSet, values[5] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_remove_event( EventSet, PAPI_LD_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_LD_INS", retval ); retval = PAPI_remove_event( EventSet, PAPI_SR_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_remove_event: PAPI_SR_INS", retval ); if ( !TESTS_QUIET ) { printf( "Pentium 4 Load / Store tests.\n" ); printf ( "These PAPI events are counted by setting a tag at the front of the pipeline,\n" ); printf ( "and counting tags at the back of the pipeline. All the tags are the same 'color'\n" ); printf ( "and can't be distinguished from each other. Therefore, PAPI_LD_INS and PAPI_SR_INS\n" ); printf ( "cannot be counted with the other two events, or the answer will always == PAPI_LST_INS.\n" ); printf ( "-------------------------------------------------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS / 10 ); printf ( "-------------------------------------------------------------------------------------------\n" ); printf ( "Test: 1 2 3 4 5 6\n" ); printf( "%s %12lld %12s %12s %12lld %12s %12lld\n", "PAPI_LD_INS: ", ( values[0] )[0], "------", "------", ( values[3] )[1], "------", ( values[5] )[0] ); printf( "%s %12s %12lld %12s %12s %12lld %12lld\n", "PAPI_SR_INS: ", "------", ( values[1] )[0], "------", "------", ( values[4] )[1], ( values[5] )[1] ); printf( "%s %12s %12s %12lld %12lld %12lld %12s\n", "PAPI_LST_INS:", "------", "------", ( values[2] )[0], ( values[3] )[0], ( values[4] )[0], "------" ); printf ( "-------------------------------------------------------------------------------------------\n" ); printf( "Test 1: PAPI_LD_INS only.\n" ); printf( "Test 2: PAPI_SR_INS only.\n" ); printf( "Test 3: PAPI_LST_INS only.\n" ); printf( "Test 4: PAPI_LD_INS and PAPI_LST_INS.\n" ); printf( "Test 5: PAPI_SR_INS and PAPI_LST_INS.\n" ); printf( "Test 6: PAPI_LD_INS and PAPI_SR_INS.\n" ); printf ( "Verification: Values within each column should be the same.\n" ); printf( " R3C3 ~= (R1C1 + R2C2) ~= all other entries.\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/pernode.c000066400000000000000000000060101502707512200175310ustar00rootroot00000000000000/* This file performs the following test: - make an event set with PAPI_TOT_INS and PAPI_TOT_CYC. - enable per node counting - enable full domain counting - sleeps for 5 seconds - print the results */ #include #include #include #include #include #include #include #include "papi_test.h" int main( ) { int ncpu, nctr, i, actual_domain; int retval; int EventSet = PAPI_NULL; long long *values; long long elapsed_us, elapsed_cyc; PAPI_option_t options; retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf( stderr, "Library mismatch: code %d, library %d\n", retval, PAPI_VER_CURRENT ); exit( 1 ); } if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) exit( 1 ); /* Set the domain as high as it will go. */ options.domain.eventset = EventSet; options.domain.domain = PAPI_DOM_ALL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) exit( 1 ); actual_domain = options.domain.domain; /* This should only happen to an empty eventset */ options.granularity.eventset = EventSet; options.granularity.granularity = PAPI_GRN_SYS_CPU; retval = PAPI_set_opt( PAPI_GRANUL, &options ); if ( retval != PAPI_OK ) exit( 1 ); /* Malloc the output array */ ncpu = PAPI_get_opt( PAPI_MAX_CPUS, NULL ); nctr = PAPI_get_opt( PAPI_MAX_HWCTRS, NULL ); values = ( long long * ) malloc( ncpu * nctr * sizeof ( long long ) ); memset( values, 0x0, ( ncpu * nctr * sizeof ( long long ) ) ); /* Add the counters */ if ( PAPI_add_event( EventSet, PAPI_TOT_CYC ) != PAPI_OK ) exit( 1 ); if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) exit( 1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) exit( 1 ); sleep( 5 ); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) exit( 1 ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; printf( "Test case: per node\n" ); printf( "-------------------\n\n" ); printf( "This machine has %d cpus, each with %d counters.\n", ncpu, nctr ); printf( "Test case asked for: PAPI_DOM_ALL\n" ); printf( "Test case got: " ); if ( actual_domain & PAPI_DOM_USER ) printf( "PAPI_DOM_USER " ); if ( actual_domain & PAPI_DOM_KERNEL ) printf( "PAPI_DOM_KERNEL " ); if ( actual_domain & PAPI_DOM_OTHER ) printf( "PAPI_DOM_OTHER " ); printf( "\n" ); for ( i = 0; i < ncpu; i++ ) { printf( "CPU %d\n", i ); printf( "PAPI_TOT_CYC: \t%lld\n", values[0 + i * nctr] ); printf( "PAPI_TOT_INS: \t%lld\n", values[1 + i * nctr] ); } printf ( "\n-------------------------------------------------------------------------\n" ); printf( "Real usec : \t%lld\n", elapsed_us ); printf( "Real cycles : \t%lld\n", elapsed_cyc ); printf ( "-------------------------------------------------------------------------\n" ); free( values ); PAPI_shutdown( ); exit( 0 ); } papi-papi-7-2-0-t/src/ctests/prof_utils.c000066400000000000000000000232461502707512200202750ustar00rootroot00000000000000/* * File: prof_utils.c * Author: Dan Terpstra * terpstra@cs.utk.edu */ /* This file contains utility functions useful for all profiling tests It can be used by: - profile.c, - sprofile.c, - profile_pthreads.c, - profile_twoevents.c, - earprofile.c, - future profiling tests. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #include "prof_utils.h" /* variables global to profiling tests */ long long **values; char event_name[PAPI_MAX_STR_LEN]; int PAPI_event; int EventSet = PAPI_NULL; void *profbuf[5]; /* Many profiling tests count one of {FP_INS, FP_OPS, TOT_INS} and TOT_CYC. This function creates an event set containing the appropriate pair of events. It also initializes the global event_name string to the event selected. Assumed globals: EventSet, PAPI_event, event_name. */ int prof_events( int num_tests) { int retval; int num_events, mask; /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ EventSet = add_two_nonderived_events( &num_events, &PAPI_event, &mask ); if (num_events==0) { return 0; } values = allocate_test_space( num_tests, num_events ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } return mask; } /* This function displays info from the prginfo structure in a standardized format. */ void prof_print_address( const char *title, const PAPI_exe_info_t * prginfo ) { printf( "%s\n", title ); printf ( "----------------------------------------------------------------\n" ); printf( "Text start: %p, Text end: %p, Text length: %#x\n", prginfo->address_info.text_start, prginfo->address_info.text_end, ( unsigned int ) ( prginfo->address_info.text_end - prginfo->address_info.text_start ) ); printf( "Data start: %p, Data end: %p\n", prginfo->address_info.data_start, prginfo->address_info.data_end ); printf( "BSS start : %p, BSS end : %p\n", prginfo->address_info.bss_start, prginfo->address_info.bss_end ); printf ( "----------------------------------------------------------------\n" ); } /* This function displays profining information useful for several profile tests. It (probably inappropriately) assumes use of a common THRESHOLD. This should probably be a passed parameter. Assumed globals: event_name, start, stop. */ void prof_print_prof_info( vptr_t start, vptr_t end, int threshold, char *event_name ) { printf( "Profiling event : %s\n", event_name ); printf( "Profile Threshold: %d\n", threshold ); printf( "Profile Iters : %d\n", ( getenv( "NUM_ITERS" ) ? atoi( getenv( "NUM_ITERS" ) ) : NUM_ITERS ) ); printf( "Profile Range : %p to %p\n", start, end ); printf ( "----------------------------------------------------------------\n" ); printf( "\n" ); } /* Most profile tests begin by counting the eventset with no profiling enabled. This function does that work. It assumes that the 'work' routine is do_both(). A better implementation would pass a pointer to the work function. Assumed globals: EventSet, values, event_name. */ void do_no_profile( int quiet ) { int retval; if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( getenv( "NUM_ITERS" ) ? atoi( getenv( "NUM_ITERS" ) ) : NUM_ITERS ); if ( ( retval = PAPI_stop( EventSet, values[0] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if (!quiet) { printf( "Test type : \t%s\n", "No profiling" ); printf( TAB1, event_name, ( values[0] )[0] ); printf( TAB1, "PAPI_TOT_CYC", ( values[0] )[1] ); } } /* This routine allocates and initializes up to 5 equal sized profiling buffers. They need to be freed when profiling is completed. The number and size are passed parameters. The profbuf[] array of void * pointers is an assumed global. It should be cast to the required type by the parent routine. */ void prof_alloc( int num, unsigned long blength ) { int i; for ( i = 0; i < num; i++ ) { profbuf[i] = malloc( blength ); if ( profbuf[i] == NULL ) { test_fail( __FILE__, __LINE__, "malloc", PAPI_ESYS ); } memset( profbuf[i], 0x00, blength ); } } /* Given the profiling type (16, 32, or 64) this function returns the bucket size in bytes. NOTE: the bucket size does not ALWAYS correspond to the expected value, esp on architectures like Cray with weird data types. This is necessary because the posix_profile routine in extras.c relies on the data types and sizes produced by the compiler. */ int prof_buckets( int bucket ) { int bucket_size; switch ( bucket ) { case PAPI_PROFIL_BUCKET_16: bucket_size = sizeof ( short ); break; case PAPI_PROFIL_BUCKET_32: bucket_size = sizeof ( int ); break; case PAPI_PROFIL_BUCKET_64: bucket_size = sizeof ( unsigned long long ); break; default: bucket_size = 0; break; } return ( bucket_size ); } /* A standardized header printing routine. No assumed globals. */ void prof_head( unsigned long blength, int bucket, int num_buckets, const char *header ) { int bucket_size = prof_buckets( bucket ); printf ( "\n------------------------------------------------------------\n" ); printf( "PAPI_profil() hash table, Bucket size: %d bits.\n", bucket_size * 8 ); printf( "Number of buckets: %d.\nLength of buffer: %ld bytes.\n", num_buckets, blength ); printf( "------------------------------------------------------------\n" ); printf( "%s\n", header ); } /* This function prints a standardized profile output based on the bucket size. A row consisting of an address and 'n' data elements is displayed for each address with at least one non-zero bucket. Assumes global profbuf[] array pointers. */ void prof_out( vptr_t start, int n, int bucket, int num_buckets, unsigned int scale ) { int i, j; unsigned short buf_16; unsigned int buf_32; unsigned long long buf_64; unsigned short **buf16 = ( unsigned short ** ) profbuf; unsigned int **buf32 = ( unsigned int ** ) profbuf; unsigned long long **buf64 = ( unsigned long long ** ) profbuf; if ( !TESTS_QUIET ) { /* printf("%#lx\n",(unsigned long) start + (unsigned long) (2 * i)); */ /* printf("start: %p; i: %#x; scale: %#x; i*scale: %#x; i*scale >>15: %#x\n", start, i, scale, i*scale, (i*scale)>>15); */ switch ( bucket ) { case PAPI_PROFIL_BUCKET_16: for ( i = 0; i < num_buckets; i++ ) { for ( j = 0, buf_16 = 0; j < n; j++ ) buf_16 |= ( buf16[j] )[i]; if ( buf_16 ) { /* On 32bit builds with gcc 4.3 gcc complained about casting vptr_t => long long * Thus the unsigned long to long long cast */ printf( "%#-16llx", (long long) (unsigned long)start + ( ( ( long long ) i * scale ) >> 15 ) ); for ( j = 0, buf_16 = 0; j < n; j++ ) printf( "\t%d", ( buf16[j] )[i] ); printf( "\n" ); } } break; case PAPI_PROFIL_BUCKET_32: for ( i = 0; i < num_buckets; i++ ) { for ( j = 0, buf_32 = 0; j < n; j++ ) buf_32 |= ( buf32[j] )[i]; if ( buf_32 ) { printf( "%#-16llx", (long long) (unsigned long)start + ( ( ( long long ) i * scale ) >> 15 ) ); for ( j = 0, buf_32 = 0; j < n; j++ ) printf( "\t%d", ( buf32[j] )[i] ); printf( "\n" ); } } break; case PAPI_PROFIL_BUCKET_64: for ( i = 0; i < num_buckets; i++ ) { for ( j = 0, buf_64 = 0; j < n; j++ ) buf_64 |= ( buf64[j] )[i]; if ( buf_64 ) { printf( "%#-16llx", (long long) (unsigned long)start + ( ( ( long long ) i * scale ) >> 15 ) ); for ( j = 0, buf_64 = 0; j < n; j++ ) printf( "\t%lld", ( buf64[j] )[i] ); printf( "\n" ); } } break; } printf ( "------------------------------------------------------------\n\n" ); } } /* This function checks to make sure that some buffer value somewhere is nonzero. If all buffers are empty, zero is returned. This usually indicates a profiling failure. Assumes global profbuf[]. */ int prof_check( int n, int bucket, int num_buckets ) { int i, j; int retval = 0; unsigned short **buf16 = ( unsigned short ** ) profbuf; unsigned int **buf32 = ( unsigned int ** ) profbuf; unsigned long long **buf64 = ( unsigned long long ** ) profbuf; switch ( bucket ) { case PAPI_PROFIL_BUCKET_16: for ( i = 0; i < num_buckets; i++ ) for ( j = 0; j < n; j++ ) retval = retval || buf16[j][i]; break; case PAPI_PROFIL_BUCKET_32: for ( i = 0; i < num_buckets; i++ ) for ( j = 0; j < n; j++ ) retval = retval || buf32[j][i]; break; case PAPI_PROFIL_BUCKET_64: for ( i = 0; i < num_buckets; i++ ) for ( j = 0; j < n; j++ ) retval = retval || buf64[j][i]; break; } return ( retval ); } /* Computes the length (in bytes) of the buffer required for profiling. 'plength' is the profile length, or address range to be profiled. By convention, it is assumed that there are half as many buckets as addresses. The scale factor is a fixed point fraction in which 0xffff = ~1 0x8000 = 1/2 0x4000 = 1/4, etc. Thus, the number of profile buckets is (plength/2) * (scale/65536), and the length (in bytes) of the profile buffer is buckets * bucket size. */ unsigned long prof_size( unsigned long plength, unsigned scale, int bucket, int *num_buckets ) { unsigned long blength; long long llength = ( ( long long ) plength * scale ); int bucket_size = prof_buckets( bucket ); *num_buckets = ( int ) ( llength / 65536 / 2 ); blength = ( unsigned long ) ( *num_buckets * bucket_size ); return ( blength ); } papi-papi-7-2-0-t/src/ctests/prof_utils.h000066400000000000000000000033131502707512200202730ustar00rootroot00000000000000/* * File: prof_utils.h * Author: Dan Terpstra * terpstra@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com */ /* This file contains utility definitions useful for all profiling tests It should be #included in: - profile.c, - sprofile.c, - profile_pthreads.c, - profile_twoevents.c, - earprofile.c, - future profiling tests. */ /* value for scale parameter that sets scale to 1 */ #define FULL_SCALE 65536 /* Internal prototype */ int prof_events(int num_tests); void prof_print_address(const char *title, const PAPI_exe_info_t *prginfo); void prof_print_prof_info(vptr_t start, vptr_t end, int threshold, char *event_name); void prof_alloc(int num, unsigned long plength); void prof_head(unsigned long blength, int bucket_size, int num_buckets, const char *header); void prof_out(vptr_t start, int n, int bucket, int num_buckets, unsigned int scale); unsigned long prof_size(unsigned long plength, unsigned scale, int bucket, int *num_buckets); int prof_check(int n, int bucket, int num_buckets); int prof_buckets(int bucket); void do_no_profile(int quiet); /* variables global to profiling tests */ extern long long **values; extern char event_name[PAPI_MAX_STR_LEN]; extern int PAPI_event; extern int EventSet; extern void *profbuf[5]; /* Itanium returns function descriptors instead of function addresses. I couldn't find the following structure in a header file, so I duplicated it below. */ #if (defined(ITANIUM1) || defined(ITANIUM2)) struct fdesc { void *ip; /* entry point (code address) */ void *gp; /* global-pointer */ }; #elif defined(__powerpc64__) struct fdesc { void * ip; // function entry point void * toc; void * env; }; #endif papi-papi-7-2-0-t/src/ctests/profile.c000066400000000000000000000132701502707512200175430ustar00rootroot00000000000000/* * File: profile.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Dan Terpstra * terpstra@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com */ /* This file performs the following test: profiling and program info option call - This tests the SVR4 profiling interface of PAPI. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). The Eventset contains: + PAPI_FP_INS (to profile) + PAPI_TOT_CYC - Set up profile - Start eventset 1 - Do both (flops and reads) - Stop eventset 1 */ #include #include #include "papi.h" #include "papi_test.h" #include "prof_utils.h" #include "do_loops.h" #define PROFILE_ALL static int do_profile( vptr_t start, unsigned long plength, unsigned scale, int thresh, int bucket ) { int i, retval; unsigned long blength; int num_buckets; const char *profstr[5] = { "PAPI_PROFIL_POSIX", "PAPI_PROFIL_RANDOM", "PAPI_PROFIL_WEIGHTED", "PAPI_PROFIL_COMPRESS", "PAPI_PROFIL_" }; int profflags[5] = { PAPI_PROFIL_POSIX, PAPI_PROFIL_POSIX | PAPI_PROFIL_RANDOM, PAPI_PROFIL_POSIX | PAPI_PROFIL_WEIGHTED, PAPI_PROFIL_POSIX | PAPI_PROFIL_COMPRESS, PAPI_PROFIL_POSIX | PAPI_PROFIL_WEIGHTED | PAPI_PROFIL_RANDOM | PAPI_PROFIL_COMPRESS }; do_no_profile( TESTS_QUIET ); blength = prof_size( plength, scale, bucket, &num_buckets ); prof_alloc( 5, blength ); for ( i = 0; i < 5; i++ ) { if ( !TESTS_QUIET ) { printf( "Test type : \t%s\n", profstr[i] ); } #ifndef SWPROFILE if ( ( retval = PAPI_profil( profbuf[i], ( unsigned int ) blength, start, scale, EventSet, PAPI_event, thresh, profflags[i] | bucket ) ) != PAPI_OK ) { if (retval==PAPI_ENOSUPP) { char warning[BUFSIZ]; sprintf(warning,"PAPI_profil %s not supported", profstr[i]); test_warn( __FILE__, __LINE__, warning, 1 ); } else { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } } #else if ( ( retval = PAPI_profil( profbuf[i], ( unsigned int ) blength, start, scale, EventSet, PAPI_event, thresh, profflags[i] | bucket | PAPI_PROFIL_FORCE_SW ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } #endif if ( retval != PAPI_OK ) break; if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( getenv( "NUM_FLOPS" ) ? atoi( getenv( "NUM_FLOPS" ) ) : NUM_FLOPS ); if ( ( retval = PAPI_stop( EventSet, values[1] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !TESTS_QUIET ) { printf( TAB1, event_name, ( values[1] )[0] ); printf( TAB1, "PAPI_TOT_CYC", ( values[1] )[1] ); } retval = PAPI_profil( profbuf[i], ( unsigned int ) blength, start, scale, EventSet, PAPI_event, 0, profflags[i] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } } if ( retval == PAPI_OK ) { if (!TESTS_QUIET) prof_head( blength, bucket, num_buckets, "address\t\t\tflat\trandom\tweight\tcomprs\tall\n" ); if (!TESTS_QUIET) prof_out( start, 5, bucket, num_buckets, scale ); retval = prof_check( 5, bucket, num_buckets ); } for ( i = 0; i < 5; i++ ) { free( profbuf[i] ); } return retval; } int main( int argc, char **argv ) { int num_tests = 6; long length; int mask; int retval; int mythreshold = THRESHOLD; const PAPI_exe_info_t *prginfo; vptr_t start, end; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); } retval = PAPI_query_event(PAPI_TOT_CYC); if (retval!=PAPI_OK) { if (!quiet) printf("No events found\n"); test_skip(__FILE__, __LINE__,"No events found",1); } mask = prof_events( num_tests ); #ifdef PROFILE_ALL /* use these lines to profile entire code address space */ start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; #else /* use these lines to profile only do_flops address space */ start = ( vptr_t ) do_flops; end = ( vptr_t ) fdo_flops; /* Itanium and ppc64 processors return function descriptors instead of function addresses. You must dereference the descriptor to get the address. */ #if defined(ITANIUM1) || defined(ITANIUM2) || defined(__powerpc64__) start = ( vptr_t ) ( ( ( struct fdesc * ) start )->ip ); end = ( vptr_t ) ( ( ( struct fdesc * ) end )->ip ); #endif #endif #if defined(linux) { char *tmp = getenv( "THRESHOLD" ); if ( tmp ) mythreshold = atoi( tmp ); } #endif length = end - start; if ( length < 0 ) { test_fail( __FILE__, __LINE__, "Profile length < 0!", ( int ) length ); } if (!quiet) { prof_print_address( "Test case profile: " "POSIX compatible profiling with hardware counters.\n", prginfo ); prof_print_prof_info( start, end, mythreshold, event_name ); } retval = do_profile( start, ( unsigned long ) length, FULL_SCALE, mythreshold, PAPI_PROFIL_BUCKET_16 ); if ( retval == PAPI_OK ) { retval = do_profile( start, ( unsigned long ) length, FULL_SCALE, mythreshold, PAPI_PROFIL_BUCKET_32 ); } if ( retval == PAPI_OK ) { retval = do_profile( start, ( unsigned long ) length, FULL_SCALE, mythreshold, PAPI_PROFIL_BUCKET_64 ); } remove_test_events( &EventSet, mask ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/profile_pthreads.c000066400000000000000000000130161502707512200214330ustar00rootroot00000000000000/* This file performs the following test: profile for pthreads */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define THR 1000000 #define FLOPS 100000000 unsigned int length; vptr_t my_start, my_end; void * Thread( void *arg ) { int retval, num_tests = 1, i; int EventSet1 = PAPI_NULL, mask1, PAPI_event; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; unsigned short *profbuf; char event_name[PAPI_MAX_STR_LEN]; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } profbuf = ( unsigned short * ) malloc( length * sizeof ( unsigned short ) ); if ( profbuf == NULL ) { test_fail(__FILE__, __LINE__, "Allocate memory",0); } memset( profbuf, 0x00, length * sizeof ( unsigned short ) ); /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ EventSet1 = add_two_nonderived_events( &num_events1, &PAPI_event, &mask1 ); values = allocate_test_space( num_tests, num_events1 ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_profil( profbuf, length, my_start, 65536, EventSet1, PAPI_event, THR, PAPI_PROFIL_POSIX ); if ( retval ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } retval = PAPI_start( EventSet1 ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( *( int * ) arg ); retval = PAPI_stop( EventSet1, values[0] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; /* to remove the profile flag */ retval = PAPI_profil( profbuf, length, my_start, 65536, EventSet1, PAPI_event, 0, PAPI_PROFIL_POSIX ); if ( retval ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } remove_test_events( &EventSet1, mask1 ); if ( !TESTS_QUIET ) { if ( mask1 == 0x3 ) { printf( "Thread %#x PAPI_TOT_INS : \t%lld\n", ( int ) pthread_self( ), ( values[0] )[0] ); } else { printf( "Thread %#x PAPI_FP_INS : \t%lld\n", ( int ) pthread_self( ), ( values[0] )[0] ); } printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", ( int ) pthread_self( ), ( values[0] )[1] ); printf( "Thread %#x Real usec : \t%lld\n", ( int ) pthread_self( ), elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", ( int ) pthread_self( ), elapsed_cyc ); printf( "Test case: PAPI_profil() for pthreads\n" ); printf( "----Profile buffer for Thread %#x---\n", ( int ) pthread_self( ) ); for ( i = 0; i < ( int ) length; i++ ) { if ( profbuf[i] ) printf( "%#lx\t%d\n", ( unsigned long ) ( my_start + 2 * i ), profbuf[i] ); } } for ( i = 0; i < ( int ) length; i++ ) if ( profbuf[i] ) break; if ( i >= ( int ) length ) { test_fail( __FILE__, __LINE__, "No information in buffers", 1 ); } free_test_space( values, num_tests ); retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); } return NULL; } int main( int argc, char **argv ) { pthread_t id[NUM_THREADS]; int flops[NUM_THREADS]; int i, rc, retval; pthread_attr_t attr; long long elapsed_us, elapsed_cyc; const PAPI_exe_info_t *prginfo = NULL; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_query_event(PAPI_TOT_CYC); if (retval != PAPI_OK) { if (!quiet) printf("Trouble adding event\n"); test_skip(__FILE__,__LINE__,"No events",0); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self )); if (retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { retval = 1; test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", retval ); } my_start = prginfo->address_info.text_start; my_end = prginfo->address_info.text_end; length = ( unsigned int ) ( my_end - my_start ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif for ( i = 0; i < NUM_THREADS; i++ ) { flops[i] = FLOPS * ( i + 1 ); rc = pthread_create( &id[i], &attr, Thread, ( void * ) &flops[i] ); if ( rc ) return ( FAILURE ); } for ( i = 0; i < NUM_THREADS; i++ ) pthread_join( id[i], NULL ); pthread_attr_destroy( &attr ); elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; if ( !quiet ) { printf( "Master real usec : \t%lld\n", elapsed_us ); printf( "Master real cycles : \t%lld\n", elapsed_cyc ); } test_pass( __FILE__ ); pthread_exit( NULL ); return 0; } papi-papi-7-2-0-t/src/ctests/profile_twoevents.c000066400000000000000000000070261502707512200216630ustar00rootroot00000000000000/* * File: profile_twoevents.c * Author: Philip Mucci * mucci@cs.utk.edu */ /* This file performs the following test: profiling two events */ #include #include #include "papi.h" #include "papi_test.h" #include "prof_utils.h" #include "do_loops.h" int main( int argc, char **argv ) { int i, num_tests = 6; unsigned long length, blength; int num_buckets, mask; char title[PAPI_2MAX_STR_LEN]; int retval; const PAPI_exe_info_t *prginfo; vptr_t start, end; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); } mask = prof_events( num_tests ); start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; /* Must have at least FP instr or Tot ins */ if ( ( ( mask & MASK_FP_INS ) == 0 ) && ( ( mask & MASK_TOT_INS ) == 0 ) ) { if (!quiet) printf("No events could be added\n"); test_skip( __FILE__, __LINE__, "No FP or Total Ins. event", 1 ); } if ( start > end ) test_fail( __FILE__, __LINE__, "Profile length < 0!", 0 ); length = ( unsigned long ) ( end - start ); if (!quiet) { prof_print_address( "Test case profile: POSIX compatible profiling with two events.\n", prginfo ); prof_print_prof_info( start, end, THRESHOLD, event_name ); } prof_alloc( 2, length ); blength = prof_size( length, FULL_SCALE, PAPI_PROFIL_BUCKET_16, &num_buckets ); do_no_profile( quiet ); if ( !quiet ) { printf( "Test type : \tPAPI_PROFIL_POSIX\n" ); } if ( ( retval = PAPI_profil( profbuf[0], ( unsigned int ) blength, start, FULL_SCALE, EventSet, PAPI_event, THRESHOLD, PAPI_PROFIL_POSIX ) ) != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); } if ( ( retval = PAPI_profil( profbuf[1], ( unsigned int ) blength, start, FULL_SCALE, EventSet, PAPI_TOT_CYC, THRESHOLD, PAPI_PROFIL_POSIX ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values[1] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !quiet ) { printf( TAB1, event_name, ( values[1] )[0] ); printf( TAB1, "PAPI_TOT_CYC:", ( values[1] )[1] ); } if ( ( retval = PAPI_profil( profbuf[0], ( unsigned int ) blength, start, FULL_SCALE, EventSet, PAPI_event, 0, PAPI_PROFIL_POSIX ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); if ( ( retval = PAPI_profil( profbuf[1], ( unsigned int ) blength, start, FULL_SCALE, EventSet, PAPI_TOT_CYC, 0, PAPI_PROFIL_POSIX ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_profil", retval ); sprintf( title, " \t\t %s\tPAPI_TOT_CYC\naddress\t\t\tcounts\tcounts\n", event_name ); if (!quiet) { prof_head( blength, PAPI_PROFIL_BUCKET_16, num_buckets, title ); prof_out( start, 2, PAPI_PROFIL_BUCKET_16, num_buckets, FULL_SCALE ); } remove_test_events( &EventSet, mask ); retval = prof_check( 2, PAPI_PROFIL_BUCKET_16, num_buckets ); for ( i = 0; i < 2; i++ ) { free( profbuf[i] ); } if ( retval == 0 ) { test_fail( __FILE__, __LINE__, "No information in buffers", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/pthread_hl.c000066400000000000000000000035611502707512200202170ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define NUM_THREADS 4 typedef struct papi_args { long tid; int quiet; } papi_args_t; void *CallMatMul(void *args) { long tid; int retval, quiet; char* region_name; papi_args_t* papi_args = (papi_args_t*)args; tid = (*papi_args).tid; quiet = (*papi_args).quiet; region_name = "do_flops"; if ( !quiet ) { printf("\nThread %ld: instrument flops\n", tid); } retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } pthread_exit(NULL); } int main( int argc, char **argv ) { pthread_t threads[NUM_THREADS]; papi_args_t args[NUM_THREADS]; int rc; long t; int quiet = 0; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); for( t = 0; t < NUM_THREADS; t++) { args[t].tid = t; args[t].quiet = quiet; rc = pthread_create(&threads[t], NULL, CallMatMul, (void *)&args[t]); if (rc) { printf("ERROR; return code from pthread_create() is %d\n", rc); exit(-1); } } for( t = 0; t < NUM_THREADS; t++) { pthread_join(threads[t], NULL); } for( t = 0; t < NUM_THREADS; t++) { args[t].tid = t; args[t].quiet = quiet; rc = pthread_create(&threads[t], NULL, CallMatMul, (void *)&args[t]); if (rc) { printf("ERROR; return code from pthread_create() is %d\n", rc); exit(-1); } } for( t = 0; t < NUM_THREADS; t++) { pthread_join(threads[t], NULL); } test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/pthrtough.c000066400000000000000000000046511502707512200201320ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #define NITER 1000 void * Thread( void *data ) { int i, ret, evtset; ( void ) data; for ( i = 0; i < NITER; i++ ) { if ( ( ret = PAPI_register_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init", ret ); evtset = PAPI_NULL; if ( ( ret = PAPI_create_eventset( &evtset ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); if ( ( ret = PAPI_destroy_eventset( &evtset ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", ret ); if ( ( ret = PAPI_unregister_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", ret ); } return ( NULL ); } int main( int argc, char *argv[] ) { int j; pthread_t *th = NULL; pthread_attr_t attr; int ret; long nthr; const PAPI_hw_info_t *hwinfo; tests_quiet( argc, argv ); /*Set TESTS_QUIET variable */ ret = PAPI_library_init( PAPI_VER_CURRENT ); if ( ret != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); if ( ( ret = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init", ret ); pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM ret=pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( ret != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", ret ); #endif if ( ( hwinfo = PAPI_get_hardware_info( ) ) == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 0 ); nthr = hwinfo->ncpu; if ( !TESTS_QUIET ) { printf( "Creating %ld threads for %d iterations each of:\n", nthr, NITER ); printf( "\tregister\n" ); printf( "\tcreate_eventset\n" ); printf( "\tdestroy_eventset\n" ); printf( "\tunregister\n" ); } th = ( pthread_t * ) malloc( ( size_t ) nthr * sizeof ( pthread_t ) ); if ( th == NULL ) test_fail( __FILE__, __LINE__, "malloc", PAPI_ESYS ); for ( j = 0; j < nthr; j++ ) { ret = pthread_create( &th[j], &attr, &Thread, NULL ); if ( ret ) { free(th); test_fail( __FILE__, __LINE__, "pthread_create", PAPI_ESYS ); } } for ( j = 0; j < nthr; j++ ) { pthread_join( th[j], NULL ); } free(th); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/pthrtough2.c000066400000000000000000000047511502707512200202150ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #define NITER 2000 void * Thread( void *data ) { int ret, evtset; ( void ) data; if ( ( ret = PAPI_register_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init", ret ); evtset = PAPI_NULL; if ( ( ret = PAPI_create_eventset( &evtset ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", ret ); if ( ( ret = PAPI_destroy_eventset( &evtset ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", ret ); if ( ( ret = PAPI_unregister_thread( ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", ret ); return ( NULL ); } int main( int argc, char *argv[] ) { int j; pthread_t *th = NULL; pthread_attr_t attr; int ret; long nthr; tests_quiet( argc, argv ); /*Set TESTS_QUIET variable */ ret = PAPI_library_init( PAPI_VER_CURRENT ); if ( ret != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", ret ); if ( ( ret = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_thread_init", ret ); pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM ret = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( ret != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", ret ); #endif nthr = NITER; if ( !TESTS_QUIET ) { printf( "Creating %d threads for %d iterations each of:\n", ( int ) nthr, 1 ); printf( "\tregister\n" ); printf( "\tcreate_eventset\n" ); printf( "\tdestroy_eventset\n" ); printf( "\tunregister\n" ); } th = ( pthread_t * ) malloc( ( size_t ) nthr * sizeof ( pthread_t ) ); if ( th == NULL ) test_fail( __FILE__, __LINE__, "malloc", PAPI_ESYS ); for ( j = 0; j < nthr; j++ ) { ret = pthread_create( &th[j], &attr, &Thread, NULL ); if ( ret ) { printf( "Failed to create thread: %d\n", j ); if ( j < 10 ) { free(th); test_fail( __FILE__, __LINE__, "pthread_create", PAPI_ESYS ); } printf( "Continuing test with %d threads.\n", j - 1 ); nthr = j - 1; th = ( pthread_t * ) realloc( th, ( size_t ) nthr * sizeof ( pthread_t ) ); break; } } for ( j = 0; j < nthr; j++ ) { pthread_join( th[j], NULL ); } free(th); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/realtime.c000066400000000000000000000056371502707512200177150ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; long long elapsed_us, elapsed_cyc; const PAPI_hw_info_t *hw_info; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); if (!quiet) { printf( "Testing real time clock. (CPU Max %d MHz, CPU Min %d MHz)\n", hw_info->cpu_max_mhz, hw_info->cpu_min_mhz ); printf( "Sleeping for 10 seconds.\n" ); } sleep( 10 ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; if (!quiet) { printf( "%lld us. %lld cyc.\n", elapsed_us, elapsed_cyc ); printf( "%f Computed MHz.\n", ( float ) elapsed_cyc / ( float ) elapsed_us ); } /* Elapsed microseconds and elapsed cycles are not as unambiguous as they appear. On Pentium III and 4, for example, cycles is a measured value, while useconds is computed from cycles and mhz. MHz is read from /proc/cpuinfo (on linux). Thus, any error in MHz is propagated to useconds. Conversely, on ultrasparc useconds are extracted from a system call (gethrtime()) and cycles are computed from useconds. Also, MHz comes from a scan of system info, Thus any error in gethrtime() propagates to both cycles and useconds, and cycles can be further impacted by errors in reported MHz. Without knowing the error bars on these system values, we can't really specify error ranges for our reported values, but we *DO* know that errors for at least one instance of Pentium 4 (torc17@utk) are on the order of one part per thousand. Newer multicore Intel processors seem to have broken the relationship between the clock rate reported in /proc/cpuinfo and the actual computed clock. To accomodate this artifact, the test no longer fails, but merely reports results out of range. */ if ( elapsed_us < 9000000 ) { if (!quiet) printf( "NOTE: Elapsed real time less than 9 seconds (%lld us)!\n",elapsed_us ); test_fail(__FILE__,__LINE__,"Real time too short",1); } if ( elapsed_us > 11000000 ) { if (!quiet) printf( "NOTE: Elapsed real time greater than 11 seconds! (%lld us)\n", elapsed_us ); test_fail(__FILE__,__LINE__,"Real time too long",1); } if ( ( float ) elapsed_cyc < 9.0 * hw_info->cpu_max_mhz * 1000000.0 ) if (!quiet) printf( "NOTE: Elapsed real cycles less than 9*MHz*1000000.0!\n" ); if ( ( float ) elapsed_cyc > 11.0 * hw_info->cpu_max_mhz * 1000000.0 ) if (!quiet) printf( "NOTE: Elapsed real cycles greater than 11*MHz*1000000.0!\n" ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/remove_events.c000066400000000000000000000065361502707512200207730ustar00rootroot00000000000000/* This test checks if removing events works properly at the low level by Vince Weaver (vweaver1@eecs.utk.edu) */ #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval; int EventSet = PAPI_NULL; long long values1[2],values2[2]; const char *event_names[] = {"PAPI_TOT_CYC","PAPI_TOT_INS"}; char add_event_str[PAPI_MAX_STR_LEN]; double instructions_error; long long old_instructions; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create an empty event set */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* add the events named above */ retval = PAPI_add_named_event( EventSet, event_names[0] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[0] ); if (!quiet) printf("Trouble %s\n",add_event_str); test_skip( __FILE__, __LINE__, add_event_str, retval ); } retval = PAPI_add_named_event( EventSet, event_names[1] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[1] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our test code */ do_flops( NUM_FLOPS ); /* Stop PAPI */ retval = PAPI_stop( EventSet, values1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } old_instructions=values1[1]; if ( !quiet ) { printf( "========================\n" ); /* cycles is first, other event second */ sprintf( add_event_str, "%-12s : \t", event_names[0] ); printf( TAB1, add_event_str, values1[0] ); sprintf( add_event_str, "%-12s : \t", event_names[1] ); printf( TAB1, add_event_str, values1[1] ); } /* remove PAPI_TOT_CYC */ retval = PAPI_remove_named_event( EventSet, event_names[0] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[0] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our test code */ do_flops( NUM_FLOPS ); /* Stop PAPI */ retval = PAPI_stop( EventSet, values2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* test if after removing the event, the second event */ /* still points to the proper native event */ /* this only works if IPC != 1 */ if ( !quiet ) { printf( "==========================\n" ); printf( "After removing PAP_TOT_CYC\n"); sprintf( add_event_str, "%-12s : \t", event_names[1] ); printf( TAB1, add_event_str, values2[0] ); instructions_error=((double)old_instructions - (double)values2[0])/ (double)old_instructions; if (instructions_error>10.0) { printf("Error of %.2f%%\n",instructions_error); test_fail( __FILE__, __LINE__, "validation", 0 ); } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/reset.c000066400000000000000000000217261502707512200172320ustar00rootroot00000000000000/* This file performs the following test: start, read, stop and again functionality - It attempts to use the following three counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS or PAPI_TOT_INS if PAPI_FP_INS doesn't exist + PAPI_TOT_CYC 1 - Start counters - Do flops - Stop counters 2 - Start counters - Do flops - Stop counters (should duplicate above) 3 - Reset counters (should be redundant if stop works properly) - Start counters - Do flops - Stop counters 4 - Start counters - Do flops/2 - Read counters (flops/2;counters keep counting) 5 - Do flops/2 - Read counters (2flops/2; counters keep counting) 6 - Do flops/2 - Read counters (3*flops/2; counters keep counting) - Accum counters (2*(3*flops.2); counters clear and counting) 7 - Do flops/2 - Read counters (flops/2; counters keep counting) 8 - Reset (counters set to zero; still counting) - Stop counters (flops/2; counters stopped) 9 - Reset (counters set to zero; still counting) - Do flops/2 - Stop counters (flops/2; counters stopped) 9 - Reset (counters set to zero and stopped) - Read counters (should be zero) */ #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, num_tests = 9, num_events, tmp, i; long long **values; int EventSet = PAPI_NULL; int PAPI_event, mask; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet = add_two_events( &num_events, &PAPI_event, &mask ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); values = allocate_test_space( num_tests, num_events ); /*===== Test 1: Start/Stop =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 2 Start/Stop =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 3: Reset/Start/Stop =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[2] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 4: Start/Read =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[3] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 5: Read =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[4] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 6: Read/Accum =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[5] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } retval = PAPI_accum( EventSet, values[5] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_accum", retval ); } /*===== Test 7: Read =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[6] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 8 Reset/Stop =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } do_flops( NUM_FLOPS / 2 ); retval = PAPI_stop( EventSet, values[7] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 9: Reset/Read =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } retval = PAPI_read( EventSet, values[8] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } remove_test_events( &EventSet, mask ); if (!quiet) { printf( "Test case: Start/Stop/Read/Accum/Reset.\n" ); printf( "----------------------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); sprintf( add_event_str, "%s:", event_name ); printf( " PAPI_TOT_CYC %s\n", event_name ); printf( "1. start,ops,stop %10lld %10lld\n", values[0][0], values[0][1] ); printf( "2. start,ops,stop %10lld %10lld\n", values[1][0], values[1][1] ); printf( "3. reset,start,ops,stop %10lld %10lld\n", values[2][0], values[2][1] ); printf( "4. start,ops/2,read %10lld %10lld\n", values[3][0], values[3][1] ); printf( "5. ops/2,read %10lld %10lld\n", values[4][0], values[4][1] ); printf( "6. ops/2,accum %10lld %10lld\n", values[5][0], values[5][1] ); printf( "7. ops/2,read %10lld %10lld\n", values[6][0], values[6][1] ); printf( "8. reset,ops/2,stop %10lld %10lld\n", values[7][0], values[7][1] ); printf( "9. reset,read %10lld %10lld\n", values[8][0], values[8][1] ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Row 1 approximately equals rows 2 and 3 \n" ); printf( "Row 4 approximately equals 1/2 of row 3\n" ); printf( "Row 5 approximately equals twice row 4\n" ); printf( "Row 6 approximately equals 6 times row 4\n" ); printf( "Rows 7 and 8 approximately equal row 4\n" ); printf( "Row 9 equals 0\n" ); printf( "%% difference between %s 1 & 2: %.2f\n", "PAPI_TOT_CYC", 100.0 * ( float ) values[0][0] / ( float ) values[1][0] ); printf( "%% difference between %s 1 & 2: %.2f\n", add_event_str, 100.0 * ( float ) values[0][1] / ( float ) values[1][1] ); } for ( i = 0; i <= 1; i++ ) { if ( !approx_equals ( ( double ) values[0][i], ( double ) values[1][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[1][i], ( double ) values[2][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[2][i], ( double ) values[3][i] * 2.0 ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[2][i], ( double ) values[4][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[5][i], ( double ) values[3][i] * 6.0 ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[6][i], ( double ) values[3][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[7][i], ( double ) values[3][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( values[8][i] != 0LL ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/reset_multiplex.c000066400000000000000000000203311502707512200213240ustar00rootroot00000000000000/* This file performs the same tests as the reset test but does it with the events multiplexed. This is mostly to test perf_event, where resetting multiplexed events is handled differently than grouped events. */ #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, num_tests = 9, num_events, tmp, i; long long **values; int EventSet = PAPI_NULL; int PAPI_event, mask; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_multiplex_init( ); if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } else if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_multiplex_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet = add_two_events( &num_events, &PAPI_event, &mask ); /* Set multiplexing on the eventset */ retval = PAPI_set_multiplex( EventSet ); if ( retval != PAPI_OK) { test_fail(__FILE__, __LINE__, "Setting multiplex", retval); } retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); values = allocate_test_space( num_tests, num_events ); /*===== Test 1: Start/Stop =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 2 Start/Stop =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[1] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 3: Reset/Start/Stop =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet, values[2] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 4: Start/Read =======================*/ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[3] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 5: Read =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[4] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 6: Read/Accum =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[5] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } retval = PAPI_accum( EventSet, values[5] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_accum", retval ); } /*===== Test 7: Read =======================*/ do_flops( NUM_FLOPS / 2 ); retval = PAPI_read( EventSet, values[6] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } /*===== Test 8 Reset/Stop =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } do_flops( NUM_FLOPS / 2 ); retval = PAPI_stop( EventSet, values[7] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /*===== Test 9: Reset/Read =======================*/ retval = PAPI_reset( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); } retval = PAPI_read( EventSet, values[8] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_read", retval ); } remove_test_events( &EventSet, mask ); if (!quiet) { printf( "Test case: Start/Stop/Read/Accum/Reset.\n" ); printf( "----------------------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); sprintf( add_event_str, "%s:", event_name ); printf( " PAPI_TOT_CYC %s\n", event_name ); printf( "1. start,ops,stop %10lld %10lld\n", values[0][0], values[0][1] ); printf( "2. start,ops,stop %10lld %10lld\n", values[1][0], values[1][1] ); printf( "3. reset,start,ops,stop %10lld %10lld\n", values[2][0], values[2][1] ); printf( "4. start,ops/2,read %10lld %10lld\n", values[3][0], values[3][1] ); printf( "5. ops/2,read %10lld %10lld\n", values[4][0], values[4][1] ); printf( "6. ops/2,accum %10lld %10lld\n", values[5][0], values[5][1] ); printf( "7. ops/2,read %10lld %10lld\n", values[6][0], values[6][1] ); printf( "8. reset,ops/2,stop %10lld %10lld\n", values[7][0], values[7][1] ); printf( "9. reset,read %10lld %10lld\n", values[8][0], values[8][1] ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Row 1 approximately equals rows 2 and 3 \n" ); printf( "Row 4 approximately equals 1/2 of row 3\n" ); printf( "Row 5 approximately equals twice row 4\n" ); printf( "Row 6 approximately equals 6 times row 4\n" ); printf( "Rows 7 and 8 approximately equal row 4\n" ); printf( "Row 9 equals 0\n" ); printf( "%% difference between %s 1 & 2: %.2f\n", "PAPI_TOT_CYC", 100.0 * ( float ) values[0][0] / ( float ) values[1][0] ); printf( "%% difference between %s 1 & 2: %.2f\n", add_event_str, 100.0 * ( float ) values[0][1] / ( float ) values[1][1] ); } for ( i = 0; i <= 1; i++ ) { if ( !approx_equals ( ( double ) values[0][i], ( double ) values[1][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[1][i], ( double ) values[2][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[2][i], ( double ) values[3][i] * 2.0 ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[2][i], ( double ) values[4][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[5][i], ( double ) values[3][i] * 6.0 ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[6][i], ( double ) values[3][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( !approx_equals ( ( double ) values[7][i], ( double ) values[3][i] ) ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); if ( values[8][i] != 0LL ) test_fail( __FILE__, __LINE__, ( ( i == 0 ) ? "PAPI_TOT_CYC" : add_event_str ), 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/sdsc-mpx.c000066400000000000000000000206161502707512200176430ustar00rootroot00000000000000/* * Test example for multiplex functionality, originally * provided by Timothy Kaiser, SDSC. It was modified to fit the * PAPI test suite by Nils Smeds, . * * This example verifies the accuracy of multiplexed events */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define REPEATS 5 #define MAXEVENTS 14 #define SLEEPTIME 100 #define MINCOUNTS 100000 #define MPX_TOLERANCE 0.20 #define NUM_FLOPS 20000000 void check_values( int eventset, int *events, int nevents, long long *values, long long *refvalues ) { double spread[MAXEVENTS]; int i = nevents, j = 0; if ( !TESTS_QUIET ) { printf( "\nRelative accuracy:\n" ); for ( j = 0; j < nevents; j++ ) printf( " Event %.2d", j + 1 ); printf( "\n" ); } for ( j = 0; j < nevents; j++ ) { spread[j] = abs( (int) ( refvalues[j] - values[j] ) ); if ( values[j] ) spread[j] /= ( double ) values[j]; if ( !TESTS_QUIET ) printf( "%10.3g ", spread[j] ); /* Make sure that NaN get counted as errors */ if ( spread[j] < MPX_TOLERANCE ) { i--; } else if ( refvalues[j] < MINCOUNTS ) { /* Neglect inprecise results with low counts */ i--; } else { char buff[BUFSIZ]; if (!TESTS_QUIET) { printf("reference = %lld, value = %lld, diff = %lld\n", refvalues[j],values[j],refvalues[j] - values[j] ); } sprintf(buff,"Error on %d, spread %lf > threshold %lf AND count %lld > minimum size threshold %d\n",j,spread[j],MPX_TOLERANCE, refvalues[j],MINCOUNTS); test_fail( __FILE__, __LINE__, buff, 1 ); } } if (!TESTS_QUIET) printf( "\n\n" ); #if 0 if ( !TESTS_QUIET ) { for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); printf( "Event %.2d: ref=", j ); printf( LLDFMT10, refvalues[j] ); printf( ", diff/ref=%7.2g -- %s\n", spread[j], info.short_descr ); printf( "\n" ); } printf( "\n" ); } #else ( void ) eventset; ( void ) events; #endif } void ref_measurements( int iters, int *eventset, int *events, int nevents, long long *refvalues ) { PAPI_event_info_t info; int i, retval; double x = 1.1, y; long long t1, t2; if (!TESTS_QUIET) printf( "PAPI reference measurements:\n" ); if ( ( retval = PAPI_create_eventset( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); for ( i = 0; i < nevents; i++ ) { if ( ( retval = PAPI_add_event( *eventset, events[i] ) ) ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); x = 1.0; t1 = PAPI_get_real_usec( ); if ( ( retval = PAPI_start( *eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); y = do_flops3( x, iters, 1 ); if ( ( retval = PAPI_stop( *eventset, &refvalues[i] ) ) ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); t2 = PAPI_get_real_usec( ); if (!TESTS_QUIET) { printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( ( float ) y / ( t2 - t1 ) ) ); } PAPI_get_event_info( events[i], &info ); if (!TESTS_QUIET) { printf( "%20s = ", info.short_descr ); printf( LLDFMT, refvalues[i] ); printf( "\n" ); } if ( ( retval = PAPI_cleanup_eventset( *eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); } if ( ( retval = PAPI_destroy_eventset( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); *eventset = PAPI_NULL; } int main( int argc, char **argv ) { PAPI_event_info_t info; int i, j, retval; int iters = NUM_FLOPS; double x = 1.1, y; long long t1, t2; long long values[MAXEVENTS], refvalues[MAXEVENTS]; int sleep_time = SLEEPTIME; int nevents = MAXEVENTS; int eventset = PAPI_NULL; int events[MAXEVENTS]; int quiet; quiet = tests_quiet( argc, argv ); if ( argc > 1 ) { if ( !strcmp( argv[1], "TESTS_QUIET" ) ) { } else { sleep_time = atoi( argv[1] ); if ( sleep_time <= 0 ) sleep_time = SLEEPTIME; } } events[0] = PAPI_FP_INS; events[1] = PAPI_TOT_INS; events[2] = PAPI_INT_INS; events[3] = PAPI_TOT_CYC; events[4] = PAPI_STL_CCY; events[5] = PAPI_BR_INS; events[6] = PAPI_SR_INS; events[7] = PAPI_LD_INS; events[8] = PAPI_TOT_IIS; events[9] = PAPI_FAD_INS; events[10] = PAPI_BR_TKN; events[11] = PAPI_BR_MSP; events[12] = PAPI_L1_ICA; events[13] = PAPI_L1_DCA; for ( i = 0; i < MAXEVENTS; i++ ) { values[i] = 0; } if ( !quiet ) { printf( "\nAccuracy check of multiplexing routines.\n" ); printf( "Comparing a multiplex measurement with separate measurements.\n\n" ); } /* Initialize PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Iterate through event list and remove those that aren't suitable */ nevents = MAXEVENTS; for ( i = 0; i < nevents; i++ ) { if (( PAPI_get_event_info( events[i], &info ) == PAPI_OK ) && (info.count && (strcmp( info.derived, "NOT_DERIVED")==0))) { if (!quiet) printf( "Added %s\n", info.symbol ); } else { for ( j = i; j < MAXEVENTS-1; j++ ) { events[j] = events[j + 1]; } nevents--; i--; } } /* Skip test if not enough events available */ if ( nevents < 2 ) { test_skip( __FILE__, __LINE__, "Not enough events to multiplex...", 0 ); } if (!quiet) printf( "Using %d events\n\n", nevents ); retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } /* Find a reasonable number of iterations (each * event active 20 times) during the measurement */ /* Target: 10000 usec/multiplex, 20 repeats */ t2 = 10000 * 20 * nevents; if ( t2 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } /* Warmup? */ y = do_flops3( x, iters, 1 ); /* Measure time of one run */ t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); t1 = PAPI_get_real_usec( ) - t1; if (t1==0) { test_fail(__FILE__, __LINE__, "do_flops3 takes no time to run!\n", retval); } /* Scale up execution time to match t2 */ if ( t2 > t1 ) { iters = iters * ( int ) ( t2 / t1 ); if (!quiet) { printf( "Modified iteration count to %d\n\n", iters ); } } if (!quiet) fprintf(stdout,"y=%lf\n",y); /* Now loop through the items one at a time */ ref_measurements( iters, &eventset, events, nevents, refvalues ); /* Now check multiplexed */ if ( ( retval = PAPI_create_eventset( &eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( eventset, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); if ( ( retval = PAPI_set_multiplex( eventset ) ) ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } if ( ( retval = PAPI_add_events( eventset, events, nevents ) ) ) test_fail( __FILE__, __LINE__, "PAPI_add_events", retval ); if (!quiet) printf( "\nPAPI multiplexed measurements:\n" ); x = 1.0; t1 = PAPI_get_real_usec( ); if ( ( retval = PAPI_start( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); y = do_flops3( x, iters, 1 ); if ( ( retval = PAPI_stop( eventset, values ) ) ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); t2 = PAPI_get_real_usec( ); for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); if ( !quiet ) { printf( "%20s = ", info.short_descr ); printf( LLDFMT, values[j] ); printf( "\n" ); } } check_values( eventset, events, nevents, values, refvalues ); if ( ( retval = PAPI_remove_events( eventset, events, nevents ) ) ) test_fail( __FILE__, __LINE__, "PAPI_remove_events", retval ); if ( ( retval = PAPI_cleanup_eventset( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); if ( ( retval = PAPI_destroy_eventset( &eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); eventset = PAPI_NULL; /* Now loop through the items one at a time */ ref_measurements( iters, &eventset, events, nevents, refvalues ); check_values( eventset, events, nevents, values, refvalues ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/sdsc2.c000066400000000000000000000150401502707512200171160ustar00rootroot00000000000000/* * Test example for multiplex functionality, originally * provided by Timothy Kaiser, SDSC. It was modified to fit the * PAPI test suite by Nils Smeds, . * * This example verifies the PAPI_reset function for * multiplexed events */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define REPEATS 5 #define MAXEVENTS 9 #define SLEEPTIME 100 #define MINCOUNTS 100000 #define MPX_TOLERANCE 0.20 #define NUM_FLOPS 20000000 int main( int argc, char **argv ) { PAPI_event_info_t info; int i, j, retval; int iters = NUM_FLOPS; double x = 1.1, y, dtmp; long long t1, t2; long long values[MAXEVENTS]; int sleep_time = SLEEPTIME; #ifdef STARTSTOP long long dummies[MAXEVENTS]; #endif double valsample[MAXEVENTS][REPEATS]; double valsum[MAXEVENTS]; double avg[MAXEVENTS]; double spread[MAXEVENTS]; int nevents = MAXEVENTS; int eventset = PAPI_NULL; int events[MAXEVENTS]; int fails; int quiet; /* Set the quiet variable */ quiet = tests_quiet( argc, argv ); /* Parse command line */ if ( argc > 1 ) { if ( !strcmp( argv[1], "TESTS_QUIET" ) ) { } else { sleep_time = atoi( argv[1] ); if ( sleep_time <= 0 ) sleep_time = SLEEPTIME; } } events[0] = PAPI_FP_INS; events[1] = PAPI_TOT_INS; events[2] = PAPI_INT_INS; events[3] = PAPI_TOT_CYC; events[4] = PAPI_STL_CCY; events[5] = PAPI_BR_INS; events[6] = PAPI_SR_INS; events[7] = PAPI_LD_INS; events[8] = PAPI_TOT_IIS; for ( i = 0; i < MAXEVENTS; i++ ) { values[i] = 0; valsum[i] = 0; } if ( !quiet ) { printf( "\nAccuracy check of multiplexing routines.\n" ); printf( "Investigating the variance of multiplexed measurements.\n\n" ); } retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } #ifdef MPX retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } #endif if ( ( retval = PAPI_create_eventset( &eventset ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } #ifdef MPX /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( eventset, 0 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); } if ( ( retval = PAPI_set_multiplex( eventset ) ) ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } #endif /* Iterate through event list and remove those that aren't available */ nevents = MAXEVENTS; for ( i = 0; i < nevents; i++ ) { if ( ( retval = PAPI_add_event( eventset, events[i] ) ) ) { for ( j = i; j < MAXEVENTS-1; j++ ) { events[j] = events[j + 1]; } nevents--; i--; } } /* Skip test if not enough events available */ if ( nevents < 2 ) { test_skip( __FILE__, __LINE__, "Not enough events left...", 0 ); } /* Find a reasonable number of iterations (each * event active 20 times) during the measurement */ /* Target: 10000 usec/multiplex, 20 repeats */ t2 = 10000 * 20 * nevents; if ( t2 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } /* Measure time of one iteration */ t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); t1 = PAPI_get_real_usec( ) - t1; /* Scale up execution time to match t2 */ if ( t2 > t1 ) { iters = iters * ( int ) ( t2 / t1 ); } /* Make sure execution time is < 30s per repeated test */ else if ( t1 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } if ( ( retval = PAPI_start( eventset ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } for ( i = 1; i <= REPEATS; i++ ) { x = 1.0; #ifndef STARTSTOP if ( ( retval = PAPI_reset( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); #else if ( ( retval = PAPI_stop( eventset, dummies ) ) ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( ( retval = PAPI_start( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); #endif if ( !quiet ) { printf( "\nTest %d (of %d):\n", i, REPEATS ); } t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); PAPI_read( eventset, values ); t2 = PAPI_get_real_usec( ); if ( !quiet ) { printf( "\n(calculated independent of PAPI)\n" ); printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( y / ( double ) ( t2 - t1 ) ) ); printf( "PAPI measurements:\n" ); for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); printf( "%20s = ", info.short_descr ); printf( "%lld", values[j] ); printf( "\n" ); } printf( "\n" ); } /* Calculate values */ for ( j = 0; j < nevents; j++ ) { dtmp = ( double ) values[j]; valsum[j] += dtmp; valsample[j][i - 1] = dtmp; } } if ( ( retval = PAPI_stop( eventset, values ) ) ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( !quiet ) { printf( "\n\nEstimated variance relative " "to average counts:\n" ); for ( j = 0; j < nevents; j++ ) printf( " Event %.2d", j ); printf( "\n" ); } fails = nevents; /* Due to limited precision of floating point cannot really use typical standard deviation compuation for large numbers with very small variations. Instead compute the std devation problems with precision. */ for ( j = 0; j < nevents; j++ ) { avg[j] = valsum[j] / REPEATS; spread[j] = 0; for ( i = 0; i < REPEATS; ++i ) { double diff = ( valsample[j][i] - avg[j] ); spread[j] += diff * diff; } spread[j] = sqrt( spread[j] / REPEATS ) / avg[j]; if ( !quiet ) printf( "%9.2g ", spread[j] ); /* Make sure that NaN get counted as errors */ if ( spread[j] < MPX_TOLERANCE ) { --fails; } /* Neglect inprecise results with low counts */ else if ( valsum[j] < MINCOUNTS ) { --fails; } } if ( !quiet ) { printf( "\n\n" ); for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); printf( "Event %.2d: mean=%10.0f, " "sdev/mean=%7.2g nrpt=%2d -- %s\n", j, avg[j], spread[j], REPEATS, info.short_descr ); } printf( "\n\n" ); } if ( fails ) { test_fail( __FILE__, __LINE__, "Values outside threshold", fails ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/sdsc4-mpx.c000066400000000000000000000261261502707512200177310ustar00rootroot00000000000000/* * Test example for multiplex functionality, originally * provided by Timothy Kaiser, SDSC. It was modified to fit the * PAPI test suite by Nils Smeds, . * * This example verifies the adding and removal of multiplexed * events in an event set. */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define MAXEVENTS 9 #define REPEATS (MAXEVENTS * 4) #define SLEEPTIME 100 #define MINCOUNTS 100000 #define MPX_TOLERANCE 0.20 #define NUM_FLOPS 20000000 int main( int argc, char **argv ) { PAPI_event_info_t info; char name2[PAPI_MAX_STR_LEN]; int i, j, retval, idx, repeats; int iters = NUM_FLOPS; double x = 1.1, y, dtmp; long long t1, t2; long long values[MAXEVENTS], refvals[MAXEVENTS]; int nsamples[MAXEVENTS], truelist[MAXEVENTS], ntrue; #ifdef STARTSTOP long long dummies[MAXEVENTS]; #endif int sleep_time = SLEEPTIME; double valsample[MAXEVENTS][REPEATS]; double valsum[MAXEVENTS]; double avg[MAXEVENTS]; double spread[MAXEVENTS]; int nevents = MAXEVENTS, nev1; int eventset = PAPI_NULL; int events[MAXEVENTS]; int eventidx[MAXEVENTS]; int eventmap[MAXEVENTS]; int fails; int quiet; quiet = tests_quiet( argc, argv ); if ( argc > 1 ) { if ( !strcmp( argv[1], "quiet" ) ) { } else { sleep_time = atoi( argv[1] ); if ( sleep_time <= 0 ) sleep_time = SLEEPTIME; } } events[0] = PAPI_FP_INS; events[1] = PAPI_TOT_CYC; events[2] = PAPI_TOT_INS; events[3] = PAPI_TOT_IIS; events[4] = PAPI_INT_INS; events[5] = PAPI_STL_CCY; events[6] = PAPI_BR_INS; events[7] = PAPI_SR_INS; events[8] = PAPI_LD_INS; for ( i = 0; i < MAXEVENTS; i++ ) { values[i] = 0; valsum[i] = 0; nsamples[i] = 0; } /* Print test summary */ if ( !quiet ) { printf( "\nFunctional check of multiplexing routines.\n" ); printf( "Adding and removing events from an event set.\n\n" ); } /* Init the library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Enable multiplexing */ #ifdef MPX retval = PAPI_multiplex_init( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI multiplex init fail\n", retval ); } #endif /* Create an eventset */ if ( ( retval = PAPI_create_eventset( &eventset ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Enable multiplexing on the eventset */ #ifdef MPX /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( eventset, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); if ( ( retval = PAPI_set_multiplex( eventset ) ) ) { if ( retval == PAPI_ENOSUPP) { test_skip(__FILE__, __LINE__, "Multiplex not supported", 1); } test_fail( __FILE__, __LINE__, "PAPI_set_multiplex", retval ); } #endif /* See which events are available and remove the ones that aren't */ nevents = MAXEVENTS; for ( i = 0; i < nevents; i++ ) { if ( ( retval = PAPI_add_event( eventset, events[i] ) ) ) { for ( j = i; j < MAXEVENTS-1; j++ ) events[j] = events[j + 1]; nevents--; i--; } } /* We want at least three events? */ /* Seems arbitrary. Might be because intel machines used to */ /* Only have two event slots */ if ( nevents < 3 ) { test_skip( __FILE__, __LINE__, "Not enough events left...", 0 ); } /* Find a reasonable number of iterations (each * event active 20 times) during the measurement */ /* TODO: find Linux multiplex interval */ /* not sure if 10ms is close or not */ /* Target: 10000 usec/multiplex, 20 repeats */ t2 = 10000 * 20 * nevents; if ( t2 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } /* Measure one run */ t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); t1 = PAPI_get_real_usec( ) - t1; /* Scale up execution time to match t2 */ if ( t2 > t1 ) { iters = iters * ( int ) ( t2 / t1 ); } /* Make sure execution time is < 30s per repeated test */ else if ( t1 > 30e6 ) { test_skip( __FILE__, __LINE__, "This test takes too much time", retval ); } /* Split the events up by odd and even? */ j = nevents; for ( i = 1; i < nevents; i = i + 2 ) eventidx[--j] = i; for ( i = 0; i < nevents; i = i + 2 ) eventidx[--j] = i; assert( j == 0 ); /* put event mapping in eventmap? */ for ( i = 0; i < nevents; i++ ) eventmap[i] = i; x = 1.0; /* Make a reference run */ if ( !quiet ) { printf( "\nReference run:\n" ); } t1 = PAPI_get_real_usec( ); if ( ( retval = PAPI_start( eventset ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } y = do_flops3( x, iters, 1 ); PAPI_read( eventset, refvals ); t2 = PAPI_get_real_usec( ); /* Print results */ ntrue = nevents; PAPI_list_events( eventset, truelist, &ntrue ); if ( !quiet ) { printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( y / ( double ) ( t2 - t1 ) ) ); printf( "%20s %16s %-15s %-15s\n", "PAPI measurement:", "Acquired count", "Expected event", "PAPI_list_events" ); for ( j = 0; j < nevents; j++ ) { PAPI_get_event_info( events[j], &info ); PAPI_event_code_to_name( truelist[j], name2 ); printf( "%20s = %16lld %-15s %-15s %s\n", info.short_descr, refvals[j], info.symbol, name2, strcmp( info.symbol,name2 ) ? "*** MISMATCH ***" : "" ); } printf( "\n" ); } /* Make repeated runs while removing/readding events */ nev1 = nevents; repeats = nevents * 4; /* Repeat four times for each event? */ for ( i = 0; i < repeats; i++ ) { /* What's going on here? as example, nevents=4, repeats=16*/ /* 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 == i*/ /* 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 == i%nevents */ /* 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 == (i%nevents)+1 */ /* 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 */ /* so we skip nevery NEVENTS time through the loop? */ if ( ( i % nevents ) + 1 == nevents ) continue; if ( !quiet ) { printf( "\nTest %d (of %d):\n", i + 1 - (i / nevents), repeats - 4 ); } /* Stop the counter, it's been left running */ if ( ( retval = PAPI_stop( eventset, values ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* We run through a 4-way pattern */ /* 1st quarter, remove events */ /* 2nd quarter, add back events */ /* 3rd quarter, remove events again */ /* 4th wuarter, re-add events */ j = eventidx[i % nevents]; if ( ( i / nevents ) % 2 == 0 ) { /* Remove event */ PAPI_get_event_info( events[j], &info ); if ( !quiet ) { printf( "Removing event[%d]: %s\n", j, info.short_descr ); } retval = PAPI_remove_event( eventset, events[j] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_remove_event", retval ); } /* Update the complex event mapping */ nev1--; for ( idx = 0; eventmap[idx] != j; idx++ ); for ( j = idx; j < nev1; j++ ) eventmap[j] = eventmap[j + 1]; } else { /* Add an event back in */ PAPI_get_event_info( events[j], &info ); if ( !quiet ) { printf( "Adding event[%d]: %s\n", j, info.short_descr ); } retval = PAPI_add_event( eventset, events[j] ); if (retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } eventmap[nev1] = j; nev1++; } if ( ( retval = PAPI_start( eventset ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } x = 1.0; // This startstop is leftover from sdsc2? */ #ifndef STARTSTOP if ( ( retval = PAPI_reset( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_reset", retval ); #else if ( ( retval = PAPI_stop( eventset, dummies ) ) ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); if ( ( retval = PAPI_start( eventset ) ) ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); #endif /* Run the actual workload */ t1 = PAPI_get_real_usec( ); y = do_flops3( x, iters, 1 ); PAPI_read( eventset, values ); t2 = PAPI_get_real_usec( ); /* Print approximate flops plus header */ if ( !quiet ) { printf( "\n(calculated independent of PAPI)\n" ); printf( "\tOperations= %.1f Mflop", y * 1e-6 ); printf( "\t(%g Mflop/s)\n\n", ( y / ( double ) ( t2 - t1 ) ) ); printf( "%20s %16s %-15s %-15s\n", "PAPI measurement:", "Acquired count", "Expected event", "PAPI_list_events" ); ntrue = nev1; PAPI_list_events( eventset, truelist, &ntrue ); for ( j = 0; j < nev1; j++ ) { idx = eventmap[j]; /* printf("Mapping: Counter %d -> slot %d.\n",j,idx); */ PAPI_get_event_info( events[idx], &info ); PAPI_event_code_to_name( truelist[j], name2 ); printf( "%20s = %16lld %-15s %-15s %s\n", info.short_descr, values[j], info.symbol, name2, strcmp( info.symbol, name2 ) ? "*** MISMATCH ***" : "" ); } printf( "\n" ); } /* Calculate results */ for ( j = 0; j < nev1; j++ ) { idx = eventmap[j]; dtmp = ( double ) values[j]; valsum[idx] += dtmp; valsample[idx][nsamples[idx]] = dtmp; nsamples[idx]++; } } /* Stop event for good */ if ( ( retval = PAPI_stop( eventset, values ) ) ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } if ( !quiet ) { printf( "\n\nEstimated variance relative " "to average counts:\n" ); for ( j = 0; j < nev1; j++ ) { printf( " Event %.2d", j ); } printf( "\n" ); } fails = nevents; /* Due to limited precision of floating point cannot really use typical standard deviation compuation for large numbers with very small variations. Instead compute the std devation problems with precision. */ /* Update so that if our event count is small (<1000 or so) */ /* Then don't fail with high variation. Since we're multiplexing */ /* it's hard to capture such small counts, and it makes the test */ /* fail on machines such as Haswell and the PAPI_SR_INS event */ for ( j = 0; j < nev1; j++ ) { avg[j] = valsum[j] / nsamples[j]; spread[j] = 0; for ( i = 0; i < nsamples[j]; ++i ) { double diff = ( valsample[j][i] - avg[j] ); spread[j] += diff * diff; } spread[j] = sqrt( spread[j] / nsamples[j] ) / avg[j]; if ( !quiet ) { printf( "%9.2g ", spread[j] ); } } for ( j = 0; j < nev1; j++ ) { /* Make sure that NaN get counted as errors */ if ( spread[j] < MPX_TOLERANCE ) { if (!quiet) printf("Event %d tolerance good\n",j); fails--; } /* Neglect inprecise results with low counts */ else if ( avg[j] < MINCOUNTS ) { if (!quiet) printf("Event %d too small to fail\n",j); fails--; } else { if (!quiet) printf("Event %d failed!\n",j); } } if ( !quiet ) { printf( "\n\n" ); for ( j = 0; j < nev1; j++ ) { PAPI_get_event_info( events[j], &info ); printf( "Event %.2d: mean=%10.0f, " "sdev/mean=%7.2g nrpt=%2d -- %s\n", j, avg[j], spread[j], nsamples[j], info.short_descr ); } printf( "\n\n" ); } if ( fails ) { test_fail( __FILE__, __LINE__, "Values differ from reference", fails ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/second.c000066400000000000000000000450301502707512200173550ustar00rootroot00000000000000/* This file performs the following test: counter domain testing - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. + PAPI_TOT_INS + PAPI_TOT_CYC - Start system domain counters - Do flops - Stop and read system domain counters - Start kernel domain counters - Do flops - Stop and read kernel domain counters - Start user domain counters - Do flops - Stop and read user domain counters */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define TAB_DOM "%s%12lld%15lld%17lld\n" #define CASE2 0 #define CREATE 1 #define ADD 2 #define MIDDLE 3 #define CHANGE 4 #define SUPERVISOR 5 void dump_and_verify( int test_case, long long **values ) { long long min, max, min2, max2; if (!TESTS_QUIET) { printf( "-----------------------------------------------------------------\n" ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------\n" ); } if ( test_case == CASE2 ) { if (!TESTS_QUIET) { printf( "Test type : Before Create Before Add Between Adds\n" ); printf( TAB_DOM, "PAPI_TOT_INS: ", ( values[0] )[0], ( values[1] )[0], ( values[2] )[0] ); printf( TAB_DOM, "PAPI_TOT_CYC: ", ( values[0] )[1], ( values[1] )[1], ( values[2] )[1] ); printf( "-------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Both rows equal 'n N N' where n << N\n" ); return; } } else if ( test_case == CHANGE ) { min = ( long long ) ( ( double ) values[0][0] * ( 1 - TOLERANCE ) ); max = ( long long ) ( ( double ) values[0][0] * ( 1 + TOLERANCE ) ); if ( values[1][0] > max || values[1][0] < min ) test_fail( __FILE__, __LINE__, "PAPI_TOT_INS", 1 ); min = ( long long ) ( ( double ) values[1][1] * ( 1 - TOLERANCE ) ); max = ( long long ) ( ( double ) values[1][1] * ( 1 + TOLERANCE ) ); if ( ( values[2][1] + values[0][1] ) > max || ( values[2][1] + values[0][1] ) < min ) test_fail( __FILE__, __LINE__, "PAPI_TOT_CYC", 1 ); if (!TESTS_QUIET) { printf( "Test type : PAPI_DOM_ALL PAPI_DOM_KERNEL PAPI_DOM_USER\n" ); printf( TAB_DOM, "PAPI_TOT_INS: ", ( values[1] )[0], ( values[2] )[0], ( values[0] )[0] ); printf( TAB_DOM, "PAPI_TOT_CYC: ", ( values[1] )[1], ( values[2] )[1], ( values[0] )[1] ); printf( "-------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Both rows approximately equal '(N+n) n N', where n << N\n" ); printf( "Column 1 approximately equals column 2 plus column 3\n" ); } } else if ( test_case == SUPERVISOR ) { if (!TESTS_QUIET) { printf( "Test type : PAPI_DOM_ALL All-minus-supervisor Supervisor-only\n" ); printf( TAB_DOM, "PAPI_TOT_INS: ", ( values[0] )[0], ( values[1] )[0], ( values[2] )[0] ); printf( TAB_DOM, "PAPI_TOT_CYC: ", ( values[0] )[1], ( values[1] )[1], ( values[2] )[1] ); printf( "-------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Both rows approximately equal '(N+n) n N', where n << N\n" ); printf( "Column 1 approximately equals column 2 plus column 3\n" ); } } else { min = ( long long ) ( ( double ) values[2][0] * ( 1 - TOLERANCE ) ); max = ( long long ) ( ( double ) values[2][0] * ( 1 + TOLERANCE ) ); min2 = ( long long ) ( ( double ) values[0][1] * ( 1 - TOLERANCE ) ); max2 = ( long long ) ( ( double ) ( double ) values[0][1] * ( 1 + TOLERANCE ) ); if (!TESTS_QUIET) { printf( "Test type : PAPI_DOM_ALL PAPI_DOM_KERNEL PAPI_DOM_USER\n" ); printf( TAB_DOM, "PAPI_TOT_INS: ", ( values[0] )[0], ( values[1] )[0], ( values[2] )[0] ); printf( TAB_DOM, "PAPI_TOT_CYC: ", ( values[0] )[1], ( values[1] )[1], ( values[2] )[1] ); printf( "-------------------------------------------------------------\n" ); printf( "Verification:\n" ); printf( "Both rows approximately equal '(N+n) n N', where n << N\n" ); printf( "Column 1 approximately equals column 2 plus column 3\n" ); } if ( values[0][0] > max || values[0][0] < min ) test_fail( __FILE__, __LINE__, "PAPI_TOT_INS", 1 ); if ( ( values[1][1] + values[2][1] ) > max2 || ( values[1][1] + values[2][1] ) < min2 ) test_fail( __FILE__, __LINE__, "PAPI_TOT_CYC", 1 ); } if ( values[0][0] == 0 || values[0][1] == 0 || values[1][0] == 0 || values[1][1] == 0 ) test_fail( __FILE__, __LINE__, "Verify non-zero count for all domain types", 1 ); if ( values[2][0] == 0 || values[2][1] == 0 ) { if ( test_case == SUPERVISOR ) { if (!TESTS_QUIET) printf( "WARNING: No events counted in supervisor context. This is expected in a non-virtualized environment.\n" ); } else { test_fail( __FILE__, __LINE__, "Verify non-zero count for all domain types", 1 ); } } } /* Do the set_domain on the eventset before adding events */ void case1( int num ) { int retval, num_tests = 3; long long **values; int EventSet1 = PAPI_NULL, EventSet2 = PAPI_NULL, EventSet3 = PAPI_NULL; PAPI_option_t options; const PAPI_component_info_t *cmpinfo; memset( &options, 0x0, sizeof ( options ) ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); /* get info from cpu component */ cmpinfo = PAPI_get_component_info( 0 ); if ( cmpinfo == NULL ) { test_fail( __FILE__, __LINE__,"PAPI_get_component_info", PAPI_ECMP); } if ( ( retval = PAPI_query_event( PAPI_TOT_INS ) ) != PAPI_OK ) test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); if ( ( retval = PAPI_query_event( PAPI_TOT_CYC ) ) != PAPI_OK ) test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); retval = PAPI_create_eventset( &EventSet1 ); if ( retval == PAPI_OK ) retval = PAPI_create_eventset( &EventSet2 ); if ( retval == PAPI_OK ) retval = PAPI_create_eventset( &EventSet3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* In Component PAPI, EventSets must be assigned a component index before you can fiddle with their internals. 0 is always the cpu component */ retval = PAPI_assign_eventset_component( EventSet1, 0 ); if ( retval == PAPI_OK ) retval = PAPI_assign_eventset_component( EventSet2, 0 ); if ( retval == PAPI_OK ) retval = PAPI_assign_eventset_component( EventSet3, 0 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_assign_eventset_component", retval ); if ( num == CREATE ) { if (!TESTS_QUIET) printf( "\nTest case CREATE: Call PAPI_set_opt(PAPI_DOMAIN) on EventSet before add\n" ); options.domain.eventset = EventSet1; options.domain.domain = PAPI_DOM_ALL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); options.domain.eventset = EventSet2; options.domain.domain = PAPI_DOM_KERNEL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); options.domain.eventset = EventSet3; options.domain.domain = PAPI_DOM_USER; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } retval = PAPI_add_event( EventSet1, PAPI_TOT_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_INS)", retval ); retval = PAPI_add_event( EventSet1, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_CYC)", retval ); retval = PAPI_add_event( EventSet2, PAPI_TOT_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_INS)", retval ); retval = PAPI_add_event( EventSet2, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_CYC)", retval ); retval = PAPI_add_event( EventSet3, PAPI_TOT_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_INS)", retval ); if ( num == MIDDLE ) { if (!TESTS_QUIET) printf( "\nTest case MIDDLE: Call PAPI_set_opt(PAPI_DOMAIN) on EventSet between adds\n" ); options.domain.eventset = EventSet1; options.domain.domain = PAPI_DOM_ALL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK && retval != PAPI_ECMP ) { test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } options.domain.eventset = EventSet2; options.domain.domain = PAPI_DOM_KERNEL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); options.domain.eventset = EventSet3; options.domain.domain = PAPI_DOM_USER; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } retval = PAPI_add_event( EventSet3, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_CYC)", retval ); if ( num == ADD ) { if (!TESTS_QUIET) printf( "\nTest case ADD: Call PAPI_set_opt(PAPI_DOMAIN) on EventSet after add\n" ); options.domain.eventset = EventSet1; options.domain.domain = PAPI_DOM_ALL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK && retval != PAPI_ECMP ) { test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } options.domain.eventset = EventSet2; options.domain.domain = PAPI_DOM_KERNEL; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); options.domain.eventset = EventSet3; options.domain.domain = PAPI_DOM_USER; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_opt", retval ); } /* 2 events */ values = allocate_test_space( num_tests, 2 ); if ( num == CHANGE ) { /* This testcase is dependent on the CREATE testcase running immediately before it, using * domain settings of "All", "Kernel" and "User", on event sets 1, 2, and 3, respectively. */ PAPI_option_t option; if (!TESTS_QUIET) printf( "\nTest case CHANGE 1: Change domain on EventSet between runs, using generic domain options:\n" ); PAPI_start( EventSet1 ); PAPI_stop( EventSet1, values[0] ); // change EventSet1 domain from All to User option.domain.domain = PAPI_DOM_USER; option.domain.eventset = EventSet1; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); PAPI_start( EventSet2 ); PAPI_stop( EventSet2, values[1] ); // change EventSet2 domain from Kernel to All option.domain.domain = PAPI_DOM_ALL; option.domain.eventset = EventSet2; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); PAPI_start( EventSet3 ); PAPI_stop( EventSet3, values[2] ); // change EventSet3 domain from User to Kernel option.domain.domain = PAPI_DOM_KERNEL; option.domain.eventset = EventSet3; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); free_test_space( values, num_tests ); values = allocate_test_space( num_tests, 2 ); } if ( num == SUPERVISOR && ( cmpinfo->available_domains & PAPI_DOM_SUPERVISOR ) ) { PAPI_option_t option; if (!TESTS_QUIET) printf( "\nTest case CHANGE 2: Change domain on EventSets to include/exclude supervisor events:\n" ); option.domain.domain = PAPI_DOM_ALL; option.domain.eventset = EventSet1; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain ALL ", retval ); option.domain.domain = PAPI_DOM_ALL ^ PAPI_DOM_SUPERVISOR; option.domain.eventset = EventSet2; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) { /* DOM_ALL is special-cased as domains_available */ /* in papi.c . Some machines don't like DOM_OTHER */ /* so try that if the above case fails. */ option.domain.domain ^= PAPI_DOM_OTHER; option.domain.eventset = EventSet2; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if (retval != PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_set_domain ALL^SUPERVISOR ", retval ); } } option.domain.domain = PAPI_DOM_SUPERVISOR; option.domain.eventset = EventSet3; retval = PAPI_set_opt( PAPI_DOMAIN, &option ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain SUPERVISOR ", retval ); free_test_space( values, num_tests ); values = allocate_test_space( num_tests, 2 ); } /* Warm it up dude */ PAPI_start( EventSet1 ); do_flops( NUM_FLOPS ); PAPI_stop( EventSet1, NULL ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_start( EventSet2 ); do_flops( NUM_FLOPS ); if ( retval == PAPI_OK ) { retval = PAPI_stop( EventSet2, values[1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } else { values[1][0] = retval; values[1][1] = retval; } retval = PAPI_start( EventSet3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet3, values[2] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_cleanup_eventset( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup", retval ); retval = PAPI_destroy_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy", retval ); retval = PAPI_cleanup_eventset( EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup", retval ); retval = PAPI_destroy_eventset( &EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy", retval ); retval = PAPI_cleanup_eventset( EventSet3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup", retval ); retval = PAPI_destroy_eventset( &EventSet3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy", retval ); dump_and_verify( num, values ); free(values); PAPI_shutdown( ); } void case2( int num, int domain, long long *values ) { int retval; int EventSet1 = PAPI_NULL; PAPI_option_t options; memset( &options, 0x0, sizeof ( options ) ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); if ( ( retval = PAPI_query_event( PAPI_TOT_INS ) ) != PAPI_OK ) test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); if ( ( retval = PAPI_query_event( PAPI_TOT_CYC ) ) != PAPI_OK ) test_skip( __FILE__, __LINE__, "PAPI_query_event", retval ); if ( num == CREATE ) { if (!TESTS_QUIET) { printf( "\nTest case 2, CREATE: Call PAPI_set_domain(%s) before create\n", stringify_domain( domain ) ); printf( "This should override the domain setting for this EventSet.\n" ); } retval = PAPI_set_domain( domain ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); } retval = PAPI_create_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); if ( num == ADD ) { if (!TESTS_QUIET) { printf( "\nTest case 2, ADD: Call PAPI_set_domain(%s) before add\n", stringify_domain( domain ) ); printf( "This should have no effect on the domain setting for this EventSet.\n" ); } retval = PAPI_set_domain( domain ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); } retval = PAPI_add_event( EventSet1, PAPI_TOT_INS ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_INS)", retval ); if ( num == MIDDLE ) { if (!TESTS_QUIET) { printf( "\nTest case 2, MIDDLE: Call PAPI_set_domain(%s) between adds\n", stringify_domain( domain ) ); printf( "This should have no effect on the domain setting for this EventSet.\n" ); } retval = PAPI_set_domain( domain ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_set_domain", retval ); } retval = PAPI_add_event( EventSet1, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event(PAPI_TOT_CYC)", retval ); /* Warm it up dude */ PAPI_start( EventSet1 ); do_flops( NUM_FLOPS ); PAPI_stop( EventSet1, NULL ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet1, values ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_cleanup_eventset( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup", retval ); retval = PAPI_destroy_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy", retval ); PAPI_shutdown( ); } void case2_driver( void ) { long long **values; /* 3 tests, 2 events */ values = allocate_test_space( 3, 2 ); case2( CREATE, PAPI_DOM_KERNEL, values[0] ); case2( ADD, PAPI_DOM_KERNEL, values[1] ); case2( MIDDLE, PAPI_DOM_KERNEL, values[2] ); dump_and_verify( CASE2, values ); free(values); } void case1_driver( void ) { case1( ADD ); case1( MIDDLE ); case1( CREATE ); case1( CHANGE ); case1( SUPERVISOR ); } int main( int argc, char **argv ) { tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ #if defined(sgi) && defined(host_mips) uid_t id; id = getuid( ); if ( id != 0 ) { printf( "IRIX requires root for PAPI_DOM_KERNEL and PAPI_DOM_ALL.\n" ); test_skip( __FILE__, __LINE__, "", 1 ); } #endif if (!TESTS_QUIET) { printf( "Test second.c: set domain of eventset via PAPI_set_domain and PAPI_set_opt.\n\n" ); printf( "* PAPI_set_domain(DOMAIN) sets the default domain \napplied to subsequently created EventSets.\n" ); printf( "It should have no effect on existing EventSets.\n\n" ); printf( "* PAPI_set_opt(DOMAIN,xxx) sets the domain for a specific EventSet.\n" ); printf( "It should always override the default setting for that EventSet.\n" ); } case2_driver( ); case1_driver( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/serial_hl.c000066400000000000000000000014661502707512200200510ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, i; int quiet = 0; char* region_name; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); region_name = "do_flops"; if ( !quiet ) { printf("\nInstrument flops\n"); } for ( i = 1; i <= 4; ++i ) { retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } } test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/serial_hl_ll_comb.c000066400000000000000000000044361502707512200215400ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, i; int quiet = 0; char* region_name; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); region_name = "do_flops"; /* three iterations with high-level API */ if ( !quiet ) { printf("\nTesting high-level API: do_flops\n"); } for ( i = 1; i < 4; ++i ) { retval = PAPI_hl_region_begin(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } do_flops( NUM_FLOPS ); retval = PAPI_hl_region_end(region_name); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } } if ( !quiet ) { printf("\nTesting low-level API: do_flops\n"); } long long values[2]; int EventSet = PAPI_NULL; char event_name1[]="appio:::READ_BYTES"; char event_name2[]="appio:::WRITE_BYTES"; /* create the eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_add_named_event( EventSet, event_name1); if ( retval != PAPI_OK ) { if (!quiet) printf("Couldn't add %s\n",event_name1); test_skip(__FILE__,__LINE__,"Couldn't add appio:::READ_BYTES",0); } retval = PAPI_add_named_event( EventSet, event_name2); if ( retval != PAPI_OK ) { if (!quiet) printf("Couldn't add %s\n",event_name2); test_skip(__FILE__,__LINE__,"Couldn't add appio:::WRITE_BYTES",0); } /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( NUM_FLOPS ); /* Read results */ retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } if ( !quiet ) { printf("%s: %lld\n", event_name1, values[0]); printf("%s: %lld\n", event_name2, values[1]); } /* remove results. */ PAPI_remove_named_event(EventSet,event_name1); PAPI_remove_named_event(EventSet,event_name2); test_hl_pass( __FILE__ ); return 0; }papi-papi-7-2-0-t/src/ctests/shlib.c000066400000000000000000000106571502707512200172120ustar00rootroot00000000000000/* * File: profile.c * Author: Philip Mucci * mucci@cs.utk.edu */ #include #include #include #include #if (!defined(NO_DLFCN) && !defined(_BGL) && !defined(_BGP)) #include #endif #include "papi.h" #include "papi_test.h" void print_shlib_info_map(const PAPI_shlib_info_t *shinfo, int quiet) { PAPI_address_map_t *map = shinfo->map; int i; if (NULL == map) { test_fail(__FILE__, __LINE__, "PAPI_get_shared_lib_info", 1); } if (!quiet) for ( i = 0; i < shinfo->count; i++ ) { printf( "Library: %s\n", map->name ); printf( "Text start: %p, Text end: %p\n", map->text_start, map->text_end ); printf( "Data start: %p, Data end: %p\n", map->data_start, map->data_end ); printf( "Bss start: %p, Bss end: %p\n", map->bss_start, map->bss_end ); if ( strlen( &(map->name[0]) ) == 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); if ( ( map->text_start == 0x0 ) || ( map->text_end == 0x0 ) || ( map->text_start >= map->text_end ) ) test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); /* if ((map->data_start == 0x0) || (map->data_end == 0x0) || (map->data_start >= map->data_end)) test_fail(__FILE__, __LINE__, "PAPI_get_shared_lib_info",1); if (((map->bss_start) && (!map->bss_end)) || ((!map->bss_start) && (map->bss_end)) || (map->bss_start > map->bss_end)) test_fail(__FILE__, __LINE__, "PAPI_get_shared_lib_info",1); */ map++; } } void display( char *msg ) { int i; for (i=0; i<64; i++) { printf( "%1d", (msg[i] ? 1 : 0) ); } printf("\n"); } int main( int argc, char **argv ) { int retval,quiet; const PAPI_shlib_info_t *shinfo; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( shinfo = PAPI_get_shared_lib_info( ) ) == NULL ) { test_skip( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); } if ( ( shinfo->count == 0 ) && ( shinfo->map ) ) { test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); } print_shlib_info_map(shinfo, quiet); /* Needed for debugging, so you can ^Z and stop the process, */ /* inspect /proc to see if it's right */ sleep( 1 ); #ifndef NO_DLFCN { const char *_libname = "libcrypt.so"; void *handle; void ( *setkey) (const char *key); void ( *encrypt) (char block[64], int edflag); char key[64]={ 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0, 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0, 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0, 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0, }; /* bit pattern for key */ char orig[64]; /* bit pattern for messages */ char txt[64]; /* bit pattern for messages */ int oldcount; handle = dlopen( _libname, RTLD_NOW ); if ( !handle ) { printf( "dlopen: %s\n", dlerror( ) ); if (!quiet) printf( "Did you forget to set the environmental " "variable LIBPATH (in AIX) or " "LD_LIBRARY_PATH (in linux) ?\n" ); test_fail( __FILE__, __LINE__, "dlopen", 1 ); } setkey = dlsym( handle, "setkey" ); encrypt = dlsym( handle, "encrypt" ); if ( setkey == NULL || encrypt == NULL) { if (!quiet) printf( "dlsym: %s\n", dlerror( ) ); test_fail( __FILE__, __LINE__, "dlsym", 1 ); } memset(orig,0,64); memcpy(txt,orig,64); setkey(key); if (!quiet) { printf("original "); display(txt); } encrypt(txt, 0); /* encode */ if (!quiet) { printf("encrypted "); display(txt); } if (!memcmp(txt,orig,64)) { test_fail( __FILE__, __LINE__, "encode", 1 ); } encrypt(txt, 1); /* decode */ if (!quiet) { printf("decrypted "); display(txt); } if (memcmp(txt,orig,64)) { test_fail( __FILE__, __LINE__, "decode", 1 ); } oldcount = shinfo->count; if ( ( shinfo = PAPI_get_shared_lib_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); } /* Needed for debugging, so you can ^Z and stop the process, */ /* inspect /proc to see if it's right */ sleep( 1 ); if ( ( shinfo->count == 0 ) && ( shinfo->map ) ) { test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); } if ( shinfo->count <= oldcount ) { test_fail( __FILE__, __LINE__, "PAPI_get_shared_lib_info", 1 ); } print_shlib_info_map(shinfo, quiet); /* Needed for debugging, so you can ^Z and stop the process, */ /* inspect /proc to see if it's right */ sleep( 1 ); dlclose( handle ); } #endif test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/sprofile.c000066400000000000000000000107521502707512200177300ustar00rootroot00000000000000/* * File: sprofile.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com */ #include #include #include "papi.h" #include "papi_test.h" #include "prof_utils.h" #include "do_loops.h" /* These architectures use Function Descriptors as Function Pointers */ #if (defined(linux) && defined(__ia64__)) || (defined(_AIX)) \ || ((defined(__powerpc64__) && (_CALL_ELF != 2))) /* PPC64 Big Endian is ELF version 1 which uses function descriptors */ #define DO_READS (unsigned long)(*(void **)do_reads) #define DO_FLOPS (unsigned long)(*(void **)do_flops) #else /* PPC64 Little Endian is ELF version 2 which does not use * function descriptors */ #define DO_READS (unsigned long)(do_reads) #define DO_FLOPS (unsigned long)(do_flops) #endif /* This file performs the following test: sprofile */ int main( int argc, char **argv ) { int i, num_events, num_tests = 6, mask = 0x1; int EventSet = PAPI_NULL; unsigned short **buf = ( unsigned short ** ) profbuf; unsigned long length, blength; int num_buckets; PAPI_sprofil_t sprof[3]; int retval; const PAPI_exe_info_t *prginfo; vptr_t start, end; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_executable_info", 1 ); } start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; if ( start > end ) { test_fail( __FILE__, __LINE__, "Profile length < 0!", PAPI_ESYS ); } length = ( unsigned long ) ( end - start ); if (!quiet) { prof_print_address( "Test case sprofile: POSIX compatible profiling over multiple regions.\n", prginfo ); } blength = prof_size( length, FULL_SCALE, PAPI_PROFIL_BUCKET_16, &num_buckets ); prof_alloc( 3, blength ); /* First half */ sprof[0].pr_base = buf[0]; sprof[0].pr_size = ( unsigned int ) blength; sprof[0].pr_off = ( vptr_t ) DO_FLOPS; #if defined(linux) && defined(__ia64__) if ( !quiet ) fprintf( stderr, "do_flops is at %p %p\n", &do_flops, sprof[0].pr_off ); #endif sprof[0].pr_scale = FULL_SCALE; /* Second half */ sprof[1].pr_base = buf[1]; sprof[1].pr_size = ( unsigned int ) blength; sprof[1].pr_off = ( vptr_t ) DO_READS; #if defined(linux) && defined(__ia64__) if ( !quiet ) fprintf( stderr, "do_reads is at %p %p\n", &do_reads, sprof[1].pr_off ); #endif sprof[1].pr_scale = FULL_SCALE; /* Overflow bin */ sprof[2].pr_base = buf[2]; sprof[2].pr_size = 1; sprof[2].pr_off = 0; sprof[2].pr_scale = 0x2; EventSet = add_test_events( &num_events, &mask, 1 ); values = allocate_test_space( num_tests, num_events ); retval = PAPI_sprofil( sprof, 3, EventSet, PAPI_TOT_CYC, THRESHOLD, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16 ); if (retval != PAPI_OK ) { if (retval == PAPI_ENOEVNT) { if (!quiet) printf("Trouble creating events\n"); test_skip(__FILE__,__LINE__,"PAPI_sprofil",retval); } test_fail( __FILE__, __LINE__, "PAPI_sprofil", retval ); } do_stuff( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_stuff( ); if ( ( retval = PAPI_stop( EventSet, values[1] ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); /* clear the profile flag before removing the event */ if ( ( retval = PAPI_sprofil( sprof, 3, EventSet, PAPI_TOT_CYC, 0, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16 ) ) != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_sprofil", retval ); remove_test_events( &EventSet, mask ); if ( !quiet ) { printf( "Test case: PAPI_sprofil()\n" ); printf( "---------Buffer 1--------\n" ); for ( i = 0; i < ( int ) length / 2; i++ ) { if ( buf[0][i] ) printf( "%#lx\t%d\n", DO_FLOPS + 2 * ( unsigned long ) i, buf[0][i] ); } printf( "---------Buffer 2--------\n" ); for ( i = 0; i < ( int ) length / 2; i++ ) { if ( buf[1][i] ) printf( "%#lx\t%d\n", DO_READS + 2 * ( unsigned long ) i, buf[1][i] ); } printf( "-------------------------\n" ); printf( "%u samples fell outside the regions.\n", *buf[2] ); } retval = prof_check( 2, PAPI_PROFIL_BUCKET_16, num_buckets ); for ( i = 0; i < 3; i++ ) { free( profbuf[i] ); } if ( retval == 0 ) { test_fail( __FILE__, __LINE__, "No information in buffers", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/system_child_overflow.c000066400000000000000000000102661502707512200225170ustar00rootroot00000000000000/* * Use "system() to run child_overflow * Test PAPI with fork() and exec(). */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define MAX_EVENTS 3 static int Event[MAX_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_FAD_INS, }; static int Threshold[MAX_EVENTS] = { 8000000, 4000000, 4000000, }; static int num_events = 1; static int EventSet = PAPI_NULL; static const char *name = "unknown"; static struct timeval start, last; static long count, total; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; count++; total++; } static void zero_count( void ) { gettimeofday( &start, NULL ); last = start; count = 0; total = 0; } static void print_here( const char *str) { if (!TESTS_QUIET) printf("[%d] %s, %s\n", getpid(), name, str); } static void print_rate( const char *str ) { static int last_count = -1; struct timeval now; double st_secs, last_secs; gettimeofday( &now, NULL ); st_secs = ( double ) ( now.tv_sec - start.tv_sec ) + ( ( double ) ( now.tv_usec - start.tv_usec ) ) / 1000000.0; last_secs = ( double ) ( now.tv_sec - last.tv_sec ) + ( ( double ) ( now.tv_usec - last.tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%d] %s, time = %.3f, total = %ld, last = %ld, rate = %.1f/sec\n", getpid( ), str, st_secs, total, count, ( ( double ) count ) / last_secs ); } if ( last_count != -1 ) { if ( count < .1 * last_count ) { test_fail( name, __LINE__, "Interrupt rate changed!", 1 ); exit( 1 ); } } last_count = ( int ) count; count = 0; last = now; } static void do_cycles( int program_time ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) sum += x; if ( sum < 0.0 ) printf( "==>> SUM IS NEGATIVE !! <<==\n" ); gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + program_time ) break; } } static void my_papi_start( void ) { int ev; EventSet = PAPI_NULL; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_create_eventset failed", 1 ); for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_add_event( EventSet, Event[ev] ) != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble adding event\n"); test_skip( name, __LINE__, "PAPI_add_event failed", 1 ); } } for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_overflow( EventSet, Event[ev], Threshold[ev], 0, my_handler ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_overflow failed", 1 ); } } if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_start failed", 1 ); } static void run( const char *str, int len ) { int n; for ( n = 1; n <= len; n++ ) { do_cycles( 1 ); print_rate( str ); } } int main( int argc, char **argv ) { char buf[100]; int quiet,retval,result=0; /* Used to be able to set this via command line */ num_events=1; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); do_cycles( 1 ); zero_count( ); /* Init library */ retval=PAPI_library_init( PAPI_VER_CURRENT ); if (retval!=PAPI_VER_CURRENT) { test_fail( name, __LINE__, "PAPI_library_init failed", 1 ); } name = argv[0]; if (!quiet) printf( "[%d] %s, num_events = %d\n", getpid( ), name, num_events ); sprintf( buf, "%d", num_events ); my_papi_start( ); run( name, 3 ); print_here( "system(./child_overflow)" ); if ( access( "./child_overflow", X_OK ) == 0 ) { if ( quiet) result=system( "./child_overflow TESTS_QUIET" ); else result=system( "./child_overflow" ); } else if ( access( "./ctests/child_overflow", X_OK ) == 0 ) { if ( quiet) result=system( "./ctests/child_overflow TESTS_QUIET" ); else result=system( "./ctests/child_overflow" ); } if (result<0) { test_fail(__FILE__,__LINE__,"system failed\n",1); } if (!quiet) printf("Successfully returned from system\n"); // Rely on test_pass from child_overflow // otherwise the run_tests.sh output is ugly //test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/ctests/system_overflow.c000066400000000000000000000074351502707512200213600ustar00rootroot00000000000000/* * Test PAPI with fork() and exec(). */ #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #define MAX_EVENTS 3 static int Event[MAX_EVENTS] = { PAPI_TOT_CYC, PAPI_FP_INS, PAPI_FAD_INS, }; static int Threshold[MAX_EVENTS] = { 8000000, 4000000, 4000000, }; static int num_events = 1; static int EventSet = PAPI_NULL; static const char *name = "unknown"; static struct timeval start, last; static long count, total; static void my_handler( int EventSet, void *pc, long long ovec, void *context ) { ( void ) EventSet; ( void ) pc; ( void ) ovec; ( void ) context; count++; total++; } static void zero_count( void ) { gettimeofday( &start, NULL ); last = start; count = 0; total = 0; } static void print_here( const char *str) { if (!TESTS_QUIET) printf("[%d] %s, %s\n", getpid(), name, str); } static void print_rate( const char *str ) { static int last_count = -1; struct timeval now; double st_secs, last_secs; gettimeofday( &now, NULL ); st_secs = ( double ) ( now.tv_sec - start.tv_sec ) + ( ( double ) ( now.tv_usec - start.tv_usec ) ) / 1000000.0; last_secs = ( double ) ( now.tv_sec - last.tv_sec ) + ( ( double ) ( now.tv_usec - last.tv_usec ) ) / 1000000.0; if ( last_secs <= 0.001 ) last_secs = 0.001; if (!TESTS_QUIET) { printf( "[%d] %s, time = %.3f, total = %ld, last = %ld, rate = %.1f/sec\n", getpid( ), str, st_secs, total, count, ( ( double ) count ) / last_secs ); } if ( last_count != -1 ) { if ( count < .1 * last_count ) { test_fail( name, __LINE__, "Interrupt rate changed!", 1 ); exit( 1 ); } } last_count = ( int ) count; count = 0; last = now; } static void do_cycles( int program_time ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) sum += x; if ( sum < 0.0 ) printf( "==>> SUM IS NEGATIVE !! <<==\n" ); gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + program_time ) break; } } static void my_papi_start( void ) { int ev; EventSet = PAPI_NULL; if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_create_eventset failed", 1 ); for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_add_event( EventSet, Event[ev] ) != PAPI_OK ) { if (!TESTS_QUIET) printf("Trouble adding event\n"); test_skip( name, __LINE__, "PAPI_add_event failed", 1 ); } } for ( ev = 0; ev < num_events; ev++ ) { if ( PAPI_overflow( EventSet, Event[ev], Threshold[ev], 0, my_handler ) != PAPI_OK ) { test_fail( name, __LINE__, "PAPI_overflow failed", 1 ); } } if ( PAPI_start( EventSet ) != PAPI_OK ) test_fail( name, __LINE__, "PAPI_start failed", 1 ); } static void run( const char *str, int len ) { int n; for ( n = 1; n <= len; n++ ) { do_cycles( 1 ); print_rate( str ); } } int main( int argc, char **argv ) { char buf[100]; int quiet,retval; /* Used to be able to set this via command line */ num_events=1; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); do_cycles( 1 ); zero_count( ); retval=PAPI_library_init( PAPI_VER_CURRENT ); if (retval!=PAPI_VER_CURRENT) { test_fail( name, __LINE__, "PAPI_library_init failed", 1 ); } name = argv[0]; if (!quiet) printf( "[%d] %s, num_events = %d\n", getpid( ), name, num_events ); sprintf( buf, "%d", num_events ); my_papi_start( ); run( name, 3 ); print_here( "system(./burn)" ); if ( access( "./burn", X_OK ) == 0 ) ( quiet ? system( "./burn TESTS_QUIET" ) : system( "./burn" ) ); else if ( access( "./ctests/burn", X_OK ) == 0 ) ( quiet ? system( "./ctests/burn TESTS_QUIET" ) : system( "./ctests/burn" ) ); test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/ctests/tenth.c000066400000000000000000000147721502707512200172350ustar00rootroot00000000000000/* * File: tenth.c * Mods: Maynard Johnson * maynardj@us.ibm.com */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #define ITERS 100 /* This file performs the following test: start, stop and timer functionality for PAPI_L1_TCM derived event - They are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #if defined(sun) && defined(sparc) #define CACHE_LEVEL "PAPI_L2_TCM" #define EVT1 PAPI_L2_TCM #define EVT2 PAPI_L2_TCA #define EVT3 PAPI_L2_TCH #define EVT1_STR "PAPI_L2_TCM" #define EVT2_STR "PAPI_L2_TCA" #define EVT3_STR "PAPI_L2_TCH" #define MASK1 MASK_L2_TCM #define MASK2 MASK_L2_TCA #define MASK3 MASK_L2_TCH #else #if defined(__powerpc__) #define CACHE_LEVEL "PAPI_L1_DCA" #define EVT1 PAPI_L1_DCA #define EVT2 PAPI_L1_DCW #define EVT3 PAPI_L1_DCR #define EVT1_STR "PAPI_L1_DCA" #define EVT2_STR "PAPI_L1_DCW" #define EVT3_STR "PAPI_L1_DCR" #define MASK1 MASK_L1_DCA #define MASK2 MASK_L1_DCW #define MASK3 MASK_L1_DCR #else #define CACHE_LEVEL "PAPI_L1_TCM" #define EVT1 PAPI_L1_TCM #define EVT2 PAPI_L1_ICM #define EVT3 PAPI_L1_DCM #define EVT1_STR "PAPI_L1_TCM" #define EVT2_STR "PAPI_L1_ICM" #define EVT3_STR "PAPI_L1_DCM" #define MASK1 MASK_L1_TCM #define MASK2 MASK_L1_ICM #define MASK3 MASK_L1_DCM #endif #endif int main( int argc, char **argv ) { int retval, num_tests = 30, tmp; int EventSet1 = PAPI_NULL; int EventSet2 = PAPI_NULL; int EventSet3 = PAPI_NULL; int mask1 = MASK1; int mask2 = MASK2; int mask3 = MASK3; int num_events1; int num_events2; int num_events3; long long **values; int i, j; long long min[3]; long long max[3]; long long sum[3]; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Make sure that required resources are available */ /* Skip (don't fail!) if they are not */ retval = PAPI_query_event( EVT1 ); if ( retval != PAPI_OK ) { test_skip( __FILE__, __LINE__, EVT1_STR, retval ); } retval = PAPI_query_event( EVT2 ); if ( retval != PAPI_OK ) { test_skip( __FILE__, __LINE__, EVT2_STR, retval ); } retval = PAPI_query_event( EVT3 ); if ( retval != PAPI_OK ) { test_skip( __FILE__, __LINE__, EVT3_STR, retval ); } EventSet1 = add_test_events( &num_events1, &mask1, 1 ); EventSet2 = add_test_events( &num_events2, &mask2, 1 ); EventSet3 = add_test_events( &num_events3, &mask3, 1 ); values = allocate_test_space( num_tests, 1 ); /* Warm me up */ do_l1misses( ITERS ); do_misses( 1, 1024 * 1024 * 4 ); for ( i = 0; i < 10; i++ ) { retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_l1misses( ITERS ); do_misses( 1, 1024 * 1024 * 4 ); retval = PAPI_stop( EventSet1, values[( i * 3 ) + 0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_start( EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_l1misses( ITERS ); do_misses( 1, 1024 * 1024 * 4 ); retval = PAPI_stop( EventSet2, values[( i * 3 ) + 1] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_start( EventSet3 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_l1misses( ITERS ); do_misses( 1, 1024 * 1024 * 4 ); retval = PAPI_stop( EventSet3, values[( i * 3 ) + 2] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } remove_test_events( &EventSet1, mask1 ); remove_test_events( &EventSet2, mask2 ); remove_test_events( &EventSet3, mask3 ); for ( j = 0; j < 3; j++ ) { min[j] = 65535; max[j] = sum[j] = 0; } for ( i = 0; i < 10; i++ ) { for ( j = 0; j < 3; j++ ) { if ( min[j] > values[( i * 3 ) + j][0] ) min[j] = values[( i * 3 ) + j][0]; if ( max[j] < values[( i * 3 ) + j][0] ) max[j] = values[( i * 3 ) + j][0]; sum[j] += values[( i * 3 ) + j][0]; } } if ( !quiet ) { printf( "Test case 10: start, stop for derived event %s.\n", CACHE_LEVEL ); printf( "--------------------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", ITERS ); printf( "Repeated 10 times\n" ); printf ( "-------------------------------------------------------------------------\n" ); /* for (i=0;i<10;i++) { printf("Test type : %12s%13s%13s\n", "1", "2", "3"); printf(TAB3, EVT1_STR, values[(i*3)+0][0], (long long)0, (long long)0); printf(TAB3, EVT2_STR, (long long)0, values[(i*3)+1][0], (long long)0); printf(TAB3, EVT3_STR, (long long)0, (long long)0, values[(i*3)+2][0]); printf ("-------------------------------------------------------------------------\n"); } */ printf( "Test type : %12s%13s%13s\n", "min", "max", "sum" ); printf( TAB3, EVT1_STR, min[0], max[0], sum[0] ); printf( TAB3, EVT2_STR, min[1], max[1], sum[1] ); printf( TAB3, EVT3_STR, min[2], max[2], sum[2] ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Verification:\n" ); #if defined(sun) && defined(sparc) printf( TAB1, "Sum 1 approximately equals sum 2 - sum 3 or", ( sum[1] - sum[2] ) ); #else printf( TAB1, "Sum 1 approximately equals sum 2 + sum 3 or", ( sum[1] + sum[2] ) ); #endif } { long long tmin, tmax; #if defined(sun) && defined(sparc) tmax = ( long long ) ( sum[1] - sum[2] ); #else tmax = ( long long ) ( sum[1] + sum[2] ); #endif if (!quiet) { printf( "percent error: %f\n", (( float ) abs( ( int ) ( tmax - sum[0] ) ) / (float) sum[0] ) * 100.0 ); } tmin = ( long long ) ( ( double ) tmax * 0.8 ); tmax = ( long long ) ( ( double ) tmax * 1.2 ); if ( sum[0] > tmax || sum[0] < tmin ) { test_fail( __FILE__, __LINE__, CACHE_LEVEL, 1 ); } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/thrspecific.c000066400000000000000000000103011502707512200203760ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for 2 slave pthreads */ /* No it doesn't, that description is *completely* wrong */ /* I think this is trying to test the pthread thread-specific */ /* implementation but it is unclear and the git commit history */ /* does not help at all here */ #include #include #include #include #include "papi.h" #include "papi_test.h" static volatile int processing = 1; void * Thread( void *arg ) { int retval; void *arg2; int i; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if (!TESTS_QUIET) { printf( "Thread %#x started, specific data is at %p\n", ( int ) pthread_self( ), arg ); } retval = PAPI_set_thr_specific( PAPI_USR1_TLS, arg ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_set_thr_specific", retval ); } retval = PAPI_get_thr_specific( PAPI_USR1_TLS, &arg2 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_get_thr_specific", retval ); } if ( arg != arg2 ) { test_fail( __FILE__, __LINE__, "set vs get specific", 0 ); } while ( processing ) { if ( *( ( int * ) arg ) == 500000 ) { sleep( 1 ); PAPI_all_thr_spec_t data; data.num = 10; data.id = ( unsigned long * ) malloc( ( size_t ) data.num * sizeof ( unsigned long ) ); data.data = ( void ** ) malloc( ( size_t ) data.num * sizeof ( void * ) ); retval = PAPI_get_thr_specific( PAPI_USR1_TLS | PAPI_TLS_ALL_THREADS, ( void ** ) &data ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_get_thr_specific", retval ); } if ( data.num != 5 ) { test_fail( __FILE__, __LINE__, "data.num != 5", 0 ); } if (!TESTS_QUIET) for ( i = 0; i < data.num; i++ ) { printf( "Entry %d, Thread %#lx, Data Pointer %p, Value %d\n", i, data.id[i], data.data[i], *( int * ) data.data[i] ); } free(data.id); free(data.data); processing = 0; } } retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); } return NULL; } int main( int argc, char **argv ) { pthread_t e_th, f_th, g_th, h_th; int flops1, flops2, flops3, flops4, flops5; int retval, rc; pthread_attr_t attr; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); if (!quiet) printf("Testing threads\n"); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) { test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } } pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) { test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); } #endif flops1 = 1000000; rc = pthread_create( &e_th, &attr, Thread, ( void * ) &flops1 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops2 = 2000000; rc = pthread_create( &f_th, &attr, Thread, ( void * ) &flops2 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops3 = 4000000; rc = pthread_create( &g_th, &attr, Thread, ( void * ) &flops3 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops4 = 8000000; rc = pthread_create( &h_th, &attr, Thread, ( void * ) &flops4 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } pthread_attr_destroy( &attr ); flops5 = 500000; Thread( &flops5 ); pthread_join( h_th, NULL ); pthread_join( g_th, NULL ); pthread_join( f_th, NULL ); pthread_join( e_th, NULL ); test_pass( __FILE__ ); pthread_exit( NULL ); return 1; } papi-papi-7-2-0-t/src/ctests/timer_overflow.c000066400000000000000000000027371502707512200211540ustar00rootroot00000000000000/* * File: timer_overflow.c * Author: Kevin London * london@cs.utk.edu * Mods: * */ /* This file looks for possible timer overflows. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #define TIMER_THRESHOLD 100 int main( int argc, char **argv ) { int sleep_time = TIMER_THRESHOLD; int retval, i; long long timer; if ( argc > 1 ) { if ( !strcmp( argv[1], "TESTS_QUIET" ) ) tests_quiet( argc, argv ); else { sleep_time = atoi( argv[1] ); if ( sleep_time <= 0 ) sleep_time = TIMER_THRESHOLD; } } if ( TESTS_QUIET ) { /* Skip the test in TESTS_QUIET so that the main script doesn't * run this as it takes a long time to check for overflow */ printf( "%-40s SKIPPED\nLine # %d\n", __FILE__, __LINE__ ); printf( "timer_overflow takes a long time to run, run separately.\n" ); exit( 0 ); } printf( "This test will take about: %f minutes.\n", ( float ) ( 20 * ( sleep_time / 60.0 ) ) ); if ( ( retval = PAPI_library_init( PAPI_VER_CURRENT ) ) != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); timer = PAPI_get_real_usec( ); for ( i = 0; i <= 20; i++ ) { if ( timer < 0 ) break; sleep( ( unsigned int ) sleep_time ); timer = PAPI_get_real_usec( ); } if ( timer < 0 ) test_fail( __FILE__, __LINE__, "PAPI_get_real_usec: overflow", 1 ); else test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/val_omp.c000066400000000000000000000121731502707512200175410ustar00rootroot00000000000000/* This file performs the following test: each OMP thread measures flops for its provided tasks, and compares this to expected flop counts, each thread having been provided with a random amount of work, such that the time and order that they complete their measurements varies. Specifically tested is the case where the value returned for some threads actually corresponds to that for another thread reading its counter values at the same time. - It is based on zero_omp.c but ignored much of its functionality. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC Each thread inside the Thread routine: - Do prework (MAX_FLOPS - flops) - Get cyc. - Get us. - Start counters - Do flops - Stop and read counters - Get us. - Get cyc. - Return flops */ #include "papi_test.h" #ifdef _OPENMP #include #else #error "This compiler does not understand OPENMP" #endif const int MAX_FLOPS = NUM_FLOPS; extern int TESTS_QUIET; /* Declared in test_utils.c */ const PAPI_hw_info_t *hw_info = NULL; long long Thread( int n ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1; long long flops; long long **values; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN]; /* printf("Thread(n=%d) %#x started\n", n, omp_get_thread_num()); */ num_events1 = 2; /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); values = allocate_test_space( num_tests, num_events1 ); do_flops( MAX_FLOPS - n ); /* prework for balance */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( n ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); flops = ( values[0] )[0]; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; remove_test_events( &EventSet1, mask1 ); if ( !TESTS_QUIET ) { /*printf("Thread %#x %-12s : \t%lld\t%d\n", omp_get_thread_num(), event_name, (values[0])[0], n); */ #if 0 printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", omp_get_thread_num( ), values[0][0] ); printf( "Thread %#x Real usec : \t%lld\n", omp_get_thread_num( ), elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", omp_get_thread_num( ), elapsed_cyc ); #endif } /* It is illegal for the threads to exit in OpenMP */ /* test_pass(__FILE__,0,0); */ free_test_space( values, num_tests ); PAPI_unregister_thread( ); /* printf("Thread %#x finished\n", omp_get_thread_num()); */ return flops; } int main( int argc, char **argv ) { int tid, retval; int maxthr = omp_get_max_threads( ); int flopper = 0; long long *flops = calloc( maxthr, sizeof ( long long ) ); long long *flopi = calloc( maxthr, sizeof ( long long ) ); tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ if ( maxthr < 2 ) test_skip( __FILE__, __LINE__, "omp_get_num_threads < 2", PAPI_EINVAL ); if ( ( flops == NULL ) || ( flopi == NULL ) ) test_fail( __FILE__, __LINE__, "calloc", PAPI_ENOMEM ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( omp_get_thread_num ) ); if ( retval != PAPI_OK ) if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); flopper = Thread( 65536 ) / 65536; printf( "flopper=%d\n", flopper ); for ( int i = 0; i < 100000; i++ ) #pragma omp parallel private(tid) { tid = omp_get_thread_num( ); flopi[tid] = rand( ) * 3; flops[tid] = Thread( ( flopi[tid] / flopper ) % MAX_FLOPS ); #pragma omp barrier #pragma omp master if ( flops[tid] < flopi[tid] ) { printf( "test iteration=%d\n", i ); for ( int j = 0; j < omp_get_num_threads( ); j++ ) { printf( "Thread %#x Value %6lld %c %6lld", j, flops[j], ( flops[j] < flopi[j] ) ? '<' : '=', flopi[j] ); for ( int k = 0; k < omp_get_num_threads( ); k++ ) if ( ( k != j ) && ( flops[k] == flops[j] ) ) printf( " == Thread %#x!", k ); printf( "\n" ); } test_fail( __FILE__, __LINE__, "value returned for thread", PAPI_EBUG ); } } test_pass( __FILE__, NULL, 0 ); exit( 0 ); } papi-papi-7-2-0-t/src/ctests/version.c000066400000000000000000000040071502707512200175660ustar00rootroot00000000000000/* This file performs the following test: */ /* compare and report versions from papi.h and the papi library */ #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int init_version, lib_version; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); init_version = PAPI_library_init( PAPI_VER_CURRENT ); lib_version = PAPI_get_opt( PAPI_LIB_VERSION, NULL ); if (lib_version == PAPI_EINVAL ) { test_fail( __FILE__, __LINE__, "PAPI_get_opt", PAPI_EINVAL ); } if ( !quiet) { printf( "Version.c: Compare and report versions from papi.h and the papi library.\n" ); printf( "-------------------------------------------------------------------------\n" ); printf( " MAJOR MINOR REVISION INCREMENT\n" ); printf( "-------------------------------------------------------------------------\n" ); printf( "PAPI_VER_CURRENT : %4d %6d %7d %10d\n", PAPI_VERSION_MAJOR( PAPI_VER_CURRENT ), PAPI_VERSION_MINOR( PAPI_VER_CURRENT ), PAPI_VERSION_REVISION( PAPI_VER_CURRENT ), PAPI_VERSION_INCREMENT( PAPI_VER_CURRENT ) ); printf( "PAPI_library_init: %4d %6d %7d %10d\n", PAPI_VERSION_MAJOR( init_version ), PAPI_VERSION_MINOR( init_version ), PAPI_VERSION_REVISION( init_version ), PAPI_VERSION_INCREMENT( init_version ) ); printf( "PAPI_VERSION : %4d %6d %7d %10d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ), PAPI_VERSION_INCREMENT (PAPI_VERSION) ); printf( "PAPI_get_opt : %4d %6d %7d %10d\n", PAPI_VERSION_MAJOR( lib_version ), PAPI_VERSION_MINOR( lib_version ), PAPI_VERSION_REVISION( lib_version ), PAPI_VERSION_INCREMENT( lib_version) ); printf( "-------------------------------------------------------------------------\n" ); } if ( lib_version != PAPI_VERSION ) { test_fail( __FILE__, __LINE__, "Version Mismatch", PAPI_EINVAL ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/virttime.c000066400000000000000000000041401502707512200177420ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_test.h" int main( int argc, char **argv ) { int retval; long long elapsed_us, elapsed_cyc; const PAPI_hw_info_t *hw_info; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); elapsed_us = PAPI_get_virt_usec( ); elapsed_cyc = PAPI_get_virt_cyc( ); if (!TESTS_QUIET) { printf( "Testing virt time clock. (CPU Max %d MHz, CPU Min %d MHz)\n", hw_info->cpu_max_mhz, hw_info->cpu_min_mhz ); printf( "Sleeping for 10 seconds.\n" ); } sleep( 10 ); elapsed_us = PAPI_get_virt_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_virt_cyc( ) - elapsed_cyc; if (!TESTS_QUIET) { printf( "%lld us. %lld cyc.\n", elapsed_us, elapsed_cyc ); } /* Elapsed microseconds and elapsed cycles are not as unambiguous as they appear. On Pentium III and 4, for example, cycles is a measured value, while useconds is computed from cycles and mhz. MHz is read from /proc/cpuinfo (on linux). Thus, any error in MHz is propagated to useconds. Conversely, on ultrasparc useconds are extracted from a system call (gethrtime()) and cycles are computed from useconds. Also, MHz comes from a scan of system info, Thus any error in gethrtime() propagates to both cycles and useconds, and cycles can be further impacted by errors in reported MHz. Without knowing the error bars on these system values, we can't really specify error ranges for our reported values, but we *DO* know that errors for at least one instance of Pentium 4 (torc17@utk) are on the order of one part per thousand. */ /* We'll accept 1.5 part per thousand error here (to allow Pentium 4 and Alpha to pass) */ if ( elapsed_us > 100000 ) test_fail( __FILE__, __LINE__, "Virt time greater than .1 seconds!", PAPI_EMISC ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero.c000066400000000000000000000124751502707512200170700ustar00rootroot00000000000000/* zero.c */ /* This is possibly the most important PAPI tests, and is the one */ /* that is often used as a quick test that PAPI is working. */ /* We should make sure that it always passes, if possible. */ /* Traditionally it used FLOPS, due to the importance of this to HPC. */ /* This has been changed to use Instructions/Cycles as some recent */ /* major Intel chips do not have good floating point events and would fail. */ #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define NUM_EVENTS 2 #define NUM_LOOPS 200 int main( int argc, char **argv ) { int retval, tmp, result, i; int EventSet1 = PAPI_NULL; long long values[NUM_EVENTS]; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; double ipc; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Initialize the EventSet */ retval=PAPI_create_eventset(&EventSet1); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add PAPI_TOT_CYC */ retval=PAPI_add_named_event(EventSet1,"PAPI_TOT_CYC"); if (retval!=PAPI_OK) { if (!quiet) { printf("Trouble adding PAPI_TOT_CYC: %s\n", PAPI_strerror(retval)); } test_skip( __FILE__, __LINE__, "adding PAPI_TOT_CYC", retval ); } /* Add PAPI_TOT_INS */ retval=PAPI_add_named_event(EventSet1,"PAPI_TOT_INS"); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "adding PAPI_TOT_INS", retval ); } /* warm up the processor to pull it out of idle state */ for(i=0;i<100;i++) { result=instructions_million(); } if (result==CODE_UNIMPLEMENTED) { if (!quiet) printf("Instructions testcode not available\n"); test_skip( __FILE__, __LINE__, "No instructions code", retval ); } /* Gather before stats */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Start PAPI */ retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our work code */ for(i=0;i (1000000*NUM_LOOPS)) { printf("%s Error of %.2f%%\n", "PAPI_TOT_INS", (100.0 * (double)(values[1] - (1000000*NUM_LOOPS)))/(1000000*NUM_LOOPS)); test_fail( __FILE__, __LINE__, "Instruction validation", 0 ); } /* Check that TOT_CYC is non-zero */ if(values[0]==0) { printf("Cycles is zero\n"); test_fail( __FILE__, __LINE__, "Cycles validation", 0 ); } /* Unless you have an amazing processor, IPC should be < 100 */ if ((ipc <=0.01 ) || (ipc >=100.0)) { printf("Unlikely IPC of %.2f%%\n", ipc); test_fail( __FILE__, __LINE__, "IPC validation", 0 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_attach.c000066400000000000000000000144421502707512200204100ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for attached processes. - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _AIX #define _LINUX_SOURCE_COMPAT #endif #if defined(__FreeBSD__) # define PTRACE_ATTACH PT_ATTACH # define PTRACE_CONT PT_CONTINUE #endif int wait_for_attach_and_loop( void ) { kill( getpid( ), SIGSTOP ); do_flops( NUM_FLOPS ); kill( getpid( ), SIGSTOP ); return 0; } int main( int argc, char **argv ) { int status, retval, num_tests = 1, tmp; int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; const PAPI_component_info_t *cmpinfo; pid_t pid; /* Set TESTS_QUIET variable */ tests_quiet( argc, argv ); /* Initialize the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if ( ( cmpinfo = PAPI_get_component_info( 0 ) ) == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_component_info", 0 ); } if ( cmpinfo->attach == 0 ) { test_skip( __FILE__, __LINE__, "Platform does not support attaching", 0 ); } pid = fork( ); if ( pid < 0 ) { test_fail( __FILE__, __LINE__, "fork()", PAPI_ESYS ); } if ( pid == 0 ) { exit( wait_for_attach_and_loop( ) ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_ATTACH, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_ATTACH)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) test_fail( __FILE__, __LINE__, "Child process didnt return true to WIFSTOPPED", 0 ); } retval = PAPI_attach( EventSet1, ( unsigned long ) pid ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_attach", retval ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); /* num_events1 is greater than num_events2 so don't worry. */ values = allocate_test_space( num_tests, num_events1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* Wait for the SIGSTOP. */ if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFSTOPPED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFSTOPPED", 0 ); } if ( WSTOPSIG( status ) != SIGSTOP ) { test_fail( __FILE__, __LINE__, "Child process didn't stop on SIGSTOP", 0 ); } } elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } remove_test_events( &EventSet1, mask1 ); if ( cmpinfo->attach_must_ptrace ) { if ( ptrace( PTRACE_CONT, pid, NULL, NULL ) == -1 ) { perror( "ptrace(PTRACE_CONT)" ); return 1; } } if ( waitpid( pid, &status, 0 ) == -1 ) { perror( "waitpid()" ); exit( 1 ); } if ( WIFEXITED( status ) == 0 ) { test_fail( __FILE__, __LINE__, "Child process didn't return true to WIFEXITED", 0 ); } if (!TESTS_QUIET) { printf( "Test case: 3rd party attach start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\n" ); sprintf( add_event_str, "%-12s : \t", event_name ); printf( TAB1, add_event_str, values[0][1] ); printf( TAB1, "PAPI_TOT_CYC : \t", values[0][0] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification: none\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_flip.c000066400000000000000000000120341502707512200200710ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC - Get us. - Start counters - Do flops - Stop and read counters - Get us. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, eventcnt, events[2], i, tmp; int EventSet1 = PAPI_NULL, EventSet2 = PAPI_NULL; int PAPI_event; long long values1[2], values2[2]; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN], add_event_str[PAPI_2MAX_STR_LEN]; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); /* query and set up the right instruction to monitor */ if ( PAPI_query_event( PAPI_FP_OPS ) == PAPI_OK ) PAPI_event = PAPI_FP_OPS; else PAPI_event = PAPI_TOT_INS; retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); sprintf( add_event_str, "PAPI_add_event[%s]", event_name ); retval = PAPI_create_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* Add the events */ if (!quiet) printf( "Adding: %s\n", event_name ); retval = PAPI_add_event( EventSet1, PAPI_event ); if ( retval != PAPI_OK ) { if (!quiet) printf("Trouble adding event\n"); test_skip( __FILE__, __LINE__, "PAPI_add_event", retval ); } retval = PAPI_add_event( EventSet1, PAPI_TOT_CYC ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); /* Add them reversed to EventSet2 */ retval = PAPI_create_eventset( &EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); eventcnt = 2; retval = PAPI_list_events( EventSet1, events, &eventcnt ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_list_events", retval ); for ( i = eventcnt - 1; i >= 0; i-- ) { retval = PAPI_event_code_to_name( events[i], event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); retval = PAPI_add_event( EventSet2, events[i] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_add_event", retval ); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet1, values1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); retval = PAPI_start( EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( NUM_FLOPS ); retval = PAPI_stop( EventSet2, values2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_cleanup_eventset( EventSet1 ); /* JT */ if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); retval = PAPI_cleanup_eventset( EventSet2 ); /* JT */ if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_cleanup_eventset", retval ); retval = PAPI_destroy_eventset( &EventSet2 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_destroy_eventset", retval ); if ( !quiet ) { printf( "Test case 0: start, stop.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\t 2\n" ); sprintf( add_event_str, "%-12s : \t", event_name ); printf( TAB2, add_event_str, values1[0], values2[1] ); printf( TAB2, "PAPI_TOT_CYC : \t", values1[1], values2[0] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Verification: none\n" ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_fork.c000066400000000000000000000074521502707512200201100ustar00rootroot00000000000000/* * File: zero_fork.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: * */ /* This file performs the following test: PAPI_library_init() Add two events PAPI_start() fork() / \ parent child | PAPI_library_init() | Add two events | PAPI_start() | PAPI_stop() | fork()-----\ | child parent PAPI_library_init() | Add two events | PAPI_start() | PAPI_stop() | wait() wait() | PAPI_stop() No validation is done */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1 = 2; long long elapsed_us, elapsed_cyc; long long **values; char event_name[PAPI_MAX_STR_LEN]; int retval, num_tests = 1; void process_init( void ) { if (!TESTS_QUIET) printf( "Process %d \n", ( int ) getpid( ) ); /* Initialize PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depends on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); values = allocate_test_space( num_tests, num_events1 ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } } void process_fini( void ) { retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; remove_test_events( &EventSet1, mask1 ); if (!TESTS_QUIET) { printf( "Process %d %-12s : \t%lld\n", ( int ) getpid( ), event_name, values[0][1] ); printf( "Process %d PAPI_TOT_CYC : \t%lld\n", ( int ) getpid( ), values[0][0] ); printf( "Process %d Real usec : \t%lld\n", ( int ) getpid( ), elapsed_us ); printf( "Process %d Real cycles : \t%lld\n", ( int ) getpid( ), elapsed_cyc ); } free_test_space( values, num_tests ); } int main( int argc, char **argv ) { int flops1; int retval; tests_quiet( argc, argv ); /* Set TESTS_QUIET variable */ # if (defined(__ALPHA) && defined(__osf__)) test_skip( __FILE__, __LINE__, "main: fork not supported.", 0 ); #endif if (!TESTS_QUIET) { printf( "This tests if PAPI_library_init(),2*fork(),PAPI_library_init() works.\n" ); } /* Initialize PAPI for this process */ process_init( ); flops1 = 1000000; if ( fork( ) == 0 ) { /* Initialize PAPI for the child process */ process_init( ); /* Let the child process do work */ do_flops( flops1 ); /* Measure the child process */ process_fini( ); exit( 0 ); } flops1 = 2000000; if ( fork( ) == 0 ) { /* Initialize PAPI for the child process */ process_init( ); /* Let the child process do work */ do_flops( flops1 ); /* Measure the child process */ process_fini( ); exit( 0 ); } /* Let this process do work */ flops1 = 4000000; do_flops( flops1 ); /* Wait for child to finish */ wait( &retval ); /* Wait for child to finish */ wait( &retval ); /* Measure this process */ process_fini( ); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_named.c000066400000000000000000000111671502707512200202310ustar00rootroot00000000000000/* This test exercises the PAPI_{query, add, remove}_event APIs for PRESET events. It more or less duplicates the functionality of the classic "zero" test. */ #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" int main( int argc, char **argv ) { int retval, num_tests = 1, tmp; int EventSet = PAPI_NULL; int num_events = 2; long long **values; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; const char *event_names[] = {"PAPI_TOT_CYC","PAPI_TOT_INS"}; char add_event_str[PAPI_MAX_STR_LEN]; double cycles_error; int quiet; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Verify that the named events exist */ retval = PAPI_query_named_event(event_names[0]); if ( retval == PAPI_OK) { retval = PAPI_query_named_event(event_names[1]); } if ( retval != PAPI_OK ) { if (!quiet) printf("Trouble querying events\n"); test_skip( __FILE__, __LINE__, "PAPI_query_named_event", retval ); } /* Create an empty event set */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); /* add the events named above */ retval = PAPI_add_named_event( EventSet, event_names[0] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[0] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } retval = PAPI_add_named_event( EventSet, event_names[1] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[1] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } values = allocate_test_space( num_tests, num_events ); /* Gather before stats */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our test code */ do_flops( NUM_FLOPS ); /* Stop PAPI */ retval = PAPI_stop( EventSet, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* Calculate total values */ elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; /* remove PAPI_TOT_CYC and PAPI_TOT_INS */ retval = PAPI_remove_named_event( EventSet, event_names[0] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[0] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } retval = PAPI_remove_named_event( EventSet, event_names[1] ); if ( retval != PAPI_OK ) { sprintf( add_event_str, "PAPI_add_named_event[%s]", event_names[1] ); test_fail( __FILE__, __LINE__, add_event_str, retval ); } if ( !quiet ) { printf( "PAPI_{query, add, remove}_named_event API test.\n" ); printf( "-----------------------------------------------\n" ); tmp = PAPI_get_opt( PAPI_DEFDOM, NULL ); printf( "Default domain is: %d (%s)\n", tmp, stringify_all_domains( tmp ) ); tmp = PAPI_get_opt( PAPI_DEFGRN, NULL ); printf( "Default granularity is: %d (%s)\n", tmp, stringify_granularity( tmp ) ); printf( "Using %d iterations of c += a*b\n", NUM_FLOPS ); printf( "-------------------------------------------------------------------------\n" ); printf( "Test type : \t 1\n" ); /* cycles is first, other event second */ sprintf( add_event_str, "%-12s : \t", event_names[0] ); printf( TAB1, add_event_str, values[0][0] ); sprintf( add_event_str, "%-12s : \t", event_names[1] ); printf( TAB1, add_event_str, values[0][1] ); printf( TAB1, "Real usec : \t", elapsed_us ); printf( TAB1, "Real cycles : \t", elapsed_cyc ); printf( TAB1, "Virt usec : \t", elapsed_virt_us ); printf( TAB1, "Virt cycles : \t", elapsed_virt_cyc ); printf( "-------------------------------------------------------------------------\n" ); printf( "Verification: PAPI_TOT_CYC should be roughly real_cycles\n" ); cycles_error=100.0*((double)values[0][0] - (double)elapsed_cyc)/ (double)values[0][0]; if (cycles_error>10.0) { printf("Error of %.2f%%\n",cycles_error); test_fail( __FILE__, __LINE__, "validation", 0 ); } } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_omp.c000066400000000000000000000115371502707512200177410ustar00rootroot00000000000000/* * File: zero_omp.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Nils Smeds * smeds@pdc.kth.se * Anders Nilsson * anni@pdc.kth.se */ /* This file performs the following test: start, stop and timer functionality for 2 slave OMP threads - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC Each thread inside the Thread routine: - Get cyc. - Get us. - Start counters - Do flops - Stop and read counters - Get us. - Get cyc. Master serial thread: - Get us. - Get cyc. - Run parallel for loop - Get us. - Get cyc. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #ifdef _OPENMP #include #else #error "This compiler does not understand OPENMP" #endif const PAPI_hw_info_t *hw_info = NULL; void Thread( int n ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN]; if (!TESTS_QUIET) { printf( "Thread %#x started\n", omp_get_thread_num( ) ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); if (num_events1==0) { if (!TESTS_QUIET) printf("No events added!\n"); test_fail(__FILE__,__LINE__,"No events",0); } retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); values = allocate_test_space( num_tests, num_events1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); do_flops( n ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; remove_test_events( &EventSet1, mask1 ); if ( !TESTS_QUIET ) { printf( "Thread %#x %-12s : \t%lld\n", omp_get_thread_num( ), event_name, values[0][1] ); printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", omp_get_thread_num( ), values[0][0] ); printf( "Thread %#x Real usec : \t%lld\n", omp_get_thread_num( ), elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", omp_get_thread_num( ), elapsed_cyc ); } /* It is illegal for the threads to exit in OpenMP */ /* test_pass(__FILE__,0,0); */ free_test_space( values, num_tests ); PAPI_unregister_thread( ); if (!TESTS_QUIET) { printf( "Thread %#x finished\n", omp_get_thread_num( ) ); } } unsigned long omp_get_thread_num_wrapper(void){ return (unsigned long)omp_get_thread_num(); } int main( int argc, char **argv ) { int retval; long long elapsed_us, elapsed_cyc; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } hw_info = PAPI_get_hardware_info( ); if ( hw_info == NULL ) { test_fail( __FILE__, __LINE__, "PAPI_get_hardware_info", 2 ); } if (PAPI_query_event(PAPI_TOT_INS)!=PAPI_OK) { if (!quiet) printf("Can't find PAPI_TOT_INS\n"); test_skip(__FILE__,__LINE__,"Event missing",1); } if (PAPI_query_event(PAPI_TOT_CYC)!=PAPI_OK) { if (!quiet) printf("Can't find PAPI_TOT_CYC\n"); test_skip(__FILE__,__LINE__,"Event missing",1); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_thread_init( omp_get_thread_num_wrapper ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) { if (!quiet) printf("Trouble init threads\n"); test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } } #pragma omp parallel { Thread( 1000000 * ( omp_get_thread_num( ) + 1 ) ); } omp_set_num_threads( 1 ); Thread( 1000000 * ( omp_get_thread_num( ) + 1 ) ); omp_set_num_threads( omp_get_max_threads( ) ); #pragma omp parallel { Thread( 1000000 * ( omp_get_thread_num( ) + 1 ) ); } elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; if ( !TESTS_QUIET ) { printf( "Master real usec : \t%lld\n", elapsed_us ); printf( "Master real cycles : \t%lld\n", elapsed_cyc ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_pthreads.c000066400000000000000000000131041502707512200207500ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for 2 slave pthreads - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC Each of 2 slave pthreads: - Get cyc. - Get us. - Start counters - Do flops - Stop and read counters - Get us. - Get cyc. Master pthread: - Get us. - Get cyc. - Fork threads - Wait for threads to exit - Get us. - Get cyc. */ #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" void * Thread( void *arg ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN]; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_register_thread", retval ); } if (!TESTS_QUIET) { printf( "Thread %#x started\n", ( int ) pthread_self( ) ); } /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); if (!TESTS_QUIET) { printf("Events %d\n",num_events1); } if (num_events1<2) { test_fail( __FILE__, __LINE__, "Not enough events", retval ); } retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); } values = allocate_test_space( num_tests, num_events1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } do_flops( *( int * ) arg ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; remove_test_events( &EventSet1, mask1 ); if ( !TESTS_QUIET ) { printf( "Thread %#x %-12s : \t%lld\n", ( int ) pthread_self( ), event_name, values[0][1] ); printf( "Thread %#x PAPI_TOT_CYC : \t%lld\n", (int) pthread_self(), values[0][0] ); printf( "Thread %#x Real usec : \t%lld\n", ( int ) pthread_self( ), elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", (int) pthread_self(), elapsed_cyc ); } free_test_space( values, num_tests ); retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_unregister_thread", retval ); return NULL; } int main( int argc, char **argv ) { pthread_t e_th, f_th, g_th, h_th; int flops1, flops2, flops3, flops4; int retval, rc; pthread_attr_t attr; long long elapsed_us, elapsed_cyc; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (PAPI_query_event(PAPI_TOT_INS)!=PAPI_OK) { if (!quiet) printf("Can't find PAPI_TOT_INS\n"); test_skip(__FILE__,__LINE__,"Event missing",1); } if (PAPI_query_event(PAPI_TOT_CYC)!=PAPI_OK) { if (!quiet) printf("Can't find PAPI_TOT_CYC\n"); test_skip(__FILE__,__LINE__,"Event missing",1); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) { test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); } else { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); pthread_attr_init( &attr ); #ifdef PTHREAD_CREATE_UNDETACHED pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_UNDETACHED ); #endif #ifdef PTHREAD_SCOPE_SYSTEM retval = pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM ); if ( retval != 0 ) test_skip( __FILE__, __LINE__, "pthread_attr_setscope", retval ); #endif flops1 = 1000000; rc = pthread_create( &e_th, &attr, Thread, ( void * ) &flops1 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops2 = 2000000; rc = pthread_create( &f_th, &attr, Thread, ( void * ) &flops2 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops3 = 4000000; rc = pthread_create( &g_th, &attr, Thread, ( void * ) &flops3 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } flops4 = 8000000; rc = pthread_create( &h_th, &attr, Thread, ( void * ) &flops4 ); if ( rc ) { retval = PAPI_ESYS; test_fail( __FILE__, __LINE__, "pthread_create", retval ); } pthread_attr_destroy( &attr ); flops1 = 500000; Thread( &flops1 ); pthread_join( h_th, NULL ); pthread_join( g_th, NULL ); pthread_join( f_th, NULL ); pthread_join( e_th, NULL ); elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; if ( !quiet ) { printf( "Master real usec : \t%lld\n", elapsed_us ); printf( "Master real cycles : \t%lld\n", elapsed_cyc ); } test_pass( __FILE__ ); pthread_exit( NULL ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_shmem.c000066400000000000000000000043551502707512200202570ustar00rootroot00000000000000/* This code attempts to test that SHMEM works with PAPI */ /* SHMEM was developed by Cray and supported by various */ /* other vendors. */ #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" void Thread( int n ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int mask1 = 0x5; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; EventSet1 = add_test_events( &num_events1, &mask1, 1 ); /* num_events1 is greater than num_events2 so don't worry. */ values = allocate_test_space( num_tests, num_events1 ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); retval = PAPI_start( EventSet1 ); /* we should indicate failure somehow, not just exit */ if ( retval != PAPI_OK ) exit( 1 ); do_flops( n ); retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) exit( 1 ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; remove_test_events( &EventSet1, mask1 ); printf( "Thread %#x PAPI_FP_INS : \t%lld\n", n / 1000000, ( values[0] )[0] ); printf( "Thread %#x PAPI_TOT_CYC: \t%lld\n", n / 1000000, ( values[0] )[1] ); printf( "Thread %#x Real usec : \t%lld\n", n / 1000000, elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", n / 1000000, elapsed_cyc ); free_test_space( values, num_tests ); } int main( int argc, char **argv ) { int quiet; long long elapsed_us, elapsed_cyc; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); #ifdef HAVE_OPENSHMEM /* Start 2 processing elements (SHMEM call) */ start_pes( 2 ); Thread( 1000000 * ( _my_pe( ) + 1 ) ); #else if (!quiet) { printf("No OpenSHMEM support\n"); } test_skip( __FILE__, __LINE__, "OpenSHMEM support not found, skipping.", 0); #endif elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; printf( "Master real usec : \t%lld\n", elapsed_us ); printf( "Master real cycles : \t%lld\n", elapsed_cyc ); return 0; } papi-papi-7-2-0-t/src/ctests/zero_smp.c000066400000000000000000000104701502707512200177400ustar00rootroot00000000000000/* This file performs the following test: start, stop and timer functionality for 2 slave native SMP threads - It attempts to use the following two counters. It may use less depending on hardware counter resource limitations. These are counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). + PAPI_FP_INS + PAPI_TOT_CYC Each of 2 slave pthreads: - Get cyc. - Get us. - Start counters - Do flops - Stop and read counters - Get us. - Get cyc. Master pthread: - Get us. - Get cyc. - Fork threads - Wait for threads to exit - Get us. - Get cyc. */ #include #include #include "papi.h" #include "papi_test.h" #include "do_loops.h" #if defined(sun) && defined(sparc) #include #elif defined(mips) && defined(sgi) && defined(unix) #include #elif defined(_AIX) || defined(__linux__) #include #endif void Thread( int t, int n ) { int retval, num_tests = 1; int EventSet1 = PAPI_NULL; int PAPI_event, mask1; int num_events1; long long **values; long long elapsed_us, elapsed_cyc; char event_name[PAPI_MAX_STR_LEN]; /* add PAPI_TOT_CYC and one of the events in PAPI_FP_INS, PAPI_FP_OPS or PAPI_TOT_INS, depending on the availability of the event on the platform */ EventSet1 = add_two_events( &num_events1, &PAPI_event, &mask1 ); retval = PAPI_event_code_to_name( PAPI_event, event_name ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_event_code_to_name", retval ); values = allocate_test_space( num_tests, num_events1 ); retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_start", retval ); elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); do_flops( n ); elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; retval = PAPI_stop( EventSet1, values[0] ); if ( retval != PAPI_OK ) test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); remove_test_events( &EventSet1, mask1 ); if ( !TESTS_QUIET ) { printf( "Thread %#x %-12s : \t%lld\n", t, event_name, values[0][1] ); printf( "Thread %#x PAPI_TOT_CYC : \t%lld\n", t, values[0][0] ); } free_test_space( values, num_tests ); if ( !TESTS_QUIET ) { printf( "Thread %#x Real usec : \t%lld\n", t, elapsed_us ); printf( "Thread %#x Real cycles : \t%lld\n", t, elapsed_cyc ); } PAPI_unregister_thread( ); } int main( int argc, char **argv ) { int i, retval, quiet; long long elapsed_us, elapsed_cyc; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); #if defined(_AIX) || defined(__linux__) retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( pthread_self ) ); if ( retval != PAPI_OK ) { if ( retval == PAPI_ECMP ) test_skip( __FILE__, __LINE__, "PAPI_thread_init", retval ); else test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } #if defined(_AIX) #pragma ibm parallel_loop #endif #elif defined(sgi) && defined(mips) retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( mp_my_threadnum ) ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } #pragma parallel #pragma local(i) #pragma pfor #elif defined(sun) && defined(sparc) retval = PAPI_thread_init( ( unsigned long ( * )( void ) ) ( thr_self ) ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_thread_init", retval ); } #pragma MP taskloop private(i) #else if (!quiet) { printf("This test only runs on AIX/IRIX/SOLOARIS\n"); } test_skip(__FILE__, __LINE__, "Architecture not included in this test file yet.", 0); #endif for ( i = 1; i < 3; i++ ) { Thread( i, 10000000 * i ); } elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; if ( !quiet ) { printf( "Master real usec : \t%lld\n", elapsed_us ); printf( "Master real cycles : \t%lld\n", elapsed_cyc ); } // FIXME: we don't really validate anything here test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/darwin-common.c000066400000000000000000000215651502707512200173560ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "darwin-memory.h" #include "darwin-common.h" #include "x86_cpuid_info.h" PAPI_os_info_t _papi_os_info; /* The locks used by Darwin */ #if defined(USE_PTHREAD_MUTEXES) pthread_mutex_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #else volatile unsigned int _papi_hwd_lock_data[PAPI_MAX_LOCK]; #endif static int _darwin_init_locks(void) { int i; for ( i = 0; i < PAPI_MAX_LOCK; i++ ) { #if defined(USE_PTHREAD_MUTEXES) pthread_mutex_init(&_papi_hwd_lock_data[i],NULL); #else _papi_hwd_lock_data[i] = MUTEX_OPEN; #endif } return PAPI_OK; } int _darwin_detect_hypervisor(char *virtual_vendor_name) { int retval=0; #if defined(__i386__)||defined(__x86_64__) retval=_x86_detect_hypervisor(virtual_vendor_name); #else (void) virtual_vendor_name; #endif return retval; } #define _PATH_SYS_SYSTEM "/sys/devices/system" #define _PATH_SYS_CPU0 _PATH_SYS_SYSTEM "/cpu/cpu0" static char pathbuf[PATH_MAX] = "/"; static char * search_cpu_info( FILE * f, char *search_str, char *line ) { /* This function courtesy of Rudolph Berrendorf! */ /* See the home page for the German version of PAPI. */ char *s; while ( fgets( line, 256, f ) != NULL ) { if ( strstr( line, search_str ) != NULL ) { /* ignore all characters in line up to : */ for ( s = line; *s && ( *s != ':' ); ++s ); if ( *s ) return s; } } return NULL; } int _darwin_get_cpu_info( PAPI_hw_info_t *hwinfo, int *cpuinfo_mhz ) { int mib[4]; size_t len; char buffer[BUFSIZ]; long long ll; /* "sysctl -a" shows lots of info we can get on OSX */ /**********/ /* Vendor */ /**********/ len = 3; sysctlnametomib("machdep.cpu.vendor", mib, &len); len = BUFSIZ; if (sysctl(mib, 3, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } strncpy( hwinfo->vendor_string,buffer,len); hwinfo->vendor = PAPI_VENDOR_INTEL; /**************/ /* Model Name */ /**************/ len = 3; sysctlnametomib("machdep.cpu.brand_string", mib, &len); len = BUFSIZ; if (sysctl(mib, 3, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } strncpy( hwinfo->model_string,buffer,len); /************/ /* Revision */ /************/ len = 3; sysctlnametomib("machdep.cpu.stepping", mib, &len); len = BUFSIZ; if (sysctl(mib, 3, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->cpuid_stepping=buffer[0]; hwinfo->revision=(float)(hwinfo->cpuid_stepping); /**********/ /* Family */ /**********/ len = 3; sysctlnametomib("machdep.cpu.family", mib, &len); len = BUFSIZ; if (sysctl(mib, 3, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->cpuid_family=buffer[0]; /**********/ /* Model */ /**********/ len = 3; sysctlnametomib("machdep.cpu.model", mib, &len); len = BUFSIZ; if (sysctl(mib, 3, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->cpuid_model=buffer[0]; hwinfo->model=hwinfo->cpuid_model; /*************/ /* Frequency */ /*************/ len = 2; sysctlnametomib("hw.cpufrequency_max", mib, &len); len = 8; if (sysctl(mib, 2, &ll, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->cpu_max_mhz=(int)(ll/(1000*1000)); len = 2; sysctlnametomib("hw.cpufrequency_min", mib, &len); len = 8; if (sysctl(mib, 2, &ll, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->cpu_min_mhz=(int)(ll/(1000*1000)); /**********/ /* ncpu */ /**********/ len = 2; sysctlnametomib("hw.ncpu", mib, &len); len = BUFSIZ; if (sysctl(mib, 2, &buffer, &len, NULL, 0) == -1) { return PAPI_ESYS; } hwinfo->totalcpus=buffer[0]; return PAPI_OK; } int _darwin_get_system_info( papi_mdi_t *mdi ) { int retval; char maxargs[PAPI_HUGE_STR_LEN]; pid_t pid; int cpuinfo_mhz,sys_min_khz,sys_max_khz; /* Software info */ /* Path and args */ pid = getpid( ); if ( pid < 0 ) { PAPIERROR( "getpid() returned < 0" ); return PAPI_ESYS; } mdi->pid = pid; #if 0 sprintf( maxargs, "/proc/%d/exe", ( int ) pid ); if ( readlink( maxargs, mdi->exe_info.fullname, PAPI_HUGE_STR_LEN ) < 0 ) { PAPIERROR( "readlink(%s) returned < 0", maxargs ); return PAPI_ESYS; } /* Careful, basename can modify it's argument */ strcpy( maxargs, mdi->exe_info.fullname ); strcpy( mdi->exe_info.address_info.name, basename( maxargs ) ); SUBDBG( "Executable is %s\n", mdi->exe_info.address_info.name ); SUBDBG( "Full Executable is %s\n", mdi->exe_info.fullname ); /* Executable regions, may require reading /proc/pid/maps file */ retval = _darwin_update_shlib_info( mdi ); SUBDBG( "Text: Start %p, End %p, length %d\n", mdi->exe_info.address_info.text_start, mdi->exe_info.address_info.text_end, ( int ) ( mdi->exe_info.address_info.text_end - mdi->exe_info.address_info.text_start ) ); SUBDBG( "Data: Start %p, End %p, length %d\n", mdi->exe_info.address_info.data_start, mdi->exe_info.address_info.data_end, ( int ) ( mdi->exe_info.address_info.data_end - mdi->exe_info.address_info.data_start ) ); SUBDBG( "Bss: Start %p, End %p, length %d\n", mdi->exe_info.address_info.bss_start, mdi->exe_info.address_info.bss_end, ( int ) ( mdi->exe_info.address_info.bss_end - mdi->exe_info.address_info.bss_start ) ); #endif /* PAPI_preload_option information */ strcpy( mdi->preload_info.lib_preload_env, "LD_PRELOAD" ); mdi->preload_info.lib_preload_sep = ' '; strcpy( mdi->preload_info.lib_dir_env, "LD_LIBRARY_PATH" ); mdi->preload_info.lib_dir_sep = ':'; /* Hardware info */ retval = _darwin_get_cpu_info( &mdi->hw_info, &cpuinfo_mhz ); if ( retval ) { return retval; } /* Set Up Memory */ retval = _darwin_get_memory_info( &mdi->hw_info, mdi->hw_info.model ); if ( retval ) return retval; SUBDBG( "Found %d %s(%d) %s(%d) CPUs at %d Mhz.\n", mdi->hw_info.totalcpus, mdi->hw_info.vendor_string, mdi->hw_info.vendor, mdi->hw_info.model_string, mdi->hw_info.model, mdi->hw_info.cpu_max_mhz); /* Get virtualization info */ mdi->hw_info.virtualized=_darwin_detect_hypervisor(mdi->hw_info.virtual_vendor_string); return PAPI_OK; } int _papi_hwi_init_os(void) { int major=0,minor=0,sub=0; char *ptr; struct utsname uname_buffer; /* Initialize the locks */ _darwin_init_locks(); /* Get the kernel info */ uname(&uname_buffer); SUBDBG("Native kernel version %s\n",uname_buffer.release); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); ptr=strtok(_papi_os_info.version,"."); if (ptr!=NULL) major=atoi(ptr); ptr=strtok(NULL,"."); if (ptr!=NULL) minor=atoi(ptr); ptr=strtok(NULL,"."); if (ptr!=NULL) sub=atoi(ptr); // _papi_os_info.os_version=LINUX_VERSION(major,minor,sub); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_ns = PAPI_INT_MPX_DEF_US * 1000; _papi_os_info.itimer_res_ns = 1; _papi_os_info.clock_ticks = sysconf( _SC_CLK_TCK ); /* Get Darwin-specific system info */ _darwin_get_system_info( &_papi_hwi_system_info ); return PAPI_OK; } static inline long long get_cycles( void ) { long long ret = 0; #ifdef __x86_64__ do { unsigned int a, d; asm volatile ( "rdtsc":"=a" ( a ), "=d"( d ) ); ( ret ) = ( ( long long ) a ) | ( ( ( long long ) d ) << 32 ); } while ( 0 ); #else __asm__ __volatile__( "rdtsc":"=A"( ret ): ); #endif return ret; } long long _darwin_get_real_cycles( void ) { long long retval; retval = get_cycles( ); return retval; } long long _darwin_get_real_usec_gettimeofday( void ) { long long retval; struct timeval buffer; gettimeofday( &buffer, NULL ); retval = ( long long ) buffer.tv_sec * ( long long ) 1000000; retval += ( long long ) ( buffer.tv_usec ); return retval; } long long _darwin_get_virt_usec_times( void ) { long long retval; struct tms buffer; times( &buffer ); SUBDBG( "user %d system %d\n", ( int ) buffer.tms_utime, ( int ) buffer.tms_stime ); retval = ( long long ) ( ( buffer.tms_utime + buffer.tms_stime ) * 1000000 / sysconf( _SC_CLK_TCK )); /* NOT CLOCKS_PER_SEC as in the headers! */ return retval; } papi_os_vector_t _papi_os_vector = { .get_memory_info = _darwin_get_memory_info, .get_dmem_info = _darwin_get_dmem_info, .get_real_cycles = _darwin_get_real_cycles, .update_shlib_info = _darwin_update_shlib_info, .get_system_info = _darwin_get_system_info, .get_real_usec = _darwin_get_real_usec_gettimeofday, .get_virt_usec = _darwin_get_virt_usec_times, }; papi-papi-7-2-0-t/src/darwin-common.h000066400000000000000000000011311502707512200173460ustar00rootroot00000000000000#ifndef _DARWIN_COMMON_H #define _DARWIN_COMMON_H #include #define min(x, y) ({ \ typeof(x) _min1 = (x); \ typeof(y) _min2 = (y); \ (void) (&_min1 == &_min2); \ _min1 < _min2 ? _min1 : _min2; }) static inline pid_t mygettid( void ) { pthread_t ptid = pthread_self(); pid_t thread_id = 0; memcpy(&thread_id, &ptid, sizeof(pid_t) < sizeof(pthread_t) ? sizeof(pid_t) : sizeof(pthread_t)); return thread_id; } long long _darwin_get_real_cycles( void ); long long _darwin_get_virt_usec_times( void ); long long _darwin_get_real_usec_gettimeofday( void ); #endif papi-papi-7-2-0-t/src/darwin-context.h000066400000000000000000000000001502707512200175340ustar00rootroot00000000000000papi-papi-7-2-0-t/src/darwin-lock.h000066400000000000000000000031011502707512200170050ustar00rootroot00000000000000#ifndef _DARWIN_LOCK_H #define _DARWIN_LOCK_H #include "mb.h" /* Locking functions */ #if defined(USE_PTHREAD_MUTEXES) #include extern pthread_mutex_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) \ do \ { \ pthread_mutex_lock (&_papi_hwd_lock_data[lck]); \ } while(0) #define _papi_hwd_unlock(lck) \ do \ { \ pthread_mutex_unlock(&_papi_hwd_lock_data[lck]); \ } while(0) #else extern volatile unsigned int _papi_hwd_lock_data[PAPI_MAX_LOCK]; #define MUTEX_OPEN 0 #define MUTEX_CLOSED 1 #define _papi_hwd_lock(lck) \ do \ { \ unsigned int res = 0; \ do { \ __asm__ __volatile__ ("lock ; " "cmpxchg %1,%2" : "=a"(res) : "q"(MUTEX_CLOSED), "m"(_papi_hwd_lock_data[lck]), "0"(MUTEX_OPEN) : "memory"); \ } while(res != (unsigned int)MUTEX_OPEN); \ } while(0) #define _papi_hwd_unlock(lck) \ do \ { \ unsigned int res = 0; \ __asm__ __volatile__ ("xchg %0,%1" : "=r"(res) : "m"(_papi_hwd_lock_data[lck]), "0"(MUTEX_OPEN) : "memory"); \ } while(0) #endif #endif papi-papi-7-2-0-t/src/darwin-memory.c000066400000000000000000000024241502707512200173670ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_memory.h" /* papi_calloc() */ #include "x86_cpuid_info.h" #include "darwin-lock.h" int _darwin_get_dmem_info( PAPI_dmem_info_t * d ) { int mib[4]; size_t len; char buffer[BUFSIZ]; long long ll; /**********/ /* memory */ /**********/ len = 2; sysctlnametomib("hw.memsize", mib, &len); len = 8; if (sysctl(mib, 2, &ll, &len, NULL, 0) == -1) { return PAPI_ESYS; } d->size=ll; d->pagesize = getpagesize( ); return PAPI_OK; } /* * Architecture-specific cache detection code */ #if defined(__i386__)||defined(__x86_64__) static int x86_get_memory_info( PAPI_hw_info_t * hw_info ) { int retval = PAPI_OK; switch ( hw_info->vendor ) { case PAPI_VENDOR_AMD: case PAPI_VENDOR_INTEL: retval = _x86_cache_info( &hw_info->mem_hierarchy ); break; default: PAPIERROR( "Unknown vendor in memory information call for x86." ); return PAPI_ENOIMPL; } return retval; } #endif int _darwin_get_memory_info( PAPI_hw_info_t * hwinfo, int cpu_type ) { ( void ) cpu_type; /*unused */ int retval = PAPI_OK; x86_get_memory_info( hwinfo ); return retval; } int _darwin_update_shlib_info( papi_mdi_t *mdi ) { return PAPI_OK; } papi-papi-7-2-0-t/src/darwin-memory.h000066400000000000000000000002541502707512200173730ustar00rootroot00000000000000int _darwin_get_dmem_info( PAPI_dmem_info_t * d ); int _darwin_get_memory_info( PAPI_hw_info_t * hwinfo, int cpu_type ); int _darwin_update_shlib_info( papi_mdi_t *mdi ); papi-papi-7-2-0-t/src/event_data/000077500000000000000000000000001502707512200165415ustar00rootroot00000000000000papi-papi-7-2-0-t/src/event_data/power4/000077500000000000000000000000001502707512200177615ustar00rootroot00000000000000papi-papi-7-2-0-t/src/event_data/power4/events000066400000000000000000004437461502707512200212320ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/power4/events { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { counter 1 } #0,v,g,n,n,PM_BIQ_IDU_FULL_CYC,Cycles BIQ or IDU full ##0224,0824 This signal will be asserted each time either the IDU is full or the BIQ is full. #1,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##0105,0605 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #2,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##0104,0604 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #3,u,g,n,n,PM_DC_PREF_L2_CLONE_L3,L2 prefetch cloned with L3 ##0C27 A prefetch request was made to the L2 with a cloned request sent to the L3 #4,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##0907 A new Prefetch Stream was allocated #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##0905 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##0904 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##0101,0601 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##0003 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##0020 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##0000 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##0001 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##0002 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##0103,0603 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0023 This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##0021 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0022 This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##0007 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0024 This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##0004 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##0005 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##0006 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##0107,0607 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0027 This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0025 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0026 This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0100,0600 The ISU sends a signal indicating the gct is full. #27,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##0124,0624 A group that previously attempted dispatch was rejected. #28,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##0123,0623 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #29,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##0225,0825 This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch. #30,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##0226,0826 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #31,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##0227,0827 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #32,v,g,n,n,PM_INST_DISP,Instructions dispatched ##0121,0621 The ISU sends the number of instructions dispatched. #33,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##0223,0823 Asserted each cycle when the IFU sends at least one instruction to the IDU. #34,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##0901 A SLB miss for an instruction fetch as occurred #35,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##0900 A TLB miss for an Instruction Fetch has occurred #36,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##0C64 The data source information is valid #37,v,g,n,n,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##4007 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #38,v,g,n,n,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##4006 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##4005 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #40,v,g,n,n,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##4004 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##4023 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #42,v,g,n,n,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##4022 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##4021 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #44,v,g,n,n,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##4020 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #45,v,g,n,n,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##4027 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #46,v,g,n,n,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##4026 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #47,v,g,n,n,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##4025 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #48,v,g,n,n,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##4024 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #49,v,g,n,n,PM_L3B0_DIR_MIS,L3 bank 0 directory misses ##4001 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #50,v,g,n,n,PM_L3B0_DIR_REF,L3 bank 0 directory references ##4000 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #51,v,g,n,n,PM_L3B1_DIR_MIS,L3 bank 1 directory misses ##4003 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #52,v,g,n,n,PM_L3B1_DIR_REF,L3 bank 1 directory references ##4002 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #53,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##0106,0606 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #54,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##0902 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##0C02 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##0C03 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##0C00 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##0C01 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #59,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##0C20 Data from a store instruction was forwarded to a load on unit 0 #60,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##0906 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #61,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##0C06 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #62,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##0C07 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #63,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##0C04 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #64,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##0C05 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #65,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##0C24 Data from a store instruction was forwarded to a load on unit 1 #66,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##0927 The LMQ was full #67,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##0926 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #68,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##0C26 LRQ slot zero was allocated #69,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##0C22 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #70,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##0C25 SRQ Slot zero was allocated #71,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##0C21 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #72,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##0922 A DL1 reload occured due to marked load #73,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0920 A marked load, executing on unit 0, missed the dcache #74,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0924 A marked load, executing on unit 1, missed the dcache #75,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##0925 A marked stcx (stwcx or stdcx) failed #76,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##0923 A marked store missed the dcache #77,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##0903 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #78,v,g,n,n,PM_STCX_FAIL,STCX failed ##0921 A stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C23 A store missed the dcache #80,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##0102,0602 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #81,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #82,v,g,n,n,PM_DATA_FROM_L3,Data loaded from L3 ##8C66 DL1 was reloaded from the local L3 due to a demand load #83,v,g,n,n,PM_FPU_DENORM,FPU received denormalized data ##8020 This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1 #84,v,g,n,n,PM_FPU_FDIV,FPU executed FDIV instruction ##8000 This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1 #85,u,g,n,n,PM_GCT_EMPTY_CYC,Cycles GCT empty ##8004 The Global Completion Table is completely empty #86,c,g,n,n,PM_INST_CMPL,Instructions completed ##8001 Number of Eligible Instructions that completed. #87,v,g,n,n,PM_INST_FROM_MEM,Instruction fetched from memory ##8227 An instruction fetch group was fetched from memory. Fetch Groups can contain up to 8 instructions #88,v,g,n,n,PM_LSU_FLUSH_ULD,LRQ unaligned load flushes ##8C00 A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #89,c,g,n,n,PM_LSU_SRQ_STFWD,SRQ store forwarded ##8C20 Data from a store instruction was forwarded to a load #90,v,g,n,n,PM_MRK_DATA_FROM_L3,Marked data loaded from L3 ##8C76 DL1 was reloaded from the local L3 due to a marked demand load #91,v,g,n,n,PM_MRK_GRP_DISP,Marked group dispatched ##8002 A group containing a sampled instruction was dispatched #92,v,g,n,n,PM_MRK_LD_MISS_L1,Marked L1 D cache load misses ##8920 Marked L1 D cache load misses #93,v,g,n,n,PM_MRK_ST_CMPL,Marked store instruction completed ##8003 A sampled store has completed (data home) #94,v,g,n,n,PM_RUN_CYC,Run cycles ##8005 Processor Cycles gated by the run latch $$$$ { counter 2 } #0,v,g,n,n,PM_BIQ_IDU_FULL_CYC,Cycles BIQ or IDU full ##0224,0824 This signal will be asserted each time either the IDU is full or the BIQ is full. #1,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##0105,0605 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #2,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##0104,0604 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #3,u,g,n,n,PM_DC_PREF_L2_CLONE_L3,L2 prefetch cloned with L3 ##0C27 A prefetch request was made to the L2 with a cloned request sent to the L3 #4,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##0907 A new Prefetch Stream was allocated #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##0905 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##0904 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##0101,0601 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##0003 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##0020 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##0000 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##0001 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##0002 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##0103,0603 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0023 This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##0021 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0022 This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##0007 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0024 This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##0004 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##0005 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##0006 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##0107,0607 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0027 This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0025 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0026 This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0100,0600 The ISU sends a signal indicating the gct is full. #27,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##0124,0624 A group that previously attempted dispatch was rejected. #28,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##0123,0623 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #29,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##0225,0825 This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch. #30,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##0226,0826 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #31,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##0227,0827 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #32,v,g,n,n,PM_INST_DISP,Instructions dispatched ##0121,0621 The ISU sends the number of instructions dispatched. #33,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##0223,0823 Asserted each cycle when the IFU sends at least one instruction to the IDU. #34,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##0901 A SLB miss for an instruction fetch as occurred #35,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##0900 A TLB miss for an Instruction Fetch has occurred #36,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##0C64 The data source information is valid #37,v,g,n,n,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##4007 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #38,v,g,n,n,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##4006 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##4005 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #40,v,g,n,n,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##4004 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##4023 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #42,v,g,n,n,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##4022 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##4021 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #44,v,g,n,n,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##4020 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #45,v,g,n,n,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##4027 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #46,v,g,n,n,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##4026 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #47,v,g,n,n,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##4025 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #48,v,g,n,n,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##4024 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #49,v,g,n,n,PM_L3B0_DIR_MIS,L3 bank 0 directory misses ##4001 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #50,v,g,n,n,PM_L3B0_DIR_REF,L3 bank 0 directory references ##4000 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #51,v,g,n,n,PM_L3B1_DIR_MIS,L3 bank 1 directory misses ##4003 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #52,v,g,n,n,PM_L3B1_DIR_REF,L3 bank 1 directory references ##4002 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #53,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##0106,0606 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #54,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##0902 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##0C02 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##0C03 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##0C00 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##0C01 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #59,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##0C20 Data from a store instruction was forwarded to a load on unit 0 #60,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##0906 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #61,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##0C06 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #62,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##0C07 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #63,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##0C04 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #64,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##0C05 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #65,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##0C24 Data from a store instruction was forwarded to a load on unit 1 #66,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##0927 The LMQ was full #67,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##0926 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #68,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##0C26 LRQ slot zero was allocated #69,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##0C22 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #70,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##0C25 SRQ Slot zero was allocated #71,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##0C21 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #72,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##0922 A DL1 reload occured due to marked load #73,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0920 A marked load, executing on unit 0, missed the dcache #74,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0924 A marked load, executing on unit 1, missed the dcache #75,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##0925 A marked stcx (stwcx or stdcx) failed #76,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##0923 A marked store missed the dcache #77,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##0903 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #78,v,g,n,n,PM_STCX_FAIL,STCX failed ##0921 A stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C23 A store missed the dcache #80,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##0102,0602 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #81,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #82,v,g,n,n,PM_DATA_FROM_MEM,Data loaded from memory ##8C66 DL1 was reloaded from memory due to a demand load #83,v,g,n,n,PM_FPU_FMA,FPU executed multiply-add instruction ##8000 This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #84,v,g,n,n,PM_FPU_STALL3,FPU stalled in pipe3 ##8020 FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1 #85,v,g,n,n,PM_GRP_DISP,Group dispatches ##8004 A group was dispatched #86,c,g,n,n,PM_INST_FROM_L25_L275,Instruction fetched from L2.5/L2.75 ##8227 An instruction fetch group was fetched from the L2 of another chip. Fetch Groups can contain up to 8 instructions #87,v,g,n,n,PM_LSU_FLUSH_UST,SRQ unaligned store flushes ##8C00 A store was flushed because it was unaligned #88,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##8002 Cycles when both the LMQ and SRQ are empty (LSU is idle) #89,v,g,n,n,PM_MRK_BRU_FIN,Marked instruction BRU processing finished ##8005 The branch unit finished a marked instruction. Instructions that finish may not necessary complete #90,v,g,n,n,PM_MRK_DATA_FROM_MEM,Marked data loaded from memory ##8C76 DL1 was reloaded from memory due to a marked demand load #91,v,g,t,n,PM_THRESH_TIMEO,Threshold timeout ##8003 The threshold timer expired #92,v,g,n,n,PM_WORK_HELD,Work held ##8001 RAS Unit has signaled completion to stop and there are groups waiting to complete $$$$ { counter 3 } #0,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##0450 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #1,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##0451 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##0452 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #3,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##0453 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##0454 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##0455 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_7INST_CLB_CYC,Cycles 7 instructions in CLB ##0456 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,v,g,n,n,PM_8INST_CLB_CYC,Cycles 8 instructions in CLB ##0457 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##0230,0830 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due CR bit setting ##0231,0831 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##0232,0832 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #11,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##0111,0611 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #12,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##0936 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #13,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##0C17 A dcache invalidated was received from the L2 because a line in L2 was castout. #14,v,g,n,n,PM_DC_PREF_OUT_STREAMS,Out of prefetch streams ##0C36 A new prefetch stream was detected, but no more stream entries were available #15,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##0133,0633 The number of Cycles MSR(EE) bit was off. #16,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##0137,0637 Cycles MSR(EE) bit off and external interrupt pending #17,v,g,n,n,PM_FAB_CMD_ISSUED,Fabric command issued ##4016 A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #18,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##4017 A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #19,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##0930 A floating point load was executed from LSU unit 0 #20,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##0934 A floating point load was executed from LSU unit 1 #21,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##0012 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #22,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##0013 fp0 finished, produced a result This only indicates finish, not completion. #23,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##0010 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #24,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##0030 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #25,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##0011 fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #26,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##0016 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #27,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##0017 fp1 finished, produced a result. This only indicates finish, not completion. #28,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##0014 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #29,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##0015 fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #30,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##0110,0610 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #31,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##0132,0632 The Fixed Point unit 0 finished an instruction and produced a result #32,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##0136,0636 The Fixed Point unit 1 finished an instruction and produced a result #33,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##0135,0635 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #34,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##0131,0631 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #35,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##0C35 A request to prefetch data into the L1 was made #36,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##0233,0833 This signal is asserted each cycle a cache write is active. #37,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##4011 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #38,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##4010 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##4013 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #40,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##4012 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##4015 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #42,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##4014 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##0C34 A request to prefetch data into L2 was made #44,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##0C73 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #45,c,g,n,n,PM_LARX_LSU1,Larx executed on LSU1 ##0C77 Invalid event, larx instructions are never executed on unit 1 #46,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0C12 A load, executing on unit 0, missed the dcache #47,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0C16 A load, executing on unit 1, missed the dcache #48,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##0C10 A load executed on unit 0 #49,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##0C14 A load executed on unit 1 #50,v,g,n,n,PM_LSU0_BUSY,LSU0 busy ##0C33 LSU unit 0 is busy rejecting instructions #51,v,g,n,n,PM_LSU1_BUSY,LSU1 busy ##0C37 LSU unit 1 is busy rejecting instructions #52,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##0935 The first entry in the LMQ was allocated. #53,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##0931 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #54,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##0112,0612 The isu sends this signal when the lrq is full. #55,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##0113,0613 The isu sends this signal when the srq is full. #56,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##0932 This signal is asserted every cycle when a sync is in the SRQ. #57,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##0C74 The source information is valid and is for a marked load #58,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##0912 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #59,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##0913 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #60,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##0910 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #61,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##0911 A marked store was flushed from unit 0 because it was unaligned #62,c,g,n,n,PM_MRK_LSU0_INST_FIN,LSU0 finished a marked instruction ##0C31 LSU unit 0 finished a marked instruction #63,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##0916 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #64,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##0917 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #65,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##0914 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #66,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##0915 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #67,c,g,n,n,PM_MRK_LSU1_INST_FIN,LSU1 finished a marked instruction ##0C32 LSU unit 1 finished a marked instruction #68,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##0933 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #69,v,g,n,n,PM_STCX_PASS,Stcx passes ##0C75 A stcx (stwcx or stdcx) instruction was successful #70,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C13 A store missed the dcache #71,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##0C11 A store executed on unit 0 #72,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##0C15 A store executed on unit 1 #73,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #74,v,g,n,n,PM_DATA_FROM_L35,Data loaded from L3.5 ##8C66 DL1 was reloaded from the L3 of another MCM due to a demand load #75,v,g,n,n,PM_FPU_FEST,FPU executed FEST instruction ##8010 This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1. #76,v,g,n,n,PM_FXU_FIN,FXU produced a result ##8130 The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete. #77,v,g,n,n,PM_FXU_FIN,FXU produced a result ##8630 The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete. #78,v,g,n,n,PM_INST_FROM_L2,Instructions fetched from L2 ##8227 An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions #79,v,g,n,n,PM_LD_MISS_L1,L1 D cache load misses ##8C10 Total DL1 Load references that miss the DL1 #80,v,g,n,n,PM_MRK_DATA_FROM_L35,Marked data loaded from L3.5 ##8C76 DL1 was reloaded from the L3 of another MCM due to a marked demand load #81,v,g,n,n,PM_MRK_LSU_FLUSH_LRQ,Marked LRQ flushes ##8910 A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #82,v,g,n,n,PM_MRK_ST_CMPL_INT,Marked store completed with intervention ##8003 A marked store previously sent to the memory subsystem completed (data home) after requiring intervention #83,v,g,n,n,PM_STOP_COMPLETION,Completion stopped ##8001 RAS Unit has signaled completion to stop #84,v,g,n,n,PM_HV_CYC,Hypervisor Cycles ##8004 Cycles when the processor is executing in Hypervisor (MSR[HV] = 0 and MSR[PR]=0) #85,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##0114,0614 The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped $$$$ { counter 4 } #0,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##0450 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #1,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##0451 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##0452 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #3,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##0453 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##0454 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##0455 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_7INST_CLB_CYC,Cycles 7 instructions in CLB ##0456 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,v,g,n,n,PM_8INST_CLB_CYC,Cycles 8 instructions in CLB ##0457 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##0230,0830 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due CR bit setting ##0231,0831 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##0232,0832 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #11,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##0111,0611 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #12,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##0936 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #13,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##0C17 A dcache invalidated was received from the L2 because a line in L2 was castout. #14,v,g,n,n,PM_DC_PREF_OUT_STREAMS,Out of prefetch streams ##0C36 A new prefetch stream was detected, but no more stream entries were available #15,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##0133,0633 The number of Cycles MSR(EE) bit was off. #16,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##0137,0637 Cycles MSR(EE) bit off and external interrupt pending #17,v,g,n,n,PM_FAB_CMD_ISSUED,Fabric command issued ##4016 A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #18,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##4017 A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #19,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##0930 A floating point load was executed from LSU unit 0 #20,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##0934 A floating point load was executed from LSU unit 1 #21,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##0012 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #22,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##0013 fp0 finished, produced a result This only indicates finish, not completion. #23,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##0010 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #24,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##0030 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #25,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##0011 fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #26,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##0016 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #27,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##0017 fp1 finished, produced a result. This only indicates finish, not completion. #28,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##0014 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #29,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##0015 fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #30,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##0110,0610 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #31,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##0132,0632 The Fixed Point unit 0 finished an instruction and produced a result #32,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##0136,0636 The Fixed Point unit 1 finished an instruction and produced a result #33,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##0135,0635 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #34,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##0131,0631 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #35,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##0C35 A request to prefetch data into the L1 was made #36,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##0233,0833 This signal is asserted each cycle a cache write is active. #37,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##4011 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #38,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##4010 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##4013 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #40,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##4012 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##4015 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #42,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##4014 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##0C34 A request to prefetch data into L2 was made #44,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##0C73 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #45,c,g,n,n,PM_LARX_LSU1,Larx executed on LSU1 ##0C77 Invalid event, larx instructions are never executed on unit 1 #46,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0C12 A load, executing on unit 0, missed the dcache #47,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0C16 A load, executing on unit 1, missed the dcache #48,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##0C10 A load executed on unit 0 #49,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##0C14 A load executed on unit 1 #50,v,g,n,n,PM_LSU0_BUSY,LSU0 busy ##0C33 LSU unit 0 is busy rejecting instructions #51,v,g,n,n,PM_LSU1_BUSY,LSU1 busy ##0C37 LSU unit 1 is busy rejecting instructions #52,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##0935 The first entry in the LMQ was allocated. #53,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##0931 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #54,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##0112,0612 The isu sends this signal when the lrq is full. #55,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##0113,0613 The isu sends this signal when the srq is full. #56,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##0932 This signal is asserted every cycle when a sync is in the SRQ. #57,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##0C74 The source information is valid and is for a marked load #58,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##0912 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #59,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##0913 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #60,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##0910 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #61,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##0911 A marked store was flushed from unit 0 because it was unaligned #62,c,g,n,n,PM_MRK_LSU0_INST_FIN,LSU0 finished a marked instruction ##0C31 LSU unit 0 finished a marked instruction #63,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##0916 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #64,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##0917 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #65,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##0914 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #66,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##0915 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #67,c,g,n,n,PM_MRK_LSU1_INST_FIN,LSU1 finished a marked instruction ##0C32 LSU unit 1 finished a marked instruction #68,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##0933 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #69,v,g,n,n,PM_STCX_PASS,Stcx passes ##0C75 A stcx (stwcx or stdcx) instruction was successful #70,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C13 A store missed the dcache #71,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##0C11 A store executed on unit 0 #72,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##0C15 A store executed on unit 1 #73,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #74,v,g,n,n,PM_DATA_FROM_L2,Data loaded from L2 ##8C66 DL1 was reloaded from the local L2 due to a demand load #75,v,g,n,n,PM_FPU_FIN,FPU produced a result ##8010 FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1 #76,u,g,n,n,PM_FXU1_BUSY_FXU0_IDLE,FXU1 busy FXU0 idle ##8002 FXU0 was idle while FXU1 was busy #77,c,g,n,n,PM_INST_CMPL,Instructions completed ##8001 Number of Eligible Instructions that completed. #78,v,g,n,n,PM_INST_FROM_L35,Instructions fetched from L3.5 ##8227 An instruction fetch group was fetched from the L3 of another module. Fetch Groups can contain up to 8 instructions #79,v,g,n,n,PM_LARX,Larx executed ##8C70 A Larx (lwarx or ldarx) was executed. This is the combined count from LSU0 + LSU1, but these instructions only execute on LSU0 #80,v,g,n,n,PM_LSU_BUSY,LSU busy ##8C30 LSU (unit 0 + unit 1) is busy rejecting instructions #81,u,g,n,n,PM_LSU_SRQ_EMPTY_CYC,Cycles SRQ empty ##8003 The Store Request Queue is empty #82,v,g,n,n,PM_MRK_CRU_FIN,Marked instruction CRU processing finished ##8005 The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete #83,v,g,n,n,PM_MRK_DATA_FROM_L2,Marked data loaded from L2 ##8C76 DL1 was reloaded from the local L2 due to a marked demand load #84,v,g,n,n,PM_MRK_GRP_CMPL,Marked group completed ##8004 A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group. #85,v,g,n,n,PM_MRK_LSU_FLUSH_SRQ,Marked SRQ flushes ##8910 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #86,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##0114,0614 The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped $$$$ { counter 5 } #0,v,g,n,n,PM_BIQ_IDU_FULL_CYC,Cycles BIQ or IDU full ##0224,0824 This signal will be asserted each time either the IDU is full or the BIQ is full. #1,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##0105,0605 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #2,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##0104,0604 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #3,u,g,n,n,PM_DC_PREF_L2_CLONE_L3,L2 prefetch cloned with L3 ##0C27 A prefetch request was made to the L2 with a cloned request sent to the L3 #4,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##0907 A new Prefetch Stream was allocated #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##0905 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##0904 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##0101,0601 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##0003 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##0020 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##0000 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##0001 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##0002 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##0103,0603 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0023 This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##0021 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0022 This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##0007 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0024 This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##0004 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##0005 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##0006 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##0107,0607 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0027 This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0025 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0026 This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0100,0600 The ISU sends a signal indicating the gct is full. #27,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##0124,0624 A group that previously attempted dispatch was rejected. #28,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##0123,0623 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #29,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##0225,0825 This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch. #30,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##0226,0826 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #31,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##0227,0827 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #32,v,g,n,n,PM_INST_DISP,Instructions dispatched ##0121,0621 The ISU sends the number of instructions dispatched. #33,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##0223,0823 Asserted each cycle when the IFU sends at least one instruction to the IDU. #34,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##0901 A SLB miss for an instruction fetch as occurred #35,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##0900 A TLB miss for an Instruction Fetch has occurred #36,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##0C64 The data source information is valid #37,v,g,n,n,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##4007 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #38,v,g,n,n,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##4006 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##4005 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #40,v,g,n,n,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##4004 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##4023 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #42,v,g,n,n,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##4022 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##4021 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #44,v,g,n,n,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##4020 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #45,v,g,n,n,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##4027 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #46,v,g,n,n,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##4026 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #47,v,g,n,n,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##4025 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #48,v,g,n,n,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##4024 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #49,v,g,n,n,PM_L3B0_DIR_MIS,L3 bank 0 directory misses ##4001 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #50,v,g,n,n,PM_L3B0_DIR_REF,L3 bank 0 directory references ##4000 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #51,v,g,n,n,PM_L3B1_DIR_MIS,L3 bank 1 directory misses ##4003 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #52,v,g,n,n,PM_L3B1_DIR_REF,L3 bank 1 directory references ##4002 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #53,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##0106,0606 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #54,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##0902 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##0C02 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##0C03 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##0C00 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##0C01 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #59,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##0C20 Data from a store instruction was forwarded to a load on unit 0 #60,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##0906 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #61,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##0C06 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #62,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##0C07 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #63,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##0C04 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #64,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##0C05 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #65,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##0C24 Data from a store instruction was forwarded to a load on unit 1 #66,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##0927 The LMQ was full #67,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##0926 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #68,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##0C26 LRQ slot zero was allocated #69,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##0C22 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #70,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##0C25 SRQ Slot zero was allocated #71,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##0C21 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #72,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##0922 A DL1 reload occured due to marked load #73,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0920 A marked load, executing on unit 0, missed the dcache #74,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0924 A marked load, executing on unit 1, missed the dcache #75,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##0925 A marked stcx (stwcx or stdcx) failed #76,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##0923 A marked store missed the dcache #77,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##0903 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #78,v,g,n,n,PM_STCX_FAIL,STCX failed ##0921 A stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C23 A store missed the dcache #80,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##0102,0602 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #81,v,g,n,n,PM_1PLUS_PPC_CMPL,One or more PPC instruction completed ##8003 A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once. #82,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #83,v,g,n,n,PM_DATA_FROM_L25_SHR,Data loaded from L2.5 shared ##8C66 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load #84,v,g,n,n,PM_FPU_ALL,FPU executed add, mult, sub, cmp or sel instruction ##8000 This signal is active for one cycle when FPU is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo. Combined Unit 0 + Unit 1 #85,c,g,n,n,PM_FPU_FULL_CYC,Cycles FPU issue queue full ##8100 Cycles when one or both FPU issue queues are full #86,c,g,n,n,PM_FPU_FULL_CYC,Cycles FPU issue queue full ##8600 Cycles when one or both FPU issue queues are full #87,v,g,n,n,PM_FPU_SINGLE,FPU executed single precision instruction ##8020 FPU is executing single precision instruction. Combined Unit 0 + Unit 1 #88,u,g,n,n,PM_FXU_IDLE,FXU idle ##8002 FXU0 and FXU1 are both idle #89,v,g,n,n,PM_GRP_DISP_SUCCESS,Group dispatch success ##8001 Number of groups sucessfully dispatched (not rejected) #90,v,g,n,n,PM_GRP_MRK,Group marked in IDU ##8004 A group was sampled (marked) #91,v,g,n,n,PM_INST_FROM_L3,Instruction fetched from L3 ##8227 An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions #92,u,g,n,n,PM_LSU_FLUSH_SRQ,SRQ flushes ##8C00 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #93,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR,Marked data loaded from L2.5 shared ##8C76 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load #94,v,g,n,n,PM_MRK_GRP_TIMEO,Marked group completion timeout ##8005 The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor $$$$ { counter 6 } #0,v,g,n,n,PM_BIQ_IDU_FULL_CYC,Cycles BIQ or IDU full ##0224,0824 This signal will be asserted each time either the IDU is full or the BIQ is full. #1,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##0105,0605 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #2,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##0104,0604 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #3,u,g,n,n,PM_DC_PREF_L2_CLONE_L3,L2 prefetch cloned with L3 ##0C27 A prefetch request was made to the L2 with a cloned request sent to the L3 #4,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##0907 A new Prefetch Stream was allocated #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##0905 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##0904 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##0101,0601 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##0003 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##0020 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##0000 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##0001 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##0002 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##0103,0603 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0023 This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##0021 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0022 This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##0007 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0024 This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##0004 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##0005 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##0006 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##0107,0607 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0027 This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0025 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0026 This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0100,0600 The ISU sends a signal indicating the gct is full. #27,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##0124,0624 A group that previously attempted dispatch was rejected. #28,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##0123,0623 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #29,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##0225,0825 This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch. #30,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##0226,0826 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #31,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##0227,0827 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #32,v,g,n,n,PM_INST_DISP,Instructions dispatched ##0121,0621 The ISU sends the number of instructions dispatched. #33,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##0223,0823 Asserted each cycle when the IFU sends at least one instruction to the IDU. #34,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##0901 A SLB miss for an instruction fetch as occurred #35,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##0900 A TLB miss for an Instruction Fetch has occurred #36,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##0C64 The data source information is valid #37,v,g,n,n,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##4007 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #38,v,g,n,n,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##4006 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##4005 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #40,v,g,n,n,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##4004 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##4023 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #42,v,g,n,n,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##4022 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##4021 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #44,v,g,n,n,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##4020 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #45,v,g,n,n,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##4027 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #46,v,g,n,n,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##4026 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #47,v,g,n,n,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##4025 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #48,v,g,n,n,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##4024 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #49,v,g,n,n,PM_L3B0_DIR_MIS,L3 bank 0 directory misses ##4001 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #50,v,g,n,n,PM_L3B0_DIR_REF,L3 bank 0 directory references ##4000 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #51,v,g,n,n,PM_L3B1_DIR_MIS,L3 bank 1 directory misses ##4003 A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #52,v,g,n,n,PM_L3B1_DIR_REF,L3 bank 1 directory references ##4002 A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3 #53,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##0106,0606 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #54,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##0902 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##0C02 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##0C03 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##0C00 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##0C01 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #59,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##0C20 Data from a store instruction was forwarded to a load on unit 0 #60,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##0906 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #61,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##0C06 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #62,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##0C07 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #63,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##0C04 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #64,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##0C05 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #65,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##0C24 Data from a store instruction was forwarded to a load on unit 1 #66,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##0927 The LMQ was full #67,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##0926 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #68,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##0C26 LRQ slot zero was allocated #69,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##0C22 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #70,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##0C25 SRQ Slot zero was allocated #71,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##0C21 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #72,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##0922 A DL1 reload occured due to marked load #73,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0920 A marked load, executing on unit 0, missed the dcache #74,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0924 A marked load, executing on unit 1, missed the dcache #75,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##0925 A marked stcx (stwcx or stdcx) failed #76,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##0923 A marked store missed the dcache #77,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##0903 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #78,v,g,n,n,PM_STCX_FAIL,STCX failed ##0921 A stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C23 A store missed the dcache #80,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##0102,0602 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #81,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #82,v,g,n,n,PM_DATA_FROM_L275_SHR,Data loaded from L2.75 shared ##8C66 DL1 was reloaded with shared (T) data from the L2 of another MCM due to a demand load #83,v,g,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction ##8000 This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #84,v,g,n,n,PM_FPU_STF,FPU executed store instruction ##8020 FPU is executing a store instruction. Combined Unit 0 + Unit 1 #85,u,g,n,n,PM_FXU_BUSY,FXU busy ##8002 FXU0 and FXU1 are both busy #86,c,g,n,n,PM_INST_CMPL,Instructions completed ##8001 Number of Eligible Instructions that completed. #87,v,g,n,n,PM_INST_FROM_L1,Instruction fetched from L1 ##8227 An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions #88,v,g,n,n,PM_LSU_DERAT_MISS,DERAT misses ##8900 Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #89,v,g,n,n,PM_LSU_FLUSH_LRQ,LRQ flushes ##8C00 A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #90,v,g,n,n,PM_MRK_DATA_FROM_L275_SHR,Marked data loaded from L2.75 shared ##8C76 DL1 was reloaded with shared (T) data from the L2 of another MCM due to a marked demand load #91,v,g,n,n,PM_MRK_FXU_FIN,Marked instruction FXU processing finished ##8004 One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete #92,v,g,n,n,PM_MRK_GRP_ISSUED,Marked group issued ##8005 A sampled instruction was issued #93,v,g,n,n,PM_MRK_ST_GPS,Marked store sent to GPS ##8003 A sampled store has been sent to the memory subsystem $$$$ { counter 7 } #0,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##0450 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #1,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##0451 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##0452 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #3,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##0453 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##0454 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##0455 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_7INST_CLB_CYC,Cycles 7 instructions in CLB ##0456 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,v,g,n,n,PM_8INST_CLB_CYC,Cycles 8 instructions in CLB ##0457 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##0230,0830 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due CR bit setting ##0231,0831 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##0232,0832 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #11,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##0111,0611 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #12,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##0936 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #13,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##0C17 A dcache invalidated was received from the L2 because a line in L2 was castout. #14,v,g,n,n,PM_DC_PREF_OUT_STREAMS,Out of prefetch streams ##0C36 A new prefetch stream was detected, but no more stream entries were available #15,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##0133,0633 The number of Cycles MSR(EE) bit was off. #16,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##0137,0637 Cycles MSR(EE) bit off and external interrupt pending #17,v,g,n,n,PM_FAB_CMD_ISSUED,Fabric command issued ##4016 A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #18,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##4017 A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #19,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##0930 A floating point load was executed from LSU unit 0 #20,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##0934 A floating point load was executed from LSU unit 1 #21,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##0012 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #22,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##0013 fp0 finished, produced a result This only indicates finish, not completion. #23,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##0010 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #24,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##0030 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #25,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##0011 fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #26,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##0016 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #27,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##0017 fp1 finished, produced a result. This only indicates finish, not completion. #28,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##0014 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #29,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##0015 fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #30,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##0110,0610 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #31,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##0132,0632 The Fixed Point unit 0 finished an instruction and produced a result #32,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##0136,0636 The Fixed Point unit 1 finished an instruction and produced a result #33,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##0135,0635 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #34,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##0131,0631 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #35,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##0C35 A request to prefetch data into the L1 was made #36,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##0233,0833 This signal is asserted each cycle a cache write is active. #37,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##4011 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #38,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##4010 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##4013 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #40,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##4012 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##4015 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #42,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##4014 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##0C34 A request to prefetch data into L2 was made #44,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##0C73 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #45,c,g,n,n,PM_LARX_LSU1,Larx executed on LSU1 ##0C77 Invalid event, larx instructions are never executed on unit 1 #46,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0C12 A load, executing on unit 0, missed the dcache #47,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0C16 A load, executing on unit 1, missed the dcache #48,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##0C10 A load executed on unit 0 #49,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##0C14 A load executed on unit 1 #50,v,g,n,n,PM_LSU0_BUSY,LSU0 busy ##0C33 LSU unit 0 is busy rejecting instructions #51,v,g,n,n,PM_LSU1_BUSY,LSU1 busy ##0C37 LSU unit 1 is busy rejecting instructions #52,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##0935 The first entry in the LMQ was allocated. #53,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##0931 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #54,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##0112,0612 The isu sends this signal when the lrq is full. #55,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##0113,0613 The isu sends this signal when the srq is full. #56,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##0932 This signal is asserted every cycle when a sync is in the SRQ. #57,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##0C74 The source information is valid and is for a marked load #58,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##0912 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #59,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##0913 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #60,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##0910 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #61,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##0911 A marked store was flushed from unit 0 because it was unaligned #62,c,g,n,n,PM_MRK_LSU0_INST_FIN,LSU0 finished a marked instruction ##0C31 LSU unit 0 finished a marked instruction #63,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##0916 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #64,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##0917 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #65,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##0914 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #66,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##0915 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #67,c,g,n,n,PM_MRK_LSU1_INST_FIN,LSU1 finished a marked instruction ##0C32 LSU unit 1 finished a marked instruction #68,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##0933 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #69,v,g,n,n,PM_STCX_PASS,Stcx passes ##0C75 A stcx (stwcx or stdcx) instruction was successful #70,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C13 A store missed the dcache #71,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##0C11 A store executed on unit 0 #72,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##0C15 A store executed on unit 1 #73,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #74,v,g,n,n,PM_DATA_FROM_L275_MOD,Data loaded from L2.75 modified ##8C66 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a demand load. #75,v,g,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions ##8010 This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #76,u,g,n,n,PM_FXU0_BUSY_FXU1_IDLE,FXU0 busy FXU1 idle ##8002 FXU0 is busy while FXU1 was idle #77,v,g,n,n,PM_GRP_CMPL,Group completed ##8003 A group completed. Microcoded instructions that span multiple groups will generate this event once per group. #78,c,g,n,n,PM_INST_CMPL,Instructions completed ##8001 Number of Eligible Instructions that completed. #79,v,g,n,n,PM_INST_FROM_PREF,Instructions fetched from prefetch ##8227 An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions #80,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD,Marked data loaded from L2.75 modified ##8C76 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a marked demand load. #81,v,g,n,n,PM_MRK_FPU_FIN,Marked instruction FPU processing finished ##8004 One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete #82,v,g,n,n,PM_MRK_INST_FIN,Marked instruction finished ##8005 One of the execution units finished a marked instruction. Instructions that finish may not necessary complete #83,v,g,n,n,PM_MRK_LSU_FLUSH_UST,Marked unaligned store flushes ##8910 A marked store was flushed because it was unaligned #84,v,g,n,n,PM_ST_REF_L1,L1 D cache store references ##8C10 Total DL1 Store references #85,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##0114,0614 The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped $$$$ { counter 8 } #0,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##0450 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #1,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##0451 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##0452 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #3,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##0453 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##0454 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##0455 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_7INST_CLB_CYC,Cycles 7 instructions in CLB ##0456 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,v,g,n,n,PM_8INST_CLB_CYC,Cycles 8 instructions in CLB ##0457 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##0230,0830 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due CR bit setting ##0231,0831 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##0232,0832 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #11,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##0111,0611 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #12,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##0936 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #13,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##0C17 A dcache invalidated was received from the L2 because a line in L2 was castout. #14,v,g,n,n,PM_DC_PREF_OUT_STREAMS,Out of prefetch streams ##0C36 A new prefetch stream was detected, but no more stream entries were available #15,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##0133,0633 The number of Cycles MSR(EE) bit was off. #16,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##0137,0637 Cycles MSR(EE) bit off and external interrupt pending #17,v,g,n,n,PM_FAB_CMD_ISSUED,Fabric command issued ##4016 A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #18,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##4017 A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2. #19,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##0930 A floating point load was executed from LSU unit 0 #20,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##0934 A floating point load was executed from LSU unit 1 #21,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##0012 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #22,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##0013 fp0 finished, produced a result This only indicates finish, not completion. #23,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##0010 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #24,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##0030 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #25,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##0011 fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #26,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##0016 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #27,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##0017 fp1 finished, produced a result. This only indicates finish, not completion. #28,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##0014 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #29,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##0015 fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #30,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##0110,0610 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #31,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##0132,0632 The Fixed Point unit 0 finished an instruction and produced a result #32,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##0136,0636 The Fixed Point unit 1 finished an instruction and produced a result #33,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##0135,0635 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #34,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##0131,0631 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #35,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##0C35 A request to prefetch data into the L1 was made #36,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##0233,0833 This signal is asserted each cycle a cache write is active. #37,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##4011 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #38,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##4010 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #39,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##4013 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #40,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##4012 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #41,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##4015 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #42,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##4014 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #43,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##0C34 A request to prefetch data into L2 was made #44,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##0C73 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #45,c,g,n,n,PM_LARX_LSU1,Larx executed on LSU1 ##0C77 Invalid event, larx instructions are never executed on unit 1 #46,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##0C12 A load, executing on unit 0, missed the dcache #47,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##0C16 A load, executing on unit 1, missed the dcache #48,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##0C10 A load executed on unit 0 #49,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##0C14 A load executed on unit 1 #50,v,g,n,n,PM_LSU0_BUSY,LSU0 busy ##0C33 LSU unit 0 is busy rejecting instructions #51,v,g,n,n,PM_LSU1_BUSY,LSU1 busy ##0C37 LSU unit 1 is busy rejecting instructions #52,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##0935 The first entry in the LMQ was allocated. #53,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##0931 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #54,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##0112,0612 The isu sends this signal when the lrq is full. #55,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##0113,0613 The isu sends this signal when the srq is full. #56,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##0932 This signal is asserted every cycle when a sync is in the SRQ. #57,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##0C74 The source information is valid and is for a marked load #58,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##0912 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #59,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##0913 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #60,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##0910 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #61,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##0911 A marked store was flushed from unit 0 because it was unaligned #62,c,g,n,n,PM_MRK_LSU0_INST_FIN,LSU0 finished a marked instruction ##0C31 LSU unit 0 finished a marked instruction #63,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##0916 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #64,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##0917 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #65,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##0914 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #66,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##0915 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #67,c,g,n,n,PM_MRK_LSU1_INST_FIN,LSU1 finished a marked instruction ##0C32 LSU unit 1 finished a marked instruction #68,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##0933 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #69,v,g,n,n,PM_STCX_PASS,Stcx passes ##0C75 A stcx (stwcx or stdcx) instruction was successful #70,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##0C13 A store missed the dcache #71,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##0C11 A store executed on unit 0 #72,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##0C15 A store executed on unit 1 #73,v,g,n,n,PM_0INST_FETCH,No instructions fetched ##8227 No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss) #74,v,g,n,n,PM_CYC,Processor cycles ##800F Processor cycles #75,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##8C66 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load #76,v,g,n,n,PM_EXT_INT,External interrupts ##8002 An external interrupt occurred #77,v,g,n,n,PM_FPU_FMOV_FEST,FPU executing FMOV or FEST instructions ##8010 This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1 #78,v,g,n,n,PM_LSU_LDF,LSU executed Floating Point load instruction ##8930 LSU executed Floating Point load instruction #79,c,g,n,n,PM_FXLS_FULL_CYC,Cycles FXLS queue is full ##8110 Cycles when one or both FXU/LSU issue queue are full #80,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##8003 A group that previously attempted dispatch was rejected. #81,c,g,n,n,PM_INST_CMPL,Instructions completed ##8001 Number of Eligible Instructions that completed. #82,v,g,n,n,PM_LD_REF_L1,L1 D cache load references ##8C10 Total DL1 Load references #83,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##8C76 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load #84,c,g,n,n,PM_MRK_LSU_FIN,Marked instruction LSU processing finished ##8004 One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete #85,v,g,n,n,PM_MRK_LSU_FLUSH_ULD,Marked unaligned load flushes ##8910 A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #86,u,g,n,n,PM_TB_BIT_TRANS,Time Base bit transition ##8005 When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 #87,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##0114,0614 The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped papi-papi-7-2-0-t/src/event_data/power4/groups000066400000000000000000000261231502707512200212270ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/power4/groups { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { Number of groups 63 { Group descriptions #0,94,81,83,77,81,81,77,80,pm_slice0,Time Slice 0 ##8005,800F,8001,8001,8003,800F,8003,8003 00000D0E,00000000,4A5675AC,00022000 Time Slice 0 #1,81,81,79,13,32,86,84,82,pm_eprof,Group for use with eprof ##800F,800F,8C10,0C17,0621,8001,8C10,8C10 0000070E,10034000,45F29420,00002001 Group for use with eprof #2,86,81,79,13,32,86,84,82,pm_basic,Basic performance indicators ##8001,800F,8C10,0C17,0621,8001,8C10,8C10 0000090E,10034000,45F29420,00002000 Basic performance indicators #3,86,0,8,9,33,81,10,36,pm_ifu,IFU events ##8001,0224,0230,0231,0223,800F,0232,0233 00000938,80000000,C6767D6C,00022000 IFU events #4,7,1,33,77,86,26,73,79,pm_isu,ISU Queue full events ##0601,0605,0635,8001,8600,0600,800F,8110 0000112A,50041000,EA5103A0,00002000 ISU Queue full events #5,82,82,74,74,83,82,74,75,pm_lsource,Information on data source ##8C66,8C66,8C66,8C66,8C66,8C66,8C66,8C66 00000E1C,0010C000,739CE738,00002000 Information on data source #6,87,86,78,78,91,87,79,73,pm_isource,Instruction Source information ##8227,8227,8227,8227,8227,8227,8227,8227 00000F1E,80000000,7BDEF7BC,00022000 Instruction Source information #7,88,87,73,77,92,89,84,82,pm_lsu,Information on the Load Store Unit ##8C00,8C00,800F,8001,8C00,8C00,8C10,8C10 00000810,000F0000,3A508420,00002000 Information on the Load Store Unit #8,35,6,12,53,31,88,78,74,pm_xlate1,Translation Events ##0900,0904,0936,0931,0227,8900,8001,800F 00001028,81082000,F67E849C,00022000 Translation Events #9,34,5,56,52,31,88,78,74,pm_xlate2,Translation Events ##0901,0905,0932,0935,0227,8900,8001,800F 0000112A,81082000,D77E849C,00022000 Translation Events #10,50,49,17,18,52,51,78,74,pm_gps1,L3 Events ##4000,4001,4016,4017,4002,4003,8001,800F 00001022,00000C00,B5E5349C,00022000 L3 Events #11,38,39,38,37,40,37,78,74,pm_l2a,L2 SliceA events ##4006,4005,4010,4011,4004,4007,8001,800F 0000162A,00000C00,8469749C,00022000 L2 SliceA events #12,42,43,40,39,44,41,78,74,pm_l2b,L2 SliceB events ##4022,4021,4012,4013,4020,4023,8001,800F 00001A32,00000600,94F1B49C,00022000 L2 SliceB events #13,46,47,42,41,48,45,78,74,pm_l2c,L2 SliceC events ##4026,4025,4014,4015,4024,4027,8001,800F 00001E3A,00000600,A579F49C,00022000 L2 SliceC events #14,84,83,75,75,82,83,78,77,pm_fpu1,Floating Point events ##8000,8000,8010,8010,800F,8000,8001,8010 00000810,00000000,420E84A0,00002000 Floating Point events #15,83,84,73,77,84,84,75,78,pm_fpu2,Floating Point events ##8020,8020,800F,8001,8000,8020,8010,8930 00000810,010020E8,3A508420,00002000 Floating Point events #16,86,81,0,1,81,81,2,3,pm_idu1,Instruction Decode Unit events ##8001,800F,0450,0451,8003,800F,0452,0453 0000090E,04010000,8456794C,00022000 Instruction Decode Unit events #17,86,81,4,5,89,81,6,7,pm_idu2,Instruction Decode Unit events ##8001,800F,0454,0455,8001,800F,0456,0457 0000090E,04010000,A5527B5C,00022000 Instruction Decode Unit events #18,80,2,11,34,53,32,78,74,pm_isu_rename,ISU Rename Pool Events ##0602,0604,0611,0631,0606,0621,8001,800F 00001228,10055000,8E6D949C,00022000 ISU Rename Pool Events #19,13,22,30,30,82,86,54,55,pm_isu_queues1,ISU Queue Full Events ##0603,0607,0610,0614,800F,8001,0612,0613 0000132E,10050000,850E994C,00022000 ISU Queue Full Events #20,32,81,31,32,28,27,78,74,pm_isu_flow,ISU Instruction Flow Events ##0621,800F,0632,0636,0623,0624,8001,800F 0000190E,10005000,D7B7C49C,00022000 ISU Instruction Flow Events #21,85,92,83,16,82,86,15,76,pm_isu_work,ISU Indicators of Work Blockage ##8004,8001,8001,0637,800F,8001,0633,8002 00000C12,10001000,4FCE9DA8,00002000 ISU Indicators of Work Blockage #22,77,78,69,73,81,86,44,45,pm_serialize,LSU Serializing Events ##0903,0921,0C75,800F,8003,8001,0C73,0C77 00001332,0118B000,E9D69DFC,00022000 LSU Serializing Events #23,71,70,50,51,69,68,78,74,pm_lsubusy,LSU Busy Events ##0C21,0C25,0C33,0C37,0C22,0C26,8001,800F 0000193A,0000F000,DFF5E49C,00022000 LSU Busy Events #24,86,36,73,74,83,82,74,75,pm_lsource2,Information on data source ##8001,0C64,800F,8C66,8C66,8C66,8C66,8C66 00000938,0010C000,3B9CE738,00002000 Information on data source #25,82,82,74,74,36,81,74,81,pm_lsource3,Information on data source ##8C66,8C66,8C66,8C66,0C64,800F,8C66,8001 00000E1C,0010C000,73B87724,00022000 Information on data source #26,86,81,78,78,91,87,79,73,pm_isource2,Instruction Source information ##8001,800F,8227,8227,8227,8227,8227,8227 0000090E,80000000,7BDEF7BC,00022000 Instruction Source information #27,87,86,78,78,91,87,73,81,pm_isource3,Instruction Source information ##8227,8227,8227,8227,8227,8227,800F,8001 00000F1E,80000000,7BDEF3A4,00022000 Instruction Source information #28,10,19,25,29,11,20,78,74,pm_fpu3,Floating Point events by unit ##0000,0004,0011,0015,0001,0005,8001,800F 00001028,00000000,8D63549C,00022000 Floating Point events by unit #29,12,21,22,27,8,17,78,74,pm_fpu4,Floating Point events by unit ##0002,0006,0013,0017,0003,0007,8001,800F 0000122C,00000000,9DE7749C,00022000 Floating Point events by unit #30,9,18,23,28,82,86,21,26,pm_fpu5,Floating Point events by unit ##0020,0024,0010,0014,800F,8001,0012,0016 00001838,00000000,850E9958,00002000 Floating Point events by unit #31,14,23,19,20,16,25,73,81,pm_fpu6,Floating Point events by unit ##0023,0027,0930,0934,0022,0026,800F,8001 00001B3E,01002000,C735E3A4,00022000 Floating Point events by unit #32,15,24,22,27,82,86,73,24,pm_fpu7,Floating Point events by unit ##0021,0025,0013,0017,800F,8001,800F,0030 0000193A,00000000,9DCE93E0,00002000 Floating Point events by unit #33,86,81,76,76,88,85,76,79,pm_fxu,Fix Point Unit events ##8001,800F,8130,8002,8002,8002,8002,8110 0000090E,40000002,4294A520,00002000 Fix Point Unit events #34,67,66,52,53,82,86,56,12,pm_lsu_lmq,LSU Load Miss Queue Events ##0926,0927,0935,0931,800F,8001,0932,0936 00001E3E,0100A000,EE4E9D78,00002000 LSU Load Miss Queue Events #35,55,61,73,73,56,62,78,74,pm_lsu_flush,LSU Flush Events ##0C02,0C06,800F,800F,0C03,0C07,8001,800F 0000122C,000C0000,39E7749C,00022000 LSU Flush Events #36,57,63,48,49,82,86,46,47,pm_lsu_load1,LSU Load Events ##0C00,0C04,0C10,0C14,800F,8001,0C12,0C16 00001028,000F0000,850E9958,00002000 LSU Load Events #37,58,64,71,72,82,86,70,13,pm_lsu_store1,LSU Store Events ##0C01,0C05,0C11,0C15,800F,8001,0C13,0C17 0000112A,000F0000,8D4E99DC,00022000 LSU Store Events #38,59,65,71,72,79,81,78,74,pm_lsu_store2,LSU Store Events ##0C20,0C24,0C11,0C15,0C23,800F,8001,800F 00001838,0003C000,8D76749C,00022000 LSU Store Events #39,54,60,73,73,36,81,78,74,pm_lsu7,Information on the Load Store Unit ##0902,0906,800F,800F,0C64,800F,8001,800F 0000122C,0118C000,39F8749C,00022000 Information on the Load Store Unit #40,4,3,43,35,82,86,73,14,pm_dpfetch,Data Prefetch Events ##0907,0C27,0C34,0C35,800F,8001,800F,0C36 0000173E,0108F000,E74E93F8,00002000 Data Prefetch Events #41,85,88,84,73,81,86,77,86,pm_misc,Misc Events for testing ##8004,8002,8004,800F,8003,8001,8003,8005 00000C14,00000000,61D695B4,00022000 Misc Events for testing #42,92,91,73,84,90,92,82,81,pm_mark1,Information on marked instructions ##8920,8003,800F,8004,8004,8005,8005,8001 00000816,01008080,3B18D6A4,00722001 Information on marked instructions #43,91,89,73,82,90,91,81,84,pm_mark2,Marked Instructions Processing Flow ##8002,8005,800F,8005,8004,8004,8004,8004 00000A1A,00000000,3B58C630,00002001 Marked Instructions Processing Flow #44,93,81,82,84,94,93,68,81,pm_mark3,Marked Stores Processing Flow ##8003,800F,8003,8004,8005,8003,0933,8001 00000B0E,01002000,5B1ABDA4,00022001 Marked Stores Processing Flow #45,92,81,81,85,94,92,78,85,pm_mark4,Marked Loads Processing FLow ##8920,800F,8910,8910,8005,8005,8001,8910 0000080E,01028080,421AD4A0,00002001 Marked Loads Processing FLow #46,90,90,80,83,93,90,80,83,pm_mark_lsource,Information on marked data source ##8C76,8C76,8C76,8C76,8C76,8C76,8C76,8C76 00000E1C,00103000,739CE738,00002001 Information on marked data source #47,86,81,57,83,93,90,80,83,pm_mark_lsource2,Information on marked data source ##8001,800F,0C74,8C76,8C76,8C76,8C76,8C76 0000090E,00103000,E39CE738,00002001 Information on marked data source #48,90,90,80,83,82,86,80,57,pm_mark_lsource3,Information on marked data source ##8C76,8C76,8C76,8C76,800F,8001,8C76,0C74 00000E1C,00103000,738E9770,00002001 Information on marked data source #49,76,72,60,65,82,86,61,66,pm_lsu_mark1,Load Store Unit Marked Events ##0923,0922,0910,0914,800F,8001,0911,0915 00001B34,01028000,850E98D4,00022001 Load Store Unit Marked Events #50,73,74,58,63,82,86,59,64,pm_lsu_mark2,Load Store Unit Marked Events ##0920,0924,0912,0916,800F,8001,0913,0917 00001838,01028000,958E99DC,00022001 Load Store Unit Marked Events #51,75,81,62,67,82,92,82,81,pm_lsu_mark3,Load Store Unit Marked Events ##0925,800F,0C31,0C32,800F,8005,8005,8001 00001D0E,0100B000,CE8ED6A4,00022001 Load Store Unit Marked Events #52,67,91,53,77,82,92,77,52,pm_threshold,Group for pipeline threshold studies ##0926,8003,0931,8001,800F,8005,8003,0935 00001E16,0100A000,CA4ED5F4,00722001 Group for pipeline threshold studies #53,84,83,77,75,82,83,78,77,pm_pe_bench1,PE Benchmarker group for FP analysis ##8000,8000,8630,8010,800F,8000,8001,8010 00000810,10001002,420E84A0,00002000 PE Benchmarker group for FP analysis #54,81,84,22,77,86,84,27,78,pm_pe_bench2,PE Benchmarker group for FP stalls analysis ##800F,8020,0013,8001,8600,8020,0017,8930 00000710,11042068,9A508BA0,00002000 PE Benchmarker group for FP stalls analysis #55,86,0,8,9,1,81,10,36,pm_pe_bench3,PE Benchmarker group for branch analysis ##8001,0224,0230,0231,0605,800F,0232,0233 00000938,90040000,C66A7D6C,00022000 PE Benchmarker group for branch analysis #56,6,35,79,70,82,86,84,82,pm_pe_bench4,PE Benchmarker group for L1 and TLB analysis ##0904,0900,8C10,0C13,800F,8001,8C10,8C10 00001420,010B0000,44CE9420,00002000 PE Benchmarker group for L1 and TLB analysis #57,86,81,74,74,83,82,74,75,pm_pe_bench5,PE Benchmarker group for L2 analysis ##8001,800F,8C66,8C66,8C66,8C66,8C66,8C66 0000090E,0010C000,739CE738,00002000 PE Benchmarker group for L2 analysis #58,82,82,74,74,83,81,78,75,pm_pe_bench6,PE Benchmarker group for L3 analysis ##8C66,8C66,8C66,8C66,8C66,800F,8001,8C66 00000E1C,0010C000,739C74B8,00002000 PE Benchmarker group for L3 analysis #59,6,88,79,70,82,86,84,82,pm_hpmcount1,Hpmcount group for L1 and TLB behavior analysis ##0904,8002,8C10,0C13,800F,8001,8C10,8C10 00001414,010B0000,44CE9420,00002000 Hpmcount group for L1 and TLB behavior analysis #60,84,83,22,27,82,84,78,78,pm_hpmcount2,Hpmcount group for computation intensity analysis ##8000,8000,0013,0017,800F,8020,8001,8930 00000810,01002028,9DCE84A0,00002000 Hpmcount group for computation intensity analysis #61,86,81,79,8,79,81,9,10,pm_l1andbr,L1 misses and branch misspredict analysis ##8001,800F,8C10,0230,0C23,800F,0231,0232 0000090E,8003C000,46367CE8,00002000 L1 misses and branch misspredict analysis #62,86,81,79,8,82,79,84,82,pm_imix,Instruction mix: loads, stores and branches ##8001,800F,8C10,0230,800F,0C23,8C10,8C10 0000090E,8003C000,460FB420,00002000 Instruction mix: loads, stores and branches papi-papi-7-2-0-t/src/event_data/power5+/000077500000000000000000000000001502707512200200355ustar00rootroot00000000000000papi-papi-7-2-0-t/src/event_data/power5+/events000066400000000000000000011576201502707512200213000ustar00rootroot00000000000000{ File: power5+/events { Date: 12/13/06 { Version: 1.7 { Copyright (c) International Business Machines, 2006. { Contributed by Eric Kjeldergaard 362,356,355,352,1,1 { counter 1 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #1,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #2,v,g,n,n,PM_1PLUS_PPC_CMPL,One or more PPC instruction completed ##00013 A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once. #3,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #4,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #5,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #6,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #7,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #8,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #9,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times. #10,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction. #11,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction. #12,v,g,n,n,PM_BR_UNCOND,Unconditional branch ##23087 An unconditional branch was executed. #13,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles when both thread's CLB is completely empty. #14,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles when both thread's CLB is full. #15,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #16,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #17,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #18,v,g,n,n,PM_DATA_FROM_L2,Data loaded from L2 ##C3087 The processor's Data Cache was reloaded from the local L2 due to a demand load. #19,v,g,n,n,PM_DATA_FROM_L25_SHR,Data loaded from L2.5 shared ##C3097 The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load. #20,v,g,n,n,PM_DATA_FROM_L275_MOD,Data loaded from L2.75 modified ##C30A3 The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load. #21,v,g,n,n,PM_DATA_FROM_L3,Data loaded from L3 ##C308E The processor's Data Cache was reloaded from the local L3 due to a demand load. #22,v,g,n,n,PM_DATA_FROM_L35_SHR,Data loaded from L3.5 shared ##C309E The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a demand load. #23,v,g,n,n,PM_DATA_FROM_L375_MOD,Data loaded from L3.75 modified ##C30A7 The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load. #24,v,g,n,n,PM_DATA_FROM_RMEM,Data loaded from remote memory ##C30A1 The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on. #25,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #26,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #27,v,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of prefetch streams ##C50C2 A new prefetch stream was detected but no more stream entries were available. #28,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 A prefetch stream was started using the DST instruction. #29,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated. #30,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve. #31,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4,C20E0 Data TLB misses, all page sizes. #32,v,g,n,n,PM_DTLB_MISS_4K,Data TLB miss for 4K page ##C208D Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time. #33,v,g,n,n,PM_DTLB_REF,Data TLB references ##C20E4 Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time. #34,v,g,n,n,PM_DTLB_REF_4K,Data TLB reference for 4K page ##C2086 Data TLB references for 4KB pages. Includes hits + misses. #35,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked. #36,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles when an interrupt due to an external exception is pending but external exceptions were masked. #37,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled. #38,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #39,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #40,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #41,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #42,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #43,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #44,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #45,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #46,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #47,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled. #48,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled. #49,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly. #50,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly. #51,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #52,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes occurred including LSU and Branch flushes. #53,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 A flush was caused by a branch mispredict. #54,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #55,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #56,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes. #57,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #58,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #59,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 FPU0 has encountered a denormalized operand. #60,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #61,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #62,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads. #63,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #64,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #65,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs. #66,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #67,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #68,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped. #69,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 FPU0 has executed a single precision instruction. #70,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #71,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 FPU0 has executed a Floating Point Store instruction. #72,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #73,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 FPU1 has encountered a denormalized operand. #74,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #75,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #76,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , #77,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #78,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executed FMOV or FEST instructions ##010C4 FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #79,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #80,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #81,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped #82,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 FPU1 has executed a single precision instruction. #83,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #84,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 FPU1 has executed a Floating Point Store instruction. #85,v,g,n,n,PM_FPU_1FLOP,FPU executed one flop instruction ##00090 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #86,v,g,n,n,PM_FPU_DENORM,FPU received denormalized data ##02088 The floating point unit has encountered a denormalized operand. Combined Unit 0 + Unit 1. #87,v,g,n,n,PM_FPU_FDIV,FPU executed FDIV instruction ##00088 The floating point unit has executed a divide instruction. This could be fdiv, fdivs, fdiv., fdivs.. Combined Unit 0 + Unit 1. #88,v,g,n,n,PM_FPU_FEST,FPU executed FEST instruction ##010A8 The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1. #89,c,g,n,n,PM_FPU_FULL_CYC,Cycles FPU issue queue full ##10090 Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full. #90,v,g,n,n,PM_FPU_SINGLE,FPU executed single precision instruction ##02090 FPU is executing single precision instruction. Combined Unit 0 + Unit 1. #91,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #92,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #93,c,g,n,n,PM_FXLS_FULL_CYC,Cycles FXLS queue is full ##110A8 Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full. #94,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete. #95,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete. #96,u,g,n,n,PM_FXU_IDLE,FXU idle ##00012 FXU0 and FXU1 are both idle. #97,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##100C0 The Global Completion Table is completely full. #98,v,g,n,n,PM_GCT_NOSLOT_CYC,Cycles no GCT slot allocated ##00004 Cycles when the Global Completion Table has no slots from this thread. #99,v,g,n,s,PM_GCT_USAGE_00to59_CYC,Cycles GCT less than 60% full ##0001F Cycles when the Global Completion Table has fewer than 60% of its slots used. The GCT has 20 entries shared between threads. #100,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #101,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count. #102,v,g,n,n,PM_GRP_BR_REDIR_NONSPEC,Group experienced non-speculative branch redirect ##12091 Number of groups, counted at completion, that have encountered a branch redirect. #103,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 A scoreboard operation on a non-renamed resource has blocked dispatch. #104,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##120E4 A group that previously attempted dispatch was rejected. #105,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 A group is available for dispatch. This does not mean it was successfully dispatched. #106,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count. #107,c,g,n,n,PM_GRP_IC_MISS_BR_REDIR_NONSPEC,Group experienced non-speculative I cache miss or branch redirect ##120E5 Group experienced non-speculative I cache miss or branch redirect #108,v,g,n,n,PM_GRP_IC_MISS_NONSPEC,Group experienced non-speculative I cache miss ##12099 Number of groups, counted at completion, that have encountered an instruction cache miss. #109,v,g,n,n,PM_GRP_MRK,Group marked in IDU ##00014 A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term. #110,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict). #111,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target). #112,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 An instruction prefetch request has been made. #113,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #114,v,g,n,n,PM_IERAT_XLATE_WR_LP,Large page translation written to ierat ##210C6 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #115,v,g,n,n,PM_IOPS_CMPL,Internal operations completed ##00001 Number of internal operations that completed. #116,v,g,n,n,PM_INST_DISP_ATTEMPT,Instructions dispatch attempted ##120E1 Number of PowerPC Instructions dispatched (attempted, not filtered by success. #117,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Cycles when at least one instruction was sent from the fetch unit to the decode unit. #118,v,g,n,n,PM_INST_FROM_L2,Instruction fetched from L2 ##22086 An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions #119,v,g,n,n,PM_INST_FROM_L25_SHR,Instruction fetched from L2.5 shared ##22096 An instruction fetch group was fetched with shared (T or SL) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions. #120,v,g,n,n,PM_INST_FROM_L2MISS,Instruction fetched missed L2 ##2209B An instruction fetch group was fetched from beyond the local L2. #121,v,g,n,n,PM_INST_FROM_L3,Instruction fetched from L3 ##2208D An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions #122,v,g,n,n,PM_INST_FROM_L35_SHR,Instruction fetched from L3.5 shared ##2209D An instruction fetch group was fetched with shared (S) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions #123,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #124,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #125,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads. #126,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #127,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 Cycles that a cache line was written to the instruction cache. #128,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #129,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #130,v,g,n,s,PM_L2SA_RCLD_DISP,L2 slice A RC load dispatch attempt ##701C0 A Read/Claim dispatch for a Load was attempted #131,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #132,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 slice A RC load dispatch attempt failed due to other reasons ##731E0 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #133,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 slice A RC load dispatch attempt failed due to all RC full ##721E0 A Read/Claim dispatch for a load failed because all RC machines are busy. #134,v,g,n,s,PM_L2SA_RCST_DISP,L2 slice A RC store dispatch attempt ##702C0 A Read/Claim dispatch for a Store was attempted. #135,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #136,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 slice A RC store dispatch attempt failed due to other reasons ##732E0 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #137,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 slice A RC store dispatch attempt failed due to all RC full ##722E0 A Read/Claim dispatch for a store failed because all RC machines are busy. #138,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #139,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice A RC dispatch attempt failed due to all CO busy ##713C0 A Read/Claim dispatch was rejected because all Castout machines were busy. #140,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #141,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #142,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C. #143,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #144,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #145,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #146,v,g,n,s,PM_L2SB_RCLD_DISP,L2 slice B RC load dispatch attempt ##701C1 A Read/Claim dispatch for a Load was attempted #147,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #148,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 slice B RC load dispatch attempt failed due to other reasons ##731E1 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #149,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 slice B RC load dispatch attempt failed due to all RC full ##721E1 A Read/Claim dispatch for a load failed because all RC machines are busy. #150,v,g,n,s,PM_L2SB_RCST_DISP,L2 slice B RC store dispatch attempt ##702C1 A Read/Claim dispatch for a Store was attempted. #151,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #152,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 slice B RC store dispatch attempt failed due to other reasons ##732E1 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #153,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 slice B RC store dispatch attempt failed due to all RC full ##722E2 A Read/Claim dispatch for a store failed because all RC machines are busy. #154,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #155,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice B RC dispatch attempt failed due to all CO busy ##713C1 A Read/Claim dispatch was rejected because all Castout machines were busy. #156,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #157,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #158,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C. #159,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #160,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #161,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #162,v,g,n,s,PM_L2SC_RCLD_DISP,L2 slice C RC load dispatch attempt ##701C2 A Read/Claim dispatch for a Load was attempted #163,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #164,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 slice C RC load dispatch attempt failed due to other reasons ##731E2 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #165,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 slice C RC load dispatch attempt failed due to all RC full ##721E2 A Read/Claim dispatch for a load failed because all RC machines are busy. #166,v,g,n,s,PM_L2SC_RCST_DISP,L2 slice C RC store dispatch attempt ##702C2 A Read/Claim dispatch for a Store was attempted. #167,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #168,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 slice C RC store dispatch attempt failed due to other reasons ##732E2 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #169,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 slice C RC store dispatch attempt failed due to all RC full ##722E1 A Read/Claim dispatch for a store failed because all RC machines are busy. #170,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #171,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice C RC dispatch attempt failed due to all CO busy ##713C2 A Read/Claim dispatch was rejected because all Castout machines were busy. #172,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #173,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #174,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C. #175,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #176,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #177,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 Cycles All Castin/Castout machines are busy. #178,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #179,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #180,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #181,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #182,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #183,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #184,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 Cycles All Castin/Castout machines are busy. #185,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #186,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #187,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #188,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #189,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #190,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #191,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 Cycles All Castin/Castout machines are busy. #192,v,g,n,s,PM_L3SC_HIT,L3 slice C hits ##711C5 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice #193,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point. #194,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #195,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice. #196,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #197,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #198,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0) #199,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 Load references that miss the Level 1 Data cache, by unit 0. #200,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C5,C10C6 Load references that miss the Level 1 Data cache, by unit 1. #201,v,g,n,n,PM_LD_REF_L1,L1 D cache load references ##C10A8 Load references to the Level 1 Data Cache. Combined unit 0 + 1. #202,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 Load references to Level 1 Data Cache, by unit 0. #203,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##230E3 The target address of a branch instruction was predicted. #204,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #205,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E1 Total cycles the Load Store Unit 0 is busy rejecting instructions. #206,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #207,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #208,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ lhs flushes ##C00C3 A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group. #209,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1) #210,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary). #211,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed by LSU0 #212,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 A non-cacheable load was executed by unit 0. #213,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C40C3 Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #214,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C40C1 Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #215,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C40C2 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #216,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ lhs rejects ##C40C0 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #217,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C60E1 Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #218,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E5 Total cycles the Load Store Unit 1 is busy rejecting instructions. #219,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #220,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #221,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ lhs flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #222,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1). #223,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary) #224,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed by LSU1 #225,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 A non-cacheable load was executed by Unit 0. #226,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C40C7 Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #227,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C40C5 Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #228,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C40C6 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #229,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ lhs rejects ##C40C4 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #230,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C60E5 Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #231,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 A flush was initiated by the Load Store Unit #232,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #233,u,g,n,n,PM_LSU_FLUSH_SRQ,SRQ flushes ##C0090 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1. #234,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #235,v,g,n,n,PM_LSU_FLUSH_ULD,LRQ unaligned load flushes ##C0088 A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1. #236,v,g,n,n,PM_LSU_LDF,LSU executed Floating Point load instruction ##C50A8 LSU executed Floating Point load instruction. Combined Unit 0 + 1. #237,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The Load Miss Queue was full. #238,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #239,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #240,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #241,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 Cycles when the LRQ is full. #242,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C60E7 LRQ slot zero was allocated #243,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C60E6 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each). #244,v,g,n,n,PM_LSU_REJECT_ERAT_MISS,LSU reject due to ERAT miss ##C4090 Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #245,v,g,n,n,PM_LSU_REJECT_SRQ,LSU SRQ lhs rejects ##C4088 Total cycles the Load Store Unit is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. Combined Unit 0 + 1. #246,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 Cycles the Store Request Queue is full. #247,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E7 SRQ Slot zero was allocated #248,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E6 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each). #249,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 Cycles that a sync instruction is active in the Store Request Queue. #250,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response. #251,c,g,n,n,PM_MEM_FAST_PATH_RD_DISP,Fast path memory read dispatched ##731E6 Fast path memory read dispatched #252,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##210C7 A prefetch buffer entry (line) is allocated but the request is not a demand fetch. #253,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #254,v,g,n,s,PM_MEM_NONSPEC_RD_CANCEL,Non speculative memory read cancelled ##711C6 A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly #255,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly #256,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #257,v,g,n,n,PM_MEM_PWQ_DISP_Q2or3,Memory partial-write queue dispatched to Write Queue 2 or 3 ##734E6 Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #258,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##724E6 Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #259,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #260,v,g,n,n,PM_MEM_RQ_DISP_Q0to3,Memory read queue dispatched to queues 0-3 ##702C6 A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #261,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #262,v,g,n,n,PM_MEM_RQ_DISP_Q4to7,Memory read queue dispatched to queues 4-7 ##712C6 A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #263,v,g,n,n,PM_MEM_RQ_DISP_Q8to11,Memory read queue dispatched to queues 8-11 ##722E6 A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #264,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read cancelled ##721E6 Speculative memory read cancelled (i.e. cresp = sourced by L2/L3) #265,v,g,n,n,PM_MEM_WQ_DISP_Q0to7,Memory write queue dispatched to queues 0-7 ##723E6 A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #266,v,g,n,n,PM_MEM_WQ_DISP_Q8to15,Memory write queue dispatched to queues 8-15 ##733E6 A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #267,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #268,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #269,v,g,n,n,PM_MRK_DATA_FROM_L2,Marked data loaded from L2 ##C7087 The processor's Data Cache was reloaded from the local L2 due to a marked load. #270,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR,Marked data loaded from L2.5 shared ##C7097 The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a marked load. #271,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD,Marked data loaded from L2.75 modified ##C70A3 The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load. #272,v,g,n,n,PM_MRK_DATA_FROM_L3,Marked data loaded from L3 ##C708E The processor's Data Cache was reloaded from the local L3 due to a marked load. #273,v,g,n,n,PM_MRK_DATA_FROM_L35_SHR,Marked data loaded from L3.5 shared ##C709E The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a marked load. #274,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD,Marked data loaded from L3.75 modified ##C70A7 The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load. #275,v,g,n,n,PM_MRK_DATA_FROM_RMEM,Marked data loaded from remote memory ##C70A1 The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on. #276,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 A Data SLB miss was caused by a marked instruction. #277,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6,C60E0 Data TLB references by a marked instruction that missed the TLB (all page sizes). #278,v,g,n,n,PM_MRK_DTLB_MISS_4K,Marked Data TLB misses for 4K page ##C608D Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time. #279,v,g,n,n,PM_MRK_DTLB_REF,Marked Data TLB reference ##C60E4 Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time. #280,v,g,n,n,PM_MRK_DTLB_REF_4K,Marked Data TLB reference for 4K page ##C6086 Data TLB references by a marked instruction for 4KB pages. #281,v,g,n,n,PM_MRK_GRP_DISP,Marked group dispatched ##00002 A group containing a sampled instruction was dispatched #282,v,g,n,n,PM_MRK_GRP_ISSUED,Marked group issued ##00015 A sampled instruction was issued. #283,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occurred due to marked load #284,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #285,v,g,n,n,PM_MRK_LD_MISS_L1,Marked L1 D cache load misses ##82088 Marked L1 D cache load misses #286,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 marked L1 D cache load misses ##820E0 Load references that miss the Level 1 Data cache, by LSU0. #287,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 marked L1 D cache load misses ##820E4 Load references that miss the Level 1 Data cache, by LSU1. #288,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #289,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ lhs flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #290,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C1 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #291,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C0 A marked store was flushed from unit 0 because it was unaligned #292,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #293,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ lhs flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #294,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #295,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #296,v,g,n,n,PM_MRK_LSU_FLUSH_ULD,Marked unaligned load flushes ##810A8 A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #297,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #298,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #299,v,g,n,n,PM_MRK_ST_CMPL,Marked store instruction completed ##00003 A sampled store has completed (data home) #300,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #301,v,g,n,n,PM_PMC4_OVERFLOW,PMC4 Overflow ##0000A Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #302,v,g,n,n,PM_PMC5_OVERFLOW,PMC5 Overflow ##0001A Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #303,v,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of PowerPC instructions that completed. #304,v,g,n,n,PM_PTEG_FROM_L2,PTEG loaded from L2 ##83087 A Page Table Entry was loaded into the TLB from the local L2 due to a demand load #305,v,g,n,n,PM_PTEG_FROM_L25_SHR,PTEG loaded from L2.5 shared ##83097 A Page Table Entry was loaded into the TLB with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load. #306,v,g,n,n,PM_PTEG_FROM_L275_MOD,PTEG loaded from L2.75 modified ##830A3 A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load. #307,v,g,n,n,PM_PTEG_FROM_L3,PTEG loaded from L3 ##8308E A Page Table Entry was loaded into the TLB from the local L3 due to a demand load. #308,v,g,n,n,PM_PTEG_FROM_L35_SHR,PTEG loaded from L3.5 shared ##8309E A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on the same module as this processor is located, due to a demand load. #309,v,g,n,n,PM_PTEG_FROM_L375_MOD,PTEG loaded from L3.75 modified ##830A7 A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load. #310,v,g,n,n,PM_PTEG_FROM_RMEM,PTEG loaded from remote memory ##830A1 A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on. #311,v,g,n,n,PM_PTEG_RELOAD_VALID,PTEG reload valid ##830E4 A Page Table Entry was loaded into the TLB. #312,v,g,n,n,PM_RUN_CYC,Run cycles ##00005 Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop. #313,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #314,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #315,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #316,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #317,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #318,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #319,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #320,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #321,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A tlbie was snooped from another processor. #322,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #323,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly #324,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #325,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #326,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #327,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache. Combined Unit 0 + 1. #328,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 Store references to the Data Cache by LSU0. #329,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C4 Store references to the Data Cache by LSU1. #330,v,g,n,n,PM_SUSPENDED,Suspended ##00000 The counter is suspended (does not count). #331,u,g,n,n,PM_TB_BIT_TRANS,Time Base bit transition ##00018 When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 #332,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##410C7 Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used. #333,v,g,n,s,PM_THRD_ONE_RUN_CYC,One of the threads in run cycles ##0000B At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT. #334,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping. #335,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles this thread was running at priority level 2. #336,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles this thread was running at priority level 3. #337,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles this thread was running at priority level 4. #338,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles this thread was running at priority level 5. #339,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles this thread was running at priority level 6. #340,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles this thread was running at priority level 7. #341,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles when this thread's priority is equal to the other thread's priority. #342,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles when this thread's priority is higher than the other thread's priority by 1 or 2. #343,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles when this thread's priority is higher than the other thread's priority by 3 or 4. #344,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles when this thread's priority is higher than the other thread's priority by 5 or 6. #345,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles when this thread's priority is lower than the other thread's priority by 1 or 2. #346,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles when this thread's priority is lower than the other thread's priority by 3 or 4. #347,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles when this thread's priority is lower than the other thread's priority by 5 or 6. #348,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overrides caused by CLB empty ##410C2 Thread selection was overridden because one thread's CLB was empty. #349,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overrides caused by GCT imbalance ##410C4 Thread selection was overridden because of a GCT imbalance. #350,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overrides caused by ISU holds ##410C5 Thread selection was overridden because of an ISU hold. #351,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overrides caused by L2 misses ##410C3 Thread selection was overridden because one thread was had a L2 miss pending. #352,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Thread selection picked thread 0 for decode. #353,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Thread selection picked thread 1 for decode. #354,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 A hung thread was detected #355,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 Cycles a TLBIE instruction was held at dispatch. #356,v,g,n,n,PM_TLB_MISS,TLB misses ##80088 Total of Data TLB mises + Instruction TLB misses #357,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #358,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##230E2 A conditional branch instruction was predicted as taken or not taken. #359,v,g,n,n,PM_MEM_RQ_DISP_Q12to15,Memory read queue dispatched to queues 12-15 ##732E6 A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #360,v,g,n,n,PM_MEM_RQ_DISP_Q16to19,Memory read queue dispatched to queues 16-19 ##727E6 A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #361,v,g,n,n,PM_SNOOP_RETRY_AB_COLLISION,Snoop retry due to a b collision ##735E6 Snoop retry due to a b collision $$$$$$$$ { counter 2 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #1,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #2,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #3,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #4,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #5,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #6,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #7,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction. #11,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##23087,230E3 The target address of a branch instruction was predicted. #12,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles when both thread's CLB is completely empty. #13,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles when both thread's CLB is full. #14,v,g,n,n,PM_CMPLU_STALL_DCACHE_MISS,Completion stall caused by D cache miss ##1109A Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU. #15,v,g,n,n,PM_CMPLU_STALL_FDIV,Completion stall caused by FDIV or FQRT instruction ##1109B Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point divide or square root instruction. This is a subset of PM_CMPLU_STALL_FPU. #16,v,g,n,n,PM_CMPLU_STALL_FXU,Completion stall caused by FXU instruction ##11099 Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction. #17,v,g,n,n,PM_CMPLU_STALL_LSU,Completion stall caused by LSU instruction ##11098 Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction. #18,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #19,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #20,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #21,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##C3097 The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load. #22,v,g,n,n,PM_DATA_FROM_L35_MOD,Data loaded from L3.5 modified ##C309E The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load. #23,v,g,n,n,PM_DATA_FROM_LMEM,Data loaded from local memory ##C3087 The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on. #24,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #25,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #26,v,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of prefetch streams ##C50C2 A new prefetch stream was detected but no more stream entries were available. #27,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 A prefetch stream was started using the DST instruction. #28,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated. #29,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve. #30,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4,C20E0 Data TLB misses, all page sizes. #31,v,g,n,n,PM_DTLB_MISS_64K,Data TLB miss for 64K page ##C208D Data TLB references to 64KB pages that missed the TLB. Page size is determined at TLB reload time. #32,v,g,n,n,PM_DTLB_REF,Data TLB references ##C20E4 Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time. #33,v,g,n,n,PM_DTLB_REF_64K,Data TLB reference for 64K page ##C2086 Data TLB references for 64KB pages. Includes hits + misses. #34,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked. #35,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles when an interrupt due to an external exception is pending but external exceptions were masked. #36,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled. #37,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #38,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #39,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #40,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #41,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #42,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #43,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #44,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #45,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #46,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled. #47,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled. #48,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly. #49,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly. #50,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #51,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes occurred including LSU and Branch flushes. #52,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 A flush was caused by a branch mispredict. #53,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #54,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #55,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes. #56,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #57,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #58,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 FPU0 has encountered a denormalized operand. #59,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #60,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #61,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads. #62,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #63,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #64,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs. #65,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #66,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #67,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped. #68,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 FPU0 has executed a single precision instruction. #69,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #70,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 FPU0 has executed a Floating Point Store instruction. #71,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #72,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 FPU1 has encountered a denormalized operand. #73,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #74,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #75,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , #76,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #77,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executed FMOV or FEST instructions ##010C4 FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #78,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #79,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #80,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped #81,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 FPU1 has executed a single precision instruction. #82,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #83,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 FPU1 has executed a Floating Point Store instruction. #84,v,g,n,n,PM_FPU_FMA,FPU executed multiply-add instruction ##00088 This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1. #85,v,g,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions ##010A8 The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1. #86,v,g,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction ##00090 The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1. #87,v,g,n,n,PM_FPU_STALL3,FPU stalled in pipe3 ##02088 FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1. #88,v,g,n,n,PM_FPU_STF,FPU executed store instruction ##02090 FPU has executed a store instruction. Combined Unit 0 + Unit 1. #89,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #90,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #91,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete. #92,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete. #93,u,g,n,n,PM_FXU_BUSY,FXU busy ##00012 Cycles when both FXU0 and FXU1 are busy. #94,v,g,n,n,PM_MRK_FXU_FIN,Marked instruction FXU processing finished ##00014 One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete. #95,v,g,n,s,PM_GCT_EMPTY_CYC,Cycles GCT empty ##00004 The Global Completion Table is completely empty #96,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##100C0 The Global Completion Table is completely full. #97,v,g,n,n,PM_GCT_NOSLOT_IC_MISS,No slot in GCT caused by I cache miss ##1009C Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss. #98,v,g,n,s,PM_GCT_USAGE_60to79_CYC,Cycles GCT 60-79% full ##0001F Cycles when the Global Completion Table has between 60% and 70% of its slots used. The GCT has 20 entries shared between threads. #99,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #100,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count. #101,c,g,n,n,PM_GRP_IC_MISS_BR_REDIR_NONSPEC,Group experienced non-speculative I cache miss or branch redirect ##120E5 Group experienced non-speculative I cache miss or branch redirect #102,v,g,n,n,PM_GRP_DISP,Group dispatches ##00002 A group was dispatched #103,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 A scoreboard operation on a non-renamed resource has blocked dispatch. #104,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##120E4 A group that previously attempted dispatch was rejected. #105,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 A group is available for dispatch. This does not mean it was successfully dispatched. #106,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count. #107,v,g,n,n,PM_HV_CYC,Hypervisor Cycles ##0000B Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0) #108,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict). #109,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target). #110,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 An instruction prefetch request has been made. #111,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #112,v,g,n,n,PM_IERAT_XLATE_WR_LP,Large page translation written to ierat ##210C6 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #113,v,g,n,n,PM_IOPS_CMPL,Internal operations completed ##00001 Number of internal operations that completed. #114,v,g,n,n,PM_INST_DISP_ATTEMPT,Instructions dispatch attempted ##120E1 Number of PowerPC Instructions dispatched (attempted, not filtered by success. #115,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Cycles when at least one instruction was sent from the fetch unit to the decode unit. #116,v,g,n,n,PM_INST_FROM_L1,Instruction fetched from L1 ##2208D An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions #117,v,g,n,n,PM_INST_FROM_L25_MOD,Instruction fetched from L2.5 modified ##22096 An instruction fetch group was fetched with modified (M) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions. #118,v,g,n,n,PM_INST_FROM_L35_MOD,Instruction fetched from L3.5 modified ##2209D An instruction fetch group was fetched with modified (M) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions #119,v,g,n,n,PM_INST_FROM_LMEM,Instruction fetched from local memory ##22086 An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions #120,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #121,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #122,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads. #123,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #124,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 Cycles that a cache line was written to the instruction cache. #125,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #126,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #127,v,g,n,s,PM_L2SA_RCLD_DISP,L2 slice A RC load dispatch attempt ##701C0 A Read/Claim dispatch for a Load was attempted #128,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #129,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 slice A RC load dispatch attempt failed due to other reasons ##731E0 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #130,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 slice A RC load dispatch attempt failed due to all RC full ##721E0 A Read/Claim dispatch for a load failed because all RC machines are busy. #131,v,g,n,s,PM_L2SA_RCST_DISP,L2 slice A RC store dispatch attempt ##702C0 A Read/Claim dispatch for a Store was attempted. #132,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #133,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 slice A RC store dispatch attempt failed due to other reasons ##732E0 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #134,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 slice A RC store dispatch attempt failed due to all RC full ##722E0 A Read/Claim dispatch for a store failed because all RC machines are busy. #135,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #136,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice A RC dispatch attempt failed due to all CO busy ##713C0 A Read/Claim dispatch was rejected because all Castout machines were busy. #137,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #138,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #139,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C. #140,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #141,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #142,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #143,v,g,n,s,PM_L2SB_RCLD_DISP,L2 slice B RC load dispatch attempt ##701C1 A Read/Claim dispatch for a Load was attempted #144,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #145,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 slice B RC load dispatch attempt failed due to other reasons ##731E1 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #146,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 slice B RC load dispatch attempt failed due to all RC full ##721E1 A Read/Claim dispatch for a load failed because all RC machines are busy. #147,v,g,n,s,PM_L2SB_RCST_DISP,L2 slice B RC store dispatch attempt ##702C1 A Read/Claim dispatch for a Store was attempted. #148,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #149,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 slice B RC store dispatch attempt failed due to other reasons ##732E1 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #150,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 slice B RC store dispatch attempt failed due to all RC full ##722E2 A Read/Claim dispatch for a store failed because all RC machines are busy. #151,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #152,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice B RC dispatch attempt failed due to all CO busy ##713C1 A Read/Claim dispatch was rejected because all Castout machines were busy. #153,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #154,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #155,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C. #156,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #157,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #158,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #159,v,g,n,s,PM_L2SC_RCLD_DISP,L2 slice C RC load dispatch attempt ##701C2 A Read/Claim dispatch for a Load was attempted #160,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #161,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 slice C RC load dispatch attempt failed due to other reasons ##731E2 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #162,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 slice C RC load dispatch attempt failed due to all RC full ##721E2 A Read/Claim dispatch for a load failed because all RC machines are busy. #163,v,g,n,s,PM_L2SC_RCST_DISP,L2 slice C RC store dispatch attempt ##702C2 A Read/Claim dispatch for a Store was attempted. #164,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #165,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 slice C RC store dispatch attempt failed due to other reasons ##732E2 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #166,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 slice C RC store dispatch attempt failed due to all RC full ##722E1 A Read/Claim dispatch for a store failed because all RC machines are busy. #167,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #168,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice C RC dispatch attempt failed due to all CO busy ##713C2 A Read/Claim dispatch was rejected because all Castout machines were busy. #169,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #170,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #171,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C. #172,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #173,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #174,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 Cycles All Castin/Castout machines are busy. #175,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #176,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #177,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #178,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #179,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #180,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #181,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 Cycles All Castin/Castout machines are busy. #182,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #183,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #184,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #185,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #186,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #187,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #188,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 Cycles All Castin/Castout machines are busy. #189,v,g,n,s,PM_L3SC_HIT,L3 slice C hits ##711C5 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice #190,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point. #191,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #192,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice. #193,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #194,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #195,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0) #196,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 Load references that miss the Level 1 Data cache, by unit 0. #197,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C5 Load references that miss the Level 1 Data cache, by unit 1. #198,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 Load references to Level 1 Data Cache, by unit 0. #199,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C6 Load references that miss the Level 1 Data cache, by unit 1. #200,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #201,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E1 Total cycles the Load Store Unit 0 is busy rejecting instructions. #202,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #203,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #204,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ lhs flushes ##C00C3 A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group. #205,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1) #206,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary). #207,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed by LSU0 #208,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 A non-cacheable load was executed by unit 0. #209,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C40C3 Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #210,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C40C1 Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #211,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C40C2 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #212,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ lhs rejects ##C40C0 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #213,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C60E1 Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #214,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E5 Total cycles the Load Store Unit 1 is busy rejecting instructions. #215,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #216,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #217,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ lhs flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #218,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1). #219,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary) #220,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed by LSU1 #221,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 A non-cacheable load was executed by Unit 0. #222,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C40C7 Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #223,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C40C5 Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #224,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C40C6 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #225,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ lhs rejects ##C40C4 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #226,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C60E5 Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #227,v,g,n,n,PM_LSU_BUSY_REJECT,LSU busy due to reject ##C2088 Total cycles the Load Store Unit is busy rejecting instructions. Combined unit 0 + 1. #228,v,g,n,n,PM_LSU_DERAT_MISS,DERAT misses ##80090 Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1. #229,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 A flush was initiated by the Load Store Unit #230,v,g,n,n,PM_LSU_FLUSH_LRQ,LRQ flushes ##C0090 A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1. #231,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #232,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #233,v,g,n,n,PM_LSU_FLUSH_UST,SRQ unaligned store flushes ##C0088 A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1. #234,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The Load Miss Queue was full. #235,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #236,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #237,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #238,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00015 Cycles when both the LMQ and SRQ are empty (LSU is idle) #239,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 Cycles when the LRQ is full. #240,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C60E7 LRQ slot zero was allocated #241,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C60E6 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each). #242,v,g,n,n,PM_LSU_REJECT_LMQ_FULL,LSU reject due to LMQ full or missed data coming ##C4088 Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1. #243,v,g,n,n,PM_LSU_REJECT_RELOAD_CDF,LSU reject due to reload CDF or tag update collision ##C4090 Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1. #244,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 Cycles the Store Request Queue is full. #245,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E7 SRQ Slot zero was allocated #246,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E6 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each). #247,c,g,n,n,PM_LSU_SRQ_STFWD,SRQ store forwarded ##C6088 Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1. #248,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 Cycles that a sync instruction is active in the Store Request Queue. #249,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response. #250,v,g,n,n,PM_MEM_PWQ_DISP_Q2or3,Memory partial-write queue dispatched to Write Queue 2 or 3 ##734E6 Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #251,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##210C7 A prefetch buffer entry (line) is allocated but the request is not a demand fetch. #252,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #253,v,g,n,s,PM_MEM_NONSPEC_RD_CANCEL,Non speculative memory read cancelled ##711C6 A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly #254,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly #255,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #256,v,g,n,n,PM_MEM_RQ_DISP_Q0to3,Memory read queue dispatched to queues 0-3 ##702C6 A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #257,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##724E6 Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #258,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #259,v,g,n,n,PM_MEM_RQ_DISP_Q4to7,Memory read queue dispatched to queues 4-7 ##712C6 A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #260,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #261,v,g,n,n,PM_MEM_RQ_DISP_Q8to11,Memory read queue dispatched to queues 8-11 ##722E6 A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #262,v,g,n,n,PM_MEM_RQ_DISP_Q12to15,Memory read queue dispatched to queues 12-15 ##732E6 A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #263,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read cancelled ##721E6 Speculative memory read cancelled (i.e. cresp = sourced by L2/L3) #264,v,g,n,n,PM_MEM_WQ_DISP_Q0to7,Memory write queue dispatched to queues 0-7 ##723E6 A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #265,v,g,n,n,PM_MEM_WQ_DISP_Q8to15,Memory write queue dispatched to queues 8-15 ##733E6 A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #266,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #267,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #268,v,g,n,n,PM_MRK_BRU_FIN,Marked instruction BRU processing finished ##00005 The branch unit finished a marked instruction. Instructions that finish may not necessary complete. #269,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##C7097 The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load. #270,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR_CYC,Marked load latency from L2.5 shared ##C70A2 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #271,v,g,n,n,PM_MRK_DATA_FROM_L275_SHR_CYC,Marked load latency from L2.75 shared ##C70A3 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #272,v,g,n,n,PM_MRK_DATA_FROM_L2_CYC,Marked load latency from L2 ##C70A0 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #273,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD,Marked data loaded from L3.5 modified ##C709E The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load. #274,v,g,n,n,PM_MRK_DATA_FROM_L35_SHR_CYC,Marked load latency from L3.5 shared ##C70A6 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #275,v,g,n,n,PM_MRK_DATA_FROM_L375_SHR_CYC,Marked load latency from L3.75 shared ##C70A7 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #276,v,g,n,n,PM_MRK_DATA_FROM_L3_CYC,Marked load latency from L3 ##C70A4 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #277,v,g,n,n,PM_MRK_DATA_FROM_LMEM,Marked data loaded from local memory ##C7087 The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on. #278,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 A Data SLB miss was caused by a marked instruction. #279,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6,C60E0 Data TLB references by a marked instruction that missed the TLB (all page sizes). #280,v,g,n,n,PM_MRK_DTLB_MISS_64K,Marked Data TLB misses for 64K page ##C608D Data TLB references to 64KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time. #281,v,g,n,n,PM_MRK_DTLB_REF,Marked Data TLB reference ##C60E4 Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time. #282,v,g,n,n,PM_MRK_DTLB_REF_64K,Marked Data TLB reference for 64K page ##C6086 Data TLB references by a marked instruction for 64KB pages. #283,v,g,n,n,PM_MRK_GRP_BR_REDIR,Group experienced marked branch redirect ##12091 A group containing a marked (sampled) instruction experienced a branch redirect. #284,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occurred due to marked load #285,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #286,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 marked L1 D cache load misses ##820E0 Load references that miss the Level 1 Data cache, by LSU0. #287,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 marked L1 D cache load misses ##820E4 Load references that miss the Level 1 Data cache, by LSU1. #288,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #289,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ lhs flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #290,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C1 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #291,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C0 A marked store was flushed from unit 0 because it was unaligned #292,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #293,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ lhs flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #294,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #295,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #296,v,g,n,n,PM_MRK_LSU_FLUSH_UST,Marked unaligned store flushes ##810A8 A marked store was flushed because it was unaligned #297,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #298,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #299,v,g,n,n,PM_MRK_ST_GPS,Marked store sent to GPS ##00003 A sampled store has been sent to the memory subsystem #300,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #301,v,g,n,n,PM_PMC1_OVERFLOW,PMC1 Overflow ##0000A Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #302,v,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of PowerPC instructions that completed. #303,v,g,n,n,PM_PTEG_FROM_L25_MOD,PTEG loaded from L2.5 modified ##83097 A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load. #304,v,g,n,n,PM_PTEG_FROM_L35_MOD,PTEG loaded from L3.5 modified ##8309E A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load. #305,v,g,n,n,PM_PTEG_FROM_LMEM,PTEG loaded from local memory ##83087 A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on. #306,v,g,n,n,PM_PTEG_RELOAD_VALID,PTEG reload valid ##830E4 A Page Table Entry was loaded into the TLB. #307,v,g,n,n,PM_SLB_MISS,SLB misses ##80088 Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data. #308,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #309,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #310,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #311,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #312,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #313,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #314,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #315,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #316,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A tlbie was snooped from another processor. #317,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #318,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly #319,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #320,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #321,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #322,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache. Combined Unit 0 + 1. #323,v,g,n,n,PM_ST_REF_L1,L1 D cache store references ##C10A8 Store references to the Data Cache. Combined Unit 0 + 1. #324,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 Store references to the Data Cache by LSU0. #325,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C4 Store references to the Data Cache by LSU1. #326,v,g,n,n,PM_SUSPENDED,Suspended ##00000 The counter is suspended (does not count). #327,v,g,n,n,PM_THRD_GRP_CMPL_BOTH_CYC,Cycles group completed by both threads ##00013 Cycles that both threads completed. #328,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##410C7 Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used. #329,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping. #330,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles this thread was running at priority level 2. #331,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles this thread was running at priority level 3. #332,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles this thread was running at priority level 4. #333,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles this thread was running at priority level 5. #334,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles this thread was running at priority level 6. #335,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles this thread was running at priority level 7. #336,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles when this thread's priority is equal to the other thread's priority. #337,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles when this thread's priority is higher than the other thread's priority by 1 or 2. #338,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles when this thread's priority is higher than the other thread's priority by 3 or 4. #339,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles when this thread's priority is higher than the other thread's priority by 5 or 6. #340,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles when this thread's priority is lower than the other thread's priority by 1 or 2. #341,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles when this thread's priority is lower than the other thread's priority by 3 or 4. #342,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles when this thread's priority is lower than the other thread's priority by 5 or 6. #343,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overrides caused by CLB empty ##410C2 Thread selection was overridden because one thread's CLB was empty. #344,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overrides caused by GCT imbalance ##410C4 Thread selection was overridden because of a GCT imbalance. #345,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overrides caused by ISU holds ##410C5 Thread selection was overridden because of an ISU hold. #346,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overrides caused by L2 misses ##410C3 Thread selection was overridden because one thread was had a L2 miss pending. #347,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Thread selection picked thread 0 for decode. #348,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Thread selection picked thread 1 for decode. #349,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 A hung thread was detected #350,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 Cycles a TLBIE instruction was held at dispatch. #351,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #352,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##230E2 A conditional branch instruction was predicted as taken or not taken. #353,v,g,n,n,PM_MEM_RQ_DISP_Q16to19,Memory read queue dispatched to queues 16-19 ##727E6 A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #354,c,g,n,n,PM_MEM_FAST_PATH_RD_DISP,Fast path memory read dispatched ##731E6 Fast path memory read dispatched #355,v,g,n,n,PM_SNOOP_RETRY_AB_COLLISION,Snoop retry due to a b collision ##735E6 Snoop retry due to a b collision $$$$$$$$ { counter 3 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #1,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #2,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #3,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #4,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #5,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #6,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #7,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #8,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times. #9,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction. #10,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction. #11,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##23087,230E2 A conditional branch instruction was predicted as taken or not taken. #12,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles when both thread's CLB is completely empty. #13,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles when both thread's CLB is full. #14,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #15,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #16,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #17,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##C30A2 The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load. #18,v,g,n,n,PM_DATA_FROM_L275_SHR,Data loaded from L2.75 shared ##C3097 The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a demand load. #19,v,g,n,n,PM_DATA_FROM_L2MISS,Data loaded missed L2 ##C309B The processor's Data Cache was reloaded but not from the local L2. #20,v,g,n,n,PM_DATA_FROM_L3,Data loaded from L3 ##C30AF The processor's Data Cache was reloaded from the local L3 due to a demand load. #21,v,g,n,n,PM_DATA_FROM_L35_MOD,Data loaded from L3.5 modified ##C30A6 The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load. #22,v,g,n,n,PM_DATA_FROM_L375_SHR,Data loaded from L3.75 shared ##C309E The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a demand load. #23,v,g,n,n,PM_DATA_FROM_LMEM,Data loaded from local memory ##C30A0 The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on. #24,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #25,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #26,v,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of prefetch streams ##C50C2 A new prefetch stream was detected but no more stream entries were available. #27,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 A prefetch stream was started using the DST instruction. #28,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated. #29,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve. #30,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4,C20E0 Data TLB misses, all page sizes. #31,v,g,n,n,PM_DTLB_MISS_16M,Data TLB miss for 16M page ##C208D Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time. #32,v,g,n,n,PM_DTLB_REF,Data TLB references ##C20E4 Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time. #33,v,g,n,n,PM_DTLB_REF_16M,Data TLB reference for 16M page ##C2086 Data TLB references for 16MB pages. Includes hits + misses. #34,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked. #35,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles when an interrupt due to an external exception is pending but external exceptions were masked. #36,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled. #37,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #38,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #39,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #40,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #41,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #42,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #43,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #44,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #45,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #46,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled. #47,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled. #48,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly. #49,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly. #50,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #51,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes occurred including LSU and Branch flushes. #52,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 A flush was caused by a branch mispredict. #53,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #54,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #55,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes. #56,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #57,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #58,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 FPU0 has encountered a denormalized operand. #59,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #60,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #61,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads. #62,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #63,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #64,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs. #65,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #66,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #67,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped. #68,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 FPU0 has executed a single precision instruction. #69,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #70,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 FPU0 has executed a Floating Point Store instruction. #71,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #72,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 FPU1 has encountered a denormalized operand. #73,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #74,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #75,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , #76,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #77,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executed FMOV or FEST instructions ##010C4 FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #78,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #79,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #80,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped #81,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 FPU1 has executed a single precision instruction. #82,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #83,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 FPU1 has executed a Floating Point Store instruction. #84,v,g,n,n,PM_FPU_FMOV_FEST,FPU executed FMOV or FEST instructions ##01088 The floating point unit has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.. Combined Unit 0 + Unit 1. #85,v,g,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions ##01090 The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1. #86,v,g,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction ##000A8 The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1. #87,v,g,n,n,PM_FPU_STF,FPU executed store instruction ##020A8 FPU has executed a store instruction. Combined Unit 0 + Unit 1. #88,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #89,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #90,u,g,n,n,PM_FXU0_BUSY_FXU1_IDLE,FXU0 busy FXU1 idle ##00012 FXU0 is busy while FXU1 was idle #91,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete. #92,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete. #93,v,g,n,n,PM_FXU_FIN,FXU produced a result ##13088 The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete. #94,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##100C0 The Global Completion Table is completely full. #95,v,g,n,n,PM_GCT_NOSLOT_SRQ_FULL,No slot in GCT caused by SRQ full ##10084 Cycles when the Global Completion Table has no slots from this thread because the Store Request Queue (SRQ) is full. This happens when the storage subsystem can not process the stores in the SRQ. Groups can not be dispatched until a SRQ entry is available. #96,v,g,n,s,PM_GCT_USAGE_80to99_CYC,Cycles GCT 80-99% full ##0001F Cycles when the Global Completion Table has between 80% and 99% of its slots used. The GCT has 20 entries shared between threads #97,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #98,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count. #99,c,g,n,n,PM_GRP_IC_MISS_BR_REDIR_NONSPEC,Group experienced non-speculative I cache miss or branch redirect ##120E5 Group experienced non-speculative I cache miss or branch redirect #100,v,g,n,n,PM_GRP_CMPL,Group completed ##00013 A group completed. Microcoded instructions that span multiple groups will generate this event once per group. #101,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 A scoreboard operation on a non-renamed resource has blocked dispatch. #102,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##120E4 A group that previously attempted dispatch was rejected. #103,v,g,n,n,PM_GRP_DISP_SUCCESS,Group dispatch success ##00002 Number of groups sucessfully dispatched (not rejected) #104,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 A group is available for dispatch. This does not mean it was successfully dispatched. #105,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count. #106,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict). #107,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target). #108,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##210C7 A prefetch buffer entry (line) is allocated but the request is not a demand fetch. #109,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 An instruction prefetch request has been made. #110,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #111,v,g,n,n,PM_IERAT_XLATE_WR_LP,Large page translation written to ierat ##210C6 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #112,v,g,n,n,PM_IOPS_CMPL,Internal operations completed ##00001 Number of internal operations that completed. #113,v,g,n,n,PM_INST_DISP,Instructions dispatched ##00009 Number of PowerPC instructions successfully dispatched. #114,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Cycles when at least one instruction was sent from the fetch unit to the decode unit. #115,v,g,n,n,PM_INST_FROM_L275_SHR,Instruction fetched from L2.75 shared ##22096 An instruction fetch group was fetched with shared (T) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions #116,v,g,n,n,PM_INST_FROM_L3,Instruction fetched from L3 ##220AE An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions #117,v,g,n,n,PM_INST_FROM_L375_SHR,Instruction fetched from L3.75 shared ##2209D An instruction fetch group was fetched with shared (S) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions #118,v,g,n,n,PM_INST_FROM_PREF,Instruction fetched from prefetch ##2208D An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions #119,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #120,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #121,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads. #122,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #123,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 Cycles that a cache line was written to the instruction cache. #124,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #125,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #126,v,g,n,s,PM_L2SA_RCLD_DISP,L2 slice A RC load dispatch attempt ##701C0 A Read/Claim dispatch for a Load was attempted #127,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #128,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 slice A RC load dispatch attempt failed due to other reasons ##731E0 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #129,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 slice A RC load dispatch attempt failed due to all RC full ##721E0 A Read/Claim dispatch for a load failed because all RC machines are busy. #130,v,g,n,s,PM_L2SA_RCST_DISP,L2 slice A RC store dispatch attempt ##702C0 A Read/Claim dispatch for a Store was attempted. #131,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #132,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 slice A RC store dispatch attempt failed due to other reasons ##732E0 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #133,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 slice A RC store dispatch attempt failed due to all RC full ##722E0 A Read/Claim dispatch for a store failed because all RC machines are busy. #134,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #135,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice A RC dispatch attempt failed due to all CO busy ##713C0 A Read/Claim dispatch was rejected because all Castout machines were busy. #136,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #137,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #138,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C. #139,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #140,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #141,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #142,v,g,n,s,PM_L2SB_RCLD_DISP,L2 slice B RC load dispatch attempt ##701C1 A Read/Claim dispatch for a Load was attempted #143,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #144,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 slice B RC load dispatch attempt failed due to other reasons ##731E1 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #145,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 slice B RC load dispatch attempt failed due to all RC full ##721E1 A Read/Claim dispatch for a load failed because all RC machines are busy. #146,v,g,n,s,PM_L2SB_RCST_DISP,L2 slice B RC store dispatch attempt ##702C1 A Read/Claim dispatch for a Store was attempted. #147,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #148,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 slice B RC store dispatch attempt failed due to other reasons ##732E1 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #149,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 slice B RC store dispatch attempt failed due to all RC full ##722E2 A Read/Claim dispatch for a store failed because all RC machines are busy. #150,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #151,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice B RC dispatch attempt failed due to all CO busy ##713C1 A Read/Claim dispatch was rejected because all Castout machines were busy. #152,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #153,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #154,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C. #155,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #156,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #157,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #158,v,g,n,s,PM_L2SC_RCLD_DISP,L2 slice C RC load dispatch attempt ##701C2 A Read/Claim dispatch for a Load was attempted #159,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #160,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 slice C RC load dispatch attempt failed due to other reasons ##731E2 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #161,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 slice C RC load dispatch attempt failed due to all RC full ##721E2 A Read/Claim dispatch for a load failed because all RC machines are busy. #162,v,g,n,s,PM_L2SC_RCST_DISP,L2 slice C RC store dispatch attempt ##702C2 A Read/Claim dispatch for a Store was attempted. #163,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #164,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 slice C RC store dispatch attempt failed due to other reasons ##732E2 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #165,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 slice C RC store dispatch attempt failed due to all RC full ##722E1 A Read/Claim dispatch for a store failed because all RC machines are busy. #166,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #167,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice C RC dispatch attempt failed due to all CO busy ##713C2 A Read/Claim dispatch was rejected because all Castout machines were busy. #168,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #169,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #170,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C. #171,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #172,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #173,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 Cycles All Castin/Castout machines are busy. #174,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #175,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #176,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #177,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #178,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #179,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #180,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 Cycles All Castin/Castout machines are busy. #181,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #182,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #183,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #184,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #185,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #186,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #187,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 Cycles All Castin/Castout machines are busy. #188,v,g,n,s,PM_L3SC_HIT,L3 slice C hits ##711C5 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice #189,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point. #190,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #191,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice. #192,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #193,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #194,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0) #195,v,g,n,n,PM_LD_MISS_L1,L1 D cache load misses ##C1088 Load references that miss the Level 1 Data cache. Combined unit 0 + 1. #196,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 Load references that miss the Level 1 Data cache, by unit 0. #197,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C6 Load references that miss the Level 1 Data cache, by unit 1. #198,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 Load references to Level 1 Data Cache, by unit 0. #199,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C5 Load references that miss the Level 1 Data cache, by unit 1. #200,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #201,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E1 Total cycles the Load Store Unit 0 is busy rejecting instructions. #202,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #203,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #204,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ lhs flushes ##C00C3 A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group. #205,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1) #206,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary). #207,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed by LSU0 #208,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 A non-cacheable load was executed by unit 0. #209,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C40C3 Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #210,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C40C1 Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #211,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C40C2 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #212,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ lhs rejects ##C40C0 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #213,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C60E1 Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #214,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E5 Total cycles the Load Store Unit 1 is busy rejecting instructions. #215,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #216,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #217,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ lhs flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #218,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1). #219,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary) #220,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed by LSU1 #221,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 A non-cacheable load was executed by Unit 0. #222,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C40C7 Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #223,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C40C5 Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #224,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C40C6 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #225,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ lhs rejects ##C40C4 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #226,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C60E5 Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #227,v,g,n,n,PM_LSU_DERAT_MISS,DERAT misses ##800A8 Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1. #228,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 A flush was initiated by the Load Store Unit #229,v,g,n,n,PM_LSU_FLUSH_LRQ,LRQ flushes ##C00A8 A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1. #230,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #231,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #232,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The Load Miss Queue was full. #233,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #234,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #235,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #236,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00015 Cycles when both the LMQ and SRQ are empty (LSU is idle) #237,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 Cycles when the LRQ is full. #238,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C60E7 LRQ slot zero was allocated #239,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C60E6 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each). #240,v,g,n,n,PM_LSU_REJECT_RELOAD_CDF,LSU reject due to reload CDF or tag update collision ##C40A8 Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1. #241,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 Cycles the Store Request Queue is full. #242,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E7 SRQ Slot zero was allocated #243,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E6 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each). #244,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 Cycles that a sync instruction is active in the Store Request Queue. #245,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response. #246,c,g,n,n,PM_MEM_FAST_PATH_RD_DISP,Fast path memory read dispatched ##731E6 Fast path memory read dispatched #247,v,g,n,n,PM_MEM_RQ_DISP_Q8to11,Memory read queue dispatched to queues 8-11 ##722E6 A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #248,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #249,v,g,n,s,PM_MEM_NONSPEC_RD_CANCEL,Non speculative memory read cancelled ##711C6 A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly #250,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly #251,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #252,v,g,n,n,PM_MEM_PWQ_DISP_Q2or3,Memory partial-write queue dispatched to Write Queue 2 or 3 ##734E6 Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #253,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##724E6 Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #254,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #255,v,g,n,n,PM_MEM_RQ_DISP_Q12to15,Memory read queue dispatched to queues 12-15 ##732E6 A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #256,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #257,v,g,n,n,PM_MEM_RQ_DISP_Q0to3,Memory read queue dispatched to queues 0-3 ##702C6 A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #258,v,g,n,n,PM_MEM_RQ_DISP_Q4to7,Memory read queue dispatched to queues 4-7 ##712C6 A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #259,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read cancelled ##721E6 Speculative memory read cancelled (i.e. cresp = sourced by L2/L3) #260,v,g,n,n,PM_MEM_WQ_DISP_Q0to7,Memory write queue dispatched to queues 0-7 ##723E6 A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #261,v,g,n,n,PM_MEM_WQ_DISP_Q8to15,Memory write queue dispatched to queues 8-15 ##733E6 A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #262,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #263,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #264,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##C70A2 The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load. #265,v,g,n,n,PM_MRK_DATA_FROM_L275_SHR,Marked data loaded from L2.75 shared ##C7097 The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a marked load. #266,v,g,n,n,PM_MRK_DATA_FROM_L2MISS,Marked data loaded missed L2 ##C709B DL1 was reloaded from beyond L2 due to a marked demand load. #267,v,g,n,n,PM_MRK_DATA_FROM_L3,Marked data loaded from L3 ##C70AF The processor's Data Cache was reloaded from the local L3 due to a marked load. #268,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD,Marked data loaded from L3.5 modified ##C70A6 The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load. #269,v,g,n,n,PM_MRK_DATA_FROM_L375_SHR,Marked data loaded from L3.75 shared ##C709E The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a marked load. #270,v,g,n,n,PM_MRK_DATA_FROM_LMEM,Marked data loaded from local memory ##C70A0 The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on. #271,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 A Data SLB miss was caused by a marked instruction. #272,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6,C60E0 Data TLB references by a marked instruction that missed the TLB (all page sizes). #273,v,g,n,n,PM_MRK_DTLB_MISS_16M,Marked Data TLB misses for 16M page ##C608D Marked Data TLB misses for 16M page #274,v,g,n,n,PM_MRK_DTLB_REF,Marked Data TLB reference ##C60E4 Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time. #275,v,g,n,n,PM_MRK_DTLB_REF_16M,Marked Data TLB reference for 16M page ##C6086 Data TLB references by a marked instruction for 16MB pages. #276,v,g,n,n,PM_MRK_FPU_FIN,Marked instruction FPU processing finished ##00014 One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete #277,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occurred due to marked load #278,v,g,n,n,PM_MRK_INST_FIN,Marked instruction finished ##00005 One of the execution units finished a marked instruction. Instructions that finish may not necessary complete #279,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #280,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 marked L1 D cache load misses ##820E0 Load references that miss the Level 1 Data cache, by LSU0. #281,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 marked L1 D cache load misses ##820E4 Load references that miss the Level 1 Data cache, by LSU1. #282,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #283,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ lhs flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #284,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C1 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #285,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C0 A marked store was flushed from unit 0 because it was unaligned #286,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #287,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ lhs flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #288,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #289,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #290,v,g,n,n,PM_MRK_LSU_FLUSH_LRQ,Marked LRQ flushes ##81088 A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #291,v,g,n,n,PM_MRK_LSU_FLUSH_UST,Marked unaligned store flushes ##81090 A marked store was flushed because it was unaligned #292,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #293,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #294,v,g,n,n,PM_MRK_ST_CMPL_INT,Marked store completed with intervention ##00003 A marked store previously sent to the memory subsystem completed (data home) after requiring intervention #295,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #296,v,g,n,n,PM_PMC2_OVERFLOW,PMC2 Overflow ##0000A Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #297,v,g,n,n,PM_PMC6_OVERFLOW,PMC6 Overflow ##0001A Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #298,v,g,n,n,PM_PTEG_FROM_L25_MOD,PTEG loaded from L2.5 modified ##830A2 A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load. #299,v,g,n,n,PM_PTEG_FROM_L275_SHR,PTEG loaded from L2.75 shared ##83097 A Page Table Entry was loaded into the TLB with shared (T) data from the L2 on a different module than this processor is located due to a demand load. #300,v,g,n,n,PM_PTEG_FROM_L2MISS,PTEG loaded from L2 miss ##8309B A Page Table Entry was loaded into the TLB but not from the local L2. #301,v,g,n,n,PM_PTEG_FROM_L3,PTEG loaded from L3 ##830AF A Page Table Entry was loaded into the TLB from the local L3 due to a demand load. #302,v,g,n,n,PM_PTEG_FROM_L35_MOD,PTEG loaded from L3.5 modified ##830A6 A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load. #303,v,g,n,n,PM_PTEG_FROM_L375_SHR,PTEG loaded from L3.75 shared ##8309E A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on a different module than this processor is located, due to a demand load. #304,v,g,n,n,PM_PTEG_FROM_LMEM,PTEG loaded from local memory ##830A0 A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on. #305,v,g,n,n,PM_PTEG_RELOAD_VALID,PTEG reload valid ##830E4 A Page Table Entry was loaded into the TLB. #306,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #307,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #308,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #309,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #310,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #311,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #312,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #313,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #314,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A tlbie was snooped from another processor. #315,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #316,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly #317,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #318,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #319,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #320,v,g,n,n,PM_STOP_COMPLETION,Completion stopped ##00018 RAS Unit has signaled completion to stop #321,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache. Combined Unit 0 + 1. #322,v,g,n,n,PM_ST_REF_L1,L1 D cache store references ##C1090 Store references to the Data Cache. Combined Unit 0 + 1. #323,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 Store references to the Data Cache by LSU0. #324,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C4 Store references to the Data Cache by LSU1. #325,v,g,n,n,PM_SUSPENDED,Suspended ##00000 The counter is suspended (does not count). #326,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##410C7 Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used. #327,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping. #328,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles this thread was running at priority level 2. #329,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles this thread was running at priority level 3. #330,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles this thread was running at priority level 4. #331,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles this thread was running at priority level 5. #332,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles this thread was running at priority level 6. #333,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles this thread was running at priority level 7. #334,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles when this thread's priority is equal to the other thread's priority. #335,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles when this thread's priority is higher than the other thread's priority by 1 or 2. #336,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles when this thread's priority is higher than the other thread's priority by 3 or 4. #337,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles when this thread's priority is higher than the other thread's priority by 5 or 6. #338,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles when this thread's priority is lower than the other thread's priority by 1 or 2. #339,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles when this thread's priority is lower than the other thread's priority by 3 or 4. #340,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles when this thread's priority is lower than the other thread's priority by 5 or 6. #341,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overrides caused by CLB empty ##410C2 Thread selection was overridden because one thread's CLB was empty. #342,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overrides caused by GCT imbalance ##410C4 Thread selection was overridden because of a GCT imbalance. #343,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overrides caused by ISU holds ##410C5 Thread selection was overridden because of an ISU hold. #344,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overrides caused by L2 misses ##410C3 Thread selection was overridden because one thread was had a L2 miss pending. #345,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Thread selection picked thread 0 for decode. #346,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Thread selection picked thread 1 for decode. #347,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 A hung thread was detected #348,v,g,t,n,PM_THRESH_TIMEO,Threshold timeout ##0000B The threshold timer expired #349,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 Cycles a TLBIE instruction was held at dispatch. #350,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #351,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##230E3 The target address of a branch instruction was predicted. #352,v,g,n,n,PM_MEM_RQ_DISP_Q16to19,Memory read queue dispatched to queues 16-19 ##727E6 A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #353,v,g,n,n,PM_SNOOP_RETRY_AB_COLLISION,Snoop retry due to a b collision ##735E6 Snoop retry due to a b collision #354,v,g,n,n,PM_INST_DISP_ATTEMPT,Instructions dispatch attempted ##120E1 Number of PowerPC Instructions dispatched (attempted, not filtered by success. $$$$$$$$ { counter 4 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #1,v,g,n,n,PM_0INST_FETCH,No instructions fetched ##2208D No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss) #2,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #3,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #4,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #5,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #6,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #7,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific. #8,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #9,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times. #10,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction. #11,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction. #12,v,g,n,n,PM_BR_PRED_CR_TA,A conditional branch was predicted, CR and target prediction ##23087 Both the condition (taken or not taken) and the target address of a branch instruction was predicted. #13,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles when both thread's CLB is completely empty. #14,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles when both thread's CLB is full. #15,v,g,n,n,PM_CMPLU_STALL_DIV,Completion stall caused by DIV instruction ##11099 Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU. #16,v,g,n,n,PM_CMPLU_STALL_ERAT_MISS,Completion stall caused by ERAT miss ##1109B Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT. #17,v,g,n,n,PM_CMPLU_STALL_FPU,Completion stall caused by FPU instruction ##11098 Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point instruction. #18,v,g,n,n,PM_CMPLU_STALL_REJECT,Completion stall caused by reject ##1109A Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU. #19,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #20,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #21,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #22,v,g,n,n,PM_DATA_FROM_L275_MOD,Data loaded from L2.75 modified ##C3097 The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load. #23,v,g,n,n,PM_DATA_FROM_L375_MOD,Data loaded from L3.75 modified ##C309E The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load. #24,v,g,n,n,PM_DATA_FROM_RMEM,Data loaded from remote memory ##C3087 The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on. #25,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #26,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #27,v,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of prefetch streams ##C50C2 A new prefetch stream was detected but no more stream entries were available. #28,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 A prefetch stream was started using the DST instruction. #29,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated. #30,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve. #31,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4,C20E0 Data TLB misses, all page sizes. #32,v,g,n,n,PM_DTLB_MISS_16G,Data TLB miss for 16G page ##C208D Data TLB references to 16GB pages that missed the TLB. Page size is determined at TLB reload time. #33,v,g,n,n,PM_DTLB_REF,Data TLB references ##C20E4 Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time. #34,v,g,n,n,PM_DTLB_REF_16G,Data TLB reference for 16G page ##C2086 Data TLB references for 16GB pages. Includes hits + misses. #35,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked. #36,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles when an interrupt due to an external exception is pending but external exceptions were masked. #37,v,g,n,n,PM_EXT_INT,External interrupts ##00003 An interrupt due to an external exception occurred #38,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled. #39,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #40,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #41,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. #42,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #43,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly. #44,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #45,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #46,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #47,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #48,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled. #49,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled. #50,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly. #51,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly. #52,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly. #53,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes occurred including LSU and Branch flushes. #54,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 A flush was caused by a branch mispredict. #55,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #56,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #57,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes. #58,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #59,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #60,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 FPU0 has encountered a denormalized operand. #61,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #62,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #63,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads. #64,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #65,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #66,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs. #67,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #68,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #69,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped. #70,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 FPU0 has executed a single precision instruction. #71,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #72,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 FPU0 has executed a Floating Point Store instruction. #73,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #74,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 FPU1 has encountered a denormalized operand. #75,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #76,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #77,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , #78,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #79,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executed FMOV or FEST instructions ##010C4 FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ. #80,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #81,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #82,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped #83,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 FPU1 has executed a single precision instruction. #84,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). #85,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 FPU1 has executed a Floating Point Store instruction. #86,v,g,n,n,PM_FPU_1FLOP,FPU executed one flop instruction ##000A8 The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. #87,v,g,n,n,PM_FPU_FEST,FPU executed FEST instruction ##01090 The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1. #88,v,g,n,n,PM_FPU_FIN,FPU produced a result ##01088 FPU finished, produced a result. This only indicates finish, not completion. Combined Unit 0 + Unit 1. Floating Point Stores are included in this count but not Floating Point Loads., , , XYZs #89,c,g,n,n,PM_FPU_FULL_CYC,Cycles FPU issue queue full ##100A8 Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full. #90,v,g,n,n,PM_FPU_SINGLE,FPU executed single precision instruction ##020A8 FPU is executing single precision instruction. Combined Unit 0 + Unit 1. #91,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #92,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented. #93,c,g,n,n,PM_FXLS_FULL_CYC,Cycles FXLS queue is full ##11090 Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full. #94,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete. #95,u,g,n,n,PM_FXU1_BUSY_FXU0_IDLE,FXU1 busy FXU0 idle ##00012 FXU0 was idle while FXU1 was busy. #96,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete. #97,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0001F,100C0 The Global Completion Table is completely full. #98,v,g,n,n,PM_GCT_NOSLOT_BR_MPRED,No slot in GCT caused by branch mispredict ##1009C Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction. #99,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #100,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count. #101,c,g,n,n,PM_GRP_IC_MISS_BR_REDIR_NONSPEC,Group experienced non-speculative I cache miss or branch redirect ##120E5 Group experienced non-speculative I cache miss or branch redirect #102,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 A scoreboard operation on a non-renamed resource has blocked dispatch. #103,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##00002,120E4 A group that previously attempted dispatch was rejected. #104,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 A group is available for dispatch. This does not mean it was successfully dispatched. #105,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count. #106,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict). #107,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target). #108,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch buffer ##210C7 A prefetch buffer entry (line) is allocated but the request is not a demand fetch. #109,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 An instruction prefetch request has been made. #110,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #111,v,g,n,n,PM_IERAT_XLATE_WR_LP,Large page translation written to ierat ##210C6 An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed. #112,v,g,n,n,PM_IOPS_CMPL,Internal operations completed ##00001 Number of internal operations that completed. #113,v,g,n,n,PM_INST_DISP,Instructions dispatched ##00009 Number of PowerPC instructions successfully dispatched. #114,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Cycles when at least one instruction was sent from the fetch unit to the decode unit. #115,v,g,n,n,PM_INST_FROM_L275_MOD,Instruction fetched from L2.75 modified ##22096 An instruction fetch group was fetched with modified (M) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions #116,v,g,n,n,PM_INST_FROM_L375_MOD,Instruction fetched from L3.75 modified ##2209D An instruction fetch group was fetched with modified (M) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions #117,v,g,n,n,PM_INST_FROM_RMEM,Instruction fetched from remote memory ##22086 An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions #118,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #119,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #120,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads. #121,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #122,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 Cycles that a cache line was written to the instruction cache. #123,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #124,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #125,v,g,n,s,PM_L2SA_RCLD_DISP,L2 slice A RC load dispatch attempt ##701C0 A Read/Claim dispatch for a Load was attempted #126,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #127,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 slice A RC load dispatch attempt failed due to other reasons ##731E0 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #128,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 slice A RC load dispatch attempt failed due to all RC full ##721E0 A Read/Claim dispatch for a load failed because all RC machines are busy. #129,v,g,n,s,PM_L2SA_RCST_DISP,L2 slice A RC store dispatch attempt ##702C0 A Read/Claim dispatch for a Store was attempted. #130,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #131,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 slice A RC store dispatch attempt failed due to other reasons ##732E0 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #132,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 slice A RC store dispatch attempt failed due to all RC full ##722E0 A Read/Claim dispatch for a store failed because all RC machines are busy. #133,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #134,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice A RC dispatch attempt failed due to all CO busy ##713C0 A Read/Claim dispatch was rejected because all Castout machines were busy. #135,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #136,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #137,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C. #138,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #139,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #140,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #141,v,g,n,s,PM_L2SB_RCLD_DISP,L2 slice B RC load dispatch attempt ##701C1 A Read/Claim dispatch for a Load was attempted #142,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #143,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 slice B RC load dispatch attempt failed due to other reasons ##731E1 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #144,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 slice B RC load dispatch attempt failed due to all RC full ##721E1 A Read/Claim dispatch for a load failed because all RC machines are busy. #145,v,g,n,s,PM_L2SB_RCST_DISP,L2 slice B RC store dispatch attempt ##702C1 A Read/Claim dispatch for a Store was attempted. #146,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #147,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 slice B RC store dispatch attempt failed due to other reasons ##732E1 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #148,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 slice B RC store dispatch attempt failed due to all RC full ##722E2 A Read/Claim dispatch for a store failed because all RC machines are busy. #149,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #150,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice B RC dispatch attempt failed due to all CO busy ##713C1 A Read/Claim dispatch was rejected because all Castout machines were busy. #151,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #152,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #153,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C. #154,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #155,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #156,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C. #157,v,g,n,s,PM_L2SC_RCLD_DISP,L2 slice C RC load dispatch attempt ##701C2 A Read/Claim dispatch for a Load was attempted #158,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #159,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 slice C RC load dispatch attempt failed due to other reasons ##731E2 A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions. #160,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 slice C RC load dispatch attempt failed due to all RC full ##721E2 A Read/Claim dispatch for a load failed because all RC machines are busy. #161,v,g,n,s,PM_L2SC_RCST_DISP,L2 slice C RC store dispatch attempt ##702C2 A Read/Claim dispatch for a Store was attempted. #162,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time. #163,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 slice C RC store dispatch attempt failed due to other reasons ##732E2 A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted. #164,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 slice C RC store dispatch attempt failed due to all RC full ##722E1 A Read/Claim dispatch for a store failed because all RC machines are busy. #165,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access. #166,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 slice C RC dispatch attempt failed due to all CO busy ##713C2 A Read/Claim dispatch was rejected because all Castout machines were busy. #167,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #168,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. #169,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C. #170,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C. #171,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #172,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 Cycles All Castin/Castout machines are busy. #173,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #174,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #175,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #176,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #177,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #178,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #179,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 Cycles All Castin/Castout machines are busy. #180,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice #181,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point. #182,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #183,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice #184,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #185,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #186,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 Cycles All Castin/Castout machines are busy. #187,v,g,n,s,PM_L3SC_HIT,L3 slice C hits ##711C5 Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice #188,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point. #189,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point. #190,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice. #191,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched). #192,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b) #193,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0) #194,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 Load references that miss the Level 1 Data cache, by unit 0. #195,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C6 Load references that miss the Level 1 Data cache, by unit 1. #196,v,g,n,n,PM_LD_REF_L1,L1 D cache load references ##C1090 Load references to the Level 1 Data Cache. Combined unit 0 + 1. #197,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 Load references to Level 1 Data Cache, by unit 0. #198,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C5 Load references that miss the Level 1 Data cache, by unit 1. #199,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #200,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E1 Total cycles the Load Store Unit 0 is busy rejecting instructions. #201,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #202,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #203,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ lhs flushes ##C00C3 A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group. #204,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1) #205,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary). #206,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed by LSU0 #207,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 A non-cacheable load was executed by unit 0. #208,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C40C3 Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #209,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C40C1 Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #210,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C40C2 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #211,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ lhs rejects ##C40C0 Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #212,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C60E1 Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #213,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E5 Total cycles the Load Store Unit 1 is busy rejecting instructions. #214,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #215,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #216,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ lhs flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #217,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1). #218,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary) #219,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed by LSU1 #220,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 A non-cacheable load was executed by Unit 0. #221,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C40C7 Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #222,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C40C5 Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected. #223,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C40C6 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. #224,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ lhs rejects ##C40C4 Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. #225,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C60E5 Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. #226,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 A flush was initiated by the Load Store Unit #227,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #228,u,g,n,n,PM_LSU_FLUSH_SRQ,SRQ flushes ##C00A8 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1. #229,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled. #230,v,g,n,n,PM_LSU_LDF,LSU executed Floating Point load instruction ##C5090 LSU executed Floating Point load instruction. Combined Unit 0 + 1. #231,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The Load Miss Queue was full. #232,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #233,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #234,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #235,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 Cycles when the LRQ is full. #236,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C60E7 LRQ slot zero was allocated #237,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C60E6 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each). #238,v,g,n,n,PM_LSU_REJECT_ERAT_MISS,LSU reject due to ERAT miss ##C40A8 Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat. #239,u,g,n,n,PM_LSU_SRQ_EMPTY_CYC,Cycles SRQ empty ##00015 Cycles the Store Request Queue is empty #240,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 Cycles the Store Request Queue is full. #241,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E7 SRQ Slot zero was allocated #242,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E6 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each). #243,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 Cycles that a sync instruction is active in the Store Request Queue. #244,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response. #245,c,g,n,n,PM_MEM_FAST_PATH_RD_DISP,Fast path memory read dispatched ##731E6 Fast path memory read dispatched #246,v,g,n,n,PM_MEM_RQ_DISP_Q16to19,Memory read queue dispatched to queues 16-19 ##727E6 A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #247,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #248,v,g,n,n,PM_MEM_RQ_DISP_Q12to15,Memory read queue dispatched to queues 12-15 ##732E6 A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #249,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly #250,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #251,v,g,n,n,PM_MEM_PWQ_DISP_Q2or3,Memory partial-write queue dispatched to Write Queue 2 or 3 ##734E6 Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #252,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##724E6 Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #253,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #254,v,g,n,n,PM_INST_DISP_ATTEMPT,Instructions dispatch attempted ##120E1 Number of PowerPC Instructions dispatched (attempted, not filtered by success. #255,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #256,v,g,n,n,PM_MEM_RQ_DISP_Q0to3,Memory read queue dispatched to queues 0-3 ##702C6 A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #257,v,g,n,n,PM_MEM_RQ_DISP_Q4to7,Memory read queue dispatched to queues 4-7 ##712C6 A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #258,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read cancelled ##721E6 Speculative memory read cancelled (i.e. cresp = sourced by L2/L3) #259,v,g,n,n,PM_MEM_WQ_DISP_Q0to7,Memory write queue dispatched to queues 0-7 ##723E6 A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #260,v,g,n,n,PM_MEM_WQ_DISP_Q8to15,Memory write queue dispatched to queues 8-15 ##733E6 A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #261,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #262,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #263,v,g,n,n,PM_MRK_CRU_FIN,Marked instruction CRU processing finished ##00005 The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete. #264,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD_CYC,Marked load latency from L2.5 modified ##C70A2 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #265,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD,Marked data loaded from L2.75 modified ##C7097 The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load. #266,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD_CYC,Marked load latency from L2.75 modified ##C70A3 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #267,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD_CYC,Marked load latency from L3.5 modified ##C70A6 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #268,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD,Marked data loaded from L3.75 modified ##C709E The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load. #269,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD_CYC,Marked load latency from L3.75 modified ##C70A7 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #270,v,g,n,n,PM_MRK_DATA_FROM_LMEM_CYC,Marked load latency from local memory ##C70A0 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #271,v,g,n,n,PM_MRK_DATA_FROM_RMEM,Marked data loaded from remote memory ##C7087 The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on. #272,v,g,n,n,PM_MRK_DATA_FROM_RMEM_CYC,Marked load latency from remote memory ##C70A1 Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level. #273,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 A Data SLB miss was caused by a marked instruction. #274,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6,C60E0 Data TLB references by a marked instruction that missed the TLB (all page sizes). #275,v,g,n,n,PM_MRK_DTLB_MISS_16G,Marked Data TLB misses for 16G page ##C608D Data TLB references to 16GB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time. #276,v,g,n,n,PM_MRK_DTLB_REF,Marked Data TLB reference ##C60E4 Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time. #277,v,g,n,n,PM_MRK_DTLB_REF_16G,Marked Data TLB reference for 16G page ##C6086 Data TLB references by a marked instruction for 16GB pages. #278,v,g,n,n,PM_MRK_GRP_CMPL,Marked group completed ##00013 A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group. #279,v,g,n,n,PM_MRK_GRP_IC_MISS,Group experienced marked I cache miss ##12091 A group containing a marked (sampled) instruction experienced an instruction cache miss. #280,v,g,n,n,PM_MRK_GRP_TIMEO,Marked group completion timeout ##0000B The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor #281,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occurred due to marked load #282,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #283,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 marked L1 D cache load misses ##820E0 Load references that miss the Level 1 Data cache, by LSU0. #284,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 marked L1 D cache load misses ##820E4 Load references that miss the Level 1 Data cache, by LSU1. #285,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #286,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ lhs flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #287,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C1 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #288,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C0 A marked store was flushed from unit 0 because it was unaligned #289,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #290,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ lhs flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #291,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #292,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #293,c,g,n,n,PM_MRK_LSU_FIN,Marked instruction LSU processing finished ##00014 One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete #294,v,g,n,n,PM_MRK_LSU_FLUSH_SRQ,Marked SRQ lhs flushes ##81088 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #295,v,g,n,n,PM_MRK_LSU_FLUSH_ULD,Marked unaligned load flushes ##81090 A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #296,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #297,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #298,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #299,v,g,n,n,PM_PMC3_OVERFLOW,PMC3 Overflow ##0000A Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow. #300,v,g,n,n,PM_PTEG_FROM_L275_MOD,PTEG loaded from L2.75 modified ##83097 A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load. #301,v,g,n,n,PM_PTEG_FROM_L375_MOD,PTEG loaded from L3.75 modified ##8309E A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load. #302,v,g,n,n,PM_PTEG_FROM_RMEM,PTEG loaded from remote memory ##83087 A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on. #303,v,g,n,n,PM_PTEG_RELOAD_VALID,PTEG reload valid ##830E4 A Page Table Entry was loaded into the TLB. #304,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #305,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #306,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #307,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #308,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #309,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #310,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #311,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #312,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A tlbie was snooped from another processor. #313,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #314,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly #315,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #316,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #317,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #318,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache. Combined Unit 0 + 1. #319,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 Store references to the Data Cache by LSU0. #320,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C4 Store references to the Data Cache by LSU1. #321,v,g,n,n,PM_SUSPENDED,Suspended ##00000 The counter is suspended (does not count). #322,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##41084,410C7 Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used. #323,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping. #324,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles this thread was running at priority level 2. #325,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles this thread was running at priority level 3. #326,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles this thread was running at priority level 4. #327,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles this thread was running at priority level 5. #328,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles this thread was running at priority level 6. #329,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles this thread was running at priority level 7. #330,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles when this thread's priority is equal to the other thread's priority. #331,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles when this thread's priority is higher than the other thread's priority by 1 or 2. #332,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles when this thread's priority is higher than the other thread's priority by 3 or 4. #333,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles when this thread's priority is higher than the other thread's priority by 5 or 6. #334,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles when this thread's priority is lower than the other thread's priority by 1 or 2. #335,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles when this thread's priority is lower than the other thread's priority by 3 or 4. #336,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles when this thread's priority is lower than the other thread's priority by 5 or 6. #337,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overrides caused by CLB empty ##410C2 Thread selection was overridden because one thread's CLB was empty. #338,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overrides caused by GCT imbalance ##410C4 Thread selection was overridden because of a GCT imbalance. #339,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overrides caused by ISU holds ##410C5 Thread selection was overridden because of an ISU hold. #340,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overrides caused by L2 misses ##410C3 Thread selection was overridden because one thread was had a L2 miss pending. #341,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Thread selection picked thread 0 for decode. #342,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Thread selection picked thread 1 for decode. #343,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 A hung thread was detected #344,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 Cycles a TLBIE instruction was held at dispatch. #345,v,g,n,n,PM_WORK_HELD,Work held ##0000C RAS Unit has signaled completion to stop and there are groups waiting to complete #346,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented. #347,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##230E2 A conditional branch instruction was predicted as taken or not taken. #348,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##230E3 The target address of a branch instruction was predicted. #349,v,g,n,n,PM_MEM_RQ_DISP_Q8to11,Memory read queue dispatched to queues 8-11 ##722E6 A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly. #350,v,g,n,n,PM_SNOOP_RETRY_AB_COLLISION,Snoop retry due to a b collision ##735E6 Snoop retry due to a b collision #351,v,g,n,s,PM_MEM_NONSPEC_RD_CANCEL,Non speculative memory read cancelled ##711C6 A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly $$$$$$$$ { counter 5 } #0,v,g,n,n,PM_RUN_INST_CMPL,Run instructions completed ##00009 Number of run instructions completed. $$$$$$$$ { counter 6 } #0,v,g,n,n,PM_RUN_CYC,Run cycles ##00005 Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop. papi-papi-7-2-0-t/src/event_data/power5+/groups000066400000000000000000000733631502707512200213130ustar00rootroot00000000000000{ File: power5+/groups { Date: 03/15/07 { Version: 1.8 { (C) Copyright IBM Corporation, 2006, 2007. All Rights Reserved. { Contributed by Corey Ashford { Number of groups 188 { Group descriptions #0,312,302,113,21,0,0,pm_utilization,CPI and utilization data ##00005,00009,00009,0000F,00009,00005 00000000,00000000,0A12121E,00000000 CPI and utilization data #1,2,95,100,21,0,0,pm_completion,Completion and cycle counts ##00013,00004,00013,0000F,00009,00005 00000000,00000000,2608261E,00000000 Completion and cycle counts #2,105,104,101,113,0,0,pm_group_dispatch,Group dispatch events ##120E3,120E4,130E1,00009,00009,00005 00000000,4000000E,C6C8C212,00000000 Group dispatch events #3,0,2,12,267,0,0,pm_clb1,CLB fullness ##400C0,400C2,410C6,C70A6,00009,00005 00000000,015B0001,80848C4C,00000001 CLB fullness #4,6,6,292,112,0,0,pm_clb2,CLB fullness ##400C5,400C6,C70E6,00001,00009,00005 00000000,01430002,8A8CCC02,00000001 CLB fullness #5,98,97,95,98,0,0,pm_gct_empty,GCT empty reasons ##00004,1009C,10084,1009C,00009,00005 00000000,40000000,08380838,00000000 GCT empty reasons #6,99,98,96,97,0,0,pm_gct_usage,GCT Usage ##0001F,0001F,0001F,0001F,00009,00005 00000000,00000000,3E3E3E3E,00000000 GCT Usage #7,242,241,234,234,0,0,pm_lsu1,LSU LRQ and LMQ events ##C60E7,C60E6,C30E6,C30E5,00009,00005 00000000,020F000F,CECCCCCA,00000000 LSU LRQ and LMQ events #8,247,246,244,240,0,0,pm_lsu2,LSU SRQ events ##C20E7,C20E6,830E5,110C3,00009,00005 00000000,400E000E,CECCCA86,00000000 LSU SRQ events #9,238,247,236,239,0,0,pm_lsu3,LSU SRQ and LMQ events ##C70E5,C6088,00015,00015,00009,00005 00000000,030F0004,EA102A2A,00000000 LSU SRQ and LMQ events #10,237,244,236,239,0,0,pm_lsu4,LSU SRQ and LMQ events ##C30E7,110C3,00015,00015,00009,00005 00000000,40030000,EEA62A2A,00000000 LSU SRQ and LMQ events #11,120,115,26,29,0,0,pm_prefetch1,Prefetch stream allocation ##2209B,220E4,C50C2,830E7,00009,00005 00000000,8432000D,36C884CE,00000000 Prefetch stream allocation #12,115,13,122,108,0,0,pm_prefetch2,Prefetch events ##00001,220E5,C70E7,210C7,00009,00005 00000000,81030006,02CACE8E,00000001 Prefetch events #13,1,227,172,112,0,0,pm_prefetch3,L2 prefetch and misc events ##400C1,C2088,C50C3,00001,00009,00005 00000000,047C0004,82108602,00000001 L2 prefetch and misc events #14,216,225,27,171,0,0,pm_prefetch4,Misc prefetch and reject events ##C40C0,C40C4,830E6,C50C3,00009,00005 00000000,0CF20002,8088CC86,00000000 Misc prefetch and reject events #15,244,242,53,294,0,0,pm_lsu_reject1,LSU reject events ##C4090,C4088,330E3,81088,00009,00005 00000000,C8E00002,2010C610,00000001 LSU reject events #16,215,224,112,122,0,0,pm_lsu_reject2,LSU rejects due to reload CDF or tag update collision ##C40C2,C40C6,00001,230E7,00009,00005 00000000,88C00001,848C02CE,00000001 LSU rejects due to reload CDF or tag update collision #17,213,222,245,344,0,0,pm_lsu_reject3,LSU rejects due to ERAT, held instuctions ##C40C3,C40C7,130E0,130E4,00009,00005 00000000,48C00003,868EC0C8,00000000 LSU rejects due to ERAT, held instuctions #18,214,223,112,9,0,0,pm_lsu_reject4,LSU0/1 reject LMQ full ##C40C1,C40C5,00001,230E4,00009,00005 00000000,88C00001,828A02C8,00000001 LSU0/1 reject LMQ full #19,245,243,228,53,0,0,pm_lsu_reject5,LSU misc reject and flush events ##C4088,C4090,110C5,110C7,00009,00005 00000000,48C00000,10208A8E,00000000 LSU misc reject and flush events #20,115,233,53,26,0,0,pm_flush1,Misc flush events ##00001,C0088,330E3,C10C7,00009,00005 00000000,C0F00002,0210C68E,00000001 Misc flush events #21,124,113,54,57,0,0,pm_flush2,Flushes due to scoreboard and sync ##800C0,00001,330E2,330E1,00009,00005 00000000,C0800003,8002C4C2,00000001 Flushes due to scoreboard and sync #22,233,230,112,226,0,0,pm_lsu_flush_srq_lrq,LSU flush by SRQ and LRQ events ##C0090,C0090,00001,110C5,00009,00005 00000000,40C00000,2020028A,00000001 LSU flush by SRQ and LRQ events #23,207,216,228,112,0,0,pm_lsu_flush_lrq,LSU0/1 flush due to LRQ ##C00C2,C00C6,110C5,00001,00009,00005 00000000,40C00000,848C8A02,00000001 LSU0/1 flush due to LRQ #24,208,217,112,226,0,0,pm_lsu_flush_srq,LSU0/1 flush due to SRQ ##C00C3,C00C7,00001,110C5,00009,00005 00000000,40C00000,868E028A,00000001 LSU0/1 flush due to SRQ #25,235,233,8,112,0,0,pm_lsu_flush_unaligned,LSU flush due to unaligned data ##C0088,C0088,230E4,00001,00009,00005 00000000,80C00002,1010C802,00000001 LSU flush due to unaligned data #26,209,218,228,112,0,0,pm_lsu_flush_uld,LSU0/1 flush due to unaligned load ##C00C0,C00C4,110C5,00001,00009,00005 00000000,40C00000,80888A02,00000001 LSU0/1 flush due to unaligned load #27,210,219,112,226,0,0,pm_lsu_flush_ust,LSU0/1 flush due to unaligned store ##C00C1,C00C5,00001,110C5,00009,00005 00000000,40C00000,828A028A,00000001 LSU0/1 flush due to unaligned store #28,232,113,290,229,0,0,pm_lsu_flush_full,LSU flush due to LRQ/SRQ full ##320E7,00001,81088,330E0,00009,00005 00000000,C0200009,CE0210C0,00000001 LSU flush due to LRQ/SRQ full #29,109,17,112,18,0,0,pm_lsu_stall1,LSU Stalls ##00014,11098,00001,1109A,00009,00005 00000000,40000000,28300234,00000001 LSU Stalls #30,115,14,16,16,0,0,pm_lsu_stall2,LSU Stalls ##00001,1109A,0000F,1109B,00009,00005 00000000,40000000,02341E36,00000001 LSU Stalls #31,107,16,112,15,0,0,pm_fxu_stall,FXU Stalls ##120E5,11099,00001,11099,00009,00005 00000000,40000008,CA320232,00000001 FXU Stalls #32,89,15,112,17,0,0,pm_fpu_stall,FPU Stalls ##10090,1109B,00001,11098,00009,00005 00000000,40000000,20360230,00000001 FPU Stalls #33,198,7,237,231,0,0,pm_queue_full,BRQ LRQ LMQ queue full ##820E7,100C5,110C2,C30E7,00009,00005 00000000,400B0009,CE8A84CE,00000000 BRQ LRQ LMQ queue full #34,68,80,88,92,0,0,pm_issueq_full,FPU FX full ##100C3,100C7,110C0,110C4,00009,00005 00000000,40000000,868E8088,00000000 FPU FX full #35,16,200,97,19,0,0,pm_mapper_full1,CR CTR GPR mapper full ##100C4,100C6,130E5,110C1,00009,00005 00000000,40000002,888CCA82,00000000 CR CTR GPR mapper full #36,57,351,266,112,0,0,pm_mapper_full2,FPR XER mapper full ##100C1,100C2,C709B,00001,00009,00005 00000000,41030002,82843602,00000001 FPR XER mapper full #37,325,321,208,220,0,0,pm_misc_load,Non-cachable loads and stcx events ##820E1,820E5,C50C1,C50C5,00009,00005 00000000,0438000C,C2CA828A,00000001 Non-cachable loads and stcx events #38,205,214,106,107,0,0,pm_ic_demand,ICache demand from BR redirect ##C20E1,C20E5,230E0,230E1,00009,00005 00000000,800C000F,C2CAC0C2,00000000 ICache demand from BR redirect #39,113,110,108,1,0,0,pm_ic_pref,ICache prefetch ##220E7,220E6,210C7,2208D,00009,00005 00000000,8000000D,CECC8E1A,00000000 ICache prefetch #40,108,106,121,112,0,0,pm_ic_miss,ICache misses ##12099,120E7,C30E4,00001,00009,00005 00000000,4003000E,32CEC802,00000001 ICache misses #41,356,307,9,11,0,0,pm_branch_miss,Branch mispredict, TLB and SLB misses ##80088,80088,230E5,230E6,00009,00005 00000000,80800003,1010CACC,00000000 Branch mispredict, TLB and SLB misses #42,12,11,11,12,0,0,pm_branch1,Branch operations ##23087,23087,23087,23087,00009,00005 00000000,8000000F,0E0E0E0E,00000000 Branch operations #43,102,100,52,112,0,0,pm_branch2,Branch operations ##12091,120E6,110C6,00001,00009,00005 00000000,4000000C,22CC8C02,00000001 Branch operations #44,25,30,195,196,0,0,pm_L1_tlbmiss,L1 load and TLB misses ##800C7,800C4,C1088,C1090,00009,00005 00000000,00B00000,8E881020,00000000 L1 load and TLB misses #45,18,228,322,318,0,0,pm_L1_DERAT_miss,L1 store and DERAT misses ##C3087,80090,C1090,C10C3,00009,00005 00000000,00B30008,0E202086,00000000 L1 store and DERAT misses #46,30,120,196,195,0,0,pm_L1_slbmiss,L1 load and SLB misses ##800C5,800C1,C10C2,C10C6,00009,00005 00000000,00B00000,8A82848C,00000000 L1 load and SLB misses #47,34,33,33,34,0,0,pm_dtlbref,Data TLB references ##C2086,C2086,C2086,C2086,00009,00005 00000000,000C000F,0C0C0C0C,00000000 Data TLB references #48,32,31,31,32,0,0,pm_dtlbmiss,Data TLB misses ##C208D,C208D,C208D,C208D,00009,00005 00000000,000C000F,1A1A1A1A,00000000 Data TLB misses #49,33,30,16,21,0,0,pm_dtlb,Data TLB references and misses ##C20E4,800C4,0000F,0000F,00009,00005 00000000,008C0008,C8881E1E,00000000 Data TLB references and misses #50,201,323,195,318,0,0,pm_L1_refmiss,L1 load references and misses and store references and misses ##C10A8,C10A8,C1088,C10C3,00009,00005 00000000,00300000,50501086,00000000 L1 load references and misses and store references and misses #51,21,23,51,112,0,0,pm_dsource1,L3 cache and memory data access ##C308E,C3087,110C7,00001,00009,00005 00000000,4003000C,1C0E8E02,00000001 L3 cache and memory data access #52,21,23,19,24,0,0,pm_dsource2,L3 cache and memory data access ##C308E,C3087,C309B,C3087,00009,00005 00000000,0003000F,1C0E360E,00000000 L3 cache and memory data access #53,19,21,18,22,0,0,pm_dsource_L2,L2 cache data access ##C3097,C3097,C3097,C3097,00009,00005 00000000,0003000F,2E2E2E2E,00000000 L2 cache data access #54,22,22,22,23,0,0,pm_dsource_L3,L3 cache data access ##C309E,C309E,C309E,C309E,00009,00005 00000000,0003000F,3C3C3C3C,00000000 L3 cache data access #55,121,116,118,117,0,0,pm_isource1,Instruction source information ##2208D,2208D,2208D,22086,00009,00005 00000000,8000000F,1A1A1A0C,00000000 Instruction source information #56,118,119,112,1,0,0,pm_isource2,Instruction source information ##22086,22086,00001,2208D,00009,00005 00000000,8000000D,0C0C021A,00000001 Instruction source information #57,119,117,115,115,0,0,pm_isource_L2,L2 instruction source information ##22096,22096,22096,22096,00009,00005 00000000,8000000F,2C2C2C2C,00000000 L2 instruction source information #58,122,118,117,116,0,0,pm_isource_L3,L3 instruction source information ##2209D,2209D,2209D,2209D,00009,00005 00000000,8000000F,3A3A3A3A,00000000 L3 instruction source information #59,305,303,299,300,0,0,pm_pteg_source1,PTEG source information ##83097,83097,83097,83097,00009,00005 00000000,0002000F,2E2E2E2E,00000000 PTEG source information #60,308,304,303,301,0,0,pm_pteg_source2,PTEG source information ##8309E,8309E,8309E,8309E,00009,00005 00000000,0002000F,3C3C3C3C,00000000 PTEG source information #61,304,305,300,302,0,0,pm_pteg_source3,PTEG source information ##83087,83087,8309B,83087,00009,00005 00000000,0002000F,0E0E360E,00000000 PTEG source information #62,307,102,103,26,0,0,pm_pteg_source4,L3 PTEG and group disptach events ##8308E,00002,00002,C10C7,00009,00005 00000000,00320008,1C04048E,00000000 L3 PTEG and group disptach events #63,130,130,127,127,0,0,pm_L2SA_ld,L2 slice A load events ##701C0,721E0,711C0,731E0,00009,00005 00000000,30554005,80C080C0,00000000 L2 slice A load events #64,134,134,131,131,0,0,pm_L2SA_st,L2 slice A store events ##702C0,722E0,712C0,732E0,00009,00005 00000000,30558005,80C080C0,00000000 L2 slice A store events #65,138,140,135,137,0,0,pm_L2SA_st2,L2 slice A store events ##703C0,723E0,713C0,733E0,00009,00005 00000000,3055C005,80C080C0,00000000 L2 slice A store events #66,146,146,143,143,0,0,pm_L2SB_ld,L2 slice B load events ##701C1,721E1,711C1,731E1,00009,00005 00000000,30554005,82C282C2,00000000 L2 slice B load events #67,150,150,147,147,0,0,pm_L2SB_st,L2 slice B store events ##702C1,722E2,712C1,732E1,00009,00005 00000000,30558005,82C482C2,00000000 L2 slice B store events #68,154,156,151,153,0,0,pm_L2SB_st2,L2 slice B store events ##703C1,723E1,713C1,733E1,00009,00005 00000000,3055C005,82C282C2,00000000 L2 slice B store events #69,162,162,159,159,0,0,pm_L2SC_ld,L2 slice C load events ##701C2,721E2,711C2,731E2,00009,00005 00000000,30554005,84C484C4,00000000 L2 slice C load events #70,166,166,163,163,0,0,pm_L2SC_st,L2 slice C store events ##702C2,722E1,712C2,732E2,00009,00005 00000000,30558005,84C284C4,00000000 L2 slice C store events #71,170,172,167,169,0,0,pm_L2SC_st2,L2 slice C store events ##703C2,723E2,713C2,733E2,00009,00005 00000000,3055C005,84C484C4,00000000 L2 slice C store events #72,180,113,175,177,0,0,pm_L3SA_trans,L3 slice A state transistions ##720E3,00001,730E3,710C3,00009,00005 00000000,3015000A,C602C686,00000001 L3 slice A state transistions #73,115,184,182,184,0,0,pm_L3SB_trans,L3 slice B state transistions ##00001,720E4,730E4,710C4,00009,00005 00000000,30150006,02C8C888,00000001 L3 slice B state transistions #74,115,191,189,191,0,0,pm_L3SC_trans,L3 slice C state transistions ##00001,720E5,730E5,710C5,00009,00005 00000000,30150006,02CACA8A,00000001 L3 slice C state transistions #75,129,138,124,135,0,0,pm_L2SA_trans,L2 slice A state transistions ##720E0,700C0,730E0,710C0,00009,00005 00000000,3055000A,C080C080,00000000 L2 slice A state transistions #76,145,154,140,151,0,0,pm_L2SB_trans,L2 slice B state transistions ##720E1,700C1,730E1,710C1,00009,00005 00000000,3055000A,C282C282,00000000 L2 slice B state transistions #77,161,170,156,167,0,0,pm_L2SC_trans,L2 slice C state transistions ##720E2,700C2,730E2,710C2,00009,00005 00000000,3055000A,C484C484,00000000 L2 slice C state transistions #78,177,181,179,185,0,0,pm_L3SAB_retry,L3 slice A/B snoop retry and all CI/CO busy ##721E3,721E4,731E3,731E4,00009,00005 00000000,3005100F,C6C8C6C8,00000000 L3 slice A/B snoop retry and all CI/CO busy #79,181,185,174,180,0,0,pm_L3SAB_hit,L3 slice A/B hit and reference ##701C3,701C4,711C3,711C4,00009,00005 00000000,30501000,86888688,00000000 L3 slice A/B hit and reference #80,191,192,193,187,0,0,pm_L3SC_retry_hit,L3 slice C hit & snoop retry ##721E5,701C5,731E5,711C5,00009,00005 00000000,3055100A,CA8ACA8A,00000000 L3 slice C hit & snoop retry #81,87,84,84,87,0,0,pm_fpu1,Floating Point events ##00088,00088,01088,01090,00009,00005 00000000,00000000,10101020,00000000 Floating Point events #82,85,86,85,88,0,0,pm_fpu2,Floating Point events ##00090,00090,01090,01088,00009,00005 00000000,00000000,20202010,00000000 Floating Point events #83,86,87,61,77,0,0,pm_fpu3,Floating point events ##02088,02088,010C3,010C7,00009,00005 00000000,0000000C,1010868E,00000000 Floating point events #84,90,88,112,230,0,0,pm_fpu4,Floating point events ##02090,02090,00001,C5090,00009,00005 00000000,0430000C,20200220,00000001 Floating point events #85,67,79,60,76,0,0,pm_fpu5,Floating point events by unit ##000C2,000C6,010C2,010C6,00009,00005 00000000,00000000,848C848C,00000000 Floating point events by unit #86,59,72,63,79,0,0,pm_fpu6,Floating point events by unit ##020E0,020E4,010C0,010C4,00009,00005 00000000,0000000C,C0C88088,00000000 Floating point events by unit #87,60,73,65,80,0,0,pm_fpu7,Floating point events by unit ##000C0,000C4,010C1,010C5,00009,00005 00000000,00000000,8088828A,00000000 Floating point events by unit #88,70,82,112,66,0,0,pm_fpu8,Floating point events by unit ##020E1,020E5,00001,030E0,00009,00005 00000000,0000000D,C2CA02C0,00000001 Floating point events by unit #89,69,81,207,219,0,0,pm_fpu9,Floating point events by unit ##020E3,020E7,C50C0,C50C4,00009,00005 00000000,0430000C,C6CE8088,00000000 Floating point events by unit #90,63,76,112,80,0,0,pm_fpu10,Floating point events by unit ##000C1,000C5,00001,010C5,00009,00005 00000000,00000000,828A028A,00000001 Floating point events by unit #91,58,71,61,112,0,0,pm_fpu11,Floating point events by unit ##000C3,000C7,010C3,00001,00009,00005 00000000,00000000,868E8602,00000001 Floating point events by unit #92,71,83,207,112,0,0,pm_fpu12,Floating point events by unit ##020E2,020E6,C50C0,00001,00009,00005 00000000,0430000C,C4CC8002,00000001 Floating point events by unit #93,96,93,90,95,0,0,pm_fxu1,Fixed Point events ##00012,00012,00012,00012,00009,00005 00000000,00000000,24242424,00000000 Fixed Point events #94,281,283,93,93,0,0,pm_fxu2,Fixed Point events ##00002,12091,13088,11090,00009,00005 00000000,40000006,04221020,00000001 Fixed Point events #95,4,4,91,96,0,0,pm_fxu3,Fixed Point events ##400C3,400C4,130E2,130E6,00009,00005 00000000,40400003,8688C4CC,00000000 Fixed Point events #96,337,335,334,331,0,0,pm_smt_priorities1,Thread priority events ##420E3,420E6,430E3,430E4,00009,00005 00000000,0005000F,C6CCC6C8,00000000 Thread priority events #97,336,334,336,333,0,0,pm_smt_priorities2,Thread priority events ##420E2,420E5,430E5,430E6,00009,00005 00000000,0005000F,C4CACACC,00000000 Thread priority events #98,335,333,338,335,0,0,pm_smt_priorities3,Thread priority events ##420E1,420E4,430E2,430E1,00009,00005 00000000,0005000F,C2C8C4C2,00000000 Thread priority events #99,334,107,340,112,0,0,pm_smt_priorities4,Thread priority events ##420E0,0000B,430E0,00001,00009,00005 00000000,0005000A,C016C002,00000001 Thread priority events #100,333,327,112,322,0,0,pm_smt_both,Thread common events ##0000B,00013,00001,41084,00009,00005 00000000,00100000,16260208,00000001 Thread common events #101,321,113,345,342,0,0,pm_smt_selection,Thread selection ##800C3,00001,410C0,410C1,00009,00005 00000000,00900000,86028082,00000001 Thread selection #102,115,0,341,338,0,0,pm_smt_selectover1,Thread selection overide ##00001,400C0,410C2,410C4,00009,00005 00000000,00500000,02808488,00000001 Thread selection overide #103,115,20,343,340,0,0,pm_smt_selectover2,Thread selection overide ##00001,0000F,410C5,410C3,00009,00005 00000000,00100000,021E8A86,00000001 Thread selection overide #104,37,38,37,41,0,0,pm_fabric1,Fabric events ##700C7,720E7,710C7,730E7,00009,00005 00000000,30550005,8ECE8ECE,00000000 Fabric events #105,45,41,45,52,0,0,pm_fabric2,Fabric data movement ##701C7,721E7,711C7,731E7,00009,00005 00000000,30550085,8ECE8ECE,00000000 Fabric data movement #106,47,48,47,51,0,0,pm_fabric3,Fabric data movement ##703C7,723E7,713C7,733E7,00009,00005 00000000,30550185,8ECE8ECE,00000000 Fabric data movement #107,43,40,34,45,0,0,pm_fabric4,Fabric data movement ##702C7,722E7,130E3,712C7,00009,00005 00000000,70540106,8ECEC68E,00000000 Fabric data movement #108,317,308,315,305,0,0,pm_snoop1,Snoop retry ##700C6,720E6,710C6,730E6,00009,00005 00000000,30550005,8CCC8CCC,00000000 Snoop retry #109,318,315,312,112,0,0,pm_snoop2,Snoop read retry ##705C6,725E6,715C6,00001,00009,00005 00000000,30540A04,8CCC8C02,00000001 Snoop read retry #110,323,252,317,249,0,0,pm_snoop3,Snoop write retry ##706C6,726E6,716C6,736E6,00009,00005 00000000,30550C05,8CCC8CCC,00000000 Snoop write retry #111,315,353,309,306,0,0,pm_snoop4,Snoop partial write retry ##707C6,727E6,717C6,707C6,00009,00005 00000000,30540E04,8CCC8CAC,00000000 Snoop partial write retry #112,261,263,249,36,0,0,pm_mem_rq,Memory read queue dispatch ##701C6,721E6,711C6,130E7,00009,00005 00000000,70540205,8CCC8CCE,00000000 Memory read queue dispatch #113,260,261,258,37,0,0,pm_mem_read,Memory read complete and cancel ##702C6,722E6,712C6,00003,00009,00005 00000000,30540404,8CCC8C06,00000000 Memory read complete and cancel #114,268,264,262,260,0,0,pm_mem_wq,Memory write queue dispatch ##703C6,723E6,713C6,733E6,00009,00005 00000000,30550605,8CCC8CCC,00000000 Memory write queue dispatch #115,256,257,254,251,0,0,pm_mem_pwq,Memory partial write queue ##704C6,724E6,714C6,734E6,00009,00005 00000000,30550805,8CCC8CCC,00000000 Memory partial write queue #116,281,284,348,293,0,0,pm_threshold,Thresholding ##00002,820E2,0000B,00014,00009,00005 00000000,00080004,04C41628,00000001 Thresholding #117,281,300,278,278,0,0,pm_mrk_grp1,Marked group events ##00002,820E3,00005,00013,00009,00005 00000000,00080004,04C60A26,00000001 Marked group events #118,282,268,279,279,0,0,pm_mrk_grp2,Marked group events ##00015,00005,C70E4,12091,00009,00005 00000000,41030003,2A0AC822,00000001 Marked group events #119,269,272,264,264,0,0,pm_mrk_dsource1,Marked data from ##C7087,C70A0,C70A2,C70A2,00009,00005 00000000,010B000F,0E404444,00000001 Marked data from #120,270,270,112,88,0,0,pm_mrk_dsource2,Marked data from ##C7097,C70A2,00001,01088,00009,00005 00000000,010B000C,2E440210,00000001 Marked data from #121,272,276,268,267,0,0,pm_mrk_dsource3,Marked data from ##C708E,C70A4,C70A6,C70A6,00009,00005 00000000,010B000F,1C484C4C,00000001 Marked data from #122,275,271,265,272,0,0,pm_mrk_dsource4,Marked data from ##C70A1,C70A3,C7097,C70A1,00009,00005 00000000,010B000F,42462E42,00000001 Marked data from #123,273,274,270,270,0,0,pm_mrk_dsource5,Marked data from ##C709E,C70A6,C70A0,C70A0,00009,00005 00000000,010B000F,3C4C4040,00000001 Marked data from #124,271,271,112,266,0,0,pm_mrk_dsource6,Marked data from ##C70A3,C70A3,00001,C70A3,00009,00005 00000000,010B000D,46460246,00000001 Marked data from #125,274,275,269,269,0,0,pm_mrk_dsource7,Marked data from ##C70A7,C70A7,C709E,C70A7,00009,00005 00000000,010B000F,4E4E3C4E,00000001 Marked data from #126,280,282,275,277,0,0,pm_mrk_dtlbref,Marked data TLB references ##C6086,C6086,C6086,C6086,00009,00005 00000000,020C000F,0C0C0C0C,00000001 Marked data TLB references #127,278,280,273,275,0,0,pm_mrk_dtlbmiss,Marked data TLB misses ##C608D,C608D,C608D,C608D,00009,00005 00000000,020C000F,1A1A1A1A,00000001 Marked data TLB misses #128,279,279,271,21,0,0,pm_mrk_dtlb_dslb,Marked data TLB references and misses and marked data SLB misses ##C60E4,C50C6,C50C7,0000F,00009,00005 00000000,063C0008,C8AC8E1E,00000001 Marked data TLB references and misses and marked data SLB misses #129,280,113,275,273,0,0,pm_mrk_lbref,Marked TLB and SLB references ##C6086,00001,C6086,C50C7,00009,00005 00000000,063C000A,0C020C8E,00000001 Marked TLB and SLB references #130,285,113,294,263,0,0,pm_mrk_lsmiss,Marked load and store miss ##82088,00001,00003,00005,00009,00005 00000000,00080008,1002060A,00000001 Marked load and store miss #131,299,300,291,295,0,0,pm_mrk_ulsflush,Mark unaligned load and store flushes ##00003,820E3,81090,81090,00009,00005 00000000,00280004,06C62020,00000001 Mark unaligned load and store flushes #132,298,299,276,280,0,0,pm_mrk_misc,Misc marked instructions ##820E6,00003,00014,0000B,00009,00005 00000000,00080008,CC062816,00000001 Misc marked instructions #133,18,116,322,196,0,0,pm_lsref_L1,Load/Store operations and L1 activity ##C3087,2208D,C1090,C1090,00009,00005 00000000,8033000C,0E1A2020,00000000 Load/Store operations and L1 activity #134,21,23,322,196,0,0,pm_lsref_L2L3,Load/Store operations and L2, L3 activity ##C308E,C3087,C1090,C1090,00009,00005 00000000,0033000C,1C0E2020,00000000 Load/Store operations and L2, L3 activity #135,124,30,322,196,0,0,pm_lsref_tlbmiss,Load/Store operations and TLB misses ##800C0,800C4,C1090,C1090,00009,00005 00000000,00B00000,80882020,00000000 Load/Store operations and TLB misses #136,21,23,195,318,0,0,pm_Dmiss,Data cache misses ##C308E,C3087,C1088,C10C3,00009,00005 00000000,0033000C,1C0E1086,00000000 Data cache misses #137,17,110,122,171,0,0,pm_prefetchX,Prefetch events ##0000F,220E6,C70E7,C50C3,00009,00005 00000000,85330006,1ECCCE86,00000000 Prefetch events #138,12,11,11,9,0,0,pm_branchX,Branch operations ##23087,23087,23087,230E4,00009,00005 00000000,8000000F,0E0E0EC8,00000000 Branch operations #139,70,82,61,66,0,0,pm_fpuX1,Floating point events by unit ##020E1,020E5,010C3,030E0,00009,00005 00000000,0000000D,C2CA86C0,00000000 Floating point events by unit #140,63,76,65,80,0,0,pm_fpuX2,Floating point events by unit ##000C1,000C5,010C1,010C5,00009,00005 00000000,00000000,828A828A,00000000 Floating point events by unit #141,58,71,61,77,0,0,pm_fpuX3,Floating point events by unit ##000C3,000C7,010C3,010C7,00009,00005 00000000,00000000,868E868E,00000000 Floating point events by unit #142,85,84,322,196,0,0,pm_fpuX4,Floating point and L1 events ##00090,00088,C1090,C1090,00009,00005 00000000,00300000,20102020,00000000 Floating point and L1 events #143,90,88,61,77,0,0,pm_fpuX5,Floating point events ##02090,02090,010C3,010C7,00009,00005 00000000,0000000C,2020868E,00000000 Floating point events #144,87,86,85,88,0,0,pm_fpuX6,Floating point events ##00088,00090,01090,01088,00009,00005 00000000,00000000,10202010,00000000 Floating point events #145,85,84,87,88,0,0,pm_fpuX7,Floating point events ##00090,00088,020A8,01088,00009,00005 00000000,00000002,20105010,00000000 Floating point events #146,17,94,16,88,0,0,pm_hpmcount8,HPM group for set 9 ##0000F,00014,0000F,01088,00009,00005 00000000,00000000,1E281E10,00000000 HPM group for set 9 #147,303,88,113,230,0,0,pm_hpmcount2,HPM group for set 2 ##00009,02090,00009,C5090,00009,00005 00000000,04300004,12201220,00000000 HPM group for set 2 #148,17,114,195,318,0,0,pm_hpmcount3,HPM group for set 3 ##0000F,120E1,C1088,C10C3,00009,00005 00000000,40300004,1EC21086,00000000 HPM group for set 3 #149,356,20,322,196,0,0,pm_hpmcount4,HPM group for set 7 ##80088,0000F,C1090,C1090,00009,00005 00000000,00B00000,101E2020,00000000 HPM group for set 7 #150,87,84,86,86,0,0,pm_flop,Floating point operations ##00088,00088,000A8,000A8,00009,00005 00000000,00000000,10105050,00000000 Floating point operations #151,303,20,195,26,0,0,pm_eprof1,Group for use with eprof ##00009,0000F,C1088,C10C7,00009,00005 00000000,00300000,121E108E,00000000 Group for use with eprof #152,303,323,113,196,0,0,pm_eprof2,Group for use with eprof ##00009,C10A8,00009,C1090,00009,00005 00000000,00300000,12501220,00000000 Group for use with eprof #153,17,84,87,88,0,0,pm_flip,Group for flips ##0000F,00088,020A8,01088,00009,00005 00000000,00000002,1E105010,00000000 Group for flips #154,17,30,195,196,0,0,pm_hpmcount5,HPM group for set 5 ##0000F,800C4,C1088,C1090,00009,00005 00000000,00B00000,1E881020,00000000 HPM group for set 5 #155,17,302,322,318,0,0,pm_hpmcount6,HPM group for set 6 ##0000F,00009,C1090,C10C3,00009,00005 00000000,00300000,1E122086,00000000 HPM group for set 6 #156,303,23,16,24,0,0,pm_hpmcount7,HPM group for set 8 ##00009,C3087,0000F,C3087,00009,00005 00000000,00030005,120E1E0E,00000000 HPM group for set 8 #157,281,302,348,293,0,0,pm_ep_threshold,Thresholding ##00002,00009,0000B,00014,00009,00005 00000000,00000000,04121628,00000001 Thresholding #158,281,302,278,278,0,0,pm_ep_mrk_grp1,Marked group events ##00002,00009,00005,00013,00009,00005 00000000,00000000,04120A26,00000001 Marked group events #159,282,302,279,279,0,0,pm_ep_mrk_grp2,Marked group events ##00015,00009,C70E4,12091,00009,00005 00000000,41030003,2A12C822,00000001 Marked group events #160,269,302,264,264,0,0,pm_ep_mrk_dsource1,Marked data from ##C7087,00009,C70A2,C70A2,00009,00005 00000000,010B000B,0E124444,00000001 Marked data from #161,270,302,277,88,0,0,pm_ep_mrk_dsource2,Marked data from ##C7097,00009,820E2,01088,00009,00005 00000000,010B0008,2E12E410,00000001 Marked data from #162,303,276,268,267,0,0,pm_ep_mrk_dsource3,Marked data from ##00009,C70A4,C70A6,C70A6,00009,00005 00000000,010B0007,12484C4C,00000001 Marked data from #163,303,271,265,272,0,0,pm_ep_mrk_dsource4,Marked data from ##00009,C70A3,C7097,C70A1,00009,00005 00000000,010B0007,12462E42,00000001 Marked data from #164,273,302,270,270,0,0,pm_ep_mrk_dsource5,Marked data from ##C709E,00009,C70A0,C70A0,00009,00005 00000000,010B000B,3C124040,00000001 Marked data from #165,303,271,112,266,0,0,pm_ep_mrk_dsource6,Marked data from ##00009,C70A3,00001,C70A3,00009,00005 00000000,010B0005,12460246,00000001 Marked data from #166,303,275,269,269,0,0,pm_ep_mrk_dsource7,Marked data from ##00009,C70A7,C709E,C70A7,00009,00005 00000000,010B0007,124E3C4E,00000001 Marked data from #167,303,280,273,275,0,0,pm_ep_mrk_lbmiss,Marked TLB and SLB misses ##00009,C608D,C608D,C608D,00009,00005 00000000,020C0007,121A1A1A,00000001 Marked TLB and SLB misses #168,303,282,275,277,0,0,pm_ep_mrk_dtlbref,Marked data TLB references ##00009,C6086,C6086,C6086,00009,00005 00000000,020C0007,120C0C0C,00000001 Marked data TLB references #169,303,280,273,275,0,0,pm_ep_mrk_dtlbmiss,Marked data TLB misses ##00009,C608D,C608D,C608D,00009,00005 00000000,020C0007,121A1A1A,00000001 Marked data TLB misses #170,280,302,275,273,0,0,pm_ep_mrk_lbref,Marked TLB and SLB references ##C6086,00009,C6086,C50C7,00009,00005 00000000,063C000A,0C120C8E,00000001 Marked TLB and SLB references #171,285,302,294,263,0,0,pm_ep_mrk_lsmiss,Marked load and store miss ##82088,00009,00003,00005,00009,00005 00000000,00080008,1012060A,00000001 Marked load and store miss #172,299,302,291,295,0,0,pm_ep_mrk_ulsflush,Mark unaligned load and store flushes ##00003,00009,81090,81090,00009,00005 00000000,00200000,06122020,00000001 Mark unaligned load and store flushes #173,303,299,276,280,0,0,pm_ep_mrk_misc1,Misc marked instructions ##00009,00003,00014,0000B,00009,00005 00000000,00000000,12062816,00000001 Misc marked instructions #174,303,270,267,281,0,0,pm_ep_mrk_misc2,Misc marked instructions ##00009,C70A2,C70AF,820E2,00009,00005 00000000,010B0006,12445EE4,00000001 Misc marked instructions #175,303,274,272,271,0,0,pm_ep_mrk_misc3,Misc marked instructions ##00009,C70A6,C50C6,C7087,00009,00005 00000000,053B0005,124C8C0E,00000001 Misc marked instructions #176,278,302,274,265,0,0,pm_ep_mrk_misc4,Misc marked instructions ##C608D,00009,C60E4,C7097,00009,00005 00000000,030F0009,1A12E82E,00000001 Misc marked instructions #177,280,302,112,286,0,0,pm_ep_mrk_misc5,Misc marked instructions ##C6086,00009,00001,810C3,00009,00005 00000000,022C0008,0C120286,00000001 Misc marked instructions #178,278,302,288,292,0,0,pm_ep_mrk_misc6,Misc marked instructions ##C608D,00009,810C4,810C5,00009,00005 00000000,022C0008,1A12888A,00000001 Misc marked instructions #179,303,272,284,288,0,0,pm_ep_mrk_misc7,Misc marked instructions ##00009,C70A0,810C1,810C0,00009,00005 00000000,012B0004,12408280,00000001 Misc marked instructions #180,303,268,282,286,0,0,pm_ep_mrk_misc8,Misc marked instructions ##00009,00005,810C2,810C3,00009,00005 00000000,00200000,120A8486,00000001 Misc marked instructions #181,303,292,287,297,0,0,pm_ep_mrk_misc9,Misc marked instructions ##00009,810C6,810C7,820E6,00009,00005 00000000,00280000,12AC8EEC,00000001 Misc marked instructions #182,303,286,281,298,0,0,pm_ep_mrk_misc10,Misc marked instructions ##00009,820E0,820E4,820E3,00009,00005 00000000,00080004,12C0E8E6,00000001 Misc marked instructions #183,303,268,264,268,0,0,pm_ep_mrk_misc11,Misc marked instructions ##00009,00005,C70A2,C709E,00009,00005 00000000,01030003,120A443C,00000001 Misc marked instructions #184,303,296,290,294,0,0,pm_ep_mrk_misc12,Misc marked instructions ##00009,810A8,81088,81088,00009,00005 00000000,00200000,12501010,00000001 Misc marked instructions #185,269,302,266,296,0,0,pm_ep_mrk_misc13,Misc marked instructions ##C7087,00009,C709B,C70E6,00009,00005 00000000,0103000B,0E1236CC,00000001 Misc marked instructions #186,303,94,276,293,0,0,pm_ep_mrk_misc14,Misc marked instructions ##00009,00014,00014,00014,00009,00005 00000000,00000000,12282828,00000001 Misc marked instructions #187,303,283,278,278,0,0,pm_ep_mrk_misc15,Misc marked instructions ##00009,12091,00005,00013,00009,00005 00000000,40000004,12220A26,00000001 Misc marked instructions papi-papi-7-2-0-t/src/event_data/power5/000077500000000000000000000000001502707512200177625ustar00rootroot00000000000000papi-papi-7-2-0-t/src/event_data/power5/events000066400000000000000000003664701502707512200212310ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/power5/events { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { counter 1 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 Cycles no instructions in CLB #1,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_1PLUS_PPC_CMPL,One or more PPC instruction completed ##00013 A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once. #3,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #8,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #9,v,g,n,n,PM_BR_UNCOND,Unconditional branch ##23087 Unconditional branch #10,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles CLB full #11,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #12,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #13,v,g,n,n,PM_DATA_FROM_L2,Data loaded from L2 ##C3087 DL1 was reloaded from the local L2 due to a demand load #14,v,g,n,n,PM_DATA_FROM_L25_SHR,Data loaded from L2.5 shared ##C3097 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load #15,v,g,n,n,PM_DATA_FROM_L275_MOD,Data loaded from L2.75 modified ##C30A3 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a demand load. #16,v,g,n,n,PM_DATA_FROM_L3,Data loaded from L3 ##C308E DL1 was reloaded from the local L3 due to a demand load #17,v,g,n,n,PM_DATA_FROM_L35_SHR,Data loaded from L3.5 shared ##C309E Data loaded from L3.5 shared #18,v,g,n,n,PM_DATA_FROM_L375_MOD,Data loaded from L3.75 modified ##C30A7 Data loaded from L3.75 modified #19,v,g,n,n,PM_DATA_FROM_RMEM,Data loaded from remote memory ##C30A1 Data loaded from remote memory #20,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #21,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #22,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #23,v,g,n,n,PM_DTLB_MISS_16M,Data TLB miss for 16M page ##C40C4 Data TLB miss for 16M page #24,v,g,n,n,PM_DTLB_MISS_4K,Data TLB miss for 4K page ##C40C0 Data TLB miss for 4K page #25,v,g,n,n,PM_DTLB_REF_16M,Data TLB reference for 16M page ##C40C6 Data TLB reference for 16M page #26,v,g,n,n,PM_DTLB_REF_4K,Data TLB reference for 4K page ##C40C2 Data TLB reference for 4K page #27,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Fabric command issued #28,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 dclaim issued #29,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Hold buffer to NN empty #30,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Hold buffer to VN empty #31,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 M1 to P1 sidecar empty #32,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 P1 to M1 sidecar empty #33,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 PN to NN beat went straight to its destination #34,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 PN to VN beat went straight to its destination #35,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #36,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #37,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 This signal is active for one cycle when one of the operands is denormalized. #38,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #39,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #40,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #41,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #42,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 This signal is active for one cycle when fp0 is executing single precision instruction. #43,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #44,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 This signal is active for one cycle when fp0 is executing a store instruction. #45,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #46,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 This signal is active for one cycle when one of the operands is denormalized. #47,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #48,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #49,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #50,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #51,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 This signal is active for one cycle when fp1 is executing single precision instruction. #52,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #53,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 This signal is active for one cycle when fp1 is executing a store instruction. #54,v,g,n,n,PM_FPU_DENORM,FPU received denormalized data ##02088 This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1 #55,v,g,n,n,PM_FPU_FDIV,FPU executed FDIV instruction ##00088 This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1 #56,v,g,n,n,PM_FPU_1FLOP,FPU executed one flop instruction ##00090 This event counts the number of one flop instructions. These could be fadd*, fmul*, fsub*, fneg+, fabs+, fnabs+, fres+, frsqrte+, fcmp**, or fsel where XYZ* means XYZ, XYZs, XYZ., XYZs., XYZ+ means XYZ, XYZ., and XYZ** means XYZu, XYZo. #57,c,g,n,n,PM_FPU_FULL_CYC,Cycles FPU issue queue full ##10090 Cycles when one or both FPU issue queues are full #58,v,g,n,n,PM_FPU_SINGLE,FPU executed single precision instruction ##02090 FPU is executing single precision instruction. Combined Unit 0 + Unit 1 #59,u,g,n,n,PM_FXU_IDLE,FXU idle ##00012 FXU0 and FXU1 are both idle #60,v,g,n,n,PM_GCT_NOSLOT_CYC,Cycles no GCT slot allocated ##00004 Cycles this thread does not have any slots allocated in the GCT. #61,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##100C0 The ISU sends a signal indicating the gct is full. #62,v,g,n,s,PM_GCT_USAGE_00to59_CYC,Cycles GCT less than 60% full ##0001F Cycles GCT less than 60% full #63,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Group experienced branch redirect #64,v,g,n,n,PM_GRP_BR_REDIR_NONSPEC,Group experienced non-speculative branch redirect ##120E5 Group experienced non-speculative branch redirect #65,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##120E4 A group that previously attempted dispatch was rejected. #66,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #67,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Group experienced I cache miss #68,v,g,n,n,PM_GRP_IC_MISS_BR_REDIR_NONSPEC,Group experienced non-speculative I cache miss or branch redirect ##12091 Group experienced non-speculative I cache miss or branch redirect #69,v,g,n,n,PM_GRP_IC_MISS_NONSPEC,Group experienced non-speculative I cache miss ##12099 Group experienced non-speculative I cache miss #70,v,g,n,n,PM_GRP_MRK,Group marked in IDU ##00014 A group was sampled (marked) #71,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #72,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #73,v,g,n,n,PM_INST_CMPL,Instructions completed ##00001 Number of Eligible Instructions that completed. #74,v,g,n,n,PM_INST_DISP,Instructions dispatched ##120E1 The ISU sends the number of instructions dispatched. #75,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Asserted each cycle when the IFU sends at least one instruction to the IDU. #76,v,g,n,n,PM_INST_FROM_L2,Instructions fetched from L2 ##22086 An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions #77,v,g,n,n,PM_INST_FROM_L25_SHR,Instruction fetched from L2.5 shared ##22096 Instruction fetched from L2.5 shared #78,v,g,n,n,PM_INST_FROM_L3,Instruction fetched from L3 ##2208D An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions #79,v,g,n,n,PM_INST_FROM_L35_SHR,Instruction fetched from L3.5 shared ##2209D Instruction fetched from L3.5 shared #80,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #81,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #82,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #83,v,g,n,s,PM_L2SA_RCLD_DISP,L2 Slice A RC load dispatch attempt ##701C0 L2 Slice A RC load dispatch attempt #84,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 Slice A RC load dispatch attempt failed due to all RC full ##721E0 L2 Slice A RC load dispatch attempt failed due to all RC full #85,v,g,n,s,PM_L2SA_RCST_DISP,L2 Slice A RC store dispatch attempt ##702C0 L2 Slice A RC store dispatch attempt #86,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 Slice A RC store dispatch attempt failed due to all RC full ##722E0 L2 Slice A RC store dispatch attempt failed due to all RC full #87,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 Slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 L2 Slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #88,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #89,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #90,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #91,v,g,n,s,PM_L2SB_RCLD_DISP,L2 Slice B RC load dispatch attempt ##701C1 L2 Slice B RC load dispatch attempt #92,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 Slice B RC load dispatch attempt failed due to all RC full ##721E1 L2 Slice B RC load dispatch attempt failed due to all RC full #93,v,g,n,s,PM_L2SB_RCST_DISP,L2 Slice B RC store dispatch attempt ##702C1 L2 Slice B RC store dispatch attempt #94,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 Slice B RC store dispatch attempt failed due to all RC full ##722E1 L2 Slice B RC store dispatch attempt failed due to all RC full #95,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 Slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 L2 Slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #96,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #97,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #98,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #99,v,g,n,s,PM_L2SC_RCLD_DISP,L2 Slice C RC load dispatch attempt ##701C2 L2 Slice C RC load dispatch attempt #100,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 Slice C RC load dispatch attempt failed due to all RC full ##721E2 L2 Slice C RC load dispatch attempt failed due to all RC full #101,v,g,n,s,PM_L2SC_RCST_DISP,L2 Slice C RC store dispatch attempt ##702C2 L2 Slice C RC store dispatch attempt #102,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 Slice C RC store dispatch attempt failed due to all RC full ##722E2 L2 Slice C RC store dispatch attempt failed due to all RC full #103,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 Slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 L2 Slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #104,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #105,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #106,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 L3 slice A active for every cycle all CI/CO machines busy #107,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 slice A transition from modified to TAG #108,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 L3 slice A references #109,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 L3 slice B active for every cycle all CI/CO machines busy #110,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 slice B transition from modified to TAG #111,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 L3 slice B references #112,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 L3 slice C active for every cycle all CI/CO machines busy #113,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 slice C transition from modified to TAG #114,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 L3 slice C references #115,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #116,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #117,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E3 LSU unit 0 busy due to reject #118,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #119,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #120,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C00C3 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #121,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #122,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #123,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C60E3 LSU0 reject due to ERAT miss #124,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C60E1 LSU0 reject due to LMQ full or missed data coming #125,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C60E2 LSU0 reject due to reload CDF or tag update collision #126,v,g,n,n,PM_LSU0_REJECT_SRQ_LHS,LSU0 SRQ rejects ##C60E0 LSU0 reject due to load hit store #127,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C20E0 Data from a store instruction was forwarded to a load on unit 0 #128,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E7 LSU unit 1 is busy due to reject #129,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #130,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #131,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #132,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #133,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #134,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C60E7 LSU1 reject due to ERAT miss #135,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C60E5 LSU1 reject due to LMQ full or missed data coming #136,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C60E6 LSU1 reject due to reload CDF or tag update collision #137,v,g,n,n,PM_LSU1_REJECT_SRQ_LHS,LSU1 SRQ rejects ##C60E4 LSU1 reject due to load hit store #138,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C20E4 Data from a store instruction was forwarded to a load on unit 1 #139,v,g,n,n,PM_LSU_BUSY_REJECT,LSU busy due to reject ##C2090 LSU (unit 0 + unit 1) is busy due to reject #140,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 Flush caused by LRQ full #141,u,g,n,n,PM_LSU_FLUSH_SRQ,SRQ flushes ##C0090 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #142,v,g,n,n,PM_LSU_FLUSH_ULD,LRQ unaligned load flushes ##C0088 A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #143,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C20E6 LRQ slot zero was allocated #144,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C20E2 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #145,v,g,n,n,PM_LSU_REJECT_ERAT_MISS,LSU reject due to ERAT miss ##C6090 LSU reject due to ERAT miss #146,v,g,n,n,PM_LSU_REJECT_SRQ_LHS,LSU SRQ rejects ##C6088 LSU reject due to load hit store #147,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E5 SRQ Slot zero was allocated #148,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E1 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #149,c,g,n,n,PM_LSU_SRQ_STFWD,SRQ store forwarded ##C2088 Data from a store instruction was forwarded to a load #150,v,g,n,s,PM_MEM_FAST_PATH_RD_CMPL,Fast path memory read completed ##722E6 Fast path memory read completed #151,v,g,n,s,PM_MEM_HI_PRIO_PW_CMPL,High priority partial-write completed ##727E6 High priority partial-write completed #152,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 High priority write completed #153,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Memory partial-write queue dispatched #154,v,g,n,s,PM_MEM_PWQ_DISP_BUSY2or3,Memory partial-write queue dispatched with 2-3 queues busy ##724E6 Memory partial-write queue dispatched with 2-3 queues busy #155,v,g,n,s,PM_MEM_READ_CMPL,Memory read completed or canceled ##702C6 Memory read completed or canceled #156,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 Memory read queue dispatched #157,v,g,n,s,PM_MEM_RQ_DISP_BUSY8to15,Memory read queue dispatched with 8-15 queues busy ##721E6 Memory read queue dispatched with 8-15 queues busy #158,v,g,n,s,PM_MEM_WQ_DISP_BUSY1to7,Memory write queue dispatched with 1-7 queues busy ##723E6 Memory write queue dispatched with 1-7 queues busy #159,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 Memory write queue dispatched due to write #160,v,g,n,n,PM_MRK_DATA_FROM_L2,Marked data loaded from L2 ##C7087 DL1 was reloaded from the local L2 due to a marked demand load #161,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR,Marked data loaded from L2.5 shared ##C7097 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load #162,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD,Marked data loaded from L2.75 modified ##C70A3 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a marked demand load. #163,v,g,n,n,PM_MRK_DATA_FROM_L3,Marked data loaded from L3 ##C708E DL1 was reloaded from the local L3 due to a marked demand load #164,v,g,n,n,PM_MRK_DATA_FROM_L35_SHR,Marked data loaded from L3.5 shared ##C709E Marked data loaded from L3.5 shared #165,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD,Marked data loaded from L3.75 modified ##C70A7 Marked data loaded from L3.75 modified #166,v,g,n,n,PM_MRK_DATA_FROM_RMEM,Marked data loaded from remote memory ##C70A1 Marked data loaded from remote memory #167,v,g,n,n,PM_MRK_DTLB_MISS_16M,Marked Data TLB misses for 16M page ##C40C5 Marked Data TLB misses for 16M page #168,v,g,n,n,PM_MRK_DTLB_MISS_4K,Marked Data TLB misses for 4K page ##C40C1 Marked Data TLB misses for 4K page #169,v,g,n,n,PM_MRK_DTLB_REF_16M,Marked Data TLB reference for 16M page ##C40C7 Marked Data TLB reference for 16M page #170,v,g,n,n,PM_MRK_DTLB_REF_4K,Marked Data TLB reference for 4K page ##C40C3 Marked Data TLB reference for 4K page #171,v,g,n,n,PM_MRK_GRP_DISP,Marked group dispatched ##00002 A group containing a sampled instruction was dispatched #172,v,g,n,n,PM_MRK_GRP_ISSUED,Marked group issued ##00015 A sampled instruction was issued #173,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occured due to marked load #174,v,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of PPC instructions completed. #175,v,g,n,n,PM_MRK_LD_MISS_L1,Marked L1 D cache load misses ##82088 Marked L1 D cache load misses #176,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##820E0 A marked load, executing on unit 0, missed the dcache #177,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##820E4 A marked load, executing on unit 1, missed the dcache #178,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #179,v,g,n,n,PM_MRK_ST_CMPL,Marked store instruction completed ##00003 A sampled store has completed (data home) #180,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #181,v,g,n,n,PM_PMC4_OVERFLOW,PMC4 Overflow ##0000A PMC4 Overflow #182,v,g,n,n,PM_PMC5_OVERFLOW,PMC5 Overflow ##0001A PMC5 Overflow #183,v,g,n,n,PM_PTEG_FROM_L2,PTEG loaded from L2 ##83087 PTEG loaded from L2 #184,v,g,n,n,PM_PTEG_FROM_L25_SHR,PTEG loaded from L2.5 shared ##83097 PTEG loaded from L2.5 shared #185,v,g,n,n,PM_PTEG_FROM_L275_MOD,PTEG loaded from L2.75 modified ##830A3 PTEG loaded from L2.75 modified #186,v,g,n,n,PM_PTEG_FROM_L3,PTEG loaded from L3 ##8308E PTEG loaded from L3 #187,v,g,n,n,PM_PTEG_FROM_L35_SHR,PTEG loaded from L3.5 shared ##8309E PTEG loaded from L3.5 shared #188,v,g,n,n,PM_PTEG_FROM_L375_MOD,PTEG loaded from L3.75 modified ##830A7 PTEG loaded from L3.75 modified #189,v,g,n,n,PM_PTEG_FROM_RMEM,PTEG loaded from remote memory ##830A1 PTEG loaded from remote memory #190,v,g,n,n,PM_RUN_CYC,Run cycles ##00005 Processor Cycles gated by the run latch #191,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 Snoop dclaim/flush retry due to write/dclaim queues full #192,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 Snoop partial-write retry due to collision with active read queue #193,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 Snoop read retry due to read queue full #194,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 Snoop read retry due to collision with active read queue #195,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #196,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #197,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 Snoop write/dclaim retry due to collision with active read queue #198,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #199,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #200,v,g,n,n,PM_SUSPENDED,Suspended ##00000 Suspended #201,u,g,n,n,PM_TB_BIT_TRANS,Time Base bit transition ##00018 When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 #202,v,g,n,s,PM_THRD_ONE_RUN_CYC,One of the threads in run cycles ##0000B One of the threads in run cycles #203,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles thread running at priority level 1 #204,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles thread running at priority level 2 #205,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles thread running at priority level 3 #206,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles thread running at priority level 4 #207,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles thread running at priority level 5 #208,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles thread running at priority level 6 #209,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles thread running at priority level 7 #210,v,g,n,n,PM_TLB_MISS,TLB misses ##80088 TLB misses #211,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #212,v,g,n,n,PM_INST_FROM_L2MISS,Instructions fetched missed L2 ##2209B An instruction fetch group was fetched from beyond L2. $$$$$$$$ { counter 2 } #0,v,g,n,n,PM_0INST_CLB_CYC,Cycles no instructions in CLB ##400C0 Cycles no instructions in CLB #1,v,g,n,n,PM_1INST_CLB_CYC,Cycles 1 instruction in CLB ##400C1 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #2,v,g,n,n,PM_2INST_CLB_CYC,Cycles 2 instructions in CLB ##400C2 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #3,v,g,n,n,PM_3INST_CLB_CYC,Cycles 3 instructions in CLB ##400C3 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #4,v,g,n,n,PM_4INST_CLB_CYC,Cycles 4 instructions in CLB ##400C4 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #5,v,g,n,n,PM_5INST_CLB_CYC,Cycles 5 instructions in CLB ##400C5 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #6,v,g,n,n,PM_6INST_CLB_CYC,Cycles 6 instructions in CLB ##400C6 The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue. #7,u,g,n,s,PM_BRQ_FULL_CYC,Cycles branch queue full ##100C5 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #8,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##23087 A conditional branch was predicted, target prediction #9,v,g,n,n,PM_CLB_FULL_CYC,Cycles CLB full ##220E5 Cycles CLB full #10,v,g,n,n,PM_CMPLU_STALL_DCACHE_MISS,Completion stall caused by D cache miss ##1109A Completion stall caused by D cache miss #11,v,g,n,n,PM_CMPLU_STALL_FDIV,Completion stall caused by FDIV or FQRT instruction ##1109B Completion stall caused by FDIV or FQRT instruction #12,v,g,n,n,PM_CMPLU_STALL_FXU,Completion stall caused by FXU instruction ##11099 Completion stall caused by FXU instruction #13,v,g,n,n,PM_CMPLU_STALL_LSU,Completion stall caused by LSU instruction ##11098 Completion stall caused by LSU instruction #14,v,g,n,s,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##100C4 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #15,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #16,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##C3097 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load #17,v,g,n,n,PM_DATA_FROM_L35_MOD,Data loaded from L3.5 modified ##C309E Data loaded from L3.5 modified #18,v,g,n,n,PM_DATA_FROM_LMEM,Data loaded from local memory ##C3087 Data loaded from local memory #19,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##800C7 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #20,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##800C5 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #21,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##800C4 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #22,v,g,n,n,PM_DTLB_MISS_16M,Data TLB miss for 16M page ##C40C4 Data TLB miss for 16M page #23,v,g,n,n,PM_DTLB_MISS_4K,Data TLB miss for 4K page ##C40C0 Data TLB miss for 4K page #24,v,g,n,n,PM_DTLB_REF_16M,Data TLB reference for 16M page ##C40C6 Data TLB reference for 16M page #25,v,g,n,n,PM_DTLB_REF_4K,Data TLB reference for 4K page ##C40C2 Data TLB reference for 4K page #26,v,g,n,s,PM_FAB_CMD_ISSUED,Fabric command issued ##700C7 Fabric command issued #27,v,g,n,s,PM_FAB_DCLAIM_ISSUED,dclaim issued ##720E7 dclaim issued #28,v,g,n,s,PM_FAB_HOLDtoNN_EMPTY,Hold buffer to NN empty ##722E7 Hold buffer to NN empty #29,v,g,n,s,PM_FAB_HOLDtoVN_EMPTY,Hold buffer to VN empty ##721E7 Hold buffer to VN empty #30,v,g,n,s,PM_FAB_M1toP1_SIDECAR_EMPTY,M1 to P1 sidecar empty ##702C7 M1 to P1 sidecar empty #31,v,g,n,s,PM_FAB_P1toM1_SIDECAR_EMPTY,P1 to M1 sidecar empty ##701C7 P1 to M1 sidecar empty #32,v,g,n,s,PM_FAB_PNtoNN_DIRECT,PN to NN beat went straight to its destination ##703C7 PN to NN beat went straight to its destination #33,v,g,n,s,PM_FAB_PNtoVN_DIRECT,PN to VN beat went straight to its destination ##723E7 PN to VN beat went straight to its destination #34,v,g,n,s,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##100C1 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #35,v,g,n,n,PM_FPU0_1FLOP,FPU0 executed add, mult, sub, cmp or sel instruction ##000C3 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #36,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##020E0 This signal is active for one cycle when one of the operands is denormalized. #37,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##000C0 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #38,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##000C1 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #39,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##000C2 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #40,v,g,n,s,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##100C3 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #41,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##020E3 This signal is active for one cycle when fp0 is executing single precision instruction. #42,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##020E1 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #43,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##020E2 This signal is active for one cycle when fp0 is executing a store instruction. #44,v,g,n,n,PM_FPU1_1FLOP,FPU1 executed add, mult, sub, cmp or sel instruction ##000C7 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #45,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##020E4 This signal is active for one cycle when one of the operands is denormalized. #46,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##000C4 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #47,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##000C5 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #48,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##000C6 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #49,v,g,n,s,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##100C7 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #50,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##020E7 This signal is active for one cycle when fp1 is executing single precision instruction. #51,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##020E5 This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #52,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##020E6 This signal is active for one cycle when fp1 is executing a store instruction. #53,v,g,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction ##00090 This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #54,v,g,n,n,PM_FPU_FMA,FPU executed multiply-add instruction ##00088 This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #55,v,g,n,n,PM_FPU_STALL3,FPU stalled in pipe3 ##02088 FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1 #56,v,g,n,n,PM_FPU_STF,FPU executed store instruction ##02090 FPU is executing a store instruction. Combined Unit 0 + Unit 1 #57,u,g,n,n,PM_FXU_BUSY,FXU busy ##00012 FXU0 and FXU1 are both busy #58,v,g,n,n,PM_FXU_FIN,FXU produced a result ##00014 The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete. #59,v,g,n,n,PM_GCT_NOSLOT_IC_MISS,No slot in GCT caused by I cache miss ##1009C This thread has no slot in the GCT because of an I cache miss #60,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##100C0 The ISU sends a signal indicating the gct is full. #61,v,g,n,s,PM_GCT_USAGE_60to79_CYC,Cycles GCT 60-79% full ##0001F Cycles GCT 60-79% full #62,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##120E6 Group experienced branch redirect #63,v,g,n,n,PM_GRP_BR_REDIR_NONSPEC,Group experienced non-speculative branch redirect ##120E5 Group experienced non-speculative branch redirect #64,v,g,n,n,PM_GRP_DISP,Group dispatches ##00002 A group was dispatched #65,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##120E4 A group that previously attempted dispatch was rejected. #66,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##120E3 Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #67,v,g,n,n,PM_GRP_IC_MISS,Group experienced I cache miss ##120E7 Group experienced I cache miss #68,v,g,n,n,PM_HV_CYC,Hypervisor Cycles ##0000B Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0) #69,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##220E6 Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #70,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##220E7 This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #71,v,g,n,n,PM_INST_CMPL,Instructions completed ##00001 Number of Eligible Instructions that completed. #72,v,g,n,n,PM_INST_DISP,Instructions dispatched ##120E1 The ISU sends the number of instructions dispatched. #73,v,g,n,n,PM_INST_FETCH_CYC,Cycles at least 1 instruction fetched ##220E4 Asserted each cycle when the IFU sends at least one instruction to the IDU. #74,v,g,n,n,PM_INST_FROM_L1,Instruction fetched from L1 ##2208D An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions #75,v,g,n,n,PM_INST_FROM_L25_MOD,Instruction fetched from L2.5 modified ##22096 Instruction fetched from L2.5 modified #76,v,g,n,n,PM_INST_FROM_L35_MOD,Instruction fetched from L3.5 modified ##2209D Instruction fetched from L3.5 modified #77,v,g,n,n,PM_INST_FROM_LMEM,Instruction fetched from local memory ##22086 Instruction fetched from local memory #78,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##800C1 A SLB miss for an instruction fetch as occurred #79,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##800C0 A TLB miss for an Instruction Fetch has occurred #80,v,g,n,s,PM_L2SA_MOD_TAG,L2 slice A transition from modified to tagged ##720E0 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #81,v,g,n,s,PM_L2SA_RCLD_DISP,L2 Slice A RC load dispatch attempt ##701C0 L2 Slice A RC load dispatch attempt #82,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_RC_FULL,L2 Slice A RC load dispatch attempt failed due to all RC full ##721E0 L2 Slice A RC load dispatch attempt failed due to all RC full #83,v,g,n,s,PM_L2SA_RCST_DISP,L2 Slice A RC store dispatch attempt ##702C0 L2 Slice A RC store dispatch attempt #84,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_RC_FULL,L2 Slice A RC store dispatch attempt failed due to all RC full ##722E0 L2 Slice A RC store dispatch attempt failed due to all RC full #85,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY,L2 Slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C0 L2 Slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #86,v,g,n,s,PM_L2SA_SHR_MOD,L2 slice A transition from shared to modified ##700C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #87,v,g,n,n,PM_L2SA_ST_REQ,L2 slice A store requests ##723E0 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #88,v,g,n,s,PM_L2SB_MOD_TAG,L2 slice B transition from modified to tagged ##720E1 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #89,v,g,n,s,PM_L2SB_RCLD_DISP,L2 Slice B RC load dispatch attempt ##701C1 L2 Slice B RC load dispatch attempt #90,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_RC_FULL,L2 Slice B RC load dispatch attempt failed due to all RC full ##721E1 L2 Slice B RC load dispatch attempt failed due to all RC full #91,v,g,n,s,PM_L2SB_RCST_DISP,L2 Slice B RC store dispatch attempt ##702C1 L2 Slice B RC store dispatch attempt #92,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_RC_FULL,L2 Slice B RC store dispatch attempt failed due to all RC full ##722E1 L2 Slice B RC store dispatch attempt failed due to all RC full #93,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY,L2 Slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C1 L2 Slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #94,v,g,n,s,PM_L2SB_SHR_MOD,L2 slice B transition from shared to modified ##700C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #95,v,g,n,n,PM_L2SB_ST_REQ,L2 slice B store requests ##723E1 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #96,v,g,n,s,PM_L2SC_MOD_TAG,L2 slice C transition from modified to tagged ##720E2 A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #97,v,g,n,s,PM_L2SC_RCLD_DISP,L2 Slice C RC load dispatch attempt ##701C2 L2 Slice C RC load dispatch attempt #98,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_RC_FULL,L2 Slice C RC load dispatch attempt failed due to all RC full ##721E2 L2 Slice C RC load dispatch attempt failed due to all RC full #99,v,g,n,s,PM_L2SC_RCST_DISP,L2 Slice C RC store dispatch attempt ##702C2 L2 Slice C RC store dispatch attempt #100,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_RC_FULL,L2 Slice C RC store dispatch attempt failed due to all RC full ##722E2 L2 Slice C RC store dispatch attempt failed due to all RC full #101,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY,L2 Slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy ##703C2 L2 Slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy #102,v,g,n,s,PM_L2SC_SHR_MOD,L2 slice C transition from shared to modified ##700C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. #103,v,g,n,n,PM_L2SC_ST_REQ,L2 slice C store requests ##723E2 A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C. #104,v,g,n,s,PM_L3SA_ALL_BUSY,L3 slice A active for every cycle all CI/CO machines busy ##721E3 L3 slice A active for every cycle all CI/CO machines busy #105,v,g,n,s,PM_L3SA_MOD_TAG,L3 slice A transition from modified to TAG ##720E3 L3 slice A transition from modified to TAG #106,v,g,n,s,PM_L3SA_REF,L3 slice A references ##701C3 L3 slice A references #107,v,g,n,s,PM_L3SB_ALL_BUSY,L3 slice B active for every cycle all CI/CO machines busy ##721E4 L3 slice B active for every cycle all CI/CO machines busy #108,v,g,n,s,PM_L3SB_MOD_TAG,L3 slice B transition from modified to TAG ##720E4 L3 slice B transition from modified to TAG #109,v,g,n,s,PM_L3SB_REF,L3 slice B references ##701C4 L3 slice B references #110,v,g,n,s,PM_L3SC_ALL_BUSY,L3 slice C active for every cycle all CI/CO machines busy ##721E5 L3 slice C active for every cycle all CI/CO machines busy #111,v,g,n,s,PM_L3SC_MOD_TAG,L3 slice C transition from modified to TAG ##720E5 L3 slice C transition from modified to TAG #112,v,g,n,s,PM_L3SC_REF,L3 slice C references ##701C5 L3 slice C references #113,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##820E7 A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #114,u,g,n,s,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##100C6 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #115,v,g,n,n,PM_LSU0_BUSY_REJECT,LSU0 busy due to reject ##C20E3 LSU unit 0 busy due to reject #116,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##800C2 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #117,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C00C2 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #118,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C00C3 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #119,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C00C0 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #120,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C00C1 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #121,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C60E3 LSU0 reject due to ERAT miss #122,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C60E1 LSU0 reject due to LMQ full or missed data coming #123,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C60E2 LSU0 reject due to reload CDF or tag update collision #124,v,g,n,n,PM_LSU0_REJECT_SRQ_LHS,LSU0 SRQ rejects ##C60E0 LSU0 reject due to load hit store #125,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C20E0 Data from a store instruction was forwarded to a load on unit 0 #126,v,g,n,n,PM_LSU1_BUSY_REJECT,LSU1 busy due to reject ##C20E7 LSU unit 1 is busy due to reject #127,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##800C6 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #128,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C00C6 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #129,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C00C7 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #130,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C00C4 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #131,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C00C5 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #132,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C60E7 LSU1 reject due to ERAT miss #133,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C60E5 LSU1 reject due to LMQ full or missed data coming #134,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C60E6 LSU1 reject due to reload CDF or tag update collision #135,v,g,n,n,PM_LSU1_REJECT_SRQ_LHS,LSU1 SRQ rejects ##C60E4 LSU1 reject due to load hit store #136,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C20E4 Data from a store instruction was forwarded to a load on unit 1 #137,v,g,n,n,PM_LSU_DERAT_MISS,DERAT misses ##80090 Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #138,v,g,n,n,PM_LSU_FLUSH_LRQ,LRQ flushes ##C0090 A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #139,v,g,n,s,PM_LSU_FLUSH_LRQ_FULL,Flush caused by LRQ full ##320E7 Flush caused by LRQ full #140,v,g,n,n,PM_LSU_FLUSH_UST,SRQ unaligned store flushes ##C0088 A store was flushed because it was unaligned #141,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00015 Cycles when both the LMQ and SRQ are empty (LSU is idle) #142,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C20E6 LRQ slot zero was allocated #143,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C20E2 This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #144,v,g,n,n,PM_LSU_REJECT_LMQ_FULL,LSU reject due to LMQ full or missed data coming ##C6088 LSU reject due to LMQ full or missed data coming #145,v,g,n,n,PM_LSU_REJECT_RELOAD_CDF,LSU reject due to reload CDF or tag update collision ##C6090 LSU reject due to reload CDF or tag update collision #146,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C20E5 SRQ Slot zero was allocated #147,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C20E1 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #148,v,g,n,s,PM_MEM_FAST_PATH_RD_CMPL,Fast path memory read completed ##722E6 Fast path memory read completed #149,v,g,n,s,PM_MEM_HI_PRIO_PW_CMPL,High priority partial-write completed ##727E6 High priority partial-write completed #150,v,g,n,s,PM_MEM_HI_PRIO_WR_CMPL,High priority write completed ##726E6 High priority write completed #151,v,g,n,s,PM_MEM_PWQ_DISP,Memory partial-write queue dispatched ##704C6 Memory partial-write queue dispatched #152,v,g,n,s,PM_MEM_PWQ_DISP_BUSY2or3,Memory partial-write queue dispatched with 2-3 queues busy ##724E6 Memory partial-write queue dispatched with 2-3 queues busy #153,v,g,n,s,PM_MEM_READ_CMPL,Memory read completed or canceled ##702C6 Memory read completed or canceled #154,v,g,n,s,PM_MEM_RQ_DISP,Memory read queue dispatched ##701C6 Memory read queue dispatched #155,v,g,n,s,PM_MEM_RQ_DISP_BUSY8to15,Memory read queue dispatched with 8-15 queues busy ##721E6 Memory read queue dispatched with 8-15 queues busy #156,v,g,n,s,PM_MEM_WQ_DISP_BUSY1to7,Memory write queue dispatched with 1-7 queues busy ##723E6 Memory write queue dispatched with 1-7 queues busy #157,v,g,n,s,PM_MEM_WQ_DISP_WRITE,Memory write queue dispatched due to write ##703C6 Memory write queue dispatched due to write #158,v,g,n,n,PM_MRK_BRU_FIN,Marked instruction BRU processing finished ##00005 The branch unit finished a marked instruction. Instructions that finish may not necessary complete #159,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##C7097 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load #160,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR_CYC,Marked load latency from L2.5 shared ##C70A2 Marked load latency from L2.5 shared #161,v,g,n,n,PM_MRK_DATA_FROM_L275_SHR_CYC,Marked load latency from L2.75 shared ##C70A3 Marked load latency from L2.75 shared #162,v,g,n,n,PM_MRK_DATA_FROM_L2_CYC,Marked load latency from L2 ##C70A0 Marked load latency from L2 #163,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD,Marked data loaded from L3.5 modified ##C709E Marked data loaded from L3.5 modified #164,v,g,n,n,PM_MRK_DATA_FROM_L35_SHR_CYC,Marked load latency from L3.5 shared ##C70A6 Marked load latency from L3.5 shared #165,v,g,n,n,PM_MRK_DATA_FROM_L375_SHR_CYC,Marked load latency from L3.75 shared ##C70A7 Marked load latency from L3.75 shared #166,v,g,n,n,PM_MRK_DATA_FROM_L3_CYC,Marked load latency from L3 ##C70A4 Marked load latency from L3 #167,v,g,n,n,PM_MRK_DATA_FROM_LMEM,Marked data loaded from local memory ##C7087 Marked data loaded from local memory #168,v,g,n,n,PM_MRK_DTLB_MISS_16M,Marked Data TLB misses for 16M page ##C40C5 Marked Data TLB misses for 16M page #169,v,g,n,n,PM_MRK_DTLB_MISS_4K,Marked Data TLB misses for 4K page ##C40C1 Marked Data TLB misses for 4K page #170,v,g,n,n,PM_MRK_DTLB_REF_16M,Marked Data TLB reference for 16M page ##C40C7 Marked Data TLB reference for 16M page #171,v,g,n,n,PM_MRK_DTLB_REF_4K,Marked Data TLB reference for 4K page ##C40C3 Marked Data TLB reference for 4K page #172,v,g,n,n,PM_MRK_GRP_BR_REDIR,Group experienced marked branch redirect ##12091 Group experienced marked branch redirect #173,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##820E2 A DL1 reload occured due to marked load #174,v,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of PPC instructions completed. #175,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##820E0 A marked load, executing on unit 0, missed the dcache #176,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##820E4 A marked load, executing on unit 1, missed the dcache #177,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##820E6 A marked stcx (stwcx or stdcx) failed #178,v,g,n,n,PM_MRK_ST_GPS,Marked store sent to GPS ##00003 A sampled store has been sent to the memory subsystem #179,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##820E3 A marked store missed the dcache #180,v,g,n,n,PM_PMC1_OVERFLOW,PMC1 Overflow ##0000A PMC1 Overflow #181,v,g,n,n,PM_PTEG_FROM_L25_MOD,PTEG loaded from L2.5 modified ##83097 PTEG loaded from L2.5 modified #182,v,g,n,n,PM_PTEG_FROM_L35_MOD,PTEG loaded from L3.5 modified ##8309E PTEG loaded from L3.5 modified #183,v,g,n,n,PM_PTEG_FROM_LMEM,PTEG loaded from local memory ##83087 PTEG loaded from local memory #184,v,g,n,n,PM_SLB_MISS,SLB misses ##80088 SLB misses #185,v,g,n,s,PM_SNOOP_DCLAIM_RETRY_QFULL,Snoop dclaim/flush retry due to write/dclaim queues full ##720E6 Snoop dclaim/flush retry due to write/dclaim queues full #186,v,g,n,s,PM_SNOOP_PW_RETRY_RQ,Snoop partial-write retry due to collision with active read queue ##707C6 Snoop partial-write retry due to collision with active read queue #187,v,g,n,s,PM_SNOOP_RD_RETRY_QFULL,Snoop read retry due to read queue full ##700C6 Snoop read retry due to read queue full #188,v,g,n,s,PM_SNOOP_RD_RETRY_RQ,Snoop read retry due to collision with active read queue ##705C6 Snoop read retry due to collision with active read queue #189,v,g,n,s,PM_SNOOP_RETRY_1AHEAD,Snoop retry due to one ahead collision ##725E6 Snoop retry due to one ahead collision #190,u,g,n,s,PM_SNOOP_TLBIE,Snoop TLBIE ##800C3 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #191,v,g,n,s,PM_SNOOP_WR_RETRY_RQ,Snoop write/dclaim retry due to collision with active read queue ##706C6 Snoop write/dclaim retry due to collision with active read queue #192,v,g,n,n,PM_STCX_FAIL,STCX failed ##820E1 A stcx (stwcx or stdcx) failed #193,v,g,n,n,PM_STCX_PASS,Stcx passes ##820E5 A stcx (stwcx or stdcx) instruction was successful #194,v,g,n,n,PM_SUSPENDED,Suspended ##00000 Suspended #195,v,g,n,s,PM_GCT_EMPTY_CYC,Cycles GCT empty ##00004 The Global Completion Table is completely empty #196,v,g,n,n,PM_THRD_GRP_CMPL_BOTH_CYC,Cycles group completed by both threads ##00013 Cycles group completed by both threads #197,v,g,n,n,PM_THRD_PRIO_1_CYC,Cycles thread running at priority level 1 ##420E0 Cycles thread running at priority level 1 #198,v,g,n,n,PM_THRD_PRIO_2_CYC,Cycles thread running at priority level 2 ##420E1 Cycles thread running at priority level 2 #199,v,g,n,n,PM_THRD_PRIO_3_CYC,Cycles thread running at priority level 3 ##420E2 Cycles thread running at priority level 3 #200,v,g,n,n,PM_THRD_PRIO_4_CYC,Cycles thread running at priority level 4 ##420E3 Cycles thread running at priority level 4 #201,v,g,n,n,PM_THRD_PRIO_5_CYC,Cycles thread running at priority level 5 ##420E4 Cycles thread running at priority level 5 #202,v,g,n,n,PM_THRD_PRIO_6_CYC,Cycles thread running at priority level 6 ##420E5 Cycles thread running at priority level 6 #203,v,g,n,n,PM_THRD_PRIO_7_CYC,Cycles thread running at priority level 7 ##420E6 Cycles thread running at priority level 7 #204,v,g,n,s,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##100C2 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. $$$$$$$$ { counter 3 } #0,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #1,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #2,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #3,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##23087,230E2 A conditional branch was predicted, CR prediction #4,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##230E3 A conditional branch was predicted, target prediction #5,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #6,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #7,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##C30A2 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load #8,v,g,n,n,PM_DATA_FROM_L275_SHR,Data loaded from L2.75 shared ##C3097 DL1 was reloaded with shared (T) data from the L2 of another MCM due to a demand load #9,v,g,n,n,PM_DATA_FROM_L35_MOD,Data loaded from L3.5 modified ##C30A6 Data loaded from L3.5 modified #10,v,g,n,n,PM_DATA_FROM_L375_SHR,Data loaded from L3.75 shared ##C309E Data loaded from L3.75 shared #11,v,g,n,n,PM_DATA_FROM_LMEM,Data loaded from local memory ##C30A0 Data loaded from local memory #12,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #13,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 DST (Data Stream Touch) stream start #14,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated #15,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 The number of Cycles MSR(EE) bit was off. #16,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles MSR(EE) bit off and external interrupt pending #17,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Fabric command retried #18,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 dclaim retried #19,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 M1 to VN/NN sidecar empty #20,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 P1 to VN/NN sidecar empty #21,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 PN to NN beat went to sidecar first #22,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 PN to VN beat went to sidecar first #23,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Vertical bypass buffer empty #24,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 Flush caused by branch mispredict #25,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 Flush caused by thread GCT imbalance #26,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes #27,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 Flush caused by scoreboard operation #28,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 Flush caused by sync #29,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #30,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 fp0 finished, produced a result This only indicates finish, not completion. #31,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #32,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #33,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #34,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #35,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 fp1 finished, produced a result. This only indicates finish, not completion. #36,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##010C4 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #37,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #38,v,g,n,n,PM_FPU_FMOV_FEST,FPU executing FMOV or FEST instructions ##01088 This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1 #39,v,g,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions ##01090 This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #40,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #41,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #42,u,g,n,n,PM_FXU0_BUSY_FXU1_IDLE,FXU0 busy FXU1 idle ##00012 FXU0 is busy while FXU1 was idle #43,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result #44,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result #45,v,g,n,n,PM_FXU_FIN,FXU produced a result ##13088 The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete. #46,v,g,n,n,PM_GCT_NOSLOT_SRQ_FULL,No slot in GCT caused by SRQ full ##10084 This thread has no slot in the GCT because the SRQ is full #47,v,g,n,s,PM_GCT_USAGE_80to99_CYC,Cycles GCT 80-99% full ##0001F Cycles GCT 80-99% full #48,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #49,v,g,n,n,PM_GRP_CMPL,Group completed ##00013 A group completed. Microcoded instructions that span multiple groups will generate this event once per group. #50,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #51,v,g,n,n,PM_GRP_DISP_SUCCESS,Group dispatch success ##00002 Number of groups sucessfully dispatched (not rejected) #52,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 L2 I cache demand request due to BHT redirect #53,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 L2 I cache demand request due to branch redirect #54,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##210C7 New line coming into the prefetch buffer #55,v,g,n,n,PM_INST_CMPL,Instructions completed ##00001 Number of Eligible Instructions that completed. #56,v,g,n,n,PM_INST_DISP,Instructions dispatched ##00009 The ISU sends the number of instructions dispatched. #57,v,g,n,n,PM_INST_FROM_L275_SHR,Instruction fetched from L2.75 shared ##22096 Instruction fetched from L2.75 shared #58,v,g,n,n,PM_INST_FROM_L375_SHR,Instruction fetched from L3.75 shared ##2209D Instruction fetched from L3.75 shared #59,v,g,n,n,PM_INST_FROM_PREF,Instructions fetched from prefetch ##2208D An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions #60,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid #61,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #62,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 This signal is asserted each cycle a cache write is active. #63,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #64,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 Slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 L2 Slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #65,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 Slice A RC load dispatch attempt failed due to other reasons ##731E0 L2 Slice A RC load dispatch attempt failed due to other reasons #66,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 Slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 L2 Slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #67,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 Slice A RC store dispatch attempt failed due to other reasons ##732E0 L2 Slice A RC store dispatch attempt failed due to other reasons #68,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice A RC dispatch attempt failed due to all CO busy ##713C0 L2 Slice A RC dispatch attempt failed due to all CO busy #69,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #70,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #71,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #72,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 Slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 L2 Slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #73,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 Slice B RC load dispatch attempt failed due to other reasons ##731E1 L2 Slice B RC load dispatch attempt failed due to other reasons #74,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 Slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 L2 Slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #75,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 Slice B RC store dispatch attempt failed due to other reasons ##732E1 L2 Slice B RC store dispatch attempt failed due to other reasons #76,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice B RC dispatch attempt failed due to all CO busy ##713C1 L2 Slice B RC dispatch attempt failed due to all CO busy #77,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #78,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #79,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #80,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 Slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 L2 Slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #81,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 Slice C RC load dispatch attempt failed due to other reasons ##731E2 L2 Slice C RC load dispatch attempt failed due to other reasons #82,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 Slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 L2 Slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #83,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 Slice C RC store dispatch attempt failed due to other reasons ##732E2 L2 Slice C RC store dispatch attempt failed due to other reasons #84,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice C RC dispatch attempt failed due to all CO busy ##713C2 L2 Slice C RC dispatch attempt failed due to all CO busy #85,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #86,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 L2 slice C store hits #87,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #88,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 L3 slice A hits #89,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 slice A transition from modified to invalid #90,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 slice A transition from shared to invalid #91,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 L3 slice A snoop retries #92,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 L3 slice B hits #93,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 slice B transition from modified to invalid #94,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 slice B transition from shared to invalid #95,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 L3 slice B snoop retries #96,v,g,n,s,PM_L3SC_HIT,L3 Slice C hits ##711C5 L3 Slice C hits #97,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 slice C transition from modified to invalid #98,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 slice C transition from shared to invalid #99,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 L3 slice C snoop retries #100,v,g,n,n,PM_LD_MISS_L1,L1 D cache load misses ##C1088 Total DL1 Load references that miss the DL1 #101,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 A load, executing on unit 0, missed the dcache #102,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C6 A load, executing on unit 1, missed the dcache #103,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 A load executed on unit 0 #104,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C10C4 A load executed on unit 1 #105,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed from LSU unit 0 #106,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 LSU0 non-cacheable loads #107,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed from LSU unit 1 #108,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 LSU1 non-cacheable loads #109,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 Flush initiated by LSU #110,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 Flush caused by SRQ full #111,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The LMQ was full #112,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #113,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #114,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #115,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00015 Cycles when both the LMQ and SRQ are empty (LSU is idle) #116,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 The ISU sends this signal when the LRQ is full. #117,u,g,n,n,PM_DC_PREF_STREAM_ALLOC_BLK,D cache out of prefech streams ##C50C2 D cache out of prefech streams #118,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 The ISU sends this signal when the srq is full. #119,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 This signal is asserted every cycle when a sync is in the SRQ. #120,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 LWSYNC held at dispatch #121,v,g,n,s,PM_MEM_LO_PRIO_PW_CMPL,Low priority partial-write completed ##737E6 Low priority partial-write completed #122,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 Low priority write completed #123,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##734E6 Memory partial-write completed #124,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Memory partial-write gathered #125,v,g,n,s,PM_MEM_RQ_DISP_BUSY1to7,Memory read queue dispatched with 1-7 queues busy ##711C6 Memory read queue dispatched with 1-7 queues busy #126,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read canceled ##712C6 Speculative memory read canceled #127,v,g,n,s,PM_MEM_WQ_DISP_BUSY8to15,Memory write queue dispatched with 8-15 queues busy ##733E6 Memory write queue dispatched with 8-15 queues busy #128,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 Memory write queue dispatched due to dclaim/flush #129,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##C70A2 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load #130,v,g,n,n,PM_MRK_DATA_FROM_L275_SHR,Marked data loaded from L2.75 shared ##C7097 DL1 was reloaded with shared (T) data from the L2 of another MCM due to a marked demand load #131,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD,Marked data loaded from L3.5 modified ##C70A6 Marked data loaded from L3.5 modified #132,v,g,n,n,PM_MRK_DATA_FROM_L375_SHR,Marked data loaded from L3.75 shared ##C709E Marked data loaded from L3.75 shared #133,v,g,n,n,PM_MRK_DATA_FROM_LMEM,Marked data loaded from local memory ##C70A0 Marked data loaded from local memory #134,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 Marked Data SLB misses #135,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6 Marked Data TLB misses #136,v,g,n,n,PM_MRK_FPU_FIN,Marked instruction FPU processing finished ##00014 One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete #137,v,g,n,n,PM_MRK_INST_FIN,Marked instruction finished ##00005 One of the execution units finished a marked instruction. Instructions that finish may not necessary complete #138,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #139,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #140,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #141,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C1 A marked store was flushed from unit 0 because it was unaligned #142,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C0 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #143,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #144,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #145,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #146,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #147,v,g,n,n,PM_MRK_LSU_FLUSH_LRQ,Marked LRQ flushes ##81088 A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #148,v,g,n,n,PM_MRK_LSU_FLUSH_UST,Marked unaligned store flushes ##81090 A marked store was flushed because it was unaligned #149,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #150,v,g,n,n,PM_MRK_ST_CMPL_INT,Marked store completed with intervention ##00003 A marked store previously sent to the memory subsystem completed (data home) after requiring intervention #151,v,g,n,n,PM_PMC2_OVERFLOW,PMC2 Overflow ##0000A PMC2 Overflow #152,v,g,n,n,PM_PMC6_OVERFLOW,PMC6 Overflow ##0001A PMC6 Overflow #153,v,g,n,n,PM_PTEG_FROM_L25_MOD,PTEG loaded from L2.5 modified ##830A2 PTEG loaded from L2.5 modified #154,v,g,n,n,PM_PTEG_FROM_L275_SHR,PTEG loaded from L2.75 shared ##83097 PTEG loaded from L2.75 shared #155,v,g,n,n,PM_PTEG_FROM_L35_MOD,PTEG loaded from L3.5 modified ##830A6 PTEG loaded from L3.5 modified #156,v,g,n,n,PM_PTEG_FROM_L375_SHR,PTEG loaded from L3.75 shared ##8309E PTEG loaded from L3.75 shared #157,v,g,n,n,PM_PTEG_FROM_LMEM,PTEG loaded from local memory ##830A0 PTEG loaded from local memory #158,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 Snoop partial write retry due to partial-write queues full #159,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 Snoop partial-write retry due to collision with active write or partial-write queue #160,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 Snoop read retry due to collision with active write queue #161,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 Snoop read retry due to read queue full #162,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 Snoop write/dclaim retry due to collision with active write queue #163,v,g,n,n,PM_STOP_COMPLETION,Completion stopped ##00018 RAS Unit has signaled completion to stop #164,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache #165,v,g,n,n,PM_ST_REF_L1,L1 D cache store references ##C1090 Total DL1 Store references #166,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 A store executed on unit 0 #167,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C5 A store executed on unit 1 #168,v,g,n,n,PM_SUSPENDED,Suspended ##00000 Suspended #169,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles CLB completely empty #170,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##410C7 Cycles both threads in L2 misses #171,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles no thread priority difference #172,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles thread priority difference is 1 or 2 #173,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles thread priority difference is 3 or 4 #174,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles thread priority difference is 5 or 6 #175,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles thread priority difference is -1 or -2 #176,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles thread priority difference is -3 or -4 #177,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles thread priority difference is -5 or -6 #178,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overides caused by CLB empty ##410C2 Thread selection overides caused by CLB empty #179,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overides caused by GCT imbalance ##410C4 Thread selection overides caused by GCT imbalance #180,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overides caused by ISU holds ##410C5 Thread selection overides caused by ISU holds #181,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overides caused by L2 misses ##410C3 Thread selection overides caused by L2 misses #182,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Decode selected thread 0 #183,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Decode selected thread 1 #184,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 SMT hang detected #185,v,g,t,n,PM_THRESH_TIMEO,Threshold timeout ##0000B The threshold timer expired #186,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 TLBIE held at dispatch #187,v,g,n,n,PM_DATA_FROM_L2MISS,Data loaded missed L2 ##C309B DL1 was reloaded from beyond L2. #188,v,g,n,n,PM_MRK_DATA_FROM_L2MISS,Marked data loaded missed L2 ##C709B DL1 was reloaded from beyond L2 due to a marked demand load. #189,v,g,n,n,PM_PTEG_FROM_L2MISS,PTEG loaded from L2 miss ##8309B PTEG loaded from L2 miss $$$$$$$$ { counter 4 } #0,v,g,n,n,PM_0INST_FETCH,No instructions fetched ##2208D No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss) #1,v,g,n,n,PM_BR_ISSUED,Branches issued ##230E4 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #2,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##230E5 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #3,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##230E6 branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #4,v,g,n,n,PM_BR_PRED_CR,A conditional branch was predicted, CR prediction ##230E2 A conditional branch was predicted, CR prediction #5,v,g,n,n,PM_BR_PRED_CR_TA,A conditional branch was predicted, CR and target prediction ##23087 A conditional branch was predicted, CR and target prediction #6,v,g,n,n,PM_BR_PRED_TA,A conditional branch was predicted, target prediction ##230E3 A conditional branch was predicted, target prediction #7,v,g,n,n,PM_CMPLU_STALL_DIV,Completion stall caused by DIV instruction ##11099 Completion stall caused by DIV instruction #8,v,g,n,n,PM_CMPLU_STALL_ERAT_MISS,Completion stall caused by ERAT miss ##1109B Completion stall caused by ERAT miss #9,v,g,n,n,PM_CMPLU_STALL_FPU,Completion stall caused by FPU instruction ##11098 Completion stall caused by FPU instruction #10,v,g,n,n,PM_CMPLU_STALL_REJECT,Completion stall caused by reject ##1109A Completion stall caused by reject #11,u,g,n,s,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##110C1 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #12,v,g,n,s,PM_CYC,Processor cycles ##0000F Processor cycles #13,v,g,n,n,PM_DATA_FROM_L275_MOD,Data loaded from L2.75 modified ##C3097 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a demand load. #14,v,g,n,n,PM_DATA_FROM_L375_MOD,Data loaded from L3.75 modified ##C309E Data loaded from L3.75 modified #15,v,g,n,n,PM_DATA_FROM_RMEM,Data loaded from remote memory ##C3087 Data loaded from remote memory #16,u,g,n,s,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C10C7 A dcache invalidated was received from the L2 because a line in L2 was castout. #17,v,g,n,n,PM_DC_PREF_DST,DST (Data Stream Touch) stream start ##830E6 DST (Data Stream Touch) stream start #18,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##830E7 A new Prefetch Stream was allocated #19,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##130E3 The number of Cycles MSR(EE) bit was off. #20,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##130E7 Cycles MSR(EE) bit off and external interrupt pending #21,v,g,n,n,PM_EXT_INT,External interrupts ##00003 An external interrupt occurred #22,v,g,n,n,PM_FAB_CMD_RETRIED,Fabric command retried ##710C7 Fabric command retried #23,v,g,n,s,PM_FAB_DCLAIM_RETRIED,dclaim retried ##730E7 dclaim retried #24,v,g,n,s,PM_FAB_M1toVNorNN_SIDECAR_EMPTY,M1 to VN/NN sidecar empty ##712C7 M1 to VN/NN sidecar empty #25,v,g,n,s,PM_FAB_P1toVNorNN_SIDECAR_EMPTY,P1 to VN/NN sidecar empty ##711C7 P1 to VN/NN sidecar empty #26,v,g,n,s,PM_FAB_PNtoNN_SIDECAR,PN to NN beat went to sidecar first ##713C7 PN to NN beat went to sidecar first #27,v,g,n,s,PM_FAB_PNtoVN_SIDECAR,PN to VN beat went to sidecar first ##733E7 PN to VN beat went to sidecar first #28,v,g,n,s,PM_FAB_VBYPASS_EMPTY,Vertical bypass buffer empty ##731E7 Vertical bypass buffer empty #29,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##110C6 Flush caused by branch mispredict #30,v,g,n,s,PM_FLUSH_IMBAL,Flush caused by thread GCT imbalance ##330E3 Flush caused by thread GCT imbalance #31,v,g,n,n,PM_FLUSH,Flushes ##110C7 Flushes #32,v,g,n,s,PM_FLUSH_SB,Flush caused by scoreboard operation ##330E2 Flush caused by scoreboard operation #33,v,g,n,s,PM_FLUSH_SYNC,Flush caused by sync ##330E1 Flush caused by sync #34,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##010C2 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #35,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##010C3 fp0 finished, produced a result This only indicates finish, not completion. #36,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##010C0 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #37,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##030E0 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #38,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##010C1 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #39,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##010C6 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #40,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##010C7 fp1 finished, produced a result. This only indicates finish, not completion. #41,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##010C4 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #42,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##010C5 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #43,v,g,n,n,PM_FPU_FEST,FPU executed FEST instruction ##01090 This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1. #44,v,g,n,n,PM_FPU_FIN,FPU produced a result ##01088 FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1 #45,v,g,n,s,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##110C0 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #46,v,g,n,s,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##110C4 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #47,c,g,n,n,PM_FXLS_FULL_CYC,Cycles FXLS queue is full ##11090 Cycles when one or both FXU/LSU issue queue are full #48,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##130E2 The Fixed Point unit 0 finished an instruction and produced a result #49,u,g,n,n,PM_FXU1_BUSY_FXU0_IDLE,FXU1 busy FXU0 idle ##00012 FXU0 was idle while FXU1 was busy #50,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##130E6 The Fixed Point unit 1 finished an instruction and produced a result #51,v,g,n,n,PM_GCT_NOSLOT_BR_MPRED,No slot in GCT caused by branch mispredict ##1009C This thread has no slot in the GCT because of branch mispredict #52,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##0001F The ISU sends a signal indicating the gct is full. #53,v,g,n,s,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##130E5 The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #54,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##130E1 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #55,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##00002 A group that previously attempted dispatch was rejected. #56,v,g,n,n,PM_IC_DEMAND_L2_BHT_REDIRECT,L2 I cache demand request due to BHT redirect ##230E0 L2 I cache demand request due to BHT redirect #57,v,g,n,n,PM_IC_DEMAND_L2_BR_REDIRECT,L2 I cache demand request due to branch redirect ##230E1 L2 I cache demand request due to branch redirect #58,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##210C7 New line coming into the prefetch buffer #59,v,g,n,n,PM_INST_CMPL,Instructions completed ##00001 Number of Eligible Instructions that completed. #60,v,g,n,n,PM_INST_DISP,Instructions dispatched ##00009 The ISU sends the number of instructions dispatched. #61,v,g,n,n,PM_INST_FROM_L275_MOD,Instruction fetched from L2.75 modified ##22096 Instruction fetched from L2.75 modified #62,v,g,n,n,PM_INST_FROM_L375_MOD,Instruction fetched from L3.75 modified ##2209D Instruction fetched from L3.75 modified #63,v,g,n,n,PM_INST_FROM_RMEM,Instruction fetched from remote memory ##22086 Instruction fetched from remote memory #64,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C30E4 The data source information is valid #65,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##C70E7 A request to prefetch data into the L1 was made #66,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##230E7 This signal is asserted each cycle a cache write is active. #67,v,g,n,s,PM_L2SA_MOD_INV,L2 slice A transition from modified to invalid ##730E0 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #68,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_ADDR,L2 Slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C0 L2 Slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #69,v,g,n,s,PM_L2SA_RCLD_DISP_FAIL_OTHER,L2 Slice A RC load dispatch attempt failed due to other reasons ##731E0 L2 Slice A RC load dispatch attempt failed due to other reasons #70,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_ADDR,L2 Slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C0 L2 Slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #71,v,g,n,s,PM_L2SA_RCST_DISP_FAIL_OTHER,L2 Slice A RC store dispatch attempt failed due to other reasons ##732E0 L2 Slice A RC store dispatch attempt failed due to other reasons #72,v,g,n,s,PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice A RC dispatch attempt failed due to all CO busy ##713C0 L2 Slice A RC dispatch attempt failed due to all CO busy #73,v,g,n,s,PM_L2SA_SHR_INV,L2 slice A transition from shared to invalid ##710C0 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #74,v,g,n,n,PM_L2SA_ST_HIT,L2 slice A store hits ##733E0 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #75,v,g,n,s,PM_L2SB_MOD_INV,L2 slice B transition from modified to invalid ##730E1 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #76,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_ADDR,L2 Slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C1 L2 Slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #77,v,g,n,s,PM_L2SB_RCLD_DISP_FAIL_OTHER,L2 Slice B RC load dispatch attempt failed due to other reasons ##731E1 L2 Slice B RC load dispatch attempt failed due to other reasons #78,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_ADDR,L2 Slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C1 L2 Slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #79,v,g,n,s,PM_L2SB_RCST_DISP_FAIL_OTHER,L2 Slice B RC store dispatch attempt failed due to other reasons ##732E1 L2 Slice B RC store dispatch attempt failed due to other reasons #80,v,g,n,s,PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice B RC dispatch attempt failed due to all CO busy ##713C1 L2 Slice B RC dispatch attempt failed due to all CO busy #81,v,g,n,s,PM_L2SB_SHR_INV,L2 slice B transition from shared to invalid ##710C1 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #82,v,g,n,n,PM_L2SB_ST_HIT,L2 slice B store hits ##733E1 A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C. #83,v,g,n,s,PM_L2SC_MOD_INV,L2 slice C transition from modified to invalid ##730E2 A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C. #84,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_ADDR,L2 Slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ ##711C2 L2 Slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ #85,v,g,n,s,PM_L2SC_RCLD_DISP_FAIL_OTHER,L2 Slice C RC load dispatch attempt failed due to other reasons ##731E2 L2 Slice C RC load dispatch attempt failed due to other reasons #86,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_ADDR,L2 Slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ ##712C2 L2 Slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ #87,v,g,n,s,PM_L2SC_RCST_DISP_FAIL_OTHER,L2 Slice C RC store dispatch attempt failed due to other reasons ##732E2 L2 Slice C RC store dispatch attempt failed due to other reasons #88,v,g,n,s,PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL,L2 Slice C RC dispatch attempt failed due to all CO busy ##713C2 L2 Slice C RC dispatch attempt failed due to all CO busy #89,v,g,n,s,PM_L2SC_SHR_INV,L2 slice C transition from shared to invalid ##710C2 A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted. #90,v,g,n,n,PM_L2SC_ST_HIT,L2 slice C store hits ##733E2 L2 slice C store hits #91,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##C50C3 A request to prefetch data into L2 was made #92,v,g,n,s,PM_L3SA_HIT,L3 slice A hits ##711C3 L3 slice A hits #93,v,g,n,s,PM_L3SA_MOD_INV,L3 slice A transition from modified to invalid ##730E3 L3 slice A transition from modified to invalid #94,v,g,n,s,PM_L3SA_SHR_INV,L3 slice A transition from shared to invalid ##710C3 L3 slice A transition from shared to invalid #95,v,g,n,s,PM_L3SA_SNOOP_RETRY,L3 slice A snoop retries ##731E3 L3 slice A snoop retries #96,v,g,n,s,PM_L3SB_HIT,L3 slice B hits ##711C4 L3 slice B hits #97,v,g,n,s,PM_L3SB_MOD_INV,L3 slice B transition from modified to invalid ##730E4 L3 slice B transition from modified to invalid #98,v,g,n,s,PM_L3SB_SHR_INV,L3 slice B transition from shared to invalid ##710C4 L3 slice B transition from shared to invalid #99,v,g,n,s,PM_L3SB_SNOOP_RETRY,L3 slice B snoop retries ##731E4 L3 slice B snoop retries #100,v,g,n,s,PM_L3SC_HIT,L3 Slice C hits ##711C5 L3 Slice C hits #101,v,g,n,s,PM_L3SC_MOD_INV,L3 slice C transition from modified to invalid ##730E5 L3 slice C transition from modified to invalid #102,v,g,n,s,PM_L3SC_SHR_INV,L3 slice C transition from shared to invalid ##710C5 L3 slice C transition from shared to invalid #103,v,g,n,s,PM_L3SC_SNOOP_RETRY,L3 slice C snoop retries ##731E5 L3 slice C snoop retries #104,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C10C2 A load, executing on unit 0, missed the dcache #105,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C10C6 A load, executing on unit 1, missed the dcache #106,v,g,n,n,PM_LD_REF_L1,L1 D cache load references ##C1090 Total DL1 Load references #107,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C10C0 A load executed on unit 0 #108,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C10C4 A load executed on unit 1 #109,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##C50C0 A floating point load was executed from LSU unit 0 #110,v,g,n,n,PM_LSU0_NCLD,LSU0 non-cacheable loads ##C50C1 LSU0 non-cacheable loads #111,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##C50C4 A floating point load was executed from LSU unit 1 #112,v,g,n,n,PM_LSU1_NCLD,LSU1 non-cacheable loads ##C50C5 LSU1 non-cacheable loads #113,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##110C5 Flush initiated by LSU #114,v,g,n,s,PM_LSU_FLUSH_SRQ_FULL,Flush caused by SRQ full ##330E0 Flush caused by SRQ full #115,v,g,n,n,PM_LSU_LDF,LSU executed Floating Point load instruction ##C5090 LSU executed Floating Point load instruction #116,u,g,n,s,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C30E7 The LMQ was full #117,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C70E5 A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #118,v,g,n,s,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C30E6 The first entry in the LMQ was allocated. #119,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C30E5 This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #120,v,g,n,s,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##110C2 The ISU sends this signal when the LRQ is full. #121,u,g,n,n,PM_DC_PREF_STREAM_ALLOC_BLK,D cache out of prefech streams ##C50C2 D cache out of prefech streams #122,u,g,n,n,PM_LSU_SRQ_EMPTY_CYC,Cycles SRQ empty ##00015 The Store Request Queue is empty #123,v,g,n,s,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##110C3 The ISU sends this signal when the srq is full. #124,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##830E5 This signal is asserted every cycle when a sync is in the SRQ. #125,v,g,n,n,PM_LWSYNC_HELD,LWSYNC held at dispatch ##130E0 LWSYNC held at dispatch #126,v,g,n,s,PM_MEM_LO_PRIO_PW_CMPL,Low priority partial-write completed ##737E6 Low priority partial-write completed #127,v,g,n,s,PM_MEM_LO_PRIO_WR_CMPL,Low priority write completed ##736E6 Low priority write completed #128,v,g,n,s,PM_MEM_PW_CMPL,Memory partial-write completed ##734E6 Memory partial-write completed #129,v,g,n,s,PM_MEM_PW_GATH,Memory partial-write gathered ##714C6 Memory partial-write gathered #130,v,g,n,s,PM_MEM_RQ_DISP_BUSY1to7,Memory read queue dispatched with 1-7 queues busy ##711C6 Memory read queue dispatched with 1-7 queues busy #131,v,g,n,s,PM_MEM_SPEC_RD_CANCEL,Speculative memory read canceled ##712C6 Speculative memory read canceled #132,v,g,n,s,PM_MEM_WQ_DISP_BUSY8to15,Memory write queue dispatched with 8-15 queues busy ##733E6 Memory write queue dispatched with 8-15 queues busy #133,v,g,n,s,PM_MEM_WQ_DISP_DCLAIM,Memory write queue dispatched due to dclaim/flush ##713C6 Memory write queue dispatched due to dclaim/flush #134,v,g,n,n,PM_MRK_CRU_FIN,Marked instruction CRU processing finished ##00005 The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete #135,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD_CYC,Marked load latency from L2.5 modified ##C70A2 Marked load latency from L2.5 modified #136,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD,Marked data loaded from L2.75 modified ##C7097 DL1 was reloaded with modified (M) data from the L2 of another MCM due to a marked demand load. #137,v,g,n,n,PM_MRK_DATA_FROM_L275_MOD_CYC,Marked load latency from L2.75 modified ##C70A3 Marked load latency from L2.75 modified #138,v,g,n,n,PM_MRK_DATA_FROM_L35_MOD_CYC,Marked load latency from L3.5 modified ##C70A6 Marked load latency from L3.5 modified #139,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD,Marked data loaded from L3.75 modified ##C709E Marked data loaded from L3.75 modified #140,v,g,n,n,PM_MRK_DATA_FROM_L375_MOD_CYC,Marked load latency from L3.75 modified ##C70A7 Marked load latency from L3.75 modified #141,v,g,n,n,PM_MRK_DATA_FROM_LMEM_CYC,Marked load latency from local memory ##C70A0 Marked load latency from local memory #142,v,g,n,n,PM_MRK_DATA_FROM_RMEM,Marked data loaded from remote memory ##C7087 Marked data loaded from remote memory #143,v,g,n,n,PM_MRK_DATA_FROM_RMEM_CYC,Marked load latency from remote memory ##C70A1 Marked load latency from remote memory #144,v,g,n,n,PM_MRK_DSLB_MISS,Marked Data SLB misses ##C50C7 Marked Data SLB misses #145,v,g,n,n,PM_MRK_DTLB_MISS,Marked Data TLB misses ##C50C6 Marked Data TLB misses #146,v,g,n,n,PM_MRK_GRP_CMPL,Marked group completed ##00013 A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group. #147,v,g,n,n,PM_MRK_GRP_IC_MISS,Group experienced marked I cache miss ##12091 Group experienced marked I cache miss #148,v,g,n,n,PM_MRK_GRP_TIMEO,Marked group completion timeout ##0000B The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor #149,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C70E4 The source information is valid and is for a marked load #150,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##810C2 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #151,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##810C3 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #152,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##810C1 A marked store was flushed from unit 0 because it was unaligned #153,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##810C0 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #154,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##810C6 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #155,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##810C7 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #156,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##810C4 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #157,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##810C5 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #158,c,g,n,n,PM_MRK_LSU_FIN,Marked instruction LSU processing finished ##00014 One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete #159,v,g,n,n,PM_MRK_LSU_FLUSH_SRQ,Marked SRQ flushes ##81088 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #160,v,g,n,n,PM_MRK_LSU_FLUSH_ULD,Marked unaligned load flushes ##81090 A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #161,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C70E6 This signal is asserted every cycle when a marked request is resident in the Store Request Queue #162,v,g,n,n,PM_PMC3_OVERFLOW,PMC3 Overflow ##0000A PMC3 Overflow #163,v,g,n,n,PM_PTEG_FROM_L275_MOD,PTEG loaded from L2.75 modified ##83097 PTEG loaded from L2.75 modified #164,v,g,n,n,PM_PTEG_FROM_L375_MOD,PTEG loaded from L3.75 modified ##8309E PTEG loaded from L3.75 modified #165,v,g,n,n,PM_PTEG_FROM_RMEM,PTEG loaded from remote memory ##83087 PTEG loaded from remote memory #166,v,g,n,s,PM_SNOOP_PARTIAL_RTRY_QFULL,Snoop partial write retry due to partial-write queues full ##730E6 Snoop partial write retry due to partial-write queues full #167,v,g,n,s,PM_SNOOP_PW_RETRY_WQ_PWQ,Snoop partial-write retry due to collision with active write or partial-write queue ##717C6 Snoop partial-write retry due to collision with active write or partial-write queue #168,v,g,n,s,PM_SNOOP_RD_RETRY_WQ,Snoop read retry due to collision with active write queue ##715C6 Snoop read retry due to collision with active write queue #169,v,g,n,s,PM_SNOOP_WR_RETRY_QFULL,Snoop read retry due to read queue full ##710C6 Snoop read retry due to read queue full #170,v,g,n,s,PM_SNOOP_WR_RETRY_WQ,Snoop write/dclaim retry due to collision with active write queue ##716C6 Snoop write/dclaim retry due to collision with active write queue #171,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C10C3 A store missed the dcache #172,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C10C1 A store executed on unit 0 #173,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C10C5 A store executed on unit 1 #174,v,g,n,n,PM_SUSPENDED,Suspended ##00000 Suspended #175,v,g,n,s,PM_CLB_EMPTY_CYC,Cycles CLB empty ##410C6 Cycles CLB completely empty #176,v,g,n,s,PM_THRD_L2MISS_BOTH_CYC,Cycles both threads in L2 misses ##41084,410C7 Cycles both threads in L2 misses #177,v,g,n,n,PM_THRD_PRIO_DIFF_0_CYC,Cycles no thread priority difference ##430E3 Cycles no thread priority difference #178,v,g,n,n,PM_THRD_PRIO_DIFF_1or2_CYC,Cycles thread priority difference is 1 or 2 ##430E4 Cycles thread priority difference is 1 or 2 #179,v,g,n,n,PM_THRD_PRIO_DIFF_3or4_CYC,Cycles thread priority difference is 3 or 4 ##430E5 Cycles thread priority difference is 3 or 4 #180,v,g,n,n,PM_THRD_PRIO_DIFF_5or6_CYC,Cycles thread priority difference is 5 or 6 ##430E6 Cycles thread priority difference is 5 or 6 #181,v,g,n,n,PM_THRD_PRIO_DIFF_minus1or2_CYC,Cycles thread priority difference is -1 or -2 ##430E2 Cycles thread priority difference is -1 or -2 #182,v,g,n,n,PM_THRD_PRIO_DIFF_minus3or4_CYC,Cycles thread priority difference is -3 or -4 ##430E1 Cycles thread priority difference is -3 or -4 #183,v,g,n,n,PM_THRD_PRIO_DIFF_minus5or6_CYC,Cycles thread priority difference is -5 or -6 ##430E0 Cycles thread priority difference is -5 or -6 #184,v,g,n,s,PM_THRD_SEL_OVER_CLB_EMPTY,Thread selection overides caused by CLB empty ##410C2 Thread selection overides caused by CLB empty #185,v,g,n,s,PM_THRD_SEL_OVER_GCT_IMBAL,Thread selection overides caused by GCT imbalance ##410C4 Thread selection overides caused by GCT imbalance #186,v,g,n,s,PM_THRD_SEL_OVER_ISU_HOLD,Thread selection overides caused by ISU holds ##410C5 Thread selection overides caused by ISU holds #187,v,g,n,s,PM_THRD_SEL_OVER_L2MISS,Thread selection overides caused by L2 misses ##410C3 Thread selection overides caused by L2 misses #188,v,g,n,s,PM_THRD_SEL_T0,Decode selected thread 0 ##410C0 Decode selected thread 0 #189,v,g,n,s,PM_THRD_SEL_T1,Decode selected thread 1 ##410C1 Decode selected thread 1 #190,v,g,n,s,PM_THRD_SMT_HANG,SMT hang detected ##330E7 SMT hang detected #191,v,g,n,n,PM_TLBIE_HELD,TLBIE held at dispatch ##130E4 TLBIE held at dispatch #192,v,g,n,n,PM_WORK_HELD,Work held ##0000C RAS Unit has signaled completion to stop and there are groups waiting to complete $$$$$$$$ { counter 5 } #0,v,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of PPC instructions completed. $$$$$$$$ { counter 6 } #0,v,g,n,n,PM_RUN_CYC,Run cycles ##00005 Processor Cycles gated by the run latch papi-papi-7-2-0-t/src/event_data/power5/groups000066400000000000000000000560221502707512200212310ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/power5/groups { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { Number of groups 145 { Group descriptions #0,190,71,56,12,0,0,pm_utilization,CPI and utilization data ##00005,00001,00009,0000F,00009,00005 00000000,00000000,0A02121E,00000000 CPI and utilization data #1,2,195,49,12,0,0,pm_completion,Completion and cycle counts ##00013,00004,00013,0000F,00009,00005 00000000,00000000,2608261E,00000000 Completion and cycle counts #2,66,65,50,60,0,0,pm_group_dispatch,Group dispatch events ##120E3,120E4,130E1,00009,00009,00005 00000000,4000000E,C6C8C212,00000000 Group dispatch events #3,0,2,169,138,0,0,pm_clb1,CLB fullness ##400C0,400C2,410C6,C70A6,00009,00005 00000000,015B0001,80848C4C,00000001 CLB fullness #4,6,6,149,59,0,0,pm_clb2,CLB fullness ##400C5,400C6,C70E6,00001,00009,00005 00000000,01430002,8A8CCC02,00000001 CLB fullness #5,60,59,46,51,0,0,pm_gct_empty,GCT empty reasons ##00004,1009C,10084,1009C,00009,00005 00000000,40000000,08380838,00000000 GCT empty reasons #6,62,61,47,52,0,0,pm_gct_usage,GCT Usage ##0001F,0001F,0001F,0001F,00009,00005 00000000,00000000,3E3E3E3E,00000000 GCT Usage #7,143,143,113,119,0,0,pm_lsu1,LSU LRQ and LMQ events ##C20E6,C20E2,C30E6,C30E5,00009,00005 00000000,000F000F,CCC4CCCA,00000000 LSU LRQ and LMQ events #8,147,147,119,123,0,0,pm_lsu2,LSU SRQ events ##C20E5,C20E1,830E5,110C3,00009,00005 00000000,400E000E,CAC2CA86,00000000 LSU SRQ events #9,149,141,112,122,0,0,pm_lsu3,LSU SRQ and LMQ events ##C2088,00015,C70E5,00015,00009,00005 00000000,010F000A,102ACA2A,00000000 LSU SRQ and LMQ events #10,212,73,117,18,0,0,pm_prefetch1,Prefetch stream allocation ##2209B,220E4,C50C2,830E7,00009,00005 00000000,8432000D,36C884CE,00000000 Prefetch stream allocation #11,73,9,61,58,0,0,pm_prefetch2,Prefetch events ##00001,220E5,C70E7,210C7,00009,00005 00000000,81030006,02CACE8E,00000001 Prefetch events #12,139,1,87,59,0,0,pm_prefetch3,L2 prefetch and misc events ##C2090,400C1,C50C3,00001,00009,00005 00000000,047C0008,20828602,00000001 L2 prefetch and misc events #13,126,135,13,91,0,0,pm_prefetch4,Misc prefetch and reject events ##C60E0,C60E4,830E6,C50C3,00009,00005 00000000,063E000E,C0C8CC86,00000000 Misc prefetch and reject events #14,145,144,25,159,0,0,pm_lsu_reject1,LSU reject events ##C6090,C6088,330E3,81088,00009,00005 00000000,C22C000E,2010C610,00000001 LSU reject events #15,125,134,55,66,0,0,pm_lsu_reject2,LSU rejects due to reload CDF or tag update collision ##C60E2,C60E6,00001,230E7,00009,00005 00000000,820C000D,C4CC02CE,00000001 LSU rejects due to reload CDF or tag update collision #16,123,132,120,191,0,0,pm_lsu_reject3,LSU rejects due to ERAT, held instuctions ##C60E3,C60E7,130E0,130E4,00009,00005 00000000,420C000F,C6CEC0C8,00000000 LSU rejects due to ERAT, held instuctions #17,124,133,55,1,0,0,pm_lsu_reject4,LSU0/1 reject LMQ full ##C60E1,C60E5,00001,230E4,00009,00005 00000000,820C000D,C2CA02C8,00000001 LSU0/1 reject LMQ full #18,146,145,109,31,0,0,pm_lsu_reject5,LSU misc reject and flush events ##C6088,C6090,110C5,110C7,00009,00005 00000000,420C000C,10208A8E,00000000 LSU misc reject and flush events #19,73,140,25,16,0,0,pm_flush1,Misc flush events ##00001,C0088,330E3,C10C7,00009,00005 00000000,C0F00002,0210C68E,00000001 Misc flush events #20,81,71,27,33,0,0,pm_flush2,Flushes due to scoreboard and sync ##800C0,00001,330E2,330E1,00009,00005 00000000,C0800003,8002C4C2,00000001 Flushes due to scoreboard and sync #21,141,138,55,113,0,0,pm_lsu_flush_srq_lrq,LSU flush by SRQ and LRQ events ##C0090,C0090,00001,110C5,00009,00005 00000000,40C00000,2020028A,00000001 LSU flush by SRQ and LRQ events #22,119,128,109,59,0,0,pm_lsu_flush_lrq,LSU0/1 flush due to LRQ ##C00C2,C00C6,110C5,00001,00009,00005 00000000,40C00000,848C8A02,00000001 LSU0/1 flush due to LRQ #23,120,129,55,113,0,0,pm_lsu_flush_srq,LSU0/1 flush due to SRQ ##C00C3,C00C7,00001,110C5,00009,00005 00000000,40C00000,868E028A,00000001 LSU0/1 flush due to SRQ #24,142,140,0,59,0,0,pm_lsu_flush_unaligned,LSU flush due to unaligned data ##C0088,C0088,230E4,00001,00009,00005 00000000,80C00002,1010C802,00000001 LSU flush due to unaligned data #25,121,130,109,59,0,0,pm_lsu_flush_uld,LSU0/1 flush due to unaligned load ##C00C0,C00C4,110C5,00001,00009,00005 00000000,40C00000,80888A02,00000001 LSU0/1 flush due to unaligned load #26,122,131,55,113,0,0,pm_lsu_flush_ust,LSU0/1 flush due to unaligned store ##C00C1,C00C5,00001,110C5,00009,00005 00000000,40C00000,828A028A,00000001 LSU0/1 flush due to unaligned store #27,140,71,147,114,0,0,pm_lsu_flush_full,LSU flush due to LRQ/SRQ full ##320E7,00001,81088,330E0,00009,00005 00000000,C0200009,CE0210C0,00000001 LSU flush due to LRQ/SRQ full #28,70,13,55,10,0,0,pm_lsu_stall1,LSU Stalls ##00014,11098,00001,1109A,00009,00005 00000000,40000000,28300234,00000001 LSU Stalls #29,73,10,6,8,0,0,pm_lsu_stall2,LSU Stalls ##00001,1109A,0000F,1109B,00009,00005 00000000,40000000,02341E36,00000001 LSU Stalls #30,68,12,55,7,0,0,pm_fxu_stall,FXU Stalls ##12091,11099,00001,11099,00009,00005 00000000,40000008,22320232,00000001 FXU Stalls #31,57,11,55,9,0,0,pm_fpu_stall,FPU Stalls ##10090,1109B,00001,11098,00009,00005 00000000,40000000,20360230,00000001 FPU Stalls #32,115,7,116,116,0,0,pm_queue_full,BRQ LRQ LMQ queue full ##820E7,100C5,110C2,C30E7,00009,00005 00000000,400B0009,CE8A84CE,00000000 BRQ LRQ LMQ queue full #33,41,49,40,46,0,0,pm_issueq_full,FPU FX full ##100C3,100C7,110C0,110C4,00009,00005 00000000,40000000,868E8088,00000000 FPU FX full #34,11,114,48,11,0,0,pm_mapper_full1,CR CTR GPR mapper full ##100C4,100C6,130E5,110C1,00009,00005 00000000,40000002,888CCA82,00000000 CR CTR GPR mapper full #35,35,204,188,59,0,0,pm_mapper_full2,FPR XER mapper full ##100C1,100C2,C709B,00001,00009,00005 00000000,41030002,82843602,00000001 FPR XER mapper full #36,198,193,106,112,0,0,pm_misc_load,Non-cachable loads and stcx events ##820E1,820E5,C50C1,C50C5,00009,00005 00000000,0438000C,C2CA828A,00000001 Non-cachable loads and stcx events #37,117,126,52,57,0,0,pm_ic_demand,ICache demand from BR redirect ##C20E3,C20E7,230E0,230E1,00009,00005 00000000,800C000F,C6CEC0C2,00000000 ICache demand from BR redirect #38,72,69,54,0,0,0,pm_ic_pref,ICache prefetch ##220E7,220E6,210C7,2208D,00009,00005 00000000,8000000C,CECC8E1A,00000000 ICache prefetch #39,69,67,60,59,0,0,pm_ic_miss,ICache misses ##12099,120E7,C30E4,00001,00009,00005 00000000,4003000E,32CEC802,00000001 ICache misses #40,210,184,1,3,0,0,pm_branch_miss,Branch mispredict, TLB and SLB misses ##80088,80088,230E5,230E6,00009,00005 00000000,80800003,1010CACC,00000000 Branch mispredict, TLB and SLB misses #41,9,8,3,5,0,0,pm_branch1,Branch operations ##23087,23087,23087,23087,00009,00005 00000000,80000003,0E0E0E0E,00000000 Branch operations #42,64,62,24,59,0,0,pm_branch2,Branch operations ##120E5,120E6,110C6,00001,00009,00005 00000000,4000000C,CACC8C02,00000001 Branch operations #43,20,21,100,106,0,0,pm_L1_tlbmiss,L1 load and TLB misses ##800C7,800C4,C1088,C1090,00009,00005 00000000,00B00000,8E881020,00000000 L1 load and TLB misses #44,13,137,165,171,0,0,pm_L1_DERAT_miss,L1 store and DERAT misses ##C3087,80090,C1090,C10C3,00009,00005 00000000,00B30000,0E202086,00000000 L1 store and DERAT misses #45,21,78,101,105,0,0,pm_L1_slbmiss,L1 load and SLB misses ##800C5,800C1,C10C2,C10C6,00009,00005 00000000,00B00000,8A82848C,00000000 L1 load and SLB misses #46,26,23,103,108,0,0,pm_L1_dtlbmiss_4K,L1 load references and 4K Data TLB references and misses ##C40C2,C40C0,C10C0,C10C4,00009,00005 00000000,08F00000,84808088,00000000 L1 load references and 4K Data TLB references and misses #47,25,22,166,173,0,0,pm_L1_dtlbmiss_16M,L1 store references and 16M Data TLB references and misses ##C40C6,C40C4,C10C1,C10C5,00009,00005 00000000,08F00000,8C88828A,00000000 L1 store references and 16M Data TLB references and misses #48,16,18,26,59,0,0,pm_dsource1,L3 cache and memory data access ##C308E,C3087,110C7,00001,00009,00005 00000000,40030000,1C0E8E02,00000001 L3 cache and memory data access #49,16,18,187,15,0,0,pm_dsource2,L3 cache and memory data access ##C308E,C3087,C309B,C3087,00009,00005 00000000,00030003,1C0E360E,00000000 L3 cache and memory data access #50,14,16,8,13,0,0,pm_dsource_L2,L2 cache data access ##C3097,C3097,C3097,C3097,00009,00005 00000000,00030003,2E2E2E2E,00000000 L2 cache data access #51,17,17,10,14,0,0,pm_dsource_L3,L3 cache data access ##C309E,C309E,C309E,C309E,00009,00005 00000000,00030003,3C3C3C3C,00000000 L3 cache data access #52,78,74,59,63,0,0,pm_isource1,Instruction source information ##2208D,2208D,2208D,22086,00009,00005 00000000,8000000C,1A1A1A0C,00000000 Instruction source information #53,76,77,55,0,0,0,pm_isource2,Instruction source information ##22086,22086,00001,2208D,00009,00005 00000000,8000000C,0C0C021A,00000001 Instruction source information #54,77,75,57,61,0,0,pm_isource_L2,L2 instruction source information ##22096,22096,22096,22096,00009,00005 00000000,8000000C,2C2C2C2C,00000000 L2 instruction source information #55,79,76,58,62,0,0,pm_isource_L3,L3 instruction source information ##2209D,2209D,2209D,2209D,00009,00005 00000000,8000000C,3A3A3A3A,00000000 L3 instruction source information #56,184,181,154,163,0,0,pm_pteg_source1,PTEG source information ##83097,83097,83097,83097,00009,00005 00000000,00020003,2E2E2E2E,00000000 PTEG source information #57,187,182,156,164,0,0,pm_pteg_source2,PTEG source information ##8309E,8309E,8309E,8309E,00009,00005 00000000,00020003,3C3C3C3C,00000000 PTEG source information #58,183,183,189,165,0,0,pm_pteg_source3,PTEG source information ##83087,83087,8309B,83087,00009,00005 00000000,00020003,0E0E360E,00000000 PTEG source information #59,186,64,51,16,0,0,pm_pteg_source4,L3 PTEG and group disptach events ##8308E,00002,00002,C10C7,00009,00005 00000000,00320000,1C04048E,00000000 L3 PTEG and group disptach events #60,83,82,64,69,0,0,pm_L2SA_ld,L2 slice A load events ##701C0,721E0,711C0,731E0,00009,00005 00000000,30554005,80C080C0,00000000 L2 slice A load events #61,85,84,66,71,0,0,pm_L2SA_st,L2 slice A store events ##702C0,722E0,712C0,732E0,00009,00005 00000000,30558005,80C080C0,00000000 L2 slice A store events #62,87,87,68,74,0,0,pm_L2SA_st2,L2 slice A store events ##703C0,723E0,713C0,733E0,00009,00005 00000000,3055C005,80C080C0,00000000 L2 slice A store events #63,91,90,72,77,0,0,pm_L2SB_ld,L2 slice B load events ##701C1,721E1,711C1,731E1,00009,00005 00000000,30554005,82C282C2,00000000 L2 slice B load events #64,93,92,74,79,0,0,pm_L2SB_st,L2 slice B store events ##702C1,722E1,712C1,732E1,00009,00005 00000000,30558005,82C282C2,00000000 L2 slice B store events #65,95,95,76,82,0,0,pm_L2SB_st2,L2 slice B store events ##703C1,723E1,713C1,733E1,00009,00005 00000000,3055C005,82C282C2,00000000 L2 slice B store events #66,99,98,80,85,0,0,pm_L2SB_ld,L2 slice C load events ##701C2,721E2,711C2,731E2,00009,00005 00000000,30554005,84C484C4,00000000 L2 slice C load events #67,101,100,82,87,0,0,pm_L2SB_st,L2 slice C store events ##702C2,722E2,712C2,732E2,00009,00005 00000000,30558005,84C484C4,00000000 L2 slice C store events #68,103,103,84,90,0,0,pm_L2SB_st2,L2 slice C store events ##703C2,723E2,713C2,733E2,00009,00005 00000000,3055C005,84C484C4,00000000 L2 slice C store events #69,107,71,89,94,0,0,pm_L3SA_trans,L3 slice A state transistions ##720E3,00001,730E3,710C3,00009,00005 00000000,3015000A,C602C686,00000001 L3 slice A state transistions #70,73,108,93,98,0,0,pm_L3SB_trans,L3 slice B state transistions ##00001,720E4,730E4,710C4,00009,00005 00000000,30150006,02C8C888,00000001 L3 slice B state transistions #71,73,111,97,102,0,0,pm_L3SC_trans,L3 slice C state transistions ##00001,720E5,730E5,710C5,00009,00005 00000000,30150006,02CACA8A,00000001 L3 slice C state transistions #72,82,86,63,73,0,0,pm_L2SA_trans,L2 slice A state transistions ##720E0,700C0,730E0,710C0,00009,00005 00000000,3055000A,C080C080,00000000 L2 slice A state transistions #73,90,94,71,81,0,0,pm_L2SB_trans,L2 slice B state transistions ##720E1,700C1,730E1,710C1,00009,00005 00000000,3055000A,C282C282,00000000 L2 slice B state transistions #74,98,102,79,89,0,0,pm_L2SC_trans,L2 slice C state transistions ##720E2,700C2,730E2,710C2,00009,00005 00000000,3055000A,C484C484,00000000 L2 slice C state transistions #75,106,107,91,99,0,0,pm_L3SAB_retry,L3 slice A/B snoop retry and all CI/CO busy ##721E3,721E4,731E3,731E4,00009,00005 00000000,3005100F,C6C8C6C8,00000000 L3 slice A/B snoop retry and all CI/CO busy #76,108,109,88,96,0,0,pm_L3SAB_hit,L3 slice A/B hit and reference ##701C3,701C4,711C3,711C4,00009,00005 00000000,30501000,86888688,00000000 L3 slice A/B hit and reference #77,112,112,99,100,0,0,pm_L3SC_retry_hit,L3 slice C hit & snoop retry ##721E5,701C5,731E5,711C5,00009,00005 00000000,3055100A,CA8ACA8A,00000000 L3 slice C hit & snoop retry #78,55,54,38,43,0,0,pm_fpu1,Floating Point events ##00088,00088,01088,01090,00009,00005 00000000,00000000,10101020,00000000 Floating Point events #79,56,53,39,44,0,0,pm_fpu2,Floating Point events ##00090,00090,01090,01088,00009,00005 00000000,00000000,20202010,00000000 Floating Point events #80,54,55,30,40,0,0,pm_fpu3,Floating point events ##02088,02088,010C3,010C7,00009,00005 00000000,0000000C,1010868E,00000000 Floating point events #81,58,56,55,115,0,0,pm_fpu4,Floating point events ##02090,02090,00001,C5090,00009,00005 00000000,0430000C,20200220,00000001 Floating point events #82,40,48,29,39,0,0,pm_fpu5,Floating point events by unit ##000C2,000C6,010C2,010C6,00009,00005 00000000,00000000,848C848C,00000000 Floating point events by unit #83,37,45,31,41,0,0,pm_fpu6,Floating point events by unit ##020E0,020E4,010C0,010C4,00009,00005 00000000,0000000C,C0C88088,00000000 Floating point events by unit #84,38,46,33,42,0,0,pm_fpu7,Floating point events by unit ##000C0,000C4,010C1,010C5,00009,00005 00000000,00000000,8088828A,00000000 Floating point events by unit #85,43,51,55,37,0,0,pm_fpu8,Floating point events by unit ##020E1,020E5,00001,030E0,00009,00005 00000000,0000000D,C2CA02C0,00000001 Floating point events by unit #86,42,50,105,111,0,0,pm_fpu9,Floating point events by unit ##020E3,020E7,C50C0,C50C4,00009,00005 00000000,0430000C,C6CE8088,00000000 Floating point events by unit #87,39,47,55,42,0,0,pm_fpu10,Floating point events by unit ##000C1,000C5,00001,010C5,00009,00005 00000000,00000000,828A028A,00000001 Floating point events by unit #88,36,44,30,59,0,0,pm_fpu11,Floating point events by unit ##000C3,000C7,010C3,00001,00009,00005 00000000,00000000,868E8602,00000001 Floating point events by unit #89,44,52,105,59,0,0,pm_fpu12,Floating point events by unit ##020E2,020E6,C50C0,00001,00009,00005 00000000,0430000C,C4CC8002,00000001 Floating point events by unit #90,59,57,42,49,0,0,pm_fxu1,Fixed Point events ##00012,00012,00012,00012,00009,00005 00000000,00000000,24242424,00000000 Fixed Point events #91,171,172,45,47,0,0,pm_fxu2,Fixed Point events ##00002,12091,13088,11090,00009,00005 00000000,40000006,04221020,00000001 Fixed Point events #92,4,4,43,50,0,0,pm_fxu3,Fixed Point events ##400C3,400C4,130E2,130E6,00009,00005 00000000,40400003,8688C4CC,00000000 Fixed Point events #93,206,203,171,178,0,0,pm_smt_priorities1,Thread priority events ##420E3,420E6,430E3,430E4,00009,00005 00000000,0005000F,C6CCC6C8,00000000 Thread priority events #94,205,202,173,180,0,0,pm_smt_priorities2,Thread priority events ##420E2,420E5,430E5,430E6,00009,00005 00000000,0005000F,C4CACACC,00000000 Thread priority events #95,204,201,175,182,0,0,pm_smt_priorities3,Thread priority events ##420E1,420E4,430E2,430E1,00009,00005 00000000,0005000F,C2C8C4C2,00000000 Thread priority events #96,203,68,177,59,0,0,pm_smt_priorities4,Thread priority events ##420E0,0000B,430E0,00001,00009,00005 00000000,0005000A,C016C002,00000001 Thread priority events #97,202,196,55,176,0,0,pm_smt_both,Thread common events ##0000B,00013,00001,41084,00009,00005 00000000,00100000,16260208,00000001 Thread common events #98,196,71,182,189,0,0,pm_smt_selection,Thread selection ##800C3,00001,410C0,410C1,00009,00005 00000000,00900000,86028082,00000001 Thread selection #99,73,0,178,185,0,0,pm_smt_selectover1,Thread selection overide ##00001,400C0,410C2,410C4,00009,00005 00000000,00500000,02808488,00000001 Thread selection overide #100,73,15,180,187,0,0,pm_smt_selectover2,Thread selection overide ##00001,0000F,410C5,410C3,00009,00005 00000000,00100000,021E8A86,00000001 Thread selection overide #101,27,27,17,23,0,0,pm_fabric1,Fabric events ##700C7,720E7,710C7,730E7,00009,00005 00000000,30550005,8ECE8ECE,00000000 Fabric events #102,32,29,20,28,0,0,pm_fabric2,Fabric data movement ##701C7,721E7,711C7,731E7,00009,00005 00000000,30550085,8ECE8ECE,00000000 Fabric data movement #103,33,33,21,27,0,0,pm_fabric3,Fabric data movement ##703C7,723E7,713C7,733E7,00009,00005 00000000,30550185,8ECE8ECE,00000000 Fabric data movement #104,31,28,15,24,0,0,pm_fabric4,Fabric data movement ##702C7,722E7,130E3,712C7,00009,00005 00000000,70540106,8ECEC68E,00000000 Fabric data movement #105,193,185,161,166,0,0,pm_snoop1,Snoop retry ##700C6,720E6,710C6,730E6,00009,00005 00000000,30550005,8CCC8CCC,00000000 Snoop retry #106,194,189,160,59,0,0,pm_snoop2,Snoop read retry ##705C6,725E6,715C6,00001,00009,00005 00000000,30540A04,8CCC8C02,00000001 Snoop read retry #107,197,150,162,127,0,0,pm_snoop3,Snoop write retry ##706C6,726E6,716C6,736E6,00009,00005 00000000,30550C05,8CCC8CCC,00000000 Snoop write retry #108,192,149,159,126,0,0,pm_snoop4,Snoop partial write retry ##707C6,727E6,717C6,737E6,00009,00005 00000000,30550E05,8CCC8CCC,00000000 Snoop partial write retry #109,156,155,125,20,0,0,pm_mem_rq,Memory read queue dispatch ##701C6,721E6,711C6,130E7,00009,00005 00000000,70540205,8CCC8CCE,00000000 Memory read queue dispatch #110,155,148,126,21,0,0,pm_mem_read,Memory read complete and cancel ##702C6,722E6,712C6,00003,00009,00005 00000000,30540404,8CCC8C06,00000000 Memory read complete and cancel #111,159,156,128,132,0,0,pm_mem_wq,Memory write queue dispatch ##703C6,723E6,713C6,733E6,00009,00005 00000000,30550605,8CCC8CCC,00000000 Memory write queue dispatch #112,153,152,124,128,0,0,pm_mem_pwq,Memory partial write queue ##704C6,724E6,714C6,734E6,00009,00005 00000000,30550805,8CCC8CCC,00000000 Memory partial write queue #113,171,173,185,158,0,0,pm_threshold,Thresholding ##00002,820E2,0000B,00014,00009,00005 00000000,00080004,04C41628,00000001 Thresholding #114,171,179,137,146,0,0,pm_mrk_grp1,Marked group events ##00002,820E3,00005,00013,00009,00005 00000000,00080004,04C60A26,00000001 Marked group events #115,172,158,138,147,0,0,pm_mrk_grp2,Marked group events ##00015,00005,C70E4,12091,00009,00005 00000000,41030002,2A0AC822,00000001 Marked group events #116,160,162,129,135,0,0,pm_mrk_dsource1,Marked data from ##C7087,C70A0,C70A2,C70A2,00009,00005 00000000,010B0003,0E404444,00000001 Marked data from #117,161,160,55,44,0,0,pm_mrk_dsource2,Marked data from ##C7097,C70A2,00001,01088,00009,00005 00000000,010B0000,2E440210,00000001 Marked data from #118,163,166,131,138,0,0,pm_mrk_dsource3,Marked data from ##C708E,C70A4,C70A6,C70A6,00009,00005 00000000,010B0003,1C484C4C,00000001 Marked data from #119,166,161,130,143,0,0,pm_mrk_dsource4,Marked data from ##C70A1,C70A3,C7097,C70A1,00009,00005 00000000,010B0003,42462E42,00000001 Marked data from #120,164,164,133,141,0,0,pm_mrk_dsource5,Marked data from ##C709E,C70A6,C70A0,C70A0,00009,00005 00000000,010B0003,3C4C4040,00000001 Marked data from #121,162,161,55,137,0,0,pm_mrk_dsource6,Marked data from ##C70A3,C70A3,00001,C70A3,00009,00005 00000000,010B0001,46460246,00000001 Marked data from #122,165,165,132,140,0,0,pm_mrk_dsource7,Marked data from ##C70A7,C70A7,C709E,C70A7,00009,00005 00000000,010B0003,4E4E3C4E,00000001 Marked data from #123,168,168,135,144,0,0,pm_mrk_lbmiss,Marked TLB and SLB misses ##C40C1,C40C5,C50C6,C50C7,00009,00005 00000000,0CF00000,828A8C8E,00000001 Marked TLB and SLB misses #124,170,170,55,144,0,0,pm_mrk_lbref,Marked TLB and SLB references ##C40C3,C40C7,00001,C50C7,00009,00005 00000000,0CF00000,868E028E,00000001 Marked TLB and SLB references #125,175,71,150,134,0,0,pm_mrk_lsmiss,Marked load and store miss ##82088,00001,00003,00005,00009,00005 00000000,00080008,1002060A,00000001 Marked load and store miss #126,179,179,148,160,0,0,pm_mrk_ulsflush,Mark unaligned load and store flushes ##00003,820E3,81090,81090,00009,00005 00000000,00280004,06C62020,00000001 Mark unaligned load and store flushes #127,178,178,136,148,0,0,pm_mrk_misc,Misc marked instructions ##820E6,00003,00014,0000B,00009,00005 00000000,00080008,CC062816,00000001 Misc marked instructions #128,13,74,165,106,0,0,pm_lsref_L1,Load/Store operations and L1 activity ##C3087,2208D,C1090,C1090,00009,00005 00000000,80330004,0E1A2020,00000000 Load/Store operations and L1 activity #129,16,18,165,106,0,0,pm_lsref_L2L3,Load/Store operations and L2,L3 activity ##C308E,C3087,C1090,C1090,00009,00005 00000000,00330000,1C0E2020,00000000 Load/Store operations and L2,L3 activity #130,81,21,165,106,0,0,pm_lsref_tlbmiss,Load/Store operations and TLB misses ##800C0,800C4,C1090,C1090,00009,00005 00000000,00B00000,80882020,00000000 Load/Store operations and TLB misses #131,16,18,100,171,0,0,pm_Dmiss,Data cache misses ##C308E,C3087,C1088,C10C3,00009,00005 00000000,00330000,1C0E1086,00000000 Data cache misses #132,12,69,61,91,0,0,pm_prefetchX,Prefetch events ##0000F,220E6,C70E7,C50C3,00009,00005 00000000,85330006,1ECCCE86,00000000 Prefetch events #133,9,8,3,1,0,0,pm_branchX,Branch operations ##23087,23087,23087,230E4,00009,00005 00000000,80000003,0E0E0EC8,00000000 Branch operations #134,43,51,30,37,0,0,pm_fpuX1,Floating point events by unit ##020E1,020E5,010C3,030E0,00009,00005 00000000,0000000D,C2CA86C0,00000000 Floating point events by unit #135,39,47,33,42,0,0,pm_fpuX2,Floating point events by unit ##000C1,000C5,010C1,010C5,00009,00005 00000000,00000000,828A828A,00000000 Floating point events by unit #136,36,44,30,40,0,0,pm_fpuX3,Floating point events by unit ##000C3,000C7,010C3,010C7,00009,00005 00000000,00000000,868E868E,00000000 Floating point events by unit #137,56,54,165,106,0,0,pm_fpuX4,Floating point and L1 events ##00090,00088,C1090,C1090,00009,00005 00000000,00300000,20102020,00000000 Floating point and L1 events #138,58,56,30,40,0,0,pm_fpuX5,Floating point events ##02090,02090,010C3,010C7,00009,00005 00000000,0000000C,2020868E,00000000 Floating point events #139,55,53,39,44,0,0,pm_fpuX6,Floating point events ##00088,00090,01090,01088,00009,00005 00000000,00000000,10202010,00000000 Floating point events #140,12,58,6,44,0,0,pm_hpmcount1,HPM group for set 1 ##0000F,00014,0000F,01088,00009,00005 00000000,00000000,1E281E10,00000000 HPM group for set 1 #141,12,56,56,115,0,0,pm_hpmcount2,HPM group for set 2 ##0000F,02090,00009,C5090,00009,00005 00000000,04300004,1E201220,00000000 HPM group for set 2 #142,12,72,100,171,0,0,pm_hpmcount3,HPM group for set 3 ##0000F,120E1,C1088,C10C3,00009,00005 00000000,40300004,1EC21086,00000000 HPM group for set 3 #143,210,15,165,106,0,0,pm_hpmcount4,HPM group for set 7 ##80088,0000F,C1090,C1090,00009,00005 00000000,00B00000,101E2020,00000000 HPM group for set 7 #144,56,54,6,59,0,0,pm_1flop_with_fma,One flop instructions plus FMA ##00090,00088,0000F,00001,00009,00005 00000000,00000000,20101E02,00000000 One flop instructions plus FMA papi-papi-7-2-0-t/src/event_data/ppc970/000077500000000000000000000000001502707512200175635ustar00rootroot00000000000000papi-papi-7-2-0-t/src/event_data/ppc970/events000066400000000000000000003123261502707512200210210ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/ppc970/events { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { counter 1 } #0,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##10095,60095 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #1,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##10094,60094 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #2,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #3,v,g,n,n,PM_DATA_FROM_L2,Data loaded from L2 ##C3087 DL1 was reloaded from the local L2 due to a demand load #4,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##80097 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##80095 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##80094 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##10091,60091 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##00093 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##02098 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##00090 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##00091 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##00092 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##10093,60093 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0209B This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##02099 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0209A This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##00097 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0209C This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##00094 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##00095 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##00096 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##10097,60097 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0209F This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0209D This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0209E This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_FPU_DENORM,FPU received denormalized data ##02080 This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1 #27,v,g,n,n,PM_FPU_FDIV,FPU executed FDIV instruction ##00080 This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1 #28,v,g,n,n,PM_GCT_EMPTY_CYC,Cycles GCT empty ##00004 The Global Completion Table is completely empty #29,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##10090,60090 The ISU sends a signal indicating the gct is full. #30,v,g,n,n,PM_GRP_BR_MPRED,Group experienced a branch mispredict ##1209F,6209F Group experienced a branch mispredict #31,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##1209E,6209E Group experienced branch redirect #32,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##1209C,6209C A group that previously attempted dispatch was rejected. #33,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##1209B,6209B Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #34,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##2209E New line coming into the prefetch buffer #35,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##2209D Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #36,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##2209F This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #37,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #38,v,g,n,n,PM_INST_DISP,Instructions dispatched ##12098,12099,1209A,62098,62099,6209A The ISU sends the number of instructions dispatched. #39,v,g,n,n,PM_INST_FROM_L1,Instruction fetched from L1 ##2208D An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions #40,v,g,n,n,PM_INST_FROM_L2,Instructions fetched from L2 ##22086 An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions #41,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##80091 A SLB miss for an instruction fetch as occurred #42,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##80090 A TLB miss for an Instruction Fetch has occurred #43,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##8209F A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #44,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##10096,60096 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #45,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##80092 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #46,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C0092 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #47,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C0093 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #48,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C0090 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #49,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C0091 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #50,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C609B LSU0 reject due to ERAT miss #51,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C6099 LSU0 reject due to LMQ full or missed data coming #52,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C609A LSU0 reject due to reload CDF or tag update collision #53,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ rejects ##C6098 LSU0 SRQ rejects #54,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C2098 Data from a store instruction was forwarded to a load on unit 0 #55,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##80096 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #56,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C0096 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #57,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C0097 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #58,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C0094 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #59,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C0095 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #60,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C609F LSU1 reject due to ERAT miss #61,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C609D LSU1 reject due to LMQ full or missed data coming #62,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C609E LSU1 reject due to reload CDF or tag update collision #63,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ rejects ##C609C LSU1 SRQ rejects #64,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C209C Data from a store instruction was forwarded to a load on unit 1 #65,v,g,n,n,PM_LSU_FLUSH_ULD,LRQ unaligned load flushes ##C0080 A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #66,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C209E LRQ slot zero was allocated #67,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C209A This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #68,v,g,n,n,PM_LSU_REJECT_SRQ,LSU SRQ rejects ##C6080 LSU SRQ rejects #69,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C209D SRQ Slot zero was allocated #70,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C2099 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #71,c,g,n,n,PM_LSU_SRQ_STFWD,SRQ store forwarded ##C2080 Data from a store instruction was forwarded to a load #72,v,g,n,n,PM_MRK_DATA_FROM_L2,Marked data loaded from L2 ##C7087 DL1 was reloaded from the local L2 due to a marked demand load #73,v,g,n,n,PM_MRK_GRP_DISP,Marked group dispatched ##00002 A group containing a sampled instruction was dispatched #74,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##8209A A DL1 reload occured due to marked load #75,v,g,n,n,PM_MRK_LD_MISS_L1,Marked L1 D cache load misses ##82080 Marked L1 D cache load misses #76,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##82098 A marked load, executing on unit 0, missed the dcache #77,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##8209C A marked load, executing on unit 1, missed the dcache #78,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##8209E A marked stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_MRK_ST_CMPL,Marked store instruction completed ##00003 A sampled store has completed (data home) #80,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##8209B A marked store missed the dcache #81,v,g,n,n,PM_PMC8_OVERFLOW,PMC8 Overflow ##0000A PMC8 Overflow #82,v,g,n,n,PM_RUN_CYC,Run cycles ##00005 Processor Cycles gated by the run latch #83,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##80093 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #84,v,g,n,n,PM_STCX_FAIL,STCX failed ##82099 A stcx (stwcx or stdcx) failed #85,v,g,n,n,PM_STCX_PASS,Stcx passes ##8209D A stcx (stwcx or stdcx) instruction was successful #86,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C209B A store missed the dcache #87,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended #88,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##10092,60092 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. $$$$$$$$ { counter 2 } #0,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##10095,60095 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #1,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##10094,60094 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #2,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #3,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##80097 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #4,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##80095 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #5,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##80094 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #6,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##10091,60091 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #7,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##00093 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #8,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##02098 This signal is active for one cycle when one of the operands is denormalized. #9,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##00090 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #10,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##00091 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #11,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##00092 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##10093,60093 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #13,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0209B This signal is active for one cycle when fp0 is executing single precision instruction. #14,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##02099 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #15,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0209A This signal is active for one cycle when fp0 is executing a store instruction. #16,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##00097 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #17,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0209C This signal is active for one cycle when one of the operands is denormalized. #18,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##00094 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #19,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##00095 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #20,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##00096 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##10097,60097 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #22,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0209F This signal is active for one cycle when fp1 is executing single precision instruction. #23,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0209D This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #24,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0209E This signal is active for one cycle when fp1 is executing a store instruction. #25,v,g,n,n,PM_FPU_FMA,FPU executed multiply-add instruction ##00080 This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #26,v,g,n,n,PM_FPU_STALL3,FPU stalled in pipe3 ##02080 FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1 #27,v,g,n,n,PM_GCT_EMPTY_SRQ_FULL,GCT empty caused by SRQ full ##0000B GCT empty caused by SRQ full #28,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##10090,60090 The ISU sends a signal indicating the gct is full. #29,v,g,n,n,PM_GRP_BR_MPRED,Group experienced a branch mispredict ##1209F,6209F Group experienced a branch mispredict #30,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##1209E,6209E Group experienced branch redirect #31,v,g,n,n,PM_GRP_DISP,Group dispatches ##00004 A group was dispatched #32,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##1209C,6209C A group that previously attempted dispatch was rejected. #33,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##1209B,6209B Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #34,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##2209E New line coming into the prefetch buffer #35,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##2209D Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #36,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##2209F This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #37,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #38,v,g,n,n,PM_INST_DISP,Instructions dispatched ##12098,12099,1209A,62098,62099,6209A The ISU sends the number of instructions dispatched. #39,v,g,n,n,PM_INST_FROM_MEM,Instruction fetched from memory ##22086 Instruction fetched from memory #40,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##80091 A SLB miss for an instruction fetch as occurred #41,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##80090 A TLB miss for an Instruction Fetch has occurred #42,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##8209F A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #43,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##10096,60096 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #44,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##80092 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #45,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C0092 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #46,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C0093 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #47,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C0090 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #48,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C0091 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #49,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C609B LSU0 reject due to ERAT miss #50,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C6099 LSU0 reject due to LMQ full or missed data coming #51,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C609A LSU0 reject due to reload CDF or tag update collision #52,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ rejects ##C6098 LSU0 SRQ rejects #53,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C2098 Data from a store instruction was forwarded to a load on unit 0 #54,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##80096 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C0096 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C0097 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C0094 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C0095 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #59,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C609F LSU1 reject due to ERAT miss #60,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C609D LSU1 reject due to LMQ full or missed data coming #61,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C609E LSU1 reject due to reload CDF or tag update collision #62,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ rejects ##C609C LSU1 SRQ rejects #63,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C209C Data from a store instruction was forwarded to a load on unit 1 #64,v,g,n,n,PM_LSU_FLUSH_UST,SRQ unaligned store flushes ##C0080 A store was flushed because it was unaligned #65,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00002 Cycles when both the LMQ and SRQ are empty (LSU is idle) #66,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C209E LRQ slot zero was allocated #67,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C209A This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #68,v,g,n,n,PM_LSU_REJECT_LMQ_FULL,LSU reject due to LMQ full or missed data coming ##C6080 LSU reject due to LMQ full or missed data coming #69,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C209D SRQ Slot zero was allocated #70,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C2099 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #71,v,g,n,n,PM_MRK_BRU_FIN,Marked instruction BRU processing finished ##00005 The branch unit finished a marked instruction. Instructions that finish may not necessary complete #72,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##8209A A DL1 reload occured due to marked load #73,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##82098 A marked load, executing on unit 0, missed the dcache #74,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##8209C A marked load, executing on unit 1, missed the dcache #75,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##8209E A marked stcx (stwcx or stdcx) failed #76,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##8209B A marked store missed the dcache #77,v,g,n,n,PM_PMC1_OVERFLOW,PMC1 Overflow ##0000A PMC1 Overflow #78,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##80093 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #79,v,g,n,n,PM_STCX_FAIL,STCX failed ##82099 A stcx (stwcx or stdcx) failed #80,v,g,n,n,PM_STCX_PASS,Stcx passes ##8209D A stcx (stwcx or stdcx) instruction was successful #81,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C209B A store missed the dcache #82,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended #83,v,g,t,n,PM_THRESH_TIMEO,Threshold timeout ##00003 The threshold timer expired #84,v,g,n,n,PM_WORK_HELD,Work held ##00001 RAS Unit has signaled completion to stop and there are groups waiting to complete #85,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##10092,60092 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. $$$$$$$$ { counter 3 } #0,v,g,n,n,PM_BR_ISSUED,Branches issued ##23098 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #1,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##23099 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #2,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##2309A branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #3,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##11091,61091 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #4,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #5,v,g,n,n,PM_DATA_FROM_MEM,Data loaded from memory ##C3087 Data loaded from memory #6,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C1097 A dcache invalidated was received from the L2 because a line in L2 was castout. #7,u,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of streams ##8309A out of streams #8,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##8309F A new Prefetch Stream was allocated #9,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##1309B,6309B The number of Cycles MSR(EE) bit was off. #10,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##1309F,6309F Cycles MSR(EE) bit off and external interrupt pending #11,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##11096,61096 Flush caused by branch mispredict #12,v,g,n,n,PM_FLUSH_LSU_BR_MPRED,Flush caused by LSU or branch mispredict ##11097,61097 Flush caused by LSU or branch mispredict #13,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##01092 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #14,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##01093 fp0 finished, produced a result This only indicates finish, not completion. #15,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##01090 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #16,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##03098 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #17,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##01091 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #18,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##01096 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #19,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##01097 fp1 finished, produced a result. This only indicates finish, not completion. #20,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##01094 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #21,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##01095 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU_FEST,FPU executed FEST instruction ##01080 This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1. #23,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##11090,61090 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #24,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##11094,61094 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #25,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##1309A,6309A The Fixed Point unit 0 finished an instruction and produced a result #26,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##1309E,6309E The Fixed Point unit 1 finished an instruction and produced a result #27,v,g,n,n,PM_FXU_FIN,FXU produced a result ##63080 The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete. #28,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##1309D,6309D The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #29,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##13099,63099 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #30,v,g,n,n,PM_HV_CYC,Hypervisor Cycles ##00004 Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0) #31,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #32,v,g,n,n,PM_INST_FROM_PREF,Instructions fetched from prefetch ##2208D An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions #33,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C309C The data source information is valid #34,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##83099 A request to prefetch data into the L1 was made #35,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##2309B This signal is asserted each cycle a cache write is active. #36,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##8309B A request to prefetch data into L2 was made #37,v,g,n,n,PM_LD_MISS_L1,L1 D cache load misses ##C1080 Total DL1 Load references that miss the DL1 #38,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C1092 A load, executing on unit 0, missed the dcache #39,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C1096 A load, executing on unit 1, missed the dcache #40,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C1090 A load executed on unit 0 #41,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C1094 A load executed on unit 1 #42,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##83098 A floating point load was executed from LSU unit 0 #43,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##8309C A floating point load was executed from LSU unit 1 #44,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##11095,61095 Flush initiated by LSU #45,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C309F The LMQ was full #46,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C709D A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #47,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C309E The first entry in the LMQ was allocated. #48,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C309D This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #49,u,g,n,n,PM_LSU_LMQ_SRQ_EMPTY_CYC,Cycles LMQ and SRQ empty ##00002 Cycles when both the LMQ and SRQ are empty (LSU is idle) #50,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##11092,61092 The ISU sends this signal when the LRQ is full. #51,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##11093,61093 The ISU sends this signal when the srq is full. #52,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##8309D This signal is asserted every cycle when a sync is in the SRQ. #53,v,g,n,n,PM_MRK_DATA_FROM_MEM,Marked data loaded from memory ##C7087 Marked data loaded from memory #54,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C709C The source information is valid and is for a marked load #55,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##81092 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##81093 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##81090 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##81091 A marked store was flushed from unit 0 because it was unaligned #59,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##81096 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #60,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##81097 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #61,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##81094 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #62,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##81095 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #63,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C709E This signal is asserted every cycle when a marked request is resident in the Store Request Queue #64,v,g,n,n,PM_MRK_ST_CMPL_INT,Marked store completed with intervention ##00003 A marked store previously sent to the memory subsystem completed (data home) after requiring intervention #65,v,g,n,n,PM_MRK_VMX_FIN,Marked instruction VMX processing finished ##00005 Marked instruction VMX processing finished #66,v,g,n,n,PM_PMC2_OVERFLOW,PMC2 Overflow ##0000A PMC2 Overflow #67,v,g,n,n,PM_STOP_COMPLETION,Completion stopped ##00001 RAS Unit has signaled completion to stop #68,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C1093 A store missed the dcache #69,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C1091 A store executed on unit 0 #70,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C1095 A store executed on unit 1 #71,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended $$$$$$$$ { counter 4 } #0,v,g,n,n,PM_0INST_FETCH,No instructions fetched ##2208D No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss) #1,v,g,n,n,PM_BR_ISSUED,Branches issued ##23098 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #2,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##23099 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #3,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##2309A branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #4,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##11091,61091 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #5,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #6,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C1097 A dcache invalidated was received from the L2 because a line in L2 was castout. #7,u,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of streams ##8309A out of streams #8,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##8309F A new Prefetch Stream was allocated #9,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##1309B,6309B The number of Cycles MSR(EE) bit was off. #10,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##1309F,6309F Cycles MSR(EE) bit off and external interrupt pending #11,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##11096,61096 Flush caused by branch mispredict #12,v,g,n,n,PM_FLUSH_LSU_BR_MPRED,Flush caused by LSU or branch mispredict ##11097,61097 Flush caused by LSU or branch mispredict #13,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##01092 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #14,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##01093 fp0 finished, produced a result This only indicates finish, not completion. #15,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##01090 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #16,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##03098 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #17,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##01091 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #18,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##01096 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #19,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##01097 fp1 finished, produced a result. This only indicates finish, not completion. #20,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##01094 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #21,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##01095 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU_FIN,FPU produced a result ##01080 FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1 #23,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##11090,61090 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #24,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##11094,61094 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #25,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##1309A,6309A The Fixed Point unit 0 finished an instruction and produced a result #26,u,g,n,n,PM_FXU1_BUSY_FXU0_IDLE,FXU1 busy FXU0 idle ##00002 FXU0 was idle while FXU1 was busy #27,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##1309E,6309E The Fixed Point unit 1 finished an instruction and produced a result #28,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##1309D,6309D The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #29,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##13099,63099 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #30,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #31,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C309C The data source information is valid #32,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##83099 A request to prefetch data into the L1 was made #33,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##2309B This signal is asserted each cycle a cache write is active. #34,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##8309B A request to prefetch data into L2 was made #35,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C1092 A load, executing on unit 0, missed the dcache #36,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C1096 A load, executing on unit 1, missed the dcache #37,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C1090 A load executed on unit 0 #38,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C1094 A load executed on unit 1 #39,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##83098 A floating point load was executed from LSU unit 0 #40,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##8309C A floating point load was executed from LSU unit 1 #41,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##11095,61095 Flush initiated by LSU #42,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C309F The LMQ was full #43,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C709D A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #44,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C309E The first entry in the LMQ was allocated. #45,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C309D This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #46,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##11092,61092 The ISU sends this signal when the LRQ is full. #47,u,g,n,n,PM_LSU_SRQ_EMPTY_CYC,Cycles SRQ empty ##00003 The Store Request Queue is empty #48,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##11093,61093 The ISU sends this signal when the srq is full. #49,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##8309D This signal is asserted every cycle when a sync is in the SRQ. #50,v,g,n,n,PM_MRK_CRU_FIN,Marked instruction CRU processing finished ##00005 The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete #51,v,g,n,n,PM_MRK_GRP_CMPL,Marked group completed ##00004 A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group. #52,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C709C The source information is valid and is for a marked load #53,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##81092 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #54,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##81093 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #55,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##81090 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #56,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##81091 A marked store was flushed from unit 0 because it was unaligned #57,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##81096 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #58,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##81097 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #59,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##81094 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #60,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##81095 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #61,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C709E This signal is asserted every cycle when a marked request is resident in the Store Request Queue #62,v,g,n,n,PM_PMC3_OVERFLOW,PMC3 Overflow ##0000A PMC3 Overflow #63,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C1093 A store missed the dcache #64,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C1091 A store executed on unit 0 #65,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C1095 A store executed on unit 1 #66,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended $$$$$$$$ { counter 5 } #0,v,g,n,n,PM_1PLUS_PPC_CMPL,One or more PPC instruction completed ##00003 A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once. #1,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##10095,60095 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #2,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##10094,60094 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #3,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #4,v,g,n,n,PM_DATA_FROM_L25_SHR,Data loaded from L2.5 shared ##C3087 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load #5,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##80097 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #6,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##80095 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #7,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##80094 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #8,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##10091,60091 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #9,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##00093 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #10,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##02098 This signal is active for one cycle when one of the operands is denormalized. #11,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##00090 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #12,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##00091 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##00092 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #14,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##10093,60093 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #15,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0209B This signal is active for one cycle when fp0 is executing single precision instruction. #16,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##02099 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #17,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0209A This signal is active for one cycle when fp0 is executing a store instruction. #18,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##00097 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #19,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0209C This signal is active for one cycle when one of the operands is denormalized. #20,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##00094 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #21,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##00095 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##00096 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #23,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##10097,60097 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #24,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0209F This signal is active for one cycle when fp1 is executing single precision instruction. #25,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0209D This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #26,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0209E This signal is active for one cycle when fp1 is executing a store instruction. #27,v,g,n,n,PM_FPU_ALL,FPU executed add, mult, sub, cmp or sel instruction ##00080 This signal is active for one cycle when FPU is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo. Combined Unit 0 + Unit 1 #28,v,g,n,n,PM_FPU_SINGLE,FPU executed single precision instruction ##02080 FPU is executing single precision instruction. Combined Unit 0 + Unit 1 #29,u,g,n,n,PM_FXU_IDLE,FXU idle ##00002 FXU0 and FXU1 are both idle #30,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##10090,60090 The ISU sends a signal indicating the gct is full. #31,v,g,n,n,PM_GRP_BR_MPRED,Group experienced a branch mispredict ##1209F,6209F Group experienced a branch mispredict #32,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##1209E,6209E Group experienced branch redirect #33,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##1209C,6209C A group that previously attempted dispatch was rejected. #34,v,g,n,n,PM_GRP_DISP_SUCCESS,Group dispatch success ##00001 Number of groups sucessfully dispatched (not rejected) #35,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##1209B,6209B Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #36,v,g,n,n,PM_GRP_MRK,Group marked in IDU ##00004 A group was sampled (marked) #37,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##2209E New line coming into the prefetch buffer #38,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##2209D Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #39,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##2209F This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #40,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #41,v,g,n,n,PM_INST_DISP,Instructions dispatched ##12098,12099,1209A,62098,62099,6209A The ISU sends the number of instructions dispatched. #42,v,g,n,n,PM_INST_FROM_L25_SHR,Instruction fetched from L2.5 shared ##22086 Instruction fetched from L2.5 shared #43,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##80091 A SLB miss for an instruction fetch as occurred #44,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##80090 A TLB miss for an Instruction Fetch has occurred #45,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##8209F A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #46,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##10096,60096 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #47,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##80092 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #48,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C0092 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #49,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C0093 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #50,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C0090 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #51,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C0091 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #52,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C609B LSU0 reject due to ERAT miss #53,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C6099 LSU0 reject due to LMQ full or missed data coming #54,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C609A LSU0 reject due to reload CDF or tag update collision #55,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ rejects ##C6098 LSU0 SRQ rejects #56,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C2098 Data from a store instruction was forwarded to a load on unit 0 #57,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##80096 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #58,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C0096 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #59,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C0097 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #60,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C0094 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #61,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C0095 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #62,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C609F LSU1 reject due to ERAT miss #63,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C609D LSU1 reject due to LMQ full or missed data coming #64,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C609E LSU1 reject due to reload CDF or tag update collision #65,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ rejects ##C609C LSU1 SRQ rejects #66,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C209C Data from a store instruction was forwarded to a load on unit 1 #67,u,g,n,n,PM_LSU_FLUSH_SRQ,SRQ flushes ##C0080 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #68,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C209E LRQ slot zero was allocated #69,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C209A This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #70,v,g,n,n,PM_LSU_REJECT_ERAT_MISS,LSU reject due to ERAT miss ##C6080 LSU reject due to ERAT miss #71,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C209D SRQ Slot zero was allocated #72,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C2099 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #73,v,g,n,n,PM_MRK_DATA_FROM_L25_SHR,Marked data loaded from L2.5 shared ##C7087 DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load #74,v,g,n,n,PM_MRK_GRP_TIMEO,Marked group completion timeout ##00005 The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor #75,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##8209A A DL1 reload occured due to marked load #76,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##82098 A marked load, executing on unit 0, missed the dcache #77,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##8209C A marked load, executing on unit 1, missed the dcache #78,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##8209E A marked stcx (stwcx or stdcx) failed #79,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##8209B A marked store missed the dcache #80,v,g,n,n,PM_PMC4_OVERFLOW,PMC4 Overflow ##0000A PMC4 Overflow #81,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##80093 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #82,v,g,n,n,PM_STCX_FAIL,STCX failed ##82099 A stcx (stwcx or stdcx) failed #83,v,g,n,n,PM_STCX_PASS,Stcx passes ##8209D A stcx (stwcx or stdcx) instruction was successful #84,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C209B A store missed the dcache #85,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended #86,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##10092,60092 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. $$$$$$$$ { counter 6 } #0,u,g,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full ##10095,60095 The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups). #1,v,g,n,n,PM_CR_MAP_FULL_CYC,Cycles CR logical operation mapper full ##10094,60094 The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #2,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #3,v,g,n,n,PM_DATA_FROM_L25_MOD,Data loaded from L2.5 modified ##C3087 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load #4,v,g,n,n,PM_DATA_TABLEWALK_CYC,Cycles doing data tablewalks ##80097 This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried. #5,v,g,n,n,PM_DSLB_MISS,Data SLB misses ##80095 A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve #6,v,g,n,n,PM_DTLB_MISS,Data TLB misses ##80094 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #7,v,g,n,n,PM_FPR_MAP_FULL_CYC,Cycles FPR mapper full ##10091,60091 The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #8,v,g,n,n,PM_FPU0_ALL,FPU0 executed add, mult, sub, cmp or sel instruction ##00093 This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #9,v,g,n,n,PM_FPU0_DENORM,FPU0 received denormalized data ##02098 This signal is active for one cycle when one of the operands is denormalized. #10,v,g,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction ##00090 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #11,v,g,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction ##00091 This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #12,v,g,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction ##00092 This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #13,v,g,n,n,PM_FPU0_FULL_CYC,Cycles FPU0 issue queue full ##10093,60093 The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped #14,v,g,n,n,PM_FPU0_SINGLE,FPU0 executed single precision instruction ##0209B This signal is active for one cycle when fp0 is executing single precision instruction. #15,v,g,n,n,PM_FPU0_STALL3,FPU0 stalled in pipe3 ##02099 This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #16,v,g,n,n,PM_FPU0_STF,FPU0 executed store instruction ##0209A This signal is active for one cycle when fp0 is executing a store instruction. #17,v,g,n,n,PM_FPU1_ALL,FPU1 executed add, mult, sub, cmp or sel instruction ##00097 This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo #18,v,g,n,n,PM_FPU1_DENORM,FPU1 received denormalized data ##0209C This signal is active for one cycle when one of the operands is denormalized. #19,v,g,n,n,PM_FPU1_FDIV,FPU1 executed FDIV instruction ##00094 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. #20,v,g,n,n,PM_FPU1_FMA,FPU1 executed multiply-add instruction ##00095 This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU1_FSQRT,FPU1 executed FSQRT instruction ##00096 This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU1_FULL_CYC,Cycles FPU1 issue queue full ##10097,60097 The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FPU1_SINGLE,FPU1 executed single precision instruction ##0209F This signal is active for one cycle when fp1 is executing single precision instruction. #24,v,g,n,n,PM_FPU1_STALL3,FPU1 stalled in pipe3 ##0209D This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. #25,v,g,n,n,PM_FPU1_STF,FPU1 executed store instruction ##0209E This signal is active for one cycle when fp1 is executing a store instruction. #26,v,g,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction ##00080 This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #27,v,g,n,n,PM_FPU_STF,FPU executed store instruction ##02080 FPU is executing a store instruction. Combined Unit 0 + Unit 1 #28,u,g,n,n,PM_FXU_BUSY,FXU busy ##00002 FXU0 and FXU1 are both busy #29,v,g,n,n,PM_GCT_FULL_CYC,Cycles GCT full ##10090,60090 The ISU sends a signal indicating the gct is full. #30,v,g,n,n,PM_GRP_BR_MPRED,Group experienced a branch mispredict ##1209F,6209F Group experienced a branch mispredict #31,v,g,n,n,PM_GRP_BR_REDIR,Group experienced branch redirect ##1209E,6209E Group experienced branch redirect #32,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##1209C,6209C A group that previously attempted dispatch was rejected. #33,v,g,n,n,PM_GRP_DISP_VALID,Group dispatch valid ##1209B,6209B Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject. #34,v,g,n,n,PM_IC_PREF_INSTALL,Instruction prefetched installed in prefetch ##2209E New line coming into the prefetch buffer #35,v,g,n,n,PM_IC_PREF_REQ,Instruction prefetch requests ##2209D Asserted when a non-canceled prefetch is made to the cache interface unit (CIU). #36,v,g,n,n,PM_IERAT_XLATE_WR,Translation written to ierat ##2209F This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available). #37,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #38,v,g,n,n,PM_INST_DISP,Instructions dispatched ##12098,12099,1209A,62098,62099,6209A The ISU sends the number of instructions dispatched. #39,v,g,n,n,PM_INST_FROM_L25_MOD,Instruction fetched from L2.5 modified ##22086 Instruction fetched from L2.5 modified #40,u,g,n,n,PM_ISLB_MISS,Instruction SLB misses ##80091 A SLB miss for an instruction fetch as occurred #41,v,g,n,n,PM_ITLB_MISS,Instruction TLB misses ##80090 A TLB miss for an Instruction Fetch has occurred #42,v,g,n,n,PM_LARX_LSU0,Larx executed on LSU0 ##8209F A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0) #43,u,g,n,n,PM_LR_CTR_MAP_FULL_CYC,Cycles LR/CTR mapper full ##10096,60096 The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #44,v,g,n,n,PM_LSU0_DERAT_MISS,LSU0 DERAT misses ##80092 A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #45,v,g,n,n,PM_LSU0_FLUSH_LRQ,LSU0 LRQ flushes ##C0092 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #46,u,g,n,n,PM_LSU0_FLUSH_SRQ,LSU0 SRQ flushes ##C0093 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #47,v,g,n,n,PM_LSU0_FLUSH_ULD,LSU0 unaligned load flushes ##C0090 A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #48,v,g,n,n,PM_LSU0_FLUSH_UST,LSU0 unaligned store flushes ##C0091 A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary) #49,v,g,n,n,PM_LSU0_REJECT_ERAT_MISS,LSU0 reject due to ERAT miss ##C609B LSU0 reject due to ERAT miss #50,v,g,n,n,PM_LSU0_REJECT_LMQ_FULL,LSU0 reject due to LMQ full or missed data coming ##C6099 LSU0 reject due to LMQ full or missed data coming #51,v,g,n,n,PM_LSU0_REJECT_RELOAD_CDF,LSU0 reject due to reload CDF or tag update collision ##C609A LSU0 reject due to reload CDF or tag update collision #52,v,g,n,n,PM_LSU0_REJECT_SRQ,LSU0 SRQ rejects ##C6098 LSU0 SRQ rejects #53,u,g,n,n,PM_LSU0_SRQ_STFWD,LSU0 SRQ store forwarded ##C2098 Data from a store instruction was forwarded to a load on unit 0 #54,v,g,n,n,PM_LSU1_DERAT_MISS,LSU1 DERAT misses ##80096 A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur. #55,v,g,n,n,PM_LSU1_FLUSH_LRQ,LSU1 LRQ flushes ##C0096 A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #56,u,g,n,n,PM_LSU1_FLUSH_SRQ,LSU1 SRQ flushes ##C0097 A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #57,v,g,n,n,PM_LSU1_FLUSH_ULD,LSU1 unaligned load flushes ##C0094 A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #58,u,g,n,n,PM_LSU1_FLUSH_UST,LSU1 unaligned store flushes ##C0095 A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #59,v,g,n,n,PM_LSU1_REJECT_ERAT_MISS,LSU1 reject due to ERAT miss ##C609F LSU1 reject due to ERAT miss #60,v,g,n,n,PM_LSU1_REJECT_LMQ_FULL,LSU1 reject due to LMQ full or missed data coming ##C609D LSU1 reject due to LMQ full or missed data coming #61,v,g,n,n,PM_LSU1_REJECT_RELOAD_CDF,LSU1 reject due to reload CDF or tag update collision ##C609E LSU1 reject due to reload CDF or tag update collision #62,v,g,n,n,PM_LSU1_REJECT_SRQ,LSU1 SRQ rejects ##C609C LSU1 SRQ rejects #63,u,g,n,n,PM_LSU1_SRQ_STFWD,LSU1 SRQ store forwarded ##C209C Data from a store instruction was forwarded to a load on unit 1 #64,v,g,n,n,PM_LSU_DERAT_MISS,DERAT misses ##80080 Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. #65,v,g,n,n,PM_LSU_FLUSH_LRQ,LRQ flushes ##C0080 A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #66,v,g,n,n,PM_LSU_LRQ_S0_ALLOC,LRQ slot 0 allocated ##C209E LRQ slot zero was allocated #67,v,g,n,n,PM_LSU_LRQ_S0_VALID,LRQ slot 0 valid ##C209A This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #68,v,g,n,n,PM_LSU_REJECT_RELOAD_CDF,LSU reject due to reload CDF or tag update collision ##C6080 LSU reject due to reload CDF or tag update collision #69,v,g,n,n,PM_LSU_SRQ_S0_ALLOC,SRQ slot 0 allocated ##C209D SRQ Slot zero was allocated #70,v,g,n,n,PM_LSU_SRQ_S0_VALID,SRQ slot 0 valid ##C2099 This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. #71,v,g,n,n,PM_MRK_DATA_FROM_L25_MOD,Marked data loaded from L2.5 modified ##C7087 DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load #72,v,g,n,n,PM_MRK_FXU_FIN,Marked instruction FXU processing finished ##00004 Marked instruction FXU processing finished #73,v,g,n,n,PM_MRK_GRP_ISSUED,Marked group issued ##00005 A sampled instruction was issued #74,v,g,n,n,PM_MRK_IMR_RELOAD,Marked IMR reloaded ##8209A A DL1 reload occured due to marked load #75,v,g,n,n,PM_MRK_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##82098 A marked load, executing on unit 0, missed the dcache #76,v,g,n,n,PM_MRK_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##8209C A marked load, executing on unit 1, missed the dcache #77,v,g,n,n,PM_MRK_STCX_FAIL,Marked STCX failed ##8209E A marked stcx (stwcx or stdcx) failed #78,v,g,n,n,PM_MRK_ST_GPS,Marked store sent to GPS ##00003 A sampled store has been sent to the memory subsystem #79,v,g,n,n,PM_MRK_ST_MISS_L1,Marked L1 D cache store misses ##8209B A marked store missed the dcache #80,v,g,n,n,PM_PMC5_OVERFLOW,PMC5 Overflow ##0000A PMC5 Overflow #81,u,g,n,n,PM_SNOOP_TLBIE,Snoop TLBIE ##80093 A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction. #82,v,g,n,n,PM_STCX_FAIL,STCX failed ##82099 A stcx (stwcx or stdcx) failed #83,v,g,n,n,PM_STCX_PASS,Stcx passes ##8209D A stcx (stwcx or stdcx) instruction was successful #84,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C209B A store missed the dcache #85,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended #86,v,g,n,n,PM_XER_MAP_FULL_CYC,Cycles XER mapper full ##10092,60092 The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. $$$$$$$$ { counter 7 } #0,v,g,n,n,PM_BR_ISSUED,Branches issued ##23098 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #1,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##23099 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #2,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##2309A branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #3,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##11091,61091 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #4,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #5,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C1097 A dcache invalidated was received from the L2 because a line in L2 was castout. #6,u,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of streams ##8309A out of streams #7,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##8309F A new Prefetch Stream was allocated #8,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##1309B,6309B The number of Cycles MSR(EE) bit was off. #9,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##1309F,6309F Cycles MSR(EE) bit off and external interrupt pending #10,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##11096,61096 Flush caused by branch mispredict #11,v,g,n,n,PM_FLUSH_LSU_BR_MPRED,Flush caused by LSU or branch mispredict ##11097,61097 Flush caused by LSU or branch mispredict #12,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##01092 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #13,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##01093 fp0 finished, produced a result This only indicates finish, not completion. #14,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##01090 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #15,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##03098 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #16,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##01091 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #17,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##01096 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #18,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##01097 fp1 finished, produced a result. This only indicates finish, not completion. #19,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##01094 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #20,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##01095 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #21,v,g,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions ##01080 This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1 #22,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##11090,61090 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #23,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##11094,61094 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #24,u,g,n,n,PM_FXU0_BUSY_FXU1_IDLE,FXU0 busy FXU1 idle ##00002 FXU0 is busy while FXU1 was idle #25,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##1309A,6309A The Fixed Point unit 0 finished an instruction and produced a result #26,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##1309E,6309E The Fixed Point unit 1 finished an instruction and produced a result #27,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##1309D,6309D The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #28,v,g,n,n,PM_GRP_CMPL,Group completed ##00003 A group completed. Microcoded instructions that span multiple groups will generate this event once per group. #29,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##13099,63099 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #30,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #31,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C309C The data source information is valid #32,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##83099 A request to prefetch data into the L1 was made #33,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##2309B This signal is asserted each cycle a cache write is active. #34,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##8309B A request to prefetch data into L2 was made #35,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C1092 A load, executing on unit 0, missed the dcache #36,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C1096 A load, executing on unit 1, missed the dcache #37,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C1090 A load executed on unit 0 #38,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C1094 A load executed on unit 1 #39,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##83098 A floating point load was executed from LSU unit 0 #40,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##8309C A floating point load was executed from LSU unit 1 #41,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##11095,61095 Flush initiated by LSU #42,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C309F The LMQ was full #43,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C709D A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #44,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C309E The first entry in the LMQ was allocated. #45,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C309D This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #46,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##11092,61092 The ISU sends this signal when the LRQ is full. #47,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##11093,61093 The ISU sends this signal when the srq is full. #48,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##8309D This signal is asserted every cycle when a sync is in the SRQ. #49,v,g,n,n,PM_MRK_FPU_FIN,Marked instruction FPU processing finished ##00004 One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete #50,v,g,n,n,PM_MRK_INST_FIN,Marked instruction finished ##00005 One of the execution units finished a marked instruction. Instructions that finish may not necessary complete #51,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C709C The source information is valid and is for a marked load #52,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##81092 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #53,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##81093 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #54,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##81090 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #55,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##81091 A marked store was flushed from unit 0 because it was unaligned #56,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##81096 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #57,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##81097 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #58,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##81094 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #59,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##81095 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #60,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C709E This signal is asserted every cycle when a marked request is resident in the Store Request Queue #61,v,g,n,n,PM_PMC6_OVERFLOW,PMC6 Overflow ##0000A PMC6 Overflow #62,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C1093 A store missed the dcache #63,v,g,n,n,PM_ST_REF_L1,L1 D cache store references ##C1080 Total DL1 Store references #64,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C1091 A store executed on unit 0 #65,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C1095 A store executed on unit 1 #66,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended $$$$$$$$ { counter 8 } #0,v,g,n,n,PM_BR_ISSUED,Branches issued ##23098 This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue. #1,v,g,n,n,PM_BR_MPRED_CR,Branch mispredictions due to CR bit setting ##23099 This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction. #2,v,g,n,n,PM_BR_MPRED_TA,Branch mispredictions due to target address ##2309A branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction. #3,u,g,n,n,PM_CRQ_FULL_CYC,Cycles CR issue queue full ##11091,61091 The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups). #4,v,g,n,n,PM_CYC,Processor cycles ##0000F Processor cycles #5,u,g,n,n,PM_DC_INV_L2,L1 D cache entries invalidated from L2 ##C1097 A dcache invalidated was received from the L2 because a line in L2 was castout. #6,u,g,n,n,PM_DC_PREF_OUT_OF_STREAMS,D cache out of streams ##8309A out of streams #7,v,g,n,n,PM_DC_PREF_STREAM_ALLOC,D cache new prefetch stream allocated ##8309F A new Prefetch Stream was allocated #8,v,g,n,n,PM_EE_OFF,Cycles MSR(EE) bit off ##1309B,6309B The number of Cycles MSR(EE) bit was off. #9,u,g,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending ##1309F,6309F Cycles MSR(EE) bit off and external interrupt pending #10,v,g,n,n,PM_EXT_INT,External interrupts ##00002 An external interrupt occurred #11,v,g,n,n,PM_FLUSH_BR_MPRED,Flush caused by branch mispredict ##11096,61096 Flush caused by branch mispredict #12,v,g,n,n,PM_FLUSH_LSU_BR_MPRED,Flush caused by LSU or branch mispredict ##11097,61097 Flush caused by LSU or branch mispredict #13,v,g,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction ##01092 This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #14,v,g,n,n,PM_FPU0_FIN,FPU0 produced a result ##01093 fp0 finished, produced a result This only indicates finish, not completion. #15,v,g,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions ##01090 This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #16,v,g,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction ##03098 This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs #17,v,g,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions ##01091 This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #18,v,g,n,n,PM_FPU1_FEST,FPU1 executed FEST instruction ##01096 This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. #19,v,g,n,n,PM_FPU1_FIN,FPU1 produced a result ##01097 fp1 finished, produced a result. This only indicates finish, not completion. #20,v,g,n,n,PM_FPU1_FMOV_FEST,FPU1 executing FMOV or FEST instructions ##01094 This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ #21,v,g,n,n,PM_FPU1_FRSP_FCONV,FPU1 executed FRSP or FCONV instructions ##01095 This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. #22,v,g,n,n,PM_FPU_FMOV_FEST,FPU executing FMOV or FEST instructions ##01080 This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1 #23,v,g,n,n,PM_FXLS0_FULL_CYC,Cycles FXU0/LS0 queue full ##11090,61090 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #24,v,g,n,n,PM_FXLS1_FULL_CYC,Cycles FXU1/LS1 queue full ##11094,61094 The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped #25,v,g,n,n,PM_FXU0_FIN,FXU0 produced a result ##1309A,6309A The Fixed Point unit 0 finished an instruction and produced a result #26,v,g,n,n,PM_FXU1_FIN,FXU1 produced a result ##1309E,6309E The Fixed Point unit 1 finished an instruction and produced a result #27,v,g,n,n,PM_GPR_MAP_FULL_CYC,Cycles GPR mapper full ##1309D,6309D The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be. #28,v,g,n,n,PM_GRP_DISP_BLK_SB_CYC,Cycles group dispatch blocked by scoreboard ##13099,63099 The ISU sends a signal indicating that dispatch is blocked by scoreboard. #29,v,g,n,n,PM_GRP_DISP_REJECT,Group dispatch rejected ##00003 A group that previously attempted dispatch was rejected. #30,c,g,n,n,PM_INST_CMPL,Instructions completed ##00009 Number of Eligible Instructions that completed. #31,v,g,n,n,PM_L1_DCACHE_RELOAD_VALID,L1 reload data source valid ##C309C The data source information is valid #32,v,g,n,n,PM_L1_PREF,L1 cache data prefetches ##83099 A request to prefetch data into the L1 was made #33,v,g,n,n,PM_L1_WRITE_CYC,Cycles writing to instruction L1 ##2309B This signal is asserted each cycle a cache write is active. #34,v,g,n,n,PM_L2_PREF,L2 cache prefetches ##8309B A request to prefetch data into L2 was made #35,v,g,n,n,PM_LD_MISS_L1_LSU0,LSU0 L1 D cache load misses ##C1092 A load, executing on unit 0, missed the dcache #36,v,g,n,n,PM_LD_MISS_L1_LSU1,LSU1 L1 D cache load misses ##C1096 A load, executing on unit 1, missed the dcache #37,v,g,n,n,PM_LD_REF_L1,L1 D cache load references ##C1080 Total DL1 Load references #38,v,g,n,n,PM_LD_REF_L1_LSU0,LSU0 L1 D cache load references ##C1090 A load executed on unit 0 #39,v,g,n,n,PM_LD_REF_L1_LSU1,LSU1 L1 D cache load references ##C1094 A load executed on unit 1 #40,v,g,n,n,PM_LSU0_LDF,LSU0 executed Floating Point load instruction ##83098 A floating point load was executed from LSU unit 0 #41,v,g,n,n,PM_LSU1_LDF,LSU1 executed Floating Point load instruction ##8309C A floating point load was executed from LSU unit 1 #42,v,g,n,n,PM_LSU_FLUSH,Flush initiated by LSU ##11095,61095 Flush initiated by LSU #43,v,g,n,n,PM_LSU_LDF,LSU executed Floating Point load instruction ##83080 LSU executed Floating Point load instruction #44,u,g,n,n,PM_LSU_LMQ_FULL_CYC,Cycles LMQ full ##C309F The LMQ was full #45,v,g,n,n,PM_LSU_LMQ_LHR_MERGE,LMQ LHR merges ##C709D A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry. #46,v,g,n,n,PM_LSU_LMQ_S0_ALLOC,LMQ slot 0 allocated ##C309E The first entry in the LMQ was allocated. #47,v,g,n,n,PM_LSU_LMQ_S0_VALID,LMQ slot 0 valid ##C309D This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO #48,v,g,n,n,PM_LSU_LRQ_FULL_CYC,Cycles LRQ full ##11092,61092 The ISU sends this signal when the LRQ is full. #49,v,g,n,n,PM_LSU_SRQ_FULL_CYC,Cycles SRQ full ##11093,61093 The ISU sends this signal when the srq is full. #50,u,g,n,n,PM_LSU_SRQ_SYNC_CYC,SRQ sync duration ##8309D This signal is asserted every cycle when a sync is in the SRQ. #51,v,g,n,n,PM_MRK_L1_RELOAD_VALID,Marked L1 reload data source valid ##C709C The source information is valid and is for a marked load #52,v,g,n,n,PM_MRK_LSU0_FLUSH_LRQ,LSU0 marked LRQ flushes ##81092 A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #53,u,g,n,n,PM_MRK_LSU0_FLUSH_SRQ,LSU0 marked SRQ flushes ##81093 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #54,v,g,n,n,PM_MRK_LSU0_FLUSH_ULD,LSU0 marked unaligned load flushes ##81090 A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #55,v,g,n,n,PM_MRK_LSU0_FLUSH_UST,LSU0 marked unaligned store flushes ##81091 A marked store was flushed from unit 0 because it was unaligned #56,v,g,n,n,PM_MRK_LSU1_FLUSH_LRQ,LSU1 marked LRQ flushes ##81096 A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. #57,u,g,n,n,PM_MRK_LSU1_FLUSH_SRQ,LSU1 marked SRQ flushes ##81097 A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group. #58,v,g,n,n,PM_MRK_LSU1_FLUSH_ULD,LSU1 marked unaligned load flushes ##81094 A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1) #59,u,g,n,n,PM_MRK_LSU1_FLUSH_UST,LSU1 marked unaligned store flushes ##81095 A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary) #60,c,g,n,n,PM_MRK_LSU_FIN,Marked instruction LSU processing finished ##00004 One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete #61,u,g,n,n,PM_MRK_LSU_SRQ_INST_VALID,Marked instruction valid in SRQ ##C709E This signal is asserted every cycle when a marked request is resident in the Store Request Queue #62,v,g,n,n,PM_PMC7_OVERFLOW,PMC7 Overflow ##0000A PMC7 Overflow #63,v,g,n,n,PM_ST_MISS_L1,L1 D cache store misses ##C1093 A store missed the dcache #64,v,g,n,n,PM_ST_REF_L1_LSU0,LSU0 L1 D cache store references ##C1091 A store executed on unit 0 #65,v,g,n,n,PM_ST_REF_L1_LSU1,LSU1 L1 D cache store references ##C1095 A store executed on unit 1 #66,v,g,n,n,PM_SUSPENDED,Suspended ##00008 Suspended #67,u,g,n,n,PM_TB_BIT_TRANS,Time Base bit transition ##00005 When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 papi-papi-7-2-0-t/src/event_data/ppc970/groups000066400000000000000000000172271502707512200210360ustar00rootroot00000000000000{ **************************** { THIS IS OPEN SOURCE CODE { **************************** { (C) COPYRIGHT International Business Machines Corp. 2005 { This file is licensed under the University of Tennessee license. { See LICENSE.txt. { { File: events/ppc970/groups { Author: Maynard Johnson { maynardj@us.ibm.com { Mods: { { Number of groups 42 { Group descriptions #0,82,2,67,30,0,2,28,29,pm_slice0,Time Slice 0 ##00005,0000F,00001,00009,00003,0000F,00003,00003 0000051E,00000000,0A46F18C,00002000 Time Slice 0 #1,2,2,37,6,41,37,63,37,pm_eprof,Group for use with eprof ##0000F,0000F,C1080,C1097,12098,00009,C1080,C1080 00000F1E,40030010,05F09000,00002000 Group for use with eprof #2,37,2,37,6,41,37,63,37,pm_basic,Basic performance indicators ##00009,0000F,C1080,C1097,12098,00009,C1080,C1080 0000091E,40030010,05F09000,00002000 Basic performance indicators #3,65,64,4,30,67,65,63,37,pm_lsu,Information on the Load Store Unit ##C0080,C0080,0000F,00009,C0080,C0080,C1080,C1080 00000000,000F0000,7A400000,00002000 Information on the Load Store Unit #4,27,25,22,22,3,26,30,22,pm_fpu1,Floating Point events ##00080,00080,01080,01080,0000F,00080,00009,01080 00000000,00000000,001E0480,00002000 Floating Point events #5,26,26,4,30,27,27,21,43,pm_fpu2,Floating Point events ##02080,02080,0000F,00009,00080,02080,01080,83080 00000000,000020E8,7A400000,00002000 Floating Point events #6,88,1,3,29,46,38,30,4,pm_isu_rename,ISU Rename Pool Events ##10092,10094,11091,13099,10096,12098,00009,0000F 00001228,40000021,8E6D84BC,00002000 ISU Rename Pool Events #7,13,21,23,24,3,37,46,49,pm_isu_queues1,ISU Rename Pool Events ##10093,10097,11090,11094,0000F,00009,11092,11093 0000132E,40000000,851E994C,00002000 ISU Rename Pool Events #8,38,2,25,27,35,32,30,4,pm_isu_flow,ISU Instruction Flow Events ##12098,0000F,1309A,1309E,1209B,1209C,00009,0000F 0000181E,400000B3,D7B7C4BC,00002000 ISU Instruction Flow Events #9,28,84,67,10,3,37,8,10,pm_isu_work,ISU Indicators of Work Blockage ##00004,00001,00001,1309F,0000F,00009,1309B,00002 00000402,40000005,0FDE9D88,00002000 ISU Indicators of Work Blockage #10,10,18,17,21,12,20,30,4,pm_fpu3,Floating Point events by unit ##00090,00094,01091,01095,00091,00095,00009,0000F 00001028,00000000,8D6354BC,00002000 Floating Point events by unit #11,12,20,14,19,9,17,30,4,pm_fpu4,Floating Point events by unit ##00092,00096,01093,01097,00093,00097,00009,0000F 0000122C,00000000,9DE774BC,00002000 Floating Point events by unit #12,9,17,15,20,3,37,12,18,pm_fpu5,Floating Point events by unit ##02098,0209C,01090,01094,0000F,00009,01092,01096 00001838,000000C0,851E9958,00002000 Floating Point events by unit #13,15,23,14,19,3,37,4,16,pm_fpu7,Floating Point events by unit ##02099,0209D,01093,01097,0000F,00009,0000F,03098 0000193A,000000C8,9DDE97E0,00002000 Floating Point events by unit #14,46,55,4,5,49,56,30,4,pm_lsu_flush,LSU Flush Events ##C0092,C0096,0000F,0000F,C0093,C0097,00009,0000F 0000122C,000C0000,7BE774BC,00002000 LSU Flush Events #15,48,57,40,38,3,37,35,36,pm_lsu_load1,LSU Load Events ##C0090,C0094,C1090,C1094,0000F,00009,C1092,C1096 00001028,000F0000,851E9958,00002000 LSU Load Events #16,49,58,69,65,3,37,62,5,pm_lsu_store1,LSU Store Events ##C0091,C0095,C1091,C1095,0000F,00009,C1093,C1097 0000112A,000F0000,8D5E99DC,00002000 LSU Store Events #17,54,63,69,65,84,2,30,4,pm_lsu_store2,LSU Store Events ##C2098,C209C,C1091,C1095,C209B,0000F,00009,0000F 00001838,0003C0D0,8D76F4BC,00002000 LSU Store Events #18,45,54,4,5,40,2,31,4,pm_lsu7,Information on the Load Store Unit ##80092,80096,0000F,0000F,00009,0000F,C309C,0000F 0000122C,00083004,7BD2FE3C,00002000 Information on the Load Store Unit #19,28,65,30,5,0,37,28,67,pm_misc,Misc Events for testing ##00004,00002,00004,0000F,00003,00009,00003,00005 00000404,00000000,23C69194,00002000 Misc Events for testing #20,27,25,27,22,3,26,30,22,pm_pe_bench1,PE Benchmarker group for FP analysis ##00080,00080,63080,01080,0000F,00080,00009,01080 00000000,10001002,001E0480,00002000 PE Benchmarker group for FP analysis #21,6,41,37,63,3,37,63,37,pm_pe_bench4,PE Benchmarker group for L1 and TLB ##80094,80090,C1080,C1093,0000F,00009,C1080,C1080 00001420,000B0000,04DE9000,00002000 PE Benchmarker group for L1 and TLB #22,6,65,37,63,3,37,63,37,pm_hpmcount1,Hpmcount group for L1 and TLB behavior ##80094,00002,C1080,C1093,0000F,00009,C1080,C1080 00001404,000B0000,04DE9000,00002000 Hpmcount group for L1 and TLB behavior #23,27,25,14,19,3,27,30,43,pm_hpmcount2,Hpmcount group for computation ##00080,00080,01093,01097,0000F,02080,00009,83080 00000000,00002028,9DDE0480,00002000 Hpmcount group for computation #24,37,2,37,1,84,2,1,2,pm_l1andbr,L1 misses and branch misspredict analysis ##00009,0000F,C1080,23098,C209B,0000F,23099,2309A 0000091E,8003C01D,0636FCE8,00002000 L1 misses and branch misspredict analysis #25,37,2,37,1,3,84,63,37,pm_imix,Instruction mix: loads, stores and branches ##00009,0000F,C1080,23098,0000F,C209B,C1080,C1080 0000091E,8003C021,061FB000,00002000 Instruction mix: loads, stores and branches #26,82,4,0,2,43,2,30,2,pm_branch,SLB and branch misspredict analysis ##00005,80095,23098,23099,80091,0000F,00009,2309A 0000052A,8008000B,C662F4E8,00002000 SLB and branch misspredict analysis #27,3,37,5,5,4,3,44,47,pm_data,data source and LMQ ##C3087,00009,C3087,0000F,C3087,C3087,C309E,C309D 00000712,0000300E,3BCE7F74,00002000 data source and LMQ #28,6,41,31,5,68,67,32,34,pm_tlb,TLB and LRQ plus data prefetch ##80094,80090,00009,0000F,C209E,C209A,83099,8309B 00001420,0008E03C,4BFDACEC,00002000 TLB and LRQ plus data prefetch #29,40,39,30,30,5,2,28,5,pm_isource,inst source and tablewalk ##22086,22086,00004,00009,80097,0000F,00003,C1097 0000060C,800B00C0,226EF1DC,00002000 inst source and tablewalk #30,69,70,37,49,40,37,4,37,pm_sync,Sync and SRQ ##C209D,C2099,C1080,8309D,00009,00009,0000F,C1080 00001D32,0003E0C1,07529780,00002000 Sync and SRQ #31,39,36,31,5,40,2,30,4,pm_ierat,IERAT ##2208D,2209F,00009,0000F,00009,0000F,00009,0000F 00000D3E,800000C0,4BD2F4BC,00002000 IERAT #32,28,33,33,30,41,64,63,4,pm_derat,DERAT ##00004,6209B,C309C,00009,6209A,80080,C1080,0000F 00000436,100B7052,E274003C,00002000 DERAT #33,75,83,4,51,36,73,50,30,pm_mark1,Information on marked instructions ##82080,00003,0000F,00004,00004,00005,00005,00009 00000006,00008080,790852A4,00002001 Information on marked instructions #34,73,71,4,50,36,72,49,60,pm_mark2,Marked Instructions Processing Flow ##00002,00005,0000F,00005,00004,00004,00004,00004 0000020A,00000000,79484210,00002001 Marked Instructions Processing Flow #35,79,2,64,51,74,78,60,30,pm_mark3,Marked Stores Processing Flow ##00003,0000F,00003,00004,00005,00003,C709E,00009 0000031E,00203004,190A3F24,00002001 Marked Stores Processing Flow #36,80,72,58,60,3,37,54,58,pm_lsu_mark1,Load Store Unit Marked Events ##8209B,8209A,81091,81095,0000F,00009,81090,81094 00001B34,000280C0,8D5E9850,00002001 Load Store Unit Marked Events #37,76,74,55,57,3,37,53,57,pm_lsu_mark2,Load Store Unit Marked Events ##82098,8209C,81092,81096,0000F,00009,81093,81097 00001838,000280C0,959E99DC,00002001 Load Store Unit Marked Events #38,37,37,27,26,29,28,24,4,pm_fxu1,Fixed Point events by unit ##00009,00009,63080,00002,00002,00002,00002,0000F 00000912,10001002,0084213C,00002000 Fixed Point events by unit #39,37,2,24,23,29,28,25,26,pm_fxu2,Fixed Point events by unit ##00009,0000F,11094,11090,00002,00002,1309A,1309E 0000091E,4000000C,A4042D78,00002000 Fixed Point events by unit #40,39,39,32,0,40,2,4,30,pm_ifu,Instruction Fetch Unit events ##2208D,22086,2208D,2208D,00009,0000F,0000F,00009 00000D0C,800000C0,6B52F7A4,00002000 Instruction Fetch Unit events #41,40,39,32,0,42,39,4,30,pm_L1_icm, Level 1 instruction cache misses ##22086,22086,2208D,2208D,22086,22086,0000F,00009 0000060C,800000F0,6B4C67A4,00002000 Level 1 instruction cache misses papi-papi-7-2-0-t/src/examples/000077500000000000000000000000001502707512200162455ustar00rootroot00000000000000papi-papi-7-2-0-t/src/examples/Makefile000066400000000000000000000026011502707512200177040ustar00rootroot00000000000000PAPIINC = .. PAPILIB = ../libpapi.a CC = gcc CFLAGS += -I$(PAPIINC) OS = $(shell uname) TARGETS_NTHD = PAPI_set_domain sprofile multiplex PAPI_state PAPI_reset PAPI_profil PAPI_perror PAPI_get_virt_cyc PAPI_get_real_cyc PAPI_get_opt PAPI_hw_info PAPI_get_executable_info PAPI_ipc PAPI_epc PAPI_flops PAPI_flips PAPI_mix_hl_rate PAPI_mix_ll_rate PAPI_mix_hl_ll PAPI_overflow PAPI_add_remove_event high_level PAPI_add_remove_events TARGETS_PTHREAD = locks_pthreads overflow_pthreads ifeq ($(OS), SunOS) LDFLAGS = $(PAPILIB) -lcpc LDFLAGS_PTHREAD = $(PAPILIB) -lpthread -lcpc TARGETS = $(TARGETS_NTHD) $(TARGETS_PTHREAD) else ifeq ($(OS), AIX) CC = xlc LDFLAGS = $(PAPILIB) -lpmapi LDFLAGS_PTHREAD = $(PAPILIB) -lpthread -lpmapi TARGETS = $(TARGETS_NTHD) $(TARGETS_PTHREAD) else ifeq ($(OS), OSF1) LDFLAGS = $(PAPILIB) -lrt LDFLAGS_PTHREAD = $(PAPILIB) -lpthread -lrt TARGETS = $(TARGETS_NTHD) else ifeq ($(OS), Linux) TARGETS = $(TARGETS_NTHD) $(TARGETS_PTHREAD) else TARGETS = $(TARGETS_NTHD) endif LDFLAGS += $(PAPILIB) LDFLAGS_PTHREAD = $(LDFLAGS) -lpthread endif endif endif all: $(TARGETS) $(TARGETS_NTHD): %:%.o $(CC) -o $@ $(CFLAGS) $^ $(LDFLAGS) $(TARGETS_PTHREAD): %:%.o $(CC) -o $@ $(CFLAGS) $^ $(LDFLAGS_PTHREAD) clean: $(RM) *.o $(TARGETS) papi-papi-7-2-0-t/src/examples/Makefile.AIX000066400000000000000000000012711502707512200203260ustar00rootroot00000000000000PAPIINC = .. PAPILIB = ../libpapi.a CC = xlc CFLAGS = -I$(PAPIINC) LDFLAGS = $(PAPILIB) -lpmapi LDFLAGS_PTHREAD = $(PAPILIB) -lpthread -lpmapi TARGETS = PAPI_set_domain sprofile multiplex PAPI_state PAPI_reset PAPI_profil PAPI_perror PAPI_get_virt_cyc PAPI_get_real_cyc PAPI_get_opt PAPI_hw_info PAPI_get_executable_info PAPI_ipc PAPI_flops PAPI_flips PAPI_overflow PAPI_add_remove_event high_level PAPI_add_remove_events TARGETS_PTHREAD = locks_pthreads overflow_pthreads all: $(TARGETS) $(TARGETS_PTHREAD) $(TARGETS): $$@.c $(CC) $? -o $@ $(CFLAGS) $(LDFLAGS) $(TARGETS_PTHREAD): $$@.c $(CC) $? -o $@ $(CFLAGS) $(LDFLAGS_PTHREAD) clean: rm -f *.o $(TARGETS) $(TARGETS_PTHREAD) papi-papi-7-2-0-t/src/examples/Makefile.IRIX64000066400000000000000000000007421502707512200206340ustar00rootroot00000000000000PAPIINC = .. PAPILIB = ../libpapi.a CC = gcc CFLAGS = -I$(PAPIINC) LDFLAGS = $(PAPILIB) TARGETS = PAPI_set_domain sprofile multiplex PAPI_state PAPI_reset PAPI_profil PAPI_perror PAPI_get_virt_cyc PAPI_get_real_cyc PAPI_get_opt PAPI_hw_info PAPI_get_executable_info PAPI_ipc PAPI_flops PAPI_flips PAPI_overflow PAPI_add_remove_event high_level PAPI_add_remove_events all: $(TARGETS) $(TARGETS): $$@.c $(CC) $? -o $@ $(CFLAGS) $(LDFLAGS) clean: rm -f *.o $(TARGETS) papi-papi-7-2-0-t/src/examples/Makefile.OSF1000066400000000000000000000007441502707512200204210ustar00rootroot00000000000000PAPIINC = .. PAPILIB = ../libpapi.a CC = gcc CFLAGS = -I$(PAPIINC) LDFLAGS = $(PAPILIB) -lrt TARGETS = PAPI_set_domain sprofile multiplex PAPI_state PAPI_reset PAPI_profil PAPI_perror PAPI_get_virt_cyc PAPI_get_real_cyc PAPI_get_opt PAPI_hw_info PAPI_get_executable_info PAPI_ipc PAPI_flops PAPI_flips PAPI_overflow PAPI_add_remove_event high_level PAPI_add_remove_events all: $(TARGETS) $(TARGETS): $$@.c $(CC) $? -o $@ $(CFLAGS) $(LDFLAGS) clean: rm -f *.o $(TARGETS) papi-papi-7-2-0-t/src/examples/PAPI_add_remove_event.c000066400000000000000000000072061502707512200225350ustar00rootroot00000000000000/***************************************************************************** * This example shows how to use PAPI_add_event, PAPI_start, PAPI_read, * * PAPI_stop and PAPI_remove_event. * ******************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define NUM_EVENTS 2 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int main() { int EventSet = PAPI_NULL; int tmp, i; /*must be initialized to PAPI_NULL before calling PAPI_create_event*/ long long values[NUM_EVENTS]; /*This is where we store the values we read from the eventset */ /* We use number to keep track of the number of events in the EventSet */ int retval, number; char errstring[PAPI_MAX_STR_LEN]; /*************************************************************************** * This part initializes the library and compares the version number of the* * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ***************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) ERROR_RETURN(retval); /* Creating the eventset */ if ( (retval = PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Cycles event to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* get the number of events in the event set */ number = 0; if ( (retval = PAPI_list_events(EventSet, NULL, &number)) != PAPI_OK) ERROR_RETURN(retval); printf("There are %d events in the event set\n", number); /* Start counting */ if ( (retval = PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* you can replace your code here */ tmp=0; for (i = 0; i < 2000000; i++) { tmp = i + tmp; } /* read the counter values and store them in the values array */ if ( (retval=PAPI_read(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf("The total instructions executed for the first loop are %lld \n", values[0] ); printf("The total cycles executed for the first loop are %lld \n",values[1]); /* our slow code again */ tmp=0; for (i = 0; i < 2000000; i++) { tmp = i + tmp; } /* Stop counting and store the values into the array */ if ( (retval = PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf("Total instructions executed are %lld \n", values[0] ); printf("Total cycles executed are %lld \n",values[1]); /* Remove event: We are going to take the PAPI_TOT_INS from the eventset */ if( (retval = PAPI_remove_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); printf("Removing PAPI_TOT_INS from the eventset\n"); /* Now we list how many events are left on the event set */ number = 0; if ((retval=PAPI_list_events(EventSet, NULL, &number))!= PAPI_OK) ERROR_RETURN(retval); printf("There is only %d event left in the eventset now\n", number); /* free the resources used by PAPI */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_add_remove_events.c000066400000000000000000000057321502707512200227220ustar00rootroot00000000000000/****************************************************************************** * This is a simple low level function demonstration on using PAPI_add_events * * to add an array of events to a created eventset, we are going to use these * * events to monitor a set of instructions, start the counters, read the * * counters and then cleanup the eventset when done. In this example we use * * the presets PAPI_TOT_INS and PAPI_TOT_CYC. PAPI_add_events,PAPI_start, * * PAPI_stop, PAPI_clean_eventset, PAPI_destroy_eventset and * * PAPI_create_eventset all return PAPI_OK(which is 0) when succesful. * ******************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define NUM_EVENT 2 #define THRESHOLD 100000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int main(){ int i,retval,tmp; int EventSet = PAPI_NULL; /*must be initialized to PAPI_NULL before calling PAPI_create_event*/ int event_codes[NUM_EVENT]={PAPI_TOT_INS,PAPI_TOT_CYC}; char errstring[PAPI_MAX_STR_LEN]; long long values[NUM_EVENT]; /*************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { fprintf(stderr, "Error: %s\n", errstring); exit(1); } /* Creating event set */ if ((retval=PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add the array of events PAPI_TOT_INS and PAPI_TOT_CYC to the eventset*/ if ((retval=PAPI_add_events(EventSet, event_codes, NUM_EVENT)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if ( (retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); /*** this is where your computation goes *********/ for(i=0;i<1000;i++) { tmp = tmp+i; } /* Stop counting, this reads from the counter as well as stop it. */ if ( (retval=PAPI_stop(EventSet,values)) != PAPI_OK) ERROR_RETURN(retval); printf("\nThe total instructions executed are %lld, total cycles %lld\n", values[0],values[1]); if ( (retval=PAPI_remove_events(EventSet,event_codes, NUM_EVENT))!=PAPI_OK) ERROR_RETURN(retval); /* Free all memory and data structures, EventSet must be empty. */ if ( (retval=PAPI_destroy_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* free the resources used by PAPI */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_epc.c000066400000000000000000000036021502707512200177720ustar00rootroot00000000000000/***************************************************************************** * This example demonstrates the usage of the function PAPI_epc which * * measures arbitrary events per cpu cycle * *****************************************************************************/ /***************************************************************************** * The first call to PAPI_epc() will initialize the PAPI interface, * * set up the counters to monitor the user specified event, PAPI_TOT_CYC, * * and PAPI_REF_CYC (if it exists) and start the counters. Subsequent calls * * will read the counters and return real time, process time, event counts, * * the core and reference cycle count and EPC rate since the latest call to * * PAPI_epc(). * *****************************************************************************/ #include #include #include "papi.h" int your_slow_code(); int main() { float real_time, proc_time, epc; long long ref, core, evt; float real_time_i, proc_time_i, epc_i; long long ref_i, core_i, evt_i; int retval; if((retval=PAPI_epc(PAPI_TOT_INS, &real_time_i, &proc_time_i, &ref_i, &core_i, &evt_i, &epc_i)) < PAPI_OK) { printf("Could not initialise PAPI_epc \n"); printf("retval: %d\n", retval); exit(1); } your_slow_code(); if((retval=PAPI_epc(PAPI_TOT_INS, &real_time, &proc_time, &ref, &core, &evt, &epc)) #include #include "papi.h" int your_slow_code(); int main() { float real_time, proc_time,mflips; long long flpins; float ireal_time, iproc_time, imflips; long long iflpins; int retval; /*********************************************************************** * if PAPI_FP_INS is a derived event in your platform, then your * * platform must have at least three counters to support PAPI_flips, * * because PAPI needs one counter to cycles. So in UltraSparcIII, even * * the platform supports PAPI_FP_INS, but UltraSparcIII only have two * * available hardware counters and PAPI_FP_INS is a derived event in * * this platform, so PAPI_flops returns an error. * ***********************************************************************/ if((retval=PAPI_flips_rate(PAPI_FP_INS,&ireal_time,&iproc_time,&iflpins,&imflips)) < PAPI_OK) { printf("Could not initialise PAPI_flips \n"); printf("Your platform may not support floating point instruction event.\n"); printf("retval: %d\n", retval); exit(1); } your_slow_code(); if((retval=PAPI_flips_rate(PAPI_FP_INS,&real_time, &proc_time, &flpins, &mflips)) #include #include "papi.h" int your_slow_code(); int main() { float real_time, proc_time,mflops; long long flpops; float ireal_time, iproc_time, imflops; long long iflpops; int retval; /*********************************************************************** * If PAPI_FP_OPS is a derived event in your platform, then your * * platform must have at least three counters to support * * PAPI_flops_rate, because PAPI needs one counter for cycles. So in * * UltraSparcIII, even though the platform supports PAPI_FP_OPS, * * UltraSparcIII only has two available hardware counters, and * * PAPI_FP_OPS is a derived event that requires both of them, so * * PAPI_flops_rate returns an error. * ***********************************************************************/ if((retval=PAPI_flops_rate(PAPI_FP_OPS,&ireal_time,&iproc_time,&iflpops,&imflops)) < PAPI_OK) { printf("Could not initialise PAPI_flops \n"); printf("Your platform may not support floating point operation event.\n"); printf("retval: %d\n", retval); exit(1); } your_slow_code(); if((retval=PAPI_flops_rate(PAPI_FP_OPS,&real_time, &proc_time, &flpops, &mflops)) #include #include "papi.h" /* This needs to be included every time you use PAPI */ int main() { int i,tmp=0; int retval; const PAPI_exe_info_t *prginfo = NULL; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } for(i=0;i<1000;i++) tmp=tmp+i; /* PAPI_get_executable_info returns a NULL if there is an error */ if ((prginfo = PAPI_get_executable_info()) == NULL) { printf("PAPI_get_executable_info error! \n"); exit(1); } printf("Start text addess of user program is at %p\n", prginfo->address_info.text_start); printf("End text address of user program is at %p\n", prginfo->address_info.text_end); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_get_opt.c000066400000000000000000000060431502707512200206660ustar00rootroot00000000000000/***************************************************************************** * This is an example using the low level function PAPI_get_opt to query the * * option settings of the PAPI library or a specific eventset created by the * * PAPI_create_eventset function. PAPI_set_opt is used on the otherhand to * * set PAPI library or event set options. * *****************************************************************************/ #include #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int poorly_tuned_function() { float tmp; int i; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } int main() { int num, retval, EventSet = PAPI_NULL; PAPI_option_t options; long long values[2]; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /*PAPI_get_opt returns a negative number if there is an error */ /* This call returns the maximum available hardware counters */ if((num = PAPI_get_opt(PAPI_MAX_HWCTRS,NULL)) <= 0) ERROR_RETURN(num); printf("This machine has %d counters.\n",num); if ((retval=PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Set the domain of this EventSet to counter user and kernel modes for this process. */ memset(&options,0x0,sizeof(options)); options.domain.eventset = EventSet; /* Default domain is PAPI_DOM_USER */ options.domain.domain = PAPI_DOM_ALL; /* this sets the options for the domain */ if ((retval=PAPI_set_opt(PAPI_DOMAIN, &options)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed event to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Cycles Executed event to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); poorly_tuned_function(); /* Stop counting */ if((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf(" Total instructions: %lld Total Cycles: %lld \n", values[0], values[1]); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_get_real_cyc.c000066400000000000000000000033121502707512200216410ustar00rootroot00000000000000/****************************************************************************** * This is an example to show how to use low level function PAPI_get_real_cyc * * and PAPI_get_real_usec. * ******************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ int your_slow_code() { int i,tmp; for(i=1; i<20000; i++) { tmp=(tmp+100)/i; } return 0; } int main() { long long s,s1, e, e1; int retval; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /* Here you get initial cycles and time */ /* No error checking is done here because this function call is always successful */ s = PAPI_get_real_cyc(); your_slow_code(); /*Here you get final cycles and time */ e = PAPI_get_real_cyc(); s1= PAPI_get_real_usec(); your_slow_code(); e1= PAPI_get_real_usec(); printf("Wallclock cycles : %lld\nWallclock time(ms): %lld\n",e-s,e1-s1); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_get_virt_cyc.c000066400000000000000000000033111502707512200217010ustar00rootroot00000000000000/****************************************************************************** * This is an example to show how to use low level function PAPI_get_virt_cyc * * and PAPI_get_virt_usec. * ******************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ int i; double tmp; int your_slow_code() { for(i=1; i<200000; i++) { tmp= (tmp+i)/2; } return 0; } int main() { long long s,s1, e, e1; int retval; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /* Here you get initial cycles and time */ /* No error checking is done here because this function call is always successful */ s = PAPI_get_virt_cyc(); your_slow_code(); /*Here you get final cycles and time */ e = PAPI_get_virt_cyc(); s1= PAPI_get_virt_usec(); your_slow_code(); e1= PAPI_get_virt_usec(); printf("Virtual cycles : %lld\nVirtual time(ms): %lld\n",e-s,e1-s1); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_hw_info.c000066400000000000000000000032511502707512200206540ustar00rootroot00000000000000/**************************************************************************** * This is a simple low level example for getting information on the system * * hardware. This function PAPI_get_hardware_info(), returns a pointer to a * * structure of type PAPI_hw_info_t, which contains number of CPUs, nodes, * * vendor number/name for CPU, CPU revision, clock speed. * ****************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ int main() { const PAPI_hw_info_t *hwinfo = NULL; int retval; /*************************************************************************** * This part initializes the library and compares the version number of the* * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ***************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /* Get hardware info*/ if ((hwinfo = PAPI_get_hardware_info()) == NULL) { printf("PAPI_get_hardware_info error! \n"); exit(1); } /* when there is an error, PAPI_get_hardware_info returns NULL */ printf("%d CPU at %f Mhz.\n",hwinfo->totalcpus,hwinfo->mhz); printf(" model string is %s \n", hwinfo->model_string); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_ipc.c000066400000000000000000000032641502707512200200020ustar00rootroot00000000000000/***************************************************************************** * This example demonstrates the usage of the function PAPI_ipc which * * measures the number of instructions executed per cpu cycle * *****************************************************************************/ /***************************************************************************** * The first call to PAPI_ipc initializes the PAPI library, set up the * * counters to monitor PAPI_TOT_INS and PAPI_TOT_CYC events, and start the * * counters. Subsequent calls will read the counters and return real time, * * process time, instructions, and the instructions per cycle rate since the * * latest call to PAPI_ipc. * *****************************************************************************/ #include #include #include "papi.h" int your_slow_code(); int main() { float real_time, proc_time,ipc; long long ins; float real_time_i, proc_time_i, ipc_i; long long ins_i; int retval; if((retval=PAPI_ipc(&real_time_i,&proc_time_i,&ins_i,&ipc_i)) < PAPI_OK) { printf("Could not initialise PAPI_ipc \n"); printf("retval: %d\n", retval); exit(1); } your_slow_code(); if((retval=PAPI_ipc( &real_time, &proc_time, &ins, &ipc)) #include #include "papi.h" #define THRESHOLD 10000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int your_slow_code(); int main() { float ipc; int retval; int EventSet = PAPI_NULL; long_long values[2]; if ( (retval = PAPI_hl_region_begin("slow_code")) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_hl_region_end("slow_code")) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_hl_stop()) < PAPI_OK ) ERROR_RETURN(retval); /* get IPC using low-level API */ if ( (retval = PAPI_create_eventset(&EventSet)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_start(EventSet)) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_stop(EventSet, values)) < PAPI_OK ) ERROR_RETURN(retval); ipc = (float) ((float)values[0] / (float) ( values[1])); printf("Results from the low-level API:\n"); printf("IPC: %f\n", ipc); exit(0); } int your_slow_code() { int i; double tmp=1.1; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } papi-papi-7-2-0-t/src/examples/PAPI_mix_hl_rate.c000066400000000000000000000040721502707512200215200ustar00rootroot00000000000000/***************************************************************************** * This example compares the measurement of IPC using the rate function * * PAPI_ipc and the high-level region instrumentation. Both methods should * * deliver the same result for IPC. * * Hint: Use PAPI's high-level output script to print the measurement report * * of the high-level API. * * * * ../high-level/scripts/papi_hl_output_writer.py --type=accumulate * *****************************************************************************/ #include #include #include "papi.h" #define THRESHOLD 10000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int your_slow_code(); int main() { float real_time, proc_time,ipc; long long ins; int retval; if ( (retval = PAPI_ipc(&real_time, &proc_time, &ins ,&ipc)) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_ipc( &real_time, &proc_time, &ins, &ipc)) < PAPI_OK ) ERROR_RETURN(retval); printf("Real_time: %f Proc_time: %f Instructions: %lld IPC: %f\n", real_time, proc_time,ins,ipc); if ( (retval = PAPI_hl_region_begin("slow_code")) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_hl_region_end("slow_code")) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_ipc(&real_time, &proc_time, &ins ,&ipc)) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_ipc( &real_time, &proc_time, &ins, &ipc)) < PAPI_OK ) ERROR_RETURN(retval); printf("Real_time: %f Proc_time: %f Instructions: %lld IPC: %f\n", real_time, proc_time,ins,ipc); if ( (retval = PAPI_rate_stop()) < PAPI_OK ) ERROR_RETURN(retval); exit(0); } int your_slow_code() { int i; double tmp=1.1; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } papi-papi-7-2-0-t/src/examples/PAPI_mix_ll_rate.c000066400000000000000000000041331502707512200215220ustar00rootroot00000000000000/***************************************************************************** * This example compares the measurement of IPC using the rate function * * PAPI_ipc and the low-level API. Both methods should deliver the same * * result for IPC. * * Note: There is no need to initialize PAPI for the low-level functions * * since this is done by PAPI_ipc. * *****************************************************************************/ #include #include #include "papi.h" #define THRESHOLD 10000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int your_slow_code(); int main() { float real_time, proc_time, ipc; long long ins; int retval; int EventSet = PAPI_NULL; long_long values[2]; if ( (retval = PAPI_ipc(&real_time, &proc_time, &ins ,&ipc)) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_ipc( &real_time, &proc_time, &ins, &ipc)) < PAPI_OK ) ERROR_RETURN(retval); printf("Results from PAPI_ipc:\n"); printf("Real_time: %f Proc_time: %f Instructions: %lld IPC: %f\n", real_time, proc_time,ins,ipc); if ( (retval = PAPI_rate_stop()) < PAPI_OK ) ERROR_RETURN(retval); /* get IPC using low-level API */ if ( (retval = PAPI_create_eventset(&EventSet)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) < PAPI_OK ) ERROR_RETURN(retval); if ( (retval = PAPI_start(EventSet)) < PAPI_OK ) ERROR_RETURN(retval); your_slow_code(); if ( (retval = PAPI_stop(EventSet, values)) < PAPI_OK ) ERROR_RETURN(retval); ipc = (float) ((float)values[0] / (float) ( values[1])); printf("Results from the low-level API:\n"); printf("IPC: %f\n", ipc); exit(0); } int your_slow_code() { int i; double tmp=1.1; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } papi-papi-7-2-0-t/src/examples/PAPI_overflow.c000066400000000000000000000104671502707512200210750ustar00rootroot00000000000000/***************************************************************************** * This example shows how to use PAPI_overflow to set up an event set to * * begin registering overflows. ******************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #include #define OVER_FMT "handler(%d ) Overflow at %p! bit=%#llx \n" #define THRESHOLD 100000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int total = 0; /* we use total to track the amount of overflows that occurred */ /* THis is the handler called by PAPI_overflow*/ void handler(int EventSet, void *address, long long overflow_vector, void *context) { fprintf(stderr, OVER_FMT, EventSet, address, overflow_vector); total++; } int main () { int EventSet = PAPI_NULL; /* must be set to null before calling PAPI_create_eventset */ char errstring[PAPI_MAX_STR_LEN]; long long (values[2])[2]; int retval, i; double tmp = 0; int PAPI_event; /* a place holder for an event preset */ char event_name[PAPI_MAX_STR_LEN]; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if ((retval = PAPI_library_init (PAPI_VER_CURRENT)) != PAPI_VER_CURRENT) { printf("Library initialization error! \n"); exit(1); } /* Here we create the eventset */ if ((retval=PAPI_create_eventset (&EventSet)) != PAPI_OK) ERROR_RETURN(retval); PAPI_event = PAPI_TOT_INS; /* Here we are querying for the existence of the PAPI presets */ if (PAPI_query_event (PAPI_TOT_INS) != PAPI_OK) { PAPI_event = PAPI_TOT_CYC; if ((retval=PAPI_query_event (PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); printf ("PAPI_TOT_INS not available on this platform."); printf (" so subst PAPI_event with PAPI_TOT_CYC !\n\n"); } /* PAPI_event_code_to_name is used to convert a PAPI preset from its integer value to its string name. */ if ((retval = PAPI_event_code_to_name (PAPI_event, event_name)) != PAPI_OK) ERROR_RETURN(retval); /* add event to the event set */ if ((retval = PAPI_add_event (EventSet, PAPI_event)) != PAPI_OK) ERROR_RETURN(retval); /* register overflow and set up threshold */ /* The threshold "THRESHOLD" was set to 100000 */ if ((retval = PAPI_overflow (EventSet, PAPI_event, THRESHOLD, 0, handler)) != PAPI_OK) ERROR_RETURN(retval); printf ("Here are the addresses at which overflows occurred and overflow vectors \n"); printf ("--------------------------------------------------------------\n"); /* Start counting */ if ( (retval=PAPI_start (EventSet)) != PAPI_OK) ERROR_RETURN(retval); for (i = 0; i < 2000000; i++) { tmp = 1.01 + tmp; tmp++; } /* Stops the counters and reads the counter values into the values array */ if ( (retval=PAPI_stop (EventSet, values[0])) != PAPI_OK) ERROR_RETURN(retval); printf ("The total no of overflows was %d\n", total); /* clear the overflow status */ if ((retval = PAPI_overflow (EventSet, PAPI_event, 0, 0, handler)) != PAPI_OK) ERROR_RETURN(retval); /************************************************************************ * PAPI_cleanup_eventset can only be used after the counter has been * * stopped then it remove all events in the eventset * ************************************************************************/ if ( (retval=PAPI_cleanup_eventset (EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Free all memory and data structures, EventSet must be empty. */ if ( (retval=PAPI_destroy_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* free the resources used by PAPI */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_perror.c000066400000000000000000000052331502707512200205360ustar00rootroot00000000000000/***************************************************************************** * PAPI_perror converts PAPI error codes to strings,it fills the string * * destination with the error message corresponding to the error code. * * The function copies length worth of the error description string * * corresponding to code into destination. The resulting string is always * * null terminated. If length is 0, then the string is printed on stderr. * * PAPI_strerror does similar but it just returns the corresponding * * error string from the code. * *****************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ int main() { int retval; int EventSet = PAPI_NULL; char error_str[PAPI_MAX_STR_LEN]; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { exit(1); } if ((retval = PAPI_create_eventset(&EventSet)) != PAPI_OK) { fprintf(stderr, "PAPI error %d: %s\n",retval,PAPI_strerror(retval)); exit(1); } /* Add Total Instructions Executed to our EventSet */ if ((retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) { PAPI_perror( "PAPI_add_event" ); exit(1); } /* Start counting */ if ((retval = PAPI_start(EventSet)) != PAPI_OK) { PAPI_perror( "PAPI_start" ); exit(1); } /* We are trying to start the counter which has already been started, and this will give an error which will be passed to PAPI_perror via retval and the function will then display the error string on the screen. */ if ((retval = PAPI_start(EventSet)) != PAPI_OK) { PAPI_perror( "PAPI_start" ); } /* The function PAPI_strerror returns the corresponding error string from the error code */ if ((retval = PAPI_start(EventSet)) != PAPI_OK) { printf("%s\n",PAPI_strerror(retval)); } /* finish using PAPI and free all related resources (this is optional, you don't have to use it */ PAPI_shutdown (); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_profil.c000066400000000000000000000113441502707512200205200ustar00rootroot00000000000000/**************************************************************************** * PAPI_profil - generate PC histogram data * ****************************************************************************/ #include #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define FLOPS 1000000 #define THRESHOLD 100000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int code_to_monitor() { int i; double tmp=1.1; for(i=0; i < FLOPS; i++) { tmp=i+tmp; tmp++; } i = (int) tmp; return i; } int main() { unsigned long length; vptr_t start, end; PAPI_sprofil_t * prof; int EventSet = PAPI_NULL; /*must be initialized to PAPI_NULL before calling PAPI_create_event*/ int PAPI_event,i,tmp = 0; char event_name[PAPI_MAX_STR_LEN]; /*These are going to be used as buffers */ unsigned short *profbuf; long long values[2]; const PAPI_exe_info_t *prginfo = NULL; int retval; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } if ((prginfo = PAPI_get_executable_info()) == NULL) { fprintf(stderr, "Error in get executable information \n"); exit(1); } start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; length = (end - start); /* for PAPI_PROFIL_BUCKET_16 and scale = 65536, profile buffer length == program address length. Larger bucket sizes would increase the buffer length. Smaller scale factors would decrease it. Handle with care... */ profbuf = (unsigned short *)malloc(length); if (profbuf == NULL) { fprintf(stderr, "Not enough memory \n"); exit(1); } memset(profbuf,0x00,length); /* Creating the eventset */ if ( (retval = PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); PAPI_event = PAPI_TOT_INS; /* Add Total Instructions Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_event)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Cycles Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* enable the collection of profiling information */ if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet, PAPI_event, THRESHOLD, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK) ERROR_RETURN(retval); /* let's rock and roll */ if ((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); code_to_monitor(); if ((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); /* disable the collection of profiling information by setting threshold to 0 */ if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet, PAPI_event, 0, PAPI_PROFIL_POSIX)) != PAPI_OK) ERROR_RETURN(retval); printf("-----------------------------------------------------------\n"); printf("Text start: %p, Text end: %p, \n", prginfo->address_info.text_start,prginfo->address_info.text_end); printf("Data start: %p, Data end: %p\n", prginfo->address_info.data_start,prginfo->address_info.data_end); printf("BSS start : %p, BSS end: %p\n", prginfo->address_info.bss_start,prginfo->address_info.bss_end); printf("------------------------------------------\n"); printf("Test type : \tPAPI_PROFIL_POSIX\n"); printf("------------------------------------------\n\n\n"); printf("PAPI_profil() hash table.\n"); printf("address\t\tflat \n"); for (i = 0; i < (int) length/2; i++) { if (profbuf[i]) printf("%#lx\t%d \n", (unsigned long) start + (unsigned long) (2 * i), profbuf[i]); } printf("-----------------------------------------\n"); retval = 0; for (i = 0; i < (int) length/2; i++) retval = retval || (profbuf[i]); if (retval) printf("Test succeeds! \n"); else printf( "No information in buffers\n"); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_reset.c000066400000000000000000000051361502707512200203510ustar00rootroot00000000000000/***************************************************************************** * PAPI_reset - resets the hardware event counters used by an EventSet. * *****************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int poorly_tuned_function() { float tmp; int i; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } int main() { int EventSet = PAPI_NULL; /*must be initialized to PAPI_NULL before calling PAPI_create_event*/ int retval; unsigned int event_code=PAPI_TOT_INS; /* By default monitor total instructions */ char errstring[PAPI_MAX_STR_LEN]; long long values[1]; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /* Creating the eventset */ if ( (retval=PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to our EventSet */ if ((retval=PAPI_add_event(EventSet, event_code)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); poorly_tuned_function(); /* Stop counting */ if((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf("The first time read value is %lld\n",values[0]); /* This zeroes out the counters on the eventset that was created */ if((retval=PAPI_reset(EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); poorly_tuned_function(); /* Stop counting */ if((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf("The second time read value is %lld\n",values[0]); /* free the resources used by PAPI */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_set_domain.c000066400000000000000000000072771502707512200213610ustar00rootroot00000000000000/***************************************************************************** * This example shows how to use PAPI_set_domain * *****************************************************************************/ #include #include #include #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int poorly_tuned_function() { float tmp; int i; for(i=1; i<2000; i++) { tmp=(tmp+100)/i; } return 0; } int main() { int num, retval, EventSet = PAPI_NULL; long long values[2]; PAPI_option_t options; int fd; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(1); } /* Set the domain of this EventSet to counter user mode. The domain will be valid for all the eventset created after this function call unless you call PAPI_set_domain again */ if ((retval=PAPI_set_domain(PAPI_DOM_USER)) != PAPI_OK) ERROR_RETURN(retval); if ((retval=PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed event to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Cycles Executed event to the EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); poorly_tuned_function(); /* add some system calls */ fd = open("/dev/zero", O_RDONLY); if (fd == -1) { perror("open(/dev/zero)"); exit(1); } close(fd); /* Stop counting */ if((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf(" Total instructions: %lld Total Cycles: %lld \n", values[0], values[1]); /* Set the domain of this EventSet to counter user and kernel modes */ if ((retval=PAPI_set_domain(PAPI_DOM_ALL)) != PAPI_OK) ERROR_RETURN(retval); EventSet = PAPI_NULL; if ((retval=PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* Start counting */ if((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); poorly_tuned_function(); /* add some system calls */ fd = open("/dev/zero", O_RDONLY); if (fd == -1) { perror("open(/dev/zero)"); exit(1); } close(fd); /* Stop counting */ if((retval=PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); printf(" Total instructions: %lld Total Cycles: %lld \n", values[0], values[1]); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/PAPI_state.c000066400000000000000000000050351502707512200203450ustar00rootroot00000000000000/***************************************************************************** * We use PAPI_state to get the counting state of an EventSet.This function * * returns the state of the entire EventSet. * *****************************************************************************/ #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int main() { int retval; int status = 0; int EventSet = PAPI_NULL; /**************************************************************************** * This part initializes the library and compares the version number of the * * header file, to the version of the library, if these don't match then it * * is likely that PAPI won't work correctly.If there is an error, retval * * keeps track of the version number. * ****************************************************************************/ if((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ) { printf("Library initialization error! \n"); exit(-1); } /*Creating the Eventset */ if((retval = PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to our EventSet */ if ((retval=PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); if ((retval=PAPI_state(EventSet, &status)) != PAPI_OK) ERROR_RETURN(retval); printstate(status); /* Start counting */ if ((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); if (PAPI_state(EventSet, &status) != PAPI_OK) ERROR_RETURN(retval); printstate(status); /* free the resources used by PAPI */ PAPI_shutdown(); exit(0); } int printstate(int status) { if(status & PAPI_STOPPED) printf("Eventset is currently stopped or inactive \n"); if(status & PAPI_RUNNING) printf("Eventset is currently running \n"); if(status & PAPI_PAUSED) printf("Eventset is currently Paused \n"); if(status & PAPI_NOT_INIT) printf(" Eventset defined but not initialized \n"); if(status & PAPI_OVERFLOWING) printf(" Eventset has overflowing enabled \n"); if(status & PAPI_PROFILING) printf(" Eventset has profiling enabled \n"); if(status & PAPI_MULTIPLEXING) printf(" Eventset has multiplexing enabled \n"); return 0; } papi-papi-7-2-0-t/src/examples/README000066400000000000000000000014431502707512200171270ustar00rootroot00000000000000/* * File: papi/src/examples/README * Author: Min Zhou * min@cs.utk.edu * Mods: * */ This directory contains: Makefile example Makefile for platforms that support GNU make Makefile.AIX example Makefile for AIX; Makefile.IRIX64 example Makefile for IRIX64; Makefile.OSF1 example Makefile for OSF1; *.c various example programs run_examples.sh shell script to test the example programs NOTE: not all the example program can be run successfully due to the availability of the events. For example, PAPI_FP_INS is a derived event in power3 and UltraSparc III, so overflow_pthreads can not be run successfully in these platforms. But these programs should help you understand how to use the PAPI functions. papi-papi-7-2-0-t/src/examples/add_event/000077500000000000000000000000001502707512200201765ustar00rootroot00000000000000papi-papi-7-2-0-t/src/examples/add_event/Papi_add_env_event.c000066400000000000000000000113041502707512200241130ustar00rootroot00000000000000/* * This example shows how to use PAPI_library_init, PAPI_create_eventset, * PAPI_add_event, * PAPI_start and PAPI_stop. These 5 functions * will allow a user to do most of the performance information gathering * that they would need. PAPI_read could also be used if you don't want * to stop the EventSet from running but only check the counts. * * Also, we will use PAPI_perror for * error information. * * In addition, a new call was created called PAPI_add_env_event * that allows a user to setup environment variable to read * which event should be monitored this allows different events * to be monitored at runtime without recompiling, the syntax * is as follows: * PAPI_add_env_event(int *EventSet, int *Event, char *env_variable); * EventSet is the same as in PAPI_add_event * Event is the default event to monitor if the environment variable * does not exist and differs from PAPI_add_event as it is * a pointer. * env_varialbe is the name of the environment variable to look for * the event code, this can be a name, number or hex, for example * PAPI_L1_DCM could be defined in the environment variable as * all of the following: PAPI_L1_DCM, 0x80000000, or -2147483648 * * To use only add_event you would change the calls to * PAPI_add_env_event(int *EventSet, int *Event, char *env_variable); * to PAPI_add_event(int *EventSet, int Event); * * We will also use PAPI_event_code_to_name since the event may have * changed. * Author: Kevin London * email: london@cs.utk.edu */ #include #include #include "papi.h" /* This needs to be included anytime you use PAPI */ int PAPI_add_env_event(int *EventSet, int *Event, char *env_variable); int main(){ int retval,i; int EventSet=PAPI_NULL; int event_code=PAPI_TOT_INS; /* By default monitor total instructions */ char errstring[PAPI_MAX_STR_LEN]; char event_name[PAPI_MAX_STR_LEN]; float a[1000],b[1000],c[1000]; long long values; /* This initializes the library and checks the version number of the * header file, to the version of the library, if these don't match * then it is likely that PAPI won't work correctly. */ if ((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT ){ /* This call loads up what the error means into errstring * if retval == PAPI_ESYS then it might be beneficial * to call perror as well to see what system call failed */ PAPI_perror("PAPI_library_init"); exit(-1); } /* Create space for the EventSet */ if ( (retval=PAPI_create_eventset( &EventSet ))!=PAPI_OK){ PAPI_perror(retval, errstring, PAPI_MAX_STR_LEN); exit(-1); } /* After this call if the environment variable PAPI_EVENT is set, * event_code may contain something different than total instructions. */ if ( (retval=PAPI_add_env_event(&EventSet, &event_code, "PAPI_EVENT"))!=PAPI_OK){ PAPI_perror("PAPI_add_env_event"); exit(-1); } /* Now lets start counting */ if ( (retval = PAPI_start(EventSet)) != PAPI_OK ){ PAPI_perror("PAPI_start"); exit(-1); } /* Some work to take up some time, the PAPI_start/PAPI_stop (and/or * PAPI_read) should surround what you want to monitor. */ for ( i=0;i<1000;i++){ a[i] = b[i]-c[i]; c[i] = a[i]*1.2; } if ( (retval = PAPI_stop(EventSet, &values) ) != PAPI_OK ){ PAPI_perror("PAPI_stop"); exit(-1); } if ( (retval=PAPI_event_code_to_name( event_code, event_name))!=PAPI_OK){ PAPI_perror("PAPI_event_code_to_name"); exit(-1); } printf("Ending values for %s: %lld\n", event_name,values); /* Remove PAPI instrumentation, this is necessary on platforms * that need to release shared memory segments and is always * good practice. */ PAPI_shutdown(); exit(0); } int PAPI_add_env_event(int *EventSet, int *EventCode, char *env_variable){ int real_event=*EventCode; char *eventname; int retval; if ( env_variable != NULL ){ if ( (eventname=getenv(env_variable)) ) { if ( eventname[0] == 'P' ) { /* Use the PAPI name */ retval=PAPI_event_name_to_code(eventname, &real_event ); if ( retval != PAPI_OK ) real_event = *EventCode; } else{ if ( strlen(eventname)>1 && eventname[1]=='x') sscanf(eventname, "%#x", &real_event); else real_event = atoi(eventname); } } } if ( (retval = PAPI_add_event( *EventSet, real_event))!= PAPI_OK ){ if ( real_event != *EventCode ) { if ( (retval = PAPI_add_event( *EventSet, *EventCode)) == PAPI_OK ){ real_event = *EventCode; } } } *EventCode = real_event; return retval; } papi-papi-7-2-0-t/src/examples/high_level.c000066400000000000000000000044351502707512200205250ustar00rootroot00000000000000/***************************************************************************** * This example code shows how to use PAPI's High level functions. * * Events to be recorded are determined via an environment variable * * PAPI_EVENTS that lists comma separated events for any component. * * If events are not specified via the environment variable PAPI_EVENTS, an * * output with default events is generated after the run. If supported by * * the respective machine the following default events are recorded: * * perf::TASK-CLOCK * * PAPI_TOT_INS * * PAPI_TOT_CYC * * PAPI_FP_INS * * PAPI_FP_OPS or PAPI_DP_OPS or PAPI_SP_OPS * ******************************************************************************/ #include #include #include "papi.h" #define THRESHOLD 10000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } /* stupid codes to be monitored */ void computation_mult() { double tmp=1.0; int i=1; for( i = 1; i < THRESHOLD; i++ ) { tmp = tmp*i; } } /* stupid codes to be monitored */ void computation_add() { int tmp = 0; int i=0; for( i = 0; i < THRESHOLD; i++ ) { tmp = tmp + i; } } int main() { int retval; char errstring[PAPI_MAX_STR_LEN]; retval = PAPI_hl_region_begin("computation_add"); if ( retval != PAPI_OK ) ERROR_RETURN(retval); /* Your code goes here*/ computation_add(); retval = PAPI_hl_read("computation_add"); if ( retval != PAPI_OK ) ERROR_RETURN(retval); /* Your code goes here*/ computation_add(); retval = PAPI_hl_region_end("computation_add"); if ( retval != PAPI_OK ) ERROR_RETURN(retval); retval = PAPI_hl_region_begin("computation_mult"); if ( retval != PAPI_OK ) ERROR_RETURN(retval); /* Your code goes here*/ computation_mult(); retval = PAPI_hl_region_end("computation_mult"); if ( retval != PAPI_OK ) ERROR_RETURN(retval); exit(0); } papi-papi-7-2-0-t/src/examples/locks_pthreads.c000066400000000000000000000063241502707512200214230ustar00rootroot00000000000000/**************************************************************************** * This program shows how to use PAPI_register_thread, PAPI_lock, * * PAPI_unlock, PAPI_set_thr_specific, PAPI_get_thr_specific. * * Warning: Don't use PAPI_lock and PAPI_unlock on platforms on which the * * locking mechanisms are not implemented. * ****************************************************************************/ #include #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } #define LOOPS 100000 #define SLEEP_VALUE 20000 int count; int rank; void *Master(void *arg) { int i, retval, tmp; int *pointer, * pointer2; tmp = 20; pointer = &tmp; /* register the thread */ if ( (retval=PAPI_register_thread())!= PAPI_OK ) ERROR_RETURN(retval); /* save the pointer for late use */ if ( (retval=PAPI_set_thr_specific(1,pointer))!= PAPI_OK ) ERROR_RETURN(retval); /* change the value of tmp */ tmp = 15; usleep(SLEEP_VALUE); PAPI_lock(PAPI_USR1_LOCK); /* Make sure Slaves are not sleeping */ for (i = 0; i < LOOPS; i++) { count = 2 * count - i; } PAPI_unlock(PAPI_USR1_LOCK); /* retrieve the pointer saved by PAPI_set_thr_specific */ if ( (retval=PAPI_get_thr_specific(1, (void *)&pointer2)) != PAPI_OK ) ERROR_RETURN(retval); /* the output value should be 15 */ printf("Thread specific data is %d \n", *pointer2); pthread_exit(NULL); } void *Slave(void *arg) { int i; PAPI_lock(PAPI_USR2_LOCK); PAPI_lock(PAPI_USR1_LOCK); for (i = 0; i < LOOPS; i++) { count += i; } PAPI_unlock(PAPI_USR1_LOCK); PAPI_unlock(PAPI_USR2_LOCK); pthread_exit(NULL); } int main(int argc, char **argv) { pthread_t master; pthread_t slave1; int result_m, result_s, rc, i; int retval; /* Setup a random number so compilers can't optimize it out */ count = rand(); result_m = count; rank = 0; for (i = 0; i < LOOPS; i++) { result_m = 2 * result_m - i; } result_s = result_m; for (i = 0; i < LOOPS; i++) { result_s += i; } if ((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT) { printf("Library initialization error! \n"); exit(-1); } if ((retval = PAPI_thread_init(&pthread_self)) != PAPI_OK) ERROR_RETURN(retval); if ((retval = PAPI_set_debug(PAPI_VERB_ECONT)) != PAPI_OK) ERROR_RETURN(retval); PAPI_lock(PAPI_USR2_LOCK); rc = pthread_create(&master, NULL, Master, NULL); if (rc) { retval = PAPI_ESYS; ERROR_RETURN(retval); } rc = pthread_create(&slave1, NULL, Slave, NULL); if (rc) { retval = PAPI_ESYS; ERROR_RETURN(retval); } pthread_join(master, NULL); printf("Master: Expected: %d Recieved: %d\n", result_m, count); if (result_m != count) ERROR_RETURN(1); PAPI_unlock(PAPI_USR2_LOCK); pthread_join(slave1, NULL); printf("Slave: Expected: %d Recieved: %d\n", result_s, count); if (result_s != count) ERROR_RETURN(1); exit(0); } papi-papi-7-2-0-t/src/examples/multiplex.c000066400000000000000000000102451502707512200204360ustar00rootroot00000000000000/**************************************************************************** * Multiplexing allows more counters to be used than what is supported by * * the platform, thus allowing a larger number of events to be counted * * simultaneously. When a microprocessor has a very limited number of * * counters that can be counted simultaneously, a large application with * * many hours of run time may require days of profiling in order to gather * * enough information to base a performance analysis. Multiplexing overcomes* * this limitation by the usage of the counters over timesharing. * * This is an example demonstrating how to use PAPI_set_multiplex to * * convert a standard event set to a multiplexed event set. * ****************************************************************************/ #include #include #include #include "papi.h" #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } #define NUM_ITERS 10000000 #define MAX_TO_ADD 6 double c = 0.11; void do_flops(int n) { int i; double a = 0.5; double b = 6.2; for (i=0; i < n; i++) c += a * b; return; } /* Tests that we can really multiplex a lot. */ int multiplex(void) { int retval, i, EventSet = PAPI_NULL, j = 0; long long *values; PAPI_event_info_t pset; int events[MAX_TO_ADD], number; /* Initialize the library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { printf("Library initialization error! \n"); exit(1); } /* initialize multiplex support */ retval = PAPI_multiplex_init(); if (retval != PAPI_OK) ERROR_RETURN(retval); retval = PAPI_create_eventset(&EventSet); if (retval != PAPI_OK) ERROR_RETURN(retval); /* convert the event set to a multiplex event set */ retval = PAPI_set_multiplex(EventSet); if (retval != PAPI_OK) ERROR_RETURN(retval); /* retval = PAPI_add_event(EventSet, PAPI_TOT_INS); if ((retval != PAPI_OK) && (retval != PAPI_ECNFLCT)) ERROR_RETURN(retval); printf("Adding %s\n", "PAPI_TOT_INS"); */ for (i = 0; i < PAPI_MAX_PRESET_EVENTS; i++) { retval = PAPI_get_event_info(i | PAPI_PRESET_MASK, &pset); if (retval != PAPI_OK) ERROR_RETURN(retval); if ((pset.count) && (pset.event_code != PAPI_TOT_CYC)) { printf("Adding %s\n", pset.symbol); retval = PAPI_add_event(EventSet, pset.event_code); if ((retval != PAPI_OK) && (retval != PAPI_ECNFLCT)) ERROR_RETURN(retval); if (retval == PAPI_OK) printf("Added %s\n", pset.symbol); else printf("Could not add %s due to resource limitation.\n", pset.symbol); if (retval == PAPI_OK) { if (++j >= MAX_TO_ADD) break; } } } values = (long long *) malloc(MAX_TO_ADD * sizeof(long long)); if (values == NULL) { printf("Not enough memory available. \n"); exit(1); } if ((retval=PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); do_flops(NUM_ITERS); retval = PAPI_stop(EventSet, values); if (retval != PAPI_OK) ERROR_RETURN(retval); /* get the number of events in the event set */ number=MAX_TO_ADD; if ( (retval = PAPI_list_events(EventSet, events, &number)) != PAPI_OK) ERROR_RETURN(retval); /* print the read result */ for (i = 0; i < MAX_TO_ADD; i++) { retval = PAPI_get_event_info(events[i], &pset); if (retval != PAPI_OK) ERROR_RETURN(retval); printf("Event name: %s value: %lld \n", pset.symbol, values[i]); } retval = PAPI_cleanup_eventset(EventSet); if (retval != PAPI_OK) ERROR_RETURN(retval); retval = PAPI_destroy_eventset(&EventSet); if (retval != PAPI_OK) ERROR_RETURN(retval); /* free the resources used by PAPI */ PAPI_shutdown(); return (0); } int main(int argc, char **argv) { printf("Using %d iterations\n\n", NUM_ITERS); printf("Does PAPI_multiplex_init() handle lots of events?\n"); multiplex(); exit(0); } papi-papi-7-2-0-t/src/examples/overflow_pthreads.c000066400000000000000000000114101502707512200221430ustar00rootroot00000000000000/* This file performs the following test: overflow dispatch with pthreads - This example tests the dispatch of overflow calls from PAPI. The event set is counted in the default counting domain and default granularity, depending on the platform. Usually this is the user domain (PAPI_DOM_USER) and thread context (PAPI_GRN_THR). The Eventset contains: + PAPI_TOT_INS (overflow monitor) + PAPI_TOT_CYC Each thread will do the followings : - enable overflow - Start eventset 1 - Do flops - Stop eventset 1 - disable overflow */ #include #include #include #include "papi.h" #define THRESHOLD 200000 #define OVER_FMT "handler(%d ) Overflow at %p! bit=%#llx \n" #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } int total = 0; void do_flops(int n) { int i; double c = 0.11; double a = 0.5; double b = 6.2; for (i=0; i < n; i++) c += a * b; } /* overflow handler */ void handler(int EventSet, void *address, long long overflow_vector, void *context) { fprintf(stderr, OVER_FMT, EventSet, address, overflow_vector); total++; } void *Thread(void *arg) { int retval; int EventSet1=PAPI_NULL; long long values[2]; long long elapsed_us, elapsed_cyc; fprintf(stderr,"Thread %lx running PAPI\n",PAPI_thread_id()); /* create the event set */ if ( (retval = PAPI_create_eventset(&EventSet1))!=PAPI_OK) ERROR_RETURN(retval); /* query whether the event exists */ if ((retval=PAPI_query_event(PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); if ((retval=PAPI_query_event(PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); /* add events to the event set */ if ( (retval = PAPI_add_event(EventSet1, PAPI_TOT_INS))!= PAPI_OK) ERROR_RETURN(retval); if ( (retval = PAPI_add_event(EventSet1, PAPI_TOT_CYC)) != PAPI_OK) ERROR_RETURN(retval); elapsed_us = PAPI_get_real_usec(); elapsed_cyc = PAPI_get_real_cyc(); retval = PAPI_overflow(EventSet1, PAPI_TOT_CYC, THRESHOLD, 0, handler); if(retval !=PAPI_OK) ERROR_RETURN(retval); /* start counting */ if((retval = PAPI_start(EventSet1))!=PAPI_OK) ERROR_RETURN(retval); do_flops(*(int *)arg); if ((retval = PAPI_stop(EventSet1, values))!=PAPI_OK) ERROR_RETURN(retval); elapsed_us = PAPI_get_real_usec() - elapsed_us; elapsed_cyc = PAPI_get_real_cyc() - elapsed_cyc; /* disable overflowing */ retval = PAPI_overflow(EventSet1, PAPI_TOT_CYC, 0, 0, handler); if(retval !=PAPI_OK) ERROR_RETURN(retval); /* remove the event from the eventset */ retval = PAPI_remove_event(EventSet1, PAPI_TOT_INS); if (retval != PAPI_OK) ERROR_RETURN(retval); retval = PAPI_remove_event(EventSet1, PAPI_TOT_CYC); if (retval != PAPI_OK) ERROR_RETURN(retval); printf("Thread %#x PAPI_TOT_INS : \t%lld\n",(int)PAPI_thread_id(), values[0]); printf(" PAPI_TOT_CYC: \t%lld\n", values[1]); printf(" Real usec : \t%lld\n", elapsed_us); printf(" Real cycles : \t%lld\n", elapsed_cyc); pthread_exit(NULL); } int main(int argc, char **argv) { pthread_t thread_one; pthread_t thread_two; int flops1, flops2; int rc,retval; pthread_attr_t attr; long long elapsed_us, elapsed_cyc; /* papi library initialization */ if ((retval=PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT) { printf("Library initialization error! \n"); exit(1); } /* thread initialization */ retval=PAPI_thread_init((unsigned long(*)(void))(pthread_self)); if (retval != PAPI_OK) ERROR_RETURN(retval); /* return the number of microseconds since some arbitrary starting point */ elapsed_us = PAPI_get_real_usec(); /* return the number of cycles since some arbitrary starting point */ elapsed_cyc = PAPI_get_real_cyc(); /* pthread attribution init */ pthread_attr_init(&attr); pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* create the first thread */ flops1 = 1000000; rc = pthread_create(&thread_one, &attr, Thread, (void *)&flops1); if (rc) ERROR_RETURN(rc); /* create the second thread */ flops2 = 4000000; rc = pthread_create(&thread_two, &attr, Thread, (void *)&flops2); if (rc) ERROR_RETURN(rc); /* wait for the threads to finish */ pthread_attr_destroy(&attr); pthread_join(thread_one, NULL); pthread_join(thread_two, NULL); /* compute the elapsed cycles and microseconds */ elapsed_cyc = PAPI_get_real_cyc() - elapsed_cyc; elapsed_us = PAPI_get_real_usec() - elapsed_us; printf("Master real usec : \t%lld\n", elapsed_us); printf("Master real cycles : \t%lld\n", elapsed_cyc); /* clean up */ PAPI_shutdown(); exit(0); } papi-papi-7-2-0-t/src/examples/run_examples.sh000077500000000000000000000007101502707512200213040ustar00rootroot00000000000000#!/bin/sh # File: run_example.sh # CVS: $Id$ # Author: Min Zhou # min@cs.utk.edu CTESTS=`find . -perm -u+x -type f`; ALLTESTS="$CTESTS"; x=0; CWD=`pwd` echo "Platform:" uname -a echo "" echo "The following test cases will be run:"; echo $ALLTESTS; echo ""; echo "Running C Example Programs"; echo "" for i in $CTESTS; do if [ -x $i ]; then if [ "$i" != "./run_examples.sh" ]; then echo "Running $i: "; ./$i fi; fi; echo ""; done papi-papi-7-2-0-t/src/examples/sprofile.c000066400000000000000000000115441502707512200202410ustar00rootroot00000000000000/* This program shows how to use PAPI_sprofil */ #include #include #include #include #include #include #include #include "papi.h" /* This needs to be included every time you use PAPI */ #define NUM_FLOPS 20000000 #define NUM_ITERS 100000 #define THRESHOLD 100000 #define ERROR_RETURN(retval) { fprintf(stderr, "Error %d %s:line %d: \n", retval,__FILE__,__LINE__); exit(retval); } #if (defined(linux) && defined(__ia64__)) || (defined(_AIX)) #define DO_FLOPS1 (vptr_t)(*(void **)do_flops1) #define DO_FLOPS2 (vptr_t)(*(void **)do_flops2) #else #define DO_FLOPS1 (vptr_t)(do_flops1) #define DO_FLOPS2 (vptr_t)(do_flops2) #endif void do_flops2(int); volatile double t1 = 0.8, t2 = 0.9; void do_flops1(int n) { int i; double c = 22222.11; for (i = 0; i < n; i++) c -= t1 * t2; } void do_both(int n) { int i; const int flops2 = NUM_FLOPS / n; const int flops1 = NUM_FLOPS / n; for (i = 0; i < n; i++) { do_flops1(flops1); do_flops2(flops2); } } int main(int argc, char **argv) { int i , PAPI_event; int EventSet = PAPI_NULL; unsigned short *profbuf; unsigned short *profbuf2; unsigned short *profbuf3; unsigned long length; vptr_t start, end; long long values[2]; const PAPI_exe_info_t *prginfo = NULL; PAPI_sprofil_t sprof[3]; int retval; /* initializaion */ if ((retval = PAPI_library_init(PAPI_VER_CURRENT)) != PAPI_VER_CURRENT) { printf("Library initialization error! \n"); exit(1); } if ((prginfo = PAPI_get_executable_info()) == NULL) ERROR_RETURN(1); start = prginfo->address_info.text_start; end = prginfo->address_info.text_end; length = (end - start)/sizeof(unsigned short) * sizeof(unsigned short); printf("start= %p end =%p \n", start, end); profbuf = (unsigned short *) malloc(length); if (profbuf == NULL) ERROR_RETURN(PAPI_ESYS); memset(profbuf, 0x00, length ); profbuf2 = (unsigned short *) malloc(length); if (profbuf2 == NULL) ERROR_RETURN(PAPI_ESYS); memset(profbuf2, 0x00, length ); profbuf3 = (unsigned short *) malloc(1 * sizeof(unsigned short)); if (profbuf3 == NULL) ERROR_RETURN(PAPI_ESYS); memset(profbuf3, 0x00, 1 * sizeof(unsigned short)); /* First half */ sprof[0].pr_base = profbuf; sprof[0].pr_size = length / 2; sprof[0].pr_off = DO_FLOPS2; fprintf(stderr, "do_flops is at %p %lx\n", &do_flops2, strtoul(sprof[0].pr_off,NULL,0)); sprof[0].pr_scale = 65536; /* constant needed by PAPI_sprofil */ /* Second half */ sprof[1].pr_base = profbuf2; sprof[1].pr_size = length / 2; sprof[1].pr_off = DO_FLOPS1; fprintf(stderr, "do_flops1 is at %p %lx\n", &do_flops1, strtoul(sprof[1].pr_off,NULL,0)); sprof[1].pr_scale = 65536; /* constant needed by PAPI_sprofil */ /* Overflow bin */ sprof[2].pr_base = profbuf3; sprof[2].pr_size = 1; sprof[2].pr_off = 0; sprof[2].pr_scale = 0x2; /* constant needed by PAPI_sprofil */ /* Creating the eventset */ if ( (retval = PAPI_create_eventset(&EventSet)) != PAPI_OK) ERROR_RETURN(retval); PAPI_event = PAPI_TOT_CYC; /* Add Total Instructions Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_event)) != PAPI_OK) ERROR_RETURN(retval); /* Add Total Instructions Executed to our EventSet */ if ( (retval = PAPI_add_event(EventSet, PAPI_TOT_INS)) != PAPI_OK) ERROR_RETURN(retval); /* set profile flag */ if ((retval = PAPI_sprofil(sprof, 3, EventSet, PAPI_event, THRESHOLD, PAPI_PROFIL_POSIX)) != PAPI_OK) ERROR_RETURN(retval); if ((retval = PAPI_start(EventSet)) != PAPI_OK) ERROR_RETURN(retval); do_both(NUM_ITERS); if ((retval = PAPI_stop(EventSet, values)) != PAPI_OK) ERROR_RETURN(retval); /* to clear the profile flag before removing the events */ if ((retval = PAPI_sprofil(sprof, 3, EventSet, PAPI_event, 0, PAPI_PROFIL_POSIX)) != PAPI_OK) ERROR_RETURN(retval); /* free the resources hold by PAPI */ PAPI_shutdown(); printf("Test case: PAPI_sprofil()\n"); printf("---------Buffer 1--------\n"); for (i = 0; i < length / 2; i++) { if (profbuf[i]) printf("%#lx\t%d\n", strtoul(DO_FLOPS2,NULL,0) + 2 * i, profbuf[i]); } printf("---------Buffer 2--------\n"); for (i = 0; i < length / 2; i++) { if (profbuf2[i]) printf("%#lx\t%d\n", strtoul(DO_FLOPS1,NULL,0) + 2 * i, profbuf2[i]); } printf("-------------------------\n"); printf("%u samples fell outside the regions.\n", *profbuf3); exit(0); } /* Declare a and b to be volatile. This is to try to keep the compiler from optimizing the loop */ volatile double a = 0.5, b = 2.2; void do_flops2(int n) { int i; double c = 0.11; for (i = 0; i < n; i++) c += a * b; } papi-papi-7-2-0-t/src/extras.c000066400000000000000000000333261502707512200161100ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: extras.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: dan terpstra * terpstra@cs.utk.edu * Mods: Haihang You * you@cs.utk.edu * Mods: Kevin London * london@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com */ /* This file contains portable routines to do things that we wish the vendors did in the kernel extensions or performance libraries. */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "threads.h" #if (!defined(HAVE_FFSLL) || defined(__bgp__)) int ffsll( long long lli ); #else #include #endif /****************/ /* BEGIN LOCALS */ /****************/ static unsigned int _rnum = DEADBEEF; /**************/ /* END LOCALS */ /**************/ inline_static unsigned short random_ushort( void ) { return ( unsigned short ) ( _rnum = 1664525 * _rnum + 1013904223 ); } /* compute the amount by which to increment the bucket. value is the current value of the bucket this routine is used by all three profiling cases it is inlined for speed */ inline_static int profil_increment( long long value, int flags, long long excess, long long threshold ) { int increment = 1; if ( flags == PAPI_PROFIL_POSIX ) { return ( 1 ); } if ( flags & PAPI_PROFIL_RANDOM ) { if ( random_ushort( ) <= ( USHRT_MAX / 4 ) ) return ( 0 ); } if ( flags & PAPI_PROFIL_COMPRESS ) { /* We're likely to ignore the sample if buf[address] gets big. */ if ( random_ushort( ) < value ) { return ( 0 ); } } if ( flags & PAPI_PROFIL_WEIGHTED ) { /* Increment is between 1 and 255 */ if ( excess <= ( long long ) 1 ) increment = 1; else if ( excess > threshold ) increment = 255; else { threshold = threshold / ( long long ) 255; increment = ( int ) ( excess / threshold ); } } return ( increment ); } static void posix_profil( vptr_t address, PAPI_sprofil_t * prof, int flags, long long excess, long long threshold ) { unsigned short *buf16; unsigned int *buf32; unsigned long long *buf64; unsigned long indx; unsigned long long lloffset; /* SPECIAL CASE: if starting address is 0 and scale factor is 2 then all counts go into first bin. */ if ( ( prof->pr_off == 0 ) && ( prof->pr_scale == 0x2 ) ) indx = 0; else { /* compute the profile buffer offset by: - subtracting the profiling base address from the pc address - multiplying by the scaling factor - dividing by max scale (65536, or 2^^16) - dividing by implicit 2 (2^^1 for a total of 2^^17), for even addresses NOTE: 131072 is a valid scale value. It produces byte resolution of addresses */ lloffset = ( unsigned long long ) ( ( address - prof->pr_off ) * prof->pr_scale ); indx = ( unsigned long ) ( lloffset >> 17 ); } /* confirm addresses within specified range */ if ( address >= prof->pr_off ) { /* test first for 16-bit buckets; this should be the fast case */ if ( flags & PAPI_PROFIL_BUCKET_16 ) { if ( ( indx * sizeof ( short ) ) < prof->pr_size ) { buf16 = (unsigned short *) prof->pr_base; buf16[indx] = ( unsigned short ) ( ( unsigned short ) buf16[indx] + profil_increment( buf16[indx], flags, excess, threshold ) ); PRFDBG( "posix_profil_16() bucket %lu = %u\n", indx, buf16[indx] ); } } /* next, look for the 32-bit case */ else if ( flags & PAPI_PROFIL_BUCKET_32 ) { if ( ( indx * sizeof ( int ) ) < prof->pr_size ) { buf32 = (unsigned int *) prof->pr_base; buf32[indx] = ( unsigned int ) buf32[indx] + ( unsigned int ) profil_increment( buf32[indx], flags, excess, threshold ); PRFDBG( "posix_profil_32() bucket %lu = %u\n", indx, buf32[indx] ); } } /* finally, fall through to the 64-bit case */ else { if ( ( indx * sizeof ( long long ) ) < prof->pr_size ) { buf64 = (unsigned long long *) prof->pr_base; buf64[indx] = ( unsigned long long ) buf64[indx] + ( unsigned long long ) profil_increment( ( long long ) buf64[indx], flags, excess, threshold ); PRFDBG( "posix_profil_64() bucket %lu = %lld\n", indx, buf64[indx] ); } } } } void _papi_hwi_dispatch_profile( EventSetInfo_t * ESI, vptr_t pc, long long over, int profile_index ) { EventSetProfileInfo_t *profile = &ESI->profile; PAPI_sprofil_t *sprof; vptr_t offset = 0; vptr_t best_offset = 0; int count; int best_index = -1; int i; PRFDBG( "handled IP %p\n", pc ); sprof = profile->prof[profile_index]; count = profile->count[profile_index]; for ( i = 0; i < count; i++ ) { offset = sprof[i].pr_off; if ( ( offset < pc ) && ( offset > best_offset ) ) { best_index = i; best_offset = offset; } } if ( best_index == -1 ) best_index = 0; posix_profil( pc, &sprof[best_index], profile->flags, over, profile->threshold[profile_index] ); } /* if isHardware is true, then the processor is using hardware overflow, else it is using software overflow. Use this parameter instead of _papi_hwi_system_info.supports_hw_overflow is in CRAY some processors may use hardware overflow, some may use software overflow. overflow_bit: if the component can get the overflow bit when overflow occurs, then this should be passed by the component; If both genOverflowBit and isHardwareSupport are true, that means the component doesn't know how to get the overflow bit from the kernel directly, so we generate the overflow bit in this function since this function can access the ESI->overflow struct; (The component can only set genOverflowBit parameter to true if the hardware doesn't support multiple hardware overflow. If the component supports multiple hardware overflow and you don't know how to get the overflow bit, then I don't know how to deal with this situation). */ int _papi_hwi_dispatch_overflow_signal( void *papiContext, vptr_t address, int *isHardware, long long overflow_bit, int genOverflowBit, ThreadInfo_t ** t, int cidx ) { int retval, event_counter, i, overflow_flag, pos; int papi_index, j; int profile_index = 0; long long overflow_vector; long long temp[_papi_hwd[cidx]->cmp_info.num_cntrs], over; long long latest = 0; ThreadInfo_t *thread; EventSetInfo_t *ESI; _papi_hwi_context_t *ctx = ( _papi_hwi_context_t * ) papiContext; OVFDBG( "enter\n" ); if ( *t ) thread = *t; else *t = thread = _papi_hwi_lookup_thread( 0 ); if ( thread != NULL ) { ESI = thread->running_eventset[cidx]; if ( ( ESI == NULL ) || ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) { OVFDBG( "Either no eventset or eventset not set to overflow.\n" ); #ifdef ANY_THREAD_GETS_SIGNAL _papi_hwi_broadcast_signal( thread->tid ); #endif return ( PAPI_OK ); } if ( ESI->CmpIdx != cidx ) return ( PAPI_ENOCMP ); if ( ESI->master != thread ) { PAPIERROR ( "eventset->thread %p vs. current thread %p mismatch", ESI->master, thread ); return ( PAPI_EBUG ); } if ( isHardware ) { if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { ESI->state |= PAPI_PAUSED; *isHardware = 1; } else *isHardware = 0; } /* Get the latest counter value */ event_counter = ESI->overflow.event_counter; overflow_flag = 0; overflow_vector = 0; if ( !( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) ) { retval = _papi_hwi_read( thread->context[cidx], ESI, ESI->sw_stop ); if ( retval < PAPI_OK ) return ( retval ); for ( i = 0; i < event_counter; i++ ) { papi_index = ESI->overflow.EventIndex[i]; latest = ESI->sw_stop[papi_index]; temp[i] = -1; if ( latest >= ( long long ) ESI->overflow.deadline[i] ) { OVFDBG ( "dispatch_overflow() latest %lld, deadline %lld, threshold %d\n", latest, ESI->overflow.deadline[i], ESI->overflow.threshold[i] ); pos = ESI->EventInfoArray[papi_index].pos[0]; overflow_vector ^= ( long long ) 1 << pos; temp[i] = latest - ESI->overflow.deadline[i]; overflow_flag = 1; /* adjust the deadline */ ESI->overflow.deadline[i] = latest + ESI->overflow.threshold[i]; } } } else if ( genOverflowBit ) { /* we had assumed the overflow event can't be derived event */ papi_index = ESI->overflow.EventIndex[0]; /* suppose the pos is the same as the counter number * (this is not true in Itanium, but itanium doesn't * need us to generate the overflow bit */ pos = ESI->EventInfoArray[papi_index].pos[0]; overflow_vector = ( long long ) 1 << pos; } else overflow_vector = overflow_bit; if ( ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) || overflow_flag ) { if ( ESI->state & PAPI_PROFILING ) { int k = 0; while ( overflow_vector ) { i = ffsll( overflow_vector ) - 1; for ( j = 0; j < event_counter; j++ ) { papi_index = ESI->overflow.EventIndex[j]; /* This loop is here ONLY because Pentium 4 can have tagged * * events that contain more than one counter without being * * derived. You've gotta scan all terms to make sure you * * find the one to profile. */ for ( k = 0, pos = 0; k < PAPI_EVENTS_IN_DERIVED_EVENT && pos >= 0; k++ ) { pos = ESI->EventInfoArray[papi_index].pos[k]; if ( i == pos ) { profile_index = j; goto foundit; } } } if ( j == event_counter ) { PAPIERROR ( "BUG! overflow_vector is 0, dropping interrupt" ); return ( PAPI_EBUG ); } foundit: if ( ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) ) over = 0; else over = temp[profile_index]; _papi_hwi_dispatch_profile( ESI, address, over, profile_index ); overflow_vector ^= ( long long ) 1 << i; } /* do not use overflow_vector after this place */ } else { ESI->overflow.handler( ESI->EventSetIndex, ( void * ) address, overflow_vector, ctx->ucontext ); } } ESI->state &= ~( PAPI_PAUSED ); } #ifdef ANY_THREAD_GETS_SIGNAL else { OVFDBG( "I haven't been noticed by PAPI before\n" ); _papi_hwi_broadcast_signal( ( *_papi_hwi_thread_id_fn ) ( ) ); } #endif return ( PAPI_OK ); } #include #include #include int _papi_hwi_using_signal[PAPI_NSIG]; int _papi_hwi_start_timer( int timer, int signal, int ns ) { struct itimerval value; int us = ns / 1000; if ( us == 0 ) us = 1; #ifdef ANY_THREAD_GETS_SIGNAL _papi_hwi_lock( INTERNAL_LOCK ); if ( ( _papi_hwi_using_signal[signal] - 1 ) ) { INTDBG( "itimer already installed\n" ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } _papi_hwi_unlock( INTERNAL_LOCK ); #else ( void ) signal; /*unused */ #endif value.it_interval.tv_sec = 0; value.it_interval.tv_usec = us; value.it_value.tv_sec = 0; value.it_value.tv_usec = us; INTDBG( "Installing itimer %d, with %d us interval\n", timer, us ); if ( setitimer( timer, &value, NULL ) < 0 ) { PAPIERROR( "setitimer errno %d", errno ); return ( PAPI_ESYS ); } return ( PAPI_OK ); } int _papi_hwi_start_signal( int signal, int need_context, int cidx ) { struct sigaction action; _papi_hwi_lock( INTERNAL_LOCK ); _papi_hwi_using_signal[signal]++; if ( _papi_hwi_using_signal[signal] - 1 ) { INTDBG( "_papi_hwi_using_signal is now %d\n", _papi_hwi_using_signal[signal] ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } memset( &action, 0x00, sizeof ( struct sigaction ) ); action.sa_flags = SA_RESTART; action.sa_sigaction = ( void ( * )( int, siginfo_t *, void * ) ) _papi_hwd[cidx]-> dispatch_timer; if ( need_context ) #if (defined(_BGL) /*|| defined (__bgp__)*/) action.sa_flags |= SIGPWR; #else action.sa_flags |= SA_SIGINFO; #endif INTDBG( "installing signal handler\n" ); if ( sigaction( signal, &action, NULL ) < 0 ) { PAPIERROR( "sigaction errno %d", errno ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_ESYS ); } INTDBG( "_papi_hwi_using_signal[%d] is now %d.\n", signal, _papi_hwi_using_signal[signal] ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } int _papi_hwi_stop_signal( int signal ) { _papi_hwi_lock( INTERNAL_LOCK ); if ( --_papi_hwi_using_signal[signal] == 0 ) { INTDBG( "removing signal handler\n" ); if ( sigaction( signal, NULL, NULL ) == -1 ) { PAPIERROR( "sigaction errno %d", errno ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_ESYS ); } } INTDBG( "_papi_hwi_using_signal[%d] is now %d\n", signal, _papi_hwi_using_signal[signal] ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } int _papi_hwi_stop_timer( int timer, int signal ) { struct itimerval value; #ifdef ANY_THREAD_GETS_SIGNAL _papi_hwi_lock( INTERNAL_LOCK ); if ( _papi_hwi_using_signal[signal] > 1 ) { INTDBG( "itimer in use by another thread\n" ); _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } _papi_hwi_unlock( INTERNAL_LOCK ); #else ( void ) signal; /*unused */ #endif value.it_interval.tv_sec = 0; value.it_interval.tv_usec = 0; value.it_value.tv_sec = 0; value.it_value.tv_usec = 0; INTDBG( "turning off timer\n" ); if ( setitimer( timer, &value, NULL ) == -1 ) { PAPIERROR( "setitimer errno %d", errno ); return PAPI_ESYS; } return PAPI_OK; } #if (!defined(HAVE_FFSLL) || defined(__bgp__)) /* find the first set bit in long long */ int ffsll( long long lli ) { int i, num, t, tmpint, len; num = sizeof ( long long ) / sizeof ( int ); if ( num == 1 ) return ( ffs( ( int ) lli ) ); len = sizeof ( int ) * CHAR_BIT; for ( i = 0; i < num; i++ ) { tmpint = ( int ) ( ( ( lli >> len ) << len ) ^ lli ); t = ffs( tmpint ); if ( t ) { return ( t + i * len ); } lli = lli >> len; } return PAPI_OK; } #endif papi-papi-7-2-0-t/src/extras.h000066400000000000000000000011111502707512200161000ustar00rootroot00000000000000#ifndef EXTRAS_H #define EXTRAS_H int _papi_hwi_stop_timer( int timer, int signal ); int _papi_hwi_start_timer( int timer, int signal, int ms ); int _papi_hwi_stop_signal( int signal ); int _papi_hwi_start_signal( int signal, int need_context, int cidx ); int _papi_hwi_initialize( DynamicArray_t ** ); int _papi_hwi_dispatch_overflow_signal( void *papiContext, vptr_t address, int *, long long, int, ThreadInfo_t ** master, int cidx ); void _papi_hwi_dispatch_profile( EventSetInfo_t * ESI, vptr_t address, long long over, int profile_index ); #endif /* EXTRAS_H */ papi-papi-7-2-0-t/src/freebsd-context.h000066400000000000000000000002261502707512200176740ustar00rootroot00000000000000#ifndef _PAPI_FreeBSD_CONTEXT_H #define _PAPI_FreeBSD_CONTEXT_H #define GET_OVERFLOW_ADDRESS(ctx) (0x80000000) #endif /* _PAPI_FreeBSD_CONTEXT_H */ papi-papi-7-2-0-t/src/freebsd-lock.h000066400000000000000000000001031502707512200171320ustar00rootroot00000000000000 #define _papi_hwd_lock(a) { ; } #define _papi_hwd_unlock(a) { ; } papi-papi-7-2-0-t/src/freebsd-memory.c000066400000000000000000000020761502707512200175200ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: freebsd-memory.c * Author: Harald Servat * redcrash@gmail.com * Mod: James Ralph * ralph@cs.utk.edu */ #include "papi.h" #include "papi_internal.h" #include "x86_cpuid_info.h" #define UNREFERENCED(x) (void)x #if defined(__i386__)||defined(__x86_64__) static int x86_get_memory_info( PAPI_hw_info_t *hw_info ) { int retval = PAPI_OK; switch ( hw_info->vendor ) { case PAPI_VENDOR_AMD: case PAPI_VENDOR_INTEL: retval = _x86_cache_info( &hw_info->mem_hierarchy ); break; default: PAPIERROR( "Unknown vendor in memory information call for x86." ); return PAPI_ENOIMPL; } return retval; } #endif int _freebsd_get_memory_info( PAPI_hw_info_t *hw_info, int id) { UNREFERENCED(id); UNREFERENCED(hw_info); #if defined(__i386__)||defined(__x86_64__) x86_get_memory_info( hw_info ); #endif return PAPI_ENOIMPL; } int _papi_freebsd_get_dmem_info(PAPI_dmem_info_t *d) { /* TODO */ d->pagesize = getpagesize(); return PAPI_OK; } papi-papi-7-2-0-t/src/freebsd-memory.h000066400000000000000000000001701502707512200175160ustar00rootroot00000000000000int _freebsd_get_memory_info( PAPI_hw_info_t *hw_info, int id); int _papi_freebsd_get_dmem_info(PAPI_dmem_info_t *d); papi-papi-7-2-0-t/src/freebsd.c000066400000000000000000000640041502707512200162110ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: freebsd.c * Author: Harald Servat * redcrash@gmail.com */ #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_lock.h" #include "freebsd.h" #include "papi_vector.h" #include "map.h" #include "freebsd-memory.h" #include "x86_cpuid_info.h" /* Global values referenced externally */ PAPI_os_info_t _papi_os_info; /* Advance Declarations */ papi_vector_t _papi_freebsd_vector; long long _papi_freebsd_get_real_cycles(void); int _papi_freebsd_ntv_code_to_name(unsigned int EventCode, char *ntv_name, int len); /* For debugging */ static void show_counter(char *string, int id, char *name, const char *function, char *file, int line) { #if defined(DEBUG) pmc_value_t tmp_value; int ret = pmc_read (id, &tmp_value); fprintf(stderr,"%s\n",string); if (ret < 0) { fprintf (stderr, "DEBUG: Unable to read counter %s (ID: %08x) " "on routine %s (file: %s, line: %d)\n", name, id, function,file,line); } else { fprintf (stderr, "DEBUG: Read counter %s (ID: %08x) - " "value %llu on routine %s (file: %s, line: %d)\n", name, id, (long long unsigned int)tmp_value, function, file, line); } #else (void) string; (void)name; (void)id; (void)function; (void)file; (void)line; #endif } static hwd_libpmc_context_t Context; /* * This function is an internal function and not exposed and thus * it can be called anything you want as long as the information * is setup in _papi_freebsd_init_component. Below is some, but not * all of the values that will need to be setup. For a complete * list check out papi_mdi_t, though some of the values are setup * and used above the component level. */ int init_mdi(void) { const struct pmc_cpuinfo *info; SUBDBG("Entering\n"); /* Initialize PMC library */ if (pmc_init() < 0) return PAPI_ESYS; if (pmc_cpuinfo (&info) != 0) return PAPI_ESYS; if (info != NULL) { /* Get CPU clock rate from HW.CLOCKRATE sysctl value, and MODEL from HW.MODEL */ int mib[5]; size_t len; int hw_clockrate; char hw_model[PAPI_MAX_STR_LEN]; #if !defined(__i386__) && !defined(__amd64__) Context.use_rdtsc = FALSE; #else /* Ok, I386s/AMD64s can use RDTSC. But be careful, if the cpufreq module is loaded, then CPU frequency can vary and this method does not work properly! We'll use use_rdtsc to know if this method is available */ len = 5; Context.use_rdtsc = sysctlnametomib ("dev.cpufreq.0.%driver", mib, &len) == -1; #endif len = 3; if (sysctlnametomib ("hw.clockrate", mib, &len) == -1) return PAPI_ESYS; len = sizeof(hw_clockrate); if (sysctl (mib, 2, &hw_clockrate, &len, NULL, 0) == -1) return PAPI_ESYS; len = 3; if (sysctlnametomib ("hw.model", mib, &len) == -1) return PAPI_ESYS; len = PAPI_MAX_STR_LEN; if (sysctl (mib, 2, &hw_model, &len, NULL, 0) == -1) return PAPI_ESYS; /*strcpy (_papi_hwi_system_info.hw_info.vendor_string, pmc_name_of_cputype(info->pm_cputype));*/ sprintf (_papi_hwi_system_info.hw_info.vendor_string, "%s (TSC:%c)", pmc_name_of_cputype(info->pm_cputype), Context.use_rdtsc?'Y':'N'); strcpy (_papi_hwi_system_info.hw_info.model_string, hw_model); _papi_hwi_system_info.hw_info.mhz = (float) hw_clockrate; _papi_hwi_system_info.hw_info.cpu_max_mhz = hw_clockrate; _papi_hwi_system_info.hw_info.cpu_min_mhz = hw_clockrate; _papi_hwi_system_info.hw_info.ncpu = info->pm_ncpu; _papi_hwi_system_info.hw_info.nnodes = 1; _papi_hwi_system_info.hw_info.totalcpus = info->pm_ncpu; /* Right now, PMC states that TSC is an additional counter. However it's only available as a system-wide counter and this requires root access */ _papi_freebsd_vector.cmp_info.num_cntrs = info->pm_npmc - 1; if ( strstr(pmc_name_of_cputype(info->pm_cputype), "INTEL")) _papi_hwi_system_info.hw_info.vendor = PAPI_VENDOR_INTEL; else if ( strstr(pmc_name_of_cputype(info->pm_cputype), "AMD")) _papi_hwi_system_info.hw_info.vendor = PAPI_VENDOR_AMD; else fprintf(stderr,"We didn't actually find a supported vendor...\n\n\n"); } else return PAPI_ESYS; return 1; } int init_presets(int cidx) { const struct pmc_cpuinfo *info; SUBDBG("Entering\n"); if (pmc_cpuinfo (&info) != 0) return PAPI_ESYS; init_freebsd_libpmc_mappings(); if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_P6") == 0) Context.CPUtype = CPU_P6; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_PII") == 0) Context.CPUtype = CPU_P6_2; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_PIII") == 0) Context.CPUtype = CPU_P6_3; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_CL") == 0) Context.CPUtype = CPU_P6_C; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_PM") == 0) Context.CPUtype = CPU_P6_M; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "AMD_K7") == 0) Context.CPUtype = CPU_K7; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "AMD_K8") == 0) Context.CPUtype = CPU_K8; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_PIV") == 0) Context.CPUtype = CPU_P4; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_ATOM") == 0) Context.CPUtype = CPU_ATOM; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_CORE") == 0) Context.CPUtype = CPU_CORE; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_CORE2") == 0) Context.CPUtype = CPU_CORE2; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_CORE2EXTREME") == 0) Context.CPUtype = CPU_CORE2EXTREME; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_COREI7") == 0) Context.CPUtype = CPU_COREI7; else if (strcmp(pmc_name_of_cputype(info->pm_cputype), "INTEL_WESTMERE") == 0) Context.CPUtype = CPU_COREWESTMERE; else /* Unknown processor! */ Context.CPUtype = CPU_UNKNOWN; _papi_freebsd_vector.cmp_info.num_native_events = freebsd_number_of_events (Context.CPUtype); _papi_freebsd_vector.cmp_info.attach = 0; _papi_load_preset_table((char *)pmc_name_of_cputype(info->pm_cputype), 0,cidx); return 0; } /* * Component setup and shutdown */ /* Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int _papi_freebsd_init_component(int cidx) { (void)cidx; int retval; SUBDBG("Entering\n"); /* Internal function, doesn't necessarily need to be a function */ retval=init_presets(cidx); return retval; } /* * This is called whenever a thread is initialized */ int _papi_freebsd_init_thread(hwd_context_t *ctx) { (void)ctx; SUBDBG("Entering\n"); return PAPI_OK; } int _papi_freebsd_shutdown_thread(hwd_context_t *ctx) { (void)ctx; SUBDBG("Entering\n"); return PAPI_OK; } int _papi_freebsd_shutdown_component(void) { SUBDBG("Entering\n"); return PAPI_OK; } /* * Control of counters (Reading/Writing/Starting/Stopping/Setup) * functions */ int _papi_freebsd_init_control_state(hwd_control_state_t *ptr) { /* We will default to gather counters in USER|KERNEL mode */ SUBDBG("Entering\n"); ptr->hwc_domain = PAPI_DOM_USER|PAPI_DOM_KERNEL; ptr->pmcs = NULL; ptr->counters = NULL; ptr->n_counters = 0; return PAPI_OK; } int _papi_freebsd_update_control_state(hwd_control_state_t *ptr, NativeInfo_t *native, int count, hwd_context_t *ctx) { char name[1024]; int i; int res; (void)ctx; SUBDBG("Entering\n"); /* We're going to store which counters are being used in this EventSet. As this ptr structure can be reused within many PAPI_add_event calls, and domain can change we will reconstruct the table of counters (ptr->counters) everytime where here. */ if (ptr->counters != NULL && ptr->n_counters > 0) { for (i = 0; i < ptr->n_counters; i++) if (ptr->counters[i] != NULL) free (ptr->counters[i]); free (ptr->counters); } if (ptr->pmcs != NULL) free (ptr->pmcs); if (ptr->values != NULL) free (ptr->values); if (ptr->caps != NULL) free (ptr->caps); ptr->n_counters = count; ptr->pmcs = (pmc_id_t*) malloc (sizeof(pmc_id_t)*count); ptr->caps = (uint32_t*) malloc (sizeof(uint32_t)*count); ptr->values = (pmc_value_t*) malloc (sizeof(pmc_value_t)*count); ptr->counters = (char **) malloc (sizeof(char*)*count); for (i = 0; i < count; i++) ptr->counters[i] = NULL; for (i = 0; i < count; i++) { res = _papi_freebsd_ntv_code_to_name (native[i].ni_event, name, sizeof(name)); if (res != PAPI_OK) return res; native[i].ni_position = i; /* Domains can be applied to canonical events in libpmc (not "generic") */ if (Context.CPUtype != CPU_UNKNOWN) { if (ptr->hwc_domain == (PAPI_DOM_USER|PAPI_DOM_KERNEL)) { /* PMC defaults domain to OS & User. So simply copy the name of the counter */ ptr->counters[i] = strdup (name); if (ptr->counters[i] == NULL) return PAPI_ESYS; } else if (ptr->hwc_domain == PAPI_DOM_USER) { /* This is user-domain case. Just add unitmask=usr */ ptr->counters[i] = malloc ((strlen(name)+strlen(",usr")+1)*sizeof(char)); if (ptr->counters[i] == NULL) return PAPI_ESYS; sprintf (ptr->counters[i], "%s,usr", name); } else /* if (ptr->hwc_domain == PAPI_DOM_KERNEL) */ { /* This is the last case. Just add unitmask=os */ ptr->counters[i] = malloc ((strlen(name)+strlen(",os")+1)*sizeof(char)); if (ptr->counters[i] == NULL) return PAPI_ESYS; sprintf (ptr->counters[i], "%s,os", name); } } else { /* PMC defaults domain to OS & User. So simply copy the name of the counter */ ptr->counters[i] = strdup (name); if (ptr->counters[i] == NULL) return PAPI_ESYS; } } return PAPI_OK; } int _papi_freebsd_start(hwd_context_t *ctx, hwd_control_state_t *ctrl) { int i, ret; (void)ctx; SUBDBG("Entering\n"); for (i = 0; i < ctrl->n_counters; i++) { if ((ret = pmc_allocate (ctrl->counters[i], PMC_MODE_TC, 0, PMC_CPU_ANY, &(ctrl->pmcs[i]))) < 0) { #if defined(DEBUG) /* This shouldn't happen, it's tested previously on _papi_freebsd_allocate_registers */ fprintf (stderr, "DEBUG: %s FAILED to allocate '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } if ((ret = pmc_capabilities (ctrl->pmcs[i],&(ctrl->caps[i]))) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to get capabilites for '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif ctrl->caps[i] = 0; } #if defined(DEBUG) fprintf (stderr, "DEBUG: %s got counter '%s' is %swrittable! [%d of %d]\n", FUNC, ctrl->counters[i], (ctrl->caps[i]&PMC_CAP_WRITE)?"":"NOT", i+1, ctrl->n_counters); #endif if ((ret = pmc_start (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to start '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } } return PAPI_OK; } int _papi_freebsd_read(hwd_context_t *ctx, hwd_control_state_t *ctrl, long long **events, int flags) { int i, ret; (void)ctx; (void)flags; SUBDBG("Entering\n"); for (i = 0; i < ctrl->n_counters; i++) if ((ret = pmc_read (ctrl->pmcs[i], &(ctrl->values[i]))) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to read '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } *events = (long long *)ctrl->values; #if defined(DEBUG) for (i = 0; i < ctrl->n_counters; i++) fprintf (stderr, "DEBUG: %s counter '%s' has value %lld\n", FUNC, ctrl->counters[i], (long long)ctrl->values[i]); #endif return PAPI_OK; } int _papi_freebsd_stop(hwd_context_t *ctx, hwd_control_state_t *ctrl) { int i, ret; (void)ctx; SUBDBG("Entering\n"); for (i = 0; i < ctrl->n_counters; i++) { if ((ret = pmc_stop (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to stop '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } if ((ret = pmc_release (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) /* This shouldn't happen, it's tested previously on _papi_freebsd_allocate_registers */ fprintf (stderr, "DEBUG: %s FAILED to release '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } } return PAPI_OK; } int _papi_freebsd_reset(hwd_context_t *ctx, hwd_control_state_t *ctrl) { int i, ret; (void)ctx; SUBDBG("Entering\n"); for (i = 0; i < ctrl->n_counters; i++) { /* Can we write on the counters? */ if (ctrl->caps[i] & PMC_CAP_WRITE) { show_counter("DEBUG: _papi_freebsd_reset is about " "to stop the counter i+1", ctrl->pmcs[i],ctrl->counters[i], __FUNCTION__,__FILE__,__LINE__); if ((ret = pmc_stop (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to stop '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } show_counter( "DEBUG: _papi_freebsd_reset is about " "to write the counter i+1\n", ctrl->pmcs[i],ctrl->counters[i], __FUNCTION__,__FILE__,__LINE__); if ((ret = pmc_write (ctrl->pmcs[i], 0)) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to write '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } show_counter("DEBUG: _papi_freebsd_reset is about to " "start the counter %i+1", ctrl->pmcs[i],ctrl->counters[i], __FUNCTION__,__FILE__,__LINE__); if ((ret = pmc_start (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to start '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } show_counter("DEBUG: _papi_freebsd_reset after " "starting the counter i+1", ctrl->pmcs[i],ctrl->counters[i], __FUNCTION__,__FILE__,__LINE__); } else return PAPI_ECMP; } return PAPI_OK; } int _papi_freebsd_write(hwd_context_t *ctx, hwd_control_state_t *ctrl, long long *from) { int i, ret; (void)ctx; SUBDBG("Entering\n"); for (i = 0; i < ctrl->n_counters; i++) { /* Can we write on the counters? */ if (ctrl->caps[i] & PMC_CAP_WRITE) { if ((ret = pmc_stop (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to stop '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } if ((ret = pmc_write (ctrl->pmcs[i], from[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to write '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } if ((ret = pmc_start (ctrl->pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to stop '%s' [%d of %d] ERROR = %d\n", FUNC, ctrl->counters[i], i+1, ctrl->n_counters, ret); #endif return PAPI_ESYS; } } else return PAPI_ECMP; } return PAPI_OK; } /* * Overflow and profile functions */ void _papi_freebsd_dispatch_timer(int signal, hwd_siginfo_t * info, void *context) { (void)signal; (void)info; (void)context; /* Real function would call the function below with the proper args * _papi_hwi_dispatch_overflow_signal(...); */ SUBDBG("Entering\n"); return; } int _papi_freebsd_stop_profiling(ThreadInfo_t *master, EventSetInfo_t *ESI) { (void)master; (void)ESI; SUBDBG("Entering\n"); return PAPI_OK; } int _papi_freebsd_set_overflow(EventSetInfo_t *ESI, int EventIndex, int threshold) { (void)ESI; (void)EventIndex; (void)threshold; SUBDBG("Entering\n"); return PAPI_OK; } int _papi_freebsd_set_profile(EventSetInfo_t *ESI, int EventIndex, int threashold) { (void)ESI; (void)EventIndex; (void)threashold; SUBDBG("Entering\n"); return PAPI_OK; } /* * Functions for setting up various options */ /* * This function has to set the bits needed to count different domains * In particular: PAPI_DOM_USER, PAPI_DOM_KERNEL PAPI_DOM_OTHER * By default return PAPI_EINVAL if none of those are specified * and PAPI_OK with success * PAPI_DOM_USER is only user context is counted * PAPI_DOM_KERNEL is only the Kernel/OS context is counted * PAPI_DOM_OTHER is Exception/transient mode (like user TLB misses) * PAPI_DOM_ALL is all of the domains */ int _papi_freebsd_set_domain(hwd_control_state_t *cntrl, int domain) { int found = 0; SUBDBG("Entering\n"); /* libpmc supports USER/KERNEL mode only when counters are native */ if (Context.CPUtype != CPU_UNKNOWN) { if (domain & (PAPI_DOM_USER|PAPI_DOM_KERNEL)) { cntrl->hwc_domain = domain & (PAPI_DOM_USER|PAPI_DOM_KERNEL); found = 1; } return found?PAPI_OK:PAPI_EINVAL; } else return PAPI_ECMP; } /* This function sets various options in the component * The valid codes being passed in are PAPI_SET_DEFDOM, * PAPI_SET_DOMAIN, PAPI_SETDEFGRN, PAPI_SET_GRANUL * and PAPI_SET_INHERIT */ int _papi_freebsd_ctl (hwd_context_t *ctx, int code, _papi_int_option_t *option) { (void)ctx; SUBDBG("Entering\n"); switch (code) { case PAPI_DOMAIN: case PAPI_DEFDOM: /*return _papi_freebsd_set_domain(&option->domain.ESI->machdep, option->domain.domain);*/ return _papi_freebsd_set_domain(option->domain.ESI->ctl_state, option->domain.domain); case PAPI_GRANUL: case PAPI_DEFGRN: return PAPI_ECMP; default: return PAPI_EINVAL; } } /* * Timing Routines * These functions should return the highest resolution timers available. */ long long _papi_freebsd_get_real_usec(void) { /* Hey, I've seen somewhere a define called __x86_64__! Should I support it? */ #if !defined(__i386__) && !defined(__amd64__) /* This will surely work, but with low precision and high overhead */ struct rusage res; SUBDBG("Entering\n"); if ((getrusage(RUSAGE_SELF, &res) == -1)) return PAPI_ESYS; return (res.ru_utime.tv_sec * 1000000) + res.ru_utime.tv_usec; #else SUBDBG("Entering\n"); if (Context.use_rdtsc) { return _papi_freebsd_get_real_cycles() / _papi_hwi_system_info.hw_info.cpu_max_mhz; } else { struct rusage res; if ((getrusage(RUSAGE_SELF, &res) == -1)) return PAPI_ESYS; return (res.ru_utime.tv_sec * 1000000) + res.ru_utime.tv_usec; } #endif } long long _papi_freebsd_get_real_cycles(void) { /* Hey, I've seen somewhere a define called __x86_64__! Should I support it? */ #if !defined(__i386__) && !defined(__amd64__) SUBDBG("Entering\n"); /* This will surely work, but with low precision and high overhead */ return ((long long) _papi_freebsd_get_real_usec() * _papi_hwi_system_info.hw_info.cpu_max_mhz); #else SUBDBG("Entering\n"); if (Context.use_rdtsc) { long long cycles; __asm __volatile(".byte 0x0f, 0x31" : "=A" (cycles)); return cycles; } else { return ((long long) _papi_freebsd_get_real_usec() * _papi_hwi_system_info.hw_info.cpu_max_mhz); } #endif } long long _papi_freebsd_get_virt_usec(void) { struct rusage res; SUBDBG("Entering\n"); if ((getrusage(RUSAGE_SELF, &res) == -1)) return PAPI_ESYS; return (res.ru_utime.tv_sec * 1000000) + res.ru_utime.tv_usec; } /* * Native Event functions */ int _papi_freebsd_ntv_enum_events(unsigned int *EventCode, int modifier) { int res; char name[1024]; unsigned int nextCode = 1 + *EventCode; SUBDBG("Entering\n"); if (modifier==PAPI_ENUM_FIRST) { *EventCode=0; return PAPI_OK; } if (modifier==PAPI_ENUM_EVENTS) { res = _papi_freebsd_ntv_code_to_name(nextCode, name, sizeof(name)); if (res != PAPI_OK) { return res; } else { *EventCode = nextCode; } return PAPI_OK; } return PAPI_ENOEVNT; } int _papi_freebsd_ntv_name_to_code(const char *name, unsigned int *event_code) { SUBDBG("Entering\n"); int i; for(i = 0; i < _papi_freebsd_vector.cmp_info.num_native_events; i++) { if (strcmp (name, _papi_hwd_native_info[Context.CPUtype].info[i].name) == 0) { *event_code = i; return PAPI_OK; } } return PAPI_ENOEVNT; } int _papi_freebsd_ntv_code_to_name(unsigned int EventCode, char *ntv_name, int len) { SUBDBG("Entering\n"); int nidx; nidx = EventCode & PAPI_NATIVE_AND_MASK; if (nidx >= _papi_freebsd_vector.cmp_info.num_native_events) { return PAPI_ENOEVNT; } strncpy (ntv_name, _papi_hwd_native_info[Context.CPUtype].info[nidx].name, len); if (strlen(_papi_hwd_native_info[Context.CPUtype].info[nidx].name) > (size_t)len-1) { return PAPI_EBUF; } return PAPI_OK; } int _papi_freebsd_ntv_code_to_descr(unsigned int EventCode, char *descr, int len) { SUBDBG("Entering\n"); int nidx; nidx = EventCode & PAPI_NATIVE_AND_MASK; if (nidx >= _papi_freebsd_vector.cmp_info.num_native_events) { return PAPI_ENOEVNT; } strncpy (descr, _papi_hwd_native_info[Context.CPUtype].info[nidx].description, len); if (strlen(_papi_hwd_native_info[Context.CPUtype].info[nidx].description) > (size_t)len-1) { return PAPI_EBUF; } return PAPI_OK; } /* * Counter Allocation Functions, only need to implement if * the component needs smart counter allocation. */ /* Here we'll check if PMC can provide all the counters the user want */ int _papi_freebsd_allocate_registers (EventSetInfo_t *ESI) { char name[1024]; int failed, allocated_counters, i, j, ret; pmc_id_t *pmcs; SUBDBG("Entering\n"); failed = 0; pmcs = (pmc_id_t*) malloc(sizeof(pmc_id_t)*ESI->NativeCount); if (pmcs != NULL) { allocated_counters = 0; /* Check if we can allocate all the counters needed */ for (i = 0; i < ESI->NativeCount; i++) { ret = _papi_freebsd_ntv_code_to_name (ESI->NativeInfoArray[i].ni_event, name, sizeof(name)); if (ret != PAPI_OK) return ret; if ( (ret = pmc_allocate (name, PMC_MODE_TC, 0, PMC_CPU_ANY, &pmcs[i])) < 0) { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s FAILED to allocate '%s' (%#08x) [%d of %d] ERROR = %d\n", FUNC, name, ESI->NativeInfoArray[i].ni_event, i+1, ESI->NativeCount, ret); #endif failed = 1; break; } else { #if defined(DEBUG) fprintf (stderr, "DEBUG: %s SUCCEEDED allocating '%s' (%#08x) [%d of %d]\n", FUNC, name, ESI->NativeInfoArray[i].ni_event, i+1, ESI->NativeCount); #endif allocated_counters++; } } /* Free the counters */ for (j = 0; j < allocated_counters; j++) pmc_release (pmcs[j]); free (pmcs); } else failed = 1; return failed?PAPI_ECNFLCT:PAPI_OK; } /* * Shared Library Information and other Information Functions */ int _papi_freebsd_update_shlib_info(papi_mdi_t *mdi){ SUBDBG("Entering\n"); (void)mdi; return PAPI_OK; } int _papi_freebsd_detect_hypervisor(char *virtual_vendor_name) { int retval=0; #if defined(__i386__)||defined(__x86_64__) retval=_x86_detect_hypervisor(virtual_vendor_name); #else (void) virtual_vendor_name; #endif return retval; } int _papi_freebsd_get_system_info( papi_mdi_t *mdi ) { int retval; retval=_freebsd_get_memory_info(&mdi->hw_info, mdi->hw_info.model ); /* Get virtualization info */ mdi->hw_info.virtualized=_papi_freebsd_detect_hypervisor(mdi->hw_info.virtual_vendor_string); return PAPI_OK; } int _papi_hwi_init_os(void) { struct utsname uname_buffer; /* Internal function, doesn't necessarily need to be a function */ init_mdi(); uname(&uname_buffer); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_ns = PAPI_INT_MPX_DEF_US * 1000; /* Not actually supported */ _papi_os_info.itimer_res_ns = 1; _papi_freebsd_get_system_info(&_papi_hwi_system_info); return PAPI_OK; } papi_vector_t _papi_freebsd_vector = { .cmp_info = { /* default component information (unspecified values are initialized to 0) */ .name = "FreeBSD", .description = "FreeBSD CPU counters", .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr = 1, .kernel_multiplex = 1, .kernel_profile = 1, .num_mpx_cntrs = HWPMC_NUM_COUNTERS, /* ?? */ .hardware_intr_sig = PAPI_INT_SIGNAL, /* component specific cmp_info initializations */ .fast_real_timer = 1, .fast_virtual_timer = 0, .attach = 0, .attach_must_ptrace = 0, } , .size = { .context = sizeof( hwd_context_t ), .control_state = sizeof( hwd_control_state_t ), .reg_value = sizeof( hwd_register_t ), .reg_alloc = sizeof( hwd_reg_alloc_t ) }, .dispatch_timer = _papi_freebsd_dispatch_timer, .start = _papi_freebsd_start, .stop = _papi_freebsd_stop, .read = _papi_freebsd_read, .reset = _papi_freebsd_reset, .write = _papi_freebsd_write, .stop_profiling = _papi_freebsd_stop_profiling, .init_component = _papi_freebsd_init_component, .init_thread = _papi_freebsd_init_thread, .init_control_state = _papi_freebsd_init_control_state, .update_control_state = _papi_freebsd_update_control_state, .ctl = _papi_freebsd_ctl, .set_overflow = _papi_freebsd_set_overflow, .set_profile = _papi_freebsd_set_profile, .set_domain = _papi_freebsd_set_domain, .ntv_enum_events = _papi_freebsd_ntv_enum_events, .ntv_name_to_code = _papi_freebsd_ntv_name_to_code, .ntv_code_to_name = _papi_freebsd_ntv_code_to_name, .ntv_code_to_descr = _papi_freebsd_ntv_code_to_descr, .allocate_registers = _papi_freebsd_allocate_registers, .shutdown_thread = _papi_freebsd_shutdown_thread, .shutdown_component = _papi_freebsd_shutdown_component, }; papi_os_vector_t _papi_os_vector = { .get_dmem_info = _papi_freebsd_get_dmem_info, .get_real_cycles = _papi_freebsd_get_real_cycles, .get_real_usec = _papi_freebsd_get_real_usec, .get_virt_usec = _papi_freebsd_get_virt_usec, .update_shlib_info = _papi_freebsd_update_shlib_info, .get_system_info = _papi_freebsd_get_system_info, }; papi-papi-7-2-0-t/src/freebsd.h000066400000000000000000000031211502707512200162070ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: freebsd-libpmc.c * Author: Kevin London * london@cs.utk.edu * Mods: Harald Servat * redcrash@gmail.com */ #ifndef _PAPI_FreeBSD_H #define _PAPI_FreeBSD_H #include #include #include #include #include #include #include #include #include #include #include "papi.h" #include #include "freebsd-config.h" #define MAX_COUNTERS HWPMC_NUM_COUNTERS #define MAX_COUNTER_TERMS MAX_COUNTERS #undef hwd_siginfo_t #undef hwd_register_t #undef hwd_reg_alloc_t #undef hwd_control_state_t #undef hwd_context_t #undef hwd_libpmc_context_t typedef struct hwd_siginfo { int placeholder; } hwd_siginfo_t; typedef struct hwd_register { int placeholder; } hwd_register_t; typedef struct hwd_reg_alloc { int placeholder; } hwd_reg_alloc_t; typedef struct hwd_control_state { int n_counters; /* Number of counters */ int hwc_domain; /* HWC domain {user|kernel} */ unsigned *caps; /* Capabilities for each counter */ pmc_id_t *pmcs; /* PMC identifiers */ pmc_value_t *values; /* Stored values for each counter */ char **counters; /* Name of each counter (with mode) */ } hwd_control_state_t; typedef struct hwd_context { int placeholder; } hwd_context_t; #include "freebsd-context.h" typedef struct hwd_libpmc_context { int CPUtype; int use_rdtsc; } hwd_libpmc_context_t; #define _papi_hwd_lock_init() { ; } #endif /* _PAPI_FreeBSD_H */ papi-papi-7-2-0-t/src/freebsd/000077500000000000000000000000001502707512200160415ustar00rootroot00000000000000papi-papi-7-2-0-t/src/freebsd/map-atom.c000066400000000000000000000536711502707512200177340ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-atom.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** ATOM SUBSTRATE ATOM SUBSTRATE ATOM SUBSTRATE ATOM SUBSTRATE ATOM SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_AtomProcessor must match AtomProcessor_info */ Native_Event_LabelDescription_t AtomProcessor_info[] = { {"BACLEARS", "The number of times the front end is resteered."}, {"BOGUS_BR", "The number of byte sequences mistakenly detected as taken branch instructions."}, {"BR_BAC_MISSP_EXEC", "The number of branch instructions that were mispredicted when decoded."}, {"BR_CALL_MISSP_EXEC", "The number of mispredicted CALL instructions that were executed."}, {"BR_CALL_EXEC", "The number of CALL instructions executed."}, {"BR_CND_EXEC", "The number of conditional branches executed, but not necessarily retired."}, {"BR_CND_MISSP_EXEC", "The number of mispredicted conditional branches executed."}, {"BR_IND_CALL_EXEC", "The number of indirect CALL instructions executed."}, {"BR_IND_EXEC", "The number of indirect branch instructions executed."}, {"BR_IND_MISSP_EXEC", "The number of mispredicted indirect branch instructions executed."}, {"BR_INST_DECODED", "The number of branch instructions decoded."}, {"BR_INST_EXEC", "The number of branches executed, but not necessarily retired."}, {"BR_INST_RETIRED.ANY", "The number of branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.ANY1", "The number of branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.MISPRED", "The number of mispredicted branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.MISPRED_NOT_TAKEN", "The number of not taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.MISPRED_TAKEN", "The number taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.PRED_NOT_TAKEN", "The number of not taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.PRED_TAKEN", "The number of taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.TAKEN", "The number of taken branch instructions retired."}, {"BR_MISSP_EXEC", "The number of mispredicted branch instructions that were executed."}, {"BR_RET_MISSP_EXEC", "The number of mispredicted RET instructions executed."}, {"BR_RET_BAC_MISSP_EXEC", "The number of RET instructions executed that were mispredicted at decode time."}, {"BR_RET_EXEC", "The number of RET instructions executed."}, {"BR_TKN_BUBBLE_1", "The number of branch predicted taken with bubble 1."}, {"BR_TKN_BUBBLE_2", "The number of branch predicted taken with bubble 2."}, {"BUSQ_EMPTY", "The number of cycles during which the core did not have any pending transactions in the bus queue."}, {"BUS_BNR_DRV", "The number of Bus Not Ready signals asserted on the bus. This event is thread-independent."}, {"BUS_DATA_RCV", "The number of bus cycles during which the processor is receiving data. This event is thread-independent."}, {"BUS_DRDY_CLOCKS", "The number of bus cycles during which the Data Ready signal is asserted on the bus. This event is thread-independent."}, {"BUS_HIT_DRV", "The number of bus cycles during which the processor drives the HIT# pin. This event is thread-independent."}, {"BUS_HITM_DRV", "The number of bus cycles during which the processor drives the HITM# pin. This event is thread-independent."}, {"BUS_IO_WAIT", "The number of core cycles during which I/O requests wait in the bus queue."}, {"BUS_LOCK_CLOCKS", "The number of bus cycles during which the LOCK signal was asserted on the bus. This event is thread independent."}, {"BUS_REQUEST_OUTSTANDING", "The number of pending full cache line read transactions on the bus occurring in each cycle. This event is thread independent."}, {"BUS_TRANS_P", "The number of partial bus transactions."}, {"BUS_TRANS_IFETCH", "The number of instruction fetch full cache line bus transactions."}, {"BUS_TRANS_INVAL", "The number of invalidate bus transactions."}, {"BUS_TRANS_PWR", "The number of partial write bus transactions."}, {"BUS_TRANS_DEF", "The number of deferred bus transactions."}, {"BUS_TRANS_BURST", "The number of burst transactions."}, {"BUS_TRANS_MEM", "The number of memory bus transactions."}, {"BUS_TRANS_ANY", "The number of bus transactions of any kind."}, {"BUS_TRANS_BRD", "The number of burst read transactions."}, {"BUS_TRANS_IO", "The number of completed I/O bus transaactions due to IN and OUT instructions."}, {"BUS_TRANS_RFO", "The number of Read For Ownership bus transactions."}, {"BUS_TRANS_WB", "The number explicit writeback bus transactions due to dirty line evictions."}, {"CMP_SNOOP", "The number of times the L1 data cache is snooped by the other core in the same processor."}, {"CPU_CLK_UNHALTED.BUS", "The number of bus cycles when the core is not in the halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.CORE_P", "The number of core cycles while the core is not in a halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.NO_OTHER", "The number of bus cycles during which the core remains unhalted and the other core is halted."}, {"CYCLES_DIV_BUSY", "The number of cycles the divider is busy."}, {"CYCLES_INT_MASKED.CYCLES_INT_MASKED", "The number of cycles during which interrupts are disabled."}, {"CYCLES_INT_MASKED.CYCLES_INT_PENDING_AND_MASKED", "The number of cycles during which there were pending interrupts while interrupts were disabled."}, {"CYCLES_L1I_MEM_STALLED", "The number of cycles for which an instruction fetch stalls."}, {"DATA_TLB_MISSES.DTLB_MISS", "The number of memory access that missed the Data TLB"}, {"DATA_TLB_MISSES.DTLB_MISS_LD", "The number of loads that missed the Data TLB."}, {"DATA_TLB_MISSES.DTLB_MISS_ST", "The number of stores that missed the Data TLB."}, {"DATA_TLB_MISSES.UTLB_MISS_LD", "The number of loads that missed the UTLB."}, {"DELAYED_BYPASS.FP", "The number of floating point operations that used data immediately after the data was generated by a non floating point execution unit."}, {"DELAYED_BYPASS.LOAD", "The number of delayed bypass penalty cycles that a load operation incurred."}, {"DELAYED_BYPASS.SIMD", "The number of times SIMD operations use data immediately after data, was generated by a non-SIMD execution unit."}, {"DIV", "The number of divide operations executed. This event is only available on PMC1."}, {"DIV.AR", "The number of divide operations retired."}, {"DIV.S", "The number of divide operations executed."}, {"DTLB_MISSES.ANY", "The number of Data TLB misses, including misses that result from speculative accesses."}, {"DTLB_MISSES.L0_MISS_LD", "The number of level 0 DTLB misses due to load operations."}, {"DTLB_MISSES.MISS_LD", "The number of Data TLB misses due to load operations."}, {"DTLB_MISSES.MISS_ST", "The number of Data TLB misses due to store operations."}, {"EIST_TRANS", "The number of Enhanced Intel SpeedStep Technology transitions."}, {"ESP.ADDITIONS", "The number of automatic additions to the esp register."}, {"ESP.SYNCH", "The number of times the esp register was explicitly used in an address expression after it is implicitly used by a PUSH or POP instruction."}, {"EXT_SNOOP", "The number of snoop responses to bus transactions."}, {"FP_ASSIST", "The number of floating point operations executed that needed a microcode assist, including speculatively executed instructions."}, {"FP_ASSIST.AR", "The number of floating point operations retired that needed a microcode assist."}, {"FP_COMP_OPS_EXE", "The number of floating point computational micro-ops executed. The event is available only on PMC0."}, {"FP_MMX_TRANS_TO_FP", "The number of transitions from MMX instructions to floating point instructions."}, {"FP_MMX_TRANS_TO_MMX", "The number of transitions from floating point instructions to MMX instructions."}, {"HW_INT_RCV", "The number of hardware interrupts recieved."}, {"ICACHE.ACCESSES", "The number of instruction fetches."}, {"ICACHE.MISSES", "The number of instruction fetches that miss the instruction cache."}, {"IDLE_DURING_DIV", "The number of cycles the divider is busy and no other execution unit or load operation was in progress. This event is available only on PMC0."}, {"ILD_STALL", "The number of cycles the instruction length decoder stalled due to a length changing prefix."}, {"INST_QUEUE.FULL", "The number of cycles during which the instruction queue is full."}, {"INST_RETIRED.ANY_P", "The number of instructions retired. This is an architectural performance event."}, {"INST_RETIRED.LOADS", "The number of instructions retired that contained a load operation."}, {"INST_RETIRED.OTHER", "The number of instructions retired that did not contain a load or a store operation."}, {"INST_RETIRED.STORES", "The number of instructions retired that contained a store operation."}, {"ITLB.FLUSH", "The number of ITLB flushes."}, {"ITLB.LARGE_MISS", "The number of instruction fetches from large pages that miss the ITLB."}, {"ITLB.MISSES", "The number of instruction fetches from both large and small pages that miss the ITLB."}, {"ITLB.SMALL_MISS", "The number of instruction fetches from small pages that miss the ITLB."}, {"ITLB_MISS_RETIRED", "The number of retired instructions that missed the ITLB when they were fetched."}, {"L1D_ALL_REF", "The number of references to L1 data cache counting loads and stores of to all memory types."}, {"L1D_ALL_CACHE_REF", "The number of data reads and writes to cacheable memory."}, {"L1D_CACHE_LOCK", "The number of locked reads from cacheable memory."}, {"L1D_CACHE_LOCK_DURATION", "The number of cycles during which any cache line is locked by any locking instruction."}, {"L1D_CACHE.LD", "The number of data reads from cacheable memory."}, {"L1D_CACHE.ST", "The number of data writes to cacheable memory."}, {"L1D_M_EVICT", "The number of modified cache lines evicted from L1 data cache."}, {"L1D_M_REPL", "The number of modified lines allocated in L1 data cache."}, {"L1D_PEND_MISS", "The total number of outstanding L1 data cache misses at any clock."}, {"L1D_PREFETCH.REQUESTS", "The number of times L1 data cache requested to prefetch a data cache line."}, {"L1D_REPL", "The number of lines brought into L1 data cache."}, {"L1D_SPLIT.LOADS", "The number of load operations that span two cache lines."}, {"L1D_SPLIT.STORES", "The number of store operations that span two cache lines."}, {"L1I_MISSES", "The number of instruction fetch unit misses."}, {"L1I_READS", "The number of instruction fetches."}, {"L2_ADS", "The number of cycles that the L2 address bus is in use."}, {"L2_DBUS_BUSY_RD", "The number of core cycles during which the L2 data bus is busy transferring data to the core."}, {"L2_IFETCH", "The number of instruction cache line requests from the instruction fetch unit."}, {"L2_LD", "The number of L2 cache read requests from L1 cache and L2 prefetchers."}, {"L2_LINES_IN", "The number of cache lines allocated in L2 cache."}, {"L2_LINES_OUT", "The number of L2 cache lines evicted."}, {"L2_LOCK", "The number of locked accesses to cache lines that miss L1 data cache."}, {"L2_M_LINES_IN", "The number of L2 cache line modifications."}, {"L2_M_LINES_OUT", "The number of modified lines evicted from L2 cache."}, {"L2_NO_REQ", "The number of cycles during which no L2 cache requests were pending from a core."}, {"L2_REJECT_BUSQ", "The number of L2 cache requests that were rejected."}, {"L2_RQSTS", "The number of completed L2 cache requests."}, {"L2_RQSTS.SELF.DEMAND.I_STATE", "The number of completed L2 cache demand requests from this core that missed the L2 cache. This is an architectural performance event."}, {"L2_RQSTS.SELF.DEMAND.MESI", "The number of completed L2 cache demand requests from this core."}, {"L2_ST", "The number of store operations that miss the L1 cache and request data from the L2 cache."}, {"LOAD_BLOCK.L1D", "The number of loads blocked by the L1 data cache."}, {"LOAD_BLOCK.OVERLAP_STORE", "The number of loads that partially overlap an earlier store or are aliased with a previous store."}, {"LOAD_BLOCK.STA", "The number of loads blocked by preceding stores whose address is yet to be calculated."}, {"LOAD_BLOCK.STD", "The number of loads blocked by preceding stores to the same address whose data value is not known."}, {"LOAD_BLOCK.UNTIL_RETIRE", "The numer of load operations that were blocked until retirement."}, {"LOAD_HIT_PRE", "The number of load operations that conflicted with an prefetch to the same cache line."}, {"MACHINE_CLEARS.SMC", "The number of times a program writes to a code section."}, {"MACHINE_NUKES.MEM_ORDER", "The number of times the execution pipeline was restarted due to a memory ordering conflict or memory disambiguation misprediction."}, {"MACRO_INSTS.ALL_DECODED", "The number of instructions decoded."}, {"MACRO_INSTS.CISC_DECODED", "The number of complex instructions decoded."}, {"MEMORY_DISAMBIGUATION.RESET", "The number of cycles during which memory disambiguation misprediction occurs."}, {"MEMORY_DISAMBIGUATION.SUCCESS", "The number of load operations that were successfully disambiguated."}, {"MEM_LOAD_RETIRED.DTLB_MISS", "The number of retired load operations that missed the DTLB."}, {"MEM_LOAD_RETIRED.L2_MISS", "The number of retired load operations that miss L2 cache."}, {"MEM_LOAD_RETIRED.L2_HIT", "The number of retired load operations that hit L2 cache."}, {"MEM_LOAD_RETIRED.L2_LINE_MISS", "The number of load operations that missed L2 cache and that caused a bus request."}, {"MUL", "The number of multiply operations executed. This event is only available on PMC1."}, {"MUL.AR", "The number of multiply operations retired."}, {"MUL.S", "The number of multiply operations executed."}, {"PAGE_WALKS.WALKS", "The number of page walks executed due to an ITLB or DTLB miss."}, {"PAGE_WALKS.CYCLES", "The number of cycles spent in a page walk caused by an ITLB or DTLB miss."}, {"PREF_RQSTS_DN", "The number of downward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"PREF_RQSTS_UP", "The number of upward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"PREFETCH.PREFETCHNTA", "The number of PREFETCHNTA instructions executed."}, {"PREFETCH.PREFETCHT0", "The number of PREFETCHT0 instructions executed."}, {"PREFETCH.SW_L2", "The number of PREFETCHT1 and PREFETCHT2 instructions executed."}, {"RAT_STALLS.ANY", "The number of stall cycles due to any of RAT_STALLS.FLAGS RAT_STALLS.FPSW, RAT_STALLS.PARTIAL and RAT_STALLS.ROB_READ_PORT."}, {"RAT_STALLS.FLAGS", "The number of cycles execution stalled due to a flag register induced stall."}, {"RAT_STALLS.FPSW", "The number of times the floating point status word was written."}, {"RAT_STALLS.PARTIAL_CYCLES", "The number of cycles of added instruction execution latency due to the use of a register that was partially written by previous instructions."}, {"RAT_STALLS.ROB_READ_PORT", "The number of cycles when ROB read port stalls occurred."}, {"RESOURCE_STALLS.ANY", "The number of cycles during which any resource related stall occurred."}, {"RESOURCE_STALLS.BR_MISS_CLEAR", "The number of cycles stalled due to branch misprediction."}, {"RESOURCE_STALLS.FPCW", "The number of cycles stalled due to writing the floating point control word."}, {"RESOURCE_STALLS.LD_ST", "The number of cycles during which the number of loads and stores in the pipeline exceeded their limits."}, {"RESOURCE_STALLS.ROB_FULL", "The number of cycles when the reorder buffer was full."}, {"RESOURCE_STALLS.RS_FULL", "The number of cycles during which the RS was full."}, {"RS_UOPS_DISPATCHED", "The number of micro-ops dispatched for execution."}, {"RS_UOPS_DISPATCHED.PORT0", "The number of cycles micro-ops were dispatched for execution on port 0."}, {"RS_UOPS_DISPATCHED.PORT1", "The number of cycles micro-ops were dispatched for execution on port 1."}, {"RS_UOPS_DISPATCHED.PORT2", "The number of cycles micro-ops were dispatched for execution on port 2."}, {"RS_UOPS_DISPATCHED.PORT3", "The number of cycles micro-ops were dispatched for execution on port 3."}, {"RS_UOPS_DISPATCHED.PORT4", "The number of cycles micro-ops were dispatched for execution on port 4."}, {"RS_UOPS_DISPATCHED.PORT5", "The number of cycles micro-ops were dispatched for execution on port 5."}, {"SB_DRAIN_CYCLES", "The number of cycles while the store buffer is draining."}, {"SEGMENT_REG_LOADS.ANY", "The number of segment register loads."}, {"SEG_REG_RENAMES.ANY", "The number of times the any segment register was renamed."}, {"SEG_REG_RENAMES.DS", "The number of times the ds register is renamed."}, {"SEG_REG_RENAMES.ES", "The number of times the es register is renamed."}, {"SEG_REG_RENAMES.FS", "The number of times the fs register is renamed."}, {"SEG_REG_RENAMES.GS", "The number of times the gs register is renamed."}, {"SEG_RENAME_STALLS.ANY", "The number of stalls due to lack of resource to rename any segment register."}, {"SEG_RENAME_STALLS.DS", "The number of stalls due to lack of renaming resources for the ds register."}, {"SEG_RENAME_STALLS.ES", "The number of stalls due to lack of renaming resources for the es register."}, {"SEG_RENAME_STALLS.FS", "The number of stalls due to lack of renaming resources for the fs register."}, {"SEG_RENAME_STALLS.GS", "The number of stalls due to lack of renaming resources for the gs register."}, {"SIMD_ASSIST", "The number SIMD assists invoked."}, {"SIMD_COMP_INST_RETIRED.PACKED_DOUBLE", "Then number of computational SSE2 packed double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.PACKED_SINGLE", "Then number of computational SSE2 packed single precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_DOUBLE", "Then number of computational SSE2 scalar double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_SINGLE", "Then number of computational SSE2 scalar single precision instructions retired."}, {"SIMD_INSTR_RETIRED", "The number of retired SIMD instructions that use MMX registers."}, {"SIMD_INST_RETIRED.ANY", "The number of streaming SIMD instructions retired."}, {"SIMD_INST_RETIRED.PACKED_DOUBLE", "The number of SSE2 packed double precision instructions retired."}, {"SIMD_INST_RETIRED.PACKED_SINGLE", "The number of SSE packed single precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_DOUBLE", "The number of SSE2 scalar double precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_SINGLE", "The number of SSE scalar single precision instructions retired."}, {"SIMD_INST_RETIRED.VECTOR", "The number of SSE2 vector instructions retired."}, {"SIMD_SAT_INSTR_RETIRED", "The number of saturated arithmetic SIMD instructions retired."}, {"SIMD_SAT_UOP_EXEC.AR", "The number of SIMD saturated arithmetic micro-ops retired."}, {"SIMD_SAT_UOP_EXEC.S", "The number of SIMD saturated arithmetic micro-ops executed."}, {"SIMD_UOPS_EXEC.AR", "The number of SIMD micro-ops retired."}, {"SIMD_UOPS_EXEC.S", "The number of SIMD micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.ARITHMETIC.AR", "The number of SIMD packed arithmetic micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.ARITHMETIC.S", "The number of SIMD packed arithmetic micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.LOGICAL.AR", "The number of SIMD packed logical microops executed."}, {"SIMD_UOP_TYPE_EXEC.LOGICAL.S", "The number of SIMD packed logical microops executed."}, {"SIMD_UOP_TYPE_EXEC.MUL.AR", "The number of SIMD packed multiply microops retired."}, {"SIMD_UOP_TYPE_EXEC.MUL.S", "The number of SIMD packed multiply microops executed."}, {"SIMD_UOP_TYPE_EXEC.PACK.AR", "The number of SIMD pack micro-ops retired."}, {"SIMD_UOP_TYPE_EXEC.PACK.S", "The number of SIMD pack micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.SHIFT.AR", "The number of SIMD packed shift micro-ops retired."}, {"SIMD_UOP_TYPE_EXEC.SHIFT.S", "The number of SIMD packed shift micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.UNPACK.AR", "The number of SIMD unpack micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.UNPACK.S", "The number of SIMD unpack micro-ops executed."}, {"SNOOP_STALL_DRV", "The number of times the bus stalled for snoops. This event is thread-independent."}, {"SSE_PRE_EXEC.L2", "The number of PREFETCHT1 instructions executed."}, {"SSE_PRE_EXEC.STORES", "The number of times SSE non-temporal store instructions were executed."}, {"SSE_PRE_MISS.L1", "The number of times the PREFETCHT0 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.L2", "The number of times the PREFETCHT1 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.NTA", "The number of times the PREFETCHNTA instruction executed and missed all cache levels."}, {"STORE_BLOCK.ORDER", "The number of cycles while a store was waiting for another store to be globally observed."}, {"STORE_BLOCK.SNOOP", "The number of cycles while a store was blocked due to a conflict with an internal or external snoop."}, {"STORE_FORWARDS.GOOD", "The number of times stored data was forwarded directly to a load."}, {"THERMAL_TRIP", "The number of thermal trips."}, {"UOPS_RETIRED.LD_IND_BR", "The number of micro-ops retired that fused a load with another operation."}, {"UOPS_RETIRED.STD_STA", "The number of store address calculations that fused into one micro-op."}, {"UOPS_RETIRED.MACRO_FUSION", "The number of times retired instruction pairs were fused into one micro-op."}, {"UOPS_RETIRED.FUSED", "The number of fused micro-ops retired."}, {"UOPS_RETIRED.NON_FUSED", "The number of non-fused micro-ops retired."}, {"UOPS_RETIRED.ANY", "The number of micro-ops retired."}, {"X87_COMP_OPS_EXE.ANY.AR", "The number of x87 floating-point computational micro-ops retired."}, {"X87_COMP_OPS_EXE.ANY.S", "The number of x87 floating-point computational micro-ops executed."}, {"X87_OPS_RETIRED.ANY", "The number of floating point computational instructions retired."}, {"X87_OPS_RETIRED.FXCH", "The number of FXCH instructions retired."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-atom.h000066400000000000000000000162041502707512200177300ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-atom.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_ATOM #define FreeBSD_MAP_ATOM enum NativeEvent_Value_AtomProcessor { PNE_ATOM_BACLEARS = PAPI_NATIVE_MASK, PNE_ATOM_BOGUS_BR, PNE_ATOM_BR_BAC_MISSP_EXEC, PNE_ATOM_BR_CALL_MISSP_EXEC, PNE_ATOM_BR_CALL_EXEC, PNE_ATOM_BR_CND_EXEC, PNE_ATOM_BR_CND_MISSP_EXEC, PNE_ATOM_BR_IND_CALL_EXEC, PNE_ATOM_BR_IND_EXEC, PNE_ATOM_BR_IND_MISSP_EXEC, PNE_ATOM_BR_INST_DECODED, PNE_ATOM_BR_INST_EXEC, PNE_ATOM_BR_INST_RETIRED_ANY, PNE_ATOM_BR_INST_RETIRED_ANY1, PNE_ATOM_BR_INST_RETIRED_MISPRED, PNE_ATOM_BR_INST_RETIRED_MISPRED_NOT_TAKEN, PNE_ATOM_BR_INST_RETIRED_MISPRED_TAKEN, PNE_ATOM_BR_INST_RETIRED_PRED_NOT_TAKEN, PNE_ATOM_BR_INST_RETIRED_PRED_TAKEN, PNE_ATOM_BR_INST_RETIRED_TAKEN, PNE_ATOM_BR_MISSP_EXEC, PNE_ATOM_BR_RET_MISSP_EXEC, PNE_ATOM_BR_RET_BAC_MISSP_EXEC, PNE_ATOM_BR_RET_EXEC, PNE_ATOM_BR_TKN_BUBBLE_1, PNE_ATOM_BR_TKN_BUBBLE_2, PNE_ATOM_BUSQ_EMPTY, PNE_ATOM_BUS_BNR_DRV, PNE_ATOM_BUS_DATA_RCV, PNE_ATOM_BUS_DRDY_CLOCKS, PNE_ATOM_BUS_HIT_DRV, PNE_ATOM_BUS_HITM_DRV, PNE_ATOM_BUS_IO_WAIT, PNE_ATOM_BUS_LOCK_CLOCKS, PNE_ATOM_BUS_REQUEST_OUTSTANDING, PNE_ATOM_BUS_TRANS_P, PNE_ATOM_BUS_TRANS_IFETCH, PNE_ATOM_BUS_TRANS_INVAL, PNE_ATOM_BUS_TRANS_PWR, PNE_ATOM_BUS_TRANS_DEF, PNE_ATOM_BUS_TRANS_BURST, PNE_ATOM_BUS_TRANS_MEM, PNE_ATOM_BUS_TRANS_ANY, PNE_ATOM_BUS_TRANS_BRD, PNE_ATOM_BUS_TRANS_IO, PNE_ATOM_BUS_TRANS_RFO, PNE_ATOM_BUS_TRANS_WB, PNE_ATOM_CMP_SNOOP, PNE_ATOM_CPU_CLK_UNHALTED_BUS, PNE_ATOM_CPU_CLK_UNHALTED_CORE_P, PNE_ATOM_CPU_CLK_UNHALTED_NO_OTHER, PNE_ATOM_CYCLES_DIV_BUSY, PNE_ATOM_CYCLES_INT_MASKED_CYCLES_INT_MASKED, PNE_ATOM_CYCLES_INT_MASKED_CYCLES_INT_PENDING_AND_MASKED, PNE_ATOM_CYCLES_L1I_MEM_STALLED, PNE_ATOM_DATA_TLB_MISSES_DTLB_MISS, PNE_ATOM_DATA_TLB_MISSES_DTLB_MISS_LD, PNE_ATOM_DATA_TLB_MISSES_DTLB_MISS_ST, PNE_ATOM_DATA_TLB_MISSES_UTLB_MISS_LD, PNE_ATOM_DELAYED_BYPASS_FP, PNE_ATOM_DELAYED_BYPASS_LOAD, PNE_ATOM_DELAYED_BYPASS_SIMD, PNE_ATOM_DIV, PNE_ATOM_DIV_AR, PNE_ATOM_DIV_S, PNE_ATOM_DTLB_MISSES_ANY, PNE_ATOM_DTLB_MISSES_L0_MISS_LD, PNE_ATOM_DTLB_MISSES_MISS_LD, PNE_ATOM_DTLB_MISSES_MISS_ST, PNE_ATOM_EIST_TRANS, PNE_ATOM_ESP_ADDITIONS, PNE_ATOM_ESP_SYNCH, PNE_ATOM_EXT_SNOOP, PNE_ATOM_FP_ASSIST, PNE_ATOM_FP_ASSIST_AR, PNE_ATOM_FP_COMP_OPS_EXE, PNE_ATOM_FP_MMX_TRANS_TO_FP, PNE_ATOM_FP_MMX_TRANS_TO_MMX, PNE_ATOM_HW_INT_RCV, PNE_ATOM_ICACHE_ACCESSES, PNE_ATOM_ICACHE_MISSES, PNE_ATOM_IDLE_DURING_DIV, PNE_ATOM_ILD_STALL, PNE_ATOM_INST_QUEUE_FULL, PNE_ATOM_INST_RETIRED_ANY_P, PNE_ATOM_INST_RETIRED_LOADS, PNE_ATOM_INST_RETIRED_OTHER, PNE_ATOM_INST_RETIRED_STORES, PNE_ATOM_ITLB_FLUSH, PNE_ATOM_ITLB_LARGE_MISS, PNE_ATOM_ITLB_MISSES, PNE_ATOM_ITLB_SMALL_MISS, PNE_ATOM_ITLB_MISS_RETIRED, PNE_ATOM_L1D_ALL_REF, PNE_ATOM_L1D_ALL_CACHE_REF, PNE_ATOM_L1D_CACHE_LOCK, PNE_ATOM_L1D_CACHE_LOCK_DURATION, PNE_ATOM_L1D_CACHE_LD, PNE_ATOM_L1D_CACHE_ST, PNE_ATOM_L1D_M_EVICT, PNE_ATOM_L1D_M_REPL, PNE_ATOM_L1D_PEND_MISS, PNE_ATOM_L1D_PREFETCH_REQUESTS, PNE_ATOM_L1D_REPL, PNE_ATOM_L1D_SPLIT_LOADS, PNE_ATOM_L1D_SPLIT_STORES, PNE_ATOM_L1I_MISSES, PNE_ATOM_L1I_READS, PNE_ATOM_L2_ADS, PNE_ATOM_L2_DBUS_BUSY_RD, PNE_ATOM_L2_IFETCH, PNE_ATOM_L2_LD, PNE_ATOM_L2_LINES_IN, PNE_ATOM_L2_LINES_OUT, PNE_ATOM_L2_LOCK, PNE_ATOM_L2_M_LINES_IN, PNE_ATOM_L2_M_LINES_OUT, PNE_ATOM_L2_NO_REQ, PNE_ATOM_L2_REJECT_BUSQ, PNE_ATOM_L2_RQSTS, PNE_ATOM_L2_RQSTS_SELF_DEMAND_I_STATE, PNE_ATOM_L2_RQSTS_SELF_DEMAND_MESI, PNE_ATOM_L2_ST, PNE_ATOM_LOAD_BLOCK_L1D, PNE_ATOM_LOAD_BLOCK_OVERLAP_STORE, PNE_ATOM_LOAD_BLOCK_STA, PNE_ATOM_LOAD_BLOCK_STD, PNE_ATOM_LOAD_BLOCK_UNTIL_RETIRE, PNE_ATOM_LOAD_HIT_PRE, PNE_ATOM_MACHINE_CLEARS_SMC, PNE_ATOM_MACHINE_NUKES_MEM_ORDER, PNE_ATOM_MACRO_INSTS_ALL_DECODED, PNE_ATOM_MACRO_INSTS_CISC_DECODED, PNE_ATOM_MEMORY_DISAMBIGUATION_RESET, PNE_ATOM_MEMORY_DISAMBIGUATION_SUCCESS, PNE_ATOM_MEM_LOAD_RETIRED_DTLB_MISS, PNE_ATOM_MEM_LOAD_RETIRED_L2_MISS, PNE_ATOM_MEM_LOAD_RETIRED_L2_HIT, PNE_ATOM_MEM_LOAD_RETIRED_L2_LINE_MISS, PNE_ATOM_MUL, PNE_ATOM_MUL_AR, PNE_ATOM_MUL_S, PNE_ATOM_PAGE_WALKS_WALKS, PNE_ATOM_PAGE_WALKS_CYCLES, PNE_ATOM_PREF_RQSTS_DN, PNE_ATOM_PREF_RQSTS_UP, PNE_ATOM_PREFETCH_PREFETCHNTA, PNE_ATOM_PREFETCH_PREFETCHT0, PNE_ATOM_PREFETCH_SW_L2, PNE_ATOM_RAT_STALLS_ANY, PNE_ATOM_RAT_STALLS_FLAGS, PNE_ATOM_RAT_STALLS_FPSW, PNE_ATOM_RAT_STALLS_PARTIAL_CYCLES, PNE_ATOM_RAT_STALLS_ROB_READ_PORT, PNE_ATOM_RESOURCE_STALLS_ANY, PNE_ATOM_RESOURCE_STALLS_BR_MISS_CLEAR, PNE_ATOM_RESOURCE_STALLS_FPCW, PNE_ATOM_RESOURCE_STALLS_LD_ST, PNE_ATOM_RESOURCE_STALLS_ROB_FULL, PNE_ATOM_RESOURCE_STALLS_RS_FULL, PNE_ATOM_RS_UOPS_DISPATCHED, PNE_ATOM_RS_UOPS_DISPATCHED_PORT0, PNE_ATOM_RS_UOPS_DISPATCHED_PORT1, PNE_ATOM_RS_UOPS_DISPATCHED_PORT2, PNE_ATOM_RS_UOPS_DISPATCHED_PORT3, PNE_ATOM_RS_UOPS_DISPATCHED_PORT4, PNE_ATOM_RS_UOPS_DISPATCHED_PORT5, PNE_ATOM_SB_DRAIN_CYCLES, PNE_ATOM_SEGMENT_REG_LOADS_ANY, PNE_ATOM_SEG_REG_RENAMES_ANY, PNE_ATOM_SEG_REG_RENAMES_DS, PNE_ATOM_SEG_REG_RENAMES_ES, PNE_ATOM_SEG_REG_RENAMES_FS, PNE_ATOM_SEG_REG_RENAMES_GS, PNE_ATOM_SEG_RENAME_STALLS_ANY, PNE_ATOM_SEG_RENAME_STALLS_DS, PNE_ATOM_SEG_RENAME_STALLS_ES, PNE_ATOM_SEG_RENAME_STALLS_FS, PNE_ATOM_SEG_RENAME_STALLS_GS, PNE_ATOM_SIMD_ASSIST, PNE_ATOM_SIMD_COMP_INST_RETIRED_PACKED_DOUBLE, PNE_ATOM_SIMD_COMP_INST_RETIRED_PACKED_SINGLE, PNE_ATOM_SIMD_COMP_INST_RETIRED_SCALAR_DOUBLE, PNE_ATOM_SIMD_COMP_INST_RETIRED_SCALAR_SINGLE, PNE_ATOM_SIMD_INSTR_RETIRED, PNE_ATOM_SIMD_INST_RETIRED_ANY, PNE_ATOM_SIMD_INST_RETIRED_PACKED_DOUBLE, PNE_ATOM_SIMD_INST_RETIRED_PACKED_SINGLE, PNE_ATOM_SIMD_INST_RETIRED_SCALAR_DOUBLE, PNE_ATOM_SIMD_INST_RETIRED_SCALAR_SINGLE, PNE_ATOM_SIMD_INST_RETIRED_VECTOR, PNE_ATOM_SIMD_SAT_INSTR_RETIRED, PNE_ATOM_SIMD_SAT_UOP_EXEC_AR, PNE_ATOM_SIMD_SAT_UOP_EXEC_S, PNE_ATOM_SIMD_UOPS_EXEC_AR, PNE_ATOM_SIMD_UOPS_EXEC_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_ARITHMETIC_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_ARITHMETIC_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_LOGICAL_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_LOGICAL_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_MUL_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_MUL_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_PACK_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_PACK_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_SHIFT_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_SHIFT_S, PNE_ATOM_SIMD_UOP_TYPE_EXEC_UNPACK_AR, PNE_ATOM_SIMD_UOP_TYPE_EXEC_UNPACK_S, PNE_ATOM_SNOOP_STALL_DRV, PNE_ATOM_SSE_PRE_EXEC_L2, PNE_ATOM_SSE_PRE_EXEC_STORES, PNE_ATOM_SSE_PRE_MISS_L1, PNE_ATOM_SSE_PRE_MISS_L2, PNE_ATOM_SSE_PRE_MISS_NTA, PNE_ATOM_STORE_BLOCK_ORDER, PNE_ATOM_STORE_BLOCK_SNOOP, PNE_ATOM_STORE_FORWARDS_GOOD, PNE_ATOM_THERMAL_TRIP, PNE_ATOM_UOPS_RETIRED_LD_IND_BR, PNE_ATOM_UOPS_RETIRED_STD_STA, PNE_ATOM_UOPS_RETIRED_MACRO_FUSION, PNE_ATOM_UOPS_RETIRED_FUSED, PNE_ATOM_UOPS_RETIRED_NON_FUSED, PNE_ATOM_UOPS_RETIRED_ANY, PNE_ATOM_X87_COMP_OPS_EXE_ANY_AR, PNE_ATOM_X87_COMP_OPS_EXE_ANY_S, PNE_ATOM_X87_OPS_RETIRED_ANY, PNE_ATOM_X87_OPS_RETIRED_FXCH, PNE_ATOM_NATNAME_GUARD }; extern Native_Event_LabelDescription_t AtomProcessor_info[]; extern hwi_search_t AtomProcessor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-core.c000066400000000000000000000277511502707512200177240ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** CORE SUBSTRATE CORE SUBSTRATE CORE SUBSTRATE CORE SUBSTRATE CORE SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_CoreProcessor must match CoreProcessor_info */ Native_Event_LabelDescription_t CoreProcessor_info[] = { {"BAClears", "The number of BAClear conditions asserted."}, {"BTB_Misses", "The number of branches for which the branch table buffer did not produce a prediction."}, {"Br_BAC_Missp_Exec", "The number of branch instructions executed that were mispredicted at the front end."}, {"Br_Bogus", "The number of bogus branches."}, {"Br_Call_Exec", "The number of CALL instructions executed."}, {"Br_Call_Missp_Exec", "The number of CALL instructions executed that were mispredicted."}, {"Br_Cnd_Exec", "The number of conditional branch instructions executed."}, {"Br_Cnd_Missp_Exec", "The number of conditional branch instructions executed that were mispredicted."}, {"Br_Ind_Call_Exec", "The number of indirect CALL instructions executed."}, {"Br_Ind_Exec", "The number of indirect branches executed."}, {"Br_Ind_Missp_Exec", "The number of indirect branch instructions executed that were mispredicted."}, {"Br_Inst_Exec", "The number of branch instructions executed including speculative branches."}, {"Br_Instr_Decoded", "The number of branch instructions decoded."}, {"Br_Instr_Ret", "The number of branch instructions retired. This is an architectural performance event."}, {"Br_MisPred_Ret", "The number of mispredicted branch instructions retired. This is an architectural performance event."}, {"Br_MisPred_Taken_Ret", "The number of taken and mispredicted branches retired."}, {"Br_Missp_Exec", "The number of branch instructions executed and mispredicted at execution including branches that were not predicted."}, {"Br_Ret_BAC_Missp_Exec", "The number of return branch instructions that were mispredicted at the front end."}, {"Br_Ret_Exec", "The number of return branch instructions executed."}, {"Br_Ret_Missp_Exec", "The number of return branch instructions executed that were mispredicted."}, {"Br_Taken_Ret", "The number of taken branches retired."}, {"Bus_BNR_Clocks", "was asserted."}, {"Bus_DRDY_Clocks", "The number of external bus cycles while DRDY was asserted."}, {"Bus_Data_Rcv", "The number of cycles during which the processor is busy receiving data."}, {"Bus_Locks_Clocks", "The number of external bus cycles while the bus lock signal was asserted."}, {"Bus_Not_In_Use", "The number of cycles when there is no transaction from the core."}, {"Bus_Req_Outstanding", "The weighted cycles of cacheable bus data read requests from the data cache unit or hardware prefetcher."}, {"Bus_Snoop_Stall", "The number bus cycles while a bus snoop is stalled."}, {"Bus_Snoops", "The number of snoop responses to bus transactions."}, {"Bus_Trans_Any", "The number of completed bus transactions."}, {"Bus_Trans_Brd", "The number of read bus transactions."}, {"Bus_Trans_Burst", "The number of completed burst transactions. Retried transactions may be counted more than once."}, {"Bus_Trans_Def", "The number of completed deferred transactions."}, {"Bus_Trans_IO", "The number of completed I/O transactions counting both reads and writes."}, {"Bus_Trans_Ifetch", "Completed instruction fetch transactions."}, {"Bus_Trans_Inval", "The number completed invalidate transactions."}, {"Bus_Trans_Mem", "The number of completed memory transactions."}, {"Bus_Trans_P", "The number of completed partial transactions."}, {"Bus_Trans_Pwr", "The number of completed partial write transactions."}, {"Bus_Trans_RFO", "The number of completed read-for-ownership transactions."}, {"Bus_Trans_WB", "The number of completed writeback transactions from the data cache unit, excluding L2 writebacks."}, {"Cycles_Div_Busy", "The number of cycles the divider is busy. The event is only available on PMC0."}, {"Cycles_Int_Masked", "The number of cycles while interrupts were disabled."}, {"Cycles_Int_Pending_Masked", "The number of cycles while interrupts were disabled and interrupts were pending."}, {"DCU_Snoop_To_Share", "The number of data cache unit snoops to L1 cache lines in the shared state."}, {"DCache_Cache_Lock", "The number of cacheable locked read operations to invalid state."}, {"DCache_Cache_LD", "The number of cacheable L1 data read operations."}, {"DCache_Cache_ST", "The number cacheable L1 data write operations."}, {"DCache_M_Evict", "The number of M state data cache lines that were evicted."}, {"DCache_M_Repl", "The number of M state data cache lines that were allocated."}, {"DCache_Pend_Miss", "The weighted cycles an L1 miss was outstanding."}, {"DCache_Repl", "The number of data cache line replacements."}, {"Data_Mem_Cache_Ref", "The number of cacheable read and write operations to L1 data cache."}, {"Data_Mem_Ref", "The number of L1 data reads and writes, both cacheable and uncacheable."}, {"Dbus_Busy", "The number of core cycles during which the data bus was busy."}, {"Dbus_Busy_Rd", "The nunber of cycles during which the data bus was busy transferring data to a core."}, {"Div", "The number of divide operations including speculative operations for integer and floating point divides. This event can only be counted on PMC1."}, {"Dtlb_Miss", "The number of data references that missed the TLB."}, {"ESP_Uops", "The number of ESP folding instructions decoded."}, {"EST_Trans", "Count the number of Intel Enhanced SpeedStep transitions."}, {"FP_Assist", "The number of floating point operations that required microcode assists. The event is only available on PMC1."}, {"FP_Comp_Instr_Ret", "The number of X87 floating point compute instructions retired. The event is only available on PMC0."}, {"FP_Comps_Op_Exe", "The number of floating point computational instructions executed."}, {"FP_MMX_Trans", "The number of transitions from X87 to MMX."}, {"Fused_Ld_Uops_Ret", "The number of fused load uops retired."}, {"Fused_St_Uops_Ret", "The number of fused store uops retired."}, {"Fused_Uops_Ret", "The number of fused uops retired."}, {"HW_Int_Rx", "The number of hardware interrupts received."}, {"ICache_Misses", "The number of instruction fetch misses in the instruction cache and streaming buffers."}, {"ICache_Reads", "The number of instruction fetches from the the instruction cache and streaming buffers counting both cacheable and uncacheable fetches."}, {"IFU_Mem_Stall", "The number of cycles the instruction fetch unit was stalled while waiting for data from memory."}, {"ILD_Stall", "The number of instruction length decoder stalls."}, {"ITLB_Misses", "The number of instruction TLB misses."}, {"Instr_Decoded", "The number of instructions decoded."}, {"Instr_Ret", "The number of instructions retired. This is an architectural performance event."}, {"L1_Pref_Req", "The number of L1 prefetch request due to data cache misses."}, {"L2_ADS", "The number of L2 address strobes."}, {"L2_IFetch", "The number of instruction fetches by the instruction fetch unit from L2 cache including speculative fetches."}, {"L2_LD", "The number of L2 cache reads."}, {"L2_Lines_In", "The number of L2 cache lines allocated."}, {"L2_Lines_Out", "The number of L2 cache lines evicted."}, {"L2_M_Lines_In", "The number of L2 M state cache lines allocated."}, {"L2_M_Lines_Out", "The number of L2 M state cache lines evicted."}, {"L2_No_Request_Cycles", "The number of cycles there was no request to access L2 cache."}, {"L2_Reject_Cycles", "The number of cycles the L2 cache was busy and rejecting new requests."}, {"L2_Rqsts", "The number of L2 cache requests."}, {"L2_ST", "The number of L2 cache writes including speculative writes."}, {"LD_Blocks", "The number of load operations delayed due to store buffer blocks."}, {"LLC_Misses", "The number of cache misses for references to the last level cache, excluding misses due to hardware prefetches. This is an architectural performance event."}, {"LLC_Reference", "The number of references to the last level cache, excluding those due to hardware prefetches. This is an architectural performance event."}, {"MMX_Assist", "The number of EMMX instructions executed."}, {"MMX_FP_Trans", "The number of transitions from MMX to X87."}, {"MMX_Instr_Exec", "The number of MMX instructions executed excluding MOVQ and MOVD stores."}, {"MMX_Instr_Ret", "The number of MMX instructions retired."}, {"Misalign_Mem_Ref", "The number of misaligned data memory references, counting loads and stores."}, {"Mul", "The number of multiply operations include speculative floating point and integer multiplies. This event is available on PMC1 only."}, {"NonHlt_Ref_Cycles", "The number of non-halted bus cycles. This is an architectural performance event."}, {"Pref_Rqsts_Dn", "The number of hardware prefetch requests issued in backward streams."}, {"Pref_Rqsts_Up", "The number of hardware prefetch requests issued in forward streams."}, {"Resource_Stall", "The number of cycles where there is a resource related stall."}, {"SD_Drains", "The number of cycles while draining store buffers."}, {"SIMD_FP_DP_P_Ret", "The number of SSE/SSE2 packed double precision instructions retired."}, {"SIMD_FP_DP_P_Comp_Ret", "The number of SSE/SSE2 packed double precision compute instructions retired."}, {"SIMD_FP_DP_S_Ret", "The number of SSE/SSE2 scalar double precision instructions retired."}, {"SIMD_FP_DP_S_Comp_Ret", "The number of SSE/SSE2 scalar double precision compute instructions retired."}, {"SIMD_FP_SP_P_Comp_Ret", "The number of SSE/SSE2 packed single precision compute instructions retired."}, {"SIMD_FP_SP_Ret", "The number of SSE/SSE2 scalar single precision instructions retired, both packed and scalar."}, {"SIMD_FP_SP_S_Ret", "The number of SSE/SSE2 scalar single precision instructions retired."}, {"SIMD_FP_SP_S_Comp_Ret", "The number of SSE/SSE2 single precision compute instructions retired."}, {"SIMD_Int_128_Ret", "The number of SSE2 128-bit integer instructions retired."}, {"SIMD_Int_Pari_Exec", "The number of SIMD integer packed arithmetic instructions executed."}, {"SIMD_Int_Pck_Exec", "The number of SIMD integer pack operations instructions executed."}, {"SIMD_Int_Plog_Exec", "The number of SIMD integer packed logical instructions executed."}, {"SIMD_Int_Pmul_Exec", "The number of SIMD integer packed multiply instructions executed."}, {"SIMD_Int_Psft_Exec", "The number of SIMD integer packed shift instructions executed."}, {"SIMD_Int_Sat_Exec", "The number of SIMD integer saturating instructions executed."}, {"SIMD_Int_Upck_Exec", "The number of SIMD integer unpack instructions executed."}, {"SMC_Detected", "The number of times self-modifying code was detected."}, {"SSE_NTStores_Miss", "The number of times an SSE streaming store instruction missed all caches."}, {"SSE_NTStores_Ret", "The number of SSE streaming store instructions executed."}, {"SSE_PrefNta_Miss", "The number of times PREFETCHNTA missed all caches."}, {"SSE_PrefNta_Ret", "The number of PREFETCHNTA instructions retired."}, {"SSE_PrefT1_Miss", "The number of times PREFETCHT1 missed all caches."}, {"SSE_PrefT1_Ret", "The number of PREFETCHT1 instructions retired."}, {"SSE_PrefT2_Miss", "The number of times PREFETCHNT2 missed all caches."}, {"SSE_PrefT2_Ret", "The number of PREFETCHT2 instructions retired."}, {"Seg_Reg_Loads", "The number of segment register loads."}, {"Serial_Execution_Cycles", "The number of non-halted bus cycles of this code while the other core was halted."}, {"Thermal_Trip", "The duration in a thermal trip based on the current core clock."}, {"Unfusion", "The number of unfusion events."}, {"Unhalted_Core_Cycles", "The number of core clock cycles when the clock signal on a specific core is not halted. This is an architectural performance event."}, {"Uops_Ret", "The number of micro-ops retired."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-core.h000066400000000000000000000074001502707512200177160ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_CORE #define FreeBSD_MAP_CORE enum NativeEvent_Value_CoreProcessor { PNE_CORE_BACLEARS = PAPI_NATIVE_MASK, PNE_CORE_BTB_MISSES, PNE_CORE_BR_BAC_MISSP_EXEC, PNE_CORE_BR_BOGUS, PNE_CORE_BR_CALL_EXEC, PNE_CORE_BR_CALL_MISSP_EXEC, PNE_CORE_BR_CND_EXEC, PNE_CORE_BR_CND_MISSP_EXEC, PNE_CORE_BR_IND_CALL_EXEC, PNE_CORE_BR_IND_EXEC, PNE_CORE_BR_IND_MISSP_EXEC, PNE_CORE_BR_INST_EXEC, PNE_CORE_BR_INSTR_DECODED, PNE_CORE_BR_INSTR_RET, PNE_CORE_BR_MISPRED_RET, PNE_CORE_BR_MISPRED_TAKEN_RET, PNE_CORE_BR_MISSP_EXEC, PNE_CORE_BR_RET_BAC_MISSP_EXEC, PNE_CORE_BR_RET_EXEC, PNE_CORE_BR_RET_MISSP_EXEC, PNE_CORE_BR_TAKEN_RET, PNE_CORE_BUS_BNR_CLOCKS, PNE_CORE_BUS_DRDY_CLOCKS, PNE_CORE_BUS_DATA_RCV, PNE_CORE_BUS_LOCKS_CLOCKS, PNE_CORE_BUS_NOT_IN_USE, PNE_CORE_BUS_REQ_OUTSTANDING, PNE_CORE_BUS_SNOOP_STALL, PNE_CORE_BUS_SNOOPS, PNE_CORE_BUS_TRANS_ANY, PNE_CORE_BUS_TRANS_BRD, PNE_CORE_BUS_TRANS_BURST, PNE_CORE_BUS_TRANS_DEF, PNE_CORE_BUS_TRANS_IO, PNE_CORE_BUS_TRANS_IFETCH, PNE_CORE_BUS_TRANS_INVAL, PNE_CORE_BUS_TRANS_MEM, PNE_CORE_BUS_TRANS_P, PNE_CORE_BUS_TRANS_PWR, PNE_CORE_BUS_TRANS_RFO, PNE_CORE_BUS_TRANS_WB, PNE_CORE_CYCLES_DIV_BUSY, PNE_CORE_CYCLES_INT_MASKED, PNE_CORE_CYCLES_INT_PENDING_MASKED, PNE_CORE_DCU_SNOOP_TO_SHARE, PNE_CORE_DCACHE_CACHE_LOCK, PNE_CORE_DCACHE_CACHE_LD, PNE_CORE_DCACHE_CACHE_ST, PNE_CORE_DCACHE_M_EVICT, PNE_CORE_DCACHE_M_REPL, PNE_CORE_DCACHE_PEND_MISS, PNE_CORE_DCACHE_REPL, PNE_CORE_DATA_MEM_CACHE_REF, PNE_CORE_DATA_MEM_REF, PNE_CORE_DBUS_BUSY, PNE_CORE_DBUS_BUSY_RD, PNE_CORE_DIV, PNE_CORE_DTLB_MISS, PNE_CORE_ESP_UOPS, PNE_CORE_EST_TRANS, PNE_CORE_FP_ASSIST, PNE_CORE_FP_COMP_INSTR_RET, PNE_CORE_FP_COMPS_OP_EXE, PNE_CORE_FP_MMX_TRANS, PNE_CORE_FUSED_LD_UOPS_RET, PNE_CORE_FUSED_ST_UOPS_RET, PNE_CORE_FUSED_UOPS_RET, PNE_CORE_HW_INT_RX, PNE_CORE_ICACHE_MISSES, PNE_CORE_ICACHE_READS, PNE_CORE_IFU_MEM_STALL, PNE_CORE_ILD_STALL, PNE_CORE_ITLB_MISSES, PNE_CORE_INSTR_DECODED, PNE_CORE_INSTR_RET, PNE_CORE_L1_PREF_REQ, PNE_CORE_L2_ADS, PNE_CORE_L2_IFETCH, PNE_CORE_L2_LD, PNE_CORE_L2_LINES_IN, PNE_CORE_L2_LINES_OUT, PNE_CORE_L2_M_LINES_IN, PNE_CORE_L2_M_LINES_OUT, PNE_CORE_L2_NO_REQUEST_CYCLES, PNE_CORE_L2_REJECT_CYCLES, PNE_CORE_L2_RQSTS, PNE_CORE_L2_ST, PNE_CORE_LD_BLOCKS, PNE_CORE_LLC_MISSES, PNE_CORE_LLC_REFERENCE, PNE_CORE_MMX_ASSIST, PNE_CORE_MMX_FP_TRANS, PNE_CORE_MMX_INSTR_EXEC, PNE_CORE_MMX_INSTR_RET, PNE_CORE_MISALIGN_MEM_REF, PNE_CORE_MUL, PNE_CORE_NONHLT_REF_CYCLES, PNE_CORE_PREF_RQSTS_DN, PNE_CORE_PREF_RQSTS_UP, PNE_CORE_RESOURCE_STALL, PNE_CORE_SD_DRAINS, PNE_CORE_SIMD_FP_DP_P_RET, PNE_CORE_SIMD_FP_DP_P_COMP_RET, PNE_CORE_SIMD_FP_DP_S_RET, PNE_CORE_SIMD_FP_DP_S_COMP_RET, PNE_CORE_SIMD_FP_SP_P_COMP_RET, PNE_CORE_SIMD_FP_SP_RET, PNE_CORE_SIMD_FP_SP_S_RET, PNE_CORE_SIMD_FP_SP_S_COMP_RET, PNE_CORE_SIMD_INT_128_RET, PNE_CORE_SIMD_INT_PARI_EXEC, PNE_CORE_SIMD_INT_PCK_EXEC, PNE_CORE_SIMD_INT_PLOG_EXEC, PNE_CORE_SIMD_INT_PMUL_EXEC, PNE_CORE_SIMD_INT_PSFT_EXEC, PNE_CORE_SIMD_INT_SAT_EXEC, PNE_CORE_SIMD_INT_UPCK_EXEC, PNE_CORE_SMC_DETECTED, PNE_CORE_SSE_NTSTORES_MISS, PNE_CORE_SSE_NTSTORES_RET, PNE_CORE_SSE_PREFNTA_MISS, PNE_CORE_SSE_PREFNTA_RET, PNE_CORE_SSE_PREFT1_MISS, PNE_CORE_SSE_PREFT1_RET, PNE_CORE_SSE_PREFT2_MISS, PNE_CORE_SSE_PREFT2_RET, PNE_CORE_SEG_REG_LOADS, PNE_CORE_SERIAL_EXECUTION_CYCLES, PNE_CORE_THERMAL_TRIP, PNE_CORE_UNFUSION, PNE_CORE_UNHALTED_CORE_CYCLES, PNE_CORE_UOPS_RET, PNE_CORE_NATNAME_GUARD }; extern Native_Event_LabelDescription_t CoreProcessor_info[]; extern hwi_search_t CoreProcessor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-core2-extreme.c000066400000000000000000000503411502707512200214440ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core2-extreme.c * Author: George Neville-Neil * gnn@freebsd.org * Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** CORE2_EXTREME SUBSTRATE CORE2_EXTREME SUBSTRATE CORE2_EXTREME SUBSTRATE CORE2_EXTREME SUBSTRATE CORE2_EXTREME SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_Core2ExtremeProcessor must match Core2ExtremeProcessor_info */ Native_Event_LabelDescription_t Core2ExtremeProcessor_info[] = { {"BACLEARS", "The number of times the front end is resteered."}, {"BOGUS_BR", "The number of byte sequences mistakenly detected as taken branch instructions."}, {"BR_BAC_MISSP_EXEC", "The number of branch instructions that were mispredicted when decoded."}, {"BR_CALL_MISSP_EXEC", "The number of mispredicted CALL instructions that were executed."}, {"BR_CALL_EXEC", "The number of CALL instructions executed."}, {"BR_CND_EXEC", "The number of conditional branches executed, but not necessarily retired."}, {"BR_CND_MISSP_EXEC", "The number of mispredicted conditional branches executed."}, {"BR_IND_CALL_EXEC", "The number of indirect CALL instructions executed."}, {"BR_IND_EXEC", "The number of indirect branch instructions executed."}, {"BR_IND_MISSP_EXEC", "The number of mispredicted indirect branch instructions executed."}, {"BR_INST_DECODED", "The number of branch instructions decoded."}, {"BR_INST_EXEC", "The number of branches executed, but not necessarily retired."}, {"BR_INST_RETIRED.ANY", "The number of branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.MISPRED", "The number of mispredicted branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.MISPRED_NOT_TAKEN", "The number of not taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.MISPRED_TAKEN", "The number taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.PRED_NOT_TAKEN", "The number of not taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.PRED_TAKEN", "The number of taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.TAKEN", "The number of taken branch instructions retired."}, {"BR_MISSP_EXEC", "The number of mispredicted branch instructions that were executed."}, {"BR_RET_MISSP_EXEC", "The number of mispredicted RET instructions executed."}, {"BR_RET_BAC_MISSP_EXEC", "The number of RET instructions executed that were mispredicted at decode time."}, {"BR_RET_EXEC", "The number of RET instructions executed."}, {"BR_TKN_BUBBLE_1", "The number of branch predicted taken with bubble 1."}, {"BR_TKN_BUBBLE_2", "The number of branch predicted taken with bubble 2."}, {"BUSQ_EMPTY", "The number of cycles during which the core did not have any pending transactions in the bus queue."}, {"BUS_BNR_DRV", "Number of Bus Not Ready signals asserted on the bus."}, {"BUS_DATA_RCV", "Number of bus cycles during which the processor is receiving data."}, {"BUS_DRDY_CLOCKS", "The number of bus cycles during which the Data Ready signal is asserted on the bus."}, {"BUS_HIT_DRV", "The number of bus cycles during which the processor drives the HIT# pin."}, {"BUS_HITM_DRV", "The number of bus cycles during which the processor drives the HITM# pin."}, {"BUS_IO_WAIT", "The number of core cycles during which I/O requests wait in the bus queue."}, {"BUS_LOCK_CLOCKS", "The number of bus cycles during which the LOCK signal was asserted on the bus."}, {"BUS_REQUEST_OUTSTANDING", "The number of pending full cache line read transactions on the bus occurring in each cycle."}, {"BUS_TRANS_ANY", "The number of bus transactions of any kind."}, {"BUS_TRANS_BRD", "The number of burst read transactions."}, {"BUS_TRANS_BURST", "The number of burst transactions."}, {"BUS_TRANS_DEF", "The number of deferred bus transactions."}, {"BUS_TRANS_IFETCH", "The number of instruction fetch full cache line bus transactions."}, {"BUS_TRANS_INVAL", "The number of invalidate bus transactions."}, {"BUS_TRANS_IO", "The number of completed I/O bus transaactions due to IN and OUT instructions."}, {"BUS_TRANS_MEM", "The number of memory bus transactions."}, {"BUS_TRANS_P", "The number of partial bus transactions."}, {"BUS_TRANS_PWR", "The number of partial write bus transactions."}, {"BUS_TRANS_RFO", "The number of Read For Ownership bus transactions."}, {"BUS_TRANS_WB", "The number of explicit writeback bus transactions due to dirty line evictions."}, {"CMP_SNOOP", "The number of times the L1 data cache is snooped by the other core in the same processor."}, {"CPU_CLK_UNHALTED.BUS", "The number of bus cycles when the core is not in the halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.CORE_P", "The number of core cycles while the core is not in a halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.NO_OTHER", "The number of bus cycles during which the core remains unhalted and the other core is halted."}, {"CYCLES_DIV_BUSY", "The number of cycles the divider is busy. This event is only available on PMC0."}, {"CYCLES_INT_MASKED", "The number of cycles during which interrupts are disabled."}, {"CYCLES_INT_PENDING_AND_MASKED", "The number of cycles during which there were pending interrupts while interrupts were disabled."}, {"CYCLES_L1I_MEM_STALLED", "The number of cycles for which an instruction fetch stalls."}, {"DELAYED_BYPASS.FP", "The number of floating point operations that used data immediately after the data was generated by a non floating point execution unit."}, {"DELAYED_BYPASS.LOAD", "The number of delayed bypass penalty cycles that a load operation incurred."}, {"DELAYED_BYPASS.SIMD", "The number of times SIMD operations use data immediately after data, was generated by a non-SIMD execution unit."}, {"DIV", "The number of divide operations executed."}, {"DTLB_MISSES.ANY", "The number of Data TLB misses, including misses that result from speculative accesses."}, {"DTLB_MISSES.L0_MISS_LD", "The number of level 0 DTLB misses due to load operations."}, {"DTLB_MISSES.MISS_LD", "The number of Data TLB misses due to load operations."}, {"DTLB_MISSES.MISS_ST", "The number of Data TLB misses due to store operations."}, {"EIST_TRANS", "The number of Enhanced Intel SpeedStep Technology transitions."}, {"ESP.ADDITIONS", "The number of automatic additions to the esp register."}, {"ESP.SYNCH", "The number of times the esp register was explicitly used in an address expression after it is implicitly used by a PUSH or POP instruction."}, {"EXT_SNOOP", "The number of snoop responses to bus transactions."}, {"FP_ASSIST", "The number of floating point operations executed that needed a microcode assist."}, {"FP_COMP_OPS_EXE", "The number of floating point computational micro-ops executed. The event is available only on PMC0."}, {"FP_MMX_TRANS_TO_FP", "The number of transitions from MMX instructions to floating point instructions."}, {"FP_MMX_TRANS_TO_MMX", "The number of transitions from floating point instructions to MMX instructions."}, {"HW_INT_RCV", "The number of hardware interrupts recieved."}, {"IDLE_DURING_DIV", "The number of cycles the divider is busy and no other execution unit or load operation was in progress. This event is available only on PMC0."}, {"ILD_STALL", "The number of cycles the instruction length decoder stalled due to a length changing prefix."}, {"INST_QUEUE.FULL", "The number of cycles during which the instruction queue is full."}, {"INST_RETIRED.ANY_P", "The number of instructions retired. This is an architectural performance event."}, {"INST_RETIRED.LOADS", "The number of instructions retired that contained a load operation."}, {"INST_RETIRED.OTHER", "The number of instructions retired that did not contain a load or a store operation."}, {"INST_RETIRED.STORES", "The number of instructions retired that contained a store operation."}, {"INST_RETIRED.VM_H", "The number of instructions retired while in VMX root operation."}, {"ITLB.FLUSH", "The number of ITLB flushes."}, {"ITLB.LARGE_MISS", "The number of instruction fetches from large pages that miss the ITLB."}, {"ITLB.MISSES", "The number of instruction fetches from both large and small pages that miss the ITLB."}, {"ITLB.SMALL_MISS", "The number of instruction fetches from small pages that miss the ITLB."}, {"ITLB_MISS_RETIRED", "The number of retired instructions that missed the ITLB when they were fetched."}, {"L1D_ALL_CACHE_REF", "The number of data reads and writes to cacheable memory."}, {"L1D_ALL_REF", "The number of references to L1 data cache counting loads and stores of to all memory types."}, {"L1D_CACHE_LD", "Number of data reads from cacheable memory excluding locked reads."}, {"L1D_CACHE_LOCK", "Number of locked reads from cacheable memory."}, {"L1D_CACHE_LOCK_DURATION", "The number of cycles during which any cache line is locked by any locking instruction."}, {"L1D_CACHE_ST", "The number of data writes to cacheable memory excluding locked writes."}, {"L1D_M_EVICT", "The number of modified cache lines evicted from L1 data cache."}, {"L1D_M_REPL", "The number of modified lines allocated in L1 data cache."}, {"L1D_PEND_MISS", "The total number of outstanding L1 data cache misses at any clock."}, {"L1D_PREFETCH.REQUESTS", "The number of times L1 data cache requested to prefetch a data cache line."}, {"L1D_REPL", "The number of lines brought into L1 data cache."}, {"L1D_SPLIT.LOADS", "The number of load operations that span two cache lines."}, {"L1D_SPLIT.STORES", "The number of store operations that span two cache lines."}, {"L1I_MISSES", "The number of instruction fetch unit misses."}, {"L1I_READS", "The number of instruction fetches."}, {"L2_ADS", "The number of cycles that the L2 address bus is in use."}, {"L2_DBUS_BUSY_RD", "The number of cycles during which the L2 data bus is busy transferring data to the core."}, {"L2_IFETCH", "The number of instruction cache line requests from the instruction fetch unit."}, {"L2_LD", "The number of L2 cache read requests from L1 cache and L2 prefetchers."}, {"L2_LINES_IN", "The number of cache lines allocated in L2 cache."}, {"L2_LINES_OUT", "The number of L2 cache lines evicted."}, {"L2_LOCK", "The number of locked accesses to cache lines that miss L1 data cache."}, {"L2_M_LINES_IN", "The number of L2 cache line modifications."}, {"L2_M_LINES_OUT", "The number of modified lines evicted from L2 cache."}, {"L2_NO_REQ", "Number of cycles during which no L2 cache requests were pending from a core."}, {"L2_REJECT_BUSQ", "Number of L2 cache requests that were rejected."}, {"L2_RQSTS", "The number of completed L2 cache requests."}, {"L2_RQSTS.SELF.DEMAND.I_STATE", "The number of completed L2 cache demand requests from this core that missed the L2 cache. This is an architectural performance event."}, {"L2_RQSTS.SELF.DEMAND.MESI", "The number of completed L2 cache demand requests from this core. This is an architectural performance event."}, {"L2_ST", "The number of store operations that miss the L1 cache and request data from the L2 cache."}, {"LOAD_BLOCK.L1D", "The number of loads blocked by the L1 data cache."}, {"LOAD_BLOCK.OVERLAP_STORE", "The number of loads that partially overlap an earlier store or are aliased with a previous store."}, {"LOAD_BLOCK.STA", "The number of loads blocked by preceding stores whose address is yet to be calculated."}, {"LOAD_BLOCK.STD", "The number of loads blocked by preceding stores to the same address whose data value is not known."}, {"LOAD_BLOCK.UNTIL_RETIRE", "The numer of load operations that were blocked until retirement."}, {"LOAD_HIT_PRE", "The number of load operations that conflicted with an prefetch to the same cache line."}, {"MACHINE_NUKES.MEM_ORDER", "The number of times the execution pipeline was restarted due to a memory ordering conflict or memory disambiguation misprediction."}, {"MACHINE_NUKES.SMC", "The number of times a program writes to a code section."}, {"MACRO_INSTS.CISC_DECODED", "The number of complex instructions decoded."}, {"MACRO_INSTS.DECODED", "The number of instructions decoded."}, {"MEMORY_DISAMBIGUATION.RESET", "The number of cycles during which memory disambiguation misprediction occurs."}, {"MEMORY_DISAMBIGUATION.SUCCESS", "The number of load operations that were successfully disambiguated."}, {"MEM_LOAD_RETIRED.DTLB_MISS", "The number of retired loads that missed the DTLB."}, {"MEM_LOAD_RETIRED.L1D_LINE_MISS", "The number of retired load operations that missed L1 data cache and that sent a request to L2 cache. This event is only available on PMC0."}, {"MEM_LOAD_RETIRED.L1D_MISS", "The number of retired load operations that missed L1 data cache. This event is only available on PMC0."}, {"MEM_LOAD_RETIRED.L2_LINE_MISS", "The number of load operations that missed L2 cache and that caused a bus request."}, {"MEM_LOAD_RETIRED.L2_MISS", "The number of load operations that missed L2 cache."}, {"MUL","The number of multiply operations executed (only available on PMC1.)"}, {"PAGE_WALKS.COUNT", "The number of page walks executed due to an ITLB or DTLB miss."}, {"PAGE_WALKS.CYCLES", "The number of cycles spent in a page walk caused by an ITLB or DTLB miss."}, {"PREF_RQSTS_DN", "The number of downward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"PREF_RQSTS_UP", "The number of upward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"RAT_STALLS.ANY", "The number of stall cycles due to any of RAT_STALLS.FLAGS RAT_STALLS.FPSW, RAT_STALLS.PARTIAL and RAT_STALLS.ROB_READ_PORT."}, {"RAT_STALLS.FLAGS", "The number of cycles execution stalled due to a flag register induced stall."}, {"RAT_STALLS.FPSW", "The number of times the floating point status word was written."}, {"RAT_STALLS.OTHER_SERIALIZATION_STALLS", "The number of stalls due to other RAT resource serialization not counted by umask 0FH."}, {"RAT_STALLS.PARTIAL_CYCLES", "The number of cycles of added instruction execution latency due to the use of a register that was partially written by previous instructions."}, {"RAT_STALLS.ROB_READ_PORT", "The number of cycles when ROB read port stalls occurred."}, {"RESOURCE_STALLS.ANY", "The number of cycles during which any resource related stall occurred."}, {"RESOURCE_STALLS.BR_MISS_CLEAR", "The number of cycles stalled due to branch misprediction."}, {"RESOURCE_STALLS.FPCW", "The number of cycles stalled due to writing the floating point control word."}, {"RESOURCE_STALLS.LD_ST", "The number of cycles during which the number of loads and stores in the pipeline exceeded their limits."}, {"RESOURCE_STALLS.ROB_FULL", "The number of cycles when the reorder buffer was full."}, {"RESOURCE_STALLS.RS_FULL", "The number of cycles during which the RS was full."}, {"RS_UOPS_DISPATCHED", "The number of micro-ops dispatched for execution."}, {"RS_UOPS_DISPATCHED.PORT0", "The number of cycles micro-ops were dispatched for execution on port 0."}, {"RS_UOPS_DISPATCHED.PORT1", "The number of cycles micro-ops were dispatched for execution on port 1."}, {"RS_UOPS_DISPATCHED.PORT2", "The number of cycles micro-ops were dispatched for execution on port 2."}, {"RS_UOPS_DISPATCHED.PORT3", "The number of cycles micro-ops were dispatched for execution on port 3."}, {"RS_UOPS_DISPATCHED.PORT4", "The number of cycles micro-ops were dispatched for execution on port 4."}, {"RS_UOPS_DISPATCHED.PORT5", "The number of cycles micro-ops were dispatched for execution on port 5."}, {"SB_DRAIN_CYCLES", "The number of cycles while the store buffer is draining."}, {"SEGMENT_REG_LOADS", "The number of segment register loads."}, {"SEG_REG_RENAMES.ANY", "The number of times the any segment register was renamed."}, {"SEG_REG_RENAMES.DS", "The number of times the ds register is renamed."}, {"SEG_REG_RENAMES.ES", "The number of times the es register is renamed."}, {"SEG_REG_RENAMES.FS", "The number of times the fs register is renamed."}, {"SEG_REG_RENAMES.GS", "The number of times the gs register is renamed."}, {"SEG_RENAME_STALLS.ANY", "The number of stalls due to lack of resource to rename any segment register."}, {"SEG_RENAME_STALLS.DS", "The number of stalls due to lack of renaming resources for the ds register."}, {"SEG_RENAME_STALLS.ES", "The number of stalls due to lack of renaming resources for the es register."}, {"SEG_RENAME_STALLS.FS", "The number of stalls due to lack of renaming resources for the fs register."}, {"SEG_RENAME_STALLS.GS", "The number of stalls due to lack of renaming resources for the gs register."}, {"SIMD_ASSIST", "The number SIMD assists invoked."}, {"SIMD_COMP_INST_RETIRED.PACKED_DOUBLE", "Then number of computational SSE2 packed double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.PACKED_SINGLE", "Then number of computational SSE2 packed single precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_DOUBLE", "Then number of computational SSE2 scalar double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_SINGLE", "Then number of computational SSE2 scalar single precision instructions retired."}, {"SIMD_INSTR_RETIRED", "The number of retired SIMD instructions that use MMX registers."}, {"SIMD_INST_RETIRED.ANY", "The number of streaming SIMD instructions retired."}, {"SIMD_INST_RETIRED.PACKED_DOUBLE", "The number of SSE2 packed double precision instructions retired."}, {"SIMD_INST_RETIRED.PACKED_SINGLE", "The number of SSE packed single precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_DOUBLE", "The number of SSE2 scalar double precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_SINGLE", "The number of SSE scalar single precision instructions retired."}, {"SIMD_INST_RETIRED.VECTOR", "The number of SSE2 vector instructions retired."}, {"SIMD_SAT_INSTR_RETIRED", "The number of saturated arithmetic SIMD instructions retired."}, {"SIMD_SAT_UOP_EXEC", "The number of SIMD saturated arithmetic micro-ops executed."}, {"SIMD_UOPS_EXEC", "The number of SIMD micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.ARITHMETIC", "The number of SIMD packed arithmetic micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.LOGICAL", "The number of SIMD packed logical micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.MUL", "The number of SIMD packed multiply micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.PACK", "The number of SIMD pack micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.SHIFT", "The number of SIMD packed shift micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.UNPACK", "The number of SIMD unpack micro-ops executed."}, {"SNOOP_STALL_DRV", "The number of times the bus stalled for snoops."}, {"SSE_PRE_EXEC.L1", "The number of PREFETCHT0 instructions executed."}, {"SSE_PRE_EXEC.L2", "The number of PREFETCHT1 instructions executed."}, {"SSE_PRE_EXEC.NTA", "The number of PREFETCHNTA instructions executed."}, {"SSE_PRE_EXEC.STORES", "The number of times SSE non-temporal store instructions were executed."}, {"SSE_PRE_MISS.L1", "The number of times the PREFETCHT0 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.L2", "The number of times the PREFETCHT1 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.NTA", "The number of times the PREFETCHNTA instruction executed and missed all cache levels."}, {"STORE_BLOCK.ORDER", "The number of cycles while a store was waiting for another store to be globally observed."}, {"STORE_BLOCK.SNOOP", "The number of cycles while a store was blocked due to a conflict with an internal or external snoop."}, {"THERMAL_TRIP", "The number of thermal trips."}, {"UOPS_RETIRED.ANY", "The number of micro-ops retired."}, {"UOPS_RETIRED.FUSED", "The number of fused micro-ops retired."}, {"UOPS_RETIRED.LD_IND_BR", "The number of micro-ops retired that fused a load with another operation."}, {"UOPS_RETIRED.MACRO_FUSION", "The number of times retired instruction pairs were fused into one micro-op."}, {"UOPS_RETIRED.NON_FUSED", "he number of non-fused micro-ops retired."}, {"UOPS_RETIRED.STD_STA", "The number of store address calculations that fused into one micro-op."}, {"X87_OPS_RETIRED.ANY", "The number of floating point computational instructions retired."}, {"X87_OPS_RETIRED.FXCH", "The number of FXCH instructions retired."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-core2-extreme.h000066400000000000000000000201751502707512200214530ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core2.h * CVS: $Id$ * Author: George Neville-Neil * gnn@freebsd.org */ #ifndef FreeBSD_MAP_CORE2EXTREME_EXTREME #define FreeBSD_MAP_CORE2EXTREME_EXTREME enum NativeEvent_Value_Core2ExtremeProcessor { PNE_CORE2EXTREME_BACLEARS = PAPI_NATIVE_MASK , PNE_CORE2EXTREME_BOGUS_BR, PNE_CORE2EXTREME_BR_BAC_MISSP_EXEC, PNE_CORE2EXTREME_BR_CALL_MISSP_EXEC, PNE_CORE2EXTREME_BR_CALL_EXEC, PNE_CORE2EXTREME_BR_CND_EXEC, PNE_CORE2EXTREME_BR_CND_MISSP_EXEC, PNE_CORE2EXTREME_BR_IND_CALL_EXEC, PNE_CORE2EXTREME_BR_IND_EXEC, PNE_CORE2EXTREME_BR_IND_MISSP_EXEC, PNE_CORE2EXTREME_BR_INST_DECODED, PNE_CORE2EXTREME_BR_INST_EXEC, PNE_CORE2EXTREME_BR_INST_RETIRED_ANY, PNE_CORE2EXTREME_BR_INST_RETIRED_MISPRED, PNE_CORE2EXTREME_BR_INST_RETIRED_MISPRED_NOT_TAKEN, PNE_CORE2EXTREME_BR_INST_RETIRED_MISPRED_TAKEN, PNE_CORE2EXTREME_BR_INST_RETIRED_PRED_NOT_TAKEN, PNE_CORE2EXTREME_BR_INST_RETIRED_PRED_TAKEN, PNE_CORE2EXTREME_BR_INST_RETIRED_TAKEN, PNE_CORE2EXTREME_BR_MISSP_EXEC, PNE_CORE2EXTREME_BR_RET_MISSP_EXEC, PNE_CORE2EXTREME_BR_RET_BAC_MISSP_EXEC, PNE_CORE2EXTREME_BR_RET_EXEC, PNE_CORE2EXTREME_BR_TKN_BUBBLE_1, PNE_CORE2EXTREME_BR_TKN_BUBBLE_2, PNE_CORE2EXTREME_BUSQ_EMPTY, PNE_CORE2EXTREME_BUS_BNR_DRV, PNE_CORE2EXTREME_BUS_DATA_RCV, PNE_CORE2EXTREME_BUS_DRDY_CLOCKS, PNE_CORE2EXTREME_BUS_HIT_DRV, PNE_CORE2EXTREME_BUS_HITM_DRV, PNE_CORE2EXTREME_BUS_IO_WAIT, PNE_CORE2EXTREME_BUS_LOCK_CLOCKS, PNE_CORE2EXTREME_BUS_REQUEST_OUTSTANDING, PNE_CORE2EXTREME_BUS_TRANS_ANY, PNE_CORE2EXTREME_BUS_TRANS_BRD, PNE_CORE2EXTREME_BUS_TRANS_BURST, PNE_CORE2EXTREME_BUS_TRANS_DEF, PNE_CORE2EXTREME_BUS_TRANS_IFETCH, PNE_CORE2EXTREME_BUS_TRANS_INVAL, PNE_CORE2EXTREME_BUS_TRANS_IO, PNE_CORE2EXTREME_BUS_TRANS_MEM, PNE_CORE2EXTREME_BUS_TRANS_P, PNE_CORE2EXTREME_BUS_TRANS_PWR, PNE_CORE2EXTREME_BUS_TRANS_RFO, PNE_CORE2EXTREME_BUS_TRANS_WB, PNE_CORE2EXTREME_CMP_SNOOP, PNE_CORE2EXTREME_CPU_CLK_UNHALTED_BUS, PNE_CORE2EXTREME_CPU_CLK_UNHALTED_CORE_P, PNE_CORE2EXTREME_CPU_CLK_UNHALTED_NO_OTHER, PNE_CORE2EXTREME_CYCLES_DIV_BUSY, PNE_CORE2EXTREME_CYCLES_INT_MASKED, PNE_CORE2EXTREME_CYCLES_INT_PENDING_AND_MASKED, PNE_CORE2EXTREME_CYCLES_L1I_MEM_STALLED, PNE_CORE2EXTREME_DELAYED_BYPASS_FP, PNE_CORE2EXTREME_DELAYED_BYPASS_LOAD, PNE_CORE2EXTREME_DELAYED_BYPASS_SIMD, PNE_CORE2EXTREME_DIV, PNE_CORE2EXTREME_DTLB_MISSES_ANY, PNE_CORE2EXTREME_DTLB_MISSES_L0_MISS_LD, PNE_CORE2EXTREME_DTLB_MISSES_MISS_LD, PNE_CORE2EXTREME_DTLB_MISSES_MISS_ST, PNE_CORE2EXTREME_EIST_TRANS, PNE_CORE2EXTREME_ESP_ADDITIONS, PNE_CORE2EXTREME_ESP_SYNCH, PNE_CORE2EXTREME_EXT_SNOOP, PNE_CORE2EXTREME_FP_ASSIST, PNE_CORE2EXTREME_FP_COMP_OPS_EXE, PNE_CORE2EXTREME_FP_MMX_TRANS_TO_FP, PNE_CORE2EXTREME_FP_MMX_TRANS_TO_MMX, PNE_CORE2EXTREME_HW_INT_RCV, PNE_CORE2EXTREME_IDLE_DURING_DIV, PNE_CORE2EXTREME_ILD_STALL, PNE_CORE2EXTREME_INST_QUEUE_FULL, PNE_CORE2EXTREME_INST_RETIRED_ANY_P, PNE_CORE2EXTREME_INST_RETIRED_LOADS, PNE_CORE2EXTREME_INST_RETIRED_OTHER, PNE_CORE2EXTREME_INST_RETIRED_STORES, PNE_CORE2EXTREME_INST_RETIRED_VM_H, PNE_CORE2EXTREME_ITLB_FLUSH, PNE_CORE2EXTREME_ITLB_LARGE_MISS, PNE_CORE2EXTREME_ITLB_MISSES, PNE_CORE2EXTREME_ITLB_SMALL_MISS, PNE_CORE2EXTREME_ITLB_MISS_RETIRED, PNE_CORE2EXTREME_L1D_ALL_CACHE_REF, PNE_CORE2EXTREME_L1D_ALL_REF, PNE_CORE2EXTREME_L1D_CACHE_LD, PNE_CORE2EXTREME_L1D_CACHE_LOCK, PNE_CORE2EXTREME_L1D_CACHE_LOCK_DURATION, PNE_CORE2EXTREME_L1D_CACHE_ST, PNE_CORE2EXTREME_L1D_M_EVICT, PNE_CORE2EXTREME_L1D_M_REPL, PNE_CORE2EXTREME_L1D_PEND_MISS, PNE_CORE2EXTREME_L1D_PREFETCH_REQUESTS, PNE_CORE2EXTREME_L1D_REPL, PNE_CORE2EXTREME_L1D_SPLIT_LOADS, PNE_CORE2EXTREME_L1D_SPLIT_STORES, PNE_CORE2EXTREME_L1I_MISSES, PNE_CORE2EXTREME_L1I_READS, PNE_CORE2EXTREME_L2_ADS, PNE_CORE2EXTREME_L2_DBUS_BUSY_RD, PNE_CORE2EXTREME_L2_IFETCH, PNE_CORE2EXTREME_L2_LD, PNE_CORE2EXTREME_L2_LINES_IN, PNE_CORE2EXTREME_L2_LINES_OUT, PNE_CORE2EXTREME_L2_LOCK, PNE_CORE2EXTREME_L2_M_LINES_IN, PNE_CORE2EXTREME_L2_M_LINES_OUT, PNE_CORE2EXTREME_L2_NO_REQ, PNE_CORE2EXTREME_L2_REJECT_BUSQ, PNE_CORE2EXTREME_L2_RQSTS, PNE_CORE2EXTREME_L2_RQSTS_SELF_DEMAND_I_STATE, PNE_CORE2EXTREME_L2_RQSTS_SELF_DEMAND_MESI, PNE_CORE2EXTREME_L2_ST, PNE_CORE2EXTREME_LOAD_BLOCK_L1D, PNE_CORE2EXTREME_LOAD_BLOCK_OVERLAP_STORE, PNE_CORE2EXTREME_LOAD_BLOCK_STA, PNE_CORE2EXTREME_LOAD_BLOCK_STD, PNE_CORE2EXTREME_LOAD_BLOCK_UNTIL_RETIRE, PNE_CORE2EXTREME_LOAD_HIT_PRE, PNE_CORE2EXTREME_MACHINE_NUKES_MEM_ORDER, PNE_CORE2EXTREME_MACHINE_NUKES_SMC, PNE_CORE2EXTREME_MACRO_INSTS_CISC_DECODED, PNE_CORE2EXTREME_MACRO_INSTS_DECODED, PNE_CORE2EXTREME_MEMORY_DISAMBIGUATION_RESET, PNE_CORE2EXTREME_MEMORY_DISAMBIGUATION_SUCCESS, PNE_CORE2EXTREME_MEM_LOAD_RETIRED_DTLB_MISS, PNE_CORE2EXTREME_MEM_LOAD_RETIRED_L1D_LINE_MISS, PNE_CORE2EXTREME_MEM_LOAD_RETIRED_L1D_MISS, PNE_CORE2EXTREME_MEM_LOAD_RETIRED_L2_LINE_MISS, PNE_CORE2EXTREME_MEM_LOAD_RETIRED_L2_MISS, PNE_CORE2EXTREME_MUL, PNE_CORE2EXTREME_PAGE_WALKS_COUNT, PNE_CORE2EXTREME_PAGE_WALKS_CYCLES, PNE_CORE2EXTREME_PREF_RQSTS_DN, PNE_CORE2EXTREME_PREF_RQSTS_UP, PNE_CORE2EXTREME_RAT_STALLS_ANY, PNE_CORE2EXTREME_RAT_STALLS_FLAGS, PNE_CORE2EXTREME_RAT_STALLS_FPSW, PNE_CORE2EXTREME_RAT_STALLS_OTHER_SERIALIZATION_STALLS, PNE_CORE2EXTREME_RAT_STALLS_PARTIAL_CYCLES, PNE_CORE2EXTREME_RAT_STALLS_ROB_READ_PORT, PNE_CORE2EXTREME_RESOURCE_STALLS_ANY, PNE_CORE2EXTREME_RESOURCE_STALLS_BR_MISS_CLEAR, PNE_CORE2EXTREME_RESOURCE_STALLS_FPCW, PNE_CORE2EXTREME_RESOURCE_STALLS_LD_ST, PNE_CORE2EXTREME_RESOURCE_STALLS_ROB_FULL, PNE_CORE2EXTREME_RESOURCE_STALLS_RS_FULL, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT0, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT1, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT2, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT3, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT4, PNE_CORE2EXTREME_RS_UOPS_DISPATCHED_PORT5, PNE_CORE2EXTREME_SB_DRAIN_CYCLES, PNE_CORE2EXTREME_SEGMENT_REG_LOADS, PNE_CORE2EXTREME_SEG_REG_RENAMES_ANY, PNE_CORE2EXTREME_SEG_REG_RENAMES_DS, PNE_CORE2EXTREME_SEG_REG_RENAMES_ES, PNE_CORE2EXTREME_SEG_REG_RENAMES_FS, PNE_CORE2EXTREME_SEG_REG_RENAMES_GS, PNE_CORE2EXTREME_SEG_RENAME_STALLS_ANY, PNE_CORE2EXTREME_SEG_RENAME_STALLS_DS, PNE_CORE2EXTREME_SEG_RENAME_STALLS_ES, PNE_CORE2EXTREME_SEG_RENAME_STALLS_FS, PNE_CORE2EXTREME_SEG_RENAME_STALLS_GS, PNE_CORE2EXTREME_SIMD_ASSIST, PNE_CORE2EXTREME_SIMD_COMP_INST_RETIRED_PACKED_DOUBLE, PNE_CORE2EXTREME_SIMD_COMP_INST_RETIRED_PACKED_SINGLE, PNE_CORE2EXTREME_SIMD_COMP_INST_RETIRED_SCALAR_DOUBLE, PNE_CORE2EXTREME_SIMD_COMP_INST_RETIRED_SCALAR_SINGLE, PNE_CORE2EXTREME_SIMD_INSTR_RETIRED, PNE_CORE2EXTREME_SIMD_INST_RETIRED_ANY, PNE_CORE2EXTREME_SIMD_INST_RETIRED_PACKED_DOUBLE, PNE_CORE2EXTREME_SIMD_INST_RETIRED_PACKED_SINGLE, PNE_CORE2EXTREME_SIMD_INST_RETIRED_SCALAR_DOUBLE, PNE_CORE2EXTREME_SIMD_INST_RETIRED_SCALAR_SINGLE, PNE_CORE2EXTREME_SIMD_INST_RETIRED_VECTOR, PNE_CORE2EXTREME_SIMD_SAT_INSTR_RETIRED, PNE_CORE2EXTREME_SIMD_SAT_UOP_EXEC, PNE_CORE2EXTREME_SIMD_UOPS_EXEC, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_ARITHMETIC, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_LOGICAL, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_MUL, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_PACK, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_SHIFT, PNE_CORE2EXTREME_SIMD_UOP_TYPE_EXEC_UNPACK, PNE_CORE2EXTREME_SNOOP_STALL_DRV, PNE_CORE2EXTREME_SSE_PRE_EXEC_L1, PNE_CORE2EXTREME_SSE_PRE_EXEC_L2, PNE_CORE2EXTREME_SSE_PRE_EXEC_NTA, PNE_CORE2EXTREME_SSE_PRE_EXEC_STORES, PNE_CORE2EXTREME_SSE_PRE_MISS_L1, PNE_CORE2EXTREME_SSE_PRE_MISS_L2, PNE_CORE2EXTREME_SSE_PRE_MISS_NTA, PNE_CORE2EXTREME_STORE_BLOCK_ORDER, PNE_CORE2EXTREME_STORE_BLOCK_SNOOP, PNE_CORE2EXTREME_THERMAL_TRIP, PNE_CORE2EXTREME_UOPS_RETIRED_ANY, PNE_CORE2EXTREME_UOPS_RETIRED_FUSED, PNE_CORE2EXTREME_UOPS_RETIRED_LD_IND_BR, PNE_CORE2EXTREME_UOPS_RETIRED_MACRO_FUSION, PNE_CORE2EXTREME_UOPS_RETIRED_NON_FUSED, PNE_CORE2EXTREME_UOPS_RETIRED_STD_STA, PNE_CORE2EXTREME_X87_OPS_RETIRED_ANY, PNE_CORE2EXTREME_X87_OPS_RETIRED_FXCH, PNE_CORE2EXTREME_NATNAME_GUARD }; extern Native_Event_LabelDescription_t Core2ExtremeProcessor_info[]; extern hwi_search_t Core2ExtremeProcessor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-core2.c000066400000000000000000000476731502707512200200130ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core2.c * Author: George Neville-Neil * gnn@freebsd.org * Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** CORE2 SUBSTRATE CORE2 SUBSTRATE CORE2 SUBSTRATE CORE2 SUBSTRATE CORE2 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_Core2Processor must match Core2Processor_info */ Native_Event_LabelDescription_t Core2Processor_info[] = { {"BACLEARS", "The number of times the front end is resteered."}, {"BOGUS_BR", "The number of byte sequences mistakenly detected as taken branch instructions."}, {"BR_BAC_MISSP_EXEC", "The number of branch instructions that were mispredicted when decoded."}, {"BR_CALL_MISSP_EXEC", "The number of mispredicted CALL instructions that were executed."}, {"BR_CALL_EXEC", "The number of CALL instructions executed."}, {"BR_CND_EXEC", "The number of conditional branches executed, but not necessarily retired."}, {"BR_CND_MISSP_EXEC", "The number of mispredicted conditional branches executed."}, {"BR_IND_CALL_EXEC", "The number of indirect CALL instructions executed."}, {"BR_IND_EXEC", "The number of indirect branch instructions executed."}, {"BR_IND_MISSP_EXEC", "The number of mispredicted indirect branch instructions executed."}, {"BR_INST_DECODED", "The number of branch instructions decoded."}, {"BR_INST_EXEC", "The number of branches executed, but not necessarily retired."}, {"BR_INST_RETIRED.ANY", "The number of branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.MISPRED", "The number of mispredicted branch instructions retired. This is an architectural performance event."}, {"BR_INST_RETIRED.MISPRED_NOT_TAKEN", "The number of not taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.MISPRED_TAKEN", "The number taken branch instructions retired that were mispredicted."}, {"BR_INST_RETIRED.PRED_NOT_TAKEN", "The number of not taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.PRED_TAKEN", "The number of taken branch instructions retired that were correctly predicted."}, {"BR_INST_RETIRED.TAKEN", "The number of taken branch instructions retired."}, {"BR_MISSP_EXEC", "The number of mispredicted branch instructions that were executed."}, {"BR_RET_MISSP_EXEC", "The number of mispredicted RET instructions executed."}, {"BR_RET_BAC_MISSP_EXEC", "The number of RET instructions executed that were mispredicted at decode time."}, {"BR_RET_EXEC", "The number of RET instructions executed."}, {"BR_TKN_BUBBLE_1", "The number of branch predicted taken with bubble 1."}, {"BR_TKN_BUBBLE_2", "The number of branch predicted taken with bubble 2."}, {"BUSQ_EMPTY", "The number of cycles during which the core did not have any pending transactions in the bus queue."}, {"BUS_BNR_DRV", "Number of Bus Not Ready signals asserted on the bus."}, {"BUS_DATA_RCV", "Number of bus cycles during which the processor is receiving data."}, {"BUS_DRDY_CLOCKS", "The number of bus cycles during which the Data Ready signal is asserted on the bus."}, {"BUS_HIT_DRV", "The number of bus cycles during which the processor drives the HIT# pin."}, {"BUS_HITM_DRV", "The number of bus cycles during which the processor drives the HITM# pin."}, {"BUS_IO_WAIT", "The number of core cycles during which I/O requests wait in the bus queue."}, {"BUS_LOCK_CLOCKS", "The number of bus cycles during which the LOCK signal was asserted on the bus."}, {"BUS_REQUEST_OUTSTANDING", "The number of pending full cache line read transactions on the bus occurring in each cycle."}, {"BUS_TRANS_ANY", "The number of bus transactions of any kind."}, {"BUS_TRANS_BRD", "The number of burst read transactions."}, {"BUS_TRANS_BURST", "The number of burst transactions."}, {"BUS_TRANS_DEF", "The number of deferred bus transactions."}, {"BUS_TRANS_IFETCH", "The number of instruction fetch full cache line bus transactions."}, {"BUS_TRANS_INVAL", "The number of invalidate bus transactions."}, {"BUS_TRANS_IO", "The number of completed I/O bus transaactions due to IN and OUT instructions."}, {"BUS_TRANS_MEM", "The number of memory bus transactions."}, {"BUS_TRANS_P", "The number of partial bus transactions."}, {"BUS_TRANS_PWR", "The number of partial write bus transactions."}, {"BUS_TRANS_RFO", "The number of Read For Ownership bus transactions."}, {"BUS_TRANS_WB", "The number of explicit writeback bus transactions due to dirty line evictions."}, {"CMP_SNOOP", "The number of times the L1 data cache is snooped by the other core in the same processor."}, {"CPU_CLK_UNHALTED.BUS", "The number of bus cycles when the core is not in the halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.CORE_P", "The number of core cycles while the core is not in a halt state. This is an architectural performance event."}, {"CPU_CLK_UNHALTED.NO_OTHER", "The number of bus cycles during which the core remains unhalted and the other core is halted."}, {"CYCLES_DIV_BUSY", "The number of cycles the divider is busy. This event is only available on PMC0."}, {"CYCLES_INT_MASKED", "The number of cycles during which interrupts are disabled."}, {"CYCLES_INT_PENDING_AND_MASKED", "The number of cycles during which there were pending interrupts while interrupts were disabled."}, {"CYCLES_L1I_MEM_STALLED", "The number of cycles for which an instruction fetch stalls."}, {"DELAYED_BYPASS.FP", "The number of floating point operations that used data immediately after the data was generated by a non floating point execution unit."}, {"DELAYED_BYPASS.LOAD", "The number of delayed bypass penalty cycles that a load operation incurred."}, {"DELAYED_BYPASS.SIMD", "The number of times SIMD operations use data immediately after data, was generated by a non-SIMD execution unit."}, {"DIV", "The number of divide operations executed."}, {"DTLB_MISSES.ANY", "The number of Data TLB misses, including misses that result from speculative accesses."}, {"DTLB_MISSES.L0_MISS_LD", "The number of level 0 DTLB misses due to load operations."}, {"DTLB_MISSES.MISS_LD", "The number of Data TLB misses due to load operations."}, {"DTLB_MISSES.MISS_ST", "The number of Data TLB misses due to store operations."}, {"EIST_TRANS", "The number of Enhanced Intel SpeedStep Technology transitions."}, {"ESP.ADDITIONS", "The number of automatic additions to the esp register."}, {"ESP.SYNCH", "The number of times the esp register was explicitly used in an address expression after it is implicitly used by a PUSH or POP instruction."}, {"EXT_SNOOP", "The number of snoop responses to bus transactions."}, {"FP_ASSIST", "The number of floating point operations executed that needed a microcode assist."}, {"FP_COMP_OPS_EXE", "The number of floating point computational micro-ops executed. The event is available only on PMC0."}, {"FP_MMX_TRANS_TO_FP", "The number of transitions from MMX instructions to floating point instructions."}, {"FP_MMX_TRANS_TO_MMX", "The number of transitions from floating point instructions to MMX instructions."}, {"HW_INT_RCV", "The number of hardware interrupts recieved."}, {"IDLE_DURING_DIV", "The number of cycles the divider is busy and no other execution unit or load operation was in progress. This event is available only on PMC0."}, {"ILD_STALL", "The number of cycles the instruction length decoder stalled due to a length changing prefix."}, {"INST_QUEUE.FULL", "The number of cycles during which the instruction queue is full."}, {"INST_RETIRED.ANY_P", "The number of instructions retired. This is an architectural performance event."}, {"INST_RETIRED.LOADS", "The number of instructions retired that contained a load operation."}, {"INST_RETIRED.OTHER", "The number of instructions retired that did not contain a load or a store operation."}, {"INST_RETIRED.STORES", "The number of instructions retired that contained a store operation."}, {"ITLB.FLUSH", "The number of ITLB flushes."}, {"ITLB.LARGE_MISS", "The number of instruction fetches from large pages that miss the ITLB."}, {"ITLB.MISSES", "The number of instruction fetches from both large and small pages that miss the ITLB."}, {"ITLB.SMALL_MISS", "The number of instruction fetches from small pages that miss the ITLB."}, {"ITLB_MISS_RETIRED", "The number of retired instructions that missed the ITLB when they were fetched."}, {"L1D_ALL_CACHE_REF", "The number of data reads and writes to cacheable memory."}, {"L1D_ALL_REF", "The number of references to L1 data cache counting loads and stores of to all memory types."}, {"L1D_CACHE_LD", "Number of data reads from cacheable memory excluding locked reads."}, {"L1D_CACHE_LOCK", "Number of locked reads from cacheable memory."}, {"L1D_CACHE_LOCK_DURATION", "The number of cycles during which any cache line is locked by any locking instruction."}, {"L1D_CACHE_ST", "The number of data writes to cacheable memory excluding locked writes."}, {"L1D_M_EVICT", "The number of modified cache lines evicted from L1 data cache."}, {"L1D_M_REPL", "The number of modified lines allocated in L1 data cache."}, {"L1D_PEND_MISS", "The total number of outstanding L1 data cache misses at any clock."}, {"L1D_PREFETCH.REQUESTS", "The number of times L1 data cache requested to prefetch a data cache line."}, {"L1D_REPL", "The number of lines brought into L1 data cache."}, {"L1D_SPLIT.LOADS", "The number of load operations that span two cache lines."}, {"L1D_SPLIT.STORES", "The number of store operations that span two cache lines."}, {"L1I_MISSES", "The number of instruction fetch unit misses."}, {"L1I_READS", "The number of instruction fetches."}, {"L2_ADS", "The number of cycles that the L2 address bus is in use."}, {"L2_DBUS_BUSY_RD", "The number of cycles during which the L2 data bus is busy transferring data to the core."}, {"L2_IFETCH", "The number of instruction cache line requests from the instruction fetch unit."}, {"L2_LD", "The number of L2 cache read requests from L1 cache and L2 prefetchers."}, {"L2_LINES_IN", "The number of cache lines allocated in L2 cache."}, {"L2_LINES_OUT", "The number of L2 cache lines evicted."}, {"L2_LOCK", "The number of locked accesses to cache lines that miss L1 data cache."}, {"L2_M_LINES_IN", "The number of L2 cache line modifications."}, {"L2_M_LINES_OUT", "The number of modified lines evicted from L2 cache."}, {"L2_NO_REQ", "Number of cycles during which no L2 cache requests were pending from a core."}, {"L2_REJECT_BUSQ", "Number of L2 cache requests that were rejected."}, {"L2_RQSTS", "The number of completed L2 cache requests."}, {"L2_RQSTS.SELF.DEMAND.I_STATE", "The number of completed L2 cache demand requests from this core that missed the L2 cache. This is an architectural performance event."}, {"L2_RQSTS.SELF.DEMAND.MESI", "The number of completed L2 cache demand requests from this core. This is an architectural performance event."}, {"L2_ST", "The number of store operations that miss the L1 cache and request data from the L2 cache."}, {"LOAD_BLOCK.L1D", "The number of loads blocked by the L1 data cache."}, {"LOAD_BLOCK.OVERLAP_STORE", "The number of loads that partially overlap an earlier store or are aliased with a previous store."}, {"LOAD_BLOCK.STA", "The number of loads blocked by preceding stores whose address is yet to be calculated."}, {"LOAD_BLOCK.STD", "The number of loads blocked by preceding stores to the same address whose data value is not known."}, {"LOAD_BLOCK.UNTIL_RETIRE", "The numer of load operations that were blocked until retirement."}, {"LOAD_HIT_PRE", "The number of load operations that conflicted with an prefetch to the same cache line."}, {"MACHINE_NUKES.MEM_ORDER", "The number of times the execution pipeline was restarted due to a memory ordering conflict or memory disambiguation misprediction."}, {"MACHINE_NUKES.SMC", "The number of times a program writes to a code section."}, {"MACRO_INSTS.CISC_DECODED", "The number of complex instructions decoded."}, {"MACRO_INSTS.DECODED", "The number of instructions decoded."}, {"MEMORY_DISAMBIGUATION.RESET", "The number of cycles during which memory disambiguation misprediction occurs."}, {"MEMORY_DISAMBIGUATION.SUCCESS", "The number of load operations that were successfully disambiguated."}, {"MEM_LOAD_RETIRED.DTLB_MISS", "The number of retired loads that missed the DTLB."}, {"MEM_LOAD_RETIRED.L1D_LINE_MISS", "The number of retired load operations that missed L1 data cache and that sent a request to L2 cache. This event is only available on PMC0."}, {"MEM_LOAD_RETIRED.L1D_MISS", "The number of retired load operations that missed L1 data cache. This event is only available on PMC0."}, {"MEM_LOAD_RETIRED.L2_LINE_MISS", "The number of load operations that missed L2 cache and that caused a bus request."}, {"MEM_LOAD_RETIRED.L2_MISS", "The number of load operations that missed L2 cache."}, {"MUL","The number of multiply operations executed (only available on PMC1.)"}, {"PAGE_WALKS.COUNT", "The number of page walks executed due to an ITLB or DTLB miss."}, {"PAGE_WALKS.CYCLES", "The number of cycles spent in a page walk caused by an ITLB or DTLB miss."}, {"PREF_RQSTS_DN", "The number of downward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"PREF_RQSTS_UP", "The number of upward prefetches issued from the Data Prefetch Logic unit to L2 cache."}, {"RAT_STALLS.ANY", "The number of stall cycles due to any of RAT_STALLS.FLAGS RAT_STALLS.FPSW, RAT_STALLS.PARTIAL and RAT_STALLS.ROB_READ_PORT."}, {"RAT_STALLS.FLAGS", "The number of cycles execution stalled due to a flag register induced stall."}, {"RAT_STALLS.FPSW", "The number of times the floating point status word was written."}, {"RAT_STALLS.PARTIAL_CYCLES", "The number of cycles of added instruction execution latency due to the use of a register that was partially written by previous instructions."}, {"RAT_STALLS.ROB_READ_PORT", "The number of cycles when ROB read port stalls occurred."}, {"RESOURCE_STALLS.ANY", "The number of cycles during which any resource related stall occurred."}, {"RESOURCE_STALLS.BR_MISS_CLEAR", "The number of cycles stalled due to branch misprediction."}, {"RESOURCE_STALLS.FPCW", "The number of cycles stalled due to writing the floating point control word."}, {"RESOURCE_STALLS.LD_ST", "The number of cycles during which the number of loads and stores in the pipeline exceeded their limits."}, {"RESOURCE_STALLS.ROB_FULL", "The number of cycles when the reorder buffer was full."}, {"RESOURCE_STALLS.RS_FULL", "The number of cycles during which the RS was full."}, {"RS_UOPS_DISPATCHED", "The number of micro-ops dispatched for execution."}, {"RS_UOPS_DISPATCHED.PORT0", "The number of cycles micro-ops were dispatched for execution on port 0."}, {"RS_UOPS_DISPATCHED.PORT1", "The number of cycles micro-ops were dispatched for execution on port 1."}, {"RS_UOPS_DISPATCHED.PORT2", "The number of cycles micro-ops were dispatched for execution on port 2."}, {"RS_UOPS_DISPATCHED.PORT3", "The number of cycles micro-ops were dispatched for execution on port 3."}, {"RS_UOPS_DISPATCHED.PORT4", "The number of cycles micro-ops were dispatched for execution on port 4."}, {"RS_UOPS_DISPATCHED.PORT5", "The number of cycles micro-ops were dispatched for execution on port 5."}, {"SB_DRAIN_CYCLES", "The number of cycles while the store buffer is draining."}, {"SEGMENT_REG_LOADS", "The number of segment register loads."}, {"SEG_REG_RENAMES.ANY", "The number of times the any segment register was renamed."}, {"SEG_REG_RENAMES.DS", "The number of times the ds register is renamed."}, {"SEG_REG_RENAMES.ES", "The number of times the es register is renamed."}, {"SEG_REG_RENAMES.FS", "The number of times the fs register is renamed."}, {"SEG_REG_RENAMES.GS", "The number of times the gs register is renamed."}, {"SEG_RENAME_STALLS.ANY", "The number of stalls due to lack of resource to rename any segment register."}, {"SEG_RENAME_STALLS.DS", "The number of stalls due to lack of renaming resources for the ds register."}, {"SEG_RENAME_STALLS.ES", "The number of stalls due to lack of renaming resources for the es register."}, {"SEG_RENAME_STALLS.FS", "The number of stalls due to lack of renaming resources for the fs register."}, {"SEG_RENAME_STALLS.GS", "The number of stalls due to lack of renaming resources for the gs register."}, {"SIMD_ASSIST", "The number SIMD assists invoked."}, {"SIMD_COMP_INST_RETIRED.PACKED_DOUBLE", "Then number of computational SSE2 packed double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.PACKED_SINGLE", "Then number of computational SSE2 packed single precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_DOUBLE", "Then number of computational SSE2 scalar double precision instructions retired."}, {"SIMD_COMP_INST_RETIRED.SCALAR_SINGLE", "Then number of computational SSE2 scalar single precision instructions retired."}, {"SIMD_INSTR_RETIRED", "The number of retired SIMD instructions that use MMX registers."}, {"SIMD_INST_RETIRED.ANY", "The number of streaming SIMD instructions retired."}, {"SIMD_INST_RETIRED.PACKED_DOUBLE", "The number of SSE2 packed double precision instructions retired."}, {"SIMD_INST_RETIRED.PACKED_SINGLE", "The number of SSE packed single precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_DOUBLE", "The number of SSE2 scalar double precision instructions retired."}, {"SIMD_INST_RETIRED.SCALAR_SINGLE", "The number of SSE scalar single precision instructions retired."}, {"SIMD_INST_RETIRED.VECTOR", "The number of SSE2 vector instructions retired."}, {"SIMD_SAT_INSTR_RETIRED", "The number of saturated arithmetic SIMD instructions retired."}, {"SIMD_SAT_UOP_EXEC", "The number of SIMD saturated arithmetic micro-ops executed."}, {"SIMD_UOPS_EXEC", "The number of SIMD micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.ARITHMETIC", "The number of SIMD packed arithmetic micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.LOGICAL", "The number of SIMD packed logical micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.MUL", "The number of SIMD packed multiply micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.PACK", "The number of SIMD pack micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.SHIFT", "The number of SIMD packed shift micro-ops executed."}, {"SIMD_UOP_TYPE_EXEC.UNPACK", "The number of SIMD unpack micro-ops executed."}, {"SNOOP_STALL_DRV", "The number of times the bus stalled for snoops."}, {"SSE_PRE_EXEC.L1", "The number of PREFETCHT0 instructions executed."}, {"SSE_PRE_EXEC.L2", "The number of PREFETCHT1 instructions executed."}, {"SSE_PRE_EXEC.NTA", "The number of PREFETCHNTA instructions executed."}, {"SSE_PRE_EXEC.STORES", "The number of times SSE non-temporal store instructions were executed."}, {"SSE_PRE_MISS.L1", "The number of times the PREFETCHT0 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.L2", "The number of times the PREFETCHT1 instruction executed and missed all cache levels."}, {"SSE_PRE_MISS.NTA", "The number of times the PREFETCHNTA instruction executed and missed all cache levels."}, {"STORE_BLOCK.ORDER", "The number of cycles while a store was waiting for another store to be globally observed."}, {"STORE_BLOCK.SNOOP", "The number of cycles while a store was blocked due to a conflict with an internal or external snoop."}, {"THERMAL_TRIP", "The number of thermal trips."}, {"UOPS_RETIRED.ANY", "The number of micro-ops retired."}, {"UOPS_RETIRED.FUSED", "The number of fused micro-ops retired."}, {"UOPS_RETIRED.LD_IND_BR", "The number of micro-ops retired that fused a load with another operation."}, {"UOPS_RETIRED.MACRO_FUSION", "The number of times retired instruction pairs were fused into one micro-op."}, {"UOPS_RETIRED.NON_FUSED", "he number of non-fused micro-ops retired."}, {"UOPS_RETIRED.STD_STA", "The number of store address calculations that fused into one micro-op."}, {"X87_OPS_RETIRED.ANY", "The number of floating point computational instructions retired."}, {"X87_OPS_RETIRED.FXCH", "The number of FXCH instructions retired."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-core2.h000066400000000000000000000151121502707512200177770ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-core2.h * CVS: $Id$ * Author: George Neville-Neil * gnn@freebsd.org */ #ifndef FreeBSD_MAP_CORE2 #define FreeBSD_MAP_CORE2 enum NativeEvent_Value_Core2Processor { PNE_CORE2_BACLEARS = PAPI_NATIVE_MASK , PNE_CORE2_BOGUS_BR, PNE_CORE2_BR_BAC_MISSP_EXEC, PNE_CORE2_BR_CALL_MISSP_EXEC, PNE_CORE2_BR_CALL_EXEC, PNE_CORE2_BR_CND_EXEC, PNE_CORE2_BR_CND_MISSP_EXEC, PNE_CORE2_BR_IND_CALL_EXEC, PNE_CORE2_BR_IND_EXEC, PNE_CORE2_BR_IND_MISSP_EXEC, PNE_CORE2_BR_INST_DECODED, PNE_CORE2_BR_INST_EXEC, PNE_CORE2_BR_INST_RETIRED_ANY, PNE_CORE2_BR_INST_RETIRED_MISPRED, PNE_CORE2_BR_INST_RETIRED_MISPRED_NOT_TAKEN, PNE_CORE2_BR_INST_RETIRED_MISPRED_TAKEN, PNE_CORE2_BR_INST_RETIRED_PRED_NOT_TAKEN, PNE_CORE2_BR_INST_RETIRED_PRED_TAKEN, PNE_CORE2_BR_INST_RETIRED_TAKEN, PNE_CORE2_BR_MISSP_EXEC, PNE_CORE2_BR_RET_MISSP_EXEC, PNE_CORE2_BR_RET_BAC_MISSP_EXEC, PNE_CORE2_BR_RET_EXEC, PNE_CORE2_BR_TKN_BUBBLE_1, PNE_CORE2_BR_TKN_BUBBLE_2, PNE_CORE2_BUSQ_EMPTY, PNE_CORE2_BUS_BNR_DRV, PNE_CORE2_BUS_DATA_RCV, PNE_CORE2_BUS_DRDY_CLOCKS, PNE_CORE2_BUS_HIT_DRV, PNE_CORE2_BUS_HITM_DRV, PNE_CORE2_BUS_IO_WAIT, PNE_CORE2_BUS_LOCK_CLOCKS, PNE_CORE2_BUS_REQUEST_OUTSTANDING, PNE_CORE2_BUS_TRANS_ANY, PNE_CORE2_BUS_TRANS_BRD, PNE_CORE2_BUS_TRANS_BURST, PNE_CORE2_BUS_TRANS_DEF, PNE_CORE2_BUS_TRANS_IFETCH, PNE_CORE2_BUS_TRANS_INVAL, PNE_CORE2_BUS_TRANS_IO, PNE_CORE2_BUS_TRANS_MEM, PNE_CORE2_BUS_TRANS_P, PNE_CORE2_BUS_TRANS_PWR, PNE_CORE2_BUS_TRANS_RFO, PNE_CORE2_BUS_TRANS_WB, PNE_CORE2_CMP_SNOOP, PNE_CORE2_CPU_CLK_UNHALTED_BUS, PNE_CORE2_CPU_CLK_UNHALTED_CORE_P, PNE_CORE2_CPU_CLK_UNHALTED_NO_OTHER, PNE_CORE2_CYCLES_DIV_BUSY, PNE_CORE2_CYCLES_INT_MASKED, PNE_CORE2_CYCLES_INT_PENDING_AND_MASKED, PNE_CORE2_CYCLES_L1I_MEM_STALLED, PNE_CORE2_DELAYED_BYPASS_FP, PNE_CORE2_DELAYED_BYPASS_LOAD, PNE_CORE2_DELAYED_BYPASS_SIMD, PNE_CORE2_DIV, PNE_CORE2_DTLB_MISSES_ANY, PNE_CORE2_DTLB_MISSES_L0_MISS_LD, PNE_CORE2_DTLB_MISSES_MISS_LD, PNE_CORE2_DTLB_MISSES_MISS_ST, PNE_CORE2_EIST_TRANS, PNE_CORE2_ESP_ADDITIONS, PNE_CORE2_ESP_SYNCH, PNE_CORE2_EXT_SNOOP, PNE_CORE2_FP_ASSIST, PNE_CORE2_FP_COMP_OPS_EXE, PNE_CORE2_FP_MMX_TRANS_TO_FP, PNE_CORE2_FP_MMX_TRANS_TO_MMX, PNE_CORE2_HW_INT_RCV, PNE_CORE2_IDLE_DURING_DIV, PNE_CORE2_ILD_STALL, PNE_CORE2_INST_QUEUE_FULL, PNE_CORE2_INST_RETIRED_ANY_P, PNE_CORE2_INST_RETIRED_LOADS, PNE_CORE2_INST_RETIRED_OTHER, PNE_CORE2_INST_RETIRED_STORES, PNE_CORE2_ITLB_FLUSH, PNE_CORE2_ITLB_LARGE_MISS, PNE_CORE2_ITLB_MISSES, PNE_CORE2_ITLB_SMALL_MISS, PNE_CORE2_ITLB_MISS_RETIRED, PNE_CORE2_L1D_ALL_CACHE_REF, PNE_CORE2_L1D_ALL_REF, PNE_CORE2_L1D_CACHE_LD, PNE_CORE2_L1D_CACHE_LOCK, PNE_CORE2_L1D_CACHE_LOCK_DURATION, PNE_CORE2_L1D_CACHE_ST, PNE_CORE2_L1D_M_EVICT, PNE_CORE2_L1D_M_REPL, PNE_CORE2_L1D_PEND_MISS, PNE_CORE2_L1D_PREFETCH_REQUESTS, PNE_CORE2_L1D_REPL, PNE_CORE2_L1D_SPLIT_LOADS, PNE_CORE2_L1D_SPLIT_STORES, PNE_CORE2_L1I_MISSES, PNE_CORE2_L1I_READS, PNE_CORE2_L2_ADS, PNE_CORE2_L2_DBUS_BUSY_RD, PNE_CORE2_L2_IFETCH, PNE_CORE2_L2_LD, PNE_CORE2_L2_LINES_IN, PNE_CORE2_L2_LINES_OUT, PNE_CORE2_L2_LOCK, PNE_CORE2_L2_M_LINES_IN, PNE_CORE2_L2_M_LINES_OUT, PNE_CORE2_L2_NO_REQ, PNE_CORE2_L2_REJECT_BUSQ, PNE_CORE2_L2_RQSTS, PNE_CORE2_L2_RQSTS_SELF_DEMAND_I_STATE, PNE_CORE2_L2_RQSTS_SELF_DEMAND_MESI, PNE_CORE2_L2_ST, PNE_CORE2_LOAD_BLOCK_L1D, PNE_CORE2_LOAD_BLOCK_OVERLAP_STORE, PNE_CORE2_LOAD_BLOCK_STA, PNE_CORE2_LOAD_BLOCK_STD, PNE_CORE2_LOAD_BLOCK_UNTIL_RETIRE, PNE_CORE2_LOAD_HIT_PRE, PNE_CORE2_MACHINE_NUKES_MEM_ORDER, PNE_CORE2_MACHINE_NUKES_SMC, PNE_CORE2_MACRO_INSTS_CISC_DECODED, PNE_CORE2_MACRO_INSTS_DECODED, PNE_CORE2_MEMORY_DISAMBIGUATION_RESET, PNE_CORE2_MEMORY_DISAMBIGUATION_SUCCESS, PNE_CORE2_MEM_LOAD_RETIRED_DTLB_MISS, PNE_CORE2_MEM_LOAD_RETIRED_L1D_LINE_MISS, PNE_CORE2_MEM_LOAD_RETIRED_L1D_MISS, PNE_CORE2_MEM_LOAD_RETIRED_L2_LINE_MISS, PNE_CORE2_MEM_LOAD_RETIRED_L2_MISS, PNE_CORE2_MUL, PNE_CORE2_PAGE_WALKS_COUNT, PNE_CORE2_PAGE_WALKS_CYCLES, PNE_CORE2_PREF_RQSTS_DN, PNE_CORE2_PREF_RQSTS_UP, PNE_CORE2_RAT_STALLS_ANY, PNE_CORE2_RAT_STALLS_FLAGS, PNE_CORE2_RAT_STALLS_FPSW, PNE_CORE2_RAT_STALLS_PARTIAL_CYCLES, PNE_CORE2_RAT_STALLS_ROB_READ_PORT, PNE_CORE2_RESOURCE_STALLS_ANY, PNE_CORE2_RESOURCE_STALLS_BR_MISS_CLEAR, PNE_CORE2_RESOURCE_STALLS_FPCW, PNE_CORE2_RESOURCE_STALLS_LD_ST, PNE_CORE2_RESOURCE_STALLS_ROB_FULL, PNE_CORE2_RESOURCE_STALLS_RS_FULL, PNE_CORE2_RS_UOPS_DISPATCHED, PNE_CORE2_RS_UOPS_DISPATCHED_PORT0, PNE_CORE2_RS_UOPS_DISPATCHED_PORT1, PNE_CORE2_RS_UOPS_DISPATCHED_PORT2, PNE_CORE2_RS_UOPS_DISPATCHED_PORT3, PNE_CORE2_RS_UOPS_DISPATCHED_PORT4, PNE_CORE2_RS_UOPS_DISPATCHED_PORT5, PNE_CORE2_SB_DRAIN_CYCLES, PNE_CORE2_SEGMENT_REG_LOADS, PNE_CORE2_SEG_REG_RENAMES_ANY, PNE_CORE2_SEG_REG_RENAMES_DS, PNE_CORE2_SEG_REG_RENAMES_ES, PNE_CORE2_SEG_REG_RENAMES_FS, PNE_CORE2_SEG_REG_RENAMES_GS, PNE_CORE2_SEG_RENAME_STALLS_ANY, PNE_CORE2_SEG_RENAME_STALLS_DS, PNE_CORE2_SEG_RENAME_STALLS_ES, PNE_CORE2_SEG_RENAME_STALLS_FS, PNE_CORE2_SEG_RENAME_STALLS_GS, PNE_CORE2_SIMD_ASSIST, PNE_CORE2_SIMD_COMP_INST_RETIRED_PACKED_DOUBLE, PNE_CORE2_SIMD_COMP_INST_RETIRED_PACKED_SINGLE, PNE_CORE2_SIMD_COMP_INST_RETIRED_SCALAR_DOUBLE, PNE_CORE2_SIMD_COMP_INST_RETIRED_SCALAR_SINGLE, PNE_CORE2_SIMD_INSTR_RETIRED, PNE_CORE2_SIMD_INST_RETIRED_ANY, PNE_CORE2_SIMD_INST_RETIRED_PACKED_DOUBLE, PNE_CORE2_SIMD_INST_RETIRED_PACKED_SINGLE, PNE_CORE2_SIMD_INST_RETIRED_SCALAR_DOUBLE, PNE_CORE2_SIMD_INST_RETIRED_SCALAR_SINGLE, PNE_CORE2_SIMD_INST_RETIRED_VECTOR, PNE_CORE2_SIMD_SAT_INSTR_RETIRED, PNE_CORE2_SIMD_SAT_UOP_EXEC, PNE_CORE2_SIMD_UOPS_EXEC, PNE_CORE2_SIMD_UOP_TYPE_EXEC_ARITHMETIC, PNE_CORE2_SIMD_UOP_TYPE_EXEC_LOGICAL, PNE_CORE2_SIMD_UOP_TYPE_EXEC_MUL, PNE_CORE2_SIMD_UOP_TYPE_EXEC_PACK, PNE_CORE2_SIMD_UOP_TYPE_EXEC_SHIFT, PNE_CORE2_SIMD_UOP_TYPE_EXEC_UNPACK, PNE_CORE2_SNOOP_STALL_DRV, PNE_CORE2_SSE_PRE_EXEC_L1, PNE_CORE2_SSE_PRE_EXEC_L2, PNE_CORE2_SSE_PRE_EXEC_NTA, PNE_CORE2_SSE_PRE_EXEC_STORES, PNE_CORE2_SSE_PRE_MISS_L1, PNE_CORE2_SSE_PRE_MISS_L2, PNE_CORE2_SSE_PRE_MISS_NTA, PNE_CORE2_STORE_BLOCK_ORDER, PNE_CORE2_STORE_BLOCK_SNOOP, PNE_CORE2_THERMAL_TRIP, PNE_CORE2_UOPS_RETIRED_ANY, PNE_CORE2_UOPS_RETIRED_FUSED, PNE_CORE2_UOPS_RETIRED_LD_IND_BR, PNE_CORE2_UOPS_RETIRED_MACRO_FUSION, PNE_CORE2_UOPS_RETIRED_NON_FUSED, PNE_CORE2_UOPS_RETIRED_STD_STA, PNE_CORE2_X87_OPS_RETIRED_ANY, PNE_CORE2_X87_OPS_RETIRED_FXCH, PNE_CORE2_NATNAME_GUARD }; extern Native_Event_LabelDescription_t Core2Processor_info[]; extern hwi_search_t Core2Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-i7.c000066400000000000000000002205351502707512200173060ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-i7.c * Author: George Neville-Neil * gnn@freebsd.org * Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** i7 SUBSTRATE i7 SUBSTRATE i7 SUBSTRATE i7 SUBSTRATE i7 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_i7 must match i7_info */ Native_Event_LabelDescription_t i7Processor_info[] = { {"SB_FORWARD.ANY", "Counts the number of store forwards. "}, {"LOAD_BLOCK.STD", "Counts the number of loads blocked by a preceding store with unknown data."}, {"LOAD_BLOCK.ADDRESS_OFFSET", "Counts the number of loads blocked by a preceding store address."}, {"SB_DRAIN.CYCLES", "Counts the cycles of store buffer drains."}, {"MISALIGN_MEM_REF.LOAD", "Counts the number of misaligned load references."}, {"MISALIGN_MEM_REF.STORE", "Counts the number of misaligned store references."}, {"MISALIGN_MEM_REF.ANY", "Counts the number of misaligned memory references."}, {"STORE_BLOCKS.NOT_STA", "This event counts the number of load operations delayed caused by preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflict with the load but which incompletely overlap the load."}, {"STORE_BLOCKS.STA", "This event counts load operations delayed caused by preceding stores whose addresses are unknown (STA block)."}, {"STORE_BLOCKS.AT_RET", "Counts number of loads delayed with at-Retirement block code. The following loads need to be executed at retirement and wait for all senior stores on the same thread to be drained: load splitting across 4K boundary (page split), load accessing uncacheable (UC or USWC) memory, load lock, and load with page table in UC or USWC memory region."}, {"STORE_BLOCKS.L1D_BLOCK", "Cacheable loads delayed with L1D block code."}, {"STORE_BLOCKS.ANY", "All loads delayed due to store blocks."}, {"PARTIAL_ADDRESS_ALIAS", "Counts false dependency due to partial address aliasing."}, {"DTLB_LOAD_MISSES.ANY", "Counts all load misses that cause a page walk."}, {"DTLB_LOAD_MISSES.WALK_COMPLETED", "Counts number of completed page walks due to load miss in the STLB."}, {"DTLB_LOAD_MISSES.STLB_HIT", "Number of cache load STLB hits."}, {"DTLB_LOAD_MISSES.PDE_MISS", "Number of DTLB cache load misses where the low part of the linear to physical address translation was missed."}, {"DTLB_LOAD_MISSES.PDP_MISS", "Number of DTLB cache load misses where the high part of the linear to physical address translation was missed."}, {"DTLB_LOAD_MISSES.LARGE_WALK_COMPLETED", "Counts number of completed large page walks due to load miss in the STLB."}, {"MEMORY_DISAMBIGURATION.RESET", "Counts memory disambiguration reset cycles."}, {"MEMORY_DISAMBIGURATION.SUCCESS", "Counts the number of loads that memory disambiguration succeeded."}, {"MEMORY_DISAMBIGURATION.WATCHDOG", "Counts the number of times the memory disambiguration watchdog kicked in."}, {"MEMORY_DISAMBIGURATION.WATCH_CYCLES", "Counts the cycles that the memory disambiguration watchdog is active."}, {"MEM_INST_RETIRED.LOADS", "Counts the number of instructions with an architecturally-visible store retired on the architected path."}, {"MEM_INST_RETIRED.STORES", "Counts the number of instructions with an architecturally-visible store retired on the architected path."}, {"MEM_STORE_RETIRED.DTLB_MISS", "The event counts the number of retired stores that missed the DTLB. The DTLB miss is not counted if the store operation causes a fault. Does not counter prefetches."}, {"UOPS_ISSUED.ANY", "Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end."}, {"UOPS_ISSUED.FUSED", "Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station."}, {"MEM_UNCORE_RETIRED.OTHER_CORE_L2_HITM", "Counts number of memory load instructions retired where the memory reference hit modified data in a sibling core residing on the same socket."}, {"MEM_UNCORE_RETIRED.REMOTE_CACHE_LOCAL_HOME_HIT", "Counts number of memory load instructions retired where the memory reference missed the L1, L2 and L3 caches and HIT in a remote socket's cache. Only counts locally homed lines."}, {"MEM_UNCORE_RETIRED.REMOTE_DRAM", "Counts number of memory load instructions retired where the memory reference missed the L1, L2 and L3 caches and was remotely homed. This includes both DRAM access and HITM in a remote socket's cache for remotely homed lines."}, {"MEM_UNCORE_RETIRED.LOCAL_DRAM", "Counts number of memory load instructions retired where the memory reference missed the L1, L2 and L3 caches and required a local socket memory reference. This includes locally homed cachelines that were in a modified state in another socket."}, {"FP_COMP_OPS_EXE.X87", "Counts the number of FP Computational Uops Executed. The number of FADD, FSUB, FCOM, FMULs, integer MULsand IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction."}, {"FP_COMP_OPS_EXE.MMX", "Counts number of MMX Uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP", "Counts number of SSE and SSE2 FP uops executed."}, {"FP_COMP_OPS_EXE.SSE2_INTEGER", "Counts number of SSE2 integer uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP_PACKED", "Counts number of SSE FP packed uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP_SCALAR", "Counts number of SSE FP scalar uops executed."}, {"FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION", "Counts number of SSE* FP single precision uops executed."}, {"FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION", "Counts number of SSE* FP double precision uops executed."}, {"SIMD_INT_128.PACKED_MPY", "Counts number of 128 bit ED_MPY integer multiply operations."}, {"SIMD_INT_128.PACKED_SHIFT", "Counts number of 128 bit SIMD integer shift operations."}, {"SIMD_INT_128.PACK", " Counts number of 128 bit SIMD integer pack operations."}, {"SIMD_INT_128.UNPACK", "Counts number of 128 bit SIMD integer unpack operations."}, {"SIMD_INT_128.PACKED_LOGICAL", "Counts number of 128 bit SIMD integer logical operations."}, {"SIMD_INT_128.PACKED_ARITH", "Counts number of 128 bit SIMD integer arithmetic operations."}, {"SIMD_INT_128.SHUFFLE_MOVE", "Counts number of 128 bit SIMD integer shuffle and move operations."}, {"LOAD_DISPATCH.RS", "Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer."}, {"LOAD_DISPATCH.RS_DELAYED", "Counts the number of delayed RS dispatches at the stage latch. If an RS dispatch can not bypass to LB, it has another chance to dispatch from the one-cycle delayed staging latch before it is written into the LB."}, {"LOAD_DISPATCH.MOB", "Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer."}, {"LOAD_DISPATCH.ANY", "Counts all loads dispatched from the Reservation Station."}, {"ARITH.CYCLES_DIV_BUSY", "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE."}, {"ARITH.MUL", "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD."}, {"INST_QUEUE_WRITES", "Counts the number of instructions written into the instruction queue every cycle."}, {"INST_DECODED.DEC0", "Counts number of instructions that require decoder 0 to be decoded. Usually, this means that the instruction maps to more than 1 uop"}, {"TWO_UOP_INSTS_DECODED", "An instruction that generates two uops was decoded."}, {"HW_INT.RCV", "Number of interrupts received."}, {"HW_INT.CYCLES_MASKED", "Number of cycles interrupts are masked."}, {"HW_INT.CYCLES_PENDING_AND_MASKED", "Number of cycles interrupts are pending and masked."}, {"INST_QUEUE_WRITE_CYCLES", "This event counts the number of cycles during which instructions are written to the instruction queue. Dividing this counter by the number of instructions written to the instruction queue (INST_QUEUE_WRITES) yields the average number of instructions decoded each cycle. If this number is less than four and the pipe stalls, this indicates that the decoder is failing to decode enough instructions per cycle to sustain the 4-wide pipeline. If SSE* instructions that are 6 bytes or longer arrive one after another, then front end throughput may limit execution speed. "}, {"L2_RQSTS.LD_HIT", "Counts number of loads that hit the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. L2 loads can be rejected for various reasons. Only non rejected loads are counted."}, {"L2_RQSTS.LD_MISS", "Counts the number of loads that miss the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches."}, {"L2_RQSTS.LOADS", "Counts all L2 load requests. L2 loads include both L1D demand misses as well as L1D prefetches."}, {"L2_RQSTS.RFO_HIT", "Counts the number of store RFO requests that hit the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. Count includes WC memory requests, where the data is not fetched but the permission to write the line is required."}, {"L2_RQSTS.RFO_MISS", "Counts the number of store RFO requests that miss the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches."}, {"L2_RQSTS.RFOS", "Counts all L2 store RFO requests. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches."}, {"L2_RQSTS.IFETCH_HIT", "Counts number of instruction fetches that hit the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.IFETCH_MISS", "Counts number of instruction fetches that miss the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.IFETCHES", "Counts all instruction fetches. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.PREFETCH_HIT", "Counts L2 prefetch hits for both code and data."}, {"L2_RQSTS.PREFETCH_MISS", "Counts L2 prefetch misses for both code and data."}, {"L2_RQSTS.PREFETCHES", "Counts all L2 prefetches for both code and data."}, {"L2_RQSTS.MISS", "Counts all L2 misses for both code and data."}, {"L2_RQSTS.REFERENCES", "Counts all L2 requests for both code and data."}, {"L2_DATA_RQSTS.DEMAND.I_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.S_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the S (shared) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.E_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the E (exclusive) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.M_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the M (modified) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.MESI", "Counts all L2 data demand requests. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.PREFETCH.I_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss."}, {"L2_DATA_RQSTS.PREFETCH.S_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the S (shared) state. A prefetch RFO will miss on an S state line, while a prefetch read will hit on an S state line."}, {"L2_DATA_RQSTS.PREFETCH.E_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the E (exclusive) state."}, {"L2_DATA_RQSTS.PREFETCH.M_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the M (modified) state."}, {"L2_DATA_RQSTS.PREFETCH.MESI", "Counts all L2 prefetch requests."}, {"L2_DATA_RQSTS.ANY", "Counts all L2 data requests."}, {"L2_WRITE.RFO.I_STATE", "Counts number of L2 demand store RFO requests where the cache line to be loaded is in the I (invalid) state, i.e, a cache miss. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.S_STATE", "Counts number of L2 store RFO requests where the cache line to be loaded is in the S (shared) state. The L1D prefetcher does not issue a RFO prefetch,. This is a demand RFO request."}, {"L2_WRITE.RFO.E_STATE", "Counts number of L2 store RFO requests where the cache line to be loaded is in the E (exclusive) state. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.M_STATE", "Counts number of L2 store RFO requests where the cache line to be loaded is in the M (modified) state. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.HIT", "Counts number of L2 store RFO requests where the cache line to be loaded is in either the S, E or M states. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.MESI", "Counts all L2 store RFO requests.The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.LOCK.I_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss."}, {"L2_WRITE.LOCK.S_STATE", "Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state."}, {"L2_WRITE.LOCK.E_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state."}, {"L2_WRITE.LOCK.M_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state."}, {"L2_WRITE.LOCK.HIT", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in either the S, E, or M state."}, {"L2_WRITE.LOCK.MESI", "Counts all L2 demand lock RFO requests."}, {"L1D_WB_L2.I_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i.e. a cache miss."}, {"L1D_WB_L2.S_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state."}, {"L1D_WB_L2.E_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state."}, {"L1D_WB_L2.M_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state."}, {"L1D_WB_L2.MESI", "Counts all L1 writebacks to the L2."}, {"L3_LAT_CACHE.REFERENCE", "This event counts requests originating from the core that reference a cache line in the last level cache. The event count includes speculative traffic but excludes cache line fills due to a L2 hardware-prefetch. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended."}, {"L3_LAT_CACHE.MISS", "This event counts each cache miss condition for references to the last level cache. The event count may include speculative traffic but excludes cache line fills due to L2 hardware-prefetches. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended."}, {"CPU_CLK_UNHALTED.THREAD_P", "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling."}, {"CPU_CLK_UNHALTED.REF_P", "Increments at the frequency of TSC when not halted."}, {"UOPS_DECODED.DEC0", "Counts micro-ops decoded by decoder 0."}, {"L1D_CACHE_LD.I_STATE", "Counts L1 data cache read requests where the cache line to be loaded is in the I (invalid) state, i.e. the read request missed the cache. Counter 0, 1 only."}, {"L1D_CACHE_LD.S_STATE", "Counts L1 data cache read requests where the cache line to be loaded is in the S (shared) state. Counter 0, 1 only."}, {"L1D_CACHE_LD.E_STATE", "Counts L1 data cache read requests where the cache line to be loaded is in the E (exclusive) state. Counter 0, 1 only."}, {"L1D_CACHE_LD.M_STATE", "Counts L1 data cache read requests where the cache line to be loaded is in the M (modified) state. Counter 0, 1 only."}, {"L1D_CACHE_LD.MESI", "Counts L1 data cache read requests. Counter 0, 1 only."}, {"L1D_CACHE_ST.I_STATE", "Counts L1 data cache store RFO requests where the cache line to be loaded is in the I state. Counter 0, 1 only."}, {"L1D_CACHE_ST.S_STATE", "Counts L1 data cache store RFO requests where the cache line to be loaded is in the S (shared) state. Counter 0, 1 only."}, {"L1D_CACHE_ST.E_STATE", "Counts L1 data cache store RFO requests where the cache line to be loaded is in the E (exclusive) state. Counter 0, 1 only."}, {"L1D_CACHE_ST.M_STATE", "Counts L1 data cache store RFO requests where cache line to be loaded is in the M (modified) state. Counter 0, 1 only."}, {"L1D_CACHE_ST.MESI", "Counts L1 data cache store RFO requests. Counter 0, 1 only."}, {"L1D_CACHE_LOCK.HIT", "Counts retired load locks that hit in the L1 data cache or hit in an already allocated fill buffer. The lock portion of the load lock transaction must hit in the L1D. The initial load will pull the lock into the L1 data cache. Counter 0, 1 only."}, {"L1D_CACHE_LOCK.S_STATE", "Counts L1 data cache retired load locks that hit the target cache line in the shared state. Counter 0, 1 only."}, {"L1D_CACHE_LOCK.E_STATE", "Counts L1 data cache retired load locks that hit the target cache line in the exclusive state. Counter 0, 1 only."}, {"L1D_CACHE_LOCK.M_STATE", "Counts L1 data cache retired load locks that hit the target cache line in the modified state. Counter 0, 1 only."}, {"L1D_ALL_REF.ANY", "Counts all references (uncached, speculated and retired) to the L1 data cache, including all loads and stores with any memory types. The event counts memory accesses only when they are actually performed. For example, a load blocked by unknown store address and later performed is only counted once. The event does not include non- memory accesses, such as I/O accesses. Counter 0, 1 only."}, {"L1D_ALL_REF.CACHEABLE", "Counts all data reads and writes (speculated and retired) from cacheable memory, including locked operations. Counter 0, 1 only."}, {"L1D_PEND_MISS.LOAD_BUFFERS_FULL", "Counts cycles of L1 data cache load fill buffers full. Counter 0, 1 only."}, {"DTLB_MISSES.ANY", "Counts the number of misses in the STLB which causes a page walk."}, {"DTLB_MISSES.WALK_COMPLETED", "Counts number of misses in the STLB which resulted in a completed page walk."}, {"DTLB_MISSES.STLB_HIT", "Counts the number of DTLB first level misses that hit in the second level TLB. This event is only relevant if the core contains multiple DTLB levels."}, {"DTLB_MISSES.PDE_MISS", "Number of DTLB cache misses where the low part of the linear to physical address translation was missed."}, {"DTLB_MISSES.PDP_MISS", "Number of DTLB misses where the high part of the linear to physical address translation was missed."}, {"DTLB_MISSES.LARGE_WALK_COMPLETED", "Counts number of completed large page walks due to misses in the STLB."}, {"SSE_MEM_EXEC.NTA", "Counts number of SSE NTA prefetch/weakly-ordered instructions which missed the L1 data cache."}, {"SSE_MEM_EXEC.STREAMING_STORES", "Counts number of SSE non- temporal stores."}, {"LOAD_HIT_PRE", "Counts load operations sent to the L1 data cache while a previous SSE prefetch instruction to the same cache line has started prefetching but has not yet finished."}, {"SFENCE_CYCLES", "Counts store fence cycles."}, {"L1D_PREFETCH.REQUESTS", "Counts number of hardware prefetch requests dispatched out of the prefetch FIFO."}, {"L1D_PREFETCH.MISS", "Counts number of hardware prefetch requests that miss the L1D. There are two prefetchers in the L1D. A streamer, which predicts lines sequentially after this one should be fetched, and the IP prefetcher that remembers access patterns for the current instruction. The streamer prefetcher stops on an L1D hit, while the IP prefetcher does not."}, {"L1D_PREFETCH.TRIGGERS", "Counts number of prefetch requests triggered by the Finite State Machine and pushed into the prefetch FIFO. Some of the prefetch requests are dropped due to overwrites or competition between the IP index prefetcher and streamer prefetcher. The prefetch FIFO contains 4 entries."}, {"EPT.EPDE_MISS", "Counts Extended Page Directory Entry misses. The Extended Page Directory cache is used by Virtual Machine operating systems while the guest operating systems use the standard TLB caches."}, {"EPT.EPDPE_HIT", "Counts Extended Page Directory Pointer Entry hits."}, {"EPT.EPDPE_MISS", "Counts Extended Page Directory Pointer Entry misses."}, {"L1D.REPL", "Counts the number of lines brought into the L1 data cache. Counter 0, 1 only."}, {"L1D.M_REPL", "Counts the number of modified lines brought into the L1 data cache. Counter 0, 1 only."}, {"L1D.M_EVICT", "Counts the number of modified lines evicted from the L1 data cache due to replacement. Counter 0, 1 only."}, {"L1D.M_SNOOP_EVICT", "Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention. Counter 0, 1 only."}, {"L1D_CACHE_PREFETCH_LOCK_FB_HIT", "Counts the number of cacheable load lock speculated instructions accepted into the fill buffer."}, {"L1D_CACHE_LOCK_FB_HIT", "Counts the number of cacheable load lock speculated or retired instructions accepted into the fill buffer."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_DATA", "Counts weighted cycles of offcore demand data read requests. Does not include L2 prefetch requests."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_CODE", "Counts weighted cycles of offcore demand code read requests. Does not include L2 prefetch requests."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.RFO", "Counts weighted cycles of offcore demand RFO requests. Does not include L2 prefetch requests."}, {"OFFCORE_REQUESTS_OUTSTANDING.ANY.READ", "Counts weighted cycles of offcore read requests of any kind. Include L2 prefetch requests."}, {"CACHE_LOCK_CYCLES.L1D_L2", "Cycle count during which the L1D and L2 are locked. A lock is asserted when there is a locked memory access, due to uncacheable memory, a locked operation that spans two cache lines, or a page walk from an uncacheable page table. Counter 0, 1 only.L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such accesses."}, {"CACHE_LOCK_CYCLES.L1D", "Counts the number of cycles that cacheline in the L1 data cache unit is locked. Counter 0, 1 only."}, {"IO_TRANSACTIONS", "Counts the number of completed I/O transactions."}, {"L1I.HITS", "Counts all instruction fetches that hit the L1 instruction cache."}, {"L1I.MISSES", "Counts all instruction fetches that miss the L1I cache. This includes instruction cache misses, streaming buffer misses, victim cache misses and uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding."}, {"L1I.READS", "Counts all instruction fetches, including uncacheable fetches that bypass the L1I."}, {"L1I.CYCLES_STALLED", "Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault."}, {"IFU_IVC.FULL", "Instruction Fetch unit victim cache full."}, {"IFU_IVC.L1I_EVICTION", "L1 Instruction cache evictions."}, {"LARGE_ITLB.HIT", "Counts number of large ITLB hits."}, {"L1I_OPPORTUNISTIC_HITS", "Opportunistic hits in streaming."}, {"ITLB_MISSES.ANY", "Counts the number of misses in all levels of the ITLB which causes a page walk."}, {"ITLB_MISSES.WALK_COMPLETED", "Counts number of misses in all levels of the ITLB which resulted in a completed page walk."}, {"ITLB_MISSES.WALK_CYCLES", "Counts ITLB miss page walk cycles."}, {"ITLB_MISSES.STLB_HIT", "Counts the number of ITLB misses that hit in the second level TLB."}, {"ITLB_MISSES.PDE_MISS", "Number of ITLB misses where the low part of the linear to physical address translation was missed."}, {"ITLB_MISSES.PDP_MISS", "Number of ITLB misses where the high part of the linear to physical address translation was missed."}, {"ITLB_MISSES.LARGE_WALK_COMPLETED", "Counts number of completed large page walks due to misses in the STLB."}, {"ILD_STALL.ANY", ""}, {"ILD_STALL.IQ_FULL", ""}, {"ILD_STALL.LCP", "Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX.W (for EM64T) instructions which change the length of the decoded instruction."}, {"ILD_STALL.MRU", ""}, {"ILD_STALL.REGEN", ""}, {"BR_INST_EXEC.ANY", "Counts all near executed branches (not necessarily retired). This includes only instructions and not micro-op branches. Frequent branching is not necessarily a major performance issue. However frequent branch mispredictions may be a problem."}, {"BR_INST_EXEC.COND", ""}, {"BR_INST_EXEC.DIRECT", ""}, {"BR_INST_EXEC.DIRECT_NEAR_CALL", ""}, {"BR_INST_EXEC.INDIRECT_NEAR_CALL", ""}, {"BR_INST_EXEC.INDIRECT_NON_CALL", ""}, {"BR_INST_EXEC.NEAR_CALLS", ""}, {"BR_INST_EXEC.NON_CALLS", ""}, {"BR_INST_EXEC.RETURN_NEAR", ""}, {"BR_INST_EXEC.TAKEN", ""}, {"BR_MISP_EXEC.COND", "Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired."}, {"BR_MISP_EXEC.DIRECT", "Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0)."}, {"BR_MISP_EXEC.INDIRECT_NON_CALL", "Counts the number of executed mispredicted indirect near branch instructions that are not calls."}, {"BR_MISP_EXEC.NON_CALLS", "Counts mispredicted non call near branches executed, but not necessarily retired."}, {"BR_MISP_EXEC.RETURN_NEAR", "Counts mispredicted indirect branches that have a rear return mnemonic."}, {"BR_MISP_EXEC.DIRECT_NEAR_CALL", "Counts mispredicted non-indirect near calls executed, (should always be 0)."}, {"BR_MISP_EXEC.INDIRECT_NEAR_CALL", "Counts mispredicted indirect near calls exeucted, including both register and memory indirect."}, {"BR_MISP_EXEC.NEAR_CALLS", "Counts all mispredicted near call branches executed, but not necessarily retired."}, {"BR_MISP_EXEC.TAKEN", "Counts executed mispredicted near branches that are taken, but not necessarily retired."}, {"BR_MISP_EXEC.ANY", "Counts the number of mispredicted near branch instructions that were executed, but not necessarily retired."}, {"RESOURCE_STALLS.ANY", "Counts the number of Allocator resource related stalls. Includes register renaming buffer entries, memory buffer entries. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations. Does not include stalls due to SuperQ (off core) queue full, too many cache misses, etc."}, {"RESOURCE_STALLS.LOAD", "Counts the cycles of stall due to lack of load buffer for load operation."}, {"RESOURCE_STALLS.RS_FULL", "This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle. A high count of this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, or instructions dependent upon instructions further down the pipeline that have yet to retire. When RS is full, new instructions can not enter the reservation station and start execution."}, {"RESOURCE_STALLS.STORE", "This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i.e. all store buffers are used). The stall ends when a store instruction commits its data to the cache or memory."}, {"RESOURCE_STALLS.ROB_FULL", "Counts the cycles of stall due to re- order buffer full."}, {"RESOURCE_STALLS.FPCW", "Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word."}, {"RESOURCE_STALLS.MXCSR", "Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename. The MXCSR provides control and status for the MMX registers."}, {"RESOURCE_STALLS.OTHER", "Counts the number of cycles while execution was stalled due to other resource issues."}, {"MACRO_INSTS.FUSIONS_DECODED", "Counts the number of instructions decoded that are macro-fused but not necessarily executed or retired."}, {"BACLEAR_FORCE_IQ", "Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is also responsible for providing conditional branch prediciton direction based on a static scheme and dynamic data provided by the L2 Branch Prediction Unit. If the conditional branch target is not found in the Target Array and the IQ predicts that the branch is taken, then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline."}, {"LSD.UOPS", "Counts the number of micro-ops delivered by loop stream detector Use cmask=1 and invert to count cycles."}, {"ITLB.FLUSH", "Counts the number of ITLB flushes."}, {"OFFCORE_REQUESTS.DEMAND.READ_DATA", "Counts number of offcore demand data read requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.DEMAND.READ_CODE", "Counts number of offcore demand code read requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.DEMAND.RFO", "Counts number of offcore demand RFO requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.ANY.READ", "Counts number of offcore read requests. Includes L2 prefetch requests."}, {"OFFCORE_REQUESTS.ANY.RFO", "Counts number of offcore RFO requests. Includes L2 prefetch requests."}, {"OFFCORE_REQUESTS.UNCACHED_MEM", "Counts number of offcore uncached memory requests."}, {"OFFCORE_REQUESTS.L1D_WRITEBACK", "Counts number of L1D writebacks to the uncore."}, {"OFFCORE_REQUESTS.ANY", "Counts all offcore requests."}, {"UOPS_EXECUTED.PORT0", "Counts number of Uops executed that were issued on port 0. Port 0 handles integer arithmetic, SIMD and FP add Uops."}, {"UOPS_EXECUTED.PORT1", "Counts number of Uops executed that were issued on port 1. Port 1 handles integer arithmetic, SIMD, integer shift, FP multiply and FP divide Uops."}, {"UOPS_EXECUTED.PORT2_CORE", "Counts number of Uops executed that were issued on port 2. Port 2 handles the load Uops. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT3_CORE", "Counts number of Uops executed that were issued on port 3. Port 3 handles store Uops. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT4_CORE", "Counts number of Uops executed that where issued on port 4. Port 4 handles the value to be stored for the store Uops issued on port 3. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT5", "Counts number of Uops executed that where issued on port 5."}, {"UOPS_EXECUTED.CORE_ACTIVE_CYCLES", "Counts cycles when the Uops are executing."}, {"UOPS_EXECUTED.PORT015", "Counts number of Uops executed that where issued on port 0, 1, or 5. use cmask=1, invert=1 to count stall cycles."}, {"UOPS_EXECUTED.PORT234", "Counts number of Uops executed that where issued on port 2, 3, or 4."}, {"OFFCORE_REQUESTS_SQ_FULL", "Counts number of cycles the SQ is full to handle off-core requests."}, {"SNOOPQ_REQUESTS_OUTSTANDING.DATA", "Counts weighted cycles of snoopq requests for data. Counter 0 only Use cmask=1 to count cycles not empty."}, {"SNOOPQ_REQUESTS_OUTSTANDING.INVALIDATE", "Counts weighted cycles of snoopq invalidate requests. Counter 0 only Use cmask=1 to count cycles not empty."}, {"SNOOPQ_REQUESTS_OUTSTANDING.CODE", "Counts weighted cycles of snoopq requests for code. Counter 0 only Use cmask=1 to count cycles not empty."}, {"OFF_CORE_RESPONSE_0", "see Section 19.17.1.3, ?Off-core Response Performance Monitoring in the Processor Core?"}, {"SNOOP_RESPONSE.HIT", "Counts HIT snoop response sent by this thread in response to a snoop request."}, {"SNOOP_RESPONSE.HITE", "Counts HIT E snoop response sent by this thread in response to a snoop request."}, {"SNOOP_RESPONSE.HITM", "Counts HIT M snoop response sent by this thread in response to a snoop request."}, {"PIC_ACCESSES.TPR_READS", "Counts number of TPR reads."}, {"PIC_ACCESSES.TPR_WRITES", "Counts number of TPR writes."}, {"INST_RETIRED.ANY_P", "See Table A-1 Notes: INST_RETIRED.ANY is counted by a designated fixed counter. INST_RETIRED.ANY_P is counted by a programmable counter and is an architectural performance event. Event is supported if CPUID.A.EBX[1] = 0. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions."}, {"INST_RETIRED.X87", "Counts the number of floating point computational operations retired: floating point computational operations executed by the assist handler and sub-operations of complex floating point instructions like transcendental instructions."}, {"UOPS_RETIRED.ANY", "Counts the number of micro-ops retired, (macro-fused=1, micro- fused=2, others=1; maximum count of 8 per cycle). Most instructions are composed of one or two micro- ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. Use cmask=1 and invert to count active cycles or stalled cycles."}, {"UOPS_RETIRED.RETIRE_SLOTS", "Counts the number of retirement slots used each cycle."}, {"UOPS_RETIRED.MACRO_FUSED", "Counts number of macro-fused uops retired."}, {"MACHINE_CLEARS.CYCLES", "Counts the cycles machine clear is asserted."}, {"MACHINE_CLEARS.MEM_ORDER", "Counts the number of machine clears due to memory order conflicts."}, {"MACHINE_CLEARS.SMC", "Counts the number of times that a program writes to a code section. Self-modifying code causes a sever penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and L3caches."}, {"MACHINE_CLEARS.FUSION_ASSIST", "Counts the number of macro-fusion assists."}, {"BR_INST_RETIRED.ALL_BRANCHES", "See Table A-1."}, {"BR_INST_RETIRED.CONDITIONAL", "Counts the number of conditional branch instructions retired."}, {"BR_INST_RETIRED.NEAR_CALL", "Counts the number of direct & indirect near unconditional calls retired."}, {"BR_MISP_RETIRED.ALL_BRANCHES", "See Table A-1."}, {"BR_MISP_RETIRED.NEAR_CALL", "Counts mispredicted direct & indirect near unconditional retired calls."}, {"SSEX_UOPS_RETIRED.PACKED_SINGLE", "Counts SIMD packed single- precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.SCALAR_SINGLE", "Counts SIMD calar single-precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.PACKED_DOUBLE", "Counts SIMD packed double- precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.SCALAR_DOUBLE", "Counts SIMD scalar double-precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.VECTOR_INTEGER", "Counts 128-bit SIMD vector integer Uops retired."}, {"ITLB_MISS_RETIRED", "Counts the number of retired instructions that missed the ITLB when the instruction was fetched."}, {"MEM_LOAD_RETIRED.L1D_HIT", "Counts number of retired loads that hit the L1 data cache."}, {"MEM_LOAD_RETIRED.L2_HIT", "Counts number of retired loads that hit the L2 data cache."}, {"MEM_LOAD_RETIRED.OTHER_CORE_L2_HIT_HITM", "Counts number of retired loads that hit in a sibling core's L2 (on die core). Since the L3 is inclusive of all cores on the package, this is an L3 hit. This counts both clean or modified hits."}, {"MEM_LOAD_RETIRED.HIT_LFB", "Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache. This is counting secondary L1D misses."}, {"MEM_LOAD_RETIRED.DTLB_MISS", "Counts the number of retired loads that missed the DTLB. The DTLB miss is not counted if the load operation causes a fault. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. Counts both primary and secondary misses to the TLB."}, {"MEM_LOAD_RETIRED.L3_MISS", "Counts number of retired loads that miss the L3 cache."}, {"MEM_LOAD_RETIRED.L3_UNSHARED_HIT", "Couns number of retired loads that hit their own, unshared lines in the L3 cache."}, {"FP_MMX_TRANS.TO_FP", "Counts the first floating-point instruction following any MMX instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"FP_MMX_TRANS.TO_MMX", "Counts the first MMX instruction following a floating-point instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"FP_MMX_TRANS.ANY", "Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"MACRO_INSTS.DECODED", "Counts the number of instructions decoded, (but not necessarily executed or retired)."}, {"UOPS_DECODED.MS", "Counts the number of Uops decoded by the Microcode Sequencer, MS. The MS delivers uops when the instruction is more than 4 uops long or a microcode assist is occurring."}, {"UOPS_DECODED.ESP_FOLDING", "Counts number of stack pointer (ESP) instructions decoded: push , pop , call , ret, etc. ESP instructions do not generate a Uop to increment or decrement ESP. Instead, they update an ESP_Offset register that keeps track of the delta to the current value of the ESP register."}, {"UOPS_DECODED.ESP_SYNC", "Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register."}, {"RAT_STALLS.FLAGS", "Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall. A partial register stall may occur when two conditions are met: 1) an instruction modifies some, but not all, of the flags in the flag register and 2) the next instruction, which depends on flags, depends on flags that were not modified by this instruction."}, {"RAT_STALLS.REGISTERS", "This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction."}, {"RAT_STALLS.ROB_READ_PORT", "Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline. Note that, at this stage in the pipeline, additional stalls may occur at the same cycle and prevent the stalled micro-ops from entering the pipe. In such a case, micro-ops retry entering the execution pipe in the next cycle and the ROB-read port stall is counted again."}, {"RAT_STALLS.SCOREBOARD", "Counts the cycles where we stall due to microarchitecturally required serialization. Microcode scoreboarding stalls."}, {"RAT_STALLS.ANY", "Counts all Register Allocation Table stall cycles due to: Cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the execution pipe. Cycles when partial register stalls occurred Cycles when flag stalls occurred Cycles floating-point unit (FPU) status word stalls occurred. To count each of these conditions separately use the events: RAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and RAT_STALLS.FPSW."}, {"SEG_RENAME_STALLS", "Counts the number of stall cycles due to the lack of renaming resources for the ES, DS, FS, and GS segment registers. If a segment is renamed but not retired and a second update to the same segment occurs, a stall occurs in the front-end of the pipeline until the renamed segment retires."}, {"ES_REG_RENAMES", "Counts the number of times the ES segment register is renamed."}, {"UOP_UNFUSION", "Counts unfusion events due to floating point exception to a fused uop."}, {"BR_INST_DECODED", "Counts the number of branch instructions decoded."}, {"BOGUS_BR", "Counts the number of bogus branches."}, {"BPU_MISSED_CALL_RET", "Counts number of times the Branch Prediciton Unit missed predicting a call or return branch."}, {"L2_HW_PREFETCH.DATA_TRIGGER", "Count L2 HW data prefetcher triggered."}, {"L2_HW_PREFETCH.CODE_TRIGGER", "Count L2 HW code prefetcher triggered."}, {"L2_HW_PREFETCH.DCA_TRIGGER", "Count L2 HW DCA prefetcher triggered."}, {"L2_HW_PREFETCH.KICK_START", "Count L2 HW prefetcher kick started."}, {"SQ_MISC.PROMOTION", "Counts the number of L2 secondary misses that hit the Super Queue."}, {"SQ_MISC.PROMOTION_POST_GO", "Counts the number of L2 secondary misses during the Super Queue filling L2."}, {"SQ_MISC.LRU_HINTS", "Counts number of Super Queue LRU hints sent to L3."}, {"SQ_MISC.FILL_DROPPED", "Counts the number of SQ L2 fills dropped due to L2 busy."}, {"SQ_MISC.SPLIT_LOCK", "Counts the number of SQ lock splits across a cache line."}, {"SQ_FULL_STALL_CYCLES", "Counts cycles the Super Queue is full. Neither of the threads on this core will be able to access the uncore."}, {"FP_ASSIST.ALL", "Counts the number of floating point operations executed that required micro-code assist intervention. Assists are required in the following cases: SSE instructions, (Denormal input when the DAZ flag is off or Underflow result when the FTZ flag is off): x87 instructions, (NaN or denormal are loaded to a register or used as input from memory, Division by 0 or Underflow output)."}, {"FP_ASSIST.OUTPUT", "Counts number of floating point micro-code assist when the output value (destination register) is invalid."}, {"FP_ASSIST.INPUT", "Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid."}, {"SEGMENT_REG_LOADS", "Counts number of segment register loads."}, {"SIMD_INT_64.PACKED_MPY", "Counts number of SID integer 64 bit packed multiply operations."}, {"SIMD_INT_64.PACKED_SHIFT", "Counts number of SID integer 64 bit packed shift operations."}, {"SIMD_INT_64.PACK", "Counts number of SID integer 64 bit pack operations."}, {"SIMD_INT_64.UNPACK", "Counts number of SID integer 64 bit unpack operations."}, {"SIMD_INT_64.PACKED_LOGICAL", "Counts number of SID integer 64 bit logical operations."}, {"SIMD_INT_64.PACKED_ARITH", "Counts number of SID integer 64 bit arithmetic operations."}, {"SIMD_INT_64.SHUFFLE_MOVE", "Counts number of SID integer 64 bit shift or move operations."}, {"INSTR_RETIRED_ANY", "Instructions retired (IAF)"}, {"CPU_CLK_UNHALTED_CORE", "Unhalted core cycles (IAF)"}, {"CPU_CLK_UNHALTED_REF", "Unhalted reference cycles (IAF)"}, {"GQ_CYCLES_FULL.READ_TRACKER", "Uncore cycles Global Queue read tracker is full."}, {"GQ_CYCLES_FULL.WRITE_TRACKER", "Uncore cycles Global Queue write tracker is full."}, {"GQ_CYCLES_FULL.PEER_PROBE_TRACKER", "Uncore cycles Global Queue peer probe tracker is full. The peer probe tracker queue tracks snoops from the IOH and remote sockets."}, {"GQ_CYCLES_NOT_EMPTY.READ_TRACKER", "Uncore cycles were Global Queue read tracker has at least one valid entry."}, {"GQ_CYCLES_NOT_EMPTY.WRITE_TRACKER", "Uncore cycles were Global Queue write tracker has at least one valid entry."}, {"GQ_CYCLES_NOT_EMPTY.PEER_PROBE_TRACKER", "Uncore cycles were Global Queue peer probe tracker has at least one valid entry. The peer probe tracker queue tracks IOH and remote socket snoops."}, {"GQ_ALLOC.READ_TRACKER", "Counts the number of tread tracker allo- cate to deallocate entries. The GQ read tracker allocate to deal- locate occupancy count is divided by the count to obtain the average read tracker latency."}, {"GQ_ALLOC.RT_L3_MISS", "Counts the number GQ read tracker entries for which a full cache line read has missed the L3. The GQ read tracker L3 miss to fill occupancy count is divided by this count to obtain the average cache line read L3 miss latency. The latency represents the time after which the L3 has determined that the cache line has missed. The time between a GQ read tracker allocation and the L3 determining that the cache line has missed is the average L3 hit latency. The total L3 cache line read miss latency is the hit latency + L3 miss latency."}, {"GQ_ALLOC.RT_TO_L3_RESP", "Counts the number of GQ read tracker entries that are allocated in the read tracker queue that hit or miss the L3. The GQ read tracker L3 hit occupancy count is divided by this count to obtain the average L3 hit latency."}, {"GQ_ALLOC.RT_TO_RTID_ACQUIRED", "Counts the number of GQ read tracker entries that are allocated in the read tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ read tracker L3 miss to RTID acquired occupancy count is divided by this count to obtain the average latency for a read L3 miss to acquire an RTID."}, {"GQ_ALLOC.WT_TO_RTID_ACQUIRED", "Counts the number of GQ write tracker entries that are allocated in the write tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ write tracker L3 miss to RTID occupancy count is divided by this count to obtain the average latency for a write L3 miss to acquire an RTID."}, {"GQ_ALLOC.WRITE_TRACKER", "Counts the number of GQ write tracker entries that are allocated in the write tracker queue that miss the L3. The GQ write tracker occupancy count is divided by the this count to obtain the average L3 write miss latency."}, {"GQ_ALLOC.PEER_PROBE_TRACKER", "Counts the number of GQ peer probe tracker (snoop) entries that are allocated in the peer probe tracker queue that miss the L3. The GQ peer probe occupancy count is divided by this count to obtain the average L3 peer probe miss latency."}, {"GQ_DATA.FROM_QPI", "Cycles Global Queue Quickpath Interface input data port is busy importing data from the Quickpath Inter- face. Each cycle the input port can transfer 8 or 16 bytes of data."}, {"GQ_DATA.FROM_QMC", "Cycles Global Queue Quickpath Memory Interface input data port is busy importing data from the Quick- path Memory Interface. Each cycle the input port can transfer 8 or 16 bytes of data."}, {"GQ_DATA.FROM_L3", "Cycles GQ L3 input data port is busy importing data from the Last Level Cache. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.FROM_CORES_02", "Cycles GQ Core 0 and 2 input data port is busy importing data from processor cores 0 and 2. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.FROM_CORES_13", "Cycles GQ Core 1 and 3 input data port is busy importing data from processor cores 1 and 3. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.TO_QPI_QMC", "Cycles GQ QPI and QMC output data port is busy sending data to the Quickpath Interface or Quickpath Memory Interface. Each cycle the output port can transfer 32 bytes of data."}, {"GQ_DATA.TO_L3", "Cycles GQ L3 output data port is busy sending data to the Last Level Cache. Each cycle the output port can transfer 32 bytes of data."}, {"GQ_DATA.TO_CORES", "Cycles GQ Core output data port is busy sending data to the Cores. Each cycle the output port can trans- fer 32 bytes of data."}, {"SNP_RESP_TO_LOCAL_HOME.I_STATE", "Number of snoop responses to the local home that L3 does not have the referenced cache line."}, {"SNP_RESP_TO_LOCAL_HOME.S_STATE", "Number of snoop responses to the local home that L3 has the referenced line cached in the S state."}, {"SNP_RESP_TO_LOCAL_HOME.FWD_S_STATE", "Number of responses to code or data read snoops to the local home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the local home in the S state."}, {"SNP_RESP_TO_LOCAL_HOME.FWD_I_STATE", "Number of responses to read invalidate snoops to the local home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the local home in the M state."}, {"SNP_RESP_TO_LOCAL_HOME.CONFLICT", "Number of conflict snoop responses sent to the local home."}, {"SNP_RESP_TO_LOCAL_HOME.WB", "Number of responses to code or data read snoops to the local home that the L3 has the referenced line cached in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.I_STATE", "Number of snoop responses to a remote home that L3 does not have the referenced cache line."}, {"SNP_RESP_TO_REMOTE_HOME.S_STATE", "Number of snoop responses to a remote home that L3 has the referenced line cached in the S state."}, {"SNP_RESP_TO_REMOTE_HOME.FWD_S_STATE", "Number of responses to code or data read snoops to a remote home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the remote home in the S state."}, {"SNP_RESP_TO_REMOTE_HOME.FWD_I_STATE", "Number of responses to read invalidate snoops to a remote home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the remote home in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.CONFLICT", "Number of conflict snoop responses sent to the local home."}, {"SNP_RESP_TO_REMOTE_HOME.WB", "Number of responses to code or data read snoops to a remote home that the L3 has the referenced line cached in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.HITM", "Number of HITM snoop responses to a remote home."}, {"L3_HITS.READ", "Number of code read, data read and RFO requests that hit in the L3."}, {"L3_HITS.WRITE", "Number of writeback requests that hit in the L3. Writebacks from the cores will always result in L3 hits due to the inclusive property of the L3."}, {"L3_HITS.PROBE", "Number of snoops from IOH or remote sock- ets that hit in the L3."}, {"L3_HITS.ANY", "Number of reads and writes that hit the L3."}, {"L3_MISS.READ", "Number of code read, data read and RFO requests that miss the L3."}, {"L3_MISS.WRITE", "Number of writeback requests that miss the L3. Should always be zero as writebacks from the cores will always result in L3 hits due to the inclusive property of the L3."}, {"L3_MISS.PROBE", "Number of snoops from IOH or remote sock- ets that miss the L3."}, {"L3_MISS.ANY", "Number of reads and writes that miss the L3."}, {"L3_LINES_IN.M_STATE", "Counts the number of L3 lines allocated in M state. The only time a cache line is allocated in the M state is when the line was forwarded in M state is forwarded due to a Snoop Read Invalidate Own request."}, {"L3_LINES_IN.E_STATE", "Counts the number of L3 lines allocated in E state."}, {"L3_LINES_IN.S_STATE", "Counts the number of L3 lines allocated in S state."}, {"L3_LINES_IN.F_STATE", "Counts the number of L3 lines allocated in F state."}, {"L3_LINES_IN.ANY", "Counts the number of L3 lines allocated in any state."}, {"L3_LINES_OUT.M_STATE", "Counts the number of L3 lines victimized that were in the M state. When the victim cache line is in M state, the line is written to its home cache agent which can be either local or remote."}, {"L3_LINES_OUT.E_STATE", "Counts the number of L3 lines victimized that were in the E state."}, {"L3_LINES_OUT.S_STATE", "Counts the number of L3 lines victimized that were in the S state."}, {"L3_LINES_OUT.I_STATE", "Counts the number of L3 lines victimized that were in the I state."}, {"L3_LINES_OUT.F_STATE", "Counts the number of L3 lines victimized that were in the F state."}, {"L3_LINES_OUT.ANY", "Counts the number of L3 lines victimized in any state."}, {"QHL_REQUESTS.IOH_READS", "Counts number of Quickpath Home Logic read requests from the IOH."}, {"QHL_REQUESTS.IOH_WRITES", "Counts number of Quickpath Home Logic write requests from the IOH."}, {"QHL_REQUESTS.REMOTE_READS", "Counts number of Quickpath Home Logic read requests from a remote socket."}, {"QHL_REQUESTS.REMOTE_WRITES", "Counts number of Quickpath Home Logic write requests from a remote socket."}, {"QHL_REQUESTS.LOCAL_READS", "Counts number of Quickpath Home Logic read requests from the local socket."}, {"QHL_REQUESTS.LOCAL_WRITES", "Counts number of Quickpath Home Logic write requests from the local socket."}, {"QHL_CYCLES_FULL.IOH", "Counts uclk cycles all entries in the Quickpath Home Logic IOH are full."}, {"QHL_CYCLES_FULL.REMOTE", "Counts uclk cycles all entries in the Quickpath Home Logic remote tracker are full."}, {"QHL_CYCLES_FULL.LOCAL", "Counts uclk cycles all entries in the Quickpath Home Logic local tracker are full."}, {"QHL_CYCLES_NOT_EMPTY.IOH", "Counts uclk cycles all entries in the Quickpath Home Logic IOH is busy."}, {"QHL_CYCLES_NOT_EMPTY.REMOTE", "Counts uclk cycles all entries in the Quickpath Home Logic remote tracker is busy."}, {"QHL_CYCLES_NOT_EMPTY.LOCAL", "Counts uclk cycles all entries in the Quickpath Home Logic local tracker is busy."}, {"QHL_OCCUPANCY.IOH", "QHL IOH tracker allocate to deallocate read occupancy."}, {"QHL_OCCUPANCY.REMOTE", "QHL remote tracker allocate to deallocate read occupancy."}, {"QHL_OCCUPANCY.LOCAL", "QHL local tracker allocate to deallocate read occupancy."}, {"QHL_ADDRESS_CONFLICTS.2WAY", "Counts number of QHL Active Address Table (AAT) entries that saw a max of 2 conflicts. The AAT is a struc- ture that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates."}, {"QHL_ADDRESS_CONFLICTS.3WAY", "Counts number of QHL Active Address Table (AAT) entries that saw a max of 3 conflicts. The AAT is a struc- ture that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates."}, {"QHL_CONFLICT_CYCLES.IOH", "Counts cycles the Quickpath Home Logic IOH Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict."}, {"QHL_CONFLICT_CYCLES.REMOTE", "Counts cycles the Quickpath Home Logic Remote Tracker contains two or more requests with an address con- flict. A max of 3 requests can be in conflict."}, {"QHL_CONFLICT_CYCLES.LOCAL", "Counts cycles the Quickpath Home Logic Local Tracker contains two or more requests with an address con- flict. A max of 3 requests can be in conflict."}, {"QHL_TO_QMC_BYPASS", "Counts number or requests to the Quickpath Memory Controller that bypass the Quickpath Home Logic. All local accesses can be bypassed. For remote requests, only read requests can be bypassed."}, {"QMC_NORMAL_FULL.READ.CH0", "Uncore cycles all the entries in the DRAM channel 0 medium or low priority queue are occupied with read requests."}, {"QMC_NORMAL_FULL.READ.CH1", "Uncore cycles all the entries in the DRAM channel 1 medium or low priority queue are occupied with read requests."}, {"QMC_NORMAL_FULL.READ.CH2", "Uncore cycles all the entries in the DRAM channel 2 medium or low priority queue are occupied with read requests."}, {"QMC_NORMAL_FULL.WRITE.CH0", "Uncore cycles all the entries in the DRAM channel 0 medium or low priority queue are occupied with write requests."}, {"QMC_NORMAL_FULL.WRITE.CH1", "Counts cycles all the entries in the DRAM channel 1 medium or low priority queue are occupied with write requests."}, {"QMC_NORMAL_FULL.WRITE.CH2", "Uncore cycles all the entries in the DRAM channel 2 medium or low priority queue are occupied with write requests."}, {"QMC_ISOC_FULL.READ.CH0", "Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.READ.CH1", "Counts cycles all the entries in the DRAM channel 1high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.READ.CH2", "Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.WRITE.CH0", "Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous write requests."}, {"QMC_ISOC_FULL.WRITE.CH1", "Counts cycles all the entries in the DRAM channel 1 high priority queue are occupied with isochronous write requests."}, {"QMC_ISOC_FULL.WRITE.CH2", "Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous write requests."}, {"QMC_BUSY.READ.CH0", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding read request to DRAM channel 0."}, {"QMC_BUSY.READ.CH1", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding read request to DRAM channel 1."}, {"QMC_BUSY.READ.CH2", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding read request to DRAM channel 2."}, {"QMC_BUSY.WRITE.CH0", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding write request to DRAM channel 0."}, {"QMC_BUSY.WRITE.CH1", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding write request to DRAM channel 1."}, {"QMC_BUSY.WRITE.CH2", "Counts cycles where Quickpath Memory Con- troller has at least 1 outstanding write request to DRAM channel 2."}, {"QMC_OCCUPANCY.CH0", "IMC channel 0 normal read request occupancy."}, {"QMC_OCCUPANCY.CH1", "IMC channel 1 normal read request occupancy."}, {"QMC_OCCUPANCY.CH2", "IMC channel 2 normal read request occupancy."}, {"QMC_ISSOC_OCCUPANCY.CH0", "IMC channel 0 issoc read request occupancy."}, {"QMC_ISSOC_OCCUPANCY.CH1", "IMC channel 1 issoc read request occupancy."}, {"QMC_ISSOC_OCCUPANCY.CH2", "IMC channel 2 issoc read request occu- pancy."}, {"QMC_ISSOC_READS.ANY", "IMC issoc read request occupancy."}, {"QMC_NORMAL_READS.CH0", "Counts the number of Quickpath Memory Con- troller channel 0 medium and low priority read requests. The QMC channel 0 normal read occupancy divided by this count provides the average QMC channel 0 read latency."}, {"QMC_NORMAL_READS.CH1", "Counts the number of Quickpath Memory Con- troller channel 1 medium and low priority read requests. The QMC channel 1 normal read occupancy divided by this count provides the average QMC channel 1 read latency."}, {"QMC_NORMAL_READS.CH2", "Counts the number of Quickpath Memory Con- troller channel 2 medium and low priority read requests. The QMC channel 2 normal read occupancy divided by this count provides the average QMC channel 2 read latency."}, {"QMC_NORMAL_READS.ANY", "Counts the number of Quickpath Memory Con- troller medium and low priority read requests. The QMC normal read occupancy divided by this count provides the average QMC read latency."}, {"QMC_HIGH_PRIORITY_READS.CH0", "Counts the number of Quickpath Memory Con- troller channel 0 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.CH1", "Counts the number of Quickpath Memory Con- troller channel 1 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.CH2", "Counts the number of Quickpath Memory Con- troller channel 2 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.ANY", "Counts the number of Quickpath Memory Con- troller high priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH0", "Counts the number of Quickpath Memory Con- troller channel 0 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH1", "Counts the number of Quickpath Memory Con- troller channel 1 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH2", "Counts the number of Quickpath Memory Con- troller channel 2 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.ANY", "Counts the number of Quickpath Memory Con- troller critical priority isochronous read requests."}, {"QMC_WRITES.FULL.CH0", "Counts number of full cache line writes to DRAM channel 0."}, {"QMC_WRITES.FULL.CH1", "Counts number of full cache line writes to DRAM channel 1."}, {"QMC_WRITES.FULL.CH2", "Counts number of full cache line writes to DRAM channel 2."}, {"QMC_WRITES.FULL.ANY", "Counts number of full cache line writes to DRAM."}, {"QMC_WRITES.PARTIAL.CH0", "Counts number of partial cache line writes to DRAM channel 0."}, {"QMC_WRITES.PARTIAL.CH1", "Counts number of partial cache line writes to DRAM channel 1."}, {"QMC_WRITES.PARTIAL.CH2", "Counts number of partial cache line writes to DRAM channel 2."}, {"QMC_WRITES.PARTIAL.ANY", "Counts number of partial cache line writes to DRAM."}, {"QMC_CANCEL.CH0", "Counts number of DRAM channel 0 cancel requests."}, {"QMC_CANCEL.CH1", "Counts number of DRAM channel 1 cancel requests."}, {"QMC_CANCEL.CH2", "Counts number of DRAM channel 2 cancel requests."}, {"QMC_CANCEL.ANY", "Counts number of DRAM cancel requests."}, {"QMC_PRIORITY_UPDATES.CH0", "Counts number of DRAM channel 0 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.CH1", "Counts number of DRAM channel 1 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.CH2", "Counts number of DRAM channel 2 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.ANY", "Counts number of DRAM priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QHL_FRC_ACK_CNFLTS.LOCAL", "Counts number of Force Acknowledge Con- flict messages sent by the Quickpath Home Logic to the local home."}, {"QPI_TX_STALLED_SINGLE_FLIT.HOME.LINK_0", "Counts cycles the Quickpath outbound link 0 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.SNOOP.LINK_0", "Counts cycles the Quickpath outbound link 0 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.NDR.LINK_0", "Counts cycles the Quickpath outbound link 0 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.HOME.LINK_1", "Counts cycles the Quickpath outbound link 1 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.SNOOP.LINK_1", "Counts cycles the Quickpath outbound link 1 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.NDR.LINK_1", "Counts cycles the Quickpath outbound link 1 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.LINK_0", "Counts cycles the Quickpath outbound link 0 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.LINK_1", "Counts cycles the Quickpath outbound link 1 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.DRS.LINK_0", "Counts cycles the Quickpath outbound link 0 Data ResponSe virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCB.LINK_0", "Counts cycles the Quickpath outbound link 0 Non-Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCS.LINK_0", "Counts cycles the Quickpath outbound link 0 Non-Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.DRS.LINK_1", "Counts cycles the Quickpath outbound link 1 Data ResponSe virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCB.LINK_1", "Counts cycles the Quickpath outbound link 1 Non-Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCS.LINK_1", "Counts cycles the Quickpath outbound link 1 Non-Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.LINK_0", "Counts cycles the Quickpath outbound link 0 virtual channels are stalled due to lack of VNA and VN0 cred- its. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.LINK_1", "Counts cycles the Quickpath outbound link 1 virtual channels are stalled due to lack of VNA and VN0 cred- its. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_HEADER.BUSY.LINK_0", "Number of cycles that the header buffer in the Quickpath Interface outbound link 0 is busy."}, {"QPI_TX_HEADER.BUSY.LINK_1", "Number of cycles that the header buffer in the Quickpath Interface outbound link 1 is busy."}, {"QPI_RX_NO_PPT_CREDIT.STALLS.LINK_0", "Number of cycles that snoop packets incom- ing to the Quickpath Interface link 0 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries."}, {"QPI_RX_NO_PPT_CREDIT.STALLS.LINK_1", "Number of cycles that snoop packets incom- ing to the Quickpath Interface link 1 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries."}, {"DRAM_OPEN.CH0", "Counts number of DRAM Channel 0 open com- mands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_OPEN.CH1", "Counts number of DRAM Channel 1 open com- mands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_OPEN.CH2", "Counts number of DRAM Channel 2 open com- mands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_PAGE_CLOSE.CH0", "DRAM channel 0 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_CLOSE.CH1", "DRAM channel 1 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_CLOSE.CH2", "DRAM channel 2 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH0", "Counts the number of precharges (PRE) that were issued to DRAM channel 0 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH1", "Counts the number of precharges (PRE) that were issued to DRAM channel 1 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH2", "Counts the number of precharges (PRE) that were issued to DRAM channel 2 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_READ_CAS.CH0", "Counts the number of times a read CAS com- mand was issued on DRAM channel 0."}, {"DRAM_READ_CAS.AUTOPRE_CH0", "Counts the number of times a read CAS com- mand was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_READ_CAS.CH1", "Counts the number of times a read CAS com- mand was issued on DRAM channel 1."}, {"DRAM_READ_CAS.AUTOPRE_CH1", "Counts the number of times a read CAS com- mand was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_READ_CAS.CH2", "Counts the number of times a read CAS com- mand was issued on DRAM channel 2."}, {"DRAM_READ_CAS.AUTOPRE_CH2", "Counts the number of times a read CAS com- mand was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH0", "Counts the number of times a write CAS command was issued on DRAM channel 0."}, {"DRAM_WRITE_CAS.AUTOPRE_CH0", "Counts the number of times a write CAS command was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH1", "Counts the number of times a write CAS command was issued on DRAM channel 1."}, {"DRAM_WRITE_CAS.AUTOPRE_CH1", "Counts the number of times a write CAS command was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH2", "Counts the number of times a write CAS command was issued on DRAM channel 2."}, {"DRAM_WRITE_CAS.AUTOPRE_CH2", "Counts the number of times a write CAS command was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_REFRESH.CH0", "Counts number of DRAM channel 0 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_REFRESH.CH1", "Counts number of DRAM channel 1 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_REFRESH.CH2", "Counts number of DRAM channel 2 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_PRE_ALL.CH0", "Counts number of DRAM Channel 0 precharge- all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, {"DRAM_PRE_ALL.CH1", "Counts number of DRAM Channel 1 precharge- all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, {"DRAM_PRE_ALL.CH2", "Counts number of DRAM Channel 2 precharge- all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-i7.h000066400000000000000000000367201502707512200173140ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-i7.h * CVS: $Id: map-i7.h,v 1.1.2.2 2010/03/06 16:12:08 servat Exp $ * Author: George Neville-Neil * gnn@freebsd.org */ #ifndef FreeBSD_MAP_I7 #define FreeBSD_MAP_I7 enum NativeEvent_Value_i7Processor { PNE_I7_SB_FORWARD_ANY= PAPI_NATIVE_MASK , PNE_I7_LOAD_BLOCK_STD, PNE_I7_LOAD_BLOCK_ADDRESS_OFFSET, PNE_I7_SB_DRAIN_CYCLES, PNE_I7_MISALIGN_MEM_REF_LOAD, PNE_I7_MISALIGN_MEM_REF_STORE, PNE_I7_MISALIGN_MEM_REF_ANY, PNE_I7_STORE_BLOCKS_NOT_STA, PNE_I7_STORE_BLOCKS_STA, PNE_I7_STORE_BLOCKS_AT_RET, PNE_I7_STORE_BLOCKS_L1D_BLOCK, PNE_I7_STORE_BLOCKS_ANY, PNE_I7_PARTIAL_ADDRESS_ALIAS, PNE_I7_DTLB_LOAD_MISSES_ANY, PNE_I7_DTLB_LOAD_MISSES_WALK_COMPLETED, PNE_I7_DTLB_LOAD_MISSES_STLB_HIT, PNE_I7_DTLB_LOAD_MISSES_PDE_MISS, PNE_I7_DTLB_LOAD_MISSES_PDP_MISS, PNE_I7_DTLB_LOAD_MISSES_LARGE_WALK_COMPLETED, PNE_I7_MEMORY_DISAMBIGURATION_RESET, PNE_I7_MEMORY_DISAMBIGURATION_SUCCESS, PNE_I7_MEMORY_DISAMBIGURATION_WATCHDOG, PNE_I7_MEMORY_DISAMBIGURATION_WATCH_CYCLES, PNE_I7_MEM_INST_RETIRED_LOADS, PNE_I7_MEM_INST_RETIRED_STORES, PNE_I7_MEM_STORE_RETIRED_DTLB_MISS, PNE_I7_UOPS_ISSUED_ANY, PNE_I7_UOPS_ISSUED_FUSED, PNE_I7_MEM_UNCORE_RETIRED_OTHER_CORE_L2_HITM, PNE_I7_MEM_UNCORE_RETIRED_REMOTE_CACHE_LOCAL_HOME_HIT, PNE_I7_MEM_UNCORE_RETIRED_REMOTE_DRAM, PNE_I7_MEM_UNCORE_RETIRED_LOCAL_DRAM, PNE_I7_FP_COMP_OPS_EXE_X87, PNE_I7_FP_COMP_OPS_EXE_MMX, PNE_I7_FP_COMP_OPS_EXE_SSE_FP, PNE_I7_FP_COMP_OPS_EXE_SSE2_INTEGER, PNE_I7_FP_COMP_OPS_EXE_SSE_FP_PACKED, PNE_I7_FP_COMP_OPS_EXE_SSE_FP_SCALAR, PNE_I7_FP_COMP_OPS_EXE_SSE_SINGLE_PRECISION, PNE_I7_FP_COMP_OPS_EXE_SSE_DOUBLE_PRECISION, PNE_I7_SIMD_INT_128_PACKED_MPY, PNE_I7_SIMD_INT_128_PACKED_SHIFT, PNE_I7_SIMD_INT_128_PACK, PNE_I7_SIMD_INT_128_UNPACK, PNE_I7_SIMD_INT_128_PACKED_LOGICAL, PNE_I7_SIMD_INT_128_PACKED_ARITH, PNE_I7_SIMD_INT_128_SHUFFLE_MOVE, PNE_I7_LOAD_DISPATCH_RS, PNE_I7_LOAD_DISPATCH_RS_DELAYED, PNE_I7_LOAD_DISPATCH_MOB, PNE_I7_LOAD_DISPATCH_ANY, PNE_I7_ARITH_CYCLES_DIV_BUSY, PNE_I7_ARITH_MUL, PNE_I7_INST_QUEUE_WRITES, PNE_I7_INST_DECODED_DEC0, PNE_I7_TWO_UOP_INSTS_DECODED, PNE_I7_HW_INT_RCV, PNE_I7_HW_INT_CYCLES_MASKED, PNE_I7_HW_INT_CYCLES_PENDING_AND_MASKED, PNE_I7_INST_QUEUE_WRITE_CYCLES, PNE_I7_L2_RQSTS_LD_HIT, PNE_I7_L2_RQSTS_LD_MISS, PNE_I7_L2_RQSTS_LOADS, PNE_I7_L2_RQSTS_RFO_HIT, PNE_I7_L2_RQSTS_RFO_MISS, PNE_I7_L2_RQSTS_RFOS, PNE_I7_L2_RQSTS_IFETCH_HIT, PNE_I7_L2_RQSTS_IFETCH_MISS, PNE_I7_L2_RQSTS_IFETCHES, PNE_I7_L2_RQSTS_PREFETCH_HIT, PNE_I7_L2_RQSTS_PREFETCH_MISS, PNE_I7_L2_RQSTS_PREFETCHES, PNE_I7_L2_RQSTS_MISS, PNE_I7_L2_RQSTS_REFERENCES, PNE_I7_L2_DATA_RQSTS_DEMAND_I_STATE, PNE_I7_L2_DATA_RQSTS_DEMAND_S_STATE, PNE_I7_L2_DATA_RQSTS_DEMAND_E_STATE, PNE_I7_L2_DATA_RQSTS_DEMAND_M_STATE, PNE_I7_L2_DATA_RQSTS_DEMAND_MESI, PNE_I7_L2_DATA_RQSTS_PREFETCH_I_STATE, PNE_I7_L2_DATA_RQSTS_PREFETCH_S_STATE, PNE_I7_L2_DATA_RQSTS_PREFETCH_E_STATE, PNE_I7_L2_DATA_RQSTS_PREFETCH_M_STATE, PNE_I7_L2_DATA_RQSTS_PREFETCH_MESI, PNE_I7_L2_DATA_RQSTS_ANY, PNE_I7_L2_WRITE_RFO_I_STATE, PNE_I7_L2_WRITE_RFO_S_STATE, PNE_I7_L2_WRITE_RFO_E_STATE, PNE_I7_L2_WRITE_RFO_M_STATE, PNE_I7_L2_WRITE_RFO_HIT, PNE_I7_L2_WRITE_RFO_MESI, PNE_I7_L2_WRITE_LOCK_I_STATE, PNE_I7_L2_WRITE_LOCK_S_STATE, PNE_I7_L2_WRITE_LOCK_E_STATE, PNE_I7_L2_WRITE_LOCK_M_STATE, PNE_I7_L2_WRITE_LOCK_HIT, PNE_I7_L2_WRITE_LOCK_MESI, PNE_I7_L1D_WB_L2_I_STATE, PNE_I7_L1D_WB_L2_S_STATE, PNE_I7_L1D_WB_L2_E_STATE, PNE_I7_L1D_WB_L2_M_STATE, PNE_I7_L1D_WB_L2_MESI, PNE_I7_L3_LAT_CACHE_REFERENCE, PNE_I7_L3_LAT_CACHE_MISS, PNE_I7_CPU_CLK_UNHALTED_THREAD_P, PNE_I7_CPU_CLK_UNHALTED_REF_P, PNE_I7_UOPS_DECODED_DEC0, PNE_I7_L1D_CACHE_LD_I_STATE, PNE_I7_L1D_CACHE_LD_S_STATE, PNE_I7_L1D_CACHE_LD_E_STATE, PNE_I7_L1D_CACHE_LD_M_STATE, PNE_I7_L1D_CACHE_LD_MESI, PNE_I7_L1D_CACHE_ST_I_STATE, PNE_I7_L1D_CACHE_ST_S_STATE, PNE_I7_L1D_CACHE_ST_E_STATE, PNE_I7_L1D_CACHE_ST_M_STATE, PNE_I7_L1D_CACHE_ST_MESI, PNE_I7_L1D_CACHE_LOCK_HIT, PNE_I7_L1D_CACHE_LOCK_S_STATE, PNE_I7_L1D_CACHE_LOCK_E_STATE, PNE_I7_L1D_CACHE_LOCK_M_STATE, PNE_I7_L1D_ALL_REF_ANY, PNE_I7_L1D_ALL_REF_CACHEABLE, PNE_I7_L1D_PEND_MISS_LOAD_BUFFERS_FULL, PNE_I7_DTLB_MISSES_ANY, PNE_I7_DTLB_MISSES_WALK_COMPLETED, PNE_I7_DTLB_MISSES_STLB_HIT, PNE_I7_DTLB_MISSES_PDE_MISS, PNE_I7_DTLB_MISSES_PDP_MISS, PNE_I7_DTLB_MISSES_LARGE_WALK_COMPLETED, PNE_I7_SSE_MEM_EXEC_NTA, PNE_I7_SSE_MEM_EXEC_STREAMING_STORES, PNE_I7_LOAD_HIT_PRE, PNE_I7_SFENCE_CYCLES, PNE_I7_L1D_PREFETCH_REQUESTS, PNE_I7_L1D_PREFETCH_MISS, PNE_I7_L1D_PREFETCH_TRIGGERS, PNE_I7_EPT_EPDE_MISS, PNE_I7_EPT_EPDPE_HIT, PNE_I7_EPT_EPDPE_MISS, PNE_I7_L1D_REPL, PNE_I7_L1D_M_REPL, PNE_I7_L1D_M_EVICT, PNE_I7_L1D_M_SNOOP_EVICT, PNE_I7_L1D_CACHE_PREFETCH_LOCK_FB_HIT, PNE_I7_L1D_CACHE_LOCK_FB_HIT, PNE_I7_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_READ_DATA, PNE_I7_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_READ_CODE, PNE_I7_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_RFO, PNE_I7_OFFCORE_REQUESTS_OUTSTANDING_ANY_READ, PNE_I7_CACHE_LOCK_CYCLES_L1D_L2, PNE_I7_CACHE_LOCK_CYCLES_L1D, PNE_I7_IO_TRANSACTIONS, PNE_I7_L1I_HITS, PNE_I7_L1I_MISSES, PNE_I7_L1I_READS, PNE_I7_L1I_CYCLES_STALLED, PNE_I7_IFU_IVC_FULL, PNE_I7_IFU_IVC_L1I_EVICTION, PNE_I7_LARGE_ITLB_HIT, PNE_I7_L1I_OPPORTUNISTIC_HITS, PNE_I7_ITLB_MISSES_ANY, PNE_I7_ITLB_MISSES_WALK_COMPLETED, PNE_I7_ITLB_MISSES_WALK_CYCLES, PNE_I7_ITLB_MISSES_STLB_HIT, PNE_I7_ITLB_MISSES_PDE_MISS, PNE_I7_ITLB_MISSES_PDP_MISS, PNE_I7_ITLB_MISSES_LARGE_WALK_COMPLETED, PNE_I7_ILD_STALL_ANY, PNE_I7_ILD_STALL_IQ_FULL, PNE_I7_ILD_STALL_LCP, PNE_I7_ILD_STALL_MRU, PNE_I7_ILD_STALL_REGEN, PNE_I7_BR_INST_EXEC_ANY, PNE_I7_BR_INST_EXEC_COND, PNE_I7_BR_INST_EXEC_DIRECT, PNE_I7_BR_INST_EXEC_DIRECT_NEAR_CALL, PNE_I7_BR_INST_EXEC_INDIRECT_NEAR_CALL, PNE_I7_BR_INST_EXEC_INDIRECT_NON_CALL, PNE_I7_BR_INST_EXEC_NEAR_CALLS, PNE_I7_BR_INST_EXEC_NON_CALLS, PNE_I7_BR_INST_EXEC_RETURN_NEAR, PNE_I7_BR_INST_EXEC_TAKEN, PNE_I7_BR_MISP_EXEC_COND, PNE_I7_BR_MISP_EXEC_DIRECT, PNE_I7_BR_MISP_EXEC_INDIRECT_NON_CALL, PNE_I7_BR_MISP_EXEC_NON_CALLS, PNE_I7_BR_MISP_EXEC_RETURN_NEAR, PNE_I7_BR_MISP_EXEC_DIRECT_NEAR_CALL, PNE_I7_BR_MISP_EXEC_INDIRECT_NEAR_CALL, PNE_I7_BR_MISP_EXEC_NEAR_CALLS, PNE_I7_BR_MISP_EXEC_TAKEN, PNE_I7_BR_MISP_EXEC_ANY, PNE_I7_RESOURCE_STALLS_ANY, PNE_I7_RESOURCE_STALLS_LOAD, PNE_I7_RESOURCE_STALLS_RS_FULL, PNE_I7_RESOURCE_STALLS_STORE, PNE_I7_RESOURCE_STALLS_ROB_FULL, PNE_I7_RESOURCE_STALLS_FPCW, PNE_I7_RESOURCE_STALLS_MXCSR, PNE_I7_RESOURCE_STALLS_OTHER, PNE_I7_MACRO_INSTS_FUSIONS_DECODED, PNE_I7_BACLEAR_FORCE_IQ, PNE_I7_LSD_UOPS, PNE_I7_ITLB_FLUSH, PNE_I7_OFFCORE_REQUESTS_DEMAND_READ_DATA, PNE_I7_OFFCORE_REQUESTS_DEMAND_READ_CODE, PNE_I7_OFFCORE_REQUESTS_DEMAND_RFO, PNE_I7_OFFCORE_REQUESTS_ANY_READ, PNE_I7_OFFCORE_REQUESTS_ANY_RFO, PNE_I7_OFFCORE_REQUESTS_UNCACHED_MEM, PNE_I7_OFFCORE_REQUESTS_L1D_WRITEBACK, PNE_I7_OFFCORE_REQUESTS_ANY, PNE_I7_UOPS_EXECUTED_PORT0, PNE_I7_UOPS_EXECUTED_PORT1, PNE_I7_UOPS_EXECUTED_PORT2_CORE, PNE_I7_UOPS_EXECUTED_PORT3_CORE, PNE_I7_UOPS_EXECUTED_PORT4_CORE, PNE_I7_UOPS_EXECUTED_PORT5, PNE_I7_UOPS_EXECUTED_CORE_ACTIVE_CYCLES, PNE_I7_UOPS_EXECUTED_PORT015, PNE_I7_UOPS_EXECUTED_PORT234, PNE_I7_OFFCORE_REQUESTS_SQ_FULL, PNE_I7_SNOOPQ_REQUESTS_OUTSTANDING_DATA, PNE_I7_SNOOPQ_REQUESTS_OUTSTANDING_INVALIDATE, PNE_I7_SNOOPQ_REQUESTS_OUTSTANDING_CODE, PNE_I7_OFF_CORE_RESPONSE_0, PNE_I7_SNOOP_RESPONSE_HIT, PNE_I7_SNOOP_RESPONSE_HITE, PNE_I7_SNOOP_RESPONSE_HITM, PNE_I7_PIC_ACCESSES_TPR_READS, PNE_I7_PIC_ACCESSES_TPR_WRITES, PNE_I7_INST_RETIRED_ANY_P, PNE_I7_INST_RETIRED_X87, PNE_I7_UOPS_RETIRED_ANY, PNE_I7_UOPS_RETIRED_RETIRE_SLOTS, PNE_I7_UOPS_RETIRED_MACRO_FUSED, PNE_I7_MACHINE_CLEARS_CYCLES, PNE_I7_MACHINE_CLEARS_MEM_ORDER, PNE_I7_MACHINE_CLEARS_SMC, PNE_I7_MACHINE_CLEARS_FUSION_ASSIST, PNE_I7_BR_INST_RETIRED_ALL_BRANCHES, PNE_I7_BR_INST_RETIRED_CONDITIONAL, PNE_I7_BR_INST_RETIRED_NEAR_CALL, PNE_I7_BR_MISP_RETIRED_ALL_BRANCHES, PNE_I7_BR_MISP_RETIRED_NEAR_CALL, PNE_I7_SSEX_UOPS_RETIRED_PACKED_SINGLE, PNE_I7_SSEX_UOPS_RETIRED_SCALAR_SINGLE, PNE_I7_SSEX_UOPS_RETIRED_PACKED_DOUBLE, PNE_I7_SSEX_UOPS_RETIRED_SCALAR_DOUBLE, PNE_I7_SSEX_UOPS_RETIRED_VECTOR_INTEGER, PNE_I7_ITLB_MISS_RETIRED, PNE_I7_MEM_LOAD_RETIRED_L1D_HIT, PNE_I7_MEM_LOAD_RETIRED_L2_HIT, PNE_I7_MEM_LOAD_RETIRED_OTHER_CORE_L2_HIT_HITM, PNE_I7_MEM_LOAD_RETIRED_HIT_LFB, PNE_I7_MEM_LOAD_RETIRED_DTLB_MISS, PNE_I7_MEM_LOAD_RETIRED_L3_MISS, PNE_I7_MEM_LOAD_RETIRED_L3_UNSHARED_HIT, PNE_I7_FP_MMX_TRANS_TO_FP, PNE_I7_FP_MMX_TRANS_TO_MMX, PNE_I7_FP_MMX_TRANS_ANY, PNE_I7_MACRO_INSTS_DECODED, PNE_I7_UOPS_DECODED_MS, PNE_I7_UOPS_DECODED_ESP_FOLDING, PNE_I7_UOPS_DECODED_ESP_SYNC, PNE_I7_RAT_STALLS_FLAGS, PNE_I7_RAT_STALLS_REGISTERS, PNE_I7_RAT_STALLS_ROB_READ_PORT, PNE_I7_RAT_STALLS_SCOREBOARD, PNE_I7_RAT_STALLS_ANY, PNE_I7_SEG_RENAME_STALLS, PNE_I7_ES_REG_RENAMES, PNE_I7_UOP_UNFUSION, PNE_I7_BR_INST_DECODED, PNE_I7_BOGUS_BR, PNE_I7_BPU_MISSED_CALL_RET, PNE_I7_L2_HW_PREFETCH_DATA_TRIGGER, PNE_I7_L2_HW_PREFETCH_CODE_TRIGGER, PNE_I7_L2_HW_PREFETCH_DCA_TRIGGER, PNE_I7_L2_HW_PREFETCH_KICK_START, PNE_I7_SQ_MISC_PROMOTION, PNE_I7_SQ_MISC_PROMOTION_POST_GO, PNE_I7_SQ_MISC_LRU_HINTS, PNE_I7_SQ_MISC_FILL_DROPPED, PNE_I7_SQ_MISC_SPLIT_LOCK, PNE_I7_SQ_FULL_STALL_CYCLES, PNE_I7_FP_ASSIST_ALL, PNE_I7_FP_ASSIST_OUTPUT, PNE_I7_FP_ASSIST_INPUT, PNE_I7_SEGMENT_REG_LOADS, PNE_I7_SIMD_INT_64_PACKED_MPY, PNE_I7_SIMD_INT_64_PACKED_SHIFT, PNE_I7_SIMD_INT_64_PACK, PNE_I7_SIMD_INT_64_UNPACK, PNE_I7_SIMD_INT_64_PACKED_LOGICAL, PNE_I7_SIMD_INT_64_PACKED_ARITH, PNE_I7_SIMD_INT_64_SHUFFLE_MOVE, PNE_I7_INSTR_RETIRED_ANY, PNE_I7_CPU_CLK_UNHALTED_CORE, PNE_I7_CPU_CLK_UNHALTED_REF, PNE_I7_GQ_CYCLES_FULL_READ_TRACKER, PNE_I7_GQ_CYCLES_FULL_WRITE_TRACKER, PNE_I7_GQ_CYCLES_FULL_PEER_PROBE_TRACKER, PNE_I7_GQ_CYCLES_NOT_EMPTY_READ_TRACKER, PNE_I7_GQ_CYCLES_NOT_EMPTY_WRITE_TRACKER, PNE_I7_GQ_CYCLES_NOT_EMPTY_PEER_PROBE_TRACKER, PNE_I7_GQ_ALLOC_READ_TRACKER, PNE_I7_GQ_ALLOC_RT_L3_MISS, PNE_I7_GQ_ALLOC_RT_TO_L3_RESP, PNE_I7_GQ_ALLOC_RT_TO_RTID_ACQUIRED, PNE_I7_GQ_ALLOC_WT_TO_RTID_ACQUIRED, PNE_I7_GQ_ALLOC_WRITE_TRACKER, PNE_I7_GQ_ALLOC_PEER_PROBE_TRACKER, PNE_I7_GQ_DATA_FROM_QPI, PNE_I7_GQ_DATA_FROM_QMC, PNE_I7_GQ_DATA_FROM_L3, PNE_I7_GQ_DATA_FROM_CORES_02, PNE_I7_GQ_DATA_FROM_CORES_13, PNE_I7_GQ_DATA_TO_QPI_QMC, PNE_I7_GQ_DATA_TO_L3, PNE_I7_GQ_DATA_TO_CORES, PNE_I7_SNP_RESP_TO_LOCAL_HOME_I_STATE, PNE_I7_SNP_RESP_TO_LOCAL_HOME_S_STATE, PNE_I7_SNP_RESP_TO_LOCAL_HOME_FWD_S_STATE, PNE_I7_SNP_RESP_TO_LOCAL_HOME_FWD_I_STATE, PNE_I7_SNP_RESP_TO_LOCAL_HOME_CONFLICT, PNE_I7_SNP_RESP_TO_LOCAL_HOME_WB, PNE_I7_SNP_RESP_TO_REMOTE_HOME_I_STATE, PNE_I7_SNP_RESP_TO_REMOTE_HOME_S_STATE, PNE_I7_SNP_RESP_TO_REMOTE_HOME_FWD_S_STATE, PNE_I7_SNP_RESP_TO_REMOTE_HOME_FWD_I_STATE, PNE_I7_SNP_RESP_TO_REMOTE_HOME_CONFLICT, PNE_I7_SNP_RESP_TO_REMOTE_HOME_WB, PNE_I7_SNP_RESP_TO_REMOTE_HOME_HITM, PNE_I7_L3_HITS_READ, PNE_I7_L3_HITS_WRITE, PNE_I7_L3_HITS_PROBE, PNE_I7_L3_HITS_ANY, PNE_I7_L3_MISS_READ, PNE_I7_L3_MISS_WRITE, PNE_I7_L3_MISS_PROBE, PNE_I7_L3_MISS_ANY, PNE_I7_L3_LINES_IN_M_STATE, PNE_I7_L3_LINES_IN_E_STATE, PNE_I7_L3_LINES_IN_S_STATE, PNE_I7_L3_LINES_IN_F_STATE, PNE_I7_L3_LINES_IN_ANY, PNE_I7_L3_LINES_OUT_M_STATE, PNE_I7_L3_LINES_OUT_E_STATE, PNE_I7_L3_LINES_OUT_S_STATE, PNE_I7_L3_LINES_OUT_I_STATE, PNE_I7_L3_LINES_OUT_F_STATE, PNE_I7_L3_LINES_OUT_ANY, PNE_I7_QHL_REQUESTS_IOH_READS, PNE_I7_QHL_REQUESTS_IOH_WRITES, PNE_I7_QHL_REQUESTS_REMOTE_READS, PNE_I7_QHL_REQUESTS_REMOTE_WRITES, PNE_I7_QHL_REQUESTS_LOCAL_READS, PNE_I7_QHL_REQUESTS_LOCAL_WRITES, PNE_I7_QHL_CYCLES_FULL_IOH, PNE_I7_QHL_CYCLES_FULL_REMOTE, PNE_I7_QHL_CYCLES_FULL_LOCAL, PNE_I7_QHL_CYCLES_NOT_EMPTY_IOH, PNE_I7_QHL_CYCLES_NOT_EMPTY_REMOTE, PNE_I7_QHL_CYCLES_NOT_EMPTY_LOCAL, PNE_I7_QHL_OCCUPANCY_IOH, PNE_I7_QHL_OCCUPANCY_REMOTE, PNE_I7_QHL_OCCUPANCY_LOCAL, PNE_I7_QHL_ADDRESS_CONFLICTS_2WAY, PNE_I7_QHL_ADDRESS_CONFLICTS_3WAY, PNE_I7_QHL_CONFLICT_CYCLES_IOH, PNE_I7_QHL_CONFLICT_CYCLES_REMOTE, PNE_I7_QHL_CONFLICT_CYCLES_LOCAL, PNE_I7_QHL_TO_QMC_BYPASS, PNE_I7_QMC_NORMAL_FULL_READ_CH0, PNE_I7_QMC_NORMAL_FULL_READ_CH1, PNE_I7_QMC_NORMAL_FULL_READ_CH2, PNE_I7_QMC_NORMAL_FULL_WRITE_CH0, PNE_I7_QMC_NORMAL_FULL_WRITE_CH1, PNE_I7_QMC_NORMAL_FULL_WRITE_CH2, PNE_I7_QMC_ISOC_FULL_READ_CH0, PNE_I7_QMC_ISOC_FULL_READ_CH1, PNE_I7_QMC_ISOC_FULL_READ_CH2, PNE_I7_QMC_ISOC_FULL_WRITE_CH0, PNE_I7_QMC_ISOC_FULL_WRITE_CH1, PNE_I7_QMC_ISOC_FULL_WRITE_CH2, PNE_I7_QMC_BUSY_READ_CH0, PNE_I7_QMC_BUSY_READ_CH1, PNE_I7_QMC_BUSY_READ_CH2, PNE_I7_QMC_BUSY_WRITE_CH0, PNE_I7_QMC_BUSY_WRITE_CH1, PNE_I7_QMC_BUSY_WRITE_CH2, PNE_I7_QMC_OCCUPANCY_CH0, PNE_I7_QMC_OCCUPANCY_CH1, PNE_I7_QMC_OCCUPANCY_CH2, PNE_I7_QMC_ISSOC_OCCUPANCY_CH0, PNE_I7_QMC_ISSOC_OCCUPANCY_CH1, PNE_I7_QMC_ISSOC_OCCUPANCY_CH2, PNE_I7_QMC_ISSOC_READS_ANY, PNE_I7_QMC_NORMAL_READS_CH0, PNE_I7_QMC_NORMAL_READS_CH1, PNE_I7_QMC_NORMAL_READS_CH2, PNE_I7_QMC_NORMAL_READS_ANY, PNE_I7_QMC_HIGH_PRIORITY_READS_CH0, PNE_I7_QMC_HIGH_PRIORITY_READS_CH1, PNE_I7_QMC_HIGH_PRIORITY_READS_CH2, PNE_I7_QMC_HIGH_PRIORITY_READS_ANY, PNE_I7_QMC_CRITICAL_PRIORITY_READS_CH0, PNE_I7_QMC_CRITICAL_PRIORITY_READS_CH1, PNE_I7_QMC_CRITICAL_PRIORITY_READS_CH2, PNE_I7_QMC_CRITICAL_PRIORITY_READS_ANY, PNE_I7_QMC_WRITES_FULL_CH0, PNE_I7_QMC_WRITES_FULL_CH1, PNE_I7_QMC_WRITES_FULL_CH2, PNE_I7_QMC_WRITES_FULL_ANY, PNE_I7_QMC_WRITES_PARTIAL_CH0, PNE_I7_QMC_WRITES_PARTIAL_CH1, PNE_I7_QMC_WRITES_PARTIAL_CH2, PNE_I7_QMC_WRITES_PARTIAL_ANY, PNE_I7_QMC_CANCEL_CH0, PNE_I7_QMC_CANCEL_CH1, PNE_I7_QMC_CANCEL_CH2, PNE_I7_QMC_CANCEL_ANY, PNE_I7_QMC_PRIORITY_UPDATES_CH0, PNE_I7_QMC_PRIORITY_UPDATES_CH1, PNE_I7_QMC_PRIORITY_UPDATES_CH2, PNE_I7_QMC_PRIORITY_UPDATES_ANY, PNE_I7_QHL_FRC_ACK_CNFLTS_LOCAL, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_HOME_LINK_0, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_SNOOP_LINK_0, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_NDR_LINK_0, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_HOME_LINK_1, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_SNOOP_LINK_1, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_NDR_LINK_1, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_LINK_0, PNE_I7_QPI_TX_STALLED_SINGLE_FLIT_LINK_1, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_DRS_LINK_0, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_NCB_LINK_0, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_NCS_LINK_0, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_DRS_LINK_1, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_NCB_LINK_1, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_NCS_LINK_1, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_LINK_0, PNE_I7_QPI_TX_STALLED_MULTI_FLIT_LINK_1, PNE_I7_QPI_TX_HEADER_BUSY_LINK_0, PNE_I7_QPI_TX_HEADER_BUSY_LINK_1, PNE_I7_QPI_RX_NO_PPT_CREDIT_STALLS_LINK_0, PNE_I7_QPI_RX_NO_PPT_CREDIT_STALLS_LINK_1, PNE_I7_DRAM_OPEN_CH0, PNE_I7_DRAM_OPEN_CH1, PNE_I7_DRAM_OPEN_CH2, PNE_I7_DRAM_PAGE_CLOSE_CH0, PNE_I7_DRAM_PAGE_CLOSE_CH1, PNE_I7_DRAM_PAGE_CLOSE_CH2, PNE_I7_DRAM_PAGE_MISS_CH0, PNE_I7_DRAM_PAGE_MISS_CH1, PNE_I7_DRAM_PAGE_MISS_CH2, PNE_I7_DRAM_READ_CAS_CH0, PNE_I7_DRAM_READ_CAS_AUTOPRE_CH0, PNE_I7_DRAM_READ_CAS_CH1, PNE_I7_DRAM_READ_CAS_AUTOPRE_CH1, PNE_I7_DRAM_READ_CAS_CH2, PNE_I7_DRAM_READ_CAS_AUTOPRE_CH2, PNE_I7_DRAM_WRITE_CAS_CH0, PNE_I7_DRAM_WRITE_CAS_AUTOPRE_CH0, PNE_I7_DRAM_WRITE_CAS_CH1, PNE_I7_DRAM_WRITE_CAS_AUTOPRE_CH1, PNE_I7_DRAM_WRITE_CAS_CH2, PNE_I7_DRAM_WRITE_CAS_AUTOPRE_CH2, PNE_I7_DRAM_REFRESH_CH0, PNE_I7_DRAM_REFRESH_CH1, PNE_I7_DRAM_REFRESH_CH2, PNE_I7_DRAM_PRE_ALL_CH0, PNE_I7_DRAM_PRE_ALL_CH1, PNE_I7_DRAM_PRE_ALL_CH2, PNE_I7_NATNAME_GUARD }; extern Native_Event_LabelDescription_t i7Processor_info[]; extern hwi_search_t i7Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-k7.c000066400000000000000000000054671502707512200173150ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-k7.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** K7 SUBSTRATE K7 SUBSTRATE K7 SUBSTRATE (aka Athlon) K7 SUBSTRATE K7 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_K7Processor must match K7Processor_info */ Native_Event_LabelDescription_t K7Processor_info[] = { { "k7-dc-accesses", "Count data cache accesses." }, { "k7-dc-misses", "Count data cache misses." }, { "k7-dc-refills-from-l2", "Count data cache refills from L2 cache." }, { "k7-dc-refills-from-system", "Count data cache refills from system memory." }, { "k7-dc-writebacks", "Count data cache writebacks." }, { "k7-l1-dtlb-miss-and-l2-dtlb-hits", "Count L1 DTLB misses and L2 DTLB hits." }, { "k7-l1-and-l2-dtlb-misses", "Count L1 and L2 DTLB misses." }, { "k7-misaligned-references", "Count misaligned data references." }, { "k7-ic-fetches", "Count instruction cache fetches." }, { "k7-ic-misses", "Count instruction cache misses." }, { "k7-l1-itlb-misses", "Count L1 ITLB misses that are L2 ITLB hits." }, { "k7-l1-l2-itlb-misses", "Count L1 (and L2) ITLB misses." }, { "k7-retired-instructions", "Count all retired instructions." }, { "k7-retired-ops", "Count retired ops." }, { "k7-retired-branches", "Count all retired branches (conditional, unconditional, exceptions and interrupts)."}, { "k7-retired-branches-mispredicted", "Count all misprediced retired branches." }, { "k7-retired-taken-branches", "Count retired taken branches." }, { "k7-retired-taken-branches-mispredicted", "Count mispredicted taken branches that were retired." }, { "k7-retired-far-control-transfers", "Count retired far control transfers." }, { "k7-retired-resync-branches", "Count retired resync branches (non control transfer branches)." }, { "k7-interrupts-masked-cycles", "Count the number of cycles when the processor's IF flag was zero." }, { "k7-interrupts-masked-while-pending-cycles", "Count the number of cycles interrupts were masked while pending due to the processor's IF flag being zero." }, { "k7-hardware-interrupts", "Count the number of taken hardware interrupts." }, /* Nearly special counters */ { "k7-dc-refills-from-l2,unitmask=+m", "Count data cache refills from L2 cache (in M state)." }, { "k7-dc-refills-from-l2,unitmask=+oes", "Count data cache refills from L2 cache (in OES state)." }, { "k7-dc-refills-from-system,unitmask=+m", "Count data cache refills from system memory (in M state)." }, { "k7-dc-refills-from-system,unitmask=+oes", "Count data cache refills from system memory (in OES state)." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-k7.h000066400000000000000000000024061502707512200173100ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-k7.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_K7 #define FreeBSD_MAP_K7 enum NativeEvent_Value_K7Processor { PNE_K7_DC_ACCESSES = PAPI_NATIVE_MASK, PNE_K7_DC_MISSES, PNE_K7_DC_REFILLS_FROM_L2, PNE_K7_DC_REFILLS_FROM_SYSTEM, PNE_K7_DC_WRITEBACKS, PNE_K7_L1_DTLB_MISS_AND_L2_DTLB_HITS, PNE_K7_L1_AND_L2_DTLB_MISSES, PNE_K7_MISALIGNED_REFERENCES, PNE_K7_IC_FETCHES, PNE_K7_IC_MISSES, PNE_K7_L1_ITLB_MISSES, PNE_K7_L1_AND_L2_ITLB_MISSES, PNE_K7_RETIRED_INSTRUCTIONS, PNE_K7_RETIRED_OPS, PNE_K7_RETIRED_BRANCHES, PNE_K7_RETIRED_BRANCHES_MISPREDICTED, PNE_K7_RETIRED_TAKEN_BRANCHES, PNE_K7_RETIRED_TAKEN_BRANCHES_MISPREDICTED, PNE_K7_RETIRED_FAR_CONTROL_TRANSFERS, PNE_K7_RETIRED_RESYNC_BRANCHES, PNE_K7_INTERRUPTS_MASKED_CYCLES, PNE_K7_INTERRUPTS_MASKED_WHILE_PENDING_CYCLES, PNE_K7_HARDWARE_INTERRUPTS, /* Nearly special counters */ PNE_K7_DC_REFILLS_FROM_L2_M, PNE_K7_DC_REFILLS_FROM_L2_OES, PNE_K7_DC_REFILLS_FROM_SYSTEM_M, PNE_K7_DC_REFILLS_FROM_SYSTEM_OES, PNE_K7_NATNAME_GUARD }; extern Native_Event_LabelDescription_t K7Processor_info[]; extern hwi_search_t K7Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-k8.c000066400000000000000000000212011502707512200172760ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-k8.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** K8 SUBSTRATE K8 SUBSTRATE K8 SUBSTRATE (aka Athlon64) K8 SUBSTRATE K8 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_K8Processor must match K8Processor_info */ Native_Event_LabelDescription_t K8Processor_info[] = { { "k8-bu-cpu-clk-unhalted", "Count the number of clock cycles when the CPU is not in the HLT or STPCLCK states" }, { "k8-bu-fill-request-l2-miss", "Count fill requests that missed in the L2 cache."}, { "k8-bu-internal-l2-request", "Count internally generated requests to the L2 cache." }, { "k8-dc-access", "Count data cache accesses including microcode scratchpad accesses."}, { "k8-dc-copyback", "Count data cache copyback operations."}, { "k8-dc-dcache-accesses-by-locks", "Count data cache accesses by lock instructions." }, { "k8-dc-dispatched-prefetch-instructions", "Count the number of dispatched prefetch instructions." }, { "k8-dc-l1-dtlb-miss-and-l2-dtlb-hit", "Count L1 DTLB misses that are L2 DTLB hits." }, { "k8-dc-l1-dtlb-miss-and-l2-dtlb-miss", "Count L1 DTLB misses that are also misses in the L2 DTLB." }, { "k8-dc-microarchitectural-early-cancel-of-an-access", "Count microarchitectural early cancels of data cache accesses." }, { "k8-dc-microarchitectural-late-cancel-of-an-access", "Count microarchitectural late cancels of data cache accesses." }, { "k8-dc-misaligned-data-reference", "Count misaligned data references." }, { "k8-dc-miss", "Count data cache misses."}, { "k8-dc-one-bit-ecc-error", "Count one bit ECC errors found by the scrubber." }, { "k8-dc-refill-from-l2", "Count data cache refills from L2 cache." }, { "k8-dc-refill-from-system", "Count data cache refills from system memory." }, { "k8-fp-dispatched-fpu-ops", "Count the number of dispatched FPU ops." }, { "k8-fp-cycles-with-no-fpu-ops-retired", "Count cycles when no FPU ops were retired." }, { "k8-fp-dispatched-fpu-fast-flag-ops", "Count dispatched FPU ops that use the fast flag interface." }, { "k8-fr-decoder-empty", "Count cycles when there was nothing to dispatch." }, { "k8-fr-dispatch-stalls", "Count all dispatch stalls." }, { "k8-fr-dispatch-stall-for-segment-load", "Count dispatch stalls for segment loads." }, { "k8-fr-dispatch-stall-for-serialization", "Count dispatch stalls for serialization." }, { "k8-fr-dispatch-stall-from-branch-abort-to-retire", "Count dispatch stalls from branch abort to retiral." }, { "k8-fr-dispatch-stall-when-fpu-is-full", "Count dispatch stalls when the FPU is full." }, { "k8-fr-dispatch-stall-when-ls-is-full", "Count dispatch stalls when the load/store unit is full." }, { "k8-fr-dispatch-stall-when-reorder-buffer-is-full", "Count dispatch stalls when the reorder buffer is full." }, { "k8-fr-dispatch-stall-when-reservation-stations-are-full", "Count dispatch stalls when reservation stations are full." }, { "k8-fr-dispatch-stall-when-waiting-for-all-to-be-quiet", "Count dispatch stalls when waiting for all to be quiet." }, { "k8-fr-dispatch-stall-when-waiting-far-xfer-or-resync-branch-pending", "Count dispatch stalls when a far control transfer or a resync branch is pending." }, { "k8-fr-fpu-exceptions", "Count FPU exceptions." }, { "k8-fr-interrupts-masked-cycles", "Count cycles when interrupts were masked." }, { "k8-fr-interrupts-masked-while-pending-cycles", "Count cycles while interrupts were masked while pending" }, { "k8-fr-number-of-breakpoints-for-dr0", "Count the number of breakpoints for DR0." }, { "k8-fr-number-of-breakpoints-for-dr1", "Count the number of breakpoints for DR1." }, { "k8-fr-number-of-breakpoints-for-dr2", "Count the number of breakpoints for DR2." }, { "k8-fr-number-of-breakpoints-for-dr3", "Count the number of breakpoints for DR3." }, { "k8-fr-retired-branches", "Count retired branches including exceptions and interrupts." }, { "k8-fr-retired-branches-mispredicted", "Count mispredicted retired branches." }, { "k8-fr-retired-far-control-transfers", "Count retired far control transfers" }, { "k8-fr-retired-fastpath-double-op-instructions", "Count retired fastpath double op instructions." }, { "k8-fr-retired-fpu-instructions", "Count retired FPU instructions." }, { "k8-fr-retired-near-returns", "Count retired near returns." }, { "k8-fr-retired-near-returns-mispredicted", "Count mispredicted near returns." }, { "k8-fr-retired-resyncs", "Count retired resyncs" }, { "k8-fr-retired-taken-hardware-interrupts", "Count retired taken hardware interrupts."}, { "k8-fr-retired-taken-branches", "Count retired taken branches." }, { "k8-fr-retired-taken-branches-mispredicted", "Count retired taken branches that were mispredicted." }, { "k8-fr-retired-taken-branches-mispredicted-by-addr-miscompare", "Count retired taken branches that were mispredicted only due to an address miscompare." }, { "k8-fr-retired-uops", "Count retired uops." }, { "k8-fr-retired-x86-instructions", "Count retired x86 instructions including exceptions and interrupts"}, { "k8-ic-fetch", "Count instruction cache fetches." }, { "k8-ic-instruction-fetch-stall", "Count cycles in stalls due to instruction fetch." }, { "k8-ic-l1-itlb-miss-and-l2-itlb-hit", "Count L1 ITLB misses that are L2 ITLB hits." }, { "k8-ic-l1-itlb-miss-and-l2-itlb-miss", "Count ITLB misses that miss in both L1 and L2 ITLBs." }, { "k8-ic-microarchitectural-resync-by-snoop", "Count microarchitectural resyncs caused by snoops." }, { "k8-ic-miss", "Count instruction cache misses." }, { "k8-ic-refill-from-l2", "Count instruction cache refills from L2 cache." }, { "k8-ic-refill-from-system", "Count instruction cache refills from system memory." }, { "k8-ic-return-stack-hits", "Count hits to the return stack." }, { "k8-ic-return-stack-overflow", "Count overflows of the return stack." }, { "k8-ls-buffer2-full", "Count load/store buffer2 full events." }, { "k8-ls-locked-operation", "Count locked operations." }, { "k8-ls-microarchitectural-late-cancel", "Count microarchitectural late cancels of operations in the load/store unit" }, { "k8-ls-microarchitectural-resync-by-self-modifying-code", "Count microarchitectural resyncs caused by self-modifying code." }, { "k8-ls-microarchitectural-resync-by-snoop", "Count microarchitectural resyncs caused by snoops." }, { "k8-ls-retired-cflush-instructions", "Count retired CFLUSH instructions." }, { "k8-ls-retired-cpuid-instructions", "Count retired CPUID instructions." }, { "k8-ls-segment-register-load", "Count segment register loads." }, { "k8-nb-memory-controller-bypass-saturation", "Count memory controller bypass counter saturation events." }, { "k8-nb-memory-controller-dram-slots-missed", "Count memory controller DRAM command slots missed (in MemClks)." }, { "k8-nb-memory-controller-page-access-event", "Count memory controller page access events." }, { "k8-nb-memory-controller-page-table-overflow", "Count memory control page table overflow events." }, { "k8-nb-probe-result", "Count probe events." }, { "k8-nb-sized-commands", "Count sized commands issued." }, { "k8-nb-memory-controller-turnaround", "Count memory control turnaround events." }, { "k8-nb-ht-bus0-bandwidth", "Count events on the HyperTransport(tm) bus #0" }, { "k8-nb-ht-bus1-bandwidth", "Count events on the HyperTransport(tm) bus #1" }, { "k8-nb-ht-bus2-bandwidth", "Count events on the HyperTransport(tm) bus #2" }, /* Special counters with some masks activated */ { "k8-dc-refill-from-l2,mask=+modified,+owner,+exclusive,+shared", "Count data cache refills from L2 cache (in MOES state)." }, { "k8-dc-refill-from-l2,mask=+owner,+exclusive,+shared", "Count data cache refills from L2 cache (in OES state)." }, { "k8-dc-refill-from-l2,mask=+modified", "Count data cache refills from L2 cache (in M state)." }, { "k8-dc-refill-from-system,mask=+modified,+owner,+exclusive,+shared", "Count data cache refills from system memory (in MOES state)." }, { "k8-dc-refill-from-system,mask=+owner,+exclusive,+shared", "Count data cache refills from system memory (in OES state)." }, { "k8-dc-refill-from-system,mask=+modified", "Count data cache refills from system memory (in M state)." }, { "k8-fp-dispatched-fpu-ops,mask=+multiply-pipe-junk-ops", "Count the number of dispatched FPU multiplies." }, { "k8-fp-dispatched-fpu-ops,mask=+add-pipe-junk-ops", "Count the number of dispatched FPU adds." }, { "k8-fp-dispatched-fpu-ops,mask=+multiply-pipe-junk-ops,+add-pipe-junk-ops", "Count the number of dispatched FPU adds and multiplies." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-k8.h000066400000000000000000000073541502707512200173200ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-k8.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_K8 #define FreeBSD_MAP_K8 enum NativeEvent_Value_K8Processor { PNE_K8_BU_CPU_CLK_UNHALTED = PAPI_NATIVE_MASK, PNE_K8_BU_FILL_REQUEST_L2_MISS, PNE_K8_BU_INTERNAL_L2_REQUEST, PNE_K8_DC_ACCESS, PNE_K8_DC_COPYBACK, PNE_K8_DC_DCACHE_ACCESSES_BY_LOCKS, PNE_K8_DC_DISPATCHED_PREFETCH_INSTRUCTIONS, PNE_K8_DC_L1_DTLB_MISS_AND_L2_DTLB_HIT, PNE_K8_DC_L1_DTLB_MISS_AND_L2_DTLB_MISS, PNE_K8_DC_MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS, PNE_K8_DC_MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS, PNE_K8_DC_MISALIGNED_DATA_REFERENCE, PNE_K8_DC_MISS, PNE_K8_DC_ONE_BIT_ECC_ERROR, PNE_K8_DC_REFILL_FROM_L2, PNE_K8_DC_REFILL_FROM_SYSTEM, PNE_K8_FP_DISPATCHED_FPU_OPS, PNE_K8_FP_CYCLES_WITH_NO_FPU_OPS_RETIRED, PNE_K8_FP_DISPATCHED_FPU_FAST_FLAG_OPS, PNE_K8_FR_DECODER_EMPTY, PNE_K8_FR_DISPATCH_STALLS, PNE_K8_FR_DISPATCH_STALL_FOR_SEGMENT_LOAD, PNE_K8_FR_DISPATCH_STALL_FOR_SERIALIZATION, PNE_K8_FR_DISPATCH_STALL_FOR_BRANCH_ABORT_TO_RETIRE, PNE_K8_FR_DISPATCH_STALL_WHEN_FPU_IS_FULL, PNE_K8_FR_DISPATCH_STALL_WHEN_LS_IS_FULL, PNE_K8_FR_DISPATCH_STALL_WHEN_REORDER_BUFFER_IS_FULL, PNE_K8_FR_DISPATCH_STALL_WHEN_RESERVATION_STATIONS_ARE_FULL, PNE_K8_FR_DISPATCH_STALL_WHEN_WAITING_FOR_ALL_TO_BE_QUIET, PNE_K8_FR_DISPATCH_STALL_WHEN_WAITING_FAR_XFER_OR_RESYNC_BRANCH_PENDING, PNE_K8_FR_FPU_EXCEPTIONS, PNE_K8_FR_INTERRUPTS_MASKED_CYCLES, PNE_K8_FR_INTERRUPTS_MASKED_WHILE_PENDING_CYCLES, PNE_K8_FR_NUMBER_OF_BREAKPOINTS_FOR_DR0, PNE_K8_FR_NUMBER_OF_BREAKPOINTS_FOR_DR1, PNE_K8_FR_NUMBER_OF_BREAKPOINTS_FOR_DR2, PNE_K8_FR_NUMBER_OF_BREAKPOINTS_FOR_DR3, PNE_K8_FR_RETIRED_BRANCHES, PNE_K8_FR_RETIRED_BRANCHES_MISPREDICTED, PNE_K8_FR_RETIRED_FAR_CONTROL_TRANSFERS, PNE_K8_FR_RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS, PNE_K8_FR_RETIRED_FPU_INSTRUCTIONS, PNE_K8_FR_RETIRED_NEAR_RETURNS, PNE_K8_FR_RETIRED_NEAR_RETURNS_MISPREDICTED, PNE_K8_FR_RETIRED_RESYNCS, PNE_K8_FR_RETIRED_TAKEN_HARDWARE_INTERRUPTS, PNE_K8_FR_RETIRED_TAKEN_BRANCHES, PNE_K8_FR_RETIRED_TAKEN_BRANCHES_MISPREDICTED, PNE_K8_FR_RETIRED_TAKEN_BRANCHES_MISPREDICTED_BY_ADDR_MISCOMPARE, PNE_K8_FR_RETIRED_UOPS, PNE_K8_FR_RETIRED_X86_INSTRUCTIONS, PNE_K8_IC_FETCH, PNE_K8_IC_INSTRUCTION_FETCH_STALL, PNE_K8_IC_L1_ITLB_MISS_AND_L2_ITLB_HIT, PNE_K8_IC_L1_ITLB_MISS_AND_L2_ITLB_MISS, PNE_K8_IC_MICROARCHITECTURAL_RESYNC_BY_SNOOP, PNE_K8_IC_MISS, PNE_K8_IC_REFILL_FROM_L2, PNE_K8_IC_REFILL_FROM_SYSTEM, PNE_K8_RETURN_STACK_HITS, PNE_K8_RETURN_STACK_OVERFLOW, PNE_K8_LS_BUFFER2_FULL, PNE_K8_LS_LOCKED_OPERATION, PNE_K8_LS_MICROARCHITECTURAL_LATE_CANCEL, PNE_K8_LS_MICROARCHITECTURAL_RESYNC_BY_SELF_MODIFYING_CODE, PNE_K8_LS_MICROARCHITECTURAL_RESYNc_BY_SNOOP, PNE_K8_LS_RETIRED_CFLUSH_INSTRUCTIONS, PNE_K8_LS_RETIRED_CPUID_INSTRUCTIONS, PNE_K8_LS_SEGMENT_REGISTER_LOAD, PNE_K8_NB_MEMORY_CONTROLLER_BYPASS_SATURATION, PNE_K8_NB_MEMORY_CONTROLLER_DRAM_SLOTS_MISSED, PNE_K8_NB_MEMORY_CONTROLLER_PAGE_ACCESS_EVENT, PNE_K8_NB_MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOW, PNE_K8_NB_PROBE_RESULT, PNE_K8_NB_SIZED_COMMANDS, PNE_K8_NB_MEMORY_CONTROLLER_TURNAROUND, PNE_K8_NB_HT_BUS0_BANDWIDTH, PNE_K8_NB_HT_BUS1_BANDWIDTH, PNE_K8_NB_HT_BUS2_BANDWIDTH, /* Special counters */ PNE_K8_DC_REFILL_FROM_L2_MOES, PNE_K8_DC_REFILL_FROM_L2_OES, PNE_K8_DC_REFILL_FROM_L2_M, PNE_K8_DC_REFILL_FROM_SYSTEM_MOES, PNE_K8_DC_REFILL_FROM_SYSTEM_OES, PNE_K8_DC_REFILL_FROM_SYSTEM_M, PNE_K8_FP_DISPATCHED_FPU_MULS, PNE_K8_FP_DISPATCHED_FPU_ADDS, PNE_K8_FP_DISPATCHED_FPU_ADDS_AND_MULS, PNE_K8_NATNAME_GUARD }; extern Native_Event_LabelDescription_t K8Processor_info[]; extern hwi_search_t K8Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p4.c000066400000000000000000000133421502707512200173060ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p4.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P4 SUBSTRATE P4 SUBSTRATE P4 SUBSTRATE (aka Pentium IV) P4 SUBSTRATE P4 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P4Processor must match P4Processor_info */ Native_Event_LabelDescription_t P4Processor_info[] = { { "p4-128bit-mmx-uop", "Count integer SIMD SSE2 instructions that operate on 128 bit SIMD operands." }, { "p4-64bit-mmx-uop", "Count MMX instructions that operate on 64 bit SIMD operands." }, { "p4-b2b-cycles", "Count back-to-back bys cycles." }, { "p4-bnr", "Count bus-not-ready conditions." }, { "p4-bpu-fetch-request", "Count instruction fetch requests." }, { "p4-branch-retired", "Counts retired branches." }, { "p4-bsq-active-entries", "Count the number of entries (clipped at 15) currently active in the BSQ." }, { "p4-bsq-allocation", "Count allocations in the bus sequence unit." }, { "p4-bsq-cache-reference", "Count cache references as seen by the bus unit." }, { "p4-execution-event", "Count the retirement uops through the execution mechanism." }, { "p4-front-end-event", "Count the retirement uops through the frontend mechanism." }, { "p4-fsb-data-activity", "Count each DBSY or DRDY event." }, { "p4-global-power-events", "Count cycles during which the processor is not stopped." }, { "p4-instr-retired", "Count all kind of instructions retired during a clock cycle." }, { "p4-ioq-active-entries", "Count the number of entries (clipped at 15) in the IOQ that are active." }, { "p4-ioq-allocation", "Count various types of transactions on the bus." }, { "p4-itlb-reference", "Count translations using the intruction translation look-aside buffer." }, { "p4-load-port-replay", "Count replayed events at the load port." }, { "p4-mispred-branch-retired", "Count mispredicted IA-32 branch instructions." }, { "p4-machine-clear", "Count the number of pipeline clears seen by the processor." }, { "p4-memory-cancel", " Count the cancelling of various kinds of requests in the data cache address control unit of the CPU." }, { "p4-memory-complete", "Count the completion of load split, store split, uncacheable split and uncacheable load operations." }, { "p4-mob-load-replay", "Count load replays triggered by the memory order buffer." }, { "p4-packed-dp-uop", "Count packed double-precision uops." }, { "p4-packed-sp-uop", "Count packed single-precision uops." }, { "p4-page-walk-type", "Count page walks performed by the page miss handler." }, { "p4-replay-event", "Count the retirement of tagged uops" }, { "p4-resource-stall", "Count the occurrence or latency of stalls in the allocator." }, { "p4-response", "Count different types of responses." }, { "p4-retired-branch-type", "Count branches retired." }, { "p4-retired-mispred-branch-type", "Count mispredicted branches retired." }, { "p4-scalar-dp-uop", "Count the number of scalar double-precision uops." }, { "p4-scalar-sp-uop", "Count the number of scalar single-precision uops." }, { "p4-snoop", "Count snoop traffic." }, { "p4-sse-input-assist", "Count the number of times an assist is required to handle problems with the operands for SSE and SSE2 operations." }, { "p4-store-port-replay", "Count events replayed at the store port." }, { "p4-tc-deliver-mode", "Count the duration in cycles of operating modes of the trace cache and decode engine." }, { "p4-tc-ms-xfer", "Count the number of times uop delivery changed from the trace cache to MS ROM." }, { "p4-uop-queue-writes", "Count the number of valid uops written to the uop queue." }, { "p4-uop-type", "This event is used in conjunction with the front-end at-retirement mechanism to tag load and store uops." }, { "p4-uops-retired", "Count uops retired during a clock cycle." }, { "p4-wc-buffer", "Count write-combining buffer operations." }, { "p4-x87-assist", "Count the retirement of x87 instructions that required special handling." }, { "p4-x87-fp-uop", "Count x87 floating-point uops." }, { "p4-x87-simd-moves-uop", "Count each x87 FPU, MMX, SSE, or SSE2 uops that load data or store data or perform register-to-register moves." }, /* counters with some modifiers */ { "p4-uop-queue-writes,mask=+from-tc-build,+from-tc-deliver", "Count the number of valid uops written to the uop queue." }, { "p4-page-walk-type,mask=+dtmiss", "Count data page walks performed by the page miss handler." }, { "p4-page-walk-type,mask=+itmiss", "Count instruction page walks performed by the page miss handler." }, { "p4-instr-retired,mask=+nbogusntag,+nbogustag", "Count all non-bogus instructions retired during a clock cycle." }, { "p4-branch-retired,mask=+mmnp,+mmnm", "Count branches not-taken." }, { "p4-branch-retired,mask=+mmtm,+mmtp", "Count branches taken." }, { "p4-branch-retired,mask=+mmnp,+mmtp", "Count branches predicted." }, { "p4-branch-retired,mask=+mmnm,+mmtm", "Count branches mis-predicted." }, { "p4-bsq-cache-reference,mask=+rd-2ndl-miss", "Count 2nd level cache misses." }, { "p4-bsq-cache-reference,mask=+rd-2ndl-miss,+rd-2ndl-hits,+rd-2ndl-hite,+rd-2ndl-hitm", "Count 2nd level cache accesses." }, { "p4-bsq-cache-reference,mask=+rd-2ndl-hits,+rd-2ndl-hite,+rd-2ndl-hitm", "Count 2nd level cache hits." }, { "p4-bsq-cache-reference,mask=+rd-3rdl-miss", "Count 3rd level cache misses." }, { "p4-bsq-cache-reference,mask=+rd-3rdl-miss,+rd-3rdl-hits,+rd-3rdl-hite,+rd-3rdl-hitm", "Count 3rd level cache accesses." }, { "p4-bsq-cache-reference,mask=+rd-3rdl-hits,+rd-3rdl-hite,+rd-3rdl-hitm", "Count 3rd level cache hits." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p4.h000066400000000000000000000037571502707512200173240ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p4.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P4 #define FreeBSD_MAP_P4 enum NativeEvent_Value_P4Processor { PNE_P4_128BIT_MMX_UOP = PAPI_NATIVE_MASK, PNE_P4_64BIT_MMX_UOP, PNE_P4_B2B_CYCLES, PNE_P4_BNR, PNE_P4_BPU_FETCH_REQUEST, PNE_P4_BRANH_RETIRED, PNE_P4_BSQ_ACTIVE_ENTRIES, PNE_P4_BSQ_ALLOCATION, PNE_P4_BSQ_CACHE_REFERENCE, PNE_P4_EXECUTION_EVENT, PNE_P4_FRONT_END_EVENT, PNE_P4_FSB_DATA_ACTIVITY, PNE_P4_GLOBAL_POWER_EVENTS, PNE_P4_INSTR_RETIRED, PNE_P4_IOQ_ACTIVE_ENTRIES, PNE_P4_IOQ_ALLOCATION, PNE_P4_ITLB_REFERENCE, PNE_P4_LOAD_PORT_REPLAY, PNE_P4_MISPRED_BRANCH_RETIRED, PNE_P4_MACHINE_CLEAR, PNE_P4_MEMORY_CANCEL, PNE_P4_MEMORY_COMPLETE, PNE_P4_MOB_LOAD_REPLAY, PNE_P4_PACKED_DP_UOP, PNE_P4_PACKED_SP_UOP, PNE_P4_PAGE_WALK_TYPE, PNE_P4_REPLAY_EVENT, PNE_P4_RESOURCE_STALL, PNE_P4_RESPONSE, PNE_P4_RETIRED_BRANCH_TYPE, PNE_P4_RETIRED_MISPRED_BRANCH_TYPE, PNE_P4_SCALAR_DP_UOP, PNE_P4_SCALAR_SP_UOP, PNE_P4_SNOOP, PNE_P4_SSE_INPUT_ASSIST, PNE_P4_STORE_PORT_REPLAY, PNE_P4_TC_DELIVER_MODE, PNE_P4_TC_MS_XFER, PNE_P4_UOP_QUEUE_WRITES, PNE_P4_UOP_TYPE, PNE_P4_UOPS_RETIRED, PNE_P4_WC_BUFFER, PNE_P4_X87_ASSIST, PNE_P4_X87_FP_UOP, PNE_P4_X87_SIMD_MOVES_UOP, /* Special counters */ PNE_P4_UOP_QUEUE_WRITES_TC_BUILD_DELIVER, PNE_P4_PAGE_WALK_TYPE_D, PNE_P4_PAGE_WALK_TYPE_I, PNE_P4_INSTR_RETIRED_NON_BOGUS, PNE_P4_BRANCH_RETIRED_NOT_TAKEN, PNE_P4_BRANCH_RETIRED_TAKEN, PNE_P4_BRANCH_RETIRED_PREDICTED, PNE_P4_BRANCH_RETIRED_MISPREDICTED, PNE_P4_BSQ_CACHE_REFERENCE_2L_MISSES, PNE_P4_BSQ_CACHE_REFERENCE_2L_ACCESSES, PNE_P4_BSQ_CACHE_REFERENCE_2L_HITS, PNE_P4_BSQ_CACHE_REFERENCE_3L_MISSES, PNE_P4_BSQ_CACHE_REFERENCE_3L_ACCESSES, PNE_P4_BSQ_CACHE_REFERENCE_3L_HITS, PNE_P4_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P4Processor_info[]; extern hwi_search_t P4Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p6-2.c000066400000000000000000000166131502707512200174530ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-2.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P6_2 SUBSTRATE P6_2 SUBSTRATE P6_2 SUBSTRATE (aka Pentium II) P6_2 SUBSTRATE P6_2 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P6_2_Processor must match P6_2_Processor_info */ Native_Event_LabelDescription_t P6_2_Processor_info[] = { /* Common P6 counters */ { "p6-baclears", "Count the number of times a static branch prediction was made by the branch decoder because the BTB did not have a prediction." }, { "p6-br-bogus", "Count the number of bogus branches." }, { "p6-br-inst-decoded", "Count the number of branch instructions decoded." }, { "p6-br-inst-retired", "Count the number of branch instructions retired." }, { "p6-br-miss-pred-retired", "Count the number of mispredicted branch instructions retired." }, { "p6-br-miss-pred-taken-ret", "Count the number of taken mispredicted branches retired." }, { "p6-br-taken-retired", "Count the number of taken branches retired." }, { "p6-btb-misses", "Count the number of branches for which the BTB did not produce a prediction. "}, { "p6-bus-bnr-drv", "Count the number of bus clock cycles during which this processor is driving the BNR# pin." }, { "p6-bus-data-rcv", "Count the number of bus clock cycles during which this processor is receiving data." }, { "p6-bus-drdy-clocks", "Count the number of clocks during which DRDY# is asserted." }, { "p6-bus-hit-drv", "Count the number of bus clock cycles during which this processor is driving the HIT# pin." }, { "p6-bus-hitm-drv", "Count the number of bus clock cycles during which this processor is driving the HITM# pin." }, { "p6-bus-lock-clocks", "Count the number of clocks during with LOCK# is asserted on the external system bus." }, { "p6-bus-req-outstanding", "Count the number of bus requests outstanding in any given cycle." }, { "p6-bus-snoop-stall", "Count the number of clock cycles during which the bus is snoop stalled." }, { "p6-bus-tran-any", "Count the number of completed bus transactions of any kind." }, { "p6-bus-tran-brd", "Count the number of burst read transactions." }, { "p6-bus-tran-burst", "Count the number of completed burst transactions." }, { "p6-bus-tran-def", "Count the number of completed deferred transactions." }, { "p6-bus-tran-ifetch", "Count the number of completed instruction fetch transactions." }, { "p6-bus-tran-inval", "Count the number of completed invalidate transactions." }, { "p6-bus-tran-mem", "Count the number of completed memory transactions." }, { "p6-bus-tran-pwr", "Count the number of completed partial write transactions." }, { "p6-bus-tran-rfo", "Count the number of completed read-for-ownership transactions." }, { "p6-bus-trans-io", "Count the number of completed I/O transactions." }, { "p6-bus-trans-p", "Count the number of completed partial transactions." }, { "p6-bus-trans-wb", "Count the number of completed write-back transactions." }, { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted." }, { "p6-cycles-div-busy", "Count the number of cycles during which the divider is busy and cannot accept new divides." }, { "p6-cycles-in-pending-and-masked", "Count the number of processor cycles for which interrupts were disabled and interrupts were pending." }, { "p6-cycles-int-masked", "Count the number of processor cycles for which interrupts were disabled." }, { "p6-data-mem-refs", "Count all loads and all stores using any memory type, including internal retries." }, { "p6-dcu-lines-in", "Count the total lines allocated in the data cache unit." }, { "p6-dcu-m-lines-in", "Count the number of M state lines allocated in the data cache unit." }, { "p6-dcu-m-lines-out", "Count the number of M state lines evicted from the data cache unit." }, { "p6-dcu-miss-outstanding", "Count the weighted number of cycles while a data cache unit miss is outstanding, incremented by the number of outstanding cache misses at any time."}, { "p6-div", "Count the number of integer and floating-point divides including speculative divides." }, { "p6-flops", "Count the number of computational floating point operations retired." }, { "p6-fp-assist", "Count the number of floating point exceptions handled by microcode." }, { "p6-fp-comps-ops-exe", "Count the number of computation floating point operations executed." }, { "p6-hw-int-rx", "Count the number of hardware interrupts received." }, { "p6-ifu-fetch", "Count the number of instruction fetches, both cacheable and non-cacheable." }, { "p6-ifu-fetch-miss", "Count the number of instruction fetch misses" }, { "p6-ifu-mem-stall", "Count the number of cycles instruction fetch is stalled for any reason." }, { "p6-ild-stall", "Count the number of cycles the instruction length decoder is stalled." }, { "p6-inst-decoded", "Count the number of instructions decoded." }, { "p6-inst-retired", "Count the number of instructions retired." }, { "p6-itlb-miss", "Count the number of instruction TLB misses." }, { "p6-l2-ads", "Count the number of L2 address strobes." }, { "p6-l2-dbus-busy", "Count the number of cycles during which the L2 cache data bus was busy." }, { "p6-l2-dbus-busy-rd", "Count the number of cycles during which the L2 cache data bus was busy transferring read data from L2 to the processor." }, { "p6-l2-ifetch", "Count the number of L2 instruction fetches." }, { "p6-l2-ld", "Count the number of L2 data loads." }, { "p6-l2-lines-in", "Count the number of L2 lines allocated." }, { "p6-l2-lines-out", "Count the number of L2 lines evicted." }, { "p6-l2-m-lines-inm", "Count the number of modified lines allocated in L2 cache." }, { "p6-l2-m-lines-outm", "Count the number of L2 M-state lines evicted." }, { "p6-l2-rqsts", "Count the total number of L2 requests." }, { "p6-l2-st", "Count the number of L2 data stores." }, { "p6-ld-blocks", "Count the number of load operations delayed due to store buffer blocks." }, { "p6-misalign-mem-ref", "Count the number of misaligned data memory references (crossing a 64 bit boundary)." }, { "p6-mul", "Count the number of floating point multiplies." }, { "p6-partial-rat-stalls", "Count the number of cycles or events for partial stalls." }, { "p6-resource-stalls", "Count the number of cycles there was a resource related stall of any kind." }, { "p6-sb-drains", "Count the number of cycles the store buffer is draining." }, { "p6-segment-reg-loads", "Count the number of segment register loads." }, { "p6-uops-retired", "Count the number of micro-ops retired."}, /* Specific Pentium 2 counters */ { "p6-fp-mmx-trans", "Count the number of transitions between MMX and floating-point instructions." }, { "p6-mmx-assist", "Count the number of MMX assists executed" }, { "p6-mmx-instr-exec", "Count the number of MMX instructions executed" }, { "p6-mmx-instr-ret", "Count the number of MMX instructions retired." }, { "p6-mmx-sat-instr-exec", "Count the number of MMX saturating instructions executed" }, { "p6-mmx-uops-exec", "Count the number of MMX micro-ops executed" }, { "p6-ret-seg-renames", "Count the number of segment register rename events retired." }, { "p6-seg-rename-stalls", "Count the number of segment register renaming stalls" }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p6-2.h000066400000000000000000000045151502707512200174560ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-2.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P6_2 #define FreeBSD_MAP_P6_2 enum NativeEvent_Value_P6_2_Processor { /* P6 common events */ PNE_P6_2_BACLEARS = PAPI_NATIVE_MASK, PNE_P6_2_BR_BOGUS, PNE_P6_2_BR_INST_DECODED, PNE_P6_2_BR_INST_RETIRED, PNE_P6_2_BR_MISS_PRED_RETIRED, PNE_P6_2_BR_MISS_PRED_TAKEN_RET, PNE_P6_2_BR_TAKEN_RETIRED, PNE_P6_2_BTB_MISSES, PNE_P6_2_BUS_BNR_DRV, PNE_P6_2_BUS_DATA_RCV, PNE_P6_2_BUS_DRDY_CLOCKS, PNE_P6_2_BUS_HIT_DRV, PNE_P6_2_BUS_HITM_DRV, PNE_P6_2_BUS_LOCK_CLOCKS, PNE_P6_2_BUS_REQ_OUTSTANDING, PNE_P6_2_BUS_SNOOP_STALL, PNE_P6_2_BUS_TRAN_ANY, PNE_P6_2_BUS_TRAN_BRD, PNE_P6_2_BUS_TRAN_BURST, PNE_P6_2_BUS_TRAN_DEF, PNE_P6_2_BUS_TRAN_IFETCH, PNE_P6_2_BUS_TRAN_INVAL, PNE_P6_2_BUS_TRAN_MEM, PNE_P6_2_BUS_TRAN_POWER, PNE_P6_2_BUS_TRAN_RFO, PNE_P6_2_BUS_TRANS_IO, PNE_P6_2_BUS_TRANS_P, PNE_P6_2_BUS_TRANS_WB, PNE_P6_2_CPU_CLK_UNHALTED, PNE_P6_2_CYCLES_DIV_BUSY, PNE_P6_2_CYCLES_IN_PENDING_AND_MASKED, PNE_P6_2_CYCLES_INT_MASKED, PNE_P6_2_DATA_MEM_REFS, PNE_P6_2_DCU_LINES_IN, PNE_P6_2_DCU_M_LINES_IN, PNE_P6_2_DCU_M_LINES_OUT, PNE_P6_2_DCU_MISS_OUTSTANDING, PNE_P6_2_DIV, PNE_P6_2_FLOPS, PNE_P6_2_FP_ASSIST, PNE_P6_2_FTP_COMPS_OPS_EXE, PNE_P6_2_HW_INT_RX, PNE_P6_2_IFU_FETCH, PNE_P6_2_IFU_FETCH_MISS, PNE_P6_2_IFU_MEM_STALL, PNE_P6_2_ILD_STALL, PNE_P6_2_INST_DECODED, PNE_P6_2_INST_RETIRED, PNE_P6_2_ITLB_MISS, PNE_P6_2_L2_ADS, PNE_P6_2_L2_DBUS_BUSY, PNE_P6_2_L2_DBUS_BUSY_RD, PNE_P6_2_L2_IFETCH, PNE_P6_2_L2_LD, PNE_P6_2_L2_LINES_IN, PNE_P6_2_L2_LINES_OUT, PNE_P6_2_L2M_LINES_INM, PNE_P6_2_L2M_LINES_OUTM, PNE_P6_2_L2_RQSTS, PNE_P6_2_L2_ST, PNE_P6_2_LD_BLOCKS, PNE_P6_2_MISALIGN_MEM_REF, PNE_P6_2_MUL, PNE_P6_2_PARTIAL_RAT_STALLS, PNE_P6_2_RESOURCE_STALL, PNE_P6_2_SB_DRAINS, PNE_P6_2_SEGMENT_REG_LOADS, PNE_P6_2_UOPS_RETIRED, /* Pentium 2 specific events */ PNE_P6_2_FP_MMX_TRANS, PNE_P6_2_MMX_ASSIST, PNE_P6_2_MMX_INSTR_EXEC, PNE_P6_2_MMX_INSTR_RET, PNE_P6_2_MMX_SAT_INSTR_EXEC, PNE_P6_2_MMX_UOPS_EXEC, PNE_P6_2_RET_SEG_RENAMES, PNE_P6_2_SEG_RENAME_STALLS, PNE_P6_2_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P6_2_Processor_info[]; extern hwi_search_t P6_2_Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p6-3.c000066400000000000000000000175051502707512200174550ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-3.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P6_3 SUBSTRATE P6_3 SUBSTRATE P6_3 SUBSTRATE (aka Pentium III) P6_3 SUBSTRATE P6_3 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P6_3_Processor must match P6_3_Processor_info */ Native_Event_LabelDescription_t P6_3_Processor_info[] = { /* Common P6 counters */ { "p6-baclears", "Count the number of times a static branch prediction was made by the branch decoder because the BTB did not have a prediction." }, { "p6-br-bogus", "Count the number of bogus branches." }, { "p6-br-inst-decoded", "Count the number of branch instructions decoded." }, { "p6-br-inst-retired", "Count the number of branch instructions retired." }, { "p6-br-miss-pred-retired", "Count the number of mispredicted branch instructions retired." }, { "p6-br-miss-pred-taken-ret", "Count the number of taken mispredicted branches retired." }, { "p6-br-taken-retired", "Count the number of taken branches retired." }, { "p6-btb-misses", "Count the number of branches for which the BTB did not produce a prediction. "}, { "p6-bus-bnr-drv", "Count the number of bus clock cycles during which this processor is driving the BNR# pin." }, { "p6-bus-data-rcv", "Count the number of bus clock cycles during which this processor is receiving data." }, { "p6-bus-drdy-clocks", "Count the number of clocks during which DRDY# is asserted." }, { "p6-bus-hit-drv", "Count the number of bus clock cycles during which this processor is driving the HIT# pin." }, { "p6-bus-hitm-drv", "Count the number of bus clock cycles during which this processor is driving the HITM# pin." }, { "p6-bus-lock-clocks", "Count the number of clocks during with LOCK# is asserted on the external system bus." }, { "p6-bus-req-outstanding", "Count the number of bus requests outstanding in any given cycle." }, { "p6-bus-snoop-stall", "Count the number of clock cycles during which the bus is snoop stalled." }, { "p6-bus-tran-any", "Count the number of completed bus transactions of any kind." }, { "p6-bus-tran-brd", "Count the number of burst read transactions." }, { "p6-bus-tran-burst", "Count the number of completed burst transactions." }, { "p6-bus-tran-def", "Count the number of completed deferred transactions." }, { "p6-bus-tran-ifetch", "Count the number of completed instruction fetch transactions." }, { "p6-bus-tran-inval", "Count the number of completed invalidate transactions." }, { "p6-bus-tran-mem", "Count the number of completed memory transactions." }, { "p6-bus-tran-pwr", "Count the number of completed partial write transactions." }, { "p6-bus-tran-rfo", "Count the number of completed read-for-ownership transactions." }, { "p6-bus-trans-io", "Count the number of completed I/O transactions." }, { "p6-bus-trans-p", "Count the number of completed partial transactions." }, { "p6-bus-trans-wb", "Count the number of completed write-back transactions." }, { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted." }, { "p6-cycles-div-busy", "Count the number of cycles during which the divider is busy and cannot accept new divides." }, { "p6-cycles-in-pending-and-masked", "Count the number of processor cycles for which interrupts were disabled and interrupts were pending." }, { "p6-cycles-int-masked", "Count the number of processor cycles for which interrupts were disabled." }, { "p6-data-mem-refs", "Count all loads and all stores using any memory type, including internal retries." }, { "p6-dcu-lines-in", "Count the total lines allocated in the data cache unit." }, { "p6-dcu-m-lines-in", "Count the number of M state lines allocated in the data cache unit." }, { "p6-dcu-m-lines-out", "Count the number of M state lines evicted from the data cache unit." }, { "p6-dcu-miss-outstanding", "Count the weighted number of cycles while a data cache unit miss is outstanding, incremented by the number of outstanding cache misses at any time."}, { "p6-div", "Count the number of integer and floating-point divides including speculative divides." }, { "p6-flops", "Count the number of computational floating point operations retired." }, { "p6-fp-assist", "Count the number of floating point exceptions handled by microcode." }, { "p6-fp-comps-ops-exe", "Count the number of computation floating point operations executed." }, { "p6-hw-int-rx", "Count the number of hardware interrupts received." }, { "p6-ifu-fetch", "Count the number of instruction fetches, both cacheable and non-cacheable." }, { "p6-ifu-fetch-miss", "Count the number of instruction fetch misses" }, { "p6-ifu-mem-stall", "Count the number of cycles instruction fetch is stalled for any reason." }, { "p6-ild-stall", "Count the number of cycles the instruction length decoder is stalled." }, { "p6-inst-decoded", "Count the number of instructions decoded." }, { "p6-inst-retired", "Count the number of instructions retired." }, { "p6-itlb-miss", "Count the number of instruction TLB misses." }, { "p6-l2-ads", "Count the number of L2 address strobes." }, { "p6-l2-dbus-busy", "Count the number of cycles during which the L2 cache data bus was busy." }, { "p6-l2-dbus-busy-rd", "Count the number of cycles during which the L2 cache data bus was busy transferring read data from L2 to the processor." }, { "p6-l2-ifetch", "Count the number of L2 instruction fetches." }, { "p6-l2-ld", "Count the number of L2 data loads." }, { "p6-l2-lines-in", "Count the number of L2 lines allocated." }, { "p6-l2-lines-out", "Count the number of L2 lines evicted." }, { "p6-l2-m-lines-inm", "Count the number of modified lines allocated in L2 cache." }, { "p6-l2-m-lines-outm", "Count the number of L2 M-state lines evicted." }, { "p6-l2-rqsts", "Count the total number of L2 requests." }, { "p6-l2-st", "Count the number of L2 data stores." }, { "p6-ld-blocks", "Count the number of load operations delayed due to store buffer blocks." }, { "p6-misalign-mem-ref", "Count the number of misaligned data memory references (crossing a 64 bit boundary)." }, { "p6-mul", "Count the number of floating point multiplies, including speculative multiplies" }, { "p6-partial-rat-stalls", "Count the number of cycles or events for partial stalls." }, { "p6-resource-stalls", "Count the number of cycles there was a resource related stall of any kind." }, { "p6-sb-drains", "Count the number of cycles the store buffer is draining." }, { "p6-segment-reg-loads", "Count the number of segment register loads." }, { "p6-uops-retired", "Count the number of micro-ops retired."}, /* Specific Pentium 3 counters */ { "p6-fp-mmx-trans", "Count the number of transitions between MMX and floating-point instructions." }, { "p6-mmx-assist", "Count the number of MMX assists executed" }, { "p6-mmx-instr-exec", "Count the number of MMX instructions executed" }, { "p6-mmx-instr-ret", "Count the number of MMX instructions retired." }, { "p6-mmx-sat-instr-exec", "Count the number of MMX saturating instructions executed" }, { "p6-mmx-uops-exec", "Count the number of MMX micro-ops executed" }, { "p6-ret-seg-renames", "Count the number of segment register rename events retired." }, { "p6-seg-rename-stalls", "Count the number of segment register renaming stalls" }, { "p6-emon-kni-comp-inst-ret", "Count the number of SSE computational instructions retired" }, { "p6-emon-kni-inst-retired", "Count the number of SSE instructions retired." }, { "p6-emon-kni-pref-dispatched", "Count the number of SSE prefetch or weakly ordered instructions dispatched." }, { "p6-emon-kni-pref-miss", "Count the number of prefetch or weakly ordered instructions that miss all caches." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p6-3.h000066400000000000000000000047221502707512200174570ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-3.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P6_3 #define FreeBSD_MAP_P6_3 enum NativeEvent_Value_P6_3_Processor { /* P6 common events */ PNE_P6_3_BACLEARS = PAPI_NATIVE_MASK, PNE_P6_3_BR_BOGUS, PNE_P6_3_BR_INST_DECODED, PNE_P6_3_BR_INST_RETIRED, PNE_P6_3_BR_MISS_PRED_RETIRED, PNE_P6_3_BR_MISS_PRED_TAKEN_RET, PNE_P6_3_BR_TAKEN_RETIRED, PNE_P6_3_BTB_MISSES, PNE_P6_3_BUS_BNR_DRV, PNE_P6_3_BUS_DATA_RCV, PNE_P6_3_BUS_DRDY_CLOCKS, PNE_P6_3_BUS_HIT_DRV, PNE_P6_3_BUS_HITM_DRV, PNE_P6_3_BUS_LOCK_CLOCKS, PNE_P6_3_BUS_REQ_OUTSTANDING, PNE_P6_3_BUS_SNOOP_STALL, PNE_P6_3_BUS_TRAN_ANY, PNE_P6_3_BUS_TRAN_BRD, PNE_P6_3_BUS_TRAN_BURST, PNE_P6_3_BUS_TRAN_DEF, PNE_P6_3_BUS_TRAN_IFETCH, PNE_P6_3_BUS_TRAN_INVAL, PNE_P6_3_BUS_TRAN_MEM, PNE_P6_3_BUS_TRAN_POWER, PNE_P6_3_BUS_TRAN_RFO, PNE_P6_3_BUS_TRANS_IO, PNE_P6_3_BUS_TRANS_P, PNE_P6_3_BUS_TRANS_WB, PNE_P6_3_CPU_CLK_UNHALTED, PNE_P6_3_CYCLES_DIV_BUSY, PNE_P6_3_CYCLES_IN_PENDING_AND_MASKED, PNE_P6_3_CYCLES_INT_MASKED, PNE_P6_3_DATA_MEM_REFS, PNE_P6_3_DCU_LINES_IN, PNE_P6_3_DCU_M_LINES_IN, PNE_P6_3_DCU_M_LINES_OUT, PNE_P6_3_DCU_MISS_OUTSTANDING, PNE_P6_3_DIV, PNE_P6_3_FLOPS, PNE_P6_3_FP_ASSIST, PNE_P6_3_FTP_COMPS_OPS_EXE, PNE_P6_3_HW_INT_RX, PNE_P6_3_IFU_FETCH, PNE_P6_3_IFU_FETCH_MISS, PNE_P6_3_IFU_MEM_STALL, PNE_P6_3_ILD_STALL, PNE_P6_3_INST_DECODED, PNE_P6_3_INST_RETIRED, PNE_P6_3_ITLB_MISS, PNE_P6_3_L2_ADS, PNE_P6_3_L2_DBUS_BUSY, PNE_P6_3_L2_DBUS_BUSY_RD, PNE_P6_3_L2_IFETCH, PNE_P6_3_L2_LD, PNE_P6_3_L2_LINES_IN, PNE_P6_3_L2_LINES_OUT, PNE_P6_3_L2M_LINES_INM, PNE_P6_3_L2M_LINES_OUTM, PNE_P6_3_L2_RQSTS, PNE_P6_3_L2_ST, PNE_P6_3_LD_BLOCKS, PNE_P6_3_MISALIGN_MEM_REF, PNE_P6_3_MUL, PNE_P6_3_PARTIAL_RAT_STALLS, PNE_P6_3_RESOURCE_STALL, PNE_P6_3_SB_DRAINS, PNE_P6_3_SEGMENT_REG_LOADS, PNE_P6_3_UOPS_RETIRED, /* Pentium 3 specific events */ PNE_P6_3_FP_MMX_TRANS, PNE_P6_3_MMX_ASSIST, PNE_P6_3_MMX_INSTR_EXEC, PNE_P6_3_MMX_INSTR_RET, PNE_P6_3_MMX_SAT_INSTR_EXEC, PNE_P6_3_MMX_UOPS_EXEC, PNE_P6_3_RET_SEG_RENAMES, PNE_P6_3_SEG_RENAME_STALLS, PNE_P6_3_EMON_KNI_COMP_INST_RET, PNE_P6_3_EMON_KNI_INST_RETIRED, PNE_P6_3_EMON_KNI_PREF_DISPATCHED, PNE_P6_3_EMON_KNI_PREF_MISS, PNE_P6_3_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P6_3_Processor_info[]; extern hwi_search_t P6_3_Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p6-c.c000066400000000000000000000155421502707512200175340ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P6_C SUBSTRATE P6_C SUBSTRATE P6_C SUBSTRATE (aka Celeron) P6_C SUBSTRATE P6_C SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P6_C_Processor must match P6_C_Processor_info */ Native_Event_LabelDescription_t P6_C_Processor_info[] = { /* Common P6 counters */ { "p6-baclears", "Count the number of times a static branch prediction was made by the branch decoder because the BTB did not have a prediction." }, { "p6-br-bogus", "Count the number of bogus branches." }, { "p6-br-inst-decoded", "Count the number of branch instructions decoded." }, { "p6-br-inst-retired", "Count the number of branch instructions retired." }, { "p6-br-miss-pred-retired", "Count the number of mispredicted branch instructions retired." }, { "p6-br-miss-pred-taken-ret", "Count the number of taken mispredicted branches retired." }, { "p6-br-taken-retired", "Count the number of taken branches retired." }, { "p6-btb-misses", "Count the number of branches for which the BTB did not produce a prediction. "}, { "p6-bus-bnr-drv", "Count the number of bus clock cycles during which this processor is driving the BNR# pin." }, { "p6-bus-data-rcv", "Count the number of bus clock cycles during which this processor is receiving data." }, { "p6-bus-drdy-clocks", "Count the number of clocks during which DRDY# is asserted." }, { "p6-bus-hit-drv", "Count the number of bus clock cycles during which this processor is driving the HIT# pin." }, { "p6-bus-hitm-drv", "Count the number of bus clock cycles during which this processor is driving the HITM# pin." }, { "p6-bus-lock-clocks", "Count the number of clocks during with LOCK# is asserted on the external system bus." }, { "p6-bus-req-outstanding", "Count the number of bus requests outstanding in any given cycle." }, { "p6-bus-snoop-stall", "Count the number of clock cycles during which the bus is snoop stalled." }, { "p6-bus-tran-any", "Count the number of completed bus transactions of any kind." }, { "p6-bus-tran-brd", "Count the number of burst read transactions." }, { "p6-bus-tran-burst", "Count the number of completed burst transactions." }, { "p6-bus-tran-def", "Count the number of completed deferred transactions." }, { "p6-bus-tran-ifetch", "Count the number of completed instruction fetch transactions." }, { "p6-bus-tran-inval", "Count the number of completed invalidate transactions." }, { "p6-bus-tran-mem", "Count the number of completed memory transactions." }, { "p6-bus-tran-pwr", "Count the number of completed partial write transactions." }, { "p6-bus-tran-rfo", "Count the number of completed read-for-ownership transactions." }, { "p6-bus-trans-io", "Count the number of completed I/O transactions." }, { "p6-bus-trans-p", "Count the number of completed partial transactions." }, { "p6-bus-trans-wb", "Count the number of completed write-back transactions." }, { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted." }, { "p6-cycles-div-busy", "Count the number of cycles during which the divider is busy and cannot accept new divides." }, { "p6-cycles-in-pending-and-masked", "Count the number of processor cycles for which interrupts were disabled and interrupts were pending." }, { "p6-cycles-int-masked", "Count the number of processor cycles for which interrupts were disabled." }, { "p6-data-mem-refs", "Count all loads and all stores using any memory type, including internal retries." }, { "p6-dcu-lines-in", "Count the total lines allocated in the data cache unit." }, { "p6-dcu-m-lines-in", "Count the number of M state lines allocated in the data cache unit." }, { "p6-dcu-m-lines-out", "Count the number of M state lines evicted from the data cache unit." }, { "p6-dcu-miss-outstanding", "Count the weighted number of cycles while a data cache unit miss is outstanding, incremented by the number of outstanding cache misses at any time."}, { "p6-div", "Count the number of integer and floating-point divides including speculative divides." }, { "p6-flops", "Count the number of computational floating point operations retired." }, { "p6-fp-assist", "Count the number of floating point exceptions handled by microcode." }, { "p6-fp-comps-ops-exe", "Count the number of computation floating point operations executed." }, { "p6-hw-int-rx", "Count the number of hardware interrupts received." }, { "p6-ifu-fetch", "Count the number of instruction fetches, both cacheable and non-cacheable." }, { "p6-ifu-fetch-miss", "Count the number of instruction fetch misses" }, { "p6-ifu-mem-stall", "Count the number of cycles instruction fetch is stalled for any reason." }, { "p6-ild-stall", "Count the number of cycles the instruction length decoder is stalled." }, { "p6-inst-decoded", "Count the number of instructions decoded." }, { "p6-inst-retired", "Count the number of instructions retired." }, { "p6-itlb-miss", "Count the number of instruction TLB misses." }, { "p6-l2-ads", "Count the number of L2 address strobes." }, { "p6-l2-dbus-busy", "Count the number of cycles during which the L2 cache data bus was busy." }, { "p6-l2-dbus-busy-rd", "Count the number of cycles during which the L2 cache data bus was busy transferring read data from L2 to the processor." }, { "p6-l2-ifetch", "Count the number of L2 instruction fetches." }, { "p6-l2-ld", "Count the number of L2 data loads." }, { "p6-l2-lines-in", "Count the number of L2 lines allocated." }, { "p6-l2-lines-out", "Count the number of L2 lines evicted." }, { "p6-l2-m-lines-inm", "Count the number of modified lines allocated in L2 cache." }, { "p6-l2-m-lines-outm", "Count the number of L2 M-state lines evicted." }, { "p6-l2-rqsts", "Count the total number of L2 requests." }, { "p6-l2-st", "Count the number of L2 data stores." }, { "p6-ld-blocks", "Count the number of load operations delayed due to store buffer blocks." }, { "p6-misalign-mem-ref", "Count the number of misaligned data memory references (crossing a 64 bit boundary)." }, { "p6-mul", "Count the number of floating point multiplies, including speculative multiplies." }, { "p6-partial-rat-stalls", "Count the number of cycles or events for partial stalls." }, { "p6-resource-stalls", "Count the number of cycles there was a resource related stall of any kind." }, { "p6-sb-drains", "Count the number of cycles the store buffer is draining." }, { "p6-segment-reg-loads", "Count the number of segment register loads." }, { "p6-uops-retired", "Count the number of micro-ops retired."}, /* Specific Celeron counters */ { "p6-mmx-instr-exec", "Count the number of MMX instructions executed" }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p6-c.h000066400000000000000000000042231502707512200175330ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P6_C #define FreeBSD_MAP_P6_C enum NativeEvent_Value_P6_C_Processor { /* P6 common events */ PNE_P6_C_BACLEARS = PAPI_NATIVE_MASK, PNE_P6_C_BR_BOGUS, PNE_P6_C_BR_INST_DECODED, PNE_P6_C_BR_INST_RETIRED, PNE_P6_C_BR_MISS_PRED_RETIRED, PNE_P6_C_BR_MISS_PRED_TAKEN_RET, PNE_P6_C_BR_TAKEN_RETIRED, PNE_P6_C_BTB_MISSES, PNE_P6_C_BUS_BNR_DRV, PNE_P6_C_BUS_DATA_RCV, PNE_P6_C_BUS_DRDY_CLOCKS, PNE_P6_C_BUS_HIT_DRV, PNE_P6_C_BUS_HITM_DRV, PNE_P6_C_BUS_LOCK_CLOCKS, PNE_P6_C_BUS_REQ_OUTSTANDING, PNE_P6_C_BUS_SNOOP_STALL, PNE_P6_C_BUS_TRAN_ANY, PNE_P6_C_BUS_TRAN_BRD, PNE_P6_C_BUS_TRAN_BURST, PNE_P6_C_BUS_TRAN_DEF, PNE_P6_C_BUS_TRAN_IFETCH, PNE_P6_C_BUS_TRAN_INVAL, PNE_P6_C_BUS_TRAN_MEM, PNE_P6_C_BUS_TRAN_POWER, PNE_P6_C_BUS_TRAN_RFO, PNE_P6_C_BUS_TRANS_IO, PNE_P6_C_BUS_TRANS_P, PNE_P6_C_BUS_TRANS_WB, PNE_P6_C_CPU_CLK_UNHALTED, PNE_P6_C_CYCLES_DIV_BUSY, PNE_P6_C_CYCLES_IN_PENDING_AND_MASKED, PNE_P6_C_CYCLES_INT_MASKED, PNE_P6_C_DATA_MEM_REFS, PNE_P6_C_DCU_LINES_IN, PNE_P6_C_DCU_M_LINES_IN, PNE_P6_C_DCU_M_LINES_OUT, PNE_P6_C_DCU_MISS_OUTSTANDING, PNE_P6_C_DIV, PNE_P6_C_FLOPS, PNE_P6_C_FP_ASSIST, PNE_P6_C_FTP_COMPS_OPS_EXE, PNE_P6_C_HW_INT_RX, PNE_P6_C_IFU_FETCH, PNE_P6_C_IFU_FETCH_MISS, PNE_P6_C_IFU_MEM_STALL, PNE_P6_C_ILD_STALL, PNE_P6_C_INST_DECODED, PNE_P6_C_INST_RETIRED, PNE_P6_C_ITLB_MISS, PNE_P6_C_L2_ADS, PNE_P6_C_L2_DBUS_BUSY, PNE_P6_C_L2_DBUS_BUSY_RD, PNE_P6_C_L2_IFETCH, PNE_P6_C_L2_LD, PNE_P6_C_L2_LINES_IN, PNE_P6_C_L2_LINES_OUT, PNE_P6_C_L2M_LINES_INM, PNE_P6_C_L2M_LINES_OUTM, PNE_P6_C_L2_RQSTS, PNE_P6_C_L2_ST, PNE_P6_C_LD_BLOCKS, PNE_P6_C_MISALIGN_MEM_REF, PNE_P6_C_MUL, PNE_P6_C_PARTIAL_RAT_STALLS, PNE_P6_C_RESOURCE_STALL, PNE_P6_C_SB_DRAINS, PNE_P6_C_SEGMENT_REG_LOADS, PNE_P6_C_UOPS_RETIRED, /* Celeron specific events */ PNE_P6_C_MMX_INSTR_EXEC, PNE_P6_C_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P6_C_Processor_info[]; extern hwi_search_t P6_C_Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p6-m.c000066400000000000000000000243041502707512200175420ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-M.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P6_M SUBSTRATE P6_M SUBSTRATE P6_M SUBSTRATE (aka Pentium M) P6_M SUBSTRATE P6_M SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P6_M_Processor must match P6_M_Processor_info */ Native_Event_LabelDescription_t P6_M_Processor_info[] = { /* Common P6 counters */ { "p6-baclears", "Count the number of times a static branch prediction was made by the branch decoder because the BTB did not have a prediction." }, { "p6-br-bogus", "Count the number of bogus branches." }, { "p6-br-inst-decoded", "Count the number of branch instructions decoded." }, { "p6-br-inst-retired", "Count the number of branch instructions retired." }, { "p6-br-miss-pred-retired", "Count the number of mispredicted branch instructions retired." }, { "p6-br-miss-pred-taken-ret", "Count the number of taken mispredicted branches retired." }, { "p6-br-taken-retired", "Count the number of taken branches retired." }, { "p6-btb-misses", "Count the number of branches for which the BTB did not produce a prediction. "}, { "p6-bus-bnr-drv", "Count the number of bus clock cycles during which this processor is driving the BNR# pin." }, { "p6-bus-data-rcv", "Count the number of bus clock cycles during which this processor is receiving data." }, { "p6-bus-drdy-clocks", "Count the number of clocks during which DRDY# is asserted." }, { "p6-bus-hit-drv", "Count the number of bus clock cycles during which this processor is driving the HIT# pin." }, { "p6-bus-hitm-drv", "Count the number of bus clock cycles during which this processor is driving the HITM# pin." }, { "p6-bus-lock-clocks", "Count the number of clocks during with LOCK# is asserted on the external system bus." }, { "p6-bus-req-outstanding", "Count the number of bus requests outstanding in any given cycle." }, { "p6-bus-snoop-stall", "Count the number of clock cycles during which the bus is snoop stalled." }, { "p6-bus-tran-any", "Count the number of completed bus transactions of any kind." }, { "p6-bus-tran-brd", "Count the number of burst read transactions." }, { "p6-bus-tran-burst", "Count the number of completed burst transactions." }, { "p6-bus-tran-def", "Count the number of completed deferred transactions." }, { "p6-bus-tran-ifetch", "Count the number of completed instruction fetch transactions." }, { "p6-bus-tran-inval", "Count the number of completed invalidate transactions." }, { "p6-bus-tran-mem", "Count the number of completed memory transactions." }, { "p6-bus-tran-pwr", "Count the number of completed partial write transactions." }, { "p6-bus-tran-rfo", "Count the number of completed read-for-ownership transactions." }, { "p6-bus-trans-io", "Count the number of completed I/O transactions." }, { "p6-bus-trans-p", "Count the number of completed partial transactions." }, { "p6-bus-trans-wb", "Count the number of completed write-back transactions." }, /* { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted." }, THIS IS DIFFERENT IN PM */ { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted and not in a thermal trip." }, { "p6-cycles-div-busy", "Count the number of cycles during which the divider is busy and cannot accept new divides." }, { "p6-cycles-in-pending-and-masked", "Count the number of processor cycles for which interrupts were disabled and interrupts were pending." }, { "p6-cycles-int-masked", "Count the number of processor cycles for which interrupts were disabled." }, { "p6-data-mem-refs", "Count all loads and all stores using any memory type, including internal retries." }, { "p6-dcu-lines-in", "Count the total lines allocated in the data cache unit." }, { "p6-dcu-m-lines-in", "Count the number of M state lines allocated in the data cache unit." }, { "p6-dcu-m-lines-out", "Count the number of M state lines evicted from the data cache unit." }, { "p6-dcu-miss-outstanding", "Count the weighted number of cycles while a data cache unit miss is outstanding, incremented by the number of outstanding cache misses at any time."}, { "p6-div", "Count the number of integer and floating-point divides including speculative divides." }, { "p6-flops", "Count the number of computational floating point operations retired." }, { "p6-fp-assist", "Count the number of floating point exceptions handled by microcode." }, { "p6-fp-comps-ops-exe", "Count the number of computation floating point operations executed." }, { "p6-hw-int-rx", "Count the number of hardware interrupts received." }, { "p6-ifu-fetch", "Count the number of instruction fetches, both cacheable and non-cacheable." }, { "p6-ifu-fetch-miss", "Count the number of instruction fetch misses" }, { "p6-ifu-mem-stall", "Count the number of cycles instruction fetch is stalled for any reason." }, { "p6-ild-stall", "Count the number of cycles the instruction length decoder is stalled." }, { "p6-inst-decoded", "Count the number of instructions decoded." }, { "p6-inst-retired", "Count the number of instructions retired." }, { "p6-itlb-miss", "Count the number of instruction TLB misses." }, { "p6-l2-ads", "Count the number of L2 address strobes." }, { "p6-l2-dbus-busy", "Count the number of cycles during which the L2 cache data bus was busy." }, { "p6-l2-dbus-busy-rd", "Count the number of cycles during which the L2 cache data bus was busy transferring read data from L2 to the processor." }, { "p6-l2-ifetch", "Count the number of L2 instruction fetches." }, { "p6-l2-ld", "Count the number of L2 data loads." }, { "p6-l2-lines-in", "Count the number of L2 lines allocated." }, { "p6-l2-lines-out", "Count the number of L2 lines evicted." }, { "p6-l2-m-lines-inm", "Count the number of modified lines allocated in L2 cache." }, { "p6-l2-m-lines-outm", "Count the number of L2 M-state lines evicted." }, { "p6-l2-rqsts", "Count the total number of L2 requests." }, { "p6-l2-st", "Count the number of L2 data stores." }, { "p6-ld-blocks", "Count the number of load operations delayed due to store buffer blocks." }, { "p6-misalign-mem-ref", "Count the number of misaligned data memory references (crossing a 64 bit boundary)." }, { "p6-mul", "Count the number of floating point multiplies, including speculative multiplies." }, { "p6-partial-rat-stalls", "Count the number of cycles or events for partial stalls." }, { "p6-resource-stalls", "Count the number of cycles there was a resource related stall of any kind." }, { "p6-sb-drains", "Count the number of cycles the store buffer is draining." }, { "p6-segment-reg-loads", "Count the number of segment register loads." }, { "p6-uops-retired", "Count the number of micro-ops retired."}, /* Specific Pentium 3 counters */ { "p6-fp-mmx-trans", "Count the number of transitions between MMX and floating-point instructions." }, { "p6-mmx-assist", "Count the number of MMX assists executed" }, { "p6-mmx-instr-exec", "Count the number of MMX instructions executed" }, { "p6-mmx-instr-ret", "Count the number of MMX instructions retired." }, { "p6-mmx-sat-instr-exec", "Count the number of MMX saturating instructions executed" }, { "p6-mmx-uops-exec", "Count the number of MMX micro-ops executed" }, { "p6-ret-seg-renames", "Count the number of segment register rename events retired." }, { "p6-seg-rename-stalls", "Count the number of segment register renaming stalls" }, { "p6-emon-kni-comp-inst-ret", "Count the number of SSE computational instructions retired" }, { "p6-emon-kni-inst-retired", "Count the number of SSE instructions retired." }, { "p6-emon-kni-pref-dispatched", "Count the number of SSE prefetch or weakly ordered instructions dispatched." }, { "p6-emon-kni-pref-miss", "Count the number of prefetch or weakly ordered instructions that miss all caches." }, /* Specific Pentium M counters */ { "p6-br-bac-missp-exec", "Count the number of branch instructions executed that where mispredicted at the Front End (BAC)." }, { "p6-br-call-exec", "Count the number of call instructions executed." }, { "p6-br-call-missp-exec", "Count the number of call instructions executed that were mispredicted." }, { "p6-br-cnd-exec", "Count the number of conditional branch instructions excuted" }, { "p6-br-cnd-missp-exec", "Count the number of conditional branch instructions executed that were mispredicted." }, { "p6-br-ind-call-exec", "Count the number of indirect call instructions executed" }, { "p6-br-ind-exec", "Count the number of indirect branch instructions executed" }, { "p6-br-ind-missp-exec", "Count the number of indirect branch instructions executed that were mispredicted." }, { "p6-br-inst-exec", "Count the number of branch instructions executed but necessarily retired." }, { "p6-br-missp-exec", "Count the number of branch instructions executed that were mispredicted at execution." }, { "p6-br-ret-bac-missp-exec", "Count the number of return instructions executed that were mispredicted at the Front End (BAC)." }, { "p6-br-ret-exec", "Count the number of return instructions executed." }, { "p6-br-ret-missp-exec", "Count the number of return instructions executed that were mispredicted at execution." }, { "p6-emon-esp-uops", "Count the total number of micro-ops." }, { "p6-emon-est-trans", "Count the number of Enhanced Intel SpeedStep transitions" }, { "p6-emon-fused-uops-ret", "Count the number of retired fused micro-ops." }, { "p6-emon-pref-rqsts-dn", "Count the number of downward prefetches issued." }, { "p6-emon-pref-rqsts-up", "Count the number of upward prefetches issued." }, { "p6-emon-simd-instr-retired", "Count the number of retired MMX instructions." }, { "p6-emon-sse-sse2-comp-inst-retired", "Count the number of computational SSE instructions retired." }, { "p6-emon-sse-sse2-inst-retired", "Count the number of SSE instructions retired." }, { "p6-emon-synch-uops", "Count the number of sync micro-ops." }, { "p6-emon-thermal-trip", "Count the duration or occurrences of thermal trips." }, { "p6-emon-unfusion", "Count the number of unfusion events in the reorder buffer." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p6-m.h000066400000000000000000000062421502707512200175500ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6-M.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P6_M #define FreeBSD_MAP_P6_M enum NativeEvent_Value_P6_M_Processor { /* P6 common events */ PNE_P6_M_BACLEARS = PAPI_NATIVE_MASK, PNE_P6_M_BR_BOGUS, PNE_P6_M_BR_INST_DECODED, PNE_P6_M_BR_INST_RETIRED, PNE_P6_M_BR_MISS_PRED_RETIRED, PNE_P6_M_BR_MISS_PRED_TAKEN_RET, PNE_P6_M_BR_TAKEN_RETIRED, PNE_P6_M_BTB_MISSES, PNE_P6_M_BUS_BNR_DRV, PNE_P6_M_BUS_DATA_RCV, PNE_P6_M_BUS_DRDY_CLOCKS, PNE_P6_M_BUS_HIT_DRV, PNE_P6_M_BUS_HITM_DRV, PNE_P6_M_BUS_LOCK_CLOCKS, PNE_P6_M_BUS_REQ_OUTSTANDING, PNE_P6_M_BUS_SNOOP_STALL, PNE_P6_M_BUS_TRAN_ANY, PNE_P6_M_BUS_TRAN_BRD, PNE_P6_M_BUS_TRAN_BURST, PNE_P6_M_BUS_TRAN_DEF, PNE_P6_M_BUS_TRAN_IFETCH, PNE_P6_M_BUS_TRAN_INVAL, PNE_P6_M_BUS_TRAN_MEM, PNE_P6_M_BUS_TRAN_POWER, PNE_P6_M_BUS_TRAN_RFO, PNE_P6_M_BUS_TRANS_IO, PNE_P6_M_BUS_TRANS_P, PNE_P6_M_BUS_TRANS_WB, PNE_P6_M_CPU_CLK_UNHALTED, PNE_P6_M_CYCLES_DIV_BUSY, PNE_P6_M_CYCLES_IN_PENDING_AND_MASKED, PNE_P6_M_CYCLES_INT_MASKED, PNE_P6_M_DATA_MEM_REFS, PNE_P6_M_DCU_LINES_IN, PNE_P6_M_DCU_M_LINES_IN, PNE_P6_M_DCU_M_LINES_OUT, PNE_P6_M_DCU_MISS_OUTSTANDING, PNE_P6_M_DIV, PNE_P6_M_FLOPS, PNE_P6_M_FP_ASSIST, PNE_P6_M_FTP_COMPS_OPS_EXE, PNE_P6_M_HW_INT_RX, PNE_P6_M_IFU_FETCH, PNE_P6_M_IFU_FETCH_MISS, PNE_P6_M_IFU_MEM_STALL, PNE_P6_M_ILD_STALL, PNE_P6_M_INST_DECODED, PNE_P6_M_INST_RETIRED, PNE_P6_M_ITLB_MISS, PNE_P6_M_L2_ADS, PNE_P6_M_L2_DBUS_BUSY, PNE_P6_M_L2_DBUS_BUSY_RD, PNE_P6_M_L2_IFETCH, PNE_P6_M_L2_LD, PNE_P6_M_L2_LINES_IN, PNE_P6_M_L2_LINES_OUT, PNE_P6_M_L2M_LINES_INM, PNE_P6_M_L2M_LINES_OUTM, PNE_P6_M_L2_RQSTS, PNE_P6_M_L2_ST, PNE_P6_M_LD_BLOCKS, PNE_P6_M_MISALIGN_MEM_REF, PNE_P6_M_MUL, PNE_P6_M_PARTIAL_RAT_STALLS, PNE_P6_M_RESOURCE_STALL, PNE_P6_M_SB_DRAINS, PNE_P6_M_SEGMENT_REG_LOADS, PNE_P6_M_UOPS_RETIRED, /* Pentium 3 specific events */ PNE_P6_M_FP_MMX_TRANS, PNE_P6_M_MMX_ASSIST, PNE_P6_M_MMX_INSTR_EXEC, PNE_P6_M_MMX_INSTR_RET, PNE_P6_M_MMX_SAT_INSTR_EXEC, PNE_P6_M_MMX_UOPS_EXEC, PNE_P6_M_RET_SEG_RENAMES, PNE_P6_M_SEG_RENAME_STALLS, PNE_P6_M_EMON_KNI_COMP_INST_RET, PNE_P6_M_EMON_KNI_INST_RETIRED, PNE_P6_M_EMON_KNI_PREF_DISPATCHED, PNE_P6_M_EMON_KNI_PREF_MISS, /* Pentium M specific events */ PNE_P6_M_BR_BAC_MISSP_EXEC, PNE_P6_M_BR_CALL_EXEC, PNE_P6_M_BR_CALL_MISSP_EXEC, PNE_P6_M_BR_CND_EXEC, PNE_P6_M_BR_CND_MISSP_EXEC, PNE_P6_M_BR_IND_CALL_EXEC, PNE_P6_M_BR_IND_EXEC, PNE_P6_M_BR_IND_MISSP_EXEC, PNE_P6_M_BR_INST_EXEC, PNE_P6_M_BR_MISSP_EXEC, PNE_P6_M_BR_RET_BAC_MISSP_EXEC, PNE_P6_M_BR_RET_EXEC, PNE_P6_M_BR_RET_MISSP_EXEC, PNE_P6_M_EMON_ESP_UOPS, PNE_P6_M_EMON_EST_TRANS, PNE_P6_M_EMON_FUSED_UOPS_RET, PNE_P6_M_EMON_PREF_RQSTS_DN, PNE_P6_M_EMON_PREF_RQSTS_UP, PNE_P6_M_EMON_SIMD_INSTR_RETIRD, PNE_P6_M_EMON_SSE_SSE2_COMP_INST_RETIRED, PNE_P6_M_EMON_SSE_SSE2_INST_RETIRED, PNE_P6_M_EMON_SYNCH_UOPS, PNE_P6_M_EMON_THERMAL_TRIP, PNE_P6_M_EMON_UNFUSION, PNE_P6_M_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P6_M_Processor_info[]; extern hwi_search_t P6_M_Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-p6.c000066400000000000000000000153171502707512200173140ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** P6 SUBSTRATE P6 SUBSTRATE P6 SUBSTRATE (aka Pentium Pro) P6 SUBSTRATE P6 SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_P6Processor must match P6Processor_info */ Native_Event_LabelDescription_t P6Processor_info[] = { { "p6-baclears", "Count the number of times a static branch prediction was made by the branch decoder because the BTB did not have a prediction." }, { "p6-br-bogus", "Count the number of bogus branches." }, { "p6-br-inst-decoded", "Count the number of branch instructions decoded." }, { "p6-br-inst-retired", "Count the number of branch instructions retired." }, { "p6-br-miss-pred-retired", "Count the number of mispredicted branch instructions retired." }, { "p6-br-miss-pred-taken-ret", "Count the number of taken mispredicted branches retired." }, { "p6-br-taken-retired", "Count the number of taken branches retired." }, { "p6-btb-misses", "Count the number of branches for which the BTB did not produce a prediction. "}, { "p6-bus-bnr-drv", "Count the number of bus clock cycles during which this processor is driving the BNR# pin." }, { "p6-bus-data-rcv", "Count the number of bus clock cycles during which this processor is receiving data." }, { "p6-bus-drdy-clocks", "Count the number of clocks during which DRDY# is asserted." }, { "p6-bus-hit-drv", "Count the number of bus clock cycles during which this processor is driving the HIT# pin." }, { "p6-bus-hitm-drv", "Count the number of bus clock cycles during which this processor is driving the HITM# pin." }, { "p6-bus-lock-clocks", "Count the number of clocks during with LOCK# is asserted on the external system bus." }, { "p6-bus-req-outstanding", "Count the number of bus requests outstanding in any given cycle." }, { "p6-bus-snoop-stall", "Count the number of clock cycles during which the bus is snoop stalled." }, { "p6-bus-tran-any", "Count the number of completed bus transactions of any kind." }, { "p6-bus-tran-brd", "Count the number of burst read transactions." }, { "p6-bus-tran-burst", "Count the number of completed burst transactions." }, { "p6-bus-tran-def", "Count the number of completed deferred transactions." }, { "p6-bus-tran-ifetch", "Count the number of completed instruction fetch transactions." }, { "p6-bus-tran-inval", "Count the number of completed invalidate transactions." }, { "p6-bus-tran-mem", "Count the number of completed memory transactions." }, { "p6-bus-tran-pwr", "Count the number of completed partial write transactions." }, { "p6-bus-tran-rfo", "Count the number of completed read-for-ownership transactions." }, { "p6-bus-trans-io", "Count the number of completed I/O transactions." }, { "p6-bus-trans-p", "Count the number of completed partial transactions." }, { "p6-bus-trans-wb", "Count the number of completed write-back transactions." }, { "p6-cpu-clk-unhalted", "Count the number of cycles during with the processor was not halted." }, { "p6-cycles-div-busy", "Count the number of cycles during which the divider is busy and cannot accept new divides." }, { "p6-cycles-in-pending-and-masked", "Count the number of processor cycles for which interrupts were disabled and interrupts were pending." }, { "p6-cycles-int-masked", "Count the number of processor cycles for which interrupts were disabled." }, { "p6-data-mem-refs", "Count all loads and all stores using any memory type, including internal retries." }, { "p6-dcu-lines-in", "Count the total lines allocated in the data cache unit." }, { "p6-dcu-m-lines-in", "Count the number of M state lines allocated in the data cache unit." }, { "p6-dcu-m-lines-out", "Count the number of M state lines evicted from the data cache unit." }, { "p6-dcu-miss-outstanding", "Count the weighted number of cycles while a data cache unit miss is outstanding, incremented by the number of outstanding cache misses at any time."}, { "p6-div", "Count the number of integer and floating-point divides including speculative divides." }, { "p6-flops", "Count the number of computational floating point operations retired." }, { "p6-fp-assist", "Count the number of floating point exceptions handled by microcode." }, { "p6-fp-comps-ops-exe", "Count the number of computation floating point operations executed." }, { "p6-hw-int-rx", "Count the number of hardware interrupts received." }, { "p6-ifu-fetch", "Count the number of instruction fetches, both cacheable and non-cacheable." }, { "p6-ifu-fetch-miss", "Count the number of instruction fetch misses" }, { "p6-ifu-mem-stall", "Count the number of cycles instruction fetch is stalled for any reason." }, { "p6-ild-stall", "Count the number of cycles the instruction length decoder is stalled." }, { "p6-inst-decoded", "Count the number of instructions decoded." }, { "p6-inst-retired", "Count the number of instructions retired." }, { "p6-itlb-miss", "Count the number of instruction TLB misses." }, { "p6-l2-ads", "Count the number of L2 address strobes." }, { "p6-l2-dbus-busy", "Count the number of cycles during which the L2 cache data bus was busy." }, { "p6-l2-dbus-busy-rd", "Count the number of cycles during which the L2 cache data bus was busy transferring read data from L2 to the processor." }, { "p6-l2-ifetch", "Count the number of L2 instruction fetches." }, { "p6-l2-ld", "Count the number of L2 data loads." }, { "p6-l2-lines-in", "Count the number of L2 lines allocated." }, { "p6-l2-lines-out", "Count the number of L2 lines evicted." }, { "p6-l2-m-lines-inm", "Count the number of modified lines allocated in L2 cache." }, { "p6-l2-m-lines-outm", "Count the number of L2 M-state lines evicted." }, { "p6-l2-rqsts", "Count the total number of L2 requests." }, { "p6-l2-st", "Count the number of L2 data stores." }, { "p6-ld-blocks", "Count the number of load operations delayed due to store buffer blocks." }, { "p6-misalign-mem-ref", "Count the number of misaligned data memory references (crossing a 64 bit boundary)." }, { "p6-mul", "Count the number of floating point multiplies, including speculative multiplies." }, { "p6-partial-rat-stalls", "Count the number of cycles or events for partial stalls." }, { "p6-resource-stalls", "Count the number of cycles there was a resource related stall of any kind." }, { "p6-sb-drains", "Count the number of cycles the store buffer is draining." }, { "p6-segment-reg-loads", "Count the number of segment register loads." }, { "p6-uops-retired", "Count the number of micro-ops retired." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-p6.h000066400000000000000000000036531502707512200173210ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-p6.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_P6 #define FreeBSD_MAP_P6 enum NativeEvent_Value_P6Processor { PNE_P6_BACLEARS = PAPI_NATIVE_MASK, PNE_P6_BR_BOGUS, PNE_P6_BR_INST_DECODED, PNE_P6_BR_INST_RETIRED, PNE_P6_BR_MISS_PRED_RETIRED, PNE_P6_BR_MISS_PRED_TAKEN_RET, PNE_P6_BR_TAKEN_RETIRED, PNE_P6_BTB_MISSES, PNE_P6_BUS_BNR_DRV, PNE_P6_BUS_DATA_RCV, PNE_P6_BUS_DRDY_CLOCKS, PNE_P6_BUS_HIT_DRV, PNE_P6_BUS_HITM_DRV, PNE_P6_BUS_LOCK_CLOCKS, PNE_P6_BUS_REQ_OUTSTANDING, PNE_P6_BUS_SNOOP_STALL, PNE_P6_BUS_TRAN_ANY, PNE_P6_BUS_TRAN_BRD, PNE_P6_BUS_TRAN_BURST, PNE_P6_BUS_TRAN_DEF, PNE_P6_BUS_TRAN_IFETCH, PNE_P6_BUS_TRAN_INVAL, PNE_P6_BUS_TRAN_MEM, PNE_P6_BUS_TRAN_POWER, PNE_P6_BUS_TRAN_RFO, PNE_P6_BUS_TRANS_IO, PNE_P6_BUS_TRANS_P, PNE_P6_BUS_TRANS_WB, PNE_P6_CPU_CLK_UNHALTED, PNE_P6_CYCLES_DIV_BUSY, PNE_P6_CYCLES_IN_PENDING_AND_MASKED, PNE_P6_CYCLES_INT_MASKED, PNE_P6_DATA_MEM_REFS, PNE_P6_DCU_LINES_IN, PNE_P6_DCU_M_LINES_IN, PNE_P6_DCU_M_LINES_OUT, PNE_P6_DCU_MISS_OUTSTANDING, PNE_P6_DIV, PNE_P6_FLOPS, PNE_P6_FP_ASSIST, PNE_P6_FTP_COMPS_OPS_EXE, PNE_P6_HW_INT_RX, PNE_P6_IFU_FETCH, PNE_P6_IFU_FETCH_MISS, PNE_P6_IFU_MEM_STALL, PNE_P6_ILD_STALL, PNE_P6_INST_DECODED, PNE_P6_INST_RETIRED, PNE_P6_ITLB_MISS, PNE_P6_L2_ADS, PNE_P6_L2_DBUS_BUSY, PNE_P6_L2_DBUS_BUSY_RD, PNE_P6_L2_IFETCH, PNE_P6_L2_LD, PNE_P6_L2_LINES_IN, PNE_P6_L2_LINES_OUT, PNE_P6_L2M_LINES_INM, PNE_P6_L2M_LINES_OUTM, PNE_P6_L2_RQSTS, PNE_P6_L2_ST, PNE_P6_LD_BLOCKS, PNE_P6_MISALIGN_MEM_REF, PNE_P6_MUL, PNE_P6_PARTIAL_RAT_STALLS, PNE_P6_RESOURCE_STALL, PNE_P6_SB_DRAINS, PNE_P6_SEGMENT_REG_LOADS, PNE_P6_UOPS_RETIRED, PNE_P6_NATNAME_GUARD }; extern Native_Event_LabelDescription_t P6Processor_info[]; extern hwi_search_t P6Processor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-unknown.c000066400000000000000000000022601502707512200204570ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-unknown.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** UNKNOWN SUBSTRATE UNKNOWN SUBSTRATE UNKNOWN SUBSTRATE UNKNOWN SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_UnknownProcessor must match UnkProcessor_info */ Native_Event_LabelDescription_t UnkProcessor_info[] = { { "branches", "Measure the number of branches retired." }, { "branch-mispredicts", "Measure the number of retired branches that were mispredicted." }, /* { "cycles", "Measure processor cycles." }, */ { "dc-misses", "Measure the number of data cache misses." }, { "ic-misses", "Measure the number of instruction cache misses." }, { "instructions", "Measure the number of instructions retired." }, { "interrupts", "Measure the number of interrupts seen." }, { "unhalted-cycles", "Measure the number of cycles the processor is not in a halted or sleep state." }, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-unknown.h000066400000000000000000000013171502707512200204660ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-unknown.h * CVS: $Id$ * Author: Harald Servat * redcrash@gmail.com */ #ifndef FreeBSD_MAP_UNKNOWN #define FreeBSD_MAP_UNKNOWN enum NativeEvent_Value_UnknownProcessor { PNE_UNK_BRANCHES = PAPI_NATIVE_MASK, PNE_UNK_BRANCH_MISPREDICTS, /* PNE_UNK_CYCLES, -- libpmc only supports cycles in system wide mode and this requires root privileges */ PNE_UNK_DC_MISSES, PNE_UNK_IC_MISSES, PNE_UNK_INSTRUCTIONS, PNE_UNK_INTERRUPTS, PNE_UNK_UNHALTED_CYCLES, PNE_UNK_NATNAME_GUARD }; extern Native_Event_LabelDescription_t UnkProcessor_info[]; extern hwi_search_t UnkProcessor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map-westmere.c000066400000000000000000002306261502707512200206240ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-westmere.c * Author: George Neville-Neil * gnn@freebsd.org * Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /**************************************************************************** Westmere SUBSTRATE Westmere SUBSTRATE Westmere SUBSTRATE Westmere SUBSTRATE Westmere SUBSTRATE ****************************************************************************/ /* NativeEvent_Value_Westmere must match Westmere_info */ Native_Event_LabelDescription_t WestmereProcessor_info[] = { {"LOAD_BLOCK.OVERLAP_STORE", "Loads that partially overlap an earlier store"}, {"SB_DRAIN.ANY", "All Store buffer stall cycles"}, {"MISALIGN_MEMORY.STORE", "All store referenced with misaligned address"}, {"STORE_BLOCKS.AT_RET", "Counts number of loads delayed with at-Retirement block code. The following loads need to be executed at retirement and wait for all senior stores on the same thread to be drained: load splitting across 4K boundary (page split), load accessing uncacheable (UC or USWC) memory, load lock, and load with page table in UC or USWC memory region."}, {"STORE_BLOCKS.L1D_BLOCK", "Cacheable loads delayed with L1D block code"}, {"PARTIAL_ADDRESS_ALIAS", "Counts false dependency due to partial address aliasing"}, {"DTLB_LOAD_MISSES.ANY", "Counts all load misses that cause a page walk"}, {"DTLB_LOAD_MISSES.WALK_COMPLETED", "Counts number of completed page walks due to load miss in the STLB."}, {"DTLB_LOAD_MISSES.WALK_CYCLES", "Cycles PMH is busy with a page walk due to a load miss in the STLB."}, {"DTLB_LOAD_MISSES.STLB_HIT", "Number of cache load STLB hits"}, {"DTLB_LOAD_MISSES.PDE_MISS", "Number of DTLB cache load misses where the low part of the linear tophysical address translation was missed."}, {"MEM_INST_RETIRED.LOADS", "Counts the number of instructions with an architecturally-visible store retired on the architected path. In conjunction with ld_lat facility"}, {"MEM_INST_RETIRED.STORES", "Counts the number of instructions with an architecturally-visible store retired on the architected path. In conjunction with ld_lat facility"}, {"MEM_INST_RETIRED.LATENCY_ABOVE_THRESHOLD", "Counts the number of instructions exceeding the latency specified with ld_lat facility. In conjunction with ld_lat facility"}, {"MEM_STORE_RETIRED.DTLB_MISS", "The event counts the number of retired stores that missed the DTLB. The DTLB miss is not counted if the store operation causes a fault. Does not counter prefetches. Counts both primary and secondary misses to the TLB"}, {"UOPS_ISSUED.ANY", "Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end."}, {"UOPS_ISSUED.STALLED_CYCLES", "Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i.e. the UOPs issued from the front end to the back end."}, {"UOPS_ISSUED.FUSED", "Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station."}, {"MEM_UNCORE_RETIRED.LOCAL_HITM", "Load instructions retired that HIT modified data in sibling core (Precise Event)"}, {"MEM_UNCORE_RETIRED.LOCAL_DRAM_AND_REMOTE_CACHE_HIT", "Load instructions retired local dram and remote cache HIT data sources (Precise Event)"}, {"MEM_UNCORE_RETIRED.LOCAL_DRAM", "Load instructions retired with a data source of local DRAM or locally homed remote cache HITM (Precise Event)"}, {"MEM_UNCORE_RETIRED.REMOTE_DRAM", "Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)"}, {"MEM_UNCORE_RETIRED.UNCACHEABLE", "Load instructions retired I/O (Precise Event)"}, {"FP_COMP_OPS_EXE.X87", "Counts the number of FP Computational Uops Executed. The number of FADD, FSUB, FCOM, FMULs, integer MULsand IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction."}, {"FP_COMP_OPS_EXE.MMX", "Counts number of MMX Uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP", "Counts number of SSE and SSE2 FP uops executed."}, {"FP_COMP_OPS_EXE.SSE2_INTEGER", "Counts number of SSE2 integer uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP_PACKED", "Counts number of SSE FP packed uops executed."}, {"FP_COMP_OPS_EXE.SSE_FP_SCALAR", "Counts number of SSE FP scalar uops executed."}, {"FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION", "Counts number of SSE* FP single precision uops executed."}, {"FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION", "Counts number of SSE* FP double precision uops executed."}, {"SIMD_INT_128.PACKED_MPY", "Counts number of 128 bit SIMD integer multiply operations."}, {"SIMD_INT_128.PACKED_SHIFT", "Counts number of 128 bit SIMD integer shift operations."}, {"SIMD_INT_128.PACK", "Counts number of 128 bit SIMD integer pack operations."}, {"SIMD_INT_128.UNPACK", "Counts number of 128 bit SIMD integer unpack operations."}, {"SIMD_INT_128.PACKED_LOGICAL", "Counts number of 128 bit SIMD integer logical operations."}, {"SIMD_INT_128.PACKED_ARITH", "Counts number of 128 bit SIMD integer arithmetic operations."}, {"SIMD_INT_128.SHUFFLE_MOVE", "Counts number of 128 bit SIMD integer shuffle and move operations."}, {"LOAD_DISPATCH.RS", "Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer."}, {"LOAD_DISPATCH.RS_DELAYED", "Counts the number of delayed RS dispatches at the stage latch. If an RS dispatch can not bypass to LB, it has another chance to dispatch from the one-cycle delayed staging latch before it is written into the LB."}, {"LOAD_DISPATCH.MOB", "Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer."}, {"LOAD_DISPATCH.ANY", "Counts all loads dispatched from the Reservation Station."}, {"ARITH.CYCLES_DIV_BUSY", "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Set 'edge =1, invert=1, cmask=1' to count the number of divides. Count may be incorrect When SMT is on."}, {"ARITH.MUL", "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD. Count may be incorrect When SMT is on."}, {"INST_QUEUE_WRITES", "Counts the number of instructions written into the instruction queue every cycle."}, {"INST_DECODED.DEC0", "Counts number of instructions that require decoder 0 to be decoded. Usually, this means that the instruction maps to more than 1 uop"}, {"TWO_UOP_INSTS_DECODED", "An instruction that generates two uops was decoded"}, {"INST_QUEUE_WRITE_CYCLES", "This event counts the number of cycles during which instructions are written to the instruction queue. Dividing this counter by the number of instructions written to the instruction queue (INST_QUEUE_WRITES) yields the average number of instructions decoded each cycle. If this number is less than four and the pipe stalls, this indicates that the decoder is failing to decode enough instructions per cycle to sustain the 4-wide pipeline. If SSE* instructions that are 6 bytes or longer arrive one after another, then front end throughput may limit execution speed. "}, {"LSD_OVERFLOW", "Number of loops that can not stream from the instruction queue."}, {"L2_RQSTS.LD_HIT", "Counts number of loads that hit the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches. L2 loads can be rejected for various reasons. Only non rejected loads are counted."}, {"L2_RQSTS.LD_MISS", "Counts the number of loads that miss the L2 cache. L2 loads include both L1D demand misses as well as L1D prefetches."}, {"L2_RQSTS.LOADS", "Counts all L2 load requests. L2 loads include both L1D demand misses as well as L1D prefetches."}, {"L2_RQSTS.RFO_HIT", "Counts the number of store RFO requests that hit the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches. Count includes WC memory requests, where the data is not fetched but the permission to write the line is required."}, {"L2_RQSTS.RFO_MISS", "Counts the number of store RFO requests that miss the L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches."}, {"L2_RQSTS.RFOS", "Counts all L2 store RFO requests. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches."}, {"L2_RQSTS.IFETCH_HIT", "Counts number of instruction fetches that hit the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.IFETCH_MISS", "Counts number of instruction fetches that miss the L2 cache. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.IFETCHES", "Counts all instruction fetches. L2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches."}, {"L2_RQSTS.PREFETCH_HIT", "Counts L2 prefetch hits for both code and data."}, {"L2_RQSTS.PREFETCH_MISS", "Counts L2 prefetch misses for both code and data."}, {"L2_RQSTS.PREFETCHES", "Counts all L2 prefetches for both code and data."}, {"L2_RQSTS.MISS", "Counts all L2 misses for both code and data."}, {"L2_RQSTS.REFERENCES", "Counts all L2 requests for both code and data."}, {"L2_DATA_RQSTS.DEMAND.I_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.S_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the S (shared) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.E_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the E (exclusive) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.M_STATE", "Counts number of L2 data demand loads where the cache line to be loaded is in the M (modified) state. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.DEMAND.MESI", "Counts all L2 data demand requests. L2 demand loads are both L1D demand misses and L1D prefetches."}, {"L2_DATA_RQSTS.PREFETCH.I_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss."}, {"L2_DATA_RQSTS.PREFETCH.S_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the S (shared) state. A prefetch RFO will miss on an S state line, while a prefetch read will hit on an S state line."}, {"L2_DATA_RQSTS.PREFETCH.E_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the E (exclusive) state."}, {"L2_DATA_RQSTS.PREFETCH.M_STATE", "Counts number of L2 prefetch data loads where the cache line to be loaded is in the M (modified) state."}, {"L2_DATA_RQSTS.PREFETCH.MESI", "Counts all L2 prefetch requests."}, {"L2_DATA_RQSTS.ANY", "Counts all L2 data requests."}, {"L2_WRITE.RFO.I_STATE", "Counts number of L2 demand store RFO requests where the cache line to be loaded is in the I (invalid) state, i.e, a cache miss. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.S_STATE", "Counts number of L2 store RFO requests where the cache line to be loaded is in the S (shared) state. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.M_STATE", "Counts number of L2 store RFO requests where the cache line to be loaded is in the M (modified) state. The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.RFO.HIT", "Counts number of L2 store RFO requests where the cache line to be loaded is in either the S, E or M states. The L1D prefetcher does not issue a RFO prefetch."}, {"L2_WRITE.RFO.MESI", "Counts all L2 store RFO requests.The L1D prefetcher does not issue a RFO prefetch. This is a demand RFO request."}, {"L2_WRITE.LOCK.I_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, i.e. a cache miss."}, {"L2_WRITE.LOCK.S_STATE", "Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state."}, {"L2_WRITE.LOCK.E_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state."}, {"L2_WRITE.LOCK.M_STATE", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state."}, {"L2_WRITE.LOCK.HIT", "Counts number of L2 demand lock RFO requests where the cache line to be loaded is in either the S, E, or M state."}, {"L2_WRITE.LOCK.MESI", "Counts all L2 demand lock RFO requests."}, {"L1D_WB_L2.I_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i.e. a cache miss."}, {"L1D_WB_L2.S_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state."}, {"L1D_WB_L2.E_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state."}, {"L1D_WB_L2.M_STATE", "Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state."}, {"L1D_WB_L2.MESI", "Counts all L1 writebacks to the L2."}, {"L3_LAT_CACHE.REFERENCE", "Counts uncore Last Level Cache references. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. See Table A-1."}, {"L3_LAT_CACHE.MISS", "Counts uncore Last Level Cache misses. Because cache hierarchy, cache sizes and other implementation-specific characteristics; value comparison to estimate performance differences is not recommended. See Table A-1."}, {"CPU_CLK_UNHALTED.THREAD_P", "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. See Table A-1."}, {"CPU_CLK_UNHALTED.REF_P", "Increments at the frequency of TSC when not halted. See Table A-1."}, {"DTLB_MISSES.ANY", "Counts the number of misses in the STLB which causes a page walk."}, {"DTLB_MISSES.WALK_COMPLETED", "Counts number of misses in the STLB which resulted in a completed page walk."}, {"DTLB_MISSES.WALK_CYCLES", "Counts cycles of page walk due to misses in the STLB."}, {"DTLB_MISSES.STLB_HIT", "Counts the number of DTLB first level misses that hit in the second level TLB. This event is only relevant if the core contains multiple DTLB levels."}, {"DTLB_MISSES.LARGE_WALK_COMPLETED", "Counts number of completed large page walks due to misses in the STLB."}, {"LOAD_HIT_PRE", "Counts load operations sent to the L1 data cache while a previous SSE prefetch instruction to the same cache line has started prefetching but has not yet finished."}, {"L1D_PREFETCH.REQUESTS", "Counts number of hardware prefetch requests dispatched out of the prefetch FIFO."}, {"L1D_PREFETCH.MISS", "Counts number of hardware prefetch requests that miss the L1D. There are two prefetchers in the L1D. A streamer, which predicts lines sequentially after this one should be fetched, and the IP prefetcher that remembers access patterns for the current instruction. The streamer prefetcher stops on an L1D hit, while the IP prefetcher does not."}, {"L1D_PREFETCH.TRIGGERS", "Counts number of prefetch requests triggered by the Finite State Machine and pushed into the prefetch FIFO. Some of the prefetch requests are dropped due to overwrites or competition between the IP index prefetcher and streamer prefetcher. The prefetch FIFO contains 4 entries."}, {"EPT.WALK_CYCLES", "Counts Extended Page walk cycles."}, {"L1D.REPL", "Counts the number of lines brought into the L1 data cache.Counter 0, 1 only."}, {"L1D.M_REPL", "Counts the number of modified lines brought into the L1 data cache. Counter 0, 1 only."}, {"L1D.M_EVICT", "Counts the number of modified lines evicted from the L1 data cache due to replacement. Counter 0, 1 only."}, {"L1D.M_SNOOP_EVICT", "Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention. Counter 0, 1 only."}, {"L1D_CACHE_PREFETCH_LOCK_FB_HIT", "Counts the number of cacheable load lock speculated instructions accepted into the fill buffer."}, {"L1D_CACHE_LOCK_FB_HIT", "Counts the number of cacheable load lock speculated or retired instructions accepted into the fill buffer."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_DATA", "Counts weighted cycles of offcore demand data read requests. Does not include L2 prefetch requests. Counter 0."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_CODE", "Counts weighted cycles of offcore demand code read requests. Does not include L2 prefetch requests. Counter 0."}, {"OFFCORE_REQUESTS_OUTSTANDING.DEMAND.RFO", "Counts weighted cycles of offcore demand RFO requests. Does not include L2 prefetch requests. Counter 0."}, {"OFFCORE_REQUESTS_OUTSTANDING.ANY.READ", "Counts weighted cycles of offcore read requests of any kind. Include L2 prefetch requests. Counter 0."}, {"CACHE_LOCK_CYCLES.L1D_L2", "Cycle count during which the L1D and L2 are locked. A lock is asserted when there is a locked memory access, due to uncacheable memory, a locked operation that spans two cache lines, or a page walk from an uncacheable page table. Counter 0, 1 only. L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such accesses."}, {"CACHE_LOCK_CYCLES.L1D", "Counts the number of cycles that cacheline in the L1 data cache unit is locked. Counter 0, 1 only."}, {"IO_TRANSACTIONS", "Counts the number of completed I/O transactions."}, {"L1I.HITS", "Counts all instruction fetches that hit the L1 instruction cache."}, {"L1I.MISSES", "Counts all instruction fetches that miss the L1I cache. This includes instruction cache misses, streaming buffer misses, victim cache misses and uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding."}, {"L1I.READS", "Counts all instruction fetches, including uncacheable fetches that bypass the L1I."}, {"L1I.CYCLES_STALLED", "Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault."}, {"LARGE_ITLB.HIT", "Counts number of large ITLB hits."}, {"ITLB_MISSES.ANY", "Counts the number of misses in all levels of the ITLB which causes a page walk."}, {"ITLB_MISSES.WALK_COMPLETED", "Counts number of misses in all levels of the ITLB which resulted in a completed page walk."}, {"ITLB_MISSES.WALK_CYCLES", "Counts ITLB miss page walk cycles."}, {"ITLB_MISSES.LARGE_WALK_COMPLETED", "Counts number of completed large page walks due to misses in the STLB."}, {"ILD_STALL.LCP", "Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX.W (for EM64T) instructions which change the length of the decoded instruction."}, {"ILD_STALL.MRU", "Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU). Most Recently Used (MRU) bypass."}, {"ILD_STALL.IQ_FULL", "Stall cycles due to a full instruction queue."}, {"ILD_STALL.REGEN", "Counts the number of regen stalls."}, {"ILD_STALL.ANY", "Counts any cycles the Instruction Length Decoder is stalled."}, {"BR_INST_EXEC.COND", "Counts the number of conditional near branch instructions executed, but not necessarily retired."}, {"BR_INST_EXEC.DIRECT", "Counts all unconditional near branch instructions excluding calls and indirect branches."}, {"BR_INST_EXEC.INDIRECT_NON_CALL", "Counts the number of executed indirect near branch instructions that are not calls."}, {"BR_INST_EXEC.NON_CALLS", "Counts all non call near branch instructions executed, but not necessarily retired."}, {"BR_INST_EXEC.RETURN_NEAR", "Counts indirect near branches that have a return mnemonic."}, {"BR_INST_EXEC.DIRECT_NEAR_CALL", "Counts unconditional near call branch instructions, excluding non call branch, executed."}, {"BR_INST_EXEC.INDIRECT_NEAR_CALL", "Counts indirect near calls, including both register and memory indirect, executed."}, {"BR_INST_EXEC.NEAR_CALLS", "Counts all near call branches executed, but not necessarily retired."}, {"BR_INST_EXEC.TAKEN", "Counts taken near branches executed, but not necessarily retired."}, {"BR_INST_EXEC.ANY", "Counts all near executed branches (not necessarily retired). This includes only instructions and not micro-op branches. Frequent branching is not necessarily a major performance issue. However frequent branch mispredictions may be a problem."}, {"BR_MISP_EXEC.COND", "Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired."}, {"BR_MISP_EXEC.DIRECT", "Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0)."}, {"BR_MISP_EXEC.INDIRECT_NON_CALL", "Counts the number of executed mispredicted indirect near branch instructions that are not calls."}, {"BR_MISP_EXEC.NON_CALLS", "Counts mispredicted non call near branches executed, but not necessarily retired."}, {"BR_MISP_EXEC.RETURN_NEAR", "Counts mispredicted indirect branches that have a rear return mnemonic."}, {"BR_MISP_EXEC.DIRECT_NEAR_CALL", "Counts mispredicted non-indirect near calls executed, (should always be 0)."}, {"BR_MISP_EXEC.INDIRECT_NEAR_CALL", "Counts mispredicted indirect near calls executed, including both register and memory indirect."}, {"BR_MISP_EXEC.NEAR_CALLS", "Counts all mispredicted near call branches executed, but not necessarily retired."}, {"BR_MISP_EXEC.TAKEN", "Counts executed mispredicted near branches that are taken, but not necessarily retired."}, {"BR_MISP_EXEC.ANY", "Counts the number of mispredicted near branch instructions that were executed, but not necessarily retired."}, {"RESOURCE_STALLS.ANY", "Counts the number of Allocator resource related stalls. Includes register renaming buffer entries, memory buffer entries. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations. Does not include stalls due to SuperQ (off core) queue full, too many cache misses, etc."}, {"RESOURCE_STALLS.LOAD", "Counts the cycles of stall due to lack of load buffer for load operation."}, {"RESOURCE_STALLS.RS_FULL", "This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle. A high count of this event indicates that there are long latency operations in the pipe (possibly load and store operations that miss the L2 cache, or instructions dependent upon instructions further down the pipeline that have yet to retire. When RS is full, new instructions can not enter the reservation station and start execution."}, {"RESOURCE_STALLS.STORE", "This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i.e. all store buffers are used). The stall ends when a store instruction commits its data to the cache or memory."}, {"RESOURCE_STALLS.ROB_FULL", "Counts the cycles of stall due to re- order buffer full."}, {"RESOURCE_STALLS.FPCW", "Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word."}, {"RESOURCE_STALLS.MXCSR", "Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename. The MXCSR provides control and status for the MMX registers."}, {"RESOURCE_STALLS.OTHER", "Counts the number of cycles while execution was stalled due to other resource issues."}, {"MACRO_INSTS.FUSIONS_DECODED", "Counts the number of instructions decoded that are macro-fused but not necessarily executed or retired."}, {"BACLEAR_FORCE_IQ", "Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is also responsible for providing conditional branch prediction direction based on a static scheme and dynamic data provided by the L2 Branch Prediction Unit. If the conditional branch target is not found in the Target Array and the IQ predicts that the branch is taken, then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline."}, {"LSD.UOPS", "Counts the number of micro-ops delivered by loop stream detector. Use cmask=1 and invert to count cycles."}, {"ITLB_FLUSH", "Counts the number of ITLB flushes"}, {"OFFCORE_REQUESTS.DEMAND.READ_DATA", "Counts number of offcore demand data read requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.DEMAND.READ_CODE", "Counts number of offcore demand code read requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.DEMAND.RFO", "Counts number of offcore demand RFO requests. Does not count L2 prefetch requests."}, {"OFFCORE_REQUESTS.ANY.READ", "Counts number of offcore read requests. Includes L2 prefetch requests."}, {"OFFCORE_REQUESTS.ANY.RFO", "Counts number of offcore RFO requests. Includes L2 prefetch requests."}, {"OFFCORE_REQUESTS.L1D_WRITEBACK", "Counts number of L1D writebacks to the uncore."}, {"OFFCORE_REQUESTS.ANY", "Counts all offcore requests."}, {"UOPS_EXECUTED.PORT0", "Counts number of Uops executed that were issued on port 0. Port 0 handles integer arithmetic, SIMD and FP add Uops."}, {"UOPS_EXECUTED.PORT1", "Counts number of Uops executed that were issued on port 1. Port 1 handles integer arithmetic, SIMD, integer shift, FP multiply and FP divide Uops."}, {"UOPS_EXECUTED.PORT2_CORE", "Counts number of Uops executed that were issued on port 2. Port 2 handles the load Uops. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT3_CORE", "Counts number of Uops executed that were issued on port 3. Port 3 handles store Uops. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT4_CORE", "Counts number of Uops executed that where issued on port 4. Port 4 handles the value to be stored for the store Uops issued on port 3. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.CORE_ACTIVE_CYCLES_NO_PORT5", "Counts number of cycles there are one or more uops being executed and were issued on ports 0-4. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT5", "Counts number of Uops executed that where issued on port 5."}, {"UOPS_EXECUTED.CORE_ACTIVE_CYCLES", "Counts number of cycles there are one or more uops being executed on any ports. This is a core count only and can not be collected per thread."}, {"UOPS_EXECUTED.PORT015", "Counts number of Uops executed that where issued on port 0, 1, or 5. Use cmask=1, invert=1 to count stall cycles."}, {"UOPS_EXECUTED.PORT234", "Counts number of Uops executed that where issued on port 2, 3, or 4."}, {"OFFCORE_REQUESTS_SQ_FULL", "Counts number of cycles the SQ is full to handle off-core requests."}, {"SNOOPQ_REQUESTS_OUTSTANDING.DATA", "Counts weighted cycles of snoopq requests for data. Counter 0 only Use cmask=1 to count cycles not empty."}, {"SNOOPQ_REQUESTS_OUTSTANDING.INVALIDATE", "Counts weighted cycles of snoopq invalidate requests. Counter 0 only Use cmask=1 to count cycles not empty."}, {"SNOOPQ_REQUESTS_OUTSTANDING.CODE", "Counts weighted cycles of snoopq requests for code. Counter 0 only. Use cmask=1 to count cycles not empty."}, {"SNOOPQ_REQUESTS.CODE", "Counts the number of snoop code requests."}, {"SNOOPQ_REQUESTS.DATA", "Counts the number of snoop data requests."}, {"SNOOPQ_REQUESTS.INVALIDATE", "Counts the number of snoop invalidate requests."}, {"OFF_CORE_RESPONSE_0", "see Section 30.6.1.3, Off-core Response Performance Monitoring in the Processor Core. Requires programming MSR 01A6H."}, {"SNOOP_RESPONSE.HIT", "Counts HIT snoop response sent by this thread in response to a snoop request."}, {"SNOOP_RESPONSE.HITE", "Counts HIT E snoop response sent by this thread in response to a snoop request."}, {"SNOOP_RESPONSE.HITM", "Counts HIT M snoop response sent by this thread in response to a snoop request."}, {"OFF_CORE_RESPONSE_1", "See Section 30.6.1.3, Off-core Response Performance Monitoring in the Processor Core. Use MSR 01A7H."}, {"INST_RETIRED.ANY_P", "See Table A-1 Notes: INST_RETIRED.ANY is counted by a designated fixed counter. INST_RETIRED.ANY_P is counted by a programmable counter and is an architectural performance event. Event is supported if CPUID.A.EBX[1] = 0. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions."}, {"INST_RETIRED.X87", "Counts the number of floating point computational operations retired floating point computational operations executed by the assist handler and sub-operations of complex floating point instructions like transcendental instructions."}, {"INST_RETIRED.MMX", "Counts the number of retired: MMX instructions."}, {"UOPS_RETIRED.ANY", "Counts the number of micro-ops retired, (macro-fused=1, micro- fused=2, others=1; maximum count of 8 per cycle). Most instructions are composed of one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. Use cmask=1 and invert to count active cycles or stalled cycles"}, {"UOPS_RETIRED.RETIRE_SLOTS", "Counts the number of retirement slots used each cycle"}, {"UOPS_RETIRED.MACRO_FUSED", "Counts number of macro-fused uops retired."}, {"MACHINE_CLEARS.CYCLES", "Counts the cycles machine clear is asserted."}, {"MACHINE_CLEARS.MEM_ORDER", "Counts the number of machine clears due to memory order conflicts."}, {"MACHINE_CLEARS.SMC", "Counts the number of times that a program writes to a code section. Self-modifying code causes a sever penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and L3caches."}, {"BR_INST_RETIRED.ANY_P", "See Table A-1"}, {"BR_INST_RETIRED.CONDITIONAL", "Counts the number of conditional branch instructions retired."}, {"BR_INST_RETIRED.NEAR_CALL", "Counts the number of direct & indirect near unconditional calls retired."}, {"BR_INST_RETIRED.ALL_BRANCHES", "Counts the number of branch instructions retired."}, {"BR_MISP_RETIRED.ANY_P", "See Table A-1."}, {"BR_MISP_RETIRED.CONDITIONAL", "Counts mispredicted conditional retired calls."}, {"BR_MISP_RETIRED.NEAR_CALL", "Counts mispredicted direct & indirect near unconditional retired calls."}, {"BR_MISP_RETIRED.ALL_BRANCHES", "Counts all mispredicted retired calls."}, {"SSEX_UOPS_RETIRED.PACKED_SINGLE", "Counts SIMD packed single-precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.SCALAR_SINGLE", "Counts SIMD calar single-precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.PACKED_DOUBLE", "Counts SIMD packed double- precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.SCALAR_DOUBLE", "Counts SIMD scalar double-precision floating point Uops retired."}, {"SSEX_UOPS_RETIRED.VECTOR_INTEGER", "Counts 128-bit SIMD vector integer Uops retired."}, {"ITLB_MISS_RETIRED", "Counts the number of retired instructions that missed the ITLB when the instruction was fetched."}, {"MEM_LOAD_RETIRED.L1D_HIT", "Counts number of retired loads that hit the L1 data cache."}, {"MEM_LOAD_RETIRED.L2_HIT", "Counts number of retired loads that hit the L2 data cache."}, {"MEM_LOAD_RETIRED.L3_UNSHARED_HIT", "Counts number of retired loads that hit their own, unshared lines in the L3 cache."}, {"MEM_LOAD_RETIRED.OTHER_CORE_L2_HIT_HITM", "Counts number of retired loads that hit in a sibling core's L2 (on die core). Since the L3 is inclusive of all cores on the package, this is an L3 hit. This counts both clean or modified hits."}, {"MEM_LOAD_RETIRED.L3_MISS", "Counts number of retired loads that miss the L3 cache. The load was satisfied by a remote socket, local memory or an IOH."}, {"MEM_LOAD_RETIRED.HIT_LFB", "Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache. This is counting secondary L1D misses."}, {"MEM_LOAD_RETIRED.DTLB_MISS", "Counts the number of retired loads that missed the DTLB. The DTLB miss is not counted if the load operation causes a fault. This event counts loads from cacheable memory only. The event does not count loads by software prefetches. Counts both primary and secondary misses to the TLB."}, {"FP_MMX_TRANS.TO_FP", "Counts the first floating-point instruction following any MMX instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"FP_MMX_TRANS.TO_MMX", "Counts the first MMX instruction following a floating-point instruction. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"FP_MMX_TRANS.ANY", "Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions. You can use this event to estimate the penalties for the transitions between floating-point and MMX technology states."}, {"MACRO_INSTS.DECODED", "Counts the number of instructions decoded, (but not necessarily executed or retired)."}, {"UOPS_DECODED.STALL_CYCLES", "Counts the cycles of decoder stalls."}, {"UOPS_DECODED.MS", "Counts the number of Uops decoded by the Microcode Sequencer, MS. The MS delivers uops when the instruction is more than 4 uops long or a microcode assist is occurring."}, {"UOPS_DECODED.ESP_FOLDING", "Counts number of stack pointer (ESP) instructions decoded: push, pop, call, ret, etc. ESP instructions do not generate a Uop to increment or decrement ESP. Instead, they update an ESP_Offset register that keeps track of the delta to the current value of the ESP register."}, {"UOPS_DECODED.ESP_SYNC", "Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register."}, {"RAT_STALLS.FLAGS", "Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall. A partial register stall may occur when two conditions are met: 1) an instruction modifies some, but not all, of the flags in the flag register and 2) the next instruction, which depends on flags, depends on flags that were not modified by this instruction."}, {"RAT_STALLS.REGISTERS", "This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction."}, {"RAT_STALLS.ROB_READ_PORT", "Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline. Note that, at this stage in the pipeline, additional stalls may occur at the same cycle and prevent the stalled micro-ops from entering the pipe. In such a case, micro-ops retry entering the execution pipe in the next cycle and the ROB-read port stall is counted again."}, {"RAT_STALLS.SCOREBOARD", "Counts the cycles where we stall due to microarchitecturally required serialization. Microcode scoreboarding stalls."}, {"RAT_STALLS.ANY", "Counts all Register Allocation Table stall cycles due to: Cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the execution pipe. Cycles when partial register stalls occurred Cycles when flag stalls occurred Cycles floating-point unit (FPU) status word stalls occurred. To count each of these conditions separately use the events: RAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and RAT_STALLS.FPSW."}, {"SEG_RENAME_STALLS", "Counts the number of stall cycles due to the lack of renaming resources for the ES, DS, FS, and GS segment registers. If a segment is renamed but not retired and a second update to the same segment occurs, a stall occurs in the front- end of the pipeline until the renamed segment retires."}, {"ES_REG_RENAMES", "Counts the number of times the ES segment register is renamed."}, {"UOP_UNFUSION", "Counts unfusion events due to floating point exception to a fused uop."}, {"BR_INST_DECODED", "Counts the number of branch instructions decoded."}, {"BPU_MISSED_CALL_RET", "Counts number of times the Branch Prediction Unit missed predicting a call or return branch."}, {"BACLEAR.CLEAR", "Counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. This can occur if the code has many branches such that they cannot be consumed by the BPU. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline. The effect on total execution time depends on the surrounding code."}, {"BACLEAR.BAD_TARGET", "Counts number of Branch Address Calculator clears (BACLEAR) asserted due to conditional branch instructions in which there was a target hit but the direction was wrong. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline."}, {"BPU_CLEARS.EARLY", "Counts early (normal) Branch Prediction Unit clears: BPU predicted a taken branch after incorrectly assuming that it was not taken. The BPU clear leads to 2 cycle bubble in the Front End."}, {"BPU_CLEARS.LATE", "Counts late Branch Prediction Unit clears due to Most Recently Used conflicts. The PBU clear leads to a 3 cycle bubble in the Front End."}, {"THREAD_ACTIVE", "Counts cycles threads are active."}, {"L2_TRANSACTIONS.LOAD", "Counts L2 load operations due to HW prefetch or demand loads."}, {"L2_TRANSACTIONS.RFO", "Counts L2 RFO operations due to HW prefetch or demand RFOs."}, {"L2_TRANSACTIONS.IFETCH", "Counts L2 instruction fetch operations due to HW prefetch or demand ifetch."}, {"L2_TRANSACTIONS.PREFETCH", "Counts L2 prefetch operations."}, {"L2_TRANSACTIONS.L1D_WB", "Counts L1D writeback operations to the L2."}, {"L2_TRANSACTIONS.FILL", "Counts L2 cache line fill operations due to load, RFO, L1D writeback or prefetch."}, {"L2_TRANSACTIONS.WB", "Counts L2 writeback operations to the L3."}, {"L2_TRANSACTIONS.ANY", "Counts all L2 cache operations."}, {"L2_LINES_IN.S_STATE", "Counts the number of cache lines allocated in the L2 cache in the S (shared) state."}, {"L2_LINES_IN.E_STATE", "Counts the number of cache lines allocated in the L2 cache in the E (exclusive) state."}, {"L2_LINES_IN.ANY", "Counts the number of cache lines allocated in the L2 cache."}, {"L2_LINES_OUT.DEMAND_CLEAN", "Counts L2 clean cache lines evicted by a demand request."}, {"L2_LINES_OUT.DEMAND_DIRTY", "Counts L2 dirty (modified) cache lines evicted by a demand request."}, {"L2_LINES_OUT.PREFETCH_CLEAN", "Counts L2 clean cache line evicted by a prefetch request."}, {"L2_LINES_OUT.PREFETCH_DIRTY", "Counts L2 modified cache line evicted by a prefetch request."}, {"L2_LINES_OUT.ANY", "Counts all L2 cache lines evicted for any reason."}, {"SQ_MISC.LRU_HINTS", "Counts number of Super Queue LRU hints sent to L3."}, {"SQ_MISC.SPLIT_LOCK", "Counts the number of SQ lock splits across a cache line."}, {"SQ_FULL_STALL_CYCLES", "Counts cycles the Super Queue is full. Neither of the threads on this core will be able to access the uncore."}, {"FP_ASSIST.ALL", "Counts the number of floating point operations executed that required micro-code assist intervention. Assists are required in the following cases: SSE instructions, (Denormal input when the DAZ flag is off or Underflow result when the FTZ flag is off): x87 instructions, (NaN or denormal are loaded to a register or used as input from memory, Division by 0 or Underflow output)."}, {"FP_ASSIST.OUTPUT", "Counts number of floating point micro-code assist when the output value (destination register) is invalid."}, {"FP_ASSIST.INPUT", "Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid."}, {"SIMD_INT_64.PACKED_MPY", "Counts number of SID integer 64 bit packed multiply operations."}, {"SIMD_INT_64.PACKED_SHIFT", "Counts number of SID integer 64 bit packed shift operations."}, {"SIMD_INT_64.PACK", "Counts number of SID integer 64 bit pack operations."}, {"SIMD_INT_64.UNPACK", "Counts number of SID integer 64 bit unpack operations."}, {"SIMD_INT_64.PACKED_LOGICAL", "Counts number of SID integer 64 bit logical operations."}, {"SIMD_INT_64.PACKED_ARITH", "Counts number of SID integer 64 bit arithmetic operations."}, {"SIMD_INT_64.SHUFFLE_MOVE", "Counts number of SID integer 64 bit shift or move operations."}, {"INSTR_RETIRED_ANY", ""}, {"CPU_CLK_UNHALTED_CORE", ""}, {"CPU_CLK_UNHALTED_REF", ""}, {"GQ_CYCLES_FULL.READ_TRACKER", "Uncore cycles Global Queue read tracker is full."}, {"GQ_CYCLES_FULL.WRITE_TRACKER", "Uncore cycles Global Queue write tracker is full."}, {"GQ_CYCLES_FULL.PEER_PROBE_TRACKER", "Uncore cycles Global Queue peer probe tracker is full. The peer probe tracker queue tracks snoops from the IOH and remote sockets."}, {"GQ_CYCLES_NOT_EMPTY.READ_TRACKER", "Uncore cycles were Global Queue read tracker has at least one valid entry."}, {"GQ_CYCLES_NOT_EMPTY.WRITE_TRACKER", "Uncore cycles were Global Queue write tracker has at least one valid entry."}, {"GQ_CYCLES_NOT_EMPTY.PEER_PROBE_TRACKER", "Uncore cycles were Global Queue peer probe tracker has at least one valid entry. The peer probe tracker queue tracks IOH and remote socket snoops."}, {"GQ_OCCUPANCY.READ_TRACKER", "Increments the number of queue entries (code read, data read, and RFOs) in the tread tracker. The GQ read tracker allocate to deallocate occupancy count is divided by the count to obtain the average read tracker latency."}, {"GQ_ALLOC.READ_TRACKER", "Counts the number of tread tracker allocate to deallocate entries. The GQ read tracker allocate to deallocate occupancy count is divided by the count to obtain the average read tracker latency."}, {"GQ_ALLOC.RT_L3_MISS", "Counts the number GQ read tracker entries for which a full cache line read has missed the L3. The GQ read tracker L3 miss to fill occupancy count is divided by this count to obtain the average cache line read L3 miss latency. The latency represents the time after which the L3 has determined that the cache line has missed. The time between a GQ read tracker allocation and the L3 determining that the cache line has missed is the average L3 hit latency. The total L3 cache line read miss latency is the hit latency + L3 misslatency."}, {"GQ_ALLOC.RT_TO_L3_RESP", "Counts the number of GQ read tracker entries that are allocated in the read tracker queue that hit or miss the L3. The GQ read tracker L3 hit occupancy count is divided by this count to obtain the average L3 hit latency."}, {"GQ_ALLOC.RT_TO_RTID_ACQUIRED", "Counts the number of GQ read tracker entries that are allocated in the read tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ read tracker L3 miss to RTID acquired occupancy count is divided by this count to obtain the average latency for a read L3 miss to acquire an RTID."}, {"GQ_ALLOC.WT_TO_RTID_ACQUIRED", "Counts the number of GQ write tracker entries that are allocated in the write tracker, have missed in the L3 and have not acquired a Request Transaction ID. The GQ write tracker L3 miss to RTID occupancy count is divided by this count to obtain the average latency for a write L3 miss to acquire an RTID."}, {"GQ_ALLOC.WRITE_TRACKER", "Counts the number of GQ write tracker entries that are allocated in the write tracker queue that miss the L3. The GQ write tracker occupancy count is divided by the this count to obtain the average L3 write miss latency."}, {"GQ_ALLOC.PEER_PROBE_TRACKER", "Counts the number of GQ peer probe tracker (snoop) entries that are allocated in the peer probe tracker queue that miss the L3. The GQ peer probe occupancy count is divided by this count to obtain the average L3 peer probe miss latency."}, {"GQ_DATA.FROM_QPI", "Cycles Global Queue Quickpath Interface input data port is busy importing data from the Quickpath Interface. Each cycle the input port can transfer 8 or 16 bytes of data."}, {"GQ_DATA.FROM_QMC", "Cycles Global Queue Quickpath Memory Interface input data port is busy importing data from the Quickpath Memory Interface. Each cycle the input port can transfer 8 or 16 bytes of data."}, {"GQ_DATA.FROM_L3", "Cycles GQ L3 input data port is busy importing data from the Last Level Cache. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.FROM_CORES_02", "Cycles GQ Core 0 and 2 input data port is busy importing data from processor cores 0 and 2. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.FROM_CORES_13", "Cycles GQ Core 1 and 3 input data port is busy importing data from processor cores 1 and 3. Each cycle the input port can transfer 32 bytes of data."}, {"GQ_DATA.TO_QPI_QMC", "Cycles GQ QPI and QMC output data port is busy sending data to the Quickpath Interface or Quickpath Memory Interface. Each cycle the output port can transfer 32 bytes of data."}, {"GQ_DATA.TO_L3", "Cycles GQ L3 output data port is busy sending data to the Last Level Cache. Each cycle the output port can transfer 32 bytes of data."}, {"GQ_DATA.TO_CORES", "Cycles GQ Core output data port is busy sending data to the Cores. Each cycle the output port can transfer 32 bytes of data."}, {"SNP_RESP_TO_LOCAL_HOME.I_STATE", "Number of snoop responses to the local home that L3 does not have the referenced cache line."}, {"SNP_RESP_TO_LOCAL_HOME.S_STATE", "Number of snoop responses to the local home that L3 has the referenced line cached in the S state."}, {"SNP_RESP_TO_LOCAL_HOME.FWD_S_STATE", "Number of responses to code or data read snoops to the local home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the local home in the S state."}, {"SNP_RESP_TO_LOCAL_HOME.FWD_I_STATE", "Number of responses to read invalidate snoops to the local home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the local home in the M state."}, {"SNP_RESP_TO_LOCAL_HOME.CONFLICT", "Number of conflict snoop responses sent to the local home."}, {"SNP_RESP_TO_LOCAL_HOME.WB", "Number of responses to code or data read snoops to the local home that the L3 has the referenced line cached in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.I_STATE", "Number of snoop responses to a remote home that L3 does not have the referenced cache line."}, {"SNP_RESP_TO_REMOTE_HOME.S_STATE", "Number of snoop responses to a remote home that L3 has the referenced line cached in the S state."}, {"SNP_RESP_TO_REMOTE_HOME.FWD_S_STATE", "Number of responses to code or data read snoops to a remote home that the L3 has the referenced cache line in the E state. The L3 cache line state is changed to the S state and the line is forwarded to the remote home in the S state."}, {"SNP_RESP_TO_REMOTE_HOME.FWD_I_STATE", "Number of responses to read invalidate snoops to a remote home that the L3 has the referenced cache line in the M state. The L3 cache line state is invalidated and the line is forwarded to the remote home in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.CONFLICT", "Number of conflict snoop responses sent to the local home."}, {"SNP_RESP_TO_REMOTE_HOME.WB", "Number of responses to code or data read snoops to a remote home that the L3 has the referenced line cached in the M state."}, {"SNP_RESP_TO_REMOTE_HOME.HITM", "Number of HITM snoop responses to a remote home."}, {"L3_HITS.READ", "Number of code read, data read and RFO requests that hit in the L3."}, {"L3_HITS.WRITE", "Number of writeback requests that hit in the L3. Writebacks from the cores will always result in L3 hits due to the inclusive property of the L3."}, {"L3_HITS.PROBE", "Number of snoops from IOH or remote sockets that hit in the L3."}, {"L3_HITS.ANY", "Number of reads and writes that hit the L3."}, {"L3_MISS.READ", "Number of code read, data read and RFO requests that miss the L3."}, {"L3_MISS.WRITE", "Number of writeback requests that miss the L3. Should always be zero as writebacks from the cores will always result in L3 hits due to the inclusive property of the L3."}, {"L3_MISS.PROBE", "Number of snoops from IOH or remote sockets that miss the L3."}, {"L3_MISS.ANY", "Number of reads and writes that miss the L3."}, {"L3_LINES_IN.M_STATE", "Counts the number of L3 lines allocated in M state. The only time a cache line is allocated in the M state is when the line was forwarded in M state is forwarded due to a Snoop Read Invalidate Own request."}, {"L3_LINES_IN.E_STATE", "Counts the number of L3 lines allocated in E state."}, {"L3_LINES_IN.S_STATE", "Counts the number of L3 lines allocated in S state."}, {"L3_LINES_IN.F_STATE", "Counts the number of L3 lines allocated in F state."}, {"L3_LINES_IN.ANY", "Counts the number of L3 lines allocated in any state."}, {"L3_LINES_OUT.M_STATE", "Counts the number of L3 lines victimized that were in the M state. When the victim cache line is in M state, the line is written to its home cache agent which can be either local or remote."}, {"L3_LINES_OUT.E_STATE", "Counts the number of L3 lines victimized that were in the E state."}, {"L3_LINES_OUT.S_STATE", "Counts the number of L3 lines victimized that were in the S state."}, {"L3_LINES_OUT.I_STATE", "Counts the number of L3 lines victimized that were in the I state."}, {"L3_LINES_OUT.F_STATE", "Counts the number of L3 lines victimized that were in the F state."}, {"L3_LINES_OUT.ANY", "Counts the number of L3 lines victimized in any state."}, {"GQ_SNOOP.GOTO_S", "Counts the number of remote snoops that have requested a cache line be set to the S state."}, {"GQ_SNOOP.GOTO_I", "Counts the number of remote snoops that have requested a cache line be set to the I state."}, {"GQ_SNOOP.GOTO_S_HIT", "Counts the number of remote snoops that have requested a cache line be set to the S state from E state. Requires writing MSR 301H with mask = 2H"}, {"GQ_SNOOP.GOTO_I_HIT", "Counts the number of remote snoops that have requested a cache line be set to the S state from F (forward) state. Requires writing MSR 301H with mask = 8H"}, {"QHL_REQUESTS.IOH_READS", "Counts number of Quickpath Home Logic read requests from the IOH."}, {"QHL_REQUESTS.IOH_WRITES", "Counts number of Quickpath Home Logic write requests from the IOH."}, {"QHL_REQUESTS.REMOTE_READS", "Counts number of Quickpath Home Logic read requests from a remote socket."}, {"QHL_REQUESTS.REMOTE_WRITES", "Counts number of Quickpath Home Logic write requests from a remote socket."}, {"QHL_REQUESTS.LOCAL_READS", "Counts number of Quickpath Home Logic read requests from the local socket."}, {"QHL_REQUESTS.LOCAL_WRITES", "Counts number of Quickpath Home Logic write requests from the local socket."}, {"QHL_CYCLES_FULL.IOH", "Counts uclk cycles all entries in the Quickpath Home Logic IOH are full."}, {"QHL_CYCLES_FULL.REMOTE", "Counts uclk cycles all entries in the Quickpath Home Logic remote tracker are full."}, {"QHL_CYCLES_FULL.LOCAL", "Counts uclk cycles all entries in the Quickpath Home Logic local tracker are full."}, {"QHL_CYCLES_NOT_EMPTY.IOH", "Counts uclk cycles all entries in the Quickpath Home Logic IOH is busy."}, {"QHL_CYCLES_NOT_EMPTY.REMOTE", "Counts uclk cycles all entries in the Quickpath Home Logic remote tracker is busy."}, {"QHL_CYCLES_NOT_EMPTY.LOCAL", "Counts uclk cycles all entries in the Quickpath Home Logic local tracker is busy."}, {"QHL_OCCUPANCY.IOH", "QHL IOH tracker allocate to deallocate read occupancy."}, {"QHL_OCCUPANCY.REMOTE", "QHL remote tracker allocate to deallocate read occupancy."}, {"QHL_OCCUPANCY.LOCAL", "QHL local tracker allocate to deallocate read occupancy."}, {"QHL_ADDRESS_CONFLICTS.2WAY", "Counts number of QHL Active Address Table (AAT) entries that saw a max of 2 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates."}, {"QHL_ADDRESS_CONFLICTS.3WAY", "Counts number of QHL Active Address Table (AAT) entries that saw a max of 3 conflicts. The AAT is a structure that tracks requests that are in conflict. The requests themselves are in the home tracker entries. The count is reported when an AAT entry deallocates."}, {"QHL_CONFLICT_CYCLES.IOH", "Counts cycles the Quickpath Home Logic IOH Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict."}, {"QHL_CONFLICT_CYCLES.REMOTE", "Counts cycles the Quickpath Home Logic Remote Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict."}, {"QHL_CONFLICT_CYCLES.LOCAL", "Counts cycles the Quickpath Home Logic Local Tracker contains two or more requests with an address conflict. A max of 3 requests can be in conflict."}, {"QHL_TO_QMC_BYPASS", "Counts number or requests to the Quickpath Memory Controller that bypass the Quickpath Home Logic. All local accesses can be bypassed. For remote requests, only read requests can be bypassed."}, {"QMC_ISOC_FULL.READ.CH0", "Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.READ.CH1", "Counts cycles all the entries in the DRAM channel 1 high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.READ.CH2", "Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous read requests."}, {"QMC_ISOC_FULL.WRITE.CH0", "Counts cycles all the entries in the DRAM channel 0 high priority queue are occupied with isochronous write requests."}, {"QMC_ISOC_FULL.WRITE.CH1", "Counts cycles all the entries in the DRAM channel 1 high priority queue are occupied with isochronous write requests."}, {"QMC_ISOC_FULL.WRITE.CH2", "Counts cycles all the entries in the DRAM channel 2 high priority queue are occupied with isochronous write requests."}, {"QMC_BUSY.READ.CH0", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 0."}, {"QMC_BUSY.READ.CH1", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 1."}, {"QMC_BUSY.READ.CH2", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding read request to DRAM channel 2."}, {"QMC_BUSY.WRITE.CH0", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 0."}, {"QMC_BUSY.WRITE.CH1", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 1."}, {"QMC_BUSY.WRITE.CH2", "Counts cycles where Quickpath Memory Controller has at least 1 outstanding write request to DRAM channel 2."}, {"QMC_OCCUPANCY.CH0", "IMC channel 0 normal read request occupancy."}, {"QMC_OCCUPANCY.CH1", "IMC channel 1 normal read request occupancy."}, {"QMC_OCCUPANCY.CH2", "IMC channel 2 normal read request occupancy."}, {"QMC_OCCUPANCY.ANY", "Normal read request occupancy for any channel."}, {"QMC_ISSOC_OCCUPANCY.CH0", "IMC channel 0 issoc read request occupancy."}, {"QMC_ISSOC_OCCUPANCY.CH1", "IMC channel 1 issoc read request occupancy."}, {"QMC_ISSOC_OCCUPANCY.CH2", "IMC channel 2 issoc read request occupancy."}, {"QMC_ISSOC_READS.ANY", "IMC issoc read request occupancy."}, {"QMC_NORMAL_READS.CH0", "Counts the number of Quickpath Memory Controller channel 0 medium and low priority read requests. The QMC channel 0 normal read occupancy divided by this count provides the average QMC channel 0 read latency."}, {"QMC_NORMAL_READS.CH1", "Counts the number of Quickpath Memory Controller channel 1 medium and low priority read requests. The QMC channel 1 normal read occupancy divided by this count provides the average QMC channel 1 read latency."}, {"QMC_NORMAL_READS.CH2", "Counts the number of Quickpath Memory Controller channel 2 medium and low priority read requests. The QMC channel 2 normal read occupancy divided by this count provides the average QMC channel 2 read latency."}, {"QMC_NORMAL_READS.ANY", "Counts the number of Quickpath Memory Controller medium and low priority read requests. The QMC normal read occupancy divided by this count provides the average QMC read latency."}, {"QMC_HIGH_PRIORITY_READS.CH0", "Counts the number of Quickpath Memory Controller channel 0 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.CH1", "Counts the number of Quickpath Memory Controller channel 1 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.CH2", "Counts the number of Quickpath Memory Controller channel 2 high priority isochronous read requests."}, {"QMC_HIGH_PRIORITY_READS.ANY", "Counts the number of Quickpath Memory Controller high priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH0", "Counts the number of Quickpath Memory Controller channel 0 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH1", "Counts the number of Quickpath Memory Controller channel 1 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.CH2", "Counts the number of Quickpath Memory Controller channel 2 critical priority isochronous read requests."}, {"QMC_CRITICAL_PRIORITY_READS.ANY", "Counts the number of Quickpath Memory Controller critical priority isochronous read requests."}, {"QMC_WRITES.FULL.CH0", "Counts number of full cache line writes to DRAM channel 0."}, {"QMC_WRITES.FULL.CH1", "Counts number of full cache line writes to DRAM channel 1."}, {"QMC_WRITES.FULL.CH2", "Counts number of full cache line writes to DRAM channel 2."}, {"QMC_WRITES.FULL.ANY", "Counts number of full cache line writes to DRAM."}, {"QMC_WRITES.PARTIAL.CH0", "Counts number of partial cache line writes to DRAM channel 0."}, {"QMC_WRITES.PARTIAL.CH1", "Counts number of partial cache line writes to DRAM channel 1."}, {"QMC_WRITES.PARTIAL.CH2", "Counts number of partial cache line writes to DRAM channel 2."}, {"QMC_WRITES.PARTIAL.ANY", "Counts number of partial cache line writes to DRAM."}, {"QMC_CANCEL.CH0", "Counts number of DRAM channel 0 cancel requests."}, {"QMC_CANCEL.CH1", "Counts number of DRAM channel 1 cancel requests."}, {"QMC_CANCEL.CH2", "Counts number of DRAM channel 2 cancel requests."}, {"QMC_CANCEL.ANY", "Counts number of DRAM cancel requests."}, {"QMC_PRIORITY_UPDATES.CH0", "Counts number of DRAM channel 0 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.CH1", "Counts number of DRAM channel 1 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.CH2", "Counts number of DRAM channel 2 priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"QMC_PRIORITY_UPDATES.ANY", "Counts number of DRAM priority updates. A priority update occurs when an ISOC high or critical request is received by the QHL and there is a matching request with normal priority that has already been issued to the QMC. In this instance, the QHL will send a priority update to QMC to expedite the request."}, {"IMC_RETRY.CH0", "Counts number of IMC DRAM channel 0 retries. DRAM retry only occurs when configured in RAS mode."}, {"IMC_RETRY.CH1", "Counts number of IMC DRAM channel 1 retries. DRAM retry only occurs when configured in RAS mode."}, {"IMC_RETRY.CH2", "Counts number of IMC DRAM channel 2 retries. DRAM retry only occurs when configured in RAS mode."}, {"IMC_RETRY.ANY", "Counts number of IMC DRAM retries from any channel. DRAM retry only occurs when configured in RAS mode."}, {"QHL_FRC_ACK_CNFLTS.IOH", "Counts number of Force Acknowledge Conflict messages sent by the Quickpath Home Logic to the IOH."}, {"QHL_FRC_ACK_CNFLTS.REMOTE", "Counts number of Force Acknowledge Conflict messages sent by the Quickpath Home Logic to the remote home."}, {"QHL_FRC_ACK_CNFLTS.LOCAL", "Counts number of Force Acknowledge Conflict messages sent by the Quickpath Home Logic to the local home."}, {"QHL_FRC_ACK_CNFLTS.ANY", "Counts number of Force Acknowledge Conflict messages sent by the Quickpath Home Logic."}, {"QHL_SLEEPS.IOH_ORDER", "Counts number of occurrences a request was put to sleep due to IOH ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"QHL_SLEEPS.REMOTE_ORDER", "Counts number of occurrences a request was put to sleep due to remote socket ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"QHL_SLEEPS.LOCAL_ORDER", "Counts number of occurrences a request was put to sleep due to local socket ordering (write after read) conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"QHL_SLEEPS.IOH_CONFLICT", "Counts number of occurrences a request was put to sleep due to IOH address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"QHL_SLEEPS.REMOTE_CONFLICT", "Counts number of occurrences a request was put to sleep due to remote socket address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"QHL_SLEEPS.LOCAL_CONFLICT", "Counts number of occurrences a request was put to sleep due to local socket address conflicts. While in the sleep state, the request is not eligible to be scheduled to the QMC."}, {"ADDR_OPCODE_MATCH.IOH", "Counts number of requests from the IOH, address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB Match opcode/address by writing MSR 396H with mask supported mask value."}, {"ADDR_OPCODE_MATCH.REMOTE", "Counts number of requests from the remote socket, address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB Match opcode/address by writing MSR 396H with mask supported mask value."}, {"ADDR_OPCODE_MATCH.LOCAL", "Counts number of requests from the local socket, address/opcode of request is qualified by mask value written to MSR 396H. The following mask values are supported: 0: NONE 40000000_00000000H:RSPFWDI 40001A00_00000000H:RSPFWDS 40001D00_00000000H:RSPIWB Match opcode/address by writing MSR 396H with mask supported mask value."}, {"QPI_TX_STALLED_SINGLE_FLIT.HOME.LINK_0", "Counts cycles the Quickpath outbound link 0 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.SNOOP.LINK_0", "Counts cycles the Quickpath outbound link 0 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.NDR.LINK_0", "Counts cycles the Quickpath outbound link 0 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.HOME.LINK_1", "Counts cycles the Quickpath outbound link 1 HOME virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.SNOOP.LINK_1", "Counts cycles the Quickpath outbound link 1 SNOOP virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.NDR.LINK_1", "Counts cycles the Quickpath outbound link 1 non-data response virtual channel is stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.LINK_0", "Counts cycles the Quickpath outbound link 0 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_SINGLE_FLIT.LINK_1", "Counts cycles the Quickpath outbound link 1 virtual channels are stalled due to lack of a VNA and VN0 credit. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.DRS.LINK_0", "Counts cycles the Quickpath outbound link 0 Data ResponSe virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCB.LINK_0", "Counts cycles the Quickpath outbound link 0 Non-Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCS.LINK_0", "Counts cycles the Quickpath outbound link 0 Non-Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.DRS.LINK_1", "Counts cycles the Quickpath outbound link 1 Data ResponSe virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCB.LINK_1", "Counts cycles the Quickpath outbound link 1 Non-Coherent Bypass virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.NCS.LINK_1", "Counts cycles the Quickpath outbound link 1 Non-Coherent Standard virtual channel is stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.LINK_0", "Counts cycles the Quickpath outbound link 0 virtual channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_STALLED_MULTI_FLIT.LINK_1", "Counts cycles the Quickpath outbound link 1 virtual channels are stalled due to lack of VNA and VN0 credits. Note that this event does not filter out when a flit would not have been selected for arbitration because another virtual channel is getting arbitrated."}, {"QPI_TX_HEADER.FULL.LINK_0", "Number of cycles that the header buffer in the Quickpath Interface outbound link 0 is full."}, {"QPI_TX_HEADER.BUSY.LINK_0", "Number of cycles that the header buffer in the Quickpath Interface outbound link 0 is busy."}, {"QPI_TX_HEADER.FULL.LINK_1", "Number of cycles that the header buffer in the Quickpath Interface outbound link 1 is full."}, {"QPI_TX_HEADER.BUSY.LINK_1", "Number of cycles that the header buffer in the Quickpath Interface outbound link 1 is busy."}, {"QPI_RX_NO_PPT_CREDIT.STALLS.LINK_0", "Number of cycles that snoop packets incoming to the Quickpath Interface link0 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries."}, {"QPI_RX_NO_PPT_CREDIT.STALLS.LINK_1", "Number of cycles that snoop packets incoming to the Quickpath Interface link 1 are stalled and not sent to the GQ because the GQ Peer Probe Tracker (PPT) does not have any available entries."}, {"DRAM_OPEN.CH0", "Counts number of DRAM Channel 0 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_OPEN.CH1", "Counts number of DRAM Channel 1 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_OPEN.CH2", "Counts number of DRAM Channel 2 open commands issued either for read or write. To read or write data, the referenced DRAM page must first be opened."}, {"DRAM_PAGE_CLOSE.CH0", "DRAM channel 0 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_CLOSE.CH1", "DRAM channel 1 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_CLOSE.CH2", "DRAM channel 2 command issued to CLOSE a page due to page idle timer expiration. Closing a page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH0", "Counts the number of precharges (PRE) that were issued to DRAM channel 0 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH1", "Counts the number of precharges (PRE) that were issued to DRAM channel 1 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_PAGE_MISS.CH2", "Counts the number of precharges (PRE) that were issued to DRAM channel 2 because there was a page miss. A page miss refers to a situation in which a page is currently open and another page from the same bank needs to be opened. The new page experiences a page miss. Closing of the old page is done by issuing a precharge."}, {"DRAM_READ_CAS.CH0", "Counts the number of times a read CAS command was issued on DRAM channel 0."}, {"DRAM_READ_CAS.AUTOPRE_CH0", "Counts the number of times a read CAS command was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_READ_CAS.CH1", "Counts the number of times a read CAS command was issued on DRAM channel 1."}, {"DRAM_READ_CAS.AUTOPRE_CH1", "Counts the number of times a read CAS command was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_READ_CAS.CH2", "Counts the number of times a read CAS command was issued on DRAM channel 2."}, {"DRAM_READ_CAS.AUTOPRE_CH2", "Counts the number of times a read CAS command was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH0", "Counts the number of times a write CAS command was issued on DRAM channel 0."}, {"DRAM_WRITE_CAS.AUTOPRE_CH0", "Counts the number of times a write CAS command was issued on DRAM channel 0 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH1", "Counts the number of times a write CAS command was issued on DRAM channel 1."}, {"DRAM_WRITE_CAS.AUTOPRE_CH1", "Counts the number of times a write CAS command was issued on DRAM channel 1 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_WRITE_CAS.CH2", "Counts the number of times a write CAS command was issued on DRAM channel 2."}, {"DRAM_WRITE_CAS.AUTOPRE_CH2", "Counts the number of times a write CAS command was issued on DRAM channel 2 where the command issued used the auto-precharge (auto page close) mode."}, {"DRAM_REFRESH.CH0", "Counts number of DRAM channel 0 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_REFRESH.CH1", "Counts number of DRAM channel 1 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_REFRESH.CH2", "Counts number of DRAM channel 2 refresh commands. DRAM loses data content over time. In order to keep correct data content, the data values have to be refreshed periodically."}, {"DRAM_PRE_ALL.CH0", "Counts number of DRAM Channel 0 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, {"DRAM_PRE_ALL.CH1", "Counts number of DRAM Channel 1 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, {"DRAM_PRE_ALL.CH2", "Counts number of DRAM Channel 2 precharge-all (PREALL) commands that close all open pages in a rank. PREALL is issued when the DRAM needs to be refreshed or needs to go into a power down mode."}, {"DRAM_THERMAL_THROTTLED", "Uncore cycles DRAM was throttled due to its temperature being above the thermal throttling threshold."}, {"THERMAL_THROTTLING_TEMP.CORE_0", "Cycles that the PCU records that core 0 is above the thermal throttling threshold temperature."}, {"THERMAL_THROTTLING_TEMP.CORE_1", "Cycles that the PCU records that core 1 is above the thermal throttling threshold temperature."}, {"THERMAL_THROTTLING_TEMP.CORE_2", "Cycles that the PCU records that core 2 is above the thermal throttling threshold temperature."}, {"THERMAL_THROTTLING_TEMP.CORE_3", "Cycles that the PCU records that core 3 is above the thermal throttling threshold temperature."}, {"THERMAL_THROTTLED_TEMP.CORE_0", "Cycles that the PCU records that core 0 is in the power throttled state due to cores temperature being above the thermal throttling threshold."}, {"THERMAL_THROTTLED_TEMP.CORE_1", "Cycles that the PCU records that core 1 is in the power throttled state due to cores temperature being above the thermal throttling threshold."}, {"THERMAL_THROTTLED_TEMP.CORE_2", "Cycles that the PCU records that core 2 is in the power throttled state due to cores temperature being above the thermal throttling threshold."}, {"THERMAL_THROTTLED_TEMP.CORE_3", "Cycles that the PCU records that core 3 is in the power throttled state due to cores temperature being above the thermal throttling threshold."}, {"PROCHOT_ASSERTION", "Number of system assertions of PROCHOT indicating the entire processor has exceeded the thermal limit."}, {"THERMAL_THROTTLING_PROCHOT.CORE_0", "Cycles that the PCU records that core 0 is a low power state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit."}, {"THERMAL_THROTTLING_PROCHOT.CORE_1", "Cycles that the PCU records that core 1 is a low power state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit."}, {"THERMAL_THROTTLING_PROCHOT.CORE_2", "Cycles that the PCU records that core 2 is a low power state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit."}, {"THERMAL_THROTTLING_PROCHOT.CORE_3", "Cycles that the PCU records that core 3 is a low power state due to the system asserting PROCHOT the entire processor has exceeded the thermal limit."}, {"TURBO_MODE.CORE_0", "Uncore cycles that core 0 is operating in turbo mode."}, {"TURBO_MODE.CORE_1", "Uncore cycles that core 1 is operating in turbo mode."}, {"TURBO_MODE.CORE_2", "Uncore cycles that core 2 is operating in turbo mode."}, {"TURBO_MODE.CORE_3", "Uncore cycles that core 3 is operating in turbo mode."}, {"CYCLES_UNHALTED_L3_FLL_ENABLE", "Uncore cycles that at least one core is unhalted and all L3 ways are enabled."}, {"CYCLES_UNHALTED_L3_FLL_DISABLE", "Uncore cycles that at least one core is unhalted and all L3 ways are disabled."}, { NULL, NULL } }; papi-papi-7-2-0-t/src/freebsd/map-westmere.h000066400000000000000000000455231502707512200206310ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: map-westmere.h * Author: George Neville-Neil * gnn@freebsd.org */ #ifndef FreeBSD_MAP_WESTMERE #define FreeBSD_MAP_WESTMERE enum NativeEvent_Value_WestmereProcessor { PNE_WESTMERE_LOAD_BLOCK_OVERLAP_STORE= PAPI_NATIVE_MASK , PNE_WESTMERE_SB_DRAIN_ANY, PNE_WESTMERE_MISALIGN_MEMORY_STORE, PNE_WESTMERE_STORE_BLOCKS_AT_RET, PNE_WESTMERE_STORE_BLOCKS_L1D_BLOCK, PNE_WESTMERE_PARTIAL_ADDRESS_ALIAS, PNE_WESTMERE_DTLB_LOAD_MISSES_ANY, PNE_WESTMERE_DTLB_LOAD_MISSES_WALK_COMPLETED, PNE_WESTMERE_DTLB_LOAD_MISSES_WALK_CYCLES, PNE_WESTMERE_DTLB_LOAD_MISSES_STLB_HIT, PNE_WESTMERE_DTLB_LOAD_MISSES_PDE_MISS, PNE_WESTMERE_MEM_INST_RETIRED_LOADS, PNE_WESTMERE_MEM_INST_RETIRED_STORES, PNE_WESTMERE_MEM_INST_RETIRED_LATENCY_ABOVE_THRESHOLD, PNE_WESTMERE_MEM_STORE_RETIRED_DTLB_MISS, PNE_WESTMERE_UOPS_ISSUED_ANY, PNE_WESTMERE_UOPS_ISSUED_STALLED_CYCLES, PNE_WESTMERE_UOPS_ISSUED_FUSED, PNE_WESTMERE_MEM_UNCORE_RETIRED_LOCAL_HITM, PNE_WESTMERE_MEM_UNCORE_RETIRED_LOCAL_DRAM_AND_REMOTE_CACHE_HIT, PNE_WESTMERE_MEM_UNCORE_RETIRED_LOCAL_DRAM, PNE_WESTMERE_MEM_UNCORE_RETIRED_REMOTE_DRAM, PNE_WESTMERE_MEM_UNCORE_RETIRED_UNCACHEABLE, PNE_WESTMERE_FP_COMP_OPS_EXE_X87, PNE_WESTMERE_FP_COMP_OPS_EXE_MMX, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE_FP, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE2_INTEGER, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE_FP_PACKED, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE_FP_SCALAR, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE_SINGLE_PRECISION, PNE_WESTMERE_FP_COMP_OPS_EXE_SSE_DOUBLE_PRECISION, PNE_WESTMERE_SIMD_INT_128_PACKED_MPY, PNE_WESTMERE_SIMD_INT_128_PACKED_SHIFT, PNE_WESTMERE_SIMD_INT_128_PACK, PNE_WESTMERE_SIMD_INT_128_UNPACK, PNE_WESTMERE_SIMD_INT_128_PACKED_LOGICAL, PNE_WESTMERE_SIMD_INT_128_PACKED_ARITH, PNE_WESTMERE_SIMD_INT_128_SHUFFLE_MOVE, PNE_WESTMERE_LOAD_DISPATCH_RS, PNE_WESTMERE_LOAD_DISPATCH_RS_DELAYED, PNE_WESTMERE_LOAD_DISPATCH_MOB, PNE_WESTMERE_LOAD_DISPATCH_ANY, PNE_WESTMERE_ARITH_CYCLES_DIV_BUSY, PNE_WESTMERE_ARITH_MUL, PNE_WESTMERE_INST_QUEUE_WRITES, PNE_WESTMERE_INST_DECODED_DEC0, PNE_WESTMERE_TWO_UOP_INSTS_DECODED, PNE_WESTMERE_INST_QUEUE_WRITE_CYCLES, PNE_WESTMERE_LSD_OVERFLOW, PNE_WESTMERE_L2_RQSTS_LD_HIT, PNE_WESTMERE_L2_RQSTS_LD_MISS, PNE_WESTMERE_L2_RQSTS_LOADS, PNE_WESTMERE_L2_RQSTS_RFO_HIT, PNE_WESTMERE_L2_RQSTS_RFO_MISS, PNE_WESTMERE_L2_RQSTS_RFOS, PNE_WESTMERE_L2_RQSTS_IFETCH_HIT, PNE_WESTMERE_L2_RQSTS_IFETCH_MISS, PNE_WESTMERE_L2_RQSTS_IFETCHES, PNE_WESTMERE_L2_RQSTS_PREFETCH_HIT, PNE_WESTMERE_L2_RQSTS_PREFETCH_MISS, PNE_WESTMERE_L2_RQSTS_PREFETCHES, PNE_WESTMERE_L2_RQSTS_MISS, PNE_WESTMERE_L2_RQSTS_REFERENCES, PNE_WESTMERE_L2_DATA_RQSTS_DEMAND_I_STATE, PNE_WESTMERE_L2_DATA_RQSTS_DEMAND_S_STATE, PNE_WESTMERE_L2_DATA_RQSTS_DEMAND_E_STATE, PNE_WESTMERE_L2_DATA_RQSTS_DEMAND_M_STATE, PNE_WESTMERE_L2_DATA_RQSTS_DEMAND_MESI, PNE_WESTMERE_L2_DATA_RQSTS_PREFETCH_I_STATE, PNE_WESTMERE_L2_DATA_RQSTS_PREFETCH_S_STATE, PNE_WESTMERE_L2_DATA_RQSTS_PREFETCH_E_STATE, PNE_WESTMERE_L2_DATA_RQSTS_PREFETCH_M_STATE, PNE_WESTMERE_L2_DATA_RQSTS_PREFETCH_MESI, PNE_WESTMERE_L2_DATA_RQSTS_ANY, PNE_WESTMERE_L2_WRITE_RFO_I_STATE, PNE_WESTMERE_L2_WRITE_RFO_S_STATE, PNE_WESTMERE_L2_WRITE_RFO_M_STATE, PNE_WESTMERE_L2_WRITE_RFO_HIT, PNE_WESTMERE_L2_WRITE_RFO_MESI, PNE_WESTMERE_L2_WRITE_LOCK_I_STATE, PNE_WESTMERE_L2_WRITE_LOCK_S_STATE, PNE_WESTMERE_L2_WRITE_LOCK_E_STATE, PNE_WESTMERE_L2_WRITE_LOCK_M_STATE, PNE_WESTMERE_L2_WRITE_LOCK_HIT, PNE_WESTMERE_L2_WRITE_LOCK_MESI, PNE_WESTMERE_L1D_WB_L2_I_STATE, PNE_WESTMERE_L1D_WB_L2_S_STATE, PNE_WESTMERE_L1D_WB_L2_E_STATE, PNE_WESTMERE_L1D_WB_L2_M_STATE, PNE_WESTMERE_L1D_WB_L2_MESI, PNE_WESTMERE_L3_LAT_CACHE_REFERENCE, PNE_WESTMERE_L3_LAT_CACHE_MISS, PNE_WESTMERE_CPU_CLK_UNHALTED_THREAD_P, PNE_WESTMERE_CPU_CLK_UNHALTED_REF_P, PNE_WESTMERE_DTLB_MISSES_ANY, PNE_WESTMERE_DTLB_MISSES_WALK_COMPLETED, PNE_WESTMERE_DTLB_MISSES_WALK_CYCLES, PNE_WESTMERE_DTLB_MISSES_STLB_HIT, PNE_WESTMERE_DTLB_MISSES_LARGE_WALK_COMPLETED, PNE_WESTMERE_LOAD_HIT_PRE, PNE_WESTMERE_L1D_PREFETCH_REQUESTS, PNE_WESTMERE_L1D_PREFETCH_MISS, PNE_WESTMERE_L1D_PREFETCH_TRIGGERS, PNE_WESTMERE_EPT_WALK_CYCLES, PNE_WESTMERE_L1D_REPL, PNE_WESTMERE_L1D_M_REPL, PNE_WESTMERE_L1D_M_EVICT, PNE_WESTMERE_L1D_M_SNOOP_EVICT, PNE_WESTMERE_L1D_CACHE_PREFETCH_LOCK_FB_HIT, PNE_WESTMERE_L1D_CACHE_LOCK_FB_HIT, PNE_WESTMERE_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_READ_DATA, PNE_WESTMERE_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_READ_CODE, PNE_WESTMERE_OFFCORE_REQUESTS_OUTSTANDING_DEMAND_RFO, PNE_WESTMERE_OFFCORE_REQUESTS_OUTSTANDING_ANY_READ, PNE_WESTMERE_CACHE_LOCK_CYCLES_L1D_L2, PNE_WESTMERE_CACHE_LOCK_CYCLES_L1D, PNE_WESTMERE_IO_TRANSACTIONS, PNE_WESTMERE_L1I_HITS, PNE_WESTMERE_L1I_MISSES, PNE_WESTMERE_L1I_READS, PNE_WESTMERE_L1I_CYCLES_STALLED, PNE_WESTMERE_LARGE_ITLB_HIT, PNE_WESTMERE_ITLB_MISSES_ANY, PNE_WESTMERE_ITLB_MISSES_WALK_COMPLETED, PNE_WESTMERE_ITLB_MISSES_WALK_CYCLES, PNE_WESTMERE_ITLB_MISSES_LARGE_WALK_COMPLETED, PNE_WESTMERE_ILD_STALL_LCP, PNE_WESTMERE_ILD_STALL_MRU, PNE_WESTMERE_ILD_STALL_IQ_FULL, PNE_WESTMERE_ILD_STALL_REGEN, PNE_WESTMERE_ILD_STALL_ANY, PNE_WESTMERE_BR_INST_EXEC_COND, PNE_WESTMERE_BR_INST_EXEC_DIRECT, PNE_WESTMERE_BR_INST_EXEC_INDIRECT_NON_CALL, PNE_WESTMERE_BR_INST_EXEC_NON_CALLS, PNE_WESTMERE_BR_INST_EXEC_RETURN_NEAR, PNE_WESTMERE_BR_INST_EXEC_DIRECT_NEAR_CALL, PNE_WESTMERE_BR_INST_EXEC_INDIRECT_NEAR_CALL, PNE_WESTMERE_BR_INST_EXEC_NEAR_CALLS, PNE_WESTMERE_BR_INST_EXEC_TAKEN, PNE_WESTMERE_BR_INST_EXEC_ANY, PNE_WESTMERE_BR_MISP_EXEC_COND, PNE_WESTMERE_BR_MISP_EXEC_DIRECT, PNE_WESTMERE_BR_MISP_EXEC_INDIRECT_NON_CALL, PNE_WESTMERE_BR_MISP_EXEC_NON_CALLS, PNE_WESTMERE_BR_MISP_EXEC_RETURN_NEAR, PNE_WESTMERE_BR_MISP_EXEC_DIRECT_NEAR_CALL, PNE_WESTMERE_BR_MISP_EXEC_INDIRECT_NEAR_CALL, PNE_WESTMERE_BR_MISP_EXEC_NEAR_CALLS, PNE_WESTMERE_BR_MISP_EXEC_TAKEN, PNE_WESTMERE_BR_MISP_EXEC_ANY, PNE_WESTMERE_RESOURCE_STALLS_ANY, PNE_WESTMERE_RESOURCE_STALLS_LOAD, PNE_WESTMERE_RESOURCE_STALLS_RS_FULL, PNE_WESTMERE_RESOURCE_STALLS_STORE, PNE_WESTMERE_RESOURCE_STALLS_ROB_FULL, PNE_WESTMERE_RESOURCE_STALLS_FPCW, PNE_WESTMERE_RESOURCE_STALLS_MXCSR, PNE_WESTMERE_RESOURCE_STALLS_OTHER, PNE_WESTMERE_MACRO_INSTS_FUSIONS_DECODED, PNE_WESTMERE_BACLEAR_FORCE_IQ, PNE_WESTMERE_LSD_UOPS, PNE_WESTMERE_ITLB_FLUSH, PNE_WESTMERE_OFFCORE_REQUESTS_DEMAND_READ_DATA, PNE_WESTMERE_OFFCORE_REQUESTS_DEMAND_READ_CODE, PNE_WESTMERE_OFFCORE_REQUESTS_DEMAND_RFO, PNE_WESTMERE_OFFCORE_REQUESTS_ANY_READ, PNE_WESTMERE_OFFCORE_REQUESTS_ANY_RFO, PNE_WESTMERE_OFFCORE_REQUESTS_L1D_WRITEBACK, PNE_WESTMERE_OFFCORE_REQUESTS_ANY, PNE_WESTMERE_UOPS_EXECUTED_PORT0, PNE_WESTMERE_UOPS_EXECUTED_PORT1, PNE_WESTMERE_UOPS_EXECUTED_PORT2_CORE, PNE_WESTMERE_UOPS_EXECUTED_PORT3_CORE, PNE_WESTMERE_UOPS_EXECUTED_PORT4_CORE, PNE_WESTMERE_UOPS_EXECUTED_CORE_ACTIVE_CYCLES_NO_PORT5, PNE_WESTMERE_UOPS_EXECUTED_PORT5, PNE_WESTMERE_UOPS_EXECUTED_CORE_ACTIVE_CYCLES, PNE_WESTMERE_UOPS_EXECUTED_PORT015, PNE_WESTMERE_UOPS_EXECUTED_PORT234, PNE_WESTMERE_OFFCORE_REQUESTS_SQ_FULL, PNE_WESTMERE_SNOOPQ_REQUESTS_OUTSTANDING_DATA, PNE_WESTMERE_SNOOPQ_REQUESTS_OUTSTANDING_INVALIDATE, PNE_WESTMERE_SNOOPQ_REQUESTS_OUTSTANDING_CODE, PNE_WESTMERE_SNOOPQ_REQUESTS_CODE, PNE_WESTMERE_SNOOPQ_REQUESTS_DATA, PNE_WESTMERE_SNOOPQ_REQUESTS_INVALIDATE, PNE_WESTMERE_OFF_CORE_RESPONSE_0, PNE_WESTMERE_SNOOP_RESPONSE_HIT, PNE_WESTMERE_SNOOP_RESPONSE_HITE, PNE_WESTMERE_SNOOP_RESPONSE_HITM, PNE_WESTMERE_OFF_CORE_RESPONSE_1, PNE_WESTMERE_INST_RETIRED_ANY_P, PNE_WESTMERE_INST_RETIRED_X87, PNE_WESTMERE_INST_RETIRED_MMX, PNE_WESTMERE_UOPS_RETIRED_ANY, PNE_WESTMERE_UOPS_RETIRED_RETIRE_SLOTS, PNE_WESTMERE_UOPS_RETIRED_MACRO_FUSED, PNE_WESTMERE_MACHINE_CLEARS_CYCLES, PNE_WESTMERE_MACHINE_CLEARS_MEM_ORDER, PNE_WESTMERE_MACHINE_CLEARS_SMC, PNE_WESTMERE_BR_INST_RETIRED_ANY_P, PNE_WESTMERE_BR_INST_RETIRED_CONDITIONAL, PNE_WESTMERE_BR_INST_RETIRED_NEAR_CALL, PNE_WESTMERE_BR_INST_RETIRED_ALL_BRANCHES, PNE_WESTMERE_BR_MISP_RETIRED_ANY_P, PNE_WESTMERE_BR_MISP_RETIRED_CONDITIONAL, PNE_WESTMERE_BR_MISP_RETIRED_NEAR_CALL, PNE_WESTMERE_BR_MISP_RETIRED_ALL_BRANCHES, PNE_WESTMERE_SSEX_UOPS_RETIRED_PACKED_SINGLE, PNE_WESTMERE_SSEX_UOPS_RETIRED_SCALAR_SINGLE, PNE_WESTMERE_SSEX_UOPS_RETIRED_PACKED_DOUBLE, PNE_WESTMERE_SSEX_UOPS_RETIRED_SCALAR_DOUBLE, PNE_WESTMERE_SSEX_UOPS_RETIRED_VECTOR_INTEGER, PNE_WESTMERE_ITLB_MISS_RETIRED, PNE_WESTMERE_MEM_LOAD_RETIRED_L1D_HIT, PNE_WESTMERE_MEM_LOAD_RETIRED_L2_HIT, PNE_WESTMERE_MEM_LOAD_RETIRED_L3_UNSHARED_HIT, PNE_WESTMERE_MEM_LOAD_RETIRED_OTHER_CORE_L2_HIT_HITM, PNE_WESTMERE_MEM_LOAD_RETIRED_L3_MISS, PNE_WESTMERE_MEM_LOAD_RETIRED_HIT_LFB, PNE_WESTMERE_MEM_LOAD_RETIRED_DTLB_MISS, PNE_WESTMERE_FP_MMX_TRANS_TO_FP, PNE_WESTMERE_FP_MMX_TRANS_TO_MMX, PNE_WESTMERE_FP_MMX_TRANS_ANY, PNE_WESTMERE_MACRO_INSTS_DECODED, PNE_WESTMERE_UOPS_DECODED_STALL_CYCLES, PNE_WESTMERE_UOPS_DECODED_MS, PNE_WESTMERE_UOPS_DECODED_ESP_FOLDING, PNE_WESTMERE_UOPS_DECODED_ESP_SYNC, PNE_WESTMERE_RAT_STALLS_FLAGS, PNE_WESTMERE_RAT_STALLS_REGISTERS, PNE_WESTMERE_RAT_STALLS_ROB_READ_PORT, PNE_WESTMERE_RAT_STALLS_SCOREBOARD, PNE_WESTMERE_RAT_STALLS_ANY, PNE_WESTMERE_SEG_RENAME_STALLS, PNE_WESTMERE_ES_REG_RENAMES, PNE_WESTMERE_UOP_UNFUSION, PNE_WESTMERE_BR_INST_DECODED, PNE_WESTMERE_BPU_MISSED_CALL_RET, PNE_WESTMERE_BACLEAR_CLEAR, PNE_WESTMERE_BACLEAR_BAD_TARGET, PNE_WESTMERE_BPU_CLEARS_EARLY, PNE_WESTMERE_BPU_CLEARS_LATE, PNE_WESTMERE_THREAD_ACTIVE, PNE_WESTMERE_L2_TRANSACTIONS_LOAD, PNE_WESTMERE_L2_TRANSACTIONS_RFO, PNE_WESTMERE_L2_TRANSACTIONS_IFETCH, PNE_WESTMERE_L2_TRANSACTIONS_PREFETCH, PNE_WESTMERE_L2_TRANSACTIONS_L1D_WB, PNE_WESTMERE_L2_TRANSACTIONS_FILL, PNE_WESTMERE_L2_TRANSACTIONS_WB, PNE_WESTMERE_L2_TRANSACTIONS_ANY, PNE_WESTMERE_L2_LINES_IN_S_STATE, PNE_WESTMERE_L2_LINES_IN_E_STATE, PNE_WESTMERE_L2_LINES_IN_ANY, PNE_WESTMERE_L2_LINES_OUT_DEMAND_CLEAN, PNE_WESTMERE_L2_LINES_OUT_DEMAND_DIRTY, PNE_WESTMERE_L2_LINES_OUT_PREFETCH_CLEAN, PNE_WESTMERE_L2_LINES_OUT_PREFETCH_DIRTY, PNE_WESTMERE_L2_LINES_OUT_ANY, PNE_WESTMERE_SQ_MISC_LRU_HINTS, PNE_WESTMERE_SQ_MISC_SPLIT_LOCK, PNE_WESTMERE_SQ_FULL_STALL_CYCLES, PNE_WESTMERE_FP_ASSIST_ALL, PNE_WESTMERE_FP_ASSIST_OUTPUT, PNE_WESTMERE_FP_ASSIST_INPUT, PNE_WESTMERE_SIMD_INT_64_PACKED_MPY, PNE_WESTMERE_SIMD_INT_64_PACKED_SHIFT, PNE_WESTMERE_SIMD_INT_64_PACK, PNE_WESTMERE_SIMD_INT_64_UNPACK, PNE_WESTMERE_SIMD_INT_64_PACKED_LOGICAL, PNE_WESTMERE_SIMD_INT_64_PACKED_ARITH, PNE_WESTMERE_SIMD_INT_64_SHUFFLE_MOVE, PNE_WESTMERE_INSTR_RETIRED_ANY, PNE_WESTMERE_CPU_CLK_UNHALTED_CORE, PNE_WESTMERE_CPU_CLK_UNHALTED_REF, PNE_WESTMERE_GQ_CYCLES_FULL_READ_TRACKER, PNE_WESTMERE_GQ_CYCLES_FULL_WRITE_TRACKER, PNE_WESTMERE_GQ_CYCLES_FULL_PEER_PROBE_TRACKER, PNE_WESTMERE_GQ_CYCLES_NOT_EMPTY_READ_TRACKER, PNE_WESTMERE_GQ_CYCLES_NOT_EMPTY_WRITE_TRACKER, PNE_WESTMERE_GQ_CYCLES_NOT_EMPTY_PEER_PROBE_TRACKER, PNE_WESTMERE_GQ_OCCUPANCY_READ_TRACKER, PNE_WESTMERE_GQ_ALLOC_READ_TRACKER, PNE_WESTMERE_GQ_ALLOC_RT_L3_MISS, PNE_WESTMERE_GQ_ALLOC_RT_TO_L3_RESP, PNE_WESTMERE_GQ_ALLOC_RT_TO_RTID_ACQUIRED, PNE_WESTMERE_GQ_ALLOC_WT_TO_RTID_ACQUIRED, PNE_WESTMERE_GQ_ALLOC_WRITE_TRACKER, PNE_WESTMERE_GQ_ALLOC_PEER_PROBE_TRACKER, PNE_WESTMERE_GQ_DATA_FROM_QPI, PNE_WESTMERE_GQ_DATA_FROM_QMC, PNE_WESTMERE_GQ_DATA_FROM_L3, PNE_WESTMERE_GQ_DATA_FROM_CORES_02, PNE_WESTMERE_GQ_DATA_FROM_CORES_13, PNE_WESTMERE_GQ_DATA_TO_QPI_QMC, PNE_WESTMERE_GQ_DATA_TO_L3, PNE_WESTMERE_GQ_DATA_TO_CORES, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_I_STATE, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_S_STATE, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_FWD_S_STATE, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_FWD_I_STATE, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_CONFLICT, PNE_WESTMERE_SNP_RESP_TO_LOCAL_HOME_WB, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_I_STATE, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_S_STATE, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_FWD_S_STATE, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_FWD_I_STATE, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_CONFLICT, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_WB, PNE_WESTMERE_SNP_RESP_TO_REMOTE_HOME_HITM, PNE_WESTMERE_L3_HITS_READ, PNE_WESTMERE_L3_HITS_WRITE, PNE_WESTMERE_L3_HITS_PROBE, PNE_WESTMERE_L3_HITS_ANY, PNE_WESTMERE_L3_MISS_READ, PNE_WESTMERE_L3_MISS_WRITE, PNE_WESTMERE_L3_MISS_PROBE, PNE_WESTMERE_L3_MISS_ANY, PNE_WESTMERE_L3_LINES_IN_M_STATE, PNE_WESTMERE_L3_LINES_IN_E_STATE, PNE_WESTMERE_L3_LINES_IN_S_STATE, PNE_WESTMERE_L3_LINES_IN_F_STATE, PNE_WESTMERE_L3_LINES_IN_ANY, PNE_WESTMERE_L3_LINES_OUT_M_STATE, PNE_WESTMERE_L3_LINES_OUT_E_STATE, PNE_WESTMERE_L3_LINES_OUT_S_STATE, PNE_WESTMERE_L3_LINES_OUT_I_STATE, PNE_WESTMERE_L3_LINES_OUT_F_STATE, PNE_WESTMERE_L3_LINES_OUT_ANY, PNE_WESTMERE_GQ_SNOOP_GOTO_S, PNE_WESTMERE_GQ_SNOOP_GOTO_I, PNE_WESTMERE_GQ_SNOOP_GOTO_S_HIT, PNE_WESTMERE_GQ_SNOOP_GOTO_I_HIT, PNE_WESTMERE_QHL_REQUESTS_IOH_READS, PNE_WESTMERE_QHL_REQUESTS_IOH_WRITES, PNE_WESTMERE_QHL_REQUESTS_REMOTE_READS, PNE_WESTMERE_QHL_REQUESTS_REMOTE_WRITES, PNE_WESTMERE_QHL_REQUESTS_LOCAL_READS, PNE_WESTMERE_QHL_REQUESTS_LOCAL_WRITES, PNE_WESTMERE_QHL_CYCLES_FULL_IOH, PNE_WESTMERE_QHL_CYCLES_FULL_REMOTE, PNE_WESTMERE_QHL_CYCLES_FULL_LOCAL, PNE_WESTMERE_QHL_CYCLES_NOT_EMPTY_IOH, PNE_WESTMERE_QHL_CYCLES_NOT_EMPTY_REMOTE, PNE_WESTMERE_QHL_CYCLES_NOT_EMPTY_LOCAL, PNE_WESTMERE_QHL_OCCUPANCY_IOH, PNE_WESTMERE_QHL_OCCUPANCY_REMOTE, PNE_WESTMERE_QHL_OCCUPANCY_LOCAL, PNE_WESTMERE_QHL_ADDRESS_CONFLICTS_2WAY, PNE_WESTMERE_QHL_ADDRESS_CONFLICTS_3WAY, PNE_WESTMERE_QHL_CONFLICT_CYCLES_IOH, PNE_WESTMERE_QHL_CONFLICT_CYCLES_REMOTE, PNE_WESTMERE_QHL_CONFLICT_CYCLES_LOCAL, PNE_WESTMERE_QHL_TO_QMC_BYPASS, PNE_WESTMERE_QMC_ISOC_FULL_READ_CH0, PNE_WESTMERE_QMC_ISOC_FULL_READ_CH1, PNE_WESTMERE_QMC_ISOC_FULL_READ_CH2, PNE_WESTMERE_QMC_ISOC_FULL_WRITE_CH0, PNE_WESTMERE_QMC_ISOC_FULL_WRITE_CH1, PNE_WESTMERE_QMC_ISOC_FULL_WRITE_CH2, PNE_WESTMERE_QMC_BUSY_READ_CH0, PNE_WESTMERE_QMC_BUSY_READ_CH1, PNE_WESTMERE_QMC_BUSY_READ_CH2, PNE_WESTMERE_QMC_BUSY_WRITE_CH0, PNE_WESTMERE_QMC_BUSY_WRITE_CH1, PNE_WESTMERE_QMC_BUSY_WRITE_CH2, PNE_WESTMERE_QMC_OCCUPANCY_CH0, PNE_WESTMERE_QMC_OCCUPANCY_CH1, PNE_WESTMERE_QMC_OCCUPANCY_CH2, PNE_WESTMERE_QMC_OCCUPANCY_ANY, PNE_WESTMERE_QMC_ISSOC_OCCUPANCY_CH0, PNE_WESTMERE_QMC_ISSOC_OCCUPANCY_CH1, PNE_WESTMERE_QMC_ISSOC_OCCUPANCY_CH2, PNE_WESTMERE_QMC_ISSOC_READS_ANY, PNE_WESTMERE_QMC_NORMAL_READS_CH0, PNE_WESTMERE_QMC_NORMAL_READS_CH1, PNE_WESTMERE_QMC_NORMAL_READS_CH2, PNE_WESTMERE_QMC_NORMAL_READS_ANY, PNE_WESTMERE_QMC_HIGH_PRIORITY_READS_CH0, PNE_WESTMERE_QMC_HIGH_PRIORITY_READS_CH1, PNE_WESTMERE_QMC_HIGH_PRIORITY_READS_CH2, PNE_WESTMERE_QMC_HIGH_PRIORITY_READS_ANY, PNE_WESTMERE_QMC_CRITICAL_PRIORITY_READS_CH0, PNE_WESTMERE_QMC_CRITICAL_PRIORITY_READS_CH1, PNE_WESTMERE_QMC_CRITICAL_PRIORITY_READS_CH2, PNE_WESTMERE_QMC_CRITICAL_PRIORITY_READS_ANY, PNE_WESTMERE_QMC_WRITES_FULL_CH0, PNE_WESTMERE_QMC_WRITES_FULL_CH1, PNE_WESTMERE_QMC_WRITES_FULL_CH2, PNE_WESTMERE_QMC_WRITES_FULL_ANY, PNE_WESTMERE_QMC_WRITES_PARTIAL_CH0, PNE_WESTMERE_QMC_WRITES_PARTIAL_CH1, PNE_WESTMERE_QMC_WRITES_PARTIAL_CH2, PNE_WESTMERE_QMC_WRITES_PARTIAL_ANY, PNE_WESTMERE_QMC_CANCEL_CH0, PNE_WESTMERE_QMC_CANCEL_CH1, PNE_WESTMERE_QMC_CANCEL_CH2, PNE_WESTMERE_QMC_CANCEL_ANY, PNE_WESTMERE_QMC_PRIORITY_UPDATES_CH0, PNE_WESTMERE_QMC_PRIORITY_UPDATES_CH1, PNE_WESTMERE_QMC_PRIORITY_UPDATES_CH2, PNE_WESTMERE_QMC_PRIORITY_UPDATES_ANY, PNE_WESTMERE_IMC_RETRY_CH0, PNE_WESTMERE_IMC_RETRY_CH1, PNE_WESTMERE_IMC_RETRY_CH2, PNE_WESTMERE_IMC_RETRY_ANY, PNE_WESTMERE_QHL_FRC_ACK_CNFLTS_IOH, PNE_WESTMERE_QHL_FRC_ACK_CNFLTS_REMOTE, PNE_WESTMERE_QHL_FRC_ACK_CNFLTS_LOCAL, PNE_WESTMERE_QHL_FRC_ACK_CNFLTS_ANY, PNE_WESTMERE_QHL_SLEEPS_IOH_ORDER, PNE_WESTMERE_QHL_SLEEPS_REMOTE_ORDER, PNE_WESTMERE_QHL_SLEEPS_LOCAL_ORDER, PNE_WESTMERE_QHL_SLEEPS_IOH_CONFLICT, PNE_WESTMERE_QHL_SLEEPS_REMOTE_CONFLICT, PNE_WESTMERE_QHL_SLEEPS_LOCAL_CONFLICT, PNE_WESTMERE_ADDR_OPCODE_MATCH_IOH, PNE_WESTMERE_ADDR_OPCODE_MATCH_REMOTE, PNE_WESTMERE_ADDR_OPCODE_MATCH_LOCAL, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_HOME_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_SNOOP_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_NDR_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_HOME_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_SNOOP_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_NDR_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_SINGLE_FLIT_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_DRS_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_NCB_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_NCS_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_DRS_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_NCB_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_NCS_LINK_1, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_LINK_0, PNE_WESTMERE_QPI_TX_STALLED_MULTI_FLIT_LINK_1, PNE_WESTMERE_QPI_TX_HEADER_FULL_LINK_0, PNE_WESTMERE_QPI_TX_HEADER_BUSY_LINK_0, PNE_WESTMERE_QPI_TX_HEADER_FULL_LINK_1, PNE_WESTMERE_QPI_TX_HEADER_BUSY_LINK_1, PNE_WESTMERE_QPI_RX_NO_PPT_CREDIT_STALLS_LINK_0, PNE_WESTMERE_QPI_RX_NO_PPT_CREDIT_STALLS_LINK_1, PNE_WESTMERE_DRAM_OPEN_CH0, PNE_WESTMERE_DRAM_OPEN_CH1, PNE_WESTMERE_DRAM_OPEN_CH2, PNE_WESTMERE_DRAM_PAGE_CLOSE_CH0, PNE_WESTMERE_DRAM_PAGE_CLOSE_CH1, PNE_WESTMERE_DRAM_PAGE_CLOSE_CH2, PNE_WESTMERE_DRAM_PAGE_MISS_CH0, PNE_WESTMERE_DRAM_PAGE_MISS_CH1, PNE_WESTMERE_DRAM_PAGE_MISS_CH2, PNE_WESTMERE_DRAM_READ_CAS_CH0, PNE_WESTMERE_DRAM_READ_CAS_AUTOPRE_CH0, PNE_WESTMERE_DRAM_READ_CAS_CH1, PNE_WESTMERE_DRAM_READ_CAS_AUTOPRE_CH1, PNE_WESTMERE_DRAM_READ_CAS_CH2, PNE_WESTMERE_DRAM_READ_CAS_AUTOPRE_CH2, PNE_WESTMERE_DRAM_WRITE_CAS_CH0, PNE_WESTMERE_DRAM_WRITE_CAS_AUTOPRE_CH0, PNE_WESTMERE_DRAM_WRITE_CAS_CH1, PNE_WESTMERE_DRAM_WRITE_CAS_AUTOPRE_CH1, PNE_WESTMERE_DRAM_WRITE_CAS_CH2, PNE_WESTMERE_DRAM_WRITE_CAS_AUTOPRE_CH2, PNE_WESTMERE_DRAM_REFRESH_CH0, PNE_WESTMERE_DRAM_REFRESH_CH1, PNE_WESTMERE_DRAM_REFRESH_CH2, PNE_WESTMERE_DRAM_PRE_ALL_CH0, PNE_WESTMERE_DRAM_PRE_ALL_CH1, PNE_WESTMERE_DRAM_PRE_ALL_CH2, PNE_WESTMERE_DRAM_THERMAL_THROTTLED, PNE_WESTMERE_THERMAL_THROTTLING_TEMP_CORE_0, PNE_WESTMERE_THERMAL_THROTTLING_TEMP_CORE_1, PNE_WESTMERE_THERMAL_THROTTLING_TEMP_CORE_2, PNE_WESTMERE_THERMAL_THROTTLING_TEMP_CORE_3, PNE_WESTMERE_THERMAL_THROTTLED_TEMP_CORE_0, PNE_WESTMERE_THERMAL_THROTTLED_TEMP_CORE_1, PNE_WESTMERE_THERMAL_THROTTLED_TEMP_CORE_2, PNE_WESTMERE_THERMAL_THROTTLED_TEMP_CORE_3, PNE_WESTMERE_PROCHOT_ASSERTION, PNE_WESTMERE_THERMAL_THROTTLING_PROCHOT_CORE_0, PNE_WESTMERE_THERMAL_THROTTLING_PROCHOT_CORE_1, PNE_WESTMERE_THERMAL_THROTTLING_PROCHOT_CORE_2, PNE_WESTMERE_THERMAL_THROTTLING_PROCHOT_CORE_3, PNE_WESTMERE_TURBO_MODE_CORE_0, PNE_WESTMERE_TURBO_MODE_CORE_1, PNE_WESTMERE_TURBO_MODE_CORE_2, PNE_WESTMERE_TURBO_MODE_CORE_3, PNE_WESTMERE_CYCLES_UNHALTED_L3_FLL_ENABLE, PNE_WESTMERE_CYCLES_UNHALTED_L3_FLL_DISABLE, PNE_WESTMERE_PNE_WESTMERE_NATNAME_GUARD, }; extern Native_Event_LabelDescription_t WestmereProcessor_info[]; extern hwi_search_t WestmereProcessor_map[]; #endif papi-papi-7-2-0-t/src/freebsd/map.c000066400000000000000000000030141502707512200167600ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: freebsd-map.c * Author: Harald Servat * redcrash@gmail.com */ #include "freebsd.h" #include "papiStdEventDefs.h" #include "map.h" /** See other freebsd-map*.* for more details! **/ Native_Event_Info_t _papi_hwd_native_info[CPU_LAST+1]; void init_freebsd_libpmc_mappings (void) { _papi_hwd_native_info[CPU_UNKNOWN].info = UnkProcessor_info; _papi_hwd_native_info[CPU_P6].info = P6Processor_info; _papi_hwd_native_info[CPU_P6_C].info = P6_C_Processor_info; _papi_hwd_native_info[CPU_P6_2].info = P6_2_Processor_info; _papi_hwd_native_info[CPU_P6_3].info = P6_3_Processor_info; _papi_hwd_native_info[CPU_P6_M].info = P6_M_Processor_info; _papi_hwd_native_info[CPU_P4].info = P4Processor_info; _papi_hwd_native_info[CPU_K7].info = K7Processor_info; _papi_hwd_native_info[CPU_K8].info = K8Processor_info; _papi_hwd_native_info[CPU_ATOM].info = AtomProcessor_info; _papi_hwd_native_info[CPU_CORE].info = CoreProcessor_info; _papi_hwd_native_info[CPU_CORE2].info = Core2Processor_info; _papi_hwd_native_info[CPU_CORE2EXTREME].info = Core2ExtremeProcessor_info; _papi_hwd_native_info[CPU_COREI7].info = i7Processor_info; _papi_hwd_native_info[CPU_COREWESTMERE].info = WestmereProcessor_info; _papi_hwd_native_info[CPU_LAST].info = NULL; } int freebsd_number_of_events (int processortype) { int counter = 0; while (_papi_hwd_native_info[processortype].info[counter].name != NULL) counter++; return counter; } papi-papi-7-2-0-t/src/freebsd/map.h000066400000000000000000000024461502707512200167750ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: freebsd-map.h * Author: Harald Servat * redcrash@gmail.com */ #ifndef _FreeBSD_MAP_H_ #define _FreeBSD_MAP_H_ #include "../papi.h" #include "../papi_internal.h" #include "../papi_vector.h" enum { CPU_UNKNOWN = 0, CPU_P6, CPU_P6_C, CPU_P6_2, CPU_P6_3, CPU_P6_M, CPU_P4, CPU_K7, CPU_K8, CPU_ATOM, CPU_CORE, CPU_CORE2, CPU_CORE2EXTREME, CPU_COREI7, CPU_COREWESTMERE, CPU_LAST }; typedef struct Native_Event_LabelDescription { char *name; char *description; } Native_Event_LabelDescription_t; typedef struct Native_Event_Info { /* Name and description for all native events */ Native_Event_LabelDescription_t *info; } Native_Event_Info_t; extern Native_Event_Info_t _papi_hwd_native_info[CPU_LAST+1]; extern void init_freebsd_libpmc_mappings (void); extern int freebsd_number_of_events (int processortype); #include "map-unknown.h" #include "map-p6.h" #include "map-p6-c.h" #include "map-p6-2.h" #include "map-p6-3.h" #include "map-p6-m.h" #include "map-p4.h" #include "map-k7.h" #include "map-k8.h" #include "map-atom.h" #include "map-core.h" #include "map-core2.h" #include "map-core2-extreme.h" #include "map-i7.h" #include "map-westmere.h" #endif /* _FreeBSD_MAP_H_ */ papi-papi-7-2-0-t/src/freebsd_events.csv000066400000000000000000000321741502707512200201510ustar00rootroot00000000000000# # FreeBSD presets # these are needed as event names are different than those in libpfm4 # CPU,UNKNOWN PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCHES PRESET,PAPI_BR_INS,NOT_DERIVED,INTERRUPTS PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_MISPREDICTS PRESET,PAPI_L2_DCM,NOT_DERIVED,DC_MISSES PRESET,PAPI_L2_ICM,NOT_DERIVED,IC_MISSES PRESET,PAPI_L2_TCM,DERIVED_ADD, IC_MISSES,DC_MISSES CPU,INTEL_P6 CPU,INTEL_PII CPU,INTEL_PIII CPU,INTEL_CL CPU,INTEL_PM PRESET,PAPI_L1_DCM,NOT_DERIVED,DCU_LINES_IN # L2_IFETCH defaults to MESI PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_IFETCH # BUS_TRAN_IFETCH defaults to SELF PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_LINES_IN,BUS_TRAN_IFETCH # BUS_TRAN_IFETCH defaults to SELF PRESET,PAPI_L2_ICM,NOT_DERIVED,BUS_TRAN_IFETCH PRESET,PAPI_L1_TCM,NOT_DERIVED,L2_RQSTS PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_LINES_IN PRESET,PAPI_CA_CLN,NOT_DERIVED,BUS_TRAN_RFO PRESET,PAPI_CA_ITV,NOT_DERIVED,BUS_TRAN_INVAL PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB.MISS PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_LD PRESET,PAPI_L1_STM,NOT_DERIVED,L2_ST PRESET,PAPI_L2_LDM,DERIVED_SUB,L2_LINES_IN,L2M_LINES_INM PRESET,PAPI_L2_STM,NOT_DERIVED,L2M_LINES_INM PRESET,PAPI_BTAC_M,NOT_DERIVED,BTB_MISSES PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RX PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_TAKEN_RETIRED PRESET,PAPI_BR_NTK,DERIVED_SUB,BR_INST_RETIRED,BR_TAKEN_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISS_PRED_RETIRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED,BR_MISS_PRED_RETIRED PRESET,PAPI_TOT_IIS,NOT_DERIVED,INST_DECODED PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_FP_INS,NOT_DERIVED,FLOPS PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALL PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_LST_INS,DERIVED_ADD,L2_LD,L2_ST PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_MEM_REFS, DCU_LINES_IN PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_MEM_REFS PRESET,PAPI_L2_DCA,DERIVED_ADD,L2_LD, L2_ST PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_LD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_ST PRESET,PAPI_L1_ICH,DERIVED_SUB,IFU_FETCH, L2_IFETCH PRESET,PAPI_L2_ICH,DERIVED_SUB,L2_IFETCH, BUS_TRAN_IFETCH PRESET,PAPI_L1_ICA,NOT_DERIVED,IFU_FETCH PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L1_ICR,NOT_DERIVED,IFU_FETCH PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS, L2_LINES_IN PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_MEM_REFS, IFU_FETCH PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_LD, L2_IFETCH PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_ST PRESET,PAPI_FML_INS,NOT_DERIVED,MUL PRESET,PAPI_FDV_INS,NOT_DERIVED,DIV PRESET,PAPI_FP_OPS,NOT_DERIVED,FLOPS CPU,INTEL_PM PRESET,PAPI_VEC_INS,DERIVED_ADD,MMX_INSTR_RET, EMON_SSE_SSE2_INST_RETIRED CPU,INTEL_PIII PRESET,PAPI_VEC_INS,DERIVED_ADD,MMX_INSTR_RET, EMON_KNI_INST_RETIRED CPU,INTEL_CL PRESET,PAPI_VEC_INS,NOT_DERIVED,MMX_INSTR_EXEC CPU,AMD_K7 PRESET,PAPI_L1_DCM,DERIVED_ADD,DC_REFILLS_FROM_SYSTEM, DC_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,IC_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,DC_REFILLS_FROM_SYSTEM PRESET,PAPI_L1_TCM,DERIVED_ADD,DC_REFILLS_FROM_SYSTEM, DC_REFILLS_FROM_L2, IC_MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_AND_L2_DTLB_MISSES PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_AND_L2_ITLB_MISSES PRESET,PAPI_TLB_TL,DERIVED_ADD,L1_AND_L2_DTLB_MISSES, L1_AND_L2_ITLB_MISSES PRESET,PAPI_L1_LDM,NOT_DERIVED,DC_REFILLS_FROM_L2_OES PRESET,PAPI_L1_STM,NOT_DERIVED,DC_REFILLS_FROM_L2_M PRESET,PAPI_L2_LDM,NOT_DERIVED,DC_REFILLS_FROM_SYSTEM_OES PRESET,PAPI_L2_STM,NOT_DERIVED,DC_REFILLS_FROM_SYSTEM_M PRESET,PAPI_HW_INT,NOT_DERIVED,HARDWARE_INTERRUPTS PRESET,PAPI_BR_UCN,NOT_DERIVED,RETIRED_FAR_CONTROL_TRANSFERS PRESET,PAPI_BR_CN,NOT_DERIVED,RETIRED_BRANCHES PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCHES PRESET,PAPI_BR_NTK,DERIVED_SUB,RETIRED_BRANCHES, RETIRED_TAKEN_BRANCHES PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_BRANCHES_MISPREDICTED PRESET,PAPI_BR_PRC,DERIVED_SUB,RETIRED_BRANCHES, RETIRED_BRANCHES_MISPREDICTED PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_TAKEN_BRANCHES PRESET,PAPI_L1_DCA,NOT_DERIVED,DC_ACCESSES PRESET,PAPI_L2_DCA,DERIVED_ADD,DC_REFILLS_FROM_SYSTEM, DC_REFILLS_FROM_L2 PRESET,PAPI_L1_ICA,NOT_DERIVED,IC_FETCHES PRESET,PAPI_L2_ICA,NOT_DERIVED,IC_MISSES PRESET,PAPI_L1_ICR,NOT_DERIVED,IC_FETCHES PRESET,PAPI_L1_TCA,DERIVED_ADD,DC_ACCESSES, IC_FETCHES CPU,AMD_K8 PRESET,PAPI_BR_INS,NOT_DERIVED,FR_RETIRED_BRANCHES PRESET,PAPI_RES_STL,NOT_DERIVED,FR_DISPATCH_STALLS PRESET,PAPI_TOT_CYC,NOT_DERIVED,BU_CPU_CLK_UNHALTED PRESET,PAPI_TOT_INS,NOT_DERIVED,FR_RETIRED_X86_INSTRUCTIONS PRESET,PAPI_STL_ICY,FR_DECODER_EMPTY PRESET,PAPI_HW_INT,NOT_DERIVED,FR_RETIRED_TAKEN_HARDWARE_INTERRUPTS PRESET,PAPI_BR_TKN,NOT_DERIVED,FR_RETIRED_TAKEN_BRANCHES PRESET,PAPI_BR_MSP,NOT_DERIVED,FR_RETIRED_TAKEN_BRANCHES_MISPREDICTED PRESET,PAPI_TLB_DM,NOT_DERIVED,DC_L1_DTLB_MISS_AND_L2_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,IC_L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,DC_L1_DTLB_MISS_AND_L2_DTLB_MISS,IC_L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_L1_DCA,NOT_DERIVED,DC_ACCESS PRESET,PAPI_L1_ICA,NOT_DERIVED,IC_FETCH PRESET,PAPI_L1_TCA,DERIVED_ADD,DC_ACCESS, IC_FETCH PRESET,PAPI_L1_ICR,NOT_DERIVED,IC_FETCH PRESET,PAPI_L2_ICH,NOT_DERIVED,IC_REFILL_FROM_L2 PRESET,PAPI_L2_DCH,NOT_DERIVED,DC_REFILL_FROM_L2 PRESET,PAPI_L2_DCM,NOT_DERIVED,DC_REFILL_FROM_SYSTEM_MOES PRESET,PAPI_L2_DCA,DERIVED_ADD,DC_REFILL_FROM_SYSTEM_MOES, DC_REFILL_FROM_L2_MOES PRESET,PAPI_L2_ICM,NOT_DERIVED,IC_REFILL_FROM_SYSTEM PRESET,PAPI_L2_DCR,NOT_DERIVED,DC_REFILL_FROM_L2_OES PRESET,PAPI_L2_DCW,NOT_DERIVED,DC_REFILL_FROM_L2_M PRESET,PAPI_L2_DCH,NOT_DERIVED,DC_REFILL_FROM_L2_MOES PRESET,PAPI_L1_LDM,NOT_DERIVED,DC_REFILL_FROM_L2_OES PRESET,PAPI_L1_STM,NOT_DERIVED,DC_REFILL_FROM_L2_M PRESET,PAPI_L2_LDM,NOT_DERIVED,DC_REFILL_FROM_SYSTEM_OES PRESET,PAPI_L2_STM,NOT_DERIVED,DC_REFILL_FROM_SYSTEM_M PRESET,PAPI_L1_DCM,DERIVED_ADD,DC_REFILL_FROM_SYSTEM_MOES, DC_REFILL_FROM_L2_MOES PRESET,PAPI_L1_ICM,DERIVED_ADD,IC_REFILL_FROM_L2, IC_REFILL_FROM_SYSTEM PRESET,PAPI_L1_TCM,DERIVED_ADD,DC_REFILL_FROM_SYSTEM_MOES,DC_REFILL_FROM_L2_MOES,IC_REFILL_FROM_SYSTEM,IC_REFILL_FROM_L2 PRESET,PAPI_L2_TCM,DERIVED_ADD,DC_REFILL_FROM_SYSTEM_MOES,IC_REFILL_FROM_SYSTEM PRESET,PAPI_L2_ICA,DERIVED_ADD,IC_REFILL_FROM_SYSTEM,IC_REFILL_FROM_L2 PRESET,PAPI_L2_TCH,DERIVED_ADD,IC_REFILL_FROM_L2,DC_REFILL_FROM_L2_MOES PRESET,PAPI_L2_TCA,DERIVED_ADD,IC_REFILL_FROM_L2,IC_REFILL_FROM_SYSTEM,DC_REFILL_FROM_L2_MOES,DC_REFILL_FROM_SYSTEM_MOES PRESET,PAPI_FML_INS,NOT_DERIVED,FP_DISPATCHED_FPU_MULS PRESET,PAPI_FAD_INS,NOT_DERIVED,FP_DISPATCHED_FPU_ADDS PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_DISPATCHED_FPU_ADDS_AND_MULS PRESET,PAPI_FP_INS,NOT_DERIVED,FR_RETIRED_FPU_INSTRUCTIONS PRESET,PAPI_FPU_IDL,NOT_DERIVED,FP_CYCLES_WITH_NO_FPU_OPS_RETIRED CPU,INTEL_PIV PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALL PRESET,PAPI_TOT_CYC,NOT_DERIVED,GLOBAL_POWER_EVENTS PRESET,PAPI_L1_ICM,NOT_DERIVED,BPU_FETCH_REQUEST PRESET,PAPI_L1_ICA,NOT_DERIVED,UOP_QUEUE_WRITES_TC_BUILD_DELIVER PRESET,PAPI_TLB_DM,NOT_DERIVED,PAGE_WALK_TYPE_D PRESET,PAPI_TLB_IM,NOT_DERIVED,PAGE_WALK_TYPE_I PRESET,PAPI_TLB_TL,NOT_DERIVED,PAGE_WALK_TYPE PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_RETIRED_NON_BOGUS PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_TYPE PRESET,PAPI_BR_TKN,NOT_DERIVED,BRANCH_RETIRED_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BRANCH_RETIRED_NOT_TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_RETIRED_MISPREDICTED PRESET,PAPI_BR_PRC,NOT_DERIVED,BRANCH_RETIRED_PREDICTED PRESET,PAPI_L2_TCH,NOT_DERIVED,BSQ_CACHE_REFERENCE_2L_HITS PRESET,PAPI_L2_TCM,NOT_DERIVED,BSQ_CACHE_REFERENCE_2L_MISSES PRESET,PAPI_L2_TCA,NOT_DERIVED,BSQ_CACHE_REFERENCE_2L_ACCESSES PRESET,PAPI_L3_TCH,NOT_DERIVED,BSQ_CACHE_REFERENCE_3L_HITS PRESET,PAPI_L3_TCM,NOT_DERIVED,BSQ_CACHE_REFERENCE_3L_MISSES PRESET,PAPI_L3_TCA,NOT_DERIVED,BSQ_CACHE_REFERENCE_3L_ACCESSES PRESET,PAPI_FP_INS,NOT_DERIVED,X87_FP_UOP CPU,INTEL_ATOM PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_ALL_REF, L1I_READS PRESET,PAPI_L2_DCA,NOT_DERIVED,L2_RQSTS PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB.MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED,DATA_TLB_MISSES.DTLB_MISS PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_EXEC PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED.TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISSP_EXEC PRESET,PAPI_RES_STL,RESOURCE_STALLS.ANY PRESET,PAPI_TOT_CYC,CPU_CLK_UNHALTED.BUS PRESET,PAPI_TOT_INS,INST_RETIRED.ANY_P PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV PRESET,PAPI_L2_DCH,NOT_DERIVED,L2_LD PRESET,PAPI_FP_INS,NOT_DERIVED,X87_OPS_RETIRED.ANY PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,MEM_LOAD_RETIRED_L2_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISSES.ANY, ITLB.MISSES CPU,INTEL_CORE PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INSTR_RET PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALL PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_RET PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RX PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INSTR_RET PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISSP_EXEC PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB.MISSES PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISS, ITLB.MISSES PRESET,PAPI_L2_DCH,NOT_DERIVED,L2_LD CPU,INTEL_CORE2 PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED.ANY PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS.ANY PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED.BUS PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED.ANY_P PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED.TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISSP_EXEC PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES.ANY PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB.MISSES PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISSES.ANY, ITLB.MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_ALL_REF, L1I_READS # PAPI_L2_ICH seems not to work PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L2_DCH,NOT_DERIVED,L2_LD PRESET,PAPI_FP_INS,NOT_DERIVED,X87_OPS_RETIRED.ANY PRESET,PAPI_L1_DCM,NOT_DERIVED,MEM_LOAD_RETIRED_L1D_MISS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_TCM,DERIVED_ADD,MEM_LOAD_RETIRED_L1D_MISS, L1I_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,MEM_LOAD_RETIRED_L2_MISS CPU,INTEL_CORE2EXTREME PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED.ANY PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS.ANY PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED.BUS PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED.ANY_P PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED.TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISSP_EXEC PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES.ANY PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB.MISSES PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISSES.ANY, ITLB.MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS PRESET,PAPI_L1_TCA, DERIVED_ADD, L1D_ALL_REF, L1I_READS # PAPI_L2_ICH seems not to work PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L2_DCH,NOT_DERIVED,L2_LD PRESET,PAPI_FP_INS,NOT_DERIVED,X87_OPS_RETIRED.ANY PRESET,PAPI_L1_DCM,NOT_DERIVED,MEM_LOAD_RETIRED.L1D_MISS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_TCM,DERIVED_ADD,MEM_LOAD_RETIRED.L1D_MISS, L1I_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,MEM_LOAD_RETIRED.L2_MISS CPU,INTELCOREI7 PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED.ALL_BRANCHES PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS.ANY PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED.CORE PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR.RETIRED_ANY PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_MISP_EXEC_TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_EXEC_ANY PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES.ANY PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES_ANY PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISSES.ANY, ITLB_MISSES_ANY PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF_ANY PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS PRESET,PAPI_L1_TCA, DERIVED_ADD, L1D_ALL_REF_ANY, L1I_READS # PAPI_L2_ICH seems not to work PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L2_DCH,NOT_DERIVED,MEM_LOAD_RETIRED.L2_HIT PRESET,PAPI_FP_INS,NOT_DERIVED,INST_RETIRED.X87 PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_PREFETCH_MISS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D_PREFETCH_MISS, L1I_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_RQSTS_MISS CPU,INTEL_WESTMERE PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED.ALL_BRANCHES PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS.ANY PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED.CORE PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR.RETIRED_ANY PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_MISP_EXEC_TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_EXEC_ANY PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES.ANY PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES_ANY PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISSES.ANY, ITLB_MISSES_ANY PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS # PAPI_L2_ICH seems not to work PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_IFETCH PRESET,PAPI_L2_DCH,NOT_DERIVED,MEM_LOAD_RETIRED.L2_HIT PRESET,PAPI_FP_INS,NOT_DERIVED,INST_RETIRED.X87 PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_PREFETCH_MISS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_TCM, DERIVED_ADD, L1D_PREFETCH_MISS, L1I_MISSES PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_RQSTS_MISS papi-papi-7-2-0-t/src/ftests/000077500000000000000000000000001502707512200157375ustar00rootroot00000000000000papi-papi-7-2-0-t/src/ftests/Makefile000066400000000000000000000016011502707512200173750ustar00rootroot00000000000000# File: ftests/Makefile include Makefile.target INCLUDE = -I../testlib -I. -I.. FFLAGS = $(CFLAGS) -ffixed-line-length-132 testlibdir=../testlib TESTLIB= $(testlibdir)/libtestlib.a DOLOOPS= $(testlibdir)/do_loops.o ifeq ($(ENABLE_FORTRAN_TESTS),yes) include Makefile.recipies install: default @echo "Fortran tests (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/ftests -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/ftests -find . -perm -100 -type f -exec cp {} $(DATADIR)/ftests \; -chmod go+rx $(DATADIR)/ftests/* -find . -name "*.[Ffh]" -type f -exec cp {} $(DATADIR)/ftests \; -cp Makefile.target $(DATADIR)/ftests/Makefile -cat Makefile.recipies >> $(DATADIR)/ftests/Makefile else all: @echo "Install Fortran compiler to build and run Fortran tests" install: @echo "No Fortran tests to install." clean: distclean clobber: rm -f Makefile.target endif papi-papi-7-2-0-t/src/ftests/Makefile.recipies000066400000000000000000000071371502707512200212110ustar00rootroot00000000000000ALL = strtest zero zeronamed first second tenth description fdmemtest accum cost \ case1 case2 clockres eventname fmatrixlowpapi fmultiplex1 \ johnmay2 fmultiplex2 avail openmp\ serial_hl .PHONY : all default ftests ftest clean install all default ftests ftest: $(ALL) serial_hl: serial_hl.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) serial_hl.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o serial_hl clockres: clockres.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) clockres.F $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o clockres avail: avail.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) avail.F $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o avail eventname: eventname.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) eventname.F $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o eventname case1: case1.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) case1.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o case1 case2: case2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) case2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o case2 fdmemtest: fdmemtest.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) fdmemtest.F $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o fdmemtest fmatrixlowpapi: fmatrixlowpapi.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) fmatrixlowpapi.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o fmatrixlowpapi strtest: strtest.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) strtest.F $(TESTLIB) $(PAPILIB) $(LDFLAGS) -o strtest description: description.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) description.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(LDFLAGS) -o description accum: accum.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) accum.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o accum $(LDFLAGS) openmp: openmp.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) openmp.F $(TESTLIB) $(PAPILIB) -o openmp $(LDFLAGS) $(OMPCFLGS) zero: zero.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) zero.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o zero $(LDFLAGS) zeronamed: zeronamed.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) zeronamed.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o zeronamed $(LDFLAGS) first: first.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) first.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o first $(LDFLAGS) second: second.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) second.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o second $(LDFLAGS) tenth: tenth.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) tenth.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o tenth $(LDFLAGS) cost: cost.F $(TESTLIB) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) cost.F $(TESTLIB) $(PAPILIB) -o cost $(LDFLAGS) johnmay2: johnmay2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) johnmay2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o johnmay2 $(LDFLAGS) fmultiplex1: fmultiplex1.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) fmultiplex1.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o fmultiplex1 $(LDFLAGS) fmultiplex2: fmultiplex2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) fmultiplex2.F $(TESTLIB) $(DOLOOPS) $(PAPILIB) -o fmultiplex2 $(LDFLAGS) clean: rm -f *.o *genmod.f90 *genmod.mod *.stderr *.stdout core *~ $(ALL) distclean clobber: clean rm -f Makefile.target papi-papi-7-2-0-t/src/ftests/Makefile.target.in000066400000000000000000000011111502707512200212630ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ prefix = @prefix@ exec_prefix = @exec_prefix@ datarootdir = @datarootdir@ datadir = @datadir@/${PACKAGE_TARNAME} testlibdir = $(datadir)/testlib DATADIR = $(DESTDIR)$(datadir) INCLUDE = -I. -I@includedir@ -I$(testlibdir) LIBDIR = @libdir@ LIBRARY = @LIBRARY@ SHLIB = @SHLIB@ PAPILIB = ../@LINKLIB@ TESTLIB = $(testlibdir)/libtestlib.a LDFLAGS = @LDFLAGS@ @LDL@ CC = @CC@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ OMPCFLGS = @OMPCFLGS@ FFLAGS = @FFLAGS@ TOPTFLAGS= @TOPTFLAGS@ FTOPTFLAGS= @TOPTFLAGS@ ENABLE_FORTRAN_TESTS=@ENABLE_FORTRAN_TESTS@ papi-papi-7-2-0-t/src/ftests/accum.F000066400000000000000000000100431502707512200171340ustar00rootroot00000000000000#include "fpapi_test.h" program accum implicit integer (p) integer es1, number, i integer*8 values(10) integer events(2) character*PAPI_MAX_STR_LEN name integer retval integer tests_quiet, get_quiet external get_quiet integer last_char, n external last_char tests_quiet = get_quiet() es1 = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_create_eventset(es1, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if number=2 call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .NE. PAPI_OK) then events(1) = PAPI_TOT_INS else events(1) = PAPI_FP_INS end if events(2) = PAPI_TOT_CYC call PAPIf_add_events( es1, events, number, retval ) if ( retval.LT.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_events', retval) end if do i=1,10 values(i)=0 end do call PAPIf_start(es1, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_accum(es1, values(7), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_accum', retval) end if values(1)=values(7) values(2)=values(8) call PAPIf_stop(es1, values(3), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_start(es1, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_accum(es1, values(7), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_accum', retval) end if values(5)=values(7) values(6)=values(8) call fdo_flops(NUM_FLOPS) call PAPIf_accum(es1, values(7), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_accum', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es1, values(9), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_remove_events( es1, events, number, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_events', retval) end if if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (events(1), name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_event_code_to_name', retval) end if n=last_char(name) print *, "Test case accum: Test of PAPI_add_events, ", * "PAPI_remove_events, PAPI_accum" print *, "------------------------------------------", * "------------------------" write (*,100) "Test type", 1, 2, 3, 4, 5 write (*,100) name(1:n), values(1), values(3), * values(5), values(7), values(9) write (*,100) "PAPI_TOT_CYC", values(2), values(4), * values(6), values(8), values(10) print *, "------------------------------------------", * "------------------------" 100 format(a15, ":", i10, i10, i10, i10, i10) print * print *, "Verification:" print *, "Column 2 approximately equals to 0;" print *, "Column 3 approximately equals 2 * Column 1;" print *, "Column 4 approximately equals 3 * Column 1;" print *, "Column 5 approximately equals Column 1." end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/avail.F000066400000000000000000000056531502707512200171530ustar00rootroot00000000000000C This file performs the following tests: C Hardware info #include "fpapi_test.h" program avail IMPLICIT integer (p) INTEGER ncpu,nnodes,totalcpus,vendor,model, check, handle, n CHARACTER*(PAPI_MAX_STR_LEN) vstring, mstring REAL revision, mhz integer last_char external last_char integer i, avail_flag, flags,k,l CHARACTER*(PAPI_MAX_STR_LEN) event_name, event_descr, *event_label, event_note CHARACTER*(10) avail_str, flags_str integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() handle=0 check = PAPI_VER_CURRENT call PAPIf_library_init(check) if ( check.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', check) end if call PAPIf_get_hardware_info( ncpu,nnodes,totalcpus,vendor, . vstring, model, mstring, revision, mhz ) if (tests_quiet .EQ. 0) then print *, 'Hardware information and available events' print *, '--------------------------------------'// .'---------------------------------------' n=last_char(vstring) print *, 'Vendor string and code : ',vstring(1:n), &' (',vendor,')' n=last_char(mstring) print *, 'Model string and code : ',mstring(1:n),' (',model,')' print *, 'CPU revision : ',revision print *, 'CPU Megahertz : ',mhz print *, 'CPUs in an SMP node : ',ncpu print *, 'Nodes in the system : ',nnodes print *, 'Total CPUs in the system : ',totalcpus print *, '--------------------------------------'// .'---------------------------------------' write (*,200) 'Name', 'Code', 'Avail', 'Deriv', *'Description', '(note)' 200 format(A8, A12, A9, A6, A25, A30) end if event_name=' ' do i=0, PAPI_MAX_PRESET_EVENTS-1 C PAPI_L1_DCM is the first event in the list call papif_get_event_info(PAPI_L1_DCM+i, event_name, * event_descr, event_label, avail_flag, event_note, flags, check) if (avail_flag.EQ.1) then avail_str = 'Yes' else avail_str = 'No' end if if (flags.EQ.1) then flags_str = 'Yes' else flags_str = 'No' end if if (check.EQ.PAPI_OK .and. tests_quiet .EQ. 0) then l=1 do k=len(event_note),1,-1 if(l.EQ.1.AND.event_note(k:k).NE.' ') l=k end do C PAPI_L1_DCM is the first event in the list write (6, 100) event_name, PAPI_L1_DCM+i, avail_str, * flags_str, event_descr, event_note(1:l) 100 format(A12, '0x', z8, 2x, A5, 1x, A5, A45, 1x,'(', A, ')') end if end do if (tests_quiet .EQ. 0) then print *, '--------------------------------------'// .'---------------------------------------' end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/case1.F000066400000000000000000000055221502707512200170460ustar00rootroot00000000000000C From Dave McNamara at PSRV. Thanks! C Ported to Fortran by Kevin London C If you try to add an event that doesn't exist, you get the correct error C message, yet you get subsequent Seg. Faults when you try to do PAPI_start C and PAPI_stop. I would expect some bizarre behavior if I had no events C added to the event set and then tried to PAPI_start but if I had C successfully added one event, then the 2nd one get an error when I C tried to add it, is it possible for PAPI_start to work but just C count the first event? #include "fpapi_test.h" program case1 IMPLICIT integer (p) INTEGER EventSet INTEGER retval INTEGER i,j INTEGER*8 gl(2) INTEGER n REAL c,a,b INTEGER last_char EXTERNAL last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() n = 1000 a = 0.999 b = 1.001 j = 0 i = 0 EventSet = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init( retval ) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_create_eventset( EventSet, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_query_event(PAPI_L2_TCM, retval) if (retval .EQ. PAPI_OK) then j = j + 1 end if if (j .NE. 0) then call PAPIf_add_event( EventSet, PAPI_L2_TCM, retval ) if (retval .NE. PAPI_OK) then if (retval .NE. PAPI_ECNFLCT) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', *retval) else j = j - 1 end if end if end if i = j call PAPIf_query_event(PAPI_L2_DCM, retval) if (retval .EQ. PAPI_OK) then j = j + 1 end if if (j .EQ. i+1) then call PAPIf_add_event( EventSet, PAPI_L2_DCM, retval ) if (retval .NE. PAPI_OK) then if (retval .NE. PAPI_ECNFLCT) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', *retval) else j = j - 1 end if end if end if if (J .GT. 0) then call PAPIf_start( EventSet, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if end if do i=1, n c = a * b end do if (j .GT. 0) then call PAPIf_stop( EventSet, gl, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/case2.F000066400000000000000000000061661502707512200170540ustar00rootroot00000000000000C From Dave McNamara at PSRV. Thanks! C Ported to fortran by Kevin London C If an event is countable but you've exhausted the counter resources C and you try to add an event, it seems subsequent PAPI_start and/or C PAPI_stop will causes a Seg. Violation. C I got around this by calling PAPI to get the # of countable events, C then making sure that I didn't try to add more than these number of C events. I still have a problem if someone adds Level 2 cache misses C and then adds FLOPS 'cause I didn't count FLOPS as actually requiring C 2 counters. #include "fpapi_test.h" program case2 IMPLICIT integer (p) REAL c,a,b INTEGER n INTEGER EventSet INTEGER retval INTEGER I,j INTEGER*8 gl(3) INTEGER last_char EXTERNAL last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() a=0.999 b=1.001 n=1000 i=0 j=0 EventSet = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init( retval ) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_create_eventset( EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_query_event(PAPI_BR_CN, retval) if (retval .EQ. PAPI_OK) then j = j + 1 end if if (j .NE. 0) then call PAPIf_add_event( EventSet, PAPI_BR_CN, retval ) if ( retval .NE. PAPI_OK ) then if (tests_quiet .EQ. 0) then call PAPIf_perror( 'PAPIf_add_event' ) endif end if end if i = j call PAPIf_query_event(PAPI_TOT_CYC, retval) if (retval .EQ. PAPI_OK) then j = j + 1 end if if (j .EQ. i+1) then call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK )then if (tests_quiet .EQ. 0) then call PAPIf_perror( 'PAPIf_add_event' ) end if end if end if i = j call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .EQ. PAPI_OK) then j = j + 1 end if if (j .EQ. i+1) then call PAPIf_add_event(EventSet,PAPI_TOT_INS,retval) if ( retval .NE. PAPI_OK )then if ( retval .NE. PAPI_ECNFLCT ) then if (tests_quiet .EQ. 0) then call PAPIf_perror( 'PAPIf_add_event' ) end if end if end if end if if (J .GT. 0) then call PAPIf_start(EventSet, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if end if do i=1,n c = a * b end do if (J .GT. 0) then call PAPIf_stop( EventSet, gl, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/clockres.F000066400000000000000000000030671502707512200176610ustar00rootroot00000000000000#include "fpapi_test.h" #define ITERS 100000 program clockres IMPLICIT integer (p) integer*8, allocatable, dimension(:) :: elapsed_usec, elapsed_cyc INTEGER*8 total_usec, total_cyc INTEGER i,handle INTEGER retval integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() total_usec=0 total_cyc=0 handle=0 allocate(elapsed_usec(ITERS)) allocate(elapsed_cyc(ITERS)) retval = PAPI_VER_CURRENT call PAPIf_library_init( retval ) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if if (tests_quiet .EQ. 0) then print *, 'Test case: Clock resolution.' print *,'-----------------------------------------------' end if do i=1,ITERS call PAPIf_get_real_cyc( elapsed_cyc(i) ) end do do i=2,ITERS if ((elapsed_cyc(i)-elapsed_cyc(i-1)).LT.0 ) stop total_cyc =total_cyc+(elapsed_cyc(i) - elapsed_cyc(i-1)) end do do i=1,ITERS call PAPIf_get_real_usec(elapsed_usec(i)) end do do i=2,ITERS if ((elapsed_usec(i) - elapsed_usec(i-1)).LT.0) stop total_usec=total_usec+(elapsed_usec(i) - elapsed_usec(i-1)) end do if (tests_quiet .EQ. 0) then print *,'PAPIf_get_real_cyc : ',(total_cyc/(ITERS-1)) print *,'PAPIf_get_real_usec: ',(total_usec/(ITERS-1)) end if deallocate(elapsed_usec, elapsed_cyc) call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/cost.F000066400000000000000000000071671502707512200170310ustar00rootroot00000000000000#include "fpapi_test.h" program cost implicit integer (p) integer es integer*8 values(10) integer*8 ototcyc, ntotcyc integer*4 i integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es = PAPI_NULL if (tests_quiet .EQ. 0) then print *, "Cost of execution for PAPI start/stop", *" and PAPI read." print *, "This test takes a while. Please be patient..." end if retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_TOT_CYC, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_event', retval) end if call PAPIf_query_event(PAPI_TOT_INS, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_event', retval) end if call PAPIf_create_eventset(es, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if if (tests_quiet .EQ. 0) then print *, "Performing start/stop test..." end if call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_get_real_cyc(ototcyc) do i=0, 50000 call PAPIf_start(es, retval) call PAPIf_stop(es, values(1), retval) end do call PAPIf_get_real_cyc(ntotcyc) ntotcyc=ntotcyc-ototcyc if (tests_quiet .EQ. 0) then print * print * print *, "Total cost for PAPI_start/stop(2 counters) over", *" 50000 iterations:" write (*, 100) ntotcyc, "total cyc" write (*, 200) REAL(ntotcyc)/50001.0, "cyc/call pair" print * print * C Start the read val print *, "Performing read test..." end if call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call PAPIf_get_real_cyc(ototcyc) do i=0, 50000 call PAPIf_read(es, values(1), retval) end do call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_get_real_cyc(ntotcyc) ntotcyc=ntotcyc-ototcyc if (tests_quiet .EQ. 0) then print * print *, "User level cost for PAPI_read(2 counters) over", *" 50000 iterations:" print * print *, "Total cost for PAPI_read(2 counters) over ", *"50000 iterations:" write (*, 100) ntotcyc, "total cyc" write (*, 200) REAL(ntotcyc)/50001.0, "cyc/call" end if 100 format (I15, A15) 200 format (F15.6, A15) call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/description.F000066400000000000000000000123671502707512200204020ustar00rootroot00000000000000#include "fpapi_test.h" program description implicit integer (p) integer es1, number integer*8 values(10) integer events(2), eventlist(2) integer eventtotal integer i character*PAPI_MAX_STR_LEN name integer status integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es1 = PAPI_NULL if (tests_quiet .EQ. 0) then print *, "Test case descriptions: Test of functions:" print *, " PAPI_add_events, PAPI_remove_events," print *, " PAPI_list_events, PAPI_describe_event," print *, " PAPI_state" end if retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, *'PAPI_library_init', retval) end if call PAPIf_create_eventset(es1, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if number=2 call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .NE. PAPI_OK) then events(1) = PAPI_TOT_INS else events(1) = PAPI_FP_INS end if events(2) = PAPI_TOT_CYC call PAPIf_add_events( es1, events, number, retval ) if ( retval.LT.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_add_event', retval) end if eventtotal=5 call PAPIf_list_events(es1, eventlist, eventtotal, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_list_events', retval) end if if (tests_quiet .EQ. 0) then print *, " " print *, "Event List:" print *, "---------------------------------------", * "---------------------------" print *, "Event Name Code" end if do i = 1, eventtotal call PAPIf_event_code_to_name (eventlist(i), name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_event_code_to_name', retval) end if if (tests_quiet .EQ. 0) then write (*, 100) name, eventlist(i) end if 100 format(A12,O12) end do if (tests_quiet .EQ. 0) then print *, "---------------------------------------", *"---------------------------" end if call PAPIf_state(es1, status, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_state', retval) end if if (status .NE. PAPI_STOPPED) then print *, "PAPI_state Error" stop end if if (tests_quiet .EQ. 0) then print *, "PAPI_state: PAPI_STOPPED" end if call PAPIf_start(es1, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if if (tests_quiet .EQ. 0) then print *, "PAPI_start" end if call PAPIf_state(es1, status, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_state', retval) end if if (status .NE. PAPI_RUNNING) then print *, "PAPI_state Error" stop end if if (tests_quiet .EQ. 0) then print *, "PAPI_state: PAPI_RUNNING" end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es1, values, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then print *, "PAPI_stop" end if call PAPIf_state(es1, status, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_state', retval) end if if (status .NE. PAPI_STOPPED) then print *, "PAPI_state Error" stop end if if (tests_quiet .EQ. 0) then print *, "PAPI_state: PAPI_STOPPED" end if call PAPIf_remove_events( es1, events, number, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_remove_events', retval) end if if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (eventlist(1), name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_event_code_to_name', retval) end if print *, " " print *, "Results:" print *, "---------------------------------------", * "---------------------------" print *, "Test type : 1" print *, name, " : ", values(1) print *, "PAPI_TOT_CYC : ", values(2) print *, "---------------------------------------", * "---------------------------" print *, " " print *, "Verification:" print *, "1. The events listed by PAPI_describe_event", * "should be exactly the same events added by PAPI_add_events." print *, "2. The PAPI_state should be PAPI_RUNNING after ", * "PAPI_start and before PAPI_stop." print *, "It should be PAPI_STOPPED at other time." end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/eventname.F000066400000000000000000000014711502707512200200330ustar00rootroot00000000000000#include "fpapi_test.h" program eventname IMPLICIT integer (p) INTEGER retval, handle INTEGER preset integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() handle = 0 retval = PAPI_VER_CURRENT call PAPIf_library_init( retval ) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_event_name_to_code( 'PAPI_FP_INS',preset,retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_event_name_to_code', *retval) end if if (tests_quiet .EQ. 0) then write (*, 100) preset 100 format ('PAPI_FP_INS code is', Z10) end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/fdmemtest.F000066400000000000000000000025061502707512200200410ustar00rootroot00000000000000#include "fpapi_test.h" program dmemtest IMPLICIT integer (p) INTEGER retval INTEGER*8 dmeminfo(PAPIF_DMEM_MAXVAL) integer tests_quiet, get_quiet external get_quiet real EventSet tests_quiet = get_quiet() EventSet = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if CALL PAPIf_get_dmem_info(dmeminfo, retval) if ( retval.NE.PAPI_OK) then stop end if if (tests_quiet .EQ. 0) then print *, "Mem Size: ", dmeminfo(PAPIF_DMEM_VMSIZE) print *, "Mem Resident: ", dmeminfo(PAPIF_DMEM_RESIDENT) print *, "Mem High Water: ", dmeminfo(PAPIF_DMEM_HIGH_WATER) print *, "Mem Shared: ", dmeminfo(PAPIF_DMEM_SHARED) print *, "Mem Text: ", dmeminfo(PAPIF_DMEM_TEXT) print *, "Mem Library: ", dmeminfo(PAPIF_DMEM_LIBRARY) print *, "Mem Heap: ", dmeminfo(PAPIF_DMEM_HEAP) print *, "Mem Locked: ", dmeminfo(PAPIF_DMEM_LOCKED) print *, "Mem Stack: ", dmeminfo(PAPIF_DMEM_STACK) print *, "Mem Pagesize: ", dmeminfo(PAPIF_DMEM_PAGESIZE) end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/first.F000066400000000000000000000132131502707512200171750ustar00rootroot00000000000000#include "fpapi_test.h" program first IMPLICIT integer (p) integer event1 INTEGER retval INTEGER*8 values(10) INTEGER*8 max, min INTEGER EventSet integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr character*(PAPI_MAX_STR_LEN) name Integer last_char, n External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() EventSet = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .NE. PAPI_OK) then event1 = PAPI_TOT_INS else event1 = PAPI_FP_INS end if call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( EventSet, event1, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', *retval) end if call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', *retval) end if call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_read(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_read', retval) end if call PAPIf_reset(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_reset', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_read(EventSet, values(3), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_read', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_read(EventSet, values(5), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_read', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(EventSet, values(7), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_read(EventSet, values(9), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_read', retval) end if if (tests_quiet .EQ. 0) then print *, 'TEST CASE 1: Non-overlapping start, stop, read.' print *, '--------------------------------------------------'// * '--------------------------------' end if call PAPIf_get_domain(EventSet, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,900) 'Default domain is:', domain, domainstr 900 format(a20, i3, ' ', a70) end if call PAPIf_get_granularity(eventset, granularity, PAPI_DEFGRN, *retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', *retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (event1, name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_event_code_to_name', retval) end if n=last_char(name) write (*,800) 'Default granularity is:', granularity, grnstr 800 format(a25, i3, ' ', a20) print *, 'Using', NUM_FLOPS, ' iterations of c += b*c' print *, '-----------------------------------------------'// * '-----------------------------------' write (*,100) 'Test type', 1, 2, 3, 4, 5 write (*,100) name(1:n), values(1), values(3), * values(5), values(7), values(9) write (*,100) 'PAPI_TOT_CYC', values(2), values(4), * values(6), values(8), values(10) 100 format(a13, ': ', i11, i11, i11, i11, i11) print *, '-----------------------------------------------'// * '-----------------------------------' print *, 'Verification:' print *, 'Column 1 approximately equals column 2' print *, 'Column 3 approximately equals 2 * column 2' print *, 'Column 4 approximately equals 3 * column 2' print *, 'Column 4 exactly equals column 5' end if min = INT(REAL(values(3))*0.8) max = INT(REAL(values(3))*1.2) if ((values(1).gt.max) .OR. (values(1).lt.min) .OR. *(values(5).gt.(max*2)) .OR. (values(5).lt.(min*2)) .OR. *(values(7).gt.(max*3)) .OR. (values(7).lt.(min*3)) .OR. *(values(7).NE.values(9))) then call ftest_fail(__FILE__, __LINE__, . name, 1) end if min = INT(REAL(values(4))*0.65) max = INT(REAL(values(4))*1.35) if ((values(2).gt.max) .OR. (values(2).lt.min) .OR. *(values(6).gt.(max*2)) .OR. (values(6).lt.(min*2)) .OR. *(values(8).gt.(max*3)) .OR. (values(8).lt.(min*3)) .OR. *(values(8).NE.values(10))) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_TOT_CYC', 1) end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/fmatrixlowpapi.F000066400000000000000000000117711502707512200211230ustar00rootroot00000000000000C **************************************************************************** C C matrixpapi.f C An example of matrix-matrix multiplication and using PAPI low level to C look at the performance. written by Kevin London C March 2000 C **************************************************************************** #include "fpapi_test.h" program fmatrixlowpapi implicit integer (p) INTEGER ncols1,nrows1,ncols2,nrows2 PARAMETER(nrows1=175,ncols1=225,nrows2=ncols1,ncols2=150) INTEGER i,j,k,retval,nchr,numevents,EventSet CHARACTER*(PAPI_MAX_STR_LEN) vstring,mstring C PAPI values of the counters INTEGER event INTEGER*8 values(2) INTEGER*8 starttime,stoptime REAL*8 finaltime INTEGER ncpu,nnodes,totalcpus,vendor,model REAL revision, mhz real*8, allocatable, dimension(:,:) :: p, q, r integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() EventSet = PAPI_NULL C Setup default values numevents=0 starttime=0 stoptime=0 allocate(p(nrows1, ncols1)) allocate(q(nrows2, ncols2)) allocate(r(nrows1, ncols2)) retval = PAPI_VER_CURRENT call PAPIf_library_init( retval ) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, *'PAPI_library_init', retval) end if C Create the eventset call PAPIf_create_eventset(EventSet,retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_create_eventset', retval) end if C Total cycles call PAPIf_add_event(EventSet,PAPI_TOT_CYC,retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_add_event PAPI_TOT_CYC', retval) end if C Total [floating point] instructions call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .NE. PAPI_OK) then event = PAPI_TOT_INS else event = PAPI_FP_INS end if call PAPIf_add_event(EventSet,event,retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, *'PAPIf_add_event PAPI_TOT_INS', retval) end if C Grab the hardware info call PAPIf_get_hardware_info( ncpu, nnodes, totalcpus, vendor, . vstring, model, mstring, revision, mhz ) do i=len(mstring),1,-1 if(mstring(i:i).NE.' ') goto 10 end do 10 if(i.LT.1)then nchr=1 else nchr=i end if if (tests_quiet .EQ. 0) then print * print 100, totalcpus,mstring(1:nchr), mhz print * print 101,'ncpu',ncpu, 'nnodes',nnodes, 'totalcpus',totalcpus print 102,'mhz',mhz,'revision',revision print 103,'vendor',vendor,'vstring',vstring print 104,'model',model,'mstring',mstring print * end if 100 format(i5,' CPU(s) ',a,' at ',f7.2,' MHz') 101 format(a9,' =',i6,7x,a9,' =',i5,5x,a9,'=',i5) 102 format(a9,' =',f7.2,6x,a9,' =',f15.5) 103 format(a9,' =',i6,7x,a9,' =',a40) 104 format(a9,' =',i6,7x,a9,' =',a40) C Open matrix file number 1 for reading C OPEN(UNIT=1,FILE='fmt1',STATUS='OLD') C Open matrix file number 2 for reading C OPEN(UNIT=2,FILE='fmt2',STATUS='OLD') C matrix 1: read in the matrix values do i=1, nrows1 do j=1,ncols1 p(i,j) = i*j*1.0 end do end do C matrix 2: read in the matrix values do i=1, nrows2 do j=1,ncols2 q(i,j) = i*j*1.0 end do end do C Initialize the result matrix do i=1,nrows1 do j=1, ncols2 r(i,j) = i*j*1.0 end do end do C Grab the beginning time call PAPIf_get_real_usec( starttime ) C Start the event counters call PAPIf_start( EventSet, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if C Compute the matrix-matrix multiplication do i=1,nrows1 do j=1,ncols2 do k=1,ncols1 r(i,j)=r(i,j) + p(i,k)*q(k,j) end do end do end do C Stop the counters and put the results in the array values call PAPIf_stop(EventSet,values,retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_get_real_usec( stoptime ) finaltime=(REAL(stoptime)/1000000.0)-(REAL(starttime)/1000000.0) C Make sure the compiler does not optimize away the multiplication call dummy(r) if (tests_quiet .EQ. 0) then print *, 'Time: ', finaltime, 'seconds' print *, 'Cycles: ', values(1) if (event .EQ. PAPI_TOT_INS) then print *, 'Total Instructions: ', values(2) else print *, 'FP Instructions: ', values(2) write(*,'(a,f9.6)') ' Efficiency (fp/cycle):', & real(values(2))/real(values(1)) end if end if deallocate(p, q, r) call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/fmultiplex1.F000066400000000000000000000304101502707512200203160ustar00rootroot00000000000000#include "fpapi_test.h" program multiplex1 IMPLICIT integer (p) integer retval integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() if (tests_quiet .EQ. 0) then write (*, 100) NUM_ITERS 100 FORMAT ("multiplex1: Using ", I3, " iterations") write (*,*) "case1: Does PAPI_multiplex_init() not break", *" regular operation?" end if call case1(retval, tests_quiet) if (tests_quiet .EQ. 0) then write (*,*) "case2: Does setmpx/add work?" end if call case2(retval, tests_quiet) if (tests_quiet .EQ. 0) then write (*,*) "case3: Does add/setmpx work?" end if call case3(retval, tests_quiet) if (tests_quiet .EQ. 0) then write (*,*) "case4: Does add/setmpx/add work?" end if call case4(retval, tests_quiet) retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, & 'PAPI_library_init', retval) end if call ftests_pass(__FILE__) end subroutine init_papi(event) IMPLICIT integer (p) integer retval integer event retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, & 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_TOT_INS, retval) if (retval .NE. PAPI_OK) then event = PAPI_TOT_CYC else event = PAPI_TOT_INS end if end C Tests that PAPI_multiplex_init does not mess with normal operation. subroutine case1(ret, tests_quiet) IMPLICIT integer (p) integer ret, tests_quiet, event integer retval, EventSet INTEGER*8 values(4) integer fd EventSet = PAPI_NULL call init_papi(event) call init_multiplex() call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_create_eventset', retval) end if call PAPIf_add_event( EventSet, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call PAPIf_add_event( EventSet, PAPI_TOT_IIS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if end if if(tests_quiet .EQ. 0) then write(*,*) 'Event set list' call PrintEventSet(EventSet) end if call do_stuff() call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_start', retval) end if fd = 1 call do_stuff() call PAPIf_stop(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then print *, "case1: ", values(1), values(2) end if call PAPIf_cleanup_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_cleanup_eventset', retval) end if call PAPIF_shutdown() ret = SUCCESS end C Tests that PAPI_set_multiplex() works before adding events subroutine case2(ret, tests_quiet) IMPLICIT integer (p) integer ret, tests_quiet, event integer retval, EventSet INTEGER*8 values(4) integer fd EventSet = PAPI_NULL call init_papi(event) call init_multiplex() call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_create_eventset', retval) end if call PAPIf_assign_eventset_component(EventSet, 0, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_assign_eventset_component', retval) end if call PAPIf_set_multiplex(EventSet, retval) if ( retval.EQ.PAPI_ENOSUPP) then call ftest_skip(__FILE__, __LINE__, & 'Multiplex not implemented', 1) end if if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'papif_set_multiplex', retval) end if call PAPIf_add_event( EventSet, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call PAPIf_add_event( EventSet, PAPI_TOT_IIS, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if end if C This print-out is disabled until PAPIf_list_event is working C for multiplexed event sets (change -4711 to 0 when it is working) if(tests_quiet .EQ. 0) then write(*,*) 'Event set list' call PrintEventSet(EventSet) endif call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_start', retval) end if fd = 1 call do_stuff() call PAPIf_stop(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then print *, "case2: ", values(1), values(2) end if call PAPIf_cleanup_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_cleanup_eventset', retval) end if call PAPIF_shutdown() ret = SUCCESS end C Tests that PAPI_set_multiplex() works after adding events subroutine case3(ret, tests_quiet) IMPLICIT integer (p) integer ret, tests_quiet, event integer retval, EventSet INTEGER*8 values(4) integer fd EventSet = PAPI_NULL call init_papi(event) call init_multiplex() call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_create_eventset', retval) end if call PAPIf_add_event( EventSet, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if if(tests_quiet .EQ. 0) then write(*,*) 'Event set before call to PAPIf_set_multiplex:' call PrintEventSet(EventSet) endif call PAPIf_set_multiplex(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'papif_set_multiplex', retval) end if if(tests_quiet .EQ. 0) then write(*,*) 'Event set after call to PAPIf_set_multiplex:' call PrintEventSet(EventSet) endif call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_start', retval) end if fd = 1 call do_stuff() call PAPIf_stop(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then print *, "case3: ", values(1), values(2) end if call PAPIf_cleanup_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_cleanup_eventset', retval) end if call PAPIF_shutdown() ret = SUCCESS end C Tests that PAPI_set_multiplex() works before adding events C Tests that PAPI_add_event() works after C PAPI_add_event()/PAPI_set_multiplex() subroutine case4(ret, tests_quiet) IMPLICIT integer (p) integer ret, tests_quiet, event integer retval, EventSet INTEGER*8 values(4) integer fd EventSet = PAPI_NULL call init_papi(event) call init_multiplex() call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_create_eventset', retval) end if call PAPIf_add_event( EventSet, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call PAPIf_add_event( EventSet, PAPI_TOT_IIS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if end if if(tests_quiet .EQ. 0) then write(*,*) 'Event set before call to PAPIf_set_multiplex:' call PrintEventSet(EventSet) endif call PAPIf_set_multiplex(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'papif_set_multiplex', retval) end if if(tests_quiet .EQ. 0) then write(*,*) 'Event set after call to PAPIf_set_multiplex:' call PrintEventSet(EventSet) endif #if (defined(i386)&&defined(linux))||defined(mips) || (defined(__ia64__) && defined(linux)) || (SUBSTR==aix-power) call PAPIf_add_event( EventSet, PAPI_L1_DCM, retval ) C Try alternative event if the above is not possible to use... if ( retval .EQ. PAPI_ECNFLCT .OR. retval .EQ. PAPI_ENOEVNT ) then call PAPIf_add_event( EventSet, PAPI_L2_DCM, retval ) end if if ( retval .EQ. PAPI_ECNFLCT .OR. retval .EQ. PAPI_ENOEVNT ) then call PAPIf_add_event( EventSet, PAPI_L2_TCM, retval ) end if if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_L1_ICM, retval ) C Try alternative event if the above is not possible to use... if ( retval .EQ. PAPI_ECNFLCT .OR. retval .EQ. PAPI_ENOEVNT ) then call PAPIf_add_event( EventSet, PAPI_L1_LDM, retval ) end if if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if #elif (defined(sparc) && defined(sun)) call PAPIf_add_event( EventSet, PAPI_LD_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if call PAPIf_add_event( EventSet, PAPI_SR_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if #elif (defined(__alpha)&&defined(__osf__)) call PAPIf_add_event( EventSet, PAPI_TLB_DM, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_add_event', retval) end if #else print *,'*** Did not match in event selection ***' #endif if(tests_quiet .EQ. 0) then write(*,*) 'Updated event set list:' call PrintEventSet(EventSet) endif call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_start', retval) end if fd = 1 call do_stuff() call PAPIf_stop(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then write (*, *) "case4: ", values(1), values(2), values(3), * values(4) end if call PAPIf_cleanup_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_cleanup_eventset', retval) end if call PAPIF_shutdown() ret = SUCCESS end papi-papi-7-2-0-t/src/ftests/fmultiplex2.F000066400000000000000000000106031502707512200203210ustar00rootroot00000000000000#include "fpapi_test.h" #define MAX_TO_ADD 5 program multiplex2 IMPLICIT integer (p) integer retval integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() if (tests_quiet .EQ. 0) then write (*, 100) NUM_ITERS 100 FORMAT ("multiplex2: Using ", I3, " iterations") write (*,*) "case1: Does PAPI_multiplex_init() handle", * " lots of events?" end if call case1(tests_quiet, retval) call ftests_pass(__FILE__) end subroutine init_papi() IMPLICIT integer (p) integer retval retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if end subroutine case1(tests_quiet, ret) IMPLICIT integer (p) integer tests_quiet integer retval integer i, ret, fd integer EventCode character*(PAPI_MAX_STR_LEN) event_name, event_descr, * event_label, event_note integer avail_flag, flags, check integer EventSet,mask1 integer*8 values(MAX_TO_ADD*2) EventSet = PAPI_NULL call init_papi() call init_multiplex() call PAPIf_create_eventset(EventSet, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', * retval) end if call PAPIf_assign_eventset_component(EventSet, 0, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'PAPIf_assign_eventset_component', retval) end if call PAPIf_set_multiplex(EventSet, retval) if ( retval.EQ.PAPI_ENOSUPP) then call ftest_skip(__FILE__, __LINE__, . 'Multiplex not implemented', retval) end if if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'papif_set_multiplex', retval) end if if (tests_quiet .EQ. 0) then print *, "Checking for available events..." end if EventCode = 0 i = 1 do while (i .LE. MAX_TO_ADD) avail_flag=0 do while ((avail_flag.EQ.0).AND. * (EventCode.LT.PAPI_MAX_PRESET_EVENTS)) mask1 = ((PAPI_L1_DCM)+EventCode) if (mask1.NE.PAPI_TOT_CYC) then call papif_get_event_info(mask1, * event_name, event_descr, event_label, avail_flag, * event_note, flags, check) end if EventCode = EventCode + 1 end do if ( EventCode.EQ.PAPI_MAX_PRESET_EVENTS .AND. * i .LT. MAX_TO_ADD ) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_add_event', retval) end if if (tests_quiet .EQ. 0) then write (*, 200) " Adding Event ", event_name 200 FORMAT(A22, A12) end if mask1 = ((PAPI_L1_DCM)+EventCode) mask1 = mask1 - 1 call PAPIf_add_event( EventSet, mask1, retval ) if ( retval .NE. PAPI_OK .AND. retval .NE. PAPI_ECNFLCT) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_add_event', retval) stop end if if (tests_quiet .EQ. 0) then if (retval .EQ. PAPI_OK) then write (*, 200) " Added Event ", event_name else write (*, 200) " Could not add Event ", event_name end if end if if (retval .EQ. PAPI_OK) then i = i + 1 end if end do call PAPIf_start(EventSet, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if fd = 1 call do_stuff() call PAPIf_stop(EventSet, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_cleanup_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_cleanup_eventset', * retval) end if call PAPIf_destroy_eventset(EventSet, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_destroy_eventset', * retval) end if ret = SUCCESS end papi-papi-7-2-0-t/src/ftests/johnmay2.F000066400000000000000000000072441502707512200176040ustar00rootroot00000000000000#include "fpapi_test.h" program johnmay2 implicit integer (p) integer*8 values(10) integer es, event integer retval character*PAPI_MAX_STR_LEN name Integer last_char, n External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_FP_INS, retval) if (retval.EQ.PAPI_OK) then event = PAPI_FP_INS else call PAPIf_query_event(PAPI_TOT_INS, retval) if ( retval.EQ.PAPI_OK) then event = PAPI_TOT_INS else call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_event', retval) end if end if call PAPIf_create_eventset(es, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call PAPIf_cleanup_eventset(es, retval) if (retval .NE. PAPI_EISRUN) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_cleanup_eventset', *retval) end if call PAPIf_destroy_eventset(es, retval) if (retval .NE. PAPI_EISRUN) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_destroy_eventset', *retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_destroy_eventset(es, retval) if (retval .NE. PAPI_EINVAL) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_destroy_eventset', *retval) end if call PAPIf_cleanup_eventset(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_cleanup_eventset', *retval) end if call PAPIf_destroy_eventset(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_destroy_eventset', *retval) end if if (es .NE. PAPI_NULL) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_destroy_eventset', *retval) end if if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (event, name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_event_code_to_name', retval) end if n=last_char(name) print *, "Test case John May 2: cleanup / ", * "destroy eventset." print *, "--------------------------------", * "-----------------" print *, "Test run : 1" print *, name(1:n), " : ", values(1) print *, "----------------------------------", * "---------------" print *, "Verification:" print *, "These error messages:" print *, "PAPI Error Code -10: PAPI_EISRUN: ", * "EventSet is currently counting" print *, "PAPI Error Code -10: PAPI_EISRUN: ", * "EventSet is currently counting" print *, "PAPI Error Code -1: PAPI_EINVAL: ", * "Invalid argument" end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/nineth.F000066400000000000000000000120261502707512200173340ustar00rootroot00000000000000#include "fpapi_test.h" program nineth implicit integer (p) integer es1, es2 integer*8 values(10),tvalues(10) integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr integer retval integer clockrate real*8 test_flops, min, max Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_FP_OPS, retval) if (retval.NE.PAPI_OK) then call ftest_skip(__FILE__, __LINE__, 'PAPI_FP_OPS', PAPI_ENOEVNT) end if call PAPIf_create_eventset(es1, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es1, PAPI_FP_OPS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es1, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_create_eventset(es2, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es2, PAPI_FLOPS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_get_clockrate(clockrate) if (tests_quiet .EQ. 0) then print *, 'Clockrate:', clockrate end if call PAPIf_start(es1, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call do_flops(NUM_FLOPS) call PAPIf_stop(es1, tvalues(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_start(es2, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call do_flops(NUM_FLOPS) call PAPIf_stop(es2, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_remove_event( es1, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es1, PAPI_FP_OPS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es2, PAPI_FLOPS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if test_flops = tvalues(1)*clockrate*1000000.0 if ( tvalues(2) .NE. 0) then test_flops = test_flops / tvalues(2) else test_flops = 0.0 end if if (tests_quiet .EQ. 0) then print *, "Test case 9: start, stop for derived event PAPI_FLOPS" print *, "---------------------------------------------" end if call PAPIf_get_domain(es1, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,900) "Default domain is:", domain, domainstr 900 format(a20, i3, " ", a70) end if call PAPIf_get_granularity(es1, granularity, PAPI_DEFGRN, *retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', *retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then write (*,800) "Default granularity is:", granularity, grnstr 800 format(a25, i3, " ", a20) print *, " Using", NUM_FLOPS, " iterations of c += b*c" print *, "---------------------------------------------" write (*,810) "Test type :", 1, 2 write (*,810) "PAPI_FP_OPS :", tvalues(1), 0 write (*,810) "PAPI_TOT_CYC:", tvalues(2), 0 write (*,810) "PAPI_FLOPS :", 0, values(1) print *, "---------------------------------------------" 810 format(a15, i15, i15) print *, "Verification:" print *, "Last number in row 3 approximately equals", test_flops end if min = values(1) * 0.9 max = values(1) * 1.1 if ((test_flops.gt.max) .OR. (test_flops.lt.min)) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_FLOPS', 1) end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/openmp.F000066400000000000000000000035401502707512200173460ustar00rootroot00000000000000#include "fpapi_test.h" program openmp use omp_lib integer*8 values(10) integer es integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet integer nthreads, tid tests_quiet = get_quiet() es = PAPI_NULL call PAPIF_thread_init(omp_get_thread_num, retval) retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_TOT_CYC, retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_event', * retval) end if call PAPIf_create_eventset(es, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', * retval) end if call PAPIf_add_event( es, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if !$OMP PARALLEL PRIVATE(NTHREADS, TID) tid = OMP_GET_THREAD_NUM() if (tests_quiet .EQ. 0) then PRINT *, 'Hello World from thread = ', TID end if !$OMP END PARALLEL call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if if (tests_quiet .EQ. 0) then write (*,*) "PAPI_TOT_CYC", values(1) end if call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/second.F000066400000000000000000000210531502707512200173220ustar00rootroot00000000000000#include "fpapi_test.h" program second implicit integer (p) integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr integer*8 values(10), max, min integer es1, es2, es3 integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet #if (defined(sgi) && defined(host_mips)) integer id integer*4 getuid #endif #if (defined(sgi) && defined(host_mips)) id = getuid() #endif tests_quiet = get_quiet() es1 = PAPI_NULL es2 = PAPI_NULL es3 = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_TOT_INS, retval) if (retval.NE.PAPI_OK) then call ftest_skip(__FILE__, __LINE__, 'PAPI_FP_INS', PAPI_ENOEVNT) end if call PAPIf_create_eventset(es1, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es1, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es1, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_create_eventset(es2, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es2, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es2, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_create_eventset(es3, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es3, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es3, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_set_event_domain(es1, PAPI_DOM_ALL, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_set_domain', retval) end if call PAPIf_set_event_domain(es2, PAPI_DOM_KERNEL, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_set_domain', retval) end if call PAPIf_set_event_domain(es3, PAPI_DOM_USER, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_set_domain', retval) end if call PAPIf_start(es1, retval) call fdo_flops(NUM_FLOPS) if (retval.eq.PAPI_OK) then call PAPIf_stop(es1, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if end if call PAPIf_start(es2, retval) call fdo_flops(NUM_FLOPS) if (retval.eq.PAPI_OK) then call PAPIf_stop(es2, values(3), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if end if call PAPIf_start(es3, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es3, values(5), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_remove_event( es1, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es1, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es2, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es2, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es3, PAPI_TOT_INS, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es3, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if if (tests_quiet .EQ. 0) then print *, 'Test case 2: Non-overlapping start, stop, read', *' for all 3 domains.' print *, '-------------------------------------------------'// * '------------------------------' end if call PAPIf_get_domain(es1, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,900) 'Default domain is:', domain, domainstr end if 900 format(a20, i3, ' ', a70) call PAPIf_get_granularity(es1, granularity, PAPI_DEFGRN, *retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', *retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then write (*,800) 'Default granularity is:', granularity, grnstr end if 800 format(a25, i3, ' ', a20) if (tests_quiet .EQ. 0) then print *, 'Using', NUM_FLOPS, ' iterations of c += b*c' print *, '-------------------------------------------------'// * '------------------------------' print *, 'Test type : PAPI_DOM_ALL PAPI_DOM_KERNEL', *' PAPI_DOM_USER' write (*,200) 'PAPI_TOT_INS', values(1), values(3), values(5) write (*,200) 'PAPI_TOT_CYC', values(2), values(4), values(6) 200 format(A15, ': ', I15, I15, I15) print *, '-------------------------------------------------'// * '------------------------------' print *, 'Verification:' print *, 'Row 1 approximately equals N 0 N' print *, 'Column 1 approximately equals column 2 plus column 3' #if defined(sgi) && defined(host_mips) print * print *, '* IRIX requires root for PAPI_DOM_KERNEL', *' and PAPI_DOM_ALL.' print *, '* The first two columns will be invalid if not', *' run as root for IRIX.' #endif end if #if (defined(sgi) && defined(host_mips)) if (id.NE.0) then min = NUM_FLOPS*0.9 max = NUM_FLOPS*1.1 if ((values(5) .lt. min) .OR. (values(5) .gt. max)) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_FP_INS', 1) end if else min = values(5)*0.9 max = values(5)*1.1 if ((values(1) .lt. min) .OR. (values(1) .gt. max)) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_FP_INS', 1) end if min = values(2)*0.9 max = values(2)*1.1 if (((values(4)+values(6)) .lt. min) .OR. * ((values(4)+values(6)) .gt. max)) then call ftest_fail(__FILE__, __LINE__, 'PAPI_TOT_CYC', 1) end if endif #else min = INT(REAL(values(5))*0.9) max = INT(REAL(values(5))*1.1) if ((values(1) .lt. min) .OR. (values(1) .gt. max)) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_FP_INS', 1) end if min = INT(REAL(values(2))*0.8) max = INT(REAL(values(2))*1.2) if (((values(4)+values(6)) .lt. min) .OR. * ((values(4)+values(6)) .gt. max)) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_TOT_CYC', 1) end if #endif call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/serial_hl.F000066400000000000000000000010651502707512200200120ustar00rootroot00000000000000#include "fpapi.h" program flops integer retval integer i do i = 1, 4 call PAPIf_hl_region_begin("main", retval) if ( retval .NE. PAPI_OK ) then write (*,*) "PAPIf_hl_region_begin failed!" end if write (*,*) 'Round', i call fdo_flops(NUM_FLOPS) call PAPIf_hl_region_end("main", retval) if ( retval .NE. PAPI_OK ) then write (*,*) "PAPIf_hl_region_end failed!" end if end do call ftests_hl_pass(__FILE__) end program flops papi-papi-7-2-0-t/src/ftests/strtest.F000066400000000000000000000244341502707512200175650ustar00rootroot00000000000000C Strtest - Perform some basic tests of the functionality of the C string passing to and from the PAPI Fortran interface. C C Test 1: Look up an event name from an event code. Use this name C to try and locate the event code using the name received. C Long, short and too short strings are used in the tests C C Test 2: Look up a PAPI error string. Use long, short and too C short strings to store the result. C C Test 3: Look up and display event descriptions C using PAPIf_get_event_info. C C Comments: C When using the Fortran interface it may not always be possible to C use the PAPI predefined constants as actual arguments. Due to the C values in these compilers might occasionally cast these into the C wrong type. In the code below the line code=MSGCODE is used to C make sure that the event code get the right type. C #include "fpapi_test.h" C Set MSGLEN to the number of characters in the named event in MSGCODE #define MSGLEN 11 #define MSGCODE PAPI_L1_DCM #define ERRCODE PAPI_EINVAL program strtest implicit integer (p) CHARACTER*(PAPI_MAX_STR_LEN) papistr CHARACTER*(PAPI_MAX_STR_LEN*2) papidblstr CHARACTER*(PAPI_MAX_STR_LEN) ckstr CHARACTER*(MSGLEN) invstr1 CHARACTER*(MSGLEN+1) invstr2 CHARACTER*(MSGLEN+2) invstr3 CHARACTER*(MSGLEN-1) invstr4 CHARACTER*(MSGLEN-2) invstr5 integer check,lastchar integer code,papicode integer getstrlen external getstrlen integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() check=PAPI_VER_CURRENT call PAPIF_library_init(check) if ( check.NE.PAPI_VER_CURRENT) then call PAPIF_perror( 'PAPI_library_init' ) call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', check) end if code=MSGCODE if (tests_quiet .EQ. 0) then print *,'---------------------------------------------------' print *,' Testing PAPIF_name_to_code/PAPIF_code_to_name ' print *,'---------------------------------------------------' print *,' These tests look up an event name and event code' print *,' On no occasion should a NULL character be found(+)' print *,' When strings are too short, the lookup should fail' print * print *,' Tests use the event code ',code print * end if lastchar=PAPI_MAX_STR_LEN call checkstr(code,ckstr,check,lastchar,tests_quiet) lastchar=getstrlen(ckstr) call checkstr(code,invstr1,check,lastchar,tests_quiet) call checkstr(code,invstr2,check,lastchar,tests_quiet) call checkstr(code,invstr3,check,lastchar,tests_quiet) call checkstr(code,invstr4,check,lastchar,tests_quiet) call checkstr(code,invstr5,check,lastchar,tests_quiet) if (tests_quiet .EQ. 0) then print *,'---------------------------------------------------' print *,' Testing PAPIF_descr_event ' print *,'---------------------------------------------------' print *,' These tests should return a PAPI description for' print *,' various event names and argument shapes.' print *,' On no occasion should a NULL character be found(+)' print * print 200,'Test 1' end if papistr=" " papicode=PAPI_L1_DCM call test_papif_descr(papistr,papicode,papidblstr, . check,tests_quiet) call checkcode(papicode,PAPI_L1_DCM,tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 2' end if papistr=" " papicode=PAPI_L2_DCM call test_papif_descr(papistr,papicode,papidblstr, . check,tests_quiet) call checkname(papistr,"PAPI_L2_DCM",tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 3' end if invstr1=" " papicode=PAPI_L1_ICM call test_papif_descr(invstr1,papicode,papidblstr, . check,tests_quiet) call checkcode(papicode,PAPI_L1_ICM,tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 4' end if invstr1=" " papicode=PAPI_L2_ICM call test_papif_descr(invstr1,papicode,papidblstr, . check,tests_quiet) call checkname(invstr1,"PAPI_L2_ICM",tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 5 (This should get a truncated description)' end if invstr2=" " papicode=PAPI_L3_DCM call test_papif_descr(invstr2,papicode,invstr1, . check,tests_quiet) call checkcode(papicode,PAPI_L3_DCM,tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 6 (This should get a truncated description)' end if invstr2=" " papicode=PAPI_L3_ICM call test_papif_descr(invstr2,papicode,invstr1, . check,tests_quiet) call checkname(invstr2,"PAPI_L3_ICM",tests_quiet) if (tests_quiet .EQ. 0) then print * print 200,'Test 7 (This should get a truncated name)' end if invstr4=" " papicode=PAPI_L1_DCM call test_papif_descr(invstr4,papicode,papistr, . check,tests_quiet) if (tests_quiet .EQ. 0) then call checkname(invstr4,"PAPI_L1_DCM",tests_quiet) end if 200 format(t1,a) if (tests_quiet .EQ. 0) then print *,'---------------------------------------------------' print *,'(+) Fortran implementations that do not provide the' print *,' string argument length might show NULL '// . 'characters.' print *,' This may or may not be OK depending on the '// . 'Fortran' print *,' compiler. See papi_fwrappers.c and your Fortran' print *,' compiler reference manual.' end if call ftests_pass(__FILE__) end subroutine checkstr(incode,string,check,lastchar,quiet) implicit integer (P) integer incode integer check,lastchar, quiet character*(*) string integer code integer getstrlen external getstrlen 100 format(t1,a,i4) if (quiet .EQ. 0) then print 100,"Testing string length ",len(string) if(len(string).lt.lastchar)then print *,'This call should return an error code.' end if end if code=incode call PAPIF_event_code_to_name(code,string,check) if(check.ne.PAPI_OK)then if (len(string).ge.lastchar)then call ftest_fail(__FILE__, __LINE__, . 'PAPIF_event_code_to_name', check) else if (quiet .EQ. 0) then call PAPIF_perror( 'PAPIF_event_code_to_name' ) print *,'*ERROR* ' print *,'******* '//'Error in checkstr using '// $ 'PAPIF_event_code_to_name' end if end if end if 200 format(t1,a,'"',a,'"') if (quiet .EQ. 0) then print 200,'The event name is: ',string(1:getstrlen(string)) end if call PAPIF_event_name_to_code(string,code,check) if(check.ne.PAPI_OK)then if (len(string).ge.lastchar)then call ftest_fail(__FILE__, __LINE__, . 'PAPIF_event_name_to_code', check) else if (quiet .EQ. 0) then call PAPIF_perror( 'PAPIF_event_name_to_code' ) print *,'*ERROR* ' print *,'******* '//'Error in checkstr using '// $ 'PAPIF_event_name_to_code' end if end if end if call findnull(string,quiet) if (quiet .EQ. 0) then print * end if return end subroutine test_papif_descr(name,code,string,check,quiet) implicit integer (P) integer code,count,flags integer check,quiet character*(*) name,string character*(PAPI_MAX_STR_LEN) label,note integer getstrlen external getstrlen C This API was deprecated with PAPI 3 C call PAPIF_describe_event(name,code,string,check) call PAPIF_get_event_info(code,name,string,label,count, $ note,flags,check) 100 format(t1,a,'"',a,'"') if (quiet .EQ. 0) then print 100,'The event description is: ', $ string(1:getstrlen(string)) end if if(check.ne.PAPI_OK)then if (quiet .EQ. 0) then call PAPIF_perror( 'PAPI_get_event_info' ) print *,'*ERROR* ' print *,'******* '//'Error in test_papif_descr using '// $ 'PAPIF_get_event_info' else call ftest_fail(__FILE__, __LINE__, . 'PAPIF_get_event_info', check) end if end if call findnull(string,quiet) call findnull(name,quiet) return end integer function getstrlen(string) implicit integer (P) character*(*) string integer i do i=len(string),1,-1 if(string(i:i).ne.' ') then goto 20 end if end do getstrlen=0 return 20 continue getstrlen=i return end subroutine findnull(string,quiet) implicit integer (P) integer quiet,i character*(*) string i=index(string,char(0)) if(i.gt.0)then if(quiet.EQ.0)then print *,'NULL character found in string!!!' else call ftest_fail(__FILE__, __LINE__, . 'NULL character found in string!!!', 0) end if end if return end subroutine checkcode(code,check,quiet) implicit integer (P) integer code integer check,quiet if(code.ne.check)then if(quiet.EQ.0)then print 100,'Code look up failed?' else call ftest_fail(__FILE__, __LINE__, . 'Code look up failed?', 0) end if end if 100 format(t2,a) return end subroutine checkname(name,check,quiet) implicit integer (P) character*(*) name character*(*) check integer i,quiet integer getstrlen i=getstrlen(name) if(name(1:i).ne.check)then if (quiet .eq. 0) then print 100,'PAPI name incorrect?' print 110,'Got: ',name(1:i) print 110,'Expected: ',check else call ftest_fail(__FILE__, __LINE__, . 'PAPI name incorrect?', 0) end if end if 100 format(t2,a) 110 format(a12,'"',a,'"') return end papi-papi-7-2-0-t/src/ftests/tenth.F000066400000000000000000000156761502707512200172070ustar00rootroot00000000000000#include "fpapi_test.h" #define ITERS 100 #if defined(sun) && defined(sparc) #define CACHE_LEVEL "PAPI_L2_TCM" #define EVT1 PAPI_L2_TCM #define EVT2 PAPI_L2_TCA #define EVT3 PAPI_L2_TCH #define EVT1_STR "PAPI_L2_TCM" #define EVT2_STR "PAPI_L2_TCA" #define EVT3_STR "PAPI_L2_TCH" #else #if defined(__powerpc__) #define CACHE_LEVEL "PAPI_L1_DCA" #define EVT1 PAPI_L1_DCA #define EVT2 PAPI_L1_DCW #define EVT3 PAPI_L1_DCR #define EVT1_STR "PAPI_L1_DCA" #define EVT2_STR "PAPI_L1_DCW" #define EVT3_STR "PAPI_L1_DCR" #else #define CACHE_LEVEL "PAPI_L1_TCM" #define EVT1 PAPI_L1_TCM #define EVT2 PAPI_L1_ICM #define EVT3 PAPI_L1_DCM #define EVT1_STR "PAPI_L1_TCM" #define EVT2_STR "PAPI_L1_ICM" #define EVT3_STR "PAPI_L1_DCM" #endif #endif program tenth implicit integer (p) integer*8 values(10) integer es1, es2, es3 integer*4 mask1, mask2, mask3 integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es1 = PAPI_NULL es2 = PAPI_NULL es3 = PAPI_NULL mask1 = EVT1 mask2 = EVT2 mask3 = EVT3 retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(mask1, retval) if ( retval.NE.PAPI_OK) then call ftest_skip(__FILE__, __LINE__, .'PAPIf_query_event', retval) end if call PAPIf_query_event(mask2, retval) if ( retval.NE.PAPI_OK) then call ftest_skip(__FILE__, __LINE__, .'PAPIf_query_event', retval) end if call PAPIf_query_event(mask3, retval) if ( retval.NE.PAPI_OK) then call ftest_skip(__FILE__, __LINE__, .'PAPIf_query_event', retval) end if call PAPIf_create_eventset(es1, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es1, mask1, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_create_eventset(es2, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', *retval) end if call PAPIf_add_event( es2, mask2, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_create_eventset(es3, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', * retval) end if call PAPIf_add_event( es3, mask3, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call fdo_l1misses(ITERS) call PAPIf_start(es1, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_l1misses(ITERS) call PAPIf_stop(es1, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_start(es2, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_l1misses(ITERS) call PAPIf_stop(es2, values(3), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_start(es3, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_l1misses(ITERS) call PAPIf_stop(es3, values(5), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_remove_event( es1, mask1, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es2, mask2, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if call PAPIf_remove_event( es3, mask3, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_remove_event', retval) end if if (tests_quiet .EQ. 0) then #if (defined(sun) && defined(sparc)) print *, "Test case 10: start, stop for derived event ", *"PAPI_L2_TCM." #else print *, "Test case 10: start, stop for derived event ", *"PAPI_L1_TCM." #endif print *, "------------------------------------------------------" end if call PAPIf_get_domain(es1, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,900) "Default domain is:", domain, domainstr 900 format(a20, i3, " ", a70) end if call PAPIf_get_granularity(es1, granularity, PAPI_DEFGRN, *retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', *retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then write (*,800) "Default granularity is:", granularity, grnstr 800 format(a25, i3, " ", a20) print *, "Using", NUM_FLOPS, " iterations of c += b*c" print *, "------------------------------------------------------" write (*,500) "Test type", 1, 2, 3 #if (defined(sun) && defined(sparc)) write (*,500) EVT1_STR, values(1), 0, 0 write (*,500) EVT2_STR, 0, values(3), 0 write (*,500) EVT3_STR, 0, 0, values(5) print *, "------------------------------------------------", *"------" print *, "Verification:" print *, "First number row 1 approximately equals (2,2) - (3,3) ", *"or ",(values(3)-values(5)) #else write (*,500) EVT1_STR, values(1), 0, 0 write (*,500) EVT2_STR, 0, values(3), 0 write (*,500) EVT3_STR, 0, 0, values(5) print *, "------------------------------------------------", *"------" print *, "Verification:" print *, "First number row 1 approximately equals (2,2) + (3,3) ", *"or ", (values(3)+values(5)) #endif end if 500 format(A13, ": ", I10, I10, I10) call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/zero.F000066400000000000000000000074371502707512200170400ustar00rootroot00000000000000#include "fpapi_test.h" program zero integer*8 values(10) integer es, event integer*8 uso, usn, cyco, cycn integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr character*(PAPI_MAX_STR_LEN) name integer retval Integer last_char, n External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_event(PAPI_FP_INS, retval) if (retval .NE. PAPI_OK) then event = PAPI_TOT_INS else event = PAPI_FP_INS end if call PAPIf_create_eventset(es, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_create_eventset', * retval) end if call PAPIf_add_event( es, event, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_add_event( es, PAPI_TOT_CYC, retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event', retval) end if call PAPIf_get_real_usec(uso) call PAPIf_get_real_cyc(cyco) call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_get_real_usec(usn) call PAPIf_get_real_cyc(cycn) if (tests_quiet .EQ. 0) then print *, "Test case 0: start, stop." print *, "-----------------------------------------------", * "--------------------------" end if call PAPIf_get_domain(es, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,800) "Default domain is :", domain, domainstr end if call PAPIf_get_granularity(es, granularity, PAPI_DEFGRN, * retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', * retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (event, name, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, * 'PAPIf_event_code_to_name', retval) end if n=last_char(name) write (*,800) "Default granularity is:", granularity, grnstr 800 format(a25, i3, " ", a70) write (*,810) "Using", NUM_FLOPS, $ " iterations of c = c + a * b" 810 format(a7, i9, a) print *, "-----------------------------------------------", * "--------------------------" write (*,100) "Test type", 1 write (*,100) name(1:n), values(1) write (*,100) "PAPI_TOT_CYC", values(2) write (*,100) "Real usec", usn-uso write (*,100) "Real cycles", cycn-cyco 100 format(a13, ":", i12) print *, "-----------------------------------------------", * "--------------------------" print *, "Verification: none" endif call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/ftests/zeronamed.F000066400000000000000000000077531502707512200200460ustar00rootroot00000000000000#include "fpapi_test.h" program zero integer*8 values(10) integer es, event integer*8 uso, usn, cyco, cycn integer domain, granularity character*(PAPI_MAX_STR_LEN) domainstr, grnstr character*(PAPI_MAX_STR_LEN) name integer retval Integer last_char External last_char integer tests_quiet, get_quiet external get_quiet tests_quiet = get_quiet() es = PAPI_NULL retval = PAPI_VER_CURRENT call PAPIf_library_init(retval) if ( retval.NE.PAPI_VER_CURRENT) then call ftest_fail(__FILE__, __LINE__, . 'PAPI_library_init', retval) end if call PAPIf_query_named_event('PAPI_TOT_CYC', retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_named_event: PAPI_TOT_CYC', retval) end if call PAPIf_query_named_event('PAPI_TOT_INS', retval) if (retval .NE. PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_query_named_event: PAPI_TOT_INS', retval) end if call PAPIf_create_eventset(es, retval) if ( retval.NE.PAPI_OK) then call ftest_fail( __FILE__, __LINE__, . 'PAPIf_create_eventset', retval ) end if call PAPIf_add_named_event( es, 'PAPI_TOT_CYC', retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event: PAPI_TOT_CYC', retval) end if call PAPIf_add_named_event( es, 'PAPI_TOT_INS', retval ) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_add_event: PAPI_TOT_INS', retval) end if call PAPIf_get_real_usec(uso) call PAPIf_get_real_cyc(cyco) call PAPIf_start(es, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_start', retval) end if call fdo_flops(NUM_FLOPS) call PAPIf_stop(es, values(1), retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_stop', retval) end if call PAPIf_get_real_usec(usn) call PAPIf_get_real_cyc(cycn) if (tests_quiet .EQ. 0) then print *, "PAPI_{query, add, remove}_named_event API test." print *, "-----------------------------------------------", * "--------------------------" end if call PAPIf_get_domain(es, domain, PAPI_DEFDOM, retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_domain', retval) end if call stringify_domain(domain, domainstr) if (tests_quiet .EQ. 0) then write (*,800) "Default domain is :", domain, domainstr end if call PAPIf_get_granularity(es, granularity, PAPI_DEFGRN, * retval) if ( retval .NE. PAPI_OK ) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_get_granularity', * retval) end if call stringify_granularity(granularity, grnstr) if (tests_quiet .EQ. 0) then call PAPIf_event_code_to_name (event, name, retval) write (*,800) "Default granularity is:", granularity, grnstr 800 format(a25, i3, " ", a70) write (*,810) "Using", NUM_FLOPS, $ " iterations of c = c + a * b" 810 format(a7, i9, a) print *, "-----------------------------------------------", * "--------------------------" write (*,100) "Test type", 1 write (*,100) "PAPI_TOT_CYC", values(1) write (*,100) "PAPI_TOT_INS", values(2) write (*,100) "Real usec", usn-uso write (*,100) "Real cycles", cycn-cyco 100 format(a13, ":", i12) print *, "-----------------------------------------------", * "--------------------------" print *, "Verification: PAPI_TOT_CYC should be roughly ", * "real_cycles" endif call ftests_pass(__FILE__) end papi-papi-7-2-0-t/src/high-level/000077500000000000000000000000001502707512200164535ustar00rootroot00000000000000papi-papi-7-2-0-t/src/high-level/papi_hl.c000066400000000000000000002243011502707512200202350ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file papi_hl.c * @author Frank Winkler * frank.winkler@icl.utk.edu * @author Philip Mucci * mucci@cs.utk.edu * @brief This file contains the 'high level' interface to PAPI. * BASIC is a high level language. ;-) */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" /* For dynamic linking to libpapi */ /* Weak symbol for pthread_once to avoid additional linking * against libpthread when not used. */ #pragma weak pthread_once #define verbose_fprintf \ if (verbosity == 1) fprintf /* defaults for number of components and events */ #define PAPIHL_NUM_OF_COMPONENTS 10 #define PAPIHL_NUM_OF_EVENTS_PER_COMPONENT 10 #define PAPIHL_ACTIVE 1 #define PAPIHL_DEACTIVATED 0 /* number of nested regions */ #define PAPIHL_MAX_STACK_SIZE 10 /* global components data begin *****************************************/ typedef struct components { int component_id; int num_of_events; int max_num_of_events; char **event_names; int *event_codes; short *event_types; int EventSet; //only for testing at initialization phase } components_t; components_t *components = NULL; int num_of_components = 0; int max_num_of_components = PAPIHL_NUM_OF_COMPONENTS; int total_num_events = 0; int num_of_cleaned_threads = 0; /* global components data end *******************************************/ /* thread local components data begin ***********************************/ typedef struct local_components { int EventSet; /** Return values for the eventsets */ long_long *values; } local_components_t; THREAD_LOCAL_STORAGE_KEYWORD local_components_t *_local_components = NULL; THREAD_LOCAL_STORAGE_KEYWORD long_long _local_cycles; THREAD_LOCAL_STORAGE_KEYWORD volatile bool _local_state = PAPIHL_ACTIVE; THREAD_LOCAL_STORAGE_KEYWORD unsigned int _local_region_begin_cnt = 0; /**< Count each PAPI_hl_region_begin call */ THREAD_LOCAL_STORAGE_KEYWORD unsigned int _local_region_end_cnt = 0; /**< Count each PAPI_hl_region_end call */ THREAD_LOCAL_STORAGE_KEYWORD unsigned int _local_region_id_stack[PAPIHL_MAX_STACK_SIZE]; THREAD_LOCAL_STORAGE_KEYWORD int _local_region_id_top = -1; /* thread local components data end *************************************/ /* global event storage data begin **************************************/ typedef struct reads { struct reads *next; struct reads *prev; long_long value; /**< Event value */ } reads_t; typedef struct { long_long begin; /**< Event value for region_begin */ long_long region_value; /**< Delta value for region_end - region_begin */ reads_t *read_values; /**< List of read event values inside a region */ } value_t; typedef struct regions { unsigned int region_id; /**< Unique region ID */ int parent_region_id; /**< Region ID of parent region */ char *region; /**< Region name */ struct regions *next; struct regions *prev; value_t values[]; /**< Array of event values based on current eventset */ } regions_t; typedef struct { unsigned long key; /**< Thread ID */ regions_t *value; /**< List of regions */ } threads_t; int compar(const void *l, const void *r) { const threads_t *lm = l; const threads_t *lr = r; return lm->key - lr->key; } typedef struct { void *root; /**< Root of binary tree */ threads_t *find_p; /**< Pointer that is used for finding a thread node */ } binary_tree_t; /**< Global binary tree that stores events from all threads */ binary_tree_t* binary_tree = NULL; /* global event storage data end ****************************************/ /* global auxiliary variables begin *************************************/ enum region_type { REGION_BEGIN, REGION_READ, REGION_END }; char **requested_event_names = NULL; /**< Events from user or default */ int num_of_requested_events = 0; bool hl_initiated = false; /**< Check PAPI-HL has been initiated */ bool hl_finalized = false; /**< Check PAPI-HL has been fininalized */ bool events_determined = false; /**< Check if events are determined */ bool output_generated = false; /**< Check if output has been already generated */ static char *absolute_output_file_path = NULL; static int output_counter = 0; /**< Count each output generation. Not used yet */ short verbosity = 0; /**< Verbose output is off by default */ bool state = PAPIHL_ACTIVE; /**< PAPIHL is active until first error or finalization */ static int region_begin_cnt = 0; /**< Count each PAPI_hl_region_begin call */ static int region_end_cnt = 0; /**< Count each PAPI_hl_region_end call */ unsigned long master_thread_id = -1; /**< Remember id of master thread */ /* global auxiliary variables end ***************************************/ static void _internal_hl_library_init(void); static void _internal_hl_onetime_library_init(void); /* functions for creating eventsets for different components */ static int _internal_hl_checkCounter ( char* counter ); static int _internal_hl_determine_rank(); static char *_internal_hl_remove_spaces( char *str, int mode ); static int _internal_hl_determine_default_events(); static int _internal_hl_read_user_events(const char *user_events); static int _internal_hl_new_component(int component_id, components_t *component); static int _internal_hl_add_event_to_component(char *event_name, int event, short event_type, components_t *component); static int _internal_hl_create_components(); static int _internal_hl_read_events(const char* events); static int _internal_hl_create_event_sets(); static int _internal_hl_start_counters(); /* functions for storing events */ static int _internal_hl_region_id_pop(); static int _internal_hl_region_id_push(); static int _internal_hl_region_id_stack_peak(); static inline reads_t* _internal_hl_insert_read_node( reads_t** head_node ); static inline int _internal_hl_add_values_to_region( regions_t *node, enum region_type reg_typ ); static inline regions_t* _internal_hl_insert_region_node( regions_t** head_node, const char *region ); static inline regions_t* _internal_hl_find_region_node( regions_t* head_node, const char *region ); static inline threads_t* _internal_hl_insert_thread_node( unsigned long tid ); static inline threads_t* _internal_hl_find_thread_node( unsigned long tid ); static int _internal_hl_store_counters( unsigned long tid, const char *region, enum region_type reg_typ ); static int _internal_hl_read_counters(); static int _internal_hl_read_and_store_counters( const char *region, enum region_type reg_typ ); static int _internal_hl_create_global_binary_tree(); /* functions for output generation */ static int _internal_hl_mkdir(const char *dir); static int _internal_hl_determine_output_path(); static void _internal_hl_json_line_break_and_indent(FILE* f, bool b, int width); static void _internal_hl_json_definitions(FILE* f, bool beautifier); static void _internal_hl_json_region_events(FILE* f, bool beautifier, regions_t *regions); static void _internal_hl_json_regions(FILE* f, bool beautifier, threads_t* thread_node); static void _internal_hl_json_threads(FILE* f, bool beautifier, unsigned long* tids, int threads_num); static int _internal_hl_cmpfunc(const void * a, const void * b); static int _internal_get_sorted_thread_list(unsigned long** tids, int* threads_num); static void _internal_hl_write_json_file(FILE* f, unsigned long* tids, int threads_num); static void _internal_hl_read_json_file(const char* path); static void _internal_hl_write_output(); /* functions for cleaning up heap memory */ static void _internal_hl_clean_up_local_data(); static void _internal_hl_clean_up_global_data(); static void _internal_hl_clean_up_all(bool deactivate); static int _internal_hl_check_for_clean_thread_states(); /* internal advanced functions */ int _internal_PAPI_hl_init(); /**< intialize high level library */ int _internal_PAPI_hl_cleanup_thread(); /**< clean local-thread event sets */ int _internal_PAPI_hl_finalize(); /**< shutdown event sets and clear up everything */ int _internal_PAPI_hl_set_events(const char* events); /**< set specfic events to be recorded */ void _internal_PAPI_hl_print_output(); /**< generate output */ static void _internal_hl_library_init(void) { /* This function is only called by one thread! */ int retval; /* check VERBOSE level */ if ( getenv("PAPI_HL_VERBOSE") != NULL ) { verbosity = 1; } if ( ( retval = PAPI_library_init(PAPI_VER_CURRENT) ) != PAPI_VER_CURRENT ) verbose_fprintf(stdout, "PAPI-HL Error: PAPI_library_init failed!\n"); /* PAPI_thread_init only suceeds if PAPI_library_init has suceeded */ char *multi_thread = getenv("PAPI_HL_THREAD_MULTIPLE"); if ( NULL == multi_thread || atoi(multi_thread) == 1 ) { retval = PAPI_thread_init(_papi_gettid); } else { retval = PAPI_thread_init(_papi_getpid); } if (retval == PAPI_OK) { /* determine output directory and output file */ if ( ( retval = _internal_hl_determine_output_path() ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: _internal_hl_determine_output_path failed!\n"); state = PAPIHL_DEACTIVATED; verbose_fprintf(stdout, "PAPI-HL Error: PAPI could not be initiated!\n"); } else { /* register the termination function for output */ atexit(_internal_PAPI_hl_print_output); verbose_fprintf(stdout, "PAPI-HL Info: PAPI has been initiated!\n"); /* remember thread id */ master_thread_id = PAPI_thread_id(); HLDBG("master_thread_id=%lu\n", master_thread_id); } /* Support multiplexing if user wants to */ if ( getenv("PAPI_MULTIPLEX") != NULL ) { retval = PAPI_multiplex_init(); if ( retval == PAPI_ENOSUPP) { verbose_fprintf(stdout, "PAPI-HL Info: Multiplex is not supported!\n"); } else if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_multiplex_init failed!\n"); } else if ( retval == PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Info: Multiplex has been initiated!\n"); } } } else { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_thread_init failed!\n"); state = PAPIHL_DEACTIVATED; verbose_fprintf(stdout, "PAPI-HL Error: PAPI could not be initiated!\n"); } hl_initiated = true; } static void _internal_hl_onetime_library_init(void) { static pthread_once_t library_is_initialized = PTHREAD_ONCE_INIT; if ( pthread_once ) { /* we assume that PAPI_hl_init() is called from a parallel region */ pthread_once(&library_is_initialized, _internal_hl_library_init); /* wait until first thread has finished */ int i = 0; /* give it 5 seconds in case PAPI_thread_init crashes */ while ( !hl_initiated && (i++) < 500000 ) usleep(10); } else { /* we assume that PAPI_hl_init() is called from a serial application * that was not linked against libpthread */ _internal_hl_library_init(); } } static int _internal_hl_checkCounter ( char* counter ) { int EventSet = PAPI_NULL; int eventcode; int retval; HLDBG("Counter: %s\n", counter); if ( ( retval = PAPI_create_eventset( &EventSet ) ) != PAPI_OK ) return ( retval ); if ( ( retval = PAPI_event_name_to_code( counter, &eventcode ) ) != PAPI_OK ) { HLDBG("Counter %s does not exist\n", counter); return ( retval ); } if ( ( retval = PAPI_add_event (EventSet, eventcode) ) != PAPI_OK ) { HLDBG("Cannot add counter %s\n", counter); return ( retval ); } if ( ( retval = PAPI_cleanup_eventset (EventSet) ) != PAPI_OK ) return ( retval ); if ( ( retval = PAPI_destroy_eventset (&EventSet) ) != PAPI_OK ) return ( retval ); return ( PAPI_OK ); } static int _internal_hl_determine_rank() { int rank = -1; /* check environment variables for rank identification */ if ( getenv("OMPI_COMM_WORLD_RANK") != NULL ) rank = atoi(getenv("OMPI_COMM_WORLD_RANK")); else if ( getenv("ALPS_APP_PE") != NULL ) rank = atoi(getenv("ALPS_APP_PE")); else if ( getenv("PMI_RANK") != NULL ) rank = atoi(getenv("PMI_RANK")); else if ( getenv("SLURM_PROCID") != NULL ) rank = atoi(getenv("SLURM_PROCID")); return rank; } static char *_internal_hl_remove_spaces( char *str, int mode ) { char *out = str, *put = str; for(; *str != '\0'; ++str) { if ( mode == 0 ) { if(*str != ' ') *put++ = *str; } else { while (*str == ' ' && *(str + 1) == ' ') str++; *put++ = *str; } } *put = '\0'; return out; } static int _internal_hl_determine_default_events() { int i; HLDBG("Default events\n"); char *default_events[] = { "PAPI_TOT_CYC", }; int num_of_defaults = sizeof(default_events) / sizeof(char*); /* allocate memory for requested events */ requested_event_names = (char**)malloc(num_of_defaults * sizeof(char*)); if ( requested_event_names == NULL ) return ( PAPI_ENOMEM ); /* check if default events are available on the current machine */ for ( i = 0; i < num_of_defaults; i++ ) { if ( _internal_hl_checkCounter( default_events[i] ) == PAPI_OK ) { requested_event_names[num_of_requested_events++] = strdup(default_events[i]); if ( requested_event_names[num_of_requested_events -1] == NULL ) return ( PAPI_ENOMEM ); } else { /* if PAPI_FP_OPS is not available try PAPI_SP_OPS or PAPI_DP_OPS */ if ( strcmp(default_events[i], "PAPI_FP_OPS") == 0 ) { if ( _internal_hl_checkCounter( "PAPI_SP_OPS" ) == PAPI_OK ) requested_event_names[num_of_requested_events++] = strdup("PAPI_SP_OPS"); else if ( _internal_hl_checkCounter( "PAPI_DP_OPS" ) == PAPI_OK ) requested_event_names[num_of_requested_events++] = strdup("PAPI_DP_OPS"); } /* if PAPI_FP_INS is not available try PAPI_VEC_SP or PAPI_VEC_DP */ if ( strcmp(default_events[i], "PAPI_FP_INS") == 0 ) { if ( _internal_hl_checkCounter( "PAPI_VEC_SP" ) == PAPI_OK ) requested_event_names[num_of_requested_events++] = strdup("PAPI_VEC_SP"); else if ( _internal_hl_checkCounter( "PAPI_VEC_DP" ) == PAPI_OK ) requested_event_names[num_of_requested_events++] = strdup("PAPI_VEC_DP"); } } } return ( PAPI_OK ); } static int _internal_hl_read_user_events(const char *user_events) { char* user_events_copy; const char *separator; //separator for events int num_of_req_events = 1; //number of events in string int req_event_index = 0; //index of event const char *position = NULL; //current position in processed string char *token; HLDBG("User events: %s\n", user_events); user_events_copy = strdup(user_events); if ( user_events_copy == NULL ) return ( PAPI_ENOMEM ); /* check if string is not empty */ if ( strlen( user_events_copy ) > 0 ) { /* count number of separator characters */ position = user_events_copy; separator=","; while ( *position ) { if ( strchr( separator, *position ) ) { num_of_req_events++; } position++; } /* allocate memory for requested events */ requested_event_names = (char**)malloc(num_of_req_events * sizeof(char*)); if ( requested_event_names == NULL ) { free(user_events_copy); return ( PAPI_ENOMEM ); } /* parse list of event names */ token = strtok( user_events_copy, separator ); while ( token ) { if ( req_event_index >= num_of_req_events ){ /* more entries as in the first run */ free(user_events_copy); return PAPI_EINVAL; } requested_event_names[req_event_index] = strdup(_internal_hl_remove_spaces(token, 0)); if ( requested_event_names[req_event_index] == NULL ) { free(user_events_copy); return ( PAPI_ENOMEM ); } token = strtok( NULL, separator ); req_event_index++; } } num_of_requested_events = req_event_index; free(user_events_copy); if ( num_of_requested_events == 0 ) return PAPI_EINVAL; HLDBG("Number of requested events: %d\n", num_of_requested_events); return ( PAPI_OK ); } static int _internal_hl_new_component(int component_id, components_t *component) { int retval; /* create new EventSet */ component->EventSet = PAPI_NULL; if ( ( retval = PAPI_create_eventset( &component->EventSet ) ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: Cannot create EventSet for component %d.\n", component_id); return ( retval ); } /* Support multiplexing if user wants to */ if ( getenv("PAPI_MULTIPLEX") != NULL ) { /* multiplex only for cpu core events */ if ( component_id == 0 ) { retval = PAPI_assign_eventset_component(component->EventSet, component_id); if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_assign_eventset_component failed.\n"); } else { if ( PAPI_get_multiplex(component->EventSet) == false ) { retval = PAPI_set_multiplex(component->EventSet); if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_set_multiplex failed.\n"); } } } } } component->component_id = component_id; component->num_of_events = 0; component->max_num_of_events = PAPIHL_NUM_OF_EVENTS_PER_COMPONENT; component->event_names = NULL; component->event_names = (char**)malloc(component->max_num_of_events * sizeof(char*)); if ( component->event_names == NULL ) return ( PAPI_ENOMEM ); component->event_codes = NULL; component->event_codes = (int*)malloc(component->max_num_of_events * sizeof(int)); if ( component->event_codes == NULL ) return ( PAPI_ENOMEM ); component->event_types = NULL; component->event_types = (short*)malloc(component->max_num_of_events * sizeof(short)); if ( component->event_types == NULL ) return ( PAPI_ENOMEM ); num_of_components += 1; return ( PAPI_OK ); } static int _internal_hl_add_event_to_component(char *event_name, int event, short event_type, components_t *component) { int i, retval; /* check if we need to reallocate memory for event_names, event_codes and event_types */ if ( component->num_of_events == component->max_num_of_events ) { component->max_num_of_events *= 2; component->event_names = (char**)realloc(component->event_names, component->max_num_of_events * sizeof(char*)); if ( component->event_names == NULL ) return ( PAPI_ENOMEM ); component->event_codes = (int*)realloc(component->event_codes, component->max_num_of_events * sizeof(int)); if ( component->event_codes == NULL ) return ( PAPI_ENOMEM ); component->event_types = (short*)realloc(component->event_types, component->max_num_of_events * sizeof(short)); if ( component->event_types == NULL ) return ( PAPI_ENOMEM ); } retval = PAPI_add_event( component->EventSet, event ); if ( retval != PAPI_OK ) { const PAPI_component_info_t* cmpinfo; cmpinfo = PAPI_get_component_info( component->component_id ); verbose_fprintf(stdout, "PAPI-HL Warning: Cannot add %s to component %s.\n", event_name, cmpinfo->name); verbose_fprintf(stdout, "The following event combination is not supported:\n"); for ( i = 0; i < component->num_of_events; i++ ) verbose_fprintf(stdout, " %s\n", component->event_names[i]); verbose_fprintf(stdout, " %s\n", event_name); verbose_fprintf(stdout, "Advice: Use papi_event_chooser to obtain an appropriate event set for this component or set PAPI_MULTIPLEX=1.\n"); return PAPI_EINVAL; } component->event_names[component->num_of_events] = event_name; component->event_codes[component->num_of_events] = event; component->event_types[component->num_of_events] = event_type; component->num_of_events += 1; total_num_events += 1; return PAPI_OK; } static int _internal_hl_create_components() { int i, j, retval, event; int component_id = -1; int comp_index = 0; bool component_exists = false; short event_type = 0; HLDBG("Create components\n"); components = (components_t*)malloc(max_num_of_components * sizeof(components_t)); if ( components == NULL ) return ( PAPI_ENOMEM ); for ( i = 0; i < num_of_requested_events; i++ ) { /* check if requested event contains event type (instant or delta) */ const char sep = '='; char *ret; int index; /* search for '=' in event name */ ret = strchr(requested_event_names[i], sep); if (ret) { if ( strcmp(ret, "=instant") == 0 ) event_type = 1; else event_type = 0; /* get index of '=' in event name */ index = (int)(ret - requested_event_names[i]); /* remove event type from string if '=instant' or '=delta' */ if ( (strcmp(ret, "=instant") == 0) || (strcmp(ret, "=delta") == 0) ) requested_event_names[i][index] = '\0'; } /* change event type to instantaneous for specific events */ /* we consider all nvml events as instantaneous values */ if( (strstr(requested_event_names[i], "nvml:::") != NULL) ) { event_type = 1; verbose_fprintf(stdout, "PAPI-HL Info: The event \"%s\" will be stored as instantaneous value.\n", requested_event_names[i]); } /* check if event is supported on current machine */ retval = _internal_hl_checkCounter(requested_event_names[i]); if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Warning: \"%s\" does not exist or is not supported on this machine.\n", requested_event_names[i]); } else { /* determine event code and corresponding component id */ retval = PAPI_event_name_to_code( requested_event_names[i], &event ); if ( retval != PAPI_OK ) return ( retval ); component_id = PAPI_COMPONENT_INDEX( event ); /* check if component_id already exists in global components structure */ for ( j = 0; j < num_of_components; j++ ) { if ( components[j].component_id == component_id ) { component_exists = true; comp_index = j; break; } else { component_exists = false; } } /* create new component */ if ( false == component_exists ) { /* check if we need to reallocate memory for components */ if ( num_of_components == max_num_of_components ) { max_num_of_components *= 2; components = (components_t*)realloc(components, max_num_of_components * sizeof(components_t)); if ( components == NULL ) return ( PAPI_ENOMEM ); } comp_index = num_of_components; retval = _internal_hl_new_component(component_id, &components[comp_index]); if ( retval != PAPI_OK ) return ( retval ); } /* add event to current component */ retval = _internal_hl_add_event_to_component(requested_event_names[i], event, event_type, &components[comp_index]); if ( retval == PAPI_ENOMEM ) return ( retval ); } } HLDBG("Number of components %d\n", num_of_components); if ( num_of_components > 0 ) verbose_fprintf(stdout, "PAPI-HL Info: Using the following events:\n"); /* destroy all EventSets from global data */ for ( i = 0; i < num_of_components; i++ ) { if ( ( retval = PAPI_cleanup_eventset (components[i].EventSet) ) != PAPI_OK ) return ( retval ); if ( ( retval = PAPI_destroy_eventset (&components[i].EventSet) ) != PAPI_OK ) return ( retval ); components[i].EventSet = PAPI_NULL; HLDBG("component_id = %d\n", components[i].component_id); HLDBG("num_of_events = %d\n", components[i].num_of_events); for ( j = 0; j < components[i].num_of_events; j++ ) { HLDBG(" %s type=%d\n", components[i].event_names[j], components[i].event_types[j]); verbose_fprintf(stdout, " %s\n", components[i].event_names[j]); } } if ( num_of_components == 0 ) return PAPI_EINVAL; return PAPI_OK; } static int _internal_hl_read_events(const char* events) { int i, retval; HLDBG("Read events: %s\n", events); if ( events != NULL ) { if ( _internal_hl_read_user_events(events) != PAPI_OK ) if ( ( retval = _internal_hl_determine_default_events() ) != PAPI_OK ) return ( retval ); /* check if user specified events via environment variable */ } else if ( getenv("PAPI_EVENTS") != NULL ) { char *user_events_from_env = strdup( getenv("PAPI_EVENTS") ); if ( user_events_from_env == NULL ) return ( PAPI_ENOMEM ); /* if string is emtpy use default events */ if ( strlen( user_events_from_env ) == 0 ) { if ( ( retval = _internal_hl_determine_default_events() ) != PAPI_OK ) { free(user_events_from_env); return ( retval ); } } else if ( _internal_hl_read_user_events(user_events_from_env) != PAPI_OK ) if ( ( retval = _internal_hl_determine_default_events() ) != PAPI_OK ) { free(user_events_from_env); return ( retval ); } free(user_events_from_env); } else { if ( ( retval = _internal_hl_determine_default_events() ) != PAPI_OK ) return ( retval ); } /* create components based on requested events */ if ( _internal_hl_create_components() != PAPI_OK ) { /* requested events do not work at all, use default events */ verbose_fprintf(stdout, "PAPI-HL Warning: All requested events do not work, using default.\n"); for ( i = 0; i < num_of_requested_events; i++ ) free(requested_event_names[i]); free(requested_event_names); num_of_requested_events = 0; if ( ( retval = _internal_hl_determine_default_events() ) != PAPI_OK ) return ( retval ); if ( ( retval = _internal_hl_create_components() ) != PAPI_OK ) return ( retval ); } events_determined = true; return ( PAPI_OK ); } static int _internal_hl_create_event_sets() { int i, j, retval; if ( state == PAPIHL_ACTIVE ) { /* allocate memory for local components */ _local_components = (local_components_t*)malloc(num_of_components * sizeof(local_components_t)); if ( _local_components == NULL ) return ( PAPI_ENOMEM ); for ( i = 0; i < num_of_components; i++ ) { /* create EventSet */ _local_components[i].EventSet = PAPI_NULL; if ( ( retval = PAPI_create_eventset( &_local_components[i].EventSet ) ) != PAPI_OK ) { return (retval ); } /* Support multiplexing if user wants to */ if ( getenv("PAPI_MULTIPLEX") != NULL ) { /* multiplex only for cpu core events */ if ( components[i].component_id == 0 ) { retval = PAPI_assign_eventset_component(_local_components[i].EventSet, components[i].component_id ); if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_assign_eventset_component failed.\n"); } else { if ( PAPI_get_multiplex(_local_components[i].EventSet) == false ) { retval = PAPI_set_multiplex(_local_components[i].EventSet); if ( retval != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_set_multiplex failed.\n"); } } } } } /* add event to current EventSet */ for ( j = 0; j < components[i].num_of_events; j++ ) { retval = PAPI_add_event( _local_components[i].EventSet, components[i].event_codes[j] ); if ( retval != PAPI_OK ) { return (retval ); } } /* allocate memory for return values */ _local_components[i].values = (long_long*)malloc(components[i].num_of_events * sizeof(long_long)); if ( _local_components[i].values == NULL ) return ( PAPI_ENOMEM ); } return PAPI_OK; } return ( PAPI_EMISC ); } static int _internal_hl_start_counters() { int i, retval; long_long cycles; if ( state == PAPIHL_ACTIVE ) { for ( i = 0; i < num_of_components; i++ ) { if ( ( retval = PAPI_start( _local_components[i].EventSet ) ) != PAPI_OK ) return (retval ); /* warm up PAPI code paths and data structures */ if ( ( retval = PAPI_read_ts( _local_components[i].EventSet, _local_components[i].values, &cycles ) != PAPI_OK ) ) { return (retval ); } } _papi_hl_events_running = 1; return PAPI_OK; } return ( PAPI_EMISC ); } static int _internal_hl_region_id_pop() { if ( _local_region_id_top == -1 ) { return PAPI_ENOEVNT; } else { _local_region_id_top--; } return PAPI_OK; } static int _internal_hl_region_id_push() { if ( _local_region_id_top == PAPIHL_MAX_STACK_SIZE ) { return PAPI_ENOMEM; } else { _local_region_id_top++; _local_region_id_stack[_local_region_id_top] = _local_region_begin_cnt; } return PAPI_OK; } static int _internal_hl_region_id_stack_peak() { if ( _local_region_id_top == -1 ) { return -1; } else { return _local_region_id_stack[_local_region_id_top]; } } static inline reads_t* _internal_hl_insert_read_node(reads_t** head_node) { reads_t *new_node; /* create new region node */ if ( ( new_node = malloc(sizeof(reads_t)) ) == NULL ) return ( NULL ); new_node->next = NULL; new_node->prev = NULL; /* insert node in list */ if ( *head_node == NULL ) { *head_node = new_node; return new_node; } (*head_node)->prev = new_node; new_node->next = *head_node; *head_node = new_node; return new_node; } static inline int _internal_hl_add_values_to_region( regions_t *node, enum region_type reg_typ ) { int i, j; long_long ts; int cmp_iter = 2; /* get timestamp */ ts = PAPI_get_real_nsec(); if ( reg_typ == REGION_BEGIN ) { /* set first fixed counters */ node->values[0].begin = _local_cycles; node->values[1].begin = ts; /* events from components */ for ( i = 0; i < num_of_components; i++ ) for ( j = 0; j < components[i].num_of_events; j++ ) node->values[cmp_iter++].begin = _local_components[i].values[j]; } else if ( reg_typ == REGION_READ ) { /* create a new read node and add values*/ reads_t* read_node; if ( ( read_node = _internal_hl_insert_read_node(&node->values[0].read_values) ) == NULL ) return ( PAPI_ENOMEM ); read_node->value = _local_cycles - node->values[0].begin; if ( ( read_node = _internal_hl_insert_read_node(&node->values[1].read_values) ) == NULL ) return ( PAPI_ENOMEM ); read_node->value = ts - node->values[1].begin; for ( i = 0; i < num_of_components; i++ ) { for ( j = 0; j < components[i].num_of_events; j++ ) { if ( ( read_node = _internal_hl_insert_read_node(&node->values[cmp_iter].read_values) ) == NULL ) return ( PAPI_ENOMEM ); if ( components[i].event_types[j] == 1 ) read_node->value = _local_components[i].values[j]; else read_node->value = _local_components[i].values[j] - node->values[cmp_iter].begin; cmp_iter++; } } } else if ( reg_typ == REGION_END ) { /* determine difference of current value and begin */ node->values[0].region_value = _local_cycles - node->values[0].begin; node->values[1].region_value = ts - node->values[1].begin; /* events from components */ for ( i = 0; i < num_of_components; i++ ) for ( j = 0; j < components[i].num_of_events; j++ ) { /* if event type is instantaneous only save last value */ if ( components[i].event_types[j] == 1 ) { node->values[cmp_iter].region_value = _local_components[i].values[j]; } else { node->values[cmp_iter].region_value = _local_components[i].values[j] - node->values[cmp_iter].begin; } cmp_iter++; } } return ( PAPI_OK ); } static inline regions_t* _internal_hl_insert_region_node(regions_t** head_node, const char *region ) { regions_t *new_node; int i; int extended_total_num_events; /* number of all events including CPU cycles and real time */ extended_total_num_events = total_num_events + 2; /* create new region node */ new_node = malloc(sizeof(regions_t) + extended_total_num_events * sizeof(value_t)); if ( new_node == NULL ) return ( NULL ); new_node->region = (char *)malloc((strlen(region) + 1) * sizeof(char)); if ( new_node->region == NULL ) { free(new_node); return ( NULL ); } new_node->next = NULL; new_node->prev = NULL; new_node->region_id = _local_region_begin_cnt; new_node->parent_region_id = _internal_hl_region_id_stack_peak(); strcpy(new_node->region, region); for ( i = 0; i < extended_total_num_events; i++ ) { new_node->values[i].read_values = NULL; } /* insert node in list */ if ( *head_node == NULL ) { *head_node = new_node; return new_node; } (*head_node)->prev = new_node; new_node->next = *head_node; *head_node = new_node; return new_node; } static inline regions_t* _internal_hl_find_region_node(regions_t* head_node, const char *region ) { regions_t* find_node = head_node; while ( find_node != NULL ) { if ( ((int)find_node->region_id == _internal_hl_region_id_stack_peak()) && (strcmp(find_node->region, region) == 0) ) { return find_node; } find_node = find_node->next; } find_node = NULL; return find_node; } static inline threads_t* _internal_hl_insert_thread_node(unsigned long tid) { threads_t *new_node = (threads_t*)malloc(sizeof(threads_t)); if ( new_node == NULL ) return ( NULL ); new_node->key = tid; new_node->value = NULL; /* head node of region list */ tsearch(new_node, &binary_tree->root, compar); return new_node; } static inline threads_t* _internal_hl_find_thread_node(unsigned long tid) { threads_t *find_node = binary_tree->find_p; find_node->key = tid; void *found = tfind(find_node, &binary_tree->root, compar); if ( found != NULL ) { find_node = (*(threads_t**)found); return find_node; } return NULL; } static int _internal_hl_store_counters( unsigned long tid, const char *region, enum region_type reg_typ ) { int retval; _papi_hwi_lock( HIGHLEVEL_LOCK ); threads_t* current_thread_node; /* check if current thread is already stored in tree */ current_thread_node = _internal_hl_find_thread_node(tid); if ( current_thread_node == NULL ) { /* insert new node for current thread in tree if type is REGION_BEGIN */ if ( reg_typ == REGION_BEGIN ) { if ( ( current_thread_node = _internal_hl_insert_thread_node(tid) ) == NULL ) { _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( PAPI_ENOMEM ); } } else { _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( PAPI_EINVAL ); } } regions_t* current_region_node; if ( reg_typ == REGION_READ || reg_typ == REGION_END ) { current_region_node = _internal_hl_find_region_node(current_thread_node->value, region); if ( current_region_node == NULL ) { if ( reg_typ == REGION_READ ) { /* ignore no matching REGION_READ */ verbose_fprintf(stdout, "PAPI-HL Warning: Cannot find matching region for PAPI_hl_read(\"%s\") for thread id=%lu.\n", region, PAPI_thread_id()); retval = PAPI_OK; } else { verbose_fprintf(stdout, "PAPI-HL Warning: Cannot find matching region for PAPI_hl_region_end(\"%s\") for thread id=%lu.\n", region, PAPI_thread_id()); retval = PAPI_EINVAL; } _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( retval ); } } else { /* create new node for current region in list if type is REGION_BEGIN */ if ( ( current_region_node = _internal_hl_insert_region_node(¤t_thread_node->value, region) ) == NULL ) { _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( PAPI_ENOMEM ); } } /* add recorded values to current region */ if ( ( retval = _internal_hl_add_values_to_region( current_region_node, reg_typ ) ) != PAPI_OK ) { _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( retval ); } /* count all REGION_BEGIN and REGION_END calls */ if ( reg_typ == REGION_BEGIN ) region_begin_cnt++; if ( reg_typ == REGION_END ) region_end_cnt++; _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( PAPI_OK ); } static int _internal_hl_read_counters() { int i, j, retval; for ( i = 0; i < num_of_components; i++ ) { if ( i < ( num_of_components - 1 ) ) { retval = PAPI_read( _local_components[i].EventSet, _local_components[i].values); } else { /* get cycles for last component */ retval = PAPI_read_ts( _local_components[i].EventSet, _local_components[i].values, &_local_cycles ); } HLDBG("Thread-ID:%lu, Component-ID:%d\n", PAPI_thread_id(), components[i].component_id); for ( j = 0; j < components[i].num_of_events; j++ ) { HLDBG("Thread-ID:%lu, %s:%lld\n", PAPI_thread_id(), components[i].event_names[j], _local_components[i].values[j]); } if ( retval != PAPI_OK ) return ( retval ); } return ( PAPI_OK ); } static int _internal_hl_read_and_store_counters( const char *region, enum region_type reg_typ ) { int retval; /* read all events */ if ( ( retval = _internal_hl_read_counters() ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: Could not read counters for thread %lu.\n", PAPI_thread_id()); _internal_hl_clean_up_all(true); return ( retval ); } /* store all events */ if ( ( retval = _internal_hl_store_counters( PAPI_thread_id(), region, reg_typ) ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: Could not store counters for thread %lu.\n", PAPI_thread_id()); verbose_fprintf(stdout, "PAPI-HL Advice: Check if your regions are matching.\n"); _internal_hl_clean_up_all(true); return ( retval ); } return ( PAPI_OK ); } static int _internal_hl_create_global_binary_tree() { if ( ( binary_tree = (binary_tree_t*)malloc(sizeof(binary_tree_t)) ) == NULL ) return ( PAPI_ENOMEM ); binary_tree->root = NULL; if ( ( binary_tree->find_p = (threads_t*)malloc(sizeof(threads_t)) ) == NULL ) return ( PAPI_ENOMEM ); return ( PAPI_OK ); } static int _internal_hl_mkdir(const char *dir) { int retval; int errno; char *tmp = NULL; char *p = NULL; size_t len; if ( ( tmp = strdup(dir) ) == NULL ) return ( PAPI_ENOMEM ); len = strlen(tmp); /* check if there is a file with the same name as the ouptut directory */ struct stat buf; if ( stat(dir, &buf) == 0 && S_ISREG(buf.st_mode) ) { verbose_fprintf(stdout, "PAPI-HL Error: Name conflict with measurement directory and existing file.\n"); return ( PAPI_ESYS ); } if(tmp[len - 1] == '/') tmp[len - 1] = 0; for(p = tmp + 1; *p; p++) { if(*p == '/') { *p = 0; errno = 0; retval = mkdir(tmp, S_IRWXU); *p = '/'; if ( retval != 0 && errno != EEXIST ) { free(tmp); return ( PAPI_ESYS ); } } } retval = mkdir(tmp, S_IRWXU); free(tmp); if ( retval != 0 && errno != EEXIST ) return ( PAPI_ESYS ); return ( PAPI_OK ); } static int _internal_hl_determine_output_path() { /* check if PAPI_OUTPUT_DIRECTORY is set */ char *output_prefix = NULL; if ( getenv("PAPI_OUTPUT_DIRECTORY") != NULL ) { if ( ( output_prefix = strdup( getenv("PAPI_OUTPUT_DIRECTORY") ) ) == NULL ) return ( PAPI_ENOMEM ); } else { if ( ( output_prefix = strdup( getcwd(NULL,0) ) ) == NULL ) return ( PAPI_ENOMEM ); } /* generate absolute path for measurement directory */ if ( ( absolute_output_file_path = (char *)malloc((strlen(output_prefix) + 64) * sizeof(char)) ) == NULL ) { free(output_prefix); return ( PAPI_ENOMEM ); } if ( output_counter > 0 ) sprintf(absolute_output_file_path, "%s/papi_hl_output_%d", output_prefix, output_counter); else sprintf(absolute_output_file_path, "%s/papi_hl_output", output_prefix); /* check if directory already exists */ struct stat buf; if ( stat(absolute_output_file_path, &buf) == 0 && S_ISDIR(buf.st_mode) ) { /* rename old directory by adding a timestamp */ char *new_absolute_output_file_path = NULL; if ( ( new_absolute_output_file_path = (char *)malloc((strlen(absolute_output_file_path) + 64) * sizeof(char)) ) == NULL ) { free(output_prefix); free(absolute_output_file_path); return ( PAPI_ENOMEM ); } /* create timestamp */ time_t t = time(NULL); struct tm tm = *localtime(&t); char m_time[32]; sprintf(m_time, "%d%02d%02d-%02d%02d%02d", tm.tm_year+1900, tm.tm_mon + 1, tm.tm_mday, tm.tm_hour, tm.tm_min, tm.tm_sec); /* add timestamp to existing folder string */ sprintf(new_absolute_output_file_path, "%s-%s", absolute_output_file_path, m_time); uintmax_t current_unix_time = (uintmax_t)t; uintmax_t unix_time_from_old_directory = buf.st_mtime; /* This is a workaround for MPI applications!!! * Only rename existing measurement directory when it is older than * current timestamp. If it's not, we assume that another MPI process already created a * new measurement directory. */ if ( unix_time_from_old_directory < current_unix_time ) { if ( rename(absolute_output_file_path, new_absolute_output_file_path) != 0 ) { verbose_fprintf(stdout, "PAPI-HL Warning: Cannot rename old measurement directory.\n"); verbose_fprintf(stdout, "If you use MPI, another process may have already renamed the directory.\n"); } } free(new_absolute_output_file_path); } free(output_prefix); output_counter++; return ( PAPI_OK ); } static void _internal_hl_json_line_break_and_indent( FILE* f, bool b, int width ) { int i; if ( b ) { fprintf(f, "\n"); for ( i = 0; i < width; ++i ) fprintf(f, " "); } } static void _internal_hl_json_definitions(FILE* f, bool beautifier) { int num_events, i, j; _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "\"event_definitions\":{"); /* get all events + types */ num_events = 1; for ( i = 0; i < num_of_components; i++ ) { for ( j = 0; j < components[i].num_of_events; j++ ) { _internal_hl_json_line_break_and_indent(f, beautifier, 2); const char *event_type = "delta"; if ( components[i].event_types[j] == 1 ) event_type = "instant"; const PAPI_component_info_t* cmpinfo; cmpinfo = PAPI_get_component_info( components[i].component_id ); fprintf(f, "\"%s\":{", components[i].event_names[j]); _internal_hl_json_line_break_and_indent(f, beautifier, 3); fprintf(f, "\"component\":\"%s\",", cmpinfo->name); _internal_hl_json_line_break_and_indent(f, beautifier, 3); fprintf(f, "\"type\":\"%s\"", event_type); _internal_hl_json_line_break_and_indent(f, beautifier, 2); fprintf(f, "}"); if ( num_events < total_num_events ) fprintf(f, ","); num_events++; } } _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "},"); } static void _internal_hl_json_region_events(FILE* f, bool beautifier, regions_t *regions) { char **all_event_names = NULL; int *all_event_types = NULL; int extended_total_num_events; int i, j, cmp_iter; /* generate array of all events including CPU cycles and real time for output */ extended_total_num_events = total_num_events + 2; all_event_names = (char**)malloc(extended_total_num_events * sizeof(char*)); all_event_names[0] = "cycles"; all_event_names[1] = "real_time_nsec"; all_event_types = (int*)malloc(extended_total_num_events * sizeof(int)); all_event_types[0] = 0; all_event_types[1] = 0; cmp_iter = 2; for ( i = 0; i < num_of_components; i++ ) { for ( j = 0; j < components[i].num_of_events; j++ ) { all_event_names[cmp_iter] = components[i].event_names[j]; if ( components[i].event_types[j] == 0 ) all_event_types[cmp_iter] = 0; else all_event_types[cmp_iter] = 1; cmp_iter++; } } for ( j = 0; j < extended_total_num_events; j++ ) { _internal_hl_json_line_break_and_indent(f, beautifier, 5); /* print read values if available */ if ( regions->values[j].read_values != NULL) { reads_t* read_node = regions->values[j].read_values; /* going to last node */ while ( read_node->next != NULL ) { read_node = read_node->next; } /* read values in reverse order */ int read_cnt = 1; fprintf(f, "\"%s\":{", all_event_names[j]); _internal_hl_json_line_break_and_indent(f, beautifier, 6); fprintf(f, "\"region_value\":\"%lld\",", regions->values[j].region_value); while ( read_node != NULL ) { _internal_hl_json_line_break_and_indent(f, beautifier, 6); fprintf(f, "\"read_%d\":\"%lld\"", read_cnt,read_node->value); read_node = read_node->prev; if ( read_node == NULL ) { _internal_hl_json_line_break_and_indent(f, beautifier, 5); fprintf(f, "}"); if ( j < extended_total_num_events - 1 ) fprintf(f, ","); } else { fprintf(f, ","); } read_cnt++; } } else { HLDBG(" %s:%lld\n", all_event_names[j], regions->values[j].region_value); fprintf(f, "\"%s\":\"%lld\"", all_event_names[j], regions->values[j].region_value); if ( j < ( extended_total_num_events - 1 ) ) fprintf(f, ","); } } free(all_event_names); free(all_event_types); } static void _internal_hl_json_regions(FILE* f, bool beautifier, threads_t* thread_node) { /* iterate over regions list */ regions_t *regions = thread_node->value; /* going to last node */ while ( regions->next != NULL ) { regions = regions->next; } /* read regions in reverse order */ while (regions != NULL) { HLDBG(" Region:%u\n", regions->region_id); _internal_hl_json_line_break_and_indent(f, beautifier, 4); fprintf(f, "\"%u\":{", regions->region_id); _internal_hl_json_line_break_and_indent(f, beautifier, 5); fprintf(f, "\"name\":\"%s\",", regions->region); _internal_hl_json_line_break_and_indent(f, beautifier, 5); fprintf(f, "\"parent_region_id\":\"%d\",", regions->parent_region_id); _internal_hl_json_region_events(f, beautifier, regions); regions = regions->prev; _internal_hl_json_line_break_and_indent(f, beautifier, 4); if (regions == NULL ) { fprintf(f, "}"); } else { fprintf(f, "},"); } } } static void _internal_hl_json_threads(FILE* f, bool beautifier, unsigned long* tids, int threads_num) { int i; _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "\"threads\":{"); /* get regions of all threads */ for ( i = 0; i < threads_num; i++ ) { HLDBG("Thread ID:%lu\n", tids[i]); /* find values of current thread in global binary tree */ threads_t* thread_node = _internal_hl_find_thread_node(tids[i]); if ( thread_node != NULL ) { /* do we really need the exact thread id? */ /* we only store iterator id as thread id, not tids[i] */ _internal_hl_json_line_break_and_indent(f, beautifier, 2); fprintf(f, "\"%d\":{", i); _internal_hl_json_line_break_and_indent(f, beautifier, 3); fprintf(f, "\"regions\":{"); _internal_hl_json_regions(f, beautifier, thread_node); _internal_hl_json_line_break_and_indent(f, beautifier, 3); fprintf(f, "}"); _internal_hl_json_line_break_and_indent(f, beautifier, 2); if ( i < threads_num - 1 ) { fprintf(f, "},"); } else { fprintf(f, "}"); } } } _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "}"); } static int _internal_hl_cmpfunc(const void * a, const void * b) { return ( *(int*)a - *(int*)b ); } static int _internal_get_sorted_thread_list(unsigned long** tids, int* threads_num) { if ( PAPI_list_threads( *tids, threads_num ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_list_threads call failed!\n"); return -1; } if ( ( *tids = malloc( *(threads_num) * sizeof(unsigned long) ) ) == NULL ) { verbose_fprintf(stdout, "PAPI-HL Error: OOM!\n"); return -1; } if ( PAPI_list_threads( *tids, threads_num ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: PAPI_list_threads call failed!\n"); return -1; } /* sort thread ids in ascending order */ qsort(*tids, *(threads_num), sizeof(unsigned long), _internal_hl_cmpfunc); return PAPI_OK; } static void _internal_hl_write_json_file(FILE* f, unsigned long* tids, int threads_num) { /* JSON beautifier (line break and indent) */ bool beautifier = true; /* start of JSON file */ fprintf(f, "{"); _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "\"papi_version\":\"%d.%d.%d.%d\",", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ), PAPI_VERSION_INCREMENT( PAPI_VERSION ) ); /* add some hardware info */ const PAPI_hw_info_t *hwinfo; if ( ( hwinfo = PAPI_get_hardware_info( ) ) != NULL ) { _internal_hl_json_line_break_and_indent(f, beautifier, 1); char* cpu_info = _internal_hl_remove_spaces(strdup(hwinfo->model_string), 1); fprintf(f, "\"cpu_info\":\"%s\",", cpu_info); free(cpu_info); _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "\"max_cpu_rate_mhz\":\"%d\",", hwinfo->cpu_max_mhz); _internal_hl_json_line_break_and_indent(f, beautifier, 1); fprintf(f, "\"min_cpu_rate_mhz\":\"%d\",", hwinfo->cpu_min_mhz); } /* write definitions */ _internal_hl_json_definitions(f, beautifier); /* write all regions with events per thread */ _internal_hl_json_threads(f, beautifier, tids, threads_num); /* end of JSON file */ _internal_hl_json_line_break_and_indent(f, beautifier, 0); fprintf(f, "}"); fprintf(f, "\n"); } static void _internal_hl_read_json_file(const char* path) { /* print output to stdout */ printf("\n\nPAPI-HL Output:\n"); FILE* output_file = fopen(path, "r"); int c = fgetc(output_file); while (c != EOF) { printf("%c", c); c = fgetc(output_file); } printf("\n"); fclose(output_file); } static void _internal_hl_write_output() { if ( output_generated == false ) { _papi_hwi_lock( HIGHLEVEL_LOCK ); if ( output_generated == false ) { /* check if events were recorded */ if ( binary_tree == NULL ) { verbose_fprintf(stdout, "PAPI-HL Info: No events were recorded.\n"); free(absolute_output_file_path); return; } if ( region_begin_cnt == region_end_cnt ) { verbose_fprintf(stdout, "PAPI-HL Info: Print results...\n"); } else { verbose_fprintf(stdout, "PAPI-HL Warning: Cannot generate output due to not matching regions.\n"); output_generated = true; HLDBG("region_begin_cnt=%d, region_end_cnt=%d\n", region_begin_cnt, region_end_cnt); _papi_hwi_unlock( HIGHLEVEL_LOCK ); free(absolute_output_file_path); return; } /* create new measurement directory */ if ( ( _internal_hl_mkdir(absolute_output_file_path) ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Error: Cannot create measurement directory %s.\n", absolute_output_file_path); free(absolute_output_file_path); return; } /* determine rank for output file */ int rank = _internal_hl_determine_rank(); /* if system does not provide rank id, create a random id */ if ( rank < 0 ) { srandom( time(NULL) + getpid() ); rank = random() % 1000000; } int unique_output_file_created = 0; char *final_absolute_output_file_path = NULL; int fd; int random_cnt = 0; /* allocate memory for final output file path */ if ( ( final_absolute_output_file_path = (char *)malloc((strlen(absolute_output_file_path) + 20) * sizeof(char)) ) == NULL ) { verbose_fprintf(stdout, "PAPI-HL Error: Cannot create output file.\n"); free(absolute_output_file_path); free(final_absolute_output_file_path); return; } /* create unique output file per process based on rank variable */ while ( unique_output_file_created == 0 ) { rank += random_cnt; sprintf(final_absolute_output_file_path, "%s/rank_%06d.json", absolute_output_file_path, rank); fd = open(final_absolute_output_file_path, O_WRONLY|O_APPEND|O_CREAT|O_NONBLOCK, S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH); if ( fd == -1 ) { verbose_fprintf(stdout, "PAPI-HL Error: Cannot create output file.\n"); free(absolute_output_file_path); free(final_absolute_output_file_path); return; } struct flock filelock; filelock.l_type = F_WRLCK; /* Test for any lock on any part of file. */ filelock.l_start = 0; filelock.l_whence = SEEK_SET; filelock.l_len = 0; if ( fcntl(fd, F_SETLK, &filelock) == 0 ) { unique_output_file_created = 1; free(absolute_output_file_path); /* write into file */ FILE *fp = fdopen(fd, "w"); if ( fp != NULL ) { /* list all threads */ unsigned long *tids = NULL; int threads_num; if ( _internal_get_sorted_thread_list(&tids, &threads_num) != PAPI_OK ) { fclose(fp); free(final_absolute_output_file_path); return; } /* start writing json output */ _internal_hl_write_json_file(fp, tids, threads_num); free(tids); fclose(fp); if ( getenv("PAPI_REPORT") != NULL ) { _internal_hl_read_json_file(final_absolute_output_file_path); } } else { verbose_fprintf(stdout, "PAPI-HL Error: Cannot create output file: %s\n", strerror( errno )); free(final_absolute_output_file_path); fcntl(fd, F_UNLCK, &filelock); return; } fcntl(fd, F_UNLCK, &filelock); } else { /* try another file name */ close(fd); random_cnt++; } } output_generated = true; free(final_absolute_output_file_path); } _papi_hwi_unlock( HIGHLEVEL_LOCK ); } } static void _internal_hl_clean_up_local_data() { int i, retval; /* destroy all EventSets from local data */ if ( _local_components != NULL ) { HLDBG("Thread-ID:%lu\n", PAPI_thread_id()); for ( i = 0; i < num_of_components; i++ ) { if ( ( retval = PAPI_stop( _local_components[i].EventSet, _local_components[i].values ) ) != PAPI_OK ) /* only print error when event set is running */ if ( retval != -9 ) verbose_fprintf(stdout, "PAPI-HL Error: PAPI_stop failed: %d.\n", retval); if ( ( retval = PAPI_cleanup_eventset (_local_components[i].EventSet) ) != PAPI_OK ) verbose_fprintf(stdout, "PAPI-HL Error: PAPI_cleanup_eventset failed: %d.\n", retval); if ( ( retval = PAPI_destroy_eventset (&_local_components[i].EventSet) ) != PAPI_OK ) verbose_fprintf(stdout, "PAPI-HL Error: PAPI_destroy_eventset failed: %d.\n", retval); free(_local_components[i].values); } free(_local_components); _local_components = NULL; /* count global thread variable */ _papi_hwi_lock( HIGHLEVEL_LOCK ); num_of_cleaned_threads++; _papi_hwi_unlock( HIGHLEVEL_LOCK ); } _papi_hl_events_running = 0; _local_state = PAPIHL_DEACTIVATED; } static void _internal_hl_clean_up_global_data() { int i; int extended_total_num_events; /* clean up binary tree of recorded events */ threads_t *thread_node; if ( binary_tree != NULL ) { while ( binary_tree->root != NULL ) { thread_node = *(threads_t **)binary_tree->root; /* clean up double linked list of region data */ regions_t *region = thread_node->value; regions_t *tmp; while ( region != NULL ) { /* clean up read node list */ extended_total_num_events = total_num_events + 2; for ( i = 0; i < extended_total_num_events; i++ ) { reads_t *read_node = region->values[i].read_values; reads_t *read_node_tmp; while ( read_node != NULL ) { read_node_tmp = read_node; read_node = read_node->next; free(read_node_tmp); } } tmp = region; region = region->next; free(tmp->region); free(tmp); } free(region); tdelete(thread_node, &binary_tree->root, compar); free(thread_node); } } /* we cannot free components here since other threads could still use them */ /* clean up requested event names */ for ( i = 0; i < num_of_requested_events; i++ ) free(requested_event_names[i]); free(requested_event_names); free(absolute_output_file_path); } static void _internal_hl_clean_up_all(bool deactivate) { int i, num_of_threads; /* we assume that output has been already generated or * cannot be generated due to previous errors */ output_generated = true; /* clean up thread local data */ if ( _local_state == PAPIHL_ACTIVE ) { HLDBG("Clean up thread local data for thread %lu\n", PAPI_thread_id()); _internal_hl_clean_up_local_data(); } /* clean up global data */ if ( state == PAPIHL_ACTIVE ) { _papi_hwi_lock( HIGHLEVEL_LOCK ); if ( state == PAPIHL_ACTIVE ) { verbose_fprintf(stdout, "PAPI-HL Info: Output generation is deactivated!\n"); HLDBG("Clean up global data for thread %lu\n", PAPI_thread_id()); _internal_hl_clean_up_global_data(); /* check if all other registered threads have cleaned up */ PAPI_list_threads(NULL, &num_of_threads); HLDBG("Number of registered threads: %d.\n", num_of_threads); HLDBG("Number of cleaned threads: %d.\n", num_of_cleaned_threads); if ( _internal_hl_check_for_clean_thread_states() == PAPI_OK && num_of_threads == num_of_cleaned_threads ) { PAPI_shutdown(); /* clean up components */ for ( i = 0; i < num_of_components; i++ ) { free(components[i].event_names); free(components[i].event_codes); free(components[i].event_types); } free(components); HLDBG("PAPI-HL shutdown!\n"); } else { verbose_fprintf(stdout, "PAPI-HL Warning: Could not call PAPI_shutdown() since some threads still have running event sets.\n"); } /* deactivate PAPI-HL */ if ( deactivate ) state = PAPIHL_DEACTIVATED; } _papi_hwi_unlock( HIGHLEVEL_LOCK ); } } static int _internal_hl_check_for_clean_thread_states() { EventSetInfo_t *ESI; DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; int i; for( i = 0; i < map->totalSlots; i++ ) { ESI = map->dataSlotArray[i]; if ( ESI ) { if ( ESI->state & PAPI_RUNNING ) return ( PAPI_EISRUN ); } } return ( PAPI_OK ); } int _internal_PAPI_hl_init() { if ( state == PAPIHL_ACTIVE ) { if ( hl_initiated == false && hl_finalized == false ) { _internal_hl_onetime_library_init(); /* check if the library has been initialized successfully */ if ( state == PAPIHL_DEACTIVATED ) return ( PAPI_EMISC ); return ( PAPI_OK ); } return ( PAPI_ENOINIT ); } return ( PAPI_EMISC ); } int _internal_PAPI_hl_cleanup_thread() { if ( state == PAPIHL_ACTIVE && hl_initiated == true && _local_state == PAPIHL_ACTIVE ) { /* do not clean local data from master thread */ if ( master_thread_id != PAPI_thread_id() ) _internal_hl_clean_up_local_data(); return ( PAPI_OK ); } return ( PAPI_EMISC ); } int _internal_PAPI_hl_finalize() { if ( state == PAPIHL_ACTIVE && hl_initiated == true ) { _internal_hl_clean_up_all(true); return ( PAPI_OK ); } return ( PAPI_EMISC ); } int _internal_PAPI_hl_set_events(const char* events) { int retval; if ( state == PAPIHL_ACTIVE ) { /* This may only be called once after the high-level API was successfully * initiated. Any second call just returns PAPI_OK without doing an * expensive lock. */ if ( hl_initiated == true ) { if ( events_determined == false ) { _papi_hwi_lock( HIGHLEVEL_LOCK ); if ( events_determined == false && state == PAPIHL_ACTIVE ) { HLDBG("Set events: %s\n", events); if ( ( retval = _internal_hl_read_events(events) ) != PAPI_OK ) { state = PAPIHL_DEACTIVATED; _internal_hl_clean_up_global_data(); _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( retval ); } if ( ( retval = _internal_hl_create_global_binary_tree() ) != PAPI_OK ) { state = PAPIHL_DEACTIVATED; _internal_hl_clean_up_global_data(); _papi_hwi_unlock( HIGHLEVEL_LOCK ); return ( retval ); } } _papi_hwi_unlock( HIGHLEVEL_LOCK ); } } /* in case the first locked thread ran into problems */ if ( state == PAPIHL_DEACTIVATED) return ( PAPI_EMISC ); return ( PAPI_OK ); } return ( PAPI_EMISC ); } void _internal_PAPI_hl_print_output() { if ( state == PAPIHL_ACTIVE && hl_initiated == true && output_generated == false ) { _internal_hl_write_output(); } } /** @class PAPI_hl_region_begin * @brief Read performance events at the beginning of a region. * * @par C Interface: * \#include @n * int PAPI_hl_region_begin( const char* region ); * * @param region * -- a unique region name * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous errors. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPI_hl_region_begin reads performance events and stores them internally at the beginning * of an instrumented code region. * If not specified via the environment variable PAPI_EVENTS, default events are used. * The first call sets all counters implicitly to zero and starts counting. * Note that if PAPI_EVENTS is not set or cannot be interpreted, default performance events are * recorded. * * @par Example: * * @code * export PAPI_EVENTS="PAPI_TOT_INS,PAPI_TOT_CYC" * * @endcode * * * @code * int retval; * * retval = PAPI_hl_region_begin("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * //Do some computation here * * retval = PAPI_hl_region_end("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * @endcode * * @see PAPI_hl_read * @see PAPI_hl_region_end * @see PAPI_hl_stop */ int PAPI_hl_region_begin( const char* region ) { int retval; /* if a rate event set is running stop it */ if ( _papi_rate_events_running == 1 ) { if ( ( retval = PAPI_rate_stop() ) != PAPI_OK ) return ( retval ); } if ( state == PAPIHL_DEACTIVATED ) { /* check if we have to clean up local stuff */ if ( _local_state == PAPIHL_ACTIVE ) _internal_hl_clean_up_local_data(); return ( PAPI_EMISC ); } if ( hl_finalized == true ) return ( PAPI_ENOTRUN ); if ( hl_initiated == false ) { if ( ( retval = _internal_PAPI_hl_init() ) != PAPI_OK ) return ( retval ); } if ( events_determined == false ) { if ( ( retval = _internal_PAPI_hl_set_events(NULL) ) != PAPI_OK ) return ( retval ); } if ( _local_components == NULL ) { if ( ( retval = _internal_hl_create_event_sets() ) != PAPI_OK ) { HLDBG("Could not create local events sets for thread %lu.\n", PAPI_thread_id()); _internal_hl_clean_up_all(true); return ( retval ); } } if ( _papi_hl_events_running == 0 ) { if ( ( retval = _internal_hl_start_counters() ) != PAPI_OK ) { HLDBG("Could not start counters for thread %lu.\n", PAPI_thread_id()); _internal_hl_clean_up_all(true); return ( retval ); } } /* read and store all events */ HLDBG("Thread ID:%lu, Region:%s\n", PAPI_thread_id(), region); if ( ( retval = _internal_hl_read_and_store_counters(region, REGION_BEGIN) ) != PAPI_OK ) return ( retval ); if ( ( retval = _internal_hl_region_id_push() ) != PAPI_OK ) { verbose_fprintf(stdout, "PAPI-HL Warning: Number of nested regions exceeded for thread %lu.\n", PAPI_thread_id()); _internal_hl_clean_up_all(true); return ( retval ); } _local_region_begin_cnt++; return ( PAPI_OK ); } /** @class PAPI_hl_read * @brief Read performance events inside of a region and store the difference to the corresponding * beginning of the region. * * @par C Interface: * \#include @n * int PAPI_hl_read( const char* region ); * * @param region * -- a unique region name corresponding to PAPI_hl_region_begin * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous errors. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPI_hl_read reads performance events inside of a region and stores the difference to the * corresponding beginning of the region. * * Assumes that PAPI_hl_region_begin was called before. * * @par Example: * * @code * int retval; * * retval = PAPI_hl_region_begin("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * //Do some computation here * * retval = PAPI_hl_read("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * //Do some computation here * * retval = PAPI_hl_region_end("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * @endcode * * @see PAPI_hl_region_begin * @see PAPI_hl_region_end * @see PAPI_hl_stop */ int PAPI_hl_read(const char* region) { int retval; if ( state == PAPIHL_DEACTIVATED ) { /* check if we have to clean up local stuff */ if ( _local_state == PAPIHL_ACTIVE ) _internal_hl_clean_up_local_data(); return ( PAPI_EMISC ); } if ( _local_region_begin_cnt == 0 ) { verbose_fprintf(stdout, "PAPI-HL Warning: Cannot find matching region for PAPI_hl_read(\"%s\") for thread %lu.\n", region, PAPI_thread_id()); return ( PAPI_EMISC ); } if ( _local_components == NULL ) return ( PAPI_ENOTRUN ); /* read and store all events */ HLDBG("Thread ID:%lu, Region:%s\n", PAPI_thread_id(), region); if ( ( retval = _internal_hl_read_and_store_counters(region, REGION_READ) ) != PAPI_OK ) return ( retval ); return ( PAPI_OK ); } /** @class PAPI_hl_region_end * @brief Read performance events at the end of a region and store the difference to the * corresponding beginning of the region. * * @par C Interface: * \#include @n * int PAPI_hl_region_end( const char* region ); * * @param region * -- a unique region name corresponding to PAPI_hl_region_begin * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous errors. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPI_hl_region_end reads performance events at the end of a region and stores the * difference to the corresponding beginning of the region. * * Assumes that PAPI_hl_region_begin was called before. * * Note that PAPI_hl_region_end does not stop counting the performance events. Counting * continues until the application terminates. Therefore, the programmer can also create * nested regions if required. To stop a running high-level event set, the programmer must call * PAPI_hl_stop(). It should also be noted, that a marked region is thread-local and therefore * has to be in the same thread. * * An output of the measured events is created automatically after the application exits. * In the case of a serial, or a thread-parallel application there is only one output file. * MPI applications would be saved in multiple files, one per MPI rank. * The output is generated in the current directory by default. However, it is recommended to * specify an output directory for larger measurements, especially for MPI applications via * the environment variable PAPI_OUTPUT_DIRECTORY. In the case where measurements are performed, * while there are old measurements in the same directory, PAPI will not overwrite or delete the * old measurement directories. Instead, timestamps are added to the old directories. * * For more convenience, the output can also be printed to stdout by setting PAPI_REPORT=1. This * is not recommended for MPI applications as each MPI rank tries to print the output concurrently. * * The generated measurement output can also be converted in a better readable output. The python * script papi_hl_output_writer.py enhances the output by creating some derived metrics, like IPC, * MFlops/s, and MFlips/s as well as real and processor time in case the corresponding PAPI events * have been recorded. The python script can also summarize performance events over all threads and * MPI ranks when using the option "accumulate" as seen below. * * @par Example: * * @code * int retval; * * retval = PAPI_hl_region_begin("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * //Do some computation here * * retval = PAPI_hl_region_end("computation"); * if ( retval != PAPI_OK ) * handle_error(1); * * @endcode * * @code * python papi_hl_output_writer.py --type=accumulate * * { * "computation": { * "Region count": 1, * "Real time in s": 0.97 , * "CPU time in s": 0.98 , * "IPC": 1.41 , * "MFLIPS /s": 386.28 , * "MFLOPS /s": 386.28 , * "Number of ranks ": 1, * "Number of threads ": 1, * "Number of processes ": 1 * } * } * * @endcode * * @see PAPI_hl_region_begin * @see PAPI_hl_read * @see PAPI_hl_stop */ int PAPI_hl_region_end( const char* region ) { int retval; if ( state == PAPIHL_DEACTIVATED ) { /* check if we have to clean up local stuff */ if ( _local_state == PAPIHL_ACTIVE ) _internal_hl_clean_up_local_data(); return ( PAPI_EMISC ); } if ( _local_region_begin_cnt == 0 ) { verbose_fprintf(stdout, "PAPI-HL Warning: Cannot find matching region for PAPI_hl_region_end(\"%s\") for thread %lu.\n", region, PAPI_thread_id()); return ( PAPI_EMISC ); } if ( _local_components == NULL ) return ( PAPI_ENOTRUN ); /* read and store all events */ HLDBG("Thread ID:%lu, Region:%s\n", PAPI_thread_id(), region); if ( ( retval = _internal_hl_read_and_store_counters(region, REGION_END) ) != PAPI_OK ) return ( retval ); _internal_hl_region_id_pop(); _local_region_end_cnt++; return ( PAPI_OK ); } /** @class PAPI_hl_stop * @brief Stop a running high-level event set. * * @par C Interface: * \#include @n * int PAPI_hl_stop(); * * @retval PAPI_ENOEVNT * -- The EventSet is not started yet. * @retval PAPI_ENOMEM * -- Insufficient memory to complete the operation. * * PAPI_hl_stop stops a running high-level event set. * * This call is optional and only necessary if the programmer wants to use the low-level API in addition * to the high-level API. It should be noted that PAPI_hl_stop and low-level calls are not * allowed inside of a marked region. Furthermore, PAPI_hl_stop is thread-local and therefore * has to be called in the same thread as the corresponding marked region. * * @see PAPI_hl_region_begin * @see PAPI_hl_read * @see PAPI_hl_region_end */ int PAPI_hl_stop() { int retval, i; if ( _papi_hl_events_running == 1 ) { if ( _local_components != NULL ) { for ( i = 0; i < num_of_components; i++ ) { if ( ( retval = PAPI_stop( _local_components[i].EventSet, _local_components[i].values ) ) != PAPI_OK ) return ( retval ); } } _papi_hl_events_running = 0; return ( PAPI_OK ); } return ( PAPI_ENOEVNT ); } papi-papi-7-2-0-t/src/high-level/scripts/000077500000000000000000000000001502707512200201425ustar00rootroot00000000000000papi-papi-7-2-0-t/src/high-level/scripts/papi_hl_output_writer.py000077500000000000000000000716061502707512200251610ustar00rootroot00000000000000#!/usr/bin/env python3 ## # @file papi_hl_output_writer.py # @brief Converts HL output to be more comprehensible. # Output is enhanced by creating derived metrics like IPC, # MFlop/s, and MFlips/s. As well as real and processor time. from __future__ import division from collections import OrderedDict import argparse import os import json # Make it work for Python 2+3 and with Unicode import io ##\cond try: to_unicode = unicode except NameError: to_unicode = str ##\endcond event_definitions = {} process_num = {} derived_metric_names = { 'region_count':'Region count', 'cycles':'Total elapsed cycles', 'real_time_nsec':'Real time in s', 'perf::TASK-CLOCK':'CPU time in s' } event_rate_names = OrderedDict([ ('PAPI_FP_INS','MFLIPS/s'), ('PAPI_VEC_SP','Single precision vector/SIMD instructions rate in M/s'), ('PAPI_VEC_DP','Double precision vector/SIMD instructions rate in M/s'), ('PAPI_FP_OPS','MFLOPS/s'), ('PAPI_SP_OPS','Single precision MFLOPS/s'), ('PAPI_DP_OPS','Double precision MFLOPS/s') ]) def merge_json_files(source_dir): """! Function definition for merge_json_files. Merge multiple .json files together into a single dictionary. @param source_dir A directory containing one or more .json files from PAPI HL function calls @returns An ordered dictionary containing measurements from recorded events for one or more .json files generated from PAPI HL function calls. """ json_object = {} events_stored = False #get measurement files file_list = os.listdir(source_dir) file_list.sort() rank_cnt = 0 json_rank = OrderedDict() for item in file_list: #determine mpi rank based on file name (rank_#) rank = item.split('_', 1)[1] rank = rank.rsplit('.', 1)[0] try: rank = int(rank) except: rank = rank_cnt #open measurement file file_name = str(source_dir) + "/" + str(item) try: with open(file_name) as json_file: #keep order of all objects data = json.load(json_file, object_pairs_hook=OrderedDict) except IOError as ioe: print("Cannot open file {} ({})".format(file_name, repr(ioe))) return #store global data if events_stored == False: global event_definitions event_definitions = data['event_definitions'] events_stored = True #get all threads json_rank[str(rank)] = OrderedDict() json_rank[str(rank)]['threads'] = data['threads'] rank_cnt = rank_cnt + 1 json_object['ranks'] = json_rank return json_object def parse_source_file(source_file): """! Function definition for parse_source_file. Parses a single user passed .json file generated from PAPI HL function calls. @param source_file .json file generated from PAPI HL function calls. @returns An ordered dictionary containing measurements from recorded events for a single .json file, generated from PAPI HL function calls. """ json_data = {} json_rank = OrderedDict() events_stored = False # determine mpi rank based on file name (rank_#) rank = source_file.split('_', 1)[1] rank = source_file.rsplit('.', 1)[0] # open json file provided by user f = open(source_file) # return ordered dictionary data = json.load(f, object_pairs_hook = OrderedDict) #store global data if events_stored == False: global event_definitions event_definitions = data['event_definitions'] events_stored = True # get all threads json_rank[str(rank)] = OrderedDict() json_rank[str(rank)]['threads'] = data['threads'] json_data['ranks'] = json_rank return json_data class Sum_Counter(object): """! Sum_Counter class defintion. Calculates the min, max, median or sum for the measurements of a recorded events. """ def __init__(self): """! Sum_Counter class initializer. """ self.min = None self.all_values = [] self.max = 0 def add_event(self, value): """! Method definition for add_event. Add a recorded event and measurement to summary output. @param value Measurement from a recorded event. E.g. PAPI_TOT_INS. """ if isinstance(value, dict): if self.min is None or self.min > int(value['min']): self.min = int(value['min']) self.all_values.append(int(value['avg'])) if self.max < int(value['max']): self.max = int(value['max']) else: val = int(value) if self.min is None or self.min > val: self.min = val self.all_values.append(val) if self.max < val: self.max = val def get_min(self): """! Method definition for get_min. Calculates the minimum for a set of measurements for a recorded event. @returns The minimum for a set of measurement values for a recorded event. E.g. PAPI_TOT_INS. """ return self.min def get_median(self): """! Method definition for get_median. Calculates the median for a set of measurements for a recorded event. @returns The median for a set of measurement values for a recorded event. E.g. PAPI_TOT_INS. """ n = len(self.all_values) s = sorted(self.all_values) return (sum(s[n//2-1:n//2+1])/2.0, s[n//2])[n % 2] if n else None def get_sum(self): """! Method definition for get_sum. Calculates the sum for a set of measurements for a recorded event. @returns The sum of measurement values for a recorded event. E.g. PAPI_TOT_INS. """ sum = 0 for value in self.all_values: sum += value return sum def get_max(self): """! Method definition for get_max. Calculate the maximum for a set of measurements for a recorded event. @returns The maximum for a set of measurement values for a recorded event. E.g. PAPI_TOT_INS. """ return self.max class Sum_Counters(object): """! Sum_Counters class defintion. Gathers summary output for a region (e.g. computation) and accompanying measurements for a recorded event (e.g. PAPI_TOT_INS). """ def __init__(self): """! Sum_Counters class initializer. """ self.regions = OrderedDict() self.regions_last_rank_id = {} self.regions_rank_num = {} self.regions_last_thread_id = {} self.regions_thread_num = {} self.clean_regions = OrderedDict() self.sum_counters = OrderedDict() def add_region(self, rank_id, thread_id, events=OrderedDict()): """! Method defintion for add_region. Adds the region (e.g. computation) and accompanying measurements for a recorded event (e.g. PAPI_TOT_INS) to summary output. @param rank_id MPI rank, if no MPI rank is present this value will be random. @param thread_id Thread identifier containing performance events. E.g. 0. @param events An ordered dictionary containing measurements for recorded events obtained through PAPI HL function calls. E.g. PAPI_TOT_INS. """ #remove all read values caused by PAPI_hl_read cleaned_events = OrderedDict() region_name = 'unknown' cleaned_events['region_count'] = 1 for key,value in events.items(): if 'name' in key: region_name = value continue if 'parent_region_id' in key: continue metric_value = value if isinstance(value, dict): if "region_value" in value: metric_value = float(value['region_value']) cleaned_events[key] = metric_value #create new Sum_Counter object for each new region if region_name not in self.regions: self.regions[region_name] = {} self.regions_last_rank_id[region_name] = rank_id self.regions_last_thread_id[region_name] = thread_id self.regions_rank_num[region_name] = 1 self.regions_thread_num[region_name] = 1 self.sum_counters[region_name] = OrderedDict() for key,value in cleaned_events.items(): self.sum_counters[region_name][key] = Sum_Counter() self.sum_counters[region_name][key].add_event(value) else: #increase number of ranks and threads when rank_id has changed if self.regions_last_rank_id[region_name] != rank_id: self.regions_last_rank_id[region_name] = rank_id self.regions_rank_num[region_name] += 1 self.regions_last_thread_id[region_name] = thread_id self.regions_thread_num[region_name] += 1 #increase number of threads when thread_id has changed if self.regions_last_thread_id[region_name] != thread_id: self.regions_last_thread_id[region_name] = thread_id self.regions_thread_num[region_name] += 1 for key,value in cleaned_events.items(): self.sum_counters[region_name][key].add_event(value) self.regions[region_name]['rank_num'] = self.regions_rank_num[region_name] self.regions[region_name]['thread_num'] = self.regions_thread_num[region_name] def get_json(self): """! Method definition for get_json. Calculates the min, max, median, or sum for a set of measurements for a recorded event. E.g. PAPI_TOT_INS. @returns An ordered dictionary containing summary measurements for recorded events. E.g. PAPI_TOT_INS. """ sum_json = OrderedDict() for name in self.regions: events = OrderedDict() for key,value in self.sum_counters.items(): if key == name: region_count = 1 for event_key,event_value in value.items(): if event_key == 'region_count': events[event_key] = int(event_value.get_sum()) region_count = events[event_key] else: global event_definitions if region_count > 1: events[event_key] = OrderedDict() if event_key == 'cycles' or event_key == 'real_time_nsec': events[event_key]['total'] = event_value.get_sum() else: if ( event_definitions[event_key]['type'] == 'delta' and event_definitions[event_key]['component'] == 'perf_event' ): events[event_key]['total'] = event_value.get_sum() events[event_key]['min'] = event_value.get_min() events[event_key]['median'] = event_value.get_median() events[event_key]['max'] = event_value.get_max() else: #sequential code if event_key == 'cycles' or event_key == 'real_time_nsec': events[event_key] = event_value.get_min() else: if event_definitions[event_key] == 'instant' and region_count > 1: events[event_key] = OrderedDict() events[event_key]['min'] = event_value.get_min() events[event_key]['median'] = event_value.get_median() events[event_key]['max'] = event_value.get_max() else: events[event_key] = event_value.get_min() break #add number of ranks and threads in case of a parallel code if self.regions[name]['rank_num'] > 1 or self.regions[name]['thread_num'] > 1: events['Number of ranks'] = self.regions[name]['rank_num'] events['Number of threads per rank'] = int(self.regions[name]['thread_num'] / self.regions[name]['rank_num']) sum_json[name] = events global process_num process_num[name] = self.regions[name]['rank_num'] * self.regions[name]['thread_num'] return sum_json def derive_sum_json_object(data): """! Function definition for derive_sum_json_object. Calculates the derived event measurements (IPC) from the recorded events obtained through PAPI HL function calls. @param data An ordered dictionary containing the number of threads and measurements for the recorded events. E.g. PAPI_TOT_INS. @returns An ordered dicitonary filled with formatted measurements for derived events. """ json_object = OrderedDict() for region_key,region_value in data.items(): derive_events = OrderedDict() events = region_value.copy() #remember runtime for other metrics like MFLOPS rt = {} #remember region count region_cnt = 1 #Region Count if 'region_count' in events: derive_events[derived_metric_names['region_count']] = int(events['region_count']) region_cnt = int(events['region_count']) del events['region_count'] #skip cycles if 'cycles' in events: del events['cycles'] #Real Time if 'real_time_nsec' in events: event_name = derived_metric_names['real_time_nsec'] if region_cnt > 1: for metric in ['total', 'min', 'median', 'max']: rt[metric] = convert_value(events['real_time_nsec'][metric], 'Runtime') derive_events[event_name] = rt['max'] else: rt['total'] = convert_value(events['real_time_nsec'], 'Runtime') derive_events[event_name] = rt['total'] del events['real_time_nsec'] #CPU Time if 'perf::TASK-CLOCK' in events: event_name = derived_metric_names['perf::TASK-CLOCK'] if region_cnt > 1: derive_events[event_name] = convert_value(events['perf::TASK-CLOCK']['total'], 'CPUtime') else: derive_events[event_name] = convert_value(events['perf::TASK-CLOCK'], 'CPUtime') del events['perf::TASK-CLOCK'] #PAPI_TOT_INS and PAPI_TOT_CYC to calculate IPC if 'PAPI_TOT_INS' in events and 'PAPI_TOT_CYC' in events: event_name = 'IPC' metric = 'total' try: if region_cnt > 1: ipc = float(format(float(int(events['PAPI_TOT_INS'][metric]) / int(events['PAPI_TOT_CYC'][metric])), '.2f')) else: ipc = float(format(float(int(events['PAPI_TOT_INS']) / int(events['PAPI_TOT_CYC'])), '.2f')) except: ipc = 'n/a' derive_events[event_name] = ipc del events['PAPI_TOT_INS'] del events['PAPI_TOT_CYC'] #Rates global event_rate_names for rate_event in event_rate_names: if rate_event in events: event_name = event_rate_names[rate_event] metric = 'total' try: if region_cnt > 1: rate = float(format(float(events[rate_event][metric]) / 1000000 / rt[metric], '.2f')) else: rate = float(format(float(events[rate_event]) / 1000000 / rt[metric], '.2f')) except: rate = 'n/a' derive_events[event_name] = rate del events[rate_event] #read the rest for event_key,event_value in events.items(): derive_events[event_key] = OrderedDict() derive_events[event_key] = event_value json_object[region_key] = derive_events.copy() return json_object def sum_json_object(data, derived = False): """! Function definition for sum_json_object. Converts the user supplied .json file containing measurements from PAPI HL function calls to summary format. @param data A dictionary containing ranks, threads, and regions. @param derived Type of notation. If set to true then the notation is derived. @returns An ordered dictionary containing measurements for recorded events. E.g. PAPI_TOT_INS. """ sum_cnt = Sum_Counters() for rank, rank_value in data['ranks'].items(): for thread, thread_value in rank_value['threads'].items(): for region_value in thread_value['regions'].values(): sum_cnt.add_region(rank, thread, region_value) if derived == True: return derive_sum_json_object(sum_cnt.get_json()) else: return sum_cnt.get_json() def get_ipc_dict(inst, cyc): """! Function definition for get_ipc_dict. Calculates IPC. @param inst An ordered dictionary containing measurements for PAPI_TOT_INS. @param cyc An ordered dictionary containing measurements for PAPI_TOT_CYC. @returns An ordered dictionary containing IPC measurement(s). """ ipc_dict = OrderedDict() for (inst_key,inst_value), (cyc_key,cyc_value) in zip(inst.items(), cyc.items()): try: ipc = float(int(inst_value) / int(cyc_value)) except: ipc = 0 ipc_dict[inst_key] = float(format(ipc, '.2f')) return ipc_dict def get_ops_dict(ops, rt): """! Function definition for get_ops_dict. Calculates OPS. @param ops An ordered dictionary containing measurements for rate recorded events. E.g. PAPI_FP_INS. @param rt An ordered dictionary containing measurements for real time. E.g. real_time_nsec. @returns An ordered dictionary containing OPS measurement(s). """ ops_dict = OrderedDict() for (ops_key,ops_value), (rt_key,rt_value) in zip(ops.items(), rt.items()): try: ops = float(ops_value) / 1000000 / rt_value except: ops = 0 ops_dict[ops_key] = float(format(ops, '.2f')) return ops_dict def convert_value(value, event_type = 'Other'): """! Function definition for convert_value. Converts current measurement precision from a recorded event to a new precision. @param value Measurement from a recorded event. E.g. PAPI_TOT_INS. @param event_type Type of event recorded. E.g. cycles or runtime. @returns New precision for the event value. Either an int or float. """ if event_type == 'Other': result = float(value) result = float(format(result, '.2f')) elif event_type == 'Cycles': result = int(value) elif event_type == 'Runtime': result = float(value) / 1.e09 result = float(format(result, '.2f')) elif event_type == 'CPUtime': result = float(value) / 1.e09 result = float(format(result, '.2f')) return result def derive_read_events(events, event_type = 'Other'): """! Function definition for derive_read_events. Format derived event values to a specific precision. @param events An ordered dictionary filled with measurements from recorded events. E.g. PAPI_TOT_INS. @param event_type Type of event recorded. E.g. cycles or runtime. @returns An ordered dictionary with values formatted to a specific precision (int or float). """ format_read_dict = OrderedDict() for read_key,read_value in events.items(): format_read_dict[read_key] = convert_value(read_value, event_type) return format_read_dict def derive_events(events): """! Function definition for derive_events. Parses an ordered dictionary that contains derived events. @param events An ordered dictionary filled with measurements from recorded events. E.g. PAPI_TOT_INS. @returns An ordered dictionary filled with formatted measurements for derived events. """ #keep order as declared derive_events = OrderedDict() #remember runtime for other metrics like MFLOPS rt = 1.0 rt_dict = OrderedDict() #name if 'name' in events: derive_events['name'] = events['name'] del events['name'] #skip parent_region_id and cycles if 'parent_region_id' in events or 'cycles' in events: del events['parent_region_id'] del events['cycles'] #Real Time if 'real_time_nsec' in events: if isinstance(events['real_time_nsec'],dict): for read_key,read_value in events['real_time_nsec'].items(): rt_dict[read_key] = convert_value(read_value, 'Runtime') derive_events[derived_metric_names['real_time_nsec']] = derive_read_events(events['real_time_nsec'],'Runtime') else: rt = convert_value(events['real_time_nsec'], 'Runtime') derive_events[derived_metric_names['real_time_nsec']] = convert_value(events['real_time_nsec'], 'Runtime') del events['real_time_nsec'] #CPU Time if 'perf::TASK-CLOCK' in events: if isinstance(events['perf::TASK-CLOCK'],dict): derive_events[derived_metric_names['perf::TASK-CLOCK']] = derive_read_events(events['perf::TASK-CLOCK'],'CPUtime') else: derive_events[derived_metric_names['perf::TASK-CLOCK']] = convert_value(events['perf::TASK-CLOCK'], 'CPUtime') del events['perf::TASK-CLOCK'] #PAPI_TOT_INS and PAPI_TOT_CYC to calculate IPC if 'PAPI_TOT_INS' in events and 'PAPI_TOT_CYC' in events: if isinstance(events['PAPI_TOT_INS'],dict) and isinstance(events['PAPI_TOT_CYC'],dict): ipc_dict = get_ipc_dict(events['PAPI_TOT_INS'], events['PAPI_TOT_CYC']) derive_events['IPC'] = ipc_dict else: try: ipc = float(int(events['PAPI_TOT_INS']) / int(events['PAPI_TOT_CYC'])) except: ipc = 0 derive_events['IPC'] = float(format(ipc, '.2f')) del events['PAPI_TOT_INS'] del events['PAPI_TOT_CYC'] #Rates global event_rate_names for rate_event in event_rate_names: if rate_event in events: event_name = event_rate_names[rate_event] if isinstance(events[rate_event],dict): rate_dict = get_ops_dict(events[rate_event], rt_dict) derive_events[event_name] = rate_dict else: try: rate = float(format(float(events[rate_event]) / 1000000 / rt, '.2f')) except: rate = 0 derive_events[event_name] = rate del events[rate_event] #read the rest for event_key,event_value in events.items(): if isinstance(event_value,dict): derive_events[event_key] = derive_read_events(event_value) else: derive_events[event_key] = convert_value(event_value) return derive_events def derive_json_object(data): """! Function defintion for derive_json_object. Converts the user supplied .json file containing measurements from PAPI HL function calls to derived format. @param data Data obtained from PAPI HL function calls. @returns An ordered dictionary containing values for ranks, threads, regions, name, real time, and IPC in .json format. """ for rank, rank_value in data['ranks'].items(): for thread, thread_value in rank_value['threads'].items(): for region, region_value in thread_value['regions'].items(): data['ranks'][rank]['threads'][thread]['regions'][region] = derive_events(region_value) return data def write_json_file(data, file_name): """! Function definition for write_json_file. Write enhanced output to output file. @param data Data obtained from PAPI HL function calls. @param file_name Output filename. Either papi.json or papi_sum.json. """ with io.open(file_name, 'w', encoding='utf8') as outfile: str_ = json.dumps(data, indent=4, sort_keys=False, separators=(',', ': '), ensure_ascii=False) outfile.write(to_unicode(str_)) print (str_) def main(format, type, notation, source_dir = None, source_file = None): """! Function definition for main. Contains code to run upon the Python interpreter executing the file. @param format User passed output format, e.g. json. @param type User passed output type. Either detailed or summary. @param notation User passed notation. Either raw or derived. @param source_dir Measurement directory of raw data. @param source_file Individual .json file containing measurements of raw data. """ if (format == "json"): if source_dir != None: json = merge_json_files(source_dir) else: json = parse_source_file(source_file) if type == 'detail': if notation == 'derived': write_json_file(derive_json_object(json), 'papi.json') else: write_json_file(json, 'papi.json') #summarize data over regions with the same name, threads and ranks if type == 'summary': if notation == 'derived': write_json_file(sum_json_object(json, True), 'papi_sum.json') else: write_json_file(sum_json_object(json), 'papi_sum.json') else: print("Format not supported!") def parse_args(): """! Function definition for parse_args. Defines and parses command line arguments. """ parser = argparse.ArgumentParser() parser.add_argument('--source_dir', type=str, required=False, help='Measurement directory of raw data.') parser.add_argument('--source_file', type=str, required=False, help='Individual file containing measurements of raw data.') parser.add_argument('--format', type=str, required=False, default='json', help='Output format, e.g. json.') parser.add_argument('--type', type=str, required=False, default='summary', help='Output type: detail or summary.') parser.add_argument('--notation', type=str, required=False, default='derived', help='Output notation: raw or derived.') # check to make sure a value has not been passed for both filename and source if (parser.parse_args().source_dir != None and parser.parse_args().source_file != None): # executes if both conditions are true print("Cannot pass values to both source_dir and source_file." " Value must be passed to either source_dir or source_file.") parser.print_help() parser.exit() # check to see if file exists if parser.parse_args().source_file != None: source_file = str(parser.parse_args().source_file) if not os.path.isfile(source_file): print("The file named '{}' does not exist!\n".format(source_file)) parser.print_help() parser.exit() # check if papi directory exists elif parser.parse_args().source_dir != None: source_dir = str(parser.parse_args().source_dir) if os.path.isdir(source_dir) == False: print("Measurement directory '{}' does not exist!\n".format(source_dir)) parser.print_help() parser.exit() # output if neither source_file or source_dir are supplied else: print("Path to either a JSON file (--source_file) or a" " dictionary (--source_dir) which contains a JSON file is required.") parser.print_help() parser.exit() # check format output_format = str(parser.parse_args().format) if output_format != "json": print("Output format '{}' is not supported!\n".format(output_format)) parser.print_help() parser.exit() # check type output_type = str(parser.parse_args().type) if output_type != "detail" and output_type != "summary": print("Output type '{}' is not supported!\n".format(output_type)) parser.print_help() parser.exit() # check notation output_notation = str(parser.parse_args().notation) if output_notation != "raw" and output_notation != "derived": print("Output notation '{}' is not supported!\n".format(output_notation)) parser.print_help() parser.exit() return parser.parse_args() if __name__ == '__main__': ##\cond args = parse_args() main(format=args.format, source_dir=args.source_dir, source_file=args.source_file, type=args.type, notation=args.notation) ##\endcond papi-papi-7-2-0-t/src/libpapi.exp000066400000000000000000000017611502707512200165720ustar00rootroot00000000000000PAPI_accum PAPI_add_event PAPI_add_events PAPI_cleanup_eventset PAPI_create_eventset PAPI_destroy_eventset PAPI_enum_event PAPI_event_code_to_name PAPI_event_name_to_code PAPI_get_event_info PAPI_get_executable_info PAPI_get_hardware_info PAPI_get_multiplex PAPI_get_opt PAPI_get_real_cyc PAPI_get_real_usec PAPI_get_shared_lib_info PAPI_get_thr_specific PAPI_get_overflow_event_index PAPI_get_virt_cyc PAPI_get_virt_usec PAPI_is_initialized PAPI_library_init PAPI_list_events PAPI_lock PAPI_multiplex_init PAPI_num_hwctrs PAPI_num_events PAPI_overflow PAPI_perror PAPI_profil PAPI_query_event PAPI_read PAPI_register_thread PAPI_remove_event PAPI_remove_events PAPI_reset PAPI_set_debug PAPI_set_domain PAPI_set_granularity PAPI_set_multiplex PAPI_set_opt PAPI_set_thr_specific PAPI_shutdown PAPI_sprofil PAPI_start PAPI_state PAPI_stop PAPI_strerror PAPI_thread_id PAPI_thread_init PAPI_unlock PAPI_write PAPI_flips_rate PAPI_flops_rate PAPI_ipc PAPI_epc PAPI_hl_region_begin PAPI_hl_read PAPI_hl_region_endpapi-papi-7-2-0-t/src/libperfnec/000077500000000000000000000000001502707512200165405ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/COPYRIGHT000066400000000000000000000021021502707512200200260ustar00rootroot00000000000000Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. papi-papi-7-2-0-t/src/libperfnec/ChangeLog000066400000000000000000000334371502707512200203240ustar00rootroot000000000000002006-08-21 Stephane Eranian This file will not be updated anymore, Refer to SF.net CVS log for diff information 2006-07-10 Stephane Eranian * removed PFM_FL_X86_INSECURE because it is not needed anymore * removed perfmon_i386.h and perfmon_mips64.h because empty 2006-06-28 Stephane Eranian * added pfmsetup.c (Kevin Corry IBM) * fixed pfmsetup.c to correctly handle sampling format uuid 2006-06-28 Stephane Eranian * added libpfm_montecito.3 man page * updated libpfm_itanium2.3 man page * removed pfm_print_event_info() and related calls from library * removed unused pfmlib_mont_ipear_mode_t struct * remove etb_ds from Montecito ETB struct as it can only have one value * added showevtinfo.c example * added PFMLIB_ITA2_EVT_NO_SET to pfmlib_itanium2.h * added PFMLIB_MONT_EVT_NO_SET to pfmlib_montecito.h * replaced pfm_mont_get_event_caf() by pfm_mont_get_event_type() * added missing perfmon_compat.h from include install (Will Cohen) * fortify showreginfo.c for FC6 (Will Cohen) 2006-06-13 Stephane Eranian * added generic support or event umask (Kevin Corry from IBM) * changed detect_pmcs.c to use pfm-getinfo_evtsets() * updated all examples to use the new detect_unavailable_pmcs() * the examples require 2.6.17-rc6 to run 2006-05-22 Stephane Eranian * corrected architected IA-32 PMU detection code, e.g., PIC assembly * fixed counter width of IA-32 architected PMU to 32 * fixed definition of perfevtsel to 64-bit wide for IA-32 architected PMU 2006-05-11 Stephane Eranian * added support for IA-32 architected PMU as specified in the latest IA-32 architecure manuals. There is enough to support miinual functionalities on Core Duo/Solo processors * updated system call number to match those used with 2.6.17-rc4 * enhanced i386_p6 model detection code 2006-04-25 Stephane Eranian * updated pfmlib_gen_mips64.c with latest code from Phil Mucci * introduced get_event_code_counter() internal method to handle the fact that on smoe MPU (MIPS) an event may have a different value based on the counter it is assigned to. This is a superset of the previous get_event_code(). added PFMLIB_CNT_FIRST to ask for first value (or don't care) 2006-04-05 Stephane Eranian * added support for install_prefix in makefile * fixed broken ETB_EVENT (not report has ETB event) * added BRANCH_EVENT as alias to ETB_EVENT for Montecito * added support for unavailable PMC registers to pfm_dispatch_events() * added detect_pmcs.c, detect_pmcs.h in examples * updated all generic examples to use detect_unavail_pmcs() helper function * updated pfm_dispatch_events() man pages * cleanup PFMLIB_REGMASK_*, change to pfm_regmask_* * created a separate set of man pages for all pfm_regmask_* functions 2006-04-04 Stephane Eranian * fixed makefile in include to install perfmon_i386.h for x86_64 install (Will Cohen from Redhat) * install pfmlib_montecito.h on IA64 2006-04-05 Stephane Eranian * updated system call numbers to 2.6.17-rc1 * incorporated a type change for reg_value in pfmlib.h (Kevin Corry from IBM) 2006-03-22 Stephane Eranian * changed HT detection for PEBS examples 2006-03-07 Stephane Eranian * updated to 2.6.16-rc5 new perfmon code base support * added preliminary Montecito support * incorporated AMD provided event list for X86-64 (Ray Bryant) * renamed all GEN_X86_64 gen_x86_64 to amd_x86_64 * removed PFM_32BIT_ABI_64BIT_OS, ABI now supports ILP32,LP64 without special compilation 2006-01-16 Stephane Eranian * added PFM_32BIT_ABI_64BIT_OS to allow 32-bit compile (32-bit ABI) for a 64-bit OS * added C++ support to perfmon header files * added MIPS64 (5K,20K) support (provided by Phil Mucci) * restructured *_standalone.c examples * added pfm_get_event_code_counter() and man page * changed implementation of pfm_get_num_pm*() * remove non-sense example task_view.c * added support for MIPS in some examples 2006-01-09 Stephane Eranian * examples code cleanups * example support up to 2048 CPU (syst.c) * portable sampling examples support more than 64 PMDs 2005-12-15 Stephane Eranian * updated all examples to new pfm_create_context() prototype * fixed some type mismatch in pfmlib_itanium2.c * required for 2.6.15-rc5-git3 kernel patch 2005-10-18 Stephane Eranian * forced perfsel.en bit to 1 for X86-64 and i386/p6 * inverted reset mask to be more familiar in examples/showreginfo.c * updated P4 examples to force enable bit to 1 2005-09-28 Stephane Eranian * split p6/pentium M event tables. Pentium M adds a few more events and changes the semantic of some. * added smpl_standalone.c, notify_standalone.c and ia32/smpl_pebs.c * cleanup the examples some more * updated multiplex. to match structure of multiplex2.c * updated perfmon2 kernel headers to match 2.6.14-rc2-mm1 release * added man pages for libpfm_p6 and libpfm_x86_64 * fixed handling of edge field for P6 2005-08-01 Stephane Eranian * switch all examples in examples/dir to use the multi system call interface. * updated perfmon.h/perfmon_compat.h to latest kernel interface (multi syscall) 2004-06-24 Stephane Eranian * fixed Itanium2 events tables L2_FORCE_RECIRC_* and L2_L3ACCESS_* events can only be measured by PMC4 * fixed pfm_*_get_event_counters(). It would always return the counter mask for event index 0. 2004-06-24 Stephane Eranian * fixed pfm_print_event_info_*() because it would not print the PMC/PMD mask correctly * updated pfm_dispatch_*ear() for Itanium2 * updated pfm_dispatch_irange() for Itanium2 * updated pfm_ita2_print_info() * updated pfm_ita2_num_pmcs() and pfm_ita2_num_pmds() 2004-02-12 Stephane Eranian * fixed a bug in pfmlib_itanium2.c which cause measurements using opcode matching with an event different from IA64_TAGGED_INST_RETIRED* to return wrong results, i.e., opcode filter was ignored. 2003-11-21 Stephane Eranian * changed interface to pfm_get_impl_*() to use a cleaner definition for bitmasks. pfmlib_regmask_t is now a struct and applications must use accesor macros PFMLIB_REGMASK_*() * added pfm_get_num_pmcs(), pfm_get_num_pmds(), pfm_get_num_counters() * updated man pages to reflect changes * cleanup all examples to reflect bitmask changes 2003-10-24 Stephane Eranian * added reserved fields to the key pfmlib structure for future extensions (recompilation from beta required). 2003-10-24 Stephane Eranian * released beta of version 3.0 * some of the changes not reported by older entries: * removed freesmpl.c example * added ita2_btb.c, ita2_dear.c, ita_dear.c, multiplex.c * added task_attach.c, task_attach_timeout.c, task_smpl.c * added missing itanium2 events, mostly subevent combinations for SYLL_NOT_DISPERSED, EXTERN_DP_PINS_0_TO_3, and EXTERN_DP_PINS_4_TO_5 * got rid of pfm_get_first_event(), pfm_get_next_event(). First valid index is always 0, use pfm_get_num_events() to find last event index * renamed pfm_stop() to pfm_self_stop(), pfm_start() to pfm_self_start() * updated all examples to perfmon2 interface * added notify_self2.c, notify_self3.c examples * updated perfmon.h/perfmon_default_smpl.h to reflect latest perfmon-2 changes (2.6.0-test8) 2003-08-25 Stephane Eranian * allowed mulitple EAR/BTB events * really implemented the 4 different ways of programming EAR/BTB 2003-07-30 Stephane Eranian * updated all man pages to reflect changes for 3.0 * more cleanups in the examples to make all package compile without warning with ecc 2003-07-29 Stephane Eranian * fixed a limitation in the iod_table[] used if dispatch_drange(). Pure Opc mode is possible using the IBR/Opc mode. Reported by Geoff Kent at UIUC. * cleaned up all functions using a bitmask as arguments 2003-06-30 Stephane Eranian * added pfm_get_max_event_name_len() * unsigned vs. int cleanups * introduced pfm_*_pmc_reg_t and pfm_*_pmd_reg_t * cleaned up calls using bitmasks * renamed PMU_MAX_* to PFMLIB_MAX_* * got rid of PMU_FIRST_COUNTER * introduced pfmlib_counter_t * internal interface changes, renaming: pmu_name vs name * got rid of char **name and replaced with char *name, int maxlen * added pfm_start(), pfm_stop() as real functions * changed interface of pfm_dispatch_events to make input vs. output parameters more explicit * model-specific input/output to pfm_dispatch_event() now arguments instead of being linked from the generic argument. 2003-06-27 Stephane Eranian * added missing const to char arguments for pfm_find_event, pfm_find_event_byname, pfm_print_event_info. Suggestion by Hans * renamed pfp_pc to pfp_pmc * renamed pfp_pc_count to pfp_pmc_count 2003-06-11 Stephane Eranian * updated manuals to reflect library changes * updated all examples to match the new Linux/ia64 kernel interface (perfmon2). 2003-06-10 Stephane Eranian * fix pfmlib_itanium.c: dispatch_dear(), dispatch_iear() to setup EAR when there is an EAR event but no detailed setting in ita_param. * added pfm_ita_ear_mode_t to pfmlib_itanium.h * added pfm_ita_get_ear_mode() to pfmlib_itanium.h 2003-06-06 Stephane Eranian * add a generic call to return hardware counter width: pfm_get hw_counter_width() * updated perfmon.h to perfmon2 * added flag to itanium/itanium2 specific parameter to tell the library to ignore per-even qualifier constraints. see PFMLIB_ITA_FL_CNT_NO_QUALCHECK and PFMLIB_ITA2_FL_CNT_NO_QUALCHECK. 2003-05-06 Stephane Eranian * got rid of all connections to perfmon.h. the library is now fully self-contained. pfarg_reg_t has been replaced by pfmlib_reg_t. 2002-03-20 Stephane Eranian * fix %x vs. %lx for pmc8/9 in pfmlib_itanium.c and pfmlib_itanium2.c 2002-12-20 Stephane Eranian * added PFM_FL_EXCL_IDLE to perfmon.h 2002-12-18 Stephane Eranian * clear ig_ad, inv fields in PMC8,9 when no code range restriction is used. 2002-12-17 Stephane Eranian * update pfm_initialize.3 to clarify when this function needs to be called. 2002-12-10 Stephane Eranian * changed _SYS_PERFMON.h to _PERFMON_PERFMON.h 2002-12-06 Stephane Eranian * integrated Peter Chubb's Debian script fixes * fixed the Debian script to include the examples 2002-12-05 Stephane Eranian * added man pages for pfm_start() and pfm_stop() * release 2.0 beta for review 2002-12-04 Stephane Eranian * the pfmlib_param_t structure now contains the pmc array (pfp_pc[]) as well as a counter representing the number of valid entries written to pfp_pc[]. cleaned up all modules and headers to reflect changes. * added pfm_ita2_is_fine_mode() to test whether or not fine mode was used for code ranges. 2002-12-03 Stephane Eranian * removed pfm_ita_ism from pfmlib_ita_param_t * removed pfm_ita2_ism from pfmlib_ita2_param_t * added libpfm.3, libpfm_itanium.3, libpfm_itanium2.3 * enabled per-range privilege level mask in pfmlib_itanium.c and pfmlib_itanium2.c 2002-11-21 Stephane Eranian * added pfmlib_generic.h to cleanup pfmlib.h * dropped retry argument to pfm_find_event() * got rid of the pfm_find_byvcode*() interface (internal only) * cleanup up interface code is int not unsigned long * added man pages in docs/man for the generic library interface * moved the PMU specific handy shortcuts for register struct to module specific file. Avoid possible conflicts in applications using different PMU models in one source file. 2002-11-20 Stephane Eranian * separated the library, headers, examples from the pfmon tool * changed license of library to MIT-style license * set version number to 2.0 * added support to generate a shared version of libpfm * fix pfm_dispatch_opcm() to check for effective use of IA64_TAGGED_INST_IBRPX_PMCY before setting the bits in PMC15 (spotted by UIUC Impact Team). * cleaned up error messages in the examples * fix bug in pfm_ita2_print_info() which caused extra umask bits to be displayed for EAR. 2002-11-19 Stephane Eranian * added pfm_get_impl_counters() to library interface and PMU models * added missing support for pfm_get_impl_pmds(), pfm_get_impl_pmcs() to pfmlib_generic.c * created pfmlib_compiler.h to encapsulate inline assembly differences between compilers. * created pfmlib_compiler_priv.h to encapsulate the inline assembly differences for library private code. 2002-11-13 Stephane Eranian * fixed definition of pmc10 in pfmlib_itanium2.h to account for a layout difference between cache and TLB mode (spotted by UIUC Impact Team). Was causing problems with some latency values in IEAR cache mode. * fixed initialization of pmc10 in pfmlib_itanium2.c to reflect above change. 2002-10-14 Stephane Eranian * fixed impl_pmds[] in pfmlib_itanium.c and pfmlib_itanium2.c. PMD17 was missing. 2002-09-09 Stephane Eranian * updated include/perfmon/perfmon.h to include sampling period randomization. 2002-08-14 Stephane Eranian * fix bitfield length for pmc14_ita2_reg and pmd3_ita2_reg in pfmlib_itanium2.h (David Mosberger) papi-papi-7-2-0-t/src/libperfnec/Makefile000066400000000000000000000040621502707512200202020ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # # Look in config.mk for options # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi) include config.mk DIRS=lib include docs ifeq ($(SYS),Linux) DIRS +=libpfms endif DIRS += $(EXAMPLES_DIRS) all: @echo Compiling for \'$(ARCH)\' target @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done lib: $(MAKE) -C lib clean: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done distclean: clean depend: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done tar: clean a=`basename $$PWD`; cd ..; tar zcf $$a.tar.gz $$a; echo generated ../$$a.tar.gz; tarcvs: clean a=`basename $$PWD`; cd ..; tar --exclude=CVS -zcf $$a.tar.gz $$a; echo generated ../$$a.tar.gz; install: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done install_examples: @set -e ; for d in $(EXAMPLES_DIRS) ; do $(MAKE) -C $$d $@ ; done .PHONY: tar tarcvs lib # DO NOT DELETE papi-papi-7-2-0-t/src/libperfnec/README000066400000000000000000000067011502707512200174240ustar00rootroot00000000000000 ------------------------------------------------------ libpfm-3.10: a helper library to program the Performance Monitoring Unit (PMU) ------------------------------------------------------ Copyright (c) 2001-2007 Hewlett-Packard Development Company, L.P. Contributed by Stephane Eranian This package provides a library, called libpfm, which can be used to develop monitoring tools which use the Performance Monitoring Unit (PMU) of several modern processors. This version of libpfm supports: - For Intel IA-64: Itanium (Merced), Itanium 2 (McKinley, Madison, Deerfield), Itanium 2 9000/9100 (Montecito, Montvale) and Generic - For AMD X86: AMD64 (K8, family 10h) - For Intel X86: Intel P6 (Pentium II, Pentium Pro, Pentium III, Pentium M) Intel Yonah (Core Duo/Core Solo), Intel Netburst (Pentium 4, Xeon) Intel Core (Merom, Penryn, Dunnington) Core 2 and Quad Intel Atom Intel Nehalem (Nehalem, Westmere) Intel architectural perfmon v1, v2, v3 - For MIPS: 5K, 20K, 25KF, 34K, 5KC, 74K, R10000, R12000, RM7000, RM9000, SB1, VR5432, VR5500, SiCortex ICA9A/ICE9B - For Cray: XT3, XT4, XT5, XT5h, X2 - For IBM: IBM Cell processor POWER: PPC970, PPC970MP, POWER4+, POWER5, POWER5+, POWER6, POWER7 - For Sun: Sparc: Ultra12, Ultra3, Ultra3i, Ultra3Plus, Ultra4Plus, Sparc: Niagara1, Niagara2 The core library is generic and does not depend on the perfmon interface. It is possible to use it on other operating systems. WHAT'S THERE ------------- - the library source code including support for all processors listed above - a set of examples showing how the library can be used with the perfmon2 and perfmon3 kernel interface. - a set of older examples for IA-64 only using the legacy perfmon2 interface (v2.0). - a set of library header files and the perfmon2 and perfmon3 kernel interface headers - libpfms: a simple library to help setup SMP system-wide monitoring sessions. It comes with a simple example. This library is not part of libpfm. - man pages for all the library entry points - Python bindings for libpfm and the perfmon interface (experimental). INSTALLATION ------------ - edit config.mk to : - update some of the configuration variables - make your compiler options - type make - type make install - To compile and install the Python bindings, you need to go to the python sub-directory and type make. Python is not systematically built - to compile the library for another ABI (e.g. 32-bit x86 on a 64-bit x86) system, you can pass the ABI flag to the compiler as follows (assuming you have the multilib version of gcc): $ make OPTION="-m32 -O2" REQUIREMENTS: ------------- - to run the programs in the examples subdir, you MUST be using a linux kernel with perfmon3. Perfmon3 is available as a branch of the perfmon kernel GIT tree on kernel.org. - to run the programs in the examples_v2x subdir, you MUST be using a linux kernel with perfmon2. Perfmon2 is available as the main branch of the perfmon kernel GIT tree on kernel.org. - On IA-64, the examples in old_interface_ia64_examples work with any 2.6.x kernels. - to compile the Python bindings, you need to have SWIG and the python development packages installed DOCUMENTATION ------------- - man pages for all entry points - More information can be found on library web site: http://perfmon2.sf.net papi-papi-7-2-0-t/src/libperfnec/TODO000066400000000000000000000004331502707512200172300ustar00rootroot00000000000000TODO list: ---------- - add Linux/ia64 perfmon support to GNU libc, this would avoid having the perfmon.h perfmon_default_smpl.h headers here. - add library interface to help setup system-wide mode SMP on Linux/ia64 - add support for cumulative calls to pfm_dispatch_events() papi-papi-7-2-0-t/src/libperfnec/config.mk000066400000000000000000000061561502707512200203460ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # This file is part of libpfm, a performance monitoring support library for # applications on Linux. # # # This file defines the global compilation settings. # It is included by every Makefile # # SYS := $(shell uname -s) ARCH := nec # # CONFIG_PFMLIB_SHARED: y=compile static and shared versions, n=static only # CONFIG_PFMLIB_OLD_PFMV2: enable old ( 2.x, x <=4) perfmon2 (mutually exclusive with v3 support) CONFIG_PFMLIB_SHARED?=y CONFIG_PFMLIB_OLD_PFMV2?=n # # Library version # VERSION=3 REVISION=10 AGE=0 # # Where should things (lib, headers, man) go in the end. # install_prefix?=/usr/local PREFIX?=$(install_prefix) LIBDIR=$(PREFIX)/lib INCDIR=$(PREFIX)/include MANDIR=$(PREFIX)/share/man EXAMPLESDIR=$(PREFIX)/share/doc/libpfm-$(VERSION).$(REVISION).$(AGE)/examples CONFIG_PFMLIB_ARCH_IA64=y CONFIG_PFMLIB_SHARED=n CONFIG_PFMLIB_OLD_PFMV2=y # handle special cases for 64-bit builds ifeq ($(BITMODE),64) ifeq ($(ARCH),powerpc) CONFIG_PFMLIB_ARCH_POWERPC64=y endif endif # # you shouldn't have to touch anything beyond this point # # # The entire package can be compiled using # icc the Intel Itanium Compiler (7.x,8.x, 9.x) # or GNU C #CC=icc CC?=gcc LIBS= INSTALL=install LN?=ln -sf PFMINCDIR=$(TOPDIR)/include PFMLIBDIR=$(TOPDIR)/lib DBG?=-g -Wall -Werror # gcc/mips64 bug ifeq ($(CONFIG_PFMLIB_ARCH_SICORTEX),y) OPTIM?=-O else OPTIM?=-O2 endif CFLAGS+=$(OPTIM) $(DBG) -I$(PFMINCDIR) MKDEP=makedepend PFMLIB=$(PFMLIBDIR)/libpfm.a # Reset options for Cray XT ifeq ($(CONFIG_PFMLIB_ARCH_CRAYXT),y) LDFLAGS+=-static CONFIG_PFMLIB_OLD_PFMV2=y endif # Reset the compiler for Cray-X2 (load x2-gcc module) ifeq ($(CONFIG_PFMLIB_ARCH_CRAYX2),y) CC=craynv-cray-linux-gnu-gcc LDFLAGS+=-static CONFIG_PFMLIB_OLD_PFMV2=y endif ifeq ($(CONFIG_PFMLIB_ARCH_SICORTEX),y) CONFIG_PFMLIB_OLD_PFMV2=y endif ifeq ($(CONFIG_PFMLIB_ARCH_POWERPC64),y) CFLAGS+= -m64 LDFLAGS+= -m64 LIBDIR=$(PREFIX)/lib64 endif ifeq ($(CONFIG_PFMLIB_OLD_PFMV2),y) CFLAGS +=-DPFMLIB_OLD_PFMV2 endif papi-papi-7-2-0-t/src/libperfnec/docs/000077500000000000000000000000001502707512200174705ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/docs/Makefile000066400000000000000000000061001502707512200211250ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk ifeq ($(CONFIG_PFMLIB_ARCH_IA64),y) ARCH_MAN=libpfm_itanium.3 libpfm_itanium2.3 libpfm_montecito.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_I386),y) ARCH_MAN=libpfm_p6.3 libpfm_core.3 libpfm_amd64.3 libpfm_atom.3 libpfm_nehalem.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_X86_64),y) ARCH_MAN=libpfm_amd64.3 libpfm_core.3 libpfm_atom.3 libpfm_nehalem.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_MIPS64),y) endif ifeq ($(CONFIG_PFMLIB_ARCH_POWERPC),y) ARCH_MAN=libpfm_powerpc.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_CRAYXT),y) endif ifeq ($(CONFIG_PFMLIB_CELL),y) endif GEN_MAN= libpfm.3 pfm_dispatch_events.3 pfm_find_event.3 pfm_find_event_bycode.3 \ pfm_find_event_bycode_next.3 pfm_find_event_mask.3 pfm_find_full_event.3 \ pfm_force_pmu.3 pfm_get_cycle_event.3 pfm_get_event_code.3 pfm_get_event_code_counter.3 \ pfm_get_event_counters.3 pfm_get_event_description.3 pfm_get_event_mask_code.3 \ pfm_get_event_mask_description.3 pfm_get_event_mask_name.3 pfm_get_event_name.3 \ pfm_get_full_event_name.3 pfm_get_hw_counter_width.3 pfm_get_impl_counters.3 \ pfm_get_impl_pmcs.3 pfm_get_impl_pmds.3 pfm_get_inst_retired.3 pfm_get_max_event_name_len.3 \ pfm_get_num_counters.3 pfm_get_num_events.3 pfm_get_num_pmcs.3 \ pfm_get_num_pmds.3 pfm_get_pmu_name.3 pfm_get_pmu_name_bytype.3 \ pfm_get_pmu_type.3 pfm_get_version.3 pfm_initialize.3 \ pfm_list_supported_pmus.3 pfm_pmu_is_supported.3 pfm_regmask_and.3 \ pfm_regmask_clr.3 pfm_regmask_copy.3 pfm_regmask_eq.3 pfm_regmask_isset.3 \ pfm_regmask_or.3 pfm_regmask_set.3 pfm_regmask_weight.3 pfm_set_options.3 \ pfm_strerror.3 MAN=$(GEN_MAN) $(ARCH_MAN) install: -mkdir -p $(DESTDIR)$(MANDIR)/man3 ( cd man3; $(INSTALL) -m 644 $(MAN) $(DESTDIR)$(MANDIR)/man3 ) papi-papi-7-2-0-t/src/libperfnec/docs/man3/000077500000000000000000000000001502707512200203265ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm.3000066400000000000000000000114131502707512200216630ustar00rootroot00000000000000.TH LIBPFM 3 "March, 2008" "" "Linux Programmer's Manual" .SH NAME libpfm \- a helper library to program Hardware Performance Units (PMUs) .SH SYNOPSIS .nf .B #include .SH DESCRIPTION The libpfm library is a helper library which is used by applications to help program the Performance Monitoring Unit (PMU), i.e., the hardware performance counters of modern processors. It provides a generic and portable programming interface to help setup the PMU configuration registers given a list of events to measure. A diversity of PMU hardware is supported, a list can be found below under \fBSUPPORTED HARDWARE\fR. The library is primarily designed to be used in conjunction with the Perfmon2 Linux kernel interface. However, at its core, it is totally independent of that interface and could as well be used on other operating systems. It is important to realize that the library does not make the actual kernel calls to program the PMU, it simply helps applications figure out which PMU registers to use to measure certain events or access certain advanced PMU features. The library logically divides PMU registers into two categories. The performance monitoring data registers (PMD) are used to collect results, e.g., counts. The performance monitoring configuration registers (PMCS) are used to indicate what events to measure or what feature to enable. Programming the PMU consists in setting up the PMC registers and collecting the results in the PMD registers. The central piece of the library is the \fBpfm_dispatch_events\fR function. The number of PMC and PMD registers varies between architectures and CPU models. The association of PMC to PMD can also change. Moreover the number and encodings of events can also widely change. Finally, the structure of a PMC register can also change. All these factors make it quite difficult to write monitoring tools. This library is designed to simplify the programming of the PMC registers by hiding the complexity behind a simple interface. The library does this without limiting accessibility to model specific features by using a layered design. The library is structured in two layers. The common layer provides an interface that is shared across all PMU models. This layer is good enough to setup simple monitoring sessions which count occurrences of events. Then, there is a model-specific layer which gives access to the model-specific features. For instance, on Itanium, applications can use the library to setup the registers for the Branch Trace Buffer. Model-specific interfaces have the abbreviated PMU model name in their names. For instance, \fBpfm_ita2_get_event_umask()\fR is an Itanium2 (ita2) specific function. When the library is initialized, it automatically probes the host CPU and enables the right set of interfaces. The common interface is defined in the \fBpfmlib.h\fR header file. Model-specific interfaces are defined in model-specific header files. For instance, \fBpfmlib_amd64.h\fR provides the AMD64 interface. .SH ENVIRONMENT VARIABLES It is possible to enable certain debug output of the library using environment variables. The following variables are defined: .TP .B LIBPFM_VERBOSE Enable verbose output. Value must be 0 or 1. When not set, verbosity level can be controlled with this function. .TP .B LIBPFM_DEBUG Enable debug output. Value must be 0 or 1. When not set, debug level can be controlled with this function. .TP .B LIBPFM_DEBUG_STDOUT Redirect verbose and debug output to the standard output file descriptor (stdout). By default, the output is directed to the standard error file descriptor (stderr). .sp Alternatively, it is possible to control verbosity and debug output using the \fBpfm_set_options\fR function. .LP .SH SUPPORTED HARDWARE .nf libpfm_amd64(3) AMD64 processors K8 and Barcelona (families 0Fh and 10h) libpfm_core(3) Intel Core processor family libpfm_atom(3) Intel Atom processor family libpfm_itanium(3) Intel Itanium libpfm_itanium2(3) Intel Itanium 2 libpfm_montecito(3) Intel dual-core Itanium 2 9000 (Montecito) libpfm_p6(3) P6 processor family including the Pentium M processor libpfm_powerpc(3) IBM PowerPC and POWER processor families (PPC970(FX,GX), PPC970MP POWER4, POWER4+, POWER5, POWER5+, and POWER6) .fi .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP .SH SEE ALSO libpfm(3), libpfm_amd64(3), libpfm_core(3), libpfm_itanium2(3), libpfm_itanium(3), libpfm_montecito(3), libpfm_p6(3), libpfm_powerpc(3). .nf pfm_dispatch_events(3), pfm_find_event(3), pfm_set_options(3), pfm_get_cycle_event(3), pfm_get_event_name(3), pfm_get_impl_pmcs(3), pfm_get_pmu_name(3), pfm_get_version(3), pfm_initialize(3), pfm_regmask_set(3), pfm_set_options(3), pfm_strerror(3). .fi .sp Examples shipped with the library papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_amd64.3000066400000000000000000000130651502707512200226630ustar00rootroot00000000000000.TH LIBPFM 3 "April, 2008" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64 - support for AMD64 processors .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the AMD64 processor families 0Fh and 10H (K8, Barcelona, Phenom) when running in either 32-bit or 64-bit mode. The interface is defined in \fBpfmlib_amd64.h\fR. It consists of a set of functions and structures which describe and allow access to the AMD64 specific PMU features. Note that it only supports AMD processors. .sp When AMD64 processor-specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The AMD64 processor-specific input arguments are described in the \fBpfmlib_amd64_input_param_t\fR structure and the output parameters in \fBpfmlib_amd64_output_param_t\fR. They are defined as follows: .sp .nf typedef struct { uint32_t cnt_mask; uint32_t flags; } pfmlib_amd64_counter_t; typedef struct { unsigned int maxcnt; unsigned int options; } ibs_param_t; typedef struct { pfmlib_amd64_counter_t pfp_amd64_counters[PMU_AMD64_MAX_COUNTERS]; uint32_t flags; uint32_t reserved1; ibs_param_t ibsfetch; ibs_param_t ibsop; uint64_t reserved2; } pfmlib_amd64_input_param_t; typedef struct { uint32_t ibsfetch_base; uint32_t ibsop_base; uint64_t reserved[7]; } pfmlib_amd64_output_param_t; .fi .LP The \fBflags\fR field of \fBpfmlib_amd64_input_param_t\fR describes which features of the PMU to use. Following use flags exist: .TP .B PFMLIB_AMD64_USE_IBSFETCH Profile IBS fetch performance (see below under \fBINSTRUCTION BASED SAMPLING\fR) .TP .B PFMLIB_AMD64_USE_IBSOP Profile IBS execution performance (see below under \fBINSTRUCTION BASED SAMPLING\fR) .LP Multiple features can be selected. Note that there are no use flags needed for \fBADDITIONAL PER-EVENT FEATURES\fR. .LP Various typedefs for MSR encoding and decoding are available. See \fBpfmlib_amd64.h\fR for details. .SS ADDITIONAL PER-EVENT FEATURES AMD64 processors provide a few additional per-event features for counters: thresholding, inversion, edge detection, virtualization. They can be set using the \fBpfp_amd64_counters\fR data structure for each event. The \fBflags\fR field of \fBpfmlib_amd64_counter_t\fR can be initialized as follows: .TP .B PFMLIB_AMD64_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set .TP .B PFMLIB_AMD64_SEL_EDGE Enables edge detection of events. .TP .B PFMLIB_AMD64_SEL_GUEST On AMD64 Family 10h processors only. Event is only measured when processor is in guest mode. .TP .B PFMLIB_AMD64_SEL_HOST On AMD64 Family 10h processors only. Event is only measured when processor is in host mode. .LP The \fBcnt_mask\fR field is used to set the event threshold. The value of the counter is incremented each time the number of occurrences per cycle of the event is greater or equal to the value of the field. When zero all occurrences are counted. .SS INSTRUCTION BASED SAMPLING (IBS) The libpfm_amd64 provides access to the model specific feature Instruction Based Sampling (IBS). IBS has been introduced with family 10h. .LP The IBS setup is using the structures \fBpfmlib_amd64_input_param_t\fR and \fBpfmlib_amd64_output_param_t\fR with its members \fBflags\fR, \fBibsfetch\fR, \fBibsop\fR, \fBibsfetch_base\fR, \fBibsop_base\fR. The input arguments \fBibsop\fR and \fBibsfetch\fR can be set in inp_mod (type \fBpfmlib_amd64_input_param_t\fR). The corresponding \fBflags\fR must be set to enable a feature. .LP Both, IBS execution profiling and IBS fetch profiling, require a maximum count value of the periodic counter (\fBmaxcnt\fR) as parameter. This is a 20 bit value, bits 3:0 are always set to zero. Additionally, there is an option (\fBoptions\fR) to enable randomization (\fBIBS_OPTIONS_RANDEN\fR) for IBS fetch profiling. .LP The IBS registers IbsFetchCtl (0xC0011030) and IbsOpCtl (0xC0011033) are available as PMC and PMD in Perfmon. The function \fBpfm_dispatch_events()\fR initializes these registers according to the input parameters in \fBpfmlib_amd64_input_param_t\fR. .LP Also, \fBpfm_dispatch_events()\fR passes back the index in pfp_pmds[] of the IbsOpCtl and IbsFetchCtl register. For this there are the entries \fBibsfetch_base\fR and \fBibsop_base\fR in \fBpfmlib_amd64_output_param_t\fR. The index may vary depending on other PMU settings, especially counter settings. If using the PMU with only one IBS feature and no counters, the index of the base register is 0. .LP Example code: .LP .nf /* initialize IBS */ inp_mod.ibsop.maxcnt = 0xFFFF0; inp_mod.flags |= PFMLIB_AMD64_USE_IBSOP; ret = pfm_dispatch_events(NULL, &inp_mod, &outp, &outp_mod); if (ret != PFMLIB_SUCCESS) { ... } /* setup PMU */ /* PMC_IBSOPCTL */ pc[0].reg_num = outp.pfp_pmcs[0].reg_num; pc[0].reg_value = outp.pfp_pmcs[0].reg_value; /* PMD_IBSOPCTL */ pd[0].reg_num = outp.pfp_pmds[0].reg_num; pd[0].reg_value = 0; /* setup sampling */ pd[0].reg_flags = PFM_REGFL_OVFL_NOTIFY; /* add range check here */ pd[0].reg_smpl_pmds[0] = ((1UL << PMD_IBSOP_NUM) - 1) << outp.pfp_pmds[0].reg_num; /* write pc and pd to PMU */ ... .fi .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_atom.3000066400000000000000000000055521502707512200227120ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2006" "" "Linux Programmer's Manual" .SH NAME libpfm_core - support for Intel Atom processor family .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the Intel Atom processor. This processor implements Intel architectural perfmon v3 with Precise Event-Based Sampling (PEBS) support. It also implements all architected events to which it adds lots of Atom specific events. .sp The libpfm interface is defined in \fBpfmlib_intel_atom.h\fR. It consists of a set of functions and structures which describe and allow access to the Intel Atom processor specific PMU features. .sp When Intel Atom processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Intel Atom processors specific input arguments are described in the \fBpfmlib_intel_atom_input_param_t\fR structure. No output parameters are currently defined. The input parameters are defined as follows: .sp .nf typedef struct { unsigned int cnt_mask; unsigned int flags; } pfmlib_intel_atom_counter_t; typedef struct { pfmlib_intel_atom_counter_t pfp_intel_atom_counters[PMU_INTEL_ATOM_NUM_COUNTERS]; unsigned int pfp_intel_atom_pebs_used; uint64_t reserved[4]; } pfmlib_core_input_param_t; .fi .sp .sp The Intel Atom processor provides several additional per-event features for counters: thresholding, inversion, edge detection, monitoring both threads. They can be set using the \fBpfp_intel_atom_counters\fR data structure for each event. The \fBflags\fR field can be initialized with any combinations of the following values: .TP .B PFMLIB_INTEL_ATOM_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set .TP .B PFMLIB_INTEL_ATOM_SEL_EDGE Enable edge detection of events. .TP .B PFMLIB_INTEL_ATOM_SEL_ANYTHR Enable measuring the event in any of the two threads. By default only the current thread is measured. .LP The \fBcnt_mask\fR field is used to set the event threshold. The value of the counter is incremented each time the number of occurrences per cycle of the event is greater or equal to the value of the field. Thus the event is modified to actually measure the number of qualifying cycles. When zero all occurrences are counted (this is the default). .sp .SH Support for Precise-Event Based Sampling (PEBS) The library can be used to setup the PMC registers when using PEBS. In this case, the \fBpfp_intel_atom_pebs_used\fR field must be set to 1. When using PEBS, it is not possible to use more than one event. .LP .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_core.3000066400000000000000000000056701502707512200227030ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2006" "" "Linux Programmer's Manual" .SH NAME libpfm_core - support for Intel Core processor family .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the Intel Core processor family, including the Core 2 Duo and Quad series. The interface is defined in \fBpfmlib_core.h\fR. It consists of a set of functions and structures which describe and allow access to the Intel Core processors specific PMU features. .sp When Intel Core processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Intel Core processors specific input arguments are described in the \fBpfmlib_core_input_param_t\fR structure. No output parameters are currently defined. The input parameters are defined as follows: .sp .nf typedef struct { unsigned int cnt_mask; unsigned int flags; } pfmlib_core_counter_t; typedef struct { unsigned int pebs_used; } pfmlib_core_pebs_t; typedef struct { pfmlib_core_counter_t pfp_core_counters[PMU_CORE_NUM_COUNTERS]; pfmlib_core_pebs_t pfp_core_pebs; uint64_t reserved[4]; } pfmlib_core_input_param_t; .fi .sp .sp The Intel Core processor provides a few additional per-event features for counters: thresholding, inversion, edge detection. They can be set using the \fBpfp_core_counters\fR data structure for each event. The \fBflags\fR field can be initialized with any combinations of the following values: .TP .B PFMLIB_CORE_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set .TP .B PFMLIB_CORE_SEL_EDGE Enables edge detection of events. .LP The \fBcnt_mask\fR field is used to set the event threshold. The value of the counter is incremented each time the number of occurrences per cycle of the event is greater or equal to the value of the field. Thus the event is modified to actually measure the number of qualifying cycles. When zero all occurrences are counted (this is the default). .sp .SH Support for Precise-Event Based Sampling (PEBS) The library can be used to setup the PMC registers when using PEBS. In this case, the \fBpfp_core_pebs\fR structure must be used and the \fBpebs_used\fR field must be set to 1. When using PEBS, it is not possible to use more than one event. .SH Support for Intel Core 2 Duo and Quad processors The Intel Core 2 Duo and Quad processors are based on the Intel Core micro-architecture. They implement the Intel architectural PMU and some extensions such as PEBS. They support all the architectural events and a lot more Core 2 specific events. The library auto-detects the processor and provides access to Core 2 events whenever possible. .LP .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_itanium.3000066400000000000000000000461641502707512200234240ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME libpfm_itanium - support for Itanium specific PMU features .SH SYNOPSIS .nf .B #include .B #include .sp .BI "int pfm_ita_is_ear(unsigned int " i ");" .BI "int pfm_ita_is_dear(unsigned int " i ");" .BI "int pfm_ita_is_dear_tlb(unsigned int " i ");" .BI "int pfm_ita_is_dear_cache(unsigned int " i ");" .BI "int pfm_ita_is_iear(unsigned int " i ");" .BI "int pfm_ita_is_iear_tlb(unsigned int " i ");" .BI "int pfm_ita_is_iear_cache(unsigned int " i ");" .BI "int pfm_ita_is_btb(unsigned int " i ");" .BI "int pfm_ita_support_opcm(unsigned int " i ");" .BI "int pfm_ita_support_iarr(unsigned int " i ");" .BI "int pfm_ita_support_darr(unsigned int " i ");" .BI "int pfm_ita_get_event_maxincr(unsigned int " i ", unsigned int *"maxincr ");" .BI "int pfm_ita_get_event_umask(unsigned int " i ", unsigned long *"umask ");" .sp .SH DESCRIPTION The libpfm library provides full support for all the Itanium specific features of the PMU. The interface is defined in \fBpfmlib_itanium.h\fR. It consists of a set of functions and structures which describe and allow access to the Itanium specific PMU features. .sp The Itanium specific functions presented here are mostly used to retrieve the characteristics of an event. Given a opaque event descriptor, obtained by the \fBpfm_find_event()\fR or its derivative functions, they return a boolean value indicating whether this event support this features or is of a particular kind. .sp The \fBpfm_ita_is_ear()\fR function returns 1 if the event designated by \fBi\fR corresponds to a EAR event, i.e., an Event Address Register type of events. Otherwise 0 is returned. For instance, \fBDATA_EAR_CACHE_LAT4\fR is an ear event, but \fBCPU_CYCLES\fR is not. It can be a data or instruction EAR event. .sp The \fBpfm_ita_is_dear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an Data EAR event. Otherwise 0 is returned. It can be a cache or TLB EAR event. .sp The \fBpfm_ita_is_dear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_ita_is_dear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_ita_is_iear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR event. Otherwise 0 is returned. It can be a cache or TLB instruction EAR event. .sp The \fBpfm_ita_is_iear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_ita_is_iear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_ita_support_opcm()\fR function returns 1 if the event designated by \fBi\fR supports opcode matching, i.e., can this event be measured accurately when opcode matching via PMC8/PMC9 is active. Not all events supports this feature. .sp The \fBpfm_ita_support_iarr()\fR function returns 1 if the event designated by \fBi\fR supports code address range restrictions, i.e., can this event be measured accurately when code range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_ita_support_darr()\fR function returns 1 if the event designated by \fBi\fR supports data address range restrictions, i.e., can this event be measured accurately when data range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_ita_get_event_maxincr()\fR function returns in \fBmaxincr\fR the maximum number of occurrences per cycle for the event designated by \fBi\fR. Certain Itanium events can occur more than once per cycle. When an event occurs more than once per cycle, the PMD counter will be incremented accordingly. It is possible to restrict measurement when event occur more than once per cycle. For instance, \fBNOPS_RETIRED\fR can happen up to 6 times/cycle which means that the threshold can be adjusted between 0 and 5, where 5 would mean that the PMD counter would be incremented by 1 only when the nop instruction is executed more than 5 times/cycle. This function returns the maximum number of occurrences of the event per cycle, and is the non-inclusive upper bound for the threshold to program in the PMC register. .sp The \fBpfm_ita_get_event_umask()\fR function returns in \fBumask\fR the umask for the event designated by \fBi\fR. .sp When the Itanium specific features are needed to support a measurement their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Itanium specific input arguments are described in the \fBpfmlib_ita_input_param_t\fR structure and the output parameters in \fBpfmlib_ita_output_param_t\fR. They are defined as follows: .sp .nf typedef enum { PFMLIB_ITA_ISM_BOTH=0, PFMLIB_ITA_ISM_IA32=1, PFMLIB_ITA_ISM_IA64=2 } pfmlib_ita_ism_t; typedef struct { unsigned int flags; unsigned int thres; pfmlib_ita_ism_t ism; } pfmlib_ita_counter_t; typedef struct { unsigned char opcm_used; unsigned long pmc_val; } pfmlib_ita_opcm_t; typedef struct { unsigned char btb_used; unsigned char btb_tar; unsigned char btb_tac; unsigned char btb_bac; unsigned char btb_tm; unsigned char btb_ptm; unsigned char btb_ppm; unsigned int btb_plm; } pfmlib_ita_btb_t; typedef enum { PFMLIB_ITA_EAR_CACHE_MODE= 0, PFMLIB_ITA_EAR_TLB_MODE = 1, } pfmlib_ita_ear_mode_t; typedef struct { unsigned char ear_used; pfmlib_ita_ear_mode_t ear_mode; pfmlib_ita_ism_t ear_ism; unsigned int ear_plm; unsigned long ear_umask; } pfmlib_ita_ear_t; typedef struct { unsigned int rr_plm; unsigned long rr_start; unsigned long rr_end; } pfmlib_ita_input_rr_desc_t; typedef struct { unsigned long rr_soff; unsigned long rr_eoff; } pfmlib_ita_output_rr_desc_t; typedef struct { unsigned int rr_flags; pfmlib_ita_input_rr_desc_t rr_limits[4]; unsigned char rr_used; } pfmlib_ita_input_rr_t; typedef struct { unsigned int rr_nbr_used; pfmlib_ita_output_rr_desc_t rr_infos[4]; pfmlib_reg_t rr_br[8]; } pfmlib_ita_output_rr_t; typedef struct { pfmlib_ita_counter_t pfp_ita_counters[PMU_ITA_NUM_COUNTERS]; unsigned long pfp_ita_flags; pfmlib_ita_opcm_t pfp_ita_pmc8; pfmlib_ita_opcm_t pfp_ita_pmc9; pfmlib_ita_ear_t pfp_ita_iear; pfmlib_ita_ear_t pfp_ita_dear; pfmlib_ita_btb_t pfp_ita_btb; pfmlib_ita_input_rr_t pfp_ita_drange; pfmlib_ita_input_rr_t pfp_ita_irange; } pfmlib_ita_input_param_t; typedef struct { pfmlib_ita_output_rr_t pfp_ita_drange; pfmlib_ita_output_rr_t pfp_ita_irange; } pfmlib_ita_output_param_t; .fi .sp .SH INSTRUCTION SET .sp The Itanium processor provides two additional per-event features for counters: thresholding and instruction set selection. They can be set using the \fBpfp_ita_counters\fR data structure for each event. The \fBism\fR field can be initialized as follows: .TP .B PFMLIB_ITA_ISM_BOTH The event will be monitored during IA-64 and IA-32 execution .TP .B PFMLIB_ITA_ISM_IA32 The event will only be monitored during IA-32 execution .TP .B PFMLIB_ITA_ISM_IA64 The event will only be monitored during IA-64 execution .sp .LP If \fBism\fR has a value of zero, it will default to PFMLIB_ITA_ISM_BOTH. .sp The \fBthres\fR indicates the threshold for the event. A threshold of \fBn\fR means that the counter will be incremented by one only when the event occurs more than \fBn\fR times per cycle. The \fBflags\fR field contains event-specific flags. The currently defined flags are: .sp .TP PFMLIB_ITA_FL_EVT_NO_QUALCHECK When this flag is set it indicates that the library should ignore the qualifiers constraints for this event. Qualifiers includes opcode matching, code and data range restrictions. When an event is marked as not supporting a particular qualifier, it usually means that it is ignored, i.e., the extra level of filtering is ignored. For instance, the CPU_CYCLES events does not support code range restrictions and by default the library will refuse to program it if range restriction is also requested. Using the flag will override the check and the call to the \fBpfm_dispatch_events()\fR function will succeed. In this case, CPU_CYCLES will be measured for the entire program and not just for the code range requested. For certain measurements this is perfectly acceptable as the range restriction will only be applied relevant to events which support it. Make sure you understand which events do not support certain qualifiers before using this flag. .LP .SH OPCODE MATCHING .sp The \fBpfp_ita_pmc8\fR and \fBpfp_ita_pmc9\fR fields of type \fBpfmlib_ita_opcm_t\fR contain the description of what to do with the opcode matchers. Itanium supports opcode matching via PMC8 and PMC9. When this feature is used the \fBopcm_used\fR field must be set to 1, otherwise it is ignored by the library. The \fBpmc_val\fR simply contains the raw value to store in PMC8 or PMC9. The library does not modify the values for PMC8 and PMC9, they will be stored in the \fBpfp_pmcs\fR table of the generic output parameters. .SH EVENT ADDRESS REGISTERS .sp The \fBpfp_ita_iear\fR field of type \fBpfmlib_ita_ear_t\fR describes what to do with instruction Event Address Registers (I-EARs). Again if this feature is used the \fBear_used\fR must be set to 1, otherwise it will be ignored by the library. The \fBear_mode\fR must be set to either one of \fBPFMLIB_ITA_EAR_TLB_MODE\fR, \fBPFMLIB_ITA_EAR_CACHE_MODE\fRto indicate the type of EAR to program. The umask to store into PMC10 must be in \fBear_umask\fR. The privilege level mask at which the I-EAR will be monitored must be set in \fBear_plm\fR which can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBear_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. Finally the instruction set for which to monitor is in \fBear_ism\fR and can be any one of \fBPFMLIB_ITA_ISM_BOTH\fR, \fBPFMLIB_ITA_ISM_IA32\fR, or \fBPFMLIB_ITA_ISM_IA64\fR. .sp The \fBpfp_ita_dear\fR field of type \fBpfmlib_ita_ear_t\fR describes what to do with data Event Address Registers (D-EARs). The description is identical to the I-EARs except that it applies to PMC11. In general, there are four different methods to program the EAR (data or instruction): .TP .B Method 1 There is an EAR event in the list of events to monitor and \fBear_used\fR is cleared. In this case the EAR will be programmed (PMC10 or PMC11) based on the information encoded in the event. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count DATA_EAR_EVENT or INSTRUCTION_EAR_EVENTS depending on the type of EAR. .TP .B Method 2 There is an EAR event in the list of events to monitor and \fBear_used\fR is set. In this case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_ita_iear\fR or \fBpfp_ita_dear\fR structure because it contains more detailed information, such as privilege level and instruction set. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count DATA_EAR_EVENT or INSTRUCTION_EAR_EVENTS depending on the type of EAR. .TP .B Method 3 There is no EAR event in the list of events to monitor and and \fBear_used\fR is cleared. In this case no EAR is programmed. .TP .B Method 4 There is no EAR event in the list of events to monitor and and \fBear_used\fR is set. In this case case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_ita_iear\fR or \fBpfp_ita_dear\fR structure. This is the free running mode for the EAR. .sp .SH BRANCH TRACE BUFFER The \fBpfp_ita_btb\fR of type \fBpfmlib_ita_btb_t\fR field is used to configure the Branch Trace Buffer (BTB). If the \fBbtb_used\fR is set, then the library will take the configuration into account, otherwise any BTB configuration will be ignored. The various fields in this structure provide means to filter out the kind of branches that gets recorded in the BTB. Each one represents an element of the branch architecture of the Itanium processor. Refer to the Itanium specific documentation for more details on the branch architecture. The fields are as follows: .TP .B btb_tar If the value of this field is 1, then branches predicted by the Target Address Register (TAR) predictions are captured. If 0 no branch predicted by the TAR is included. .TP .B btb_tac If this field is 1, then branches predicted by the Target Address Cache (TAC) are captured. If 0 no branch predicted by the TAC is included. .TP .B btb_bac If this field is 1, then branches predicted by the Branch Address Corrector (BAC) are captured. If 0 no branch predicted by the BAC is included. .TP .B btb_tm If this field is 0, then no branch is captured. If this field is 1, then non taken branches are captured. If this field is 2, then taken branches are captured. Finally if this field is 3 then all branches are captured. .TP .B btb_ptm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted target address are captured. If this field is 2, then branches with correctly predicted target address are captured. Finally if this field is 3 then all branches are captured regardless of target address prediction. .TP .B btb_ppm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted path (taken/non taken) are captured. If this field is 2, then branches with correctly predicted path are captured. Finally if this field is 3 then all branches are captured regardless of their path prediction. .TP .B btb_plm This is the privilege level mask at which the BTB captures branches. It can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBbtb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. .sp There are 4 methods to program the BTB and they are as follows: .sp .TP .B Method 1 The \fBBRANCH_EVENT\fR is in the list of events to monitor and \fBbtb_used\fR is cleared. In this case, the BTB will be configured (PMC12) to record ALL branches. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBBRANCH_EVENT\fR. .TP .B Method 2 The \fBBRANCH_EVENT\fR is in the list of events to monitor and \fBbtb_used\fR is set. In this case, the BTB will be configured (PMC12) using the information in the \fBpfp_ita_btb\fR structure. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBBRANCH_EVENT\fR. .TP .B Method 3 The \fBBRANCH_EVENT\fR is not in the list of events to monitor and \fBbtb_used\fR is set. In this case, the BTB will be configured (PMC12) using the information in the \fBpfp_ita_btb\fR structure. This is the free running mode for the BTB. .TP .B Method 4 The \fBBRANCH_EVENT\fR is not in the list of events to monitor and \fBbtb_used\fR is cleared. In this case, the BTB is not programmed. .sp .SH DATA AND CODE RANGE RESTRICTIONS The \fBpfp_ita_drange\fR and \fBpfp_ita_irange\fR fields control the range restrictions for the data and code respectively. The idea is that the application passes a set of ranges, each designated by a start and end address. Upon return from the \fBpfm_dispatch_events()\fR function, the application gets back the set of registers and their values that needs to be programmed via a kernel interface. Range restriction is implemented using the debug registers. There is a limited number of debug registers and they go in pair. With 8 data debug registers, a maximum of 4 distinct ranges can be specified. The same applies to code range restrictions. Moreover, there are some severe constraints on the alignment and size of the range. Given that the size range is specified using a bitmask, there can be situations where the actual range is larger than the requested range. The library will make the best effort to cover only what is requested. It will never cover less than what is requested. The algorithm uses more than one pair of debug registers to get a more precise range if necessary. Hence, up to the 4 pairs can be used to describe a single range. The library returns the start and end offsets of the actual range compared to the requested range. If range restriction is to be used, the \fBrr_used\fR field must be set to one, otherwise settings will be ignored. The ranges are described by the \fBpfmlib_ita_input_rr_t\fR structure. Up to 4 ranges can be defined. Each range is described in by a entry in \fBrr_limits\fR. The \fBpfmlib_ita_input_rr_desc_t\fR structure is defined as follows: .TP .B rr_plm The privilege level at which the range is active. It can be any combinations of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBrr_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used.The privilege level is only relevant for code ranges, data ranges ignores the setting. .TP .B rr_start This is the start address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .TP .B rr_end This is the end address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .LP .sp The library will provide the values for the debug registers as well as some information about the actual ranges in the output parameters and more precisely in the \fBpfmlib_ita_output_rr_t\fR structure for each range. The structure is defined as follows: .TP .B rr_nbr_used Contains the number of debug registers used to cover the range. This is necessarily an even number as debug registers always go in pair. The value of this field is between 0 and 7. .TP .B rr_br This table contains the list of debug registers necessary to cover the ranges. Each element is of type \fBpfmlib_reg_t\fR. The \fBreg_num\fR field contains the debug register index while \fBreg_value\fR contains the debug register value. Both the index and value must be copied into the kernel specific argument to program the debug registers. The library never programs them. .TP .B rr_infos Contains information about the ranges defined. Because of alignment restrictions, the actual range covered by the debug registers may be larger than the requested range. This table describe the differences between the requested and actual ranges expressed as offsets: .TP .B rr_soff Contains the start offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the beginning of the range. Otherwise it represents the number of byte by which the actual range precedes the requested range. .TP .B rr_eoff Contains the end offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the end of the range. Otherwise it represents the number of bytes by which the actual range exceeds the requested range. .sp .LP .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors when using the Itanium specific input and output arguments. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_itanium2.3000066400000000000000000000564671502707512200235150ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME libpfm_itanium2 - support for Itanium2 specific PMU features .SH SYNOPSIS .nf .B #include .B #include .sp .BI "int pfm_ita2_is_ear(unsigned int " i ");" .BI "int pfm_ita2_is_dear(unsigned int " i ");" .BI "int pfm_ita2_is_dear_tlb(unsigned int " i ");" .BI "int pfm_ita2_is_dear_cache(unsigned int " i ");" .BI "int pfm_ita2_is_dear_alat(unsigned int " i ");" .BI "int pfm_ita2_is_iear(unsigned int " i ");" .BI "int pfm_ita2_is_iear_tlb(unsigned int " i ");" .BI "int pfm_ita2_is_iear_cache(unsigned int " i ");" .BI "int pfm_ita2_is_btb(unsigned int " i ");" .BI "int pfm_ita2_support_opcm(unsigned int " i ");" .BI "int pfm_ita2_support_iarr(unsigned int " i ");" .BI "int pfm_ita2_support_darr(unsigned int " i ");" .BI "int pfm_ita2_get_event_maxincr(unsigned int "i ", unsigned int *"maxincr ");" .BI "int pfm_ita2_get_event_umask(unsigned int "i ", unsigned long *"umask ");" .BI "int pfm_ita2_get_event_group(unsigned int "i ", int *"grp ");" .BI "int pfm_ita2_get_event_set(unsigned int "i ", int *"set ");" .BI "int pfm_ita2_get_ear_mode(unsigned int "i ", pfmlib_ita2_ear_mode_t *"mode ");" .BI "int pfm_ita2_irange_is_fine(pfmlib_output_param_t *"outp ", pfmlib_ita2_output_param_t *"mod_out ");" .sp .SH DESCRIPTION The libpfm library provides full support for all the Itanium 2 specific features of the PMU. The interface is defined in \fBpfmlib_itanium2.h\fR. It consists of a set of functions and structures which describe and allow access to the Itanium 2 specific PMU features. .sp The Itanium 2 specific functions presented here are mostly used to retrieve the characteristics of an event. Given a opaque event descriptor, obtained by the \fBpfm_find_event()\fR or its derivative functions, they return a boolean value indicating whether this event support this feature or is of a particular kind. .sp The \fBpfm_ita2_is_ear()\fR function returns 1 if the event designated by \fBi\fR corresponds to a EAR event, i.e., an Event Address Register type of events. Otherwise 0 is returned. For instance, \fBDATA_EAR_CACHE_LAT4\fR is an ear event, but \fBCPU_CYCLES\fR is not. It can be a data or instruction EAR event. .sp The \fBpfm_ita2_is_dear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an Data EAR event. Otherwise 0 is returned. It can be a cache or TLB EAR event. .sp The \fBpfm_ita2_is_dear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_ita2_is_dear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_ita2_is_dear_alat()\fR function returns 1 if the event designated by \fBi\fR corresponds to a ALAT EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_ita2_is_iear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR event. Otherwise 0 is returned. It can be a cache or TLB instruction EAR event. .sp The \fBpfm_ita2_is_iear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_ita2_is_iear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_ita2_support_opcm()\fR function returns 1 if the event designated by \fBi\fR supports opcode matching, i.e., can this event be measured accurately when opcode matching via PMC8/PMC9 is active. Not all events supports this feature. .sp The \fBpfm_ita2_support_iarr()\fR function returns 1 if the event designated by \fBi\fR supports code address range restrictions, i.e., can this event be measured accurately when code range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_ita2_support_darr()\fR function returns 1 if the event designated by \fBi\fR supports data address range restrictions, i.e., can this event be measured accurately when data range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_ita2_get_event_maxincr()\fR function returns in \fBmaxincr\fR the maximum number of occurrences per cycle for the event designated by \fBi\fR. Certain Itanium 2 events can occur more than once per cycle. When an event occurs more than once per cycle, the PMD counter will be incremented accordingly. It is possible to restrict measurement when event occur more than once per cycle. For instance, \fBNOPS_RETIRED\fR can happen up to 6 times/cycle which means that the threshold can be adjusted between 0 and 5, where 5 would mean that the PMD counter would be incremented by 1 only when the nop instruction is executed more than 5 times/cycle. This function returns the maximum number of occurrences of the event per cycle, and is the non-inclusive upper bound for the threshold to program in the PMC register. .sp The \fBpfm_ita2_get_event_umask()\fR function returns in \fBumask\fR the umask for the event designated by \fBi\fR. .sp The \fBpfm_ita2_get_event_grp()\fR function returns in \fBgrp\fR the group to which the event designated by \fBi\fR belongs. The notion of group is used for L1 and L2 cache events only. For all other events, a group is irrelevant and can be ignored. If the event is an L2 cache event then the value of \fBgrp\fR will be \fBPFMLIB_ITA2_EVT_L2_CACHE_GRP\fR. Similarly, if the event is an L1 cache event, the value of \fBgrp\fR will be \fBPFMLIB_ITA2_EVT_L1_CACHE_GRP\fR. In any other cases, the value of \fBgrp\fR will be \fBPFMLIB_ITA2_EVT_NO_GRP\fR. .sp The \fBpfm_ita2_get_event_set()\fR function returns in \fBset\fR the set to which the event designated by \fBi\fR belongs. A set is a subdivision of a group and is therefore only relevant for L1 and L2 cache events. An event can only belong to one group and one set. This partitioning of the cache events is due to some hardware limitations which impose some restrictions on events. For a given group, events from different sets cannot be measured at the same time. If the event does not belong to a group then the value of \fBset\fR is \fBPFMLIB_MONT_EVT_NO_SET\fR. .sp The \fBpfm_ita2_irange_is_fine()\fR function returns 1 if the configuration description passed in \fBoutp\fR, the generic output parameters and \fBmod_out\fR, the Itanium2 specific output parameters, use code range restriction in fine mode. Otherwise the function returns 0. This function can only be called after a call to the \fBpfm_dispatch_events()\fR function returns successfully and had the data structures pointed to by \fBoutp\fR and \fBmod_out\fR as output parameters. .sp The \fBpfm_ita2_get_event_ear_mode()\fR function returns in \fBmode\fR the EAR mode of the event designated by \fBi\fR. If the event is not an EAR event, then \fBPFMLIB_ERR_INVAL\fR is returned and mode is not updated. Otherwise mode can have the following values: .TP .B PFMLIB_ITA2_EAR_TLB_MODE The event is an EAR TLB mode. It can be either data or instruction TLB EAR. .TP .B PFMLIB_ITA2_EAR_CACHE_MODE The event is a cache EAR. It can be either data or instruction cache EAR. .TP .B PFMLIB_ITA2_EAR_ALAT_MODE The event is an ALAT EAR. It can only be a data EAR event. .sp .LP When the Itanium 2 specific features are needed to support a measurement their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Itanium 2 specific input arguments are described in the \fBpfmlib_ita2_input_param_t\fR structure and the output parameters in \fBpfmlib_ita2_output_param_t\fR. They are defined as follows: .sp .nf typedef enum { PFMLIB_ITA2_ISM_BOTH=0, PFMLIB_ITA2_ISM_IA32=1, PFMLIB_ITA2_ISM_IA64=2 } pfmlib_ita2_ism_t; typedef struct { unsigned int flags; unsigned int thres; pfmlib_ita2_ism_t ism; } pfmlib_ita2_counter_t; typedef struct { unsigned char opcm_used; unsigned long pmc_val; } pfmlib_ita2_opcm_t; typedef struct { unsigned char btb_used; unsigned char btb_ds; unsigned char btb_tm; unsigned char btb_ptm; unsigned char btb_ppm; unsigned char btb_brt; unsigned int btb_plm; } pfmlib_ita2_btb_t; typedef enum { PFMLIB_ITA2_EAR_CACHE_MODE= 0, PFMLIB_ITA2_EAR_TLB_MODE = 1, PFMLIB_ITA2_EAR_ALAT_MODE = 2 } pfmlib_ita2_ear_mode_t; typedef struct { unsigned char ear_used; pfmlib_ita2_ear_mode_t ear_mode; pfmlib_ita2_ism_t ear_ism; unsigned int ear_plm; unsigned long ear_umask; } pfmlib_ita2_ear_t; typedef struct { unsigned int rr_plm; unsigned long rr_start; unsigned long rr_end; } pfmlib_ita2_input_rr_desc_t; typedef struct { unsigned long rr_soff; unsigned long rr_eoff; } pfmlib_ita2_output_rr_desc_t; typedef struct { unsigned int rr_flags; pfmlib_ita2_input_rr_desc_t rr_limits[4]; unsigned char rr_used; } pfmlib_ita2_input_rr_t; typedef struct { unsigned int rr_nbr_used; pfmlib_ita2_output_rr_desc_t rr_infos[4]; pfmlib_reg_t rr_br[8]; } pfmlib_ita2_output_rr_t; typedef struct { pfmlib_ita2_counter_t pfp_ita2_counters[PMU_ITA2_NUM_COUNTERS]; unsigned long pfp_ita2_flags; pfmlib_ita2_opcm_t pfp_ita2_pmc8; pfmlib_ita2_opcm_t pfp_ita2_pmc9; pfmlib_ita2_ear_t pfp_ita2_iear; pfmlib_ita2_ear_t pfp_ita2_dear; pfmlib_ita2_btb_t pfp_ita2_btb; pfmlib_ita2_input_rr_t pfp_ita2_drange; pfmlib_ita2_input_rr_t pfp_ita2_irange; } pfmlib_ita2_input_param_t; typedef struct { pfmlib_ita2_output_rr_t pfp_ita2_drange; pfmlib_ita2_output_rr_t pfp_ita2_irange; } pfmlib_ita2_output_param_t; .fi .sp .SH PER-EVENT OPTIONS .sp The Itanium 2 processor provides two additional per-event features for counters: thresholding and instruction set selection. They can be set using the \fBpfp_ita2_counters\fR data structure for each event. The \fBism\fR field can be initialized as follows: .TP .B PFMLIB_ITA2_ISM_BOTH The event will be monitored during IA-64 and IA-32 execution .TP .B PFMLIB_ITA2_ISM_IA32 The event will only be monitored during IA-32 execution .TP .B PFMLIB_ITA2_ISM_IA64 The event will only be monitored during IA-64 execution .sp .LP If \fBism\fR has a value of zero, it will default to PFMLIB_ITA2_ISM_BOTH. The \fBthres\fR indicates the threshold for the event. A threshold of \fBn\fR means that the counter will be incremented by one only when the event occurs more than \fBn\fR times per cycle. The \fBflags\fR field contains event-specific flags. The currently defined flags are: .sp .TP PFMLIB_ITA2_FL_EVT_NO_QUALCHECK When this flag is set it indicates that the library should ignore the qualifiers constraints for this event. Qualifiers includes opcode matching, code and data range restrictions. When an event is marked as not supporting a particular qualifier, it usually means that it is ignored, i.e., the extra level of filtering is ignored. For instance, the CPU_CYCLES event does not support code range restrictions and by default the library will refuse to program it if range restriction is also requested. Using the flag will override the check and the call to the \fBpfm_dispatch_events()\fR function will succeed. In this case, CPU_CYCLES will be measured for the entire program and not just for the code range requested. For certain measurements this is perfectly acceptable as the range restriction will only be applied relevant to events which support it. Make sure you understand which events do not support certain qualifiers before using this flag. .LP .SH OPCODE MATCHING .sp The \fBpfp_ita2_pmc8\fR and \fBpfp_ita2_pmc9\fR fields of type \fBpfmlib_ita2_opcm_t\fR contain the description of what to do with the opcode matchers. Itanium 2 supports opcode matching via PMC8 and PMC9. When this feature is used the \fBopcm_used\fR field must be set to 1, otherwise it is ignored by the library. The \fBpmc_val\fR simply contains the raw value to store in PMC8 or PMC9. The library may adjust the value to enable/disable some options depending on the set of features being used. The final value for PMC8 and PMC9 will be stored in the \fBpfp_pmcs\fR table of the generic output parameters. .SH EVENT ADDRESS REGISTERS .sp The \fBpfp_ita2_iear\fR field of type \fBpfmlib_ita2_ear_t\fR describes what to do with instruction Event Address Registers (I-EARs). Again if this feature is used the \fBear_used\fR must be set to 1, otherwise it will be ignored by the library. The \fBear_mode\fR must be set to either one of \fBPFMLIB_ITA2_EAR_TLB_MODE\fR, \fBPFMLIB_ITA2_EAR_CACHE_MODE\fRto indicate the type of EAR to program. The umask to store into PMC10 must be in \fBear_umask\fR. The privilege level mask at which the I-EAR will be monitored must be set in \fBear_plm\fR which can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBear_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. Finally the instruction set for which to monitor is in \fBear_ism\fR and can be any one of \fBPFMLIB_ITA2_ISM_BOTH\fR, \fBPFMLIB_ITA2_ISM_IA32\fR, or \fBPFMLIB_ITA2_ISM_IA64\fR. .sp The \fBpfp_ita2_dear\fR field of type \fBpfmlib_ita2_ear_t\fR describes what to do with data Event Address Registers (D-EARs). The description is identical to the I-EARs except that it applies to PMC11 and that a \fBear_mode\fR of \fBPFMLIB_ITA2_EAR_ALAT_MODE\fR is possible. In general, there are four different methods to program the EAR (data or instruction): .TP .B Method 1 There is an EAR event in the list of events to monitor and \fBear_used\fR is cleared. In this case the EAR will be programmed (PMC10 or PMC11) based on the information encoded in the event. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBDATA_EAR_EVENT\fR or \fBL1I_EAR_EVENTS\fR depending on the type of EAR. .TP .B Method 2 There is an EAR event in the list of events to monitor and \fBear_used\fR is set. In this case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_ita2_iear\fR or \fBpfp_ita2_dear\fR structure because it contains more detailed information, such as privilege level and instruction set. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count DATA_EAR_EVENT or L1I_EAR_EVENTS depending on the type of EAR. .TP .B Method 3 There is no EAR event in the list of events to monitor and and \fBear_used\fR is cleared. In this case no EAR is programmed. .TP .B Method 4 There is no EAR event in the list of events to monitor and and \fBear_used\fR is set. In this case case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_ita2_iear\fR or \fBpfp_ita2_dear\fR structure. This is the free running mode for the EAR. .sp .SH BRANCH TRACE BUFFER The \fBpfp_ita2_btb\fR of type \fBpfmlib_ita2_btb_t\fR field is used to configure the Branch Trace Buffer (BTB). If the \fBbtb_used\fR is set, then the library will take the configuration into account, otherwise any BTB configuration will be ignored. The various fields in this structure provide means to filter out the kind of branches that gets recorded in the BTB. Each one represents an element of the branch architecture of the Itanium 2 processor. Refer to the Itanium 2 specific documentation for more details on the branch architecture. The fields are as follows: .TP .B btb_ds If the value of this field is 1, then detailed information about the branch prediction are recorded in place of information about the target address. If the value is 0, then information about the target address of the branch is recorded instead. .TP .B btb_tm If this field is 0, then no branch is captured. If this field is 1, then non taken branches are captured. If this field is 2, then taken branches are captured. Finally if this field is 3 then all branches are captured. .TP .B btb_ptm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted target address are captured. If this field is 2, then branches with correctly predicted target address are captured. Finally if this field is 3 then all branches are captured regardless of target address prediction. .TP .B btb_ppm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted path (taken/non taken) are captured. If this field is 2, then branches with correctly predicted path are captured. Finally if this field is 3 then all branches are captured regardless of their path prediction. .TP .B btb_brt If this field is 0, then all branches are captured. If this field is 1, then only IP-relative branches are captured. If this field is 2, then only return branches are captured. Finally if this field is 3 then only non-return indirect branches are captured. .TP .B btb_plm This is the privilege level mask at which the BTB captures branches. It can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBbtb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. .sp There are 4 methods to program the BTB and they are as follows: .sp .TP .B Method 1 The \fBBRANCH_EVENT\fR is in the list of event to monitor and \fBbtb_used\fR is cleared. In this case, the BTB will be configured (PMC12) to record ALL branches. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBBRANCH_EVENT\fR. .TP .B Method 2 The \fBBRANCH_EVENT\fR is in the list of events to monitor and \fBbtb_used\fR is set. In this case, the BTB will be configured (PMC12) using the information in the \fBpfp_ita2_btb\fR structure. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBBRANCH_EVENT\fR. .TP .B Method 3 The \fBBRANCH_EVENT\fR is not in the list of events to monitor and \fBbtb_used\fR is set. In this case, the BTB will be configured (PMC12) using the information in the \fBpfp_ita2_btb\fR structure. This is the free running mode for the BTB. .TP .B Method 4 The \fBBRANCH_EVENT\fR is not in the list of events to monitor and \fBbtb_used\fR is cleared. In this case, the BTB is not programmed. .SH DATA AND CODE RANGE RESTRICTIONS The \fBpfp_ita2_drange\fR and \fBpfp_ita2_irange\fR fields control the range restrictions for the data and code respectively. The idea is that the application passes a set of ranges, each designated by a start and end address. Upon return from the \fBpfm_dispatch_events()\fR function, the application gets back the set of registers and their values that needs to be programmed via a kernel interface. Range restriction is implemented using the debug registers. There is a limited number of debug registers and they go in pair. With 8 data debug registers, a maximum of 4 distinct ranges can be specified. The same applies to code range restrictions. Moreover, there are some severe constraints on the alignment and size of the ranges. Given that the size of a range is specified using a bitmask, there can be situations where the actual range is larger than the requested range. For code ranges, the Itanium 2 processor can use what is called a fine mode, where a range is designated using two pairs of code debug registers. In this mode, the bitmask is not used, the start and end addresses are directly specified. Not all code ranges qualify for fine mode, the size of the range must be 4KB or less and the range cannot cross a 4KB page boundary. The library will make a best effort in choosing the right mode for each range. For code ranges, it will try the fine mode first and will default to using the bitmask mode otherwise. Fine mode applies to all code debug registers or none, i.e., you cannot have a range using fine mode and another using the bitmask. the Itanium 2 processor somehow limits the use of multiple pairs to accurately cover a code range. This can only be done for \fBIA64_INST_RETIRED\fR and even then, you need several events to collect the counts. For all other events, only one pair can be used, which leads to more inaccuracy due to approximation. Data ranges can used multiple debug register pairs to gain more accuracy. The library will never cover less than what is requested. The algorithm will use more than one pair of debug registers whenever possible to get a more precise range. Hence, up to the 4 pairs can be used to describe a single range. If range restriction is to be used, the \fBrr_used\fR field must be set to one, otherwise settings will be ignored. The ranges are described by the \fBpfmlib_ita2_input_rr_t\fR structure. Up to 4 ranges can be defined. Each range is described in by a entry in \fBrr_limits\fR. Some flags for all ranges can be defined in \fBrr_flags\fR. Currently defined flags are: .sp .TP .B PFMLIB_ITA2_RR_INV Inverse the code ranges. The qualifying events will be measurement when executing outside the specified ranges. .TP .B PFMLIB_ITA2_RR_NO_FINE_MODE Force non fine mode for all code ranges (mostly for debug) .sp .LP The \fBpfmlib_ita2_input_rr_desc_t\fR structure is defined as follows: .TP .B rr_plm The privilege level at which the range is active. It can be any combinations of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBbtb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. The privilege level is only relevant for code ranges, data ranges ignores the setting. .TP .B rr_start This is the start address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .TP .B rr_end This is the end address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .sp .LP The library will provide the values for the debug registers as well as some information about the actual ranges in the output parameters and more precisely in the \fBpfmlib_ita2_output_rr_t\fR structure for each range. The structure is defined as follows: .TP .B rr_nbr_used Contains the number of debug registers used to cover the range. This is necessarily an even number as debug registers always go in pair. The value of this field is between 0 and 7. .TP .B rr_br This table contains the list of debug registers necessary to cover the ranges. Each element is of type \fBpfmlib_reg_t\fR. The \fBreg_num\fR field contains the debug register index while \fBreg_value\fR contains the debug register value. Both the index and value must be copied into the kernel specific argument to program the debug registers. The library never programs them. .TP .B rr_infos Contains information about the ranges defined. Because of alignment restrictions, the actual range covered by the debug registers may be larger than the requested range. This table describe the differences between the requested and actual ranges expressed as offsets: .TP .B rr_soff Contains the start offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the beginning of the range. Otherwise it represents the number of byte by which the actual range precedes the requested range. .TP .B rr_eoff Contains the end offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the end of the range. Otherwise it represents the number of bytes by which the actual range exceeds the requested range. .sp .LP .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors when using the Itanium 2 specific input and output arguments. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_montecito.3000066400000000000000000000616701502707512200237560ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME libpfm_montecito - support for Itanium 2 9000 (Montecito) processor specific PMU features .SH SYNOPSIS .nf .B #include .B #include .sp .BI "int pfm_mont_is_ear(unsigned int " i ");" .BI "int pfm_mont_is_dear(unsigned int " i ");" .BI "int pfm_mont_is_dear_tlb(unsigned int " i ");" .BI "int pfm_mont_is_dear_cache(unsigned int " i ");" .BI "int pfm_mont_is_dear_alat(unsigned int " i ");" .BI "int pfm_mont_is_iear(unsigned int " i ");" .BI "int pfm_mont_is_iear_tlb(unsigned int " i ");" .BI "int pfm_mont_is_iear_cache(unsigned int " i ");" .BI "int pfm_mont_is_etb(unsigned int " i ");" .BI "int pfm_mont_support_opcm(unsigned int " i ");" .BI "int pfm_mont_support_iarr(unsigned int " i ");" .BI "int pfm_mont_support_darr(unsigned int " i ");" .BI "int pfm_mont_get_event_maxincr(unsigned int "i ", unsigned int *"maxincr ");" .BI "int pfm_mont_get_event_umask(unsigned int "i ", unsigned long *"umask ");" .BI "int pfm_mont_get_event_group(unsigned int "i ", int *"grp ");" .BI "int pfm_mont_get_event_set(unsigned int "i ", int *"set ");" .BI "int pfm_mont_get_event_type(unsigned int "i ", int *"type ");" .BI "int pfm_mont_get_ear_mode(unsigned int "i ", pfmlib_mont_ear_mode_t *"mode ");" .BI "int pfm_mont_irange_is_fine(pfmlib_output_param_t *"outp ", pfmlib_mont_output_param_t *"mod_out ");" .sp .SH DESCRIPTION The libpfm library provides full support for all the Itanium 2 900 (Montecito) processor specific features of the PMU. The interface is defined in \fBpfmlib_montecito.h\fR. It consists of a set of functions and structures which describe and allow access to the model specific PMU features. .sp The Itanium 2 900 (Montecito) processor specific functions presented here are mostly used to retrieve the characteristics of an event. Given a opaque event descriptor, obtained by the \fBpfm_find_event()\fR or its derivative functions, they return a boolean value indicating whether this event support this feature or is of a particular kind. .sp The \fBpfm_mont_is_ear()\fR function returns 1 if the event designated by \fBi\fR corresponds to a EAR event, i.e., an Event Address Register type of events. Otherwise 0 is returned. For instance, \fBDATA_EAR_CACHE_LAT4\fR is an ear event, but \fBCPU_OP_CYCLES_ALL\fR is not. It can be a data or instruction EAR event. .sp The \fBpfm_mont_is_dear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an Data EAR event. Otherwise 0 is returned. It can be a cache or TLB EAR event. .sp The \fBpfm_mont_is_dear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_mont_is_dear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to a Data EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_mont_is_dear_alat()\fR function returns 1 if the event designated by \fBi\fR corresponds to a ALAT EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_mont_is_iear()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR event. Otherwise 0 is returned. It can be a cache or TLB instruction EAR event. .sp The \fBpfm_mont_is_iear_tlb()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR TLB event. Otherwise 0 is returned. .sp The \fBpfm_mont_is_iear_cache()\fR function returns 1 if the event designated by \fBi\fR corresponds to an instruction EAR cache event. Otherwise 0 is returned. .sp The \fBpfm_mont_support_opcm()\fR function returns 1 if the event designated by \fBi\fR supports opcode matching, i.e., can this event be measured accurately when opcode matching via PMC32/PMC34 is active. Not all events supports this feature. .sp The \fBpfm_mont_support_iarr()\fR function returns 1 if the event designated by \fBi\fR supports code address range restrictions, i.e., can this event be measured accurately when code range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_mont_support_darr()\fR function returns 1 if the event designated by \fBi\fR supports data address range restrictions, i.e., can this event be measured accurately when data range restriction is active. Otherwise 0 is returned. Not all events supports this feature. .sp The \fBpfm_mont_get_event_maxincr()\fR function returns in \fBmaxincr\fR the maximum number of occurrences per cycle for the event designated by \fBi\fR. Certain Itanium 2 9000 (Montecito) events can occur more than once per cycle. When an event occurs more than once per cycle, the PMD counter will be incremented accordingly. It is possible to restrict measurement when event occur more than once per cycle. For instance, \fBNOPS_RETIRED\fR can happen up to 6 times/cycle which means that the threshold can be adjusted between 0 and 5, where 5 would mean that the PMD counter would be incremented by 1 only when the nop instruction is executed more than 5 times/cycle. This function returns the maximum number of occurrences of the event per cycle, and is the non-inclusive upper bound for the threshold to program in the PMC register. .sp The \fBpfm_mont_get_event_umask()\fR function returns in \fBumask\fR the umask for the event designated by \fBi\fR. .sp The \fBpfm_mont_get_event_grp()\fR function returns in \fBgrp\fR the group to which the event designated by \fBi\fR belongs. The notion of group is used for L1D and L2D cache events only. For all other events, a group is irrelevant and can be ignored. If the event is an L2D cache event then the value of \fBgrp\fR will be \fBPFMLIB_MONT_EVT_L2D_CACHE_GRP\fR. Similarly, if the event is an L1D cache event, the value of \fBgrp\fR will be \fBPFMLIB_MONT_EVT_L1D_CACHE_GRP\fR. In any other cases, the value of \fBgrp\fR will be \fBPFMLIB_MONT_EVT_NO_GRP\fR. .sp The \fBpfm_mont_get_event_set()\fR function returns in \fBset\fR the set to which the event designated by \fBi\fR belongs. A set is a subdivision of a group and is therefore only relevant for L1 and L2 cache events. An event can only belong to one group and one set. This partitioning of the cache events is due to some hardware limitations which impose some restrictions on events. For a given group, events from different sets cannot be measured at the same time. If the event does not belong to a group then the value of \fBset\fR is \fBPFMLIB_MONT_EVT_NO_SET\fR. .sp The \fBpfm_mont_get_event_type()\fR function returns in \fBtype\fR the type of the event designated by \fBi\fR belongs. The itanium2 9000 (Montecito) events can have any one of the following types: .sp .TP .B PFMLIB_MONT_EVT_ACTIVE The event can only occur when the processor thread that generated it is currently active .TP .B PFMLIB_MONT_EVT_FLOATING The event can be generated when the processor thread is inactive .TP .B PFMLIB_MONT_EVT_CAUSAL The event does not belong to a processor thread .TP .B PFMLIB_MONT_EVT_SELF_FLOATING Hybrid event. It is floating if measured with .me. If is causal otherwise. .LP .sp The \fBpfm_mont_irange_is_fine()\fR function returns 1 if the configuration description passed in \fBoutp\fR, the generic output parameters and \fBmod_out\fR, the Itanium 2 9000 (Montecito) specific output parameters, use code range restriction in fine mode. Otherwise the function returns 0. This function can only be called after a call to the \fBpfm_dispatch_events()\fR function returns successfully and had the data structures pointed to by \fBoutp\fR and \fBmod_out\fR as output parameters. .sp The \fBpfm_mont_get_event_ear_mode()\fR function returns in \fBmode\fR the EAR mode of the event designated by \fBi\fR. If the event is not an EAR event, then \fBPFMLIB_ERR_INVAL\fR is returned and mode is not updated. Otherwise mode can have the following values: .TP .B PFMLIB_MONT_EAR_TLB_MODE The event is an EAR TLB mode. It can be either data or instruction TLB EAR. .TP .B PFMLIB_MONT_EAR_CACHE_MODE The event is a cache EAR. It can be either data or instruction cache EAR. .TP .B PFMLIB_MONT_EAR_ALAT_MODE The event is an ALAT EAR. It can only be a data EAR event. .sp .LP When the Itanium 2 9000 (Montecito) specific features are needed to support a measurement their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Itanium 2 9000 (Montecito) specific input arguments are described in the \fBpfmlib_mont_input_param_t\fR structure and the output parameters in \fBpfmlib_mont_output_param_t\fR. They are defined as follows: .sp .nf typedef struct { unsigned int flags; unsigned int thres; } pfmlib_mont_counter_t; typedef struct { unsigned char opcm_used; unsigned char opcm_m; unsigned char opcm_i; unsigned char opcm_f; unsigned char opcm_b; unsigned long opcm_match; unsigned long opcm_mask; } pfmlib_mont_opcm_t; typedef struct { unsigned char etb_used; unsigned int etb_plm; unsigned char etb_ds; unsigned char etb_tm; unsigned char etb_ptm; unsigned char etb_ppm; unsigned char etb_brt; } pfmlib_mont_etb_t; typedef struct { unsigned char ipear_used; unsigned int ipear_plm; unsigned short ipear_delay; } pfmlib_mont_ipear_t; typedef enum { PFMLIB_MONT_EAR_CACHE_MODE= 0, PFMLIB_MONT_EAR_TLB_MODE = 1, PFMLIB_MONT_EAR_ALAT_MODE = 2 } pfmlib_mont_ear_mode_t; typedef struct { unsigned char ear_used; pfmlib_mont_ear_mode_t ear_mode; unsigned int ear_plm; unsigned long ear_umask; } pfmlib_mont_ear_t; typedef struct { unsigned int rr_plm; unsigned long rr_start; unsigned long rr_end; } pfmlib_mont_input_rr_desc_t; typedef struct { unsigned long rr_soff; unsigned long rr_eoff; } pfmlib_mont_output_rr_desc_t; typedef struct { unsigned int rr_flags; pfmlib_mont_input_rr_desc_t rr_limits[4]; unsigned char rr_used; } pfmlib_mont_input_rr_t; typedef struct { unsigned int rr_nbr_used; pfmlib_mont_output_rr_desc_t rr_infos[4]; pfmlib_reg_t rr_br[8]; } pfmlib_mont_output_rr_t; typedef struct { pfmlib_mont_counter_t pfp_mont_counters[PMU_MONT_NUM_COUNTERS]; unsigned long pfp_mont_flags; pfmlib_mont_opcm_t pfp_mont_opcm1; pfmlib_mont_opcm_t pfp_mont_opcm2; pfmlib_mont_ear_t pfp_mont_iear; pfmlib_mont_ear_t pfp_mont_dear; pfmlib_mont_ipear_t pfp_mont_ipear; pfmlib_mont_etb_t pfp_mont_etb; pfmlib_mont_input_rr_t pfp_mont_drange; pfmlib_mont_input_rr_t pfp_mont_irange; } pfmlib_mont_input_param_t; typedef struct { pfmlib_mont_output_rr_t pfp_mont_drange; pfmlib_mont_output_rr_t pfp_mont_irange; } pfmlib_mont_output_param_t; .fi .sp .SH PER-EVENT OPTIONS .sp The Itanium 2 9000 (Montecito) processor provides one per-event feature for counters: thresholding. It can be set using the \fBpfp_mont_counters\fR data structure for each event. .sp The \fBthres\fR indicates the threshold for the event. A threshold of \fBn\fR means that the counter will be incremented by one only when the event occurs more than \fBn\fR times per cycle. The \fBflags\fR field contains event-specific flags. The currently defined flags are: .sp .TP PFMLIB_MONT_FL_EVT_NO_QUALCHECK When this flag is set it indicates that the library should ignore the qualifiers constraints for this event. Qualifiers includes opcode matching, code and data range restrictions. When an event is marked as not supporting a particular qualifier, it usually means that it is ignored, i.e., the extra level of filtering is ignored. For instance, the FE_BUBBLE_ALL event does not support code range restrictions and by default the library will refuse to program it if range restriction is also requested. Using the flag will override the check and the call to the \fBpfm_dispatch_events()\fR function will succeed. In this case, FE_BUBBLE_ALL will be measured for the entire program and not just for the code range requested. For certain measurements this is perfectly acceptable as the range restriction will only be applied relevant to events which support it. Make sure you understand which events do not support certain qualifiers before using this flag. .LP .SH OPCODE MATCHING .sp The \fBpfp_mont_opcm1\fR and \fBpfp_mont_opcm2\fR fields of type \fBpfmlib_mont_opcm_t\fR contain the description of what to do with the opcode matchers. The Itanium 2 9000 (Montecito) processor supports opcode matching via PMC32 and PMC34. When this feature is used the \fBopcm_used\fR field must be set to 1, otherwise it is ignored by the library. The Itanium 2 9000 (Montecito) processor implements two full 41-bit opcode matchers. As such, it is possible to match all instructions individually. It is possible to match a single instruction or an instruction pattern based on opcode or slot type. The slots are specified in: .TP .B opcm_m Match when the instruction is in a M-slot (memory) .TP .B opcm_i Match when the instruction is in an I-slot (ALU) .TP .B opcm_f Match when the instruction is in an F-slot (FPU) .TP .B opcm_b Match when the instruction is in a B-slot (Branch) .sp .LP Any combinations of slot settings is supported. To match all slot types, simply set all fields to 1. .sp The 41-bit opcode is specified in \fBopcm_match\fR and a 41-bit mask is passed in \fBopcm_mask\fR. When a bit is set in \fBopcm_mask\fR the corresponding bit is ignored in \fBopcm_match\fR. .SH EVENT ADDRESS REGISTERS .sp The \fBpfp_mont_iear\fR field of type \fBpfmlib_mont_ear_t\fR describes what to do with instruction Event Address Registers (I-EARs). Again if this feature is used the \fBear_used\fR must be set to 1, otherwise it will be ignored by the library. The \fBear_mode\fR must be set to either one of \fBPFMLIB_MONT_EAR_TLB_MODE\fR, \fBPFMLIB_MONT_EAR_CACHE_MODE\fRto indicate the type of EAR to program. The umask to store into PMC10 must be in \fBear_umask\fR. The privilege level mask at which the I-EAR will be monitored must be set in \fBear_plm\fR which can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBear_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. .sp The \fBpfp_mont_dear\fR field of type \fBpfmlib_mont_ear_t\fR describes what to do with data Event Address Registers (D-EARs). The description is identical to the I-EARs except that it applies to PMC11 and that a \fBear_mode\fR of \fBPFMLIB_MONT_EAR_ALAT_MODE\fR is possible. In general, there are four different methods to program the EAR (data or instruction): .TP .B Method 1 There is an EAR event in the list of events to monitor and \fBear_used\fR is cleared. In this case the EAR will be programmed (PMC10 or PMC11) based on the information encoded in the event. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBDATA_EAR_EVENT\fR or \fBL1I_EAR_EVENTS\fR depending on the type of EAR. .TP .B Method 2 There is an EAR event in the list of events to monitor and \fBear_used\fR is set. In this case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_ita_iear\fR or \fBpfp_ita_dear\fR structure because it contains more detailed information, such as privilege level and instruction set. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count DATA_EAR_EVENT or L1I_EAR_EVENTS depending on the type of EAR. .TP .B Method 3 There is no EAR event in the list of events to monitor and and \fBear_used\fR is cleared. In this case no EAR is programmed. .TP .B Method 4 There is no EAR event in the list of events to monitor and and \fBear_used\fR is set. In this case case the EAR will be programmed (PMC10 or PMC11) using the information in the \fBpfp_mont_iear\fR or \fBpfp_mont_dear\fR structure. This is the free running mode for the EAR. .sp .SH EXECUTION TRACE BUFFER The \fBpfp_mont_etb\fR of type \fBpfmlib_mont_etb_t\fR field is used to configure the Execution Trace Buffer (ETB). If the \fBetb_used\fR is set, then the library will take the configuration into account, otherwise any ETB configuration will be ignored. The various fields in this structure provide means to filter out the kind of changes in the control flow (branches, traps, rfi, ...) that get recorded in the ETB. Each one represents an element of the branch architecture of the Itanium 2 9000 (Montecito) processor. Refer to the Itanium 2 9000 (Montecito) specific documentation for more details on the branch architecture. The fields are as follows: .TP .B etb_tm If this field is 0, then no branch is captured. If this field is 1, then non taken branches are captured. If this field is 2, then taken branches are captured. Finally if this field is 3 then all branches are captured. .TP .B etb_ptm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted target address are captured. If this field is 2, then branches with correctly predicted target address are captured. Finally if this field is 3 then all branches are captured regardless of target address prediction. .TP .B etb_ppm If this field is 0, then no branch is captured. If this field is 1, then branches with a mispredicted path (taken/non taken) are captured. If this field is 2, then branches with correctly predicted path are captured. Finally if this field is 3 then all branches are captured regardless of their path prediction. .TP .B etb_brt If this field is 0, then no branch is captured. If this field is 1, then only IP-relative branches are captured. If this field is 2, then only return branches are captured. Finally if this field is 3 then only non-return indirect branches are captured. .TP .B etb_plm This is the privilege level mask at which the ETB captures branches. It can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBetb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. .sp There are 4 methods to program the ETB and they are as follows: .sp .TP .B Method 1 The \fBETB_EVENT\fR is in the list of event to monitor and \fBetb_used\fR is cleared. In this case, the ETB will be configured (PMC39) to record ALL branches. A counting monitor will be programmed to count \fBETB_EVENT\fR. .TP .B Method 2 The \fBETB_EVENT\fR is in the list of events to monitor and \fBetb_used\fR is set. In this case, the BTB will be configured (PMC39) using the information in the \fBpfp_mont_etb\fR structure. A counting monitor (PMC4/PMD4-PMC7/PMD7) will be programmed to count \fBBRANCH_EVENT\fR. .TP .B Method 3 The \fBETB_EVENT\fR is not in the list of events to monitor and \fBetb_used\fR is set. In this case, the ETB will be configured (PMC39) using the information in the \fBpfp_mont_etb\fR structure. This is the free running mode for the ETB. .TP .B Method 4 The \fBETB_EVENT\fR is not in the list of events to monitor and \fBetb_used\fR is cleared. In this case, the ETB is not programmed. .SH DATA AND CODE RANGE RESTRICTIONS The \fBpfp_mont_drange\fR and \fBpfp_mont_irange\fR fields control the range restrictions for the data and code respectively. The idea is that the application passes a set of ranges, each designated by a start and end address. Upon return from the \fBpfm_dispatch_events()\fR function, the application gets back the set of registers and their values that needs to be programmed via a kernel interface. Range restriction is implemented using the debug registers. There is a limited number of debug registers and they go in pair. With 8 data debug registers, a maximum of 4 distinct ranges can be specified. The same applies to code range restrictions. Moreover, there are some severe constraints on the alignment and size of the ranges. Given that the size of a range is specified using a bitmask, there can be situations where the actual range is larger than the requested range. For code ranges, Itanium 2 9000 (Montecito) processor can use what is called a fine mode, where a range is designated using two pairs of code debug registers. In this mode, the bitmask is not used, the start and end addresses are directly specified. Not all code ranges qualify for fine mode, the size of the range must be 64KB or less and the range cannot cross a 64KB page boundary. The library will make a best effort in choosing the right mode for each range. For code ranges, it will try the fine mode first and will default to using the bitmask mode otherwise. Fine mode applies to all code debug registers or none, i.e., you cannot have a range using fine mode and another using the bitmask. The Itanium 2 9000 (Montecito) processor somehow limits the use of multiple pairs to accurately cover a code range. This can only be done for \fBIA64_INST_RETIRED\fR and even then, you need several events to collect the counts. For all other events, only one pair can be used, which leads to more inaccuracy due to approximation. Data ranges can used multiple debug register pairs to gain more accuracy. The library will never cover less than what is requested. The algorithm will use more than one pair of debug registers whenever possible to get a more precise range. Hence, up to the 4 pairs can be used to describe a single range. If range restriction is to be used, the \fBrr_used\fR field must be set to one, otherwise settings will be ignored. The ranges are described by the \fBpfmlib_mont_input_rr_t\fR structure. Up to 4 ranges can be defined. Each range is described in by a entry in \fBrr_limits\fR. Some flags for all ranges can be defined in \fBrr_flags\fR. Currently defined flags are: .sp .TP .B PFMLIB_MONT_RR_INV Inverse the code ranges. The qualifying events will be measurement when executing outside the specified ranges. .TP .B PFMLIB_MONT_RR_NO_FINE_MODE Force non fine mode for all code ranges (mostly for debug) .sp .LP The \fBpfmlib_mont_input_rr_desc_t\fR structure is defined as follows: .TP .B rr_plm The privilege level at which the range is active. It can be any combinations of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBetb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. The privilege level is only relevant for code ranges, data ranges ignores the setting. .TP .B rr_start This is the start address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .TP .B rr_end This is the end address of the range. Any address is supported but for code range it must be bundle aligned, i.e., 16-byte aligned. .sp .LP The library will provide the values for the debug registers as well as some information about the actual ranges in the output parameters and more precisely in the \fBpfmlib_mont_output_rr_t\fR structure for each range. The structure is defined as follows: .TP .B rr_nbr_used Contains the number of debug registers used to cover the range. This is necessarily an even number as debug registers always go in pair. The value of this field is between 0 and 7. .TP .B rr_br This table contains the list of debug registers necessary to cover the ranges. Each element is of type \fBpfmlib_reg_t\fR. The \fBreg_num\fR field contains the debug register index while \fBreg_value\fR contains the debug register value. Both the index and value must be copied into the kernel specific argument to program the debug registers. The library never programs them. .TP .B rr_infos Contains information about the ranges defined. Because of alignment restrictions, the actual range covered by the debug registers may be larger than the requested range. This table describe the differences between the requested and actual ranges expressed as offsets: .TP .B rr_soff Contains the start offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the beginning of the range. Otherwise it represents the number of byte by which the actual range precedes the requested range. .TP .B rr_eoff Contains the end offset of the actual range described by the debug registers. If zero, it means the library was able to match exactly the end of the range. Otherwise it represents the number of bytes by which the actual range exceeds the requested range. .sp .LP .SH IP EVENT CAPTURE (IP-EAR) The Execution Trace Buffer (ETB) can be configured to record the addresses of consecutive retiring instructions. In this case the ETB contains IP addresses and not branches related information. This feature cannot be used in conjunction with regular branch captures as described above. To active this feature the \fBipear_used\fR field of the \fBpfmlib_mont_ipear_t\fR must be set to 1. The other fields in this structure are used as follows: .sp .TP .B ipear_plm The privilege level of the instructions to capture. It can be any combination of \fBPFM_PLM0\fR, \fBPFM_PLM1\fR, \fBPFM_PLM2\fR, \fBPFM_PLM3\fR. If \fBetb_plm\fR is 0 then the default privilege level mask in \fBpfp_dfl_plm\fR is used. .TP .B ipear_delay The number of cycles by which to delay the freeze of the ETB after a PMU interrupt (which freeze the rest of counters). .LP .sp .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors when using the Itanium 2 9000 (Montecito) specific input and output arguments. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_nehalem.3000066400000000000000000000141021502707512200233520ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_nehalem - support for Intel Nehalem processor family .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the Intel Nehalem processor family, such as Intel Core i7. The interface is defined in \fBpfmlib_intel_nhm.h\fR. It consists of a set of functions and structures describing the Intel Nehalem processor specific PMU features. The Intel Nehalem processor is a quad core, dual thread processor. It includes two types of PMU: core and uncore. The latter measures events at the socket level and is therefore disconnected from any of the four cores. The core PMU implements Intel architectural perfmon version 3 with four generic counters and three fixed counters. The uncore has eight generic counters and one fixed counter. Each Intel Nehalem core also implement a 16-deep branch trace buffer, called Last Branch Record (LBR), which can be used in combination with the core PMU. Intel Nehalem implements a newer version of the Precise Event-Based Sampling (PEBS) mechanism which has the ability to capture where cache misses occur. .sp When Intel Nehalem processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Intel Nehalem processors specific input arguments are described in the \fBpfmlib_nhm_input_param_t\fR structure. No output parameters are currently defined. The input parameters are defined as follows: .sp .nf typedef struct { unsigned long cnt_mask; unsigned int flags; } pfmlib_nhm_counter_t; typedef struct { unsigned int lbr_used; unsigned int lbr_plm; unsigned int lbr_filter; } pfmlib_nhm_lbr_t; typedef struct { unsigned int pebs_used; unsigned int ld_lat_thres; } pfmlib_nhm_pebs_t; typedef struct { pfmlib_nhm_counter_t pfp_nhm_counters[PMU_NHM_NUM_COUNTERS]; pfmlib_nhm_pebs_t pfp_nhm_pebs; pfmlib_nhm_lbr_t pfm_nhm_lbr; uint64_t reserved[4]; } pfmlib_nhm_input_param_t; .fi .sp .sp The Intel Nehalem processor provides a few additional per-event features for counters: thresholding, inversion, edge detection, monitoring of both threads, occupancy. They can be set using the \fBpfp_nhm_counters\fR data structure for each event. The \fBflags\fR field can be initialized with the following values, depending on the event: .TP .B PFMLIB_NHM_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set. This flag is supported for core and uncore PMU events. .TP .B PFMLIB_NHM_SEL_EDGE Enables edge detection of events. This flag is supported for core and uncore PMU events. .TP .B PFMLIB_NHM_SEL_ANYTHR Enable measuring the event in any of the two processor threads assuming hyper-threading is enabled. By default, only the current thread is measured. This flag is restricted to core PMU events. .TP .B PFMLIB_NHM_SEL_OCC_RST When set, the queue occupancy counter associated with the event is cleared. This flag is only available to uncore PMU events. .LP The \fBcnt_mask\fR field is used to set the event threshold. The value of the counter is incremented for each cycle in which the number of occurrences of the event is greater or equal to the value of the field. Thus, the event is modified to actually measure the number of qualifying cycles. When zero all occurrences are counted (this is the default). This flag is supported for core and uncore PMU events. .sp .SH Support for Precise-Event Based Sampling (PEBS) The library can be used to setup the PMC registers associated with PEBS. In this case, the \fBpfp_nhm_pebs_t\fR structure must be used and the \fBpebs_used\fR field must be set to 1. .sp To enable the PEBS load latency filtering capability, it is necessary to program the \fBMEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD\fR event into one generic counter. The latency threshold must be passed to the library in the \fBld_lat_thres\fR field. It is expressed in core cycles and \fBmust\fR greater than 3. Note that \fBpebs_used\fR must be set as well. .SH Support for Last Branch Record (LBR) The library can be used to setup LBR registers. On Intel Nehalem processors, the LBR is 16-entry deep and it is possible to filter branches, based on privilege level or type. To configure the LBR, the \fBpfm_nhm_lbr_t\fR structure must be used. .sp Like core PMU counters, LBR only distinguishes two privilege levels, 0 and the rest (1,2,3). When running Linux natively, the kernel is at privilege level 0, applications at level 3. It is possible to specify the privilege level of LBR using the \fBlbr_plm\fR. Any attempt to pass \fBPFM_PLM1\fB or \fBPFM_PLM2\fR will be rejected. If \fB\lbr_plm\fR is 0, then the global value in \fBpfmlib_input_param_t\fR and the \fBpfp_dfl_plm\fR is used. .sp By default, LBR captures all branches. It is possible to filter out branches by passing a set of flags in \fBlbr_select\fR. The flags are as follows: .TP .B PFMLIB_NHM_LBR_JCC When set, LBR does not capture conditional branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_REL_CALL When set, LBR does not capture near calls. Default: off. .TP .B PFM_NHM_LBR_NEAR_IND_CALL When set, LBR does not capture indirect calls. Default: off. .TP .B PFM_NHM_LBR_NEAR_RET When set, LBR does not capture return branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_IND_JMP When set, LBR does not capture indirect branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_REL_JMP When set, LBR does not capture relative branches. Default: off. .TP .B PFM_NHM_LBR_FAR_BRANCH When set, LBR does not capture far branches. Default: off. .SH Support for uncore PMU By nature, the uncore PMU does not distinguish privilege levels, therefore it captures events at all privilege levels. To avoid any misinterpretation, the library enforces that uncore events be measured with both \fBPFM_PLM0\fR and \fBPFM_PLM3\fR set. Tools and operating system kernel interfaces may impose further restrictions on how the uncore PMU can be accessed. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_p6.3000066400000000000000000000045751502707512200223030ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2005" "" "Linux Programmer's Manual" .SH NAME libpfm_i386_p6 - support for Intel P6 processor family .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the P6 processor family, including the Pentium M processor. The interface is defined in \fBpfmlib_i386_p6.h\fR. It consists of a set of functions and structures which describe and allow access to the P6 processors specific PMU features. .sp When P6 processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The P6 processors specific input arguments are described in the \fBpfmlib_i386_p6_input_param_t\fR structure and the output parameters in \fBpfmlib_i386_p6_output_param_t\fR. They are defined as follows: .sp .nf typedef struct { unsigned int cnt_mask; unsigned int flags; } pfmlib_i386_p6_counter_t; typedef struct { pfmlib_i386_p6_counter_t pfp_i386_p6_counters[PMU_I386_P6_NUM_COUNTERS]; uint64_t reserved[4]; } pfmlib_i386_p6_input_param_t; typedef struct { uint64_t reserved[8]; } pfmlib_i386_p6_output_param_t; .fi .sp .sp The P6 processor provides a few additional per-event features for counters: thresholding, inversion, edge detection. They can be set using the \fBpfp_i386_p6_counters\fR data structure for each event. The \fBflags\fR field can be initialized as follows: .TP .B PFMLIB_I386_P6_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set .TP .B PFMLIB_I386_P6_SEL_EDGE Enables edge detection of events. .LP The \fBcnt_mask\fR field contains is used to set the event threshold. The value of the counter is incremented each time the number of occurrences per cycle of the event is greater or equal to the value of the field. When zero all occurrences are counted. .sp .SH Handling of Pentium M The library provides full support for the Pentium M PMU. A Pentium implements more events than a generic P6 processor. The library autodetects the host processor and can distinguish generic P6 processor from a Pentium. Thus no special call is needed. .sp .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_powerpc.3000066400000000000000000000033111502707512200234200ustar00rootroot00000000000000.TH LIBPFM 3 "October, 2007" "" "Linux Programmer's Manual" .SH NAME libpfm_powerpc - support for IBM PowerPC and POWER processor families .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides support for the IBM PowerPC and POWER processor families. Specifically, it currently provides support for the following processors: PPC970(FX,GX), PPC970MP POWER4, POWER4+, POWER5, POWER5+, and POWER6. .sp .SH MODEL-SPECIFIC PARAMETERS At present, the model_in and model_out model-specific input and output parameters are not used by \fBpfm_dispatch_events()\fR function. For future compatibility, NULLs must be passed for these arguments. .sp .SH COMBINING EVENTS IN A SET As with many architecture's PMU hardware design, events can not be combined together arbitrarily in the same event set, even if there are a sufficient number of counters available. This implementation for IBM PowerPC/POWER bases the event compatibility on a set of previously-defined compatible event groups. If the events placed in an event set are all members of one of the predefined event groups, a call to the \fBpfm_dispatch_events()\fR function will be successful. With the current interface, there is no way to discover apriori which events are compatible, so application software that wishes to combine events must do so by trial and error, possibly using multiplexed event sets to count events that cannot otherwise be combined in the same set. .sp .SH ERRORS Refer to the description of the \fBpfm_dispatch_events()\fR function for errors. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Corey Ashford .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/libpfm_westmere.3000066400000000000000000000141021502707512200235740ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_nehalem - support for Intel Nehalem processor family .SH SYNOPSIS .nf .B #include .B #include .sp .SH DESCRIPTION The libpfm library provides full support for the Intel Nehalem processor family, such as Intel Core i7. The interface is defined in \fBpfmlib_intel_nhm.h\fR. It consists of a set of functions and structures describing the Intel Nehalem processor specific PMU features. The Intel Nehalem processor is a quad core, dual thread processor. It includes two types of PMU: core and uncore. The latter measures events at the socket level and is therefore disconnected from any of the four cores. The core PMU implements Intel architectural perfmon version 3 with four generic counters and three fixed counters. The uncore has eight generic counters and one fixed counter. Each Intel Nehalem core also implement a 16-deep branch trace buffer, called Last Branch Record (LBR), which can be used in combination with the core PMU. Intel Nehalem implements a newer version of the Precise Event-Based Sampling (PEBS) mechanism which has the ability to capture where cache misses occur. .sp When Intel Nehalem processor specific features are needed to support a measurement, their descriptions must be passed as model-specific input arguments to the \fBpfm_dispatch_events()\fR function. The Intel Nehalem processors specific input arguments are described in the \fBpfmlib_nhm_input_param_t\fR structure. No output parameters are currently defined. The input parameters are defined as follows: .sp .nf typedef struct { unsigned long cnt_mask; unsigned int flags; } pfmlib_nhm_counter_t; typedef struct { unsigned int lbr_used; unsigned int lbr_plm; unsigned int lbr_filter; } pfmlib_nhm_lbr_t; typedef struct { unsigned int pebs_used; unsigned int ld_lat_thres; } pfmlib_nhm_pebs_t; typedef struct { pfmlib_nhm_counter_t pfp_nhm_counters[PMU_NHM_NUM_COUNTERS]; pfmlib_nhm_pebs_t pfp_nhm_pebs; pfmlib_nhm_lbr_t pfm_nhm_lbr; uint64_t reserved[4]; } pfmlib_nhm_input_param_t; .fi .sp .sp The Intel Nehalem processor provides a few additional per-event features for counters: thresholding, inversion, edge detection, monitoring of both threads, occupancy. They can be set using the \fBpfp_nhm_counters\fR data structure for each event. The \fBflags\fR field can be initialized with the following values, depending on the event: .TP .B PFMLIB_NHM_SEL_INV Inverse the results of the \fBcnt_mask\fR comparison when set. This flag is supported for core and uncore PMU events. .TP .B PFMLIB_NHM_SEL_EDGE Enables edge detection of events. This flag is supported for core and uncore PMU events. .TP .B PFMLIB_NHM_SEL_ANYTHR Enable measuring the event in any of the two processor threads assuming hyper-threading is enabled. By default, only the current thread is measured. This flag is restricted to core PMU events. .TP .B PFMLIB_NHM_SEL_OCC_RST When set, the queue occupancy counter associated with the event is cleared. This flag is only available to uncore PMU events. .LP The \fBcnt_mask\fR field is used to set the event threshold. The value of the counter is incremented for each cycle in which the number of occurrences of the event is greater or equal to the value of the field. Thus, the event is modified to actually measure the number of qualifying cycles. When zero all occurrences are counted (this is the default). This flag is supported for core and uncore PMU events. .sp .SH Support for Precise-Event Based Sampling (PEBS) The library can be used to setup the PMC registers associated with PEBS. In this case, the \fBpfp_nhm_pebs_t\fR structure must be used and the \fBpebs_used\fR field must be set to 1. .sp To enable the PEBS load latency filtering capability, it is necessary to program the \fBMEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD\fR event into one generic counter. The latency threshold must be passed to the library in the \fBld_lat_thres\fR field. It is expressed in core cycles and \fBmust\fR greater than 3. Note that \fBpebs_used\fR must be set as well. .SH Support for Last Branch Record (LBR) The library can be used to setup LBR registers. On Intel Nehalem processors, the LBR is 16-entry deep and it is possible to filter branches, based on privilege level or type. To configure the LBR, the \fBpfm_nhm_lbr_t\fR structure must be used. .sp Like core PMU counters, LBR only distinguishes two privilege levels, 0 and the rest (1,2,3). When running Linux natively, the kernel is at privilege level 0, applications at level 3. It is possible to specify the privilege level of LBR using the \fBlbr_plm\fR. Any attempt to pass \fBPFM_PLM1\fB or \fBPFM_PLM2\fR will be rejected. If \fB\lbr_plm\fR is 0, then the global value in \fBpfmlib_input_param_t\fR and the \fBpfp_dfl_plm\fR is used. .sp By default, LBR captures all branches. It is possible to filter out branches by passing a set of flags in \fBlbr_select\fR. The flags are as follows: .TP .B PFMLIB_NHM_LBR_JCC When set, LBR does not capture conditional branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_REL_CALL When set, LBR does not capture near calls. Default: off. .TP .B PFM_NHM_LBR_NEAR_IND_CALL When set, LBR does not capture indirect calls. Default: off. .TP .B PFM_NHM_LBR_NEAR_RET When set, LBR does not capture return branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_IND_JMP When set, LBR does not capture indirect branches. Default: off. .TP .B PFM_NHM_LBR_NEAR_REL_JMP When set, LBR does not capture relative branches. Default: off. .TP .B PFM_NHM_LBR_FAR_BRANCH When set, LBR does not capture far branches. Default: off. .SH Support for uncore PMU By nature, the uncore PMU does not distinguish privilege levels, therefore it captures events at all privilege levels. To avoid any misinterpretation, the library enforces that uncore events be measured with both \fBPFM_PLM0\fR and \fBPFM_PLM3\fR set. Tools and operating system kernel interfaces may impose further restrictions on how the uncore PMU can be accessed. .SH SEE ALSO pfm_dispatch_events(3) and set of examples shipped with the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_dispatch_events.3000066400000000000000000000251041502707512200244410ustar00rootroot00000000000000.TH LIBPFM 3 "July , 2003" "" "Linux Programmer's Manual" .SH NAME pfm_dispatch_events \- determine PMC registers values for a set of events to measure .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_dispatch_events(pfmlib_input_param_t *"p ", void *" mod_in ", pfmlib_output_param_t *" q, "void *" mod_out ");" .sp .SH DESCRIPTION This function is the central piece of the library. It is important to understand that the library does not effectively program the PMU, i.e., it does not make the operating system calls. The PMU is never actually accessed by the library. Instead, the library helps applications prepare the arguments to pass to the kernel. In particular, it sets up the values to program into the PMU configuration registers (PMC). The list of used data registers (PMD) is also returned. .sp The input argument are divided into two categories: the generic arguments in \fBp\fR and the optional PMU model specific arguments in \fBmod_in\fR. The same applies for the output arguments: \fBq\fR contains the generic output arguments and \fBmod_out\fR the optional PMU model specific arguments. .sp An application describes what it wants to measure in the \fBin\fR and if it uses some model specific features, such as opcode matching on Itanium 2 processors, it must pass a pointer to the relevant model-specific input parameters in \fBmod_in\fR. The generic output parameters contains the register index and values for the PMC and PMD registers needed to make the measurement. The index mapping is guaranteed to match the mapping used by the Linux perfmon2 interface. In case the library is not used on this system, the hardware register addresses or indexes can also be retrieved from the output structure. .sp The \fBpfmlib_input_param_t\fR structure is defined as follows: .sp .nf typedef struct int event; unsigned int plm; unsigned long flags; unsigned int unit_masks[PFMLIB_MAX_MASKS_PER_EVENT]; unsigned int num_masks; } pfmlib_event_t; typedef struct { unsigned int pfp_event_count; unsigned int pfp_dfl_plm; unsigned int pfp_flags; pfmlib_event_t pfp_events[PFMLIB_MAX_PMCS]; pfmlib_regmask_t pfp_unavail_pmcs; } pfmlib_input_param_t; .fi .sp The structure mostly contains one table, called \fBpfp_events\fR which describes the events to be measured. The number of submitted events is indicated by \fBpfp_event_count\fR. Each event is described in the \fBpfp_events\fR table by an opaque descriptor stored in the \fBevent\fR field. This descriptor is obtained with the \fBpfm_find_full_event()\fR or derivative functions. For some events, it may be necessary to specify at least one unit mask in the \fBunit_masks\fR table. A unit mask is yet another opaque descriptor obtained via the \fBpfm_find_event_mask()\fR or \fBpfm_find_full_event()\fR functions. Typically, if an event supports multiple unit masks, they can be combined in which case more than one entry in \fBunit_masks\fR must be specified. The actual number of unit mask descriptors passed must be indicated in \fBnum_masks\fR. When no unit mask is used, this field must be set to 0. A privilege level mask for the event can be provided in \fBplm\fR. This is a bitmask where each bit indicates a privilege level at which to monitor, more than one bit can be set. The library supports up to four levels, but depending on the PMU model, some levels may not be available. The levels are as follows: .TP .B PFM_PLM0 monitor at the privilege level 0. For many architectures, this means kernel level .TP .B PFM_PLM1 monitor at privilege level 1 .TP .B PFM_PLM2 monitor at privilege level 2 .TP .B PFM_PLM3 monitor at the privilege level 3. For many architectures, this means user level .LP .sp .sp Events with a \fBplm\fR value of 0 will use the default privilege level mask as indicated by \fBpfp_dfl_plm\fR which must be set to any combinations of values described above. It is illegal to have a value of 0 for this field. .sp The \fBpfp_flags\fR field contains a set of flags that affect the whole set of events to be monitored. The currently defined flags are: .TP .B PFMLIB_PFP_SYSTEMWIDE indicates that the monitors are to be used in a system-wide monitoring session. This could influence the way the library sets up some register values. .sp .LP The \fBpfp_unavail_pmcs\fR bitmask can be used by applications to communicate to the library the list of PMC registers which are not available on the system. Some kernels may allocate certain PMC registers (and associated data registers) for other purposes. Those registers must not be used by the library otherwise the assignment of events to PMC registers may be rejected by the kernel. Applications must figure out which registers are available using a kernel interface at their disposal, the library does not provide this service. The library expect the restrictions to be expressed using the Linux perfmon2 PMC register mapping. .LP Refer to the PMU specific manual for a description of the model-specific input parameters to be passed in \fBmod_in\fR. The generic output parameters are contained in the fBpfmlib_output_param_t\fR structure which is defined as: .sp .nf typedef struct { unsigned long long reg_value; unsigned int reg_num; unsigned long reg_addr; } pfmlib_reg_t; typedef struct { unsigned int pfp_pmc_count; unsigned int pfp_pmd_count; pfmlib_reg_t pfp_pmcs[PFMLIB_MAX_PMCS]; pfmlib_reg_t pfp_pmds[PFMLIB_MAX_PMDS]; } pfmlib_output_param_t; .fi .sp The number of valid entries in the \fBpfp_pmcs\fR table is indicated by \fBpfp_pmc_count\fR. The number of valid entries in the \fBpfp_pmds\fR table is indicated by \fBpfp_pmd_count\fR. Each entry in both tables is of type \fBpfmlib_reg_t\fR. .sp In the \fBpfp_pmcs\fR table, the \fBreg_num\fR contains the PMC register index (perfmon2 mapping), and the \fBreg_value\fR contains a 64-bit value to be used to program the PMC register. The \fBreg_addr\fR indicates the hardware address or index for the PMC register. .sp In the \fBpfp_pmds\fR table, the \fBreg_num\fR contains the PMD register index (perfmon2 mapping). the \fBreg_value\fR is ignored. The \fBreg_addr\fR indicates the hardware address or index for the PMC register. .sp Refer to the PMU specific manual for a description of the model-specific output parameters to be returned in \fBmod_out\fR. .sp The current implementation of the \fBpfm_dispatch_events()\fR function completely overwrites the \fBpfmlib_output_param\fR structure. In other words, results do not accumulate into the \fBpfp_pmcs\fR table across multiple calls. Unused fields are guaranteed to be zeroed upon successful return. .sp Depending on the PMU model, there may not always be a one to one mapping between a PMC register and a data register. Register dependencies may be more intricate. However the \fBpfm_dispatch_events()\fR function guarantees certain ordering between the \fBpfp_pmcs\fR and \fBpfp_pmds\fR tables. In particular, it guarantees that the \fBpfp_pmds\fR table always starts with the counters corresponding, in the same order, to the events as provided in the \fBpfp_event\fR table on input. There is always one counter per event. Additional PMD registers, if any, come after. .SH EXAMPLE Here is a typical sequence using the perfmon2 interface: .nf #include ... pfmlib_input_param_t inp; pfmlib_output_param_t outp; pfarg_ctx_t ctx; pfarg_pmd_t pd[1]; pfarg_pmc_t pc[1]; pfarg_load_t load_arg; int fd, i; int ret; if (pfm_initialize() != PFMLIB_SUCCESS) { fprintf(stderr, "can't initialize library\\n"); exit(1); } memset(&ctx,0, sizeof(ctx)); memset(&inp,0, sizeof(inp)); memset(&outp,0, sizeof(outp)); memset(pd, 0, sizeof(pd)); memset(pc, 0, sizeof(pc)); memset(&load_arg, 0, sizeof(load_arg)); ret = pfm_get_cycle_event(&inp.pfp_events[0]); if (ret != PFMLIB_SUCCESS) { fprintf(stderr, "cannot find cycle event\\n"); exit(1); } inp.pfp_dfl_plm = PFM_PLM3; inp.pfp_event_count = 1; ret = pfm_dispatch_events(&inp, NULL, &outp, NULL); if (ret != PFMLIB_SUCCESS) { fprintf(stderr, "cannot dispatch events: %s\\n", pfm_strerror(ret)); exit(1); } /* propagate pmc value to perfmon2 structures */ for(i=0; i < outp.pfp_pmc_count; i++) { pc[i].reg_num = outp.pfp_pmcs[i].reg_num; pc[i].reg_value = outp.pfp_pmcs[i].reg_value; } for(i=0; i < outp.pfp_pmd_count; i++) { pd[i].reg_num = outp.pfp_pmds[i].reg_num; pd[i].reg_value = 0; } ... if (pfm_create_context(&ctx, NULL, 0) == -1 ) { ... } fd = ctx.ctx_fd; if (pfm_write_pmcs(fd, pc, outp.pfp_pmc_count) == -1) { ... } if (pfm_write_pmds(fd, pd, outp.pfp_pmd_count) == -1) { ... } load_arg.load_pid = getpid(); if (pfm_load_context(fd, &load_arg) == -1) { ... } pfm_start(fd, NULL); /* code to monitor */ pfm_stop(fd); if (pfm_read_pmds(fd, pd, evt.pfp_event_count) == -1) { ... } printf("results: %llu\n", pd[0].reg_value); ... close(fd); ... .fi .SH RETURN The function returns whether or not the call was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT The library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL Some arguments were invalid. For instance the value of *count is zero. This can also be due to he content of the \fBpfmlib_param_t\fR structure. .TP .B PFMLIB_ERR_NOTFOUND No matching event was found. .TP .B PFMLIB_ERR_TOOMANY The number of events to monitor exceed the number of implemented counters. .TP .B PFMLIB_ERR_NOASSIGN The events cannot be dispatched to the PMC because events have conflicting constraints. .TP .B PFMLIB_ERR_MAGIC The model specific extension does not have the right magic number. .TP .B PFMLIB_ERR_FEATCOMB The set of events and features cannot be combined. .TP .B PFMLIB_ERR_EVTMANY An event has been supplied more than once and is causing resource (PMC) conflicts. .TP .B PFMLIB_ERR_IRRINVAL Invalid code range restriction (Itanium, Itanium 2). .TP .B PFMLIB_ERR_IRRALIGN Code range has invalid alignment (Itanium, Itanium 2). .TP .B PFMLIB_ERR_IRRTOOMANY Cannot satisfy all the code ranges (Itanium, Itanium 2). .TP .B PFMLIB_ERR_DRRTOOMANY Cannot satisfy all the data ranges (Itanium, Itanium 2). .TP .B PFMLIB_ERR_DRRINVAL Invalid data range restriction (Itanium, Itanium 2). .TP .B PFMLIB_ERR_EVTSET Some events belong to incompatible sets (Itanium 2). .TP .B PFMLIB_ERR_EVTINCOMP Some events cannot be measured at the same time (Itanium 2). .TP .B PFMLIB_ERR_IRRTOOBIG Code range is too big (Itanium 2). .TP .B PFMLIB_ERR_UMASK Invalid or missing unit mask. .SH SEE ALSO libpfm_itanium(3), libpfm_itanium2(3), pfm_regmask_set(3), pfm_regmask_clr(3), pfm_find_event_code_mask(3) .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_find_event.3000066400000000000000000000100711502707512200233740ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2006" "" "Linux Programmer's Manual" .SH NAME pfm_find_event, pfm_find_full_event, pfm_find_event_bycode, pfm_find_event_bycode_next, pfm_find_event_mask \- search for events and unit masks .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_find_event(const char *"str ", unsigned int *"desc ");" .BI "int pfm_find_full_event(const char *"str ", pfmlib_event_t *"e ");" .BI "int pfm_find_event_bycode(int "code ", unsigned int *"desc ");" .BI "int pfm_find_event_bycode_next(unsigned int "desc1 ", int "code ", unsigned int *"desc ");" .BI "int pfm_find_event_mask(unsigned int "idx ", const char *"str ", unsigned int *"mask_idx ");" .sp .SH DESCRIPTION The PMU counters can be programmed to count the number of occurrences of certain events. The number of events varies from one PMU model to the other. Each event has a name and a code which is used to program the actual PMU register. Some event may need to be further qualified with unit masks. .sp The library does not directly expose the event code, nor unit mask code, to user applications because it is not necessary. Instead applications use names to query the library for particular information about events. Given an event name, the library returns an opaque descriptor. Each descriptor is unique and has no relationship to the event code. .sp The set of functions described here can be used to get an event descriptor given either the name of the event or its code. Several events may share the same code. An event name is a string structured as: event_name[:unit_mask1[:unit_mask2]]. .sp The \fBpfm_find_event()\fR function is a general purpose search routine. Given an event name in \fBstr\fR, it returns the descriptor for the corresponding event. If unit masks are provided, they are not taken into account. This function is being \fBdeprecated\fR in favor of the \fBpfm_find_full_event()\fR function. .sp The \fBpfm_find_full_event()\fR function is the general purpose search routine. Given an event name in \fBstr\fR, it returns in \fBev\fR, the full event descriptor that includes the event descriptor in \fBev->event\fR and the unit mask descriptors in \fBev->unit_masks\fR. The number of unit masks descriptors returned is indicated in \fBev->num_masks\fR. Unit masks are specified as a colon separated list of unit mask names, exact values or value combinations. For instance, if event A supports unit masks M1 (0x1) and M2 (0x40), and both unit masks are to be measured, then the following values for \fBstr\fR are valid: "A:M1:M2", "A:M1:0x40", "A:M2:0x1", "A:0x1:0x40", "A:0x41". .sp The \fBpfm_find_event_bycode()\fR function searches for an event given its \fBcode\fR represented as an integer. It returns in \fBdesc\fR, the event code. Unit masks are ignored. .sp Because there can be several events with the same code, the library provides the \fBpfm_find_event_bycode_next()\fR function to search for other events with the same code. Given an event \fBdesc1\fR and a \fBcode\fR, this function will look for the next event with the same code. If such an event exists, its descriptor will be stored into \fBdesc\fR. It is not necessary to have called the \fBpfm_find_event_bycode()\fR function prior to calling this function. This function is fully threadsafe as it does not maintain any state between calls. .sp The \fBpfm_find_event_mask()\fR function is used to find the unit mask descriptor based on its name or numerical value passed in \fBstr\fR for the event specified in \fBidx\fR. The numeric value must be an exact match of an existing unit mask value, i.e., all bits must match. Some events do not have unit masks, in which case this function returns an error. .SH RETURN All functions return whether or not the call was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT the library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL the event descriptor is invalid, or the pointer argument is NULL. .TP .B PFMLIB_ERR_NOTFOUND no matching event or unit mask was found. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_find_event_bycode.3000066400000000000000000000000321502707512200247150ustar00rootroot00000000000000.so man3/pfm_find_event.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_find_event_bycode_next.3000066400000000000000000000000321502707512200257530ustar00rootroot00000000000000.so man3/pfm_find_event.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_find_event_mask.3000066400000000000000000000000321502707512200244030ustar00rootroot00000000000000.so man3/pfm_find_event.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_find_full_event.3000066400000000000000000000000321502707512200244120ustar00rootroot00000000000000.so man3/pfm_find_event.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_force_pmu.3000066400000000000000000000000341502707512200232300ustar00rootroot00000000000000.so man3/pfm_get_pmu_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_cycle_event.3000066400000000000000000000040731502707512200244170ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2006" "" "Linux Programmer's Manual" .SH NAME pfm_get_cycle_event, pfm_get_inst_retired_event - get basic event descriptors .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_cycle_event(pfmlib_event_t *"ev ");" .BI "int pfm_get_inst_retired_event(pfmlib_event_t *"ev ");" .sp .SH DESCRIPTION In order to build very simple generic examples that work across all PMU models, the library provides a way to retrieve information about two basic events that are present in most PMU models: cycles and instruction retired. The first event, cycles, counts the number of elapsed cycles. The second event, instruction retired, counts the number of instructions that have executed and retired from the processor pipeline. Depending on the PMU model, there may be variations in the exact definition of those events. The library provides this information on a best effort basis. User must refer to PMU model specific documentation to validate the event definition. .sp The \fBpfm_get_cycle_event()\fR function returns in \fBev\fR the event and optional unit mask descriptors for the event that counts elapsed cycles. Depending on the PMU model, there may be unit mask(s) necessary to count cycles. Application must check the value returned in \fBev->num_masks\fR. .sp The \fBpfm_get_inst_retired_event()\fR function returns in \fBev\fR the event and optional unit mask descriptors for the event that counts the number of returned instruction. Depending on the PMU model, there may be unit mask(s) necessary to count retired instructions. Application must check the value returned in \fBev->num_masks\fR. .SH RETURN All functions return whether or not the call was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT the library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL the \fBev\fR parameter is NULL. .TP .B PFMLIB_ERR_NOTSUPP the host PMU does not define an event to count cycles or instructions retired. .TP .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_code.3000066400000000000000000000000361502707512200242250ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_code_counter.3000066400000000000000000000000361502707512200257640ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_counters.3000066400000000000000000000000361502707512200251550ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_description.3000066400000000000000000000000361502707512200256360ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_mask_code.3000066400000000000000000000000361502707512200252400ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_mask_description.3000066400000000000000000000000361502707512200266510ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_mask_name.3000066400000000000000000000000361502707512200252460ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_event_name.3000066400000000000000000000152101502707512200242330ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2006" "" "Linux Programmer's Manual" .SH NAME pfm_get_event_name, pfm_get_full_event_name, pfm_get_event_mask_name, pfm_get_event_code, pfm_get_event_mask_code, pfm_get_event_counters, pfm_get_num_events, pfm_get_max_event_name_len, pfm_get_event_description, pfm_get_event_mask_description \- get event information .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_event_name(unsigned int " e ", char *"name ", size_t " maxlen ");" .BI "int pfm_get_full_event_name(pfmlib_event_t *" ev ", char *"name ", size_t " maxlen ");" .BI "int pfm_get_event_mask_name(unsigned int " e ", unsigned int "mask ", char *"name ", size_t " maxlen ");" .BI "int pfm_get_event_code(unsigned int " e ", int *"code ");" .BI "int pfm_get_event_mask_code(unsigned int " e ", unsigned int "mask ", int *"code ");" .BI "int pfm_get_event_code_counter(unsigned int " e ", unsigned int " cnt ", int *"code ");" .BI "int pfm_get_event_counters(int " e ", pfmlib_regmask_t "counters ");" .BI "int pfm_get_num_events(unsigned int *" count ");" .BI "int pfm_get_max_event_name_len(size_t *" len ");" .BI "int pfm_get_event_description(unsigned int " ev ", char **" str ");" .BI "int pfm_get_event_mask_description(unsigned int " ev ", unsigned int "mask ", char **" str ");" .sp .SH DESCRIPTION The \fBpfm_get_event_name()\fR function returns in \fBname\fR the event name given its opaque descriptor in \fBe\fR. The \fBmaxlen\fR argument indicates the maximum length of the buffer provided for \fBname\fR. Up to \fBmaxlen\fR-1 characters are stored in the buffer. The buffer size must be large enough to store the event name, otherwise an error is returned. This behavior is required to avoid returning partial names with no way for the caller to verify this is not the full name, except by failing other calls. The buffer can be appropriately sized using the \fBpfm_get_max_event_name_len()\fR function. The returned name is a null terminated string with all upper-case characters and no spaces. .sp The \fBpfm_get_full_event_name()\fR function returns in \fBname\fR the event name given the full event description in \fBev\fR. The description contains the event code in \fBev->event\fR and optional unit masks descriptors in \fBev->unit_masks\fR. The \fBmaxlen\fR argument indicates the maximum length of the buffer provided for \fBname\fR. If more than \fBmaxlen\fR-1 characters are needed to represent the event, an error is returned. Applications may use the \fBpfm_get_max_event_name_len()\fR function to size the buffer correctly. In case unit masks are provided, the final event name string is structured as: event_name:unit_masks1[:unit_masks2]. Event names and unit masks names are returned in all upper case. .sp The \fBpfm_get_event_code()\fR function returns the event code in \fBcode\fR given its opaque descriptor \fBe\fR. .sp On some PMU models, the code associated with an event is different based on the counter it is programmed into. The \fBpfm_get_event_code_counter()\fR function is used to retrieve the event code in \fBcode\fR when the event \fBe\fR is programmed into counter \fBcnt\fR. The counter index \fBcnt\fR must correspond to of a counting PMD register. .sp Given an opaque event descriptor \fBe\fR, the \fBpfm_get_event_counters()\fR function returns in \fBcounters\fR a bitmask of type \fBpfmlib_regmask_t\fR where each bit set represents a PMU config register which can be used to program this event. The bitmask must be accessed using accessor macros defined by the library. .so The \fBpfm_get_num_events()\fR function returns in \fBcount\fR the total number of events available for the PMU model. On some PMU models, however, not all events in the table may be useable due to processor stepping changes. However, The library guarantees that no more that \fBcount\fR events are available. .sp It is possible to list all existing events for the detected host PMU using accessor functions as the full table of events is not accessible to the applications. The index of the first event is always zero, then using the \fBpfm_get_num_events()\fR function you get the total number of events. On some PMU models, e.g., AMD64, not all events are necessarily supported by the host PMU, therefore the count returned by this calls may not be the actual number of available events. Event descriptors are contiguous therefore a simple loop will allow complete scanning. The typical scan loop is constructed as follows: .sp .nf unsigned int i, count; char name[256]; int ret; pfm_get_num_events(&count); for(i=0;i < count; i++) { ret = pfm_get_event_name(i, name, 256); if (ret != PFMLIB_SUCCESS) continue; printf("%s\\n", name); } .fi .sp The \fBpfm_get_max_event_name_len()\fR function returns in \fBlen\fR the maximum length in bytes for the name of the events or its unit masks, if any, available on one PMU implementation. The value excludes the string termination character ('\\0'). .sp The \fBpfm_get_event_description()\fR function returns in \fBstr\fR the description string associated with the event specified in \fBev\fR. The description is returned into a buffer that is allocated to hold the entire description text. It is the responsibility of the caller to free the buffer when it becomes useless by calling the \fBfree(3)\fR function. .sp The \fBpfm_get_event_mask_code()\fR function must be used to retrieve the actual unit mask value given a event descriptor in \fBe\fR and a unit mask descriptor in \fBmask\fR. The value is returned in \fBcode\fR. .sp The \fBpfm_get_event_mask_name()\fR function must be used to retrieve the name associated with a unit mask specified in \fBmask\fR for event \fBe\fR. The name is returned in the buffer specified in \fBname\fR. The maximum size of the buffer must be specified in \fBmaxlen\fR. .sp The \fBpfm_get_event_mask_description()\fR function returns in \fBstr\fR the description string associated with the unit mask specified in \fBmask\fR for the event specified in \fBev\fR. The description is returned into a buffer that is allocated to hold the entire description text. It is the responsibility of the caller to free the buffer when it becomes useless by calling the \fBfree(3)\fR function. .SH RETURN All functions return whether or not the call was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT the library has not been initialized properly. .TP .B PFMLIB_ERR_FULL the string buffer provided is too small .TP .B PFMLIB_ERR_INVAL the event or unit mask descriptor, or the \fBcnt\fR argument is invalid, or a pointer argument is NULL. .SH SEE ALSO pfm_get_impl_counters(3), pfm_get_max_event_name_len(3), free(3) .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_full_event_name.3000066400000000000000000000000361502707512200252550ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_hw_counter_width.3000066400000000000000000000000351502707512200254650ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_impl_counters.3000066400000000000000000000000351502707512200247740ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_impl_pmcs.3000066400000000000000000000057551502707512200241120ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_get_impl_pmcs, pfm_get_impl_pmds, pfm_get_impl_counters, pfm_get_num_counters, pfm_get_num_pmcs, pfm_get_num_pmds, pfm_get_hw_counter_width \- return bitmask of implemented PMU registers or number of PMU registers .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_impl_pmcs(pfmlib_regmask_t *" impl_pmcs ");" .BI "int pfm_get_impl_pmds(pfmlib_regmask_t *" impl_pmds ");" .BI "int pfm_get_impl_counters(pfmlib_regmask_t *" impl_counters ");" .BI "int pfm_get_num_counters(unsigned int *"num ");" .BI "int pfm_get_num_pmcs(unsigned int *"num ");" .BI "int pfm_get_num_pmds(unsigned int *"num ");" .BI "int pfm_get_num_counters(unsigned int *"num ");" .BI "int pfm_get_hw_counter_width(unsigned int *"width ");" .sp .SH DESCRIPTION The \fBpfm_get_impl_*()\fR functions can be used to figure out which PMU registers are implemented on the host CPU. All implemented registers may not necessarily be available to applications. Programs need to query the operating system kernel monitoring interface to figure out the list of available registers. .sp The \fBpfm_get_impl_*()\fR functions all return a bitmask of registers corresponding to the query. The bitmask pointer passed as argument is reset to zero by each function. The returned bitmask must be accessed using the set of functions provided by the library to ensure portability. See related man pages below. .sp The \fBpfm_get_num_*()\fR functions return the number of implemented PMC or PMD registers. Those numbers may be different from the actual number of registers available to applications. .sp The \fBpfm_get_impl_pmcs()\fR function returns in \fBimpl_pmcs\fR the bitmask of implemented PMCS. The \fBpfm_get_impl_pmds()\fR function returns in \fBimpl_pmds\fR the bitmask of implemented PMDS. The \fBpfm_get_impl_counters()\fR function returns in \fBimpl_counters\fR a bitmask of the PMD registers used as counters. Depending on the PMU mode, not all PMD registers are necessarily used as counters. .sp The \fBpfm_get_num_counters()\fR function returns in \fBnum\fR the number of PMD used as counters. A counter is a PMD which is used to accumulate the number of occurrences of an event. The \fBpfm_get_num_pmcs()\fR function returns in \fBnum\fR the number of implemented PMCs by the host PMU. The \fBpfm_get_num_pmds()\fR function returns in \fBnum\fR the number of implemented PMDs by the host PMU. The \fBpfm_get_hw_counter_width()\fR function returns the width in bits of the counters in \fBwidth\fR. PMU implementations can have different number of bits implemented. For instance, Itanium has 32-bit counters, while Itanium 2 has 47-bits. .SH RETURN The function returns whether or not it was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT the library has not been initialized properly. .SH SEE ALSO pfm_regmask_set(3), pfm_regmask_isset(3) .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_impl_pmds.3000066400000000000000000000000351502707512200240750ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_inst_retired.3000066400000000000000000000000371502707512200246060ustar00rootroot00000000000000.so man3/pfm_get_cycle_event.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_max_event_name_len.3000066400000000000000000000000361502707512200257360ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_num_counters.3000066400000000000000000000000351502707512200246320ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_num_events.3000066400000000000000000000000361502707512200242750ustar00rootroot00000000000000.so man3/pfm_get_event_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_num_pmcs.3000066400000000000000000000000351502707512200237320ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_num_pmds.3000066400000000000000000000000351502707512200237330ustar00rootroot00000000000000.so man3/pfm_get_impl_pmcs.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_pmu_name.3000066400000000000000000000134531502707512200237220ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_get_pmu_name, pfm_get_pmu_type, pfm_get_pmu_name_bytype, pfm_pmu_is_supported, pfm_force_pmu,pfm_list_supported_pmu \- query library about supported PMU models .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_pmu_name(char *"name ", int " maxlen); .BI "int pfm_get_pmu_type(int *" type); .BI "int pfm_get_pmu_name_bytype(int " type ", char *" name ", int " maxlen); .BI "int pfm_pmu_is_supported(int " type); .BI "int pfm_force_pmu(int " type); .BI "int pfm_list_supported_pmus(int (*" pf ")(const char *"fmt ",...));" .sp .SH DESCRIPTION These functions retrieve information about the detected host PMU and the PMU models supported by the library. More than one model can be supported by the same library. Each PMU model is assigned a type and a name. The latter is just a string and the former is a unique identifier. The currently supported types are: .TP .B PFMLIB_GENERIC_PMU Intel Itanium default architected PMU model, i.e., the basic model. .TP .B PFMLIB_ITANIUM_PMU Intel Itanium processor PMU model. The model is found in the first implementation of the IA-64 architecture, code name Merced. .TP .B PFMLIB_ITANIUM2_PMU Intel Itanium 2 processor PMU model. This is the model provided by McKinley, Madison, and Deerfield processors. .TP .B PFMLIB_MONTECITO_PMU Intel Dual-core Itanium 2 processor PMU model. This is the model provided by Montecito, Montvale processors. .TP .B PFMLIB_AMD64_PMU AMD AMD64 processors (family 15 and 16) .TP .B PFMLIB_GEN_IA32_PMU Intel X86 architectural PMU v1, v2, v3 .TP .B PFMLIB_I386_P6_PMU Intel P6 processors. That includes Pentium Pro, Pentium II, Pentium III, but excludes Pentium M .TP .B PFMLIB_I386_PM_PMU Intel Pentium M processors. .TP .B PFMLIB_INTEL_PII_PMU Intel Pentium II processors. .TP .B PFMLIB_PENTIUM4_PMU Intel processors based on Netburst micro-architecture. That includes Pentium 4. .TP .B PFMLIB_COREDUO_PMU Intel processors based on Yonah micro-architecture. That includes Intel Core Duo/Core Solo processors .TP .B PFMLIB_I386_PM_PMU Intel Pentium M processors .TP .B PFMLIB_INTEL_CORE_PMU Intel processors based on the Core micro-architecture. That includes Intel Core 2 Duo/Quad processors .TP .B PFMLIB_INTEL_ATOM_PMU Intel processors based on the Atom micro-architecture. .TP .B PFMLIB_INTEL_NHM_PMU Intel processors based on the Nehalem micro-architectures. That includes Intel Core i7 processors. .TP .B PFMLIB_MIPS_20KC_PMU MIPS 20KC processors .TP .B PFMLIB_MIPS_24K_PMU MIPS 24K processors .TP .B PFMLIB_MIPS_25KF_PMU MIPS 25KF processors .TP .B PFMLIB_MIPS_34K_PMU MIPS 34K processors .TP .B PFMLIB_MIPS_5KC_PMU MIPS 5KC processors .TP .B PFMLIB_MIPS_74K_PMU MIPS 74K processors .TP .B PFMLIB_MIPS_R10000_PMU MIPS R10000 processors .TP .B PFMLIB_MIPS_R12000_PMU MIPS R12000 processors .TP .B PFMLIB_MIPS_RM7000_PMU MIPS RM7000 processors .TP .B PFMLIB_MIPS_RM9000_PMU MIPS RM9000 processors .TP .B PFMLIB_MIPS_SB1_PMU MIPS SB1/SB1A processors .TP .B PFMLIB_MIPS_VR5432_PMU MIPS VR5432 processors .TP .B PFMLIB_MIPS_VR5500_PMU MIPS VR5500 processors .TP .B PFMLIB_MIPS_ICE9A_PMU SiCortex ICE9A .TP .B PFMLIB_MIPS_ICE9B_PMU SiCortex ICE9B .TP .B PFMLIB_POWERPC_PMU IBM POWERPC processors .TP .B PFMLIB_CRAYX2_PMU Cray X2 processors .TP .B PFMLIB_CELL_PMU IBM Cell processors .TP .B PFMLIB_PPC970_PMU IBM PowerPC 970(FX,GX) processors .TP .B PFMLIB_PPC970MP_PMU IBM PowerPC 970MP processors .TP .B PFMLIB_POWER3_PMU IBM POWER3 processors .TP .B PFMLIB_POWER4_PMU IBM POWER4 processors .TP .B PFMLIB_POWER5_PMU IBM POWER5 processors .TP .B PFMLIB_POWER5p_PMU BM POWER5+ processors .TP .B PFMLIB_POWER6_PMU IBM POWER6 processors .LP The \fBpfm_get_pmu_name()\fR function returns the name of the detected host PMU. The library must have been initialized properly before making this call. The name is returned in the \fBname\fR argument. The \fBmaxlen\fR argument indicates the maximum length of the buffer provided for \fBname\fR. Up to \fBmaxlen-1\fR characters will be returned, not including the termination character. .sp The \fBpfm_get_pmu_type()\fR function returns the type of the detected host PMU. The library must have been initialized properly before making this call. The type returned in \fBtype\fR can be any one of the three listed above. .sp The \fBpfm_get_pmu_name_bytype()\fR function returns the name of a PMU model in \fBname\fR given a type in the \fBtype\fR argument. The \fBmaxlen\fR argument indicates the maximum length of the buffer provided for \fBname\fR. Up to \fBmaxlen-1\fR characters will be returned, not including the termination character. .sp The \fBpfm_pmu_is_supported()\fR function returns \fBPFMLIB_SUCCESS\fR if the given PMU type is supported by the library independently of what the host PMU model is. .sp The \fBpfm_force_pmu()\fR function is used to forced the library to use a particular PMU model compared to what it has detected. The library checks that the selected type can be supported by the host PMU. This is mostly useful to force the library to the use generic PMU model \fBPFMLIB_GENERIC_PMU\fR. This function can be called at any time and upon return the library is considered initialized. .sp The \fBpfm_list_supported_pmu()\fR function is used to print the list PMU types that the library supports. The results is printed using the function provided in the \fBpf\fR argument, which must be a printf-style function. .SH RETURN The function returns whether or not it was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_NOINIT the library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL invalid argument was given, most likely invalid pointer or invalid PMU type. .TP .B PFMLIB_ERR_NOTSUPP the selected PMU type can be used on the host CPU. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_pmu_name_bytype.3000066400000000000000000000000341502707512200253050ustar00rootroot00000000000000.so man3/pfm_get_pmu_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_pmu_type.3000066400000000000000000000000341502707512200237520ustar00rootroot00000000000000.so man3/pfm_get_pmu_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_get_version.3000066400000000000000000000020221502707512200235740ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_get_version \- get performance monitoring library version .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_version(unsigned int *"version); .sp .SH DESCRIPTION This function can be called at any time to get the revision level of the library. The version is encoded into an unsigned integer and returned in the \fBversion\fR argument. A revision number is composed of two fields: a major number and a minor number. Both can be extracted from the returned argument using macros provided in the header file: .TP .B PFMLIB_MAJ_VERSION(v) returns the major number encoded in v. .TP .B PFMLIB_MIN_VERSION(v) returns the minor number encoded in v. .SH RETURN The function returns whether or not it was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .TP .B PFMLIB_ERR_INVAL the argument is invalid, most likely a NULL pointer. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_initialize.3000066400000000000000000000016321502707512200234170ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_initialize \- initialize performance monitoring library .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_initialize(void);" .sp .SH DESCRIPTION This is the first function that a program using the library \fBmust\fR call otherwise the library will not function at all. This function probes the host PMU and initialize the internal state of the library. In the case of a multi-threaded application, this function needs to be called only once, most likely by the initial thread. .SH RETURN The function returns whether or not it was successful, i.e., the host PMU has been correctly identified and is supported. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is an error code. .SH ERRORS .TP .B PFMLIB_ERR_NOTSUPP the host PMU is not supported. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_list_supported_pmus.3000066400000000000000000000000341502707512200253750ustar00rootroot00000000000000.so man3/pfm_get_pmu_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_pmu_is_supported.3000066400000000000000000000000341502707512200246520ustar00rootroot00000000000000.so man3/pfm_get_pmu_name.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_and.3000066400000000000000000000000331502707512200235230ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_clr.3000066400000000000000000000000331502707512200235410ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_copy.3000066400000000000000000000000331502707512200237330ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_eq.3000066400000000000000000000000331502707512200233660ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_isset.3000066400000000000000000000000331502707512200241100ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_or.3000066400000000000000000000000331502707512200234010ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_set.3000066400000000000000000000050701502707512200235620ustar00rootroot00000000000000.TH LIBPFM 3 "Apr, 2006" "" "Linux Programmer's Manual" .SH NAME pfm_regmask_set, pfm_regmask_isset, pfm_regmask_clr, pfm_regmask_weight, pfm_regmask_eq, pfm_regmask_and, pfm_regmask_or, pfm_regmask_copy -\ operations on pfmlib_regmask_t bitmasks .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_regmask_isset(pfmlib_regmask_t *"mask ", unsigned int "b ");" .BI "int pfm_regmask_set(pfmlib_regmask_t *"mask ", unsigned int "b ");" .BI "int pfm_regmask_clr(pfmlib_regmask_t *"mask ", unsigned int "b ");" .BI "int pfm_regmask_weight(pfmlib_regmask_t *"mask ", unsigned int *"w ");" .BI "int pfm_regmask_eq(pfmlib_regmask_t *"mask1 ", pfmlib_regmask_t *"mask2 ");" .BI "int pfm_regmask_and(pfmlib_regmask_t *"dest ", pfmlib_regmask_t *"m1 ", pmlib_regmask_t *"m2 ");" .BI "int pfm_regmask_or(pfmlib_regmask_t *"dest ", pfmlib_regmask_t *"m1 ", pmlib_regmask_t *"m2 ");" .BI "int pfm_regmask_copy(pfmlib_regmask_t *"dest ", pfmlib_regmask_t *"src ");" .sp .SH DESCRIPTION This set of function is used to operate of the \fBpfmlib_regmask_t\fR bitmasks that are returned by certain functions or passed to the \fBpfm_dispatch_events()\fR function. To ensure portability, it is important that applications use \fBonly\fR the functions specified here to access the bitmasks. It is strongly discouraged to access the internal fields of the \fBpfm_regmask_t\fR structure. The \fBpfm_regmask_set()\fR function is used to set bit \fBb\fR in the bitmask \fBmask\fR. The \fBpfm_regmask_clr()\fR function is used to clear bit \fBb\fR in the bitmask \fBmask\fR. The \fBpfm_regmask_isset()\fR function returns a non-zero value if \fBb\fR is set in the bitmask \fBmask\fR. The \fBpfm_regmask_weight()\fR function returns in \fBw\fR the number of bits set in the bitmask \fBmask\fR. The \fBpfm_regmask_eq()\fR function returns a non-zero value if the bitmasks \fBmask1\fR and \fBmask2\fR are identical. The \fBpfm_regmask_and()\fR function returns in bitmask \fBdest\fR the result of the logical AND operation between bitmask \fBm1\fR and bitmask \fBm2\fR. The \fBpfm_regmask_or()\fR function returns in bitmask \fBdest\fR the result of the logical OR operation between bitmask \fBm1\fR and bitmask \fBm2\fR. The \fBpfm_regmask_copy()\fR function copies bitmask \fBsrc\fR into bitmask \fRdest\fR. .SH RETURN The function returns whether or not it was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .SH ERRORS .B PFMLIB_ERR_INVAL the bit \fBb\fR exceeds the limit supported by the library .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_regmask_weight.3000066400000000000000000000000331502707512200242500ustar00rootroot00000000000000.so man3/pfm_regmask_set.3 papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_set_options.3000066400000000000000000000034541502707512200236300ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_set_options \- set performance monitoring library debug options .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_set_options(pfmlib_options_t *"opt); .sp .SH DESCRIPTION This function can be called at any time to adjust the level of debug of the library. In both cases, extra output will be generated on standard error when the library gets called. This can be useful to figure out how the PMC registers are initialized for instance. .sp The opt argument to this function is a pointer to a .B pfmlib_options_t structure which is defined as follows: .sp .nf typedef struct { unsigned int pfm_debug:1; unsigned int pfm_verbose:1; } pfmlib_options_t; .fi .sp .sp Setting \fBpfm_debug\fR to 1 will enable debug messages whereas setting \fBpfm_verbose\fR will enable verbose messages. .SH ENVIRONMENT VARIABLES Setting library options with this function has lower priority than with environment variables. As such, the call to this function may not have any actual effects. A user can set the following environment variables to control verbosity and debug output: .TP .B LIBPFM_VERBOSE Enable verbose output. Value must be 0 or 1. When not set, verbosity level can be controlled with this function. .TP .B LIBPFM_DEBUG Enable debug output. Value must be 0 or 1. When not set, debug level can be controlled with this function. .LP .SH RETURN The function returns whether or not it was successful. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is the error code. .sp When environment variables exist, they take precedence and this function returns \fBPFMLIB_SUCCESS\fR. .SH ERRORS .TP .B PFMLIB_ERR_INVAL the argument is invalid, most likely a NULL pointer. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/docs/man3/pfm_strerror.3000066400000000000000000000014521502707512200231400ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2003" "" "Linux Programmer's Manual" .SH NAME pfm_strerror \- return string describing error code .SH SYNOPSIS .nf .B #include .sp .BI "char *pfm_strerror(int "code); .sp .SH DESCRIPTION This function returns a string which describes the libpfm error value in \fBcode\fR. The string returned by the call must be considered as read only. The function must \fBonly\fR be used on libpfm calls. It is not designed to handle OS system call errors. .SH RETURN The function returns a pointer to the string describing the error code. If code is invalid then the default error message is returned. .SH ERRORS If the error code is invalid, then the function returns a pointer to a string which says "unknown error code". .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libperfnec/include/000077500000000000000000000000001502707512200201635ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/include/Makefile000066400000000000000000000105571502707512200216330ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk # perfmon/perfmon.h is installed below HEADERS=perfmon/perfmon_dfl_smpl.h \ perfmon/pfmlib.h \ perfmon/perfmon_v2.h \ perfmon/pfmlib_comp.h \ perfmon/pfmlib_os.h ifeq ($(CONFIG_PFMLIB_ARCH_IA64),y) HEADERS += perfmon/pfmlib_os_ia64.h \ perfmon/pfmlib_comp_ia64.h \ perfmon/perfmon_ia64.h \ perfmon/perfmon_compat.h \ perfmon/perfmon_default_smpl.h \ perfmon/pfmlib_itanium.h \ perfmon/pfmlib_itanium2.h \ perfmon/pfmlib_montecito.h \ perfmon/pfmlib_gen_ia64.h endif ifeq ($(CONFIG_PFMLIB_ARCH_X86_64),y) HEADERS += perfmon/pfmlib_os_x86_64.h \ perfmon/pfmlib_os_i386.h \ perfmon/pfmlib_comp_x86_64.h \ perfmon/pfmlib_comp_i386.h \ perfmon/perfmon_x86_64.h \ perfmon/perfmon_i386.h \ perfmon/pfmlib_i386_p6.h \ perfmon/perfmon_pebs_p4_smpl.h \ perfmon/perfmon_pebs_core_smpl.h \ perfmon/perfmon_pebs_smpl.h \ perfmon/pfmlib_amd64.h \ perfmon/pfmlib_pentium4.h \ perfmon/pfmlib_core.h \ perfmon/pfmlib_coreduo.h \ perfmon/pfmlib_intel_atom.h \ perfmon/pfmlib_intel_nhm.h \ perfmon/pfmlib_gen_ia32.h endif ifeq ($(CONFIG_PFMLIB_ARCH_I386),y) HEADERS += perfmon/pfmlib_os_i386.h \ perfmon/pfmlib_comp_i386.h \ perfmon/perfmon_i386.h \ perfmon/perfmon_pebs_p4_smpl.h \ perfmon/perfmon_pebs_core_smpl.h \ perfmon/perfmon_pebs_smpl.h \ perfmon/pfmlib_amd64.h \ perfmon/pfmlib_pentium4.h \ perfmon/pfmlib_core.h \ perfmon/pfmlib_coreduo.h \ perfmon/pfmlib_intel_atom.h \ perfmon/pfmlib_intel_nhm.h \ perfmon/pfmlib_i386_p6.h \ perfmon/pfmlib_gen_ia32.h endif ifeq ($(CONFIG_PFMLIB_ARCH_POWERPC),y) HEADERS += perfmon/pfmlib_cell.h \ perfmon/pfmlib_os_powerpc.h \ perfmon/pfmlib_comp_powerpc.h \ perfmon/perfmon_powerpc.h \ perfmon/pfmlib_powerpc.h endif ifeq ($(CONFIG_PFMLIB_ARCH_SPARC),y) HEADERS += perfmon/pfmlib_os_sparc.h \ perfmon/pfmlib_comp_sparc.h \ perfmon/perfmon_sparc.h \ perfmon/pfmlib_sparc.h endif ifeq ($(CONFIG_PFMLIB_ARCH_MIPS64),y) HEADERS += perfmon/pfmlib_os_mips64.h \ perfmon/pfmlib_comp_mips64.h \ perfmon/perfmon_mips64.h \ perfmon/pfmlib_gen_mips64.h \ perfmon/pfmlib_sicortex.h endif ifeq ($(CONFIG_PFMLIB_ARCH_CRAYX2),y) HEADERS += perfmon/pfmlib_os_crayx2.h \ perfmon/pfmlib_comp_crayx2.h \ perfmon/perfmon_crayx2.h \ perfmon/pfmlib_crayx2.h endif .PHONY: perfmon.h dir perfmon.h: dir perfmon.h: ifeq ($(CONFIG_PFMLIB_OLD_PFMV2),y) echo "#ifndef PFMLIB_OLD_PFMV2" > $(DESTDIR)$(INCDIR)/perfmon/perfmon.h echo "#define PFMLIB_OLD_PFMV2" >> $(DESTDIR)$(INCDIR)/perfmon/perfmon.h echo "#endif" >> $(DESTDIR)$(INCDIR)/perfmon/perfmon.h cat perfmon/perfmon.h >> $(DESTDIR)$(INCDIR)/perfmon/perfmon.h chmod 644 $(DESTDIR)$(INCDIR)/perfmon/perfmon.h else $(INSTALL) -m 644 perfmon/perfmon.h $(DESTDIR)$(INCDIR)/perfmon endif dir: mkdir -p $(DESTDIR)$(INCDIR)/perfmon install: dir perfmon.h $(HEADERS) install: $(INSTALL) -m 644 $(HEADERS) $(DESTDIR)$(INCDIR)/perfmon papi-papi-7-2-0-t/src/libperfnec/include/perfmon/000077500000000000000000000000001502707512200216315ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon.h000066400000000000000000000145471502707512200234630ustar00rootroot00000000000000/* * This file contains the user level interface description for * the perfmon3.x interface on Linux. * * It also includes perfmon2.x interface definitions. * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian */ #ifndef __PERFMON_H__ #define __PERFMON_H__ #include #include #ifdef __cplusplus extern "C" { #endif #ifdef __x86_64__ #include #endif #define PFM_MAX_PMCS PFM_ARCH_MAX_PMCS #define PFM_MAX_PMDS PFM_ARCH_MAX_PMDS #ifndef SWIG /* * number of element for each type of bitvector */ #define PFM_BPL (sizeof(uint64_t)<<3) #define PFM_BVSIZE(x) (((x)+PFM_BPL-1) / PFM_BPL) #define PFM_PMD_BV PFM_BVSIZE(PFM_MAX_PMDS) #define PFM_PMC_BV PFM_BVSIZE(PFM_MAX_PMCS) #endif /* * special data type for syscall error value used to help * with Python support and in particular for SWIG. By using * a specific type we can detect syscalls and trap errors * in one SWIG statement as opposed to having to keep track of * each syscall individually. Programs can use 'int' safely for * the return value. */ typedef int os_err_t; /* error if -1 */ /* * passed to pfm_create * contains list of available register upon return */ #define PFM_ARCH_MAX_PMCS 32 #define PFM_ARCH_MAX_PMDS 32 typedef struct { uint64_t sif_avail_pmcs[PFM_PMC_BV]; /* out: available PMCs */ uint64_t sif_avail_pmds[PFM_PMD_BV]; /* out: available PMDs */ uint64_t sif_reserved[4]; } pfarg_sinfo_t; //os_err_t pfm_create(int flags, pfarg_sinfo_t *sif, // char *smpl_name, void *smpl_arg, size_t arg_size); extern os_err_t pfm_create(int flags, pfarg_sinfo_t *sif, ...); /* * pfm_create flags: * bits[00-15]: generic flags * bits[16-31]: arch-specific flags (see perfmon_const.h) */ #define PFM_FL_NOTIFY_BLOCK 0x01 /* block task on user notifications */ #define PFM_FL_SYSTEM_WIDE 0x02 /* create a system wide context */ #define PFM_FL_SMPL_FMT 0x04 /* session uses sampling format */ #define PFM_FL_OVFL_NO_MSG 0x80 /* no overflow msgs */ /* * PMC and PMD generic (simplified) register description */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* which event set */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* 64-bit value */ } pfarg_pmr_t; /* * pfarg_pmr_t flags: * bit[00-15] : generic flags * bit[16-31] : arch-specific flags * * PFM_REGFL_NO_EMUL64: must be set on the PMC controlling the PMD */ #define PFM_REGFL_OVFL_NOTIFY 0x1 /* PMD: send notification on event */ #define PFM_REGFL_RANDOM 0x2 /* PMD: randomize value after event */ #define PFM_REGFL_NO_EMUL64 0x4 /* PMC: no 64-bit emulation */ /* * PMD extended description * to be used with pfm_writeand pfm_read * must be used with type = PFM_RW_PMD_ATTR */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* which event set */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* 64-bit value */ uint64_t reg_long_reset; /* write: value to reload after notification */ uint64_t reg_short_reset; /* write: reset after counter overflow */ uint64_t reg_random_mask; /* write: bitmask used to limit random value */ uint64_t reg_smpl_pmds[PFM_PMD_BV]; /* write: record in sample */ uint64_t reg_reset_pmds[PFM_PMD_BV]; /* write: reset on overflow */ uint64_t reg_ovfl_swcnt; /* write: # overflows before switch */ uint64_t reg_smpl_eventid; /* write: opaque event identifier */ uint64_t reg_last_value; /* read: PMD last reset value */ uint64_t reg_reserved[8]; /* for future use */ } pfarg_pmd_attr_t; /* * pfm_write, pfm_read type: */ #define PFM_RW_PMD 1 /* simplified PMD (pfarg_pmr_t) */ #define PFM_RW_PMC 2 /* PMC registers (pfarg_pmr_t) */ #define PFM_RW_PMD_ATTR 3 /* extended PMD (pfarg_pmd_attr) */ /* * pfm_attach special target for detach */ #define PFM_NO_TARGET -1 /* no target, detach */ /* * pfm_set_state state: */ #define PFM_ST_START 0x1 /* start monitoring */ #define PFM_ST_STOP 0x2 /* stop monitoring */ #define PFM_ST_RESTART 0x3 /* resume after notify */ #ifndef PFMLIB_OLD_PFMV2 typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_flags; /* SETFL flags */ uint64_t set_timeout; /* requested/effective switch timeout in nsecs */ uint64_t reserved[6]; /* for future use */ } pfarg_set_desc_t; typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_reserved2; /* for future use */ uint64_t set_ovfl_pmds[PFM_PMD_BV]; /* out: last ovfl PMDs */ uint64_t set_runs; /* out: #times set was active */ uint64_t set_timeout; /* out: leftover switch timeout (nsecs) */ uint64_t set_duration; /* out: time set was active (nsecs) */ uint64_t set_reserved3[4]; /* for future use */ } pfarg_set_info_t; #endif /* * pfm_set_desc_t flags: */ #define PFM_SETFL_OVFL_SWITCH 0x01 /* enable switch on overflow (subject to individual switch_cnt */ #define PFM_SETFL_TIME_SWITCH 0x02 /* switch set on timeout */ #ifndef PFMLIB_OLD_PFMV2 typedef struct { uint32_t msg_type; /* PFM_MSG_OVFL */ uint32_t msg_ovfl_pid; /* process id */ uint16_t msg_active_set; /* active set at the time of overflow */ uint16_t msg_ovfl_cpu; /* cpu on which the overflow occurred */ uint32_t msg_ovfl_tid; /* thread id */ uint64_t msg_ovfl_ip; /* instruction pointer where overflow interrupt happened */ uint64_t msg_ovfl_pmds[PFM_PMD_BV];/* which PMDs overflowed */ } pfarg_ovfl_msg_t; extern os_err_t pfm_write(int fd, int flags, int type, void *reg, size_t n); extern os_err_t pfm_read(int fd, int flags, int type, void *reg, size_t n); extern os_err_t pfm_set_state(int fd, int flags, int state); extern os_err_t pfm_create_sets(int fd, int flags, pfarg_set_desc_t *s, size_t sz); extern os_err_t pfm_getinfo_sets(int fd, int flags, pfarg_set_info_t *s, size_t sz); extern os_err_t pfm_attach(int fd, int flags, int target); #endif #include "perfmon_v2.h" typedef union { uint32_t type; pfarg_ovfl_msg_t pfm_ovfl_msg; } pfarg_msg_t; #define PFM_MSG_OVFL 1 /* an overflow happened */ #define PFM_MSG_END 2 /* thread to which context was attached ended */ #define PFM_VERSION_MAJOR(x) (((x)>>16) & 0xffff) #define PFM_VERSION_MINOR(x) ((x) & 0xffff) #ifdef __cplusplus }; #endif #endif /* _PERFMON_H */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_compat.h000066400000000000000000000127451502707512200250240ustar00rootroot00000000000000/* * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This header file contains obsolete user-level perfmon interface * definitions for the Itanium Processor Family architecture. * * please use replacements as indicated below whenever possible. */ #ifndef _PERFMON_COMPAT_H_ #define _PERFMON_COMPAT_H_ #ifndef __ia64__ #error "you should not include this file on non Itanium platforms" #endif /* * old perfmon2 interface for backward compatibility. * Do not use in portable applications. */ extern int perfmonctl(int fd, int cmd, void *arg, int narg); typedef unsigned char pfm_uuid_t[16]; /* custom sampling buffer identifier type */ /* * obsolete perfmon comamnds supported on all CPU models */ #define PFM_WRITE_PMCS 0x01 #define PFM_WRITE_PMDS 0x02 #define PFM_READ_PMDS 0x03 #define PFM_STOP 0x04 #define PFM_START 0x05 #define PFM_ENABLE 0x06 /* obsolete */ #define PFM_DISABLE 0x07 /* obsolete */ #define PFM_CREATE_CONTEXT 0x08 #define PFM_DESTROY_CONTEXT 0x09 /* obsolete use close() */ #define PFM_RESTART 0x0a #define PFM_PROTECT_CONTEXT 0x0b /* obsolete */ #define PFM_GET_FEATURES 0x0c /* obsolete: use /proc/sys/kernel/perfmon */ #define PFM_DEBUG 0x0d /* obsolete: use /proc/sys/kernel/perfmon/debug */ #define PFM_UNPROTECT_CONTEXT 0x0e /* obsolete */ #define PFM_GET_PMC_RESET_VAL 0x0f /* obsolete: use /proc/perfmon_mappings */ #define PFM_LOAD_CONTEXT 0x10 #define PFM_UNLOAD_CONTEXT 0x11 /* * PMU model specific commands (may not be supported on all PMU models) */ #define PFM_WRITE_IBRS 0x20 /* obsolete: use PFM_WRITE_PMCS[256-263] */ #define PFM_WRITE_DBRS 0x21 /* obsolete: use PFM_WRITE_PMCS[264-271] */ /* * argument to PFM_CREATE_CONTEXT */ typedef struct { pfm_uuid_t ctx_smpl_buf_id; /* which buffer format to use (if needed) */ unsigned long ctx_flags; /* noblock/block */ unsigned int ctx_reserved1; /* for future use */ int ctx_fd; /* return arg: unique identification for context */ void *ctx_smpl_vaddr; /* return arg: virtual address of sampling buffer, is used */ unsigned long ctx_reserved3[11];/* for future use */ } pfarg_context_t; /* * argument structure for PFM_WRITE_PMCS/PFM_WRITE_PMDS/PFM_WRITE_PMDS */ typedef struct { unsigned int reg_num; /* which register */ unsigned short reg_set; /* event set for this register */ unsigned short reg_reserved1; /* for future use */ unsigned long reg_value; /* initial pmc/pmd value */ unsigned long reg_flags; /* input: pmc/pmd flags, return: reg error */ unsigned long reg_long_reset; /* reset after buffer overflow notification */ unsigned long reg_short_reset; /* reset after counter overflow */ unsigned long reg_reset_pmds[4]; /* which other counters to reset on overflow */ unsigned long reg_random_seed; /* seed value when randomization is used */ unsigned long reg_random_mask; /* bitmask used to limit random value */ unsigned long reg_last_reset_val;/* return: PMD last reset value */ unsigned long reg_smpl_pmds[4]; /* which pmds are accessed when PMC overflows */ unsigned long reg_smpl_eventid; /* opaque sampling event identifier */ unsigned long reg_ovfl_switch_cnt; /* how many overflow before switch for next set */ unsigned long reg_reserved2[2]; /* for future use */ } pfarg_reg_t; /* * argument to PFM_WRITE_IBRS/PFM_WRITE_DBRS */ typedef struct { unsigned int dbreg_num; /* which debug register */ unsigned short dbreg_set; /* event set for this register */ unsigned short dbreg_reserved1; /* for future use */ unsigned long dbreg_value; /* value for debug register */ unsigned long dbreg_flags; /* return: dbreg error */ unsigned long dbreg_reserved2[1]; /* for future use */ } pfarg_dbreg_t; /* * argument to PFM_GET_FEATURES */ typedef struct { unsigned int ft_version; /* perfmon: major [16-31], minor [0-15] */ unsigned int ft_reserved; /* reserved for future use */ unsigned long reserved[4]; /* for future use */ } pfarg_features_t; typedef struct { int msg_type; /* generic message header */ int msg_ctx_fd; /* generic message header */ unsigned long msg_ovfl_pmds[4]; /* which PMDs overflowed */ unsigned short msg_active_set; /* active set at the time of overflow */ unsigned short msg_reserved1; /* for future use */ unsigned int msg_reserved2; /* for future use */ unsigned long msg_tstamp; /* for perf tuning/debug */ } pfm_ovfl_msg_t; typedef struct { int msg_type; /* generic message header */ int msg_ctx_fd; /* generic message header */ unsigned long msg_tstamp; /* for perf tuning */ } pfm_end_msg_t; typedef struct { int msg_type; /* type of the message */ int msg_ctx_fd; /* unique identifier for the context */ unsigned long msg_tstamp; /* for perf tuning */ } pfm_gen_msg_t; typedef union { int type; pfm_ovfl_msg_t pfm_ovfl_msg; pfm_end_msg_t pfm_end_msg; pfm_gen_msg_t pfm_gen_msg; } pfm_msg_t; /* * PMD/PMC return flags in case of error (ignored on input) * * Those flags are used on output and must be checked in case EINVAL is returned * by a command accepting a vector of values and each has a flag field, such as * pfarg_pmc_t or pfarg_pmd_t. */ #define PFM_REG_RETFL_NOTAVAIL (1<<31) /* set if register is implemented but not available */ #define PFM_REG_RETFL_EINVAL (1<<30) /* set if register entry is invalid */ #define PFM_REG_RETFL_MASK (PFM_REG_RETFL_NOTAVAIL|PFM_REG_RETFL_EINVAL) #define PFM_REG_HAS_ERROR(flag) (((flag) & PFM_REG_RETFL_MASK) != 0) #endif /* _PERFMON_COMPAT_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_crayx2.h000066400000000000000000000010061502707512200247350ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_CRAY_H_ #define _PERFMON_CRAY_H_ #define PFM_ARCH_MAX_PMCS (12+8) /* 12 HW SW 8 */ #define PFM_ARCH_MAX_PMDS (512+8) /* 512 HW SW 8 */ /* * Cray specific register flags */ #define PFM_CRAY_REGFL_SMP_SCOPE 0x10000 /* PMD: shared state event counter */ #endif /* _PERFMON_CRAY_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_default_smpl.h000066400000000000000000000070131502707512200262100ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file implements the old default sampling buffer format * for the perfmon2 subsystem. It works ONLY with perfmon v2.0 * on IA-64 systems. */ #ifndef __PERFMON_DEFAULT_SMPL_H__ #define __PERFMON_DEFAULT_SMPL_H__ 1 #ifndef __ia64__ #error "you should not be using this file on a non IA-64 platform" #endif #ifdef __cplusplus extern "C" { #endif #define PFM_DEFAULT_SMPL_UUID { \ 0x4d, 0x72, 0xbe, 0xc0, 0x06, 0x64, 0x41, 0x43, 0x82, 0xb4, 0xd3, 0xfd, 0x27, 0x24, 0x3c, 0x97} /* * format specific parameters (passed at context creation) */ typedef struct { unsigned long buf_size; /* size of the buffer in bytes */ unsigned int flags; /* buffer specific flags */ unsigned int res1; /* for future use */ unsigned long reserved[2]; /* for future use */ } pfm_default_smpl_arg_t; /* * combined context+format specific structure. Can be passed * to PFM_CONTEXT_CREATE */ typedef struct { pfarg_context_t ctx_arg; pfm_default_smpl_arg_t buf_arg; } pfm_default_smpl_ctx_arg_t; /* * This header is at the beginning of the sampling buffer returned to the user. * It is directly followed by the first record. */ typedef struct { uint64_t hdr_count; /* how many valid entries */ uint64_t hdr_cur_offs; /* current offset from top of buffer */ uint64_t dr_reserved2; /* reserved for future use */ uint64_t hdr_overflows; /* how many times the buffer overflowed */ uint64_t hdr_buf_size; /* how many bytes in the buffer */ uint32_t hdr_version; /* contains perfmon version (smpl format diffs) */ uint32_t hdr_reserved1; /* for future use */ uint64_t hdr_reserved[10]; /* for future use */ } pfm_default_smpl_hdr_t; /* * Entry header in the sampling buffer. The header is directly followed * with the values of the PMD registers of interest saved in increasing * index order: PMD4, PMD5, and so on. How many PMDs are present depends * on how the session was programmed. * * In the case where multiple counters overflow at the same time, multiple * entries are written consecutively. * * last_reset_value member indicates the initial value of the overflowed PMD. */ typedef struct { pid_t pid; /* thread id (for NPTL, this is gettid()) */ uint8_t reserved1[3]; /* for future use */ uint8_t ovfl_pmd; /* index of pmd that overflowed for this sample */ uint64_t last_reset_val; /* initial value of overflowed PMD */ unsigned long ip; /* where did the overflow interrupt happened */ uint64_t tstamp; /* overflow timetamp */ uint16_t cpu; /* cpu on which the overfow occured */ uint16_t set; /* event set active when overflow ocurred */ pid_t tgid; /* thread group id (for NPTL, this is getpid()) */ } pfm_default_smpl_entry_t; #define PFM_DEFAULT_MAX_PMDS 64 /* how many pmds supported by data structures (sizeof(unsigned long) */ #define PFM_DEFAULT_MAX_ENTRY_SIZE (sizeof(pfm_default_smpl_entry_t)+(sizeof(unsigned long)*PFM_DEFAULT_MAX_PMDS)) #define PFM_DEFAULT_SMPL_MIN_BUF_SIZE (sizeof(pfm_default_smpl_hdr_t)+PFM_DEFAULT_MAX_ENTRY_SIZE) #define PFM_DEFAULT_SMPL_VERSION_MAJ 2U #define PFM_DEFAULT_SMPL_VERSION_MIN 0U #define PFM_DEFAULT_SMPL_VERSION (((PFM_DEFAULT_SMPL_VERSION_MAJ&0xffff)<<16)|(PFM_DEFAULT_SMPL_VERSION_MIN & 0xffff)) #ifdef __cplusplus }; #endif #endif /* __PERFMON_DEFAULT_SMPL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_dfl_smpl.h000066400000000000000000000062061502707512200253340ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file implements the new dfl sampling buffer format * for perfmon2 subsystem. * * This format is supported used by all platforms. For IA-64, older * applications using perfmon v2.0 MUST use the * perfmon_default_smpl.h */ #ifndef __PERFMON_DFL_SMPL_H__ #define __PERFMON_DFL_SMPL_H__ 1 #ifdef __cplusplus extern "C" { #endif #include #define PFM_DFL_SMPL_NAME "default" #ifdef PFMLIB_OLD_PFMV2 /* * UUID for compatibility with perfmon v2.2 (used by Cray) */ #define PFM_DFL_SMPL_UUID { \ 0xd1, 0x39, 0xb2, 0x9e, 0x62, 0xe8, 0x40, 0xe4,\ 0xb4, 0x02, 0x73, 0x07, 0x87, 0x92, 0xe9, 0x37 } #endif /* * format specific parameters (passed at context creation) */ typedef struct { uint64_t buf_size; /* size of the buffer in bytes */ uint32_t buf_flags; /* buffer specific flags */ uint32_t res1; /* for future use */ uint64_t reserved[6]; /* for future use */ } pfm_dfl_smpl_arg_t; /* * This header is at the beginning of the sampling buffer returned to the user. * It is directly followed by the first record. */ typedef struct { uint64_t hdr_count; /* how many valid entries */ uint64_t hdr_cur_offs; /* current offset from top of buffer */ uint64_t hdr_overflows; /* #overflows for buffer */ uint64_t hdr_buf_size; /* bytes in the buffer */ uint64_t hdr_min_buf_space; /* minimal buffer size (internal use) */ uint32_t hdr_version; /* smpl format version */ uint32_t hdr_buf_flags; /* copy of buf_flags */ uint64_t hdr_reserved[10]; /* for future use */ } pfm_dfl_smpl_hdr_t; /* * Entry header in the sampling buffer. The header is directly followed * with the values of the PMD registers of interest saved in increasing * index order: PMD4, PMD5, and so on. How many PMDs are present depends * on how the session was programmed. * * In the case where multiple counters overflow at the same time, multiple * entries are written consecutively. * * last_reset_value member indicates the initial value of the overflowed PMD. */ typedef struct { uint32_t pid; /* thread id (for NPTL, this is gettid()) */ uint16_t ovfl_pmd; /* index of pmd that overflowed for this sample */ uint16_t reserved; /* for future use */ uint64_t last_reset_val; /* initial value of overflowed PMD */ uint64_t ip; /* where did the overflow interrupt happened */ uint64_t tstamp; /* overflow timetamp */ uint16_t cpu; /* cpu on which the overfow occured */ uint16_t set; /* event set active when overflow ocurred */ uint32_t tgid; /* thread group id (for NPTL, this is getpid()) */ } pfm_dfl_smpl_entry_t; #define PFM_DFL_SMPL_VERSION_MAJ 1U #define PFM_DFL_SMPL_VERSION_MIN 0U #define PFM_DFL_SMPL_VERSION (((PFM_DFL_SMPL_VERSION_MAJ&0xffff)<<16)|(PFM_DFL_SMPL_VERSION_MIN & 0xffff)) #ifdef __cplusplus }; #endif #endif /* __PERFMON_DFL_SMPL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_i386.h000066400000000000000000000007531502707512200242260ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_I386_H_ #define _PERFMON_I386_H_ /* * Both i386 and x86-64 must have same limits to ensure ABI * compatibility */ #define PFM_ARCH_MAX_PMCS (256+64) /* 256 HW SW 64 */ #define PFM_ARCH_MAX_PMDS (256+64) /* 256 HW SW 64 */ #endif /* _PERFMON_I386_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_ia64.h000066400000000000000000000021701502707512200242730ustar00rootroot00000000000000/* * Copyright (c) 2001-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_IA64_H_ #define _PERFMON_IA64_H_ #define PFM_ARCH_MAX_PMCS (256+64) /* 256 HW 64 SW */ #define PFM_ARCH_MAX_PMDS (256+64) /* 256 HW 64 SW */ /* * privilege level mask usage for ia-64: * * PFM_PLM0 = most privileged (kernel, hypervisor, ..) * PFM_PLM1 = privilege level 1 * PFM_PLM2 = privilege level 2 * PFM_PLM3 = least privileged (user level) */ /* * Itanium specific context flags */ #define PFM_ITA_FL_INSECURE 0x10000 /* force psr.sp=0 for non self-monitoring */ /* * Itanium specific event set flags */ #define PFM_ITA_SETFL_EXCL_INTR 0x10000 /* exclude interrupt triggered execution */ #define PFM_ITA_SETFL_INTR_ONLY 0x20000 /* include only interrupt triggered execution */ #define PFM_ITA_SETFL_IDLE_EXCL 0x40000 /* not stop monitoring in idle loop */ /* * compatibility for previous versions of the interface */ #include #endif /* _PERFMON_IA64_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_mips64.h000066400000000000000000000006331502707512200246540ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_MIPS64_H_ #define _PERFMON_MIPS64_H_ #define PFM_ARCH_MAX_PMCS (256+64) /* 256 HW 64 SW */ #define PFM_ARCH_MAX_PMDS (256+64) /* 256 HW 64 SW */ #endif /* _PERFMON_MIPS64_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_nec.h000066400000000000000000000151411502707512200242770ustar00rootroot00000000000000/* * This file contains the user level interface description for * the perfmon3.x interface on Linux. * * It also includes perfmon2.x interface definitions. * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian */ #ifndef __PERFMON_H__ #define __PERFMON_H__ #include #include #ifdef __cplusplus extern "C" { #endif #ifdef __ia64__ #include #endif #ifdef __x86_64__ #include #endif #ifdef __i386__ #include #endif #if defined(__powerpc__) || defined(__cell__) #include #endif #ifdef __sparc__ #include #endif #ifdef __mips__ #include #endif #ifdef __crayx2 #include #endif #define PFM_MAX_PMCS 8 #define PFM_MAX_PMDS 8 #define PFM_PMC_BV 8 #define PFM_PMD_BV 8 #ifndef SWIG /* * number of element for each type of bitvector */ #define PFM_BPL (sizeof(uint64_t)<<3) #define PFM_BVSIZE(x) (((x)+PFM_BPL-1) / PFM_BPL) #endif /* * special data type for syscall error value used to help * with Python support and in particular for SWIG. By using * a specific type we can detect syscalls and trap errors * in one SWIG statement as opposed to having to keep track of * each syscall individually. Programs can use 'int' safely for * the return value. */ typedef int os_err_t; /* error if -1 */ /* * passed to pfm_create * contains list of available register upon return */ typedef struct { uint64_t sif_avail_pmcs[PFM_PMC_BV]; /* out: available PMCs */ uint64_t sif_avail_pmds[PFM_PMD_BV]; /* out: available PMDs */ uint64_t sif_reserved[4]; } pfarg_sinfo_t; //os_err_t pfm_create(int flags, pfarg_sinfo_t *sif, // char *smpl_name, void *smpl_arg, size_t arg_size); extern os_err_t pfm_create(int flags, pfarg_sinfo_t *sif, ...); /* * pfm_create flags: * bits[00-15]: generic flags * bits[16-31]: arch-specific flags (see perfmon_const.h) */ #define PFM_FL_NOTIFY_BLOCK 0x01 /* block task on user notifications */ #define PFM_FL_SYSTEM_WIDE 0x02 /* create a system wide context */ #define PFM_FL_SMPL_FMT 0x04 /* session uses sampling format */ #define PFM_FL_OVFL_NO_MSG 0x80 /* no overflow msgs */ /* * PMC and PMD generic (simplified) register description */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* which event set */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* 64-bit value */ } pfarg_pmr_t; /* * pfarg_pmr_t flags: * bit[00-15] : generic flags * bit[16-31] : arch-specific flags * * PFM_REGFL_NO_EMUL64: must be set on the PMC controlling the PMD */ #define PFM_REGFL_OVFL_NOTIFY 0x1 /* PMD: send notification on event */ #define PFM_REGFL_RANDOM 0x2 /* PMD: randomize value after event */ #define PFM_REGFL_NO_EMUL64 0x4 /* PMC: no 64-bit emulation */ /* * PMD extended description * to be used with pfm_writeand pfm_read * must be used with type = PFM_RW_PMD_ATTR */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* which event set */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* 64-bit value */ uint64_t reg_long_reset; /* write: value to reload after notification */ uint64_t reg_short_reset; /* write: reset after counter overflow */ uint64_t reg_random_mask; /* write: bitmask used to limit random value */ uint64_t reg_smpl_pmds[PFM_PMD_BV]; /* write: record in sample */ uint64_t reg_reset_pmds[PFM_PMD_BV]; /* write: reset on overflow */ uint64_t reg_ovfl_swcnt; /* write: # overflows before switch */ uint64_t reg_smpl_eventid; /* write: opaque event identifier */ uint64_t reg_last_value; /* read: PMD last reset value */ uint64_t reg_reserved[8]; /* for future use */ } pfarg_pmd_attr_t; /* * pfm_write, pfm_read type: */ #define PFM_RW_PMD 1 /* simplified PMD (pfarg_pmr_t) */ #define PFM_RW_PMC 2 /* PMC registers (pfarg_pmr_t) */ #define PFM_RW_PMD_ATTR 3 /* extended PMD (pfarg_pmd_attr) */ /* * pfm_attach special target for detach */ #define PFM_NO_TARGET -1 /* no target, detach */ /* * pfm_set_state state: */ #define PFM_ST_START 0x1 /* start monitoring */ #define PFM_ST_STOP 0x2 /* stop monitoring */ #define PFM_ST_RESTART 0x3 /* resume after notify */ #ifndef PFMLIB_OLD_PFMV2 typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_flags; /* SETFL flags */ uint64_t set_timeout; /* requested/effective switch timeout in nsecs */ uint64_t reserved[6]; /* for future use */ } pfarg_set_desc_t; typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_reserved2; /* for future use */ uint64_t set_ovfl_pmds[PFM_PMD_BV]; /* out: last ovfl PMDs */ uint64_t set_runs; /* out: #times set was active */ uint64_t set_timeout; /* out: leftover switch timeout (nsecs) */ uint64_t set_duration; /* out: time set was active (nsecs) */ uint64_t set_reserved3[4]; /* for future use */ } pfarg_set_info_t; #endif /* * pfm_set_desc_t flags: */ #define PFM_SETFL_OVFL_SWITCH 0x01 /* enable switch on overflow (subject to individual switch_cnt */ #define PFM_SETFL_TIME_SWITCH 0x02 /* switch set on timeout */ #ifndef PFMLIB_OLD_PFMV2 typedef struct { uint32_t msg_type; /* PFM_MSG_OVFL */ uint32_t msg_ovfl_pid; /* process id */ uint16_t msg_active_set; /* active set at the time of overflow */ uint16_t msg_ovfl_cpu; /* cpu on which the overflow occurred */ uint32_t msg_ovfl_tid; /* thread id */ uint64_t msg_ovfl_ip; /* instruction pointer where overflow interrupt happened */ uint64_t msg_ovfl_pmds[PFM_PMD_BV];/* which PMDs overflowed */ } pfarg_ovfl_msg_t; extern os_err_t pfm_write(int fd, int flags, int type, void *reg, size_t n); extern os_err_t pfm_read(int fd, int flags, int type, void *reg, size_t n); extern os_err_t pfm_set_state(int fd, int flags, int state); extern os_err_t pfm_create_sets(int fd, int flags, pfarg_set_desc_t *s, size_t sz); extern os_err_t pfm_getinfo_sets(int fd, int flags, pfarg_set_info_t *s, size_t sz); extern os_err_t pfm_attach(int fd, int flags, int target); #endif #include "perfmon_v2.h" typedef union { uint32_t type; pfarg_ovfl_msg_t pfm_ovfl_msg; } pfarg_msg_t; #define PFM_MSG_OVFL 1 /* an overflow happened */ #define PFM_MSG_END 2 /* thread to which context was attached ended */ #define PFM_VERSION_MAJOR(x) (((x)>>16) & 0xffff) #define PFM_VERSION_MINOR(x) ((x) & 0xffff) #ifdef __cplusplus }; #endif #endif /* _PERFMON_H */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_pebs_core_smpl.h000066400000000000000000000112521502707512200265250ustar00rootroot00000000000000/* * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file implements the sampling format to support Intel * Precise Event Based Sampling (PEBS) feature of Intel * Core and Atom processors. * * What is PEBS? * ------------ * This is a hardware feature to enhance sampling by providing * better precision as to where a sample is taken. This avoids the * typical skew in the instruction one can observe with any * interrupt-based sampling technique. * * PEBS also lowers sampling overhead significantly by having the * processor store samples instead of the OS. PMU interrupt are only * generated after multiple samples are written. * * Another benefit of PEBS is that samples can be captured inside * critical sections where interrupts are masked. * * How does it work? * PEBS effectively implements a Hw buffer. The Os must pass a region * of memory where samples are to be stored. The region can have any * size. The OS must also specify the sampling period to reload. The PMU * will interrupt when it reaches the end of the buffer or a specified * threshold location inside the memory region. * * The description of the buffer is stored in the Data Save Area (DS). * The samples are stored sequentially in the buffer. The format of the * buffer is fixed and specified in the PEBS documentation. The sample * format does not change between 32-bit and 64-bit modes unlike on the * Pentium 4 version of PEBS. * * What does the format do? * It provides access to the PEBS feature for both 32-bit and 64-bit * processors that support it. * * The same code and data structures are used for both 32-bit and 64-bi * modes. A single format name is used for both modes. In 32-bit mode, * some of the extended registers are written to zero in each sample. * * It is important to realize that the format provides a zero-copy * environment for the samples, i.e,, the OS never touches the * samples. Whatever the processor write is directly accessible to * the user. * * Parameters to the buffer can be passed via pfm_create_context() in * the pfm_pebs_smpl_arg structure. */ #ifndef __PERFMON_PEBS_CORE_SMPL_H__ #define __PERFMON_PEBS_CORE_SMPL_H__ 1 #ifdef __cplusplus extern "C" { #endif #include #define PFM_PEBS_CORE_SMPL_NAME "pebs_core" /* * format specific parameters (passed at context creation) */ typedef struct { uint64_t cnt_reset; /* counter reset value */ uint64_t buf_size; /* size of the buffer in bytes */ uint64_t intr_thres; /* index of interrupt threshold entry */ uint64_t reserved[6]; /* for future use */ } pfm_pebs_core_smpl_arg_t; /* * DS Save Area */ typedef struct { uint64_t bts_buf_base; uint64_t bts_index; uint64_t bts_abs_max; uint64_t bts_intr_thres; uint64_t pebs_buf_base; uint64_t pebs_index; uint64_t pebs_abs_max; uint64_t pebs_intr_thres; uint64_t pebs_cnt_reset; } pfm_ds_area_core_t; /* * This header is at the beginning of the sampling buffer returned to the user. * * Because of PEBS alignement constraints, the actual PEBS buffer area does * not necessarily begin right after the header. The hdr_start_offs must be * used to compute the first byte of the buffer. The offset is defined as * the number of bytes between the end of the header and the beginning of * the buffer. As such the formula is: * actual_buffer = (unsigned long)(hdr+1)+hdr->hdr_start_offs */ typedef struct { uint64_t overflows; /* #overflows for buffer */ size_t buf_size; /* bytes in the buffer */ size_t start_offs; /* actual buffer start offset */ uint32_t version; /* smpl format version */ uint32_t reserved1; /* for future use */ uint64_t reserved2[5]; /* for future use */ pfm_ds_area_core_t ds; /* DS management Area */ } pfm_pebs_core_smpl_hdr_t; /* * PEBS record format as for both 32-bit and 64-bit modes */ typedef struct { uint64_t eflags; uint64_t ip; uint64_t eax; uint64_t ebx; uint64_t ecx; uint64_t edx; uint64_t esi; uint64_t edi; uint64_t ebp; uint64_t esp; uint64_t r8; /* 0 in 32-bit mode */ uint64_t r9; /* 0 in 32-bit mode */ uint64_t r10; /* 0 in 32-bit mode */ uint64_t r11; /* 0 in 32-bit mode */ uint64_t r12; /* 0 in 32-bit mode */ uint64_t r13; /* 0 in 32-bit mode */ uint64_t r14; /* 0 in 32-bit mode */ uint64_t r15; /* 0 in 32-bit mode */ } pfm_pebs_core_smpl_entry_t; #define PFM_PEBS_CORE_SMPL_VERSION_MAJ 1U #define PFM_PEBS_CORE_SMPL_VERSION_MIN 0U #define PFM_PEBS_CORE_SMPL_VERSION (((PFM_PEBS_CORE_SMPL_VERSION_MAJ&0xffff)<<16)|\ (PFM_PEBS_CORE_SMPL_VERSION_MIN & 0xffff)) #ifdef __cplusplus }; #endif #endif /* __PERFMON_PEBS_CORE_SMPL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_pebs_p4_smpl.h000066400000000000000000000132371502707512200261250ustar00rootroot00000000000000/* * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This program is free software; you can redistribute it and/or * modify it under the terms of version 2 of the GNU General Public * License as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA * 02111-1307 USA * * This file implements the sampling format to support Intel * Precise Event Based Sampling (PEBS) feature of Pentium 4 * and other Netburst-based processors. Not to be used for * Intel Core-based processors. * * What is PEBS? * ------------ * This is a hardware feature to enhance sampling by providing * better precision as to where a sample is taken. This avoids the * typical skew in the instruction one can observe with any * interrupt-based sampling technique. * * PEBS also lowers sampling overhead significantly by having the * processor store samples instead of the OS. PMU interrupt are only * generated after multiple samples are written. * * Another benefit of PEBS is that samples can be captured inside * critical sections where interrupts are masked. * * How does it work? * PEBS effectively implements a Hw buffer. The Os must pass a region * of memory where samples are to be stored. The region can have any * size. The OS must also specify the sampling period to reload. The PMU * will interrupt when it reaches the end of the buffer or a specified * threshold location inside the memory region. * * The description of the buffer is stored in the Data Save Area (DS). * The samples are stored sequentially in the buffer. The format of the * buffer is fixed and specified in the PEBS documentation. The sample * format changes between 32-bit and 64-bit modes due to extended register * file. * * PEBS does not work when HyperThreading is enabled due to certain MSR * being shared being to two threads. * * What does the format do? * It provides access to the PEBS feature for both 32-bit and 64-bit * processors that support it. * * The same code is used for both 32-bit and 64-bit modes, but different * format names are used because the two modes are not compatible due to * data model and register file differences. Similarly the public data * structures describing the samples are different. * * It is important to realize that the format provides a zero-copy environment * for the samples, i.e,, the OS never touches the samples. Whatever the * processor write is directly accessible to the user. * * Parameters to the buffer can be passed via pfm_create_context() in * the pfm_pebs_smpl_arg structure. * * It is not possible to mix a 32-bit PEBS application on top of a 64-bit * host kernel. */ #ifndef __PERFMON_PEBS_P4_SMPL_H__ #define __PERFMON_PEBS_P4_SMPL_H__ 1 #ifdef __cplusplus extern "C" { #endif #include #ifdef __i386__ #define PFM_PEBS_P4_SMPL_NAME "pebs32_p4" #else #define PFM_PEBS_P4_SMPL_NAME "pebs64_p4" #endif /* * format specific parameters (passed at context creation) */ typedef struct { uint64_t cnt_reset; /* counter reset value */ size_t buf_size; /* size of the buffer in bytes */ size_t intr_thres; /* index of interrupt threshold entry */ uint64_t reserved[6]; /* for future use */ } pfm_pebs_p4_smpl_arg_t; /* * DS Save Area as described in section 15.10.5 */ typedef struct { unsigned long bts_buf_base; unsigned long bts_index; unsigned long bts_abs_max; unsigned long bts_intr_thres; unsigned long pebs_buf_base; unsigned long pebs_index; unsigned long pebs_abs_max; unsigned long pebs_intr_thres; uint64_t pebs_cnt_reset; } pfm_ds_area_p4_t; /* * This header is at the beginning of the sampling buffer returned to the user. * * Because of PEBS alignement constraints, the actual PEBS buffer area does * not necessarily begin right after the header. The hdr_start_offs must be * used to compute the first byte of the buffer. The offset is defined as * the number of bytes between the end of the header and the beginning of * the buffer. As such the formula is: * actual_buffer = (unsigned long)(hdr+1)+hdr->hdr_start_offs */ typedef struct { uint64_t overflows; /* #overflows for buffer */ size_t buf_size; /* bytes in the buffer */ size_t start_offs; /* actual buffer start offset */ uint32_t version; /* smpl format version */ uint32_t reserved1; /* for future use */ uint64_t reserved2[5]; /* for future use */ pfm_ds_area_p4_t ds; /* DS management Area */ } pfm_pebs_p4_smpl_hdr_t; /* * PEBS record format as for both 32-bit and 64-bit modes */ typedef struct { unsigned long eflags; unsigned long ip; unsigned long eax; unsigned long ebx; unsigned long ecx; unsigned long edx; unsigned long esi; unsigned long edi; unsigned long ebp; unsigned long esp; #ifdef __x86_64__ unsigned long r8; unsigned long r9; unsigned long r10; unsigned long r11; unsigned long r12; unsigned long r13; unsigned long r14; unsigned long r15; #endif } pfm_pebs_p4_smpl_entry_t; #define PFM_PEBS_P4_SMPL_VERSION_MAJ 1U #define PFM_PEBS_P4_SMPL_VERSION_MIN 0U #define PFM_PEBS_P4_SMPL_VERSION (((PFM_PEBS_P4_SMPL_VERSION_MAJ&0xffff)<<16)|\ (PFM_PEBS_P4_SMPL_VERSION_MIN & 0xffff)) #ifdef __cplusplus }; #endif #endif /* __PERFMON_PEBS_P4_SMPL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_pebs_smpl.h000066400000000000000000000110711502707512200255140ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * This program is free software; you can redistribute it and/or * modify it under the terms of version 2 of the GNU General Public * License as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA * 02111-1307 USA * */ #ifndef __PERFMON_PEBS_SMPL_H__ #define __PERFMON_PEBS_SMPL_H__ 1 /* * The 32-bit and 64-bit formats are identical, thus we use only * one name for the format. */ #define PFM_PEBS_SMPL_NAME "pebs" #define PFM_PEBS_NUM_CNT_RESET 8 /* * format specific parameters (passed at context creation) * * intr_thres: index from start of buffer of entry where the * PMU interrupt must be triggered. It must be several samples * short of the end of the buffer. */ typedef struct { uint64_t buf_size; /* size of the PEBS buffer in bytes */ uint64_t cnt_reset[PFM_PEBS_NUM_CNT_RESET];/* counter reset values */ uint64_t reserved2[23]; /* for future use */ } pfm_pebs_smpl_arg_t; /* * This header is at the beginning of the sampling buffer returned to the user. * * Because of PEBS alignement constraints, the actual PEBS buffer area does * not necessarily begin right after the header. The hdr_start_offs must be * used to compute the first byte of the buffer. The offset is defined as * the number of bytes between the end of the header and the beginning of * the buffer. As such the formula is: * actual_buffer = (unsigned long)(hdr+1)+hdr->hdr_start_offs */ typedef struct { uint64_t overflows; /* #overflows for buffer */ uint64_t count; /* number of valid samples */ uint64_t buf_size; /* total buffer size */ uint64_t pebs_size; /* pebs buffer size */ uint32_t version; /* smpl format version */ uint32_t entry_size; /* pebs sample size */ uint64_t reserved2[11]; /* for future use */ } pfm_pebs_smpl_hdr_t; /* * Sample format as mandated by Intel documentation. * The same format is used in both 32 and 64 bit modes. */ typedef struct { uint64_t eflags; uint64_t ip; uint64_t eax; uint64_t ebx; uint64_t ecx; uint64_t edx; uint64_t esi; uint64_t edi; uint64_t ebp; uint64_t esp; uint64_t r8; /* 0 in 32-bit mode */ uint64_t r9; /* 0 in 32-bit mode */ uint64_t r10; /* 0 in 32-bit mode */ uint64_t r11; /* 0 in 32-bit mode */ uint64_t r12; /* 0 in 32-bit mode */ uint64_t r13; /* 0 in 32-bit mode */ uint64_t r14; /* 0 in 32-bit mode */ uint64_t r15; /* 0 in 32-bit mode */ } pfm_pebs_core_smpl_entry_t; /* * Sample format as mandated by Intel documentation. * The same format is used in both 32 and 64 bit modes. */ typedef struct { uint64_t eflags; uint64_t ip; uint64_t eax; uint64_t ebx; uint64_t ecx; uint64_t edx; uint64_t esi; uint64_t edi; uint64_t ebp; uint64_t esp; uint64_t r8; /* 0 in 32-bit mode */ uint64_t r9; /* 0 in 32-bit mode */ uint64_t r10; /* 0 in 32-bit mode */ uint64_t r11; /* 0 in 32-bit mode */ uint64_t r12; /* 0 in 32-bit mode */ uint64_t r13; /* 0 in 32-bit mode */ uint64_t r14; /* 0 in 32-bit mode */ uint64_t r15; /* 0 in 32-bit mode */ uint64_t ia32_perf_global_status; uint64_t daddr; uint64_t dsrc_enc; uint64_t latency; } pfm_pebs_nhm_smpl_entry_t; /* * 64-bit PEBS record format is described in * http://www.intel.com/technology/64bitextensions/30083502.pdf * * The format does not peek at samples. The sample structure is only * used to ensure that the buffer is large enough to accomodate one * sample. */ #ifdef __i386__ typedef struct { uint32_t eflags; uint32_t ip; uint32_t eax; uint32_t ebx; uint32_t ecx; uint32_t edx; uint32_t esi; uint32_t edi; uint32_t ebp; uint32_t esp; } pfm_pebs_p4_smpl_entry_t; #else typedef struct { uint64_t eflags; uint64_t ip; uint64_t eax; uint64_t ebx; uint64_t ecx; uint64_t edx; uint64_t esi; uint64_t edi; uint64_t ebp; uint64_t esp; uint64_t r8; uint64_t r9; uint64_t r10; uint64_t r11; uint64_t r12; uint64_t r13; uint64_t r14; uint64_t r15; } pfm_pebs_p4_smpl_entry_t; #endif #define PFM_PEBS_SMPL_VERSION_MAJ 1U #define PFM_PEBS_SMPL_VERSION_MIN 0U #define PFM_PEBS_SMPL_VERSION (((PFM_PEBS_SMPL_VERSION_MAJ&0xffff)<<16)|\ (PFM_PEBS_SMPL_VERSION_MIN & 0xffff)) #endif /* __PERFMON_PEBS_SMPL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_powerpc.h000066400000000000000000000006351502707512200252130ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_POWERPC_H_ #define _PERFMON_POWERPC_H_ #define PFM_ARCH_MAX_PMCS (256+64) /* 256 HW 64 SW */ #define PFM_ARCH_MAX_PMDS (256+64) /* 256 HW 64 SW */ #endif /* _PERFMON_POWERPC_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_sparc.h000066400000000000000000000005431502707512200246420ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_SPARC_H_ #define _PERFMON_SPARC_H_ #define PFM_ARCH_MAX_PMCS 1 #define PFM_ARCH_MAX_PMDS 2 #endif /* _PERFMON_SPARC_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_v2.h000066400000000000000000000143751502707512200240710ustar00rootroot00000000000000/* * This file contains the user level interface description for * the perfmon-2.x interface on Linux. * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian */ #ifndef __PERFMON_V2_H__ #define __PERFMON_V2_H__ #ifndef __PERFMON_H__ #error "this file should never be included directly, use perfmon.h instead" #endif /* * argument to v2.3 and onward pfm_create_context() */ typedef struct { uint32_t ctx_flags; /* noblock/block/syswide */ uint32_t ctx_reserved1; /* for future use */ uint64_t ctx_reserved3[7]; /* for future use */ } pfarg_ctx_t; /* * argument for pfm_write_pmcs() */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* event set for this register */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* pmc value */ uint64_t reg_reserved2[4]; /* for future use */ } pfarg_pmc_t; /* * argument pfm_write_pmds() and pfm_read_pmds() */ typedef struct { uint16_t reg_num; /* which register */ uint16_t reg_set; /* event set for this register */ uint32_t reg_flags; /* REGFL flags */ uint64_t reg_value; /* initial pmc/pmd value */ uint64_t reg_long_reset; /* reset after buffer overflow notification */ uint64_t reg_short_reset; /* reset after counter overflow */ uint64_t reg_last_reset_val; /* return: PMD last reset value */ uint64_t reg_ovfl_switch_cnt; /* how many overflow before switch for next set */ uint64_t reg_reset_pmds[PFM_PMD_BV]; /* which other PMDS to reset on overflow */ uint64_t reg_smpl_pmds[PFM_PMD_BV]; /* which other PMDS to record when the associated PMD overflows */ uint64_t reg_smpl_eventid; /* opaque sampling event identifier */ uint64_t reg_random_mask; /* bitmask used to limit random value */ uint32_t reg_random_seed; /* seed for randomization (DEPRECATED) */ uint32_t reg_reserved2[7]; /* for future use */ } pfarg_pmd_t; /* * optional argument to pfm_start(), pass NULL if no arg needed */ typedef struct { uint16_t start_set; /* event set to start with */ uint16_t start_reserved1; /* for future use */ uint32_t start_reserved2; /* for future use */ uint64_t reserved3[3]; /* for future use */ } pfarg_start_t; /* * argument to pfm_load_context() */ typedef struct { uint32_t load_pid; /* thread or CPU to attach to */ uint16_t load_set; /* set to load first */ uint16_t load_reserved1; /* for future use */ uint64_t load_reserved2[3]; /* for future use */ } pfarg_load_t; #ifndef PFMLIB_OLD_PFMV2 typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_flags; /* SETFL flags */ uint64_t set_timeout; /* requested/effective switch timeout in nsecs */ uint64_t reserved[6]; /* for future use */ } pfarg_setdesc_t; typedef struct { uint16_t set_id; /* which set */ uint16_t set_reserved1; /* for future use */ uint32_t set_flags; /* for future use */ uint64_t set_ovfl_pmds[PFM_PMD_BV]; /* out: last ovfl PMDs */ uint64_t set_runs; /* out: #times set was active */ uint64_t set_timeout; /* out: leftover switch timeout (nsecs) */ uint64_t set_act_duration; /* out: time set was active (nsecs) */ uint64_t set_avail_pmcs[PFM_PMC_BV]; /* out: available PMCs */ uint64_t set_avail_pmds[PFM_PMD_BV]; /* out: available PMDs */ uint64_t set_reserved3[6]; /* for future use */ } pfarg_setinfo_t; #endif #ifdef PFMLIB_OLD_PFMV2 /* * argument to pfm_create_evtsets()/pfm_delete_evtsets() */ typedef struct { uint16_t set_id; /* which set */ uint16_t set_id_next; /* next set to go to (must use PFM_SETFL_EXPL_NEXT) */ uint32_t set_flags; /* SETFL flags */ uint64_t set_timeout; /* requested switch timeout in nsecs */ uint64_t set_mmap_offset; /* cookie to pass as mmap offset to access 64-bit virtual PMD */ uint64_t reserved[5]; /* for future use */ } pfarg_setdesc_t; /* * argument to pfm_getinfo_evtsets() */ typedef struct { uint16_t set_id; /* which set */ uint16_t set_id_next; /* output: next set to go to (must use PFM_SETFL_EXPL_NEXT) */ uint32_t set_flags; /* output: SETFL flags */ uint64_t set_ovfl_pmds[PFM_PMD_BV]; /* output: last ovfl PMDs which triggered a switch from set */ uint64_t set_runs; /* output: number of times the set was active */ uint64_t set_timeout; /* output:effective/leftover switch timeout in nsecs */ uint64_t set_act_duration; /* number of cycles set was active (syswide only) */ uint64_t set_mmap_offset; /* cookie to pass as mmap offset to access 64-bit virtual PMD */ uint64_t set_avail_pmcs[PFM_PMC_BV]; uint64_t set_avail_pmds[PFM_PMD_BV]; uint64_t reserved[4]; /* for future use */ } pfarg_setinfo_t; #ifdef __crayx2 #define PFM_MAX_HW_PMDS 512 #else #define PFM_MAX_HW_PMDS 256 #endif #define PFM_HW_PMD_BV PFM_BVSIZE(PFM_MAX_HW_PMDS) typedef struct { uint32_t msg_type; /* PFM_MSG_OVFL */ uint32_t msg_ovfl_pid; /* process id */ uint64_t msg_ovfl_pmds[PFM_HW_PMD_BV];/* which PMDs overflowed */ uint16_t msg_active_set; /* active set at the time of overflow */ uint16_t msg_ovfl_cpu; /* cpu on which the overflow occurred */ uint32_t msg_ovfl_tid; /* thread id */ uint64_t msg_ovfl_ip; /* instruction pointer where overflow interrupt happened */ } pfarg_ovfl_msg_t; #endif /* PFMLIB_OLD_PFMV2 */ extern os_err_t pfm_create_context(pfarg_ctx_t *ctx, char *smpl_name, void *smpl_arg, size_t smpl_size); extern os_err_t pfm_write_pmcs(int fd, pfarg_pmc_t *pmcs, int count); extern os_err_t pfm_write_pmds(int fd, pfarg_pmd_t *pmds, int count); extern os_err_t pfm_read_pmds(int fd, pfarg_pmd_t *pmds, int count); extern os_err_t pfm_load_context(int fd, pfarg_load_t *load); extern os_err_t pfm_start(int fd, pfarg_start_t *start); extern os_err_t pfm_stop(int fd); extern os_err_t pfm_restart(int fd); extern os_err_t pfm_create_evtsets(int fd, pfarg_setdesc_t *setd, int count); extern os_err_t pfm_getinfo_evtsets(int fd, pfarg_setinfo_t *info, int count); extern os_err_t pfm_delete_evtsets(int fd, pfarg_setdesc_t *setd, int count); extern os_err_t pfm_unload_context(int fd); #endif /* _PERFMON_V2_H */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/perfmon_x86_64.h000066400000000000000000000005211502707512200244640ustar00rootroot00000000000000/* * Copyright (c) 2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * This file should never be included directly, use * instead. */ #ifndef _PERFMON_X86_64_H_ #define _PERFMON_X86_64_H_ #include #endif /* _PERFMON_X86_64_H_ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib.h000066400000000000000000000400321502707512200232520ustar00rootroot00000000000000/* * Copyright (c) 2001-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_H__ #define __PFMLIB_H__ #ifdef __cplusplus extern "C" { #endif #include #include #include #include #define PFMLIB_VERSION (3 << 16 | 10) #define PFMLIB_MAJ_VERSION(v) ((v)>>16) #define PFMLIB_MIN_VERSION(v) ((v) & 0xffff) /* * Maximum number of PMCs/PMDs supported by the library (especially bitmasks) */ #define PFMLIB_MAX_PMCS 512 /* maximum number of PMCS supported by the library */ #define PFMLIB_MAX_PMDS 512 /* maximum number of PMDS supported by the library */ /* * privilege level mask (mask can be combined) * The interpretation of the level is specific to each * architecture. Checkout the architecture specific header * file for more details. */ #define PFM_PLM0 0x1 /* priv level 0 */ #define PFM_PLM1 0x2 /* priv level 1 */ #define PFM_PLM2 0x4 /* priv level 2 */ #define PFM_PLM3 0x8 /* priv level 3 */ /* * type used to describe a set of bits in the mask (container type) */ typedef unsigned long pfmlib_regmask_bits_t; /* * how many elements do we need to represent all the PMCs and PMDs (rounded up) */ #if PFMLIB_MAX_PMCS > PFMLIB_MAX_PMDS #define PFMLIB_REG_MAX PFMLIB_MAX_PMCS #else #define PFMLIB_REG_MAX PFMLIB_MAX_PMDS #endif #ifndef SWIG #define __PFMLIB_REG_BV_BITS (sizeof(pfmlib_regmask_bits_t)<<3) #define PFMLIB_BVSIZE(x) (((x)+(__PFMLIB_REG_BV_BITS)-1) / __PFMLIB_REG_BV_BITS) #define PFMLIB_REG_BV PFMLIB_BVSIZE(PFMLIB_REG_MAX) #endif typedef struct { pfmlib_regmask_bits_t bits[PFMLIB_REG_BV]; } pfmlib_regmask_t; #define PFMLIB_MAX_MASKS_PER_EVENT 48 /* maximum number of unit masks per event */ /* * event definition for pfmlib_input_param_t */ typedef struct { unsigned int event; /* event descriptor */ unsigned int plm; /* event privilege level mask */ unsigned long flags; /* per-event flag */ unsigned int unit_masks[PFMLIB_MAX_MASKS_PER_EVENT]; /* unit-mask identifiers */ unsigned int num_masks; /* number of masks specified in 'unit_masks' */ unsigned long reserved[2]; /* for future use */ } pfmlib_event_t; /* * generic register definition */ typedef struct { unsigned long long reg_value; /* register value */ unsigned long long reg_addr; /* hardware register addr or index */ unsigned int reg_num; /* logical register index (perfmon2) */ unsigned int reg_reserved1; /* for future use */ unsigned long reg_alt_addr; /* alternate hw register addr of index */ } pfmlib_reg_t; /* * library generic input parameters for pfm_dispatch_event() */ typedef struct { unsigned int pfp_event_count; /* how many events specified (input) */ unsigned int pfp_dfl_plm; /* default priv level : used when event.plm==0 */ unsigned int pfp_flags; /* set of flags for all events used when event.flags==0*/ unsigned int reserved1; /* for future use */ pfmlib_event_t pfp_events[PFMLIB_MAX_PMCS]; /* event descriptions */ pfmlib_regmask_t pfp_unavail_pmcs; /* bitmask of unavailable PMC registers */ unsigned long reserved[6]; /* for future use */ } pfmlib_input_param_t; /* * pfp_flags possible values (apply to all events) */ #define PFMLIB_PFP_SYSTEMWIDE 0x1 /* indicate monitors will be used in a system-wide session */ /* * library generic output parameters for pfm_dispatch_event() */ typedef struct { unsigned int pfp_pmc_count; /* number of entries in pfp_pmcs */ unsigned int pfp_pmd_count; /* number of entries in pfp_pmds */ pfmlib_reg_t pfp_pmcs[PFMLIB_MAX_PMCS]; /* PMC registers number and values */ pfmlib_reg_t pfp_pmds[PFMLIB_MAX_PMDS]; /* PMD registers numbers */ unsigned long reserved[7]; /* for future use */ } pfmlib_output_param_t; /* * library configuration options */ typedef struct { unsigned int pfm_debug:1; /* set in debug mode */ unsigned int pfm_verbose:1; /* set in verbose mode */ unsigned int pfm_reserved:30;/* for future use */ } pfmlib_options_t; /* * special data type for libpfm error value used to help * with Python support and in particular for SWIG. By using * a specific type we can detect library calls and trap errors * in one SWIG statement as opposed to having to keep track of * each call individually. Programs can use 'int' safely for * the return value. */ typedef int pfm_err_t; /* error if !PFMLIB_SUCCESS */ extern pfm_err_t pfm_set_options(pfmlib_options_t *opt); extern pfm_err_t pfm_initialize(void); extern pfm_err_t pfm_list_supported_pmus(int (*pf)(const char *fmt,...)); extern pfm_err_t pfm_get_pmu_name(char *name, int maxlen); extern pfm_err_t pfm_get_pmu_type(int *type); extern pfm_err_t pfm_get_pmu_name_bytype(int type, char *name, size_t maxlen); extern pfm_err_t pfm_is_pmu_supported(int type); extern pfm_err_t pfm_force_pmu(int type); /* * pfm_find_event_byname() is obsolete, use pfm_find_event */ extern pfm_err_t pfm_find_event(const char *str, unsigned int *idx); extern pfm_err_t pfm_find_event_byname(const char *name, unsigned int *idx); extern pfm_err_t pfm_find_event_bycode(int code, unsigned int *idx); extern pfm_err_t pfm_find_event_bycode_next(int code, unsigned int start, unsigned int *next); extern pfm_err_t pfm_find_event_mask(unsigned int event_idx, const char *str, unsigned int *mask_idx); extern pfm_err_t pfm_find_full_event(const char *str, pfmlib_event_t *e); extern pfm_err_t pfm_get_max_event_name_len(size_t *len); extern pfm_err_t pfm_get_num_events(unsigned int *count); extern pfm_err_t pfm_get_num_event_masks(unsigned int event_idx, unsigned int *count); extern pfm_err_t pfm_get_event_name(unsigned int idx, char *name, size_t maxlen); extern pfm_err_t pfm_get_full_event_name(pfmlib_event_t *e, char *name, size_t maxlen); extern pfm_err_t pfm_get_event_code(unsigned int idx, int *code); extern pfm_err_t pfm_get_event_mask_code(unsigned int idx, unsigned int mask_idx, unsigned int *code); extern pfm_err_t pfm_get_event_counters(unsigned int idx, pfmlib_regmask_t *counters); extern pfm_err_t pfm_get_event_description(unsigned int idx, char **str); extern pfm_err_t pfm_get_event_code_counter(unsigned int idx, unsigned int cnt, int *code); extern pfm_err_t pfm_get_event_mask_name(unsigned int event_idx, unsigned int mask_idx, char *name, size_t maxlen); extern pfm_err_t pfm_get_event_mask_description(unsigned int event_idx, unsigned int mask_idx, char **desc); extern pfm_err_t pfm_dispatch_events(pfmlib_input_param_t *p, void *model_in, pfmlib_output_param_t *q, void *model_out); extern pfm_err_t pfm_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs); extern pfm_err_t pfm_get_impl_pmds(pfmlib_regmask_t *impl_pmds); extern pfm_err_t pfm_get_impl_counters(pfmlib_regmask_t *impl_counters); extern pfm_err_t pfm_get_num_pmds(unsigned int *num); extern pfm_err_t pfm_get_num_pmcs(unsigned int *num); extern pfm_err_t pfm_get_num_counters(unsigned int *num); extern pfm_err_t pfm_get_hw_counter_width(unsigned int *width); extern pfm_err_t pfm_get_version(unsigned int *version); extern char *pfm_strerror(int code); extern pfm_err_t pfm_get_cycle_event(pfmlib_event_t *e); extern pfm_err_t pfm_get_inst_retired_event(pfmlib_event_t *e); /* * Supported PMU family */ #define PFMLIB_NO_PMU -1 /* PMU unused (forced) */ #define PFMLIB_UNKNOWN_PMU 0 /* type not yet known (dynamic) */ #define PFMLIB_GEN_IA64_PMU 1 /* Intel IA-64 architected PMU */ #define PFMLIB_ITANIUM_PMU 2 /* Intel Itanium */ #define PFMLIB_ITANIUM2_PMU 3 /* Intel Itanium 2 */ #define PFMLIB_MONTECITO_PMU 4 /* Intel Dual-Core Itanium 2 9000 */ #define PFMLIB_AMD64_PMU 16 /* AMD AMD64 (K7, K8, Families 10h, 15h) */ #define PFMLIB_GEN_IA32_PMU 63 /* Intel architectural PMU for X86 */ #define PFMLIB_I386_P6_PMU 32 /* Intel PIII (P6 core) */ #define PFMLIB_PENTIUM4_PMU 33 /* Intel Pentium4/Xeon/EM64T */ #define PFMLIB_COREDUO_PMU 34 /* Intel Core Duo/Core Solo */ #define PFMLIB_I386_PM_PMU 35 /* Intel Pentium M */ #define PFMLIB_CORE_PMU 36 /* obsolete, use PFMLIB_INTEL_CORE_PMU */ #define PFMLIB_INTEL_CORE_PMU 36 /* Intel Core */ #define PFMLIB_INTEL_PPRO_PMU 37 /* Intel Pentium Pro */ #define PFMLIB_INTEL_PII_PMU 38 /* Intel Pentium II */ #define PFMLIB_INTEL_ATOM_PMU 39 /* Intel Atom */ #define PFMLIB_INTEL_NHM_PMU 40 /* Intel Nehalem */ #define PFMLIB_INTEL_WSM_PMU 41 /* Intel Westmere */ #define PFMLIB_MIPS_20KC_PMU 64 /* MIPS 20KC */ #define PFMLIB_MIPS_24K_PMU 65 /* MIPS 24K */ #define PFMLIB_MIPS_25KF_PMU 66 /* MIPS 25KF */ #define PFMLIB_MIPS_34K_PMU 67 /* MIPS 34K */ #define PFMLIB_MIPS_5KC_PMU 68 /* MIPS 5KC */ #define PFMLIB_MIPS_74K_PMU 69 /* MIPS 74K */ #define PFMLIB_MIPS_R10000_PMU 70 /* MIPS R10000 */ #define PFMLIB_MIPS_R12000_PMU 71 /* MIPS R12000 */ #define PFMLIB_MIPS_RM7000_PMU 72 /* MIPS RM7000 */ #define PFMLIB_MIPS_RM9000_PMU 73 /* MIPS RM9000 */ #define PFMLIB_MIPS_SB1_PMU 74 /* MIPS SB1/SB1A */ #define PFMLIB_MIPS_VR5432_PMU 75 /* MIPS VR5432 */ #define PFMLIB_MIPS_VR5500_PMU 76 /* MIPS VR5500 */ #define PFMLIB_MIPS_ICE9A_PMU 77 /* SiCortex ICE9A */ #define PFMLIB_MIPS_ICE9B_PMU 78 /* SiCortex ICE9B */ #define PFMLIB_POWERPC_PMU 90 /* POWERPC */ #define PFMLIB_CRAYX2_PMU 96 /* Cray X2 */ #define PFMLIB_CELL_PMU 100 /* CELL */ #define PFMLIB_PPC970_PMU 110 /* IBM PowerPC 970(FX,GX) */ #define PFMLIB_PPC970MP_PMU 111 /* IBM PowerPC 970MP */ #define PFMLIB_POWER3_PMU 112 /* IBM POWER3 */ #define PFMLIB_POWER4_PMU 113 /* IBM POWER4 */ #define PFMLIB_POWER5_PMU 114 /* IBM POWER5 */ #define PFMLIB_POWER5p_PMU 115 /* IBM POWER5+ */ #define PFMLIB_POWER6_PMU 116 /* IBM POWER6 */ #define PFMLIB_POWER7_PMU 117 /* IBM POWER7 */ #define PFMLIB_SPARC_ULTRA12_PMU 130 /* UltraSPARC I, II, IIi, and IIe */ #define PFMLIB_SPARC_ULTRA3_PMU 131 /* UltraSPARC III */ #define PFMLIB_SPARC_ULTRA3I_PMU 132 /* UltraSPARC IIIi and IIIi+ */ #define PFMLIB_SPARC_ULTRA3PLUS_PMU 133 /* UltraSPARC III+ and IV */ #define PFMLIB_SPARC_ULTRA4PLUS_PMU 134 /* UltraSPARC IV+ */ #define PFMLIB_SPARC_NIAGARA1_PMU 135 /* Niagara-1 */ #define PFMLIB_SPARC_NIAGARA2_PMU 136 /* Niagara-2 */ /* * pfmlib error codes */ #define PFMLIB_SUCCESS 0 /* success */ #define PFMLIB_ERR_NOTSUPP -1 /* function not supported */ #define PFMLIB_ERR_INVAL -2 /* invalid parameters */ #define PFMLIB_ERR_NOINIT -3 /* library was not initialized */ #define PFMLIB_ERR_NOTFOUND -4 /* event not found */ #define PFMLIB_ERR_NOASSIGN -5 /* cannot assign events to counters */ #define PFMLIB_ERR_FULL -6 /* buffer is full or too small */ #define PFMLIB_ERR_EVTMANY -7 /* event used more than once */ #define PFMLIB_ERR_MAGIC -8 /* invalid library magic number */ #define PFMLIB_ERR_FEATCOMB -9 /* invalid combination of features */ #define PFMLIB_ERR_EVTSET -10 /* incompatible event sets */ #define PFMLIB_ERR_EVTINCOMP -11 /* incompatible event combination */ #define PFMLIB_ERR_TOOMANY -12 /* too many events or unit masks */ #define PFMLIB_ERR_IRRTOOBIG -13 /* code range too big */ #define PFMLIB_ERR_IRREMPTY -14 /* empty code range */ #define PFMLIB_ERR_IRRINVAL -15 /* invalid code range */ #define PFMLIB_ERR_IRRTOOMANY -16 /* too many code ranges */ #define PFMLIB_ERR_DRRINVAL -17 /* invalid data range */ #define PFMLIB_ERR_DRRTOOMANY -18 /* too many data ranges */ #define PFMLIB_ERR_BADHOST -19 /* not supported by host CPU */ #define PFMLIB_ERR_IRRALIGN -20 /* bad alignment for code range */ #define PFMLIB_ERR_IRRFLAGS -21 /* code range missing flags */ #define PFMLIB_ERR_UMASK -22 /* invalid or missing unit mask */ #define PFMLIB_ERR_NOMEM -23 /* out of memory */ #define __PFMLIB_REGMASK_EL(g) ((g)/__PFMLIB_REG_BV_BITS) #define __PFMLIB_REGMASK_MASK(g) (((pfmlib_regmask_bits_t)1) << ((g) % __PFMLIB_REG_BV_BITS)) static inline int pfm_regmask_isset(pfmlib_regmask_t *h, unsigned int b) { if (b >= PFMLIB_REG_MAX) return 0; return (h->bits[__PFMLIB_REGMASK_EL(b)] & __PFMLIB_REGMASK_MASK(b)) != 0; } static inline int pfm_regmask_set(pfmlib_regmask_t *h, unsigned int b) { if (b >= PFMLIB_REG_MAX) return PFMLIB_ERR_INVAL; h->bits[__PFMLIB_REGMASK_EL(b)] |= __PFMLIB_REGMASK_MASK(b); return PFMLIB_SUCCESS; } static inline int pfm_regmask_clr(pfmlib_regmask_t *h, unsigned int b) { if (h == NULL || b >= PFMLIB_REG_MAX) return PFMLIB_ERR_INVAL; h->bits[__PFMLIB_REGMASK_EL(b)] &= ~ __PFMLIB_REGMASK_MASK(b); return PFMLIB_SUCCESS; } static inline int pfm_regmask_weight(pfmlib_regmask_t *h, unsigned int *w) { unsigned int pos; unsigned int weight = 0; if (h == NULL || w == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { weight += (unsigned int)pfmlib_popcnt(h->bits[pos]); } *w = weight; return PFMLIB_SUCCESS; } static inline int pfm_regmask_eq(pfmlib_regmask_t *h1, pfmlib_regmask_t *h2) { unsigned int pos; if (h1 == NULL || h2 == NULL) return 0; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { if (h1->bits[pos] != h2->bits[pos]) return 0; } return 1; } static inline int pfm_regmask_and(pfmlib_regmask_t *dst, pfmlib_regmask_t *h1, pfmlib_regmask_t *h2) { unsigned int pos; if (dst == NULL || h1 == NULL || h2 == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { dst->bits[pos] = h1->bits[pos] & h2->bits[pos]; } return PFMLIB_SUCCESS; } static inline int pfm_regmask_andnot(pfmlib_regmask_t *dst, pfmlib_regmask_t *h1, pfmlib_regmask_t *h2) { unsigned int pos; if (dst == NULL || h1 == NULL || h2 == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { dst->bits[pos] = h1->bits[pos] & ~h2->bits[pos]; } return PFMLIB_SUCCESS; } static inline int pfm_regmask_or(pfmlib_regmask_t *dst, pfmlib_regmask_t *h1, pfmlib_regmask_t *h2) { unsigned int pos; if (dst == NULL || h1 == NULL || h2 == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { dst->bits[pos] = h1->bits[pos] | h2->bits[pos]; } return PFMLIB_SUCCESS; } static inline int pfm_regmask_copy(pfmlib_regmask_t *dst, pfmlib_regmask_t *src) { unsigned int pos; if (dst == NULL || src == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { dst->bits[pos] = src->bits[pos]; } return PFMLIB_SUCCESS; } static inline int pfm_regmask_not(pfmlib_regmask_t *dst) { unsigned int pos; if (dst == NULL) return PFMLIB_ERR_INVAL; for (pos = 0; pos < PFMLIB_REG_BV; pos++) { dst->bits[pos] = ~dst->bits[pos]; } return PFMLIB_SUCCESS; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_amd64.h000066400000000000000000000155031502707512200242520ustar00rootroot00000000000000/* * AMD64 PMU specific types and definitions (64 and 32 bit modes) * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_AMD64_H__ #define __PFMLIB_AMD64_H__ #include /* * privilege level mask usage for AMD64: * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = invalid parameters * PFM_PLM2 = invalid parameters * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif #define PMU_AMD64_MAX_COUNTERS 6 /* total numbers of performance counters */ /* * AMD64 MSR definitions */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t sel_event_mask:8; /* event mask */ uint64_t sel_unit_mask:8; /* unit mask */ uint64_t sel_usr:1; /* user level */ uint64_t sel_os:1; /* system level */ uint64_t sel_edge:1; /* edge detec */ uint64_t sel_pc:1; /* pin control */ uint64_t sel_int:1; /* enable APIC intr */ uint64_t sel_res1:1; /* reserved */ uint64_t sel_en:1; /* enable */ uint64_t sel_inv:1; /* invert counter mask */ uint64_t sel_cnt_mask:8; /* counter mask */ uint64_t sel_event_mask2:4; /* from 10h: event mask [11:8] */ uint64_t sel_res2:4; /* reserved */ uint64_t sel_guest:1; /* from 10h: guest only counter */ uint64_t sel_host:1; /* from 10h: host only counter */ uint64_t sel_res3:22; /* reserved */ } perfsel; } pfm_amd64_sel_reg_t; /* MSR 0xc001000-0xc001003 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t ctr_count:48; /* 48-bit hardware counter */ uint64_t ctr_res1:16; /* reserved */ } perfctr; } pfm_amd64_ctr_reg_t; /* MSR 0xc001004-0xc001007 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t ibsfetchmaxcnt:16; uint64_t ibsfetchcnt:16; uint64_t ibsfetchlat:16; uint64_t ibsfetchen:1; uint64_t ibsfetchval:1; uint64_t ibsfetchcomp:1; uint64_t ibsicmiss:1; uint64_t ibsphyaddrvalid:1; uint64_t ibsl1tlbpgsz:2; uint64_t ibsl1tlbmiss:1; uint64_t ibsl2tlbmiss:1; uint64_t ibsranden:1; uint64_t reserved:6; } reg; } ibsfetchctl_t; /* MSR 0xc0011030 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t ibsopmaxcnt:16; uint64_t reserved1:1; uint64_t ibsopen:1; uint64_t ibsopval:1; uint64_t ibsopcntl:1; uint64_t reserved2:44; } reg; } ibsopctl_t; /* MSR 0xc0011033 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t ibscomptoretctr:16; uint64_t ibstagtoretctr:16; uint64_t ibsopbrnresync:1; uint64_t ibsopmispreturn:1; uint64_t ibsopreturn:1; uint64_t ibsopbrntaken:1; uint64_t ibsopbrnmisp:1; uint64_t ibsopbrnret:1; uint64_t reserved:26; } reg; } ibsopdata_t; /* MSR 0xc0011035 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t nbibsreqsrc:3; uint64_t reserved1:1; uint64_t nbibsreqdstproc:1; uint64_t nbibsreqcachehitst:1; uint64_t reserved2:58; } reg; } ibsopdata2_t; /* MSR 0xc0011036 */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t ibsldop:1; uint64_t ibsstop:1; uint64_t ibsdcl1tlbmiss:1; uint64_t ibsdcl2tlbmiss:1; uint64_t ibsdcl1tlbhit2m:1; uint64_t ibsdcl1tlbhit1g:1; uint64_t ibsdcl2tlbhit2m:1; uint64_t ibsdcmiss:1; uint64_t ibsdcmissacc:1; uint64_t ibsdcldbnkcon:1; uint64_t ibsdcstbnkcon:1; uint64_t ibsdcsttoldfwd:1; uint64_t ibsdcsttoldcan:1; uint64_t ibsdcucmemacc:1; uint64_t ibsdcwcmemacc:1; uint64_t ibsdclockedop:1; uint64_t ibsdcmabhit:1; uint64_t ibsdclinaddrvalid:1; uint64_t ibsdcphyaddrvalid:1; uint64_t reserved1:13; uint64_t ibsdcmisslat:16; uint64_t reserved2:16; } reg; } ibsopdata3_t; /* MSR 0xc0011037 */ /* * AMD64 specific input parameters for the library */ typedef struct { uint32_t cnt_mask; /* threshold ([4-255] are reserved) */ uint32_t flags; /* counter specific flag */ } pfmlib_amd64_counter_t; #define PFM_AMD64_SEL_INV 0x1 /* inverse */ #define PFM_AMD64_SEL_EDGE 0x2 /* edge detect */ #define PFM_AMD64_SEL_GUEST 0x4 /* guest only */ #define PFM_AMD64_SEL_HOST 0x8 /* host only */ /* * IBS input parameters * * Maxcnt specifies the maximum count value of the periodic counter, * 20 bits, bits 3:0 are always set to zero. */ typedef struct { unsigned int maxcnt; unsigned int options; } ibs_param_t; /* * values for ibs_param_t.options */ #define IBS_OPTIONS_RANDEN 1 /* enable randomization (IBS fetch only) */ #define IBS_OPTIONS_UOPS 1 /* count dispatched uops (IBS op only) */ typedef struct { pfmlib_amd64_counter_t pfp_amd64_counters[PMU_AMD64_MAX_COUNTERS]; /* extended counter features */ uint32_t flags; /* use flags */ uint32_t reserved1; /* for future use */ ibs_param_t ibsfetch; /* IBS fetch control */ ibs_param_t ibsop; /* IBS execution control */ uint64_t reserved2; /* for future use */ } pfmlib_amd64_input_param_t; /* A bit mask, meaning multiple usage types may be defined */ #define PFMLIB_AMD64_USE_IBSFETCH 1 #define PFMLIB_AMD64_USE_IBSOP 2 /* * AMD64 specific output parameters for the library * * The values ibsfetch_base and ibsop_base pass back the index of the * ibsopctl and ibsfetchctl register in pfp_pmds[]. */ typedef struct { uint32_t ibsfetch_base; /* Perfmon2 base register index */ uint32_t ibsop_base; /* Perfmon2 base register index */ uint64_t reserved[7]; /* for future use */ } pfmlib_amd64_output_param_t; /* Perfmon2 registers relative to base register */ #define PMD_IBSFETCHCTL 0 #define PMD_IBSFETCHLINAD 1 #define PMD_IBSFETCHPHYSAD 2 #define PMD_IBSOPCTL 0 #define PMD_IBSOPRIP 1 #define PMD_IBSOPDATA 2 #define PMD_IBSOPDATA2 3 #define PMD_IBSOPDATA3 4 #define PMD_IBSDCLINAD 5 #define PMD_IBSDCPHYSAD 6 #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_AMD64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_cell.h000066400000000000000000000045521502707512200242600ustar00rootroot00000000000000/* * Cell PMU specific types and definitions * * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_CELL_H__ #define __PFMLIB_CELL_H__ #include #define PMU_CELL_NUM_COUNTERS 8 /* total number of EvtSel/EvtCtr */ #define PMU_CELL_NUM_PERFSEL 8 /* total number of EvtSel */ #define PMU_CELL_NUM_PERFCTR 8 /* total number of EvtCtr */ typedef struct { unsigned int pmX_control_num; /* for pmX_control X=1(pm0_control)...X=8(pm7_control) */ unsigned int spe_subunit; unsigned int polarity; unsigned int input_control; unsigned int cnt_mask; /* threshold (reserved) */ unsigned int flags; /* counter specific flag (reserved) */ } pfmlib_cell_counter_t; /* * Cell specific parameters for the library */ typedef struct { unsigned int triggers; unsigned int interval; unsigned int control; pfmlib_cell_counter_t pfp_cell_counters[PMU_CELL_NUM_COUNTERS]; /* extended counter features */ uint64_t reserved[4]; /* for future use */ } pfmlib_cell_input_param_t; typedef struct { uint64_t reserved[8]; /* for future use */ } pfmlib_cell_output_param_t; int pfm_cell_spe_event(unsigned int event_index); #endif /* __PFMLIB_CELL_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp.h000066400000000000000000000034131502707512200242720ustar00rootroot00000000000000/* * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_H__ #define __PFMLIB_COMP_H__ #ifdef __ia64__ #include #endif #ifdef __x86_64__ #include #endif #ifdef __i386__ #include #endif #ifdef __mips__ #include #endif #ifdef __powerpc__ #include #endif #ifdef __sparc__ #include #endif #ifdef __cell__ #include #endif #ifdef __crayx2 #include #endif #endif /* __PFMLIB_COMP_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_crayx2.h000066400000000000000000000031131502707512200255570ustar00rootroot00000000000000/* * Cray X2 compiler specific macros * * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_CRAYX2_H__ #define __PFMLIB_COMP_CRAYX2_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #include #define pfmlib_popcnt _pop #endif /* __PFMLIB_COMP_CRAYX2_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_i386.h000066400000000000000000000033501502707512200250430ustar00rootroot00000000000000/* * I386 P6/Pentium M compiler specific macros * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_I386_P6_H__ #define __PFMLIB_COMP_I386_P6_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #ifndef __i386__ #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_ia64.h000066400000000000000000000053311502707512200251160ustar00rootroot00000000000000/* * IA-64 compiler specific macros * * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_IA64_H__ #define __PFMLIB_COMP_IA64_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #ifndef __ia64__ #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif /* * this header file contains all the macros, inline assembly, instrinsics needed * by the library and which are compiler-specific */ #if defined(__ECC) && defined(__INTEL_COMPILER) #define LIBPFM_USING_INTEL_ECC_COMPILER 1 /* if you do not have this file, your compiler is too old */ #include #endif #ifdef LIBPFM_USING_INTEL_ECC_COMPILER #define ia64_sum(void) __sum(1<<2) #define ia64_rum(void) __rum(1<<2) #define ia64_get_pmd(regnum) __getIndReg(_IA64_REG_INDR_PMD, (regnum)) #define pfmlib_popcnt(v) _m64_popcnt(v) #elif defined(__GNUC__) static inline void ia64_sum(void) { __asm__ __volatile__("sum psr.up;;" ::: "memory" ); } static inline void ia64_rum(void) { __asm__ __volatile__("rum psr.up;;" ::: "memory" ); } static inline unsigned long ia64_get_pmd(int regnum) { unsigned long value; __asm__ __volatile__ ("mov %0=pmd[%1]" : "=r"(value) : "r"(regnum)); return value; } static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long ret; __asm__ __volatile__ ("popcnt %0=%1" : "=r"(ret) : "r"(v)); return ret; } #else /* !GNUC nor INTEL_ECC */ #error "need to define a set of compiler-specific macros" #endif #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_mips64.h000066400000000000000000000034501502707512200254750ustar00rootroot00000000000000/* * MIPS64 compiler specific macros * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_MIPS64_H__ #define __PFMLIB_COMP_MIPS64_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #if !defined(__mips__) #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_MIPS64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_powerpc.h000066400000000000000000000033441502707512200260340ustar00rootroot00000000000000/* * PowerPC compiler specific macros * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_POWERPC_H__ #define __PFMLIB_COMP_POWERPC_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #ifndef __powerpc__ #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_POWERPC_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_sparc.h000066400000000000000000000033321502707512200254620ustar00rootroot00000000000000/* * Sparc compiler specific macros * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_SPARC_H__ #define __PFMLIB_COMP_SPARC_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #ifndef __sparc__ #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_SPARC_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_comp_x86_64.h000066400000000000000000000033351502707512200253130ustar00rootroot00000000000000/* * X86-64 compiler specific macros * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COMP_X86_64_H__ #define __PFMLIB_COMP_X86_64_H__ #ifndef __PFMLIB_COMP_H__ #error "you should never include this file directly, use pfmlib_comp.h" #endif #ifndef __x86_64__ #error "you should not be including this file" #endif #ifdef __cplusplus extern "C" { #endif static inline unsigned long pfmlib_popcnt(unsigned long v) { unsigned long sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COMP_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_core.h000066400000000000000000000062461502707512200242730ustar00rootroot00000000000000/* * Intel Core PMU * * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_CORE_H__ #define __PFMLIB_CORE_H__ #include /* * privilege level mask usage for Intel Core * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif #define PMU_CORE_NUM_FIXED_COUNTERS 3 /* number of fixed counters */ #define PMU_CORE_NUM_GEN_COUNTERS 2 /* number of generic counters */ #define PMU_CORE_NUM_COUNTERS 5 /* number of counters */ typedef union { unsigned long long val; /* complete register value */ struct { unsigned long sel_event_select:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_res1:1; /* reserved */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_res2:32; } perfevtsel; } pfm_core_sel_reg_t; typedef struct { unsigned long cnt_mask; /* threshold (cnt_mask) */ unsigned int flags; /* counter specific flag */ } pfmlib_core_counter_t; #define PFM_CORE_SEL_INV 0x1 /* inverse */ #define PFM_CORE_SEL_EDGE 0x2 /* edge detect */ /* * model-specific parameters for the library */ typedef struct { unsigned int pebs_used; /* set to 1 if PEBS is used */ } pfmlib_core_pebs_t; typedef struct { pfmlib_core_counter_t pfp_core_counters[PMU_CORE_NUM_COUNTERS]; pfmlib_core_pebs_t pfp_core_pebs; uint64_t reserved[4]; /* for future use */ } pfmlib_core_input_param_t; typedef struct { uint64_t reserved[8]; /* for future use */ } pfmlib_core_output_param_t; #ifdef __cplusplus /* extern C */ } #endif /* * PMU-specific interface */ extern int pfm_core_is_pebs(pfmlib_event_t *e); #endif /* __PFMLIB_CORE_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_coreduo.h000066400000000000000000000053311502707512200247750ustar00rootroot00000000000000/* * Intel Core Duo/Solo * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COREDUO_H__ #define __PFMLIB_COREDUO_H__ #include /* * privilege level mask usage for architected PMU: * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif #define PMU_COREDUO_NUM_COUNTERS 2 typedef union { unsigned long long val; /* complete register value */ struct { unsigned long sel_event_select:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_res1:1; /* reserved */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_res2:32; } perfevtsel; } pfm_coreduo_sel_reg_t; typedef struct { unsigned long cnt_mask; /* threshold (cnt_mask) */ unsigned int flags; /* counter specific flag */ } pfm_coreduo_counter_t; #define PFM_COREDUO_SEL_INV 0x1 /* inverse */ #define PFM_COREDUO_SEL_EDGE 0x2 /* edge detect */ /* * model-specific parameters for the library */ typedef struct { pfm_coreduo_counter_t pfp_coreduo_counters[PMU_COREDUO_NUM_COUNTERS]; uint64_t reserved[4]; /* for future use */ } pfmlib_coreduo_input_param_t; #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_COREUO_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_crayx2.h000066400000000000000000000066711502707512200245550ustar00rootroot00000000000000/* * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_CRAYX2_H__ #define __PFMLIB_CRAYX2_H__ 1 /* * Allows to be included on its own. */ #define PFM_MAX_HW_PMCS 12 #define PFM_MAX_HW_PMDS 512 #include #include /* Priviledge level mask for Cray-X2: * * PFM_PLM0 = Kernel * PFM_PLM1 = Kernel * PFM_PLM2 = Exception * PFM_PLM3 = User */ /* The performance control (PMC) registers appear as follows: * PMC0 control for CPU chip * PMC1 events on CPU chip * PMC2 enable for CPU chip * PMC3 control for L2 Cache chip * PMC4 events on L2 Cache chip * PMC5 enable for L2 Cache chip * PMC6 control for Memory chip * PMC7 events on Memory chip * PMC8 enable for Memory chip * * The performance data (PMD) registers appear for * CPU (32), L2 Cache (16), and Memory (28*16) chips contiguously. * There are four events per chip. * * PMD0 P chip, counter 0 * ... * PMD31 P chip, counter 31 * PMD32 C chip, counter 0 * ... * PMD47 C chip, counter 15 * PMD48 M chip 0, counter 0 * ... * PMD495 M chip 15, counter 27 */ #ifdef __cplusplus extern "C" { #endif /* PMC counts */ #define PMU_CRAYX2_CPU_PMC_COUNT PFM_CPU_PMC_COUNT #define PMU_CRAYX2_CACHE_PMC_COUNT PFM_CACHE_PMC_COUNT #define PMU_CRAYX2_MEMORY_PMC_COUNT PFM_MEM_PMC_COUNT /* PMC bases */ #define PMU_CRAYX2_CPU_PMC_BASE PFM_CPU_PMC #define PMU_CRAYX2_CACHE_PMC_BASE PFM_CACHE_PMC #define PMU_CRAYX2_MEMORY_PMC_BASE PFM_MEM_PMC /* PMD counts */ #define PMU_CRAYX2_CPU_PMD_COUNT PFM_CPU_PMD_COUNT #define PMU_CRAYX2_CACHE_PMD_COUNT PFM_CACHE_PMD_COUNT #define PMU_CRAYX2_MEMORY_PMD_COUNT PFM_MEM_PMD_COUNT /* PMD bases */ #define PMU_CRAYX2_CPU_PMD_BASE PFM_CPU_PMD #define PMU_CRAYX2_CACHE_PMD_BASE PFM_CACHE_PMD #define PMU_CRAYX2_MEMORY_PMD_BASE PFM_MEM_PMD /* Total number of PMCs and PMDs */ #define PMU_CRAYX2_PMC_COUNT PFM_PMC_COUNT #define PMU_CRAYX2_PMD_COUNT PFM_PMD_COUNT #define PMU_CRAYX2_NUM_COUNTERS PFM_PMD_COUNT /* Counter width (can also be acquired via /sys/kernel/perfmon) */ #define PMU_CRAYX2_COUNTER_WIDTH 63 /* PMU name (can also be acquired via /sys/kernel/perfmon) */ #define PMU_CRAYX2_NAME "Cray X2" #ifdef __cplusplus } #endif /* extern C */ #endif /* __PFMLIB_CRAYX2_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_gen_ia32.h000066400000000000000000000065041502707512200247270ustar00rootroot00000000000000/* * Intel architectural PMU v1, v2, v3 * * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_GEN_IA32_H__ #define __PFMLIB_GEN_IA32_H__ #include /* * privilege level mask usage for architected PMU: * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif /* * upper limit, actual number determined dynamically */ #define PMU_GEN_IA32_MAX_COUNTERS PFMLIB_MAX_PMCS /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 32-bit have full * degree of freedom. That is the "useable" counter width. */ #define PMU_GEN_IA32_COUNTER_WIDTH 32 typedef union { unsigned long long val; /* complete register value */ struct { unsigned long sel_event_select:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_any:1; /* any thread (v3) */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_res2:32; } perfevtsel; } pfm_gen_ia32_sel_reg_t; typedef struct { unsigned long cnt_mask; /* threshold (cnt_mask) */ unsigned int flags; /* counter specific flag */ } pfmlib_gen_ia32_counter_t; #define PFM_GEN_IA32_SEL_INV 0x1 /* inverse */ #define PFM_GEN_IA32_SEL_EDGE 0x2 /* edge detect */ #define PFM_GEN_IA32_SEL_ANYTHR 0x4 /* measure on any thread (v3 and up) */ /* * model-specific parameters for the library */ typedef struct { pfmlib_gen_ia32_counter_t pfp_gen_ia32_counters[PMU_GEN_IA32_MAX_COUNTERS]; uint64_t reserved[4]; /* for future use */ } pfmlib_gen_ia32_input_param_t; typedef struct { uint64_t reserved[8]; /* for future use */ } pfmlib_gen_ia32_output_param_t; #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_GEN_IA32_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_gen_ia64.h000066400000000000000000000047161502707512200247370ustar00rootroot00000000000000/* * Generic IA-64 PMU specific types and definitions * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_GEN_IA64_H__ #define __PFMLIB_GEN_IA64_H__ #include #include #if BYTE_ORDER != LITTLE_ENDIAN #error "this file only supports little endian environments" #endif #ifdef __cplusplus extern "C" { #endif #define PMU_GEN_IA64_FIRST_COUNTER 4 /* index of first PMC/PMD counter */ #define PMU_GEN_IA64_NUM_COUNTERS 4 /* total numbers of PMC/PMD pairs used as counting monitors */ #define PMU_GEN_IA64_NUM_PMCS 8 /* total number of PMCS defined */ #define PMU_GEN_IA64_NUM_PMDS 4 /* total number of PMDS defined */ /* * architected PMC register structure */ typedef union { unsigned long pmc_val; /* generic PMC register */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_ig2:48; /* reserved */ } pmc_gen_count_reg; } pfm_gen_ia64_pmc_reg_t; typedef struct { unsigned long pmd_val; /* generic counter value */ } pfm_gen_ia64_pmd_reg_t; #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_GEN_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_gen_mips64.h000066400000000000000000000076251502707512200253200ustar00rootroot00000000000000/* * Generic MIPS64 PMU specific types and definitions * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_GEN_MIPS64_H__ #define __PFMLIB_GEN_MIPS64_H__ #include /* MIPS are bi-endian */ #include /* * privilege level mask usage for MIPS: * * PFM_PLM0 = KERNEL * PFM_PLM1 = SUPERVISOR * PFM_PLM2 = INTERRUPT * PFM_PLM3 = USER */ #ifdef __cplusplus extern "C" { #endif #define PMU_GEN_MIPS64_NUM_COUNTERS 4 /* total numbers of EvtSel/EvtCtr */ #define PMU_GEN_MIPS64_COUNTER_WIDTH 32 /* hardware counter bit width */ /* * This structure provides a detailed way to setup a PMC register. * Once value is loaded, it must be copied (via pmu_reg) to the * perfmon_req_t and passed to the kernel via perfmonctl(). * * It needs to be adjusted based on endianess */ #if __BYTE_ORDER == __LITTLE_ENDIAN typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_exl:1; /* int level */ unsigned long sel_os:1; /* system level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_usr:1; /* user level */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_event_mask:5; /* event mask */ unsigned long sel_res1:22; /* reserved */ unsigned long sel_res2:32; /* reserved */ } perfsel; } pfm_gen_mips64_sel_reg_t; #elif __BYTE_ORDER == __BIG_ENDIAN typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_res2:32; /* reserved */ unsigned long sel_res1:22; /* reserved */ unsigned long sel_event_mask:5; /* event mask */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_usr:1; /* user level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_os:1; /* system level */ unsigned long sel_exl:1; /* int level */ } perfsel; } pfm_gen_mips64_sel_reg_t; #else #error "cannot determine endianess" #endif typedef union { uint64_t val; /* counter value */ /* counting perfctr register */ struct { unsigned long ctr_count:32; /* 32-bit hardware counter */ } perfctr; } pfm_gen_mips64_ctr_reg_t; typedef struct { unsigned int cnt_mask; /* threshold ([4-255] are reserved) */ unsigned int flags; /* counter specific flag */ } pfmlib_gen_mips64_counter_t; /* * MIPS64 specific parameters for the library */ typedef struct { pfmlib_gen_mips64_counter_t pfp_gen_mips64_counters[PMU_GEN_MIPS64_NUM_COUNTERS]; /* extended counter features */ uint64_t reserved[4]; /* for future use */ } pfmlib_gen_mips64_input_param_t; typedef struct { uint64_t reserved[8]; /* for future use */ } pfmlib_gen_mips64_output_param_t; #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_GEN_MIPS64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_i386_p6.h000066400000000000000000000073341502707512200244400ustar00rootroot00000000000000/* * Intel Pentium II/Pentium Pro/Pentium III/Pentium M PMU specific types and definitions * * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_I386_P6_H__ #define __PFMLIB_I386_P6_H__ #include /* * privilege level mask usage for i386-p6: * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif #define PMU_I386_P6_NUM_COUNTERS 2 /* total numbers of EvtSel/EvtCtr */ #define PMU_I386_P6_NUM_PERFSEL 2 /* total number of EvtSel defined */ #define PMU_I386_P6_NUM_PERFCTR 2 /* total number of EvtCtr defined */ #define PMU_I386_P6_COUNTER_WIDTH 32 /* hardware counter bit width */ /* * This structure provides a detailed way to setup a PMC register. * Once value is loaded, it must be copied (via pmu_reg) to the * perfmon_req_t and passed to the kernel via perfmonctl(). */ typedef union { unsigned long val; /* complete register value */ struct { unsigned long sel_event_mask:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_res1:1; /* reserved */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ } perfsel; } pfm_i386_p6_sel_reg_t; typedef union { uint64_t val; /* counter value */ /* counting perfctr register */ struct { unsigned long ctr_count:32; /* 32-bit hardware counter */ unsigned long ctr_res1:32; /* reserved */ } perfctr; } pfm_i386_p6_ctr_reg_t; typedef enum { PFM_I386_P6_CNT_MASK_0, PFM_I386_P6_CNT_MASK_1, PFM_I386_P6_CNT_MASK_2, PFM_I386_P6_CNT_MASK_3 } pfm_i386_p6_cnt_mask_t; typedef struct { pfm_i386_p6_cnt_mask_t cnt_mask; /* threshold (cnt_mask) */ unsigned int flags; /* counter specific flag */ } pfmlib_i386_p6_counter_t; #define PFM_I386_P6_SEL_INV 0x1 /* inverse */ #define PFM_I386_P6_SEL_EDGE 0x2 /* edge detect */ /* * P6-specific parameters for the library */ typedef struct { pfmlib_i386_p6_counter_t pfp_i386_p6_counters[PMU_I386_P6_NUM_COUNTERS]; /* extended counter features */ uint64_t reserved[4]; /* for future use */ } pfmlib_i386_p6_input_param_t; typedef struct { uint64_t reserved[8]; /* for future use */ } pfmlib_i386_p6_output_param_t; #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_I386_P6_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_intel_atom.h000066400000000000000000000063151502707512200254730ustar00rootroot00000000000000/* * Intel Atom : architectural perfmon v3 + PEBS * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Based on pfmlib_intel_atom.h with * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_INTEL_ATOM_H__ #define __PFMLIB_INTEL_ATOM_H__ #include /* * privilege level mask usage for architected PMU: * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif #define PMU_INTEL_ATOM_NUM_COUNTERS 5 /* 2 generic + 3 fixed */ typedef union { unsigned long long val; /* complete register value */ struct { unsigned long sel_event_select:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_any:1; /* any thread */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_res2:32; } perfevtsel; } pfm_intel_atom_sel_reg_t; typedef struct { unsigned long cnt_mask; /* threshold (cnt_mask) */ unsigned int flags; /* counter specific flags */ } pfmlib_intel_atom_counter_t; #define PFM_INTEL_ATOM_SEL_INV 0x1 /* inverse */ #define PFM_INTEL_ATOM_SEL_EDGE 0x2 /* edge detect */ #define PFM_INTEL_ATOM_SEL_ANYTHR 0x4 /* measure on any of 2 threads */ /* * model-specific parameters for the library */ typedef struct { pfmlib_intel_atom_counter_t pfp_intel_atom_counters[PMU_INTEL_ATOM_NUM_COUNTERS]; unsigned int pfp_intel_atom_pebs_used; /* set to 1 to use PEBS */ uint64_t reserved[4]; /* for future use */ } pfmlib_intel_atom_input_param_t; #ifdef __cplusplus /* extern C */ } #endif /* * Atom-specific interface */ extern int pfm_intel_atom_has_pebs(pfmlib_event_t *e); #endif /* __PFMLIB_INTEL_ATOM_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_intel_nhm.h000066400000000000000000000131571502707512200253170ustar00rootroot00000000000000/* * Intel Nehalem PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_NHM_H__ #define __PFMLIB_NHM_H__ #include /* * privilege level mask usage for Intel Core * * PFM_PLM0 = OS (kernel, hypervisor, ..) * PFM_PLM1 = unused (ignored) * PFM_PLM2 = unused (ignored) * PFM_PLM3 = USR (user level) */ #ifdef __cplusplus extern "C" { #endif /* * total number of counters: * - 4 generic core * - 3 fixed core * - 1 uncore fixed * - 8 uncore generic */ #define PMU_NHM_NUM_COUNTERS 16 typedef union { unsigned long long val; /* complete register value */ struct { unsigned long sel_event:8; /* event mask */ unsigned long sel_umask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_anythr:1; /* measure any thread */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_res2:32; } perfevtsel; struct { unsigned long usel_event:8; /* event select */ unsigned long usel_umask:8; /* event unit mask */ unsigned long usel_res1:1; /* reserved */ unsigned long usel_occ:1; /* occupancy reset */ unsigned long usel_edge:1; /* edge detection */ unsigned long usel_res2:1; /* reserved */ unsigned long usel_int:1; /* PMI enable */ unsigned long usel_res3:1; /* reserved */ unsigned long usel_en:1; /* enable */ unsigned long usel_inv:1; /* invert */ unsigned long usel_cnt_mask:8; /* counter mask */ unsigned long usel_res4:32; /* reserved */ } unc_perfevtsel; struct { unsigned long cpl_eq0:1; /* filter out branches at pl0 */ unsigned long cpl_neq0:1; /* filter out branches at pl1-pl3 */ unsigned long jcc:1; /* filter out condition branches */ unsigned long near_rel_call:1; /* filter out near relative calls */ unsigned long near_ind_call:1; /* filter out near indirect calls */ unsigned long near_ret:1; /* filter out near returns */ unsigned long near_ind_jmp:1; /* filter out near unconditional jmp/calls */ unsigned long near_rel_jmp:1; /* filter out near uncoditional relative jmp */ unsigned long far_branch:1; /* filter out far branches */ unsigned long reserved1:23; /* reserved */ unsigned long reserved2:32; /* reserved */ } lbr_select; } pfm_nhm_sel_reg_t; typedef struct { unsigned long cnt_mask; /* counter mask (occurences) */ unsigned int flags; /* counter specific flag */ } pfmlib_nhm_counter_t; /* * flags for pfmlib_nhm_counter_t */ #define PFM_NHM_SEL_INV 0x1 /* inverse */ #define PFM_NHM_SEL_EDGE 0x2 /* edge detect */ #define PFM_NHM_SEL_ANYTHR 0x4 /* any thread (core only) */ #define PFM_NHM_SEL_OCC_RST 0x8 /* reset occupancy (uncore only) */ typedef struct { unsigned int lbr_used; /* set to 1 if LBR is used */ unsigned int lbr_plm; /* priv level PLM0 or PLM3 */ unsigned int lbr_filter;/* filters */ } pfmlib_nhm_lbr_t; /* * lbr_filter: filter out branches * refer to IA32 SDM vol3b section 18.6.2 */ #define PFM_NHM_LBR_JCC 0x4 /* do not capture conditional branches */ #define PFM_NHM_LBR_NEAR_REL_CALL 0x8 /* do not capture near calls */ #define PFM_NHM_LBR_NEAR_IND_CALL 0x10 /* do not capture indirect calls */ #define PFM_NHM_LBR_NEAR_RET 0x20 /* do not capture near returns */ #define PFM_NHM_LBR_NEAR_IND_JMP 0x40 /* do not capture indirect jumps */ #define PFM_NHM_LBR_NEAR_REL_JMP 0x80 /* do not capture near relative jumps */ #define PFM_NHM_LBR_FAR_BRANCH 0x100/* do not capture far branches */ #define PFM_NHM_LBR_ALL 0x1fc /* filter out all branches */ /* * PEBS input parameters */ typedef struct { unsigned int pebs_used; /* set to 1 if PEBS is used */ unsigned int ld_lat_thres; /* load latency threshold (cycles) */ } pfmlib_nhm_pebs_t; /* * model-specific input parameter to pfm_dispatch_events() */ typedef struct { pfmlib_nhm_counter_t pfp_nhm_counters[PMU_NHM_NUM_COUNTERS]; pfmlib_nhm_pebs_t pfp_nhm_pebs; /* PEBS settings */ pfmlib_nhm_lbr_t pfp_nhm_lbr; /* LBR settings */ uint64_t reserved[4]; /* for future use */ } pfmlib_nhm_input_param_t; /* * no pfmlib_nhm_output_param_t defined */ /* * Model-specific interface * can be called directly */ extern int pfm_nhm_is_pebs(pfmlib_event_t *e); extern int pfm_nhm_is_uncore(pfmlib_event_t *e); extern int pfm_nhm_data_src_desc(unsigned int val, char **desc); #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_NHM_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_itanium.h000066400000000000000000000316601502707512200250070ustar00rootroot00000000000000/* * Itanium PMU specific types and definitions * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_ITANIUM_H__ #define __PFMLIB_ITANIUM_H__ #include #include #if BYTE_ORDER != LITTLE_ENDIAN #error "this file only supports little endian environments" #endif #ifdef __cplusplus extern "C" { #endif #define PMU_ITA_FIRST_COUNTER 4 /* index of first PMC/PMD counter */ #define PMU_ITA_NUM_COUNTERS 4 /* total numbers of PMC/PMD pairs used as counting monitors */ #define PMU_ITA_NUM_PMCS 14 /* total number of PMCS defined */ #define PMU_ITA_NUM_PMDS 18 /* total number of PMDS defined */ #define PMU_ITA_NUM_BTB 8 /* total number of PMDS in BTB */ #define PMU_ITA_COUNTER_WIDTH 32 /* hardware counter bit width */ /* * This structure provides a detailed way to setup a PMC register. */ typedef union { unsigned long pmc_val; /* complete register value */ /* This is the Itanium-specific PMC layout for counter config */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:7; /* event select */ unsigned long pmc_ig2:1; /* reserved */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_ig3:1; /* reserved (missing from table on p6-17) */ unsigned long pmc_ism:2; /* instruction set mask */ unsigned long pmc_ig4:38; /* reserved */ } pmc_ita_count_reg; /* Opcode matcher */ struct { unsigned long ignored1:3; unsigned long mask:27; /* mask encoding bits {40:27}{12:0} */ unsigned long ignored2:3; unsigned long match:27; /* match encoding bits {40:27}{12:0} */ unsigned long b:1; /* B-syllable */ unsigned long f:1; /* F-syllable */ unsigned long i:1; /* I-syllable */ unsigned long m:1; /* M-syllable */ } pmc8_9_ita_reg; /* Instruction Event Address Registers */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_ig1:2; /* reserved */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_tlb:1; /* cache/tlb mode */ unsigned long iear_ig2:8; /* reserved */ unsigned long iear_umask:4; /* unit mask */ unsigned long iear_ig3:4; /* reserved */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:38; /* reserved */ } pmc10_ita_reg; /* Data Event Address Registers */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* reserved */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_tlb:1; /* cache/tlb mode */ unsigned long dear_ig2:8; /* reserved */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* reserved */ unsigned long dear_ism:2; /* instruction set */ unsigned long dear_ig4:2; /* reserved */ unsigned long dear_pt:1; /* pass tags */ unsigned long dear_ig5:35; /* reserved */ } pmc11_ita_reg; /* Branch Trace Buffer registers */ struct { unsigned long btbc_plm:4; /* privilege level */ unsigned long btbc_ig1:2; unsigned long btbc_pm:1; /* privileged monitor */ unsigned long btbc_tar:1; /* target address register */ unsigned long btbc_tm:2; /* taken mask */ unsigned long btbc_ptm:2; /* predicted taken address mask */ unsigned long btbc_ppm:2; /* predicted predicate mask */ unsigned long btbc_bpt:1; /* branch prediction table */ unsigned long btbc_bac:1; /* branch address calculator */ unsigned long btbc_ig2:48; } pmc12_ita_reg; struct { unsigned long irange_ta:1; /* tag all bit */ unsigned long irange_ig:63; } pmc13_ita_reg; } pfm_ita_pmc_reg_t; typedef union { unsigned long pmd_val; /* counter value */ /* counting pmd register */ struct { unsigned long pmd_count:32; /* 32-bit hardware counter */ unsigned long pmd_sxt32:32; /* sign extension of bit 32 */ } pmd_ita_counter_reg; struct { unsigned long iear_v:1; /* valid bit */ unsigned long iear_tlb:1; /* tlb miss bit */ unsigned long iear_ig1:3; /* reserved */ unsigned long iear_icla:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd0_ita_reg; struct { unsigned long iear_lat:12; /* latency */ unsigned long iear_ig1:52; /* reserved */ } pmd1_ita_reg; struct { unsigned long dear_daddr; /* data address */ } pmd2_ita_reg; struct { unsigned long dear_latency:12; /* latency */ unsigned long dear_ig1:50; /* reserved */ unsigned long dear_level:2; /* level */ } pmd3_ita_reg; struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_addr:60; /* b=1, bundle address, b=0 target address */ } pmd8_15_ita_reg; struct { unsigned long btbi_bbi:3; /* branch buffer index */ unsigned long btbi_full:1; /* full bit (sticky) */ unsigned long btbi_ignored:60; } pmd16_ita_reg; struct { unsigned long dear_vl:1; /* valid bit */ unsigned long dear_ig1:1; /* reserved */ unsigned long dear_slot:2; /* slot number */ unsigned long dear_iaddr:60; /* instruction address */ } pmd17_ita_reg; } pfm_ita_pmd_reg_t; /* * type definition for Itanium instruction set support */ typedef enum { PFMLIB_ITA_ISM_BOTH=0, /* IA-32 and IA-64 (default) */ PFMLIB_ITA_ISM_IA32=1, /* IA-32 only */ PFMLIB_ITA_ISM_IA64=2 /* IA-64 only */ } pfmlib_ita_ism_t; typedef struct { unsigned int flags; /* counter specific flags */ unsigned int thres; /* per event threshold */ pfmlib_ita_ism_t ism; /* per event instruction set */ } pfmlib_ita_counter_t; /* * counter specific flags */ #define PFMLIB_ITA_FL_EVT_NO_QUALCHECK 0x1 /* don't check qualifier constraints */ typedef struct { unsigned char opcm_used; /* set to 1 if this opcode matcher is used */ unsigned long pmc_val; /* value of opcode matcher for PMC8 */ } pfmlib_ita_opcm_t; /* * * The BTB can be configured via 4 different methods: * * - BRANCH_EVENT is in the event list, pfp_ita_btb.btb_used == 0: * The BTB will be configured (PMC12) to record all branches AND a counting * monitor will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is in the event list, pfp_ita_btb.btb_used == 1: * The BTB will be configured (PMC12) according to information in pfp_ita_btb AND * a counter will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is NOT in the event list, pfp_ita_btb.btb_used == 0: * Nothing is programmed * * - BRANCH_EVENT is NOT in the event list, pfp_ita_btb.btb_used == 1: * The BTB will be configured (PMC12) according to information in pfp_ita_btb. * This is the free running BTB mode. */ typedef struct { unsigned char btb_used; /* set to 1 if the BTB is used */ unsigned char btb_tar; unsigned char btb_tac; unsigned char btb_bac; unsigned char btb_tm; unsigned char btb_ptm; unsigned char btb_ppm; unsigned int btb_plm; /* BTB privilege level mask */ } pfmlib_ita_btb_t; /* * There are four ways to configure EAR: * * - an EAR event is in the event list AND pfp_ita_ear.ear_used = 0: * The EAR will be programmed (PMC10 or PMC11) based on the information encoded in the * event (umask, cache, tlb). A counting monitor will be programmed to * count DATA_EAR_EVENTS or INSTRUCTION_EAR_EVENTS depending on the type of EAR. * * - an EAR event is in the event list AND pfp_ita_ear.ear_used = 1: * The EAR will be programmed (PMC10 or PMC11) according to the information in the * pfp_ita_ear structure because it contains more detailed information * (such as priv level and instruction set). A counting monitor will be programmed * to count DATA_EAR_EVENTS or INSTRUCTION_EAR_EVENTS depending on the type of EAR. * * - no EAR event is in the event list AND pfp_ita_ear.ear_used = 0: * Nothing is programmed. * * - no EAR event is in the event list AND pfp_ita_ear.ear_used = 1: * The EAR will be programmed (PMC10 or PMC11) according to the information in the * pfp_ita_ear structure. This is the free running mode for EAR */ typedef enum { PFMLIB_ITA_EAR_CACHE_MODE=0, /* Cache mode : I-EAR and D-EAR */ PFMLIB_ITA_EAR_TLB_MODE =1, /* TLB mode : I-EAR and D-EAR */ } pfmlib_ita_ear_mode_t; typedef struct { unsigned char ear_used; /* when set will force definition of PMC[10] */ pfmlib_ita_ear_mode_t ear_mode; /* EAR mode */ pfmlib_ita_ism_t ear_ism; /* instruction set */ unsigned int ear_plm; /* IEAR privilege level mask */ unsigned long ear_umask; /* umask value for PMC10 */ } pfmlib_ita_ear_t; /* * describes one range. rr_plm is ignored for data ranges * a range is interpreted as unused (not defined) when rr_start = rr_end = 0. * if rr_plm is not set it will use the default settings set in the generic * library param structure. */ typedef struct { unsigned int rr_flags; /* currently unused */ unsigned int rr_plm; /* privilege level (ignored for data ranges) */ unsigned long rr_start; /* start address */ unsigned long rr_end; /* end address (not included) */ } pfmlib_ita_input_rr_desc_t; typedef struct { unsigned long rr_soff; /* output: start offset from actual start */ unsigned long rr_eoff; /* output: end offset from actual end */ } pfmlib_ita_output_rr_desc_t; /* * rr_used must be set to true for the library to configure the debug registers. * If using less than 4 intervals, must mark the end with entry: rr_limits[x].rr_start = rr_limits[x].rr_end = 0 */ typedef struct { unsigned char rr_used; /* set if address range restriction is used */ unsigned int rr_flags; /* set of flags for all ranges */ unsigned int rr_nbr_used; /* how many registers were used (output) */ pfmlib_ita_input_rr_desc_t rr_limits[4]; /* at most 4 distinct intervals */ } pfmlib_ita_input_rr_t; typedef struct { unsigned int rr_nbr_used; /* how many registers were used (output) */ pfmlib_ita_output_rr_desc_t rr_infos[4]; /* at most 4 distinct intervals */ pfmlib_reg_t rr_br[8]; /* array of debug reg requests to configure */ } pfmlib_ita_output_rr_t; /* * Itanium specific parameters for the library */ typedef struct { pfmlib_ita_counter_t pfp_ita_counters[PMU_ITA_NUM_COUNTERS]; /* extended counter features */ unsigned long pfp_ita_flags; /* Itanium specific flags */ pfmlib_ita_opcm_t pfp_ita_pmc8; /* PMC8 (opcode matcher) configuration */ pfmlib_ita_opcm_t pfp_ita_pmc9; /* PMC9 (opcode matcher) configuration */ pfmlib_ita_ear_t pfp_ita_iear; /* IEAR configuration */ pfmlib_ita_ear_t pfp_ita_dear; /* DEAR configuration */ pfmlib_ita_btb_t pfp_ita_btb; /* BTB configuration */ pfmlib_ita_input_rr_t pfp_ita_drange; /* data range restrictions */ pfmlib_ita_input_rr_t pfp_ita_irange; /* code range restrictions */ unsigned long reserved[1]; /* for future use */ } pfmlib_ita_input_param_t; typedef struct { pfmlib_ita_output_rr_t pfp_ita_drange; /* data range restrictions */ pfmlib_ita_output_rr_t pfp_ita_irange; /* code range restrictions */ unsigned long reserved[6]; /* for future use */ } pfmlib_ita_output_param_t; extern int pfm_ita_is_ear(unsigned int i); extern int pfm_ita_is_dear(unsigned int i); extern int pfm_ita_is_dear_tlb(unsigned int i); extern int pfm_ita_is_dear_cache(unsigned int i); extern int pfm_ita_is_iear(unsigned int i); extern int pfm_ita_is_iear_tlb(unsigned int i); extern int pfm_ita_is_iear_cache(unsigned int i); extern int pfm_ita_is_btb(unsigned int i); extern int pfm_ita_support_opcm(unsigned int i); extern int pfm_ita_support_iarr(unsigned int i); extern int pfm_ita_support_darr(unsigned int i); extern int pfm_ita_get_ear_mode(unsigned int i, pfmlib_ita_ear_mode_t *m); extern int pfm_ita_get_event_maxincr(unsigned int i, unsigned int *maxincr); extern int pfm_ita_get_event_umask(unsigned int i, unsigned long *umask); #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_ITANIUM_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_itanium2.h000066400000000000000000000451021502707512200250650ustar00rootroot00000000000000/* * Itanium 2 PMU specific types and definitions * * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_ITANIUM2_H__ #define __PFMLIB_ITANIUM2_H__ #include #include #if BYTE_ORDER != LITTLE_ENDIAN #error "this file only supports little endian environments" #endif #ifdef __cplusplus extern "C" { #endif #define PMU_ITA2_FIRST_COUNTER 4 /* index of first PMC/PMD counter */ #define PMU_ITA2_NUM_COUNTERS 4 /* total numbers of PMC/PMD pairs used as counting monitors */ #define PMU_ITA2_NUM_PMCS 16 /* total number of PMCS defined */ #define PMU_ITA2_NUM_PMDS 18 /* total number of PMDS defined */ #define PMU_ITA2_NUM_BTB 8 /* total number of PMDS in BTB */ #define PMU_ITA2_COUNTER_WIDTH 47 /* hardware counter bit width */ /* * This structure provides a detailed way to setup a PMC register. * Once value is loaded, it must be copied (via pmu_reg) to the * perfmon_req_t and passed to the kernel via perfmonctl(). */ typedef union { unsigned long pmc_val; /* complete register value */ /* This is the Itanium2-specific PMC layout for counter config */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_enable:1; /* pmc4 only: power enable bit */ unsigned long pmc_ism:2; /* instruction set mask */ unsigned long pmc_ig2:38; /* reserved */ } pmc_ita2_counter_reg; /* opcode matchers */ struct { unsigned long opcm_ig_ad:1; /* ignore instruction address range checking */ unsigned long opcm_inv:1; /* invert range check */ unsigned long opcm_bit2:1; /* must be 1 */ unsigned long opcm_mask:27; /* mask encoding bits {41:27}{12:0} */ unsigned long opcm_ig1:3; /* reserved */ unsigned long opcm_match:27; /* match encoding bits {41:27}{12:0} */ unsigned long opcm_b:1; /* B-syllable */ unsigned long opcm_f:1; /* F-syllable */ unsigned long opcm_i:1; /* I-syllable */ unsigned long opcm_m:1; /* M-syllable */ } pmc8_9_ita2_reg; /* * instruction event address register configuration * * The register has two layout depending on the value of the ct field. * In cache mode(ct=1x): * - ct is 1 bit, umask is 8 bits * In TLB mode (ct=00): * - ct is 2 bits, umask is 7 bits * ct=11 <=> cache mode and use a latency with eighth bit set * ct=01 => nothing monitored * * The ct=01 value is the only reason why we cannot fix the layout * to ct 1 bit and umask 8 bits. Even though in TLB mode, only 6 bits * are effectively used for the umask, if the user inadvertently use * a umask with the most significant bit set, it would be equivalent * to no monitoring. */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:8; /* event unit mask: 7 bits in TLB mode, 8 bits in cache mode */ unsigned long iear_ct:1; /* cache tlb bit13: 0 for TLB mode, 1 for cache mode */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:48; /* reserved */ } pmc10_ita2_cache_reg; struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:7; /* event unit mask: 7 bits in TLB mode, 8 bits in cache mode */ unsigned long iear_ct:2; /* cache tlb bit13: 0 for TLB mode, 1 for cache mode */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:48; /* reserved */ } pmc10_ita2_tlb_reg; /* data event address register configuration */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* reserved */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_mode:2; /* mode */ unsigned long dear_ig2:7; /* reserved */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* reserved */ unsigned long dear_ism:2; /* instruction set */ unsigned long dear_ig4:38; /* reserved */ } pmc11_ita2_reg; /* branch trace buffer configuration register */ struct { unsigned long btbc_plm:4; /* privilege level */ unsigned long btbc_ig1:2; unsigned long btbc_pm:1; /* privileged monitor */ unsigned long btbc_ds:1; /* data selector */ unsigned long btbc_tm:2; /* taken mask */ unsigned long btbc_ptm:2; /* predicted taken address mask */ unsigned long btbc_ppm:2; /* predicted predicate mask */ unsigned long btbc_brt:2; /* branch type mask */ unsigned long btbc_ig2:48; } pmc12_ita2_reg; /* data address range configuration register */ struct { unsigned long darc_ig1:3; unsigned long darc_cfg_dbrp0:2; /* constraint on dbr0 */ unsigned long darc_ig2:6; unsigned long darc_cfg_dbrp1:2; /* constraint on dbr1 */ unsigned long darc_ig3:6; unsigned long darc_cfg_dbrp2:2; /* constraint on dbr2 */ unsigned long darc_ig4:6; unsigned long darc_cfg_dbrp3:2; /* constraint on dbr3 */ unsigned long darc_ig5:16; unsigned long darc_ena_dbrp0:1; /* enable constraint dbr0 */ unsigned long darc_ena_dbrp1:1; /* enable constraint dbr1 */ unsigned long darc_ena_dbrp2:1; /* enable constraint dbr2 */ unsigned long darc_ena_dbrp3:1; /* enable constraint dbr3 */ unsigned long darc_ig6:15; } pmc13_ita2_reg; /* instruction address range configuration register */ struct { unsigned long iarc_ig1:1; unsigned long iarc_ibrp0:1; /* constrained by ibr0 */ unsigned long iarc_ig2:2; unsigned long iarc_ibrp1:1; /* constrained by ibr1 */ unsigned long iarc_ig3:2; unsigned long iarc_ibrp2:1; /* constrained by ibr2 */ unsigned long iarc_ig4:2; unsigned long iarc_ibrp3:1; /* constrained by ibr3 */ unsigned long iarc_ig5:2; unsigned long iarc_fine:1; /* fine mode */ unsigned long iarc_ig6:50; } pmc14_ita2_reg; /* opcode matcher configuration register */ struct { unsigned long opcmc_ibrp0_pmc8:1; unsigned long opcmc_ibrp1_pmc9:1; unsigned long opcmc_ibrp2_pmc8:1; unsigned long opcmc_ibrp3_pmc9:1; unsigned long opcmc_ig1:60; } pmc15_ita2_reg; } pfm_ita2_pmc_reg_t; typedef union { unsigned long pmd_val; /* counter value */ /* counting pmd register */ struct { unsigned long pmd_count:47; /* 47-bit hardware counter */ unsigned long pmd_sxt47:17; /* sign extension of bit 46 */ } pmd_ita2_counter_reg; /* instruction event address register: data address register */ struct { unsigned long iear_stat:2; /* status bit */ unsigned long iear_ig1:3; unsigned long iear_iaddr:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd0_ita2_reg; /* instruction event address register: data address register */ struct { unsigned long iear_latency:12; /* latency */ unsigned long iear_overflow:1; /* latency overflow */ unsigned long iear_ig1:51; /* reserved */ } pmd1_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_daddr; /* data address */ } pmd2_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_latency:13; /* latency */ unsigned long dear_overflow:1; /* overflow */ unsigned long dear_stat:2; /* status */ unsigned long dear_ig1:48; /* ignored */ } pmd3_ita2_reg; /* branch trace buffer data register when pmc12.ds == 0 */ struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_addr:60; /* bundle address(b=1), target address(b=0) */ } pmd8_15_ita2_reg; /* branch trace buffer data register when pmc12.ds == 1 */ struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_loaddr:37; /* b=1, bundle address, b=0 target address */ unsigned long btb_pred:20; /* low 20bits of L1IBR */ unsigned long btb_hiaddr:3; /* hi 3bits of bundle address(b=1) or target address (b=0)*/ } pmd8_15_ds_ita2_reg; /* branch trace buffer index register */ struct { unsigned long btbi_bbi:3; /* next entry index */ unsigned long btbi_full:1; /* full bit (sticky) */ unsigned long btbi_pmd8ext_b1:1; /* pmd8 ext */ unsigned long btbi_pmd8ext_bruflush:1; /* pmd8 ext */ unsigned long btbi_pmd8ext_ig:2; /* pmd8 ext */ unsigned long btbi_pmd9ext_b1:1; /* pmd9 ext */ unsigned long btbi_pmd9ext_bruflush:1; /* pmd9 ext */ unsigned long btbi_pmd9ext_ig:2; /* pmd9 ext */ unsigned long btbi_pmd10ext_b1:1; /* pmd10 ext */ unsigned long btbi_pmd10ext_bruflush:1; /* pmd10 ext */ unsigned long btbi_pmd10ext_ig:2; /* pmd10 ext */ unsigned long btbi_pmd11ext_b1:1; /* pmd11 ext */ unsigned long btbi_pmd11ext_bruflush:1; /* pmd11 ext */ unsigned long btbi_pmd11ext_ig:2; /* pmd11 ext */ unsigned long btbi_pmd12ext_b1:1; /* pmd12 ext */ unsigned long btbi_pmd12ext_bruflush:1; /* pmd12 ext */ unsigned long btbi_pmd12ext_ig:2; /* pmd12 ext */ unsigned long btbi_pmd13ext_b1:1; /* pmd13 ext */ unsigned long btbi_pmd13ext_bruflush:1; /* pmd13 ext */ unsigned long btbi_pmd13ext_ig:2; /* pmd13 ext */ unsigned long btbi_pmd14ext_b1:1; /* pmd14 ext */ unsigned long btbi_pmd14ext_bruflush:1; /* pmd14 ext */ unsigned long btbi_pmd14ext_ig:2; /* pmd14 ext */ unsigned long btbi_pmd15ext_b1:1; /* pmd15 ext */ unsigned long btbi_pmd15ext_bruflush:1; /* pmd15 ext */ unsigned long btbi_pmd15ext_ig:2; /* pmd15 ext */ unsigned long btbi_ignored:28; } pmd16_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_slot:2; /* slot */ unsigned long dear_bn:1; /* bundle bit (if 1 add 16 to address) */ unsigned long dear_vl:1; /* valid */ unsigned long dear_iaddr:60; /* instruction address (2-bundle window)*/ } pmd17_ita2_reg; } pfm_ita2_pmd_reg_t; /* * type definition for Itanium 2 instruction set support */ typedef enum { PFMLIB_ITA2_ISM_BOTH=0, /* IA-32 and IA-64 (default) */ PFMLIB_ITA2_ISM_IA32=1, /* IA-32 only */ PFMLIB_ITA2_ISM_IA64=2 /* IA-64 only */ } pfmlib_ita2_ism_t; typedef struct { unsigned int flags; /* counter specific flags */ unsigned int thres; /* per event threshold */ pfmlib_ita2_ism_t ism; /* per event instruction set */ } pfmlib_ita2_counter_t; /* * counter specific flags */ #define PFMLIB_ITA2_FL_EVT_NO_QUALCHECK 0x1 /* don't check qualifier constraints */ typedef struct { unsigned char opcm_used; /* set to 1 if this opcode matcher is used */ unsigned long pmc_val; /* full opcode mask (41bits) */ } pfmlib_ita2_opcm_t; /* * * The BTB can be configured via 4 different methods: * * - BRANCH_EVENT is in the event list, pfp_ita2_btb.btb_used == 0: * The BTB will be configured (PMC12) to record all branches AND a counting * monitor will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is in the event list, pfp_ita2_btb.btb_used == 1: * The BTB will be configured (PMC12) according to information in pfp_ita2_btb AND * a counter will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is NOT in the event list, pfp_ita2_btb.btb_used == 0: * Nothing is programmed * * - BRANCH_EVENT is NOT in the event list, pfp_ita2_btb.btb_used == 1: * The BTB will be configured (PMC12) according to information in pfp_ita2_btb. * This is the free running BTB mode. */ typedef struct { unsigned char btb_used; /* set to 1 if the BTB is used */ unsigned char btb_ds; /* data selector */ unsigned char btb_tm; /* taken mask */ unsigned char btb_ptm; /* predicted target mask */ unsigned char btb_ppm; /* predicted predicate mask */ unsigned char btb_brt; /* branch type mask */ unsigned int btb_plm; /* BTB privilege level mask */ } pfmlib_ita2_btb_t; /* * There are four ways to configure EAR: * * - an EAR event is in the event list AND pfp_ita2_?ear.ear_used = 0: * The EAR will be programmed (PMC10 or PMC11) based on the information encoded in the * event (umask, cache, tlb,alat). A counting monitor will be programmed to * count DATA_EAR_EVENTS or L1I_EAR_EVENTS depending on the type of EAR. * * - an EAR event is in the event list AND pfp_ita2_?ear.ear_used = 1: * The EAR will be programmed (PMC10 or PMC11) according to the information in the * pfp_ita2_?ear structure because it contains more detailed information * (such as priv level and instruction set). A counting monitor will be programmed * to count DATA_EAR_EVENTS or L1I_EAR_EVENTS depending on the type of EAR. * * - no EAR event is in the event list AND pfp_ita2_?ear.ear_used = 0: * Nothing is programmed. * * - no EAR event is in the event list AND pfp_ita2_?ear.ear_used = 1: * The EAR will be programmed (PMC10 or PMC11) according to the information in the * pfp_ita2_?ear structure. This is the free running mode for EAR */ typedef enum { PFMLIB_ITA2_EAR_CACHE_MODE= 0, /* Cache mode : I-EAR and D-EAR */ PFMLIB_ITA2_EAR_TLB_MODE = 1, /* TLB mode : I-EAR and D-EAR */ PFMLIB_ITA2_EAR_ALAT_MODE = 2 /* ALAT mode : D-EAR only */ } pfmlib_ita2_ear_mode_t; typedef struct { unsigned char ear_used; /* when set will force definition of PMC[10] */ pfmlib_ita2_ear_mode_t ear_mode; /* EAR mode */ pfmlib_ita2_ism_t ear_ism; /* instruction set */ unsigned int ear_plm; /* IEAR privilege level mask */ unsigned long ear_umask; /* umask value for PMC10 */ } pfmlib_ita2_ear_t; /* * describes one range. rr_plm is ignored for data ranges * a range is interpreted as unused (not defined) when rr_start = rr_end = 0. * if rr_plm is not set it will use the default settings set in the generic * library param structure. */ typedef struct { unsigned int rr_plm; /* privilege level (ignored for data ranges) */ unsigned long rr_start; /* start address */ unsigned long rr_end; /* end address (not included) */ } pfmlib_ita2_input_rr_desc_t; typedef struct { unsigned long rr_soff; /* start offset from actual start */ unsigned long rr_eoff; /* end offset from actual end */ } pfmlib_ita2_output_rr_desc_t; /* * rr_used must be set to true for the library to configure the debug registers. * rr_inv only applies when the rr_limits table contains ONLY 1 range. * * If using less than 4 intervals, must mark the end with entry: rr_start = rr_end = 0 */ typedef struct { unsigned int rr_flags; /* set of flags for all ranges */ pfmlib_ita2_input_rr_desc_t rr_limits[4]; /* at most 4 distinct intervals */ unsigned char rr_used; /* set if address range restriction is used */ } pfmlib_ita2_input_rr_t; typedef struct { unsigned int rr_nbr_used; /* how many registers were used */ pfmlib_ita2_output_rr_desc_t rr_infos[4]; /* at most 4 distinct intervals */ pfmlib_reg_t rr_br[8]; /* debug reg to configure */ } pfmlib_ita2_output_rr_t; #define PFMLIB_ITA2_RR_INV 0x1 /* inverse instruction ranges (iranges only) */ #define PFMLIB_ITA2_RR_NO_FINE_MODE 0x2 /* force non fine mode for instruction ranges */ /* * Itanium 2 specific parameters for the library */ typedef struct { pfmlib_ita2_counter_t pfp_ita2_counters[PMU_ITA2_NUM_COUNTERS]; /* extended counter features */ unsigned long pfp_ita2_flags; /* Itanium2 specific flags */ pfmlib_ita2_opcm_t pfp_ita2_pmc8; /* PMC8 (opcode matcher) configuration */ pfmlib_ita2_opcm_t pfp_ita2_pmc9; /* PMC9 (opcode matcher) configuration */ pfmlib_ita2_ear_t pfp_ita2_iear; /* IEAR configuration */ pfmlib_ita2_ear_t pfp_ita2_dear; /* DEAR configuration */ pfmlib_ita2_btb_t pfp_ita2_btb; /* BTB configuration */ pfmlib_ita2_input_rr_t pfp_ita2_drange; /* data range restrictions */ pfmlib_ita2_input_rr_t pfp_ita2_irange; /* code range restrictions */ unsigned long reserved[1]; /* for future use */ } pfmlib_ita2_input_param_t; typedef struct { pfmlib_ita2_output_rr_t pfp_ita2_drange; /* data range restrictions */ pfmlib_ita2_output_rr_t pfp_ita2_irange; /* code range restrictions */ unsigned long reserved[6]; /* for future use */ } pfmlib_ita2_output_param_t; extern int pfm_ita2_is_ear(unsigned int i); extern int pfm_ita2_is_dear(unsigned int i); extern int pfm_ita2_is_dear_tlb(unsigned int i); extern int pfm_ita2_is_dear_cache(unsigned int i); extern int pfm_ita2_is_dear_alat(unsigned int i); extern int pfm_ita2_is_iear(unsigned int i); extern int pfm_ita2_is_iear_tlb(unsigned int i); extern int pfm_ita2_is_iear_cache(unsigned int i); extern int pfm_ita2_is_btb(unsigned int i); extern int pfm_ita2_support_opcm(unsigned int i); extern int pfm_ita2_support_iarr(unsigned int i); extern int pfm_ita2_support_darr(unsigned int i); extern int pfm_ita2_get_ear_mode(unsigned int i, pfmlib_ita2_ear_mode_t *m); extern int pfm_ita2_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out); extern int pfm_ita2_get_event_maxincr(unsigned int i, unsigned int *maxincr); extern int pfm_ita2_get_event_umask(unsigned int i, unsigned long *umask); extern int pfm_ita2_get_event_group(unsigned int i, int *grp); extern int pfm_ita2_get_event_set(unsigned int i, int *set); /* * values of group (grp) returned by pfm_ita2_get_event_group() */ #define PFMLIB_ITA2_EVT_NO_GRP 0 /* event does not belong to a group */ #define PFMLIB_ITA2_EVT_L1_CACHE_GRP 1 /* event belongs to L1 Cache group */ #define PFMLIB_ITA2_EVT_L2_CACHE_GRP 2 /* event belongs to L2 Cache group */ /* * possible values returned in set by pfm_ita2_get_event_set() */ #define PFMLIB_ITA2_EVT_NO_SET -1 /* event does not belong to a set */ #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_ITANIUM2_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_montecito.h000066400000000000000000000643141502707512200253440ustar00rootroot00000000000000/* * Dual-Core Itanium 2 PMU specific types and definitions * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_MONTECITO_H__ #define __PFMLIB_MONTECITO_H__ #include #include #if BYTE_ORDER != LITTLE_ENDIAN #error "this file only supports little endian environments" #endif #ifdef __cplusplus extern "C" { #endif #define PMU_MONT_FIRST_COUNTER 4 /* index of first PMC/PMD counter */ #define PMU_MONT_NUM_COUNTERS 12 /* total numbers of PMC/PMD pairs used as counting monitors */ #define PMU_MONT_NUM_PMCS 27 /* total number of PMCS defined */ #define PMU_MONT_NUM_PMDS 36 /* total number of PMDS defined */ #define PMU_MONT_NUM_ETB 16 /* total number of PMDS in ETB */ #define PMU_MONT_COUNTER_WIDTH 47 /* hardware counter bit width */ /* * This structure provides a detailed way to setup a PMC register. * Once value is loaded, it must be copied (via pmu_reg) to the * perfmon_req_t and passed to the kernel via perfmonctl(). */ typedef union { unsigned long pmc_val; /* complete register value */ /* This is the Montecito-specific PMC layout for counters PMC4-PMC15 */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* ignored */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_ig2:1; /* ignored */ unsigned long pmc_ism:2; /* instruction set: must be 2 */ unsigned long pmc_all:1; /* 0=only self, 1=both threads */ unsigned long pmc_i:1; /* Invalidate */ unsigned long pmc_s:1; /* Shared */ unsigned long pmc_e:1; /* Exclusive */ unsigned long pmc_m:1; /* Modified */ unsigned long pmc_res3:33; /* reserved */ } pmc_mont_counter_reg; /* opcode matchers mask registers */ struct { unsigned long opcm_mask:41; /* opcode mask */ unsigned long opcm_ig1:7; /* ignored */ unsigned long opcm_b:1; /* B-syllable */ unsigned long opcm_f:1; /* F-syllable */ unsigned long opcm_i:1; /* I-syllable */ unsigned long opcm_m:1; /* M-syllable */ unsigned long opcm_ig2:4; /* ignored */ unsigned long opcm_inv:1; /* inverse range for ibrp0 */ unsigned long opcm_ig_ad:1; /* ignore address range restrictions */ unsigned long opcm_ig3:6; /* ignored */ } pmc32_34_mont_reg; /* opcode matchers match registers */ struct { unsigned long opcm_match:41; /* opcode match */ unsigned long opcm_ig1:23; /* ignored */ } pmc33_35_mont_reg; /* opcode matcher config register */ struct { unsigned long opcm_ch0_ig_opcm:1; /* chan0 opcode constraint */ unsigned long opcm_ch1_ig_opcm:1; /* chan1 opcode constraint */ unsigned long opcm_ch2_ig_opcm:1; /* chan2 opcode constraint */ unsigned long opcm_ch3_ig_opcm:1; /* chan3 opcode constraint */ unsigned long opcm_res:28; /* reserved */ unsigned long opcm_ig:32; /* ignored */ } pmc36_mont_reg; /* * instruction event address register configuration (I-EAR) * * The register has two layouts depending on the value of the ct field. * In cache mode(ct=1x): * - ct is 1 bit, umask is 8 bits * In TLB mode (ct=0x): * - ct is 2 bits, umask is 7 bits * ct=11 => cache mode using a latency filter with eighth bit set * ct=01 => nothing monitored * * The ct=01 value is the only reason why we cannot fix the layout * to ct 1 bit and umask 8 bits. Even though in TLB mode, only 6 bits * are effectively used for the umask, if the user inadvertently sets * a umask with the most significant bit set, it would be equivalent * to no monitoring. */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:8; /* event unit mask */ unsigned long iear_ct:1; /* =1 for i-cache */ unsigned long iear_res:2; /* reserved */ unsigned long iear_ig:48; /* ignored */ } pmc37_mont_cache_reg; struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:7; /* event unit mask */ unsigned long iear_ct:2; /* 00=i-tlb, 01=nothing 1x=illegal */ unsigned long iear_res:50; /* reserved */ } pmc37_mont_tlb_reg; /* data event address register configuration (D-EAR) */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* ignored */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_mode:2; /* mode */ unsigned long dear_ig2:7; /* ignored */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* ignored */ unsigned long dear_ism:2; /* instruction set: must be 2 */ unsigned long dear_ig4:38; /* ignored */ } pmc40_mont_reg; /* IP event address register (IP-EAR) */ struct { unsigned long ipear_plm:4; /* privilege level mask */ unsigned long ipear_ig1:2; /* ignored */ unsigned long ipear_pm:1; /* privileged monitor */ unsigned long ipear_ig2:1; /* ignored */ unsigned long ipear_mode:3; /* mode */ unsigned long ipear_delay:8; /* delay */ unsigned long ipear_ig3:45; /* reserved */ } pmc42_mont_reg; /* execution trace buffer configuration register (ETB) */ struct { unsigned long etbc_plm:4; /* privilege level */ unsigned long etbc_res1:2; /* reserved */ unsigned long etbc_pm:1; /* privileged monitor */ unsigned long etbc_ds:1; /* data selector */ unsigned long etbc_tm:2; /* taken mask */ unsigned long etbc_ptm:2; /* predicted taken address mask */ unsigned long etbc_ppm:2; /* predicted predicate mask */ unsigned long etbc_brt:2; /* branch type mask */ unsigned long etbc_ig:48; /* ignored */ } pmc39_mont_reg; /* data address range configuration register */ struct { unsigned long darc_res1:3; /* reserved */ unsigned long darc_cfg_dtag0:2; /* constraints on dbrp0 */ unsigned long darc_res2:6; /* reserved */ unsigned long darc_cfg_dtag1:2; /* constraints on dbrp1 */ unsigned long darc_res3:6; /* reserved */ unsigned long darc_cfg_dtag2:2; /* constraints on dbrp2 */ unsigned long darc_res4:6; /* reserved */ unsigned long darc_cfg_dtag3:2; /* constraints on dbrp3 */ unsigned long darc_res5:16; /* reserved */ unsigned long darc_ena_dbrp0:1; /* enable constraints dbrp0 */ unsigned long darc_ena_dbrp1:1; /* enable constraints dbrp1 */ unsigned long darc_ena_dbrp2:1; /* enable constraints dbrp2 */ unsigned long darc_ena_dbrp3:1; /* enable constraint dbr3 */ unsigned long darc_res6:15; } pmc41_mont_reg; /* instruction address range configuration register */ struct { unsigned long iarc_res1:1; /* reserved */ unsigned long iarc_ig_ibrp0:1; /* constrained by ibrp0 */ unsigned long iarc_res2:2; /* reserved */ unsigned long iarc_ig_ibrp1:1; /* constrained by ibrp1 */ unsigned long iarc_res3:2; /* reserved */ unsigned long iarc_ig_ibrp2:1; /* constrained by ibrp2 */ unsigned long iarc_res4:2; /* reserved */ unsigned long iarc_ig_ibrp3:1; /* constrained by ibrp3 */ unsigned long iarc_res5:2; /* reserved */ unsigned long iarc_fine:1; /* fine mode */ unsigned long iarc_ig6:50; /* reserved */ } pmc38_mont_reg; } pfm_mont_pmc_reg_t; typedef union { unsigned long pmd_val; /* counter value */ /* counting pmd register */ struct { unsigned long pmd_count:47; /* 47-bit hardware counter */ unsigned long pmd_sxt47:17; /* sign extension of bit 46 */ } pmd_mont_counter_reg; /* data event address register */ struct { unsigned long dear_daddr; /* data address */ } pmd32_mont_reg; /* data event address register (D-EAR) */ struct { unsigned long dear_latency:13; /* latency */ unsigned long dear_ov:1; /* latency overflow */ unsigned long dear_stat:2; /* status */ unsigned long dear_ig:48; /* ignored */ } pmd33_mont_reg; /* instruction event address register (I-EAR) */ struct { unsigned long iear_stat:2; /* status bit */ unsigned long iear_ig:3; /* ignored */ unsigned long iear_iaddr:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd34_mont_reg; /* instruction event address register (I-EAR) */ struct { unsigned long iear_latency:12; /* latency */ unsigned long iear_ov:1; /* latency overflow */ unsigned long iear_ig:51; /* ignored */ } pmd35_mont_reg; /* data event address register (D-EAR) */ struct { unsigned long dear_slot:2; /* slot */ unsigned long dear_bn:1; /* bundle bit (if 1 add 16 to iaddr) */ unsigned long dear_vl:1; /* valid */ unsigned long dear_iaddr:60; /* instruction address (2-bundle window)*/ } pmd36_mont_reg; /* execution trace buffer index register (ETB) */ struct { unsigned long etbi_ebi:4; /* next entry index */ unsigned long etbi_ig1:1; /* ignored */ unsigned long etbi_full:1; /* ETB overflowed at least once */ unsigned long etbi_ig2:58; /* ignored */ } pmd38_mont_reg; /* execution trace buffer extension register (ETB) */ struct { unsigned long etb_pmd48ext_b1:1; /* pmd48 ext */ unsigned long etb_pmd48ext_bruflush:1; /* pmd48 ext */ unsigned long etb_pmd48ext_res:2; /* reserved */ unsigned long etb_pmd56ext_b1:1; /* pmd56 ext */ unsigned long etb_pmd56ext_bruflush:1; /* pmd56 ext */ unsigned long etb_pmd56ext_res:2; /* reserved */ unsigned long etb_pmd49ext_b1:1; /* pmd49 ext */ unsigned long etb_pmd49ext_bruflush:1; /* pmd49 ext */ unsigned long etb_pmd49ext_res:2; /* reserved */ unsigned long etb_pmd57ext_b1:1; /* pmd57 ext */ unsigned long etb_pmd57ext_bruflush:1; /* pmd57 ext */ unsigned long etb_pmd57ext_res:2; /* reserved */ unsigned long etb_pmd50ext_b1:1; /* pmd50 ext */ unsigned long etb_pmd50ext_bruflush:1; /* pmd50 ext */ unsigned long etb_pmd50ext_res:2; /* reserved */ unsigned long etb_pmd58ext_b1:1; /* pmd58 ext */ unsigned long etb_pmd58ext_bruflush:1; /* pmd58 ext */ unsigned long etb_pmd58ext_res:2; /* reserved */ unsigned long etb_pmd51ext_b1:1; /* pmd51 ext */ unsigned long etb_pmd51ext_bruflush:1; /* pmd51 ext */ unsigned long etb_pmd51ext_res:2; /* reserved */ unsigned long etb_pmd59ext_b1:1; /* pmd59 ext */ unsigned long etb_pmd59ext_bruflush:1; /* pmd59 ext */ unsigned long etb_pmd59ext_res:2; /* reserved */ unsigned long etb_pmd52ext_b1:1; /* pmd52 ext */ unsigned long etb_pmd52ext_bruflush:1; /* pmd52 ext */ unsigned long etb_pmd52ext_res:2; /* reserved */ unsigned long etb_pmd60ext_b1:1; /* pmd60 ext */ unsigned long etb_pmd60ext_bruflush:1; /* pmd60 ext */ unsigned long etb_pmd60ext_res:2; /* reserved */ unsigned long etb_pmd53ext_b1:1; /* pmd53 ext */ unsigned long etb_pmd53ext_bruflush:1; /* pmd53 ext */ unsigned long etb_pmd53ext_res:2; /* reserved */ unsigned long etb_pmd61ext_b1:1; /* pmd61 ext */ unsigned long etb_pmd61ext_bruflush:1; /* pmd61 ext */ unsigned long etb_pmd61ext_res:2; /* reserved */ unsigned long etb_pmd54ext_b1:1; /* pmd54 ext */ unsigned long etb_pmd54ext_bruflush:1; /* pmd54 ext */ unsigned long etb_pmd54ext_res:2; /* reserved */ unsigned long etb_pmd62ext_b1:1; /* pmd62 ext */ unsigned long etb_pmd62ext_bruflush:1; /* pmd62 ext */ unsigned long etb_pmd62ext_res:2; /* reserved */ unsigned long etb_pmd55ext_b1:1; /* pmd55 ext */ unsigned long etb_pmd55ext_bruflush:1; /* pmd55 ext */ unsigned long etb_pmd55ext_res:2; /* reserved */ unsigned long etb_pmd63ext_b1:1; /* pmd63 ext */ unsigned long etb_pmd63ext_bruflush:1; /* pmd63 ext */ unsigned long etb_pmd63ext_res:2; /* reserved */ } pmd39_mont_reg; /* * execution trace buffer extension register when used with IP-EAR * * to be used in conjunction with pmd48_63_ipear_reg (see below) */ struct { unsigned long ipear_pmd48ext_cycles:2; /* pmd48 upper 2 bits of cycles */ unsigned long ipear_pmd48ext_f:1; /* pmd48 flush bit */ unsigned long ipear_pmd48ext_ef:1; /* pmd48 early freeze */ unsigned long ipear_pmd56ext_cycles:2; /* pmd56 upper 2 bits of cycles */ unsigned long ipear_pmd56ext_f:1; /* pmd56 flush bit */ unsigned long ipear_pmd56ext_ef:1; /* pmd56 early freeze */ unsigned long ipear_pmd49ext_cycles:2; /* pmd49 upper 2 bits of cycles */ unsigned long ipear_pmd49ext_f:1; /* pmd49 flush bit */ unsigned long ipear_pmd49ext_ef:1; /* pmd49 early freeze */ unsigned long ipear_pmd57ext_cycles:2; /* pmd57 upper 2 bits of cycles */ unsigned long ipear_pmd57ext_f:1; /* pmd57 flush bit */ unsigned long ipear_pmd57ext_ef:1; /* pmd57 early freeze */ unsigned long ipear_pmd50ext_cycles:2; /* pmd50 upper 2 bits of cycles */ unsigned long ipear_pmd50ext_f:1; /* pmd50 flush bit */ unsigned long ipear_pmd50ext_ef:1; /* pmd50 early freeze */ unsigned long ipear_pmd58ext_cycles:2; /* pmd58 upper 2 bits of cycles */ unsigned long ipear_pmd58ext_f:1; /* pmd58 flush bit */ unsigned long ipear_pmd58ext_ef:1; /* pmd58 early freeze */ unsigned long ipear_pmd51ext_cycles:2; /* pmd51 upper 2 bits of cycles */ unsigned long ipear_pmd51ext_f:1; /* pmd51 flush bit */ unsigned long ipear_pmd51ext_ef:1; /* pmd51 early freeze */ unsigned long ipear_pmd59ext_cycles:2; /* pmd59 upper 2 bits of cycles */ unsigned long ipear_pmd59ext_f:1; /* pmd59 flush bit */ unsigned long ipear_pmd59ext_ef:1; /* pmd59 early freeze */ unsigned long ipear_pmd52ext_cycles:2; /* pmd52 upper 2 bits of cycles */ unsigned long ipear_pmd52ext_f:1; /* pmd52 flush bit */ unsigned long ipear_pmd52ext_ef:1; /* pmd52 early freeze */ unsigned long ipear_pmd60ext_cycles:2; /* pmd60 upper 2 bits of cycles */ unsigned long ipear_pmd60ext_f:1; /* pmd60 flush bit */ unsigned long ipear_pmd60ext_ef:1; /* pmd60 early freeze */ unsigned long ipear_pmd53ext_cycles:2; /* pmd53 upper 2 bits of cycles */ unsigned long ipear_pmd53ext_f:1; /* pmd53 flush bit */ unsigned long ipear_pmd53ext_ef:1; /* pmd53 early freeze */ unsigned long ipear_pmd61ext_cycles:2; /* pmd61 upper 2 bits of cycles */ unsigned long ipear_pmd61ext_f:1; /* pmd61 flush bit */ unsigned long ipear_pmd61ext_ef:1; /* pmd61 early freeze */ unsigned long ipear_pmd54ext_cycles:2; /* pmd54 upper 2 bits of cycles */ unsigned long ipear_pmd54ext_f:1; /* pmd54 flush bit */ unsigned long ipear_pmd54ext_ef:1; /* pmd54 early freeze */ unsigned long ipear_pmd62ext_cycles:2; /* pmd62 upper 2 bits of cycles */ unsigned long ipear_pmd62ext_f:1; /* pmd62 flush bit */ unsigned long ipear_pmd62ext_ef:1; /* pmd62 early freeze */ unsigned long ipear_pmd55ext_cycles:2; /* pmd55 upper 2 bits of cycles */ unsigned long ipear_pmd55ext_f:1; /* pmd55 flush bit */ unsigned long ipear_pmd55ext_ef:1; /* pmd55 early freeze */ unsigned long ipear_pmd63ext_cycles:2; /* pmd63 upper 2 bits of cycles */ unsigned long ipear_pmd63ext_f:1; /* pmd63 flush bit */ unsigned long ipear_pmd63ext_ef:1; /* pmd63 early freeze */ } pmd39_ipear_mont_reg; /* * execution trace buffer data register (ETB) * * when pmc39.ds == 0: pmd48-63 contains branch targets * when pmc39.ds == 1: pmd48-63 content is undefined */ struct { unsigned long etb_s:1; /* source bit */ unsigned long etb_mp:1; /* mispredict bit */ unsigned long etb_slot:2; /* which slot, 3=not taken branch */ unsigned long etb_addr:60; /* bundle address(s=1), target address(s=0) */ } pmd48_63_etb_mont_reg; /* * execution trace buffer when used with IP-EAR with PMD48-63.ef=0 * * The cycles field straddles pmdXX and corresponding extension in * pmd39 (pmd39_ipear_mont_reg). For instance, cycles for pmd48: * * cycles= pmd39_ipear_mont_reg.etb_pmd48ext_cycles << 4 * | pmd48_63_etb_ipear_mont_reg.etb_cycles */ struct { unsigned long ipear_addr:60; /* retired IP[63:4] */ unsigned long ipear_cycles:4; /* lower 4 bit of cycles */ } pmd48_63_ipear_mont_reg; /* * execution trace buffer when used with IP-EAR with PMD48-63.ef=1 * * The cycles field straddles pmdXX and corresponding extension in * pmd39 (pmd39_ipear_mont_reg). For instance, cycles for pmd48: * * cycles= pmd39_ipear_mont_reg.etb_pmd48ext_cycles << 4 * | pmd48_63_etb_ipear_ef_mont_reg.etb_cycles */ struct { unsigned long ipear_delay:8; /* delay count */ unsigned long ipear_addr:52; /* retired IP[61:12] */ unsigned long ipear_cycles:4; /* lower 5 bit of cycles */ } pmd48_63_ipear_ef_mont_reg; } pfm_mont_pmd_reg_t; typedef struct { unsigned int flags; /* counter specific flags */ unsigned int thres; /* per event threshold */ } pfmlib_mont_counter_t; /* * counter specific flags */ #define PFMLIB_MONT_FL_EVT_NO_QUALCHECK 0x1 /* don't check qualifier constraints */ #define PFMLIB_MONT_FL_EVT_ALL_THRD 0x2 /* event measured for both threads */ #define PFMLIB_MONT_FL_EVT_ACTIVE_ONLY 0x4 /* measure the event only when the thread is active */ #define PFMLIB_MONT_FL_EVT_ALWAYS 0x8 /* measure the event at all times (active or inactive) */ /* * * The ETB can be configured via 4 different methods: * * - BRANCH_EVENT is in the event list, pfp_mont_etb.etb_used == 0: * The ETB will be configured (PMC12) to record all branches AND a counting * monitor will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is in the event list, pfp_mont_etb.etb_used == 1: * The ETB will be configured (PMC12) according to information in pfp_mont_etb AND * a counter will be setup to count BRANCH_EVENT. * * - BRANCH_EVENT is NOT in the event list, pfp_mont_etb.etb_used == 0: * Nothing is programmed * * - BRANCH_EVENT is NOT in the event list, pfp_mont_etb.etb_used == 1: * The ETB will be configured (PMC12) according to information in pfp_mont_etb. * This is the free running ETB mode. */ typedef struct { unsigned char etb_used; /* set to 1 if the ETB is used */ unsigned int etb_plm; /* ETB privilege level mask */ unsigned char etb_tm; /* taken mask */ unsigned char etb_ptm; /* predicted target mask */ unsigned char etb_ppm; /* predicted predicate mask */ unsigned char etb_brt; /* branch type mask */ } pfmlib_mont_etb_t; /* * There are four ways to configure EAR: * * - an EAR event is in the event list AND pfp_mont_?ear.ear_used = 0: * The EAR will be programmed (PMC37 or PMC40) based on the information encoded in the * event (umask, cache, tlb,alat). A counting monitor will be programmed to * count DATA_EAR_EVENTS or L1I_EAR_EVENTS depending on the type of EAR. * * - an EAR event is in the event list AND pfp_mont_?ear.ear_used = 1: * The EAR will be programmed (PMC37 or PMC40) according to the information in the * pfp_mont_?ear structure because it contains more detailed information * (such as priv level and instruction set). A counting monitor will be programmed * to count DATA_EAR_EVENTS or L1I_EAR_EVENTS depending on the type of EAR. * * - no EAR event is in the event list AND pfp_mont_?ear.ear_used = 0: * Nothing is programmed. * * - no EAR event is in the event list AND pfp_mont_?ear.ear_used = 1: * The EAR will be programmed (PMC37 or PMC40) according to the information in the * pfp_mont_?ear structure. This is the free running mode for EAR */ typedef enum { PFMLIB_MONT_EAR_CACHE_MODE= 0, /* Cache mode : I-EAR and D-EAR */ PFMLIB_MONT_EAR_TLB_MODE = 1, /* TLB mode : I-EAR and D-EAR */ PFMLIB_MONT_EAR_ALAT_MODE = 2 /* ALAT mode : D-EAR only */ } pfmlib_mont_ear_mode_t; typedef struct { unsigned char ear_used; /* when set will force definition of PMC[10] */ pfmlib_mont_ear_mode_t ear_mode; /* EAR mode */ unsigned int ear_plm; /* IEAR privilege level mask */ unsigned long ear_umask; /* umask value for PMC10 */ } pfmlib_mont_ear_t; /* * describes one range. rr_plm is ignored for data ranges * a range is interpreted as unused (not defined) when rr_start = rr_end = 0. * if rr_plm is not set it will use the default settings set in the generic * library param structure. */ typedef struct { unsigned int rr_plm; /* privilege level (ignored for data ranges) */ unsigned long rr_start; /* start address */ unsigned long rr_end; /* end address (not included) */ } pfmlib_mont_input_rr_desc_t; typedef struct { unsigned long rr_soff; /* start offset from actual start */ unsigned long rr_eoff; /* end offset from actual end */ } pfmlib_mont_output_rr_desc_t; /* * rr_used must be set to true for the library to configure the debug registers. * rr_inv only applies when the rr_limits table contains ONLY 1 range. * * If using less than 4 intervals, must mark the end with entry: rr_start = rr_end = 0 */ typedef struct { unsigned int rr_flags; /* set of flags for all ranges */ pfmlib_mont_input_rr_desc_t rr_limits[4]; /* at most 4 distinct intervals */ unsigned char rr_used; /* set if address range restriction is used */ } pfmlib_mont_input_rr_t; /* * rr_flags values: * PFMLIB_MONT_IRR_DEMAND_FETCH, PFMLIB_MONT_IRR_PREFETCH_MATCH to be used * ONLY in conunction with any of the following (dual) events: * * - ISB_BUNPAIRS_IN, L1I_FETCH_RAB_HIT, L1I_FETCH_ISB_HIT, L1I_FILLS * * PFMLIB_MONT_IRR_DEMAND_FETCH: declared interest in demand fetched cache * line (force use of IBRP0) * * PFMLIB_MONT_IRR_PREFETCH_MATCH: declared interest in regular prefetched cache * line (force use of IBRP1) */ #define PFMLIB_MONT_RR_INV 0x1 /* inverse instruction ranges (iranges only) */ #define PFMLIB_MONT_RR_NO_FINE_MODE 0x2 /* force non fine mode for instruction ranges */ #define PFMLIB_MONT_IRR_DEMAND_FETCH 0x4 /* demand fetch only for dual events */ #define PFMLIB_MONT_IRR_PREFETCH_MATCH 0x8 /* regular prefetches for dual events */ typedef struct { unsigned int rr_nbr_used; /* how many registers were used */ pfmlib_mont_output_rr_desc_t rr_infos[4]; /* at most 4 distinct intervals */ pfmlib_reg_t rr_br[8]; /* debug reg to configure */ } pfmlib_mont_output_rr_t; typedef struct { unsigned char opcm_used; /* set when opcm is used */ unsigned char opcm_m; /* M slot */ unsigned char opcm_i; /* I slot */ unsigned char opcm_f; /* F slot */ unsigned char opcm_b; /* B slot */ unsigned long opcm_match; /* match field */ unsigned long opcm_mask; /* mask field */ } pfmlib_mont_opcm_t; typedef struct { unsigned char ipear_used; /* set when ipear is used */ unsigned int ipear_plm; /* IP-EAR privilege level mask */ unsigned short ipear_delay; /* delay in cycles */ } pfmlib_mont_ipear_t; /* * Montecito specific parameters for the library */ typedef struct { pfmlib_mont_counter_t pfp_mont_counters[PMU_MONT_NUM_COUNTERS]; /* extended counter features */ unsigned long pfp_mont_flags; /* Montecito specific flags */ pfmlib_mont_opcm_t pfp_mont_opcm1; /* pmc32/pmc33 (opcode matcher) configuration */ pfmlib_mont_opcm_t pfp_mont_opcm2; /* pmc34/pmc35 (opcode matcher) configuration */ pfmlib_mont_ear_t pfp_mont_iear; /* IEAR configuration */ pfmlib_mont_ear_t pfp_mont_dear; /* DEAR configuration */ pfmlib_mont_etb_t pfp_mont_etb; /* ETB configuration */ pfmlib_mont_ipear_t pfp_mont_ipear; /* IP-EAR configuration */ pfmlib_mont_input_rr_t pfp_mont_drange; /* data range restrictions */ pfmlib_mont_input_rr_t pfp_mont_irange; /* code range restrictions */ unsigned long reserved[1]; /* for future use */ } pfmlib_mont_input_param_t; typedef struct { pfmlib_mont_output_rr_t pfp_mont_drange; /* data range restrictions */ pfmlib_mont_output_rr_t pfp_mont_irange; /* code range restrictions */ unsigned long reserved[6]; /* for future use */ } pfmlib_mont_output_param_t; extern int pfm_mont_is_ear(unsigned int i); extern int pfm_mont_is_dear(unsigned int i); extern int pfm_mont_is_dear_tlb(unsigned int i); extern int pfm_mont_is_dear_cache(unsigned int i); extern int pfm_mont_is_dear_alat(unsigned int i); extern int pfm_mont_is_iear(unsigned int i); extern int pfm_mont_is_iear_tlb(unsigned int i); extern int pfm_mont_is_iear_cache(unsigned int i); extern int pfm_mont_is_etb(unsigned int i); extern int pfm_mont_support_opcm(unsigned int i); extern int pfm_mont_support_iarr(unsigned int i); extern int pfm_mont_support_darr(unsigned int i); extern int pfm_mont_support_all(unsigned int i); extern int pfm_mont_get_ear_mode(unsigned int i, pfmlib_mont_ear_mode_t *m); extern int pfm_mont_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out); extern int pfm_mont_get_event_maxincr(unsigned int i, unsigned int *maxincr); extern int pfm_mont_get_event_umask(unsigned int i, unsigned long *umask); extern int pfm_mont_get_event_group(unsigned int i, int *grp); extern int pfm_mont_get_event_set(unsigned int i, int *set); extern int pfm_mont_get_event_type(unsigned int i, int *type); /* * values of group (grp) returned by pfm_mont_get_event_group() */ #define PFMLIB_MONT_EVT_NO_GRP 0 /* event does not belong to a group */ #define PFMLIB_MONT_EVT_L1D_CACHE_GRP 1 /* event belongs to L1D Cache group */ #define PFMLIB_MONT_EVT_L2D_CACHE_GRP 2 /* event belongs to L2D Cache group */ /* * possible values returned in set by pfm_mont_get_event_set() */ #define PFMLIB_MONT_EVT_NO_SET -1 /* event does not belong to a set */ /* * values of type returned by pfm_mont_get_event_type() */ #define PFMLIB_MONT_EVT_ACTIVE 0 /* event measures only when thread is active */ #define PFMLIB_MONT_EVT_FLOATING 1 #define PFMLIB_MONT_EVT_CAUSAL 2 #define PFMLIB_MONT_EVT_SELF_FLOATING 3 /* floating with .self, causal otherwise */ #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_MONTECITO_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os.h000066400000000000000000000034421502707512200237570ustar00rootroot00000000000000/* * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_H__ #define __PFMLIB_OS_H__ #ifdef __linux__ #ifdef __ia64__ #include #endif #ifdef __x86_64__ #include #endif #ifdef __i386__ #include #endif #if defined(__mips__) #include #endif #ifdef __powerpc__ #include #endif #ifdef __sparc__ #include #endif #ifdef __cell__ #include #endif #ifdef __crayx2 #include #endif #endif /* __linux__ */ #endif /* __PFMLIB_OS_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_crayx2.h000066400000000000000000000035171502707512200252520ustar00rootroot00000000000000/* * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_CRAYX2_H__ #define __PFMLIB_OS_CRAYX2_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_CRAYX2_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_i386.h000066400000000000000000000036261502707512200245340ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_I386_P6_H__ #define __PFMLIB_OS_I386_P6_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #ifndef __i386__ #error "you should not be including this file" #endif #ifndef __PFMLIB_OS_COMPILE #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_COMPILE */ #endif /* __PFMLIB_OS_I386_P6_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_ia64.h000066400000000000000000000042111502707512200245750ustar00rootroot00000000000000/* * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_IA64_H__ #define __PFMLIB_OS_IA64_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #ifndef __ia64__ #error "you should not be including this file" #endif /* * you should never include this file directly, it is included from pfmlib.h */ #ifdef __cplusplus extern "C" { #endif #ifdef __linux__ #ifndef __PFMLIB_OS_COMPILE /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { fd = 0; /* avoid compiler warning */ ia64_sum(); return 0; } static inline int pfm_self_stop(int fd) { fd = 0; /* avoid compiler warning */ ia64_rum(); return 0; } #endif /* __PFMLIB_OS_COMPILE */ #endif /*__linux__ */ #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_OS_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_mips64.h000066400000000000000000000037351502707512200251660ustar00rootroot00000000000000/* * Contributed by Philip Mucci based on code from * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_MIPS64_H__ #define __PFMLIB_OS_MIPS64_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #if !defined(__mips__) #error "you should not be including this file" #endif #ifndef __PFMLIB_OS_COMPILE #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_COMPILE */ #endif /* __PFMLIB_OS_MIPS64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_powerpc.h000066400000000000000000000036311502707512200255160ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_POWERPC_H__ #define __PFMLIB_OS_POWERPC_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #ifndef __powerpc__ #error "you should not be including this file" #endif #ifndef __PFMLIB_OS_COMPILE #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_COMPILE */ #endif /* __PFMLIB_OS_POWERPC_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_sparc.h000066400000000000000000000036211502707512200251460ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_SPARC_H__ #define __PFMLIB_OS_SPARC_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #ifndef __sparc__ #error "you should not be including this file" #endif #ifndef __PFMLIB_OS_COMPILE #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_COMPILE */ #endif /* __PFMLIB_OS_SPARC_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_os_x86_64.h000066400000000000000000000036251502707512200250000ustar00rootroot00000000000000/* * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_OS_X86_64_H__ #define __PFMLIB_OS_X86_64_H__ #ifndef __PFMLIB_OS_H__ #error "you should never include this file directly, use pfmlib_os.h" #endif #include #ifndef __x86_64__ #error "you should not be including this file" #endif #ifndef __PFMLIB_OS_COMPILE #include /* * macros version of pfm_self_start/pfm_self_stop to be used in per-process self-monitoring sessions. * they are also defined as real functions. * * DO NOT USE on system-wide sessions. */ static inline int pfm_self_start(int fd) { return pfm_start(fd, NULL); } static inline int pfm_self_stop(int fd) { return pfm_stop(fd); } #endif /* __PFMLIB_OS_COMPILE */ #endif /* __PFMLIB_OS_X86_64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_pentium4.h000066400000000000000000000110701502707512200250770ustar00rootroot00000000000000/* * Intel Pentium 4 PMU specific types and definitions (32 and 64 bit modes) * * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_PENTIUM4_H__ #define __PFMLIB_PENTIUM4_H__ #include /* ESCR: Event Selection Control Register * * These registers are used to select which event to count along with options * for that event. There are (up to) 45 ESCRs, but each data counter is * restricted to a specific set of ESCRs. */ /** * pentium4_escr_value_t * * Bit-wise breakdown of the ESCR registers. * * Bits Description * ------- ----------- * 63 - 31 Reserved * 30 - 25 Event Select * 24 - 9 Event Mask * 8 - 5 Tag Value * 4 Tag Enable * 3 T0 OS - Enable counting in kernel mode (thread 0) * 2 T0 USR - Enable counting in user mode (thread 0) * 1 T1 OS - Enable counting in kernel mode (thread 1) * 0 T1 USR - Enable counting in user mode (thread 1) **/ #define EVENT_MASK_BITS 16 #define EVENT_SELECT_BITS 6 typedef union { unsigned long val; struct { unsigned long t1_usr:1; unsigned long t1_os:1; unsigned long t0_usr:1; unsigned long t0_os:1; unsigned long tag_enable:1; unsigned long tag_value:4; unsigned long event_mask:EVENT_MASK_BITS; unsigned long event_select:EVENT_SELECT_BITS; unsigned long reserved:1; } bits; } pentium4_escr_value_t; /* CCCR: Counter Configuration Control Register * * These registers are used to configure the data counters. There are 18 * CCCRs, one for each data counter. */ /** * pentium4_cccr_value_t * * Bit-wise breakdown of the CCCR registers. * * Bits Description * ------- ----------- * 63 - 32 Reserved * 31 OVF - The data counter overflowed. * 30 Cascade - Enable cascading of data counter when alternate * counter overflows. * 29 - 28 Reserved * 27 OVF_PMI_T1 - Generate interrupt for LP1 on counter overflow * 26 OVF_PMI_T0 - Generate interrupt for LP0 on counter overflow * 25 FORCE_OVF - Force interrupt on every counter increment * 24 Edge - Enable rising edge detection of the threshold comparison * output for filtering event counts. * 23 - 20 Threshold Value - Select the threshold value for comparing to * incoming event counts. * 19 Complement - Select how incoming event count is compared with * the threshold value. * 18 Compare - Enable filtering of event counts. * 17 - 16 Active Thread - Only used with HT enabled. * 00 - None: Count when neither LP is active. * 01 - Single: Count when only one LP is active. * 10 - Both: Count when both LPs are active. * 11 - Any: Count when either LP is active. * 15 - 13 ESCR Select - Select which ESCR to use for selecting the * event to count. * 12 Enable - Turns the data counter on or off. * 11 - 0 Reserved **/ typedef union { unsigned long val; struct { unsigned long reserved1:12; unsigned long enable:1; unsigned long escr_select:3; unsigned long active_thread:2; unsigned long compare:1; unsigned long complement:1; unsigned long threshold:4; unsigned long edge:1; unsigned long force_ovf:1; unsigned long ovf_pmi_t0:1; unsigned long ovf_pmi_t1:1; unsigned long reserved2:2; unsigned long cascade:1; unsigned long overflow:1; } bits; } pentium4_cccr_value_t; #endif /* __PFMLIB_PENTIUM4_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_powerpc.h000066400000000000000000000027661502707512200250250ustar00rootroot00000000000000/* * PowerPC PMU specific types and definitions. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_POWERPC_H_ #define __PFMLIB_POWERPC_H__ #include /* This privilege level mapping derives from from PAPI's perfmon.c's set_domain function: PFM_PLM0 = Kernel -> POWER supervisor state PFM_PLM1 = Supervisor -> POWER hypervisor state PFM_PLM2 = Other -> not supported PFM_PLM3 = User -> POWER problem state */ #endif /* __PFMLIB_POWERPC_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_sicortex.h000066400000000000000000000107201502707512200251730ustar00rootroot00000000000000/* * Generic MIPS64 PMU specific types and definitions * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_SICORTEX_H__ #define __PFMLIB_SICORTEX_H__ #include /* MIPS are bi-endian */ #include /* * privilege level mask usage for MIPS: * * PFM_PLM0 = KERNEL * PFM_PLM1 = SUPERVISOR * PFM_PLM2 = INTERRUPT * PFM_PLM3 = USER */ #ifdef __cplusplus extern "C" { #endif /* * SiCortex specific */ typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_exl:1; /* int level */ unsigned long sel_os:1; /* system level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_usr:1; /* user level */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_event_mask:6; /* event mask */ unsigned long sel_res1:23; /* reserved */ unsigned long sel_res2:32; /* reserved */ } perfsel; } pfm_sicortex_sel_reg_t; #define PMU_SICORTEX_SCB_NUM_COUNTERS 256 typedef union { uint64_t val; struct { unsigned long Interval:4; unsigned long IntBit:5; unsigned long NoInc:1; unsigned long AddrAssert:1; unsigned long MagicEvent:2; unsigned long Reserved:19; } sicortex_ScbPerfCtl_reg; struct { unsigned long HistGte:20; unsigned long Reserved:12; } sicortex_ScbPerfHist_reg; struct { unsigned long Bucket:8; unsigned long Reserved:24; } sicortex_ScbPerfBuckNum_reg; struct { unsigned long ena:1; unsigned long Reserved:31; } sicortex_ScbPerfEna_reg; struct { unsigned long event:15; unsigned long hist:1; unsigned long ifOther:2; unsigned long Reserved:15; } sicortex_ScbPerfBucket_reg; } pmc_sicortex_scb_reg_t; typedef union { uint64_t val; struct { unsigned long Reserved:2; uint64_t VPCL:38; unsigned long VPCH:2; } sicortex_CpuPerfVPC_reg; struct { unsigned long Reserved:5; unsigned long PEA:31; unsigned long Reserved2:12; unsigned long ASID:8; unsigned long L2STOP:4; unsigned long L2STATE:3; unsigned long L2HIT:1; } sicortex_CpuPerfPEA_reg; } pmd_sicortex_cpu_reg_t; typedef struct { unsigned long NoInc:1; unsigned long Interval:4; unsigned long HistGte:20; unsigned long Bucket:8; } pfmlib_sicortex_scb_t; typedef struct { unsigned long ifOther:2; unsigned long hist:1; } pfmlib_sicortex_scb_counter_t; #define PFMLIB_SICORTEX_INPUT_SCB_NONE (unsigned long)0x0 #define PFMLIB_SICORTEX_INPUT_SCB_INTERVAL (unsigned long)0x1 #define PFMLIB_SICORTEX_INPUT_SCB_NOINC (unsigned long)0x2 #define PFMLIB_SICORTEX_INPUT_SCB_HISTGTE (unsigned long)0x4 #define PFMLIB_SICORTEX_INPUT_SCB_BUCKET (unsigned long)0x8 typedef struct { unsigned long flags; pfmlib_sicortex_scb_counter_t pfp_sicortex_scb_counters[PMU_SICORTEX_SCB_NUM_COUNTERS]; pfmlib_sicortex_scb_t pfp_sicortex_scb_global; } pfmlib_sicortex_input_param_t; typedef struct { unsigned long reserved; } pfmlib_sicortex_output_param_t; /* CPU counter */ int pfm_sicortex_is_cpu(unsigned int i); /* SCB counter */ int pfm_sicortex_is_scb(unsigned int i); /* Reg 25 domain support */ int pfm_sicortex_support_domain(unsigned int i); /* VPC/PEA sampling support */ int pfm_sicortex_support_vpc_pea(unsigned int i); #ifdef __cplusplus /* extern C */ } #endif #endif /* __PFMLIB_GEN_MIPS64_H__ */ papi-papi-7-2-0-t/src/libperfnec/include/perfmon/pfmlib_sparc.h000066400000000000000000000025151502707512200244460ustar00rootroot00000000000000/* * Sparc PMU specific types and definitions. * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_SPARC_H_ #define __PFMLIB_SPARC_H__ #include /* PFM_PLM0 = OS (supervisor) * PFM_PLM1 = Hypervisor * PFM_PLM2 = unused (ignored) * PFM_PLM3 = User level */ #endif /* __PFMLIB_SPARC_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/000077500000000000000000000000001502707512200173065ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/lib/Makefile000066400000000000000000000132021502707512200207440ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk # # Common files # SRCS=pfmlib_common.c pfmlib_priv.c ifeq ($(SYS),Linux) SRCS += pfmlib_os_linux.c pfmlib_os_linux_v2.c ifneq ($(CONFIG_PFMLIB_OLD_PFMV2),y) SRCS += pfmlib_os_linux_v3.c endif endif CFLAGS+=-D_REENTRANT # # list all library support modules # ifeq ($(CONFIG_PFMLIB_ARCH_X86_64),y) INCARCH = $(INC_X86_64) SRCS += pfmlib_pentium4.c pfmlib_amd64.c pfmlib_core.c pfmlib_gen_ia32.c pfmlib_intel_atom.c \ pfmlib_intel_nhm.c CFLAGS += -DCONFIG_PFMLIB_ARCH_X86_64 endif ifeq ($(SYS),Linux) SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM) SLIBPFM=libpfm.so.$(VERSION).$(REVISION).$(AGE) VLIBPFM=libpfm.so.$(VERSION) SOLIBEXT=so endif CFLAGS+=-I. ALIBPFM=libpfm.a TARGETS=$(ALIBPFM) ifeq ($(CONFIG_PFMLIB_SHARED),y) TARGETS += $(SLIBPFM) endif OBJS=$(SRCS:.c=.o) SOBJS=$(OBJS:.o=.lo) INC_COMMON= $(PFMINCDIR)/perfmon/pfmlib.h \ $(PFMINCDIR)/perfmon/pfmlib_comp.h \ $(PFMINCDIR)/perfmon/pfmlib_os.h \ $(PFMINCDIR)/perfmon/perfmon.h \ $(PFMINCDIR)/perfmon/perfmon_dfl_smpl.h \ pfmlib_priv.h pfmlib_priv_comp.h \ INC_IA64= $(PFMINCDIR)/perfmon/pfmlib_itanium.h \ $(PFMINCDIR)/perfmon/pfmlib_itanium2.h \ $(PFMINCDIR)/perfmon/pfmlib_montecito.h \ $(PFMINCDIR)/perfmon/perfmon_compat.h \ $(PFMINCDIR)/perfmon/perfmon_default_smpl.h \ $(PFMINCDIR)/perfmon/perfmon_ia64.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_ia64.h \ $(PFMINCDIR)/perfmon/pfmlib_gen_ia64.h \ $(PFMINCDIR)/perfmon/pfmlib_os_ia64.h \ itanium_events.h itanium2_events.h montecito_events.h INC_IA32=$(PFMINCDIR)/perfmon/perfmon_pebs_core_smpl.h \ $(PFMINCDIR)/perfmon/perfmon_pebs_p4_smpl.h \ $(PFMINCDIR)/perfmon/pfmlib_pentium4.h \ $(PFMINCDIR)/perfmon/pfmlib_amd64.h \ $(PFMINCDIR)/perfmon/pfmlib_core.h \ $(PFMINCDIR)/perfmon/pfmlib_intel_atom.h \ $(PFMINCDIR)/perfmon/pfmlib_intel_nhm.h \ $(PFMINCDIR)/perfmon/pfmlib_i386_p6.h \ $(PFMINCDIR)/perfmon/pfmlib_gen_ia32.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_i386.h \ $(PFMINCDIR)/perfmon/pfmlib_os_i386.h \ amd64_events.h i386_p6_events.h \ pentium4_events.h gen_ia32_events.h coreduo_events.h core_events.h \ intel_atom_events.h intel_corei7_events.h intel_corei7_unc_events.h INC_X86_64= $(PFMINCDIR)/perfmon/perfmon_pebs_core_smpl.h \ $(PFMINCDIR)/perfmon/perfmon_pebs_p4_smpl.h \ $(PFMINCDIR)/perfmon/pfmlib_amd64.h \ $(PFMINCDIR)/perfmon/pfmlib_core.h \ $(PFMINCDIR)/perfmon/pfmlib_intel_atom.h \ $(PFMINCDIR)/perfmon/pfmlib_intel_nhm.h \ $(PFMINCDIR)/perfmon/pfmlib_gen_ia32.h \ $(PFMINCDIR)/perfmon/pfmlib_pentium4.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_x86_64.h \ $(PFMINCDIR)/perfmon/pfmlib_os_x86_64.h \ amd64_events.h pentium4_events.h gen_ia32_events.h core_events.h \ intel_atom_events.h intel_corei7_events.h intel_corei7_unc_events.h INC_MIPS64= $(PFMINCDIR)/perfmon/pfmlib_gen_mips64.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_mips64.h \ $(PFMINCDIR)/perfmon/pfmlib_os_mips64.h \ gen_mips64_events.h INC_SICORTEX= $(INC_MIPS64) $(PFMINCDIR)/perfmon/pfmlib_sicortex.h INC_POWERPC= $(PFMINCDIR)/perfmon/pfmlib_powerpc.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_powerpc.h \ $(PFMINCDIR)/perfmon/pfmlib_os_powerpc.h \ ppc970_events.h ppc970mp_events.h power4_events.h \ power5_events.h power5+_events.h power6_events.h \ power7_events.h powerpc_reg.h INC_SPARC= $(PFMINCDIR)/perfmon/pfmlib_sparc.h \ $(PFMINCDIR)/perfmon/pfmlib_comp_sparc.h \ $(PFMINCDIR)/perfmon/pfmlib_os_sparc.h \ ultra12_events.h ultra3_events.h ultra3plus_events.h ultra3i_events.h \ ultra4plus_events.h niagara1_events.h niagara2_events.h INC_CRAYX2= $(PFMINCDIR)/perfmon/pfmlib_crayx2.h \ crayx2_events.h pfmlib_crayx2_priv.h INC_CELL= $(PFMINCDIR)/perfmon/pfmlib_cell.h \ cell_events.h INCDEP=$(INC_COMMON) $(INCARCH) all: $(TARGETS) $(OBJS) $(SOBJS): $(TOPDIR)/config.mk $(TOPDIR)/rules.mk Makefile $(INCDEP) libpfm.a: $(OBJS) $(RM) $@ $(AR) cru $@ $(OBJS) $(SLIBPFM): $(SOBJS) $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LN) $@ $(VLIBPFM) $(LN) $@ libpfm.$(SOLIBEXT) clean: $(RM) -f *.o *.lo *.a *.so* *~ *.$(SOLIBEXT) distclean: clean depend: $(MKDEP) $(CFLAGS) $(SRCS) install: $(TARGETS) install: @echo building: $(TARGETS) -mkdir -p $(DESTDIR)$(LIBDIR) $(INSTALL) -m 644 $(ALIBPFM) $(DESTDIR)$(LIBDIR) ifeq ($(CONFIG_PFMLIB_SHARED),y) $(INSTALL) $(SLIBPFM) $(DESTDIR)$(LIBDIR) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) $(VLIBPFM) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) libpfm.$(SOLIBEXT) endif papi-papi-7-2-0-t/src/libperfnec/lib/amd64_events.h000066400000000000000000000046631502707512200217670ustar00rootroot00000000000000/* * Copyright (c) 2006, 2007 Advanced Micro Devices, Inc. * Contributed by Ray Bryant * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include "amd64_events_k7.h" #include "amd64_events_k8.h" #include "amd64_events_fam10h.h" #include "amd64_events_fam15h.h" struct pme_amd64_table { unsigned int num; pme_amd64_entry_t *events; unsigned int cpu_clks; unsigned int ret_inst; }; static struct pme_amd64_table amd64_k7_table = { .num = PME_AMD64_K7_EVENT_COUNT, .events = amd64_k7_pe, .cpu_clks = PME_AMD64_K7_CPU_CLK_UNHALTED, .ret_inst = PME_AMD64_K7_RETIRED_INSTRUCTIONS, }; static struct pme_amd64_table amd64_k8_table = { .num = PME_AMD64_K8_EVENT_COUNT, .events = amd64_k8_pe, .cpu_clks = PME_AMD64_K8_CPU_CLK_UNHALTED, .ret_inst = PME_AMD64_K8_RETIRED_INSTRUCTIONS, }; static struct pme_amd64_table amd64_fam10h_table = { .num = PME_AMD64_FAM10H_EVENT_COUNT, .events = amd64_fam10h_pe, .cpu_clks = PME_AMD64_FAM10H_CPU_CLK_UNHALTED, .ret_inst = PME_AMD64_FAM10H_RETIRED_INSTRUCTIONS, }; static struct pme_amd64_table amd64_fam15h_table = { .num = PME_AMD64_FAM15H_EVENT_COUNT, .events = amd64_fam15h_pe, .cpu_clks = PME_AMD64_FAM15H_CPU_CLK_UNHALTED, .ret_inst = PME_AMD64_FAM15H_RETIRED_INSTRUCTIONS, }; papi-papi-7-2-0-t/src/libperfnec/lib/amd64_events_fam10h.h000066400000000000000000002125211502707512200231150ustar00rootroot00000000000000/* * Copyright (c) 2007 Advanced Micro Devices, Inc. * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* History * * Feb 06 2009 -- Robert Richter, robert.richter@amd.com: * * Update for Family 10h RevD (Istanbul) from: BIOS and Kernel * Developer's Guide (BKDG) For AMD Family 10h Processors, 31116 Rev * 3.20 - February 04, 2009 * * Update for Family 10h RevC (Shanghai) from: BIOS and Kernel * Developer's Guide (BKDG) For AMD Family 10h Processors, 31116 Rev * 3.20 - February 04, 2009 * * * Dec 12 2007 -- Robert Richter, robert.richter@amd.com: * * Created from: BIOS and Kernel Developer's Guide (BKDG) For AMD * Family 10h Processors, 31116 Rev 3.00 - September 07, 2007 */ static pme_amd64_entry_t amd64_fam10h_pe[]={ /* Family 10h RevB, Barcelona */ /* 0 */{.pme_name = "DISPATCHED_FPU", .pme_code = 0x00, .pme_desc = "Dispatched FPU Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "OPS_ADD", .pme_udesc = "Add pipe ops excluding load ops and SSE move ops", .pme_ucode = 0x01, }, { .pme_uname = "OPS_MULTIPLY", .pme_udesc = "Multiply pipe ops excluding load ops and SSE move ops", .pme_ucode = 0x02, }, { .pme_uname = "OPS_STORE", .pme_udesc = "Store pipe ops excluding load ops and SSE move ops", .pme_ucode = 0x04, }, { .pme_uname = "OPS_ADD_PIPE_LOAD_OPS", .pme_udesc = "Add pipe load ops and SSE move ops", .pme_ucode = 0x08, }, { .pme_uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .pme_udesc = "Multiply pipe load ops and SSE move ops", .pme_ucode = 0x10, }, { .pme_uname = "OPS_STORE_PIPE_LOAD_OPS", .pme_udesc = "Store pipe load ops and SSE move ops", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 1 */{.pme_name = "CYCLES_NO_FPU_OPS_RETIRED", .pme_code = 0x01, .pme_desc = "Cycles in which the FPU is Empty", }, /* 2 */{.pme_name = "DISPATCHED_FPU_OPS_FAST_FLAG", .pme_code = 0x02, .pme_desc = "Dispatched Fast Flag FPU Operations", }, /* 3 */{.pme_name = "RETIRED_SSE_OPERATIONS", .pme_code = 0x03, .pme_desc = "Retired SSE Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "SINGLE_ADD_SUB_OPS", .pme_udesc = "Single precision add/subtract ops", .pme_ucode = 0x01, }, { .pme_uname = "SINGLE_MUL_OPS", .pme_udesc = "Single precision multiply ops", .pme_ucode = 0x02, }, { .pme_uname = "SINGLE_DIV_OPS", .pme_udesc = "Single precision divide/square root ops", .pme_ucode = 0x04, }, { .pme_uname = "DOUBLE_ADD_SUB_OPS", .pme_udesc = "Double precision add/subtract ops", .pme_ucode = 0x08, }, { .pme_uname = "DOUBLE_MUL_OPS", .pme_udesc = "Double precision multiply ops", .pme_ucode = 0x10, }, { .pme_uname = "DOUBLE_DIV_OPS", .pme_udesc = "Double precision divide/square root ops", .pme_ucode = 0x20, }, { .pme_uname = "OP_TYPE", .pme_udesc = "Op type: 0=uops. 1=FLOPS", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, }, }, }, /* 4 */{.pme_name = "RETIRED_MOVE_OPS", .pme_code = 0x04, .pme_desc = "Retired Move Ops", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "LOW_QW_MOVE_UOPS", .pme_udesc = "Merging low quadword move uops", .pme_ucode = 0x01, }, { .pme_uname = "HIGH_QW_MOVE_UOPS", .pme_udesc = "Merging high quadword move uops", .pme_ucode = 0x02, }, { .pme_uname = "ALL_OTHER_MERGING_MOVE_UOPS", .pme_udesc = "All other merging move uops", .pme_ucode = 0x04, }, { .pme_uname = "ALL_OTHER_MOVE_UOPS", .pme_udesc = "All other move uops", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 5 */{.pme_name = "RETIRED_SERIALIZING_OPS", .pme_code = 0x05, .pme_desc = "Retired Serializing Ops", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SSE_BOTTOM_EXECUTING_UOPS", .pme_udesc = "SSE bottom-executing uops retired", .pme_ucode = 0x01, }, { .pme_uname = "SSE_BOTTOM_SERIALIZING_UOPS", .pme_udesc = "SSE bottom-serializing uops retired", .pme_ucode = 0x02, }, { .pme_uname = "X87_BOTTOM_EXECUTING_UOPS", .pme_udesc = "x87 bottom-executing uops retired", .pme_ucode = 0x04, }, { .pme_uname = "X87_BOTTOM_SERIALIZING_UOPS", .pme_udesc = "x87 bottom-serializing uops retired", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 6 */{.pme_name = "FP_SCHEDULER_CYCLES", .pme_code = 0x06, .pme_desc = "Number of Cycles that a Serializing uop is in the FP Scheduler", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "BOTTOM_EXECUTE_CYCLES", .pme_udesc = "Number of cycles a bottom-execute uop is in the FP scheduler", .pme_ucode = 0x01, }, { .pme_uname = "BOTTOM_SERIALIZING_CYCLES", .pme_udesc = "Number of cycles a bottom-serializing uop is in the FP scheduler", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 7 */{.pme_name = "SEGMENT_REGISTER_LOADS", .pme_code = 0x20, .pme_desc = "Segment Register Loads", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "ES", .pme_udesc = "ES", .pme_ucode = 0x01, }, { .pme_uname = "CS", .pme_udesc = "CS", .pme_ucode = 0x02, }, { .pme_uname = "SS", .pme_udesc = "SS", .pme_ucode = 0x04, }, { .pme_uname = "DS", .pme_udesc = "DS", .pme_ucode = 0x08, }, { .pme_uname = "FS", .pme_udesc = "FS", .pme_ucode = 0x10, }, { .pme_uname = "GS", .pme_udesc = "GS", .pme_ucode = 0x20, }, { .pme_uname = "HS", .pme_udesc = "HS", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, }, }, }, /* 8 */{.pme_name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .pme_code = 0x21, .pme_desc = "Pipeline Restart Due to Self-Modifying Code", }, /* 9 */{.pme_name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .pme_code = 0x22, .pme_desc = "Pipeline Restart Due to Probe Hit", }, /* 10 */{.pme_name = "LS_BUFFER_2_FULL_CYCLES", .pme_code = 0x23, .pme_desc = "LS Buffer 2 Full", }, /* 11 */{.pme_name = "LOCKED_OPS", .pme_code = 0x24, .pme_desc = "Locked Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "EXECUTED", .pme_udesc = "The number of locked instructions executed", .pme_ucode = 0x01, }, { .pme_uname = "CYCLES_SPECULATIVE_PHASE", .pme_udesc = "The number of cycles spent in speculative phase", .pme_ucode = 0x02, }, { .pme_uname = "CYCLES_NON_SPECULATIVE_PHASE", .pme_udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .pme_ucode = 0x04, }, { .pme_uname = "CYCLES_WAITING", .pme_udesc = "The number of cycles waiting for a cache hit (cache miss penalty).", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 12 */{.pme_name = "RETIRED_CLFLUSH_INSTRUCTIONS", .pme_code = 0x26, .pme_desc = "Retired CLFLUSH Instructions", }, /* 13 */{.pme_name = "RETIRED_CPUID_INSTRUCTIONS", .pme_code = 0x27, .pme_desc = "Retired CPUID Instructions", }, /* 14 */{.pme_name = "CANCELLED_STORE_TO_LOAD_FORWARD_OPERATIONS", .pme_code = 0x2A, .pme_desc = "Cancelled Store to Load Forward Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "ADDRESS_MISMATCHES", .pme_udesc = "Address mismatches (starting byte not the same).", .pme_ucode = 0x01, }, { .pme_uname = "STORE_IS_SMALLER_THAN_LOAD", .pme_udesc = "Store is smaller than load.", .pme_ucode = 0x02, }, { .pme_uname = "MISALIGNED", .pme_udesc = "Misaligned.", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 15 */{.pme_name = "SMIS_RECEIVED", .pme_code = 0x2B, .pme_desc = "SMIs Received", }, /* 16 */{.pme_name = "DATA_CACHE_ACCESSES", .pme_code = 0x40, .pme_desc = "Data Cache Accesses", }, /* 17 */{.pme_name = "DATA_CACHE_MISSES", .pme_code = 0x41, .pme_desc = "Data Cache Misses", }, /* 18 */{.pme_name = "DATA_CACHE_REFILLS", .pme_code = 0x42, .pme_desc = "Data Cache Refills from L2 or Northbridge", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "SYSTEM", .pme_udesc = "Refill from the Northbridge", .pme_ucode = 0x01, }, { .pme_uname = "L2_SHARED", .pme_udesc = "Shared-state line from L2", .pme_ucode = 0x02, }, { .pme_uname = "L2_EXCLUSIVE", .pme_udesc = "Exclusive-state line from L2", .pme_ucode = 0x04, }, { .pme_uname = "L2_OWNED", .pme_udesc = "Owned-state line from L2", .pme_ucode = 0x08, }, { .pme_uname = "L2_MODIFIED", .pme_udesc = "Modified-state line from L2", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x1F, }, }, }, /* 19 */{.pme_name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x43, .pme_desc = "Data Cache Refills from the Northbridge", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x1F, }, }, }, /* 20 */{.pme_name = "DATA_CACHE_LINES_EVICTED", .pme_code = 0x44, .pme_desc = "Data Cache Lines Evicted", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "BY_PREFETCHNTA", .pme_udesc = "Cache line evicted was brought into the cache with by a PrefetchNTA instruction.", .pme_ucode = 0x20, }, { .pme_uname = "NOT_BY_PREFETCHNTA", .pme_udesc = "Cache line evicted was not brought into the cache with by a PrefetchNTA instruction.", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, }, }, }, /* 21 */{.pme_name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .pme_code = 0x45, .pme_desc = "L1 DTLB Miss and L2 DTLB Hit", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "L2_4K_TLB_HIT", .pme_udesc = "L2 4K TLB hit", .pme_ucode = 0x01, }, { .pme_uname = "L2_2M_TLB_HIT", .pme_udesc = "L2 2M TLB hit", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, .pme_uflags = PFMLIB_AMD64_TILL_FAM10H_REV_B, }, { .pme_uname = "L2_1G_TLB_HIT", .pme_udesc = "L2 1G TLB hit", .pme_ucode = 0x04, .pme_uflags = PFMLIB_AMD64_FAM10H_REV_C, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, .pme_uflags = PFMLIB_AMD64_FAM10H_REV_C, }, }, }, /* 22 */{.pme_name = "L1_DTLB_AND_L2_DTLB_MISS", .pme_code = 0x46, .pme_desc = "L1 DTLB and L2 DTLB Miss", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "4K_TLB_RELOAD", .pme_udesc = "4K TLB reload", .pme_ucode = 0x01, }, { .pme_uname = "2M_TLB_RELOAD", .pme_udesc = "2M TLB reload", .pme_ucode = 0x02, }, { .pme_uname = "1G_TLB_RELOAD", .pme_udesc = "1G TLB reload", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 23 */{.pme_name = "MISALIGNED_ACCESSES", .pme_code = 0x47, .pme_desc = "Misaligned Accesses", }, /* 24 */{.pme_name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .pme_code = 0x48, .pme_desc = "Microarchitectural Late Cancel of an Access", }, /* 25 */{.pme_name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .pme_code = 0x49, .pme_desc = "Microarchitectural Early Cancel of an Access", }, /* 26 */{.pme_name = "SCRUBBER_SINGLE_BIT_ECC_ERRORS", .pme_code = 0x4A, .pme_desc = "Single-bit ECC Errors Recorded by Scrubber", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SCRUBBER_ERROR", .pme_udesc = "Scrubber error", .pme_ucode = 0x01, }, { .pme_uname = "PIGGYBACK_ERROR", .pme_udesc = "Piggyback scrubber errors", .pme_ucode = 0x02, }, { .pme_uname = "LOAD_PIPE_ERROR", .pme_udesc = "Load pipe error", .pme_ucode = 0x04, }, { .pme_uname = "STORE_WRITE_PIPE_ERROR", .pme_udesc = "Store write pipe error", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 27 */{.pme_name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .pme_code = 0x4B, .pme_desc = "Prefetch Instructions Dispatched", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "LOAD", .pme_udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .pme_ucode = 0x01, }, { .pme_uname = "STORE", .pme_udesc = "Store (PrefetchW)", .pme_ucode = 0x02, }, { .pme_uname = "NTA", .pme_udesc = "NTA (PrefetchNTA)", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 28 */{.pme_name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .pme_code = 0x4C, .pme_desc = "DCACHE Misses by Locked Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .pme_udesc = "Data cache misses by locked instructions", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x02, }, }, }, /* 29 */{.pme_name = "L1_DTLB_HIT", .pme_code = 0x4D, .pme_desc = "L1 DTLB Hit", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "L1_4K_TLB_HIT", .pme_udesc = "L1 4K TLB hit", .pme_ucode = 0x01, }, { .pme_uname = "L1_2M_TLB_HIT", .pme_udesc = "L1 2M TLB hit", .pme_ucode = 0x02, }, { .pme_uname = "L1_1G_TLB_HIT", .pme_udesc = "L1 1G TLB hit", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 30 */{.pme_name = "INEFFECTIVE_SW_PREFETCHES", .pme_code = 0x52, .pme_desc = "Ineffective Software Prefetches", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "SW_PREFETCH_HIT_IN_L1", .pme_udesc = "Software prefetch hit in the L1.", .pme_ucode = 0x01, }, { .pme_uname = "SW_PREFETCH_HIT_IN_L2", .pme_udesc = "Software prefetch hit in L2.", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x09, }, }, }, /* 31 */{.pme_name = "GLOBAL_TLB_FLUSHES", .pme_code = 0x54, .pme_desc = "Global TLB Flushes", }, /* 32 */{.pme_name = "MEMORY_REQUESTS", .pme_code = 0x65, .pme_desc = "Memory Requests by Type", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "NON_CACHEABLE", .pme_udesc = "Requests to non-cacheable (UC) memory", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_COMBINING", .pme_udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .pme_ucode = 0x02, }, { .pme_uname = "STREAMING_STORE", .pme_udesc = "Streaming store (SS) requests", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x83, }, }, }, /* 33 */{.pme_name = "DATA_PREFETCHES", .pme_code = 0x67, .pme_desc = "Data Prefetcher", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "CANCELLED", .pme_udesc = "Cancelled prefetches", .pme_ucode = 0x01, }, { .pme_uname = "ATTEMPTED", .pme_udesc = "Prefetch attempts", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 34 */{.pme_name = "SYSTEM_READ_RESPONSES", .pme_code = 0x6C, .pme_desc = "Northbridge Read Responses by Coherency State", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x01, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x02, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "DATA_ERROR", .pme_udesc = "Data Error", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x1F, }, }, }, /* 35 */{.pme_name = "QUADWORDS_WRITTEN_TO_SYSTEM", .pme_code = 0x6D, .pme_desc = "Octwords Written to System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "QUADWORD_WRITE_TRANSFER", .pme_udesc = "Octword write transfer", .pme_ucode = 0x01, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x01, }, }, }, /* 36 */{.pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x76, .pme_desc = "CPU Clocks not Halted", }, /* 37 */{.pme_name = "REQUESTS_TO_L2", .pme_code = 0x7D, .pme_desc = "Requests to L2 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 0x01, }, { .pme_uname = "DATA", .pme_udesc = "DC fill", .pme_ucode = 0x02, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB fill (page table walks)", .pme_ucode = 0x04, }, { .pme_uname = "SNOOP", .pme_udesc = "Tag snoop request", .pme_ucode = 0x08, }, { .pme_uname = "CANCELLED", .pme_udesc = "Cancelled request", .pme_ucode = 0x10, }, { .pme_uname = "HW_PREFETCH_FROM_DC", .pme_udesc = "Hardware prefetch from DC", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 38 */{.pme_name = "L2_CACHE_MISS", .pme_code = 0x7E, .pme_desc = "L2 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 0x01, }, { .pme_uname = "DATA", .pme_udesc = "DC fill (includes possible replays, whereas EventSelect 041h does not)", .pme_ucode = 0x02, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB page table walk", .pme_ucode = 0x04, }, { .pme_uname = "HW_PREFETCH_FROM_DC", .pme_udesc = "Hardware prefetch from DC", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 39 */{.pme_name = "L2_FILL_WRITEBACK", .pme_code = 0x7F, .pme_desc = "L2 Fill/Writeback", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "L2_FILLS", .pme_udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .pme_ucode = 0x01, }, { .pme_uname = "L2_WRITEBACKS", .pme_udesc = "L2 Writebacks to system.", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 40 */{.pme_name = "INSTRUCTION_CACHE_FETCHES", .pme_code = 0x80, .pme_desc = "Instruction Cache Fetches", }, /* 41 */{.pme_name = "INSTRUCTION_CACHE_MISSES", .pme_code = 0x81, .pme_desc = "Instruction Cache Misses", }, /* 42 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .pme_code = 0x82, .pme_desc = "Instruction Cache Refills from L2", }, /* 43 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x83, .pme_desc = "Instruction Cache Refills from System", }, /* 44 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .pme_code = 0x84, .pme_desc = "L1 ITLB Miss and L2 ITLB Hit", }, /* 45 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .pme_code = 0x85, .pme_desc = "L1 ITLB Miss and L2 ITLB Miss", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "4K_PAGE_FETCHES", .pme_udesc = "Instruction fetches to a 4K page.", .pme_ucode = 0x01, }, { .pme_uname = "2M_PAGE_FETCHES", .pme_udesc = "Instruction fetches to a 2M page.", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 46 */{.pme_name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .pme_code = 0x86, .pme_desc = "Pipeline Restart Due to Instruction Stream Probe", }, /* 47 */{.pme_name = "INSTRUCTION_FETCH_STALL", .pme_code = 0x87, .pme_desc = "Instruction Fetch Stall", }, /* 48 */{.pme_name = "RETURN_STACK_HITS", .pme_code = 0x88, .pme_desc = "Return Stack Hits", }, /* 49 */{.pme_name = "RETURN_STACK_OVERFLOWS", .pme_code = 0x89, .pme_desc = "Return Stack Overflows", }, /* 50 */{.pme_name = "INSTRUCTION_CACHE_VICTIMS", .pme_code = 0x8B, .pme_desc = "Instruction Cache Victims", }, /* 51 */{.pme_name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .pme_code = 0x8C, .pme_desc = "Instruction Cache Lines Invalidated", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "INVALIDATING_PROBE_NO_IN_FLIGHT", .pme_udesc = "Invalidating probe that did not hit any in-flight instructions.", .pme_ucode = 0x01, }, { .pme_uname = "INVALIDATING_PROBE_ONE_OR_MORE_IN_FLIGHT", .pme_udesc = "Invalidating probe that hit one or more in-flight instructions.", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 52 */{.pme_name = "ITLB_RELOADS", .pme_code = 0x99, .pme_desc = "ITLB Reloads", }, /* 53 */{.pme_name = "ITLB_RELOADS_ABORTED", .pme_code = 0x9A, .pme_desc = "ITLB Reloads Aborted", }, /* 54 */{.pme_name = "RETIRED_INSTRUCTIONS", .pme_code = 0xC0, .pme_desc = "Retired Instructions", }, /* 55 */{.pme_name = "RETIRED_UOPS", .pme_code = 0xC1, .pme_desc = "Retired uops", }, /* 56 */{.pme_name = "RETIRED_BRANCH_INSTRUCTIONS", .pme_code = 0xC2, .pme_desc = "Retired Branch Instructions", }, /* 57 */{.pme_name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .pme_code = 0xC3, .pme_desc = "Retired Mispredicted Branch Instructions", }, /* 58 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .pme_code = 0xC4, .pme_desc = "Retired Taken Branch Instructions", }, /* 59 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .pme_code = 0xC5, .pme_desc = "Retired Taken Branch Instructions Mispredicted", }, /* 60 */{.pme_name = "RETIRED_FAR_CONTROL_TRANSFERS", .pme_code = 0xC6, .pme_desc = "Retired Far Control Transfers", }, /* 61 */{.pme_name = "RETIRED_BRANCH_RESYNCS", .pme_code = 0xC7, .pme_desc = "Retired Branch Resyncs", }, /* 62 */{.pme_name = "RETIRED_NEAR_RETURNS", .pme_code = 0xC8, .pme_desc = "Retired Near Returns", }, /* 63 */{.pme_name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .pme_code = 0xC9, .pme_desc = "Retired Near Returns Mispredicted", }, /* 64 */{.pme_name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .pme_code = 0xCA, .pme_desc = "Retired Indirect Branches Mispredicted", }, /* 65 */{.pme_name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .pme_code = 0xCB, .pme_desc = "Retired MMX/FP Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "X87", .pme_udesc = "x87 instructions", .pme_ucode = 0x01, }, { .pme_uname = "MMX_AND_3DNOW", .pme_udesc = "MMX and 3DNow! instructions", .pme_ucode = 0x02, }, { .pme_uname = "PACKED_SSE_AND_SSE2", .pme_udesc = "SSE instructions (SSE, SSE2, SSE3, and SSE4A)", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 66 */{.pme_name = "RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS", .pme_code = 0xCC, .pme_desc = "Retired Fastpath Double Op Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "POSITION_0", .pme_udesc = "With low op in position 0", .pme_ucode = 0x01, }, { .pme_uname = "POSITION_1", .pme_udesc = "With low op in position 1", .pme_ucode = 0x02, }, { .pme_uname = "POSITION_2", .pme_udesc = "With low op in position 2", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 67 */{.pme_name = "INTERRUPTS_MASKED_CYCLES", .pme_code = 0xCD, .pme_desc = "Interrupts-Masked Cycles", }, /* 68 */{.pme_name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .pme_code = 0xCE, .pme_desc = "Interrupts-Masked Cycles with Interrupt Pending", }, /* 69 */{.pme_name = "INTERRUPTS_TAKEN", .pme_code = 0xCF, .pme_desc = "Interrupts Taken", }, /* 70 */{.pme_name = "DECODER_EMPTY", .pme_code = 0xD0, .pme_desc = "Decoder Empty", }, /* 71 */{.pme_name = "DISPATCH_STALLS", .pme_code = 0xD1, .pme_desc = "Dispatch Stalls", }, /* 72 */{.pme_name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .pme_code = 0xD2, .pme_desc = "Dispatch Stall for Branch Abort to Retire", }, /* 73 */{.pme_name = "DISPATCH_STALL_FOR_SERIALIZATION", .pme_code = 0xD3, .pme_desc = "Dispatch Stall for Serialization", }, /* 74 */{.pme_name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .pme_code = 0xD4, .pme_desc = "Dispatch Stall for Segment Load", }, /* 75 */{.pme_name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .pme_code = 0xD5, .pme_desc = "Dispatch Stall for Reorder Buffer Full", }, /* 76 */{.pme_name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .pme_code = 0xD6, .pme_desc = "Dispatch Stall for Reservation Station Full", }, /* 77 */{.pme_name = "DISPATCH_STALL_FOR_FPU_FULL", .pme_code = 0xD7, .pme_desc = "Dispatch Stall for FPU Full", }, /* 78 */{.pme_name = "DISPATCH_STALL_FOR_LS_FULL", .pme_code = 0xD8, .pme_desc = "Dispatch Stall for LS Full", }, /* 79 */{.pme_name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .pme_code = 0xD9, .pme_desc = "Dispatch Stall Waiting for All Quiet", }, /* 80 */{.pme_name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .pme_code = 0xDA, .pme_desc = "Dispatch Stall for Far Transfer or Resync to Retire", }, /* 81 */{.pme_name = "FPU_EXCEPTIONS", .pme_code = 0xDB, .pme_desc = "FPU Exceptions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "X87_RECLASS_MICROFAULTS", .pme_udesc = "x87 reclass microfaults", .pme_ucode = 0x01, }, { .pme_uname = "SSE_RETYPE_MICROFAULTS", .pme_udesc = "SSE retype microfaults", .pme_ucode = 0x02, }, { .pme_uname = "SSE_RECLASS_MICROFAULTS", .pme_udesc = "SSE reclass microfaults", .pme_ucode = 0x04, }, { .pme_uname = "SSE_AND_X87_MICROTRAPS", .pme_udesc = "SSE and x87 microtraps", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 82 */{.pme_name = "DR0_BREAKPOINT_MATCHES", .pme_code = 0xDC, .pme_desc = "DR0 Breakpoint Matches", }, /* 83 */{.pme_name = "DR1_BREAKPOINT_MATCHES", .pme_code = 0xDD, .pme_desc = "DR1 Breakpoint Matches", }, /* 84 */{.pme_name = "DR2_BREAKPOINT_MATCHES", .pme_code = 0xDE, .pme_desc = "DR2 Breakpoint Matches", }, /* 85 */{.pme_name = "DR3_BREAKPOINT_MATCHES", .pme_code = 0xDF, .pme_desc = "DR3 Breakpoint Matches", }, /* 86 */{.pme_name = "DRAM_ACCESSES_PAGE", .pme_code = 0xE0, .pme_desc = "DRAM Accesses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "DCT0 Page hit", .pme_ucode = 0x01, }, { .pme_uname = "MISS", .pme_udesc = "DCT0 Page Miss", .pme_ucode = 0x02, }, { .pme_uname = "CONFLICT", .pme_udesc = "DCT0 Page Conflict", .pme_ucode = 0x04, }, { .pme_uname = "DCT1_PAGE_HIT", .pme_udesc = "DCT1 Page hit", .pme_ucode = 0x08, }, { .pme_uname = "DCT1_PAGE_MISS", .pme_udesc = "DCT1 Page Miss", .pme_ucode = 0x10, }, { .pme_uname = "DCT1_PAGE_CONFLICT", .pme_udesc = "DCT1 Page Conflict", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 87 */{.pme_name = "MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS", .pme_code = 0xE1, .pme_desc = "DRAM Controller Page Table Overflows", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "DCT0_PAGE_TABLE_OVERFLOW", .pme_udesc = "DCT0 Page Table Overflow", .pme_ucode = 0x01, }, { .pme_uname = "DCT1_PAGE_TABLE_OVERFLOW", .pme_udesc = "DCT1 Page Table Overflow", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 88 */{.pme_name = "MEMORY_CONTROLLER_SLOT_MISSES", .pme_code = 0xE2, .pme_desc = "Memory Controller DRAM Command Slots Missed", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "DCT0_COMMAND_SLOTS_MISSED", .pme_udesc = "DCT0 Command Slots Missed", .pme_ucode = 0x01, }, { .pme_uname = "DCT1_COMMAND_SLOTS_MISSED", .pme_udesc = "DCT1 Command Slots Missed", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 89 */{.pme_name = "MEMORY_CONTROLLER_TURNAROUNDS", .pme_code = 0xE3, .pme_desc = "Memory Controller Turnarounds", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "CHIP_SELECT", .pme_udesc = "DCT0 DIMM (chip select) turnaround", .pme_ucode = 0x01, }, { .pme_uname = "READ_TO_WRITE", .pme_udesc = "DCT0 Read to write turnaround", .pme_ucode = 0x02, }, { .pme_uname = "WRITE_TO_READ", .pme_udesc = "DCT0 Write to read turnaround", .pme_ucode = 0x04, }, { .pme_uname = "DCT1_DIMM", .pme_udesc = "DCT1 DIMM (chip select) turnaround", .pme_ucode = 0x08, }, { .pme_uname = "DCT1_READ_TO_WRITE_TURNAROUND", .pme_udesc = "DCT1 Read to write turnaround", .pme_ucode = 0x10, }, { .pme_uname = "DCT1_WRITE_TO_READ_TURNAROUND", .pme_udesc = "DCT1 Write to read turnaround", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 90 */{.pme_name = "MEMORY_CONTROLLER_BYPASS", .pme_code = 0xE4, .pme_desc = "Memory Controller Bypass Counter Saturation", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "HIGH_PRIORITY", .pme_udesc = "Memory controller high priority bypass", .pme_ucode = 0x01, }, { .pme_uname = "LOW_PRIORITY", .pme_udesc = "Memory controller medium priority bypass", .pme_ucode = 0x02, }, { .pme_uname = "DRAM_INTERFACE", .pme_udesc = "DCT0 DCQ bypass", .pme_ucode = 0x04, }, { .pme_uname = "DRAM_QUEUE", .pme_udesc = "DCT1 DCQ bypass", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 91 */{.pme_name = "THERMAL_STATUS_AND_ECC_ERRORS", .pme_code = 0xE8, .pme_desc = "Thermal Status", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "CLKS_DIE_TEMP_TOO_HIGH", .pme_udesc = "Number of times the HTC trip point is crossed", .pme_ucode = 0x04, }, { .pme_uname = "CLKS_TEMP_THRESHOLD_EXCEEDED", .pme_udesc = "Number of clocks when STC trip point active", .pme_ucode = 0x08, }, { .pme_uname = "STC_TRIP_POINTS_CROSSED", .pme_udesc = "Number of times the STC trip point is crossed", .pme_ucode = 0x10, }, { .pme_uname = "CLOCKS_HTC_P_STATE_INACTIVE", .pme_udesc = "Number of clocks HTC P-state is inactive.", .pme_ucode = 0x20, }, { .pme_uname = "CLOCKS_HTC_P_STATE_ACTIVE", .pme_udesc = "Number of clocks HTC P-state is active", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7C, }, }, }, /* 92 */{.pme_name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .pme_code = 0xE9, .pme_desc = "CPU/IO Requests to Memory/IO", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "I_O_TO_I_O", .pme_udesc = "IO to IO", .pme_ucode = 0x01, }, { .pme_uname = "I_O_TO_MEM", .pme_udesc = "IO to Mem", .pme_ucode = 0x02, }, { .pme_uname = "CPU_TO_I_O", .pme_udesc = "CPU to IO", .pme_ucode = 0x04, }, { .pme_uname = "CPU_TO_MEM", .pme_udesc = "CPU to Mem", .pme_ucode = 0x08, }, { .pme_uname = "TO_REMOTE_NODE", .pme_udesc = "To remote node", .pme_ucode = 0x10, }, { .pme_uname = "TO_LOCAL_NODE", .pme_udesc = "To local node", .pme_ucode = 0x20, }, { .pme_uname = "FROM_REMOTE_NODE", .pme_udesc = "From remote node", .pme_ucode = 0x40, }, { .pme_uname = "FROM_LOCAL_NODE", .pme_udesc = "From local node", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 93 */{.pme_name = "CACHE_BLOCK", .pme_code = 0xEA, .pme_desc = "Cache Block Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "VICTIM_WRITEBACK", .pme_udesc = "Victim Block (Writeback)", .pme_ucode = 0x01, }, { .pme_uname = "DCACHE_LOAD_MISS", .pme_udesc = "Read Block (Dcache load miss refill)", .pme_ucode = 0x04, }, { .pme_uname = "SHARED_ICACHE_REFILL", .pme_udesc = "Read Block Shared (Icache refill)", .pme_ucode = 0x08, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read Block Modified (Dcache store miss refill)", .pme_ucode = 0x10, }, { .pme_uname = "READ_TO_DIRTY", .pme_udesc = "Change-to-Dirty (first store to clean block already in cache)", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3D, }, }, }, /* 94 */{.pme_name = "SIZED_COMMANDS", .pme_code = 0xEB, .pme_desc = "Sized Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "NON_POSTED_WRITE_BYTE", .pme_udesc = "Non-Posted SzWr Byte (1-32 bytes) Legacy or mapped IO, typically 1-4 bytes", .pme_ucode = 0x01, }, { .pme_uname = "NON_POSTED_WRITE_DWORD", .pme_udesc = "Non-Posted SzWr DW (1-16 dwords) Legacy or mapped IO, typically 1 DWORD", .pme_ucode = 0x02, }, { .pme_uname = "POSTED_WRITE_BYTE", .pme_udesc = "Posted SzWr Byte (1-32 bytes) Sub-cache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .pme_ucode = 0x04, }, { .pme_uname = "POSTED_WRITE_DWORD", .pme_udesc = "Posted SzWr DW (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .pme_ucode = 0x08, }, { .pme_uname = "READ_BYTE_4_BYTES", .pme_udesc = "SzRd Byte (4 bytes) Legacy or mapped IO", .pme_ucode = 0x10, }, { .pme_uname = "READ_DWORD_1_16_DWORDS", .pme_udesc = "SzRd DW (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 95 */{.pme_name = "PROBE", .pme_code = 0xEC, .pme_desc = "Probe Responses and Upstream Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "Probe miss", .pme_ucode = 0x01, }, { .pme_uname = "HIT_CLEAN", .pme_udesc = "Probe hit clean", .pme_ucode = 0x02, }, { .pme_uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .pme_ucode = 0x04, }, { .pme_uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .pme_ucode = 0x08, }, { .pme_uname = "UPSTREAM_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream display refresh/ISOC reads", .pme_ucode = 0x10, }, { .pme_uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream non-display refresh reads", .pme_ucode = 0x20, }, { .pme_uname = "UPSTREAM_WRITES", .pme_udesc = "Upstream ISOC writes", .pme_ucode = 0x40, }, { .pme_uname = "UPSTREAM_NON_ISOC_WRITES", .pme_udesc = "Upstream non-ISOC writes", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 96 */{.pme_name = "GART", .pme_code = 0xEE, .pme_desc = "GART Events", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "APERTURE_HIT_FROM_CPU", .pme_udesc = "GART aperture hit on access from CPU", .pme_ucode = 0x01, }, { .pme_uname = "APERTURE_HIT_FROM_IO", .pme_udesc = "GART aperture hit on access from IO", .pme_ucode = 0x02, }, { .pme_uname = "MISS", .pme_udesc = "GART miss", .pme_ucode = 0x04, }, { .pme_uname = "REQUEST_HIT_TABLE_WALK", .pme_udesc = "GART/DEV Request hit table walk in progress", .pme_ucode = 0x08, }, { .pme_uname = "DEV_HIT", .pme_udesc = "DEV hit", .pme_ucode = 0x10, }, { .pme_uname = "DEV_MISS", .pme_udesc = "DEV miss", .pme_ucode = 0x20, }, { .pme_uname = "DEV_ERROR", .pme_udesc = "DEV error", .pme_ucode = 0x40, }, { .pme_uname = "MULTIPLE_TABLE_WALK", .pme_udesc = "GART/DEV multiple table walk in progress", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 97 */{.pme_name = "MEMORY_CONTROLLER_REQUESTS", .pme_code = 0x1F0, .pme_desc = "Memory Controller Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "WRITE_REQUESTS", .pme_udesc = "Write requests sent to the DCT", .pme_ucode = 0x01, }, { .pme_uname = "READ_REQUESTS", .pme_udesc = "Read requests (including prefetch requests) sent to the DCT", .pme_ucode = 0x02, }, { .pme_uname = "PREFETCH_REQUESTS", .pme_udesc = "Prefetch requests sent to the DCT", .pme_ucode = 0x04, }, { .pme_uname = "32_BYTES_WRITES", .pme_udesc = "32 Bytes Sized Writes", .pme_ucode = 0x08, }, { .pme_uname = "64_BYTES_WRITES", .pme_udesc = "64 Bytes Sized Writes", .pme_ucode = 0x10, }, { .pme_uname = "32_BYTES_READS", .pme_udesc = "32 Bytes Sized Reads", .pme_ucode = 0x20, }, { .pme_uname = "64_BYTES_READS", .pme_udesc = "64 Byte Sized Reads", .pme_ucode = 0x40, }, { .pme_uname = "READ_REQUESTS_WHILE_WRITES_REQUESTS", .pme_udesc = "Read requests sent to the DCT while writes requests are pending in the DCT", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 98 */{.pme_name = "CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE", .pme_code = 0x1E0, .pme_desc = "CPU to DRAM Requests to Target Node", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 0x01, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 0x04, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 99 */{.pme_name = "IO_TO_DRAM_REQUESTS_TO_TARGET_NODE", .pme_code = 0x1E1, .pme_desc = "IO to DRAM Requests to Target Node", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 0x01, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 0x04, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 100 */{.pme_name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_0_3", .pme_code = 0x1E2, .pme_desc = "CPU Read Command Latency to Target Node 0-3", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 0x04, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 101 */{.pme_name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_0_3", .pme_code = 0x1E3, .pme_desc = "CPU Read Command Requests to Target Node 0-3", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 0x04, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 102 */{.pme_name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_4_7", .pme_code = 0x1E4, .pme_desc = "CPU Read Command Latency to Target Node 4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 0x04, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 103 */{.pme_name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_4_7", .pme_code = 0x1E5, .pme_desc = "CPU Read Command Requests to Target Node 4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 0x04, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 104 */{.pme_name = "CPU_COMMAND_LATENCY_TO_TARGET_NODE_0_3_4_7", .pme_code = 0x1E6, .pme_desc = "CPU Command Latency to Target Node 0-3/4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_SIZED", .pme_udesc = "Read Sized", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_SIZED", .pme_udesc = "Write Sized", .pme_ucode = 0x02, }, { .pme_uname = "VICTIM_BLOCK", .pme_udesc = "Victim Block", .pme_ucode = 0x04, }, { .pme_uname = "NODE_GROUP_SELECT", .pme_udesc = "Node Group Select. 0=Nodes 0-3. 1= Nodes 4-7.", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_0_4", .pme_udesc = "From Local node to Node 0/4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_1_5", .pme_udesc = "From Local node to Node 1/5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_2_6", .pme_udesc = "From Local node to Node 2/6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_3_7", .pme_udesc = "From Local node to Node 3/7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 105 */{.pme_name = "CPU_REQUESTS_TO_TARGET_NODE_0_3_4_7", .pme_code = 0x1E7, .pme_desc = "CPU Requests to Target Node 0-3/4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_SIZED", .pme_udesc = "Read Sized", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_SIZED", .pme_udesc = "Write Sized", .pme_ucode = 0x02, }, { .pme_uname = "VICTIM_BLOCK", .pme_udesc = "Victim Block", .pme_ucode = 0x04, }, { .pme_uname = "NODE_GROUP_SELECT", .pme_udesc = "Node Group Select. 0=Nodes 0-3. 1= Nodes 4-7.", .pme_ucode = 0x08, }, { .pme_uname = "LOCAL_TO_0_4", .pme_udesc = "From Local node to Node 0/4", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_TO_1_5", .pme_udesc = "From Local node to Node 1/5", .pme_ucode = 0x20, }, { .pme_uname = "LOCAL_TO_2_6", .pme_udesc = "From Local node to Node 2/6", .pme_ucode = 0x40, }, { .pme_uname = "LOCAL_TO_3_7", .pme_udesc = "From Local node to Node 3/7", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 106 */{.pme_name = "HYPERTRANSPORT_LINK0", .pme_code = 0xF6, .pme_desc = "HyperTransport Link 0 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address extension DWORD sent", .pme_ucode = 0x10, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 0x20, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 107 */{.pme_name = "HYPERTRANSPORT_LINK1", .pme_code = 0xF7, .pme_desc = "HyperTransport Link 1 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address extension DWORD sent", .pme_ucode = 0x10, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 0x20, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 108 */{.pme_name = "HYPERTRANSPORT_LINK2", .pme_code = 0xF8, .pme_desc = "HyperTransport Link 2 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address extension DWORD sent", .pme_ucode = 0x10, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 0x20, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 109 */{.pme_name = "HYPERTRANSPORT_LINK3", .pme_code = 0x1F9, .pme_desc = "HyperTransport Link 3 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address DWORD sent", .pme_ucode = 0x10, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 0x20, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 110 */{.pme_name = "READ_REQUEST_TO_L3_CACHE", .pme_code = 0x4E0, .pme_desc = "Read Request to L3 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_TILL_FAM10H_REV_C, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 0x04, }, { .pme_uname = "ANY_READ", .pme_udesc = "any read modes (exclusive, shared, modify)", .pme_ucode = 0x07, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x80, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 111 */{.pme_name = "L3_CACHE_MISSES", .pme_code = 0x4E1, .pme_desc = "L3 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_TILL_FAM10H_REV_C, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 0x04, }, { .pme_uname = "ANY_READ", .pme_udesc = "any read modes (exclusive, shared, modify)", .pme_ucode = 0x07, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x80, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 112 */{.pme_name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .pme_code = 0x4E2, .pme_desc = "L3 Fills caused by L2 Evictions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_TILL_FAM10H_REV_C, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x01, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x02, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x04, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x08, }, { .pme_uname = "ANY_STATE", .pme_udesc = "any line state (shared, owned, exclusive, modified)", .pme_ucode = 0x0F, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x80, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 113 */{.pme_name = "L3_EVICTIONS", .pme_code = 0x4E3, .pme_desc = "L3 Evictions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x01, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x02, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x04, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* Family 10h RevC, Shanghai */ /* 114 */{.pme_name = "PAGE_SIZE_MISMATCHES", .pme_code = 0x165, .pme_desc = "Page Size Mismatches", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_FAM10H_REV_C, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "GUEST_LARGER", .pme_udesc = "Guest page size is larger than the host page size.", .pme_ucode = 0x01, }, { .pme_uname = "MTRR_MISMATCH", .pme_udesc = "MTRR mismatch.", .pme_ucode = 0x02, }, { .pme_uname = "HOST_LARGER", .pme_udesc = "Host page size is larger than the guest page size.", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 115 */{.pme_name = "RETIRED_X87_OPS", .pme_code = 0x1C0, .pme_desc = "Retired x87 Floating Point Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_FAM10H_REV_C, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "ADD_SUB_OPS", .pme_udesc = "Add/subtract ops", .pme_ucode = 0x01, }, { .pme_uname = "MUL_OPS", .pme_udesc = "Multiply ops", .pme_ucode = 0x02, }, { .pme_uname = "DIV_OPS", .pme_udesc = "Divide ops", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 116 */{.pme_name = "IBS_OPS_TAGGED", .pme_code = 0x1CF, .pme_desc = "IBS Ops Tagged", .pme_flags = PFMLIB_AMD64_FAM10H_REV_C, }, /* 117 */{.pme_name = "LFENCE_INST_RETIRED", .pme_code = 0x1D3, .pme_desc = "LFENCE Instructions Retired", .pme_flags = PFMLIB_AMD64_FAM10H_REV_C, }, /* 118 */{.pme_name = "SFENCE_INST_RETIRED", .pme_code = 0x1D4, .pme_desc = "SFENCE Instructions Retired", .pme_flags = PFMLIB_AMD64_FAM10H_REV_C, }, /* 119 */{.pme_name = "MFENCE_INST_RETIRED", .pme_code = 0x1D5, .pme_desc = "MFENCE Instructions Retired", .pme_flags = PFMLIB_AMD64_FAM10H_REV_C, }, /* Family 10h RevD, Istanbul */ /* 120 */{.pme_name = "READ_REQUEST_TO_L3_CACHE", .pme_code = 0x4E0, .pme_desc = "Read Request to L3 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_FAM10H_REV_D, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 0x04, }, { .pme_uname = "ANY_READ", .pme_udesc = "any read modes (exclusive, shared, modify)", .pme_ucode = 0x07, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "ANY_CORE", .pme_udesc = "Any core", .pme_ucode = 0xF0, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 121 */{.pme_name = "L3_CACHE_MISSES", .pme_code = 0x4E1, .pme_desc = "L3 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_FAM10H_REV_D, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 0x04, }, { .pme_uname = "ANY_READ", .pme_udesc = "any read modes (exclusive, shared, modify)", .pme_ucode = 0x07, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "ANY_CORE", .pme_udesc = "Any core", .pme_ucode = 0xF0, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 122 */{.pme_name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .pme_code = 0x4E2, .pme_desc = "L3 Fills caused by L2 Evictions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO|PFMLIB_AMD64_FAM10H_REV_D, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x01, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x02, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x04, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x08, }, { .pme_uname = "ANY_STATE", .pme_udesc = "any line state (shared, owned, exclusive, modified)", .pme_ucode = 0x0F, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "ANY_CORE", .pme_udesc = "Any core", .pme_ucode = 0xF0, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 123 */{.pme_name = "IBSOP_EVENT", .pme_code = 0xFF, .pme_desc = "Enable IBS OP mode (pseudo event)", .pme_flags = 0, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "CYCLES", .pme_udesc = "sample cycles", .pme_ucode = 0x01, }, { .pme_uname = "UOPS", .pme_udesc = "sample dispatched uops (Rev C and later)", .pme_ucode = 0x02, }, }, }, /* 124 */{.pme_name = "IBSFETCH_EVENT", .pme_code = 0xFF, .pme_desc = "Enable IBS Fetch mode (pseudo event)", .pme_flags = 0, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "RANDOM", .pme_udesc = "randomize period", .pme_ucode = 0x01, }, { .pme_uname = "NO_RANDOM", .pme_udesc = "do not randomize period", .pme_ucode = 0x00, }, }, }, /* 125 */{.pme_name = "MAB_REQUESTS", .pme_code = 0x68, .pme_desc = "Average L1 refill latency for Icache and Dcache misses (request count for cache refills)", .pme_numasks = 10, .pme_umasks = { { .pme_uname = "BUFFER_0", .pme_udesc = "Buffer 0", .pme_ucode = 0x00, }, { .pme_uname = "BUFFER_1", .pme_udesc = "Buffer 1", .pme_ucode = 0x01, }, { .pme_uname = "BUFFER_2", .pme_udesc = "Buffer 2", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_3", .pme_udesc = "Buffer 3", .pme_ucode = 0x03, }, { .pme_uname = "BUFFER_4", .pme_udesc = "Buffer 4", .pme_ucode = 0x04, }, { .pme_uname = "BUFFER_5", .pme_udesc = "Buffer 5", .pme_ucode = 0x05, }, { .pme_uname = "BUFFER_6", .pme_udesc = "Buffer 6", .pme_ucode = 0x06, }, { .pme_uname = "BUFFER_7", .pme_udesc = "Buffer 7", .pme_ucode = 0x07, }, { .pme_uname = "BUFFER_8", .pme_udesc = "Buffer 8", .pme_ucode = 0x08, }, { .pme_uname = "BUFFER_9", .pme_udesc = "Buffer 9", .pme_ucode = 0x09, }, }, }, /* 126 */{.pme_name = "MAB_WAIT_CYCLES", .pme_code = 0x69, .pme_desc = "Average L1 refill latency for Icache and Dcache misses (cycles that requests spent waiting for the refills)", .pme_numasks = 10, .pme_umasks = { { .pme_uname = "BUFFER_0", .pme_udesc = "Buffer 0", .pme_ucode = 0x00, }, { .pme_uname = "BUFFER_1", .pme_udesc = "Buffer 1", .pme_ucode = 0x01, }, { .pme_uname = "BUFFER_2", .pme_udesc = "Buffer 2", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_3", .pme_udesc = "Buffer 3", .pme_ucode = 0x03, }, { .pme_uname = "BUFFER_4", .pme_udesc = "Buffer 4", .pme_ucode = 0x04, }, { .pme_uname = "BUFFER_5", .pme_udesc = "Buffer 5", .pme_ucode = 0x05, }, { .pme_uname = "BUFFER_6", .pme_udesc = "Buffer 6", .pme_ucode = 0x06, }, { .pme_uname = "BUFFER_7", .pme_udesc = "Buffer 7", .pme_ucode = 0x07, }, { .pme_uname = "BUFFER_8", .pme_udesc = "Buffer 8", .pme_ucode = 0x08, }, { .pme_uname = "BUFFER_9", .pme_udesc = "Buffer 9", .pme_ucode = 0x09, }, }, }, /* 127 */{.pme_name = "NON_CANCELLED_L3_READ_REQUESTS", .pme_code = 0x4ED, .pme_desc = "Non-cancelled L3 Read Requests", .pme_numasks = 5, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 0x01, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 0x02, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 0x04, }, { .pme_uname = "ANY_READ", .pme_udesc = "any read modes (exclusive, shared, modify)", .pme_ucode = 0x07, }, #if 0 /* * http://support.amd.com/us/Processor_TechDocs/41322.pdf * * Issue number 437 on page 131. * */ { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, #endif { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, }; #define PME_AMD64_FAM10H_EVENT_COUNT (sizeof(amd64_fam10h_pe)/sizeof(pme_amd64_entry_t)) #define PME_AMD64_FAM10H_CPU_CLK_UNHALTED 36 #define PME_AMD64_FAM10H_RETIRED_INSTRUCTIONS 54 #define PME_AMD64_IBSOP 123 #define PME_AMD64_IBSFETCH 124 papi-papi-7-2-0-t/src/libperfnec/lib/amd64_events_fam15h.h000066400000000000000000002003101502707512200231130ustar00rootroot00000000000000/* * Copyright (c) 2010 Advanced Micro Devices, Inc. * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * Family 15h Microarchitecture performance monitor events * * History: * * Apr 29 2011 -- Robert Richter, robert.richter@amd.com: * Source: BKDG for AMD Family 15h Models 00h-0Fh Processors, * 42301, Rev 1.15, April 18, 2011 * * Dec 09 2010 -- Robert Richter, robert.richter@amd.com: * Source: BIOS and Kernel Developer's Guide for the AMD Family 15h * Processors, Rev 0.90, May 18, 2010 */ static pme_amd64_entry_t amd64_fam15h_pe[]={ /* Family 15h */ /* 0 */{.pme_name = "DISPATCHED_FPU_OPS", .pme_code = 0x00, .pme_desc = "FPU Pipe Assignment", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "OPS_PIPE0", .pme_udesc = "Total number uops assigned to Pipe 0", .pme_ucode = 1 << 0, }, { .pme_uname = "OPS_PIPE1", .pme_udesc = "Total number uops assigned to Pipe 1", .pme_ucode = 1 << 1, }, { .pme_uname = "OPS_PIPE2", .pme_udesc = "Total number uops assigned to Pipe 2", .pme_ucode = 1 << 2, }, { .pme_uname = "OPS_PIPE3", .pme_udesc = "Total number uops assigned to Pipe 3", .pme_ucode = 1 << 3, }, { .pme_uname = "OPS_DUAL_PIPE0", .pme_udesc = "Total number dual-pipe uops assigned to Pipe 0", .pme_ucode = 1 << 4, }, { .pme_uname = "OPS_DUAL_PIPE1", .pme_udesc = "Total number dual-pipe uops assigned to Pipe 1", .pme_ucode = 1 << 5, }, { .pme_uname = "OPS_DUAL_PIPE2", .pme_udesc = "Total number dual-pipe uops assigned to Pipe 2", .pme_ucode = 1 << 6, }, { .pme_uname = "OPS_DUAL_PIPE3", .pme_udesc = "Total number dual-pipe uops assigned to Pipe 3", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 1 */{.pme_name = "CYCLES_FPU_EMPTY", .pme_code = 0x01, .pme_desc = "FP Scheduler Empty", }, /* 2 */{.pme_name = "RETIRED_SSE_OPS", .pme_code = 0x03, .pme_desc = "Retired SSE/BNI Ops", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "SINGLE_ADD_SUB_OPS", .pme_udesc = "Single-precision add/subtract FLOPS", .pme_ucode = 1 << 0, }, { .pme_uname = "SINGLE_MUL_OPS", .pme_udesc = "Single-precision multiply FLOPS", .pme_ucode = 1 << 1, }, { .pme_uname = "SINGLE_DIV_OPS", .pme_udesc = "Single-precision divide/square root FLOPS", .pme_ucode = 1 << 2, }, { .pme_uname = "SINGLE_MUL_ADD_OPS", .pme_udesc = "Single precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS", .pme_ucode = 1 << 3, }, { .pme_uname = "DOUBLE_ADD_SUB_OPS", .pme_udesc = "Double precision add/subtract FLOPS", .pme_ucode = 1 << 4, }, { .pme_uname = "DOUBLE_MUL_OPS", .pme_udesc = "Double precision multiply FLOPS", .pme_ucode = 1 << 5, }, { .pme_uname = "DOUBLE_DIV_OPS", .pme_udesc = "Double precision divide/square root FLOPS", .pme_ucode = 1 << 6, }, { .pme_uname = "DOUBLE_MUL_ADD_OPS", .pme_udesc = "Double precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 3 */{.pme_name = "MOVE_SCALAR_OPTIMIZATION", .pme_code = 0x04, .pme_desc = "Number of Move Elimination and Scalar Op Optimization", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SSE_MOVE_OPS", .pme_udesc = "Number of SSE Move Ops", .pme_ucode = 1 << 0, }, { .pme_uname = "SSE_MOVE_OPS_ELIM", .pme_udesc = "Number of SSE Move Ops eliminated", .pme_ucode = 1 << 1, }, { .pme_uname = "OPT_CAND", .pme_udesc = "Number of Ops that are candidates for optimization (Z-bit set or pass)", .pme_ucode = 1 << 2, }, { .pme_uname = "SCALAR_OPS_OPTIMIZED", .pme_udesc = "Number of Scalar ops optimized", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 4 */{.pme_name = "RETIRED_SERIALIZING_OPS", .pme_code = 0x05, .pme_desc = "Retired Serializing Ops", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SSE_RETIRED", .pme_udesc = "SSE bottom-executing uops retired", .pme_ucode = 1 << 0, }, { .pme_uname = "SSE_MISPREDICTED", .pme_udesc = "SSE control word mispredict traps due to mispredictions", .pme_ucode = 1 << 1, }, { .pme_uname = "X87_RETIRED", .pme_udesc = "x87 bottom-executing uops retired", .pme_ucode = 1 << 2, }, { .pme_uname = "X87_MISPREDICTED", .pme_udesc = "x87 control word mispredict traps due to mispredictions", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 5 */{.pme_name = "BOTTOM_EXECUTE_OP", .pme_code = 0x06, .pme_desc = "Number of Cycles that a Bottom-Execute uop is in the FP Scheduler", }, /* 6 */{.pme_name = "SEGMENT_REGISTER_LOADS", .pme_code = 0x20, .pme_desc = "Segment Register Loads", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "ES", .pme_udesc = "ES", .pme_ucode = 1 << 0, }, { .pme_uname = "CS", .pme_udesc = "CS", .pme_ucode = 1 << 1, }, { .pme_uname = "SS", .pme_udesc = "SS", .pme_ucode = 1 << 2, }, { .pme_uname = "DS", .pme_udesc = "DS", .pme_ucode = 1 << 3, }, { .pme_uname = "FS", .pme_udesc = "FS", .pme_ucode = 1 << 4, }, { .pme_uname = "GS", .pme_udesc = "GS", .pme_ucode = 1 << 5, }, { .pme_uname = "HS", .pme_udesc = "HS", .pme_ucode = 1 << 6, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, }, }, }, /* 7 */{.pme_name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .pme_code = 0x21, .pme_desc = "Pipeline Restart Due to Self-Modifying Code", }, /* 8 */{.pme_name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .pme_code = 0x22, .pme_desc = "Pipeline Restart Due to Probe Hit", }, /* 9 */{.pme_name = "LOAD_Q_STORE_Q_FULL", .pme_code = 0x23, .pme_desc = "Load Queue/Store Queue Full", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "LOAD_QUEUE", .pme_udesc = "The number of cycles that the load buffer is full", .pme_ucode = 1 << 0, }, { .pme_uname = "STORE_QUEUE", .pme_udesc = "The number of cycles that the store buffer is full", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 10 */{.pme_name = "LOCKED_OPS", .pme_code = 0x24, .pme_desc = "Locked Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "EXECUTED", .pme_udesc = "Number of locked instructions executed", .pme_ucode = 1 << 0, }, { .pme_uname = "CYCLES_NON_SPECULATIVE_PHASE", .pme_udesc = "Number of cycles spent in non-speculative phase, excluding cache miss penalty", .pme_ucode = 1 << 2, }, { .pme_uname = "CYCLES_WAITING", .pme_udesc = "Number of cycles spent in non-speculative phase, including the cache miss penalty", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0D, }, }, }, /* 11 */{.pme_name = "RETIRED_CLFLUSH_INSTRUCTIONS", .pme_code = 0x26, .pme_desc = "Retired CLFLUSH Instructions", }, /* 12 */{.pme_name = "RETIRED_CPUID_INSTRUCTIONS", .pme_code = 0x27, .pme_desc = "Retired CPUID Instructions", }, /* 13 */{.pme_name = "CANCELLED_STORE_TO_LOAD", .pme_code = 0x2A, .pme_desc = "Canceled Store to Load Forward Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "SIZE_ADDRESS_MISMATCHES", .pme_udesc = "Store is smaller than load or different starting byte but partial overlap", .pme_ucode = 1 << 0, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x01, }, }, }, /* 14 */{.pme_name = "SMIS_RECEIVED", .pme_code = 0x2B, .pme_desc = "SMIs Received", }, /* 15 */{.pme_name = "DATA_CACHE_ACCESSES", .pme_code = 0x40, .pme_desc = "Data Cache Accesses", }, /* 16 */{.pme_name = "DATA_CACHE_MISSES", .pme_code = 0x41, .pme_desc = "Data Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "DC_MISS_STREAMING_STORE", .pme_udesc = "First data cache miss or streaming store to a 64B cache line", .pme_ucode = 1 << 0, }, { .pme_uname = "STREAMING_STORE", .pme_udesc = "First streaming store to a 64B cache line", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 17 */{.pme_name = "DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE", .pme_code = 0x42, .pme_desc = "Data Cache Refills from L2 or System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "GOOD", .pme_udesc = "Fill with good data. (Final valid status is valid)", .pme_ucode = 1 << 0, }, { .pme_uname = "INVALID", .pme_udesc = "Early valid status turned out to be invalid", .pme_ucode = 1 << 1, }, { .pme_uname = "POISON", .pme_udesc = "Fill with poison data", .pme_ucode = 1 << 2, }, { .pme_uname = "READ_ERROR", .pme_udesc = "Fill with read data error", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 18 */{.pme_name = "DATA_CACHE_REFILLS_FROM_NORTHBRIDGE", .pme_code = 0x43, .pme_desc = "Data Cache Refills from System", }, /* 19 */{.pme_name = "UNIFIED_TLB_HIT", .pme_code = 0x45, .pme_desc = "Unified TLB Hit", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "4K_DATA", .pme_udesc = "4 KB unified TLB hit for data", .pme_ucode = 1 << 0, }, { .pme_uname = "2M_DATA", .pme_udesc = "2 MB unified TLB hit for data", .pme_ucode = 1 << 1, }, { .pme_uname = "1G_DATA", .pme_udesc = "1 GB unified TLB hit for data", .pme_ucode = 1 << 2, }, { .pme_uname = "4K_INST", .pme_udesc = "4 KB unified TLB hit for instruction", .pme_ucode = 1 << 4, }, { .pme_uname = "2M_INST", .pme_udesc = "2 MB unified TLB hit for instruction", .pme_ucode = 1 << 5, }, { .pme_uname = "1G_INST", .pme_udesc = "1 GB unified TLB hit for instruction", .pme_ucode = 1 << 6, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x77, }, }, }, /* 20 */{.pme_name = "UNIFIED_TLB_MISS", .pme_code = 0x46, .pme_desc = "Unified TLB Miss", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "4K_DATA", .pme_udesc = "4 KB unified TLB miss for data", .pme_ucode = 1 << 0, }, { .pme_uname = "2M_DATA", .pme_udesc = "2 MB unified TLB miss for data", .pme_ucode = 1 << 1, }, { .pme_uname = "1GB_DATA", .pme_udesc = "1 GB unified TLB miss for data", .pme_ucode = 1 << 2, }, { .pme_uname = "4K_INST", .pme_udesc = "4 KB unified TLB miss for instruction", .pme_ucode = 1 << 4, }, { .pme_uname = "2M_INST", .pme_udesc = "2 MB unified TLB miss for instruction", .pme_ucode = 1 << 5, }, { .pme_uname = "1G_INST", .pme_udesc = "1 GB unified TLB miss for instruction", .pme_ucode = 1 << 6, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x77, }, }, }, /* 21 */{.pme_name = "MISALIGNED_ACCESSES", .pme_code = 0x47, .pme_desc = "Misaligned Accesses", }, /* 22 */{.pme_name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .pme_code = 0x4B, .pme_desc = "Prefetch Instructions Dispatched", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "LOAD", .pme_udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .pme_ucode = 1 << 0, }, { .pme_uname = "STORE", .pme_udesc = "Store (PrefetchW)", .pme_ucode = 1 << 1, }, { .pme_uname = "NTA", .pme_udesc = "NTA (PrefetchNTA)", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 23 */{.pme_name = "INEFFECTIVE_SW_PREFETCHES", .pme_code = 0x52, .pme_desc = "Ineffective Software Prefetches", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "SW_PREFETCH_HIT_IN_L1", .pme_udesc = "Software prefetch hit in the L1", .pme_ucode = 1 << 0, }, { .pme_uname = "SW_PREFETCH_HIT_IN_L2", .pme_udesc = "Software prefetch hit in the L2", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x09, }, }, }, /* 24 */{.pme_name = "MEMORY_REQUESTS", .pme_code = 0x65, .pme_desc = "Memory Requests by Type", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "NON_CACHEABLE", .pme_udesc = "Requests to non-cacheable (UC) memory", .pme_ucode = 1 << 0, }, { .pme_uname = "WRITE_COMBINING", .pme_udesc = "Requests to non-cacheable (WC, but not WC+/SS) memory", .pme_ucode = 1 << 1, }, { .pme_uname = "STREAMING_STORE", .pme_udesc = "Requests to non-cacheable (WC+/SS, but not WC) memory", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x83, }, }, }, /* 25 */{.pme_name = "DATA_PREFETCHER", .pme_code = 0x67, .pme_desc = "Data Prefetcher", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "ATTEMPTED", .pme_udesc = "Prefetch attempts", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x02, }, }, }, /* 26 */{.pme_name = "MAB_REQS", .pme_code = 0x68, .pme_desc = "MAB Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "BUFFER_BIT_0", .pme_udesc = "Buffer entry index bit 0", .pme_ucode = 1 << 0, }, { .pme_uname = "BUFFER_BIT_1", .pme_udesc = "Buffer entry index bit 1", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_BIT_2", .pme_udesc = "Buffer entry index bit 2", .pme_ucode = 1 << 2, }, { .pme_uname = "BUFFER_BIT_3", .pme_udesc = "Buffer entry index bit 3", .pme_ucode = 1 << 3, }, { .pme_uname = "BUFFER_BIT_4", .pme_udesc = "Buffer entry index bit 4", .pme_ucode = 1 << 4, }, { .pme_uname = "BUFFER_BIT_5", .pme_udesc = "Buffer entry index bit 5", .pme_ucode = 1 << 5, }, { .pme_uname = "BUFFER_BIT_6", .pme_udesc = "Buffer entry index bit 6", .pme_ucode = 1 << 6, }, { .pme_uname = "BUFFER_BIT_7", .pme_udesc = "Buffer entry index bit 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 27 */{.pme_name = "MAB_WAIT", .pme_code = 0x69, .pme_desc = "MAB Wait Cycles", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "BUFFER_BIT_0", .pme_udesc = "Buffer entry index bit 0", .pme_ucode = 1 << 0, }, { .pme_uname = "BUFFER_BIT_1", .pme_udesc = "Buffer entry index bit 1", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_BIT_2", .pme_udesc = "Buffer entry index bit 2", .pme_ucode = 1 << 2, }, { .pme_uname = "BUFFER_BIT_3", .pme_udesc = "Buffer entry index bit 3", .pme_ucode = 1 << 3, }, { .pme_uname = "BUFFER_BIT_4", .pme_udesc = "Buffer entry index bit 4", .pme_ucode = 1 << 4, }, { .pme_uname = "BUFFER_BIT_5", .pme_udesc = "Buffer entry index bit 5", .pme_ucode = 1 << 5, }, { .pme_uname = "BUFFER_BIT_6", .pme_udesc = "Buffer entry index bit 6", .pme_ucode = 1 << 6, }, { .pme_uname = "BUFFER_BIT_7", .pme_udesc = "Buffer entry index bit 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 28 */{.pme_name = "SYSTEM_READ_RESPONSES", .pme_code = 0x6C, .pme_desc = "Response From System on Cache Refills", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 1 << 0, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified (D18F0x68[ATMModeEn]==0), Modified written (D18F0x68[ATMModeEn]==1)", .pme_ucode = 1 << 1, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 1 << 2, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 1 << 3, }, { .pme_uname = "DATA_ERROR", .pme_udesc = "Data Error", .pme_ucode = 1 << 4, }, { .pme_uname = "MODIFIED_UNWRITTEN", .pme_udesc = "Modified unwritten", .pme_ucode = 1 << 5, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 29 */{.pme_name = "OCTWORD_WRITE_TRANSFERS", .pme_code = 0x6D, .pme_desc = "Octwords Written to System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "OCTWORD_WRITE_TRANSFER", .pme_udesc = "OW write transfer", .pme_ucode = 1 << 0, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x01, }, }, }, /* 30 */{.pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x76, .pme_desc = "CPU Clocks not Halted", }, /* 31 */{.pme_name = "REQUESTS_TO_L2", .pme_code = 0x7D, .pme_desc = "Requests to L2 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA", .pme_udesc = "DC fill", .pme_ucode = 1 << 1, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB fill (page table walks)", .pme_ucode = 1 << 2, }, { .pme_uname = "SNOOP", .pme_udesc = "NB probe request", .pme_ucode = 1 << 3, }, { .pme_uname = "CANCELLED", .pme_udesc = "Canceled request", .pme_ucode = 1 << 4, }, { .pme_uname = "PREFETCHER", .pme_udesc = "L2 cache prefetcher request", .pme_ucode = 1 << 6, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x5F, }, }, }, /* 32 */{.pme_name = "L2_CACHE_MISS", .pme_code = 0x7E, .pme_desc = "L2 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA", .pme_udesc = "DC fill (includes possible replays, whereas PMCx041 does not)", .pme_ucode = 1 << 1, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB page table walk", .pme_ucode = 1 << 2, }, { .pme_uname = "PREFETCHER", .pme_udesc = "L2 Cache Prefetcher request", .pme_ucode = 1 << 4, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x17, }, }, }, /* 33 */{.pme_name = "L2_CACHE_FILL_WRITEBACK", .pme_code = 0x7F, .pme_desc = "L2 Fill/Writeback", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "L2_FILLS", .pme_udesc = "L2 fills from system", .pme_ucode = 1 << 0, }, { .pme_uname = "L2_WRITEBACKS", .pme_udesc = "L2 Writebacks to system (Clean and Dirty)", .pme_ucode = 1 << 1, }, { .pme_uname = "L2_WRITEBACKS_CLEAN", .pme_udesc = "L2 Clean Writebacks to system", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 34 */{.pme_name = "PAGE_SPLINTERING", .pme_code = 0x165, .pme_desc = "Page Splintering", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "GUEST_LARGER", .pme_udesc = "Guest page size is larger than host page size when nested paging is enabled", .pme_ucode = 1 << 0, }, { .pme_uname = "MTRR_MISMATCH", .pme_udesc = "Splintering due to MTRRs, IORRs, APIC, TOMs or other special address region", .pme_ucode = 1 << 1, }, { .pme_uname = "HOST_LARGER", .pme_udesc = "Host page size is larger than the guest page size", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 35 */{.pme_name = "INSTRUCTION_CACHE_FETCHES", .pme_code = 0x80, .pme_desc = "Instruction Cache Fetches", }, /* 36 */{.pme_name = "INSTRUCTION_CACHE_MISSES", .pme_code = 0x81, .pme_desc = "Instruction Cache Misses", }, /* 37 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .pme_code = 0x82, .pme_desc = "Instruction Cache Refills from L2", }, /* 38 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x83, .pme_desc = "Instruction Cache Refills from System", }, /* 39 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .pme_code = 0x84, .pme_desc = "L1 ITLB Miss, L2 ITLB Hit", }, /* 40 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .pme_code = 0x85, .pme_desc = "L1 ITLB Miss, L2 ITLB Miss", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "4K_PAGE_FETCHES", .pme_udesc = "Instruction fetches to a 4 KB page", .pme_ucode = 1 << 0, }, { .pme_uname = "2M_PAGE_FETCHES", .pme_udesc = "Instruction fetches to a 2 MB page", .pme_ucode = 1 << 1, }, { .pme_uname = "1G_PAGE_FETCHES", .pme_udesc = "Instruction fetches to a 1 GB page", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 41 */{.pme_name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .pme_code = 0x86, .pme_desc = "Pipeline Restart Due to Instruction Stream Probe", }, /* 42 */{.pme_name = "INSTRUCTION_FETCH_STALL", .pme_code = 0x87, .pme_desc = "Instruction Fetch Stall", }, /* 43 */{.pme_name = "RETURN_STACK_HITS", .pme_code = 0x88, .pme_desc = "Return Stack Hits", }, /* 44 */{.pme_name = "RETURN_STACK_OVERFLOWS", .pme_code = 0x89, .pme_desc = "Return Stack Overflows", }, /* 45 */{.pme_name = "INSTRUCTION_CACHE_VICTIMS", .pme_code = 0x8B, .pme_desc = "Instruction Cache Victims", }, /* 46 */{.pme_name = "INSTRUCTION_CACHE_INVALIDATED", .pme_code = 0x8C, .pme_desc = "Instruction Cache Lines Invalidated", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "NON_SMC_PROBE_MISS", .pme_udesc = "Non-SMC invalidating probe that missed on in-flight instructions", .pme_ucode = 1 << 0, }, { .pme_uname = "NON_SMC_PROBE_HIT", .pme_udesc = "Non-SMC invalidating probe that hit on in-flight instructions", .pme_ucode = 1 << 1, }, { .pme_uname = "SMC_PROBE_MISS", .pme_udesc = "SMC invalidating probe that missed on in-flight instructions", .pme_ucode = 1 << 2, }, { .pme_uname = "SMC_PROBE_HIT", .pme_udesc = "SMC invalidating probe that hit on in-flight instructions", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 47 */{.pme_name = "ITLB_RELOADS", .pme_code = 0x99, .pme_desc = "ITLB Reloads", }, /* 48 */{.pme_name = "ITLB_RELOADS_ABORTED", .pme_code = 0x9A, .pme_desc = "ITLB Reloads Aborted", }, /* 49 */{.pme_name = "RETIRED_INSTRUCTIONS", .pme_code = 0xC0, .pme_desc = "Retired Instructions", }, /* 50 */{.pme_name = "RETIRED_UOPS", .pme_code = 0xC1, .pme_desc = "Retired uops", }, /* 51 */{.pme_name = "RETIRED_BRANCH_INSTRUCTIONS", .pme_code = 0xC2, .pme_desc = "Retired Branch Instructions", }, /* 52 */{.pme_name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .pme_code = 0xC3, .pme_desc = "Retired Mispredicted Branch Instructions", }, /* 53 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .pme_code = 0xC4, .pme_desc = "Retired Taken Branch Instructions", }, /* 54 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .pme_code = 0xC5, .pme_desc = "Retired Taken Branch Instructions Mispredicted", }, /* 55 */{.pme_name = "RETIRED_FAR_CONTROL_TRANSFERS", .pme_code = 0xC6, .pme_desc = "Retired Far Control Transfers", }, /* 56 */{.pme_name = "RETIRED_BRANCH_RESYNCS", .pme_code = 0xC7, .pme_desc = "Retired Branch Resyncs", }, /* 57 */{.pme_name = "RETIRED_NEAR_RETURNS", .pme_code = 0xC8, .pme_desc = "Retired Near Returns", }, /* 58 */{.pme_name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .pme_code = 0xC9, .pme_desc = "Retired Near Returns Mispredicted", }, /* 59 */{.pme_name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .pme_code = 0xCA, .pme_desc = "Retired Indirect Branches Mispredicted", }, /* 60 */{.pme_name = "RETIRED_MMX_FP_INSTRUCTIONS", .pme_code = 0xCB, .pme_desc = "Retired MMX/FP Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "X87", .pme_udesc = "x87 instructions", .pme_ucode = 1 << 0, }, { .pme_uname = "MMX", .pme_udesc = "MMX(tm) instructions", .pme_ucode = 1 << 1, }, { .pme_uname = "SSE", .pme_udesc = "SSE instructions (SSE,SSE2,SSE3,SSSE3,SSE4A,SSE4.1,SSE4.2,AVX,XOP,FMA4)", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 61 */{.pme_name = "INTERRUPTS_MASKED_CYCLES", .pme_code = 0xCD, .pme_desc = "Interrupts-Masked Cycles", }, /* 62 */{.pme_name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .pme_code = 0xCE, .pme_desc = "Interrupts-Masked Cycles with Interrupt Pending", }, /* 63 */{.pme_name = "INTERRUPTS_TAKEN", .pme_code = 0xCF, .pme_desc = "Interrupts Taken", }, /* 64 */{.pme_name = "DECODER_EMPTY", .pme_code = 0xD0, .pme_desc = "Decoder Empty", }, /* 65 */{.pme_name = "DISPATCH_STALLS", .pme_code = 0xD1, .pme_desc = "Dispatch Stalls", }, /* 66 */{.pme_name = "DISPATCH_STALL_FOR_SERIALIZATION", .pme_code = 0xD3, .pme_desc = "Microsequencer Stall due to Serialization", }, /* 67 */{.pme_name = "DISPATCH_STALL_FOR_RETIRE_QUEUE_FULL", .pme_code = 0xD5, .pme_desc = "Dispatch Stall for Instruction Retire Q Full", }, /* 68 */{.pme_name = "DISPATCH_STALL_FOR_INT_SCHED_QUEUE_FULL", .pme_code = 0xD6, .pme_desc = "Dispatch Stall for Integer Scheduler Queue Full", }, /* 69 */{.pme_name = "DISPATCH_STALL_FOR_FPU_FULL", .pme_code = 0xD7, .pme_desc = "Dispatch Stall for FP Scheduler Queue Full", }, /* 70 */{.pme_name = "DISPATCH_STALL_FOR_LDQ_FULL", .pme_code = 0xD8, .pme_desc = "Dispatch Stall for LDQ Full", }, /* 71 */{.pme_name = "MICROSEQ_STALL_WAITING_FOR_ALL_QUIET", .pme_code = 0xD9, .pme_desc = "Microsequencer Stall Waiting for All Quiet", }, /* 72 */{.pme_name = "FPU_EXCEPTIONS", .pme_code = 0xDB, .pme_desc = "FPU Exceptions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "TOTAL_FAULTS", .pme_udesc = "Total microfaults", .pme_ucode = 1 << 0, }, { .pme_uname = "TOTAL_TRAPS", .pme_udesc = "Total microtraps", .pme_ucode = 1 << 1, }, { .pme_uname = "INT2EXT_FAULTS", .pme_udesc = "Int2Ext faults", .pme_ucode = 1 << 2, }, { .pme_uname = "EXT2INT_FAULTS", .pme_udesc = "Ext2Int faults", .pme_ucode = 1 << 3, }, { .pme_uname = "BYPASS_FAULTS", .pme_udesc = "Bypass faults", .pme_ucode = 1 << 4, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x1F, }, }, }, /* 73 */{.pme_name = "DR0_BREAKPOINTS", .pme_code = 0xDC, .pme_desc = "DR0 Breakpoint Match", }, /* 74 */{.pme_name = "DR1_BREAKPOINTS", .pme_code = 0xDD, .pme_desc = "DR1 Breakpoint Match", }, /* 75 */{.pme_name = "DR2_BREAKPOINTS", .pme_code = 0xDE, .pme_desc = "DR2 Breakpoint Match", }, /* 76 */{.pme_name = "DR3_BREAKPOINTS", .pme_code = 0xDF, .pme_desc = "DR3 Breakpoint Match", }, /* 77 */{.pme_name = "IBS_OPS_TAGGED", .pme_code = 0x1CF, .pme_desc = "Tagged IBS Ops", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "TAGGED", .pme_udesc = "Number of ops tagged by IBS", .pme_ucode = 1 << 0, }, { .pme_uname = "RETIRED", .pme_udesc = "Number of ops tagged by IBS that retired", .pme_ucode = 1 << 1, }, { .pme_uname = "IGNORED", .pme_udesc = "Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* Northbridge events (.pme_code & 0x0E0) not yet supported by the kernel */ #if 0 /* 78 */{.pme_name = "DRAM_ACCESSES", .pme_code = 0xE0, .pme_desc = "DRAM Accesses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "DCT0 Page hit", .pme_ucode = 1 << 0, }, { .pme_uname = "MISS", .pme_udesc = "DCT0 Page Miss", .pme_ucode = 1 << 1, }, { .pme_uname = "CONFLICT", .pme_udesc = "DCT0 Page Conflict", .pme_ucode = 1 << 2, }, { .pme_uname = "DCT1_PAGE_HIT", .pme_udesc = "DCT1 Page hit", .pme_ucode = 1 << 3, }, { .pme_uname = "DCT1_PAGE_MISS", .pme_udesc = "DCT1 Page Miss", .pme_ucode = 1 << 4, }, { .pme_uname = "DCT1_PAGE_CONFLICT", .pme_udesc = "DCT1 Page Conflict", .pme_ucode = 1 << 5, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 79 */{.pme_name = "MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS", .pme_code = 0xE1, .pme_desc = "DRAM Controller Page Table Overflows", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "DCT0_PAGE_TABLE_OVERFLOW", .pme_udesc = "DCT0 Page Table Overflow", .pme_ucode = 1 << 0, }, { .pme_uname = "DCT1_PAGE_TABLE_OVERFLOW", .pme_udesc = "DCT1 Page Table Overflow", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 80 */{.pme_name = "MEMORY_CONTROLLER_SLOT_MISSED", .pme_code = 0xE2, .pme_desc = "Memory Controller DRAM Command Slots Missed", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "DCT0_COMMAND_SLOTS_MISSED", .pme_udesc = "DCT0 Command Slots Missed (in MemClks)", .pme_ucode = 1 << 0, }, { .pme_uname = "DCT1_COMMAND_SLOTS_MISSED", .pme_udesc = "DCT1 Command Slots Missed (in MemClks)", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 81 */{.pme_name = "MEMORY_CONTROLLER_TURNAROUNDS", .pme_code = 0xE3, .pme_desc = "Memory Controller Turnarounds", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "CHIP_SELECT", .pme_udesc = "DCT0 DIMM (chip select) turnaround", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_TO_WRITE", .pme_udesc = "DCT0 Read to write turnaround", .pme_ucode = 1 << 1, }, { .pme_uname = "WRITE_TO_READ", .pme_udesc = "DCT0 Write to read turnaround", .pme_ucode = 1 << 2, }, { .pme_uname = "DCT1_DIMM", .pme_udesc = "DCT1 DIMM (chip select) turnaround", .pme_ucode = 1 << 3, }, { .pme_uname = "DCT1_READ_TO_WRITE_TURNAROUND", .pme_udesc = "DCT1 Read to write turnaround", .pme_ucode = 1 << 4, }, { .pme_uname = "DCT1_WRITE_TO_READ_TURNAROUND", .pme_udesc = "DCT1 Write to read turnaround", .pme_ucode = 1 << 5, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 82 */{.pme_name = "MEMORY_CONTROLLER_BYPASS_COUNTER_SATURATION", .pme_code = 0xE4, .pme_desc = "Memory Controller Bypass Counter Saturation", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "HIGH_PRIORITY", .pme_udesc = "Memory controller high priority bypass", .pme_ucode = 1 << 0, }, { .pme_uname = "MEDIUM_PRIORITY", .pme_udesc = "Memory controller medium priority bypass", .pme_ucode = 1 << 1, }, { .pme_uname = "DCT0_DCQ", .pme_udesc = "DCT0 DCQ bypass", .pme_ucode = 1 << 2, }, { .pme_uname = "DCT1_DCQ", .pme_udesc = "DCT1 DCQ bypass", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 83 */{.pme_name = "THERMAL_STATUS", .pme_code = 0xE8, .pme_desc = "Thermal Status", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "CLKS_DIE_TEMP_TOO_HIGH", .pme_udesc = "Number of times the HTC trip point is crossed", .pme_ucode = 1 << 2, }, { .pme_uname = "CLOCKS_HTC_P_STATE_INACTIVE", .pme_udesc = "Number of clocks HTC P-state is inactive", .pme_ucode = 1 << 5, }, { .pme_uname = "CLOCKS_HTC_P_STATE_ACTIVE", .pme_udesc = "Number of clocks HTC P-state is active", .pme_ucode = 1 << 6, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x64, }, }, }, /* 84 */{.pme_name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .pme_code = 0xE9, .pme_desc = "CPU/IO Requests to Memory/IO", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "I_O_TO_I_O", .pme_udesc = "IO to IO", .pme_ucode = 1 << 0, }, { .pme_uname = "I_O_TO_MEM", .pme_udesc = "IO to Mem", .pme_ucode = 1 << 1, }, { .pme_uname = "CPU_TO_I_O", .pme_udesc = "CPU to IO", .pme_ucode = 1 << 2, }, { .pme_uname = "CPU_TO_MEM", .pme_udesc = "CPU to Mem", .pme_ucode = 1 << 3, }, { .pme_uname = "TO_REMOTE_NODE", .pme_udesc = "To remote node", .pme_ucode = 1 << 4, }, { .pme_uname = "TO_LOCAL_NODE", .pme_udesc = "To local node", .pme_ucode = 1 << 5, }, { .pme_uname = "FROM_REMOTE_NODE", .pme_udesc = "From remote node", .pme_ucode = 1 << 6, }, { .pme_uname = "FROM_LOCAL_NODE", .pme_udesc = "From local node", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 85 */{.pme_name = "CACHE_BLOCK_COMMANDS", .pme_code = 0xEA, .pme_desc = "Cache Block Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "VICTIM_WRITEBACK", .pme_udesc = "Victim Block (Writeback)", .pme_ucode = 1 << 0, }, { .pme_uname = "DCACHE_LOAD_MISS", .pme_udesc = "Read Block (Dcache load miss refill)", .pme_ucode = 1 << 2, }, { .pme_uname = "SHARED_ICACHE_REFILL", .pme_udesc = "Read Block Shared (Icache refill)", .pme_ucode = 1 << 3, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read Block Modified (Dcache store miss refill)", .pme_ucode = 1 << 4, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty (first store to clean block already in cache)", .pme_ucode = 1 << 5, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3D, }, }, }, /* 86 */{.pme_name = "SIZED_COMMANDS", .pme_code = 0xEB, .pme_desc = "Sized Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "NON_POSTED_WRITE_BYTE", .pme_udesc = "Non-Posted SzWr Byte (1-32 bytes)", .pme_ucode = 1 << 0, }, { .pme_uname = "NON_POSTED_WRITE_DWORD", .pme_udesc = "Non-Posted SzWr DW (1-16 dwords)", .pme_ucode = 1 << 1, }, { .pme_uname = "POSTED_WRITE_BYTE", .pme_udesc = "Posted SzWr Byte (1-32 bytes)", .pme_ucode = 1 << 2, }, { .pme_uname = "POSTED_WRITE_DWORD", .pme_udesc = "Posted SzWr DW (1-16 dwords)", .pme_ucode = 1 << 3, }, { .pme_uname = "READ_BYTE", .pme_udesc = "SzRd Byte (4 bytes)", .pme_ucode = 1 << 4, }, { .pme_uname = "READ_DWORD", .pme_udesc = "SzRd DW (1-16 dwords)", .pme_ucode = 1 << 5, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 87 */{.pme_name = "PROBE_RESPONSES_AND_UPSTREAM_REQUESTS", .pme_code = 0xEC, .pme_desc = "Probe Responses and Upstream Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "Probe miss", .pme_ucode = 1 << 0, }, { .pme_uname = "HIT_CLEAN", .pme_udesc = "Probe hit clean", .pme_ucode = 1 << 1, }, { .pme_uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .pme_ucode = 1 << 2, }, { .pme_uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty with memory cancel (probed by DMA read/cache refill request)", .pme_ucode = 1 << 3, }, { .pme_uname = "UPSTREAM_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream display refresh/ISOC reads", .pme_ucode = 1 << 4, }, { .pme_uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream non-display refresh reads", .pme_ucode = 1 << 5, }, { .pme_uname = "UPSTREAM_WRITES", .pme_udesc = "Upstream ISOC writes", .pme_ucode = 1 << 6, }, { .pme_uname = "UPSTREAM_NON_ISOC_WRITES", .pme_udesc = "Upstream non-ISOC writes", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 88 */{.pme_name = "GART_EVENTS", .pme_code = 0xEE, .pme_desc = "GART Events", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "CPU_HIT", .pme_udesc = "GART aperture hit on access from CPU", .pme_ucode = 1 << 0, }, { .pme_uname = "IO_HIT", .pme_udesc = "GART aperture hit on access from IO", .pme_ucode = 1 << 1, }, { .pme_uname = "MISS", .pme_udesc = "GART miss", .pme_ucode = 1 << 2, }, { .pme_uname = "TABLE_WALK", .pme_udesc = "GART Request hit table walk in progress", .pme_ucode = 1 << 3, }, { .pme_uname = "MULTIPLE_TABLE_WALK", .pme_udesc = "GART multiple table walk in progress", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x8F, }, }, }, /* 89 */{.pme_name = "HYPERTRANSPORT_LINK0_TRANSMIT_BANDWIDTH", .pme_code = 0xF6, .pme_desc = "HyperTransport(tm) Link 0 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 1 << 2, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 1 << 3, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address DWORD sent", .pme_ucode = 1 << 4, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 1 << 5, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 90 */{.pme_name = "HYPERTRANSPORT_LINK1_TRANSMIT_BANDWIDTH", .pme_code = 0xF7, .pme_desc = "HyperTransport(tm) Link 1 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 1 << 2, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 1 << 3, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address DWORD sent", .pme_ucode = 1 << 4, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 1 << 5, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 91 */{.pme_name = "HYPERTRANSPORT_LINK2_TRANSMIT_BANDWIDTH", .pme_code = 0xF8, .pme_desc = "HyperTransport(tm) Link 2 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 1 << 2, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 1 << 3, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address DWORD sent", .pme_ucode = 1 << 4, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 1 << 5, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 92 */{.pme_name = "HYPERTRANSPORT_LINK3_TRANSMIT_BANDWIDTH", .pme_code = 0x1F9, .pme_desc = "HyperTransport(tm) Link 3 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command DWORD sent", .pme_ucode = 1 << 0, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data DWORD sent", .pme_ucode = 1 << 1, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release DWORD sent", .pme_ucode = 1 << 2, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop DW sent (idle)", .pme_ucode = 1 << 3, }, { .pme_uname = "ADDRESS_EXT_DWORD_SENT", .pme_udesc = "Address DWORD sent", .pme_ucode = 1 << 4, }, { .pme_uname = "PER_PACKET_CRC_SENT", .pme_udesc = "Per packet CRC sent", .pme_ucode = 1 << 5, }, { .pme_uname = "SUBLINK_MASK", .pme_udesc = "SubLink Mask", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xBF, }, }, }, /* 93 */{.pme_name = "CPU_DRAM_REQUEST_TO_NODE", .pme_code = 0x1E0, .pme_desc = "CPU to DRAM Requests to Target Node", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 1 << 0, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 1 << 1, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 1 << 2, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 94 */{.pme_name = "IO_DRAM_REQUEST_TO_NODE", .pme_code = 0x1E1, .pme_desc = "IO to DRAM Requests to Target Node", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 1 << 0, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 1 << 1, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 1 << 2, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 95 */{.pme_name = "CPU_READ_COMMAND_LATENCY_NODE_0_3", .pme_code = 0x1E2, .pme_desc = "CPU Read Command Latency to Target Node 0-3", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 1 << 2, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 96 */{.pme_name = "CPU_READ_COMMAND_REQUEST_NODE_0_3", .pme_code = 0x1E3, .pme_desc = "CPU Read Command Requests to Target Node 0-3", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 1 << 2, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_0", .pme_udesc = "From Local node to Node 0", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_1", .pme_udesc = "From Local node to Node 1", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_2", .pme_udesc = "From Local node to Node 2", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_3", .pme_udesc = "From Local node to Node 3", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 97 */{.pme_name = "CPU_READ_COMMAND_LATENCY_NODE_4_7", .pme_code = 0x1E4, .pme_desc = "CPU Read Command Latency to Target Node 4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 1 << 2, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 98 */{.pme_name = "CPU_READ_COMMAND_REQUEST_NODE_4_7", .pme_code = 0x1E5, .pme_desc = "CPU Read Command Requests to Target Node 4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_BLOCK", .pme_udesc = "Read block", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read block shared", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read block modified", .pme_ucode = 1 << 2, }, { .pme_uname = "CHANGE_TO_DIRTY", .pme_udesc = "Change-to-Dirty", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_4", .pme_udesc = "From Local node to Node 4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_5", .pme_udesc = "From Local node to Node 5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_6", .pme_udesc = "From Local node to Node 6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_7", .pme_udesc = "From Local node to Node 7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 99 */{.pme_name = "CPU_COMMAND_LATENCY_TARGET", .pme_code = 0x1E6, .pme_desc = "CPU Command Latency to Target Node 0-3/4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_SIZED", .pme_udesc = "Read Sized", .pme_ucode = 1 << 0, }, { .pme_uname = "WRITE_SIZED", .pme_udesc = "Write Sized", .pme_ucode = 1 << 1, }, { .pme_uname = "VICTIM_BLOCK", .pme_udesc = "Victim Block", .pme_ucode = 1 << 2, }, { .pme_uname = "NODE_GROUP_SELECT", .pme_udesc = "Node Group Select: 0=Nodes 0-3, 1= Nodes 4-7", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_0_4", .pme_udesc = "From Local node to Node 0/4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_1_5", .pme_udesc = "From Local node to Node 1/5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_2_6", .pme_udesc = "From Local node to Node 2/6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_3_7", .pme_udesc = "From Local node to Node 3/7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 100 */{.pme_name = "CPU_REQUEST_TARGET", .pme_code = 0x1E7, .pme_desc = "CPU Requests to Target Node 0-3/4-7", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "READ_SIZED", .pme_udesc = "Read Sized", .pme_ucode = 1 << 0, }, { .pme_uname = "WRITE_SIZED", .pme_udesc = "Write Sized", .pme_ucode = 1 << 1, }, { .pme_uname = "VICTIM_BLOCK", .pme_udesc = "Victim Block", .pme_ucode = 1 << 2, }, { .pme_uname = "NODE_GROUP_SELECT", .pme_udesc = "Node Group Select: 0=Nodes 0-3, 1= Nodes 4-7", .pme_ucode = 1 << 3, }, { .pme_uname = "LOCAL_TO_0_4", .pme_udesc = "From Local node to Node 0/4", .pme_ucode = 1 << 4, }, { .pme_uname = "LOCAL_TO_1_5", .pme_udesc = "From Local node to Node 1/5", .pme_ucode = 1 << 5, }, { .pme_uname = "LOCAL_TO_2_6", .pme_udesc = "From Local node to Node 2/6", .pme_ucode = 1 << 6, }, { .pme_uname = "LOCAL_TO_3_7", .pme_udesc = "From Local node to Node 3/7", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 101 */{.pme_name = "MEMORY_CONTROLLER_REQUESTS", .pme_code = 0x1F0, .pme_desc = "Memory Controller Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "WRITE_REQUESTS", .pme_udesc = "Write requests sent to the DCT", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_REQUESTS", .pme_udesc = "Read requests (including prefetch requests) sent to the DCT", .pme_ucode = 1 << 1, }, { .pme_uname = "PREFETCH_REQUESTS", .pme_udesc = "Prefetch requests sent to the DCT", .pme_ucode = 1 << 2, }, { .pme_uname = "32_BYTES_WRITES", .pme_udesc = "32 Bytes Sized Writes", .pme_ucode = 1 << 3, }, { .pme_uname = "64_BYTES_WRITES", .pme_udesc = "64 Bytes Sized Writes", .pme_ucode = 1 << 4, }, { .pme_uname = "32_BYTES_READS", .pme_udesc = "32 Bytes Sized Reads", .pme_ucode = 1 << 5, }, { .pme_uname = "64_BYTES_READS", .pme_udesc = "64 Byte Sized Reads", .pme_ucode = 1 << 6, }, { .pme_uname = "READ_REQUESTS_WHILE_WRITES_REQUESTS", .pme_udesc = "Read requests sent to the DCT while writes requests are pending in the DCT", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 102 */{.pme_name = "READ_REQUEST_L3_CACHE", .pme_code = 0x4E0, .pme_desc = "Read Request to L3 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 13, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 1 << 2, }, { .pme_uname = "PREFETCH_ONLY", .pme_udesc = "1=Count prefetch only, 0=Count prefetch and non-prefetch", .pme_ucode = 1 << 3, }, { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "CORE_6_SELECT", .pme_udesc = "Core 6 Select", .pme_ucode = 0x60, }, { .pme_uname = "CORE_7_SELECT", .pme_udesc = "Core 7 Select", .pme_ucode = 0x70, }, { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 103 */{.pme_name = "L3_CACHE_MISSES", .pme_code = 0x4E1, .pme_desc = "L3 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 13, .pme_umasks = { { .pme_uname = "READ_BLOCK_EXCLUSIVE", .pme_udesc = "Read Block Exclusive (Data cache read)", .pme_ucode = 1 << 0, }, { .pme_uname = "READ_BLOCK_SHARED", .pme_udesc = "Read Block Shared (Instruction cache read)", .pme_ucode = 1 << 1, }, { .pme_uname = "READ_BLOCK_MODIFY", .pme_udesc = "Read Block Modify", .pme_ucode = 1 << 2, }, { .pme_uname = "PREFETCH_ONLY", .pme_udesc = "1=Count prefetch only, 0=Count prefetch and non-prefetch", .pme_ucode = 1 << 3, }, { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "CORE_6_SELECT", .pme_udesc = "Core 6 Select", .pme_ucode = 0x60, }, { .pme_uname = "CORE_7_SELECT", .pme_udesc = "Core 7 Select", .pme_ucode = 0x70, }, { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 104 */{.pme_name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .pme_code = 0x4E2, .pme_desc = "L3 Fills caused by L2 Evictions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 13, .pme_umasks = { { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 1 << 0, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 1 << 1, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 1 << 2, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 1 << 3, }, { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "CORE_6_SELECT", .pme_udesc = "Core 6 Select", .pme_ucode = 0x60, }, { .pme_uname = "CORE_7_SELECT", .pme_udesc = "Core 7 Select", .pme_ucode = 0x70, }, { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, /* 105 */{.pme_name = "L3_EVICTIONS", .pme_code = 0x4E3, .pme_desc = "L3 Evictions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 1 << 0, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 1 << 1, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 1 << 2, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 1 << 3, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 106 */{.pme_name = "NON_CANCELLED_L3_READ_REQUESTS", .pme_code = 0x4ED, .pme_desc = "Non-canceled L3 Read Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 13, .pme_umasks = { { .pme_uname = "RDBLK", .pme_udesc = "RdBlk", .pme_ucode = 1 << 0, }, { .pme_uname = "RDBLKS", .pme_udesc = "RdBlkS", .pme_ucode = 1 << 1, }, { .pme_uname = "RDBLKM", .pme_udesc = "RdBlkM", .pme_ucode = 1 << 2, }, { .pme_uname = "PREFETCH_ONLY", .pme_udesc = "1=Count prefetch only; 0=Count prefetch and non-prefetch", .pme_ucode = 1 << 3, }, { .pme_uname = "CORE_0_SELECT", .pme_udesc = "Core 0 Select", .pme_ucode = 0x00, }, { .pme_uname = "CORE_1_SELECT", .pme_udesc = "Core 1 Select", .pme_ucode = 0x10, }, { .pme_uname = "CORE_2_SELECT", .pme_udesc = "Core 2 Select", .pme_ucode = 0x20, }, { .pme_uname = "CORE_3_SELECT", .pme_udesc = "Core 3 Select", .pme_ucode = 0x30, }, { .pme_uname = "CORE_4_SELECT", .pme_udesc = "Core 4 Select", .pme_ucode = 0x40, }, { .pme_uname = "CORE_5_SELECT", .pme_udesc = "Core 5 Select", .pme_ucode = 0x50, }, { .pme_uname = "CORE_6_SELECT", .pme_udesc = "Core 6 Select", .pme_ucode = 0x60, }, { .pme_uname = "CORE_7_SELECT", .pme_udesc = "Core 7 Select", .pme_ucode = 0x70, }, { .pme_uname = "ALL_CORES", .pme_udesc = "All cores", .pme_ucode = 0xF0, }, }, }, #endif /* 107 */{.pme_name = "LS_DISPATCH", .pme_code = 0x29, .pme_desc = "LS Dispatch", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "LOADS", .pme_udesc = "Loads", .pme_ucode = 1 << 0, }, { .pme_uname = "STORES", .pme_udesc = "Stores", .pme_ucode = 1 << 1, }, { .pme_uname = "LOAD_OP_STORES", .pme_udesc = "Load-op-Stores", .pme_ucode = 1 << 2, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 108 */{.pme_name = "EXECUTED_CLFLUSH_INSTRUCTIONS", .pme_code = 0x30, .pme_desc = "Executed CLFLUSH Instructions", }, /* 109 */{.pme_name = "L2_PREFETCHER_TRIGGER_EVENTS", .pme_code = 0x16C, .pme_desc = "L2 Prefetcher Trigger Events", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "LOAD_L1_MISS_SEEN_BY_PREFETCHER", .pme_udesc = "Load L1 miss seen by prefetcher", .pme_ucode = 1 << 0, }, { .pme_uname = "STORE_L1_MISS_SEEN_BY_PREFETCHER", .pme_udesc = "Store L1 miss seen by prefetcher", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 110 */{.pme_name = "DISPATCH_STALL_FOR_STQ_FULL", .pme_code = 0x1D8, .pme_desc = "Dispatch Stall for STQ Full", }, /* Northbridge events (.pme_code & 0x0E0) not yet supported by the kernel */ #if 0 /* 111 */{.pme_name = "REQUEST_CACHE_STATUS_0", .pme_code = 0x1EA, .pme_desc = "Request Cache Status 0", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "PROBE_HIT_S", .pme_udesc = "Probe Hit S", .pme_ucode = 1 << 0, }, { .pme_uname = "PROBE_HIT_E", .pme_udesc = "Probe Hit E", .pme_ucode = 1 << 1, }, { .pme_uname = "PROBE_HIT_MUW_OR_O", .pme_udesc = "Probe Hit MuW or O", .pme_ucode = 1 << 2, }, { .pme_uname = "PROBE_HIT_M", .pme_udesc = "Probe Hit M", .pme_ucode = 1 << 3, }, { .pme_uname = "PROBE_MISS", .pme_udesc = "Probe Miss", .pme_ucode = 1 << 4, }, { .pme_uname = "DIRECTED_PROBE", .pme_udesc = "Directed Probe", .pme_ucode = 1 << 5, }, { .pme_uname = "TRACK_CACHE_STAT_FOR_RDBLK", .pme_udesc = "Track Cache Stat for RdBlk", .pme_ucode = 1 << 6, }, { .pme_uname = "TRACK_CACHE_STAT_FOR_RDBLKS", .pme_udesc = "Track Cache Stat for RdBlkS", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 112 */{.pme_name = "REQUEST_CACHE_STATUS_1", .pme_code = 0x1EB, .pme_desc = "Request Cache Status 1", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "PROBE_HIT_S", .pme_udesc = "Probe Hit S", .pme_ucode = 1 << 0, }, { .pme_uname = "PROBE_HIT_E", .pme_udesc = "Probe Hit E", .pme_ucode = 1 << 1, }, { .pme_uname = "PROBE_HIT_MUW_OR_O", .pme_udesc = "Probe Hit MuW or O", .pme_ucode = 1 << 2, }, { .pme_uname = "PROBE_HIT_M", .pme_udesc = "Probe Hit M", .pme_ucode = 1 << 3, }, { .pme_uname = "PROBE_MISS", .pme_udesc = "Probe Miss", .pme_ucode = 1 << 4, }, { .pme_uname = "DIRECTED_PROBE", .pme_udesc = "Directed Probe", .pme_ucode = 1 << 5, }, { .pme_uname = "TRACK_CACHE_STAT_FOR_CHGTODIRTY", .pme_udesc = "Track Cache Stat for ChgToDirty", .pme_ucode = 1 << 6, }, { .pme_uname = "TRACK_CACHE_STAT_FOR_RDBLKM", .pme_udesc = "Track Cache Stat for RdBlkM", .pme_ucode = 1 << 7, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 113 */{.pme_name = "L3_LATENCY", .pme_code = 0x4EF, .pme_desc = "L3 Latency", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "L3CYCCOUNT", .pme_udesc = "L3CycCount. L3 Request cycle count", .pme_ucode = 1 << 0, }, { .pme_uname = "L3REQCOUNT", .pme_udesc = "L3ReqCount. L3 request count", .pme_ucode = 1 << 1, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, #endif }; #define PME_AMD64_FAM15H_EVENT_COUNT (sizeof(amd64_fam15h_pe)/sizeof(pme_amd64_entry_t)) #define PME_AMD64_FAM15H_CPU_CLK_UNHALTED 30 #define PME_AMD64_FAM15H_RETIRED_INSTRUCTIONS 49 papi-papi-7-2-0-t/src/libperfnec/lib/amd64_events_k7.h000066400000000000000000000151321502707512200223610ustar00rootroot00000000000000/* * Copyright (c) 2006, 2007 Advanced Micro Devices, Inc. * Contributed by Ray Bryant * Contributed by Robert Richter * Modified for K7 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * Definitions taken from "AMD Athlon Processor x86 Code Optimization Guide" * Table 11 February 2002 */ static pme_amd64_entry_t amd64_k7_pe[]={ /* 0 */{.pme_name = "DATA_CACHE_ACCESSES", .pme_code = 0x40, .pme_desc = "Data Cache Accesses", }, /* 1 */{.pme_name = "DATA_CACHE_MISSES", .pme_code = 0x41, .pme_desc = "Data Cache Misses", }, /* 2 */{.pme_name = "DATA_CACHE_REFILLS", .pme_code = 0x42, .pme_desc = "Data Cache Refills from L2", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "L2_INVALID", .pme_udesc = "Invalid line from L2", .pme_ucode = 0x01, }, { .pme_uname = "L2_SHARED", .pme_udesc = "Shared-state line from L2", .pme_ucode = 0x02, }, { .pme_uname = "L2_EXCLUSIVE", .pme_udesc = "Exclusive-state line from L2", .pme_ucode = 0x04, }, { .pme_uname = "L2_OWNED", .pme_udesc = "Owned-state line from L2", .pme_ucode = 0x08, }, { .pme_uname = "L2_MODIFIED", .pme_udesc = "Modified-state line from L2", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Shared, Exclusive, Owned, Modified State Refills", .pme_ucode = 0x1F, }, }, }, /* 3 */{.pme_name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x43, .pme_desc = "Data Cache Refills from System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Invalid, Shared, Exclusive, Owned, Modified", .pme_ucode = 0x1F, }, }, }, /* 4 */{.pme_name = "DATA_CACHE_LINES_EVICTED", .pme_code = 0x44, .pme_desc = "Data Cache Lines Evicted", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Invalid, Shared, Exclusive, Owned, Modified", .pme_ucode = 0x1F, }, }, }, /* 5 */{.pme_name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .pme_code = 0x45, .pme_desc = "L1 DTLB Miss and L2 DTLB Hit", }, /* 6 */{.pme_name = "L1_DTLB_AND_L2_DTLB_MISS", .pme_code = 0x46, .pme_desc = "L1 DTLB and L2 DTLB Miss", }, /* 7 */{.pme_name = "MISALIGNED_ACCESSES", .pme_code = 0x47, .pme_desc = "Misaligned Accesses", }, /* CPU_CLK_UNHALTED is undocumented in the Athlon Guide? */ /* 8 */{.pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x76, .pme_desc = "CPU Clocks not Halted", }, /* 9 */{.pme_name = "INSTRUCTION_CACHE_FETCHES", .pme_code = 0x80, .pme_desc = "Instruction Cache Fetches", }, /* 10 */{.pme_name = "INSTRUCTION_CACHE_MISSES", .pme_code = 0x81, .pme_desc = "Instruction Cache Misses", }, /* 11 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .pme_code = 0x84, .pme_desc = "L1 ITLB Miss and L2 ITLB Hit", }, /* 12 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .pme_code = 0x85, .pme_desc = "L1 ITLB Miss and L2 ITLB Miss", }, /* 13 */{.pme_name = "RETIRED_INSTRUCTIONS", .pme_code = 0xC0, .pme_desc = "Retired Instructions (includes exceptions, interrupts, resyncs)", }, /* 14 */{.pme_name = "RETIRED_UOPS", .pme_code = 0xC1, .pme_desc = "Retired uops", }, /* 15 */{.pme_name = "RETIRED_BRANCH_INSTRUCTIONS", .pme_code = 0xC2, .pme_desc = "Retired Branch Instructions", }, /* 16 */{.pme_name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .pme_code = 0xC3, .pme_desc = "Retired Mispredicted Branch Instructions", }, /* 17 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .pme_code = 0xC4, .pme_desc = "Retired Taken Branch Instructions", }, /* 18 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .pme_code = 0xC5, .pme_desc = "Retired Taken Branch Instructions Mispredicted", }, /* 19 */{.pme_name = "RETIRED_FAR_CONTROL_TRANSFERS", .pme_code = 0xC6, .pme_desc = "Retired Far Control Transfers", }, /* 20 */{.pme_name = "RETIRED_BRANCH_RESYNCS", .pme_code = 0xC7, .pme_desc = "Retired Branch Resyncs (only non-control transfer branches)", }, /* 21 */{.pme_name = "INTERRUPTS_MASKED_CYCLES", .pme_code = 0xCD, .pme_desc = "Interrupts-Masked Cycles", }, /* 22 */{.pme_name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .pme_code = 0xCE, .pme_desc = "Interrupts-Masked Cycles with Interrupt Pending", }, /* 23 */{.pme_name = "INTERRUPTS_TAKEN", .pme_code = 0xCF, .pme_desc = "Interrupts Taken", }, }; #define PME_AMD64_K7_EVENT_COUNT (sizeof(amd64_k7_pe)/sizeof(pme_amd64_entry_t)) #define PME_AMD64_K7_CPU_CLK_UNHALTED 8 #define PME_AMD64_K7_RETIRED_INSTRUCTIONS 13 papi-papi-7-2-0-t/src/libperfnec/lib/amd64_events_k8.h000066400000000000000000001015731502707512200223670ustar00rootroot00000000000000/* * Copyright (c) 2006, 2007 Advanced Micro Devices, Inc. * Contributed by Ray Bryant * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* History * * Feb 10 2006 -- Ray Bryant, raybry@mpdtxmail.amd.com * * Brought event table up-to-date with the 3.85 (October 2005) version of the * "BIOS and Kernel Developer's Guide for the AMD Athlon[tm] 64 and * AMD Opteron[tm] Processors," AMD Publication # 26094. * * Dec 12 2007 -- Robert Richter, robert.richter@amd.com * * Updated to: BIOS and Kernel Developer's Guide for AMD NPT Family * 0Fh Processors, Publication # 32559, Revision: 3.08, Issue Date: * July 2007 * * Feb 26 2009 -- Robert Richter, robert.richter@amd.com * * Updates and fixes of some revision flags and descriptions according * to these documents: * BIOS and Kernel Developer's Guide, #26094, Revision: 3.30 * BIOS and Kernel Developer's Guide, #32559, Revision: 3.12 */ static pme_amd64_entry_t amd64_k8_pe[]={ /* 0 */{.pme_name = "DISPATCHED_FPU", .pme_code = 0x00, .pme_desc = "Dispatched FPU Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "OPS_ADD", .pme_udesc = "Add pipe ops", .pme_ucode = 0x01, }, { .pme_uname = "OPS_MULTIPLY", .pme_udesc = "Multiply pipe ops", .pme_ucode = 0x02, }, { .pme_uname = "OPS_STORE", .pme_udesc = "Store pipe ops", .pme_ucode = 0x04, }, { .pme_uname = "OPS_ADD_PIPE_LOAD_OPS", .pme_udesc = "Add pipe load ops", .pme_ucode = 0x08, }, { .pme_uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .pme_udesc = "Multiply pipe load ops", .pme_ucode = 0x10, }, { .pme_uname = "OPS_STORE_PIPE_LOAD_OPS", .pme_udesc = "Store pipe load ops", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, }, }, }, /* 1 */{.pme_name = "CYCLES_NO_FPU_OPS_RETIRED", .pme_code = 0x01, .pme_desc = "Cycles with no FPU Ops Retired", }, /* 2 */{.pme_name = "DISPATCHED_FPU_OPS_FAST_FLAG", .pme_code = 0x02, .pme_desc = "Dispatched Fast Flag FPU Operations", }, /* 3 */{.pme_name = "SEGMENT_REGISTER_LOADS", .pme_code = 0x20, .pme_desc = "Segment Register Loads", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "ES", .pme_udesc = "ES", .pme_ucode = 0x01, }, { .pme_uname = "CS", .pme_udesc = "CS", .pme_ucode = 0x02, }, { .pme_uname = "SS", .pme_udesc = "SS", .pme_ucode = 0x04, }, { .pme_uname = "DS", .pme_udesc = "DS", .pme_ucode = 0x08, }, { .pme_uname = "FS", .pme_udesc = "FS", .pme_ucode = 0x10, }, { .pme_uname = "GS", .pme_udesc = "GS", .pme_ucode = 0x20, }, { .pme_uname = "HS", .pme_udesc = "HS", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All segments", .pme_ucode = 0x7F, }, }, }, /* 4 */{.pme_name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .pme_code = 0x21, .pme_desc = "Pipeline restart due to self-modifying code", }, /* 5 */{.pme_name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .pme_code = 0x22, .pme_desc = "Pipeline restart due to probe hit", }, /* 6 */{.pme_name = "LS_BUFFER_2_FULL_CYCLES", .pme_code = 0x23, .pme_desc = "LS Buffer 2 Full", }, /* 7 */{.pme_name = "LOCKED_OPS", .pme_code = 0x24, .pme_desc = "Locked Operations", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "EXECUTED", .pme_udesc = "The number of locked instructions executed", .pme_ucode = 0x01, }, { .pme_uname = "CYCLES_SPECULATIVE_PHASE", .pme_udesc = "The number of cycles spent in speculative phase", .pme_ucode = 0x02, }, { .pme_uname = "CYCLES_NON_SPECULATIVE_PHASE", .pme_udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 8 */{.pme_name = "MEMORY_REQUESTS", .pme_code = 0x65, .pme_desc = "Memory Requests by Type", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "NON_CACHEABLE", .pme_udesc = "Requests to non-cacheable (UC) memory", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_COMBINING", .pme_udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .pme_ucode = 0x02, }, { .pme_uname = "STREAMING_STORE", .pme_udesc = "Streaming store (SS) requests", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x83, }, }, }, /* 9 */{.pme_name = "DATA_CACHE_ACCESSES", .pme_code = 0x40, .pme_desc = "Data Cache Accesses", }, /* 10 */{.pme_name = "DATA_CACHE_MISSES", .pme_code = 0x41, .pme_desc = "Data Cache Misses", }, /* 11 */{.pme_name = "DATA_CACHE_REFILLS", .pme_code = 0x42, .pme_desc = "Data Cache Refills from L2 or System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "SYSTEM", .pme_udesc = "Refill from System", .pme_ucode = 0x01, }, { .pme_uname = "L2_SHARED", .pme_udesc = "Shared-state line from L2", .pme_ucode = 0x02, }, { .pme_uname = "L2_EXCLUSIVE", .pme_udesc = "Exclusive-state line from L2", .pme_ucode = 0x04, }, { .pme_uname = "L2_OWNED", .pme_udesc = "Owned-state line from L2", .pme_ucode = 0x08, }, { .pme_uname = "L2_MODIFIED", .pme_udesc = "Modified-state line from L2", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Shared, Exclusive, Owned, Modified State Refills", .pme_ucode = 0x1F, }, }, }, /* 12 */{.pme_name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x43, .pme_desc = "Data Cache Refills from System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Invalid, Shared, Exclusive, Owned, Modified", .pme_ucode = 0x1F, }, }, }, /* 13 */{.pme_name = "DATA_CACHE_LINES_EVICTED", .pme_code = 0x44, .pme_desc = "Data Cache Lines Evicted", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INVALID", .pme_udesc = "Invalid", .pme_ucode = 0x01, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x02, }, { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x04, }, { .pme_uname = "OWNED", .pme_udesc = "Owned", .pme_ucode = 0x08, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "Invalid, Shared, Exclusive, Owned, Modified", .pme_ucode = 0x1F, }, }, }, /* 14 */{.pme_name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .pme_code = 0x45, .pme_desc = "L1 DTLB Miss and L2 DTLB Hit", }, /* 15 */{.pme_name = "L1_DTLB_AND_L2_DTLB_MISS", .pme_code = 0x46, .pme_desc = "L1 DTLB and L2 DTLB Miss", }, /* 16 */{.pme_name = "MISALIGNED_ACCESSES", .pme_code = 0x47, .pme_desc = "Misaligned Accesses", }, /* 17 */{.pme_name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .pme_code = 0x48, .pme_desc = "Microarchitectural Late Cancel of an Access", }, /* 18 */{.pme_name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .pme_code = 0x49, .pme_desc = "Microarchitectural Early Cancel of an Access", }, /* 19 */{.pme_name = "SCRUBBER_SINGLE_BIT_ECC_ERRORS", .pme_code = 0x4A, .pme_desc = "Single-bit ECC Errors Recorded by Scrubber", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "SCRUBBER_ERROR", .pme_udesc = "Scrubber error", .pme_ucode = 0x01, }, { .pme_uname = "PIGGYBACK_ERROR", .pme_udesc = "Piggyback scrubber errors", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 20 */{.pme_name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .pme_code = 0x4B, .pme_desc = "Prefetch Instructions Dispatched", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "LOAD", .pme_udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .pme_ucode = 0x01, }, { .pme_uname = "STORE", .pme_udesc = "Store (PrefetchW)", .pme_ucode = 0x02, }, { .pme_uname = "NTA", .pme_udesc = "NTA (PrefetchNTA)", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 21 */{.pme_name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .pme_code = 0x4C, .pme_desc = "DCACHE Misses by Locked Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .pme_udesc = "Data cache misses by locked instructions", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x02, }, }, }, /* 22 */{.pme_name = "DATA_PREFETCHES", .pme_code = 0x67, .pme_desc = "Data Prefetcher", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "CANCELLED", .pme_udesc = "Cancelled prefetches", .pme_ucode = 0x01, }, { .pme_uname = "ATTEMPTED", .pme_udesc = "Prefetch attempts", .pme_ucode = 0x02, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, }, }, }, /* 23 */{.pme_name = "SYSTEM_READ_RESPONSES", .pme_code = 0x6C, .pme_desc = "System Read Responses by Coherency State", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "EXCLUSIVE", .pme_udesc = "Exclusive", .pme_ucode = 0x01, }, { .pme_uname = "MODIFIED", .pme_udesc = "Modified", .pme_ucode = 0x02, }, { .pme_uname = "SHARED", .pme_udesc = "Shared", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "Exclusive, Modified, Shared", .pme_ucode = 0x07, }, }, }, /* 24 */{.pme_name = "QUADWORDS_WRITTEN_TO_SYSTEM", .pme_code = 0x6D, .pme_desc = "Quadwords Written to System", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 2, .pme_umasks = { { .pme_uname = "QUADWORD_WRITE_TRANSFER", .pme_udesc = "Quadword write transfer", .pme_ucode = 0x01, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x01, }, }, }, /* 25 */{.pme_name = "REQUESTS_TO_L2", .pme_code = 0x7D, .pme_desc = "Requests to L2 Cache", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 0x01, }, { .pme_uname = "DATA", .pme_udesc = "DC fill", .pme_ucode = 0x02, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB fill (page table walks)", .pme_ucode = 0x04, }, { .pme_uname = "SNOOP", .pme_udesc = "Tag snoop request", .pme_ucode = 0x08, }, { .pme_uname = "CANCELLED", .pme_udesc = "Cancelled request", .pme_ucode = 0x10, }, { .pme_uname = "ALL", .pme_udesc = "All non-cancelled requests", .pme_ucode = 0x1F, }, }, }, /* 26 */{.pme_name = "L2_CACHE_MISS", .pme_code = 0x7E, .pme_desc = "L2 Cache Misses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "INSTRUCTIONS", .pme_udesc = "IC fill", .pme_ucode = 0x01, }, { .pme_uname = "DATA", .pme_udesc = "DC fill (includes possible replays, whereas event 41h does not)", .pme_ucode = 0x02, }, { .pme_uname = "TLB_WALK", .pme_udesc = "TLB page table walk", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "Instructions, Data, TLB walk", .pme_ucode = 0x07, }, }, }, /* 27 */{.pme_name = "L2_FILL_WRITEBACK", .pme_code = 0x7F, .pme_desc = "L2 Fill/Writeback", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "L2_FILLS", .pme_udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .pme_ucode = 0x01, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x01, .pme_uflags = PFMLIB_AMD64_TILL_K8_REV_E, }, { .pme_uname = "L2_WRITEBACKS", .pme_udesc = "L2 Writebacks to system.", .pme_ucode = 0x02, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x03, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, }, }, /* 28 */{.pme_name = "INSTRUCTION_CACHE_FETCHES", .pme_code = 0x80, .pme_desc = "Instruction Cache Fetches", }, /* 29 */{.pme_name = "INSTRUCTION_CACHE_MISSES", .pme_code = 0x81, .pme_desc = "Instruction Cache Misses", }, /* 30 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .pme_code = 0x82, .pme_desc = "Instruction Cache Refills from L2", }, /* 31 */{.pme_name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .pme_code = 0x83, .pme_desc = "Instruction Cache Refills from System", }, /* 32 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .pme_code = 0x84, .pme_desc = "L1 ITLB Miss and L2 ITLB Hit", }, /* 33 */{.pme_name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .pme_code = 0x85, .pme_desc = "L1 ITLB Miss and L2 ITLB Miss", }, /* 34 */{.pme_name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .pme_code = 0x86, .pme_desc = "Pipeline Restart Due to Instruction Stream Probe", }, /* 35 */{.pme_name = "INSTRUCTION_FETCH_STALL", .pme_code = 0x87, .pme_desc = "Instruction Fetch Stall", }, /* 36 */{.pme_name = "RETURN_STACK_HITS", .pme_code = 0x88, .pme_desc = "Return Stack Hits", }, /* 37 */{.pme_name = "RETURN_STACK_OVERFLOWS", .pme_code = 0x89, .pme_desc = "Return Stack Overflows", }, /* 38 */{.pme_name = "RETIRED_CLFLUSH_INSTRUCTIONS", .pme_code = 0x26, .pme_desc = "Retired CLFLUSH Instructions", }, /* 39 */{.pme_name = "RETIRED_CPUID_INSTRUCTIONS", .pme_code = 0x27, .pme_desc = "Retired CPUID Instructions", }, /* 40 */{.pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x76, .pme_desc = "CPU Clocks not Halted", }, /* 41 */{.pme_name = "RETIRED_INSTRUCTIONS", .pme_code = 0xC0, .pme_desc = "Retired Instructions", }, /* 42 */{.pme_name = "RETIRED_UOPS", .pme_code = 0xC1, .pme_desc = "Retired uops", }, /* 43 */{.pme_name = "RETIRED_BRANCH_INSTRUCTIONS", .pme_code = 0xC2, .pme_desc = "Retired Branch Instructions", }, /* 44 */{.pme_name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .pme_code = 0xC3, .pme_desc = "Retired Mispredicted Branch Instructions", }, /* 45 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .pme_code = 0xC4, .pme_desc = "Retired Taken Branch Instructions", }, /* 46 */{.pme_name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .pme_code = 0xC5, .pme_desc = "Retired Taken Branch Instructions Mispredicted", }, /* 47 */{.pme_name = "RETIRED_FAR_CONTROL_TRANSFERS", .pme_code = 0xC6, .pme_desc = "Retired Far Control Transfers", }, /* 48 */{.pme_name = "RETIRED_BRANCH_RESYNCS", .pme_code = 0xC7, .pme_desc = "Retired Branch Resyncs", }, /* 49 */{.pme_name = "RETIRED_NEAR_RETURNS", .pme_code = 0xC8, .pme_desc = "Retired Near Returns", }, /* 50 */{.pme_name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .pme_code = 0xC9, .pme_desc = "Retired Near Returns Mispredicted", }, /* 51 */{.pme_name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .pme_code = 0xCA, .pme_desc = "Retired Indirect Branches Mispredicted", }, /* 52 */{.pme_name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .pme_code = 0xCB, .pme_desc = "Retired MMX/FP Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "X87", .pme_udesc = "x87 instructions", .pme_ucode = 0x01, }, { .pme_uname = "MMX_AND_3DNOW", .pme_udesc = "MMX and 3DNow! instructions", .pme_ucode = 0x02, }, { .pme_uname = "PACKED_SSE_AND_SSE2", .pme_udesc = "Packed SSE and SSE2 instructions", .pme_ucode = 0x04, }, { .pme_uname = "SCALAR_SSE_AND_SSE2", .pme_udesc = "Scalar SSE and SSE2 instructions", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "X87, MMX(TM), 3DNow!(TM), Scalar and Packed SSE and SSE2 instructions", .pme_ucode = 0x0F, }, }, }, /* 53 */{.pme_name = "RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS", .pme_code = 0xCC, .pme_desc = "Retired Fastpath Double Op Instructions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "POSITION_0", .pme_udesc = "With low op in position 0", .pme_ucode = 0x01, }, { .pme_uname = "POSITION_1", .pme_udesc = "With low op in position 1", .pme_ucode = 0x02, }, { .pme_uname = "POSITION_2", .pme_udesc = "With low op in position 2", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "With low op in position 0, 1, or 2", .pme_ucode = 0x07, }, }, }, /* 54 */{.pme_name = "INTERRUPTS_MASKED_CYCLES", .pme_code = 0xCD, .pme_desc = "Interrupts-Masked Cycles", }, /* 55 */{.pme_name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .pme_code = 0xCE, .pme_desc = "Interrupts-Masked Cycles with Interrupt Pending", }, /* 56 */{.pme_name = "INTERRUPTS_TAKEN", .pme_code = 0xCF, .pme_desc = "Interrupts Taken", }, /* 57 */{.pme_name = "DECODER_EMPTY", .pme_code = 0xD0, .pme_desc = "Decoder Empty", }, /* 58 */{.pme_name = "DISPATCH_STALLS", .pme_code = 0xD1, .pme_desc = "Dispatch Stalls", }, /* 59 */{.pme_name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .pme_code = 0xD2, .pme_desc = "Dispatch Stall for Branch Abort to Retire", }, /* 60 */{.pme_name = "DISPATCH_STALL_FOR_SERIALIZATION", .pme_code = 0xD3, .pme_desc = "Dispatch Stall for Serialization", }, /* 61 */{.pme_name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .pme_code = 0xD4, .pme_desc = "Dispatch Stall for Segment Load", }, /* 62 */{.pme_name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .pme_code = 0xD5, .pme_desc = "Dispatch Stall for Reorder Buffer Full", }, /* 63 */{.pme_name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .pme_code = 0xD6, .pme_desc = "Dispatch Stall for Reservation Station Full", }, /* 64 */{.pme_name = "DISPATCH_STALL_FOR_FPU_FULL", .pme_code = 0xD7, .pme_desc = "Dispatch Stall for FPU Full", }, /* 65 */{.pme_name = "DISPATCH_STALL_FOR_LS_FULL", .pme_code = 0xD8, .pme_desc = "Dispatch Stall for LS Full", }, /* 66 */{.pme_name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .pme_code = 0xD9, .pme_desc = "Dispatch Stall Waiting for All Quiet", }, /* 67 */{.pme_name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .pme_code = 0xDA, .pme_desc = "Dispatch Stall for Far Transfer or Resync to Retire", }, /* 68 */{.pme_name = "FPU_EXCEPTIONS", .pme_code = 0xDB, .pme_desc = "FPU Exceptions", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "X87_RECLASS_MICROFAULTS", .pme_udesc = "x87 reclass microfaults", .pme_ucode = 0x01, }, { .pme_uname = "SSE_RETYPE_MICROFAULTS", .pme_udesc = "SSE retype microfaults", .pme_ucode = 0x02, }, { .pme_uname = "SSE_RECLASS_MICROFAULTS", .pme_udesc = "SSE reclass microfaults", .pme_ucode = 0x04, }, { .pme_uname = "SSE_AND_X87_MICROTRAPS", .pme_udesc = "SSE and x87 microtraps", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 69 */{.pme_name = "DR0_BREAKPOINT_MATCHES", .pme_code = 0xDC, .pme_desc = "DR0 Breakpoint Matches", }, /* 70 */{.pme_name = "DR1_BREAKPOINT_MATCHES", .pme_code = 0xDD, .pme_desc = "DR1 Breakpoint Matches", }, /* 71 */{.pme_name = "DR2_BREAKPOINT_MATCHES", .pme_code = 0xDE, .pme_desc = "DR2 Breakpoint Matches", }, /* 72 */{.pme_name = "DR3_BREAKPOINT_MATCHES", .pme_code = 0xDF, .pme_desc = "DR3 Breakpoint Matches", }, /* 73 */{.pme_name = "DRAM_ACCESSES_PAGE", .pme_code = 0xE0, .pme_desc = "DRAM Accesses", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Page hit", .pme_ucode = 0x01, }, { .pme_uname = "MISS", .pme_udesc = "Page Miss", .pme_ucode = 0x02, }, { .pme_uname = "CONFLICT", .pme_udesc = "Page Conflict", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "Page Hit, Miss, or Conflict", .pme_ucode = 0x07, }, }, }, /* 74 */{.pme_name = "MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS", .pme_code = 0xE1, .pme_desc = "Memory Controller Page Table Overflows", }, /* 75 */{.pme_name = "MEMORY_CONTROLLER_TURNAROUNDS", .pme_code = 0xE3, .pme_desc = "Memory Controller Turnarounds", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "CHIP_SELECT", .pme_udesc = "DIMM (chip select) turnaround", .pme_ucode = 0x01, }, { .pme_uname = "READ_TO_WRITE", .pme_udesc = "Read to write turnaround", .pme_ucode = 0x02, }, { .pme_uname = "WRITE_TO_READ", .pme_udesc = "Write to read turnaround", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All Memory Controller Turnarounds", .pme_ucode = 0x07, }, }, }, /* 76 */{.pme_name = "MEMORY_CONTROLLER_BYPASS", .pme_code = 0xE4, .pme_desc = "Memory Controller Bypass Counter Saturation", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "HIGH_PRIORITY", .pme_udesc = "Memory controller high priority bypass", .pme_ucode = 0x01, }, { .pme_uname = "LOW_PRIORITY", .pme_udesc = "Memory controller low priority bypass", .pme_ucode = 0x02, }, { .pme_uname = "DRAM_INTERFACE", .pme_udesc = "DRAM controller interface bypass", .pme_ucode = 0x04, }, { .pme_uname = "DRAM_QUEUE", .pme_udesc = "DRAM controller queue bypass", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 77 */{.pme_name = "SIZED_BLOCKS", .pme_code = 0xE5, .pme_desc = "Sized Blocks", .pme_flags = PFMLIB_AMD64_UMASK_COMBO | PFMLIB_AMD64_K8_REV_D, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "32_BYTE_WRITES", .pme_udesc = "32-byte Sized Writes", .pme_ucode = 0x04, }, { .pme_uname = "64_BYTE_WRITES", .pme_udesc = "64-byte Sized Writes", .pme_ucode = 0x08, }, { .pme_uname = "32_BYTE_READS", .pme_udesc = "32-byte Sized Reads", .pme_ucode = 0x10, }, { .pme_uname = "64_BYTE_READS", .pme_udesc = "64-byte Sized Reads", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3C, }, }, }, /* 78 */{.pme_name = "THERMAL_STATUS_AND_ECC_ERRORS", .pme_code = 0xE8, .pme_desc = "Thermal Status and ECC Errors", .pme_flags = PFMLIB_AMD64_UMASK_COMBO | PFMLIB_AMD64_K8_REV_E, .pme_numasks = 7, .pme_umasks = { { .pme_uname = "CLKS_CPU_ACTIVE", .pme_udesc = "Number of clocks CPU is active when HTC is active", .pme_ucode = 0x01, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, { .pme_uname = "CLKS_CPU_INACTIVE", .pme_udesc = "Number of clocks CPU clock is inactive when HTC is active", .pme_ucode = 0x02, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, { .pme_uname = "CLKS_DIE_TEMP_TOO_HIGH", .pme_udesc = "Number of clocks when die temperature is higher than the software high temperature threshold", .pme_ucode = 0x04, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, { .pme_uname = "CLKS_TEMP_THRESHOLD_EXCEEDED", .pme_udesc = "Number of clocks when high temperature threshold was exceeded", .pme_ucode = 0x08, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, { .pme_uname = "DRAM_ECC_ERRORS", .pme_udesc = "Number of correctable and Uncorrectable DRAM ECC errors", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x80, .pme_uflags = PFMLIB_AMD64_TILL_K8_REV_E, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x8F, .pme_uflags = PFMLIB_AMD64_K8_REV_F, }, }, }, /* 79 */{.pme_name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .pme_code = 0xE9, .pme_desc = "CPU/IO Requests to Memory/IO", .pme_flags = PFMLIB_AMD64_UMASK_COMBO | PFMLIB_AMD64_K8_REV_E, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "I_O_TO_I_O", .pme_udesc = "I/O to I/O", .pme_ucode = 0x01, }, { .pme_uname = "I_O_TO_MEM", .pme_udesc = "I/O to Mem", .pme_ucode = 0x02, }, { .pme_uname = "CPU_TO_I_O", .pme_udesc = "CPU to I/O", .pme_ucode = 0x04, }, { .pme_uname = "CPU_TO_MEM", .pme_udesc = "CPU to Mem", .pme_ucode = 0x08, }, { .pme_uname = "TO_REMOTE_NODE", .pme_udesc = "To remote node", .pme_ucode = 0x10, }, { .pme_uname = "TO_LOCAL_NODE", .pme_udesc = "To local node", .pme_ucode = 0x20, }, { .pme_uname = "FROM_REMOTE_NODE", .pme_udesc = "From remote node", .pme_ucode = 0x40, }, { .pme_uname = "FROM_LOCAL_NODE", .pme_udesc = "From local node", .pme_ucode = 0x80, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0xFF, }, }, }, /* 80 */{.pme_name = "CACHE_BLOCK", .pme_code = 0xEA, .pme_desc = "Cache Block Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO | PFMLIB_AMD64_K8_REV_E, .pme_numasks = 6, .pme_umasks = { { .pme_uname = "VICTIM_WRITEBACK", .pme_udesc = "Victim Block (Writeback)", .pme_ucode = 0x01, }, { .pme_uname = "DCACHE_LOAD_MISS", .pme_udesc = "Read Block (Dcache load miss refill)", .pme_ucode = 0x04, }, { .pme_uname = "SHARED_ICACHE_REFILL", .pme_udesc = "Read Block Shared (Icache refill)", .pme_ucode = 0x08, }, { .pme_uname = "READ_BLOCK_MODIFIED", .pme_udesc = "Read Block Modified (Dcache store miss refill)", .pme_ucode = 0x10, }, { .pme_uname = "READ_TO_DIRTY", .pme_udesc = "Change to Dirty (first store to clean block already in cache)", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3D, }, }, }, /* 81 */{.pme_name = "SIZED_COMMANDS", .pme_code = 0xEB, .pme_desc = "Sized Commands", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 8, .pme_umasks = { { .pme_uname = "NON_POSTED_WRITE_BYTE", .pme_udesc = "NonPosted SzWr Byte (1-32 bytes) Legacy or mapped I/O, typically 1-4 bytes", .pme_ucode = 0x01, }, { .pme_uname = "NON_POSTED_WRITE_DWORD", .pme_udesc = "NonPosted SzWr Dword (1-16 dwords) Legacy or mapped I/O, typically 1 dword", .pme_ucode = 0x02, }, { .pme_uname = "POSTED_WRITE_BYTE", .pme_udesc = "Posted SzWr Byte (1-32 bytes) Sub-cache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .pme_ucode = 0x04, }, { .pme_uname = "POSTED_WRITE_DWORD", .pme_udesc = "Posted SzWr Dword (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .pme_ucode = 0x08, }, { .pme_uname = "READ_BYTE_4_BYTES", .pme_udesc = "SzRd Byte (4 bytes) Legacy or mapped I/O", .pme_ucode = 0x10, }, { .pme_uname = "READ_DWORD_1_16_DWORDS", .pme_udesc = "SzRd Dword (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .pme_ucode = 0x20, }, { .pme_uname = "READ_MODIFY_WRITE", .pme_udesc = "RdModWr", .pme_ucode = 0x40, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, }, }, }, /* 82 */{.pme_name = "PROBE", .pme_code = 0xEC, .pme_desc = "Probe Responses and Upstream Requests", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 9, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "Probe miss", .pme_ucode = 0x01, }, { .pme_uname = "HIT_CLEAN", .pme_udesc = "Probe hit clean", .pme_ucode = 0x02, }, { .pme_uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .pme_ucode = 0x04, }, { .pme_uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .pme_udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .pme_ucode = 0x08, }, { .pme_uname = "UPSTREAM_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream display refresh reads", .pme_ucode = 0x10, }, { .pme_uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .pme_udesc = "Upstream non-display refresh reads", .pme_ucode = 0x20, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x3F, .pme_uflags = PFMLIB_AMD64_TILL_K8_REV_C, }, { .pme_uname = "UPSTREAM_WRITES", .pme_udesc = "Upstream writes", .pme_ucode = 0x40, .pme_uflags = PFMLIB_AMD64_K8_REV_D, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x7F, .pme_uflags = PFMLIB_AMD64_K8_REV_D, }, }, }, /* 83 */{.pme_name = "GART", .pme_code = 0xEE, .pme_desc = "GART Events", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 4, .pme_umasks = { { .pme_uname = "APERTURE_HIT_FROM_CPU", .pme_udesc = "GART aperture hit on access from CPU", .pme_ucode = 0x01, }, { .pme_uname = "APERTURE_HIT_FROM_IO", .pme_udesc = "GART aperture hit on access from I/O", .pme_ucode = 0x02, }, { .pme_uname = "MISS", .pme_udesc = "GART miss", .pme_ucode = 0x04, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x07, }, }, }, /* 84 */{.pme_name = "HYPERTRANSPORT_LINK0", .pme_code = 0xF6, .pme_desc = "HyperTransport Link 0 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command dword sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data dword sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release dword sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop dword sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 85 */{.pme_name = "HYPERTRANSPORT_LINK1", .pme_code = 0xF7, .pme_desc = "HyperTransport Link 1 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command dword sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data dword sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release dword sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop dword sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, /* 86 */{.pme_name = "HYPERTRANSPORT_LINK2", .pme_code = 0xF8, .pme_desc = "HyperTransport Link 2 Transmit Bandwidth", .pme_flags = PFMLIB_AMD64_UMASK_COMBO, .pme_numasks = 5, .pme_umasks = { { .pme_uname = "COMMAND_DWORD_SENT", .pme_udesc = "Command dword sent", .pme_ucode = 0x01, }, { .pme_uname = "DATA_DWORD_SENT", .pme_udesc = "Data dword sent", .pme_ucode = 0x02, }, { .pme_uname = "BUFFER_RELEASE_DWORD_SENT", .pme_udesc = "Buffer release dword sent", .pme_ucode = 0x04, }, { .pme_uname = "NOP_DWORD_SENT", .pme_udesc = "Nop dword sent (idle)", .pme_ucode = 0x08, }, { .pme_uname = "ALL", .pme_udesc = "All sub-events selected", .pme_ucode = 0x0F, }, }, }, }; #define PME_AMD64_K8_EVENT_COUNT (sizeof(amd64_k8_pe)/sizeof(pme_amd64_entry_t)) #define PME_AMD64_K8_CPU_CLK_UNHALTED 40 #define PME_AMD64_K8_RETIRED_INSTRUCTIONS 41 papi-papi-7-2-0-t/src/libperfnec/lib/cell_events.h000066400000000000000000003364411502707512200217750ustar00rootroot00000000000000/* * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ static pme_cell_entry_t cell_pe[] = { {.pme_name = "CYCLES", .pme_desc = "CPU cycles", .pme_code = 0x0, /* 0 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_COMMIT_TH0", .pme_desc = "Branch instruction committed.", .pme_code = 0x834, /* 2100 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_FLUSH_TH0", .pme_desc = "Branch instruction that caused a misprediction flush is committed. Branch misprediction includes", .pme_code = 0x835, /* 2101 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "INST_BUFF_EMPTY_TH0", .pme_desc = "Instruction buffer empty.", .pme_code = 0x836, /* 2102 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_ERAT_MISS_TH0", .pme_desc = "Instruction effective-address-to-real-address translation (I-ERAT) miss.", .pme_code = 0x837, /* 2103 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L1_ICACHE_MISS_CYCLES_TH0", .pme_desc = "L1 Instruction cache miss cycles. Counts the cycles from the miss event until the returned instruction is dispatched or cancelled due to branch misprediction, completion restart, or exceptions.", .pme_code = 0x838, /* 2104 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "DISPATCH_BLOCKED_TH0", .pme_desc = "Valid instruction available for dispatch, but dispatch is blocked.", .pme_code = 0x83a, /* 2106 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_FLUSH_TH0", .pme_desc = "Instruction in pipeline stage EX7 causes a flush.", .pme_code = 0x83d, /* 2109 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "PPC_INST_COMMIT_TH0", .pme_desc = "Two PowerPC instructions committed. For microcode sequences, only the last microcode operation is counted. Committed instructions are counted two at a time. If only one instruction has committed for a given cycle, this event will not be raised until another instruction has been committed in a future cycle.", .pme_code = 0x83f, /* 2111 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_COMMIT_TH1", .pme_desc = "Branch instruction committed.", .pme_code = 0x847, /* 2119 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_FLUSH_TH1", .pme_desc = "Branch instruction that caused a misprediction flush is committed. Branch misprediction includes", .pme_code = 0x848, /* 2120 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "INST_BUFF_EMPTY_TH1", .pme_desc = "Instruction buffer empty.", .pme_code = 0x849, /* 2121 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_ERAT_MISS_TH1", .pme_desc = "Instruction effective-address-to-real-address translation (I-ERAT) miss.", .pme_code = 0x84a, /* 2122 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L1_ICACHE_MISS_CYCLES_TH1", .pme_desc = "L1 Instruction cache miss cycles. Counts the cycles from the miss event until the returned instruction is dispatched or cancelled due to branch misprediction, completion restart, or exceptions.", .pme_code = 0x84b, /* 2123 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "DISPATCH_BLOCKED_TH1", .pme_desc = "Valid instruction available for dispatch, but dispatch is blocked.", .pme_code = 0x84d, /* 2125 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_FLUSH_TH1", .pme_desc = "Instruction in pipeline stage EX7 causes a flush.", .pme_code = 0x850, /* 2128 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "PPC_INST_COMMIT_TH1", .pme_desc = "Two PowerPC instructions committed. For microcode sequences, only the last microcode operation is counted. Committed instructions are counted two at a time. If only one instruction has committed for a given cycle, this event will not be raised until another instruction has been committed in a future cycle.", .pme_code = 0x852, /* 2130 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "DATA_ERAT_MISS_TH0", .pme_desc = "Data effective-address-to-real-address translation (D-ERAT) miss. Not speculative.", .pme_code = 0x89a, /* 2202 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "ST_REQ_TH0", .pme_desc = "Store request counted at the L2 interface. Counts microcoded PPE sequences more than once. (Thread 0 and 1)", .pme_code = 0x89b, /* 2203 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_VALID_TH0", .pme_desc = "Load valid at a particular pipe stage. Speculative, since flushed operations are counted as well. Counts microcoded PPE sequences more than once. Misaligned flushes might be counted the first time as well. Load operations include all loads that read data from the cache, dcbt and dcbtst. Does not include load Vector/SIMD multimedia extension pattern instructions.", .pme_code = 0x89c, /* 2204 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L1_DCACHE_MISS_TH0", .pme_desc = "L1 D-cache load miss. Pulsed when there is a miss request that has a tag miss but not an ERAT miss. Speculative, since flushed operations are counted as well.", .pme_code = 0x89d, /* 2205 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "DATA_ERAT_MISS_TH1", .pme_desc = "Data effective-address-to-real-address translation (D-ERAT) miss. Not speculative.", .pme_code = 0x8aa, /* 2218 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_VALID_TH1", .pme_desc = "Load valid at a particular pipe stage. Speculative, since flushed operations are counted as well. Counts microcoded PPE sequences more than once. Misaligned flushes might be counted the first time as well. Load operations include all loads that read data from the cache, dcbt and dcbtst. Does not include load Vector/SIMD multimedia extension pattern instructions.", .pme_code = 0x8ac, /* 2220 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "DATA_ERAT_MISS_TH1", .pme_desc = "L1 D-cache load miss. Pulsed when there is a miss request that has a tag miss but not an ERAT miss. Speculative, since flushed operations are counted as well.", .pme_code = 0x8ad, /* 2221 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_MFC_MMIO", .pme_desc = "Load from MFC memory-mapped I/O (MMIO) space.", .pme_code = 0xc1c, /* 3100 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "ST_MFC_MMIO", .pme_desc = "Stores to MFC MMIO space.", .pme_code = 0xc1d, /* 3101 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "REQ_TOKEN_TYPE", .pme_desc = "Request token for even memory bank numbers 0-14.", .pme_code = 0xc22, /* 3106 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "RCV_8BEAT_DATA", .pme_desc = "Receive 8-beat data from the Element Interconnect Bus (EIB).", .pme_code = 0xc2b, /* 3115 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "SEND_8BEAT_DATA", .pme_desc = "Send 8-beat data to the EIB.", .pme_code = 0xc2c, /* 3116 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "SEND_CMD", .pme_desc = "Send a command to the EIB; includes retried commands.", .pme_code = 0xc2d, /* 3117 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "DATA_GRANT_CYCLES", .pme_desc = "Cycles between data request and data grant.", .pme_code = 0xc2e, /* 3118 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_ST_Q_NOT_EMPTY_CYCLES", .pme_desc = "The five-entry Non-Cacheable Unit (NCU) Store Command queue not empty.", .pme_code = 0xc33, /* 3123 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_CACHE_HIT", .pme_desc = "Cache hit for core interface unit (CIU) loads and stores.", .pme_code = 0xc80, /* 3200 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_CACHE_MISS", .pme_desc = "Cache miss for CIU loads and stores.", .pme_code = 0xc81, /* 3201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LD_MISS", .pme_desc = "CIU load miss.", .pme_code = 0xc84, /* 3204 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_MISS", .pme_desc = "CIU store to Invalid state (miss).", .pme_code = 0xc85, /* 3205 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LWARX_LDARX_MISS_TH0", .pme_desc = "Load word and reserve indexed (lwarx/ldarx) for Thread 0 hits Invalid cache state", .pme_code = 0xc87, /* 3207 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_STWCX_STDCX_MISS_TH0", .pme_desc = "Store word conditional indexed (stwcx/stdcx) for Thread 0 hits Invalid cache state when reservation is set.", .pme_code = 0xc8e, /* 3214 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ALL_SNOOP_SM_BUSY", .pme_desc = "All four snoop state machines busy.", .pme_code = 0xc99, /* 3225 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_DCLAIM_GOOD", .pme_desc = "Data line claim (dclaim) that received good combined response; includes store/stcx/dcbz to Shared (S), Shared Last (SL),or Tagged (T) cache state; does not include dcbz to Invalid (I) cache state.", .pme_code = 0xce8, /* 3304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_DCLAIM_TO_RWITM", .pme_desc = "Dclaim converted into rwitm; may still not get to the bus if stcx is aborted .", .pme_code = 0xcef, /* 3311 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_TO_M_MU_E", .pme_desc = "Store to modified (M), modified unsolicited (MU), or exclusive (E) cache state.", .pme_code = 0xcf0, /* 3312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_Q_FULL", .pme_desc = "8-entry store queue (STQ) full.", .pme_code = 0xcf1, /* 3313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_ST_TO_RC_ACKED", .pme_desc = "Store dispatched to RC machine is acknowledged.", .pme_code = 0xcf2, /* 3314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_GATHERABLE_ST", .pme_desc = "Gatherable store (type = 00000) received from CIU.", .pme_code = 0xcf3, /* 3315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_PUSH", .pme_desc = "Snoop push.", .pme_code = 0xcf6, /* 3318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_INTERVENTION_FROM_SL_E_SAME_MODE", .pme_desc = "Send intervention from (SL | E) cache state to a destination within the same CBE chip.", .pme_code = 0xcf7, /* 3319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_INTERVENTION_FROM_M_MU_SAME_MODE", .pme_desc = "Send intervention from (M | MU) cache state to a destination within the same CBE chip.", .pme_code = 0xcf8, /* 3320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RETRY_CONFLICTS", .pme_desc = "Respond with Retry to a snooped request due to one of the following conflicts", .pme_code = 0xcfd, /* 3325 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RETRY_BUSY", .pme_desc = "Respond with Retry to a snooped request because all snoop machines are busy.", .pme_code = 0xcfe, /* 3326 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_MMU_TO_EST", .pme_desc = "Snooped response causes a cache state transition from (M | MU) to (E | S | T).", .pme_code = 0xcff, /* 3327 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_E_TO_S", .pme_desc = "Snooped response causes a cache state transition from E to S.", .pme_code = 0xd00, /* 3328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_ESLST_TO_I", .pme_desc = "Snooped response causes a cache state transition from (E | SL | S | T) to Invalid (I).", .pme_code = 0xd01, /* 3329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_MMU_TO_I", .pme_desc = "Snooped response causes a cache state transition from (M | MU) to I.", .pme_code = 0xd02, /* 3330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LWARX_LDARX_MISS_TH1", .pme_desc = "Load and reserve indexed (lwarx/ldarx) for Thread 1 hits Invalid cache state.", .pme_code = 0xd54, /* 3412 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_STWCX_STDCX_MISS_TH1", .pme_desc = "Store conditional indexed (stwcx/stdcx) for Thread 1 hits Invalid cache state.", .pme_code = 0xd5b, /* 3419 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_NON_CACHEABLE_ST_ALL", .pme_desc = "Non-cacheable store request received from CIU; includes all synchronization operations such as sync and eieio.", .pme_code = 0xdac, /* 3500 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_SYNC_REQ", .pme_desc = "sync received from CIU.", .pme_code = 0xdad, /* 3501 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_NON_CACHEABLE_ST", .pme_desc = "Non-cacheable store request received from CIU; includes only stores.", .pme_code = 0xdb0, /* 3504 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_EIEIO_REQ", .pme_desc = "eieio received from CIU.", .pme_code = 0xdb2, /* 3506 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_TLBIE_REQ", .pme_desc = "tlbie received from CIU.", .pme_code = 0xdb3, /* 3507 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_SYNC_WAIT", .pme_desc = "sync at the bottom of the store queue, while waiting on st_done signal from the Bus Interface Unit (BIU) and sync_done signal from L2.", .pme_code = 0xdb4, /* 3508 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_LWSYNC_WAIT", .pme_desc = "lwsync at the bottom of the store queue, while waiting for a sync_done signal from the L2.", .pme_code = 0xdb5, /* 3509 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_EIEIO_WAIT", .pme_desc = "eieio at the bottom of the store queue, while waiting for a st_done signal from the BIU and a sync_done signal from the L2.", .pme_code = 0xdb6, /* 3510 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_TLBIE_WAIT", .pme_desc = "tlbie at the bottom of the store queue, while waiting for a st_done signal from the BIU.", .pme_code = 0xdb7, /* 3511 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_COMBINED_NON_CACHEABLE_ST", .pme_desc = "Non-cacheable store combined with the previous non-cacheable store with a contiguous address.", .pme_code = 0xdb8, /* 3512 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_ALL_ST_GATHER_BUFFS_FULL", .pme_desc = "All four store-gather buffers full.", .pme_code = 0xdbb, /* 3515 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_LD_REQ", .pme_desc = "Non-cacheable load request received from CIU; includes instruction and data fetches.", .pme_code = 0xdbc, /* 3516 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_ST_Q_NOT_EMPTY", .pme_desc = "The four-deep store queue not empty.", .pme_code = 0xdbd, /* 3517 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_ST_Q_FULL", .pme_desc = "The four-deep store queue full.", .pme_code = 0xdbe, /* 3518 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_AT_LEAST_ONE_ST_GATHER_BUFF_NOT_EMPTY", .pme_desc = "At least one store gather buffer not empty.", .pme_code = 0xdbf, /* 3519 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_DUAL_INST_COMMITTED", .pme_desc = "A dual instruction is committed.", .pme_code = 0x1004, /* 4100 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_SINGLE_INST_COMMITTED", .pme_desc = "A single instruction is committed.", .pme_code = 0x1005, /* 4101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_PIPE0_INST_COMMITTED", .pme_desc = "A pipeline 0 instruction is committed.", .pme_code = 0x1006, /* 4102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_PIPE1_INST_COMMITTED", .pme_desc = "A pipeline 1 instruction is committed.", .pme_code = 0x1007, /* 4103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_BUSY", .pme_desc = "Local storage is busy.", .pme_code = 0x1009, /* 4105 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_DMA_CONFLICT_LD_ST", .pme_desc = "A direct memory access (DMA) might conflict with a load or store.", .pme_code = 0x100a, /* 4106 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_ST", .pme_desc = "A store instruction to local storage is issued.", .pme_code = 0x100b, /* 4107 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_LD", .pme_desc = "A load instruction from local storage is issued.", .pme_code = 0x100c, /* 4108 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_FP_EXCEPTION", .pme_desc = "A floating-point unit exception occurred.", .pme_code = 0x100d, /* 4109 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_COMMIT", .pme_desc = "A branch instruction is committed.", .pme_code = 0x100e, /* 4110 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_NON_SEQ_PC", .pme_desc = "A nonsequential change of the SPU program counter has occurred. This can be caused by branch, asynchronous interrupt, stalled wait on channel, error-correction code (ECC) error, and so forth.", .pme_code = 0x100f, /* 4111 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_NOT_TAKEN", .pme_desc = "A branch was not taken.", .pme_code = 0x1010, /* 4112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_MISS_PREDICTION", .pme_desc = "Branch miss prediction. This count is not exact. Certain other code sequences can cause additional pulses on this signal.", .pme_code = 0x1011, /* 4113 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_HINT_MISS_PREDICTION", .pme_desc = "Branch hint miss prediction. This count is not exact. Certain other code sequences can cause additional pulses on this signal.", .pme_code = 0x1012, /* 4114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_INST_SEQ_ERROR", .pme_desc = "Instruction sequence error", .pme_code = 0x1013, /* 4115 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_STALL_CH_WRITE", .pme_desc = "Stalled waiting on any blocking channel write.", .pme_code = 0x1015, /* 4117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_EXTERNAL_EVENT_CH0", .pme_desc = "Stalled waiting on external event status (Channel 0).", .pme_code = 0x1016, /* 4118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_SIGNAL_1_CH3", .pme_desc = "Stalled waiting on SPU Signal Notification 1 (Channel 3).", .pme_code = 0x1017, /* 4119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_SIGNAL_2_CH4", .pme_desc = "Stalled waiting on SPU Signal Notification 2 (Channel 4).", .pme_code = 0x1018, /* 4120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_DMA_CH21", .pme_desc = "Stalled waiting on DMA Command Opcode or ClassID Register (Channel 21).", .pme_code = 0x1019, /* 4121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MFC_READ_CH24", .pme_desc = "Stalled waiting on memory flow control (MFC) Read Tag-Group Status (Channel 24).", .pme_code = 0x101a, /* 4122 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MFC_READ_CH25", .pme_desc = "Stalled waiting on MFC Read List Stall-and-Notify Tag Status (Channel 25).", .pme_code = 0x101b, /* 4123 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_OUTBOUND_MAILBOX_WRITE_CH28", .pme_desc = "Stalled waiting on SPU Write Outbound Mailbox (Channel 28).", .pme_code = 0x101c, /* 4124 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MAILBOX_CH29", .pme_desc = "Stalled waiting on SPU Mailbox (Channel 29).", .pme_code = 0x1022, /* 4130 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_TR_STALL_CH", .pme_desc = "Stalled waiting on a channel operation.", .pme_code = 0x10a1, /* 4257 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_EV_INST_FETCH_STALL", .pme_desc = "Instruction fetch stall", .pme_code = 0x1107, /* 4359 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_EV_ADDR_TRACE", .pme_desc = "Serialized SPU address (program counter) trace.", .pme_code = 0x110b, /* 4363 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD", .pme_desc = "An atomic load was received from direct memory access controller (DMAC).", .pme_code = 0x13ed, /* 5101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_DCLAIM", .pme_desc = "An atomic dclaim was sent to synergistic bus interface (SBI); includes retried requests.", .pme_code = 0x13ee, /* 5102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_RWITM", .pme_desc = "An atomic rwitm performed was sent to SBI; includes retried requests.", .pme_code = 0x13ef, /* 5103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_MU", .pme_desc = "An atomic load miss caused MU cache state.", .pme_code = 0x13f0, /* 5104 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_E", .pme_desc = "An atomic load miss caused E cache state.", .pme_code = 0x13f1, /* 5105 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_SL", .pme_desc = "An atomic load miss caused SL cache state.", .pme_code = 0x13f2, /* 5106 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_HIT", .pme_desc = "An atomic load hits cache.", .pme_code = 0x13f3, /* 5107 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_INTERVENTION", .pme_desc = "Atomic load misses cache with data intervention; sum of signals 4 and 6 in this group.", .pme_code = 0x13f4, /* 5108 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_PUTLLXC_CACHE_MISS_WO_INTERVENTION", .pme_desc = "putllc or putlluc misses cache without data intervention; for putllc, counts only when reservation is set for the address.", .pme_code = 0x13fa, /* 5114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_MACHINE_BUSY", .pme_desc = "Snoop machine busy.", .pme_code = 0x13fd, /* 5117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SNOOP_MMU_TO_I", .pme_desc = "A snoop caused cache transition from [M | MU] to I.", .pme_code = 0x13ff, /* 5119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_ESSL_TO_I", .pme_desc = "A snoop caused cache transition from [E | S | SL] to I.", .pme_code = 0x1401, /* 5121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_MU_TO_T", .pme_desc = "A snoop caused cache transition from MU to T cache state.", .pme_code = 0x1403, /* 5123 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_INTERVENTION_LOCAL", .pme_desc = "Sent modified data intervention to a destination within the same CBE chip.", .pme_code = 0x1407, /* 5127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ANY_DMA_GET", .pme_desc = "Any flavor of DMA get[] command issued to Synergistic Bus Interface (SBI); sum of signals 17-25 in this group.", .pme_code = 0x1450, /* 5200 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ANY_DMA_PUT", .pme_desc = "Any flavor of DMA put[] command issued to SBI; sum of signals 2-16 in this group.", .pme_code = 0x1451, /* 5201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_DMA_PUT", .pme_desc = "DMA put (put) is issued to SBI.", .pme_code = 0x1452, /* 5202 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_DMA_GET", .pme_desc = "DMA get data from effective address to local storage (get) issued to SBI.", .pme_code = 0x1461, /* 5217 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_LD_REQ", .pme_desc = "Load request sent to element interconnect bus (EIB); includes read, read atomic, rwitm, rwitm atomic, and retried commands.", .pme_code = 0x14b8, /* 5304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ST_REQ", .pme_desc = "Store request sent to EIB; includes wwf, wwc, wwk, dclaim, dclaim atomic, and retried commands.", .pme_code = 0x14b9, /* 5305 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_DATA", .pme_desc = "Received data from EIB, including partial cache line data.", .pme_code = 0x14ba, /* 5306 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_DATA", .pme_desc = "Sent data to EIB, both as a master and a snooper.", .pme_code = 0x14bb, /* 5307 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SBI_Q_NOT_EMPTY", .pme_desc = "16-deep synergistic bus interface (SBI) queue with outgoing requests not empty; does not include atomic requests.", .pme_code = 0x14bc, /* 5308 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SBI_Q_FULL", .pme_desc = "16-deep SBI queue with outgoing requests full; does not include atomic requests.", .pme_code = 0x14bd, /* 5309 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SENT_REQ", .pme_desc = "Sent request to EIB.", .pme_code = 0x14be, /* 5310 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_DATA_BUS_GRANT", .pme_desc = "Received data bus grant; includes data sent for MMIO operations.", .pme_code = 0x14c0, /* 5312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_WAIT_DATA_BUS_GRANT", .pme_desc = "Cycles between data bus request and data bus grant.", .pme_code = 0x14c1, /* 5313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_CMD_O_MEM", .pme_desc = "Command (read or write) for an odd-numbered memory bank; valid only when resource allocation is turned on.", .pme_code = 0x14c2, /* 5314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_CMD_E_MEM", .pme_desc = "Command (read or write) for an even-numbered memory bank; valid only when resource allocation is turned on.", .pme_code = 0x14c3, /* 5315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_RETRY_RESP", .pme_desc = "Request gets the Retry response; includes local and global requests.", .pme_code = 0x14c6, /* 5318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_DATA_BUS_REQ", .pme_desc = "Sent data bus request to EIB.", .pme_code = 0x14c7, /* 5319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_TLB_MISS", .pme_desc = "Translation Lookaside Buffer (TLB) miss without parity or protection errors.", .pme_code = 0x1518, /* 5400 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_TLB_CYCLES", .pme_desc = "TLB miss (cycles).", .pme_code = 0x1519, /* 5401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_TLB_HIT", .pme_desc = "TLB hit.", .pme_code = 0x151a, /* 5402 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_READ_RWITM_1", .pme_desc = "Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 1)", .pme_code = 0x17d4, /* 6100 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DCLAIM_1", .pme_desc = "Number of dclaim commands (including atomic) AC1 to AC0. (Group 1)", .pme_code = 0x17d5, /* 6101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_WWK_WWC_WWF_1", .pme_desc = "Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d6, /* 6102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SYNC_TLBSYNC_EIEIO_1", .pme_desc = "Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d7, /* 6103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_TLBIE_1", .pme_desc = "Number of tlbie commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d8, /* 6104 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_HIT_1", .pme_desc = "Previous adjacent address match (PAAM) Content Addressable Memory (CAM) hit. (Group 1)", .pme_code = 0x17df, /* 6111 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_MISS_1", .pme_desc = "PAAM CAM miss. (Group 1)", .pme_code = 0x17e0, /* 6112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_CMD_REFLECTED_1", .pme_desc = "Command reflected. (Group 1)", .pme_code = 0x17e2, /* 6114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_READ_RWITM_2", .pme_desc = "Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 2)", .pme_code = 0x17e4, /* 6116 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DCLAIM_2", .pme_desc = "Number of dclaim commands (including atomic) AC1 to AC0. (Group 2)", .pme_code = 0x17e5, /* 6117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_WWK_WWC_WWF_2", .pme_desc = "Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e6, /* 6118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SYNC_TLBSYNC_EIEIO_2", .pme_desc = "Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e7, /* 6119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_TLBIE_2", .pme_desc = "Number of tlbie commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e8, /* 6120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_HIT_2", .pme_desc = "PAAM CAM hit. (Group 2)", .pme_code = 0x17ef, /* 6127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_MISS_2", .pme_desc = "PAAM CAM miss. (Group 2)", .pme_code = 0x17f0, /* 6128 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_CMD_REFLECTED_2", .pme_desc = "Command reflected. (Group 2)", .pme_code = 0x17f2, /* 6130 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE6", .pme_desc = "Local command from SPE 6.", .pme_code = 0x1839, /* 6201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE4", .pme_desc = "Local command from SPE 4.", .pme_code = 0x183a, /* 6202 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CME_FROM_SPE2", .pme_desc = "Local command from SPE 2.", .pme_code = 0x183b, /* 6203 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_PPE", .pme_desc = "Local command from PPE.", .pme_code = 0x183d, /* 6205 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE1", .pme_desc = "Local command from SPE 1.", .pme_code = 0x183e, /* 6206 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE3", .pme_desc = "Local command from SPE 3.", .pme_code = 0x183f, /* 6207 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE5", .pme_desc = "Local command from SPE 5.", .pme_code = 0x1840, /* 6208 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE7", .pme_desc = "Local command from SPE 7.", .pme_code = 0x1841, /* 6209 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE6", .pme_desc = "AC1-to-AC0 global command from SPE 6.", .pme_code = 0x1844, /* 6212 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE4", .pme_desc = "AC1-to-AC0 global command from SPE 4.", .pme_code = 0x1845, /* 6213 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE2", .pme_desc = "AC1-to-AC0 global command from SPE 2.", .pme_code = 0x1846, /* 6214 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE0", .pme_desc = "AC1-to-AC0 global command from SPE 0.", .pme_code = 0x1847, /* 6215 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_PPE", .pme_desc = "AC1-to-AC0 global command from PPE.", .pme_code = 0x1848, /* 6216 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE1", .pme_desc = "AC1-to-AC0 global command from SPE 1.", .pme_code = 0x1849, /* 6217 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE3", .pme_desc = "AC1-to-AC0 global command from SPE 3.", .pme_code = 0x184a, /* 6218 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE5", .pme_desc = "AC1-to-AC0 global command from SPE 5.", .pme_code = 0x184b, /* 6219 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE7", .pme_desc = "AC1-to-AC0 global command from SPE 7", .pme_code = 0x184c, /* 6220 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_REFLECTING_LOCAL_CMD", .pme_desc = "AC1 is reflecting any local command.", .pme_code = 0x184e, /* 6222 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_SEND_GLOBAL_CMD", .pme_desc = "AC1 sends a global command to AC0.", .pme_code = 0x184f, /* 6223 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC0_REFLECT_GLOBAL_CMD", .pme_desc = "AC0 reflects a global command back to AC1.", .pme_code = 0x1850, /* 6224 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_REFLECT_CMD_TO_BM", .pme_desc = "AC1 reflects a command back to the bus masters.", .pme_code = 0x1851, /* 6225 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING0_1", .pme_desc = "Grant on data ring 0.", .pme_code = 0x189c, /* 6300 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING1_1", .pme_desc = "Grant on data ring 1.", .pme_code = 0x189d, /* 6301 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING2_1", .pme_desc = "Grant on data ring 2.", .pme_code = 0x189e, /* 6302 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING3_1", .pme_desc = "Grant on data ring 3.", .pme_code = 0x189f, /* 6303 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DATA_RING0_INUSE_1", .pme_desc = "Data ring 0 is in use.", .pme_code = 0x18a0, /* 6304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING1_INUSE_1", .pme_desc = "Data ring 1 is in use.", .pme_code = 0x18a1, /* 6305 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING2_INUSE_1", .pme_desc = "Data ring 2 is in use.", .pme_code = 0x18a2, /* 6306 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING3_INUSE_1", .pme_desc = "Data ring 3 is in use.", .pme_code = 0x18a3, /* 6307 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_IDLE_1", .pme_desc = "All data rings are idle.", .pme_code = 0x18a4, /* 6308 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ONE_DATA_RING_BUSY_1", .pme_desc = "One data ring is busy.", .pme_code = 0x18a5, /* 6309 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TWO_OR_THREE_DATA_RINGS_BUSY_1", .pme_desc = "Two or three data rings are busy.", .pme_code = 0x18a6, /* 6310 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_BUSY_1", .pme_desc = "All data rings are busy.", .pme_code = 0x18a7, /* 6311 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_IOIF0_DATA_REQ_PENDING_1", .pme_desc = "BIC(IOIF0) data request pending.", .pme_code = 0x18a8, /* 6312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE6_DATA_REQ_PENDING_1", .pme_desc = "SPE 6 data request pending.", .pme_code = 0x18a9, /* 6313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE4_DATA_REQ_PENDING_1", .pme_desc = "SPE 4 data request pending.", .pme_code = 0x18aa, /* 6314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE2_DATA_REQ_PENDING_1", .pme_desc = "SPE 2 data request pending.", .pme_code = 0x18ab, /* 6315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE0_DATA_REQ_PENDING_1", .pme_desc = "SPE 0 data request pending.", .pme_code = 0x18ac, /* 6316 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_MIC_DATA_REQ_PENDING_1", .pme_desc = "MIC data request pending.", .pme_code = 0x18ad, /* 6317 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_PPE_DATA_REQ_PENDING_1", .pme_desc = "PPE data request pending.", .pme_code = 0x18ae, /* 6318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE1_DATA_REQ_PENDING_1", .pme_desc = "SPE 1 data request pending.", .pme_code = 0x18af, /* 6319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE3_DATA_REQ_PENDING_1", .pme_desc = "SPE 3 data request pending.", .pme_code = 0x18b0, /* 6320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE5_DATA_REQ_PENDING_1", .pme_desc = "SPE 5 data request pending.", .pme_code = 0x18b1, /* 6321 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE7_DATA_REQ_PENDING_1", .pme_desc = "SPE 7 data request pending.", .pme_code = 0x18b2, /* 6322 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF0_DATA_DEST_1", .pme_desc = "IOIF0 is data destination.", .pme_code = 0x18b4, /* 6324 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE6_DATA_DEST_1", .pme_desc = "SPE 6 is data destination.", .pme_code = 0x18b5, /* 6325 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE4_DATA_DEST_1", .pme_desc = "SPE 4 is data destination.", .pme_code = 0x18b6, /* 6326 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE2_DATA_DEST_1", .pme_desc = "SPE 2 is data destination.", .pme_code = 0x18b7, /* 6327 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE0_DATA_DEST_1", .pme_desc = "SPE 0 is data destination.", .pme_code = 0x18b8, /* 6328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_MIC_DATA_DEST_1", .pme_desc = "MIC is data destination.", .pme_code = 0x18b9, /* 6329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PPE_DATA_DEST_1", .pme_desc = "PPE is data destination.", .pme_code = 0x18ba, /* 6330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE1_DATA_DEST_1", .pme_desc = "SPE 1 is data destination.", .pme_code = 0x18bb, /* 6331 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_IOIF0_DATA_REQ_PENDING_2", .pme_desc = "BIC(IOIF0) data request pending.", .pme_code = 0x1900, /* 6400 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE6_DATA_REQ_PENDING_2", .pme_desc = "SPE 6 data request pending.", .pme_code = 0x1901, /* 6401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE4_DATA_REQ_PENDING_2", .pme_desc = "SPE 4 data request pending.", .pme_code = 0x1902, /* 6402 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE2_DATA_REQ_PENDING_2", .pme_desc = "SPE 2 data request pending.", .pme_code = 0x1903, /* 6403 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE0_DATA_REQ_PENDING_2", .pme_desc = "SPE 0 data request pending.", .pme_code = 0x1904, /* 6404 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_MIC_DATA_REQ_PENDING_2", .pme_desc = "MIC data request pending.", .pme_code = 0x1905, /* 6405 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_PPE_DATA_REQ_PENDING_2", .pme_desc = "PPE data request pending.", .pme_code = 0x1906, /* 6406 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE1_DATA_REQ_PENDING_2", .pme_desc = "SPE 1 data request pending.", .pme_code = 0x1907, /* 6407 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE3_DATA_REQ_PENDING_2", .pme_desc = "SPE 3 data request pending.", .pme_code = 0x1908, /* 6408 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE5_DATA_REQ_PENDING_2", .pme_desc = "SPE 5 data request pending.", .pme_code = 0x1909, /* 6409 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE7_DATA_REQ_PENDING_2", .pme_desc = "SPE 7 data request pending.", .pme_code = 0x190a, /* 6410 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF1_DATA_REQ_PENDING_2", .pme_desc = "IOIF1 data request pending.", .pme_code = 0x190b, /* 6411 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF0_DATA_DEST_2", .pme_desc = "IOIF0 is data destination.", .pme_code = 0x190c, /* 6412 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE6_DATA_DEST_2", .pme_desc = "SPE 6 is data destination.", .pme_code = 0x190d, /* 6413 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE4_DATA_DEST_2", .pme_desc = "SPE 4 is data destination.", .pme_code = 0x190e, /* 6414 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE2_DATA_DEST_2", .pme_desc = "SPE 2 is data destination.", .pme_code = 0x190f, /* 6415 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE0_DATA_DEST_2", .pme_desc = "SPE 0 is data destination.", .pme_code = 0x1910, /* 6416 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_MIC_DATA_DEST_2", .pme_desc = "MIC is data destination.", .pme_code = 0x1911, /* 6417 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PPE_DATA_DEST_2", .pme_desc = "PPE is data destination.", .pme_code = 0x1912, /* 6418 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE1_DATA_DEST_2", .pme_desc = "SPE 1 is data destination.", .pme_code = 0x1913, /* 6419 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE3_DATA_DEST_2", .pme_desc = "SPE 3 is data destination.", .pme_code = 0x1914, /* 6420 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE5_DATA_DEST_2", .pme_desc = "SPE 5 is data destination.", .pme_code = 0x1915, /* 6421 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE7_DATA_DEST_2", .pme_desc = "SPE 7 is data destination.", .pme_code = 0x1916, /* 6422 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_IOIF1_DATA_DEST_2", .pme_desc = "IOIF1 is data destination.", .pme_code = 0x1917, /* 6423 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING0_2", .pme_desc = "Grant on data ring 0.", .pme_code = 0x1918, /* 6424 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING1_2", .pme_desc = "Grant on data ring 1.", .pme_code = 0x1919, /* 6425 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING2_2", .pme_desc = "Grant on data ring 2.", .pme_code = 0x191a, /* 6426 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING3_2", .pme_desc = "Grant on data ring 3.", .pme_code = 0x191b, /* 6427 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_ALL_DATA_RINGS_IDLE_2", .pme_desc = "All data rings are idle.", .pme_code = 0x191c, /* 6428 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ONE_DATA_RING_BUSY_2", .pme_desc = "One data ring is busy.", .pme_code = 0x191d, /* 6429 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TWO_OR_THREE_DATA_RINGS_BUSY_2", .pme_desc = "Two or three data rings are busy.", .pme_code = 0x191e, /* 6430 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_BUSY_2", .pme_desc = "All four data rings are busy.", .pme_code = 0x191f, /* 6431 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_UNUSED", .pme_desc = "Even XIO token unused by RAG 0.", .pme_code = 0xfe4c, /* 65100 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_UNUSED", .pme_desc = "Odd XIO token unused by RAG 0.", .pme_code = 0xfe4d, /* 65101 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_UNUSED", .pme_desc = "Even bank token unused by RAG 0.", .pme_code = 0xfe4e, /* 65102 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_UNUSED", .pme_desc = "Odd bank token unused by RAG 0.", .pme_code = 0xfe4f, /* 65103 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE0", .pme_desc = "Token granted for SPE 0.", .pme_code = 0xfe54, /* 65108 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE1", .pme_desc = "Token granted for SPE 1.", .pme_code = 0xfe55, /* 65109 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE2", .pme_desc = "Token granted for SPE 2.", .pme_code = 0xfe56, /* 65110 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE3", .pme_desc = "Token granted for SPE 3.", .pme_code = 0xfe57, /* 65111 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE4", .pme_desc = "Token granted for SPE 4.", .pme_code = 0xfe58, /* 65112 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE5", .pme_desc = "Token granted for SPE 5.", .pme_code = 0xfe59, /* 65113 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE6", .pme_desc = "Token granted for SPE 6.", .pme_code = 0xfe5a, /* 65114 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE7", .pme_desc = "Token granted for SPE 7.", .pme_code = 0xfe5b, /* 65115 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb0, /* 65200 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb1, /* 65201 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb2, /* 65202 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb3, /* 65203 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG U.", .pme_code = 0xfebc, /* 65212 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG U.", .pme_code = 0xfebd, /* 65213 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG U.", .pme_code = 0xfebe, /* 65214 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG U.", .pme_code = 0xfebf, /* 65215 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 0 shared with RAG 1", .pme_code = 0xff14, /* 65300 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 0 shared with RAG 2", .pme_code = 0xff15, /* 65301 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 0 shared with RAG 3", .pme_code = 0xff16, /* 65302 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 1", .pme_code = 0xff17, /* 65303 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 2", .pme_code = 0xff18, /* 65304 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 3", .pme_code = 0xff19, /* 65305 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 0 shared with RAG 1", .pme_code = 0xff1a, /* 65306 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 0 shared with RAG 2", .pme_code = 0xff1b, /* 65307 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 0 shared with RAG 3", .pme_code = 0xff1c, /* 65308 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 0 shared with RAG 1", .pme_code = 0xff1d, /* 65309 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 0 shared with RAG 2", .pme_code = 0xff1e, /* 65310 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 0 shared with RAG 3", .pme_code = 0xff1f, /* 65311 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_UNUSED", .pme_desc = "Even XIO token was unused by RAG 1.", .pme_code = 0xff88, /* 65416 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_UNUSED", .pme_desc = "Odd XIO token was unused by RAG 1.", .pme_code = 0xff89, /* 65417 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_UNUSED", .pme_desc = "Even bank token was unused by RAG 1.", .pme_code = 0xff8a, /* 65418 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_UNUSED", .pme_desc = "Odd bank token was unused by RAG 1.", .pme_code = 0xff8b, /* 65419 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_IOC0", .pme_desc = "Token was granted for IOC0.", .pme_code = 0xff91, /* 65425 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_IOC1", .pme_desc = "Token was granted for IOC1.", .pme_code = 0xff92, /* 65426 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_WASTED", .pme_desc = "Even XIO token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffec, /* 65516 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_WASTED", .pme_desc = "Odd XIO token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffed, /* 65517 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_WASTED", .pme_desc = "Even bank token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffee, /* 65518 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_WASTED", .pme_desc = "Odd bank token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffef, /* 65519 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 1 shared with RAG 0", .pme_code = 0x10050, /* 65616 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 1 shared with RAG 2", .pme_code = 0x10051, /* 65617 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 1 shared with RAG 3", .pme_code = 0x10052, /* 65618 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 0", .pme_code = 0x10053, /* 65619 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 2", .pme_code = 0x10054, /* 65620 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 3", .pme_code = 0x10055, /* 65621 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 1 shared with RAG 0", .pme_code = 0x10056, /* 65622 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 1 shared with RAG 2", .pme_code = 0x10057, /* 65623 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 1 shared with RAG 3", .pme_code = 0x10058, /* 65624 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 1 shared with RAG 0", .pme_code = 0x10059, /* 65625 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 1 shared with RAG 2", .pme_code = 0x1005a, /* 65626 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 1 shared with RAG 3", .pme_code = 0x1005b, /* 65627 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG U shared with RAG 1", .pme_code = 0x1005c, /* 65628 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG U shared with RAG 1", .pme_code = 0x1005d, /* 65629 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_BANK_RAG1", .pme_desc = "Even bank token from RAG U shared with RAG 1", .pme_code = 0x1005e, /* 65630 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG U shared with RAG 1", .pme_code = 0x1005f, /* 65631 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_UNUSED", .pme_desc = "Even XIO token unused by RAG 2", .pme_code = 0x100e4, /* 65764 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_UNUSED", .pme_desc = "Odd XIO token unused by RAG 2", .pme_code = 0x100e5, /* 65765 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_UNUSED", .pme_desc = "Even bank token unused by RAG 2", .pme_code = 0x100e6, /* 65766 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_UNUSED", .pme_desc = "Odd bank token unused by RAG 2", .pme_code = 0x100e7, /* 65767 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_IN_TOKEN_UNUSED", .pme_desc = "IOIF0 In token unused by RAG 0", .pme_code = 0x100e8, /* 65768 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_OUT_TOKEN_UNUSED", .pme_desc = "IOIF0 Out token unused by RAG 0", .pme_code = 0x100e9, /* 65769 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_IN_TOKEN_UNUSED", .pme_desc = "IOIF1 In token unused by RAG 0", .pme_code = 0x100ea, /* 65770 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_OUT_TOKEN_UNUSED", .pme_desc = "IOIF1 Out token unused by RAG 0", .pme_code = 0x100eb, /* 65771 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 2", .pme_code = 0x10148, /* 65864 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 2", .pme_code = 0x10149, /* 65865 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 2", .pme_code = 0x1014a, /* 65866 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 2", .pme_code = 0x1014b, /* 65867 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 2 shared with RAG 0", .pme_code = 0x101ac, /* 65964 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 2 shared with RAG 1", .pme_code = 0x101ad, /* 65965 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 2 shared with RAG 3", .pme_code = 0x101ae, /* 65966 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 0", .pme_code = 0x101af, /* 65967 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 1", .pme_code = 0x101b0, /* 65968 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 3", .pme_code = 0x101b1, /* 65969 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 2 shared with RAG 0", .pme_code = 0x101b2, /* 65970 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 2 shared with RAG 1", .pme_code = 0x101b3, /* 65971 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 2 shared with RAG 3", .pme_code = 0x101b4, /* 65972 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 2 shared with RAG 0", .pme_code = 0x101b5, /* 65973 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 2 shared with RAG 1", .pme_code = 0x101b6, /* 65974 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 2 shared with RAG 3", .pme_code = 0x101b7, /* 65975 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_IN_TOKEN_WASTED", .pme_desc = "IOIF0 In token wasted by RAG 0", .pme_code = 0x9ef38, /* 651064 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_OUT_TOKEN_WASTED", .pme_desc = "IOIF0 Out token wasted by RAG 0", .pme_code = 0x9ef39, /* 651065 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_IN_TOKEN_WASTED", .pme_desc = "IOIF1 In token wasted by RAG 0", .pme_code = 0x9ef3a, /* 651066 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_OUT_TOKEN_WASTED", .pme_desc = "IOIF1 Out token wasted by RAG 0", .pme_code = 0x9ef3b, /* 651067 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_UNUSED", .pme_desc = "Even XIO token was unused by RAG 3.", .pme_code = 0x9efac, /* 651180 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_UNUSED", .pme_desc = "Odd XIO token was unused by RAG 3.", .pme_code = 0x9efad, /* 651181 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_UNUSED", .pme_desc = "Even bank token was unused by RAG 3.", .pme_code = 0x9efae, /* 651182 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_UNUSED", .pme_desc = "Odd bank token was unused by RAG 3.", .pme_code = 0x9efaf, /* 651183 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 3", .pme_code = 0x9f010, /* 651280 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 3", .pme_code = 0x9f011, /* 651281 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 3", .pme_code = 0x9f012, /* 651282 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 3", .pme_code = 0x9f013, /* 651283 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 3 shared with RAG 0", .pme_code = 0x9f074, /* 651380 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 3 shared with RAG 1", .pme_code = 0x9f075, /* 651381 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 3 shared with RAG 2", .pme_code = 0x9f076, /* 651382 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 0", .pme_code = 0x9f077, /* 651383 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 1", .pme_code = 0x9f078, /* 651384 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 2", .pme_code = 0x9f079, /* 651385 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 3 shared with RAG 0", .pme_code = 0x9f07a, /* 651386 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 3 shared with RAG 1", .pme_code = 0x9f07b, /* 651387 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 3 shared with RAG 2", .pme_code = 0x9f07c, /* 651388 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 3 shared with RAG 0", .pme_code = 0x9f07d, /* 651389 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 3 shared with RAG 1", .pme_code = 0x9f07e, /* 651390 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 3 shared with RAG 2", .pme_code = 0x9f07f, /* 651391 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_Q_EMPTY", .pme_desc = "XIO1 - Read command queue is empty.", .pme_code = 0x1bc5, /* 7109 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_Q_EMPTY", .pme_desc = "XIO1 - Write command queue is empty.", .pme_code = 0x1bc6, /* 7110 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_Q_FULL", .pme_desc = "XIO1 - Read command queue is full.", .pme_code = 0x1bc8, /* 7112 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_RESPONDS_READ_RETRY", .pme_desc = "XIO1 - MIC responds with a Retry for a read command because the read command queue is full.", .pme_code = 0x1bc9, /* 7113 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_Q_FULL", .pme_desc = "XIO1 - Write command queue is full.", .pme_code = 0x1bca, /* 7114 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_RESPONDS_WRITE_RETRY", .pme_desc = "XIO1 - MIC responds with a Retry for a write command because the write command queue is full.", .pme_code = 0x1bcb, /* 7115 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_DISPATCHED", .pme_desc = "XIO1 - Read command dispatched; includes high-priority and fast-path reads.", .pme_code = 0x1bde, /* 7134 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Write command dispatched.", .pme_code = 0x1bdf, /* 7135 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_MOD_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1be0, /* 7136 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_REFRESH_DISPATCHED", .pme_desc = "XIO1 - Refresh dispatched.", .pme_code = 0x1be1, /* 7137 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_BYTE_MSK_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Byte-masking write command (data size >= 16 bytes) dispatched.", .pme_code = 0x1be3, /* 7139 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_DISPATCHED_AFTER_READ", .pme_desc = "XIO1 - Write command dispatched after a read command was previously dispatched.", .pme_code = 0x1be5, /* 7141 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_DISPATCHED_AFTER_WRITE", .pme_desc = "XIO1 - Read command dispatched after a write command was previously dispatched.", .pme_code = 0x1be6, /* 7142 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_Q_EMPTY", .pme_desc = "XIO0 - Read command queue is empty.", .pme_code = 0x1c29, /* 7209 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_Q_EMPTY", .pme_desc = "XIO0 - Write command queue is empty.", .pme_code = 0x1c2a, /* 7210 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_Q_FULL", .pme_desc = "XIO0 - Read command queue is full.", .pme_code = 0x1c2c, /* 7212 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_RESPONDS_READ_RETRY", .pme_desc = "XIO0 - MIC responds with a Retry for a read command because the read command queue is full.", .pme_code = 0x1c2d, /* 7213 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_Q_FULL", .pme_desc = "XIO0 - Write command queue is full.", .pme_code = 0x1c2e, /* 7214 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_RESPONDS_WRITE_RETRY", .pme_desc = "XIO0 - MIC responds with a Retry for a write command because the write command queue is full.", .pme_code = 0x1c2f, /* 7215 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_DISPATCHED", .pme_desc = "XIO0 - Read command dispatched; includes high-priority and fast-path reads.", .pme_code = 0x1c42, /* 7234 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Write command dispatched.", .pme_code = 0x1c43, /* 7235 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_MOD_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1c44, /* 7236 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_REFRESH_DISPATCHED", .pme_desc = "XIO0 - Refresh dispatched.", .pme_code = 0x1c45, /* 7237 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED_AFTER_READ", .pme_desc = "XIO0 - Write command dispatched after a read command was previously dispatched.", .pme_code = 0x1c49, /* 7241 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_DISPATCHED_AFTER_WRITE", .pme_desc = "XIO0 - Read command dispatched after a write command was previously dispatched.", .pme_code = 0x1c4a, /* 7242 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED_2", .pme_desc = "XIO0 - Write command dispatched.", .pme_code = 0x1ca7, /* 7335 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_MOD_WRITE_CMD_DISPATCHED_2", .pme_desc = "XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1ca8, /* 7336 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_REFRESH_DISPATCHED_2", .pme_desc = "XIO0 - Refresh dispatched.", .pme_code = 0x1ca9, /* 7337 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_BYTE_MSK_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Byte-masking write command (data size >= 16 bytes) dispatched.", .pme_code = 0x1cab, /* 7339 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_DATA_PLG", .pme_desc = "Type A data physical layer group (PLG). Does not include header-only or credit-only data PLGs. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.", .pme_code = 0x1fb0, /* 8112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_DATA_PLG", .pme_desc = "Type B data PLG. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.", .pme_code = 0x1fb1, /* 8113 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_IOIF_TYPEA_DATA_PLG", .pme_desc = "Type A data PLG. Does not include header-only or credit-only PLGs. In IOIF mode, counts CBE store data to I/O device. Does not apply in BIF mode.", .pme_code = 0x1fb2, /* 8114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_IOIF_TYPEB_DATA_PLG", .pme_desc = "Type B data PLG. In IOIF mode, counts CBE store data to an I/O device. Does not apply in BIF mode.", .pme_code = 0x1fb3, /* 8115 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_PLG", .pme_desc = "Data PLG. Does not include header-only or credit-only PLGs.", .pme_code = 0x1fb4, /* 8116 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_PLG", .pme_desc = "Command PLG (no credit-only PLG). In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/ reflected command or snoop/combined responses.", .pme_code = 0x1fb5, /* 8117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_TRANSFER", .pme_desc = "Type A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).", .pme_code = 0x1fb6, /* 8118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_TRANSFER", .pme_desc = "Type B data transfer.", .pme_code = 0x1fb7, /* 8119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_GREDIT_ONLY_PLG", .pme_desc = "Command-credit-only command PLG in either IOIF or BIF mode.", .pme_code = 0x1fb8, /* 8120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_CREDIT_ONLY_PLG", .pme_desc = "Data-credit-only data PLG sent in either IOIF or BIF mode.", .pme_code = 0x1fb9, /* 8121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NON_NULL_ENVLP_SENT", .pme_desc = "Non-null envelope sent (does not include long envelopes).", .pme_code = 0x1fba, /* 8122 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NULL_ENVLP_SENT", .pme_desc = "Null envelope sent.", .pme_code = 0x1fbc, /* 8124 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NO_VALID_DATA_SENT", .pme_desc = "No valid data sent this cycle.", .pme_code = 0x1fbd, /* 8125 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NORMAL_ENVLP_SENT", .pme_desc = "Normal envelope sent.", .pme_code = 0x1fbe, /* 8126 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_LONG_ENVLP_SENT", .pme_desc = "Long envelope sent.", .pme_code = 0x1fbf, /* 8127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NULL_PLG_INSERTED", .pme_desc = "A Null PLG inserted in an outgoing envelope.", .pme_code = 0x1fc0, /* 8128 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_OUTBOUND_ENV_ARRAY_FULL", .pme_desc = "Outbound envelope array is full.", .pme_code = 0x1fc1, /* 8129 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEB_TRANSFER", .pme_desc = "Type B data transfer.", .pme_code = 0x201b, /* 8219 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NULL_ENVLP_RECV", .pme_desc = "Null envelope received.", .pme_code = 0x206d, /* 8301 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_CMD_PLG_2", .pme_desc = "Command PLG, but not credit-only PLG. In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/reflected command or snoop/combined responses.", .pme_code = 0x207a, /* 8314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_GREDIT_ONLY_PLG_2", .pme_desc = "Command-credit-only command PLG.", .pme_code = 0x207b, /* 8315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NORMAL_ENVLP_RECV", .pme_desc = "Normal envelope received is good.", .pme_code = 0x2080, /* 8320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_LONG_ENVLP_RECV", .pme_desc = "Long envelope received is good.", .pme_code = 0x2081, /* 8321 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_DATA_GREDIT_ONLY_PLG_2", .pme_desc = "Data-credit-only data PLG in either IOIF or BIF mode; will count a maximum of one per envelope.", .pme_code = 0x2082, /* 8322 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NON_NULL_ENVLP", .pme_desc = "Non-null envelope; does not include long envelopes; includes retried envelopes.", .pme_code = 0x2083, /* 8323 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_GRANT_RECV", .pme_desc = "Data grant received.", .pme_code = 0x2084, /* 8324 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_PLG_2", .pme_desc = "Data PLG. Does not include header-only or credit-only PLGs.", .pme_code = 0x2088, /* 8328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_TRANSFER_2", .pme_desc = "Type A data transfer regardless of length. Can also be used to count Type A data header PLGs, but not credit-only PLGs.", .pme_code = 0x2089, /* 8329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_TRANSFER_2", .pme_desc = "Type B data transfer.", .pme_code = 0x208a, /* 8330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NULL_ENVLP_RECV", .pme_desc = "Null envelope received.", .pme_code = 0x20d1, /* 8401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_CMD_PLG_2", .pme_desc = "Command PLG (no credit-only PLG). Counts I/O command or reply PLGs.", .pme_code = 0x20de, /* 8414 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_CMD_GREDIT_ONLY_PLG_2", .pme_desc = "Command-credit-only command PLG.", .pme_code = 0x20df, /* 8415 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NORMAL_ENVLP_RECV", .pme_desc = "Normal envelope received is good.", .pme_code = 0x20e4, /* 8420 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_LONG_ENVLP_RECV", .pme_desc = "Long envelope received is good.", .pme_code = 0x20e5, /* 8421 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_DATA_GREDIT_ONLY_PLG_2", .pme_desc = "Data-credit-only data PLG received; will count a maximum of one per envelope.", .pme_code = 0x20e6, /* 8422 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NON_NULL_ENVLP", .pme_desc = "Non-Null envelope received; does not include long envelopes; includes retried envelopes.", .pme_code = 0x20e7, /* 8423 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_DATA_GRANT_RECV", .pme_desc = "Data grant received.", .pme_code = 0x20e8, /* 8424 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_DATA_PLG_2", .pme_desc = "Data PLG received. Does not include header-only or credit-only PLGs.", .pme_code = 0x20ec, /* 8428 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEA_TRANSFER_2", .pme_desc = "Type I A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).", .pme_code = 0x20ed, /* 8429 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEB_TRANSFER_2", .pme_desc = "Type B data transfer received.", .pme_code = 0x20ee, /* 8430 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_MMIO_READ_IOIF1", .pme_desc = "Received MMIO read targeted to IOIF1.", .pme_code = 0x213c, /* 8508 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_WRITE_IOIF1", .pme_desc = "Received MMIO write targeted to IOIF1.", .pme_code = 0x213d, /* 8509 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_READ_IOIF0", .pme_desc = "Received MMIO read targeted to IOIF0.", .pme_code = 0x213e, /* 8510 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_WRITE_IOIF0", .pme_desc = "Received MMIO write targeted to IOIF0.", .pme_code = 0x213f, /* 8511 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_CMD_TO_IOIF0", .pme_desc = "Sent command to IOIF0.", .pme_code = 0x2140, /* 8512 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_CMD_TO_IOIF1", .pme_desc = "Sent command to IOIF1.", .pme_code = 0x2141, /* 8513 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_MATRIX3_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 3 is occupied by a dependent command.", .pme_code = 0x219d, /* 8605 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_IOIF0_MATRIX4_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 4 is occupied by a dependent command.", .pme_code = 0x219e, /* 8606 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_IOIF0_MATRIX5_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 5 is occupied by a dependent command.", .pme_code = 0x219f, /* 8607 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_DMA_READ_IOIF0", .pme_desc = "Received read request from IOIF0.", .pme_code = 0x21a2, /* 8610 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_DMA_WRITE_IOIF0", .pme_desc = "Received write request from IOIF0.", .pme_code = 0x21a3, /* 8611 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_INTERRUPT_IOIF0", .pme_desc = "Received interrupt from the IOIF0.", .pme_code = 0x21a6, /* 8614 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_E_MEM", .pme_desc = "IOIF0 request for token for even memory banks 0-14.", .pme_code = 0x220c, /* 8716 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_O_MEM", .pme_desc = "IOIF0 request for token for odd memory banks 1-15.", .pme_code = 0x220d, /* 8717 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_1357", .pme_desc = "IOIF0 request for token type 1, 3, 5, or 7.", .pme_code = 0x220e, /* 8718 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_9111315", .pme_desc = "IOIF0 request for token type 9, 11, 13, or 15.", .pme_code = 0x220f, /* 8719 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_16", .pme_desc = "IOIF0 request for token type 16.", .pme_code = 0x2214, /* 8724 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_17", .pme_desc = "IOIF0 request for token type 17.", .pme_code = 0x2215, /* 8725 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_18", .pme_desc = "IOIF0 request for token type 18.", .pme_code = 0x2216, /* 8726 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_19", .pme_desc = "IOIF0 request for token type 19.", .pme_code = 0x2217, /* 8727 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOPT_CACHE_HIT", .pme_desc = "I/O page table cache hit for commands from IOIF.", .pme_code = 0x2260, /* 8800 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOPT_CACHE_MISS", .pme_desc = "I/O page table cache miss for commands from IOIF.", .pme_code = 0x2261, /* 8801 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOST_CACHE_HIT", .pme_desc = "I/O segment table cache hit.", .pme_code = 0x2263, /* 8803 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOST_CACHE_MISS", .pme_desc = "I/O segment table cache miss.", .pme_code = 0x2264, /* 8804 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_INTERRUPT_FROM_SPU", .pme_desc = "Interrupt received from any SPU (reflected cmd when IIC has sent ACK response).", .pme_code = 0x2278, /* 8824 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IIC_INTERRUPT_TO_PPU_TH0", .pme_desc = "Internal interrupt controller (IIC) generated interrupt to PPU thread 0.", .pme_code = 0x2279, /* 8825 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IIC_INTERRUPT_TO_PPU_TH1", .pme_desc = "IIC generated interrupt to PPU thread 1.", .pme_code = 0x227a, /* 8826 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_RECV_EXTERNAL_INTERRUPT_TO_TH0", .pme_desc = "Received external interrupt (using MMIO) from PPU to PPU thread 0.", .pme_code = 0x227b, /* 8827 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_RECV_EXTERNAL_INTERRUPT_TO_TH1", .pme_desc = "Received external interrupt (using MMIO) from PPU to PPU thread 1.", .pme_code = 0x227c, /* 8828 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, }; /*--- The number of events : 435 ---*/ #define PME_CELL_EVENT_COUNT (sizeof(cell_pe)/sizeof(pme_cell_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/core_events.h000066400000000000000000001214141502707512200217760ustar00rootroot00000000000000/* * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #define INTEL_CORE_MESI_UMASKS \ { .pme_uname = "MESI",\ .pme_udesc = "Any cacheline access (default)",\ .pme_ucode = 0xf\ },\ { .pme_uname = "I_STATE",\ .pme_udesc = "Invalid cacheline",\ .pme_ucode = 0x1\ },\ { .pme_uname = "S_STATE",\ .pme_udesc = "Shared cacheline",\ .pme_ucode = 0x2\ },\ { .pme_uname = "E_STATE",\ .pme_udesc = "Exclusive cacheline",\ .pme_ucode = 0x4\ },\ { .pme_uname = "M_STATE",\ .pme_udesc = "Modified cacheline",\ .pme_ucode = 0x8\ } #define INTEL_CORE_SPECIFICITY_UMASKS \ { .pme_uname = "SELF",\ .pme_udesc = "This core",\ .pme_ucode = 0x40\ },\ { .pme_uname = "BOTH_CORES",\ .pme_udesc = "Both cores",\ .pme_ucode = 0xc0\ } #define INTEL_CORE_HW_PREFETCH_UMASKS \ { .pme_uname = "ANY",\ .pme_udesc = "All inclusive",\ .pme_ucode = 0x30\ },\ { .pme_uname = "PREFETCH",\ .pme_udesc = "Hardware prefetch only",\ .pme_ucode = 0x10\ } #define INTEL_CORE_AGENT_UMASKS \ { .pme_uname = "THIS_AGENT",\ .pme_udesc = "This agent",\ .pme_ucode = 0x00\ },\ { .pme_uname = "ALL_AGENTS",\ .pme_udesc = "Any agent on the bus",\ .pme_ucode = 0x20\ } static pme_core_entry_t core_pe[]={ /* * BEGIN: architected Core events */ {.pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_flags = PFMLIB_CORE_FIXED1, .pme_desc = "count core clock cycles whenever the clock signal on the specific core is running (not halted). Alias to event CPU_CLK_UNHALTED:CORE_P" }, {.pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0x00c0, .pme_flags = PFMLIB_CORE_FIXED0, .pme_desc = "count the number of instructions at retirement. Alias to event INST_RETIRED:ANY_P", }, {.pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_flags = PFMLIB_CORE_FIXED2_ONLY, .pme_desc = "Unhalted reference cycles. Alias to event CPU_CLK_UNHALTED:REF", }, {.pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_desc = "count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to L2_RQSTS:SELF_DEMAND_MESI", }, {.pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_desc = "count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L2_RQSTS:SELF_DEMAND_I_STATE", }, {.pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0x00c4, .pme_desc = "count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction. Alias to event BR_INST_RETIRED:ANY", }, {.pme_name = "MISPREDICTED_BRANCH_RETIRED", .pme_code = 0x00c5, .pme_desc = "count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware. Alias to BR_INST_RETIRED:MISPRED", }, /* * END: architected events */ /* * BEGIN: Core 2 Duo events */ { .pme_name = "RS_UOPS_DISPATCHED_CYCLES", .pme_code = 0xa1, .pme_flags = PFMLIB_CORE_PMC0, .pme_desc = "Cycles micro-ops dispatched for execution", .pme_umasks = { { .pme_uname = "PORT_0", .pme_udesc = "on port 0", .pme_ucode = 0x1 }, { .pme_uname = "PORT_1", .pme_udesc = "on port 1", .pme_ucode = 0x2 }, { .pme_uname = "PORT_2", .pme_udesc = "on port 2", .pme_ucode = 0x4 }, { .pme_uname = "PORT_3", .pme_udesc = "on port 3", .pme_ucode = 0x8 }, { .pme_uname = "PORT_4", .pme_udesc = "on port 4", .pme_ucode = 0x10 }, { .pme_uname = "PORT_5", .pme_udesc = "on port 5", .pme_ucode = 0x20 }, { .pme_uname = "ANY", .pme_udesc = "on any port", .pme_ucode = 0x3f }, }, .pme_numasks = 7 }, { .pme_name = "RS_UOPS_DISPATCHED", .pme_code = 0xa0, .pme_desc = "Number of micro-ops dispatched for execution", }, { .pme_name = "RS_UOPS_DISPATCHED_NONE", .pme_code = 0xa0 | (1 << 23 | 1 << 24), .pme_desc = "Number of of cycles in which no micro-ops is dispatched for execution", }, { .pme_name = "LOAD_BLOCK", .pme_code = 0x3, .pme_flags = 0, .pme_desc = "Loads blocked", .pme_umasks = { { .pme_uname = "STA", .pme_udesc = "Loads blocked by a preceding store with unknown address", .pme_ucode = 0x2 }, { .pme_uname = "STD", .pme_udesc = "Loads blocked by a preceding store with unknown data", .pme_ucode = 0x4 }, { .pme_uname = "OVERLAP_STORE", .pme_udesc = "Loads that partially overlap an earlier store, or 4K aliased with a previous store", .pme_ucode = 0x8 }, { .pme_uname = "UNTIL_RETIRE", .pme_udesc = "Loads blocked until retirement", .pme_ucode = 0x10 }, { .pme_uname = "L1D", .pme_udesc = "Loads blocked by the L1 data cache", .pme_ucode = 0x20 } }, .pme_numasks = 5 }, { .pme_name = "SB_DRAIN_CYCLES", .pme_code = 0x104, .pme_flags = 0, .pme_desc = "Cycles while stores are blocked due to store buffer drain" }, { .pme_name = "STORE_BLOCK", .pme_code = 0x4, .pme_flags = 0, .pme_desc = "Cycles while store is waiting", .pme_umasks = { { .pme_uname = "ORDER", .pme_udesc = "Cycles while store is waiting for a preceding store to be globally observed", .pme_ucode = 0x2 }, { .pme_uname = "SNOOP", .pme_udesc = "A store is blocked due to a conflict with an external or internal snoop", .pme_ucode = 0x8 } }, .pme_numasks = 2 }, { .pme_name = "SEGMENT_REG_LOADS", .pme_code = 0x6, .pme_flags = 0, .pme_desc = "Number of segment register loads" }, { .pme_name = "SSE_PRE_EXEC", .pme_code = 0x7, .pme_flags = 0, .pme_desc = "Streaming SIMD Extensions (SSE) Prefetch instructions executed", .pme_umasks = { { .pme_uname = "NTA", .pme_udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .pme_ucode = 0x0 }, { .pme_uname = "L1", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .pme_ucode = 0x1 }, { .pme_uname = "L2", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed", .pme_ucode = 0x2 }, { .pme_uname = "STORES", .pme_udesc = "Streaming SIMD Extensions (SSE) Weakly-ordered store instructions executed", .pme_ucode = 0x3 } }, .pme_numasks = 4 }, { .pme_name = "DTLB_MISSES", .pme_code = 0x8, .pme_flags = 0, .pme_desc = "Memory accesses that missed the DTLB", .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Any memory access that missed the DTLB", .pme_ucode = 0x1 }, { .pme_uname = "MISS_LD", .pme_udesc = "DTLB misses due to load operations", .pme_ucode = 0x2 }, { .pme_uname = "L0_MISS_LD", .pme_udesc = "L0 DTLB misses due to load operations", .pme_ucode = 0x4 }, { .pme_uname = "MISS_ST", .pme_udesc = "DTLB misses due to store operations", .pme_ucode = 0x8 } }, .pme_numasks = 4 }, { .pme_name = "MEMORY_DISAMBIGUATION", .pme_code = 0x9, .pme_flags = 0, .pme_desc = "Memory disambiguation", .pme_umasks = { { .pme_uname = "RESET", .pme_udesc = "Memory disambiguation reset cycles", .pme_ucode = 0x1 }, { .pme_uname = "SUCCESS", .pme_udesc = "Number of loads that were successfully disambiguated", .pme_ucode = 0x2 } }, .pme_numasks = 2 }, { .pme_name = "PAGE_WALKS", .pme_code = 0xc, .pme_flags = 0, .pme_desc = "Number of page-walks executed", .pme_umasks = { { .pme_uname = "COUNT", .pme_udesc = "Number of page-walks executed", .pme_ucode = 0x1 }, { .pme_uname = "CYCLES", .pme_udesc = "Duration of page-walks in core cycles", .pme_ucode = 0x2 } }, .pme_numasks = 2 }, { .pme_name = "FP_COMP_OPS_EXE", .pme_code = 0x10, .pme_flags = PFMLIB_CORE_PMC0, .pme_desc = "Floating point computational micro-ops executed" }, { .pme_name = "FP_ASSIST", .pme_code = 0x11, .pme_flags = PFMLIB_CORE_PMC1, .pme_desc = "Floating point assists" }, { .pme_name = "MUL", .pme_code = 0x12, .pme_flags = PFMLIB_CORE_PMC1, .pme_desc = "Multiply operations executed" }, { .pme_name = "DIV", .pme_code = 0x13, .pme_flags = PFMLIB_CORE_PMC1, .pme_desc = "Divide operations executed" }, { .pme_name = "CYCLES_DIV_BUSY", .pme_code = 0x14, .pme_flags = PFMLIB_CORE_PMC0, .pme_desc = "Cycles the divider is busy" }, { .pme_name = "IDLE_DURING_DIV", .pme_code = 0x18, .pme_flags = PFMLIB_CORE_PMC0, .pme_desc = "Cycles the divider is busy and all other execution units are idle" }, { .pme_name = "DELAYED_BYPASS", .pme_code = 0x19, .pme_flags = PFMLIB_CORE_PMC1, .pme_desc = "Delayed bypass", .pme_umasks = { { .pme_uname = "FP", .pme_udesc = "Delayed bypass to FP operation", .pme_ucode = 0x0 }, { .pme_uname = "SIMD", .pme_udesc = "Delayed bypass to SIMD operation", .pme_ucode = 0x1 }, { .pme_uname = "LOAD", .pme_udesc = "Delayed bypass to load operation", .pme_ucode = 0x2 } }, .pme_numasks = 3 }, { .pme_name = "L2_ADS", .pme_code = 0x21, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Cycles L2 address bus is in use", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "L2_DBUS_BUSY_RD", .pme_code = 0x23, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Cycles the L2 transfers data to the core", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "L2_LINES_IN", .pme_code = 0x24, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "L2 cache misses", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_name = "L2_M_LINES_IN", .pme_code = 0x25, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "L2 cache line modifications", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "L2_LINES_OUT", .pme_code = 0x26, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "L2 cache lines evicted", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_name = "L2_M_LINES_OUT", .pme_code = 0x27, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Modified lines evicted from the L2 cache", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_name = "L2_IFETCH", .pme_code = 0x28, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "L2 cacheable instruction fetch requests", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_name = "L2_LD", .pme_code = 0x29, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "L2 cache reads", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_name = "L2_ST", .pme_code = 0x2a, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "L2 store requests", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_name = "L2_LOCK", .pme_code = 0x2b, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "L2 locked accesses", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_name = "L2_RQSTS", .pme_code = 0x2e, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "L2 cache requests", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_name = "L2_REJECT_BUSQ", .pme_code = 0x30, .pme_flags = PFMLIB_CORE_CSPEC|PFMLIB_CORE_MESI, .pme_desc = "Rejected L2 cache requests", .pme_umasks = { INTEL_CORE_MESI_UMASKS, INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_name = "L2_NO_REQ", .pme_code = 0x32, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Cycles no L2 cache requests are pending", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "EIST_TRANS", .pme_code = 0x3a, .pme_flags = 0, .pme_desc = "Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions" }, { .pme_name = "THERMAL_TRIP", .pme_code = 0xc03b, .pme_flags = 0, .pme_desc = "Number of thermal trips" }, { .pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x3c, .pme_flags = PFMLIB_CORE_UMASK_NCOMBO, .pme_desc = "Core cycles when core is not halted", .pme_umasks = { { .pme_uname = "CORE_P", .pme_udesc = "Core cycles when core is not halted", .pme_ucode = 0x0, }, { .pme_uname = "REF", .pme_udesc = "Reference cycles. This event is not affected by core changes such as P-states or TM2 transitions but counts at the same frequency as the time stamp counter. This event can approximate elapsed time. This event has a constant ratio with the CPU_CLK_UNHALTED:BUS event", .pme_ucode = 0x1, .pme_flags = PFMLIB_CORE_FIXED2_ONLY /* Can only be measured on FIXED_CTR2 */ }, { .pme_uname = "BUS", .pme_udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .pme_ucode = 0x1, }, { .pme_uname = "NO_OTHER", .pme_udesc = "Bus cycles when core is active and the other is halted", .pme_ucode = 0x2 } }, .pme_numasks = 4 }, { .pme_name = "L1D_CACHE_LD", .pme_code = 0x40, .pme_flags = PFMLIB_CORE_MESI, .pme_desc = "L1 cacheable data reads", .pme_umasks = { INTEL_CORE_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_name = "L1D_CACHE_ST", .pme_code = 0x41, .pme_flags = PFMLIB_CORE_MESI, .pme_desc = "L1 cacheable data writes", .pme_umasks = { INTEL_CORE_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_name = "L1D_CACHE_LOCK", .pme_code = 0x42, .pme_flags = PFMLIB_CORE_MESI, .pme_desc = "L1 data cacheable locked reads", .pme_umasks = { INTEL_CORE_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_name = "L1D_ALL_REF", .pme_code = 0x143, .pme_flags = 0, .pme_desc = "All references to the L1 data cache" }, { .pme_name = "L1D_ALL_CACHE_REF", .pme_code = 0x243, .pme_flags = 0, .pme_desc = "L1 Data cacheable reads and writes" }, { .pme_name = "L1D_REPL", .pme_code = 0xf45, .pme_flags = 0, .pme_desc = "Cache lines allocated in the L1 data cache" }, { .pme_name = "L1D_M_REPL", .pme_code = 0x46, .pme_flags = 0, .pme_desc = "Modified cache lines allocated in the L1 data cache" }, { .pme_name = "L1D_M_EVICT", .pme_code = 0x47, .pme_flags = 0, .pme_desc = "Modified cache lines evicted from the L1 data cache" }, { .pme_name = "L1D_PEND_MISS", .pme_code = 0x48, .pme_flags = 0, .pme_desc = "Total number of outstanding L1 data cache misses at any cycle" }, { .pme_name = "L1D_SPLIT", .pme_code = 0x49, .pme_flags = 0, .pme_desc = "Cache line split from L1 data cache", .pme_umasks = { { .pme_uname = "LOADS", .pme_udesc = "Cache line split loads from the L1 data cache", .pme_ucode = 0x1 }, { .pme_uname = "STORES", .pme_udesc = "Cache line split stores to the L1 data cache", .pme_ucode = 0x2 } }, .pme_numasks = 2 }, { .pme_name = "SSE_PRE_MISS", .pme_code = 0x4b, .pme_flags = 0, .pme_desc = "Streaming SIMD Extensions (SSE) instructions missing all cache levels", .pme_umasks = { { .pme_uname = "NTA", .pme_udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions missing all cache levels", .pme_ucode = 0x0 }, { .pme_uname = "L1", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions missing all cache levels", .pme_ucode = 0x1 }, { .pme_uname = "L2", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions missing all cache levels", .pme_ucode = 0x2 }, }, .pme_numasks = 3 }, { .pme_name = "LOAD_HIT_PRE", .pme_code = 0x4c, .pme_flags = 0, .pme_desc = "Load operations conflicting with a software prefetch to the same address" }, { .pme_name = "L1D_PREFETCH", .pme_code = 0x4e, .pme_flags = 0, .pme_desc = "L1 data cache prefetch", .pme_umasks = { { .pme_uname = "REQUESTS", .pme_udesc = "L1 data cache prefetch requests", .pme_ucode = 0x10 } }, .pme_numasks = 1 }, { .pme_name = "BUS_REQUEST_OUTSTANDING", .pme_code = 0x60, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Number of pending full cache line read transactions on the bus occurring in each cycle", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_BNR_DRV", .pme_code = 0x61, .pme_flags = 0, .pme_desc = "Number of Bus Not Ready signals asserted", .pme_umasks = { INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_name = "BUS_DRDY_CLOCKS", .pme_code = 0x62, .pme_flags = 0, .pme_desc = "Bus cycles when data is sent on the bus", .pme_umasks = { INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_name = "BUS_LOCK_CLOCKS", .pme_code = 0x63, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Bus cycles when a LOCK signal is asserted", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_DATA_RCV", .pme_code = 0x64, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Bus cycles while processor receives data", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "BUS_TRANS_BRD", .pme_code = 0x65, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Burst read bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_RFO", .pme_code = 0x66, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "RFO bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_WB", .pme_code = 0x67, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Explicit writeback bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_IFETCH", .pme_code = 0x68, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Instruction-fetch bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_INVAL", .pme_code = 0x69, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Invalidate bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_PWR", .pme_code = 0x6a, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Partial write bus transaction", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_P", .pme_code = 0x6b, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Partial bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_IO", .pme_code = 0x6c, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "IO bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_DEF", .pme_code = 0x6d, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Deferred bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_BURST", .pme_code = 0x6e, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Burst (full cache-line) bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_MEM", .pme_code = 0x6f, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Memory bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_ANY", .pme_code = 0x70, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "All bus transactions", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "EXT_SNOOP", .pme_code = 0x77, .pme_flags = 0, .pme_desc = "External snoops responses", .pme_umasks = { INTEL_CORE_AGENT_UMASKS, { .pme_uname = "ANY", .pme_udesc = "Any external snoop response", .pme_ucode = 0xb }, { .pme_uname = "CLEAN", .pme_udesc = "External snoop CLEAN response", .pme_ucode = 0x1 }, { .pme_uname = "HIT", .pme_udesc = "External snoop HIT response", .pme_ucode = 0x2 }, { .pme_uname = "HITM", .pme_udesc = "External snoop HITM response", .pme_ucode = 0x8 } }, .pme_numasks = 6 }, { .pme_name = "CMP_SNOOP", .pme_code = 0x78, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "L1 data cache is snooped by other core", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, { .pme_uname = "ANY", .pme_udesc = "L1 data cache is snooped by other core", .pme_ucode = 0x03 }, { .pme_uname = "SHARE", .pme_udesc = "L1 data cache is snooped for sharing by other core", .pme_ucode = 0x01 }, { .pme_uname = "INVALIDATE", .pme_udesc = "L1 data cache is snooped for Invalidation by other core", .pme_ucode = 0x02 } }, .pme_numasks = 5 }, { .pme_name = "BUS_HIT_DRV", .pme_code = 0x7a, .pme_flags = 0, .pme_desc = "HIT signal asserted", .pme_umasks = { INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_name = "BUS_HITM_DRV", .pme_code = 0x7b, .pme_flags = 0, .pme_desc = "HITM signal asserted", .pme_umasks = { INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_name = "BUSQ_EMPTY", .pme_code = 0x7d, .pme_flags = 0, .pme_desc = "Bus queue is empty", .pme_umasks = { INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_name = "SNOOP_STALL_DRV", .pme_code = 0x7e, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "Bus stalled for snoops", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS, INTEL_CORE_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_name = "BUS_IO_WAIT", .pme_code = 0x7f, .pme_flags = PFMLIB_CORE_CSPEC, .pme_desc = "IO requests waiting in the bus queue", .pme_umasks = { INTEL_CORE_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_name = "L1I_READS", .pme_code = 0x80, .pme_flags = 0, .pme_desc = "Instruction fetches" }, { .pme_name = "L1I_MISSES", .pme_code = 0x81, .pme_flags = 0, .pme_desc = "Instruction Fetch Unit misses" }, { .pme_name = "ITLB", .pme_code = 0x82, .pme_flags = 0, .pme_desc = "ITLB small page misses", .pme_umasks = { { .pme_uname = "SMALL_MISS", .pme_udesc = "ITLB small page misses", .pme_ucode = 0x2 }, { .pme_uname = "LARGE_MISS", .pme_udesc = "ITLB large page misses", .pme_ucode = 0x10 }, { .pme_uname = "FLUSH", .pme_udesc = "ITLB flushes", .pme_ucode = 0x40 }, { .pme_uname = "MISSES", .pme_udesc = "ITLB misses", .pme_ucode = 0x12 } }, .pme_numasks = 4 }, { .pme_name = "INST_QUEUE", .pme_code = 0x83, .pme_flags = 0, .pme_desc = "Cycles during which the instruction queue is full", .pme_umasks = { { .pme_uname = "FULL", .pme_udesc = "Cycles during which the instruction queue is full", .pme_ucode = 0x2 } }, .pme_numasks = 1 }, { .pme_name = "CYCLES_L1I_MEM_STALLED", .pme_code = 0x86, .pme_flags = 0, .pme_desc = "Cycles during which instruction fetches are stalled" }, { .pme_name = "ILD_STALL", .pme_code = 0x87, .pme_flags = 0, .pme_desc = "Instruction Length Decoder stall cycles due to a length changing prefix" }, { .pme_name = "BR_INST_EXEC", .pme_code = 0x88, .pme_flags = 0, .pme_desc = "Branch instructions executed" }, { .pme_name = "BR_MISSP_EXEC", .pme_code = 0x89, .pme_flags = 0, .pme_desc = "Mispredicted branch instructions executed" }, { .pme_name = "BR_BAC_MISSP_EXEC", .pme_code = 0x8a, .pme_flags = 0, .pme_desc = "Branch instructions mispredicted at decoding" }, { .pme_name = "BR_CND_EXEC", .pme_code = 0x8b, .pme_flags = 0, .pme_desc = "Conditional branch instructions executed" }, { .pme_name = "BR_CND_MISSP_EXEC", .pme_code = 0x8c, .pme_flags = 0, .pme_desc = "Mispredicted conditional branch instructions executed" }, { .pme_name = "BR_IND_EXEC", .pme_code = 0x8d, .pme_flags = 0, .pme_desc = "Indirect branch instructions executed" }, { .pme_name = "BR_IND_MISSP_EXEC", .pme_code = 0x8e, .pme_flags = 0, .pme_desc = "Mispredicted indirect branch instructions executed" }, { .pme_name = "BR_RET_EXEC", .pme_code = 0x8f, .pme_flags = 0, .pme_desc = "RET instructions executed" }, { .pme_name = "BR_RET_MISSP_EXEC", .pme_code = 0x90, .pme_flags = 0, .pme_desc = "Mispredicted RET instructions executed" }, { .pme_name = "BR_RET_BAC_MISSP_EXEC", .pme_code = 0x91, .pme_flags = 0, .pme_desc = "RET instructions executed mispredicted at decoding" }, { .pme_name = "BR_CALL_EXEC", .pme_code = 0x92, .pme_flags = 0, .pme_desc = "CALL instructions executed" }, { .pme_name = "BR_CALL_MISSP_EXEC", .pme_code = 0x93, .pme_flags = 0, .pme_desc = "Mispredicted CALL instructions executed" }, { .pme_name = "BR_IND_CALL_EXEC", .pme_code = 0x94, .pme_flags = 0, .pme_desc = "Indirect CALL instructions executed" }, { .pme_name = "BR_TKN_BUBBLE_1", .pme_code = 0x97, .pme_flags = 0, .pme_desc = "Branch predicted taken with bubble I" }, { .pme_name = "BR_TKN_BUBBLE_2", .pme_code = 0x98, .pme_flags = 0, .pme_desc = "Branch predicted taken with bubble II" }, #if 0 /* * Looks like event 0xa1 supersedes this one */ { .pme_name = "RS_UOPS_DISPATCHED", .pme_code = 0xa0, .pme_flags = 0, .pme_desc = "Micro-ops dispatched for execution" }, #endif { .pme_name = "MACRO_INSTS", .pme_code = 0xaa, .pme_flags = 0, .pme_desc = "Instructions decoded", .pme_umasks = { { .pme_uname = "DECODED", .pme_udesc = "Instructions decoded", .pme_ucode = 0x1 }, { .pme_uname = "CISC_DECODED", .pme_udesc = "CISC instructions decoded", .pme_ucode = 0x8 } }, .pme_numasks = 2 }, { .pme_name = "ESP", .pme_code = 0xab, .pme_flags = 0, .pme_desc = "ESP register content synchronization", .pme_umasks = { { .pme_uname = "SYNCH", .pme_udesc = "ESP register content synchronization", .pme_ucode = 0x1 }, { .pme_uname = "ADDITIONS", .pme_udesc = "ESP register automatic additions", .pme_ucode = 0x2 } }, .pme_numasks = 2 }, { .pme_name = "SIMD_UOPS_EXEC", .pme_code = 0xb0, .pme_flags = 0, .pme_desc = "SIMD micro-ops executed (excluding stores)" }, { .pme_name = "SIMD_SAT_UOP_EXEC", .pme_code = 0xb1, .pme_flags = 0, .pme_desc = "SIMD saturated arithmetic micro-ops executed" }, { .pme_name = "SIMD_UOP_TYPE_EXEC", .pme_code = 0xb3, .pme_flags = 0, .pme_desc = "SIMD packed multiply micro-ops executed", .pme_umasks = { { .pme_uname = "MUL", .pme_udesc = "SIMD packed multiply micro-ops executed", .pme_ucode = 0x1 }, { .pme_uname = "SHIFT", .pme_udesc = "SIMD packed shift micro-ops executed", .pme_ucode = 0x2 }, { .pme_uname = "PACK", .pme_udesc = "SIMD pack micro-ops executed", .pme_ucode = 0x4 }, { .pme_uname = "UNPACK", .pme_udesc = "SIMD unpack micro-ops executed", .pme_ucode = 0x8 }, { .pme_uname = "LOGICAL", .pme_udesc = "SIMD packed logical micro-ops executed", .pme_ucode = 0x10 }, { .pme_uname = "ARITHMETIC", .pme_udesc = "SIMD packed arithmetic micro-ops executed", .pme_ucode = 0x20 } }, .pme_numasks = 6 }, { .pme_name = "INST_RETIRED", .pme_code = 0xc0, .pme_desc = "Instructions retired", .pme_umasks = { { .pme_uname = "ANY_P", .pme_udesc = "Instructions retired (precise event)", .pme_ucode = 0x0, .pme_flags = PFMLIB_CORE_PEBS }, { .pme_uname = "LOADS", .pme_udesc = "Instructions retired, which contain a load", .pme_ucode = 0x1 }, { .pme_uname = "STORES", .pme_udesc = "Instructions retired, which contain a store", .pme_ucode = 0x2 }, { .pme_uname = "OTHER", .pme_udesc = "Instructions retired, with no load or store operation", .pme_ucode = 0x4 } }, .pme_numasks = 4 }, { .pme_name = "X87_OPS_RETIRED", .pme_code = 0xc1, .pme_flags = 0, .pme_desc = "FXCH instructions retired", .pme_umasks = { { .pme_uname = "FXCH", .pme_udesc = "FXCH instructions retired", .pme_ucode = 0x1 }, { .pme_uname = "ANY", .pme_udesc = "Retired floating-point computational operations (precise event)", .pme_ucode = 0xfe, .pme_flags = PFMLIB_CORE_PEBS } }, .pme_numasks = 2 }, { .pme_name = "UOPS_RETIRED", .pme_code = 0xc2, .pme_flags = 0, .pme_desc = "Fused load+op or load+indirect branch retired", .pme_umasks = { { .pme_uname = "LD_IND_BR", .pme_udesc = "Fused load+op or load+indirect branch retired", .pme_ucode = 0x1 }, { .pme_uname = "STD_STA", .pme_udesc = "Fused store address + data retired", .pme_ucode = 0x2 }, { .pme_uname = "MACRO_FUSION", .pme_udesc = "Retired instruction pairs fused into one micro-op", .pme_ucode = 0x4 }, { .pme_uname = "NON_FUSED", .pme_udesc = "Non-fused micro-ops retired", .pme_ucode = 0x8 }, { .pme_uname = "FUSED", .pme_udesc = "Fused micro-ops retired", .pme_ucode = 0x7 }, { .pme_uname = "ANY", .pme_udesc = "Micro-ops retired", .pme_ucode = 0xf } }, .pme_numasks = 6 }, { .pme_name = "MACHINE_NUKES", .pme_code = 0xc3, .pme_flags = 0, .pme_desc = "Self-Modifying Code detected", .pme_umasks = { { .pme_uname = "SMC", .pme_udesc = "Self-Modifying Code detected", .pme_ucode = 0x1 }, { .pme_uname = "MEM_ORDER", .pme_udesc = "Execution pipeline restart due to memory ordering conflict or memory disambiguation misprediction", .pme_ucode = 0x4 } }, .pme_numasks = 2 }, { .pme_name = "BR_INST_RETIRED", .pme_code = 0xc4, .pme_flags = 0, .pme_desc = "Retired branch instructions", .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Retired branch instructions", .pme_ucode = 0x0 }, { .pme_uname = "PRED_NOT_TAKEN", .pme_udesc = "Retired branch instructions that were predicted not-taken", .pme_ucode = 0x1 }, { .pme_uname = "MISPRED_NOT_TAKEN", .pme_udesc = "Retired branch instructions that were mispredicted not-taken", .pme_ucode = 0x2 }, { .pme_uname = "PRED_TAKEN", .pme_udesc = "Retired branch instructions that were predicted taken", .pme_ucode = 0x4 }, { .pme_uname = "MISPRED_TAKEN", .pme_udesc = "Retired branch instructions that were mispredicted taken", .pme_ucode = 0x8 }, { .pme_uname = "TAKEN", .pme_udesc = "Retired taken branch instructions", .pme_ucode = 0xc } }, .pme_numasks = 6 }, { .pme_name = "BR_INST_RETIRED_MISPRED", .pme_code = 0x00c5, .pme_desc = "Retired mispredicted branch instructions (precise_event)", .pme_flags = PFMLIB_CORE_PEBS }, { .pme_name = "CYCLES_INT_MASKED", .pme_code = 0x1c6, .pme_flags = 0, .pme_desc = "Cycles during which interrupts are disabled" }, { .pme_name = "CYCLES_INT_PENDING_AND_MASKED", .pme_code = 0x2c6, .pme_flags = 0, .pme_desc = "Cycles during which interrupts are pending and disabled" }, { .pme_name = "SIMD_INST_RETIRED", .pme_code = 0xc7, .pme_flags = 0, .pme_desc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .pme_umasks = { { .pme_uname = "PACKED_SINGLE", .pme_udesc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .pme_ucode = 0x1 }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Retired Streaming SIMD Extensions (SSE) scalar-single instructions", .pme_ucode = 0x2 }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .pme_ucode = 0x4 }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .pme_ucode = 0x8 }, { .pme_uname = "VECTOR", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) vector integer instructions", .pme_ucode = 0x10 }, { .pme_uname = "ANY", .pme_udesc = "Retired Streaming SIMD instructions (precise event)", .pme_ucode = 0x1f, .pme_flags = PFMLIB_CORE_PEBS } }, .pme_numasks = 6 }, { .pme_name = "HW_INT_RCV", .pme_code = 0xc8, .pme_desc = "Hardware interrupts received" }, { .pme_name = "ITLB_MISS_RETIRED", .pme_code = 0xc9, .pme_flags = 0, .pme_desc = "Retired instructions that missed the ITLB" }, { .pme_name = "SIMD_COMP_INST_RETIRED", .pme_code = 0xca, .pme_flags = 0, .pme_desc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .pme_umasks = { { .pme_uname = "PACKED_SINGLE", .pme_udesc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .pme_ucode = 0x1 }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions", .pme_ucode = 0x2 }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .pme_ucode = 0x4 }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .pme_ucode = 0x8 } }, .pme_numasks = 4 }, { .pme_name = "MEM_LOAD_RETIRED", .pme_code = 0xcb, .pme_desc = "Retired loads that miss the L1 data cache", .pme_flags = PFMLIB_CORE_PMC0, .pme_umasks = { { .pme_uname = "L1D_MISS", .pme_udesc = "Retired loads that miss the L1 data cache (precise event)", .pme_ucode = 0x1, .pme_flags = PFMLIB_CORE_PEBS }, { .pme_uname = "L1D_LINE_MISS", .pme_udesc = "L1 data cache line missed by retired loads (precise event)", .pme_ucode = 0x2, .pme_flags = PFMLIB_CORE_PEBS }, { .pme_uname = "L2_MISS", .pme_udesc = "Retired loads that miss the L2 cache (precise event)", .pme_ucode = 0x4, .pme_flags = PFMLIB_CORE_PEBS }, { .pme_uname = "L2_LINE_MISS", .pme_udesc = "L2 cache line missed by retired loads (precise event)", .pme_ucode = 0x8, .pme_flags = PFMLIB_CORE_PEBS }, { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired loads that miss the DTLB (precise event)", .pme_ucode = 0x10, .pme_flags = PFMLIB_CORE_PEBS } }, .pme_numasks = 5 }, { .pme_name = "FP_MMX_TRANS", .pme_code = 0xcc, .pme_flags = PFMLIB_CORE_PEBS, .pme_desc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .pme_umasks = { { .pme_uname = "TO_FP", .pme_udesc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .pme_ucode = 0x2 }, { .pme_uname = "TO_MMX", .pme_udesc = "Transitions from Floating Point to MMX (TM) Instructions", .pme_ucode = 0x1 } }, .pme_numasks = 2 }, { .pme_name = "SIMD_ASSIST", .pme_code = 0xcd, .pme_flags = 0, .pme_desc = "SIMD assists invoked" }, { .pme_name = "SIMD_INSTR_RETIRED", .pme_code = 0xce, .pme_flags = 0, .pme_desc = "SIMD Instructions retired" }, { .pme_name = "SIMD_SAT_INSTR_RETIRED", .pme_code = 0xcf, .pme_flags = 0, .pme_desc = "Saturated arithmetic instructions retired" }, { .pme_name = "RAT_STALLS", .pme_code = 0xd2, .pme_flags = 0, .pme_desc = "ROB read port stalls cycles", .pme_umasks = { { .pme_uname = "ROB_READ_PORT", .pme_udesc = "ROB read port stalls cycles", .pme_ucode = 0x1 }, { .pme_uname = "PARTIAL_CYCLES", .pme_udesc = "Partial register stall cycles", .pme_ucode = 0x2 }, { .pme_uname = "FLAGS", .pme_udesc = "Flag stall cycles", .pme_ucode = 0x4 }, { .pme_uname = "FPSW", .pme_udesc = "FPU status word stall", .pme_ucode = 0x8 }, { .pme_uname = "ANY", .pme_udesc = "All RAT stall cycles", .pme_ucode = 0xf } }, .pme_numasks = 5 }, { .pme_name = "SEG_RENAME_STALLS", .pme_code = 0xd4, .pme_flags = 0, .pme_desc = "Segment rename stalls - ES ", .pme_umasks = { { .pme_uname = "ES", .pme_udesc = "Segment rename stalls - ES ", .pme_ucode = 0x1 }, { .pme_uname = "DS", .pme_udesc = "Segment rename stalls - DS", .pme_ucode = 0x2 }, { .pme_uname = "FS", .pme_udesc = "Segment rename stalls - FS", .pme_ucode = 0x4 }, { .pme_uname = "GS", .pme_udesc = "Segment rename stalls - GS", .pme_ucode = 0x8 }, { .pme_uname = "ANY", .pme_udesc = "Any (ES/DS/FS/GS) segment rename stall", .pme_ucode = 0xf } }, .pme_numasks = 5 }, { .pme_name = "SEG_REG_RENAMES", .pme_code = 0xd5, .pme_flags = 0, .pme_desc = "Segment renames - ES", .pme_umasks = { { .pme_uname = "ES", .pme_udesc = "Segment renames - ES", .pme_ucode = 0x1 }, { .pme_uname = "DS", .pme_udesc = "Segment renames - DS", .pme_ucode = 0x2 }, { .pme_uname = "FS", .pme_udesc = "Segment renames - FS", .pme_ucode = 0x4 }, { .pme_uname = "GS", .pme_udesc = "Segment renames - GS", .pme_ucode = 0x8 }, { .pme_uname = "ANY", .pme_udesc = "Any (ES/DS/FS/GS) segment rename", .pme_ucode = 0xf } }, .pme_numasks = 5 }, { .pme_name = "RESOURCE_STALLS", .pme_code = 0xdc, .pme_flags = 0, .pme_desc = "Cycles during which the ROB is full", .pme_umasks = { { .pme_uname = "ROB_FULL", .pme_udesc = "Cycles during which the ROB is full", .pme_ucode = 0x1 }, { .pme_uname = "RS_FULL", .pme_udesc = "Cycles during which the RS is full", .pme_ucode = 0x2 }, { .pme_uname = "LD_ST", .pme_udesc = "Cycles during which the pipeline has exceeded load or store limit or waiting to commit all stores", .pme_ucode = 0x4 }, { .pme_uname = "FPCW", .pme_udesc = "Cycles stalled due to FPU control word write", .pme_ucode = 0x8 }, { .pme_uname = "BR_MISS_CLEAR", .pme_udesc = "Cycles stalled due to branch misprediction", .pme_ucode = 0x10 }, { .pme_uname = "ANY", .pme_udesc = "Resource related stalls", .pme_ucode = 0x1f } }, .pme_numasks = 6 }, { .pme_name = "BR_INST_DECODED", .pme_code = 0xe0, .pme_flags = 0, .pme_desc = "Branch instructions decoded" }, { .pme_name = "BOGUS_BR", .pme_code = 0xe4, .pme_flags = 0, .pme_desc = "Bogus branches" }, { .pme_name = "BACLEARS", .pme_code = 0xe6, .pme_flags = 0, .pme_desc = "BACLEARS asserted" }, { .pme_name = "PREF_RQSTS_UP", .pme_code = 0xf0, .pme_flags = 0, .pme_desc = "Upward prefetches issued from the DPL" }, { .pme_name = "PREF_RQSTS_DN", .pme_code = 0xf8, .pme_flags = 0, .pme_desc = "Downward prefetches issued from the DPL" } }; #define PME_CORE_UNHALTED_CORE_CYCLES 0 #define PME_CORE_INSTRUCTIONS_RETIRED 1 #define PME_CORE_EVENT_COUNT (sizeof(core_pe)/sizeof(pme_core_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/coreduo_events.h000066400000000000000000000641251502707512200225130ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * Contributions by James Ralph * * Based on: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #define INTEL_COREDUO_MESI_UMASKS \ { .pme_uname = "MESI",\ .pme_udesc = "Any cacheline access",\ .pme_ucode = 0xf\ },\ { .pme_uname = "I_STATE",\ .pme_udesc = "Invalid cacheline",\ .pme_ucode = 0x1\ },\ { .pme_uname = "S_STATE",\ .pme_udesc = "Shared cacheline",\ .pme_ucode = 0x2\ },\ { .pme_uname = "E_STATE",\ .pme_udesc = "Exclusive cacheline",\ .pme_ucode = 0x4\ },\ { .pme_uname = "M_STATE",\ .pme_udesc = "Modified cacheline",\ .pme_ucode = 0x8\ } #define INTEL_COREDUO_SPECIFICITY_UMASKS \ { .pme_uname = "SELF",\ .pme_udesc = "This core",\ .pme_ucode = 0x40\ },\ { .pme_uname = "BOTH_CORES",\ .pme_udesc = "Both cores",\ .pme_ucode = 0xc0\ } #define INTEL_COREDUO_HW_PREFETCH_UMASKS \ { .pme_uname = "ANY",\ .pme_udesc = "All inclusive",\ .pme_ucode = 0x30\ },\ { .pme_uname = "PREFETCH",\ .pme_udesc = "Hardware prefetch only",\ .pme_ucode = 0x10\ } #define INTEL_COREDUO_AGENT_UMASKS \ { .pme_uname = "THIS_AGENT",\ .pme_udesc = "This agent",\ .pme_ucode = 0x00\ },\ { .pme_uname = "ALL_AGENTS",\ .pme_udesc = "Any agent on the bus",\ .pme_ucode = 0x20\ } static pme_coreduo_entry_t coreduo_pe[]={ /* * BEGIN architectural perfmon events */ /* 0 */{ .pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_desc = "Unhalted core cycles", }, /* 1 */{ .pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_desc = "Unhalted reference cycles. Measures bus cycles" }, /* 2 */{ .pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0xc0, .pme_desc = "Instructions retired" }, /* 3 */{ .pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_desc = "Last level of cache references" }, /* 4 */{ .pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_desc = "Last level of cache misses", }, /* 5 */{ .pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0xc4, .pme_desc = "Branch instructions retired" }, /* 6 */{ .pme_name = "MISPREDICTED_BRANCH_RETIRED", .pme_code = 0xc5, .pme_desc = "Mispredicted branch instruction retired" }, /* * BEGIN non architectural events */ { .pme_code = 0x3, .pme_name = "LD_BLOCKS", .pme_desc = "Load operations delayed due to store buffer blocks. The preceding store may be blocked due to unknown address, unknown data, or conflict due to partial overlap between the load and store.", }, { .pme_code = 0x4, .pme_name = "SD_DRAINS", .pme_desc = "Cycles while draining store buffers", }, { .pme_code = 0x5, .pme_name = "MISALIGN_MEM_REF", .pme_desc = "Misaligned data memory references (MOB splits of loads and stores).", }, { .pme_code = 0x6, .pme_name = "SEG_REG_LOADS", .pme_desc = "Segment register loads", }, { .pme_code = 0x7, .pme_name = "SSE_PREFETCH", .pme_flags = 0, .pme_desc = "Streaming SIMD Extensions (SSE) Prefetch instructions executed", .pme_umasks = { { .pme_uname = "NTA", .pme_udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .pme_ucode = 0x0 }, { .pme_uname = "T1", .pme_udesc = "SSE software prefetch instruction PREFE0xTCT1 retired", .pme_ucode = 0x01 }, { .pme_uname = "T2", .pme_udesc = "SSE software prefetch instruction PREFE0xTCT2 retired", .pme_ucode = 0x02 }, }, .pme_numasks = 3 }, { .pme_name = "SSE_NTSTORES_RET", .pme_desc = "SSE streaming store instruction retired", .pme_code = 0x0307 }, { .pme_code = 0x10, .pme_name = "FP_COMPS_OP_EXE", .pme_desc = "FP computational Instruction executed. FADD, FSUB, FCOM, FMULs, MUL, IMUL, FDIVs, DIV, IDIV, FPREMs, FSQRT are included; but exclude FADD or FMUL used in the middle of a transcendental instruction.", }, { .pme_code = 0x11, .pme_name = "FP_ASSIST", .pme_desc = "FP exceptions experienced microcode assists", .pme_flags = PFMLIB_COREDUO_PMC1 }, { .pme_code = 0x12, .pme_name = "MUL", .pme_desc = "Multiply operations (a speculative count, including FP and integer multiplies).", .pme_flags = PFMLIB_COREDUO_PMC1 }, { .pme_code = 0x13, .pme_name = "DIV", .pme_desc = "Divide operations (a speculative count, including FP and integer multiplies). ", .pme_flags = PFMLIB_COREDUO_PMC1 }, { .pme_code = 0x14, .pme_name = "CYCLES_DIV_BUSY", .pme_desc = "Cycles the divider is busy ", .pme_flags = PFMLIB_COREDUO_PMC0 }, { .pme_code = 0x21, .pme_name = "L2_ADS", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "L2 Address strobes ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x22, .pme_name = "DBUS_BUSY", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Core cycle during which data buswas busy (increments by 4)", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x23, .pme_name = "DBUS_BUSY_RD", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Cycles data bus is busy transferring data to a core (increments by 4) ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x24, .pme_name = "L2_LINES_IN", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "L2 cache lines allocated", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x25, .pme_name = "L2_M_LINES_IN", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "L2 Modified-state cache lines allocated", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x26, .pme_name = "L2_LINES_OUT", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "L2 cache lines evicted ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x27, .pme_name = "L2_M_LINES_OUT", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "L2 Modified-state cache lines evicted ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x28, .pme_name = "L2_IFETCH", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_desc = "L2 instruction fetches from nstruction fetch unit (includes speculative fetches) ", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_code = 0x29, .pme_name = "L2_LD", .pme_desc = "L2 cache reads (includes speculation) ", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_code = 0x2A, .pme_name = "L2_ST", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_desc = "L2 cache writes (includes speculation)", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 7 }, { .pme_code = 0x2E, .pme_name = "L2_RQSTS", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_desc = "L2 cache reference requests ", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_code = 0x30, .pme_name = "L2_REJECT_CYCLES", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_desc = "Cycles L2 is busy and rejecting new requests.", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_code = 0x32, .pme_name = "L2_NO_REQUEST_CYCLES", .pme_flags = PFMLIB_COREDUO_CSPEC|PFMLIB_COREDUO_MESI, .pme_desc = "Cycles there is no request to access L2.", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_HW_PREFETCH_UMASKS }, .pme_numasks = 9 }, { .pme_code = 0x3A, .pme_name = "EST_TRANS_ALL", .pme_desc = "Any Intel Enhanced SpeedStep(R) Technology transitions", }, { .pme_code = 0x103A, .pme_name = "EST_TRANS_ALL", .pme_desc = "Intel Enhanced SpeedStep Technology frequency transitions", }, { .pme_code = 0x3B, .pme_name = "THERMAL_TRIP", .pme_desc = "Duration in a thermal trip based on the current core clock ", .pme_umasks = { { .pme_uname = "CYCLES", .pme_udesc = "Duration in a thermal trip based on the current core clock", .pme_ucode = 0xC0 }, { .pme_uname = "TRIPS", .pme_udesc = "Number of thermal trips", .pme_ucode = 0xC0 | (1<<10) /* Edge detect pin (Figure 18-13) */ } }, .pme_numasks = 2 }, { .pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x3c, .pme_desc = "Core cycles when core is not halted", .pme_umasks = { { .pme_uname = "NONHLT_REF_CYCLES", .pme_udesc = "Non-halted bus cycles", .pme_ucode = 0x01 }, { .pme_uname = "SERIAL_EXECUTION_CYCLES", .pme_udesc ="Non-halted bus cycles of this core executing code while the other core is halted", .pme_ucode = 0x02 } }, .pme_numasks = 2 }, { .pme_code = 0x40, .pme_name = "DCACHE_CACHE_LD", .pme_desc = "L1 cacheable data read operations", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_code = 0x41, .pme_name = "DCACHE_CACHE_ST", .pme_desc = "L1 cacheable data write operations", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_code = 0x42, .pme_name = "DCACHE_CACHE_LOCK", .pme_desc = "L1 cacheable lock read operations to invalid state", .pme_umasks = { INTEL_COREDUO_MESI_UMASKS }, .pme_numasks = 5 }, { .pme_code = 0x0143, .pme_name = "DATA_MEM_REF", .pme_desc = "L1 data read and writes of cacheable and non-cacheable types", }, { .pme_code = 0x0244, .pme_name = "DATA_MEM_CACHE_REF", .pme_desc = "L1 data cacheable read and write operations.", }, { .pme_code = 0x0f45, .pme_name = "DCACHE_REPL", .pme_desc = "L1 data cache line replacements", }, { .pme_code = 0x46, .pme_name = "DCACHE_M_REPL", .pme_desc = "L1 data M-state cache line allocated", }, { .pme_code = 0x47, .pme_name = "DCACHE_M_EVICT", .pme_desc = "L1 data M-state cache line evicted", }, { .pme_code = 0x48, .pme_name = "DCACHE_PEND_MISS", .pme_desc = "Weighted cycles of L1 miss outstanding", }, { .pme_code = 0x49, .pme_name = "DTLB_MISS", .pme_desc = "Data references that missed TLB", }, { .pme_code = 0x4B, .pme_name = "SSE_PRE_MISS", .pme_flags = 0, .pme_desc = "Streaming SIMD Extensions (SSE) instructions missing all cache levels", .pme_umasks = { { .pme_uname = "NTA_MISS", .pme_udesc = "PREFETCHNTA missed all caches", .pme_ucode = 0x00 }, { .pme_uname = "T1_MISS", .pme_udesc = "PREFETCHT1 missed all caches", .pme_ucode = 0x01 }, { .pme_uname = "T2_MISS", .pme_udesc = "PREFETCHT2 missed all caches", .pme_ucode = 0x02 }, { .pme_uname = "STORES_MISS", .pme_udesc = "SSE streaming store instruction missed all caches", .pme_ucode = 0x03 } }, .pme_numasks = 4 }, { .pme_code = 0x4F, .pme_name = "L1_PREF_REQ", .pme_desc = "L1 prefetch requests due to DCU cache misses", }, { .pme_code = 0x60, .pme_name = "BUS_REQ_OUTSTANDING", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Weighted cycles of cacheable bus data read requests. This event counts full-line read request from DCU or HW prefetcher, but not RFO, write, instruction fetches, or others.", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 /* TODO: umasks bit 12 to include HWP or exclude HWP separately. */, }, { .pme_code = 0x61, .pme_name = "BUS_BNR_CLOCKS", .pme_desc = "External bus cycles while BNR asserted", }, { .pme_code = 0x62, .pme_name = "BUS_DRDY_CLOCKS", .pme_desc = "External bus cycles while DRDY asserted", .pme_umasks = { INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x63, .pme_name = "BUS_LOCKS_CLOCKS", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "External bus cycles while bus lock signal asserted", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, }, .pme_numasks = 2 }, { .pme_code = 0x4064, .pme_name = "BUS_DATA_RCV", .pme_desc = "External bus cycles while bus lock signal asserted", }, { .pme_code = 0x65, .pme_name = "BUS_TRANS_BRD", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Burst read bus transactions (data or code)", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, }, .pme_numasks = 2 }, { .pme_code = 0x66, .pme_name = "BUS_TRANS_RFO", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed read for ownership ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x68, .pme_name = "BUS_TRANS_IFETCH", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed instruction fetch transactions", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x69, .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_name = "BUS_TRANS_INVAL", .pme_desc = "Completed invalidate transactions", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x6A, .pme_name = "BUS_TRANS_PWR", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed partial write transactions", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x6B, .pme_name = "BUS_TRANS_P", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed partial transactions (include partial read + partial write + line write)", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x6C, .pme_name = "BUS_TRANS_IO", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed I/O transactions (read and write)", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 4 }, { .pme_code = 0x206D, .pme_name = "BUS_TRANS_DEF", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed defer transactions ", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0xc067, .pme_name = "BUS_TRANS_WB", .pme_desc = "Completed writeback transactions from DCU (does not include L2 writebacks)", .pme_umasks = { INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0xc06E, .pme_name = "BUS_TRANS_BURST", .pme_desc = "Completed burst transactions (full line transactions include reads, write, RFO, and writebacks) ", /* TODO .pme_umasks = 0xC0, */ .pme_umasks = { INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0xc06F, .pme_name = "BUS_TRANS_MEM", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Completed memory transactions. This includes Bus_Trans_Burst + Bus_Trans_P + Bus_Trans_Inval.", .pme_umasks = { INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0xc070, .pme_name = "BUS_TRANS_ANY", .pme_desc = "Any completed bus transactions", .pme_umasks = { INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x77, .pme_name = "BUS_SNOOPS", .pme_desc = "External bus cycles while bus lock signal asserted", .pme_flags = PFMLIB_COREDUO_MESI, .pme_umasks = { INTEL_COREDUO_MESI_UMASKS, INTEL_COREDUO_AGENT_UMASKS }, .pme_numasks = 7 }, { .pme_code = 0x0178, .pme_name = "DCU_SNOOP_TO_SHARE", .pme_desc = "DCU snoops to share-state L1 cache line due to L1 misses ", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x7D, .pme_name = "BUS_NOT_IN_USE", .pme_flags = PFMLIB_COREDUO_CSPEC, .pme_desc = "Number of cycles there is no transaction from the core", .pme_umasks = { INTEL_COREDUO_SPECIFICITY_UMASKS }, .pme_numasks = 2 }, { .pme_code = 0x7E, .pme_name = "BUS_SNOOP_STALL", .pme_desc = "Number of bus cycles while bus snoop is stalled" }, { .pme_code = 0x80, .pme_name = "ICACHE_READS", .pme_desc = "Number of instruction fetches from ICache, streaming buffers (both cacheable and uncacheable fetches)" }, { .pme_code = 0x81, .pme_name = "ICACHE_MISSES", .pme_desc = "Number of instruction fetch misses from ICache, streaming buffers." }, { .pme_code = 0x85, .pme_name = "ITLB_MISSES", .pme_desc = "Number of iITLB misses" }, { .pme_code = 0x86, .pme_name = "IFU_MEM_STALL", .pme_desc = "Cycles IFU is stalled while waiting for data from memory" }, { .pme_code = 0x87, .pme_name = "ILD_STALL", .pme_desc = "Number of instruction length decoder stalls (Counts number of LCP stalls)" }, { .pme_code = 0x88, .pme_name = "BR_INST_EXEC", .pme_desc = "Branch instruction executed (includes speculation)." }, { .pme_code = 0x89, .pme_name = "BR_MISSP_EXEC", .pme_desc = "Branch instructions executed and mispredicted at execution (includes branches that do not have prediction or mispredicted)" }, { .pme_code = 0x8A, .pme_name = "BR_BAC_MISSP_EXEC", .pme_desc = "Branch instructions executed that were mispredicted at front end" }, { .pme_code = 0x8B, .pme_name = "BR_CND_EXEC", .pme_desc = "Conditional branch instructions executed" }, { .pme_code = 0x8C, .pme_name = "BR_CND_MISSP_EXEC", .pme_desc = "Conditional branch instructions executed that were mispredicted" }, { .pme_code = 0x8D, .pme_name = "BR_IND_EXEC", .pme_desc = "Indirect branch instructions executed" }, { .pme_code = 0x8E, .pme_name = "BR_IND_MISSP_EXEC", .pme_desc = "Indirect branch instructions executed that were mispredicted" }, { .pme_code = 0x8F, .pme_name = "BR_RET_EXEC", .pme_desc = "Return branch instructions executed" }, { .pme_code = 0x90, .pme_name = "BR_RET_MISSP_EXEC", .pme_desc = "Return branch instructions executed that were mispredicted" }, { .pme_code = 0x91, .pme_name = "BR_RET_BAC_MISSP_EXEC", .pme_desc = "Return branch instructions executed that were mispredicted at the front end" }, { .pme_code = 0x92, .pme_name = "BR_CALL_EXEC", .pme_desc = "Return call instructions executed" }, { .pme_code = 0x93, .pme_name = "BR_CALL_MISSP_EXEC", .pme_desc = "Return call instructions executed that were mispredicted" }, { .pme_code = 0x94, .pme_name = "BR_IND_CALL_EXEC", .pme_desc = "Indirect call branch instructions executed" }, { .pme_code = 0xA2, .pme_name = "RESOURCE_STALL", .pme_desc = "Cycles while there is a resource related stall (renaming, buffer entries) as seen by allocator" }, { .pme_code = 0xB0, .pme_name = "MMX_INSTR_EXEC", .pme_desc = "Number of MMX instructions executed (does not include MOVQ and MOVD stores)" }, { .pme_code = 0xB1, .pme_name = "SIMD_INT_SAT_EXEC", .pme_desc = "Number of SIMD Integer saturating instructions executed" }, { .pme_code = 0xB3, .pme_name = "SIMD_INT_INSTRUCTIONS", .pme_desc = "Number of SIMD Integer instructions executed", .pme_umasks = { { .pme_uname = "MUL", .pme_udesc = "Number of SIMD Integer packed multiply instructions executed", .pme_ucode = 0x01 }, { .pme_uname = "SHIFT", .pme_udesc = "Number of SIMD Integer packed shift instructions executed", .pme_ucode = 0x02 }, { .pme_uname = "PACK", .pme_udesc = "Number of SIMD Integer pack operations instruction executed", .pme_ucode = 0x04 }, { .pme_uname = "UNPACK", .pme_udesc = "Number of SIMD Integer unpack instructions executed", .pme_ucode = 0x08 }, { .pme_uname = "LOGICAL", .pme_udesc = "Number of SIMD Integer packed logical instructions executed", .pme_ucode = 0x10 }, { .pme_uname = "ARITHMETIC", .pme_udesc = "Number of SIMD Integer packed arithmetic instructions executed", .pme_ucode = 0x20 } }, .pme_numasks = 6 }, { .pme_code = 0xC0, .pme_name = "INSTR_RET", .pme_desc = "Number of instruction retired (Macro fused instruction count as 2)" }, { .pme_code = 0xC1, .pme_name = "FP_COMP_INSTR_RET", .pme_desc = "Number of FP compute instructions retired (X87 instruction or instruction that contain X87 operations)", .pme_flags = PFMLIB_COREDUO_PMC0 }, { .pme_code = 0xC2, .pme_name = "UOPS_RET", .pme_desc = "Number of micro-ops retired (include fused uops)" }, { .pme_code = 0xC3, .pme_name = "SMC_DETECTED", .pme_desc = "Number of times self-modifying code condition detected" }, { .pme_code = 0xC4, .pme_name = "BR_INSTR_RET", .pme_desc = "Number of branch instructions retired" }, { .pme_code = 0xC5, .pme_name = "BR_MISPRED_RET", .pme_desc = "Number of mispredicted branch instructions retired" }, { .pme_code = 0xC6, .pme_name = "CYCLES_INT_MASKED", .pme_desc = "Cycles while interrupt is disabled" }, { .pme_code = 0xC7, .pme_name = "CYCLES_INT_PEDNING_MASKED", .pme_desc = "Cycles while interrupt is disabled and interrupts are pending" }, { .pme_code = 0xC8, .pme_name = "HW_INT_RX", .pme_desc = "Number of hardware interrupts received" }, { .pme_code = 0xC9, .pme_name = "BR_TAKEN_RET", .pme_desc = "Number of taken branch instruction retired" }, { .pme_code = 0xCA, .pme_name = "BR_MISPRED_TAKEN_RET", .pme_desc = "Number of taken and mispredicted branch instructions retired" }, { .pme_code = 0xCC, .pme_name = "FP_MMX_TRANS", .pme_name = "MMX_FP_TRANS", .pme_desc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .pme_umasks = { { .pme_uname = "TO_FP", .pme_udesc = "Number of transitions from MMX to X87", .pme_ucode = 0x00 }, { .pme_uname = "TO_MMX", .pme_udesc = "Number of transitions from X87 to MMX", .pme_ucode = 0x01 } }, .pme_numasks = 2 }, { .pme_code = 0xCD, .pme_name = "MMX_ASSIST", .pme_desc = "Number of EMMS executed" }, { .pme_code = 0xCE, .pme_name = "MMX_INSTR_RET", .pme_desc = "Number of MMX instruction retired" }, { .pme_code = 0xD0, .pme_name = "INSTR_DECODED", .pme_desc = "Number of instruction decoded" }, { .pme_code = 0xD7, .pme_name = "ESP_UOPS", .pme_desc = "Number of ESP folding instruction decoded" }, { .pme_code = 0xD8, .pme_name = "SSE_INSTRUCTIONS_RETIRED", .pme_desc = "Number of SSE/SSE2 instructions retired (packed and scalar)", .pme_umasks = { { .pme_uname = "SINGLE", .pme_udesc = "Number of SSE/SSE2 single precision instructions retired (packed and scalar)", .pme_ucode = 0x00 }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Number of SSE/SSE2 scalar single precision instructions retired", .pme_ucode = 0x01, }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Number of SSE/SSE2 packed double percision instructions retired", .pme_ucode = 0x02, }, { .pme_uname = "DOUBLE", .pme_udesc = "Number of SSE/SSE2 scalar double percision instructions retired", .pme_ucode = 0x03, }, { .pme_uname = "INT_128", .pme_udesc = "Number of SSE2 128 bit integer instructions retired", .pme_ucode = 0x04, }, }, .pme_numasks = 5 }, { .pme_code = 0xD9, .pme_name = "SSE_COMP_INSTRUCTIONS_RETIRED", .pme_desc = "Number of computational SSE/SSE2 instructions retired (does not include AND, OR, XOR)", .pme_umasks = { { .pme_uname = "PACKED_SINGLE", .pme_udesc = "Number of SSE/SSE2 packed single precision compute instructions retired (does not include AND, OR, XOR)", .pme_ucode = 0x00 }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Number of SSE/SSE2 scalar single precision compute instructions retired (does not include AND, OR, XOR)", .pme_ucode = 0x01 }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Number of SSE/SSE2 packed double precision compute instructions retired (does not include AND, OR, XOR)", .pme_ucode = 0x02 }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "Number of SSE/SSE2 scalar double precision compute instructions retired (does not include AND, OR, XOR)", .pme_ucode = 0x03 } }, .pme_numasks = 4 }, { .pme_code = 0xDA, .pme_name = "FUSED_UOPS", .pme_desc = "fused uops retired", .pme_umasks = { { .pme_uname = "ALL", .pme_udesc = "All fused uops retired", .pme_ucode = 0x00 }, { .pme_uname = "LOADS", .pme_udesc = "Fused load uops retired", .pme_ucode = 0x01 }, { .pme_uname = "STORES", .pme_udesc = "Fused load uops retired", .pme_ucode = 0x02 }, }, .pme_numasks = 3 }, { .pme_code = 0xDB, .pme_name = "UNFUSION", .pme_desc = "Number of unfusion events in the ROB (due to exception)" }, { .pme_code = 0xE0, .pme_name = "BR_INSTR_DECODED", .pme_desc = "Branch instructions decoded" }, { .pme_code = 0xE2, .pme_name = "BTB_MISSES", .pme_desc = "Number of branches the BTB did not produce a prediction" }, { .pme_code = 0xE4, .pme_name = "BR_BOGUS", .pme_desc = "Number of bogus branches" }, { .pme_code = 0xE6, .pme_name = "BACLEARS", .pme_desc = "Number of BAClears asserted" }, { .pme_code = 0xF0, .pme_name = "PREF_RQSTS_UP", .pme_desc = "Number of hardware prefetch requests issued in forward streams" }, { .pme_code = 0xF8, .pme_name = "PREF_RQSTS_DN", .pme_desc = "Number of hardware prefetch requests issued in backward streams" } }; #define PME_COREDUO_UNHALTED_CORE_CYCLES 0 #define PME_COREDUO_INSTRUCTIONS_RETIRED 2 #define PME_COREDUO_EVENT_COUNT (sizeof(coreduo_pe)/sizeof(pme_coreduo_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/crayx2_events.h000066400000000000000000031115271502707512200222650ustar00rootroot00000000000000/* * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __CRAYX2_EVENTS_H__ #define __CRAYX2_EVENTS_H__ 1 #include "pfmlib_crayx2_priv.h" /* ***************************************************************** ******* THIS TABLE IS GENERATED AUTOMATICALLY ******* MODIFICATIONS REQUIRED FOR THE EVENT NAMES ******* OR EVENT DESCRIPTIONS SHOULD BE MADE TO ******* THE TEXT FILE AND THE TABLE REGENERATED ******* Sat Nov 10 14:40:30 CST 2007 ***************************************************************** */ static pme_crayx2_entry_t crayx2_pe[ ] = { /* P Counter 0 Event 0 */ { .pme_name = "CYCLES", .pme_desc = "Cycles.", .pme_code = 0, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 0 Event 1 */ { .pme_name = "CYCLES", .pme_desc = "Cycles.", .pme_code = 1, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 0 Event 2 */ { .pme_name = "CYCLES", .pme_desc = "Cycles.", .pme_code = 2, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 0 Event 3 */ { .pme_name = "CYCLES", .pme_desc = "Cycles.", .pme_code = 3, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 1 Event 0 */ { .pme_name = "INST_GRAD", .pme_desc = "Number of instructions graduated.", .pme_code = 4, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 1 Event 1 */ { .pme_name = "INST_GRAD", .pme_desc = "Number of instructions graduated.", .pme_code = 5, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 1 Event 2 */ { .pme_name = "INST_GRAD", .pme_desc = "Number of instructions graduated.", .pme_code = 6, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 1 Event 3 */ { .pme_name = "INST_GRAD", .pme_desc = "Number of instructions graduated.", .pme_code = 7, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 2 Event 0 */ { .pme_name = "INST_DISPATCH", .pme_desc = "Number of instructions dispatched.", .pme_code = 8, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 2 Event 1 */ { .pme_name = "ITLB_MISS", .pme_desc = "Number of Instruction TLB misses.", .pme_code = 9, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 2 Event 2 */ { .pme_name = "JB_CORRECT", .pme_desc = "Number of jumps and branches predicted correctly.", .pme_code = 10, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 2 Event 3 */ { .pme_name = "STALL_VU_FUG1", .pme_desc = "CPs VU stalled waiting for FUG 1.", .pme_code = 11, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 3 Event 0 */ { .pme_name = "INST_SYNCS", .pme_desc = "Number of synchronization instructions graduated g=02.", .pme_code = 12, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 3 Event 1 */ { .pme_name = "INST_GSYNCS", .pme_desc = "Number of Gsync instructions graduated g=02 & f=0-3.", .pme_code = 13, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 3 Event 2 */ { .pme_name = "STALL_DU_ICACHE", .pme_desc = "CPs dispatch stalled waiting for instruction from Icache.", .pme_code = 14, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 3 Event 3 */ { .pme_name = "STALL_VU_FUG2", .pme_desc = "CPs VU stalled waiting for FUG 2.", .pme_code = 15, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 4 Event 0 */ { .pme_name = "INST_AMO", .pme_desc = "Number of AMO instructions graduated g=04.", .pme_code = 16, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 4 Event 1 */ { .pme_name = "ICACHE_FETCH", .pme_desc = "Number of instruction fetch requests to memory.", .pme_code = 17, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 4 Event 2 */ { .pme_name = "STALL_DU_BRANCH_PRED", .pme_desc = "CPs Dispatch stalled waiting for branch prediction register.", .pme_code = 18, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 4 Event 3 */ { .pme_name = "STALL_VU_FUG3", .pme_desc = "CPs VU stalled waiting for FUG 3.", .pme_code = 19, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 5 Event 0 */ { .pme_name = "INST_A", .pme_desc = "Number of A register instructions graduated g=05,40,42,43.", .pme_code = 20, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 5 Event 1 */ { .pme_name = "ICACHE_HIT", .pme_desc = "Number of Icache hits.", .pme_code = 21, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 5 Event 2 */ { .pme_name = "STALL_DU_AREG", .pme_desc = "CPs instruction dispatch stalled waiting for free A register.", .pme_code = 22, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 5 Event 3 */ { .pme_name = "STALL_VU", .pme_desc = "CPs VU is stalled with a valid instruction.", .pme_code = 23, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 6 Event 0 */ { .pme_name = "INST_S_INT", .pme_desc = "Number of S register integer instructions graduated g=60,62 & t1=1,63.", .pme_code = 24, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 6 Event 1 */ { .pme_name = "INST_MSYNCS", .pme_desc = "Number of Msync instructions graduated g=02 & f=20-22.", .pme_code = 25, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 6 Event 2 */ { .pme_name = "STALL_DU_ACT_LIST_FULL", .pme_desc = "CPs dispatch stalled waiting for active list entry.", .pme_code = 26, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 6 Event 3 */ { .pme_name = "STALL_VU_NO_INST", .pme_desc = "CPs VU has no valid instruction.", .pme_code = 27, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 7 Event 0 */ { .pme_name = "INST_S_FP", .pme_desc = "Number of S register FP instructions graduated g=62 & t1=0.", .pme_code = 28, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 7 Event 1 */ { .pme_name = "STLB_MISS", .pme_desc = "Number of Scalar TLB misses.", .pme_code = 29, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 7 Event 2 */ { .pme_name = "STALL_DU_SREG", .pme_desc = "CPs instruction dispatch stalled waiting for free S register.", .pme_code = 30, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 7 Event 3 */ { .pme_name = "STALL_VU_VR", .pme_desc = "CPs VU is stalled waiting for busy V Reg.", .pme_code = 31, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 8 Event 0 */ { .pme_name = "INST_MISC", .pme_desc = "Number of Misc. scalar instructions graduated g=00, 01, 03, 06, 34.", .pme_code = 32, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 8 Event 1 */ { .pme_name = "VTLB_MISS", .pme_desc = "Number of vector TLB misses.", .pme_code = 33, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 8 Event 2 */ { .pme_name = "STALL_DU_INST", .pme_desc = "CPs dispatch stalled due to an instruction such as a Gsync or Lsync FP that stops dispatch until it executes.", .pme_code = 34, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 8 Event 3 */ { .pme_name = "STALL_VLSU_NO_INST", .pme_desc = "CPs VLSU has no valid instruction.", .pme_code = 35, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 9 Event 0 */ { .pme_name = "INST_JB", .pme_desc = "Number of Jump and Branch instructions graduated g=50-57, 70-76.", .pme_code = 36, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 9 Event 1 */ { .pme_name = "ICACHE_MISS", .pme_desc = "Number of Icache misses.", .pme_code = 37, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 9 Event 2 */ { .pme_name = "STALL_GRAD", .pme_desc = "CPs no instructions graduate for any reason.", .pme_code = 38, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 9 Event 3 */ { .pme_name = "STALL_VLSU_LB", .pme_desc = "CPs VLSU stalled waiting for load buffers (LB).", .pme_code = 39, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 10 Event 0 */ { .pme_name = "INST_MEM", .pme_desc = "Number of A and S register load and store instructions graduated g=41, 44-47, 61, 64-67.", .pme_code = 40, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 10 Event 1 */ { .pme_name = "ICACHE_HIT_PEND", .pme_desc = "Number of Icache hits to blocks with allocations pending.", .pme_code = 41, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 10 Event 2 */ { .pme_name = "STALL_GRAD_NO_INST", .pme_desc = "CPs no instructions graduated due to empty active list.", .pme_code = 42, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 10 Event 3 */ { .pme_name = "STALL_VLSU_SB", .pme_desc = "CPs VLSU stalled waiting for store buffer (SB).", .pme_code = 43, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 11 Event 0 */ { .pme_name = "INST_VFUG1", .pme_desc = "Number of vector FUG 1 instructions graduated g=20-27, f=0-7,60-77 Add, sub, compare.", .pme_code = 44, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 11 Event 1 */ { .pme_name = "TLB_MISS", .pme_desc = "Total number of TLB misses including ITLB, STLB, and VTLB.", .pme_code = 45, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 11 Event 2 */ { .pme_name = "STALL_GRAD_AX_INST", .pme_desc = "CPs no instructions graduate and an A FUG instruction is at the head of the active list g=5, 40, 42, 43.", .pme_code = 46, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 11 Event 3 */ { .pme_name = "STALL_VLSU_RB", .pme_desc = "CPs VLSU stalled waiting for request buffer (RB).", .pme_code = 47, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 12 Event 0 */ { .pme_name = "INST_VFUG2", .pme_desc = "Number of vector FUG 2 instructions graduated g=20-27, f=30-37 (multiply, shift).", .pme_code = 48, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 12 Event 1 */ { .pme_name = "DCACHE_HIT", .pme_desc = "Number of A or S loads that hit in the Dcache.", .pme_code = 49, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 12 Event 2 */ { .pme_name = "STALL_GRAD_SX_INST", .pme_desc = "CPs no instructions graduate and an S FUG instruction is at the head of the active list g=60, 62, 63.", .pme_code = 50, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 12 Event 3 */ { .pme_name = "STALL_VLSU_VM", .pme_desc = "CPs VLSU stalled waiting for VU vector mask (VM).", .pme_code = 51, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 13 Event 0 */ { .pme_name = "INST_VFUG3", .pme_desc = "Number of vector FUG 3 instructions graduated g=20-27, f=10-27, 40-57, 77 div, sqrt, abs, cpsign, compress, merge, logical, bmm.", .pme_code = 52, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 13 Event 1 */ { .pme_name = "DCACHE_MISS", .pme_desc = "Number of A or S loads that miss in the Dcache.", .pme_code = 53, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 13 Event 2 */ { .pme_name = "STALL_GRAD_FP_INST", .pme_desc = "CPs no instructions graduate and an S FP instruction is at the head of the active list g=62, t1=0.", .pme_code = 54, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 13 Event 3 */ { .pme_name = "STALL_VLSU_SREF", .pme_desc = "CPs VLSU stalled waiting for prior scalar instruction reference sent.", .pme_code = 55, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 14 Event 0 */ { .pme_name = "VOPS_EXT_FUG3", .pme_desc = "Number of vector FUG 3 external operations g=20-27 f=25,57,77 compress, merge, bmm.", .pme_code = 56, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 14 Event 1 */ { .pme_name = "DCACHE_HIT_PEND", .pme_desc = "Number of scalar loads that hit in the Dcache and in the FOQ and the load is merged with a pending allocation.", .pme_code = 57, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 14 Event 2 */ { .pme_name = "STALL_GRAD_LOAD_INST", .pme_desc = "CPs no instructions graduate and a scalar load is at the head of the active list.", .pme_code = 58, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 14 Event 3 */ { .pme_name = "STALL_VLSU_INDEX", .pme_desc = "CPS VLSU stalled waiting for busy scatter or gather index register.", .pme_code = 59, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 15 Event 0 */ { .pme_name = "VOPS_LOG_FUG3", .pme_desc = "Number of vector FUG 3 logical operations.", .pme_code = 60, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 15 Event 1 */ { .pme_name = "DCACHE_HIT_WORD", .pme_desc = "Number of scalar loads that hit in the Dcache and hit in the FOQ and were not merged with a pending allocation.", .pme_code = 61, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 15 Event 2 */ { .pme_name = "STALL_GRAD_STORE_INST", .pme_desc = "CPs no instructions graduate and a scalar store is at the head of the active list.", .pme_code = 62, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 15 Event 3 */ { .pme_name = "STALL_VLSU_FOM", .pme_desc = "CPs VLSU stalled in forced order mode.", .pme_code = 63, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 16 Event 0 */ { .pme_name = "INST_V", .pme_desc = "Number of elemental vector instructions graduated g=20-27, 30-33.", .pme_code = 64, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 16 Event 1 */ { .pme_name = "INST_V_INT", .pme_desc = "Number of elemental vector integer instructions graduated g=20-27 & t1=", .pme_code = 65, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 16 Event 2 */ { .pme_name = "INST_V_FP", .pme_desc = "Number of elemental vector FP instructions graduated g=20-27 & t1=0.", .pme_code = 66, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 16 Event 3 */ { .pme_name = "INST_V_MEM", .pme_desc = "Number of elemental vector memory instructions graduated g=30-33.", .pme_code = 67, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 17 Event 0 */ { .pme_name = "VOPS_VL", .pme_desc = "Inst_V * Current VL.", .pme_code = 68, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 17 Event 1 */ { .pme_name = "DCACHE_INVAL_V", .pme_desc = "Number of Dcache invalidates due to vector stores.", .pme_code = 69, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 17 Event 2 */ { .pme_name = "VOPS_VL_32-BIT", .pme_desc = "Inst_V * Current VL for 32-bit operations only.", .pme_code = 70, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 17 Event 3 */ { .pme_name = "STALL_VLSU", .pme_desc = "Stall vector load store for any reason.", .pme_code = 71, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 18 Event 0 */ { .pme_name = "VOPS_INT_ADD", .pme_desc = "Number of selected vector integer add operations g=20-27 & f=0-3 & t1=", .pme_code = 72, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 18 Event 1 */ { .pme_name = "DCACHE_INVAL_L2", .pme_desc = "Number of Dcache invalidates from L2 cache.", .pme_code = 73, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 18 Event 2 */ { .pme_name = "STALL_GRAD_XFER_INST", .pme_desc = "Number of CPs no instruction graduates and an A to S or S to A move is at the head of the active list.", .pme_code = 74, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 18 Event 3 */ { .pme_name = "STALL_VU_VM", .pme_desc = "CPs VU stalled waiting for vector mask.", .pme_code = 75, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 19 Event 0 */ { .pme_name = "VOPS_FP_ADD", .pme_desc = "Number of selected vector FP add operations g=20-27 & f=0-3 & t1=0.", .pme_code = 76, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 19 Event 1 */ { .pme_name = "DCACHE_INVALIDATE", .pme_desc = "Total Number of Dcache invalidates.", .pme_code = 77, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 19 Event 2 */ { .pme_name = "STALL_GRAD_VXFER_INST", .pme_desc = "CPs no instruction graduates and a V to A or V to S move is at the head of the active list.", .pme_code = 78, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 19 Event 3 */ { .pme_name = "STALL_VU_VR_MEM", .pme_desc = "CPs VU is stalled waiting on a busy vector register being loaded from memory.", .pme_code = 79, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 20 Event 0 */ { .pme_name = "VOPS_INT_LOG", .pme_desc = "Number of selected vector integer logical operations g=20-27 & f=10-27 & t1=1.", .pme_code = 80, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 20 Event 1 */ { .pme_name = "BRANCH_PRED", .pme_desc = "Number of branches predicted.", .pme_code = 81, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 20 Event 2 */ { .pme_name = "STALL_GRAD_VLSU_INST", .pme_desc = "Number of CPs no instruction graduates and a vector load, store, or AMO instruction is at the head of the active list.", .pme_code = 82, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 20 Event 3 */ { .pme_name = "STALL_VU_TLB", .pme_desc = "CPs VU stalled waiting for a memory translation.", .pme_code = 83, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 21 Event 0 */ { .pme_name = "VOPS_FP_DIV", .pme_desc = "Number of selected vector FP divide and sqrt operations g=20-27 & f=10-11 & t1=0.", .pme_code = 84, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 21 Event 1 */ { .pme_name = "BRANCH_CORRECT", .pme_desc = "Number of branches predicted correctly.", .pme_code = 85, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 21 Event 2 */ { .pme_name = "STALL_SLSQ_DEST", .pme_desc = "SLS issue stall for FOQ, PARB, ORB full or Lsync vs active.", .pme_code = 86, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 21 Event 3 */ { .pme_name = "STALL_VLSU_VK_PORT", .pme_desc = "CPs VLSU stalled waiting for scatter or gather index register read port.", .pme_code = 87, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 22 Event 0 */ { .pme_name = "VOPS_INT_SHIFT", .pme_desc = "Number of selected vector integer shift operations g=20-27 & f=30-37 & t1=", .pme_code = 88, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 22 Event 1 */ { .pme_name = "JTB_PRED", .pme_desc = "Number of jumps predicted g=57 & f=0,20.", .pme_code = 89, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 22 Event 2 */ { .pme_name = "STALL_GRAD_ARQ_DEST", .pme_desc = "Stall arq issue due to vdispatch, control unit, or A to S full.", .pme_code = 90, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 22 Event 3 */ { .pme_name = "STALL_VLSU_ADR_PORT", .pme_desc = "CPs VLSU stalled waiting for address read port.", .pme_code = 91, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 23 Event 0 */ { .pme_name = "VOPS_FP_MULT", .pme_desc = "Number of selected vector FP multiply operations g=20-27 & f=30-37 & t1=0.", .pme_code = 92, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 23 Event 1 */ { .pme_name = "JTB_CORRECT", .pme_desc = "Number of jumps predicted correctly g=57 & f=0,20.", .pme_code = 93, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 23 Event 2 */ { .pme_name = "STALL_SRQ_DEST", .pme_desc = "Stall srq issue due to vdispatch or S to A full.", .pme_code = 94, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 23 Event 3 */ { .pme_name = "STALL_VLSU_MISC", .pme_desc = "CPs VLSU stalled due to miscellaneous instructions.", .pme_code = 95, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 24 Event 0 */ { .pme_name = "VOPS_LOAD_INDEX", .pme_desc = "Number of selected vector load indexed references g=30-33 & f2=1 & f0=0.", .pme_code = 96, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 24 Event 1 */ { .pme_name = "VOPS_INT_MISC", .pme_desc = "Number of selected vector integer misc. operations g=20-27 & f=40-77 & t1=", .pme_code = 97, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 24 Event 2 */ { .pme_name = "INST_LSYNCVS", .pme_desc = "Number of LsyncVS instructions graduated.", .pme_code = 98, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 24 Event 3 */ { .pme_name = "VOPS_VL_64-BIT", .pme_desc = "Inst_V * Current VL for 64-bit operations only.", .pme_code = 99, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 25 Event 0 */ { .pme_name = "VOPS_STORE_INDEX", .pme_desc = "Number of selected vector store indexed references g=30-33 & f2=1 & f0=1", .pme_code = 100, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 25 Event 1 */ { .pme_name = "JRS_PRED", .pme_desc = "Number of return jumps predicted g=57, f=40.", .pme_code = 101, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 25 Event 2 */ { .pme_name = "STALL_SLSQ_PARB", .pme_desc = "Number of CPs SLS issue stalled due to PARB full.", .pme_code = 102, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 25 Event 3 */ { .pme_name = "", .pme_desc = "", .pme_code = 103, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 26 Event 0 */ { .pme_name = "VOPS_LOADS", .pme_desc = "Number of selected vector load references g=30-33 & f0=0.", .pme_code = 104, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 26 Event 1 */ { .pme_name = "JRS_CORRECT", .pme_desc = "Number of return jumps predicted correctly g=57, f=40.", .pme_code = 105, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 26 Event 2 */ { .pme_name = "STALL_SLSQ_ORB", .pme_desc = "Number of CPs SLS issue stalled due to all ORB entries in use.", .pme_code = 106, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 26 Event 3 */ { .pme_name = "STALL_VU_MISC", .pme_desc = "CPs VU stalled due to miscellaneous instructions.", .pme_code = 107, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 27 Event 0 */ { .pme_name = "VOPS_STORE", .pme_desc = "Number of selected vector store references g=30-33 & f0=", .pme_code = 108, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 27 Event 1 */ { .pme_name = "INST_MEM_ALLOC", .pme_desc = "Number of A and S register memory instructions that allocate.", .pme_code = 109, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 27 Event 2 */ { .pme_name = "STALL_SLSQ_FOQ", .pme_desc = "Number of CPs SLS issue stalled due to full FOQ.", .pme_code = 110, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 27 Event 3 */ { .pme_name = "STALL_VDU_NO_INST_VU", .pme_desc = "CPs VDU and VU have no valid instructions.", .pme_code = 111, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 28 Event 0 */ { .pme_name = "VOPS_LOAD_STRIDE", .pme_desc = "Number of selected vector load references that were stride >2 or <-2.", .pme_code = 112, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 28, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 28 Event 1 */ { .pme_name = "INST_SYSCALL", .pme_desc = "Number of syscall instructions graduated g=01.", .pme_code = 113, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 28, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 28 Event 2 */ { .pme_name = "STALL_SLSQ_LSYNC_VS", .pme_desc = "Number of CPs SLS issue is stalled due to active Lsync vs instruction.", .pme_code = 114, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 28, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 28 Event 3 */ { .pme_name = "STALL_VDU_SOP_VU", .pme_desc = "Number of CPs vector issue has no instructions and the next instruction is waiting on an S reg operand.", .pme_code = 115, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 28, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 29 Event 0 */ { .pme_name = "VOPS_STORE_STRIDE", .pme_desc = "Number of selected vector store references that were stride >2 or <-2.", .pme_code = 116, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 29, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 29 Event 1 */ { .pme_name = "", .pme_desc = "", .pme_code = 117, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 29, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 29 Event 2 */ { .pme_name = "", .pme_desc = "", .pme_code = 118, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 29, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 29 Event 3 */ { .pme_name = "STALL_VDU_NO_INST_VLSU", .pme_desc = "CPs VDU and VLSU have no valid instructions.", .pme_code = 119, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 29, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 30 Event 0 */ { .pme_name = "VOPS_LOAD_ALLOC", .pme_desc = "Number of selected vector load references that were marked allocate (cache line requests count as 1).", .pme_code = 120, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 30, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 30 Event 1 */ { .pme_name = "INST_LOAD", .pme_desc = "Number of A or S memory loads g=44, 45, 41 & f0=0, 64, 65, 61 & f0=0.", .pme_code = 121, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 30, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 30 Event 2 */ { .pme_name = "EXCEPTIONS_TAKEN", .pme_desc = "Taken exception count.", .pme_code = 122, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 30, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 30 Event 3 */ { .pme_name = "STALL_VDU_SCM_VLSU", .pme_desc = "CPs VDU stalled waiting for scalar commit and VLSU has no valid instruction.", .pme_code = 123, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 30, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 31 Event 0 */ { .pme_name = "VOPS_STORE_ALLOC", .pme_desc = "Number of selected vector stores references that were marked allocate (cache line requests count as 1).", .pme_code = 124, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 31, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 31 Event 1 */ { .pme_name = "BRANCH_TAKEN", .pme_desc = "Number of taken branches.", .pme_code = 125, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 31, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 31 Event 2 */ { .pme_name = "INST_LSYNCSV", .pme_desc = "Number of graduated Lsync SV instructions.", .pme_code = 126, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 31, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* P Counter 31 Event 3 */ { .pme_name = "STALL_VDU_SCM_VU", .pme_desc = "CPs VDU stalled waiting for scalar commit and VU has no valid instruction.", .pme_code = 127, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CPU, .pme_ctr = 31, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CPU_PMD_BASE, .pme_nctrs = PME_CRAYX2_CPU_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CPU_CHIPS }, /* C Counter 0 Event 0 */ { .pme_name = "REQUESTS", .pme_desc = "Processor requests processed.", .pme_code = 128, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 0 Event 1 */ { .pme_name = "L2_MISSES", .pme_desc = "Cache line allocations.", .pme_code = 129, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 0 Event 2 */ { .pme_name = "M_OUT_BUSY", .pme_desc = "Cycles W chip output port busy.", .pme_code = 130, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 0 Event 3 */ { .pme_name = "REPLAYED", .pme_desc = "Requests sent to replay queue.", .pme_code = 131, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 1 Event 0 */ { .pme_name = "ALLOC_REQUESTS", .pme_desc = "Allocating requests (Read, ReadUC, ReadShared, ReadUCShared, ReadMod, SWrite, VWrite).", .pme_code = 132, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 1 Event 1 */ { .pme_name = "", .pme_desc = "", .pme_code = 133, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 1 Event 2 */ { .pme_name = "M_OUT_BLOCK", .pme_desc = "CyclesWchip output port blocked (something to send but no flow control credits).", .pme_code = 134, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 1 Event 3 */ { .pme_name = "LS/VS", .pme_desc = "Replayed Ls or Vs Requests sent to the replay queue.", .pme_code = 135, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 2 Event 0 */ { .pme_name = "DWORDS_ALLOCATED", .pme_desc = "Dwords written into L2 from L3 (excluding updates).", .pme_code = 136, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 2 Event 1 */ { .pme_name = "", .pme_desc = "", .pme_code = 137, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 2 Event 2 */ { .pme_name = "NW_OUT_BUSY", .pme_desc = "Cycles NIF output port busy.", .pme_code = 138, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 2 Event 3 */ { .pme_name = "REPLAY_PENDING", .pme_desc = "Requests sent to replay queue because the line was in PendingReq state.", .pme_code = 139, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 3 Event 0 */ { .pme_name = "DWORDS_EVICTED", .pme_desc = "Dwords written back to L3.", .pme_code = 140, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 3 Event 1 */ { .pme_name = "CACHE_LINE_EVICTIONS", .pme_desc = "Cache lines evicted due to new allocations.", .pme_code = 141, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 3 Event 2 */ { .pme_name = "NW_OUT_BLOCK", .pme_desc = "Cycles NIF output port blocked (something to send but no flow control credits).", .pme_code = 142, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 3 Event 3 */ { .pme_name = "REPLAY_ALLOC", .pme_desc = "Requests sent to replay queue because a line could not be allocated due to all ways pending.", .pme_code = 143, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 4 Event 0 */ { .pme_name = "ALLOC_WRITE_TO_L2", .pme_desc = "Dwords written to L2 by local allocating write requests.", .pme_code = 144, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 4 Event 1 */ { .pme_name = "DROPS", .pme_desc = "Drops sent to directory.", .pme_code = 145, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 4 Event 2 */ { .pme_name = "", .pme_desc = "", .pme_code = 146, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 4 Event 3 */ { .pme_name = "REPLAY_WAKEUPS", .pme_desc = "Replay queue wakeups.", .pme_code = 147, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 5 Event 0 */ { .pme_name = "NON_ALLOC_WRITE_TO_L2", .pme_desc = "Dwords written to L2 by local non-allocating write requests.", .pme_code = 148, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 5 Event 1 */ { .pme_name = "WRITE_BACKS", .pme_desc = "WriteBacks sent to directory.", .pme_code = 149, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 5 Event 2 */ { .pme_name = "", .pme_desc = "", .pme_code = 150, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 5 Event 3 */ { .pme_name = "REPLAY_MATCHES", .pme_desc = "Requests matched during replay wakeups (Replay_Matches/Replay_Wakeups=avg. number of matches per wakeup).", .pme_code = 151, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 6 Event 0 */ { .pme_name = "NON_ALLOC_WRITE_TO_L3", .pme_desc = "Dwords written to L3 by local non-allocating write requests.", .pme_code = 152, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 6 Event 1 */ { .pme_name = "FWD_REQ", .pme_desc = "Forwarded requests received (FlushReq, FwdRead, FwdReadShared, FwdGet).", .pme_code = 153, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 6 Event 2 */ { .pme_name = "", .pme_desc = "", .pme_code = 154, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 6 Event 3 */ { .pme_name = "", .pme_desc = "", .pme_code = 155, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 7 Event 0 */ { .pme_name = "ALLOC_READ_FROM_L2", .pme_desc = "Dwords read from L2 by local allocating read requests.", .pme_code = 156, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 7 Event 1 */ { .pme_name = "FWD_READ_ALL", .pme_desc = "FwdReads and FwdReadShared received.", .pme_code = 157, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 7 Event 2 */ { .pme_name = "STALL_RP_FULL_NW", .pme_desc = "Cycles NW request queue stalled due to replay queue full.", .pme_code = 158, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 7 Event 3 */ { .pme_name = "ALLOC_NO_FILL", .pme_desc = "ReadMods sent to directory when the entire line is dirty.", .pme_code = 159, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 8 Event 0 */ { .pme_name = "NON_ALLOC_READ_FROM_L2", .pme_desc = "Dwords read from L2 by local non-allocating read requests.", .pme_code = 160, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 8 Event 1 */ { .pme_name = "FWD_READ_SHARED_RECV", .pme_desc = "FwdReadShareds received.", .pme_code = 161, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 8 Event 2 */ { .pme_name = "STALL_RP_FULL_PROC", .pme_desc = "Cycles Ls/Vs request queue stalled due to replay queue full.", .pme_code = 162, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 8 Event 3 */ { .pme_name = "UPGRADES", .pme_desc = "ReadMods sent to directory when the line was currently in ShClean state.", .pme_code = 163, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 9 Event 0 */ { .pme_name = "NON_ALLOC_READ_FROM_L3", .pme_desc = "Dwords read from L3 by local non-allocating read requests.", .pme_code = 164, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 9 Event 1 */ { .pme_name = "FWD_GET_RECV", .pme_desc = "FwdGets received.", .pme_code = 165, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 9 Event 2 */ { .pme_name = "STALL_TB_FULL", .pme_desc = "Cycles bank request queue stalled due to transient buffer full.", .pme_code = 166, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 9 Event 3 */ { .pme_name = "", .pme_desc = "", .pme_code = 167, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 10 Event 0 */ { .pme_name = "NETWORK_WRITE_TO_L2", .pme_desc = "Dwords written to L2 by remote write requests.", .pme_code = 168, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 10 Event 1 */ { .pme_name = "FLUSH_REQ", .pme_desc = "FlushReqs received.", .pme_code = 169, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 10 Event 2 */ { .pme_name = "STALL_VWRITENA", .pme_desc = "Cycles bank request queue stalled due to VWriteNA bit being set.", .pme_code = 170, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 10 Event 3 */ { .pme_name = "", .pme_desc = "", .pme_code = 171, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 11 Event 0 */ { .pme_name = "NETWORK_WRITE_TO_L3", .pme_desc = "Dwords written to L3 by remote write requests.", .pme_code = 172, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 11 Event 1 */ { .pme_name = "UPDATES_RECV", .pme_desc = "Updates received.", .pme_code = 173, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 11 Event 2 */ { .pme_name = "PROT_ENGINE_IDLE_NO_REQUEST", .pme_desc = "Cycles protocol engine idle due to no new requests to process.", .pme_code = 174, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 11 Event 3 */ { .pme_name = "READ_DATA_TO_VECTOR_UNIT_PIPE_0_3", .pme_desc = "Swords delivered to vector unit via pipes 0 - 3.", .pme_code = 175, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 12 Event 0 */ { .pme_name = "NETWORK_READ_FROM_L2", .pme_desc = "Dwords read from L2 by remote read requests.", .pme_code = 176, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 12 Event 1 */ { .pme_name = "", .pme_desc = "", .pme_code = 177, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 12 Event 2 */ { .pme_name = "UPDATE_NACK_SENT", .pme_desc = "UpdateNacks sent.", .pme_code = 178, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 12 Event 3 */ { .pme_name = "READ_DATA_TO_VECTOR_UNIT_PIPE_4_7", .pme_desc = "Swords delivered to vector unit via pipes 4 - 7.", .pme_code = 179, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 13 Event 0 */ { .pme_name = "NETWORK_READ_FROM_L3", .pme_desc = "Dwords read from L3 by remote read requests.", .pme_code = 180, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 13 Event 1 */ { .pme_name = "NACKS_SENT", .pme_desc = "FlushAcks and UpdateNacks sent (these happen when there's a race b/w a forwarded request and an eviction by the current owner).", .pme_code = 181, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 13 Event 2 */ { .pme_name = "INVAL_RECV", .pme_desc = "Inval packets received from the directory.", .pme_code = 182, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 13 Event 3 */ { .pme_name = "READ_DATA_TO_SCALAR_UNIT", .pme_desc = "Dwords delivered to scalar unit.", .pme_code = 183, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 14 Event 0 */ { .pme_name = "REMOTE_READS", .pme_desc = "Dwords read from remote nodes.", .pme_code = 184, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 14 Event 1 */ { .pme_name = "LOCAL_INVAL", .pme_desc = "Local writes that cause invals of other Dcaches.", .pme_code = 185, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 14 Event 2 */ { .pme_name = "MARKED_REQS", .pme_desc = "Memory requests sent with TID 0.", .pme_code = 186, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 14 Event 3 */ { .pme_name = "READ_DATA_TO_ICACHE", .pme_desc = "Dwords delivered to Icache.", .pme_code = 187, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 15 Event 0 */ { .pme_name = "REMOTE_WRITES", .pme_desc = "Dwords written to remote nodes.", .pme_code = 188, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 15 Event 1 */ { .pme_name = "DCACHE_INVAL_EVENTS", .pme_desc = "State transitions (evictions, directory Invals or forwards, processor writes) requiring Dcache invals.", .pme_code = 189, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 15 Event 2 */ { .pme_name = "MARKED_CYCLES", .pme_desc = "Cycles with a TID 0 request outstanding.", .pme_code = 190, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* C Counter 15 Event 3 */ { .pme_name = "READ_DATA_TO_NIF", .pme_desc = "Dwords delivered to NIF.", .pme_code = 191, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_CACHE, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_CACHE_PMD_BASE, .pme_nctrs = PME_CRAYX2_CACHE_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_CACHE_CHIPS }, /* M Counter 0 Event 0 */ { .pme_name = "W_IN_IDLE_0@0", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 0)", .pme_code = 192, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@1", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 1)", .pme_code = 193, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@2", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 2)", .pme_code = 194, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@3", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 3)", .pme_code = 195, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@4", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 4)", .pme_code = 196, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@5", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 5)", .pme_code = 197, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@6", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 6)", .pme_code = 198, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@7", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 7)", .pme_code = 199, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@8", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 8)", .pme_code = 200, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@9", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 9)", .pme_code = 201, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@10", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 10)", .pme_code = 202, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@11", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 11)", .pme_code = 203, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@12", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 12)", .pme_code = 204, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@13", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 13)", .pme_code = 205, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@14", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 14)", .pme_code = 206, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_0@15", .pme_desc = "Wclk cycles BW2MD input port 0 is idle (no flits in either VC0 or VC2). (M chip 15)", .pme_code = 207, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 0 Event 1 */ { .pme_name = "STALL_REPLAY_FULL@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 0)", .pme_code = 208, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 1)", .pme_code = 209, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 2)", .pme_code = 210, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 3)", .pme_code = 211, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 4)", .pme_code = 212, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 5)", .pme_code = 213, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 6)", .pme_code = 214, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 7)", .pme_code = 215, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 8)", .pme_code = 216, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 9)", .pme_code = 217, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 10)", .pme_code = 218, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 11)", .pme_code = 219, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 12)", .pme_code = 220, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 13)", .pme_code = 221, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 14)", .pme_code = 222, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_REPLAY_FULL@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to replay queue full (sum of 4 engines). (M chip 15)", .pme_code = 223, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 0 Event 2 */ { .pme_name = "W_OUT_IDLE_0@0", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 0)", .pme_code = 224, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@1", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 1)", .pme_code = 225, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@2", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 2)", .pme_code = 226, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@3", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 3)", .pme_code = 227, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@4", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 4)", .pme_code = 228, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@5", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 5)", .pme_code = 229, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@6", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 6)", .pme_code = 230, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@7", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 7)", .pme_code = 231, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@8", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 8)", .pme_code = 232, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@9", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 9)", .pme_code = 233, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@10", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 10)", .pme_code = 234, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@11", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 11)", .pme_code = 235, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@12", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 12)", .pme_code = 236, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@13", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 13)", .pme_code = 237, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@14", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 14)", .pme_code = 238, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_0@15", .pme_desc = "Wclk cycles MD2BW output port 0 is idle (no flits flowing). (M chip 15)", .pme_code = 239, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 0 Event 3 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 240, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 241, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 242, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 243, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 244, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 245, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 246, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 247, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 248, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 249, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 250, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 251, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 252, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 253, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 254, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 255, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 0, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 1 Event 0 */ { .pme_name = "W_IN_IDLE_1@0", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 0)", .pme_code = 256, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@1", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 1)", .pme_code = 257, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@2", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 2)", .pme_code = 258, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@3", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 3)", .pme_code = 259, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@4", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 4)", .pme_code = 260, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@5", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 5)", .pme_code = 261, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@6", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 6)", .pme_code = 262, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@7", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 7)", .pme_code = 263, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@8", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 8)", .pme_code = 264, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@9", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 9)", .pme_code = 265, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@10", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 10)", .pme_code = 266, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@11", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 11)", .pme_code = 267, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@12", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 12)", .pme_code = 268, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@13", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 13)", .pme_code = 269, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@14", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 14)", .pme_code = 270, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_1@15", .pme_desc = "Wclk cycles BW2MD input port 1 is idle (no flits in either VC0 or VC2). (M chip 15)", .pme_code = 271, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 1 Event 1 */ { .pme_name = "STALL_TDB_FULL@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 0)", .pme_code = 272, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 1)", .pme_code = 273, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 2)", .pme_code = 274, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 3)", .pme_code = 275, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 4)", .pme_code = 276, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 5)", .pme_code = 277, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 6)", .pme_code = 278, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 7)", .pme_code = 279, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 8)", .pme_code = 280, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 9)", .pme_code = 281, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 10)", .pme_code = 282, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 11)", .pme_code = 283, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 12)", .pme_code = 284, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 13)", .pme_code = 285, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 14)", .pme_code = 286, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_TDB_FULL@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to transient directory buffer full (sum of 4 engines). (M chip 15)", .pme_code = 287, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 1 Event 2 */ { .pme_name = "W_OUT_IDLE_1@0", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 0)", .pme_code = 288, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@1", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 1)", .pme_code = 289, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@2", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 2)", .pme_code = 290, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@3", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 3)", .pme_code = 291, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@4", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 4)", .pme_code = 292, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@5", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 5)", .pme_code = 293, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@6", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 6)", .pme_code = 294, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@7", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 7)", .pme_code = 295, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@8", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 8)", .pme_code = 296, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@9", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 9)", .pme_code = 297, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@10", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 10)", .pme_code = 298, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@11", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 11)", .pme_code = 299, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@12", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 12)", .pme_code = 300, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@13", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 13)", .pme_code = 301, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@14", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 14)", .pme_code = 302, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_1@15", .pme_desc = "Wclk cycles MD2BW output port 1 is idle (no flits flowing). (M chip 15)", .pme_code = 303, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 1 Event 3 */ { .pme_name = "FWD_READ_SHARED_SENT@0", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 0)", .pme_code = 304, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@1", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 1)", .pme_code = 305, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@2", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 2)", .pme_code = 306, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@3", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 3)", .pme_code = 307, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@4", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 4)", .pme_code = 308, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@5", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 5)", .pme_code = 309, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@6", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 6)", .pme_code = 310, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@7", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 7)", .pme_code = 311, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@8", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 8)", .pme_code = 312, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@9", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 9)", .pme_code = 313, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@10", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 10)", .pme_code = 314, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@11", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 11)", .pme_code = 315, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@12", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 12)", .pme_code = 316, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@13", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 13)", .pme_code = 317, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@14", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 14)", .pme_code = 318, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ_SHARED_SENT@15", .pme_desc = "FwdReadShared packets sent (Exclusive -> PendFwd transition). (M chip 15)", .pme_code = 319, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 1, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 2 Event 0 */ { .pme_name = "UPDATES_SENT@0", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 0)", .pme_code = 320, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@1", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 1)", .pme_code = 321, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@2", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 2)", .pme_code = 322, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@3", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 3)", .pme_code = 323, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@4", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 4)", .pme_code = 324, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@5", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 5)", .pme_code = 325, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@6", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 6)", .pme_code = 326, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@7", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 7)", .pme_code = 327, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@8", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 8)", .pme_code = 328, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@9", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 9)", .pme_code = 329, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@10", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 10)", .pme_code = 330, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@11", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 11)", .pme_code = 331, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@12", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 12)", .pme_code = 332, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@13", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 13)", .pme_code = 333, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@14", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 14)", .pme_code = 334, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATES_SENT@15", .pme_desc = "Puts that cause an Update to be sent to owner. (M chip 15)", .pme_code = 335, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 2 Event 1 */ { .pme_name = "STALL_MM_RESPQ@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 0)", .pme_code = 336, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 1)", .pme_code = 337, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 2)", .pme_code = 338, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 3)", .pme_code = 339, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 4)", .pme_code = 340, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 5)", .pme_code = 341, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 6)", .pme_code = 342, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 7)", .pme_code = 343, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 8)", .pme_code = 344, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 9)", .pme_code = 345, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 10)", .pme_code = 346, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 11)", .pme_code = 347, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 12)", .pme_code = 348, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 13)", .pme_code = 349, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 14)", .pme_code = 350, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM_RESPQ@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to MM VN1 response queue full (sum of 4 engines). (M chip 15)", .pme_code = 351, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 2 Event 2 */ { .pme_name = "W_OUT_IDLE_2@0", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 0)", .pme_code = 352, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@1", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 1)", .pme_code = 353, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@2", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 2)", .pme_code = 354, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@3", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 3)", .pme_code = 355, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@4", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 4)", .pme_code = 356, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@5", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 5)", .pme_code = 357, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@6", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 6)", .pme_code = 358, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@7", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 7)", .pme_code = 359, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@8", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 8)", .pme_code = 360, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@9", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 9)", .pme_code = 361, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@10", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 10)", .pme_code = 362, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@11", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 11)", .pme_code = 363, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@12", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 12)", .pme_code = 364, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@13", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 13)", .pme_code = 365, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@14", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 14)", .pme_code = 366, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_2@15", .pme_desc = "Wclk cycles MD2BW output port 2 is idle (no flits flowing). (M chip 15)", .pme_code = 367, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 2 Event 3 */ { .pme_name = "W_IN_IDLE_2@0", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 0)", .pme_code = 368, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@1", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 1)", .pme_code = 369, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@2", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 2)", .pme_code = 370, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@3", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 3)", .pme_code = 371, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@4", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 4)", .pme_code = 372, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@5", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 5)", .pme_code = 373, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@6", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 6)", .pme_code = 374, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@7", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 7)", .pme_code = 375, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@8", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 8)", .pme_code = 376, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@9", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 9)", .pme_code = 377, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@10", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 10)", .pme_code = 378, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@11", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 11)", .pme_code = 379, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@12", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 12)", .pme_code = 380, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@13", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 13)", .pme_code = 381, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@14", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 14)", .pme_code = 382, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_2@15", .pme_desc = "Wclk cycles BW2MD input port 2 is idle (no flits in either VC0 or VC2). (M chip 15)", .pme_code = 383, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 2, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 3 Event 0 */ { .pme_name = "NON_CACHED@0", .pme_desc = "Read requests satisfied from non-cached state. (M chip 0)", .pme_code = 384, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@1", .pme_desc = "Read requests satisfied from non-cached state. (M chip 1)", .pme_code = 385, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@2", .pme_desc = "Read requests satisfied from non-cached state. (M chip 2)", .pme_code = 386, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@3", .pme_desc = "Read requests satisfied from non-cached state. (M chip 3)", .pme_code = 387, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@4", .pme_desc = "Read requests satisfied from non-cached state. (M chip 4)", .pme_code = 388, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@5", .pme_desc = "Read requests satisfied from non-cached state. (M chip 5)", .pme_code = 389, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@6", .pme_desc = "Read requests satisfied from non-cached state. (M chip 6)", .pme_code = 390, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@7", .pme_desc = "Read requests satisfied from non-cached state. (M chip 7)", .pme_code = 391, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@8", .pme_desc = "Read requests satisfied from non-cached state. (M chip 8)", .pme_code = 392, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@9", .pme_desc = "Read requests satisfied from non-cached state. (M chip 9)", .pme_code = 393, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@10", .pme_desc = "Read requests satisfied from non-cached state. (M chip 10)", .pme_code = 394, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@11", .pme_desc = "Read requests satisfied from non-cached state. (M chip 11)", .pme_code = 395, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@12", .pme_desc = "Read requests satisfied from non-cached state. (M chip 12)", .pme_code = 396, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@13", .pme_desc = "Read requests satisfied from non-cached state. (M chip 13)", .pme_code = 397, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@14", .pme_desc = "Read requests satisfied from non-cached state. (M chip 14)", .pme_code = 398, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NON_CACHED@15", .pme_desc = "Read requests satisfied from non-cached state. (M chip 15)", .pme_code = 399, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 3 Event 1 */ { .pme_name = "STALL_ASSOC@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 0)", .pme_code = 400, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 1)", .pme_code = 401, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 2)", .pme_code = 402, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 3)", .pme_code = 403, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 4)", .pme_code = 404, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 5)", .pme_code = 405, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 6)", .pme_code = 406, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 7)", .pme_code = 407, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 8)", .pme_code = 408, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 9)", .pme_code = 409, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 10)", .pme_code = 410, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 11)", .pme_code = 411, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 12)", .pme_code = 412, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 13)", .pme_code = 413, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 14)", .pme_code = 414, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_ASSOC@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to temporary over-subscription of directory ways. (M chip 15)", .pme_code = 415, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 3 Event 2 */ { .pme_name = "W_OUT_IDLE_3@0", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 0)", .pme_code = 416, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@1", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 1)", .pme_code = 417, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@2", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 2)", .pme_code = 418, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@3", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 3)", .pme_code = 419, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@4", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 4)", .pme_code = 420, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@5", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 5)", .pme_code = 421, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@6", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 6)", .pme_code = 422, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@7", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 7)", .pme_code = 423, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@8", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 8)", .pme_code = 424, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@9", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 9)", .pme_code = 425, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@10", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 10)", .pme_code = 426, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@11", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 11)", .pme_code = 427, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@12", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 12)", .pme_code = 428, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@13", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 13)", .pme_code = 429, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@14", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 14)", .pme_code = 430, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_IDLE_3@15", .pme_desc = "Wclk cycles MD2BW output port 3 is idle (no flits flowing). (M chip 15)", .pme_code = 431, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 3 Event 3 */ { .pme_name = "W_IN_IDLE_3@0", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 0)", .pme_code = 432, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@1", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 1)", .pme_code = 433, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@2", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 2)", .pme_code = 434, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@3", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 3)", .pme_code = 435, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@4", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 4)", .pme_code = 436, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@5", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 5)", .pme_code = 437, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@6", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 6)", .pme_code = 438, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@7", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 7)", .pme_code = 439, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@8", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 8)", .pme_code = 440, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@9", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 9)", .pme_code = 441, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@10", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 10)", .pme_code = 442, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@11", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 11)", .pme_code = 443, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@12", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 12)", .pme_code = 444, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@13", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 13)", .pme_code = 445, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@14", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 14)", .pme_code = 446, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_IDLE_3@15", .pme_desc = "Wclk cycles BW2MD input port 3 is idle (no flits in either VC0 or VC2). (M chip 15)", .pme_code = 447, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 3, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 4 Event 0 */ { .pme_name = "READ_REQ_SHARED@0", .pme_desc = "Read requests satisfied from the Shared state. (M chip 0)", .pme_code = 448, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@1", .pme_desc = "Read requests satisfied from the Shared state. (M chip 1)", .pme_code = 449, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@2", .pme_desc = "Read requests satisfied from the Shared state. (M chip 2)", .pme_code = 450, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@3", .pme_desc = "Read requests satisfied from the Shared state. (M chip 3)", .pme_code = 451, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@4", .pme_desc = "Read requests satisfied from the Shared state. (M chip 4)", .pme_code = 452, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@5", .pme_desc = "Read requests satisfied from the Shared state. (M chip 5)", .pme_code = 453, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@6", .pme_desc = "Read requests satisfied from the Shared state. (M chip 6)", .pme_code = 454, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@7", .pme_desc = "Read requests satisfied from the Shared state. (M chip 7)", .pme_code = 455, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@8", .pme_desc = "Read requests satisfied from the Shared state. (M chip 8)", .pme_code = 456, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@9", .pme_desc = "Read requests satisfied from the Shared state. (M chip 9)", .pme_code = 457, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@10", .pme_desc = "Read requests satisfied from the Shared state. (M chip 10)", .pme_code = 458, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@11", .pme_desc = "Read requests satisfied from the Shared state. (M chip 11)", .pme_code = 459, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@12", .pme_desc = "Read requests satisfied from the Shared state. (M chip 12)", .pme_code = 460, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@13", .pme_desc = "Read requests satisfied from the Shared state. (M chip 13)", .pme_code = 461, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@14", .pme_desc = "Read requests satisfied from the Shared state. (M chip 14)", .pme_code = 462, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "READ_REQ_SHARED@15", .pme_desc = "Read requests satisfied from the Shared state. (M chip 15)", .pme_code = 463, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 4 Event 1 */ { .pme_name = "STALL_VN1_BLOCKED@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 0)", .pme_code = 464, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 1)", .pme_code = 465, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 2)", .pme_code = 466, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 3)", .pme_code = 467, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 4)", .pme_code = 468, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 5)", .pme_code = 469, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 6)", .pme_code = 470, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 7)", .pme_code = 471, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 8)", .pme_code = 472, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 9)", .pme_code = 473, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 10)", .pme_code = 474, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 11)", .pme_code = 475, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 12)", .pme_code = 476, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 13)", .pme_code = 477, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 14)", .pme_code = 478, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_VN1_BLOCKED@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to virtual network 1 output blocked. (M chip 15)", .pme_code = 479, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 4 Event 2 */ { .pme_name = "W_IN_FLOWING_0@0", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 0)", .pme_code = 480, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@1", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 1)", .pme_code = 481, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@2", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 2)", .pme_code = 482, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@3", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 3)", .pme_code = 483, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@4", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 4)", .pme_code = 484, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@5", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 5)", .pme_code = 485, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@6", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 6)", .pme_code = 486, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@7", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 7)", .pme_code = 487, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@8", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 8)", .pme_code = 488, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@9", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 9)", .pme_code = 489, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@10", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 10)", .pme_code = 490, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@11", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 11)", .pme_code = 491, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@12", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 12)", .pme_code = 492, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@13", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 13)", .pme_code = 493, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@14", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 14)", .pme_code = 494, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_0@15", .pme_desc = "Wclk cycles BW2MD input port 0 has a flit flowing (on either VC0 or VC2). (M chip 15)", .pme_code = 495, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 4 Event 3 */ { .pme_name = "W_OUT_FLOWING_0@0", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 0)", .pme_code = 496, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@1", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 1)", .pme_code = 497, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@2", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 2)", .pme_code = 498, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@3", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 3)", .pme_code = 499, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@4", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 4)", .pme_code = 500, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@5", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 5)", .pme_code = 501, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@6", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 6)", .pme_code = 502, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@7", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 7)", .pme_code = 503, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@8", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 8)", .pme_code = 504, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@9", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 9)", .pme_code = 505, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@10", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 10)", .pme_code = 506, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@11", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 11)", .pme_code = 507, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@12", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 12)", .pme_code = 508, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@13", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 13)", .pme_code = 509, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@14", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 14)", .pme_code = 510, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_0@15", .pme_desc = "Wclk cycles MD2BW output port 0 has a flit flowing. (M chip 15)", .pme_code = 511, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 4, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 5 Event 0 */ { .pme_name = "FWD_REQ_TO_OWNER@0", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 0)", .pme_code = 512, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@1", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 1)", .pme_code = 513, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@2", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 2)", .pme_code = 514, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@3", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 3)", .pme_code = 515, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@4", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 4)", .pme_code = 516, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@5", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 5)", .pme_code = 517, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@6", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 6)", .pme_code = 518, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@7", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 7)", .pme_code = 519, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@8", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 8)", .pme_code = 520, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@9", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 9)", .pme_code = 521, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@10", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 10)", .pme_code = 522, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@11", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 11)", .pme_code = 523, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@12", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 12)", .pme_code = 524, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@13", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 13)", .pme_code = 525, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@14", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 14)", .pme_code = 526, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_REQ_TO_OWNER@15", .pme_desc = "Requests forwarded to current owner (FwdRead, FwdReadShared, FlushReq, FwdGet, Update). (M chip 15)", .pme_code = 527, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 5 Event 1 */ { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@0", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 0)", .pme_code = 528, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@1", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 1)", .pme_code = 529, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@2", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 2)", .pme_code = 530, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@3", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 3)", .pme_code = 531, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@4", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 4)", .pme_code = 532, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@5", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 5)", .pme_code = 533, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@6", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 6)", .pme_code = 534, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@7", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 7)", .pme_code = 535, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@8", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 8)", .pme_code = 536, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@9", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 9)", .pme_code = 537, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@10", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 10)", .pme_code = 538, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@11", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 11)", .pme_code = 539, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@12", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 12)", .pme_code = 540, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@13", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 13)", .pme_code = 541, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@14", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 14)", .pme_code = 542, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PROT_ENGINE_IDLE_NO_PACKETS@15", .pme_desc = "Wclk cycles protocol engine idle due to no new packets to process. Note: The maximum packet acceptance rate into the MD is 1 packet every 2 Wclk periods. (M chip 15)", .pme_code = 543, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 5 Event 2 */ { .pme_name = "W_IN_FLOWING_1@0", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 0)", .pme_code = 544, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@1", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 1)", .pme_code = 545, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@2", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 2)", .pme_code = 546, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@3", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 3)", .pme_code = 547, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@4", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 4)", .pme_code = 548, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@5", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 5)", .pme_code = 549, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@6", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 6)", .pme_code = 550, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@7", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 7)", .pme_code = 551, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@8", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 8)", .pme_code = 552, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@9", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 9)", .pme_code = 553, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@10", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 10)", .pme_code = 554, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@11", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 11)", .pme_code = 555, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@12", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 12)", .pme_code = 556, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@13", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 13)", .pme_code = 557, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@14", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 14)", .pme_code = 558, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_1@15", .pme_desc = "Wclk cycles BW2MD input port 1 has a flit flowing (on either VC0 or VC2). (M chip 15)", .pme_code = 559, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 5 Event 3 */ { .pme_name = "FWD_READ@0", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 0)", .pme_code = 560, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@1", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 1)", .pme_code = 561, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@2", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 2)", .pme_code = 562, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@3", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 3)", .pme_code = 563, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@4", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 4)", .pme_code = 564, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@5", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 5)", .pme_code = 565, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@6", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 6)", .pme_code = 566, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@7", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 7)", .pme_code = 567, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@8", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 8)", .pme_code = 568, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@9", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 9)", .pme_code = 569, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@10", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 10)", .pme_code = 570, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@11", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 11)", .pme_code = 571, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@12", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 12)", .pme_code = 572, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@13", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 13)", .pme_code = 573, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@14", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 14)", .pme_code = 574, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_READ@15", .pme_desc = "FwdRead packets sent (Exclusive -> PendFwd transition). (M chip 15)", .pme_code = 575, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 5, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 6 Event 0 */ { .pme_name = "SUPPLY_INV@0", .pme_desc = "SupplyInv packets received. (M chip 0)", .pme_code = 576, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@1", .pme_desc = "SupplyInv packets received. (M chip 1)", .pme_code = 577, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@2", .pme_desc = "SupplyInv packets received. (M chip 2)", .pme_code = 578, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@3", .pme_desc = "SupplyInv packets received. (M chip 3)", .pme_code = 579, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@4", .pme_desc = "SupplyInv packets received. (M chip 4)", .pme_code = 580, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@5", .pme_desc = "SupplyInv packets received. (M chip 5)", .pme_code = 581, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@6", .pme_desc = "SupplyInv packets received. (M chip 6)", .pme_code = 582, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@7", .pme_desc = "SupplyInv packets received. (M chip 7)", .pme_code = 583, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@8", .pme_desc = "SupplyInv packets received. (M chip 8)", .pme_code = 584, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@9", .pme_desc = "SupplyInv packets received. (M chip 9)", .pme_code = 585, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@10", .pme_desc = "SupplyInv packets received. (M chip 10)", .pme_code = 586, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@11", .pme_desc = "SupplyInv packets received. (M chip 11)", .pme_code = 587, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@12", .pme_desc = "SupplyInv packets received. (M chip 12)", .pme_code = 588, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@13", .pme_desc = "SupplyInv packets received. (M chip 13)", .pme_code = 589, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@14", .pme_desc = "SupplyInv packets received. (M chip 14)", .pme_code = 590, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_INV@15", .pme_desc = "SupplyInv packets received. (M chip 15)", .pme_code = 591, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 6 Event 1 */ { .pme_name = "NUM_REPLAY@0", .pme_desc = "Requests sent through replay queue. (M chip 0)", .pme_code = 592, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@1", .pme_desc = "Requests sent through replay queue. (M chip 1)", .pme_code = 593, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@2", .pme_desc = "Requests sent through replay queue. (M chip 2)", .pme_code = 594, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@3", .pme_desc = "Requests sent through replay queue. (M chip 3)", .pme_code = 595, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@4", .pme_desc = "Requests sent through replay queue. (M chip 4)", .pme_code = 596, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@5", .pme_desc = "Requests sent through replay queue. (M chip 5)", .pme_code = 597, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@6", .pme_desc = "Requests sent through replay queue. (M chip 6)", .pme_code = 598, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@7", .pme_desc = "Requests sent through replay queue. (M chip 7)", .pme_code = 599, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@8", .pme_desc = "Requests sent through replay queue. (M chip 8)", .pme_code = 600, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@9", .pme_desc = "Requests sent through replay queue. (M chip 9)", .pme_code = 601, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@10", .pme_desc = "Requests sent through replay queue. (M chip 10)", .pme_code = 602, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@11", .pme_desc = "Requests sent through replay queue. (M chip 11)", .pme_code = 603, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@12", .pme_desc = "Requests sent through replay queue. (M chip 12)", .pme_code = 604, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@13", .pme_desc = "Requests sent through replay queue. (M chip 13)", .pme_code = 605, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@14", .pme_desc = "Requests sent through replay queue. (M chip 14)", .pme_code = 606, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NUM_REPLAY@15", .pme_desc = "Requests sent through replay queue. (M chip 15)", .pme_code = 607, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 6 Event 2 */ { .pme_name = "W_IN_FLOWING_2@0", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 0)", .pme_code = 608, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@1", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 1)", .pme_code = 609, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@2", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 2)", .pme_code = 610, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@3", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 3)", .pme_code = 611, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@4", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 4)", .pme_code = 612, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@5", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 5)", .pme_code = 613, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@6", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 6)", .pme_code = 614, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@7", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 7)", .pme_code = 615, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@8", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 8)", .pme_code = 616, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@9", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 9)", .pme_code = 617, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@10", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 10)", .pme_code = 618, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@11", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 11)", .pme_code = 619, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@12", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 12)", .pme_code = 620, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@13", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 13)", .pme_code = 621, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@14", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 14)", .pme_code = 622, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_2@15", .pme_desc = "Wclk cycles BW2MD input port 2 has a flit flowing (on either VC0 or VC2). (M chip 15)", .pme_code = 623, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 6 Event 3 */ { .pme_name = "INVAL_1@0", .pme_desc = "Invalidations sent to a single BW. (M chip 0)", .pme_code = 624, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@1", .pme_desc = "Invalidations sent to a single BW. (M chip 1)", .pme_code = 625, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@2", .pme_desc = "Invalidations sent to a single BW. (M chip 2)", .pme_code = 626, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@3", .pme_desc = "Invalidations sent to a single BW. (M chip 3)", .pme_code = 627, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@4", .pme_desc = "Invalidations sent to a single BW. (M chip 4)", .pme_code = 628, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@5", .pme_desc = "Invalidations sent to a single BW. (M chip 5)", .pme_code = 629, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@6", .pme_desc = "Invalidations sent to a single BW. (M chip 6)", .pme_code = 630, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@7", .pme_desc = "Invalidations sent to a single BW. (M chip 7)", .pme_code = 631, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@8", .pme_desc = "Invalidations sent to a single BW. (M chip 8)", .pme_code = 632, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@9", .pme_desc = "Invalidations sent to a single BW. (M chip 9)", .pme_code = 633, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@10", .pme_desc = "Invalidations sent to a single BW. (M chip 10)", .pme_code = 634, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@11", .pme_desc = "Invalidations sent to a single BW. (M chip 11)", .pme_code = 635, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@12", .pme_desc = "Invalidations sent to a single BW. (M chip 12)", .pme_code = 636, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@13", .pme_desc = "Invalidations sent to a single BW. (M chip 13)", .pme_code = 637, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@14", .pme_desc = "Invalidations sent to a single BW. (M chip 14)", .pme_code = 638, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_1@15", .pme_desc = "Invalidations sent to a single BW. (M chip 15)", .pme_code = 639, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 6, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 7 Event 0 */ { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@0", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 0)", .pme_code = 640, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@1", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 1)", .pme_code = 641, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@2", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 2)", .pme_code = 642, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@3", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 3)", .pme_code = 643, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@4", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 4)", .pme_code = 644, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@5", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 5)", .pme_code = 645, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@6", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 6)", .pme_code = 646, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@7", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 7)", .pme_code = 647, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@8", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 8)", .pme_code = 648, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@9", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 9)", .pme_code = 649, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@10", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 10)", .pme_code = 650, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@11", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 11)", .pme_code = 651, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@12", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 12)", .pme_code = 652, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@13", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 13)", .pme_code = 653, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@14", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 14)", .pme_code = 654, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_HIT@15", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 hit. (M chip 15)", .pme_code = 655, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 7 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 656, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 657, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 658, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 659, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 660, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 661, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 662, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 663, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 664, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 665, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 666, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 667, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 668, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 669, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 670, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 671, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 7 Event 2 */ { .pme_name = "W_IN_FLOWING_3@0", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 0)", .pme_code = 672, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@1", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 1)", .pme_code = 673, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@2", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 2)", .pme_code = 674, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@3", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 3)", .pme_code = 675, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@4", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 4)", .pme_code = 676, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@5", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 5)", .pme_code = 677, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@6", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 6)", .pme_code = 678, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@7", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 7)", .pme_code = 679, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@8", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 8)", .pme_code = 680, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@9", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 9)", .pme_code = 681, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@10", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 10)", .pme_code = 682, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@11", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 11)", .pme_code = 683, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@12", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 12)", .pme_code = 684, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@13", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 13)", .pme_code = 685, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@14", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 14)", .pme_code = 686, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_FLOWING_3@15", .pme_desc = "Wclk cycles BW2MD input port 3 has a flit flowing (on either VC0 or VC2). (M chip 15)", .pme_code = 687, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 7 Event 3 */ { .pme_name = "INVAL_2@0", .pme_desc = "Invalidations sent to two BWs. (M chip 0)", .pme_code = 688, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@1", .pme_desc = "Invalidations sent to two BWs. (M chip 1)", .pme_code = 689, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@2", .pme_desc = "Invalidations sent to two BWs. (M chip 2)", .pme_code = 690, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@3", .pme_desc = "Invalidations sent to two BWs. (M chip 3)", .pme_code = 691, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@4", .pme_desc = "Invalidations sent to two BWs. (M chip 4)", .pme_code = 692, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@5", .pme_desc = "Invalidations sent to two BWs. (M chip 5)", .pme_code = 693, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@6", .pme_desc = "Invalidations sent to two BWs. (M chip 6)", .pme_code = 694, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@7", .pme_desc = "Invalidations sent to two BWs. (M chip 7)", .pme_code = 695, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@8", .pme_desc = "Invalidations sent to two BWs. (M chip 8)", .pme_code = 696, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@9", .pme_desc = "Invalidations sent to two BWs. (M chip 9)", .pme_code = 697, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@10", .pme_desc = "Invalidations sent to two BWs. (M chip 10)", .pme_code = 698, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@11", .pme_desc = "Invalidations sent to two BWs. (M chip 11)", .pme_code = 699, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@12", .pme_desc = "Invalidations sent to two BWs. (M chip 12)", .pme_code = 700, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@13", .pme_desc = "Invalidations sent to two BWs. (M chip 13)", .pme_code = 701, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@14", .pme_desc = "Invalidations sent to two BWs. (M chip 14)", .pme_code = 702, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_2@15", .pme_desc = "Invalidations sent to two BWs. (M chip 15)", .pme_code = 703, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 7, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 8 Event 0 */ { .pme_name = "SUPPLY_SH@0", .pme_desc = "SupplySh packets received. (M chip 0)", .pme_code = 704, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@1", .pme_desc = "SupplySh packets received. (M chip 1)", .pme_code = 705, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@2", .pme_desc = "SupplySh packets received. (M chip 2)", .pme_code = 706, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@3", .pme_desc = "SupplySh packets received. (M chip 3)", .pme_code = 707, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@4", .pme_desc = "SupplySh packets received. (M chip 4)", .pme_code = 708, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@5", .pme_desc = "SupplySh packets received. (M chip 5)", .pme_code = 709, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@6", .pme_desc = "SupplySh packets received. (M chip 6)", .pme_code = 710, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@7", .pme_desc = "SupplySh packets received. (M chip 7)", .pme_code = 711, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@8", .pme_desc = "SupplySh packets received. (M chip 8)", .pme_code = 712, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@9", .pme_desc = "SupplySh packets received. (M chip 9)", .pme_code = 713, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@10", .pme_desc = "SupplySh packets received. (M chip 10)", .pme_code = 714, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@11", .pme_desc = "SupplySh packets received. (M chip 11)", .pme_code = 715, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@12", .pme_desc = "SupplySh packets received. (M chip 12)", .pme_code = 716, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@13", .pme_desc = "SupplySh packets received. (M chip 13)", .pme_code = 717, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@14", .pme_desc = "SupplySh packets received. (M chip 14)", .pme_code = 718, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_SH@15", .pme_desc = "SupplySh packets received. (M chip 15)", .pme_code = 719, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 8 Event 1 */ { .pme_name = "STALL_MM@0", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 0)", .pme_code = 720, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@1", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 1)", .pme_code = 721, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@2", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 2)", .pme_code = 722, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@3", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 3)", .pme_code = 723, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@4", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 4)", .pme_code = 724, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@5", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 5)", .pme_code = 725, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@6", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 6)", .pme_code = 726, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@7", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 7)", .pme_code = 727, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@8", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 8)", .pme_code = 728, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@9", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 9)", .pme_code = 729, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@10", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 10)", .pme_code = 730, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@11", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 11)", .pme_code = 731, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@12", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 12)", .pme_code = 732, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@13", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 13)", .pme_code = 733, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@14", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 14)", .pme_code = 734, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "STALL_MM@15", .pme_desc = "Wclk cycles protocol engine request queue stalled due to back-pressure from memory manager. (M chip 15)", .pme_code = 735, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 8 Event 2 */ { .pme_name = "W_IN_WAITING_0@0", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 0)", .pme_code = 736, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@1", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 1)", .pme_code = 737, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@2", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 2)", .pme_code = 738, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@3", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 3)", .pme_code = 739, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@4", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 4)", .pme_code = 740, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@5", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 5)", .pme_code = 741, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@6", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 6)", .pme_code = 742, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@7", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 7)", .pme_code = 743, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@8", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 8)", .pme_code = 744, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@9", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 9)", .pme_code = 745, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@10", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 10)", .pme_code = 746, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@11", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 11)", .pme_code = 747, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@12", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 12)", .pme_code = 748, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@13", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 13)", .pme_code = 749, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@14", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 14)", .pme_code = 750, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_0@15", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 15)", .pme_code = 751, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 8 Event 3 */ { .pme_name = "W_OUT_FLOWING_1@0", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 0)", .pme_code = 752, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@1", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 1)", .pme_code = 753, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@2", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 2)", .pme_code = 754, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@3", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 3)", .pme_code = 755, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@4", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 4)", .pme_code = 756, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@5", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 5)", .pme_code = 757, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@6", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 6)", .pme_code = 758, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@7", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 7)", .pme_code = 759, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@8", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 8)", .pme_code = 760, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@9", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 9)", .pme_code = 761, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@10", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 10)", .pme_code = 762, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@11", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 11)", .pme_code = 763, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@12", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 12)", .pme_code = 764, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@13", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 13)", .pme_code = 765, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@14", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 14)", .pme_code = 766, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_1@15", .pme_desc = "Wclk cycles MD2BW output port 1 has a flit flowing. (M chip 15)", .pme_code = 767, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 8, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 9 Event 0 */ { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@0", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 0)", .pme_code = 768, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@1", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 1)", .pme_code = 769, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@2", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 2)", .pme_code = 770, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@3", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 3)", .pme_code = 771, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@4", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 4)", .pme_code = 772, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@5", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 5)", .pme_code = 773, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@6", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 6)", .pme_code = 774, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@7", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 7)", .pme_code = 775, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@8", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 8)", .pme_code = 776, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@9", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 9)", .pme_code = 777, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@10", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 10)", .pme_code = 778, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@11", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 11)", .pme_code = 779, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@12", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 12)", .pme_code = 780, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@13", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 13)", .pme_code = 781, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@14", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 14)", .pme_code = 782, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_GETS_4DWORDS_L3_MISS@15", .pme_desc = "NGet or Get Full cache line requests to MDs - L3 miss. (M chip 15)", .pme_code = 783, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 9 Event 1 */ { .pme_name = "SECTION_BUSY@0", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 0)", .pme_code = 784, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@1", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 1)", .pme_code = 785, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@2", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 2)", .pme_code = 786, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@3", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 3)", .pme_code = 787, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@4", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 4)", .pme_code = 788, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@5", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 5)", .pme_code = 789, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@6", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 6)", .pme_code = 790, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@7", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 7)", .pme_code = 791, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@8", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 8)", .pme_code = 792, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@9", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 9)", .pme_code = 793, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@10", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 10)", .pme_code = 794, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@11", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 11)", .pme_code = 795, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@12", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 12)", .pme_code = 796, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@13", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 13)", .pme_code = 797, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@14", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 14)", .pme_code = 798, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SECTION_BUSY@15", .pme_desc = "Wclk cycles MD pipeline busy. (M chip 15)", .pme_code = 799, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 9 Event 2 */ { .pme_name = "W_IN_WAITING_1@0", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 0)", .pme_code = 800, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@1", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 1)", .pme_code = 801, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@2", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 2)", .pme_code = 802, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@3", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 3)", .pme_code = 803, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@4", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 4)", .pme_code = 804, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@5", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 5)", .pme_code = 805, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@6", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 6)", .pme_code = 806, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@7", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 7)", .pme_code = 807, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@8", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 8)", .pme_code = 808, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@9", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 9)", .pme_code = 809, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@10", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 10)", .pme_code = 810, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@11", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 11)", .pme_code = 811, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@12", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 12)", .pme_code = 812, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@13", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 13)", .pme_code = 813, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@14", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 14)", .pme_code = 814, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_1@15", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 15)", .pme_code = 815, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 9 Event 3 */ { .pme_name = "W_OUT_FLOWING_2@0", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 0)", .pme_code = 816, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@1", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 1)", .pme_code = 817, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@2", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 2)", .pme_code = 818, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@3", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 3)", .pme_code = 819, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@4", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 4)", .pme_code = 820, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@5", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 5)", .pme_code = 821, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@6", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 6)", .pme_code = 822, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@7", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 7)", .pme_code = 823, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@8", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 8)", .pme_code = 824, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@9", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 9)", .pme_code = 825, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@10", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 10)", .pme_code = 826, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@11", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 11)", .pme_code = 827, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@12", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 12)", .pme_code = 828, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@13", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 13)", .pme_code = 829, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@14", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 14)", .pme_code = 830, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_2@15", .pme_desc = "Wclk cycles MD2BW output port 2 has a flit flowing. (M chip 15)", .pme_code = 831, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 9, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 10 Event 0 */ { .pme_name = "SUPPLY_EXCL@0", .pme_desc = "SupplyExcl packets received. (M chip 0)", .pme_code = 832, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@1", .pme_desc = "SupplyExcl packets received. (M chip 1)", .pme_code = 833, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@2", .pme_desc = "SupplyExcl packets received. (M chip 2)", .pme_code = 834, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@3", .pme_desc = "SupplyExcl packets received. (M chip 3)", .pme_code = 835, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@4", .pme_desc = "SupplyExcl packets received. (M chip 4)", .pme_code = 836, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@5", .pme_desc = "SupplyExcl packets received. (M chip 5)", .pme_code = 837, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@6", .pme_desc = "SupplyExcl packets received. (M chip 6)", .pme_code = 838, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@7", .pme_desc = "SupplyExcl packets received. (M chip 7)", .pme_code = 839, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@8", .pme_desc = "SupplyExcl packets received. (M chip 8)", .pme_code = 840, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@9", .pme_desc = "SupplyExcl packets received. (M chip 9)", .pme_code = 841, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@10", .pme_desc = "SupplyExcl packets received. (M chip 10)", .pme_code = 842, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@11", .pme_desc = "SupplyExcl packets received. (M chip 11)", .pme_code = 843, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@12", .pme_desc = "SupplyExcl packets received. (M chip 12)", .pme_code = 844, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@13", .pme_desc = "SupplyExcl packets received. (M chip 13)", .pme_code = 845, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@14", .pme_desc = "SupplyExcl packets received. (M chip 14)", .pme_code = 846, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "SUPPLY_EXCL@15", .pme_desc = "SupplyExcl packets received. (M chip 15)", .pme_code = 847, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 10 Event 1 */ { .pme_name = "W_OUT_FLOWING_3@0", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 0)", .pme_code = 848, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@1", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 1)", .pme_code = 849, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@2", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 2)", .pme_code = 850, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@3", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 3)", .pme_code = 851, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@4", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 4)", .pme_code = 852, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@5", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 5)", .pme_code = 853, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@6", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 6)", .pme_code = 854, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@7", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 7)", .pme_code = 855, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@8", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 8)", .pme_code = 856, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@9", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 9)", .pme_code = 857, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@10", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 10)", .pme_code = 858, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@11", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 11)", .pme_code = 859, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@12", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 12)", .pme_code = 860, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@13", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 13)", .pme_code = 861, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@14", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 14)", .pme_code = 862, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_FLOWING_3@15", .pme_desc = "Wclk cycles MD2BW output port 3 has a flit flowing. (M chip 15)", .pme_code = 863, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 10 Event 2 */ { .pme_name = "W_IN_WAITING_2@0", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 0)", .pme_code = 864, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@1", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 1)", .pme_code = 865, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@2", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 2)", .pme_code = 866, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@3", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 3)", .pme_code = 867, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@4", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 4)", .pme_code = 868, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@5", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 5)", .pme_code = 869, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@6", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 6)", .pme_code = 870, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@7", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 7)", .pme_code = 871, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@8", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 8)", .pme_code = 872, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@9", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 9)", .pme_code = 873, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@10", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 10)", .pme_code = 874, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@11", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 11)", .pme_code = 875, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@12", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 12)", .pme_code = 876, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@13", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 13)", .pme_code = 877, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@14", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 14)", .pme_code = 878, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_2@15", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 15)", .pme_code = 879, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 10 Event 3 */ { .pme_name = "INVAL_3@0", .pme_desc = "Invalidations sent to three BWs. (M chip 0)", .pme_code = 880, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@1", .pme_desc = "Invalidations sent to three BWs. (M chip 1)", .pme_code = 881, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@2", .pme_desc = "Invalidations sent to three BWs. (M chip 2)", .pme_code = 882, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@3", .pme_desc = "Invalidations sent to three BWs. (M chip 3)", .pme_code = 883, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@4", .pme_desc = "Invalidations sent to three BWs. (M chip 4)", .pme_code = 884, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@5", .pme_desc = "Invalidations sent to three BWs. (M chip 5)", .pme_code = 885, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@6", .pme_desc = "Invalidations sent to three BWs. (M chip 6)", .pme_code = 886, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@7", .pme_desc = "Invalidations sent to three BWs. (M chip 7)", .pme_code = 887, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@8", .pme_desc = "Invalidations sent to three BWs. (M chip 8)", .pme_code = 888, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@9", .pme_desc = "Invalidations sent to three BWs. (M chip 9)", .pme_code = 889, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@10", .pme_desc = "Invalidations sent to three BWs. (M chip 10)", .pme_code = 890, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@11", .pme_desc = "Invalidations sent to three BWs. (M chip 11)", .pme_code = 891, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@12", .pme_desc = "Invalidations sent to three BWs. (M chip 12)", .pme_code = 892, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@13", .pme_desc = "Invalidations sent to three BWs. (M chip 13)", .pme_code = 893, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@14", .pme_desc = "Invalidations sent to three BWs. (M chip 14)", .pme_code = 894, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_3@15", .pme_desc = "Invalidations sent to three BWs. (M chip 15)", .pme_code = 895, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 10, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 11 Event 0 */ { .pme_name = "NACKS_RECV@0", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 0)", .pme_code = 896, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@1", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 1)", .pme_code = 897, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@2", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 2)", .pme_code = 898, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@3", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 3)", .pme_code = 899, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@4", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 4)", .pme_code = 900, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@5", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 5)", .pme_code = 901, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@6", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 6)", .pme_code = 902, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@7", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 7)", .pme_code = 903, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@8", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 8)", .pme_code = 904, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@9", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 9)", .pme_code = 905, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@10", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 10)", .pme_code = 906, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@11", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 11)", .pme_code = 907, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@12", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 12)", .pme_code = 908, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@13", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 13)", .pme_code = 909, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@14", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 14)", .pme_code = 910, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "NACKS_RECV@15", .pme_desc = "FlushAck and Update Nack packets received (race between forwarded request and eviction by owner). (M chip 15)", .pme_code = 911, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 11 Event 1 */ { .pme_name = "W_OUT_BLOCK_CRED_0@0", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 0)", .pme_code = 912, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@1", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 1)", .pme_code = 913, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@2", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 2)", .pme_code = 914, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@3", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 3)", .pme_code = 915, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@4", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 4)", .pme_code = 916, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@5", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 5)", .pme_code = 917, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@6", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 6)", .pme_code = 918, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@7", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 7)", .pme_code = 919, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@8", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 8)", .pme_code = 920, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@9", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 9)", .pme_code = 921, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@10", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 10)", .pme_code = 922, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@11", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 11)", .pme_code = 923, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@12", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 12)", .pme_code = 924, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@13", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 13)", .pme_code = 925, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@14", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 14)", .pme_code = 926, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_0@15", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to lack of credits. (M chip 15)", .pme_code = 927, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 11 Event 2 */ { .pme_name = "W_IN_WAITING_3@0", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 0)", .pme_code = 928, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@1", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 1)", .pme_code = 929, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@2", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 2)", .pme_code = 930, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@3", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 3)", .pme_code = 931, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@4", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 4)", .pme_code = 932, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@5", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 5)", .pme_code = 933, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@6", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 6)", .pme_code = 934, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@7", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 7)", .pme_code = 935, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@8", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 8)", .pme_code = 936, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@9", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 9)", .pme_code = 937, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@10", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 10)", .pme_code = 938, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@11", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 11)", .pme_code = 939, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@12", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 12)", .pme_code = 940, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@13", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 13)", .pme_code = 941, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@14", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 14)", .pme_code = 942, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_WAITING_3@15", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that failed to win arbitration (on either VC0 or VC2). (M chip 15)", .pme_code = 943, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 11 Event 3 */ { .pme_name = "INVAL_4@0", .pme_desc = "Invalidations sent to four BWs. (M chip 0)", .pme_code = 944, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@1", .pme_desc = "Invalidations sent to four BWs. (M chip 1)", .pme_code = 945, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@2", .pme_desc = "Invalidations sent to four BWs. (M chip 2)", .pme_code = 946, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@3", .pme_desc = "Invalidations sent to four BWs. (M chip 3)", .pme_code = 947, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@4", .pme_desc = "Invalidations sent to four BWs. (M chip 4)", .pme_code = 948, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@5", .pme_desc = "Invalidations sent to four BWs. (M chip 5)", .pme_code = 949, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@6", .pme_desc = "Invalidations sent to four BWs. (M chip 6)", .pme_code = 950, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@7", .pme_desc = "Invalidations sent to four BWs. (M chip 7)", .pme_code = 951, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@8", .pme_desc = "Invalidations sent to four BWs. (M chip 8)", .pme_code = 952, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@9", .pme_desc = "Invalidations sent to four BWs. (M chip 9)", .pme_code = 953, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@10", .pme_desc = "Invalidations sent to four BWs. (M chip 10)", .pme_code = 954, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@11", .pme_desc = "Invalidations sent to four BWs. (M chip 11)", .pme_code = 955, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@12", .pme_desc = "Invalidations sent to four BWs. (M chip 12)", .pme_code = 956, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@13", .pme_desc = "Invalidations sent to four BWs. (M chip 13)", .pme_code = 957, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@14", .pme_desc = "Invalidations sent to four BWs. (M chip 14)", .pme_code = 958, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_4@15", .pme_desc = "Invalidations sent to four BWs. (M chip 15)", .pme_code = 959, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 11, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 12 Event 0 */ { .pme_name = "UPDATE_NACK_RECV@0", .pme_desc = "UpdateNacks received. (M chip 0)", .pme_code = 960, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@1", .pme_desc = "UpdateNacks received. (M chip 1)", .pme_code = 961, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@2", .pme_desc = "UpdateNacks received. (M chip 2)", .pme_code = 962, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@3", .pme_desc = "UpdateNacks received. (M chip 3)", .pme_code = 963, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@4", .pme_desc = "UpdateNacks received. (M chip 4)", .pme_code = 964, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@5", .pme_desc = "UpdateNacks received. (M chip 5)", .pme_code = 965, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@6", .pme_desc = "UpdateNacks received. (M chip 6)", .pme_code = 966, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@7", .pme_desc = "UpdateNacks received. (M chip 7)", .pme_code = 967, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@8", .pme_desc = "UpdateNacks received. (M chip 8)", .pme_code = 968, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@9", .pme_desc = "UpdateNacks received. (M chip 9)", .pme_code = 969, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@10", .pme_desc = "UpdateNacks received. (M chip 10)", .pme_code = 970, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@11", .pme_desc = "UpdateNacks received. (M chip 11)", .pme_code = 971, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@12", .pme_desc = "UpdateNacks received. (M chip 12)", .pme_code = 972, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@13", .pme_desc = "UpdateNacks received. (M chip 13)", .pme_code = 973, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@14", .pme_desc = "UpdateNacks received. (M chip 14)", .pme_code = 974, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "UPDATE_NACK_RECV@15", .pme_desc = "UpdateNacks received. (M chip 15)", .pme_code = 975, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 12 Event 1 */ { .pme_name = "W_OUT_BLOCK_CRED_1@0", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 0)", .pme_code = 976, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@1", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 1)", .pme_code = 977, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@2", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 2)", .pme_code = 978, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@3", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 3)", .pme_code = 979, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@4", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 4)", .pme_code = 980, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@5", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 5)", .pme_code = 981, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@6", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 6)", .pme_code = 982, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@7", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 7)", .pme_code = 983, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@8", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 8)", .pme_code = 984, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@9", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 9)", .pme_code = 985, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@10", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 10)", .pme_code = 986, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@11", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 11)", .pme_code = 987, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@12", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 12)", .pme_code = 988, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@13", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 13)", .pme_code = 989, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@14", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 14)", .pme_code = 990, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_1@15", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to lack of credits. (M chip 15)", .pme_code = 991, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 12 Event 2 */ { .pme_name = "W_IN_BLOCKED_0@0", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 0)", .pme_code = 992, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@1", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 1)", .pme_code = 993, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@2", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 2)", .pme_code = 994, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@3", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 3)", .pme_code = 995, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@4", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 4)", .pme_code = 996, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@5", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 5)", .pme_code = 997, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@6", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 6)", .pme_code = 998, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@7", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 7)", .pme_code = 999, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@8", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 8)", .pme_code = 1000, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@9", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 9)", .pme_code = 1001, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@10", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 10)", .pme_code = 1002, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@11", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 11)", .pme_code = 1003, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@12", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 12)", .pme_code = 1004, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@13", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 13)", .pme_code = 1005, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@14", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 14)", .pme_code = 1006, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_0@15", .pme_desc = "Wclk cycles BW2MD input port 0 has a packet waiting that is blocked due to MD full. (M chip 15)", .pme_code = 1007, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 12 Event 3 */ { .pme_name = "FWD_GET_SENT@0", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 0)", .pme_code = 1008, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@1", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 1)", .pme_code = 1009, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@2", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 2)", .pme_code = 1010, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@3", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 3)", .pme_code = 1011, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@4", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 4)", .pme_code = 1012, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@5", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 5)", .pme_code = 1013, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@6", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 6)", .pme_code = 1014, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@7", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 7)", .pme_code = 1015, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@8", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 8)", .pme_code = 1016, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@9", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 9)", .pme_code = 1017, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@10", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 10)", .pme_code = 1018, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@11", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 11)", .pme_code = 1019, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@12", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 12)", .pme_code = 1020, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@13", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 13)", .pme_code = 1021, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@14", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 14)", .pme_code = 1022, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FWD_GET_SENT@15", .pme_desc = "FwdGet packets sent (Exclusive -> PendFwd transition). (M chip 15)", .pme_code = 1023, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 12, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 13 Event 0 */ { .pme_name = "PEND_DROP@0", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 0)", .pme_code = 1024, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@1", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 1)", .pme_code = 1025, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@2", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 2)", .pme_code = 1026, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@3", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 3)", .pme_code = 1027, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@4", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 4)", .pme_code = 1028, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@5", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 5)", .pme_code = 1029, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@6", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 6)", .pme_code = 1030, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@7", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 7)", .pme_code = 1031, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@8", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 8)", .pme_code = 1032, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@9", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 9)", .pme_code = 1033, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@10", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 10)", .pme_code = 1034, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@11", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 11)", .pme_code = 1035, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@12", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 12)", .pme_code = 1036, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@13", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 13)", .pme_code = 1037, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@14", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 14)", .pme_code = 1038, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "PEND_DROP@15", .pme_desc = "Times entering PendDrop state (from Shared). (M chip 15)", .pme_code = 1039, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 13 Event 1 */ { .pme_name = "LINE_EVICTIONS@0", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 0)", .pme_code = 1040, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@1", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 1)", .pme_code = 1041, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@2", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 2)", .pme_code = 1042, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@3", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 3)", .pme_code = 1043, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@4", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 4)", .pme_code = 1044, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@5", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 5)", .pme_code = 1045, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@6", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 6)", .pme_code = 1046, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@7", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 7)", .pme_code = 1047, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@8", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 8)", .pme_code = 1048, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@9", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 9)", .pme_code = 1049, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@10", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 10)", .pme_code = 1050, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@11", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 11)", .pme_code = 1051, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@12", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 12)", .pme_code = 1052, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@13", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 13)", .pme_code = 1053, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@14", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 14)", .pme_code = 1054, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "LINE_EVICTIONS@15", .pme_desc = "Counts lines that are evicted. Note: doesn't count AMO forced evictions. Also note that the counter will increment if the line is not dirty and it is evicted. (M chip 15)", .pme_code = 1055, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 13 Event 2 */ { .pme_name = "W_IN_BLOCKED_1@0", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 0)", .pme_code = 1056, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@1", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 1)", .pme_code = 1057, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@2", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 2)", .pme_code = 1058, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@3", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 3)", .pme_code = 1059, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@4", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 4)", .pme_code = 1060, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@5", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 5)", .pme_code = 1061, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@6", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 6)", .pme_code = 1062, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@7", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 7)", .pme_code = 1063, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@8", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 8)", .pme_code = 1064, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@9", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 9)", .pme_code = 1065, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@10", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 10)", .pme_code = 1066, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@11", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 11)", .pme_code = 1067, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@12", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 12)", .pme_code = 1068, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@13", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 13)", .pme_code = 1069, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@14", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 14)", .pme_code = 1070, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_1@15", .pme_desc = "Wclk cycles BW2MD input port 1 has a packet waiting that is blocked due to MD full. (M chip 15)", .pme_code = 1071, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 13 Event 3 */ { .pme_name = "FLUSH_REQ_PACKETS@0", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 0)", .pme_code = 1072, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@1", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 1)", .pme_code = 1073, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@2", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 2)", .pme_code = 1074, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@3", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 3)", .pme_code = 1075, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@4", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 4)", .pme_code = 1076, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@5", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 5)", .pme_code = 1077, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@6", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 6)", .pme_code = 1078, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@7", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 7)", .pme_code = 1079, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@8", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 8)", .pme_code = 1080, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@9", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 9)", .pme_code = 1081, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@10", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 10)", .pme_code = 1082, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@11", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 11)", .pme_code = 1083, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@12", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 12)", .pme_code = 1084, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@13", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 13)", .pme_code = 1085, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@14", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 14)", .pme_code = 1086, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "FLUSH_REQ_PACKETS@15", .pme_desc = "FlushReq packets sent (Exclusive -> PendFwd transition). (M chip 15)", .pme_code = 1087, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 13, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 14 Event 0 */ { .pme_name = "INVAL_EVENTS@0", .pme_desc = "Invalidation events (any number of sharers). (M chip 0)", .pme_code = 1088, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@1", .pme_desc = "Invalidation events (any number of sharers). (M chip 1)", .pme_code = 1089, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@2", .pme_desc = "Invalidation events (any number of sharers). (M chip 2)", .pme_code = 1090, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@3", .pme_desc = "Invalidation events (any number of sharers). (M chip 3)", .pme_code = 1091, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@4", .pme_desc = "Invalidation events (any number of sharers). (M chip 4)", .pme_code = 1092, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@5", .pme_desc = "Invalidation events (any number of sharers). (M chip 5)", .pme_code = 1093, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@6", .pme_desc = "Invalidation events (any number of sharers). (M chip 6)", .pme_code = 1094, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@7", .pme_desc = "Invalidation events (any number of sharers). (M chip 7)", .pme_code = 1095, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@8", .pme_desc = "Invalidation events (any number of sharers). (M chip 8)", .pme_code = 1096, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@9", .pme_desc = "Invalidation events (any number of sharers). (M chip 9)", .pme_code = 1097, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@10", .pme_desc = "Invalidation events (any number of sharers). (M chip 10)", .pme_code = 1098, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@11", .pme_desc = "Invalidation events (any number of sharers). (M chip 11)", .pme_code = 1099, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@12", .pme_desc = "Invalidation events (any number of sharers). (M chip 12)", .pme_code = 1100, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@13", .pme_desc = "Invalidation events (any number of sharers). (M chip 13)", .pme_code = 1101, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@14", .pme_desc = "Invalidation events (any number of sharers). (M chip 14)", .pme_code = 1102, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "INVAL_EVENTS@15", .pme_desc = "Invalidation events (any number of sharers). (M chip 15)", .pme_code = 1103, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 14 Event 1 */ { .pme_name = "L3_LINE_HIT_GLOBAL@0", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 0)", .pme_code = 1104, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@1", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 1)", .pme_code = 1105, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@2", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 2)", .pme_code = 1106, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@3", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 3)", .pme_code = 1107, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@4", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 4)", .pme_code = 1108, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@5", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 5)", .pme_code = 1109, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@6", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 6)", .pme_code = 1110, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@7", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 7)", .pme_code = 1111, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@8", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 8)", .pme_code = 1112, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@9", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 9)", .pme_code = 1113, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@10", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 10)", .pme_code = 1114, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@11", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 11)", .pme_code = 1115, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@12", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 12)", .pme_code = 1116, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@13", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 13)", .pme_code = 1117, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@14", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 14)", .pme_code = 1118, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_GLOBAL@15", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was global. (M chip 15)", .pme_code = 1119, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 14 Event 2 */ { .pme_name = "W_IN_BLOCKED_2@0", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 0)", .pme_code = 1120, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@1", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 1)", .pme_code = 1121, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@2", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 2)", .pme_code = 1122, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@3", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 3)", .pme_code = 1123, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@4", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 4)", .pme_code = 1124, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@5", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 5)", .pme_code = 1125, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@6", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 6)", .pme_code = 1126, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@7", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 7)", .pme_code = 1127, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@8", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 8)", .pme_code = 1128, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@9", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 9)", .pme_code = 1129, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@10", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 10)", .pme_code = 1130, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@11", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 11)", .pme_code = 1131, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@12", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 12)", .pme_code = 1132, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@13", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 13)", .pme_code = 1133, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@14", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 14)", .pme_code = 1134, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_2@15", .pme_desc = "Wclk cycles BW2MD input port 2 has a packet waiting that is blocked due to MD full. (M chip 15)", .pme_code = 1135, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 14 Event 3 */ { .pme_name = "W_OUT_BLOCK_CRED_2@0", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 0)", .pme_code = 1136, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@1", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 1)", .pme_code = 1137, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@2", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 2)", .pme_code = 1138, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@3", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 3)", .pme_code = 1139, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@4", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 4)", .pme_code = 1140, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@5", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 5)", .pme_code = 1141, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@6", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 6)", .pme_code = 1142, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@7", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 7)", .pme_code = 1143, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@8", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 8)", .pme_code = 1144, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@9", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 9)", .pme_code = 1145, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@10", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 10)", .pme_code = 1146, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@11", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 11)", .pme_code = 1147, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@12", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 12)", .pme_code = 1148, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@13", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 13)", .pme_code = 1149, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@14", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 14)", .pme_code = 1150, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_2@15", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to lack of credits. (M chip 15)", .pme_code = 1151, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 14, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 15 Event 0 */ { .pme_name = "REQUEST_ALLOC_NO_FILL@0", .pme_desc = "Allocating no fill requests. (M chip 0)", .pme_code = 1152, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@1", .pme_desc = "Allocating no fill requests. (M chip 1)", .pme_code = 1153, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@2", .pme_desc = "Allocating no fill requests. (M chip 2)", .pme_code = 1154, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@3", .pme_desc = "Allocating no fill requests. (M chip 3)", .pme_code = 1155, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@4", .pme_desc = "Allocating no fill requests. (M chip 4)", .pme_code = 1156, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@5", .pme_desc = "Allocating no fill requests. (M chip 5)", .pme_code = 1157, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@6", .pme_desc = "Allocating no fill requests. (M chip 6)", .pme_code = 1158, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@7", .pme_desc = "Allocating no fill requests. (M chip 7)", .pme_code = 1159, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@8", .pme_desc = "Allocating no fill requests. (M chip 8)", .pme_code = 1160, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@9", .pme_desc = "Allocating no fill requests. (M chip 9)", .pme_code = 1161, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@10", .pme_desc = "Allocating no fill requests. (M chip 10)", .pme_code = 1162, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@11", .pme_desc = "Allocating no fill requests. (M chip 11)", .pme_code = 1163, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@12", .pme_desc = "Allocating no fill requests. (M chip 12)", .pme_code = 1164, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@13", .pme_desc = "Allocating no fill requests. (M chip 13)", .pme_code = 1165, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@14", .pme_desc = "Allocating no fill requests. (M chip 14)", .pme_code = 1166, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_ALLOC_NO_FILL@15", .pme_desc = "Allocating no fill requests. (M chip 15)", .pme_code = 1167, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 15 Event 1 */ { .pme_name = "L3_LINE_HIT_SHARED@0", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 0)", .pme_code = 1168, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@1", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 1)", .pme_code = 1169, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@2", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 2)", .pme_code = 1170, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@3", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 3)", .pme_code = 1171, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@4", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 4)", .pme_code = 1172, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@5", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 5)", .pme_code = 1173, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@6", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 6)", .pme_code = 1174, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@7", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 7)", .pme_code = 1175, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@8", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 8)", .pme_code = 1176, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@9", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 9)", .pme_code = 1177, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@10", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 10)", .pme_code = 1178, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@11", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 11)", .pme_code = 1179, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@12", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 12)", .pme_code = 1180, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@13", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 13)", .pme_code = 1181, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@14", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 14)", .pme_code = 1182, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "L3_LINE_HIT_SHARED@15", .pme_desc = "Allocating read requests that hit out of L3 cached data and state was shared. (M chip 15)", .pme_code = 1183, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 15 Event 2 */ { .pme_name = "W_IN_BLOCKED_3@0", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 0)", .pme_code = 1184, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@1", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 1)", .pme_code = 1185, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@2", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 2)", .pme_code = 1186, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@3", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 3)", .pme_code = 1187, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@4", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 4)", .pme_code = 1188, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@5", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 5)", .pme_code = 1189, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@6", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 6)", .pme_code = 1190, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@7", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 7)", .pme_code = 1191, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@8", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 8)", .pme_code = 1192, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@9", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 9)", .pme_code = 1193, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@10", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 10)", .pme_code = 1194, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@11", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 11)", .pme_code = 1195, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@12", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 12)", .pme_code = 1196, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@13", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 13)", .pme_code = 1197, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@14", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 14)", .pme_code = 1198, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_IN_BLOCKED_3@15", .pme_desc = "Wclk cycles BW2MD input port 3 has a packet waiting that is blocked due to MD full. (M chip 15)", .pme_code = 1199, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 15 Event 3 */ { .pme_name = "W_OUT_BLOCK_CRED_3@0", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 0)", .pme_code = 1200, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@1", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 1)", .pme_code = 1201, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@2", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 2)", .pme_code = 1202, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@3", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 3)", .pme_code = 1203, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@4", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 4)", .pme_code = 1204, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@5", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 5)", .pme_code = 1205, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@6", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 6)", .pme_code = 1206, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@7", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 7)", .pme_code = 1207, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@8", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 8)", .pme_code = 1208, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@9", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 9)", .pme_code = 1209, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@10", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 10)", .pme_code = 1210, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@11", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 11)", .pme_code = 1211, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@12", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 12)", .pme_code = 1212, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@13", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 13)", .pme_code = 1213, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@14", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 14)", .pme_code = 1214, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CRED_3@15", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to lack of credits. (M chip 15)", .pme_code = 1215, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 15, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 16 Event 0 */ { .pme_name = "REQUEST_1DWORD_L3_HIT@0", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 0)", .pme_code = 1216, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@1", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 1)", .pme_code = 1217, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@2", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 2)", .pme_code = 1218, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@3", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 3)", .pme_code = 1219, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@4", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 4)", .pme_code = 1220, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@5", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 5)", .pme_code = 1221, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@6", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 6)", .pme_code = 1222, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@7", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 7)", .pme_code = 1223, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@8", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 8)", .pme_code = 1224, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@9", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 9)", .pme_code = 1225, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@10", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 10)", .pme_code = 1226, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@11", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 11)", .pme_code = 1227, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@12", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 12)", .pme_code = 1228, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@13", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 13)", .pme_code = 1229, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@14", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 14)", .pme_code = 1230, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_HIT@15", .pme_desc = "Single DWord Get and NGet requests to MDs - L3 hit. (M chip 15)", .pme_code = 1231, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 16 Event 1 */ { .pme_name = "AMOS@0", .pme_desc = "AMOs to local memory (memory manager). (M chip 0)", .pme_code = 1232, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@1", .pme_desc = "AMOs to local memory (memory manager). (M chip 1)", .pme_code = 1233, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@2", .pme_desc = "AMOs to local memory (memory manager). (M chip 2)", .pme_code = 1234, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@3", .pme_desc = "AMOs to local memory (memory manager). (M chip 3)", .pme_code = 1235, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@4", .pme_desc = "AMOs to local memory (memory manager). (M chip 4)", .pme_code = 1236, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@5", .pme_desc = "AMOs to local memory (memory manager). (M chip 5)", .pme_code = 1237, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@6", .pme_desc = "AMOs to local memory (memory manager). (M chip 6)", .pme_code = 1238, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@7", .pme_desc = "AMOs to local memory (memory manager). (M chip 7)", .pme_code = 1239, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@8", .pme_desc = "AMOs to local memory (memory manager). (M chip 8)", .pme_code = 1240, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@9", .pme_desc = "AMOs to local memory (memory manager). (M chip 9)", .pme_code = 1241, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@10", .pme_desc = "AMOs to local memory (memory manager). (M chip 10)", .pme_code = 1242, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@11", .pme_desc = "AMOs to local memory (memory manager). (M chip 11)", .pme_code = 1243, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@12", .pme_desc = "AMOs to local memory (memory manager). (M chip 12)", .pme_code = 1244, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@13", .pme_desc = "AMOs to local memory (memory manager). (M chip 13)", .pme_code = 1245, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@14", .pme_desc = "AMOs to local memory (memory manager). (M chip 14)", .pme_code = 1246, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMOS@15", .pme_desc = "AMOs to local memory (memory manager). (M chip 15)", .pme_code = 1247, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 16 Event 2 */ { .pme_name = "MM0_ANY_BANK_BUSY@0", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 0)", .pme_code = 1248, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@1", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 1)", .pme_code = 1249, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@2", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 2)", .pme_code = 1250, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@3", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 3)", .pme_code = 1251, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@4", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 4)", .pme_code = 1252, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@5", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 5)", .pme_code = 1253, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@6", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 6)", .pme_code = 1254, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@7", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 7)", .pme_code = 1255, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@8", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 8)", .pme_code = 1256, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@9", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 9)", .pme_code = 1257, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@10", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 10)", .pme_code = 1258, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@11", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 11)", .pme_code = 1259, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@12", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 12)", .pme_code = 1260, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@13", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 13)", .pme_code = 1261, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@14", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 14)", .pme_code = 1262, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ANY_BANK_BUSY@15", .pme_desc = "Wclk cycles that any back is busy in MM0. (M chip 15)", .pme_code = 1263, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 16 Event 3 */ { .pme_name = "W_OUT_BLOCK_CHN_0@0", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 0)", .pme_code = 1264, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@1", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 1)", .pme_code = 1265, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@2", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 2)", .pme_code = 1266, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@3", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 3)", .pme_code = 1267, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@4", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 4)", .pme_code = 1268, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@5", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 5)", .pme_code = 1269, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@6", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 6)", .pme_code = 1270, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@7", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 7)", .pme_code = 1271, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@8", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 8)", .pme_code = 1272, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@9", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 9)", .pme_code = 1273, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@10", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 10)", .pme_code = 1274, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@11", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 11)", .pme_code = 1275, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@12", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 12)", .pme_code = 1276, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@13", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 13)", .pme_code = 1277, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@14", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 14)", .pme_code = 1278, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_0@15", .pme_desc = "Wclk cycles MD2BW output port 0 is blocked due to channel back-pressure. (M chip 15)", .pme_code = 1279, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 16, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 17 Event 0 */ { .pme_name = "REQUEST_4DWORDS_L3_HIT@0", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 0)", .pme_code = 1280, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@1", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 1)", .pme_code = 1281, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@2", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 2)", .pme_code = 1282, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@3", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 3)", .pme_code = 1283, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@4", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 4)", .pme_code = 1284, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@5", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 5)", .pme_code = 1285, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@6", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 6)", .pme_code = 1286, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@7", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 7)", .pme_code = 1287, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@8", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 8)", .pme_code = 1288, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@9", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 9)", .pme_code = 1289, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@10", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 10)", .pme_code = 1290, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@11", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 11)", .pme_code = 1291, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@12", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 12)", .pme_code = 1292, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@13", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 13)", .pme_code = 1293, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@14", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 14)", .pme_code = 1294, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_HIT@15", .pme_desc = "Allocating read requests to MDs - L3 hit. (M chip 15)", .pme_code = 1295, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 17 Event 1 */ { .pme_name = "AMO_MISSES@0", .pme_desc = "Misses in AMO cache (memory manager). (M chip 0)", .pme_code = 1296, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@1", .pme_desc = "Misses in AMO cache (memory manager). (M chip 1)", .pme_code = 1297, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@2", .pme_desc = "Misses in AMO cache (memory manager). (M chip 2)", .pme_code = 1298, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@3", .pme_desc = "Misses in AMO cache (memory manager). (M chip 3)", .pme_code = 1299, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@4", .pme_desc = "Misses in AMO cache (memory manager). (M chip 4)", .pme_code = 1300, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@5", .pme_desc = "Misses in AMO cache (memory manager). (M chip 5)", .pme_code = 1301, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@6", .pme_desc = "Misses in AMO cache (memory manager). (M chip 6)", .pme_code = 1302, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@7", .pme_desc = "Misses in AMO cache (memory manager). (M chip 7)", .pme_code = 1303, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@8", .pme_desc = "Misses in AMO cache (memory manager). (M chip 8)", .pme_code = 1304, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@9", .pme_desc = "Misses in AMO cache (memory manager). (M chip 9)", .pme_code = 1305, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@10", .pme_desc = "Misses in AMO cache (memory manager). (M chip 10)", .pme_code = 1306, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@11", .pme_desc = "Misses in AMO cache (memory manager). (M chip 11)", .pme_code = 1307, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@12", .pme_desc = "Misses in AMO cache (memory manager). (M chip 12)", .pme_code = 1308, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@13", .pme_desc = "Misses in AMO cache (memory manager). (M chip 13)", .pme_code = 1309, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@14", .pme_desc = "Misses in AMO cache (memory manager). (M chip 14)", .pme_code = 1310, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "AMO_MISSES@15", .pme_desc = "Misses in AMO cache (memory manager). (M chip 15)", .pme_code = 1311, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 17 Event 2 */ { .pme_name = "MM0_ACCUM_BANK_BUSY@0", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 0)", .pme_code = 1312, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@1", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 1)", .pme_code = 1313, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@2", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 2)", .pme_code = 1314, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@3", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 3)", .pme_code = 1315, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@4", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 4)", .pme_code = 1316, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@5", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 5)", .pme_code = 1317, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@6", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 6)", .pme_code = 1318, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@7", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 7)", .pme_code = 1319, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@8", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 8)", .pme_code = 1320, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@9", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 9)", .pme_code = 1321, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@10", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 10)", .pme_code = 1322, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@11", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 11)", .pme_code = 1323, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@12", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 12)", .pme_code = 1324, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@13", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 13)", .pme_code = 1325, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@14", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 14)", .pme_code = 1326, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM0_ACCUM_BANK_BUSY@15", .pme_desc = "Accumulation of the MM0 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 15)", .pme_code = 1327, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 17 Event 3 */ { .pme_name = "W_OUT_BLOCK_CHN_1@0", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 0)", .pme_code = 1328, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@1", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 1)", .pme_code = 1329, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@2", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 2)", .pme_code = 1330, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@3", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 3)", .pme_code = 1331, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@4", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 4)", .pme_code = 1332, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@5", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 5)", .pme_code = 1333, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@6", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 6)", .pme_code = 1334, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@7", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 7)", .pme_code = 1335, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@8", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 8)", .pme_code = 1336, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@9", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 9)", .pme_code = 1337, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@10", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 10)", .pme_code = 1338, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@11", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 11)", .pme_code = 1339, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@12", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 12)", .pme_code = 1340, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@13", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 13)", .pme_code = 1341, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@14", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 14)", .pme_code = 1342, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_1@15", .pme_desc = "Wclk cycles MD2BW output port 1 is blocked due to channel back-pressure. (M chip 15)", .pme_code = 1343, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 17, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 18 Event 0 */ { .pme_name = "REQUEST_1DWORD@0", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 0)", .pme_code = 1344, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@1", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 1)", .pme_code = 1345, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@2", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 2)", .pme_code = 1346, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@3", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 3)", .pme_code = 1347, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@4", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 4)", .pme_code = 1348, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@5", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 5)", .pme_code = 1349, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@6", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 6)", .pme_code = 1350, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@7", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 7)", .pme_code = 1351, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@8", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 8)", .pme_code = 1352, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@9", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 9)", .pme_code = 1353, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@10", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 10)", .pme_code = 1354, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@11", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 11)", .pme_code = 1355, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@12", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 12)", .pme_code = 1356, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@13", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 13)", .pme_code = 1357, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@14", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 14)", .pme_code = 1358, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD@15", .pme_desc = "Single DWord Get and NGet requests to MDs. (M chip 15)", .pme_code = 1359, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 18 Event 1 */ { .pme_name = "RETRIES_MM@0", .pme_desc = "Memory Manager retries. (M chip 0)", .pme_code = 1360, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@1", .pme_desc = "Memory Manager retries. (M chip 1)", .pme_code = 1361, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@2", .pme_desc = "Memory Manager retries. (M chip 2)", .pme_code = 1362, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@3", .pme_desc = "Memory Manager retries. (M chip 3)", .pme_code = 1363, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@4", .pme_desc = "Memory Manager retries. (M chip 4)", .pme_code = 1364, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@5", .pme_desc = "Memory Manager retries. (M chip 5)", .pme_code = 1365, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@6", .pme_desc = "Memory Manager retries. (M chip 6)", .pme_code = 1366, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@7", .pme_desc = "Memory Manager retries. (M chip 7)", .pme_code = 1367, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@8", .pme_desc = "Memory Manager retries. (M chip 8)", .pme_code = 1368, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@9", .pme_desc = "Memory Manager retries. (M chip 9)", .pme_code = 1369, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@10", .pme_desc = "Memory Manager retries. (M chip 10)", .pme_code = 1370, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@11", .pme_desc = "Memory Manager retries. (M chip 11)", .pme_code = 1371, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@12", .pme_desc = "Memory Manager retries. (M chip 12)", .pme_code = 1372, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@13", .pme_desc = "Memory Manager retries. (M chip 13)", .pme_code = 1373, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@14", .pme_desc = "Memory Manager retries. (M chip 14)", .pme_code = 1374, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "RETRIES_MM@15", .pme_desc = "Memory Manager retries. (M chip 15)", .pme_code = 1375, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 18 Event 2 */ { .pme_name = "MM1_ANY_BANK_BUSY@0", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 0)", .pme_code = 1376, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@1", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 1)", .pme_code = 1377, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@2", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 2)", .pme_code = 1378, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@3", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 3)", .pme_code = 1379, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@4", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 4)", .pme_code = 1380, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@5", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 5)", .pme_code = 1381, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@6", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 6)", .pme_code = 1382, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@7", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 7)", .pme_code = 1383, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@8", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 8)", .pme_code = 1384, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@9", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 9)", .pme_code = 1385, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@10", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 10)", .pme_code = 1386, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@11", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 11)", .pme_code = 1387, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@12", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 12)", .pme_code = 1388, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@13", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 13)", .pme_code = 1389, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@14", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 14)", .pme_code = 1390, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ANY_BANK_BUSY@15", .pme_desc = "Wclk cycles that any bank is busy in MM1. (M chip 15)", .pme_code = 1391, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 18 Event 3 */ { .pme_name = "W_OUT_BLOCK_CHN_2@0", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 0)", .pme_code = 1392, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@1", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 1)", .pme_code = 1393, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@2", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 2)", .pme_code = 1394, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@3", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 3)", .pme_code = 1395, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@4", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 4)", .pme_code = 1396, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@5", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 5)", .pme_code = 1397, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@6", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 6)", .pme_code = 1398, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@7", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 7)", .pme_code = 1399, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@8", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 8)", .pme_code = 1400, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@9", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 9)", .pme_code = 1401, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@10", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 10)", .pme_code = 1402, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@11", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 11)", .pme_code = 1403, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@12", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 12)", .pme_code = 1404, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@13", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 13)", .pme_code = 1405, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@14", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 14)", .pme_code = 1406, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_2@15", .pme_desc = "Wclk cycles MD2BW output port 2 is blocked due to channel back-pressure. (M chip 15)", .pme_code = 1407, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 18, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 19 Event 0 */ { .pme_name = "REQUEST_4DWORDS@0", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 0)", .pme_code = 1408, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@1", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 1)", .pme_code = 1409, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@2", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 2)", .pme_code = 1410, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@3", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 3)", .pme_code = 1411, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@4", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 4)", .pme_code = 1412, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@5", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 5)", .pme_code = 1413, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@6", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 6)", .pme_code = 1414, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@7", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 7)", .pme_code = 1415, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@8", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 8)", .pme_code = 1416, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@9", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 9)", .pme_code = 1417, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@10", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 10)", .pme_code = 1418, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@11", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 11)", .pme_code = 1419, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@12", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 12)", .pme_code = 1420, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@13", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 13)", .pme_code = 1421, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@14", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 14)", .pme_code = 1422, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS@15", .pme_desc = "Allocating read, Get and NGet full cache line requests to MDs. (M chip 15)", .pme_code = 1423, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 19 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1424, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1425, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1426, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1427, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1428, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1429, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1430, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1431, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1432, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1433, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1434, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1435, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1436, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1437, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1438, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1439, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 19 Event 2 */ { .pme_name = "MM1_ACCUM_BANK_BUSY@0", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 0)", .pme_code = 1440, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@1", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 1)", .pme_code = 1441, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@2", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 2)", .pme_code = 1442, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@3", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 3)", .pme_code = 1443, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@4", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 4)", .pme_code = 1444, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@5", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 5)", .pme_code = 1445, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@6", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 6)", .pme_code = 1446, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@7", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 7)", .pme_code = 1447, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@8", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 8)", .pme_code = 1448, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@9", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 9)", .pme_code = 1449, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@10", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 10)", .pme_code = 1450, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@11", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 11)", .pme_code = 1451, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@12", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 12)", .pme_code = 1452, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@13", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 13)", .pme_code = 1453, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@14", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 14)", .pme_code = 1454, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM1_ACCUM_BANK_BUSY@15", .pme_desc = "Accumulation of the MM1 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 15)", .pme_code = 1455, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 19 Event 3 */ { .pme_name = "W_OUT_BLOCK_CHN_3@0", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 0)", .pme_code = 1456, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@1", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 1)", .pme_code = 1457, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@2", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 2)", .pme_code = 1458, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@3", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 3)", .pme_code = 1459, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@4", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 4)", .pme_code = 1460, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@5", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 5)", .pme_code = 1461, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@6", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 6)", .pme_code = 1462, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@7", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 7)", .pme_code = 1463, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@8", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 8)", .pme_code = 1464, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@9", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 9)", .pme_code = 1465, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@10", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 10)", .pme_code = 1466, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@11", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 11)", .pme_code = 1467, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@12", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 12)", .pme_code = 1468, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@13", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 13)", .pme_code = 1469, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@14", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 14)", .pme_code = 1470, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_BLOCK_CHN_3@15", .pme_desc = "Wclk cycles MD2BW output port 3 is blocked due to channel back-pressure. (M chip 15)", .pme_code = 1471, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 19, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 20 Event 0 */ { .pme_name = "REQUESTS_0@0", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 0)", .pme_code = 1472, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@1", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 1)", .pme_code = 1473, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@2", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 2)", .pme_code = 1474, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@3", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 3)", .pme_code = 1475, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@4", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 4)", .pme_code = 1476, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@5", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 5)", .pme_code = 1477, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@6", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 6)", .pme_code = 1478, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@7", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 7)", .pme_code = 1479, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@8", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 8)", .pme_code = 1480, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@9", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 9)", .pme_code = 1481, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@10", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 10)", .pme_code = 1482, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@11", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 11)", .pme_code = 1483, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@12", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 12)", .pme_code = 1484, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@13", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 13)", .pme_code = 1485, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@14", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 14)", .pme_code = 1486, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_0@15", .pme_desc = "Read or write requests from port 0 to MDs. (M chip 15)", .pme_code = 1487, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 20 Event 1 */ { .pme_name = "REQUEST_1DWORD_L3_MISS@0", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 0)", .pme_code = 1488, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@1", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 1)", .pme_code = 1489, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@2", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 2)", .pme_code = 1490, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@3", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 3)", .pme_code = 1491, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@4", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 4)", .pme_code = 1492, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@5", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 5)", .pme_code = 1493, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@6", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 6)", .pme_code = 1494, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@7", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 7)", .pme_code = 1495, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@8", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 8)", .pme_code = 1496, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@9", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 9)", .pme_code = 1497, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@10", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 10)", .pme_code = 1498, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@11", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 11)", .pme_code = 1499, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@12", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 12)", .pme_code = 1500, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@13", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 13)", .pme_code = 1501, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@14", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 14)", .pme_code = 1502, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1DWORD_L3_MISS@15", .pme_desc = "Single DWord get requests to MDs - L3 miss. (M chip 15)", .pme_code = 1503, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 20 Event 2 */ { .pme_name = "MM2_ANY_BANK_BUSY@0", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 0)", .pme_code = 1504, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@1", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 1)", .pme_code = 1505, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@2", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 2)", .pme_code = 1506, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@3", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 3)", .pme_code = 1507, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@4", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 4)", .pme_code = 1508, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@5", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 5)", .pme_code = 1509, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@6", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 6)", .pme_code = 1510, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@7", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 7)", .pme_code = 1511, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@8", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 8)", .pme_code = 1512, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@9", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 9)", .pme_code = 1513, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@10", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 10)", .pme_code = 1514, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@11", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 11)", .pme_code = 1515, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@12", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 12)", .pme_code = 1516, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@13", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 13)", .pme_code = 1517, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@14", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 14)", .pme_code = 1518, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ANY_BANK_BUSY@15", .pme_desc = "Wclk cycles that any bank is busy in MM2. (M chip 15)", .pme_code = 1519, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 20 Event 3 */ { .pme_name = "W_OUT_QUEUE_BP_0@0", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 0)", .pme_code = 1520, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@1", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 1)", .pme_code = 1521, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@2", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 2)", .pme_code = 1522, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@3", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 3)", .pme_code = 1523, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@4", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 4)", .pme_code = 1524, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@5", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 5)", .pme_code = 1525, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@6", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 6)", .pme_code = 1526, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@7", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 7)", .pme_code = 1527, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@8", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 8)", .pme_code = 1528, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@9", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 9)", .pme_code = 1529, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@10", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 10)", .pme_code = 1530, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@11", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 11)", .pme_code = 1531, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@12", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 12)", .pme_code = 1532, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@13", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 13)", .pme_code = 1533, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@14", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 14)", .pme_code = 1534, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_0@15", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 0 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 15)", .pme_code = 1535, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 20, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 21 Event 0 */ { .pme_name = "REQUESTS_1@0", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 0)", .pme_code = 1536, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@1", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 1)", .pme_code = 1537, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@2", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 2)", .pme_code = 1538, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@3", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 3)", .pme_code = 1539, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@4", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 4)", .pme_code = 1540, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@5", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 5)", .pme_code = 1541, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@6", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 6)", .pme_code = 1542, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@7", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 7)", .pme_code = 1543, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@8", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 8)", .pme_code = 1544, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@9", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 9)", .pme_code = 1545, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@10", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 10)", .pme_code = 1546, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@11", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 11)", .pme_code = 1547, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@12", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 12)", .pme_code = 1548, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@13", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 13)", .pme_code = 1549, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@14", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 14)", .pme_code = 1550, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_1@15", .pme_desc = "Read or write requests from port 1 to MDs. (M chip 15)", .pme_code = 1551, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 21 Event 1 */ { .pme_name = "REQUEST_4DWORDS_L3_MISS@0", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 0)", .pme_code = 1552, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@1", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 1)", .pme_code = 1553, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@2", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 2)", .pme_code = 1554, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@3", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 3)", .pme_code = 1555, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@4", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 4)", .pme_code = 1556, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@5", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 5)", .pme_code = 1557, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@6", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 6)", .pme_code = 1558, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@7", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 7)", .pme_code = 1559, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@8", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 8)", .pme_code = 1560, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@9", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 9)", .pme_code = 1561, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@10", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 10)", .pme_code = 1562, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@11", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 11)", .pme_code = 1563, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@12", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 12)", .pme_code = 1564, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@13", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 13)", .pme_code = 1565, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@14", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 14)", .pme_code = 1566, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_4DWORDS_L3_MISS@15", .pme_desc = "Allocating read requests to MDs - L3 miss. (M chip 15)", .pme_code = 1567, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 21 Event 2 */ { .pme_name = "MM2_ACCUM_BANK_BUSY@0", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 0)", .pme_code = 1568, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@1", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 1)", .pme_code = 1569, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@2", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 2)", .pme_code = 1570, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@3", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 3)", .pme_code = 1571, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@4", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 4)", .pme_code = 1572, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@5", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 5)", .pme_code = 1573, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@6", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 6)", .pme_code = 1574, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@7", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 7)", .pme_code = 1575, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@8", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 8)", .pme_code = 1576, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@9", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 9)", .pme_code = 1577, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@10", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 10)", .pme_code = 1578, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@11", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 11)", .pme_code = 1579, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@12", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 12)", .pme_code = 1580, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@13", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 13)", .pme_code = 1581, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@14", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 14)", .pme_code = 1582, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM2_ACCUM_BANK_BUSY@15", .pme_desc = "Accumulation of the MM2 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 15)", .pme_code = 1583, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 21 Event 3 */ { .pme_name = "W_OUT_QUEUE_BP_1@0", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 0)", .pme_code = 1584, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@1", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 1)", .pme_code = 1585, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@2", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 2)", .pme_code = 1586, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@3", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 3)", .pme_code = 1587, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@4", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 4)", .pme_code = 1588, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@5", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 5)", .pme_code = 1589, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@6", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 6)", .pme_code = 1590, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@7", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 7)", .pme_code = 1591, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@8", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 8)", .pme_code = 1592, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@9", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 9)", .pme_code = 1593, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@10", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 10)", .pme_code = 1594, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@11", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 11)", .pme_code = 1595, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@12", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 12)", .pme_code = 1596, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@13", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 13)", .pme_code = 1597, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@14", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 14)", .pme_code = 1598, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_1@15", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 1 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 15)", .pme_code = 1599, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 21, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 22 Event 0 */ { .pme_name = "REQUESTS_2@0", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 0)", .pme_code = 1600, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@1", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 1)", .pme_code = 1601, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@2", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 2)", .pme_code = 1602, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@3", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 3)", .pme_code = 1603, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@4", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 4)", .pme_code = 1604, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@5", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 5)", .pme_code = 1605, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@6", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 6)", .pme_code = 1606, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@7", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 7)", .pme_code = 1607, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@8", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 8)", .pme_code = 1608, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@9", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 9)", .pme_code = 1609, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@10", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 10)", .pme_code = 1610, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@11", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 11)", .pme_code = 1611, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@12", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 12)", .pme_code = 1612, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@13", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 13)", .pme_code = 1613, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@14", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 14)", .pme_code = 1614, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_2@15", .pme_desc = "Read or write requests from port 2 to MDs. (M chip 15)", .pme_code = 1615, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 22 Event 1 */ { .pme_name = "REQUEST_1SWORD@0", .pme_desc = "Single SWord requests to MDs. (M chip 0)", .pme_code = 1616, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@1", .pme_desc = "Single SWord requests to MDs. (M chip 1)", .pme_code = 1617, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@2", .pme_desc = "Single SWord requests to MDs. (M chip 2)", .pme_code = 1618, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@3", .pme_desc = "Single SWord requests to MDs. (M chip 3)", .pme_code = 1619, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@4", .pme_desc = "Single SWord requests to MDs. (M chip 4)", .pme_code = 1620, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@5", .pme_desc = "Single SWord requests to MDs. (M chip 5)", .pme_code = 1621, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@6", .pme_desc = "Single SWord requests to MDs. (M chip 6)", .pme_code = 1622, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@7", .pme_desc = "Single SWord requests to MDs. (M chip 7)", .pme_code = 1623, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@8", .pme_desc = "Single SWord requests to MDs. (M chip 8)", .pme_code = 1624, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@9", .pme_desc = "Single SWord requests to MDs. (M chip 9)", .pme_code = 1625, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@10", .pme_desc = "Single SWord requests to MDs. (M chip 10)", .pme_code = 1626, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@11", .pme_desc = "Single SWord requests to MDs. (M chip 11)", .pme_code = 1627, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@12", .pme_desc = "Single SWord requests to MDs. (M chip 12)", .pme_code = 1628, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@13", .pme_desc = "Single SWord requests to MDs. (M chip 13)", .pme_code = 1629, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@14", .pme_desc = "Single SWord requests to MDs. (M chip 14)", .pme_code = 1630, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUEST_1SWORD@15", .pme_desc = "Single SWord requests to MDs. (M chip 15)", .pme_code = 1631, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 22 Event 2 */ { .pme_name = "MM3_ANY_BANK_BUSY@0", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 0)", .pme_code = 1632, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@1", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 1)", .pme_code = 1633, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@2", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 2)", .pme_code = 1634, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@3", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 3)", .pme_code = 1635, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@4", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 4)", .pme_code = 1636, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@5", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 5)", .pme_code = 1637, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@6", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 6)", .pme_code = 1638, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@7", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 7)", .pme_code = 1639, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@8", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 8)", .pme_code = 1640, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@9", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 9)", .pme_code = 1641, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@10", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 10)", .pme_code = 1642, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@11", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 11)", .pme_code = 1643, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@12", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 12)", .pme_code = 1644, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@13", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 13)", .pme_code = 1645, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@14", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 14)", .pme_code = 1646, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ANY_BANK_BUSY@15", .pme_desc = "Wclk cycles that any bank is busy in MM3. (M chip 15)", .pme_code = 1647, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 22 Event 3 */ { .pme_name = "W_OUT_QUEUE_BP_2@0", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 0)", .pme_code = 1648, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@1", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 1)", .pme_code = 1649, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@2", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 2)", .pme_code = 1650, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@3", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 3)", .pme_code = 1651, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@4", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 4)", .pme_code = 1652, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@5", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 5)", .pme_code = 1653, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@6", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 6)", .pme_code = 1654, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@7", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 7)", .pme_code = 1655, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@8", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 8)", .pme_code = 1656, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@9", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 9)", .pme_code = 1657, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@10", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 10)", .pme_code = 1658, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@11", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 11)", .pme_code = 1659, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@12", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 12)", .pme_code = 1660, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@13", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 13)", .pme_code = 1661, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@14", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 14)", .pme_code = 1662, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_2@15", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 2 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 15)", .pme_code = 1663, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 22, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 23 Event 0 */ { .pme_name = "REQUESTS_3@0", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 0)", .pme_code = 1664, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@1", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 1)", .pme_code = 1665, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@2", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 2)", .pme_code = 1666, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@3", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 3)", .pme_code = 1667, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@4", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 4)", .pme_code = 1668, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@5", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 5)", .pme_code = 1669, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@6", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 6)", .pme_code = 1670, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@7", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 7)", .pme_code = 1671, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@8", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 8)", .pme_code = 1672, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@9", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 9)", .pme_code = 1673, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@10", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 10)", .pme_code = 1674, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@11", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 11)", .pme_code = 1675, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@12", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 12)", .pme_code = 1676, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@13", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 13)", .pme_code = 1677, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@14", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 14)", .pme_code = 1678, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "REQUESTS_3@15", .pme_desc = "Read or write requests from port 3 to MDs. (M chip 15)", .pme_code = 1679, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 23 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1680, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1681, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1682, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1683, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1684, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1685, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1686, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1687, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1688, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1689, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1690, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1691, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1692, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1693, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1694, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1695, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 23 Event 2 */ { .pme_name = "MM3_ACCUM_BANK_BUSY@0", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 0)", .pme_code = 1696, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@1", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 1)", .pme_code = 1697, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@2", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 2)", .pme_code = 1698, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@3", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 3)", .pme_code = 1699, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@4", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 4)", .pme_code = 1700, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@5", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 5)", .pme_code = 1701, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@6", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 6)", .pme_code = 1702, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@7", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 7)", .pme_code = 1703, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@8", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 8)", .pme_code = 1704, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@9", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 9)", .pme_code = 1705, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@10", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 10)", .pme_code = 1706, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@11", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 11)", .pme_code = 1707, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@12", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 12)", .pme_code = 1708, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@13", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 13)", .pme_code = 1709, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@14", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 14)", .pme_code = 1710, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "MM3_ACCUM_BANK_BUSY@15", .pme_desc = "Accumulation of the MM3 memory banks are busy in Mclks. There are 8 banks per MM and this counter will be +1 every Mclk that 1 bank is busy, +2 every Mclk that 2 banks are busy, etc. (M chip 15)", .pme_code = 1711, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 23 Event 3 */ { .pme_name = "W_OUT_QUEUE_BP_3@0", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 0)", .pme_code = 1712, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@1", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 1)", .pme_code = 1713, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@2", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 2)", .pme_code = 1714, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@3", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 3)", .pme_code = 1715, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@4", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 4)", .pme_code = 1716, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@5", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 5)", .pme_code = 1717, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@6", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 6)", .pme_code = 1718, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@7", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 7)", .pme_code = 1719, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@8", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 8)", .pme_code = 1720, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@9", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 9)", .pme_code = 1721, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@10", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 10)", .pme_code = 1722, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@11", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 11)", .pme_code = 1723, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@12", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 12)", .pme_code = 1724, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@13", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 13)", .pme_code = 1725, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@14", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 14)", .pme_code = 1726, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_OUT_QUEUE_BP_3@15", .pme_desc = "One of the input FIFOs that is destined for MD2BW output port 3 is full and asserting back-pressure to the MD (Wclk cycles). (M chip 15)", .pme_code = 1727, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 23, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 24 Event 0 */ { .pme_name = "W_SWORD_PUTS@0", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 0)", .pme_code = 1728, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@1", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 1)", .pme_code = 1729, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@2", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 2)", .pme_code = 1730, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@3", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 3)", .pme_code = 1731, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@4", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 4)", .pme_code = 1732, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@5", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 5)", .pme_code = 1733, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@6", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 6)", .pme_code = 1734, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@7", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 7)", .pme_code = 1735, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@8", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 8)", .pme_code = 1736, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@9", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 9)", .pme_code = 1737, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@10", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 10)", .pme_code = 1738, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@11", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 11)", .pme_code = 1739, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@12", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 12)", .pme_code = 1740, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@13", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 13)", .pme_code = 1741, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@14", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 14)", .pme_code = 1742, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_PUTS@15", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with Put commands. Counts up to 2 SWords per memory directory per clock period. (M chip 15)", .pme_code = 1743, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 24 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1744, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1745, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1746, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1747, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1748, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1749, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1750, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1751, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1752, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1753, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1754, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1755, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1756, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1757, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1758, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1759, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 24 Event 2 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1760, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1761, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1762, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1763, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1764, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1765, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1766, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1767, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1768, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1769, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1770, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1771, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1772, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1773, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1774, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1775, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 24 Event 3 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1776, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1777, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1778, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1779, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1780, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1781, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1782, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1783, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1784, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1785, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1786, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1787, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1788, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1789, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1790, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1791, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 24, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 25 Event 0 */ { .pme_name = "W_SWORD_NPUTS@0", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 0)", .pme_code = 1792, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@1", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 1)", .pme_code = 1793, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@2", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 2)", .pme_code = 1794, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@3", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 3)", .pme_code = 1795, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@4", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 4)", .pme_code = 1796, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@5", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 5)", .pme_code = 1797, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@6", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 6)", .pme_code = 1798, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@7", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 7)", .pme_code = 1799, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@8", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 8)", .pme_code = 1800, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@9", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 9)", .pme_code = 1801, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@10", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 10)", .pme_code = 1802, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@11", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 11)", .pme_code = 1803, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@12", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 12)", .pme_code = 1804, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@13", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 13)", .pme_code = 1805, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@14", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 14)", .pme_code = 1806, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NPUTS@15", .pme_desc = "Count of the total number of SWords that are written to memory or the L3 cache with NPut commands. Counts up to 2 SWords per memory directory per clock period. (M chip 15)", .pme_code = 1807, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 25 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1808, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1809, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1810, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1811, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1812, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1813, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1814, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1815, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1816, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1817, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1818, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1819, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1820, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1821, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1822, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1823, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 25 Event 2 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1824, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1825, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1826, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1827, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1828, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1829, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1830, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1831, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1832, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1833, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1834, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1835, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1836, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1837, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1838, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1839, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 25 Event 3 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1840, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1841, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1842, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1843, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1844, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1845, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1846, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1847, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1848, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1849, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1850, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1851, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1852, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1853, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1854, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1855, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 25, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 26 Event 0 */ { .pme_name = "W_SWORD_GETS@0", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 0)", .pme_code = 1856, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@1", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 1)", .pme_code = 1857, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@2", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 2)", .pme_code = 1858, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@3", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 3)", .pme_code = 1859, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@4", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 4)", .pme_code = 1860, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@5", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 5)", .pme_code = 1861, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@6", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 6)", .pme_code = 1862, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@7", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 7)", .pme_code = 1863, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@8", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 8)", .pme_code = 1864, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@9", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 9)", .pme_code = 1865, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@10", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 10)", .pme_code = 1866, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@11", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 11)", .pme_code = 1867, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@12", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 12)", .pme_code = 1868, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@13", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 13)", .pme_code = 1869, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@14", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 14)", .pme_code = 1870, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_GETS@15", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with Get commands. Counts up to 2 SWords per memory directory per clock period. (M chip 15)", .pme_code = 1871, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 26 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1872, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1873, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1874, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1875, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1876, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1877, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1878, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1879, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1880, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1881, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1882, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1883, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1884, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1885, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1886, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1887, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 26 Event 2 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1888, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1889, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1890, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1891, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1892, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1893, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1894, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1895, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1896, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1897, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1898, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1899, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1900, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1901, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1902, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1903, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 26 Event 3 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1904, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1905, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1906, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1907, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1908, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1909, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1910, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1911, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1912, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1913, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1914, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1915, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1916, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1917, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1918, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1919, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 26, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 27 Event 0 */ { .pme_name = "W_SWORD_NGETS@0", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 0)", .pme_code = 1920, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@1", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 1)", .pme_code = 1921, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@2", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 2)", .pme_code = 1922, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@3", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 3)", .pme_code = 1923, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@4", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 4)", .pme_code = 1924, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@5", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 5)", .pme_code = 1925, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@6", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 6)", .pme_code = 1926, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@7", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 7)", .pme_code = 1927, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@8", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 8)", .pme_code = 1928, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@9", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 9)", .pme_code = 1929, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@10", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 10)", .pme_code = 1930, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@11", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 11)", .pme_code = 1931, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@12", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 12)", .pme_code = 1932, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@13", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 13)", .pme_code = 1933, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@14", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 14)", .pme_code = 1934, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "W_SWORD_NGETS@15", .pme_desc = "Count of the total number of SWords that are read from memory or the L3 cache with NGet commands. Counts up to 2 SWords per memory directory per clock period. (M chip 15)", .pme_code = 1935, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 0, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 27 Event 1 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1936, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1937, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1938, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1939, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1940, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1941, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1942, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1943, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1944, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1945, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1946, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1947, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1948, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1949, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1950, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1951, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 1, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 27 Event 2 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1952, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1953, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1954, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1955, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1956, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1957, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1958, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1959, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1960, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1961, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1962, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1963, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1964, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1965, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1966, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1967, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 2, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, /* M Counter 27 Event 3 */ { .pme_name = "@0", .pme_desc = "", .pme_code = 1968, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 0, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@1", .pme_desc = "", .pme_code = 1969, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 1, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@2", .pme_desc = "", .pme_code = 1970, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 2, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@3", .pme_desc = "", .pme_code = 1971, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 3, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@4", .pme_desc = "", .pme_code = 1972, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 4, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@5", .pme_desc = "", .pme_code = 1973, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 5, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@6", .pme_desc = "", .pme_code = 1974, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 6, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@7", .pme_desc = "", .pme_code = 1975, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 7, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@8", .pme_desc = "", .pme_code = 1976, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 8, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@9", .pme_desc = "", .pme_code = 1977, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 9, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@10", .pme_desc = "", .pme_code = 1978, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 10, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@11", .pme_desc = "", .pme_code = 1979, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 11, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@12", .pme_desc = "", .pme_code = 1980, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 12, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@13", .pme_desc = "", .pme_code = 1981, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 13, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@14", .pme_desc = "", .pme_code = 1982, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 14, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, { .pme_name = "@15", .pme_desc = "", .pme_code = 1983, .pme_flags = 0x0, .pme_numasks = 0, .pme_chip = PME_CRAYX2_CHIP_MEMORY, .pme_ctr = 27, .pme_event = 3, .pme_chipno = 15, .pme_base = PMU_CRAYX2_MEMORY_PMD_BASE, .pme_nctrs = PME_CRAYX2_MEMORY_CTRS_PER_CHIP, .pme_nchips = PME_CRAYX2_MEMORY_CHIPS }, }; #define PME_CRAYX2_CYCLES 0 #define PME_CRAYX2_INSTR_GRADUATED 4 #define PME_CRAYX2_EVENT_COUNT (sizeof(crayx2_pe)/sizeof(pme_crayx2_entry_t)) #endif /* __CRAYX2_EVENTS_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/gen_ia32_events.h000066400000000000000000000067541502707512200224460ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * architected events for architectural perfmon v1 and v2 as defined by the IA-32 developer's manual * Vol 3B, table 18-6 (May 2007) */ static pme_gen_ia32_entry_t gen_ia32_all_pe[]={ {.pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_fixed = 17, .pme_desc = "count core clock cycles whenever the clock signal on the specific core is running (not halted)" }, {.pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0x00c0, .pme_fixed = 16, .pme_desc = "count the number of instructions at retirement. For instructions that consists of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction", }, {.pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_fixed = 18, .pme_desc = "count reference clock cycles while the clock signal on the specific core is running. The reference clock operates at a fixed frequency, irrespective of core freqeuncy changes due to performance state transitions", }, {.pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_desc = "count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch", }, {.pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_desc = "count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch", }, {.pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0x00c4, .pme_desc = "count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", }, {.pme_name = "MISPREDICTED_BRANCH_RETIRED", .pme_code = 0x00c5, .pme_desc = "count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", } }; #define PME_GEN_IA32_UNHALTED_CORE_CYCLES 0 #define PME_GEN_IA32_INSTRUCTIONS_RETIRED 1 #define PFMLIB_GEN_IA32_EVENT_COUNT (sizeof(gen_ia32_all_pe)/sizeof(pme_gen_ia32_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/gen_mips64_events.h000066400000000000000000001525131502707512200230250ustar00rootroot00000000000000static pme_gen_mips64_entry_t gen_mips64_20K_pe[] = { {.pme_name="INSN_REQ_FROM_IFU_TO_BIU", .pme_code = 0x00000009, .pme_counters = 0x1, .pme_desc = "Instruction requests from the IFU to the BIU" }, {.pme_name="BRANCHES_MISSPREDICTED", .pme_code = 0x00000005, .pme_counters = 0x1, .pme_desc = "Branches that mispredicted before completing execution" }, {.pme_name="REPLAYS", .pme_code = 0x0000000b, .pme_counters = 0x1, .pme_desc = "Total number of LSU requested replays, Load-dependent speculative dispatch or FPU exception prediction replays." }, {.pme_name="JR_INSNS_COMPLETED", .pme_code = 0x0000000d, .pme_counters = 0x1, .pme_desc = "JR instruction that completed execution" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x1, .pme_desc = "CPU cycles" }, {.pme_name="REPLAY_DUE_TO_LOAD_DEPENDENT_SPEC_DISPATCH", .pme_code = 0x00000008, .pme_counters = 0x1, .pme_desc = "Replays due to load-dependent speculative dispatch" }, {.pme_name="LSU_REPLAYS", .pme_code = 0x0000000e, .pme_counters = 0x1, .pme_desc = "LSU requested replays" }, {.pme_name="FP_INSNS_COMPLETED", .pme_code = 0x00000003, .pme_counters = 0x1, .pme_desc = "Instructions completed in FPU datapath (computational event" }, {.pme_name="FPU_EXCEPTIONS_TAKEN", .pme_code = 0x0000000a, .pme_counters = 0x1, .pme_desc = "Taken FPU exceptions" }, {.pme_name="TLB_REFILLS_TAKEN", .pme_code = 0x00000004, .pme_counters = 0x1, .pme_desc = "Taken TLB refill exceptions" }, {.pme_name="RPS_MISSPREDICTS", .pme_code = 0x0000000c, .pme_counters = 0x1, .pme_desc = "JR instructions that mispredicted using the Return Prediction Stack (RPS)" }, {.pme_name="INSN_ISSUED", .pme_code = 0x00000001, .pme_counters = 0x1, .pme_desc = "Dispatched/issued instructions" }, {.pme_name="INSNS_COMPLETED", .pme_code = 0x0000000f, .pme_counters = 0x1, .pme_desc = "Instruction that completed execution (with or without exception)" }, {.pme_name="BRANCHES_COMPLETED", .pme_code = 0x00000006, .pme_counters = 0x1, .pme_desc = "Branches that completed execution" }, {.pme_name="JTLB_EXCEPTIONS", .pme_code = 0x00000007, .pme_counters = 0x1, .pme_desc = "Taken Joint-TLB exceptions" }, {.pme_name="FETCH_GROUPS", .pme_code = 0x00000002, .pme_counters = 0x1, .pme_desc = "Fetch groups entering CPU execution pipes" }, }; static pme_gen_mips64_entry_t gen_mips64_24K_pe[] = { {.pme_name="DCACHE_MISS", .pme_code = 0x00000b0b, .pme_counters = 0x3, .pme_desc = "Data cache misses" }, {.pme_name="REPLAY_TRAPS_NOT_UTLB", .pme_code = 0x00001200, .pme_counters = 0x2, .pme_desc = "``replay traps'' (other than micro-TLB related)" }, {.pme_name="ITLB_ACCESSES", .pme_code = 0x00000005, .pme_counters = 0x1, .pme_desc = "Instruction micro-TLB accesses" }, {.pme_name="INSTRUCTIONS", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "Instructions completed" }, {.pme_name="LOADS_COMPLETED", .pme_code = 0x0000000f, .pme_counters = 0x1, .pme_desc = "Loads completed (including FP)" }, {.pme_name="SC_COMPLETE_BUT_FAILED", .pme_code = 0x00001300, .pme_counters = 0x2, .pme_desc = "sc instructions completed, but store failed (because the link bit had been cleared)." }, {.pme_name="JTLB_DATA_MISSES", .pme_code = 0x00000800, .pme_counters = 0x2, .pme_desc = "Joint TLB data (non-instruction) misses" }, {.pme_name="L2_MISSES", .pme_code = 0x00001616, .pme_counters = 0x3, .pme_desc = "L2 cache misses" }, {.pme_name="SC_COMPLETED", .pme_code = 0x00000013, .pme_counters = 0x1, .pme_desc = "sc instructions completed" }, {.pme_name="SUPERFLUOUS_INSTRUCTIONS", .pme_code = 0x00001400, .pme_counters = 0x2, .pme_desc = "``superfluous'' prefetch instructions (data was already in cache)." }, {.pme_name="DCACHE_WRITEBACKS", .pme_code = 0x00000a00, .pme_counters = 0x2, .pme_desc = "Data cache writebacks" }, {.pme_name="JR_31_MISSPREDICTS", .pme_code = 0x00000300, .pme_counters = 0x2, .pme_desc = "jr r31 (return) mispredictions" }, {.pme_name="JTLB_DATA_ACCESSES", .pme_code = 0x00000007, .pme_counters = 0x1, .pme_desc = "Joint TLB instruction accesses" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00000900, .pme_counters = 0x2, .pme_desc = "Instruction cache misses" }, {.pme_name="STALLS", .pme_code = 0x00000012, .pme_counters = 0x1, .pme_desc = "Stalls" }, {.pme_name="INTEGER_INSNS_COMPLETED", .pme_code = 0x0000000e, .pme_counters = 0x1, .pme_desc = "Integer instructions completed" }, {.pme_name="INTEGER_MUL_DIV_COMPLETED", .pme_code = 0x00001100, .pme_counters = 0x2, .pme_desc = "integer multiply/divide unit instructions completed" }, {.pme_name="STORES_COMPLETED", .pme_code = 0x00000f00, .pme_counters = 0x2, .pme_desc = "Stores completed (including FP)" }, {.pme_name="MIPS16_INSTRUCTIONS_COMPLETED", .pme_code = 0x00001000, .pme_counters = 0x2, .pme_desc = "MIPS16 instructions completed" }, {.pme_name="BRANCHES_LAUNCHED", .pme_code = 0x00000002, .pme_counters = 0x1, .pme_desc = "Branch instructions launched (whether completed or mispredicted)" }, {.pme_name="SCACHE_ACCESSES", .pme_code = 0x00001500, .pme_counters = 0x2, .pme_desc = "L2 cache accesses" }, {.pme_name="JR_31_LAUNCHED", .pme_code = 0x00000003, .pme_counters = 0x1, .pme_desc = "jr r31 (return) instructions launched (whether completed or mispredicted)" }, {.pme_name="PREFETCH_COMPLETED", .pme_code = 0x00000014, .pme_counters = 0x1, .pme_desc = "Prefetch instructions completed" }, {.pme_name="EXCEPTIONS_TAKEN", .pme_code = 0x00000017, .pme_counters = 0x1, .pme_desc = "Exceptions taken" }, {.pme_name="JR_NON_31_LAUNCHED", .pme_code = 0x00000004, .pme_counters = 0x1, .pme_desc = "jr (not r31) issues, which cost the same as a mispredict." }, {.pme_name="DTLB_ACCESSES", .pme_code = 0x00000006, .pme_counters = 0x1, .pme_desc = "Data micro-TLB accesses" }, {.pme_name="JTLB_INSTRUCTION_ACCESSES", .pme_code = 0x00000008, .pme_counters = 0x1, .pme_desc = "Joint TLB data (non-instruction) accesses" }, {.pme_name="CACHE_FIXUPS", .pme_code = 0x00000018, .pme_counters = 0x1, .pme_desc = "``cache fixup'' events (specific to the 24K family microarchitecture)." }, {.pme_name="INSTRUCTION_CACHE_ACCESSES", .pme_code = 0x00000009, .pme_counters = 0x1, .pme_desc = "Instruction cache accesses" }, {.pme_name="DTLB_MISSES", .pme_code = 0x00000600, .pme_counters = 0x2, .pme_desc = "Data micro-TLB misses" }, {.pme_name="J_JAL_INSNS_COMPLETED", .pme_code = 0x00000010, .pme_counters = 0x1, .pme_desc = "j/jal instructions completed" }, {.pme_name="DCACHE_ACCESSES", .pme_code = 0x0000000a, .pme_counters = 0x1, .pme_desc = "Data cache accesses" }, {.pme_name="BRANCH_MISSPREDICTS", .pme_code = 0x00000200, .pme_counters = 0x2, .pme_desc = "Branch mispredictions" }, {.pme_name="SCACHE_WRITEBACKS", .pme_code = 0x00000015, .pme_counters = 0x1, .pme_desc = "L2 cache writebacks" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Cycles" }, {.pme_name="JTLB_INSN_MISSES", .pme_code = 0x00000700, .pme_counters = 0x2, .pme_desc = "Joint TLB instruction misses" }, {.pme_name="FPU_INSNS_NON_LOAD_STORE_COMPLETED", .pme_code = 0x00000e00, .pme_counters = 0x2, .pme_desc = "FPU instructions completed (not including loads/stores)" }, {.pme_name="NOPS_COMPLETED", .pme_code = 0x00000011, .pme_counters = 0x1, .pme_desc = "no-ops completed, ie instructions writing $0" }, {.pme_name="ITLB_MISSES", .pme_code = 0x00000500, .pme_counters = 0x2, .pme_desc = "Instruction micro-TLB misses" }, }; static pme_gen_mips64_entry_t gen_mips64_25K_pe[] = { {.pme_name="INSNS_FETCHED_FROM_ICACHE", .pme_code = 0x00001818, .pme_counters = 0x3, .pme_desc = "Total number of instructions fetched from the I-Cache" }, {.pme_name="FP_EXCEPTIONS_TAKEN", .pme_code = 0x00000b0b, .pme_counters = 0x3, .pme_desc = "Taken FPU exceptions" }, {.pme_name="INSN_ISSUED", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "Dispatched/issued instructions" }, {.pme_name="STORE_INSNS_ISSUED", .pme_code = 0x00000505, .pme_counters = 0x3, .pme_desc = "Store instructions issued" }, {.pme_name="L2_MISSES", .pme_code = 0x00001e1e, .pme_counters = 0x3, .pme_desc = "L2 Cache miss" }, {.pme_name="REPLAYS_LOAD_DEP_DISPATCH", .pme_code = 0x00002323, .pme_counters = 0x3, .pme_desc = "replays due to load-dependent speculative dispatch" }, {.pme_name="BRANCHES_JUMPS_ISSUED", .pme_code = 0x00000606, .pme_counters = 0x3, .pme_desc = "Branch/Jump instructions issued" }, {.pme_name="REPLAYS_LSU_LOAD_DEP_FPU", .pme_code = 0x00002121, .pme_counters = 0x3, .pme_desc = "LSU requested replays, load-dependent speculative dispatch, FPU exception prediction" }, {.pme_name="INSNS_COMPLETE", .pme_code = 0x00000808, .pme_counters = 0x3, .pme_desc = "Instruction that completed execution (with or without exception)" }, {.pme_name="JTLB_MISSES_LOADS_STORES", .pme_code = 0x00001313, .pme_counters = 0x3, .pme_desc = "Raw count of Joint-TLB misses for loads/stores" }, {.pme_name="CACHEABLE_DCACHE_REQUEST", .pme_code = 0x00001d1d, .pme_counters = 0x3, .pme_desc = "number of cacheable requests to D-Cache" }, {.pme_name="DCACHE_WRITEBACKS", .pme_code = 0x00001c1c, .pme_counters = 0x3, .pme_desc = "D-Cache number of write-backs" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00001a1a, .pme_counters = 0x3, .pme_desc = "I-Cache miss" }, {.pme_name="ICACHE_PSEUDO_HITS", .pme_code = 0x00002626, .pme_counters = 0x3, .pme_desc = "I-Cache pseudo-hits" }, {.pme_name="FP_EXCEPTION_PREDICTED", .pme_code = 0x00000c0c, .pme_counters = 0x3, .pme_desc = "Predicted FPU exceptions" }, {.pme_name="LOAD_STORE_ISSUED", .pme_code = 0x00002727, .pme_counters = 0x3, .pme_desc = "Load/store instructions issued" }, {.pme_name="REPLAYS_WBB_FULL", .pme_code = 0x00002424, .pme_counters = 0x3, .pme_desc = "replays due to WBB full" }, {.pme_name="L2_WBACKS", .pme_code = 0x00001f1f, .pme_counters = 0x3, .pme_desc = "L2 Cache number of write-backs" }, {.pme_name="JR_COMPLETED", .pme_code = 0x00001010, .pme_counters = 0x3, .pme_desc = "JR instruction that completed execution" }, {.pme_name="JR_RPD_MISSPREDICTED", .pme_code = 0x00000f0f, .pme_counters = 0x3, .pme_desc = "JR instructions that mispredicted using the Return Prediction Stack" }, {.pme_name="JTLB_IFETCH_REFILL_EXCEPTIONS", .pme_code = 0x00001515, .pme_counters = 0x3, .pme_desc = "Joint-TLB refill exceptions due to instruction fetch" }, {.pme_name="DUAL_ISSUED_PAIRS", .pme_code = 0x00000707, .pme_counters = 0x3, .pme_desc = "Dual-issued pairs" }, {.pme_name="FSB_FULL_REPLAYS", .pme_code = 0x00002525, .pme_counters = 0x3, .pme_desc = "replays due to FSB full" }, {.pme_name="JTLB_REFILL_EXCEPTIONS", .pme_code = 0x00001717, .pme_counters = 0x3, .pme_desc = "total Joint-TLB Instruction exceptions (refill)" }, {.pme_name="INT_INSNS_ISSUED", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Integer instructions issued" }, {.pme_name="FP_INSNS_ISSUED", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "FPU instructions issued" }, {.pme_name="BRANCHES_MISSPREDICTED", .pme_code = 0x00000d0d, .pme_counters = 0x3, .pme_desc = "Branches that mispredicted before completing execution" }, {.pme_name="FETCH_GROUPS_IN_PIPE", .pme_code = 0x00000909, .pme_counters = 0x3, .pme_desc = "Fetch groups entering CPU execution pipes" }, {.pme_name="CACHEABLE_L2_REQS", .pme_code = 0x00002020, .pme_counters = 0x3, .pme_desc = "Number of cacheable requests to L2" }, {.pme_name="JTLB_DATA_ACCESS_REFILL_EXCEPTIONS", .pme_code = 0x00001616, .pme_counters = 0x3, .pme_desc = "Joint-TLB refill exceptions due to data access" }, {.pme_name="UTLB_MISSES", .pme_code = 0x00001111, .pme_counters = 0x3, .pme_desc = "U-TLB misses" }, {.pme_name="LOAD_INSNS_ISSUED", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Load instructions issued" }, {.pme_name="JTLB_MISSES_IFETCH", .pme_code = 0x00001212, .pme_counters = 0x3, .pme_desc = "Raw count of Joint-TLB misses for instruction fetch" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "CPU cycles" }, {.pme_name="LSU_REQ_REPLAYS", .pme_code = 0x00002222, .pme_counters = 0x3, .pme_desc = "LSU requested replays" }, {.pme_name="INSN_REQ_FROM_IFU_BIU", .pme_code = 0x00001919, .pme_counters = 0x3, .pme_desc = "instruction requests from the IFU to the BIU" }, {.pme_name="JTLB_EXCEPTIONS", .pme_code = 0x00001414, .pme_counters = 0x3, .pme_desc = "Refill, Invalid and Modified TLB exceptions" }, {.pme_name="BRANCHES_COMPLETED", .pme_code = 0x00000e0e, .pme_counters = 0x3, .pme_desc = "Branches that completed execution" }, {.pme_name="INSN_FP_DATAPATH_COMPLETED", .pme_code = 0x00000a0a, .pme_counters = 0x3, .pme_desc = "Instructions completed in FPU datapath (computational instructions only)" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x00001b1b, .pme_counters = 0x3, .pme_desc = "D-Cache miss" }, }; static pme_gen_mips64_entry_t gen_mips64_34K_pe[] = { {.pme_name="YIELD_INSNS", .pme_code = 0x00220022, .pme_counters = 0x5, .pme_desc = "yield instructions." }, {.pme_name="BRANCH_MISPREDICT_STALLS", .pme_code = 0x002e002e, .pme_counters = 0x5, .pme_desc = "Branch mispredict stalls" }, {.pme_name="SC_FAILED_INSNS", .pme_code = 0x00130013, .pme_counters = 0x5, .pme_desc = "sc instructions completed, but store failed (because the link bit had been cleared)." }, {.pme_name="ITC_LOAD_STORE_STALLS", .pme_code = 0x00280028, .pme_counters = 0x5, .pme_desc = "ITC load/store stalls" }, {.pme_name="ITC_LOADS", .pme_code = 0x00200020, .pme_counters = 0x5, .pme_desc = "ITC Loads" }, {.pme_name="LOADS_COMPLETED", .pme_code = 0x000f000f, .pme_counters = 0x5, .pme_desc = "Loads completed (including FP)" }, {.pme_name="BRANCH_INSNS_LAUNCHED", .pme_code = 0x00020002, .pme_counters = 0x5, .pme_desc = "Branch instructions launched (whether completed or mispredicted)" }, {.pme_name="DATA_SIDE_SCRATCHPAD_ACCESS_STALLS", .pme_code = 0x002b002b, .pme_counters = 0x5, .pme_desc = "Data-side scratchpad access stalls" }, {.pme_name="FB_ENTRY_ALLOCATED", .pme_code = 0x00300030, .pme_counters = 0x5, .pme_desc = "FB entry allocated" }, {.pme_name="CP2_STALLS", .pme_code = 0x002a002a, .pme_counters = 0x5, .pme_desc = "CP2 stalls" }, {.pme_name="FSB_25_50_FULL", .pme_code = 0x00320032, .pme_counters = 0x5, .pme_desc = "FSB 25-50% full" }, {.pme_name="CACHE_FIXUP_EVENTS", .pme_code = 0x00180018, .pme_counters = 0x5, .pme_desc = "cache fixup events (specific to the 34K family microarchitecture)" }, {.pme_name="IFU_FB_FULL_REFETCHES", .pme_code = 0x00300030, .pme_counters = 0x5, .pme_desc = "IFU FB full re-fetches" }, {.pme_name="L1_DCACHE_MISS_STALLS", .pme_code = 0x00250025, .pme_counters = 0x5, .pme_desc = "L1 D-cache miss stalls" }, {.pme_name="INT_MUL_DIV_UNIT_INSNS_COMPLETED", .pme_code = 0x00110011, .pme_counters = 0x5, .pme_desc = "integer multiply/divide unit instructions completed" }, {.pme_name="JTLB_INSN_ACCESSES", .pme_code = 0x00070007, .pme_counters = 0x5, .pme_desc = "Joint TLB instruction accesses" }, {.pme_name="ALU_STALLS", .pme_code = 0x00190019, .pme_counters = 0x5, .pme_desc = "ALU stalls" }, {.pme_name="FPU_STALLS", .pme_code = 0x00290029, .pme_counters = 0x5, .pme_desc = "FPU stalls" }, {.pme_name="JTLB_DATA_ACCESSES", .pme_code = 0x00080008, .pme_counters = 0x5, .pme_desc = "Joint TLB data (non-instruction) accesses" }, {.pme_name="INTEGER_INSNS_COMPLETED", .pme_code = 0x000e000e, .pme_counters = 0x5, .pme_desc = "Integer instructions completed" }, {.pme_name="MFC2_MTC2_INSNS", .pme_code = 0x00230023, .pme_counters = 0x5, .pme_desc = "CP2 move to/from instructions." }, {.pme_name="STORES_COMPLETED", .pme_code = 0x000f000f, .pme_counters = 0x5, .pme_desc = "Stores completed (including FP)" }, {.pme_name="JR_NON_31_INSN_EXECED", .pme_code = 0x00040004, .pme_counters = 0x5, .pme_desc = "jr $xx (not $31), which cost the same as a mispredict." }, {.pme_name="EXCEPTIONS_TAKEN", .pme_code = 0x00170017, .pme_counters = 0x5, .pme_desc = "Exceptions taken" }, {.pme_name="L2_MISS_PENDING_CYCLES", .pme_code = 0x00270027, .pme_counters = 0x5, .pme_desc = "Cycles where L2 miss is pending" }, {.pme_name="LDQ_FULL_PIPE_STALLS", .pme_code = 0x00350035, .pme_counters = 0x5, .pme_desc = "LDQ full pipeline stalls" }, {.pme_name="DTLB_ACCESSES", .pme_code = 0x00060006, .pme_counters = 0x5, .pme_desc = "Data micro-TLB accesses" }, {.pme_name="SUPERFLUOUS_PREFETCHES", .pme_code = 0x00140014, .pme_counters = 0x5, .pme_desc = "``superfluous'' prefetch instructions (data was already in cache)." }, {.pme_name="LDQ_LESS_25_FULL", .pme_code = 0x00340034, .pme_counters = 0x5, .pme_desc = "LDQ < 25% full" }, {.pme_name="FORK_INSTRUCTIONS", .pme_code = 0x00220022, .pme_counters = 0x5, .pme_desc = "fork instructions" }, {.pme_name="UNCACHED_LOAD_STALLS", .pme_code = 0x00280028, .pme_counters = 0x5, .pme_desc = "Uncached load stalls" }, {.pme_name="FSB_FULL_PIPE_STALLS", .pme_code = 0x00330033, .pme_counters = 0x5, .pme_desc = "FSB full pipeline stalls" }, {.pme_name="MDU_STALLS", .pme_code = 0x00290029, .pme_counters = 0x5, .pme_desc = "MDU stalls" }, {.pme_name="FSB_LESS_25_FULL", .pme_code = 0x00320032, .pme_counters = 0x5, .pme_desc = "FSB < 25% full" }, {.pme_name="UNCACHED_LOADS", .pme_code = 0x00210021, .pme_counters = 0x5, .pme_desc = "Uncached Loads" }, {.pme_name="NO_OPS_COMPLETED", .pme_code = 0x00110011, .pme_counters = 0x5, .pme_desc = "no-ops completed, ie instructions writing $0" }, {.pme_name="DATA_SIDE_SCRATCHPAD_RAM_LOGIC", .pme_code = 0x001d001d, .pme_counters = 0x5, .pme_desc = "Data-side scratchpad RAM logic" }, {.pme_name="CYCLES_INSN_NOT_IN_SKID_BUFFER", .pme_code = 0x00180018, .pme_counters = 0x5, .pme_desc = "Cycles lost when an unblocked thread's instruction isn't in the skid buffer, and must be re-fetched from I-cache." }, {.pme_name="ITC_LOGIC", .pme_code = 0x001f001f, .pme_counters = 0x5, .pme_desc = "ITC logic" }, {.pme_name="L2_IMISS_STALLS", .pme_code = 0x00260026, .pme_counters = 0x5, .pme_desc = "L2 I-miss stalls" }, {.pme_name="DSP_RESULT_SATURATED", .pme_code = 0x00240024, .pme_counters = 0x5, .pme_desc = "DSP result saturated" }, {.pme_name="INSTRUCTIONS", .pme_code = 0x01010101, .pme_counters = 0xf, .pme_desc = "Instructions completed" }, {.pme_name="ITLB_ACCESSES", .pme_code = 0x00050005, .pme_counters = 0x5, .pme_desc = "Instruction micro-TLB accesses" }, {.pme_name="CP2_REG_TO_REG_INSNS", .pme_code = 0x00230023, .pme_counters = 0x5, .pme_desc = "CP2 register-to-register instructions" }, {.pme_name="SC_INSNS_COMPLETED", .pme_code = 0x00130013, .pme_counters = 0x5, .pme_desc = "sc instructions completed" }, {.pme_name="COREEXTEND_STALLS", .pme_code = 0x002a002a, .pme_counters = 0x5, .pme_desc = "CorExtend stalls" }, {.pme_name="LOAD_USE_STALLS", .pme_code = 0x002d002d, .pme_counters = 0x5, .pme_desc = "Load to Use stalls" }, {.pme_name="JR_31_INSN_EXECED", .pme_code = 0x00030003, .pme_counters = 0x5, .pme_desc = "jr $31 (return) instructions executed." }, {.pme_name="JR_31_MISPREDICTS", .pme_code = 0x00030003, .pme_counters = 0x5, .pme_desc = "jr $31 mispredictions." }, {.pme_name="REPLAY_CYCLES", .pme_code = 0x00120012, .pme_counters = 0x5, .pme_desc = "Cycles lost due to ``replays'' - when a thread blocks, its instructions in the pipeline are discarded to allow other threads to advance." }, {.pme_name="L2_MISSES", .pme_code = 0x16161616, .pme_counters = 0xf, .pme_desc = "L2 cache misses" }, {.pme_name="JTLB_DATA_MISSES", .pme_code = 0x00080008, .pme_counters = 0x5, .pme_desc = "Joint TLB data (non-instruction) misses" }, {.pme_name="SYSTEM_INTERFACE", .pme_code = 0x001e001e, .pme_counters = 0x5, .pme_desc = "System interface" }, {.pme_name="BRANCH_MISPREDICTS", .pme_code = 0x00020002, .pme_counters = 0x5, .pme_desc = "Branch mispredictions" }, {.pme_name="ITC_STORES", .pme_code = 0x00200020, .pme_counters = 0x5, .pme_desc = "ITC Stores" }, {.pme_name="LDQ_OVER_50_FULL", .pme_code = 0x00350035, .pme_counters = 0x5, .pme_desc = "LDQ > 50% full" }, {.pme_name="FSB_OVER_50_FULL", .pme_code = 0x00330033, .pme_counters = 0x5, .pme_desc = "FSB > 50% full" }, {.pme_name="STALLS_NO_ROOM_PENDING_WRITE", .pme_code = 0x002c002c, .pme_counters = 0x5, .pme_desc = "Stalls when no more room to store pending write." }, {.pme_name="JR_31_NOT_PREDICTED", .pme_code = 0x00040004, .pme_counters = 0x5, .pme_desc = "jr $31 not predicted (stack mismatch)." }, {.pme_name="EXTERNAL_YIELD_MANAGER_LOGIC", .pme_code = 0x001f001f, .pme_counters = 0x5, .pme_desc = "External Yield Manager logic" }, {.pme_name="DCACHE_WRITEBACKS", .pme_code = 0x000a000a, .pme_counters = 0x5, .pme_desc = "Data cache writebacks" }, {.pme_name="RELAX_BUBBLES", .pme_code = 0x002f002f, .pme_counters = 0x5, .pme_desc = "``Relax bubbles'' - when thread scheduler chooses to schedule nothing to reduce power consumption." }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00090009, .pme_counters = 0x5, .pme_desc = "Instruction cache misses" }, {.pme_name="MIPS16_INSNS_COMPLETED", .pme_code = 0x00100010, .pme_counters = 0x5, .pme_desc = "MIPS16 instructions completed" }, {.pme_name="OTHER_INTERLOCK_STALLS", .pme_code = 0x002e002e, .pme_counters = 0x5, .pme_desc = "Other interlock stalls" }, {.pme_name="L2_CACHE_WRITEBACKS", .pme_code = 0x00150015, .pme_counters = 0x5, .pme_desc = "L2 cache writebacks" }, {.pme_name="WBB_LESS_25_FULL", .pme_code = 0x00360036, .pme_counters = 0x5, .pme_desc = "WBB < 25% full" }, {.pme_name="L2_DCACHE_MISS_STALLS", .pme_code = 0x00260026, .pme_counters = 0x5, .pme_desc = "L2 D-miss stalls" }, {.pme_name="CACHE_INSTRUCTION_STALLS", .pme_code = 0x002c002c, .pme_counters = 0x5, .pme_desc = "Stalls due to cache instructions" }, {.pme_name="L1_DCACHE_MISS_PENDING_CYCLES", .pme_code = 0x00270027, .pme_counters = 0x5, .pme_desc = "Cycles where L1 D-cache miss pending" }, {.pme_name="ALU_TO_AGEN_STALLS", .pme_code = 0x002d002d, .pme_counters = 0x5, .pme_desc = "ALU to AGEN stalls" }, {.pme_name="L2_ACCESSES", .pme_code = 0x00150015, .pme_counters = 0x5, .pme_desc = "L2 cache accesses" }, {.pme_name="J_JAL_INSN_COMPLETED", .pme_code = 0x00100010, .pme_counters = 0x5, .pme_desc = "j/jal instructions completed" }, {.pme_name="ALL_STALLS", .pme_code = 0x00120012, .pme_counters = 0x5, .pme_desc = "All stalls (no action in RF pipe stage)" }, {.pme_name="DSP_INSTRUCTIONS", .pme_code = 0x00240024, .pme_counters = 0x5, .pme_desc = "DSP instructions" }, {.pme_name="UNCACHED_STORES", .pme_code = 0x00210021, .pme_counters = 0x5, .pme_desc = "Uncached Stores" }, {.pme_name="WBB_FULL_PIPE_STALLS", .pme_code = 0x00370037, .pme_counters = 0x5, .pme_desc = "WBB full pipeline stalls" }, {.pme_name="INSN_CACHE_ACCESSES", .pme_code = 0x00090009, .pme_counters = 0x5, .pme_desc = "Instruction cache accesses" }, {.pme_name="EXT_POLICY_MANAGER", .pme_code = 0x001c001c, .pme_counters = 0x5, .pme_desc = "External policy manager" }, {.pme_name="WBB_OVER_50_FULL", .pme_code = 0x00370037, .pme_counters = 0x5, .pme_desc = "WBB > 50% full" }, {.pme_name="DTLB_MISSES", .pme_code = 0x00060006, .pme_counters = 0x5, .pme_desc = "Data micro-TLB misses" }, {.pme_name="DCACHE_ACCESSES", .pme_code = 0x000a000a, .pme_counters = 0x5, .pme_desc = "Data cache accesses" }, {.pme_name="COREEXTEND_LOGIC", .pme_code = 0x001e001e, .pme_counters = 0x5, .pme_desc = "CorExtend logic" }, {.pme_name="LDQ_25_50_FULL", .pme_code = 0x00340034, .pme_counters = 0x5, .pme_desc = "LDQ 25-50% full" }, {.pme_name="PREFETCH_INSNS_COMPLETED", .pme_code = 0x00140014, .pme_counters = 0x5, .pme_desc = "Prefetch instructions completed" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0xf, .pme_desc = "Cycles" }, {.pme_name="L1_ICACHE_MISS_STALLS", .pme_code = 0x00250025, .pme_counters = 0x5, .pme_desc = "L1 I-cache miss stalls" }, {.pme_name="JTLB_INSN_MISSES", .pme_code = 0x00070007, .pme_counters = 0x5, .pme_desc = "Joint TLB instruction misses" }, {.pme_name="COP2", .pme_code = 0x001c001c, .pme_counters = 0x5, .pme_desc = "Co-Processor 2" }, {.pme_name="FPU_INSNS_COMPLETED", .pme_code = 0x000e000e, .pme_counters = 0x5, .pme_desc = "FPU instructions completed (not including loads/stores)" }, {.pme_name="ITLB_MISSES", .pme_code = 0x00050005, .pme_counters = 0x5, .pme_desc = "Instruction micro-TLB misses" }, {.pme_name="IFU_STALLS", .pme_code = 0x00190019, .pme_counters = 0x5, .pme_desc = "IFU stalls (when no instruction offered) ALU stalls" }, {.pme_name="WBB_25_50_FULL", .pme_code = 0x00360036, .pme_counters = 0x5, .pme_desc = "WBB 25-50% full" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x0b0b0b0b, .pme_counters = 0xf, .pme_desc = "Data cache misses" }, }; static pme_gen_mips64_entry_t gen_mips64_5K_pe[] = { {.pme_name="DCACHE_LINE_EVICTED", .pme_code = 0x00000600, .pme_counters = 0x2, .pme_desc = "Data cache line evicted" }, {.pme_name="LOADS_EXECED", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "Load/pref(x)/sync/cache-ops executed" }, {.pme_name="INSN_SCHEDULED", .pme_code = 0x0000000a, .pme_counters = 0x1, .pme_desc = "Instruction scheduled" }, {.pme_name="DUAL_ISSUED_INSNS", .pme_code = 0x0000000e, .pme_counters = 0x1, .pme_desc = "Dual issued instructions executed" }, {.pme_name="BRANCHES_MISSPREDICTED", .pme_code = 0x00000800, .pme_counters = 0x2, .pme_desc = "Branch mispredicted" }, {.pme_name="CONFLICT_STALL_M_STAGE", .pme_code = 0x00000a00, .pme_counters = 0x2, .pme_desc = "Instruction stall in M stage due to scheduling conflicts" }, {.pme_name="STORES_EXECED", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Stores (including conditional stores) executed" }, {.pme_name="DCACHE_MISS", .pme_code = 0x00000900, .pme_counters = 0x2, .pme_desc = "Data cache miss" }, {.pme_name="INSN_FETCHED", .pme_code = 0x00000001, .pme_counters = 0x1, .pme_desc = "Instructions fetched" }, {.pme_name="TLB_MISS_EXCEPTIONS", .pme_code = 0x00000700, .pme_counters = 0x2, .pme_desc = "TLB miss exceptions" }, {.pme_name="COP2_INSNS_EXECED", .pme_code = 0x00000f00, .pme_counters = 0x2, .pme_desc = "COP2 instructions executed" }, {.pme_name="FAILED_COND_STORES", .pme_code = 0x00000005, .pme_counters = 0x1, .pme_desc = "Failed conditional stores" }, {.pme_name="INSNS_EXECED", .pme_code = 0x0000010f, .pme_counters = 0x3, .pme_desc = "Instructions executed" }, {.pme_name="ICACHE_MISS", .pme_code = 0x00000009, .pme_counters = 0x1, .pme_desc = "Instruction cache miss" }, {.pme_name="COND_STORES_EXECED", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Conditional stores executed" }, {.pme_name="FP_INSNS_EXECED", .pme_code = 0x00000500, .pme_counters = 0x2, .pme_desc = "Floating-point instructions executed" }, {.pme_name="DTLB_MISSES", .pme_code = 0x00000008, .pme_counters = 0x1, .pme_desc = "DTLB miss" }, {.pme_name="BRANCHES_EXECED", .pme_code = 0x00000006, .pme_counters = 0x1, .pme_desc = "Branches executed" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Cycles" }, {.pme_name="ITLB_MISSES", .pme_code = 0x00000007, .pme_counters = 0x1, .pme_desc = "ITLB miss" }, }; static pme_gen_mips64_entry_t gen_mips64_r10000_pe[] = { {.pme_name="BRANCHES_RESOLVED", .pme_code = 0x00000006, .pme_counters = 0x1, .pme_desc = "Branches resolved" }, {.pme_name="TLB_REFILL_EXCEPTIONS", .pme_code = 0x00000700, .pme_counters = 0x2, .pme_desc = "TLB refill exceptions" }, {.pme_name="EXTERNAL_INTERVENTION_RQ", .pme_code = 0x0000000c, .pme_counters = 0x1, .pme_desc = "External intervention requests" }, {.pme_name="STORES_GRADUATED", .pme_code = 0x00000300, .pme_counters = 0x2, .pme_desc = "Stores graduated" }, {.pme_name="SCACHE_WAY_MISPREDICTED_INSN", .pme_code = 0x0000000b, .pme_counters = 0x1, .pme_desc = "Secondary cache way mispredicted (instruction)" }, {.pme_name="INSTRUCTION_CACHE_MISSES", .pme_code = 0x00000009, .pme_counters = 0x1, .pme_desc = "Instruction cache misses" }, {.pme_name="SCACHE_MISSES_DATA", .pme_code = 0x00000a00, .pme_counters = 0x2, .pme_desc = "Secondary cache misses (data)" }, {.pme_name="QUADWORDS_WB_FROM_PRIMARY_DCACHE", .pme_code = 0x00000600, .pme_counters = 0x2, .pme_desc = "Quadwords written back from primary data cache" }, {.pme_name="EXTERNAL_INVALIDATE_RQ_HITS_SCACHE", .pme_code = 0x00000d00, .pme_counters = 0x2, .pme_desc = "External invalidate request is determined to have hit in secondary cache" }, {.pme_name="LOAD_PREFETC_SYNC_CACHEOP_ISSUED", .pme_code = 0x00000002, .pme_counters = 0x1, .pme_desc = "Load / prefetch / sync / CacheOp issued" }, {.pme_name="STORES_OR_STORE_PREF_TO_SHD_SCACHE_BLOCKS", .pme_code = 0x00000f00, .pme_counters = 0x2, .pme_desc = "Stores or prefetches with store hint to Shared secondary cache blocks" }, {.pme_name="STORE_COND_ISSUED", .pme_code = 0x00000004, .pme_counters = 0x1, .pme_desc = "Store conditional issued" }, {.pme_name="BRANCHES_MISPREDICTED", .pme_code = 0x00000800, .pme_counters = 0x2, .pme_desc = "Branches mispredicted" }, {.pme_name="EXTERNAL_INVALIDATE_RQ", .pme_code = 0x0000000d, .pme_counters = 0x1, .pme_desc = "External invalidate requests" }, {.pme_name="LOAD_PREFETC_SYNC_CACHEOP_GRADUATED", .pme_code = 0x00000200, .pme_counters = 0x2, .pme_desc = "Load / prefetch / sync / CacheOp graduated" }, {.pme_name="INSTRUCTIONS_ISSUED", .pme_code = 0x00000001, .pme_counters = 0x1, .pme_desc = "Instructions issued" }, {.pme_name="INSTRUCTION_GRADUATED", .pme_code = 0x0000000f, .pme_counters = 0x1, .pme_desc = "Instructions graduated" }, {.pme_name="EXTERNAL_INTERVENTION_RQ_HITS_SCACHE", .pme_code = 0x00000c00, .pme_counters = 0x2, .pme_desc = "External intervention request is determined to have hit in secondary cache" }, {.pme_name="SCACHE_MISSES_INSTRUCTION", .pme_code = 0x0000000a, .pme_counters = 0x1, .pme_desc = "Secondary cache misses (instruction)" }, {.pme_name="SCACHE_LOAD_STORE_CACHEOP_OPERATIONS", .pme_code = 0x00000900, .pme_counters = 0x2, .pme_desc = "Secondary cache load / store and cache-ops operations" }, {.pme_name="STORES_OR_STORE_PREF_TO_CLEANEXCLUSIVE_SCACHE_BLOCKS", .pme_code = 0x00000e00, .pme_counters = 0x2, .pme_desc = "Stores or prefetches with store hint to CleanExclusive secondary cache blocks" }, {.pme_name="INSTRUCTIONS_GRADUATED", .pme_code = 0x00000100, .pme_counters = 0x2, .pme_desc = "Instructions graduated" }, {.pme_name="FP_INSTRUCTON_GRADUATED", .pme_code = 0x00000500, .pme_counters = 0x2, .pme_desc = "Floating-point instructions graduated" }, {.pme_name="STORES_ISSUED", .pme_code = 0x00000003, .pme_counters = 0x1, .pme_desc = "Stores issued" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Cycles" }, {.pme_name="CORRECTABLE_ECC_ERRORS_SCACHE", .pme_code = 0x00000008, .pme_counters = 0x1, .pme_desc = "Correctable ECC errors on secondary cache data" }, {.pme_name="QUADWORDS_WB_FROM_SCACHE", .pme_code = 0x00000007, .pme_counters = 0x1, .pme_desc = "Quadwords written back from secondary cache" }, {.pme_name="STORE_COND_GRADUATED", .pme_code = 0x00000400, .pme_counters = 0x2, .pme_desc = "Store conditional graduated" }, {.pme_name="FUNCTIONAL_UNIT_COMPLETION_CYCLES", .pme_code = 0x0000000e, .pme_counters = 0x1, .pme_desc = "Functional unit completion cycles" }, {.pme_name="FAILED_STORE_CONDITIONAL", .pme_code = 0x00000005, .pme_counters = 0x1, .pme_desc = "Failed store conditional" }, {.pme_name="SCACHE_WAY_MISPREDICTED_DATA", .pme_code = 0x00000b00, .pme_counters = 0x2, .pme_desc = "Secondary cache way mispredicted (data)" }, }; static pme_gen_mips64_entry_t gen_mips64_r12000_pe[] = { {.pme_name="INTERVENTION_REQUESTS", .pme_code = 0x0c0c0c0c, .pme_counters = 0xf, .pme_desc = "External intervention requests" }, {.pme_name="QUADWORDS", .pme_code = 0x16161616, .pme_counters = 0xf, .pme_desc = "Quadwords written back from primary data cache" }, {.pme_name="MISPREDICTED_BRANCHES", .pme_code = 0x18181818, .pme_counters = 0xf, .pme_desc = "Mispredicted branches" }, {.pme_name="DECODED_STORES", .pme_code = 0x03030303, .pme_counters = 0xf, .pme_desc = "Decoded stores" }, {.pme_name="TLB_MISSES", .pme_code = 0x17171717, .pme_counters = 0xf, .pme_desc = "TLB misses" }, {.pme_name="GRADUATED_FP_INSTRUCTIONS", .pme_code = 0x15151515, .pme_counters = 0xf, .pme_desc = "Graduated floating point instructions" }, {.pme_name="EXTERNAL_REQUESTS", .pme_code = 0x0d0d0d0d, .pme_counters = 0xf, .pme_desc = "External invalidate requests" }, {.pme_name="GRADUATED_STORES", .pme_code = 0x13131313, .pme_counters = 0xf, .pme_desc = "Graduated stores" }, {.pme_name="PREFETCH_MISSES_IN_DCACHE", .pme_code = 0x11111111, .pme_counters = 0xf, .pme_desc = "Primary data cache misses by prefetch instructions" }, {.pme_name="STORE_PREFETCH_EXCLUSIVE_SHARED_SC_BLOCK", .pme_code = 0x1f1f1f1f, .pme_counters = 0xf, .pme_desc = "Store/prefetch exclusive to shared block in secondary" }, {.pme_name="DECODED_LOADS", .pme_code = 0x02020202, .pme_counters = 0xf, .pme_desc = "Decoded loads" }, {.pme_name="GRADUATED_STORE_CONDITIONALS", .pme_code = 0x14141414, .pme_counters = 0xf, .pme_desc = "Graduated store conditionals" }, {.pme_name="INSTRUCTION_SECONDARY_CACHE_MISSES", .pme_code = 0x0a0a0a0a, .pme_counters = 0xf, .pme_desc = "Secondary cache misses (instruction)" }, {.pme_name="STATE_OF_EXTERNAL_INVALIDATION_HIT", .pme_code = 0x1d1d1d1d, .pme_counters = 0xf, .pme_desc = "State of external invalidation hits in secondary cache" }, {.pme_name="SECONDARY_CACHE_WAY_MISSPREDICTED", .pme_code = 0x0b0b0b0b, .pme_counters = 0xf, .pme_desc = "Secondary cache way mispredicted (instruction)" }, {.pme_name="DECODED_INSTRUCTIONS", .pme_code = 0x01010101, .pme_counters = 0xf, .pme_desc = "Decoded instructions" }, {.pme_name="SCACHE_MISSES", .pme_code = 0x1a1a1a1a, .pme_counters = 0xf, .pme_desc = "Secondary cache misses (data)" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x09090909, .pme_counters = 0xf, .pme_desc = "Instruction cache misses" }, {.pme_name="SCACHE_WAY_MISPREDICTION", .pme_code = 0x1b1b1b1b, .pme_counters = 0xf, .pme_desc = "Misprediction from scache way prediction table (data)" }, {.pme_name="STATE_OF_SCACHE_INTERVENTION_HIT", .pme_code = 0x1c1c1c1c, .pme_counters = 0xf, .pme_desc = "State of external intervention hit in secondary cache" }, {.pme_name="GRADUATED_LOADS", .pme_code = 0x12121212, .pme_counters = 0xf, .pme_desc = "Graduated loads" }, {.pme_name="PREFETCH_INSTRUCTIONS_EXECUTED", .pme_code = 0x10101010, .pme_counters = 0xf, .pme_desc = "Executed prefetch instructions" }, {.pme_name="MISS_TABLE_OCCUPANCY", .pme_code = 0x04040404, .pme_counters = 0xf, .pme_desc = "Miss Handling Table Occupancy" }, {.pme_name="INSTRUCTIONS_GRADUATED", .pme_code = 0x0f0f0f0f, .pme_counters = 0xf, .pme_desc = "Instructions graduated" }, {.pme_name="QUADWORDS_WRITEBACK_FROM_SC", .pme_code = 0x07070707, .pme_counters = 0xf, .pme_desc = "Quadwords written back from secondary cache" }, {.pme_name="CORRECTABLE_ECC_ERRORS", .pme_code = 0x08080808, .pme_counters = 0xf, .pme_desc = "Correctable ECC errors on secondary cache data" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0xf, .pme_desc = "Cycles" }, {.pme_name="RESOLVED_BRANCH_CONDITIONAL", .pme_code = 0x06060606, .pme_counters = 0xf, .pme_desc = "Resolved conditional branches" }, {.pme_name="STORE_PREFETCH_EXCLUSIVE_TO_CLEAN_SC_BLOCK", .pme_code = 0x1e1e1e1e, .pme_counters = 0xf, .pme_desc = "Store/prefetch exclusive to clean block in secondary cache" }, {.pme_name="FAILED_STORE_CONDITIONAL", .pme_code = 0x05050505, .pme_counters = 0xf, .pme_desc = "Failed store conditional" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x19191919, .pme_counters = 0xf, .pme_desc = "Primary data cache misses" }, }; static pme_gen_mips64_entry_t gen_mips64_rm7000_pe[] = { {.pme_name="SLIP_CYCLES_PENDING_NON_BLKING_LOAD", .pme_code = 0x00001a1a, .pme_counters = 0x3, .pme_desc = "Slip cycles due to pending non-blocking loads" }, {.pme_name="STORE_INSTRUCTIONS_ISSUED", .pme_code = 0x00000505, .pme_counters = 0x3, .pme_desc = "Store instructions issued" }, {.pme_name="BRANCH_PREFETCHES", .pme_code = 0x00000707, .pme_counters = 0x3, .pme_desc = "Branch prefetches" }, {.pme_name="PCACHE_WRITEBACKS", .pme_code = 0x00001414, .pme_counters = 0x3, .pme_desc = "Primary cache writebacks" }, {.pme_name="STALL_CYCLES_PENDING_NON_BLKING_LOAD", .pme_code = 0x00001f1f, .pme_counters = 0x3, .pme_desc = "Stall cycles due to pending non-blocking loads - stall start of exception" }, {.pme_name="STALL_CYCLES", .pme_code = 0x00000909, .pme_counters = 0x3, .pme_desc = "Stall cycles" }, {.pme_name="CACHE_MISSES", .pme_code = 0x00001616, .pme_counters = 0x3, .pme_desc = "Cache misses" }, {.pme_name="DUAL_ISSUED_PAIRS", .pme_code = 0x00000606, .pme_counters = 0x3, .pme_desc = "Dual issued pairs" }, {.pme_name="SLIP_CYCLES_DUE_MULTIPLIER_BUSY", .pme_code = 0x00001818, .pme_counters = 0x3, .pme_desc = "Slip Cycles due to multiplier busy" }, {.pme_name="INTEGER_INSTRUCTIONS_ISSUED", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Integer instructions issued" }, {.pme_name="SCACHE_WRITEBACKS", .pme_code = 0x00001313, .pme_counters = 0x3, .pme_desc = "Secondary cache writebacks" }, {.pme_name="DCACHE_MISS_STALL_CYCLES", .pme_code = 0x00001515, .pme_counters = 0x3, .pme_desc = "Dcache miss stall cycles (cycles where both cache miss tokens taken and a third try is requested)" }, {.pme_name="MULTIPLIER_STALL_CYCLES", .pme_code = 0x00001e1e, .pme_counters = 0x3, .pme_desc = "Multiplier stall cycles" }, {.pme_name="WRITE_BUFFER_FULL_STALL_CYCLES", .pme_code = 0x00001c1c, .pme_counters = 0x3, .pme_desc = "Write buffer full stall cycles" }, {.pme_name="FP_INSTRUCTIONS_ISSUED", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "Floating-point instructions issued" }, {.pme_name="JTLB_DATA_MISSES", .pme_code = 0x00001010, .pme_counters = 0x3, .pme_desc = "Joint TLB data misses" }, {.pme_name="FP_EXCEPTION_STALL_CYCLES", .pme_code = 0x00001717, .pme_counters = 0x3, .pme_desc = "FP possible exception cycles" }, {.pme_name="SCACHE_MISSES", .pme_code = 0x00000a0a, .pme_counters = 0x3, .pme_desc = "Secondary cache misses" }, {.pme_name="BRANCHES_ISSUED", .pme_code = 0x00001212, .pme_counters = 0x3, .pme_desc = "Branches issued" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00000b0b, .pme_counters = 0x3, .pme_desc = "Instruction cache misses" }, {.pme_name="INSTRUCTIONS_ISSUED", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "Total instructions issued" }, {.pme_name="JTLB_INSTRUCTION_MISSES", .pme_code = 0x00000f0f, .pme_counters = 0x3, .pme_desc = "Joint TLB instruction misses" }, {.pme_name="LOAD_INSTRUCTIONS_ISSUED", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Load instructions issued" }, {.pme_name="EXTERNAL_CACHE_MISSES", .pme_code = 0x00000808, .pme_counters = 0x3, .pme_desc = "External Cache Misses" }, {.pme_name="BRANCHES_TAKEN", .pme_code = 0x00001111, .pme_counters = 0x3, .pme_desc = "Branches taken" }, {.pme_name="DTLB_MISSES", .pme_code = 0x00000d0d, .pme_counters = 0x3, .pme_desc = "Data TLB misses" }, {.pme_name="CACHE_INSTRUCTION_STALL_CYCLES", .pme_code = 0x00001d1d, .pme_counters = 0x3, .pme_desc = "Cache instruction stall cycles" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Clock cycles" }, {.pme_name="COP0_SLIP_CYCLES", .pme_code = 0x00001919, .pme_counters = 0x3, .pme_desc = "Coprocessor 0 slip cycles" }, {.pme_name="ITLB_MISSES", .pme_code = 0x00000e0e, .pme_counters = 0x3, .pme_desc = "Instruction TLB misses" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x00000c0c, .pme_counters = 0x3, .pme_desc = "Data cache misses" }, }; static pme_gen_mips64_entry_t gen_mips64_rm9000_pe[] = { {.pme_name="FP_POSSIBLE_EXCEPTION_CYCLES", .pme_code = 0x00001717, .pme_counters = 0x3, .pme_desc = "Floating-point possible exception cycles" }, {.pme_name="STORE_INSTRUCTIONS_ISSUED", .pme_code = 0x00000505, .pme_counters = 0x3, .pme_desc = "Store instructions issued" }, {.pme_name="STALL_CYCLES", .pme_code = 0x00000909, .pme_counters = 0x3, .pme_desc = "Stall cycles" }, {.pme_name="L2_WRITEBACKS", .pme_code = 0x00001313, .pme_counters = 0x3, .pme_desc = "L2 cache writebacks" }, {.pme_name="NONBLOCKING_LOAD_SLIP_CYCLES", .pme_code = 0x00001a1a, .pme_counters = 0x3, .pme_desc = "Slip cycles due to pending non-blocking loads" }, {.pme_name="NONBLOCKING_LOAD_PENDING_EXCEPTION_STALL_CYCLES", .pme_code = 0x00001e1e, .pme_counters = 0x3, .pme_desc = "Stall cycles due to pending non-blocking loads - stall start of exception" }, {.pme_name="BRANCH_MISSPREDICTS", .pme_code = 0x00000707, .pme_counters = 0x3, .pme_desc = "Branch mispredictions" }, {.pme_name="DCACHE_MISS_STALL_CYCLES", .pme_code = 0x00001515, .pme_counters = 0x3, .pme_desc = "Dcache-miss stall cycles" }, {.pme_name="WRITE_BUFFER_FULL_STALL_CYCLES", .pme_code = 0x00001b1b, .pme_counters = 0x3, .pme_desc = "Stall cycles due to a full write buffer" }, {.pme_name="INT_INSTRUCTIONS_ISSUED", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Integer instructions issued" }, {.pme_name="FP_INSTRUCTIONS_ISSUED", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "Floating-point instructions issued" }, {.pme_name="JTLB_DATA_MISSES", .pme_code = 0x00001010, .pme_counters = 0x3, .pme_desc = "Joint TLB data misses" }, {.pme_name="L2_CACHE_MISSES", .pme_code = 0x00000a0a, .pme_counters = 0x3, .pme_desc = "L2 cache misses" }, {.pme_name="DCACHE_WRITEBACKS", .pme_code = 0x00001414, .pme_counters = 0x3, .pme_desc = "Dcache writebacks" }, {.pme_name="BRANCHES_ISSUED", .pme_code = 0x00001212, .pme_counters = 0x3, .pme_desc = "Branch instructions issued" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00000b0b, .pme_counters = 0x3, .pme_desc = "Icache misses" }, {.pme_name="INSTRUCTIONS_ISSUED", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "Instructions issued" }, {.pme_name="MULTIPLIER_BUSY_SLIP_CYCLES", .pme_code = 0x00001818, .pme_counters = 0x3, .pme_desc = "Slip cycles due to busy multiplier" }, {.pme_name="INSTRUCTIONS_DUAL_ISSUED", .pme_code = 0x00000606, .pme_counters = 0x3, .pme_desc = "Dual-issued instruction pairs" }, {.pme_name="CACHE_INSN_STALL_CYCLES", .pme_code = 0x00001c1c, .pme_counters = 0x3, .pme_desc = "Stall cycles due to cache instructions" }, {.pme_name="JTLB_INSTRUCTION_MISSES", .pme_code = 0x00000f0f, .pme_counters = 0x3, .pme_desc = "Joint TLB instruction misses" }, {.pme_name="LOAD_INSTRUCTIONS_ISSUED", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Load instructions issued" }, {.pme_name="CACHE_REMISSES", .pme_code = 0x00001616, .pme_counters = 0x3, .pme_desc = "Cache remisses" }, {.pme_name="BRANCHES_TAKEN", .pme_code = 0x00001111, .pme_counters = 0x3, .pme_desc = "Branches taken" }, {.pme_name="DTLB_MISSES", .pme_code = 0x00000d0d, .pme_counters = 0x3, .pme_desc = "Data TLB misses" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Processor clock cycles" }, {.pme_name="COP0_SLIP_CYCLES", .pme_code = 0x00001919, .pme_counters = 0x3, .pme_desc = "Co-processor 0 slip cycles" }, {.pme_name="ITLB_MISSES", .pme_code = 0x00000e0e, .pme_counters = 0x3, .pme_desc = "Instruction TLB misses" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x00000c0c, .pme_counters = 0x3, .pme_desc = "Dcache misses" }, }; static pme_gen_mips64_entry_t gen_mips64_sb1_pe[] = { {.pme_name="DATA_DEPENDENCY_REPLAY", .pme_code = 0x1e1e1e1e, .pme_counters = 0xf, .pme_desc = "Data dependency replay" }, {.pme_name="DCACHE_READ_MISS", .pme_code = 0x0f0f0f00, .pme_counters = 0xe, .pme_desc = "Dcache read results in a miss" }, {.pme_name="R_RESP_OTHER_CORE_D_MOD", .pme_code = 0x19191900, .pme_counters = 0xe, .pme_desc = "Read response comes from the other core with D_MOD set" }, {.pme_name="RQ_LENGTH", .pme_code = 0x01010100, .pme_counters = 0xe, .pme_desc = "Read queue length" }, {.pme_name="READ_RQ_NOPS_SENT_TO_ABUS", .pme_code = 0x14141400, .pme_counters = 0xe, .pme_desc = "Read requests and NOPs sent to ZB Abus" }, {.pme_name="R_RESP_OTHER_CORE", .pme_code = 0x18181800, .pme_counters = 0xe, .pme_desc = "Read response comes from the other core" }, {.pme_name="SNOOP_RQ_HITS", .pme_code = 0x16161600, .pme_counters = 0xe, .pme_desc = "Snoop request hits anywhere" }, {.pme_name="LOAD_SURVIVED_STAGE4", .pme_code = 0x08080800, .pme_counters = 0xe, .pme_desc = "Load survived stage 4" }, {.pme_name="BRANCH_PREDICTED_TAKEN", .pme_code = 0x2e2e2e00, .pme_counters = 0xe, .pme_desc = "Predicted taken conditional branch" }, {.pme_name="ISSUE_L1", .pme_code = 0x29292900, .pme_counters = 0xe, .pme_desc = "Issue to L0" }, {.pme_name="ANY_REPLAY", .pme_code = 0x1f1f1f1f, .pme_counters = 0xf, .pme_desc = "Any replay except mispredict" }, {.pme_name="LD_ST_HITS_PREFETCH_IN_QUEUE", .pme_code = 0x06060600, .pme_counters = 0xe, .pme_desc = "Load/store hits prefetch in read queue" }, {.pme_name="NOT_DATA_READY", .pme_code = 0x23232300, .pme_counters = 0xe, .pme_desc = "Not data ready" }, {.pme_name="DCFIFO", .pme_code = 0x1c1c1c1c, .pme_counters = 0xf, .pme_desc = "DCFIFO" }, {.pme_name="ISSUE_E1", .pme_code = 0x2b2b2b00, .pme_counters = 0xe, .pme_desc = "Issue to E1" }, {.pme_name="PREFETCH_HITS_CACHE_OR_READ_Q", .pme_code = 0x05050500, .pme_counters = 0xe, .pme_desc = "Prefetch hits in cache or read queue" }, {.pme_name="BRANCH_STAGE4", .pme_code = 0x2c2c2c00, .pme_counters = 0xe, .pme_desc = "Branch survived stage 4" }, {.pme_name="SNOOP_ADDR_Q_FULL", .pme_code = 0x17171700, .pme_counters = 0xe, .pme_desc = "Snoop address queue is full" }, {.pme_name="CONSUMER_WAITING_FOR_LOAD", .pme_code = 0x22222200, .pme_counters = 0xe, .pme_desc = "load consumer waiting for dfill" }, {.pme_name="VICTIM_WRITEBACK", .pme_code = 0x0d0d0d00, .pme_counters = 0xe, .pme_desc = "A writeback occurs due to replacement" }, {.pme_name="BRANCH_MISSPREDICTS", .pme_code = 0x2f2f2f00, .pme_counters = 0xe, .pme_desc = "Branch mispredicts" }, {.pme_name="UPGRADE_SHARED_TO_EXCLUSIVE", .pme_code = 0x07070700, .pme_counters = 0xe, .pme_desc = "A line is upgraded from shared to exclusive" }, {.pme_name="READ_HITS_READ_Q", .pme_code = 0x04040400, .pme_counters = 0xe, .pme_desc = "Read hits in read queue" }, {.pme_name="INSN_STAGE4", .pme_code = 0x27272700, .pme_counters = 0xe, .pme_desc = "One or more instructions survives stage 4" }, {.pme_name="UNCACHED_RQ_LENGTH", .pme_code = 0x02020200, .pme_counters = 0xe, .pme_desc = "Number of valid uncached entries in read queue" }, {.pme_name="READ_RQ_SENT_TO_ABUS", .pme_code = 0x17171700, .pme_counters = 0xe, .pme_desc = "Read requests sent to ZB Abus" }, {.pme_name="DCACHE_FILL_SHARED_LINE", .pme_code = 0x0b0b0b00, .pme_counters = 0xe, .pme_desc = "Dcache is filled with shared line" }, {.pme_name="ISSUE_CONFLICT_DUE_IMISS", .pme_code = 0x25252500, .pme_counters = 0xe, .pme_desc = "issue conflict due to imiss using LS0" }, {.pme_name="NO_VALID_INSN", .pme_code = 0x21212100, .pme_counters = 0xe, .pme_desc = "No valid instr to issue" }, {.pme_name="ISSUE_E0", .pme_code = 0x2a2a2a00, .pme_counters = 0xe, .pme_desc = "Issue to E0" }, {.pme_name="INSN_SURVIVED_STAGE7", .pme_code = 0x00000000, .pme_counters = 0xe, .pme_desc = "Instruction survived stage 7" }, {.pme_name="BRANCH_REALLY_TAKEN", .pme_code = 0x2d2d2d00, .pme_counters = 0xe, .pme_desc = "Conditional branch was really taken" }, {.pme_name="STORE_COND_FAILED", .pme_code = 0x1a1a1a00, .pme_counters = 0xe, .pme_desc = "Failed store conditional" }, {.pme_name="MAX_ISSUE", .pme_code = 0x20202000, .pme_counters = 0xe, .pme_desc = "Max issue" }, {.pme_name="BIU_STALLS_ON_ZB_ADDR_BUS", .pme_code = 0x11111100, .pme_counters = 0xe, .pme_desc = "BIU stalls on ZB addr bus" }, {.pme_name="STORE_SURVIVED_STAGE4", .pme_code = 0x09090900, .pme_counters = 0xe, .pme_desc = "Store survived stage 4" }, {.pme_name="RESOURCE_CONSTRAINT", .pme_code = 0x24242400, .pme_counters = 0xe, .pme_desc = "Resource (L0/1 E0/1) constraint" }, {.pme_name="DCACHE_FILL_REPLAY", .pme_code = 0x1b1b1b1b, .pme_counters = 0xf, .pme_desc = "Dcache fill replay" }, {.pme_name="BIU_STALLS_ON_ZB_DATA_BUS", .pme_code = 0x12121200, .pme_counters = 0xe, .pme_desc = "BIU stalls on ZB data bus" }, {.pme_name="ISSUE_CONFLICT_DUE_DFILL", .pme_code = 0x26262600, .pme_counters = 0xe, .pme_desc = "issue conflict due to dfill using LS0/1" }, {.pme_name="WRITEBACK_RETURNS", .pme_code = 0x0f0f0f00, .pme_counters = 0xe, .pme_desc = "Number of instruction returns" }, {.pme_name="DCACHE_FILLED_SHD_NONC_EXC", .pme_code = 0x0a0a0a00, .pme_counters = 0xe, .pme_desc = "Dcache is filled (shared, nonc, exclusive)" }, {.pme_name="ISSUE_L0", .pme_code = 0x28282800, .pme_counters = 0xe, .pme_desc = "Issue to L0" }, {.pme_name="CYCLES", .pme_code = 0x10101010, .pme_counters = 0xf, .pme_desc = "Elapsed cycles" }, {.pme_name="MBOX_RQ_WHEN_BIU_BUSY", .pme_code = 0x0e0e0e00, .pme_counters = 0xe, .pme_desc = "MBOX requests to BIU when BIU busy" }, {.pme_name="MBOX_REPLAY", .pme_code = 0x1d1d1d1d, .pme_counters = 0xf, .pme_desc = "MBOX replay" }, }; static pme_gen_mips64_entry_t gen_mips64_vr5432_pe[] = { {.pme_name="INSTRUCTIONS_EXECUTED", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "(Instructions executed)/2 and truncated" }, {.pme_name="JTLB_REFILLS", .pme_code = 0x00000707, .pme_counters = 0x3, .pme_desc = "JTLB refills" }, {.pme_name="BRANCHES", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Branch execution (no jumps or jump registers)" }, {.pme_name="FP_INSTRUCTIONS", .pme_code = 0x00000505, .pme_counters = 0x3, .pme_desc = "(FP instruction execution) / 2 and truncated excluding cp1 loads and stores" }, {.pme_name="BRANCHES_MISPREDICTED", .pme_code = 0x00000a0a, .pme_counters = 0x3, .pme_desc = "Branches mispredicted" }, {.pme_name="DOUBLEWORDS_FLUSHED", .pme_code = 0x00000606, .pme_counters = 0x3, .pme_desc = "Doublewords flushed to main memory (no uncached stores)" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00000909, .pme_counters = 0x3, .pme_desc = "Instruction cache misses (no D-cache misses)" }, {.pme_name="LOAD_PREF_CACHE_INSTRUCTIONS", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "Load, prefetch/CacheOps execution (no sync)" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Processor cycles (PClock)" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x00000808, .pme_counters = 0x3, .pme_desc = "Data cache misses (no I-cache misses)" }, {.pme_name="STORES", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Store execution" }, }; static pme_gen_mips64_entry_t gen_mips64_vr5500_pe[] = { {.pme_name="INSTRUCTIONS_EXECUTED", .pme_code = 0x00000101, .pme_counters = 0x3, .pme_desc = "Instructions executed" }, {.pme_name="JTLB_REFILLS", .pme_code = 0x00000707, .pme_counters = 0x3, .pme_desc = "TLB refill" }, {.pme_name="BRANCHES", .pme_code = 0x00000404, .pme_counters = 0x3, .pme_desc = "Execution of branch instruction" }, {.pme_name="FP_INSTRUCTIONS", .pme_code = 0x00000505, .pme_counters = 0x3, .pme_desc = "Execution of floating-point instruction" }, {.pme_name="BRANCHES_MISPREDICTED", .pme_code = 0x00000a0a, .pme_counters = 0x3, .pme_desc = "Branch prediction miss" }, {.pme_name="DOUBLEWORDS_FLUSHED", .pme_code = 0x00000606, .pme_counters = 0x3, .pme_desc = "Doubleword flush to main memory" }, {.pme_name="ICACHE_MISSES", .pme_code = 0x00000909, .pme_counters = 0x3, .pme_desc = "Instruction cache miss" }, {.pme_name="LOAD_PREF_CACHE_INSTRUCTIONS", .pme_code = 0x00000202, .pme_counters = 0x3, .pme_desc = "Execution of load/prefetch/cache instruction" }, {.pme_name="CYCLES", .pme_code = 0x00000000, .pme_counters = 0x3, .pme_desc = "Processor clock cycles" }, {.pme_name="DCACHE_MISSES", .pme_code = 0x00000808, .pme_counters = 0x3, .pme_desc = "Data cache miss" }, {.pme_name="STORES", .pme_code = 0x00000303, .pme_counters = 0x3, .pme_desc = "Execution of store instruction" }, }; papi-papi-7-2-0-t/src/libperfnec/lib/i386_p6_events.h000066400000000000000000001006411502707512200221430ustar00rootroot00000000000000/* * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #define I386_P6_MESI_UMASKS \ .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, \ .pme_numasks = 4, \ .pme_umasks = { \ { .pme_uname = "I", \ .pme_udesc = "invalid state", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "S", \ .pme_udesc = "shared state", \ .pme_ucode = 0x2 \ }, \ { .pme_uname = "E", \ .pme_udesc = "exclusive state", \ .pme_ucode = 0x4 \ }, \ { .pme_uname = "M", \ .pme_udesc = "modified state", \ .pme_ucode = 0x8 \ }} #define I386_PM_MESI_PREFETCH_UMASKS \ .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, \ .pme_numasks = 7, \ .pme_umasks = { \ { .pme_uname = "I", \ .pme_udesc = "invalid state", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "S", \ .pme_udesc = "shared state", \ .pme_ucode = 0x2 \ }, \ { .pme_uname = "E", \ .pme_udesc = "exclusive state", \ .pme_ucode = 0x4 \ }, \ { .pme_uname = "M", \ .pme_udesc = "modified state", \ .pme_ucode = 0x8 \ }, \ { .pme_uname = "EXCL_HW_PREFETCH", \ .pme_udesc = "exclude hardware prefetched lines", \ .pme_ucode = 0x0 \ }, \ { .pme_uname = "ONLY_HW_PREFETCH", \ .pme_udesc = "only hardware prefetched lines", \ .pme_ucode = 0x1 << 4 \ }, \ { .pme_uname = "NON_HW_PREFETCH", \ .pme_udesc = "non hardware prefetched lines", \ .pme_ucode = 0x2 << 4 \ }} #define I386_P6_PII_ONLY_PME \ {.pme_name = "MMX_INSTR_EXEC",\ .pme_code = 0xb0,\ .pme_desc = "Number of MMX instructions executed"\ },\ {.pme_name = "MMX_INSTR_RET",\ .pme_code = 0xce,\ .pme_desc = "Number of MMX instructions retired"\ }\ #define I386_P6_PII_PIII_PME \ {.pme_name = "MMX_SAT_INSTR_EXEC",\ .pme_code = 0xb1,\ .pme_desc = "Number of MMX saturating instructions executed"\ },\ {.pme_name = "MMX_UOPS_EXEC",\ .pme_code = 0xb2,\ .pme_desc = "Number of MMX micro-ops executed"\ },\ {.pme_name = "MMX_INSTR_TYPE_EXEC",\ .pme_code = 0xb3,\ .pme_desc = "Number of MMX instructions executed by type",\ .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, \ .pme_numasks = 6, \ .pme_umasks = { \ { .pme_uname = "MUL", \ .pme_udesc = "MMX packed multiply instructions executed", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "SHIFT", \ .pme_udesc = "MMX packed shift instructions executed", \ .pme_ucode = 0x2 \ }, \ { .pme_uname = "PACK", \ .pme_udesc = "MMX pack operation instructions executed", \ .pme_ucode = 0x4 \ }, \ { .pme_uname = "UNPACK", \ .pme_udesc = "MMX unpack operation instructions executed", \ .pme_ucode = 0x8 \ }, \ { .pme_uname = "LOGICAL", \ .pme_udesc = "MMX packed logical instructions executed", \ .pme_ucode = 0x10 \ }, \ { .pme_uname = "ARITH", \ .pme_udesc = "MMX packed arithmetic instructions executed", \ .pme_ucode = 0x20 \ } \ }\ },\ {.pme_name = "FP_MMX_TRANS",\ .pme_code = 0xcc,\ .pme_desc = "Number of MMX transitions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "TO_FP", \ .pme_udesc = "from MMX instructions to floating-point instructions", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "TO_MMX", \ .pme_udesc = "from floating-point instructions to MMX instructions", \ .pme_ucode = 0x01 \ }\ }\ },\ {.pme_name = "MMX_ASSIST",\ .pme_code = 0xcd,\ .pme_desc = "Number of MMX micro-ops executed"\ },\ {.pme_name = "SEG_RENAME_STALLS",\ .pme_code = 0xd4,\ .pme_desc = "Number of Segment Register Renaming Stalls", \ .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, \ .pme_numasks = 4, \ .pme_umasks = { \ { .pme_uname = "ES", \ .pme_udesc = "Segment register ES", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "DS", \ .pme_udesc = "Segment register DS", \ .pme_ucode = 0x2 \ }, \ { .pme_uname = "FS", \ .pme_udesc = "Segment register FS", \ .pme_ucode = 0x4 \ }, \ { .pme_uname = "GS", \ .pme_udesc = "Segment register GS", \ .pme_ucode = 0x8 \ } \ }\ },\ {.pme_name = "SEG_REG_RENAMES",\ .pme_code = 0xd5,\ .pme_desc = "Number of Segment Register Renames", \ .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, \ .pme_numasks = 4, \ .pme_umasks = { \ { .pme_uname = "ES", \ .pme_udesc = "Segment register ES", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "DS", \ .pme_udesc = "Segment register DS", \ .pme_ucode = 0x2 \ }, \ { .pme_uname = "FS", \ .pme_udesc = "Segment register FS", \ .pme_ucode = 0x4 \ }, \ { .pme_uname = "GS", \ .pme_udesc = "Segment register GS", \ .pme_ucode = 0x8 \ } \ }\ },\ {.pme_name = "RET_SEG_RENAMES",\ .pme_code = 0xd6,\ .pme_desc = "Number of segment register rename events retired"\ } \ #define I386_P6_PIII_PME \ {.pme_name = "EMON_KNI_PREF_DISPATCHED",\ .pme_code = 0x07,\ .pme_desc = "Number of Streaming SIMD extensions prefetch/weakly-ordered instructions dispatched " \ "(speculative prefetches are included in counting). Pentium III and later",\ .pme_numasks = 4, \ .pme_umasks = { \ { .pme_uname = "NTA", \ .pme_udesc = "prefetch NTA", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "T1", \ .pme_udesc = "prefetch T1", \ .pme_ucode = 0x01 \ }, \ { .pme_uname = "T2", \ .pme_udesc = "prefetch T2", \ .pme_ucode = 0x02 \ }, \ { .pme_uname = "WEAK", \ .pme_udesc = "weakly ordered stores", \ .pme_ucode = 0x03 \ } \ } \ },\ {.pme_name = "EMON_KNI_PREF_MISS",\ .pme_code = 0x4b,\ .pme_desc = "Number of prefetch/weakly-ordered instructions that miss all caches. Pentium III and later",\ .pme_numasks = 4, \ .pme_umasks = { \ { .pme_uname = "NTA", \ .pme_udesc = "prefetch NTA", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "T1", \ .pme_udesc = "prefetch T1", \ .pme_ucode = 0x01 \ }, \ { .pme_uname = "T2", \ .pme_udesc = "prefetch T2", \ .pme_ucode = 0x02 \ }, \ { .pme_uname = "WEAK", \ .pme_udesc = "weakly ordered stores", \ .pme_ucode = 0x03 \ } \ } \ } \ #define I386_P6_CPU_CLK_UNHALTED \ {.pme_name = "CPU_CLK_UNHALTED",\ .pme_code = 0x79,\ .pme_desc = "Number cycles during which the processor is not halted"\ }\ #define I386_P6_NOT_PM_PME \ {.pme_name = "L2_LD",\ .pme_code = 0x29,\ .pme_desc = "Number of L2 data loads. This event indicates that a normal, unlocked, load memory access "\ "was received by the L2. It includes only L2 cacheable memory accesses; it does not include I/O "\ "accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include "\ "L2 cacheable TLB miss memory accesses",\ I386_P6_MESI_UMASKS\ },\ {.pme_name = "L2_LINES_IN",\ .pme_code = 0x24,\ .pme_desc = "Number of lines allocated in the L2"\ },\ {.pme_name = "L2_LINES_OUT",\ .pme_code = 0x26,\ .pme_desc = "Number of lines removed from the L2 for any reason"\ },\ {.pme_name = "L2_M_LINES_OUTM",\ .pme_code = 0x27,\ .pme_desc = "Number of modified lines removed from the L2 for any reason"\ }\ #define I386_P6_PIII_NOT_PM_PME \ {.pme_name = "EMON_KNI_INST_RETIRED",\ .pme_code = 0xd8,\ .pme_desc = "Number of SSE instructions retired. Pentium III and later",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "PACKED_SCALAR", \ .pme_udesc = "packed and scalar instructions", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "SCALAR", \ .pme_udesc = "scalar only", \ .pme_ucode = 0x01 \ } \ } \ },\ {.pme_name = "EMON_KNI_COMP_INST_RET",\ .pme_code = 0xd9,\ .pme_desc = "Number of SSE computation instructions retired. Pentium III and later",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "PACKED_SCALAR", \ .pme_udesc = "packed and scalar instructions", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "SCALAR", \ .pme_udesc = "scalar only", \ .pme_ucode = 0x01 \ } \ } \ }\ #define I386_P6_COMMON_PME \ {.pme_name = "INST_RETIRED",\ .pme_code = 0xc0,\ .pme_desc = "Number of instructions retired"\ },\ {.pme_name = "DATA_MEM_REFS",\ .pme_code = 0x43,\ .pme_desc = "All loads from any memory type. All stores to any memory type"\ "Each part of a split is counted separately. The internal logic counts not only memory loads and stores"\ " but also internal retries. 80-bit floating point accesses are double counted, since they are decomposed"\ " into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are "\ " actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the"\ " same address, and which finally gets performe, is only counted once). Does ot include I/O accesses or other"\ " non-memory accesses"\ },\ {.pme_name = "DCU_LINES_IN",\ .pme_code = 0x45,\ .pme_desc = "Total lines allocated in the DCU"\ },\ {.pme_name = "DCU_M_LINES_IN",\ .pme_code = 0x46,\ .pme_desc = "Number of M state lines allocated in the DCU"\ },\ {.pme_name = "DCU_M_LINES_OUT",\ .pme_code = 0x47,\ .pme_desc = "Number of M state lines evicted from the DCU. This includes evictions via snoop HITM, intervention"\ " or replacement"\ },\ {.pme_name = "DCU_MISS_OUTSTANDING",\ .pme_code = 0x48,\ .pme_desc = "Weighted number of cycle while a DCU miss is outstanding, incremented by the number of cache misses"\ " at any particular time. Cacheable read requests only are considered. Uncacheable requests are excluded"\ " Read-for-ownerships are counted, as well as line fills, invalidates, and stores"\ },\ {.pme_name = "IFU_IFETCH",\ .pme_code = 0x80,\ .pme_desc = "Number of instruction fetches, both cacheable and noncacheable including UC fetches"\ },\ {.pme_name = "IFU_IFETCH_MISS",\ .pme_code = 0x81,\ .pme_desc = "Number of instruction fetch misses. All instructions fetches that do not hit the IFU (i.e., that"\ " produce memory requests). Includes UC accesses"\ },\ {.pme_name = "ITLB_MISS",\ .pme_code = 0x85,\ .pme_desc = "Number of ITLB misses"\ },\ {.pme_name = "IFU_MEM_STALL",\ .pme_code = 0x86,\ .pme_desc = "Number of cycles instruction fetch is stalled for any reason. Includs IFU cache misses, ITLB misses,"\ " ITLB faults, and other minor stalls"\ },\ {.pme_name = "ILD_STALL",\ .pme_code = 0x87,\ .pme_desc = "Number of cycles that the instruction length decoder is stalled"\ },\ {.pme_name = "L2_IFETCH",\ .pme_code = 0x28,\ .pme_desc = "Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by"\ " the L2. The count includes only L2 cacheable instruction fetches: it does not include UC instruction fetches"\ " It does not include ITLB miss accesses",\ I386_P6_MESI_UMASKS \ }, \ {.pme_name = "L2_ST",\ .pme_code = 0x2a,\ .pme_desc = "Number of L2 data stores. This event indicates that a normal, unlocked, store memory access "\ "was received by the L2. Specifically, it indictes that the DCU sent a read-for ownership request to " \ "the L2. It also includes Invalid to Modified reqyests sent by the DCU to the L2. " \ "It includes only L2 cacheable memory accesses; it does not include I/O " \ "accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include " \ "L2 cacheable TLB miss memory accesses", \ I386_P6_MESI_UMASKS \ },\ {.pme_name = "L2_M_LINES_INM",\ .pme_code = 0x25,\ .pme_desc = "Number of modified lines allocated in the L2"\ },\ {.pme_name = "L2_RQSTS",\ .pme_code = 0x2e,\ .pme_desc = "Total number of L2 requests",\ I386_P6_MESI_UMASKS \ },\ {.pme_name = "L2_ADS",\ .pme_code = 0x21,\ .pme_desc = "Number of L2 address strobes"\ },\ {.pme_name = "L2_DBUS_BUSY",\ .pme_code = 0x22,\ .pme_desc = "Number of cycles during which the L2 cache data bus was busy"\ },\ {.pme_name = "L2_DBUS_BUSY_RD",\ .pme_code = 0x23,\ .pme_desc = "Number of cycles during which the data bus was busy transferring read data from L2 to the processor"\ },\ {.pme_name = "BUS_DRDY_CLOCKS",\ .pme_code = 0x62,\ .pme_desc = "Number of clocks during which DRDY# is asserted. " \ "Utilization of the external system data bus during data transfers", \ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_LOCK_CLOCKS",\ .pme_code = 0x63,\ .pme_desc = "Number of clocks during which LOCK# is asserted on the external system bus", \ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_REQ_OUTSTANDING",\ .pme_code = 0x60,\ .pme_desc = "Number of bus requests outstanding. This counter is incremented " \ "by the number of cacheable read bus requests outstanding in any given cycle", \ },\ {.pme_name = "BUS_TRANS_BRD",\ .pme_code = 0x65,\ .pme_desc = "Number of burst read transactions", \ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRANS_RFO",\ .pme_code = 0x66,\ .pme_desc = "Number of completed read for ownership transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRANS_WB",\ .pme_code = 0x67,\ .pme_desc = "Number of completed write back transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_IFETCH",\ .pme_code = 0x68,\ .pme_desc = "Number of completed instruction fetch transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_INVAL",\ .pme_code = 0x69,\ .pme_desc = "Number of completed invalidate transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_PWR",\ .pme_code = 0x6a,\ .pme_desc = "Number of completed partial write transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRANS_P",\ .pme_code = 0x6b,\ .pme_desc = "Number of completed partial transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRANS_IO",\ .pme_code = 0x6c,\ .pme_desc = "Number of completed I/O transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_DEF",\ .pme_code = 0x6d,\ .pme_desc = "Number of completed deferred transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x1 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x2 \ } \ } \ },\ {.pme_name = "BUS_TRAN_BURST",\ .pme_code = 0x6e,\ .pme_desc = "Number of completed burst transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_ANY",\ .pme_code = 0x70,\ .pme_desc = "Number of all completed bus transactions. Address bus utilization " \ "can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc.",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_TRAN_MEM",\ .pme_code = 0x6f,\ .pme_desc = "Number of completed memory transactions",\ .pme_numasks = 2, \ .pme_umasks = { \ { .pme_uname = "SELF", \ .pme_udesc = "clocks when processor is driving bus", \ .pme_ucode = 0x00 \ }, \ { .pme_uname = "ANY", \ .pme_udesc = "clocks when any agent is driving bus", \ .pme_ucode = 0x20 \ } \ } \ },\ {.pme_name = "BUS_DATA_RECV",\ .pme_code = 0x64,\ .pme_desc = "Number of bus clock cycles during which this processor is receiving data"\ },\ {.pme_name = "BUS_BNR_DRV",\ .pme_code = 0x61,\ .pme_desc = "Number of bus clock cycles during which this processor is driving the BNR# pin"\ },\ {.pme_name = "BUS_HIT_DRV",\ .pme_code = 0x7a,\ .pme_desc = "Number of bus clock cycles during which this processor is driving the HIT# pin"\ },\ {.pme_name = "BUS_HITM_DRV",\ .pme_code = 0x7b,\ .pme_desc = "Number of bus clock cycles during which this processor is driving the HITM# pin"\ },\ {.pme_name = "BUS_SNOOP_STALL",\ .pme_code = 0x7e,\ .pme_desc = "Number of clock cycles during which the bus is snoop stalled"\ },\ {.pme_name = "FLOPS",\ .pme_code = 0xc1,\ .pme_desc = "Number of computational floating-point operations retired. " \ "Excludes floating-point computational operations that cause traps or assists. " \ "Includes internal sub-operations for complex floating-point instructions like transcendentals. " \ "Excludes floating point loads and stores", \ .pme_flags = PFMLIB_I386_P6_CTR0_ONLY \ },\ {.pme_name = "FP_COMP_OPS_EXE",\ .pme_code = 0x10,\ .pme_desc = "Number of computational floating-point operations executed. The number of FADD, FSUB, " \ "FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. " \ "This number does not include the number of cycles, but the number of operations. " \ "This event does not distinguish an FADD used in the middle of a transcendental flow " \ "from a separate FADD instruction", \ .pme_flags = PFMLIB_I386_P6_CTR0_ONLY \ },\ {.pme_name = "FP_ASSIST",\ .pme_code = 0x11,\ .pme_desc = "Number of floating-point exception cases handled by microcode.", \ .pme_flags = PFMLIB_I386_P6_CTR1_ONLY \ },\ {.pme_name = "MUL",\ .pme_code = 0x12,\ .pme_desc = "Number of multiplies." \ "This count includes integer as well as FP multiplies and is speculative", \ .pme_flags = PFMLIB_I386_P6_CTR1_ONLY \ },\ {.pme_name = "DIV",\ .pme_code = 0x13,\ .pme_desc = "Number of divides." \ "This count includes integer as well as FP divides and is speculative", \ .pme_flags = PFMLIB_I386_P6_CTR1_ONLY \ },\ {.pme_name = "CYCLES_DIV_BUSY",\ .pme_code = 0x14,\ .pme_desc = "Number of cycles during which the divider is busy, and cannot accept new divides. " \ "This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative", \ .pme_flags = PFMLIB_I386_P6_CTR0_ONLY \ },\ {.pme_name = "LD_BLOCKS",\ .pme_code = 0x03,\ .pme_desc = "Number of load operations delayed due to store buffer blocks. Includes counts " \ "caused by preceding stores whose addresses are unknown, preceding stores whose addresses " \ "are known but whose data is unknown, and preceding stores that conflicts with the load " \ "but which incompletely overlap the load" \ },\ {.pme_name = "SB_DRAINS",\ .pme_code = 0x04,\ .pme_desc = "Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. " \ "Draining is caused by serializing operations like CPUID, synchronizing operations " \ "like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing)."\ },\ {.pme_name = "MISALIGN_MEM_REF",\ .pme_code = 0x05,\ .pme_desc = "Number of misaligned data memory references. Incremented by 1 every cycle during "\ "which, either the processor's load or store pipeline dispatches a misaligned micro-op "\ "Counting is performed if it is the first or second half or if it is blocked, squashed, "\ "or missed. In this context, misaligned means crossing a 64-bit boundary"\ },\ {.pme_name = "UOPS_RETIRED",\ .pme_code = 0xc2,\ .pme_desc = "Number of micro-ops retired"\ },\ {.pme_name = "INST_DECODED",\ .pme_code = 0xd0,\ .pme_desc = "Number of instructions decoded"\ },\ {.pme_name = "HW_INT_RX",\ .pme_code = 0xc8,\ .pme_desc = "Number of hardware interrupts received"\ },\ {.pme_name = "CYCLES_INT_MASKED",\ .pme_code = 0xc6,\ .pme_desc = "Number of processor cycles for which interrupts are disabled"\ },\ {.pme_name = "CYCLES_INT_PENDING_AND_MASKED",\ .pme_code = 0xc7,\ .pme_desc = "Number of processor cycles for which interrupts are disabled and interrupts are pending."\ },\ {.pme_name = "BR_INST_RETIRED",\ .pme_code = 0xc4,\ .pme_desc = "Number of branch instructions retired"\ },\ {.pme_name = "BR_MISS_PRED_RETIRED",\ .pme_code = 0xc5,\ .pme_desc = "Number of mispredicted branches retired"\ },\ {.pme_name = "BR_TAKEN_RETIRED",\ .pme_code = 0xc9,\ .pme_desc = "Number of taken branches retired"\ },\ {.pme_name = "BR_MISS_PRED_TAKEN_RET",\ .pme_code = 0xca,\ .pme_desc = "Number of taken mispredicted branches retired"\ },\ {.pme_name = "BR_INST_DECODED",\ .pme_code = 0xe0,\ .pme_desc = "Number of branch instructions decoded"\ },\ {.pme_name = "BTB_MISSES",\ .pme_code = 0xe2,\ .pme_desc = "Number of branches for which the BTB did not produce a prediction"\ },\ {.pme_name = "BR_BOGUS",\ .pme_code = 0xe4,\ .pme_desc = "Number of bogus branches"\ },\ {.pme_name = "BACLEARS",\ .pme_code = 0xe6,\ .pme_desc = "Number of times BACLEAR is asserted. This is the number of times that " \ "a static branch prediction was made, in which the branch decoder decided " \ "to make a branch prediction because the BTB did not" \ },\ {.pme_name = "RESOURCE_STALLS",\ .pme_code = 0xa2,\ .pme_desc = "Incremented by 1 during every cycle for which there is a resource related stall. " \ "Includes register renaming buffer entries, memory buffer entries. Does not include " \ "stalls due to bus queue full, too many cache misses, etc. In addition to resource " \ "related stalls, this event counts some other events. Includes stalls arising during " \ "branch misprediction recovery, such as if retirement of the mispredicted branch is " \ "delayed and stalls arising while store buffer is draining from synchronizing operations" \ },\ {.pme_name = "PARTIAL_RAT_STALLS",\ .pme_code = 0xd2,\ .pme_desc = "Number of cycles or events for partial stalls. This includes flag partial stalls"\ },\ {.pme_name = "SEGMENT_REG_LOADS",\ .pme_code = 0x06,\ .pme_desc = "Number of segment register loads."\ }\ /* * Pentium Pro Processor Event Table */ static pme_i386_p6_entry_t i386_ppro_pe []={ I386_P6_CPU_CLK_UNHALTED, /* should be first */ I386_P6_COMMON_PME, /* generic p6 */ I386_P6_NOT_PM_PME, /* generic p6 that conflict with Pentium M */ }; #define PME_I386_PPRO_CPU_CLK_UNHALTED 0 #define PME_I386_PPRO_INST_RETIRED 1 #define PME_I386_PPRO_EVENT_COUNT (sizeof(i386_ppro_pe)/sizeof(pme_i386_p6_entry_t)) /* * Pentium II Processor Event Table */ static pme_i386_p6_entry_t i386_pII_pe []={ I386_P6_CPU_CLK_UNHALTED, /* should be first */ I386_P6_COMMON_PME, /* generic p6 */ I386_P6_PII_ONLY_PME, /* pentium II only */ I386_P6_PII_PIII_PME, /* pentium II and later */ I386_P6_NOT_PM_PME, /* generic p6 that conflict with Pentium M */ }; #define PME_I386_PII_CPU_CLK_UNHALTED 0 #define PME_I386_PII_INST_RETIRED 1 #define PME_I386_PII_EVENT_COUNT (sizeof(i386_pII_pe)/sizeof(pme_i386_p6_entry_t)) /* * Pentium III Processor Event Table */ static pme_i386_p6_entry_t i386_pIII_pe []={ I386_P6_CPU_CLK_UNHALTED, /* should be first */ I386_P6_COMMON_PME, /* generic p6 */ I386_P6_PII_PIII_PME, /* pentium II and later */ I386_P6_PIII_PME, /* pentium III and later */ I386_P6_NOT_PM_PME, /* generic p6 that conflict with Pentium M */ I386_P6_PIII_NOT_PM_PME /* pentium III that conflict with Pentium M */ }; #define PME_I386_PIII_CPU_CLK_UNHALTED 0 #define PME_I386_PIII_INST_RETIRED 1 #define PME_I386_PIII_EVENT_COUNT (sizeof(i386_pIII_pe)/sizeof(pme_i386_p6_entry_t)) /* * Pentium M event table * It is different from regular P6 because it supports additional events * and also because the semantics of some events is slightly different * * The library autodetects which table to use during pfmlib_initialize() */ static pme_i386_p6_entry_t i386_pm_pe []={ {.pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x79, .pme_desc = "Number cycles during which the processor is not halted and not in a thermal trip" }, I386_P6_COMMON_PME, /* generic p6 */ I386_P6_PII_PIII_PME, /* pentium II and later */ I386_P6_PIII_PME, /* pentium III and later */ {.pme_name = "EMON_EST_TRANS", .pme_code = 0x58, .pme_desc = "Number of Enhanced Intel SpeedStep technology transitions", .pme_numasks = 2, .pme_umasks = { { .pme_uname = "ALL", .pme_udesc = "All transitions", .pme_ucode = 0x0 }, { .pme_uname = "FREQ", .pme_udesc = "Only frequency transitions", .pme_ucode = 0x2 }, } }, {.pme_name = "EMON_THERMAL_TRIP", .pme_code = 0x59, .pme_desc = "Duration/occurrences in thermal trip; to count the number of thermal trips; edge detect must be used" }, {.pme_name = "BR_INST_EXEC", .pme_code = 0x088, .pme_desc = "Branch instructions executed (not necessarily retired)" }, {.pme_name = "BR_MISSP_EXEC", .pme_code = 0x89, .pme_desc = "Branch instructions executed that were mispredicted at execution" }, {.pme_name = "BR_BAC_MISSP_EXEC", .pme_code = 0x8a, .pme_desc = "Branch instructions executed that were mispredicted at Front End (BAC)" }, {.pme_name = "BR_CND_EXEC", .pme_code = 0x8b, .pme_desc = "Conditional branch instructions executed" }, {.pme_name = "BR_CND_MISSP_EXEC", .pme_code = 0x8c, .pme_desc = "Conditional branch instructions executed that were mispredicted" }, {.pme_name = "BR_IND_EXEC", .pme_code = 0x8d, .pme_desc = "Indirect branch instructions executed" }, {.pme_name = "BR_IND_MISSP_EXEC", .pme_code = 0x8e, .pme_desc = "Indirect branch instructions executed that were mispredicted" }, {.pme_name = "BR_RET_EXEC", .pme_code = 0x8f, .pme_desc = "Return branch instructions executed" }, {.pme_name = "BR_RET_MISSP_EXEC", .pme_code = 0x90, .pme_desc = "Return branch instructions executed that were mispredicted at Execution" }, {.pme_name = "BR_RET_BAC_MISSP_EXEC", .pme_code = 0x91, .pme_desc = "Return branch instructions executed that were mispredicted at Front End (BAC)" }, {.pme_name = "BR_CALL_EXEC", .pme_code = 0x92, .pme_desc = "CALL instructions executed" }, {.pme_name = "BR_CALL_MISSP_EXEC", .pme_code = 0x93, .pme_desc = "CALL instructions executed that were mispredicted" }, {.pme_name = "BR_IND_CALL_EXEC", .pme_code = 0x94, .pme_desc = "Indirect CALL instructions executed" }, {.pme_name = "EMON_SIMD_INSTR_RETIRED", .pme_code = 0xce, .pme_desc = "Number of retired MMX instructions" }, {.pme_name = "EMON_SYNCH_UOPS", .pme_code = 0xd3, .pme_desc = "Sync micro-ops" }, {.pme_name = "EMON_ESP_UOPS", .pme_code = 0xd7, .pme_desc = "Total number of micro-ops" }, {.pme_name = "EMON_FUSED_UOPS_RET", .pme_code = 0xda, .pme_desc = "Total number of micro-ops", .pme_flags = PFMLIB_I386_P6_UMASK_COMBO, .pme_numasks = 3, .pme_umasks = { { .pme_uname = "ALL", .pme_udesc = "All fused micro-ops", .pme_ucode = 0x0 }, { .pme_uname = "LD_OP", .pme_udesc = "Only load+Op micro-ops", .pme_ucode = 0x1 }, { .pme_uname = "STD_STA", .pme_udesc = "Only std+sta micro-ops", .pme_ucode = 0x2 } } }, {.pme_name = "EMON_UNFUSION", .pme_code = 0xdb, .pme_desc = "Number of unfusion events in the ROB, happened on a FP exception to a fused micro-op" }, {.pme_name = "EMON_PREF_RQSTS_UP", .pme_code = 0xf0, .pme_desc = "Number of upward prefetches issued" }, {.pme_name = "EMON_PREF_RQSTS_DN", .pme_code = 0xf8, .pme_desc = "Number of downward prefetches issued" }, {.pme_name = "EMON_SSE_SSE2_INST_RETIRED", .pme_code = 0xd8, .pme_desc = "Streaming SIMD extensions instructions retired", .pme_numasks = 4, .pme_umasks = { { .pme_uname = "SSE_PACKED_SCALAR_SINGLE", .pme_udesc = "SSE Packed Single and Scalar Single", .pme_ucode = 0x0 }, { .pme_uname = "SSE_SCALAR_SINGLE", .pme_udesc = "SSE Scalar Single", .pme_ucode = 0x1 }, { .pme_uname = "SSE2_PACKED_DOUBLE", .pme_udesc = "SSE2 Packed Double", .pme_ucode = 0x2 }, { .pme_uname = "SSE2_SCALAR_DOUBLE", .pme_udesc = "SSE2 Scalar Double", .pme_ucode = 0x3 } } }, {.pme_name = "EMON_SSE_SSE2_COMP_INST_RETIRED", .pme_code = 0xd9, .pme_desc = "Computational SSE instructions retired", .pme_numasks = 4, .pme_umasks = { { .pme_uname = "SSE_PACKED_SINGLE", .pme_udesc = "SSE Packed Single", .pme_ucode = 0x0 }, { .pme_uname = "SSE_SCALAR_SINGLE", .pme_udesc = "SSE Scalar Single", .pme_ucode = 0x1 }, { .pme_uname = "SSE2_PACKED_DOUBLE", .pme_udesc = "SSE2 Packed Double", .pme_ucode = 0x2 }, { .pme_uname = "SSE2_SCALAR_DOUBLE", .pme_udesc = "SSE2 Scalar Double", .pme_ucode = 0x3 } } }, {.pme_name = "L2_LD", .pme_code = 0x29, .pme_desc = "Number of L2 data loads", I386_PM_MESI_PREFETCH_UMASKS }, {.pme_name = "L2_LINES_IN", .pme_code = 0x24, .pme_desc = "Number of L2 lines allocated", I386_PM_MESI_PREFETCH_UMASKS }, {.pme_name = "L2_LINES_OUT", .pme_code = 0x26, .pme_desc = "Number of L2 lines evicted", I386_PM_MESI_PREFETCH_UMASKS }, {.pme_name = "L2_M_LINES_OUT", .pme_code = 0x27, .pme_desc = "Number of L2 M-state lines evicted", I386_PM_MESI_PREFETCH_UMASKS } }; #define PME_I386_PM_CPU_CLK_UNHALTED 0 #define PME_I386_PM_INST_RETIRED 1 #define PME_I386_PM_EVENT_COUNT (sizeof(i386_pm_pe)/sizeof(pme_i386_p6_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/intel_atom_events.h000066400000000000000000000664151502707512200232120ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* table 18.11 */ #define INTEL_ATOM_MESI \ { .pme_uname = "MESI",\ .pme_udesc = "Any cacheline access",\ .pme_ucode = 0xf\ },\ { .pme_uname = "I_STATE",\ .pme_udesc = "Invalid cacheline",\ .pme_ucode = 0x1\ },\ { .pme_uname = "S_STATE",\ .pme_udesc = "Shared cacheline",\ .pme_ucode = 0x2\ },\ { .pme_uname = "E_STATE",\ .pme_udesc = "Exclusive cacheline",\ .pme_ucode = 0x4\ },\ { .pme_uname = "M_STATE",\ .pme_udesc = "Modified cacheline",\ .pme_ucode = 0x8\ } /* table 18.9 */ #define INTEL_ATOM_AGENT \ { .pme_uname = "THIS_AGENT",\ .pme_udesc = "This agent",\ .pme_ucode = 0x00\ },\ { .pme_uname = "ALL_AGENTS",\ .pme_udesc = "Any agent on the bus",\ .pme_ucode = 0x20\ } /* table 18.8 */ #define INTEL_ATOM_CORE \ { .pme_uname = "SELF",\ .pme_udesc = "This core",\ .pme_ucode = 0x40\ },\ { .pme_uname = "BOTH_CORES",\ .pme_udesc = "Both cores",\ .pme_ucode = 0xc0\ } /* table 18.10 */ #define INTEL_ATOM_PREFETCH \ { .pme_uname = "ANY",\ .pme_udesc = "All inclusive",\ .pme_ucode = 0x30\ },\ { .pme_uname = "PREFETCH",\ .pme_udesc = "Hardware prefetch only",\ .pme_ucode = 0x10\ } static pme_intel_atom_entry_t intel_atom_pe[]={ /* * BEGIN architectural perfmon events */ /* 0 */{.pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_flags = PFMLIB_INTEL_ATOM_FIXED1, .pme_desc = "Unhalted core cycles", }, /* 1 */{.pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_flags = PFMLIB_INTEL_ATOM_FIXED2_ONLY, .pme_desc = "Unhalted reference cycles. Measures bus cycles" }, /* 2 */{.pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0xc0, .pme_flags = PFMLIB_INTEL_ATOM_FIXED0|PFMLIB_INTEL_ATOM_PEBS, .pme_desc = "Instructions retired" }, /* 3 */{.pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_desc = "Last level of cache references" }, /* 4 */{.pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_desc = "Last level of cache misses", }, /* 5 */{.pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0xc4, .pme_desc = "Branch instructions retired" }, /* 6 */{.pme_name = "MISPREDICTED_BRANCH_RETIRED", .pme_code = 0xc5, .pme_flags = PFMLIB_INTEL_ATOM_PEBS, .pme_desc = "Mispredicted branch instruction retired" }, /* * BEGIN non architectural events */ { .pme_name = "SIMD_INSTR_RETIRED", .pme_desc = "SIMD Instructions retired", .pme_code = 0xCE, .pme_flags = 0, }, { .pme_name = "L2_REJECT_BUSQ", .pme_desc = "Rejected L2 cache requests", .pme_code = 0x30, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_MESI, INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH }, .pme_numasks = 9, }, { .pme_name = "SIMD_SAT_INSTR_RETIRED", .pme_desc = "Saturated arithmetic instructions retired", .pme_code = 0xCF, .pme_flags = 0, }, { .pme_name = "ICACHE", .pme_desc = "Instruction fetches", .pme_code = 0x80, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ACCESSES", .pme_udesc = "Instruction fetches, including uncacheacble fetches", .pme_ucode = 0x3, }, { .pme_uname = "MISSES", .pme_udesc = "count all instructions fetches that miss tha icache or produce memory requests. This includes uncacheache fetches. Any instruction fetch miss is counted only once and not once for every cycle it is outstanding", .pme_ucode = 0x2, }, }, .pme_numasks = 2 }, { .pme_name = "L2_LOCK", .pme_desc = "L2 locked accesses", .pme_code = 0x2B, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_MESI, INTEL_ATOM_CORE }, .pme_numasks = 7 }, { .pme_name = "UOPS_RETIRED", .pme_desc = "Micro-ops retired", .pme_code = 0xC2, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Micro-ops retired", .pme_ucode = 0x10, }, { .pme_uname = "STALLED_CYCLES", .pme_udesc = "Cycles no micro-ops retired", .pme_ucode = 0x1d010, /* inv=1 cnt_mask=1 */ }, { .pme_uname = "STALLS", .pme_udesc = "Periods no micro-ops retired", .pme_ucode = 0x1d410, /* inv=1 edge=1, cnt_mask=1 */ }, }, .pme_numasks = 3 }, { .pme_name = "L2_M_LINES_OUT", .pme_desc = "Modified lines evicted from the L2 cache", .pme_code = 0x27, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH }, .pme_numasks = 4 }, { .pme_name = "SIMD_COMP_INST_RETIRED", .pme_desc = "Retired computational Streaming SIMD Extensions (SSE) instructions", .pme_code = 0xCA, .pme_flags = 0, .pme_umasks = { { .pme_uname = "PACKED_SINGLE", .pme_udesc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .pme_ucode = 0x1, }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions", .pme_ucode = 0x2, }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .pme_ucode = 0x4, }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .pme_ucode = 0x8, }, }, .pme_numasks = 4 }, { .pme_name = "SNOOP_STALL_DRV", .pme_desc = "Bus stalled for snoops", .pme_code = 0x7E, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT, }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_BURST", .pme_desc = "Burst (full cache-line) bus transactions", .pme_code = 0x6E, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT, }, .pme_numasks = 4 }, { .pme_name = "SIMD_SAT_UOP_EXEC", .pme_desc = "SIMD saturated arithmetic micro-ops executed", .pme_code = 0xB1, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "S", .pme_udesc = "SIMD saturated arithmetic micro-ops executed", .pme_ucode = 0x0, }, { .pme_uname = "AR", .pme_udesc = "SIMD saturated arithmetic micro-ops retired", .pme_ucode = 0x80, }, }, .pme_numasks = 2 }, { .pme_name = "BUS_TRANS_IO", .pme_desc = "IO bus transactions", .pme_code = 0x6C, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_RFO", .pme_desc = "RFO bus transactions", .pme_code = 0x66, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "SIMD_ASSIST", .pme_desc = "SIMD assists invoked", .pme_code = 0xCD, .pme_flags = 0, }, { .pme_name = "INST_RETIRED", .pme_desc = "Instructions retired", .pme_code = 0xC0, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY_P", .pme_udesc = "Instructions retired using generic counter (precise event)", .pme_ucode = 0x0, .pme_flags = PFMLIB_INTEL_ATOM_PEBS }, }, .pme_numasks = 1 }, { .pme_name = "L1D_CACHE", .pme_desc = "L1 Cacheable Data Reads", .pme_code = 0x40, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LD", .pme_udesc = "L1 Cacheable Data Reads", .pme_ucode = 0x21, }, { .pme_uname = "ST", .pme_udesc = "L1 Cacheable Data Writes", .pme_ucode = 0x22, }, }, .pme_numasks = 2 }, { .pme_name = "MUL", .pme_desc = "Multiply operations executed", .pme_code = 0x12, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "S", .pme_udesc = "Multiply operations executed", .pme_ucode = 0x1, }, { .pme_uname = "AR", .pme_udesc = "Multiply operations retired", .pme_ucode = 0x81, }, }, .pme_numasks = 2 }, { .pme_name = "DIV", .pme_desc = "Divide operations executed", .pme_code = 0x13, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "S", .pme_udesc = "Divide operations executed", .pme_ucode = 0x1, }, { .pme_uname = "AR", .pme_udesc = "Divide operations retired", .pme_ucode = 0x81, }, }, .pme_numasks = 2 }, { .pme_name = "BUS_TRANS_P", .pme_desc = "Partial bus transactions", .pme_code = 0x6b, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT, INTEL_ATOM_CORE, }, .pme_numasks = 4 }, { .pme_name = "BUS_IO_WAIT", .pme_desc = "IO requests waiting in the bus queue", .pme_code = 0x7F, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, { .pme_name = "L2_M_LINES_IN", .pme_desc = "L2 cache line modifications", .pme_code = 0x25, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, { .pme_name = "L2_LINES_IN", .pme_desc = "L2 cache misses", .pme_code = 0x24, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH }, .pme_numasks = 4 }, { .pme_name = "BUSQ_EMPTY", .pme_desc = "Bus queue is empty", .pme_code = 0x7D, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, { .pme_name = "L2_IFETCH", .pme_desc = "L2 cacheable instruction fetch requests", .pme_code = 0x28, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_MESI, INTEL_ATOM_CORE }, .pme_numasks = 7 }, { .pme_name = "BUS_HITM_DRV", .pme_desc = "HITM signal asserted", .pme_code = 0x7B, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT }, .pme_numasks = 2 }, { .pme_name = "ITLB", .pme_desc = "ITLB hits", .pme_code = 0x82, .pme_flags = 0, .pme_umasks = { { .pme_uname = "FLUSH", .pme_udesc = "ITLB flushes", .pme_ucode = 0x4, }, { .pme_uname = "MISSES", .pme_udesc = "ITLB misses", .pme_ucode = 0x2, }, }, .pme_numasks = 2 }, { .pme_name = "BUS_TRANS_MEM", .pme_desc = "Memory bus transactions", .pme_code = 0x6F, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT, }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_PWR", .pme_desc = "Partial write bus transaction", .pme_code = 0x6A, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT, }, .pme_numasks = 4 }, { .pme_name = "BR_INST_DECODED", .pme_desc = "Branch instructions decoded", .pme_code = 0x1E0, .pme_flags = 0, }, { .pme_name = "BUS_TRANS_INVAL", .pme_desc = "Invalidate bus transactions", .pme_code = 0x69, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "SIMD_UOP_TYPE_EXEC", .pme_desc = "SIMD micro-ops executed", .pme_code = 0xB3, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MUL_S", .pme_udesc = "SIMD packed multiply micro-ops executed", .pme_ucode = 0x1, }, { .pme_uname = "MUL_AR", .pme_udesc = "SIMD packed multiply micro-ops retired", .pme_ucode = 0x81, }, { .pme_uname = "SHIFT_S", .pme_udesc = "SIMD packed shift micro-ops executed", .pme_ucode = 0x2, }, { .pme_uname = "SHIFT_AR", .pme_udesc = "SIMD packed shift micro-ops retired", .pme_ucode = 0x82, }, { .pme_uname = "PACK_S", .pme_udesc = "SIMD packed micro-ops executed", .pme_ucode = 0x4, }, { .pme_uname = "PACK_AR", .pme_udesc = "SIMD packed micro-ops retired", .pme_ucode = 0x84, }, { .pme_uname = "UNPACK_S", .pme_udesc = "SIMD unpacked micro-ops executed", .pme_ucode = 0x8, }, { .pme_uname = "UNPACK_AR", .pme_udesc = "SIMD unpacked micro-ops retired", .pme_ucode = 0x88, }, { .pme_uname = "LOGICAL_S", .pme_udesc = "SIMD packed logical micro-ops executed", .pme_ucode = 0x10, }, { .pme_uname = "LOGICAL_AR", .pme_udesc = "SIMD packed logical micro-ops retired", .pme_ucode = 0x90, }, { .pme_uname = "ARITHMETIC_S", .pme_udesc = "SIMD packed arithmetic micro-ops executed", .pme_ucode = 0x20, }, { .pme_uname = "ARITHMETIC_AR", .pme_udesc = "SIMD packed arithmetic micro-ops retired", .pme_ucode = 0xA0, }, }, .pme_numasks = 12 }, { .pme_name = "SIMD_INST_RETIRED", .pme_desc = "Retired Streaming SIMD Extensions (SSE)", .pme_code = 0xC7, .pme_flags = 0, .pme_umasks = { { .pme_uname = "PACKED_SINGLE", .pme_udesc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .pme_ucode = 0x1, }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "Retired Streaming SIMD Extensions (SSE) scalar-single instructions", .pme_ucode = 0x2, }, { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .pme_ucode = 0x4, }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .pme_ucode = 0x8, }, { .pme_uname = "VECTOR", .pme_udesc = "Retired Streaming SIMD Extensions 2 (SSE2) vector instructions", .pme_ucode = 0x10, }, { .pme_uname = "ANY", .pme_udesc = "Retired Streaming SIMD instructions", .pme_ucode = 0x1F, }, }, .pme_numasks = 6 }, { .pme_name = "CYCLES_DIV_BUSY", .pme_desc = "Cycles the divider is busy", .pme_code = 0x14, .pme_flags = 0, }, { .pme_name = "PREFETCH", .pme_desc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .pme_code = 0x7, .pme_flags = 0, .pme_umasks = { { .pme_uname = "PREFETCHT0", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .pme_ucode = 0x01, }, { .pme_uname = "SW_L2", .pme_udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed", .pme_ucode = 0x06, }, { .pme_uname = "PREFETCHNTA", .pme_udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .pme_ucode = 0x08, }, }, .pme_numasks = 3 }, { .pme_name = "L2_RQSTS", .pme_desc = "L2 cache requests", .pme_code = 0x2E, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH, INTEL_ATOM_MESI }, .pme_numasks = 9 }, { .pme_name = "SIMD_UOPS_EXEC", .pme_desc = "SIMD micro-ops executed (excluding stores)", .pme_code = 0xB0, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "S", .pme_udesc = "number of SMD saturated arithmetic micro-ops executed", .pme_ucode = 0x0, }, { .pme_uname = "AR", .pme_udesc = "number of SIMD saturated arithmetic micro-ops retired", .pme_ucode = 0x80, }, }, .pme_numasks = 2 }, { .pme_name = "HW_INT_RCV", .pme_desc = "Hardware interrupts received", .pme_code = 0xC8, .pme_flags = 0, }, { .pme_name = "BUS_TRANS_BRD", .pme_desc = "Burst read bus transactions", .pme_code = 0x65, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT, INTEL_ATOM_CORE }, .pme_numasks = 4 }, { .pme_name = "BOGUS_BR", .pme_desc = "Bogus branches", .pme_code = 0xE4, .pme_flags = 0, }, { .pme_name = "BUS_DATA_RCV", .pme_desc = "Bus cycles while processor receives data", .pme_code = 0x64, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, }, .pme_numasks = 2 }, { .pme_name = "MACHINE_CLEARS", .pme_desc = "Self-Modifying Code detected", .pme_code = 0xC3, .pme_flags = 0, .pme_umasks = { { .pme_uname = "SMC", .pme_udesc = "Self-Modifying Code detected", .pme_ucode = 0x1, }, }, .pme_numasks = 1 }, { .pme_name = "BR_INST_RETIRED", .pme_desc = "Retired branch instructions", .pme_code = 0xC4, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Retired branch instructions", .pme_ucode = 0x0, }, { .pme_uname = "PRED_NOT_TAKEN", .pme_udesc = "Retired branch instructions that were predicted not-taken", .pme_ucode = 0x1, }, { .pme_uname = "MISPRED_NOT_TAKEN", .pme_udesc = "Retired branch instructions that were mispredicted not-taken", .pme_ucode = 0x2, }, { .pme_uname = "PRED_TAKEN", .pme_udesc = "Retired branch instructions that were predicted taken", .pme_ucode = 0x4, }, { .pme_uname = "MISPRED_TAKEN", .pme_udesc = "Retired branch instructions that were mispredicted taken", .pme_ucode = 0x8, }, { .pme_uname = "MISPRED", .pme_udesc = "Retired mispredicted branch instructions (precise event)", .pme_flags = PFMLIB_INTEL_ATOM_PEBS, .pme_ucode = 0xA, }, { .pme_uname = "TAKEN", .pme_udesc = "Retired taken branch instructions", .pme_ucode = 0xC, }, { .pme_uname = "ANY1", .pme_udesc = "Retired branch instructions", .pme_ucode = 0xF, }, }, .pme_numasks = 8 }, { .pme_name = "L2_ADS", .pme_desc = "Cycles L2 address bus is in use", .pme_code = 0x21, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, { .pme_name = "EIST_TRANS", .pme_desc = "Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions", .pme_code = 0x3A, .pme_flags = 0, }, { .pme_name = "BUS_TRANS_WB", .pme_desc = "Explicit writeback bus transactions", .pme_code = 0x67, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "MACRO_INSTS", .pme_desc = "Macro instructions decoded", .pme_code = 0xAA, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "NON_CISC_DECODED", .pme_udesc = "Non-CISC macro instructions decoded", .pme_ucode = 0x1, }, { .pme_uname = "ALL_DECODED", .pme_udesc = "All Instructions decoded", .pme_ucode = 0x3, }, }, .pme_numasks = 2 }, { .pme_name = "L2_LINES_OUT", .pme_desc = "L2 cache lines evicted", .pme_code = 0x26, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH }, .pme_numasks = 4 }, { .pme_name = "L2_LD", .pme_desc = "L2 cache reads", .pme_code = 0x29, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_PREFETCH, INTEL_ATOM_MESI }, .pme_numasks = 9 }, { .pme_name = "SEGMENT_REG_LOADS", .pme_desc = "Number of segment register loads", .pme_code = 0x6, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Number of segment register loads", .pme_ucode = 0x80, }, }, .pme_numasks = 1 }, { .pme_name = "L2_NO_REQ", .pme_desc = "Cycles no L2 cache requests are pending", .pme_code = 0x32, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, { .pme_name = "THERMAL_TRIP", .pme_desc = "Number of thermal trips", .pme_code = 0xC03B, .pme_flags = 0, }, { .pme_name = "EXT_SNOOP", .pme_desc = "External snoops", .pme_code = 0x77, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_MESI, INTEL_ATOM_CORE }, .pme_numasks = 7 }, { .pme_name = "BACLEARS", .pme_desc = "BACLEARS asserted", .pme_code = 0xE6, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "BACLEARS asserted", .pme_ucode = 0x1, }, }, .pme_numasks = 1 }, { .pme_name = "CYCLES_INT_MASKED", .pme_desc = "Cycles during which interrupts are disabled", .pme_code = 0xC6, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CYCLES_INT_MASKED", .pme_udesc = "Cycles during which interrupts are disabled", .pme_ucode = 0x1, }, { .pme_uname = "CYCLES_INT_PENDING_AND_MASKED", .pme_udesc = "Cycles during which interrupts are pending and disabled", .pme_ucode = 0x2, }, }, .pme_numasks = 2 }, { .pme_name = "FP_ASSIST", .pme_desc = "Floating point assists", .pme_code = 0x11, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "S", .pme_udesc = "Floating point assists for executed instructions", .pme_ucode = 0x1, }, { .pme_uname = "AR", .pme_udesc = "Floating point assists for retired instructions", .pme_ucode = 0x81, }, }, .pme_numasks = 2 }, { .pme_name = "L2_ST", .pme_desc = "L2 store requests", .pme_code = 0x2A, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_MESI, INTEL_ATOM_CORE }, .pme_numasks = 7 }, { .pme_name = "BUS_TRANS_DEF", .pme_desc = "Deferred bus transactions", .pme_code = 0x6D, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "DATA_TLB_MISSES", .pme_desc = "Memory accesses that missed the DTLB", .pme_code = 0x8, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "DTLB_MISS", .pme_udesc = "Memory accesses that missed the DTLB", .pme_ucode = 0x7, }, { .pme_uname = "DTLB_MISS_LD", .pme_udesc = "DTLB misses due to load operations", .pme_ucode = 0x5, }, { .pme_uname = "L0_DTLB_MISS_LD", .pme_udesc = "L0 (micro-TLB) misses due to load operations", .pme_ucode = 0x9, }, { .pme_uname = "DTLB_MISS_ST", .pme_udesc = "DTLB misses due to store operations", .pme_ucode = 0x6, }, }, .pme_numasks = 4 }, { .pme_name = "BUS_BNR_DRV", .pme_desc = "Number of Bus Not Ready signals asserted", .pme_code = 0x61, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT }, .pme_numasks = 2 }, { .pme_name = "STORE_FORWARDS", .pme_desc = "All store forwards", .pme_code = 0x2, .pme_flags = 0, .pme_umasks = { { .pme_uname = "GOOD", .pme_udesc = "Good store forwards", .pme_ucode = 0x81, }, }, .pme_numasks = 1 }, { .pme_name = "CPU_CLK_UNHALTED", .pme_code = 0x3c, .pme_desc = "Core cycles when core is not halted", .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CORE_P", .pme_udesc = "Core cycles when core is not halted", .pme_ucode = 0x0, }, { .pme_uname = "BUS", .pme_udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .pme_ucode = 0x1, }, { .pme_uname = "NO_OTHER", .pme_udesc = "Bus cycles when core is active and other is halted", .pme_ucode = 0x2, }, }, .pme_numasks = 3 }, { .pme_name = "BUS_TRANS_ANY", .pme_desc = "All bus transactions", .pme_code = 0x70, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE, INTEL_ATOM_AGENT }, .pme_numasks = 4 }, { .pme_name = "MEM_LOAD_RETIRED", .pme_desc = "Retired loads that hit the L2 cache (precise event)", .pme_code = 0xCB, .pme_flags = 0, .pme_umasks = { { .pme_uname = "L2_HIT", .pme_udesc = "Retired loads that hit the L2 cache (precise event)", .pme_ucode = 0x1, .pme_flags = PFMLIB_INTEL_ATOM_PEBS }, { .pme_uname = "L2_MISS", .pme_udesc = "Retired loads that miss the L2 cache (precise event)", .pme_ucode = 0x2, .pme_flags = PFMLIB_INTEL_ATOM_PEBS }, { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired loads that miss the DTLB (precise event)", .pme_ucode = 0x4, .pme_flags = PFMLIB_INTEL_ATOM_PEBS }, }, .pme_numasks = 3 }, { .pme_name = "X87_COMP_OPS_EXE", .pme_desc = "Floating point computational micro-ops executed", .pme_code = 0x10, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY_S", .pme_udesc = "Floating point computational micro-ops executed", .pme_ucode = 0x1, }, { .pme_uname = "ANY_AR", .pme_udesc = "Floating point computational micro-ops retired", .pme_ucode = 0x81, }, }, .pme_numasks = 2 }, { .pme_name = "PAGE_WALKS", .pme_desc = "Number of page-walks executed", .pme_code = 0xC, .pme_flags = PFMLIB_INTEL_ATOM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "WALKS", .pme_udesc = "Number of page-walks executed", .pme_ucode = 0x3 | 1ul << 10, }, { .pme_uname = "CYCLES", .pme_udesc = "Duration of page-walks in core cycles", .pme_ucode = 0x3, }, }, .pme_numasks = 2 }, { .pme_name = "BUS_LOCK_CLOCKS", .pme_desc = "Bus cycles when a LOCK signal is asserted", .pme_code = 0x63, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT, INTEL_ATOM_CORE }, .pme_numasks = 4 }, { .pme_name = "BUS_REQUEST_OUTSTANDING", .pme_desc = "Outstanding cacheable data read bus requests duration", .pme_code = 0x60, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT, INTEL_ATOM_CORE }, .pme_numasks = 4 }, { .pme_name = "BUS_TRANS_IFETCH", .pme_desc = "Instruction-fetch bus transactions", .pme_code = 0x68, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT, INTEL_ATOM_CORE }, .pme_numasks = 4 }, { .pme_name = "BUS_HIT_DRV", .pme_desc = "HIT signal asserted", .pme_code = 0x7A, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT }, .pme_numasks = 2 }, { .pme_name = "BUS_DRDY_CLOCKS", .pme_desc = "Bus cycles when data is sent on the bus", .pme_code = 0x62, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_AGENT }, .pme_numasks = 2 }, { .pme_name = "L2_DBUS_BUSY", .pme_desc = "Cycles the L2 cache data bus is busy", .pme_code = 0x22, .pme_flags = 0, .pme_umasks = { INTEL_ATOM_CORE }, .pme_numasks = 2 }, }; #define PME_INTEL_ATOM_UNHALTED_CORE_CYCLES 0 #define PME_INTEL_ATOM_INSTRUCTIONS_RETIRED 2 #define PME_INTEL_ATOM_EVENT_COUNT (sizeof(intel_atom_pe)/sizeof(pme_intel_atom_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/intel_corei7_events.h000066400000000000000000001767571502707512200234550ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ static pme_nhm_entry_t corei7_pe[]={ /* * BEGIN architected events */ {.pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_cntmsk = 0x2000f, .pme_flags = PFMLIB_NHM_FIXED1, .pme_desc = "count core clock cycles whenever the clock signal on the specific core is running (not halted). Alias to event CPU_CLK_UNHALTED:THREAD" }, {.pme_name = "INSTRUCTION_RETIRED", .pme_code = 0x00c0, .pme_cntmsk = 0x1000f, .pme_flags = PFMLIB_NHM_FIXED0|PFMLIB_NHM_PEBS, .pme_desc = "count the number of instructions at retirement. Alias to event INST_RETIRED:ANY_P", }, {.pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0x00c0, .pme_cntmsk = 0x1000f, .pme_flags = PFMLIB_NHM_FIXED0|PFMLIB_NHM_PEBS, .pme_desc = "This is an alias for INSTRUCTION_RETIRED", }, {.pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_cntmsk = 0x40000, .pme_flags = PFMLIB_NHM_FIXED2_ONLY, .pme_desc = "Unhalted reference cycles", }, {.pme_name = "LLC_REFERENCES", .pme_code = 0x4f2e, .pme_cntmsk = 0xf, .pme_desc = "count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to L2_RQSTS:SELF_DEMAND_MESI", }, {.pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_cntmsk = 0xf, .pme_desc = "This is an alias for LLC_REFERENCES", }, {.pme_name = "LLC_MISSES", .pme_code = 0x412e, .pme_cntmsk = 0xf, .pme_desc = "count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L2_RQSTS:SELF_DEMAND_I_STATE", }, {.pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_cntmsk = 0xf, .pme_desc = "This is an alias for LLC_MISSES", }, {.pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0x00c4, .pme_cntmsk = 0xf, .pme_desc = "count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction. Alias to event BR_INST_RETIRED:ANY", }, /* * BEGIN core specific events */ { .pme_name = "ARITH", .pme_desc = "Counts arithmetic multiply and divide operations", .pme_code = 0x14, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CYCLES_DIV_BUSY", .pme_udesc = "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE.", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DIV", .pme_udesc = "Counts the number of divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE.", .pme_ucode = 0x01 | (1<<16) | (1<<15) | (1<<10), /* cmask=1  invert=1  edge=1 */ .pme_uflags = 0, }, { .pme_uname = "MUL", .pme_udesc = "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD.", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "BACLEAR", .pme_desc = "Branch address calculator", .pme_code = 0xE6, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "BAD_TARGET", .pme_udesc = "BACLEAR asserted with bad target address", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CLEAR", .pme_udesc = "BACLEAR asserted, regardless of cause", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "BACLEAR_FORCE_IQ", .pme_desc = "Instruction queue forced BACLEAR", .pme_code = 0x01A7, .pme_flags = 0, }, { .pme_name = "BOGUS_BR", .pme_desc = "Counts the number of bogus branches.", .pme_code = 0x01E4, .pme_flags = 0, }, { .pme_name = "BPU_CLEARS", .pme_desc = "Branch prediction Unit clears", .pme_code = 0xE8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "EARLY", .pme_udesc = "Early Branch Prediciton Unit clears", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "LATE", .pme_udesc = "Late Branch Prediction Unit clears", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "count any Branch Prediction Unit clears", .pme_ucode = 0x03, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "BPU_MISSED_CALL_RET", .pme_desc = "Branch prediction unit missed call or return", .pme_code = 0x01E5, .pme_flags = 0, }, { .pme_name = "BR_INST_DECODED", .pme_desc = "Branch instructions decoded", .pme_code = 0x01E0, .pme_flags = 0, }, { .pme_name = "BR_INST_EXEC", .pme_desc = "Branch instructions executed", .pme_code = 0x88, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Branch instructions executed", .pme_ucode = 0x7F, .pme_uflags = 0, }, { .pme_uname = "COND", .pme_udesc = "Conditional branch instructions executed", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DIRECT", .pme_udesc = "Unconditional branches executed", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DIRECT_NEAR_CALL", .pme_udesc = "Unconditional call branches executed", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NEAR_CALL", .pme_udesc = "Indirect call branches executed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NON_CALL", .pme_udesc = "Indirect non call branches executed", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALLS", .pme_udesc = "Call branches executed", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "NON_CALLS", .pme_udesc = "All non call branches executed", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "RETURN_NEAR", .pme_udesc = "Indirect return branches executed", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "TAKEN", .pme_udesc = "Taken branches executed", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 10 }, { .pme_name = "BR_INST_RETIRED", .pme_desc = "Retired branch instructions", .pme_code = 0xC4, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ALL_BRANCHES", .pme_udesc = "Retired branch instructions (Precise Event)", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "CONDITIONAL", .pme_udesc = "Retired conditional branch instructions (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "NEAR_CALL", .pme_udesc = "Retired near call instructions (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 3 }, { .pme_name = "BR_MISP_EXEC", .pme_desc = "Mispredicted branches executed", .pme_code = 0x89, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Mispredicted branches executed", .pme_ucode = 0x7F, .pme_uflags = 0, }, { .pme_uname = "COND", .pme_udesc = "Mispredicted conditional branches executed", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DIRECT", .pme_udesc = "Mispredicted unconditional branches executed", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DIRECT_NEAR_CALL", .pme_udesc = "Mispredicted non call branches executed", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NEAR_CALL", .pme_udesc = "Mispredicted indirect call branches executed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NON_CALL", .pme_udesc = "Mispredicted indirect non call branches executed", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALLS", .pme_udesc = "Mispredicted call branches executed", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "NON_CALLS", .pme_udesc = "Mispredicted non call branches executed", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "RETURN_NEAR", .pme_udesc = "Mispredicted return branches executed", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "TAKEN", .pme_udesc = "Mispredicted taken branches executed", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 10 }, { .pme_name = "BR_MISP_RETIRED", .pme_desc = "Count Mispredicted Branch Activity", .pme_code = 0xC5, .pme_flags = 0, .pme_umasks = { { .pme_uname = "NEAR_CALL", .pme_udesc = "Counts mispredicted direct and indirect near unconditional retired calls", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "CACHE_LOCK_CYCLES", .pme_desc = "Cache lock cycles", .pme_code = 0x63, .pme_flags = PFMLIB_NHM_PMC01, .pme_umasks = { { .pme_uname = "L1D", .pme_udesc = "Cycles L1D locked", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "L1D_L2", .pme_udesc = "Cycles L1D and L2 locked", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "CPU_CLK_UNHALTED", .pme_desc = "Cycles when processor is not in halted state", .pme_code = 0x3C, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "THREAD_P", .pme_udesc = "Cycles when thread is not halted (programmable counter)", .pme_ucode = 0x00, .pme_uflags = 0, }, { .pme_uname = "REF_P", .pme_udesc = "Reference base clock (133 Mhz) cycles when thread is not halted", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "DTLB_LOAD_MISSES", .pme_desc = "Data TLB load misses", .pme_code = 0x08, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "DTLB load misses", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PDE_MISS", .pme_udesc = "DTLB load miss caused by low part of address", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "DTLB load miss page walks complete", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "STLB_HIT", .pme_udesc = "DTLB second level hit", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PDP_MISS", .pme_udesc = "Number of DTLB cache load misses where the high part of the linear to physical address translation was missed", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "LARGE_WALK_COMPLETED", .pme_udesc = "Counts number of completed large page walks due to load miss in the STLB", .pme_ucode = 0x80, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "DTLB_MISSES", .pme_desc = "Data TLB misses", .pme_code = 0x49, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "DTLB misses", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "STLB_HIT", .pme_udesc = "DTLB first level misses but second level hit", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "DTLB miss page walks", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PDE_MISS", .pme_udesc = "Number of DTLB cache misses where the low part of the linear to physical address translation was missed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PDP_MISS", .pme_udesc = "Number of DTLB misses where the high part of the linear to physical address translation was missed", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "LARGE_WALK_COMPLETED", .pme_udesc = "Counts number of completed large page walks due to misses in the STLB", .pme_ucode = 0x80, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "EPT", .pme_desc = "Extended Page Directory", .pme_code = 0x4F, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "EPDE_MISS", .pme_udesc = "Extended Page Directory Entry miss", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "EPDPE_MISS", .pme_udesc = "Extended Page Directory Pointer miss", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "EPDPE_HIT", .pme_udesc = "Extended Page Directory Pointer hit", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "ES_REG_RENAMES", .pme_desc = "ES segment renames", .pme_code = 0x01D5, .pme_flags = 0, }, { .pme_name = "FP_ASSIST", .pme_desc = "Floating point assists", .pme_code = 0xF7, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ALL", .pme_udesc = "Floating point assists (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "INPUT", .pme_udesc = "Floating poiint assists for invalid input value (Precise Event)", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "OUTPUT", .pme_udesc = "Floating point assists for invalid output value (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 3 }, { .pme_name = "FP_COMP_OPS_EXE", .pme_desc = "Floating point computational micro-ops", .pme_code = 0x10, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MMX", .pme_udesc = "MMX Uops", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "SSE_DOUBLE_PRECISION", .pme_udesc = "SSE* FP double precision Uops", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "SSE_FP", .pme_udesc = "SSE and SSE2 FP Uops", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "SSE_FP_PACKED", .pme_udesc = "SSE FP packed Uops", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "SSE_FP_SCALAR", .pme_udesc = "SSE FP scalar Uops", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "SSE_SINGLE_PRECISION", .pme_udesc = "SSE* FP single precision Uops", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "SSE2_INTEGER", .pme_udesc = "SSE2 integer Uops", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "X87", .pme_udesc = "Computational floating-point operations executed", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "FP_MMX_TRANS", .pme_desc = "Floating Point to and from MMX transitions", .pme_code = 0xCC, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All Floating Point to and from MMX transitions", .pme_ucode = 0x03, .pme_uflags = 0, }, { .pme_uname = "TO_FP", .pme_udesc = "Transitions from MMX to Floating Point instructions", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "TO_MMX", .pme_udesc = "Transitions from Floating Point to MMX instructions", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "HW_INT", .pme_desc = "Hardware interrupts", .pme_code = 0x1D, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "RCV", .pme_udesc = "Number of interrupt received", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CYCLES_MASKED", .pme_udesc = "Number of cycles interrupt are masked", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CYCLES_PENDING_AND_MASKED", .pme_udesc = "Number of cycles interrupts are pending and masked", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "IFU_IVC", .pme_desc = "Instruction Fetch unit victim cache", .pme_code = 0x81, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "FULL", .pme_udesc = "Instruction Fetche unit victim cache full", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "L1I_EVICTION", .pme_udesc = "L1 Instruction cache evictions", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "ILD_STALL", .pme_desc = "Instruction Length Decoder stalls", .pme_code = 0x87, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Any Instruction Length Decoder stall cycles", .pme_ucode = 0x0F, .pme_uflags = 0, }, { .pme_uname = "IQ_FULL", .pme_udesc = "Instruction Queue full stall cycles", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "LCP", .pme_udesc = "Length Change Prefix stall cycles", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "MRU", .pme_udesc = "Stall cycles due to BPU MRU bypass", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "REGEN", .pme_udesc = "Regen stall cycles", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "INST_DECODED", .pme_desc = "Instructions decoded", .pme_code = 0x18, .pme_flags = 0, .pme_umasks = { { .pme_uname = "DEC0", .pme_udesc = "Instructions that must be decoded by decoder 0", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "INST_QUEUE_WRITES", .pme_desc = "Instructions written to instruction queue.", .pme_code = 0x0117, .pme_flags = 0, }, { .pme_name = "INST_QUEUE_WRITE_CYCLES", .pme_desc = "Cycles instructions are written to the instruction queue", .pme_code = 0x011E, .pme_flags = 0, }, { .pme_name = "INST_RETIRED", .pme_desc = "Instructions retired", .pme_code = 0xC0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY_P", .pme_udesc = "Instructions Retired (Precise Event)", .pme_ucode = 0x00, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "X87", .pme_udesc = "Retired floating-point operations (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 2 }, { .pme_name = "IO_TRANSACTIONS", .pme_desc = "I/O transactions", .pme_code = 0x016C, .pme_flags = 0, }, { .pme_name = "ITLB_FLUSH", .pme_desc = "Counts the number of ITLB flushes", .pme_code = 0x01AE, .pme_flags = 0, }, { .pme_name = "ITLB_MISSES", .pme_desc = "Instruction TLB misses", .pme_code = 0x85, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "ITLB miss", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "ITLB miss page walks", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "STLB_HIT", .pme_udesc = "Counts the number of ITLB misses that hit in the second level TLB", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PDE_MISS", .pme_udesc = "Number of ITLB misses where the low part of the linear to physical address translation was missed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PDP_MISS", .pme_udesc = "Number of ITLB misses where the high part of the linear to physical address translation was missed", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "LARGE_WALK_COMPLETED", .pme_udesc = "Counts number of completed large page walks due to misses in the STLB", .pme_ucode = 0x80, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "ITLB_MISS_RETIRED", .pme_desc = "Retired instructions that missed the ITLB (Precise Event)", .pme_code = 0x20C8, .pme_flags = PFMLIB_NHM_PEBS, }, { .pme_name = "L1D", .pme_desc = "L1D cache", .pme_code = 0x51, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "M_EVICT", .pme_udesc = "L1D cache lines replaced in M state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "M_REPL", .pme_udesc = "L1D cache lines allocated in the M state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "M_SNOOP_EVICT", .pme_udesc = "L1D snoop eviction of cache lines in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "REPL", .pme_udesc = "L1 data cache lines allocated", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L1D_ALL_REF", .pme_desc = "L1D references", .pme_code = 0x43, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All references to the L1 data cache", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CACHEABLE", .pme_udesc = "L1 data cacheable reads and writes", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "L1D_CACHE_LD", .pme_desc = "L1D cacheable loads. WARNING: event may overcount loads", .pme_code = 0x40, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "E_STATE", .pme_udesc = "L1 data cache read in E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "I_STATE", .pme_udesc = "L1 data cache read in I state (misses)", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "M_STATE", .pme_udesc = "L1 data cache read in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "MESI", .pme_udesc = "L1 data cache reads", .pme_ucode = 0x0F, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L1 data cache read in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L1D_CACHE_LOCK", .pme_desc = "L1 data cache load lock", .pme_code = 0x42, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "E_STATE", .pme_udesc = "L1 data cache load locks in E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "HIT", .pme_udesc = "L1 data cache load lock hits. WARNING: overcounts by 3x", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "M_STATE", .pme_udesc = "L1 data cache load locks in M state. WARNING: overcounts by 3x", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L1 data cache load locks in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L1D_CACHE_LOCK_FB_HIT", .pme_desc = "L1D load lock accepted in fill buffer", .pme_code = 0x0153, .pme_flags = PFMLIB_NHM_PMC01, }, { .pme_name = "L1D_CACHE_PREFETCH_LOCK_FB_HIT", .pme_desc = "L1D prefetch load lock accepted in fill buffer", .pme_code = 0x0152, .pme_flags = PFMLIB_NHM_PMC01, }, { .pme_name = "L1D_CACHE_ST", .pme_desc = "L1 data cache stores", .pme_code = 0x41, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "E_STATE", .pme_udesc = "L1 data cache stores in E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "I_STATE", .pme_udesc = "L1 data cache store in the I state", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "M_STATE", .pme_udesc = "L1 data cache stores in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L1 data cache stores in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "MESI", .pme_udesc = "L1 data cache store in all states", .pme_ucode = 0x0F, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L1D_PREFETCH", .pme_desc = "L1D hardware prefetch", .pme_code = 0x4E, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "L1D hardware prefetch misses", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "REQUESTS", .pme_udesc = "L1D hardware prefetch requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "TRIGGERS", .pme_udesc = "L1D hardware prefetch requests triggered", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "L1D_WB_L2", .pme_desc = "L1 writebacks to L2", .pme_code = 0x28, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "E_STATE", .pme_udesc = "L1 writebacks to L2 in E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "I_STATE", .pme_udesc = "L1 writebacks to L2 in I state (misses)", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "M_STATE", .pme_udesc = "L1 writebacks to L2 in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L1 writebacks to L2 in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "MESI", .pme_udesc = "All L1 writebacks to L2", .pme_ucode = 0x0F, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L1I", .pme_desc = "L1I instruction fetches", .pme_code = 0x80, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CYCLES_STALLED", .pme_udesc = "L1I instruction fetch stall cycles", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "HITS", .pme_udesc = "L1I instruction fetch hits", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "MISSES", .pme_udesc = "L1I instruction fetch misses", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "READS", .pme_udesc = "L1I Instruction fetches", .pme_ucode = 0x03, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L1I_OPPORTUNISTIC_HITS", .pme_desc = "Opportunistic hits in streaming", .pme_code = 0x0183, .pme_flags = 0, }, { .pme_name = "L2_DATA_RQSTS", .pme_desc = "L2 data requests", .pme_code = 0x26, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All L2 data requests", .pme_ucode = 0xFF, .pme_uflags = 0, }, { .pme_uname = "DEMAND_E_STATE", .pme_udesc = "L2 data demand loads in E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "DEMAND_I_STATE", .pme_udesc = "L2 data demand loads in I state (misses)", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DEMAND_M_STATE", .pme_udesc = "L2 data demand loads in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "DEMAND_MESI", .pme_udesc = "L2 data demand requests", .pme_ucode = 0x0F, .pme_uflags = 0, }, { .pme_uname = "DEMAND_S_STATE", .pme_udesc = "L2 data demand loads in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_E_STATE", .pme_udesc = "L2 data prefetches in E state", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_I_STATE", .pme_udesc = "L2 data prefetches in the I state (misses)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_M_STATE", .pme_udesc = "L2 data prefetches in M state", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_MESI", .pme_udesc = "All L2 data prefetches", .pme_ucode = 0xF0, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_S_STATE", .pme_udesc = "L2 data prefetches in the S state", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 11 }, { .pme_name = "L2_HW_PREFETCH", .pme_desc = "L2 HW prefetches", .pme_code = 0xF3, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Count L2 HW prefetcher detector hits", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "ALLOC", .pme_udesc = "Count L2 HW prefetcher allocations", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DATA_TRIGGER", .pme_udesc = "Count L2 HW data prefetcher triggered", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "CODE_TRIGGER", .pme_udesc = "Count L2 HW code prefetcher triggered", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "DCA_TRIGGER", .pme_udesc = "Count L2 HW DCA prefetcher triggered", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "KICK_START", .pme_udesc = "Count L2 HW prefetcher kick started", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "L2_LINES_IN", .pme_desc = "L2 lines allocated", .pme_code = 0xF1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "any L2 lines allocated", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "E_STATE", .pme_udesc = "L2 lines allocated in the E state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L2 lines allocated in the S state", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "L2_LINES_OUT", .pme_desc = "L2 lines evicted", .pme_code = 0xF2, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "L2 lines evicted", .pme_ucode = 0x0F, .pme_uflags = 0, }, { .pme_uname = "DEMAND_CLEAN", .pme_udesc = "L2 lines evicted by a demand request", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DEMAND_DIRTY", .pme_udesc = "L2 modified lines evicted by a demand request", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_CLEAN", .pme_udesc = "L2 lines evicted by a prefetch request", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_DIRTY", .pme_udesc = "L2 modified lines evicted by a prefetch request", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L2_RQSTS", .pme_desc = "L2 requests", .pme_code = 0x24, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "All L2 misses", .pme_ucode = 0xAA, .pme_uflags = 0, }, { .pme_uname = "REFERENCES", .pme_udesc = "All L2 requests", .pme_ucode = 0xFF, .pme_uflags = 0, }, { .pme_uname = "IFETCH_HIT", .pme_udesc = "L2 instruction fetch hits", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "IFETCH_MISS", .pme_udesc = "L2 instruction fetch misses", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "IFETCHES", .pme_udesc = "L2 instruction fetches", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "LD_HIT", .pme_udesc = "L2 load hits", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "LD_MISS", .pme_udesc = "L2 load misses", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LOADS", .pme_udesc = "L2 requests", .pme_ucode = 0x03, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_HIT", .pme_udesc = "L2 prefetch hits", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_MISS", .pme_udesc = "L2 prefetch misses", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "PREFETCHES", .pme_udesc = "All L2 prefetches", .pme_ucode = 0xC0, .pme_uflags = 0, }, { .pme_uname = "RFO_HIT", .pme_udesc = "L2 RFO hits", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "RFO_MISS", .pme_udesc = "L2 RFO misses", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "RFOS", .pme_udesc = "L2 RFO requests", .pme_ucode = 0x0C, .pme_uflags = 0, }, }, .pme_numasks = 14 }, { .pme_name = "L2_TRANSACTIONS", .pme_desc = "L2 transactions", .pme_code = 0xF0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All L2 transactions", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "FILL", .pme_udesc = "L2 fill transactions", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "IFETCH", .pme_udesc = "L2 instruction fetch transactions", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "L1D_WB", .pme_udesc = "L1D writeback to L2 transactions", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "LOAD", .pme_udesc = "L2 Load transactions", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PREFETCH", .pme_udesc = "L2 prefetch transactions", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "RFO", .pme_udesc = "L2 RFO transactions", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "L2 writeback to LLC transactions", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "L2_WRITE", .pme_desc = "L2 demand lock/store RFO", .pme_code = 0x27, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LOCK_E_STATE", .pme_udesc = "L2 demand lock RFOs in E state", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "LOCK_I_STATE", .pme_udesc = "L2 demand lock RFOs in I state (misses)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "LOCK_S_STATE", .pme_udesc = "L2 demand lock RFOs in S state", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "LOCK_HIT", .pme_udesc = "All demand L2 lock RFOs that hit the cache", .pme_ucode = 0xE0, .pme_uflags = 0, }, { .pme_uname = "LOCK_M_STATE", .pme_udesc = "L2 demand lock RFOs in M state", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "LOCK_MESI", .pme_udesc = "All demand L2 lock RFOs", .pme_ucode = 0xF0, .pme_uflags = 0, }, { .pme_uname = "RFO_HIT", .pme_udesc = "All L2 demand store RFOs that hit the cache", .pme_ucode = 0x0E, .pme_uflags = 0, }, { .pme_uname = "RFO_E_STATE", .pme_udesc = "L2 demand store RFOs in the E state (exclusive)", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "RFO_I_STATE", .pme_udesc = "L2 demand store RFOs in I state (misses)", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "RFO_M_STATE", .pme_udesc = "L2 demand store RFOs in M state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "RFO_MESI", .pme_udesc = "All L2 demand store RFOs", .pme_ucode = 0x0F, .pme_uflags = 0, }, { .pme_uname = "RFO_S_STATE", .pme_udesc = "L2 demand store RFOs in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 12 }, { .pme_name = "LARGE_ITLB", .pme_desc = "Large instruction TLB", .pme_code = 0x82, .pme_flags = 0, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Large ITLB hit", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "LOAD_DISPATCH", .pme_desc = "Loads dispatched", .pme_code = 0x13, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All loads dispatched", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "MOB", .pme_udesc = "Loads dispatched from the MOB", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "RS", .pme_udesc = "Loads dispatched that bypass the MOB", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "RS_DELAYED", .pme_udesc = "Loads dispatched from stage 305", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "LOAD_HIT_PRE", .pme_desc = "Load operations conflicting with software prefetches", .pme_code = 0x014C, .pme_flags = PFMLIB_NHM_PMC01, }, { .pme_name = "LONGEST_LAT_CACHE", .pme_desc = "Longest latency cache reference", .pme_code = 0x2E, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "REFERENCE", .pme_udesc = "Longest latency cache reference", .pme_ucode = 0x4F, .pme_uflags = 0, }, { .pme_uname = "MISS", .pme_udesc = "Longest latency cache miss", .pme_ucode = 0x41, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "LSD", .pme_desc = "Loop stream detector", .pme_code = 0xA8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ACTIVE", .pme_udesc = "Cycles when uops were delivered by the LSD", .pme_ucode = 0x01 | (1<<16), .pme_uflags = 0, }, { .pme_uname = "INACTIVE", .pme_udesc = "Cycles no uops were delivered by the LSD", .pme_ucode = 0x01 | (1<<16)|(1<<15), .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "MACHINE_CLEARS", .pme_desc = "Machine Clear", .pme_code = 0xC3, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "SMC", .pme_udesc = "Self-Modifying Code detected", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "CYCLES", .pme_udesc = "Cycles machine clear asserted", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "MEM_ORDER", .pme_udesc = "Execution pipeline restart due to Memory ordering conflicts", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "FUSION_ASSIST", .pme_udesc = "Counts the number of macro-fusion assists", .pme_ucode = 0x10, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "MACRO_INSTS", .pme_desc = "Macro-fused instructions", .pme_code = 0xD0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "DECODED", .pme_udesc = "Instructions decoded", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "FUSIONS_DECODED", .pme_udesc = "Macro-fused instructions decoded", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "MEMORY_DISAMBIGUATION", .pme_desc = "Memory Disambiguation Activity", .pme_code = 0x09, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "RESET", .pme_udesc = "Counts memory disambiguation reset cycles", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WATCHDOG", .pme_udesc = "Counts the number of times the memory disambiguation watchdog kicked in", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WATCH_CYCLES", .pme_udesc = "Counts the cycles that the memory disambiguation watchdog is active", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "MEM_INST_RETIRED", .pme_desc = "Memory instructions retired", .pme_code = 0x0B, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LATENCY_ABOVE_THRESHOLD", .pme_udesc = "Memory instructions retired above programmed clocks, minimum value threhold is 4, requires PEBS", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "LOADS", .pme_udesc = "Instructions retired which contains a load (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "STORES", .pme_udesc = "Instructions retired which contains a store (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 3 }, { .pme_name = "MEM_LOAD_RETIRED", .pme_desc = "Retired loads", .pme_code = 0xCB, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired loads that miss the DTLB (Precise Event)", .pme_ucode = 0x80, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "HIT_LFB", .pme_udesc = "Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)", .pme_ucode = 0x40, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "L1D_HIT", .pme_udesc = "Retired loads that hit the L1 data cache (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "L2_HIT", .pme_udesc = "Retired loads that hit the L2 cache (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "L3_MISS", .pme_udesc = "Retired loads that miss the LLC cache (Precise Event)", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "LLC_MISS", .pme_udesc = "This is an alias for L3_MISS", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "L3_UNSHARED_HIT", .pme_udesc = "Retired loads that hit valid versions in the LLC cache (Precise Event)", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "LLC_UNSHARED_HIT", .pme_udesc = "This is an alias for L3_UNSHARED_HIT", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "OTHER_CORE_L2_HIT_HITM", .pme_udesc = "Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)", .pme_ucode = 0x08, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 9 }, { .pme_name = "MEM_STORE_RETIRED", .pme_desc = "Retired stores", .pme_code = 0x0C, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired stores that miss the DTLB (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 1 }, { .pme_name = "MEM_UNCORE_RETIRED", .pme_desc = "Load instructions retired which hit offcore", .pme_code = 0x0F, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "OTHER_CORE_L2_HITM", .pme_udesc = "Load instructions retired that HIT modified data in sibling core (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "REMOTE_CACHE_LOCAL_HOME_HIT", .pme_udesc = "Load instructions retired remote cache HIT data source (Precise Event)", .pme_ucode = 0x08, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "LOCAL_DRAM", .pme_udesc = "Load instructions retired with a data source of local DRAM or locally homed remote hitm (Precise Event)", .pme_ucode = 0x20, .pme_uflags = PFMLIB_NHM_PEBS, }, /* Model 46 only (must be after common umasks) */ { .pme_uname = "L3_DATA_MISS_UNKNOWN", .pme_udesc = "Load instructions retired where the memory reference missed L3 and data source is unknown (Model 46 only, Precise Event)", .pme_ucode = 0x01, .pme_umodel = 46, .pme_uflags = PFMLIB_NHM_PEBS, }, /* Model 46 only (must be after common umasks) */ { .pme_uname = "UNCACHEABLE", .pme_udesc = "Load instructions retired where the memory reference missed L1, L2, L3 caches and to perform I/O (Model 46 only, Precise Event)", .pme_ucode = 0x80, .pme_umodel = 46, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 6 /* patched at runtime for model 46 */ }, { .pme_name = "OFFCORE_REQUESTS", .pme_desc = "Offcore memory requests", .pme_code = 0xB0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All offcore requests", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ANY_READ", .pme_udesc = "Offcore read requests", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "ANY_RFO", .pme_udesc = "Offcore RFO requests", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_CODE", .pme_udesc = "Counts number of offcore demand code read requests. Does not count L2 prefetch requests.", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_DATA", .pme_udesc = "Offcore demand data read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DEMAND_RFO", .pme_udesc = "Offcore demand RFO requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "L1D_WRITEBACK", .pme_udesc = "Offcore L1 data cache writebacks", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "UNCACHED_MEM", .pme_udesc = "Counts number of offcore uncached memory requests", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "OFFCORE_REQUESTS_SQ_FULL", .pme_desc = "Counts cycles the Offcore Request buffer or Super Queue is full.", .pme_code = 0x01B2, .pme_flags = 0, }, { .pme_name = "PARTIAL_ADDRESS_ALIAS", .pme_desc = "False dependencies due to partial address aliasing", .pme_code = 0x0107, .pme_flags = 0, }, { .pme_name = "PIC_ACCESSES", .pme_desc = "Programmable interrupt controller", .pme_code = 0xBA, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "TPR_READS", .pme_udesc = "Counts number of TPR reads", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "TPR_WRITES", .pme_udesc = "Counts number of TPR writes", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "RAT_STALLS", .pme_desc = "Register allocation table stalls", .pme_code = 0xD2, .pme_flags = 0, .pme_umasks = { { .pme_uname = "FLAGS", .pme_udesc = "Flag stall cycles", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "REGISTERS", .pme_udesc = "Partial register stall cycles", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "ROB_READ_PORT", .pme_udesc = "ROB read port stalls cycles", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "SCOREBOARD", .pme_udesc = "Scoreboard stall cycles", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "All RAT stall cycles", .pme_ucode = 0x0F, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "RESOURCE_STALLS", .pme_desc = "Processor stalls", .pme_code = 0xA2, .pme_flags = 0, .pme_umasks = { { .pme_uname = "FPCW", .pme_udesc = "FPU control word write stall cycles", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "LOAD", .pme_udesc = "Load buffer stall cycles", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "MXCSR", .pme_udesc = "MXCSR rename stall cycles", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "RS_FULL", .pme_udesc = "Reservation Station full stall cycles", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "STORE", .pme_udesc = "Store buffer stall cycles", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "OTHER", .pme_udesc = "Other Resource related stall cycles", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ROB_FULL", .pme_udesc = "ROB full stall cycles", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "Resource related stall cycles", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "SEG_RENAME_STALLS", .pme_desc = "Segment rename stall cycles", .pme_code = 0x01D4, .pme_flags = 0, }, { .pme_name = "SEGMENT_REG_LOADS", .pme_desc = "Counts number of segment register loads", .pme_code = 0x01F8, .pme_flags = 0, }, { .pme_name = "SIMD_INT_128", .pme_desc = "128 bit SIMD integer operations", .pme_code = 0x12, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACK", .pme_udesc = "128 bit SIMD integer pack operations", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "PACKED_ARITH", .pme_udesc = "128 bit SIMD integer arithmetic operations", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PACKED_LOGICAL", .pme_udesc = "128 bit SIMD integer logical operations", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PACKED_MPY", .pme_udesc = "128 bit SIMD integer multiply operations", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PACKED_SHIFT", .pme_udesc = "128 bit SIMD integer shift operations", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "SHUFFLE_MOVE", .pme_udesc = "128 bit SIMD integer shuffle/move operations", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "UNPACK", .pme_udesc = "128 bit SIMD integer unpack operations", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "SIMD_INT_64", .pme_desc = "64 bit SIMD integer operations", .pme_code = 0xFD, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACK", .pme_udesc = "SIMD integer 64 bit pack operations", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "PACKED_ARITH", .pme_udesc = "SIMD integer 64 bit arithmetic operations", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PACKED_LOGICAL", .pme_udesc = "SIMD integer 64 bit logical operations", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PACKED_MPY", .pme_udesc = "SIMD integer 64 bit packed multiply operations", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PACKED_SHIFT", .pme_udesc = "SIMD integer 64 bit shift operations", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "SHUFFLE_MOVE", .pme_udesc = "SIMD integer 64 bit shuffle/move operations", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "UNPACK", .pme_udesc = "SIMD integer 64 bit unpack operations", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "SNOOP_RESPONSE", .pme_desc = "Snoop", .pme_code = 0xB8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Thread responded HIT to snoop", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "HITE", .pme_udesc = "Thread responded HITE to snoop", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "HITM", .pme_udesc = "Thread responded HITM to snoop", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "SQ_FULL_STALL_CYCLES", .pme_desc = "Counts cycles the Offcore Request buffer or Super Queue is full and request(s) are outstanding.", .pme_code = 0x01F6, .pme_flags = 0, }, { .pme_name = "SQ_MISC", .pme_desc = "Super Queue Activity Related to L2 Cache Access", .pme_code = 0xF4, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PROMOTION", .pme_udesc = "Counts the number of L2 secondary misses that hit the Super Queue", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PROMOTION_POST_GO", .pme_udesc = "Counts the number of L2 secondary misses during the Super Queue filling L2", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LRU_HINTS", .pme_udesc = "Counts number of Super Queue LRU hints sent to L3", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "FILL_DROPPED", .pme_udesc = "Counts the number of SQ L2 fills dropped due to L2 busy", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "SPLIT_LOCK", .pme_udesc = "Super Queue lock splits across a cache line", .pme_ucode = 0x10, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "SSE_MEM_EXEC", .pme_desc = "Streaming SIMD executed", .pme_code = 0x4B, .pme_flags = 0, .pme_umasks = { { .pme_uname = "NTA", .pme_udesc = "Streaming SIMD L1D NTA prefetch miss", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "SSEX_UOPS_RETIRED", .pme_desc = "SIMD micro-ops retired", .pme_code = 0xC7, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "SIMD Packed-Double Uops retired (Precise Event)", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "PACKED_SINGLE", .pme_udesc = "SIMD Packed-Single Uops retired (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "SIMD Scalar-Double Uops retired (Precise Event)", .pme_ucode = 0x08, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "SIMD Scalar-Single Uops retired (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "VECTOR_INTEGER", .pme_udesc = "SIMD Vector Integer Uops retired (Precise Event)", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 5 }, { .pme_name = "STORE_BLOCKS", .pme_desc = "Delayed loads", .pme_code = 0x06, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "AT_RET", .pme_udesc = "Loads delayed with at-Retirement block code", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "L1D_BLOCK", .pme_udesc = "Cacheable loads delayed with L1D block code", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "NOT_STA", .pme_udesc = "Loads delayed due to a store blocked for unknown data", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "STA", .pme_udesc = "Loads delayed due to a store blocked for an unknown address", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "TWO_UOP_INSTS_DECODED", .pme_desc = "Two micro-ops instructions decoded", .pme_code = 0x0119, .pme_flags = 0, }, { .pme_name = "UOPS_DECODED_DEC0", .pme_desc = "Micro-ops decoded by decoder 0", .pme_code = 0x013D, .pme_flags = 0, }, { .pme_name = "UOPS_DECODED", .pme_desc = "Micro-ops decoded", .pme_code = 0xD1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ESP_FOLDING", .pme_udesc = "Stack pointer instructions decoded", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ESP_SYNC", .pme_udesc = "Stack pointer sync operations", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "MS", .pme_udesc = "Uops decoded by Microcode Sequencer", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "MS_CYCLES_ACTIVE", .pme_udesc = "cycles in which at least one uop is decoded by Microcode Sequencer", .pme_ucode = 0x2 | (1<< 16), /* counter-mask = 1 */ }, }, .pme_numasks = 4 }, { .pme_name = "UOPS_EXECUTED", .pme_desc = "Micro-ops executed", .pme_code = 0xB1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PORT0", .pme_udesc = "Uops executed on port 0", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PORT1", .pme_udesc = "Uops executed on port 1", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PORT2_CORE", .pme_udesc = "Uops executed on port 2 (core count only)", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "PORT3_CORE", .pme_udesc = "Uops executed on port 3 (core count only)", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "PORT4_CORE", .pme_udesc = "Uops executed on port 4 (core count only)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PORT5", .pme_udesc = "Uops executed on port 5", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PORT015", .pme_udesc = "Uops issued on ports 0, 1 or 5", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PORT234_CORE", .pme_udesc = "Uops issued on ports 2, 3 or 4 (core count only)", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "PORT015_STALL_CYCLES", .pme_udesc = "Cycles no Uops issued on ports 0, 1 or 5", .pme_ucode = 0x40 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = 0, }, }, .pme_numasks = 9 }, { .pme_name = "UOPS_ISSUED", .pme_desc = "Micro-ops issued", .pme_code = 0x0E, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Uops issued", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "STALLED_CYCLES", .pme_udesc = "Cycles stalled no issued uops", .pme_ucode = 0x01 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = 0, }, { .pme_uname = "FUSED", .pme_udesc = "Fused Uops issued", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UOPS_RETIRED", .pme_desc = "Micro-ops retired", .pme_code = 0xC2, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Uops retired (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "RETIRE_SLOTS", .pme_udesc = "Retirement slots used (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "ACTIVE_CYCLES", .pme_udesc = "Cycles Uops are being retired (Precise Event)", .pme_ucode = 0x01 | (1<< 16), /* counter mask = 1 */ .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "STALL_CYCLES", .pme_udesc = "Cycles No Uops retired (Precise Event)", .pme_ucode = 0x01 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "MACRO_FUSED", .pme_udesc = "Macro-fused Uops retired (Precise Event)", .pme_ucode = 0x04, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 5 }, { .pme_name = "UOP_UNFUSION", .pme_desc = "Micro-ops unfusions due to FP exceptions", .pme_code = 0x01DB, .pme_flags = 0, }, /* * BEGIN OFFCORE_RESPONSE */ { .pme_name = "OFFCORE_RESPONSE_0", .pme_desc = "Offcore response", .pme_code = 0x01B7, .pme_flags = PFMLIB_NHM_OFFCORE_RSP0, .pme_umasks = { { .pme_uname = "DMND_DATA_RD", .pme_udesc = "Request. Counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DMND_RFO", .pme_udesc = "Request. Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DMND_IFETCH", .pme_udesc = "Request. Counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "Request. Counts the number of writeback (modified to exclusive) transactions", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "PF_DATA_RD", .pme_udesc = "Request. Counts the number of data cacheline reads generated by L2 prefetchers", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PF_RFO", .pme_udesc = "Request. Counts the number of RFO requests generated by L2 prefetchers", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PF_IFETCH", .pme_udesc = "Request. Counts the number of code reads generated by L2 prefetchers", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "OTHER", .pme_udesc = "Request. Counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ANY_REQUEST", .pme_udesc = "Request. Counts any request type", .pme_ucode = 0xff, .pme_uflags = 0, }, { .pme_uname = "UNCORE_HIT", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping)", .pme_ucode = 0x100, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HIT_SNP", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean)", .pme_ucode = 0x200, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HITM", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM)", .pme_ucode = 0x400, .pme_uflags = 0, }, { .pme_uname = "REMOTE_CACHE_FWD", .pme_udesc = "Response. Counts L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted)", .pme_ucode = 0x1000, .pme_uflags = 0, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Response. Counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .pme_ucode = 0x2000, .pme_uflags = 0, }, { .pme_uname = "LOCAL_DRAM", .pme_udesc = "Response. Counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM", .pme_ucode = 0x4000, .pme_uflags = 0, }, { .pme_uname = "NON_DRAM", .pme_udesc = "Response. Non-DRAM requests that were serviced by IOH", .pme_ucode = 0x8000, .pme_uflags = 0, }, { .pme_uname = "ANY_RESPONSE", .pme_udesc = "Response. Counts any response type", .pme_ucode = 0xf700, .pme_uflags = 0, }, }, .pme_numasks = 17 } }; #define PME_COREI7_UNHALTED_CORE_CYCLES 0 #define PME_COREI7_INSTRUCTIONS_RETIRED 1 #define PME_COREI7_EVENT_COUNT (sizeof(corei7_pe)/sizeof(pme_nhm_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/intel_corei7_unc_events.h000066400000000000000000001027611502707512200243020ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ static pme_nhm_entry_t corei7_unc_pe[]={ /* * BEGIN uncore events */ { .pme_name = "UNC_CLK_UNHALTED", .pme_desc = "Uncore clockticks.", .pme_code = 0x0000, .pme_flags = PFMLIB_NHM_UNC_FIXED, }, { .pme_name = "UNC_DRAM_OPEN", .pme_desc = "DRAM open comamnds issued for read or write", .pme_code = 0x60, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 open comamnds issued for read or write", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 open comamnds issued for read or write", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 open comamnds issued for read or write", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_PAGE_CLOSE", .pme_desc = "DRAM page close due to idle timer expiration", .pme_code = 0x61, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 page close", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 page close", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 page close", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_PAGE_MISS", .pme_desc = "DRAM Channel 0 page miss", .pme_code = 0x62, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 page miss", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 page miss", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 page miss", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_PRE_ALL", .pme_desc = "DRAM Channel 0 precharge all commands", .pme_code = 0x66, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 precharge all commands", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 precharge all commands", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 precharge all commands", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_READ_CAS", .pme_desc = "DRAM Channel 0 read CAS commands", .pme_code = 0x63, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 read CAS commands", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH0", .pme_udesc = "DRAM Channel 0 read CAS auto page close commands", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 read CAS commands", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH1", .pme_udesc = "DRAM Channel 1 read CAS auto page close commands", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 read CAS commands", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH2", .pme_udesc = "DRAM Channel 2 read CAS auto page close commands", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_DRAM_REFRESH", .pme_desc = "DRAM Channel 0 refresh commands", .pme_code = 0x65, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 refresh commands", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 refresh commands", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 refresh commands", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_WRITE_CAS", .pme_desc = "DRAM Channel 0 write CAS commands", .pme_code = 0x64, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 write CAS commands", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH0", .pme_udesc = "DRAM Channel 0 write CAS auto page close commands", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 write CAS commands", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH1", .pme_udesc = "DRAM Channel 1 write CAS auto page close commands", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 write CAS commands", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "AUTOPRE_CH2", .pme_udesc = "DRAM Channel 2 write CAS auto page close commands", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_GQ_ALLOC", .pme_desc = "GQ read tracker requests", .pme_code = 0x03, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "GQ read tracker requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "RT_LLC_MISS", .pme_udesc = "GQ read tracker LLC misses", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "RT_TO_LLC_RESP", .pme_udesc = "GQ read tracker LLC requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "RT_TO_RTID_ACQUIRED", .pme_udesc = "GQ read tracker LLC miss to RTID acquired", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "WT_TO_RTID_ACQUIRED", .pme_udesc = "GQ write tracker LLC miss to RTID acquired", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "GQ write tracker LLC misses", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "GQ peer probe tracker requests", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "UNC_GQ_CYCLES_FULL", .pme_desc = "Cycles GQ read tracker is full.", .pme_code = 0x00, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "Cycles GQ read tracker is full.", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "Cycles GQ write tracker is full.", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "Cycles GQ peer probe tracker is full.", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_GQ_CYCLES_NOT_EMPTY", .pme_desc = "Cycles GQ read tracker is busy", .pme_code = 0x01, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "Cycles GQ read tracker is busy", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "Cycles GQ write tracker is busy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "Cycles GQ peer probe tracker is busy", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_GQ_DATA", .pme_desc = "Cycles GQ data is imported from Quickpath interface", .pme_code = 0x04, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "FROM_QPI", .pme_udesc = "Cycles GQ data is imported from Quickpath interface", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "FROM_QMC", .pme_udesc = "Cycles GQ data is imported from Quickpath memory interface", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "FROM_LLC", .pme_udesc = "Cycles GQ data is imported from LLC", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "FROM_CORES_02", .pme_udesc = "Cycles GQ data is imported from Cores 0 and 2", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "FROM_CORES_13", .pme_udesc = "Cycles GQ data is imported from Cores 1 and 3", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "TO_QPI_QMC", .pme_udesc = "Cycles GQ data sent to the QPI or QMC", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "TO_LLC", .pme_udesc = "Cycles GQ data sent to LLC", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "TO_CORES", .pme_udesc = "Cycles GQ data sent to cores", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_LLC_HITS", .pme_desc = "Number of LLC read hits", .pme_code = 0x08, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ", .pme_udesc = "Number of LLC read hits", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WRITE", .pme_udesc = "Number of LLC write hits", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PROBE", .pme_udesc = "Number of LLC peer probe hits", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "Number of LLC hits", .pme_ucode = 0x03, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_LLC_LINES_IN", .pme_desc = "LLC lines allocated in M state", .pme_code = 0x0A, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "M_STATE", .pme_udesc = "LLC lines allocated in M state", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "E_STATE", .pme_udesc = "LLC lines allocated in E state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "LLC lines allocated in S state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "F_STATE", .pme_udesc = "LLC lines allocated in F state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "LLC lines allocated", .pme_ucode = 0x0F, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "UNC_LLC_LINES_OUT", .pme_desc = "LLC lines victimized in M state", .pme_code = 0x0B, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "M_STATE", .pme_udesc = "LLC lines victimized in M state", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "E_STATE", .pme_udesc = "LLC lines victimized in E state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "LLC lines victimized in S state", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "I_STATE", .pme_udesc = "LLC lines victimized in I state", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "F_STATE", .pme_udesc = "LLC lines victimized in F state", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "LLC lines victimized", .pme_ucode = 0x1F, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_LLC_MISS", .pme_desc = "Number of LLC read misses", .pme_code = 0x09, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ", .pme_udesc = "Number of LLC read misses", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "WRITE", .pme_udesc = "Number of LLC write misses", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PROBE", .pme_udesc = "Number of LLC peer probe misses", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "Number of LLC misses", .pme_ucode = 0x03, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QHL_ADDRESS_CONFLICTS", .pme_desc = "QHL 2 way address conflicts", .pme_code = 0x24, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "2WAY", .pme_udesc = "QHL 2 way address conflicts", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "3WAY", .pme_udesc = "QHL 3 way address conflicts", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QHL_CONFLICT_CYCLES", .pme_desc = "QHL IOH Tracker conflict cycles", .pme_code = 0x25, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "QHL IOH Tracker conflict cycles", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "REMOTE", .pme_udesc = "QHL Remote Tracker conflict cycles", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LOCAL", .pme_udesc = "QHL Local Tracker conflict cycles", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_CYCLES_FULL", .pme_desc = "Cycles QHL Remote Tracker is full", .pme_code = 0x21, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker is full", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker is full", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH Tracker is full", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_CYCLES_NOT_EMPTY", .pme_desc = "Cycles QHL Tracker is not empty", .pme_code = 0x22, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH is busy", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker is busy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker is busy", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_FRC_ACK_CNFLTS", .pme_desc = "QHL FrcAckCnflts sent to local home", .pme_code = 0x33, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "LOCAL", .pme_udesc = "QHL FrcAckCnflts sent to local home", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "UNC_QHL_OCCUPANCY", .pme_desc = "Cycles QHL Tracker Allocate to Deallocate Read Occupancy", .pme_code = 0x23, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_REQUESTS", .pme_desc = "Quickpath Home Logic local read requests", .pme_code = 0x20, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "LOCAL_READS", .pme_udesc = "Quickpath Home Logic local read requests", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "LOCAL_WRITES", .pme_udesc = "Quickpath Home Logic local write requests", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "REMOTE_READS", .pme_udesc = "Quickpath Home Logic remote read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "IOH_READS", .pme_udesc = "Quickpath Home Logic IOH read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "IOH_WRITES", .pme_udesc = "Quickpath Home Logic IOH write requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "REMOTE_WRITES", .pme_udesc = "Quickpath Home Logic remote write requests", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QHL_TO_QMC_BYPASS", .pme_desc = "Number of requests to QMC that bypass QHL", .pme_code = 0x0126, .pme_flags = PFMLIB_NHM_UNC, }, { .pme_name = "UNC_QMC_BUSY", .pme_desc = "Cycles QMC busy with a read request", .pme_code = 0x29, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_CH0", .pme_udesc = "Cycles QMC channel 0 busy with a read request", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "READ_CH1", .pme_udesc = "Cycles QMC channel 1 busy with a read request", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "READ_CH2", .pme_udesc = "Cycles QMC channel 2 busy with a read request", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH0", .pme_udesc = "Cycles QMC channel 0 busy with a write request", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH1", .pme_udesc = "Cycles QMC channel 1 busy with a write request", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH2", .pme_udesc = "Cycles QMC channel 2 busy with a write request", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QMC_CANCEL", .pme_desc = "QMC cancels", .pme_code = 0x30, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 cancels", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 cancels", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 cancels", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "QMC cancels", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_CRITICAL_PRIORITY_READS", .pme_desc = "QMC critical priority read requests", .pme_code = 0x2E, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 critical priority read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 critical priority read requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 critical priority read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "QMC critical priority read requests", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_HIGH_PRIORITY_READS", .pme_desc = "QMC high priority read requests", .pme_code = 0x2D, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 high priority read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 high priority read requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 high priority read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "QMC high priority read requests", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_ISOC_FULL", .pme_desc = "Cycles DRAM full with isochronous read requests", .pme_code = 0x28, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_CH0", .pme_udesc = "Cycles DRAM channel 0 full with isochronous read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "READ_CH1", .pme_udesc = "Cycles DRAM channel 1 full with isochronous read requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "READ_CH2", .pme_udesc = "Cycles DRAM channel 2 full with ISOC read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH0", .pme_udesc = "Cycles DRAM channel 0 full with ISOC write requests", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH1", .pme_udesc = "Cycles DRAM channel 1 full with ISOC write requests", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH2", .pme_udesc = "Cycles DRAM channel 2 full with ISOC write requests", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_IMC_ISOC_OCCUPANCY", .pme_desc = "IMC isochronous (ISOC) Read Occupancy", .pme_code = 0x2B, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "IMC channel 0 isochronous read request occupancy", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "IMC channel 1 isochronous read request occupancy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "IMC channel 2 isochronous read request occupancy", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "IMC any channel isochronous read request occupancy", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_NORMAL_FULL", .pme_desc = "Cycles DRAM full with normal read requests", .pme_code = 0x27, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_CH0", .pme_udesc = "Cycles DRAM channel 0 full with normal read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "READ_CH1", .pme_udesc = "Cycles DRAM channel 1 full with normal read requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "READ_CH2", .pme_udesc = "Cycles DRAM channel 2 full with normal read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH0", .pme_udesc = "Cycles DRAM channel 0 full with normal write requests", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH1", .pme_udesc = "Cycles DRAM channel 1 full with normal write requests", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WRITE_CH2", .pme_udesc = "Cycles DRAM channel 2 full with normal write requests", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QMC_NORMAL_READS", .pme_desc = "QMC normal read requests", .pme_code = 0x2C, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 normal read requests", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 normal read requests", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 normal read requests", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "QMC normal read requests", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_OCCUPANCY", .pme_desc = "QMC Occupancy", .pme_code = 0x2A, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "IMC channel 0 normal read request occupancy", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "IMC channel 1 normal read request occupancy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "IMC channel 2 normal read request occupancy", .pme_ucode = 0x04, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QMC_PRIORITY_UPDATES", .pme_desc = "QMC priority updates", .pme_code = 0x31, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 priority updates", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 priority updates", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 priority updates", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "ANY", .pme_udesc = "QMC priority updates", .pme_ucode = 0x07, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_WRITES", .pme_desc = "QMC full cache line writes", .pme_code = 0x2F, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "FULL_CH0", .pme_udesc = "QMC channel 0 full cache line writes", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "FULL_CH1", .pme_udesc = "QMC channel 1 full cache line writes", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "FULL_CH2", .pme_udesc = "QMC channel 2 full cache line writes", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "FULL_ANY", .pme_udesc = "QMC full cache line writes", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "PARTIAL_CH0", .pme_udesc = "QMC channel 0 partial cache line writes", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "PARTIAL_CH1", .pme_udesc = "QMC channel 1 partial cache line writes", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PARTIAL_CH2", .pme_udesc = "QMC channel 2 partial cache line writes", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PARTIAL_ANY", .pme_udesc = "QMC partial cache line writes", .pme_ucode = 0x38, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_QPI_RX_NO_PPT_CREDIT", .pme_desc = "Link 0 snoop stalls due to no PPT entry", .pme_code = 0x43, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "STALLS_LINK_0", .pme_udesc = "Link 0 snoop stalls due to no PPT entry", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "STALLS_LINK_1", .pme_udesc = "Link 1 snoop stalls due to no PPT entry", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QPI_TX_HEADER", .pme_desc = "Cycles link 0 outbound header busy", .pme_code = 0x42, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "BUSY_LINK_0", .pme_udesc = "Cycles link 0 outbound header busy", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "BUSY_LINK_1", .pme_udesc = "Cycles link 1 outbound header busy", .pme_ucode = 0x08, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QPI_TX_STALLED_MULTI_FLIT", .pme_desc = "Cycles QPI outbound link 0 DRS stalled", .pme_code = 0x41, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "DRS_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 DRS stalled", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "NCB_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NCB stalled", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "NCS_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NCS stalled", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "DRS_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 DRS stalled", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "NCB_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NCB stalled", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "NCS_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NCS stalled", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "LINK_0", .pme_udesc = "Cycles QPI outbound link 0 multi flit stalled", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "LINK_1", .pme_udesc = "Cycles QPI outbound link 1 multi flit stalled", .pme_ucode = 0x38, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_QPI_TX_STALLED_SINGLE_FLIT", .pme_desc = "Cycles QPI outbound link 0 HOME stalled", .pme_code = 0x40, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "HOME_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 HOME stalled", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "SNOOP_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 SNOOP stalled", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "NDR_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NDR stalled", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "HOME_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 HOME stalled", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "SNOOP_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 SNOOP stalled", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "NDR_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NDR stalled", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "LINK_0", .pme_udesc = "Cycles QPI outbound link 0 single flit stalled", .pme_ucode = 0x07, .pme_uflags = 0, }, { .pme_uname = "LINK_1", .pme_udesc = "Cycles QPI outbound link 1 single flit stalled", .pme_ucode = 0x38, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_SNP_RESP_TO_LOCAL_HOME", .pme_desc = "Local home snoop response - LLC does not have cache line", .pme_code = 0x06, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "I_STATE", .pme_udesc = "Local home snoop response - LLC does not have cache line", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "Local home snoop response - LLC has cache line in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "FWD_S_STATE", .pme_udesc = "Local home snoop response - LLC forwarding cache line in S state.", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "FWD_I_STATE", .pme_udesc = "Local home snoop response - LLC has forwarded a modified cache line", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "CONFLICT", .pme_udesc = "Local home conflict snoop response", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "Local home snoop response - LLC has cache line in the M state", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_SNP_RESP_TO_REMOTE_HOME", .pme_desc = "Remote home snoop response - LLC does not have cache line", .pme_code = 0x07, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "I_STATE", .pme_udesc = "Remote home snoop response - LLC does not have cache line", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "Remote home snoop response - LLC has cache line in S state", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "FWD_S_STATE", .pme_udesc = "Remote home snoop response - LLC forwarding cache line in S state.", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "FWD_I_STATE", .pme_udesc = "Remote home snoop response - LLC has forwarded a modified cache line", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "CONFLICT", .pme_udesc = "Remote home conflict snoop response", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "Remote home snoop response - LLC has cache line in the M state", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "HITM", .pme_udesc = "Remote home snoop response - LLC HITM", .pme_ucode = 0x24, .pme_uflags = 0, }, }, .pme_numasks = 7 }, }; #define PME_COREI7_UNC_EVENT_COUNT (sizeof(corei7_unc_pe)/sizeof(pme_nhm_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/intel_wsm_events.h000066400000000000000000002013741502707512200230530ustar00rootroot00000000000000/* * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ static pme_nhm_entry_t wsm_pe[]={ /* * BEGIN architected events */ {.pme_name = "UNHALTED_CORE_CYCLES", .pme_code = 0x003c, .pme_cntmsk = 0x2000f, .pme_flags = PFMLIB_NHM_FIXED1, .pme_desc = "count core clock cycles whenever the clock signal on the specific core is running (not halted). Alias to event CPU_CLK_UNHALTED:THREAD" }, {.pme_name = "INSTRUCTION_RETIRED", .pme_code = 0x00c0, .pme_cntmsk = 0x1000f, .pme_flags = PFMLIB_NHM_FIXED0|PFMLIB_NHM_PEBS, .pme_desc = "count the number of instructions at retirement. Alias to event INST_RETIRED:ANY_P", }, {.pme_name = "INSTRUCTIONS_RETIRED", .pme_code = 0x00c0, .pme_cntmsk = 0x1000f, .pme_flags = PFMLIB_NHM_FIXED0|PFMLIB_NHM_PEBS, .pme_desc = "This is an alias for INSTRUCTION_RETIRED", }, {.pme_name = "UNHALTED_REFERENCE_CYCLES", .pme_code = 0x013c, .pme_cntmsk = 0x40000, .pme_flags = PFMLIB_NHM_FIXED2_ONLY, .pme_desc = "Unhalted reference cycles", }, {.pme_name = "LLC_REFERENCES", .pme_code = 0x4f2e, .pme_cntmsk = 0xf, .pme_desc = "count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to L2_RQSTS:SELF_DEMAND_MESI", }, {.pme_name = "LAST_LEVEL_CACHE_REFERENCES", .pme_code = 0x4f2e, .pme_cntmsk = 0xf, .pme_desc = "This is an alias for LLC_REFERENCES", }, {.pme_name = "LLC_MISSES", .pme_code = 0x412e, .pme_cntmsk = 0xf, .pme_desc = "count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L2_RQSTS:SELF_DEMAND_I_STATE", }, {.pme_name = "LAST_LEVEL_CACHE_MISSES", .pme_code = 0x412e, .pme_cntmsk = 0xf, .pme_desc = "This is an alias for LLC_MISSES", }, {.pme_name = "BRANCH_INSTRUCTIONS_RETIRED", .pme_code = 0x00c4, .pme_cntmsk = 0xf, .pme_desc = "count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction. Alias to event BR_INST_RETIRED:ANY", }, /* * BEGIN core specific events */ { .pme_name = "UOPS_DECODED", .pme_desc = "micro-ops decoded", .pme_code = 0xD1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ESP_FOLDING", .pme_udesc = "Stack pointer instructions decoded", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "ESP_SYNC", .pme_udesc = "Stack pointer sync operations", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "MS_CYCLES_ACTIVE", .pme_udesc = "cycles in which at least one uop is decoded by Microcode Sequencer", .pme_ucode = 0x2 | (1<< 16), /* counter-mask = 1 */ .pme_uflags = 0, }, { .pme_uname = "STALL_CYCLES", .pme_udesc = "Cycles no Uops are decoded", .pme_ucode = 0x1 | (1<<16) | (1<<15), /* inv=1, counter-mask=1 */ .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L1D_CACHE_LOCK_FB_HIT", .pme_desc = "L1D cacheable load lock speculated or retired accepted into the fill buffer", .pme_code = 0x0152, .pme_flags = 0, }, { .pme_name = "BPU_CLEARS", .pme_desc = "Branch Prediciton Unit clears", .pme_code = 0xE8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "EARLY", .pme_udesc = "Early Branch Prediciton Unit clears", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "LATE", .pme_udesc = "Late Branch Prediction Unit clears", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "UOPS_RETIRED", .pme_desc = "Cycles Uops are being retired", .pme_code = 0xC2, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Uops retired (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "MACRO_FUSED", .pme_udesc = "Macro-fused Uops retired (Precise Event)", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "RETIRE_SLOTS", .pme_udesc = "Retirement slots used (Precise Event)", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "STALL_CYCLES", .pme_udesc = "Cycles Uops are not retiring (Precise Event)", .pme_ucode = 0x01 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = 0, }, { .pme_uname = "TOTAL_CYCLES", .pme_udesc = "Total cycles using precise uop retired event (Precise Event)", .pme_ucode = 0x01 | (1<< 16), /* counter mask = 1 */ .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "ACTIVE_CYCLES", .pme_udesc = "Alias for TOTAL_CYCLES", .pme_ucode = 0x01 | (1<< 16), /* counter mask = 1 */ .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 6 }, { .pme_name = "BR_MISP_RETIRED", .pme_desc = "Mispredicted retired branches", .pme_code = 0xC5, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ALL_BRANCHES", .pme_udesc = "Mispredicted retired branch instructions", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALL", .pme_udesc = "Mispredicted near retired calls", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "CONDITIONAL", .pme_udesc = "Mispredicted conditional branches retired", .pme_ucode = 0x1, .pme_uflags = 0, } }, .pme_numasks = 3 }, { .pme_name = "EPT", .pme_desc = "Extended Page Table", .pme_code = 0x4F, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "WALK_CYCLES", .pme_udesc = "Extended Page Table walk cycles", .pme_ucode = 0x10, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "UOPS_EXECUTED", .pme_desc = "micro-ops executed", .pme_code = 0xB1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PORT0", .pme_udesc = "Uops executed on port 0 (integer arithmetic, SIMD and FP add uops)", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "PORT1", .pme_udesc = "Uops executed on port 1 (integer arithmetic, SIMD, integer shift, FP multiply, FP divide uops)", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "PORT2_CORE", .pme_udesc = "Uops executed on port 2 from any thread (load uops) (core count only)", .pme_ucode = 0x04 | (1<< 13), /* any=1 */ .pme_uflags = 0, }, { .pme_uname = "PORT3_CORE", .pme_udesc = "Uops executed on port 3 from any thread (store uops) (core count only)", .pme_ucode = 0x08 | (1<<13), /* any=1 */ .pme_uflags = 0, }, { .pme_uname = "PORT4_CORE", .pme_udesc = "Uops executed on port 4 from any thread (handle store values for stores on port 3) (core count only)", .pme_ucode = 0x10 | (1<<13), /* any=1 */ .pme_uflags = 0, }, { .pme_uname = "PORT5", .pme_udesc = "Uops executed on port 5", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PORT015", .pme_udesc = "Uops issued on ports 0, 1 or 5", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PORT234_CORE", .pme_udesc = "Uops issued on ports 2, 3 or 4 from any thread (core count only)", .pme_ucode = 0x80 | (1<<13), /* any=1 */ .pme_uflags = 0, }, { .pme_uname = "PORT015_STALL_CYCLES", .pme_udesc = "Cycles no Uops issued on ports 0, 1 or 5", .pme_ucode = 0x40 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = 0, }, { .pme_uname = "CORE_ACTIVE_CYCLES_NO_PORT5", .pme_udesc = "Cycles in which uops are executed only on port0-4 on any thread (core count only)", .pme_ucode = 0x1f | (1<<13) | (1<<16), /* counter-mask = 1, any=1 */ }, { .pme_uname = "CORE_ACTIVE_CYCLES", .pme_udesc = "Cycles in which uops are executed on any port any thread (core count only)", .pme_ucode = 0x3f | (1<<13) | (1<<16), /* counter-mask = 1, any=1 */ }, { .pme_uname = "CORE_STALL_CYCLES", .pme_udesc = "Cycles in which no uops are executed on any port any thread (core count only)", .pme_ucode = 0x3f | (1<<13) | (1<<15) | (1<<16), /* counter-mask = 1, inv = 1,any=1 */ }, { .pme_uname = "CORE_STALL_CYCLES_NO_PORT5", .pme_udesc = "Cycles in which no uops are executed on any port0-4 on any thread (core count only)", .pme_ucode = 0x1f | (1<<13) | (1<<15) | (1<<16), /* counter-mask = 1, inv = 1,any=1 */ }, { .pme_uname = "CORE_STALL_COUNT", .pme_udesc = "number of transitions from stalled to uops to execute on any port any thread(core count only)", .pme_ucode = 0x3f | (1<<13) | (1<<15) | (1<<16) | (1<<10), /* counter-mask = 1, inv = 1, any=1, edge=1 */ }, { .pme_uname = "CORE_STALL_COUNT_NO_PORT5", .pme_udesc = "number of transitions from stalled to uops to execute on port0-4 on any thread (core count only)", .pme_ucode = 0x1f | (1<<13) | (1<<15) | (1<<16) | (1<<10), /* counter-mask = 1, inv = 1, any=1, edge=1 */ }, }, .pme_numasks = 15 }, { .pme_name = "IO_TRANSACTIONS", .pme_desc = "I/O transactions", .pme_code = 0x016C, .pme_flags = 0, }, { .pme_name = "ES_REG_RENAMES", .pme_desc = "ES segment renames", .pme_code = 0x01D5, .pme_flags = 0, }, { .pme_name = "INST_RETIRED", .pme_desc = "Instructions retired", .pme_code = 0xC0, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY_P", .pme_udesc = "Instructions Retired (Precise Event)", .pme_ucode = 0x00, .pme_uflags = 0, }, { .pme_uname = "X87", .pme_udesc = "Retired floating-point operations (Precise Event)", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "MMX", .pme_udesc = "Retired MMX instructions (Precise Event)", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "TOTAL_CYCLES", .pme_udesc = "Total cycles (Precise Event)", .pme_ucode = 0x1 | (16 << 16) | (1 <<15), /* inv=1, cmask = 16 */ .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "ILD_STALL", .pme_desc = "Instruction Length Decoder stalls", .pme_code = 0x87, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Any Instruction Length Decoder stall cycles", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "IQ_FULL", .pme_udesc = "Instruction Queue full stall cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "LCP", .pme_udesc = "Length Change Prefix stall cycles", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "MRU", .pme_udesc = "Stall cycles due to BPU MRU bypass", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "REGEN", .pme_udesc = "Regen stall cycles", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "DTLB_LOAD_MISSES", .pme_desc = "DTLB load misses", .pme_code = 0x8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "DTLB load misses", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "PDE_MISS", .pme_udesc = "DTLB load miss caused by low part of address", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "STLB_HIT", .pme_udesc = "DTLB second level hit", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "DTLB load miss page walks complete", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "WALK_CYCLES", .pme_udesc = "DTLB load miss page walk cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L2_LINES_IN", .pme_desc = "L2 lines alloacated", .pme_code = 0xF1, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "L2 lines alloacated", .pme_ucode = 0x7, .pme_uflags = 0, }, { .pme_uname = "E_STATE", .pme_udesc = "L2 lines allocated in the E state", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L2 lines allocated in the S state", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "SSEX_UOPS_RETIRED", .pme_desc = "SIMD micro-ops retired (Precise Event)", .pme_code = 0xC7, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACKED_DOUBLE", .pme_udesc = "SIMD Packed-Double Uops retired (Precise Event)", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "PACKED_SINGLE", .pme_udesc = "SIMD Packed-Single Uops retired (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "SCALAR_DOUBLE", .pme_udesc = "SIMD Scalar-Double Uops retired (Precise Event)", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "SCALAR_SINGLE", .pme_udesc = "SIMD Scalar-Single Uops retired (Precise Event)", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "VECTOR_INTEGER", .pme_udesc = "SIMD Vector Integer Uops retired (Precise Event)", .pme_ucode = 0x10, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "STORE_BLOCKS", .pme_desc = "Load delayed by block code", .pme_code = 0x6, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "AT_RET", .pme_udesc = "Loads delayed with at-Retirement block code", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "L1D_BLOCK", .pme_udesc = "Cacheable loads delayed with L1D block code", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "FP_MMX_TRANS", .pme_desc = "Floating Point to and from MMX transitions", .pme_code = 0xCC, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All Floating Point to and from MMX transitions", .pme_ucode = 0x3, .pme_uflags = 0, }, { .pme_uname = "TO_FP", .pme_udesc = "Transitions from MMX to Floating Point instructions", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "TO_MMX", .pme_udesc = "Transitions from Floating Point to MMX instructions", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "CACHE_LOCK_CYCLES", .pme_desc = "Cache locked", .pme_code = 0x63, .pme_flags = PFMLIB_NHM_PMC01, .pme_umasks = { { .pme_uname = "L1D", .pme_udesc = "Cycles L1D locked", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "L1D_L2", .pme_udesc = "Cycles L1D and L2 locked", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "OFFCORE_REQUESTS_SQ_FULL", .pme_desc = "Offcore requests blocked due to Super Queue full", .pme_code = 0x01B2, .pme_flags = 0, }, { .pme_name = "L3_LAT_CACHE", .pme_desc = "Last level cache accesses", .pme_code = 0x2E, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "Last level cache miss", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "REFERENCE", .pme_udesc = "Last level cache reference", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "SIMD_INT_64", .pme_desc = "SIMD 64-bit integer operations", .pme_code = 0xFD, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACK", .pme_udesc = "SIMD integer 64 bit pack operations", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "PACKED_ARITH", .pme_udesc = "SIMD integer 64 bit arithmetic operations", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PACKED_LOGICAL", .pme_udesc = "SIMD integer 64 bit logical operations", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PACKED_MPY", .pme_udesc = "SIMD integer 64 bit packed multiply operations", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "PACKED_SHIFT", .pme_udesc = "SIMD integer 64 bit shift operations", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "SHUFFLE_MOVE", .pme_udesc = "SIMD integer 64 bit shuffle/move operations", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "UNPACK", .pme_udesc = "SIMD integer 64 bit unpack operations", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "BR_INST_DECODED", .pme_desc = "Branch instructions decoded", .pme_code = 0x01E0, .pme_flags = 0, }, { .pme_name = "BR_MISP_EXEC", .pme_desc = "Mispredicted branches executed", .pme_code = 0x89, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Mispredicted branches executed", .pme_ucode = 0x7F, .pme_uflags = 0, }, { .pme_uname = "COND", .pme_udesc = "Mispredicted conditional branches executed", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DIRECT", .pme_udesc = "Mispredicted unconditional branches executed", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "DIRECT_NEAR_CALL", .pme_udesc = "Mispredicted non call branches executed", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NEAR_CALL", .pme_udesc = "Mispredicted indirect call branches executed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NON_CALL", .pme_udesc = "Mispredicted indirect non call branches executed", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALLS", .pme_udesc = "Mispredicted call branches executed", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "NON_CALLS", .pme_udesc = "Mispredicted non call branches executed", .pme_ucode = 0x7, .pme_uflags = 0, }, { .pme_uname = "RETURN_NEAR", .pme_udesc = "Mispredicted return branches executed", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "TAKEN", .pme_udesc = "Mispredicted taken branches executed", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 10 }, { .pme_name = "SQ_FULL_STALL_CYCLES", .pme_desc = "Super Queue full stall cycles", .pme_code = 0x01F6, .pme_flags = 0, }, /* * BEGIN OFFCORE_RESPONSE */ { .pme_name = "OFFCORE_RESPONSE_0", .pme_desc = "Offcore response 0", .pme_code = 0x01B7, .pme_flags = PFMLIB_NHM_OFFCORE_RSP0, .pme_umasks = { { .pme_uname = "DMND_DATA_RD", .pme_udesc = "Request. Counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DMND_RFO", .pme_udesc = "Request. Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DMND_IFETCH", .pme_udesc = "Request. Counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "Request. Counts the number of writeback (modified to exclusive) transactions", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "PF_DATA_RD", .pme_udesc = "Request. Counts the number of data cacheline reads generated by L2 prefetchers", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PF_RFO", .pme_udesc = "Request. Counts the number of RFO requests generated by L2 prefetchers", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PF_IFETCH", .pme_udesc = "Request. Counts the number of code reads generated by L2 prefetchers", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "OTHER", .pme_udesc = "Request. Counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ANY_REQUEST", .pme_udesc = "Request. Counts any request type", .pme_ucode = 0xff, .pme_uflags = 0, }, { .pme_uname = "UNCORE_HIT", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping)", .pme_ucode = 0x100, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HIT_SNP", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean)", .pme_ucode = 0x200, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HITM", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM)", .pme_ucode = 0x400, .pme_uflags = 0, }, { .pme_uname = "REMOTE_CACHE_FWD", .pme_udesc = "Response. Counts L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted)", .pme_ucode = 0x1000, .pme_uflags = 0, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Response. Counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .pme_ucode = 0x2000, .pme_uflags = 0, }, { .pme_uname = "LOCAL_DRAM", .pme_udesc = "Response. Counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM", .pme_ucode = 0x4000, .pme_uflags = 0, }, { .pme_uname = "NON_DRAM", .pme_udesc = "Response. Non-DRAM requests that were serviced by IOH", .pme_ucode = 0x8000, .pme_uflags = 0, }, { .pme_uname = "ANY_RESPONSE", .pme_udesc = "Response. Counts any response type", .pme_ucode = 0xf700, .pme_uflags = 0, }, }, .pme_numasks = 17 }, { .pme_name = "OFFCORE_RESPONSE_1", .pme_desc = "Offcore response 1", .pme_code = 0x01BB, .pme_flags = PFMLIB_NHM_OFFCORE_RSP1, .pme_umasks = { { .pme_uname = "DMND_DATA_RD", .pme_udesc = "Request. Counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DMND_RFO", .pme_udesc = "Request. Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "DMND_IFETCH", .pme_udesc = "Request. Counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .pme_ucode = 0x04, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "Request. Counts the number of writeback (modified to exclusive) transactions", .pme_ucode = 0x08, .pme_uflags = 0, }, { .pme_uname = "PF_DATA_RD", .pme_udesc = "Request. Counts the number of data cacheline reads generated by L2 prefetchers", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PF_RFO", .pme_udesc = "Request. Counts the number of RFO requests generated by L2 prefetchers", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PF_IFETCH", .pme_udesc = "Request. Counts the number of code reads generated by L2 prefetchers", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "OTHER", .pme_udesc = "Request. Counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ANY_REQUEST", .pme_udesc = "Request. Counts any request type", .pme_ucode = 0xff, .pme_uflags = 0, }, { .pme_uname = "UNCORE_HIT", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping)", .pme_ucode = 0x100, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HIT_SNP", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean)", .pme_ucode = 0x200, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_HITM", .pme_udesc = "Response. Counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM)", .pme_ucode = 0x400, .pme_uflags = 0, }, { .pme_uname = "REMOTE_CACHE_FWD", .pme_udesc = "Response. Counts L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted)", .pme_ucode = 0x1000, .pme_uflags = 0, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Response. Counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .pme_ucode = 0x2000, .pme_uflags = 0, }, { .pme_uname = "LOCAL_DRAM", .pme_udesc = "Response. Counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM", .pme_ucode = 0x4000, .pme_uflags = 0, }, { .pme_uname = "NON_DRAM", .pme_udesc = "Response. Non-DRAM requests that were serviced by IOH", .pme_ucode = 0x8000, .pme_uflags = 0, }, { .pme_uname = "ANY_RESPONSE", .pme_udesc = "Response. Counts any response type", .pme_ucode = 0xf700, .pme_uflags = 0, }, }, .pme_numasks = 17 }, /* * END OFFCORE_RESPONSE */ { .pme_name = "BACLEAR", .pme_desc = "Branch address calculator clears", .pme_code = 0xE6, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "BAD_TARGET", .pme_udesc = "BACLEAR asserted with bad target address", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "CLEAR", .pme_udesc = "BACLEAR asserted, regardless of cause", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "DTLB_MISSES", .pme_desc = "Data TLB misses", .pme_code = 0x49, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "DTLB misses", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "LARGE_WALK_COMPLETED", .pme_udesc = "DTLB miss large page walks", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "STLB_HIT", .pme_udesc = "DTLB first level misses but second level hit", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "DTLB miss page walks", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "WALK_CYCLES", .pme_udesc = "DTLB miss page walk cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "MEM_INST_RETIRED", .pme_desc = "Memory instructions retired", .pme_code = 0x0B, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LATENCY_ABOVE_THRESHOLD", .pme_udesc = "Memory instructions retired above programmed clocks, minimum value threhold is 4, requires PEBS", .pme_ucode = 0x10, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "LOADS", .pme_udesc = "Instructions retired which contains a load (Precise Event)", .pme_ucode = 0x01, .pme_uflags = PFMLIB_NHM_PEBS, }, { .pme_uname = "STORES", .pme_udesc = "Instructions retired which contains a store (Precise Event)", .pme_ucode = 0x02, .pme_uflags = PFMLIB_NHM_PEBS, }, }, .pme_numasks = 3 }, { .pme_name = "UOPS_ISSUED", .pme_desc = "Uops issued", .pme_code = 0x0E, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Uops issued", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "STALL_CYCLES", .pme_udesc = "Cycles stalled no issued uops", .pme_ucode = 0x01 | (1<<16) | (1<<15), /* counter-mask=1, inv=1 */ .pme_uflags = 0, }, { .pme_uname = "FUSED", .pme_udesc = "Fused Uops issued", .pme_ucode = 0x02, .pme_uflags = 0, }, { .pme_uname = "CYCLES_ALL_THREADS", .pme_udesc = "Cycles uops issued on either threads (core count)", .pme_ucode = 0x01 | (1<<16) | (1<<13), /* counter-mask=1, any=1 */ .pme_uflags = 0, }, { .pme_uname = "CORE_STALL_CYCLES", .pme_udesc = "Cycles no uops issued on any threads (core count)", .pme_ucode = 0x01 | (1<<16) | (1<<15) | (1<<13), /* counter-mask=1, any=1, inv=1 */ .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "L2_RQSTS", .pme_desc = "L2 requests", .pme_code = 0x24, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "IFETCH_HIT", .pme_udesc = "L2 instruction fetch hits", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "IFETCH_MISS", .pme_udesc = "L2 instruction fetch misses", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "IFETCHES", .pme_udesc = "L2 instruction fetches", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "LD_HIT", .pme_udesc = "L2 load hits", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "LD_MISS", .pme_udesc = "L2 load misses", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "LOADS", .pme_udesc = "L2 requests", .pme_ucode = 0x3, .pme_uflags = 0, }, { .pme_uname = "MISS", .pme_udesc = "All L2 misses", .pme_ucode = 0xAA, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_HIT", .pme_udesc = "L2 prefetch hits", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_MISS", .pme_udesc = "L2 prefetch misses", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "PREFETCHES", .pme_udesc = "All L2 prefetches", .pme_ucode = 0xC0, .pme_uflags = 0, }, { .pme_uname = "REFERENCES", .pme_udesc = "All L2 requests", .pme_ucode = 0xFF, .pme_uflags = 0, }, { .pme_uname = "RFO_HIT", .pme_udesc = "L2 RFO hits", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "RFO_MISS", .pme_udesc = "L2 RFO misses", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "RFOS", .pme_udesc = "L2 RFO requests", .pme_ucode = 0xC, .pme_uflags = 0, }, }, .pme_numasks = 14 }, { .pme_name = "TWO_UOP_INSTS_DECODED", .pme_desc = "Two Uop instructions decoded", .pme_code = 0x0119, .pme_flags = 0, }, { .pme_name = "LOAD_DISPATCH", .pme_desc = "Loads dispatched", .pme_code = 0x13, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All loads dispatched", .pme_ucode = 0x7, .pme_uflags = 0, }, { .pme_uname = "RS", .pme_udesc = "Number of loads dispatched from the Reservation Station (RS) that bypass the Memory Order Buffer", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "RS_DELAYED", .pme_udesc = "Number of delayed RS dispatches at the stage latch", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "MOB", .pme_udesc = "Number of loads dispatched from Reservation Station (RS)", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "BACLEAR_FORCE_IQ", .pme_desc = "BACLEAR forced by Instruction queue", .pme_code = 0x01A7, .pme_flags = 0, }, { .pme_name = "SNOOPQ_REQUESTS", .pme_desc = "Snoopq requests", .pme_code = 0xB4, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CODE", .pme_udesc = "Snoop code requests", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "DATA", .pme_udesc = "Snoop data requests", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "INVALIDATE", .pme_udesc = "Snoop invalidate requests", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "OFFCORE_REQUESTS", .pme_desc = "offcore requests", .pme_code = 0xB0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All offcore requests", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ANY_READ", .pme_udesc = "Offcore read requests", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "ANY_RFO", .pme_udesc = "Offcore RFO requests", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_CODE", .pme_udesc = "Offcore demand code read requests", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_DATA", .pme_udesc = "Offcore demand data read requests", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DEMAND_RFO", .pme_udesc = "Offcore demand RFO requests", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "L1D_WRITEBACK", .pme_udesc = "Offcore L1 data cache writebacks", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "LOAD_BLOCK", .pme_desc = "Loads blocked", .pme_code = 0x3, .pme_flags = 0, .pme_umasks = { { .pme_uname = "OVERLAP_STORE", .pme_udesc = "lods that partially overlap an earlier store", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "MISALIGN_MEMORY", .pme_desc = "Misaligned accesses", .pme_code = 0x5, .pme_flags = 0, .pme_umasks = { { .pme_uname = "STORE", .pme_udesc = "store referenced with misaligned address", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "INST_QUEUE_WRITE_CYCLES", .pme_desc = "Cycles instructions are written to the instruction queue", .pme_code = 0x011E, .pme_flags = 0, }, { .pme_name = "MACHINE_CLEARS", .pme_desc = "Machine clear asserted", .pme_code = 0xC3, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MEM_ORDER", .pme_udesc = "Execution pipeline restart due to Memory ordering conflicts ", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "CYCLES", .pme_udesc = "cycles machine clear is asserted", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "SMC", .pme_udesc = "Self-modifying code detected", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "FP_COMP_OPS_EXE", .pme_desc = "SSE/MMX micro-ops", .pme_code = 0x10, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MMX", .pme_udesc = "MMX Uops", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "SSE_DOUBLE_PRECISION", .pme_udesc = "SSE FP double precision Uops", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "SSE_FP", .pme_udesc = "SSE and SSE2 FP Uops", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "SSE_FP_PACKED", .pme_udesc = "SSE FP packed Uops", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "SSE_FP_SCALAR", .pme_udesc = "SSE FP scalar Uops", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "SSE_SINGLE_PRECISION", .pme_udesc = "SSE FP single precision Uops", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "SSE2_INTEGER", .pme_udesc = "SSE2 integer Uops", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "X87", .pme_udesc = "Computational floating-point operations executed", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "ITLB_FLUSH", .pme_desc = "ITLB flushes", .pme_code = 0x01AE, .pme_flags = 0, }, { .pme_name = "BR_INST_RETIRED", .pme_desc = "Retired branch instructions (Precise Event)", .pme_code = 0xC4, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ALL_BRANCHES", .pme_udesc = "Retired branch instructions (Precise Event)", .pme_ucode = 0x0, .pme_uflags = 0, }, { .pme_uname = "CONDITIONAL", .pme_udesc = "Retired conditional branch instructions (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALL", .pme_udesc = "Retired near call instructions (Precise Event)", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "L1D_CACHE_PREFETCH_LOCK_FB_HIT", .pme_desc = "L1D prefetch load lock accepted in fill buffer", .pme_code = 0x0152, .pme_flags = 0, }, { .pme_name = "LARGE_ITLB", .pme_desc = "Large ITLB accesses", .pme_code = 0x82, .pme_flags = 0, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Large ITLB hit", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "LSD", .pme_desc = "Loop stream detector", .pme_code = 0xA8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "UOPS", .pme_udesc = "counts the number of micro-ops delivered by LSD", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "ACTIVE", .pme_udesc = "Cycles is which at least one micro-op delivered by LSD", .pme_ucode = 0x01 | (1<<16), .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "L2_LINES_OUT", .pme_desc = "L2 lines evicted", .pme_code = 0xF2, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "L2 lines evicted", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "DEMAND_CLEAN", .pme_udesc = "L2 lines evicted by a demand request", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DEMAND_DIRTY", .pme_udesc = "L2 modified lines evicted by a demand request", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_CLEAN", .pme_udesc = "L2 lines evicted by a prefetch request", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_DIRTY", .pme_udesc = "L2 modified lines evicted by a prefetch request", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "ITLB_MISSES", .pme_desc = "ITLB miss", .pme_code = 0x85, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "ITLB miss", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "WALK_COMPLETED", .pme_udesc = "ITLB miss page walks", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "WALK_CYCLES", .pme_udesc = "ITLB miss page walk cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "LARGE_WALK_COMPLETED", .pme_udesc = "Number of completed large page walks due to misses in the STLB", .pme_ucode = 0x80, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L1D_PREFETCH", .pme_desc = "L1D hardware prefetch", .pme_code = 0x4E, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "MISS", .pme_udesc = "L1D hardware prefetch misses", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "REQUESTS", .pme_udesc = "L1D hardware prefetch requests", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "TRIGGERS", .pme_udesc = "L1D hardware prefetch requests triggered", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "SQ_MISC", .pme_desc = "Super Queue miscellaneous", .pme_code = 0xF4, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LRU_HINTS", .pme_udesc = "Super Queue LRU hints sent to LLC", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "SPLIT_LOCK", .pme_udesc = "Super Queue lock splits across a cache line", .pme_ucode = 0x10, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "SEG_RENAME_STALLS", .pme_desc = "Segment rename stall cycles", .pme_code = 0x01D4, .pme_flags = 0, }, { .pme_name = "FP_ASSIST", .pme_desc = "X87 Floating point assists (Precise Event)", .pme_code = 0xF7, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ALL", .pme_udesc = "All X87 Floating point assists (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "INPUT", .pme_udesc = "X87 Floating poiint assists for invalid input value (Precise Event)", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "OUTPUT", .pme_udesc = "X87 Floating point assists for invalid output value (Precise Event)", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "SIMD_INT_128", .pme_desc = "128 bit SIMD operations", .pme_code = 0x12, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "PACK", .pme_udesc = "128 bit SIMD integer pack operations", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "PACKED_ARITH", .pme_udesc = "128 bit SIMD integer arithmetic operations", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "PACKED_LOGICAL", .pme_udesc = "128 bit SIMD integer logical operations", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PACKED_MPY", .pme_udesc = "128 bit SIMD integer multiply operations", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "PACKED_SHIFT", .pme_udesc = "128 bit SIMD integer shift operations", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "SHUFFLE_MOVE", .pme_udesc = "128 bit SIMD integer shuffle/move operations", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "UNPACK", .pme_udesc = "128 bit SIMD integer unpack operations", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "OFFCORE_REQUESTS_OUTSTANDING", .pme_desc = "Outstanding offcore requests", .pme_code = 0x60, .pme_flags = PFMLIB_NHM_PMC0|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY_READ", .pme_udesc = "Outstanding offcore reads", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_CODE", .pme_udesc = "Outstanding offcore demand code reads", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "DEMAND_READ_DATA", .pme_udesc = "Outstanding offcore demand data reads", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DEMAND_RFO", .pme_udesc = "Outstanding offcore demand RFOs", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "MEM_STORE_RETIRED", .pme_desc = "Retired stores", .pme_code = 0xC, .pme_flags = 0, .pme_umasks = { { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired stores that miss the DTLB (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "INST_DECODED", .pme_desc = "Instructions decoded", .pme_code = 0x18, .pme_flags = 0, .pme_umasks = { { .pme_uname = "DEC0", .pme_udesc = "Instructions that must be decoded by decoder 0", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "MACRO_INSTS_FUSIONS_DECODED", .pme_desc = "Count the number of instructions decoded that are macros-fused but not necessarily executed or retired", .pme_code = 0x01A6, .pme_flags = 0, }, { .pme_name = "MACRO_INSTS", .pme_desc = "macro-instructions", .pme_code = 0xD0, .pme_flags = 0, .pme_umasks = { { .pme_uname = "DECODED", .pme_udesc = "Instructions decoded", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "PARTIAL_ADDRESS_ALIAS", .pme_desc = "False dependencies due to partial address aliasing", .pme_code = 0x0107, .pme_flags = 0, }, { .pme_name = "ARITH", .pme_desc = "Counts arithmetic multiply and divide operations", .pme_code = 0x14, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CYCLES_DIV_BUSY", .pme_udesc = "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Count may be incorrect when HT is on", .pme_ucode = 0x01, .pme_uflags = 0, }, { .pme_uname = "DIV", .pme_udesc = "Counts the number of divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Count may be incorrect when HT is on", .pme_ucode = 0x01 | (1<<16) | (1<<15) | (1<<10), /* cmask=1  invert=1  edge=1 */ .pme_uflags = 0, }, { .pme_uname = "MUL", .pme_udesc = "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD. Count may be incorrect when HT is on", .pme_ucode = 0x02, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "L2_TRANSACTIONS", .pme_desc = "All L2 transactions", .pme_code = 0xF0, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All L2 transactions", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "FILL", .pme_udesc = "L2 fill transactions", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "IFETCH", .pme_udesc = "L2 instruction fetch transactions", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "L1D_WB", .pme_udesc = "L1D writeback to L2 transactions", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "LOAD", .pme_udesc = "L2 Load transactions", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "PREFETCH", .pme_udesc = "L2 prefetch transactions", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "RFO", .pme_udesc = "L2 RFO transactions", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "WB", .pme_udesc = "L2 writeback to LLC transactions", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "INST_QUEUE_WRITES", .pme_desc = "Instructions written to instruction queue.", .pme_code = 0x0117, .pme_flags = 0, }, { .pme_name = "LSD_OVERFLOW", .pme_desc = "Number of loops that cannot stream from the instruction queue.", .pme_code = 0x0120, .pme_flags = 0, }, { .pme_name = "SB_DRAIN", .pme_desc = "store buffer", .pme_code = 0x4, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All Store buffer stall cycles", .pme_ucode = 0x7, .pme_uflags = 0, }, }, .pme_numasks = 1 }, { .pme_name = "LOAD_HIT_PRE", .pme_desc = "Load operations conflicting with software prefetches", .pme_code = 0x014C, .pme_flags = PFMLIB_NHM_PMC01, }, { .pme_name = "MEM_UNCORE_RETIRED", .pme_desc = "Load instructions retired (Precise Event)", .pme_code = 0xF, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LOCAL_HITM", .pme_udesc = "Load instructions retired that HIT modified data in sibling core (Precise Event) (Model 44 only)", .pme_ucode = 0x2, .pme_umodel = 44, }, { .pme_uname = "LOCAL_DRAM_AND_REMOTE_CACHE_HIT", .pme_udesc = "Load instructions retired local dram and remote cache HIT data sources (Precise Event) (Model 44 only)", .pme_ucode = 0x8, .pme_umodel = 44, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event) (Model 44 only)", .pme_ucode = 0x10, .pme_umodel = 44, }, { .pme_uname = "UNCACHEABLE", .pme_udesc = "Load instructions retired IO (Precise Event)", .pme_ucode = 0x80, }, { .pme_uname = "REMOTE_HITM", .pme_udesc = "Retired lods that hit remote socket in modified state (Precise Event) (Model 44 only)", .pme_ucode = 0x4, .pme_umodel = 44, }, { .pme_uname = "OTHER_LLC_MISS", .pme_udesc = "Load instructions retired other LLC miss (Precise Event) (Model 44 only)", .pme_ucode = 0x20, .pme_umodel = 44, }, { .pme_uname = "UNKNOWN_SOURCE", .pme_udesc = "Load instructions retired unknown LLC miss(Precise Event) (Model 44 only)", .pme_ucode = 0x1, .pme_umodel = 44, }, { .pme_uname = "LOCAL_DRAM", .pme_udesc = "Retired loads with a data source of local DRAM or locally homed remote cache HITM (Precise Event) (Model 37 only)", .pme_ucode = 0x10, .pme_umodel = 37, }, { .pme_uname = "OTHER_CORE_L2_HITM", .pme_udesc = "Retired loads instruction that hit modified data in sibling core (Precise Event) (Model 37 only)", .pme_ucode = 0x2, .pme_umodel = 37, }, { .pme_uname = "REMOTE_CACHE_LOCAL_HOME_HIT", .pme_udesc = "Retired loads instruction that hit remote cache hit data source (Precise Event) (Model 37 only)", .pme_ucode = 0x8, .pme_umodel = 37, }, { .pme_uname = "REMOTE_DRAM", .pme_udesc = "Retired loads instruction remote DRAM and remote home-remote cache HITM (Precise Event) (Model 37 only)", .pme_ucode = 0x20, .pme_umodel = 37, }, }, .pme_numasks = 11, }, { .pme_name = "L2_DATA_RQSTS", .pme_desc = "All L2 data requests", .pme_code = 0x26, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All L2 data requests", .pme_ucode = 0xFF, .pme_uflags = 0, }, { .pme_uname = "DEMAND_E_STATE", .pme_udesc = "L2 data demand loads in E state", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "DEMAND_I_STATE", .pme_udesc = "L2 data demand loads in I state (misses)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DEMAND_M_STATE", .pme_udesc = "L2 data demand loads in M state", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "DEMAND_MESI", .pme_udesc = "L2 data demand requests", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "DEMAND_S_STATE", .pme_udesc = "L2 data demand loads in S state", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_E_STATE", .pme_udesc = "L2 data prefetches in E state", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_I_STATE", .pme_udesc = "L2 data prefetches in the I state (misses)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_M_STATE", .pme_udesc = "L2 data prefetches in M state", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_MESI", .pme_udesc = "All L2 data prefetches", .pme_ucode = 0xF0, .pme_uflags = 0, }, { .pme_uname = "PREFETCH_S_STATE", .pme_udesc = "L2 data prefetches in the S state", .pme_ucode = 0x20, .pme_uflags = 0, }, }, .pme_numasks = 11 }, { .pme_name = "BR_INST_EXEC", .pme_desc = "Branch instructions executed", .pme_code = 0x88, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Branch instructions executed", .pme_ucode = 0x7F, .pme_uflags = 0, }, { .pme_uname = "COND", .pme_udesc = "Conditional branch instructions executed", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DIRECT", .pme_udesc = "Unconditional branches executed", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "DIRECT_NEAR_CALL", .pme_udesc = "Unconditional call branches executed", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NEAR_CALL", .pme_udesc = "Indirect call branches executed", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "INDIRECT_NON_CALL", .pme_udesc = "Indirect non call branches executed", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "NEAR_CALLS", .pme_udesc = "Call branches executed", .pme_ucode = 0x30, .pme_uflags = 0, }, { .pme_uname = "NON_CALLS", .pme_udesc = "All non call branches executed", .pme_ucode = 0x7, .pme_uflags = 0, }, { .pme_uname = "RETURN_NEAR", .pme_udesc = "Indirect return branches executed", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "TAKEN", .pme_udesc = "Taken branches executed", .pme_ucode = 0x40, .pme_uflags = 0, }, }, .pme_numasks = 10 }, { .pme_name = "ITLB_MISS_RETIRED", .pme_desc = "Retired instructions that missed the ITLB (Precise Event)", .pme_code = 0x20C8, .pme_flags = 0, }, { .pme_name = "BPU_MISSED_CALL_RET", .pme_desc = "Branch prediction unit missed call or return", .pme_code = 0x01E5, .pme_flags = 0, }, { .pme_name = "SNOOPQ_REQUESTS_OUTSTANDING", .pme_desc = "Outstanding snoop requests", .pme_code = 0xB3, .pme_flags = PFMLIB_NHM_PMC0|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CODE", .pme_udesc = "Outstanding snoop code requests", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "CODE_NOT_EMPTY", .pme_udesc = "Cycles snoop code requests queue not empty", .pme_ucode = 0x4 | (1 << 16), /* cmask=1 */ .pme_uflags = 0, }, { .pme_uname = "DATA", .pme_udesc = "Outstanding snoop data requests", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "DATA_NOT_EMPTY", .pme_udesc = "Cycles snoop data requests queue not empty", .pme_ucode = 0x1 | (1 << 16), /* cmask=1 */ .pme_uflags = 0, }, { .pme_uname = "INVALIDATE", .pme_udesc = "Outstanding snoop invalidate requests", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "INVALIDATE_NOT_EMPTY", .pme_udesc = "Cycles snoop invalidate requests queue not empty", .pme_ucode = 0x2 | (1 << 16), /* cmask=1 */ .pme_uflags = 0, }, }, .pme_numasks = 6 }, { .pme_name = "MEM_LOAD_RETIRED", .pme_desc = "memory load retired (Precise Event)", .pme_code = 0xCB, .pme_flags = PFMLIB_NHM_PEBS|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "DTLB_MISS", .pme_udesc = "Retired loads that miss the DTLB (Precise Event)", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "HIT_LFB", .pme_udesc = "Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "L1D_HIT", .pme_udesc = "Retired loads that hit the L1 data cache (Precise Event)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "L2_HIT", .pme_udesc = "Retired loads that hit the L2 cache (Precise Event)", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "L3_MISS", .pme_udesc = "Retired loads that miss the LLC cache (Precise Event)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "L3_UNSHARED_HIT", .pme_udesc = "Retired loads that hit valid versions in the LLC cache (Precise Event)", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "OTHER_CORE_L2_HIT_HITM", .pme_udesc = "Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 7 }, { .pme_name = "L1I", .pme_desc = "L1I instruction fetch", .pme_code = 0x80, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "CYCLES_STALLED", .pme_udesc = "L1I instruction fetch stall cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "HITS", .pme_udesc = "L1I instruction fetch hits", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "MISSES", .pme_udesc = "L1I instruction fetch misses", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "READS", .pme_udesc = "L1I Instruction fetches", .pme_ucode = 0x3, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "L2_WRITE", .pme_desc = "L2 demand lock/store RFO", .pme_code = 0x27, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "LOCK_E_STATE", .pme_udesc = "L2 demand lock RFOs in E state", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "LOCK_HIT", .pme_udesc = "All demand L2 lock RFOs that hit the cache", .pme_ucode = 0xE0, .pme_uflags = 0, }, { .pme_uname = "LOCK_I_STATE", .pme_udesc = "L2 demand lock RFOs in I state (misses)", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "LOCK_M_STATE", .pme_udesc = "L2 demand lock RFOs in M state", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "LOCK_MESI", .pme_udesc = "All demand L2 lock RFOs", .pme_ucode = 0xF0, .pme_uflags = 0, }, { .pme_uname = "LOCK_S_STATE", .pme_udesc = "L2 demand lock RFOs in S state", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "RFO_HIT", .pme_udesc = "All L2 demand store RFOs that hit the cache", .pme_ucode = 0xE, .pme_uflags = 0, }, { .pme_uname = "RFO_I_STATE", .pme_udesc = "L2 demand store RFOs in I state (misses)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "RFO_M_STATE", .pme_udesc = "L2 demand store RFOs in M state", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "RFO_MESI", .pme_udesc = "All L2 demand store RFOs", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "RFO_S_STATE", .pme_udesc = "L2 demand store RFOs in S state", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 11 }, { .pme_name = "SNOOP_RESPONSE", .pme_desc = "Snoop", .pme_code = 0xB8, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "HIT", .pme_udesc = "Thread responded HIT to snoop", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "HITE", .pme_udesc = "Thread responded HITE to snoop", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "HITM", .pme_udesc = "Thread responded HITM to snoop", .pme_ucode = 0x4, .pme_uflags = 0, }, }, .pme_numasks = 3 }, { .pme_name = "L1D", .pme_desc = "L1D cache", .pme_code = 0x51, .pme_flags = PFMLIB_NHM_PMC01|PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "M_EVICT", .pme_udesc = "L1D cache lines replaced in M state ", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "M_REPL", .pme_udesc = "L1D cache lines allocated in the M state", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "M_SNOOP_EVICT", .pme_udesc = "L1D snoop eviction of cache lines in M state", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "REPL", .pme_udesc = "L1 data cache lines allocated", .pme_ucode = 0x1, .pme_uflags = 0, }, }, .pme_numasks = 4 }, { .pme_name = "RESOURCE_STALLS", .pme_desc = "Resource related stall cycles", .pme_code = 0xA2, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "Resource related stall cycles", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "FPCW", .pme_udesc = "FPU control word write stall cycles", .pme_ucode = 0x20, .pme_uflags = 0, }, { .pme_uname = "LOAD", .pme_udesc = "Load buffer stall cycles", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "MXCSR", .pme_udesc = "MXCSR rename stall cycles", .pme_ucode = 0x40, .pme_uflags = 0, }, { .pme_uname = "OTHER", .pme_udesc = "Other Resource related stall cycles", .pme_ucode = 0x80, .pme_uflags = 0, }, { .pme_uname = "ROB_FULL", .pme_udesc = "ROB full stall cycles", .pme_ucode = 0x10, .pme_uflags = 0, }, { .pme_uname = "RS_FULL", .pme_udesc = "Reservation Station full stall cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "STORE", .pme_udesc = "Store buffer stall cycles", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 8 }, { .pme_name = "RAT_STALLS", .pme_desc = "All RAT stall cycles", .pme_code = 0xD2, .pme_flags = 0, .pme_umasks = { { .pme_uname = "ANY", .pme_udesc = "All RAT stall cycles", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "FLAGS", .pme_udesc = "Flag stall cycles", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "REGISTERS", .pme_udesc = "Partial register stall cycles", .pme_ucode = 0x2, .pme_uflags = 0, }, { .pme_uname = "ROB_READ_PORT", .pme_udesc = "ROB read port stalls cycles", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "SCOREBOARD", .pme_udesc = "Scoreboard stall cycles", .pme_ucode = 0x8, .pme_uflags = 0, }, }, .pme_numasks = 5 }, { .pme_name = "CPU_CLK_UNHALTED", .pme_desc = "Cycles when processor is not in halted state", .pme_code = 0x3C, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "THREAD_P", .pme_udesc = "Cycles when thread is not halted (programmable counter)", .pme_ucode = 0x00, .pme_uflags = 0, }, { .pme_uname = "REF_P", .pme_udesc = "Reference base clock (133 Mhz) cycles when thread is not halted", .pme_ucode = 0x01, .pme_uflags = 0, }, }, .pme_numasks = 2 }, { .pme_name = "L1D_WB_L2", .pme_desc = "L1D writebacks to L2", .pme_code = 0x28, .pme_flags = PFMLIB_NHM_UMASK_NCOMBO, .pme_umasks = { { .pme_uname = "E_STATE", .pme_udesc = "L1 writebacks to L2 in E state", .pme_ucode = 0x4, .pme_uflags = 0, }, { .pme_uname = "I_STATE", .pme_udesc = "L1 writebacks to L2 in I state (misses)", .pme_ucode = 0x1, .pme_uflags = 0, }, { .pme_uname = "M_STATE", .pme_udesc = "L1 writebacks to L2 in M state", .pme_ucode = 0x8, .pme_uflags = 0, }, { .pme_uname = "MESI", .pme_udesc = "All L1 writebacks to L2", .pme_ucode = 0xF, .pme_uflags = 0, }, { .pme_uname = "S_STATE", .pme_udesc = "L1 writebacks to L2 in S state", .pme_ucode = 0x2, .pme_uflags = 0, }, }, .pme_numasks = 5 }, {.pme_name = "MISPREDICTED_BRANCH_RETIRED", .pme_code = 0x00c5, .pme_desc = "count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", }, {.pme_name = "THREAD_ACTIVE", .pme_code= 0x01ec, .pme_desc = "Cycles thread is active", }, {.pme_name = "UOP_UNFUSION", .pme_code= 0x01db, .pme_desc = "Counts unfusion events due to floating point exception to a fused uop", } }; #define PME_WSM_UNHALTED_CORE_CYCLES 0 #define PME_WSM_INSTRUCTIONS_RETIRED 1 #define PME_WSM_EVENT_COUNT (sizeof(wsm_pe)/sizeof(pme_nhm_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/intel_wsm_unc_events.h000066400000000000000000001030541502707512200237140ustar00rootroot00000000000000/* * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ static pme_nhm_entry_t intel_wsm_unc_pe[]={ /* * BEGIN uncore events */ { .pme_name = "UNC_CLK_UNHALTED", .pme_desc = "Uncore clockticks.", .pme_code = 0x0000, .pme_flags = PFMLIB_NHM_UNC_FIXED, }, { .pme_name = "UNC_DRAM_OPEN", .pme_desc = "DRAM open comamnds issued for read or write", .pme_code = 0x60, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 open comamnds issued for read or write", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 open comamnds issued for read or write", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 open comamnds issued for read or write", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_GC_OCCUPANCY", .pme_desc = "Number of queue entries", .pme_code = 0x02, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "in the read tracker", .pme_ucode = 0x01, }, }, .pme_numasks = 1 }, { .pme_name = "UNC_DRAM_PAGE_CLOSE", .pme_desc = "DRAM page close due to idle timer expiration", .pme_code = 0x61, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 page close", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 page close", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 page close", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_PAGE_MISS", .pme_desc = "DRAM Channel 0 page miss", .pme_code = 0x62, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 page miss", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 page miss", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 page miss", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_PRE_ALL", .pme_desc = "DRAM Channel 0 precharge all commands", .pme_code = 0x66, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 precharge all commands", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 precharge all commands", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 precharge all commands", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_THERMAL_THROTTLED", .pme_desc = "uncore cycles DRAM was throttled due to its temperature being above thermal throttling threshold", .pme_code = 0x0167, .pme_flags = PFMLIB_NHM_UNC, }, { .pme_name = "UNC_DRAM_READ_CAS", .pme_desc = "DRAM Channel 0 read CAS commands", .pme_code = 0x63, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 read CAS commands", .pme_ucode = 0x01, }, { .pme_uname = "AUTOPRE_CH0", .pme_udesc = "DRAM Channel 0 read CAS auto page close commands", .pme_ucode = 0x02, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 read CAS commands", .pme_ucode = 0x04, }, { .pme_uname = "AUTOPRE_CH1", .pme_udesc = "DRAM Channel 1 read CAS auto page close commands", .pme_ucode = 0x08, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 read CAS commands", .pme_ucode = 0x10, }, { .pme_uname = "AUTOPRE_CH2", .pme_udesc = "DRAM Channel 2 read CAS auto page close commands", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_DRAM_REFRESH", .pme_desc = "DRAM Channel 0 refresh commands", .pme_code = 0x65, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 refresh commands", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 refresh commands", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 refresh commands", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_DRAM_WRITE_CAS", .pme_desc = "DRAM Channel 0 write CAS commands", .pme_code = 0x64, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "DRAM Channel 0 write CAS commands", .pme_ucode = 0x01, }, { .pme_uname = "AUTOPRE_CH0", .pme_udesc = "DRAM Channel 0 write CAS auto page close commands", .pme_ucode = 0x02, }, { .pme_uname = "CH1", .pme_udesc = "DRAM Channel 1 write CAS commands", .pme_ucode = 0x04, }, { .pme_uname = "AUTOPRE_CH1", .pme_udesc = "DRAM Channel 1 write CAS auto page close commands", .pme_ucode = 0x08, }, { .pme_uname = "CH2", .pme_udesc = "DRAM Channel 2 write CAS commands", .pme_ucode = 0x10, }, { .pme_uname = "AUTOPRE_CH2", .pme_udesc = "DRAM Channel 2 write CAS auto page close commands", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_GQ_ALLOC", .pme_desc = "GQ read tracker requests", .pme_code = 0x03, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "GQ read tracker requests", .pme_ucode = 0x01, }, { .pme_uname = "RT_LLC_MISS", .pme_udesc = "GQ read tracker LLC misses", .pme_ucode = 0x02, }, { .pme_uname = "RT_TO_LLC_RESP", .pme_udesc = "GQ read tracker LLC requests", .pme_ucode = 0x04, }, { .pme_uname = "RT_TO_RTID_ACQUIRED", .pme_udesc = "GQ read tracker LLC miss to RTID acquired", .pme_ucode = 0x08, }, { .pme_uname = "WT_TO_RTID_ACQUIRED", .pme_udesc = "GQ write tracker LLC miss to RTID acquired", .pme_ucode = 0x10, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "GQ write tracker LLC misses", .pme_ucode = 0x20, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "GQ peer probe tracker requests", .pme_ucode = 0x40, }, }, .pme_numasks = 7 }, { .pme_name = "UNC_GQ_CYCLES_FULL", .pme_desc = "Cycles GQ read tracker is full.", .pme_code = 0x00, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "Cycles GQ read tracker is full.", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "Cycles GQ write tracker is full.", .pme_ucode = 0x02, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "Cycles GQ peer probe tracker is full.", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_GQ_CYCLES_NOT_EMPTY", .pme_desc = "Cycles GQ read tracker is busy", .pme_code = 0x01, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_TRACKER", .pme_udesc = "Cycles GQ read tracker is busy", .pme_ucode = 0x01, }, { .pme_uname = "WRITE_TRACKER", .pme_udesc = "Cycles GQ write tracker is busy", .pme_ucode = 0x02, }, { .pme_uname = "PEER_PROBE_TRACKER", .pme_udesc = "Cycles GQ peer probe tracker is busy", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_GQ_DATA_FROM", .pme_desc = "Cycles GQ data is imported", .pme_code = 0x04, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "QPI", .pme_udesc = "Cycles GQ data is imported from Quickpath interface", .pme_ucode = 0x01, }, { .pme_uname = "QMC", .pme_udesc = "Cycles GQ data is imported from Quickpath memory interface", .pme_ucode = 0x02, }, { .pme_uname = "LLC", .pme_udesc = "Cycles GQ data is imported from LLC", .pme_ucode = 0x04, }, { .pme_uname = "CORES_02", .pme_udesc = "Cycles GQ data is imported from Cores 0 and 2", .pme_ucode = 0x08, }, { .pme_uname = "CORES_13", .pme_udesc = "Cycles GQ data is imported from Cores 1 and 3", .pme_ucode = 0x10, }, }, .pme_numasks = 5 }, { .pme_name = "UNC_GQ_DATA_TO", .pme_desc = "Cycles GQ data is exported", .pme_code = 0x05, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "QPI_QMC", .pme_udesc = "Cycles GQ data sent to the QPI or QMC", .pme_ucode = 0x01, }, { .pme_uname = "LLC", .pme_udesc = "Cycles GQ data sent to LLC", .pme_ucode = 0x02, }, { .pme_uname = "CORES", .pme_udesc = "Cycles GQ data sent to cores", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_LLC_HITS", .pme_desc = "Number of LLC read hits", .pme_code = 0x08, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ", .pme_udesc = "Number of LLC read hits", .pme_ucode = 0x01, }, { .pme_uname = "WRITE", .pme_udesc = "Number of LLC write hits", .pme_ucode = 0x02, }, { .pme_uname = "PROBE", .pme_udesc = "Number of LLC peer probe hits", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "Number of LLC hits", .pme_ucode = 0x03, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_LLC_LINES_IN", .pme_desc = "LLC lines allocated in M state", .pme_code = 0x0A, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "M_STATE", .pme_udesc = "LLC lines allocated in M state", .pme_ucode = 0x01, }, { .pme_uname = "E_STATE", .pme_udesc = "LLC lines allocated in E state", .pme_ucode = 0x02, }, { .pme_uname = "S_STATE", .pme_udesc = "LLC lines allocated in S state", .pme_ucode = 0x04, }, { .pme_uname = "F_STATE", .pme_udesc = "LLC lines allocated in F state", .pme_ucode = 0x08, }, { .pme_uname = "ANY", .pme_udesc = "LLC lines allocated", .pme_ucode = 0x0F, }, }, .pme_numasks = 5 }, { .pme_name = "UNC_LLC_LINES_OUT", .pme_desc = "LLC lines victimized in M state", .pme_code = 0x0B, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "M_STATE", .pme_udesc = "LLC lines victimized in M state", .pme_ucode = 0x01, }, { .pme_uname = "E_STATE", .pme_udesc = "LLC lines victimized in E state", .pme_ucode = 0x02, }, { .pme_uname = "S_STATE", .pme_udesc = "LLC lines victimized in S state", .pme_ucode = 0x04, }, { .pme_uname = "I_STATE", .pme_udesc = "LLC lines victimized in I state", .pme_ucode = 0x08, }, { .pme_uname = "F_STATE", .pme_udesc = "LLC lines victimized in F state", .pme_ucode = 0x10, }, { .pme_uname = "ANY", .pme_udesc = "LLC lines victimized", .pme_ucode = 0x1F, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_LLC_MISS", .pme_desc = "Number of LLC read misses", .pme_code = 0x09, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ", .pme_udesc = "Number of LLC read misses", .pme_ucode = 0x01, }, { .pme_uname = "WRITE", .pme_udesc = "Number of LLC write misses", .pme_ucode = 0x02, }, { .pme_uname = "PROBE", .pme_udesc = "Number of LLC peer probe misses", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "Number of LLC misses", .pme_ucode = 0x03, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QHL_ADDRESS_CONFLICTS", .pme_desc = "QHL 2 way address conflicts", .pme_code = 0x24, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "2WAY", .pme_udesc = "QHL 2 way address conflicts", .pme_ucode = 0x02, }, { .pme_uname = "3WAY", .pme_udesc = "QHL 3 way address conflicts", .pme_ucode = 0x04, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QHL_CONFLICT_CYCLES", .pme_desc = "QHL IOH Tracker conflict cycles", .pme_code = 0x25, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "QHL IOH Tracker conflict cycles", .pme_ucode = 0x01, }, { .pme_uname = "REMOTE", .pme_udesc = "QHL Remote Tracker conflict cycles", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL", .pme_udesc = "QHL Local Tracker conflict cycles", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_CYCLES_FULL", .pme_desc = "Cycles QHL Remote Tracker is full", .pme_code = 0x21, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker is full", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker is full", .pme_ucode = 0x04, }, { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH Tracker is full", .pme_ucode = 0x01, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_CYCLES_NOT_EMPTY", .pme_desc = "Cycles QHL Tracker is not empty", .pme_code = 0x22, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH is busy", .pme_ucode = 0x01, }, { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker is busy", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker is busy", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_FRC_ACK_CNFLTS", .pme_desc = "QHL FrcAckCnflts sent to local home", .pme_code = 0x33, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "LOCAL", .pme_udesc = "QHL FrcAckCnflts sent to local home", .pme_ucode = 0x04, }, }, .pme_numasks = 1 }, { .pme_name = "UNC_QHL_SLEEPS", .pme_desc = "number of occurrences a request was put to sleep", .pme_code = 0x34, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH_ORDER", .pme_udesc = "due to IOH ordering (write after read) conflicts", .pme_ucode = 0x01, }, { .pme_uname = "REMOTE_ORDER", .pme_udesc = "due to remote socket ordering (write after read) conflicts", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL_ORDER", .pme_udesc = "due to local socket ordering (write after read) conflicts", .pme_ucode = 0x04, }, { .pme_uname = "IOH_CONFLICT", .pme_udesc = "due to IOH address conflicts", .pme_ucode = 0x08, }, { .pme_uname = "REMOTE_CONFLICT", .pme_udesc = "due to remote socket address conflicts", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_CONFLICT", .pme_udesc = "due to local socket address conflicts", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QHL_OCCUPANCY", .pme_desc = "Cycles QHL Tracker Allocate to Deallocate Read Occupancy", .pme_code = 0x23, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "IOH", .pme_udesc = "Cycles QHL IOH Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x01, }, { .pme_uname = "REMOTE", .pme_udesc = "Cycles QHL Remote Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x02, }, { .pme_uname = "LOCAL", .pme_udesc = "Cycles QHL Local Tracker Allocate to Deallocate Read Occupancy", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QHL_REQUESTS", .pme_desc = "Quickpath Home Logic local read requests", .pme_code = 0x20, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "LOCAL_READS", .pme_udesc = "Quickpath Home Logic local read requests", .pme_ucode = 0x10, }, { .pme_uname = "LOCAL_WRITES", .pme_udesc = "Quickpath Home Logic local write requests", .pme_ucode = 0x20, }, { .pme_uname = "REMOTE_READS", .pme_udesc = "Quickpath Home Logic remote read requests", .pme_ucode = 0x04, }, { .pme_uname = "IOH_READS", .pme_udesc = "Quickpath Home Logic IOH read requests", .pme_ucode = 0x01, }, { .pme_uname = "IOH_WRITES", .pme_udesc = "Quickpath Home Logic IOH write requests", .pme_ucode = 0x02, }, { .pme_uname = "REMOTE_WRITES", .pme_udesc = "Quickpath Home Logic remote write requests", .pme_ucode = 0x08, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QHL_TO_QMC_BYPASS", .pme_desc = "Number of requests to QMC that bypass QHL", .pme_code = 0x0126, .pme_flags = PFMLIB_NHM_UNC, }, { .pme_name = "UNC_QMC_BUSY", .pme_desc = "Cycles QMC busy with a read request", .pme_code = 0x29, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_CH0", .pme_udesc = "Cycles QMC channel 0 busy with a read request", .pme_ucode = 0x01, }, { .pme_uname = "READ_CH1", .pme_udesc = "Cycles QMC channel 1 busy with a read request", .pme_ucode = 0x02, }, { .pme_uname = "READ_CH2", .pme_udesc = "Cycles QMC channel 2 busy with a read request", .pme_ucode = 0x04, }, { .pme_uname = "WRITE_CH0", .pme_udesc = "Cycles QMC channel 0 busy with a write request", .pme_ucode = 0x08, }, { .pme_uname = "WRITE_CH1", .pme_udesc = "Cycles QMC channel 1 busy with a write request", .pme_ucode = 0x10, }, { .pme_uname = "WRITE_CH2", .pme_udesc = "Cycles QMC channel 2 busy with a write request", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_QMC_CANCEL", .pme_desc = "QMC cancels", .pme_code = 0x30, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 cancels", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 cancels", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 cancels", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "QMC cancels", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_CRITICAL_PRIORITY_READS", .pme_desc = "QMC critical priority read requests", .pme_code = 0x2E, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 critical priority read requests", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 critical priority read requests", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 critical priority read requests", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "QMC critical priority read requests", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_HIGH_PRIORITY_READS", .pme_desc = "QMC high priority read requests", .pme_code = 0x2D, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 high priority read requests", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 high priority read requests", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 high priority read requests", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "QMC high priority read requests", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_ISOC_FULL", .pme_desc = "Cycles DRAM full with isochronous (ISOC) read requests", .pme_code = 0x28, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "READ_CH0", .pme_udesc = "Cycles DRAM channel 0 full with isochronous read requests", .pme_ucode = 0x01, }, { .pme_uname = "READ_CH1", .pme_udesc = "Cycles DRAM channel 1 full with isochronous read requests", .pme_ucode = 0x02, }, { .pme_uname = "READ_CH2", .pme_udesc = "Cycles DRAM channel 2 full with isochronous read requests", .pme_ucode = 0x04, }, { .pme_uname = "WRITE_CH0", .pme_udesc = "Cycles DRAM channel 0 full with isochronous write requests", .pme_ucode = 0x08, }, { .pme_uname = "WRITE_CH1", .pme_udesc = "Cycles DRAM channel 1 full with isochronous write requests", .pme_ucode = 0x10, }, { .pme_uname = "WRITE_CH2", .pme_udesc = "Cycles DRAM channel 2 full with isochronous write requests", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_IMC_ISOC_OCCUPANCY", .pme_desc = "IMC isochronous (ISOC) Read Occupancy", .pme_code = 0x2B, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "IMC channel 0 isochronous read request occupancy", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "IMC channel 1 isochronous read request occupancy", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "IMC channel 2 isochronous read request occupancy", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "IMC isochronous read request occupancy", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_NORMAL_READS", .pme_desc = "QMC normal read requests", .pme_code = 0x2C, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 normal read requests", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 normal read requests", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 normal read requests", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "QMC normal read requests", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_OCCUPANCY", .pme_desc = "QMC Occupancy", .pme_code = 0x2A, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "IMC channel 0 normal read request occupancy", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "IMC channel 1 normal read request occupancy", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "IMC channel 2 normal read request occupancy", .pme_ucode = 0x04, }, }, .pme_numasks = 3 }, { .pme_name = "UNC_QMC_PRIORITY_UPDATES", .pme_desc = "QMC priority updates", .pme_code = 0x31, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "QMC channel 0 priority updates", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "QMC channel 1 priority updates", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "QMC channel 2 priority updates", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "QMC priority updates", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_IMC_RETRY", .pme_desc = "Number of IMC DRAM channel retries (retries occur in RAS mode only)", .pme_code = 0x32, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CH0", .pme_udesc = "channel 0", .pme_ucode = 0x01, }, { .pme_uname = "CH1", .pme_udesc = "channel 1", .pme_ucode = 0x02, }, { .pme_uname = "CH2", .pme_udesc = "channel 2", .pme_ucode = 0x04, }, { .pme_uname = "ANY", .pme_udesc = "any channel", .pme_ucode = 0x07, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_QMC_WRITES", .pme_desc = "QMC cache line writes", .pme_code = 0x2F, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "FULL_CH0", .pme_udesc = "QMC channel 0 full cache line writes", .pme_ucode = 0x01, }, { .pme_uname = "FULL_CH1", .pme_udesc = "QMC channel 1 full cache line writes", .pme_ucode = 0x02, }, { .pme_uname = "FULL_CH2", .pme_udesc = "QMC channel 2 full cache line writes", .pme_ucode = 0x04, }, { .pme_uname = "FULL_ANY", .pme_udesc = "QMC full cache line writes", .pme_ucode = 0x07, }, { .pme_uname = "PARTIAL_CH0", .pme_udesc = "QMC channel 0 partial cache line writes", .pme_ucode = 0x08, }, { .pme_uname = "PARTIAL_CH1", .pme_udesc = "QMC channel 1 partial cache line writes", .pme_ucode = 0x10, }, { .pme_uname = "PARTIAL_CH2", .pme_udesc = "QMC channel 2 partial cache line writes", .pme_ucode = 0x20, }, { .pme_uname = "PARTIAL_ANY", .pme_udesc = "QMC partial cache line writes", .pme_ucode = 0x38, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_QPI_RX_NO_PPT_CREDIT", .pme_desc = "Link 0 snoop stalls due to no PPT entry", .pme_code = 0x43, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "STALLS_LINK_0", .pme_udesc = "Link 0 snoop stalls due to no PPT entry", .pme_ucode = 0x01, }, { .pme_uname = "STALLS_LINK_1", .pme_udesc = "Link 1 snoop stalls due to no PPT entry", .pme_ucode = 0x02, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QPI_TX_HEADER", .pme_desc = "Cycles link 0 outbound header busy", .pme_code = 0x42, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "BUSY_LINK_0", .pme_udesc = "Cycles link 0 outbound header busy", .pme_ucode = 0x02, }, { .pme_uname = "BUSY_LINK_1", .pme_udesc = "Cycles link 1 outbound header busy", .pme_ucode = 0x08, }, }, .pme_numasks = 2 }, { .pme_name = "UNC_QPI_TX_STALLED_MULTI_FLIT", .pme_desc = "Cycles QPI outbound stalls", .pme_code = 0x41, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "DRS_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 DRS stalled", .pme_ucode = 0x01, }, { .pme_uname = "NCB_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NCB stalled", .pme_ucode = 0x02, }, { .pme_uname = "NCS_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NCS stalled", .pme_ucode = 0x04, }, { .pme_uname = "DRS_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 DRS stalled", .pme_ucode = 0x08, }, { .pme_uname = "NCB_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NCB stalled", .pme_ucode = 0x10, }, { .pme_uname = "NCS_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NCS stalled", .pme_ucode = 0x20, }, { .pme_uname = "LINK_0", .pme_udesc = "Cycles QPI outbound link 0 multi flit stalled", .pme_ucode = 0x07, }, { .pme_uname = "LINK_1", .pme_udesc = "Cycles QPI outbound link 1 multi flit stalled", .pme_ucode = 0x38, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_QPI_TX_STALLED_SINGLE_FLIT", .pme_desc = "Cycles QPI outbound link stalls", .pme_code = 0x40, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "HOME_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 HOME stalled", .pme_ucode = 0x01, }, { .pme_uname = "SNOOP_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 SNOOP stalled", .pme_ucode = 0x02, }, { .pme_uname = "NDR_LINK_0", .pme_udesc = "Cycles QPI outbound link 0 NDR stalled", .pme_ucode = 0x04, }, { .pme_uname = "HOME_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 HOME stalled", .pme_ucode = 0x08, }, { .pme_uname = "SNOOP_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 SNOOP stalled", .pme_ucode = 0x10, }, { .pme_uname = "NDR_LINK_1", .pme_udesc = "Cycles QPI outbound link 1 NDR stalled", .pme_ucode = 0x20, }, { .pme_uname = "LINK_0", .pme_udesc = "Cycles QPI outbound link 0 single flit stalled", .pme_ucode = 0x07, }, { .pme_uname = "LINK_1", .pme_udesc = "Cycles QPI outbound link 1 single flit stalled", .pme_ucode = 0x38, }, }, .pme_numasks = 8 }, { .pme_name = "UNC_SNP_RESP_TO_LOCAL_HOME", .pme_desc = "Local home snoop response", .pme_code = 0x06, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "I_STATE", .pme_udesc = "Local home snoop response - LLC does not have cache line", .pme_ucode = 0x01, }, { .pme_uname = "S_STATE", .pme_udesc = "Local home snoop response - LLC has cache line in S state", .pme_ucode = 0x02, }, { .pme_uname = "FWD_S_STATE", .pme_udesc = "Local home snoop response - LLC forwarding cache line in S state.", .pme_ucode = 0x04, }, { .pme_uname = "FWD_I_STATE", .pme_udesc = "Local home snoop response - LLC has forwarded a modified cache line", .pme_ucode = 0x08, }, { .pme_uname = "CONFLICT", .pme_udesc = "Local home conflict snoop response", .pme_ucode = 0x10, }, { .pme_uname = "WB", .pme_udesc = "Local home snoop response - LLC has cache line in the M state", .pme_ucode = 0x20, }, }, .pme_numasks = 6 }, { .pme_name = "UNC_SNP_RESP_TO_REMOTE_HOME", .pme_desc = "Remote home snoop response", .pme_code = 0x07, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "I_STATE", .pme_udesc = "Remote home snoop response - LLC does not have cache line", .pme_ucode = 0x01, }, { .pme_uname = "S_STATE", .pme_udesc = "Remote home snoop response - LLC has cache line in S state", .pme_ucode = 0x02, }, { .pme_uname = "FWD_S_STATE", .pme_udesc = "Remote home snoop response - LLC forwarding cache line in S state.", .pme_ucode = 0x04, }, { .pme_uname = "FWD_I_STATE", .pme_udesc = "Remote home snoop response - LLC has forwarded a modified cache line", .pme_ucode = 0x08, }, { .pme_uname = "CONFLICT", .pme_udesc = "Remote home conflict snoop response", .pme_ucode = 0x10, }, { .pme_uname = "WB", .pme_udesc = "Remote home snoop response - LLC has cache line in the M state", .pme_ucode = 0x20, }, { .pme_uname = "HITM", .pme_udesc = "Remote home snoop response - LLC HITM", .pme_ucode = 0x24, }, }, .pme_numasks = 7 }, { .pme_name = "UNC_THERMAL_THROTTLING_TEMP", .pme_desc = "uncore cycles that the PCU records core temperature above threshold", .pme_code = 0x80, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CORE_0", .pme_udesc = "Core 0", .pme_ucode = 0x01, }, { .pme_uname = "CORE_1", .pme_udesc = "Core 1", .pme_ucode = 0x02, }, { .pme_uname = "CORE_2", .pme_udesc = "Core 2", .pme_ucode = 0x04, }, { .pme_uname = "CORE_3", .pme_udesc = "Core 3", .pme_ucode = 0x08, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_THERMAL_THROTTLED_TEMP", .pme_desc = "uncore cycles that the PCU records that core is in power throttled state due to temperature being above threshold", .pme_code = 0x81, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CORE_0", .pme_udesc = "Core 0", .pme_ucode = 0x01, }, { .pme_uname = "CORE_1", .pme_udesc = "Core 1", .pme_ucode = 0x02, }, { .pme_uname = "CORE_2", .pme_udesc = "Core 2", .pme_ucode = 0x04, }, { .pme_uname = "CORE_3", .pme_udesc = "Core 3", .pme_ucode = 0x08, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_PROCHOT_ASSERTION", .pme_desc = "Number of system ssertions of PROCHOT indicating the entire processor has exceeded the thermal limit", .pme_code = 0x0182, }, { .pme_name = "UNC_THERMAL_THROTTLING_PROCHOT", .pme_desc = "uncore cycles that the PCU records that core is in power throttled state due PROCHOT assertions", .pme_code = 0x83, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CORE_0", .pme_udesc = "Core 0", .pme_ucode = 0x01, }, { .pme_uname = "CORE_1", .pme_udesc = "Core 1", .pme_ucode = 0x02, }, { .pme_uname = "CORE_2", .pme_udesc = "Core 2", .pme_ucode = 0x04, }, { .pme_uname = "CORE_3", .pme_udesc = "Core 3", .pme_ucode = 0x08, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_TURBO_MODE", .pme_desc = "uncore cycles that a core is operating in turbo mode", .pme_code = 0x84, .pme_flags = PFMLIB_NHM_UNC, .pme_umasks = { { .pme_uname = "CORE_0", .pme_udesc = "Core 0", .pme_ucode = 0x01, }, { .pme_uname = "CORE_1", .pme_udesc = "Core 1", .pme_ucode = 0x02, }, { .pme_uname = "CORE_2", .pme_udesc = "Core 2", .pme_ucode = 0x04, }, { .pme_uname = "CORE_3", .pme_udesc = "Core 3", .pme_ucode = 0x08, }, }, .pme_numasks = 4 }, { .pme_name = "UNC_CYCLES_UNHALTED_L3_FLL_ENABLE", .pme_desc = "uncore cycles where at least one core is unhalted and all L3 ways are enabled", .pme_code = 0x0285, }, { .pme_name = "UNC_CYCLES_UNHALTED_L3_FLL_DISABLE", .pme_desc = "uncore cycles where at least one core is unhalted and all L3 ways are disabled", .pme_code = 0x0186, }, }; #define PME_INTEL_WSM_UNC_CYCLE 0 #define PME_WSM_UNC_EVENT_COUNT (sizeof(intel_wsm_unc_pe)/sizeof(pme_nhm_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/itanium2_events.h000066400000000000000000002746701502707512200226130ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ static pme_ita2_entry_t itanium2_pe []={ #define PME_ITA2_ALAT_CAPACITY_MISS_ALL 0 { "ALAT_CAPACITY_MISS_ALL", {0x30058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- both integer and floating point instructions"}, #define PME_ITA2_ALAT_CAPACITY_MISS_FP 1 { "ALAT_CAPACITY_MISS_FP", {0x20058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- only floating point instructions"}, #define PME_ITA2_ALAT_CAPACITY_MISS_INT 2 { "ALAT_CAPACITY_MISS_INT", {0x10058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- only integer instructions"}, #define PME_ITA2_BACK_END_BUBBLE_ALL 3 { "BACK_END_BUBBLE_ALL", {0x0}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- Front-end, RSE, EXE, FPU/L1D stall or a pipeline flush due to an exception/branch misprediction"}, #define PME_ITA2_BACK_END_BUBBLE_FE 4 { "BACK_END_BUBBLE_FE", {0x10000}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- front-end"}, #define PME_ITA2_BACK_END_BUBBLE_L1D_FPU_RSE 5 { "BACK_END_BUBBLE_L1D_FPU_RSE", {0x20000}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- L1D_FPU or RSE."}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_ANY 6 { "BE_BR_MISPRED_DETAIL_ANY", {0x61}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- any back-end (be) mispredictions"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_PFS 7 { "BE_BR_MISPRED_DETAIL_PFS", {0x30061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end pfs mispredictions for taken branches"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_ROT 8 { "BE_BR_MISPRED_DETAIL_ROT", {0x20061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end rotate mispredictions"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_STG 9 { "BE_BR_MISPRED_DETAIL_STG", {0x10061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end stage mispredictions"}, #define PME_ITA2_BE_EXE_BUBBLE_ALL 10 { "BE_EXE_BUBBLE_ALL", {0x2}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe"}, #define PME_ITA2_BE_EXE_BUBBLE_ARCR 11 { "BE_EXE_BUBBLE_ARCR", {0x40002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to AR or CR dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK 12 { "BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK", {0x80002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- ARCR, PR, CANCEL or BANK_SWITCH"}, #define PME_ITA2_BE_EXE_BUBBLE_BANK_SWITCH 13 { "BE_EXE_BUBBLE_BANK_SWITCH", {0x70002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to bank switching."}, #define PME_ITA2_BE_EXE_BUBBLE_CANCEL 14 { "BE_EXE_BUBBLE_CANCEL", {0x60002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to a canceled load"}, #define PME_ITA2_BE_EXE_BUBBLE_FRALL 15 { "BE_EXE_BUBBLE_FRALL", {0x20002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to FR/FR or FR/load dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_GRALL 16 { "BE_EXE_BUBBLE_GRALL", {0x10002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR or GR/load dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_GRGR 17 { "BE_EXE_BUBBLE_GRGR", {0x50002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_PR 18 { "BE_EXE_BUBBLE_PR", {0x30002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to PR dependency"}, #define PME_ITA2_BE_FLUSH_BUBBLE_ALL 19 { "BE_FLUSH_BUBBLE_ALL", {0x4}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to either an exception/interruption or branch misprediction flush"}, #define PME_ITA2_BE_FLUSH_BUBBLE_BRU 20 { "BE_FLUSH_BUBBLE_BRU", {0x10004}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to a branch misprediction flush"}, #define PME_ITA2_BE_FLUSH_BUBBLE_XPN 21 { "BE_FLUSH_BUBBLE_XPN", {0x20004}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to an exception/interruption flush"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_ALL 22 { "BE_L1D_FPU_BUBBLE_ALL", {0xca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D or FPU"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_FPU 23 { "BE_L1D_FPU_BUBBLE_FPU", {0x100ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by FPU."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D 24 { "BE_L1D_FPU_BUBBLE_L1D", {0x200ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D. This includes all stalls caused by the L1 pipeline (created in the L1D stage of the L1 pipeline which corresponds to the DET stage of the main pipe)."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_DCS 25 { "BE_L1D_FPU_BUBBLE_L1D_DCS", {0x800ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to DCS requiring a stall"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_DCURECIR 26 { "BE_L1D_FPU_BUBBLE_L1D_DCURECIR", {0x400ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to DCU recirculating"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_FILLCONF 27 { "BE_L1D_FPU_BUBBLE_L1D_FILLCONF", {0x700ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due a store in conflict with a returning fill."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF 28 { "BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF", {0x300ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer being full"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_HPW 29 { "BE_L1D_FPU_BUBBLE_L1D_HPW", {0x500ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to Hardware Page Walker"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_L2BPRESS 30 { "BE_L1D_FPU_BUBBLE_L1D_L2BPRESS", {0x900ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2 Back Pressure"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_LDCHK 31 { "BE_L1D_FPU_BUBBLE_L1D_LDCHK", {0xc00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_LDCONF 32 { "BE_L1D_FPU_BUBBLE_L1D_LDCONF", {0xb00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_NAT 33 { "BE_L1D_FPU_BUBBLE_L1D_NAT", {0xd00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L1D data return needing recirculated NaT generation."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_NATCONF 34 { "BE_L1D_FPU_BUBBLE_L1D_NATCONF", {0xf00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ld8.fill conflict with st8.spill not written to unat."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR 35 { "BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR", {0xe00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer cancel needing recirculate."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_TLB 36 { "BE_L1D_FPU_BUBBLE_L1D_TLB", {0xa00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2DTLB to L1DTLB transfer"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_ALL 37 { "BE_LOST_BW_DUE_TO_FE_ALL", {0x72}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- count regardless of cause"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BI 38 { "BE_LOST_BW_DUE_TO_FE_BI", {0x90072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch initialization stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BRQ 39 { "BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch retirement queue stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 40 { "BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch interlock stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BUBBLE 41 { "BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_FEFLUSH 42 { "BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a front-end flush"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 43 { "BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_IBFULL 44 { "BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- (* meaningless for this event *)"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_IMISS 45 { "BE_LOST_BW_DUE_TO_FE_IMISS", {0x60072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by instruction cache miss stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_PLP 46 { "BE_LOST_BW_DUE_TO_FE_PLP", {0xb0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_TLBMISS 47 { "BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by TLB stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_UNREACHED 48 { "BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by unreachable bundle"}, #define PME_ITA2_BE_RSE_BUBBLE_ALL 49 { "BE_RSE_BUBBLE_ALL", {0x1}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE"}, #define PME_ITA2_BE_RSE_BUBBLE_AR_DEP 50 { "BE_RSE_BUBBLE_AR_DEP", {0x20001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to AR dependencies"}, #define PME_ITA2_BE_RSE_BUBBLE_BANK_SWITCH 51 { "BE_RSE_BUBBLE_BANK_SWITCH", {0x10001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to bank switching"}, #define PME_ITA2_BE_RSE_BUBBLE_LOADRS 52 { "BE_RSE_BUBBLE_LOADRS", {0x50001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to loadrs calculations"}, #define PME_ITA2_BE_RSE_BUBBLE_OVERFLOW 53 { "BE_RSE_BUBBLE_OVERFLOW", {0x30001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to spill"}, #define PME_ITA2_BE_RSE_BUBBLE_UNDERFLOW 54 { "BE_RSE_BUBBLE_UNDERFLOW", {0x40001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to fill"}, #define PME_ITA2_BRANCH_EVENT 55 { "BRANCH_EVENT", {0x111}, 0xf0, 1, {0xf00003}, "Branch Event Captured"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_ALL_PRED 56 { "BR_MISPRED_DETAIL_ALL_ALL_PRED", {0x5b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_CORRECT_PRED 57 { "BR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x1005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_WRONG_PATH 58 { "BR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x2005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_WRONG_TARGET 59 { "BR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x3005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_ALL_PRED 60 { "BR_MISPRED_DETAIL_IPREL_ALL_PRED", {0x4005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_CORRECT_PRED 61 { "BR_MISPRED_DETAIL_IPREL_CORRECT_PRED", {0x5005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_WRONG_PATH 62 { "BR_MISPRED_DETAIL_IPREL_WRONG_PATH", {0x6005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_WRONG_TARGET 63 { "BR_MISPRED_DETAIL_IPREL_WRONG_TARGET", {0x7005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_ALL_PRED 64 { "BR_MISPRED_DETAIL_NTRETIND_ALL_PRED", {0xc005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_CORRECT_PRED 65 { "BR_MISPRED_DETAIL_NTRETIND_CORRECT_PRED", {0xd005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_WRONG_PATH 66 { "BR_MISPRED_DETAIL_NTRETIND_WRONG_PATH", {0xe005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_WRONG_TARGET 67 { "BR_MISPRED_DETAIL_NTRETIND_WRONG_TARGET", {0xf005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_ALL_PRED 68 { "BR_MISPRED_DETAIL_RETURN_ALL_PRED", {0x8005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_CORRECT_PRED 69 { "BR_MISPRED_DETAIL_RETURN_CORRECT_PRED", {0x9005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_WRONG_PATH 70 { "BR_MISPRED_DETAIL_RETURN_WRONG_PATH", {0xa005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_WRONG_TARGET 71 { "BR_MISPRED_DETAIL_RETURN_WRONG_TARGET", {0xb005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED 72 { "BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED", {0x68}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED 73 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED", {0x10068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and correctly predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH 74 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH", {0x20068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED 75 { "BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED", {0x40068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED 76 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED", {0x50068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH 77 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH", {0x60068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED 78 { "BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED", {0xc0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED 79 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED", {0xd0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH 80 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH", {0xe0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED 81 { "BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED", {0x80068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED 82 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED", {0x90068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH 83 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH", {0xa0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_PATH_PRED_ALL_MISPRED_NOTTAKEN 84 { "BR_PATH_PRED_ALL_MISPRED_NOTTAKEN", {0x54}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_MISPRED_TAKEN 85 { "BR_PATH_PRED_ALL_MISPRED_TAKEN", {0x10054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_OKPRED_NOTTAKEN 86 { "BR_PATH_PRED_ALL_OKPRED_NOTTAKEN", {0x20054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_OKPRED_TAKEN 87 { "BR_PATH_PRED_ALL_OKPRED_TAKEN", {0x30054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN 88 { "BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN", {0x40054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_MISPRED_TAKEN 89 { "BR_PATH_PRED_IPREL_MISPRED_TAKEN", {0x50054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN 90 { "BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN", {0x60054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_OKPRED_TAKEN 91 { "BR_PATH_PRED_IPREL_OKPRED_TAKEN", {0x70054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN 92 { "BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN", {0xc0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_MISPRED_TAKEN 93 { "BR_PATH_PRED_NRETIND_MISPRED_TAKEN", {0xd0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN 94 { "BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN", {0xe0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_OKPRED_TAKEN 95 { "BR_PATH_PRED_NRETIND_OKPRED_TAKEN", {0xf0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN 96 { "BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN", {0x80054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_MISPRED_TAKEN 97 { "BR_PATH_PRED_RETURN_MISPRED_TAKEN", {0x90054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN 98 { "BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN", {0xa0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_OKPRED_TAKEN 99 { "BR_PATH_PRED_RETURN_OKPRED_TAKEN", {0xb0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN 100 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN", {0x6a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN 101 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN", {0x1006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN 102 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN", {0x4006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN 103 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN", {0x5006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN 104 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN", {0xc006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN 105 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN", {0xd006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN 106 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN", {0x8006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN 107 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN", {0x9006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BUS_ALL_ANY 108 { "BUS_ALL_ANY", {0x30087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_ALL_IO 109 { "BUS_ALL_IO", {0x10087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_ALL_SELF 110 { "BUS_ALL_SELF", {0x20087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- local processor"}, #define PME_ITA2_BUS_BACKSNP_REQ_THIS 111 { "BUS_BACKSNP_REQ_THIS", {0x1008e}, 0xf0, 1, {0xf00000}, "Bus Back Snoop Requests -- Counts the number of bus back snoop me requests"}, #define PME_ITA2_BUS_BRQ_LIVE_REQ_HI 112 { "BUS_BRQ_LIVE_REQ_HI", {0x9c}, 0xf0, 2, {0xf00000}, "BRQ Live Requests (upper 2 bits)"}, #define PME_ITA2_BUS_BRQ_LIVE_REQ_LO 113 { "BUS_BRQ_LIVE_REQ_LO", {0x9b}, 0xf0, 7, {0xf00000}, "BRQ Live Requests (lower 3 bits)"}, #define PME_ITA2_BUS_BRQ_REQ_INSERTED 114 { "BUS_BRQ_REQ_INSERTED", {0x9d}, 0xf0, 1, {0xf00000}, "BRQ Requests Inserted"}, #define PME_ITA2_BUS_DATA_CYCLE 115 { "BUS_DATA_CYCLE", {0x88}, 0xf0, 1, {0xf00000}, "Valid Data Cycle on the Bus"}, #define PME_ITA2_BUS_HITM 116 { "BUS_HITM", {0x84}, 0xf0, 1, {0xf00000}, "Bus Hit Modified Line Transactions"}, #define PME_ITA2_BUS_IO_ANY 117 { "BUS_IO_ANY", {0x30090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_IO_IO 118 { "BUS_IO_IO", {0x10090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_IO_SELF 119 { "BUS_IO_SELF", {0x20090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- local processor"}, #define PME_ITA2_BUS_IOQ_LIVE_REQ_HI 120 { "BUS_IOQ_LIVE_REQ_HI", {0x98}, 0xf0, 2, {0xf00000}, "Inorder Bus Queue Requests (upper 2 bits)"}, #define PME_ITA2_BUS_IOQ_LIVE_REQ_LO 121 { "BUS_IOQ_LIVE_REQ_LO", {0x97}, 0xf0, 3, {0xf00000}, "Inorder Bus Queue Requests (lower2 bitst)"}, #define PME_ITA2_BUS_LOCK_ANY 122 { "BUS_LOCK_ANY", {0x30093}, 0xf0, 1, {0xf00000}, "IA-32 Compatible Bus Lock Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_LOCK_SELF 123 { "BUS_LOCK_SELF", {0x20093}, 0xf0, 1, {0xf00000}, "IA-32 Compatible Bus Lock Transactions -- local processor"}, #define PME_ITA2_BUS_MEMORY_ALL_ANY 124 { "BUS_MEMORY_ALL_ANY", {0xf008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_ALL_IO 125 { "BUS_MEMORY_ALL_IO", {0xd008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_ALL_SELF 126 { "BUS_MEMORY_ALL_SELF", {0xe008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from local processor"}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_ANY 127 { "BUS_MEMORY_EQ_128BYTE_ANY", {0x7008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_IO 128 { "BUS_MEMORY_EQ_128BYTE_IO", {0x5008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_SELF 129 { "BUS_MEMORY_EQ_128BYTE_SELF", {0x6008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from local processor"}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_ANY 130 { "BUS_MEMORY_LT_128BYTE_ANY", {0xb008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_IO 131 { "BUS_MEMORY_LT_128BYTE_IO", {0x9008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_SELF 132 { "BUS_MEMORY_LT_128BYTE_SELF", {0xa008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) local processor"}, #define PME_ITA2_BUS_MEM_READ_ALL_ANY 133 { "BUS_MEM_READ_ALL_ANY", {0xf008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_ALL_IO 134 { "BUS_MEM_READ_ALL_IO", {0xd008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_ALL_SELF 135 { "BUS_MEM_READ_ALL_SELF", {0xe008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BIL_ANY 136 { "BUS_MEM_READ_BIL_ANY", {0x3008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BIL_IO 137 { "BUS_MEM_READ_BIL_IO", {0x1008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BIL_SELF 138 { "BUS_MEM_READ_BIL_SELF", {0x2008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BRIL_ANY 139 { "BUS_MEM_READ_BRIL_ANY", {0xb008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BRIL_IO 140 { "BUS_MEM_READ_BRIL_IO", {0x9008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BRIL_SELF 141 { "BUS_MEM_READ_BRIL_SELF", {0xa008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BRL_ANY 142 { "BUS_MEM_READ_BRL_ANY", {0x7008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BRL_IO 143 { "BUS_MEM_READ_BRL_IO", {0x5008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BRL_SELF 144 { "BUS_MEM_READ_BRL_SELF", {0x6008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_OUT_HI 145 { "BUS_MEM_READ_OUT_HI", {0x94}, 0xf0, 2, {0xf00000}, "Outstanding Memory Read Transactions (upper 2 bits)"}, #define PME_ITA2_BUS_MEM_READ_OUT_LO 146 { "BUS_MEM_READ_OUT_LO", {0x95}, 0xf0, 7, {0xf00000}, "Outstanding Memory Read Transactions (lower 3 bits)"}, #define PME_ITA2_BUS_OOQ_LIVE_REQ_HI 147 { "BUS_OOQ_LIVE_REQ_HI", {0x9a}, 0xf0, 2, {0xf00000}, "Out-of-order Bus Queue Requests (upper 2 bits)"}, #define PME_ITA2_BUS_OOQ_LIVE_REQ_LO 148 { "BUS_OOQ_LIVE_REQ_LO", {0x99}, 0xf0, 7, {0xf00000}, "Out-of-order Bus Queue Requests (lower 3 bits)"}, #define PME_ITA2_BUS_RD_DATA_ANY 149 { "BUS_RD_DATA_ANY", {0x3008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_DATA_IO 150 { "BUS_RD_DATA_IO", {0x1008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_DATA_SELF 151 { "BUS_RD_DATA_SELF", {0x2008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- local processor"}, #define PME_ITA2_BUS_RD_HIT 152 { "BUS_RD_HIT", {0x80}, 0xf0, 1, {0xf00000}, "Bus Read Hit Clean Non-local Cache Transactions"}, #define PME_ITA2_BUS_RD_HITM 153 { "BUS_RD_HITM", {0x81}, 0xf0, 1, {0xf00000}, "Bus Read Hit Modified Non-local Cache Transactions"}, #define PME_ITA2_BUS_RD_INVAL_ALL_HITM 154 { "BUS_RD_INVAL_ALL_HITM", {0x83}, 0xf0, 1, {0xf00000}, "Bus BRIL Burst Transaction Results in HITM"}, #define PME_ITA2_BUS_RD_INVAL_HITM 155 { "BUS_RD_INVAL_HITM", {0x82}, 0xf0, 1, {0xf00000}, "Bus BIL Transaction Results in HITM"}, #define PME_ITA2_BUS_RD_IO_ANY 156 { "BUS_RD_IO_ANY", {0x30091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_IO_IO 157 { "BUS_RD_IO_IO", {0x10091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_IO_SELF 158 { "BUS_RD_IO_SELF", {0x20091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- local processor"}, #define PME_ITA2_BUS_RD_PRTL_ANY 159 { "BUS_RD_PRTL_ANY", {0x3008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_PRTL_IO 160 { "BUS_RD_PRTL_IO", {0x1008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_PRTL_SELF 161 { "BUS_RD_PRTL_SELF", {0x2008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- local processor"}, #define PME_ITA2_BUS_SNOOPQ_REQ 162 { "BUS_SNOOPQ_REQ", {0x96}, 0xf0, 7, {0xf00000}, "Bus Snoop Queue Requests"}, #define PME_ITA2_BUS_SNOOPS_ANY 163 { "BUS_SNOOPS_ANY", {0x30086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOPS_IO 164 { "BUS_SNOOPS_IO", {0x10086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- non-CPU priority agents"}, #define PME_ITA2_BUS_SNOOPS_SELF 165 { "BUS_SNOOPS_SELF", {0x20086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- local processor"}, #define PME_ITA2_BUS_SNOOPS_HITM_ANY 166 { "BUS_SNOOPS_HITM_ANY", {0x30085}, 0xf0, 1, {0xf00000}, "Bus Snoops HIT Modified Cache Line -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOPS_HITM_SELF 167 { "BUS_SNOOPS_HITM_SELF", {0x20085}, 0xf0, 1, {0xf00000}, "Bus Snoops HIT Modified Cache Line -- local processor"}, #define PME_ITA2_BUS_SNOOP_STALL_CYCLES_ANY 168 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x3008f}, 0xf0, 1, {0xf00000}, "Bus Snoop Stall Cycles (from any agent) -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOP_STALL_CYCLES_SELF 169 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x2008f}, 0xf0, 1, {0xf00000}, "Bus Snoop Stall Cycles (from any agent) -- local processor"}, #define PME_ITA2_BUS_WR_WB_ALL_ANY 170 { "BUS_WR_WB_ALL_ANY", {0xf0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_WR_WB_ALL_IO 171 { "BUS_WR_WB_ALL_IO", {0xd0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_WR_WB_ALL_SELF 172 { "BUS_WR_WB_ALL_SELF", {0xe0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor"}, #define PME_ITA2_BUS_WR_WB_CCASTOUT_ANY 173 { "BUS_WR_WB_CCASTOUT_ANY", {0xb0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_ITA2_BUS_WR_WB_CCASTOUT_SELF 174 { "BUS_WR_WB_CCASTOUT_SELF", {0xa0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_ANY 175 { "BUS_WR_WB_EQ_128BYTE_ANY", {0x70092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)./Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_IO 176 { "BUS_WR_WB_EQ_128BYTE_IO", {0x50092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- non-CPU priority agents/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_SELF 177 { "BUS_WR_WB_EQ_128BYTE_SELF", {0x60092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_CPU_CPL_CHANGES 178 { "CPU_CPL_CHANGES", {0x13}, 0xf0, 1, {0xf00000}, "Privilege Level Changes"}, #define PME_ITA2_CPU_CYCLES 179 { "CPU_CYCLES", {0x12}, 0xf0, 1, {0xf00000}, "CPU Cycles"}, #define PME_ITA2_DATA_DEBUG_REGISTER_FAULT 180 { "DATA_DEBUG_REGISTER_FAULT", {0x52}, 0xf0, 1, {0xf00000}, "Fault Due to Data Debug Reg. Match to Load/Store Instruction"}, #define PME_ITA2_DATA_DEBUG_REGISTER_MATCHES 181 { "DATA_DEBUG_REGISTER_MATCHES", {0xc6}, 0xf0, 1, {0xf00007}, "Data Debug Register Matches Data Address of Memory Reference."}, #define PME_ITA2_DATA_EAR_ALAT 182 { "DATA_EAR_ALAT", {0x6c8}, 0xf0, 1, {0xf00007}, "Data EAR ALAT"}, #define PME_ITA2_DATA_EAR_CACHE_LAT1024 183 { "DATA_EAR_CACHE_LAT1024", {0x805c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 1024 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT128 184 { "DATA_EAR_CACHE_LAT128", {0x505c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 128 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT16 185 { "DATA_EAR_CACHE_LAT16", {0x205c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 16 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT2048 186 { "DATA_EAR_CACHE_LAT2048", {0x905c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 2048 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT256 187 { "DATA_EAR_CACHE_LAT256", {0x605c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 256 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT32 188 { "DATA_EAR_CACHE_LAT32", {0x305c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 32 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT4 189 { "DATA_EAR_CACHE_LAT4", {0x5c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 4 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT4096 190 { "DATA_EAR_CACHE_LAT4096", {0xa05c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 4096 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT512 191 { "DATA_EAR_CACHE_LAT512", {0x705c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 512 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT64 192 { "DATA_EAR_CACHE_LAT64", {0x405c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 64 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT8 193 { "DATA_EAR_CACHE_LAT8", {0x105c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 8 Cycles"}, #define PME_ITA2_DATA_EAR_EVENTS 194 { "DATA_EAR_EVENTS", {0xc8}, 0xf0, 1, {0xf00007}, "L1 Data Cache EAR Events"}, #define PME_ITA2_DATA_EAR_TLB_ALL 195 { "DATA_EAR_TLB_ALL", {0xe04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- All L1 DTLB Misses"}, #define PME_ITA2_DATA_EAR_TLB_FAULT 196 { "DATA_EAR_TLB_FAULT", {0x804c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- DTLB Misses which produce a software fault"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB 197 { "DATA_EAR_TLB_L2DTLB", {0x204c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB_OR_FAULT 198 { "DATA_EAR_TLB_L2DTLB_OR_FAULT", {0xa04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or produce a software fault"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB_OR_VHPT 199 { "DATA_EAR_TLB_L2DTLB_OR_VHPT", {0x604c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or VHPT"}, #define PME_ITA2_DATA_EAR_TLB_VHPT 200 { "DATA_EAR_TLB_VHPT", {0x404c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT"}, #define PME_ITA2_DATA_EAR_TLB_VHPT_OR_FAULT 201 { "DATA_EAR_TLB_VHPT_OR_FAULT", {0xc04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT or produce a software fault"}, #define PME_ITA2_DATA_REFERENCES_SET0 202 { "DATA_REFERENCES_SET0", {0xc3}, 0xf0, 4, {0x5010007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_ITA2_DATA_REFERENCES_SET1 203 { "DATA_REFERENCES_SET1", {0xc5}, 0xf0, 4, {0x5110007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_ITA2_DISP_STALLED 204 { "DISP_STALLED", {0x49}, 0xf0, 1, {0xf00000}, "Number of Cycles Dispersal Stalled"}, #define PME_ITA2_DTLB_INSERTS_HPW 205 { "DTLB_INSERTS_HPW", {0xc9}, 0xf0, 4, {0xf00007}, "Hardware Page Walker Installs to DTLB"}, #define PME_ITA2_DTLB_INSERTS_HPW_RETIRED 206 { "DTLB_INSERTS_HPW_RETIRED", {0x2c}, 0xf0, 4, {0xf00007}, "VHPT Entries Inserted into DTLB by the Hardware Page Walker"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_ALL_PRED 207 { "ENCBR_MISPRED_DETAIL_ALL_ALL_PRED", {0x63}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED 208 { "ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x10063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH 209 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x20063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET 210 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x30063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED 211 { "ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED", {0xc0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED 212 { "ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED", {0xd0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH 213 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH", {0xe0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET 214 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET", {0xf0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED 215 { "ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED", {0x80063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED 216 { "ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED", {0x90063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH 217 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH", {0xa0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET 218 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET", {0xb0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_ALL 219 { "EXTERN_DP_PINS_0_TO_3_ALL", {0xf009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0 220 { "EXTERN_DP_PINS_0_TO_3_PIN0", {0x1009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1 221 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1", {0x3009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN2 222 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN2", {0x7009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN3 223 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN3", {0xb009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2 224 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2", {0x5009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2_OR_PIN3 225 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2_OR_PIN3", {0xd009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN3 226 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN3", {0x9009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1 227 { "EXTERN_DP_PINS_0_TO_3_PIN1", {0x2009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2 228 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2", {0x6009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2_OR_PIN3 229 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2_OR_PIN3", {0xe009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN3 230 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN3", {0xa009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN2 231 { "EXTERN_DP_PINS_0_TO_3_PIN2", {0x4009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN2_OR_PIN3 232 { "EXTERN_DP_PINS_0_TO_3_PIN2_OR_PIN3", {0xc009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN3 233 { "EXTERN_DP_PINS_0_TO_3_PIN3", {0x8009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_ALL 234 { "EXTERN_DP_PINS_4_TO_5_ALL", {0x3009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin5 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_PIN4 235 { "EXTERN_DP_PINS_4_TO_5_PIN4", {0x1009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin4 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_PIN5 236 { "EXTERN_DP_PINS_4_TO_5_PIN5", {0x2009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin5 assertion"}, #define PME_ITA2_FE_BUBBLE_ALL 237 { "FE_BUBBLE_ALL", {0x71}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- count regardless of cause"}, #define PME_ITA2_FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE 238 { "FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE", {0xb0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- ALL except FEFLUSH and BUBBLE"}, #define PME_ITA2_FE_BUBBLE_ALLBUT_IBFULL 239 { "FE_BUBBLE_ALLBUT_IBFULL", {0xc0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- ALL except IBFULl"}, #define PME_ITA2_FE_BUBBLE_BRANCH 240 { "FE_BUBBLE_BRANCH", {0x90071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by any of 4 branch recirculates"}, #define PME_ITA2_FE_BUBBLE_BUBBLE 241 { "FE_BUBBLE_BUBBLE", {0xd0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by branch bubble stall"}, #define PME_ITA2_FE_BUBBLE_FEFLUSH 242 { "FE_BUBBLE_FEFLUSH", {0x10071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by a front-end flush"}, #define PME_ITA2_FE_BUBBLE_FILL_RECIRC 243 { "FE_BUBBLE_FILL_RECIRC", {0x80071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_FE_BUBBLE_GROUP1 244 { "FE_BUBBLE_GROUP1", {0x30071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- BUBBLE or BRANCH"}, #define PME_ITA2_FE_BUBBLE_GROUP2 245 { "FE_BUBBLE_GROUP2", {0x40071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- IMISS or TLBMISS"}, #define PME_ITA2_FE_BUBBLE_GROUP3 246 { "FE_BUBBLE_GROUP3", {0xa0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- FILL_RECIRC or BRANCH"}, #define PME_ITA2_FE_BUBBLE_IBFULL 247 { "FE_BUBBLE_IBFULL", {0x50071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by instruction buffer full stall"}, #define PME_ITA2_FE_BUBBLE_IMISS 248 { "FE_BUBBLE_IMISS", {0x60071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by instruction cache miss stall"}, #define PME_ITA2_FE_BUBBLE_TLBMISS 249 { "FE_BUBBLE_TLBMISS", {0x70071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by TLB stall"}, #define PME_ITA2_FE_LOST_BW_ALL 250 { "FE_LOST_BW_ALL", {0x70}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- count regardless of cause"}, #define PME_ITA2_FE_LOST_BW_BI 251 { "FE_LOST_BW_BI", {0x90070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch initialization stall"}, #define PME_ITA2_FE_LOST_BW_BRQ 252 { "FE_LOST_BW_BRQ", {0xa0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch retirement queue stall"}, #define PME_ITA2_FE_LOST_BW_BR_ILOCK 253 { "FE_LOST_BW_BR_ILOCK", {0xc0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch interlock stall"}, #define PME_ITA2_FE_LOST_BW_BUBBLE 254 { "FE_LOST_BW_BUBBLE", {0xd0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_FE_LOST_BW_FEFLUSH 255 { "FE_LOST_BW_FEFLUSH", {0x10070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by a front-end flush"}, #define PME_ITA2_FE_LOST_BW_FILL_RECIRC 256 { "FE_LOST_BW_FILL_RECIRC", {0x80070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_FE_LOST_BW_IBFULL 257 { "FE_LOST_BW_IBFULL", {0x50070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction buffer full stall"}, #define PME_ITA2_FE_LOST_BW_IMISS 258 { "FE_LOST_BW_IMISS", {0x60070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction cache miss stall"}, #define PME_ITA2_FE_LOST_BW_PLP 259 { "FE_LOST_BW_PLP", {0xb0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_FE_LOST_BW_TLBMISS 260 { "FE_LOST_BW_TLBMISS", {0x70070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by TLB stall"}, #define PME_ITA2_FE_LOST_BW_UNREACHED 261 { "FE_LOST_BW_UNREACHED", {0x40070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by unreachable bundle"}, #define PME_ITA2_FP_FAILED_FCHKF 262 { "FP_FAILED_FCHKF", {0x6}, 0xf0, 1, {0xf00001}, "Failed fchkf"}, #define PME_ITA2_FP_FALSE_SIRSTALL 263 { "FP_FALSE_SIRSTALL", {0x5}, 0xf0, 1, {0xf00001}, "SIR Stall Without a Trap"}, #define PME_ITA2_FP_FLUSH_TO_ZERO 264 { "FP_FLUSH_TO_ZERO", {0xb}, 0xf0, 2, {0xf00001}, "FP Result Flushed to Zero"}, #define PME_ITA2_FP_OPS_RETIRED 265 { "FP_OPS_RETIRED", {0x9}, 0xf0, 4, {0xf00001}, "Retired FP Operations"}, #define PME_ITA2_FP_TRUE_SIRSTALL 266 { "FP_TRUE_SIRSTALL", {0x3}, 0xf0, 1, {0xf00001}, "SIR stall asserted and leads to a trap"}, #define PME_ITA2_HPW_DATA_REFERENCES 267 { "HPW_DATA_REFERENCES", {0x2d}, 0xf0, 4, {0xf00007}, "Data Memory References to VHPT"}, #define PME_ITA2_IA32_INST_RETIRED 268 { "IA32_INST_RETIRED", {0x59}, 0xf0, 2, {0xf00000}, "IA-32 Instructions Retired"}, #define PME_ITA2_IA32_ISA_TRANSITIONS 269 { "IA32_ISA_TRANSITIONS", {0x7}, 0xf0, 1, {0xf00000}, "IA-64 to/from IA-32 ISA Transitions"}, #define PME_ITA2_IA64_INST_RETIRED 270 { "IA64_INST_RETIRED", {0x8}, 0xf0, 6, {0xf00003}, "Retired IA-64 Instructions, alias to IA64_INST_RETIRED_THIS"}, #define PME_ITA2_IA64_INST_RETIRED_THIS 271 { "IA64_INST_RETIRED_THIS", {0x8}, 0xf0, 6, {0xf00003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions"}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP0_PMC8 272 { "IA64_TAGGED_INST_RETIRED_IBRP0_PMC8", {0x8}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 0 and opcode matcher PMC8. Code executed with PSR.is=1 is included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP1_PMC9 273 { "IA64_TAGGED_INST_RETIRED_IBRP1_PMC9", {0x10008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 1 and opcode matcher PMC9. Code executed with PSR.is=1 is included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP2_PMC8 274 { "IA64_TAGGED_INST_RETIRED_IBRP2_PMC8", {0x20008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 2 and opcode matcher PMC8. Code executed with PSR.is=1 is not included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP3_PMC9 275 { "IA64_TAGGED_INST_RETIRED_IBRP3_PMC9", {0x30008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 3 and opcode matcher PMC9. Code executed with PSR.is=1 is not included."}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_ALL 276 { "IDEAL_BE_LOST_BW_DUE_TO_FE_ALL", {0x73}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- count regardless of cause"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BI 277 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BI", {0x90073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch initialization stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ 278 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch retirement queue stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 279 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch interlock stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE 280 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH 281 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by a front-end flush"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 282 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL 283 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- (* meaningless for this event *)"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS 284 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS", {0x60073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by instruction cache miss stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_PLP 285 { "IDEAL_BE_LOST_BW_DUE_TO_FE_PLP", {0xb0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS 286 { "IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by TLB stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED 287 { "IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by unreachable bundle"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_ALL 288 { "INST_CHKA_LDC_ALAT_ALL", {0x30056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_FP 289 { "INST_CHKA_LDC_ALAT_FP", {0x20056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_INT 290 { "INST_CHKA_LDC_ALAT_INT", {0x10056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- only integer instructions"}, #define PME_ITA2_INST_DISPERSED 291 { "INST_DISPERSED", {0x4d}, 0xf0, 6, {0xf00001}, "Syllables Dispersed from REN to REG stage"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_ALL 292 { "INST_FAILED_CHKA_LDC_ALAT_ALL", {0x30057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_FP 293 { "INST_FAILED_CHKA_LDC_ALAT_FP", {0x20057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_INT 294 { "INST_FAILED_CHKA_LDC_ALAT_INT", {0x10057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- only integer instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_ALL 295 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_FP 296 { "INST_FAILED_CHKS_RETIRED_FP", {0x20055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- only floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_INT 297 { "INST_FAILED_CHKS_RETIRED_INT", {0x10055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- only integer instructions"}, #define PME_ITA2_ISB_BUNPAIRS_IN 298 { "ISB_BUNPAIRS_IN", {0x46}, 0xf0, 1, {0xf00001}, "Bundle Pairs Written from L2 into FE"}, #define PME_ITA2_ITLB_MISSES_FETCH_ALL 299 { "ITLB_MISSES_FETCH_ALL", {0x30047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All tlb misses will be counted. Note that this is not equal to sum of the L1ITLB and L2ITLB umasks because any access could be a miss in L1ITLB and L2ITLB."}, #define PME_ITA2_ITLB_MISSES_FETCH_L1ITLB 300 { "ITLB_MISSES_FETCH_L1ITLB", {0x10047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB will be counted. even if L1ITLB is not updated for an access (Uncacheable/nat page/not present page/faulting/some flushed), it will be counted here."}, #define PME_ITA2_ITLB_MISSES_FETCH_L2ITLB 301 { "ITLB_MISSES_FETCH_L2ITLB", {0x20047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB which also missed in L2ITLB will be counted."}, #define PME_ITA2_L1DTLB_TRANSFER 302 { "L1DTLB_TRANSFER", {0xc0}, 0xf0, 1, {0x5010007}, "L1DTLB Misses That Hit in the L2DTLB for Accesses Counted in L1D_READS"}, #define PME_ITA2_L1D_READS_SET0 303 { "L1D_READS_SET0", {0xc2}, 0xf0, 2, {0x5010007}, "L1 Data Cache Reads"}, #define PME_ITA2_L1D_READS_SET1 304 { "L1D_READS_SET1", {0xc4}, 0xf0, 2, {0x5110007}, "L1 Data Cache Reads"}, #define PME_ITA2_L1D_READ_MISSES_ALL 305 { "L1D_READ_MISSES_ALL", {0xc7}, 0xf0, 2, {0x5110007}, "L1 Data Cache Read Misses -- all L1D read misses will be counted."}, #define PME_ITA2_L1D_READ_MISSES_RSE_FILL 306 { "L1D_READ_MISSES_RSE_FILL", {0x100c7}, 0xf0, 2, {0x5110007}, "L1 Data Cache Read Misses -- only L1D read misses caused by RSE fills will be counted"}, #define PME_ITA2_L1ITLB_INSERTS_HPW 307 { "L1ITLB_INSERTS_HPW", {0x48}, 0xf0, 1, {0xf00001}, "L1ITLB Hardware Page Walker Inserts"}, #define PME_ITA2_L1I_EAR_CACHE_LAT0 308 { "L1I_EAR_CACHE_LAT0", {0x400343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- > 0 Cycles (All L1 Misses)"}, #define PME_ITA2_L1I_EAR_CACHE_LAT1024 309 { "L1I_EAR_CACHE_LAT1024", {0xc00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 1024 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT128 310 { "L1I_EAR_CACHE_LAT128", {0xf00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 128 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT16 311 { "L1I_EAR_CACHE_LAT16", {0xfc0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 16 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT256 312 { "L1I_EAR_CACHE_LAT256", {0xe00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 256 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT32 313 { "L1I_EAR_CACHE_LAT32", {0xf80343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 32 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT4 314 { "L1I_EAR_CACHE_LAT4", {0xff0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 4 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT4096 315 { "L1I_EAR_CACHE_LAT4096", {0x800343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 4096 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT8 316 { "L1I_EAR_CACHE_LAT8", {0xfe0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 8 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_RAB 317 { "L1I_EAR_CACHE_RAB", {0x343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- RAB HIT"}, #define PME_ITA2_L1I_EAR_EVENTS 318 { "L1I_EAR_EVENTS", {0x43}, 0xf0, 1, {0xf00001}, "Instruction EAR Events"}, #define PME_ITA2_L1I_EAR_TLB_ALL 319 { "L1I_EAR_TLB_ALL", {0x70243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- All L1 ITLB Misses"}, #define PME_ITA2_L1I_EAR_TLB_FAULT 320 { "L1I_EAR_TLB_FAULT", {0x40243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- ITLB Misses which produced a fault"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB 321 { "L1I_EAR_TLB_L2TLB", {0x10243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB_OR_FAULT 322 { "L1I_EAR_TLB_L2TLB_OR_FAULT", {0x50243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or produce a software fault"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB_OR_VHPT 323 { "L1I_EAR_TLB_L2TLB_OR_VHPT", {0x30243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or VHPT"}, #define PME_ITA2_L1I_EAR_TLB_VHPT 324 { "L1I_EAR_TLB_VHPT", {0x20243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT"}, #define PME_ITA2_L1I_EAR_TLB_VHPT_OR_FAULT 325 { "L1I_EAR_TLB_VHPT_OR_FAULT", {0x60243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT or produce a software fault"}, #define PME_ITA2_L1I_FETCH_ISB_HIT 326 { "L1I_FETCH_ISB_HIT", {0x66}, 0xf0, 1, {0xf00001}, "\"Just-In-Time\" Instruction Fetch Hitting in and Being Bypassed from ISB"}, #define PME_ITA2_L1I_FETCH_RAB_HIT 327 { "L1I_FETCH_RAB_HIT", {0x65}, 0xf0, 1, {0xf00001}, "Instruction Fetch Hitting in RAB"}, #define PME_ITA2_L1I_FILLS 328 { "L1I_FILLS", {0x41}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Fills"}, #define PME_ITA2_L1I_PREFETCHES 329 { "L1I_PREFETCHES", {0x44}, 0xf0, 1, {0xf00001}, "L1 Instruction Prefetch Requests"}, #define PME_ITA2_L1I_PREFETCH_STALL_ALL 330 { "L1I_PREFETCH_STALL_ALL", {0x30067}, 0xf0, 1, {0xf00000}, "Prefetch Pipeline Stalls -- Number of clocks prefetch pipeline is stalled"}, #define PME_ITA2_L1I_PREFETCH_STALL_FLOW 331 { "L1I_PREFETCH_STALL_FLOW", {0x20067}, 0xf0, 1, {0xf00000}, "Prefetch Pipeline Stalls -- Number of clocks flow is not asserted"}, #define PME_ITA2_L1I_PURGE 332 { "L1I_PURGE", {0x4b}, 0xf0, 1, {0xf00001}, "L1ITLB Purges Handled by L1I"}, #define PME_ITA2_L1I_PVAB_OVERFLOW 333 { "L1I_PVAB_OVERFLOW", {0x69}, 0xf0, 1, {0xf00000}, "PVAB Overflow"}, #define PME_ITA2_L1I_RAB_ALMOST_FULL 334 { "L1I_RAB_ALMOST_FULL", {0x64}, 0xf0, 1, {0xf00000}, "Is RAB Almost Full?"}, #define PME_ITA2_L1I_RAB_FULL 335 { "L1I_RAB_FULL", {0x60}, 0xf0, 1, {0xf00000}, "Is RAB Full?"}, #define PME_ITA2_L1I_READS 336 { "L1I_READS", {0x40}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Reads"}, #define PME_ITA2_L1I_SNOOP 337 { "L1I_SNOOP", {0x4a}, 0xf0, 1, {0xf00007}, "Snoop Requests Handled by L1I"}, #define PME_ITA2_L1I_STRM_PREFETCHES 338 { "L1I_STRM_PREFETCHES", {0x5f}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Line Prefetch Requests"}, #define PME_ITA2_L2DTLB_MISSES 339 { "L2DTLB_MISSES", {0xc1}, 0xf0, 4, {0x5010007}, "L2DTLB Misses"}, #define PME_ITA2_L2_BAD_LINES_SELECTED_ANY 340 { "L2_BAD_LINES_SELECTED_ANY", {0xb9}, 0xf0, 4, {0x4320007}, "Valid Line Replaced When Invalid Line Is Available -- Valid line replaced when invalid line is available"}, #define PME_ITA2_L2_BYPASS_L2_DATA1 341 { "L2_BYPASS_L2_DATA1", {0xb8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 data bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L2_DATA2 342 { "L2_BYPASS_L2_DATA2", {0x100b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 data bypasses (L1W to L2I)"}, #define PME_ITA2_L2_BYPASS_L2_INST1 343 { "L2_BYPASS_L2_INST1", {0x400b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 instruction bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L2_INST2 344 { "L2_BYPASS_L2_INST2", {0x500b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 instruction bypasses (L1W to L2I)"}, #define PME_ITA2_L2_BYPASS_L3_DATA1 345 { "L2_BYPASS_L3_DATA1", {0x200b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L3 data bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L3_INST1 346 { "L2_BYPASS_L3_INST1", {0x600b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L3 instruction bypasses (L1D to L2A)"}, #define PME_ITA2_L2_DATA_REFERENCES_L2_ALL 347 { "L2_DATA_REFERENCES_L2_ALL", {0x300b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count both read and write operations (semaphores will count as 2)"}, #define PME_ITA2_L2_DATA_REFERENCES_L2_DATA_READS 348 { "L2_DATA_REFERENCES_L2_DATA_READS", {0x100b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count only data read and semaphore operations."}, #define PME_ITA2_L2_DATA_REFERENCES_L2_DATA_WRITES 349 { "L2_DATA_REFERENCES_L2_DATA_WRITES", {0x200b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count only data write and semaphore operations"}, #define PME_ITA2_L2_FILLB_FULL_THIS 350 { "L2_FILLB_FULL_THIS", {0xbf}, 0xf0, 1, {0x4520000}, "L2D Fill Buffer Is Full -- L2 Fill buffer is full"}, #define PME_ITA2_L2_FORCE_RECIRC_ANY 351 { "L2_FORCE_RECIRC_ANY", {0xb4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count forced recirculates regardless of cause. SMC_HIT, TRAN_PREF & SNP_OR_L3 will not be included here."}, #define PME_ITA2_L2_FORCE_RECIRC_FILL_HIT 352 { "L2_FORCE_RECIRC_FILL_HIT", {0x900b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss which hit in the fill buffer."}, #define PME_ITA2_L2_FORCE_RECIRC_FRC_RECIRC 353 { "L2_FORCE_RECIRC_FRC_RECIRC", {0xe00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when a force recirculate already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_IPF_MISS 354 { "L2_FORCE_RECIRC_IPF_MISS", {0xa00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by L2 miss when instruction prefetch buffer miss already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_L1W 355 { "L2_FORCE_RECIRC_L1W", {0x200b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by forced limbo"}, #define PME_ITA2_L2_FORCE_RECIRC_OZQ_MISS 356 { "L2_FORCE_RECIRC_OZQ_MISS", {0xc00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when an OZQ miss already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_SAME_INDEX 357 { "L2_FORCE_RECIRC_SAME_INDEX", {0xd00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when a miss to the same index already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_SMC_HIT 358 { "L2_FORCE_RECIRC_SMC_HIT", {0x100b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by SMC hits due to an ifetch and load to same cache line or a pending WT store"}, #define PME_ITA2_L2_FORCE_RECIRC_SNP_OR_L3 359 { "L2_FORCE_RECIRC_SNP_OR_L3", {0x600b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by a snoop or L3 issue"}, #define PME_ITA2_L2_FORCE_RECIRC_TAG_NOTOK 360 { "L2_FORCE_RECIRC_TAG_NOTOK", {0x400b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by L2 hits caused by in flight snoops, stores with a sibling miss to the same index, sibling probe to the same line or pending sync.ia instructions."}, #define PME_ITA2_L2_FORCE_RECIRC_TRAN_PREF 361 { "L2_FORCE_RECIRC_TRAN_PREF", {0x500b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by transforms to prefetches"}, #define PME_ITA2_L2_FORCE_RECIRC_VIC_BUF_FULL 362 { "L2_FORCE_RECIRC_VIC_BUF_FULL", {0xb00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss with victim buffer full"}, #define PME_ITA2_L2_FORCE_RECIRC_VIC_PEND 363 { "L2_FORCE_RECIRC_VIC_PEND", {0x800b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss with pending victim"}, #define PME_ITA2_L2_GOT_RECIRC_IFETCH_ANY 364 { "L2_GOT_RECIRC_IFETCH_ANY", {0x800ba}, 0xf0, 1, {0x4420007}, "Instruction Fetch Recirculates Received by L2D -- Instruction fetch recirculates received by L2"}, #define PME_ITA2_L2_GOT_RECIRC_OZQ_ACC 365 { "L2_GOT_RECIRC_OZQ_ACC", {0xb6}, 0xf0, 1, {0x4220007}, "Counts Number of OZQ Accesses Recirculated to L1D"}, #define PME_ITA2_L2_IFET_CANCELS_ANY 366 { "L2_IFET_CANCELS_ANY", {0xa1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- total instruction fetch cancels by L2"}, #define PME_ITA2_L2_IFET_CANCELS_BYPASS 367 { "L2_IFET_CANCELS_BYPASS", {0x200a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to bypassing"}, #define PME_ITA2_L2_IFET_CANCELS_CHG_PRIO 368 { "L2_IFET_CANCELS_CHG_PRIO", {0xc00a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to change priority"}, #define PME_ITA2_L2_IFET_CANCELS_DATA_RD 369 { "L2_IFET_CANCELS_DATA_RD", {0x700a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch/prefetch cancels due to a data read"}, #define PME_ITA2_L2_IFET_CANCELS_DIDNT_RECIR 370 { "L2_IFET_CANCELS_DIDNT_RECIR", {0x400a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels because it did not recirculate"}, #define PME_ITA2_L2_IFET_CANCELS_IFETCH_BYP 371 { "L2_IFET_CANCELS_IFETCH_BYP", {0xd00a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- due to ifetch bypass during last clock"}, #define PME_ITA2_L2_IFET_CANCELS_PREEMPT 372 { "L2_IFET_CANCELS_PREEMPT", {0x800a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to preempts"}, #define PME_ITA2_L2_IFET_CANCELS_RECIR_OVER_SUB 373 { "L2_IFET_CANCELS_RECIR_OVER_SUB", {0x500a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels because of recirculate oversubscription"}, #define PME_ITA2_L2_IFET_CANCELS_ST_FILL_WB 374 { "L2_IFET_CANCELS_ST_FILL_WB", {0x600a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to a store or fill or write back"}, #define PME_ITA2_L2_INST_DEMAND_READS 375 { "L2_INST_DEMAND_READS", {0x42}, 0xf0, 1, {0xf00001}, "L2 Instruction Demand Fetch Requests"}, #define PME_ITA2_L2_INST_PREFETCHES 376 { "L2_INST_PREFETCHES", {0x45}, 0xf0, 1, {0xf00001}, "L2 Instruction Prefetch Requests"}, #define PME_ITA2_L2_ISSUED_RECIRC_IFETCH_ANY 377 { "L2_ISSUED_RECIRC_IFETCH_ANY", {0x800b9}, 0xf0, 1, {0x4420007}, "Instruction Fetch Recirculates Issued by L2 -- Instruction fetch recirculates issued by L2"}, #define PME_ITA2_L2_ISSUED_RECIRC_OZQ_ACC 378 { "L2_ISSUED_RECIRC_OZQ_ACC", {0xb5}, 0xf0, 1, {0x4220007}, "Count Number of Times a Recirculate Issue Was Attempted and Not Preempted"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_ANY 379 { "L2_L3ACCESS_CANCEL_ANY", {0x900b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- count cancels due to any reason. This umask will count more than the sum of all the other umasks. It will count things that weren't committed accesses when they reached L1w, but the L2 attempted to bypass them to the L3 anyway (speculatively). This will include accesses made repeatedly while the main pipeline is stalled and the L1d is attempting to recirculate an access down the L1d pipeline. Thus, an access could get counted many times before it really does get bypassed to the L3. It is a measure of how many times we asserted a request to the L3 but didn't confirm it."}, #define PME_ITA2_L2_L3ACCESS_CANCEL_DFETCH 380 { "L2_L3ACCESS_CANCEL_DFETCH", {0xa00b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- data fetches"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_EBL_REJECT 381 { "L2_L3ACCESS_CANCEL_EBL_REJECT", {0x800b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- ebl rejects"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_FILLD_FULL 382 { "L2_L3ACCESS_CANCEL_FILLD_FULL", {0x200b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- filld being full"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_IFETCH 383 { "L2_L3ACCESS_CANCEL_IFETCH", {0xb00b0}, 0xf0, 1, {0x4120007}, "Canceled L3 Accesses -- instruction fetches"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_INV_L3_BYP 384 { "L2_L3ACCESS_CANCEL_INV_L3_BYP", {0x600b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- invalid L3 bypasses"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_SPEC_L3_BYP 385 { "L2_L3ACCESS_CANCEL_SPEC_L3_BYP", {0x100b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- speculative L3 bypasses"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_UC_BLOCKED 386 { "L2_L3ACCESS_CANCEL_UC_BLOCKED", {0x500b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- Uncacheable blocked L3 Accesses"}, #define PME_ITA2_L2_MISSES 387 { "L2_MISSES", {0xcb}, 0xf0, 1, {0xf00007}, "L2 Misses"}, #define PME_ITA2_L2_OPS_ISSUED_FP_LOAD 388 { "L2_OPS_ISSUED_FP_LOAD", {0x900b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid floating point loads"}, #define PME_ITA2_L2_OPS_ISSUED_INT_LOAD 389 { "L2_OPS_ISSUED_INT_LOAD", {0x800b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid integer loads"}, #define PME_ITA2_L2_OPS_ISSUED_NST_NLD 390 { "L2_OPS_ISSUED_NST_NLD", {0xc00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid non-load, no-store accesses"}, #define PME_ITA2_L2_OPS_ISSUED_RMW 391 { "L2_OPS_ISSUED_RMW", {0xa00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid read_modify_write stores"}, #define PME_ITA2_L2_OPS_ISSUED_STORE 392 { "L2_OPS_ISSUED_STORE", {0xb00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid non-read_modify_write stores"}, #define PME_ITA2_L2_OZDB_FULL_THIS 393 { "L2_OZDB_FULL_THIS", {0xbd}, 0xf0, 1, {0x4520000}, "L2 OZ Data Buffer Is Full -- L2 OZ Data Buffer is full"}, #define PME_ITA2_L2_OZQ_ACQUIRE 394 { "L2_OZQ_ACQUIRE", {0xa2}, 0xf0, 1, {0x4020000}, "Clocks With Acquire Ordering Attribute Existed in L2 OZQ"}, #define PME_ITA2_L2_OZQ_CANCELS0_ANY 395 { "L2_OZQ_CANCELS0_ANY", {0xa0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the total OZ Queue cancels"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_ACQUIRE 396 { "L2_OZQ_CANCELS0_LATE_ACQUIRE", {0x300a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by acquires"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_BYP_EFFRELEASE 397 { "L2_OZQ_CANCELS0_LATE_BYP_EFFRELEASE", {0x400a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by L1D to L2A bypass effective releases"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_RELEASE 398 { "L2_OZQ_CANCELS0_LATE_RELEASE", {0x200a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by releases"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_SPEC_BYP 399 { "L2_OZQ_CANCELS0_LATE_SPEC_BYP", {0x100a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by speculative bypasses"}, #define PME_ITA2_L2_OZQ_CANCELS1_BANK_CONF 400 { "L2_OZQ_CANCELS1_BANK_CONF", {0x100ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- bank conflicts"}, #define PME_ITA2_L2_OZQ_CANCELS1_CANC_L2M_ST 401 { "L2_OZQ_CANCELS1_CANC_L2M_ST", {0x600ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by a canceled store in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_CCV 402 { "L2_OZQ_CANCELS1_CCV", {0x900ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a ccv"}, #define PME_ITA2_L2_OZQ_CANCELS1_ECC 403 { "L2_OZQ_CANCELS1_ECC", {0xf00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- ECC hardware detecting a problem"}, #define PME_ITA2_L2_OZQ_CANCELS1_HPW_IFETCH_CONF 404 { "L2_OZQ_CANCELS1_HPW_IFETCH_CONF", {0x500ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a ifetch conflict (canceling HPW?)"}, #define PME_ITA2_L2_OZQ_CANCELS1_L1DF_L2M 405 { "L2_OZQ_CANCELS1_L1DF_L2M", {0xe00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- L1D fill in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_L1_FILL_CONF 406 { "L2_OZQ_CANCELS1_L1_FILL_CONF", {0x700ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- an L1 fill conflict"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2A_ST_MAT 407 { "L2_OZQ_CANCELS1_L2A_ST_MAT", {0xd00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2A"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2D_ST_MAT 408 { "L2_OZQ_CANCELS1_L2D_ST_MAT", {0x200ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2D"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2M_ST_MAT 409 { "L2_OZQ_CANCELS1_L2M_ST_MAT", {0xb00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_MFA 410 { "L2_OZQ_CANCELS1_MFA", {0xc00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a memory fence instruction"}, #define PME_ITA2_L2_OZQ_CANCELS1_REL 411 { "L2_OZQ_CANCELS1_REL", {0xac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by release"}, #define PME_ITA2_L2_OZQ_CANCELS1_SEM 412 { "L2_OZQ_CANCELS1_SEM", {0xa00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a semaphore"}, #define PME_ITA2_L2_OZQ_CANCELS1_ST_FILL_CONF 413 { "L2_OZQ_CANCELS1_ST_FILL_CONF", {0x800ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store fill conflict"}, #define PME_ITA2_L2_OZQ_CANCELS1_SYNC 414 { "L2_OZQ_CANCELS1_SYNC", {0x400ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by sync.i"}, #define PME_ITA2_L2_OZQ_CANCELS2_ACQ 415 { "L2_OZQ_CANCELS2_ACQ", {0x400a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by an acquire"}, #define PME_ITA2_L2_OZQ_CANCELS2_CANC_L2C_ST 416 { "L2_OZQ_CANCELS2_CANC_L2C_ST", {0x100a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a canceled store in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_CANC_L2D_ST 417 { "L2_OZQ_CANCELS2_CANC_L2D_ST", {0xd00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a canceled store in L2D"}, #define PME_ITA2_L2_OZQ_CANCELS2_DIDNT_RECIRC 418 { "L2_OZQ_CANCELS2_DIDNT_RECIRC", {0x900a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused because it did not recirculate"}, #define PME_ITA2_L2_OZQ_CANCELS2_D_IFET 419 { "L2_OZQ_CANCELS2_D_IFET", {0xf00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a demand ifetch"}, #define PME_ITA2_L2_OZQ_CANCELS2_L2C_ST_MAT 420 { "L2_OZQ_CANCELS2_L2C_ST_MAT", {0x200a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a store match in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_L2FILL_ST_CONF 421 { "L2_OZQ_CANCELS2_L2FILL_ST_CONF", {0x800a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a L2fill and store conflict in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_OVER_SUB 422 { "L2_OZQ_CANCELS2_OVER_SUB", {0xc00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- oversubscription"}, #define PME_ITA2_L2_OZQ_CANCELS2_OZ_DATA_CONF 423 { "L2_OZQ_CANCELS2_OZ_DATA_CONF", {0x600a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- an OZ data conflict"}, #define PME_ITA2_L2_OZQ_CANCELS2_READ_WB_CONF 424 { "L2_OZQ_CANCELS2_READ_WB_CONF", {0x500a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a write back conflict (canceling read?)"}, #define PME_ITA2_L2_OZQ_CANCELS2_RECIRC_OVER_SUB 425 { "L2_OZQ_CANCELS2_RECIRC_OVER_SUB", {0xa8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a recirculate oversubscription"}, #define PME_ITA2_L2_OZQ_CANCELS2_SCRUB 426 { "L2_OZQ_CANCELS2_SCRUB", {0x300a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- 32/64 byte HPW/L2D fill which needs scrub"}, #define PME_ITA2_L2_OZQ_CANCELS2_WEIRD 427 { "L2_OZQ_CANCELS2_WEIRD", {0xa00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- counts the cancels caused by attempted 5-cycle bypasses for non-aligned accesses and bypasses blocking recirculates for too long"}, #define PME_ITA2_L2_OZQ_FULL_THIS 428 { "L2_OZQ_FULL_THIS", {0xbc}, 0xf0, 1, {0x4520000}, "L2D OZQ Is Full -- L2D OZQ is full"}, #define PME_ITA2_L2_OZQ_RELEASE 429 { "L2_OZQ_RELEASE", {0xa3}, 0xf0, 1, {0x4020000}, "Clocks With Release Ordering Attribute Existed in L2 OZQ"}, #define PME_ITA2_L2_REFERENCES 430 { "L2_REFERENCES", {0xb1}, 0xf0, 4, {0x4120007}, "Requests Made To L2"}, #define PME_ITA2_L2_STORE_HIT_SHARED_ANY 431 { "L2_STORE_HIT_SHARED_ANY", {0xba}, 0xf0, 2, {0x4320007}, "Store Hit a Shared Line -- Store hit a shared line"}, #define PME_ITA2_L2_SYNTH_PROBE 432 { "L2_SYNTH_PROBE", {0xb7}, 0xf0, 1, {0x4220007}, "Synthesized Probe"}, #define PME_ITA2_L2_VICTIMB_FULL_THIS 433 { "L2_VICTIMB_FULL_THIS", {0xbe}, 0xf0, 1, {0x4520000}, "L2D Victim Buffer Is Full -- L2D victim buffer is full"}, #define PME_ITA2_L3_LINES_REPLACED 434 { "L3_LINES_REPLACED", {0xdf}, 0xf0, 1, {0xf00000}, "L3 Cache Lines Replaced"}, #define PME_ITA2_L3_MISSES 435 { "L3_MISSES", {0xdc}, 0xf0, 1, {0xf00007}, "L3 Misses"}, #define PME_ITA2_L3_READS_ALL_ALL 436 { "L3_READS_ALL_ALL", {0xf00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read References"}, #define PME_ITA2_L3_READS_ALL_HIT 437 { "L3_READS_ALL_HIT", {0xd00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read Hits"}, #define PME_ITA2_L3_READS_ALL_MISS 438 { "L3_READS_ALL_MISS", {0xe00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read Misses"}, #define PME_ITA2_L3_READS_DATA_READ_ALL 439 { "L3_READS_DATA_READ_ALL", {0xb00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load References (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DATA_READ_HIT 440 { "L3_READS_DATA_READ_HIT", {0x900dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load Hits (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DATA_READ_MISS 441 { "L3_READS_DATA_READ_MISS", {0xa00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load Misses (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DINST_FETCH_ALL 442 { "L3_READS_DINST_FETCH_ALL", {0x300dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction References"}, #define PME_ITA2_L3_READS_DINST_FETCH_HIT 443 { "L3_READS_DINST_FETCH_HIT", {0x100dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction Fetch Hits"}, #define PME_ITA2_L3_READS_DINST_FETCH_MISS 444 { "L3_READS_DINST_FETCH_MISS", {0x200dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction Fetch Misses"}, #define PME_ITA2_L3_READS_INST_FETCH_ALL 445 { "L3_READS_INST_FETCH_ALL", {0x700dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch References"}, #define PME_ITA2_L3_READS_INST_FETCH_HIT 446 { "L3_READS_INST_FETCH_HIT", {0x500dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch Hits"}, #define PME_ITA2_L3_READS_INST_FETCH_MISS 447 { "L3_READS_INST_FETCH_MISS", {0x600dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch Misses"}, #define PME_ITA2_L3_REFERENCES 448 { "L3_REFERENCES", {0xdb}, 0xf0, 1, {0xf00007}, "L3 References"}, #define PME_ITA2_L3_WRITES_ALL_ALL 449 { "L3_WRITES_ALL_ALL", {0xf00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write References"}, #define PME_ITA2_L3_WRITES_ALL_HIT 450 { "L3_WRITES_ALL_HIT", {0xd00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write Hits"}, #define PME_ITA2_L3_WRITES_ALL_MISS 451 { "L3_WRITES_ALL_MISS", {0xe00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write Misses"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_ALL 452 { "L3_WRITES_DATA_WRITE_ALL", {0x700de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store References (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_HIT 453 { "L3_WRITES_DATA_WRITE_HIT", {0x500de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store Hits (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_MISS 454 { "L3_WRITES_DATA_WRITE_MISS", {0x600de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store Misses (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_L2_WB_ALL 455 { "L3_WRITES_L2_WB_ALL", {0xb00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back References"}, #define PME_ITA2_L3_WRITES_L2_WB_HIT 456 { "L3_WRITES_L2_WB_HIT", {0x900de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back Hits"}, #define PME_ITA2_L3_WRITES_L2_WB_MISS 457 { "L3_WRITES_L2_WB_MISS", {0xa00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back Misses"}, #define PME_ITA2_LOADS_RETIRED 458 { "LOADS_RETIRED", {0xcd}, 0xf0, 4, {0x5310007}, "Retired Loads"}, #define PME_ITA2_MEM_READ_CURRENT_ANY 459 { "MEM_READ_CURRENT_ANY", {0x30089}, 0xf0, 1, {0xf00000}, "Current Mem Read Transactions On Bus -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_MEM_READ_CURRENT_IO 460 { "MEM_READ_CURRENT_IO", {0x10089}, 0xf0, 1, {0xf00000}, "Current Mem Read Transactions On Bus -- non-CPU priority agents"}, #define PME_ITA2_MISALIGNED_LOADS_RETIRED 461 { "MISALIGNED_LOADS_RETIRED", {0xce}, 0xf0, 4, {0x5310007}, "Retired Misaligned Load Instructions"}, #define PME_ITA2_MISALIGNED_STORES_RETIRED 462 { "MISALIGNED_STORES_RETIRED", {0xd2}, 0xf0, 2, {0x5410007}, "Retired Misaligned Store Instructions"}, #define PME_ITA2_NOPS_RETIRED 463 { "NOPS_RETIRED", {0x50}, 0xf0, 6, {0xf00003}, "Retired NOP Instructions"}, #define PME_ITA2_PREDICATE_SQUASHED_RETIRED 464 { "PREDICATE_SQUASHED_RETIRED", {0x51}, 0xf0, 6, {0xf00003}, "Instructions Squashed Due to Predicate Off"}, #define PME_ITA2_RSE_CURRENT_REGS_2_TO_0 465 { "RSE_CURRENT_REGS_2_TO_0", {0x2b}, 0xf0, 7, {0xf00000}, "Current RSE Registers (Bits 2:0)"}, #define PME_ITA2_RSE_CURRENT_REGS_5_TO_3 466 { "RSE_CURRENT_REGS_5_TO_3", {0x2a}, 0xf0, 7, {0xf00000}, "Current RSE Registers (Bits 5:3)"}, #define PME_ITA2_RSE_CURRENT_REGS_6 467 { "RSE_CURRENT_REGS_6", {0x26}, 0xf0, 1, {0xf00000}, "Current RSE Registers (Bit 6)"}, #define PME_ITA2_RSE_DIRTY_REGS_2_TO_0 468 { "RSE_DIRTY_REGS_2_TO_0", {0x29}, 0xf0, 7, {0xf00000}, "Dirty RSE Registers (Bits 2:0)"}, #define PME_ITA2_RSE_DIRTY_REGS_5_TO_3 469 { "RSE_DIRTY_REGS_5_TO_3", {0x28}, 0xf0, 7, {0xf00000}, "Dirty RSE Registers (Bits 5:3)"}, #define PME_ITA2_RSE_DIRTY_REGS_6 470 { "RSE_DIRTY_REGS_6", {0x24}, 0xf0, 1, {0xf00000}, "Dirty RSE Registers (Bit 6)"}, #define PME_ITA2_RSE_EVENT_RETIRED 471 { "RSE_EVENT_RETIRED", {0x32}, 0xf0, 1, {0xf00000}, "Retired RSE operations"}, #define PME_ITA2_RSE_REFERENCES_RETIRED_ALL 472 { "RSE_REFERENCES_RETIRED_ALL", {0x30020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Both RSE loads and stores will be counted."}, #define PME_ITA2_RSE_REFERENCES_RETIRED_LOAD 473 { "RSE_REFERENCES_RETIRED_LOAD", {0x10020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Only RSE loads will be counted."}, #define PME_ITA2_RSE_REFERENCES_RETIRED_STORE 474 { "RSE_REFERENCES_RETIRED_STORE", {0x20020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Only RSE stores will be counted."}, #define PME_ITA2_SERIALIZATION_EVENTS 475 { "SERIALIZATION_EVENTS", {0x53}, 0xf0, 1, {0xf00000}, "Number of srlz.i Instructions"}, #define PME_ITA2_STORES_RETIRED 476 { "STORES_RETIRED", {0xd1}, 0xf0, 2, {0x5410007}, "Retired Stores"}, #define PME_ITA2_SYLL_NOT_DISPERSED_ALL 477 { "SYLL_NOT_DISPERSED_ALL", {0xf004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Counts all syllables not dispersed. NOTE: Any combination of b0000-b1111 is valid."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL 478 { "SYLL_NOT_DISPERSED_EXPL", {0x1004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits. These consist of programmer specified architected S-bit and templates 1 and 5. Dispersal takes a 6-syllable (3-syllable) hit for every template 1/5 in bundle 0(1). Dispersal takes a 3-syllable (0 syllable) hit for every S-bit in bundle 0(1)"}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_FE 479 { "SYLL_NOT_DISPERSED_EXPL_OR_FE", {0x5004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLI 480 { "SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLI", {0xd004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL 481 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL", {0x3004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit/implicit stop bits."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE 482 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE", {0x7004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to front-end not providing valid bundles or providing valid illegal template."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLI 483 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLI", {0xb004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_MLI 484 { "SYLL_NOT_DISPERSED_EXPL_OR_MLI", {0x9004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_FE 485 { "SYLL_NOT_DISPERSED_FE", {0x4004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to front-end not providing valid bundles or providing valid illegal templates. Dispersal takes a 3-syllable hit for every invalid bundle or valid illegal template from front-end. Bundle 1 with front-end fault, is counted here (3-syllable hit).."}, #define PME_ITA2_SYLL_NOT_DISPERSED_FE_OR_MLI 486 { "SYLL_NOT_DISPERSED_FE_OR_MLI", {0xc004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable or due to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL 487 { "SYLL_NOT_DISPERSED_IMPL", {0x2004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits. These consist of all of the non-architected stop bits (asymmetry, oversubscription, implicit). Dispersal takes a 6-syllable(3-syllable) hit for every implicit stop bits in bundle 0(1)."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_FE 488 { "SYLL_NOT_DISPERSED_IMPL_OR_FE", {0x6004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLI 489 { "SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLI", {0xe004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_MLI 490 { "SYLL_NOT_DISPERSED_IMPL_OR_MLI", {0xa004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_MLI 491 { "SYLL_NOT_DISPERSED_MLI", {0x8004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable. Dispersal takes a 1 syllable hit for each MLI bundle . Dispersal could take 0-2 syllable hit depending on which syllable we resteer to. Bundle 1 with front-end fault which is split, is counted here (0-2 syllable hit)."}, #define PME_ITA2_SYLL_OVERCOUNT_ALL 492 { "SYLL_OVERCOUNT_ALL", {0x3004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- syllables overcounted in implicit & explicit bucket"}, #define PME_ITA2_SYLL_OVERCOUNT_EXPL 493 { "SYLL_OVERCOUNT_EXPL", {0x1004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- Only syllables overcounted in the explicit bucket"}, #define PME_ITA2_SYLL_OVERCOUNT_IMPL 494 { "SYLL_OVERCOUNT_IMPL", {0x2004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- Only syllables overcounted in the implicit bucket"}, #define PME_ITA2_UC_LOADS_RETIRED 495 { "UC_LOADS_RETIRED", {0xcf}, 0xf0, 4, {0x5310007}, "Retired Uncacheable Loads"}, #define PME_ITA2_UC_STORES_RETIRED 496 { "UC_STORES_RETIRED", {0xd0}, 0xf0, 2, {0x5410007}, "Retired Uncacheable Stores"}, }; #define PME_ITA2_EVENT_COUNT 497 papi-papi-7-2-0-t/src/libperfnec/lib/itanium_events.h000066400000000000000000000667411502707512200225270ustar00rootroot00000000000000/* * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ /* * Events table for the Itanium PMU family */ static pme_ita_entry_t itanium_pe []={ #define PME_ITA_ALAT_INST_CHKA_LDC_ALL 0 { "ALAT_INST_CHKA_LDC_ALL", {0x30036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_CHKA_LDC_FP 1 { "ALAT_INST_CHKA_LDC_FP", {0x10036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_CHKA_LDC_INT 2 { "ALAT_INST_CHKA_LDC_INT", {0x20036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_ALL 3 { "ALAT_INST_FAILED_CHKA_LDC_ALL", {0x30037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_FP 4 { "ALAT_INST_FAILED_CHKA_LDC_FP", {0x10037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_INT 5 { "ALAT_INST_FAILED_CHKA_LDC_INT", {0x20037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_ALL 6 { "ALAT_REPLACEMENT_ALL", {0x30038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_FP 7 { "ALAT_REPLACEMENT_FP", {0x10038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_INT 8 { "ALAT_REPLACEMENT_INT", {0x20038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALL_STOPS_DISPERSED 9 { "ALL_STOPS_DISPERSED", {0x2f} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_BRANCH_EVENT 10 { "BRANCH_EVENT", {0x811} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_ALL_PREDICTIONS 11 { "BRANCH_MULTIWAY_ALL_PATHS_ALL_PREDICTIONS", {0xe} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_CORRECT_PREDICTIONS 12 { "BRANCH_MULTIWAY_ALL_PATHS_CORRECT_PREDICTIONS", {0x1000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_WRONG_PATH 13 { "BRANCH_MULTIWAY_ALL_PATHS_WRONG_PATH", {0x2000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_WRONG_TARGET 14 { "BRANCH_MULTIWAY_ALL_PATHS_WRONG_TARGET", {0x3000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_ALL_PREDICTIONS 15 { "BRANCH_MULTIWAY_NOT_TAKEN_ALL_PREDICTIONS", {0x8000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_CORRECT_PREDICTIONS 16 { "BRANCH_MULTIWAY_NOT_TAKEN_CORRECT_PREDICTIONS", {0x9000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_WRONG_PATH 17 { "BRANCH_MULTIWAY_NOT_TAKEN_WRONG_PATH", {0xa000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_WRONG_TARGET 18 { "BRANCH_MULTIWAY_NOT_TAKEN_WRONG_TARGET", {0xb000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_ALL_PREDICTIONS 19 { "BRANCH_MULTIWAY_TAKEN_ALL_PREDICTIONS", {0xc000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_CORRECT_PREDICTIONS 20 { "BRANCH_MULTIWAY_TAKEN_CORRECT_PREDICTIONS", {0xd000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_WRONG_PATH 21 { "BRANCH_MULTIWAY_TAKEN_WRONG_PATH", {0xe000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_WRONG_TARGET 22 { "BRANCH_MULTIWAY_TAKEN_WRONG_TARGET", {0xf000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_NOT_TAKEN 23 { "BRANCH_NOT_TAKEN", {0x8000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 24 { "BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0x6000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 25 { "BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0x4000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 26 { "BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0x7000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 27 { "BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x5000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 28 { "BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0xa000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 29 { "BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0x8000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 30 { "BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0xb000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 31 { "BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x9000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 32 { "BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0xe000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 33 { "BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0xc000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 34 { "BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0xf000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 35 { "BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0xd000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_NT_OUTCOMES_CORRECTLY_PREDICTED 36 { "BRANCH_PATH_ALL_NT_OUTCOMES_CORRECTLY_PREDICTED", {0x2000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_NT_OUTCOMES_INCORRECTLY_PREDICTED 37 { "BRANCH_PATH_ALL_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0xf} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_TK_OUTCOMES_CORRECTLY_PREDICTED 38 { "BRANCH_PATH_ALL_TK_OUTCOMES_CORRECTLY_PREDICTED", {0x3000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_TK_OUTCOMES_INCORRECTLY_PREDICTED 39 { "BRANCH_PATH_ALL_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x1000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_ALL_PREDICTIONS 40 { "BRANCH_PREDICTOR_1ST_STAGE_ALL_PREDICTIONS", {0x40010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_CORRECT_PREDICTIONS 41 { "BRANCH_PREDICTOR_1ST_STAGE_CORRECT_PREDICTIONS", {0x50010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_WRONG_PATH 42 { "BRANCH_PREDICTOR_1ST_STAGE_WRONG_PATH", {0x60010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_WRONG_TARGET 43 { "BRANCH_PREDICTOR_1ST_STAGE_WRONG_TARGET", {0x70010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_ALL_PREDICTIONS 44 { "BRANCH_PREDICTOR_2ND_STAGE_ALL_PREDICTIONS", {0x80010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_CORRECT_PREDICTIONS 45 { "BRANCH_PREDICTOR_2ND_STAGE_CORRECT_PREDICTIONS", {0x90010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_WRONG_PATH 46 { "BRANCH_PREDICTOR_2ND_STAGE_WRONG_PATH", {0xa0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_WRONG_TARGET 47 { "BRANCH_PREDICTOR_2ND_STAGE_WRONG_TARGET", {0xb0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_ALL_PREDICTIONS 48 { "BRANCH_PREDICTOR_3RD_STAGE_ALL_PREDICTIONS", {0xc0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_CORRECT_PREDICTIONS 49 { "BRANCH_PREDICTOR_3RD_STAGE_CORRECT_PREDICTIONS", {0xd0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_WRONG_PATH 50 { "BRANCH_PREDICTOR_3RD_STAGE_WRONG_PATH", {0xe0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_WRONG_TARGET 51 { "BRANCH_PREDICTOR_3RD_STAGE_WRONG_TARGET", {0xf0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_ALL_PREDICTIONS 52 { "BRANCH_PREDICTOR_ALL_ALL_PREDICTIONS", {0x10} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_CORRECT_PREDICTIONS 53 { "BRANCH_PREDICTOR_ALL_CORRECT_PREDICTIONS", {0x10010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_WRONG_PATH 54 { "BRANCH_PREDICTOR_ALL_WRONG_PATH", {0x20010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_WRONG_TARGET 55 { "BRANCH_PREDICTOR_ALL_WRONG_TARGET", {0x30010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_0 56 { "BRANCH_TAKEN_SLOT_0", {0x1000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_1 57 { "BRANCH_TAKEN_SLOT_1", {0x2000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_2 58 { "BRANCH_TAKEN_SLOT_2", {0x4000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BUS_ALL_ANY 59 { "BUS_ALL_ANY", {0x10047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_ALL_IO 60 { "BUS_ALL_IO", {0x40047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_ALL_SELF 61 { "BUS_ALL_SELF", {0x20047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_LIVE_REQ_HI 62 { "BUS_BRQ_LIVE_REQ_HI", {0x5c} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_LIVE_REQ_LO 63 { "BUS_BRQ_LIVE_REQ_LO", {0x5b} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_REQ_INSERTED 64 { "BUS_BRQ_REQ_INSERTED", {0x5d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_ANY 65 { "BUS_BURST_ANY", {0x10049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_IO 66 { "BUS_BURST_IO", {0x40049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_SELF 67 { "BUS_BURST_SELF", {0x20049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_HITM 68 { "BUS_HITM", {0x44} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_IO_ANY 69 { "BUS_IO_ANY", {0x10050} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_IOQ_LIVE_REQ_HI 70 { "BUS_IOQ_LIVE_REQ_HI", {0x58} , 0xf0, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_IOQ_LIVE_REQ_LO 71 { "BUS_IOQ_LIVE_REQ_LO", {0x57} , 0xf0, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_IO_SELF 72 { "BUS_IO_SELF", {0x20050} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_ANY 73 { "BUS_LOCK_ANY", {0x10053} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_CYCLES_ANY 74 { "BUS_LOCK_CYCLES_ANY", {0x10054} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_CYCLES_SELF 75 { "BUS_LOCK_CYCLES_SELF", {0x20054} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_SELF 76 { "BUS_LOCK_SELF", {0x20053} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_ANY 77 { "BUS_MEMORY_ANY", {0x1004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_IO 78 { "BUS_MEMORY_IO", {0x4004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_SELF 79 { "BUS_MEMORY_SELF", {0x2004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_ANY 80 { "BUS_PARTIAL_ANY", {0x10048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_IO 81 { "BUS_PARTIAL_IO", {0x40048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_SELF 82 { "BUS_PARTIAL_SELF", {0x20048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_ANY 83 { "BUS_RD_ALL_ANY", {0x1004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_IO 84 { "BUS_RD_ALL_IO", {0x4004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_SELF 85 { "BUS_RD_ALL_SELF", {0x2004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_ANY 86 { "BUS_RD_DATA_ANY", {0x1004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_IO 87 { "BUS_RD_DATA_IO", {0x4004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_SELF 88 { "BUS_RD_DATA_SELF", {0x2004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_HIT 89 { "BUS_RD_HIT", {0x40} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_HITM 90 { "BUS_RD_HITM", {0x41} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_ANY 91 { "BUS_RD_INVAL_ANY", {0x1004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_ANY 92 { "BUS_RD_INVAL_BST_ANY", {0x1004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_HITM 93 { "BUS_RD_INVAL_BST_HITM", {0x43} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_IO 94 { "BUS_RD_INVAL_BST_IO", {0x4004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_SELF 95 { "BUS_RD_INVAL_BST_SELF", {0x2004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_HITM 96 { "BUS_RD_INVAL_HITM", {0x42} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_IO 97 { "BUS_RD_INVAL_IO", {0x4004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_SELF 98 { "BUS_RD_INVAL_SELF", {0x2004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_IO_ANY 99 { "BUS_RD_IO_ANY", {0x10051} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_IO_SELF 100 { "BUS_RD_IO_SELF", {0x20051} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_ANY 101 { "BUS_RD_PRTL_ANY", {0x1004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_IO 102 { "BUS_RD_PRTL_IO", {0x4004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_SELF 103 { "BUS_RD_PRTL_SELF", {0x2004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPQ_REQ 104 { "BUS_SNOOPQ_REQ", {0x56} , 0x30, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPS_ANY 105 { "BUS_SNOOPS_ANY", {0x10046} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPS_HITM_ANY 106 { "BUS_SNOOPS_HITM_ANY", {0x10045} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOP_STALL_CYCLES_ANY 107 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x10055} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOP_STALL_CYCLES_SELF 108 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x20055} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_ANY 109 { "BUS_WR_WB_ANY", {0x10052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_IO 110 { "BUS_WR_WB_IO", {0x40052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_SELF 111 { "BUS_WR_WB_SELF", {0x20052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_CPU_CPL_CHANGES 112 { "CPU_CPL_CHANGES", {0x34} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_CPU_CYCLES 113 { "CPU_CYCLES", {0x12} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DATA_ACCESS_CYCLE 114 { "DATA_ACCESS_CYCLE", {0x3} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT1024 115 { "DATA_EAR_CACHE_LAT1024", {0x90367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT128 116 { "DATA_EAR_CACHE_LAT128", {0x50367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT16 117 { "DATA_EAR_CACHE_LAT16", {0x20367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT2048 118 { "DATA_EAR_CACHE_LAT2048", {0xa0367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT256 119 { "DATA_EAR_CACHE_LAT256", {0x60367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT32 120 { "DATA_EAR_CACHE_LAT32", {0x30367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT4 121 { "DATA_EAR_CACHE_LAT4", {0x367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT512 122 { "DATA_EAR_CACHE_LAT512", {0x80367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT64 123 { "DATA_EAR_CACHE_LAT64", {0x40367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT8 124 { "DATA_EAR_CACHE_LAT8", {0x10367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT_NONE 125 { "DATA_EAR_CACHE_LAT_NONE", {0xf0367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_EVENTS 126 { "DATA_EAR_EVENTS", {0x67} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DATA_EAR_TLB_L2 127 { "DATA_EAR_TLB_L2", {0x20767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_TLB_SW 128 { "DATA_EAR_TLB_SW", {0x80767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_TLB_VHPT 129 { "DATA_EAR_TLB_VHPT", {0x40767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_REFERENCES_RETIRED 130 { "DATA_REFERENCES_RETIRED", {0x63} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_DEPENDENCY_ALL_CYCLE 131 { "DEPENDENCY_ALL_CYCLE", {0x6} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DEPENDENCY_SCOREBOARD_CYCLE 132 { "DEPENDENCY_SCOREBOARD_CYCLE", {0x2} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DTC_MISSES 133 { "DTC_MISSES", {0x60} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DTLB_INSERTS_HPW 134 { "DTLB_INSERTS_HPW", {0x62} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DTLB_MISSES 135 { "DTLB_MISSES", {0x61} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_EXPL_STOPBITS 136 { "EXPL_STOPBITS", {0x2e} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_FP_FLUSH_TO_ZERO 137 { "FP_FLUSH_TO_ZERO", {0xb} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_FP_OPS_RETIRED_HI 138 { "FP_OPS_RETIRED_HI", {0xa} , 0xf0, 3, {0xffff0003}, NULL}, #define PME_ITA_FP_OPS_RETIRED_LO 139 { "FP_OPS_RETIRED_LO", {0x9} , 0xf0, 3, {0xffff0003}, NULL}, #define PME_ITA_FP_SIR_FLUSH 140 { "FP_SIR_FLUSH", {0xc} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_IA32_INST_RETIRED 141 { "IA32_INST_RETIRED", {0x15} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_IA64_INST_RETIRED 142 { "IA64_INST_RETIRED", {0x8} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_IA64_TAGGED_INST_RETIRED_PMC8 143 { "IA64_TAGGED_INST_RETIRED_PMC8", {0x30008} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_IA64_TAGGED_INST_RETIRED_PMC9 144 { "IA64_TAGGED_INST_RETIRED_PMC9", {0x20008} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_INST_ACCESS_CYCLE 145 { "INST_ACCESS_CYCLE", {0x1} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_INST_DISPERSED 146 { "INST_DISPERSED", {0x2d} , 0x30, 6, {0xffff0001}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_ALL 147 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_FP 148 { "INST_FAILED_CHKS_RETIRED_FP", {0x20035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_INT 149 { "INST_FAILED_CHKS_RETIRED_INT", {0x10035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT1024 150 { "INSTRUCTION_EAR_CACHE_LAT1024", {0x80123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT128 151 { "INSTRUCTION_EAR_CACHE_LAT128", {0x50123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT16 152 { "INSTRUCTION_EAR_CACHE_LAT16", {0x20123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT2048 153 { "INSTRUCTION_EAR_CACHE_LAT2048", {0x90123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT256 154 { "INSTRUCTION_EAR_CACHE_LAT256", {0x60123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT32 155 { "INSTRUCTION_EAR_CACHE_LAT32", {0x30123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT4096 156 { "INSTRUCTION_EAR_CACHE_LAT4096", {0xa0123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT4 157 { "INSTRUCTION_EAR_CACHE_LAT4", {0x123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT512 158 { "INSTRUCTION_EAR_CACHE_LAT512", {0x70123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT64 159 { "INSTRUCTION_EAR_CACHE_LAT64", {0x40123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT8 160 { "INSTRUCTION_EAR_CACHE_LAT8", {0x10123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT_NONE 161 { "INSTRUCTION_EAR_CACHE_LAT_NONE", {0xf0123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_EVENTS 162 { "INSTRUCTION_EAR_EVENTS", {0x23} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_TLB_SW 163 { "INSTRUCTION_EAR_TLB_SW", {0x80523} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_TLB_VHPT 164 { "INSTRUCTION_EAR_TLB_VHPT", {0x40523} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_ISA_TRANSITIONS 165 { "ISA_TRANSITIONS", {0x14} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_ISB_LINES_IN 166 { "ISB_LINES_IN", {0x26} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_ITLB_INSERTS_HPW 167 { "ITLB_INSERTS_HPW", {0x28} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_ITLB_MISSES_FETCH 168 { "ITLB_MISSES_FETCH", {0x27} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1D_READ_FORCED_MISSES_RETIRED 169 { "L1D_READ_FORCED_MISSES_RETIRED", {0x6b} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1D_READ_MISSES_RETIRED 170 { "L1D_READ_MISSES_RETIRED", {0x66} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1D_READS_RETIRED 171 { "L1D_READS_RETIRED", {0x64} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1I_DEMAND_READS 172 { "L1I_DEMAND_READS", {0x20} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1I_FILLS 173 { "L1I_FILLS", {0x21} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L1I_PREFETCH_READS 174 { "L1I_PREFETCH_READS", {0x24} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1_OUTSTANDING_REQ_HI 175 { "L1_OUTSTANDING_REQ_HI", {0x79} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L1_OUTSTANDING_REQ_LO 176 { "L1_OUTSTANDING_REQ_LO", {0x78} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_ALL 177 { "L2_DATA_REFERENCES_ALL", {0x30069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_READS 178 { "L2_DATA_REFERENCES_READS", {0x10069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_WRITES 179 { "L2_DATA_REFERENCES_WRITES", {0x20069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ADDR_CONFLICT 180 { "L2_FLUSH_DETAILS_ADDR_CONFLICT", {0x20077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ALL 181 { "L2_FLUSH_DETAILS_ALL", {0xf0077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_BUS_REJECT 182 { "L2_FLUSH_DETAILS_BUS_REJECT", {0x40077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_FULL_FLUSH 183 { "L2_FLUSH_DETAILS_FULL_FLUSH", {0x80077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ST_BUFFER 184 { "L2_FLUSH_DETAILS_ST_BUFFER", {0x10077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSHES 185 { "L2_FLUSHES", {0x76} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_INST_DEMAND_READS 186 { "L2_INST_DEMAND_READS", {0x22} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L2_INST_PREFETCH_READS 187 { "L2_INST_PREFETCH_READS", {0x25} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L2_MISSES 188 { "L2_MISSES", {0x6a} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_REFERENCES 189 { "L2_REFERENCES", {0x68} , 0xf0, 3, {0xffff0007}, NULL}, #define PME_ITA_L3_LINES_REPLACED 190 { "L3_LINES_REPLACED", {0x7f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_MISSES 191 { "L3_MISSES", {0x7c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_ALL 192 { "L3_READS_ALL_READS_ALL", {0xf007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_HIT 193 { "L3_READS_ALL_READS_HIT", {0xd007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_MISS 194 { "L3_READS_ALL_READS_MISS", {0xe007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_ALL 195 { "L3_READS_DATA_READS_ALL", {0xb007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_HIT 196 { "L3_READS_DATA_READS_HIT", {0x9007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_MISS 197 { "L3_READS_DATA_READS_MISS", {0xa007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_ALL 198 { "L3_READS_INST_READS_ALL", {0x7007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_HIT 199 { "L3_READS_INST_READS_HIT", {0x5007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_MISS 200 { "L3_READS_INST_READS_MISS", {0x6007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_REFERENCES 201 { "L3_REFERENCES", {0x7b} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_ALL 202 { "L3_WRITES_ALL_WRITES_ALL", {0xf007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_HIT 203 { "L3_WRITES_ALL_WRITES_HIT", {0xd007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_MISS 204 { "L3_WRITES_ALL_WRITES_MISS", {0xe007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_ALL 205 { "L3_WRITES_DATA_WRITES_ALL", {0x7007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_HIT 206 { "L3_WRITES_DATA_WRITES_HIT", {0x5007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_MISS 207 { "L3_WRITES_DATA_WRITES_MISS", {0x6007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_ALL 208 { "L3_WRITES_L2_WRITEBACK_ALL", {0xb007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_HIT 209 { "L3_WRITES_L2_WRITEBACK_HIT", {0x9007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_MISS 210 { "L3_WRITES_L2_WRITEBACK_MISS", {0xa007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_LOADS_RETIRED 211 { "LOADS_RETIRED", {0x6c} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_MEMORY_CYCLE 212 { "MEMORY_CYCLE", {0x7} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_MISALIGNED_LOADS_RETIRED 213 { "MISALIGNED_LOADS_RETIRED", {0x70} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_MISALIGNED_STORES_RETIRED 214 { "MISALIGNED_STORES_RETIRED", {0x71} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_NOPS_RETIRED 215 { "NOPS_RETIRED", {0x30} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_PIPELINE_ALL_FLUSH_CYCLE 216 { "PIPELINE_ALL_FLUSH_CYCLE", {0x4} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_BACKEND_FLUSH_CYCLE 217 { "PIPELINE_BACKEND_FLUSH_CYCLE", {0x0} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_ALL 218 { "PIPELINE_FLUSH_ALL", {0xf0033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_DTC_FLUSH 219 { "PIPELINE_FLUSH_DTC_FLUSH", {0x40033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_IEU_FLUSH 220 { "PIPELINE_FLUSH_IEU_FLUSH", {0x80033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_L1D_WAYMP_FLUSH 221 { "PIPELINE_FLUSH_L1D_WAYMP_FLUSH", {0x20033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_OTHER_FLUSH 222 { "PIPELINE_FLUSH_OTHER_FLUSH", {0x10033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PREDICATE_SQUASHED_RETIRED 223 { "PREDICATE_SQUASHED_RETIRED", {0x31} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_RSE_LOADS_RETIRED 224 { "RSE_LOADS_RETIRED", {0x72} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_RSE_REFERENCES_RETIRED 225 { "RSE_REFERENCES_RETIRED", {0x65} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_STORES_RETIRED 226 { "STORES_RETIRED", {0x6d} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UC_LOADS_RETIRED 227 { "UC_LOADS_RETIRED", {0x6e} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UC_STORES_RETIRED 228 { "UC_STORES_RETIRED", {0x6f} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UNSTALLED_BACKEND_CYCLE 229 { "UNSTALLED_BACKEND_CYCLE", {0x5} , 0xf0, 1, {0xffff0000}, NULL}}; #define PME_ITA_EVENT_COUNT 230 papi-papi-7-2-0-t/src/libperfnec/lib/libpfm.a000066400000000000000000004255221502707512200207330ustar00rootroot00000000000000! / 1605125400 0 0 0 1510 ` C¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¢¢¢KKKKK\â\â\â\â\â\â\â\â\â\â\â\âô>ô>ô>ô>ô>ô>ô>forced_pmupfm_configpfm_initializepfm_set_optionspfm_get_pmu_name_bytypepfm_list_supported_pmuspfm_get_pmu_namepfm_get_pmu_typepfm_is_pmu_supportedpfm_force_pmupfm_find_event_bynamepfm_find_event_bycodepfm_find_eventpfm_find_event_bycode_nextpfm_find_event_maskpfm_get_event_namepfm_get_event_codepfm_get_event_code_counterpfm_get_event_counterspfm_get_event_mask_namepfm_get_num_eventspfm_get_num_event_maskspfm_dispatch_eventspfm_get_num_counterspfm_get_num_pmcspfm_get_num_pmdspfm_get_impl_pmcspfm_get_impl_pmdspfm_get_impl_counterspfm_get_hw_counter_widthpfm_strerrorpfm_get_versionpfm_get_max_event_name_lenpfm_get_cycle_eventpfm_get_inst_retired_eventpfm_get_event_descriptionpfm_get_event_mask_descriptionpfm_get_event_mask_codepfm_get_full_event_namepfm_find_full_event__pfm_vbprintflibpfm_fp__pfm_check_event__pfm_getcpuinfo_attrpfm_init_syscalls_pfmlib_minor_version_pfmlib_major_version_pfmlib_sys_basepfm_load_contextpfm_startpfm_stoppfm_restartpfm_create_evtsetspfm_delete_evtsetspfm_getinfo_evtsetspfm_unload_contextpfm_create_contextpfm_write_pmcspfm_write_pmdspfm_read_pmdspfm_createpfm_writepfm_readpfm_create_setspfm_getinfo_setspfm_attachpfm_set_state// 64 ` pfmlib_os_linux.o/ pfmlib_os_linux_v2.o/ pfmlib_os_linux_v3.o/ pfmlib_common.o/1604947637 15504 3101 100644 70336 ` ELFû@@‹ ‹ ‹ E0‰ ’@?E¿?o@>=E¾¿>I¿?I¾>X¾>e€¿?¾¿>D¾>U¾=;½E0‰‰ E‹ ‹ ?E0‰‰ E‹ ‹ ?‹ ‹ ‹ ‹‹ E@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?H¿?X¿€ >e¾ u¿ EŒ ‰ E‹ ‹‹ ‹ ?E‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?`—D———>E¾?`˜D˜˜`€D€€˜ EŒ ø€ X‰`‰‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ ?`¿?D¿¿¿X‰`‰‰ E‹ ‹‹ ‹ ?`€D€€˜ EŒ Xÿÿÿ€ÿÿÿ?`¿?D¿¿—E¿?—?˜ÿÿÿ`€D€€˜ EŒ  ÿÿÿ€`ÿÿÿ€PJ?`¿?D¿¿€ e¿¾€ u₫ÿÿÿ=€€f`½=D€u½¾=D>E€½E¿¿>Hÿÿÿxÿÿÿ€9…8ÿÿÿ€èÿÿÿ€0† ÿÿÿ`€D€€˜ EŒ Àÿÿÿ€è₫ÿÿ€PJ?`¿?D¿¿€ e¿¾€ uưÿÿÿ=€€f`½=D€u½¾=D>E€½E¿¿>Hÿÿÿxÿÿÿ€9…8ÿÿÿ€èÿÿÿ€0† ÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰! ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E`—D———E `Œ DŒŒŒ  `Œ DŒŒŒ —’E0‰8‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?»=E(½>Y½;Y¾>Øÿÿÿ¾0ÿÿÿ=EhX¾¿E¼E½E› EŒ Àÿÿÿ€„=Eœ?E¼¾xÿÿÿ’“„8ûÿÿ—¼Eœ¿E°‹½>½E˜E¾™EEØ‹¸‹À‹È‹Đ‹ `Œ DŒŒŒ =E›u(ÿÿÿ¾ø₫ÿÿ!`¡!D¡¡¡?E `  D  —=E E`ŸDŸŸŸE<`¼0‰‰ E‹ ‹ ??`¿?D¿¿¿˜ÿÿÿ’„`ÿÿÿ~E0‰‰ E‹ ‹ ?‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E>`¾>D¾¾¾?E¨ˆ~E0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?¿¿?YĐ“ÿÿÿ¿?@‚HE¿?E—E¿E `Œ DŒŒŒ ?E˜—?0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?¿>¾>0ÿÿÿ’¾„ÿÿÿ€EØÿÿÿ¿èÿÿÿ“ˆ₫ÿÿ‚U“Øÿÿÿ~E0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0‰X‰`‰h‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E(€ >`¾>D¾¾¾E¿E°‹¸‹˜ EŒ E0‰X‰`‰h‰‰ E‹ ‹‹ ‹ ?¾>E¾>¿E¾hÿÿÿ?`¿?D¿¿>`¾>D¾¾¾=E½=˜ÿÿÿ½đ₫ÿÿ™E—?€E¿¸‹°‹˜ EŒ ——Y°ÿÿÿ’Xÿÿÿ`€D€€ÿÿÿ?`¿?D¿¿€E°‹?€E¿E™ EŒ ?`¿?D¿¿¿E—pÿÿÿ’¸₫ÿÿ~E0‰X‰`‰h‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?P¿}EX‰`‰‰ E‹ ‹‹ ‹ ?¿?@J€E—E¿E `Œ DŒŒŒ —E?E˜—?X‰`‰‰ E‹ ‹‹ ‹ ?Pÿÿÿ~EX‰`‰‰ E‹ ‹‹ ‹ ?¸ÿÿÿ€ ÿÿÿ‹ ‹ ‹ E?`¿?D¿¿¿?˜¿}E‰ E‹ ‹ ?¿?>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?Øÿÿÿ€ÿÿÿ‹ ‹ ‹ E0‰8‰>`¾>D¾¾¾?E>`¾>D¾¾¾>¾}E0‰8‰‰ E‹ ‹ ?E0‰8‰‰ E‹ ‹ ?¿¿?YH“°ÿÿÿE0‰8‰‰ E‹ ‹ ?¿>¾>¸ÿÿÿ’¾„ÿÿÿ€EØÿÿÿ¿èÿÿÿ“0ÿÿÿ‹ ‹ ‹ E0‰8‰>`¾>D¾¾¾?E¿à“E0‰8‰‰ E‹ ‹ ?¿¿?Yp“°ÿÿÿ¿?E>`¾>D¾¾¾?0‰8‰‰ E‹ ‹ ?¿>¾>ÿÿÿ’¾„hÿÿÿ€EØÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰@‰X‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿>E¾>8¾}E0‰8‰@‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?€[—?E€EP”E0‰8‰@‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?øÿÿÿ‰ `Œ DŒŒŒ €›U’?EHÿÿÿ“˜Eèøÿÿÿ‰E›E `Œ DŒŒŒ xÿÿÿ€„’?E˜E|E0‰8‰@‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿EH¿>€€H ¾¾Øÿÿÿ¾€Xÿÿÿ’?E˜EÈÿÿÿ—?€E0¿™ EŒ Àÿÿÿ€°₫ÿÿ¿E¿E˜E°ÿÿÿE¿?E¿> ¾¾Àÿÿÿ¾¸₫ÿÿE `Œ DŒŒŒ —?E€E˜ÿÿÿ:>E¿EE€E¾E `Œ DŒŒŒ xÿÿÿ€Øüÿÿ ÿÿÿ~E0‰8‰@‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?ˆÿÿÿ€pÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰@‰X‰`‰h‰p‰x‰€‰ˆ‰°₫ÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿>E¾8’}E0‰8‰@‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?E™E@—?H ¿¿Đÿÿÿ¿¸“E¨—?€Eÿÿÿÿ>`¾D ¿‚E˜ EŒ đÿÿÿ‰¨ÿÿÿœ’„xÿÿÿ¾E<‰Y€EEE€ÿÿÿE ’’Àÿÿÿ’|E0‰8‰@‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?¿>H ¾¾¨¾Xÿÿÿ”?E—=EÈÿÿÿ—E ˜Hèÿÿÿ’Èÿÿÿ`’’HàÿÿÿE0‰8‰@‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ? ÿÿÿ‰hÿÿÿ™?˜EE ¿›E EŒ đÿÿÿ‰°ÿÿÿœ“„ÿÿÿ€EE `Œ DŒŒŒ €ÿÿÿ€ƒ°₫ÿÿ¿E½E°ÿÿÿEàÿÿÿ’0₫ÿÿ—?E—E<‰Y˜ÿÿÿ‰=¨ÿÿÿ‰Àÿÿÿ¨ÿÿÿ‰ ÿÿÿ‰¿E—?°ÿÿÿ>¾‰Yx¿˜ÿÿÿ‰˜ EŒ —?E¿’ ¿¿`ÿÿÿ¿øüÿÿ’½½=Dpÿÿÿ½Àüÿÿ~E0‰8‰@‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?xÿÿÿPÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰h‰p‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?Eđÿÿÿ‰?~?E>`¾>D¾¾¾>¸¾}E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?™EE `Œ DŒŒŒ €E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?˜E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€?J™E¿E `Œ DŒŒŒ €E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?a€U`ÿÿÿ’ø₫ÿÿ~E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?<‰YEE `Œ DŒŒŒ đÿÿÿ‰>¾>Xÿÿÿ¾ƒ8ÿÿÿ€¿EE€E `Œ DŒŒŒ €>—E—n=¾———E½—=D0ÿÿÿ½ƒ(ưÿÿpÿÿÿ~E0‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ? ÿÿÿ€ˆÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿¸’}E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?|E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?›E¨—?™€H ¿¿Øÿÿÿ¿€`ÿÿÿ“E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?—?€Eÿÿÿÿ>`¾D ¿‚E EŒ đÿÿÿ‰@ÿÿÿ˜’„ÿÿÿ¿E<‰?Y€EE‚E¿Exÿÿÿ ’’H¸ÿÿÿ’8₫ÿÿ~E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?ÿÿÿ‚hÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰x‰€‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?Eđÿÿÿ‰?x’„“E0‰8‰X‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?›E `Œ DŒŒŒ Xÿÿÿ€„™E˜?E jE0‰8‰X‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?™EœE Hàÿÿÿ’`ÿÿÿ“E0‰8‰X‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?—?€EE(¿‚E› EŒ øÿÿÿ‰¿¿˜?U8ÿÿÿ¿ÿÿÿ=‰Y€?E˜E¿E—E`—D——hÿÿÿE°ÿÿÿ’(₫ÿÿ“àÿÿÿ“„₫ÿÿđÿÿÿ‰àÿÿÿ“øưÿÿ<‰YE€E›E `Œ DŒŒŒ @€U ÿÿÿ“ ưÿÿ’E“E¿Eÿÿÿ¿E8Hèÿÿÿ’Àÿÿÿ™E˜?EØÿÿÿ—?€EE8¿ EŒ ¸ÿÿÿ€¸üÿÿ€E˜E—E`—D——ÿÿÿ€?J‚EE€E¿E `Œ DŒŒŒ Exÿÿÿ€x₫ÿÿjE0‰8‰X‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?đ¿}E‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ? ¿¿ ¿€ ÿÿÿ‚àÿÿÿ~E‰ E‹ ‹‹ ‹ ?Èÿÿÿ°ÿÿÿ‹ ‹ ‹ E0‰8‰Đ€’x’@E¿EÈ’¹¿U†¹¿;ÀE0‰8‰‰ E‹ ‹ ?jE0‰8‰‰ E‹ ‹ ?Èÿÿÿ“ˆÿÿÿđÿÿÿ’¹¹Xđÿÿÿ¹Pÿÿÿ’½½Hàÿÿÿ“0ÿÿÿ?¿¾?Y½;E»;n:n»¿;Y’¿?»ºÀƒ?>¼`¹>µ?¥¿ ?e¿ ?u€ÿÿÿ’¿ ₫ÿÿ?.¿’»U¿?E†’»?;E’9E=E€>EE;?½?1½1½ 1?¾E!E?Eđÿÿÿ‰?j?Ex’„È™E”EĐ›E@Hèÿÿÿ’Èÿÿÿøÿÿÿ‰¿¡¿!EØÿÿÿ“?E¿?n““H˜¿Èÿÿÿ¼º½U†º½<;˜E0‰8‰@‰X‰`‰h‰p‰x‰€‰¨‰!‰ E‹ ‹‹ ‹ ?xÿÿÿ¾¿½H9½=E½=n¹˜9Y¼¿?Œ½¹=Y9n»¾¾H¼ººX?½¹@“˜ÿÿÿºđ₫ÿÿ9.¹½¸U¹EXÿÿÿ½ =e½ =u°ÿÿÿ½¨₫ÿÿĐ˜¿`“¾H¿=J¾>J½¾=Z½=J°ÿÿÿ“¿h₫ÿÿĐ˜’¸ÿÿÿ’“P₫ÿÿĐ˜E0‰8‰@‰X‰`‰h‰p‰x‰€‰¨‰!‰ E‹ ‹‹ ‹ ?¡€!Uhÿÿÿ¡HÿÿÿtE™EØÿÿÿèÿÿÿ“0@ưÿÿE˜E `Œ DŒŒŒ Àÿÿÿ€„èüÿÿ—?E˜€‚E(¿œ EŒ øÿÿÿ‰¿™¿>D¿¾?Upÿÿÿ¿püÿÿ`—D——=‰Y€EEpÿÿÿEÀÿÿÿ’Ø₫ÿÿèÿÿÿ’È₫ÿÿjE0‰8‰@‰X‰`‰h‰p‰x‰€‰¨‰!‰ E‹ ‹‹ ‹ ?”pÿÿÿ”ƒXÿÿÿ€¿EE€E—E `Œ DŒŒŒ <‰YEĐ˜“€E›E `Œ DŒŒŒ đÿÿÿ‰Pÿÿÿ”°₫ÿÿjE0‰8‰@‰X‰`‰h‰p‰x‰€‰¨‰!‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0‰X‰`‰h‰p‰x‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¨¿}E0‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ??>E€?¾E0‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?E™Epÿÿÿ—˜?¿E `Œ DŒŒŒ €›E›n€››8e›8w™˜˜H˜Uˆÿÿÿ’hÿÿÿE€EhÿÿÿE€UØÿÿÿ’°₫ÿÿ~E0‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ @™H€™Uhÿÿÿ™@ÿÿÿmE0‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?0¿‚EE˜ EŒ €Ehÿÿÿ€ ÿÿÿ‚U ’°ÿÿÿàÿÿÿ~E0‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ? ¿’ÿÿÿ’€xÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¿}E0‰X‰‰ E‹ ‹‹ ‹ ? ¿Eÿÿÿÿ?`¿D— EŒ 0‰X‰‰ E‹ ‹‹ ‹ ?ˆÿÿÿ~E0‰X‰‰ E‹ ‹‹ ‹ ? ¿’°ÿÿÿ’€˜ÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¿}E0‰X‰‰ E‹ ‹‹ ‹ ? ¿— EŒ 0‰X‰‰ E‹ ‹‹ ‹ ?‚ ÿÿÿ~E0‰X‰‰ E‹ ‹‹ ‹ ? ¿’°ÿÿÿ’€˜ÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?ø¿}E0‰X‰‰ E‹ ‹‹ ‹ ?@¿— EŒ E0‰X‰‰ E‹ ‹‹ ‹ ?~E0‰X‰‰ E‹ ‹‹ ‹ ? ¿’°ÿÿÿ’€Hÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰h‰p‰x‰¨‰!@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿Đ’}E0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?E0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?˜H¡˜U€’xÿÿÿ—˜?¿E `Œ DŒŒŒ €™E™n€™™8e™8w—˜xÿÿÿ—˜?¿E `Œ DŒŒŒ €™E™n?€™™™E¿™?D(ÿÿÿ¿ƒÿÿÿE€!Exÿÿÿ˜EE `Œ DŒŒŒ E— EŒ E€Uÿÿÿ’₫ÿÿzE0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?`—D——€E— EŒ @™H™€UPÿÿÿ™à₫ÿÿmE0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?—˜E›E8—— EŒ `ÿÿÿ€ÿÿÿ~E0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?ÿÿÿ€˜HÿÿÿE0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ?€>J¿EEƒE‚E€E¾E `Œ DŒŒŒ 8ÿÿÿ€ ÿÿÿƒU ’ˆÿÿÿ‚àÿÿÿ~E0‰X‰`‰h‰p‰x‰¨‰!‰ E‹ ‹‹ ‹ ? ’’ˆÿÿÿ’€pÿÿÿ‹ ‹ ‹ E?`¿?D¿¿¿?˜¿}E‰ E‹ ‹ ? ¿¿>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?Øÿÿÿ€ÿÿÿ‹ ‹ ‹ ‹‹ EX‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¿}EX‰‰ E‹ ‹‹ ‹ ?€?JE¿E `Œ DŒŒŒ —EX‰‰ E‹ ‹‹ ‹ ?xÿÿÿ~EX‰‰ E‹ ‹‹ ‹ ? ¿¿¸ÿÿÿ¿€ ÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰@‰H‰P‰X‰`‰h‰p‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿>E¾>ˆ¾}E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?’E€E€{E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿>E¾>¾¾hÿÿÿ¾’Hÿÿÿ~E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€’pÿÿÿ“’@ÿÿÿ~E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€’xÿÿÿ’Pÿÿÿ¿?E¿?P¿— EŒ 0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?>E¿E„EEƒE¾E‚Eøÿÿÿ‰@€ `Œ DŒŒŒ —?E˜E™Eøÿÿÿ‰EĐ₫ÿÿhÿÿÿ‚„E¸₫ÿÿ”E™E“E•E–?EÀÿÿÿ——Hˆ’—Àÿÿÿ0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?è?—>E¿¾?n˜¿Y `Œ DŒŒŒ Hÿÿÿ€ƒ(ÿÿÿ„YE‚EƒE¿E„E€ÿÿÿEÀÿÿÿ’₫ÿÿE€EØÿÿÿèÿÿÿ€ưÿÿ~E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€ÿÿÿƒhÿÿÿđÿÿÿ‚Xÿÿÿ~E0‰8‰@‰H‰P‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€ÿÿÿhÿÿÿđÿÿÿ€Xÿÿÿ‹ ‹ ‹ E?`¿?D¿¿¿?˜¿}E‰ E‹ ‹ ?¿¿>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?Øÿÿÿ€ÿÿÿ‹ ‹ ‹ E?`¿?D¿¿¿?˜¿}E‰ E‹ ‹ ?¿¿>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?Øÿÿÿ€ÿÿÿ‹ ‹ ‹ E?`¿?D¿¿¿?˜¿}E‰ E‹ ‹ ?¿¿>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?Øÿÿÿ€ÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿`’}E0‰X‰`‰‰ E‹ ‹‹ ‹ ?E@¿E€E `Œ DŒŒŒ —˜Eh—— EŒ E0‰X‰`‰‰ E‹ ‹‹ ‹ ?~E0‰X‰`‰‰ E‹ ‹‹ ‹ ?°ÿÿÿ€đ₫ÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿`’}E0‰X‰`‰‰ E‹ ‹‹ ‹ ?E@¿E€E `Œ DŒŒŒ —˜Ep—— EŒ E0‰X‰`‰‰ E‹ ‹‹ ‹ ?~E0‰X‰`‰‰ E‹ ‹‹ ‹ ?°ÿÿÿ€đ₫ÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿`’}E0‰X‰`‰‰ E‹ ‹‹ ‹ ?E@¿E€E `Œ DŒŒŒ —˜Ex—— EŒ E0‰X‰`‰‰ E‹ ‹‹ ‹ ?~E0‰X‰`‰‰ E‹ ‹‹ ‹ ?°ÿÿÿ€đ₫ÿÿ‹ ‹ ‹ ‹‹ EX‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?à¿}EX‰‰ E‹ ‹‹ ‹ ?€¿— EŒ EX‰‰ E‹ ‹‹ ‹ ?~EX‰‰ E‹ ‹‹ ‹ ?Àÿÿÿ€`ÿÿÿ‹ ‹ ‹ E?`¿?D¿¿€Z¿?€€X€?E¿?n>`¾>D¾¾¾¿‰ E‹ ‹ ?€>E¿¾?U¿˜ÿÿÿ`€D€€‰ E‹ ‹ ?‹ ‹ ‹ EH€ ?>E€?¾E‰ E‹ ‹ ?~E‰ E‹ ‹ ?‹ ‹ ‹ ‹‹ E0‰8‰@‰H‰P‰X‰`‰h‰p‰x‰€‰ˆ‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾EE?`¿?D¿¿¿ ’}E0‰8‰@‰H‰P‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?’?E™Eà™E˜?EP€E0‰8‰@‰H‰P‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?–EXÿÿÿ¿Eè¿>€€H ¾¾Øÿÿÿ¾€Àÿÿÿ”œ>U”Eœ¾;’?E™E°ÿÿÿ“E¿E•E¸ÿÿÿ¿EXHèÿÿÿ’Àÿÿÿ› EŒ €H™E€œH˜?E¸ÿÿÿ—?€EE8¿ EŒ ˜ÿÿÿ€₫ÿÿ˜E’E€E™E ÿÿÿEĐÿÿÿ€Ø₫ÿÿ`›D››› EŒ ™>J€UE€“;€E¾E `Œ DŒŒŒ pÿÿÿ€P₫ÿÿ˜?€E0¿ EŒ Pÿÿÿ€ưÿÿ¿E¿E¿E€E¾E ÿÿÿ>EÈÿÿÿ’èüÿÿ ’’àÿÿÿ’Đüÿÿ~E0‰8‰@‰H‰P‰X‰`‰h‰p‰x‰€‰ˆ‰‰ E‹ ‹‹ ‹ ?hÿÿÿ€Hÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿À’}E0‰X‰`‰‰ E‹ ‹‹ ‹ ?Eè¿E€E `Œ DŒŒŒ —˜E˜—— EŒ 0‰X‰`‰‰ E‹ ‹‹ ‹ ?E0‰X‰`‰‰ E‹ ‹‹ ‹ ?˜’¨ÿÿÿ’đ₫ÿÿ~E0‰X‰`‰‰ E‹ ‹‹ ‹ ?°ÿÿÿ€ÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿À’}E0‰X‰`‰‰ E‹ ‹‹ ‹ ?Eè¿E€E `Œ DŒŒŒ —˜E —— EŒ 0‰X‰`‰‰ E‹ ‹‹ ‹ ?E0‰X‰`‰‰ E‹ ‹‹ ‹ ? ’¨ÿÿÿ’đ₫ÿÿ~E0‰X‰`‰‰ E‹ ‹‹ ‹ ?°ÿÿÿ€ÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿? ¿}E0‰X‰‰ E‹ ‹‹ ‹ ?¿ EŒ 0‰X‰‰ E‹ ‹‹ ‹ ?`€D€€E `Œ DŒŒŒ —E0‰X‰‰ E‹ ‹‹ ‹ ?ˆ¿?hÿÿÿ¿ÿÿÿàÿÿÿ~E0‰X‰‰ E‹ ‹‹ ‹ ? ¿’°ÿÿÿ’€˜ÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿ø’}E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?—˜E™E—E— EŒ 0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?~E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?H’¿E‚EE€E’ EŒ `ÿÿÿ€™À₫ÿÿ`€D€€‚E `Œ DŒŒŒ —E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?’Hÿÿÿ“ø₫ÿÿ‚àÿÿÿ~E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ? ’“ÿÿÿ“€xÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿À’}E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?—˜E™E(—E— EŒ 0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?~E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?H’¿E‚EE€E’ EŒ `ÿÿÿ€™À₫ÿÿ?E‚?0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?(’€ÿÿÿ“0ÿÿÿ‚àÿÿÿ~E0‰8‰X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ? ’“ÿÿÿ“€xÿÿÿ‹ ‹ ‹ ‹‹ E0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿¸’}E0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?`D€ EEŸE EŒ EE EŒ   H˜E œX’?EđzE0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?€E› EŒ @œ>H¾€>U@ÿÿÿ¾°₫ÿÿE0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?˜H¡˜U€’Pÿÿÿ—˜?¿E `Œ DŒŒŒ €™E™n€™™8e™8w—˜xÿÿÿ—˜?¿E `Œ DŒŒŒ €™E™n?€™™™E¿™?D(ÿÿÿ¿ƒÿÿÿ™E€!Exÿÿÿ™E› EŒ E€UÀÿÿÿ’₫ÿÿEÀÿÿÿ€E¿EP™YĐ€¾¿?H¾¿>UÈÿÿÿ¾°ÿÿÿ˜E’?EÀÿÿÿ—?˜€8¿™ EŒ ¸ÿÿÿ€0ưÿÿ˜E`ŸDŸŸE™EE¿E€ÿÿÿ€EœE™E `Œ DŒŒŒ H?E—XĐ¾¾>UXÿÿÿ¾h₫ÿÿzE0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?`›D››€E› EŒ @—>H¾€>U(ÿÿÿ¾°₫ÿÿmE0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?>™>˜>€0¾› EŒ 0ÿÿÿ€Ø₫ÿÿ0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?‚E¿EE€E `Œ DŒŒŒ 0ÿÿÿ€ƒà₫ÿÿ‚U ’ ÿÿÿàÿÿÿ~E0‰X‰`‰h‰p‰x‰€‰ˆ‰‰˜‰ ‰ ¨‰!‰ E‹ ‹‹ ‹ ?hÿÿÿ€Pÿÿÿ‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿>E¾>ؾ}E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?đÿÿÿ‰¾’E’n›E™’>Đ™¾¾¾HĐ™>Đt?E“E Đ™’àÿÿÿ’0ÿÿÿ?E›Eˆđÿÿÿ‰™E `Œ DŒŒŒ €?E°ÿÿÿ€ƒÿÿÿ¸ÿÿÿ€„›E(“E¾E€>Eèÿÿÿ€Đÿÿÿ“EȘE™E `Œ DŒŒŒ €?E¸ÿÿÿ€ƒ›Eÿÿÿ™¿€E˜E¿EE `Œ DŒŒŒ €?E`ÿÿÿ€j„ø₫ÿÿ?€?€Yˆÿÿÿ:E€E— EŒ Àÿÿÿ€Xÿÿÿ<‰?Y—E™EE¿E¨ÿÿÿĐÿÿÿ€ˆ€Y€p’„Øÿÿÿ—E `Œ DŒŒŒ E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?€¿E˜ EŒ Hÿÿÿ€ƒ™?E˜˜E˜¨˜°ÿÿÿ˜pØÿÿÿ€à₫ÿÿđÿÿÿ›Đ₫ÿÿH›èÿÿÿj?Eèÿÿÿ€Đÿÿÿ—E `Œ DŒŒŒ E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?hÿÿÿ€Hÿÿÿ `Œ DŒŒŒ :E`™D™™€E—E™ EŒ ˆÿÿÿ›Đ₫ÿÿ¿E—E `Œ DŒŒŒ ˜E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?E€E `Œ DŒŒŒ €?E(ÿÿÿ€ƒ¨₫ÿÿiE0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?Eè¿E€EEE—E `Œ DŒŒŒ ™E `Œ DŒŒŒ ÿÿÿ€À₫ÿÿhÿÿÿ~E0‰8‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?ÿÿÿ€xÿÿÿÿÿÿÿpfm_initialize(0@Xp€¨Èè@Xx˜°Èàø(Hp Àncc 3.0.8 (Build 08:14:57 Aug 26 2020)LIBPFM_DEBUG_STDOUTLIBPFM_DEBUGLIBPFM_VERBOSELIBPFM_FORCE_PMU%s (%s.%d): trying %s PMU forced to %s pfmlib_common.c%s (%s.%d): found %s pfmlib_common.csupported PMU models: [%s] detected host PMU: %s not detected yetunknown error codesuccessnot supportedinvalid parameterspfmlib not initializedevent not foundcannot assign events to countersbuffer is full or too smallevent used more than onceinvalid model specific magic numberinvalid combination of model specific featuresincompatible event setsincompatible events combinationtoo many events or unit maskscode range too bigempty code rangeinvalid code rangetoo many code rangesinvalid data rangetoo many data rangesnot supported by host cpucode range is not bundle-alignedcode range requires some flags in rr_flagsinvalid or missing unit maskout of memoryno description availableno description available:q'7et“intŸ¨ƒP°„Puº¿ /;ÆƯ %|Ÿuä0¯éØơ/ ̣öI ùûo üo ưo !₫o 0ÿo( >o0 Lo8 Yo@ eoH soP ƒoX /` ™ fh   Ip ¨It °Wx ¼-€ ÈB‚ ×lƒ á|ˆ ç%b ï.m˜ ö/m  ư0m¨ 1m° 2¸ 4IÀ 6‰Ä5! f ,¡/ 2¢f 8¦I¯ u| 4‚ =u 4ImH/:;@/KÀ ^/LÀ  Đ 4c/M«è/U, t/V4plm/W4 z/X; €/Y, ‹/Z4Đ •/[<Ø 4< 4/ ;L 4/\Û /aœ ­/b‡ ·/c‡ À/d4 È/e4 Ö/f;ă/gW€Đ/l  đ/m4 /n4 /o4 /p4 /q +/rĐĐ•/sPĐ L4ÿ ;. 4</t§@€/~ Q/4 _/€4 m/v/‚@•/ƒ’€ œ’4ÿ ;¢ 4/„9/‰ă•/4Ÿ/‹4«/Œ4¸/­°7 # É7!o ̉7"I Û7#4 å7$4 ï7%4 ù7&4 z7'4 7(# 7)#( $7*D0 37+U8 G7,k@ Z7-ƒH n7.”P ~7/ÀX ‰70À` ’71Ëh  72Ëp ®73Ëx À74Ë€ Ơ75؈ ä76# ø77ô˜ 78ô  79ú¨)I44>IJo4[o44q4}Љ44I´mºm.¢ÆIÑ}̃I4îo•I417:î7BC C7Că K7DC S7EI c7Fp/9Iøœ‹h/9}b/94‚7}40œ¶e7}It0 œß strvo!¤IP œ p’"ret“IC#³ÂIPœIoptÂIă!ĂÔI`¸œ›$ÛÔI$àÔo$åÔ pÖ%́éI œpféô pë!ûI@œ $àûo$åûI& IPàœ3 'Û >&&I0€œl 'ÛI(p&;&I°@œ¥ 'Û&I(p()I6Iđ(œ n6™idx6 (p8o(e8o*i94(len:4)__I !8œ 'u_Iidx_ +zaĐ‘°*ib4*jb4,ùb4-ƒcI‘p)‰„I`' œû v„™ev„ ,˜†;+Ÿ‡o‘p*retˆI,¦—I)µ I,˜œP 'u Ii 4'Đ  -ƒ¢I‘p.ƠµI /Àœä evµ4strµ™'́µ *i·4/c·4‘x,‹·4,ơ¸;+Ÿ¹o‘p0₫ºo)ÙI`5˜œ* evÙ4strÙ™'́Ù èI7œl eèl uè4*jê4L.1ôI9đœ́ eôl strô™,‹÷4,ơø;*mø;+Ÿùo‘p*retúI)G4Iđ?ˆœk i44'à4o'å4(l6(j6(str7o,ZNI*__cNI)`VI€DØœ¤ iV4'uV>)saI`FÀœé ia4cnta4'ua>)kI H¸œ"ik4'¥k})®wIàI¸œ»evw4'Æw4'àwo'åw(stryo*numz4(l{(j{,Z™I*__c™I&ËI Pàœê'̃ )äªI€QĐœ$evª4'̃ª )üÑIPS€œœinp̉´'̉m'Óº'Óm,̃Ơ4*iÖ4*ret×I&(IĐZàœËnum &=I°[àœúnum &NI\àœ)num )_&Ip] œX'q&}){3I_ œ‡'3})—@I°a œ¶'­@})»MIĐcœå'ÔM 1Úwo`eøœ'uwI&çI`fˆœ?'÷ )ÿ‡Iđf°œlen‡(max(l(str‹o) ¯I l€œĂe¯l ). ÂI o€œđeÂl )I ̉I q`œ)ỉ4str̉î)c àItØœp'‚ à4'́à4'Œ àî)‘ ñIàw œ·'‚ ñ4'́ñ4'uñ )© I€{À œCel 'ào'å(stro(l(j*retI,Z4I*__c4I)Á 9I@…ÀœÂv9™e9l (str;o(p;o(q;o*j<4-Æ<4‘p*ret=I2Ơ ©f2Ü ªf2ă 7„㤠Cù 43í 'é 04ü hI 4 lI  oH 4+ Y8 H+" t @% $ > $ > : ; I  I&I : ;  : ; I8 : ;I8 I !I/ : ; I'II : ;  : ; I8  : ;  : ; I8!I/ : ;  : ; I8 : ; I 8 ''I.: ;'I@—B: ;I.: ; 'I@–B: ; I.: ; '@–B 4: ; I!.?: ; 'I@–B"4: ; I #.?: ; 'I@—B$: ; I%.?: ; 'I@–B&.?: ;'I@—B': ;I(4: ;I).?: ;'I@–B*4: ;I +4: ;I,4: ;I -4: ;I ..: ;'I@–B/4: ;I 04: ;I1.?: ;'I@—B24: ; I?<34: ; I44: ; I?R âû /opt/nec/ve/ncc/3.0.8/include/opt/nec/ve/include/sys/opt/nec/ve/include/opt/nec/ve/include/bits/opt/nec/ve/include/gnu/opt/nec/ve/include/linux/storage/users/dgenet/papi/src/libperfnec/lib/../include/perfmonpfmlib_common.cstdc-predef.htypes.hfeatures.hcdefs.hwordsize.hstubs.htypes.htypesizes.htime.hstddef.hnecvals.hstdarg.hendian.hendian.hbyteswap.hbyteswap-16.hselect.hselect.hsigset.htime.hsysmacros.hpthreadtypes.hctype.hxlocale.hstring.hstring.hstdio.hlibio.h_G_config.hwchar.hstdio_lim.hsys_errlist.hstdlib.hstdlib.hwaitflags.hwaitstatus.halloca.hstdlib-float.hlimits.hlimits.hposix1_lim.hlocal_lim.hlimits.hposix2_lim.hxopen_lim.hpfmlib.hinttypes.hstdint.hstdint.hwchar.hinttypes.hpfmlib_os.hpfmlib_comp.hpfmlib_priv.hpfmlib_priv_comp.h/ ¹åô˜7Â~8x0Pu8ˆHP@@@888`óâs8`óâtàçèæ̣°ṣ (h„€æ˜  (r â 0vđp( Đo'8 00ô€„(08˜0X„x`€„vP0rx ¨H`x8åáH8`ˆpX€„zP `uP„€ôà 8(æ`8„sh t(æö8„x„€æy0 0¸0 xäƒy€ xˆ@Xw88h  ÈĐ8@ă(ypˆPy ̣ƒp˜8(H(z88„€„pâ ¨ƒ„(p(¸ u(zXr`8á˜p°( p€(ƒzp@ă0âˆÀ† tƒ€@xwƒx8ăHs0@uH0xHqxx(8P( P(uăx8P0xåñ (Èóƒ…@ tâƒ̣çˆHósx¸ƒˆtt808(8( ˆtt Hẓzårt7ˆ¨(p€ôxhXwh́wäh@(häˆp€hˆph`ˆpvH h°(ppæpy (óxäp0âp@€ph(päuP„€ôà 8€(@@P@%tÀ0˜ˆă€ă€ä˜xàwxˆrt( % uP„€ôà 8uP„€ôà 8uP„€ôà 8x@(zP `x@(zP `x@(zP `€hv@(P0çPXæƒóâ0Ø…(ȃj¸(z`x8hwt(€(z0h0 ¨(P@pPr`(P@pPr `ˆpPHH( `¨ ˜hHHh(ˆ¨ ˜hPƒ`(€Ø(!˜Pó( óä˜æp˜ópHHHvX僃à˜0â „w8°0(.¨°0àp(àspry¸@h r0o@†uäåpsàåu€(È`(«x8 yp`y@€ncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfmlib_common.c/storage/users/dgenet/papi/src/libperfnec/libunsigned shortunsigned intlong unsigned intsigned charlong int__off_t__off64_tchar_Sizetunsigned long long intsize_tFILE_IO_FILE_flags_IO_read_ptr_IO_read_end_IO_read_base_IO_write_base_IO_write_ptr_IO_write_end_IO_buf_base_IO_buf_end_IO_save_base_IO_backup_base_IO_save_end_markers_chain_fileno_flags2_old_offset_cur_column_vtable_offset_shortbuf_lock_offset__pad1__pad2__pad3__pad4__pad5_mode_unused2_IO_marker_next_sbuf_pos_IO_lock_tpfmlib_regmask_bits_tbitspfmlib_regmask_teventflagsunit_masksnum_masksreservedpfmlib_event_treg_valuereg_addrreg_numreg_reserved1reg_alt_addrpfmlib_reg_tpfp_event_countpfp_dfl_plmpfp_flagsreserved1pfp_eventspfp_unavail_pmcspfmlib_input_param_tpfp_pmc_countpfp_pmd_countpfp_pmcspfp_pmdspfmlib_output_param_tpfm_debugpfm_verbosepfm_reservedpfmlib_options_tpmu_namepmu_typepme_countpmd_countpmc_countnum_cntget_event_codeget_event_mask_codeget_event_nameget_event_mask_nameget_event_countersget_num_event_masksdispatch_eventspmu_detectpmu_initget_impl_pmcsget_impl_pmdsget_impl_countersget_hw_counter_widthget_event_descget_event_mask_descget_cycle_eventget_inst_retired_eventhas_umask_defaultpfm_pmu_support_toptionscurrentoptions_env_setpfm_config_tpfm_regmask_issetpfm_num_maskspfm_check_debug_envpfm_initializepfm_set_optionspfm_get_pmu_name_bytypetypenamemaxlenpfm_list_supported_pmuspfm_get_pmu_namepfm_get_pmu_typepfm_is_pmu_supportedpfm_force_pmupfm_find_event_bynamepfm_find_event_bycodecodeimpl_cntcode2pfm_find_eventnumberendptrthe_int_numberpfm_find_event_bycode_nextnextpfm_do_find_event_maskmask_idxmask_valmask_namepfm_find_event_maskpfm_check_duplicatespfm_add_numeric_maskspfm_get_event_name__respfm_get_event_codepfm_get_event_code_counterpfm_get_event_counterscounterspfm_get_event_mask_namemaskpfm_get_num_eventscountpfm_get_num_event_maskspfm_dispatch_eventsmodel_inoutpmodel_outpfm_get_num_counterspfm_get_num_pmcspfm_get_num_pmdspfm_get_impl_pmcsimpl_pmcspfm_get_impl_pmdsimpl_pmdspfm_get_impl_countersimpl_counterspfm_get_hw_counter_widthwidthpfm_strerrorpfm_get_versionversionpfm_get_max_event_name_lenpfm_get_cycle_eventpfm_get_inst_retired_eventpfm_get_event_descriptionpfm_get_event_mask_descriptionevent_idxdescpfm_get_event_mask_codepfm_get_full_event_namepfm_find_full_eventstdoutstderrlibpfm_fpsupported_pmuspfm_configforced_pmupfmlib_err_listpfmlib_err_countzRx   øH‰P ˜ p  @0H‰` ¨ x  d0 H‰`  € 0ˆP H‰`   x ˜ è À $¼PH‰P p H ` $ä`¸H‰` Đ à ¸   H‰` ø Đ $0@H‰` ¨ ¨ X $XPàH‰P x @ h $€0€H‰P x x X  ¨°@H‰P ` € ,̀đ(H‰`      ø 0ü !8H‰` 0 x  ( 40`' H‰` đ  ` ˜ x P ,h,˜H‰` ø p °   ,˜ /ÀH‰`  À ° đ $È`5˜H‰` ˆ P `  đ7H‰P € x 09đH‰` Đ À È H 0Hđ?ˆH‰` è € @ ¨ Đ $|€DØH‰` ¨ p X $¤`FÀH‰` ¨ X X $̀ H¸H‰` ¨ ` H <ôàI¸H‰` ø p ø À ° €  $4 PàH‰P x @ h $\€QĐH‰` ˜ € P @„PS€H‰`   ˜ ¨   ¸ p H   $ÈĐZàH‰P x @ h $đ°[àH‰P x @ h $\àH‰P x @ h $@p] H‰` ¸ ¸ P $h_ H‰` ¸ ¸ P $°a H‰` ¸ ¸ P $¸ĐcH‰` ˜ X @  à`eøH‰P p X  `fˆH‰P p h ((đf°H‰` P Đ 0 ,T l€H‰` ¸ ° P h ,„ o€H‰` ¸ ° P h ,´ q`H‰` ¨ P  p 0ätØH‰` è ˜ h ø  0àw H‰` è ˜ h À  <L€{À H‰` H   Đ X è Ø  <Œ@…ÀH‰`  0 @ ( ¸  ñÿø#010 S0E /À\7q9đ‡@˜HÀ   ¨³¾ÅÏÖÛâP ñ P*`¸BJ b@sPà„0€™°@§đ(½ÄĐ× !8í`' ü,˜-8`5˜Lđ?ˆ_s€D؆`FÀ¡ H¸¸àI¸Đ× Pàê€QĐPS€#5ĐZàJ°[à[\àlp] ~_ °a ¦Đc¿`eø̀`fˆÜđf°÷ l€  o€& q`@GtØfàw ~€{À –@…À±pfmlib_common.cpfm_regmask_issetpfm_num_maskspfm_check_debug_envpfm_do_find_event_maskpfm_check_duplicatespfm_add_numeric_maskspfmlib_err_countpfmlib_err_listforced_pmupfm_configstderrlibpfm_fpgetenvatoistdoutpfm_initializepfm_init_syscallsfprintf__pfm_vbprintfpfm_set_optionspfm_get_pmu_name_bytypestrncpypfm_list_supported_pmuspfm_get_pmu_namepfm_get_pmu_typepfm_is_pmu_supportedpfm_force_pmupfm_find_event_bynamestrlenstrncasecmpstrchrpfm_find_event_bycodepfm_find_eventstrtoul__ctype_b_locpfm_find_event_bycode_nextstrcasecmppfm_find_event_maskpfm_get_event_name__ctype_toupper_locpfm_get_event_codepfm_get_event_code_counterpfm_get_event_counterspfm_get_event_mask_namestrcpypfm_get_num_eventspfm_get_num_event_maskspfm_dispatch_events__vec_memset__pfm_check_eventpfm_get_num_counterspfm_get_num_pmcspfm_get_num_pmdspfm_get_impl_pmcspfm_get_impl_pmdspfm_get_impl_counterspfm_get_hw_counter_widthpfm_strerrorpfm_get_versionpfm_get_max_event_name_lenpfm_get_cycle_eventpfm_get_inst_retired_eventpfm_get_event_descriptionstrduppfm_get_event_mask_descriptionpfm_get_event_mask_codepfm_get_full_event_namestrcatpfm_find_full_eventfreexˆ¸ÈØè((($ #°À8 8HX€ÀĐhx¨¸00@0P0`0p$€#  8 H h €x €¸ $È # h( hX $h #0$@#˜¨¸Èà đ P(P0¨@¨øø00Ø$"è#"ÈøØøà à¨¸@ØPØ`ÀpÀ¨0¸0ÈØ`$"p#"hxX0h0xˆØ0è0p€¨¸ø$)#)X$*h#*đ$) #)P $+` #+è!ø!% %(((°($(À(#(¨)$,¸)#,¸*$.È*#.+$/(+#/°,À,1$11#1đ23x3$.ˆ3#.h4x4°4À4Ø5è586 /H6 /0=7@=7À=Đ=è>ø>0?$.@?#.˜@¨@ĐA$4àA#4èB$)øB#)EEèFøF¨H¸HJ JÈK$4ØK#48L$/HL#/ÈL$9ØL#9M) M)¸OÈO¸PÈPRR€RRT T W$=°W#=èX$>øX#>èZøZÈ[Ø[¨\¸\^^˜^$=¨^#= `0`¸`$=È`#=@bPbØb$=èb#=Pd`dxe@ˆe@ÀeHĐeH f0fĐgàgØj)èj)0k@k0m@mÈm$=Øm#=°oÀoHp$=Xp#=(r8rèrĐørĐs$Ls#L¨t¸t€vđvđ v$L°v#Lˆx˜xX|h|}P(}P8$4H#4¨$/¸#/(p$9€#9`‚)p‚)@„$>P„#>đ…†0‡7@‡7؇9è‡9@ˆ /Pˆ /0‰$R@‰#Rp$R€#R‹(‹@‹+P‹+˜‹$R¨‹#R8Œ$(HŒ#($=(#=@$LP#LH(P0X@`Xhpp€x¨€Èˆè˜@ X¨x°˜¸°ÀÈÈàĐøØà(èHđpø À '7)0e7t>E“SŸX¨c°xº}¿ÆƯ¥ä°é¼̣ÈùÔà́!ø0>LY+e8sEƒR_™l y¨†°“¼ È­×ºáÇçÔïáöîưû "6!B,N2Z8ƒ=¡H´^Ñcätüz€‹ •M`­l·xÀ„ÈÖă³đ¿Ë ×ă ï+₫•/<CQO_[mgvt•£¶•ÅŸÔ«ä¸÷É̉Ûå'ï3ù?zKWc$o3{G‡Z“nŸ~«‰·’Ă Ï®ÛÀçƠóäÿø  1C+K7SJcUp`Œ‚–·½0à¤êP³'PPĂZ`qÛ|à‡åœ́¦ Í×@îàùå  P' Û4 &? 0V Ûm ;x ° Û¦ I± đ _  !2 uJ zp ù} ƒ ‰™ `'Å ˜̉ Ÿî ¦ü µ , u4 Đ@ ƒQ Ơ\  / ́¯ ‹¼ ơÉ ŸØ ₫å đ `5 ́+ 6 7s 1~ 9« ‹¸ ơĐ Ÿí Gø đ? à% åQ Zl `w €D˜ u¥ s° `FƯ uê ơ  H¥#®.àIPÆ\àhå¡Z¼ËÇ P̃̃ëäö€Q̃%ü0PSS_kw̃(¨ĐZ̀=×°[ûN\*_5p]LqY{d_{ˆ—“°aª­·»ÂĐcÙÔæÚñ`euç`f3÷@ÿKđf— ¢ lÄ. Ï oñI ü q*c 5tL‚ X́dŒ q‘ |àw“‚ Ÿ́«u¸© Ă€{äàđå)ZDÁ O@…¥ÆĂƠ ÎÜ Ùă úí 0ü $ 0I VH_" l@ñ D@hdŒˆÀ¼èä 40\X„€¬¨Đ̀ü40lhœ˜̀ÈôđLH€|¨¤Đ̀øô84`\ˆ„̀ÈôđD@lh”¼¸äà,(XTˆ„¸´èäPLŒ.symtab.strtab.shstrtab.rela.text.rela.data.bss.comment.rodata.rela.debug_info.debug_abbrev.rela.debug_line.debug_str.rela.eh_frame @@đÓ +@&@í@1H60H(?pL‚“uG@PïàX÷§%k«V f@0  w0rµ3 ‡¨¿È‚@H  h ‘pÇÈ 8϶pfmlib_priv.o/ 1604947637 15504 3101 100644 12608 ` ELFû-@@‹ ‹ ‹ ‹‹ E@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?H¿?X¿€ >e¾ u¿ EŒ ‰ E‹ ‹‹ ‹ ?E‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E°‰?`¿?D¿¿¿¿¿¿f¿?u¿?J ¿„°?¿‰?Y>`¾>D¾¾¿Y¾>€E¾E `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0‰8‰X‰`‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿=E½=€’ ½½(½’ĐE¿E@’¹¿U†¹¿;8E0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?jE0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?E0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?˜€— EŒ ÿÿÿ€ƒ0ÿÿÿ—E—¨—Àÿÿÿ—ÿÿÿØÿÿÿ½ ₫ÿÿđÿÿÿ€₫ÿÿ¼E¾EàÿÿÿjE0‰8‰X‰`‰‰ E‹ ‹‹ ‹ ?¨ÿÿÿ“ˆÿÿÿđÿÿÿ’¹¹Xđÿÿÿ¹Øưÿÿ’½½Hàÿÿÿ“¸ưÿÿ?¿¾?Y½;E»;n:n»¿;Y’¿?»ºÀƒ?>¼`¹>µ?¥¿ ?e¿ ?u€ÿÿÿ’¿(ưÿÿ?.¿’»U¿?E†’»?;E’9E=E˜>E€¦Iº u‡ 4 Cu 4ImN.:;@.KË d.LË «Û 4i.M¶è.U7 z.V4plm.W4 €.X; †.Y7 ‘.Z4Đ ›.[GØ 4G 4/ ;W 4¤.\æ .a§ ³.b‡ ½.c‡ Æ.d4 Î.e4 Ü.f;é.gb€Đ.l ö.m4 .n4 .o4 .p4 &.q1.rÛĐ›.s)PĐ W)4ÿ ;9 4B.t²@€.~Œ W.4 e.€4 s.Œ|.‚Œ@›.ƒ€ §4ÿ ;­ 4….„D.‰î›.4¥.‹4±.Œ4¾.¸°6 . Ï6!o Ø6"I á6#4 ë6$4 ơ6%4 ÿ6&4 €6'4 6(. 6).( *6*O0 96+`8 M6,v@ `6-H t6.ŸP „6/ËX 60Ë` ˜61Öh ¦62Öp ´63Öx Æ64Ö€ Û65ăˆ ê66. ₫67ÿ˜ 68ÿ  %69¨4I44IIUo4fo44|4ˆÛ”44¥I¿mÅm9­ÑI܈éI4ùo  I476:ù6BN I6Cî Q6DN Y6EIi6F!v6}40œe6}I„40ˆœÂfmt4¤‘° ap6!“AIÀPœeA"nC4"jC4W#¥6JT$°*' ¯% $ > $ > : ; I  I&I : ;  : ; I8 : ;I8 I !I/ : ; I'II : ;  : ; I8  : ;  : ; I8!I/ : ;  : ; I8 : ; I 8 ''I.: ; 'I@–B: ; I.?: ; '@–B: ; I 4: ; I!.?: ; 'I@–B"4: ; I #4: ; I?<$4: ; I?GÑû /opt/nec/ve/ncc/3.0.8/include/opt/nec/ve/include/sys/opt/nec/ve/include/opt/nec/ve/include/bits/opt/nec/ve/include/gnu/opt/nec/ve/include/linux/storage/users/dgenet/papi/src/libperfnec/lib/../include/perfmonpfmlib_priv.cstdc-predef.htypes.hfeatures.hcdefs.hwordsize.hstubs.htypes.htypesizes.htime.hstddef.hnecvals.hstdarg.hendian.hendian.hbyteswap.hbyteswap-16.hselect.hselect.hsigset.htime.hsysmacros.hpthreadtypes.hctype.hxlocale.hstring.hstring.hstdio.hlibio.h_G_config.hwchar.hstdio_lim.hsys_errlist.hstdlib.hstdlib.hwaitflags.hwaitstatus.halloca.hstdlib-float.hlimits.hlimits.hposix1_lim.hlocal_lim.hlimits.hposix2_lim.hpfmlib.hinttypes.hstdint.hstdint.hwchar.hinttypes.hpfmlib_os.hpfmlib_comp.hpfmlib_priv.hpfmlib_priv_comp.h6 ưx0Pµ8€@ósƒ80  8˜HătXXXwˆp0xåñHXsPXncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfmlib_priv.c/storage/users/dgenet/papi/src/libperfnec/libunsigned shortunsigned intlong unsigned intsigned charlong int__off_t__off64_tchar_Sizetunsigned long long intva_listsize_tFILE_IO_FILE_flags_IO_read_ptr_IO_read_end_IO_read_base_IO_write_base_IO_write_ptr_IO_write_end_IO_buf_base_IO_buf_end_IO_save_base_IO_backup_base_IO_save_end_markers_chain_fileno_flags2_old_offset_cur_column_vtable_offset_shortbuf_lock_offset__pad1__pad2__pad3__pad4__pad5_mode_unused2_IO_marker_next_sbuf_pos_IO_lock_tpfmlib_regmask_bits_tbitspfmlib_regmask_teventflagsunit_masksnum_masksreservedpfmlib_event_treg_valuereg_addrreg_numreg_reserved1reg_alt_addrpfmlib_reg_tpfp_event_countpfp_dfl_plmpfp_flagsreserved1pfp_eventspfp_unavail_pmcspfmlib_input_param_tpfp_pmc_countpfp_pmd_countpfp_pmcspfp_pmdspfmlib_output_param_tpfm_debugpfm_verbosepfm_reservedpfmlib_options_tpmu_namepmu_typepme_countpmd_countpmc_countnum_cntget_event_codeget_event_mask_codeget_event_nameget_event_mask_nameget_event_countersget_num_event_masksdispatch_eventspmu_detectpmu_initget_impl_pmcsget_impl_pmdsget_impl_countersget_hw_counter_widthget_event_descget_event_mask_descget_cycle_eventget_inst_retired_eventhas_umask_defaultpfm_pmu_support_toptionscurrentoptions_env_setpfm_config_tpfm_num_masks__pfm_vbprintf__pfm_check_eventpfm_configlibpfm_fpzRx   0H‰` ¨ x  @0ˆH‰`  p <dÀPH‰`   X X à  À ñÿ0   (0ˆ7̣ÿAJÀPpfmlib_priv.cpfm_num_maskspfm_config__pfm_vbprintflibpfm_fpvfprintf__pfm_check_eventx ˆ ° À 8$H#X h ˆ˜   ' 5)0 c7 r> E ‘S X ¦c ®x ¸} ½ Ä Û ă° ê» ïÇ øÓ ÿß ë ÷ ' 6 D R) _6 kC yP ‰] –j Ÿw ¦„ ®‘ ¶ « θ ƯÅ ç̉ íß ớ üù     - A 'M 2Y 8e > C¬ N¿ dÜ iï z € † ‘+ ›X ¤k ³w ½ƒ Æ Î› ܨ é¾ öÊ Ö â î &ú 1  ›: BN WZ ef sr | ›® …Á ›Đ ¥ß ±ï ¾ Ï Ø á& ë2 ơ> ÿJ €V b n *z 9† M’ ` tª „¶  ˜Î ¦Ú ´æ Æ̣ Û₫ ê  ₫ " % 7* I6 QB YU i` vj‹ „‘0Ă “ÍÀ ¥ °à D@hd.symtab.strtab.shstrtab.rela.text.data.bss.comment.rela.debug_info.debug_abbrev.rela.debug_line.debug_str.rela.eh_frame @@( &P,P10P(?x.:@H Đ K¦à^†KY@, j0Ѻz u@0,H x,„0˜  È\/0 1604947637 15504 3101 100644 4488 ` ELFûˆ @@ ‹ ‹ ‹ EE‰ E‹ ‹ ?‹ ‹ ‹ E‰ E‹ ‹ ?ncc 3.0.8 (Build 08:14:57 Aug 26 2020) '9xgintAy~ /-… %HdAŒ;4@œ¸¢;^§;;¯;S ret>4 ¶V@8œ È)4  Ù*4  ï+4 % $ > $ >  I: ; I&I.?: ; 'I@—B: ; I 4: ; I .?: ; '@—B 4: ; I?¡û /opt/nec/ve/ncc/3.0.8/include/opt/nec/ve/include/sys/opt/nec/ve/include/opt/nec/ve/include/bits/opt/nec/ve/include/gnu/opt/nec/ve/include/linux/opt/nec/ve/include/asm/opt/nec/ve/include/asm-generic/storage/users/dgenet/papi/src/libperfnec/lib/../include/perfmonpfmlib_os_linux.cstdc-predef.htypes.hfeatures.hcdefs.hwordsize.hstubs.htypes.htypesizes.htime.hstddef.hnecvals.hstdarg.hendian.hendian.hbyteswap.hbyteswap-16.hselect.hselect.hsigset.htime.hsysmacros.hpthreadtypes.hstdint.hstdint.hwchar.hstdio.hlibio.h_G_config.hwchar.hstdio_lim.hsys_errlist.hstring.hstring.hxlocale.hstdlib.hstdlib.hwaitflags.hwaitstatus.halloca.hstdlib-float.hunistd.hposix_opt.henvironments.hconfname.hgetopt.herrno.herrno.herrno.herrno.herrno.herrno.herrno-base.hutsname.hutsname.hperfmon.h perfmon_v2.h pfmlib.h inttypes.hinttypes.hpfmlib_os.h pfmlib_comp.h pfmlib_priv.hpfmlib_priv_comp.h ;w(u ncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfmlib_os_linux.c/storage/users/dgenet/papi/src/libperfnec/liblong unsigned intchar_Sizetsize_t__pfm_getcpuinfo_attrattrret_bufmaxlenpfm_init_syscalls_pfmlib_sys_base_pfmlib_major_version_pfmlib_minor_versionzRx  @H‰P P <@8H‰P H ñÿ  @)@8;̣ÿQ̣ÿg̣ÿpfmlib_os_linux.c__pfm_getcpuinfo_attrpfm_init_syscalls_pfmlib_minor_version_pfmlib_major_version_pfmlib_sys_base '9)0gDyI~T…jŒt‹¢–§¡¯¹¶¿@̉È̃çÙóüï Œ @<.symtab.strtab.shstrtab.text.data.bss.comment.rela.debug_info.debug_abbrev.rela.debug_line.debug_str.rela.eh_frame@x!¸'¸,0¸(:à5@˜ (Fñ¨Y™¥T@À e0>uHXp@Ø 0   €   x/19 1604947637 15504 3101 100644 38688 ` ELFû ’@@‹ ‹ ‹ ‹‹ E0‰đ₫ÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E€pƒE‚EEƒU€>EĐÿÿÿ’ƒE‚EE@€>EƒE‚EE°Øÿÿÿ‚˜ÿÿÿĐÿÿÿ?¿‰Y¾J°‹¸‹À‹È‹Đ‹ `Œ DŒŒŒ 0‰‰ E‹ ‹‹ ‹ ??¿¾>EXÿÿÿ€>EƒE‚EEÈÿÿÿ€€Đÿÿÿÿÿÿ `Œ DŒŒŒ ?E>E€?¾E0‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E‚?E¿>IĐEEE¾E `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ  ?E>E€?¾EX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?€½¹=x EE—E€E€E™E `Œ DŒŒŒ `˜D˜˜€E˜ EŒ €—E `Œ DŒŒŒ ˜ EŒ €™EX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¶EØ₫ÿÿ¿¾@¼¿¾¿0?Y½>½=YÈÿÿÿ¼‚°ÿÿÿ·Yº¼¿0@>¸@‘`ÿÿÿ¹…Đưÿÿ˜Z?.˜7Z>E€¿·=x·9E½0;n€E;?½?1½1½ 1?¾E?‚>E¿¾?I¨@EE—E€E€E™E `Œ DŒŒŒ `˜D˜˜€E˜ EŒ €—E `Œ DŒŒŒ ˜ EŒ €™EX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?EØ₫ÿÿ¿¾¶6J5½>¿¾p4½>¿¾3½>¿>p2½>¿>½>¿>½>(¿>8½>¼»>º¹>¸»>»²;Y·¹>@¿>¹³9Y@½>H¿>¿´?Y ½>½µ=Yÿÿÿ¶đ₫ÿÿ˜Z€(:Y˜6E8E€?¾EX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E‚E€EE¾E `Œ DŒŒŒ  ÿÿÿ€ÿÿÿEEE¿E `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?‚>E¿¾?IPEEE¿E `Œ DŒŒŒ X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ  ?E>E€?¾EX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?€½¹=xà›E¿E€½¹=x° `Œ DŒŒŒ — EŒ €™EX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?­E`ÿÿÿ?¿¾?Y8½¹9Z½¿?¿¸@?¿¼?Yp8?¿¸@‘?¿¾?Y87·¼7Yp65µ¾5Y43³¼3Yp2H1±¾1Y0 /¯¼/Yp.>¿¸@8?¿¾?Y8»¾>Y>·¶@‘(7·¼7Yp6º¼E¿¾p:¿º?Y½>½»=Y¸ÿÿÿ¼‚˜ÿÿÿµY7·¼7Y6º¼E€E;?½?1½1½ 1?¾EhE `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ ?E>E€?¾E‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾EhEE `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?`€D€€°‹ `Œ DŒŒŒ  `Œ DŒŒŒ ?E>E€?¾E‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾EEE `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾EEE `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E@?‚>E¿¾I?EE¿E `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ E@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E`€D€€°‹ `Œ DŒŒŒ  `Œ DŒŒŒ ?E>E€?¾E‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰€‰ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E8 `Œ DŒŒŒ ?E>E€?¾EX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ? `Œ DŒŒŒ  ?E>E€?¾EX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?—EE›Eh>´4J¿>½¼>h3»º>¹¼>h2¿>¹¼>H1¿> ¹¼>¼±¸>¿²?Y·º>¶>µº>º³:YXÿÿÿ´0ÿÿÿZĐÿÿÿ>4E¾‰8YØÿÿÿ>¾‰6Yœ;Y—=Yœ(7Yœ05Y:E½»=Yh>¿¾?YÈÿÿÿ¼°ÿÿÿ>Z€=E¾E;?½?1½1½ 1?¾EEE `Œ DŒŒŒ ‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?€¿ `Œ DŒŒŒ X‰`‰h‰‰ E‹ ‹‹ ‹ ?¿?E¿?E¿?J°‹?¸‹À‹€E¿E `Œ DŒŒŒ €JX‰`‰h‰‰ E‹ ‹‹ ‹ ?¿E€EE `Œ DŒŒŒ —?E˜E™Eø₫ÿÿ?`¿?D¿¿¿>E¾>€ÿÿÿ¾„À₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?€¿ `Œ DŒŒŒ X‰`‰h‰‰ E‹ ‹‹ ‹ ?¿?E¿?E¿?J°‹?¸‹À‹€E¿E `Œ DŒŒŒ €JX‰`‰h‰‰ E‹ ‹‹ ‹ ?¿E€EE `Œ DŒŒŒ —?E˜E™Eø₫ÿÿ?`¿?D¿¿¿>E¾>€ÿÿÿ¾„À₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?P¿ `Œ DŒŒŒ X‰`‰‰ E‹ ‹‹ ‹ ?¿?E¿?€E¿?J°‹?¸‹¿E `Œ DŒŒŒ €JX‰`‰‰ E‹ ‹‹ ‹ ?¿E€E `Œ DŒŒŒ —?E˜E ÿÿÿ?`¿?D¿¿¿>E¾>ÿÿÿ¾„è₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?P¿ `Œ DŒŒŒ X‰`‰‰ E‹ ‹‹ ‹ ?¿?E¿?€E}¿?J°‹?¸‹¿E `Œ DŒŒŒ €JX‰`‰‰ E‹ ‹‹ ‹ ?¿E€E `Œ DŒŒŒ —?E˜E ÿÿÿ?`¿?D¿¿¿>E¾>ÿÿÿ¾„è₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E¿ ?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E¿ ?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰@ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?P¿ `Œ DŒŒŒ X‰`‰‰ E‹ ‹‹ ‹ ?¿?E¿?€E¿ ?J°‹?¸‹¿E `Œ DŒŒŒ €JX‰`‰‰ E‹ ‹‹ ‹ ?¿E€E `Œ DŒŒŒ —?E˜E ÿÿÿ?`¿?D¿¿¿>E¾>ÿÿÿ¾„è₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰ ₫ÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¿ `Œ DŒŒŒ X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿?E¿?ƒE°‹?¸‹À‹È‹Đ‹‚EE€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿E€EE‚EƒE `Œ DŒŒŒ —?E˜E™EE›E°₫ÿÿ?`¿?D¿¿¿>E¾>`ÿÿÿ¾„x₫ÿÿ”ÿÿÿ‰X‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?—EX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿?E¿9‰Y˜E°‹¸‹À‹È‹™E `Œ DŒŒŒ  ÿÿÿ?¿‰Yÿÿÿ‰¾?Y@€J>¿E `Œ DŒŒŒ Ø₫ÿÿ—p₫ÿÿ¿E `Œ DŒŒŒ —?Eø₫ÿÿ¿Y ÿÿÿ>¾‰Yÿÿÿ‰?@ `Œ DŒŒŒ ?`¿?D¿¿¿>E¾>Pÿÿÿ¾„p₫ÿÿ›?›Y—>Y—?›¾Hÿÿÿ9‰Y?EƒE‚E€€EE—E¿E `Œ DŒŒŒ `ÿÿÿ›Đ₫ÿÿ?`¿?D¿¿¿?hÿÿÿ¿˜üÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿ‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?°¿ `Œ DŒŒŒ X‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿?E¿?‚E~¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfarg_start_t not supported in v3.x pfm_delete_evtsets not supported in v3.x ” '<`Cjx‡”intX¦« /B² %_¹ÄIÁ14Ê3;Ó7B0;7ÎÜ;8Îë;9Îú;:̃ –̃ ; –î ;;;¡;f;g€;h€%;i‹/;j–9;k–H;l–X;m– h;nÎ(v;oÎ0…;p–8”;q–@¥;r–H´;sP –® ;Á;tù – ;@<̣̉<‹Ü<‹ê<̣ – ;ø<Å0<R<€<€%<‹/<–< ̃b –r ;•uŸ?–ê@̣ -̣ ; – ; ±A ¿HIˆœs ctxHsÖHRÛHPäHjîJî‘PflK‹ ôaIàœ» fdaIa» aIR ‚Ipèœ: fd‚I%‚: ‚I*„@sz…j/†Ii‡Iret‡I® :±I` hœˆ fd±I%±: ±I LáIĐPœ¿ fdáIaá¿¶ fëI œü fdëItëür zơI°Øœ. fdơI ‡ûIØœZ fdûI—Ipøœ fdI®  I³Ip(œ́fdI®  IÊI  œ2fdIâ2 I’çQI@!ØœffdQI₫¡I "hœ fd¡Ia¡¿©I$hœÚfd©It©ü±I'0œ fd±I"¹I0)0œ6 fd¹I.ÁI`+ œ| fdÁI®Á  ÁIAÉI. œÂ fdÉI®É  ÉITÑI 0 œ fdÑIâÑ2 ÑIhÚI@30œ6 fdÚI{ăIp5œ¤ ctxăsÖăRÛăPäăjŕIí‘€”I€; œê fdI» I£I > œ0 fdI%: I²IÀ@ œv fdI%: IÀB†IÑB‡IçBˆI% $ > $ >   I: ; I : ;  : ; I8 I !I/ I .: ; 'I@–B : ; I: ; I4: ; I4: ; I4: ; I4: ; I 4: ; I .: ;'I@–B: ;I: ;I.?: ;'I@–B4: ;I 4: ;I4: ; I?<€ªû /opt/nec/ve/ncc/3.0.8/include/opt/nec/ve/include/sys/opt/nec/ve/include/opt/nec/ve/include/bits/opt/nec/ve/include/gnu/opt/nec/ve/include/linux/opt/nec/ve/include/asm/opt/nec/ve/include/asm-generic/storage/users/dgenet/papi/src/libperfnec/lib/../include/perfmonpfmlib_os_linux_v2.cstdc-predef.htypes.hfeatures.hcdefs.hwordsize.hstubs.htypes.htypesizes.htime.hstddef.hnecvals.hstdarg.hendian.hendian.hbyteswap.hbyteswap-16.hselect.hselect.hsigset.htime.hsysmacros.hpthreadtypes.hstdint.hstdint.hwchar.hstdio.hlibio.h_G_config.hwchar.hstdio_lim.hsys_errlist.hstring.hstring.hxlocale.hstdlib.hstdlib.hwaitflags.hwaitstatus.halloca.hstdlib-float.hunistd.hposix_opt.henvironments.hconfname.hgetopt.herrno.herrno.herrno.herrno.herrno.herrno.herrno-base.hsyscall.hsyscall.hunistd.hunistd_64.hsyscall.hperfmon.h perfmon_v2.h pfmlib.h inttypes.hinttypes.hpfmlib_os.h pfmlib_comp.h pfmlib_priv.hpfmlib_priv_comp.h Ȱp˜@xt(ó@ ơ(óPùP@ååv`ƒƒ€„€80ñƒîxàzph˜(P@åæh`󃀄ƒ…ƒ~†„óơôós‚ ‚„q‚‚ƒp‚‚p‚ äusä0(óp@&˜ ç˜(óX ( ̣åårhsƒ0000†yt…†yt†óôyäpPx`óƒ€ô€8q…0p†p†àz %H€`(ó0€`@(ó0xhxhxˆx@(ó8°(ó`(ó ˆƒ„…†y‚‰ƒx‚ƒw‚ ‚„u‚ ‚†q̣ äxṭú†q‚Xååe€`XózhHx(p-PxÏh(pÈ(pȈ(h˜ˆ(h˜˜(xè˜(xè˜(xèˆ(h˜ (€ˆzÈ`ñƒî…ó80 …q…(y88x 8˜(xè˜(xè˜(xèncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfmlib_os_linux_v2.c/storage/users/dgenet/papi/src/libperfnec/libunsigned charunsigned shortunsigned intlong unsigned intchar_Sizetsize_tint32_tuint16_tuint32_tuint64_tsif_avail_pmcssif_avail_pmdssif_reservedpfarg_sinfo_treg_numreg_setreg_flagsreg_valuereg_long_resetreg_short_resetreg_random_maskreg_smpl_pmdsreg_reset_pmdsreg_ovfl_swcntreg_smpl_eventidreg_last_valuereg_reservedpfarg_pmd_attr_tctx_flagsctx_reserved1ctx_reserved3pfarg_ctx_treg_reserved2pfarg_pmc_treg_last_reset_valreg_ovfl_switch_cntreg_random_seedpfarg_pmd_tstart_setstart_reserved1start_reserved2reserved3pfarg_start_tload_pidload_setload_reserved1load_reserved2pfarg_load_tset_idset_reserved1set_flagsset_timeoutreservedpfarg_setdesc_tset_ovfl_pmdsset_runsset_act_durationset_avail_pmcsset_avail_pmdsset_reserved3pfarg_setinfo_tctx_smpl_buf_idctx_fdctx_smpl_buf_sizepfarg_ctx22_tpfm_create_context_2v3namesmpl_argsmpl_sizecinfopfm_write_pmcs_2v3pmcscountpfm_write_pmds_2v3pmdspmaserrno_savepfm_read_pmds_2v3pfm_load_context_2v3loadpfm_start_2v3startpfm_stop_2v3pfm_restart_2v3pfm_create_evtsets_2v3setdpfm_delete_evtsets_2v3pfm_getinfo_evtsets_2v3infopfm_unload_context_2v3pfm_load_contextpfm_startpfm_stoppfm_restartpfm_create_evtsetspfm_delete_evtsetspfm_getinfo_evtsetspfm_unload_contextpfm_create_contextctx22pfm_write_pmcspfm_write_pmdspfm_read_pmds_pfmlib_sys_base_pfmlib_major_version_pfmlib_minor_versionzRx   ˆH‰` h Đ (@àH‰` đ  0 (lpèH‰`   đ (˜` hH‰`  ˜ Đ  ÄĐPH‰`  p  è H‰`  °  °ØH‰` ˆ ,ØH‰` ˆ LpøH‰` ¨ lp(H‰` Ø (Œ  H‰`      ¸@!ØH‰` ˆ  Ø "hH‰` Đ ¸  ü$hH‰` Đ ¸  '0H‰` À    D0)0H‰` À    h`+ H‰` à Đ  Œ. H‰` à Đ  ° 0 H‰` à Đ  Ô@30H‰` À   ,øp5H‰` đ à  `  (€; H‰` à Đ  L > H‰` à Đ  pÀ@ H‰` à Đ ñÿˆ-à@pèS` heĐPz ˆ°Ø•Ø¥pø¼p(Ó  ë@!Ø     (-4=HVeu†Œ™ "hªÀÈÚë$hơ'0₫0)0 `+ . 0 0 D@30Wp5jw€; œ > «À@ pfmlib_os_linux_v2.cpfm_create_context_2v3pfm_write_pmcs_2v3pfm_write_pmds_2v3pfm_read_pmds_2v3pfm_load_context_2v3pfm_start_2v3pfm_stop_2v3pfm_restart_2v3pfm_create_evtsets_2v3pfm_delete_evtsets_2v3pfm_getinfo_evtsets_2v3pfm_unload_context_2v3pfm_create__errno_locationpfm_writefreecallocpfm_readpfm_attachpfm_set_state__pfm_vbprintfpfm_create_setspfm_getinfo_setsclose__vec_memsetpfm_load_context_pfmlib_major_versionsyscallpfm_init_syscalls_pfmlib_sys_basepfm_startpfm_stoppfm_restartpfm_create_evtsetspfm_delete_evtsetspfm_getinfo_evtsetspfm_unload_contextpfm_create_context__vec_memcpy_pfmlib_minor_versionpfm_write_pmcspfm_write_pmdspfm_read_pmds`$p#$ #`$p#Đ$à# $°#ÀĐ$#@$P#`$p#€À$Đ#Ø $è #˜ $¨ #è $ø #@ $P #¸ $È #ˆ$˜#@$P#`p˜$¨#`$p#°$À#°$À#  $ 0# @$P#8$H#$(#$!(#!è 0ø 0$ # ($8#P$`#đ$#$ #$" #"0@ $0 #X $#h ##Đ $$à #$!$ !#È!$Ø!#°"&À"&Ø"Đè"Đˆ#$'˜##'$$( $#(P$)`$) %&0%&H% X% ø%$'&#'€&$(&#(À&)Đ&)ˆ'&˜'&°'°À'°H($'X(#'À($(Đ(#(ø()))¸)&È)&à)đ)x*$'ˆ*#'đ*$(+#((+)8+)ø+&,& ,p0,pè,$'ø,#'€-$(-#(È-)Ø-)˜.&¨.&À.pĐ.pˆ/$'˜/#' 0$(00#(h0)x0)81&H1&`1 p1 (2$'82#'À2$(Đ2#(3)3)È3&Ø3&đ3@!4@!ˆ4$'˜4#'5$(5#(85)H5)6& 6&86H67$' 7#'¸7$(È7#(8)8)H9$'X9#'¨9$2¸9#2à9$(đ9#(@:$2P:#2`:)p:);$$(;#$H;3X;3<&(<&@<P<=$'=#' =$(°=#(è=)ø=)¸>&È>&à>pđ>p¨?$'¸?#'@@$(P@#(ˆ@)˜@)XA&hA&€A` A` HB$'XB#'àB$(đB#((C)8C) '<)0j7x>‡E”[¦`«k²v¹ÁŒÊ—ÓªܶëÂúï%&/29>HJXVhbvn…z”†¥’´¯ÁÎ̉ÚÜæêø".%:/FSfr~%/–9¢H®º1Æv̉h̃”êXöEU2a>kJ{V‹s•†£’¬µªÄ·ÓÊàÖçâơîÿú à&ç2ơ>$J2Vÿb;nLz[†j“x¦ˆ²̉¾˜ÊŸÖ걿:ÖEÛPä[îzô„¥° Ầpí%ø */G:Q` r%} ‰L“Đ´aÆfĐ ñtz °/‡9[—fpˆ®” §³²pÔ®à íÊø â& 9çD@!g₫r "”a¡¬$ÎtÛæ' " 0)7 .B `+d ®p  } Aˆ .ª ®¶  Ă TÎ  0đ âü   h @37 {B p5e Öq Û} ä” ¥ ”° €;̉ ̃  ë £ö  > %$  1 ²< À@^ %j  w À‚ Ñ ç· D@plœ˜ÈÄ́è 0,PLplŒ¼¸ÜØü$ HDlhŒ´°ØÔüø,(PLtp.symtab.strtab.shstrtab.rela.text.data.bss.comment.rodata.rela.debug_info.debug_abbrev.rela.debug_line.debug_str.rela.eh_frame @`C@8jp& C, C10 C(:ĐCZG*D˜ B@¨~SÂOvf8Q„a@¸ r0¼Xư‚À^}@Đ@ ’ŒPb( xg¹/41 1604947638 15504 3101 100644 14040 ` ELFû˜2@@‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E°‰¸‰?E>E=E<`¼Ø‹= `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¼E¿E¾E½E `Œ DŒŒŒ —E=EÀ₫ÿÿ<`¼»=‚? ÿÿÿ°‰<;»¼;D¸ÿÿÿ»ƒxÿÿÿ¸<¼‰E€?¾EX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰€‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?Đ¿¿?E¿?„E¿?J°‹?¸‹À‹È‹Đ‹Ø‹ƒE‚EE€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?¿E€EE‚EƒE„E `Œ DŒŒŒ —?E˜E™EE›EœE€₫ÿÿ?`¿?D¿¿¿>E¾>Pÿÿÿ¾„H₫ÿÿZEX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰€‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?Đ¿¿?E¿?„E¿?J°‹?¸‹À‹È‹Đ‹Ø‹ƒE‚EE€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?¿E€EE‚EƒE„E `Œ DŒŒŒ —?E˜E™EE›EœE€₫ÿÿ?`¿?D¿¿¿>E¾>Pÿÿÿ¾„H₫ÿÿZEX‰`‰h‰p‰x‰€‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¨¿h¿?E¿?ƒE¿?J°‹?¸‹À‹È‹Đ‹‚EE€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿E€EE‚EƒE `Œ DŒŒŒ —?E˜E™EE›E¨₫ÿÿ?`¿?D¿¿¿>E¾>`ÿÿÿ¾„p₫ÿÿZEX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰x‰ ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?¨¿h¿?E¿?ƒE¿?J°‹?¸‹À‹È‹Đ‹‚EE€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?¿E€EE‚EƒE `Œ DŒŒŒ —?E˜E™EE›E¨₫ÿÿ?`¿?D¿¿¿>E¾>`ÿÿÿ¾„p₫ÿÿZEX‰`‰h‰p‰x‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?€¿@¿?E¿?‚E~¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿZEX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?‹ ‹ ‹ ‹‹ EX‰`‰h‰p‰0ÿÿÿ‰ Hˆ‹5=€>E;?½?1½1½ 1?¾E?`¿?D¿¿¿?€¿@¿?E¿?‚E¿?J°‹?¸‹À‹È‹E€E¿E `Œ DŒŒŒ €JX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?¿E€EE‚E `Œ DŒŒŒ —?E˜E™EEĐ₫ÿÿ?`¿?D¿¿¿>E¾>pÿÿÿ¾„˜₫ÿÿZEX‰`‰h‰p‰‰ E‹ ‹‹ ‹ ?ncc 3.0.8 (Build 08:14:57 Aug 26 2020)-'<`jy†intQ˜ /;¤ K¬ %X³1-¼34Å7;037ÇÎ38ÇƯ39Ḉ3:× × 4 ç 4ù3;@3Œ73y3y3„&323‘7 G 4;3’̣H3”»3•y3–yL3—„Z3˜Çh3™&3q3› ~3œ×(Œ3R 7B˜œ> ¨7B‘° sif7>‘¸ap9c®:K³;I¼fdfB¨fBîf>szfnG ónBp ÈœfdnB¨nBnsznn» vB@œØfdvB¨vBvB ~BĐœfd~B¨~B)~B/@†B@@‡B% $ > $ >   I: ; I : ;  : ; I8 I !I/ .?: ; 'I@–B : ; I : ; I4: ; I4: ; I4: ; I : ; I: ; I4: ; I?<'’û /opt/nec/ve/ncc/3.0.8/include/opt/nec/ve/include/sys/opt/nec/ve/include/opt/nec/ve/include/bits/opt/nec/ve/include/gnu/opt/nec/ve/include/linux/opt/nec/ve/include/asm/opt/nec/ve/include/asm-generic/storage/users/dgenet/papi/src/libperfnec/lib/../include/perfmonpfmlib_os_linux_v3.cstdc-predef.htypes.hfeatures.hcdefs.hwordsize.hstubs.htypes.htypesizes.htime.hstddef.hnecvals.hstdarg.hendian.hendian.hbyteswap.hbyteswap-16.hselect.hselect.hsigset.htime.hsysmacros.hpthreadtypes.hstdint.hstdint.hwchar.hstdlib.hstdlib.hwaitflags.hwaitstatus.hxlocale.halloca.hstdlib-float.hunistd.hposix_opt.henvironments.hconfname.hgetopt.herrno.herrno.herrno.herrno.herrno.herrno.herrno-base.hsyscall.hsyscall.hunistd.hunistd_64.hsyscall.hperfmon.h perfmon_v2.h pfmlib.h stdio.hlibio.h_G_config.hwchar.hstdio_lim.hsys_errlist.hinttypes.hinttypes.hpfmlib_os.h pfmlib_comp.h pfmlib_priv.hpfmlib_priv_comp.h 7¨ó…8ˆp € ƒó€ñ(ăà(óX¨8¸h¨8¸h 8h 8h˜8èX˜8èXncc 3.0.8 (Build 08:14:57 Aug 26 2020)pfmlib_os_linux_v3.c/storage/users/dgenet/papi/src/libperfnec/libunsigned shortunsigned intlong unsigned intchar_Sizetva_listsize_tuint16_tuint32_tuint64_tsif_avail_pmcssif_avail_pmdssif_reservedpfarg_sinfo_tset_idset_reserved1set_flagsset_timeoutreservedpfarg_set_desc_tset_reserved2set_ovfl_pmdsset_runsset_durationset_reserved3pfarg_set_info_tpfm_createflagsnamesmpl_argsmpl_sizepfm_writetypepfm_readpfm_create_setssetdpfm_getinfo_setsinfopfm_attachtargetpfm_set_statestate_pfmlib_sys_base_pfmlib_major_versionzRx  $˜H‰` € È $D H‰` ˆ ( $l H‰` ˆ ( $”  ÈH‰` h  $¼p ÈH‰` h   ä@H‰` H ø  ĐH‰` H ø ñÿ   ˜!7?Qbs } †  È–p ȧ@²Đpfmlib_os_linux_v3.cpfm_create_pfmlib_major_versionsyscallpfm_init_syscalls_pfmlib_sys_base__errno_locationpfm_writepfm_readpfm_create_setspfm_getinfo_setspfm_attachpfm_set_stateÀ Đ X$ h# đ$#8H$#H X đ$ # ¨$¸#H X đ$ # ¨$¸#  @  P  Ø $ è # € $ #Đ à    ¨ $ ¸ # P$`# °Ø è `$ p# ø$#@Ph x đ$ # ˆ$˜#Đà '<)0j7y>†T˜Yd¤o¬z³…¼Å£ίƯ»́èùû&+2H;[gsLZ‹h—&£q¯~¼ŒÇÑ訮³'¼EÆO p¨{ĐœƠ¦ Ç¨̉Đó̃ư  ¨)îEóOp p¨{— ¡@¨ÍÙăШ)/&@Ÿ HDpl˜”À¼èä .symtab.strtab.shstrtab.rela.text.data.bss.comment.rela.debug_info.debug_abbrev.rela.debug_line.debug_str.rela.eh_frame @`@˜%p& , 10 (?È1:@+HKù^+Y@P1 j0,Vzˆ!(u@h1¨ 2„°"(  Ø$Àpapi-papi-7-2-0-t/src/libperfnec/lib/montecito_events.h000066400000000000000000003721371502707512200230610ustar00rootroot00000000000000/* * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ static pme_mont_entry_t montecito_pe []={ #define PME_MONT_ALAT_CAPACITY_MISS_ALL 0 { "ALAT_CAPACITY_MISS_ALL", {0x30058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- both integer and floating point instructions"}, #define PME_MONT_ALAT_CAPACITY_MISS_FP 1 { "ALAT_CAPACITY_MISS_FP", {0x20058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- only floating point instructions"}, #define PME_MONT_ALAT_CAPACITY_MISS_INT 2 { "ALAT_CAPACITY_MISS_INT", {0x10058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- only integer instructions"}, #define PME_MONT_BACK_END_BUBBLE_ALL 3 { "BACK_END_BUBBLE_ALL", {0x0}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- Front-end, RSE, EXE, FPU/L1D stall or a pipeline flush due to an exception/branch misprediction"}, #define PME_MONT_BACK_END_BUBBLE_FE 4 { "BACK_END_BUBBLE_FE", {0x10000}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- front-end"}, #define PME_MONT_BACK_END_BUBBLE_L1D_FPU_RSE 5 { "BACK_END_BUBBLE_L1D_FPU_RSE", {0x20000}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- L1D_FPU or RSE."}, #define PME_MONT_BE_BR_MISPRED_DETAIL_ANY 6 { "BE_BR_MISPRED_DETAIL_ANY", {0x61}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- any back-end (be) mispredictions"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_PFS 7 { "BE_BR_MISPRED_DETAIL_PFS", {0x30061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end pfs mispredictions for taken branches"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_ROT 8 { "BE_BR_MISPRED_DETAIL_ROT", {0x20061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end rotate mispredictions"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_STG 9 { "BE_BR_MISPRED_DETAIL_STG", {0x10061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end stage mispredictions"}, #define PME_MONT_BE_EXE_BUBBLE_ALL 10 { "BE_EXE_BUBBLE_ALL", {0x2}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe"}, #define PME_MONT_BE_EXE_BUBBLE_ARCR 11 { "BE_EXE_BUBBLE_ARCR", {0x40002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to AR or CR dependency"}, #define PME_MONT_BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK 12 { "BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK", {0x80002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- ARCR, PR, CANCEL or BANK_SWITCH"}, #define PME_MONT_BE_EXE_BUBBLE_BANK_SWITCH 13 { "BE_EXE_BUBBLE_BANK_SWITCH", {0x70002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to bank switching."}, #define PME_MONT_BE_EXE_BUBBLE_CANCEL 14 { "BE_EXE_BUBBLE_CANCEL", {0x60002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to a canceled load"}, #define PME_MONT_BE_EXE_BUBBLE_FRALL 15 { "BE_EXE_BUBBLE_FRALL", {0x20002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to FR/FR or FR/load dependency"}, #define PME_MONT_BE_EXE_BUBBLE_GRALL 16 { "BE_EXE_BUBBLE_GRALL", {0x10002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR or GR/load dependency"}, #define PME_MONT_BE_EXE_BUBBLE_GRGR 17 { "BE_EXE_BUBBLE_GRGR", {0x50002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR dependency"}, #define PME_MONT_BE_EXE_BUBBLE_PR 18 { "BE_EXE_BUBBLE_PR", {0x30002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to PR dependency"}, #define PME_MONT_BE_FLUSH_BUBBLE_ALL 19 { "BE_FLUSH_BUBBLE_ALL", {0x4}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to either an exception/interruption or branch misprediction flush"}, #define PME_MONT_BE_FLUSH_BUBBLE_BRU 20 { "BE_FLUSH_BUBBLE_BRU", {0x10004}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to a branch misprediction flush"}, #define PME_MONT_BE_FLUSH_BUBBLE_XPN 21 { "BE_FLUSH_BUBBLE_XPN", {0x20004}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to an exception/interruption flush"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_ALL 22 { "BE_L1D_FPU_BUBBLE_ALL", {0xca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D or FPU"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_FPU 23 { "BE_L1D_FPU_BUBBLE_FPU", {0x100ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by FPU."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D 24 { "BE_L1D_FPU_BUBBLE_L1D", {0x200ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D. This includes all stalls caused by the L1 pipeline (created in the L1D stage of the L1 pipeline which corresponds to the DET stage of the main pipe)."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_AR_CR 25 { "BE_L1D_FPU_BUBBLE_L1D_AR_CR", {0x800ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ar/cr requiring a stall"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_FILLCONF 26 { "BE_L1D_FPU_BUBBLE_L1D_FILLCONF", {0x700ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due a store in conflict with a returning fill."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF 27 { "BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF", {0x300ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer being full"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_HPW 28 { "BE_L1D_FPU_BUBBLE_L1D_HPW", {0x500ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to Hardware Page Walker"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_L2BPRESS 29 { "BE_L1D_FPU_BUBBLE_L1D_L2BPRESS", {0x900ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2 Back Pressure"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_LDCHK 30 { "BE_L1D_FPU_BUBBLE_L1D_LDCHK", {0xc00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_LDCONF 31 { "BE_L1D_FPU_BUBBLE_L1D_LDCONF", {0xb00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_NAT 32 { "BE_L1D_FPU_BUBBLE_L1D_NAT", {0xd00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L1D data return needing recirculated NaT generation."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_NATCONF 33 { "BE_L1D_FPU_BUBBLE_L1D_NATCONF", {0xf00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ld8.fill conflict with st8.spill not written to unat."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_PIPE_RECIRC 34 { "BE_L1D_FPU_BUBBLE_L1D_PIPE_RECIRC", {0x400ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to recirculate"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR 35 { "BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR", {0xe00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer cancel needing recirculate."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_TLB 36 { "BE_L1D_FPU_BUBBLE_L1D_TLB", {0xa00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2DTLB to L1DTLB transfer"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_ALL 37 { "BE_LOST_BW_DUE_TO_FE_ALL", {0x72}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- count regardless of cause"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BI 38 { "BE_LOST_BW_DUE_TO_FE_BI", {0x90072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch initialization stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BRQ 39 { "BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch retirement queue stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 40 { "BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch interlock stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BUBBLE 41 { "BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch resteer bubble stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_FEFLUSH 42 { "BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a front-end flush"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 43 { "BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_IBFULL 44 { "BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- (* meaningless for this event *)"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_IMISS 45 { "BE_LOST_BW_DUE_TO_FE_IMISS", {0x60072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by instruction cache miss stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_PLP 46 { "BE_LOST_BW_DUE_TO_FE_PLP", {0xb0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by perfect loop prediction stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_TLBMISS 47 { "BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by TLB stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_UNREACHED 48 { "BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by unreachable bundle"}, #define PME_MONT_BE_RSE_BUBBLE_ALL 49 { "BE_RSE_BUBBLE_ALL", {0x1}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE"}, #define PME_MONT_BE_RSE_BUBBLE_AR_DEP 50 { "BE_RSE_BUBBLE_AR_DEP", {0x20001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to AR dependencies"}, #define PME_MONT_BE_RSE_BUBBLE_BANK_SWITCH 51 { "BE_RSE_BUBBLE_BANK_SWITCH", {0x10001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to bank switching"}, #define PME_MONT_BE_RSE_BUBBLE_LOADRS 52 { "BE_RSE_BUBBLE_LOADRS", {0x50001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to loadrs calculations"}, #define PME_MONT_BE_RSE_BUBBLE_OVERFLOW 53 { "BE_RSE_BUBBLE_OVERFLOW", {0x30001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to spill"}, #define PME_MONT_BE_RSE_BUBBLE_UNDERFLOW 54 { "BE_RSE_BUBBLE_UNDERFLOW", {0x40001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to fill"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_ALL_PRED 55 { "BR_MISPRED_DETAIL_ALL_ALL_PRED", {0x5b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_CORRECT_PRED 56 { "BR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x1005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_WRONG_PATH 57 { "BR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x2005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_WRONG_TARGET 58 { "BR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x3005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_ALL_PRED 59 { "BR_MISPRED_DETAIL_IPREL_ALL_PRED", {0x4005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_CORRECT_PRED 60 { "BR_MISPRED_DETAIL_IPREL_CORRECT_PRED", {0x5005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_WRONG_PATH 61 { "BR_MISPRED_DETAIL_IPREL_WRONG_PATH", {0x6005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_WRONG_TARGET 62 { "BR_MISPRED_DETAIL_IPREL_WRONG_TARGET", {0x7005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_ALL_PRED 63 { "BR_MISPRED_DETAIL_NRETIND_ALL_PRED", {0xc005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_CORRECT_PRED 64 { "BR_MISPRED_DETAIL_NRETIND_CORRECT_PRED", {0xd005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_WRONG_PATH 65 { "BR_MISPRED_DETAIL_NRETIND_WRONG_PATH", {0xe005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_WRONG_TARGET 66 { "BR_MISPRED_DETAIL_NRETIND_WRONG_TARGET", {0xf005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_ALL_PRED 67 { "BR_MISPRED_DETAIL_RETURN_ALL_PRED", {0x8005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_CORRECT_PRED 68 { "BR_MISPRED_DETAIL_RETURN_CORRECT_PRED", {0x9005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_WRONG_PATH 69 { "BR_MISPRED_DETAIL_RETURN_WRONG_PATH", {0xa005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_WRONG_TARGET 70 { "BR_MISPRED_DETAIL_RETURN_WRONG_TARGET", {0xb005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED 71 { "BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED", {0x68}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED 72 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED", {0x10068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and correctly predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH 73 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH", {0x20068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED 74 { "BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED", {0x40068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED 75 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED", {0x50068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH 76 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH", {0x60068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED 77 { "BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED", {0xc0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED 78 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED", {0xd0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH 79 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH", {0xe0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED 80 { "BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED", {0x80068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED 81 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED", {0x90068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH 82 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH", {0xa0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_PATH_PRED_ALL_MISPRED_NOTTAKEN 83 { "BR_PATH_PRED_ALL_MISPRED_NOTTAKEN", {0x54}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_MISPRED_TAKEN 84 { "BR_PATH_PRED_ALL_MISPRED_TAKEN", {0x10054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_OKPRED_NOTTAKEN 85 { "BR_PATH_PRED_ALL_OKPRED_NOTTAKEN", {0x20054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_OKPRED_TAKEN 86 { "BR_PATH_PRED_ALL_OKPRED_TAKEN", {0x30054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN 87 { "BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN", {0x40054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_MISPRED_TAKEN 88 { "BR_PATH_PRED_IPREL_MISPRED_TAKEN", {0x50054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN 89 { "BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN", {0x60054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_OKPRED_TAKEN 90 { "BR_PATH_PRED_IPREL_OKPRED_TAKEN", {0x70054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN 91 { "BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN", {0xc0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_MISPRED_TAKEN 92 { "BR_PATH_PRED_NRETIND_MISPRED_TAKEN", {0xd0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN 93 { "BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN", {0xe0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_OKPRED_TAKEN 94 { "BR_PATH_PRED_NRETIND_OKPRED_TAKEN", {0xf0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN 95 { "BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN", {0x80054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_MISPRED_TAKEN 96 { "BR_PATH_PRED_RETURN_MISPRED_TAKEN", {0x90054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN 97 { "BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN", {0xa0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_OKPRED_TAKEN 98 { "BR_PATH_PRED_RETURN_OKPRED_TAKEN", {0xb0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN 99 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN", {0x6a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN 100 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN", {0x1006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN 101 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN", {0x4006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN 102 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN", {0x5006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN 103 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN", {0xc006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN 104 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN", {0xd006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN 105 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN", {0x8006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN 106 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN", {0x9006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BUS_ALL_ANY 107 { "BUS_ALL_ANY", {0x31887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_ALL_EITHER 108 { "BUS_ALL_EITHER", {0x1887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_ALL_IO 109 { "BUS_ALL_IO", {0x11887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_ALL_SELF 110 { "BUS_ALL_SELF", {0x21887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_ANY 111 { "BUS_B2B_DATA_CYCLES_ANY", {0x31093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_B2B_DATA_CYCLES_EITHER 112 { "BUS_B2B_DATA_CYCLES_EITHER", {0x1093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_IO 113 { "BUS_B2B_DATA_CYCLES_IO", {0x11093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_SELF 114 { "BUS_B2B_DATA_CYCLES_SELF", {0x21093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_DATA_CYCLE_ANY 115 { "BUS_DATA_CYCLE_ANY", {0x31088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_DATA_CYCLE_EITHER 116 { "BUS_DATA_CYCLE_EITHER", {0x1088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_DATA_CYCLE_IO 117 { "BUS_DATA_CYCLE_IO", {0x11088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_DATA_CYCLE_SELF 118 { "BUS_DATA_CYCLE_SELF", {0x21088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_HITM_ANY 119 { "BUS_HITM_ANY", {0x31884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_HITM_EITHER 120 { "BUS_HITM_EITHER", {0x1884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_HITM_IO 121 { "BUS_HITM_IO", {0x11884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_HITM_SELF 122 { "BUS_HITM_SELF", {0x21884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_IO_ANY 123 { "BUS_IO_ANY", {0x31890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_IO_EITHER 124 { "BUS_IO_EITHER", {0x1890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_IO_IO 125 { "BUS_IO_IO", {0x11890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_IO_SELF 126 { "BUS_IO_SELF", {0x21890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_MEMORY_ALL_ANY 127 { "BUS_MEMORY_ALL_ANY", {0xf188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_ALL_EITHER 128 { "BUS_MEMORY_ALL_EITHER", {0xc188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_ALL_IO 129 { "BUS_MEMORY_ALL_IO", {0xd188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from 'this' local processor"}, #define PME_MONT_BUS_MEMORY_ALL_SELF 130 { "BUS_MEMORY_ALL_SELF", {0xe188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_ANY 131 { "BUS_MEMORY_EQ_128BYTE_ANY", {0x7188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from either local processor"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_EITHER 132 { "BUS_MEMORY_EQ_128BYTE_EITHER", {0x4188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_IO 133 { "BUS_MEMORY_EQ_128BYTE_IO", {0x5188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from 'this' processor"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_SELF 134 { "BUS_MEMORY_EQ_128BYTE_SELF", {0x6188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_ANY 135 { "BUS_MEMORY_LT_128BYTE_ANY", {0xb188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from either local processor"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_EITHER 136 { "BUS_MEMORY_LT_128BYTE_EITHER", {0x8188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_IO 137 { "BUS_MEMORY_LT_128BYTE_IO", {0x9188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from 'this' processor"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_SELF 138 { "BUS_MEMORY_LT_128BYTE_SELF", {0xa188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_ALL_ANY 139 { "BUS_MEM_READ_ALL_ANY", {0xf188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_ALL_EITHER 140 { "BUS_MEM_READ_ALL_EITHER", {0xc188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_ALL_IO 141 { "BUS_MEM_READ_ALL_IO", {0xd188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_ALL_SELF 142 { "BUS_MEM_READ_ALL_SELF", {0xe188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BIL_ANY 143 { "BUS_MEM_READ_BIL_ANY", {0x3188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BIL_EITHER 144 { "BUS_MEM_READ_BIL_EITHER", {0x188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BIL_IO 145 { "BUS_MEM_READ_BIL_IO", {0x1188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BIL_SELF 146 { "BUS_MEM_READ_BIL_SELF", {0x2188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BRIL_ANY 147 { "BUS_MEM_READ_BRIL_ANY", {0xb188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BRIL_EITHER 148 { "BUS_MEM_READ_BRIL_EITHER", {0x8188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BRIL_IO 149 { "BUS_MEM_READ_BRIL_IO", {0x9188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BRIL_SELF 150 { "BUS_MEM_READ_BRIL_SELF", {0xa188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BRL_ANY 151 { "BUS_MEM_READ_BRL_ANY", {0x7188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BRL_EITHER 152 { "BUS_MEM_READ_BRL_EITHER", {0x4188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BRL_IO 153 { "BUS_MEM_READ_BRL_IO", {0x5188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BRL_SELF 154 { "BUS_MEM_READ_BRL_SELF", {0x6188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from local processor"}, #define PME_MONT_BUS_RD_DATA_ANY 155 { "BUS_RD_DATA_ANY", {0x3188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_DATA_EITHER 156 { "BUS_RD_DATA_EITHER", {0x188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_DATA_IO 157 { "BUS_RD_DATA_IO", {0x1188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_DATA_SELF 158 { "BUS_RD_DATA_SELF", {0x2188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_HIT_ANY 159 { "BUS_RD_HIT_ANY", {0x31880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_HIT_EITHER 160 { "BUS_RD_HIT_EITHER", {0x1880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_HIT_IO 161 { "BUS_RD_HIT_IO", {0x11880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_HIT_SELF 162 { "BUS_RD_HIT_SELF", {0x21880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_HITM_ANY 163 { "BUS_RD_HITM_ANY", {0x31881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_HITM_EITHER 164 { "BUS_RD_HITM_EITHER", {0x1881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_HITM_IO 165 { "BUS_RD_HITM_IO", {0x11881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_HITM_SELF 166 { "BUS_RD_HITM_SELF", {0x21881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_ANY 167 { "BUS_RD_INVAL_BST_HITM_ANY", {0x31883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_EITHER 168 { "BUS_RD_INVAL_BST_HITM_EITHER", {0x1883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_IO 169 { "BUS_RD_INVAL_BST_HITM_IO", {0x11883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_SELF 170 { "BUS_RD_INVAL_BST_HITM_SELF", {0x21883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_INVAL_HITM_ANY 171 { "BUS_RD_INVAL_HITM_ANY", {0x31882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_INVAL_HITM_EITHER 172 { "BUS_RD_INVAL_HITM_EITHER", {0x1882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_INVAL_HITM_IO 173 { "BUS_RD_INVAL_HITM_IO", {0x11882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_INVAL_HITM_SELF 174 { "BUS_RD_INVAL_HITM_SELF", {0x21882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_IO_ANY 175 { "BUS_RD_IO_ANY", {0x31891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_IO_EITHER 176 { "BUS_RD_IO_EITHER", {0x1891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_IO_IO 177 { "BUS_RD_IO_IO", {0x11891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_IO_SELF 178 { "BUS_RD_IO_SELF", {0x21891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_PRTL_ANY 179 { "BUS_RD_PRTL_ANY", {0x3188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_PRTL_EITHER 180 { "BUS_RD_PRTL_EITHER", {0x188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_PRTL_IO 181 { "BUS_RD_PRTL_IO", {0x1188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_PRTL_SELF 182 { "BUS_RD_PRTL_SELF", {0x2188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_ANY 183 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x3188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_EITHER 184 { "BUS_SNOOP_STALL_CYCLES_EITHER", {0x188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_SELF 185 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x2188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- local processor"}, #define PME_MONT_BUS_WR_WB_ALL_ANY 186 { "BUS_WR_WB_ALL_ANY", {0xf1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_WR_WB_ALL_IO 187 { "BUS_WR_WB_ALL_IO", {0xd1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- non-CPU priority agents"}, #define PME_MONT_BUS_WR_WB_ALL_SELF 188 { "BUS_WR_WB_ALL_SELF", {0xe1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor"}, #define PME_MONT_BUS_WR_WB_CCASTOUT_ANY 189 { "BUS_WR_WB_CCASTOUT_ANY", {0xb1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_MONT_BUS_WR_WB_CCASTOUT_SELF 190 { "BUS_WR_WB_CCASTOUT_SELF", {0xa1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_ANY 191 { "BUS_WR_WB_EQ_128BYTE_ANY", {0x71892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)./Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_IO 192 { "BUS_WR_WB_EQ_128BYTE_IO", {0x51892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- non-CPU priority agents/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_SELF 193 { "BUS_WR_WB_EQ_128BYTE_SELF", {0x61892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_CPU_CPL_CHANGES_ALL 194 { "CPU_CPL_CHANGES_ALL", {0xf0013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes in cpl counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL0 195 { "CPU_CPL_CHANGES_LVL0", {0x10013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level0 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL1 196 { "CPU_CPL_CHANGES_LVL1", {0x20013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level1 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL2 197 { "CPU_CPL_CHANGES_LVL2", {0x40013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level2 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL3 198 { "CPU_CPL_CHANGES_LVL3", {0x80013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level3 are counted"}, #define PME_MONT_CPU_OP_CYCLES_ALL 199 { "CPU_OP_CYCLES_ALL", {0x1012}, 0xfff0, 1, {0xffff0000}, "CPU Operating Cycles -- All CPU cycles counted"}, #define PME_MONT_CPU_OP_CYCLES_QUAL 200 { "CPU_OP_CYCLES_QUAL", {0x11012}, 0xfff0, 1, {0xffff0003}, "CPU Operating Cycles -- Qualified cycles only"}, #define PME_MONT_CPU_OP_CYCLES_HALTED 201 { "CPU_OP_CYCLES_HALTED", {0x1018}, 0x0400, 7, {0xffff0000}, "CPU Operating Cycles Halted"}, #define PME_MONT_DATA_DEBUG_REGISTER_FAULT 202 { "DATA_DEBUG_REGISTER_FAULT", {0x52}, 0xfff0, 1, {0xffff0000}, "Fault Due to Data Debug Reg. Match to Load/Store Instruction"}, #define PME_MONT_DATA_DEBUG_REGISTER_MATCHES 203 { "DATA_DEBUG_REGISTER_MATCHES", {0xc6}, 0xfff0, 1, {0xffff0007}, "Data Debug Register Matches Data Address of Memory Reference."}, #define PME_MONT_DATA_EAR_ALAT 204 { "DATA_EAR_ALAT", {0xec8}, 0xfff0, 1, {0xffff0007}, "Data EAR ALAT"}, #define PME_MONT_DATA_EAR_CACHE_LAT1024 205 { "DATA_EAR_CACHE_LAT1024", {0x80dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 1024 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT128 206 { "DATA_EAR_CACHE_LAT128", {0x50dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 128 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT16 207 { "DATA_EAR_CACHE_LAT16", {0x20dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 16 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT2048 208 { "DATA_EAR_CACHE_LAT2048", {0x90dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 2048 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT256 209 { "DATA_EAR_CACHE_LAT256", {0x60dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 256 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT32 210 { "DATA_EAR_CACHE_LAT32", {0x30dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 32 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT4 211 { "DATA_EAR_CACHE_LAT4", {0xdc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 4 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT4096 212 { "DATA_EAR_CACHE_LAT4096", {0xa0dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 4096 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT512 213 { "DATA_EAR_CACHE_LAT512", {0x70dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 512 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT64 214 { "DATA_EAR_CACHE_LAT64", {0x40dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 64 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT8 215 { "DATA_EAR_CACHE_LAT8", {0x10dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 8 Cycles"}, #define PME_MONT_DATA_EAR_EVENTS 216 { "DATA_EAR_EVENTS", {0x8c8}, 0xfff0, 1, {0xffff0007}, "L1 Data Cache EAR Events"}, #define PME_MONT_DATA_EAR_TLB_ALL 217 { "DATA_EAR_TLB_ALL", {0xe0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- All L1 DTLB Misses"}, #define PME_MONT_DATA_EAR_TLB_FAULT 218 { "DATA_EAR_TLB_FAULT", {0x80cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- DTLB Misses which produce a software fault"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB 219 { "DATA_EAR_TLB_L2DTLB", {0x20cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB_OR_FAULT 220 { "DATA_EAR_TLB_L2DTLB_OR_FAULT", {0xa0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or produce a software fault"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB_OR_VHPT 221 { "DATA_EAR_TLB_L2DTLB_OR_VHPT", {0x60cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or VHPT"}, #define PME_MONT_DATA_EAR_TLB_VHPT 222 { "DATA_EAR_TLB_VHPT", {0x40cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT"}, #define PME_MONT_DATA_EAR_TLB_VHPT_OR_FAULT 223 { "DATA_EAR_TLB_VHPT_OR_FAULT", {0xc0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT or produce a software fault"}, #define PME_MONT_DATA_REFERENCES_SET0 224 { "DATA_REFERENCES_SET0", {0xc3}, 0xfff0, 4, {0x5010007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_MONT_DATA_REFERENCES_SET1 225 { "DATA_REFERENCES_SET1", {0xc5}, 0xfff0, 4, {0x5110007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_MONT_DISP_STALLED 226 { "DISP_STALLED", {0x49}, 0xfff0, 1, {0xffff0000}, "Number of Cycles Dispersal Stalled"}, #define PME_MONT_DTLB_INSERTS_HPW 227 { "DTLB_INSERTS_HPW", {0x8c9}, 0xfff0, 4, {0xffff0000}, "Hardware Page Walker Installs to DTLB"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_ALL_PRED 228 { "ENCBR_MISPRED_DETAIL_ALL_ALL_PRED", {0x63}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED 229 { "ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x10063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH 230 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x20063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET 231 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x30063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED 232 { "ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED", {0xc0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED 233 { "ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED", {0xd0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH 234 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH", {0xe0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET 235 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET", {0xf0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED 236 { "ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED", {0x80063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED 237 { "ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED", {0x90063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH 238 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH", {0xa0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET 239 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET", {0xb0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ER_BKSNP_ME_ACCEPTED 240 { "ER_BKSNP_ME_ACCEPTED", {0x10bb}, 0x03f0, 2, {0xffff0000}, "Backsnoop Me Accepted"}, #define PME_MONT_ER_BRQ_LIVE_REQ_HI 241 { "ER_BRQ_LIVE_REQ_HI", {0x10b8}, 0x03f0, 2, {0xffff0000}, "BRQ Live Requests (upper 2 bits)"}, #define PME_MONT_ER_BRQ_LIVE_REQ_LO 242 { "ER_BRQ_LIVE_REQ_LO", {0x10b9}, 0x03f0, 7, {0xffff0000}, "BRQ Live Requests (lower 3 bits)"}, #define PME_MONT_ER_BRQ_REQ_INSERTED 243 { "ER_BRQ_REQ_INSERTED", {0x8ba}, 0x03f0, 1, {0xffff0000}, "BRQ Requests Inserted"}, #define PME_MONT_ER_MEM_READ_OUT_HI 244 { "ER_MEM_READ_OUT_HI", {0x8b4}, 0x03f0, 2, {0xffff0000}, "Outstanding Memory Read Transactions (upper 2 bits)"}, #define PME_MONT_ER_MEM_READ_OUT_LO 245 { "ER_MEM_READ_OUT_LO", {0x8b5}, 0x03f0, 7, {0xffff0000}, "Outstanding Memory Read Transactions (lower 3 bits)"}, #define PME_MONT_ER_REJECT_ALL_L1D_REQ 246 { "ER_REJECT_ALL_L1D_REQ", {0x10bd}, 0x03f0, 1, {0xffff0000}, "Reject All L1D Requests"}, #define PME_MONT_ER_REJECT_ALL_L1I_REQ 247 { "ER_REJECT_ALL_L1I_REQ", {0x10be}, 0x03f0, 1, {0xffff0000}, "Reject All L1I Requests"}, #define PME_MONT_ER_REJECT_ALL_L1_REQ 248 { "ER_REJECT_ALL_L1_REQ", {0x10bc}, 0x03f0, 1, {0xffff0000}, "Reject All L1 Requests"}, #define PME_MONT_ER_SNOOPQ_REQ_HI 249 { "ER_SNOOPQ_REQ_HI", {0x10b6}, 0x03f0, 2, {0xffff0000}, "Outstanding Snoops (upper bit)"}, #define PME_MONT_ER_SNOOPQ_REQ_LO 250 { "ER_SNOOPQ_REQ_LO", {0x10b7}, 0x03f0, 7, {0xffff0000}, "Outstanding Snoops (lower 3 bits)"}, #define PME_MONT_ETB_EVENT 251 { "ETB_EVENT", {0x111}, 0xfff0, 1, {0xffff0003}, "Execution Trace Buffer Event Captured"}, #define PME_MONT_FE_BUBBLE_ALL 252 { "FE_BUBBLE_ALL", {0x71}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- count regardless of cause"}, #define PME_MONT_FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE 253 { "FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE", {0xb0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- ALL except FEFLUSH and BUBBLE"}, #define PME_MONT_FE_BUBBLE_ALLBUT_IBFULL 254 { "FE_BUBBLE_ALLBUT_IBFULL", {0xc0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- ALL except IBFULl"}, #define PME_MONT_FE_BUBBLE_BRANCH 255 { "FE_BUBBLE_BRANCH", {0x90071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by any of 4 branch recirculates"}, #define PME_MONT_FE_BUBBLE_BUBBLE 256 { "FE_BUBBLE_BUBBLE", {0xd0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by branch bubble stall"}, #define PME_MONT_FE_BUBBLE_FEFLUSH 257 { "FE_BUBBLE_FEFLUSH", {0x10071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by a front-end flush"}, #define PME_MONT_FE_BUBBLE_FILL_RECIRC 258 { "FE_BUBBLE_FILL_RECIRC", {0x80071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_FE_BUBBLE_GROUP1 259 { "FE_BUBBLE_GROUP1", {0x30071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- BUBBLE or BRANCH"}, #define PME_MONT_FE_BUBBLE_GROUP2 260 { "FE_BUBBLE_GROUP2", {0x40071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- IMISS or TLBMISS"}, #define PME_MONT_FE_BUBBLE_GROUP3 261 { "FE_BUBBLE_GROUP3", {0xa0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- FILL_RECIRC or BRANCH"}, #define PME_MONT_FE_BUBBLE_IBFULL 262 { "FE_BUBBLE_IBFULL", {0x50071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by instruction buffer full stall"}, #define PME_MONT_FE_BUBBLE_IMISS 263 { "FE_BUBBLE_IMISS", {0x60071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by instruction cache miss stall"}, #define PME_MONT_FE_BUBBLE_TLBMISS 264 { "FE_BUBBLE_TLBMISS", {0x70071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by TLB stall"}, #define PME_MONT_FE_LOST_BW_ALL 265 { "FE_LOST_BW_ALL", {0x70}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- count regardless of cause"}, #define PME_MONT_FE_LOST_BW_BI 266 { "FE_LOST_BW_BI", {0x90070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch initialization stall"}, #define PME_MONT_FE_LOST_BW_BRQ 267 { "FE_LOST_BW_BRQ", {0xa0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch retirement queue stall"}, #define PME_MONT_FE_LOST_BW_BR_ILOCK 268 { "FE_LOST_BW_BR_ILOCK", {0xc0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch interlock stall"}, #define PME_MONT_FE_LOST_BW_BUBBLE 269 { "FE_LOST_BW_BUBBLE", {0xd0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch resteer bubble stall"}, #define PME_MONT_FE_LOST_BW_FEFLUSH 270 { "FE_LOST_BW_FEFLUSH", {0x10070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by a front-end flush"}, #define PME_MONT_FE_LOST_BW_FILL_RECIRC 271 { "FE_LOST_BW_FILL_RECIRC", {0x80070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_FE_LOST_BW_IBFULL 272 { "FE_LOST_BW_IBFULL", {0x50070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction buffer full stall"}, #define PME_MONT_FE_LOST_BW_IMISS 273 { "FE_LOST_BW_IMISS", {0x60070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction cache miss stall"}, #define PME_MONT_FE_LOST_BW_PLP 274 { "FE_LOST_BW_PLP", {0xb0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by perfect loop prediction stall"}, #define PME_MONT_FE_LOST_BW_TLBMISS 275 { "FE_LOST_BW_TLBMISS", {0x70070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by TLB stall"}, #define PME_MONT_FE_LOST_BW_UNREACHED 276 { "FE_LOST_BW_UNREACHED", {0x40070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by unreachable bundle"}, #define PME_MONT_FP_FAILED_FCHKF 277 { "FP_FAILED_FCHKF", {0x6}, 0xfff0, 1, {0xffff0001}, "Failed fchkf"}, #define PME_MONT_FP_FALSE_SIRSTALL 278 { "FP_FALSE_SIRSTALL", {0x5}, 0xfff0, 1, {0xffff0001}, "SIR Stall Without a Trap"}, #define PME_MONT_FP_FLUSH_TO_ZERO_FTZ_POSS 279 { "FP_FLUSH_TO_ZERO_FTZ_POSS", {0x1000b}, 0xfff0, 2, {0xffff0001}, "FP Result Flushed to Zero -- "}, #define PME_MONT_FP_FLUSH_TO_ZERO_FTZ_REAL 280 { "FP_FLUSH_TO_ZERO_FTZ_REAL", {0xb}, 0xfff0, 2, {0xffff0001}, "FP Result Flushed to Zero -- Times FTZ"}, #define PME_MONT_FP_OPS_RETIRED 281 { "FP_OPS_RETIRED", {0x9}, 0xfff0, 6, {0xffff0001}, "Retired FP Operations"}, #define PME_MONT_FP_TRUE_SIRSTALL 282 { "FP_TRUE_SIRSTALL", {0x3}, 0xfff0, 1, {0xffff0001}, "SIR stall asserted and leads to a trap"}, #define PME_MONT_HPW_DATA_REFERENCES 283 { "HPW_DATA_REFERENCES", {0x2d}, 0xfff0, 4, {0xffff0000}, "Data Memory References to VHPT"}, #define PME_MONT_IA64_INST_RETIRED_THIS 284 { "IA64_INST_RETIRED_THIS", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions"}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33 285 { "IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 0 and the opcode matcher pair PMC32 and PMC33."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35 286 { "IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35", {0x10008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 1 and the opcode matcher pair PMC34 and PMC35."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33 287 { "IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33", {0x20008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 2 and the opcode matcher pair PMC32 and PMC33."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35 288 { "IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35", {0x30008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 3 and the opcode matcher pair PMC34 and PMC35."}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_ALL 289 { "IDEAL_BE_LOST_BW_DUE_TO_FE_ALL", {0x73}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- count regardless of cause"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BI 290 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BI", {0x90073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch initialization stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ 291 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch retirement queue stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 292 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch interlock stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE 293 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch resteer bubble stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH 294 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by a front-end flush"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 295 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL 296 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- (* meaningless for this event *)"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS 297 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS", {0x60073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by instruction cache miss stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_PLP 298 { "IDEAL_BE_LOST_BW_DUE_TO_FE_PLP", {0xb0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by perfect loop prediction stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS 299 { "IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by TLB stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED 300 { "IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by unreachable bundle"}, #define PME_MONT_INST_CHKA_LDC_ALAT_ALL 301 { "INST_CHKA_LDC_ALAT_ALL", {0x30056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_CHKA_LDC_ALAT_FP 302 { "INST_CHKA_LDC_ALAT_FP", {0x20056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_MONT_INST_CHKA_LDC_ALAT_INT 303 { "INST_CHKA_LDC_ALAT_INT", {0x10056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- only integer instructions"}, #define PME_MONT_INST_DISPERSED 304 { "INST_DISPERSED", {0x4d}, 0xfff0, 6, {0xffff0001}, "Syllables Dispersed from REN to REG stage"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_ALL 305 { "INST_FAILED_CHKA_LDC_ALAT_ALL", {0x30057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_FP 306 { "INST_FAILED_CHKA_LDC_ALAT_FP", {0x20057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_INT 307 { "INST_FAILED_CHKA_LDC_ALAT_INT", {0x10057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- only integer instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_ALL 308 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_FP 309 { "INST_FAILED_CHKS_RETIRED_FP", {0x20055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- only floating point instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_INT 310 { "INST_FAILED_CHKS_RETIRED_INT", {0x10055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- only integer instructions"}, #define PME_MONT_ISB_BUNPAIRS_IN 311 { "ISB_BUNPAIRS_IN", {0x46}, 0xfff0, 1, {0xffff0001}, "Bundle Pairs Written from L2I into FE"}, #define PME_MONT_ITLB_MISSES_FETCH_ALL 312 { "ITLB_MISSES_FETCH_ALL", {0x30047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All tlb misses will be counted. Note that this is not equal to sum of the L1ITLB and L2ITLB umasks because any access could be a miss in L1ITLB and L2ITLB."}, #define PME_MONT_ITLB_MISSES_FETCH_L1ITLB 313 { "ITLB_MISSES_FETCH_L1ITLB", {0x10047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB will be counted. even if L1ITLB is not updated for an access (Uncacheable/nat page/not present page/faulting/some flushed), it will be counted here."}, #define PME_MONT_ITLB_MISSES_FETCH_L2ITLB 314 { "ITLB_MISSES_FETCH_L2ITLB", {0x20047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB which also missed in L2ITLB will be counted."}, #define PME_MONT_L1DTLB_TRANSFER 315 { "L1DTLB_TRANSFER", {0xc0}, 0xfff0, 1, {0x5010007}, "L1DTLB Misses That Hit in the L2DTLB for Accesses Counted in L1D_READS"}, #define PME_MONT_L1D_READS_SET0 316 { "L1D_READS_SET0", {0xc2}, 0xfff0, 2, {0x5010007}, "L1 Data Cache Reads"}, #define PME_MONT_L1D_READS_SET1 317 { "L1D_READS_SET1", {0xc4}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Reads"}, #define PME_MONT_L1D_READ_MISSES_ALL 318 { "L1D_READ_MISSES_ALL", {0xc7}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Read Misses -- all L1D read misses will be counted."}, #define PME_MONT_L1D_READ_MISSES_RSE_FILL 319 { "L1D_READ_MISSES_RSE_FILL", {0x100c7}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Read Misses -- only L1D read misses caused by RSE fills will be counted"}, #define PME_MONT_L1ITLB_INSERTS_HPW 320 { "L1ITLB_INSERTS_HPW", {0x48}, 0xfff0, 1, {0xffff0001}, "L1ITLB Hardware Page Walker Inserts"}, #define PME_MONT_L1I_EAR_CACHE_LAT0 321 { "L1I_EAR_CACHE_LAT0", {0x400b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- > 0 Cycles (All L1 Misses)"}, #define PME_MONT_L1I_EAR_CACHE_LAT1024 322 { "L1I_EAR_CACHE_LAT1024", {0xc00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 1024 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT128 323 { "L1I_EAR_CACHE_LAT128", {0xf00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 128 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT16 324 { "L1I_EAR_CACHE_LAT16", {0xfc0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 16 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT256 325 { "L1I_EAR_CACHE_LAT256", {0xe00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 256 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT32 326 { "L1I_EAR_CACHE_LAT32", {0xf80b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 32 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT4 327 { "L1I_EAR_CACHE_LAT4", {0xff0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 4 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT4096 328 { "L1I_EAR_CACHE_LAT4096", {0x800b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 4096 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT8 329 { "L1I_EAR_CACHE_LAT8", {0xfe0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 8 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_RAB 330 { "L1I_EAR_CACHE_RAB", {0xb43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- RAB HIT"}, #define PME_MONT_L1I_EAR_EVENTS 331 { "L1I_EAR_EVENTS", {0x843}, 0xfff0, 1, {0xffff0001}, "Instruction EAR Events"}, #define PME_MONT_L1I_EAR_TLB_ALL 332 { "L1I_EAR_TLB_ALL", {0x70a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- All L1 ITLB Misses"}, #define PME_MONT_L1I_EAR_TLB_FAULT 333 { "L1I_EAR_TLB_FAULT", {0x40a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- ITLB Misses which produced a fault"}, #define PME_MONT_L1I_EAR_TLB_L2TLB 334 { "L1I_EAR_TLB_L2TLB", {0x10a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB"}, #define PME_MONT_L1I_EAR_TLB_L2TLB_OR_FAULT 335 { "L1I_EAR_TLB_L2TLB_OR_FAULT", {0x50a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or produce a software fault"}, #define PME_MONT_L1I_EAR_TLB_L2TLB_OR_VHPT 336 { "L1I_EAR_TLB_L2TLB_OR_VHPT", {0x30a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or VHPT"}, #define PME_MONT_L1I_EAR_TLB_VHPT 337 { "L1I_EAR_TLB_VHPT", {0x20a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT"}, #define PME_MONT_L1I_EAR_TLB_VHPT_OR_FAULT 338 { "L1I_EAR_TLB_VHPT_OR_FAULT", {0x60a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT or produce a software fault"}, #define PME_MONT_L1I_FETCH_ISB_HIT 339 { "L1I_FETCH_ISB_HIT", {0x66}, 0xfff0, 1, {0xffff0001}, "\"Just-In-Time\" Instruction Fetch Hitting in and Being Bypassed from ISB"}, #define PME_MONT_L1I_FETCH_RAB_HIT 340 { "L1I_FETCH_RAB_HIT", {0x65}, 0xfff0, 1, {0xffff0001}, "Instruction Fetch Hitting in RAB"}, #define PME_MONT_L1I_FILLS 341 { "L1I_FILLS", {0x841}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Fills"}, #define PME_MONT_L1I_PREFETCHES 342 { "L1I_PREFETCHES", {0x44}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Prefetch Requests"}, #define PME_MONT_L1I_PREFETCH_STALL_ALL 343 { "L1I_PREFETCH_STALL_ALL", {0x30067}, 0xfff0, 1, {0xffff0000}, "Prefetch Pipeline Stalls -- Number of clocks prefetch pipeline is stalled"}, #define PME_MONT_L1I_PREFETCH_STALL_FLOW 344 { "L1I_PREFETCH_STALL_FLOW", {0x20067}, 0xfff0, 1, {0xffff0000}, "Prefetch Pipeline Stalls -- Asserted when the streaming prefetcher is working close to the instructions being fetched for demand reads, and is not asserted when the streaming prefetcher is ranging way ahead of the demand reads."}, #define PME_MONT_L1I_PURGE 345 { "L1I_PURGE", {0x104b}, 0xfff0, 1, {0xffff0001}, "L1ITLB Purges Handled by L1I"}, #define PME_MONT_L1I_PVAB_OVERFLOW 346 { "L1I_PVAB_OVERFLOW", {0x69}, 0xfff0, 1, {0xffff0000}, "PVAB Overflow"}, #define PME_MONT_L1I_RAB_ALMOST_FULL 347 { "L1I_RAB_ALMOST_FULL", {0x1064}, 0xfff0, 1, {0xffff0000}, "Is RAB Almost Full?"}, #define PME_MONT_L1I_RAB_FULL 348 { "L1I_RAB_FULL", {0x1060}, 0xfff0, 1, {0xffff0000}, "Is RAB Full?"}, #define PME_MONT_L1I_READS 349 { "L1I_READS", {0x40}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Reads"}, #define PME_MONT_L1I_SNOOP 350 { "L1I_SNOOP", {0x104a}, 0xfff0, 1, {0xffff0007}, "Snoop Requests Handled by L1I"}, #define PME_MONT_L1I_STRM_PREFETCHES 351 { "L1I_STRM_PREFETCHES", {0x5f}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Line Prefetch Requests"}, #define PME_MONT_L2DTLB_MISSES 352 { "L2DTLB_MISSES", {0xc1}, 0xfff0, 4, {0x5010007}, "L2DTLB Misses"}, #define PME_MONT_L2D_BAD_LINES_SELECTED_ANY 353 { "L2D_BAD_LINES_SELECTED_ANY", {0x8ec}, 0xfff0, 4, {0x4520007}, "Valid Line Replaced When Invalid Line Is Available -- Valid line replaced when invalid line is available"}, #define PME_MONT_L2D_BYPASS_L2_DATA1 354 { "L2D_BYPASS_L2_DATA1", {0x8e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L2D data bypasses (L1D to L2A)"}, #define PME_MONT_L2D_BYPASS_L2_DATA2 355 { "L2D_BYPASS_L2_DATA2", {0x108e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L2D data bypasses (L1W to L2I)"}, #define PME_MONT_L2D_BYPASS_L3_DATA1 356 { "L2D_BYPASS_L3_DATA1", {0x208e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L3 data bypasses (L1D to L2A)"}, #define PME_MONT_L2D_FILLB_FULL_THIS 357 { "L2D_FILLB_FULL_THIS", {0x8f1}, 0xfff0, 1, {0x4720000}, "L2D Fill Buffer Is Full -- L2D Fill buffer is full"}, #define PME_MONT_L2D_FILL_MESI_STATE_E 358 { "L2D_FILL_MESI_STATE_E", {0x108f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_I 359 { "L2D_FILL_MESI_STATE_I", {0x308f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_M 360 { "L2D_FILL_MESI_STATE_M", {0x8f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_P 361 { "L2D_FILL_MESI_STATE_P", {0x408f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_S 362 { "L2D_FILL_MESI_STATE_S", {0x208f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FORCE_RECIRC_FILL_HIT 363 { "L2D_FORCE_RECIRC_FILL_HIT", {0x808ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by an L2D miss which hit in the fill buffer."}, #define PME_MONT_L2D_FORCE_RECIRC_FRC_RECIRC 364 { "L2D_FORCE_RECIRC_FRC_RECIRC", {0x908ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when a force recirculate already existed in the Ozq."}, #define PME_MONT_L2D_FORCE_RECIRC_L1W 365 { "L2D_FORCE_RECIRC_L1W", {0xc08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by a L2D miss one cycle ahead of the current op."}, #define PME_MONT_L2D_FORCE_RECIRC_LIMBO 366 { "L2D_FORCE_RECIRC_LIMBO", {0x108ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count operations that went into the LIMBO Ozq state. This state is entered when the the op sees a FILL_HIT or OZQ_MISS event."}, #define PME_MONT_L2D_FORCE_RECIRC_OZQ_MISS 367 { "L2D_FORCE_RECIRC_OZQ_MISS", {0xb08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when an L2D miss was already in the OZQ."}, #define PME_MONT_L2D_FORCE_RECIRC_RECIRC 368 { "L2D_FORCE_RECIRC_RECIRC", {0x8ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Counts inserts into OzQ due to a recirculate. The recirculate due to secondary misses or various other conflicts"}, #define PME_MONT_L2D_FORCE_RECIRC_SAME_INDEX 369 { "L2D_FORCE_RECIRC_SAME_INDEX", {0xa08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when a miss to the same index was in the same issue group."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_ALL 370 { "L2D_FORCE_RECIRC_SECONDARY_ALL", {0xf08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- CSaused by any L2D op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_READ 371 { "L2D_FORCE_RECIRC_SECONDARY_READ", {0xd08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by L2D read op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_WRITE 372 { "L2D_FORCE_RECIRC_SECONDARY_WRITE", {0xe08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by L2D write op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SNP_OR_L3 373 { "L2D_FORCE_RECIRC_SNP_OR_L3", {0x608ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by a snoop or L3 issue."}, #define PME_MONT_L2D_FORCE_RECIRC_TAG_NOTOK 374 { "L2D_FORCE_RECIRC_TAG_NOTOK", {0x408ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by L2D hits caused by in flight snoops, stores with a sibling miss to the same index, sibling probe to the same line or a pending mf.a instruction. This count can usually be ignored since its events are rare, unpredictable, and/or show up in one of the other events."}, #define PME_MONT_L2D_FORCE_RECIRC_TAG_OK 375 { "L2D_FORCE_RECIRC_TAG_OK", {0x708ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count operations that inserted to Ozq as a hit. Thus it was NOT forced to recirculate. Likely identical to L2D_INSERT_HITS."}, #define PME_MONT_L2D_FORCE_RECIRC_TRAN_PREF 376 { "L2D_FORCE_RECIRC_TRAN_PREF", {0x508ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by L2D miss requests that transformed to prefetches"}, #define PME_MONT_L2D_INSERT_HITS 377 { "L2D_INSERT_HITS", {0x8b1}, 0xfff0, 4, {0xffff0007}, "Count Number of Times an Inserting Data Request Hit in the L2D."}, #define PME_MONT_L2D_INSERT_MISSES 378 { "L2D_INSERT_MISSES", {0x8b0}, 0xfff0, 4, {0xffff0007}, "Count Number of Times an Inserting Data Request Missed the L2D."}, #define PME_MONT_L2D_ISSUED_RECIRC_OZQ_ACC 379 { "L2D_ISSUED_RECIRC_OZQ_ACC", {0x8eb}, 0xfff0, 1, {0x4420007}, "Count Number of Times a Recirculate Issue Was Attempted and Not Preempted"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_ANY 380 { "L2D_L3ACCESS_CANCEL_ANY", {0x208e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- count cancels due to any reason. This umask will count more than the sum of all the other umasks. It will count things that weren't committed accesses when they reached L1w, but the L2D attempted to bypass them to the L3 anyway (speculatively). This will include accesses made repeatedly while the main pipeline is stalled and the L1D is attempting to recirculate an access down the L1D pipeline. Thus, an access could get counted many times before it really does get bypassed to the L3. It is a measure of how many times we asserted a request to the L3 but didn't confirm it."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_ER_REJECT 381 { "L2D_L3ACCESS_CANCEL_ER_REJECT", {0x308e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- Count only requests that were rejected by ER"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_INV_L3_BYP 382 { "L2D_L3ACCESS_CANCEL_INV_L3_BYP", {0x8e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- L2D cancelled a bypass because it did not commit, or was not a valid opcode to bypass, or was not a true miss of L2D (either hit,recirc,or limbo)."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_FILL_NOSNP 383 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_FILL_NOSNP", {0x608e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop and a fill to the same address reached the L2D within a 3 cycle window of each other or a snoop hit a nosnoops entry in Ozq."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_TEM 384 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_TEM", {0x408e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop saw an L2D tag error and missed/"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_VIC 385 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_VIC", {0x508e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop hit in the L1D victim buffer"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_SPEC_L3_BYP 386 { "L2D_L3ACCESS_CANCEL_SPEC_L3_BYP", {0x108e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- L2D cancelled speculative L3 bypasses because it was not a WB memory attribute or it was an effective release."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_TAIL_TRANS_DIS 387 { "L2D_L3ACCESS_CANCEL_TAIL_TRANS_DIS", {0x708e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- Count the number of cycles that either transform to prefetches or Ozq tail collapse have been dynamically disabled. This would indicate that memory contention has lead the L2D to throttle request to prevent livelock scenarios."}, #define PME_MONT_L2D_MISSES 388 { "L2D_MISSES", {0x8cb}, 0xfff0, 1, {0xffff0007}, "L2 Misses"}, #define PME_MONT_L2D_OPS_ISSUED_FP_LOAD 389 { "L2D_OPS_ISSUED_FP_LOAD", {0x108f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid floating-point loads"}, #define PME_MONT_L2D_OPS_ISSUED_INT_LOAD 390 { "L2D_OPS_ISSUED_INT_LOAD", {0x8f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid integer loads, including ld16."}, #define PME_MONT_L2D_OPS_ISSUED_LFETCH 391 { "L2D_OPS_ISSUED_LFETCH", {0x408f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only lfetch operations."}, #define PME_MONT_L2D_OPS_ISSUED_OTHER 392 { "L2D_OPS_ISSUED_OTHER", {0x508f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid non-load, no-store accesses that are not in any of the above sections."}, #define PME_MONT_L2D_OPS_ISSUED_RMW 393 { "L2D_OPS_ISSUED_RMW", {0x208f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid read_modify_write stores and semaphores including cmp8xchg16."}, #define PME_MONT_L2D_OPS_ISSUED_STORE 394 { "L2D_OPS_ISSUED_STORE", {0x308f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid non-read_modify_write stores, including st16."}, #define PME_MONT_L2D_OZDB_FULL_THIS 395 { "L2D_OZDB_FULL_THIS", {0x8e9}, 0xfff0, 1, {0x4320000}, "L2D OZ Data Buffer Is Full -- L2 OZ Data Buffer is full"}, #define PME_MONT_L2D_OZQ_ACQUIRE 396 { "L2D_OZQ_ACQUIRE", {0x8ef}, 0xfff0, 1, {0x4620000}, "Acquire Ordering Attribute Exists in L2D OZQ"}, #define PME_MONT_L2D_OZQ_CANCELS0_ACQ 397 { "L2D_OZQ_CANCELS0_ACQ", {0x608e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- caused by an acquire somewhere in Ozq or ER."}, #define PME_MONT_L2D_OZQ_CANCELS0_BANK_CONF 398 { "L2D_OZQ_CANCELS0_BANK_CONF", {0x808e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a bypassed L2D hit operation had a bank conflict with an older sibling bypass or an older operation in the L2D pipeline."}, #define PME_MONT_L2D_OZQ_CANCELS0_CANC_L2M_TO_L2C_ST 399 { "L2D_OZQ_CANCELS0_CANC_L2M_TO_L2C_ST", {0x108e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- caused by a canceled store in L2M,L2D or L2C. This is the combination of following subevents that were available separately in Itanium2: CANC_L2M_ST=caused by canceled store in L2M, CANC_L2D_ST=caused by canceled store in L2D, CANC_L2C_ST=caused by canceled store in L2C"}, #define PME_MONT_L2D_OZQ_CANCELS0_FILL_ST_CONF 400 { "L2D_OZQ_CANCELS0_FILL_ST_CONF", {0xe08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ store conflicted with a returning L2D fill"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2A_ST_MAT 401 { "L2D_OZQ_CANCELS0_L2A_ST_MAT", {0x208e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2A"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2C_ST_MAT 402 { "L2D_OZQ_CANCELS0_L2C_ST_MAT", {0x508e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2C"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2D_ST_MAT 403 { "L2D_OZQ_CANCELS0_L2D_ST_MAT", {0x408e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2D"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2M_ST_MAT 404 { "L2D_OZQ_CANCELS0_L2M_ST_MAT", {0x308e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2M"}, #define PME_MONT_L2D_OZQ_CANCELS0_MISC_ORDER 405 { "L2D_OZQ_CANCELS0_MISC_ORDER", {0xd08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a sync.i or mf.a . This is the combination of following subevents that were available separately in Itanium2: SYNC=caused by sync.i, MFA=a memory fence instruction"}, #define PME_MONT_L2D_OZQ_CANCELS0_OVER_SUB 406 { "L2D_OZQ_CANCELS0_OVER_SUB", {0xa08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a high Ozq issue rate resulted in the L2D having to cancel due to hardware restrictions. This is the combination of following subevents that were available separately in Itanium2: OVER_SUB=oversubscription, L1DF_L2M=L1D fill in L2M"}, #define PME_MONT_L2D_OZQ_CANCELS0_OZDATA_CONF 407 { "L2D_OZQ_CANCELS0_OZDATA_CONF", {0xf08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ operation that needed to read the OZQ data buffer conflicted with a fill return that needed to do the same."}, #define PME_MONT_L2D_OZQ_CANCELS0_OZQ_PREEMPT 408 { "L2D_OZQ_CANCELS0_OZQ_PREEMPT", {0xb08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an L2D fill return conflicted with, and cancelled, an ozq request for various reasons. Formerly known as L1_FILL_CONF."}, #define PME_MONT_L2D_OZQ_CANCELS0_RECIRC 409 { "L2D_OZQ_CANCELS0_RECIRC", {0x8e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a recirculate was cancelled due h/w limitations on recirculate issue rate. This is the combination of following subevents that were available separately in Itanium2: RECIRC_OVER_SUB=caused by a recirculate oversubscription, DIDNT_RECIRC=caused because it did not recirculate, WEIRD=counts the cancels caused by attempted 5-cycle bypasses for non-aligned accesses and bypasses blocking recirculates for too long"}, #define PME_MONT_L2D_OZQ_CANCELS0_REL 410 { "L2D_OZQ_CANCELS0_REL", {0x708e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a release was cancelled due to some other operation"}, #define PME_MONT_L2D_OZQ_CANCELS0_SEMA 411 { "L2D_OZQ_CANCELS0_SEMA", {0x908e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a semaphore op was cancelled for various ordering or h/w restriction reasons. This is the combination of following subevents that were available separately in Itanium 2: SEM=a semaphore, CCV=a CCV"}, #define PME_MONT_L2D_OZQ_CANCELS0_WB_CONF 412 { "L2D_OZQ_CANCELS0_WB_CONF", {0xc08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ request conflicted with an L2D data array read for a writeback. This is the combination of following subevents that were available separately in Itanium2: READ_WB_CONF=a write back conflict, ST_FILL_CONF=a store fill conflict"}, #define PME_MONT_L2D_OZQ_CANCELS1_ANY 413 { "L2D_OZQ_CANCELS1_ANY", {0x8e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the total OZ Queue cancels"}, #define PME_MONT_L2D_OZQ_CANCELS1_LATE_BYP_EFFRELEASE 414 { "L2D_OZQ_CANCELS1_LATE_BYP_EFFRELEASE", {0x308e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by L1D to L2A bypass effective releases"}, #define PME_MONT_L2D_OZQ_CANCELS1_LATE_SPEC_BYP 415 { "L2D_OZQ_CANCELS1_LATE_SPEC_BYP", {0x108e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by speculative bypasses"}, #define PME_MONT_L2D_OZQ_CANCELS1_SIBLING_ACQ_REL 416 { "L2D_OZQ_CANCELS1_SIBLING_ACQ_REL", {0x208e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by releases and acquires in the same issue group. This is the combination of following subevents that were available separately in Itanium2: LATE_ACQUIRE=late cancels caused by acquires, LATE_RELEASE=late cancles caused by releases"}, #define PME_MONT_L2D_OZQ_FULL_THIS 417 { "L2D_OZQ_FULL_THIS", {0x8bc}, 0xfff0, 1, {0x4520000}, "L2D OZQ Is Full -- L2D OZQ is full"}, #define PME_MONT_L2D_OZQ_RELEASE 418 { "L2D_OZQ_RELEASE", {0x8e5}, 0xfff0, 1, {0x4120000}, "Release Ordering Attribute Exists in L2D OZQ"}, #define PME_MONT_L2D_REFERENCES_ALL 419 { "L2D_REFERENCES_ALL", {0x308e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count both read and write operations (semaphores will count as 2)"}, #define PME_MONT_L2D_REFERENCES_READS 420 { "L2D_REFERENCES_READS", {0x108e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count only data read and semaphore operations."}, #define PME_MONT_L2D_REFERENCES_WRITES 421 { "L2D_REFERENCES_WRITES", {0x208e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count only data write and semaphore operations"}, #define PME_MONT_L2D_STORE_HIT_SHARED_ANY 422 { "L2D_STORE_HIT_SHARED_ANY", {0x8ed}, 0xfff0, 2, {0x4520007}, "Store Hit a Shared Line -- Store hit a shared line"}, #define PME_MONT_L2D_VICTIMB_FULL_THIS 423 { "L2D_VICTIMB_FULL_THIS", {0x8f3}, 0xfff0, 1, {0x4820000}, "L2D Victim Buffer Is Full -- L2D victim buffer is full"}, #define PME_MONT_L2I_DEMAND_READS 424 { "L2I_DEMAND_READS", {0x42}, 0xfff0, 1, {0xffff0001}, "L2 Instruction Demand Fetch Requests"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_ALL 425 { "L2I_HIT_CONFLICTS_ALL_ALL", {0xf087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_DMND 426 { "L2I_HIT_CONFLICTS_ALL_DMND", {0xd087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_PFTCH 427 { "L2I_HIT_CONFLICTS_ALL_PFTCH", {0xe087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_ALL 428 { "L2I_HIT_CONFLICTS_HIT_ALL", {0x7087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_DMND 429 { "L2I_HIT_CONFLICTS_HIT_DMND", {0x5087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_PFTCH 430 { "L2I_HIT_CONFLICTS_HIT_PFTCH", {0x6087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_ALL 431 { "L2I_HIT_CONFLICTS_MISS_ALL", {0xb087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_DMND 432 { "L2I_HIT_CONFLICTS_MISS_DMND", {0x9087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_PFTCH 433 { "L2I_HIT_CONFLICTS_MISS_PFTCH", {0xa087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_ALL 434 { "L2I_L3_REJECTS_ALL_ALL", {0xf087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_DMND 435 { "L2I_L3_REJECTS_ALL_DMND", {0xd087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_PFTCH 436 { "L2I_L3_REJECTS_ALL_PFTCH", {0xe087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_ALL 437 { "L2I_L3_REJECTS_HIT_ALL", {0x7087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_DMND 438 { "L2I_L3_REJECTS_HIT_DMND", {0x5087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_PFTCH 439 { "L2I_L3_REJECTS_HIT_PFTCH", {0x6087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_ALL 440 { "L2I_L3_REJECTS_MISS_ALL", {0xb087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_DMND 441 { "L2I_L3_REJECTS_MISS_DMND", {0x9087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_PFTCH 442 { "L2I_L3_REJECTS_MISS_PFTCH", {0xa087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_PREFETCHES 443 { "L2I_PREFETCHES", {0x45}, 0xfff0, 1, {0xffff0001}, "L2 Instruction Prefetch Requests"}, #define PME_MONT_L2I_READS_ALL_ALL 444 { "L2I_READS_ALL_ALL", {0xf0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_ALL_DMND 445 { "L2I_READS_ALL_DMND", {0xd0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_ALL_PFTCH 446 { "L2I_READS_ALL_PFTCH", {0xe0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_HIT_ALL 447 { "L2I_READS_HIT_ALL", {0x70878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_READS_HIT_DMND 448 { "L2I_READS_HIT_DMND", {0x50878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_READS_HIT_PFTCH 449 { "L2I_READS_HIT_PFTCH", {0x60878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_ALL 450 { "L2I_READS_MISS_ALL", {0xb0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_DMND 451 { "L2I_READS_MISS_DMND", {0x90878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_PFTCH 452 { "L2I_READS_MISS_PFTCH", {0xa0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_ALL 453 { "L2I_RECIRCULATES_ALL_ALL", {0xf087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_DMND 454 { "L2I_RECIRCULATES_ALL_DMND", {0xd087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_PFTCH 455 { "L2I_RECIRCULATES_ALL_PFTCH", {0xe087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_ALL 456 { "L2I_RECIRCULATES_HIT_ALL", {0x7087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_DMND 457 { "L2I_RECIRCULATES_HIT_DMND", {0x5087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_PFTCH 458 { "L2I_RECIRCULATES_HIT_PFTCH", {0x6087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_ALL 459 { "L2I_RECIRCULATES_MISS_ALL", {0xb087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_DMND 460 { "L2I_RECIRCULATES_MISS_DMND", {0x9087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_PFTCH 461 { "L2I_RECIRCULATES_MISS_PFTCH", {0xa087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_SNOOP_HITS 462 { "L2I_SNOOP_HITS", {0x107f}, 0xfff0, 1, {0xffff0000}, "L2I snoop hits"}, #define PME_MONT_L2I_SPEC_ABORTS 463 { "L2I_SPEC_ABORTS", {0x87e}, 0xfff0, 1, {0xffff0001}, "L2I speculative aborts"}, #define PME_MONT_L2I_UC_READS_ALL_ALL 464 { "L2I_UC_READS_ALL_ALL", {0xf0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_ALL_DMND 465 { "L2I_UC_READS_ALL_DMND", {0xd0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_ALL_PFTCH 466 { "L2I_UC_READS_ALL_PFTCH", {0xe0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_HIT_ALL 467 { "L2I_UC_READS_HIT_ALL", {0x70879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_UC_READS_HIT_DMND 468 { "L2I_UC_READS_HIT_DMND", {0x50879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_UC_READS_HIT_PFTCH 469 { "L2I_UC_READS_HIT_PFTCH", {0x60879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_ALL 470 { "L2I_UC_READS_MISS_ALL", {0xb0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_DMND 471 { "L2I_UC_READS_MISS_DMND", {0x90879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_PFTCH 472 { "L2I_UC_READS_MISS_PFTCH", {0xa0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_VICTIMIZATION 473 { "L2I_VICTIMIZATION", {0x87a}, 0xfff0, 1, {0xffff0001}, "L2I victimizations"}, #define PME_MONT_L3_INSERTS 474 { "L3_INSERTS", {0x8da}, 0xfff0, 1, {0xffff0017}, "L3 Cache Lines inserts"}, #define PME_MONT_L3_LINES_REPLACED 475 { "L3_LINES_REPLACED", {0x8df}, 0xfff0, 1, {0xffff0010}, "L3 Cache Lines Replaced"}, #define PME_MONT_L3_MISSES 476 { "L3_MISSES", {0x8dc}, 0xfff0, 1, {0xffff0007}, "L3 Misses"}, #define PME_MONT_L3_READS_ALL_ALL 477 { "L3_READS_ALL_ALL", {0xf08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read References"}, #define PME_MONT_L3_READS_ALL_HIT 478 { "L3_READS_ALL_HIT", {0xd08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read Hits"}, #define PME_MONT_L3_READS_ALL_MISS 479 { "L3_READS_ALL_MISS", {0xe08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read Misses"}, #define PME_MONT_L3_READS_DATA_READ_ALL 480 { "L3_READS_DATA_READ_ALL", {0xb08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load References (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DATA_READ_HIT 481 { "L3_READS_DATA_READ_HIT", {0x908dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load Hits (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DATA_READ_MISS 482 { "L3_READS_DATA_READ_MISS", {0xa08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load Misses (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DINST_FETCH_ALL 483 { "L3_READS_DINST_FETCH_ALL", {0x308dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction References"}, #define PME_MONT_L3_READS_DINST_FETCH_HIT 484 { "L3_READS_DINST_FETCH_HIT", {0x108dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction Fetch Hits"}, #define PME_MONT_L3_READS_DINST_FETCH_MISS 485 { "L3_READS_DINST_FETCH_MISS", {0x208dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction Fetch Misses"}, #define PME_MONT_L3_READS_INST_FETCH_ALL 486 { "L3_READS_INST_FETCH_ALL", {0x708dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch References"}, #define PME_MONT_L3_READS_INST_FETCH_HIT 487 { "L3_READS_INST_FETCH_HIT", {0x508dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch Hits"}, #define PME_MONT_L3_READS_INST_FETCH_MISS 488 { "L3_READS_INST_FETCH_MISS", {0x608dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch Misses"}, #define PME_MONT_L3_REFERENCES 489 { "L3_REFERENCES", {0x8db}, 0xfff0, 1, {0xffff0007}, "L3 References"}, #define PME_MONT_L3_WRITES_ALL_ALL 490 { "L3_WRITES_ALL_ALL", {0xf08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write References"}, #define PME_MONT_L3_WRITES_ALL_HIT 491 { "L3_WRITES_ALL_HIT", {0xd08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write Hits"}, #define PME_MONT_L3_WRITES_ALL_MISS 492 { "L3_WRITES_ALL_MISS", {0xe08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write Misses"}, #define PME_MONT_L3_WRITES_DATA_WRITE_ALL 493 { "L3_WRITES_DATA_WRITE_ALL", {0x708de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store References (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_DATA_WRITE_HIT 494 { "L3_WRITES_DATA_WRITE_HIT", {0x508de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store Hits (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_DATA_WRITE_MISS 495 { "L3_WRITES_DATA_WRITE_MISS", {0x608de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store Misses (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_L2_WB_ALL 496 { "L3_WRITES_L2_WB_ALL", {0xb08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back References"}, #define PME_MONT_L3_WRITES_L2_WB_HIT 497 { "L3_WRITES_L2_WB_HIT", {0x908de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back Hits"}, #define PME_MONT_L3_WRITES_L2_WB_MISS 498 { "L3_WRITES_L2_WB_MISS", {0xa08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back Misses"}, #define PME_MONT_LOADS_RETIRED 499 { "LOADS_RETIRED", {0xcd}, 0xfff0, 4, {0x5310007}, "Retired Loads"}, #define PME_MONT_LOADS_RETIRED_INTG 500 { "LOADS_RETIRED_INTG", {0xd8}, 0xfff0, 2, {0x5610007}, "Integer loads retired"}, #define PME_MONT_MEM_READ_CURRENT_ANY 501 { "MEM_READ_CURRENT_ANY", {0x31089}, 0xfff0, 1, {0xffff0000}, "Current Mem Read Transactions On Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_MEM_READ_CURRENT_IO 502 { "MEM_READ_CURRENT_IO", {0x11089}, 0xfff0, 1, {0xffff0000}, "Current Mem Read Transactions On Bus -- non-CPU priority agents"}, #define PME_MONT_MISALIGNED_LOADS_RETIRED 503 { "MISALIGNED_LOADS_RETIRED", {0xce}, 0xfff0, 4, {0x5310007}, "Retired Misaligned Load Instructions"}, #define PME_MONT_MISALIGNED_STORES_RETIRED 504 { "MISALIGNED_STORES_RETIRED", {0xd2}, 0xfff0, 2, {0x5410007}, "Retired Misaligned Store Instructions"}, #define PME_MONT_NOPS_RETIRED 505 { "NOPS_RETIRED", {0x50}, 0xfff0, 6, {0xffff0003}, "Retired NOP Instructions"}, #define PME_MONT_PREDICATE_SQUASHED_RETIRED 506 { "PREDICATE_SQUASHED_RETIRED", {0x51}, 0xfff0, 6, {0xffff0003}, "Instructions Squashed Due to Predicate Off"}, #define PME_MONT_RSE_CURRENT_REGS_2_TO_0 507 { "RSE_CURRENT_REGS_2_TO_0", {0x2b}, 0xfff0, 7, {0xffff0000}, "Current RSE Registers (Bits 2:0)"}, #define PME_MONT_RSE_CURRENT_REGS_5_TO_3 508 { "RSE_CURRENT_REGS_5_TO_3", {0x2a}, 0xfff0, 7, {0xffff0000}, "Current RSE Registers (Bits 5:3)"}, #define PME_MONT_RSE_CURRENT_REGS_6 509 { "RSE_CURRENT_REGS_6", {0x26}, 0xfff0, 1, {0xffff0000}, "Current RSE Registers (Bit 6)"}, #define PME_MONT_RSE_DIRTY_REGS_2_TO_0 510 { "RSE_DIRTY_REGS_2_TO_0", {0x29}, 0xfff0, 7, {0xffff0000}, "Dirty RSE Registers (Bits 2:0)"}, #define PME_MONT_RSE_DIRTY_REGS_5_TO_3 511 { "RSE_DIRTY_REGS_5_TO_3", {0x28}, 0xfff0, 7, {0xffff0000}, "Dirty RSE Registers (Bits 5:3)"}, #define PME_MONT_RSE_DIRTY_REGS_6 512 { "RSE_DIRTY_REGS_6", {0x24}, 0xfff0, 1, {0xffff0000}, "Dirty RSE Registers (Bit 6)"}, #define PME_MONT_RSE_EVENT_RETIRED 513 { "RSE_EVENT_RETIRED", {0x32}, 0xfff0, 1, {0xffff0000}, "Retired RSE operations"}, #define PME_MONT_RSE_REFERENCES_RETIRED_ALL 514 { "RSE_REFERENCES_RETIRED_ALL", {0x30020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Both RSE loads and stores will be counted."}, #define PME_MONT_RSE_REFERENCES_RETIRED_LOAD 515 { "RSE_REFERENCES_RETIRED_LOAD", {0x10020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Only RSE loads will be counted."}, #define PME_MONT_RSE_REFERENCES_RETIRED_STORE 516 { "RSE_REFERENCES_RETIRED_STORE", {0x20020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Only RSE stores will be counted."}, #define PME_MONT_SERIALIZATION_EVENTS 517 { "SERIALIZATION_EVENTS", {0x53}, 0xfff0, 1, {0xffff0000}, "Number of srlz.i Instructions"}, #define PME_MONT_SI_CCQ_COLLISIONS_EITHER 518 { "SI_CCQ_COLLISIONS_EITHER", {0x10a8}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Collisions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_COLLISIONS_SELF 519 { "SI_CCQ_COLLISIONS_SELF", {0x110a8}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Collisions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_INSERTS_EITHER 520 { "SI_CCQ_INSERTS_EITHER", {0x18a5}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Insertions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_INSERTS_SELF 521 { "SI_CCQ_INSERTS_SELF", {0x118a5}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Insertions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_HI_EITHER 522 { "SI_CCQ_LIVE_REQ_HI_EITHER", {0x10a7}, 0xfff0, 1, {0xffff0000}, "Clean Castout Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_HI_SELF 523 { "SI_CCQ_LIVE_REQ_HI_SELF", {0x110a7}, 0xfff0, 1, {0xffff0000}, "Clean Castout Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_LO_EITHER 524 { "SI_CCQ_LIVE_REQ_LO_EITHER", {0x10a6}, 0xfff0, 7, {0xffff0000}, "Clean Castout Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_LO_SELF 525 { "SI_CCQ_LIVE_REQ_LO_SELF", {0x110a6}, 0xfff0, 7, {0xffff0000}, "Clean Castout Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CYCLES 526 { "SI_CYCLES", {0x108e}, 0xfff0, 1, {0xffff0000}, "SI Cycles"}, #define PME_MONT_SI_IOQ_COLLISIONS 527 { "SI_IOQ_COLLISIONS", {0x10aa}, 0xfff0, 2, {0xffff0000}, "In Order Queue Collisions"}, #define PME_MONT_SI_IOQ_LIVE_REQ_HI 528 { "SI_IOQ_LIVE_REQ_HI", {0x1098}, 0xfff0, 2, {0xffff0000}, "Inorder Bus Queue Requests (upper bit)"}, #define PME_MONT_SI_IOQ_LIVE_REQ_LO 529 { "SI_IOQ_LIVE_REQ_LO", {0x1097}, 0xfff0, 3, {0xffff0000}, "Inorder Bus Queue Requests (lower three bits)"}, #define PME_MONT_SI_RQ_INSERTS_EITHER 530 { "SI_RQ_INSERTS_EITHER", {0x189e}, 0xfff0, 2, {0xffff0000}, "Request Queue Insertions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_INSERTS_SELF 531 { "SI_RQ_INSERTS_SELF", {0x1189e}, 0xfff0, 2, {0xffff0000}, "Request Queue Insertions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_HI_EITHER 532 { "SI_RQ_LIVE_REQ_HI_EITHER", {0x10a0}, 0xfff0, 1, {0xffff0000}, "Request Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_HI_SELF 533 { "SI_RQ_LIVE_REQ_HI_SELF", {0x110a0}, 0xfff0, 1, {0xffff0000}, "Request Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_LO_EITHER 534 { "SI_RQ_LIVE_REQ_LO_EITHER", {0x109f}, 0xfff0, 7, {0xffff0000}, "Request Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_LO_SELF 535 { "SI_RQ_LIVE_REQ_LO_SELF", {0x1109f}, 0xfff0, 7, {0xffff0000}, "Request Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_ALL_EITHER 536 { "SI_SCB_INSERTS_ALL_EITHER", {0xc10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count all snoop signoffs (plus backsnoop inserts) from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_ALL_SELF 537 { "SI_SCB_INSERTS_ALL_SELF", {0xd10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count all snoop signoffs (plus backsnoop inserts) from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HIT_EITHER 538 { "SI_SCB_INSERTS_HIT_EITHER", {0x410ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HIT snoop signoffs from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HIT_SELF 539 { "SI_SCB_INSERTS_HIT_SELF", {0x510ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HIT snoop signoffs from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HITM_EITHER 540 { "SI_SCB_INSERTS_HITM_EITHER", {0x810ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HITM snoop signoffs from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HITM_SELF 541 { "SI_SCB_INSERTS_HITM_SELF", {0x910ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HITM snoop signoffs from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_MISS_EITHER 542 { "SI_SCB_INSERTS_MISS_EITHER", {0x10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count MISS snoop signoffs (plus backsnoop inserts) from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_MISS_SELF 543 { "SI_SCB_INSERTS_MISS_SELF", {0x110ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count MISS snoop signoffs (plus backsnoop inserts) from 'this' cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_HI_EITHER 544 { "SI_SCB_LIVE_REQ_HI_EITHER", {0x10ad}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_HI_SELF 545 { "SI_SCB_LIVE_REQ_HI_SELF", {0x110ad}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_LO_EITHER 546 { "SI_SCB_LIVE_REQ_LO_EITHER", {0x10ac}, 0xfff0, 7, {0xffff0000}, "Snoop Coalescing Buffer Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_LO_SELF 547 { "SI_SCB_LIVE_REQ_LO_SELF", {0x110ac}, 0xfff0, 7, {0xffff0000}, "Snoop Coalescing Buffer Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_SIGNOFFS_ALL 548 { "SI_SCB_SIGNOFFS_ALL", {0xc10ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count all snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_HIT 549 { "SI_SCB_SIGNOFFS_HIT", {0x410ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count HIT snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_HITM 550 { "SI_SCB_SIGNOFFS_HITM", {0x810ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count HITM snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_MISS 551 { "SI_SCB_SIGNOFFS_MISS", {0x10ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count MISS snoop signoffs"}, #define PME_MONT_SI_WAQ_COLLISIONS_EITHER 552 { "SI_WAQ_COLLISIONS_EITHER", {0x10a4}, 0xfff0, 1, {0xffff0000}, "Write Address Queue Collisions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WAQ_COLLISIONS_SELF 553 { "SI_WAQ_COLLISIONS_SELF", {0x110a4}, 0xfff0, 1, {0xffff0000}, "Write Address Queue Collisions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_ALL_EITHER 554 { "SI_WDQ_ECC_ERRORS_ALL_EITHER", {0x810af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count all ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_ALL_SELF 555 { "SI_WDQ_ECC_ERRORS_ALL_SELF", {0x910af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count all ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_DBL_EITHER 556 { "SI_WDQ_ECC_ERRORS_DBL_EITHER", {0x410af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count double-bit ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_DBL_SELF 557 { "SI_WDQ_ECC_ERRORS_DBL_SELF", {0x510af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count double-bit ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_SGL_EITHER 558 { "SI_WDQ_ECC_ERRORS_SGL_EITHER", {0x10af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count single-bit ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_SGL_SELF 559 { "SI_WDQ_ECC_ERRORS_SGL_SELF", {0x110af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count single-bit ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WRITEQ_INSERTS_ALL_EITHER 560 { "SI_WRITEQ_INSERTS_ALL_EITHER", {0x18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_ALL_SELF 561 { "SI_WRITEQ_INSERTS_ALL_SELF", {0x118a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_EWB_EITHER 562 { "SI_WRITEQ_INSERTS_EWB_EITHER", {0x418a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_EWB_SELF 563 { "SI_WRITEQ_INSERTS_EWB_SELF", {0x518a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_IWB_EITHER 564 { "SI_WRITEQ_INSERTS_IWB_EITHER", {0x218a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_IWB_SELF 565 { "SI_WRITEQ_INSERTS_IWB_SELF", {0x318a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_NEWB_EITHER 566 { "SI_WRITEQ_INSERTS_NEWB_EITHER", {0xc18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_NEWB_SELF 567 { "SI_WRITEQ_INSERTS_NEWB_SELF", {0xd18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC16_EITHER 568 { "SI_WRITEQ_INSERTS_WC16_EITHER", {0x818a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC16_SELF 569 { "SI_WRITEQ_INSERTS_WC16_SELF", {0x918a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8A_EITHER 570 { "SI_WRITEQ_INSERTS_WC1_8A_EITHER", {0x618a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8A_SELF 571 { "SI_WRITEQ_INSERTS_WC1_8A_SELF", {0x718a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8B_EITHER 572 { "SI_WRITEQ_INSERTS_WC1_8B_EITHER", {0xe18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8B_SELF 573 { "SI_WRITEQ_INSERTS_WC1_8B_SELF", {0xf18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC32_EITHER 574 { "SI_WRITEQ_INSERTS_WC32_EITHER", {0xa18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC32_SELF 575 { "SI_WRITEQ_INSERTS_WC32_SELF", {0xb18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_HI_EITHER 576 { "SI_WRITEQ_LIVE_REQ_HI_EITHER", {0x10a3}, 0xfff0, 1, {0xffff0000}, "Write Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_HI_SELF 577 { "SI_WRITEQ_LIVE_REQ_HI_SELF", {0x110a3}, 0xfff0, 1, {0xffff0000}, "Write Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_LO_EITHER 578 { "SI_WRITEQ_LIVE_REQ_LO_EITHER", {0x10a2}, 0xfff0, 7, {0xffff0000}, "Write Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_LO_SELF 579 { "SI_WRITEQ_LIVE_REQ_LO_SELF", {0x110a2}, 0xfff0, 7, {0xffff0000}, "Write Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SPEC_LOADS_NATTED_ALL 580 { "SPEC_LOADS_NATTED_ALL", {0xd9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Count all NaT'd loads"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_PSR_ED 581 { "SPEC_LOADS_NATTED_DEF_PSR_ED", {0x500d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to effect of PSR.ed"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_TLB_FAULT 582 { "SPEC_LOADS_NATTED_DEF_TLB_FAULT", {0x300d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to deferred TLB faults"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_TLB_MISS 583 { "SPEC_LOADS_NATTED_DEF_TLB_MISS", {0x200d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to deferred TLB misses"}, #define PME_MONT_SPEC_LOADS_NATTED_NAT_CNSM 584 { "SPEC_LOADS_NATTED_NAT_CNSM", {0x400d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to NaT consumption"}, #define PME_MONT_SPEC_LOADS_NATTED_VHPT_MISS 585 { "SPEC_LOADS_NATTED_VHPT_MISS", {0x100d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to VHPT miss"}, #define PME_MONT_STORES_RETIRED 586 { "STORES_RETIRED", {0xd1}, 0xfff0, 2, {0x5410007}, "Retired Stores"}, #define PME_MONT_SYLL_NOT_DISPERSED_ALL 587 { "SYLL_NOT_DISPERSED_ALL", {0xf004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Counts all syllables not dispersed. NOTE: Any combination of b0000-b1111 is valid."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL 588 { "SYLL_NOT_DISPERSED_EXPL", {0x1004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits. These consist of programmer specified architected S-bit and templates 1 and 5. Dispersal takes a 6-syllable (3-syllable) hit for every template 1/5 in bundle 0(1). Dispersal takes a 3-syllable (0 syllable) hit for every S-bit in bundle 0(1)"}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_FE 589 { "SYLL_NOT_DISPERSED_EXPL_OR_FE", {0x5004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLX 590 { "SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLX", {0xd004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL 591 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL", {0x3004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit/implicit stop bits."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE 592 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE", {0x7004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to front-end not providing valid bundles or providing valid illegal template."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLX 593 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLX", {0xb004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_MLX 594 { "SYLL_NOT_DISPERSED_EXPL_OR_MLX", {0x9004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_FE 595 { "SYLL_NOT_DISPERSED_FE", {0x4004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to front-end not providing valid bundles or providing valid illegal templates. Dispersal takes a 3-syllable hit for every invalid bundle or valid illegal template from front-end. Bundle 1 with front-end fault, is counted here (3-syllable hit).."}, #define PME_MONT_SYLL_NOT_DISPERSED_FE_OR_MLX 596 { "SYLL_NOT_DISPERSED_FE_OR_MLX", {0xc004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable or due to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL 597 { "SYLL_NOT_DISPERSED_IMPL", {0x2004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits. These consist of all of the non-architected stop bits (asymmetry, oversubscription, implicit). Dispersal takes a 6-syllable(3-syllable) hit for every implicit stop bits in bundle 0(1)."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_FE 598 { "SYLL_NOT_DISPERSED_IMPL_OR_FE", {0x6004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLX 599 { "SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLX", {0xe004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_MLX 600 { "SYLL_NOT_DISPERSED_IMPL_OR_MLX", {0xa004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_MLX 601 { "SYLL_NOT_DISPERSED_MLX", {0x8004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLX bundle and resteers to non-0 syllable. Dispersal takes a 1 syllable hit for each MLX bundle . Dispersal could take 0-2 syllable hit depending on which syllable we resteer to. Bundle 1 with front-end fault which is split, is counted here (0-2 syllable hit)."}, #define PME_MONT_SYLL_OVERCOUNT_ALL 602 { "SYLL_OVERCOUNT_ALL", {0x3004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- syllables overcounted in implicit & explicit bucket"}, #define PME_MONT_SYLL_OVERCOUNT_EXPL 603 { "SYLL_OVERCOUNT_EXPL", {0x1004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- Only syllables overcounted in the explicit bucket"}, #define PME_MONT_SYLL_OVERCOUNT_IMPL 604 { "SYLL_OVERCOUNT_IMPL", {0x2004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- Only syllables overcounted in the implicit bucket"}, #define PME_MONT_THREAD_SWITCH_CYCLE_ALL_GATED 605 { "THREAD_SWITCH_CYCLE_ALL_GATED", {0x6000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are gated due to any reason"}, #define PME_MONT_THREAD_SWITCH_CYCLE_ANYSTALL 606 { "THREAD_SWITCH_CYCLE_ANYSTALL", {0x3000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to any reason"}, #define PME_MONT_THREAD_SWITCH_CYCLE_CRAB 607 { "THREAD_SWITCH_CYCLE_CRAB", {0x1000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to CRAB operation"}, #define PME_MONT_THREAD_SWITCH_CYCLE_L2D 608 { "THREAD_SWITCH_CYCLE_L2D", {0x2000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to L2D return operation"}, #define PME_MONT_THREAD_SWITCH_CYCLE_PCR 609 { "THREAD_SWITCH_CYCLE_PCR", {0x4000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles we run with PCR.sd set"}, #define PME_MONT_THREAD_SWITCH_CYCLE_TOTAL 610 { "THREAD_SWITCH_CYCLE_TOTAL", {0x7000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Total time from TS opportunity is seized to TS happens."}, #define PME_MONT_THREAD_SWITCH_EVENTS_ALL 611 { "THREAD_SWITCH_EVENTS_ALL", {0x7000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- All taken TSs"}, #define PME_MONT_THREAD_SWITCH_EVENTS_DBG 612 { "THREAD_SWITCH_EVENTS_DBG", {0x5000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to debug operations"}, #define PME_MONT_THREAD_SWITCH_EVENTS_HINT 613 { "THREAD_SWITCH_EVENTS_HINT", {0x3000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to hint instruction"}, #define PME_MONT_THREAD_SWITCH_EVENTS_L3MISS 614 { "THREAD_SWITCH_EVENTS_L3MISS", {0x1000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to L3 miss"}, #define PME_MONT_THREAD_SWITCH_EVENTS_LP 615 { "THREAD_SWITCH_EVENTS_LP", {0x4000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to low power operation"}, #define PME_MONT_THREAD_SWITCH_EVENTS_MISSED 616 { "THREAD_SWITCH_EVENTS_MISSED", {0xc}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TS opportunities missed"}, #define PME_MONT_THREAD_SWITCH_EVENTS_TIMER 617 { "THREAD_SWITCH_EVENTS_TIMER", {0x2000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to time out"}, #define PME_MONT_THREAD_SWITCH_GATED_ALL 618 { "THREAD_SWITCH_GATED_ALL", {0x7000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- TSs gated for any reason"}, #define PME_MONT_THREAD_SWITCH_GATED_FWDPRO 619 { "THREAD_SWITCH_GATED_FWDPRO", {0x5000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- Gated due to forward progress reasons"}, #define PME_MONT_THREAD_SWITCH_GATED_LP 620 { "THREAD_SWITCH_GATED_LP", {0x1000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- TSs gated due to LP"}, #define PME_MONT_THREAD_SWITCH_GATED_PIPE 621 { "THREAD_SWITCH_GATED_PIPE", {0x4000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- Gated due to pipeline operations"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_1024 622 { "THREAD_SWITCH_STALL_GTE_1024", {0x8000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 1024 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_128 623 { "THREAD_SWITCH_STALL_GTE_128", {0x5000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 128 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_16 624 { "THREAD_SWITCH_STALL_GTE_16", {0x2000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 16 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_2048 625 { "THREAD_SWITCH_STALL_GTE_2048", {0x9000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 2048 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_256 626 { "THREAD_SWITCH_STALL_GTE_256", {0x6000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 256 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_32 627 { "THREAD_SWITCH_STALL_GTE_32", {0x3000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 32 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_4 628 { "THREAD_SWITCH_STALL_GTE_4", {0xf}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 4 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_4096 629 { "THREAD_SWITCH_STALL_GTE_4096", {0xa000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 4096 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_512 630 { "THREAD_SWITCH_STALL_GTE_512", {0x7000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 512 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_64 631 { "THREAD_SWITCH_STALL_GTE_64", {0x4000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 64 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_8 632 { "THREAD_SWITCH_STALL_GTE_8", {0x1000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 8 cycles"}, #define PME_MONT_UC_LOADS_RETIRED 633 { "UC_LOADS_RETIRED", {0xcf}, 0xfff0, 4, {0x5310007}, "Retired Uncacheable Loads"}, #define PME_MONT_UC_STORES_RETIRED 634 { "UC_STORES_RETIRED", {0xd0}, 0xfff0, 2, {0x5410007}, "Retired Uncacheable Stores"}, #define PME_MONT_IA64_INST_RETIRED 635 { "IA64_INST_RETIRED", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions -- Alias to IA64_INST_RETIRED_THIS"}, #define PME_MONT_BRANCH_EVENT 636 { "BRANCH_EVENT", {0x111}, 0xfff0, 1, {0xffff0003}, "Execution Trace Buffer Event Captured. Alias to ETB_EVENT"}, }; #define PME_MONT_EVENT_COUNT (sizeof(montecito_pe)/sizeof(pme_mont_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/niagara1_events.h000066400000000000000000000022641502707512200225320ustar00rootroot00000000000000static pme_sparc_entry_t niagara1_pe[] = { /* PIC1 Niagara-1 events */ { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x0, }, /* PIC0 Niagara-1 events */ { .pme_name = "SB_full", .pme_desc = "Store-buffer full", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x0, }, { .pme_name = "FP_instr_cnt", .pme_desc = "FPU instructions", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1, }, { .pme_name = "IC_miss", .pme_desc = "I-cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, { .pme_name = "DC_miss", .pme_desc = "D-cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "ITLB_miss", .pme_desc = "I-TLB miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "DTLB_miss", .pme_desc = "D-TLB miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x5, }, { .pme_name = "L2_imiss", .pme_desc = "E-cache instruction fetch miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x6, }, { .pme_name = "L2_dmiss_ld", .pme_desc = "E-cache data load miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x7, }, }; #define PME_NIAGARA1_EVENT_COUNT (sizeof(niagara1_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/niagara2_events.h000066400000000000000000000202131502707512200225250ustar00rootroot00000000000000static pme_sparc_mask_entry_t niagara2_pe[] = { /* PIC0 Niagara-2 events */ { .pme_name = "All_strands_idle", .pme_desc = "Cycles when no strand can be picked for the physical core on which the monitoring strand resides.", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, .pme_masks = { { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, { .mask_name = "ignored2", .mask_desc = "Ignored", }, { .mask_name = "ignored3", .mask_desc = "Ignored", }, { .mask_name = "ignored4", .mask_desc = "Ignored", }, { .mask_name = "ignored5", .mask_desc = "Ignored", }, { .mask_name = "ignored6", .mask_desc = "Ignored", }, { .mask_name = "ignored7", .mask_desc = "Ignored", }, }, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x2, .pme_masks = { { .mask_name = "branches", .mask_desc = "Completed branches", }, { .mask_name = "taken_branches", .mask_desc = "Taken branches, which are always mispredicted", }, { .mask_name = "FGU_arith", .mask_desc = "All FADD, FSUB, FCMP, convert, FMUL, FDIV, FNEG, FABS, FSQRT, FMOV, FPADD, FPSUB, FPACK, FEXPAND, FPMERGE, FMUL8, FMULD8, FALIGNDATA, BSHUFFLE, FZERO, FONE, FSRC, FNOT1, FNOT2, FOR, FNOR, FAND, FNAND, FXOR, FXNOR, FORNOT1, FORNOT2, FANDNOT1, FANDNOT2, PDIST, SIAM", }, { .mask_name = "Loads", .mask_desc = "Load instructions", }, { .mask_name = "Stores", .mask_desc = "Stores instructions", }, { .mask_name = "SW_count", .mask_desc = "Software count 'sethi %hi(fc00), %g0' instructions", }, { .mask_name = "other", .mask_desc = "Instructions not covered by other mask bits", }, { .mask_name = "atomics", .mask_desc = "Atomics are LDSTUB/A, CASA/XA, SWAP/A", }, }, }, { .pme_name = "cache", .pme_desc = "Cache events", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x3, .pme_masks = { { .mask_name = "IC_miss", .mask_desc = "I-cache misses. This counts only primary instruction cache misses, and does not count duplicate instruction cache misses.4 Also, only 'true' misses are counted. If a thread encounters an I$ miss, but the thread is redirected (due to a branch misprediction or trap, for example) before the line returns from L2 and is loaded into the I$, then the miss is not counted.", }, { .mask_name = "DC_miss", .mask_desc = "D-cache misses. This counts both primary and duplicate data cache misses.", }, { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, { .mask_name = "L2IC_miss", .mask_desc = "L2 cache instruction misses", }, { .mask_name = "L2LD_miss", .mask_desc = "L2 cache load misses. Block loads are treated as one L2 miss event. In reality, each individual load can hit or miss in the L2 since the block load is not atomic.", }, { .mask_name = "ignored2", .mask_desc = "Ignored", }, { .mask_name = "ignored3", .mask_desc = "Ignored", }, }, }, { .pme_name = "TLB", .pme_desc = "TLB events", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x4, .pme_masks = { { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, { .mask_name = "ITLB_L2ref", .mask_desc = "ITLB references to L2. For each ITLB miss with hardware tablewalk enabled, count each access the ITLB hardware tablewalk makes to L2.", }, { .mask_name = "DTLB_L2ref", .mask_desc = "DTLB references to L2. For each DTLB miss with hardware tablewalk enabled, count each access the DTLB hardware tablewalk makes to L2.", }, { .mask_name = "ITLB_L2miss", .mask_desc = "For each ITLB miss with hardware tablewalk enabled, count each access the ITLB hardware tablewalk makes to L2 which misses in L2. Note: Depending upon the hardware table walk configuration, each ITLB miss may issue from 1 to 4 requests to L2 to search TSBs.", }, { .mask_name = "DTLB_L2miss", .mask_desc = "For each DTLB miss with hardware tablewalk enabled, count each access the DTLB hardware tablewalk makes to L2 which misses in L2. Note: Depending upon the hardware table walk configuration, each DTLB miss may issue from 1 to 4 requests to L2 to search TSBs.", }, { .mask_name = "ignored2", .mask_desc = "Ignored", }, { .mask_name = "ignored3", .mask_desc = "Ignored", }, }, }, { .pme_name = "mem", .pme_desc = "Memory operations", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x5, .pme_masks = { { .mask_name = "stream_load", .mask_desc = "Stream Unit load operations to L2", }, { .mask_name = "stream_store", .mask_desc = "Stream Unit store operations to L2", }, { .mask_name = "cpu_load", .mask_desc = "CPU loads to L2", }, { .mask_name = "cpu_ifetch", .mask_desc = "CPU instruction fetches to L2", }, { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "cpu_store", .mask_desc = "CPU stores to L2", }, { .mask_name = "mmu_load", .mask_desc = "MMU loads to L2", }, }, }, { .pme_name = "spu_ops", .pme_desc = "Stream Unit operations. User, supervisor, and hypervisor counting must all be enabled to properly count these events.", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x6, .pme_masks = { { .mask_name = "DES", .mask_desc = "Increment for each CWQ or ASI operation that uses DES/3DES unit", }, { .mask_name = "AES", .mask_desc = "Increment for each CWQ or ASI operation that uses AES unit", }, { .mask_name = "RC4", .mask_desc = "Increment for each CWQ or ASI operation that uses RC4 unit", }, { .mask_name = "HASH", .mask_desc = "Increment for each CWQ or ASI operation that uses MD5/SHA-1/SHA-256 unit", }, { .mask_name = "MA", .mask_desc = "Increment for each CWQ or ASI modular arithmetic operation", }, { .mask_name = "CSUM", .mask_desc = "Increment for each iSCSI CRC or TCP/IP checksum operation", }, { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, }, }, { .pme_name = "spu_busy", .pme_desc = "Stream Unit busy cycles. User, supervisor, and hypervisor counting must all be enabled to properly count these events.", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x07, .pme_masks = { { .mask_name = "DES", .mask_desc = "Cycles the DES/3DES unit is busy", }, { .mask_name = "AES", .mask_desc = "Cycles the AES unit is busy", }, { .mask_name = "RC4", .mask_desc = "Cycles the RC4 unit is busy", }, { .mask_name = "HASH", .mask_desc = "Cycles the MD5/SHA-1/SHA-256 unit is busy", }, { .mask_name = "MA", .mask_desc = "Cycles the modular arithmetic unit is busy", }, { .mask_name = "CSUM", .mask_desc = "Cycles the CRC/MPA/checksum unit is busy", }, { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, }, }, { .pme_name = "tlb_miss", .pme_desc = "TLB misses", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0xb, .pme_masks = { { .mask_name = "ignored0", .mask_desc = "Ignored", }, { .mask_name = "ignored1", .mask_desc = "Ignored", }, { .mask_name = "ITLB", .mask_desc = "I-TLB misses", }, { .mask_name = "DTLB", .mask_desc = "D-TLB misses", }, { .mask_name = "ignored2", .mask_desc = "Ignored", }, { .mask_name = "ignored3", .mask_desc = "Ignored", }, { .mask_name = "ignored4", .mask_desc = "Ignored", }, { .mask_name = "ignored5", .mask_desc = "Ignored", }, }, }, }; #define PME_NIAGARA2_EVENT_COUNT (sizeof(niagara2_pe)/sizeof(pme_sparc_mask_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/pentium4_events.h000066400000000000000000001336071502707512200226220ustar00rootroot00000000000000/* * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pentium4_events.h * * This header contains arrays to describe the Event-Selection-Control * Registers (ESCRs), Counter-Configuration-Control Registers (CCCRs), * and countable events on Pentium4/Xeon/EM64T systems. * * For more details, see: * - IA-32 Intel Architecture Software Developer's Manual, * Volume 3B: System Programming Guide, Part 2 * (available at: http://www.intel.com/design/Pentium4/manuals/253669.htm) * - Chapter 18.10: Performance Monitoring Overview * - Chapter 18.13: Performance Monitoring - Pentium4 and Xeon Processors * - Chapter 18.14: Performance Monitoring and Hyper-Threading Technology * - Appendix A.1: Pentium4 and Xeon Processor Performance-Monitoring Events * * This header also contains an array to describe how the Perfmon PMCs map to * the ESCRs and CCCRs. */ #ifndef _PENTIUM4_EVENTS_H_ #define _PENTIUM4_EVENTS_H_ /** * pentium4_escrs * * Array of event-selection-control registers that are available * on Pentium4. **/ pentium4_escr_reg_t pentium4_escrs[] = { {.name = "BPU_ESCR0", .pmc = 0, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "IS_ESCR0", .pmc = 1, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "MOB_ESCR0", .pmc = 2, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "ITLB_ESCR0", .pmc = 3, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "PMH_ESCR0", .pmc = 4, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "IX_ESCR0", .pmc = 5, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "FSB_ESCR0", .pmc = 6, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "BSU_ESCR0", .pmc = 7, .allowed_cccrs = { 0, 9, -1, }, }, {.name = "MS_ESCR0", .pmc = 8, .allowed_cccrs = { 2, 11, -1, }, }, {.name = "TC_ESCR0", .pmc = 9, .allowed_cccrs = { 2, 11, -1, }, }, {.name = "TBPU_ESCR0", .pmc = 10, .allowed_cccrs = { 2, 11, -1, }, }, {.name = "FLAME_ESCR0", .pmc = 11, .allowed_cccrs = { 4, 13, -1, }, }, {.name = "FIRM_ESCR0", .pmc = 12, .allowed_cccrs = { 4, 13, -1, }, }, {.name = "SAAT_ESCR0", .pmc = 13, .allowed_cccrs = { 4, 13, -1, }, }, {.name = "U2L_ESCR0", .pmc = 14, .allowed_cccrs = { 4, 13, -1, }, }, {.name = "DAC_ESCR0", .pmc = 15, .allowed_cccrs = { 4, 13, -1, }, }, {.name = "IQ_ESCR0", .pmc = 16, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "ALF_ESCR0", .pmc = 17, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "RAT_ESCR0", .pmc = 18, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "SSU_ESCR0", .pmc = 19, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "CRU_ESCR0", .pmc = 20, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "CRU_ESCR2", .pmc = 21, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "CRU_ESCR4", .pmc = 22, .allowed_cccrs = { 6, 8, 15, }, }, {.name = "BPU_ESCR1", .pmc = 32, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "IS_ESCR1", .pmc = 33, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "MOB_ESCR1", .pmc = 34, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "ITLB_ESCR1", .pmc = 35, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "PMH_ESCR1", .pmc = 36, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "IX_ESCR1", .pmc = 37, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "FSB_ESCR1", .pmc = 38, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "BSU_ESCR1", .pmc = 39, .allowed_cccrs = { 1, 10, -1, }, }, {.name = "MS_ESCR1", .pmc = 40, .allowed_cccrs = { 3, 12, -1, }, }, {.name = "TC_ESCR1", .pmc = 41, .allowed_cccrs = { 3, 12, -1, }, }, {.name = "TBPU_ESCR1", .pmc = 42, .allowed_cccrs = { 3, 12, -1, }, }, {.name = "FLAME_ESCR1", .pmc = 43, .allowed_cccrs = { 5, 14, -1, }, }, {.name = "FIRM_ESCR1", .pmc = 44, .allowed_cccrs = { 5, 14, -1, }, }, {.name = "SAAT_ESCR1", .pmc = 45, .allowed_cccrs = { 5, 14, -1, }, }, {.name = "U2L_ESCR1", .pmc = 46, .allowed_cccrs = { 5, 14, -1, }, }, {.name = "DAC_ESCR1", .pmc = 47, .allowed_cccrs = { 5, 14, -1, }, }, {.name = "IQ_ESCR1", .pmc = 48, .allowed_cccrs = { 7, 16, 17, }, }, {.name = "ALF_ESCR1", .pmc = 49, .allowed_cccrs = { 7, 16, 17, }, }, {.name = "RAT_ESCR1", .pmc = 50, .allowed_cccrs = { 7, 16, 17, }, }, {.name = "CRU_ESCR1", .pmc = 51, .allowed_cccrs = { 7, 16, 17, }, }, {.name = "CRU_ESCR3", .pmc = 52, .allowed_cccrs = { 7, 16, 17, }, }, {.name = "CRU_ESCR5", .pmc = 53, .allowed_cccrs = { 7, 16, 17, }, }, }; #define PENTIUM4_NUM_ESCRS (sizeof(pentium4_escrs)/sizeof(pentium4_escrs[0])) /** * pentium4_cccrs * * Array of counter-configuration-control registers that are available * on Pentium4. **/ pentium4_cccr_reg_t pentium4_cccrs[] = { {.name = "BPU_CCCR0", .pmc = 23, .pmd = 0, .allowed_escrs = { 0, 1, 2, 3, 4, 5, 6, 7 }, }, {.name = "BPU_CCCR2", .pmc = 24, .pmd = 9, .allowed_escrs = { 23, 24, 25, 26, 27, 28, 29, 30 }, }, {.name = "MS_CCCR0", .pmc = 25, .pmd = 2, .allowed_escrs = { 8, 9, 10, -1, -1, -1, -1, -1, }, }, {.name = "MS_CCCR2", .pmc = 56, .pmd = 11, .allowed_escrs = { 31, 32, 33, -1, -1, -1, -1, -1, }, }, {.name = "FLAME_CCCR0", .pmc = 27, .pmd = 4, .allowed_escrs = { 11, 12, 13, 14, -1, 15, -1, -1 }, }, {.name = "FLAME_CCCR2", .pmc = 58, .pmd = 13, .allowed_escrs = { 34, 35, 36, 37, -1, 38, -1, -1 }, }, {.name = "IQ_CCCR0", .pmc = 29, .pmd = 6, .allowed_escrs = { 16, 17, 18, 19, 20, 21, 22, -1 }, }, {.name = "IQ_CCCR2", .pmc = 60, .pmd = 15, .allowed_escrs = { 39, 40, 41, -1, 42, 43, 44, -1 }, }, {.name = "IQ_CCCR4", .pmc = 31, .pmd = 8, .allowed_escrs = { 16, 17, 18, 19, 20, 21, 22, -1 }, }, {.name = "BPU_CCCR1", .pmc = 24, .pmd = 1, .allowed_escrs = { 0, 1, 2, 3, 4, 5, 6, 7 }, }, {.name = "BPU_CCCR3", .pmc = 55, .pmd = 10, .allowed_escrs = { 23, 24, 25, 26, 27, 28, 29, 30 }, }, {.name = "MS_CCCR1", .pmc = 26, .pmd = 3, .allowed_escrs = { 8, 9, 10, -1, -1, -1, -1, -1, }, }, {.name = "MS_CCCR3", .pmc = 57, .pmd = 12, .allowed_escrs = { 31, 32, 33, -1, -1, -1, -1, -1, }, }, {.name = "FLAME_CCCR1", .pmc = 28, .pmd = 5, .allowed_escrs = { 11, 12, 13, 14, -1, 15, -1, -1 }, }, {.name = "FLAME_CCCR3", .pmc = 59, .pmd = 14, .allowed_escrs = { 34, 35, 36, 37, -1, 38, -1, -1 }, }, {.name = "IQ_CCCR1", .pmc = 30, .pmd = 7, .allowed_escrs = { 16, 17, 18, 19, 20, 21, 22, -1 }, }, {.name = "IQ_CCCR3", .pmc = 61, .pmd = 16, .allowed_escrs = { 39, 40, 41, -1, 42, 43, 44, -1 }, }, {.name = "IQ_CCCR5", .pmc = 62, .pmd = 17, .allowed_escrs = { 39, 40, 41, -1, 42, 43, 44, -1 }, }, }; #define PENTIUM4_NUM_CCCRS (sizeof(pentium4_cccrs)/sizeof(pentium4_cccrs[0])) #define PENTIUM4_NUM_PMCS (PENTIUM4_NUM_CCCRS + PENTIUM4_NUM_ESCRS) #define PENTIUM4_NUM_PMDS PENTIUM4_NUM_CCCRS #define PENTIUM4_COUNTER_WIDTH 40 /** * pentium4_pmcs * * Array of PMCs on the Pentium4, showing how they map to the ESCRs and CCCRs. **/ pentium4_pmc_t pentium4_pmcs[PENTIUM4_NUM_PMCS] = { {.name = "BPU_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 0, }, {.name = "IS_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 1, }, {.name = "MOB_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 2, }, {.name = "ITLB_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 3, }, {.name = "PMH_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 4, }, {.name = "IX_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 5, }, {.name = "FSB_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 6, }, {.name = "BSU_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 7, }, {.name = "MS_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 8, }, {.name = "TC_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 9, }, {.name = "TBPU_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 10, }, {.name = "FLAME_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 11, }, {.name = "FIRM_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 12, }, {.name = "SAAT_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 13, }, {.name = "U2L_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 14, }, {.name = "DAC_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 15, }, {.name = "IQ_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 16, }, {.name = "ALF_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 17, }, {.name = "RAT_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 18, }, {.name = "SSU_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 19, }, {.name = "CRU_ESCR0", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 20, }, {.name = "CRU_ESCR2", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 21, }, {.name = "CRU_ESCR4", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 22, }, {.name = "BPU_CCCR0", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 0, }, {.name = "BPU_CCCR2", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 1, }, {.name = "MS_CCCR0", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 2, }, {.name = "MS_CCCR2", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 3, }, {.name = "FLAME_CCCR0", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 4, }, {.name = "FLAME_CCCR2", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 5, }, {.name = "IQ_CCCR0", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 6, }, {.name = "IQ_CCCR2", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 7, }, {.name = "IQ_CCCR4", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 8, }, {.name = "BPU_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 23, }, {.name = "IS_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 24, }, {.name = "MOB_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 25, }, {.name = "ITLB_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 26, }, {.name = "PMH_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 27, }, {.name = "IX_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 28, }, {.name = "FSB_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 29, }, {.name = "BSU_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 30, }, {.name = "MS_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 31, }, {.name = "TC_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 32, }, {.name = "TBPU_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 33, }, {.name = "FLAME_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 34, }, {.name = "FIRM_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 35, }, {.name = "SAAT_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 36, }, {.name = "U2L_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 37, }, {.name = "DAC_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 38, }, {.name = "IQ_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 39, }, {.name = "ALF_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 40, }, {.name = "RAT_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 41, }, {.name = "CRU_ESCR1", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 42, }, {.name = "CRU_ESCR3", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 43, }, {.name = "CRU_ESCR5", .type = PENTIUM4_PMC_TYPE_ESCR, .index = 44, }, {.name = "BPU_CCCR1", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 9, }, {.name = "BPU_CCCR3", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 10, }, {.name = "MS_CCCR1", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 11, }, {.name = "MS_CCCR3", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 12, }, {.name = "FLAME_CCCR1", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 13, }, {.name = "FLAME_CCCR3", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 14, }, {.name = "IQ_CCCR1", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 15, }, {.name = "IQ_CCCR3", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 16, }, {.name = "IQ_CCCR5", .type = PENTIUM4_PMC_TYPE_CCCR, .index = 17, }, }; /** * pentium4_events * * Array of events that can be counted on Pentium4. **/ pentium4_event_t pentium4_events[] = { /* 0 */ {.name = "TC_deliver_mode", .desc = "The duration (in clock cycles) of the operating modes of " "the trace cache and decode engine in the processor package.", .event_select = 0x1, .escr_select = 0x1, .allowed_escrs = { 9, 32 }, .event_masks = { {.name = "DD", .desc = "Both logical CPUs in deliver mode.", .bit = 0, }, {.name = "DB", .desc = "Logical CPU 0 in deliver mode and " "logical CPU 1 in build mode.", .bit = 1, }, {.name = "DI", .desc = "Logical CPU 0 in deliver mode and logical CPU 1 " "either halted, under machine clear condition, or " "transitioning to a long microcode flow.", .bit = 2, }, {.name = "BD", .desc = "Logical CPU 0 in build mode and " "logical CPU 1 is in deliver mode.", .bit = 3, }, {.name = "BB", .desc = "Both logical CPUs in build mode.", .bit = 4, }, {.name = "BI", .desc = "Logical CPU 0 in build mode and logical CPU 1 " "either halted, under machine clear condition, or " "transitioning to a long microcode flow.", .bit = 5, }, {.name = "ID", .desc = "Logical CPU 0 either halted, under machine clear " "condition, or transitioning to a long microcode " "flow, and logical CPU 1 in deliver mode.", .bit = 6, }, {.name = "IB", .desc = "Logical CPU 0 either halted, under machine clear " "condition, or transitioning to a long microcode " "flow, and logical CPU 1 in build mode.", .bit = 7, }, }, }, /* 1 */ {.name = "BPU_fetch_request", .desc = "Instruction fetch requests by the Branch Prediction Unit.", .event_select = 0x3, .escr_select = 0x0, .allowed_escrs = { 0, 23 }, .event_masks = { {.name = "TCMISS", .desc = "Trace cache lookup miss.", .bit = 0, }, }, }, /* 2 */ {.name = "ITLB_reference", .desc = "Translations using the Instruction " "Translation Look-Aside Buffer.", .event_select = 0x18, .escr_select = 0x3, .allowed_escrs = { 3, 26 }, .event_masks = { {.name = "HIT", .desc = "ITLB hit.", .bit = 0, }, {.name = "MISS", .desc = "ITLB miss.", .bit = 1, }, {.name = "HIT_UC", .desc = "Uncacheable ITLB hit.", .bit = 2, }, }, }, /* 3 */ {.name = "memory_cancel", .desc = "Canceling of various types of requests in the " "Data cache Address Control unit (DAC).", .event_select = 0x2, .escr_select = 0x5, .allowed_escrs = { 15, 38 }, .event_masks = { {.name = "ST_RB_FULL", .desc = "Replayed because no store request " "buffer is available.", .bit = 2, }, {.name = "64K_CONF", .desc = "Conflicts due to 64K aliasing.", .bit = 3, }, }, }, /* 4 */ {.name = "memory_complete", .desc = "Completions of a load split, store split, " "uncacheable (UC) split, or UC load.", .event_select = 0x8, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .event_masks = { {.name = "LSC", .desc = "Load split completed, excluding UC/WC loads.", .bit = 0, }, {.name = "SSC", .desc = "Any split stores completed.", .bit = 1, }, }, }, /* 5 */ {.name = "load_port_replay", .desc = "Replayed events at the load port.", .event_select = 0x4, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .event_masks = { {.name = "SPLIT_LD", .desc = "Split load.", .bit = 1, }, }, }, /* 6 */ {.name = "store_port_replay", .desc = "Replayed events at the store port.", .event_select = 0x5, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .event_masks = { {.name = "SPLIT_ST", .desc = "Split store.", .bit = 1, }, }, }, /* 7 */ {.name = "MOB_load_replay", .desc = "Count of times the memory order buffer (MOB) " "caused a load operation to be replayed.", .event_select = 0x3, .escr_select = 0x2, .allowed_escrs = { 2, 25 }, .event_masks = { {.name = "NO_STA", .desc = "Replayed because of unknown store address.", .bit = 1, }, {.name = "NO_STD", .desc = "Replayed because of unknown store data.", .bit = 3, }, {.name = "PARTIAL_DATA", .desc = "Replayed because of partially overlapped data " "access between the load and store operations.", .bit = 4, }, {.name = "UNALGN_ADDR", .desc = "Replayed because the lower 4 bits of the " "linear address do not match between the " "load and store operations.", .bit = 5, }, }, }, /* 8 */ {.name = "page_walk_type", .desc = "Page walks that the page miss handler (PMH) performs.", .event_select = 0x1, .escr_select = 0x4, .allowed_escrs = { 4, 27 }, .event_masks = { {.name = "DTMISS", .desc = "Page walk for a data TLB miss (load or store)", .bit = 0, }, {.name = "ITMISS", .desc = "Page walk for an instruction TLB miss", .bit = 1, }, }, }, /* 9 */ {.name = "BSQ_cache_reference", .desc = "Cache references (2nd or 3rd level caches) as seen by the " "bus unit. Read types include both load and RFO, and write " "types include writebacks and evictions.", .event_select = 0xC, .escr_select = 0x7, .allowed_escrs = { 7, 30 }, .event_masks = { {.name = "RD_2ndL_HITS", .desc = "Read 2nd level cache hit Shared.", .bit = 0, }, {.name = "RD_2ndL_HITE", .desc = "Read 2nd level cache hit Exclusive.", .bit = 1, }, {.name = "RD_2ndL_HITM", .desc = "Read 2nd level cache hit Modified.", .bit = 2, }, {.name = "RD_3rdL_HITS", .desc = "Read 3rd level cache hit Shared.", .bit = 3, }, {.name = "RD_3rdL_HITE", .desc = "Read 3rd level cache hit Exclusive.", .bit = 4, }, {.name = "RD_3rdL_HITM", .desc = "Read 3rd level cache hit Modified.", .bit = 5, }, {.name = "RD_2ndL_MISS", .desc = "Read 2nd level cache miss.", .bit = 8, }, {.name = "RD_3rdL_MISS", .desc = "Read 3rd level cache miss.", .bit = 9, }, {.name = "WR_2ndL_MISS", .desc = "A writeback lookup from DAC misses the 2nd " "level cache (unlikely to happen)", .bit = 10, }, }, }, /* 10 */ {.name = "IOQ_allocation", .desc = "Count of various types of transactions on the bus. A count " "is generated each time a transaction is allocated into the " "IOQ that matches the specified mask bits. An allocated entry " "can be a sector (64 bytes) or a chunk of 8 bytes. Requests " "are counted once per retry. All 'TYPE_BIT*' event-masks " "together are treated as a single 5-bit value.", .event_select = 0x3, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .event_masks = { {.name = "TYPE_BIT0", .desc = "Bus request type (bit 0).", .bit = 0, }, {.name = "TYPE_BIT1", .desc = "Bus request type (bit 1).", .bit = 1, }, {.name = "TYPE_BIT2", .desc = "Bus request type (bit 2).", .bit = 2, }, {.name = "TYPE_BIT3", .desc = "Bus request type (bit 3).", .bit = 3, }, {.name = "TYPE_BIT4", .desc = "Bus request type (bit 4).", .bit = 4, }, {.name = "ALL_READ", .desc = "Count read entries.", .bit = 5, }, {.name = "ALL_WRITE", .desc = "Count write entries.", .bit = 6, }, {.name = "MEM_UC", .desc = "Count UC memory access entries.", .bit = 7, }, {.name = "MEM_WC", .desc = "Count WC memory access entries.", .bit = 8, }, {.name = "MEM_WT", .desc = "Count write-through (WT) memory access entries.", .bit = 9, }, {.name = "MEM_WP", .desc = "Count write-protected (WP) memory access entries.", .bit = 10, }, {.name = "MEM_WB", .desc = "Count WB memory access entries.", .bit = 11, }, {.name = "OWN", .desc = "Count all store requests driven by processor, as " "opposed to other processor or DMA.", .bit = 13, }, {.name = "OTHER", .desc = "Count all requests driven by other " "processors or DMA.", .bit = 14, }, {.name = "PREFETCH", .desc = "Include HW and SW prefetch requests in the count.", .bit = 15, }, }, }, /* 11 */ {.name = "IOQ_active_entries", .desc = "Number of entries (clipped at 15) in the IOQ that are " "active. An allocated entry can be a sector (64 bytes) " "or a chunk of 8 bytes. This event must be programmed in " "conjuction with IOQ_allocation. All 'TYPE_BIT*' event-masks " "together are treated as a single 5-bit value.", .event_select = 0x1A, .escr_select = 0x6, .allowed_escrs = { 29, -1 }, .event_masks = { {.name = "TYPE_BIT0", .desc = "Bus request type (bit 0).", .bit = 0, }, {.name = "TYPE_BIT1", .desc = "Bus request type (bit 1).", .bit = 1, }, {.name = "TYPE_BIT2", .desc = "Bus request type (bit 2).", .bit = 2, }, {.name = "TYPE_BIT3", .desc = "Bus request type (bit 3).", .bit = 3, }, {.name = "TYPE_BIT4", .desc = "Bus request type (bit 4).", .bit = 4, }, {.name = "ALL_READ", .desc = "Count read entries.", .bit = 5, }, {.name = "ALL_WRITE", .desc = "Count write entries.", .bit = 6, }, {.name = "MEM_UC", .desc = "Count UC memory access entries.", .bit = 7, }, {.name = "MEM_WC", .desc = "Count WC memory access entries.", .bit = 8, }, {.name = "MEM_WT", .desc = "Count write-through (WT) memory access entries.", .bit = 9, }, {.name = "MEM_WP", .desc = "Count write-protected (WP) memory access entries.", .bit = 10, }, {.name = "MEM_WB", .desc = "Count WB memory access entries.", .bit = 11, }, {.name = "OWN", .desc = "Count all store requests driven by processor, as " "opposed to other processor or DMA.", .bit = 13, }, {.name = "OTHER", .desc = "Count all requests driven by other " "processors or DMA.", .bit = 14, }, {.name = "PREFETCH", .desc = "Include HW and SW prefetch requests in the count.", .bit = 15, }, }, }, /* 12 */ {.name = "FSB_data_activity", .desc = "Count of DRDY or DBSY events that " "occur on the front side bus.", .event_select = 0x17, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .event_masks = { {.name = "DRDY_DRV", .desc = "Count when this processor drives data onto the bus. " "Includes writes and implicit writebacks.", .bit = 0, }, {.name = "DRDY_OWN", .desc = "Count when this processor reads data from the bus. " "Includes loads and some PIC transactions. Count " "DRDY events that we drive. Count DRDY events sampled " "that we own.", .bit = 1, }, {.name = "DRDY_OTHER", .desc = "Count when data is on the bus but not being sampled " "by the processor. It may or may not be driven by " "this processor.", .bit = 2, }, {.name = "DBSY_DRV", .desc = "Count when this processor reserves the bus for use " "in the next bus cycle in order to drive data.", .bit = 3, }, {.name = "DBSY_OWN", .desc = "Count when some agent reserves the bus for use in " "the next bus cycle to drive data that this processor " "will sample.", .bit = 4, }, {.name = "DBSY_OTHER", .desc = "Count when some agent reserves the bus for use in " "the next bus cycle to drive data that this processor " "will NOT sample. It may or may not be being driven " "by this processor.", .bit = 5, }, }, }, /* 13 */ {.name = "BSQ_allocation", .desc = "Allocations in the Bus Sequence Unit (BSQ). The event mask " "bits consist of four sub-groups: request type, request " "length, memory type, and a sub-group consisting mostly of " "independent bits (5 through 10). Must specify a mask for " "each sub-group.", .event_select = 0x5, .escr_select = 0x7, .allowed_escrs = { 7, -1 }, .event_masks = { {.name = "REQ_TYPE0", .desc = "Along with REQ_TYPE1, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache).", .bit = 0, }, {.name = "REQ_TYPE1", .desc = "Along with REQ_TYPE0, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache).", .bit = 1, }, {.name = "REQ_LEN0", .desc = "Along with REQ_LEN1, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks.", .bit = 2, }, {.name = "REQ_LEN1", .desc = "Along with REQ_LEN0, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks.", .bit = 3, }, {.name = "REQ_IO_TYPE", .desc = "Request type is input or output.", .bit = 5, }, {.name = "REQ_LOCK_TYPE", .desc = "Request type is bus lock.", .bit = 6, }, {.name = "REQ_CACHE_TYPE", .desc = "Request type is cacheable.", .bit = 7, }, {.name = "REQ_SPLIT_TYPE", .desc = "Request type is a bus 8-byte chunk split across " "an 8-byte boundary.", .bit = 8, }, {.name = "REQ_DEM_TYPE", .desc = "0: Request type is HW.SW prefetch. " "1: Request type is a demand.", .bit = 9, }, {.name = "REQ_ORD_TYPE", .desc = "Request is an ordered type.", .bit = 10, }, {.name = "MEM_TYPE0", .desc = "Along with MEM_TYPE1 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 11, }, {.name = "MEM_TYPE1", .desc = "Along with MEM_TYPE0 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 12, }, {.name = "MEM_TYPE2", .desc = "Along with MEM_TYPE0 and MEM_TYPE1, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 13, }, }, }, /* 14 */ {.name = "BSQ_active_entries", .desc = "Number of BSQ entries (clipped at 15) currently active " "(valid) which meet the subevent mask criteria during " "allocation in the BSQ. Active request entries are allocated " "on the BSQ until de-allocated. De-allocation of an entry " "does not necessarily imply the request is filled. This " "event must be programmed in conjunction with BSQ_allocation.", .event_select = 0x6, .escr_select = 0x7, .allowed_escrs = { 30, -1 }, .event_masks = { {.name = "REQ_TYPE0", .desc = "Along with REQ_TYPE1, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache).", .bit = 0, }, {.name = "REQ_TYPE1", .desc = "Along with REQ_TYPE0, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache).", .bit = 1, }, {.name = "REQ_LEN0", .desc = "Along with REQ_LEN1, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks.", .bit = 2, }, {.name = "REQ_LEN1", .desc = "Along with REQ_LEN0, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks.", .bit = 3, }, {.name = "REQ_IO_TYPE", .desc = "Request type is input or output.", .bit = 5, }, {.name = "REQ_LOCK_TYPE", .desc = "Request type is bus lock.", .bit = 6, }, {.name = "REQ_CACHE_TYPE", .desc = "Request type is cacheable.", .bit = 7, }, {.name = "REQ_SPLIT_TYPE", .desc = "Request type is a bus 8-byte chunk split across " "an 8-byte boundary.", .bit = 8, }, {.name = "REQ_DEM_TYPE", .desc = "0: Request type is HW.SW prefetch. " "1: Request type is a demand.", .bit = 9, }, {.name = "REQ_ORD_TYPE", .desc = "Request is an ordered type.", .bit = 10, }, {.name = "MEM_TYPE0", .desc = "Along with MEM_TYPE1 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 11, }, {.name = "MEM_TYPE1", .desc = "Along with MEM_TYPE0 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 12, }, {.name = "MEM_TYPE2", .desc = "Along with MEM_TYPE0 and MEM_TYPE1, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 13, }, }, }, /* 15 */ {.name = "SSE_input_assist", .desc = "Number of times an assist is requested to handle problems " "with input operands for SSE/SSE2/SSE3 operations; most " "notably denormal source operands when the DAZ bit isn't set.", .event_select = 0x34, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count assists for SSE/SSE2/SSE3 uops.", .bit = 15, }, }, }, /* 16 */ {.name = "packed_SP_uop", .desc = "Number of packed single-precision uops.", .event_select = 0x8, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on packed " "single-precisions operands.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 17 */ {.name = "packed_DP_uop", .desc = "Number of packed double-precision uops.", .event_select = 0xC, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on packed " "double-precisions operands.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 18 */ {.name = "scalar_SP_uop", .desc = "Number of scalar single-precision uops.", .event_select = 0xA, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on scalar " "single-precisions operands.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 19 */ {.name = "scalar_DP_uop", .desc = "Number of scalar double-precision uops.", .event_select = 0xE, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on scalar " "double-precisions operands.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 20 */ {.name = "64bit_MMX_uop", .desc = "Number of MMX instructions which " "operate on 64-bit SIMD operands.", .event_select = 0x2, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on 64-bit SIMD integer " "operands in memory or MMX registers.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 21 */ {.name = "128bit_MMX_uop", .desc = "Number of MMX instructions which " "operate on 128-bit SIMD operands.", .event_select = 0x1A, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on 128-bit SIMD integer " "operands in memory or MMX registers.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 22 */ {.name = "x87_FP_uop", .desc = "Number of x87 floating-point uops.", .event_select = 0x4, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all x87 FP uops.", .bit = 15, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event.", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event.", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event.", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event.", .bit = 19, }, }, }, /* 23 */ {.name = "TC_misc", .desc = "Miscellaneous events detected by the TC. The counter will " "count twice for each occurrence.", .event_select = 0x6, .escr_select = 0x1, .allowed_escrs = { 9, 32 }, .event_masks = { {.name = "FLUSH", .desc = "Number of flushes", .bit = 4, }, }, }, /* 24 */ {.name = "global_power_events", .desc = "Counts the time during which a processor is not stopped.", .event_select = 0x13, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .event_masks = { {.name = "RUNNING", .desc = "The processor is active (includes the " "handling of HLT STPCLK and throttling.", .bit = 0, }, }, }, /* 25 */ {.name = "tc_ms_xfer", .desc = "Number of times that uop delivery changed from TC to MS ROM.", .event_select = 0x5, .escr_select = 0x0, .allowed_escrs = { 8, 31 }, .event_masks = { {.name = "CISC", .desc = "A TC to MS transfer occurred.", .bit = 0, }, }, }, /* 26 */ {.name = "uop_queue_writes", .desc = "Number of valid uops written to the uop queue.", .event_select = 0x9, .escr_select = 0x0, .allowed_escrs = { 8, 31 }, .event_masks = { {.name = "FROM_TC_BUILD", .desc = "The uops being written are from TC build mode.", .bit = 0, }, {.name = "FROM_TC_DELIVER", .desc = "The uops being written are from TC deliver mode.", .bit = 1, }, {.name = "FROM_ROM", .desc = "The uops being written are from microcode ROM.", .bit = 2, }, }, }, /* 27 */ {.name = "retired_mispred_branch_type", .desc = "Number of retiring mispredicted branches by type.", .event_select = 0x5, .escr_select = 0x2, .allowed_escrs = { 10, 33 }, .event_masks = { {.name = "CONDITIONAL", .desc = "Conditional jumps.", .bit = 1, }, {.name = "CALL", .desc = "Indirect call branches.", .bit = 2, }, {.name = "RETURN", .desc = "Return branches.", .bit = 3, }, {.name = "INDIRECT", .desc = "Returns, indirect calls, or indirect jumps.", .bit = 4, }, }, }, /* 28 */ {.name = "retired_branch_type", .desc = "Number of retiring branches by type.", .event_select = 0x4, .escr_select = 0x2, .allowed_escrs = { 10, 33 }, .event_masks = { {.name = "CONDITIONAL", .desc = "Conditional jumps.", .bit = 1, }, {.name = "CALL", .desc = "Indirect call branches.", .bit = 2, }, {.name = "RETURN", .desc = "Return branches.", .bit = 3, }, {.name = "INDIRECT", .desc = "Returns, indirect calls, or indirect jumps.", .bit = 4, }, }, }, /* 29 */ {.name = "resource_stall", .desc = "Occurrences of latency or stalls in the Allocator.", .event_select = 0x1, .escr_select = 0x1, .allowed_escrs = { 17, 40 }, .event_masks = { {.name = "SBFULL", .desc = "A stall due to lack of store buffers.", .bit = 5, }, }, }, /* 30 */ {.name = "WC_Buffer", .desc = "Number of Write Combining Buffer operations.", .event_select = 0x5, .escr_select = 0x5, .allowed_escrs = { 15, 38 }, .event_masks = { {.name = "WCB_EVICTS", .desc = "WC Buffer evictions of all causes.", .bit = 0, }, {.name = "WCB_FULL_EVICT", .desc = "WC Buffer eviction; no WC buffer is available.", .bit = 1, }, }, }, /* 31 */ {.name = "b2b_cycles", .desc = "Number of back-to-back bus cycles", .event_select = 0x16, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, /* FIXME: Appendix A is missing event-mask info. .event_masks = { {.name = .desc = .bit = }, }, */ }, /* 32 */ {.name = "bnr", .desc = "Number of bus-not-ready conditions.", .event_select = 0x8, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, /* FIXME: Appendix A is missing event-mask info. .event_masks = { {.name = .desc = .bit = }, }, */ }, /* 33 */ {.name = "snoop", .desc = "Number of snoop hit modified bus traffic.", .event_select = 0x6, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, /* FIXME: Appendix A is missing event-mask info. .event_masks = { {.name = .desc = .bit = }, }, */ }, /* 34 */ {.name = "response", .desc = "Count of different types of responses.", .event_select = 0x4, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, /* FIXME: Appendix A is missing event-mask info. .event_masks = { {.name = .desc = .bit = }, }, */ }, /* 35 */ {.name = "front_end_event", .desc = "Number of retirements of tagged uops which are specified " "through the front-end tagging mechanism.", .event_select = 0x8, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus.", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus.", .bit = 1, }, }, }, /* 36 */ {.name = "execution_event", .desc = "Number of retirements of tagged uops which are specified " "through the execution tagging mechanism. The event-mask " "allows from one to four types of uops to be tagged.", .event_select = 0xC, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "NBOGUS0", .desc = "The marked uops are not bogus.", .bit = 0, }, {.name = "NBOGUS1", .desc = "The marked uops are not bogus.", .bit = 1, }, {.name = "NBOGUS2", .desc = "The marked uops are not bogus.", .bit = 2, }, {.name = "NBOGUS3", .desc = "The marked uops are not bogus.", .bit = 3, }, {.name = "BOGUS0", .desc = "The marked uops are bogus.", .bit = 4, }, {.name = "BOGUS1", .desc = "The marked uops are bogus.", .bit = 5, }, {.name = "BOGUS2", .desc = "The marked uops are bogus.", .bit = 6, }, {.name = "BOGUS3", .desc = "The marked uops are bogus.", .bit = 7, }, }, }, /* 37 */ {.name = "replay_event", .desc = "Number of retirements of tagged uops which are specified " "through the replay tagging mechanism.", .event_select = 0x9, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus.", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus.", .bit = 1, }, {.name = "L1_LD_MISS", .desc = "Virtual mask for L1 cache load miss replays.", .bit = 2, }, {.name = "L2_LD_MISS", .desc = "Virtual mask for L2 cache load miss replays.", .bit = 3, }, {.name = "DTLB_LD_MISS", .desc = "Virtual mask for DTLB load miss replays.", .bit = 4, }, {.name = "DTLB_ST_MISS", .desc = "Virtual mask for DTLB store miss replays.", .bit = 5, }, {.name = "DTLB_ALL_MISS", .desc = "Virtual mask for all DTLB miss replays.", .bit = 6, }, {.name = "BR_MSP", .desc = "Virtual mask for tagged mispredicted branch replays.", .bit = 7, }, {.name = "MOB_LD_REPLAY", .desc = "Virtual mask for MOB load replays.", .bit = 8, }, {.name = "SP_LD_RET", .desc = "Virtual mask for split load replays. Use with load_port_replay event.", .bit = 9, }, {.name = "SP_ST_RET", .desc = "Virtual mask for split store replays. Use with store_port_replay event.", .bit = 10, }, }, }, /* 38 */ {.name = "instr_retired", .desc = "Number of instructions retired during a clock cycle.", .event_select = 0x2, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .event_masks = { {.name = "NBOGUSNTAG", .desc = "Non-bogus instructions that are not tagged.", .bit = 0, }, {.name = "NBOGUSTAG", .desc = "Non-bogus instructions that are tagged.", .bit = 1, }, {.name = "BOGUSNTAG", .desc = "Bogus instructions that are not tagged.", .bit = 2, }, {.name = "BOGUSTAG", .desc = "Bogus instructions that are tagged.", .bit = 3, }, }, }, /* 39 */ {.name = "uops_retired", .desc = "Number of uops retired during a clock cycle.", .event_select = 0x1, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus.", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus.", .bit = 1, }, }, }, /* 40 */ {.name = "uops_type", .desc = "This event is used in conjunction with with the front-end " "mechanism to tag load and store uops.", .event_select = 0x2, .escr_select = 0x2, .allowed_escrs = { 18, 41 }, .event_masks = { {.name = "TAGLOADS", .desc = "The uop is a load operation.", .bit = 1, }, {.name = "TAGSTORES", .desc = "The uop is a store operation.", .bit = 2, }, }, }, /* 41 */ {.name = "branch_retired", .desc = "Number of retirements of a branch.", .event_select = 0x6, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "MMNP", .desc = "Branch not-taken predicted.", .bit = 0, }, {.name = "MMNM", .desc = "Branch not-taken mispredicted.", .bit = 1, }, {.name = "MMTP", .desc = "Branch taken predicted.", .bit = 2, }, {.name = "MMTM", .desc = "Branch taken mispredicted.", .bit = 3, }, }, }, /* 42 */ {.name = "mispred_branch_retired", .desc = "Number of retirements of mispredicted " "IA-32 branch instructions", .event_select = 0x3, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .event_masks = { {.name = "BOGUS", .desc = "The retired instruction is not bogus.", .bit = 0, }, }, }, /* 43 */ {.name = "x87_assist", .desc = "Number of retirements of x87 instructions that required " "special handling.", .event_select = 0x3, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "FPSU", .desc = "Handle FP stack underflow.", .bit = 0, }, {.name = "FPSO", .desc = "Handle FP stack overflow.", .bit = 1, }, {.name = "POAO", .desc = "Handle x87 output overflow.", .bit = 2, }, {.name = "POAU", .desc = "Handle x87 output underflow.", .bit = 3, }, {.name = "PREA", .desc = "Handle x87 input assist.", .bit = 4, }, }, }, /* 44 */ {.name = "machine_clear", .desc = "Number of occurances when the entire " "pipeline of the machine is cleared.", .event_select = 0x2, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .event_masks = { {.name = "CLEAR", .desc = "Counts for a portion of the many cycles while the " "machine is cleared for any cause. Use edge-" "triggering for this bit only to get a count of " "occurances versus a duration.", .bit = 0, }, {.name = "MOCLEAR", .desc = "Increments each time the machine is cleared due to " "memory ordering issues.", .bit = 2, }, {.name = "SMCLEAR", .desc = "Increments each time the machine is cleared due to " "self-modifying code issues.", .bit = 6, }, }, }, /* 45 */ {.name = "instr_completed", .desc = "Instructions that have completed and " "retired during a clock cycle. Supported on models 3, 4, 6 only", .event_select = 0x7, .escr_select = 0x5, .allowed_escrs = { 21, 42 }, .event_masks = { {.name = "NBOGUS", .desc = "Non-bogus instructions.", .bit = 0, }, {.name = "BOGUS", .desc = "Bogus instructions.", .bit = 1, }, }, }, }; #define PME_INSTR_COMPLETED 45 #define PME_REPLAY_EVENT 37 #define PENTIUM4_EVENT_COUNT (sizeof(pentium4_events)/sizeof(pentium4_events[0])) /* CPU_CLK_UNHALTED uses the global_power_events event. * INST_RETIRED uses the instr_retired event. */ #define PENTIUM4_CPU_CLK_UNHALTED 24 #define PENTIUM4_INST_RETIRED 38 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_amd64.c000066400000000000000000000552621502707512200217300ustar00rootroot00000000000000/* * pfmlib_amd64.c : support for the AMD64 architected PMU * (for both 64 and 32 bit modes) * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_amd64_priv.h" /* architecture private */ #include "amd64_events.h" /* PMU private */ /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_unit_mask perfsel.sel_unit_mask #define sel_usr perfsel.sel_usr #define sel_os perfsel.sel_os #define sel_edge perfsel.sel_edge #define sel_pc perfsel.sel_pc #define sel_int perfsel.sel_int #define sel_en perfsel.sel_en #define sel_inv perfsel.sel_inv #define sel_cnt_mask perfsel.sel_cnt_mask #define sel_event_mask2 perfsel.sel_event_mask2 #define sel_guest perfsel.sel_guest #define sel_host perfsel.sel_host #define CHECK_AMD_ARCH(reg) \ ((reg).sel_event_mask2 || (reg).sel_guest || (reg).sel_host) #define PFMLIB_AMD64_HAS_COMBO(_e) \ ((pfm_amd64_get_event_entry(_e)->pme_flags & PFMLIB_AMD64_UMASK_COMBO) != 0) #define PFMLIB_AMD64_ALL_FLAGS \ (PFM_AMD64_SEL_INV|PFM_AMD64_SEL_EDGE|PFM_AMD64_SEL_GUEST|PFM_AMD64_SEL_HOST) /* * Description of the PMC register mappings use by * this module: * pfp_pmcs[].reg_num: * 0 -> PMC0 -> PERFEVTSEL0 -> MSR @ 0xc0010000 * 1 -> PMC1 -> PERFEVTSEL1 -> MSR @ 0xc0010001 * ... * pfp_pmds[].reg_num: * 0 -> PMD0 -> PERCTR0 -> MSR @ 0xc0010004 * 1 -> PMD1 -> PERCTR1 -> MSR @ 0xc0010005 * ... */ #define AMD64_SEL_BASE 0xc0010000 #define AMD64_CTR_BASE 0xc0010004 #define AMD64_SEL_BASE_F15H 0xc0010200 #define AMD64_CTR_BASE_F15H 0xc0010201 static struct { amd64_rev_t revision; char *name; unsigned int cpu_clks; unsigned int ret_inst; int family; int model; int stepping; pme_amd64_entry_t *events; } amd64_pmu; pme_amd64_entry_t unsupported_event = { .pme_name = "", .pme_desc = "This event is not supported be this cpu revision.", .pme_code = ~0, .pme_flags = PFMLIB_AMD64_NOT_SUPP, }; pfm_pmu_support_t amd64_support; #define amd64_revision amd64_pmu.revision #define amd64_event_count amd64_support.pme_count #define amd64_cpu_clks amd64_pmu.cpu_clks #define amd64_ret_inst amd64_pmu.ret_inst #define amd64_events amd64_pmu.events #define amd64_family amd64_pmu.family #define amd64_model amd64_pmu.model #define amd64_stepping amd64_pmu.stepping /* AMD architectural pmu features starts with family 10h */ #define IS_AMD_ARCH() (amd64_pmu.family >= 0x10) static amd64_rev_t amd64_get_revision(int family, int model, int stepping) { switch (family) { case 6: return AMD64_K7; case 0x0f: switch (model >> 4) { case 0: if (model == 5 && stepping < 2) return AMD64_K8_REV_B; if (model == 4 && stepping == 0) return AMD64_K8_REV_B; return AMD64_K8_REV_C; case 1: return AMD64_K8_REV_D; case 2: case 3: return AMD64_K8_REV_E; case 4: case 5: case 0xc: return AMD64_K8_REV_F; case 6: case 7: case 8: return AMD64_K8_REV_G; } return AMD64_K8_REV_B; case 0x10: switch (model) { case 4: case 5: case 6: return AMD64_FAM10H_REV_C; case 8: case 9: return AMD64_FAM10H_REV_D; case 10: return AMD64_FAM10H_REV_E; } return AMD64_FAM10H_REV_B; case 0x15: return AMD64_FAM15H_REV_B; } return AMD64_CPU_UN; } /* * .byte 0x53 == push ebx. it's universal for 32 and 64 bit * .byte 0x5b == pop ebx. * Some gcc's (4.1.2 on Core2) object to pairing push/pop and ebx in 64 bit mode. * Using the opcode directly avoids this problem. */ static inline void cpuid(unsigned int op, unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d) { __asm__ __volatile__ (".byte 0x53\n\tcpuid\n\tmovl %%ebx, %%esi\n\t.byte 0x5b" : "=a" (*a), "=S" (*b), "=c" (*c), "=d" (*d) : "a" (op)); } static void pfm_amd64_setup(amd64_rev_t revision) { amd64_pmu.revision = revision; amd64_pmu.name = (char *)amd64_cpu_strs[revision]; amd64_support.pmu_name = amd64_pmu.name; /* K8 (default) */ amd64_pmu.events = amd64_k8_table.events; amd64_support.pme_count = amd64_k8_table.num; amd64_pmu.cpu_clks = amd64_k8_table.cpu_clks; amd64_pmu.ret_inst = amd64_k8_table.ret_inst; amd64_support.pmu_type = PFMLIB_AMD64_PMU; amd64_support.num_cnt = PMU_AMD64_NUM_COUNTERS; amd64_support.pmc_count = PMU_AMD64_NUM_COUNTERS; amd64_support.pmd_count = PMU_AMD64_NUM_COUNTERS; switch (amd64_pmu.family) { case 6: /* K7 */ amd64_pmu.events = amd64_k7_table.events; amd64_support.pme_count = amd64_k7_table.num; amd64_pmu.cpu_clks = amd64_k7_table.cpu_clks; amd64_pmu.ret_inst = amd64_k7_table.ret_inst; return; case 0x10: /* Family 10h */ amd64_pmu.events = amd64_fam10h_table.events; amd64_support.pme_count = amd64_fam10h_table.num; amd64_pmu.cpu_clks = amd64_fam10h_table.cpu_clks; amd64_pmu.ret_inst = amd64_fam10h_table.ret_inst; amd64_support.pmc_count = PMU_AMD64_NUM_PERFSEL; amd64_support.pmd_count = PMU_AMD64_NUM_PERFCTR; return; case 0x15: /* Family 15h */ amd64_pmu.events = amd64_fam15h_table.events; amd64_support.pme_count = amd64_fam15h_table.num; amd64_pmu.cpu_clks = amd64_fam15h_table.cpu_clks; amd64_pmu.ret_inst = amd64_fam15h_table.ret_inst; amd64_support.num_cnt = PMU_AMD64_NUM_COUNTERS_F15H; amd64_support.pmc_count = PMU_AMD64_NUM_PERFSEL; amd64_support.pmd_count = PMU_AMD64_NUM_PERFCTR; return; } } static int pfm_amd64_detect(void) { unsigned int a, b, c, d; char buffer[128]; cpuid(0, &a, &b, &c, &d); strncpy(&buffer[0], (char *)(&b), 4); strncpy(&buffer[4], (char *)(&d), 4); strncpy(&buffer[8], (char *)(&c), 4); buffer[12] = '\0'; if (strcmp(buffer, "AuthenticAMD")) return PFMLIB_ERR_NOTSUPP; cpuid(1, &a, &b, &c, &d); amd64_family = (a >> 8) & 0x0000000f; // bits 11 - 8 amd64_model = (a >> 4) & 0x0000000f; // Bits 7 - 4 if (amd64_family == 0xf) { amd64_family += (a >> 20) & 0x000000ff; // Extended family amd64_model |= (a >> 12) & 0x000000f0; // Extended model } amd64_stepping = a & 0x0000000f; // bits 3 - 0 amd64_revision = amd64_get_revision(amd64_family, amd64_model, amd64_stepping); if (amd64_revision == AMD64_CPU_UN) return PFMLIB_ERR_NOTSUPP; return PFMLIB_SUCCESS; } static void pfm_amd64_force(void) { char *str; int pmu_type; /* parses LIBPFM_FORCE_PMU=16,,, */ str = getenv("LIBPFM_FORCE_PMU"); if (!str) goto failed; pmu_type = strtol(str, &str, 10); if (pmu_type != PFMLIB_AMD64_PMU) goto failed; if (!*str || *str++ != ',') goto failed; amd64_family = strtol(str, &str, 10); if (!*str || *str++ != ',') goto failed; amd64_model = strtol(str, &str, 10); if (!*str || *str++ != ',') goto failed; amd64_stepping = strtol(str, &str, 10); if (!*str) goto done; failed: DPRINT("force failed at: %s\n", str ? str : ""); /* force AMD64 = force to Barcelona */ amd64_family = 16; amd64_model = 2; amd64_stepping = 2; done: amd64_revision = amd64_get_revision(amd64_family, amd64_model, amd64_stepping); } static int pfm_amd64_init(void) { if (forced_pmu != PFMLIB_NO_PMU) pfm_amd64_force(); __pfm_vbprintf("AMD family=%d model=0x%x stepping=0x%x rev=%s, %s\n", amd64_family, amd64_model, amd64_stepping, amd64_rev_strs[amd64_revision], amd64_cpu_strs[amd64_revision]); pfm_amd64_setup(amd64_revision); return PFMLIB_SUCCESS; } static int is_valid_rev(unsigned int flags, int revision) { if (revision < from_revision(flags)) return 0; if (revision > till_revision(flags)) return 0; /* no restrictions or matches restrictions */ return 1; } static inline pme_amd64_entry_t *pfm_amd64_get_event_entry(unsigned int index) { /* * Since there are no NULL pointer checks for the return * value, &unsupported_event is returned instead. Function * is_valid_index() may be used to validate the index. */ pme_amd64_entry_t *event; if (index >= amd64_event_count) return &unsupported_event; event = &amd64_events[index]; if (!is_valid_rev(event->pme_flags, amd64_revision)) return &unsupported_event; return event; } static inline int is_valid_index(unsigned int index) { return (pfm_amd64_get_event_entry(index) != &unsupported_event); } /* * Automatically dispatch events to corresponding counters following constraints. */ static int pfm_amd64_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_amd64_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_amd64_input_param_t *param = mod_in; pfmlib_amd64_counter_t *cntrs; pfm_amd64_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned long plm; unsigned int i, j, k, cnt, umask; unsigned int assign[PMU_AMD64_MAX_COUNTERS]; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_amd64_counters : NULL; /* priviledge level 1 and 2 are not supported */ if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s\n", j, pfm_amd64_get_event_entry(e[j].event)->pme_name); } } if (cnt > amd64_support.num_cnt) return PFMLIB_ERR_TOOMANY; for(i=0, j=0; j < cnt; j++, i++) { /* * AMD64 only supports two priv levels for perf counters */ if (e[j].plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("event=%d invalid plm=%d\n", e[j].event, e[j].plm); return PFMLIB_ERR_INVAL; } /* * check illegal unit masks combination */ if (e[j].num_masks > 1 && PFMLIB_AMD64_HAS_COMBO(e[j].event) == 0) { DPRINT("event does not supports unit mask combination\n"); return PFMLIB_ERR_FEATCOMB; } /* * check revision restrictions at the event level * (check at the umask level later) */ if (!is_valid_rev(pfm_amd64_get_event_entry(e[i].event)->pme_flags, amd64_revision)) { DPRINT("CPU does not have correct revision level\n"); return PFMLIB_ERR_BADHOST; } if (cntrs && (cntrs[j].flags & ~PFMLIB_AMD64_ALL_FLAGS)) { DPRINT("invalid AMD64 flags\n"); return PFMLIB_ERR_INVAL; } if (cntrs && (cntrs[j].cnt_mask >= PMU_AMD64_CNT_MASK_MAX)) { DPRINT("event=%d invalid cnt_mask=%d: must be < %u\n", e[j].event, cntrs[j].cnt_mask, PMU_AMD64_CNT_MASK_MAX); return PFMLIB_ERR_INVAL; } /* * exclude unavailable registers from assignment */ while(i < amd64_support.num_cnt && pfm_regmask_isset(r_pmcs, i)) i++; if (i == amd64_support.num_cnt) return PFMLIB_ERR_NOASSIGN; assign[j] = i; } for (j=0; j < cnt ; j++ ) { reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[j].plm ? e[j].plm : inp->pfp_dfl_plm; if (!is_valid_rev(pfm_amd64_get_event_entry(e[j].event)->pme_flags, amd64_revision)) return PFMLIB_ERR_BADHOST; reg.sel_event_mask = pfm_amd64_get_event_entry(e[j].event)->pme_code; reg.sel_event_mask2 = pfm_amd64_get_event_entry(e[j].event)->pme_code >> 8; umask = 0; for(k=0; k < e[j].num_masks; k++) { /* check unit mask revision restrictions */ if (!is_valid_rev(pfm_amd64_get_event_entry(e[j].event)->pme_umasks[e[j].unit_masks[k]].pme_uflags, amd64_revision)) return PFMLIB_ERR_BADHOST; umask |= pfm_amd64_get_event_entry(e[j].event)->pme_umasks[e[j].unit_masks[k]].pme_ucode; } if (e[j].event == PME_AMD64_IBSOP) { ibsopctl_t ibsopctl; ibsopctl.val = 0; ibsopctl.reg.ibsopen = 1; if (umask == 2 && amd64_revision < from_revision(PFMLIB_AMD64_FAM10H_REV_C)) { DPRINT("IBSOP:UOPS available on Rev C and later processors\n"); return PFMLIB_ERR_BADHOST; } /* * 1: cycles * 2: uops */ ibsopctl.reg.ibsopcntl = umask == 0x1 ? 0 : 1; pc[j].reg_value = ibsopctl.val; pc[j].reg_num = PMU_AMD64_IBSOPCTL_PMC; pc[j].reg_addr = 0xc0011033; __pfm_vbprintf("[IBSOPCTL(pmc%u)=0x%llx en=%d uops=%d maxcnt=0x%x]\n", PMU_AMD64_IBSOPCTL_PMD, ibsopctl.val, ibsopctl.reg.ibsopen, ibsopctl.reg.ibsopcntl, ibsopctl.reg.ibsopmaxcnt); pd[j].reg_num = PMU_AMD64_IBSOPCTL_PMD; pd[j].reg_addr = 0xc0011033; __pfm_vbprintf("[IBSOPCTL(pmd%u)]\n", PMU_AMD64_IBSOPCTL_PMD); } else if (e[j].event == PME_AMD64_IBSFETCH) { ibsfetchctl_t ibsfetchctl; ibsfetchctl.val = 0; ibsfetchctl.reg.ibsfetchen = 1; ibsfetchctl.reg.ibsranden = umask == 0x1 ? 1 : 0; pc[j].reg_value = ibsfetchctl.val; pc[j].reg_num = PMU_AMD64_IBSFETCHCTL_PMC; pc[j].reg_addr = 0xc0011031; pd[j].reg_num = PMU_AMD64_IBSFETCHCTL_PMD; pd[j].reg_addr = 0xc0011031; __pfm_vbprintf("[IBSFETCHCTL(pmc%u)=0x%llx en=%d maxcnt=0x%x rand=%u]\n", PMU_AMD64_IBSFETCHCTL_PMD, ibsfetchctl.val, ibsfetchctl.reg.ibsfetchen, ibsfetchctl.reg.ibsfetchmaxcnt, ibsfetchctl.reg.ibsranden); __pfm_vbprintf("[IBSOPFETCH(pmd%u)]\n", PMU_AMD64_IBSFETCHCTL_PMD); } else { reg.sel_unit_mask = umask; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ if (cntrs) { reg.sel_cnt_mask = cntrs[j].cnt_mask; reg.sel_edge = cntrs[j].flags & PFM_AMD64_SEL_EDGE ? 1 : 0; reg.sel_inv = cntrs[j].flags & PFM_AMD64_SEL_INV ? 1 : 0; reg.sel_guest = cntrs[j].flags & PFM_AMD64_SEL_GUEST ? 1 : 0; reg.sel_host = cntrs[j].flags & PFM_AMD64_SEL_HOST ? 1 : 0; } pc[j].reg_num = assign[j]; if ((CHECK_AMD_ARCH(reg)) && !IS_AMD_ARCH()) return PFMLIB_ERR_BADHOST; if (amd64_support.num_cnt == PMU_AMD64_NUM_COUNTERS_F15H) { pc[j].reg_addr = AMD64_SEL_BASE_F15H + (assign[j] << 1); pd[j].reg_addr = AMD64_CTR_BASE_F15H + (assign[j] << 1); } else { pc[j].reg_addr = AMD64_SEL_BASE + assign[j]; pd[j].reg_addr = AMD64_CTR_BASE + assign[j]; } pc[j].reg_value = reg.val; pc[j].reg_alt_addr = pc[j].reg_addr; pd[j].reg_num = assign[j]; pd[j].reg_alt_addr = assign[j]; /* index to use with RDPMC */ __pfm_vbprintf("[PERFSEL%u(pmc%u)=0x%llx emask=0x%x umask=0x%x os=%d usr=%d inv=%d en=%d int=%d edge=%d cnt_mask=%d] %s\n", assign[j], assign[j], reg.val, reg.sel_event_mask, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_inv, reg.sel_en, reg.sel_int, reg.sel_edge, reg.sel_cnt_mask, pfm_amd64_get_event_entry(e[j].event)->pme_name); __pfm_vbprintf("[PERFCTR%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } } /* number of evtsel/ctr registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_amd64_dispatch_ibs(pfmlib_input_param_t *inp, pfmlib_amd64_input_param_t *inp_mod, pfmlib_output_param_t *outp, pfmlib_amd64_output_param_t *outp_mod) { unsigned int pmc_base, pmd_base; ibsfetchctl_t ibsfetchctl; ibsopctl_t ibsopctl; if (!inp_mod || !outp || !outp_mod) return PFMLIB_ERR_INVAL; if (!IS_AMD_ARCH()) return PFMLIB_ERR_BADHOST; /* IBS fetch profiling */ if (inp_mod->flags & PFMLIB_AMD64_USE_IBSFETCH) { /* check availability of a PMC and PMD */ if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return PFMLIB_ERR_NOASSIGN; if (outp->pfp_pmd_count >= PFMLIB_MAX_PMDS) return PFMLIB_ERR_NOASSIGN; pmc_base = outp->pfp_pmc_count; pmd_base = outp->pfp_pmd_count; outp->pfp_pmcs[pmc_base].reg_num = PMU_AMD64_IBSFETCHCTL_PMC; ibsfetchctl.val = 0; ibsfetchctl.reg.ibsfetchen = 1; ibsfetchctl.reg.ibsfetchmaxcnt = inp_mod->ibsfetch.maxcnt >> 4; if (inp_mod->ibsfetch.options & IBS_OPTIONS_RANDEN) ibsfetchctl.reg.ibsranden = 1; outp->pfp_pmcs[pmc_base].reg_value = ibsfetchctl.val; outp->pfp_pmds[pmd_base].reg_num = PMU_AMD64_IBSFETCHCTL_PMD; outp_mod->ibsfetch_base = pmd_base; ++outp->pfp_pmc_count; ++outp->pfp_pmd_count; } /* IBS execution profiling */ if (inp_mod->flags & PFMLIB_AMD64_USE_IBSOP) { /* check availability of a PMC and PMD */ if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return PFMLIB_ERR_NOASSIGN; if (outp->pfp_pmd_count >= PFMLIB_MAX_PMDS) return PFMLIB_ERR_NOASSIGN; pmc_base = outp->pfp_pmc_count; pmd_base = outp->pfp_pmd_count; outp->pfp_pmcs[pmc_base].reg_num = PMU_AMD64_IBSOPCTL_PMC; ibsopctl.val = 0; ibsopctl.reg.ibsopen = 1; ibsopctl.reg.ibsopmaxcnt = inp_mod->ibsop.maxcnt >> 4; if (inp_mod->ibsop.options & IBS_OPTIONS_UOPS) { if (amd64_revision < from_revision(PFMLIB_AMD64_FAM10H_REV_C)) { DPRINT("IBSOP:UOPS available on Rev C and later processors\n"); return PFMLIB_ERR_BADHOST; } ibsopctl.reg.ibsopcntl = 1; } outp->pfp_pmcs[pmc_base].reg_value = ibsopctl.val; outp->pfp_pmds[pmd_base].reg_num = PMU_AMD64_IBSOPCTL_PMD; outp_mod->ibsop_base = pmd_base; ++outp->pfp_pmc_count; ++outp->pfp_pmd_count; } return PFMLIB_SUCCESS; } static int pfm_amd64_dispatch_events( pfmlib_input_param_t *inp, void *_inp_mod, pfmlib_output_param_t *outp, void *outp_mod) { pfmlib_amd64_input_param_t *inp_mod = _inp_mod; int ret = PFMLIB_ERR_INVAL; if (!outp) return PFMLIB_ERR_INVAL; /* * At least one of the dispatch function calls must return * PFMLIB_SUCCESS */ if (inp && inp->pfp_event_count) { ret = pfm_amd64_dispatch_counters(inp, inp_mod, outp); if (ret != PFMLIB_SUCCESS) return ret; } if (inp_mod && inp_mod->flags & (PFMLIB_AMD64_USE_IBSOP | PFMLIB_AMD64_USE_IBSFETCH)) ret = pfm_amd64_dispatch_ibs(inp, inp_mod, outp, outp_mod); return ret; } static int pfm_amd64_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && cnt >= amd64_support.num_cnt) return PFMLIB_ERR_INVAL; *code = pfm_amd64_get_event_entry(i)->pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_amd64_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= amd64_event_count || umask == NULL) return PFMLIB_ERR_INVAL; *umask = 0; //evt_umask(i); return PFMLIB_SUCCESS; } static void pfm_amd64_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < amd64_support.num_cnt; i++) pfm_regmask_set(counters, i); } static void pfm_amd64_get_impl_perfsel(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < amd64_support.pmc_count; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_amd64_get_impl_perfctr(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < amd64_support.pmd_count; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_amd64_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=0; i < amd64_support.num_cnt; i++) pfm_regmask_set(impl_counters, i); } static void pfm_amd64_get_hw_counter_width(unsigned int *width) { *width = PMU_AMD64_COUNTER_WIDTH; } static char * pfm_amd64_get_event_name(unsigned int i) { if (!is_valid_index(i)) return NULL; return pfm_amd64_get_event_entry(i)->pme_name; } static int pfm_amd64_get_event_desc(unsigned int ev, char **str) { char *s; s = pfm_amd64_get_event_entry(ev)->pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_amd64_get_event_mask_name(unsigned int ev, unsigned int midx) { pme_amd64_umask_t *umask; umask = &pfm_amd64_get_event_entry(ev)->pme_umasks[midx]; if (!is_valid_rev(umask->pme_uflags, amd64_revision)) return NULL; return umask->pme_uname; } static int pfm_amd64_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = pfm_amd64_get_event_entry(ev)->pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_amd64_get_num_event_masks(unsigned int ev) { return pfm_amd64_get_event_entry(ev)->pme_numasks; } static int pfm_amd64_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code = pfm_amd64_get_event_entry(ev)->pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_amd64_get_cycle_event(pfmlib_event_t *e) { e->event = amd64_cpu_clks; return PFMLIB_SUCCESS; } static int pfm_amd64_get_inst_retired(pfmlib_event_t *e) { e->event = amd64_ret_inst; return PFMLIB_SUCCESS; } pfm_pmu_support_t amd64_support = { .pmu_name = "AMD64", .pmu_type = PFMLIB_AMD64_PMU, .pme_count = 0, .pmc_count = PMU_AMD64_NUM_COUNTERS, .pmd_count = PMU_AMD64_NUM_COUNTERS, .num_cnt = PMU_AMD64_NUM_COUNTERS, .get_event_code = pfm_amd64_get_event_code, .get_event_name = pfm_amd64_get_event_name, .get_event_counters = pfm_amd64_get_event_counters, .dispatch_events = pfm_amd64_dispatch_events, .pmu_detect = pfm_amd64_detect, .pmu_init = pfm_amd64_init, .get_impl_pmcs = pfm_amd64_get_impl_perfsel, .get_impl_pmds = pfm_amd64_get_impl_perfctr, .get_impl_counters = pfm_amd64_get_impl_counters, .get_hw_counter_width = pfm_amd64_get_hw_counter_width, .get_event_desc = pfm_amd64_get_event_desc, .get_num_event_masks = pfm_amd64_get_num_event_masks, .get_event_mask_name = pfm_amd64_get_event_mask_name, .get_event_mask_code = pfm_amd64_get_event_mask_code, .get_event_mask_desc = pfm_amd64_get_event_mask_desc, .get_cycle_event = pfm_amd64_get_cycle_event, .get_inst_retired_event = pfm_amd64_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_amd64_priv.h000066400000000000000000000110721502707512200227640ustar00rootroot00000000000000/* * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_AMD64_PRIV_H__ #define __PFMLIB_AMD64_PRIV_H__ /* * PERFSEL/PERFCTR include IBS registers: * * PMCs PMDs * * PERFCTRS 6 6 * IBS FETCH 1 3 * IBS OP 1 7 * * total 8 16 */ #define PMU_AMD64_NUM_PERFSEL 8 /* number of PMCs defined */ #define PMU_AMD64_NUM_PERFCTR 16 /* number of PMDs defined */ #define PMU_AMD64_NUM_COUNTERS 4 /* number of EvtSel/EvtCtr */ #define PMU_AMD64_NUM_COUNTERS_F15H 6 /* number of EvtSel/EvtCtr */ #define PMU_AMD64_COUNTER_WIDTH 48 /* hw counter bit width */ #define PMU_AMD64_CNT_MASK_MAX 4 /* max cnt_mask value */ #define PMU_AMD64_IBSFETCHCTL_PMC 6 /* IBS: fetch PMC base */ #define PMU_AMD64_IBSFETCHCTL_PMD 6 /* IBS: fetch PMD base */ #define PMU_AMD64_IBSOPCTL_PMC 7 /* IBS: op PMC base */ #define PMU_AMD64_IBSOPCTL_PMD 9 /* IBS: op PMD base */ #define PFMLIB_AMD64_MAX_UMASK 13 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ unsigned int pme_uflags; /* unit mask flags */ } pme_amd64_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ pme_amd64_umask_t pme_umasks[PFMLIB_AMD64_MAX_UMASK]; /* umask desc */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ } pme_amd64_entry_t; typedef enum { AMD64_CPU_UN, AMD64_K7, AMD64_K8_REV_B, AMD64_K8_REV_C, AMD64_K8_REV_D, AMD64_K8_REV_E, AMD64_K8_REV_F, AMD64_K8_REV_G, AMD64_FAM10H_REV_B, AMD64_FAM10H_REV_C, AMD64_FAM10H_REV_D, AMD64_FAM10H_REV_E, AMD64_FAM15H_REV_B, } amd64_rev_t; static const char *amd64_rev_strs[]= { "?", "?", /* K8 */ "B", "C", "D", "E", "F", "G", /* Family 10h */ "B", "C", "D", "E", /* Family 15h */ "B", }; static const char *amd64_cpu_strs[] = { "AMD64 (unknown model)", "AMD64 (K7)", "AMD64 (K8 RevB)", "AMD64 (K8 RevC)", "AMD64 (K8 RevD)", "AMD64 (K8 RevE)", "AMD64 (K8 RevF)", "AMD64 (K8 RevG)", "AMD64 (Family 10h RevB, Barcelona)", "AMD64 (Family 10h RevC, Shanghai)", "AMD64 (Family 10h RevD, Istanbul)", "AMD64 (Family 10h RevE)", "AMD64 (Family 15h RevB)", }; /* * pme_flags values */ #define PFMLIB_AMD64_UMASK_COMBO 0x1 /* unit mask can be combined */ #define PFMLIB_AMD64_FROM_REV(rev) ((rev)<<8) #define PFMLIB_AMD64_TILL_REV(rev) ((rev)<<16) #define PFMLIB_AMD64_NOT_SUPP 0x1ff00 #define PFMLIB_AMD64_TILL_K8_REV_C PFMLIB_AMD64_TILL_REV(AMD64_K8_REV_C) #define PFMLIB_AMD64_K8_REV_D PFMLIB_AMD64_FROM_REV(AMD64_K8_REV_D) #define PFMLIB_AMD64_K8_REV_E PFMLIB_AMD64_FROM_REV(AMD64_K8_REV_E) #define PFMLIB_AMD64_TILL_K8_REV_E PFMLIB_AMD64_TILL_REV(AMD64_K8_REV_E) #define PFMLIB_AMD64_K8_REV_F PFMLIB_AMD64_FROM_REV(AMD64_K8_REV_F) #define PFMLIB_AMD64_TILL_FAM10H_REV_B PFMLIB_AMD64_TILL_REV(AMD64_FAM10H_REV_B) #define PFMLIB_AMD64_FAM10H_REV_C PFMLIB_AMD64_FROM_REV(AMD64_FAM10H_REV_C) #define PFMLIB_AMD64_TILL_FAM10H_REV_C PFMLIB_AMD64_TILL_REV(AMD64_FAM10H_REV_C) #define PFMLIB_AMD64_FAM10H_REV_D PFMLIB_AMD64_FROM_REV(AMD64_FAM10H_REV_D) static inline int from_revision(unsigned int flags) { return ((flags) >> 8) & 0xff; } static inline int till_revision(unsigned int flags) { int till = (((flags)>>16) & 0xff); if (!till) return 0xff; return till; } #endif /* __PFMLIB_AMD64_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_cell.c000066400000000000000000000435451502707512200217350ustar00rootroot00000000000000/* * pfmlib_cell.c : support for the Cell PMU family * * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_cell_priv.h" /* architecture private */ #include "cell_events.h" /* PMU private */ #define SIGNAL_TYPE_CYCLES 0 #define PM_COUNTER_CTRL_CYLES 0x42C00000U #define PFM_CELL_NUM_PMCS 24 #define PFM_CELL_EVENT_MIN 1 #define PFM_CELL_EVENT_MAX 8 #define PMX_MIN_NUM 1 #define PMX_MAX_NUM 8 #define PFM_CELL_16BIT_CNTR_EVENT_MAX 8 #define PFM_CELL_32BIT_CNTR_EVENT_MAX 4 #define COMMON_REG_NUMS 8 #define ENABLE_WORD0 0 #define ENABLE_WORD1 1 #define ENABLE_WORD2 2 #define PFM_CELL_GRP_CONTROL_REG_GRP0_BIT 30 #define PFM_CELL_GRP_CONTROL_REG_GRP1_BIT 28 #define PFM_CELL_BASE_WORD_UNIT_FIELD_BIT 24 #define PFM_CELL_WORD_UNIT_FIELD_WIDTH 2 #define PFM_CELL_MAX_WORD_NUMBER 3 #define PFM_CELL_COUNTER_CONTROL_GRP1 0x80000000U #define PFM_CELL_DEFAULT_TRIGGER_EVENT_UNIT 0x00555500U #define PFM_CELL_PM_CONTROL_16BIT_CNTR_MASK 0x01E00000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_PROBLEM 0x00080000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_SUPERVISOR 0x00000000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_HYPERVISOR 0x00040000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_ALL 0x000C0000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_MASK 0x000C0000U #define ONLY_WORD(x) \ ((x == WORD_0_ONLY)||(x == WORD_2_ONLY)) ? x : 0 struct pfm_cell_signal_group_desc { unsigned int signal_type; unsigned int word_type; unsigned long long word; unsigned long long freq; unsigned int subunit; }; #define swap_int(num1, num2) do { \ int tmp = num1; \ num1 = num2; \ num2 = tmp; \ } while(0) static int pfm_cell_detect(void) { int ret; char buffer[128]; ret = __pfm_getcpuinfo_attr("cpu", buffer, sizeof(buffer)); if (ret == -1) { return PFMLIB_ERR_NOTSUPP; } if (strcmp(buffer, "Cell Broadband Engine, altivec supported")) { return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int get_pmx_offset(int pmx_num, unsigned int *pmx_ctrl_bits) { /* pmx_num==0 -> not specified * pmx_num==1 -> pm0 * : * pmx_num==8 -> pm7 */ int i = 0; int offset; if ((pmx_num >= PMX_MIN_NUM) && (pmx_num <= PMX_MAX_NUM)) { /* offset is specified */ offset = (pmx_num - 1); if ((~*pmx_ctrl_bits >> offset) & 0x1) { *pmx_ctrl_bits |= (0x1 << offset); return offset; } else { /* offset is used */ return PFMLIB_ERR_INVAL; } } else if (pmx_num == 0){ /* offset is not specified */ while (((*pmx_ctrl_bits >> i) & 0x1) && (i < PMX_MAX_NUM)) { i++; } *pmx_ctrl_bits |= (0x1 << i); return i; } /* pmx_num is invalid */ return PFMLIB_ERR_INVAL; } static unsigned long long search_enable_word(int word) { unsigned long long count = 0; while ((~word) & 0x1) { count++; word >>= 1; } return count; } static int get_count_bit(unsigned int type) { int count = 0; while(type) { if (type & 1) { count++; } type >>= 1; } return count; } static int get_debug_bus_word(struct pfm_cell_signal_group_desc *group0, struct pfm_cell_signal_group_desc *group1) { unsigned int word_type0, word_type1; /* search enable word */ word_type0 = group0->word_type; word_type1 = group1->word_type; if (group1->signal_type == NONE_SIGNAL) { group0->word = search_enable_word(word_type0); goto found; } /* swap */ if ((get_count_bit(word_type0) > get_count_bit(word_type1)) || (group0->freq == PFM_CELL_PME_FREQ_SPU)) { swap_int(group0->signal_type, group1->signal_type); swap_int(group0->freq, group1->freq); swap_int(group0->word_type, group1->word_type); swap_int(group0->subunit, group1->subunit); swap_int(word_type0, word_type1); } if ((ONLY_WORD(word_type0) != 0) && (word_type0 == word_type1)) { return PFMLIB_ERR_INVAL; } if (ONLY_WORD(word_type0)) { group0->word = search_enable_word(ONLY_WORD(word_type0)); word_type1 &= ~(1UL << (group0->word)); group1->word = search_enable_word(word_type1); } else if (ONLY_WORD(word_type1)) { group1->word = search_enable_word(ONLY_WORD(word_type1)); word_type0 &= ~(1UL << (group1->word)); group0->word = search_enable_word(word_type0); } else { group0->word = ENABLE_WORD0; if (word_type1 == WORD_0_AND_1) { group1->word = ENABLE_WORD1; } else if(word_type1 == WORD_0_AND_2) { group1->word = ENABLE_WORD2; } else { return PFMLIB_ERR_INVAL; } } found: return PFMLIB_SUCCESS; } static unsigned int get_signal_type(unsigned long long event_code) { return (event_code & 0x00000000FFFFFFFFULL) / 100; } static unsigned int get_signal_bit(unsigned long long event_code) { return (event_code & 0x00000000FFFFFFFFULL) % 100; } static int is_spe_signal_group(unsigned int signal_type) { if (41 <= signal_type && signal_type <= 56) { return 1; } else { return 0; } } static int check_signal_type(pfmlib_input_param_t *inp, pfmlib_cell_input_param_t *mod_in, struct pfm_cell_signal_group_desc *group0, struct pfm_cell_signal_group_desc *group1) { pfmlib_event_t *e; unsigned int event_cnt; int signal_cnt = 0; int i; int cycles_signal_cnt = 0; unsigned int signal_type, subunit; e = inp->pfp_events; event_cnt = inp->pfp_event_count; for(i = 0; i < event_cnt; i++) { signal_type = get_signal_type(cell_pe[e[i].event].pme_code); if ((signal_type == SIGNAL_SPU_TRIGGER) || (signal_type == SIGNAL_SPU_EVENT)) { continue; } if (signal_type == SIGNAL_TYPE_CYCLES) { cycles_signal_cnt = 1; continue; } subunit = 0; if (is_spe_signal_group(signal_type)) { subunit = mod_in->pfp_cell_counters[i].spe_subunit; } switch(signal_cnt) { case 0: group0->signal_type = signal_type; group0->word_type = cell_pe[e[i].event].pme_enable_word; group0->freq = cell_pe[e[i].event].pme_freq; group0->subunit = subunit; signal_cnt++; break; case 1: if ((group0->signal_type != signal_type) || (is_spe_signal_group(signal_type) && group0->subunit != subunit)) { group1->signal_type = signal_type; group1->word_type = cell_pe[e[i].event].pme_enable_word; group1->freq = cell_pe[e[i].event].pme_freq; group1->subunit = subunit; signal_cnt++; } break; case 2: if ((group0->signal_type != signal_type) && (group1->signal_type != signal_type)) { DPRINT("signal count is invalid\n"); return PFMLIB_ERR_INVAL; } break; default: DPRINT("signal count is invalid\n"); return PFMLIB_ERR_INVAL; } } return (signal_cnt + cycles_signal_cnt); } /* * The assignment between the privilege leve options * and ppu-count-mode field in pm_control register. * * option ppu count mode(pm_control) * --------------------------------- * -u(-3) 0b10 : Problem mode * -k(-0) 0b00 : Supervisor mode * -1 0b00 : Supervisor mode * -2 0b01 : Hypervisor mode * two options 0b11 : Any mode * * Note : Hypervisor-mode and Any-mode don't work on PS3. * */ static unsigned int get_ppu_count_mode(unsigned int plm) { unsigned int ppu_count_mode = 0; switch (plm) { case PFM_PLM0: case PFM_PLM1: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_SUPERVISOR; break; case PFM_PLM2: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_HYPERVISOR; break; case PFM_PLM3: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_PROBLEM; break; default : ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_ALL; break; } return ppu_count_mode; } static int pfm_cell_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_cell_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; unsigned int event_cnt; unsigned int signal_cnt = 0, pmcs_cnt = 0; unsigned int signal_type; unsigned long long signal_bit; struct pfm_cell_signal_group_desc group[2]; int pmx_offset = 0; int i, ret; int input_control, polarity, count_cycle, count_enable; unsigned long long subunit; int shift0, shift1; unsigned int pmx_ctrl_bits; int max_event_cnt = PFM_CELL_32BIT_CNTR_EVENT_MAX; count_enable = 1; group[0].signal_type = group[1].signal_type = NONE_SIGNAL; group[0].word = group[1].word = 0L; group[0].freq = group[1].freq = 0L; group[0].subunit = group[1].subunit = 0; group[0].word_type = group[1].word_type = WORD_NONE; event_cnt = inp->pfp_event_count; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; /* check event_cnt */ if (mod_in->control & PFM_CELL_PM_CONTROL_16BIT_CNTR_MASK) max_event_cnt = PFM_CELL_16BIT_CNTR_EVENT_MAX; if (event_cnt < PFM_CELL_EVENT_MIN) return PFMLIB_ERR_NOTFOUND; if (event_cnt > max_event_cnt) return PFMLIB_ERR_TOOMANY; /* check signal type */ signal_cnt = check_signal_type(inp, mod_in, &group[0], &group[1]); if (signal_cnt == PFMLIB_ERR_INVAL) return PFMLIB_ERR_NOASSIGN; /* decide debug_bus word */ if (signal_cnt != 0 && group[0].signal_type != NONE_SIGNAL) { ret = get_debug_bus_word(&group[0], &group[1]); if (ret != PFMLIB_SUCCESS) return PFMLIB_ERR_NOASSIGN; } /* common register setting */ pc[pmcs_cnt].reg_num = REG_GROUP_CONTROL; if (signal_cnt == 1) { pc[pmcs_cnt].reg_value = group[0].word << PFM_CELL_GRP_CONTROL_REG_GRP0_BIT; } else if (signal_cnt == 2) { pc[pmcs_cnt].reg_value = (group[0].word << PFM_CELL_GRP_CONTROL_REG_GRP0_BIT) | (group[1].word << PFM_CELL_GRP_CONTROL_REG_GRP1_BIT); } pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_DEBUG_BUS_CONTROL; if (signal_cnt == 1) { shift0 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[0].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); pc[pmcs_cnt].reg_value = group[0].freq << shift0; } else if (signal_cnt == 2) { shift0 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[0].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); shift1 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[1].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); pc[pmcs_cnt].reg_value = (group[0].freq << shift0) | (group[1].freq << shift1); } pc[pmcs_cnt].reg_value |= PFM_CELL_DEFAULT_TRIGGER_EVENT_UNIT; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_TRACE_ADDRESS; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_EXT_TRACE_TIMER; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_STATUS; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_CONTROL; pc[pmcs_cnt].reg_value = (mod_in->control & ~PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_MASK) | get_ppu_count_mode(inp->pfp_dfl_plm); pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_INTERVAL; pc[pmcs_cnt].reg_value = mod_in->interval; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_START_STOP; pc[pmcs_cnt].reg_value = mod_in->triggers; pmcs_cnt++; pmx_ctrl_bits = 0; /* pmX register setting */ for(i = 0; i < event_cnt; i++) { /* PMX_CONTROL */ pmx_offset = get_pmx_offset(mod_in->pfp_cell_counters[i].pmX_control_num, &pmx_ctrl_bits); if (pmx_offset == PFMLIB_ERR_INVAL) { DPRINT("pmX already used\n"); return PFMLIB_ERR_INVAL; } signal_type = get_signal_type(cell_pe[e[i].event].pme_code); if (signal_type == SIGNAL_TYPE_CYCLES) { pc[pmcs_cnt].reg_value = PM_COUNTER_CTRL_CYLES; pc[pmcs_cnt].reg_num = REG_PM0_CONTROL + pmx_offset; pmcs_cnt++; pc[pmcs_cnt].reg_value = cell_pe[e[i].event].pme_code; pc[pmcs_cnt].reg_num = REG_PM0_EVENT + pmx_offset; pmcs_cnt++; pd[i].reg_num = pmx_offset; pd[i].reg_value = 0; continue; } switch(cell_pe[e[i].event].pme_type) { case COUNT_TYPE_BOTH_TYPE: case COUNT_TYPE_CUMULATIVE_LEN: case COUNT_TYPE_MULTI_CYCLE: case COUNT_TYPE_SINGLE_CYCLE: count_cycle = 1; break; case COUNT_TYPE_OCCURRENCE: count_cycle = 0; break; default: return PFMLIB_ERR_INVAL; } signal_bit = get_signal_bit(cell_pe[e[i].event].pme_code); polarity = mod_in->pfp_cell_counters[i].polarity; input_control = mod_in->pfp_cell_counters[i].input_control; subunit = 0; if (is_spe_signal_group(signal_type)) { subunit = mod_in->pfp_cell_counters[i].spe_subunit; } pc[pmcs_cnt].reg_value = ( (signal_bit << (31 - 5)) | (input_control << (31 - 6)) | (polarity << (31 - 7)) | (count_cycle << (31 - 8)) | (count_enable << (31 - 9)) ); pc[pmcs_cnt].reg_num = REG_PM0_CONTROL + pmx_offset; if (signal_type == group[1].signal_type && subunit == group[1].subunit) { pc[pmcs_cnt].reg_value |= PFM_CELL_COUNTER_CONTROL_GRP1; } pmcs_cnt++; /* PMX_EVENT */ pc[pmcs_cnt].reg_num = REG_PM0_EVENT + pmx_offset; /* debug bus word setting */ if (signal_type == group[0].signal_type && subunit == group[0].subunit) { pc[pmcs_cnt].reg_value = (cell_pe[e[i].event].pme_code | (group[0].word << 48) | (subunit << 32)); } else if (signal_type == group[1].signal_type && subunit == group[1].subunit) { pc[pmcs_cnt].reg_value = (cell_pe[e[i].event].pme_code | (group[1].word << 48) | (subunit << 32)); } else if ((signal_type == SIGNAL_SPU_TRIGGER) || (signal_type == SIGNAL_SPU_EVENT)) { pc[pmcs_cnt].reg_value = cell_pe[e[i].event].pme_code | (subunit << 32); } else { return PFMLIB_ERR_INVAL; } pmcs_cnt++; /* pmd setting */ pd[i].reg_num = pmx_offset; pd[i].reg_value = 0; } outp->pfp_pmc_count = pmcs_cnt; outp->pfp_pmd_count = event_cnt; return PFMLIB_SUCCESS; } static int pfm_cell_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_cell_input_param_t *mod_in = (pfmlib_cell_input_param_t *)model_in; pfmlib_cell_input_param_t default_model_in; int i; if (model_in) { mod_in = (pfmlib_cell_input_param_t *)model_in; } else { mod_in = &default_model_in; mod_in->control = 0x80000000; mod_in->interval = 0; mod_in->triggers = 0; for (i = 0; i < PMU_CELL_NUM_COUNTERS; i++) { mod_in->pfp_cell_counters[i].pmX_control_num = 0; mod_in->pfp_cell_counters[i].spe_subunit = 0; mod_in->pfp_cell_counters[i].polarity = 1; mod_in->pfp_cell_counters[i].input_control = 0; mod_in->pfp_cell_counters[i].cnt_mask = 0; mod_in->pfp_cell_counters[i].flags = 0; } } return pfm_cell_dispatch_counters(inp, mod_in, outp); } static int pfm_cell_get_event_code(unsigned int i, unsigned int cnt, int *code) { // if (cnt != PFMLIB_CNT_FIRST && cnt > 2) { if (cnt != PFMLIB_CNT_FIRST && cnt > cell_support.num_cnt) { return PFMLIB_ERR_INVAL; } *code = cell_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_cell_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < PMU_CELL_NUM_COUNTERS; i++) { pfm_regmask_set(counters, i); } } static void pfm_cell_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i; memset(impl_pmcs, 0, sizeof(*impl_pmcs)); for(i=0; i < PFM_CELL_NUM_PMCS; i++) { pfm_regmask_set(impl_pmcs, i); } } static void pfm_cell_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i; memset(impl_pmds, 0, sizeof(*impl_pmds)); for(i=0; i < PMU_CELL_NUM_PERFCTR; i++) { pfm_regmask_set(impl_pmds, i); } } static void pfm_cell_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i; for(i=0; i < PMU_CELL_NUM_COUNTERS; i++) { pfm_regmask_set(impl_counters, i); } } static char* pfm_cell_get_event_name(unsigned int i) { return cell_pe[i].pme_name; } static int pfm_cell_get_event_desc(unsigned int ev, char **str) { char *s; s = cell_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_cell_get_cycle_event(pfmlib_event_t *e) { int i; for (i = 0; i < PME_CELL_EVENT_COUNT; i++) { if (!strcmp(cell_pe[i].pme_name, "CYCLES")) { e->event = i; return PFMLIB_SUCCESS; } } return PFMLIB_ERR_NOTFOUND; } int pfm_cell_spe_event(unsigned int event_index) { if (event_index >= PME_CELL_EVENT_COUNT) return 0; return is_spe_signal_group(get_signal_type(cell_pe[event_index].pme_code)); } pfm_pmu_support_t cell_support={ .pmu_name = "CELL", .pmu_type = PFMLIB_CELL_PMU, .pme_count = PME_CELL_EVENT_COUNT, .pmc_count = PFM_CELL_NUM_PMCS, .pmd_count = PMU_CELL_NUM_PERFCTR, .num_cnt = PMU_CELL_NUM_COUNTERS, .get_event_code = pfm_cell_get_event_code, .get_event_name = pfm_cell_get_event_name, .get_event_counters = pfm_cell_get_event_counters, .dispatch_events = pfm_cell_dispatch_events, .pmu_detect = pfm_cell_detect, .get_impl_pmcs = pfm_cell_get_impl_pmcs, .get_impl_pmds = pfm_cell_get_impl_pmds, .get_impl_counters = pfm_cell_get_impl_counters, .get_event_desc = pfm_cell_get_event_desc, .get_cycle_event = pfm_cell_get_cycle_event }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_cell_priv.h000066400000000000000000000056571502707512200230040ustar00rootroot00000000000000/* * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_CELL_PRIV_H__ #define __PFMLIB_CELL_PRIV_H__ #define PFM_CELL_PME_FREQ_PPU_MFC 0 #define PFM_CELL_PME_FREQ_SPU 1 #define PFM_CELL_PME_FREQ_HALF 2 typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned long long pme_code; /* event code */ unsigned int pme_type; /* count type */ unsigned int pme_freq; /* debug_bus_control's frequency value */ unsigned int pme_enable_word; } pme_cell_entry_t; /* PMC register */ #define REG_PM0_CONTROL 0x0000 #define REG_PM1_CONTROL 0x0001 #define REG_PM2_CONTROL 0x0002 #define REG_PM3_CONTROL 0x0003 #define REG_PM4_CONTROL 0x0004 #define REG_PM5_CONTROL 0x0005 #define REG_PM6_CONTROL 0x0006 #define REG_PM7_CONTROL 0x0007 #define REG_PM0_EVENT 0x0008 #define REG_PM1_EVENT 0x0009 #define REG_PM2_EVENT 0x000A #define REG_PM3_EVENT 0x000B #define REG_PM4_EVENT 0x000C #define REG_PM5_EVENT 0x000D #define REG_PM6_EVENT 0x000E #define REG_PM7_EVENT 0x000F #define REG_GROUP_CONTROL 0x0010 #define REG_DEBUG_BUS_CONTROL 0x0011 #define REG_TRACE_ADDRESS 0x0012 #define REG_EXT_TRACE_TIMER 0x0013 #define REG_PM_STATUS 0x0014 #define REG_PM_CONTROL 0x0015 #define REG_PM_INTERVAL 0x0016 #define REG_PM_START_STOP 0x0017 #define NONE_SIGNAL 0x0000 #define SIGNAL_SPU 41 #define SIGNAL_SPU_TRIGGER 42 #define SIGNAL_SPU_EVENT 43 #define COUNT_TYPE_BOTH_TYPE 1 #define COUNT_TYPE_CUMULATIVE_LEN 2 #define COUNT_TYPE_OCCURRENCE 3 #define COUNT_TYPE_MULTI_CYCLE 4 #define COUNT_TYPE_SINGLE_CYCLE 5 #define WORD_0_ONLY 1 /* 0001 */ #define WORD_2_ONLY 4 /* 0100 */ #define WORD_0_AND_1 3 /* 0011 */ #define WORD_0_AND_2 5 /* 0101 */ #define WORD_NONE 0 #endif /* __PFMLIB_CELL_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_common.c000066400000000000000000000606311502707512200223010ustar00rootroot00000000000000/* * pfmlib_common.c: set of functions common to all PMU models * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" static pfm_pmu_support_t *supported_pmus[]= { #ifdef CONFIG_PFMLIB_ARCH_IA64 &montecito_support, &itanium2_support, &itanium_support, &generic_ia64_support, /* must always be last for IA-64 */ #endif #ifdef CONFIG_PFMLIB_ARCH_X86_64 &amd64_support, &pentium4_support, &core_support, &intel_atom_support, &intel_nhm_support, &intel_wsm_support, &gen_ia32_support, /* must always be last for x86-64 */ #endif #ifdef CONFIG_PFMLIB_ARCH_I386 &i386_pii_support, &i386_ppro_support, &i386_p6_support, &i386_pm_support, &coreduo_support, &amd64_support, &pentium4_support, &core_support, &intel_atom_support, &intel_nhm_support, &intel_wsm_support, &gen_ia32_support, /* must always be last for i386 */ #endif #ifdef CONFIG_PFMLIB_ARCH_MIPS64 &generic_mips64_support, #endif #ifdef CONFIG_PFMLIB_ARCH_SICORTEX &sicortex_support, #endif #ifdef CONFIG_PFMLIB_ARCH_POWERPC &gen_powerpc_support, #endif #ifdef CONFIG_PFMLIB_ARCH_SPARC &sparc_support, #endif #ifdef CONFIG_PFMLIB_ARCH_CRAYX2 &crayx2_support, #endif #ifdef CONFIG_PFMLIB_CELL &cell_support, #endif NULL }; /* * contains runtime configuration options for the library. * mostly for debug purposes. */ pfm_config_t pfm_config = { .current = NULL }; int forced_pmu = PFMLIB_NO_PMU; /* * check environment variables for: * LIBPFM_VERBOSE : enable verbose output (must be 1) * LIBPFM_DEBUG : enable debug output (must be 1) */ static void pfm_check_debug_env(void) { char *str; libpfm_fp = stderr; str = getenv("LIBPFM_VERBOSE"); if (str && *str >= '0' && *str <= '9') { pfm_config.options.pfm_verbose = *str - '0'; pfm_config.options_env_set = 1; } str = getenv("LIBPFM_DEBUG"); if (str && *str >= '0' && *str <= '9') { pfm_config.options.pfm_debug = *str - '0'; pfm_config.options_env_set = 1; } str = getenv("LIBPFM_DEBUG_STDOUT"); if (str) libpfm_fp = stdout; str = getenv("LIBPFM_FORCE_PMU"); if (str) forced_pmu = atoi(str); } int pfm_initialize(void) { pfm_pmu_support_t **p = supported_pmus; int ret; pfm_check_debug_env(); /* * syscall mapping, no failure on error */ pfm_init_syscalls(); while(*p) { DPRINT("trying %s\n", (*p)->pmu_name); /* * check for forced_pmu * pmu_type can never be zero */ if ((*p)->pmu_type == forced_pmu) { __pfm_vbprintf("PMU forced to %s\n", (*p)->pmu_name); goto found; } if (forced_pmu == PFMLIB_NO_PMU && (*p)->pmu_detect() == PFMLIB_SUCCESS) goto found; p++; } return PFMLIB_ERR_NOTSUPP; found: DPRINT("found %s\n", (*p)->pmu_name); /* * run a few sanity checks */ if ((*p)->pmc_count >= PFMLIB_MAX_PMCS) return PFMLIB_ERR_NOTSUPP; if ((*p)->pmd_count >= PFMLIB_MAX_PMDS) return PFMLIB_ERR_NOTSUPP; if ((*p)->pmu_init) { ret = (*p)->pmu_init(); if (ret != PFMLIB_SUCCESS) return ret; } pfm_current = *p; return PFMLIB_SUCCESS; } int pfm_set_options(pfmlib_options_t *opt) { if (opt == NULL) return PFMLIB_ERR_INVAL; /* * environment variables override program presets */ if (pfm_config.options_env_set == 0) pfm_config.options = *opt; return PFMLIB_SUCCESS; } /* * return the name corresponding to the pmu type. Only names * of PMU actually compiled in the library will be returned. */ int pfm_get_pmu_name_bytype(int type, char *name, size_t maxlen) { pfm_pmu_support_t **p = supported_pmus; if (name == NULL || maxlen < 1) return PFMLIB_ERR_INVAL; while (*p) { if ((*p)->pmu_type == type) goto found; p++; } return PFMLIB_ERR_INVAL; found: strncpy(name, (*p)->pmu_name, maxlen-1); /* make sure the string is null terminated */ name[maxlen-1] = '\0'; return PFMLIB_SUCCESS; } int pfm_list_supported_pmus(int (*pf)(const char *fmt,...)) { pfm_pmu_support_t **p; if (pf == NULL) return PFMLIB_ERR_INVAL; (*pf)("supported PMU models: "); for (p = supported_pmus; *p; p++) { (*pf)("[%s] ", (*p)->pmu_name);; } (*pf)("\ndetected host PMU: %s\n", pfm_current ? pfm_current->pmu_name : "not detected yet"); return PFMLIB_SUCCESS; } int pfm_get_pmu_name(char *name, int maxlen) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (name == NULL || maxlen < 1) return PFMLIB_ERR_INVAL; strncpy(name, pfm_current->pmu_name, maxlen-1); name[maxlen-1] = '\0'; return PFMLIB_SUCCESS; } int pfm_get_pmu_type(int *type) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (type == NULL) return PFMLIB_ERR_INVAL; *type = pfm_current->pmu_type; return PFMLIB_SUCCESS; } /* * boolean return value */ int pfm_is_pmu_supported(int type) { pfm_pmu_support_t **p = supported_pmus; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; while (*p) { if ((*p)->pmu_type == type) return PFMLIB_SUCCESS; p++; } return PFMLIB_ERR_NOTSUPP; } int pfm_force_pmu(int type) { pfm_pmu_support_t **p = supported_pmus; while (*p) { if ((*p)->pmu_type == type) goto found; p++; } return PFMLIB_ERR_NOTSUPP; found: pfm_current = *p; return PFMLIB_SUCCESS; } int pfm_find_event_byname(const char *n, unsigned int *idx) { char *p, *e; unsigned int i; size_t len; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (n == NULL || idx == NULL) return PFMLIB_ERR_INVAL; /* * this function ignores any ':' separator */ p = strchr(n, ':'); if (!p) len = strlen(n); else len = p - n; /* * we do case insensitive comparisons * * event names must match completely */ for(i=0; i < pfm_current->pme_count; i++) { e = pfm_current->get_event_name(i); if (!e) continue; if (!strncasecmp(e, n, len) && len == strlen(e)) goto found; } return PFMLIB_ERR_NOTFOUND; found: *idx = i; return PFMLIB_SUCCESS; } int pfm_find_event_bycode(int code, unsigned int *idx) { pfmlib_regmask_t impl_cnt; unsigned int i, j, num_cnt; int code2; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (idx == NULL) return PFMLIB_ERR_INVAL; if (pfm_current->flags & PFMLIB_MULT_CODE_EVENT) { pfm_current->get_impl_counters(&impl_cnt); num_cnt = pfm_current->num_cnt; for(i=0; i < pfm_current->pme_count; i++) { for(j=0; num_cnt; j++) { if (pfm_regmask_isset(&impl_cnt, j)) { pfm_current->get_event_code(i, j, &code2); if (code2 == code) goto found; num_cnt--; } } } } else { for(i=0; i < pfm_current->pme_count; i++) { pfm_current->get_event_code(i, PFMLIB_CNT_FIRST, &code2); if (code2 == code) goto found; } } return PFMLIB_ERR_NOTFOUND; found: *idx = i; return PFMLIB_SUCCESS; } int pfm_find_event(const char *v, unsigned int *ev) { unsigned long number; char *endptr = NULL; int ret = PFMLIB_ERR_INVAL; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (v == NULL || ev == NULL) return PFMLIB_ERR_INVAL; if (isdigit((int)*v)) { number = strtoul(v,&endptr, 0); /* check for errors */ if (*endptr!='\0') return PFMLIB_ERR_INVAL; if (number <= INT_MAX) { int the_int_number = (int)number; ret = pfm_find_event_bycode(the_int_number, ev); } } else ret = pfm_find_event_byname(v, ev); return ret; } int pfm_find_event_bycode_next(int code, unsigned int i, unsigned int *next) { int code2; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (!next) return PFMLIB_ERR_INVAL; for(++i; i < pfm_current->pme_count; i++) { pfm_current->get_event_code(i, PFMLIB_CNT_FIRST, &code2); if (code2 == code) goto found; } return PFMLIB_ERR_NOTFOUND; found: *next = i; return PFMLIB_SUCCESS; } static int pfm_do_find_event_mask(unsigned int ev, const char *str, unsigned int *mask_idx) { unsigned int i, c, num_masks = 0; unsigned long mask_val = -1; char *endptr = NULL; char *mask_name; /* empty mask name */ if (*str == '\0') return PFMLIB_ERR_UMASK; num_masks = pfm_num_masks(ev); for (i = 0; i < num_masks; i++) { mask_name = pfm_current->get_event_mask_name(ev, i); if (!mask_name) continue; if (strcasecmp(mask_name, str)) continue; *mask_idx = i; return PFMLIB_SUCCESS; } /* don't give up yet; check for a exact numerical value */ mask_val = strtoul(str, &endptr, 0); if (mask_val != ULONG_MAX && endptr && *endptr == '\0') { for (i = 0; i < num_masks; i++) { pfm_current->get_event_mask_code(ev, i, &c); if (mask_val == c) { *mask_idx = i; return PFMLIB_SUCCESS; } } } return PFMLIB_ERR_UMASK; } int pfm_find_event_mask(unsigned int ev, const char *str, unsigned int *mask_idx) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (str == NULL || mask_idx == NULL || ev >= pfm_current->pme_count) return PFMLIB_ERR_INVAL; return pfm_do_find_event_mask(ev, str, mask_idx); } /* * check if unit mask is not already present */ static inline int pfm_check_duplicates(pfmlib_event_t *e, unsigned int u) { unsigned int j; for(j=0; j < e->num_masks; j++) { if (e->unit_masks[j] == u) return PFMLIB_ERR_UMASK; } return PFMLIB_SUCCESS; } static int pfm_add_numeric_masks(pfmlib_event_t *e, const char *str) { unsigned int i, j, c; unsigned int num_masks = 0; unsigned long mask_val = -1, m = 0; char *endptr = NULL; int ret = PFMLIB_ERR_UMASK; /* empty mask name */ if (*str == '\0') return PFMLIB_ERR_UMASK; num_masks = pfm_num_masks(e->event); /* * add to the existing list of unit masks */ j = e->num_masks; /* * use unsigned long to benefit from radix wildcard * and error checking of strtoul() */ mask_val = strtoul(str, &endptr, 0); if (endptr && *endptr != '\0') return PFMLIB_ERR_UMASK; /* * look for a numerical match */ for (i = 0; i < num_masks; i++) { pfm_current->get_event_mask_code(e->event, i, &c); if ((mask_val & c) == (unsigned long)c) { /* ignore duplicates */ if (pfm_check_duplicates(e, i) == PFMLIB_SUCCESS) { if (j == PFMLIB_MAX_MASKS_PER_EVENT) { ret = PFMLIB_ERR_TOOMANY; break; } e->unit_masks[j++] = i; } m |= c; } } /* * all bits accounted for */ if (mask_val == m) { e->num_masks = j; return PFMLIB_SUCCESS; } /* * extra bits left over; * reset and flag error */ for (i = e->num_masks; i < j; i++) e->unit_masks[i] = 0; return ret; } int pfm_get_event_name(unsigned int i, char *name, size_t maxlen) { size_t l, j; char *str; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (i >= pfm_current->pme_count || name == NULL || maxlen < 1) return PFMLIB_ERR_INVAL; str = pfm_current->get_event_name(i); if (!str) return PFMLIB_ERR_BADHOST; l = strlen(str); /* * we fail if buffer is too small, simply because otherwise we * get partial names which are useless for subsequent calls * users mus invoke pfm_get_event_name_max_len() to correctly size * the buffer for this call */ if ((maxlen-1) < l) return PFMLIB_ERR_INVAL; for(j=0; j < l; j++) name[j] = (char)toupper(str[j]); name[l] = '\0'; return PFMLIB_SUCCESS; } int pfm_get_event_code(unsigned int i, int *code) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (i >= pfm_current->pme_count || code == NULL) return PFMLIB_ERR_INVAL; return pfm_current->get_event_code(i, PFMLIB_CNT_FIRST, code); } int pfm_get_event_code_counter(unsigned int i, unsigned int cnt, int *code) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (i >= pfm_current->pme_count || code == NULL) return PFMLIB_ERR_INVAL; return pfm_current->get_event_code(i, cnt, code); } int pfm_get_event_counters(unsigned int i, pfmlib_regmask_t *counters) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (i >= pfm_current->pme_count) return PFMLIB_ERR_INVAL; pfm_current->get_event_counters(i, counters); return PFMLIB_SUCCESS; } int pfm_get_event_mask_name(unsigned int ev, unsigned int mask, char *name, size_t maxlen) { char *str; unsigned int num; size_t l, j; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (ev >= pfm_current->pme_count || name == NULL || maxlen < 1) return PFMLIB_ERR_INVAL; num = pfm_num_masks(ev); if (num == 0) return PFMLIB_ERR_NOTSUPP; if (mask >= num) return PFMLIB_ERR_INVAL; str = pfm_current->get_event_mask_name(ev, mask); if (!str) return PFMLIB_ERR_BADHOST; l = strlen(str); if (l >= (maxlen-1)) return PFMLIB_ERR_FULL; strcpy(name, str); /* * present nice uniform names */ l = strlen(name); for(j=0; j < l; j++) if (islower(name[j])) name[j] = (char)toupper(name[j]); return PFMLIB_SUCCESS; } int pfm_get_num_events(unsigned int *count) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (count == NULL) return PFMLIB_ERR_INVAL; *count = pfm_current->pme_count; return PFMLIB_SUCCESS; } int pfm_get_num_event_masks(unsigned int ev, unsigned int *count) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (ev >= pfm_current->pme_count || count == NULL) return PFMLIB_ERR_INVAL; *count = pfm_num_masks(ev); return PFMLIB_SUCCESS; } #if 0 /* * check that the unavailable PMCs registers correspond * to implemented PMC registers */ static int pfm_check_unavail_pmcs(pfmlib_regmask_t *pmcs) { pfmlib_regmask_t impl_pmcs; pfm_current->get_impl_pmcs(&impl_pmcs); unsigned int i; for (i=0; i < PFMLIB_REG_BV; i++) { if ((pmcs->bits[i] & impl_pmcs.bits[i]) != pmcs->bits[i]) return PFMLIB_ERR_INVAL; } return PFMLIB_SUCCESS; } #endif /* * we do not check if pfp_unavail_pmcs contains only implemented PMC * registers. In other words, invalid registers are ignored */ int pfm_dispatch_events( pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { unsigned count; unsigned int i; int ret; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; /* at least one input and one output set must exist */ if (!inp && !model_in) return PFMLIB_ERR_INVAL; if (!outp && !model_out) return PFMLIB_ERR_INVAL; if (!inp) count = 0; else if (inp->pfp_dfl_plm == 0) /* the default priv level must be set to something */ return PFMLIB_ERR_INVAL; else if (inp->pfp_event_count >= PFMLIB_MAX_PMCS) return PFMLIB_ERR_INVAL; else if (inp->pfp_event_count > pfm_current->num_cnt) return PFMLIB_ERR_NOASSIGN; else count = inp->pfp_event_count; /* * check that event and unit masks descriptors are correct */ for (i=0; i < count; i++) { ret = __pfm_check_event(inp->pfp_events+i); if (ret != PFMLIB_SUCCESS) return ret; } /* reset output data structure */ if (outp) memset(outp, 0, sizeof(*outp)); return pfm_current->dispatch_events(inp, model_in, outp, model_out); } /* * more or less obosleted by pfm_get_impl_counters() */ int pfm_get_num_counters(unsigned int *num) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (num == NULL) return PFMLIB_ERR_INVAL; *num = pfm_current->num_cnt; return PFMLIB_SUCCESS; } int pfm_get_num_pmcs(unsigned int *num) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (num == NULL) return PFMLIB_ERR_INVAL; *num = pfm_current->pmc_count; return PFMLIB_SUCCESS; } int pfm_get_num_pmds(unsigned int *num) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (num == NULL) return PFMLIB_ERR_INVAL; *num = pfm_current->pmd_count; return PFMLIB_SUCCESS; } int pfm_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (impl_pmcs == NULL) return PFMLIB_ERR_INVAL; memset(impl_pmcs , 0, sizeof(*impl_pmcs)); pfm_current->get_impl_pmcs(impl_pmcs); return PFMLIB_SUCCESS; } int pfm_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (impl_pmds == NULL) return PFMLIB_ERR_INVAL; memset(impl_pmds, 0, sizeof(*impl_pmds)); pfm_current->get_impl_pmds(impl_pmds); return PFMLIB_SUCCESS; } int pfm_get_impl_counters(pfmlib_regmask_t *impl_counters) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (impl_counters == NULL) return PFMLIB_ERR_INVAL; memset(impl_counters, 0, sizeof(*impl_counters)); pfm_current->get_impl_counters(impl_counters); return PFMLIB_SUCCESS; } int pfm_get_hw_counter_width(unsigned int *width) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (width == NULL) return PFMLIB_ERR_INVAL; pfm_current->get_hw_counter_width(width); return PFMLIB_SUCCESS; } /* sorry, only English supported at this point! */ static char *pfmlib_err_list[]= { "success", "not supported", "invalid parameters", "pfmlib not initialized", "event not found", "cannot assign events to counters", "buffer is full or too small", "event used more than once", "invalid model specific magic number", "invalid combination of model specific features", "incompatible event sets", "incompatible events combination", "too many events or unit masks", "code range too big", "empty code range", "invalid code range", "too many code ranges", "invalid data range", "too many data ranges", "not supported by host cpu", "code range is not bundle-aligned", "code range requires some flags in rr_flags", "invalid or missing unit mask", "out of memory" }; static size_t pfmlib_err_count = sizeof(pfmlib_err_list)/sizeof(char *); char * pfm_strerror(int code) { code = -code; if (code <0 || code >= pfmlib_err_count) return "unknown error code"; return pfmlib_err_list[code]; } int pfm_get_version(unsigned int *version) { if (version == NULL) return PFMLIB_ERR_INVAL; *version = PFMLIB_VERSION; return 0; } int pfm_get_max_event_name_len(size_t *len) { unsigned int i, j, num_masks; size_t max = 0, l; char *str; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (len == NULL) return PFMLIB_ERR_INVAL; for(i=0; i < pfm_current->pme_count; i++) { str = pfm_current->get_event_name(i); if (!str) continue; l = strlen(str); if (l > max) max = l; num_masks = pfm_num_masks(i); /* * we need to add up all length because unit masks can * be combined typically. We add 1 to account for ':' * which is inserted as the unit mask separator */ for (j = 0; j < num_masks; j++) { str = pfm_current->get_event_mask_name(i, j); if (!str) continue; l += 1 + strlen(str); } if (l > max) max = l; } *len = max; return PFMLIB_SUCCESS; } /* * return the index of the event that counts elapsed cycles */ int pfm_get_cycle_event(pfmlib_event_t *e) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (e == NULL) return PFMLIB_ERR_INVAL; if (!pfm_current->get_cycle_event) return PFMLIB_ERR_NOTSUPP; memset(e, 0, sizeof(*e)); return pfm_current->get_cycle_event(e); } /* * return the index of the event that retired instructions */ int pfm_get_inst_retired_event(pfmlib_event_t *e) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (e == NULL) return PFMLIB_ERR_INVAL; if (!pfm_current->get_inst_retired_event) return PFMLIB_ERR_NOTSUPP; memset(e, 0, sizeof(*e)); return pfm_current->get_inst_retired_event(e); } int pfm_get_event_description(unsigned int i, char **str) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (i >= pfm_current->pme_count || str == NULL) return PFMLIB_ERR_INVAL; if (pfm_current->get_event_desc == NULL) { *str = strdup("no description available"); return PFMLIB_SUCCESS; } return pfm_current->get_event_desc(i, str); } int pfm_get_event_mask_description(unsigned int event_idx, unsigned int mask_idx, char **desc) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (event_idx >= pfm_current->pme_count || desc == NULL) return PFMLIB_ERR_INVAL; if (pfm_current->get_event_mask_desc == NULL) { *desc = strdup("no description available"); return PFMLIB_SUCCESS; } if (mask_idx >= pfm_current->get_num_event_masks(event_idx)) return PFMLIB_ERR_INVAL; return pfm_current->get_event_mask_desc(event_idx, mask_idx, desc); } int pfm_get_event_mask_code(unsigned int event_idx, unsigned int mask_idx, unsigned int *code) { if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (event_idx >= pfm_current->pme_count || code == NULL) return PFMLIB_ERR_INVAL; if (pfm_current->get_event_mask_code == NULL) { *code = 0; return PFMLIB_SUCCESS; } if (mask_idx >= pfm_current->get_num_event_masks(event_idx)) return PFMLIB_ERR_INVAL; return pfm_current->get_event_mask_code(event_idx, mask_idx, code); } int pfm_get_full_event_name(pfmlib_event_t *e, char *name, size_t maxlen) { char *str; size_t l, j; int ret; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (e == NULL || name == NULL || maxlen < 1) return PFMLIB_ERR_INVAL; ret = __pfm_check_event(e); if (ret != PFMLIB_SUCCESS) return ret; /* * make sure the string is at least empty * important for programs that do not check return value * from this function! */ *name = '\0'; str = pfm_current->get_event_name(e->event); if (!str) return PFMLIB_ERR_BADHOST; l = strlen(str); if (l > (maxlen-1)) return PFMLIB_ERR_FULL; strcpy(name, str); maxlen -= l + 1; for(j=0; j < e->num_masks; j++) { str = pfm_current->get_event_mask_name(e->event, e->unit_masks[j]); if (!str) continue; l = strlen(str); if (l > (maxlen-1)) return PFMLIB_ERR_FULL; strcat(name, ":"); strcat(name, str); maxlen -= l + 1; } /* * present nice uniform names */ l = strlen(name); for(j=0; j < l; j++) if (islower(name[j])) name[j] = (char)toupper(name[j]); return PFMLIB_SUCCESS; } int pfm_find_full_event(const char *v, pfmlib_event_t *e) { char *str, *p, *q; unsigned int j, mask; int ret = PFMLIB_SUCCESS; if (PFMLIB_INITIALIZED() == 0) return PFMLIB_ERR_NOINIT; if (v == NULL || e == NULL) return PFMLIB_ERR_INVAL; memset(e, 0, sizeof(*e)); /* * must copy string because we modify it when parsing */ str = strdup(v); if (!str) return PFMLIB_ERR_NOMEM; /* * find event. this function ignores ':' separator */ ret = pfm_find_event_byname(str, &e->event); if (ret) goto error; /* * get number of unit masks for event */ j = pfm_num_masks(e->event); /* * look for colon (unit mask separator) */ p = strchr(str, ':'); /* If no unit masks available and none specified, we're done */ if ((j == 0) && (p == NULL)) { free(str); return PFMLIB_SUCCESS; } ret = PFMLIB_ERR_UMASK; /* * error if: * - event has no unit mask and at least one is passed */ if (p && !j) goto error; /* * error if: * - event has unit masks, no default unit mask, and none is passed */ if (j && !p) { if (pfm_current->has_umask_default && pfm_current->has_umask_default(e->event)) { free(str); return PFMLIB_SUCCESS; } goto error; } /* skip : */ p++; /* * separator is passed but there is nothing behind it */ if (!*p) goto error; /* parse unit masks */ for( q = p; q ; p = q) { q = strchr(p,':'); if (q) *q++ = '\0'; /* * text or exact unit mask value match */ ret = pfm_do_find_event_mask(e->event, p, &mask); if (ret == PFMLIB_ERR_UMASK) { ret = pfm_add_numeric_masks(e, p); if (ret != PFMLIB_SUCCESS) break; } else if (ret == PFMLIB_SUCCESS) { /* * ignore duplicates */ ret = pfm_check_duplicates(e, mask); if (ret != PFMLIB_SUCCESS) { ret = PFMLIB_SUCCESS; continue; } if (e->num_masks == PFMLIB_MAX_MASKS_PER_EVENT) { ret = PFMLIB_ERR_TOOMANY; break; } e->unit_masks[e->num_masks] = mask; e->num_masks++; } } error: free(str); return ret; } papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_core.c000066400000000000000000000565031502707512200217440ustar00rootroot00000000000000/* * pfmlib_core.c : Intel Core PMU * * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements support for Intel Core PMU as specified in the following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" * * Core PMU = architectural perfmon v2 + PEBS */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_core_priv.h" #include "core_events.h" /* let's define some handy shortcuts! */ #define sel_event_select perfevtsel.sel_event_select #define sel_unit_mask perfevtsel.sel_unit_mask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_cnt_mask perfevtsel.sel_cnt_mask #define is_pebs(i) (core_pe[i].pme_flags & PFMLIB_CORE_PEBS) /* * Description of the PMC register mappings: * * 0 -> PMC0 -> PERFEVTSEL0 * 1 -> PMC1 -> PERFEVTSEL1 * 16 -> PMC16 -> FIXED_CTR_CTRL * 17 -> PMC17 -> PEBS_ENABLED * * Description of the PMD register mapping: * * 0 -> PMD0 -> PMC0 * 1 -> PMD1 -> PMC1 * 16 -> PMD2 -> FIXED_CTR0 * 17 -> PMD3 -> FIXED_CTR1 * 18 -> PMD4 -> FIXED_CTR2 */ #define CORE_SEL_BASE 0x186 #define CORE_CTR_BASE 0xc1 #define FIXED_CTR_BASE 0x309 #define PFMLIB_CORE_ALL_FLAGS \ (PFM_CORE_SEL_INV|PFM_CORE_SEL_EDGE) static pfmlib_regmask_t core_impl_pmcs, core_impl_pmds; static int highest_counter; static int pfm_core_detect(void) { int ret; int family, model; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (family != 6) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); switch(model) { case 15: /* Merom */ case 23: /* Penryn */ case 29: /* Dunnington */ break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_core_init(void) { int i; pfm_regmask_set(&core_impl_pmcs, 0); pfm_regmask_set(&core_impl_pmcs, 1); pfm_regmask_set(&core_impl_pmcs, 16); pfm_regmask_set(&core_impl_pmcs, 17); pfm_regmask_set(&core_impl_pmds, 0); pfm_regmask_set(&core_impl_pmds, 1); pfm_regmask_set(&core_impl_pmds, 16); pfm_regmask_set(&core_impl_pmds, 17); pfm_regmask_set(&core_impl_pmds, 18); /* lbr */ pfm_regmask_set(&core_impl_pmds, 19); for(i=0; i < 8; i++) pfm_regmask_set(&core_impl_pmds, i); highest_counter = 18; return PFMLIB_SUCCESS; } static int pfm_core_is_fixed(pfmlib_event_t *e, unsigned int f) { unsigned int fl, flc, i; unsigned int mask = 0; fl = core_pe[e->event].pme_flags; /* * first pass: check if event as a whole supports fixed counters */ switch(f) { case 0: mask = PFMLIB_CORE_FIXED0; break; case 1: mask = PFMLIB_CORE_FIXED1; break; case 2: mask = PFMLIB_CORE_FIXED2_ONLY; break; default: return 0; } if (fl & mask) return 1; /* * second pass: check if unit mask support fixed counter * * reject if mask not found OR if not all unit masks have * same fixed counter mask */ flc = 0; for(i=0; i < e->num_masks; i++) { fl = core_pe[e->event].pme_umasks[e->unit_masks[i]].pme_flags; if (fl & mask) flc++; } return flc > 0 && flc == e->num_masks ? 1 : 0; } /* * IMPORTANT: the interface guarantees that pfp_pmds[] elements are returned in the order the events * were submitted. */ static int pfm_core_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_core_input_param_t *param, pfmlib_output_param_t *outp) { #define HAS_OPTIONS(x) (cntrs && (cntrs[x].flags || cntrs[x].cnt_mask)) #define is_fixed_pmc(a) (a == 16 || a == 17 || a == 18) pfmlib_core_counter_t *cntrs; pfm_core_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; uint64_t val; unsigned long plm; unsigned long long fixed_ctr; unsigned int npc, npmc0, npmc1, nf2; unsigned int i, j, n, k, ucode, use_pebs = 0, done_pebs; unsigned int assign_pc[PMU_CORE_NUM_COUNTERS]; unsigned int next_gen, last_gen; npc = npmc0 = npmc1 = nf2 = 0; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; n = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_core_counters : NULL; use_pebs = param ? param->pfp_core_pebs.pebs_used : 0; if (n > PMU_CORE_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; /* * initilize to empty */ for(i=0; i < PMU_CORE_NUM_COUNTERS; i++) assign_pc[i] = -1; /* * error checking */ for(i=0; i < n; i++) { /* * only supports two priv levels for perf counters */ if (e[i].plm & (PFM_PLM1|PFM_PLM2)) return PFMLIB_ERR_INVAL; /* * check for valid flags */ if (cntrs && cntrs[i].flags & ~PFMLIB_CORE_ALL_FLAGS) return PFMLIB_ERR_INVAL; if (core_pe[e[i].event].pme_flags & PFMLIB_CORE_UMASK_NCOMBO && e[i].num_masks > 1) { DPRINT("events does not support unit mask combination\n"); return PFMLIB_ERR_NOASSIGN; } /* * check event-level single register constraint (PMC0, PMC1, FIXED_CTR2) * fail if more than two events requested for the same counter */ if (core_pe[e[i].event].pme_flags & PFMLIB_CORE_PMC0) { if (++npmc0 > 1) { DPRINT("two events compete for a PMC0\n"); return PFMLIB_ERR_NOASSIGN; } } /* * check if PMC1 is available and if only one event is dependent on it */ if (core_pe[e[i].event].pme_flags & PFMLIB_CORE_PMC1) { if (++npmc1 > 1) { DPRINT("two events compete for a PMC1\n"); return PFMLIB_ERR_NOASSIGN; } } /* * UNHALTED_REFERENCE_CYCLES can only be measured on FIXED_CTR2 */ if (core_pe[e[i].event].pme_flags & PFMLIB_CORE_FIXED2_ONLY) { if (++nf2 > 1) { DPRINT("two events compete for FIXED_CTR2\n"); return PFMLIB_ERR_NOASSIGN; } if (HAS_OPTIONS(i)) { DPRINT("fixed counters do not support inversion/counter-mask\n"); return PFMLIB_ERR_NOASSIGN; } } /* * unit-mask level constraint checking (PMC0, PMC1, FIXED_CTR2) */ for(j=0; j < e[i].num_masks; j++) { unsigned int flags; flags = core_pe[e[i].event].pme_umasks[e[i].unit_masks[j]].pme_flags; if (flags & PFMLIB_CORE_FIXED2_ONLY) { if (++nf2 > 1) { DPRINT("two events compete for FIXED_CTR2\n"); return PFMLIB_ERR_NOASSIGN; } if (HAS_OPTIONS(i)) { DPRINT("fixed counters do not support inversion/counter-mask\n"); return PFMLIB_ERR_NOASSIGN; } } } } next_gen = 0; /* first generic counter */ last_gen = 1; /* last generic counter */ /* * strongest constraint first: works only in IA32_PMC0, IA32_PMC1, FIXED_CTR2 * * When PEBS is used, we pick the first PEBS event and * place it into PMC0. Subsequent PEBS events, will go * in the other counters. */ done_pebs = 0; for(i=0; i < n; i++) { if ((core_pe[e[i].event].pme_flags & PFMLIB_CORE_PMC0) || (use_pebs && pfm_core_is_pebs(e+i) && done_pebs == 0)) { if (pfm_regmask_isset(r_pmcs, 0)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 0; next_gen = 1; done_pebs = 1; } if (core_pe[e[i].event].pme_flags & PFMLIB_CORE_PMC1) { if (pfm_regmask_isset(r_pmcs, 1)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 1; if (next_gen == 1) next_gen = 2; else next_gen = 0; } } /* * next constraint: fixed counters * * We abuse the mapping here for assign_pc to make it easier * to provide the correct values for pd[]. * We use: * - 16 : fixed counter 0 (pmc16, pmd16) * - 17 : fixed counter 1 (pmc16, pmd17) * - 18 : fixed counter 1 (pmc16, pmd18) */ fixed_ctr = pfm_regmask_isset(r_pmcs, 16) ? 0 : 0x7; if (fixed_ctr) { for(i=0; i < n; i++) { /* fixed counters do not support event options (filters) */ if (HAS_OPTIONS(i) || (use_pebs && pfm_core_is_pebs(e+i))) continue; if ((fixed_ctr & 0x1) && pfm_core_is_fixed(e+i, 0)) { assign_pc[i] = 16; fixed_ctr &= ~1; } if ((fixed_ctr & 0x2) && pfm_core_is_fixed(e+i, 1)) { assign_pc[i] = 17; fixed_ctr &= ~2; } if ((fixed_ctr & 0x4) && pfm_core_is_fixed(e+i, 2)) { assign_pc[i] = 18; fixed_ctr &= ~4; } } } /* * assign what is left */ for(i=0; i < n; i++) { if (assign_pc[i] == -1) { for(; next_gen <= last_gen; next_gen++) { DPRINT("i=%d next_gen=%d last=%d isset=%d\n", i, next_gen, last_gen, pfm_regmask_isset(r_pmcs, next_gen)); if (!pfm_regmask_isset(r_pmcs, next_gen)) break; } if (next_gen <= last_gen) assign_pc[i] = next_gen++; else { DPRINT("cannot assign generic counters\n"); return PFMLIB_ERR_NOASSIGN; } } } j = 0; /* setup fixed counters */ reg.val = 0; k = 0; for (i=0; i < n ; i++ ) { if (!is_fixed_pmc(assign_pc[i])) continue; val = 0; /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; if (plm & PFM_PLM0) val |= 1ULL; if (plm & PFM_PLM3) val |= 2ULL; val |= 1ULL << 3; /* force APIC int (kernel may force it anyway) */ reg.val |= val << ((assign_pc[i]-16)<<2); } if (reg.val) { pc[npc].reg_num = 16; pc[npc].reg_value = reg.val; pc[npc].reg_addr = 0x38D; pc[npc].reg_alt_addr = 0x38D; __pfm_vbprintf("[FIXED_CTRL(pmc%u)=0x%"PRIx64" pmi0=1 en0=0x%"PRIx64" pmi1=1 en1=0x%"PRIx64" pmi2=1 en2=0x%"PRIx64"] ", pc[npc].reg_num, reg.val, reg.val & 0x3ULL, (reg.val>>4) & 0x3ULL, (reg.val>>8) & 0x3ULL); if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("INSTRUCTIONS_RETIRED "); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("UNHALTED_CORE_CYCLES "); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("UNHALTED_REFERENCE_CYCLES "); __pfm_vbprintf("\n"); npc++; if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("[FIXED_CTR0(pmd16)]\n"); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("[FIXED_CTR1(pmd17)]\n"); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("[FIXED_CTR2(pmd18)]\n"); } for (i=0; i < n ; i++ ) { /* skip fixed counters */ if (is_fixed_pmc(assign_pc[i])) continue; reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; val = core_pe[e[i].event].pme_code; reg.sel_event_select = val & 0xff; ucode = (val >> 8) & 0xff; for(k=0; k < e[i].num_masks; k++) { ucode |= core_pe[e[i].event].pme_umasks[e[i].unit_masks[k]].pme_ucode; } /* * for events supporting Core specificity (self, both), a value * of 0 for bits 15:14 (7:6 in our umask) is reserved, therefore we * force to SELF if user did not specify anything */ if ((core_pe[e[i].event].pme_flags & PFMLIB_CORE_CSPEC) && ((ucode & (0x3 << 6)) == 0)) { ucode |= 1 << 6; } /* * for events supporting MESI, a value * of 0 for bits 11:8 (0-3 in our umask) means nothing will be * counted. Therefore, we force a default of 0xf (M,E,S,I). */ if ((core_pe[e[i].event].pme_flags & PFMLIB_CORE_MESI) && ((ucode & 0xf) == 0)) { ucode |= 0xf; } val |= ucode << 8; reg.sel_unit_mask = ucode; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_edge = val >> 18; if (cntrs) { if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[i].flags & PFM_CORE_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[i].flags & PFM_CORE_SEL_INV ? 1 : 0; } pc[npc].reg_num = assign_pc[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = CORE_SEL_BASE+assign_pc[i]; pc[npc].reg_alt_addr= CORE_SEL_BASE+assign_pc[i]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, core_pe[e[i].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pc[npc].reg_num, pc[npc].reg_num); npc++; } /* * setup pmds: must be in the same order as the events */ for (i=0; i < n ; i++) { if (is_fixed_pmc(assign_pc[i])) { /* setup pd array */ pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = FIXED_CTR_BASE+assign_pc[i]-16; pd[i].reg_alt_addr = 0x40000000+assign_pc[i]-16; } else { pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = CORE_CTR_BASE+assign_pc[i]; /* index to use with RDPMC */ pd[i].reg_alt_addr = assign_pc[i]; } } outp->pfp_pmd_count = i; /* * setup PEBS_ENABLE */ if (use_pebs && done_pebs) { /* * check that PEBS_ENABLE is available */ if (pfm_regmask_isset(r_pmcs, 17)) return PFMLIB_ERR_NOASSIGN; pc[npc].reg_num = 17; pc[npc].reg_value = 1ULL; pc[npc].reg_addr = 0x3f1; /* IA32_PEBS_ENABLE */ pc[npc].reg_alt_addr = 0x3f1; /* IA32_PEBS_ENABLE */ __pfm_vbprintf("[PEBS_ENABLE(pmc%u)=0x%"PRIx64" ena=%d]\n", pc[npc].reg_num, pc[npc].reg_value, pc[npc].reg_value & 0x1ull); npc++; } outp->pfp_pmc_count = npc; return PFMLIB_SUCCESS; } #if 0 static int pfm_core_dispatch_pebs(pfmlib_input_param_t *inp, pfmlib_core_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e; pfm_core_sel_reg_t reg; unsigned int umask, npc, npd, k, plm; pfmlib_regmask_t *r_pmcs; pfmlib_reg_t *pc, *pd; int event; npc = outp->pfp_pmc_count; npd = outp->pfp_pmd_count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; r_pmcs = &inp->pfp_unavail_pmcs; e = inp->pfp_events; /* * check for valid flags */ if (e[0].flags & ~PFMLIB_CORE_ALL_FLAGS) return PFMLIB_ERR_INVAL; /* * check event supports PEBS */ if (pfm_core_is_pebs(e) == 0) return PFMLIB_ERR_FEATCOMB; /* * check that PMC0 is available * PEBS works only on PMC0 * Some PEBS at-retirement events do require PMC0 anyway */ if (pfm_regmask_isset(r_pmcs, 0)) return PFMLIB_ERR_NOASSIGN; /* * check that PEBS_ENABLE is available */ if (pfm_regmask_isset(r_pmcs, 17)) return PFMLIB_ERR_NOASSIGN; reg.val = 0; /* assume reserved bits are zerooed */ event = e[0].event; /* if plm is 0, then assume not specified per-event and use default */ plm = e[0].plm ? e[0].plm : inp->pfp_dfl_plm; reg.sel_event_select = core_pe[event].pme_code & 0xff; umask = (core_pe[event].pme_code >> 8) & 0xff; for(k=0; k < e[0].num_masks; k++) { umask |= core_pe[event].pme_umasks[e[0].unit_masks[k]].pme_ucode; } reg.sel_unit_mask = umask; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 0; /* not INT for PEBS counter */ reg.sel_cnt_mask = mod_in->pfp_core_counters[0].cnt_mask; reg.sel_edge = mod_in->pfp_core_counters[0].flags & PFM_CORE_SEL_EDGE ? 1 : 0; reg.sel_inv = mod_in->pfp_core_counters[0].flags & PFM_CORE_SEL_INV ? 1 : 0; pc[npc].reg_num = 0; pc[npc].reg_value = reg.val; pc[npc].reg_addr = CORE_SEL_BASE; pc[npc].reg_alt_addr= CORE_SEL_BASE; pd[npd].reg_num = 0; pd[npd].reg_addr = CORE_CTR_BASE; pd[npd].reg_alt_addr = 0; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, core_pe[e[0].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pd[npd].reg_num, pd[npd].reg_num); npc++; npd++; /* * setup PEBS_ENABLE */ pc[npc].reg_num = 17; pc[npc].reg_value = 1ULL; pc[npc].reg_addr = 0x3f1; /* IA32_PEBS_ENABLE */ pc[npc].reg_alt_addr = 0x3f1; /* IA32_PEBS_ENABLE */ __pfm_vbprintf("[PEBS_ENABLE(pmc%u)=0x%"PRIx64" ena=%d]\n", pc[npc].reg_num, pc[npc].reg_value, pc[npc].reg_value & 0x1ull); npc++; /* number of evtsel/ctr registers programmed */ outp->pfp_pmc_count = npc; outp->pfp_pmd_count = npd; return PFMLIB_SUCCESS; } #endif static int pfm_core_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_core_input_param_t *mod_in = (pfmlib_core_input_param_t *)model_in; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } return pfm_core_dispatch_counters(inp, mod_in, outp); } static int pfm_core_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt > highest_counter || !pfm_regmask_isset(&core_impl_pmds, cnt))) return PFMLIB_ERR_INVAL; *code = core_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_core_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int n, i; unsigned int has_f0, has_f1, has_f2; memset(counters, 0, sizeof(*counters)); n = core_pe[j].pme_numasks; has_f0 = has_f1 = has_f2 = 0; for (i=0; i < n; i++) { if (core_pe[j].pme_umasks[i].pme_flags & PFMLIB_CORE_FIXED0) has_f0 = 1; if (core_pe[j].pme_umasks[i].pme_flags & PFMLIB_CORE_FIXED1) has_f1 = 1; if (core_pe[j].pme_umasks[i].pme_flags & PFMLIB_CORE_FIXED2_ONLY) has_f2 = 1; } if (has_f0 == 0) has_f0 = core_pe[j].pme_flags & PFMLIB_CORE_FIXED0; if (has_f1 == 0) has_f1 = core_pe[j].pme_flags & PFMLIB_CORE_FIXED1; if (has_f2 == 0) has_f2 = core_pe[j].pme_flags & PFMLIB_CORE_FIXED2_ONLY; if (has_f0) pfm_regmask_set(counters, 16); if (has_f1) pfm_regmask_set(counters, 17); if (has_f2) pfm_regmask_set(counters, 18); /* the event on FIXED_CTR2 is exclusive CPU_CLK_UNHALTED:REF */ if (!has_f2) { pfm_regmask_set(counters, 0); pfm_regmask_set(counters, 1); if (core_pe[j].pme_flags & PFMLIB_CORE_PMC0) pfm_regmask_clr(counters, 1); if (core_pe[j].pme_flags & PFMLIB_CORE_PMC1) pfm_regmask_clr(counters, 0); } } static void pfm_core_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = core_impl_pmcs; } static void pfm_core_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = core_impl_pmds; } static void pfm_core_get_impl_counters(pfmlib_regmask_t *impl_counters) { pfm_regmask_set(impl_counters, 0); pfm_regmask_set(impl_counters, 1); pfm_regmask_set(impl_counters, 16); pfm_regmask_set(impl_counters, 17); pfm_regmask_set(impl_counters, 18); } /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 32-bit have full * degree of freedom. That is the "useable" counter width. */ #define PMU_CORE_COUNTER_WIDTH 32 static void pfm_core_get_hw_counter_width(unsigned int *width) { /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 31 bits have full * degree of freedom. That is the "useable" counter width. */ *width = PMU_CORE_COUNTER_WIDTH; } static char * pfm_core_get_event_name(unsigned int i) { return core_pe[i].pme_name; } static int pfm_core_get_event_description(unsigned int ev, char **str) { char *s; s = core_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_core_get_event_mask_name(unsigned int ev, unsigned int midx) { return core_pe[ev].pme_umasks[midx].pme_uname; } static int pfm_core_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = core_pe[ev].pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_core_get_num_event_masks(unsigned int ev) { return core_pe[ev].pme_numasks; } static int pfm_core_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code =core_pe[ev].pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_core_get_cycle_event(pfmlib_event_t *e) { e->event = PME_CORE_UNHALTED_CORE_CYCLES; return PFMLIB_SUCCESS; } static int pfm_core_get_inst_retired(pfmlib_event_t *e) { e->event = PME_CORE_INSTRUCTIONS_RETIRED; return PFMLIB_SUCCESS; } int pfm_core_is_pebs(pfmlib_event_t *e) { unsigned int i, n=0; if (e == NULL || e->event >= PME_CORE_EVENT_COUNT) return 0; if (core_pe[e->event].pme_flags & PFMLIB_CORE_PEBS) return 1; /* * ALL unit mask must support PEBS for this test to return true */ for(i=0; i < e->num_masks; i++) { /* check for valid unit mask */ if (e->unit_masks[i] >= core_pe[e->event].pme_numasks) return 0; if (core_pe[e->event].pme_umasks[e->unit_masks[i]].pme_flags & PFMLIB_CORE_PEBS) n++; } return n > 0 && n == e->num_masks; } pfm_pmu_support_t core_support={ .pmu_name = "Intel Core", .pmu_type = PFMLIB_CORE_PMU, .pme_count = PME_CORE_EVENT_COUNT, .pmc_count = 4, .pmd_count = 14, .num_cnt = 5, .get_event_code = pfm_core_get_event_code, .get_event_name = pfm_core_get_event_name, .get_event_counters = pfm_core_get_event_counters, .dispatch_events = pfm_core_dispatch_events, .pmu_detect = pfm_core_detect, .pmu_init = pfm_core_init, .get_impl_pmcs = pfm_core_get_impl_pmcs, .get_impl_pmds = pfm_core_get_impl_pmds, .get_impl_counters = pfm_core_get_impl_counters, .get_hw_counter_width = pfm_core_get_hw_counter_width, .get_event_desc = pfm_core_get_event_description, .get_num_event_masks = pfm_core_get_num_event_masks, .get_event_mask_name = pfm_core_get_event_mask_name, .get_event_mask_code = pfm_core_get_event_mask_code, .get_event_mask_desc = pfm_core_get_event_mask_desc, .get_cycle_event = pfm_core_get_cycle_event, .get_inst_retired_event = pfm_core_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_core_priv.h000066400000000000000000000053441502707512200230060ustar00rootroot00000000000000/* * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_CORE_PRIV_H__ #define __PFMLIB_CORE_PRIV_H__ #define PFMLIB_CORE_MAX_UMASK 32 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ unsigned int pme_flags; /* unit mask flags */ } pme_core_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ pme_core_umask_t pme_umasks[PFMLIB_CORE_MAX_UMASK]; /* umask desc */ } pme_core_entry_t; /* * pme_flags value (event and unit mask) */ /* event or unit-mask level constraints */ #define PFMLIB_CORE_FIXED0 0x02 /* event supported by FIXED_CTR0, can work on generic counters */ #define PFMLIB_CORE_FIXED1 0x04 /* event supported by FIXED_CTR1, can work on generic counters */ #define PFMLIB_CORE_FIXED2_ONLY 0x08 /* works only on FIXED_CTR2 */ /* event-level constraints */ #define PFMLIB_CORE_UMASK_NCOMBO 0x01 /* unit mask cannot be combined (default: combination ok) */ #define PFMLIB_CORE_CSPEC 0x40 /* requires a core specification */ #define PFMLIB_CORE_PEBS 0x20 /* support PEBS (precise event) */ #define PFMLIB_CORE_PMC0 0x10 /* works only on IA32_PMC0 */ #define PFMLIB_CORE_PMC1 0x80 /* works only on IA32_PMC1 */ #define PFMLIB_CORE_MESI 0x100 /* requires MESI */ #endif /* __PFMLIB_CORE_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_coreduo.c000066400000000000000000000327371502707512200224570ustar00rootroot00000000000000/* * pfmlib_coreduo.c : Intel Core Duo/Solo * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements support for Intel Core Duo/Solor PMU as specified in the * following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" * * Core Dup/Solo PMU = architectural perfmon v1 + model specific events */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_coreduo_priv.h" #include "coreduo_events.h" /* let's define some handy shortcuts! */ #define sel_event_select perfevtsel.sel_event_select #define sel_unit_mask perfevtsel.sel_unit_mask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_cnt_mask perfevtsel.sel_cnt_mask /* * Description of the PMC register mappings: * * 0 -> PMC0 -> PERFEVTSEL0 * 1 -> PMC1 -> PERFEVTSEL1 * 16 -> PMC16 -> FIXED_CTR_CTRL * 17 -> PMC17 -> PEBS_ENABLED * * Description of the PMD register mapping: * * 0 -> PMD0 -> PMC0 * 1 -> PMD1 -> PMC1 * 16 -> PMD2 -> FIXED_CTR0 * 17 -> PMD3 -> FIXED_CTR1 * 18 -> PMD4 -> FIXED_CTR2 */ #define COREDUO_SEL_BASE 0x186 #define COREDUO_CTR_BASE 0xc1 #define PFMLIB_COREDUO_ALL_FLAGS \ (PFM_COREDUO_SEL_INV|PFM_COREDUO_SEL_EDGE) static pfmlib_regmask_t coreduo_impl_pmcs, coreduo_impl_pmds; static int highest_counter; static int pfm_coreduo_detect(void) { char buffer[128]; int family, model; int ret; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); return family == 6 && model == 14 ? PFMLIB_SUCCESS : PFMLIB_ERR_NOTSUPP; } static int pfm_coreduo_init(void) { pfm_regmask_set(&coreduo_impl_pmcs, 0); pfm_regmask_set(&coreduo_impl_pmcs, 1); pfm_regmask_set(&coreduo_impl_pmds, 0); pfm_regmask_set(&coreduo_impl_pmds, 1); highest_counter = 1; return PFMLIB_SUCCESS; } /* * IMPORTANT: the interface guarantees that pfp_pmds[] elements are returned in the order the events * were submitted. */ static int pfm_coreduo_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_coreduo_input_param_t *param, pfmlib_output_param_t *outp) { #define HAS_OPTIONS(x) (cntrs && (cntrs[x].flags || cntrs[x].cnt_mask)) pfm_coreduo_counter_t *cntrs; pfm_coreduo_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; uint64_t val; unsigned long plm; unsigned int npc, npmc0, npmc1, nf2; unsigned int i, n, k, ucode; unsigned int assign_pc[PMU_COREDUO_NUM_COUNTERS]; unsigned int next_gen, last_gen; npc = npmc0 = npmc1 = nf2 = 0; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; n = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_coreduo_counters : NULL; if (n > PMU_COREDUO_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; /* * initilize to empty */ for(i=0; i < PMU_COREDUO_NUM_COUNTERS; i++) assign_pc[i] = -1; /* * error checking */ for(i=0; i < n; i++) { /* * only supports two priv levels for perf counters */ if (e[i].plm & (PFM_PLM1|PFM_PLM2)) return PFMLIB_ERR_INVAL; /* * check for valid flags */ if (cntrs && cntrs[i].flags & ~PFMLIB_COREDUO_ALL_FLAGS) return PFMLIB_ERR_INVAL; /* * check event-level single register constraint (PMC0, PMC1, FIXED_CTR2) * fail if more than two events requested for the same counter */ if (coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_PMC0) { if (++npmc0 > 1) { DPRINT("two events compete for a PMC0\n"); return PFMLIB_ERR_NOASSIGN; } } /* * check if PMC1 is available and if only one event is dependent on it */ if (coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_PMC1) { if (++npmc1 > 1) { DPRINT("two events compete for a PMC1\n"); return PFMLIB_ERR_NOASSIGN; } } } next_gen = 0; /* first generic counter */ last_gen = 1; /* last generic counter */ /* * strongest constraint first: works only in IA32_PMC0, IA32_PMC1 */ for(i=0; i < n; i++) { if ((coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_PMC0)) { if (pfm_regmask_isset(r_pmcs, 0)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 0; next_gen++; } if (coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_PMC1) { if (pfm_regmask_isset(r_pmcs, 1)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 1; next_gen = (next_gen+1) % PMU_COREDUO_NUM_COUNTERS; } } /* * assign what is left */ for(i=0; i < n; i++) { if (assign_pc[i] == -1) { for(; next_gen <= last_gen; next_gen++) { DPRINT("i=%d next_gen=%d last=%d isset=%d\n", i, next_gen, last_gen, pfm_regmask_isset(r_pmcs, next_gen)); if (!pfm_regmask_isset(r_pmcs, next_gen)) break; } if (next_gen <= last_gen) assign_pc[i] = next_gen++; else { DPRINT("cannot assign generic counters\n"); return PFMLIB_ERR_NOASSIGN; } } } for (i=0; i < n ; i++ ) { reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; val = coreduo_pe[e[i].event].pme_code; reg.sel_event_select = val & 0xff; ucode = (val >> 8) & 0xff; for(k=0; k < e[i].num_masks; k++) { ucode |= coreduo_pe[e[i].event].pme_umasks[e[i].unit_masks[k]].pme_ucode; } /* * for events supporting Core specificity (self, both), a value * of 0 for bits 15:14 (7:6 in our umask) is reserved, therefore we * force to SELF if user did not specify anything */ if ((coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_CSPEC) && ((ucode & (0x3 << 6)) == 0)) { ucode |= 1 << 6; } /* * for events supporting MESI, a value * of 0 for bits 11:8 (0-3 in our umask) means nothing will be * counted. Therefore, we force a default of 0xf (M,E,S,I). */ if ((coreduo_pe[e[i].event].pme_flags & PFMLIB_COREDUO_MESI) && ((ucode & 0xf) == 0)) { ucode |= 0xf; } val |= ucode << 8; reg.sel_unit_mask = ucode; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_edge = val >> 18; if (cntrs) { if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[i].flags & PFM_COREDUO_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[i].flags & PFM_COREDUO_SEL_INV ? 1 : 0; } pc[npc].reg_num = assign_pc[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = COREDUO_SEL_BASE+assign_pc[i]; pc[npc].reg_alt_addr= COREDUO_SEL_BASE+assign_pc[i]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, coreduo_pe[e[i].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pc[npc].reg_num, pc[npc].reg_num); npc++; } /* * setup pmds: must be in the same order as the events */ for (i=0; i < n ; i++) { pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = COREDUO_CTR_BASE+assign_pc[i]; /* index to use with RDPMC */ pd[i].reg_alt_addr = assign_pc[i]; } outp->pfp_pmd_count = i; outp->pfp_pmc_count = npc; return PFMLIB_SUCCESS; } static int pfm_coreduo_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_coreduo_input_param_t *mod_in = (pfmlib_coreduo_input_param_t *)model_in; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } return pfm_coreduo_dispatch_counters(inp, mod_in, outp); } static int pfm_coreduo_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt > highest_counter || !pfm_regmask_isset(&coreduo_impl_pmds, cnt))) return PFMLIB_ERR_INVAL; *code = coreduo_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_coreduo_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { memset(counters, 0, sizeof(*counters)); pfm_regmask_set(counters, 0); pfm_regmask_set(counters, 1); if (coreduo_pe[j].pme_flags & PFMLIB_COREDUO_PMC0) pfm_regmask_clr(counters, 1); if (coreduo_pe[j].pme_flags & PFMLIB_COREDUO_PMC1) pfm_regmask_clr(counters, 0); } static void pfm_coreduo_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = coreduo_impl_pmcs; } static void pfm_coreduo_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = coreduo_impl_pmds; } static void pfm_coreduo_get_impl_counters(pfmlib_regmask_t *impl_counters) { /* all pmds are counters */ *impl_counters = coreduo_impl_pmds; } /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 32-bit have full * degree of freedom. That is the "useable" counter width. */ static void pfm_coreduo_get_hw_counter_width(unsigned int *width) { /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 31 bits have full * degree of freedom. That is the "useable" counter width. */ *width = 32; } static char * pfm_coreduo_get_event_name(unsigned int i) { return coreduo_pe[i].pme_name; } static int pfm_coreduo_get_event_description(unsigned int ev, char **str) { char *s; s = coreduo_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_coreduo_get_event_mask_name(unsigned int ev, unsigned int midx) { return coreduo_pe[ev].pme_umasks[midx].pme_uname; } static int pfm_coreduo_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = coreduo_pe[ev].pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_coreduo_get_num_event_masks(unsigned int ev) { return coreduo_pe[ev].pme_numasks; } static int pfm_coreduo_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code = coreduo_pe[ev].pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_coreduo_get_cycle_event(pfmlib_event_t *e) { e->event = PME_COREDUO_UNHALTED_CORE_CYCLES; return PFMLIB_SUCCESS; } static int pfm_coreduo_get_inst_retired(pfmlib_event_t *e) { e->event = PME_COREDUO_INSTRUCTIONS_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t coreduo_support={ .pmu_name = "Intel Core Duo/Solo", .pmu_type = PFMLIB_COREDUO_PMU, .pme_count = PME_COREDUO_EVENT_COUNT, .pmc_count = 2, .pmd_count = 2, .num_cnt = 2, .get_event_code = pfm_coreduo_get_event_code, .get_event_name = pfm_coreduo_get_event_name, .get_event_counters = pfm_coreduo_get_event_counters, .dispatch_events = pfm_coreduo_dispatch_events, .pmu_detect = pfm_coreduo_detect, .pmu_init = pfm_coreduo_init, .get_impl_pmcs = pfm_coreduo_get_impl_pmcs, .get_impl_pmds = pfm_coreduo_get_impl_pmds, .get_impl_counters = pfm_coreduo_get_impl_counters, .get_hw_counter_width = pfm_coreduo_get_hw_counter_width, .get_event_desc = pfm_coreduo_get_event_description, .get_num_event_masks = pfm_coreduo_get_num_event_masks, .get_event_mask_name = pfm_coreduo_get_event_mask_name, .get_event_mask_code = pfm_coreduo_get_event_mask_code, .get_event_mask_desc = pfm_coreduo_get_event_mask_desc, .get_cycle_event = pfm_coreduo_get_cycle_event, .get_inst_retired_event = pfm_coreduo_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_coreduo_priv.h000066400000000000000000000044441502707512200235160ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_COREDUO_PRIV_H__ #define __PFMLIB_COREDUO_PRIV_H__ #define PFMLIB_COREDUO_MAX_UMASK 16 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ unsigned int pme_flags; /* unit mask flags */ } pme_coreduo_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ pme_coreduo_umask_t pme_umasks[PFMLIB_COREDUO_MAX_UMASK]; /* umask desc */ } pme_coreduo_entry_t; /* * pme_flags value (event and unit mask) */ /* event-level constraints */ #define PFMLIB_COREDUO_CSPEC 0x02 /* requires a core specification */ #define PFMLIB_COREDUO_PMC0 0x04 /* works only on IA32_PMC0 */ #define PFMLIB_COREDUO_PMC1 0x08 /* works only on IA32_PMC1 */ #define PFMLIB_COREDUO_MESI 0x10 /* requires MESI */ #endif /* __PFMLIB_COREDUO_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_crayx2.c000066400000000000000000000350041502707512200222150ustar00rootroot00000000000000/* * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include "pfmlib_priv.h" #include "pfmlib_crayx2_priv.h" #include "crayx2_events.h" #define CRAYX2_NO_REDUNDANT 0 /* if>0 an error if chip:ctr:ev repeated */ typedef enum { CTR_REDUNDANT = -2, /* event on counter repeated */ CTR_CONFLICT = -1, /* event on counter not the same as previous */ CTR_OK = 0 /* event on counter open */ } counter_use_t; static int pfm_crayx2_get_event_code (unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && cnt > crayx2_support.num_cnt) { DPRINT ("return: count %d exceeded #counters\n", cnt); return PFMLIB_ERR_INVAL; } else if (i >= crayx2_support.pme_count) { DPRINT ("return: event index %d exceeded #events\n", i); return PFMLIB_ERR_INVAL; } *code = crayx2_pe[i].pme_code; DPRINT ("return: event code is %#x\n", *code); return PFMLIB_SUCCESS; } static char * pfm_crayx2_get_event_name (unsigned int i) { if (i >= crayx2_support.pme_count) { DPRINT ("return: event index %d exceeded #events\n", i); return NULL; } DPRINT ("return: event name '%s'\n", crayx2_pe[i].pme_name); return (char *) crayx2_pe[i].pme_name; } static void pfm_crayx2_get_event_counters (unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset (counters, 0, sizeof (*counters)); DPRINT ("event counters for %d counters\n", PMU_CRAYX2_NUM_COUNTERS); for (i=0; ipfp_event_count, inp->pfp_dfl_plm, inp->pfp_flags); for (i=0; ipfp_event_count; i++) { DPRINT (" %3d: event %3d plm %#3x flags %#8lx num_masks %d\n", i, inp->pfp_events[i].event, inp->pfp_events[i].plm, inp->pfp_events[i].flags, inp->pfp_events[i].num_masks); for (j=0; jpfp_events[i].num_masks; j++) { DPRINT (" unit-mask-%2d: %d\n", j, inp->pfp_events[i].unit_masks[j]); } } } /* Better have at least one event specified and not exceed limit. */ if (inp->pfp_event_count == 0) { DPRINT ("return: event count is 0\n"); return PFMLIB_ERR_INVAL; } else if (inp->pfp_event_count > PMU_CRAYX2_NUM_COUNTERS) { DPRINT ("return: event count exceeds max %d\n", PMU_CRAYX2_NUM_COUNTERS); return PFMLIB_ERR_TOOMANY; } memset (Pused, 0, sizeof(Pused)); memset (Cused, 0, sizeof(Cused)); memset (Mused, 0, sizeof(Mused)); /* Loop through the input parameters describing the events. */ for (i=0; ipfp_event_count; i++) { unsigned int code, chip, ctr, ev, chipno; counter_use_t ret; /* Acquire details describing this event code: * o which substrate/chip it is on * o which counter on the chip * o which event on the counter */ code = inp->pfp_events[i].event; chip = crayx2_pe[code].pme_chip; ctr = crayx2_pe[code].pme_ctr; ev = crayx2_pe[code].pme_event; chipno = crayx2_pe[code].pme_chipno; DPRINT ("%3d: code %3d chip %1d ctr %2d ev %1d chipno %2d\n", code, i, chip, ctr, ev, chipno); /* These priviledge levels are not recognized. */ if (inp->pfp_events[i].plm != 0) { DPRINT ("%3d: priviledge level %#x per event not allowed\n", i, inp->pfp_events[i].plm); return PFMLIB_ERR_INVAL; } /* No masks exist. */ if (inp->pfp_events[i].num_masks > 0) { DPRINT ("too many masks for event\n"); return PFMLIB_ERR_TOOMANY; } /* The event code. Set-up the event selection mask for * the PMC of the respective chip. Check if more than * one event on the same counter is selected. */ if (chip == PME_CRAYX2_CHIP_CPU) { ret = pfm_crayx2_counter_use (ctr, ev, &Pused[chipno], &Pevents); } else if (chip == PME_CRAYX2_CHIP_CACHE) { ret = pfm_crayx2_counter_use (ctr, ev, &Cused[chipno], &Cevents); } else if (chip == PME_CRAYX2_CHIP_MEMORY) { ret = pfm_crayx2_counter_use (ctr, ev, &Mused[chipno], &Mevents); } else { DPRINT ("return: invalid chip\n"); return PFMLIB_ERR_INVAL; } /* Each chip's counter can only count one event. */ if (ret == CTR_CONFLICT) { DPRINT ("return: ctr conflict\n"); return PFMLIB_ERR_EVTINCOMP; } else if (ret == CTR_REDUNDANT) { #if (CRAYX2_NO_REDUNDANT != 0) DPRINT ("return: ctr redundant\n"); return PFMLIB_ERR_EVTMANY; #else DPRINT ("warning: ctr redundant\n"); #endif /* CRAYX2_NO_REDUNDANT */ } /* Set up the output PMDs. */ outp->pfp_pmds[npmds].reg_num = crayx2_pe[code].pme_base + ctr + chipno*crayx2_pe[code].pme_nctrs; outp->pfp_pmds[npmds].reg_addr = 0; outp->pfp_pmds[npmds].reg_alt_addr = 0; outp->pfp_pmds[npmds].reg_value = 0; npmds++; } outp->pfp_pmd_count = npmds; if (PFMLIB_DEBUG ( )) { DPRINT ("P event mask %#16lx\n", Pevents); DPRINT ("C event mask %#16lx\n", Cevents); DPRINT ("M event mask %#16lx\n", Mevents); DPRINT ("PMDs: pmd_count %d\n", outp->pfp_pmd_count); for (i=0; ipfp_pmd_count; i++) { DPRINT (" %3d: reg_value %3lld reg_num %3d reg_addr %#16llx\n", i, outp->pfp_pmds[i].reg_value, outp->pfp_pmds[i].reg_num, outp->pfp_pmds[i].reg_addr); } } /* Set up the PMC basics for the chips that will be doing * some counting. */ if (pfm_crayx2_chip_use (Pused, PME_CRAYX2_CPU_CHIPS) > 0) { uint64_t Pctrl = PFM_CPU_START; uint64_t Pen = PFM_ENABLE_RW; if (inp->pfp_dfl_plm & (PFM_PLM0 | PFM_PLM1)) { Pen |= PFM_ENABLE_KERNEL; } if (inp->pfp_dfl_plm & PFM_PLM2) { Pen |= PFM_ENABLE_EXL; } if (inp->pfp_dfl_plm & PFM_PLM3) { Pen |= PFM_ENABLE_USER; } /* First of three CPU PMC registers. */ base_pmc = PMU_CRAYX2_CPU_PMC_BASE; outp->pfp_pmcs[npmcs].reg_value = Pctrl; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_CONTROL; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Pevents; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_EVENTS; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Pen; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_ENABLE; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; } if (pfm_crayx2_chip_use (Cused, PME_CRAYX2_CACHE_CHIPS) > 0) { uint64_t Cctrl = PFM_CACHE_START; uint64_t Cen = PFM_ENABLE_RW; /* domains N/A */ /* Second of three Cache PMC registers. */ base_pmc = PMU_CRAYX2_CACHE_PMC_BASE; outp->pfp_pmcs[npmcs].reg_value = Cctrl; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_CONTROL; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Cevents; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_EVENTS; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Cen; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_ENABLE; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; } if (pfm_crayx2_chip_use (Mused, PME_CRAYX2_MEMORY_CHIPS) > 0) { uint64_t Mctrl = PFM_MEM_START; uint64_t Men = PFM_ENABLE_RW; /* domains N/A */ /* Third of three Memory PMC registers. */ base_pmc = PMU_CRAYX2_MEMORY_PMC_BASE; outp->pfp_pmcs[npmcs].reg_value = Mctrl; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_CONTROL; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Mevents; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_EVENTS; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; outp->pfp_pmcs[npmcs].reg_value = Men; outp->pfp_pmcs[npmcs].reg_num = base_pmc + PMC_ENABLE; outp->pfp_pmcs[npmcs].reg_addr = 0; outp->pfp_pmcs[npmcs].reg_alt_addr = 0; npmcs++; } outp->pfp_pmc_count = npmcs; if (PFMLIB_DEBUG ( )) { DPRINT ("PMCs: pmc_count %d\n", outp->pfp_pmc_count); for (i=0; ipfp_pmc_count; i++) { DPRINT (" %3d: reg_value %#16llx reg_num %3d reg_addr %#16llx\n", i, outp->pfp_pmcs[i].reg_value, outp->pfp_pmcs[i].reg_num, outp->pfp_pmcs[i].reg_addr); } } return PFMLIB_SUCCESS; } static int pfm_crayx2_pmu_detect (void) { char buffer[128]; int ret; DPRINT ("detect the PMU attributes\n"); ret = __pfm_getcpuinfo_attr ("vendor_id", buffer, sizeof(buffer)); if (ret != 0 || strcasecmp (buffer, "Cray") != 0) { DPRINT ("return: no 'Cray' vendor_id\n"); return PFMLIB_ERR_NOTSUPP; } ret = __pfm_getcpuinfo_attr ("type", buffer, sizeof(buffer)); if (ret != 0 || strcasecmp (buffer, "craynv2") != 0) { DPRINT ("return: no 'craynv2' type\n"); return PFMLIB_ERR_NOTSUPP; } DPRINT ("Cray X2 nv2 found\n"); return PFMLIB_SUCCESS; } static void pfm_crayx2_get_impl_pmcs (pfmlib_regmask_t *impl_pmcs) { unsigned int i; DPRINT ("entered with PMC_COUNT %d\n", PMU_CRAYX2_PMC_COUNT); for (i=0; ievent = PME_CRAYX2_CYCLES; DPRINT ("return: event code for cycles %#x\n", e->event); return PFMLIB_SUCCESS; } static int pfm_crayx2_get_inst_retired (pfmlib_event_t *e) { e->event = PME_CRAYX2_INSTR_GRADUATED; DPRINT ("return: event code for retired instr %#x\n", e->event); return PFMLIB_SUCCESS; } /* Register the constants and the access functions. */ pfm_pmu_support_t crayx2_support = { .pmu_name = PMU_CRAYX2_NAME, .pmu_type = PFMLIB_CRAYX2_PMU, .pme_count = PME_CRAYX2_EVENT_COUNT, .pmc_count = PMU_CRAYX2_PMC_COUNT, .pmd_count = PMU_CRAYX2_PMD_COUNT, .num_cnt = PMU_CRAYX2_NUM_COUNTERS, .get_event_code = pfm_crayx2_get_event_code, .get_event_name = pfm_crayx2_get_event_name, .get_event_counters = pfm_crayx2_get_event_counters, .dispatch_events = pfm_crayx2_dispatch_events, .pmu_detect = pfm_crayx2_pmu_detect, .get_impl_pmcs = pfm_crayx2_get_impl_pmcs, .get_impl_pmds = pfm_crayx2_get_impl_pmds, .get_impl_counters = pfm_crayx2_get_impl_counters, .get_hw_counter_width = pfm_crayx2_get_hw_counter_width, .get_event_desc = pfm_crayx2_get_event_desc, .get_num_event_masks = pfm_crayx2_get_num_event_masks, .get_event_mask_name = pfm_crayx2_get_event_mask_name, .get_event_mask_code = pfm_crayx2_get_event_mask_code, .get_event_mask_desc = pfm_crayx2_get_event_mask_desc, .get_cycle_event = pfm_crayx2_get_cycle_event, .get_inst_retired_event = pfm_crayx2_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_crayx2_priv.h000066400000000000000000000070701502707512200232640ustar00rootroot00000000000000/* * Copyright (c) 2007 Cray Inc. * Contributed by Steve Kaufmann based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PMLIB_CRAYX2_PRIV_H__ #define __PMLIB_CRAYX2_PRIV_H__ 1 #include /* Chips (substrates) that contain performance counters. */ #define PME_CRAYX2_CHIP_CPU 1 #define PME_CRAYX2_CHIP_CACHE 2 #define PME_CRAYX2_CHIP_MEMORY 3 /* Number of chips monitored per single process. */ #define PME_CRAYX2_CPU_CHIPS 1 #define PME_CRAYX2_CACHE_CHIPS 1 #define PME_CRAYX2_MEMORY_CHIPS 16 /* Number of events per physical counter. */ #define PME_CRAYX2_EVENTS_PER_COUNTER 4 /* Number of counters per chip (CPU, L2 Cache, Memory) */ #define PME_CRAYX2_CPU_CTRS_PER_CHIP PFM_CPU_PMD_COUNT #define PME_CRAYX2_CACHE_CTRS_PER_CHIP PFM_CACHE_PMD_PER_CHIP #define PME_CRAYX2_MEMORY_CTRS_PER_CHIP PFM_MEM_PMD_PER_CHIP /* Number of events per chip (CPU, L2 Cache, Memory) */ #define PME_CRAYX2_CPU_EVENTS \ (PME_CRAYX2_CPU_CHIPS*PME_CRAYX2_CPU_CTRS_PER_CHIP*PME_CRAYX2_EVENTS_PER_COUNTER) #define PME_CRAYX2_CACHE_EVENTS \ (PME_CRAYX2_CACHE_CHIPS*PME_CRAYX2_CACHE_CTRS_PER_CHIP*PME_CRAYX2_EVENTS_PER_COUNTER) #define PME_CRAYX2_MEMORY_EVENTS \ (PME_CRAYX2_MEMORY_CHIPS*PME_CRAYX2_MEMORY_CTRS_PER_CHIP*PME_CRAYX2_EVENTS_PER_COUNTER) /* No unit masks are (currently) used. */ #define PFMLIB_CRAYX2_MAX_UMASK 1 typedef struct { const char *pme_uname; /* unit mask name */ const char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ } pme_crayx2_umask_t; /* Description of each performance counter event available on all * substrates. Listed contiguously for all substrates. */ typedef struct { const char *pme_name; /* event name */ const char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_flags; /* flags */ unsigned int pme_numasks; /* number of unit masks */ pme_crayx2_umask_t pme_umasks[PFMLIB_CRAYX2_MAX_UMASK]; /* unit masks (chip numbers) */ unsigned int pme_chip; /* substrate/chip containing counter */ unsigned int pme_ctr; /* counter on chip */ unsigned int pme_event; /* event number on counter */ unsigned int pme_chipno; /* chip# upon which the event lies */ unsigned int pme_base; /* PMD base reg_num for this chip */ unsigned int pme_nctrs; /* PMDs/counters per chip */ unsigned int pme_nchips; /* number of chips per process */ } pme_crayx2_entry_t; #endif /* __PMLIB_CRAYX2_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_ia32.c000066400000000000000000000566261502707512200224110ustar00rootroot00000000000000/* * pfmlib_gen_ia32.c : Intel architectural PMU v1, v2, v3 * * The file provides support for the Intel architectural PMU v1 and v2. * * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements supports for the IA-32 architectural PMU as specified * in the following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_gen_ia32_priv.h" /* architecture private */ #include "gen_ia32_events.h" /* architected event table */ /* let's define some handy shortcuts! */ #define sel_event_select perfevtsel.sel_event_select #define sel_unit_mask perfevtsel.sel_unit_mask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_any perfevtsel.sel_any #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_cnt_mask perfevtsel.sel_cnt_mask pfm_pmu_support_t *gen_support; /* * Description of the PMC/PMD register mappings use by * this module (as reported in pfmlib_reg_t.reg_num) * * For V1 (up to 16 generic counters 0-15): * * 0 -> PMC0 -> PERFEVTSEL0 -> MSR @ 0x186 * 1 -> PMC1 -> PERFEVTSEL1 -> MSR @ 0x187 * ... * n -> PMCn -> PERFEVTSELn -> MSR @ 0x186+n * * 0 -> PMD0 -> IA32_PMC0 -> MSR @ 0xc1 * 1 -> PMD1 -> IA32_PMC1 -> MSR @ 0xc2 * ... * n -> PMDn -> IA32_PMCn -> MSR @ 0xc1+n * * For V2 (up to 16 generic and 16 fixed counters): * * 0 -> PMC0 -> PERFEVTSEL0 -> MSR @ 0x186 * 1 -> PMC1 -> PERFEVTSEL1 -> MSR @ 0x187 * ... * 15 -> PMC15 -> PERFEVTSEL15 -> MSR @ 0x186+15 * * 16 -> PMC16 -> IA32_FIXED_CTR_CTRL -> MSR @ 0x38d * * 0 -> PMD0 -> IA32_PMC0 -> MSR @ 0xc1 * 1 -> PMD1 -> IA32_PMC1 -> MSR @ 0xc2 * ... * 15 -> PMD15 -> IA32_PMC15 -> MSR @ 0xc1+15 * * 16 -> PMD16 -> IA32_FIXED_CTR0 -> MSR @ 0x309 * 17 -> PMD17 -> IA32_FIXED_CTR1 -> MSR @ 0x30a * ... * n -> PMDn -> IA32_FIXED_CTRn -> MSR @ 0x309+n */ #define GEN_IA32_SEL_BASE 0x186 #define GEN_IA32_CTR_BASE 0xc1 #define GEN_IA32_FIXED_CTR_BASE 0x309 #define FIXED_PMD_BASE 16 #define PFMLIB_GEN_IA32_ALL_FLAGS \ (PFM_GEN_IA32_SEL_INV|PFM_GEN_IA32_SEL_EDGE|PFM_GEN_IA32_SEL_ANYTHR) static char * pfm_gen_ia32_get_event_name(unsigned int i); static pme_gen_ia32_entry_t *gen_ia32_pe; static int gen_ia32_cycle_event, gen_ia32_inst_retired_event; static unsigned int num_fixed_cnt, num_gen_cnt, pmu_version; #ifdef __i386__ static inline void cpuid(unsigned int op, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { /* * because ebx is used in Pic mode, we need to save/restore because * cpuid clobbers it. I could not figure out a way to get ebx out in * one cpuid instruction. To extract ebx, we need to move it to another * register (here eax) */ __asm__("pushl %%ebx;cpuid; popl %%ebx" :"=a" (*eax) : "a" (op) : "ecx", "edx"); __asm__("pushl %%ebx;cpuid; movl %%ebx, %%eax;popl %%ebx" :"=a" (*ebx) : "a" (op) : "ecx", "edx"); } #else static inline void cpuid(unsigned int op, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { __asm__("cpuid" : "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx) : "0" (op), "c"(0)); } #endif static pfmlib_regmask_t gen_ia32_impl_pmcs, gen_ia32_impl_pmds; /* * create architected event table */ static int create_arch_event_table(unsigned int mask) { pme_gen_ia32_entry_t *pe; unsigned int i, num_events = 0; unsigned int m; /* * first pass: count the number of supported events */ m = mask; for(i=0; i < 7; i++, m>>=1) { if ((m & 0x1) == 0) num_events++; } gen_ia32_support.pme_count = num_events; gen_ia32_pe = calloc(num_events, sizeof(pme_gen_ia32_entry_t)); if (gen_ia32_pe == NULL) return PFMLIB_ERR_NOTSUPP; /* * second pass: populate the table */ gen_ia32_cycle_event = gen_ia32_inst_retired_event = -1; m = mask; for(i=0, pe = gen_ia32_pe; i < 7; i++, m>>=1) { if ((m & 0x1) == 0) { *pe = gen_ia32_all_pe[i]; /* * setup default event: cycles and inst_retired */ if (i == PME_GEN_IA32_UNHALTED_CORE_CYCLES) gen_ia32_cycle_event = pe - gen_ia32_pe; if (i == PME_GEN_IA32_INSTRUCTIONS_RETIRED) gen_ia32_inst_retired_event = pe - gen_ia32_pe; pe++; } } return PFMLIB_SUCCESS; } static int check_arch_pmu(int family) { union { unsigned int val; pmu_eax_t eax; pmu_edx_t edx; } eax, ecx, edx, ebx; /* * check family number to reject for processors * older than Pentium (family=5). Those processors * did not have the CPUID instruction */ if (family < 5) return PFMLIB_ERR_NOTSUPP; /* * check if CPU supports 0xa function of CPUID * 0xa started with Core Duo. Needed to detect if * architected PMU is present */ cpuid(0x0, &eax.val, &ebx.val, &ecx.val, &edx.val); if (eax.val < 0xa) return PFMLIB_ERR_NOTSUPP; /* * extract architected PMU information */ cpuid(0xa, &eax.val, &ebx.val, &ecx.val, &edx.val); /* * version must be greater than zero */ return eax.eax.version < 1 ? PFMLIB_ERR_NOTSUPP : PFMLIB_SUCCESS; } static int pfm_gen_ia32_detect(void) { int ret, family; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); return check_arch_pmu(family); } static int pfm_gen_ia32_init(void) { union { unsigned int val; pmu_eax_t eax; pmu_edx_t edx; } eax, ecx, edx, ebx; unsigned int num_cnt, i; int ret; /* * extract architected PMU information */ if (forced_pmu == PFMLIB_NO_PMU) { cpuid(0xa, &eax.val, &ebx.val, &ecx.val, &edx.val); } else { /* * when forced, simulate v2 * with 2 generic and 3 fixed counters */ eax.eax.version = 3; eax.eax.num_cnt = 2; eax.eax.cnt_width = 40; eax.eax.ebx_length = 0; /* unused */ ebx.val = 0; edx.edx.num_cnt = 3; edx.edx.cnt_width = 40; } num_cnt = eax.eax.num_cnt; pmu_version = eax.eax.version; /* * populate impl_pm* bitmasks for generic counters */ for(i=0; i < num_cnt; i++) { pfm_regmask_set(&gen_ia32_impl_pmcs, i); pfm_regmask_set(&gen_ia32_impl_pmds, i); } /* check for fixed counters */ if (pmu_version >= 2) { /* * As described in IA-32 Developer's manual vol 3b * in section 18.12.2.1, early processors supporting * V2 may report invalid information concerning the fixed * counters. So we compensate for this here by forcing * num_cnt to 3. */ if (edx.edx.num_cnt == 0) edx.edx.num_cnt = 3; for(i=0; i < edx.edx.num_cnt; i++) pfm_regmask_set(&gen_ia32_impl_pmds, FIXED_PMD_BASE+i); if (i) pfm_regmask_set(&gen_ia32_impl_pmcs, 16); } num_gen_cnt = eax.eax.num_cnt; num_fixed_cnt = edx.edx.num_cnt; gen_ia32_support.pmc_count = num_gen_cnt + (num_fixed_cnt > 0); gen_ia32_support.pmd_count = num_gen_cnt + num_fixed_cnt; gen_ia32_support.num_cnt = num_gen_cnt + num_fixed_cnt; __pfm_vbprintf("Intel architected PMU: version=%d num_gen=%u num_fixed=%u pmc=%u pmd=%d\n", pmu_version, num_gen_cnt,num_fixed_cnt, gen_ia32_support.pmc_count, gen_ia32_support.pmd_count); ret = create_arch_event_table(ebx.val); if (ret != PFMLIB_SUCCESS) return ret; gen_support = &gen_ia32_support; return PFMLIB_SUCCESS; } static int pfm_gen_ia32_dispatch_counters_v1(pfmlib_input_param_t *inp, pfmlib_gen_ia32_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_gen_ia32_input_param_t *param = mod_in; pfmlib_gen_ia32_counter_t *cntrs; pfm_gen_ia32_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned long plm; unsigned int i, j, cnt, k, ucode, val; unsigned int assign[PMU_GEN_IA32_MAX_COUNTERS]; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_gen_ia32_counters : NULL; if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s\n", j, gen_ia32_pe[e[j].event].pme_name); } } if (cnt > gen_support->pmd_count) return PFMLIB_ERR_TOOMANY; for(i=0, j=0; j < cnt; j++) { if (e[j].plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("event=%d invalid plm=%d\n", e[j].event, e[j].plm); return PFMLIB_ERR_INVAL; } if (e[j].flags & ~PFMLIB_GEN_IA32_ALL_FLAGS) { DPRINT("event=%d invalid flags=0x%lx\n", e[j].event, e[j].flags); return PFMLIB_ERR_INVAL; } if (cntrs && pmu_version != 3 && (cntrs[j].flags & PFM_GEN_IA32_SEL_ANYTHR)) { DPRINT("event=%d anythread requires architectural perfmon v3", e[j].event); return PFMLIB_ERR_INVAL; } /* * exclude restricted registers from assignment */ while(i < gen_support->pmc_count && pfm_regmask_isset(r_pmcs, i)) i++; if (i == gen_support->pmc_count) return PFMLIB_ERR_TOOMANY; /* * events can be assigned to any counter */ assign[j] = i++; } for (j=0; j < cnt ; j++ ) { reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[j].plm ? e[j].plm : inp->pfp_dfl_plm; val = gen_ia32_pe[e[j].event].pme_code; reg.sel_event_select = val & 0xff; ucode = (val >> 8) & 0xff; for(k=0; k < e[j].num_masks; k++) ucode |= gen_ia32_pe[e[j].event].pme_umasks[e[j].unit_masks[k]].pme_ucode; val |= ucode << 8; reg.sel_unit_mask = ucode; /* use 8 least significant bits */ reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_any = val >> 21;; reg.sel_edge = val >> 18; if (cntrs) { if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[j].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[j].flags & PFM_GEN_IA32_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[j].flags & PFM_GEN_IA32_SEL_INV ? 1 : 0; } pc[j].reg_num = assign[j]; pc[j].reg_addr = GEN_IA32_SEL_BASE+assign[j]; pc[j].reg_value = reg.val; pd[j].reg_num = assign[j]; pd[j].reg_addr = GEN_IA32_CTR_BASE+assign[j]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%llx event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", assign[j], assign[j], reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, gen_ia32_pe[e[j].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of evtsel registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static const char *fixed_event_names[]={ "INSTRUCTIONS_RETIRED", "UNHALTED_CORE_CYCLES ", "UNHALTED_REFERENCE_CYCLES " }; #define MAX_EVENT_NAMES (sizeof(fixed_event_names)/sizeof(char *)) static int pfm_gen_ia32_dispatch_counters_v23(pfmlib_input_param_t *inp, pfmlib_gen_ia32_input_param_t *param, pfmlib_output_param_t *outp) { #define HAS_OPTIONS(x) (cntrs && (cntrs[i].flags || cntrs[i].cnt_mask)) #define is_fixed_pmc(a) (a > 15) pfmlib_gen_ia32_counter_t *cntrs; pfm_gen_ia32_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; uint64_t val; unsigned long plm; unsigned int fixed_ctr_mask; unsigned int npc = 0; unsigned int i, j, n, k, ucode; unsigned int assign[PMU_GEN_IA32_MAX_COUNTERS]; unsigned int next_gen, last_gen; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; n = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_gen_ia32_counters : NULL; if (n > gen_support->pmd_count) return PFMLIB_ERR_TOOMANY; /* * initilize to empty */ for(i=0; i < n; i++) assign[i] = -1; /* * error checking */ for(j=0; j < n; j++) { /* * only supports two priv levels for perf counters */ if (e[j].plm & (PFM_PLM1|PFM_PLM2)) return PFMLIB_ERR_INVAL; /* * check for valid flags */ if (cntrs && cntrs[j].flags & ~PFMLIB_GEN_IA32_ALL_FLAGS) return PFMLIB_ERR_INVAL; if (cntrs && pmu_version != 3 && (cntrs[j].flags & PFM_GEN_IA32_SEL_ANYTHR)) { DPRINT("event=%d anythread requires architectural perfmon v3", e[j].event); return PFMLIB_ERR_INVAL; } } next_gen = 0; /* first generic counter */ last_gen = num_gen_cnt - 1; /* last generic counter */ fixed_ctr_mask = (1 << num_fixed_cnt) - 1; /* * first constraint: fixed counters (try using them first) */ if (fixed_ctr_mask) { for(i=0; i < n; i++) { /* fixed counters do not support event options (filters) */ if (HAS_OPTIONS(i)) { if (pmu_version != 3) continue; if (cntrs[i].flags != PFM_GEN_IA32_SEL_ANYTHR) continue; /* ok for ANYTHR */ } for(j=0; j < num_fixed_cnt; j++) { if ((fixed_ctr_mask & (1<pfp_dfl_plm; if (plm & PFM_PLM0) val |= 1ULL; if (plm & PFM_PLM3) val |= 2ULL; /* only possible for v3 */ if (cntrs && cntrs[i].flags & PFM_GEN_IA32_SEL_ANYTHR) val |= 4ULL; val |= 1ULL << 3; /* force APIC int (kernel may force it anyway) */ reg.val |= val << ((assign[i]-FIXED_PMD_BASE)<<2); /* setup pd array */ pd[i].reg_num = assign[i]; pd[i].reg_addr = GEN_IA32_FIXED_CTR_BASE+assign[i]-FIXED_PMD_BASE; } if (reg.val) { pc[npc].reg_num = 16; pc[npc].reg_value = reg.val; pc[npc].reg_addr = 0x38D; __pfm_vbprintf("[FIXED_CTRL(pmc%u)=0x%"PRIx64, pc[npc].reg_num, reg.val); for(i=0; i < num_fixed_cnt; i++) { if (pmu_version != 3) __pfm_vbprintf(" pmi%d=1 en%d=0x%"PRIx64, i, i, (reg.val >> (i*4)) & 0x3ULL); else __pfm_vbprintf(" pmi%d=1 en%d=0x%"PRIx64 " any%d=%"PRId64, i, i, (reg.val >> (i*4)) & 0x3ULL, i, !!((reg.val >> (i*4)) & 0x4ULL)); } __pfm_vbprintf("] "); for(i=0; i < num_fixed_cnt; i++) { if ((fixed_ctr_mask & (0x1 << i)) == 0) { if (i < MAX_EVENT_NAMES) __pfm_vbprintf("%s ", fixed_event_names[i]); else __pfm_vbprintf("??? "); } } __pfm_vbprintf("\n"); npc++; for (i=0; i < n ; i++ ) { if (!is_fixed_pmc(assign[i])) continue; __pfm_vbprintf("[FIXED_CTR%u(pmd%u)]\n", pd[i].reg_num, pd[i].reg_num); } } for (i=0; i < n ; i++ ) { /* skip fixed counters */ if (is_fixed_pmc(assign[i])) continue; reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; val = gen_ia32_pe[e[i].event].pme_code; reg.sel_event_select = val & 0xff; ucode = (val >> 8) & 0xff; for(k=0; k < e[i].num_masks; k++) ucode |= gen_ia32_pe[e[i].event].pme_umasks[e[i].unit_masks[k]].pme_ucode; val |= ucode << 8; reg.sel_unit_mask = ucode; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_any = val >> 21;; reg.sel_edge = val >> 18; if (cntrs) { if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[i].flags & PFM_GEN_IA32_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[i].flags & PFM_GEN_IA32_SEL_INV ? 1 : 0; if (!reg.sel_any) reg.sel_any = cntrs[i].flags & PFM_GEN_IA32_SEL_ANYTHR? 1 : 0; } pc[npc].reg_num = assign[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = GEN_IA32_SEL_BASE+assign[i]; pd[i].reg_num = assign[i]; pd[i].reg_addr = GEN_IA32_CTR_BASE+assign[i]; if (pmu_version < 3) __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, gen_ia32_pe[e[i].event].pme_name); else __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d anythr=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, reg.sel_any, gen_ia32_pe[e[i].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pd[i].reg_num, pd[i].reg_num); npc++; } /* number of evtsel/ctr registers programmed */ outp->pfp_pmc_count = npc; outp->pfp_pmd_count = n; return PFMLIB_SUCCESS; } static int pfm_gen_ia32_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_gen_ia32_input_param_t *mod_in = model_in; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } /* simplfied v1 (no fixed counters */ if (pmu_version == 1) return pfm_gen_ia32_dispatch_counters_v1(inp, mod_in, outp); /* v2 or above */ return pfm_gen_ia32_dispatch_counters_v23(inp, mod_in, outp); } static int pfm_gen_ia32_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && cnt > gen_support->pmc_count) return PFMLIB_ERR_INVAL; *code = gen_ia32_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_gen_ia32_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < num_gen_cnt; i++) pfm_regmask_set(counters, i); for(i=0; i < num_fixed_cnt; i++) { if (gen_ia32_pe[j].pme_fixed == (FIXED_PMD_BASE+i)) pfm_regmask_set(counters, FIXED_PMD_BASE+i); } } static void pfm_gen_ia32_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = gen_ia32_impl_pmcs; } static void pfm_gen_ia32_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = gen_ia32_impl_pmds; } static void pfm_gen_ia32_get_impl_counters(pfmlib_regmask_t *impl_counters) { /* all pmds are counters */ *impl_counters = gen_ia32_impl_pmds; } static void pfm_gen_ia32_get_hw_counter_width(unsigned int *width) { /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 31 bits have full * degree of freedom. That is the "useable" counter width. */ *width = PMU_GEN_IA32_COUNTER_WIDTH; } static char * pfm_gen_ia32_get_event_name(unsigned int i) { return gen_ia32_pe[i].pme_name; } static int pfm_gen_ia32_get_event_description(unsigned int ev, char **str) { char *s; s = gen_ia32_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_gen_ia32_get_event_mask_name(unsigned int ev, unsigned int midx) { return gen_ia32_pe[ev].pme_umasks[midx].pme_uname; } static int pfm_gen_ia32_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = gen_ia32_pe[ev].pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_gen_ia32_get_num_event_masks(unsigned int ev) { return gen_ia32_pe[ev].pme_numasks; } static int pfm_gen_ia32_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code =gen_ia32_pe[ev].pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_gen_ia32_get_cycle_event(pfmlib_event_t *e) { if (gen_ia32_cycle_event == -1) return PFMLIB_ERR_NOTSUPP; e->event = gen_ia32_cycle_event; return PFMLIB_SUCCESS; } static int pfm_gen_ia32_get_inst_retired(pfmlib_event_t *e) { if (gen_ia32_inst_retired_event == -1) return PFMLIB_ERR_NOTSUPP; e->event = gen_ia32_inst_retired_event; return PFMLIB_SUCCESS; } /* architected PMU */ pfm_pmu_support_t gen_ia32_support={ .pmu_name = "Intel architectural PMU", .pmu_type = PFMLIB_GEN_IA32_PMU, .pme_count = 0, .pmc_count = 0, .pmd_count = 0, .num_cnt = 0, .get_event_code = pfm_gen_ia32_get_event_code, .get_event_name = pfm_gen_ia32_get_event_name, .get_event_counters = pfm_gen_ia32_get_event_counters, .dispatch_events = pfm_gen_ia32_dispatch_events, .pmu_detect = pfm_gen_ia32_detect, .pmu_init = pfm_gen_ia32_init, .get_impl_pmcs = pfm_gen_ia32_get_impl_pmcs, .get_impl_pmds = pfm_gen_ia32_get_impl_pmds, .get_impl_counters = pfm_gen_ia32_get_impl_counters, .get_hw_counter_width = pfm_gen_ia32_get_hw_counter_width, .get_event_desc = pfm_gen_ia32_get_event_description, .get_cycle_event = pfm_gen_ia32_get_cycle_event, .get_inst_retired_event = pfm_gen_ia32_get_inst_retired, .get_num_event_masks = pfm_gen_ia32_get_num_event_masks, .get_event_mask_name = pfm_gen_ia32_get_event_mask_name, .get_event_mask_code = pfm_gen_ia32_get_event_mask_code, .get_event_mask_desc = pfm_gen_ia32_get_event_mask_desc }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_ia32_priv.h000066400000000000000000000052161502707512200234430ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_GEN_IA32_PRIV_H__ #define __PFMLIB_GEN_IA32_PRIV_H__ #define PFMLIB_GEN_IA32_MAX_UMASK 16 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ } pme_gen_ia32_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ unsigned int pme_fixed; /* fixed counter index, < FIXED_CTR0 if unsupported */ pme_gen_ia32_umask_t pme_umasks[PFMLIB_GEN_IA32_MAX_UMASK]; /* umask desc */ } pme_gen_ia32_entry_t; /* * pme_flags value */ #define PFMLIB_GEN_IA32_UMASK_COMBO 0x01 /* unit mask can be combined (default exclusive) */ typedef struct { unsigned int version:8; unsigned int num_cnt:8; unsigned int cnt_width:8; unsigned int ebx_length:8; } pmu_eax_t; typedef struct { unsigned int num_cnt:6; unsigned int cnt_width:6; unsigned int reserved:20; } pmu_edx_t; typedef struct { unsigned int no_core_cycle:1; unsigned int no_inst_retired:1; unsigned int no_ref_cycle:1; unsigned int no_llc_ref:1; unsigned int no_llc_miss:1; unsigned int no_br_retired:1; unsigned int no_br_mispred_retired:1; unsigned int reserved:25; } pmu_ebx_t; #endif /* __PFMLIB_GEN_IA32_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_ia64.c000066400000000000000000000325061502707512200224050ustar00rootroot00000000000000/* * pfmlib_gen_ia64.c : support default architected IA-64 PMU features * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #define PMU_GEN_IA64_MAX_COUNTERS 4 /* * number of architected events */ #define PME_GEN_COUNT 2 /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn */ #define PFMLIB_GEN_IA64_PMC_BASE 0 /* * generic event as described by architecture */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_ig:56; /* ignored */ } pme_gen_ia64_code_t; /* * union of all possible entry codes. All encodings must fit in 64bit */ typedef union { unsigned long pme_vcode; pme_gen_ia64_code_t pme_gen_code; } pme_gen_ia64_entry_code_t; /* * entry in the event table (one table per implementation) */ typedef struct pme_entry { char *pme_name; pme_gen_ia64_entry_code_t pme_entry_code; /* event code */ pfmlib_regmask_t pme_counters; /* counter bitmask */ } pme_gen_ia64_entry_t; /* let's define some handy shortcuts ! */ #define pmc_plm pmc_gen_count_reg.pmc_plm #define pmc_ev pmc_gen_count_reg.pmc_ev #define pmc_oi pmc_gen_count_reg.pmc_oi #define pmc_pm pmc_gen_count_reg.pmc_pm #define pmc_es pmc_gen_count_reg.pmc_es /* * this table is patched by initialization code */ static pme_gen_ia64_entry_t generic_pe[PME_GEN_COUNT]={ #define PME_IA64_GEN_CPU_CYCLES 0 { "CPU_CYCLES", }, #define PME_IA64_GEN_INST_RETIRED 1 { "IA64_INST_RETIRED", }, }; static int pfm_gen_ia64_counter_width; static int pfm_gen_ia64_counters; static pfmlib_regmask_t pfm_gen_ia64_impl_pmcs; static pfmlib_regmask_t pfm_gen_ia64_impl_pmds; /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * We do not use a mapping table, instead we make up the * values on the fly given the base. */ #define PFMLIB_GEN_IA64_PMC_BASE 0 /* * convert text range (e.g. 4-15 18 12-26) into actual bitmask * range argument is modified */ static int parse_counter_range(char *range, pfmlib_regmask_t *b) { char *p, c; int start, end; if (range[strlen(range)-1] == '\n') range[strlen(range)-1] = '\0'; while(range) { p = range; while (*p && *p != ' ' && *p != '-') p++; if (*p == '\0') break; c = *p; *p = '\0'; start = atoi(range); range = p+1; if (c == '-') { p++; while (*p && *p != ' ' && *p != '-') p++; if (*p) *p++ = '\0'; end = atoi(range); range = p; } else { end = start; } if (end >= PFMLIB_REG_MAX|| start >= PFMLIB_REG_MAX) goto invalid; for (; start <= end; start++) pfm_regmask_set(b, start); } return 0; invalid: fprintf(stderr, "%s.%s : bitmask too small need %d bits\n", __FILE__, __FUNCTION__, start); return -1; } static int pfm_gen_ia64_initialize(void) { FILE *fp; char *p; char buffer[64]; int matches = 0; fp = fopen("/proc/pal/cpu0/perfmon_info", "r"); if (fp == NULL) return PFMLIB_ERR_NOTSUPP; for (;;) { p = fgets(buffer, sizeof(buffer)-1, fp); if (p == NULL) break; if ((p = strchr(buffer, ':')) == NULL) break; *p = '\0'; if (!strncmp("Counter width", buffer, 13)) { pfm_gen_ia64_counter_width = atoi(p+2); matches++; continue; } if (!strncmp("PMC/PMD pairs", buffer, 13)) { pfm_gen_ia64_counters = atoi(p+2); matches++; continue; } if (!strncmp("Cycle event number", buffer, 18)) { generic_pe[0].pme_entry_code.pme_vcode = atoi(p+2); matches++; continue; } if (!strncmp("Retired event number", buffer, 20)) { generic_pe[1].pme_entry_code.pme_vcode = atoi(p+2); matches++; continue; } if (!strncmp("Cycles count capable", buffer, 20)) { if (parse_counter_range(p+2, &generic_pe[0].pme_counters) == -1) return -1; matches++; continue; } if (!strncmp("Retired bundles count capable", buffer, 29)) { if (parse_counter_range(p+2, &generic_pe[1].pme_counters) == -1) return -1; matches++; continue; } if (!strncmp("Implemented PMC", buffer, 15)) { if (parse_counter_range(p+2, &pfm_gen_ia64_impl_pmcs) == -1) return -1; matches++; continue; } if (!strncmp("Implemented PMD", buffer, 15)) { if (parse_counter_range(p+2, &pfm_gen_ia64_impl_pmds) == -1) return -1; matches++; continue; } } pfm_regmask_weight(&pfm_gen_ia64_impl_pmcs, &generic_ia64_support.pmc_count); pfm_regmask_weight(&pfm_gen_ia64_impl_pmds, &generic_ia64_support.pmd_count); fclose(fp); return matches == 8 ? PFMLIB_SUCCESS : PFMLIB_ERR_NOTSUPP; } static void pfm_gen_ia64_forced_initialize(void) { unsigned int i; pfm_gen_ia64_counter_width = 47; pfm_gen_ia64_counters = 4; generic_pe[0].pme_entry_code.pme_vcode = 18; generic_pe[1].pme_entry_code.pme_vcode = 8; memset(&pfm_gen_ia64_impl_pmcs, 0, sizeof(pfmlib_regmask_t)); memset(&pfm_gen_ia64_impl_pmds, 0, sizeof(pfmlib_regmask_t)); for(i=0; i < 8; i++) pfm_regmask_set(&pfm_gen_ia64_impl_pmcs, i); for(i=4; i < 8; i++) pfm_regmask_set(&pfm_gen_ia64_impl_pmds, i); memset(&generic_pe[0].pme_counters, 0, sizeof(pfmlib_regmask_t)); memset(&generic_pe[1].pme_counters, 0, sizeof(pfmlib_regmask_t)); for(i=4; i < 8; i++) { pfm_regmask_set(&generic_pe[0].pme_counters, i); pfm_regmask_set(&generic_pe[1].pme_counters, i); } generic_ia64_support.pmc_count = 8; generic_ia64_support.pmd_count = 4; generic_ia64_support.num_cnt = 4; } static int pfm_gen_ia64_detect(void) { /* PMU is architected, so guaranteed to be present */ return PFMLIB_SUCCESS; } static int pfm_gen_ia64_init(void) { if (forced_pmu != PFMLIB_NO_PMU) { pfm_gen_ia64_forced_initialize(); } else if (pfm_gen_ia64_initialize() == -1) return PFMLIB_ERR_NOTSUPP; return PFMLIB_SUCCESS; } static int valid_assign(unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned int i; for(i=0; i < cnt; i++) { if (as[i]==0) return 0; /* * take care of restricted PMC registers */ if (pfm_regmask_isset(r_pmcs, as[i])) return 0; } return 1; } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_reg_t structure is ready to be submitted to kernel */ static int pfm_gen_ia64_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_output_param_t *outp) { #define has_counter(e,b) (pfm_regmask_isset(&generic_pe[e].pme_counters, b) ? b : 0) unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_GEN_IA64_MAX_COUNTERS]; pfm_gen_ia64_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l; unsigned int cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (cnt > PMU_GEN_IA64_MAX_COUNTERS) return PFMLIB_ERR_TOOMANY; max_l0 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS; max_l1 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>1); max_l2 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>2); max_l3 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>3); if (PFMLIB_DEBUG()) { DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); } /* * This code needs fixing. It is not very pretty and * won't handle more than 4 counters if more become * available ! * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_GEN_IA64_FIRST_COUNTER; i < max_l0; i++) { assign[0]= has_counter(e[0].event,i); if (max_l1 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (j=PMU_GEN_IA64_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (k=PMU_GEN_IA64_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (l=PMU_GEN_IA64_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(assign, r_pmcs, cnt)) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: memset(pc, 0, cnt*sizeof(pfmlib_reg_t)); memset(pd, 0, cnt*sizeof(pfmlib_reg_t)); for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all */ /* if not specified per event, then use default (could be zero: measure nothing) */ reg.pmc_plm = e[j].plm ? e[j].plm: inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE? 1 : 0; reg.pmc_es = generic_pe[e[j].event].pme_entry_code.pme_gen_code.pme_code; pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = PFMLIB_GEN_IA64_PMC_BASE+j; pd[j].reg_num = assign[j]; pd[j].reg_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%lx,es=0x%02x,plm=%d pm=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_es,reg.pmc_plm, reg.pmc_pm, generic_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_gen_ia64_dispatch_events(pfmlib_input_param_t *inp, void *dummy1, pfmlib_output_param_t *outp, void *dummy2) { return pfm_gen_ia64_dispatch_counters(inp, outp); } static int pfm_gen_ia64_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)generic_pe[i].pme_entry_code.pme_gen_code.pme_code; return PFMLIB_SUCCESS; } static char * pfm_gen_ia64_get_event_name(unsigned int i) { return generic_pe[i].pme_name; } static void pfm_gen_ia64_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < pfm_gen_ia64_counters; i++) { if (pfm_regmask_isset(&generic_pe[j].pme_counters, i)) pfm_regmask_set(counters, i); } } static void pfm_gen_ia64_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = pfm_gen_ia64_impl_pmcs; } static void pfm_gen_ia64_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = pfm_gen_ia64_impl_pmds; } static void pfm_gen_ia64_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* pmd4-pmd7 */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_gen_ia64_get_hw_counter_width(unsigned int *width) { *width = pfm_gen_ia64_counter_width; } static int pfm_gen_ia64_get_event_desc(unsigned int ev, char **str) { switch(ev) { case PME_IA64_GEN_CPU_CYCLES: *str = strdup("CPU cycles"); break; case PME_IA64_GEN_INST_RETIRED: *str = strdup("IA-64 instructions retired"); break; default: *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_gen_ia64_get_cycle_event(pfmlib_event_t *e) { e->event = PME_IA64_GEN_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_gen_ia64_get_inst_retired(pfmlib_event_t *e) { e->event = PME_IA64_GEN_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t generic_ia64_support={ .pmu_name ="IA-64", .pmu_type = PFMLIB_GEN_IA64_PMU, .pme_count = PME_GEN_COUNT, .pmc_count = 4+4, .pmd_count = PMU_GEN_IA64_MAX_COUNTERS, .num_cnt = PMU_GEN_IA64_MAX_COUNTERS, .get_event_code = pfm_gen_ia64_get_event_code, .get_event_name = pfm_gen_ia64_get_event_name, .get_event_counters = pfm_gen_ia64_get_event_counters, .dispatch_events = pfm_gen_ia64_dispatch_events, .pmu_detect = pfm_gen_ia64_detect, .pmu_init = pfm_gen_ia64_init, .get_impl_pmcs = pfm_gen_ia64_get_impl_pmcs, .get_impl_pmds = pfm_gen_ia64_get_impl_pmds, .get_impl_counters = pfm_gen_ia64_get_impl_counters, .get_hw_counter_width = pfm_gen_ia64_get_hw_counter_width, .get_event_desc = pfm_gen_ia64_get_event_desc, .get_cycle_event = pfm_gen_ia64_get_cycle_event, .get_inst_retired_event = pfm_gen_ia64_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_mips64.c000066400000000000000000000352321502707512200227630ustar00rootroot00000000000000/* * pfmlib_gen_mips64.c : support for the generic MIPS64 PMU family * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_gen_mips64_priv.h" /* architecture private */ #include "gen_mips64_events.h" /* PMU private */ /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_exl perfsel.sel_exl #define sel_os perfsel.sel_os #define sel_usr perfsel.sel_usr #define sel_sup perfsel.sel_sup #define sel_int perfsel.sel_int static pme_gen_mips64_entry_t *gen_mips64_pe = NULL; pfm_pmu_support_t generic_mips64_support; static int pfm_gen_mips64_detect(void) { static char mips_name[64] = ""; int ret; char buffer[128]; ret = __pfm_getcpuinfo_attr("cpu model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; generic_mips64_support.pmu_name = mips_name; generic_mips64_support.num_cnt = 0; if (strstr(buffer,"MIPS 20Kc")) { gen_mips64_pe = gen_mips64_20K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS20KC"), generic_mips64_support.pme_count = (sizeof(gen_mips64_20K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 1; generic_mips64_support.pmd_count = 1; generic_mips64_support.pmu_type = PFMLIB_MIPS_20KC_PMU; } else if (strstr(buffer,"MIPS 24K")) { gen_mips64_pe = gen_mips64_24K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS24K"), generic_mips64_support.pme_count = (sizeof(gen_mips64_24K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_24K_PMU; } else if (strstr(buffer,"MIPS 25Kf")) { gen_mips64_pe = gen_mips64_25K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS25KF"), generic_mips64_support.pme_count = (sizeof(gen_mips64_25K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_25KF_PMU; } else if (strstr(buffer,"MIPS 34K")) { gen_mips64_pe = gen_mips64_34K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS34K"), generic_mips64_support.pme_count = (sizeof(gen_mips64_34K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 4; generic_mips64_support.pmd_count = 4; generic_mips64_support.pmu_type = PFMLIB_MIPS_34K_PMU; } else if (strstr(buffer,"MIPS 5Kc")) { gen_mips64_pe = gen_mips64_5K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS5KC"), generic_mips64_support.pme_count = (sizeof(gen_mips64_5K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_5KC_PMU; } #if 0 else if (strstr(buffer,"MIPS 74K")) { gen_mips64_pe = gen_mips64_74K_pe; strcpy(generic_mips64_support.pmu_name,"MIPS74K"), generic_mips64_support.pme_count = (sizeof(gen_mips64_74K_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 4; generic_mips64_support.pmd_count = 4; generic_mips64_support.pmu_type = PFMLIB_MIPS_74K_PMU; } #endif else if (strstr(buffer,"R10000")) { gen_mips64_pe = gen_mips64_r10000_pe; strcpy(generic_mips64_support.pmu_name,"MIPSR10000"), generic_mips64_support.pme_count = (sizeof(gen_mips64_r10000_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_R10000_PMU; } else if (strstr(buffer,"R12000")) { gen_mips64_pe = gen_mips64_r12000_pe; strcpy(generic_mips64_support.pmu_name,"MIPSR12000"), generic_mips64_support.pme_count = (sizeof(gen_mips64_r12000_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 4; generic_mips64_support.pmd_count = 4; generic_mips64_support.pmu_type = PFMLIB_MIPS_R12000_PMU; } else if (strstr(buffer,"RM7000")) { gen_mips64_pe = gen_mips64_rm7000_pe; strcpy(generic_mips64_support.pmu_name,"MIPSRM7000"), generic_mips64_support.pme_count = (sizeof(gen_mips64_rm7000_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_RM7000_PMU; } else if (strstr(buffer,"RM9000")) { gen_mips64_pe = gen_mips64_rm9000_pe; strcpy(generic_mips64_support.pmu_name,"MIPSRM9000"), generic_mips64_support.pme_count = (sizeof(gen_mips64_rm9000_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_RM9000_PMU; } else if (strstr(buffer,"SB1")) { gen_mips64_pe = gen_mips64_sb1_pe; strcpy(generic_mips64_support.pmu_name,"MIPSSB1"), generic_mips64_support.pme_count = (sizeof(gen_mips64_sb1_pe)/sizeof(pme_gen_mips64_entry_t)); generic_mips64_support.pmc_count = 4; generic_mips64_support.pmd_count = 4; generic_mips64_support.pmu_type = PFMLIB_MIPS_SB1_PMU; } else if (strstr(buffer,"VR5432")) { gen_mips64_pe = gen_mips64_vr5432_pe; generic_mips64_support.pme_count = (sizeof(gen_mips64_vr5432_pe)/sizeof(pme_gen_mips64_entry_t)); strcpy(generic_mips64_support.pmu_name,"MIPSVR5432"), generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_VR5432_PMU; } else if (strstr(buffer,"VR5500")) { gen_mips64_pe = gen_mips64_vr5500_pe; generic_mips64_support.pme_count = (sizeof(gen_mips64_vr5500_pe)/sizeof(pme_gen_mips64_entry_t)); strcpy(generic_mips64_support.pmu_name,"MIPSVR5500"), generic_mips64_support.pmc_count = 2; generic_mips64_support.pmd_count = 2; generic_mips64_support.pmu_type = PFMLIB_MIPS_VR5500_PMU; } else return PFMLIB_ERR_NOTSUPP; if (generic_mips64_support.num_cnt == 0) generic_mips64_support.num_cnt = generic_mips64_support.pmd_count; return PFMLIB_SUCCESS; } static void stuff_regs(pfmlib_event_t *e, int plm, pfmlib_reg_t *pc, pfmlib_reg_t *pd, int cntr, int j, pfmlib_gen_mips64_input_param_t *mod_in) { pfm_gen_mips64_sel_reg_t reg; reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[j].plm ? e[j].plm : plm; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_sup = plm & PFM_PLM1 ? 1 : 0; reg.sel_exl = plm & PFM_PLM2 ? 1 : 0; reg.sel_int = 1; /* force int to 1 */ reg.sel_event_mask = (gen_mips64_pe[e[j].event].pme_code >> (cntr*8)) & 0xff; pc[j].reg_value = reg.val; pc[j].reg_addr = cntr*2; pc[j].reg_num = cntr; __pfm_vbprintf("[CP0_25_%"PRIx64"(pmc%u)=0x%"PRIx64" event_mask=0x%x usr=%d os=%d sup=%d exl=%d int=1] %s\n", pc[j].reg_addr, pc[j].reg_num, pc[j].reg_value, reg.sel_event_mask, reg.sel_usr, reg.sel_os, reg.sel_sup, reg.sel_exl, gen_mips64_pe[e[j].event].pme_name); pd[j].reg_num = cntr; pd[j].reg_addr = cntr*2 + 1; __pfm_vbprintf("[CP0_25_%u(pmd%u)]\n", pc[j].reg_addr, pc[j].reg_num); } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_gen_mips64_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_gen_mips64_input_param_t *mod_in, pfmlib_output_param_t *outp) { /* pfmlib_gen_mips64_input_param_t *param = mod_in; */ pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc, *pd; unsigned int i, j, cnt = inp->pfp_event_count; unsigned int used = 0; extern pfm_pmu_support_t generic_mips64_support; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; /* Degree 2 rank based allocation */ if (cnt > generic_mips64_support.pmc_count) return PFMLIB_ERR_TOOMANY; if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s, counters=0x%x\n", j, gen_mips64_pe[e[j].event].pme_name,gen_mips64_pe[e[j].event].pme_counters); } } /* Do rank based allocation, counters that live on 1 reg before counters that live on 2 regs etc. */ for (i=1;i<=PMU_GEN_MIPS64_NUM_COUNTERS;i++) { for (j=0; j < cnt;j++) { unsigned int cntr, avail; if (pfmlib_popcnt(gen_mips64_pe[e[j].event].pme_counters) == i) { /* These counters can be used for this event */ avail = ~used & gen_mips64_pe[e[j].event].pme_counters; DPRINT("Rank %d: Counters available 0x%x\n",i,avail); if (avail == 0x0) return PFMLIB_ERR_NOASSIGN; /* Pick one, mark as used*/ cntr = ffs(avail) - 1; DPRINT("Rank %d: Chose counter %d\n",i,cntr); /* Update registers */ stuff_regs(e,inp->pfp_dfl_plm,pc,pd,cntr,j,mod_in); used |= (1 << cntr); DPRINT("%d: Used counters 0x%x\n",i, used); } } } /* number of evtsel registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_gen_mips64_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_gen_mips64_input_param_t *mod_in = (pfmlib_gen_mips64_input_param_t *)model_in; return pfm_gen_mips64_dispatch_counters(inp, mod_in, outp); } static int pfm_gen_mips64_get_event_code(unsigned int i, unsigned int cnt, int *code) { extern pfm_pmu_support_t generic_mips64_support; /* check validity of counter index */ if (cnt != PFMLIB_CNT_FIRST) { if (cnt < 0 || cnt >= generic_mips64_support.pmc_count) return PFMLIB_ERR_INVAL; } else { cnt = ffs(gen_mips64_pe[i].pme_counters)-1; if (cnt == -1) return(PFMLIB_ERR_INVAL); } /* if cnt == 1, shift right by 0, if cnt == 2, shift right by 8 */ /* Works on both 5k anf 20K */ if (gen_mips64_pe[i].pme_counters & (1<< cnt)) *code = 0xff & (gen_mips64_pe[i].pme_code >> (cnt*8)); else return PFMLIB_ERR_INVAL; return PFMLIB_SUCCESS; } static void pfm_gen_mips64_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { extern pfm_pmu_support_t generic_mips64_support; unsigned int tmp; memset(counters, 0, sizeof(*counters)); tmp = gen_mips64_pe[j].pme_counters; while (tmp) { int t = ffs(tmp) - 1; pfm_regmask_set(counters, t); tmp = tmp ^ (1 << t); } } static void pfm_gen_mips64_get_impl_perfsel(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; extern pfm_pmu_support_t generic_mips64_support; /* all pmcs are contiguous */ for(i=0; i < generic_mips64_support.pmc_count; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_gen_mips64_get_impl_perfctr(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; extern pfm_pmu_support_t generic_mips64_support; /* all pmds are contiguous */ for(i=0; i < generic_mips64_support.pmd_count; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_gen_mips64_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; extern pfm_pmu_support_t generic_mips64_support; for(i=0; i < generic_mips64_support.pmc_count; i++) pfm_regmask_set(impl_counters, i); } static void pfm_gen_mips64_get_hw_counter_width(unsigned int *width) { *width = PMU_GEN_MIPS64_COUNTER_WIDTH; } static char * pfm_gen_mips64_get_event_name(unsigned int i) { return gen_mips64_pe[i].pme_name; } static int pfm_gen_mips64_get_event_description(unsigned int ev, char **str) { char *s; s = gen_mips64_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_gen_mips64_get_cycle_event(pfmlib_event_t *e) { return pfm_find_full_event("CYCLES",e); } static int pfm_gen_mips64_get_inst_retired(pfmlib_event_t *e) { if (pfm_current == NULL) return(PFMLIB_ERR_NOINIT); switch (pfm_current->pmu_type) { case PFMLIB_MIPS_20KC_PMU: return pfm_find_full_event("INSNS_COMPLETED",e); case PFMLIB_MIPS_24K_PMU: return pfm_find_full_event("INSTRUCTIONS",e); case PFMLIB_MIPS_25KF_PMU: return pfm_find_full_event("INSNS_COMPLETE",e); case PFMLIB_MIPS_34K_PMU: return pfm_find_full_event("INSTRUCTIONS",e); case PFMLIB_MIPS_5KC_PMU: return pfm_find_full_event("INSNS_EXECD",e); case PFMLIB_MIPS_R10000_PMU: case PFMLIB_MIPS_R12000_PMU: return pfm_find_full_event("INSTRUCTIONS_GRADUATED",e); case PFMLIB_MIPS_RM7000_PMU: case PFMLIB_MIPS_RM9000_PMU: return pfm_find_full_event("INSTRUCTIONS_ISSUED",e); case PFMLIB_MIPS_VR5432_PMU: case PFMLIB_MIPS_VR5500_PMU: return pfm_find_full_event("INSTRUCTIONS_EXECUTED",e); case PFMLIB_MIPS_SB1_PMU: return pfm_find_full_event("INSN_SURVIVED_STAGE7",e); default: return(PFMLIB_ERR_NOTFOUND); } } /* SiCortex specific functions */ pfm_pmu_support_t generic_mips64_support = { .pmu_name = NULL, .pmu_type = PFMLIB_UNKNOWN_PMU, .pme_count = 0, .pmc_count = 0, .pmd_count = 0, .num_cnt = 0, .flags = PFMLIB_MULT_CODE_EVENT, .get_event_code = pfm_gen_mips64_get_event_code, .get_event_name = pfm_gen_mips64_get_event_name, .get_event_counters = pfm_gen_mips64_get_event_counters, .dispatch_events = pfm_gen_mips64_dispatch_events, .pmu_detect = pfm_gen_mips64_detect, .get_impl_pmcs = pfm_gen_mips64_get_impl_perfsel, .get_impl_pmds = pfm_gen_mips64_get_impl_perfctr, .get_impl_counters = pfm_gen_mips64_get_impl_counters, .get_hw_counter_width = pfm_gen_mips64_get_hw_counter_width, .get_event_desc = pfm_gen_mips64_get_event_description, .get_cycle_event = pfm_gen_mips64_get_cycle_event, .get_inst_retired_event = pfm_gen_mips64_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_mips64_priv.h000066400000000000000000000033411502707512200240240ustar00rootroot00000000000000/* * Contributed by Philip Mucci based on code from * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_GEN_MIPS64_PRIV_H__ #define __PFMLIB_GEN_MIPS64_PRIV_H__ typedef struct { char *pme_name; char *pme_desc; /* text description of the event */ unsigned int pme_code; /* event mask, holds room for four events, low 8 bits cntr0, ... high 8 bits cntr3 */ unsigned int pme_counters; } pme_gen_mips64_entry_t; #endif /* __PFMLIB_GEN_MIPS64_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_gen_powerpc.c000066400000000000000000000604221502707512200233170ustar00rootroot00000000000000/* * Copyright (C) IBM Corporation, 2007. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_gen_powerpc.c * * Support for libpfm for the PowerPC970, POWER4,4+,5,5+,6 processors. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include /* private headers */ #include "powerpc_reg.h" #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "pfmlib_ppc970_priv.h" #include "pfmlib_ppc970mp_priv.h" #include "pfmlib_power4_priv.h" #include "pfmlib_power5_priv.h" #include "pfmlib_power5+_priv.h" #include "pfmlib_power6_priv.h" #include "pfmlib_power7_priv.h" #include "ppc970_events.h" #include "ppc970mp_events.h" #include "power4_events.h" #include "power5_events.h" #include "power5+_events.h" #include "power6_events.h" #include "power7_events.h" #define FIRST_POWER_PMU PFMLIB_PPC970_PMU static const int num_group_vec[] = { [PFMLIB_PPC970_PMU - FIRST_POWER_PMU] = PPC970_NUM_GROUP_VEC, [PFMLIB_PPC970MP_PMU - FIRST_POWER_PMU] = PPC970MP_NUM_GROUP_VEC, [PFMLIB_POWER4_PMU - FIRST_POWER_PMU] = POWER4_NUM_GROUP_VEC, [PFMLIB_POWER5_PMU - FIRST_POWER_PMU] = POWER5_NUM_GROUP_VEC, [PFMLIB_POWER5p_PMU - FIRST_POWER_PMU] = POWER5p_NUM_GROUP_VEC, [PFMLIB_POWER6_PMU - FIRST_POWER_PMU] = POWER6_NUM_GROUP_VEC, [PFMLIB_POWER7_PMU - FIRST_POWER_PMU] = POWER7_NUM_GROUP_VEC }; static const int event_count[] = { [PFMLIB_PPC970_PMU - FIRST_POWER_PMU] = PPC970_PME_EVENT_COUNT, [PFMLIB_PPC970MP_PMU - FIRST_POWER_PMU] = PPC970MP_PME_EVENT_COUNT, [PFMLIB_POWER5_PMU - FIRST_POWER_PMU] = POWER5_PME_EVENT_COUNT, [PFMLIB_POWER5p_PMU - FIRST_POWER_PMU] = POWER5p_PME_EVENT_COUNT, [PFMLIB_POWER6_PMU - FIRST_POWER_PMU] = POWER6_PME_EVENT_COUNT, [PFMLIB_POWER7_PMU - FIRST_POWER_PMU] = POWER7_PME_EVENT_COUNT }; unsigned *pmd_priv_vec; static unsigned long long mmcr0_fc5_6_mask; static unsigned long long *mmcr0_counter_mask; static unsigned long long *mmcr1_counter_mask; static unsigned long long *mmcr0_counter_off_val; static unsigned long long *mmcr1_counter_off_val; static const pme_power_entry_t *pe; static const pmg_power_group_t *groups; static inline int get_num_event_counters() { return gen_powerpc_support.pmd_count; } static inline int get_num_control_regs() { return gen_powerpc_support.pmc_count; } static inline const unsigned long long *get_group_vector(int event) { return pe[event].pme_group_vector; } static inline int get_event_id(int event, int counter) { return pe[event].pme_event_ids[counter]; } static inline char *get_event_name(int event) { return pe[event].pme_name; } static inline char *get_long_desc(int event) { return pe[event].pme_long_desc; } static inline int get_group_event_id(int group, int counter) { return groups[group].pmg_event_ids[counter]; } static inline unsigned long long get_mmcr0(int group) { return groups[group].pmg_mmcr0; } static inline unsigned long long get_mmcr1(int group) { return groups[group].pmg_mmcr1; } static inline unsigned long long get_mmcra(int group) { return groups[group].pmg_mmcra; } /** * pfm_gen_powerpc_get_event_code * * Return the event-select value for the specified event as * needed for the specified PMD counter. **/ static int pfm_gen_powerpc_get_event_code(unsigned int event, unsigned int pmd, int *code) { if (event < event_count[gen_powerpc_support.pmu_type - FIRST_POWER_PMU]) { *code = pe[event].pme_code; return PFMLIB_SUCCESS; } else return PFMLIB_ERR_INVAL; } /** * pfm_gen_powerpc_get_event_name * * Return the name of the specified event. **/ static char *pfm_gen_powerpc_get_event_name(unsigned int event) { return get_event_name(event); } /** * pfm_gen_powerpc_get_event_mask_name * * Return the name of the specified event-mask. **/ static char *pfm_gen_powerpc_get_event_mask_name(unsigned int event, unsigned int mask) { return ""; } /** * pfm_gen_powerpc_get_event_counters * * Fill in the 'counters' bitmask with all possible PMDs that could be * used to count the specified event. **/ static void pfm_gen_powerpc_get_event_counters(unsigned int event, pfmlib_regmask_t *counters) { int i; counters->bits[0] = 0; for (i = 0; i < get_num_event_counters(); i++) { if (get_event_id(event, i) != -1) { counters->bits[0] |= (1 << i); } } } /** * pfm_gen_powerpc_get_num_event_masks * * Count the number of available event-masks for the specified event. **/ static unsigned int pfm_gen_powerpc_get_num_event_masks(unsigned int event) { /* POWER arch doesn't use event masks */ return 0; } static void remove_group(unsigned long long *group_vec, int group) { group_vec[group / 64] &= ~(1ULL << (group % 64)); } static void intersect_groups(unsigned long long *result, const unsigned long long *operand) { int i; for (i = 0; i < num_group_vec[gen_powerpc_support.pmu_type - FIRST_POWER_PMU]; i++) { result[i] &= operand[i]; } } static int first_group(unsigned long long *group_vec) { int i, bit; for (i = 0; i < num_group_vec[gen_powerpc_support.pmu_type - FIRST_POWER_PMU]; i++) { bit = ffsll(group_vec[i]); if (bit) { return (bit - 1) + (i * 64); } } /* There were no groups */ return -1; } static unsigned gq_pmd_priv_vec[8] = { 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e }; static unsigned gr_pmd_priv_vec[6] = { 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, }; static unsigned gs_pmd_priv_vec[6] = { 0x0f0e, 0x0f0e, 0x0f0e, 0x0f0e, 0x0800, 0x0800, }; /* These masks are used on the PPC970*, and POWER4,4+ chips */ static unsigned long long power4_mmcr0_counter_mask[POWER4_NUM_EVENT_COUNTERS] = { 0x1fUL << (63 - 55), /* PMC1 */ 0x1fUL << (63 - 62), /* PMC2 */ 0, 0, 0, 0, 0, 0 }; static unsigned long long power4_mmcr1_counter_mask[POWER4_NUM_EVENT_COUNTERS] = { 0, 0, 0x1fUL << (63 - 36), /* PMC3 */ 0x1fUL << (63 - 41), /* PMC4 */ 0x1fUL << (63 - 46), /* PMC5 */ 0x1fUL << (63 - 51), /* PMC6 */ 0x1fUL << (63 - 56), /* PMC7 */ 0x1fUL << (63 - 61) /* PMC8 */ }; static unsigned long long power4_mmcr0_counter_off_val[POWER4_NUM_EVENT_COUNTERS] = { 0, /* PMC1 */ 0, /* PMC2 */ 0, 0, 0, 0, 0, 0 }; static unsigned long long power4_mmcr1_counter_off_val[POWER4_NUM_EVENT_COUNTERS] = { 0, 0, 0, /* PMC3 */ 0, /* PMC4 */ 0, /* PMC5 */ 0, /* PMC6 */ 0, /* PMC7 */ 0 /* PMC8 */ }; static unsigned long long ppc970_mmcr0_counter_off_val[POWER4_NUM_EVENT_COUNTERS] = { 0x8UL << (63 - 55), /* PMC1 */ 0x8UL << (63 - 62), /* PMC2 */ 0, 0, 0, 0, 0, 0 }; static unsigned long long ppc970_mmcr1_counter_off_val[POWER4_NUM_EVENT_COUNTERS] = { 0, 0, 0x8UL << (63 - 36), /* PMC3 */ 0x8UL << (63 - 41), /* PMC4 */ 0x8UL << (63 - 46), /* PMC5 */ 0x8UL << (63 - 51), /* PMC6 */ 0x8UL << (63 - 56), /* PMC7 */ 0x8UL << (63 - 61) /* PMC8 */ }; /* These masks are used on POWER5,5+,5++,6,7 */ static unsigned long long power5_mmcr0_counter_mask[POWER5_NUM_EVENT_COUNTERS] = { 0, 0, 0, 0, 0, 0 }; static unsigned long long power5_mmcr1_counter_mask[POWER5_NUM_EVENT_COUNTERS] = { 0xffUL << (63 - 39), /* PMC1 */ 0xffUL << (63 - 47), /* PMC2 */ 0xffUL << (63 - 55), /* PMC3 */ 0xffUL << (63 - 63), /* PMC4 */ 0, 0 }; static unsigned long long power5_mmcr0_counter_off_val[POWER5_NUM_EVENT_COUNTERS] = { 0, 0, 0, 0, 0, 0 }; static unsigned long long power5_mmcr1_counter_off_val[POWER5_NUM_EVENT_COUNTERS] = { 0, /* PMC1 */ 0, /* PMC2 */ 0, /* PMC3 */ 0, /* PMC4 */ 0, 0 }; /** * pfm_gen_powerpc_dispatch_events * * Examine each desired event specified in "input" and find an appropriate * set of PMCs and PMDs to count them. **/ static int pfm_gen_powerpc_dispatch_events(pfmlib_input_param_t *input, void *model_input, pfmlib_output_param_t *output, void *model_output) { /* model_input and model_output are unused on POWER */ int i, j, group; int counters_used = 0; unsigned long long mmcr0_val, mmcr1_val; unsigned long long group_vector[num_group_vec[gen_powerpc_support.pmu_type - FIRST_POWER_PMU]]; unsigned int plm; plm = (input->pfp_events[0].plm != 0) ? input->pfp_events[0].plm : input->pfp_dfl_plm; /* * Verify that all of the privilege level masks are identical, as * we cannot have mixed levels on POWER */ for (i = 1; i < input->pfp_event_count; i++) { if (input->pfp_events[i].plm == 0) { /* it's ok if the default is the same as plm */ if (plm != input->pfp_dfl_plm) return PFMLIB_ERR_NOASSIGN; } else { if (plm != input->pfp_events[i].plm) return PFMLIB_ERR_NOASSIGN; } } /* start by setting all of the groups as available */ memset(group_vector, 0xff, sizeof(unsigned long long) * num_group_vec[gen_powerpc_support.pmu_type - FIRST_POWER_PMU]); for (i = 0; i < input->pfp_event_count; i++) { mmcr0_val |= mmcr0_counter_off_val[i]; intersect_groups(group_vector, get_group_vector(input->pfp_events[i].event)); mmcr1_val |= mmcr1_counter_off_val[i]; } group = first_group(group_vector); while (group != -1) { /* find out if the the privilege levels are compatible with each counter */ for (i = 0; i < input->pfp_event_count; i++) { /* find event counter in group */ for (j = 0; j < get_num_event_counters(); j++) { if (get_event_id(input->pfp_events[i].event,j) == get_group_event_id(group, j)) { /* found counter */ if (input->pfp_events[i].plm != 0) { if (! (pmd_priv_vec[j] & (1 << input->pfp_events[0].plm))) { remove_group(group_vector, group); group = first_group(group_vector); goto try_next_group; } } else { if (! (pmd_priv_vec[j] & (1 << input->pfp_dfl_plm))) { remove_group(group_vector, group); group = first_group(group_vector); goto try_next_group; } } /* We located this counter and its privilege checks out ok. */ counters_used |= (1 << j); output->pfp_pmds[i].reg_value = 0; output->pfp_pmds[i].reg_addr = 0; output->pfp_pmds[i].reg_alt_addr = 0; output->pfp_pmds[i].reg_num = j + 1; output->pfp_pmds[i].reg_reserved1 = 0; output->pfp_pmd_count = i + 1; /* Find the next counter */ break; } } if (j == get_num_event_counters()) { printf ("libpfm: Internal error. Unable to find counter in group.\n"); } } /* * Success! We found a group (group) that meets the * privilege constraints */ break; try_next_group: ; } if (group == -1) /* We did not find a group that meets the constraints */ return PFMLIB_ERR_NOASSIGN; /* We now have a group that meets the constraints */ mmcr0_val = get_mmcr0(group); mmcr1_val = get_mmcr1(group); for (i = 0; i < get_num_event_counters(); i++) { if (! (counters_used & (1 << i))) { /* * This counter is not used, so set that * selector to its off value. */ mmcr0_val &= ~mmcr0_counter_mask[i]; mmcr0_val |= mmcr0_counter_off_val[i]; mmcr1_val &= ~mmcr1_counter_mask[i]; mmcr1_val |= mmcr1_counter_off_val[i]; } } /* * As a special case for PMC5 and PMC6 on POWER5/5+, freeze these * two counters if neither are used. Note that the * mmcr0_fc5_6_mask is zero for all processors except POWER5/5+ */ if ((counters_used & ((1 << (5 - 1)) | (1 << (6 - 1)))) == 0) mmcr0_val |= mmcr0_fc5_6_mask; /* * Enable counter "exception on negative" and performance monitor * exceptions */ mmcr0_val |= MMCR0_PMXE | MMCR0_PMC1CE | MMCR0_PMCjCE; /* Start with the counters frozen in every state, then selectively enable them */ mmcr0_val |= MMCR0_FCP | MMCR0_FCS | MMCR0_FCHV; if (plm & PFM_PLM3) { /* user */ mmcr0_val &= ~MMCR0_FCP; } if (plm & PFM_PLM0) { /* kernel */ mmcr0_val &= ~MMCR0_FCS; } if (plm & PFM_PLM1) { /* hypervisor */ mmcr0_val &= ~MMCR0_FCHV; } /* PFM_PLM2 is not supported */ output->pfp_pmcs[0].reg_value = mmcr0_val; output->pfp_pmcs[0].reg_addr = 0; output->pfp_pmcs[0].reg_alt_addr = 0; output->pfp_pmcs[0].reg_num = 0; output->pfp_pmcs[0].reg_reserved1 = 0; output->pfp_pmcs[1].reg_value = mmcr1_val; output->pfp_pmcs[1].reg_addr = 0; output->pfp_pmcs[1].reg_alt_addr = 0; output->pfp_pmcs[1].reg_num = 1; output->pfp_pmcs[1].reg_reserved1 = 0; output->pfp_pmcs[2].reg_value = get_mmcra(group); output->pfp_pmcs[2].reg_addr = 0; output->pfp_pmcs[2].reg_alt_addr = 0; output->pfp_pmcs[2].reg_num = 2; output->pfp_pmcs[2].reg_reserved1 = 0; /* We always use the same number of control regs */ output->pfp_pmc_count = get_num_control_regs(); return PFMLIB_SUCCESS; } /** * pfm_gen_powerpc_pmu_detect * * Determine which POWER processor, if any, we are running on. * **/ /** * These should be defined in more recent versions of * /usr/include/asm-ppc64/reg.h. It isn't pretty to have these here, but * maybe we can remove them someday. **/ static int pfm_gen_powerpc_pmu_detect(void) { if (__is_processor(PV_970) || __is_processor(PV_970FX) || __is_processor(PV_970GX)) { gen_powerpc_support.pmu_type = PFMLIB_PPC970_PMU; gen_powerpc_support.pmu_name = "PPC970"; gen_powerpc_support.pme_count = PPC970_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = PPC970_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = PPC970_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = PPC970_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = 0; mmcr0_counter_mask = power4_mmcr0_counter_mask; mmcr1_counter_mask = power4_mmcr1_counter_mask; mmcr0_counter_off_val = ppc970_mmcr0_counter_off_val; mmcr1_counter_off_val = ppc970_mmcr1_counter_off_val; pmd_priv_vec = gq_pmd_priv_vec; pe = ppc970_pe; groups = ppc970_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_970MP)) { gen_powerpc_support.pmu_type = PFMLIB_PPC970MP_PMU; gen_powerpc_support.pmu_name = "PPC970MP"; gen_powerpc_support.pme_count = PPC970MP_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = PPC970MP_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = PPC970MP_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = PPC970MP_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = 0; mmcr0_counter_mask = power4_mmcr0_counter_mask; mmcr1_counter_mask = power4_mmcr1_counter_mask; mmcr0_counter_off_val = ppc970_mmcr0_counter_off_val; mmcr1_counter_off_val = ppc970_mmcr1_counter_off_val; pmd_priv_vec = gq_pmd_priv_vec; pe = ppc970mp_pe; groups = ppc970mp_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_POWER4) || __is_processor(PV_POWER4p)) { gen_powerpc_support.pmu_type = PFMLIB_PPC970_PMU; gen_powerpc_support.pmu_name = "POWER4"; gen_powerpc_support.pme_count = POWER4_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = POWER4_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = POWER4_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = POWER4_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = 0; mmcr0_counter_mask = power4_mmcr0_counter_mask; mmcr1_counter_mask = power4_mmcr1_counter_mask; mmcr0_counter_off_val = ppc970_mmcr0_counter_off_val; mmcr1_counter_off_val = ppc970_mmcr1_counter_off_val; mmcr0_counter_off_val = power4_mmcr0_counter_off_val; mmcr1_counter_off_val = power4_mmcr1_counter_off_val; pmd_priv_vec = gq_pmd_priv_vec; pe = power4_pe; groups = power4_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_POWER5)) { gen_powerpc_support.pmu_type = PFMLIB_POWER5_PMU; gen_powerpc_support.pmu_name = "POWER5"; gen_powerpc_support.pme_count = POWER5_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = POWER5_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = POWER5_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = POWER5_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = MMCR0_FC5_6; mmcr0_counter_off_val = ppc970_mmcr0_counter_off_val; mmcr1_counter_off_val = ppc970_mmcr1_counter_off_val; mmcr0_counter_mask = power5_mmcr0_counter_mask; mmcr1_counter_mask = power5_mmcr1_counter_mask; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; pmd_priv_vec = gr_pmd_priv_vec; pe = power5_pe; groups = power5_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_POWER5p)) { gen_powerpc_support.pmu_type = PFMLIB_POWER5p_PMU; gen_powerpc_support.pmu_name = "POWER5+"; gen_powerpc_support.pme_count = POWER5p_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = POWER5p_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = POWER5p_NUM_CONTROL_REGS; mmcr0_counter_off_val = power4_mmcr0_counter_off_val; mmcr1_counter_off_val = power4_mmcr1_counter_off_val; gen_powerpc_support.num_cnt = POWER5p_NUM_EVENT_COUNTERS; mmcr0_counter_mask = power5_mmcr0_counter_mask; mmcr1_counter_mask = power5_mmcr1_counter_mask; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; if (PVR_VER(mfspr(SPRN_PVR)) >= 0x300) { /* this is a newer, GS model POWER5+ */ mmcr0_fc5_6_mask = 0; pmd_priv_vec = gs_pmd_priv_vec; } else { mmcr0_fc5_6_mask = MMCR0_FC5_6; pmd_priv_vec = gr_pmd_priv_vec; } mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; pe = power5p_pe; groups = power5p_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_POWER6)) { gen_powerpc_support.pmu_type = PFMLIB_POWER6_PMU; gen_powerpc_support.pmu_name = "POWER6"; gen_powerpc_support.pme_count = POWER6_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = POWER6_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = POWER6_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = POWER6_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = 0; mmcr0_counter_mask = power5_mmcr0_counter_mask; mmcr1_counter_mask = power5_mmcr1_counter_mask; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; pmd_priv_vec = gs_pmd_priv_vec; pe = power6_pe; groups = power6_groups; return PFMLIB_SUCCESS; } if (__is_processor(PV_POWER7)) { gen_powerpc_support.pmu_type = PFMLIB_POWER7_PMU; gen_powerpc_support.pmu_name = "POWER7"; gen_powerpc_support.pme_count = POWER7_PME_EVENT_COUNT; gen_powerpc_support.pmd_count = POWER7_NUM_EVENT_COUNTERS; gen_powerpc_support.pmc_count = POWER7_NUM_CONTROL_REGS; gen_powerpc_support.num_cnt = POWER7_NUM_EVENT_COUNTERS; mmcr0_fc5_6_mask = 0; mmcr0_counter_mask = power5_mmcr0_counter_mask; mmcr1_counter_mask = power5_mmcr1_counter_mask; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; mmcr0_counter_off_val = power5_mmcr0_counter_off_val; mmcr1_counter_off_val = power5_mmcr1_counter_off_val; pmd_priv_vec = gr_pmd_priv_vec; pe = power7_pe; groups = power7_groups; return PFMLIB_SUCCESS; } return PFMLIB_ERR_NOTSUPP; } /** * pfm_gen_powerpc_get_impl_pmcs * * Set the appropriate bit in the impl_pmcs bitmask for each PMC that's * available on power4. **/ static void pfm_gen_powerpc_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { impl_pmcs->bits[0] = (0xffffffff >> (32 - get_num_control_regs())); } /** * pfm_gen_powerpc_get_impl_pmds * * Set the appropriate bit in the impl_pmcs bitmask for each PMD that's * available. **/ static void pfm_gen_powerpc_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { impl_pmds->bits[0] = (0xffffffff >> (32 - get_num_event_counters())); } /** * pfm_gen_powerpc_get_impl_counters * * Set the appropriate bit in the impl_counters bitmask for each counter * that's available on power4. * * For now, all PMDs are counters, so just call get_impl_pmds(). **/ static void pfm_gen_powerpc_get_impl_counters(pfmlib_regmask_t *impl_counters) { pfm_gen_powerpc_get_impl_pmds(impl_counters); } /** * pfm_gen_powerpc_get_hw_counter_width * * Return the number of usable bits in the PMD counters. **/ static void pfm_gen_powerpc_get_hw_counter_width(unsigned int *width) { *width = 64; } /** * pfm_gen_powerpc_get_event_desc * * Return the description for the specified event (if it has one). **/ static int pfm_gen_powerpc_get_event_desc(unsigned int event, char **desc) { *desc = strdup(get_long_desc(event)); return 0; } /** * pfm_gen_powerpc_get_event_mask_desc * * Return the description for the specified event-mask (if it has one). **/ static int pfm_gen_powerpc_get_event_mask_desc(unsigned int event, unsigned int mask, char **desc) { *desc = strdup(""); return 0; } static int pfm_gen_powerpc_get_event_mask_code(unsigned int event, unsigned int mask, unsigned int *code) { *code = 0; return 0; } static int pfm_gen_powerpc_get_cycle_event(pfmlib_event_t *e) { switch (gen_powerpc_support.pmu_type) { case PFMLIB_PPC970_PMU: e->event = PPC970_PME_PM_CYC; break; case PFMLIB_PPC970MP_PMU: e->event = PPC970MP_PME_PM_CYC; break; case PFMLIB_POWER4_PMU: e->event = POWER4_PME_PM_CYC; break; case PFMLIB_POWER5_PMU: e->event = POWER5_PME_PM_CYC; break; case PFMLIB_POWER5p_PMU: e->event = POWER5p_PME_PM_RUN_CYC; break; case PFMLIB_POWER6_PMU: e->event = POWER6_PME_PM_RUN_CYC; break; case PFMLIB_POWER7_PMU: e->event = POWER7_PME_PM_RUN_CYC; break; default: /* perhaps gen_powerpc_suport.pmu_type wasn't initialized? */ return PFMLIB_ERR_NOINIT; } e->num_masks = 0; e->unit_masks[0] = 0; return PFMLIB_SUCCESS; } static int pfm_gen_powerpc_get_inst_retired(pfmlib_event_t *e) { switch (gen_powerpc_support.pmu_type) { case PFMLIB_PPC970_PMU: e->event = PPC970_PME_PM_INST_CMPL; break; case PFMLIB_PPC970MP_PMU: e->event = PPC970MP_PME_PM_INST_CMPL; break; case PFMLIB_POWER4_PMU: e->event = POWER4_PME_PM_INST_CMPL; break; case PFMLIB_POWER5_PMU: e->event = POWER5_PME_PM_INST_CMPL; break; case PFMLIB_POWER5p_PMU: e->event = POWER5p_PME_PM_INST_CMPL; break; case PFMLIB_POWER6_PMU: e->event = POWER6_PME_PM_INST_CMPL; break; case PFMLIB_POWER7_PMU: e->event = POWER7_PME_PM_INST_CMPL; break; default: /* perhaps gen_powerpc_suport.pmu_type wasn't initialized? */ return PFMLIB_ERR_NOINIT; } e->num_masks = 0; e->unit_masks[0] = 0; return 0; } /** * gen_powerpc_support **/ pfm_pmu_support_t gen_powerpc_support = { /* the next 6 fields are initialized in pfm_gen_powerpc_pmu_detect */ .pmu_name = NULL, .pmu_type = PFMLIB_UNKNOWN_PMU, .pme_count = 0, .pmd_count = 0, .pmc_count = 0, .num_cnt = 0, .get_event_code = pfm_gen_powerpc_get_event_code, .get_event_name = pfm_gen_powerpc_get_event_name, .get_event_mask_name = pfm_gen_powerpc_get_event_mask_name, .get_event_counters = pfm_gen_powerpc_get_event_counters, .get_num_event_masks = pfm_gen_powerpc_get_num_event_masks, .dispatch_events = pfm_gen_powerpc_dispatch_events, .pmu_detect = pfm_gen_powerpc_pmu_detect, .get_impl_pmcs = pfm_gen_powerpc_get_impl_pmcs, .get_impl_pmds = pfm_gen_powerpc_get_impl_pmds, .get_impl_counters = pfm_gen_powerpc_get_impl_counters, .get_hw_counter_width = pfm_gen_powerpc_get_hw_counter_width, .get_event_desc = pfm_gen_powerpc_get_event_desc, .get_event_mask_desc = pfm_gen_powerpc_get_event_mask_desc, .get_event_mask_code = pfm_gen_powerpc_get_event_mask_code, .get_cycle_event = pfm_gen_powerpc_get_cycle_event, .get_inst_retired_event = pfm_gen_powerpc_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_i386_p6.c000066400000000000000000000447761502707512200221230ustar00rootroot00000000000000/* * pfmlib_i386_pm.c : support for the P6 processor family (family=6) * incl. Pentium II, Pentium III, Pentium Pro, Pentium M * * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_i386_p6_priv.h" /* architecture private */ #include "i386_p6_events.h" /* event tables */ /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_unit_mask perfsel.sel_unit_mask #define sel_usr perfsel.sel_usr #define sel_os perfsel.sel_os #define sel_edge perfsel.sel_edge #define sel_pc perfsel.sel_pc #define sel_int perfsel.sel_int #define sel_en perfsel.sel_en #define sel_inv perfsel.sel_inv #define sel_cnt_mask perfsel.sel_cnt_mask static char * pfm_i386_p6_get_event_name(unsigned int i); static pme_i386_p6_entry_t *i386_pe; static int i386_p6_cycle_event, i386_p6_inst_retired_event; #define PFMLIB_I386_P6_HAS_COMBO(_e) ((i386_pe[_e].pme_flags & PFMLIB_I386_P6_UMASK_COMBO) != 0) #define PFMLIB_I386_P6_ALL_FLAGS \ (PFM_I386_P6_SEL_INV|PFM_I386_P6_SEL_EDGE) /* * Description of the PMC register mappings use by * this module. * pfp_pmcs[].reg_num: * 0 -> PMC0 -> PERFEVTSEL0 -> MSR @ 0x186 * 1 -> PMC1 -> PERFEVTSEL1 -> MSR @ 0x187 * pfp_pmds[].reg_num: * 0 -> PMD0 -> PERFCTR0 -> MSR @ 0xc1 * 1 -> PMD1 -> PERFCTR1 -> MSR @ 0xc2 */ #define I386_P6_SEL_BASE 0x186 #define I386_P6_CTR_BASE 0xc1 static void pfm_i386_p6_get_impl_counters(pfmlib_regmask_t *impl_counters); static int pfm_i386_detect_common(void) { int ret, family; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); return family != 6 ? PFMLIB_ERR_NOTSUPP : PFMLIB_SUCCESS; } /* * detect Pentium Pro */ static int pfm_i386_p6_detect_ppro(void) { int ret, model; char buffer[128]; ret = pfm_i386_detect_common(); if (ret != PFMLIB_SUCCESS) return ret; ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); if (model != 1) return PFMLIB_ERR_NOTSUPP; return PFMLIB_SUCCESS; } static int pfm_i386_p6_init_ppro(void) { i386_pe = i386_ppro_pe; i386_p6_cycle_event = PME_I386_PPRO_CPU_CLK_UNHALTED; i386_p6_inst_retired_event = PME_I386_PPRO_INST_RETIRED; return PFMLIB_SUCCESS; } /* * detect Pentium II */ static int pfm_i386_p6_detect_pii(void) { int ret, model; char buffer[128]; ret = pfm_i386_detect_common(); if (ret != PFMLIB_SUCCESS) return ret; ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); switch(model) { case 3: /* Pentium II */ case 5: /* Pentium II Deschutes */ case 6: /* Pentium II Mendocino */ break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_i386_p6_init_pii(void) { i386_pe = i386_pII_pe; i386_p6_cycle_event = PME_I386_PII_CPU_CLK_UNHALTED; i386_p6_inst_retired_event = PME_I386_PII_INST_RETIRED; return PFMLIB_SUCCESS; } /* * detect Pentium III */ static int pfm_i386_p6_detect_piii(void) { int ret, model; char buffer[128]; ret = pfm_i386_detect_common(); if (ret != PFMLIB_SUCCESS) return ret; ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); switch(model) { case 7: /* Pentium III Katmai */ case 8: /* Pentium III Coppermine */ case 10:/* Pentium III Cascades */ case 11:/* Pentium III Tualatin */ break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_i386_p6_init_piii(void) { i386_pe = i386_pIII_pe; i386_p6_cycle_event = PME_I386_PIII_CPU_CLK_UNHALTED; i386_p6_inst_retired_event = PME_I386_PIII_INST_RETIRED; return PFMLIB_SUCCESS; } /* * detect Pentium M */ static int pfm_i386_p6_detect_pm(void) { int ret, model; char buffer[128]; ret = pfm_i386_detect_common(); if (ret != PFMLIB_SUCCESS) return ret; ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); switch (model) { case 9: case 13: break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_i386_p6_init_pm(void) { i386_pe = i386_pm_pe; i386_p6_cycle_event = PME_I386_PM_CPU_CLK_UNHALTED; i386_p6_inst_retired_event = PME_I386_PM_INST_RETIRED; return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_i386_p6_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_i386_p6_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_i386_p6_input_param_t *param = mod_in; pfmlib_i386_p6_counter_t *cntrs; pfm_i386_p6_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t impl_cntrs, avail_cntrs; unsigned long plm; unsigned int i, j, cnt, k, umask; unsigned int assign[PMU_I386_P6_NUM_COUNTERS]; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; cntrs = param ? param->pfp_i386_p6_counters : NULL; if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s\n", j, i386_pe[e[j].event].pme_name); } } if (cnt > PMU_I386_P6_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; pfm_i386_p6_get_impl_counters(&impl_cntrs); pfm_regmask_andnot(&avail_cntrs, &impl_cntrs, &inp->pfp_unavail_pmcs); DPRINT("impl=0x%lx avail=0x%lx unavail=0x%lx\n", impl_cntrs.bits[0], avail_cntrs.bits[0], inp->pfp_unavail_pmcs.bits[0]); for(j=0; j < cnt; j++) { /* * P6 only supports two priv levels for perf counters */ if (e[j].plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("event=%d invalid plm=%d\n", e[j].event, e[j].plm); return PFMLIB_ERR_INVAL; } if (cntrs && cntrs[j].flags & ~PFMLIB_I386_P6_ALL_FLAGS) { DPRINT("event=%d invalid flags=0x%lx\n", e[j].event, e[j].flags); return PFMLIB_ERR_INVAL; } /* * check illegal unit masks combination */ if (e[j].num_masks > 1 && PFMLIB_I386_P6_HAS_COMBO(e[j].event) == 0) { DPRINT("event does not support unit mask combination\n"); return PFMLIB_ERR_FEATCOMB; } } /* * first pass: events for fixed counters */ for(j=0; j < cnt; j++) { if (i386_pe[e[j].event].pme_flags & PFMLIB_I386_P6_CTR0_ONLY) { if (!pfm_regmask_isset(&avail_cntrs, 0)) return PFMLIB_ERR_NOASSIGN; assign[j] = 0; pfm_regmask_clr(&avail_cntrs, 0); } else if (i386_pe[e[j].event].pme_flags & PFMLIB_I386_P6_CTR1_ONLY) { if (!pfm_regmask_isset(&avail_cntrs, 1)) return PFMLIB_ERR_NOASSIGN; assign[j] = 1; pfm_regmask_clr(&avail_cntrs, 1); } } /* * second pass: events with no constraints */ for (j=0, i=0; j < cnt ; j++ ) { if (i386_pe[e[j].event].pme_flags & (PFMLIB_I386_P6_CTR0_ONLY|PFMLIB_I386_P6_CTR1_ONLY)) continue; while (i < PMU_I386_P6_NUM_COUNTERS && !pfm_regmask_isset(&avail_cntrs, i)) i++; if (i == PMU_I386_P6_NUM_COUNTERS) return PFMLIB_ERR_NOASSIGN; pfm_regmask_clr(&avail_cntrs, i); assign[j] = i++; } /* * final pass: assign value to registers */ for (j=0; j < cnt ; j++) { reg.val = 0; /* assume reserved bits are zeroed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[j].plm ? e[j].plm : inp->pfp_dfl_plm; reg.sel_event_mask = i386_pe[e[j].event].pme_code; /* * some events have only a single umask. We do not create * specific umask entry in this case. The umask code is taken * out of the (extended) event code (2nd byte) */ umask = (i386_pe[e[j].event].pme_code >> 8) & 0xff; for(k=0; k < e[j].num_masks; k++) { umask |= i386_pe[e[j].event].pme_umasks[e[j].unit_masks[k]].pme_ucode; } reg.sel_unit_mask = umask; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_int = 1; /* force APIC int to 1 */ /* * only perfevtsel0 has an enable bit (allows atomic start/stop) */ if (assign[j] == 0) reg.sel_en = 1; /* force enable bit to 1 */ if (cntrs) { reg.sel_cnt_mask = cntrs[j].cnt_mask; reg.sel_edge = cntrs[j].flags & PFM_I386_P6_SEL_EDGE ? 1 : 0; reg.sel_inv = cntrs[j].flags & PFM_I386_P6_SEL_INV ? 1 : 0; } pc[j].reg_num = assign[j]; pc[j].reg_value = reg.val; pc[j].reg_addr = I386_P6_SEL_BASE+assign[j]; pc[j].reg_alt_addr= I386_P6_SEL_BASE+assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = I386_P6_CTR_BASE+assign[j]; /* index to use with RDPMC */ pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%lx emask=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", assign[j], assign[j], reg.val, reg.sel_event_mask, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, i386_pe[e[j].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* * add perfsel0 if not used. This is required as it holds * the enable bit for all counters */ if (pfm_regmask_isset(&avail_cntrs, 0)) { reg.val = 0; reg.sel_en = 1; /* force enable bit to 1 */ pc[j].reg_num = 0; pc[j].reg_value = reg.val; pc[j].reg_addr = I386_P6_SEL_BASE; pc[j].reg_alt_addr = I386_P6_SEL_BASE; j++; __pfm_vbprintf("[PERFEVTSEL0(pmc0)=0x%lx] required for enabling counters\n", reg.val); } /* number of evtsel registers programmed */ outp->pfp_pmc_count = j; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_i386_p6_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_i386_p6_input_param_t *mod_in = (pfmlib_i386_p6_input_param_t *)model_in; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } return pfm_i386_p6_dispatch_counters(inp, mod_in, outp); } static int pfm_i386_p6_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && cnt > 2) return PFMLIB_ERR_INVAL; *code = i386_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_i386_p6_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); if (i386_pe[j].pme_flags & PFMLIB_I386_P6_CTR0_ONLY) { pfm_regmask_set(counters, 0); } else if (i386_pe[j].pme_flags & PFMLIB_I386_P6_CTR1_ONLY) { pfm_regmask_set(counters, 1); } else { for(i=0; i < PMU_I386_P6_NUM_COUNTERS; i++) pfm_regmask_set(counters, i); } } static void pfm_i386_p6_get_impl_perfsel(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < PMU_I386_P6_NUM_PERFSEL; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_i386_p6_get_impl_perfctr(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < PMU_I386_P6_NUM_PERFCTR; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_i386_p6_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=0; i < PMU_I386_P6_NUM_COUNTERS; i++) pfm_regmask_set(impl_counters, i); } static void pfm_i386_p6_get_hw_counter_width(unsigned int *width) { *width = PMU_I386_P6_COUNTER_WIDTH; } static char * pfm_i386_p6_get_event_name(unsigned int i) { return i386_pe[i].pme_name; } static int pfm_i386_p6_get_event_description(unsigned int ev, char **str) { char *s; s = i386_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_i386_p6_get_event_mask_name(unsigned int ev, unsigned int midx) { return i386_pe[ev].pme_umasks[midx].pme_uname; } static int pfm_i386_p6_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = i386_pe[ev].pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_i386_p6_get_num_event_masks(unsigned int ev) { return i386_pe[ev].pme_numasks; } static int pfm_i386_p6_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code = i386_pe[ev].pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_i386_p6_get_cycle_event(pfmlib_event_t *e) { e->event = i386_p6_cycle_event; return PFMLIB_SUCCESS; } static int pfm_i386_p6_get_inst_retired(pfmlib_event_t *e) { e->event = i386_p6_inst_retired_event; return PFMLIB_SUCCESS; } /* Pentium II support */ pfm_pmu_support_t i386_pii_support={ .pmu_name = "Intel Pentium II", .pmu_type = PFMLIB_INTEL_PII_PMU, .pme_count = PME_I386_PII_EVENT_COUNT, .pmc_count = PMU_I386_P6_NUM_PERFSEL, .pmd_count = PMU_I386_P6_NUM_PERFCTR, .num_cnt = PMU_I386_P6_NUM_COUNTERS, .get_event_code = pfm_i386_p6_get_event_code, .get_event_name = pfm_i386_p6_get_event_name, .get_event_counters = pfm_i386_p6_get_event_counters, .dispatch_events = pfm_i386_p6_dispatch_events, .pmu_detect = pfm_i386_p6_detect_pii, .pmu_init = pfm_i386_p6_init_pii, .get_impl_pmcs = pfm_i386_p6_get_impl_perfsel, .get_impl_pmds = pfm_i386_p6_get_impl_perfctr, .get_impl_counters = pfm_i386_p6_get_impl_counters, .get_hw_counter_width = pfm_i386_p6_get_hw_counter_width, .get_event_desc = pfm_i386_p6_get_event_description, .get_num_event_masks = pfm_i386_p6_get_num_event_masks, .get_event_mask_name = pfm_i386_p6_get_event_mask_name, .get_event_mask_code = pfm_i386_p6_get_event_mask_code, .get_event_mask_desc = pfm_i386_p6_get_event_mask_desc, .get_cycle_event = pfm_i386_p6_get_cycle_event, .get_inst_retired_event = pfm_i386_p6_get_inst_retired }; /* Generic P6 processor support (not incl. Pentium M) */ pfm_pmu_support_t i386_p6_support={ .pmu_name = "Intel P6 Processor Family", .pmu_type = PFMLIB_I386_P6_PMU, .pme_count = PME_I386_PIII_EVENT_COUNT, .pmc_count = PMU_I386_P6_NUM_PERFSEL, .pmd_count = PMU_I386_P6_NUM_PERFCTR, .num_cnt = PMU_I386_P6_NUM_COUNTERS, .get_event_code = pfm_i386_p6_get_event_code, .get_event_name = pfm_i386_p6_get_event_name, .get_event_counters = pfm_i386_p6_get_event_counters, .dispatch_events = pfm_i386_p6_dispatch_events, .pmu_detect = pfm_i386_p6_detect_piii, .pmu_init = pfm_i386_p6_init_piii, .get_impl_pmcs = pfm_i386_p6_get_impl_perfsel, .get_impl_pmds = pfm_i386_p6_get_impl_perfctr, .get_impl_counters = pfm_i386_p6_get_impl_counters, .get_hw_counter_width = pfm_i386_p6_get_hw_counter_width, .get_event_desc = pfm_i386_p6_get_event_description, .get_num_event_masks = pfm_i386_p6_get_num_event_masks, .get_event_mask_name = pfm_i386_p6_get_event_mask_name, .get_event_mask_code = pfm_i386_p6_get_event_mask_code, .get_event_mask_desc = pfm_i386_p6_get_event_mask_desc, .get_cycle_event = pfm_i386_p6_get_cycle_event, .get_inst_retired_event = pfm_i386_p6_get_inst_retired }; pfm_pmu_support_t i386_ppro_support={ .pmu_name = "Intel Pentium Pro", .pmu_type = PFMLIB_INTEL_PPRO_PMU, .pme_count = PME_I386_PPRO_EVENT_COUNT, .pmc_count = PMU_I386_P6_NUM_PERFSEL, .pmd_count = PMU_I386_P6_NUM_PERFCTR, .num_cnt = PMU_I386_P6_NUM_COUNTERS, .get_event_code = pfm_i386_p6_get_event_code, .get_event_name = pfm_i386_p6_get_event_name, .get_event_counters = pfm_i386_p6_get_event_counters, .dispatch_events = pfm_i386_p6_dispatch_events, .pmu_detect = pfm_i386_p6_detect_ppro, .pmu_init = pfm_i386_p6_init_ppro, .get_impl_pmcs = pfm_i386_p6_get_impl_perfsel, .get_impl_pmds = pfm_i386_p6_get_impl_perfctr, .get_impl_counters = pfm_i386_p6_get_impl_counters, .get_hw_counter_width = pfm_i386_p6_get_hw_counter_width, .get_event_desc = pfm_i386_p6_get_event_description, .get_num_event_masks = pfm_i386_p6_get_num_event_masks, .get_event_mask_name = pfm_i386_p6_get_event_mask_name, .get_event_mask_code = pfm_i386_p6_get_event_mask_code, .get_event_mask_desc = pfm_i386_p6_get_event_mask_desc, .get_cycle_event = pfm_i386_p6_get_cycle_event, .get_inst_retired_event = pfm_i386_p6_get_inst_retired }; /* Pentium M support */ pfm_pmu_support_t i386_pm_support={ .pmu_name = "Intel Pentium M", .pmu_type = PFMLIB_I386_PM_PMU, .pme_count = PME_I386_PM_EVENT_COUNT, .pmc_count = PMU_I386_P6_NUM_PERFSEL, .pmd_count = PMU_I386_P6_NUM_PERFCTR, .num_cnt = PMU_I386_P6_NUM_COUNTERS, .get_event_code = pfm_i386_p6_get_event_code, .get_event_name = pfm_i386_p6_get_event_name, .get_event_counters = pfm_i386_p6_get_event_counters, .dispatch_events = pfm_i386_p6_dispatch_events, .pmu_detect = pfm_i386_p6_detect_pm, .pmu_init = pfm_i386_p6_init_pm, .get_impl_pmcs = pfm_i386_p6_get_impl_perfsel, .get_impl_pmds = pfm_i386_p6_get_impl_perfctr, .get_impl_counters = pfm_i386_p6_get_impl_counters, .get_hw_counter_width = pfm_i386_p6_get_hw_counter_width, .get_event_desc = pfm_i386_p6_get_event_description, .get_num_event_masks = pfm_i386_p6_get_num_event_masks, .get_event_mask_name = pfm_i386_p6_get_event_mask_name, .get_event_mask_code = pfm_i386_p6_get_event_mask_code, .get_event_mask_desc = pfm_i386_p6_get_event_mask_desc, .get_cycle_event = pfm_i386_p6_get_cycle_event, .get_inst_retired_event = pfm_i386_p6_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_i386_p6_priv.h000066400000000000000000000042651502707512200231550ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_I386_P6_PRIV_H__ #define __PFMLIB_I386_P6_PRIV_H__ #define PFMLIB_I386_P6_MAX_UMASK 16 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ } pme_i386_p6_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ pme_i386_p6_umask_t pme_umasks[PFMLIB_I386_P6_MAX_UMASK]; /* umask desc */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ } pme_i386_p6_entry_t; /* * pme_flags values */ #define PFMLIB_I386_P6_UMASK_COMBO 0x01 /* unit mask can be combined */ #define PFMLIB_I386_P6_CTR0_ONLY 0x02 /* event can only be counted on counter 0 */ #define PFMLIB_I386_P6_CTR1_ONLY 0x04 /* event can only be counted on counter 1 */ #endif /* __PFMLIB_I386_P6_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_intel_atom.c000066400000000000000000000524221502707512200231430ustar00rootroot00000000000000/* * pfmlib_intel_atom.c : Intel Atom PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Based on work: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements support for Intel Core PMU as specified in the following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" * * Intel Atom = architectural v3 + PEBS */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_atom_priv.h" #include "intel_atom_events.h" /* let's define some handy shortcuts! */ #define sel_event_select perfevtsel.sel_event_select #define sel_unit_mask perfevtsel.sel_unit_mask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_cnt_mask perfevtsel.sel_cnt_mask #define sel_any perfevtsel.sel_any #define has_pebs(i) (intel_atom_pe[i].pme_flags & PFMLIB_INTEL_ATOM_PEBS) /* * Description of the PMC register mappings: * * 0 -> PMC0 -> PERFEVTSEL0 * 1 -> PMC1 -> PERFEVTSEL1 * 16 -> PMC16 -> FIXED_CTR_CTRL * 17 -> PMC17 -> PEBS_ENABLED * * Description of the PMD register mapping: * * 0 -> PMD0 -> PMC0 * 1 -> PMD1 -> PMC1 * 16 -> PMD2 -> FIXED_CTR0 * 17 -> PMD3 -> FIXED_CTR1 * 18 -> PMD4 -> FIXED_CTR2 */ #define INTEL_ATOM_SEL_BASE 0x186 #define INTEL_ATOM_CTR_BASE 0xc1 #define FIXED_CTR_BASE 0x309 #define PFMLIB_INTEL_ATOM_ALL_FLAGS \ (PFM_INTEL_ATOM_SEL_INV|PFM_INTEL_ATOM_SEL_EDGE|PFM_INTEL_ATOM_SEL_ANYTHR) static pfmlib_regmask_t intel_atom_impl_pmcs, intel_atom_impl_pmds; static int highest_counter; static int pfm_intel_atom_detect(void) { int ret, family, model; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; model = atoi(buffer); /* * Atom : family 6 model 28 */ return family == 6 && model == 28 ? PFMLIB_SUCCESS : PFMLIB_ERR_NOTSUPP; } static int pfm_intel_atom_init(void) { int i; /* generic counters */ pfm_regmask_set(&intel_atom_impl_pmcs, 0); pfm_regmask_set(&intel_atom_impl_pmds, 0); pfm_regmask_set(&intel_atom_impl_pmcs, 1); pfm_regmask_set(&intel_atom_impl_pmds, 1); /* fixed counters */ pfm_regmask_set(&intel_atom_impl_pmcs, 16); pfm_regmask_set(&intel_atom_impl_pmds, 16); pfm_regmask_set(&intel_atom_impl_pmds, 17); pfm_regmask_set(&intel_atom_impl_pmds, 18); /* lbr */ pfm_regmask_set(&intel_atom_impl_pmds, 19); for(i=0; i < 16; i++) pfm_regmask_set(&intel_atom_impl_pmds, i); highest_counter = 18; /* PEBS */ pfm_regmask_set(&intel_atom_impl_pmcs, 17); return PFMLIB_SUCCESS; } static int pfm_intel_atom_is_fixed(pfmlib_event_t *e, unsigned int f) { unsigned int fl, flc, i; unsigned int mask = 0; fl = intel_atom_pe[e->event].pme_flags; /* * first pass: check if event as a whole supports fixed counters */ switch(f) { case 0: mask = PFMLIB_INTEL_ATOM_FIXED0; break; case 1: mask = PFMLIB_INTEL_ATOM_FIXED1; break; case 2: mask = PFMLIB_INTEL_ATOM_FIXED2_ONLY; break; default: return 0; } if (fl & mask) return 1; /* * second pass: check if unit mask supports fixed counter * * reject if mask not found OR if not all unit masks have * same fixed counter mask */ flc = 0; for(i=0; i < e->num_masks; i++) { fl = intel_atom_pe[e->event].pme_umasks[e->unit_masks[i]].pme_flags; if (fl & mask) flc++; } return flc > 0 && flc == e->num_masks ? 1 : 0; } /* * IMPORTANT: the interface guarantees that pfp_pmds[] elements are returned in the order the events * were submitted. */ static int pfm_intel_atom_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_intel_atom_input_param_t *param, pfmlib_output_param_t *outp) { #define HAS_OPTIONS(x) (cntrs && (cntrs[x].flags || cntrs[x].cnt_mask)) #define is_fixed_pmc(a) (a == 16 || a == 17 || a == 18) pfmlib_intel_atom_counter_t *cntrs; pfm_intel_atom_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; uint64_t val; unsigned long plm; unsigned long long fixed_ctr; unsigned int npc, npmc0, npmc1, nf2; unsigned int i, j, n, k, ucode, use_pebs = 0, done_pebs; unsigned int assign_pc[PMU_INTEL_ATOM_NUM_COUNTERS]; unsigned int next_gen, last_gen; npc = npmc0 = npmc1 = nf2 = 0; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; n = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_intel_atom_counters : NULL; use_pebs = param ? param->pfp_intel_atom_pebs_used : 0; if (n > PMU_INTEL_ATOM_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; /* * initilize to empty */ for(i=0; i < PMU_INTEL_ATOM_NUM_COUNTERS; i++) assign_pc[i] = -1; /* * error checking */ for(i=0; i < n; i++) { /* * only supports two priv levels for perf counters */ if (e[i].plm & (PFM_PLM1|PFM_PLM2)) return PFMLIB_ERR_INVAL; /* * check for valid flags */ if (cntrs && cntrs[i].flags & ~PFMLIB_INTEL_ATOM_ALL_FLAGS) return PFMLIB_ERR_INVAL; if (intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_UMASK_NCOMBO && e[i].num_masks > 1) { DPRINT("events does not support unit mask combination\n"); return PFMLIB_ERR_NOASSIGN; } /* * check event-level single register constraint (PMC0, PMC1, FIXED_CTR2) * fail if more than two events requested for the same counter */ if (intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_PMC0) { if (++npmc0 > 1) { DPRINT("two events compete for a PMC0\n"); return PFMLIB_ERR_NOASSIGN; } } /* * check if PMC1 is available and if only one event is dependent on it */ if (intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_PMC1) { if (++npmc1 > 1) { DPRINT("two events compete for a PMC1\n"); return PFMLIB_ERR_NOASSIGN; } } /* * UNHALTED_REFERENCE_CYCLES can only be measured on FIXED_CTR2 */ if (intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_FIXED2_ONLY) { if (++nf2 > 1) { DPRINT("two events compete for FIXED_CTR2\n"); return PFMLIB_ERR_NOASSIGN; } if (cntrs && ((cntrs[i].flags & (PFM_INTEL_ATOM_SEL_EDGE|PFM_INTEL_ATOM_SEL_INV)) || cntrs[i].cnt_mask)) { DPRINT("UNHALTED_REFERENCE_CYCLES only accepts anythr filter\n"); return PFMLIB_ERR_NOASSIGN; } } /* * unit-mask level constraint checking (PMC0, PMC1, FIXED_CTR2) */ for(j=0; j < e[i].num_masks; j++) { unsigned int flags; flags = intel_atom_pe[e[i].event].pme_umasks[e[i].unit_masks[j]].pme_flags; if (flags & PFMLIB_INTEL_ATOM_FIXED2_ONLY) { if (++nf2 > 1) { DPRINT("two events compete for FIXED_CTR2\n"); return PFMLIB_ERR_NOASSIGN; } if (HAS_OPTIONS(i)) { DPRINT("fixed counters do not support inversion/counter-mask\n"); return PFMLIB_ERR_NOASSIGN; } } } } next_gen = 0; /* first generic counter */ last_gen = 1; /* last generic counter */ /* * strongest constraint first: works only in IA32_PMC0, IA32_PMC1, FIXED_CTR2 * * When PEBS is used, we pick the first PEBS event and * place it into PMC0. Subsequent PEBS events, will go * in the other counters. */ done_pebs = 0; for(i=0; i < n; i++) { if ((intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_PMC0) || (use_pebs && pfm_intel_atom_has_pebs(e+i) && done_pebs == 0)) { if (pfm_regmask_isset(r_pmcs, 0)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 0; next_gen = 1; done_pebs = 1; } if (intel_atom_pe[e[i].event].pme_flags & PFMLIB_INTEL_ATOM_PMC1) { if (pfm_regmask_isset(r_pmcs, 1)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 1; if (next_gen == 1) next_gen = 2; else next_gen = 0; } } /* * next constraint: fixed counters * * We abuse the mapping here for assign_pc to make it easier * to provide the correct values for pd[]. * We use: * - 16 : fixed counter 0 (pmc16, pmd16) * - 17 : fixed counter 1 (pmc16, pmd17) * - 18 : fixed counter 1 (pmc16, pmd18) */ fixed_ctr = pfm_regmask_isset(r_pmcs, 16) ? 0 : 0x7; if (fixed_ctr) { for(i=0; i < n; i++) { /* fixed counters do not support event options (filters) */ if (HAS_OPTIONS(i)) { if (use_pebs && pfm_intel_atom_has_pebs(e+i)) continue; if (cntrs[i].flags != PFM_INTEL_ATOM_SEL_ANYTHR) continue; } if ((fixed_ctr & 0x1) && pfm_intel_atom_is_fixed(e+i, 0)) { assign_pc[i] = 16; fixed_ctr &= ~1; } if ((fixed_ctr & 0x2) && pfm_intel_atom_is_fixed(e+i, 1)) { assign_pc[i] = 17; fixed_ctr &= ~2; } if ((fixed_ctr & 0x4) && pfm_intel_atom_is_fixed(e+i, 2)) { assign_pc[i] = 18; fixed_ctr &= ~4; } } } /* * assign what is left */ for(i=0; i < n; i++) { if (assign_pc[i] == -1) { for(; next_gen <= last_gen; next_gen++) { if (!pfm_regmask_isset(r_pmcs, next_gen)) break; } if (next_gen <= last_gen) assign_pc[i] = next_gen++; else { DPRINT("cannot assign generic counters\n"); return PFMLIB_ERR_NOASSIGN; } } } j = 0; /* setup fixed counters */ reg.val = 0; k = 0; for (i=0; i < n ; i++ ) { if (!is_fixed_pmc(assign_pc[i])) continue; val = 0; /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; if (plm & PFM_PLM0) val |= 1ULL; if (plm & PFM_PLM3) val |= 2ULL; if (cntrs && cntrs[i].flags & PFM_INTEL_ATOM_SEL_ANYTHR) val |= 4ULL; val |= 1ULL << 3; /* force APIC int (kernel may force it anyway) */ reg.val |= val << ((assign_pc[i]-16)<<2); } if (reg.val) { pc[npc].reg_num = 16; pc[npc].reg_value = reg.val; pc[npc].reg_addr = 0x38D; pc[npc].reg_alt_addr = 0x38D; __pfm_vbprintf("[FIXED_CTRL(pmc%u)=0x%"PRIx64" pmi0=1 en0=0x%"PRIx64" any0=%d pmi1=1 en1=0x%"PRIx64" any1=%d pmi2=1 en2=0x%"PRIx64" any2=%d] ", pc[npc].reg_num, reg.val, reg.val & 0x3ULL, !!(reg.val & 0x4ULL), (reg.val>>4) & 0x3ULL, !!((reg.val>>4) & 0x4ULL), (reg.val>>8) & 0x3ULL, !!((reg.val>>8) & 0x4ULL)); if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("INSTRUCTIONS_RETIRED "); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("UNHALTED_CORE_CYCLES "); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("UNHALTED_REFERENCE_CYCLES "); __pfm_vbprintf("\n"); npc++; if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("[FIXED_CTR0(pmd16)]\n"); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("[FIXED_CTR1(pmd17)]\n"); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("[FIXED_CTR2(pmd18)]\n"); } for (i=0; i < n ; i++ ) { /* skip fixed counters */ if (is_fixed_pmc(assign_pc[i])) continue; reg.val = 0; /* assume reserved bits are zerooed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; val = intel_atom_pe[e[i].event].pme_code; reg.sel_event_select = val & 0xff; ucode = (val >> 8) & 0xff; for(k=0; k < e[i].num_masks; k++) ucode |= intel_atom_pe[e[i].event].pme_umasks[e[i].unit_masks[k]].pme_ucode; val |= ucode << 8; reg.sel_unit_mask = ucode; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_edge = val >> 18; reg.sel_any = val >> 21;; if (cntrs) { if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[i].flags & PFM_INTEL_ATOM_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[i].flags & PFM_INTEL_ATOM_SEL_INV ? 1 : 0; if (!reg.sel_any) reg.sel_any = cntrs[i].flags & PFM_INTEL_ATOM_SEL_ANYTHR? 1 : 0; } pc[npc].reg_num = assign_pc[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = INTEL_ATOM_SEL_BASE+assign_pc[i]; pc[npc].reg_alt_addr= INTEL_ATOM_SEL_BASE+assign_pc[i]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d anythr=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, reg.sel_any, intel_atom_pe[e[i].event].pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pc[npc].reg_num, pc[npc].reg_num); npc++; } /* * setup pmds: must be in the same order as the events */ for (i=0; i < n ; i++) { if (is_fixed_pmc(assign_pc[i])) { /* setup pd array */ pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = FIXED_CTR_BASE+assign_pc[i]-16; pd[i].reg_alt_addr = 0x40000000+assign_pc[i]-16; } else { pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = INTEL_ATOM_CTR_BASE+assign_pc[i]; /* index to use with RDPMC */ pd[i].reg_alt_addr = assign_pc[i]; } } outp->pfp_pmd_count = i; /* * setup PEBS_ENABLE */ if (use_pebs && done_pebs) { /* * check that PEBS_ENABLE is available */ if (pfm_regmask_isset(r_pmcs, 17)) return PFMLIB_ERR_NOASSIGN; pc[npc].reg_num = 17; pc[npc].reg_value = 1ULL; pc[npc].reg_addr = 0x3f1; /* IA32_PEBS_ENABLE */ pc[npc].reg_alt_addr = 0x3f1; /* IA32_PEBS_ENABLE */ __pfm_vbprintf("[PEBS_ENABLE(pmc%u)=0x%"PRIx64" ena=%d]\n", pc[npc].reg_num, pc[npc].reg_value, pc[npc].reg_value & 0x1ull); npc++; } outp->pfp_pmc_count = npc; return PFMLIB_SUCCESS; } static int pfm_intel_atom_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_intel_atom_input_param_t *mod_in = (pfmlib_intel_atom_input_param_t *)model_in; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } return pfm_intel_atom_dispatch_counters(inp, mod_in, outp); } static int pfm_intel_atom_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt > highest_counter || !pfm_regmask_isset(&intel_atom_impl_pmds, cnt))) return PFMLIB_ERR_INVAL; *code = intel_atom_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_intel_atom_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int n, i; unsigned int has_f0, has_f1, has_f2; memset(counters, 0, sizeof(*counters)); n = intel_atom_pe[j].pme_numasks; has_f0 = has_f1 = has_f2 = 0; for (i=0; i < n; i++) { if (intel_atom_pe[j].pme_umasks[i].pme_flags & PFMLIB_INTEL_ATOM_FIXED0) has_f0 = 1; if (intel_atom_pe[j].pme_umasks[i].pme_flags & PFMLIB_INTEL_ATOM_FIXED1) has_f1 = 1; if (intel_atom_pe[j].pme_umasks[i].pme_flags & PFMLIB_INTEL_ATOM_FIXED2_ONLY) has_f2 = 1; } if (has_f0 == 0) has_f0 = intel_atom_pe[j].pme_flags & PFMLIB_INTEL_ATOM_FIXED0; if (has_f1 == 0) has_f1 = intel_atom_pe[j].pme_flags & PFMLIB_INTEL_ATOM_FIXED1; if (has_f2 == 0) has_f2 = intel_atom_pe[j].pme_flags & PFMLIB_INTEL_ATOM_FIXED2_ONLY; if (has_f0) pfm_regmask_set(counters, 16); if (has_f1) pfm_regmask_set(counters, 17); if (has_f2) pfm_regmask_set(counters, 18); /* the event on FIXED_CTR2 is exclusive CPU_CLK_UNHALTED:REF */ if (!has_f2) { pfm_regmask_set(counters, 0); pfm_regmask_set(counters, 1); if (intel_atom_pe[j].pme_flags & PFMLIB_INTEL_ATOM_PMC0) pfm_regmask_clr(counters, 1); if (intel_atom_pe[j].pme_flags & PFMLIB_INTEL_ATOM_PMC1) pfm_regmask_clr(counters, 0); } } static void pfm_intel_atom_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = intel_atom_impl_pmcs; } static void pfm_intel_atom_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = intel_atom_impl_pmds; } static void pfm_intel_atom_get_impl_counters(pfmlib_regmask_t *impl_counters) { pfm_regmask_set(impl_counters, 0); pfm_regmask_set(impl_counters, 1); pfm_regmask_set(impl_counters, 16); pfm_regmask_set(impl_counters, 17); pfm_regmask_set(impl_counters, 18); } /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 32-bit have full * degree of freedom. That is the "useable" counter width. */ #define PMU_INTEL_ATOM_COUNTER_WIDTH 32 static void pfm_intel_atom_get_hw_counter_width(unsigned int *width) { /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 31 bits have full * degree of freedom. That is the "useable" counter width. */ *width = PMU_INTEL_ATOM_COUNTER_WIDTH; } static char * pfm_intel_atom_get_event_name(unsigned int i) { return intel_atom_pe[i].pme_name; } static int pfm_intel_atom_get_event_description(unsigned int ev, char **str) { char *s; s = intel_atom_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_intel_atom_get_event_mask_name(unsigned int ev, unsigned int midx) { return intel_atom_pe[ev].pme_umasks[midx].pme_uname; } static int pfm_intel_atom_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; s = intel_atom_pe[ev].pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_intel_atom_get_num_event_masks(unsigned int ev) { return intel_atom_pe[ev].pme_numasks; } static int pfm_intel_atom_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { *code =intel_atom_pe[ev].pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_intel_atom_get_cycle_event(pfmlib_event_t *e) { e->event = PME_INTEL_ATOM_UNHALTED_CORE_CYCLES; return PFMLIB_SUCCESS; } static int pfm_intel_atom_get_inst_retired(pfmlib_event_t *e) { e->event = PME_INTEL_ATOM_INSTRUCTIONS_RETIRED; return PFMLIB_SUCCESS; } /* * this function is directly accessible by external caller * library initialization is not required, though recommended */ int pfm_intel_atom_has_pebs(pfmlib_event_t *e) { unsigned int i, n=0; if (e == NULL || e->event >= PME_INTEL_ATOM_EVENT_COUNT) return 0; if (intel_atom_pe[e->event].pme_flags & PFMLIB_INTEL_ATOM_PEBS) return 1; /* * ALL unit mask must support PEBS for this test to return true */ for(i=0; i < e->num_masks; i++) { /* check for valid unit mask */ if (e->unit_masks[i] >= intel_atom_pe[e->event].pme_numasks) return 0; if (intel_atom_pe[e->event].pme_umasks[e->unit_masks[i]].pme_flags & PFMLIB_INTEL_ATOM_PEBS) n++; } return n > 0 && n == e->num_masks; } pfm_pmu_support_t intel_atom_support={ .pmu_name = "Intel Atom", .pmu_type = PFMLIB_INTEL_ATOM_PMU, .pme_count = PME_INTEL_ATOM_EVENT_COUNT, .pmc_count = 4, .pmd_count = 22, .num_cnt = 5, .get_event_code = pfm_intel_atom_get_event_code, .get_event_name = pfm_intel_atom_get_event_name, .get_event_counters = pfm_intel_atom_get_event_counters, .dispatch_events = pfm_intel_atom_dispatch_events, .pmu_detect = pfm_intel_atom_detect, .pmu_init = pfm_intel_atom_init, .get_impl_pmcs = pfm_intel_atom_get_impl_pmcs, .get_impl_pmds = pfm_intel_atom_get_impl_pmds, .get_impl_counters = pfm_intel_atom_get_impl_counters, .get_hw_counter_width = pfm_intel_atom_get_hw_counter_width, .get_event_desc = pfm_intel_atom_get_event_description, .get_num_event_masks = pfm_intel_atom_get_num_event_masks, .get_event_mask_name = pfm_intel_atom_get_event_mask_name, .get_event_mask_code = pfm_intel_atom_get_event_mask_code, .get_event_mask_desc = pfm_intel_atom_get_event_mask_desc, .get_cycle_event = pfm_intel_atom_get_cycle_event, .get_inst_retired_event = pfm_intel_atom_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_intel_atom_priv.h000066400000000000000000000065301502707512200242070ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_INTEL_ATOM_PRIV_H__ #define __PFMLIB_INTEL_ATOM_PRIV_H__ #define PFMLIB_INTEL_ATOM_MAX_UMASK 16 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ unsigned int pme_flags; /* unit mask flags */ } pme_intel_atom_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ unsigned int pme_fixed; /* fixed counter index, < FIXED_CTR0 if unsupported */ pme_intel_atom_umask_t pme_umasks[PFMLIB_INTEL_ATOM_MAX_UMASK]; /* umask desc */ } pme_intel_atom_entry_t; /* * pme_flags value */ /* * pme_flags value (event and unit mask) */ #define PFMLIB_INTEL_ATOM_UMASK_NCOMBO 0x01 /* unit mask cannot be combined (default exclusive) */ #define PFMLIB_INTEL_ATOM_FIXED0 0x02 /* event supported by FIXED_CTR0, can work on generic counters */ #define PFMLIB_INTEL_ATOM_FIXED1 0x04 /* event supported by FIXED_CTR1, can work on generic counters */ #define PFMLIB_INTEL_ATOM_FIXED2_ONLY 0x08 /* works only on FIXED_CTR2 */ #define PFMLIB_INTEL_ATOM_PEBS 0x10 /* support PEBS (precise event) */ #define PFMLIB_INTEL_ATOM_PMC0 0x20 /* works only on IA32_PMC0 */ #define PFMLIB_INTEL_ATOM_PMC1 0x40 /* works only on IA32_PMC1 */ typedef struct { unsigned int version:8; unsigned int num_cnt:8; unsigned int cnt_width:8; unsigned int ebx_length:8; } pmu_eax_t; typedef struct { unsigned int num_cnt:6; unsigned int cnt_width:6; unsigned int reserved:20; } pmu_edx_t; typedef struct { unsigned int no_core_cycle:1; unsigned int no_inst_retired:1; unsigned int no_ref_cycle:1; unsigned int no_llc_ref:1; unsigned int no_llc_miss:1; unsigned int no_br_retired:1; unsigned int no_br_mispred_retired:1; unsigned int reserved:25; } pmu_ebx_t; #endif /* __PFMLIB_INTEL_ATOM_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_intel_nhm.c000066400000000000000000001273401502707512200227670ustar00rootroot00000000000000/* * pfmlib_intel_nhm.c : Intel Nehalem PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Nehalem PMU = architectural perfmon v3 + OFFCORE + PEBS v2 + uncore PMU + LBR */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_nhm_priv.h" /* Intel Westmere event tables */ #include "intel_wsm_events.h" #include "intel_wsm_unc_events.h" /* Intel Core i7 event tables */ #include "intel_corei7_events.h" #include "intel_corei7_unc_events.h" /* let's define some handy shortcuts! */ #define usel_event unc_perfevtsel.usel_event #define usel_umask unc_perfevtsel.usel_umask #define usel_occ unc_perfevtsel.usel_occ #define usel_edge unc_perfevtsel.usel_edge #define usel_int unc_perfevtsel.usel_int #define usel_en unc_perfevtsel.usel_en #define usel_inv unc_perfevtsel.usel_inv #define usel_cnt_mask unc_perfevtsel.usel_cnt_mask #define sel_event perfevtsel.sel_event #define sel_umask perfevtsel.sel_umask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_anythr perfevtsel.sel_anythr #define sel_cnt_mask perfevtsel.sel_cnt_mask /* * Description of the PMC registers mappings: * * 0 -> PMC0 -> PERFEVTSEL0 * 1 -> PMC1 -> PERFEVTSEL1 * 2 -> PMC2 -> PERFEVTSEL2 * 3 -> PMC3 -> PERFEVTSEL3 * 16 -> PMC16 -> FIXED_CTR_CTRL * 17 -> PMC17 -> PEBS_ENABLED * 18 -> PMC18 -> PEBS_LD_LATENCY_THRESHOLD * 19 -> PMC19 -> OFFCORE_RSP0 * 20 -> PMC20 -> UNCORE_FIXED_CTRL * 21 -> PMC21 -> UNCORE_EVNTSEL0 * 22 -> PMC22 -> UNCORE_EVNTSEL1 * 23 -> PMC23 -> UNCORE_EVNTSEL2 * 24 -> PMC24 -> UNCORE_EVNTSEL3 * 25 -> PMC25 -> UNCORE_EVNTSEL4 * 26 -> PMC26 -> UNCORE_EVNTSEL5 * 27 -> PMC27 -> UNCORE_EVNTSEL6 * 28 -> PMC28 -> UNCORE_EVNTSEL7 * 29 -> PMC31 -> UNCORE_ADDROP_MATCH * 30 -> PMC32 -> LBR_SELECT * * Description of the PMD registers mapping: * * 0 -> PMD0 -> PMC0 * 1 -> PMD1 -> PMC1 * 2 -> PMD2 -> PMC2 * 3 -> PMD3 -> PMC3 * 16 -> PMD16 -> FIXED_CTR0 * 17 -> PMD17 -> FIXED_CTR1 * 18 -> PMD18 -> FIXED_CTR2 * 19 not used * 20 -> PMD20 -> UNCORE_FIXED_CTR0 * 21 -> PMD21 -> UNCORE_PMC0 * 22 -> PMD22 -> UNCORE_PMC1 * 23 -> PMD23 -> UNCORE_PMC2 * 24 -> PMD24 -> UNCORE_PMC3 * 25 -> PMD25 -> UNCORE_PMC4 * 26 -> PMD26 -> UNCORE_PMC5 * 27 -> PMD27 -> UNCORE_PMC6 * 28 -> PMD28 -> UNCORE_PMC7 * * 31 -> PMD31 -> LBR_TOS * 32-63 -> PMD32-PMD63 -> LBR_FROM_0/LBR_TO_0 - LBR_FROM15/LBR_TO_15 */ #define NHM_SEL_BASE 0x186 #define NHM_CTR_BASE 0xc1 #define NHM_FIXED_CTR_BASE 0x309 #define UNC_NHM_SEL_BASE 0x3c0 #define UNC_NHM_CTR_BASE 0x3b0 #define UNC_NHM_FIXED_CTR_BASE 0x394 #define MAX_COUNTERS 28 /* highest implemented counter */ #define PFMLIB_NHM_ALL_FLAGS \ (PFM_NHM_SEL_INV|PFM_NHM_SEL_EDGE|PFM_NHM_SEL_ANYTHR) #define NHM_NUM_GEN_COUNTERS 4 #define NHM_NUM_FIXED_COUNTERS 3 pfm_pmu_support_t intel_nhm_support; pfm_pmu_support_t intel_wsm_support; static pfmlib_regmask_t nhm_impl_pmcs, nhm_impl_pmds; static pfmlib_regmask_t nhm_impl_unc_pmcs, nhm_impl_unc_pmds; static pme_nhm_entry_t *pe, *unc_pe; static unsigned int num_pe, num_unc_pe; static int cpu_model, aaj80; static int pme_cycles, pme_instr; #ifdef __i386__ static inline void cpuid(unsigned int op, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { /* * because ebx is used in Pic mode, we need to save/restore because * cpuid clobbers it. I could not figure out a way to get ebx out in * one cpuid instruction. To extract ebx, we need to move it to another * register (here eax) */ __asm__("pushl %%ebx;cpuid; popl %%ebx" :"=a" (*eax) : "a" (op) : "ecx", "edx"); __asm__("pushl %%ebx;cpuid; movl %%ebx, %%eax;popl %%ebx" :"=a" (*ebx) : "a" (op) : "ecx", "edx"); } #else static inline void cpuid(unsigned int op, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { __asm__("cpuid" : "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx) : "0" (op), "c"(0)); } #endif static inline pme_nhm_entry_t * get_nhm_entry(unsigned int i) { return i < num_pe ? pe+i : unc_pe+(i-num_pe); } static int pfm_nhm_midx2uidx(unsigned int ev, unsigned int midx) { int i, num = 0; pme_nhm_entry_t *ne; int model; ne = get_nhm_entry(ev); for (i=0; i < ne->pme_numasks; i++) { model = ne->pme_umasks[i].pme_umodel; if (!model || model == cpu_model) { if (midx == num) return i; num++; } } DPRINT("cannot find umask %d for event %s\n", midx, ne->pme_name); return -1; } static int pfm_nhm_detect_common(void) { int ret; int family; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; cpu_model = atoi(buffer); if (family != 6) return PFMLIB_ERR_NOTSUPP; return PFMLIB_SUCCESS; } static int pfm_nhm_detect(void) { #define INTEL_ARCH_MISP_BR_RETIRED (1 << 6) unsigned int eax, ebx, ecx, edx; int ret; ret = pfm_nhm_detect_common(); if (ret != PFMLIB_SUCCESS) return ret; switch(cpu_model) { case 26: /* Nehalem */ case 30: case 31: case 46: /* * check for erratum AAJ80 * * MISPREDICTED_BRANCH_RETIRED may be broken * in which case it appears in the list of * unavailable architected events */ cpuid(0xa, &eax, &ebx, &ecx, &edx); if (ebx & INTEL_ARCH_MISP_BR_RETIRED) aaj80 = 1; break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_wsm_detect(void) { switch(cpu_model) { case 37: /* Westmere */ case 44: break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static inline void setup_nhm_impl_unc_regs(void) { pfm_regmask_set(&nhm_impl_unc_pmds, 20); pfm_regmask_set(&nhm_impl_unc_pmds, 21); pfm_regmask_set(&nhm_impl_unc_pmds, 22); pfm_regmask_set(&nhm_impl_unc_pmds, 23); pfm_regmask_set(&nhm_impl_unc_pmds, 24); pfm_regmask_set(&nhm_impl_unc_pmds, 25); pfm_regmask_set(&nhm_impl_unc_pmds, 26); pfm_regmask_set(&nhm_impl_unc_pmds, 27); pfm_regmask_set(&nhm_impl_unc_pmds, 28); /* uncore */ pfm_regmask_set(&nhm_impl_unc_pmcs, 20); pfm_regmask_set(&nhm_impl_unc_pmcs, 21); pfm_regmask_set(&nhm_impl_unc_pmcs, 22); pfm_regmask_set(&nhm_impl_unc_pmcs, 23); pfm_regmask_set(&nhm_impl_unc_pmcs, 24); pfm_regmask_set(&nhm_impl_unc_pmcs, 25); pfm_regmask_set(&nhm_impl_unc_pmcs, 26); pfm_regmask_set(&nhm_impl_unc_pmcs, 27); pfm_regmask_set(&nhm_impl_unc_pmcs, 28); /* unnhm_addrop_match */ pfm_regmask_set(&nhm_impl_unc_pmcs, 29); } static void fixup_mem_uncore_retired(void) { size_t i; for(i=0; i < PME_COREI7_EVENT_COUNT; i++) { if (corei7_pe[i].pme_code != 0xf) continue; /* * assume model46 umasks are at the end */ corei7_pe[i].pme_numasks = 6; break; } } static int pfm_nhm_init(void) { pfm_pmu_support_t *supp; int i; int num_unc_cnt = 0; if (forced_pmu != PFMLIB_NO_PMU) { if (forced_pmu == PFMLIB_INTEL_NHM_PMU) cpu_model = 26; else cpu_model = 37; } /* core */ pfm_regmask_set(&nhm_impl_pmcs, 0); pfm_regmask_set(&nhm_impl_pmcs, 1); pfm_regmask_set(&nhm_impl_pmcs, 2); pfm_regmask_set(&nhm_impl_pmcs, 3); pfm_regmask_set(&nhm_impl_pmcs, 16); pfm_regmask_set(&nhm_impl_pmcs, 17); pfm_regmask_set(&nhm_impl_pmcs, 18); pfm_regmask_set(&nhm_impl_pmcs, 19); pfm_regmask_set(&nhm_impl_pmds, 0); pfm_regmask_set(&nhm_impl_pmds, 1); pfm_regmask_set(&nhm_impl_pmds, 2); pfm_regmask_set(&nhm_impl_pmds, 3); pfm_regmask_set(&nhm_impl_pmds, 16); pfm_regmask_set(&nhm_impl_pmds, 17); pfm_regmask_set(&nhm_impl_pmds, 18); /* lbr */ pfm_regmask_set(&nhm_impl_pmcs, 30); for(i=31; i < 64; i++) pfm_regmask_set(&nhm_impl_pmds, i); switch(cpu_model) { case 46: num_pe = PME_COREI7_EVENT_COUNT; num_unc_pe = 0; pe = corei7_pe; unc_pe = NULL; pme_cycles = PME_COREI7_UNHALTED_CORE_CYCLES; pme_instr = PME_COREI7_INSTRUCTIONS_RETIRED; num_unc_cnt = 0; fixup_mem_uncore_retired(); supp = &intel_nhm_support; break; case 26: /* Nehalem */ case 30: /* Lynnfield */ num_pe = PME_COREI7_EVENT_COUNT; num_unc_pe = PME_COREI7_UNC_EVENT_COUNT; pe = corei7_pe; unc_pe = corei7_unc_pe; pme_cycles = PME_COREI7_UNHALTED_CORE_CYCLES; pme_instr = PME_COREI7_INSTRUCTIONS_RETIRED; setup_nhm_impl_unc_regs(); num_unc_cnt = 9; /* one fixed + 8 generic */ supp = &intel_nhm_support; break; case 37: /* Westmere */ case 44: num_pe = PME_WSM_EVENT_COUNT; num_unc_pe = PME_WSM_UNC_EVENT_COUNT; pe = wsm_pe; unc_pe = intel_wsm_unc_pe; pme_cycles = PME_WSM_UNHALTED_CORE_CYCLES; pme_instr = PME_WSM_INSTRUCTIONS_RETIRED; setup_nhm_impl_unc_regs(); num_unc_cnt = 9; /* one fixed + 8 generic */ /* OFFCORE_RESPONSE_1 */ pfm_regmask_set(&nhm_impl_pmcs, 31); supp = &intel_wsm_support; break; default: return PFMLIB_ERR_NOTSUPP; } supp->pme_count = num_pe + num_unc_pe; supp->num_cnt = NHM_NUM_GEN_COUNTERS + NHM_NUM_FIXED_COUNTERS + num_unc_cnt; /* * propagate uncore registers to impl bitmaps */ pfm_regmask_or(&nhm_impl_pmds, &nhm_impl_pmds, &nhm_impl_unc_pmds); pfm_regmask_or(&nhm_impl_pmcs, &nhm_impl_pmcs, &nhm_impl_unc_pmcs); /* * compute number of registers available * not all CPUs may have uncore */ pfm_regmask_weight(&nhm_impl_pmds, &supp->pmd_count); pfm_regmask_weight(&nhm_impl_pmcs, &supp->pmc_count); return PFMLIB_SUCCESS; } static int pfm_nhm_is_fixed(pfmlib_event_t *e, unsigned int f) { pme_nhm_entry_t *ne; unsigned int fl, flc, i; unsigned int mask = 0; ne = get_nhm_entry(e->event); fl = ne->pme_flags; /* * first pass: check if event as a whole supports fixed counters */ switch(f) { case 0: mask = PFMLIB_NHM_FIXED0; break; case 1: mask = PFMLIB_NHM_FIXED1; break; case 2: mask = PFMLIB_NHM_FIXED2_ONLY; break; default: return 0; } if (fl & mask) return 1; /* * second pass: check if unit mask supports fixed counter * * reject if mask not found OR if not all unit masks have * same fixed counter mask */ flc = 0; for(i=0; i < e->num_masks; i++) { int midx = pfm_nhm_midx2uidx(e->event, e->unit_masks[i]); fl = ne->pme_umasks[midx].pme_uflags; if (fl & mask) flc++; } return flc > 0 && flc == e->num_masks ? 1 : 0; } /* * Allow combination of events when cnt_mask > 0 AND unit mask codes do * not overlap (otherwise, we do not know what is actually measured) */ static int pfm_nhm_check_cmask(pfmlib_event_t *e, pme_nhm_entry_t *ne, pfmlib_nhm_counter_t *cntr) { unsigned int ref, ucode; int i, j; if (!cntr) return -1; if (cntr->cnt_mask == 0) return -1; for(i=0; i < e->num_masks; i++) { int midx = pfm_nhm_midx2uidx(e->event, e->unit_masks[i]); ref = ne->pme_umasks[midx].pme_ucode; for(j=i+1; j < e->num_masks; j++) { midx = pfm_nhm_midx2uidx(e->event, e->unit_masks[j]); ucode = ne->pme_umasks[midx].pme_ucode; if (ref & ucode) return -1; } } return 0; } /* * IMPORTANT: the interface guarantees that pfp_pmds[] elements are returned in the order the events * were submitted. */ static int pfm_nhm_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_nhm_input_param_t *param, pfmlib_output_param_t *outp) { #define HAS_OPTIONS(x) (cntrs && (cntrs[x].flags || cntrs[x].cnt_mask)) #define is_fixed_pmc(a) (a == 16 || a == 17 || a == 18) #define is_uncore(a) (a > 19) pme_nhm_entry_t *ne; pfmlib_nhm_counter_t *cntrs; pfm_nhm_sel_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; uint64_t val, unc_global_ctrl; uint64_t pebs_mask, ld_mask; unsigned long long fixed_ctr; unsigned int plm; unsigned int npc, npmc0, npmc01, nf2, nuf; unsigned int i, n, k, j, umask, use_pebs = 0; unsigned int assign_pc[PMU_NHM_NUM_COUNTERS]; unsigned int next_gen, last_gen, u_flags; unsigned int next_unc_gen, last_unc_gen, lat; unsigned int offcore_rsp0_value = 0; unsigned int offcore_rsp1_value = 0; npc = npmc01 = npmc0 = nf2 = nuf = 0; unc_global_ctrl = 0; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; n = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; cntrs = param ? param->pfp_nhm_counters : NULL; pebs_mask = ld_mask = 0; use_pebs = param ? param->pfp_nhm_pebs.pebs_used : 0; lat = param ? param->pfp_nhm_pebs.ld_lat_thres : 0; if (n > PMU_NHM_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; /* * error checking */ for(i=0; i < n; i++) { /* * only supports two priv levels for perf counters */ if (e[i].plm & (PFM_PLM1|PFM_PLM2)) return PFMLIB_ERR_INVAL; ne = get_nhm_entry(e[i].event); /* check for erratum AAJ80 */ if (aaj80 && (ne->pme_code & 0xff) == 0xc5) { DPRINT("MISPREDICTED_BRANCH_RETIRED broken on this Nehalem processor, see eeratum AAJ80\n"); return PFMLIB_ERR_NOTSUPP; } /* * check for valid flags */ if (e[i].flags & ~PFMLIB_NHM_ALL_FLAGS) return PFMLIB_ERR_INVAL; if (ne->pme_flags & PFMLIB_NHM_UMASK_NCOMBO && e[i].num_masks > 1 && pfm_nhm_check_cmask(e, ne, cntrs ? cntrs+i : NULL)) { DPRINT("events does not support unit mask combination\n"); return PFMLIB_ERR_NOASSIGN; } /* * check event-level single register constraint for uncore fixed */ if (ne->pme_flags & PFMLIB_NHM_UNC_FIXED) { if (++nuf > 1) { DPRINT("two events compete for a UNCORE_FIXED_CTR0\n"); return PFMLIB_ERR_NOASSIGN; } if (HAS_OPTIONS(i)) { DPRINT("uncore fixed counter does not support options\n"); return PFMLIB_ERR_NOASSIGN; } } if (ne->pme_flags & PFMLIB_NHM_PMC0) { if (++npmc0 > 1) { DPRINT("two events compete for a PMC0\n"); return PFMLIB_ERR_NOASSIGN; } } /* * check event-level single register constraint (PMC0/1 only) * fail if more than two events requested for the same counter pair */ if (ne->pme_flags & PFMLIB_NHM_PMC01) { if (++npmc01 > 2) { DPRINT("two events compete for a PMC0\n"); return PFMLIB_ERR_NOASSIGN; } } /* * UNHALTED_REFERENCE_CYCLES (CPU_CLK_UNHALTED:BUS) * can only be measured on FIXED_CTR2 */ if (ne->pme_flags & PFMLIB_NHM_FIXED2_ONLY) { if (++nf2 > 1) { DPRINT("two events compete for FIXED_CTR2\n"); return PFMLIB_ERR_NOASSIGN; } if (cntrs && ((cntrs[i].flags & (PFM_NHM_SEL_INV|PFM_NHM_SEL_EDGE)) || cntrs[i].cnt_mask)) { DPRINT("UNHALTED_REFERENCE_CYCLES only accepts anythr filter\n"); return PFMLIB_ERR_NOASSIGN; } } /* * OFFCORE_RSP0 is shared, unit masks for all offcore_response events * must be identical */ umask = 0; for(j=0; j < e[i].num_masks; j++) { int midx = pfm_nhm_midx2uidx(e[i].event, e[i].unit_masks[j]); umask |= ne->pme_umasks[midx].pme_ucode; } if (ne->pme_flags & PFMLIB_NHM_OFFCORE_RSP0) { if (offcore_rsp0_value && offcore_rsp0_value != umask) { DPRINT("all OFFCORE_RSP0 events must have the same unit mask\n"); return PFMLIB_ERR_NOASSIGN; } if (pfm_regmask_isset(r_pmcs, 19)) { DPRINT("OFFCORE_RSP0 register not available\n"); return PFMLIB_ERR_NOASSIGN; } if (!((umask & 0xff) && (umask & 0xff00))) { DPRINT("OFFCORE_RSP0 must have at least one request and response unit mask set\n"); return PFMLIB_ERR_INVAL; } /* lock-in offcore_value */ offcore_rsp0_value = umask; } if (ne->pme_flags & PFMLIB_NHM_OFFCORE_RSP1) { if (offcore_rsp1_value && offcore_rsp1_value != umask) { DPRINT("all OFFCORE_RSP1 events must have the same unit mask\n"); return PFMLIB_ERR_NOASSIGN; } if (pfm_regmask_isset(r_pmcs, 31)) { DPRINT("OFFCORE_RSP1 register not available\n"); return PFMLIB_ERR_NOASSIGN; } if (!((umask & 0xff) && (umask & 0xff00))) { DPRINT("OFFCORE_RSP1 must have at least one request and response unit mask set\n"); return PFMLIB_ERR_INVAL; } /* lock-in offcore_value */ offcore_rsp1_value = umask; } /* * enforce PLM0|PLM3 for uncore events given they have no * priv level filter. This is to ensure users understand what * they are doing */ if (ne->pme_flags & (PFMLIB_NHM_UNC|PFMLIB_NHM_UNC_FIXED)) { if (inp->pfp_dfl_plm != (PFM_PLM0|PFM_PLM3) && e[i].plm != (PFM_PLM0|PFM_PLM3)) { DPRINT("uncore events must have PLM0|PLM3\n"); return PFMLIB_ERR_NOASSIGN; } } } /* * initilize to empty */ for(i=0; i < PMU_NHM_NUM_COUNTERS; i++) assign_pc[i] = -1; next_gen = 0; /* first generic counter */ last_gen = 3; /* last generic counter */ /* * strongest constraint: only uncore_fixed_ctr0 or PMC0 only */ if (nuf || npmc0) { for(i=0; i < n; i++) { ne = get_nhm_entry(e[i].event); if (ne->pme_flags & PFMLIB_NHM_PMC0) { if (pfm_regmask_isset(r_pmcs, 0)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 0; next_gen = 1; } if (ne->pme_flags & PFMLIB_NHM_UNC_FIXED) { if (pfm_regmask_isset(r_pmcs, 20)) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = 20; } } } /* * 2nd strongest constraint first: works only on PMC0 or PMC1 * On Nehalem, this constraint applies at the event-level * (not unit mask level, fortunately) * * PEBS works on all 4 generic counters * * Because of sanity check above, we know we can find * only up to 2 events with this constraint */ if (npmc01) { for(i=0; i < n; i++) { ne = get_nhm_entry(e[i].event); if (ne->pme_flags & PFMLIB_NHM_PMC01) { while (next_gen < 2 && pfm_regmask_isset(r_pmcs, next_gen)) next_gen++; if (next_gen == 2) return PFMLIB_ERR_NOASSIGN; assign_pc[i] = next_gen++; } } } /* * next constraint: fixed counters * * We abuse the mapping here for assign_pc to make it easier * to provide the correct values for pd[]. * We use: * - 16 : fixed counter 0 (pmc16, pmd16) * - 17 : fixed counter 1 (pmc16, pmd17) * - 18 : fixed counter 2 (pmc16, pmd18) */ fixed_ctr = pfm_regmask_isset(r_pmcs, 16) ? 0 : 0x7; if (fixed_ctr) { for(i=0; i < n; i++) { /* * Nehalem fixed counters (as for architected perfmon v3) * does support anythr filter */ if (HAS_OPTIONS(i)) { if (use_pebs && pfm_nhm_is_pebs(e+i)) continue; if (cntrs[i].flags != PFM_NHM_SEL_ANYTHR) continue; } if ((fixed_ctr & 0x1) && pfm_nhm_is_fixed(e+i, 0)) { assign_pc[i] = 16; fixed_ctr &= ~1; } if ((fixed_ctr & 0x2) && pfm_nhm_is_fixed(e+i, 1)) { assign_pc[i] = 17; fixed_ctr &= ~2; } if ((fixed_ctr & 0x4) && pfm_nhm_is_fixed(e+i, 2)) { assign_pc[i] = 18; fixed_ctr &= ~4; } } } /* * uncore events on any of the 8 counters */ next_unc_gen = 21; /* first generic uncore counter config */ last_unc_gen = 28; /* last generic uncore counter config */ for(i=0; i < n; i++) { ne = get_nhm_entry(e[i].event); if (ne->pme_flags & PFMLIB_NHM_UNC) { for(; next_unc_gen <= last_unc_gen; next_unc_gen++) { if (!pfm_regmask_isset(r_pmcs, next_unc_gen)) break; } if (next_unc_gen <= last_unc_gen) assign_pc[i] = next_unc_gen++; else { DPRINT("cannot assign generic uncore event\n"); return PFMLIB_ERR_NOASSIGN; } } } /* * assign what is left of the generic events */ for(i=0; i < n; i++) { if (assign_pc[i] == -1) { for(; next_gen <= last_gen; next_gen++) { DPRINT("i=%d next_gen=%d last=%d isset=%d\n", i, next_gen, last_gen, pfm_regmask_isset(r_pmcs, next_gen)); if (!pfm_regmask_isset(r_pmcs, next_gen)) break; } if (next_gen <= last_gen) { assign_pc[i] = next_gen++; } else { DPRINT("cannot assign generic event\n"); return PFMLIB_ERR_NOASSIGN; } } } /* * setup core fixed counters */ reg.val = 0; for (i=0; i < n ; i++ ) { if (!is_fixed_pmc(assign_pc[i])) continue; val = 0; /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; if (plm & PFM_PLM0) val |= 1ULL; if (plm & PFM_PLM3) val |= 2ULL; if (cntrs && cntrs[i].flags & PFM_NHM_SEL_ANYTHR) val |= 4ULL; val |= 1ULL << 3; /* force APIC int (kernel may force it anyway) */ reg.val |= val << ((assign_pc[i]-16)<<2); } if (reg.val) { pc[npc].reg_num = 16; pc[npc].reg_value = reg.val; pc[npc].reg_addr = 0x38D; pc[npc].reg_alt_addr = 0x38D; __pfm_vbprintf("[FIXED_CTRL(pmc%u)=0x%"PRIx64" pmi0=1 en0=0x%"PRIx64" any0=%d pmi1=1 en1=0x%"PRIx64" any1=%d pmi2=1 en2=0x%"PRIx64" any2=%d] ", pc[npc].reg_num, reg.val, reg.val & 0x3ULL, !!(reg.val & 0x4ULL), (reg.val>>4) & 0x3ULL, !!((reg.val>>4) & 0x4ULL), (reg.val>>8) & 0x3ULL, !!((reg.val>>8) & 0x4ULL)); if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("INSTRUCTIONS_RETIRED "); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("UNHALTED_CORE_CYCLES "); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("UNHALTED_REFERENCE_CYCLES "); __pfm_vbprintf("\n"); npc++; if ((fixed_ctr & 0x1) == 0) __pfm_vbprintf("[FIXED_CTR0(pmd16)]\n"); if ((fixed_ctr & 0x2) == 0) __pfm_vbprintf("[FIXED_CTR1(pmd17)]\n"); if ((fixed_ctr & 0x4) == 0) __pfm_vbprintf("[FIXED_CTR2(pmd18)]\n"); } /* * setup core counter config */ for (i=0; i < n ; i++ ) { /* skip fixed counters */ if (is_fixed_pmc(assign_pc[i]) || is_uncore(assign_pc[i])) continue; reg.val = 0; /* assume reserved bits are zeroed */ /* if plm is 0, then assume not specified per-event and use default */ plm = e[i].plm ? e[i].plm : inp->pfp_dfl_plm; ne = get_nhm_entry(e[i].event); val = ne->pme_code; reg.sel_event = val & 0xff; umask = (val >> 8) & 0xff; u_flags = 0; /* * for OFFCORE_RSP, the unit masks are all in the * dedicated OFFCORE_RSP MSRs and event unit mask must be * 0x1 (extracted from pme_code) */ if (!(ne->pme_flags & (PFMLIB_NHM_OFFCORE_RSP0|PFMLIB_NHM_OFFCORE_RSP1))) for(k=0; k < e[i].num_masks; k++) { int midx = pfm_nhm_midx2uidx(e[i].event, e[i].unit_masks[k]); umask |= ne->pme_umasks[midx].pme_ucode; u_flags |= ne->pme_umasks[midx].pme_uflags; } val |= umask << 8; reg.sel_umask = umask; reg.sel_usr = plm & PFM_PLM3 ? 1 : 0; reg.sel_os = plm & PFM_PLM0 ? 1 : 0; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ reg.sel_cnt_mask = val >>24; reg.sel_inv = val >> 23; reg.sel_anythr = val >> 21; reg.sel_edge = val >> 18; if (cntrs) { /* * occupancy reset flag is for uncore counters only */ if (cntrs[i].flags & PFM_NHM_SEL_OCC_RST) return PFMLIB_ERR_INVAL; if (!reg.sel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.sel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.sel_edge) reg.sel_edge = cntrs[i].flags & PFM_NHM_SEL_EDGE ? 1 : 0; if (!reg.sel_inv) reg.sel_inv = cntrs[i].flags & PFM_NHM_SEL_INV ? 1 : 0; if (!reg.sel_anythr) reg.sel_anythr = cntrs[i].flags & PFM_NHM_SEL_ANYTHR ? 1 : 0; } if (u_flags || (ne->pme_flags & PFMLIB_NHM_PEBS)) pebs_mask |= 1ULL << assign_pc[i]; /* * check for MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD_0 to enable load latency filtering * when PEBS is used. There is only one threshold possible, yet mutliple counters may be * programmed with this event/umask. That means they all share the same threshold. */ if (reg.sel_event == 0xb && (umask & 0x10)) ld_mask |= 1ULL << assign_pc[i]; pc[npc].reg_num = assign_pc[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = NHM_SEL_BASE+assign_pc[i]; pc[npc].reg_alt_addr= NHM_SEL_BASE+assign_pc[i]; __pfm_vbprintf("[PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d anythr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d] %s\n", pc[npc].reg_num, pc[npc].reg_num, reg.val, reg.sel_event, reg.sel_umask, reg.sel_os, reg.sel_usr, reg.sel_anythr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask, ne->pme_name); __pfm_vbprintf("[PMC%u(pmd%u)]\n", pc[npc].reg_num, pc[npc].reg_num); npc++; } /* * setup uncore fixed counter config */ if (nuf) { pc[npc].reg_num = 20; pc[npc].reg_value = 0x5ULL; /* ena=1, PMI=dtermined by kernel */ pc[npc].reg_addr = 0x395; pc[npc].reg_alt_addr = 0x395; __pfm_vbprintf("[UNC_FIXED_CTRL(pmc20)=0x%"PRIx64" pmi=1 ena=1] UNC_CLK_UNHALTED\n", pc[npc].reg_value); __pfm_vbprintf("[UNC_FIXED_CTR0(pmd20)]\n"); unc_global_ctrl |= 1ULL<< 32; npc++; } /* * setup uncore counter config */ for (i=0; i < n ; i++ ) { /* skip core counters, uncore fixed */ if (!is_uncore(assign_pc[i]) || assign_pc[i] == 20) continue; reg.val = 0; /* assume reserved bits are zerooed */ ne = get_nhm_entry(e[i].event); val = ne->pme_code; reg.usel_event = val & 0xff; umask = (val >> 8) & 0xff; for(k=0; k < e[i].num_masks; k++) { int midx = pfm_nhm_midx2uidx(e[i].event, e[i].unit_masks[k]); umask |= ne->pme_umasks[midx].pme_ucode; } val |= umask << 8; reg.usel_umask = umask; reg.usel_en = 1; /* force enable bit to 1 */ reg.usel_int = 1; /* force APIC int to 1 */ /* * allow hardcoded filters in event table */ reg.usel_cnt_mask = val >>24; reg.usel_inv = val >> 23; reg.usel_edge = val >> 18; reg.usel_occ = val >> 17; if (cntrs) { /* * anythread if for core counters only */ if (cntrs[i].flags & PFM_NHM_SEL_ANYTHR) return PFMLIB_ERR_INVAL; if (!reg.usel_cnt_mask) { /* * counter mask is 8-bit wide, do not silently * wrap-around */ if (cntrs[i].cnt_mask > 255) return PFMLIB_ERR_INVAL; reg.usel_cnt_mask = cntrs[i].cnt_mask; } if (!reg.usel_edge) reg.usel_edge = cntrs[i].flags & PFM_NHM_SEL_EDGE ? 1 : 0; if (!reg.usel_inv) reg.usel_inv = cntrs[i].flags & PFM_NHM_SEL_INV ? 1 : 0; if (!reg.usel_occ) reg.usel_occ = cntrs[i].flags & PFM_NHM_SEL_OCC_RST ? 1 : 0; } unc_global_ctrl |= 1ULL<< (assign_pc[i] - 21); pc[npc].reg_num = assign_pc[i]; pc[npc].reg_value = reg.val; pc[npc].reg_addr = UNC_NHM_SEL_BASE+assign_pc[i] - 21; pc[npc].reg_alt_addr= UNC_NHM_SEL_BASE+assign_pc[i] - 21; __pfm_vbprintf("[UNC_PERFEVTSEL%u(pmc%u)=0x%"PRIx64" event=0x%x umask=0x%x en=%d int=%d inv=%d edge=%d occ=%d cnt_msk=%d] %s\n", pc[npc].reg_num - 21, pc[npc].reg_num, reg.val, reg.usel_event, reg.usel_umask, reg.usel_en, reg.usel_int, reg.usel_inv, reg.usel_edge, reg.usel_occ, reg.usel_cnt_mask, ne->pme_name); __pfm_vbprintf("[UNC_PMC%u(pmd%u)]\n", pc[npc].reg_num - 21, pc[npc].reg_num); npc++; } /* * setup pmds: must be in the same order as the events */ for (i=0; i < n ; i++) { switch (assign_pc[i]) { case 0 ... 4: pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = NHM_CTR_BASE+assign_pc[i]; /* index to use with RDPMC */ pd[i].reg_alt_addr = assign_pc[i]; break; case 16 ... 18: /* setup pd array */ pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = NHM_FIXED_CTR_BASE+assign_pc[i]-16; pd[i].reg_alt_addr = 0x40000000+assign_pc[i]-16; break; case 20: pd[i].reg_num = 20; pd[i].reg_addr = UNC_NHM_FIXED_CTR_BASE; pd[i].reg_alt_addr = UNC_NHM_FIXED_CTR_BASE; break; case 21 ... 28: pd[i].reg_num = assign_pc[i]; pd[i].reg_addr = UNC_NHM_CTR_BASE + assign_pc[i] - 21; pd[i].reg_alt_addr = UNC_NHM_CTR_BASE + assign_pc[i] - 21; break; } } outp->pfp_pmd_count = i; /* * setup PEBS_ENABLE */ if (use_pebs && pebs_mask) { if (!lat) ld_mask = 0; /* * check that PEBS_ENABLE is available */ if (pfm_regmask_isset(r_pmcs, 17)) return PFMLIB_ERR_NOASSIGN; pc[npc].reg_num = 17; pc[npc].reg_value = pebs_mask | (ld_mask <<32); pc[npc].reg_addr = 0x3f1; /* IA32_PEBS_ENABLE */ pc[npc].reg_alt_addr = 0x3f1; /* IA32_PEBS_ENABLE */ __pfm_vbprintf("[PEBS_ENABLE(pmc%u)=0x%"PRIx64" ena0=%d ena1=%d ena2=%d ena3=%d ll0=%d ll1=%d ll2=%d ll3=%d]\n", pc[npc].reg_num, pc[npc].reg_value, pc[npc].reg_value & 0x1, (pc[npc].reg_value >> 1) & 0x1, (pc[npc].reg_value >> 2) & 0x1, (pc[npc].reg_value >> 3) & 0x1, (pc[npc].reg_value >> 32) & 0x1, (pc[npc].reg_value >> 33) & 0x1, (pc[npc].reg_value >> 34) & 0x1, (pc[npc].reg_value >> 35) & 0x1); npc++; if (ld_mask) { if (lat < 3 || lat > 0xffff) { DPRINT("invalid load latency threshold %u (must be in [3:0xffff])\n", lat); return PFMLIB_ERR_INVAL; } if (pfm_regmask_isset(r_pmcs, 18)) return PFMLIB_ERR_NOASSIGN; pc[npc].reg_num = 18; pc[npc].reg_value = lat; pc[npc].reg_addr = 0x3f1; /* IA32_PEBS_ENABLE */ pc[npc].reg_alt_addr = 0x3f1; /* IA32_PEBS_ENABLE */ __pfm_vbprintf("[LOAD_LATENCY_THRESHOLD(pmc%u)=0x%"PRIx64"]\n", pc[npc].reg_num, pc[npc].reg_value); npc++; } } /* * setup OFFCORE_RSP0 */ if (offcore_rsp0_value) { pc[npc].reg_num = 19; pc[npc].reg_value = offcore_rsp0_value; pc[npc].reg_addr = 0x1a6; pc[npc].reg_alt_addr = 0x1a6; __pfm_vbprintf("[OFFCORE_RSP0(pmc%u)=0x%"PRIx64"]\n", pc[npc].reg_num, pc[npc].reg_value); npc++; } /* * setup OFFCORE_RSP1 */ if (offcore_rsp1_value) { pc[npc].reg_num = 31; pc[npc].reg_value = offcore_rsp1_value; pc[npc].reg_addr = 0x1a7; pc[npc].reg_alt_addr = 0x1a7; __pfm_vbprintf("[OFFCORE_RSP1(pmc%u)=0x%"PRIx64"]\n", pc[npc].reg_num, pc[npc].reg_value); npc++; } outp->pfp_pmc_count = npc; return PFMLIB_SUCCESS; } static int pfm_nhm_dispatch_lbr(pfmlib_input_param_t *inp, pfmlib_nhm_input_param_t *param, pfmlib_output_param_t *outp) { static int lbr_plm_map[4]={ 0x3, /* PLM0=0 PLM3=0 neq0=1 eq0=1 */ 0x1, /* PLM0=0 PLM3=1 neq0=0 eq0=1 */ 0x2, /* PLM0=1 PLM3=0 neq0=1 eq0=0 */ 0x0 /* PLM0=1 PLM3=1 neq0=0 eq0=0 */ }; pfm_nhm_sel_reg_t reg; unsigned int filter, i, c; unsigned int plm; /* * check LBR_SELECT is available */ if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 30)) return PFMLIB_ERR_NOASSIGN; reg.val = 0; /* capture everything */ plm = param->pfp_nhm_lbr.lbr_plm; if (!plm) plm = inp->pfp_dfl_plm; /* * LBR does not distinguish PLM1, PLM2 from PLM3 */ i = plm & PFM_PLM0 ? 0x2 : 0; i |= plm & PFM_PLM3 ? 0x1 : 0; if (lbr_plm_map[i] & 0x1) reg.lbr_select.cpl_eq0 = 1; if (lbr_plm_map[i] & 0x2) reg.lbr_select.cpl_neq0 = 1; filter = param->pfp_nhm_lbr.lbr_filter; if (filter & PFM_NHM_LBR_JCC) reg.lbr_select.jcc = 1; if (filter & PFM_NHM_LBR_NEAR_REL_CALL) reg.lbr_select.near_rel_call = 1; if (filter & PFM_NHM_LBR_NEAR_IND_CALL) reg.lbr_select.near_ind_call = 1; if (filter & PFM_NHM_LBR_NEAR_RET) reg.lbr_select.near_ret = 1; if (filter & PFM_NHM_LBR_NEAR_IND_JMP) reg.lbr_select.near_ind_jmp = 1; if (filter & PFM_NHM_LBR_NEAR_REL_JMP) reg.lbr_select.near_rel_jmp = 1; if (filter & PFM_NHM_LBR_FAR_BRANCH) reg.lbr_select.far_branch = 1; __pfm_vbprintf("[LBR_SELECT(PMC30)=0x%"PRIx64" eq0=%d neq0=%d jcc=%d rel=%d ind=%d ret=%d ind_jmp=%d rel_jmp=%d far=%d ]\n", reg.val, reg.lbr_select.cpl_eq0, reg.lbr_select.cpl_neq0, reg.lbr_select.jcc, reg.lbr_select.near_rel_call, reg.lbr_select.near_ind_call, reg.lbr_select.near_ret, reg.lbr_select.near_ind_jmp, reg.lbr_select.near_rel_jmp, reg.lbr_select.far_branch); __pfm_vbprintf("[LBR_TOS(PMD31)]\n"); __pfm_vbprintf("[LBR_FROM-LBR_TO(PMD32..PMD63)]\n"); c = outp->pfp_pmc_count; outp->pfp_pmcs[c].reg_num = 30; outp->pfp_pmcs[c].reg_value = reg.val; outp->pfp_pmcs[c].reg_addr = 0x1c8; outp->pfp_pmcs[c].reg_alt_addr = 0x1c8; c++; outp->pfp_pmc_count = c; c = outp->pfp_pmd_count; outp->pfp_pmds[c].reg_num = 31; outp->pfp_pmds[c].reg_value = 0; outp->pfp_pmds[c].reg_addr = 0x1c9; outp->pfp_pmds[c].reg_alt_addr = 0x1c9; c++; for(i=0; i < 32; i++, c++) { outp->pfp_pmds[c].reg_num = 32 + i; outp->pfp_pmds[c].reg_value = 0; outp->pfp_pmds[c].reg_addr = (i>>1) + ((i & 0x1) ? 0x6c0 : 0x680); outp->pfp_pmds[c].reg_alt_addr = (i>>1) + ((i & 0x1) ? 0x6c0 : 0x680); } outp->pfp_pmd_count = c; return PFMLIB_SUCCESS; } static int pfm_nhm_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_nhm_input_param_t *mod_in = (pfmlib_nhm_input_param_t *)model_in; int ret; if (inp->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { DPRINT("invalid plm=%x\n", inp->pfp_dfl_plm); return PFMLIB_ERR_INVAL; } ret = pfm_nhm_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; if (mod_in && mod_in->pfp_nhm_lbr.lbr_used) ret = pfm_nhm_dispatch_lbr(inp, mod_in, outp); return ret; } static int pfm_nhm_get_event_code(unsigned int i, unsigned int cnt, int *code) { pfmlib_regmask_t cnts; pfm_get_impl_counters(&cnts); if (cnt != PFMLIB_CNT_FIRST && (cnt > MAX_COUNTERS || !pfm_regmask_isset(&cnts, cnt))) return PFMLIB_ERR_INVAL; *code = get_nhm_entry(i)->pme_code; return PFMLIB_SUCCESS; } static void pfm_nhm_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { pme_nhm_entry_t *ne; unsigned int i; memset(counters, 0, sizeof(*counters)); ne = get_nhm_entry(j); if (ne->pme_flags & PFMLIB_NHM_UNC_FIXED) { pfm_regmask_set(counters, 20); return; } if (ne->pme_flags & PFMLIB_NHM_UNC) { pfm_regmask_set(counters, 20); pfm_regmask_set(counters, 21); pfm_regmask_set(counters, 22); pfm_regmask_set(counters, 23); pfm_regmask_set(counters, 24); pfm_regmask_set(counters, 25); pfm_regmask_set(counters, 26); pfm_regmask_set(counters, 27); return; } /* * fixed counter events have no unit mask */ if (ne->pme_flags & PFMLIB_NHM_FIXED0) pfm_regmask_set(counters, 16); if (ne->pme_flags & PFMLIB_NHM_FIXED1) pfm_regmask_set(counters, 17); if (ne->pme_flags & PFMLIB_NHM_FIXED2_ONLY) pfm_regmask_set(counters, 18); /* * extract from unit mask level */ for (i=0; i < ne->pme_numasks; i++) { if (ne->pme_umasks[i].pme_uflags & PFMLIB_NHM_FIXED0) pfm_regmask_set(counters, 16); if (ne->pme_umasks[i].pme_uflags & PFMLIB_NHM_FIXED1) pfm_regmask_set(counters, 17); if (ne->pme_umasks[i].pme_uflags & PFMLIB_NHM_FIXED2_ONLY) pfm_regmask_set(counters, 18); } /* * event on FIXED_CTR2 is exclusive CPU_CLK_UNHALTED:REF * PMC0|PMC1 only on 0,1, constraint at event-level */ if (!pfm_regmask_isset(counters, 18)) { pfm_regmask_set(counters, 0); if (!(ne->pme_flags & PFMLIB_NHM_PMC0)) pfm_regmask_set(counters, 1); if (!(ne->pme_flags & (PFMLIB_NHM_PMC01|PFMLIB_NHM_PMC0))) { pfm_regmask_set(counters, 2); pfm_regmask_set(counters, 3); } } } static void pfm_nhm_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = nhm_impl_pmcs; } static void pfm_nhm_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = nhm_impl_pmds; } static void pfm_nhm_get_impl_counters(pfmlib_regmask_t *impl_counters) { /* core generic */ pfm_regmask_set(impl_counters, 0); pfm_regmask_set(impl_counters, 1); pfm_regmask_set(impl_counters, 2); pfm_regmask_set(impl_counters, 3); /* core fixed */ pfm_regmask_set(impl_counters, 16); pfm_regmask_set(impl_counters, 17); pfm_regmask_set(impl_counters, 18); /* uncore pmd registers all counters */ pfm_regmask_or(impl_counters, impl_counters, &nhm_impl_unc_pmds); } /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 32-bit have full * degree of freedom. That is the "useable" counter width. */ #define PMU_NHM_COUNTER_WIDTH 32 static void pfm_nhm_get_hw_counter_width(unsigned int *width) { /* * Even though, CPUID 0xa returns in eax the actual counter * width, the architecture specifies that writes are limited * to lower 32-bits. As such, only the lower 31 bits have full * degree of freedom. That is the "useable" counter width. */ *width = PMU_NHM_COUNTER_WIDTH; } static char * pfm_nhm_get_event_name(unsigned int i) { return get_nhm_entry(i)->pme_name; } static int pfm_nhm_get_event_description(unsigned int ev, char **str) { char *s; s = get_nhm_entry(ev)->pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static char * pfm_nhm_get_event_mask_name(unsigned int ev, unsigned int midx) { midx = pfm_nhm_midx2uidx(ev, midx); return get_nhm_entry(ev)->pme_umasks[midx].pme_uname; } static int pfm_nhm_get_event_mask_desc(unsigned int ev, unsigned int midx, char **str) { char *s; midx = pfm_nhm_midx2uidx(ev, midx); s = get_nhm_entry(ev)->pme_umasks[midx].pme_udesc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static unsigned int pfm_nhm_get_num_event_masks(unsigned int ev) { int i, num = 0; pme_nhm_entry_t *ne; int model; ne = get_nhm_entry(ev); for (i=0; i < ne->pme_numasks; i++) { model = ne->pme_umasks[i].pme_umodel; if (!model || model == cpu_model) num++; } DPRINT("event %s numasks=%d\n", ne->pme_name, num); return num; } static int pfm_nhm_get_event_mask_code(unsigned int ev, unsigned int midx, unsigned int *code) { midx = pfm_nhm_midx2uidx(ev, midx); *code =get_nhm_entry(ev)->pme_umasks[midx].pme_ucode; return PFMLIB_SUCCESS; } static int pfm_nhm_get_cycle_event(pfmlib_event_t *e) { e->event = pme_cycles; return PFMLIB_SUCCESS; } static int pfm_nhm_get_inst_retired(pfmlib_event_t *e) { e->event = pme_instr;; return PFMLIB_SUCCESS; } /* * the following function implement the model * specific API directly available to user */ /* * Check if event and all provided unit masks support PEBS * * return: * PFMLIB_ERR_INVAL: invalid event e * 1 event supports PEBS * 0 event does not support PEBS * */ int pfm_nhm_is_pebs(pfmlib_event_t *e) { pme_nhm_entry_t *ne; unsigned int i, n=0; if (e == NULL || e->event >= intel_nhm_support.pme_count) return PFMLIB_ERR_INVAL; ne = get_nhm_entry(e->event); if (ne->pme_flags & PFMLIB_NHM_PEBS) return 1; /* * ALL unit mask must support PEBS for this test to return true */ for(i=0; i < e->num_masks; i++) { int midx; /* check for valid unit mask */ if (e->unit_masks[i] >= ne->pme_numasks) return PFMLIB_ERR_INVAL; midx = pfm_nhm_midx2uidx(e->event, e->unit_masks[i]); if (ne->pme_umasks[midx].pme_uflags & PFMLIB_NHM_PEBS) n++; } return n > 0 && n == e->num_masks; } /* * Check if event is uncore * return: * PFMLIB_ERR_INVAL: invalid event e * 1 event is uncore * 0 event is not uncore */ int pfm_nhm_is_uncore(pfmlib_event_t *e) { if (PFMLIB_INITIALIZED() == 0) return 0; if (e == NULL || e->event >= num_pe) return PFMLIB_ERR_INVAL; return !!(get_nhm_entry(e->event)->pme_flags & (PFMLIB_NHM_UNC|PFMLIB_NHM_UNC_FIXED)); } static const char *data_src_encodings[]={ /* 0 */ "unknown L3 cache miss", /* 1 */ "minimal latency core cache hit. Request was satisfied by L1 data cache", /* 2 */ "pending core cache HIT. Outstanding core cache miss to same cacheline address already underway", /* 3 */ "data request satisfied by the L2", /* 4 */ "L3 HIT. Local or remote home request that hit L3 in the uncore with no coherency actions required (snooping)", /* 5 */ "L3 HIT. Local or remote home request that hit L3 and was serviced by another core with a cross core snoop where no modified copy was found (clean)", /* 6 */ "L3 HIT. Local or remote home request that hit L3 and was serviced by another core with a cross core snoop where modified copies were found (HITM)", /* 7 */ "reserved", /* 8 */ "L3 MISS. Local homed request that missed L3 and was serviced by forwarded data following a cross package snoop where no modified copy was found (remote home requests are not counted)", /* 9 */ "reserved", /* 10 */ "L3 MISS. Local homed request that missed L3 and was serviced by local DRAM (go to shared state)", /* 11 */ "L3 MISS. Remote homed request that missed L3 and was serviced by remote DRAM (go to shared state)", /* 12 */ "L3 MISS. Local homed request that missed L3 and was serviced by local DRAM (go to exclusive state)", /* 13 */ "L3 MISS. Remote homed request that missed L3 and was serviced by remote DRAM (go to exclusive state)", /* 14 */ "reserved", /* 15 */ "request to uncacheable memory" }; /* * return data source encoding based on index in val * To be used with PEBS load latency filtering to decode * source of the load miss */ int pfm_nhm_data_src_desc(unsigned int val, char **desc) { if (val > 15 || !desc) return PFMLIB_ERR_INVAL; *desc = strdup(data_src_encodings[val]); if (!*desc) return PFMLIB_ERR_NOMEM; return PFMLIB_SUCCESS; } pfm_pmu_support_t intel_nhm_support={ .pmu_name = "Intel Nehalem", .pmu_type = PFMLIB_INTEL_NHM_PMU, .pme_count = 0,/* patched at runtime */ .pmc_count = 0,/* patched at runtime */ .pmd_count = 0,/* patched at runtime */ .num_cnt = 0,/* patched at runtime */ .get_event_code = pfm_nhm_get_event_code, .get_event_name = pfm_nhm_get_event_name, .get_event_counters = pfm_nhm_get_event_counters, .dispatch_events = pfm_nhm_dispatch_events, .pmu_detect = pfm_nhm_detect, .pmu_init = pfm_nhm_init, .get_impl_pmcs = pfm_nhm_get_impl_pmcs, .get_impl_pmds = pfm_nhm_get_impl_pmds, .get_impl_counters = pfm_nhm_get_impl_counters, .get_hw_counter_width = pfm_nhm_get_hw_counter_width, .get_event_desc = pfm_nhm_get_event_description, .get_num_event_masks = pfm_nhm_get_num_event_masks, .get_event_mask_name = pfm_nhm_get_event_mask_name, .get_event_mask_code = pfm_nhm_get_event_mask_code, .get_event_mask_desc = pfm_nhm_get_event_mask_desc, .get_cycle_event = pfm_nhm_get_cycle_event, .get_inst_retired_event = pfm_nhm_get_inst_retired }; pfm_pmu_support_t intel_wsm_support={ .pmu_name = "Intel Westmere", .pmu_type = PFMLIB_INTEL_WSM_PMU, .pme_count = 0,/* patched at runtime */ .pmc_count = 0,/* patched at runtime */ .pmd_count = 0,/* patched at runtime */ .num_cnt = 0,/* patched at runtime */ .get_event_code = pfm_nhm_get_event_code, .get_event_name = pfm_nhm_get_event_name, .get_event_counters = pfm_nhm_get_event_counters, .dispatch_events = pfm_nhm_dispatch_events, .pmu_detect = pfm_wsm_detect, .pmu_init = pfm_nhm_init, .get_impl_pmcs = pfm_nhm_get_impl_pmcs, .get_impl_pmds = pfm_nhm_get_impl_pmds, .get_impl_counters = pfm_nhm_get_impl_counters, .get_hw_counter_width = pfm_nhm_get_hw_counter_width, .get_event_desc = pfm_nhm_get_event_description, .get_num_event_masks = pfm_nhm_get_num_event_masks, .get_event_mask_name = pfm_nhm_get_event_mask_name, .get_event_mask_code = pfm_nhm_get_event_mask_code, .get_event_mask_desc = pfm_nhm_get_event_mask_desc, .get_cycle_event = pfm_nhm_get_cycle_event, .get_inst_retired_event = pfm_nhm_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_intel_nhm_priv.h000066400000000000000000000060261502707512200240310ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_NHM_PRIV_H__ #define __PFMLIB_NHM_PRIV_H__ #define PFMLIB_NHM_MAX_UMASK 32 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_cntmsk; /* counter mask */ unsigned int pme_ucode; /* unit mask code */ unsigned int pme_uflags; /* unit mask flags */ unsigned int pme_umodel; /* CPU model for this umask */ } pme_nhm_umask_t; typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned int pme_code; /* event code */ unsigned int pme_cntmsk; /* counter mask */ unsigned int pme_numasks; /* number of umasks */ unsigned int pme_flags; /* flags */ pme_nhm_umask_t pme_umasks[PFMLIB_NHM_MAX_UMASK]; /* umask desc */ } pme_nhm_entry_t; /* * pme_flags value (event and unit mask) */ /* event or unit-mask level constraints */ #define PFMLIB_NHM_UMASK_NCOMBO 0x001 /* unit mask cannot be combined (default: combination ok) */ #define PFMLIB_NHM_FIXED0 0x002 /* event supported by FIXED_CTR0, can work on generic counters */ #define PFMLIB_NHM_FIXED1 0x004 /* event supported by FIXED_CTR1, can work on generic counters */ #define PFMLIB_NHM_FIXED2_ONLY 0x008 /* only works in FIXED_CTR2 */ #define PFMLIB_NHM_OFFCORE_RSP0 0x010 /* requires OFFCORE_RSP0 register */ #define PFMLIB_NHM_PMC01 0x020 /* works only on IA32_PMC0 or IA32_PMC1 */ #define PFMLIB_NHM_PEBS 0x040 /* support PEBS (precise event) */ #define PFMLIB_NHM_UNC 0x080 /* uncore event */ #define PFMLIB_NHM_UNC_FIXED 0x100 /* uncore fixed event */ #define PFMLIB_NHM_OFFCORE_RSP1 0x200 /* requires OFFCORE_RSP1 register */ #define PFMLIB_NHM_PMC0 0x400 /* works only on IA32_PMC0 */ #define PFMLIB_NHM_EX 0x800 /* has Nehalem-EX specific unit masks */ #endif /* __PFMLIB_NHM_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_itanium.c000066400000000000000000001016131502707512200224530ustar00rootroot00000000000000/* * pfmlib_itanium.c : support for Itanium-family PMU * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_itanium_priv.h" /* PMU private */ #include "itanium_events.h" /* PMU private */ #define is_ear(i) event_is_ear(itanium_pe+(i)) #define is_ear_tlb(i) event_is_tlb_ear(itanium_pe+(i)) #define is_iear(i) event_is_iear(itanium_pe+(i)) #define is_dear(i) event_is_dear(itanium_pe+(i)) #define is_btb(i) event_is_btb(itanium_pe+(i)) #define has_opcm(i) event_opcm_ok(itanium_pe+(i)) #define has_iarr(i) event_iarr_ok(itanium_pe+(i)) #define has_darr(i) event_darr_ok(itanium_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_ita_pmc8.opcm_used != 0 || (e)->pfp_ita_pmc9.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_ita_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_ita_drange.rr_used) #define evt_umask(e) itanium_pe[(e)].pme_umask /* let's define some handy shortcuts! */ #define pmc_plm pmc_ita_count_reg.pmc_plm #define pmc_ev pmc_ita_count_reg.pmc_ev #define pmc_oi pmc_ita_count_reg.pmc_oi #define pmc_pm pmc_ita_count_reg.pmc_pm #define pmc_es pmc_ita_count_reg.pmc_es #define pmc_umask pmc_ita_count_reg.pmc_umask #define pmc_thres pmc_ita_count_reg.pmc_thres #define pmc_ism pmc_ita_count_reg.pmc_ism /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ #define PFMLIB_ITA_PMC_BASE 0 static int pfm_ita_detect(void) { int ret = PFMLIB_ERR_NOTSUPP; /* * we support all chips (there is only one!) in the Itanium family */ if (pfm_ia64_get_cpu_family() == 0x07) ret = PFMLIB_SUCCESS; return ret; } /* * Part of the following code will eventually go into a perfmon library */ static int valid_assign(unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned int i; for(i=0; i < cnt; i++) { if (as[i]==0) return PFMLIB_ERR_NOASSIGN; /* * take care of restricted PMC registers */ if (pfm_regmask_isset(r_pmcs, as[i])) return PFMLIB_ERR_NOASSIGN; } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. */ static int pfm_ita_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { #define has_counter(e,b) (itanium_pe[e].pme_counters & (1 << (b)) ? (b) : 0) pfmlib_ita_input_param_t *param = mod_in; pfm_ita_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l, m; unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_ITA_NUM_COUNTERS]; unsigned int cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (PFMLIB_DEBUG()) { for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, itanium_pe[e[m].event].pme_name, itanium_pe[e[m].event].pme_counters); } } if (cnt > PMU_ITA_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; max_l0 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS; max_l1 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>1); max_l2 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>2); max_l3 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>3); DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); /* * This code needs fixing. It is not very pretty and * won't handle more than 4 counters if more become * available ! * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_ITA_FIRST_COUNTER; i < max_l0; i++) { assign[0]= has_counter(e[0].event,i); if (max_l1 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (j=PMU_ITA_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (k=PMU_ITA_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (l=PMU_ITA_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = e[j].plm ? e[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_ita_counters[j].thres: 0; reg.pmc_ism = param ? param->pfp_ita_counters[j].ism : PFMLIB_ITA_ISM_BOTH; reg.pmc_umask = is_ear(e[j].event) ? 0x0 : evt_umask(e[j].event); reg.pmc_es = itanium_pe[e[j].event].pme_code; pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = assign[j]; pc[j].reg_alt_addr= assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = assign[j]; pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx thres=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_thres, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, itanium_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; int iear_idx = -1; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) iear_idx = i; } if (param == NULL || mod_in->pfp_ita_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_ita2_iear.ear_used */ if (iear_idx == -1) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; pfm_ita_get_ear_mode(inp->pfp_events[iear_idx].event, ¶m->pfp_ita_iear.ear_mode); param->pfp_ita_iear.ear_umask = evt_umask(inp->pfp_events[iear_idx].event); param->pfp_ita_iear.ear_ism = PFMLIB_ITA_ISM_BOTH; /* force both instruction sets */ DPRINT("I-EAR event with no info\n"); } /* sanity check on the mode */ if (param->pfp_ita_iear.ear_mode < 0 || param->pfp_ita_iear.ear_mode > 2) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita_reg.iear_plm = param->pfp_ita_iear.ear_plm ? param->pfp_ita_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita_reg.iear_tlb = param->pfp_ita_iear.ear_mode; reg.pmc10_ita_reg.iear_umask = param->pfp_ita_iear.ear_umask; reg.pmc10_ita_reg.iear_ism = param->pfp_ita_iear.ear_ism; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 10)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 10; /* PMC10 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = 10; pc[pos1].reg_alt_addr= 10; pos1++; pd[pos2].reg_num = 0; pd[pos2].reg_addr = 0; pd[pos2].reg_alt_addr = 0; pos2++; pd[pos2].reg_num = 1; pd[pos2].reg_addr = 1; pd[pos2].reg_alt_addr = 1; pos2++; __pfm_vbprintf("[PMC10(pmc10)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita_reg.iear_tlb ? "Yes" : "No", reg.pmc10_ita_reg.iear_plm, reg.pmc10_ita_reg.iear_pm, reg.pmc10_ita_reg.iear_ism, reg.pmc10_ita_reg.iear_umask); __pfm_vbprintf("[PMD0(pmd0)]\n[PMD1(pmd1)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; int dear_idx = -1; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) dear_idx = i; } if (param == NULL || param->pfp_ita_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_ita2_dear.ear_used */ if (dear_idx == -1) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; pfm_ita_get_ear_mode(inp->pfp_events[dear_idx].event, ¶m->pfp_ita_dear.ear_mode); param->pfp_ita_dear.ear_umask = evt_umask(inp->pfp_events[dear_idx].event); param->pfp_ita_dear.ear_ism = PFMLIB_ITA_ISM_BOTH; /* force both instruction sets */ DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if (param->pfp_ita_dear.ear_mode > 2) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc11_ita_reg.dear_plm = param->pfp_ita_dear.ear_plm ? param->pfp_ita_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc11_ita_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc11_ita_reg.dear_tlb = param->pfp_ita_dear.ear_mode; reg.pmc11_ita_reg.dear_ism = param->pfp_ita_dear.ear_ism; reg.pmc11_ita_reg.dear_umask = param->pfp_ita_dear.ear_umask; reg.pmc11_ita_reg.dear_pt = param->pfp_ita_drange.rr_used ? 0: 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 11; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = 11; pos1++; pd[pos2].reg_num = 2; pd[pos2].reg_addr = 2; pd[pos2].reg_alt_addr = 2; pos2++; pd[pos2].reg_num = 3; pd[pos2].reg_addr = 3; pd[pos2].reg_alt_addr = 3; pos2++; pd[pos2].reg_num = 17; pd[pos2].reg_addr = 17; pd[pos2].reg_alt_addr = 17; pos2++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x pt=%d]\n", reg.pmc_val, reg.pmc11_ita_reg.dear_tlb ? "Yes" : "No", reg.pmc11_ita_reg.dear_plm, reg.pmc11_ita_reg.dear_pm, reg.pmc11_ita_reg.dear_ism, reg.pmc11_ita_reg.dear_umask, reg.pmc11_ita_reg.dear_pt); __pfm_vbprintf("[PMD2(pmd2)]\n[PMD3(pmd3)\nPMD17(pmd17)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_ita_input_param_t *param = mod_in; pfm_ita_pmc_reg_t reg; pfmlib_reg_t *pc = outp->pfp_pmcs; int pos = outp->pfp_pmc_count; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_ita_pmc8.opcm_used) { reg.pmc_val = param->pfp_ita_pmc8.pmc_val; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 8)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 8; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 8; pc[pos].reg_alt_addr = 8; pos++; __pfm_vbprintf("[PMC8(pmc8)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita_reg.m, reg.pmc8_9_ita_reg.i, reg.pmc8_9_ita_reg.f, reg.pmc8_9_ita_reg.b, reg.pmc8_9_ita_reg.match, reg.pmc8_9_ita_reg.mask); } if (param->pfp_ita_pmc9.opcm_used) { reg.pmc_val = param->pfp_ita_pmc9.pmc_val; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 9)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 9; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 9; pc[pos].reg_alt_addr = 9; pos++; __pfm_vbprintf("[PMC9(pmc9)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita_reg.m, reg.pmc8_9_ita_reg.i, reg.pmc8_9_ita_reg.f, reg.pmc8_9_ita_reg.b, reg.pmc8_9_ita_reg.match, reg.pmc8_9_ita_reg.mask); } outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_btb(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; int found_btb=0; unsigned int i, count; unsigned int pos1, pos2; reg.pmc_val = 0; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_btb(inp->pfp_events[i].event)) found_btb = 1; } if (param == NULL || param->pfp_ita_btb.btb_used == 0) { /* * case 3: no BTB event, no param */ if (found_btb == 0) return PFMLIB_SUCCESS; /* * case 1: BTB event, no param, capture all branches */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_ita_btb.btb_tar = 0x1; /* capture TAR */ param->pfp_ita_btb.btb_tm = 0x3; /* all branches */ param->pfp_ita_btb.btb_ptm = 0x3; /* all branches */ param->pfp_ita_btb.btb_ppm = 0x3; /* all branches */ param->pfp_ita_btb.btb_tac = 0x1; /* capture TAC */ param->pfp_ita_btb.btb_bac = 0x1; /* capture BAC */ DPRINT("BTB event with no info\n"); } /* * case 2: BTB event, param * case 4: no BTB event, param (free running mode) */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc12_ita_reg.btbc_plm = param->pfp_ita_btb.btb_plm ? param->pfp_ita_btb.btb_plm : inp->pfp_dfl_plm; reg.pmc12_ita_reg.btbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc12_ita_reg.btbc_tar = param->pfp_ita_btb.btb_tar & 0x1; reg.pmc12_ita_reg.btbc_tm = param->pfp_ita_btb.btb_tm & 0x3; reg.pmc12_ita_reg.btbc_ptm = param->pfp_ita_btb.btb_ptm & 0x3; reg.pmc12_ita_reg.btbc_ppm = param->pfp_ita_btb.btb_ppm & 0x3; reg.pmc12_ita_reg.btbc_bpt = param->pfp_ita_btb.btb_tac & 0x1; reg.pmc12_ita_reg.btbc_bac = param->pfp_ita_btb.btb_bac & 0x1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 12)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 12; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_value = 12; pos1++; __pfm_vbprintf("[PMC12(pmc12)=0x%lx plm=%d pm=%d tar=%d tm=%d ptm=%d ppm=%d bpt=%d bac=%d]\n", reg.pmc_val, reg.pmc12_ita_reg.btbc_plm, reg.pmc12_ita_reg.btbc_pm, reg.pmc12_ita_reg.btbc_tar, reg.pmc12_ita_reg.btbc_tm, reg.pmc12_ita_reg.btbc_ptm, reg.pmc12_ita_reg.btbc_ppm, reg.pmc12_ita_reg.btbc_bpt, reg.pmc12_ita_reg.btbc_bac); /* * PMD16 is included in list of used PMD */ for(i=8; i < 17; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = i; pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_ita_input_rr_t *irr, int mode, int *n_intervals) { int i; pfmlib_ita_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) DPRINT(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx += 2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, (unsigned long) d.db.db_mask, r_end); } } static int compute_normal_rr(pfmlib_ita_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita_output_rr_t *orr) { pfmlib_ita_input_rr_desc_t *in_rr; pfmlib_ita_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j, br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used = br_index; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita_output_param_t *mod_out) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita_input_rr_t *irr; pfmlib_ita_output_rr_t *orr; int pos = outp->pfp_pmc_count; int ret, base_idx = 0; int n_intervals; if (param == NULL || param->pfp_ita_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita_irange; orr = &mod_out->pfp_ita_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; DPRINT("n_intervals=%d\n", n_intervals); ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; } reg.pmc_val = 0; reg.pmc13_ita_reg.irange_ta = 0x0; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 13)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 13; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 13; pc[pos].reg_alt_addr= 13; pos++; __pfm_vbprintf("[PMC13(pmc13)=0x%lx ta=%d]\n", reg.pmc_val, reg.pmc13_ita_reg.irange_ta); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita_output_param_t *mod_out) { pfmlib_ita_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita_input_rr_t *irr; pfmlib_ita_output_rr_t *orr; pfm_ita_pmc_reg_t reg; unsigned int i, count; int pos = outp->pfp_pmc_count; int ret, base_idx = 0; int n_intervals; if (param == NULL || param->pfp_ita_drange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita_drange; orr = &mod_out->pfp_ita_drange; ret = check_intervals(irr, 1 , &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; DPRINT("n_intervals=%d\n", n_intervals); ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(e[i].event)) return PFMLIB_SUCCESS; /* will be done there */ } reg.pmc_val = 0UL; /* * here we have no other choice but to use the default priv level as there is no * specific D-EAR event provided */ reg.pmc11_ita_reg.dear_plm = inp->pfp_dfl_plm; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 11; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 11; pc[pos].reg_alt_addr= 11; pos++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x pt=%d]\n", reg.pmc_val, reg.pmc11_ita_reg.dear_tlb ? "Yes" : "No", reg.pmc11_ita_reg.dear_plm, reg.pmc11_ita_reg.dear_pm, reg.pmc11_ita_reg.dear_ism, reg.pmc11_ita_reg.dear_umask, reg.pmc11_ita_reg.dear_pt); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in) { pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (mod_in->pfp_ita_counters[i].flags & PFMLIB_ITA_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(mod_in) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(mod_in) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(mod_in) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in) { unsigned int i, count; if (mod_in->pfp_ita_drange.rr_used == 0 && mod_in->pfp_ita_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_ita_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_ita_input_param_t *mod_in = (pfmlib_ita_input_param_t *)model_in; pfmlib_ita_output_param_t *mod_out = (pfmlib_ita_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with raneg restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_ita_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_btb(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; return ret; } /* XXX: return value is also error code */ int pfm_ita_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_ITA_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = itanium_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_ita_is_ear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_ear(i) ? 0 : 1; } int pfm_ita_is_dear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_dear(i) ? 0 : 1; } int pfm_ita_is_dear_tlb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_dear(i) && is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_dear_cache(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_dear(i) && !is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_iear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_iear(i) ? 0 : 1; } int pfm_ita_is_iear_tlb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_iear(i) && is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_iear_cache(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_iear(i) && !is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_btb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_btb(i) ? 0 : 1; } int pfm_ita_support_iarr(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_iarr(i) ? 0 : 1; } int pfm_ita_support_darr(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_darr(i) ? 0 : 1; } int pfm_ita_support_opcm(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_opcm(i) ? 0 : 1; } int pfm_ita_get_ear_mode(unsigned int i, pfmlib_ita_ear_mode_t *m) { if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; *m = is_ear_tlb(i) ? PFMLIB_ITA_EAR_TLB_MODE : PFMLIB_ITA_EAR_CACHE_MODE; return PFMLIB_SUCCESS; } static int pfm_ita_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)itanium_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_ita_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_ITA_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } static char * pfm_ita_get_event_name(unsigned int i) { return itanium_pe[i].pme_name; } static void pfm_ita_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =itanium_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_ita_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < PMU_ITA_NUM_PMCS; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_ita_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < PMU_ITA_NUM_PMDS; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_ita_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_ita_get_hw_counter_width(unsigned int *width) { *width = PMU_ITA_COUNTER_WIDTH; } static int pfm_ita_get_cycle_event(pfmlib_event_t *e) { e->event = PME_ITA_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_ita_get_inst_retired(pfmlib_event_t *e) { e->event = PME_ITA_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t itanium_support={ .pmu_name = "itanium", .pmu_type = PFMLIB_ITANIUM_PMU, .pme_count = PME_ITA_EVENT_COUNT, .pmc_count = PMU_ITA_NUM_PMCS, .pmd_count = PMU_ITA_NUM_PMDS, .num_cnt = PMU_ITA_NUM_COUNTERS, .get_event_code = pfm_ita_get_event_code, .get_event_name = pfm_ita_get_event_name, .get_event_counters = pfm_ita_get_event_counters, .dispatch_events = pfm_ita_dispatch_events, .pmu_detect = pfm_ita_detect, .get_impl_pmcs = pfm_ita_get_impl_pmcs, .get_impl_pmds = pfm_ita_get_impl_pmds, .get_impl_counters = pfm_ita_get_impl_counters, .get_hw_counter_width = pfm_ita_get_hw_counter_width, .get_cycle_event = pfm_ita_get_cycle_event, .get_inst_retired_event = pfm_ita_get_inst_retired /* no event description available for Itanium */ }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_itanium2.c000066400000000000000000001735321502707512200225460ustar00rootroot00000000000000/* * pfmlib_itanium2.c : support for the Itanium2 PMU family * * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_itanium2_priv.h" /* PMU private */ #include "itanium2_events.h" /* PMU private */ #define is_ear(i) event_is_ear(itanium2_pe+(i)) #define is_ear_tlb(i) event_is_ear_tlb(itanium2_pe+(i)) #define is_ear_alat(i) event_is_ear_alat(itanium2_pe+(i)) #define is_ear_cache(i) event_is_ear_cache(itanium2_pe+(i)) #define is_iear(i) event_is_iear(itanium2_pe+(i)) #define is_dear(i) event_is_dear(itanium2_pe+(i)) #define is_btb(i) event_is_btb(itanium2_pe+(i)) #define has_opcm(i) event_opcm_ok(itanium2_pe+(i)) #define has_iarr(i) event_iarr_ok(itanium2_pe+(i)) #define has_darr(i) event_darr_ok(itanium2_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_ita2_pmc8.opcm_used != 0 || (e)->pfp_ita2_pmc9.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_ita2_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_ita2_drange.rr_used) #define evt_grp(e) (int)itanium2_pe[e].pme_qualifiers.pme_qual.pme_group #define evt_set(e) (int)itanium2_pe[e].pme_qualifiers.pme_qual.pme_set #define evt_umask(e) itanium2_pe[e].pme_umask #define FINE_MODE_BOUNDARY_BITS 12 #define FINE_MODE_MASK ~((1U<<12)-1) /* let's define some handy shortcuts! */ #define pmc_plm pmc_ita2_counter_reg.pmc_plm #define pmc_ev pmc_ita2_counter_reg.pmc_ev #define pmc_oi pmc_ita2_counter_reg.pmc_oi #define pmc_pm pmc_ita2_counter_reg.pmc_pm #define pmc_es pmc_ita2_counter_reg.pmc_es #define pmc_umask pmc_ita2_counter_reg.pmc_umask #define pmc_thres pmc_ita2_counter_reg.pmc_thres #define pmc_ism pmc_ita2_counter_reg.pmc_ism static char * pfm_ita2_get_event_name(unsigned int i); /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ /* * The Itanium2 PMU has a bug in the fine mode implementation. * It only sees ranges with a granularity of two bundles. * So we prepare for the day they fix it. */ static int has_fine_mode_bug; static int pfm_ita2_detect(void) { int tmp; int ret = PFMLIB_ERR_NOTSUPP; tmp = pfm_ia64_get_cpu_family(); if (tmp == 0x1f) { has_fine_mode_bug = 1; ret = PFMLIB_SUCCESS; } return ret; } /* * Check the event for incompatibilities. This is useful * for L1 and L2 related events. Due to wire limitations, * some caches events are separated into sets. There * are 5 sets for the L1D cache group and 6 sets for L2 group. * It is NOT possible to simultaneously measure events from * differents sets within a group. For instance, you cannot * measure events from set0 and set1 in L1D cache group. However * it is possible to measure set0 in L1D and set1 in L2 at the same * time. * * This function verifies that the set constraint are respected. */ static int check_cross_groups_and_umasks(pfmlib_input_param_t *inp) { unsigned long ref_umask, umask; int g, s; unsigned int cnt = inp->pfp_event_count; pfmlib_event_t *e = inp->pfp_events; unsigned int i, j; /* * XXX: could possibly be optimized */ for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); s = evt_set(e[i].event); if (g == PFMLIB_ITA2_EVT_NO_GRP) continue; ref_umask = evt_umask(e[i].event); for (j=i+1; j < cnt; j++) { if (evt_grp(e[j].event) != g) continue; if (evt_set(e[j].event) != s) return PFMLIB_ERR_EVTSET; /* only care about L2 cache group */ if (g != PFMLIB_ITA2_EVT_L2_CACHE_GRP || (s == 1 || s == 2)) continue; umask = evt_umask(e[j].event); /* * there is no assignement possible if the event in PMC4 * has a umask (ref_umask) and an event (from the same * set) also has a umask AND it is different. For some * sets, the umasks are shared, therefore the value * programmed into PMC4 determines the umask for all * the other events (with umask) from the set. */ if (umask && ref_umask != umask) return PFMLIB_ERR_NOASSIGN; } } return PFMLIB_SUCCESS; } /* * Certain prefetch events must be treated specially when instruction range restriction * is in use because they can only be constrained by IBRP1 in fine-mode. Other events * will use IBRP0 if tagged as a demand fetch OR IBPR1 if tagged as a prefetch match. * From the library's point of view there is no way of distinguishing this, so we leave * it up to the user to interpret the results. * * Events which can be qualified by the two pairs depending on their tag: * - IBP_BUNPAIRS_IN * - L1I_FETCH_RAB_HIT * - L1I_FETCH_ISB_HIT * - L1I_FILLS * * This function returns the number of qualifying prefetch events found * * XXX: not clear which events do qualify as prefetch events. */ static int prefetch_events[]={ PME_ITA2_L1I_PREFETCHES, PME_ITA2_L1I_STRM_PREFETCHES, PME_ITA2_L2_INST_PREFETCHES }; #define NPREFETCH_EVENTS sizeof(prefetch_events)/sizeof(int) static int check_prefetch_events(pfmlib_input_param_t *inp) { int code; int prefetch_codes[NPREFETCH_EVENTS]; unsigned int i, j, count; int c; int found = 0; for(i=0; i < NPREFETCH_EVENTS; i++) { pfm_get_event_code(prefetch_events[i], &code); prefetch_codes[i] = code; } count = inp->pfp_event_count; for(i=0; i < count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); for(j=0; j < NPREFETCH_EVENTS; j++) { if (c == prefetch_codes[j]) found++; } } return found; } /* * IA64_INST_RETIRED (and subevents) is the only event which can be measured on all * 4 IBR when non-fine mode is not possible. * * This function returns: * - the number of events matching the IA64_INST_RETIRED code * - in retired_mask the bottom 4 bits indicates which of the 4 INST_RETIRED event * is present */ static unsigned int check_inst_retired_events(pfmlib_input_param_t *inp, unsigned long *retired_mask) { int code; int c, ret; unsigned int i, count, found = 0; unsigned long umask, mask; pfm_get_event_code(PME_ITA2_IA64_INST_RETIRED_THIS, &code); count = inp->pfp_event_count; mask = 0; for(i=0; i < count; i++) { ret = pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) { ret = pfm_ita2_get_event_umask(inp->pfp_events[i].event, &umask); if (ret != PFMLIB_SUCCESS) break; switch(umask) { case 0: mask |= 1; break; case 1: mask |= 2; break; case 2: mask |= 4; break; case 3: mask |= 8; break; } found++; } } if (retired_mask) *retired_mask = mask; return found; } static int check_fine_mode_possible(pfmlib_ita2_input_rr_t *rr, int n) { pfmlib_ita2_input_rr_desc_t *lim = rr->rr_limits; int i; for(i=0; i < n; i++) { if ((lim[i].rr_start & FINE_MODE_MASK) != (lim[i].rr_end & FINE_MODE_MASK)) return 0; } return 1; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_ita2_input_rr_t *irr, int mode, unsigned int *n_intervals) { unsigned int i; pfmlib_ita2_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } static int valid_assign(pfmlib_event_t *e, unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned long pmc4_umask = 0, umask; char *name; int l1_grp_present = 0, l2_grp_present = 0; unsigned int i; int c, failure; int need_pmc5, need_pmc4; int pmc5_evt = -1, pmc4_evt = -1; if (PFMLIB_DEBUG()) { unsigned int j; for(j=0;jpfp_event_count; for(i=0; i < count; i++) { for (j=0; j < NCANCEL_EVENTS; j++) { pfm_get_event_code(inp->pfp_events[i].event, &code); if (code == cancel_codes[j]) { if (idx != -1) { return PFMLIB_ERR_INVAL; } idx = inp->pfp_events[i].event; } } } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_ita2_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { #define has_counter(e,b) (itanium2_pe[e].pme_counters & (1 << (b)) ? (b) : 0) pfmlib_ita2_input_param_t *param = mod_in; pfm_ita2_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l; int ret; unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_ITA2_NUM_COUNTERS]; unsigned int m, cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (PFMLIB_DEBUG()) for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, itanium2_pe[e[m].event].pme_name, itanium2_pe[e[m].event].pme_counters); } if (cnt > PMU_ITA2_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; ret = check_cross_groups_and_umasks(inp); if (ret != PFMLIB_SUCCESS) return ret; ret = check_cancel_events(inp); if (ret != PFMLIB_SUCCESS) return ret; max_l0 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS; max_l1 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>1); max_l2 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>2); max_l3 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>3); DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); /* * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_ITA2_FIRST_COUNTER; i < max_l0; i++) { assign[0] = has_counter(e[0].event,i); if (max_l1 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (j=PMU_ITA2_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (k=PMU_ITA2_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (l=PMU_ITA2_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all, bits 26-27 must be zero for proper operations */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = inp->pfp_events[j].plm ? inp->pfp_events[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_ita2_counters[j].thres: 0; reg.pmc_ism = param ? param->pfp_ita2_counters[j].ism : PFMLIB_ITA2_ISM_BOTH; reg.pmc_umask = is_ear(e[j].event) ? 0x0 : itanium2_pe[e[j].event].pme_umask; reg.pmc_es = itanium2_pe[e[j].event].pme_code; /* * Note that we don't force PMC4.pmc_ena = 1 because the kernel takes care of this for us. * This way we don't have to program something in PMC4 even when we don't use it */ pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = pc[j].reg_alt_addr = assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx thres=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_thres, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, itanium2_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_ita2_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_ita2_iear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_ita2_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_ita2_iear.ear_mode); param->pfp_ita2_iear.ear_umask = evt_umask(inp->pfp_events[i].event); param->pfp_ita2_iear.ear_ism = PFMLIB_ITA2_ISM_BOTH; /* force both instruction sets */ DPRINT("I-EAR event with no info\n"); } /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita2_tlb_reg.iear_plm = param->pfp_ita2_iear.ear_plm ? param->pfp_ita2_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita2_tlb_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita2_tlb_reg.iear_ct = 0x0; reg.pmc10_ita2_tlb_reg.iear_umask = param->pfp_ita2_iear.ear_umask; reg.pmc10_ita2_tlb_reg.iear_ism = param->pfp_ita2_iear.ear_ism; } else if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_CACHE_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita2_cache_reg.iear_plm = param->pfp_ita2_iear.ear_plm ? param->pfp_ita2_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita2_cache_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita2_cache_reg.iear_ct = 0x1; reg.pmc10_ita2_cache_reg.iear_umask = param->pfp_ita2_iear.ear_umask; reg.pmc10_ita2_cache_reg.iear_ism = param->pfp_ita2_iear.ear_ism; } else { DPRINT("ALAT mode not supported in I-EAR mode\n"); return PFMLIB_ERR_INVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 10)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 10; /* PMC10 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 10; pos1++; pd[pos2].reg_num = 0; pd[pos2].reg_addr = pd[pos2].reg_alt_addr= 0; pos2++; pd[pos2].reg_num = 1; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 1; pos2++; if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE) { __pfm_vbprintf("[PMC10(pmc10)=0x%lx ctb=tlb plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita2_tlb_reg.iear_plm, reg.pmc10_ita2_tlb_reg.iear_pm, reg.pmc10_ita2_tlb_reg.iear_ism, reg.pmc10_ita2_tlb_reg.iear_umask); } else { __pfm_vbprintf("[PMC10(pmc10)=0x%lx ctb=cache plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita2_cache_reg.iear_plm, reg.pmc10_ita2_cache_reg.iear_pm, reg.pmc10_ita2_cache_reg.iear_ism, reg.pmc10_ita2_cache_reg.iear_umask); } __pfm_vbprintf("[PMD0(pmd0)]\n[PMD1(pmd1)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_ita2_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_ita2_dear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_ita2_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_ita2_dear.ear_mode); param->pfp_ita2_dear.ear_umask = evt_umask(inp->pfp_events[i].event); param->pfp_ita2_dear.ear_ism = PFMLIB_ITA2_ISM_BOTH; /* force both instruction sets */ DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if ( param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_CACHE_MODE && param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_TLB_MODE && param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_ALAT_MODE) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc11_ita2_reg.dear_plm = param->pfp_ita2_dear.ear_plm ? param->pfp_ita2_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc11_ita2_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc11_ita2_reg.dear_mode = param->pfp_ita2_dear.ear_mode; reg.pmc11_ita2_reg.dear_umask = param->pfp_ita2_dear.ear_umask; reg.pmc11_ita2_reg.dear_ism = param->pfp_ita2_dear.ear_ism; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 11; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 11; pos1++; pd[pos2].reg_num = 2; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 2; pos2++; pd[pos2].reg_num = 3; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 3; pos2++; pd[pos2].reg_num = 17; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 17; pos2++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx mode=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc11_ita2_reg.dear_mode == 0 ? "L1D" : (reg.pmc11_ita2_reg.dear_mode == 1 ? "L1DTLB" : "ALAT"), reg.pmc11_ita2_reg.dear_plm, reg.pmc11_ita2_reg.dear_pm, reg.pmc11_ita2_reg.dear_ism, reg.pmc11_ita2_reg.dear_umask); __pfm_vbprintf("[PMD2(pmd2)]\n[PMD3(pmd3)\nPMD17(pmd17)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfm_ita2_pmc_reg_t reg, pmc15; unsigned int i, has_1st_pair, has_2nd_pair, count; unsigned int pos = outp->pfp_pmc_count; if (param == NULL) return PFMLIB_SUCCESS; /* not constrained by PMC8 nor PMC9 */ pmc15.pmc_val = 0xffffffff; /* XXX: use PAL instead. PAL value is 0xfffffff0 */ if (param->pfp_ita2_irange.rr_used && mod_out == NULL) return PFMLIB_ERR_INVAL; if (param->pfp_ita2_pmc8.opcm_used || (param->pfp_ita2_irange.rr_used && mod_out->pfp_ita2_irange.rr_nbr_used!=0) ) { reg.pmc_val = param->pfp_ita2_pmc8.opcm_used ? param->pfp_ita2_pmc8.pmc_val : 0xffffffff3fffffff; if (param->pfp_ita2_irange.rr_used) { reg.pmc8_9_ita2_reg.opcm_ig_ad = 0; reg.pmc8_9_ita2_reg.opcm_inv = param->pfp_ita2_irange.rr_flags & PFMLIB_ITA2_RR_INV ? 1 : 0; } else { /* clear range restriction fields when none is used */ reg.pmc8_9_ita2_reg.opcm_ig_ad = 1; reg.pmc8_9_ita2_reg.opcm_inv = 0; } /* force bit 2 to 1 */ reg.pmc8_9_ita2_reg.opcm_bit2 = 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 8)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 8; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 8; pos++; /* * will be constrained by PMC8 */ if (param->pfp_ita2_pmc8.opcm_used) { has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP0_PMC8) has_1st_pair=1; if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP2_PMC8) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp0_pmc8 = 0; if (has_2nd_pair || has_1st_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp2_pmc8 = 0; } __pfm_vbprintf("[PMC8(pmc8)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x inv=%d ig_ad=%d]\n", reg.pmc_val, reg.pmc8_9_ita2_reg.opcm_m, reg.pmc8_9_ita2_reg.opcm_i, reg.pmc8_9_ita2_reg.opcm_f, reg.pmc8_9_ita2_reg.opcm_b, reg.pmc8_9_ita2_reg.opcm_match, reg.pmc8_9_ita2_reg.opcm_mask, reg.pmc8_9_ita2_reg.opcm_inv, reg.pmc8_9_ita2_reg.opcm_ig_ad); } if (param->pfp_ita2_pmc9.opcm_used) { /* * PMC9 can only be used to qualify IA64_INST_RETIRED_* events */ if (check_inst_retired_events(inp, NULL) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; reg.pmc_val = param->pfp_ita2_pmc9.pmc_val; /* ig_ad, inv are ignored for PMC9, to avoid confusion we force default values */ reg.pmc8_9_ita2_reg.opcm_ig_ad = 1; reg.pmc8_9_ita2_reg.opcm_inv = 0; /* force bit 2 to 1 */ reg.pmc8_9_ita2_reg.opcm_bit2 = 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 9)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 9; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 9; pos++; /* * will be constrained by PMC9 */ has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP1_PMC9) has_1st_pair=1; if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP3_PMC9) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp1_pmc9 = 0; if (has_2nd_pair || has_1st_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp3_pmc9 = 0; __pfm_vbprintf("[PMC9(pmc9)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita2_reg.opcm_m, reg.pmc8_9_ita2_reg.opcm_i, reg.pmc8_9_ita2_reg.opcm_f, reg.pmc8_9_ita2_reg.opcm_b, reg.pmc8_9_ita2_reg.opcm_match, reg.pmc8_9_ita2_reg.opcm_mask); } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 15)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 15; pc[pos].reg_value = pmc15.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 15; pos++; __pfm_vbprintf("[PMC15(pmc15)=0x%lx ibrp0_pmc8=%d ibrp1_pmc9=%d ibrp2_pmc8=%d ibrp3_pmc9=%d]\n", pmc15.pmc_val, pmc15.pmc15_ita2_reg.opcmc_ibrp0_pmc8, pmc15.pmc15_ita2_reg.opcmc_ibrp1_pmc9, pmc15.pmc15_ita2_reg.opcmc_ibrp2_pmc8, pmc15.pmc15_ita2_reg.opcmc_ibrp3_pmc9); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_btb(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e= inp->pfp_events; pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; int found_btb = 0, found_bad_dear = 0; int has_btb_param; unsigned int i, pos1, pos2; unsigned int count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * explicit BTB settings */ has_btb_param = param && param->pfp_ita2_btb.btb_used; reg.pmc_val = 0UL; /* * we need to scan all events looking for DEAR ALAT/TLB due to incompatibility */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_btb(e[i].event)) found_btb = 1; /* * keep track of the first BTB event */ /* look only for DEAR TLB */ if (is_dear(e[i].event) && (is_ear_tlb(e[i].event) || is_ear_alat(e[i].event))) { found_bad_dear = 1; } } DPRINT("found_btb=%d found_bar_dear=%d\n", found_btb, found_bad_dear); /* * did not find D-EAR TLB/ALAT event, need to check param structure */ if (found_bad_dear == 0 && param && param->pfp_ita2_dear.ear_used == 1) { if ( param->pfp_ita2_dear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE || param->pfp_ita2_dear.ear_mode == PFMLIB_ITA2_EAR_ALAT_MODE) found_bad_dear = 1; } /* * no explicit BTB event and no special case to deal with (cover part of case 3) */ if (found_btb == 0 && has_btb_param == 0 && found_bad_dear == 0) return PFMLIB_SUCCESS; if (has_btb_param == 0) { /* * case 3: no BTB event, btb_used=0 but found_bad_dear=1, need to cleanup PMC12 */ if (found_btb == 0) goto assign_zero; /* * case 1: we have a BTB event but no param, default setting is to capture * all branches. */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_ita2_btb.btb_ds = 0; /* capture branch targets */ param->pfp_ita2_btb.btb_tm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_ptm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_ppm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_brt = 0x0; /* all branches */ DPRINT("BTB event with no info\n"); } /* * case 2: BTB event in the list, param provided * case 4: no BTB event, param provided (free running mode) */ reg.pmc12_ita2_reg.btbc_plm = param->pfp_ita2_btb.btb_plm ? param->pfp_ita2_btb.btb_plm : inp->pfp_dfl_plm; reg.pmc12_ita2_reg.btbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc12_ita2_reg.btbc_ds = param->pfp_ita2_btb.btb_ds & 0x1; reg.pmc12_ita2_reg.btbc_tm = param->pfp_ita2_btb.btb_tm & 0x3; reg.pmc12_ita2_reg.btbc_ptm = param->pfp_ita2_btb.btb_ptm & 0x3; reg.pmc12_ita2_reg.btbc_ppm = param->pfp_ita2_btb.btb_ppm & 0x3; reg.pmc12_ita2_reg.btbc_brt = param->pfp_ita2_btb.btb_brt & 0x3; /* * if DEAR-ALAT or DEAR-TLB is set then PMC12 must be set to zero (see documentation p. 87) * * D-EAR ALAT/TLB and BTB cannot be used at the same time. * From documentation: PMC12 must be zero in this mode; else the wrong IP for misses * coming right after a mispredicted branch. * * D-EAR cache is fine. */ assign_zero: if (found_bad_dear && reg.pmc_val != 0UL) return PFMLIB_ERR_EVTINCOMP; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 12)) return PFMLIB_ERR_NOASSIGN; memset(pc+pos1, 0, sizeof(pfmlib_reg_t)); pc[pos1].reg_num = 12; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 12; pos1++; __pfm_vbprintf("[PMC12(pmc12)=0x%lx plm=%d pm=%d ds=%d tm=%d ptm=%d ppm=%d brt=%d]\n", reg.pmc_val, reg.pmc12_ita2_reg.btbc_plm, reg.pmc12_ita2_reg.btbc_pm, reg.pmc12_ita2_reg.btbc_ds, reg.pmc12_ita2_reg.btbc_tm, reg.pmc12_ita2_reg.btbc_ptm, reg.pmc12_ita2_reg.btbc_ppm, reg.pmc12_ita2_reg.btbc_brt); /* * only add BTB PMD when actually using BTB. * Not needed when dealing with D-EAR TLB and DEAR-ALAT * PMC12 restriction */ if (found_btb || has_btb_param) { /* * PMD16 is included in list of used PMD */ for(i=8; i < 17; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) printf(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs, fine_mode ? ", fine_mode" : "", rr_flags & PFMLIB_ITA2_RR_INV ? ", inversed" : ""); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx+=2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); if (fine_mode) __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask); else __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask, r_end); } } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_fine_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita2_output_rr_t *orr) { int i; pfmlib_reg_t *br; pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; unsigned long addr; int reg_idx; dbreg_t db; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; db.val = 0; db.db.db_mask = FINE_MODE_MASK; if (n > 2) return PFMLIB_ERR_IRRTOOMANY; for (i=0; i < n; i++, reg_idx += 2, in_rr++, br+= 4) { /* * setup lower limit pair * * because of the PMU bug, we must align down to the closest bundle-pair * aligned address. 5 => 32-byte aligned address */ addr = has_fine_mode_bug ? ALIGN_DOWN(in_rr->rr_start, 5) : in_rr->rr_start; out_rr->rr_soff = in_rr->rr_start - addr; /* * adjust plm for each range */ db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[0].reg_num = reg_idx; br[0].reg_value = addr; br[0].reg_addr = br[0].reg_alt_addr = reg_idx; br[1].reg_num = reg_idx+1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = reg_idx+1; /* * setup upper limit pair * * * In fine mode, the bundle address stored in the upper limit debug * registers is included in the count, so we substract 0x10 to exclude it. * * because of the PMU bug, we align the (corrected) end to the nearest * 32-byte aligned address + 0x10. With this correction and depending * on the correction, we may count one * * */ addr = in_rr->rr_end - 0x10; if (has_fine_mode_bug && (addr & 0x1f) == 0) addr += 0x10; out_rr->rr_eoff = addr - in_rr->rr_end + 0x10; br[2].reg_num = reg_idx+4; br[2].reg_value = addr; br[2].reg_addr = br[2].reg_alt_addr = reg_idx+4; br[3].reg_num = reg_idx+5; br[3].reg_value = db.val; br[3].reg_addr = br[3].reg_alt_addr = reg_idx+5; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 2, 1, irr->rr_flags); } orr->rr_nbr_used += i<<2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_single_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int *base_idx, pfmlib_ita2_output_rr_t *orr) { unsigned long size, end, start; unsigned long p_start, p_end; pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; pfmlib_reg_t *br; dbreg_t db; int reg_idx; int l, m; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; start = in_rr->rr_start; end = in_rr->rr_end; size = end - start; reg_idx = *base_idx; l = pfm_ia64_fls(size); m = l; if (size & ((1UL << l)-1)) { if (l>62) { printf("range: [0x%lx-0x%lx] too big\n", start, end); return PFMLIB_ERR_IRRTOOBIG; } m++; } DPRINT("size=%ld, l=%d m=%d, internal: 0x%lx full: 0x%lx\n", size, l, m, 1UL << l, 1UL << m); for (; m < 64; m++) { p_start = ALIGN_DOWN(start, m); p_end = p_start+(1UL<= end) goto found; } return PFMLIB_ERR_IRRINVAL; found: DPRINT("m=%d p_start=0x%lx p_end=0x%lx\n", m, p_start,p_end); /* when the event is not IA64_INST_RETIRED, then we MUST use ibrp0 */ br[0].reg_num = reg_idx; br[0].reg_value = p_start; br[0].reg_addr = br[0].reg_alt_addr = reg_idx; db.val = 0; db.db.db_mask = ~((1UL << m)-1); db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[1].reg_num = reg_idx + 1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = reg_idx + 1; out_rr->rr_soff = start - p_start; out_rr->rr_eoff = p_end - end; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 1, 0, irr->rr_flags); orr->rr_nbr_used += 2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int compute_normal_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita2_output_rr_t *orr) { pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j; int br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1, 0, irr->rr_flags); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used += br_index; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_ita2_input_rr_t *irr; pfmlib_ita2_output_rr_t *orr; pfmlib_reg_t *pc = outp->pfp_pmcs; unsigned int i, pos = outp->pfp_pmc_count, count; int ret; unsigned int retired_only, retired_count, fine_mode, prefetch_count; unsigned int n_intervals; int base_idx = 0; unsigned long retired_mask; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita2_irange; orr = &mod_out->pfp_ita2_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; retired_count = check_inst_retired_events(inp, &retired_mask); retired_only = retired_count == inp->pfp_event_count; prefetch_count = check_prefetch_events(inp); fine_mode = irr->rr_flags & PFMLIB_ITA2_RR_NO_FINE_MODE ? 0 : check_fine_mode_possible(irr, n_intervals); DPRINT("n_intervals=%d retired_only=%d retired_count=%d prefetch_count=%d fine_mode=%d\n", n_intervals, retired_only, retired_count, prefetch_count, fine_mode); /* * On Itanium2, there are more constraints on what can be measured with irange. * * - The fine mode is the best because you directly set the lower and upper limits of * the range. This uses 2 ibr pairs for range (ibrp0/ibrp2 and ibp1/ibrp3). Therefore * at most 2 fine mode ranges can be defined. There is a limit on the size and alignment * of the range to allow fine mode: the range must be less than 4KB in size AND the lower * and upper limits must NOT cross a 4KB page boundary. The fine mode works will all events. * * - if the fine mode fails, then for all events, except IA64_TAGGED_INST_RETIRED_*, only * the first pair of ibr is available: ibrp0. This imposes some severe restrictions on the * size and alignement of the range. It can be bigger than 4KB and must be properly aligned * on its size. The library relaxes these constraints by allowing the covered areas to be * larger than the expected range. It may start before and end after. You can determine how * far off the range is in either direction for each range by looking at the rr_soff (start * offset) and rr_eoff (end offset). * * - if the events include certain prefetch events then only IBRP1 can be used in fine mode * See 10.3.5.1 Exception 1. * * - Finally, when the events are ONLY IA64_TAGGED_INST_RETIRED_* then all IBR pairs can be used * to cover the range giving us more flexibility to approximate the range when it is not * properly aligned on its size (see 10.3.5.2 Exception 2). */ if (fine_mode == 0 && retired_only == 0 && n_intervals > 1) return PFMLIB_ERR_IRRTOOMANY; /* we do not default to non-fine mode to support more ranges */ if (n_intervals > 2 && fine_mode == 1) return PFMLIB_ERR_IRRTOOMANY; if (fine_mode == 0) { if (retired_only) { ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } else { /* unless we have only prefetch and instruction retired events, * we cannot satisfy the request because the other events cannot * be measured on anything but IBRP0. */ if (prefetch_count && (prefetch_count+retired_count) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; base_idx = prefetch_count ? 2 : 0; ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); } } else { if (prefetch_count && n_intervals != 1) return PFMLIB_ERR_IRRTOOMANY; base_idx = prefetch_count ? 2 : 0; ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; } reg.pmc_val = 0xdb6; /* default value */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { switch(orr->rr_br[i].reg_num) { case 0: reg.pmc14_ita2_reg.iarc_ibrp0 = 0; break; case 2: reg.pmc14_ita2_reg.iarc_ibrp1 = 0; break; case 4: reg.pmc14_ita2_reg.iarc_ibrp2 = 0; break; case 6: reg.pmc14_ita2_reg.iarc_ibrp3 = 0; break; } } if (retired_only && (param->pfp_ita2_pmc8.opcm_used ||param->pfp_ita2_pmc9.opcm_used)) { /* * PMC8 + IA64_INST_RETIRED only works if irange on IBRP0 and/or IBRP2 * PMC9 + IA64_INST_RETIRED only works if irange on IBRP1 and/or IBRP3 */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { if (orr->rr_br[i].reg_num == 0 && param->pfp_ita2_pmc9.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 2 && param->pfp_ita2_pmc8.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 4 && param->pfp_ita2_pmc9.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 6 && param->pfp_ita2_pmc8.opcm_used) return PFMLIB_ERR_FEATCOMB; } } if (fine_mode) { reg.pmc14_ita2_reg.iarc_fine = 1; } else if (retired_only) { /* * we need to check that the user provided all the events needed to cover * all the ibr pairs used to cover the range */ if ((retired_mask & 0x1) == 0 && reg.pmc14_ita2_reg.iarc_ibrp0 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x2) == 0 && reg.pmc14_ita2_reg.iarc_ibrp1 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x4) == 0 && reg.pmc14_ita2_reg.iarc_ibrp2 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x8) == 0 && reg.pmc14_ita2_reg.iarc_ibrp3 == 0) return PFMLIB_ERR_IRRINVAL; } /* initialize pmc request slot */ memset(pc+pos, 0, sizeof(pfmlib_reg_t)); if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 14)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 14; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 14; pos++; __pfm_vbprintf("[PMC14(pmc14)=0x%lx ibrp0=%d ibrp1=%d ibrp2=%d ibrp3=%d fine=%d]\n", reg.pmc_val, reg.pmc14_ita2_reg.iarc_ibrp0, reg.pmc14_ita2_reg.iarc_ibrp1, reg.pmc14_ita2_reg.iarc_ibrp2, reg.pmc14_ita2_reg.iarc_ibrp3, reg.pmc14_ita2_reg.iarc_fine); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static const unsigned long iod_tab[8]={ /* --- */ 3, /* --D */ 2, /* -O- */ 3, /* should not be used */ /* -OD */ 0, /* =IOD safe because default IBR is harmless */ /* I-- */ 1, /* =IO safe because by defaut OPC is turned off */ /* I-D */ 0, /* =IOD safe because by default opc is turned off */ /* IO- */ 1, /* IOD */ 0 }; /* * IMPORTANT: MUST BE CALLED *AFTER* pfm_dispatch_irange() to make sure we see * the irange programming to adjust pmc13. */ static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita2_input_rr_t *irr; pfmlib_ita2_output_rr_t *orr, *orr2; pfm_ita2_pmc_reg_t pmc13; pfm_ita2_pmc_reg_t pmc14; unsigned int i, pos = outp->pfp_pmc_count; int iod_codes[4], dfl_val_pmc8, dfl_val_pmc9; unsigned int n_intervals; int ret; int base_idx = 0; int fine_mode = 0; #define DR_USED 0x1 /* data range is used */ #define OP_USED 0x2 /* opcode matching is used */ #define IR_USED 0x4 /* code range is used */ if (param == NULL) return PFMLIB_SUCCESS; /* * if only pmc8/pmc9 opcode matching is used, we do not need to change * the default value of pmc13 regardless of the events being measured. */ if ( param->pfp_ita2_drange.rr_used == 0 && param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * it seems like the ignored bits need to have special values * otherwise this does not work. */ pmc13.pmc_val = 0x2078fefefefe; /* * initialize iod codes */ iod_codes[0] = iod_codes[1] = iod_codes[2] = iod_codes[3] = 0; /* * setup default iod value, we need to separate because * if drange is used we do not know in advance which DBR will be used * therefore we need to apply dfl_val later */ dfl_val_pmc8 = param->pfp_ita2_pmc8.opcm_used ? OP_USED : 0; dfl_val_pmc9 = param->pfp_ita2_pmc9.opcm_used ? OP_USED : 0; if (param->pfp_ita2_drange.rr_used == 1) { if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita2_drange; orr = &mod_out->pfp_ita2_drange; ret = check_intervals(irr, 1, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } /* * Update iod_codes to reflect the use of the DBR constraint. */ for (i=0; i < orr->rr_nbr_used; i++) { if (orr->rr_br[i].reg_num == 0) iod_codes[0] |= DR_USED | dfl_val_pmc8; if (orr->rr_br[i].reg_num == 2) iod_codes[1] |= DR_USED | dfl_val_pmc9; if (orr->rr_br[i].reg_num == 4) iod_codes[2] |= DR_USED | dfl_val_pmc8; if (orr->rr_br[i].reg_num == 6) iod_codes[3] |= DR_USED | dfl_val_pmc9; } } /* * XXX: assume dispatch_irange executed before calling this function */ if (param->pfp_ita2_irange.rr_used == 1) { orr2 = &mod_out->pfp_ita2_irange; if (mod_out == NULL) return PFMLIB_ERR_INVAL; /* * we need to find out whether or not the irange is using * fine mode. If this is the case, then we only need to * program pmc13 for the ibr pairs which designate the lower * bounds of a range. For instance, if IBRP0/IBRP2 are used, * then we only need to program pmc13.cfg_dbrp0 and pmc13.ena_dbrp0, * the PMU will automatically use IBRP2, even though pmc13.ena_dbrp2=0. */ for(i=0; i <= pos; i++) { if (pc[i].reg_num == 14) { pmc14.pmc_val = pc[i].reg_value; if (pmc14.pmc14_ita2_reg.iarc_fine == 1) fine_mode = 1; break; } } /* * Update to reflect the use of the IBR constraint */ for (i=0; i < orr2->rr_nbr_used; i++) { if (orr2->rr_br[i].reg_num == 0) iod_codes[0] |= IR_USED | dfl_val_pmc8; if (orr2->rr_br[i].reg_num == 2) iod_codes[1] |= IR_USED | dfl_val_pmc9; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 4) iod_codes[2] |= IR_USED | dfl_val_pmc8; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 6) iod_codes[3] |= IR_USED | dfl_val_pmc9; } } if (param->pfp_ita2_irange.rr_used == 0 && param->pfp_ita2_drange.rr_used ==0) { iod_codes[0] = iod_codes[2] = dfl_val_pmc8; iod_codes[1] = iod_codes[3] = dfl_val_pmc9; } /* * update the cfg dbrpX field. If we put a constraint on a cfg dbrp, then * we must enable it in the corresponding ena_dbrpX */ pmc13.pmc13_ita2_reg.darc_ena_dbrp0 = iod_codes[0] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp0 = iod_tab[iod_codes[0]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp1 = iod_codes[1] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp1 = iod_tab[iod_codes[1]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp2 = iod_codes[2] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp2 = iod_tab[iod_codes[2]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp3 = iod_codes[3] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp3 = iod_tab[iod_codes[3]]; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 13)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 13; pc[pos].reg_value = pmc13.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 13; pos++; __pfm_vbprintf("[PMC13(pmc13)=0x%lx cfg_dbrp0=%d cfg_dbrp1=%d cfg_dbrp2=%d cfg_dbrp3=%d ena_dbrp0=%d ena_dbrp1=%d ena_dbrp2=%d ena_dbrp3=%d]\n", pmc13.pmc_val, pmc13.pmc13_ita2_reg.darc_cfg_dbrp0, pmc13.pmc13_ita2_reg.darc_cfg_dbrp1, pmc13.pmc13_ita2_reg.darc_cfg_dbrp2, pmc13.pmc13_ita2_reg.darc_cfg_dbrp3, pmc13.pmc13_ita2_reg.darc_ena_dbrp0, pmc13.pmc13_ita2_reg.darc_ena_dbrp1, pmc13.pmc13_ita2_reg.darc_ena_dbrp2, pmc13.pmc13_ita2_reg.darc_ena_dbrp3); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (param->pfp_ita2_counters[i].flags & PFMLIB_ITA2_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(param) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(param) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(param) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in) { pfmlib_ita2_input_param_t *param = mod_in; unsigned int i, count; if (param->pfp_ita2_drange.rr_used == 0 && param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_ita2_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_ita2_input_param_t *mod_in = (pfmlib_ita2_input_param_t *)model_in; pfmlib_ita2_output_param_t *mod_out = (pfmlib_ita2_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with raneg restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_ita2_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* XXX: must be done before dispatch_opcm() and dispatch_drange() */ ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp, mod_out); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_btb(inp, mod_in, outp); return ret; } /* XXX: return value is also error code */ int pfm_ita2_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_ITA2_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = itanium2_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_ita2_is_ear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_ear(i); } int pfm_ita2_is_dear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i); } int pfm_ita2_is_dear_tlb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i) && is_ear_tlb(i); } int pfm_ita2_is_dear_cache(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i) && is_ear_cache(i); } int pfm_ita2_is_dear_alat(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_ear_alat(i); } int pfm_ita2_is_iear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i); } int pfm_ita2_is_iear_tlb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i) && is_ear_tlb(i); } int pfm_ita2_is_iear_cache(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i) && is_ear_cache(i); } int pfm_ita2_is_btb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_btb(i); } int pfm_ita2_support_iarr(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_iarr(i); } int pfm_ita2_support_darr(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_darr(i); } int pfm_ita2_support_opcm(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_opcm(i); } int pfm_ita2_get_ear_mode(unsigned int i, pfmlib_ita2_ear_mode_t *m) { pfmlib_ita2_ear_mode_t r; if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; r = PFMLIB_ITA2_EAR_TLB_MODE; if (is_ear_tlb(i)) goto done; r = PFMLIB_ITA2_EAR_CACHE_MODE; if (is_ear_cache(i)) goto done; r = PFMLIB_ITA2_EAR_ALAT_MODE; if (is_ear_alat(i)) goto done; return PFMLIB_ERR_INVAL; done: *m = r; return PFMLIB_SUCCESS; } static int pfm_ita2_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)itanium2_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_ita2_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_ITA2_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } int pfm_ita2_get_event_group(unsigned int i, int *grp) { if (i >= PME_ITA2_EVENT_COUNT || grp == NULL) return PFMLIB_ERR_INVAL; *grp = evt_grp(i); return PFMLIB_SUCCESS; } int pfm_ita2_get_event_set(unsigned int i, int *set) { if (i >= PME_ITA2_EVENT_COUNT || set == NULL) return PFMLIB_ERR_INVAL; *set = evt_set(i) == 0xf ? PFMLIB_ITA2_EVT_NO_SET : evt_set(i); return PFMLIB_SUCCESS; } /* external interface */ int pfm_ita2_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_output_param_t *param = mod_out; pfm_ita2_pmc_reg_t reg; unsigned int i, count; /* some sanity checks */ if (outp == NULL || param == NULL) return 0; if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return 0; if (param->pfp_ita2_irange.rr_nbr_used == 0) return 0; /* * we look for pmc14 as it contains the bit indicating if fine mode is used */ count = outp->pfp_pmc_count; for(i=0; i < count; i++) { if (outp->pfp_pmcs[i].reg_num == 14) goto found; } return 0; found: reg.pmc_val = outp->pfp_pmcs[i].reg_value; return reg.pmc14_ita2_reg.iarc_fine ? 1 : 0; } static char * pfm_ita2_get_event_name(unsigned int i) { return itanium2_pe[i].pme_name; } static void pfm_ita2_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =itanium2_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_ita2_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < PMU_ITA2_NUM_PMCS; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_ita2_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < PMU_ITA2_NUM_PMDS; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_ita2_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_ita2_get_hw_counter_width(unsigned int *width) { *width = PMU_ITA2_COUNTER_WIDTH; } static int pfm_ita2_get_event_description(unsigned int ev, char **str) { char *s; s = itanium2_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_ita2_get_cycle_event(pfmlib_event_t *e) { e->event = PME_ITA2_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_ita2_get_inst_retired(pfmlib_event_t *e) { e->event = PME_ITA2_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t itanium2_support={ .pmu_name = "itanium2", .pmu_type = PFMLIB_ITANIUM2_PMU, .pme_count = PME_ITA2_EVENT_COUNT, .pmc_count = PMU_ITA2_NUM_PMCS, .pmd_count = PMU_ITA2_NUM_PMDS, .num_cnt = PMU_ITA2_NUM_COUNTERS, .get_event_code = pfm_ita2_get_event_code, .get_event_name = pfm_ita2_get_event_name, .get_event_counters = pfm_ita2_get_event_counters, .dispatch_events = pfm_ita2_dispatch_events, .pmu_detect = pfm_ita2_detect, .get_impl_pmcs = pfm_ita2_get_impl_pmcs, .get_impl_pmds = pfm_ita2_get_impl_pmds, .get_impl_counters = pfm_ita2_get_impl_counters, .get_hw_counter_width = pfm_ita2_get_hw_counter_width, .get_event_desc = pfm_ita2_get_event_description, .get_cycle_event = pfm_ita2_get_cycle_event, .get_inst_retired_event = pfm_ita2_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_itanium2_priv.h000066400000000000000000000121711502707512200236020ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_ITANIUM2_PRIV_H__ #define __PFMLIB_ITANIUM2_PRIV_H__ /* * Event type definitions * * The virtual events are not really defined in the specs but are an artifact used * to quickly and easily setup EAR and/or BTB. The event type encodes the exact feature * which must be configured in combination with a counting monitor. * For instance, DATA_EAR_CACHE_LAT4 is a virtual D-EAR cache event. If the user * requests this event, this will configure a counting monitor to count DATA_EAR_EVENTS * and PMC11 will be configured for cache mode. The latency is encoded in the umask, here * it would correspond to 4 cycles. * */ #define PFMLIB_ITA2_EVENT_NORMAL 0x0 /* standard counter */ #define PFMLIB_ITA2_EVENT_BTB 0x1 /* virtual event used with BTB configuration */ #define PFMLIB_ITA2_EVENT_IEAR_TLB 0x2 /* virtual event used for I-EAR TLB configuration */ #define PFMLIB_ITA2_EVENT_IEAR_CACHE 0x3 /* virtual event used for I-EAR cache configuration */ #define PFMLIB_ITA2_EVENT_DEAR_TLB 0x4 /* virtual event used for D-EAR TLB configuration */ #define PFMLIB_ITA2_EVENT_DEAR_CACHE 0x5 /* virtual event used for D-EAR cache configuration */ #define PFMLIB_ITA2_EVENT_DEAR_ALAT 0x6 /* virtual event used for D-EAR ALAT configuration */ #define event_is_ear(e) ((e)->pme_type >= PFMLIB_ITA2_EVENT_IEAR_TLB &&(e)->pme_type <= PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_iear(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_CACHE) #define event_is_dear(e) ((e)->pme_type >= PFMLIB_ITA2_EVENT_DEAR_TLB && (e)->pme_type <= PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_ear_cache(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_CACHE || (e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_CACHE) #define event_is_ear_tlb(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_TLB) #define event_is_ear_alat(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_btb(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_BTB) /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_type:3; /* see definitions above */ unsigned long pme_ig1:5; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_ita2_entry_code_t; typedef union { unsigned long pme_vcode; pme_ita2_entry_code_t pme_ita2_code; /* must not be larger than vcode */ } pme_ita2_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_res1:13; /* reserved */ unsigned long pme_group:4; /* event group */ unsigned long pme_set:4; /* event feature set*/ unsigned long pme_res2:40; /* reserved */ } pme_qual; } pme_ita2_qualifiers_t; typedef struct { char *pme_name; pme_ita2_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_ita2_qualifiers_t pme_qualifiers; char *pme_desc; /* text description of the event */ } pme_ita2_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_ita2_code.pme_code #define pme_umask pme_entry_code.pme_ita2_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define pme_type pme_entry_code.pme_ita2_code.pme_type #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #endif /* __PFMLIB_ITANIUM2_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_itanium_priv.h000066400000000000000000000071471502707512200235270ustar00rootroot00000000000000/* * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_ITANIUM_PRIV_H__ #define __PFMLIB_ITANIUM_PRIV_H__ /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_ear:1; /* is EAR event */ unsigned long pme_dear:1; /* 1=Data 0=Instr */ unsigned long pme_tlb:1; /* 1=TLB 0=Cache */ unsigned long pme_btb:1; /* 1=BTB */ unsigned long pme_ig1:4; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_ita_entry_code_t; #define PME_UMASK_NONE 0x0 typedef union { unsigned long pme_vcode; pme_ita_entry_code_t pme_ita_code; /* must not be larger than vcode */ } pme_ita_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_reserved:61; /* not used */ } pme_qual; } pme_ita_qualifiers_t; typedef struct { char *pme_name; pme_ita_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_ita_qualifiers_t pme_qualifiers; char *pme_desc; } pme_ita_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_ita_code.pme_code #define pme_ear pme_entry_code.pme_ita_code.pme_ear #define pme_dear pme_entry_code.pme_ita_code.pme_dear #define pme_tlb pme_entry_code.pme_ita_code.pme_tlb #define pme_btb pme_entry_code.pme_ita_code.pme_btb #define pme_umask pme_entry_code.pme_ita_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define event_is_ear(e) ((e)->pme_ear == 1) #define event_is_iear(e) ((e)->pme_ear == 1 && (e)->pme_dear==0) #define event_is_dear(e) ((e)->pme_ear == 1 && (e)->pme_dear==1) #define event_is_tlb_ear(e) ((e)->pme_ear == 1 && (e)->pme_tlb==1) #define event_is_btb(e) ((e)->pme_btb) #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #endif /* __PFMLIB_ITANIUM_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_montecito.c000066400000000000000000002117321502707512200230120ustar00rootroot00000000000000/* * pfmlib_montecito.c : support for the Dual-Core Itanium2 processor * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_montecito_priv.h" /* PMU private */ #include "montecito_events.h" /* PMU private */ #define is_ear(i) event_is_ear(montecito_pe+(i)) #define is_ear_tlb(i) event_is_ear_tlb(montecito_pe+(i)) #define is_ear_alat(i) event_is_ear_alat(montecito_pe+(i)) #define is_ear_cache(i) event_is_ear_cache(montecito_pe+(i)) #define is_iear(i) event_is_iear(montecito_pe+(i)) #define is_dear(i) event_is_dear(montecito_pe+(i)) #define is_etb(i) event_is_etb(montecito_pe+(i)) #define has_opcm(i) event_opcm_ok(montecito_pe+(i)) #define has_iarr(i) event_iarr_ok(montecito_pe+(i)) #define has_darr(i) event_darr_ok(montecito_pe+(i)) #define has_all(i) event_all_ok(montecito_pe+(i)) #define has_mesi(i) event_mesi_ok(montecito_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_mont_opcm1.opcm_used != 0 || (e)->pfp_mont_opcm2.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_mont_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_mont_drange.rr_used) #define evt_grp(e) (int)montecito_pe[e].pme_qualifiers.pme_qual.pme_group #define evt_set(e) (int)montecito_pe[e].pme_qualifiers.pme_qual.pme_set #define evt_umask(e) montecito_pe[e].pme_umask #define evt_type(e) (int)montecito_pe[e].pme_type #define evt_caf(e) (int)montecito_pe[e].pme_caf #define FINE_MODE_BOUNDARY_BITS 16 #define FINE_MODE_MASK ~((1U< PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ static int pfm_mont_detect(void) { int tmp; int ret = PFMLIB_ERR_NOTSUPP; tmp = pfm_ia64_get_cpu_family(); if (tmp == 0x20) { ret = PFMLIB_SUCCESS; } return ret; } /* * Check the event for incompatibilities. This is useful * for L1D and L2D related events. Due to wire limitations, * some caches events are separated into sets. There * are 6 sets for the L1D cache group and 8 sets for L2D group. * It is NOT possible to simultaneously measure events from * differents sets for L1D. For instance, you cannot * measure events from set0 and set1 in L1D cache group. The L2D * group allows up to two different sets to be active at the same * time. The first set is selected by the event in PMC4 and the second * set by the event in PMC6. Once the set is selected for PMC4, * the same set is locked for PMC5 and PMC8. Similarly, once the * set is selected for PMC6, the same set is locked for PMC7 and * PMC9. * * This function verifies that only one set of L1D is selected * and that no more than 2 sets are selected for L2D */ static int check_cross_groups(pfmlib_input_param_t *inp, unsigned int *l1d_event, unsigned long *l2d_set1_mask, unsigned long *l2d_set2_mask) { int g, s, s1, s2; unsigned int cnt = inp->pfp_event_count; pfmlib_event_t *e = inp->pfp_events; unsigned int i, j; unsigned long l2d_mask1 = 0, l2d_mask2 = 0; unsigned int l1d_event_idx = UNEXISTING_SET; /* * Let check the L1D constraint first * * There is no umask restriction for this group */ for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); s = evt_set(e[i].event); if (g != PFMLIB_MONT_EVT_L1D_CACHE_GRP) continue; DPRINT("i=%u g=%d s=%d\n", i, g, s); l1d_event_idx = i; for (j=i+1; j < cnt; j++) { if (evt_grp(e[j].event) != g) continue; /* * if there is another event from the same group * but with a different set, then we return an error */ if (evt_set(e[j].event) != s) return PFMLIB_ERR_EVTSET; } } /* * Check that we have only up to two distinct * sets for L2D */ s1 = s2 = -1; for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); if (g != PFMLIB_MONT_EVT_L2D_CACHE_GRP) continue; s = evt_set(e[i].event); /* * we have seen this set before, continue */ if (s1 == s) { l2d_mask1 |= 1UL << i; continue; } if (s2 == s) { l2d_mask2 |= 1UL << i; continue; } /* * record first of second set seen */ if (s1 == -1) { s1 = s; l2d_mask1 |= 1UL << i; } else if (s2 == -1) { s2 = s; l2d_mask2 |= 1UL << i; } else { /* * found a third set, that's not possible */ return PFMLIB_ERR_EVTSET; } } *l1d_event = l1d_event_idx; *l2d_set1_mask = l2d_mask1; *l2d_set2_mask = l2d_mask2; return PFMLIB_SUCCESS; } /* * Certain prefetch events must be treated specially when instruction range restriction * is used because they can only be constrained by IBRP1 in fine-mode. Other events * will use IBRP0 if tagged as a demand fetch OR IBPR1 if tagged as a prefetch match. * * Events which can be qualified by the two pairs depending on their tag: * - ISB_BUNPAIRS_IN * - L1I_FETCH_RAB_HIT * - L1I_FETCH_ISB_HIT * - L1I_FILLS * * This function returns the number of qualifying prefetch events found */ static int prefetch_events[]={ PME_MONT_L1I_PREFETCHES, PME_MONT_L1I_STRM_PREFETCHES, PME_MONT_L2I_PREFETCHES }; #define NPREFETCH_EVENTS sizeof(prefetch_events)/sizeof(int) static int prefetch_dual_events[]= { PME_MONT_ISB_BUNPAIRS_IN, PME_MONT_L1I_FETCH_RAB_HIT, PME_MONT_L1I_FETCH_ISB_HIT, PME_MONT_L1I_FILLS }; #define NPREFETCH_DUAL_EVENTS sizeof(prefetch_dual_events)/sizeof(int) /* * prefetch events must use IBRP1, unless they are dual and the user specified * PFMLIB_MONT_IRR_DEMAND_FETCH in rr_flags */ static int check_prefetch_events(pfmlib_input_param_t *inp, pfmlib_mont_input_rr_t *irr, unsigned int *count, int *base_idx, int *dup) { int code; int prefetch_codes[NPREFETCH_EVENTS]; int prefetch_dual_codes[NPREFETCH_DUAL_EVENTS]; unsigned int i, j; int c, flags; int found = 0, found_ibrp0 = 0, found_ibrp1 = 0; flags = irr->rr_flags & (PFMLIB_MONT_IRR_DEMAND_FETCH|PFMLIB_MONT_IRR_PREFETCH_MATCH); for(i=0; i < NPREFETCH_EVENTS; i++) { pfm_get_event_code(prefetch_events[i], &code); prefetch_codes[i] = code; } for(i=0; i < NPREFETCH_DUAL_EVENTS; i++) { pfm_get_event_code(prefetch_dual_events[i], &code); prefetch_dual_codes[i] = code; } for(i=0; i < inp->pfp_event_count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); for(j=0; j < NPREFETCH_EVENTS; j++) { if (c == prefetch_codes[j]) { found++; found_ibrp1++; } } /* * for the dual events, users must specify one or both of the * PFMLIB_MONT_IRR_DEMAND_FETCH or PFMLIB_MONT_IRR_PREFETCH_MATCH */ for(j=0; j < NPREFETCH_DUAL_EVENTS; j++) { if (c == prefetch_dual_codes[j]) { found++; if (flags == 0) return PFMLIB_ERR_IRRFLAGS; if (flags & PFMLIB_MONT_IRR_DEMAND_FETCH) found_ibrp0++; if (flags & PFMLIB_MONT_IRR_PREFETCH_MATCH) found_ibrp1++; } } } *count = found; *dup = 0; /* * if both found_ibrp0 and found_ibrp1 > 0, then we need to duplicate * the range in ibrp0 to ibrp1. */ if (found) { *base_idx = found_ibrp0 ? 0 : 2; if (found_ibrp1 && found_ibrp0) *dup = 1; } return 0; } /* * look for CPU_OP_CYCLES_QUAL * Return: * 1 if found * 0 otherwise */ static int has_cpu_cycles_qual(pfmlib_input_param_t *inp) { unsigned int i; int code, c; pfm_get_event_code(PME_MONT_CPU_OP_CYCLES_QUAL, &code); for(i=0; i < inp->pfp_event_count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) return 1; } return 0; } /* * IA64_INST_RETIRED (and subevents) is the only event which can be measured on all * 4 IBR when non-fine mode is not possible. * * This function returns: * - the number of events match the IA64_INST_RETIRED code * - in retired_mask to bottom 4 bits indicates which of the 4 INST_RETIRED event * is present */ static unsigned int check_inst_retired_events(pfmlib_input_param_t *inp, unsigned long *retired_mask) { int code; int c; unsigned int i, count, found = 0; unsigned long umask, mask; pfm_get_event_code(PME_MONT_IA64_INST_RETIRED, &code); count = inp->pfp_event_count; mask = 0; for(i=0; i < count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) { pfm_mont_get_event_umask(inp->pfp_events[i].event, &umask); switch(umask) { case 0: mask |= 1; break; case 1: mask |= 2; break; case 2: mask |= 4; break; case 3: mask |= 8; break; } found++; } } if (retired_mask) *retired_mask = mask; return found; } static int check_fine_mode_possible(pfmlib_mont_input_rr_t *rr, int n) { pfmlib_mont_input_rr_desc_t *lim = rr->rr_limits; int i; for(i=0; i < n; i++) { if ((lim[i].rr_start & FINE_MODE_MASK) != (lim[i].rr_end & FINE_MODE_MASK)) return 0; } return 1; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_mont_input_rr_t *irr, int mode, unsigned int *n_intervals) { unsigned int i; pfmlib_mont_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } /* * It is not possible to measure more than one of the * L2D_OZQ_CANCELS0, L2D_OZQ_CANCELS1 at the same time. */ static int cancel_events[]= { PME_MONT_L2D_OZQ_CANCELS0_ACQ, PME_MONT_L2D_OZQ_CANCELS1_ANY }; #define NCANCEL_EVENTS sizeof(cancel_events)/sizeof(int) static int check_cancel_events(pfmlib_input_param_t *inp) { unsigned int i, j, count; int code; int cancel_codes[NCANCEL_EVENTS]; int idx = -1; for(i=0; i < NCANCEL_EVENTS; i++) { pfm_get_event_code(cancel_events[i], &code); cancel_codes[i] = code; } count = inp->pfp_event_count; for(i=0; i < count; i++) { for (j=0; j < NCANCEL_EVENTS; j++) { pfm_get_event_code(inp->pfp_events[i].event, &code); if (code == cancel_codes[j]) { if (idx != -1) { return PFMLIB_ERR_INVAL; } idx = inp->pfp_events[i].event; } } } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. */ static unsigned int l2d_set1_cnts[]={ 4, 5, 8 }; static unsigned int l2d_set2_cnts[]={ 6, 7, 9 }; static int pfm_mont_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_mont_input_param_t *param = mod_in; pfm_mont_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t avail_cntrs, impl_cntrs; unsigned int i,j, k, max_cnt; unsigned int assign[PMU_MONT_NUM_COUNTERS]; unsigned int m, cnt; unsigned int l1d_set; unsigned long l2d_set1_mask, l2d_set2_mask, evt_mask, mesi; unsigned long not_assigned_events, cnt_mask; int l2d_set1_p, l2d_set2_p; int ret; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; if (PFMLIB_DEBUG()) for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, montecito_pe[e[m].event].pme_name, montecito_pe[e[m].event].pme_counters); } if (cnt > PMU_MONT_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; l1d_set = UNEXISTING_SET; ret = check_cross_groups(inp, &l1d_set, &l2d_set1_mask, &l2d_set2_mask); if (ret != PFMLIB_SUCCESS) return ret; ret = check_cancel_events(inp); if (ret != PFMLIB_SUCCESS) return ret; /* * at this point, we know that: * - we have at most 1 L1D set * - we have at most 2 L2D sets * - cancel events are compatible */ DPRINT("l1d_set=%u l2d_set1_mask=0x%lx l2d_set2_mask=0x%lx\n", l1d_set, l2d_set1_mask, l2d_set2_mask); /* * first, place L1D cache event in PMC5 * * this is the strongest constraint */ pfm_get_impl_counters(&impl_cntrs); pfm_regmask_andnot(&avail_cntrs, &impl_cntrs, &inp->pfp_unavail_pmcs); not_assigned_events = 0; DPRINT("avail_cntrs=0x%lx\n", avail_cntrs.bits[0]); /* * we do not check ALL_THRD here because at least * one event has to be in PMC5 for this group */ if (l1d_set != UNEXISTING_SET) { if (!pfm_regmask_isset(&avail_cntrs, 5)) return PFMLIB_ERR_NOASSIGN; assign[l1d_set] = 5; pfm_regmask_clr(&avail_cntrs, 5); } l2d_set1_p = l2d_set2_p = 0; /* * assign L2D set1 and set2 counters */ for (i=0; i < cnt ; i++) { evt_mask = 1UL << i; /* * place l2d set1 events. First 3 go to designated * counters, the rest is placed elsewhere in the final * pass */ if (l2d_set1_p < 3 && (l2d_set1_mask & evt_mask)) { assign[i] = l2d_set1_cnts[l2d_set1_p]; if (!pfm_regmask_isset(&avail_cntrs, assign[i])) return PFMLIB_ERR_NOASSIGN; pfm_regmask_clr(&avail_cntrs, assign[i]); l2d_set1_p++; continue; } /* * same as above but for l2d set2 */ if (l2d_set2_p < 3 && (l2d_set2_mask & evt_mask)) { assign[i] = l2d_set2_cnts[l2d_set2_p]; if (!pfm_regmask_isset(&avail_cntrs, assign[i])) return PFMLIB_ERR_NOASSIGN; pfm_regmask_clr(&avail_cntrs, assign[i]); l2d_set2_p++; continue; } /* * if not l2d nor l1d, then defer placement until final pass */ if (i != l1d_set) not_assigned_events |= evt_mask; DPRINT("phase 1: i=%u avail_cntrs=0x%lx l2d_set1_p=%d l2d_set2_p=%d not_assigned=0x%lx\n", i, avail_cntrs.bits[0], l2d_set1_p, l2d_set2_p, not_assigned_events); } /* * assign BUS_* ER_* events (work only in PMC4-PMC9) */ evt_mask = not_assigned_events; for (i=0; evt_mask ; i++, evt_mask >>=1) { if ((evt_mask & 0x1) == 0) continue; cnt_mask = montecito_pe[e[i].event].pme_counters; /* * only interested in events with restricted set of counters */ if (cnt_mask == 0xfff0) continue; for(j=0; cnt_mask; j++, cnt_mask >>=1) { if ((cnt_mask & 0x1) == 0) continue; DPRINT("phase 2: i=%d j=%d cnt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, cnt_mask, avail_cntrs.bits[0], not_assigned_events); if (!pfm_regmask_isset(&avail_cntrs, j)) continue; assign[i] = j; not_assigned_events &= ~(1UL << i); pfm_regmask_clr(&avail_cntrs, j); break; } if (cnt_mask == 0) return PFMLIB_ERR_NOASSIGN; } /* * assign the rest of the events (no constraints) */ evt_mask = not_assigned_events; max_cnt = PMU_MONT_FIRST_COUNTER + PMU_MONT_NUM_COUNTERS; for (i=0, j=0; evt_mask ; i++, evt_mask >>=1) { DPRINT("phase 3a: i=%d j=%d evt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, evt_mask, avail_cntrs.bits[0], not_assigned_events); if ((evt_mask & 0x1) == 0) continue; while(j < max_cnt && !pfm_regmask_isset(&avail_cntrs, j)) { DPRINT("phase 3: i=%d j=%d evt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, evt_mask, avail_cntrs.bits[0], not_assigned_events); j++; } if (j == max_cnt) return PFMLIB_ERR_NOASSIGN; assign[i] = j; j++; } for (j=0; j < cnt ; j++ ) { mesi = 0; /* * XXX: we do not support .all placement just yet */ if (param && param->pfp_mont_counters[j].flags & PFMLIB_MONT_FL_EVT_ALL_THRD) { DPRINT(".all mode is not yet supported by libpfm\n"); return PFMLIB_ERR_NOTSUPP; } if (has_mesi(e[j].event)) { for(k=0;k< e[j].num_masks; k++) { mesi |= 1UL << e[j].unit_masks[k]; } /* by default we capture everything */ if (mesi == 0) mesi = 0xf; } reg.pmc_val = 0; /* clear all, bits 26-27 must be zero for proper operations */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = inp->pfp_events[j].plm ? inp->pfp_events[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 0; /* let the user/OS deal with this field */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_mont_counters[j].thres: 0; reg.pmc_ism = 0x2; /* force IA-64 mode */ reg.pmc_umask = is_ear(e[j].event) ? 0x0 : montecito_pe[e[j].event].pme_umask; reg.pmc_es = montecito_pe[e[j].event].pme_code; reg.pmc_all = 0; /* XXX force self for now */ reg.pmc_m = (mesi>>3) & 0x1; reg.pmc_e = (mesi>>2) & 0x1; reg.pmc_s = (mesi>>1) & 0x1; reg.pmc_i = mesi & 0x1; /* * Note that we don't force PMC4.pmc_ena = 1 because the kernel takes care of this for us. * This way we don't have to program something in PMC4 even when we don't use it */ pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = pc[j].reg_alt_addr = assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx m=%d e=%d s=%d i=%d thres=%d all=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_m, reg.pmc_e, reg.pmc_s, reg.pmc_i, reg.pmc_thres, reg.pmc_all, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, montecito_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_mont_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_mont_iear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_mont_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_mont_iear.ear_mode); param->pfp_mont_iear.ear_umask = evt_umask(inp->pfp_events[i].event); DPRINT("I-EAR event with no info\n"); } /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc37_mont_tlb_reg.iear_plm = param->pfp_mont_iear.ear_plm ? param->pfp_mont_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc37_mont_tlb_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc37_mont_tlb_reg.iear_ct = 0x0; reg.pmc37_mont_tlb_reg.iear_umask = param->pfp_mont_iear.ear_umask; } else if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_CACHE_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc37_mont_cache_reg.iear_plm = param->pfp_mont_iear.ear_plm ? param->pfp_mont_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc37_mont_cache_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc37_mont_cache_reg.iear_ct = 0x1; reg.pmc37_mont_cache_reg.iear_umask = param->pfp_mont_iear.ear_umask; } else { DPRINT("ALAT mode not supported in I-EAR mode\n"); return PFMLIB_ERR_INVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 37)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 37; /* PMC37 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 37; pos1++; pd[pos2].reg_num = 34; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 34; pos2++; pd[pos2].reg_num = 35; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 35; pos2++; if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE) { __pfm_vbprintf("[PMC37(pmc37)=0x%lx ctb=tlb plm=%d pm=%d umask=0x%x]\n", reg.pmc_val, reg.pmc37_mont_tlb_reg.iear_plm, reg.pmc37_mont_tlb_reg.iear_pm, reg.pmc37_mont_tlb_reg.iear_umask); } else { __pfm_vbprintf("[PMC37(pmc37)=0x%lx ctb=cache plm=%d pm=%d umask=0x%x]\n", reg.pmc_val, reg.pmc37_mont_cache_reg.iear_plm, reg.pmc37_mont_cache_reg.iear_pm, reg.pmc37_mont_cache_reg.iear_umask); } __pfm_vbprintf("[PMD34(pmd34)]\n[PMD35(pmd35)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_mont_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_mont_dear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_mont_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_mont_dear.ear_mode); param->pfp_mont_dear.ear_umask = evt_umask(inp->pfp_events[i].event); DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if ( param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_CACHE_MODE && param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_TLB_MODE && param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_ALAT_MODE) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc40_mont_reg.dear_plm = param->pfp_mont_dear.ear_plm ? param->pfp_mont_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc40_mont_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc40_mont_reg.dear_mode = param->pfp_mont_dear.ear_mode; reg.pmc40_mont_reg.dear_umask = param->pfp_mont_dear.ear_umask; reg.pmc40_mont_reg.dear_ism = 0x2; /* force IA-64 mode */ if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 40)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 40; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 40; pos1++; pd[pos2].reg_num = 32; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 32; pos2++; pd[pos2].reg_num = 33; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 33; pos2++; pd[pos2].reg_num = 36; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 36; pos2++; __pfm_vbprintf("[PMC40(pmc40)=0x%lx mode=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc40_mont_reg.dear_mode == 0 ? "L1D" : (reg.pmc40_mont_reg.dear_mode == 1 ? "L1DTLB" : "ALAT"), reg.pmc40_mont_reg.dear_plm, reg.pmc40_mont_reg.dear_pm, reg.pmc40_mont_reg.dear_ism, reg.pmc40_mont_reg.dear_umask); __pfm_vbprintf("[PMD32(pmd32)]\n[PMD33(pmd33)\nPMD36(pmd36)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfm_mont_pmc_reg_t reg1, reg2, pmc36; unsigned int i, has_1st_pair, has_2nd_pair, count; unsigned int pos = outp->pfp_pmc_count; int used_pmc32, used_pmc34; if (param == NULL) return PFMLIB_SUCCESS; #define PMC36_DFL_VAL 0xfffffff0 /* * mandatory default value for PMC36 as described in the documentation * all monitoring is opcode constrained. Better make sure the match/mask * is set to match everything! It looks weird for the default value! */ pmc36.pmc_val = PMC36_DFL_VAL; reg1.pmc_val = 0x030f01ffffffffff; reg2.pmc_val = 0; used_pmc32 = param->pfp_mont_opcm1.opcm_used; used_pmc34 = param->pfp_mont_opcm2.opcm_used; /* * check in any feature is used. * PMC36 must be setup when opcode matching is used OR when code range restriction is used */ if (used_pmc32 == 0 && used_pmc34 == 0 && param->pfp_mont_irange.rr_used == 0) return 0; /* * check for rr_nbr_used to make sure that the range request produced something on output */ if (used_pmc32 || (param->pfp_mont_irange.rr_used && mod_out->pfp_mont_irange.rr_nbr_used) ) { /* * if not used, ignore all bits */ if (used_pmc32) { reg1.pmc32_34_mont_reg.opcm_mask = param->pfp_mont_opcm1.opcm_mask; reg1.pmc32_34_mont_reg.opcm_b = param->pfp_mont_opcm1.opcm_b; reg1.pmc32_34_mont_reg.opcm_f = param->pfp_mont_opcm1.opcm_f; reg1.pmc32_34_mont_reg.opcm_i = param->pfp_mont_opcm1.opcm_i; reg1.pmc32_34_mont_reg.opcm_m = param->pfp_mont_opcm1.opcm_m; reg2.pmc33_35_mont_reg.opcm_match = param->pfp_mont_opcm1.opcm_match; } if (param->pfp_mont_irange.rr_used) { reg1.pmc32_34_mont_reg.opcm_ig_ad = 0; reg1.pmc32_34_mont_reg.opcm_inv = param->pfp_mont_irange.rr_flags & PFMLIB_MONT_RR_INV ? 1 : 0; } else { /* clear range restriction fields when none is used */ reg1.pmc32_34_mont_reg.opcm_ig_ad = 1; reg1.pmc32_34_mont_reg.opcm_inv = 0; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 32)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 32; pc[pos].reg_value = reg1.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 32; pos++; /* * will be constrained by PMC32 */ if (used_pmc32) { if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 33)) return PFMLIB_ERR_NOASSIGN; /* * used pmc33 only when we have active opcode matching */ pc[pos].reg_num = 33; pc[pos].reg_value = reg2.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 33; pos++; has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33) has_1st_pair=1; if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc36.pmc36_mont_reg.opcm_ch0_ig_opcm = 0; if (has_2nd_pair || has_1st_pair == 0) pmc36.pmc36_mont_reg.opcm_ch2_ig_opcm = 0; } __pfm_vbprintf("[PMC32(pmc32)=0x%lx m=%d i=%d f=%d b=%d mask=0x%lx inv=%d ig_ad=%d]\n", reg1.pmc_val, reg1.pmc32_34_mont_reg.opcm_m, reg1.pmc32_34_mont_reg.opcm_i, reg1.pmc32_34_mont_reg.opcm_f, reg1.pmc32_34_mont_reg.opcm_b, reg1.pmc32_34_mont_reg.opcm_mask, reg1.pmc32_34_mont_reg.opcm_inv, reg1.pmc32_34_mont_reg.opcm_ig_ad); if (used_pmc32) __pfm_vbprintf("[PMC33(pmc33)=0x%lx match=0x%lx]\n", reg2.pmc_val, reg2.pmc33_35_mont_reg.opcm_match); } /* * will be constrained by PMC34 */ if (used_pmc34) { reg1.pmc_val = 0x01ffffffffff; /* pmc34 default value */ reg2.pmc_val = 0; reg1.pmc32_34_mont_reg.opcm_mask = param->pfp_mont_opcm2.opcm_mask; reg1.pmc32_34_mont_reg.opcm_b = param->pfp_mont_opcm2.opcm_b; reg1.pmc32_34_mont_reg.opcm_f = param->pfp_mont_opcm2.opcm_f; reg1.pmc32_34_mont_reg.opcm_i = param->pfp_mont_opcm2.opcm_i; reg1.pmc32_34_mont_reg.opcm_m = param->pfp_mont_opcm2.opcm_m; reg2.pmc33_35_mont_reg.opcm_match = param->pfp_mont_opcm2.opcm_match; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 34)) return PFMLIB_ERR_NOASSIGN; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 35)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 34; pc[pos].reg_value = reg1.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 34; pos++; pc[pos].reg_num = 35; pc[pos].reg_value = reg2.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 35; pos++; has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35) has_1st_pair=1; if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc36.pmc36_mont_reg.opcm_ch1_ig_opcm = 0; if (has_2nd_pair || has_1st_pair == 0) pmc36.pmc36_mont_reg.opcm_ch3_ig_opcm = 0; __pfm_vbprintf("[PMC34(pmc34)=0x%lx m=%d i=%d f=%d b=%d mask=0x%lx]\n", reg1.pmc_val, reg1.pmc32_34_mont_reg.opcm_m, reg1.pmc32_34_mont_reg.opcm_i, reg1.pmc32_34_mont_reg.opcm_f, reg1.pmc32_34_mont_reg.opcm_b, reg1.pmc32_34_mont_reg.opcm_mask); __pfm_vbprintf("[PMC35(pmc35)=0x%lx match=0x%lx]\n", reg2.pmc_val, reg2.pmc33_35_mont_reg.opcm_match); } if (pmc36.pmc_val != PMC36_DFL_VAL) { if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 36)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 36; pc[pos].reg_value = pmc36.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 36; pos++; __pfm_vbprintf("[PMC36(pmc36)=0x%lx ch0_ig_op=%d ch1_ig_op=%d ch2_ig_op=%d ch3_ig_op=%d]\n", pmc36.pmc_val, pmc36.pmc36_mont_reg.opcm_ch0_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch1_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch2_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch3_ig_opcm); } outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_etb(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e= inp->pfp_events; pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; int found_etb = 0, found_bad_dear = 0; int has_etb_param; unsigned int i, pos1, pos2; unsigned int count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * explicit ETB settings */ has_etb_param = param && param->pfp_mont_etb.etb_used; reg.pmc_val = 0UL; /* * we need to scan all events looking for DEAR ALAT/TLB due to incompatibility. * In this case PMC39 must be forced to zero */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_etb(e[i].event)) found_etb = 1; /* * keep track of the first ETB event */ /* look only for DEAR TLB */ if (is_dear(e[i].event) && (is_ear_tlb(e[i].event) || is_ear_alat(e[i].event))) { found_bad_dear = 1; } } DPRINT("found_etb=%d found_bar_dear=%d\n", found_etb, found_bad_dear); /* * did not find D-EAR TLB/ALAT event, need to check param structure */ if (found_bad_dear == 0 && param && param->pfp_mont_dear.ear_used == 1) { if ( param->pfp_mont_dear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE || param->pfp_mont_dear.ear_mode == PFMLIB_MONT_EAR_ALAT_MODE) found_bad_dear = 1; } /* * no explicit ETB event and no special case to deal with (cover part of case 3) */ if (found_etb == 0 && has_etb_param == 0 && found_bad_dear == 0) return PFMLIB_SUCCESS; if (has_etb_param == 0) { /* * case 3: no ETB event, etb_used=0 but found_bad_dear=1, need to cleanup PMC12 */ if (found_etb == 0) goto assign_zero; /* * case 1: we have a ETB event but no param, default setting is to capture * all branches. */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_mont_etb.etb_tm = 0x3; /* all branches */ param->pfp_mont_etb.etb_ptm = 0x3; /* all branches */ param->pfp_mont_etb.etb_ppm = 0x3; /* all branches */ param->pfp_mont_etb.etb_brt = 0x0; /* all branches */ DPRINT("ETB event with no info\n"); } /* * case 2: ETB event in the list, param provided * case 4: no ETB event, param provided (free running mode) */ reg.pmc39_mont_reg.etbc_plm = param->pfp_mont_etb.etb_plm ? param->pfp_mont_etb.etb_plm : inp->pfp_dfl_plm; reg.pmc39_mont_reg.etbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc39_mont_reg.etbc_ds = 0; /* 1 is reserved */ reg.pmc39_mont_reg.etbc_tm = param->pfp_mont_etb.etb_tm & 0x3; reg.pmc39_mont_reg.etbc_ptm = param->pfp_mont_etb.etb_ptm & 0x3; reg.pmc39_mont_reg.etbc_ppm = param->pfp_mont_etb.etb_ppm & 0x3; reg.pmc39_mont_reg.etbc_brt = param->pfp_mont_etb.etb_brt & 0x3; /* * if DEAR-ALAT or DEAR-TLB is set then PMC12 must be set to zero (see documentation p. 87) * * D-EAR ALAT/TLB and ETB cannot be used at the same time. * From documentation: PMC12 must be zero in this mode; else the wrong IP for misses * coming right after a mispredicted branch. * * D-EAR cache is fine. */ assign_zero: if (found_bad_dear && reg.pmc_val != 0UL) return PFMLIB_ERR_EVTINCOMP; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 39)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 39; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 39; pos1++; __pfm_vbprintf("[PMC39(pmc39)=0x%lx plm=%d pm=%d ds=%d tm=%d ptm=%d ppm=%d brt=%d]\n", reg.pmc_val, reg.pmc39_mont_reg.etbc_plm, reg.pmc39_mont_reg.etbc_pm, reg.pmc39_mont_reg.etbc_ds, reg.pmc39_mont_reg.etbc_tm, reg.pmc39_mont_reg.etbc_ptm, reg.pmc39_mont_reg.etbc_ppm, reg.pmc39_mont_reg.etbc_brt); /* * only add ETB PMDs when actually using BTB. * Not needed when dealing with D-EAR TLB and DEAR-ALAT * PMC39 restriction */ if (found_etb || has_etb_param) { pd[pos2].reg_num = 38; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 38; pos2++; pd[pos2].reg_num = 39; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 39; pos2++; __pfm_vbprintf("[PMD38(pmd38)]\n[PMD39(pmd39)\n"); for(i=48; i < 64; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) printf(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs, fine_mode ? ", fine_mode" : "", rr_flags & PFMLIB_MONT_RR_INV ? ", inversed" : ""); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx+=2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); if (fine_mode) __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask); else __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask, r_end); } } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_fine_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_mont_output_rr_t *orr) { int i; pfmlib_reg_t *br; pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; unsigned long addr; int reg_idx; dbreg_t db; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; db.val = 0; db.db.db_mask = FINE_MODE_MASK; if (n > 2) return PFMLIB_ERR_IRRTOOMANY; for (i=0; i < n; i++, reg_idx += 2, in_rr++, br+= 4) { /* * setup lower limit pair * * because of the PMU can only see addresses on a 2-bundle boundary, we must align * down to the closest bundle-pair aligned address. 5 => 32-byte aligned address */ addr = ALIGN_DOWN(in_rr->rr_start, 5); out_rr->rr_soff = in_rr->rr_start - addr; /* * adjust plm for each range */ db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[0].reg_num = reg_idx; br[0].reg_value = addr; br[0].reg_addr = br[0].reg_alt_addr = 1+reg_idx; br[1].reg_num = reg_idx+1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = 1+reg_idx+1; /* * setup upper limit pair * * * In fine mode, the bundle address stored in the upper limit debug * registers is included in the count, so we substract 0x10 to exclude it. * * because of the PMU bug, we align the (corrected) end to the nearest * 32-byte aligned address + 0x10. With this correction and depending * on the correction, we may count one * * */ addr = in_rr->rr_end - 0x10; if ((addr & 0x1f) == 0) addr += 0x10; out_rr->rr_eoff = addr - in_rr->rr_end + 0x10; br[2].reg_num = reg_idx+4; br[2].reg_value = addr; br[2].reg_addr = br[2].reg_alt_addr = 1+reg_idx+4; br[3].reg_num = reg_idx+5; br[3].reg_value = db.val; br[3].reg_addr = br[3].reg_alt_addr = 1+reg_idx+5; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 2, 1, irr->rr_flags); } orr->rr_nbr_used += i<<2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_single_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int *base_idx, pfmlib_mont_output_rr_t *orr) { unsigned long size, end, start; unsigned long p_start, p_end; pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; pfmlib_reg_t *br; dbreg_t db; int reg_idx; int l, m; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; start = in_rr->rr_start; end = in_rr->rr_end; size = end - start; reg_idx = *base_idx; l = pfm_ia64_fls(size); m = l; if (size & ((1UL << l)-1)) { if (l>62) { printf("range: [0x%lx-0x%lx] too big\n", start, end); return PFMLIB_ERR_IRRTOOBIG; } m++; } DPRINT("size=%ld, l=%d m=%d, internal: 0x%lx full: 0x%lx\n", size, l, m, 1UL << l, 1UL << m); for (; m < 64; m++) { p_start = ALIGN_DOWN(start, m); p_end = p_start+(1UL<= end) goto found; } return PFMLIB_ERR_IRRINVAL; found: DPRINT("m=%d p_start=0x%lx p_end=0x%lx\n", m, p_start,p_end); /* when the event is not IA64_INST_RETIRED, then we MUST use ibrp0 */ br[0].reg_num = reg_idx; br[0].reg_value = p_start; br[0].reg_addr = br[0].reg_alt_addr = 1+reg_idx; db.val = 0; db.db.db_mask = ~((1UL << m)-1); db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[1].reg_num = reg_idx + 1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = 1+reg_idx+1; out_rr->rr_soff = start - p_start; out_rr->rr_eoff = p_end - end; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 1, 0, irr->rr_flags); orr->rr_nbr_used += 2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int compute_normal_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_mont_output_rr_t *orr) { pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j; int br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1, 0, irr->rr_flags); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used += br_index; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_mont_input_rr_t *irr; pfmlib_mont_output_rr_t *orr; pfmlib_reg_t *pc = outp->pfp_pmcs; unsigned long retired_mask; unsigned int i, pos = outp->pfp_pmc_count, count; unsigned int retired_only, retired_count, fine_mode, prefetch_count; unsigned int n_intervals; int base_idx = 0, dup = 0; int ret; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_mont_irange; orr = &mod_out->pfp_mont_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; retired_count = check_inst_retired_events(inp, &retired_mask); retired_only = retired_count == inp->pfp_event_count; fine_mode = irr->rr_flags & PFMLIB_MONT_RR_NO_FINE_MODE ? 0 : check_fine_mode_possible(irr, n_intervals); DPRINT("n_intervals=%d retired_only=%d retired_count=%d fine_mode=%d\n", n_intervals, retired_only, retired_count, fine_mode); /* * On montecito, there are more constraints on what can be measured with irange. * * - The fine mode is the best because you directly set the lower and upper limits of * the range. This uses 2 ibr pairs for range (ibrp0/ibrp2 and ibp1/ibrp3). Therefore * at most 2 fine mode ranges can be defined. The boundaries of the range must be in the * same 64KB page. The fine mode works will all events. * * - if the fine mode fails, then for all events, except IA64_TAGGED_INST_RETIRED_*, only * the first pair of ibr is available: ibrp0. This imposes some severe restrictions on the * size and alignement of the range. It can be bigger than 64KB and must be properly aligned * on its size. The library relaxes these constraints by allowing the covered areas to be * larger than the expected range. It may start before and end after the requested range. * You can determine the amount of overrun in either direction for each range by looking at * the rr_soff (start offset) and rr_eoff (end offset). * * - if the events include certain prefetch events then only IBRP1 can be used. * See 3.3.5.2 Exception 1. * * - Finally, when the events are ONLY IA64_TAGGED_INST_RETIRED_* then all IBR pairs can be used * to cover the range giving us more flexibility to approximate the range when it is not * properly aligned on its size (see 10.3.5.2 Exception 2). But the corresponding * IA64_TAGGED_INST_RETIRED_* must be present. */ if (fine_mode == 0 && retired_only == 0 && n_intervals > 1) return PFMLIB_ERR_IRRTOOMANY; /* we do not default to non-fine mode to support more ranges */ if (n_intervals > 2 && fine_mode == 1) return PFMLIB_ERR_IRRTOOMANY; ret = check_prefetch_events(inp, irr, &prefetch_count, &base_idx, &dup); if (ret) return ret; DPRINT("prefetch_count=%u base_idx=%d dup=%d\n", prefetch_count, base_idx, dup); /* * CPU_OP_CYCLES.QUAL supports code range restrictions but it returns * meaningful values (fine/coarse mode) only when IBRP1 is not used. */ if ((base_idx > 0 || dup) && has_cpu_cycles_qual(inp)) return PFMLIB_ERR_FEATCOMB; if (fine_mode == 0) { if (retired_only) { /* can take multiple intervals */ ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } else { /* unless we have only prefetch and instruction retired events, * we cannot satisfy the request because the other events cannot * be measured on anything but IBRP0. */ if ((prefetch_count+retired_count) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); if (ret == PFMLIB_SUCCESS && dup) ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); } } else { if (prefetch_count && n_intervals != 1) return PFMLIB_ERR_IRRTOOMANY; /* except is retired_only, can take only one interval */ ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret == PFMLIB_SUCCESS && dup) ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } if (ret != PFMLIB_SUCCESS) return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; reg.pmc_val = 0xdb6; /* default value */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { switch(orr->rr_br[i].reg_num) { case 0: reg.pmc38_mont_reg.iarc_ig_ibrp0 = 0; break; case 2: reg.pmc38_mont_reg.iarc_ig_ibrp1 = 0; break; case 4: reg.pmc38_mont_reg.iarc_ig_ibrp2 = 0; break; case 6: reg.pmc38_mont_reg.iarc_ig_ibrp3 = 0; break; } } if (fine_mode) { reg.pmc38_mont_reg.iarc_fine = 1; } else if (retired_only) { /* * we need to check that the user provided all the events needed to cover * all the ibr pairs used to cover the range */ if ((retired_mask & 0x1) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp0 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x2) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp1 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x4) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp2 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x8) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp3 == 0) return PFMLIB_ERR_IRRINVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 38)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 38; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 38; pos++; __pfm_vbprintf("[PMC38(pmc38)=0x%lx ig_ibrp0=%d ig_ibrp1=%d ig_ibrp2=%d ig_ibrp3=%d fine=%d]\n", reg.pmc_val, reg.pmc38_mont_reg.iarc_ig_ibrp0, reg.pmc38_mont_reg.iarc_ig_ibrp1, reg.pmc38_mont_reg.iarc_ig_ibrp2, reg.pmc38_mont_reg.iarc_ig_ibrp3, reg.pmc38_mont_reg.iarc_fine); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static const unsigned long iod_tab[8]={ /* --- */ 3, /* --D */ 2, /* -O- */ 3, /* should not be used */ /* -OD */ 0, /* =IOD safe because default IBR is harmless */ /* I-- */ 1, /* =IO safe because by defaut OPC is turned off */ /* I-D */ 0, /* =IOD safe because by default opc is turned off */ /* IO- */ 1, /* IOD */ 0 }; /* * IMPORTANT: MUST BE CALLED *AFTER* pfm_dispatch_irange() to make sure we see * the irange programming to adjust pmc41. */ static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_mont_input_rr_t *irr; pfmlib_mont_output_rr_t *orr, *orr2; pfm_mont_pmc_reg_t pmc38; pfm_mont_pmc_reg_t reg; unsigned int i, pos = outp->pfp_pmc_count; int iod_codes[4], dfl_val_pmc32, dfl_val_pmc34; unsigned int n_intervals; int ret; int base_idx = 0; int fine_mode = 0; #define DR_USED 0x1 /* data range is used */ #define OP_USED 0x2 /* opcode matching is used */ #define IR_USED 0x4 /* code range is used */ if (param == NULL) return PFMLIB_SUCCESS; /* * if only pmc32/pmc33 opcode matching is used, we do not need to change * the default value of pmc41 regardless of the events being measured. */ if ( param->pfp_mont_drange.rr_used == 0 && param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * it seems like the ignored bits need to have special values * otherwise this does not work. */ reg.pmc_val = 0x2078fefefefe; /* * initialize iod codes */ iod_codes[0] = iod_codes[1] = iod_codes[2] = iod_codes[3] = 0; /* * setup default iod value, we need to separate because * if drange is used we do not know in advance which DBR will be used * therefore we need to apply dfl_val later */ dfl_val_pmc32 = param->pfp_mont_opcm1.opcm_used ? OP_USED : 0; dfl_val_pmc34 = param->pfp_mont_opcm2.opcm_used ? OP_USED : 0; if (param->pfp_mont_drange.rr_used == 1) { if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_mont_drange; orr = &mod_out->pfp_mont_drange; ret = check_intervals(irr, 1, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } /* * Update iod_codes to reflect the use of the DBR constraint. */ for (i=0; i < orr->rr_nbr_used; i++) { if (orr->rr_br[i].reg_num == 0) iod_codes[0] |= DR_USED | dfl_val_pmc32; if (orr->rr_br[i].reg_num == 2) iod_codes[1] |= DR_USED | dfl_val_pmc34; if (orr->rr_br[i].reg_num == 4) iod_codes[2] |= DR_USED | dfl_val_pmc32; if (orr->rr_br[i].reg_num == 6) iod_codes[3] |= DR_USED | dfl_val_pmc34; } } /* * XXX: assume dispatch_irange executed before calling this function */ if (param->pfp_mont_irange.rr_used == 1) { orr2 = &mod_out->pfp_mont_irange; if (mod_out == NULL) return PFMLIB_ERR_INVAL; /* * we need to find out whether or not the irange is using * fine mode. If this is the case, then we only need to * program pmc41 for the ibr pairs which designate the lower * bounds of a range. For instance, if IBRP0/IBRP2 are used, * then we only need to program pmc13.cfg_dbrp0 and pmc13.ena_dbrp0, * the PMU will automatically use IBRP2, even though pmc13.ena_dbrp2=0. */ for(i=0; i <= pos; i++) { if (pc[i].reg_num == 38) { pmc38.pmc_val = pc[i].reg_value; if (pmc38.pmc38_mont_reg.iarc_fine == 1) fine_mode = 1; break; } } /* * Update to reflect the use of the IBR constraint */ for (i=0; i < orr2->rr_nbr_used; i++) { if (orr2->rr_br[i].reg_num == 0) iod_codes[0] |= IR_USED | dfl_val_pmc32; if (orr2->rr_br[i].reg_num == 2) iod_codes[1] |= IR_USED | dfl_val_pmc34; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 4) iod_codes[2] |= IR_USED | dfl_val_pmc32; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 6) iod_codes[3] |= IR_USED | dfl_val_pmc34; } } if (param->pfp_mont_irange.rr_used == 0 && param->pfp_mont_drange.rr_used ==0) { iod_codes[0] = iod_codes[2] = dfl_val_pmc32; iod_codes[1] = iod_codes[3] = dfl_val_pmc34; } /* * update the cfg dbrpX field. If we put a constraint on a cfg dbrp, then * we must enable it in the corresponding ena_dbrpX */ reg.pmc41_mont_reg.darc_ena_dbrp0 = iod_codes[0] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag0 = iod_tab[iod_codes[0]]; reg.pmc41_mont_reg.darc_ena_dbrp1 = iod_codes[1] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag1 = iod_tab[iod_codes[1]]; reg.pmc41_mont_reg.darc_ena_dbrp2 = iod_codes[2] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag2 = iod_tab[iod_codes[2]]; reg.pmc41_mont_reg.darc_ena_dbrp3 = iod_codes[3] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag3 = iod_tab[iod_codes[3]]; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 41)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 41; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 41; pos++; __pfm_vbprintf("[PMC41(pmc41)=0x%lx cfg_dtag0=%d cfg_dtag1=%d cfg_dtag2=%d cfg_dtag3=%d ena_dbrp0=%d ena_dbrp1=%d ena_dbrp2=%d ena_dbrp3=%d]\n", reg.pmc_val, reg.pmc41_mont_reg.darc_cfg_dtag0, reg.pmc41_mont_reg.darc_cfg_dtag1, reg.pmc41_mont_reg.darc_cfg_dtag2, reg.pmc41_mont_reg.darc_cfg_dtag3, reg.pmc41_mont_reg.darc_ena_dbrp0, reg.pmc41_mont_reg.darc_ena_dbrp1, reg.pmc41_mont_reg.darc_ena_dbrp2, reg.pmc41_mont_reg.darc_ena_dbrp3); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (param->pfp_mont_counters[i].flags & PFMLIB_MONT_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(param) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(param) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(param) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in) { pfmlib_mont_input_param_t *param = mod_in; unsigned int i, count; if (param->pfp_mont_drange.rr_used == 0 && param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_dispatch_ipear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * check if there is something to do */ if (param == NULL || param->pfp_mont_ipear.ipear_used == 0) return PFMLIB_SUCCESS; /* * we need to look for use of ETB, because IP-EAR and ETB cannot be used at the * same time */ if (param->pfp_mont_etb.etb_used) return PFMLIB_ERR_FEATCOMB; /* * look for implicit ETB used because of BRANCH_EVENT */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_etb(e[i].event)) return PFMLIB_ERR_FEATCOMB; } reg.pmc_val = 0; reg.pmc42_mont_reg.ipear_plm = param->pfp_mont_ipear.ipear_plm ? param->pfp_mont_ipear.ipear_plm : inp->pfp_dfl_plm; reg.pmc42_mont_reg.ipear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc42_mont_reg.ipear_mode = 4; reg.pmc42_mont_reg.ipear_delay = param->pfp_mont_ipear.ipear_delay; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 42)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 42; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 42; pos1++; __pfm_vbprintf("[PMC42(pmc42)=0x%lx plm=%d pm=%d mode=%d delay=%d]\n", reg.pmc_val, reg.pmc42_mont_reg.ipear_plm, reg.pmc42_mont_reg.ipear_pm, reg.pmc42_mont_reg.ipear_mode, reg.pmc42_mont_reg.ipear_delay); pd[pos2].reg_num = 38; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 38; pos2++; pd[pos2].reg_num = 39; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 39; pos2++; __pfm_vbprintf("[PMD38(pmd38)]\n[PMD39(pmd39)\n"); for(i=48; i < 64; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_mont_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_mont_input_param_t *mod_in = (pfmlib_mont_input_param_t *)model_in; pfmlib_mont_output_param_t *mod_out = (pfmlib_mont_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with range restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_mont_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* XXX: must be done before dispatch_opcm() and dispatch_drange() */ ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp, mod_out); if (ret != PFMLIB_SUCCESS) return ret; /* now check for ETB */ ret = pfm_dispatch_etb(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for IP-EAR */ ret = pfm_dispatch_ipear(inp, mod_in, outp); return ret; } /* XXX: return value is also error code */ int pfm_mont_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_MONT_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = montecito_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_mont_is_ear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_ear(i); } int pfm_mont_is_dear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i); } int pfm_mont_is_dear_tlb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i) && is_ear_tlb(i); } int pfm_mont_is_dear_cache(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i) && is_ear_cache(i); } int pfm_mont_is_dear_alat(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_ear_alat(i); } int pfm_mont_is_iear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i); } int pfm_mont_is_iear_tlb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i) && is_ear_tlb(i); } int pfm_mont_is_iear_cache(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i) && is_ear_cache(i); } int pfm_mont_is_etb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_etb(i); } int pfm_mont_support_iarr(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_iarr(i); } int pfm_mont_support_darr(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_darr(i); } int pfm_mont_support_opcm(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_opcm(i); } int pfm_mont_support_all(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_all(i); } int pfm_mont_get_ear_mode(unsigned int i, pfmlib_mont_ear_mode_t *m) { pfmlib_mont_ear_mode_t r; if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; r = PFMLIB_MONT_EAR_TLB_MODE; if (is_ear_tlb(i)) goto done; r = PFMLIB_MONT_EAR_CACHE_MODE; if (is_ear_cache(i)) goto done; r = PFMLIB_MONT_EAR_ALAT_MODE; if (is_ear_alat(i)) goto done; return PFMLIB_ERR_INVAL; done: *m = r; return PFMLIB_SUCCESS; } static int pfm_mont_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 15)) return PFMLIB_ERR_INVAL; *code = (int)montecito_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_mont_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_MONT_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_group(unsigned int i, int *grp) { if (i >= PME_MONT_EVENT_COUNT || grp == NULL) return PFMLIB_ERR_INVAL; *grp = evt_grp(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_set(unsigned int i, int *set) { if (i >= PME_MONT_EVENT_COUNT || set == NULL) return PFMLIB_ERR_INVAL; *set = evt_set(i) == 0xf ? PFMLIB_MONT_EVT_NO_SET : evt_set(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_type(unsigned int i, int *type) { if (i >= PME_MONT_EVENT_COUNT || type == NULL) return PFMLIB_ERR_INVAL; *type = evt_caf(i); return PFMLIB_SUCCESS; } /* external interface */ int pfm_mont_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_output_param_t *param = mod_out; pfm_mont_pmc_reg_t reg; unsigned int i, count; /* some sanity checks */ if (outp == NULL || param == NULL) return 0; if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return 0; if (param->pfp_mont_irange.rr_nbr_used == 0) return 0; /* * we look for pmc38 as it contains the bit indicating if fine mode is used */ count = outp->pfp_pmc_count; for(i=0; i < count; i++) { if (outp->pfp_pmcs[i].reg_num == 38) goto found; } return 0; found: reg.pmc_val = outp->pfp_pmcs[i].reg_value; return reg.pmc38_mont_reg.iarc_fine ? 1 : 0; } static char * pfm_mont_get_event_name(unsigned int i) { return montecito_pe[i].pme_name; } static void pfm_mont_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =montecito_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_mont_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; for(i=0; i < 16; i++) pfm_regmask_set(impl_pmcs, i); for(i=32; i < 43; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_mont_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; for(i=4; i < 16; i++) pfm_regmask_set(impl_pmds, i); for(i=32; i < 40; i++) pfm_regmask_set(impl_pmds, i); for(i=48; i < 64; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_mont_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counter pmds are contiguous */ for(i=4; i < 16; i++) pfm_regmask_set(impl_counters, i); } static void pfm_mont_get_hw_counter_width(unsigned int *width) { *width = PMU_MONT_COUNTER_WIDTH; } static int pfm_mont_get_event_description(unsigned int ev, char **str) { char *s; s = montecito_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_mont_get_cycle_event(pfmlib_event_t *e) { e->event = PME_MONT_CPU_OP_CYCLES_ALL; return PFMLIB_SUCCESS; } static int pfm_mont_get_inst_retired(pfmlib_event_t *e) { e->event = PME_MONT_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } static unsigned int pfm_mont_get_num_event_masks(unsigned int event) { return has_mesi(event) ? 4 : 0; } static char * pfm_mont_get_event_mask_name(unsigned int event, unsigned int mask) { switch(mask) { case 0: return "I"; case 1: return "S"; case 2: return "E"; case 3: return "M"; } return NULL; } static int pfm_mont_get_event_mask_desc(unsigned int event, unsigned int mask, char **desc) { switch(mask) { case 0: *desc = strdup("invalid"); break; case 1: *desc = strdup("shared"); break; case 2: *desc = strdup("exclusive"); break; case 3: *desc = strdup("modified"); break; default: return PFMLIB_ERR_INVAL; } return PFMLIB_SUCCESS; } static int pfm_mont_get_event_mask_code(unsigned int event, unsigned int mask, unsigned int *code) { *code = mask; return PFMLIB_SUCCESS; } pfm_pmu_support_t montecito_support={ .pmu_name = "dual-core Itanium 2", .pmu_type = PFMLIB_MONTECITO_PMU, .pme_count = PME_MONT_EVENT_COUNT, .pmc_count = PMU_MONT_NUM_PMCS, .pmd_count = PMU_MONT_NUM_PMDS, .num_cnt = PMU_MONT_NUM_COUNTERS, .get_event_code = pfm_mont_get_event_code, .get_event_name = pfm_mont_get_event_name, .get_event_counters = pfm_mont_get_event_counters, .dispatch_events = pfm_mont_dispatch_events, .pmu_detect = pfm_mont_detect, .get_impl_pmcs = pfm_mont_get_impl_pmcs, .get_impl_pmds = pfm_mont_get_impl_pmds, .get_impl_counters = pfm_mont_get_impl_counters, .get_hw_counter_width = pfm_mont_get_hw_counter_width, .get_event_desc = pfm_mont_get_event_description, .get_cycle_event = pfm_mont_get_cycle_event, .get_inst_retired_event = pfm_mont_get_inst_retired, .get_num_event_masks = pfm_mont_get_num_event_masks, .get_event_mask_name = pfm_mont_get_event_mask_name, .get_event_mask_desc = pfm_mont_get_event_mask_desc, .get_event_mask_code = pfm_mont_get_event_mask_code }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_montecito_priv.h000066400000000000000000000127401502707512200240550ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_MONTECITO_PRIV_H__ #define __PFMLIB_MONTECITO_PRIV_H__ /* * Event type definitions * * The virtual events are not really defined in the specs but are an artifact used * to quickly and easily setup EAR and/or BTB. The event type encodes the exact feature * which must be configured in combination with a counting monitor. * For instance, DATA_EAR_CACHE_LAT4 is a virtual D-EAR cache event. If the user * requests this event, this will configure a counting monitor to count DATA_EAR_EVENTS * and PMC11 will be configured for cache mode. The latency is encoded in the umask, here * it would correspond to 4 cycles. * */ #define PFMLIB_MONT_EVENT_NORMAL 0x0 /* standard counter */ #define PFMLIB_MONT_EVENT_ETB 0x1 /* virtual event used with ETB configuration */ #define PFMLIB_MONT_EVENT_IEAR_TLB 0x2 /* virtual event used for I-EAR TLB configuration */ #define PFMLIB_MONT_EVENT_IEAR_CACHE 0x3 /* virtual event used for I-EAR cache configuration */ #define PFMLIB_MONT_EVENT_DEAR_TLB 0x4 /* virtual event used for D-EAR TLB configuration */ #define PFMLIB_MONT_EVENT_DEAR_CACHE 0x5 /* virtual event used for D-EAR cache configuration */ #define PFMLIB_MONT_EVENT_DEAR_ALAT 0x6 /* virtual event used for D-EAR ALAT configuration */ #define event_is_ear(e) ((e)->pme_type >= PFMLIB_MONT_EVENT_IEAR_TLB &&(e)->pme_type <= PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_iear(e) ((e)->pme_type == PFMLIB_MONT_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_MONT_EVENT_IEAR_CACHE) #define event_is_dear(e) ((e)->pme_type >= PFMLIB_MONT_EVENT_DEAR_TLB && (e)->pme_type <= PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_ear_cache(e) ((e)->pme_type == PFMLIB_MONT_EVENT_DEAR_CACHE || (e)->pme_type == PFMLIB_MONT_EVENT_IEAR_CACHE) #define event_is_ear_tlb(e) ((e)->pme_type == PFMLIB_MONT_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_MONT_EVENT_DEAR_TLB) #define event_is_ear_alat(e) ((e)->pme_type == PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_etb(e) ((e)->pme_type == PFMLIB_MONT_EVENT_ETB) /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_type:3; /* see definitions above */ unsigned long pme_caf:2; /* Active, Floating, Causal, Self-Floating */ unsigned long pme_ig1:3; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_mont_entry_code_t; typedef union { unsigned long pme_vcode; pme_mont_entry_code_t pme_mont_code; /* must not be larger than vcode */ } pme_mont_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_all:1; /* supports all_thrd=1 */ unsigned long pme_mesi:1; /* event supports MESI */ unsigned long pme_res1:11; /* reserved */ unsigned long pme_group:3; /* event group */ unsigned long pme_set:4; /* event set*/ unsigned long pme_res2:41; /* reserved */ } pme_qual; } pme_mont_qualifiers_t; typedef struct { char *pme_name; pme_mont_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_mont_qualifiers_t pme_qualifiers; char *pme_desc; /* text description of the event */ } pme_mont_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_mont_code.pme_code #define pme_umask pme_entry_code.pme_mont_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define pme_type pme_entry_code.pme_mont_code.pme_type #define pme_caf pme_entry_code.pme_mont_code.pme_caf #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #define event_all_ok(e) ((e)->pme_qualifiers.pme_qual.pme_all==1) #define event_mesi_ok(e) ((e)->pme_qualifiers.pme_qual.pme_mesi==1) #endif /* __PFMLIB_MONTECITO_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_os_linux.c000066400000000000000000000046621502707512200226530ustar00rootroot00000000000000/* * pfmlib_os.c: set of functions OS dependent functions * * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" int _pfmlib_sys_base; /* syscall base */ int _pfmlib_major_version; /* kernel perfmon major version */ int _pfmlib_minor_version; /* kernel perfmon minor version */ /* * helper function to retrieve one value from /proc/cpuinfo * for internal libpfm use only * attr: the attribute (line) to look for * ret_buf: a buffer to store the value of the attribute (as a string) * maxlen : number of bytes of capacity in ret_buf * * ret_buf is null terminated. * * Return: * 0 : attribute found, ret_buf populated * -1: attribute not found */ int __pfm_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { int ret = 0; return ret; } static void adjust__pfmlib_sys_base(int version) { } static void pfm_init_syscalls_hardcoded(void) { } static int pfm_init_syscalls_sysfs(void) { } static int pfm_init_version_sysfs(void) { } void pfm_init_syscalls(void) { } papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_os_linux_v2.c000066400000000000000000000312531502707512200232560ustar00rootroot00000000000000/* * pfmlib_os_linux_v2.c: Perfmon2 syscall API * * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" /* * v2.x interface */ #define PFM_pfm_create_context (_pfmlib_get_sys_base()+0) #define PFM_pfm_write_pmcs (_pfmlib_get_sys_base()+1) #define PFM_pfm_write_pmds (_pfmlib_get_sys_base()+2) #define PFM_pfm_read_pmds (_pfmlib_get_sys_base()+3) #define PFM_pfm_load_context (_pfmlib_get_sys_base()+4) #define PFM_pfm_start (_pfmlib_get_sys_base()+5) #define PFM_pfm_stop (_pfmlib_get_sys_base()+6) #define PFM_pfm_restart (_pfmlib_get_sys_base()+7) #define PFM_pfm_create_evtsets (_pfmlib_get_sys_base()+8) #define PFM_pfm_getinfo_evtsets (_pfmlib_get_sys_base()+9) #define PFM_pfm_delete_evtsets (_pfmlib_get_sys_base()+10) #define PFM_pfm_unload_context (_pfmlib_get_sys_base()+11) /* * argument to v2.2 pfm_create_context() * ALWAYS use pfarg_ctx_t in programs, libpfm * does convert ths structure on the fly if v2.2. is detected */ typedef struct { unsigned char ctx_smpl_buf_id[16]; /* which buffer format to use */ uint32_t ctx_flags; /* noblock/block/syswide */ int32_t ctx_fd; /* ret arg: fd for context */ uint64_t ctx_smpl_buf_size; /* ret arg: actual buffer sz */ uint64_t ctx_reserved3[12]; /* for future use */ } pfarg_ctx22_t; /* * perfmon2 compatibility layer with perfmon3 */ #ifndef PFMLIB_OLD_PFMV2 static int pfm_create_context_2v3(pfarg_ctx_t *ctx, char *name, void *smpl_arg, size_t smpl_size) { pfarg_sinfo_t cinfo; uint32_t fl; /* * simulate kernel returning error on NULL ctx */ if (!ctx) { errno = EINVAL; return -1; } /* * if sampling format is used, then force SMPL_FMT * and PFM_FL_SINFO because it comes first */ fl = ctx->ctx_flags; if (name || smpl_arg || smpl_size) fl |= PFM_FL_SMPL_FMT; return pfm_create(fl, &cinfo, name, smpl_arg, smpl_size); } static int pfm_write_pmcs_2v3(int fd, pfarg_pmc_t *pmcs, int count) { pfarg_pmr_t *pmrs; int errno_save; int i, ret; size_t sz; sz = count * sizeof(pfarg_pmr_t); if (!pmcs) return pfm_write(fd, 0, PFM_RW_PMC, NULL, sz); pmrs = calloc(count, sizeof(*pmrs)); if (!pmrs) { errno = ENOMEM; return -1; } for (i=0 ; i < count; i++) { pmrs[i].reg_num = pmcs[i].reg_num; pmrs[i].reg_set = pmcs[i].reg_set; pmrs[i].reg_flags = pmcs[i].reg_flags; pmrs[i].reg_value = pmcs[i].reg_value; } ret = pfm_write(fd, 0, PFM_RW_PMC, pmrs, sz); errno_save = errno; free(pmrs); errno = errno_save; return ret; } static int pfm_write_pmds_2v3(int fd, pfarg_pmd_t *pmds, int count) { pfarg_pmd_attr_t *pmas; size_t sz; int errno_save; int i, ret; sz = count * sizeof(*pmas); if (!pmds) return pfm_write(fd, 0, PFM_RW_PMD, NULL, sz); pmas = calloc(count, sizeof(*pmas)); if (!pmas) { errno = ENOMEM; return -1; } for (i=0 ; i < count; i++) { pmas[i].reg_num = pmds[i].reg_num; pmas[i].reg_set = pmds[i].reg_set; pmas[i].reg_flags = pmds[i].reg_flags; pmas[i].reg_value = pmds[i].reg_value; pmas[i].reg_long_reset = pmds[i].reg_long_reset; pmas[i].reg_short_reset = pmds[i].reg_short_reset; /* skip last_value not used on write */ pmas[i].reg_ovfl_swcnt = pmds[i].reg_ovfl_switch_cnt; memcpy(pmas[i].reg_smpl_pmds, pmds[i].reg_smpl_pmds, sizeof(pmds[i].reg_smpl_pmds)); memcpy(pmas[i].reg_reset_pmds, pmds[i].reg_reset_pmds, sizeof(pmds[i].reg_reset_pmds)); pmas[i].reg_smpl_eventid = pmds[i].reg_smpl_eventid; pmas[i].reg_random_mask = pmds[i].reg_random_mask; } ret = pfm_write(fd, 0, PFM_RW_PMD_ATTR, pmas, sz); errno_save = errno; free(pmas); errno = errno_save; return ret; } static int pfm_read_pmds_2v3(int fd, pfarg_pmd_t *pmds, int count) { pfarg_pmd_attr_t *pmas; int errno_save; int i, ret; size_t sz; sz = count * sizeof(*pmas); if (!pmds) return pfm_write(fd, 0, PFM_RW_PMD, NULL, sz); pmas = calloc(count, sizeof(*pmas)); if (!pmas) { errno = ENOMEM; return -1; } for (i=0 ; i < count; i++) { pmas[i].reg_num = pmds[i].reg_num; pmas[i].reg_set = pmds[i].reg_set; pmas[i].reg_flags = pmds[i].reg_flags; pmas[i].reg_value = pmds[i].reg_value; } ret = pfm_read(fd, 0, PFM_RW_PMD_ATTR, pmas, sz); errno_save = errno; for (i=0 ; i < count; i++) { pmds[i].reg_value = pmas[i].reg_value; pmds[i].reg_long_reset = pmas[i].reg_long_reset; pmds[i].reg_short_reset = pmas[i].reg_short_reset; pmds[i].reg_last_reset_val = pmas[i].reg_last_value; pmds[i].reg_ovfl_switch_cnt = pmas[i].reg_ovfl_swcnt; /* skip reg_smpl_pmds */ /* skip reg_reset_pmds */ /* skip reg_smpl_eventid */ /* skip reg_random_mask */ } free(pmas); errno = errno_save; return ret; } static int pfm_load_context_2v3(int fd, pfarg_load_t *load) { if (!load) { errno = EINVAL; return -1; } return pfm_attach(fd, 0, load->load_pid); } static int pfm_start_2v3(int fd, pfarg_start_t *start) { if (start) { __pfm_vbprintf("pfarg_start_t not supported in v3.x\n"); errno = EINVAL; return -1; } return pfm_set_state(fd, 0, PFM_ST_START); } static int pfm_stop_2v3(int fd) { return pfm_set_state(fd, 0, PFM_ST_STOP); } static int pfm_restart_2v3(int fd) { return pfm_set_state(fd, 0, PFM_ST_RESTART); } static int pfm_create_evtsets_2v3(int fd, pfarg_setdesc_t *setd, int count) { /* set_desc an setdesc are identical so we can cast */ return pfm_create_sets(fd, 0, (pfarg_set_desc_t *)setd, count * sizeof(pfarg_setdesc_t)); } static int pfm_delete_evtsets_2v3(int fd, pfarg_setdesc_t *setd, int count) { __pfm_vbprintf("pfm_delete_evtsets not supported in v3.x\n"); errno = EINVAL; return -1; } static int pfm_getinfo_evtsets_2v3(int fd, pfarg_setinfo_t *info, int count) { pfarg_sinfo_t cinfo; pfarg_set_info_t *sif; int fdx, i, ret, errno_save; if (!info) { errno = EFAULT; return -1; } /* * initialize bitmask to all available and defer checking * until kernel. That means libpfm must be misled but we * have no other way of fixing this */ memset(&cinfo, -1, sizeof(cinfo)); /* * XXX: relies on the fact that cinfo is independent * of the session type (which is wrong in the future) */ fdx = pfm_create(0, &cinfo); if (fdx > -1) close(fdx); sif = calloc(count, sizeof(*sif)); if (!sif) { errno = ENOMEM; return -1; } for (i=0 ; i < count; i++) sif[i].set_id = info[i].set_id; ret = pfm_getinfo_sets(fd, 0, sif, count * sizeof(pfarg_set_info_t)); errno_save = errno; if (ret) goto skip; for (i=0 ; i < count; i++) { info[i].set_flags = 0; memcpy(info[i].set_ovfl_pmds, sif[i].set_ovfl_pmds, sizeof(info[i].set_ovfl_pmds)); info[i].set_runs = sif[i].set_runs; info[i].set_timeout = sif[i].set_timeout; info[i].set_act_duration = sif[i].set_duration; memcpy(info[i].set_avail_pmcs, cinfo.sif_avail_pmcs, sizeof(info[i].set_avail_pmcs)); memcpy(info[i].set_avail_pmds, cinfo.sif_avail_pmds, sizeof(info[i].set_avail_pmds)); } skip: free(sif); errno = errno_save; return ret; } static int pfm_unload_context_2v3(int fd) { return pfm_attach(fd, 0, PFM_NO_TARGET); } #else /* PFMLIB_OLD_PFMV2 */ static int pfm_create_context_2v3(pfarg_ctx_t *ctx, char *name, void *smpl_arg, size_t smpl_size) { return -1; } static int pfm_write_pmcs_2v3(int fd, pfarg_pmc_t *pmcs, int count) { return -1; } static int pfm_write_pmds_2v3(int fd, pfarg_pmd_t *pmds, int count) { return -1; } static int pfm_read_pmds_2v3(int fd, pfarg_pmd_t *pmds, int count) { return -1; } static int pfm_load_context_2v3(int fd, pfarg_load_t *load) { return -1; } static int pfm_start_2v3(int fd, pfarg_start_t *start) { return -1; } static int pfm_stop_2v3(int fd) { return -1; } static int pfm_restart_2v3(int fd) { return -1; } static int pfm_create_evtsets_2v3(int fd, pfarg_setdesc_t *setd, int count) { return -1; } static int pfm_delete_evtsets_2v3(int fd, pfarg_setdesc_t *setd, int count) { return -1; } static int pfm_getinfo_evtsets_2v3(int fd, pfarg_setinfo_t *info, int count) { return -1; } static int pfm_unload_context_2v3(int fd) { return -1; } #endif /* PFMLIB_OLD_PFMV2 */ int pfm_load_context(int fd, pfarg_load_t *load) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_load_context, fd, load); return pfm_load_context_2v3(fd, load); } int pfm_start(int fd, pfarg_start_t *start) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_start, fd, start); return pfm_start_2v3(fd, start); } int pfm_stop(int fd) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_stop, fd); return pfm_stop_2v3(fd); } int pfm_restart(int fd) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_restart, fd); return pfm_restart_2v3(fd); } int pfm_create_evtsets(int fd, pfarg_setdesc_t *setd, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_create_evtsets, fd, setd, count); return pfm_create_evtsets_2v3(fd, setd, count); } int pfm_delete_evtsets(int fd, pfarg_setdesc_t *setd, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_delete_evtsets, fd, setd, count); return pfm_delete_evtsets_2v3(fd, setd, count); } int pfm_getinfo_evtsets(int fd, pfarg_setinfo_t *info, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_getinfo_evtsets, fd, info, count); return pfm_getinfo_evtsets_2v3(fd, info, count); } int pfm_unload_context(int fd) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_unload_context, fd); return pfm_unload_context_2v3(fd); } int pfm_create_context(pfarg_ctx_t *ctx, char *name, void *smpl_arg, size_t smpl_size) { if (_pfmlib_major_version < 3) { /* * In perfmon v2.2, the pfm_create_context() call had a * different return value. It used to return errno, in v2.3 * it returns the file descriptor. */ if (_pfmlib_minor_version < 3) { int r; pfarg_ctx22_t ctx22; /* transfer the v2.3 contents to v2.2 for sys call */ memset (&ctx22, 0, sizeof(ctx22)); if (name != NULL) { memcpy (ctx22.ctx_smpl_buf_id, name, 16); } ctx22.ctx_flags = ctx->ctx_flags; /* ctx22.ctx_fd returned */ /* ctx22.ctx_smpl_buf_size returned */ memcpy (ctx22.ctx_reserved3, &ctx->ctx_reserved1, 64); r = syscall (PFM_pfm_create_context, &ctx22, smpl_arg, smpl_size); /* transfer the v2.2 contents back to v2.3 */ ctx->ctx_flags = ctx22.ctx_flags; memcpy (&ctx->ctx_reserved1, ctx22.ctx_reserved3, 64); return (r < 0 ? r : ctx22.ctx_fd); } else { return (int)syscall(PFM_pfm_create_context, ctx, name, smpl_arg, smpl_size); } } return pfm_create_context_2v3(ctx, name, smpl_arg, smpl_size); } int pfm_write_pmcs(int fd, pfarg_pmc_t *pmcs, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_write_pmcs, fd, pmcs, count); return pfm_write_pmcs_2v3(fd, pmcs, count); } int pfm_write_pmds(int fd, pfarg_pmd_t *pmds, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_write_pmds, fd, pmds, count); return pfm_write_pmds_2v3(fd, pmds, count); } int pfm_read_pmds(int fd, pfarg_pmd_t *pmds, int count) { if (_pfmlib_major_version < 3) return (int)syscall(PFM_pfm_read_pmds, fd, pmds, count); return pfm_read_pmds_2v3(fd, pmds, count); } #ifdef __ia64__ #define __PFMLIB_OS_COMPILE #include /* * this is the old perfmon2 interface, maintained for backward * compatibility reasons with older applications. This is for IA-64 ONLY. */ int perfmonctl(int fd, int cmd, void *arg, int narg) { return syscall(__NR_perfmonctl, fd, cmd, arg, narg); } #endif /* __ia64__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_os_linux_v3.c000066400000000000000000000070641502707512200232620ustar00rootroot00000000000000/* * pfmlib_os_linux_v3.c: Perfmon3 API syscalls * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" /* * v3.x interface */ #define PFM_pfm_create (_pfmlib_get_sys_base()+0) #define PFM_pfm_write (_pfmlib_get_sys_base()+1) #define PFM_pfm_read (_pfmlib_get_sys_base()+2) #define PFM_pfm_attach (_pfmlib_get_sys_base()+3) #define PFM_pfm_set_state (_pfmlib_get_sys_base()+4) #define PFM_pfm_create_sets (_pfmlib_get_sys_base()+5) #define PFM_pfm_getinfo_sets (_pfmlib_get_sys_base()+6) /* * perfmon v3 interface */ int //pfm_create(int flags, pfarg_sinfo_t *sif, char *name, void *smpl_arg, size_t smpl_size) pfm_create(int flags, pfarg_sinfo_t *sif, ...) { va_list ap; char *name = NULL; void *smpl_arg = NULL; size_t smpl_size = 0; int ret; if (_pfmlib_major_version < 3) { errno = ENOSYS; return -1; } if (flags & PFM_FL_SMPL_FMT) va_start(ap, sif); if (flags & PFM_FL_SMPL_FMT) { name = va_arg(ap, char *); smpl_arg = va_arg(ap, void *); smpl_size = va_arg(ap, size_t); } ret = (int)syscall(PFM_pfm_create, flags, sif, name, smpl_arg, smpl_size); if (flags & PFM_FL_SMPL_FMT) va_end(ap); return ret; } int pfm_write(int fd, int flags, int type, void *pms, size_t sz) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_write, fd, flags, type, pms, sz); } int pfm_read(int fd, int flags, int type, void *pms, size_t sz) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_read, fd, flags, type, pms, sz); } int pfm_create_sets(int fd, int flags, pfarg_set_desc_t *setd, size_t sz) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_create_sets, fd, flags, setd, sz); } int pfm_getinfo_sets(int fd, int flags, pfarg_set_info_t *info, size_t sz) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_getinfo_sets, fd, flags, info, sz); } int pfm_attach(int fd, int flags, int target) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_attach, fd, flags, target); } int pfm_set_state(int fd, int flags, int state) { if (_pfmlib_major_version < 3) return -ENOSYS; return (int)syscall(PFM_pfm_set_state, fd, flags, state); } papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_os_macos.c000066400000000000000000000055411502707512200226130ustar00rootroot00000000000000/* * pfmlib_os_macos.c: set of functions for MacOS (Tiger) * * Copyright (c) 2008 Stephane Eranian * Contributed by Stephane Eranian * As a sign of friendship to my friend Eric, big fan of MacOS * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include "pfmlib_priv.h" typedef enum { TYPE_NONE, TYPE_STR, TYPE_INT } mib_name_t; /* * helper function to retrieve one value from /proc/cpuinfo * for internal libpfm use only * attr: the attribute (line) to look for * ret_buf: a buffer to store the value of the attribute (as a string) * maxlen : number of bytes of capacity in ret_buf * * ret_buf is null terminated. * * Return: * 0 : attribute found, ret_buf populated * -1: attribute not found */ int __pfm_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { mib_name_t type = TYPE_NONE; union { char str[32]; int val; } value; char *name = NULL; int mib[16]; int ret = -1; size_t len, mib_len; if (attr == NULL || ret_buf == NULL || maxlen < 1) return -1; *ret_buf = '\0'; if (!strcmp(attr, "vendor_id")) { name = "machdep.cpu.vendor"; type = TYPE_STR; } else if (!strcmp(attr, "model")) { name = "machdep.cpu.model"; type = TYPE_INT; } else if (!strcmp(attr, "cpu family")) { name = "machdep.cpu.family"; type = TYPE_INT; } mib_len = 16; ret = sysctlnametomib(name, mib, &mib_len); if (ret) return -1; len = sizeof(value); ret = sysctl(mib, mib_len, &value, &len, NULL, 0); if (ret) return ret; if (type == TYPE_STR) strncpy(ret_buf, value.str, maxlen); else if (type == TYPE_INT) snprintf(ret_buf, maxlen, "%d", value.val); __pfm_vbprintf("attr=%s ret=%d ret_buf=%s\n", attr, ret, ret_buf); return ret; } void pfm_init_syscalls(void) { } papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_pentium4.c000066400000000000000000000537271502707512200225660ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_pentium4.c * * Support for libpfm for the Pentium4/Xeon/EM64T processor family (family=15). */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_pentium4_priv.h" #include "pentium4_events.h" typedef struct { unsigned long addr; char *name; } p4_regmap_t; #define P4_REGMAP(a, n) { .addr = a, .name = n } static p4_regmap_t p4_pmc_regmap[]={ /* 0 */ P4_REGMAP(0x3b2, "BPU_ESCR0"), /* 1 */ P4_REGMAP(0x3ba, "IS_ESCR0"), /* 2 */ P4_REGMAP(0x3aa, "MOB_ESCR0"), /* 3 */ P4_REGMAP(0x3b6, "ITLB_ESCR0"), /* 4 */ P4_REGMAP(0x3ac, "PMH_ESCR0"), /* 5 */ P4_REGMAP(0x3c8, "IX_ESCR0"), /* 6 */ P4_REGMAP(0x3a2, "FSB_ESCR0"), /* 7 */ P4_REGMAP(0x3a0, "BSU_ESCR0"), /* 8 */ P4_REGMAP(0x3c0, "MS_ESCR0"), /* 9 */ P4_REGMAP(0x3c4, "TC_ESCR0"), /* 10 */ P4_REGMAP(0x3c2, "TBPU_ESCR0"), /* 11 */ P4_REGMAP(0x3a6, "FLAME_ESCR0"), /* 12 */ P4_REGMAP(0x3a4, "FIRM_ESCR0"), /* 13 */ P4_REGMAP(0x3ae, "SAAT_ESCR0"), /* 14 */ P4_REGMAP(0x3b0, "U2L_ESCR0"), /* 15 */ P4_REGMAP(0x3a8, "DAC_ESCR0"), /* 16 */ P4_REGMAP(0x3ba, "IQ_ESCR0"), /* 17 */ P4_REGMAP(0x3ca, "ALF_ESCR0"), /* 18 */ P4_REGMAP(0x3bc, "RAT_ESCR0"), /* 19 */ P4_REGMAP(0x3be, "SSU_ESCR0"), /* 20 */ P4_REGMAP(0x3b8, "CRU_ESCR0"), /* 21 */ P4_REGMAP(0x3cc, "CRU_ESCR2"), /* 22 */ P4_REGMAP(0x3e0, "CRU_ESCR4"), /* 23 */ P4_REGMAP(0x360, "BPU_CCCR0"), /* 24 */ P4_REGMAP(0x361, "BPU_CCCR1"), /* 25 */ P4_REGMAP(0x364, "MS_CCCR0"), /* 26 */ P4_REGMAP(0x365, "MS_CCCR1"), /* 27 */ P4_REGMAP(0x368, "FLAME_CCCR0"), /* 28 */ P4_REGMAP(0x369, "FLAME_CCCR1"), /* 29 */ P4_REGMAP(0x36c, "IQ_CCCR0"), /* 30 */ P4_REGMAP(0x36d, "IQ_CCCR1"), /* 31 */ P4_REGMAP(0x370, "IQ_CCCR4"), /* 32 */ P4_REGMAP(0x3b3, "BPU_ESCR1"), /* 33 */ P4_REGMAP(0x3b5, "IS_ESCR1"), /* 34 */ P4_REGMAP(0x3ab, "MOB_ESCR1"), /* 35 */ P4_REGMAP(0x3b7, "ITLB_ESCR1"), /* 36 */ P4_REGMAP(0x3ad, "PMH_ESCR1"), /* 37 */ P4_REGMAP(0x3c9, "IX_ESCR1"), /* 38 */ P4_REGMAP(0x3a3, "FSB_ESCR1"), /* 39 */ P4_REGMAP(0x3a1, "BSU_ESCR1"), /* 40 */ P4_REGMAP(0x3c1, "MS_ESCR1"), /* 41 */ P4_REGMAP(0x3c5, "TC_ESCR1"), /* 42 */ P4_REGMAP(0x3c3, "TBPU_ESCR1"), /* 43 */ P4_REGMAP(0x3a7, "FLAME_ESCR1"), /* 44 */ P4_REGMAP(0x3a5, "FIRM_ESCR1"), /* 45 */ P4_REGMAP(0x3af, "SAAT_ESCR1"), /* 46 */ P4_REGMAP(0x3b1, "U2L_ESCR1"), /* 47 */ P4_REGMAP(0x3a9, "DAC_ESCR1"), /* 48 */ P4_REGMAP(0x3bb, "IQ_ESCR1"), /* 49 */ P4_REGMAP(0x3cb, "ALF_ESCR1"), /* 50 */ P4_REGMAP(0x3bd, "RAT_ESCR1"), /* 51 */ P4_REGMAP(0x3b9, "CRU_ESCR1"), /* 52 */ P4_REGMAP(0x3cd, "CRU_ESCR3"), /* 53 */ P4_REGMAP(0x3e1, "CRU_ESCR5"), /* 54 */ P4_REGMAP(0x362, "BPU_CCCR2"), /* 55 */ P4_REGMAP(0x363, "BPU_CCCR3"), /* 56 */ P4_REGMAP(0x366, "MS_CCCR2"), /* 57 */ P4_REGMAP(0x367, "MS_CCCR3"), /* 58 */ P4_REGMAP(0x36a, "FLAME_CCCR2"), /* 59 */ P4_REGMAP(0x36b, "FLAME_CCCR3"), /* 60 */ P4_REGMAP(0x36e, "IQ_CCCR2"), /* 61 */ P4_REGMAP(0x36f, "IQ_CCCR3"), /* 62 */ P4_REGMAP(0x371, "IQ_CCCR5"), /* 63 */ P4_REGMAP(0x3f2, "PEBS_MATRIX_VERT"), /* 64 */ P4_REGMAP(0x3f1, "PEBS_ENABLE"), }; #define PMC_PEBS_MATRIX_VERT 63 #define PMC_PEBS_ENABLE 64 static p4_regmap_t p4_pmd_regmap[]={ /* 0 */ P4_REGMAP(0x300, "BPU_CTR0"), /* 1 */ P4_REGMAP(0x301, "BPU_CTR1"), /* 2 */ P4_REGMAP(0x304, "MS_CTR0"), /* 3 */ P4_REGMAP(0x305, "MS_CTR1"), /* 4 */ P4_REGMAP(0x308, "FLAME_CTR0"), /* 5 */ P4_REGMAP(0x309, "FLAME_CTR1"), /* 6 */ P4_REGMAP(0x30c, "IQ_CTR0"), /* 7 */ P4_REGMAP(0x30d, "IQ_CTR1"), /* 8 */ P4_REGMAP(0x310, "IQ_CTR4"), /* 9 */ P4_REGMAP(0x302, "BPU_CTR2"), /* 10 */ P4_REGMAP(0x303, "BPU_CTR3"), /* 11 */ P4_REGMAP(0x306, "MS_CTR2"), /* 12 */ P4_REGMAP(0x307, "MS_CTR3"), /* 13 */ P4_REGMAP(0x30a, "FLAME_CTR2"), /* 14 */ P4_REGMAP(0x30b, "FLAME_CTR3"), /* 15 */ P4_REGMAP(0x30d, "IQ_CTR2"), /* 16 */ P4_REGMAP(0x30f, "IQ_CTR3"), /* 17 */ P4_REGMAP(0x311, "IQ_CTR5"), }; /* This array provides values for the PEBS_ENABLE and PEBS_MATRIX_VERT registers to support a series of metric for replay_event. The first two entries are dummies; the remaining 9 correspond to virtual bit masks in the replay_event definition and map onto Intel documentation. */ #define P4_REPLAY_REAL_MASK 0x00000003 #define P4_REPLAY_VIRT_MASK 0x00000FFC static pentium4_replay_regs_t p4_replay_regs[]={ /* 0 */ {.enb = 0, /* dummy */ .mat_vert = 0, }, /* 1 */ {.enb = 0, /* dummy */ .mat_vert = 0, }, /* 2 */ {.enb = 0x01000001, /* 1stL_cache_load_miss_retired */ .mat_vert = 0x00000001, }, /* 3 */ {.enb = 0x01000002, /* 2ndL_cache_load_miss_retired */ .mat_vert = 0x00000001, }, /* 4 */ {.enb = 0x01000004, /* DTLB_load_miss_retired */ .mat_vert = 0x00000001, }, /* 5 */ {.enb = 0x01000004, /* DTLB_store_miss_retired */ .mat_vert = 0x00000002, }, /* 6 */ {.enb = 0x01000004, /* DTLB_all_miss_retired */ .mat_vert = 0x00000003, }, /* 7 */ {.enb = 0x01018001, /* Tagged_mispred_branch */ .mat_vert = 0x00000010, }, /* 8 */ {.enb = 0x01000200, /* MOB_load_replay_retired */ .mat_vert = 0x00000001, }, /* 9 */ {.enb = 0x01000400, /* split_load_retired */ .mat_vert = 0x00000001, }, /* 10 */ {.enb = 0x01000400, /* split_store_retired */ .mat_vert = 0x00000002, }, }; static int p4_model; /** * pentium4_get_event_code * * Return the event-select value for the specified event as * needed for the specified PMD counter. **/ static int pentium4_get_event_code(unsigned int event, unsigned int pmd, int *code) { int i, j, escr, cccr; int rc = PFMLIB_ERR_INVAL; if (pmd >= PENTIUM4_NUM_PMDS && pmd != PFMLIB_CNT_FIRST) { goto out; } /* Check that the specified event is allowed for the specified PMD. * Each event has a specific set of ESCRs it can use, which implies * a specific set of CCCRs (and thus PMDs). A specified PMD of -1 * means assume any allowable PMD. */ if (pmd == PFMLIB_CNT_FIRST) { *code = pentium4_events[event].event_select; rc = PFMLIB_SUCCESS; goto out; } for (i = 0; i < MAX_ESCRS_PER_EVENT; i++) { escr = pentium4_events[event].allowed_escrs[i]; if (escr < 0) { continue; } for (j = 0; j < MAX_CCCRS_PER_ESCR; j++) { cccr = pentium4_escrs[escr].allowed_cccrs[j]; if (cccr < 0) { continue; } if (pmd == pentium4_cccrs[cccr].pmd) { *code = pentium4_events[event].event_select; rc = PFMLIB_SUCCESS; goto out; } } } out: return rc; } /** * pentium4_get_event_name * * Return the name of the specified event. **/ static char *pentium4_get_event_name(unsigned int event) { return pentium4_events[event].name; } /** * pentium4_get_event_mask_name * * Return the name of the specified event-mask. **/ static char *pentium4_get_event_mask_name(unsigned int event, unsigned int mask) { if (mask >= EVENT_MASK_BITS || pentium4_events[event].event_masks[mask].name == NULL) return NULL; return pentium4_events[event].event_masks[mask].name; } /** * pentium4_get_event_counters * * Fill in the 'counters' bitmask with all possible PMDs that could be * used to count the specified event. **/ static void pentium4_get_event_counters(unsigned int event, pfmlib_regmask_t *counters) { int i, j, escr, cccr; memset(counters, 0, sizeof(*counters)); for (i = 0; i < MAX_ESCRS_PER_EVENT; i++) { escr = pentium4_events[event].allowed_escrs[i]; if (escr < 0) { continue; } for (j = 0; j < MAX_CCCRS_PER_ESCR; j++) { cccr = pentium4_escrs[escr].allowed_cccrs[j]; if (cccr < 0) { continue; } pfm_regmask_set(counters, pentium4_cccrs[cccr].pmd); } } } /** * pentium4_get_num_event_masks * * Count the number of available event-masks for the specified event. All * valid masks in pentium4_events[].event_masks are contiguous in the array * and have a non-NULL name. **/ static unsigned int pentium4_get_num_event_masks(unsigned int event) { unsigned int i = 0; while (pentium4_events[event].event_masks[i].name) { i++; } return i; } /** * pentium4_dispatch_events * * Examine each desired event specified in "input" and find an appropriate * ESCR/CCCR pair that can be used to count them. **/ static int pentium4_dispatch_events(pfmlib_input_param_t *input, void *model_input, pfmlib_output_param_t *output, void *model_output) { unsigned int assigned_pmcs[PENTIUM4_NUM_PMCS] = {0}; unsigned int event, event_mask, mask; unsigned int bit, tag_value, tag_enable; unsigned int plm; unsigned int i, j, k, m, n; int escr, escr_pmc; int cccr, cccr_pmc, cccr_pmd; int assigned; pentium4_escr_value_t escr_value; pentium4_cccr_value_t cccr_value; if (input->pfp_event_count > PENTIUM4_NUM_PMDS) { /* Can't specify more events than we have counters. */ return PFMLIB_ERR_TOOMANY; } if (input->pfp_dfl_plm & (PFM_PLM1|PFM_PLM2)) { /* Can't specify privilege levels 1 or 2. */ return PFMLIB_ERR_INVAL; } /* Examine each event specified in input->pfp_events. i counts * through the input->pfp_events array, and j counts through the * PMCs in output->pfp_pmcs as they are set up. */ for (i = 0, j = 0; i < input->pfp_event_count; i++) { if (input->pfp_events[i].plm & (PFM_PLM1|PFM_PLM2)) { /* Can't specify privilege levels 1 or 2. */ return PFMLIB_ERR_INVAL; } /* * INSTR_COMPLETED event only exist for model 3, 4, 6 (Prescott) */ if (input->pfp_events[i].event == PME_INSTR_COMPLETED && p4_model != 3 && p4_model != 4 && p4_model != 6) return PFMLIB_ERR_EVTINCOMP; event = input->pfp_events[i].event; assigned = 0; /* Use the event-specific privilege mask if set. * Otherwise use the default privilege mask. */ plm = input->pfp_events[i].plm ? input->pfp_events[i].plm : input->pfp_dfl_plm; /* Examine each ESCR that this event could be assigned to. */ for (k = 0; k < MAX_ESCRS_PER_EVENT && !assigned; k++) { escr = pentium4_events[event].allowed_escrs[k]; if (escr < 0) continue; /* Make sure this ESCR isn't already assigned * and isn't on the "unavailable" list. */ escr_pmc = pentium4_escrs[escr].pmc; if (assigned_pmcs[escr_pmc] || pfm_regmask_isset(&input->pfp_unavail_pmcs, escr_pmc)) { continue; } /* Examine each CCCR that can be used with this ESCR. */ for (m = 0; m < MAX_CCCRS_PER_ESCR && !assigned; m++) { cccr = pentium4_escrs[escr].allowed_cccrs[m]; if (cccr < 0) { continue; } /* Make sure this CCCR isn't already assigned * and isn't on the "unavailable" list. */ cccr_pmc = pentium4_cccrs[cccr].pmc; cccr_pmd = pentium4_cccrs[cccr].pmd; if (assigned_pmcs[cccr_pmc] || pfm_regmask_isset(&input->pfp_unavail_pmcs, cccr_pmc)) { continue; } /* Found an available ESCR/CCCR pair. */ assigned = 1; assigned_pmcs[escr_pmc] = 1; assigned_pmcs[cccr_pmc] = 1; /* Calculate the event-mask value. Invalid masks * specified by the caller are ignored. */ event_mask = 0; tag_value = 0; tag_enable = 0; for (n = 0; n < input->pfp_events[i].num_masks; n++) { mask = input->pfp_events[i].unit_masks[n]; bit = pentium4_events[event].event_masks[mask].bit; if (bit < EVENT_MASK_BITS && pentium4_events[event].event_masks[mask].name) { event_mask |= (1 << bit); } if (bit >= EVENT_MASK_BITS && pentium4_events[event].event_masks[mask].name) { tag_value |= (1 << (bit - EVENT_MASK_BITS)); tag_enable = 1; } } /* Set up the ESCR and CCCR register values. */ escr_value.val = 0; escr_value.bits.t1_usr = 0; /* controlled by kernel */ escr_value.bits.t1_os = 0; /* controlled by kernel */ escr_value.bits.t0_usr = (plm & PFM_PLM3) ? 1 : 0; escr_value.bits.t0_os = (plm & PFM_PLM0) ? 1 : 0; escr_value.bits.tag_enable = tag_enable; escr_value.bits.tag_value = tag_value; escr_value.bits.event_mask = event_mask; escr_value.bits.event_select = pentium4_events[event].event_select; escr_value.bits.reserved = 0; cccr_value.val = 0; cccr_value.bits.reserved1 = 0; cccr_value.bits.enable = 1; cccr_value.bits.escr_select = pentium4_events[event].escr_select; cccr_value.bits.active_thread = 3; /* FIXME: This is set to count when either logical * CPU is active. Need a way to distinguish * between logical CPUs when HT is enabled. */ cccr_value.bits.compare = 0; /* FIXME: What do we do with "threshold" settings? */ cccr_value.bits.complement = 0; /* FIXME: What do we do with "threshold" settings? */ cccr_value.bits.threshold = 0; /* FIXME: What do we do with "threshold" settings? */ cccr_value.bits.force_ovf = 0; /* FIXME: Do we want to allow "forcing" overflow * interrupts on all counter increments? */ cccr_value.bits.ovf_pmi_t0 = 1; cccr_value.bits.ovf_pmi_t1 = 0; /* PMI taken care of by kernel typically */ cccr_value.bits.reserved2 = 0; cccr_value.bits.cascade = 0; /* FIXME: How do we handle "cascading" counters? */ cccr_value.bits.overflow = 0; /* Special processing for the replay event: Remove virtual mask bits from actual mask; scan mask bit list and OR bit values for each virtual mask into the PEBS ENABLE and PEBS MATRIX VERT registers */ if (event == PME_REPLAY_EVENT) { escr_value.bits.event_mask &= P4_REPLAY_REAL_MASK; /* remove virtual mask bits */ if (event_mask & P4_REPLAY_VIRT_MASK) { /* find a valid virtual mask */ output->pfp_pmcs[j].reg_value = 0; output->pfp_pmcs[j].reg_num = PMC_PEBS_ENABLE; output->pfp_pmcs[j].reg_addr = p4_pmc_regmap[PMC_PEBS_ENABLE].addr; output->pfp_pmcs[j+1].reg_value = 0; output->pfp_pmcs[j+1].reg_num = PMC_PEBS_MATRIX_VERT; output->pfp_pmcs[j+1].reg_addr = p4_pmc_regmap[PMC_PEBS_MATRIX_VERT].addr; for (n = 0; n < input->pfp_events[i].num_masks; n++) { mask = input->pfp_events[i].unit_masks[n]; if (mask > 1 && mask < 11) { /* process each valid mask we find */ output->pfp_pmcs[j].reg_value |= p4_replay_regs[mask].enb; output->pfp_pmcs[j+1].reg_value |= p4_replay_regs[mask].mat_vert; } } j += 2; output->pfp_pmc_count += 2; } } /* Set up the PMCs in the * output->pfp_pmcs array. */ output->pfp_pmcs[j].reg_num = escr_pmc; output->pfp_pmcs[j].reg_value = escr_value.val; output->pfp_pmcs[j].reg_addr = p4_pmc_regmap[escr_pmc].addr; j++; __pfm_vbprintf("[%s(pmc%u)=0x%lx os=%u usr=%u tag=%u tagval=0x%x mask=%u sel=0x%x] %s\n", p4_pmc_regmap[escr_pmc].name, escr_pmc, escr_value.val, escr_value.bits.t0_os, escr_value.bits.t0_usr, escr_value.bits.tag_enable, escr_value.bits.tag_value, escr_value.bits.event_mask, escr_value.bits.event_select, pentium4_events[event].name); output->pfp_pmcs[j].reg_num = cccr_pmc; output->pfp_pmcs[j].reg_value = cccr_value.val; output->pfp_pmcs[j].reg_addr = p4_pmc_regmap[cccr_pmc].addr; output->pfp_pmds[i].reg_num = cccr_pmd; output->pfp_pmds[i].reg_addr = p4_pmd_regmap[cccr_pmd].addr; __pfm_vbprintf("[%s(pmc%u)=0x%lx ena=1 sel=0x%x cmp=%u cmpl=%u thres=%u edg=%u cas=%u] %s\n", p4_pmc_regmap[cccr_pmc].name, cccr_pmc, cccr_value.val, cccr_value.bits.escr_select, cccr_value.bits.compare, cccr_value.bits.complement, cccr_value.bits.threshold, cccr_value.bits.edge, cccr_value.bits.cascade, pentium4_events[event].name); __pfm_vbprintf("[%s(pmd%u)]\n", p4_pmd_regmap[output->pfp_pmds[i].reg_num].name, output->pfp_pmds[i].reg_num); j++; output->pfp_pmc_count += 2; } } if (k == MAX_ESCRS_PER_EVENT && !assigned) { /* Couldn't find an available ESCR and/or CCCR. */ return PFMLIB_ERR_NOASSIGN; } } output->pfp_pmd_count = input->pfp_event_count; return PFMLIB_SUCCESS; } /** * pentium4_pmu_detect * * Determine whether the system we're running on is a Pentium4 * (or other CPU that uses the same PMU). **/ static int pentium4_pmu_detect(void) { int ret, family; char buffer[128]; ret = __pfm_getcpuinfo_attr("vendor_id", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; if (strcmp(buffer, "GenuineIntel")) return PFMLIB_ERR_NOTSUPP; ret = __pfm_getcpuinfo_attr("cpu family", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; family = atoi(buffer); ret = __pfm_getcpuinfo_attr("model", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; /* * we use model to detect model 2 which has one more counter IQ_ESCR1 */ p4_model = atoi(buffer); if (family != 15) return PFMLIB_ERR_NOTSUPP; /* * IQ_ESCR0, IQ_ESCR1 only for model 1 and 2 */ if (p4_model >2) pentium4_support.pmc_count -= 2; return family == 15 ? PFMLIB_SUCCESS : PFMLIB_ERR_NOTSUPP; } /** * pentium4_get_impl_pmcs * * Set the appropriate bit in the impl_pmcs bitmask for each PMC that's * available on Pentium4. * * FIXME: How can we detect when HyperThreading is enabled? **/ static void pentium4_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i; for(i = 0; i < PENTIUM4_NUM_PMCS; i++) { pfm_regmask_set(impl_pmcs, i); } /* * IQ_ESCR0, IQ_ESCR1 only available on model 1 and 2 */ if (p4_model > 2) { pfm_regmask_clr(impl_pmcs, 16); pfm_regmask_clr(impl_pmcs, 48); } } /** * pentium4_get_impl_pmds * * Set the appropriate bit in the impl_pmcs bitmask for each PMD that's * available on Pentium4. * * FIXME: How can we detect when HyperThreading is enabled? **/ static void pentium4_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i; for(i = 0; i < PENTIUM4_NUM_PMDS; i++) { pfm_regmask_set(impl_pmds, i); } } /** * pentium4_get_impl_counters * * Set the appropriate bit in the impl_counters bitmask for each counter * that's available on Pentium4. * * For now, all PMDs are counters, so just call get_impl_pmds(). **/ static void pentium4_get_impl_counters(pfmlib_regmask_t *impl_counters) { pentium4_get_impl_pmds(impl_counters); } /** * pentium4_get_hw_counter_width * * Return the number of usable bits in the PMD counters. **/ static void pentium4_get_hw_counter_width(unsigned int *width) { *width = PENTIUM4_COUNTER_WIDTH; } /** * pentium4_get_event_desc * * Return the description for the specified event (if it has one). * * FIXME: In this routine, we make a copy of the description string to * return. But in get_event_name(), we just return the string * directly. Why the difference? **/ static int pentium4_get_event_desc(unsigned int event, char **desc) { if (pentium4_events[event].desc) { *desc = strdup(pentium4_events[event].desc); } else { *desc = NULL; } return PFMLIB_SUCCESS; } /** * pentium4_get_event_mask_desc * * Return the description for the specified event-mask (if it has one). **/ static int pentium4_get_event_mask_desc(unsigned int event, unsigned int mask, char **desc) { if (mask >= EVENT_MASK_BITS || pentium4_events[event].event_masks[mask].desc == NULL) return PFMLIB_ERR_INVAL; *desc = strdup(pentium4_events[event].event_masks[mask].desc); return PFMLIB_SUCCESS; } static int pentium4_get_event_mask_code(unsigned int event, unsigned int mask, unsigned int *code) { *code = 1U << pentium4_events[event].event_masks[mask].bit; return PFMLIB_SUCCESS; } static int pentium4_get_cycle_event(pfmlib_event_t *e) { e->event = PENTIUM4_CPU_CLK_UNHALTED; e->num_masks = 1; e->unit_masks[0] = 0; return PFMLIB_SUCCESS; } static int pentium4_get_inst_retired(pfmlib_event_t *e) { /* * some models do not implement INSTR_COMPLETED */ if (p4_model != 3 && p4_model != 4 && p4_model != 6) { e->event = PENTIUM4_INST_RETIRED; e->num_masks = 2; e->unit_masks[0] = 0; e->unit_masks[1] = 1; } else { e->event = PME_INSTR_COMPLETED; e->num_masks = 1; e->unit_masks[0] = 0; } return PFMLIB_SUCCESS; } /** * pentium4_support **/ pfm_pmu_support_t pentium4_support = { .pmu_name = "Pentium4/Xeon/EM64T", .pmu_type = PFMLIB_PENTIUM4_PMU, .pme_count = PENTIUM4_EVENT_COUNT, .pmd_count = PENTIUM4_NUM_PMDS, .pmc_count = PENTIUM4_NUM_PMCS, .num_cnt = PENTIUM4_NUM_PMDS, .get_event_code = pentium4_get_event_code, .get_event_name = pentium4_get_event_name, .get_event_mask_name = pentium4_get_event_mask_name, .get_event_counters = pentium4_get_event_counters, .get_num_event_masks = pentium4_get_num_event_masks, .dispatch_events = pentium4_dispatch_events, .pmu_detect = pentium4_pmu_detect, .get_impl_pmcs = pentium4_get_impl_pmcs, .get_impl_pmds = pentium4_get_impl_pmds, .get_impl_counters = pentium4_get_impl_counters, .get_hw_counter_width = pentium4_get_hw_counter_width, .get_event_desc = pentium4_get_event_desc, .get_event_mask_desc = pentium4_get_event_mask_desc, .get_event_mask_code = pentium4_get_event_mask_code, .get_cycle_event = pentium4_get_cycle_event, .get_inst_retired_event = pentium4_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_pentium4_priv.h000066400000000000000000000125271502707512200236240ustar00rootroot00000000000000/* * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_pentium4_priv.h * * Structures and definitions for use in the Pentium4/Xeon/EM64T libpfm code. */ #ifndef _PFMLIB_PENTIUM4_PRIV_H_ #define _PFMLIB_PENTIUM4_PRIV_H_ /** * pentium4_escr_reg_t * * Describe one ESCR register. * * "pentium4_escrs" is a flat array of these structures * that defines all the ESCRs. * * @name: ESCR's name * @pmc: Perfmon's PMC number for this ESCR. * @allowed_cccrs: Array of CCCR numbers that can be used with this ESCR. A * positive value is an index into the pentium4_ccrs array. * A value of -1 indicates that slot is unused. **/ #define MAX_CCCRS_PER_ESCR 3 typedef struct { char *name; int pmc; int allowed_cccrs[MAX_CCCRS_PER_ESCR]; } pentium4_escr_reg_t; /* CCCR: Counter Configuration Control Register * * These registers are used to configure the data counters. There are 18 * CCCRs, one for each data counter. */ /** * pentium4_cccr_reg_t * * Describe one CCCR register. * * "pentium4_cccrs" is a flat array of these structures * that defines all the CCCRs. * * @name: CCCR's name * @pmc: Perfmon's PMC number for this CCCR * @pmd: Perfmon's PMD number for the associated data counter. Every CCCR has * exactly one counter. * @allowed_escrs: Array of ESCR numbers that can be used with this CCCR. A * positive value is an index into the pentium4_escrs array. * A value of -1 indicates that slot is unused. The index into * this array is the value to use in the escr_select portion * of the CCCR value. **/ #define MAX_ESCRS_PER_CCCR 8 typedef struct { char *name; int pmc; int pmd; int allowed_escrs[MAX_ESCRS_PER_CCCR]; } pentium4_cccr_reg_t; /** * pentium4_replay_regs_t * * Describe one pair of PEBS registers for use with the replay_event event. * * "p4_replay_regs" is a flat array of these structures * that defines all the PEBS pairs per Table A-10 of * the Intel System Programming Guide Vol 3B. * * @enb: value for the PEBS_ENABLE register for a given replay metric. * @mat_vert: value for the PEBS_MATRIX_VERT register for a given metric. * The replay_event event defines a series of virtual mask bits * that serve as indexes into this array. The values at that index * provide information programmed into the PEBS registers to count * specific metrics available to the replay_event event. **/ typedef struct { int enb; int mat_vert; } pentium4_replay_regs_t; /** * pentium4_pmc_t * * Provide a mapping from PMC number to the type of control register and * its index within the appropriate array. * * @name: Name * @type: PENTIUM4_PMC_TYPE_ESCR or PENTIUM4_PMC_TYPE_CCCR * @index: Index into the pentium4_escrs array or the pentium4_cccrs array. **/ typedef struct { char *name; int type; int index; } pentium4_pmc_t; #define PENTIUM4_PMC_TYPE_ESCR 1 #define PENTIUM4_PMC_TYPE_CCCR 2 /** * pentium4_event_mask_t * * Defines one bit of the event-mask for one Pentium4 event. * * @name: Event mask name * @desc: Event mask description * @bit: The bit position within the event_mask field. **/ typedef struct { char *name; char *desc; unsigned int bit; } pentium4_event_mask_t; /** * pentium4_event_t * * Describe one event that can be counted on Pentium4/EM64T. * * "pentium4_events" is a flat array of these structures that defines * all possible events. * * @name: Event name * @desc: Event description * @event_select: Value for the 'event_select' field in the ESCR (bits [31:25]). * @escr_select: Value for the 'escr_select' field in the CCCR (bits [15:13]). * @allowed_escrs: Numbers for ESCRs that can be used to count this event. A * positive value is an index into the pentium4_escrs array. * A value of -1 means that slot is not used. * @event_masks: Array of descriptions of available masks for this event. * Array elements with a NULL 'name' field are unused. **/ #define MAX_ESCRS_PER_EVENT 2 typedef struct { char *name; char *desc; unsigned int event_select; unsigned int escr_select; int allowed_escrs[MAX_ESCRS_PER_EVENT]; pentium4_event_mask_t event_masks[EVENT_MASK_BITS]; } pentium4_event_t; #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power4_priv.h000066400000000000000000000011731502707512200232720ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER4_PRIV_H__ #define __PFMLIB_POWER4_PRIV_H__ /* * File: pfmlib_power4_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER4_NUM_EVENT_COUNTERS 8 #define POWER4_NUM_GROUP_VEC 1 #define POWER4_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power5+_priv.h000066400000000000000000000012011502707512200233360ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER5p_PRIV_H__ #define __PFMLIB_POWER5p_PRIV_H__ /* * File: pfmlib_power5+_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5p_NUM_EVENT_COUNTERS 6 #define POWER5p_NUM_GROUP_VEC 3 #define POWER5p_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power5_priv.h000066400000000000000000000011731502707512200232730ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER5_PRIV_H__ #define __PFMLIB_POWER5_PRIV_H__ /* * File: pfmlib_power5_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5_NUM_EVENT_COUNTERS 6 #define POWER5_NUM_GROUP_VEC 3 #define POWER5_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power6_priv.h000066400000000000000000000011731502707512200232740ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER6_PRIV_H__ #define __PFMLIB_POWER6_PRIV_H__ /* * File: pfmlib_power6_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER6_NUM_EVENT_COUNTERS 6 #define POWER6_NUM_GROUP_VEC 4 #define POWER6_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power7_priv.h000066400000000000000000000011731502707512200232750ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER7_PRIV_H__ #define __PFMLIB_POWER7_PRIV_H__ /* * File: pfmlib_power7_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER7_NUM_EVENT_COUNTERS 6 #define POWER7_NUM_GROUP_VEC 4 #define POWER7_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_power_priv.h000066400000000000000000000016531502707512200232110ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER_PRIV_H__ #define __PFMLIB_POWER_PRIV_H__ /* * File: pfmlib_power_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ typedef struct { char *pme_name; unsigned pme_code; char *pme_short_desc; char *pme_long_desc; const int *pme_event_ids; const unsigned long long *pme_group_vector; } pme_power_entry_t; typedef struct { char *pmg_name; char *pmg_desc; const int *pmg_event_ids; unsigned long long pmg_mmcr0; unsigned long long pmg_mmcr1; unsigned long long pmg_mmcra; } pmg_power_group_t; #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_powerpc_priv.h000066400000000000000000000023571502707512200235360ustar00rootroot00000000000000/* * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_pentium4_priv.h * * Structures and definitions for use in the Pentium4/Xeon/EM64T libpfm code. */ #ifndef _PFMLIB_POWERPC_PRIV_H_ #define _PFMLIB_POWERPC_PRIV_H_ #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_ppc970_priv.h000066400000000000000000000011731502707512200230740ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_PPC970_PRIV_H__ #define __PFMLIB_PPC970_PRIV_H__ /* * File: pfmlib_ppc970_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970_NUM_EVENT_COUNTERS 8 #define PPC970_NUM_GROUP_VEC 1 #define PPC970_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_ppc970mp_priv.h000066400000000000000000000012071502707512200234270ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_PPC970MP_PRIV_H__ #define __PFMLIB_PPC970MP_PRIV_H__ /* * File: pfmlib_ppc970mp_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970MP_NUM_EVENT_COUNTERS 8 #define PPC970MP_NUM_GROUP_VEC 1 #define PPC970MP_NUM_CONTROL_REGS 3 #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_priv.c000066400000000000000000000050531502707512200217660ustar00rootroot00000000000000/* * pfmlib_priv.c: set of internal utility functions for all architectures * * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" /* * file for all libpfm verbose and debug output * * By default, it is set to stderr, unless the * PFMLIB_DEBUG_STDOUT environment variable is set */ FILE *libpfm_fp; /* * by convention all internal utility function must be prefixed by __ */ /* * debug printf */ void __pfm_vbprintf(const char *fmt, ...) { va_list ap; if (pfm_config.options.pfm_verbose == 0) return; va_start(ap, fmt); vfprintf(libpfm_fp, fmt, ap); va_end(ap); } int __pfm_check_event(pfmlib_event_t *e) { unsigned int n, j; if (e->event >= pfm_current->pme_count) return PFMLIB_ERR_INVAL; n = pfm_num_masks(e->event); if (n == 0 && e->num_masks) return PFMLIB_ERR_UMASK; for(j=0; j < e->num_masks; j++) { if (e->unit_masks[j] >= n) return PFMLIB_ERR_UMASK; } /* * if event has umask, but non specified by user, then * return: * - error if no default umask is defined * - success if default umask exists for event */ if (n && j == 0) { if (pfm_current->has_umask_default && pfm_current->has_umask_default(e->event)) return PFMLIB_SUCCESS; return PFMLIB_ERR_UMASK; } return PFMLIB_SUCCESS; } papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_priv.h000066400000000000000000000130271502707512200217730ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_PRIV_H__ #define __PFMLIB_PRIV_H__ #include #include "pfmlib_priv_comp.h" typedef struct { char *pmu_name; int pmu_type; /* must remain int, using -1 */ unsigned int pme_count; /* number of events */ unsigned int pmd_count; /* number of PMD registers */ unsigned int pmc_count; /* number of PMC registers */ unsigned int num_cnt; /* number of counters (counting PMD registers) */ unsigned int flags; int (*get_event_code)(unsigned int i, unsigned int cnt, int *code); int (*get_event_mask_code)(unsigned int i, unsigned int mask_idx, unsigned int *code); char *(*get_event_name)(unsigned int i); char *(*get_event_mask_name)(unsigned int event_idx, unsigned int mask_idx); void (*get_event_counters)(unsigned int i, pfmlib_regmask_t *counters); unsigned int (*get_num_event_masks)(unsigned int event_idx); int (*dispatch_events)(pfmlib_input_param_t *p, void *model_in, pfmlib_output_param_t *q, void *model_out); int (*pmu_detect)(void); int (*pmu_init)(void); void (*get_impl_pmcs)(pfmlib_regmask_t *impl_pmcs); void (*get_impl_pmds)(pfmlib_regmask_t *impl_pmds); void (*get_impl_counters)(pfmlib_regmask_t *impl_counters); void (*get_hw_counter_width)(unsigned int *width); int (*get_event_desc)(unsigned int i, char **buf); int (*get_event_mask_desc)(unsigned int event_idx, unsigned int mask_idx, char **buf); int (*get_cycle_event)(pfmlib_event_t *e); int (*get_inst_retired_event)(pfmlib_event_t *e); int (*has_umask_default)(unsigned int i); /* optional */ } pfm_pmu_support_t; #define PFMLIB_MULT_CODE_EVENT 0x1 /* more than one code per event (depending on counter) */ #define PFMLIB_CNT_FIRST -1 /* return code for event on first counter */ #define PFMLIB_NO_EVT (~0U) /* no event index associated with event */ typedef struct { pfmlib_options_t options; pfm_pmu_support_t *current; int options_env_set; /* 1 if options set by env variables */ } pfm_config_t; #define PFMLIB_INITIALIZED() (pfm_config.current != NULL) extern pfm_config_t pfm_config; #define PFMLIB_DEBUG() pfm_config.options.pfm_debug #define PFMLIB_VERBOSE() pfm_config.options.pfm_verbose #define pfm_current pfm_config.current extern void __pfm_vbprintf(const char *fmt,...); extern int __pfm_check_event(pfmlib_event_t *e); /* * provided by OS-specific module */ extern int __pfm_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen); extern void pfm_init_syscalls(void); #ifdef PFMLIB_DEBUG #define DPRINT(fmt, a...) \ do { \ if (pfm_config.options.pfm_debug) { \ fprintf(libpfm_fp, "%s (%s.%d): " fmt, __FILE__, __func__, __LINE__, ## a); } \ } while (0) #else #define DPRINT(a) #endif #define ALIGN_DOWN(a,p) ((a) & ~((1UL<<(p))-1)) #define ALIGN_UP(a,p) ((((a) + ((1UL<<(p))-1))) & ~((1UL<<(p))-1)) extern pfm_pmu_support_t crayx2_support; extern pfm_pmu_support_t montecito_support; extern pfm_pmu_support_t itanium2_support; extern pfm_pmu_support_t itanium_support; extern pfm_pmu_support_t generic_ia64_support; extern pfm_pmu_support_t amd64_support; extern pfm_pmu_support_t i386_p6_support; extern pfm_pmu_support_t i386_ppro_support; extern pfm_pmu_support_t i386_pii_support; extern pfm_pmu_support_t i386_pm_support; extern pfm_pmu_support_t gen_ia32_support; extern pfm_pmu_support_t generic_mips64_support; extern pfm_pmu_support_t sicortex_support; extern pfm_pmu_support_t pentium4_support; extern pfm_pmu_support_t coreduo_support; extern pfm_pmu_support_t core_support; extern pfm_pmu_support_t gen_powerpc_support; extern pfm_pmu_support_t sparc_support; extern pfm_pmu_support_t cell_support; extern pfm_pmu_support_t intel_atom_support; extern pfm_pmu_support_t intel_nhm_support; extern pfm_pmu_support_t intel_wsm_support; static inline unsigned int pfm_num_masks(int e) { if (pfm_current->get_num_event_masks == NULL) return 0; return pfm_current->get_num_event_masks(e); } extern FILE *libpfm_fp; extern int forced_pmu; extern int _pfmlib_sys_base; /* syscall base */ extern int _pfmlib_major_version; /* kernel perfmon major version */ extern int _pfmlib_minor_version; /* kernel perfmon minor version */ extern void pfm_init_syscalls(void); static inline int _pfmlib_get_sys_base() { if (!_pfmlib_sys_base) pfm_init_syscalls(); return _pfmlib_sys_base; } #endif /* __PFMLIB_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_priv_comp.h000066400000000000000000000031201502707512200230020ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_PRIV_COMP_H__ #define __PFMLIB_PRIV_COMP_H__ #include /* * this header file contains all the macros, inline assembly, instrinsics needed * by the library and which are compiler-specific */ #ifdef __ia64__ #include "pfmlib_priv_comp_ia64.h" #endif #endif papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_priv_comp_ia64.h000066400000000000000000000037541502707512200236420ustar00rootroot00000000000000/* * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_PRIV_COMP_IA64_H__ #define __PFMLIB_PRIV_COMP_IA64_H__ #include #ifdef LIBPFM_USING_INTEL_ECC_COMPILER #define ia64_get_cpuid(regnum) __getIndReg(_IA64_REG_INDR_CPUID, (regnum)) #define ia64_getf(d) __getf_exp(d) #elif defined(__GNUC__) static inline unsigned long ia64_get_cpuid (unsigned long regnum) { unsigned long r; asm ("mov %0=cpuid[%r1]" : "=r"(r) : "rO"(regnum)); return r; } static inline unsigned long ia64_getf(double d) { unsigned long exp; __asm__ ("getf.exp %0=%1" : "=r"(exp) : "f"(d)); return exp; } #else /* !GNUC nor INTEL_ECC */ #error "need to define a set of compiler-specific macros" #endif #endif /* __PFMLIB_PRIV_COMP_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_priv_ia64.h000066400000000000000000000037141502707512200226200ustar00rootroot00000000000000/* * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_PRIV_IA64_H__ #define __PFMLIB_PRIV_IA64_H__ typedef struct { unsigned long db_mask:56; unsigned long db_plm:4; unsigned long db_ig:2; unsigned long db_w:1; unsigned long db_rx:1; } br_mask_reg_t; typedef union { unsigned long val; br_mask_reg_t db; } dbreg_t; static inline int pfm_ia64_get_cpu_family(void) { return (int)((ia64_get_cpuid(3) >> 24) & 0xff); } static inline int pfm_ia64_get_cpu_model(void) { return (int)((ia64_get_cpuid(3) >> 16) & 0xff); } /* * find last bit set */ static inline int pfm_ia64_fls (unsigned long x) { double d = x; long exp; exp = ia64_getf(d); return exp - 0xffff; } #endif /* __PFMLIB_PRIV_IA64_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_sicortex.c000066400000000000000000000505171502707512200226530ustar00rootroot00000000000000/* * pfmlib_sicortex.c : support for the generic MIPS64 PMU family * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include /* public headers */ #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_sicortex_priv.h" /* architecture private */ #include "sicortex/ice9a/ice9a_all_spec_pme.h" #include "sicortex/ice9b/ice9b_all_spec_pme.h" #include "sicortex/ice9/ice9_scb_spec_sw.h" /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_exl perfsel.sel_exl #define sel_os perfsel.sel_os #define sel_usr perfsel.sel_usr #define sel_sup perfsel.sel_sup #define sel_int perfsel.sel_int static pme_sicortex_entry_t *sicortex_pe = NULL; // CHANGE FOR ICET #define core_counters 2 #define MAX_ICE9_PMCS 2+4+256 #define MAX_ICE9_PMDS 2+4+256 static int compute_ice9_counters(int type) { int i; int bound = 0; pme_gen_mips64_entry_t *gen_mips64_pe = NULL; sicortex_support.pmd_count = 0; sicortex_support.pmc_count = 0; for (i=0;i 2) { /* Account for 4 sampling PMD registers */ sicortex_support.num_cnt = sicortex_support.pmd_count - 4; sicortex_support.pme_count = bound; } else { sicortex_support.pme_count = 0; /* Count up CPU only events */ for (i=0;i> (cntr*8)) & 0xff; pc[j].reg_addr = cntr*2; pc[j].reg_value = reg.val; pc[j].reg_num = cntr; __pfm_vbprintf("[CP0_25_%u(pmc%u)=0x%"PRIx64" event_mask=0x%x usr=%d os=%d sup=%d exl=%d int=1] %s\n", pc[j].reg_addr, pc[j].reg_num, pc[j].reg_value, reg.sel_event_mask, reg.sel_usr, reg.sel_os, reg.sel_sup, reg.sel_exl, sicortex_pe[e[j].event].pme_name); pd[j].reg_num = cntr; pd[j].reg_addr = cntr*2 + 1; __pfm_vbprintf("[CP0_25_%u(pmd%u)]\n", pc[j].reg_addr, pc[j].reg_num); } /* SCB event */ else { pmc_sicortex_scb_reg_t scbreg; int k; scbreg.val = 0; scbreg.sicortex_ScbPerfBucket_reg.event = sicortex_pe[e[j].event].pme_code >> 16; for (k=0;kflags & PFMLIB_SICORTEX_INPUT_SCB_INTERVAL)) { two.sicortex_ScbPerfCtl_reg.Interval = mod_in->pfp_sicortex_scb_global.Interval; } else { two.sicortex_ScbPerfCtl_reg.Interval = 6; /* 2048 cycles */ } if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_NOINC)) { two.sicortex_ScbPerfCtl_reg.NoInc = mod_in->pfp_sicortex_scb_global.NoInc; } else { two.sicortex_ScbPerfCtl_reg.NoInc = 0; } two.sicortex_ScbPerfCtl_reg.IntBit = 31; /* Interrupt on last bit */ two.sicortex_ScbPerfCtl_reg.MagicEvent = 0; two.sicortex_ScbPerfCtl_reg.AddrAssert = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" Interval=0x%x IntBit=0x%x NoInc=%d AddrAssert=%d MagicEvent=0x%x]\n","PerfCtl", pc[num].reg_num, two.val, two.sicortex_ScbPerfCtl_reg.Interval, two.sicortex_ScbPerfCtl_reg.IntBit, two.sicortex_ScbPerfCtl_reg.NoInc, two.sicortex_ScbPerfCtl_reg.AddrAssert, two.sicortex_ScbPerfCtl_reg.MagicEvent); pc[num].reg_value = two.val; /*ScbPerfHist */ pc[++num].reg_num = 3; pc[num].reg_addr = 3; three.val = 0; if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_HISTGTE)) three.sicortex_ScbPerfHist_reg.HistGte = mod_in->pfp_sicortex_scb_global.HistGte; else three.sicortex_ScbPerfHist_reg.HistGte = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" HistGte=0x%x]\n","PerfHist", pc[num].reg_num, three.val, three.sicortex_ScbPerfHist_reg.HistGte); pc[num].reg_value = three.val; /*ScbPerfBuckNum */ pc[++num].reg_num = 4; pc[num].reg_addr = 4; four.val = 0; if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_BUCKET)) four.sicortex_ScbPerfBuckNum_reg.Bucket = mod_in->pfp_sicortex_scb_global.Bucket; else four.sicortex_ScbPerfBuckNum_reg.Bucket = 0; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" Bucket=0x%x]\n","PerfBuckNum", pc[num].reg_num, four.val, four.sicortex_ScbPerfBuckNum_reg.Bucket); pc[num].reg_value = four.val; /*ScbPerfEna */ pc[++num].reg_num = 5; pc[num].reg_addr = 5; five.val = 0; five.sicortex_ScbPerfEna_reg.ena = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" ena=%d]\n","PerfEna", pc[num].reg_num, five.val, five.sicortex_ScbPerfEna_reg.ena); pc[num].reg_value = five.val; ++num; return(num); } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_sicortex_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_sicortex_input_param_t *mod_in, pfmlib_output_param_t *outp) { /* pfmlib_sicortex_input_param_t *param = mod_in; */ pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc, *pd; unsigned int i, j, cnt = inp->pfp_event_count; unsigned int used = 0; extern pfm_pmu_support_t sicortex_support; unsigned int cntr, avail; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; /* Degree N rank based allocation */ if (cnt > sicortex_support.pmc_count) return PFMLIB_ERR_TOOMANY; if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s, counters=0x%x\n", j, sicortex_pe[e[j].event].pme_name,sicortex_pe[e[j].event].pme_counters); } } /* Do rank based allocation, counters that live on 1 reg before counters that live on 2 regs etc. */ /* CPU counters first */ for (i=1;i<=core_counters;i++) { for (j=0; j < cnt;j++) { /* CPU counters first */ if ((sicortex_pe[e[j].event].pme_counters & ((1<pfp_dfl_plm,pc,pd,cntr,j,mod_in); used |= (1 << cntr); DPRINT("Rank %d: Used counters 0x%x\n",i, used); } } } /* SCB counters can live anywhere */ used = 0; for (j=0; j < cnt;j++) { unsigned int cntr; /* CPU counters first */ if (sicortex_pe[e[j].event].pme_counters & (1<pfp_dfl_plm,pc,pd,cntr,j,mod_in); used++; DPRINT("SCB(%d): Used counters %d\n",j,used); } } if (used) { outp->pfp_pmc_count = stuff_sicortex_scb_control_regs(pc,pd,cnt,mod_in); outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } /* number of evtsel registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_sicortex_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_sicortex_input_param_t *mod_sicortex_in = (pfmlib_sicortex_input_param_t *)model_in; return pfm_sicortex_dispatch_counters(inp, mod_sicortex_in, outp); } static int pfm_sicortex_get_event_code(unsigned int i, unsigned int cnt, int *code) { extern pfm_pmu_support_t sicortex_support; /* check validity of counter index */ if (cnt != PFMLIB_CNT_FIRST) { if (cnt < 0 || cnt >= sicortex_support.pmc_count) return PFMLIB_ERR_INVAL; } else { cnt = ffs(sicortex_pe[i].pme_counters)-1; if (cnt == -1) return(PFMLIB_ERR_INVAL); } /* if cnt == 1, shift right by 0, if cnt == 2, shift right by 8 */ /* Works on both 5k anf 20K */ unsigned int tmp = sicortex_pe[i].pme_counters; /* CPU event */ if (tmp & ((1<> (cnt*8)); else return PFMLIB_ERR_INVAL; } /* SCB event */ else { if ((cnt < 6) || (cnt >= sicortex_support.pmc_count)) return PFMLIB_ERR_INVAL; *code = 0xffff & (sicortex_pe[i].pme_code >> 16); } return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_sicortex_get_event_umask(unsigned int i, unsigned long *umask) { extern pfm_pmu_support_t sicortex_support; if (i >= sicortex_support.pme_count || umask == NULL) return PFMLIB_ERR_INVAL; *umask = 0; //evt_umask(i); return PFMLIB_SUCCESS; } static void pfm_sicortex_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { extern pfm_pmu_support_t sicortex_support; unsigned int tmp; memset(counters, 0, sizeof(*counters)); tmp = sicortex_pe[j].pme_counters; /* CPU counter */ if (tmp & ((1< core_counters) { /* counting pmds are not contiguous on ICE9*/ for(i=6; i < sicortex_support.pmd_count; i++) pfm_regmask_set(impl_counters, i); } } static void pfm_sicortex_get_hw_counter_width(unsigned int *width) { *width = PMU_GEN_MIPS64_COUNTER_WIDTH; } static char * pfm_sicortex_get_event_name(unsigned int i) { return sicortex_pe[i].pme_name; } static int pfm_sicortex_get_event_description(unsigned int ev, char **str) { char *s; s = sicortex_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_sicortex_get_cycle_event(pfmlib_event_t *e) { return pfm_find_full_event("CPU_CYCLES",e); } static int pfm_sicortex_get_inst_retired(pfmlib_event_t *e) { return pfm_find_full_event("CPU_INSEXEC",e); } /* SiCortex specific functions */ /* CPU counter */ int pfm_sicortex_is_cpu(unsigned int i) { if (i < sicortex_support.pme_count) { unsigned int tmp = sicortex_pe[i].pme_counters; return !(tmp & (1< based on code from * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_SICORTEX_PRIV_H__ #define __PFMLIB_SICORTEX_PRIV_H__ #include "pfmlib_gen_mips64_priv.h" #define PFMLIB_SICORTEX_MAX_UMASK 5 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ } pme_sicortex_umask_t; typedef struct { char *pme_name; char *pme_desc; /* text description of the event */ unsigned int pme_code; /* event mask, holds room for four events, low 8 bits cntr0, ... high 8 bits cntr3 */ unsigned int pme_counters; /* Which counter event lives on */ unsigned int pme_numasks; /* number of umasks */ pme_sicortex_umask_t pme_umasks[PFMLIB_SICORTEX_MAX_UMASK]; /* umask desc */ } pme_sicortex_entry_t; static pme_sicortex_umask_t sicortex_scb_umasks[PFMLIB_SICORTEX_MAX_UMASK] = { { "IFOTHER_NONE","Both buckets count independently",0x00 }, { "IFOTHER_AND","Increment where this event counts and the opposite bucket counts",0x02 }, { "IFOTHER_ANDNOT","Increment where this event counts and the opposite bucket does not",0x04 }, { "HIST_NONE","Count cycles where the event is asserted",0x0 }, { "HIST_EDGE","Histogram on edges of the specified event",0x1 } }; #endif /* __PFMLIB_GEN_MIPS64_PRIV_H__ */ papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_sparc.c000066400000000000000000000333031502707512200221150ustar00rootroot00000000000000/* * Copyright (C) 2007 David S. Miller (davem@davemloft.net) * * Based upon gen_powerpc code which is: * Copyright (C) IBM Corporation, 2007. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_sparc.c * * Support for libpfm for Sparc processors. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE /* for getline */ #endif #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_sparc_priv.h" #include "ultra12_events.h" #include "ultra3_events.h" #include "ultra3i_events.h" #include "ultra3plus_events.h" #include "ultra4plus_events.h" #include "niagara1_events.h" #include "niagara2_events.h" static char *get_event_name(int event) { switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: return ultra12_pe[event].pme_name; case PFMLIB_SPARC_ULTRA3_PMU: return ultra3_pe[event].pme_name; case PFMLIB_SPARC_ULTRA3I_PMU: return ultra3i_pe[event].pme_name; case PFMLIB_SPARC_ULTRA3PLUS_PMU: return ultra3plus_pe[event].pme_name; case PFMLIB_SPARC_ULTRA4PLUS_PMU: return ultra4plus_pe[event].pme_name; case PFMLIB_SPARC_NIAGARA1_PMU: return niagara1_pe[event].pme_name; case PFMLIB_SPARC_NIAGARA2_PMU: return niagara2_pe[event].pme_name; } return (char *)-1; } static char *get_event_desc(int event) { switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: return ultra12_pe[event].pme_desc; case PFMLIB_SPARC_ULTRA3_PMU: return ultra3_pe[event].pme_desc; case PFMLIB_SPARC_ULTRA3I_PMU: return ultra3i_pe[event].pme_desc; case PFMLIB_SPARC_ULTRA3PLUS_PMU: return ultra3plus_pe[event].pme_desc; case PFMLIB_SPARC_ULTRA4PLUS_PMU: return ultra4plus_pe[event].pme_desc; case PFMLIB_SPARC_NIAGARA1_PMU: return niagara1_pe[event].pme_desc; case PFMLIB_SPARC_NIAGARA2_PMU: return niagara2_pe[event].pme_desc; } return (char *)-1; } static char get_ctrl(int event) { switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: return ultra12_pe[event].pme_ctrl; case PFMLIB_SPARC_ULTRA3_PMU: return ultra3_pe[event].pme_ctrl; case PFMLIB_SPARC_ULTRA3I_PMU: return ultra3i_pe[event].pme_ctrl; case PFMLIB_SPARC_ULTRA3PLUS_PMU: return ultra3plus_pe[event].pme_ctrl; case PFMLIB_SPARC_ULTRA4PLUS_PMU: return ultra4plus_pe[event].pme_ctrl; case PFMLIB_SPARC_NIAGARA1_PMU: return niagara1_pe[event].pme_ctrl; case PFMLIB_SPARC_NIAGARA2_PMU: return niagara2_pe[event].pme_ctrl; } return 0xff; } static int get_val(int event) { switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: return ultra12_pe[event].pme_val; case PFMLIB_SPARC_ULTRA3_PMU: return ultra3_pe[event].pme_val; case PFMLIB_SPARC_ULTRA3I_PMU: return ultra3i_pe[event].pme_val; case PFMLIB_SPARC_ULTRA3PLUS_PMU: return ultra3plus_pe[event].pme_val; case PFMLIB_SPARC_ULTRA4PLUS_PMU: return ultra4plus_pe[event].pme_val; case PFMLIB_SPARC_NIAGARA1_PMU: return niagara1_pe[event].pme_val; case PFMLIB_SPARC_NIAGARA2_PMU: return niagara2_pe[event].pme_val; } return -1; } static int pfm_sparc_get_event_code(unsigned int event, unsigned int pmd, int *code) { *code = get_val(event); return 0; } static char *pfm_sparc_get_event_name(unsigned int event) { return get_event_name(event); } static char *pfm_sparc_get_event_mask_name(unsigned int event, unsigned int mask) { pme_sparc_mask_entry_t *e; if (sparc_support.pmu_type != PFMLIB_SPARC_NIAGARA2_PMU) return ""; e = &niagara2_pe[event]; return e->pme_masks[mask].mask_name; } static void pfm_sparc_get_event_counters(unsigned int event, pfmlib_regmask_t *counters) { if (sparc_support.pmu_type == PFMLIB_SPARC_NIAGARA2_PMU) { counters->bits[0] = (1 << 0) | (1 << 1); } else { char ctrl = get_ctrl(event); counters->bits[0] = 0; if (ctrl & PME_CTRL_S0) counters->bits[0] |= (1 << 0); if (ctrl & PME_CTRL_S1) counters->bits[0] |= (1 << 1); } } static unsigned int pfm_sparc_get_num_event_masks(unsigned int event) { if (sparc_support.pmu_type != PFMLIB_SPARC_NIAGARA2_PMU) return 0; return (event == 0 ? 0 : EVENT_MASK_BITS); } /* Bits common to all PCR implementations */ #define PCR_PRIV (0x1UL << 0) #define PCR_SYS_TRACE (0x1UL << 1) #define PCR_USER_TRACE (0x1UL << 2) /* The S0 and S1 fields determine which events are monitored in * the assosciated PIC (PIC0 vs. PIC1 respectively). For ultra12 * these fields are 4 bits, on ultra3/3i/3+/4+ they are 6 bits. * For Niagara-1 there is only S0 and it is 3 bits in size. * Niagara-1's PIC1 is hard-coded to record retired instructions. */ #define PCR_S0_SHIFT 4 #define PCR_S0 (0x1fUL << PCR_S0_SHIFT) #define PCR_S1_SHIFT 11 #define PCR_S1 (0x1fUL << PCR_S1_SHIFT) /* Niagara-2 specific PCR bits. It supports event masking. */ #define PCR_N2_HYP_TRACE (0x1UL << 3) #define PCR_N2_TOE0 (0x1UL << 4) #define PCR_N2_TOE1 (0x1UL << 5) #define PCR_N2_SL0_SHIFT 14 #define PCR_N2_SL0 (0xf << PCR_N2_SL0_SHIFT) #define PCR_N2_MASK0_SHIFT 6 #define PCR_N2_MASK0 (0xff << PCR_N2_MASK0_SHIFT) #define PCR_N2_SL1_SHIFT 27 #define PCR_N2_SL1 (0xf << PCR_N2_SL1_SHIFT) #define PCR_N2_MASK1_SHIFT 19 #define PCR_N2_MASK1 (0xff << PCR_N2_MASK1_SHIFT) static int pfm_sparc_dispatch_events(pfmlib_input_param_t *input, void *model_input, pfmlib_output_param_t *output, void *model_output) { unsigned long long pcr, vals[2]; unsigned int plm, i; int niagara2; char ctrls[2]; if (input->pfp_event_count > 2) return PFMLIB_ERR_TOOMANY; plm = ((input->pfp_events[0].plm != 0) ? input->pfp_events[0].plm : input->pfp_dfl_plm); for (i = 1; i < input->pfp_event_count; i++) { if (input->pfp_events[i].plm == 0) { /* it's ok if the default is the same as plm */ if (plm != input->pfp_dfl_plm) return PFMLIB_ERR_NOASSIGN; } else { if (plm != input->pfp_events[i].plm) return PFMLIB_ERR_NOASSIGN; } } niagara2 = 0; if (sparc_support.pmu_type == PFMLIB_SPARC_NIAGARA2_PMU) niagara2 = 1; pcr = 0; if (plm & PFM_PLM3) pcr |= PCR_USER_TRACE; if (plm & PFM_PLM0) pcr |= PCR_SYS_TRACE; if (niagara2 && (plm & PFM_PLM1)) pcr |= PCR_N2_HYP_TRACE; for (i = 0; i < input->pfp_event_count; i++) { pfmlib_event_t *e = &input->pfp_events[i]; ctrls[i] = get_ctrl(e->event); vals[i] = get_val(e->event); if (i == 1) { if ((ctrls[0] & ctrls[1]) == 0) continue; if (ctrls[0] == (PME_CTRL_S0|PME_CTRL_S1)) { if (ctrls[1] == (PME_CTRL_S0|PME_CTRL_S1)) { ctrls[0] = PME_CTRL_S0; ctrls[1] = PME_CTRL_S1; } else { ctrls[0] &= ~ctrls[1]; } } else if (ctrls[1] == (PME_CTRL_S0|PME_CTRL_S1)) { ctrls[1] &= ~ctrls[0]; } else return PFMLIB_ERR_INVAL; } } if (input->pfp_event_count == 1) { if (ctrls[0] == (PME_CTRL_S0|PME_CTRL_S1)) ctrls[0] = PME_CTRL_S0; } for (i = 0; i < input->pfp_event_count; i++) { unsigned long long val = vals[i]; char ctrl = ctrls[i]; switch (ctrl) { case PME_CTRL_S0: output->pfp_pmds[i].reg_num = 0; pcr |= (val << (niagara2 ? PCR_N2_SL0_SHIFT : PCR_S0_SHIFT)); break; case PME_CTRL_S1: output->pfp_pmds[i].reg_num = 1; pcr |= (val << (niagara2 ? PCR_N2_SL1_SHIFT : PCR_S1_SHIFT)); break; default: return PFMLIB_ERR_INVAL; } if (niagara2) { pfmlib_event_t *e = &input->pfp_events[i]; unsigned int j, shift; if (ctrl == PME_CTRL_S0) { pcr |= PCR_N2_TOE0; shift = PCR_N2_MASK0_SHIFT; } else { pcr |= PCR_N2_TOE1; shift = PCR_N2_MASK1_SHIFT; } for (j = 0; j < e->num_masks; j++) { unsigned int mask; mask = e->unit_masks[j]; if (mask >= EVENT_MASK_BITS) return PFMLIB_ERR_INVAL; pcr |= (1ULL << (shift + mask)); } } output->pfp_pmds[i].reg_value = 0; output->pfp_pmds[i].reg_addr = 0; output->pfp_pmds[i].reg_alt_addr = 0; output->pfp_pmds[i].reg_reserved1 = 0; output->pfp_pmd_count = i + 1; } output->pfp_pmcs[0].reg_value = pcr; output->pfp_pmcs[0].reg_addr = 0; output->pfp_pmcs[0].reg_num = 0; output->pfp_pmcs[0].reg_reserved1 = 0; output->pfp_pmc_count = 1; return PFMLIB_SUCCESS; } static int pmu_name_to_pmu_type(char *name) { if (!strcmp(name, "ultra12")) return PFMLIB_SPARC_ULTRA12_PMU; if (!strcmp(name, "ultra3")) return PFMLIB_SPARC_ULTRA3_PMU; if (!strcmp(name, "ultra3i")) return PFMLIB_SPARC_ULTRA3I_PMU; if (!strcmp(name, "ultra3+")) return PFMLIB_SPARC_ULTRA3PLUS_PMU; if (!strcmp(name, "ultra4+")) return PFMLIB_SPARC_ULTRA4PLUS_PMU; if (!strcmp(name, "niagara2")) return PFMLIB_SPARC_NIAGARA2_PMU; if (!strcmp(name, "niagara")) return PFMLIB_SPARC_NIAGARA1_PMU; return -1; } static int pfm_sparc_pmu_detect(void) { int ret, pmu_type, pme_count; char buffer[32]; ret = __pfm_getcpuinfo_attr("pmu", buffer, sizeof(buffer)); if (ret == -1) return PFMLIB_ERR_NOTSUPP; pmu_type = pmu_name_to_pmu_type(buffer); if (pmu_type == -1) return PFMLIB_ERR_NOTSUPP; switch (pmu_type) { default: return PFMLIB_ERR_NOTSUPP; case PFMLIB_SPARC_ULTRA12_PMU: pme_count = PME_ULTRA12_EVENT_COUNT; break; case PFMLIB_SPARC_ULTRA3_PMU: pme_count = PME_ULTRA3_EVENT_COUNT; break; case PFMLIB_SPARC_ULTRA3I_PMU: pme_count = PME_ULTRA3I_EVENT_COUNT; break; case PFMLIB_SPARC_ULTRA3PLUS_PMU: pme_count = PME_ULTRA3PLUS_EVENT_COUNT; break; case PFMLIB_SPARC_ULTRA4PLUS_PMU: pme_count = PME_ULTRA4PLUS_EVENT_COUNT; break; case PFMLIB_SPARC_NIAGARA1_PMU: pme_count = PME_NIAGARA1_EVENT_COUNT; break; case PFMLIB_SPARC_NIAGARA2_PMU: pme_count = PME_NIAGARA2_EVENT_COUNT; break; } sparc_support.pmu_type = pmu_type; sparc_support.pmu_name = strdup(buffer); sparc_support.pme_count = pme_count; return PFMLIB_SUCCESS; } static void pfm_sparc_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { impl_pmcs->bits[0] = 0x1; } static void pfm_sparc_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { impl_pmds->bits[0] = 0x3; } static void pfm_sparc_get_impl_counters(pfmlib_regmask_t *impl_counters) { pfm_sparc_get_impl_pmds(impl_counters); } static void pfm_sparc_get_hw_counter_width(unsigned int *width) { *width = 32; } static int pfm_sparc_get_event_desc(unsigned int event, char **desc) { *desc = strdup(get_event_desc(event)); return 0; } static int pfm_sparc_get_event_mask_desc(unsigned int event, unsigned int mask, char **desc) { if (sparc_support.pmu_type != PFMLIB_SPARC_NIAGARA2_PMU) { *desc = strdup(""); } else { pme_sparc_mask_entry_t *e; e = &niagara2_pe[event]; *desc = strdup(e->pme_masks[mask].mask_desc); } return 0; } static int pfm_sparc_get_event_mask_code(unsigned int event, unsigned int mask, unsigned int *code) { if (sparc_support.pmu_type != PFMLIB_SPARC_NIAGARA2_PMU) *code = 0; else *code = mask; return 0; } static int pfm_sparc_get_cycle_event(pfmlib_event_t *e) { switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: case PFMLIB_SPARC_ULTRA3_PMU: case PFMLIB_SPARC_ULTRA3I_PMU: case PFMLIB_SPARC_ULTRA3PLUS_PMU: case PFMLIB_SPARC_ULTRA4PLUS_PMU: e->event = 0; break; case PFMLIB_SPARC_NIAGARA1_PMU: case PFMLIB_SPARC_NIAGARA2_PMU: default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int pfm_sparc_get_inst_retired(pfmlib_event_t *e) { unsigned int i; switch (sparc_support.pmu_type) { case PFMLIB_SPARC_ULTRA12_PMU: case PFMLIB_SPARC_ULTRA3_PMU: case PFMLIB_SPARC_ULTRA3I_PMU: case PFMLIB_SPARC_ULTRA3PLUS_PMU: case PFMLIB_SPARC_ULTRA4PLUS_PMU: e->event = 1; break; case PFMLIB_SPARC_NIAGARA1_PMU: e->event = 0; break; case PFMLIB_SPARC_NIAGARA2_PMU: e->event = 1; e->num_masks = EVENT_MASK_BITS; for (i = 0; i < e->num_masks; i++) e->unit_masks[i] = i; break; default: return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } /** * sparc_support **/ pfm_pmu_support_t sparc_support = { /* the next 3 fields are initialized in pfm_sparc_pmu_detect */ .pmu_name = NULL, .pmu_type = PFMLIB_UNKNOWN_PMU, .pme_count = 0, .pmd_count = 2, .pmc_count = 1, .num_cnt = 2, .get_event_code = pfm_sparc_get_event_code, .get_event_name = pfm_sparc_get_event_name, .get_event_mask_name = pfm_sparc_get_event_mask_name, .get_event_counters = pfm_sparc_get_event_counters, .get_num_event_masks = pfm_sparc_get_num_event_masks, .dispatch_events = pfm_sparc_dispatch_events, .pmu_detect = pfm_sparc_pmu_detect, .get_impl_pmcs = pfm_sparc_get_impl_pmcs, .get_impl_pmds = pfm_sparc_get_impl_pmds, .get_impl_counters = pfm_sparc_get_impl_counters, .get_hw_counter_width = pfm_sparc_get_hw_counter_width, .get_event_desc = pfm_sparc_get_event_desc, .get_event_mask_desc = pfm_sparc_get_event_mask_desc, .get_event_mask_code = pfm_sparc_get_event_mask_code, .get_cycle_event = pfm_sparc_get_cycle_event, .get_inst_retired_event = pfm_sparc_get_inst_retired }; papi-papi-7-2-0-t/src/libperfnec/lib/pfmlib_sparc_priv.h000066400000000000000000000012071502707512200231600ustar00rootroot00000000000000typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ char pme_ctrl; /* S0 or S1 */ char __pad; int pme_val; /* S0/S1 encoding */ } pme_sparc_entry_t; typedef struct { char *mask_name; /* mask name */ char *mask_desc; /* mask description */ } pme_sparc_mask_t; #define EVENT_MASK_BITS 8 typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ char pme_ctrl; /* S0 or S1 */ char __pad; int pme_val; /* S0/S1 encoding */ pme_sparc_mask_t pme_masks[EVENT_MASK_BITS]; } pme_sparc_mask_entry_t; #define PME_CTRL_S0 1 #define PME_CTRL_S1 2 papi-papi-7-2-0-t/src/libperfnec/lib/power4_events.h000066400000000000000000005055551502707512200223020ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER4_EVENTS_H__ #define __POWER4_EVENTS_H__ /* * File: power4_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID 0 #define POWER4_PME_PM_FPU1_SINGLE 1 #define POWER4_PME_PM_DC_PREF_OUT_STREAMS 2 #define POWER4_PME_PM_FPU0_STALL3 3 #define POWER4_PME_PM_TB_BIT_TRANS 4 #define POWER4_PME_PM_GPR_MAP_FULL_CYC 5 #define POWER4_PME_PM_MRK_ST_CMPL 6 #define POWER4_PME_PM_MRK_LSU_FLUSH_LRQ 7 #define POWER4_PME_PM_FPU0_STF 8 #define POWER4_PME_PM_FPU1_FMA 9 #define POWER4_PME_PM_L2SA_MOD_TAG 10 #define POWER4_PME_PM_MRK_DATA_FROM_L275_SHR 11 #define POWER4_PME_PM_1INST_CLB_CYC 12 #define POWER4_PME_PM_LSU1_FLUSH_ULD 13 #define POWER4_PME_PM_MRK_INST_FIN 14 #define POWER4_PME_PM_MRK_LSU0_FLUSH_UST 15 #define POWER4_PME_PM_FPU_FDIV 16 #define POWER4_PME_PM_LSU_LRQ_S0_ALLOC 17 #define POWER4_PME_PM_FPU0_FULL_CYC 18 #define POWER4_PME_PM_FPU_SINGLE 19 #define POWER4_PME_PM_FPU0_FMA 20 #define POWER4_PME_PM_MRK_LSU1_FLUSH_ULD 21 #define POWER4_PME_PM_LSU1_FLUSH_LRQ 22 #define POWER4_PME_PM_L2SA_ST_HIT 23 #define POWER4_PME_PM_L2SB_SHR_INV 24 #define POWER4_PME_PM_DTLB_MISS 25 #define POWER4_PME_PM_MRK_ST_MISS_L1 26 #define POWER4_PME_PM_EXT_INT 27 #define POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ 28 #define POWER4_PME_PM_MRK_ST_GPS 29 #define POWER4_PME_PM_GRP_DISP_SUCCESS 30 #define POWER4_PME_PM_LSU1_LDF 31 #define POWER4_PME_PM_FAB_CMD_ISSUED 32 #define POWER4_PME_PM_LSU0_SRQ_STFWD 33 #define POWER4_PME_PM_CR_MAP_FULL_CYC 34 #define POWER4_PME_PM_MRK_LSU0_FLUSH_ULD 35 #define POWER4_PME_PM_LSU_DERAT_MISS 36 #define POWER4_PME_PM_FPU0_SINGLE 37 #define POWER4_PME_PM_FPU1_FDIV 38 #define POWER4_PME_PM_FPU1_FEST 39 #define POWER4_PME_PM_FPU0_FRSP_FCONV 40 #define POWER4_PME_PM_MRK_ST_CMPL_INT 41 #define POWER4_PME_PM_FXU_FIN 42 #define POWER4_PME_PM_FPU_STF 43 #define POWER4_PME_PM_DSLB_MISS 44 #define POWER4_PME_PM_DATA_FROM_L275_SHR 45 #define POWER4_PME_PM_FXLS1_FULL_CYC 46 #define POWER4_PME_PM_L3B0_DIR_MIS 47 #define POWER4_PME_PM_2INST_CLB_CYC 48 #define POWER4_PME_PM_MRK_STCX_FAIL 49 #define POWER4_PME_PM_LSU_LMQ_LHR_MERGE 50 #define POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE 51 #define POWER4_PME_PM_L3B1_DIR_REF 52 #define POWER4_PME_PM_MRK_LSU_FLUSH_UST 53 #define POWER4_PME_PM_MRK_DATA_FROM_L25_SHR 54 #define POWER4_PME_PM_LSU_FLUSH_ULD 55 #define POWER4_PME_PM_MRK_BRU_FIN 56 #define POWER4_PME_PM_IERAT_XLATE_WR 57 #define POWER4_PME_PM_LSU0_BUSY 58 #define POWER4_PME_PM_L2SA_ST_REQ 59 #define POWER4_PME_PM_DATA_FROM_MEM 60 #define POWER4_PME_PM_FPR_MAP_FULL_CYC 61 #define POWER4_PME_PM_FPU1_FULL_CYC 62 #define POWER4_PME_PM_FPU0_FIN 63 #define POWER4_PME_PM_3INST_CLB_CYC 64 #define POWER4_PME_PM_DATA_FROM_L35 65 #define POWER4_PME_PM_L2SA_SHR_INV 66 #define POWER4_PME_PM_MRK_LSU_FLUSH_SRQ 67 #define POWER4_PME_PM_THRESH_TIMEO 68 #define POWER4_PME_PM_FPU_FSQRT 69 #define POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ 70 #define POWER4_PME_PM_FXLS0_FULL_CYC 71 #define POWER4_PME_PM_DATA_TABLEWALK_CYC 72 #define POWER4_PME_PM_FPU0_ALL 73 #define POWER4_PME_PM_FPU0_FEST 74 #define POWER4_PME_PM_DATA_FROM_L25_MOD 75 #define POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 76 #define POWER4_PME_PM_FPU_FEST 77 #define POWER4_PME_PM_0INST_FETCH 78 #define POWER4_PME_PM_LARX_LSU1 79 #define POWER4_PME_PM_LD_MISS_L1_LSU0 80 #define POWER4_PME_PM_L1_PREF 81 #define POWER4_PME_PM_FPU1_STALL3 82 #define POWER4_PME_PM_BRQ_FULL_CYC 83 #define POWER4_PME_PM_LARX 84 #define POWER4_PME_PM_MRK_DATA_FROM_L35 85 #define POWER4_PME_PM_WORK_HELD 86 #define POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 87 #define POWER4_PME_PM_FXU_IDLE 88 #define POWER4_PME_PM_INST_CMPL 89 #define POWER4_PME_PM_LSU1_FLUSH_UST 90 #define POWER4_PME_PM_LSU0_FLUSH_ULD 91 #define POWER4_PME_PM_INST_FROM_L2 92 #define POWER4_PME_PM_DATA_FROM_L3 93 #define POWER4_PME_PM_FPU0_DENORM 94 #define POWER4_PME_PM_FPU1_FMOV_FEST 95 #define POWER4_PME_PM_GRP_DISP_REJECT 96 #define POWER4_PME_PM_INST_FETCH_CYC 97 #define POWER4_PME_PM_LSU_LDF 98 #define POWER4_PME_PM_INST_DISP 99 #define POWER4_PME_PM_L2SA_MOD_INV 100 #define POWER4_PME_PM_DATA_FROM_L25_SHR 101 #define POWER4_PME_PM_FAB_CMD_RETRIED 102 #define POWER4_PME_PM_L1_DCACHE_RELOAD_VALID 103 #define POWER4_PME_PM_MRK_GRP_ISSUED 104 #define POWER4_PME_PM_FPU_FULL_CYC 105 #define POWER4_PME_PM_FPU_FMA 106 #define POWER4_PME_PM_MRK_CRU_FIN 107 #define POWER4_PME_PM_MRK_LSU1_FLUSH_UST 108 #define POWER4_PME_PM_MRK_FXU_FIN 109 #define POWER4_PME_PM_BR_ISSUED 110 #define POWER4_PME_PM_EE_OFF 111 #define POWER4_PME_PM_INST_FROM_L3 112 #define POWER4_PME_PM_ITLB_MISS 113 #define POWER4_PME_PM_FXLS_FULL_CYC 114 #define POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE 115 #define POWER4_PME_PM_GRP_DISP_VALID 116 #define POWER4_PME_PM_L2SC_ST_HIT 117 #define POWER4_PME_PM_MRK_GRP_DISP 118 #define POWER4_PME_PM_L2SB_MOD_TAG 119 #define POWER4_PME_PM_INST_FROM_L25_L275 120 #define POWER4_PME_PM_LSU_FLUSH_UST 121 #define POWER4_PME_PM_L2SB_ST_HIT 122 #define POWER4_PME_PM_FXU1_FIN 123 #define POWER4_PME_PM_L3B1_DIR_MIS 124 #define POWER4_PME_PM_4INST_CLB_CYC 125 #define POWER4_PME_PM_GRP_CMPL 126 #define POWER4_PME_PM_DC_PREF_L2_CLONE_L3 127 #define POWER4_PME_PM_FPU_FRSP_FCONV 128 #define POWER4_PME_PM_5INST_CLB_CYC 129 #define POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ 130 #define POWER4_PME_PM_MRK_LSU_FLUSH_ULD 131 #define POWER4_PME_PM_8INST_CLB_CYC 132 #define POWER4_PME_PM_LSU_LMQ_FULL_CYC 133 #define POWER4_PME_PM_ST_REF_L1_LSU0 134 #define POWER4_PME_PM_LSU0_DERAT_MISS 135 #define POWER4_PME_PM_LSU_SRQ_SYNC_CYC 136 #define POWER4_PME_PM_FPU_STALL3 137 #define POWER4_PME_PM_MRK_DATA_FROM_L2 138 #define POWER4_PME_PM_FPU0_FMOV_FEST 139 #define POWER4_PME_PM_LSU0_FLUSH_SRQ 140 #define POWER4_PME_PM_LD_REF_L1_LSU0 141 #define POWER4_PME_PM_L2SC_SHR_INV 142 #define POWER4_PME_PM_LSU1_FLUSH_SRQ 143 #define POWER4_PME_PM_LSU_LMQ_S0_ALLOC 144 #define POWER4_PME_PM_ST_REF_L1 145 #define POWER4_PME_PM_LSU_SRQ_EMPTY_CYC 146 #define POWER4_PME_PM_FPU1_STF 147 #define POWER4_PME_PM_L3B0_DIR_REF 148 #define POWER4_PME_PM_RUN_CYC 149 #define POWER4_PME_PM_LSU_LMQ_S0_VALID 150 #define POWER4_PME_PM_LSU_LRQ_S0_VALID 151 #define POWER4_PME_PM_LSU0_LDF 152 #define POWER4_PME_PM_MRK_IMR_RELOAD 153 #define POWER4_PME_PM_7INST_CLB_CYC 154 #define POWER4_PME_PM_MRK_GRP_TIMEO 155 #define POWER4_PME_PM_FPU_FMOV_FEST 156 #define POWER4_PME_PM_GRP_DISP_BLK_SB_CYC 157 #define POWER4_PME_PM_XER_MAP_FULL_CYC 158 #define POWER4_PME_PM_ST_MISS_L1 159 #define POWER4_PME_PM_STOP_COMPLETION 160 #define POWER4_PME_PM_MRK_GRP_CMPL 161 #define POWER4_PME_PM_ISLB_MISS 162 #define POWER4_PME_PM_CYC 163 #define POWER4_PME_PM_LD_MISS_L1_LSU1 164 #define POWER4_PME_PM_STCX_FAIL 165 #define POWER4_PME_PM_LSU1_SRQ_STFWD 166 #define POWER4_PME_PM_GRP_DISP 167 #define POWER4_PME_PM_DATA_FROM_L2 168 #define POWER4_PME_PM_L2_PREF 169 #define POWER4_PME_PM_FPU0_FPSCR 170 #define POWER4_PME_PM_FPU1_DENORM 171 #define POWER4_PME_PM_MRK_DATA_FROM_L25_MOD 172 #define POWER4_PME_PM_L2SB_ST_REQ 173 #define POWER4_PME_PM_L2SB_MOD_INV 174 #define POWER4_PME_PM_FPU0_FSQRT 175 #define POWER4_PME_PM_LD_REF_L1 176 #define POWER4_PME_PM_MRK_L1_RELOAD_VALID 177 #define POWER4_PME_PM_L2SB_SHR_MOD 178 #define POWER4_PME_PM_INST_FROM_L1 179 #define POWER4_PME_PM_1PLUS_PPC_CMPL 180 #define POWER4_PME_PM_EE_OFF_EXT_INT 181 #define POWER4_PME_PM_L2SC_SHR_MOD 182 #define POWER4_PME_PM_LSU_LRQ_FULL_CYC 183 #define POWER4_PME_PM_IC_PREF_INSTALL 184 #define POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ 185 #define POWER4_PME_PM_GCT_FULL_CYC 186 #define POWER4_PME_PM_INST_FROM_MEM 187 #define POWER4_PME_PM_FXU_BUSY 188 #define POWER4_PME_PM_ST_REF_L1_LSU1 189 #define POWER4_PME_PM_MRK_LD_MISS_L1 190 #define POWER4_PME_PM_MRK_LSU1_INST_FIN 191 #define POWER4_PME_PM_L1_WRITE_CYC 192 #define POWER4_PME_PM_BIQ_IDU_FULL_CYC 193 #define POWER4_PME_PM_MRK_LSU0_INST_FIN 194 #define POWER4_PME_PM_L2SC_ST_REQ 195 #define POWER4_PME_PM_LSU1_BUSY 196 #define POWER4_PME_PM_FPU_ALL 197 #define POWER4_PME_PM_LSU_SRQ_S0_ALLOC 198 #define POWER4_PME_PM_GRP_MRK 199 #define POWER4_PME_PM_FPU1_FIN 200 #define POWER4_PME_PM_DC_PREF_STREAM_ALLOC 201 #define POWER4_PME_PM_BR_MPRED_CR 202 #define POWER4_PME_PM_BR_MPRED_TA 203 #define POWER4_PME_PM_CRQ_FULL_CYC 204 #define POWER4_PME_PM_INST_FROM_PREF 205 #define POWER4_PME_PM_LD_MISS_L1 206 #define POWER4_PME_PM_STCX_PASS 207 #define POWER4_PME_PM_DC_INV_L2 208 #define POWER4_PME_PM_LSU_SRQ_FULL_CYC 209 #define POWER4_PME_PM_LSU0_FLUSH_LRQ 210 #define POWER4_PME_PM_LSU_SRQ_S0_VALID 211 #define POWER4_PME_PM_LARX_LSU0 212 #define POWER4_PME_PM_GCT_EMPTY_CYC 213 #define POWER4_PME_PM_FPU1_ALL 214 #define POWER4_PME_PM_FPU1_FSQRT 215 #define POWER4_PME_PM_FPU_FIN 216 #define POWER4_PME_PM_L2SA_SHR_MOD 217 #define POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 218 #define POWER4_PME_PM_LSU_SRQ_STFWD 219 #define POWER4_PME_PM_FXU0_FIN 220 #define POWER4_PME_PM_MRK_FPU_FIN 221 #define POWER4_PME_PM_LSU_BUSY 222 #define POWER4_PME_PM_INST_FROM_L35 223 #define POWER4_PME_PM_FPU1_FRSP_FCONV 224 #define POWER4_PME_PM_SNOOP_TLBIE 225 #define POWER4_PME_PM_FPU0_FDIV 226 #define POWER4_PME_PM_LD_REF_L1_LSU1 227 #define POWER4_PME_PM_MRK_DATA_FROM_L275_MOD 228 #define POWER4_PME_PM_HV_CYC 229 #define POWER4_PME_PM_6INST_CLB_CYC 230 #define POWER4_PME_PM_LR_CTR_MAP_FULL_CYC 231 #define POWER4_PME_PM_L2SC_MOD_INV 232 #define POWER4_PME_PM_FPU_DENORM 233 #define POWER4_PME_PM_DATA_FROM_L275_MOD 234 #define POWER4_PME_PM_LSU1_DERAT_MISS 235 #define POWER4_PME_PM_IC_PREF_REQ 236 #define POWER4_PME_PM_MRK_LSU_FIN 237 #define POWER4_PME_PM_MRK_DATA_FROM_L3 238 #define POWER4_PME_PM_MRK_DATA_FROM_MEM 239 #define POWER4_PME_PM_LSU0_FLUSH_UST 240 #define POWER4_PME_PM_LSU_FLUSH_LRQ 241 #define POWER4_PME_PM_LSU_FLUSH_SRQ 242 #define POWER4_PME_PM_L2SC_MOD_TAG 243 static const int power4_event_ids[][POWER4_NUM_EVENT_COUNTERS] = { [ POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { -1, -1, 68, 68, -1, -1, 68, 68 }, [ POWER4_PME_PM_FPU1_SINGLE ] = { 23, 23, -1, -1, 23, 23, -1, -1 }, [ POWER4_PME_PM_DC_PREF_OUT_STREAMS ] = { -1, -1, 14, 14, -1, -1, 14, 14 }, [ POWER4_PME_PM_FPU0_STALL3 ] = { 15, 15, -1, -1, 15, 15, -1, -1 }, [ POWER4_PME_PM_TB_BIT_TRANS ] = { -1, -1, -1, -1, -1, -1, -1, 86 }, [ POWER4_PME_PM_GPR_MAP_FULL_CYC ] = { -1, -1, 33, 33, -1, -1, 33, 33 }, [ POWER4_PME_PM_MRK_ST_CMPL ] = { 93, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_LSU_FLUSH_LRQ ] = { -1, -1, 81, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU0_STF ] = { 16, 16, -1, -1, 16, 16, -1, -1 }, [ POWER4_PME_PM_FPU1_FMA ] = { 20, 20, -1, -1, 20, 20, -1, -1 }, [ POWER4_PME_PM_L2SA_MOD_TAG ] = { 38, 38, -1, -1, 38, 38, -1, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_SHR ] = { -1, -1, -1, -1, -1, 90, -1, -1 }, [ POWER4_PME_PM_1INST_CLB_CYC ] = { -1, -1, 0, 0, -1, -1, 0, 0 }, [ POWER4_PME_PM_LSU1_FLUSH_ULD ] = { 63, 63, -1, -1, 63, 63, -1, -1 }, [ POWER4_PME_PM_MRK_INST_FIN ] = { -1, -1, -1, -1, -1, -1, 82, -1 }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_UST ] = { -1, -1, 61, 61, -1, -1, 61, 61 }, [ POWER4_PME_PM_FPU_FDIV ] = { 84, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_LSU_LRQ_S0_ALLOC ] = { 68, 68, -1, -1, 68, 68, -1, -1 }, [ POWER4_PME_PM_FPU0_FULL_CYC ] = { 13, 13, -1, -1, 13, 13, -1, -1 }, [ POWER4_PME_PM_FPU_SINGLE ] = { -1, -1, -1, -1, 87, -1, -1, -1 }, [ POWER4_PME_PM_FPU0_FMA ] = { 11, 11, -1, -1, 11, 11, -1, -1 }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_ULD ] = { -1, -1, 65, 65, -1, -1, 65, 65 }, [ POWER4_PME_PM_LSU1_FLUSH_LRQ ] = { 61, 61, -1, -1, 61, 61, -1, -1 }, [ POWER4_PME_PM_L2SA_ST_HIT ] = { -1, -1, 37, 37, -1, -1, 37, 37 }, [ POWER4_PME_PM_L2SB_SHR_INV ] = { 43, 43, -1, -1, 43, 43, -1, -1 }, [ POWER4_PME_PM_DTLB_MISS ] = { 6, 6, -1, -1, 6, 6, -1, -1 }, [ POWER4_PME_PM_MRK_ST_MISS_L1 ] = { 76, 76, -1, -1, 76, 76, -1, -1 }, [ POWER4_PME_PM_EXT_INT ] = { -1, -1, -1, -1, -1, -1, -1, 76 }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { -1, -1, 63, 63, -1, -1, 63, 63 }, [ POWER4_PME_PM_MRK_ST_GPS ] = { -1, -1, -1, -1, -1, 93, -1, -1 }, [ POWER4_PME_PM_GRP_DISP_SUCCESS ] = { -1, -1, -1, -1, 89, -1, -1, -1 }, [ POWER4_PME_PM_LSU1_LDF ] = { -1, -1, 20, 20, -1, -1, 20, 20 }, [ POWER4_PME_PM_FAB_CMD_ISSUED ] = { -1, -1, 17, 17, -1, -1, 17, 17 }, [ POWER4_PME_PM_LSU0_SRQ_STFWD ] = { 59, 59, -1, -1, 59, 59, -1, -1 }, [ POWER4_PME_PM_CR_MAP_FULL_CYC ] = { 2, 2, -1, -1, 2, 2, -1, -1 }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_ULD ] = { -1, -1, 60, 60, -1, -1, 60, 60 }, [ POWER4_PME_PM_LSU_DERAT_MISS ] = { -1, -1, -1, -1, -1, 88, -1, -1 }, [ POWER4_PME_PM_FPU0_SINGLE ] = { 14, 14, -1, -1, 14, 14, -1, -1 }, [ POWER4_PME_PM_FPU1_FDIV ] = { 19, 19, -1, -1, 19, 19, -1, -1 }, [ POWER4_PME_PM_FPU1_FEST ] = { -1, -1, 26, 26, -1, -1, 26, 26 }, [ POWER4_PME_PM_FPU0_FRSP_FCONV ] = { -1, -1, 25, 25, -1, -1, 25, 25 }, [ POWER4_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 82, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FXU_FIN ] = { -1, -1, 77, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU_STF ] = { -1, -1, -1, -1, -1, 84, -1, -1 }, [ POWER4_PME_PM_DSLB_MISS ] = { 5, 5, -1, -1, 5, 5, -1, -1 }, [ POWER4_PME_PM_DATA_FROM_L275_SHR ] = { -1, -1, -1, -1, -1, 82, -1, -1 }, [ POWER4_PME_PM_FXLS1_FULL_CYC ] = { -1, -1, 85, 86, -1, -1, 85, 87 }, [ POWER4_PME_PM_L3B0_DIR_MIS ] = { 49, 49, -1, -1, 49, 49, -1, -1 }, [ POWER4_PME_PM_2INST_CLB_CYC ] = { -1, -1, 1, 1, -1, -1, 1, 1 }, [ POWER4_PME_PM_MRK_STCX_FAIL ] = { 75, 75, -1, -1, 75, 75, -1, -1 }, [ POWER4_PME_PM_LSU_LMQ_LHR_MERGE ] = { 67, 67, -1, -1, 67, 67, -1, -1 }, [ POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, -1, -1, -1, -1, 76, -1 }, [ POWER4_PME_PM_L3B1_DIR_REF ] = { 52, 52, -1, -1, 52, 52, -1, -1 }, [ POWER4_PME_PM_MRK_LSU_FLUSH_UST ] = { -1, -1, -1, -1, -1, -1, 83, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 93, -1, -1, -1 }, [ POWER4_PME_PM_LSU_FLUSH_ULD ] = { 88, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_BRU_FIN ] = { -1, 89, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_IERAT_XLATE_WR ] = { 31, 31, -1, -1, 31, 31, -1, -1 }, [ POWER4_PME_PM_LSU0_BUSY ] = { -1, -1, 50, 50, -1, -1, 50, 50 }, [ POWER4_PME_PM_L2SA_ST_REQ ] = { -1, -1, 38, 38, -1, -1, 38, 38 }, [ POWER4_PME_PM_DATA_FROM_MEM ] = { -1, 82, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPR_MAP_FULL_CYC ] = { 7, 7, -1, -1, 7, 7, -1, -1 }, [ POWER4_PME_PM_FPU1_FULL_CYC ] = { 22, 22, -1, -1, 22, 22, -1, -1 }, [ POWER4_PME_PM_FPU0_FIN ] = { -1, -1, 22, 22, -1, -1, 22, 22 }, [ POWER4_PME_PM_3INST_CLB_CYC ] = { -1, -1, 2, 2, -1, -1, 2, 2 }, [ POWER4_PME_PM_DATA_FROM_L35 ] = { -1, -1, 74, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_L2SA_SHR_INV ] = { 39, 39, -1, -1, 39, 39, -1, -1 }, [ POWER4_PME_PM_MRK_LSU_FLUSH_SRQ ] = { -1, -1, -1, 85, -1, -1, -1, -1 }, [ POWER4_PME_PM_THRESH_TIMEO ] = { -1, 91, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU_FSQRT ] = { -1, -1, -1, -1, -1, 83, -1, -1 }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { -1, -1, 58, 58, -1, -1, 58, 58 }, [ POWER4_PME_PM_FXLS0_FULL_CYC ] = { -1, -1, 30, 30, -1, -1, 30, 30 }, [ POWER4_PME_PM_DATA_TABLEWALK_CYC ] = { -1, -1, 12, 12, -1, -1, 12, 12 }, [ POWER4_PME_PM_FPU0_ALL ] = { 8, 8, -1, -1, 8, 8, -1, -1 }, [ POWER4_PME_PM_FPU0_FEST ] = { -1, -1, 21, 21, -1, -1, 21, 21 }, [ POWER4_PME_PM_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, -1, -1, 75 }, [ POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 88, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU_FEST ] = { -1, -1, 75, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_0INST_FETCH ] = { -1, -1, -1, -1, -1, -1, -1, 73 }, [ POWER4_PME_PM_LARX_LSU1 ] = { -1, -1, 45, 45, -1, -1, 45, 45 }, [ POWER4_PME_PM_LD_MISS_L1_LSU0 ] = { -1, -1, 46, 46, -1, -1, 46, 46 }, [ POWER4_PME_PM_L1_PREF ] = { -1, -1, 35, 35, -1, -1, 35, 35 }, [ POWER4_PME_PM_FPU1_STALL3 ] = { 24, 24, -1, -1, 24, 24, -1, -1 }, [ POWER4_PME_PM_BRQ_FULL_CYC ] = { 1, 1, -1, -1, 1, 1, -1, -1 }, [ POWER4_PME_PM_LARX ] = { -1, -1, -1, 79, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_L35 ] = { -1, -1, 80, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_WORK_HELD ] = { -1, 92, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 73, 73, -1, -1, 73, 73, -1, -1 }, [ POWER4_PME_PM_FXU_IDLE ] = { -1, -1, -1, -1, 88, -1, -1, -1 }, [ POWER4_PME_PM_INST_CMPL ] = { 86, -1, -1, 77, -1, 86, 78, 81 }, [ POWER4_PME_PM_LSU1_FLUSH_UST ] = { 64, 64, -1, -1, 64, 64, -1, -1 }, [ POWER4_PME_PM_LSU0_FLUSH_ULD ] = { 57, 57, -1, -1, 57, 57, -1, -1 }, [ POWER4_PME_PM_INST_FROM_L2 ] = { -1, -1, 78, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_DATA_FROM_L3 ] = { 82, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU0_DENORM ] = { 9, 9, -1, -1, 9, 9, -1, -1 }, [ POWER4_PME_PM_FPU1_FMOV_FEST ] = { -1, -1, 28, 28, -1, -1, 28, 28 }, [ POWER4_PME_PM_GRP_DISP_REJECT ] = { 27, 27, -1, -1, 27, 27, -1, 80 }, [ POWER4_PME_PM_INST_FETCH_CYC ] = { 33, 33, -1, -1, 33, 33, -1, -1 }, [ POWER4_PME_PM_LSU_LDF ] = { -1, -1, -1, -1, -1, -1, -1, 78 }, [ POWER4_PME_PM_INST_DISP ] = { 32, 32, -1, -1, 32, 32, -1, -1 }, [ POWER4_PME_PM_L2SA_MOD_INV ] = { 37, 37, -1, -1, 37, 37, -1, -1 }, [ POWER4_PME_PM_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 83, -1, -1, -1 }, [ POWER4_PME_PM_FAB_CMD_RETRIED ] = { -1, -1, 18, 18, -1, -1, 18, 18 }, [ POWER4_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 36, 36, -1, -1, 36, 36, -1, -1 }, [ POWER4_PME_PM_MRK_GRP_ISSUED ] = { -1, -1, -1, -1, -1, 92, -1, -1 }, [ POWER4_PME_PM_FPU_FULL_CYC ] = { -1, -1, -1, -1, 86, -1, -1, -1 }, [ POWER4_PME_PM_FPU_FMA ] = { -1, 83, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_CRU_FIN ] = { -1, -1, -1, 82, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_UST ] = { -1, -1, 66, 66, -1, -1, 66, 66 }, [ POWER4_PME_PM_MRK_FXU_FIN ] = { -1, -1, -1, -1, -1, 91, -1, -1 }, [ POWER4_PME_PM_BR_ISSUED ] = { -1, -1, 8, 8, -1, -1, 8, 8 }, [ POWER4_PME_PM_EE_OFF ] = { -1, -1, 15, 15, -1, -1, 15, 15 }, [ POWER4_PME_PM_INST_FROM_L3 ] = { -1, -1, -1, -1, 91, -1, -1, -1 }, [ POWER4_PME_PM_ITLB_MISS ] = { 35, 35, -1, -1, 35, 35, -1, -1 }, [ POWER4_PME_PM_FXLS_FULL_CYC ] = { -1, -1, -1, -1, -1, -1, -1, 79 }, [ POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 76, -1, -1, -1, -1 }, [ POWER4_PME_PM_GRP_DISP_VALID ] = { 28, 28, -1, -1, 28, 28, -1, -1 }, [ POWER4_PME_PM_L2SC_ST_HIT ] = { -1, -1, 41, 41, -1, -1, 41, 41 }, [ POWER4_PME_PM_MRK_GRP_DISP ] = { 91, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_L2SB_MOD_TAG ] = { 42, 42, -1, -1, 42, 42, -1, -1 }, [ POWER4_PME_PM_INST_FROM_L25_L275 ] = { -1, 86, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_LSU_FLUSH_UST ] = { -1, 87, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_L2SB_ST_HIT ] = { -1, -1, 39, 39, -1, -1, 39, 39 }, [ POWER4_PME_PM_FXU1_FIN ] = { -1, -1, 32, 32, -1, -1, 32, 32 }, [ POWER4_PME_PM_L3B1_DIR_MIS ] = { 51, 51, -1, -1, 51, 51, -1, -1 }, [ POWER4_PME_PM_4INST_CLB_CYC ] = { -1, -1, 3, 3, -1, -1, 3, 3 }, [ POWER4_PME_PM_GRP_CMPL ] = { -1, -1, -1, -1, -1, -1, 77, -1 }, [ POWER4_PME_PM_DC_PREF_L2_CLONE_L3 ] = { 3, 3, -1, -1, 3, 3, -1, -1 }, [ POWER4_PME_PM_FPU_FRSP_FCONV ] = { -1, -1, -1, -1, -1, -1, 75, -1 }, [ POWER4_PME_PM_5INST_CLB_CYC ] = { -1, -1, 4, 4, -1, -1, 4, 4 }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { -1, -1, 59, 59, -1, -1, 59, 59 }, [ POWER4_PME_PM_MRK_LSU_FLUSH_ULD ] = { -1, -1, -1, -1, -1, -1, -1, 85 }, [ POWER4_PME_PM_8INST_CLB_CYC ] = { -1, -1, 7, 7, -1, -1, 7, 7 }, [ POWER4_PME_PM_LSU_LMQ_FULL_CYC ] = { 66, 66, -1, -1, 66, 66, -1, -1 }, [ POWER4_PME_PM_ST_REF_L1_LSU0 ] = { -1, -1, 71, 71, -1, -1, 71, 71 }, [ POWER4_PME_PM_LSU0_DERAT_MISS ] = { 54, 54, -1, -1, 54, 54, -1, -1 }, [ POWER4_PME_PM_LSU_SRQ_SYNC_CYC ] = { -1, -1, 56, 56, -1, -1, 56, 56 }, [ POWER4_PME_PM_FPU_STALL3 ] = { -1, 84, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_L2 ] = { -1, -1, -1, 83, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU0_FMOV_FEST ] = { -1, -1, 23, 23, -1, -1, 23, 23 }, [ POWER4_PME_PM_LSU0_FLUSH_SRQ ] = { 56, 56, -1, -1, 56, 56, -1, -1 }, [ POWER4_PME_PM_LD_REF_L1_LSU0 ] = { -1, -1, 48, 48, -1, -1, 48, 48 }, [ POWER4_PME_PM_L2SC_SHR_INV ] = { 47, 47, -1, -1, 47, 47, -1, -1 }, [ POWER4_PME_PM_LSU1_FLUSH_SRQ ] = { 62, 62, -1, -1, 62, 62, -1, -1 }, [ POWER4_PME_PM_LSU_LMQ_S0_ALLOC ] = { -1, -1, 52, 52, -1, -1, 52, 52 }, [ POWER4_PME_PM_ST_REF_L1 ] = { -1, -1, -1, -1, -1, -1, 84, -1 }, [ POWER4_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 81, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU1_STF ] = { 25, 25, -1, -1, 25, 25, -1, -1 }, [ POWER4_PME_PM_L3B0_DIR_REF ] = { 50, 50, -1, -1, 50, 50, -1, -1 }, [ POWER4_PME_PM_RUN_CYC ] = { 94, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_LSU_LMQ_S0_VALID ] = { -1, -1, 53, 53, -1, -1, 53, 53 }, [ POWER4_PME_PM_LSU_LRQ_S0_VALID ] = { 69, 69, -1, -1, 69, 69, -1, -1 }, [ POWER4_PME_PM_LSU0_LDF ] = { -1, -1, 19, 19, -1, -1, 19, 19 }, [ POWER4_PME_PM_MRK_IMR_RELOAD ] = { 72, 72, -1, -1, 72, 72, -1, -1 }, [ POWER4_PME_PM_7INST_CLB_CYC ] = { -1, -1, 6, 6, -1, -1, 6, 6 }, [ POWER4_PME_PM_MRK_GRP_TIMEO ] = { -1, -1, -1, -1, 94, -1, -1, -1 }, [ POWER4_PME_PM_FPU_FMOV_FEST ] = { -1, -1, -1, -1, -1, -1, -1, 77 }, [ POWER4_PME_PM_GRP_DISP_BLK_SB_CYC ] = { -1, -1, 34, 34, -1, -1, 34, 34 }, [ POWER4_PME_PM_XER_MAP_FULL_CYC ] = { 80, 80, -1, -1, 80, 80, -1, -1 }, [ POWER4_PME_PM_ST_MISS_L1 ] = { 79, 79, 70, 70, 79, 79, 70, 70 }, [ POWER4_PME_PM_STOP_COMPLETION ] = { -1, -1, 83, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 84, -1, -1, -1, -1 }, [ POWER4_PME_PM_ISLB_MISS ] = { 34, 34, -1, -1, 34, 34, -1, -1 }, [ POWER4_PME_PM_CYC ] = { 81, 81, 73, 73, 82, 81, 73, 74 }, [ POWER4_PME_PM_LD_MISS_L1_LSU1 ] = { -1, -1, 47, 47, -1, -1, 47, 47 }, [ POWER4_PME_PM_STCX_FAIL ] = { 78, 78, -1, -1, 78, 78, -1, -1 }, [ POWER4_PME_PM_LSU1_SRQ_STFWD ] = { 65, 65, -1, -1, 65, 65, -1, -1 }, [ POWER4_PME_PM_GRP_DISP ] = { -1, 85, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_DATA_FROM_L2 ] = { -1, -1, -1, 74, -1, -1, -1, -1 }, [ POWER4_PME_PM_L2_PREF ] = { -1, -1, 43, 43, -1, -1, 43, 43 }, [ POWER4_PME_PM_FPU0_FPSCR ] = { -1, -1, 24, 24, -1, -1, 24, 24 }, [ POWER4_PME_PM_FPU1_DENORM ] = { 18, 18, -1, -1, 18, 18, -1, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, -1, -1, 83 }, [ POWER4_PME_PM_L2SB_ST_REQ ] = { -1, -1, 40, 40, -1, -1, 40, 40 }, [ POWER4_PME_PM_L2SB_MOD_INV ] = { 41, 41, -1, -1, 41, 41, -1, -1 }, [ POWER4_PME_PM_FPU0_FSQRT ] = { 12, 12, -1, -1, 12, 12, -1, -1 }, [ POWER4_PME_PM_LD_REF_L1 ] = { -1, -1, -1, -1, -1, -1, -1, 82 }, [ POWER4_PME_PM_MRK_L1_RELOAD_VALID ] = { -1, -1, 57, 57, -1, -1, 57, 57 }, [ POWER4_PME_PM_L2SB_SHR_MOD ] = { 44, 44, -1, -1, 44, 44, -1, -1 }, [ POWER4_PME_PM_INST_FROM_L1 ] = { -1, -1, -1, -1, -1, 87, -1, -1 }, [ POWER4_PME_PM_1PLUS_PPC_CMPL ] = { -1, -1, -1, -1, 81, -1, -1, -1 }, [ POWER4_PME_PM_EE_OFF_EXT_INT ] = { -1, -1, 16, 16, -1, -1, 16, 16 }, [ POWER4_PME_PM_L2SC_SHR_MOD ] = { 48, 48, -1, -1, 48, 48, -1, -1 }, [ POWER4_PME_PM_LSU_LRQ_FULL_CYC ] = { -1, -1, 54, 54, -1, -1, 54, 54 }, [ POWER4_PME_PM_IC_PREF_INSTALL ] = { 29, 29, -1, -1, 29, 29, -1, -1 }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { -1, -1, 64, 64, -1, -1, 64, 64 }, [ POWER4_PME_PM_GCT_FULL_CYC ] = { 26, 26, -1, -1, 26, 26, -1, -1 }, [ POWER4_PME_PM_INST_FROM_MEM ] = { 87, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FXU_BUSY ] = { -1, -1, -1, -1, -1, 85, -1, -1 }, [ POWER4_PME_PM_ST_REF_L1_LSU1 ] = { -1, -1, 72, 72, -1, -1, 72, 72 }, [ POWER4_PME_PM_MRK_LD_MISS_L1 ] = { 92, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_LSU1_INST_FIN ] = { -1, -1, 67, 67, -1, -1, 67, 67 }, [ POWER4_PME_PM_L1_WRITE_CYC ] = { -1, -1, 36, 36, -1, -1, 36, 36 }, [ POWER4_PME_PM_BIQ_IDU_FULL_CYC ] = { 0, 0, -1, -1, 0, 0, -1, -1 }, [ POWER4_PME_PM_MRK_LSU0_INST_FIN ] = { -1, -1, 62, 62, -1, -1, 62, 62 }, [ POWER4_PME_PM_L2SC_ST_REQ ] = { -1, -1, 42, 42, -1, -1, 42, 42 }, [ POWER4_PME_PM_LSU1_BUSY ] = { -1, -1, 51, 51, -1, -1, 51, 51 }, [ POWER4_PME_PM_FPU_ALL ] = { -1, -1, -1, -1, 84, -1, -1, -1 }, [ POWER4_PME_PM_LSU_SRQ_S0_ALLOC ] = { 70, 70, -1, -1, 70, 70, -1, -1 }, [ POWER4_PME_PM_GRP_MRK ] = { -1, -1, -1, -1, 90, -1, -1, -1 }, [ POWER4_PME_PM_FPU1_FIN ] = { -1, -1, 27, 27, -1, -1, 27, 27 }, [ POWER4_PME_PM_DC_PREF_STREAM_ALLOC ] = { 4, 4, -1, -1, 4, 4, -1, -1 }, [ POWER4_PME_PM_BR_MPRED_CR ] = { -1, -1, 9, 9, -1, -1, 9, 9 }, [ POWER4_PME_PM_BR_MPRED_TA ] = { -1, -1, 10, 10, -1, -1, 10, 10 }, [ POWER4_PME_PM_CRQ_FULL_CYC ] = { -1, -1, 11, 11, -1, -1, 11, 11 }, [ POWER4_PME_PM_INST_FROM_PREF ] = { -1, -1, -1, -1, -1, -1, 79, -1 }, [ POWER4_PME_PM_LD_MISS_L1 ] = { -1, -1, 79, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_STCX_PASS ] = { -1, -1, 69, 69, -1, -1, 69, 69 }, [ POWER4_PME_PM_DC_INV_L2 ] = { -1, -1, 13, 13, -1, -1, 13, 13 }, [ POWER4_PME_PM_LSU_SRQ_FULL_CYC ] = { -1, -1, 55, 55, -1, -1, 55, 55 }, [ POWER4_PME_PM_LSU0_FLUSH_LRQ ] = { 55, 55, -1, -1, 55, 55, -1, -1 }, [ POWER4_PME_PM_LSU_SRQ_S0_VALID ] = { 71, 71, -1, -1, 71, 71, -1, -1 }, [ POWER4_PME_PM_LARX_LSU0 ] = { -1, -1, 44, 44, -1, -1, 44, 44 }, [ POWER4_PME_PM_GCT_EMPTY_CYC ] = { 85, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU1_ALL ] = { 17, 17, -1, -1, 17, 17, -1, -1 }, [ POWER4_PME_PM_FPU1_FSQRT ] = { 21, 21, -1, -1, 21, 21, -1, -1 }, [ POWER4_PME_PM_FPU_FIN ] = { -1, -1, -1, 75, -1, -1, -1, -1 }, [ POWER4_PME_PM_L2SA_SHR_MOD ] = { 40, 40, -1, -1, 40, 40, -1, -1 }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 74, 74, -1, -1, 74, 74, -1, -1 }, [ POWER4_PME_PM_LSU_SRQ_STFWD ] = { 89, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_FXU0_FIN ] = { -1, -1, 31, 31, -1, -1, 31, 31 }, [ POWER4_PME_PM_MRK_FPU_FIN ] = { -1, -1, -1, -1, -1, -1, 81, -1 }, [ POWER4_PME_PM_LSU_BUSY ] = { -1, -1, -1, 80, -1, -1, -1, -1 }, [ POWER4_PME_PM_INST_FROM_L35 ] = { -1, -1, -1, 78, -1, -1, -1, -1 }, [ POWER4_PME_PM_FPU1_FRSP_FCONV ] = { -1, -1, 29, 29, -1, -1, 29, 29 }, [ POWER4_PME_PM_SNOOP_TLBIE ] = { 77, 77, -1, -1, 77, 77, -1, -1 }, [ POWER4_PME_PM_FPU0_FDIV ] = { 10, 10, -1, -1, 10, 10, -1, -1 }, [ POWER4_PME_PM_LD_REF_L1_LSU1 ] = { -1, -1, 49, 49, -1, -1, 49, 49 }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_MOD ] = { -1, -1, -1, -1, -1, -1, 80, -1 }, [ POWER4_PME_PM_HV_CYC ] = { -1, -1, 84, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_6INST_CLB_CYC ] = { -1, -1, 5, 5, -1, -1, 5, 5 }, [ POWER4_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 53, 53, -1, -1, 53, 53, -1, -1 }, [ POWER4_PME_PM_L2SC_MOD_INV ] = { 45, 45, -1, -1, 45, 45, -1, -1 }, [ POWER4_PME_PM_FPU_DENORM ] = { 83, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_DATA_FROM_L275_MOD ] = { -1, -1, -1, -1, -1, -1, 74, -1 }, [ POWER4_PME_PM_LSU1_DERAT_MISS ] = { 60, 60, -1, -1, 60, 60, -1, -1 }, [ POWER4_PME_PM_IC_PREF_REQ ] = { 30, 30, -1, -1, 30, 30, -1, -1 }, [ POWER4_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, -1, -1, -1, -1, 84 }, [ POWER4_PME_PM_MRK_DATA_FROM_L3 ] = { 90, -1, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_MRK_DATA_FROM_MEM ] = { -1, 90, -1, -1, -1, -1, -1, -1 }, [ POWER4_PME_PM_LSU0_FLUSH_UST ] = { 58, 58, -1, -1, 58, 58, -1, -1 }, [ POWER4_PME_PM_LSU_FLUSH_LRQ ] = { -1, -1, -1, -1, -1, 89, -1, -1 }, [ POWER4_PME_PM_LSU_FLUSH_SRQ ] = { -1, -1, -1, -1, 92, -1, -1, -1 }, [ POWER4_PME_PM_L2SC_MOD_TAG ] = { 46, 46, -1, -1, 46, 46, -1, -1 } }; static const unsigned long long power4_group_vecs[][POWER4_NUM_GROUP_VEC] = { [ POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 0x0000100000000000ULL }, [ POWER4_PME_PM_FPU1_SINGLE ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_DC_PREF_OUT_STREAMS ] = { 0x0000010000000000ULL }, [ POWER4_PME_PM_FPU0_STALL3 ] = { 0x0000000100000000ULL }, [ POWER4_PME_PM_TB_BIT_TRANS ] = { 0x0000020000000000ULL }, [ POWER4_PME_PM_GPR_MAP_FULL_CYC ] = { 0x0000000000000010ULL }, [ POWER4_PME_PM_MRK_ST_CMPL ] = { 0x0000100000000000ULL }, [ POWER4_PME_PM_MRK_LSU_FLUSH_LRQ ] = { 0x0000200000000000ULL }, [ POWER4_PME_PM_FPU0_STF ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_FPU1_FMA ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_L2SA_MOD_TAG ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_SHR ] = { 0x0000c00000000000ULL }, [ POWER4_PME_PM_1INST_CLB_CYC ] = { 0x0000000000010000ULL }, [ POWER4_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_MRK_INST_FIN ] = { 0x0008040000000000ULL }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_UST ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_FPU_FDIV ] = { 0x1020000000004000ULL }, [ POWER4_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_FPU0_FULL_CYC ] = { 0x0000000000080000ULL }, [ POWER4_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_FPU0_FMA ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000000800000000ULL }, [ POWER4_PME_PM_L2SA_ST_HIT ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_L2SB_SHR_INV ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_DTLB_MISS ] = { 0x0900000000000100ULL }, [ POWER4_PME_PM_MRK_ST_MISS_L1 ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_EXT_INT ] = { 0x0000000000200000ULL }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_MRK_ST_GPS ] = { 0x0000100000000000ULL }, [ POWER4_PME_PM_GRP_DISP_SUCCESS ] = { 0x0000000000020000ULL }, [ POWER4_PME_PM_LSU1_LDF ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_FAB_CMD_ISSUED ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000004000000000ULL }, [ POWER4_PME_PM_CR_MAP_FULL_CYC ] = { 0x0000000000040000ULL }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_LSU_DERAT_MISS ] = { 0x0000000000000300ULL }, [ POWER4_PME_PM_FPU0_SINGLE ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_FPU1_FDIV ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_FPU1_FEST ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_FPU0_FRSP_FCONV ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000100000000000ULL }, [ POWER4_PME_PM_FXU_FIN ] = { 0x0020000000000000ULL }, [ POWER4_PME_PM_FPU_STF ] = { 0x1040000000008000ULL }, [ POWER4_PME_PM_DSLB_MISS ] = { 0x0000000000000200ULL }, [ POWER4_PME_PM_DATA_FROM_L275_SHR ] = { 0x0200000001000020ULL }, [ POWER4_PME_PM_FXLS1_FULL_CYC ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_L3B0_DIR_MIS ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_2INST_CLB_CYC ] = { 0x0000000000010000ULL }, [ POWER4_PME_PM_MRK_STCX_FAIL ] = { 0x0008000000000000ULL }, [ POWER4_PME_PM_LSU_LMQ_LHR_MERGE ] = { 0x0010000400000000ULL }, [ POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000000200000000ULL }, [ POWER4_PME_PM_L3B1_DIR_REF ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_MRK_LSU_FLUSH_UST ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0000c00000000000ULL }, [ POWER4_PME_PM_LSU_FLUSH_ULD ] = { 0x0000000000000080ULL }, [ POWER4_PME_PM_MRK_BRU_FIN ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_IERAT_XLATE_WR ] = { 0x0000000000000300ULL }, [ POWER4_PME_PM_LSU0_BUSY ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_L2SA_ST_REQ ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_DATA_FROM_MEM ] = { 0x0400000002000020ULL }, [ POWER4_PME_PM_FPR_MAP_FULL_CYC ] = { 0x0000000000000010ULL }, [ POWER4_PME_PM_FPU1_FULL_CYC ] = { 0x0000000000080000ULL }, [ POWER4_PME_PM_FPU0_FIN ] = { 0x1040000120000000ULL }, [ POWER4_PME_PM_3INST_CLB_CYC ] = { 0x0000000000010000ULL }, [ POWER4_PME_PM_DATA_FROM_L35 ] = { 0x0600000002000020ULL }, [ POWER4_PME_PM_L2SA_SHR_INV ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_MRK_LSU_FLUSH_SRQ ] = { 0x0000200000000000ULL }, [ POWER4_PME_PM_THRESH_TIMEO ] = { 0x0010040000000000ULL }, [ POWER4_PME_PM_FPU_FSQRT ] = { 0x0020000000004000ULL }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_FXLS0_FULL_CYC ] = { 0x0000000000080000ULL }, [ POWER4_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000000400000100ULL }, [ POWER4_PME_PM_FPU0_ALL ] = { 0x0000000020000000ULL }, [ POWER4_PME_PM_FPU0_FEST ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_DATA_FROM_L25_MOD ] = { 0x0600000001000020ULL }, [ POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0800020000000000ULL }, [ POWER4_PME_PM_FPU_FEST ] = { 0x0000000000004000ULL }, [ POWER4_PME_PM_0INST_FETCH ] = { 0x0000000004000040ULL }, [ POWER4_PME_PM_LARX_LSU1 ] = { 0x0000000000400000ULL }, [ POWER4_PME_PM_LD_MISS_L1_LSU0 ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_L1_PREF ] = { 0x0000010000000000ULL }, [ POWER4_PME_PM_FPU1_STALL3 ] = { 0x0000000100000000ULL }, [ POWER4_PME_PM_BRQ_FULL_CYC ] = { 0x0080000000000010ULL }, [ POWER4_PME_PM_LARX ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L35 ] = { 0x0001400000000000ULL }, [ POWER4_PME_PM_WORK_HELD ] = { 0x0000000000200000ULL }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_FXU_IDLE ] = { 0x0000000200000000ULL }, [ POWER4_PME_PM_INST_CMPL ] = { 0x7fffb7ffffffff9fULL }, [ POWER4_PME_PM_LSU1_FLUSH_UST ] = { 0x0000002000000000ULL }, [ POWER4_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_INST_FROM_L2 ] = { 0x000000000c000040ULL }, [ POWER4_PME_PM_DATA_FROM_L3 ] = { 0x0400000002000020ULL }, [ POWER4_PME_PM_FPU0_DENORM ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_FPU1_FMOV_FEST ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_GRP_DISP_REJECT ] = { 0x0000000000100001ULL }, [ POWER4_PME_PM_INST_FETCH_CYC ] = { 0x0000000000000008ULL }, [ POWER4_PME_PM_LSU_LDF ] = { 0x1040000000008000ULL }, [ POWER4_PME_PM_INST_DISP ] = { 0x0000000000140006ULL }, [ POWER4_PME_PM_L2SA_MOD_INV ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_DATA_FROM_L25_SHR ] = { 0x0600000001000020ULL }, [ POWER4_PME_PM_FAB_CMD_RETRIED ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000008003000000ULL }, [ POWER4_PME_PM_MRK_GRP_ISSUED ] = { 0x0018240000000000ULL }, [ POWER4_PME_PM_FPU_FULL_CYC ] = { 0x0040000000000010ULL }, [ POWER4_PME_PM_FPU_FMA ] = { 0x1020000000004000ULL }, [ POWER4_PME_PM_MRK_CRU_FIN ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_UST ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_MRK_FXU_FIN ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_BR_ISSUED ] = { 0x6080000000000008ULL }, [ POWER4_PME_PM_EE_OFF ] = { 0x0000000000200000ULL }, [ POWER4_PME_PM_INST_FROM_L3 ] = { 0x000000000c000040ULL }, [ POWER4_PME_PM_ITLB_MISS ] = { 0x0100000000000100ULL }, [ POWER4_PME_PM_FXLS_FULL_CYC ] = { 0x0000000200000010ULL }, [ POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000000200000000ULL }, [ POWER4_PME_PM_GRP_DISP_VALID ] = { 0x0000000000100000ULL }, [ POWER4_PME_PM_L2SC_ST_HIT ] = { 0x0000000000002000ULL }, [ POWER4_PME_PM_MRK_GRP_DISP ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_L2SB_MOD_TAG ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_INST_FROM_L25_L275 ] = { 0x0000000008000040ULL }, [ POWER4_PME_PM_LSU_FLUSH_UST ] = { 0x0000000000000080ULL }, [ POWER4_PME_PM_L2SB_ST_HIT ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_FXU1_FIN ] = { 0x0000000000100000ULL }, [ POWER4_PME_PM_L3B1_DIR_MIS ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_4INST_CLB_CYC ] = { 0x0000000000010000ULL }, [ POWER4_PME_PM_GRP_CMPL ] = { 0x0010020000000001ULL }, [ POWER4_PME_PM_DC_PREF_L2_CLONE_L3 ] = { 0x0000010000000000ULL }, [ POWER4_PME_PM_FPU_FRSP_FCONV ] = { 0x0000000000008000ULL }, [ POWER4_PME_PM_5INST_CLB_CYC ] = { 0x0000000000020000ULL }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_MRK_LSU_FLUSH_ULD ] = { 0x0000200000000000ULL }, [ POWER4_PME_PM_8INST_CLB_CYC ] = { 0x0000000000020000ULL }, [ POWER4_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000400000000ULL }, [ POWER4_PME_PM_ST_REF_L1_LSU0 ] = { 0x0000006000000000ULL }, [ POWER4_PME_PM_LSU0_DERAT_MISS ] = { 0x0000008000000000ULL }, [ POWER4_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000400000200ULL }, [ POWER4_PME_PM_FPU_STALL3 ] = { 0x0040000000008000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0001c00000000000ULL }, [ POWER4_PME_PM_FPU0_FMOV_FEST ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000000800000000ULL }, [ POWER4_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_L2SC_SHR_INV ] = { 0x0000000000002000ULL }, [ POWER4_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000000800000000ULL }, [ POWER4_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0010000400000200ULL }, [ POWER4_PME_PM_ST_REF_L1 ] = { 0x4900000000000086ULL }, [ POWER4_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_FPU1_STF ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_L3B0_DIR_REF ] = { 0x0000000000000400ULL }, [ POWER4_PME_PM_RUN_CYC ] = { 0x0000000000000001ULL }, [ POWER4_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0010000400000100ULL }, [ POWER4_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_LSU0_LDF ] = { 0x0000000080000000ULL }, [ POWER4_PME_PM_MRK_IMR_RELOAD ] = { 0x0002000000000000ULL }, [ POWER4_PME_PM_7INST_CLB_CYC ] = { 0x0000000000020000ULL }, [ POWER4_PME_PM_MRK_GRP_TIMEO ] = { 0x0000300000000000ULL }, [ POWER4_PME_PM_FPU_FMOV_FEST ] = { 0x0020000000004000ULL }, [ POWER4_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 0x0000000000040000ULL }, [ POWER4_PME_PM_XER_MAP_FULL_CYC ] = { 0x0000000000040000ULL }, [ POWER4_PME_PM_ST_MISS_L1 ] = { 0x6900006000000000ULL }, [ POWER4_PME_PM_STOP_COMPLETION ] = { 0x0000000000200001ULL }, [ POWER4_PME_PM_MRK_GRP_CMPL ] = { 0x0000140000000000ULL }, [ POWER4_PME_PM_ISLB_MISS ] = { 0x0000000000000200ULL }, [ POWER4_PME_PM_CYC ] = { 0x7fffbfffffffff9fULL }, [ POWER4_PME_PM_LD_MISS_L1_LSU1 ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_STCX_FAIL ] = { 0x0000000000400000ULL }, [ POWER4_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000004000000000ULL }, [ POWER4_PME_PM_GRP_DISP ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_DATA_FROM_L2 ] = { 0x0600000003000020ULL }, [ POWER4_PME_PM_L2_PREF ] = { 0x0000010000000000ULL }, [ POWER4_PME_PM_FPU0_FPSCR ] = { 0x0000000100000000ULL }, [ POWER4_PME_PM_FPU1_DENORM ] = { 0x0000000040000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0000c00000000000ULL }, [ POWER4_PME_PM_L2SB_ST_REQ ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_L2SB_MOD_INV ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_FPU0_FSQRT ] = { 0x0000000020000000ULL }, [ POWER4_PME_PM_LD_REF_L1 ] = { 0x4900000000000086ULL }, [ POWER4_PME_PM_MRK_L1_RELOAD_VALID ] = { 0x0001800000000000ULL }, [ POWER4_PME_PM_L2SB_SHR_MOD ] = { 0x0000000000001000ULL }, [ POWER4_PME_PM_INST_FROM_L1 ] = { 0x000000000c000040ULL }, [ POWER4_PME_PM_1PLUS_PPC_CMPL ] = { 0x0000020000410001ULL }, [ POWER4_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000200000ULL }, [ POWER4_PME_PM_L2SC_SHR_MOD ] = { 0x0000000000002000ULL }, [ POWER4_PME_PM_LSU_LRQ_FULL_CYC ] = { 0x0000000000080000ULL }, [ POWER4_PME_PM_IC_PREF_INSTALL ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000010ULL }, [ POWER4_PME_PM_INST_FROM_MEM ] = { 0x0000000008000040ULL }, [ POWER4_PME_PM_FXU_BUSY ] = { 0x0000000200000000ULL }, [ POWER4_PME_PM_ST_REF_L1_LSU1 ] = { 0x0000006000000000ULL }, [ POWER4_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000240000000000ULL }, [ POWER4_PME_PM_MRK_LSU1_INST_FIN ] = { 0x0008000000000000ULL }, [ POWER4_PME_PM_L1_WRITE_CYC ] = { 0x0080000000000008ULL }, [ POWER4_PME_PM_BIQ_IDU_FULL_CYC ] = { 0x0080000000000008ULL }, [ POWER4_PME_PM_MRK_LSU0_INST_FIN ] = { 0x0008000000000000ULL }, [ POWER4_PME_PM_L2SC_ST_REQ ] = { 0x0000000000002000ULL }, [ POWER4_PME_PM_LSU1_BUSY ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_FPU_ALL ] = { 0x0000000000008000ULL }, [ POWER4_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_GRP_MRK ] = { 0x00000c0000000000ULL }, [ POWER4_PME_PM_FPU1_FIN ] = { 0x1040000120000000ULL }, [ POWER4_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0000010000000000ULL }, [ POWER4_PME_PM_BR_MPRED_CR ] = { 0x2080000000000008ULL }, [ POWER4_PME_PM_BR_MPRED_TA ] = { 0x2080000000000008ULL }, [ POWER4_PME_PM_CRQ_FULL_CYC ] = { 0x0000000000040000ULL }, [ POWER4_PME_PM_INST_FROM_PREF ] = { 0x0000000004000040ULL }, [ POWER4_PME_PM_LD_MISS_L1 ] = { 0x6900000000000006ULL }, [ POWER4_PME_PM_STCX_PASS ] = { 0x0000000000400000ULL }, [ POWER4_PME_PM_DC_INV_L2 ] = { 0x0000002000000006ULL }, [ POWER4_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000080000ULL }, [ POWER4_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000000800000000ULL }, [ POWER4_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000000800000ULL }, [ POWER4_PME_PM_LARX_LSU0 ] = { 0x0000000000400000ULL }, [ POWER4_PME_PM_GCT_EMPTY_CYC ] = { 0x0000020000200000ULL }, [ POWER4_PME_PM_FPU1_ALL ] = { 0x0000000020000000ULL }, [ POWER4_PME_PM_FPU1_FSQRT ] = { 0x0000000020000000ULL }, [ POWER4_PME_PM_FPU_FIN ] = { 0x0020000000004000ULL }, [ POWER4_PME_PM_L2SA_SHR_MOD ] = { 0x0000000000000800ULL }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 0x0004000000000000ULL }, [ POWER4_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_FXU0_FIN ] = { 0x0000000000100000ULL }, [ POWER4_PME_PM_MRK_FPU_FIN ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_LSU_BUSY ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_INST_FROM_L35 ] = { 0x000000000c000040ULL }, [ POWER4_PME_PM_FPU1_FRSP_FCONV ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_SNOOP_TLBIE ] = { 0x0000000000400000ULL }, [ POWER4_PME_PM_FPU0_FDIV ] = { 0x0000000010000000ULL }, [ POWER4_PME_PM_LD_REF_L1_LSU1 ] = { 0x0000001000000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_MOD ] = { 0x0001c00000000000ULL }, [ POWER4_PME_PM_HV_CYC ] = { 0x0000020000000000ULL }, [ POWER4_PME_PM_6INST_CLB_CYC ] = { 0x0000000000020000ULL }, [ POWER4_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 0x0000000000040000ULL }, [ POWER4_PME_PM_L2SC_MOD_INV ] = { 0x0000000000002000ULL }, [ POWER4_PME_PM_FPU_DENORM ] = { 0x0000000000008000ULL }, [ POWER4_PME_PM_DATA_FROM_L275_MOD ] = { 0x0200000003000020ULL }, [ POWER4_PME_PM_LSU1_DERAT_MISS ] = { 0x0000008000000000ULL }, [ POWER4_PME_PM_IC_PREF_REQ ] = { 0x0000000000000000ULL }, [ POWER4_PME_PM_MRK_LSU_FIN ] = { 0x0000080000000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_L3 ] = { 0x0001400000000000ULL }, [ POWER4_PME_PM_MRK_DATA_FROM_MEM ] = { 0x0001400000000000ULL }, [ POWER4_PME_PM_LSU0_FLUSH_UST ] = { 0x0000002000000000ULL }, [ POWER4_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000000000000080ULL }, [ POWER4_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000000000000080ULL }, [ POWER4_PME_PM_L2SC_MOD_TAG ] = { 0x0000000000002000ULL } }; static const pme_power_entry_t power4_pe[] = { [ POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x933, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID] }, [ POWER4_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_SINGLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_SINGLE] }, [ POWER4_PME_PM_DC_PREF_OUT_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_STREAMS", .pme_code = 0xc36, .pme_short_desc = "Out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected, but no more stream entries were available", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DC_PREF_OUT_STREAMS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DC_PREF_OUT_STREAMS] }, [ POWER4_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_STALL3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_STALL3] }, [ POWER4_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_TB_BIT_TRANS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_TB_BIT_TRANS] }, [ POWER4_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x235, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GPR_MAP_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GPR_MAP_FULL_CYC] }, [ POWER4_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_ST_CMPL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_ST_CMPL] }, [ POWER4_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x3910, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_FLUSH_LRQ] }, [ POWER4_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_STF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_STF] }, [ POWER4_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FMA], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FMA] }, [ POWER4_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0xf06, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_MOD_TAG], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_MOD_TAG] }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x6c76, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "DL1 was reloaded with shared (T) data from the L2 of another MCM due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L275_SHR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L275_SHR] }, [ POWER4_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x450, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_1INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_1INST_CLB_CYC] }, [ POWER4_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc04, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_FLUSH_ULD] }, [ POWER4_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_INST_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_INST_FIN] }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x911, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU0_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU0_FLUSH_UST] }, [ POWER4_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FDIV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FDIV] }, [ POWER4_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc26, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LRQ_S0_ALLOC] }, [ POWER4_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x203, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FULL_CYC] }, [ POWER4_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_SINGLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_SINGLE] }, [ POWER4_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FMA], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FMA] }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x914, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU1_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU1_FLUSH_ULD] }, [ POWER4_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc06, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_FLUSH_LRQ] }, [ POWER4_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0xf11, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_ST_HIT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_ST_HIT] }, [ POWER4_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0xf21, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_SHR_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_SHR_INV] }, [ POWER4_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x904, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DTLB_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DTLB_MISS] }, [ POWER4_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x923, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_ST_MISS_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_ST_MISS_L1] }, [ POWER4_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", .pme_event_ids = power4_event_ids[POWER4_PME_PM_EXT_INT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_EXT_INT] }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x916, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ] }, [ POWER4_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_ST_GPS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_ST_GPS] }, [ POWER4_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups sucessfully dispatched (not rejected)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_DISP_SUCCESS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_DISP_SUCCESS] }, [ POWER4_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x934, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_LDF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_LDF] }, [ POWER4_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0xf16, .pme_short_desc = "Fabric command issued", .pme_long_desc = "A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FAB_CMD_ISSUED], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FAB_CMD_ISSUED] }, [ POWER4_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc20, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_SRQ_STFWD] }, [ POWER4_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x204, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_CR_MAP_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_CR_MAP_FULL_CYC] }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x910, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU0_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU0_FLUSH_ULD] }, [ POWER4_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6900, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_DERAT_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_DERAT_MISS] }, [ POWER4_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_SINGLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_SINGLE] }, [ POWER4_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FDIV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FDIV] }, [ POWER4_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FEST] }, [ POWER4_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FRSP_FCONV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FRSP_FCONV] }, [ POWER4_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_ST_CMPL_INT] }, [ POWER4_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3230, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU_FIN] }, [ POWER4_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_STF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_STF] }, [ POWER4_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x905, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DSLB_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DSLB_MISS] }, [ POWER4_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x6c66, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "DL1 was reloaded with shared (T) data from the L2 of another MCM due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L275_SHR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L275_SHR] }, [ POWER4_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x214, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXLS1_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXLS1_FULL_CYC] }, [ POWER4_PME_PM_L3B0_DIR_MIS ] = { .pme_name = "PM_L3B0_DIR_MIS", .pme_code = 0xf01, .pme_short_desc = "L3 bank 0 directory misses", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L3B0_DIR_MIS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L3B0_DIR_MIS] }, [ POWER4_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x451, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_2INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_2INST_CLB_CYC] }, [ POWER4_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x925, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_STCX_FAIL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_STCX_FAIL] }, [ POWER4_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x926, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LMQ_LHR_MERGE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LMQ_LHR_MERGE] }, [ POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ POWER4_PME_PM_L3B1_DIR_REF ] = { .pme_name = "PM_L3B1_DIR_REF", .pme_code = 0xf02, .pme_short_desc = "L3 bank 1 directory references", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L3B1_DIR_REF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L3B1_DIR_REF] }, [ POWER4_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x7910, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_FLUSH_UST] }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x5c76, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ POWER4_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c00, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_FLUSH_ULD] }, [ POWER4_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_BRU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_BRU_FIN] }, [ POWER4_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x327, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", .pme_event_ids = power4_event_ids[POWER4_PME_PM_IERAT_XLATE_WR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_IERAT_XLATE_WR] }, [ POWER4_PME_PM_LSU0_BUSY ] = { .pme_name = "PM_LSU0_BUSY", .pme_code = 0xc33, .pme_short_desc = "LSU0 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_BUSY], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_BUSY] }, [ POWER4_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0xf10, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_ST_REQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_ST_REQ] }, [ POWER4_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x2c66, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "DL1 was reloaded from memory due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_MEM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_MEM] }, [ POWER4_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x201, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPR_MAP_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPR_MAP_FULL_CYC] }, [ POWER4_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x207, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FULL_CYC] }, [ POWER4_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FIN] }, [ POWER4_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x452, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_3INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_3INST_CLB_CYC] }, [ POWER4_PME_PM_DATA_FROM_L35 ] = { .pme_name = "PM_DATA_FROM_L35", .pme_code = 0x3c66, .pme_short_desc = "Data loaded from L3.5", .pme_long_desc = "DL1 was reloaded from the L3 of another MCM due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L35], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L35] }, [ POWER4_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0xf05, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_SHR_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_SHR_INV] }, [ POWER4_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x4910, .pme_short_desc = "Marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_FLUSH_SRQ] }, [ POWER4_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = power4_event_ids[POWER4_PME_PM_THRESH_TIMEO], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_THRESH_TIMEO] }, [ POWER4_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FSQRT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FSQRT] }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x912, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ] }, [ POWER4_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x210, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXLS0_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXLS0_FULL_CYC] }, [ POWER4_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x936, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_TABLEWALK_CYC] }, [ POWER4_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_ALL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_ALL] }, [ POWER4_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FEST] }, [ POWER4_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x8c66, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L25_MOD] }, [ POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ POWER4_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FEST] }, [ POWER4_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x8327, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_0INST_FETCH], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_0INST_FETCH] }, [ POWER4_PME_PM_LARX_LSU1 ] = { .pme_name = "PM_LARX_LSU1", .pme_code = 0xc77, .pme_short_desc = "Larx executed on LSU1", .pme_long_desc = "Invalid event, larx instructions are never executed on unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LARX_LSU1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LARX_LSU1] }, [ POWER4_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc12, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_MISS_L1_LSU0], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_MISS_L1_LSU0] }, [ POWER4_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc35, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L1_PREF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L1_PREF] }, [ POWER4_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_STALL3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_STALL3] }, [ POWER4_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x205, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", .pme_event_ids = power4_event_ids[POWER4_PME_PM_BRQ_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_BRQ_FULL_CYC] }, [ POWER4_PME_PM_LARX ] = { .pme_name = "PM_LARX", .pme_code = 0x4c70, .pme_short_desc = "Larx executed", .pme_long_desc = "A Larx (lwarx or ldarx) was executed. This is the combined count from LSU0 + LSU1, but these instructions only execute on LSU0", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LARX], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LARX] }, [ POWER4_PME_PM_MRK_DATA_FROM_L35 ] = { .pme_name = "PM_MRK_DATA_FROM_L35", .pme_code = 0x3c76, .pme_short_desc = "Marked data loaded from L3.5", .pme_long_desc = "DL1 was reloaded from the L3 of another MCM due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L35], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L35] }, [ POWER4_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_WORK_HELD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_WORK_HELD] }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x920, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LD_MISS_L1_LSU0], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LD_MISS_L1_LSU0] }, [ POWER4_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU_IDLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU_IDLE] }, [ POWER4_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x8001, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_CMPL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_CMPL] }, [ POWER4_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc05, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_FLUSH_UST] }, [ POWER4_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_FLUSH_ULD] }, [ POWER4_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x3327, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_L2], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_L2] }, [ POWER4_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c66, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L3] }, [ POWER4_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_DENORM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_DENORM] }, [ POWER4_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FMOV_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FMOV_FEST] }, [ POWER4_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x8003, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_DISP_REJECT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_DISP_REJECT] }, [ POWER4_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x323, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FETCH_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FETCH_CYC] }, [ POWER4_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8930, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LDF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LDF] }, [ POWER4_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x221, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_DISP], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_DISP] }, [ POWER4_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0xf07, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_MOD_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_MOD_INV] }, [ POWER4_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x5c66, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L25_SHR] }, [ POWER4_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0xf17, .pme_short_desc = "Fabric command retried", .pme_long_desc = "A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FAB_CMD_RETRIED], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FAB_CMD_RETRIED] }, [ POWER4_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc64, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ POWER4_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_GRP_ISSUED], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_GRP_ISSUED] }, [ POWER4_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x5200, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FULL_CYC] }, [ POWER4_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FMA], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FMA] }, [ POWER4_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_CRU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_CRU_FIN] }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x915, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU1_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU1_FLUSH_UST] }, [ POWER4_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_FXU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_FXU_FIN] }, [ POWER4_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x330, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_BR_ISSUED], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_BR_ISSUED] }, [ POWER4_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x233, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_EE_OFF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_EE_OFF] }, [ POWER4_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x5327, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_L3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_L3] }, [ POWER4_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x900, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ITLB_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ITLB_MISS] }, [ POWER4_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x8210, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when one or both FXU/LSU issue queue are full", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXLS_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXLS_FULL_CYC] }, [ POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ POWER4_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x223, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_DISP_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_DISP_VALID] }, [ POWER4_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0xf15, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_ST_HIT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_ST_HIT] }, [ POWER4_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_GRP_DISP], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_GRP_DISP] }, [ POWER4_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0xf22, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_MOD_TAG], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_MOD_TAG] }, [ POWER4_PME_PM_INST_FROM_L25_L275 ] = { .pme_name = "PM_INST_FROM_L25_L275", .pme_code = 0x2327, .pme_short_desc = "Instruction fetched from L2.5/L2.75", .pme_long_desc = "An instruction fetch group was fetched from the L2 of another chip. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_L25_L275], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_L25_L275] }, [ POWER4_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c00, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_FLUSH_UST] }, [ POWER4_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0xf13, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_ST_HIT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_ST_HIT] }, [ POWER4_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x236, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU1_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU1_FIN] }, [ POWER4_PME_PM_L3B1_DIR_MIS ] = { .pme_name = "PM_L3B1_DIR_MIS", .pme_code = 0xf03, .pme_short_desc = "L3 bank 1 directory misses", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L3B1_DIR_MIS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L3B1_DIR_MIS] }, [ POWER4_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x453, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_4INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_4INST_CLB_CYC] }, [ POWER4_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_CMPL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_CMPL] }, [ POWER4_PME_PM_DC_PREF_L2_CLONE_L3 ] = { .pme_name = "PM_DC_PREF_L2_CLONE_L3", .pme_code = 0xc27, .pme_short_desc = "L2 prefetch cloned with L3", .pme_long_desc = "A prefetch request was made to the L2 with a cloned request sent to the L3", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DC_PREF_L2_CLONE_L3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DC_PREF_L2_CLONE_L3] }, [ POWER4_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FRSP_FCONV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FRSP_FCONV] }, [ POWER4_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x454, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_5INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_5INST_CLB_CYC] }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x913, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ] }, [ POWER4_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x8910, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_FLUSH_ULD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_FLUSH_ULD] }, [ POWER4_PME_PM_8INST_CLB_CYC ] = { .pme_name = "PM_8INST_CLB_CYC", .pme_code = 0x457, .pme_short_desc = "Cycles 8 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_8INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_8INST_CLB_CYC] }, [ POWER4_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x927, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LMQ_FULL_CYC] }, [ POWER4_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc11, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ST_REF_L1_LSU0], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ST_REF_L1_LSU0] }, [ POWER4_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x902, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_DERAT_MISS] }, [ POWER4_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x932, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_SYNC_CYC] }, [ POWER4_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_STALL3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_STALL3] }, [ POWER4_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x4c76, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L2] }, [ POWER4_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FMOV_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FMOV_FEST] }, [ POWER4_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc03, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_FLUSH_SRQ] }, [ POWER4_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_REF_L1_LSU0] }, [ POWER4_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0xf25, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_SHR_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_SHR_INV] }, [ POWER4_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc07, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_FLUSH_SRQ] }, [ POWER4_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x935, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LMQ_S0_ALLOC] }, [ POWER4_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7c10, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ST_REF_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ST_REF_L1] }, [ POWER4_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ POWER4_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_STF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_STF] }, [ POWER4_PME_PM_L3B0_DIR_REF ] = { .pme_name = "PM_L3B0_DIR_REF", .pme_code = 0xf00, .pme_short_desc = "L3 bank 0 directory references", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L3B0_DIR_REF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L3B0_DIR_REF] }, [ POWER4_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", .pme_event_ids = power4_event_ids[POWER4_PME_PM_RUN_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_RUN_CYC] }, [ POWER4_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x931, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LMQ_S0_VALID] }, [ POWER4_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc22, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LRQ_S0_VALID] }, [ POWER4_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x930, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_LDF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_LDF] }, [ POWER4_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x922, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occured due to marked load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_IMR_RELOAD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_IMR_RELOAD] }, [ POWER4_PME_PM_7INST_CLB_CYC ] = { .pme_name = "PM_7INST_CLB_CYC", .pme_code = 0x456, .pme_short_desc = "Cycles 7 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_7INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_7INST_CLB_CYC] }, [ POWER4_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_GRP_TIMEO], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_GRP_TIMEO] }, [ POWER4_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FMOV_FEST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FMOV_FEST] }, [ POWER4_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x231, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_DISP_BLK_SB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_DISP_BLK_SB_CYC] }, [ POWER4_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x202, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_XER_MAP_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_XER_MAP_FULL_CYC] }, [ POWER4_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc23, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ST_MISS_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ST_MISS_L1] }, [ POWER4_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", .pme_event_ids = power4_event_ids[POWER4_PME_PM_STOP_COMPLETION], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_STOP_COMPLETION] }, [ POWER4_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_GRP_CMPL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_GRP_CMPL] }, [ POWER4_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x901, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ISLB_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ISLB_MISS] }, [ POWER4_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = power4_event_ids[POWER4_PME_PM_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_CYC] }, [ POWER4_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc16, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_MISS_L1_LSU1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_MISS_L1_LSU1] }, [ POWER4_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x921, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = power4_event_ids[POWER4_PME_PM_STCX_FAIL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_STCX_FAIL] }, [ POWER4_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc24, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_SRQ_STFWD] }, [ POWER4_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_DISP], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_DISP] }, [ POWER4_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x4c66, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L2], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L2] }, [ POWER4_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc34, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2_PREF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2_PREF] }, [ POWER4_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FPSCR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FPSCR] }, [ POWER4_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_DENORM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_DENORM] }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x8c76, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ POWER4_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0xf12, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_ST_REQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_ST_REQ] }, [ POWER4_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0xf23, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_MOD_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_MOD_INV] }, [ POWER4_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FSQRT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FSQRT] }, [ POWER4_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8c10, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_REF_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_REF_L1] }, [ POWER4_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc74, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_L1_RELOAD_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_L1_RELOAD_VALID] }, [ POWER4_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0xf20, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SB_SHR_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SB_SHR_MOD] }, [ POWER4_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x6327, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_L1] }, [ POWER4_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_1PLUS_PPC_CMPL] }, [ POWER4_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x237, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_event_ids = power4_event_ids[POWER4_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_EE_OFF_EXT_INT] }, [ POWER4_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0xf24, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_SHR_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_SHR_MOD] }, [ POWER4_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x212, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The isu sends this signal when the lrq is full.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_LRQ_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_LRQ_FULL_CYC] }, [ POWER4_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x325, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_IC_PREF_INSTALL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_IC_PREF_INSTALL] }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x917, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ] }, [ POWER4_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x200, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GCT_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GCT_FULL_CYC] }, [ POWER4_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x1327, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "An instruction fetch group was fetched from memory. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_MEM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_MEM] }, [ POWER4_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU_BUSY], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU_BUSY] }, [ POWER4_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc15, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_ST_REF_L1_LSU1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_ST_REF_L1_LSU1] }, [ POWER4_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1920, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LD_MISS_L1] }, [ POWER4_PME_PM_MRK_LSU1_INST_FIN ] = { .pme_name = "PM_MRK_LSU1_INST_FIN", .pme_code = 0xc32, .pme_short_desc = "LSU1 finished a marked instruction", .pme_long_desc = "LSU unit 1 finished a marked instruction", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU1_INST_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU1_INST_FIN] }, [ POWER4_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x333, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L1_WRITE_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L1_WRITE_CYC] }, [ POWER4_PME_PM_BIQ_IDU_FULL_CYC ] = { .pme_name = "PM_BIQ_IDU_FULL_CYC", .pme_code = 0x324, .pme_short_desc = "Cycles BIQ or IDU full", .pme_long_desc = "This signal will be asserted each time either the IDU is full or the BIQ is full.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_BIQ_IDU_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_BIQ_IDU_FULL_CYC] }, [ POWER4_PME_PM_MRK_LSU0_INST_FIN ] = { .pme_name = "PM_MRK_LSU0_INST_FIN", .pme_code = 0xc31, .pme_short_desc = "LSU0 finished a marked instruction", .pme_long_desc = "LSU unit 0 finished a marked instruction", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU0_INST_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU0_INST_FIN] }, [ POWER4_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0xf14, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_ST_REQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_ST_REQ] }, [ POWER4_PME_PM_LSU1_BUSY ] = { .pme_name = "PM_LSU1_BUSY", .pme_code = 0xc37, .pme_short_desc = "LSU1 busy", .pme_long_desc = "LSU unit 1 is busy rejecting instructions ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_BUSY], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_BUSY] }, [ POWER4_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add", .pme_long_desc = " mult", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_ALL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_ALL] }, [ POWER4_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc25, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_S0_ALLOC] }, [ POWER4_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GRP_MRK], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GRP_MRK] }, [ POWER4_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FIN] }, [ POWER4_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x907, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DC_PREF_STREAM_ALLOC] }, [ POWER4_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x331, .pme_short_desc = "Branch mispredictions due CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_BR_MPRED_CR], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_BR_MPRED_CR] }, [ POWER4_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x332, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_BR_MPRED_TA], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_BR_MPRED_TA] }, [ POWER4_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x211, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", .pme_event_ids = power4_event_ids[POWER4_PME_PM_CRQ_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_CRQ_FULL_CYC] }, [ POWER4_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x7327, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_PREF], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_PREF] }, [ POWER4_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c10, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_MISS_L1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_MISS_L1] }, [ POWER4_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0xc75, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", .pme_event_ids = power4_event_ids[POWER4_PME_PM_STCX_PASS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_STCX_PASS] }, [ POWER4_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc17, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DC_INV_L2], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DC_INV_L2] }, [ POWER4_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x213, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The isu sends this signal when the srq is full.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_FULL_CYC] }, [ POWER4_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc02, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_FLUSH_LRQ] }, [ POWER4_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc21, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_S0_VALID] }, [ POWER4_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0xc73, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LARX_LSU0], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LARX_LSU0] }, [ POWER4_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = power4_event_ids[POWER4_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_GCT_EMPTY_CYC] }, [ POWER4_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_ALL], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_ALL] }, [ POWER4_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FSQRT], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FSQRT] }, [ POWER4_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_FIN] }, [ POWER4_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0xf04, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SA_SHR_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SA_SHR_MOD] }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x924, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LD_MISS_L1_LSU1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LD_MISS_L1_LSU1] }, [ POWER4_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1c20, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_SRQ_STFWD] }, [ POWER4_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x232, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FXU0_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FXU0_FIN] }, [ POWER4_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_FPU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_FPU_FIN] }, [ POWER4_PME_PM_LSU_BUSY ] = { .pme_name = "PM_LSU_BUSY", .pme_code = 0x4c30, .pme_short_desc = "LSU busy", .pme_long_desc = "LSU (unit 0 + unit 1) is busy rejecting instructions ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_BUSY], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_BUSY] }, [ POWER4_PME_PM_INST_FROM_L35 ] = { .pme_name = "PM_INST_FROM_L35", .pme_code = 0x4327, .pme_short_desc = "Instructions fetched from L3.5", .pme_long_desc = "An instruction fetch group was fetched from the L3 of another module. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power4_event_ids[POWER4_PME_PM_INST_FROM_L35], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_INST_FROM_L35] }, [ POWER4_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU1_FRSP_FCONV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU1_FRSP_FCONV] }, [ POWER4_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x903, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_SNOOP_TLBIE], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_SNOOP_TLBIE] }, [ POWER4_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU0_FDIV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU0_FDIV] }, [ POWER4_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc14, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LD_REF_L1_LSU1], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LD_REF_L1_LSU1] }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x7c76, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of another MCM due to a marked demand load. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L275_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L275_MOD] }, [ POWER4_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 0 and MSR[PR]=0)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_HV_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_HV_CYC] }, [ POWER4_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x455, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_6INST_CLB_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_6INST_CLB_CYC] }, [ POWER4_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x206, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LR_CTR_MAP_FULL_CYC], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LR_CTR_MAP_FULL_CYC] }, [ POWER4_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0xf27, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_MOD_INV], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_MOD_INV] }, [ POWER4_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", .pme_event_ids = power4_event_ids[POWER4_PME_PM_FPU_DENORM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_FPU_DENORM] }, [ POWER4_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x7c66, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of another MCM due to a demand load. ", .pme_event_ids = power4_event_ids[POWER4_PME_PM_DATA_FROM_L275_MOD], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_DATA_FROM_L275_MOD] }, [ POWER4_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x906, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU1_DERAT_MISS] }, [ POWER4_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x326, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", .pme_event_ids = power4_event_ids[POWER4_PME_PM_IC_PREF_REQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_IC_PREF_REQ] }, [ POWER4_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_LSU_FIN], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_LSU_FIN] }, [ POWER4_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c76, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_L3], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_L3] }, [ POWER4_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x2c76, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "DL1 was reloaded from memory due to a marked demand load", .pme_event_ids = power4_event_ids[POWER4_PME_PM_MRK_DATA_FROM_MEM], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_MRK_DATA_FROM_MEM] }, [ POWER4_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc01, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU0_FLUSH_UST] }, [ POWER4_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6c00, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_FLUSH_LRQ] }, [ POWER4_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5c00, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_LSU_FLUSH_SRQ] }, [ POWER4_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0xf26, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power4_event_ids[POWER4_PME_PM_L2SC_MOD_TAG], .pme_group_vector = power4_group_vecs[POWER4_PME_PM_L2SC_MOD_TAG] } }; #define POWER4_PME_EVENT_COUNT 244 static const int power4_group_event_ids[][POWER4_NUM_EVENT_COUNTERS] = { [ 0 ] = { 94, 81, 83, 77, 81, 81, 77, 80 }, [ 1 ] = { 81, 81, 79, 13, 32, 86, 84, 82 }, [ 2 ] = { 86, 81, 79, 13, 32, 86, 84, 82 }, [ 3 ] = { 86, 0, 8, 9, 33, 81, 10, 36 }, [ 4 ] = { 7, 1, 33, 77, 86, 26, 73, 79 }, [ 5 ] = { 82, 82, 74, 74, 83, 82, 74, 75 }, [ 6 ] = { 87, 86, 78, 78, 91, 87, 79, 73 }, [ 7 ] = { 88, 87, 73, 77, 92, 89, 84, 82 }, [ 8 ] = { 35, 6, 12, 53, 31, 88, 78, 74 }, [ 9 ] = { 34, 5, 56, 52, 31, 88, 78, 74 }, [ 10 ] = { 50, 49, 17, 18, 52, 51, 78, 74 }, [ 11 ] = { 38, 39, 38, 37, 40, 37, 78, 74 }, [ 12 ] = { 42, 43, 40, 39, 44, 41, 78, 74 }, [ 13 ] = { 46, 47, 42, 41, 48, 45, 78, 74 }, [ 14 ] = { 84, 83, 75, 75, 82, 83, 78, 77 }, [ 15 ] = { 83, 84, 73, 77, 84, 84, 75, 78 }, [ 16 ] = { 86, 81, 0, 1, 81, 81, 2, 3 }, [ 17 ] = { 86, 81, 4, 5, 89, 81, 6, 7 }, [ 18 ] = { 80, 2, 11, 34, 53, 32, 78, 74 }, [ 19 ] = { 13, 22, 30, 30, 82, 86, 54, 55 }, [ 20 ] = { 32, 81, 31, 32, 28, 27, 78, 74 }, [ 21 ] = { 85, 92, 83, 16, 82, 86, 15, 76 }, [ 22 ] = { 77, 78, 69, 73, 81, 86, 44, 45 }, [ 23 ] = { 71, 70, 50, 51, 69, 68, 78, 74 }, [ 24 ] = { 86, 36, 73, 74, 83, 82, 74, 75 }, [ 25 ] = { 82, 82, 74, 74, 36, 81, 74, 81 }, [ 26 ] = { 86, 81, 78, 78, 91, 87, 79, 73 }, [ 27 ] = { 87, 86, 78, 78, 91, 87, 73, 81 }, [ 28 ] = { 10, 19, 25, 29, 11, 20, 78, 74 }, [ 29 ] = { 12, 21, 22, 27, 8, 17, 78, 74 }, [ 30 ] = { 9, 18, 23, 28, 82, 86, 21, 26 }, [ 31 ] = { 14, 23, 19, 20, 16, 25, 73, 81 }, [ 32 ] = { 15, 24, 22, 27, 82, 86, 73, 24 }, [ 33 ] = { 86, 81, 76, 76, 88, 85, 76, 79 }, [ 34 ] = { 67, 66, 52, 53, 82, 86, 56, 12 }, [ 35 ] = { 55, 61, 73, 73, 56, 62, 78, 74 }, [ 36 ] = { 57, 63, 48, 49, 82, 86, 46, 47 }, [ 37 ] = { 58, 64, 71, 72, 82, 86, 70, 13 }, [ 38 ] = { 59, 65, 71, 72, 79, 81, 78, 74 }, [ 39 ] = { 54, 60, 73, 73, 36, 81, 78, 74 }, [ 40 ] = { 4, 3, 43, 35, 82, 86, 73, 14 }, [ 41 ] = { 85, 88, 84, 73, 81, 86, 77, 86 }, [ 42 ] = { 92, 91, 73, 84, 90, 92, 82, 81 }, [ 43 ] = { 91, 89, 73, 82, 90, 91, 81, 84 }, [ 44 ] = { 93, 81, 82, 84, 94, 93, 68, 81 }, [ 45 ] = { 92, 81, 81, 85, 94, 92, 78, 85 }, [ 46 ] = { 90, 90, 80, 83, 93, 90, 80, 83 }, [ 47 ] = { 86, 81, 57, 83, 93, 90, 80, 83 }, [ 48 ] = { 90, 90, 80, 83, 82, 86, 80, 57 }, [ 49 ] = { 76, 72, 60, 65, 82, 86, 61, 66 }, [ 50 ] = { 73, 74, 58, 63, 82, 86, 59, 64 }, [ 51 ] = { 75, 81, 62, 67, 82, 92, 82, 81 }, [ 52 ] = { 67, 91, 53, 77, 82, 92, 77, 52 }, [ 53 ] = { 84, 83, 77, 75, 82, 83, 78, 77 }, [ 54 ] = { 81, 84, 22, 77, 86, 84, 27, 78 }, [ 55 ] = { 86, 0, 8, 9, 1, 81, 10, 36 }, [ 56 ] = { 6, 35, 79, 70, 82, 86, 84, 82 }, [ 57 ] = { 86, 81, 74, 74, 83, 82, 74, 75 }, [ 58 ] = { 82, 82, 74, 74, 83, 81, 78, 75 }, [ 59 ] = { 6, 88, 79, 70, 82, 86, 84, 82 }, [ 60 ] = { 84, 83, 22, 27, 82, 84, 78, 78 }, [ 61 ] = { 86, 81, 79, 8, 79, 81, 9, 10 }, [ 62 ] = { 86, 81, 79, 8, 82, 79, 84, 82 } }; static const pmg_power_group_t power4_groups[] = { [ 0 ] = { .pmg_name = "pm_slice0", .pmg_desc = "Time Slice 0", .pmg_event_ids = power4_group_event_ids[0], .pmg_mmcr0 = 0x0000000000000d0eULL, .pmg_mmcr1 = 0x000000004a5675acULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 1 ] = { .pmg_name = "pm_eprof", .pmg_desc = "Group for use with eprof", .pmg_event_ids = power4_group_event_ids[1], .pmg_mmcr0 = 0x000000000000070eULL, .pmg_mmcr1 = 0x1003400045f29420ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 2 ] = { .pmg_name = "pm_basic", .pmg_desc = "Basic performance indicators", .pmg_event_ids = power4_group_event_ids[2], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x1003400045f29420ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 3 ] = { .pmg_name = "pm_ifu", .pmg_desc = "IFU events", .pmg_event_ids = power4_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000938ULL, .pmg_mmcr1 = 0x80000000c6767d6cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 4 ] = { .pmg_name = "pm_isu", .pmg_desc = "ISU Queue full events", .pmg_event_ids = power4_group_event_ids[4], .pmg_mmcr0 = 0x000000000000112aULL, .pmg_mmcr1 = 0x50041000ea5103a0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 5 ] = { .pmg_name = "pm_lsource", .pmg_desc = "Information on data source", .pmg_event_ids = power4_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000e1cULL, .pmg_mmcr1 = 0x0010c000739ce738ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 6 ] = { .pmg_name = "pm_isource", .pmg_desc = "Instruction Source information", .pmg_event_ids = power4_group_event_ids[6], .pmg_mmcr0 = 0x0000000000000f1eULL, .pmg_mmcr1 = 0x800000007bdef7bcULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 7 ] = { .pmg_name = "pm_lsu", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = power4_group_event_ids[7], .pmg_mmcr0 = 0x0000000000000810ULL, .pmg_mmcr1 = 0x000f00003a508420ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 8 ] = { .pmg_name = "pm_xlate1", .pmg_desc = "Translation Events", .pmg_event_ids = power4_group_event_ids[8], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x81082000f67e849cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 9 ] = { .pmg_name = "pm_xlate2", .pmg_desc = "Translation Events", .pmg_event_ids = power4_group_event_ids[9], .pmg_mmcr0 = 0x000000000000112aULL, .pmg_mmcr1 = 0x81082000d77e849cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 10 ] = { .pmg_name = "pm_gps1", .pmg_desc = "L3 Events", .pmg_event_ids = power4_group_event_ids[10], .pmg_mmcr0 = 0x0000000000001022ULL, .pmg_mmcr1 = 0x00000c00b5e5349cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 11 ] = { .pmg_name = "pm_l2a", .pmg_desc = "L2 SliceA events", .pmg_event_ids = power4_group_event_ids[11], .pmg_mmcr0 = 0x000000000000162aULL, .pmg_mmcr1 = 0x00000c008469749cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 12 ] = { .pmg_name = "pm_l2b", .pmg_desc = "L2 SliceB events", .pmg_event_ids = power4_group_event_ids[12], .pmg_mmcr0 = 0x0000000000001a32ULL, .pmg_mmcr1 = 0x0000060094f1b49cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 13 ] = { .pmg_name = "pm_l2c", .pmg_desc = "L2 SliceC events", .pmg_event_ids = power4_group_event_ids[13], .pmg_mmcr0 = 0x0000000000001e3aULL, .pmg_mmcr1 = 0x00000600a579f49cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 14 ] = { .pmg_name = "pm_fpu1", .pmg_desc = "Floating Point events", .pmg_event_ids = power4_group_event_ids[14], .pmg_mmcr0 = 0x0000000000000810ULL, .pmg_mmcr1 = 0x00000000420e84a0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 15 ] = { .pmg_name = "pm_fpu2", .pmg_desc = "Floating Point events", .pmg_event_ids = power4_group_event_ids[15], .pmg_mmcr0 = 0x0000000000000810ULL, .pmg_mmcr1 = 0x010020e83a508420ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 16 ] = { .pmg_name = "pm_idu1", .pmg_desc = "Instruction Decode Unit events", .pmg_event_ids = power4_group_event_ids[16], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x040100008456794cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 17 ] = { .pmg_name = "pm_idu2", .pmg_desc = "Instruction Decode Unit events", .pmg_event_ids = power4_group_event_ids[17], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x04010000a5527b5cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 18 ] = { .pmg_name = "pm_isu_rename", .pmg_desc = "ISU Rename Pool Events", .pmg_event_ids = power4_group_event_ids[18], .pmg_mmcr0 = 0x0000000000001228ULL, .pmg_mmcr1 = 0x100550008e6d949cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 19 ] = { .pmg_name = "pm_isu_queues1", .pmg_desc = "ISU Queue Full Events", .pmg_event_ids = power4_group_event_ids[19], .pmg_mmcr0 = 0x000000000000132eULL, .pmg_mmcr1 = 0x10050000850e994cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 20 ] = { .pmg_name = "pm_isu_flow", .pmg_desc = "ISU Instruction Flow Events", .pmg_event_ids = power4_group_event_ids[20], .pmg_mmcr0 = 0x000000000000190eULL, .pmg_mmcr1 = 0x10005000d7b7c49cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 21 ] = { .pmg_name = "pm_isu_work", .pmg_desc = "ISU Indicators of Work Blockage", .pmg_event_ids = power4_group_event_ids[21], .pmg_mmcr0 = 0x0000000000000c12ULL, .pmg_mmcr1 = 0x100010004fce9da8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 22 ] = { .pmg_name = "pm_serialize", .pmg_desc = "LSU Serializing Events", .pmg_event_ids = power4_group_event_ids[22], .pmg_mmcr0 = 0x0000000000001332ULL, .pmg_mmcr1 = 0x0118b000e9d69dfcULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 23 ] = { .pmg_name = "pm_lsubusy", .pmg_desc = "LSU Busy Events", .pmg_event_ids = power4_group_event_ids[23], .pmg_mmcr0 = 0x000000000000193aULL, .pmg_mmcr1 = 0x0000f000dff5e49cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 24 ] = { .pmg_name = "pm_lsource2", .pmg_desc = "Information on data source", .pmg_event_ids = power4_group_event_ids[24], .pmg_mmcr0 = 0x0000000000000938ULL, .pmg_mmcr1 = 0x0010c0003b9ce738ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 25 ] = { .pmg_name = "pm_lsource3", .pmg_desc = "Information on data source", .pmg_event_ids = power4_group_event_ids[25], .pmg_mmcr0 = 0x0000000000000e1cULL, .pmg_mmcr1 = 0x0010c00073b87724ULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 26 ] = { .pmg_name = "pm_isource2", .pmg_desc = "Instruction Source information", .pmg_event_ids = power4_group_event_ids[26], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x800000007bdef7bcULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 27 ] = { .pmg_name = "pm_isource3", .pmg_desc = "Instruction Source information", .pmg_event_ids = power4_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000f1eULL, .pmg_mmcr1 = 0x800000007bdef3a4ULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 28 ] = { .pmg_name = "pm_fpu3", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = power4_group_event_ids[28], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000000008d63549cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 29 ] = { .pmg_name = "pm_fpu4", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = power4_group_event_ids[29], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000000009de7749cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 30 ] = { .pmg_name = "pm_fpu5", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = power4_group_event_ids[30], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x00000000850e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 31 ] = { .pmg_name = "pm_fpu6", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = power4_group_event_ids[31], .pmg_mmcr0 = 0x0000000000001b3eULL, .pmg_mmcr1 = 0x01002000c735e3a4ULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 32 ] = { .pmg_name = "pm_fpu7", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = power4_group_event_ids[32], .pmg_mmcr0 = 0x000000000000193aULL, .pmg_mmcr1 = 0x000000009dce93e0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 33 ] = { .pmg_name = "pm_fxu", .pmg_desc = "Fix Point Unit events", .pmg_event_ids = power4_group_event_ids[33], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x400000024294a520ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 34 ] = { .pmg_name = "pm_lsu_lmq", .pmg_desc = "LSU Load Miss Queue Events", .pmg_event_ids = power4_group_event_ids[34], .pmg_mmcr0 = 0x0000000000001e3eULL, .pmg_mmcr1 = 0x0100a000ee4e9d78ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 35 ] = { .pmg_name = "pm_lsu_flush", .pmg_desc = "LSU Flush Events", .pmg_event_ids = power4_group_event_ids[35], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000c000039e7749cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 36 ] = { .pmg_name = "pm_lsu_load1", .pmg_desc = "LSU Load Events", .pmg_event_ids = power4_group_event_ids[36], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000f0000850e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 37 ] = { .pmg_name = "pm_lsu_store1", .pmg_desc = "LSU Store Events", .pmg_event_ids = power4_group_event_ids[37], .pmg_mmcr0 = 0x000000000000112aULL, .pmg_mmcr1 = 0x000f00008d4e99dcULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 38 ] = { .pmg_name = "pm_lsu_store2", .pmg_desc = "LSU Store Events", .pmg_event_ids = power4_group_event_ids[38], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x0003c0008d76749cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 39 ] = { .pmg_name = "pm_lsu7", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = power4_group_event_ids[39], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x0118c00039f8749cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 40 ] = { .pmg_name = "pm_dpfetch", .pmg_desc = "Data Prefetch Events", .pmg_event_ids = power4_group_event_ids[40], .pmg_mmcr0 = 0x000000000000173eULL, .pmg_mmcr1 = 0x0108f000e74e93f8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 41 ] = { .pmg_name = "pm_misc", .pmg_desc = "Misc Events for testing", .pmg_event_ids = power4_group_event_ids[41], .pmg_mmcr0 = 0x0000000000000c14ULL, .pmg_mmcr1 = 0x0000000061d695b4ULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 42 ] = { .pmg_name = "pm_mark1", .pmg_desc = "Information on marked instructions", .pmg_event_ids = power4_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000816ULL, .pmg_mmcr1 = 0x010080803b18d6a4ULL, .pmg_mmcra = 0x0000000000722001ULL }, [ 43 ] = { .pmg_name = "pm_mark2", .pmg_desc = "Marked Instructions Processing Flow", .pmg_event_ids = power4_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000a1aULL, .pmg_mmcr1 = 0x000000003b58c630ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 44 ] = { .pmg_name = "pm_mark3", .pmg_desc = "Marked Stores Processing Flow", .pmg_event_ids = power4_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000b0eULL, .pmg_mmcr1 = 0x010020005b1abda4ULL, .pmg_mmcra = 0x0000000000022001ULL }, [ 45 ] = { .pmg_name = "pm_mark4", .pmg_desc = "Marked Loads Processing FLow", .pmg_event_ids = power4_group_event_ids[45], .pmg_mmcr0 = 0x000000000000080eULL, .pmg_mmcr1 = 0x01028080421ad4a0ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 46 ] = { .pmg_name = "pm_mark_lsource", .pmg_desc = "Information on marked data source", .pmg_event_ids = power4_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000e1cULL, .pmg_mmcr1 = 0x00103000739ce738ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 47 ] = { .pmg_name = "pm_mark_lsource2", .pmg_desc = "Information on marked data source", .pmg_event_ids = power4_group_event_ids[47], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x00103000e39ce738ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 48 ] = { .pmg_name = "pm_mark_lsource3", .pmg_desc = "Information on marked data source", .pmg_event_ids = power4_group_event_ids[48], .pmg_mmcr0 = 0x0000000000000e1cULL, .pmg_mmcr1 = 0x00103000738e9770ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 49 ] = { .pmg_name = "pm_lsu_mark1", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = power4_group_event_ids[49], .pmg_mmcr0 = 0x0000000000001b34ULL, .pmg_mmcr1 = 0x01028000850e98d4ULL, .pmg_mmcra = 0x0000000000022001ULL }, [ 50 ] = { .pmg_name = "pm_lsu_mark2", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = power4_group_event_ids[50], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x01028000958e99dcULL, .pmg_mmcra = 0x0000000000022001ULL }, [ 51 ] = { .pmg_name = "pm_lsu_mark3", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = power4_group_event_ids[51], .pmg_mmcr0 = 0x0000000000001d0eULL, .pmg_mmcr1 = 0x0100b000ce8ed6a4ULL, .pmg_mmcra = 0x0000000000022001ULL }, [ 52 ] = { .pmg_name = "pm_threshold", .pmg_desc = "Group for pipeline threshold studies", .pmg_event_ids = power4_group_event_ids[52], .pmg_mmcr0 = 0x0000000000001e16ULL, .pmg_mmcr1 = 0x0100a000ca4ed5f4ULL, .pmg_mmcra = 0x0000000000722001ULL }, [ 53 ] = { .pmg_name = "pm_pe_bench1", .pmg_desc = "PE Benchmarker group for FP analysis", .pmg_event_ids = power4_group_event_ids[53], .pmg_mmcr0 = 0x0000000000000810ULL, .pmg_mmcr1 = 0x10001002420e84a0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 54 ] = { .pmg_name = "pm_pe_bench2", .pmg_desc = "PE Benchmarker group for FP stalls analysis", .pmg_event_ids = power4_group_event_ids[54], .pmg_mmcr0 = 0x0000000000000710ULL, .pmg_mmcr1 = 0x110420689a508ba0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 55 ] = { .pmg_name = "pm_pe_bench3", .pmg_desc = "PE Benchmarker group for branch analysis", .pmg_event_ids = power4_group_event_ids[55], .pmg_mmcr0 = 0x0000000000000938ULL, .pmg_mmcr1 = 0x90040000c66a7d6cULL, .pmg_mmcra = 0x0000000000022000ULL }, [ 56 ] = { .pmg_name = "pm_pe_bench4", .pmg_desc = "PE Benchmarker group for L1 and TLB analysis", .pmg_event_ids = power4_group_event_ids[56], .pmg_mmcr0 = 0x0000000000001420ULL, .pmg_mmcr1 = 0x010b000044ce9420ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 57 ] = { .pmg_name = "pm_pe_bench5", .pmg_desc = "PE Benchmarker group for L2 analysis", .pmg_event_ids = power4_group_event_ids[57], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x0010c000739ce738ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 58 ] = { .pmg_name = "pm_pe_bench6", .pmg_desc = "PE Benchmarker group for L3 analysis", .pmg_event_ids = power4_group_event_ids[58], .pmg_mmcr0 = 0x0000000000000e1cULL, .pmg_mmcr1 = 0x0010c000739c74b8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 59 ] = { .pmg_name = "pm_hpmcount1", .pmg_desc = "Hpmcount group for L1 and TLB behavior analysis", .pmg_event_ids = power4_group_event_ids[59], .pmg_mmcr0 = 0x0000000000001414ULL, .pmg_mmcr1 = 0x010b000044ce9420ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 60 ] = { .pmg_name = "pm_hpmcount2", .pmg_desc = "Hpmcount group for computation intensity analysis", .pmg_event_ids = power4_group_event_ids[60], .pmg_mmcr0 = 0x0000000000000810ULL, .pmg_mmcr1 = 0x010020289dce84a0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 61 ] = { .pmg_name = "pm_l1andbr", .pmg_desc = "L1 misses and branch misspredict analysis", .pmg_event_ids = power4_group_event_ids[61], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x8003c00046367ce8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 62 ] = { .pmg_name = "Instruction mix: loads", .pmg_desc = " stores and branches", .pmg_event_ids = power4_group_event_ids[62], .pmg_mmcr0 = 0x000000000000090eULL, .pmg_mmcr1 = 0x8003c000460fb420ULL, .pmg_mmcra = 0x0000000000002000ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/power5+_events.h000066400000000000000000014003671502707512200223520ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER5p_EVENTS_H__ #define __POWER5p_EVENTS_H__ /* * File: power5+_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define POWER5p_PME_PM_FPU1_SINGLE 1 #define POWER5p_PME_PM_L3SB_REF 2 #define POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC 3 #define POWER5p_PME_PM_INST_FROM_L275_SHR 4 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD 5 #define POWER5p_PME_PM_DTLB_MISS_4K 6 #define POWER5p_PME_PM_CLB_FULL_CYC 7 #define POWER5p_PME_PM_MRK_ST_CMPL 8 #define POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL 9 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR 10 #define POWER5p_PME_PM_1INST_CLB_CYC 11 #define POWER5p_PME_PM_MEM_SPEC_RD_CANCEL 12 #define POWER5p_PME_PM_MRK_DTLB_MISS_16M 13 #define POWER5p_PME_PM_FPU_FDIV 14 #define POWER5p_PME_PM_FPU_SINGLE 15 #define POWER5p_PME_PM_FPU0_FMA 16 #define POWER5p_PME_PM_SLB_MISS 17 #define POWER5p_PME_PM_LSU1_FLUSH_LRQ 18 #define POWER5p_PME_PM_L2SA_ST_HIT 19 #define POWER5p_PME_PM_DTLB_MISS 20 #define POWER5p_PME_PM_BR_PRED_TA 21 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC 22 #define POWER5p_PME_PM_CMPLU_STALL_FXU 23 #define POWER5p_PME_PM_EXT_INT 24 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ 25 #define POWER5p_PME_PM_MRK_ST_GPS 26 #define POWER5p_PME_PM_LSU1_LDF 27 #define POWER5p_PME_PM_FAB_CMD_ISSUED 28 #define POWER5p_PME_PM_LSU0_SRQ_STFWD 29 #define POWER5p_PME_PM_CR_MAP_FULL_CYC 30 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL 31 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD 32 #define POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL 33 #define POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 34 #define POWER5p_PME_PM_FLUSH_IMBAL 35 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 36 #define POWER5p_PME_PM_DATA_FROM_L35_MOD 37 #define POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL 38 #define POWER5p_PME_PM_FPU1_FDIV 39 #define POWER5p_PME_PM_MEM_RQ_DISP 40 #define POWER5p_PME_PM_FPU0_FRSP_FCONV 41 #define POWER5p_PME_PM_LWSYNC_HELD 42 #define POWER5p_PME_PM_FXU_FIN 43 #define POWER5p_PME_PM_DSLB_MISS 44 #define POWER5p_PME_PM_DATA_FROM_L275_SHR 45 #define POWER5p_PME_PM_FXLS1_FULL_CYC 46 #define POWER5p_PME_PM_THRD_SEL_T0 47 #define POWER5p_PME_PM_PTEG_RELOAD_VALID 48 #define POWER5p_PME_PM_MRK_STCX_FAIL 49 #define POWER5p_PME_PM_LSU_LMQ_LHR_MERGE 50 #define POWER5p_PME_PM_2INST_CLB_CYC 51 #define POWER5p_PME_PM_FAB_PNtoVN_DIRECT 52 #define POWER5p_PME_PM_PTEG_FROM_L2MISS 53 #define POWER5p_PME_PM_CMPLU_STALL_LSU 54 #define POWER5p_PME_PM_MRK_DSLB_MISS 55 #define POWER5p_PME_PM_LSU_FLUSH_ULD 56 #define POWER5p_PME_PM_PTEG_FROM_LMEM 57 #define POWER5p_PME_PM_MRK_BRU_FIN 58 #define POWER5p_PME_PM_MEM_WQ_DISP_WRITE 59 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC 60 #define POWER5p_PME_PM_LSU1_NCLD 61 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER 62 #define POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ 63 #define POWER5p_PME_PM_FPU1_FULL_CYC 64 #define POWER5p_PME_PM_FPR_MAP_FULL_CYC 65 #define POWER5p_PME_PM_L3SA_ALL_BUSY 66 #define POWER5p_PME_PM_3INST_CLB_CYC 67 #define POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 68 #define POWER5p_PME_PM_L2SA_SHR_INV 69 #define POWER5p_PME_PM_THRESH_TIMEO 70 #define POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL 71 #define POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL 72 #define POWER5p_PME_PM_FPU_FSQRT 73 #define POWER5p_PME_PM_PMC1_OVERFLOW 74 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ 75 #define POWER5p_PME_PM_L3SC_SNOOP_RETRY 76 #define POWER5p_PME_PM_DATA_TABLEWALK_CYC 77 #define POWER5p_PME_PM_THRD_PRIO_6_CYC 78 #define POWER5p_PME_PM_FPU_FEST 79 #define POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY 80 #define POWER5p_PME_PM_MRK_DATA_FROM_RMEM 81 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC 82 #define POWER5p_PME_PM_MEM_PWQ_DISP 83 #define POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY 84 #define POWER5p_PME_PM_LD_MISS_L1_LSU0 85 #define POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL 86 #define POWER5p_PME_PM_FPU1_STALL3 87 #define POWER5p_PME_PM_GCT_USAGE_80to99_CYC 88 #define POWER5p_PME_PM_WORK_HELD 89 #define POWER5p_PME_PM_INST_CMPL 90 #define POWER5p_PME_PM_LSU1_FLUSH_UST 91 #define POWER5p_PME_PM_FXU_IDLE 92 #define POWER5p_PME_PM_LSU0_FLUSH_ULD 93 #define POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL 94 #define POWER5p_PME_PM_GRP_DISP_REJECT 95 #define POWER5p_PME_PM_PTEG_FROM_L25_SHR 96 #define POWER5p_PME_PM_L2SA_MOD_INV 97 #define POWER5p_PME_PM_FAB_CMD_RETRIED 98 #define POWER5p_PME_PM_L3SA_SHR_INV 99 #define POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL 100 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR 101 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL 102 #define POWER5p_PME_PM_PTEG_FROM_L375_MOD 103 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_UST 104 #define POWER5p_PME_PM_BR_ISSUED 105 #define POWER5p_PME_PM_MRK_GRP_BR_REDIR 106 #define POWER5p_PME_PM_EE_OFF 107 #define POWER5p_PME_PM_IERAT_XLATE_WR_LP 108 #define POWER5p_PME_PM_DTLB_REF_64K 109 #define POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 110 #define POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP 111 #define POWER5p_PME_PM_INST_FROM_L3 112 #define POWER5p_PME_PM_ITLB_MISS 113 #define POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE 114 #define POWER5p_PME_PM_DTLB_REF_4K 115 #define POWER5p_PME_PM_FXLS_FULL_CYC 116 #define POWER5p_PME_PM_GRP_DISP_VALID 117 #define POWER5p_PME_PM_LSU_FLUSH_UST 118 #define POWER5p_PME_PM_FXU1_FIN 119 #define POWER5p_PME_PM_THRD_PRIO_4_CYC 120 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD 121 #define POWER5p_PME_PM_4INST_CLB_CYC 122 #define POWER5p_PME_PM_MRK_DTLB_REF_16M 123 #define POWER5p_PME_PM_INST_FROM_L375_MOD 124 #define POWER5p_PME_PM_GRP_CMPL 125 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR 126 #define POWER5p_PME_PM_FPU1_1FLOP 127 #define POWER5p_PME_PM_FPU_FRSP_FCONV 128 #define POWER5p_PME_PM_L3SC_REF 129 #define POWER5p_PME_PM_5INST_CLB_CYC 130 #define POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC 131 #define POWER5p_PME_PM_MEM_PW_GATH 132 #define POWER5p_PME_PM_DTLB_REF_16G 133 #define POWER5p_PME_PM_FAB_DCLAIM_ISSUED 134 #define POWER5p_PME_PM_FAB_PNtoNN_SIDECAR 135 #define POWER5p_PME_PM_GRP_IC_MISS 136 #define POWER5p_PME_PM_INST_FROM_L35_SHR 137 #define POWER5p_PME_PM_LSU_LMQ_FULL_CYC 138 #define POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC 139 #define POWER5p_PME_PM_LSU_SRQ_SYNC_CYC 140 #define POWER5p_PME_PM_LSU0_BUSY_REJECT 141 #define POWER5p_PME_PM_LSU_REJECT_ERAT_MISS 142 #define POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC 143 #define POWER5p_PME_PM_DATA_FROM_L375_SHR 144 #define POWER5p_PME_PM_PTEG_FROM_L25_MOD 145 #define POWER5p_PME_PM_FPU0_FMOV_FEST 146 #define POWER5p_PME_PM_THRD_PRIO_7_CYC 147 #define POWER5p_PME_PM_LSU1_FLUSH_SRQ 148 #define POWER5p_PME_PM_LD_REF_L1_LSU0 149 #define POWER5p_PME_PM_L2SC_RCST_DISP 150 #define POWER5p_PME_PM_CMPLU_STALL_DIV 151 #define POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 152 #define POWER5p_PME_PM_INST_FROM_L375_SHR 153 #define POWER5p_PME_PM_ST_REF_L1 154 #define POWER5p_PME_PM_L3SB_ALL_BUSY 155 #define POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY 156 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC 157 #define POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY 158 #define POWER5p_PME_PM_DATA_FROM_LMEM 159 #define POWER5p_PME_PM_RUN_CYC 160 #define POWER5p_PME_PM_PTEG_FROM_RMEM 161 #define POWER5p_PME_PM_L2SC_RCLD_DISP 162 #define POWER5p_PME_PM_LSU_LRQ_S0_VALID 163 #define POWER5p_PME_PM_LSU0_LDF 164 #define POWER5p_PME_PM_PMC3_OVERFLOW 165 #define POWER5p_PME_PM_MRK_IMR_RELOAD 166 #define POWER5p_PME_PM_MRK_GRP_TIMEO 167 #define POWER5p_PME_PM_ST_MISS_L1 168 #define POWER5p_PME_PM_STOP_COMPLETION 169 #define POWER5p_PME_PM_LSU_BUSY_REJECT 170 #define POWER5p_PME_PM_ISLB_MISS 171 #define POWER5p_PME_PM_CYC 172 #define POWER5p_PME_PM_THRD_ONE_RUN_CYC 173 #define POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC 174 #define POWER5p_PME_PM_LSU1_SRQ_STFWD 175 #define POWER5p_PME_PM_L3SC_MOD_INV 176 #define POWER5p_PME_PM_L2_PREF 177 #define POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED 178 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD 179 #define POWER5p_PME_PM_L2SB_ST_REQ 180 #define POWER5p_PME_PM_L2SB_MOD_INV 181 #define POWER5p_PME_PM_MRK_L1_RELOAD_VALID 182 #define POWER5p_PME_PM_L3SB_HIT 183 #define POWER5p_PME_PM_L2SB_SHR_MOD 184 #define POWER5p_PME_PM_EE_OFF_EXT_INT 185 #define POWER5p_PME_PM_1PLUS_PPC_CMPL 186 #define POWER5p_PME_PM_L2SC_SHR_MOD 187 #define POWER5p_PME_PM_PMC6_OVERFLOW 188 #define POWER5p_PME_PM_IC_PREF_INSTALL 189 #define POWER5p_PME_PM_LSU_LRQ_FULL_CYC 190 #define POWER5p_PME_PM_TLB_MISS 191 #define POWER5p_PME_PM_GCT_FULL_CYC 192 #define POWER5p_PME_PM_FXU_BUSY 193 #define POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC 194 #define POWER5p_PME_PM_LSU_REJECT_LMQ_FULL 195 #define POWER5p_PME_PM_LSU_SRQ_S0_ALLOC 196 #define POWER5p_PME_PM_GRP_MRK 197 #define POWER5p_PME_PM_INST_FROM_L25_SHR 198 #define POWER5p_PME_PM_DC_PREF_STREAM_ALLOC 199 #define POWER5p_PME_PM_FPU1_FIN 200 #define POWER5p_PME_PM_BR_MPRED_TA 201 #define POWER5p_PME_PM_MRK_DTLB_REF_64K 202 #define POWER5p_PME_PM_RUN_INST_CMPL 203 #define POWER5p_PME_PM_CRQ_FULL_CYC 204 #define POWER5p_PME_PM_L2SA_RCLD_DISP 205 #define POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL 206 #define POWER5p_PME_PM_MRK_DTLB_REF_4K 207 #define POWER5p_PME_PM_LSU_SRQ_S0_VALID 208 #define POWER5p_PME_PM_LSU0_FLUSH_LRQ 209 #define POWER5p_PME_PM_INST_FROM_L275_MOD 210 #define POWER5p_PME_PM_GCT_EMPTY_CYC 211 #define POWER5p_PME_PM_LARX_LSU0 212 #define POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC 213 #define POWER5p_PME_PM_SNOOP_RETRY_1AHEAD 214 #define POWER5p_PME_PM_FPU1_FSQRT 215 #define POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 216 #define POWER5p_PME_PM_MRK_FPU_FIN 217 #define POWER5p_PME_PM_THRD_PRIO_5_CYC 218 #define POWER5p_PME_PM_MRK_DATA_FROM_LMEM 219 #define POWER5p_PME_PM_SNOOP_TLBIE 220 #define POWER5p_PME_PM_FPU1_FRSP_FCONV 221 #define POWER5p_PME_PM_DTLB_MISS_16G 222 #define POWER5p_PME_PM_L3SB_SNOOP_RETRY 223 #define POWER5p_PME_PM_FAB_VBYPASS_EMPTY 224 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD 225 #define POWER5p_PME_PM_L2SB_RCST_DISP 226 #define POWER5p_PME_PM_6INST_CLB_CYC 227 #define POWER5p_PME_PM_FLUSH 228 #define POWER5p_PME_PM_L2SC_MOD_INV 229 #define POWER5p_PME_PM_FPU_DENORM 230 #define POWER5p_PME_PM_L3SC_HIT 231 #define POWER5p_PME_PM_SNOOP_WR_RETRY_RQ 232 #define POWER5p_PME_PM_LSU1_REJECT_SRQ 233 #define POWER5p_PME_PM_L3SC_ALL_BUSY 234 #define POWER5p_PME_PM_IC_PREF_REQ 235 #define POWER5p_PME_PM_MRK_GRP_IC_MISS 236 #define POWER5p_PME_PM_GCT_NOSLOT_IC_MISS 237 #define POWER5p_PME_PM_MRK_DATA_FROM_L3 238 #define POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL 239 #define POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS 240 #define POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD 241 #define POWER5p_PME_PM_LSU_FLUSH_LRQ 242 #define POWER5p_PME_PM_THRD_PRIO_2_CYC 243 #define POWER5p_PME_PM_L3SA_MOD_INV 244 #define POWER5p_PME_PM_LSU_FLUSH_SRQ 245 #define POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID 246 #define POWER5p_PME_PM_L3SA_REF 247 #define POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL 248 #define POWER5p_PME_PM_FPU0_STALL3 249 #define POWER5p_PME_PM_TB_BIT_TRANS 250 #define POWER5p_PME_PM_GPR_MAP_FULL_CYC 251 #define POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ 252 #define POWER5p_PME_PM_FPU0_STF 253 #define POWER5p_PME_PM_MRK_DTLB_MISS 254 #define POWER5p_PME_PM_FPU1_FMA 255 #define POWER5p_PME_PM_L2SA_MOD_TAG 256 #define POWER5p_PME_PM_LSU1_FLUSH_ULD 257 #define POWER5p_PME_PM_MRK_INST_FIN 258 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_UST 259 #define POWER5p_PME_PM_FPU0_FULL_CYC 260 #define POWER5p_PME_PM_LSU_LRQ_S0_ALLOC 261 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD 262 #define POWER5p_PME_PM_MRK_DTLB_REF 263 #define POWER5p_PME_PM_BR_UNCOND 264 #define POWER5p_PME_PM_THRD_SEL_OVER_L2MISS 265 #define POWER5p_PME_PM_L2SB_SHR_INV 266 #define POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL 267 #define POWER5p_PME_PM_MRK_DTLB_MISS_64K 268 #define POWER5p_PME_PM_MRK_ST_MISS_L1 269 #define POWER5p_PME_PM_L3SC_MOD_TAG 270 #define POWER5p_PME_PM_GRP_DISP_SUCCESS 271 #define POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC 272 #define POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 273 #define POWER5p_PME_PM_LSU_DERAT_MISS 274 #define POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 275 #define POWER5p_PME_PM_FPU0_SINGLE 276 #define POWER5p_PME_PM_THRD_PRIO_1_CYC 277 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER 278 #define POWER5p_PME_PM_SNOOP_RD_RETRY_RQ 279 #define POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY 280 #define POWER5p_PME_PM_FPU1_FEST 281 #define POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL 282 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC 283 #define POWER5p_PME_PM_MRK_ST_CMPL_INT 284 #define POWER5p_PME_PM_FLUSH_BR_MPRED 285 #define POWER5p_PME_PM_MRK_DTLB_MISS_16G 286 #define POWER5p_PME_PM_FPU_STF 287 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR 288 #define POWER5p_PME_PM_CMPLU_STALL_FPU 289 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 290 #define POWER5p_PME_PM_GCT_NOSLOT_CYC 291 #define POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE 292 #define POWER5p_PME_PM_PTEG_FROM_L35_SHR 293 #define POWER5p_PME_PM_MRK_DTLB_REF_16G 294 #define POWER5p_PME_PM_MRK_LSU_FLUSH_UST 295 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR 296 #define POWER5p_PME_PM_L3SA_HIT 297 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR 298 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR 299 #define POWER5p_PME_PM_IERAT_XLATE_WR 300 #define POWER5p_PME_PM_L2SA_ST_REQ 301 #define POWER5p_PME_PM_INST_FROM_LMEM 302 #define POWER5p_PME_PM_THRD_SEL_T1 303 #define POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT 304 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC 305 #define POWER5p_PME_PM_FPU0_1FLOP 306 #define POWER5p_PME_PM_PTEG_FROM_L2 307 #define POWER5p_PME_PM_MEM_PW_CMPL 308 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 309 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER 310 #define POWER5p_PME_PM_MRK_DTLB_MISS_4K 311 #define POWER5p_PME_PM_FPU0_FIN 312 #define POWER5p_PME_PM_L3SC_SHR_INV 313 #define POWER5p_PME_PM_GRP_BR_REDIR 314 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL 315 #define POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ 316 #define POWER5p_PME_PM_PTEG_FROM_L275_SHR 317 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL 318 #define POWER5p_PME_PM_SNOOP_RD_RETRY_WQ 319 #define POWER5p_PME_PM_FAB_DCLAIM_RETRIED 320 #define POWER5p_PME_PM_LSU0_NCLD 321 #define POWER5p_PME_PM_LSU1_BUSY_REJECT 322 #define POWER5p_PME_PM_FXLS0_FULL_CYC 323 #define POWER5p_PME_PM_DTLB_REF_16M 324 #define POWER5p_PME_PM_FPU0_FEST 325 #define POWER5p_PME_PM_GCT_USAGE_60to79_CYC 326 #define POWER5p_PME_PM_DATA_FROM_L25_MOD 327 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR 328 #define POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS 329 #define POWER5p_PME_PM_DATA_FROM_L375_MOD 330 #define POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 331 #define POWER5p_PME_PM_DTLB_MISS_64K 332 #define POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF 333 #define POWER5p_PME_PM_0INST_FETCH 334 #define POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF 335 #define POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 336 #define POWER5p_PME_PM_L1_PREF 337 #define POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC 338 #define POWER5p_PME_PM_BRQ_FULL_CYC 339 #define POWER5p_PME_PM_GRP_IC_MISS_NONSPEC 340 #define POWER5p_PME_PM_PTEG_FROM_L275_MOD 341 #define POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 342 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC 343 #define POWER5p_PME_PM_DATA_FROM_L3 344 #define POWER5p_PME_PM_INST_FROM_L2 345 #define POWER5p_PME_PM_LSU_FLUSH 346 #define POWER5p_PME_PM_PMC2_OVERFLOW 347 #define POWER5p_PME_PM_FPU0_DENORM 348 #define POWER5p_PME_PM_FPU1_FMOV_FEST 349 #define POWER5p_PME_PM_INST_FETCH_CYC 350 #define POWER5p_PME_PM_INST_DISP 351 #define POWER5p_PME_PM_LSU_LDF 352 #define POWER5p_PME_PM_DATA_FROM_L25_SHR 353 #define POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID 354 #define POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM 355 #define POWER5p_PME_PM_MRK_GRP_ISSUED 356 #define POWER5p_PME_PM_FPU_FULL_CYC 357 #define POWER5p_PME_PM_INST_FROM_L35_MOD 358 #define POWER5p_PME_PM_FPU_FMA 359 #define POWER5p_PME_PM_THRD_PRIO_3_CYC 360 #define POWER5p_PME_PM_MRK_CRU_FIN 361 #define POWER5p_PME_PM_SNOOP_WR_RETRY_WQ 362 #define POWER5p_PME_PM_CMPLU_STALL_REJECT 363 #define POWER5p_PME_PM_MRK_FXU_FIN 364 #define POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS 365 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER 366 #define POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY 367 #define POWER5p_PME_PM_PMC4_OVERFLOW 368 #define POWER5p_PME_PM_L3SA_SNOOP_RETRY 369 #define POWER5p_PME_PM_PTEG_FROM_L35_MOD 370 #define POWER5p_PME_PM_INST_FROM_L25_MOD 371 #define POWER5p_PME_PM_THRD_SMT_HANG 372 #define POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS 373 #define POWER5p_PME_PM_L3SA_MOD_TAG 374 #define POWER5p_PME_PM_INST_FROM_L2MISS 375 #define POWER5p_PME_PM_FLUSH_SYNC 376 #define POWER5p_PME_PM_MRK_GRP_DISP 377 #define POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 378 #define POWER5p_PME_PM_L2SC_ST_HIT 379 #define POWER5p_PME_PM_L2SB_MOD_TAG 380 #define POWER5p_PME_PM_CLB_EMPTY_CYC 381 #define POWER5p_PME_PM_L2SB_ST_HIT 382 #define POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL 383 #define POWER5p_PME_PM_BR_PRED_CR_TA 384 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ 385 #define POWER5p_PME_PM_MRK_LSU_FLUSH_ULD 386 #define POWER5p_PME_PM_INST_DISP_ATTEMPT 387 #define POWER5p_PME_PM_INST_FROM_RMEM 388 #define POWER5p_PME_PM_ST_REF_L1_LSU0 389 #define POWER5p_PME_PM_LSU0_DERAT_MISS 390 #define POWER5p_PME_PM_FPU_STALL3 391 #define POWER5p_PME_PM_L2SB_RCLD_DISP 392 #define POWER5p_PME_PM_BR_PRED_CR 393 #define POWER5p_PME_PM_MRK_DATA_FROM_L2 394 #define POWER5p_PME_PM_LSU0_FLUSH_SRQ 395 #define POWER5p_PME_PM_FAB_PNtoNN_DIRECT 396 #define POWER5p_PME_PM_IOPS_CMPL 397 #define POWER5p_PME_PM_L2SA_RCST_DISP 398 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER 399 #define POWER5p_PME_PM_L2SC_SHR_INV 400 #define POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION 401 #define POWER5p_PME_PM_FAB_PNtoVN_SIDECAR 402 #define POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL 403 #define POWER5p_PME_PM_LSU_LMQ_S0_ALLOC 404 #define POWER5p_PME_PM_SNOOP_PW_RETRY_RQ 405 #define POWER5p_PME_PM_DTLB_REF 406 #define POWER5p_PME_PM_PTEG_FROM_L3 407 #define POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY 408 #define POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC 409 #define POWER5p_PME_PM_FPU1_STF 410 #define POWER5p_PME_PM_LSU_LMQ_S0_VALID 411 #define POWER5p_PME_PM_GCT_USAGE_00to59_CYC 412 #define POWER5p_PME_PM_FPU_FMOV_FEST 413 #define POWER5p_PME_PM_DATA_FROM_L2MISS 414 #define POWER5p_PME_PM_XER_MAP_FULL_CYC 415 #define POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC 416 #define POWER5p_PME_PM_FLUSH_SB 417 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR 418 #define POWER5p_PME_PM_MRK_GRP_CMPL 419 #define POWER5p_PME_PM_SUSPENDED 420 #define POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL 421 #define POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC 422 #define POWER5p_PME_PM_DATA_FROM_L35_SHR 423 #define POWER5p_PME_PM_L3SB_MOD_INV 424 #define POWER5p_PME_PM_STCX_FAIL 425 #define POWER5p_PME_PM_LD_MISS_L1_LSU1 426 #define POWER5p_PME_PM_GRP_DISP 427 #define POWER5p_PME_PM_DC_PREF_DST 428 #define POWER5p_PME_PM_FPU1_DENORM 429 #define POWER5p_PME_PM_FPU0_FPSCR 430 #define POWER5p_PME_PM_DATA_FROM_L2 431 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR 432 #define POWER5p_PME_PM_FPU_1FLOP 433 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER 434 #define POWER5p_PME_PM_FPU0_FSQRT 435 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL 436 #define POWER5p_PME_PM_LD_REF_L1 437 #define POWER5p_PME_PM_INST_FROM_L1 438 #define POWER5p_PME_PM_TLBIE_HELD 439 #define POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS 440 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC 441 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ 442 #define POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 443 #define POWER5p_PME_PM_ST_REF_L1_LSU1 444 #define POWER5p_PME_PM_MRK_LD_MISS_L1 445 #define POWER5p_PME_PM_L1_WRITE_CYC 446 #define POWER5p_PME_PM_L2SC_ST_REQ 447 #define POWER5p_PME_PM_CMPLU_STALL_FDIV 448 #define POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY 449 #define POWER5p_PME_PM_BR_MPRED_CR 450 #define POWER5p_PME_PM_L3SB_MOD_TAG 451 #define POWER5p_PME_PM_MRK_DATA_FROM_L2MISS 452 #define POWER5p_PME_PM_LSU_REJECT_SRQ 453 #define POWER5p_PME_PM_LD_MISS_L1 454 #define POWER5p_PME_PM_INST_FROM_PREF 455 #define POWER5p_PME_PM_STCX_PASS 456 #define POWER5p_PME_PM_DC_INV_L2 457 #define POWER5p_PME_PM_LSU_SRQ_FULL_CYC 458 #define POWER5p_PME_PM_FPU_FIN 459 #define POWER5p_PME_PM_LSU_SRQ_STFWD 460 #define POWER5p_PME_PM_L2SA_SHR_MOD 461 #define POWER5p_PME_PM_0INST_CLB_CYC 462 #define POWER5p_PME_PM_FXU0_FIN 463 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL 464 #define POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC 465 #define POWER5p_PME_PM_PMC5_OVERFLOW 466 #define POWER5p_PME_PM_FPU0_FDIV 467 #define POWER5p_PME_PM_PTEG_FROM_L375_SHR 468 #define POWER5p_PME_PM_HV_CYC 469 #define POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY 470 #define POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC 471 #define POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC 472 #define POWER5p_PME_PM_L3SB_SHR_INV 473 #define POWER5p_PME_PM_DATA_FROM_RMEM 474 #define POWER5p_PME_PM_DATA_FROM_L275_MOD 475 #define POWER5p_PME_PM_LSU0_REJECT_SRQ 476 #define POWER5p_PME_PM_LSU1_DERAT_MISS 477 #define POWER5p_PME_PM_MRK_LSU_FIN 478 #define POWER5p_PME_PM_DTLB_MISS_16M 479 #define POWER5p_PME_PM_LSU0_FLUSH_UST 480 #define POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY 481 #define POWER5p_PME_PM_L2SC_MOD_TAG 482 static const int power5p_event_ids[][POWER5p_NUM_EVENT_COUNTERS] = { [ POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF ] = { -1, 243, 240, -1, -1, -1 }, [ POWER5p_PME_PM_FPU1_SINGLE ] = { 82, 81, 81, 83, -1, -1 }, [ POWER5p_PME_PM_L3SB_REF ] = { 188, 185, 184, 183, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { 343, 338, 336, 332, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L275_SHR ] = { -1, -1, 115, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD ] = { 274, -1, -1, 268, -1, -1 }, [ POWER5p_PME_PM_DTLB_MISS_4K ] = { 32, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_CLB_FULL_CYC ] = { 14, 13, 13, 14, -1, -1 }, [ POWER5p_PME_PM_MRK_ST_CMPL ] = { 299, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL ] = { 232, 231, 230, 227, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR ] = { -1, -1, 265, -1, -1, -1 }, [ POWER5p_PME_PM_1INST_CLB_CYC ] = { 1, 1, 1, 2, -1, -1 }, [ POWER5p_PME_PM_MEM_SPEC_RD_CANCEL ] = { 264, 263, 259, 258, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16M ] = { -1, -1, 273, -1, -1, -1 }, [ POWER5p_PME_PM_FPU_FDIV ] = { 87, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU_SINGLE ] = { 90, -1, -1, 90, -1, -1 }, [ POWER5p_PME_PM_FPU0_FMA ] = { 63, 62, 62, 64, -1, -1 }, [ POWER5p_PME_PM_SLB_MISS ] = { -1, 307, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU1_FLUSH_LRQ ] = { 220, 216, 216, 215, -1, -1 }, [ POWER5p_PME_PM_L2SA_ST_HIT ] = { 142, 139, 138, 137, -1, -1 }, [ POWER5p_PME_PM_DTLB_MISS ] = { 31, 30, 30, 31, -1, -1 }, [ POWER5p_PME_PM_BR_PRED_TA ] = { 203, 11, 351, 348, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { -1, -1, -1, 269, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_FXU ] = { -1, 16, -1, -1, -1, -1 }, [ POWER5p_PME_PM_EXT_INT ] = { -1, -1, -1, 37, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 292, 292, 286, 289, -1, -1 }, [ POWER5p_PME_PM_MRK_ST_GPS ] = { -1, 299, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU1_LDF ] = { 224, 220, 220, 219, -1, -1 }, [ POWER5p_PME_PM_FAB_CMD_ISSUED ] = { 37, 36, 36, 38, -1, -1 }, [ POWER5p_PME_PM_LSU0_SRQ_STFWD ] = { 217, 213, 213, 212, -1, -1 }, [ POWER5p_PME_PM_CR_MAP_FULL_CYC ] = { 16, 19, 15, 20, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { 137, 134, 133, 132, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 290, 290, 284, 287, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL ] = { 234, 232, 231, 229, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 ] = { 360, 353, 352, 246, -1, -1 }, [ POWER5p_PME_PM_FLUSH_IMBAL ] = { 54, 53, 53, 55, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { 346, 341, 339, 335, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L35_MOD ] = { -1, 22, 21, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { 253, 252, 248, 247, -1, -1 }, [ POWER5p_PME_PM_FPU1_FDIV ] = { 74, 73, 73, 75, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP ] = { 261, 260, 256, 255, -1, -1 }, [ POWER5p_PME_PM_FPU0_FRSP_FCONV ] = { 66, 65, 65, 67, -1, -1 }, [ POWER5p_PME_PM_LWSYNC_HELD ] = { 250, 249, 245, 244, -1, -1 }, [ POWER5p_PME_PM_FXU_FIN ] = { -1, -1, 93, -1, -1, -1 }, [ POWER5p_PME_PM_DSLB_MISS ] = { 30, 29, 29, 30, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L275_SHR ] = { -1, -1, 18, -1, -1, -1 }, [ POWER5p_PME_PM_FXLS1_FULL_CYC ] = { 92, 90, 89, 92, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_T0 ] = { 352, 347, 345, 341, -1, -1 }, [ POWER5p_PME_PM_PTEG_RELOAD_VALID ] = { 311, 306, 305, 303, -1, -1 }, [ POWER5p_PME_PM_MRK_STCX_FAIL ] = { 298, 298, 293, 297, -1, -1 }, [ POWER5p_PME_PM_LSU_LMQ_LHR_MERGE ] = { 238, 235, 233, 232, -1, -1 }, [ POWER5p_PME_PM_2INST_CLB_CYC ] = { 3, 2, 2, 3, -1, -1 }, [ POWER5p_PME_PM_FAB_PNtoVN_DIRECT ] = { 49, 48, 48, 50, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L2MISS ] = { -1, -1, 300, -1, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_LSU ] = { -1, 17, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DSLB_MISS ] = { 276, 278, 271, 273, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_ULD ] = { 235, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_LMEM ] = { -1, 305, 304, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_BRU_FIN ] = { -1, 268, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_WQ_DISP_WRITE ] = { 268, 267, 263, 262, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { -1, -1, -1, 266, -1, -1 }, [ POWER5p_PME_PM_LSU1_NCLD ] = { 225, 221, 221, 220, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { 132, 129, 128, 127, -1, -1 }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { 316, 311, 309, 307, -1, -1 }, [ POWER5p_PME_PM_FPU1_FULL_CYC ] = { 81, 80, 80, 82, -1, -1 }, [ POWER5p_PME_PM_FPR_MAP_FULL_CYC ] = { 57, 56, 56, 58, -1, -1 }, [ POWER5p_PME_PM_L3SA_ALL_BUSY ] = { 177, 174, 173, 172, -1, -1 }, [ POWER5p_PME_PM_3INST_CLB_CYC ] = { 4, 3, 3, 4, -1, -1 }, [ POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { 257, 250, 252, 251, -1, -1 }, [ POWER5p_PME_PM_L2SA_SHR_INV ] = { 140, 137, 136, 135, -1, -1 }, [ POWER5p_PME_PM_THRESH_TIMEO ] = { -1, -1, 348, -1, -1, -1 }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { 139, 136, 135, 134, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { 349, 344, 342, 338, -1, -1 }, [ POWER5p_PME_PM_FPU_FSQRT ] = { -1, 86, 86, -1, -1, -1 }, [ POWER5p_PME_PM_PMC1_OVERFLOW ] = { -1, 301, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 288, 288, 282, 285, -1, -1 }, [ POWER5p_PME_PM_L3SC_SNOOP_RETRY ] = { 197, 194, 193, 192, -1, -1 }, [ POWER5p_PME_PM_DATA_TABLEWALK_CYC ] = { 25, 24, 24, 25, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_6_CYC ] = { 339, 334, 332, 328, -1, -1 }, [ POWER5p_PME_PM_FPU_FEST ] = { 88, -1, -1, 87, -1, -1 }, [ POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { 43, 42, 42, 44, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM ] = { 275, -1, -1, 271, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { -1, -1, -1, 267, -1, -1 }, [ POWER5p_PME_PM_MEM_PWQ_DISP ] = { 256, 255, 251, 250, -1, -1 }, [ POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { 45, 44, 44, 46, -1, -1 }, [ POWER5p_PME_PM_LD_MISS_L1_LSU0 ] = { 199, 196, 196, 194, -1, -1 }, [ POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { 314, 309, 307, 305, -1, -1 }, [ POWER5p_PME_PM_FPU1_STALL3 ] = { 83, 82, 82, 84, -1, -1 }, [ POWER5p_PME_PM_GCT_USAGE_80to99_CYC ] = { -1, -1, 96, -1, -1, -1 }, [ POWER5p_PME_PM_WORK_HELD ] = { -1, -1, -1, 345, -1, -1 }, [ POWER5p_PME_PM_INST_CMPL ] = { 303, 302, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU1_FLUSH_UST ] = { 223, 219, 219, 218, -1, -1 }, [ POWER5p_PME_PM_FXU_IDLE ] = { 96, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU0_FLUSH_ULD ] = { 209, 205, 205, 204, -1, -1 }, [ POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 227, 223, 223, 222, -1, -1 }, [ POWER5p_PME_PM_GRP_DISP_REJECT ] = { 104, 104, 102, 103, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L25_SHR ] = { 305, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SA_MOD_INV ] = { 128, 125, 124, 123, -1, -1 }, [ POWER5p_PME_PM_FAB_CMD_RETRIED ] = { 38, 37, 37, 39, -1, -1 }, [ POWER5p_PME_PM_L3SA_SHR_INV ] = { 182, 179, 178, 177, -1, -1 }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { 155, 152, 151, 150, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { 135, 132, 131, 130, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { 133, 130, 129, 128, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L375_MOD ] = { 309, -1, -1, 301, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_UST ] = { 295, 295, 289, 292, -1, -1 }, [ POWER5p_PME_PM_BR_ISSUED ] = { 9, 8, 8, 9, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_BR_REDIR ] = { -1, 283, -1, -1, -1, -1 }, [ POWER5p_PME_PM_EE_OFF ] = { 35, 34, 34, 35, -1, -1 }, [ POWER5p_PME_PM_IERAT_XLATE_WR_LP ] = { 114, 112, 111, 111, -1, -1 }, [ POWER5p_PME_PM_DTLB_REF_64K ] = { -1, 33, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 ] = { 262, 259, 258, 257, -1, -1 }, [ POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP ] = { 251, 354, 246, 245, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L3 ] = { 121, -1, 116, -1, -1, -1 }, [ POWER5p_PME_PM_ITLB_MISS ] = { 124, 121, 120, 119, -1, -1 }, [ POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 95, -1, -1 }, [ POWER5p_PME_PM_DTLB_REF_4K ] = { 34, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FXLS_FULL_CYC ] = { 93, -1, -1, 93, -1, -1 }, [ POWER5p_PME_PM_GRP_DISP_VALID ] = { 105, 105, 104, 104, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_UST ] = { -1, 233, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FXU1_FIN ] = { 95, 92, 92, 96, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_4_CYC ] = { 337, 332, 330, 326, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD ] = { -1, 273, 268, -1, -1, -1 }, [ POWER5p_PME_PM_4INST_CLB_CYC ] = { 5, 4, 4, 5, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_REF_16M ] = { -1, -1, 275, -1, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L375_MOD ] = { -1, -1, -1, 116, -1, -1 }, [ POWER5p_PME_PM_GRP_CMPL ] = { -1, -1, 100, -1, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { 167, 164, 163, 162, -1, -1 }, [ POWER5p_PME_PM_FPU1_1FLOP ] = { 72, 71, 71, 73, -1, -1 }, [ POWER5p_PME_PM_FPU_FRSP_FCONV ] = { -1, 85, 85, -1, -1, -1 }, [ POWER5p_PME_PM_L3SC_REF ] = { 195, 192, 191, 190, -1, -1 }, [ POWER5p_PME_PM_5INST_CLB_CYC ] = { 6, 5, 5, 6, -1, -1 }, [ POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC ] = { 332, 328, 326, 322, -1, -1 }, [ POWER5p_PME_PM_MEM_PW_GATH ] = { 259, 258, 254, 253, -1, -1 }, [ POWER5p_PME_PM_DTLB_REF_16G ] = { -1, -1, -1, 34, -1, -1 }, [ POWER5p_PME_PM_FAB_DCLAIM_ISSUED ] = { 39, 38, 38, 40, -1, -1 }, [ POWER5p_PME_PM_FAB_PNtoNN_SIDECAR ] = { 48, 47, 47, 49, -1, -1 }, [ POWER5p_PME_PM_GRP_IC_MISS ] = { 106, 106, 105, 105, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L35_SHR ] = { 122, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_LMQ_FULL_CYC ] = { 237, 234, 232, 231, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC ] = { -1, 272, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_SYNC_CYC ] = { 249, 248, 244, 243, -1, -1 }, [ POWER5p_PME_PM_LSU0_BUSY_REJECT ] = { 205, 201, 201, 200, -1, -1 }, [ POWER5p_PME_PM_LSU_REJECT_ERAT_MISS ] = { 244, -1, -1, 238, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { -1, -1, -1, 272, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L375_SHR ] = { -1, -1, 22, -1, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L25_MOD ] = { -1, 303, 298, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_FMOV_FEST ] = { 64, 63, 63, 65, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_7_CYC ] = { 340, 335, 333, 329, -1, -1 }, [ POWER5p_PME_PM_LSU1_FLUSH_SRQ ] = { 221, 217, 217, 216, -1, -1 }, [ POWER5p_PME_PM_LD_REF_L1_LSU0 ] = { 202, 198, 198, 197, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCST_DISP ] = { 166, 163, 162, 161, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_DIV ] = { -1, -1, -1, 15, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 ] = { 359, 262, 255, 248, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L375_SHR ] = { -1, -1, 117, -1, -1, -1 }, [ POWER5p_PME_PM_ST_REF_L1 ] = { -1, 323, 322, -1, -1, -1 }, [ POWER5p_PME_PM_L3SB_ALL_BUSY ] = { 184, 181, 180, 179, -1, -1 }, [ POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { 46, 45, 45, 47, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { -1, 271, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY ] = { 41, 40, 40, 42, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_LMEM ] = { -1, 23, 23, -1, -1, -1 }, [ POWER5p_PME_PM_RUN_CYC ] = { 312, -1, -1, -1, -1, 0 }, [ POWER5p_PME_PM_PTEG_FROM_RMEM ] = { 310, -1, -1, 302, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCLD_DISP ] = { 162, 159, 158, 157, -1, -1 }, [ POWER5p_PME_PM_LSU_LRQ_S0_VALID ] = { 243, 241, 239, 237, -1, -1 }, [ POWER5p_PME_PM_LSU0_LDF ] = { 211, 207, 207, 206, -1, -1 }, [ POWER5p_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 299, -1, -1 }, [ POWER5p_PME_PM_MRK_IMR_RELOAD ] = { 283, 284, 277, 281, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_TIMEO ] = { -1, -1, -1, 280, -1, -1 }, [ POWER5p_PME_PM_ST_MISS_L1 ] = { 327, 322, 321, 318, -1, -1 }, [ POWER5p_PME_PM_STOP_COMPLETION ] = { -1, -1, 320, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_BUSY_REJECT ] = { -1, 227, -1, -1, -1, -1 }, [ POWER5p_PME_PM_ISLB_MISS ] = { 123, 120, 119, 118, -1, -1 }, [ POWER5p_PME_PM_CYC ] = { 17, 20, 16, 21, -1, -1 }, [ POWER5p_PME_PM_THRD_ONE_RUN_CYC ] = { 333, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC ] = { 102, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU1_SRQ_STFWD ] = { 230, 226, 226, 225, -1, -1 }, [ POWER5p_PME_PM_L3SC_MOD_INV ] = { 193, 190, 189, 188, -1, -1 }, [ POWER5p_PME_PM_L2_PREF ] = { 176, 173, 172, 171, -1, -1 }, [ POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED ] = { -1, -1, -1, 98, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, 269, 264, -1, -1, -1 }, [ POWER5p_PME_PM_L2SB_ST_REQ ] = { 159, 156, 155, 154, -1, -1 }, [ POWER5p_PME_PM_L2SB_MOD_INV ] = { 144, 141, 140, 139, -1, -1 }, [ POWER5p_PME_PM_MRK_L1_RELOAD_VALID ] = { 284, 285, 279, 282, -1, -1 }, [ POWER5p_PME_PM_L3SB_HIT ] = { 185, 182, 181, 180, -1, -1 }, [ POWER5p_PME_PM_L2SB_SHR_MOD ] = { 157, 154, 153, 152, -1, -1 }, [ POWER5p_PME_PM_EE_OFF_EXT_INT ] = { 36, 35, 35, 36, -1, -1 }, [ POWER5p_PME_PM_1PLUS_PPC_CMPL ] = { 2, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SC_SHR_MOD ] = { 173, 170, 169, 168, -1, -1 }, [ POWER5p_PME_PM_PMC6_OVERFLOW ] = { -1, -1, 297, -1, -1, -1 }, [ POWER5p_PME_PM_IC_PREF_INSTALL ] = { 252, 251, 108, 108, -1, -1 }, [ POWER5p_PME_PM_LSU_LRQ_FULL_CYC ] = { 241, 239, 237, 235, -1, -1 }, [ POWER5p_PME_PM_TLB_MISS ] = { 356, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_GCT_FULL_CYC ] = { 97, 96, 94, 97, -1, -1 }, [ POWER5p_PME_PM_FXU_BUSY ] = { -1, 93, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC ] = { -1, 276, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_REJECT_LMQ_FULL ] = { -1, 242, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_S0_ALLOC ] = { 247, 245, 242, 241, -1, -1 }, [ POWER5p_PME_PM_GRP_MRK ] = { 109, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L25_SHR ] = { 119, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_DC_PREF_STREAM_ALLOC ] = { 29, 28, 28, 29, -1, -1 }, [ POWER5p_PME_PM_FPU1_FIN ] = { 76, 75, 75, 77, -1, -1 }, [ POWER5p_PME_PM_BR_MPRED_TA ] = { 11, 10, 10, 11, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_REF_64K ] = { -1, 282, -1, -1, -1, -1 }, [ POWER5p_PME_PM_RUN_INST_CMPL ] = { -1, -1, -1, -1, 0, -1 }, [ POWER5p_PME_PM_CRQ_FULL_CYC ] = { 15, 18, 14, 19, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCLD_DISP ] = { 130, 127, 126, 125, -1, -1 }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL ] = { 322, 317, 315, 313, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_REF_4K ] = { 280, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_S0_VALID ] = { 248, 246, 243, 242, -1, -1 }, [ POWER5p_PME_PM_LSU0_FLUSH_LRQ ] = { 207, 203, 203, 202, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L275_MOD ] = { -1, -1, -1, 115, -1, -1 }, [ POWER5p_PME_PM_GCT_EMPTY_CYC ] = { -1, 95, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LARX_LSU0 ] = { 198, 195, 194, 193, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { 344, 339, 337, 333, -1, -1 }, [ POWER5p_PME_PM_SNOOP_RETRY_1AHEAD ] = { 320, 315, 313, 311, -1, -1 }, [ POWER5p_PME_PM_FPU1_FSQRT ] = { 80, 79, 79, 81, -1, -1 }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 287, 287, 281, 284, -1, -1 }, [ POWER5p_PME_PM_MRK_FPU_FIN ] = { -1, -1, 276, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_5_CYC ] = { 338, 333, 331, 327, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM ] = { -1, 277, 270, -1, -1, -1 }, [ POWER5p_PME_PM_SNOOP_TLBIE ] = { 321, 316, 314, 312, -1, -1 }, [ POWER5p_PME_PM_FPU1_FRSP_FCONV ] = { 79, 78, 78, 80, -1, -1 }, [ POWER5p_PME_PM_DTLB_MISS_16G ] = { -1, -1, -1, 32, -1, -1 }, [ POWER5p_PME_PM_L3SB_SNOOP_RETRY ] = { 190, 187, 186, 185, -1, -1 }, [ POWER5p_PME_PM_FAB_VBYPASS_EMPTY ] = { 51, 50, 50, 52, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD ] = { 271, -1, -1, 265, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCST_DISP ] = { 150, 147, 146, 145, -1, -1 }, [ POWER5p_PME_PM_6INST_CLB_CYC ] = { 7, 6, 6, 7, -1, -1 }, [ POWER5p_PME_PM_FLUSH ] = { 52, 51, 51, 53, -1, -1 }, [ POWER5p_PME_PM_L2SC_MOD_INV ] = { 160, 157, 156, 155, -1, -1 }, [ POWER5p_PME_PM_FPU_DENORM ] = { 86, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L3SC_HIT ] = { 192, 189, 188, 187, -1, -1 }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_RQ ] = { 323, 318, 316, 314, -1, -1 }, [ POWER5p_PME_PM_LSU1_REJECT_SRQ ] = { 229, 225, 225, 224, -1, -1 }, [ POWER5p_PME_PM_L3SC_ALL_BUSY ] = { 191, 188, 187, 186, -1, -1 }, [ POWER5p_PME_PM_IC_PREF_REQ ] = { 112, 110, 109, 109, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_IC_MISS ] = { -1, -1, -1, 279, -1, -1 }, [ POWER5p_PME_PM_GCT_NOSLOT_IC_MISS ] = { -1, 97, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3 ] = { 272, -1, 267, -1, -1, -1 }, [ POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { -1, -1, 95, -1, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { -1, 14, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { 350, 345, 343, 339, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ ] = { -1, 230, 229, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_2_CYC ] = { 335, 330, 328, 324, -1, -1 }, [ POWER5p_PME_PM_L3SA_MOD_INV ] = { 179, 176, 175, 174, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ ] = { 233, -1, -1, 228, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 297, 297, 292, 296, -1, -1 }, [ POWER5p_PME_PM_L3SA_REF ] = { 181, 178, 177, 176, -1, -1 }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { 171, 168, 167, 166, -1, -1 }, [ POWER5p_PME_PM_FPU0_STALL3 ] = { 70, 69, 69, 71, -1, -1 }, [ POWER5p_PME_PM_TB_BIT_TRANS ] = { 331, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_GPR_MAP_FULL_CYC ] = { 100, 99, 97, 99, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ ] = { -1, -1, 290, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_STF ] = { 71, 70, 70, 72, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_MISS ] = { 277, 279, 272, 274, -1, -1 }, [ POWER5p_PME_PM_FPU1_FMA ] = { 77, 76, 76, 78, -1, -1 }, [ POWER5p_PME_PM_L2SA_MOD_TAG ] = { 129, 126, 125, 124, -1, -1 }, [ POWER5p_PME_PM_LSU1_FLUSH_ULD ] = { 222, 218, 218, 217, -1, -1 }, [ POWER5p_PME_PM_MRK_INST_FIN ] = { -1, -1, 278, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_UST ] = { 291, 291, 285, 288, -1, -1 }, [ POWER5p_PME_PM_FPU0_FULL_CYC ] = { 68, 67, 67, 69, -1, -1 }, [ POWER5p_PME_PM_LSU_LRQ_S0_ALLOC ] = { 242, 240, 238, 236, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 294, 294, 288, 291, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_REF ] = { 279, 281, 274, 276, -1, -1 }, [ POWER5p_PME_PM_BR_UNCOND ] = { 12, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_OVER_L2MISS ] = { 351, 346, 344, 340, -1, -1 }, [ POWER5p_PME_PM_L2SB_SHR_INV ] = { 156, 153, 152, 151, -1, -1 }, [ POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { 255, 254, 250, 249, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_MISS_64K ] = { -1, 280, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_ST_MISS_L1 ] = { 300, 300, 295, 298, -1, -1 }, [ POWER5p_PME_PM_L3SC_MOD_TAG ] = { 194, 191, 190, 189, -1, -1 }, [ POWER5p_PME_PM_GRP_DISP_SUCCESS ] = { -1, -1, 103, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { 342, 337, 335, 331, -1, -1 }, [ POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 110, 108, 106, 106, -1, -1 }, [ POWER5p_PME_PM_LSU_DERAT_MISS ] = { -1, 228, 227, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 ] = { 266, 265, 261, 260, -1, -1 }, [ POWER5p_PME_PM_FPU0_SINGLE ] = { 69, 68, 68, 70, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_1_CYC ] = { 334, 329, 327, 323, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { 168, 165, 164, 163, -1, -1 }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_RQ ] = { 318, 313, 311, 309, -1, -1 }, [ POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY ] = { 42, 41, 41, 43, -1, -1 }, [ POWER5p_PME_PM_FPU1_FEST ] = { 75, 74, 74, 76, -1, -1 }, [ POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { 313, 308, 306, 304, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { -1, 270, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 294, -1, -1, -1 }, [ POWER5p_PME_PM_FLUSH_BR_MPRED ] = { 53, 52, 52, 54, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16G ] = { -1, -1, -1, 275, -1, -1 }, [ POWER5p_PME_PM_FPU_STF ] = { -1, 88, 87, -1, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { 147, 144, 143, 142, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_FPU ] = { -1, -1, -1, 17, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { 345, 340, 338, 334, -1, -1 }, [ POWER5p_PME_PM_GCT_NOSLOT_CYC ] = { 98, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, 90, -1, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L35_SHR ] = { 308, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_REF_16G ] = { -1, -1, -1, 277, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_UST ] = { -1, 296, 291, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 270, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L3SA_HIT ] = { 178, 175, 174, 173, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR ] = { 273, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { 151, 148, 147, 146, -1, -1 }, [ POWER5p_PME_PM_IERAT_XLATE_WR ] = { 113, 111, 110, 110, -1, -1 }, [ POWER5p_PME_PM_L2SA_ST_REQ ] = { 143, 140, 139, 138, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_LMEM ] = { -1, 119, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_T1 ] = { 353, 348, 346, 342, -1, -1 }, [ POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 111, 109, 107, 107, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { -1, 274, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_1FLOP ] = { 58, 57, 57, 59, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L2 ] = { 304, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_PW_CMPL ] = { 258, 257, 253, 252, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { 347, 342, 340, 336, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { 148, 145, 144, 143, -1, -1 }, [ POWER5p_PME_PM_MRK_DTLB_MISS_4K ] = { 278, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_FIN ] = { 62, 61, 61, 63, -1, -1 }, [ POWER5p_PME_PM_L3SC_SHR_INV ] = { 196, 193, 192, 191, -1, -1 }, [ POWER5p_PME_PM_GRP_BR_REDIR ] = { 101, 100, 98, 100, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { 165, 162, 161, 160, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ ] = { -1, -1, -1, 294, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L275_SHR ] = { -1, -1, 299, -1, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { 149, 146, 145, 144, -1, -1 }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_WQ ] = { 319, 314, 312, 310, -1, -1 }, [ POWER5p_PME_PM_FAB_DCLAIM_RETRIED ] = { 40, 39, 39, 41, -1, -1 }, [ POWER5p_PME_PM_LSU0_NCLD ] = { 212, 208, 208, 207, -1, -1 }, [ POWER5p_PME_PM_LSU1_BUSY_REJECT ] = { 218, 214, 214, 213, -1, -1 }, [ POWER5p_PME_PM_FXLS0_FULL_CYC ] = { 91, 89, 88, 91, -1, -1 }, [ POWER5p_PME_PM_DTLB_REF_16M ] = { -1, -1, 33, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_FEST ] = { 61, 60, 60, 62, -1, -1 }, [ POWER5p_PME_PM_GCT_USAGE_60to79_CYC ] = { -1, 98, -1, -1, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L25_MOD ] = { -1, 21, 17, -1, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { 163, 160, 159, 158, -1, -1 }, [ POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 213, 209, 209, 208, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L375_MOD ] = { 23, -1, -1, 23, -1, -1 }, [ POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 238, 236, -1, -1, -1 }, [ POWER5p_PME_PM_DTLB_MISS_64K ] = { -1, 31, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 215, 211, 211, 210, -1, -1 }, [ POWER5p_PME_PM_0INST_FETCH ] = { -1, -1, -1, 1, -1, -1 }, [ POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 228, 224, 224, 223, -1, -1 }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 ] = { 265, 264, 260, 259, -1, -1 }, [ POWER5p_PME_PM_L1_PREF ] = { 126, 123, 122, 121, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { -1, -1, -1, 270, -1, -1 }, [ POWER5p_PME_PM_BRQ_FULL_CYC ] = { 8, 7, 7, 8, -1, -1 }, [ POWER5p_PME_PM_GRP_IC_MISS_NONSPEC ] = { 108, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L275_MOD ] = { 306, -1, -1, 300, -1, -1 }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 286, 286, 280, 283, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { -1, 275, -1, -1, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L3 ] = { 21, -1, 20, -1, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L2 ] = { 118, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_FLUSH ] = { 231, 229, 228, 226, -1, -1 }, [ POWER5p_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 296, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_DENORM ] = { 59, 58, 58, 60, -1, -1 }, [ POWER5p_PME_PM_FPU1_FMOV_FEST ] = { 78, 77, 77, 79, -1, -1 }, [ POWER5p_PME_PM_INST_FETCH_CYC ] = { 117, 115, 114, 114, -1, -1 }, [ POWER5p_PME_PM_INST_DISP ] = { -1, -1, 113, 113, -1, -1 }, [ POWER5p_PME_PM_LSU_LDF ] = { 236, -1, -1, 230, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L25_SHR ] = { 19, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 125, 122, 121, 120, -1, -1 }, [ POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM ] = { 267, 266, 262, 261, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_ISSUED ] = { 282, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU_FULL_CYC ] = { 89, -1, -1, 89, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L35_MOD ] = { -1, 118, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU_FMA ] = { -1, 84, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_3_CYC ] = { 336, 331, 329, 325, -1, -1 }, [ POWER5p_PME_PM_MRK_CRU_FIN ] = { -1, -1, -1, 263, -1, -1 }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_WQ ] = { 324, 319, 317, 315, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_REJECT ] = { -1, -1, -1, 18, -1, -1 }, [ POWER5p_PME_PM_MRK_FXU_FIN ] = { -1, 94, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 226, 222, 222, 221, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { 152, 149, 148, 147, -1, -1 }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { 170, 167, 166, 165, -1, -1 }, [ POWER5p_PME_PM_PMC4_OVERFLOW ] = { 301, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L3SA_SNOOP_RETRY ] = { 183, 180, 179, 178, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L35_MOD ] = { -1, 304, 302, -1, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L25_MOD ] = { -1, 117, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_SMT_HANG ] = { 354, 349, 347, 343, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS ] = { -1, -1, -1, 16, -1, -1 }, [ POWER5p_PME_PM_L3SA_MOD_TAG ] = { 180, 177, 176, 175, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L2MISS ] = { 120, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FLUSH_SYNC ] = { 56, 55, 55, 57, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_DISP ] = { 281, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 ] = { 263, 261, 247, 349, -1, -1 }, [ POWER5p_PME_PM_L2SC_ST_HIT ] = { 174, 171, 170, 169, -1, -1 }, [ POWER5p_PME_PM_L2SB_MOD_TAG ] = { 145, 142, 141, 140, -1, -1 }, [ POWER5p_PME_PM_CLB_EMPTY_CYC ] = { 13, 12, 12, 13, -1, -1 }, [ POWER5p_PME_PM_L2SB_ST_HIT ] = { 158, 155, 154, 153, -1, -1 }, [ POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { 254, 253, 249, 351, -1, -1 }, [ POWER5p_PME_PM_BR_PRED_CR_TA ] = { -1, -1, -1, 12, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 289, 289, 283, 286, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_ULD ] = { 296, -1, -1, 295, -1, -1 }, [ POWER5p_PME_PM_INST_DISP_ATTEMPT ] = { 116, 114, 354, 254, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_RMEM ] = { -1, -1, -1, 117, -1, -1 }, [ POWER5p_PME_PM_ST_REF_L1_LSU0 ] = { 328, 324, 323, 319, -1, -1 }, [ POWER5p_PME_PM_LSU0_DERAT_MISS ] = { 206, 202, 202, 201, -1, -1 }, [ POWER5p_PME_PM_FPU_STALL3 ] = { -1, 87, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCLD_DISP ] = { 146, 143, 142, 141, -1, -1 }, [ POWER5p_PME_PM_BR_PRED_CR ] = { 358, 352, 11, 347, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2 ] = { 269, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LSU0_FLUSH_SRQ ] = { 208, 204, 204, 203, -1, -1 }, [ POWER5p_PME_PM_FAB_PNtoNN_DIRECT ] = { 47, 46, 46, 48, -1, -1 }, [ POWER5p_PME_PM_IOPS_CMPL ] = { 115, 113, 112, 112, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCST_DISP ] = { 134, 131, 130, 129, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { 136, 133, 132, 131, -1, -1 }, [ POWER5p_PME_PM_L2SC_SHR_INV ] = { 172, 169, 168, 167, -1, -1 }, [ POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { 361, 355, 353, 350, -1, -1 }, [ POWER5p_PME_PM_FAB_PNtoVN_SIDECAR ] = { 50, 49, 49, 51, -1, -1 }, [ POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 214, 210, 210, 209, -1, -1 }, [ POWER5p_PME_PM_LSU_LMQ_S0_ALLOC ] = { 239, 236, 234, 233, -1, -1 }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_RQ ] = { 315, 310, 308, 306, -1, -1 }, [ POWER5p_PME_PM_DTLB_REF ] = { 33, 32, 32, 33, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L3 ] = { 307, -1, 301, -1, -1, -1 }, [ POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { 44, 43, 43, 45, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 239, -1, -1 }, [ POWER5p_PME_PM_FPU1_STF ] = { 84, 83, 83, 85, -1, -1 }, [ POWER5p_PME_PM_LSU_LMQ_S0_VALID ] = { 240, 237, 235, 234, -1, -1 }, [ POWER5p_PME_PM_GCT_USAGE_00to59_CYC ] = { 99, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU_FMOV_FEST ] = { -1, -1, 84, -1, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L2MISS ] = { -1, -1, 19, -1, -1, -1 }, [ POWER5p_PME_PM_XER_MAP_FULL_CYC ] = { 357, 351, 350, 346, -1, -1 }, [ POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 103, 103, 101, 102, -1, -1 }, [ POWER5p_PME_PM_FLUSH_SB ] = { 55, 54, 54, 56, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR ] = { -1, -1, 269, -1, -1, -1 }, [ POWER5p_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 278, -1, -1 }, [ POWER5p_PME_PM_SUSPENDED ] = { 330, 326, 325, 321, -1, -1 }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL ] = { 317, 312, 310, 308, -1, -1 }, [ POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { 107, 101, 99, 101, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L35_SHR ] = { 22, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L3SB_MOD_INV ] = { 186, 183, 182, 181, -1, -1 }, [ POWER5p_PME_PM_STCX_FAIL ] = { 325, 320, 318, 316, -1, -1 }, [ POWER5p_PME_PM_LD_MISS_L1_LSU1 ] = { 200, 199, 199, 198, -1, -1 }, [ POWER5p_PME_PM_GRP_DISP ] = { -1, 102, -1, -1, -1, -1 }, [ POWER5p_PME_PM_DC_PREF_DST ] = { 28, 27, 27, 28, -1, -1 }, [ POWER5p_PME_PM_FPU1_DENORM ] = { 73, 72, 72, 74, -1, -1 }, [ POWER5p_PME_PM_FPU0_FPSCR ] = { 65, 64, 64, 66, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L2 ] = { 18, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { 131, 128, 127, 126, -1, -1 }, [ POWER5p_PME_PM_FPU_1FLOP ] = { 85, -1, -1, 86, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { 164, 161, 160, 159, -1, -1 }, [ POWER5p_PME_PM_FPU0_FSQRT ] = { 67, 66, 66, 68, -1, -1 }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { 169, 166, 165, 164, -1, -1 }, [ POWER5p_PME_PM_LD_REF_L1 ] = { 201, -1, -1, 196, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_L1 ] = { -1, 116, -1, -1, -1, -1 }, [ POWER5p_PME_PM_TLBIE_HELD ] = { 355, 350, 349, 344, -1, -1 }, [ POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 27, 26, 26, 27, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { -1, -1, -1, 264, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 293, 293, 287, 290, -1, -1 }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 ] = { 260, 256, 257, 256, -1, -1 }, [ POWER5p_PME_PM_ST_REF_L1_LSU1 ] = { 329, 325, 324, 320, -1, -1 }, [ POWER5p_PME_PM_MRK_LD_MISS_L1 ] = { 285, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L1_WRITE_CYC ] = { 127, 124, 123, 122, -1, -1 }, [ POWER5p_PME_PM_L2SC_ST_REQ ] = { 175, 172, 171, 170, -1, -1 }, [ POWER5p_PME_PM_CMPLU_STALL_FDIV ] = { -1, 15, -1, -1, -1, -1 }, [ POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { 348, 343, 341, 337, -1, -1 }, [ POWER5p_PME_PM_BR_MPRED_CR ] = { 10, 9, 9, 10, -1, -1 }, [ POWER5p_PME_PM_L3SB_MOD_TAG ] = { 187, 184, 183, 182, -1, -1 }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2MISS ] = { -1, -1, 266, -1, -1, -1 }, [ POWER5p_PME_PM_LSU_REJECT_SRQ ] = { 245, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_LD_MISS_L1 ] = { -1, -1, 195, -1, -1, -1 }, [ POWER5p_PME_PM_INST_FROM_PREF ] = { -1, -1, 118, -1, -1, -1 }, [ POWER5p_PME_PM_STCX_PASS ] = { 326, 321, 319, 317, -1, -1 }, [ POWER5p_PME_PM_DC_INV_L2 ] = { 26, 25, 25, 26, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_FULL_CYC ] = { 246, 244, 241, 240, -1, -1 }, [ POWER5p_PME_PM_FPU_FIN ] = { -1, -1, -1, 88, -1, -1 }, [ POWER5p_PME_PM_LSU_SRQ_STFWD ] = { -1, 247, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SA_SHR_MOD ] = { 141, 138, 137, 136, -1, -1 }, [ POWER5p_PME_PM_0INST_CLB_CYC ] = { 0, 0, 0, 0, -1, -1 }, [ POWER5p_PME_PM_FXU0_FIN ] = { 94, 91, 91, 94, -1, -1 }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { 153, 150, 149, 148, -1, -1 }, [ POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { -1, 327, -1, -1, -1, -1 }, [ POWER5p_PME_PM_PMC5_OVERFLOW ] = { 302, -1, -1, -1, -1, -1 }, [ POWER5p_PME_PM_FPU0_FDIV ] = { 60, 59, 59, 61, -1, -1 }, [ POWER5p_PME_PM_PTEG_FROM_L375_SHR ] = { -1, -1, 303, -1, -1, -1 }, [ POWER5p_PME_PM_HV_CYC ] = { -1, 107, -1, -1, -1, -1 }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { 138, 135, 134, 133, -1, -1 }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { 341, 336, 334, 330, -1, -1 }, [ POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 204, 200, 200, 199, -1, -1 }, [ POWER5p_PME_PM_L3SB_SHR_INV ] = { 189, 186, 185, 184, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_RMEM ] = { 24, -1, -1, 24, -1, -1 }, [ POWER5p_PME_PM_DATA_FROM_L275_MOD ] = { 20, -1, -1, 22, -1, -1 }, [ POWER5p_PME_PM_LSU0_REJECT_SRQ ] = { 216, 212, 212, 211, -1, -1 }, [ POWER5p_PME_PM_LSU1_DERAT_MISS ] = { 219, 215, 215, 214, -1, -1 }, [ POWER5p_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, 293, -1, -1 }, [ POWER5p_PME_PM_DTLB_MISS_16M ] = { -1, -1, 31, -1, -1, -1 }, [ POWER5p_PME_PM_LSU0_FLUSH_UST ] = { 210, 206, 206, 205, -1, -1 }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { 154, 151, 150, 149, -1, -1 }, [ POWER5p_PME_PM_L2SC_MOD_TAG ] = { 161, 158, 157, 156, -1, -1 } }; static const unsigned long long power5p_group_vecs[][POWER5p_NUM_GROUP_VEC] = { [ POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF ] = { 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_SINGLE ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SB_REF ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L275_SHR ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0080000000000000ULL }, [ POWER5p_PME_PM_DTLB_MISS_4K ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CLB_FULL_CYC ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_ST_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000008ULL }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000800000000ULL }, [ POWER5p_PME_PM_1INST_CLB_CYC ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_SPEC_RD_CANCEL ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16M ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000028000000000ULL }, [ POWER5p_PME_PM_FPU_FDIV ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000410000ULL }, [ POWER5p_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000008000ULL }, [ POWER5p_PME_PM_FPU0_FMA ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000001000ULL }, [ POWER5p_PME_PM_SLB_MISS ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_ST_HIT ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_MISS ] = { 0x0002100000000000ULL, 0x0000000000000000ULL, 0x0000000004000080ULL }, [ POWER5p_PME_PM_BR_PRED_TA ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000004000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_FXU ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_EXT_INT ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL }, [ POWER5p_PME_PM_MRK_ST_GPS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000010ULL }, [ POWER5p_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_CMD_ISSUED ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CR_MAP_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FLUSH_IMBAL ] = { 0x0000000000108000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L35_MOD ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FDIV ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000001000ULL }, [ POWER5p_PME_PM_LWSYNC_HELD ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DSLB_MISS ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L275_SHR ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXLS1_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SEL_T0 ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_RELOAD_VALID ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000010ULL }, [ POWER5p_PME_PM_LSU_LMQ_LHR_MERGE ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_2INST_CLB_CYC ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_PNtoVN_DIRECT ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L2MISS ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_LSU ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DSLB_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000040000000003ULL }, [ POWER5p_PME_PM_LSU_FLUSH_ULD ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_LMEM ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_BRU_FIN ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0090000000000000ULL }, [ POWER5p_PME_PM_MEM_WQ_DISP_WRITE ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000002000000000ULL }, [ POWER5p_PME_PM_LSU1_NCLD ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPR_MAP_FULL_CYC ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SA_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_3INST_CLB_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRESH_TIMEO ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000020000000ULL }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000410000ULL }, [ POWER5p_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL }, [ POWER5p_PME_PM_L3SC_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_6_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_FEST ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000800000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { 0x0000000000000008ULL, 0x0200000000000000ULL, 0x0000000400000000ULL }, [ POWER5p_PME_PM_MEM_PWQ_DISP ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LD_MISS_L1_LSU0 ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_STALL3 ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000800ULL }, [ POWER5p_PME_PM_GCT_USAGE_80to99_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_WORK_HELD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_CMPL ] = { 0x0000000000000001ULL, 0x0000000000000000ULL, 0x0ffffffff9880000ULL }, [ POWER5p_PME_PM_LSU1_FLUSH_UST ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU_IDLE ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_DISP_REJECT ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L25_SHR ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_CMD_RETRIED ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SA_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L375_MOD ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL }, [ POWER5p_PME_PM_BR_ISSUED ] = { 0x0000000002040000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER5p_PME_PM_MRK_GRP_BR_REDIR ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0800000000000000ULL }, [ POWER5p_PME_PM_EE_OFF ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IERAT_XLATE_WR_LP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_REF_64K ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L3 ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_ITLB_MISS ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL }, [ POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_REF_4K ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXLS_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_DISP_VALID ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_FLUSH_UST ] = { 0x0000000002100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU1_FIN ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_4_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000400000000ULL }, [ POWER5p_PME_PM_4INST_CLB_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_REF_16M ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000050000000002ULL }, [ POWER5p_PME_PM_INST_FROM_L375_MOD ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_CMPL ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_1FLOP ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000002000ULL }, [ POWER5p_PME_PM_FPU_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000010000ULL }, [ POWER5p_PME_PM_L3SC_REF ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_5INST_CLB_CYC ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_PW_GATH ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_REF_16G ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_DCLAIM_ISSUED ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_PNtoNN_SIDECAR ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_IC_MISS ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L35_SHR ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000200000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0008000000000000ULL }, [ POWER5p_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_BUSY_REJECT ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_REJECT_ERAT_MISS ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000800000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L375_SHR ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L25_MOD ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_7_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_DIV ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L375_SHR ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_ST_REF_L1 ] = { 0x0004200000000000ULL, 0x0000000000000000ULL, 0x00000000092040e0ULL }, [ POWER5p_PME_PM_L3SB_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { 0x0000000000000000ULL, 0x1400000000000000ULL, 0x0000002800000000ULL }, [ POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_LMEM ] = { 0x0018000000000000ULL, 0x0000000000000000ULL, 0x0000000010000140ULL }, [ POWER5p_PME_PM_RUN_CYC ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x0fffffffffffffffULL }, [ POWER5p_PME_PM_PTEG_FROM_RMEM ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCLD_DISP ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL, 0x0000000012000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_IMR_RELOAD ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000400200000000ULL }, [ POWER5p_PME_PM_MRK_GRP_TIMEO ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000010ULL }, [ POWER5p_PME_PM_ST_MISS_L1 ] = { 0x0004200000000000ULL, 0x0000000000000000ULL, 0x0000000008100100ULL }, [ POWER5p_PME_PM_STOP_COMPLETION ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_BUSY_REJECT ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_ISLB_MISS ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CYC ] = { 0x0002000040000003ULL, 0x0000008000000000ULL, 0x000000001eb40201ULL }, [ POWER5p_PME_PM_THRD_ONE_RUN_CYC ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SC_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2_PREF ] = { 0x0000000000006000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL }, [ POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0080000100000000ULL }, [ POWER5p_PME_PM_L2SB_ST_REQ ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_L1_RELOAD_VALID ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000080000000ULL }, [ POWER5p_PME_PM_L3SB_HIT ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_1PLUS_PPC_CMPL ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IC_PREF_INSTALL ] = { 0x0000008000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LRQ_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_TLB_MISS ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL }, [ POWER5p_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU_BUSY ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000400000000ULL }, [ POWER5p_PME_PM_LSU_REJECT_LMQ_FULL ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_MRK ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L25_SHR ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FIN ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x000000000000a000ULL }, [ POWER5p_PME_PM_BR_MPRED_TA ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_REF_64K ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000010000000000ULL }, [ POWER5p_PME_PM_RUN_INST_CMPL ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x0fffffffffffffffULL }, [ POWER5p_PME_PM_CRQ_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCLD_DISP ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_REF_4K ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0002040000000002ULL }, [ POWER5p_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L275_MOD ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GCT_EMPTY_CYC ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LARX_LSU0 ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_RETRY_1AHEAD ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL }, [ POWER5p_PME_PM_MRK_FPU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400200000000010ULL }, [ POWER5p_PME_PM_THRD_PRIO_5_CYC ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000001000000000ULL }, [ POWER5p_PME_PM_SNOOP_TLBIE ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000004800000ULL, 0x0000000000001000ULL }, [ POWER5p_PME_PM_DTLB_MISS_16G ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SB_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_VBYPASS_EMPTY ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0001000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_6INST_CLB_CYC ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FLUSH ] = { 0x0008000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_DENORM ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SC_HIT ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_REJECT_SRQ ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SC_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IC_PREF_REQ ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL }, [ POWER5p_PME_PM_MRK_GRP_IC_MISS ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000080000000ULL }, [ POWER5p_PME_PM_GCT_NOSLOT_IC_MISS ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3 ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000400000000000ULL }, [ POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_2_CYC ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SA_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0200000000000000ULL }, [ POWER5p_PME_PM_L3SA_REF ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_STALL3 ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000800ULL }, [ POWER5p_PME_PM_TB_BIT_TRANS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GPR_MAP_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL }, [ POWER5p_PME_PM_FPU0_STF ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000001ULL }, [ POWER5p_PME_PM_FPU1_FMA ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000001000ULL }, [ POWER5p_PME_PM_L2SA_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_INST_FIN ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0800000040000000ULL }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL }, [ POWER5p_PME_PM_FPU0_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_REF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0001000000000001ULL }, [ POWER5p_PME_PM_BR_UNCOND ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER5p_PME_PM_THRD_SEL_OVER_L2MISS ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_MISS_64K ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000028000000000ULL }, [ POWER5p_PME_PM_MRK_ST_MISS_L1 ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0040000000000008ULL }, [ POWER5p_PME_PM_L3SC_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_DISP_SUCCESS ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_DERAT_MISS ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_SINGLE ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_1_CYC ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FEST ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000400000000000ULL }, [ POWER5p_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000004ULL }, [ POWER5p_PME_PM_FLUSH_BR_MPRED ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16G ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000028000000000ULL }, [ POWER5p_PME_PM_FPU_STF ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x00000000020a8000ULL }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_FPU ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GCT_NOSLOT_CYC ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L35_SHR ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_REF_16G ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000010000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100100000000008ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000200000000ULL }, [ POWER5p_PME_PM_L3SA_HIT ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000001000000000ULL }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IERAT_XLATE_WR ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_ST_REQ ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_LMEM ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SEL_T1 ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000800000000000ULL }, [ POWER5p_PME_PM_FPU0_1FLOP ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000002000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L2 ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_PW_CMPL ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DTLB_MISS_4K ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0005000000000000ULL }, [ POWER5p_PME_PM_FPU0_FIN ] = { 0x0000000000000000ULL, 0x0000000008080000ULL, 0x000000000000a800ULL }, [ POWER5p_PME_PM_L3SC_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_BR_REDIR ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L275_SHR ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_WQ ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_DCLAIM_RETRIED ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_NCLD ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_BUSY_REJECT ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXLS0_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_REF_16M ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FEST ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GCT_USAGE_60to79_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L25_MOD ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L375_MOD ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000000600ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_MISS_64K ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_0INST_FETCH ] = { 0x0100008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L1_PREF ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000001000000000ULL }, [ POWER5p_PME_PM_BRQ_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_IC_MISS_NONSPEC ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L275_MOD ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000004000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L3 ] = { 0x0018000000000000ULL, 0x0000000000000000ULL, 0x0000000000000140ULL }, [ POWER5p_PME_PM_INST_FROM_L2 ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_FLUSH ] = { 0x000000000dc80000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_DENORM ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FETCH_CYC ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_DISP ] = { 0x0000000000000005ULL, 0x0000000000000000ULL, 0x0000000001080000ULL }, [ POWER5p_PME_PM_LSU_LDF ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000080000ULL }, [ POWER5p_PME_PM_DATA_FROM_L25_SHR ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_GRP_ISSUED ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000080000000ULL }, [ POWER5p_PME_PM_FPU_FULL_CYC ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L35_MOD ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_FMA ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000002424000ULL }, [ POWER5p_PME_PM_THRD_PRIO_3_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_CRU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000004ULL }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_WQ ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_REJECT ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400000000040000ULL }, [ POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SA_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L35_MOD ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L25_MOD ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SMT_HANG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SA_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_INST_FROM_L2MISS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FLUSH_SYNC ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_GRP_DISP ] = { 0x0000000000000000ULL, 0x0030000040000000ULL, 0x0000000060000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_ST_HIT ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CLB_EMPTY_CYC ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_ST_HIT ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_BR_PRED_CR_TA ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0012000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000008ULL }, [ POWER5p_PME_PM_INST_DISP_ATTEMPT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000100000ULL }, [ POWER5p_PME_PM_INST_FROM_RMEM ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_ST_REF_L1_LSU0 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_STALL3 ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCLD_DISP ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_BR_PRED_CR ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0200000100000000ULL }, [ POWER5p_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_PNtoNN_DIRECT ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_IOPS_CMPL ] = { 0x01080911fff53010ULL, 0x110020f81d100700ULL, 0x0002002000000006ULL }, [ POWER5p_PME_PM_L2SA_RCST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_PNtoVN_SIDECAR ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DTLB_REF ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L3 ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000600ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_STF ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GCT_USAGE_00to59_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L2MISS ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_XER_MAP_FULL_CYC ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FLUSH_SB ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000004000000000ULL }, [ POWER5p_PME_PM_MRK_GRP_CMPL ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0800000040000000ULL }, [ POWER5p_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L35_SHR ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SB_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_STCX_FAIL ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LD_MISS_L1_LSU1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_GRP_DISP ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DC_PREF_DST ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU1_DENORM ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FPSCR ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000800ULL }, [ POWER5p_PME_PM_DATA_FROM_L2 ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_1FLOP ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000424000ULL }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LD_REF_L1 ] = { 0x0004100000000000ULL, 0x0000000000000000ULL, 0x00000000052040e0ULL }, [ POWER5p_PME_PM_INST_FROM_L1 ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5p_PME_PM_TLBIE_HELD ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000100000000ULL }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_ST_REF_L1_LSU1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000004ULL }, [ POWER5p_PME_PM_L1_WRITE_CYC ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_ST_REQ ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_CMPLU_STALL_FDIV ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_BR_MPRED_CR ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SB_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2MISS ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL }, [ POWER5p_PME_PM_LSU_REJECT_SRQ ] = { 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LD_MISS_L1 ] = { 0x0004100000000000ULL, 0x0000000000000000ULL, 0x0000000004900100ULL }, [ POWER5p_PME_PM_INST_FROM_PREF ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_STCX_PASS ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DC_INV_L2 ] = { 0x4000000000100000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL }, [ POWER5p_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000000500ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU_FIN ] = { 0x0000000000000000ULL, 0x0100000000040000ULL, 0x0000000202070000ULL }, [ POWER5p_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_0INST_CLB_CYC ] = { 0x0000000000000008ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FXU0_FIN ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_FPU0_FDIV ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_PTEG_FROM_L375_SHR ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_HV_CYC ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L3SB_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_DATA_FROM_RMEM ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL }, [ POWER5p_PME_PM_DATA_FROM_L275_MOD ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_REJECT_SRQ ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU1_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_MRK_LSU_FIN ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0400000020000000ULL }, [ POWER5p_PME_PM_DTLB_MISS_16M ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_LSU0_FLUSH_UST ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5p_PME_PM_L2SC_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL } }; static const pme_power_entry_t power5p_pe[] = { [ POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x2c4090, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF] }, [ POWER5p_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x20e7, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "FPU1 has executed a single precision instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_SINGLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_SINGLE] }, [ POWER5p_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x701c4, .pme_short_desc = "L3 slice B references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_REF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_REF] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x430e5, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 3 or 4.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC] }, [ POWER5p_PME_PM_INST_FROM_L275_SHR ] = { .pme_name = "PM_INST_FROM_L275_SHR", .pme_code = 0x322096, .pme_short_desc = "Instruction fetched from L2.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L275_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L275_SHR] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD", .pme_code = 0x1c70a7, .pme_short_desc = "Marked data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD] }, [ POWER5p_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x1c208d, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_MISS_4K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_MISS_4K] }, [ POWER5p_PME_PM_CLB_FULL_CYC ] = { .pme_name = "PM_CLB_FULL_CYC", .pme_code = 0x220e5, .pme_short_desc = "Cycles CLB full", .pme_long_desc = "Cycles when both thread's CLB is full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CLB_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CLB_FULL_CYC] }, [ POWER5p_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_ST_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_ST_CMPL] }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_LRQ_FULL", .pme_code = 0x320e7, .pme_short_desc = "Flush caused by LRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x3c7097, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR] }, [ POWER5p_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x400c1, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_1INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_1INST_CLB_CYC] }, [ POWER5p_PME_PM_MEM_SPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_SPEC_RD_CANCEL", .pme_code = 0x721e6, .pme_short_desc = "Speculative memory read cancelled", .pme_long_desc = "Speculative memory read cancelled (i.e. cresp = sourced by L2/L3)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_SPEC_RD_CANCEL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_SPEC_RD_CANCEL] }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x3c608d, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Marked Data TLB misses for 16M page", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_MISS_16M], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_MISS_16M] }, [ POWER5p_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x100088, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "The floating point unit has executed a divide instruction. This could be fdiv, fdivs, fdiv., fdivs.. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FDIV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FDIV] }, [ POWER5p_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x102090, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_SINGLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_SINGLE] }, [ POWER5p_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc1, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FMA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FMA] }, [ POWER5p_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x280088, .pme_short_desc = "SLB misses", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SLB_MISS] }, [ POWER5p_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc00c6, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_FLUSH_LRQ] }, [ POWER5p_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x733e0, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_ST_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_ST_HIT] }, [ POWER5p_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x800c4, .pme_short_desc = "Data TLB misses", .pme_long_desc = "Data TLB misses, all page sizes.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_MISS] }, [ POWER5p_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x230e3, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " target prediction", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_PRED_TA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_PRED_TA] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD_CYC", .pme_code = 0x4c70a7, .pme_short_desc = "Marked load latency from L3.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC] }, [ POWER5p_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x211099, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_FXU], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_FXU] }, [ POWER5p_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x400003, .pme_short_desc = "External interrupts", .pme_long_desc = "An interrupt due to an external exception occurred", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_EXT_INT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_EXT_INT] }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x810c6, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ] }, [ POWER5p_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_ST_GPS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_ST_GPS] }, [ POWER5p_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc50c4, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU1", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_LDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_LDF] }, [ POWER5p_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x700c7, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_CMD_ISSUED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_CMD_ISSUED] }, [ POWER5p_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc60e1, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_SRQ_STFWD] }, [ POWER5p_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x100c4, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CR_MAP_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CR_MAP_FULL_CYC] }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x810c1, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD] }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_SRQ_FULL", .pme_code = 0x330e0, .pme_short_desc = "Flush caused by SRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL] }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 ] = { .pme_name = "PM_MEM_RQ_DISP_Q16to19", .pme_code = 0x727e6, .pme_short_desc = "Memory read queue dispatched to queues 16-19", .pme_long_desc = "A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP_Q16to19], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP_Q16to19] }, [ POWER5p_PME_PM_FLUSH_IMBAL ] = { .pme_name = "PM_FLUSH_IMBAL", .pme_code = 0x330e3, .pme_short_desc = "Flush caused by thread GCT imbalance", .pme_long_desc = "This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FLUSH_IMBAL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FLUSH_IMBAL] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x430e1, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 3 or 4.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC] }, [ POWER5p_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x2c309e, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L35_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L35_MOD] }, [ POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_HI_PRIO_WR_CMPL", .pme_code = 0x726e6, .pme_short_desc = "High priority write completed", .pme_long_desc = "A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL] }, [ POWER5p_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0xc4, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FDIV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FDIV] }, [ POWER5p_PME_PM_MEM_RQ_DISP ] = { .pme_name = "PM_MEM_RQ_DISP", .pme_code = 0x701c6, .pme_short_desc = "Memory read queue dispatched", .pme_long_desc = "A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP] }, [ POWER5p_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x10c1, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FRSP_FCONV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FRSP_FCONV] }, [ POWER5p_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x130e0, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LWSYNC_HELD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LWSYNC_HELD] }, [ POWER5p_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x313088, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU_FIN] }, [ POWER5p_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x800c5, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DSLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DSLB_MISS] }, [ POWER5p_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x3c3097, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L275_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L275_SHR] }, [ POWER5p_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x110c4, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXLS1_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXLS1_FULL_CYC] }, [ POWER5p_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x410c0, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Thread selection picked thread 0 for decode.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_T0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_T0] }, [ POWER5p_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x830e4, .pme_short_desc = "PTEG reload valid", .pme_long_desc = "A Page Table Entry was loaded into the TLB.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_RELOAD_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_RELOAD_VALID] }, [ POWER5p_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x820e6, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_STCX_FAIL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_STCX_FAIL] }, [ POWER5p_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0xc70e5, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LMQ_LHR_MERGE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LMQ_LHR_MERGE] }, [ POWER5p_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x400c2, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_2INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_2INST_CLB_CYC] }, [ POWER5p_PME_PM_FAB_PNtoVN_DIRECT ] = { .pme_name = "PM_FAB_PNtoVN_DIRECT", .pme_code = 0x723e7, .pme_short_desc = "PN to VN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_PNtoVN_DIRECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_PNtoVN_DIRECT] }, [ POWER5p_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x38309b, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L2MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L2MISS] }, [ POWER5p_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x211098, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_LSU], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_LSU] }, [ POWER5p_PME_PM_MRK_DSLB_MISS ] = { .pme_name = "PM_MRK_DSLB_MISS", .pme_code = 0xc50c7, .pme_short_desc = "Marked Data SLB misses", .pme_long_desc = "A Data SLB miss was caused by a marked instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DSLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DSLB_MISS] }, [ POWER5p_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c0088, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_ULD] }, [ POWER5p_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x283087, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_LMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_LMEM] }, [ POWER5p_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x200005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_BRU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_BRU_FIN] }, [ POWER5p_PME_PM_MEM_WQ_DISP_WRITE ] = { .pme_name = "PM_MEM_WQ_DISP_WRITE", .pme_code = 0x703c6, .pme_short_desc = "Memory write queue dispatched due to write", .pme_long_desc = "A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_WQ_DISP_WRITE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_WQ_DISP_WRITE] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD_CYC", .pme_code = 0x4c70a3, .pme_short_desc = "Marked load latency from L2.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC] }, [ POWER5p_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc50c5, .pme_short_desc = "LSU1 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_NCLD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_NCLD] }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_WQ_PWQ", .pme_code = 0x717c6, .pme_short_desc = "Snoop partial-write retry due to collision with active write or partial-write queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ] }, [ POWER5p_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x100c7, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FULL_CYC] }, [ POWER5p_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x100c1, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPR_MAP_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPR_MAP_FULL_CYC] }, [ POWER5p_PME_PM_L3SA_ALL_BUSY ] = { .pme_name = "PM_L3SA_ALL_BUSY", .pme_code = 0x721e3, .pme_short_desc = "L3 slice A active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_ALL_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_ALL_BUSY] }, [ POWER5p_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x400c3, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_3INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_3INST_CLB_CYC] }, [ POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { .pme_name = "PM_MEM_PWQ_DISP_Q2or3", .pme_code = 0x734e6, .pme_short_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3", .pme_long_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3] }, [ POWER5p_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0x710c0, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_SHR_INV] }, [ POWER5p_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x30000b, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRESH_TIMEO], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRESH_TIMEO] }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { .pme_name = "PM_THRD_SEL_OVER_GCT_IMBAL", .pme_code = 0x410c4, .pme_short_desc = "Thread selection overrides caused by GCT imbalance", .pme_long_desc = "Thread selection was overridden because of a GCT imbalance.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL] }, [ POWER5p_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x200090, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FSQRT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FSQRT] }, [ POWER5p_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20000a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC1_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC1_OVERFLOW] }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x810c2, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ] }, [ POWER5p_PME_PM_L3SC_SNOOP_RETRY ] = { .pme_name = "PM_L3SC_SNOOP_RETRY", .pme_code = 0x731e5, .pme_short_desc = "L3 slice C snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_SNOOP_RETRY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_SNOOP_RETRY] }, [ POWER5p_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x800c7, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_TABLEWALK_CYC] }, [ POWER5p_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x420e5, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles this thread was running at priority level 6.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_6_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_6_CYC] }, [ POWER5p_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x1010a8, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FEST] }, [ POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toP1_SIDECAR_EMPTY", .pme_code = 0x702c7, .pme_short_desc = "M1 to P1 sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY] }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x1c70a1, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_RMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_RMEM] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD_CYC", .pme_code = 0x4c70a6, .pme_short_desc = "Marked load latency from L3.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC] }, [ POWER5p_PME_PM_MEM_PWQ_DISP ] = { .pme_name = "PM_MEM_PWQ_DISP", .pme_code = 0x704c6, .pme_short_desc = "Memory partial-write queue dispatched", .pme_long_desc = "Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_PWQ_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_PWQ_DISP] }, [ POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toM1_SIDECAR_EMPTY", .pme_code = 0x701c7, .pme_short_desc = "P1 to M1 sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY] }, [ POWER5p_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc10c2, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LD_MISS_L1_LSU0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LD_MISS_L1_LSU0] }, [ POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { .pme_name = "PM_SNOOP_PARTIAL_RTRY_QFULL", .pme_code = 0x730e6, .pme_short_desc = "Snoop partial write retry due to partial-write queues full", .pme_long_desc = "A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL] }, [ POWER5p_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x20e5, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_STALL3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_STALL3] }, [ POWER5p_PME_PM_GCT_USAGE_80to99_CYC ] = { .pme_name = "PM_GCT_USAGE_80to99_CYC", .pme_code = 0x30001f, .pme_short_desc = "Cycles GCT 80-99% full", .pme_long_desc = "Cycles when the Global Completion Table has between 80% and 99% of its slots used. The GCT has 20 entries shared between threads", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_USAGE_80to99_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_USAGE_80to99_CYC] }, [ POWER5p_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x40000c, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_WORK_HELD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_WORK_HELD] }, [ POWER5p_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x100009, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PowerPC instructions that completed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_CMPL] }, [ POWER5p_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc00c5, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_FLUSH_UST] }, [ POWER5p_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100012, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU_IDLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU_IDLE] }, [ POWER5p_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00c0, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_FLUSH_ULD] }, [ POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc40c5, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL] }, [ POWER5p_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x120e4, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_DISP_REJECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_DISP_REJECT] }, [ POWER5p_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x183097, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L25_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L25_SHR] }, [ POWER5p_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0x730e0, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_MOD_INV] }, [ POWER5p_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x710c7, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_CMD_RETRIED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_CMD_RETRIED] }, [ POWER5p_PME_PM_L3SA_SHR_INV ] = { .pme_name = "PM_L3SA_SHR_INV", .pme_code = 0x710c3, .pme_short_desc = "L3 slice A transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_SHR_INV] }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_PTEG_FROM_L375_MOD ] = { .pme_name = "PM_PTEG_FROM_L375_MOD", .pme_code = 0x1830a7, .pme_short_desc = "PTEG loaded from L3.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L375_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L375_MOD] }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x810c5, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU1_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU1_FLUSH_UST] }, [ POWER5p_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x230e4, .pme_short_desc = "Branches issued", .pme_long_desc = "A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_ISSUED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_ISSUED] }, [ POWER5p_PME_PM_MRK_GRP_BR_REDIR ] = { .pme_name = "PM_MRK_GRP_BR_REDIR", .pme_code = 0x212091, .pme_short_desc = "Group experienced marked branch redirect", .pme_long_desc = "A group containing a marked (sampled) instruction experienced a branch redirect.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_BR_REDIR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_BR_REDIR] }, [ POWER5p_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x130e3, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_EE_OFF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_EE_OFF] }, [ POWER5p_PME_PM_IERAT_XLATE_WR_LP ] = { .pme_name = "PM_IERAT_XLATE_WR_LP", .pme_code = 0x210c6, .pme_short_desc = "Large page translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IERAT_XLATE_WR_LP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IERAT_XLATE_WR_LP] }, [ POWER5p_PME_PM_DTLB_REF_64K ] = { .pme_name = "PM_DTLB_REF_64K", .pme_code = 0x2c2086, .pme_short_desc = "Data TLB reference for 64K page", .pme_long_desc = "Data TLB references for 64KB pages. Includes hits + misses.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_REF_64K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_REF_64K] }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 ] = { .pme_name = "PM_MEM_RQ_DISP_Q4to7", .pme_code = 0x712c6, .pme_short_desc = "Memory read queue dispatched to queues 4-7", .pme_long_desc = "A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP_Q4to7], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP_Q4to7] }, [ POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP ] = { .pme_name = "PM_MEM_FAST_PATH_RD_DISP", .pme_code = 0x731e6, .pme_short_desc = "Fast path memory read dispatched", .pme_long_desc = "Fast path memory read dispatched", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP] }, [ POWER5p_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x12208d, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L3] }, [ POWER5p_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x800c0, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ITLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ITLB_MISS] }, [ POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400012, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ POWER5p_PME_PM_DTLB_REF_4K ] = { .pme_name = "PM_DTLB_REF_4K", .pme_code = 0x1c2086, .pme_short_desc = "Data TLB reference for 4K page", .pme_long_desc = "Data TLB references for 4KB pages. Includes hits + misses.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_REF_4K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_REF_4K] }, [ POWER5p_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x1110a8, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXLS_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXLS_FULL_CYC] }, [ POWER5p_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x120e3, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "A group is available for dispatch. This does not mean it was successfully dispatched.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_DISP_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_DISP_VALID] }, [ POWER5p_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c0088, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_UST] }, [ POWER5p_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x130e6, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU1_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU1_FIN] }, [ POWER5p_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x420e3, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles this thread was running at priority level 4.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_4_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_4_CYC] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x2c709e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD] }, [ POWER5p_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x400c4, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_4INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_4INST_CLB_CYC] }, [ POWER5p_PME_PM_MRK_DTLB_REF_16M ] = { .pme_name = "PM_MRK_DTLB_REF_16M", .pme_code = 0x3c6086, .pme_short_desc = "Marked Data TLB reference for 16M page", .pme_long_desc = "Data TLB references by a marked instruction for 16MB pages.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_REF_16M], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_REF_16M] }, [ POWER5p_PME_PM_INST_FROM_L375_MOD ] = { .pme_name = "PM_INST_FROM_L375_MOD", .pme_code = 0x42209d, .pme_short_desc = "Instruction fetched from L3.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L375_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L375_MOD] }, [ POWER5p_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x300013, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_CMPL] }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc7, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_1FLOP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_1FLOP] }, [ POWER5p_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x2010a8, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FRSP_FCONV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FRSP_FCONV] }, [ POWER5p_PME_PM_L3SC_REF ] = { .pme_name = "PM_L3SC_REF", .pme_code = 0x701c5, .pme_short_desc = "L3 slice C references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_REF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_REF] }, [ POWER5p_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x400c5, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_5INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_5INST_CLB_CYC] }, [ POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC ] = { .pme_name = "PM_THRD_L2MISS_BOTH_CYC", .pme_code = 0x410c7, .pme_short_desc = "Cycles both threads in L2 misses", .pme_long_desc = "Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC] }, [ POWER5p_PME_PM_MEM_PW_GATH ] = { .pme_name = "PM_MEM_PW_GATH", .pme_code = 0x714c6, .pme_short_desc = "Memory partial-write gathered", .pme_long_desc = "Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_PW_GATH], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_PW_GATH] }, [ POWER5p_PME_PM_DTLB_REF_16G ] = { .pme_name = "PM_DTLB_REF_16G", .pme_code = 0x4c2086, .pme_short_desc = "Data TLB reference for 16G page", .pme_long_desc = "Data TLB references for 16GB pages. Includes hits + misses.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_REF_16G], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_REF_16G] }, [ POWER5p_PME_PM_FAB_DCLAIM_ISSUED ] = { .pme_name = "PM_FAB_DCLAIM_ISSUED", .pme_code = 0x720e7, .pme_short_desc = "dclaim issued", .pme_long_desc = "A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_DCLAIM_ISSUED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_DCLAIM_ISSUED] }, [ POWER5p_PME_PM_FAB_PNtoNN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoNN_SIDECAR", .pme_code = 0x713c7, .pme_short_desc = "PN to NN beat went to sidecar first", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_PNtoNN_SIDECAR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_PNtoNN_SIDECAR] }, [ POWER5p_PME_PM_GRP_IC_MISS ] = { .pme_name = "PM_GRP_IC_MISS", .pme_code = 0x120e7, .pme_short_desc = "Group experienced I cache miss", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_IC_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_IC_MISS] }, [ POWER5p_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x12209d, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L35_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L35_SHR] }, [ POWER5p_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xc30e7, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The Load Miss Queue was full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LMQ_FULL_CYC] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x2c70a0, .pme_short_desc = "Marked load latency from L2", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC] }, [ POWER5p_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x830e5, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_SYNC_CYC] }, [ POWER5p_PME_PM_LSU0_BUSY_REJECT ] = { .pme_name = "PM_LSU0_BUSY_REJECT", .pme_code = 0xc20e1, .pme_short_desc = "LSU0 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_BUSY_REJECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_BUSY_REJECT] }, [ POWER5p_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x1c4090, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_REJECT_ERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_REJECT_ERAT_MISS] }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4c70a1, .pme_short_desc = "Marked load latency from remote memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC] }, [ POWER5p_PME_PM_DATA_FROM_L375_SHR ] = { .pme_name = "PM_DATA_FROM_L375_SHR", .pme_code = 0x3c309e, .pme_short_desc = "Data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L375_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L375_SHR] }, [ POWER5p_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x283097, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L25_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L25_MOD] }, [ POWER5p_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x10c0, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FMOV_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FMOV_FEST] }, [ POWER5p_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x420e6, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles this thread was running at priority level 7.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_7_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_7_CYC] }, [ POWER5p_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc00c7, .pme_short_desc = "LSU1 SRQ lhs flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_FLUSH_SRQ] }, [ POWER5p_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10c0, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LD_REF_L1_LSU0] }, [ POWER5p_PME_PM_L2SC_RCST_DISP ] = { .pme_name = "PM_L2SC_RCST_DISP", .pme_code = 0x702c2, .pme_short_desc = "L2 slice C RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCST_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCST_DISP] }, [ POWER5p_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x411099, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_DIV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_DIV] }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 ] = { .pme_name = "PM_MEM_RQ_DISP_Q12to15", .pme_code = 0x732e6, .pme_short_desc = "Memory read queue dispatched to queues 12-15", .pme_long_desc = "A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP_Q12to15], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP_Q12to15] }, [ POWER5p_PME_PM_INST_FROM_L375_SHR ] = { .pme_name = "PM_INST_FROM_L375_SHR", .pme_code = 0x32209d, .pme_short_desc = "Instruction fetched from L3.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L375_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L375_SHR] }, [ POWER5p_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x2c10a8, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Store references to the Data Cache. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ST_REF_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ST_REF_L1] }, [ POWER5p_PME_PM_L3SB_ALL_BUSY ] = { .pme_name = "PM_L3SB_ALL_BUSY", .pme_code = 0x721e4, .pme_short_desc = "L3 slice B active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_ALL_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_ALL_BUSY] }, [ POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x711c7, .pme_short_desc = "P1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR_CYC", .pme_code = 0x2c70a3, .pme_short_desc = "Marked load latency from L2.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC] }, [ POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoNN_EMPTY", .pme_code = 0x722e7, .pme_short_desc = "Hold buffer to NN empty", .pme_long_desc = "Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY] }, [ POWER5p_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x2c3087, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_LMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_LMEM] }, [ POWER5p_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x100005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_RUN_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_RUN_CYC] }, [ POWER5p_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x1830a1, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_RMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_RMEM] }, [ POWER5p_PME_PM_L2SC_RCLD_DISP ] = { .pme_name = "PM_L2SC_RCLD_DISP", .pme_code = 0x701c2, .pme_short_desc = "L2 slice C RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCLD_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCLD_DISP] }, [ POWER5p_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc60e6, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LRQ_S0_VALID] }, [ POWER5p_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc50c0, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU0", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_LDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_LDF] }, [ POWER5p_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40000a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC3_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC3_OVERFLOW] }, [ POWER5p_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x820e2, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_IMR_RELOAD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_IMR_RELOAD] }, [ POWER5p_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x40000b, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_TIMEO], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_TIMEO] }, [ POWER5p_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc10c3, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ST_MISS_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ST_MISS_L1] }, [ POWER5p_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x300018, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_STOP_COMPLETION], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_STOP_COMPLETION] }, [ POWER5p_PME_PM_LSU_BUSY_REJECT ] = { .pme_name = "PM_LSU_BUSY_REJECT", .pme_code = 0x2c2088, .pme_short_desc = "LSU busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions. Combined unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_BUSY_REJECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_BUSY_REJECT] }, [ POWER5p_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x800c1, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ISLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ISLB_MISS] }, [ POWER5p_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0xf, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CYC] }, [ POWER5p_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x10000b, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_ONE_RUN_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_ONE_RUN_CYC] }, [ POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_BR_REDIR_NONSPEC", .pme_code = 0x112091, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Number of groups, counted at completion, that have encountered a branch redirect.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC] }, [ POWER5p_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc60e5, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_SRQ_STFWD] }, [ POWER5p_PME_PM_L3SC_MOD_INV ] = { .pme_name = "PM_L3SC_MOD_INV", .pme_code = 0x730e5, .pme_short_desc = "L3 slice C transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_MOD_INV] }, [ POWER5p_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc50c3, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2_PREF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2_PREF] }, [ POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x41009c, .pme_short_desc = "No slot in GCT caused by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x2c7097, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ POWER5p_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x723e1, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_ST_REQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_ST_REQ] }, [ POWER5p_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0x730e1, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_MOD_INV] }, [ POWER5p_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc70e4, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_L1_RELOAD_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_L1_RELOAD_VALID] }, [ POWER5p_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x711c4, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_HIT] }, [ POWER5p_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0x700c1, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_SHR_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_SHR_MOD] }, [ POWER5p_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x130e7, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_EE_OFF_EXT_INT] }, [ POWER5p_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100013, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_1PLUS_PPC_CMPL] }, [ POWER5p_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0x700c2, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_SHR_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_SHR_MOD] }, [ POWER5p_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30001a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC6_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC6_OVERFLOW] }, [ POWER5p_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x210c7, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "A prefetch buffer entry (line) is allocated but the request is not a demand fetch.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IC_PREF_INSTALL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IC_PREF_INSTALL] }, [ POWER5p_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x110c2, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "Cycles when the LRQ is full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LRQ_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LRQ_FULL_CYC] }, [ POWER5p_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x180088, .pme_short_desc = "TLB misses", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_TLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_TLB_MISS] }, [ POWER5p_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x100c0, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The Global Completion Table is completely full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_FULL_CYC] }, [ POWER5p_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200012, .pme_short_desc = "FXU busy", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU_BUSY] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x2c70a4, .pme_short_desc = "Marked load latency from L3", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC] }, [ POWER5p_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2c4088, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_REJECT_LMQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_REJECT_LMQ_FULL] }, [ POWER5p_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc20e7, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_S0_ALLOC] }, [ POWER5p_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x100014, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_MRK], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_MRK] }, [ POWER5p_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x122096, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T or SL) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L25_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L25_SHR] }, [ POWER5p_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x830e7, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DC_PREF_STREAM_ALLOC] }, [ POWER5p_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x10c7, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., ,", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FIN] }, [ POWER5p_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x230e6, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_MPRED_TA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_MPRED_TA] }, [ POWER5p_PME_PM_MRK_DTLB_REF_64K ] = { .pme_name = "PM_MRK_DTLB_REF_64K", .pme_code = 0x2c6086, .pme_short_desc = "Marked Data TLB reference for 64K page", .pme_long_desc = "Data TLB references by a marked instruction for 64KB pages.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_REF_64K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_REF_64K] }, [ POWER5p_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x500009, .pme_short_desc = "Run instructions completed", .pme_long_desc = "Number of run instructions completed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_RUN_INST_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_RUN_INST_CMPL] }, [ POWER5p_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x110c1, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CRQ_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CRQ_FULL_CYC] }, [ POWER5p_PME_PM_L2SA_RCLD_DISP ] = { .pme_name = "PM_L2SA_RCLD_DISP", .pme_code = 0x701c0, .pme_short_desc = "L2 slice A RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCLD_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCLD_DISP] }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_WR_RETRY_QFULL", .pme_code = 0x710c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL] }, [ POWER5p_PME_PM_MRK_DTLB_REF_4K ] = { .pme_name = "PM_MRK_DTLB_REF_4K", .pme_code = 0x1c6086, .pme_short_desc = "Marked Data TLB reference for 4K page", .pme_long_desc = "Data TLB references by a marked instruction for 4KB pages.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_REF_4K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_REF_4K] }, [ POWER5p_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc20e6, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_S0_VALID] }, [ POWER5p_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc00c2, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_FLUSH_LRQ] }, [ POWER5p_PME_PM_INST_FROM_L275_MOD ] = { .pme_name = "PM_INST_FROM_L275_MOD", .pme_code = 0x422096, .pme_short_desc = "Instruction fetched from L2.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L275_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L275_MOD] }, [ POWER5p_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x200004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_EMPTY_CYC] }, [ POWER5p_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x820e7, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LARX_LSU0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LARX_LSU0] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x430e6, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 5 or 6.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC] }, [ POWER5p_PME_PM_SNOOP_RETRY_1AHEAD ] = { .pme_name = "PM_SNOOP_RETRY_1AHEAD", .pme_code = 0x725e6, .pme_short_desc = "Snoop retry due to one ahead collision", .pme_long_desc = "Snoop retry due to one ahead collision", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_RETRY_1AHEAD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_RETRY_1AHEAD] }, [ POWER5p_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0xc6, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FSQRT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FSQRT] }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x820e4, .pme_short_desc = "LSU1 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1] }, [ POWER5p_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x300014, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_FPU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_FPU_FIN] }, [ POWER5p_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x420e4, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles this thread was running at priority level 5.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_5_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_5_CYC] }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x2c7087, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_LMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_LMEM] }, [ POWER5p_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x800c3, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A tlbie was snooped from another processor.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_TLBIE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_TLBIE] }, [ POWER5p_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x10c5, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FRSP_FCONV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FRSP_FCONV] }, [ POWER5p_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x4c208d, .pme_short_desc = "Data TLB miss for 16G page", .pme_long_desc = "Data TLB references to 16GB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_MISS_16G], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_MISS_16G] }, [ POWER5p_PME_PM_L3SB_SNOOP_RETRY ] = { .pme_name = "PM_L3SB_SNOOP_RETRY", .pme_code = 0x731e4, .pme_short_desc = "L3 slice B snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_SNOOP_RETRY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_SNOOP_RETRY] }, [ POWER5p_PME_PM_FAB_VBYPASS_EMPTY ] = { .pme_name = "PM_FAB_VBYPASS_EMPTY", .pme_code = 0x731e7, .pme_short_desc = "Vertical bypass buffer empty", .pme_long_desc = "Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_VBYPASS_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_VBYPASS_EMPTY] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x1c70a3, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD] }, [ POWER5p_PME_PM_L2SB_RCST_DISP ] = { .pme_name = "PM_L2SB_RCST_DISP", .pme_code = 0x702c1, .pme_short_desc = "L2 slice B RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCST_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCST_DISP] }, [ POWER5p_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x400c6, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_6INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_6INST_CLB_CYC] }, [ POWER5p_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x110c7, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FLUSH], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FLUSH] }, [ POWER5p_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0x730e2, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_MOD_INV] }, [ POWER5p_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x102088, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "The floating point unit has encountered a denormalized operand. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_DENORM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_DENORM] }, [ POWER5p_PME_PM_L3SC_HIT ] = { .pme_name = "PM_L3SC_HIT", .pme_code = 0x711c5, .pme_short_desc = "L3 slice C hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_HIT] }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_RQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_RQ", .pme_code = 0x706c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active read queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_WR_RETRY_RQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_WR_RETRY_RQ] }, [ POWER5p_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0xc40c4, .pme_short_desc = "LSU1 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_REJECT_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_REJECT_SRQ] }, [ POWER5p_PME_PM_L3SC_ALL_BUSY ] = { .pme_name = "PM_L3SC_ALL_BUSY", .pme_code = 0x721e5, .pme_short_desc = "L3 slice C active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_ALL_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_ALL_BUSY] }, [ POWER5p_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x220e6, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IC_PREF_REQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IC_PREF_REQ] }, [ POWER5p_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x412091, .pme_short_desc = "Group experienced marked I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_IC_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_IC_MISS] }, [ POWER5p_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x21009c, .pme_short_desc = "No slot in GCT caused by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_NOSLOT_IC_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_NOSLOT_IC_MISS] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c708e, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L3] }, [ POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { .pme_name = "PM_GCT_NOSLOT_SRQ_FULL", .pme_code = 0x310084, .pme_short_desc = "No slot in GCT caused by SRQ full", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because the Store Request Queue (SRQ) is full. This happens when the storage subsystem can not process the stores in the SRQ. Groups can not be dispatched until a SRQ entry is available.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL] }, [ POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x21109a, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS] }, [ POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { .pme_name = "PM_THRD_SEL_OVER_ISU_HOLD", .pme_code = 0x410c5, .pme_short_desc = "Thread selection overrides caused by ISU holds", .pme_long_desc = "Thread selection was overridden because of an ISU hold.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD] }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x2c0090, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_LRQ] }, [ POWER5p_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x420e1, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles this thread was running at priority level 2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_2_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_2_CYC] }, [ POWER5p_PME_PM_L3SA_MOD_INV ] = { .pme_name = "PM_L3SA_MOD_INV", .pme_code = 0x730e3, .pme_short_desc = "L3 slice A transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_MOD_INV] }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x1c0090, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH_SRQ] }, [ POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0xc70e6, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID] }, [ POWER5p_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x701c3, .pme_short_desc = "L3 slice A references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_REF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_REF] }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5p_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x20e1, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_STALL3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_STALL3] }, [ POWER5p_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100018, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_TB_BIT_TRANS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_TB_BIT_TRANS] }, [ POWER5p_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x130e5, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GPR_MAP_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GPR_MAP_FULL_CYC] }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x381088, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ] }, [ POWER5p_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x20e2, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "FPU0 has executed a Floating Point Store instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_STF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_STF] }, [ POWER5p_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0xc50c6, .pme_short_desc = "Marked Data TLB misses", .pme_long_desc = "Data TLB references by a marked instruction that missed the TLB (all page sizes).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_MISS] }, [ POWER5p_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc5, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FMA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FMA] }, [ POWER5p_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0x720e0, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_MOD_TAG] }, [ POWER5p_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc00c4, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_FLUSH_ULD] }, [ POWER5p_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x300005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_INST_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_INST_FIN] }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x810c0, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU0_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU0_FLUSH_UST] }, [ POWER5p_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x100c3, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FULL_CYC] }, [ POWER5p_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc60e7, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LRQ_S0_ALLOC] }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x810c4, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD] }, [ POWER5p_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0xc60e4, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_REF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_REF] }, [ POWER5p_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x123087, .pme_short_desc = "Unconditional branch", .pme_long_desc = "An unconditional branch was executed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_UNCOND], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_UNCOND] }, [ POWER5p_PME_PM_THRD_SEL_OVER_L2MISS ] = { .pme_name = "PM_THRD_SEL_OVER_L2MISS", .pme_code = 0x410c3, .pme_short_desc = "Thread selection overrides caused by L2 misses", .pme_long_desc = "Thread selection was overridden because one thread was had a L2 miss pending.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_OVER_L2MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_OVER_L2MISS] }, [ POWER5p_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0x710c1, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_SHR_INV] }, [ POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_LO_PRIO_WR_CMPL", .pme_code = 0x736e6, .pme_short_desc = "Low priority write completed", .pme_long_desc = "A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL] }, [ POWER5p_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x2c608d, .pme_short_desc = "Marked Data TLB misses for 64K page", .pme_long_desc = "Data TLB references to 64KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_MISS_64K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_MISS_64K] }, [ POWER5p_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x820e3, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_ST_MISS_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_ST_MISS_L1] }, [ POWER5p_PME_PM_L3SC_MOD_TAG ] = { .pme_name = "PM_L3SC_MOD_TAG", .pme_code = 0x720e5, .pme_short_desc = "L3 slice C transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_MOD_TAG] }, [ POWER5p_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x300002, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups sucessfully dispatched (not rejected)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_DISP_SUCCESS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_DISP_SUCCESS] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x430e4, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 1 or 2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC] }, [ POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x230e0, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT] }, [ POWER5p_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x280090, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_DERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_DERAT_MISS] }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 ] = { .pme_name = "PM_MEM_WQ_DISP_Q8to15", .pme_code = 0x733e6, .pme_short_desc = "Memory write queue dispatched to queues 8-15", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_WQ_DISP_Q8to15], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_WQ_DISP_Q8to15] }, [ POWER5p_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x20e3, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "FPU0 has executed a single precision instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_SINGLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_SINGLE] }, [ POWER5p_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x420e0, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_1_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_1_CYC] }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_RQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_RQ", .pme_code = 0x705c6, .pme_short_desc = "Snoop read retry due to collision with active read queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_RD_RETRY_RQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_RD_RETRY_RQ] }, [ POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoVN_EMPTY", .pme_code = 0x721e7, .pme_short_desc = "Hold buffer to VN empty", .pme_long_desc = "Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY] }, [ POWER5p_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x10c6, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FEST] }, [ POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_DCLAIM_RETRY_QFULL", .pme_code = 0x720e6, .pme_short_desc = "Snoop dclaim/flush retry due to write/dclaim queues full", .pme_long_desc = "The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR_CYC", .pme_code = 0x2c70a2, .pme_short_desc = "Marked load latency from L2.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC] }, [ POWER5p_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_ST_CMPL_INT] }, [ POWER5p_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x110c6, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FLUSH_BR_MPRED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FLUSH_BR_MPRED] }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x4c608d, .pme_short_desc = "Marked Data TLB misses for 16G page", .pme_long_desc = "Data TLB references to 16GB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_MISS_16G], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_MISS_16G] }, [ POWER5p_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x202090, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU has executed a store instruction. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_STF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_STF] }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x411098, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_FPU], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_FPU] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 1 or 2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC] }, [ POWER5p_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100004, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_NOSLOT_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_NOSLOT_CYC] }, [ POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300012, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ POWER5p_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x18309e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L35_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L35_SHR] }, [ POWER5p_PME_PM_MRK_DTLB_REF_16G ] = { .pme_name = "PM_MRK_DTLB_REF_16G", .pme_code = 0x4c6086, .pme_short_desc = "Marked Data TLB reference for 16G page", .pme_long_desc = "Data TLB references by a marked instruction for 16GB pages.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_REF_16G], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_REF_16G] }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x2810a8, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_FLUSH_UST] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x1c7097, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ POWER5p_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x711c3, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_HIT] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x1c709e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR] }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x220e7, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IERAT_XLATE_WR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IERAT_XLATE_WR] }, [ POWER5p_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x723e0, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_ST_REQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_ST_REQ] }, [ POWER5p_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x222086, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_LMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_LMEM] }, [ POWER5p_PME_PM_THRD_SEL_T1 ] = { .pme_name = "PM_THRD_SEL_T1", .pme_code = 0x410c1, .pme_short_desc = "Decode selected thread 1", .pme_long_desc = "Thread selection picked thread 1 for decode.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_T1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_T1] }, [ POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x230e1, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR_CYC", .pme_code = 0x2c70a6, .pme_short_desc = "Marked load latency from L3.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC] }, [ POWER5p_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc3, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_1FLOP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_1FLOP] }, [ POWER5p_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x183087, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L2 due to a demand load", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L2], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L2] }, [ POWER5p_PME_PM_MEM_PW_CMPL ] = { .pme_name = "PM_MEM_PW_CMPL", .pme_code = 0x724e6, .pme_short_desc = "Memory partial-write completed", .pme_long_desc = "Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_PW_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_PW_CMPL] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x430e0, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 5 or 6.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC] }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x1c608d, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DTLB_MISS_4K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DTLB_MISS_4K] }, [ POWER5p_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x10c3, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FIN] }, [ POWER5p_PME_PM_L3SC_SHR_INV ] = { .pme_name = "PM_L3SC_SHR_INV", .pme_code = 0x710c5, .pme_short_desc = "L3 slice C transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SC_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SC_SHR_INV] }, [ POWER5p_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x120e6, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_BR_REDIR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_BR_REDIR] }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x481088, .pme_short_desc = "Marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ] }, [ POWER5p_PME_PM_PTEG_FROM_L275_SHR ] = { .pme_name = "PM_PTEG_FROM_L275_SHR", .pme_code = 0x383097, .pme_short_desc = "PTEG loaded from L2.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L275_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L275_SHR] }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_WQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_WQ", .pme_code = 0x715c6, .pme_short_desc = "Snoop read retry due to collision with active write queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_RD_RETRY_WQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_RD_RETRY_WQ] }, [ POWER5p_PME_PM_FAB_DCLAIM_RETRIED ] = { .pme_name = "PM_FAB_DCLAIM_RETRIED", .pme_code = 0x730e7, .pme_short_desc = "dclaim retried", .pme_long_desc = "A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_DCLAIM_RETRIED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_DCLAIM_RETRIED] }, [ POWER5p_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc50c1, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by unit 0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_NCLD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_NCLD] }, [ POWER5p_PME_PM_LSU1_BUSY_REJECT ] = { .pme_name = "PM_LSU1_BUSY_REJECT", .pme_code = 0xc20e5, .pme_short_desc = "LSU1 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_BUSY_REJECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_BUSY_REJECT] }, [ POWER5p_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x110c0, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXLS0_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXLS0_FULL_CYC] }, [ POWER5p_PME_PM_DTLB_REF_16M ] = { .pme_name = "PM_DTLB_REF_16M", .pme_code = 0x3c2086, .pme_short_desc = "Data TLB reference for 16M page", .pme_long_desc = "Data TLB references for 16MB pages. Includes hits + misses.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_REF_16M], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_REF_16M] }, [ POWER5p_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x10c2, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FEST] }, [ POWER5p_PME_PM_GCT_USAGE_60to79_CYC ] = { .pme_name = "PM_GCT_USAGE_60to79_CYC", .pme_code = 0x20001f, .pme_short_desc = "Cycles GCT 60-79% full", .pme_long_desc = "Cycles when the Global Completion Table has between 60% and 70% of its slots used. The GCT has 20 entries shared between threads.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_USAGE_60to79_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_USAGE_60to79_CYC] }, [ POWER5p_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x2c3097, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L25_MOD] }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0xc40c3, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS] }, [ POWER5p_PME_PM_DATA_FROM_L375_MOD ] = { .pme_name = "PM_DATA_FROM_L375_MOD", .pme_code = 0x1c30a7, .pme_short_desc = "Data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L375_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L375_MOD] }, [ POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x200015, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ POWER5p_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x2c208d, .pme_short_desc = "Data TLB miss for 64K page", .pme_long_desc = "Data TLB references to 64KB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_MISS_64K], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_MISS_64K] }, [ POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0xc40c2, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF] }, [ POWER5p_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x42208d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_0INST_FETCH], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_0INST_FETCH] }, [ POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0xc40c6, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF] }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 ] = { .pme_name = "PM_MEM_WQ_DISP_Q0to7", .pme_code = 0x723e6, .pme_short_desc = "Memory write queue dispatched to queues 0-7", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_WQ_DISP_Q0to7], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_WQ_DISP_Q0to7] }, [ POWER5p_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc70e7, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L1_PREF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L1_PREF] }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x4c70a0, .pme_short_desc = "Marked load latency from local memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC] }, [ POWER5p_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x100c5, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BRQ_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BRQ_FULL_CYC] }, [ POWER5p_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x112099, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_IC_MISS_NONSPEC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_IC_MISS_NONSPEC] }, [ POWER5p_PME_PM_PTEG_FROM_L275_MOD ] = { .pme_name = "PM_PTEG_FROM_L275_MOD", .pme_code = 0x1830a3, .pme_short_desc = "PTEG loaded from L2.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L275_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L275_MOD] }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x820e0, .pme_short_desc = "LSU0 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR_CYC", .pme_code = 0x2c70a7, .pme_short_desc = "Marked load latency from L3.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC] }, [ POWER5p_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c308e, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L3] }, [ POWER5p_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x122086, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L2], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L2] }, [ POWER5p_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x110c5, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_FLUSH], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_FLUSH] }, [ POWER5p_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30000a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC2_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC2_OVERFLOW] }, [ POWER5p_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x20e0, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "FPU0 has encountered a denormalized operand.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_DENORM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_DENORM] }, [ POWER5p_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x10c4, .pme_short_desc = "FPU1 executed FMOV or FEST instructions", .pme_long_desc = "FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_FMOV_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_FMOV_FEST] }, [ POWER5p_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x220e4, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Cycles when at least one instruction was sent from the fetch unit to the decode unit.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FETCH_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FETCH_CYC] }, [ POWER5p_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x300009, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_DISP] }, [ POWER5p_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x1c50a8, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LDF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LDF] }, [ POWER5p_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x1c3097, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L25_SHR] }, [ POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc30e4, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM ] = { .pme_name = "PM_MEM_WQ_DISP_DCLAIM", .pme_code = 0x713c6, .pme_short_desc = "Memory write queue dispatched due to dclaim/flush", .pme_long_desc = "A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM] }, [ POWER5p_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x100015, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_ISSUED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_ISSUED] }, [ POWER5p_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x110090, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FULL_CYC] }, [ POWER5p_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x22209d, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L35_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L35_MOD] }, [ POWER5p_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x200088, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FMA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FMA] }, [ POWER5p_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x420e2, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles this thread was running at priority level 3.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_3_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_3_CYC] }, [ POWER5p_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x400005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_CRU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_CRU_FIN] }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_WQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_WQ", .pme_code = 0x716c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active write queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_WR_RETRY_WQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_WR_RETRY_WQ] }, [ POWER5p_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x41109a, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_REJECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_REJECT] }, [ POWER5p_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x200014, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_FXU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_FXU_FIN] }, [ POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0xc40c7, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS] }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY] }, [ POWER5p_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10000a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC4_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC4_OVERFLOW] }, [ POWER5p_PME_PM_L3SA_SNOOP_RETRY ] = { .pme_name = "PM_L3SA_SNOOP_RETRY", .pme_code = 0x731e3, .pme_short_desc = "L3 slice A snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_SNOOP_RETRY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_SNOOP_RETRY] }, [ POWER5p_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x28309e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L35_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L35_MOD] }, [ POWER5p_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x222096, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L25_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L25_MOD] }, [ POWER5p_PME_PM_THRD_SMT_HANG ] = { .pme_name = "PM_THRD_SMT_HANG", .pme_code = 0x330e7, .pme_short_desc = "SMT hang detected", .pme_long_desc = "A hung thread was detected", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SMT_HANG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SMT_HANG] }, [ POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x41109b, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS] }, [ POWER5p_PME_PM_L3SA_MOD_TAG ] = { .pme_name = "PM_L3SA_MOD_TAG", .pme_code = 0x720e3, .pme_short_desc = "L3 slice A transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SA_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SA_MOD_TAG] }, [ POWER5p_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x12209b, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L2MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L2MISS] }, [ POWER5p_PME_PM_FLUSH_SYNC ] = { .pme_name = "PM_FLUSH_SYNC", .pme_code = 0x330e1, .pme_short_desc = "Flush caused by sync", .pme_long_desc = "This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FLUSH_SYNC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FLUSH_SYNC] }, [ POWER5p_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x100002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_DISP] }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 ] = { .pme_name = "PM_MEM_RQ_DISP_Q8to11", .pme_code = 0x722e6, .pme_short_desc = "Memory read queue dispatched to queues 8-11", .pme_long_desc = "A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP_Q8to11], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP_Q8to11] }, [ POWER5p_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0x733e2, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_ST_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_ST_HIT] }, [ POWER5p_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0x720e1, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_MOD_TAG] }, [ POWER5p_PME_PM_CLB_EMPTY_CYC ] = { .pme_name = "PM_CLB_EMPTY_CYC", .pme_code = 0x410c6, .pme_short_desc = "Cycles CLB empty", .pme_long_desc = "Cycles when both thread's CLB is completely empty.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CLB_EMPTY_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CLB_EMPTY_CYC] }, [ POWER5p_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x733e1, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_ST_HIT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_ST_HIT] }, [ POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_NONSPEC_RD_CANCEL", .pme_code = 0x711c6, .pme_short_desc = "Non speculative memory read cancelled", .pme_long_desc = "A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL] }, [ POWER5p_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x423087, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " CR and target prediction", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_PRED_CR_TA], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_PRED_CR_TA] }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x810c3, .pme_short_desc = "LSU0 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ] }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x1810a8, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_FLUSH_ULD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_FLUSH_ULD] }, [ POWER5p_PME_PM_INST_DISP_ATTEMPT ] = { .pme_name = "PM_INST_DISP_ATTEMPT", .pme_code = 0x120e1, .pme_short_desc = "Instructions dispatch attempted", .pme_long_desc = "Number of PowerPC Instructions dispatched (attempted, not filtered by success.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_DISP_ATTEMPT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_DISP_ATTEMPT] }, [ POWER5p_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x422086, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_RMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_RMEM] }, [ POWER5p_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc10c1, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU0.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ST_REF_L1_LSU0], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ST_REF_L1_LSU0] }, [ POWER5p_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x800c2, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_DERAT_MISS] }, [ POWER5p_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x202088, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_STALL3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_STALL3] }, [ POWER5p_PME_PM_L2SB_RCLD_DISP ] = { .pme_name = "PM_L2SB_RCLD_DISP", .pme_code = 0x701c1, .pme_short_desc = "L2 slice B RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCLD_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCLD_DISP] }, [ POWER5p_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x230e2, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " CR prediction", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_PRED_CR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_PRED_CR] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1c7087, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L2] }, [ POWER5p_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc00c3, .pme_short_desc = "LSU0 SRQ lhs flushes", .pme_long_desc = "A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_FLUSH_SRQ] }, [ POWER5p_PME_PM_FAB_PNtoNN_DIRECT ] = { .pme_name = "PM_FAB_PNtoNN_DIRECT", .pme_code = 0x703c7, .pme_short_desc = "PN to NN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_PNtoNN_DIRECT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_PNtoNN_DIRECT] }, [ POWER5p_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1, .pme_short_desc = "Internal operations completed", .pme_long_desc = "Number of internal operations that completed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_IOPS_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_IOPS_CMPL] }, [ POWER5p_PME_PM_L2SA_RCST_DISP ] = { .pme_name = "PM_L2SA_RCST_DISP", .pme_code = 0x702c0, .pme_short_desc = "L2 slice A RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCST_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCST_DISP] }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0x710c2, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_SHR_INV] }, [ POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { .pme_name = "PM_SNOOP_RETRY_AB_COLLISION", .pme_code = 0x735e6, .pme_short_desc = "Snoop retry due to a b collision", .pme_long_desc = "Snoop retry due to a b collision", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION] }, [ POWER5p_PME_PM_FAB_PNtoVN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoVN_SIDECAR", .pme_code = 0x733e7, .pme_short_desc = "PN to VN beat went to sidecar first", .pme_long_desc = "Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_PNtoVN_SIDECAR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_PNtoVN_SIDECAR] }, [ POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc40c1, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL] }, [ POWER5p_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xc30e6, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LMQ_S0_ALLOC] }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_RQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_RQ", .pme_code = 0x707c6, .pme_short_desc = "Snoop partial-write retry due to collision with active read queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_PW_RETRY_RQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_PW_RETRY_RQ] }, [ POWER5p_PME_PM_DTLB_REF ] = { .pme_name = "PM_DTLB_REF", .pme_code = 0xc20e4, .pme_short_desc = "Data TLB references", .pme_long_desc = "Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_REF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_REF] }, [ POWER5p_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x18308e, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L3] }, [ POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x712c7, .pme_short_desc = "M1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY] }, [ POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x400015, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "Cycles the Store Request Queue is empty", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ POWER5p_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x20e6, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "FPU1 has executed a Floating Point Store instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_STF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_STF] }, [ POWER5p_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xc30e5, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_LMQ_S0_VALID] }, [ POWER5p_PME_PM_GCT_USAGE_00to59_CYC ] = { .pme_name = "PM_GCT_USAGE_00to59_CYC", .pme_code = 0x10001f, .pme_short_desc = "Cycles GCT less than 60% full", .pme_long_desc = "Cycles when the Global Completion Table has fewer than 60% of its slots used. The GCT has 20 entries shared between threads.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GCT_USAGE_00to59_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GCT_USAGE_00to59_CYC] }, [ POWER5p_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x301088, .pme_short_desc = "FPU executed FMOV or FEST instructions", .pme_long_desc = "The floating point unit has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.. Combined Unit 0 + Unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FMOV_FEST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FMOV_FEST] }, [ POWER5p_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x3c309b, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L2MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L2MISS] }, [ POWER5p_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x100c2, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_XER_MAP_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_XER_MAP_FULL_CYC] }, [ POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x130e1, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "A scoreboard operation on a non-renamed resource has blocked dispatch.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC] }, [ POWER5p_PME_PM_FLUSH_SB ] = { .pme_name = "PM_FLUSH_SB", .pme_code = 0x330e2, .pme_short_desc = "Flush caused by scoreboard operation", .pme_long_desc = "This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FLUSH_SB], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FLUSH_SB] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR", .pme_code = 0x3c709e, .pme_short_desc = "Marked data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a marked load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR] }, [ POWER5p_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x400013, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_GRP_CMPL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_GRP_CMPL] }, [ POWER5p_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "The counter is suspended (does not count).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SUSPENDED], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SUSPENDED] }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_RD_RETRY_QFULL", .pme_code = 0x700c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL] }, [ POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_BR_REDIR_NONSPEC", .pme_code = 0x120e5, .pme_short_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_long_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC] }, [ POWER5p_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x1c309e, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L35_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L35_SHR] }, [ POWER5p_PME_PM_L3SB_MOD_INV ] = { .pme_name = "PM_L3SB_MOD_INV", .pme_code = 0x730e4, .pme_short_desc = "L3 slice B transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_MOD_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_MOD_INV] }, [ POWER5p_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x820e1, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_STCX_FAIL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_STCX_FAIL] }, [ POWER5p_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc10c5, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LD_MISS_L1_LSU1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LD_MISS_L1_LSU1] }, [ POWER5p_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x200002, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_GRP_DISP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_GRP_DISP] }, [ POWER5p_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0x830e6, .pme_short_desc = "DST (Data Stream Touch) stream start", .pme_long_desc = "A prefetch stream was started using the DST instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DC_PREF_DST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DC_PREF_DST] }, [ POWER5p_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x20e4, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "FPU1 has encountered a denormalized operand.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU1_DENORM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU1_DENORM] }, [ POWER5p_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x30e0, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FPSCR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FPSCR] }, [ POWER5p_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c3087, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L2], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L2] }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR] }, [ POWER5p_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x100090, .pme_short_desc = "FPU executed one flop instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_1FLOP], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_1FLOP] }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER] }, [ POWER5p_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0xc2, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FSQRT], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FSQRT] }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e1, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x1c10a8, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Load references to the Level 1 Data Cache. Combined unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LD_REF_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LD_REF_L1] }, [ POWER5p_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x22208d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_L1] }, [ POWER5p_PME_PM_TLBIE_HELD ] = { .pme_name = "PM_TLBIE_HELD", .pme_code = 0x130e4, .pme_short_desc = "TLBIE held at dispatch", .pme_long_desc = "Cycles a TLBIE instruction was held at dispatch.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_TLBIE_HELD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_TLBIE_HELD] }, [ POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0xc50c2, .pme_short_desc = "D cache out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected but no more stream entries were available.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD_CYC", .pme_code = 0x4c70a2, .pme_short_desc = "Marked load latency from L2.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC] }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x810c7, .pme_short_desc = "LSU1 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ] }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 ] = { .pme_name = "PM_MEM_RQ_DISP_Q0to3", .pme_code = 0x702c6, .pme_short_desc = "Memory read queue dispatched to queues 0-3", .pme_long_desc = "A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MEM_RQ_DISP_Q0to3], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MEM_RQ_DISP_Q0to3] }, [ POWER5p_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc10c4, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_ST_REF_L1_LSU1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_ST_REF_L1_LSU1] }, [ POWER5p_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x182088, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LD_MISS_L1] }, [ POWER5p_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x230e7, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "Cycles that a cache line was written to the instruction cache.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L1_WRITE_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L1_WRITE_CYC] }, [ POWER5p_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0x723e2, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_ST_REQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_ST_REQ] }, [ POWER5p_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x21109b, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point divide or square root instruction. This is a subset of PM_CMPLU_STALL_FPU.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_CMPLU_STALL_FDIV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_CMPLU_STALL_FDIV] }, [ POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { .pme_name = "PM_THRD_SEL_OVER_CLB_EMPTY", .pme_code = 0x410c2, .pme_short_desc = "Thread selection overrides caused by CLB empty", .pme_long_desc = "Thread selection was overridden because one thread's CLB was empty.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY] }, [ POWER5p_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x230e5, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_BR_MPRED_CR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_BR_MPRED_CR] }, [ POWER5p_PME_PM_L3SB_MOD_TAG ] = { .pme_name = "PM_L3SB_MOD_TAG", .pme_code = 0x720e4, .pme_short_desc = "L3 slice B transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_MOD_TAG] }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x3c709b, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_DATA_FROM_L2MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_DATA_FROM_L2MISS] }, [ POWER5p_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1c4088, .pme_short_desc = "LSU SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_REJECT_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_REJECT_SRQ] }, [ POWER5p_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c1088, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LD_MISS_L1], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LD_MISS_L1] }, [ POWER5p_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x32208d, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_INST_FROM_PREF], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_INST_FROM_PREF] }, [ POWER5p_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x820e5, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_STCX_PASS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_STCX_PASS] }, [ POWER5p_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc10c7, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DC_INV_L2], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DC_INV_L2] }, [ POWER5p_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x110c3, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "Cycles the Store Request Queue is full.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_FULL_CYC] }, [ POWER5p_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x401088, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result. This only indicates finish, not completion. Combined Unit 0 + Unit 1. Floating Point Stores are included in this count but not Floating Point Loads., , , XYZs", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU_FIN] }, [ POWER5p_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x2c6088, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU_SRQ_STFWD] }, [ POWER5p_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0x700c0, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_SHR_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_SHR_MOD] }, [ POWER5p_PME_PM_0INST_CLB_CYC ] = { .pme_name = "PM_0INST_CLB_CYC", .pme_code = 0x400c0, .pme_short_desc = "Cycles no instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_0INST_CLB_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_0INST_CLB_CYC] }, [ POWER5p_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x130e2, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FXU0_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FXU0_FIN] }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e2, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL] }, [ POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200013, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC] }, [ POWER5p_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10001a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PMC5_OVERFLOW], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PMC5_OVERFLOW] }, [ POWER5p_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0xc0, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_FPU0_FDIV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_FPU0_FDIV] }, [ POWER5p_PME_PM_PTEG_FROM_L375_SHR ] = { .pme_name = "PM_PTEG_FROM_L375_SHR", .pme_code = 0x38309e, .pme_short_desc = "PTEG loaded from L3.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_PTEG_FROM_L375_SHR], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_PTEG_FROM_L375_SHR] }, [ POWER5p_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x20000b, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_HV_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_HV_CYC] }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY] }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x430e3, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles when this thread's priority is equal to the other thread's priority.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC] }, [ POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x100c6, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC] }, [ POWER5p_PME_PM_L3SB_SHR_INV ] = { .pme_name = "PM_L3SB_SHR_INV", .pme_code = 0x710c4, .pme_short_desc = "L3 slice B transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L3SB_SHR_INV], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L3SB_SHR_INV] }, [ POWER5p_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x1c30a1, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_RMEM], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_RMEM] }, [ POWER5p_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x1c30a3, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DATA_FROM_L275_MOD], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DATA_FROM_L275_MOD] }, [ POWER5p_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0xc40c0, .pme_short_desc = "LSU0 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_REJECT_SRQ], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_REJECT_SRQ] }, [ POWER5p_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x800c6, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU1_DERAT_MISS] }, [ POWER5p_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x400014, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_MRK_LSU_FIN], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_MRK_LSU_FIN] }, [ POWER5p_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x3c208d, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_DTLB_MISS_16M], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_DTLB_MISS_16M] }, [ POWER5p_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc00c1, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_LSU0_FLUSH_UST] }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY] }, [ POWER5p_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0x720e2, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5p_event_ids[POWER5p_PME_PM_L2SC_MOD_TAG], .pme_group_vector = power5p_group_vecs[POWER5p_PME_PM_L2SC_MOD_TAG] } }; #define POWER5p_PME_EVENT_COUNT 483 static const int power5p_group_event_ids[][POWER5p_NUM_EVENT_COUNTERS] = { [ 0 ] = { 312, 302, 113, 21, 0, 0 }, [ 1 ] = { 2, 95, 100, 21, 0, 0 }, [ 2 ] = { 105, 104, 101, 113, 0, 0 }, [ 3 ] = { 0, 2, 12, 267, 0, 0 }, [ 4 ] = { 6, 6, 292, 112, 0, 0 }, [ 5 ] = { 98, 97, 95, 98, 0, 0 }, [ 6 ] = { 99, 98, 96, 97, 0, 0 }, [ 7 ] = { 242, 241, 234, 234, 0, 0 }, [ 8 ] = { 247, 246, 244, 240, 0, 0 }, [ 9 ] = { 238, 247, 236, 239, 0, 0 }, [ 10 ] = { 237, 244, 236, 239, 0, 0 }, [ 11 ] = { 120, 115, 26, 29, 0, 0 }, [ 12 ] = { 115, 13, 122, 108, 0, 0 }, [ 13 ] = { 1, 227, 172, 112, 0, 0 }, [ 14 ] = { 216, 225, 27, 171, 0, 0 }, [ 15 ] = { 244, 242, 53, 294, 0, 0 }, [ 16 ] = { 215, 224, 112, 122, 0, 0 }, [ 17 ] = { 213, 222, 245, 344, 0, 0 }, [ 18 ] = { 214, 223, 112, 9, 0, 0 }, [ 19 ] = { 245, 243, 228, 53, 0, 0 }, [ 20 ] = { 115, 233, 53, 26, 0, 0 }, [ 21 ] = { 124, 113, 54, 57, 0, 0 }, [ 22 ] = { 233, 230, 112, 226, 0, 0 }, [ 23 ] = { 207, 216, 228, 112, 0, 0 }, [ 24 ] = { 208, 217, 112, 226, 0, 0 }, [ 25 ] = { 235, 233, 8, 112, 0, 0 }, [ 26 ] = { 209, 218, 228, 112, 0, 0 }, [ 27 ] = { 210, 219, 112, 226, 0, 0 }, [ 28 ] = { 232, 113, 290, 229, 0, 0 }, [ 29 ] = { 109, 17, 112, 18, 0, 0 }, [ 30 ] = { 115, 14, 16, 16, 0, 0 }, [ 31 ] = { 107, 16, 112, 15, 0, 0 }, [ 32 ] = { 89, 15, 112, 17, 0, 0 }, [ 33 ] = { 198, 7, 237, 231, 0, 0 }, [ 34 ] = { 68, 80, 88, 92, 0, 0 }, [ 35 ] = { 16, 200, 97, 19, 0, 0 }, [ 36 ] = { 57, 351, 266, 112, 0, 0 }, [ 37 ] = { 325, 321, 208, 220, 0, 0 }, [ 38 ] = { 205, 214, 106, 107, 0, 0 }, [ 39 ] = { 113, 110, 108, 1, 0, 0 }, [ 40 ] = { 108, 106, 121, 112, 0, 0 }, [ 41 ] = { 356, 307, 9, 11, 0, 0 }, [ 42 ] = { 12, 11, 11, 12, 0, 0 }, [ 43 ] = { 102, 100, 52, 112, 0, 0 }, [ 44 ] = { 25, 30, 195, 196, 0, 0 }, [ 45 ] = { 18, 228, 322, 318, 0, 0 }, [ 46 ] = { 30, 120, 196, 195, 0, 0 }, [ 47 ] = { 34, 33, 33, 34, 0, 0 }, [ 48 ] = { 32, 31, 31, 32, 0, 0 }, [ 49 ] = { 33, 30, 16, 21, 0, 0 }, [ 50 ] = { 201, 323, 195, 318, 0, 0 }, [ 51 ] = { 21, 23, 51, 112, 0, 0 }, [ 52 ] = { 21, 23, 19, 24, 0, 0 }, [ 53 ] = { 19, 21, 18, 22, 0, 0 }, [ 54 ] = { 22, 22, 22, 23, 0, 0 }, [ 55 ] = { 121, 116, 118, 117, 0, 0 }, [ 56 ] = { 118, 119, 112, 1, 0, 0 }, [ 57 ] = { 119, 117, 115, 115, 0, 0 }, [ 58 ] = { 122, 118, 117, 116, 0, 0 }, [ 59 ] = { 305, 303, 299, 300, 0, 0 }, [ 60 ] = { 308, 304, 303, 301, 0, 0 }, [ 61 ] = { 304, 305, 300, 302, 0, 0 }, [ 62 ] = { 307, 102, 103, 26, 0, 0 }, [ 63 ] = { 130, 130, 127, 127, 0, 0 }, [ 64 ] = { 134, 134, 131, 131, 0, 0 }, [ 65 ] = { 138, 140, 135, 137, 0, 0 }, [ 66 ] = { 146, 146, 143, 143, 0, 0 }, [ 67 ] = { 150, 150, 147, 147, 0, 0 }, [ 68 ] = { 154, 156, 151, 153, 0, 0 }, [ 69 ] = { 162, 162, 159, 159, 0, 0 }, [ 70 ] = { 166, 166, 163, 163, 0, 0 }, [ 71 ] = { 170, 172, 167, 169, 0, 0 }, [ 72 ] = { 180, 113, 175, 177, 0, 0 }, [ 73 ] = { 115, 184, 182, 184, 0, 0 }, [ 74 ] = { 115, 191, 189, 191, 0, 0 }, [ 75 ] = { 129, 138, 124, 135, 0, 0 }, [ 76 ] = { 145, 154, 140, 151, 0, 0 }, [ 77 ] = { 161, 170, 156, 167, 0, 0 }, [ 78 ] = { 177, 181, 179, 185, 0, 0 }, [ 79 ] = { 181, 185, 174, 180, 0, 0 }, [ 80 ] = { 191, 192, 193, 187, 0, 0 }, [ 81 ] = { 87, 84, 84, 87, 0, 0 }, [ 82 ] = { 85, 86, 85, 88, 0, 0 }, [ 83 ] = { 86, 87, 61, 77, 0, 0 }, [ 84 ] = { 90, 88, 112, 230, 0, 0 }, [ 85 ] = { 67, 79, 60, 76, 0, 0 }, [ 86 ] = { 59, 72, 63, 79, 0, 0 }, [ 87 ] = { 60, 73, 65, 80, 0, 0 }, [ 88 ] = { 70, 82, 112, 66, 0, 0 }, [ 89 ] = { 69, 81, 207, 219, 0, 0 }, [ 90 ] = { 63, 76, 112, 80, 0, 0 }, [ 91 ] = { 58, 71, 61, 112, 0, 0 }, [ 92 ] = { 71, 83, 207, 112, 0, 0 }, [ 93 ] = { 96, 93, 90, 95, 0, 0 }, [ 94 ] = { 281, 283, 93, 93, 0, 0 }, [ 95 ] = { 4, 4, 91, 96, 0, 0 }, [ 96 ] = { 337, 335, 334, 331, 0, 0 }, [ 97 ] = { 336, 334, 336, 333, 0, 0 }, [ 98 ] = { 335, 333, 338, 335, 0, 0 }, [ 99 ] = { 334, 107, 340, 112, 0, 0 }, [ 100 ] = { 333, 327, 112, 322, 0, 0 }, [ 101 ] = { 321, 113, 345, 342, 0, 0 }, [ 102 ] = { 115, 0, 341, 338, 0, 0 }, [ 103 ] = { 115, 20, 343, 340, 0, 0 }, [ 104 ] = { 37, 38, 37, 41, 0, 0 }, [ 105 ] = { 45, 41, 45, 52, 0, 0 }, [ 106 ] = { 47, 48, 47, 51, 0, 0 }, [ 107 ] = { 43, 40, 34, 45, 0, 0 }, [ 108 ] = { 317, 308, 315, 305, 0, 0 }, [ 109 ] = { 318, 315, 312, 112, 0, 0 }, [ 110 ] = { 323, 252, 317, 249, 0, 0 }, [ 111 ] = { 315, 353, 309, 306, 0, 0 }, [ 112 ] = { 261, 263, 249, 36, 0, 0 }, [ 113 ] = { 260, 261, 258, 37, 0, 0 }, [ 114 ] = { 268, 264, 262, 260, 0, 0 }, [ 115 ] = { 256, 257, 254, 251, 0, 0 }, [ 116 ] = { 281, 284, 348, 293, 0, 0 }, [ 117 ] = { 281, 300, 278, 278, 0, 0 }, [ 118 ] = { 282, 268, 279, 279, 0, 0 }, [ 119 ] = { 269, 272, 264, 264, 0, 0 }, [ 120 ] = { 270, 270, 112, 88, 0, 0 }, [ 121 ] = { 272, 276, 268, 267, 0, 0 }, [ 122 ] = { 275, 271, 265, 272, 0, 0 }, [ 123 ] = { 273, 274, 270, 270, 0, 0 }, [ 124 ] = { 271, 271, 112, 266, 0, 0 }, [ 125 ] = { 274, 275, 269, 269, 0, 0 }, [ 126 ] = { 280, 282, 275, 277, 0, 0 }, [ 127 ] = { 278, 280, 273, 275, 0, 0 }, [ 128 ] = { 279, 279, 271, 21, 0, 0 }, [ 129 ] = { 280, 113, 275, 273, 0, 0 }, [ 130 ] = { 285, 113, 294, 263, 0, 0 }, [ 131 ] = { 299, 300, 291, 295, 0, 0 }, [ 132 ] = { 298, 299, 276, 280, 0, 0 }, [ 133 ] = { 18, 116, 322, 196, 0, 0 }, [ 134 ] = { 21, 23, 322, 196, 0, 0 }, [ 135 ] = { 124, 30, 322, 196, 0, 0 }, [ 136 ] = { 21, 23, 195, 318, 0, 0 }, [ 137 ] = { 17, 110, 122, 171, 0, 0 }, [ 138 ] = { 12, 11, 11, 9, 0, 0 }, [ 139 ] = { 70, 82, 61, 66, 0, 0 }, [ 140 ] = { 63, 76, 65, 80, 0, 0 }, [ 141 ] = { 58, 71, 61, 77, 0, 0 }, [ 142 ] = { 85, 84, 322, 196, 0, 0 }, [ 143 ] = { 90, 88, 61, 77, 0, 0 }, [ 144 ] = { 87, 86, 85, 88, 0, 0 }, [ 145 ] = { 85, 84, 87, 88, 0, 0 }, [ 146 ] = { 17, 94, 16, 88, 0, 0 }, [ 147 ] = { 303, 88, 113, 230, 0, 0 }, [ 148 ] = { 17, 114, 195, 318, 0, 0 }, [ 149 ] = { 356, 20, 322, 196, 0, 0 }, [ 150 ] = { 87, 84, 86, 86, 0, 0 }, [ 151 ] = { 303, 20, 195, 26, 0, 0 }, [ 152 ] = { 303, 323, 113, 196, 0, 0 }, [ 153 ] = { 17, 84, 87, 88, 0, 0 }, [ 154 ] = { 17, 30, 195, 196, 0, 0 }, [ 155 ] = { 17, 302, 322, 318, 0, 0 }, [ 156 ] = { 303, 23, 16, 24, 0, 0 }, [ 157 ] = { 281, 302, 348, 293, 0, 0 }, [ 158 ] = { 281, 302, 278, 278, 0, 0 }, [ 159 ] = { 282, 302, 279, 279, 0, 0 }, [ 160 ] = { 269, 302, 264, 264, 0, 0 }, [ 161 ] = { 270, 302, 277, 88, 0, 0 }, [ 162 ] = { 303, 276, 268, 267, 0, 0 }, [ 163 ] = { 303, 271, 265, 272, 0, 0 }, [ 164 ] = { 273, 302, 270, 270, 0, 0 }, [ 165 ] = { 303, 271, 112, 266, 0, 0 }, [ 166 ] = { 303, 275, 269, 269, 0, 0 }, [ 167 ] = { 303, 280, 273, 275, 0, 0 }, [ 168 ] = { 303, 282, 275, 277, 0, 0 }, [ 169 ] = { 303, 280, 273, 275, 0, 0 }, [ 170 ] = { 280, 302, 275, 273, 0, 0 }, [ 171 ] = { 285, 302, 294, 263, 0, 0 }, [ 172 ] = { 299, 302, 291, 295, 0, 0 }, [ 173 ] = { 303, 299, 276, 280, 0, 0 }, [ 174 ] = { 303, 270, 267, 281, 0, 0 }, [ 175 ] = { 303, 274, 272, 271, 0, 0 }, [ 176 ] = { 278, 302, 274, 265, 0, 0 }, [ 177 ] = { 280, 302, 112, 286, 0, 0 }, [ 178 ] = { 278, 302, 288, 292, 0, 0 }, [ 179 ] = { 303, 272, 284, 288, 0, 0 }, [ 180 ] = { 303, 268, 282, 286, 0, 0 }, [ 181 ] = { 303, 292, 287, 297, 0, 0 }, [ 182 ] = { 303, 286, 281, 298, 0, 0 }, [ 183 ] = { 303, 268, 264, 268, 0, 0 }, [ 184 ] = { 303, 296, 290, 294, 0, 0 }, [ 185 ] = { 269, 302, 266, 296, 0, 0 }, [ 186 ] = { 303, 94, 276, 293, 0, 0 }, [ 187 ] = { 303, 283, 278, 278, 0, 0 } }; static const pmg_power_group_t power5p_groups[] = { [ 0 ] = { .pmg_name = "pm_utilization", .pmg_desc = "CPI and utilization data", .pmg_event_ids = power5p_group_event_ids[0], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000a12121eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 1 ] = { .pmg_name = "pm_completion", .pmg_desc = "Completion and cycle counts", .pmg_event_ids = power5p_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002608261eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 2 ] = { .pmg_name = "pm_group_dispatch", .pmg_desc = "Group dispatch events", .pmg_event_ids = power5p_group_event_ids[2], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000ec6c8c212ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 3 ] = { .pmg_name = "pm_clb1", .pmg_desc = "CLB fullness", .pmg_event_ids = power5p_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x015b000180848c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 4 ] = { .pmg_name = "pm_clb2", .pmg_desc = "CLB fullness", .pmg_event_ids = power5p_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x014300028a8ccc02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 5 ] = { .pmg_name = "pm_gct_empty", .pmg_desc = "GCT empty reasons", .pmg_event_ids = power5p_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000008380838ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 6 ] = { .pmg_name = "pm_gct_usage", .pmg_desc = "GCT Usage", .pmg_event_ids = power5p_group_event_ids[6], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000003e3e3e3eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 7 ] = { .pmg_name = "pm_lsu1", .pmg_desc = "LSU LRQ and LMQ events", .pmg_event_ids = power5p_group_event_ids[7], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020f000fcecccccaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 8 ] = { .pmg_name = "pm_lsu2", .pmg_desc = "LSU SRQ events", .pmg_event_ids = power5p_group_event_ids[8], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400e000ececcca86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 9 ] = { .pmg_name = "pm_lsu3", .pmg_desc = "LSU SRQ and LMQ events", .pmg_event_ids = power5p_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x030f0004ea102a2aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 10 ] = { .pmg_name = "pm_lsu4", .pmg_desc = "LSU SRQ and LMQ events", .pmg_event_ids = power5p_group_event_ids[10], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40030000eea62a2aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 11 ] = { .pmg_name = "pm_prefetch1", .pmg_desc = "Prefetch stream allocation", .pmg_event_ids = power5p_group_event_ids[11], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8432000d36c884ceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 12 ] = { .pmg_name = "pm_prefetch2", .pmg_desc = "Prefetch events", .pmg_event_ids = power5p_group_event_ids[12], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8103000602cace8eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 13 ] = { .pmg_name = "pm_prefetch3", .pmg_desc = "L2 prefetch and misc events", .pmg_event_ids = power5p_group_event_ids[13], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x047c000482108602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 14 ] = { .pmg_name = "pm_prefetch4", .pmg_desc = "Misc prefetch and reject events", .pmg_event_ids = power5p_group_event_ids[14], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0cf200028088cc86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 15 ] = { .pmg_name = "pm_lsu_reject1", .pmg_desc = "LSU reject events", .pmg_event_ids = power5p_group_event_ids[15], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc8e000022010c610ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 16 ] = { .pmg_name = "pm_lsu_reject2", .pmg_desc = "LSU rejects due to reload CDF or tag update collision", .pmg_event_ids = power5p_group_event_ids[16], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x88c00001848c02ceULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 17 ] = { .pmg_name = "LSU rejects due to ERAT", .pmg_desc = " held instuctions", .pmg_event_ids = power5p_group_event_ids[17], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x48c00003868ec0c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 18 ] = { .pmg_name = "pm_lsu_reject4", .pmg_desc = "LSU0/1 reject LMQ full", .pmg_event_ids = power5p_group_event_ids[18], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x88c00001828a02c8ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 19 ] = { .pmg_name = "pm_lsu_reject5", .pmg_desc = "LSU misc reject and flush events", .pmg_event_ids = power5p_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x48c0000010208a8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 20 ] = { .pmg_name = "pm_flush1", .pmg_desc = "Misc flush events", .pmg_event_ids = power5p_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0f000020210c68eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 21 ] = { .pmg_name = "pm_flush2", .pmg_desc = "Flushes due to scoreboard and sync", .pmg_event_ids = power5p_group_event_ids[21], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc08000038002c4c2ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 22 ] = { .pmg_name = "pm_lsu_flush_srq_lrq", .pmg_desc = "LSU flush by SRQ and LRQ events", .pmg_event_ids = power5p_group_event_ids[22], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c000002020028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 23 ] = { .pmg_name = "pm_lsu_flush_lrq", .pmg_desc = "LSU0/1 flush due to LRQ", .pmg_event_ids = power5p_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000848c8a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 24 ] = { .pmg_name = "pm_lsu_flush_srq", .pmg_desc = "LSU0/1 flush due to SRQ", .pmg_event_ids = power5p_group_event_ids[24], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000868e028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 25 ] = { .pmg_name = "pm_lsu_flush_unaligned", .pmg_desc = "LSU flush due to unaligned data", .pmg_event_ids = power5p_group_event_ids[25], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x80c000021010c802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 26 ] = { .pmg_name = "pm_lsu_flush_uld", .pmg_desc = "LSU0/1 flush due to unaligned load", .pmg_event_ids = power5p_group_event_ids[26], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c0000080888a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 27 ] = { .pmg_name = "pm_lsu_flush_ust", .pmg_desc = "LSU0/1 flush due to unaligned store", .pmg_event_ids = power5p_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000828a028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 28 ] = { .pmg_name = "pm_lsu_flush_full", .pmg_desc = "LSU flush due to LRQ/SRQ full", .pmg_event_ids = power5p_group_event_ids[28], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0200009ce0210c0ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 29 ] = { .pmg_name = "pm_lsu_stall1", .pmg_desc = "LSU Stalls", .pmg_event_ids = power5p_group_event_ids[29], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000028300234ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 30 ] = { .pmg_name = "pm_lsu_stall2", .pmg_desc = "LSU Stalls", .pmg_event_ids = power5p_group_event_ids[30], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000002341e36ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 31 ] = { .pmg_name = "pm_fxu_stall", .pmg_desc = "FXU Stalls", .pmg_event_ids = power5p_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000008ca320232ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 32 ] = { .pmg_name = "pm_fpu_stall", .pmg_desc = "FPU Stalls", .pmg_event_ids = power5p_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000020360230ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 33 ] = { .pmg_name = "pm_queue_full", .pmg_desc = "BRQ LRQ LMQ queue full", .pmg_event_ids = power5p_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400b0009ce8a84ceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 34 ] = { .pmg_name = "pm_issueq_full", .pmg_desc = "FPU FX full", .pmg_event_ids = power5p_group_event_ids[34], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000000868e8088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 35 ] = { .pmg_name = "pm_mapper_full1", .pmg_desc = "CR CTR GPR mapper full", .pmg_event_ids = power5p_group_event_ids[35], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000002888cca82ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 36 ] = { .pmg_name = "pm_mapper_full2", .pmg_desc = "FPR XER mapper full", .pmg_event_ids = power5p_group_event_ids[36], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4103000282843602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 37 ] = { .pmg_name = "pm_misc_load", .pmg_desc = "Non-cachable loads and stcx events", .pmg_event_ids = power5p_group_event_ids[37], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0438000cc2ca828aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 38 ] = { .pmg_name = "pm_ic_demand", .pmg_desc = "ICache demand from BR redirect", .pmg_event_ids = power5p_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800c000fc2cac0c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 39 ] = { .pmg_name = "pm_ic_pref", .pmg_desc = "ICache prefetch", .pmg_event_ids = power5p_group_event_ids[39], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000dcecc8e1aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 40 ] = { .pmg_name = "pm_ic_miss", .pmg_desc = "ICache misses", .pmg_event_ids = power5p_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4003000e32cec802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 41 ] = { .pmg_name = "Branch mispredict", .pmg_desc = " TLB and SLB misses", .pmg_event_ids = power5p_group_event_ids[41], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x808000031010caccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 42 ] = { .pmg_name = "pm_branch1", .pmg_desc = "Branch operations", .pmg_event_ids = power5p_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000f0e0e0e0eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 43 ] = { .pmg_name = "pm_branch2", .pmg_desc = "Branch operations", .pmg_event_ids = power5p_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000c22cc8c02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 44 ] = { .pmg_name = "pm_L1_tlbmiss", .pmg_desc = "L1 load and TLB misses", .pmg_event_ids = power5p_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b000008e881020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 45 ] = { .pmg_name = "pm_L1_DERAT_miss", .pmg_desc = "L1 store and DERAT misses", .pmg_event_ids = power5p_group_event_ids[45], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b300080e202086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 46 ] = { .pmg_name = "pm_L1_slbmiss", .pmg_desc = "L1 load and SLB misses", .pmg_event_ids = power5p_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b000008a82848cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 47 ] = { .pmg_name = "pm_dtlbref", .pmg_desc = "Data TLB references", .pmg_event_ids = power5p_group_event_ids[47], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000c000f0c0c0c0cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 48 ] = { .pmg_name = "pm_dtlbmiss", .pmg_desc = "Data TLB misses", .pmg_event_ids = power5p_group_event_ids[48], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000c000f1a1a1a1aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 49 ] = { .pmg_name = "pm_dtlb", .pmg_desc = "Data TLB references and misses", .pmg_event_ids = power5p_group_event_ids[49], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x008c0008c8881e1eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 50 ] = { .pmg_name = "pm_L1_refmiss", .pmg_desc = "L1 load references and misses and store references and misses", .pmg_event_ids = power5p_group_event_ids[50], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0030000050501086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 51 ] = { .pmg_name = "pm_dsource1", .pmg_desc = "L3 cache and memory data access", .pmg_event_ids = power5p_group_event_ids[51], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4003000c1c0e8e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 52 ] = { .pmg_name = "pm_dsource2", .pmg_desc = "L3 cache and memory data access", .pmg_event_ids = power5p_group_event_ids[52], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0003000f1c0e360eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 53 ] = { .pmg_name = "pm_dsource_L2", .pmg_desc = "L2 cache data access", .pmg_event_ids = power5p_group_event_ids[53], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0003000f2e2e2e2eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 54 ] = { .pmg_name = "pm_dsource_L3", .pmg_desc = "L3 cache data access", .pmg_event_ids = power5p_group_event_ids[54], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0003000f3c3c3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 55 ] = { .pmg_name = "pm_isource1", .pmg_desc = "Instruction source information", .pmg_event_ids = power5p_group_event_ids[55], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000f1a1a1a0cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 56 ] = { .pmg_name = "pm_isource2", .pmg_desc = "Instruction source information", .pmg_event_ids = power5p_group_event_ids[56], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000d0c0c021aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 57 ] = { .pmg_name = "pm_isource_L2", .pmg_desc = "L2 instruction source information", .pmg_event_ids = power5p_group_event_ids[57], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000f2c2c2c2cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 58 ] = { .pmg_name = "pm_isource_L3", .pmg_desc = "L3 instruction source information", .pmg_event_ids = power5p_group_event_ids[58], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000f3a3a3a3aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 59 ] = { .pmg_name = "pm_pteg_source1", .pmg_desc = "PTEG source information", .pmg_event_ids = power5p_group_event_ids[59], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0002000f2e2e2e2eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 60 ] = { .pmg_name = "pm_pteg_source2", .pmg_desc = "PTEG source information", .pmg_event_ids = power5p_group_event_ids[60], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0002000f3c3c3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 61 ] = { .pmg_name = "pm_pteg_source3", .pmg_desc = "PTEG source information", .pmg_event_ids = power5p_group_event_ids[61], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0002000f0e0e360eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 62 ] = { .pmg_name = "pm_pteg_source4", .pmg_desc = "L3 PTEG and group disptach events", .pmg_event_ids = power5p_group_event_ids[62], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x003200081c04048eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 63 ] = { .pmg_name = "pm_L2SA_ld", .pmg_desc = "L2 slice A load events", .pmg_event_ids = power5p_group_event_ids[63], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 64 ] = { .pmg_name = "pm_L2SA_st", .pmg_desc = "L2 slice A store events", .pmg_event_ids = power5p_group_event_ids[64], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 65 ] = { .pmg_name = "pm_L2SA_st2", .pmg_desc = "L2 slice A store events", .pmg_event_ids = power5p_group_event_ids[65], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 66 ] = { .pmg_name = "pm_L2SB_ld", .pmg_desc = "L2 slice B load events", .pmg_event_ids = power5p_group_event_ids[66], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400582c282c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 67 ] = { .pmg_name = "pm_L2SB_st", .pmg_desc = "L2 slice B store events", .pmg_event_ids = power5p_group_event_ids[67], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800582c482c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 68 ] = { .pmg_name = "pm_L2SB_st2", .pmg_desc = "L2 slice B store events", .pmg_event_ids = power5p_group_event_ids[68], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00582c282c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 69 ] = { .pmg_name = "pm_L2SC_ld", .pmg_desc = "L2 slice C load events", .pmg_event_ids = power5p_group_event_ids[69], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400584c484c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 70 ] = { .pmg_name = "pm_L2SC_st", .pmg_desc = "L2 slice C store events", .pmg_event_ids = power5p_group_event_ids[70], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800584c284c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 71 ] = { .pmg_name = "pm_L2SC_st2", .pmg_desc = "L2 slice C store events", .pmg_event_ids = power5p_group_event_ids[71], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00584c484c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 72 ] = { .pmg_name = "pm_L3SA_trans", .pmg_desc = "L3 slice A state transistions", .pmg_event_ids = power5p_group_event_ids[72], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000ac602c686ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 73 ] = { .pmg_name = "pm_L3SB_trans", .pmg_desc = "L3 slice B state transistions", .pmg_event_ids = power5p_group_event_ids[73], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000602c8c888ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 74 ] = { .pmg_name = "pm_L3SC_trans", .pmg_desc = "L3 slice C state transistions", .pmg_event_ids = power5p_group_event_ids[74], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000602caca8aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 75 ] = { .pmg_name = "pm_L2SA_trans", .pmg_desc = "L2 slice A state transistions", .pmg_event_ids = power5p_group_event_ids[75], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac080c080ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 76 ] = { .pmg_name = "pm_L2SB_trans", .pmg_desc = "L2 slice B state transistions", .pmg_event_ids = power5p_group_event_ids[76], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac282c282ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 77 ] = { .pmg_name = "pm_L2SC_trans", .pmg_desc = "L2 slice C state transistions", .pmg_event_ids = power5p_group_event_ids[77], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac484c484ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 78 ] = { .pmg_name = "pm_L3SAB_retry", .pmg_desc = "L3 slice A/B snoop retry and all CI/CO busy", .pmg_event_ids = power5p_group_event_ids[78], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3005100fc6c8c6c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 79 ] = { .pmg_name = "pm_L3SAB_hit", .pmg_desc = "L3 slice A/B hit and reference", .pmg_event_ids = power5p_group_event_ids[79], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3050100086888688ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 80 ] = { .pmg_name = "pm_L3SC_retry_hit", .pmg_desc = "L3 slice C hit & snoop retry", .pmg_event_ids = power5p_group_event_ids[80], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055100aca8aca8aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 81 ] = { .pmg_name = "pm_fpu1", .pmg_desc = "Floating Point events", .pmg_event_ids = power5p_group_event_ids[81], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010101020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 82 ] = { .pmg_name = "pm_fpu2", .pmg_desc = "Floating Point events", .pmg_event_ids = power5p_group_event_ids[82], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000020202010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 83 ] = { .pmg_name = "pm_fpu3", .pmg_desc = "Floating point events", .pmg_event_ids = power5p_group_event_ids[83], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000c1010868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 84 ] = { .pmg_name = "pm_fpu4", .pmg_desc = "Floating point events", .pmg_event_ids = power5p_group_event_ids[84], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000c20200220ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 85 ] = { .pmg_name = "pm_fpu5", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[85], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000848c848cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 86 ] = { .pmg_name = "pm_fpu6", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[86], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000cc0c88088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 87 ] = { .pmg_name = "pm_fpu7", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[87], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000008088828aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 88 ] = { .pmg_name = "pm_fpu8", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[88], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000dc2ca02c0ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 89 ] = { .pmg_name = "pm_fpu9", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[89], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000cc6ce8088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 90 ] = { .pmg_name = "pm_fpu10", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[90], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000828a028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 91 ] = { .pmg_name = "pm_fpu11", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[91], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000868e8602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 92 ] = { .pmg_name = "pm_fpu12", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[92], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000cc4cc8002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 93 ] = { .pmg_name = "pm_fxu1", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5p_group_event_ids[93], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000024242424ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 94 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5p_group_event_ids[94], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000604221020ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 95 ] = { .pmg_name = "pm_fxu3", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5p_group_event_ids[95], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x404000038688c4ccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 96 ] = { .pmg_name = "pm_smt_priorities1", .pmg_desc = "Thread priority events", .pmg_event_ids = power5p_group_event_ids[96], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc6ccc6c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 97 ] = { .pmg_name = "pm_smt_priorities2", .pmg_desc = "Thread priority events", .pmg_event_ids = power5p_group_event_ids[97], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc4cacaccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 98 ] = { .pmg_name = "pm_smt_priorities3", .pmg_desc = "Thread priority events", .pmg_event_ids = power5p_group_event_ids[98], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc2c8c4c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 99 ] = { .pmg_name = "pm_smt_priorities4", .pmg_desc = "Thread priority events", .pmg_event_ids = power5p_group_event_ids[99], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000ac016c002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 100 ] = { .pmg_name = "pm_smt_both", .pmg_desc = "Thread common events", .pmg_event_ids = power5p_group_event_ids[100], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000016260208ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 101 ] = { .pmg_name = "pm_smt_selection", .pmg_desc = "Thread selection", .pmg_event_ids = power5p_group_event_ids[101], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000086028082ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 102 ] = { .pmg_name = "pm_smt_selectover1", .pmg_desc = "Thread selection overide", .pmg_event_ids = power5p_group_event_ids[102], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0050000002808488ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 103 ] = { .pmg_name = "pm_smt_selectover2", .pmg_desc = "Thread selection overide", .pmg_event_ids = power5p_group_event_ids[103], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00100000021e8a86ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 104 ] = { .pmg_name = "pm_fabric1", .pmg_desc = "Fabric events", .pmg_event_ids = power5p_group_event_ids[104], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500058ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 105 ] = { .pmg_name = "pm_fabric2", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5p_group_event_ids[105], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500858ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 106 ] = { .pmg_name = "pm_fabric3", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5p_group_event_ids[106], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305501858ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 107 ] = { .pmg_name = "pm_fabric4", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5p_group_event_ids[107], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x705401068ecec68eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 108 ] = { .pmg_name = "pm_snoop1", .pmg_desc = "Snoop retry", .pmg_event_ids = power5p_group_event_ids[108], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 109 ] = { .pmg_name = "pm_snoop2", .pmg_desc = "Snoop read retry", .pmg_event_ids = power5p_group_event_ids[109], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30540a048ccc8c02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 110 ] = { .pmg_name = "pm_snoop3", .pmg_desc = "Snoop write retry", .pmg_event_ids = power5p_group_event_ids[110], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30550c058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 111 ] = { .pmg_name = "pm_snoop4", .pmg_desc = "Snoop partial write retry", .pmg_event_ids = power5p_group_event_ids[111], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30540e048ccc8cacULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 112 ] = { .pmg_name = "pm_mem_rq", .pmg_desc = "Memory read queue dispatch", .pmg_event_ids = power5p_group_event_ids[112], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x705402058ccc8cceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 113 ] = { .pmg_name = "pm_mem_read", .pmg_desc = "Memory read complete and cancel", .pmg_event_ids = power5p_group_event_ids[113], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305404048ccc8c06ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 114 ] = { .pmg_name = "pm_mem_wq", .pmg_desc = "Memory write queue dispatch", .pmg_event_ids = power5p_group_event_ids[114], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305506058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 115 ] = { .pmg_name = "pm_mem_pwq", .pmg_desc = "Memory partial write queue", .pmg_event_ids = power5p_group_event_ids[115], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305508058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 116 ] = { .pmg_name = "pm_threshold", .pmg_desc = "Thresholding", .pmg_event_ids = power5p_group_event_ids[116], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0008000404c41628ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 117 ] = { .pmg_name = "pm_mrk_grp1", .pmg_desc = "Marked group events", .pmg_event_ids = power5p_group_event_ids[117], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0008000404c60a26ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 118 ] = { .pmg_name = "pm_mrk_grp2", .pmg_desc = "Marked group events", .pmg_event_ids = power5p_group_event_ids[118], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x410300032a0ac822ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 119 ] = { .pmg_name = "pm_mrk_dsource1", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[119], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000f0e404444ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 120 ] = { .pmg_name = "pm_mrk_dsource2", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[120], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000c2e440210ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 121 ] = { .pmg_name = "pm_mrk_dsource3", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[121], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000f1c484c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 122 ] = { .pmg_name = "pm_mrk_dsource4", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[122], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000f42462e42ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 123 ] = { .pmg_name = "pm_mrk_dsource5", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[123], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000f3c4c4040ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 124 ] = { .pmg_name = "pm_mrk_dsource6", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[124], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000d46460246ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 125 ] = { .pmg_name = "pm_mrk_dsource7", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[125], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000f4e4e3c4eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 126 ] = { .pmg_name = "pm_mrk_dtlbref", .pmg_desc = "Marked data TLB references", .pmg_event_ids = power5p_group_event_ids[126], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020c000f0c0c0c0cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 127 ] = { .pmg_name = "pm_mrk_dtlbmiss", .pmg_desc = "Marked data TLB misses", .pmg_event_ids = power5p_group_event_ids[127], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020c000f1a1a1a1aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 128 ] = { .pmg_name = "pm_mrk_dtlb_dslb", .pmg_desc = "Marked data TLB references and misses and marked data SLB misses", .pmg_event_ids = power5p_group_event_ids[128], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x063c0008c8ac8e1eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 129 ] = { .pmg_name = "pm_mrk_lbref", .pmg_desc = "Marked TLB and SLB references", .pmg_event_ids = power5p_group_event_ids[129], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x063c000a0c020c8eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 130 ] = { .pmg_name = "pm_mrk_lsmiss", .pmg_desc = "Marked load and store miss", .pmg_event_ids = power5p_group_event_ids[130], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000800081002060aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 131 ] = { .pmg_name = "pm_mrk_ulsflush", .pmg_desc = "Mark unaligned load and store flushes", .pmg_event_ids = power5p_group_event_ids[131], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0028000406c62020ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 132 ] = { .pmg_name = "pm_mrk_misc", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[132], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00080008cc062816ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 133 ] = { .pmg_name = "pm_lsref_L1", .pmg_desc = "Load/Store operations and L1 activity", .pmg_event_ids = power5p_group_event_ids[133], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8033000c0e1a2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 134 ] = { .pmg_name = "Load/Store operations and L2", .pmg_desc = " L3 activity", .pmg_event_ids = power5p_group_event_ids[134], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0033000c1c0e2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 135 ] = { .pmg_name = "pm_lsref_tlbmiss", .pmg_desc = "Load/Store operations and TLB misses", .pmg_event_ids = power5p_group_event_ids[135], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b0000080882020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 136 ] = { .pmg_name = "pm_Dmiss", .pmg_desc = "Data cache misses", .pmg_event_ids = power5p_group_event_ids[136], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0033000c1c0e1086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 137 ] = { .pmg_name = "pm_prefetchX", .pmg_desc = "Prefetch events", .pmg_event_ids = power5p_group_event_ids[137], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x853300061eccce86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 138 ] = { .pmg_name = "pm_branchX", .pmg_desc = "Branch operations", .pmg_event_ids = power5p_group_event_ids[138], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000f0e0e0ec8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 139 ] = { .pmg_name = "pm_fpuX1", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[139], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000dc2ca86c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 140 ] = { .pmg_name = "pm_fpuX2", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[140], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000828a828aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 141 ] = { .pmg_name = "pm_fpuX3", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5p_group_event_ids[141], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000868e868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 142 ] = { .pmg_name = "pm_fpuX4", .pmg_desc = "Floating point and L1 events", .pmg_event_ids = power5p_group_event_ids[142], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0030000020102020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 143 ] = { .pmg_name = "pm_fpuX5", .pmg_desc = "Floating point events", .pmg_event_ids = power5p_group_event_ids[143], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000c2020868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 144 ] = { .pmg_name = "pm_fpuX6", .pmg_desc = "Floating point events", .pmg_event_ids = power5p_group_event_ids[144], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010202010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 145 ] = { .pmg_name = "pm_fpuX7", .pmg_desc = "Floating point events", .pmg_event_ids = power5p_group_event_ids[145], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000220105010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 146 ] = { .pmg_name = "pm_hpmcount8", .pmg_desc = "HPM group for set 9", .pmg_event_ids = power5p_group_event_ids[146], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e281e10ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 147 ] = { .pmg_name = "pm_hpmcount2", .pmg_desc = "HPM group for set 2", .pmg_event_ids = power5p_group_event_ids[147], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000412201220ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 148 ] = { .pmg_name = "pm_hpmcount3", .pmg_desc = "HPM group for set 3", .pmg_event_ids = power5p_group_event_ids[148], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x403000041ec21086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 149 ] = { .pmg_name = "pm_hpmcount4", .pmg_desc = "HPM group for set 7", .pmg_event_ids = power5p_group_event_ids[149], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b00000101e2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 150 ] = { .pmg_name = "pm_flop", .pmg_desc = "Floating point operations", .pmg_event_ids = power5p_group_event_ids[150], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010105050ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 151 ] = { .pmg_name = "pm_eprof1", .pmg_desc = "Group for use with eprof", .pmg_event_ids = power5p_group_event_ids[151], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00300000121e108eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 152 ] = { .pmg_name = "pm_eprof2", .pmg_desc = "Group for use with eprof", .pmg_event_ids = power5p_group_event_ids[152], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0030000012501220ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 153 ] = { .pmg_name = "pm_flip", .pmg_desc = "Group for flips", .pmg_event_ids = power5p_group_event_ids[153], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000021e105010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 154 ] = { .pmg_name = "pm_hpmcount5", .pmg_desc = "HPM group for set 5", .pmg_event_ids = power5p_group_event_ids[154], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b000001e881020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 155 ] = { .pmg_name = "pm_hpmcount6", .pmg_desc = "HPM group for set 6", .pmg_event_ids = power5p_group_event_ids[155], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x003000001e122086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 156 ] = { .pmg_name = "pm_hpmcount7", .pmg_desc = "HPM group for set 8", .pmg_event_ids = power5p_group_event_ids[156], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00030005120e1e0eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 157 ] = { .pmg_name = "pm_ep_threshold", .pmg_desc = "Thresholding", .pmg_event_ids = power5p_group_event_ids[157], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000004121628ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 158 ] = { .pmg_name = "pm_ep_mrk_grp1", .pmg_desc = "Marked group events", .pmg_event_ids = power5p_group_event_ids[158], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000004120a26ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 159 ] = { .pmg_name = "pm_ep_mrk_grp2", .pmg_desc = "Marked group events", .pmg_event_ids = power5p_group_event_ids[159], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x410300032a12c822ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 160 ] = { .pmg_name = "pm_ep_mrk_dsource1", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[160], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000b0e124444ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 161 ] = { .pmg_name = "pm_ep_mrk_dsource2", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[161], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00082e12e410ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 162 ] = { .pmg_name = "pm_ep_mrk_dsource3", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[162], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000712484c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 163 ] = { .pmg_name = "pm_ep_mrk_dsource4", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[163], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000712462e42ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 164 ] = { .pmg_name = "pm_ep_mrk_dsource5", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[164], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000b3c124040ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 165 ] = { .pmg_name = "pm_ep_mrk_dsource6", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[165], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000512460246ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 166 ] = { .pmg_name = "pm_ep_mrk_dsource7", .pmg_desc = "Marked data from", .pmg_event_ids = power5p_group_event_ids[166], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b0007124e3c4eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 167 ] = { .pmg_name = "pm_ep_mrk_lbmiss", .pmg_desc = "Marked TLB and SLB misses", .pmg_event_ids = power5p_group_event_ids[167], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020c0007121a1a1aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 168 ] = { .pmg_name = "pm_ep_mrk_dtlbref", .pmg_desc = "Marked data TLB references", .pmg_event_ids = power5p_group_event_ids[168], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020c0007120c0c0cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 169 ] = { .pmg_name = "pm_ep_mrk_dtlbmiss", .pmg_desc = "Marked data TLB misses", .pmg_event_ids = power5p_group_event_ids[169], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x020c0007121a1a1aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 170 ] = { .pmg_name = "pm_ep_mrk_lbref", .pmg_desc = "Marked TLB and SLB references", .pmg_event_ids = power5p_group_event_ids[170], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x063c000a0c120c8eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 171 ] = { .pmg_name = "pm_ep_mrk_lsmiss", .pmg_desc = "Marked load and store miss", .pmg_event_ids = power5p_group_event_ids[171], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000800081012060aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 172 ] = { .pmg_name = "pm_ep_mrk_ulsflush", .pmg_desc = "Mark unaligned load and store flushes", .pmg_event_ids = power5p_group_event_ids[172], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0020000006122020ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 173 ] = { .pmg_name = "pm_ep_mrk_misc1", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[173], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000012062816ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 174 ] = { .pmg_name = "pm_ep_mrk_misc2", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[174], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000612445ee4ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 175 ] = { .pmg_name = "pm_ep_mrk_misc3", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[175], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x053b0005124c8c0eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 176 ] = { .pmg_name = "pm_ep_mrk_misc4", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[176], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x030f00091a12e82eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 177 ] = { .pmg_name = "pm_ep_mrk_misc5", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[177], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x022c00080c120286ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 178 ] = { .pmg_name = "pm_ep_mrk_misc6", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[178], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x022c00081a12888aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 179 ] = { .pmg_name = "pm_ep_mrk_misc7", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[179], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x012b000412408280ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 180 ] = { .pmg_name = "pm_ep_mrk_misc8", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[180], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00200000120a8486ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 181 ] = { .pmg_name = "pm_ep_mrk_misc9", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[181], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0028000012ac8eecULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 182 ] = { .pmg_name = "pm_ep_mrk_misc10", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[182], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0008000412c0e8e6ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 183 ] = { .pmg_name = "pm_ep_mrk_misc11", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[183], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x01030003120a443cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 184 ] = { .pmg_name = "pm_ep_mrk_misc12", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[184], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0020000012501010ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 185 ] = { .pmg_name = "pm_ep_mrk_misc13", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[185], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0103000b0e1236ccULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 186 ] = { .pmg_name = "pm_ep_mrk_misc14", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[186], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000012282828ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 187 ] = { .pmg_name = "pm_ep_mrk_misc15", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5p_group_event_ids[187], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000412220a26ULL, .pmg_mmcra = 0x0000000000000001ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/power5_events.h000066400000000000000000013233001502707512200222660ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER5_EVENTS_H__ #define __POWER5_EVENTS_H__ /* * File: power5_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define POWER5_PME_PM_FPU1_SINGLE 1 #define POWER5_PME_PM_L3SB_REF 2 #define POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC 3 #define POWER5_PME_PM_INST_FROM_L275_SHR 4 #define POWER5_PME_PM_MRK_DATA_FROM_L375_MOD 5 #define POWER5_PME_PM_DTLB_MISS_4K 6 #define POWER5_PME_PM_CLB_FULL_CYC 7 #define POWER5_PME_PM_MRK_ST_CMPL 8 #define POWER5_PME_PM_LSU_FLUSH_LRQ_FULL 9 #define POWER5_PME_PM_MRK_DATA_FROM_L275_SHR 10 #define POWER5_PME_PM_1INST_CLB_CYC 11 #define POWER5_PME_PM_MEM_SPEC_RD_CANCEL 12 #define POWER5_PME_PM_MRK_DTLB_MISS_16M 13 #define POWER5_PME_PM_FPU_FDIV 14 #define POWER5_PME_PM_FPU_SINGLE 15 #define POWER5_PME_PM_FPU0_FMA 16 #define POWER5_PME_PM_SLB_MISS 17 #define POWER5_PME_PM_LSU1_FLUSH_LRQ 18 #define POWER5_PME_PM_L2SA_ST_HIT 19 #define POWER5_PME_PM_DTLB_MISS 20 #define POWER5_PME_PM_BR_PRED_TA 21 #define POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC 22 #define POWER5_PME_PM_CMPLU_STALL_FXU 23 #define POWER5_PME_PM_EXT_INT 24 #define POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ 25 #define POWER5_PME_PM_LSU1_LDF 26 #define POWER5_PME_PM_MRK_ST_GPS 27 #define POWER5_PME_PM_FAB_CMD_ISSUED 28 #define POWER5_PME_PM_LSU0_SRQ_STFWD 29 #define POWER5_PME_PM_CR_MAP_FULL_CYC 30 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL 31 #define POWER5_PME_PM_MRK_LSU0_FLUSH_ULD 32 #define POWER5_PME_PM_LSU_FLUSH_SRQ_FULL 33 #define POWER5_PME_PM_FLUSH_IMBAL 34 #define POWER5_PME_PM_MEM_RQ_DISP_Q16to19 35 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 36 #define POWER5_PME_PM_DATA_FROM_L35_MOD 37 #define POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL 38 #define POWER5_PME_PM_FPU1_FDIV 39 #define POWER5_PME_PM_FPU0_FRSP_FCONV 40 #define POWER5_PME_PM_MEM_RQ_DISP 41 #define POWER5_PME_PM_LWSYNC_HELD 42 #define POWER5_PME_PM_FXU_FIN 43 #define POWER5_PME_PM_DSLB_MISS 44 #define POWER5_PME_PM_FXLS1_FULL_CYC 45 #define POWER5_PME_PM_DATA_FROM_L275_SHR 46 #define POWER5_PME_PM_THRD_SEL_T0 47 #define POWER5_PME_PM_PTEG_RELOAD_VALID 48 #define POWER5_PME_PM_LSU_LMQ_LHR_MERGE 49 #define POWER5_PME_PM_MRK_STCX_FAIL 50 #define POWER5_PME_PM_2INST_CLB_CYC 51 #define POWER5_PME_PM_FAB_PNtoVN_DIRECT 52 #define POWER5_PME_PM_PTEG_FROM_L2MISS 53 #define POWER5_PME_PM_CMPLU_STALL_LSU 54 #define POWER5_PME_PM_MRK_DSLB_MISS 55 #define POWER5_PME_PM_LSU_FLUSH_ULD 56 #define POWER5_PME_PM_PTEG_FROM_LMEM 57 #define POWER5_PME_PM_MRK_BRU_FIN 58 #define POWER5_PME_PM_MEM_WQ_DISP_WRITE 59 #define POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC 60 #define POWER5_PME_PM_LSU1_NCLD 61 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER 62 #define POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ 63 #define POWER5_PME_PM_FPR_MAP_FULL_CYC 64 #define POWER5_PME_PM_FPU1_FULL_CYC 65 #define POWER5_PME_PM_L3SA_ALL_BUSY 66 #define POWER5_PME_PM_3INST_CLB_CYC 67 #define POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 68 #define POWER5_PME_PM_L2SA_SHR_INV 69 #define POWER5_PME_PM_THRESH_TIMEO 70 #define POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL 71 #define POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL 72 #define POWER5_PME_PM_FPU_FSQRT 73 #define POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ 74 #define POWER5_PME_PM_PMC1_OVERFLOW 75 #define POWER5_PME_PM_L3SC_SNOOP_RETRY 76 #define POWER5_PME_PM_DATA_TABLEWALK_CYC 77 #define POWER5_PME_PM_THRD_PRIO_6_CYC 78 #define POWER5_PME_PM_FPU_FEST 79 #define POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY 80 #define POWER5_PME_PM_MRK_DATA_FROM_RMEM 81 #define POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC 82 #define POWER5_PME_PM_MEM_PWQ_DISP 83 #define POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY 84 #define POWER5_PME_PM_LD_MISS_L1_LSU0 85 #define POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL 86 #define POWER5_PME_PM_FPU1_STALL3 87 #define POWER5_PME_PM_GCT_USAGE_80to99_CYC 88 #define POWER5_PME_PM_WORK_HELD 89 #define POWER5_PME_PM_INST_CMPL 90 #define POWER5_PME_PM_LSU1_FLUSH_UST 91 #define POWER5_PME_PM_FXU_IDLE 92 #define POWER5_PME_PM_LSU0_FLUSH_ULD 93 #define POWER5_PME_PM_LSU1_REJECT_LMQ_FULL 94 #define POWER5_PME_PM_GRP_DISP_REJECT 95 #define POWER5_PME_PM_L2SA_MOD_INV 96 #define POWER5_PME_PM_PTEG_FROM_L25_SHR 97 #define POWER5_PME_PM_FAB_CMD_RETRIED 98 #define POWER5_PME_PM_L3SA_SHR_INV 99 #define POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL 100 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR 101 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL 102 #define POWER5_PME_PM_PTEG_FROM_L375_MOD 103 #define POWER5_PME_PM_MRK_LSU1_FLUSH_UST 104 #define POWER5_PME_PM_BR_ISSUED 105 #define POWER5_PME_PM_MRK_GRP_BR_REDIR 106 #define POWER5_PME_PM_EE_OFF 107 #define POWER5_PME_PM_MEM_RQ_DISP_Q4to7 108 #define POWER5_PME_PM_MEM_FAST_PATH_RD_DISP 109 #define POWER5_PME_PM_INST_FROM_L3 110 #define POWER5_PME_PM_ITLB_MISS 111 #define POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE 112 #define POWER5_PME_PM_FXLS_FULL_CYC 113 #define POWER5_PME_PM_DTLB_REF_4K 114 #define POWER5_PME_PM_GRP_DISP_VALID 115 #define POWER5_PME_PM_LSU_FLUSH_UST 116 #define POWER5_PME_PM_FXU1_FIN 117 #define POWER5_PME_PM_THRD_PRIO_4_CYC 118 #define POWER5_PME_PM_MRK_DATA_FROM_L35_MOD 119 #define POWER5_PME_PM_4INST_CLB_CYC 120 #define POWER5_PME_PM_MRK_DTLB_REF_16M 121 #define POWER5_PME_PM_INST_FROM_L375_MOD 122 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR 123 #define POWER5_PME_PM_GRP_CMPL 124 #define POWER5_PME_PM_FPU1_1FLOP 125 #define POWER5_PME_PM_FPU_FRSP_FCONV 126 #define POWER5_PME_PM_5INST_CLB_CYC 127 #define POWER5_PME_PM_L3SC_REF 128 #define POWER5_PME_PM_THRD_L2MISS_BOTH_CYC 129 #define POWER5_PME_PM_MEM_PW_GATH 130 #define POWER5_PME_PM_FAB_PNtoNN_SIDECAR 131 #define POWER5_PME_PM_FAB_DCLAIM_ISSUED 132 #define POWER5_PME_PM_GRP_IC_MISS 133 #define POWER5_PME_PM_INST_FROM_L35_SHR 134 #define POWER5_PME_PM_LSU_LMQ_FULL_CYC 135 #define POWER5_PME_PM_MRK_DATA_FROM_L2_CYC 136 #define POWER5_PME_PM_LSU_SRQ_SYNC_CYC 137 #define POWER5_PME_PM_LSU0_BUSY_REJECT 138 #define POWER5_PME_PM_LSU_REJECT_ERAT_MISS 139 #define POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC 140 #define POWER5_PME_PM_DATA_FROM_L375_SHR 141 #define POWER5_PME_PM_FPU0_FMOV_FEST 142 #define POWER5_PME_PM_PTEG_FROM_L25_MOD 143 #define POWER5_PME_PM_LD_REF_L1_LSU0 144 #define POWER5_PME_PM_THRD_PRIO_7_CYC 145 #define POWER5_PME_PM_LSU1_FLUSH_SRQ 146 #define POWER5_PME_PM_L2SC_RCST_DISP 147 #define POWER5_PME_PM_CMPLU_STALL_DIV 148 #define POWER5_PME_PM_MEM_RQ_DISP_Q12to15 149 #define POWER5_PME_PM_INST_FROM_L375_SHR 150 #define POWER5_PME_PM_ST_REF_L1 151 #define POWER5_PME_PM_L3SB_ALL_BUSY 152 #define POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY 153 #define POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC 154 #define POWER5_PME_PM_FAB_HOLDtoNN_EMPTY 155 #define POWER5_PME_PM_DATA_FROM_LMEM 156 #define POWER5_PME_PM_RUN_CYC 157 #define POWER5_PME_PM_PTEG_FROM_RMEM 158 #define POWER5_PME_PM_L2SC_RCLD_DISP 159 #define POWER5_PME_PM_LSU0_LDF 160 #define POWER5_PME_PM_LSU_LRQ_S0_VALID 161 #define POWER5_PME_PM_PMC3_OVERFLOW 162 #define POWER5_PME_PM_MRK_IMR_RELOAD 163 #define POWER5_PME_PM_MRK_GRP_TIMEO 164 #define POWER5_PME_PM_ST_MISS_L1 165 #define POWER5_PME_PM_STOP_COMPLETION 166 #define POWER5_PME_PM_LSU_BUSY_REJECT 167 #define POWER5_PME_PM_ISLB_MISS 168 #define POWER5_PME_PM_CYC 169 #define POWER5_PME_PM_THRD_ONE_RUN_CYC 170 #define POWER5_PME_PM_GRP_BR_REDIR_NONSPEC 171 #define POWER5_PME_PM_LSU1_SRQ_STFWD 172 #define POWER5_PME_PM_L3SC_MOD_INV 173 #define POWER5_PME_PM_L2_PREF 174 #define POWER5_PME_PM_GCT_NOSLOT_BR_MPRED 175 #define POWER5_PME_PM_MRK_DATA_FROM_L25_MOD 176 #define POWER5_PME_PM_L2SB_MOD_INV 177 #define POWER5_PME_PM_L2SB_ST_REQ 178 #define POWER5_PME_PM_MRK_L1_RELOAD_VALID 179 #define POWER5_PME_PM_L3SB_HIT 180 #define POWER5_PME_PM_L2SB_SHR_MOD 181 #define POWER5_PME_PM_EE_OFF_EXT_INT 182 #define POWER5_PME_PM_1PLUS_PPC_CMPL 183 #define POWER5_PME_PM_L2SC_SHR_MOD 184 #define POWER5_PME_PM_PMC6_OVERFLOW 185 #define POWER5_PME_PM_LSU_LRQ_FULL_CYC 186 #define POWER5_PME_PM_IC_PREF_INSTALL 187 #define POWER5_PME_PM_TLB_MISS 188 #define POWER5_PME_PM_GCT_FULL_CYC 189 #define POWER5_PME_PM_FXU_BUSY 190 #define POWER5_PME_PM_MRK_DATA_FROM_L3_CYC 191 #define POWER5_PME_PM_LSU_REJECT_LMQ_FULL 192 #define POWER5_PME_PM_LSU_SRQ_S0_ALLOC 193 #define POWER5_PME_PM_GRP_MRK 194 #define POWER5_PME_PM_INST_FROM_L25_SHR 195 #define POWER5_PME_PM_FPU1_FIN 196 #define POWER5_PME_PM_DC_PREF_STREAM_ALLOC 197 #define POWER5_PME_PM_BR_MPRED_TA 198 #define POWER5_PME_PM_CRQ_FULL_CYC 199 #define POWER5_PME_PM_L2SA_RCLD_DISP 200 #define POWER5_PME_PM_SNOOP_WR_RETRY_QFULL 201 #define POWER5_PME_PM_MRK_DTLB_REF_4K 202 #define POWER5_PME_PM_LSU_SRQ_S0_VALID 203 #define POWER5_PME_PM_LSU0_FLUSH_LRQ 204 #define POWER5_PME_PM_INST_FROM_L275_MOD 205 #define POWER5_PME_PM_GCT_EMPTY_CYC 206 #define POWER5_PME_PM_LARX_LSU0 207 #define POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC 208 #define POWER5_PME_PM_SNOOP_RETRY_1AHEAD 209 #define POWER5_PME_PM_FPU1_FSQRT 210 #define POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 211 #define POWER5_PME_PM_MRK_FPU_FIN 212 #define POWER5_PME_PM_THRD_PRIO_5_CYC 213 #define POWER5_PME_PM_MRK_DATA_FROM_LMEM 214 #define POWER5_PME_PM_FPU1_FRSP_FCONV 215 #define POWER5_PME_PM_SNOOP_TLBIE 216 #define POWER5_PME_PM_L3SB_SNOOP_RETRY 217 #define POWER5_PME_PM_FAB_VBYPASS_EMPTY 218 #define POWER5_PME_PM_MRK_DATA_FROM_L275_MOD 219 #define POWER5_PME_PM_6INST_CLB_CYC 220 #define POWER5_PME_PM_L2SB_RCST_DISP 221 #define POWER5_PME_PM_FLUSH 222 #define POWER5_PME_PM_L2SC_MOD_INV 223 #define POWER5_PME_PM_FPU_DENORM 224 #define POWER5_PME_PM_L3SC_HIT 225 #define POWER5_PME_PM_SNOOP_WR_RETRY_RQ 226 #define POWER5_PME_PM_LSU1_REJECT_SRQ 227 #define POWER5_PME_PM_IC_PREF_REQ 228 #define POWER5_PME_PM_L3SC_ALL_BUSY 229 #define POWER5_PME_PM_MRK_GRP_IC_MISS 230 #define POWER5_PME_PM_GCT_NOSLOT_IC_MISS 231 #define POWER5_PME_PM_MRK_DATA_FROM_L3 232 #define POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL 233 #define POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD 234 #define POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS 235 #define POWER5_PME_PM_L3SA_MOD_INV 236 #define POWER5_PME_PM_LSU_FLUSH_LRQ 237 #define POWER5_PME_PM_THRD_PRIO_2_CYC 238 #define POWER5_PME_PM_LSU_FLUSH_SRQ 239 #define POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID 240 #define POWER5_PME_PM_L3SA_REF 241 #define POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL 242 #define POWER5_PME_PM_FPU0_STALL3 243 #define POWER5_PME_PM_GPR_MAP_FULL_CYC 244 #define POWER5_PME_PM_TB_BIT_TRANS 245 #define POWER5_PME_PM_MRK_LSU_FLUSH_LRQ 246 #define POWER5_PME_PM_FPU0_STF 247 #define POWER5_PME_PM_MRK_DTLB_MISS 248 #define POWER5_PME_PM_FPU1_FMA 249 #define POWER5_PME_PM_L2SA_MOD_TAG 250 #define POWER5_PME_PM_LSU1_FLUSH_ULD 251 #define POWER5_PME_PM_MRK_LSU0_FLUSH_UST 252 #define POWER5_PME_PM_MRK_INST_FIN 253 #define POWER5_PME_PM_FPU0_FULL_CYC 254 #define POWER5_PME_PM_LSU_LRQ_S0_ALLOC 255 #define POWER5_PME_PM_MRK_LSU1_FLUSH_ULD 256 #define POWER5_PME_PM_MRK_DTLB_REF 257 #define POWER5_PME_PM_BR_UNCOND 258 #define POWER5_PME_PM_THRD_SEL_OVER_L2MISS 259 #define POWER5_PME_PM_L2SB_SHR_INV 260 #define POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL 261 #define POWER5_PME_PM_L3SC_MOD_TAG 262 #define POWER5_PME_PM_MRK_ST_MISS_L1 263 #define POWER5_PME_PM_GRP_DISP_SUCCESS 264 #define POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC 265 #define POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 266 #define POWER5_PME_PM_MEM_WQ_DISP_Q8to15 267 #define POWER5_PME_PM_FPU0_SINGLE 268 #define POWER5_PME_PM_LSU_DERAT_MISS 269 #define POWER5_PME_PM_THRD_PRIO_1_CYC 270 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER 271 #define POWER5_PME_PM_FPU1_FEST 272 #define POWER5_PME_PM_FAB_HOLDtoVN_EMPTY 273 #define POWER5_PME_PM_SNOOP_RD_RETRY_RQ 274 #define POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL 275 #define POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC 276 #define POWER5_PME_PM_MRK_ST_CMPL_INT 277 #define POWER5_PME_PM_FLUSH_BR_MPRED 278 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR 279 #define POWER5_PME_PM_FPU_STF 280 #define POWER5_PME_PM_CMPLU_STALL_FPU 281 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 282 #define POWER5_PME_PM_GCT_NOSLOT_CYC 283 #define POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE 284 #define POWER5_PME_PM_PTEG_FROM_L35_SHR 285 #define POWER5_PME_PM_MRK_LSU_FLUSH_UST 286 #define POWER5_PME_PM_L3SA_HIT 287 #define POWER5_PME_PM_MRK_DATA_FROM_L25_SHR 288 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR 289 #define POWER5_PME_PM_MRK_DATA_FROM_L35_SHR 290 #define POWER5_PME_PM_IERAT_XLATE_WR 291 #define POWER5_PME_PM_L2SA_ST_REQ 292 #define POWER5_PME_PM_THRD_SEL_T1 293 #define POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT 294 #define POWER5_PME_PM_INST_FROM_LMEM 295 #define POWER5_PME_PM_FPU0_1FLOP 296 #define POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC 297 #define POWER5_PME_PM_PTEG_FROM_L2 298 #define POWER5_PME_PM_MEM_PW_CMPL 299 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 300 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER 301 #define POWER5_PME_PM_FPU0_FIN 302 #define POWER5_PME_PM_MRK_DTLB_MISS_4K 303 #define POWER5_PME_PM_L3SC_SHR_INV 304 #define POWER5_PME_PM_GRP_BR_REDIR 305 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL 306 #define POWER5_PME_PM_MRK_LSU_FLUSH_SRQ 307 #define POWER5_PME_PM_PTEG_FROM_L275_SHR 308 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL 309 #define POWER5_PME_PM_SNOOP_RD_RETRY_WQ 310 #define POWER5_PME_PM_LSU0_NCLD 311 #define POWER5_PME_PM_FAB_DCLAIM_RETRIED 312 #define POWER5_PME_PM_LSU1_BUSY_REJECT 313 #define POWER5_PME_PM_FXLS0_FULL_CYC 314 #define POWER5_PME_PM_FPU0_FEST 315 #define POWER5_PME_PM_DTLB_REF_16M 316 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR 317 #define POWER5_PME_PM_LSU0_REJECT_ERAT_MISS 318 #define POWER5_PME_PM_DATA_FROM_L25_MOD 319 #define POWER5_PME_PM_GCT_USAGE_60to79_CYC 320 #define POWER5_PME_PM_DATA_FROM_L375_MOD 321 #define POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 322 #define POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF 323 #define POWER5_PME_PM_0INST_FETCH 324 #define POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF 325 #define POWER5_PME_PM_L1_PREF 326 #define POWER5_PME_PM_MEM_WQ_DISP_Q0to7 327 #define POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC 328 #define POWER5_PME_PM_BRQ_FULL_CYC 329 #define POWER5_PME_PM_GRP_IC_MISS_NONSPEC 330 #define POWER5_PME_PM_PTEG_FROM_L275_MOD 331 #define POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 332 #define POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC 333 #define POWER5_PME_PM_LSU_FLUSH 334 #define POWER5_PME_PM_DATA_FROM_L3 335 #define POWER5_PME_PM_INST_FROM_L2 336 #define POWER5_PME_PM_PMC2_OVERFLOW 337 #define POWER5_PME_PM_FPU0_DENORM 338 #define POWER5_PME_PM_FPU1_FMOV_FEST 339 #define POWER5_PME_PM_INST_FETCH_CYC 340 #define POWER5_PME_PM_LSU_LDF 341 #define POWER5_PME_PM_INST_DISP 342 #define POWER5_PME_PM_DATA_FROM_L25_SHR 343 #define POWER5_PME_PM_L1_DCACHE_RELOAD_VALID 344 #define POWER5_PME_PM_MEM_WQ_DISP_DCLAIM 345 #define POWER5_PME_PM_FPU_FULL_CYC 346 #define POWER5_PME_PM_MRK_GRP_ISSUED 347 #define POWER5_PME_PM_THRD_PRIO_3_CYC 348 #define POWER5_PME_PM_FPU_FMA 349 #define POWER5_PME_PM_INST_FROM_L35_MOD 350 #define POWER5_PME_PM_MRK_CRU_FIN 351 #define POWER5_PME_PM_SNOOP_WR_RETRY_WQ 352 #define POWER5_PME_PM_CMPLU_STALL_REJECT 353 #define POWER5_PME_PM_LSU1_REJECT_ERAT_MISS 354 #define POWER5_PME_PM_MRK_FXU_FIN 355 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER 356 #define POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY 357 #define POWER5_PME_PM_PMC4_OVERFLOW 358 #define POWER5_PME_PM_L3SA_SNOOP_RETRY 359 #define POWER5_PME_PM_PTEG_FROM_L35_MOD 360 #define POWER5_PME_PM_INST_FROM_L25_MOD 361 #define POWER5_PME_PM_THRD_SMT_HANG 362 #define POWER5_PME_PM_CMPLU_STALL_ERAT_MISS 363 #define POWER5_PME_PM_L3SA_MOD_TAG 364 #define POWER5_PME_PM_FLUSH_SYNC 365 #define POWER5_PME_PM_INST_FROM_L2MISS 366 #define POWER5_PME_PM_L2SC_ST_HIT 367 #define POWER5_PME_PM_MEM_RQ_DISP_Q8to11 368 #define POWER5_PME_PM_MRK_GRP_DISP 369 #define POWER5_PME_PM_L2SB_MOD_TAG 370 #define POWER5_PME_PM_CLB_EMPTY_CYC 371 #define POWER5_PME_PM_L2SB_ST_HIT 372 #define POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL 373 #define POWER5_PME_PM_BR_PRED_CR_TA 374 #define POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ 375 #define POWER5_PME_PM_MRK_LSU_FLUSH_ULD 376 #define POWER5_PME_PM_INST_DISP_ATTEMPT 377 #define POWER5_PME_PM_INST_FROM_RMEM 378 #define POWER5_PME_PM_ST_REF_L1_LSU0 379 #define POWER5_PME_PM_LSU0_DERAT_MISS 380 #define POWER5_PME_PM_L2SB_RCLD_DISP 381 #define POWER5_PME_PM_FPU_STALL3 382 #define POWER5_PME_PM_BR_PRED_CR 383 #define POWER5_PME_PM_MRK_DATA_FROM_L2 384 #define POWER5_PME_PM_LSU0_FLUSH_SRQ 385 #define POWER5_PME_PM_FAB_PNtoNN_DIRECT 386 #define POWER5_PME_PM_IOPS_CMPL 387 #define POWER5_PME_PM_L2SC_SHR_INV 388 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER 389 #define POWER5_PME_PM_L2SA_RCST_DISP 390 #define POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION 391 #define POWER5_PME_PM_FAB_PNtoVN_SIDECAR 392 #define POWER5_PME_PM_LSU_LMQ_S0_ALLOC 393 #define POWER5_PME_PM_LSU0_REJECT_LMQ_FULL 394 #define POWER5_PME_PM_SNOOP_PW_RETRY_RQ 395 #define POWER5_PME_PM_DTLB_REF 396 #define POWER5_PME_PM_PTEG_FROM_L3 397 #define POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY 398 #define POWER5_PME_PM_LSU_SRQ_EMPTY_CYC 399 #define POWER5_PME_PM_FPU1_STF 400 #define POWER5_PME_PM_LSU_LMQ_S0_VALID 401 #define POWER5_PME_PM_GCT_USAGE_00to59_CYC 402 #define POWER5_PME_PM_DATA_FROM_L2MISS 403 #define POWER5_PME_PM_GRP_DISP_BLK_SB_CYC 404 #define POWER5_PME_PM_FPU_FMOV_FEST 405 #define POWER5_PME_PM_XER_MAP_FULL_CYC 406 #define POWER5_PME_PM_FLUSH_SB 407 #define POWER5_PME_PM_MRK_DATA_FROM_L375_SHR 408 #define POWER5_PME_PM_MRK_GRP_CMPL 409 #define POWER5_PME_PM_SUSPENDED 410 #define POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC 411 #define POWER5_PME_PM_SNOOP_RD_RETRY_QFULL 412 #define POWER5_PME_PM_L3SB_MOD_INV 413 #define POWER5_PME_PM_DATA_FROM_L35_SHR 414 #define POWER5_PME_PM_LD_MISS_L1_LSU1 415 #define POWER5_PME_PM_STCX_FAIL 416 #define POWER5_PME_PM_DC_PREF_DST 417 #define POWER5_PME_PM_GRP_DISP 418 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR 419 #define POWER5_PME_PM_FPU0_FPSCR 420 #define POWER5_PME_PM_DATA_FROM_L2 421 #define POWER5_PME_PM_FPU1_DENORM 422 #define POWER5_PME_PM_FPU_1FLOP 423 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER 424 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL 425 #define POWER5_PME_PM_FPU0_FSQRT 426 #define POWER5_PME_PM_LD_REF_L1 427 #define POWER5_PME_PM_INST_FROM_L1 428 #define POWER5_PME_PM_TLBIE_HELD 429 #define POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS 430 #define POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC 431 #define POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ 432 #define POWER5_PME_PM_MEM_RQ_DISP_Q0to3 433 #define POWER5_PME_PM_ST_REF_L1_LSU1 434 #define POWER5_PME_PM_MRK_LD_MISS_L1 435 #define POWER5_PME_PM_L1_WRITE_CYC 436 #define POWER5_PME_PM_L2SC_ST_REQ 437 #define POWER5_PME_PM_CMPLU_STALL_FDIV 438 #define POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY 439 #define POWER5_PME_PM_BR_MPRED_CR 440 #define POWER5_PME_PM_L3SB_MOD_TAG 441 #define POWER5_PME_PM_MRK_DATA_FROM_L2MISS 442 #define POWER5_PME_PM_LSU_REJECT_SRQ 443 #define POWER5_PME_PM_LD_MISS_L1 444 #define POWER5_PME_PM_INST_FROM_PREF 445 #define POWER5_PME_PM_DC_INV_L2 446 #define POWER5_PME_PM_STCX_PASS 447 #define POWER5_PME_PM_LSU_SRQ_FULL_CYC 448 #define POWER5_PME_PM_FPU_FIN 449 #define POWER5_PME_PM_L2SA_SHR_MOD 450 #define POWER5_PME_PM_LSU_SRQ_STFWD 451 #define POWER5_PME_PM_0INST_CLB_CYC 452 #define POWER5_PME_PM_FXU0_FIN 453 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL 454 #define POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC 455 #define POWER5_PME_PM_PMC5_OVERFLOW 456 #define POWER5_PME_PM_FPU0_FDIV 457 #define POWER5_PME_PM_PTEG_FROM_L375_SHR 458 #define POWER5_PME_PM_LD_REF_L1_LSU1 459 #define POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY 460 #define POWER5_PME_PM_HV_CYC 461 #define POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC 462 #define POWER5_PME_PM_LR_CTR_MAP_FULL_CYC 463 #define POWER5_PME_PM_L3SB_SHR_INV 464 #define POWER5_PME_PM_DATA_FROM_RMEM 465 #define POWER5_PME_PM_DATA_FROM_L275_MOD 466 #define POWER5_PME_PM_LSU0_REJECT_SRQ 467 #define POWER5_PME_PM_LSU1_DERAT_MISS 468 #define POWER5_PME_PM_MRK_LSU_FIN 469 #define POWER5_PME_PM_DTLB_MISS_16M 470 #define POWER5_PME_PM_LSU0_FLUSH_UST 471 #define POWER5_PME_PM_L2SC_MOD_TAG 472 #define POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY 473 static const int power5_event_ids[][POWER5_NUM_EVENT_COUNTERS] = { [ POWER5_PME_PM_LSU_REJECT_RELOAD_CDF ] = { -1, 145, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_SINGLE ] = { 51, 50, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SB_REF ] = { 111, 109, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { -1, -1, 173, 179, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L275_SHR ] = { -1, -1, 57, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD ] = { 165, -1, -1, 139, -1, -1 }, [ POWER5_PME_PM_DTLB_MISS_4K ] = { 24, 23, -1, -1, -1, -1 }, [ POWER5_PME_PM_CLB_FULL_CYC ] = { 10, 9, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_ST_CMPL ] = { 179, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_LRQ_FULL ] = { 140, 139, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR ] = { -1, -1, 130, -1, -1, -1 }, [ POWER5_PME_PM_1INST_CLB_CYC ] = { 1, 1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MEM_SPEC_RD_CANCEL ] = { 157, 155, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_MISS_16M ] = { 167, 168, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_FDIV ] = { 55, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_SINGLE ] = { 58, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FMA ] = { 39, 38, -1, -1, -1, -1 }, [ POWER5_PME_PM_SLB_MISS ] = { -1, 184, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_FLUSH_LRQ ] = { 130, 128, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_ST_HIT ] = { -1, -1, 70, 74, -1, -1 }, [ POWER5_PME_PM_DTLB_MISS ] = { 22, 21, -1, -1, -1, -1 }, [ POWER5_PME_PM_BR_PRED_TA ] = { -1, 8, 4, 6, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { -1, -1, -1, 140, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_FXU ] = { -1, 12, -1, -1, -1, -1 }, [ POWER5_PME_PM_EXT_INT ] = { -1, -1, -1, 21, -1, -1 }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { -1, -1, 143, 154, -1, -1 }, [ POWER5_PME_PM_LSU1_LDF ] = { -1, -1, 107, 111, -1, -1 }, [ POWER5_PME_PM_MRK_ST_GPS ] = { -1, 178, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_CMD_ISSUED ] = { 27, 26, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_SRQ_STFWD ] = { 127, 125, -1, -1, -1, -1 }, [ POWER5_PME_PM_CR_MAP_FULL_CYC ] = { 11, 14, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { 86, 84, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_ULD ] = { -1, -1, 142, 153, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_SRQ_FULL ] = { -1, -1, 110, 114, -1, -1 }, [ POWER5_PME_PM_FLUSH_IMBAL ] = { -1, -1, 25, 30, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP_Q16to19 ] = { 151, 149, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { -1, -1, 176, 182, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L35_MOD ] = { -1, 17, 9, -1, -1, -1 }, [ POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { 152, 150, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FDIV ] = { 47, 46, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FRSP_FCONV ] = { -1, -1, 33, 38, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP ] = { 156, 154, -1, -1, -1, -1 }, [ POWER5_PME_PM_LWSYNC_HELD ] = { -1, -1, 120, 125, -1, -1 }, [ POWER5_PME_PM_FXU_FIN ] = { -1, -1, 45, -1, -1, -1 }, [ POWER5_PME_PM_DSLB_MISS ] = { 21, 20, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXLS1_FULL_CYC ] = { -1, -1, 41, 46, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L275_SHR ] = { -1, -1, 8, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_T0 ] = { -1, -1, 182, 188, -1, -1 }, [ POWER5_PME_PM_PTEG_RELOAD_VALID ] = { -1, -1, 191, 195, -1, -1 }, [ POWER5_PME_PM_LSU_LMQ_LHR_MERGE ] = { -1, -1, 112, 117, -1, -1 }, [ POWER5_PME_PM_MRK_STCX_FAIL ] = { 178, 177, -1, -1, -1, -1 }, [ POWER5_PME_PM_2INST_CLB_CYC ] = { 3, 2, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_PNtoVN_DIRECT ] = { 34, 33, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L2MISS ] = { -1, -1, 189, -1, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_LSU ] = { -1, 13, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DSLB_MISS ] = { -1, -1, 134, 144, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_ULD ] = { 142, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_LMEM ] = { -1, 183, 157, -1, -1, -1 }, [ POWER5_PME_PM_MRK_BRU_FIN ] = { -1, 158, -1, -1, -1, -1 }, [ POWER5_PME_PM_MEM_WQ_DISP_WRITE ] = { 159, 157, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { -1, -1, -1, 137, -1, -1 }, [ POWER5_PME_PM_LSU1_NCLD ] = { -1, -1, 108, 112, -1, -1 }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { -1, -1, 65, 69, -1, -1 }, [ POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { -1, -1, 159, 167, -1, -1 }, [ POWER5_PME_PM_FPR_MAP_FULL_CYC ] = { 35, 34, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FULL_CYC ] = { 50, 49, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SA_ALL_BUSY ] = { 106, 104, -1, -1, -1, -1 }, [ POWER5_PME_PM_3INST_CLB_CYC ] = { 4, 3, -1, -1, -1, -1 }, [ POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { -1, -1, 123, 128, -1, -1 }, [ POWER5_PME_PM_L2SA_SHR_INV ] = { -1, -1, 69, 73, -1, -1 }, [ POWER5_PME_PM_THRESH_TIMEO ] = { -1, -1, 185, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { -1, -1, 68, 72, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { -1, -1, 179, 185, -1, -1 }, [ POWER5_PME_PM_FPU_FSQRT ] = { -1, 53, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { -1, -1, 139, 150, -1, -1 }, [ POWER5_PME_PM_PMC1_OVERFLOW ] = { -1, 180, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_SNOOP_RETRY ] = { -1, -1, 99, 103, -1, -1 }, [ POWER5_PME_PM_DATA_TABLEWALK_CYC ] = { 20, 19, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_6_CYC ] = { 208, 202, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_FEST ] = { -1, -1, -1, 43, -1, -1 }, [ POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { 31, 30, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM ] = { 166, -1, -1, 142, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { -1, -1, -1, 138, -1, -1 }, [ POWER5_PME_PM_MEM_PWQ_DISP ] = { 153, 151, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { 32, 31, -1, -1, -1, -1 }, [ POWER5_PME_PM_LD_MISS_L1_LSU0 ] = { -1, -1, 101, 104, -1, -1 }, [ POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { -1, -1, 158, 166, -1, -1 }, [ POWER5_PME_PM_FPU1_STALL3 ] = { 52, 51, -1, -1, -1, -1 }, [ POWER5_PME_PM_GCT_USAGE_80to99_CYC ] = { -1, -1, 47, -1, -1, -1 }, [ POWER5_PME_PM_WORK_HELD ] = { -1, -1, -1, 192, -1, -1 }, [ POWER5_PME_PM_INST_CMPL ] = { 174, 174, -1, -1, 0, -1 }, [ POWER5_PME_PM_LSU1_FLUSH_UST ] = { 133, 131, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXU_IDLE ] = { 59, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_FLUSH_ULD ] = { 121, 119, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 135, 133, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_DISP_REJECT ] = { 65, 65, -1, 55, -1, -1 }, [ POWER5_PME_PM_L2SA_MOD_INV ] = { -1, -1, 63, 67, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L25_SHR ] = { 184, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_CMD_RETRIED ] = { -1, -1, 17, 22, -1, -1 }, [ POWER5_PME_PM_L3SA_SHR_INV ] = { -1, -1, 90, 94, -1, -1 }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { -1, -1, 76, 80, -1, -1 }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { -1, -1, 66, 70, -1, -1 }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { 84, 82, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L375_MOD ] = { 188, -1, -1, 164, -1, -1 }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_UST ] = { -1, -1, 146, 157, -1, -1 }, [ POWER5_PME_PM_BR_ISSUED ] = { -1, -1, 0, 1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_BR_REDIR ] = { -1, 172, -1, -1, -1, -1 }, [ POWER5_PME_PM_EE_OFF ] = { -1, -1, 15, 19, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP_Q4to7 ] = { -1, -1, 126, 131, -1, -1 }, [ POWER5_PME_PM_MEM_FAST_PATH_RD_DISP ] = { -1, -1, 190, 193, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L3 ] = { 78, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_ITLB_MISS ] = { 81, 79, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 49, -1, -1 }, [ POWER5_PME_PM_FXLS_FULL_CYC ] = { -1, -1, -1, 47, -1, -1 }, [ POWER5_PME_PM_DTLB_REF_4K ] = { 26, 25, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_DISP_VALID ] = { 66, 66, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_UST ] = { -1, 140, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXU1_FIN ] = { -1, -1, 44, 50, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_4_CYC ] = { 206, 200, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD ] = { -1, 163, 131, -1, -1, -1 }, [ POWER5_PME_PM_4INST_CLB_CYC ] = { 5, 4, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_REF_16M ] = { 169, 170, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L375_MOD ] = { -1, -1, -1, 62, -1, -1 }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { -1, -1, 82, 86, -1, -1 }, [ POWER5_PME_PM_GRP_CMPL ] = { -1, -1, 49, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_1FLOP ] = { 45, 44, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_FRSP_FCONV ] = { -1, -1, 39, -1, -1, -1 }, [ POWER5_PME_PM_5INST_CLB_CYC ] = { 6, 5, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_REF ] = { 114, 112, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_L2MISS_BOTH_CYC ] = { -1, -1, 170, 176, -1, -1 }, [ POWER5_PME_PM_MEM_PW_GATH ] = { -1, -1, 124, 129, -1, -1 }, [ POWER5_PME_PM_FAB_PNtoNN_SIDECAR ] = { -1, -1, 21, 26, -1, -1 }, [ POWER5_PME_PM_FAB_DCLAIM_ISSUED ] = { 28, 27, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_IC_MISS ] = { 67, 67, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L35_SHR ] = { 79, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_LMQ_FULL_CYC ] = { -1, -1, 111, 116, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L2_CYC ] = { -1, 162, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_SYNC_CYC ] = { -1, -1, 119, 124, -1, -1 }, [ POWER5_PME_PM_LSU0_BUSY_REJECT ] = { 117, 115, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_REJECT_ERAT_MISS ] = { 145, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { -1, -1, -1, 143, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L375_SHR ] = { -1, -1, 10, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FMOV_FEST ] = { -1, -1, 31, 36, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L25_MOD ] = { -1, 181, 153, -1, -1, -1 }, [ POWER5_PME_PM_LD_REF_L1_LSU0 ] = { -1, -1, 103, 107, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_7_CYC ] = { 209, 203, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_FLUSH_SRQ ] = { 131, 129, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RCST_DISP ] = { 101, 99, -1, -1, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_DIV ] = { -1, -1, -1, 7, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP_Q12to15 ] = { -1, -1, 121, 126, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L375_SHR ] = { -1, -1, 58, -1, -1, -1 }, [ POWER5_PME_PM_ST_REF_L1 ] = { -1, -1, 165, -1, -1, -1 }, [ POWER5_PME_PM_L3SB_ALL_BUSY ] = { 109, 107, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { -1, -1, 20, 25, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { -1, 161, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_HOLDtoNN_EMPTY ] = { 29, 28, -1, -1, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_LMEM ] = { -1, 18, 11, -1, -1, -1 }, [ POWER5_PME_PM_RUN_CYC ] = { 190, -1, -1, -1, -1, 0 }, [ POWER5_PME_PM_PTEG_FROM_RMEM ] = { 189, -1, -1, 165, -1, -1 }, [ POWER5_PME_PM_L2SC_RCLD_DISP ] = { 99, 97, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_LDF ] = { -1, -1, 105, 109, -1, -1 }, [ POWER5_PME_PM_LSU_LRQ_S0_VALID ] = { 144, 143, -1, -1, -1, -1 }, [ POWER5_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 162, -1, -1 }, [ POWER5_PME_PM_MRK_IMR_RELOAD ] = { 173, 173, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_TIMEO ] = { -1, -1, -1, 148, -1, -1 }, [ POWER5_PME_PM_ST_MISS_L1 ] = { -1, -1, 164, 171, -1, -1 }, [ POWER5_PME_PM_STOP_COMPLETION ] = { -1, -1, 163, -1, -1, -1 }, [ POWER5_PME_PM_LSU_BUSY_REJECT ] = { 139, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_ISLB_MISS ] = { 80, 78, -1, -1, -1, -1 }, [ POWER5_PME_PM_CYC ] = { 12, 15, 6, 12, -1, -1 }, [ POWER5_PME_PM_THRD_ONE_RUN_CYC ] = { 202, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_BR_REDIR_NONSPEC ] = { 64, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_SRQ_STFWD ] = { 138, 136, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_MOD_INV ] = { -1, -1, 97, 101, -1, -1 }, [ POWER5_PME_PM_L2_PREF ] = { -1, -1, 87, 91, -1, -1 }, [ POWER5_PME_PM_GCT_NOSLOT_BR_MPRED ] = { -1, -1, -1, 51, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, 159, 129, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_MOD_INV ] = { -1, -1, 71, 75, -1, -1 }, [ POWER5_PME_PM_L2SB_ST_REQ ] = { 97, 95, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_L1_RELOAD_VALID ] = { -1, -1, 138, 149, -1, -1 }, [ POWER5_PME_PM_L3SB_HIT ] = { -1, -1, 92, 96, -1, -1 }, [ POWER5_PME_PM_L2SB_SHR_MOD ] = { 96, 94, -1, -1, -1, -1 }, [ POWER5_PME_PM_EE_OFF_EXT_INT ] = { -1, -1, 16, 20, -1, -1 }, [ POWER5_PME_PM_1PLUS_PPC_CMPL ] = { 2, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_SHR_MOD ] = { 104, 102, -1, -1, -1, -1 }, [ POWER5_PME_PM_PMC6_OVERFLOW ] = { -1, -1, 152, -1, -1, -1 }, [ POWER5_PME_PM_LSU_LRQ_FULL_CYC ] = { -1, -1, 116, 120, -1, -1 }, [ POWER5_PME_PM_IC_PREF_INSTALL ] = { -1, -1, 54, 58, -1, -1 }, [ POWER5_PME_PM_TLB_MISS ] = { 210, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_GCT_FULL_CYC ] = { 61, 60, -1, 52, -1, -1 }, [ POWER5_PME_PM_FXU_BUSY ] = { -1, 57, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L3_CYC ] = { -1, 166, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_REJECT_LMQ_FULL ] = { -1, 144, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_S0_ALLOC ] = { 147, 146, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_MRK ] = { 70, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L25_SHR ] = { 77, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FIN ] = { -1, -1, 35, 40, -1, -1 }, [ POWER5_PME_PM_DC_PREF_STREAM_ALLOC ] = { -1, -1, 14, 18, -1, -1 }, [ POWER5_PME_PM_BR_MPRED_TA ] = { -1, -1, 2, 3, -1, -1 }, [ POWER5_PME_PM_CRQ_FULL_CYC ] = { -1, -1, 5, 11, -1, -1 }, [ POWER5_PME_PM_L2SA_RCLD_DISP ] = { 83, 81, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_WR_RETRY_QFULL ] = { -1, -1, 161, 169, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_REF_4K ] = { 170, 171, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_S0_VALID ] = { 148, 147, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_FLUSH_LRQ ] = { 119, 117, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L275_MOD ] = { -1, -1, -1, 61, -1, -1 }, [ POWER5_PME_PM_GCT_EMPTY_CYC ] = { -1, 195, -1, -1, -1, -1 }, [ POWER5_PME_PM_LARX_LSU0 ] = { 115, 113, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { -1, -1, 174, 180, -1, -1 }, [ POWER5_PME_PM_SNOOP_RETRY_1AHEAD ] = { 195, 189, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FSQRT ] = { 49, 48, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 177, 176, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_FPU_FIN ] = { -1, -1, 136, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_5_CYC ] = { 207, 201, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM ] = { -1, 167, 133, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FRSP_FCONV ] = { -1, -1, 37, 42, -1, -1 }, [ POWER5_PME_PM_SNOOP_TLBIE ] = { 196, 190, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SB_SNOOP_RETRY ] = { -1, -1, 95, 99, -1, -1 }, [ POWER5_PME_PM_FAB_VBYPASS_EMPTY ] = { -1, -1, 23, 28, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD ] = { 162, -1, -1, 136, -1, -1 }, [ POWER5_PME_PM_6INST_CLB_CYC ] = { 7, 6, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RCST_DISP ] = { 93, 91, -1, -1, -1, -1 }, [ POWER5_PME_PM_FLUSH ] = { -1, -1, 26, 31, -1, -1 }, [ POWER5_PME_PM_L2SC_MOD_INV ] = { -1, -1, 79, 83, -1, -1 }, [ POWER5_PME_PM_FPU_DENORM ] = { 54, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_HIT ] = { -1, -1, 96, 100, -1, -1 }, [ POWER5_PME_PM_SNOOP_WR_RETRY_RQ ] = { 197, 191, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_REJECT_SRQ ] = { 137, 135, -1, -1, -1, -1 }, [ POWER5_PME_PM_IC_PREF_REQ ] = { 71, 69, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_ALL_BUSY ] = { 112, 110, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_IC_MISS ] = { -1, -1, -1, 147, -1, -1 }, [ POWER5_PME_PM_GCT_NOSLOT_IC_MISS ] = { -1, 59, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L3 ] = { 163, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { -1, -1, 46, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { -1, -1, 180, 186, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { -1, 10, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SA_MOD_INV ] = { -1, -1, 89, 93, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_LRQ ] = { -1, 138, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_2_CYC ] = { 204, 198, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH_SRQ ] = { 141, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { -1, -1, 149, 161, -1, -1 }, [ POWER5_PME_PM_L3SA_REF ] = { 108, 106, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { -1, -1, 84, 88, -1, -1 }, [ POWER5_PME_PM_FPU0_STALL3 ] = { 43, 42, -1, -1, -1, -1 }, [ POWER5_PME_PM_GPR_MAP_FULL_CYC ] = { -1, -1, 48, 53, -1, -1 }, [ POWER5_PME_PM_TB_BIT_TRANS ] = { 201, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_FLUSH_LRQ ] = { -1, -1, 147, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_STF ] = { 44, 43, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_MISS ] = { -1, -1, 135, 145, -1, -1 }, [ POWER5_PME_PM_FPU1_FMA ] = { 48, 47, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_MOD_TAG ] = { 82, 80, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_FLUSH_ULD ] = { 132, 130, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_UST ] = { -1, -1, 141, 152, -1, -1 }, [ POWER5_PME_PM_MRK_INST_FIN ] = { -1, -1, 137, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FULL_CYC ] = { 41, 40, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_LRQ_S0_ALLOC ] = { 143, 142, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_ULD ] = { -1, -1, 145, 156, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_REF ] = { 213, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_BR_UNCOND ] = { 9, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_OVER_L2MISS ] = { -1, -1, 181, 187, -1, -1 }, [ POWER5_PME_PM_L2SB_SHR_INV ] = { -1, -1, 77, 81, -1, -1 }, [ POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { -1, -1, 122, 127, -1, -1 }, [ POWER5_PME_PM_L3SC_MOD_TAG ] = { 113, 111, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_ST_MISS_L1 ] = { 180, 179, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_DISP_SUCCESS ] = { -1, -1, 51, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { -1, -1, 172, 178, -1, -1 }, [ POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { -1, -1, 52, 56, -1, -1 }, [ POWER5_PME_PM_MEM_WQ_DISP_Q8to15 ] = { -1, -1, 127, 132, -1, -1 }, [ POWER5_PME_PM_FPU0_SINGLE ] = { 42, 41, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_DERAT_MISS ] = { -1, 137, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_1_CYC ] = { 203, 197, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { -1, -1, 83, 87, -1, -1 }, [ POWER5_PME_PM_FPU1_FEST ] = { -1, -1, 34, 39, -1, -1 }, [ POWER5_PME_PM_FAB_HOLDtoVN_EMPTY ] = { 30, 29, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_RD_RETRY_RQ ] = { 194, 188, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { 191, 185, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { -1, 160, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 150, -1, -1, -1 }, [ POWER5_PME_PM_FLUSH_BR_MPRED ] = { -1, -1, 24, 29, -1, -1 }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { -1, -1, 72, 76, -1, -1 }, [ POWER5_PME_PM_FPU_STF ] = { -1, 56, -1, -1, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_FPU ] = { -1, -1, -1, 9, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { -1, -1, 175, 181, -1, -1 }, [ POWER5_PME_PM_GCT_NOSLOT_CYC ] = { 60, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, 42, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L35_SHR ] = { 187, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_FLUSH_UST ] = { -1, -1, 148, -1, -1, -1 }, [ POWER5_PME_PM_L3SA_HIT ] = { -1, -1, 88, 92, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 161, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { -1, -1, 74, 78, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR ] = { 164, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_IERAT_XLATE_WR ] = { 72, 70, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_ST_REQ ] = { 89, 87, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_T1 ] = { -1, -1, 183, 189, -1, -1 }, [ POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { -1, -1, 53, 57, -1, -1 }, [ POWER5_PME_PM_INST_FROM_LMEM ] = { -1, 77, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_1FLOP ] = { 36, 35, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { -1, 164, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L2 ] = { 183, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MEM_PW_CMPL ] = { 154, 152, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { -1, -1, 177, 183, -1, -1 }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { -1, -1, 73, 77, -1, -1 }, [ POWER5_PME_PM_FPU0_FIN ] = { -1, -1, 30, 35, -1, -1 }, [ POWER5_PME_PM_MRK_DTLB_MISS_4K ] = { 168, 169, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SC_SHR_INV ] = { -1, -1, 98, 102, -1, -1 }, [ POWER5_PME_PM_GRP_BR_REDIR ] = { 63, 62, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { 100, 98, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_FLUSH_SRQ ] = { -1, -1, -1, 159, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L275_SHR ] = { -1, -1, 154, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { 92, 90, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_RD_RETRY_WQ ] = { -1, -1, 160, 168, -1, -1 }, [ POWER5_PME_PM_LSU0_NCLD ] = { -1, -1, 106, 110, -1, -1 }, [ POWER5_PME_PM_FAB_DCLAIM_RETRIED ] = { -1, -1, 18, 23, -1, -1 }, [ POWER5_PME_PM_LSU1_BUSY_REJECT ] = { 128, 126, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXLS0_FULL_CYC ] = { -1, -1, 40, 45, -1, -1 }, [ POWER5_PME_PM_FPU0_FEST ] = { -1, -1, 29, 34, -1, -1 }, [ POWER5_PME_PM_DTLB_REF_16M ] = { 25, 24, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { -1, -1, 80, 84, -1, -1 }, [ POWER5_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 123, 121, -1, -1, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L25_MOD ] = { -1, 16, 7, -1, -1, -1 }, [ POWER5_PME_PM_GCT_USAGE_60to79_CYC ] = { -1, 61, -1, -1, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L375_MOD ] = { 18, -1, -1, 14, -1, -1 }, [ POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 141, 115, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 125, 123, -1, -1, -1, -1 }, [ POWER5_PME_PM_0INST_FETCH ] = { -1, -1, -1, 0, -1, -1 }, [ POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 136, 134, -1, -1, -1, -1 }, [ POWER5_PME_PM_L1_PREF ] = { -1, -1, 61, 65, -1, -1 }, [ POWER5_PME_PM_MEM_WQ_DISP_Q0to7 ] = { 158, 156, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { -1, -1, -1, 141, -1, -1 }, [ POWER5_PME_PM_BRQ_FULL_CYC ] = { 8, 7, -1, -1, -1, -1 }, [ POWER5_PME_PM_GRP_IC_MISS_NONSPEC ] = { 69, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L275_MOD ] = { 185, -1, -1, 163, -1, -1 }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 176, 175, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { -1, 165, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_FLUSH ] = { -1, -1, 109, 113, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L3 ] = { 16, -1, 192, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L2 ] = { 76, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 151, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_DENORM ] = { 37, 36, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_FMOV_FEST ] = { -1, -1, 36, 41, -1, -1 }, [ POWER5_PME_PM_INST_FETCH_CYC ] = { 75, 73, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_LDF ] = { -1, -1, -1, 115, -1, -1 }, [ POWER5_PME_PM_INST_DISP ] = { -1, -1, 56, 60, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L25_SHR ] = { 14, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L1_DCACHE_RELOAD_VALID ] = { -1, -1, 60, 64, -1, -1 }, [ POWER5_PME_PM_MEM_WQ_DISP_DCLAIM ] = { -1, -1, 128, 133, -1, -1 }, [ POWER5_PME_PM_FPU_FULL_CYC ] = { 57, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_ISSUED ] = { 172, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_3_CYC ] = { 205, 199, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_FMA ] = { -1, 54, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L35_MOD ] = { -1, 76, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_CRU_FIN ] = { -1, -1, -1, 134, -1, -1 }, [ POWER5_PME_PM_SNOOP_WR_RETRY_WQ ] = { -1, -1, 162, 170, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_REJECT ] = { -1, -1, -1, 10, -1, -1 }, [ POWER5_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 134, 132, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_FXU_FIN ] = { -1, 58, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { -1, -1, 75, 79, -1, -1 }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { 103, 101, -1, -1, -1, -1 }, [ POWER5_PME_PM_PMC4_OVERFLOW ] = { 181, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SA_SNOOP_RETRY ] = { -1, -1, 91, 95, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L35_MOD ] = { -1, 182, 155, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L25_MOD ] = { -1, 75, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SMT_HANG ] = { -1, -1, 184, 190, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_ERAT_MISS ] = { -1, -1, -1, 8, -1, -1 }, [ POWER5_PME_PM_L3SA_MOD_TAG ] = { 107, 105, -1, -1, -1, -1 }, [ POWER5_PME_PM_FLUSH_SYNC ] = { -1, -1, 28, 33, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L2MISS ] = { 212, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_ST_HIT ] = { -1, -1, 86, 90, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP_Q8to11 ] = { 150, 148, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_DISP ] = { 171, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_MOD_TAG ] = { 90, 88, -1, -1, -1, -1 }, [ POWER5_PME_PM_CLB_EMPTY_CYC ] = { -1, -1, 169, 175, -1, -1 }, [ POWER5_PME_PM_L2SB_ST_HIT ] = { -1, -1, 78, 82, -1, -1 }, [ POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { -1, -1, 125, 130, -1, -1 }, [ POWER5_PME_PM_BR_PRED_CR_TA ] = { -1, -1, -1, 5, -1, -1 }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { -1, -1, 140, 151, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_FLUSH_ULD ] = { -1, -1, -1, 160, -1, -1 }, [ POWER5_PME_PM_INST_DISP_ATTEMPT ] = { 74, 72, -1, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_RMEM ] = { -1, -1, -1, 63, -1, -1 }, [ POWER5_PME_PM_ST_REF_L1_LSU0 ] = { -1, -1, 166, 172, -1, -1 }, [ POWER5_PME_PM_LSU0_DERAT_MISS ] = { 118, 116, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RCLD_DISP ] = { 91, 89, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_STALL3 ] = { -1, 55, -1, -1, -1, -1 }, [ POWER5_PME_PM_BR_PRED_CR ] = { -1, -1, 3, 4, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L2 ] = { 160, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_FLUSH_SRQ ] = { 120, 118, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_PNtoNN_DIRECT ] = { 33, 32, -1, -1, -1, -1 }, [ POWER5_PME_PM_IOPS_CMPL ] = { 73, 71, 55, 59, -1, -1 }, [ POWER5_PME_PM_L2SC_SHR_INV ] = { -1, -1, 85, 89, -1, -1 }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { -1, -1, 67, 71, -1, -1 }, [ POWER5_PME_PM_L2SA_RCST_DISP ] = { 85, 83, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { -1, -1, -1, 194, -1, -1 }, [ POWER5_PME_PM_FAB_PNtoVN_SIDECAR ] = { -1, -1, 22, 27, -1, -1 }, [ POWER5_PME_PM_LSU_LMQ_S0_ALLOC ] = { -1, -1, 113, 118, -1, -1 }, [ POWER5_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 124, 122, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_PW_RETRY_RQ ] = { 192, 186, -1, 196, -1, -1 }, [ POWER5_PME_PM_DTLB_REF ] = { -1, 63, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L3 ] = { 186, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { -1, -1, 19, 24, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 122, -1, -1 }, [ POWER5_PME_PM_FPU1_STF ] = { 53, 52, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_LMQ_S0_VALID ] = { -1, -1, 114, 119, -1, -1 }, [ POWER5_PME_PM_GCT_USAGE_00to59_CYC ] = { 62, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L2MISS ] = { -1, -1, 187, -1, -1, -1 }, [ POWER5_PME_PM_GRP_DISP_BLK_SB_CYC ] = { -1, -1, 50, 54, -1, -1 }, [ POWER5_PME_PM_FPU_FMOV_FEST ] = { -1, -1, 38, -1, -1, -1 }, [ POWER5_PME_PM_XER_MAP_FULL_CYC ] = { 211, 204, -1, -1, -1, -1 }, [ POWER5_PME_PM_FLUSH_SB ] = { -1, -1, 27, 32, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR ] = { -1, -1, 132, -1, -1, -1 }, [ POWER5_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 146, -1, -1 }, [ POWER5_PME_PM_SUSPENDED ] = { 200, 194, 168, 174, -1, -1 }, [ POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { 68, 205, -1, -1, -1, -1 }, [ POWER5_PME_PM_SNOOP_RD_RETRY_QFULL ] = { 193, 187, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SB_MOD_INV ] = { -1, -1, 93, 97, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L35_SHR ] = { 17, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LD_MISS_L1_LSU1 ] = { -1, -1, 102, 105, -1, -1 }, [ POWER5_PME_PM_STCX_FAIL ] = { 198, 192, -1, -1, -1, -1 }, [ POWER5_PME_PM_DC_PREF_DST ] = { -1, -1, 13, 17, -1, -1 }, [ POWER5_PME_PM_GRP_DISP ] = { -1, 64, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { -1, -1, 64, 68, -1, -1 }, [ POWER5_PME_PM_FPU0_FPSCR ] = { -1, -1, 32, 37, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L2 ] = { 13, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU1_DENORM ] = { 46, 45, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU_1FLOP ] = { 56, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { -1, -1, 81, 85, -1, -1 }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { 102, 100, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FSQRT ] = { 40, 39, -1, -1, -1, -1 }, [ POWER5_PME_PM_LD_REF_L1 ] = { -1, -1, -1, 106, -1, -1 }, [ POWER5_PME_PM_INST_FROM_L1 ] = { -1, 74, -1, -1, -1, -1 }, [ POWER5_PME_PM_TLBIE_HELD ] = { -1, -1, 186, 191, -1, -1 }, [ POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { -1, -1, 117, 121, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { -1, -1, -1, 135, -1, -1 }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { -1, -1, 144, 155, -1, -1 }, [ POWER5_PME_PM_MEM_RQ_DISP_Q0to3 ] = { 155, 153, -1, -1, -1, -1 }, [ POWER5_PME_PM_ST_REF_L1_LSU1 ] = { -1, -1, 167, 173, -1, -1 }, [ POWER5_PME_PM_MRK_LD_MISS_L1 ] = { 175, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_L1_WRITE_CYC ] = { -1, -1, 62, 66, -1, -1 }, [ POWER5_PME_PM_L2SC_ST_REQ ] = { 105, 103, -1, -1, -1, -1 }, [ POWER5_PME_PM_CMPLU_STALL_FDIV ] = { -1, 11, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { -1, -1, 178, 184, -1, -1 }, [ POWER5_PME_PM_BR_MPRED_CR ] = { -1, -1, 1, 2, -1, -1 }, [ POWER5_PME_PM_L3SB_MOD_TAG ] = { 110, 108, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_DATA_FROM_L2MISS ] = { -1, -1, 188, -1, -1, -1 }, [ POWER5_PME_PM_LSU_REJECT_SRQ ] = { 146, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_LD_MISS_L1 ] = { -1, -1, 100, -1, -1, -1 }, [ POWER5_PME_PM_INST_FROM_PREF ] = { -1, -1, 59, -1, -1, -1 }, [ POWER5_PME_PM_DC_INV_L2 ] = { -1, -1, 12, 16, -1, -1 }, [ POWER5_PME_PM_STCX_PASS ] = { 199, 193, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_FULL_CYC ] = { -1, -1, 118, 123, -1, -1 }, [ POWER5_PME_PM_FPU_FIN ] = { -1, -1, -1, 44, -1, -1 }, [ POWER5_PME_PM_L2SA_SHR_MOD ] = { 88, 86, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU_SRQ_STFWD ] = { 149, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_0INST_CLB_CYC ] = { 0, 0, -1, -1, -1, -1 }, [ POWER5_PME_PM_FXU0_FIN ] = { -1, -1, 43, 48, -1, -1 }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { 94, 92, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { -1, 196, -1, -1, -1, -1 }, [ POWER5_PME_PM_PMC5_OVERFLOW ] = { 182, -1, -1, -1, -1, -1 }, [ POWER5_PME_PM_FPU0_FDIV ] = { 38, 37, -1, -1, -1, -1 }, [ POWER5_PME_PM_PTEG_FROM_L375_SHR ] = { -1, -1, 156, -1, -1, -1 }, [ POWER5_PME_PM_LD_REF_L1_LSU1 ] = { -1, -1, 104, 108, -1, -1 }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { 87, 85, -1, -1, -1, -1 }, [ POWER5_PME_PM_HV_CYC ] = { -1, 68, -1, -1, -1, -1 }, [ POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { -1, -1, 171, 177, -1, -1 }, [ POWER5_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 116, 114, -1, -1, -1, -1 }, [ POWER5_PME_PM_L3SB_SHR_INV ] = { -1, -1, 94, 98, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_RMEM ] = { 19, -1, -1, 15, -1, -1 }, [ POWER5_PME_PM_DATA_FROM_L275_MOD ] = { 15, -1, -1, 13, -1, -1 }, [ POWER5_PME_PM_LSU0_REJECT_SRQ ] = { 126, 124, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU1_DERAT_MISS ] = { 129, 127, -1, -1, -1, -1 }, [ POWER5_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, 158, -1, -1 }, [ POWER5_PME_PM_DTLB_MISS_16M ] = { 23, 22, -1, -1, -1, -1 }, [ POWER5_PME_PM_LSU0_FLUSH_UST ] = { 122, 120, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SC_MOD_TAG ] = { 98, 96, -1, -1, -1, -1 }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { 95, 93, -1, -1, -1, -1 } }; static const unsigned long long power5_group_vecs[][POWER5_NUM_GROUP_VEC] = { [ POWER5_PME_PM_LSU_REJECT_RELOAD_CDF ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_REF ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L275_SHR ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_MISS_4K ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CLB_FULL_CYC ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_ST_CMPL ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_LRQ_FULL ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_1INST_CLB_CYC ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_SPEC_RD_CANCEL ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DTLB_MISS_16M ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FDIV ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000800ULL }, [ POWER5_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000400ULL }, [ POWER5_PME_PM_FPU0_FMA ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000080ULL }, [ POWER5_PME_PM_SLB_MISS ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_ST_HIT ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_MISS ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL }, [ POWER5_PME_PM_BR_PRED_TA ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_FXU ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_EXT_INT ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_ST_GPS ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_CMD_ISSUED ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CR_MAP_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_SRQ_FULL ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FLUSH_IMBAL ] = { 0x0000000000084000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_RQ_DISP_Q16to19 ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L35_MOD ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FDIV ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000080ULL }, [ POWER5_PME_PM_MEM_RQ_DISP ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LWSYNC_HELD ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DSLB_MISS ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXLS1_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L275_SHR ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SEL_T0 ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_RELOAD_VALID ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LMQ_LHR_MERGE ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_2INST_CLB_CYC ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_PNtoVN_DIRECT ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L2MISS ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_LSU ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DSLB_MISS ] = { 0x0000000000000000ULL, 0x1800000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_ULD ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_LMEM ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_BRU_FIN ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_WQ_DISP_WRITE ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_NCLD ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPR_MAP_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_3INST_CLB_CYC ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRESH_TIMEO ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000800ULL }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_6_CYC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FEST ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { 0x0000000000000008ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_PWQ_DISP ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_MISS_L1_LSU0 ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_STALL3 ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000040ULL }, [ POWER5_PME_PM_GCT_USAGE_80to99_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_WORK_HELD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_CMPL ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x000000000001ffffULL }, [ POWER5_PME_PM_LSU1_FLUSH_UST ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU_IDLE ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_DISP_REJECT ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L25_SHR ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_CMD_RETRIED ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L375_MOD ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_ISSUED ] = { 0x0000000001020000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5_PME_PM_MRK_GRP_BR_REDIR ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_EE_OFF ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_RQ_DISP_Q4to7 ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_FAST_PATH_RD_DISP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L3 ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ITLB_MISS ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL }, [ POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXLS_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_REF_4K ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_DISP_VALID ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_UST ] = { 0x0000000001080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU1_FIN ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_4_CYC ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_4INST_CLB_CYC ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DTLB_REF_16M ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L375_MOD ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_CMPL ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_1FLOP ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000100ULL }, [ POWER5_PME_PM_FPU_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000800ULL }, [ POWER5_PME_PM_5INST_CLB_CYC ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_REF ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_L2MISS_BOTH_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_PW_GATH ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_PNtoNN_SIDECAR ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_DCLAIM_ISSUED ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_IC_MISS ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L35_SHR ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L2_CYC ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_BUSY_REJECT ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_REJECT_ERAT_MISS ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L375_SHR ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L25_MOD ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_7_CYC ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_DIV ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_RQ_DISP_Q12to15 ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L375_SHR ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ST_REF_L1 ] = { 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000008207ULL }, [ POWER5_PME_PM_L3SB_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { 0x0000000000000000ULL, 0x0280000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_HOLDtoNN_EMPTY ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_LMEM ] = { 0x0003000000000000ULL, 0x0000000000000000ULL, 0x000000000000000aULL }, [ POWER5_PME_PM_RUN_CYC ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x000000000001ffffULL }, [ POWER5_PME_PM_PTEG_FROM_RMEM ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCLD_DISP ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL, 0x0000000002400000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_IMR_RELOAD ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_GRP_TIMEO ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ST_MISS_L1 ] = { 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000004008ULL }, [ POWER5_PME_PM_STOP_COMPLETION ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_BUSY_REJECT ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ISLB_MISS ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CYC ] = { 0x0000000020000003ULL, 0x0000001000000000ULL, 0x000000000001f010ULL }, [ POWER5_PME_PM_THRD_ONE_RUN_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_BR_REDIR_NONSPEC ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2_PREF ] = { 0x0000000000003000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL }, [ POWER5_PME_PM_GCT_NOSLOT_BR_MPRED ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_ST_REQ ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_L1_RELOAD_VALID ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_HIT ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_1PLUS_PPC_CMPL ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LRQ_FULL_CYC ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IC_PREF_INSTALL ] = { 0x0000004000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_TLB_MISS ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000008000ULL }, [ POWER5_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU_BUSY ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L3_CYC ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_REJECT_LMQ_FULL ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_MRK ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L25_SHR ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FIN ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000500ULL }, [ POWER5_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_MPRED_TA ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CRQ_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCLD_DISP ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_WR_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DTLB_REF_4K ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L275_MOD ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_EMPTY_CYC ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LARX_LSU0 ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_RETRY_1AHEAD ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_FPU_FIN ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_5_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FRSP_FCONV ] = { 0x0000000000000000ULL, 0x0000000000900000ULL, 0x0000000000000080ULL }, [ POWER5_PME_PM_SNOOP_TLBIE ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_VBYPASS_EMPTY ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_6INST_CLB_CYC ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FLUSH ] = { 0x0001000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_DENORM ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_HIT ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_WR_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_REJECT_SRQ ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IC_PREF_REQ ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL }, [ POWER5_PME_PM_L3SC_ALL_BUSY ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_GRP_IC_MISS ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_NOSLOT_IC_MISS ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L3 ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_2_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_REF ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_STALL3 ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000040ULL }, [ POWER5_PME_PM_GPR_MAP_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_TB_BIT_TRANS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_FLUSH_LRQ ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_STF ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DTLB_MISS ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FMA ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000080ULL }, [ POWER5_PME_PM_L2SA_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_INST_FIN ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DTLB_REF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_UNCOND ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5_PME_PM_THRD_SEL_OVER_L2MISS ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_ST_MISS_L1 ] = { 0x0000000000000000ULL, 0x4004000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_DISP_SUCCESS ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_WQ_DISP_Q8to15 ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_DERAT_MISS ] = { 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_1_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FEST ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_HOLDtoVN_EMPTY ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_RD_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FLUSH_BR_MPRED ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_STF ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000002400ULL }, [ POWER5_PME_PM_CMPLU_STALL_FPU ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_NOSLOT_CYC ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L35_SHR ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_FLUSH_UST ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_HIT ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IERAT_XLATE_WR ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_ST_REQ ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SEL_T1 ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_LMEM ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_1FLOP ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000100ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L2 ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_PW_CMPL ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FIN ] = { 0x0000000000000000ULL, 0x0000000001010000ULL, 0x0000000000000540ULL }, [ POWER5_PME_PM_MRK_DTLB_MISS_4K ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SC_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_BR_REDIR ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_FLUSH_SRQ ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L275_SHR ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_RD_RETRY_WQ ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_NCLD ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_DCLAIM_RETRIED ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_BUSY_REJECT ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXLS0_FULL_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FEST ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_REF_16M ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L25_MOD ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_USAGE_60to79_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L375_MOD ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_0INST_FETCH ] = { 0x0020004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L1_PREF ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000010ULL }, [ POWER5_PME_PM_MEM_WQ_DISP_Q0to7 ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BRQ_FULL_CYC ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_IC_MISS_NONSPEC ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L275_MOD ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_FLUSH ] = { 0x0000000006e40000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L3 ] = { 0x0003000000000000ULL, 0x0000000000000000ULL, 0x000000000000000aULL }, [ POWER5_PME_PM_INST_FROM_L2 ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_DENORM ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FETCH_CYC ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LDF ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000002000ULL }, [ POWER5_PME_PM_INST_DISP ] = { 0x0000000000000005ULL, 0x0000000000000000ULL, 0x0000000000002000ULL }, [ POWER5_PME_PM_DATA_FROM_L25_SHR ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_WQ_DISP_DCLAIM ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FULL_CYC ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_GRP_ISSUED ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_3_CYC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FMA ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000010200ULL }, [ POWER5_PME_PM_INST_FROM_L35_MOD ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_CRU_FIN ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_WR_RETRY_WQ ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_REJECT ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_SNOOP_RETRY ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L35_MOD ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L25_MOD ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SMT_HANG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_ERAT_MISS ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SA_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FLUSH_SYNC ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_FROM_L2MISS ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_ST_HIT ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_RQ_DISP_Q8to11 ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_GRP_DISP ] = { 0x0000000000000000ULL, 0x0006000008000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CLB_EMPTY_CYC ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_ST_HIT ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_PRED_CR_TA ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_INST_DISP_ATTEMPT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000004000ULL }, [ POWER5_PME_PM_INST_FROM_RMEM ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ST_REF_L1_LSU0 ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCLD_DISP ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_STALL3 ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_PRED_CR ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_PNtoNN_DIRECT ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_IOPS_CMPL ] = { 0x00210488fffa9811ULL, 0x3220041f03a200e0ULL, 0x0000000000010000ULL }, [ POWER5_PME_PM_L2SC_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCST_DISP ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_PNtoVN_SIDECAR ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_PW_RETRY_RQ ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_REF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L3 ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU1_STF ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GCT_USAGE_00to59_CYC ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L2MISS ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 0x0000000000000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FMOV_FEST ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_XER_MAP_FULL_CYC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FLUSH_SB ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_GRP_CMPL ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_SNOOP_RD_RETRY_QFULL ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_MOD_INV ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L35_SHR ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_MISS_L1_LSU1 ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_STCX_FAIL ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DC_PREF_DST ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_GRP_DISP ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FPSCR ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000040ULL }, [ POWER5_PME_PM_DATA_FROM_L2 ] = { 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER5_PME_PM_FPU1_DENORM ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_1FLOP ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000010200ULL }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FSQRT ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_REF_L1 ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000008207ULL }, [ POWER5_PME_PM_INST_FROM_L1 ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER5_PME_PM_TLBIE_HELD ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MEM_RQ_DISP_Q0to3 ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_ST_REF_L1_LSU1 ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L1_WRITE_CYC ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_ST_REQ ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_CMPLU_STALL_FDIV ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_BR_MPRED_CR ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_DATA_FROM_L2MISS ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_REJECT_SRQ ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_MISS_L1 ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000004008ULL }, [ POWER5_PME_PM_INST_FROM_PREF ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DC_INV_L2 ] = { 0x0800000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_STCX_PASS ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU_FIN ] = { 0x0000000000000000ULL, 0x0020000000008000ULL, 0x0000000000001800ULL }, [ POWER5_PME_PM_L2SA_SHR_MOD ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_0INST_CLB_CYC ] = { 0x0000000000000008ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FXU0_FIN ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_FPU0_FDIV ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_PTEG_FROM_L375_SHR ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LD_REF_L1_LSU1 ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_HV_CYC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L3SB_SHR_INV ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_RMEM ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DATA_FROM_L275_MOD ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_REJECT_SRQ ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU1_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_MRK_LSU_FIN ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_DTLB_MISS_16M ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_LSU0_FLUSH_UST ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SC_MOD_TAG ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL } }; static const pme_power_entry_t power5_pe[] = { [ POWER5_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x2c6090, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_REJECT_RELOAD_CDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_REJECT_RELOAD_CDF] }, [ POWER5_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x20e7, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "FPU1 has executed a single precision instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_SINGLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_SINGLE] }, [ POWER5_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x701c4, .pme_short_desc = "L3 slice B references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_REF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_REF] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x430e5, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 3 or 4.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC] }, [ POWER5_PME_PM_INST_FROM_L275_SHR ] = { .pme_name = "PM_INST_FROM_L275_SHR", .pme_code = 0x322096, .pme_short_desc = "Instruction fetched from L2.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L275_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L275_SHR] }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD", .pme_code = 0x1c70a7, .pme_short_desc = "Marked data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L375_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L375_MOD] }, [ POWER5_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0xc40c0, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_MISS_4K], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_MISS_4K] }, [ POWER5_PME_PM_CLB_FULL_CYC ] = { .pme_name = "PM_CLB_FULL_CYC", .pme_code = 0x220e5, .pme_short_desc = "Cycles CLB full", .pme_long_desc = "Cycles when both thread's CLB is full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CLB_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CLB_FULL_CYC] }, [ POWER5_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_ST_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_ST_CMPL] }, [ POWER5_PME_PM_LSU_FLUSH_LRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_LRQ_FULL", .pme_code = 0x320e7, .pme_short_desc = "Flush caused by LRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_LRQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_LRQ_FULL] }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x3c7097, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L275_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L275_SHR] }, [ POWER5_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x400c1, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_1INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_1INST_CLB_CYC] }, [ POWER5_PME_PM_MEM_SPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_SPEC_RD_CANCEL", .pme_code = 0x721e6, .pme_short_desc = "Speculative memory read cancelled", .pme_long_desc = "Speculative memory read cancelled (i.e. cresp = sourced by L2/L3)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_SPEC_RD_CANCEL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_SPEC_RD_CANCEL] }, [ POWER5_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0xc40c5, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Marked Data TLB misses for 16M page", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_MISS_16M], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_MISS_16M] }, [ POWER5_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x100088, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "The floating point unit has executed a divide instruction. This could be fdiv, fdivs, fdiv., fdivs.. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FDIV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FDIV] }, [ POWER5_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x102090, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_SINGLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_SINGLE] }, [ POWER5_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc1, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FMA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FMA] }, [ POWER5_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x280088, .pme_short_desc = "SLB misses", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SLB_MISS] }, [ POWER5_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc00c6, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_FLUSH_LRQ] }, [ POWER5_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x733e0, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_ST_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_ST_HIT] }, [ POWER5_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x800c4, .pme_short_desc = "Data TLB misses", .pme_long_desc = "Data TLB misses, all page sizes.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_MISS] }, [ POWER5_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x230e3, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " target prediction", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_PRED_TA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_PRED_TA] }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD_CYC", .pme_code = 0x4c70a7, .pme_short_desc = "Marked load latency from L3.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC] }, [ POWER5_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x211099, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_FXU], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_FXU] }, [ POWER5_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x400003, .pme_short_desc = "External interrupts", .pme_long_desc = "An interrupt due to an external exception occurred", .pme_event_ids = power5_event_ids[POWER5_PME_PM_EXT_INT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_EXT_INT] }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x810c6, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ] }, [ POWER5_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc50c4, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU1", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_LDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_LDF] }, [ POWER5_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_ST_GPS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_ST_GPS] }, [ POWER5_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x700c7, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_CMD_ISSUED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_CMD_ISSUED] }, [ POWER5_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc20e0, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_SRQ_STFWD] }, [ POWER5_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x100c4, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CR_MAP_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CR_MAP_FULL_CYC] }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x810c0, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU0_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU0_FLUSH_ULD] }, [ POWER5_PME_PM_LSU_FLUSH_SRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_SRQ_FULL", .pme_code = 0x330e0, .pme_short_desc = "Flush caused by SRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_SRQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_SRQ_FULL] }, [ POWER5_PME_PM_FLUSH_IMBAL ] = { .pme_name = "PM_FLUSH_IMBAL", .pme_code = 0x330e3, .pme_short_desc = "Flush caused by thread GCT imbalance", .pme_long_desc = "This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FLUSH_IMBAL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FLUSH_IMBAL] }, [ POWER5_PME_PM_MEM_RQ_DISP_Q16to19 ] = { .pme_name = "PM_MEM_RQ_DISP_Q16to19", .pme_code = 0x727e6, .pme_short_desc = "Memory read queue dispatched to queues 16-19", .pme_long_desc = "A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP_Q16to19], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP_Q16to19] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x430e1, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 3 or 4.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC] }, [ POWER5_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x2c309e, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L35_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L35_MOD] }, [ POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_HI_PRIO_WR_CMPL", .pme_code = 0x726e6, .pme_short_desc = "High priority write completed", .pme_long_desc = "A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL] }, [ POWER5_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0xc4, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FDIV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FDIV] }, [ POWER5_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x10c1, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FRSP_FCONV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FRSP_FCONV] }, [ POWER5_PME_PM_MEM_RQ_DISP ] = { .pme_name = "PM_MEM_RQ_DISP", .pme_code = 0x701c6, .pme_short_desc = "Memory read queue dispatched", .pme_long_desc = "A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP] }, [ POWER5_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x130e0, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LWSYNC_HELD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LWSYNC_HELD] }, [ POWER5_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x313088, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU_FIN] }, [ POWER5_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x800c5, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DSLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DSLB_MISS] }, [ POWER5_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x110c4, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXLS1_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXLS1_FULL_CYC] }, [ POWER5_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x3c3097, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a demand load. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L275_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L275_SHR] }, [ POWER5_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x410c0, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Thread selection picked thread 0 for decode.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_T0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_T0] }, [ POWER5_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x830e4, .pme_short_desc = "PTEG reload valid", .pme_long_desc = "A Page Table Entry was loaded into the TLB.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_RELOAD_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_RELOAD_VALID] }, [ POWER5_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0xc70e5, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LMQ_LHR_MERGE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LMQ_LHR_MERGE] }, [ POWER5_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x820e6, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_STCX_FAIL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_STCX_FAIL] }, [ POWER5_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x400c2, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_2INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_2INST_CLB_CYC] }, [ POWER5_PME_PM_FAB_PNtoVN_DIRECT ] = { .pme_name = "PM_FAB_PNtoVN_DIRECT", .pme_code = 0x723e7, .pme_short_desc = "PN to VN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_PNtoVN_DIRECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_PNtoVN_DIRECT] }, [ POWER5_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x38309b, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L2MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L2MISS] }, [ POWER5_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x211098, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_LSU], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_LSU] }, [ POWER5_PME_PM_MRK_DSLB_MISS ] = { .pme_name = "PM_MRK_DSLB_MISS", .pme_code = 0xc50c7, .pme_short_desc = "Marked Data SLB misses", .pme_long_desc = "A Data SLB miss was caused by a marked instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DSLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DSLB_MISS] }, [ POWER5_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c0088, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_ULD] }, [ POWER5_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x283087, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_LMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_LMEM] }, [ POWER5_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x200005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_BRU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_BRU_FIN] }, [ POWER5_PME_PM_MEM_WQ_DISP_WRITE ] = { .pme_name = "PM_MEM_WQ_DISP_WRITE", .pme_code = 0x703c6, .pme_short_desc = "Memory write queue dispatched due to write", .pme_long_desc = "A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_WQ_DISP_WRITE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_WQ_DISP_WRITE] }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD_CYC", .pme_code = 0x4c70a3, .pme_short_desc = "Marked load latency from L2.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC] }, [ POWER5_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc50c5, .pme_short_desc = "LSU1 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_NCLD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_NCLD] }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_WQ_PWQ", .pme_code = 0x717c6, .pme_short_desc = "Snoop partial-write retry due to collision with active write or partial-write queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ] }, [ POWER5_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x100c1, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPR_MAP_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPR_MAP_FULL_CYC] }, [ POWER5_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x100c7, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FULL_CYC] }, [ POWER5_PME_PM_L3SA_ALL_BUSY ] = { .pme_name = "PM_L3SA_ALL_BUSY", .pme_code = 0x721e3, .pme_short_desc = "L3 slice A active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_ALL_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_ALL_BUSY] }, [ POWER5_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x400c3, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_3INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_3INST_CLB_CYC] }, [ POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { .pme_name = "PM_MEM_PWQ_DISP_Q2or3", .pme_code = 0x734e6, .pme_short_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3", .pme_long_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_PWQ_DISP_Q2or3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_PWQ_DISP_Q2or3] }, [ POWER5_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0x710c0, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_SHR_INV] }, [ POWER5_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x30000b, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRESH_TIMEO], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRESH_TIMEO] }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { .pme_name = "PM_THRD_SEL_OVER_GCT_IMBAL", .pme_code = 0x410c4, .pme_short_desc = "Thread selection overrides caused by GCT imbalance", .pme_long_desc = "Thread selection was overridden because of a GCT imbalance.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL] }, [ POWER5_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x200090, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FSQRT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FSQRT] }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x810c2, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ] }, [ POWER5_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20000a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC1_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC1_OVERFLOW] }, [ POWER5_PME_PM_L3SC_SNOOP_RETRY ] = { .pme_name = "PM_L3SC_SNOOP_RETRY", .pme_code = 0x731e5, .pme_short_desc = "L3 slice C snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_SNOOP_RETRY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_SNOOP_RETRY] }, [ POWER5_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x800c7, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_TABLEWALK_CYC] }, [ POWER5_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x420e5, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles this thread was running at priority level 6.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_6_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_6_CYC] }, [ POWER5_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x401090, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FEST] }, [ POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toP1_SIDECAR_EMPTY", .pme_code = 0x702c7, .pme_short_desc = "M1 to P1 sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY] }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x1c70a1, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_RMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_RMEM] }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD_CYC", .pme_code = 0x4c70a6, .pme_short_desc = "Marked load latency from L3.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC] }, [ POWER5_PME_PM_MEM_PWQ_DISP ] = { .pme_name = "PM_MEM_PWQ_DISP", .pme_code = 0x704c6, .pme_short_desc = "Memory partial-write queue dispatched", .pme_long_desc = "Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_PWQ_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_PWQ_DISP] }, [ POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toM1_SIDECAR_EMPTY", .pme_code = 0x701c7, .pme_short_desc = "P1 to M1 sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY] }, [ POWER5_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc10c2, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_MISS_L1_LSU0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_MISS_L1_LSU0] }, [ POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { .pme_name = "PM_SNOOP_PARTIAL_RTRY_QFULL", .pme_code = 0x730e6, .pme_short_desc = "Snoop partial write retry due to partial-write queues full", .pme_long_desc = "A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL] }, [ POWER5_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x20e5, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_STALL3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_STALL3] }, [ POWER5_PME_PM_GCT_USAGE_80to99_CYC ] = { .pme_name = "PM_GCT_USAGE_80to99_CYC", .pme_code = 0x30001f, .pme_short_desc = "Cycles GCT 80-99% full", .pme_long_desc = "Cycles when the Global Completion Table has between 80% and 99% of its slots used. The GCT has 20 entries shared between threads", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_USAGE_80to99_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_USAGE_80to99_CYC] }, [ POWER5_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x40000c, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", .pme_event_ids = power5_event_ids[POWER5_PME_PM_WORK_HELD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_WORK_HELD] }, [ POWER5_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x100009, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PowerPC instructions that completed. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_CMPL] }, [ POWER5_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc00c5, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_FLUSH_UST] }, [ POWER5_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100012, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU_IDLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU_IDLE] }, [ POWER5_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00c0, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_FLUSH_ULD] }, [ POWER5_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc60e5, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_REJECT_LMQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_REJECT_LMQ_FULL] }, [ POWER5_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x120e4, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_DISP_REJECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_DISP_REJECT] }, [ POWER5_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0x730e0, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_MOD_INV] }, [ POWER5_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x183097, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L25_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L25_SHR] }, [ POWER5_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x710c7, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_CMD_RETRIED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_CMD_RETRIED] }, [ POWER5_PME_PM_L3SA_SHR_INV ] = { .pme_name = "PM_L3SA_SHR_INV", .pme_code = 0x710c3, .pme_short_desc = "L3 slice A transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_SHR_INV] }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_PTEG_FROM_L375_MOD ] = { .pme_name = "PM_PTEG_FROM_L375_MOD", .pme_code = 0x1830a7, .pme_short_desc = "PTEG loaded from L3.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L375_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L375_MOD] }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x810c5, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU1_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU1_FLUSH_UST] }, [ POWER5_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x230e4, .pme_short_desc = "Branches issued", .pme_long_desc = "A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_ISSUED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_ISSUED] }, [ POWER5_PME_PM_MRK_GRP_BR_REDIR ] = { .pme_name = "PM_MRK_GRP_BR_REDIR", .pme_code = 0x212091, .pme_short_desc = "Group experienced marked branch redirect", .pme_long_desc = "A group containing a marked (sampled) instruction experienced a branch redirect.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_BR_REDIR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_BR_REDIR] }, [ POWER5_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x130e3, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_EE_OFF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_EE_OFF] }, [ POWER5_PME_PM_MEM_RQ_DISP_Q4to7 ] = { .pme_name = "PM_MEM_RQ_DISP_Q4to7", .pme_code = 0x712c6, .pme_short_desc = "Memory read queue dispatched to queues 4-7", .pme_long_desc = "A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP_Q4to7], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP_Q4to7] }, [ POWER5_PME_PM_MEM_FAST_PATH_RD_DISP ] = { .pme_name = "PM_MEM_FAST_PATH_RD_DISP", .pme_code = 0x713e6, .pme_short_desc = "Fast path memory read dispatched", .pme_long_desc = "Fast path memory read dispatched", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_FAST_PATH_RD_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_FAST_PATH_RD_DISP] }, [ POWER5_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x12208d, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L3] }, [ POWER5_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x800c0, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ITLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ITLB_MISS] }, [ POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400012, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ POWER5_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x411090, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXLS_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXLS_FULL_CYC] }, [ POWER5_PME_PM_DTLB_REF_4K ] = { .pme_name = "PM_DTLB_REF_4K", .pme_code = 0xc40c2, .pme_short_desc = "Data TLB reference for 4K page", .pme_long_desc = "Data TLB references for 4KB pages. Includes hits + misses.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_REF_4K], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_REF_4K] }, [ POWER5_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x120e3, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "A group is available for dispatch. This does not mean it was successfully dispatched.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_DISP_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_DISP_VALID] }, [ POWER5_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c0088, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_UST] }, [ POWER5_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x130e6, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU1_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU1_FIN] }, [ POWER5_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x420e3, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles this thread was running at priority level 4.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_4_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_4_CYC] }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x2c709e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L35_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L35_MOD] }, [ POWER5_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x400c4, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_4INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_4INST_CLB_CYC] }, [ POWER5_PME_PM_MRK_DTLB_REF_16M ] = { .pme_name = "PM_MRK_DTLB_REF_16M", .pme_code = 0xc40c7, .pme_short_desc = "Marked Data TLB reference for 16M page", .pme_long_desc = "Data TLB references by a marked instruction for 16MB pages.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_REF_16M], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_REF_16M] }, [ POWER5_PME_PM_INST_FROM_L375_MOD ] = { .pme_name = "PM_INST_FROM_L375_MOD", .pme_code = 0x42209d, .pme_short_desc = "Instruction fetched from L3.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L375_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L375_MOD] }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x300013, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_CMPL] }, [ POWER5_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc7, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_1FLOP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_1FLOP] }, [ POWER5_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x301090, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FRSP_FCONV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FRSP_FCONV] }, [ POWER5_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x400c5, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_5INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_5INST_CLB_CYC] }, [ POWER5_PME_PM_L3SC_REF ] = { .pme_name = "PM_L3SC_REF", .pme_code = 0x701c5, .pme_short_desc = "L3 slice C references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_REF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_REF] }, [ POWER5_PME_PM_THRD_L2MISS_BOTH_CYC ] = { .pme_name = "PM_THRD_L2MISS_BOTH_CYC", .pme_code = 0x410c7, .pme_short_desc = "Cycles both threads in L2 misses", .pme_long_desc = "Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_L2MISS_BOTH_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_L2MISS_BOTH_CYC] }, [ POWER5_PME_PM_MEM_PW_GATH ] = { .pme_name = "PM_MEM_PW_GATH", .pme_code = 0x714c6, .pme_short_desc = "Memory partial-write gathered", .pme_long_desc = "Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_PW_GATH], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_PW_GATH] }, [ POWER5_PME_PM_FAB_PNtoNN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoNN_SIDECAR", .pme_code = 0x713c7, .pme_short_desc = "PN to NN beat went to sidecar first", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_PNtoNN_SIDECAR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_PNtoNN_SIDECAR] }, [ POWER5_PME_PM_FAB_DCLAIM_ISSUED ] = { .pme_name = "PM_FAB_DCLAIM_ISSUED", .pme_code = 0x720e7, .pme_short_desc = "dclaim issued", .pme_long_desc = "A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_DCLAIM_ISSUED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_DCLAIM_ISSUED] }, [ POWER5_PME_PM_GRP_IC_MISS ] = { .pme_name = "PM_GRP_IC_MISS", .pme_code = 0x120e7, .pme_short_desc = "Group experienced I cache miss", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_IC_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_IC_MISS] }, [ POWER5_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x12209d, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L35_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L35_SHR] }, [ POWER5_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xc30e7, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The Load Miss Queue was full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LMQ_FULL_CYC] }, [ POWER5_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x2c70a0, .pme_short_desc = "Marked load latency from L2", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L2_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L2_CYC] }, [ POWER5_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x830e5, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_SYNC_CYC] }, [ POWER5_PME_PM_LSU0_BUSY_REJECT ] = { .pme_name = "PM_LSU0_BUSY_REJECT", .pme_code = 0xc20e3, .pme_short_desc = "LSU0 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_BUSY_REJECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_BUSY_REJECT] }, [ POWER5_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x1c6090, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_REJECT_ERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_REJECT_ERAT_MISS] }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4c70a1, .pme_short_desc = "Marked load latency from remote memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC] }, [ POWER5_PME_PM_DATA_FROM_L375_SHR ] = { .pme_name = "PM_DATA_FROM_L375_SHR", .pme_code = 0x3c309e, .pme_short_desc = "Data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L375_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L375_SHR] }, [ POWER5_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x10c0, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FMOV_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FMOV_FEST] }, [ POWER5_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x283097, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L25_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L25_MOD] }, [ POWER5_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10c0, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_REF_L1_LSU0] }, [ POWER5_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x420e6, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles this thread was running at priority level 7.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_7_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_7_CYC] }, [ POWER5_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc00c7, .pme_short_desc = "LSU1 SRQ lhs flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_FLUSH_SRQ] }, [ POWER5_PME_PM_L2SC_RCST_DISP ] = { .pme_name = "PM_L2SC_RCST_DISP", .pme_code = 0x702c2, .pme_short_desc = "L2 slice C RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCST_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCST_DISP] }, [ POWER5_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x411099, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_DIV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_DIV] }, [ POWER5_PME_PM_MEM_RQ_DISP_Q12to15 ] = { .pme_name = "PM_MEM_RQ_DISP_Q12to15", .pme_code = 0x732e6, .pme_short_desc = "Memory read queue dispatched to queues 12-15", .pme_long_desc = "A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP_Q12to15], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP_Q12to15] }, [ POWER5_PME_PM_INST_FROM_L375_SHR ] = { .pme_name = "PM_INST_FROM_L375_SHR", .pme_code = 0x32209d, .pme_short_desc = "Instruction fetched from L3.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L375_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L375_SHR] }, [ POWER5_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x3c1090, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Store references to the Data Cache. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ST_REF_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ST_REF_L1] }, [ POWER5_PME_PM_L3SB_ALL_BUSY ] = { .pme_name = "PM_L3SB_ALL_BUSY", .pme_code = 0x721e4, .pme_short_desc = "L3 slice B active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_ALL_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_ALL_BUSY] }, [ POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x711c7, .pme_short_desc = "P1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY] }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR_CYC", .pme_code = 0x2c70a3, .pme_short_desc = "Marked load latency from L2.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC] }, [ POWER5_PME_PM_FAB_HOLDtoNN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoNN_EMPTY", .pme_code = 0x722e7, .pme_short_desc = "Hold buffer to NN empty", .pme_long_desc = "Fabric cyles when the Next Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_HOLDtoNN_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_HOLDtoNN_EMPTY] }, [ POWER5_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x2c3087, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_LMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_LMEM] }, [ POWER5_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x100005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_RUN_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_RUN_CYC] }, [ POWER5_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x1830a1, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_RMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_RMEM] }, [ POWER5_PME_PM_L2SC_RCLD_DISP ] = { .pme_name = "PM_L2SC_RCLD_DISP", .pme_code = 0x701c2, .pme_short_desc = "L2 slice C RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCLD_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCLD_DISP] }, [ POWER5_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc50c0, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU0", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_LDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_LDF] }, [ POWER5_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc20e2, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LRQ_S0_VALID] }, [ POWER5_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40000a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC3_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC3_OVERFLOW] }, [ POWER5_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x820e2, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_IMR_RELOAD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_IMR_RELOAD] }, [ POWER5_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x40000b, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_TIMEO], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_TIMEO] }, [ POWER5_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc10c3, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ST_MISS_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ST_MISS_L1] }, [ POWER5_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x300018, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", .pme_event_ids = power5_event_ids[POWER5_PME_PM_STOP_COMPLETION], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_STOP_COMPLETION] }, [ POWER5_PME_PM_LSU_BUSY_REJECT ] = { .pme_name = "PM_LSU_BUSY_REJECT", .pme_code = 0x1c2090, .pme_short_desc = "LSU busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions. Combined unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_BUSY_REJECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_BUSY_REJECT] }, [ POWER5_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x800c1, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ISLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ISLB_MISS] }, [ POWER5_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0xf, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CYC] }, [ POWER5_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x10000b, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_ONE_RUN_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_ONE_RUN_CYC] }, [ POWER5_PME_PM_GRP_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_BR_REDIR_NONSPEC", .pme_code = 0x112091, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Number of groups, counted at completion, that have encountered a branch redirect.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_BR_REDIR_NONSPEC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_BR_REDIR_NONSPEC] }, [ POWER5_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc20e4, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_SRQ_STFWD] }, [ POWER5_PME_PM_L3SC_MOD_INV ] = { .pme_name = "PM_L3SC_MOD_INV", .pme_code = 0x730e5, .pme_short_desc = "L3 slice C transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_MOD_INV] }, [ POWER5_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc50c3, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2_PREF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2_PREF] }, [ POWER5_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x41009c, .pme_short_desc = "No slot in GCT caused by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_NOSLOT_BR_MPRED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_NOSLOT_BR_MPRED] }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x2c7097, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ POWER5_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0x730e1, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_MOD_INV] }, [ POWER5_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x723e1, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_ST_REQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_ST_REQ] }, [ POWER5_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc70e4, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_L1_RELOAD_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_L1_RELOAD_VALID] }, [ POWER5_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x711c4, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_HIT] }, [ POWER5_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0x700c1, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_SHR_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_SHR_MOD] }, [ POWER5_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x130e7, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_EE_OFF_EXT_INT] }, [ POWER5_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100013, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_1PLUS_PPC_CMPL] }, [ POWER5_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0x700c2, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_SHR_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_SHR_MOD] }, [ POWER5_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30001a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC6_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC6_OVERFLOW] }, [ POWER5_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x110c2, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "Cycles when the LRQ is full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LRQ_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LRQ_FULL_CYC] }, [ POWER5_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x210c7, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "A prefetch buffer entry (line) is allocated but the request is not a demand fetch.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IC_PREF_INSTALL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IC_PREF_INSTALL] }, [ POWER5_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x180088, .pme_short_desc = "TLB misses", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", .pme_event_ids = power5_event_ids[POWER5_PME_PM_TLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_TLB_MISS] }, [ POWER5_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x100c0, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The Global Completion Table is completely full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_FULL_CYC] }, [ POWER5_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200012, .pme_short_desc = "FXU busy", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU_BUSY] }, [ POWER5_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x2c70a4, .pme_short_desc = "Marked load latency from L3", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L3_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L3_CYC] }, [ POWER5_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2c6088, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_REJECT_LMQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_REJECT_LMQ_FULL] }, [ POWER5_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc20e5, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_S0_ALLOC] }, [ POWER5_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x100014, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_MRK], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_MRK] }, [ POWER5_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x122096, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T or SL) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L25_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L25_SHR] }, [ POWER5_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x10c7, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FIN] }, [ POWER5_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x830e7, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DC_PREF_STREAM_ALLOC] }, [ POWER5_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x230e6, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_MPRED_TA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_MPRED_TA] }, [ POWER5_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x110c1, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CRQ_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CRQ_FULL_CYC] }, [ POWER5_PME_PM_L2SA_RCLD_DISP ] = { .pme_name = "PM_L2SA_RCLD_DISP", .pme_code = 0x701c0, .pme_short_desc = "L2 slice A RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCLD_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCLD_DISP] }, [ POWER5_PME_PM_SNOOP_WR_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_WR_RETRY_QFULL", .pme_code = 0x710c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_WR_RETRY_QFULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_WR_RETRY_QFULL] }, [ POWER5_PME_PM_MRK_DTLB_REF_4K ] = { .pme_name = "PM_MRK_DTLB_REF_4K", .pme_code = 0xc40c3, .pme_short_desc = "Marked Data TLB reference for 4K page", .pme_long_desc = "Data TLB references by a marked instruction for 4KB pages.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_REF_4K], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_REF_4K] }, [ POWER5_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc20e1, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_S0_VALID] }, [ POWER5_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc00c2, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_FLUSH_LRQ] }, [ POWER5_PME_PM_INST_FROM_L275_MOD ] = { .pme_name = "PM_INST_FROM_L275_MOD", .pme_code = 0x422096, .pme_short_desc = "Instruction fetched from L2.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L275_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L275_MOD] }, [ POWER5_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x200004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_EMPTY_CYC] }, [ POWER5_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x820e7, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LARX_LSU0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LARX_LSU0] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x430e6, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 5 or 6.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC] }, [ POWER5_PME_PM_SNOOP_RETRY_1AHEAD ] = { .pme_name = "PM_SNOOP_RETRY_1AHEAD", .pme_code = 0x725e6, .pme_short_desc = "Snoop retry due to one ahead collision", .pme_long_desc = "Snoop retry due to one ahead collision", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_RETRY_1AHEAD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_RETRY_1AHEAD] }, [ POWER5_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0xc6, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FSQRT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FSQRT] }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x820e4, .pme_short_desc = "LSU1 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LD_MISS_L1_LSU1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LD_MISS_L1_LSU1] }, [ POWER5_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x300014, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_FPU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_FPU_FIN] }, [ POWER5_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x420e4, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles this thread was running at priority level 5.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_5_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_5_CYC] }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x2c7087, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_LMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_LMEM] }, [ POWER5_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x10c5, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FRSP_FCONV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FRSP_FCONV] }, [ POWER5_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x800c3, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A tlbie was snooped from another processor.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_TLBIE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_TLBIE] }, [ POWER5_PME_PM_L3SB_SNOOP_RETRY ] = { .pme_name = "PM_L3SB_SNOOP_RETRY", .pme_code = 0x731e4, .pme_short_desc = "L3 slice B snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_SNOOP_RETRY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_SNOOP_RETRY] }, [ POWER5_PME_PM_FAB_VBYPASS_EMPTY ] = { .pme_name = "PM_FAB_VBYPASS_EMPTY", .pme_code = 0x731e7, .pme_short_desc = "Vertical bypass buffer empty", .pme_long_desc = "Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_VBYPASS_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_VBYPASS_EMPTY] }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x1c70a3, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L275_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L275_MOD] }, [ POWER5_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x400c6, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_6INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_6INST_CLB_CYC] }, [ POWER5_PME_PM_L2SB_RCST_DISP ] = { .pme_name = "PM_L2SB_RCST_DISP", .pme_code = 0x702c1, .pme_short_desc = "L2 slice B RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCST_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCST_DISP] }, [ POWER5_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x110c7, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FLUSH], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FLUSH] }, [ POWER5_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0x730e2, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_MOD_INV] }, [ POWER5_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x102088, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "The floating point unit has encountered a denormalized operand. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_DENORM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_DENORM] }, [ POWER5_PME_PM_L3SC_HIT ] = { .pme_name = "PM_L3SC_HIT", .pme_code = 0x711c5, .pme_short_desc = "L3 slice C hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_HIT] }, [ POWER5_PME_PM_SNOOP_WR_RETRY_RQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_RQ", .pme_code = 0x706c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active read queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_WR_RETRY_RQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_WR_RETRY_RQ] }, [ POWER5_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0xc60e4, .pme_short_desc = "LSU1 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_REJECT_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_REJECT_SRQ] }, [ POWER5_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x220e6, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IC_PREF_REQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IC_PREF_REQ] }, [ POWER5_PME_PM_L3SC_ALL_BUSY ] = { .pme_name = "PM_L3SC_ALL_BUSY", .pme_code = 0x721e5, .pme_short_desc = "L3 slice C active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_ALL_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_ALL_BUSY] }, [ POWER5_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x412091, .pme_short_desc = "Group experienced marked I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_IC_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_IC_MISS] }, [ POWER5_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x21009c, .pme_short_desc = "No slot in GCT caused by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_NOSLOT_IC_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_NOSLOT_IC_MISS] }, [ POWER5_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c708e, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L3] }, [ POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { .pme_name = "PM_GCT_NOSLOT_SRQ_FULL", .pme_code = 0x310084, .pme_short_desc = "No slot in GCT caused by SRQ full", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because the Store Request Queue (SRQ) is full. This happens when the storage subsystem can not process the stores in the SRQ. Groups can not be dispatched until a SRQ entry is available.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL] }, [ POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { .pme_name = "PM_THRD_SEL_OVER_ISU_HOLD", .pme_code = 0x410c5, .pme_short_desc = "Thread selection overrides caused by ISU holds", .pme_long_desc = "Thread selection was overridden because of an ISU hold.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD] }, [ POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x21109a, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS] }, [ POWER5_PME_PM_L3SA_MOD_INV ] = { .pme_name = "PM_L3SA_MOD_INV", .pme_code = 0x730e3, .pme_short_desc = "L3 slice A transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_MOD_INV] }, [ POWER5_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x2c0090, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_LRQ] }, [ POWER5_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x420e1, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles this thread was running at priority level 2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_2_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_2_CYC] }, [ POWER5_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x1c0090, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH_SRQ] }, [ POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0xc70e6, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID] }, [ POWER5_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x701c3, .pme_short_desc = "L3 slice A references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_REF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_REF] }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL] }, [ POWER5_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x20e1, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_STALL3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_STALL3] }, [ POWER5_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x130e5, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GPR_MAP_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GPR_MAP_FULL_CYC] }, [ POWER5_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100018, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_TB_BIT_TRANS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_TB_BIT_TRANS] }, [ POWER5_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x381088, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_FLUSH_LRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_FLUSH_LRQ] }, [ POWER5_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x20e2, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "FPU0 has executed a Floating Point Store instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_STF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_STF] }, [ POWER5_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0xc50c6, .pme_short_desc = "Marked Data TLB misses", .pme_long_desc = "Data TLB references by a marked instruction that missed the TLB (all page sizes).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_MISS] }, [ POWER5_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc5, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FMA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FMA] }, [ POWER5_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0x720e0, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_MOD_TAG] }, [ POWER5_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc00c4, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_FLUSH_ULD] }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x810c1, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU0_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU0_FLUSH_UST] }, [ POWER5_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x300005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_INST_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_INST_FIN] }, [ POWER5_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x100c3, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FULL_CYC] }, [ POWER5_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc20e6, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LRQ_S0_ALLOC] }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x810c4, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU1_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU1_FLUSH_ULD] }, [ POWER5_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0x1c4090, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_REF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_REF] }, [ POWER5_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x123087, .pme_short_desc = "Unconditional branch", .pme_long_desc = "An unconditional branch was executed.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_UNCOND], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_UNCOND] }, [ POWER5_PME_PM_THRD_SEL_OVER_L2MISS ] = { .pme_name = "PM_THRD_SEL_OVER_L2MISS", .pme_code = 0x410c3, .pme_short_desc = "Thread selection overrides caused by L2 misses", .pme_long_desc = "Thread selection was overridden because one thread was had a L2 miss pending.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_OVER_L2MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_OVER_L2MISS] }, [ POWER5_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0x710c1, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_SHR_INV] }, [ POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_LO_PRIO_WR_CMPL", .pme_code = 0x736e6, .pme_short_desc = "Low priority write completed", .pme_long_desc = "A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL] }, [ POWER5_PME_PM_L3SC_MOD_TAG ] = { .pme_name = "PM_L3SC_MOD_TAG", .pme_code = 0x720e5, .pme_short_desc = "L3 slice C transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_MOD_TAG] }, [ POWER5_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x820e3, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_ST_MISS_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_ST_MISS_L1] }, [ POWER5_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x300002, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups sucessfully dispatched (not rejected)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_DISP_SUCCESS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_DISP_SUCCESS] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x430e4, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 1 or 2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC] }, [ POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x230e0, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT] }, [ POWER5_PME_PM_MEM_WQ_DISP_Q8to15 ] = { .pme_name = "PM_MEM_WQ_DISP_Q8to15", .pme_code = 0x733e6, .pme_short_desc = "Memory write queue dispatched to queues 8-15", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_WQ_DISP_Q8to15], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_WQ_DISP_Q8to15] }, [ POWER5_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x20e3, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "FPU0 has executed a single precision instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_SINGLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_SINGLE] }, [ POWER5_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x280090, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_DERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_DERAT_MISS] }, [ POWER5_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x420e0, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_1_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_1_CYC] }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x10c6, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FEST] }, [ POWER5_PME_PM_FAB_HOLDtoVN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoVN_EMPTY", .pme_code = 0x721e7, .pme_short_desc = "Hold buffer to VN empty", .pme_long_desc = "Fabric cycles when the Vertical Node out hold-buffers are emtpy. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_HOLDtoVN_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_HOLDtoVN_EMPTY] }, [ POWER5_PME_PM_SNOOP_RD_RETRY_RQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_RQ", .pme_code = 0x705c6, .pme_short_desc = "Snoop read retry due to collision with active read queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_RD_RETRY_RQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_RD_RETRY_RQ] }, [ POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_DCLAIM_RETRY_QFULL", .pme_code = 0x720e6, .pme_short_desc = "Snoop dclaim/flush retry due to write/dclaim queues full", .pme_long_desc = "The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL] }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR_CYC", .pme_code = 0x2c70a2, .pme_short_desc = "Marked load latency from L2.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC] }, [ POWER5_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_ST_CMPL_INT] }, [ POWER5_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x110c6, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FLUSH_BR_MPRED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FLUSH_BR_MPRED] }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x202090, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU has executed a store instruction. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_STF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_STF] }, [ POWER5_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x411098, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_FPU], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_FPU] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 1 or 2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC] }, [ POWER5_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100004, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_NOSLOT_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_NOSLOT_CYC] }, [ POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300012, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ POWER5_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x18309e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L35_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L35_SHR] }, [ POWER5_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x381090, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_FLUSH_UST] }, [ POWER5_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x711c3, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_HIT] }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x1c7097, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x1c709e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L35_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L35_SHR] }, [ POWER5_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x220e7, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IERAT_XLATE_WR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IERAT_XLATE_WR] }, [ POWER5_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x723e0, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_ST_REQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_ST_REQ] }, [ POWER5_PME_PM_THRD_SEL_T1 ] = { .pme_name = "PM_THRD_SEL_T1", .pme_code = 0x410c1, .pme_short_desc = "Decode selected thread 1", .pme_long_desc = "Thread selection picked thread 1 for decode.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_T1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_T1] }, [ POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x230e1, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT] }, [ POWER5_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x222086, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_LMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_LMEM] }, [ POWER5_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc3, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_1FLOP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_1FLOP] }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR_CYC", .pme_code = 0x2c70a6, .pme_short_desc = "Marked load latency from L3.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC] }, [ POWER5_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x183087, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L2 due to a demand load", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L2], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L2] }, [ POWER5_PME_PM_MEM_PW_CMPL ] = { .pme_name = "PM_MEM_PW_CMPL", .pme_code = 0x724e6, .pme_short_desc = "Memory partial-write completed", .pme_long_desc = "Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_PW_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_PW_CMPL] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x430e0, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 5 or 6.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC] }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x10c3, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FIN] }, [ POWER5_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0xc40c1, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DTLB_MISS_4K], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DTLB_MISS_4K] }, [ POWER5_PME_PM_L3SC_SHR_INV ] = { .pme_name = "PM_L3SC_SHR_INV", .pme_code = 0x710c5, .pme_short_desc = "L3 slice C transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SC_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SC_SHR_INV] }, [ POWER5_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x120e6, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_BR_REDIR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_BR_REDIR] }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x481088, .pme_short_desc = "Marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_FLUSH_SRQ] }, [ POWER5_PME_PM_PTEG_FROM_L275_SHR ] = { .pme_name = "PM_PTEG_FROM_L275_SHR", .pme_code = 0x383097, .pme_short_desc = "PTEG loaded from L2.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L275_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L275_SHR] }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_SNOOP_RD_RETRY_WQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_WQ", .pme_code = 0x715c6, .pme_short_desc = "Snoop read retry due to collision with active write queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_RD_RETRY_WQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_RD_RETRY_WQ] }, [ POWER5_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc50c1, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by unit 0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_NCLD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_NCLD] }, [ POWER5_PME_PM_FAB_DCLAIM_RETRIED ] = { .pme_name = "PM_FAB_DCLAIM_RETRIED", .pme_code = 0x730e7, .pme_short_desc = "dclaim retried", .pme_long_desc = "A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_DCLAIM_RETRIED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_DCLAIM_RETRIED] }, [ POWER5_PME_PM_LSU1_BUSY_REJECT ] = { .pme_name = "PM_LSU1_BUSY_REJECT", .pme_code = 0xc20e7, .pme_short_desc = "LSU1 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_BUSY_REJECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_BUSY_REJECT] }, [ POWER5_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x110c0, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXLS0_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXLS0_FULL_CYC] }, [ POWER5_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x10c2, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FEST] }, [ POWER5_PME_PM_DTLB_REF_16M ] = { .pme_name = "PM_DTLB_REF_16M", .pme_code = 0xc40c6, .pme_short_desc = "Data TLB reference for 16M page", .pme_long_desc = "Data TLB references for 16MB pages. Includes hits + misses.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_REF_16M], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_REF_16M] }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0xc60e3, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_REJECT_ERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_REJECT_ERAT_MISS] }, [ POWER5_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x2c3097, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L25_MOD] }, [ POWER5_PME_PM_GCT_USAGE_60to79_CYC ] = { .pme_name = "PM_GCT_USAGE_60to79_CYC", .pme_code = 0x20001f, .pme_short_desc = "Cycles GCT 60-79% full", .pme_long_desc = "Cycles when the Global Completion Table has between 60% and 70% of its slots used. The GCT has 20 entries shared between threads.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_USAGE_60to79_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_USAGE_60to79_CYC] }, [ POWER5_PME_PM_DATA_FROM_L375_MOD ] = { .pme_name = "PM_DATA_FROM_L375_MOD", .pme_code = 0x1c30a7, .pme_short_desc = "Data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L375_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L375_MOD] }, [ POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x200015, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0xc60e2, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF] }, [ POWER5_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x42208d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_0INST_FETCH], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_0INST_FETCH] }, [ POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0xc60e6, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF] }, [ POWER5_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc70e7, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L1_PREF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L1_PREF] }, [ POWER5_PME_PM_MEM_WQ_DISP_Q0to7 ] = { .pme_name = "PM_MEM_WQ_DISP_Q0to7", .pme_code = 0x723e6, .pme_short_desc = "Memory write queue dispatched to queues 0-7", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_WQ_DISP_Q0to7], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_WQ_DISP_Q0to7] }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x4c70a0, .pme_short_desc = "Marked load latency from local memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC] }, [ POWER5_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x100c5, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BRQ_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BRQ_FULL_CYC] }, [ POWER5_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x112099, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_IC_MISS_NONSPEC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_IC_MISS_NONSPEC] }, [ POWER5_PME_PM_PTEG_FROM_L275_MOD ] = { .pme_name = "PM_PTEG_FROM_L275_MOD", .pme_code = 0x1830a3, .pme_short_desc = "PTEG loaded from L2.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L275_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L275_MOD] }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x820e0, .pme_short_desc = "LSU0 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LD_MISS_L1_LSU0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LD_MISS_L1_LSU0] }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR_CYC", .pme_code = 0x2c70a7, .pme_short_desc = "Marked load latency from L3.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC] }, [ POWER5_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x110c5, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_FLUSH], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_FLUSH] }, [ POWER5_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c308e, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L3] }, [ POWER5_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x122086, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L2], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L2] }, [ POWER5_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30000a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC2_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC2_OVERFLOW] }, [ POWER5_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x20e0, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "FPU0 has encountered a denormalized operand. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_DENORM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_DENORM] }, [ POWER5_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x10c4, .pme_short_desc = "FPU1 executed FMOV or FEST instructions", .pme_long_desc = "FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_FMOV_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_FMOV_FEST] }, [ POWER5_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x220e4, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Cycles when at least one instruction was sent from the fetch unit to the decode unit.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FETCH_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FETCH_CYC] }, [ POWER5_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x4c5090, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LDF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LDF] }, [ POWER5_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x300009, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_DISP] }, [ POWER5_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x1c3097, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L25_SHR] }, [ POWER5_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc30e4, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ POWER5_PME_PM_MEM_WQ_DISP_DCLAIM ] = { .pme_name = "PM_MEM_WQ_DISP_DCLAIM", .pme_code = 0x713c6, .pme_short_desc = "Memory write queue dispatched due to dclaim/flush", .pme_long_desc = "A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_WQ_DISP_DCLAIM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_WQ_DISP_DCLAIM] }, [ POWER5_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x110090, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FULL_CYC] }, [ POWER5_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x100015, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_ISSUED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_ISSUED] }, [ POWER5_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x420e2, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles this thread was running at priority level 3.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_3_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_3_CYC] }, [ POWER5_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x200088, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FMA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FMA] }, [ POWER5_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x22209d, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L35_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L35_MOD] }, [ POWER5_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x400005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_CRU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_CRU_FIN] }, [ POWER5_PME_PM_SNOOP_WR_RETRY_WQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_WQ", .pme_code = 0x716c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active write queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_WR_RETRY_WQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_WR_RETRY_WQ] }, [ POWER5_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x41109a, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_REJECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_REJECT] }, [ POWER5_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0xc60e7, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_REJECT_ERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_REJECT_ERAT_MISS] }, [ POWER5_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x200014, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_FXU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_FXU_FIN] }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY] }, [ POWER5_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10000a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC4_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC4_OVERFLOW] }, [ POWER5_PME_PM_L3SA_SNOOP_RETRY ] = { .pme_name = "PM_L3SA_SNOOP_RETRY", .pme_code = 0x731e3, .pme_short_desc = "L3 slice A snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_SNOOP_RETRY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_SNOOP_RETRY] }, [ POWER5_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x28309e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L35_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L35_MOD] }, [ POWER5_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x222096, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L25_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L25_MOD] }, [ POWER5_PME_PM_THRD_SMT_HANG ] = { .pme_name = "PM_THRD_SMT_HANG", .pme_code = 0x330e7, .pme_short_desc = "SMT hang detected", .pme_long_desc = "A hung thread was detected", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SMT_HANG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SMT_HANG] }, [ POWER5_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x41109b, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_ERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_ERAT_MISS] }, [ POWER5_PME_PM_L3SA_MOD_TAG ] = { .pme_name = "PM_L3SA_MOD_TAG", .pme_code = 0x720e3, .pme_short_desc = "L3 slice A transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SA_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SA_MOD_TAG] }, [ POWER5_PME_PM_FLUSH_SYNC ] = { .pme_name = "PM_FLUSH_SYNC", .pme_code = 0x330e1, .pme_short_desc = "Flush caused by sync", .pme_long_desc = "This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FLUSH_SYNC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FLUSH_SYNC] }, [ POWER5_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x12209b, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L2MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L2MISS] }, [ POWER5_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0x733e2, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_ST_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_ST_HIT] }, [ POWER5_PME_PM_MEM_RQ_DISP_Q8to11 ] = { .pme_name = "PM_MEM_RQ_DISP_Q8to11", .pme_code = 0x722e6, .pme_short_desc = "Memory read queue dispatched to queues 8-11", .pme_long_desc = "A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP_Q8to11], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP_Q8to11] }, [ POWER5_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x100002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_DISP] }, [ POWER5_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0x720e1, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_MOD_TAG] }, [ POWER5_PME_PM_CLB_EMPTY_CYC ] = { .pme_name = "PM_CLB_EMPTY_CYC", .pme_code = 0x410c6, .pme_short_desc = "Cycles CLB empty", .pme_long_desc = "Cycles when both thread's CLB is completely empty.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CLB_EMPTY_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CLB_EMPTY_CYC] }, [ POWER5_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x733e1, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_ST_HIT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_ST_HIT] }, [ POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_NONSPEC_RD_CANCEL", .pme_code = 0x711c6, .pme_short_desc = "Non speculative memory read cancelled", .pme_long_desc = "A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL] }, [ POWER5_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x423087, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " CR and target prediction", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_PRED_CR_TA], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_PRED_CR_TA] }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x810c3, .pme_short_desc = "LSU0 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ] }, [ POWER5_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x481090, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_FLUSH_ULD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_FLUSH_ULD] }, [ POWER5_PME_PM_INST_DISP_ATTEMPT ] = { .pme_name = "PM_INST_DISP_ATTEMPT", .pme_code = 0x120e1, .pme_short_desc = "Instructions dispatch attempted", .pme_long_desc = "Number of PowerPC Instructions dispatched (attempted, not filtered by success.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_DISP_ATTEMPT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_DISP_ATTEMPT] }, [ POWER5_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x422086, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_RMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_RMEM] }, [ POWER5_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc10c1, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU0.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ST_REF_L1_LSU0], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ST_REF_L1_LSU0] }, [ POWER5_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x800c2, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_DERAT_MISS] }, [ POWER5_PME_PM_L2SB_RCLD_DISP ] = { .pme_name = "PM_L2SB_RCLD_DISP", .pme_code = 0x701c1, .pme_short_desc = "L2 slice B RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCLD_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCLD_DISP] }, [ POWER5_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x202088, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_STALL3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_STALL3] }, [ POWER5_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x230e2, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " CR prediction", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_PRED_CR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_PRED_CR] }, [ POWER5_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1c7087, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L2] }, [ POWER5_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc00c3, .pme_short_desc = "LSU0 SRQ lhs flushes", .pme_long_desc = "A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_FLUSH_SRQ] }, [ POWER5_PME_PM_FAB_PNtoNN_DIRECT ] = { .pme_name = "PM_FAB_PNtoNN_DIRECT", .pme_code = 0x703c7, .pme_short_desc = "PN to NN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_PNtoNN_DIRECT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_PNtoNN_DIRECT] }, [ POWER5_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1, .pme_short_desc = "Internal operations completed", .pme_long_desc = "Number of internal operations that completed.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_IOPS_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_IOPS_CMPL] }, [ POWER5_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0x710c2, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_SHR_INV] }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_L2SA_RCST_DISP ] = { .pme_name = "PM_L2SA_RCST_DISP", .pme_code = 0x702c0, .pme_short_desc = "L2 slice A RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCST_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCST_DISP] }, [ POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { .pme_name = "PM_SNOOP_RETRY_AB_COLLISION", .pme_code = 0x735e6, .pme_short_desc = "Snoop retry due to a b collision", .pme_long_desc = "Snoop retry due to a b collision", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION] }, [ POWER5_PME_PM_FAB_PNtoVN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoVN_SIDECAR", .pme_code = 0x733e7, .pme_short_desc = "PN to VN beat went to sidecar first", .pme_long_desc = "Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_PNtoVN_SIDECAR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_PNtoVN_SIDECAR] }, [ POWER5_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xc30e6, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LMQ_S0_ALLOC] }, [ POWER5_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc60e1, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_REJECT_LMQ_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_REJECT_LMQ_FULL] }, [ POWER5_PME_PM_SNOOP_PW_RETRY_RQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_RQ", .pme_code = 0x707c6, .pme_short_desc = "Snoop partial-write retry due to collision with active read queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_PW_RETRY_RQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_PW_RETRY_RQ] }, [ POWER5_PME_PM_DTLB_REF ] = { .pme_name = "PM_DTLB_REF", .pme_code = 0x2c4090, .pme_short_desc = "Data TLB references", .pme_long_desc = "Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_REF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_REF] }, [ POWER5_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x18308e, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L3] }, [ POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x712c7, .pme_short_desc = "M1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY] }, [ POWER5_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x400015, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "Cycles the Store Request Queue is empty", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ POWER5_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x20e6, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "FPU1 has executed a Floating Point Store instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_STF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_STF] }, [ POWER5_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xc30e5, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_LMQ_S0_VALID] }, [ POWER5_PME_PM_GCT_USAGE_00to59_CYC ] = { .pme_name = "PM_GCT_USAGE_00to59_CYC", .pme_code = 0x10001f, .pme_short_desc = "Cycles GCT less than 60% full", .pme_long_desc = "Cycles when the Global Completion Table has fewer than 60% of its slots used. The GCT has 20 entries shared between threads.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GCT_USAGE_00to59_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GCT_USAGE_00to59_CYC] }, [ POWER5_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x3c309b, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L2MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L2MISS] }, [ POWER5_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x130e1, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "A scoreboard operation on a non-renamed resource has blocked dispatch.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_DISP_BLK_SB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_DISP_BLK_SB_CYC] }, [ POWER5_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x301088, .pme_short_desc = "FPU executed FMOV or FEST instructions", .pme_long_desc = "The floating point unit has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.. Combined Unit 0 + Unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FMOV_FEST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FMOV_FEST] }, [ POWER5_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x100c2, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_XER_MAP_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_XER_MAP_FULL_CYC] }, [ POWER5_PME_PM_FLUSH_SB ] = { .pme_name = "PM_FLUSH_SB", .pme_code = 0x330e2, .pme_short_desc = "Flush caused by scoreboard operation", .pme_long_desc = "This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FLUSH_SB], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FLUSH_SB] }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR", .pme_code = 0x3c709e, .pme_short_desc = "Marked data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a marked load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L375_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L375_SHR] }, [ POWER5_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x400013, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_GRP_CMPL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_GRP_CMPL] }, [ POWER5_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "The counter is suspended (does not count).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SUSPENDED], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SUSPENDED] }, [ POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_BR_REDIR_NONSPEC", .pme_code = 0x120e5, .pme_short_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_long_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC] }, [ POWER5_PME_PM_SNOOP_RD_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_RD_RETRY_QFULL", .pme_code = 0x700c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_SNOOP_RD_RETRY_QFULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_SNOOP_RD_RETRY_QFULL] }, [ POWER5_PME_PM_L3SB_MOD_INV ] = { .pme_name = "PM_L3SB_MOD_INV", .pme_code = 0x730e4, .pme_short_desc = "L3 slice B transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_MOD_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_MOD_INV] }, [ POWER5_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x1c309e, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L35_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L35_SHR] }, [ POWER5_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc10c6, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_MISS_L1_LSU1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_MISS_L1_LSU1] }, [ POWER5_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x820e1, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = power5_event_ids[POWER5_PME_PM_STCX_FAIL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_STCX_FAIL] }, [ POWER5_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0x830e6, .pme_short_desc = "DST (Data Stream Touch) stream start", .pme_long_desc = "A prefetch stream was started using the DST instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DC_PREF_DST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DC_PREF_DST] }, [ POWER5_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x200002, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", .pme_event_ids = power5_event_ids[POWER5_PME_PM_GRP_DISP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_GRP_DISP] }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR] }, [ POWER5_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x30e0, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FPSCR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FPSCR] }, [ POWER5_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c3087, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L2], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L2] }, [ POWER5_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x20e4, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "FPU1 has encountered a denormalized operand.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU1_DENORM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU1_DENORM] }, [ POWER5_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x100090, .pme_short_desc = "FPU executed one flop instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_1FLOP], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_1FLOP] }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER] }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0xc2, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FSQRT], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FSQRT] }, [ POWER5_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x4c1090, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Load references to the Level 1 Data Cache. Combined unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_REF_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_REF_L1] }, [ POWER5_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x22208d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_L1] }, [ POWER5_PME_PM_TLBIE_HELD ] = { .pme_name = "PM_TLBIE_HELD", .pme_code = 0x130e4, .pme_short_desc = "TLBIE held at dispatch", .pme_long_desc = "Cycles a TLBIE instruction was held at dispatch.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_TLBIE_HELD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_TLBIE_HELD] }, [ POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0xc50c2, .pme_short_desc = "D cache out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected but no more stream entries were available.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS] }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD_CYC", .pme_code = 0x4c70a2, .pme_short_desc = "Marked load latency from L2.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC] }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x810c7, .pme_short_desc = "LSU1 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ] }, [ POWER5_PME_PM_MEM_RQ_DISP_Q0to3 ] = { .pme_name = "PM_MEM_RQ_DISP_Q0to3", .pme_code = 0x702c6, .pme_short_desc = "Memory read queue dispatched to queues 0-3", .pme_long_desc = "A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MEM_RQ_DISP_Q0to3], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MEM_RQ_DISP_Q0to3] }, [ POWER5_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc10c5, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_ST_REF_L1_LSU1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_ST_REF_L1_LSU1] }, [ POWER5_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x182088, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LD_MISS_L1] }, [ POWER5_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x230e7, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "Cycles that a cache line was written to the instruction cache.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L1_WRITE_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L1_WRITE_CYC] }, [ POWER5_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0x723e2, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_ST_REQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_ST_REQ] }, [ POWER5_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x21109b, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point divide or square root instruction. This is a subset of PM_CMPLU_STALL_FPU.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_CMPLU_STALL_FDIV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_CMPLU_STALL_FDIV] }, [ POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { .pme_name = "PM_THRD_SEL_OVER_CLB_EMPTY", .pme_code = 0x410c2, .pme_short_desc = "Thread selection overrides caused by CLB empty", .pme_long_desc = "Thread selection was overridden because one thread's CLB was empty.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY] }, [ POWER5_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x230e5, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_BR_MPRED_CR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_BR_MPRED_CR] }, [ POWER5_PME_PM_L3SB_MOD_TAG ] = { .pme_name = "PM_L3SB_MOD_TAG", .pme_code = 0x720e4, .pme_short_desc = "L3 slice B transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_MOD_TAG] }, [ POWER5_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x3c709b, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_DATA_FROM_L2MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_DATA_FROM_L2MISS] }, [ POWER5_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1c6088, .pme_short_desc = "LSU SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_REJECT_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_REJECT_SRQ] }, [ POWER5_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c1088, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_MISS_L1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_MISS_L1] }, [ POWER5_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x32208d, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", .pme_event_ids = power5_event_ids[POWER5_PME_PM_INST_FROM_PREF], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_INST_FROM_PREF] }, [ POWER5_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc10c7, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DC_INV_L2], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DC_INV_L2] }, [ POWER5_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x820e5, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", .pme_event_ids = power5_event_ids[POWER5_PME_PM_STCX_PASS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_STCX_PASS] }, [ POWER5_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x110c3, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "Cycles the Store Request Queue is full.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_FULL_CYC] }, [ POWER5_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x401088, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result. This only indicates finish, not completion. Combined Unit 0 + Unit 1. Floating Point Stores are included in this count but not Floating Point Loads., , , XYZs", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU_FIN] }, [ POWER5_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0x700c0, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_SHR_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_SHR_MOD] }, [ POWER5_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1c2088, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU_SRQ_STFWD] }, [ POWER5_PME_PM_0INST_CLB_CYC ] = { .pme_name = "PM_0INST_CLB_CYC", .pme_code = 0x400c0, .pme_short_desc = "Cycles no instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_0INST_CLB_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_0INST_CLB_CYC] }, [ POWER5_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x130e2, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FXU0_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FXU0_FIN] }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL] }, [ POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200013, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC] }, [ POWER5_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10001a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PMC5_OVERFLOW], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PMC5_OVERFLOW] }, [ POWER5_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0xc0, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_FPU0_FDIV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_FPU0_FDIV] }, [ POWER5_PME_PM_PTEG_FROM_L375_SHR ] = { .pme_name = "PM_PTEG_FROM_L375_SHR", .pme_code = 0x38309e, .pme_short_desc = "PTEG loaded from L3.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_PTEG_FROM_L375_SHR], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_PTEG_FROM_L375_SHR] }, [ POWER5_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc10c4, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 1.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LD_REF_L1_LSU1], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LD_REF_L1_LSU1] }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY] }, [ POWER5_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x20000b, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = power5_event_ids[POWER5_PME_PM_HV_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_HV_CYC] }, [ POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x430e3, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles when this thread's priority is equal to the other thread's priority.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC] }, [ POWER5_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x100c6, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LR_CTR_MAP_FULL_CYC], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LR_CTR_MAP_FULL_CYC] }, [ POWER5_PME_PM_L3SB_SHR_INV ] = { .pme_name = "PM_L3SB_SHR_INV", .pme_code = 0x710c4, .pme_short_desc = "L3 slice B transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L3SB_SHR_INV], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L3SB_SHR_INV] }, [ POWER5_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x1c30a1, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_RMEM], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_RMEM] }, [ POWER5_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x1c30a3, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load. ", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DATA_FROM_L275_MOD], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DATA_FROM_L275_MOD] }, [ POWER5_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0xc60e0, .pme_short_desc = "LSU0 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_REJECT_SRQ], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_REJECT_SRQ] }, [ POWER5_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x800c6, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU1_DERAT_MISS] }, [ POWER5_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x400014, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power5_event_ids[POWER5_PME_PM_MRK_LSU_FIN], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_MRK_LSU_FIN] }, [ POWER5_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0xc40c4, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_DTLB_MISS_16M], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_DTLB_MISS_16M] }, [ POWER5_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc00c1, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", .pme_event_ids = power5_event_ids[POWER5_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_LSU0_FLUSH_UST] }, [ POWER5_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0x720e2, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SC_MOD_TAG], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SC_MOD_TAG] }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", .pme_event_ids = power5_event_ids[POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY], .pme_group_vector = power5_group_vecs[POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY] } }; #define POWER5_PME_EVENT_COUNT 474 static const int power5_group_event_ids[][POWER5_NUM_EVENT_COUNTERS] = { [ 0 ] = { 190, 71, 56, 12, 0, 0 }, [ 1 ] = { 2, 195, 49, 12, 0, 0 }, [ 2 ] = { 66, 65, 50, 60, 0, 0 }, [ 3 ] = { 0, 2, 169, 138, 0, 0 }, [ 4 ] = { 6, 6, 149, 59, 0, 0 }, [ 5 ] = { 60, 59, 46, 51, 0, 0 }, [ 6 ] = { 62, 61, 47, 52, 0, 0 }, [ 7 ] = { 143, 143, 113, 119, 0, 0 }, [ 8 ] = { 147, 147, 119, 123, 0, 0 }, [ 9 ] = { 149, 141, 112, 122, 0, 0 }, [ 10 ] = { 212, 73, 117, 18, 0, 0 }, [ 11 ] = { 73, 9, 61, 58, 0, 0 }, [ 12 ] = { 139, 1, 87, 59, 0, 0 }, [ 13 ] = { 126, 135, 13, 91, 0, 0 }, [ 14 ] = { 145, 144, 25, 159, 0, 0 }, [ 15 ] = { 125, 134, 55, 66, 0, 0 }, [ 16 ] = { 123, 132, 120, 191, 0, 0 }, [ 17 ] = { 124, 133, 55, 1, 0, 0 }, [ 18 ] = { 146, 145, 109, 31, 0, 0 }, [ 19 ] = { 73, 140, 25, 16, 0, 0 }, [ 20 ] = { 81, 71, 27, 33, 0, 0 }, [ 21 ] = { 141, 138, 55, 113, 0, 0 }, [ 22 ] = { 119, 128, 109, 59, 0, 0 }, [ 23 ] = { 120, 129, 55, 113, 0, 0 }, [ 24 ] = { 142, 140, 0, 59, 0, 0 }, [ 25 ] = { 121, 130, 109, 59, 0, 0 }, [ 26 ] = { 122, 131, 55, 113, 0, 0 }, [ 27 ] = { 140, 71, 147, 114, 0, 0 }, [ 28 ] = { 70, 13, 55, 10, 0, 0 }, [ 29 ] = { 73, 10, 6, 8, 0, 0 }, [ 30 ] = { 68, 12, 55, 7, 0, 0 }, [ 31 ] = { 57, 11, 55, 9, 0, 0 }, [ 32 ] = { 115, 7, 116, 116, 0, 0 }, [ 33 ] = { 41, 49, 40, 46, 0, 0 }, [ 34 ] = { 11, 114, 48, 11, 0, 0 }, [ 35 ] = { 35, 204, 188, 59, 0, 0 }, [ 36 ] = { 198, 193, 106, 112, 0, 0 }, [ 37 ] = { 117, 126, 52, 57, 0, 0 }, [ 38 ] = { 72, 69, 54, 0, 0, 0 }, [ 39 ] = { 69, 67, 60, 59, 0, 0 }, [ 40 ] = { 210, 184, 1, 3, 0, 0 }, [ 41 ] = { 9, 8, 3, 5, 0, 0 }, [ 42 ] = { 64, 62, 24, 59, 0, 0 }, [ 43 ] = { 20, 21, 100, 106, 0, 0 }, [ 44 ] = { 13, 137, 165, 171, 0, 0 }, [ 45 ] = { 21, 78, 101, 105, 0, 0 }, [ 46 ] = { 26, 23, 103, 108, 0, 0 }, [ 47 ] = { 25, 22, 166, 173, 0, 0 }, [ 48 ] = { 16, 18, 26, 59, 0, 0 }, [ 49 ] = { 16, 18, 187, 15, 0, 0 }, [ 50 ] = { 14, 16, 8, 13, 0, 0 }, [ 51 ] = { 17, 17, 10, 14, 0, 0 }, [ 52 ] = { 78, 74, 59, 63, 0, 0 }, [ 53 ] = { 76, 77, 55, 0, 0, 0 }, [ 54 ] = { 77, 75, 57, 61, 0, 0 }, [ 55 ] = { 79, 76, 58, 62, 0, 0 }, [ 56 ] = { 184, 181, 154, 163, 0, 0 }, [ 57 ] = { 187, 182, 156, 164, 0, 0 }, [ 58 ] = { 183, 183, 189, 165, 0, 0 }, [ 59 ] = { 186, 64, 51, 16, 0, 0 }, [ 60 ] = { 83, 82, 64, 69, 0, 0 }, [ 61 ] = { 85, 84, 66, 71, 0, 0 }, [ 62 ] = { 87, 87, 68, 74, 0, 0 }, [ 63 ] = { 91, 90, 72, 77, 0, 0 }, [ 64 ] = { 93, 92, 74, 79, 0, 0 }, [ 65 ] = { 95, 95, 76, 82, 0, 0 }, [ 66 ] = { 99, 98, 80, 85, 0, 0 }, [ 67 ] = { 101, 100, 82, 87, 0, 0 }, [ 68 ] = { 103, 103, 84, 90, 0, 0 }, [ 69 ] = { 107, 71, 89, 94, 0, 0 }, [ 70 ] = { 73, 108, 93, 98, 0, 0 }, [ 71 ] = { 73, 111, 97, 102, 0, 0 }, [ 72 ] = { 82, 86, 63, 73, 0, 0 }, [ 73 ] = { 90, 94, 71, 81, 0, 0 }, [ 74 ] = { 98, 102, 79, 89, 0, 0 }, [ 75 ] = { 106, 107, 91, 99, 0, 0 }, [ 76 ] = { 108, 109, 88, 96, 0, 0 }, [ 77 ] = { 112, 112, 99, 100, 0, 0 }, [ 78 ] = { 55, 54, 38, 43, 0, 0 }, [ 79 ] = { 56, 53, 39, 44, 0, 0 }, [ 80 ] = { 54, 55, 30, 40, 0, 0 }, [ 81 ] = { 58, 56, 55, 115, 0, 0 }, [ 82 ] = { 40, 48, 29, 39, 0, 0 }, [ 83 ] = { 37, 45, 31, 41, 0, 0 }, [ 84 ] = { 38, 46, 33, 42, 0, 0 }, [ 85 ] = { 43, 51, 55, 37, 0, 0 }, [ 86 ] = { 42, 50, 105, 111, 0, 0 }, [ 87 ] = { 39, 47, 55, 42, 0, 0 }, [ 88 ] = { 36, 44, 30, 59, 0, 0 }, [ 89 ] = { 44, 52, 105, 59, 0, 0 }, [ 90 ] = { 59, 57, 42, 49, 0, 0 }, [ 91 ] = { 171, 172, 45, 47, 0, 0 }, [ 92 ] = { 4, 4, 43, 50, 0, 0 }, [ 93 ] = { 206, 203, 171, 178, 0, 0 }, [ 94 ] = { 205, 202, 173, 180, 0, 0 }, [ 95 ] = { 204, 201, 175, 182, 0, 0 }, [ 96 ] = { 203, 68, 177, 59, 0, 0 }, [ 97 ] = { 202, 196, 55, 176, 0, 0 }, [ 98 ] = { 196, 71, 182, 189, 0, 0 }, [ 99 ] = { 73, 0, 178, 185, 0, 0 }, [ 100 ] = { 73, 15, 180, 187, 0, 0 }, [ 101 ] = { 27, 27, 17, 23, 0, 0 }, [ 102 ] = { 32, 29, 20, 28, 0, 0 }, [ 103 ] = { 33, 33, 21, 27, 0, 0 }, [ 104 ] = { 31, 28, 15, 24, 0, 0 }, [ 105 ] = { 193, 185, 161, 166, 0, 0 }, [ 106 ] = { 194, 189, 160, 59, 0, 0 }, [ 107 ] = { 197, 150, 162, 127, 0, 0 }, [ 108 ] = { 192, 149, 159, 126, 0, 0 }, [ 109 ] = { 156, 155, 125, 20, 0, 0 }, [ 110 ] = { 155, 148, 126, 21, 0, 0 }, [ 111 ] = { 159, 156, 128, 132, 0, 0 }, [ 112 ] = { 153, 152, 124, 128, 0, 0 }, [ 113 ] = { 171, 173, 185, 158, 0, 0 }, [ 114 ] = { 171, 179, 137, 146, 0, 0 }, [ 115 ] = { 172, 158, 138, 147, 0, 0 }, [ 116 ] = { 160, 162, 129, 135, 0, 0 }, [ 117 ] = { 161, 160, 55, 44, 0, 0 }, [ 118 ] = { 163, 166, 131, 138, 0, 0 }, [ 119 ] = { 166, 161, 130, 143, 0, 0 }, [ 120 ] = { 164, 164, 133, 141, 0, 0 }, [ 121 ] = { 162, 161, 55, 137, 0, 0 }, [ 122 ] = { 165, 165, 132, 140, 0, 0 }, [ 123 ] = { 168, 168, 135, 144, 0, 0 }, [ 124 ] = { 170, 170, 55, 144, 0, 0 }, [ 125 ] = { 175, 71, 150, 134, 0, 0 }, [ 126 ] = { 179, 179, 148, 160, 0, 0 }, [ 127 ] = { 178, 178, 136, 148, 0, 0 }, [ 128 ] = { 13, 74, 165, 106, 0, 0 }, [ 129 ] = { 16, 18, 165, 106, 0, 0 }, [ 130 ] = { 81, 21, 165, 106, 0, 0 }, [ 131 ] = { 16, 18, 100, 171, 0, 0 }, [ 132 ] = { 12, 69, 61, 91, 0, 0 }, [ 133 ] = { 9, 8, 3, 1, 0, 0 }, [ 134 ] = { 43, 51, 30, 37, 0, 0 }, [ 135 ] = { 39, 47, 33, 42, 0, 0 }, [ 136 ] = { 36, 44, 30, 40, 0, 0 }, [ 137 ] = { 56, 54, 165, 106, 0, 0 }, [ 138 ] = { 58, 56, 30, 40, 0, 0 }, [ 139 ] = { 55, 53, 39, 44, 0, 0 }, [ 140 ] = { 12, 58, 6, 44, 0, 0 }, [ 141 ] = { 12, 56, 56, 115, 0, 0 }, [ 142 ] = { 12, 72, 100, 171, 0, 0 }, [ 143 ] = { 210, 15, 165, 106, 0, 0 }, [ 144 ] = { 56, 54, 6, 59, 0, 0 } }; static const pmg_power_group_t power5_groups[] = { [ 0 ] = { .pmg_name = "pm_utilization", .pmg_desc = "CPI and utilization data", .pmg_event_ids = power5_group_event_ids[0], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000a02121eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 1 ] = { .pmg_name = "pm_completion", .pmg_desc = "Completion and cycle counts", .pmg_event_ids = power5_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002608261eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 2 ] = { .pmg_name = "pm_group_dispatch", .pmg_desc = "Group dispatch events", .pmg_event_ids = power5_group_event_ids[2], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000ec6c8c212ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 3 ] = { .pmg_name = "pm_clb1", .pmg_desc = "CLB fullness", .pmg_event_ids = power5_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x015b000180848c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 4 ] = { .pmg_name = "pm_clb2", .pmg_desc = "CLB fullness", .pmg_event_ids = power5_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x014300028a8ccc02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 5 ] = { .pmg_name = "pm_gct_empty", .pmg_desc = "GCT empty reasons", .pmg_event_ids = power5_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000008380838ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 6 ] = { .pmg_name = "pm_gct_usage", .pmg_desc = "GCT Usage", .pmg_event_ids = power5_group_event_ids[6], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000003e3e3e3eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 7 ] = { .pmg_name = "pm_lsu1", .pmg_desc = "LSU LRQ and LMQ events", .pmg_event_ids = power5_group_event_ids[7], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000f000fccc4cccaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 8 ] = { .pmg_name = "pm_lsu2", .pmg_desc = "LSU SRQ events", .pmg_event_ids = power5_group_event_ids[8], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400e000ecac2ca86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 9 ] = { .pmg_name = "pm_lsu3", .pmg_desc = "LSU SRQ and LMQ events", .pmg_event_ids = power5_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010f000a102aca2aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 10 ] = { .pmg_name = "pm_prefetch1", .pmg_desc = "Prefetch stream allocation", .pmg_event_ids = power5_group_event_ids[10], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8432000d36c884ceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 11 ] = { .pmg_name = "pm_prefetch2", .pmg_desc = "Prefetch events", .pmg_event_ids = power5_group_event_ids[11], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8103000602cace8eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 12 ] = { .pmg_name = "pm_prefetch3", .pmg_desc = "L2 prefetch and misc events", .pmg_event_ids = power5_group_event_ids[12], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x047c000820828602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 13 ] = { .pmg_name = "pm_prefetch4", .pmg_desc = "Misc prefetch and reject events", .pmg_event_ids = power5_group_event_ids[13], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x063e000ec0c8cc86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 14 ] = { .pmg_name = "pm_lsu_reject1", .pmg_desc = "LSU reject events", .pmg_event_ids = power5_group_event_ids[14], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc22c000e2010c610ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 15 ] = { .pmg_name = "pm_lsu_reject2", .pmg_desc = "LSU rejects due to reload CDF or tag update collision", .pmg_event_ids = power5_group_event_ids[15], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x820c000dc4cc02ceULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 16 ] = { .pmg_name = "LSU rejects due to ERAT", .pmg_desc = " held instuctions", .pmg_event_ids = power5_group_event_ids[16], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x420c000fc6cec0c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 17 ] = { .pmg_name = "pm_lsu_reject4", .pmg_desc = "LSU0/1 reject LMQ full", .pmg_event_ids = power5_group_event_ids[17], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x820c000dc2ca02c8ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 18 ] = { .pmg_name = "pm_lsu_reject5", .pmg_desc = "LSU misc reject and flush events", .pmg_event_ids = power5_group_event_ids[18], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x420c000c10208a8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 19 ] = { .pmg_name = "pm_flush1", .pmg_desc = "Misc flush events", .pmg_event_ids = power5_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0f000020210c68eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 20 ] = { .pmg_name = "pm_flush2", .pmg_desc = "Flushes due to scoreboard and sync", .pmg_event_ids = power5_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc08000038002c4c2ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 21 ] = { .pmg_name = "pm_lsu_flush_srq_lrq", .pmg_desc = "LSU flush by SRQ and LRQ events", .pmg_event_ids = power5_group_event_ids[21], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c000002020028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 22 ] = { .pmg_name = "pm_lsu_flush_lrq", .pmg_desc = "LSU0/1 flush due to LRQ", .pmg_event_ids = power5_group_event_ids[22], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000848c8a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 23 ] = { .pmg_name = "pm_lsu_flush_srq", .pmg_desc = "LSU0/1 flush due to SRQ", .pmg_event_ids = power5_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000868e028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 24 ] = { .pmg_name = "pm_lsu_flush_unaligned", .pmg_desc = "LSU flush due to unaligned data", .pmg_event_ids = power5_group_event_ids[24], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x80c000021010c802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 25 ] = { .pmg_name = "pm_lsu_flush_uld", .pmg_desc = "LSU0/1 flush due to unaligned load", .pmg_event_ids = power5_group_event_ids[25], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c0000080888a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 26 ] = { .pmg_name = "pm_lsu_flush_ust", .pmg_desc = "LSU0/1 flush due to unaligned store", .pmg_event_ids = power5_group_event_ids[26], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40c00000828a028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 27 ] = { .pmg_name = "pm_lsu_flush_full", .pmg_desc = "LSU flush due to LRQ/SRQ full", .pmg_event_ids = power5_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0200009ce0210c0ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 28 ] = { .pmg_name = "pm_lsu_stall1", .pmg_desc = "LSU Stalls", .pmg_event_ids = power5_group_event_ids[28], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000028300234ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 29 ] = { .pmg_name = "pm_lsu_stall2", .pmg_desc = "LSU Stalls", .pmg_event_ids = power5_group_event_ids[29], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000002341e36ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 30 ] = { .pmg_name = "pm_fxu_stall", .pmg_desc = "FXU Stalls", .pmg_event_ids = power5_group_event_ids[30], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000822320232ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 31 ] = { .pmg_name = "pm_fpu_stall", .pmg_desc = "FPU Stalls", .pmg_event_ids = power5_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000020360230ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 32 ] = { .pmg_name = "pm_queue_full", .pmg_desc = "BRQ LRQ LMQ queue full", .pmg_event_ids = power5_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400b0009ce8a84ceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 33 ] = { .pmg_name = "pm_issueq_full", .pmg_desc = "FPU FX full", .pmg_event_ids = power5_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000000868e8088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 34 ] = { .pmg_name = "pm_mapper_full1", .pmg_desc = "CR CTR GPR mapper full", .pmg_event_ids = power5_group_event_ids[34], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000002888cca82ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 35 ] = { .pmg_name = "pm_mapper_full2", .pmg_desc = "FPR XER mapper full", .pmg_event_ids = power5_group_event_ids[35], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4103000282843602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 36 ] = { .pmg_name = "pm_misc_load", .pmg_desc = "Non-cachable loads and stcx events", .pmg_event_ids = power5_group_event_ids[36], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0438000cc2ca828aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 37 ] = { .pmg_name = "pm_ic_demand", .pmg_desc = "ICache demand from BR redirect", .pmg_event_ids = power5_group_event_ids[37], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800c000fc6cec0c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 38 ] = { .pmg_name = "pm_ic_pref", .pmg_desc = "ICache prefetch", .pmg_event_ids = power5_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000ccecc8e1aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 39 ] = { .pmg_name = "pm_ic_miss", .pmg_desc = "ICache misses", .pmg_event_ids = power5_group_event_ids[39], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4003000e32cec802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 40 ] = { .pmg_name = "Branch mispredict", .pmg_desc = " TLB and SLB misses", .pmg_event_ids = power5_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x808000031010caccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 41 ] = { .pmg_name = "pm_branch1", .pmg_desc = "Branch operations", .pmg_event_ids = power5_group_event_ids[41], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000030e0e0e0eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 42 ] = { .pmg_name = "pm_branch2", .pmg_desc = "Branch operations", .pmg_event_ids = power5_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000ccacc8c02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 43 ] = { .pmg_name = "pm_L1_tlbmiss", .pmg_desc = "L1 load and TLB misses", .pmg_event_ids = power5_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b000008e881020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 44 ] = { .pmg_name = "pm_L1_DERAT_miss", .pmg_desc = "L1 store and DERAT misses", .pmg_event_ids = power5_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b300000e202086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 45 ] = { .pmg_name = "pm_L1_slbmiss", .pmg_desc = "L1 load and SLB misses", .pmg_event_ids = power5_group_event_ids[45], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b000008a82848cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 46 ] = { .pmg_name = "pm_L1_dtlbmiss_4K", .pmg_desc = "L1 load references and 4K Data TLB references and misses", .pmg_event_ids = power5_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x08f0000084808088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 47 ] = { .pmg_name = "pm_L1_dtlbmiss_16M", .pmg_desc = "L1 store references and 16M Data TLB references and misses", .pmg_event_ids = power5_group_event_ids[47], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x08f000008c88828aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 48 ] = { .pmg_name = "pm_dsource1", .pmg_desc = "L3 cache and memory data access", .pmg_event_ids = power5_group_event_ids[48], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400300001c0e8e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 49 ] = { .pmg_name = "pm_dsource2", .pmg_desc = "L3 cache and memory data access", .pmg_event_ids = power5_group_event_ids[49], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000300031c0e360eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 50 ] = { .pmg_name = "pm_dsource_L2", .pmg_desc = "L2 cache data access", .pmg_event_ids = power5_group_event_ids[50], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000300032e2e2e2eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 51 ] = { .pmg_name = "pm_dsource_L3", .pmg_desc = "L3 cache data access", .pmg_event_ids = power5_group_event_ids[51], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000300033c3c3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 52 ] = { .pmg_name = "pm_isource1", .pmg_desc = "Instruction source information", .pmg_event_ids = power5_group_event_ids[52], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000c1a1a1a0cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 53 ] = { .pmg_name = "pm_isource2", .pmg_desc = "Instruction source information", .pmg_event_ids = power5_group_event_ids[53], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000c0c0c021aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 54 ] = { .pmg_name = "pm_isource_L2", .pmg_desc = "L2 instruction source information", .pmg_event_ids = power5_group_event_ids[54], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000c2c2c2c2cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 55 ] = { .pmg_name = "pm_isource_L3", .pmg_desc = "L3 instruction source information", .pmg_event_ids = power5_group_event_ids[55], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8000000c3a3a3a3aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 56 ] = { .pmg_name = "pm_pteg_source1", .pmg_desc = "PTEG source information", .pmg_event_ids = power5_group_event_ids[56], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000200032e2e2e2eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 57 ] = { .pmg_name = "pm_pteg_source2", .pmg_desc = "PTEG source information", .pmg_event_ids = power5_group_event_ids[57], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000200033c3c3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 58 ] = { .pmg_name = "pm_pteg_source3", .pmg_desc = "PTEG source information", .pmg_event_ids = power5_group_event_ids[58], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000200030e0e360eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 59 ] = { .pmg_name = "pm_pteg_source4", .pmg_desc = "L3 PTEG and group disptach events", .pmg_event_ids = power5_group_event_ids[59], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x003200001c04048eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 60 ] = { .pmg_name = "pm_L2SA_ld", .pmg_desc = "L2 slice A load events", .pmg_event_ids = power5_group_event_ids[60], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 61 ] = { .pmg_name = "pm_L2SA_st", .pmg_desc = "L2 slice A store events", .pmg_event_ids = power5_group_event_ids[61], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 62 ] = { .pmg_name = "pm_L2SA_st2", .pmg_desc = "L2 slice A store events", .pmg_event_ids = power5_group_event_ids[62], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00580c080c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 63 ] = { .pmg_name = "pm_L2SB_ld", .pmg_desc = "L2 slice B load events", .pmg_event_ids = power5_group_event_ids[63], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400582c282c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 64 ] = { .pmg_name = "pm_L2SB_st", .pmg_desc = "L2 slice B store events", .pmg_event_ids = power5_group_event_ids[64], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800582c282c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 65 ] = { .pmg_name = "pm_L2SB_st2", .pmg_desc = "L2 slice B store events", .pmg_event_ids = power5_group_event_ids[65], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00582c282c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 66 ] = { .pmg_name = "pm_L2SB_ld", .pmg_desc = "L2 slice C load events", .pmg_event_ids = power5_group_event_ids[66], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055400584c484c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 67 ] = { .pmg_name = "pm_L2SB_st", .pmg_desc = "L2 slice C store events", .pmg_event_ids = power5_group_event_ids[67], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055800584c484c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 68 ] = { .pmg_name = "pm_L2SB_st2", .pmg_desc = "L2 slice C store events", .pmg_event_ids = power5_group_event_ids[68], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055c00584c484c4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 69 ] = { .pmg_name = "pm_L3SA_trans", .pmg_desc = "L3 slice A state transistions", .pmg_event_ids = power5_group_event_ids[69], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000ac602c686ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 70 ] = { .pmg_name = "pm_L3SB_trans", .pmg_desc = "L3 slice B state transistions", .pmg_event_ids = power5_group_event_ids[70], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000602c8c888ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 71 ] = { .pmg_name = "pm_L3SC_trans", .pmg_desc = "L3 slice C state transistions", .pmg_event_ids = power5_group_event_ids[71], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3015000602caca8aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 72 ] = { .pmg_name = "pm_L2SA_trans", .pmg_desc = "L2 slice A state transistions", .pmg_event_ids = power5_group_event_ids[72], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac080c080ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 73 ] = { .pmg_name = "pm_L2SB_trans", .pmg_desc = "L2 slice B state transistions", .pmg_event_ids = power5_group_event_ids[73], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac282c282ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 74 ] = { .pmg_name = "pm_L2SC_trans", .pmg_desc = "L2 slice C state transistions", .pmg_event_ids = power5_group_event_ids[74], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055000ac484c484ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 75 ] = { .pmg_name = "pm_L3SAB_retry", .pmg_desc = "L3 slice A/B snoop retry and all CI/CO busy", .pmg_event_ids = power5_group_event_ids[75], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3005100fc6c8c6c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 76 ] = { .pmg_name = "pm_L3SAB_hit", .pmg_desc = "L3 slice A/B hit and reference", .pmg_event_ids = power5_group_event_ids[76], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3050100086888688ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 77 ] = { .pmg_name = "pm_L3SC_retry_hit", .pmg_desc = "L3 slice C hit & snoop retry", .pmg_event_ids = power5_group_event_ids[77], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x3055100aca8aca8aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 78 ] = { .pmg_name = "pm_fpu1", .pmg_desc = "Floating Point events", .pmg_event_ids = power5_group_event_ids[78], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010101020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 79 ] = { .pmg_name = "pm_fpu2", .pmg_desc = "Floating Point events", .pmg_event_ids = power5_group_event_ids[79], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000020202010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 80 ] = { .pmg_name = "pm_fpu3", .pmg_desc = "Floating point events", .pmg_event_ids = power5_group_event_ids[80], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000c1010868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 81 ] = { .pmg_name = "pm_fpu4", .pmg_desc = "Floating point events", .pmg_event_ids = power5_group_event_ids[81], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000c20200220ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 82 ] = { .pmg_name = "pm_fpu5", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[82], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000848c848cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 83 ] = { .pmg_name = "pm_fpu6", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[83], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000cc0c88088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 84 ] = { .pmg_name = "pm_fpu7", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[84], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000008088828aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 85 ] = { .pmg_name = "pm_fpu8", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[85], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000dc2ca02c0ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 86 ] = { .pmg_name = "pm_fpu9", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[86], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000cc6ce8088ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 87 ] = { .pmg_name = "pm_fpu10", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[87], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000828a028aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 88 ] = { .pmg_name = "pm_fpu11", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[88], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000868e8602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 89 ] = { .pmg_name = "pm_fpu12", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[89], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0430000cc4cc8002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 90 ] = { .pmg_name = "pm_fxu1", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5_group_event_ids[90], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000024242424ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 91 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5_group_event_ids[91], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4000000604221020ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 92 ] = { .pmg_name = "pm_fxu3", .pmg_desc = "Fixed Point events", .pmg_event_ids = power5_group_event_ids[92], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x404000038688c4ccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 93 ] = { .pmg_name = "pm_smt_priorities1", .pmg_desc = "Thread priority events", .pmg_event_ids = power5_group_event_ids[93], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc6ccc6c8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 94 ] = { .pmg_name = "pm_smt_priorities2", .pmg_desc = "Thread priority events", .pmg_event_ids = power5_group_event_ids[94], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc4cacaccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 95 ] = { .pmg_name = "pm_smt_priorities3", .pmg_desc = "Thread priority events", .pmg_event_ids = power5_group_event_ids[95], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000fc2c8c4c2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 96 ] = { .pmg_name = "pm_smt_priorities4", .pmg_desc = "Thread priority events", .pmg_event_ids = power5_group_event_ids[96], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0005000ac016c002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 97 ] = { .pmg_name = "pm_smt_both", .pmg_desc = "Thread common events", .pmg_event_ids = power5_group_event_ids[97], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000016260208ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 98 ] = { .pmg_name = "pm_smt_selection", .pmg_desc = "Thread selection", .pmg_event_ids = power5_group_event_ids[98], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000086028082ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 99 ] = { .pmg_name = "pm_smt_selectover1", .pmg_desc = "Thread selection overide", .pmg_event_ids = power5_group_event_ids[99], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0050000002808488ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 100 ] = { .pmg_name = "pm_smt_selectover2", .pmg_desc = "Thread selection overide", .pmg_event_ids = power5_group_event_ids[100], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00100000021e8a86ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 101 ] = { .pmg_name = "pm_fabric1", .pmg_desc = "Fabric events", .pmg_event_ids = power5_group_event_ids[101], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500058ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 102 ] = { .pmg_name = "pm_fabric2", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5_group_event_ids[102], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500858ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 103 ] = { .pmg_name = "pm_fabric3", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5_group_event_ids[103], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305501858ece8eceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 104 ] = { .pmg_name = "pm_fabric4", .pmg_desc = "Fabric data movement", .pmg_event_ids = power5_group_event_ids[104], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x705401068ecec68eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 105 ] = { .pmg_name = "pm_snoop1", .pmg_desc = "Snoop retry", .pmg_event_ids = power5_group_event_ids[105], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305500058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 106 ] = { .pmg_name = "pm_snoop2", .pmg_desc = "Snoop read retry", .pmg_event_ids = power5_group_event_ids[106], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30540a048ccc8c02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 107 ] = { .pmg_name = "pm_snoop3", .pmg_desc = "Snoop write retry", .pmg_event_ids = power5_group_event_ids[107], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30550c058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 108 ] = { .pmg_name = "pm_snoop4", .pmg_desc = "Snoop partial write retry", .pmg_event_ids = power5_group_event_ids[108], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30550e058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 109 ] = { .pmg_name = "pm_mem_rq", .pmg_desc = "Memory read queue dispatch", .pmg_event_ids = power5_group_event_ids[109], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x705402058ccc8cceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 110 ] = { .pmg_name = "pm_mem_read", .pmg_desc = "Memory read complete and cancel", .pmg_event_ids = power5_group_event_ids[110], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305404048ccc8c06ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 111 ] = { .pmg_name = "pm_mem_wq", .pmg_desc = "Memory write queue dispatch", .pmg_event_ids = power5_group_event_ids[111], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305506058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 112 ] = { .pmg_name = "pm_mem_pwq", .pmg_desc = "Memory partial write queue", .pmg_event_ids = power5_group_event_ids[112], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x305508058ccc8cccULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 113 ] = { .pmg_name = "pm_threshold", .pmg_desc = "Thresholding", .pmg_event_ids = power5_group_event_ids[113], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0008000404c41628ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 114 ] = { .pmg_name = "pm_mrk_grp1", .pmg_desc = "Marked group events", .pmg_event_ids = power5_group_event_ids[114], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0008000404c60a26ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 115 ] = { .pmg_name = "pm_mrk_grp2", .pmg_desc = "Marked group events", .pmg_event_ids = power5_group_event_ids[115], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x410300022a0ac822ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 116 ] = { .pmg_name = "pm_mrk_dsource1", .pmg_desc = "Marked data from ", .pmg_event_ids = power5_group_event_ids[116], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00030e404444ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 117 ] = { .pmg_name = "pm_mrk_dsource2", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[117], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00002e440210ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 118 ] = { .pmg_name = "pm_mrk_dsource3", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[118], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00031c484c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 119 ] = { .pmg_name = "pm_mrk_dsource4", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[119], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000342462e42ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 120 ] = { .pmg_name = "pm_mrk_dsource5", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[120], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00033c4c4040ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 121 ] = { .pmg_name = "pm_mrk_dsource6", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[121], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b000146460246ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 122 ] = { .pmg_name = "pm_mrk_dsource7", .pmg_desc = "Marked data from", .pmg_event_ids = power5_group_event_ids[122], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x010b00034e4e3c4eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 123 ] = { .pmg_name = "pm_mrk_lbmiss", .pmg_desc = "Marked TLB and SLB misses", .pmg_event_ids = power5_group_event_ids[123], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0cf00000828a8c8eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 124 ] = { .pmg_name = "pm_mrk_lbref", .pmg_desc = "Marked TLB and SLB references", .pmg_event_ids = power5_group_event_ids[124], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0cf00000868e028eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 125 ] = { .pmg_name = "pm_mrk_lsmiss", .pmg_desc = "Marked load and store miss", .pmg_event_ids = power5_group_event_ids[125], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000800081002060aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 126 ] = { .pmg_name = "pm_mrk_ulsflush", .pmg_desc = "Mark unaligned load and store flushes", .pmg_event_ids = power5_group_event_ids[126], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0028000406c62020ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 127 ] = { .pmg_name = "pm_mrk_misc", .pmg_desc = "Misc marked instructions", .pmg_event_ids = power5_group_event_ids[127], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00080008cc062816ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 128 ] = { .pmg_name = "pm_lsref_L1", .pmg_desc = "Load/Store operations and L1 activity", .pmg_event_ids = power5_group_event_ids[128], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x803300040e1a2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 129 ] = { .pmg_name = "Load/Store operations and L2", .pmg_desc = "L3 activity", .pmg_event_ids = power5_group_event_ids[129], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x003300001c0e2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 130 ] = { .pmg_name = "pm_lsref_tlbmiss", .pmg_desc = "Load/Store operations and TLB misses", .pmg_event_ids = power5_group_event_ids[130], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b0000080882020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 131 ] = { .pmg_name = "pm_Dmiss", .pmg_desc = "Data cache misses", .pmg_event_ids = power5_group_event_ids[131], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x003300001c0e1086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 132 ] = { .pmg_name = "pm_prefetchX", .pmg_desc = "Prefetch events", .pmg_event_ids = power5_group_event_ids[132], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x853300061eccce86ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 133 ] = { .pmg_name = "pm_branchX", .pmg_desc = "Branch operations", .pmg_event_ids = power5_group_event_ids[133], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000030e0e0ec8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 134 ] = { .pmg_name = "pm_fpuX1", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[134], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000dc2ca86c0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 135 ] = { .pmg_name = "pm_fpuX2", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[135], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000828a828aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 136 ] = { .pmg_name = "pm_fpuX3", .pmg_desc = "Floating point events by unit", .pmg_event_ids = power5_group_event_ids[136], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000868e868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 137 ] = { .pmg_name = "pm_fpuX4", .pmg_desc = "Floating point and L1 events", .pmg_event_ids = power5_group_event_ids[137], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0030000020102020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 138 ] = { .pmg_name = "pm_fpuX5", .pmg_desc = "Floating point events", .pmg_event_ids = power5_group_event_ids[138], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000c2020868eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 139 ] = { .pmg_name = "pm_fpuX6", .pmg_desc = "Floating point events", .pmg_event_ids = power5_group_event_ids[139], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010202010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 140 ] = { .pmg_name = "pm_hpmcount1", .pmg_desc = "HPM group for set 1 ", .pmg_event_ids = power5_group_event_ids[140], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e281e10ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 141 ] = { .pmg_name = "pm_hpmcount2", .pmg_desc = "HPM group for set 2", .pmg_event_ids = power5_group_event_ids[141], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x043000041e201220ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 142 ] = { .pmg_name = "pm_hpmcount3", .pmg_desc = "HPM group for set 3 ", .pmg_event_ids = power5_group_event_ids[142], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x403000041ec21086ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 143 ] = { .pmg_name = "pm_hpmcount4", .pmg_desc = "HPM group for set 7", .pmg_event_ids = power5_group_event_ids[143], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00b00000101e2020ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 144 ] = { .pmg_name = "pm_1flop_with_fma", .pmg_desc = "One flop instructions plus FMA", .pmg_event_ids = power5_group_event_ids[144], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000020101e02ULL, .pmg_mmcra = 0x0000000000000000ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/power6_events.h000066400000000000000000014015221502707512200222720ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER6_EVENTS_H__ #define __POWER6_EVENTS_H__ /* * File: power6_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER6_PME_PM_LSU_REJECT_STQ_FULL 0 #define POWER6_PME_PM_DPU_HELD_FXU_MULTI 1 #define POWER6_PME_PM_VMX1_STALL 2 #define POWER6_PME_PM_PMC2_SAVED 3 #define POWER6_PME_PM_L2SB_IC_INV 4 #define POWER6_PME_PM_IERAT_MISS_64K 5 #define POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC 6 #define POWER6_PME_PM_LD_REF_L1_BOTH 7 #define POWER6_PME_PM_FPU1_FCONV 8 #define POWER6_PME_PM_IBUF_FULL_COUNT 9 #define POWER6_PME_PM_MRK_LSU_DERAT_MISS 10 #define POWER6_PME_PM_MRK_ST_CMPL 11 #define POWER6_PME_PM_L2_CASTOUT_MOD 12 #define POWER6_PME_PM_FPU1_ST_FOLDED 13 #define POWER6_PME_PM_MRK_INST_TIMEO 14 #define POWER6_PME_PM_DPU_WT 15 #define POWER6_PME_PM_DPU_HELD_RESTART 16 #define POWER6_PME_PM_IERAT_MISS 17 #define POWER6_PME_PM_FPU_SINGLE 18 #define POWER6_PME_PM_MRK_PTEG_FROM_LMEM 19 #define POWER6_PME_PM_HV_COUNT 20 #define POWER6_PME_PM_L2SA_ST_HIT 21 #define POWER6_PME_PM_L2_LD_MISS_INST 22 #define POWER6_PME_PM_EXT_INT 23 #define POWER6_PME_PM_LSU1_LDF 24 #define POWER6_PME_PM_FAB_CMD_ISSUED 25 #define POWER6_PME_PM_PTEG_FROM_L21 26 #define POWER6_PME_PM_L2SA_MISS 27 #define POWER6_PME_PM_PTEG_FROM_RL2L3_MOD 28 #define POWER6_PME_PM_DPU_WT_COUNT 29 #define POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD 30 #define POWER6_PME_PM_LD_HIT_L2 31 #define POWER6_PME_PM_PTEG_FROM_DL2L3_SHR 32 #define POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC 33 #define POWER6_PME_PM_L3SA_MISS 34 #define POWER6_PME_PM_NO_ITAG_COUNT 35 #define POWER6_PME_PM_DSLB_MISS 36 #define POWER6_PME_PM_LSU_FLUSH_ALIGN 37 #define POWER6_PME_PM_DPU_HELD_FPU_CR 38 #define POWER6_PME_PM_PTEG_FROM_L2MISS 39 #define POWER6_PME_PM_MRK_DATA_FROM_DMEM 40 #define POWER6_PME_PM_PTEG_FROM_LMEM 41 #define POWER6_PME_PM_MRK_DERAT_REF_64K 42 #define POWER6_PME_PM_L2SA_LD_REQ_INST 43 #define POWER6_PME_PM_MRK_DERAT_MISS_16M 44 #define POWER6_PME_PM_DATA_FROM_DL2L3_MOD 45 #define POWER6_PME_PM_FPU0_FXMULT 46 #define POWER6_PME_PM_L3SB_MISS 47 #define POWER6_PME_PM_STCX_CANCEL 48 #define POWER6_PME_PM_L2SA_LD_MISS_DATA 49 #define POWER6_PME_PM_IC_INV_L2 50 #define POWER6_PME_PM_DPU_HELD 51 #define POWER6_PME_PM_PMC1_OVERFLOW 52 #define POWER6_PME_PM_THRD_PRIO_6_CYC 53 #define POWER6_PME_PM_MRK_PTEG_FROM_L3MISS 54 #define POWER6_PME_PM_MRK_LSU0_REJECT_UST 55 #define POWER6_PME_PM_MRK_INST_DISP 56 #define POWER6_PME_PM_LARX 57 #define POWER6_PME_PM_INST_CMPL 58 #define POWER6_PME_PM_FXU_IDLE 59 #define POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD 60 #define POWER6_PME_PM_L2_LD_REQ_DATA 61 #define POWER6_PME_PM_LSU_DERAT_MISS_CYC 62 #define POWER6_PME_PM_DPU_HELD_POWER_COUNT 63 #define POWER6_PME_PM_INST_FROM_RL2L3_MOD 64 #define POWER6_PME_PM_DATA_FROM_DMEM_CYC 65 #define POWER6_PME_PM_DATA_FROM_DMEM 66 #define POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR 67 #define POWER6_PME_PM_LSU_REJECT_DERAT_MPRED 68 #define POWER6_PME_PM_LSU1_REJECT_ULD 69 #define POWER6_PME_PM_DATA_FROM_L3_CYC 70 #define POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE 71 #define POWER6_PME_PM_INST_FROM_MEM_DP 72 #define POWER6_PME_PM_LSU_FLUSH_DSI 73 #define POWER6_PME_PM_MRK_DERAT_REF_16G 74 #define POWER6_PME_PM_LSU_LDF_BOTH 75 #define POWER6_PME_PM_FPU1_1FLOP 76 #define POWER6_PME_PM_DATA_FROM_RMEM_CYC 77 #define POWER6_PME_PM_INST_PTEG_SECONDARY 78 #define POWER6_PME_PM_L1_ICACHE_MISS 79 #define POWER6_PME_PM_INST_DISP_LLA 80 #define POWER6_PME_PM_THRD_BOTH_RUN_CYC 81 #define POWER6_PME_PM_LSU_ST_CHAINED 82 #define POWER6_PME_PM_FPU1_FXDIV 83 #define POWER6_PME_PM_FREQ_UP 84 #define POWER6_PME_PM_FAB_RETRY_SYS_PUMP 85 #define POWER6_PME_PM_DATA_FROM_LMEM 86 #define POWER6_PME_PM_PMC3_OVERFLOW 87 #define POWER6_PME_PM_LSU0_REJECT_SET_MPRED 88 #define POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED 89 #define POWER6_PME_PM_LSU1_REJECT_STQ_FULL 90 #define POWER6_PME_PM_MRK_BR_MPRED 91 #define POWER6_PME_PM_L2SA_ST_MISS 92 #define POWER6_PME_PM_LSU0_REJECT_EXTERN 93 #define POWER6_PME_PM_MRK_BR_TAKEN 94 #define POWER6_PME_PM_ISLB_MISS 95 #define POWER6_PME_PM_CYC 96 #define POWER6_PME_PM_FPU_FXDIV 97 #define POWER6_PME_PM_DPU_HELD_LLA_END 98 #define POWER6_PME_PM_MEM0_DP_CL_WR_LOC 99 #define POWER6_PME_PM_MRK_LSU_REJECT_ULD 100 #define POWER6_PME_PM_1PLUS_PPC_CMPL 101 #define POWER6_PME_PM_PTEG_FROM_DMEM 102 #define POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT 103 #define POWER6_PME_PM_GCT_FULL_CYC 104 #define POWER6_PME_PM_INST_FROM_L25_SHR 105 #define POWER6_PME_PM_MRK_DERAT_MISS_4K 106 #define POWER6_PME_PM_DC_PREF_STREAM_ALLOC 107 #define POWER6_PME_PM_FPU1_FIN 108 #define POWER6_PME_PM_BR_MPRED_TA 109 #define POWER6_PME_PM_DPU_HELD_POWER 110 #define POWER6_PME_PM_RUN_INST_CMPL 111 #define POWER6_PME_PM_GCT_EMPTY_CYC 112 #define POWER6_PME_PM_LLA_COUNT 113 #define POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH 114 #define POWER6_PME_PM_DPU_WT_IC_MISS 115 #define POWER6_PME_PM_DATA_FROM_L3MISS 116 #define POWER6_PME_PM_FPU_FPSCR 117 #define POWER6_PME_PM_VMX1_INST_ISSUED 118 #define POWER6_PME_PM_FLUSH 119 #define POWER6_PME_PM_ST_HIT_L2 120 #define POWER6_PME_PM_SYNC_CYC 121 #define POWER6_PME_PM_FAB_SYS_PUMP 122 #define POWER6_PME_PM_IC_PREF_REQ 123 #define POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC 124 #define POWER6_PME_PM_FPU_ISSUE_0 125 #define POWER6_PME_PM_THRD_PRIO_2_CYC 126 #define POWER6_PME_PM_VMX_SIMPLE_ISSUED 127 #define POWER6_PME_PM_MRK_FPU1_FIN 128 #define POWER6_PME_PM_DPU_HELD_CW 129 #define POWER6_PME_PM_L3SA_REF 130 #define POWER6_PME_PM_STCX 131 #define POWER6_PME_PM_L2SB_MISS 132 #define POWER6_PME_PM_LSU0_REJECT 133 #define POWER6_PME_PM_TB_BIT_TRANS 134 #define POWER6_PME_PM_THERMAL_MAX 135 #define POWER6_PME_PM_FPU0_STF 136 #define POWER6_PME_PM_FPU1_FMA 137 #define POWER6_PME_PM_LSU1_REJECT_LHS 138 #define POWER6_PME_PM_DPU_HELD_INT 139 #define POWER6_PME_PM_THRD_LLA_BOTH_CYC 140 #define POWER6_PME_PM_DPU_HELD_THERMAL_COUNT 141 #define POWER6_PME_PM_PMC4_REWIND 142 #define POWER6_PME_PM_DERAT_REF_16M 143 #define POWER6_PME_PM_FPU0_FCONV 144 #define POWER6_PME_PM_L2SA_LD_REQ_DATA 145 #define POWER6_PME_PM_DATA_FROM_MEM_DP 146 #define POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED 147 #define POWER6_PME_PM_MRK_PTEG_FROM_L2MISS 148 #define POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC 149 #define POWER6_PME_PM_VMX0_STALL 150 #define POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 151 #define POWER6_PME_PM_LSU_DERAT_MISS 152 #define POWER6_PME_PM_FPU0_SINGLE 153 #define POWER6_PME_PM_FPU_ISSUE_STEERING 154 #define POWER6_PME_PM_THRD_PRIO_1_CYC 155 #define POWER6_PME_PM_VMX_COMPLEX_ISSUED 156 #define POWER6_PME_PM_FPU_ISSUE_ST_FOLDED 157 #define POWER6_PME_PM_DFU_FIN 158 #define POWER6_PME_PM_BR_PRED_CCACHE 159 #define POWER6_PME_PM_MRK_ST_CMPL_INT 160 #define POWER6_PME_PM_FAB_MMIO 161 #define POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED 162 #define POWER6_PME_PM_FPU_STF 163 #define POWER6_PME_PM_MEM1_DP_CL_WR_GLOB 164 #define POWER6_PME_PM_MRK_DATA_FROM_L3MISS 165 #define POWER6_PME_PM_GCT_NOSLOT_CYC 166 #define POWER6_PME_PM_L2_ST_REQ_DATA 167 #define POWER6_PME_PM_INST_TABLEWALK_COUNT 168 #define POWER6_PME_PM_PTEG_FROM_L35_SHR 169 #define POWER6_PME_PM_DPU_HELD_ISYNC 170 #define POWER6_PME_PM_MRK_DATA_FROM_L25_SHR 171 #define POWER6_PME_PM_L3SA_HIT 172 #define POWER6_PME_PM_DERAT_MISS_16G 173 #define POWER6_PME_PM_DATA_PTEG_2ND_HALF 174 #define POWER6_PME_PM_L2SA_ST_REQ 175 #define POWER6_PME_PM_INST_FROM_LMEM 176 #define POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT 177 #define POWER6_PME_PM_PTEG_FROM_L2 178 #define POWER6_PME_PM_DATA_PTEG_1ST_HALF 179 #define POWER6_PME_PM_BR_MPRED_COUNT 180 #define POWER6_PME_PM_IERAT_MISS_4K 181 #define POWER6_PME_PM_THRD_BOTH_RUN_COUNT 182 #define POWER6_PME_PM_LSU_REJECT_ULD 183 #define POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC 184 #define POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD 185 #define POWER6_PME_PM_FPU0_FLOP 186 #define POWER6_PME_PM_FPU0_FEST 187 #define POWER6_PME_PM_MRK_LSU0_REJECT_LHS 188 #define POWER6_PME_PM_VMX_RESULT_SAT_1 189 #define POWER6_PME_PM_NO_ITAG_CYC 190 #define POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH 191 #define POWER6_PME_PM_0INST_FETCH 192 #define POWER6_PME_PM_DPU_WT_BR_MPRED 193 #define POWER6_PME_PM_L1_PREF 194 #define POWER6_PME_PM_VMX_FLOAT_MULTICYCLE 195 #define POWER6_PME_PM_DATA_FROM_L25_SHR_CYC 196 #define POWER6_PME_PM_DATA_FROM_L3 197 #define POWER6_PME_PM_PMC2_OVERFLOW 198 #define POWER6_PME_PM_VMX0_LD_WRBACK 199 #define POWER6_PME_PM_FPU0_DENORM 200 #define POWER6_PME_PM_INST_FETCH_CYC 201 #define POWER6_PME_PM_LSU_LDF 202 #define POWER6_PME_PM_LSU_REJECT_L2_CORR 203 #define POWER6_PME_PM_DERAT_REF_64K 204 #define POWER6_PME_PM_THRD_PRIO_3_CYC 205 #define POWER6_PME_PM_FPU_FMA 206 #define POWER6_PME_PM_INST_FROM_L35_MOD 207 #define POWER6_PME_PM_DFU_CONV 208 #define POWER6_PME_PM_INST_FROM_L25_MOD 209 #define POWER6_PME_PM_PTEG_FROM_L35_MOD 210 #define POWER6_PME_PM_MRK_VMX_ST_ISSUED 211 #define POWER6_PME_PM_VMX_FLOAT_ISSUED 212 #define POWER6_PME_PM_LSU0_REJECT_L2_CORR 213 #define POWER6_PME_PM_THRD_L2MISS 214 #define POWER6_PME_PM_FPU_FCONV 215 #define POWER6_PME_PM_FPU_FXMULT 216 #define POWER6_PME_PM_FPU1_FRSP 217 #define POWER6_PME_PM_MRK_DERAT_REF_16M 218 #define POWER6_PME_PM_L2SB_CASTOUT_SHR 219 #define POWER6_PME_PM_THRD_ONE_RUN_COUNT 220 #define POWER6_PME_PM_INST_FROM_RMEM 221 #define POWER6_PME_PM_LSU_BOTH_BUS 222 #define POWER6_PME_PM_FPU1_FSQRT_FDIV 223 #define POWER6_PME_PM_L2_LD_REQ_INST 224 #define POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR 225 #define POWER6_PME_PM_BR_PRED_CR 226 #define POWER6_PME_PM_MRK_LSU0_REJECT_ULD 227 #define POWER6_PME_PM_LSU_REJECT 228 #define POWER6_PME_PM_LSU_REJECT_LHS_BOTH 229 #define POWER6_PME_PM_GXO_ADDR_CYC_BUSY 230 #define POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT 231 #define POWER6_PME_PM_PTEG_FROM_L3 232 #define POWER6_PME_PM_VMX0_LD_ISSUED 233 #define POWER6_PME_PM_FXU_PIPELINED_MULT_DIV 234 #define POWER6_PME_PM_FPU1_STF 235 #define POWER6_PME_PM_DFU_ADD 236 #define POWER6_PME_PM_MEM_DP_CL_WR_GLOB 237 #define POWER6_PME_PM_MRK_LSU1_REJECT_ULD 238 #define POWER6_PME_PM_ITLB_REF 239 #define POWER6_PME_PM_LSU0_REJECT_L2MISS 240 #define POWER6_PME_PM_DATA_FROM_L35_SHR 241 #define POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD 242 #define POWER6_PME_PM_FPU0_FPSCR 243 #define POWER6_PME_PM_DATA_FROM_L2 244 #define POWER6_PME_PM_DPU_HELD_XER 245 #define POWER6_PME_PM_FAB_NODE_PUMP 246 #define POWER6_PME_PM_VMX_RESULT_SAT_0_1 247 #define POWER6_PME_PM_LD_REF_L1 248 #define POWER6_PME_PM_TLB_REF 249 #define POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS 250 #define POWER6_PME_PM_FLUSH_FPU 251 #define POWER6_PME_PM_MEM1_DP_CL_WR_LOC 252 #define POWER6_PME_PM_L2SB_LD_HIT 253 #define POWER6_PME_PM_FAB_DCLAIM 254 #define POWER6_PME_PM_MEM_DP_CL_WR_LOC 255 #define POWER6_PME_PM_BR_MPRED_CR 256 #define POWER6_PME_PM_LSU_REJECT_EXTERN 257 #define POWER6_PME_PM_DATA_FROM_RL2L3_MOD 258 #define POWER6_PME_PM_DPU_HELD_RU_WQ 259 #define POWER6_PME_PM_LD_MISS_L1 260 #define POWER6_PME_PM_DC_INV_L2 261 #define POWER6_PME_PM_MRK_PTEG_FROM_RMEM 262 #define POWER6_PME_PM_FPU_FIN 263 #define POWER6_PME_PM_FXU0_FIN 264 #define POWER6_PME_PM_DPU_HELD_FPQ 265 #define POWER6_PME_PM_GX_DMA_READ 266 #define POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR 267 #define POWER6_PME_PM_0INST_FETCH_COUNT 268 #define POWER6_PME_PM_PMC5_OVERFLOW 269 #define POWER6_PME_PM_L2SB_LD_REQ 270 #define POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC 271 #define POWER6_PME_PM_DATA_FROM_RMEM 272 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC 273 #define POWER6_PME_PM_ST_REF_L1_BOTH 274 #define POWER6_PME_PM_VMX_PERMUTE_ISSUED 275 #define POWER6_PME_PM_BR_TAKEN 276 #define POWER6_PME_PM_FAB_DMA 277 #define POWER6_PME_PM_GCT_EMPTY_COUNT 278 #define POWER6_PME_PM_FPU1_SINGLE 279 #define POWER6_PME_PM_L2SA_CASTOUT_SHR 280 #define POWER6_PME_PM_L3SB_REF 281 #define POWER6_PME_PM_FPU0_FRSP 282 #define POWER6_PME_PM_PMC4_SAVED 283 #define POWER6_PME_PM_L2SA_DC_INV 284 #define POWER6_PME_PM_GXI_ADDR_CYC_BUSY 285 #define POWER6_PME_PM_FPU0_FMA 286 #define POWER6_PME_PM_SLB_MISS 287 #define POWER6_PME_PM_MRK_ST_GPS 288 #define POWER6_PME_PM_DERAT_REF_4K 289 #define POWER6_PME_PM_L2_CASTOUT_SHR 290 #define POWER6_PME_PM_DPU_HELD_STCX_CR 291 #define POWER6_PME_PM_FPU0_ST_FOLDED 292 #define POWER6_PME_PM_MRK_DATA_FROM_L21 293 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 294 #define POWER6_PME_PM_DATA_FROM_L35_MOD 295 #define POWER6_PME_PM_DATA_FROM_DL2L3_SHR 296 #define POWER6_PME_PM_GXI_DATA_CYC_BUSY 297 #define POWER6_PME_PM_LSU_REJECT_STEAL 298 #define POWER6_PME_PM_ST_FIN 299 #define POWER6_PME_PM_DPU_HELD_CR_LOGICAL 300 #define POWER6_PME_PM_THRD_SEL_T0 301 #define POWER6_PME_PM_PTEG_RELOAD_VALID 302 #define POWER6_PME_PM_L2_PREF_ST 303 #define POWER6_PME_PM_MRK_STCX_FAIL 304 #define POWER6_PME_PM_LSU0_REJECT_LHS 305 #define POWER6_PME_PM_DFU_EXP_EQ 306 #define POWER6_PME_PM_DPU_HELD_FP_FX_MULT 307 #define POWER6_PME_PM_L2_LD_MISS_DATA 308 #define POWER6_PME_PM_DATA_FROM_L35_MOD_CYC 309 #define POWER6_PME_PM_FLUSH_FXU 310 #define POWER6_PME_PM_FPU_ISSUE_1 311 #define POWER6_PME_PM_DATA_FROM_LMEM_CYC 312 #define POWER6_PME_PM_DPU_HELD_LSU_SOPS 313 #define POWER6_PME_PM_INST_PTEG_2ND_HALF 314 #define POWER6_PME_PM_THRESH_TIMEO 315 #define POWER6_PME_PM_LSU_REJECT_UST_BOTH 316 #define POWER6_PME_PM_LSU_REJECT_FAST 317 #define POWER6_PME_PM_DPU_HELD_THRD_PRIO 318 #define POWER6_PME_PM_L2_PREF_LD 319 #define POWER6_PME_PM_FPU_FEST 320 #define POWER6_PME_PM_MRK_DATA_FROM_RMEM 321 #define POWER6_PME_PM_LD_MISS_L1_CYC 322 #define POWER6_PME_PM_DERAT_MISS_4K 323 #define POWER6_PME_PM_DPU_HELD_COMPLETION 324 #define POWER6_PME_PM_FPU_ISSUE_STALL_ST 325 #define POWER6_PME_PM_L2SB_DC_INV 326 #define POWER6_PME_PM_PTEG_FROM_L25_SHR 327 #define POWER6_PME_PM_PTEG_FROM_DL2L3_MOD 328 #define POWER6_PME_PM_FAB_CMD_RETRIED 329 #define POWER6_PME_PM_BR_PRED_LSTACK 330 #define POWER6_PME_PM_GXO_DATA_CYC_BUSY 331 #define POWER6_PME_PM_DFU_SUBNORM 332 #define POWER6_PME_PM_FPU_ISSUE_OOO 333 #define POWER6_PME_PM_LSU_REJECT_ULD_BOTH 334 #define POWER6_PME_PM_L2SB_ST_MISS 335 #define POWER6_PME_PM_DATA_FROM_L25_MOD_CYC 336 #define POWER6_PME_PM_INST_PTEG_1ST_HALF 337 #define POWER6_PME_PM_DERAT_MISS_16M 338 #define POWER6_PME_PM_GX_DMA_WRITE 339 #define POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD 340 #define POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC 341 #define POWER6_PME_PM_L2SB_LD_REQ_DATA 342 #define POWER6_PME_PM_L2SA_LD_MISS_INST 343 #define POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS 344 #define POWER6_PME_PM_MRK_IFU_FIN 345 #define POWER6_PME_PM_INST_FROM_L3 346 #define POWER6_PME_PM_FXU1_FIN 347 #define POWER6_PME_PM_THRD_PRIO_4_CYC 348 #define POWER6_PME_PM_MRK_DATA_FROM_L35_MOD 349 #define POWER6_PME_PM_LSU_REJECT_SET_MPRED 350 #define POWER6_PME_PM_MRK_DERAT_MISS_16G 351 #define POWER6_PME_PM_FPU0_FXDIV 352 #define POWER6_PME_PM_MRK_LSU1_REJECT_UST 353 #define POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP 354 #define POWER6_PME_PM_INST_FROM_L35_SHR 355 #define POWER6_PME_PM_MRK_LSU_REJECT_LHS 356 #define POWER6_PME_PM_LSU_LMQ_FULL_CYC 357 #define POWER6_PME_PM_SYNC_COUNT 358 #define POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB 359 #define POWER6_PME_PM_L2SA_CASTOUT_MOD 360 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT 361 #define POWER6_PME_PM_PTEG_FROM_MEM_DP 362 #define POWER6_PME_PM_LSU_REJECT_SLOW 363 #define POWER6_PME_PM_PTEG_FROM_L25_MOD 364 #define POWER6_PME_PM_THRD_PRIO_7_CYC 365 #define POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR 366 #define POWER6_PME_PM_ST_REQ_L2 367 #define POWER6_PME_PM_ST_REF_L1 368 #define POWER6_PME_PM_FPU_ISSUE_STALL_THRD 369 #define POWER6_PME_PM_RUN_COUNT 370 #define POWER6_PME_PM_RUN_CYC 371 #define POWER6_PME_PM_PTEG_FROM_RMEM 372 #define POWER6_PME_PM_LSU0_LDF 373 #define POWER6_PME_PM_ST_MISS_L1 374 #define POWER6_PME_PM_INST_FROM_DL2L3_SHR 375 #define POWER6_PME_PM_L2SA_IC_INV 376 #define POWER6_PME_PM_THRD_ONE_RUN_CYC 377 #define POWER6_PME_PM_L2SB_LD_REQ_INST 378 #define POWER6_PME_PM_MRK_DATA_FROM_L25_MOD 379 #define POWER6_PME_PM_DPU_HELD_XTHRD 380 #define POWER6_PME_PM_L2SB_ST_REQ 381 #define POWER6_PME_PM_INST_FROM_L21 382 #define POWER6_PME_PM_INST_FROM_L3MISS 383 #define POWER6_PME_PM_L3SB_HIT 384 #define POWER6_PME_PM_EE_OFF_EXT_INT 385 #define POWER6_PME_PM_INST_FROM_DL2L3_MOD 386 #define POWER6_PME_PM_PMC6_OVERFLOW 387 #define POWER6_PME_PM_FPU_FLOP 388 #define POWER6_PME_PM_FXU_BUSY 389 #define POWER6_PME_PM_FPU1_FLOP 390 #define POWER6_PME_PM_IC_RELOAD_SHR 391 #define POWER6_PME_PM_INST_TABLEWALK_CYC 392 #define POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC 393 #define POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC 394 #define POWER6_PME_PM_IBUF_FULL_CYC 395 #define POWER6_PME_PM_L2SA_LD_REQ 396 #define POWER6_PME_PM_VMX1_LD_WRBACK 397 #define POWER6_PME_PM_MRK_FPU_FIN 398 #define POWER6_PME_PM_THRD_PRIO_5_CYC 399 #define POWER6_PME_PM_DFU_BACK2BACK 400 #define POWER6_PME_PM_MRK_DATA_FROM_LMEM 401 #define POWER6_PME_PM_LSU_REJECT_LHS 402 #define POWER6_PME_PM_DPU_HELD_SPR 403 #define POWER6_PME_PM_FREQ_DOWN 404 #define POWER6_PME_PM_DFU_ENC_BCD_DPD 405 #define POWER6_PME_PM_DPU_HELD_GPR 406 #define POWER6_PME_PM_LSU0_NCST 407 #define POWER6_PME_PM_MRK_INST_ISSUED 408 #define POWER6_PME_PM_INST_FROM_RL2L3_SHR 409 #define POWER6_PME_PM_FPU_DENORM 410 #define POWER6_PME_PM_PTEG_FROM_L3MISS 411 #define POWER6_PME_PM_RUN_PURR 412 #define POWER6_PME_PM_MRK_VMX0_LD_WRBACK 413 #define POWER6_PME_PM_L2_MISS 414 #define POWER6_PME_PM_MRK_DATA_FROM_L3 415 #define POWER6_PME_PM_MRK_LSU1_REJECT_LHS 416 #define POWER6_PME_PM_L2SB_LD_MISS_INST 417 #define POWER6_PME_PM_PTEG_FROM_RL2L3_SHR 418 #define POWER6_PME_PM_MRK_DERAT_MISS_64K 419 #define POWER6_PME_PM_LWSYNC 420 #define POWER6_PME_PM_FPU1_FXMULT 421 #define POWER6_PME_PM_MEM0_DP_CL_WR_GLOB 422 #define POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR 423 #define POWER6_PME_PM_INST_IMC_MATCH_CMPL 424 #define POWER6_PME_PM_DPU_HELD_THERMAL 425 #define POWER6_PME_PM_FPU_FRSP 426 #define POWER6_PME_PM_MRK_INST_FIN 427 #define POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR 428 #define POWER6_PME_PM_MRK_DTLB_REF 429 #define POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR 430 #define POWER6_PME_PM_DPU_HELD_LSU 431 #define POWER6_PME_PM_FPU_FSQRT_FDIV 432 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT 433 #define POWER6_PME_PM_DATA_PTEG_SECONDARY 434 #define POWER6_PME_PM_FPU1_FEST 435 #define POWER6_PME_PM_L2SA_LD_HIT 436 #define POWER6_PME_PM_DATA_FROM_MEM_DP_CYC 437 #define POWER6_PME_PM_BR_MPRED_CCACHE 438 #define POWER6_PME_PM_DPU_HELD_COUNT 439 #define POWER6_PME_PM_LSU1_REJECT_SET_MPRED 440 #define POWER6_PME_PM_FPU_ISSUE_2 441 #define POWER6_PME_PM_LSU1_REJECT_L2_CORR 442 #define POWER6_PME_PM_MRK_PTEG_FROM_DMEM 443 #define POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB 444 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 445 #define POWER6_PME_PM_THRD_PRIO_0_CYC 446 #define POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE 447 #define POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED 448 #define POWER6_PME_PM_MRK_VMX1_LD_WRBACK 449 #define POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC 450 #define POWER6_PME_PM_IERAT_MISS_16M 451 #define POWER6_PME_PM_MRK_DATA_FROM_MEM_DP 452 #define POWER6_PME_PM_LARX_L1HIT 453 #define POWER6_PME_PM_L2_ST_MISS_DATA 454 #define POWER6_PME_PM_FPU_ST_FOLDED 455 #define POWER6_PME_PM_MRK_DATA_FROM_L35_SHR 456 #define POWER6_PME_PM_DPU_HELD_MULT_GPR 457 #define POWER6_PME_PM_FPU0_1FLOP 458 #define POWER6_PME_PM_IERAT_MISS_16G 459 #define POWER6_PME_PM_IC_PREF_WRITE 460 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 461 #define POWER6_PME_PM_FPU0_FIN 462 #define POWER6_PME_PM_DATA_FROM_L2_CYC 463 #define POWER6_PME_PM_DERAT_REF_16G 464 #define POWER6_PME_PM_BR_PRED 465 #define POWER6_PME_PM_VMX1_LD_ISSUED 466 #define POWER6_PME_PM_L2SB_CASTOUT_MOD 467 #define POWER6_PME_PM_INST_FROM_DMEM 468 #define POWER6_PME_PM_DATA_FROM_L35_SHR_CYC 469 #define POWER6_PME_PM_LSU0_NCLD 470 #define POWER6_PME_PM_FAB_RETRY_NODE_PUMP 471 #define POWER6_PME_PM_VMX0_INST_ISSUED 472 #define POWER6_PME_PM_DATA_FROM_L25_MOD 473 #define POWER6_PME_PM_DPU_HELD_ITLB_ISLB 474 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 475 #define POWER6_PME_PM_THRD_CONC_RUN_INST 476 #define POWER6_PME_PM_MRK_PTEG_FROM_L2 477 #define POWER6_PME_PM_PURR 478 #define POWER6_PME_PM_DERAT_MISS_64K 479 #define POWER6_PME_PM_PMC2_REWIND 480 #define POWER6_PME_PM_INST_FROM_L2 481 #define POWER6_PME_PM_INST_DISP 482 #define POWER6_PME_PM_DATA_FROM_L25_SHR 483 #define POWER6_PME_PM_L1_DCACHE_RELOAD_VALID 484 #define POWER6_PME_PM_LSU1_REJECT_UST 485 #define POWER6_PME_PM_FAB_ADDR_COLLISION 486 #define POWER6_PME_PM_MRK_FXU_FIN 487 #define POWER6_PME_PM_LSU0_REJECT_UST 488 #define POWER6_PME_PM_PMC4_OVERFLOW 489 #define POWER6_PME_PM_MRK_PTEG_FROM_L3 490 #define POWER6_PME_PM_INST_FROM_L2MISS 491 #define POWER6_PME_PM_L2SB_ST_HIT 492 #define POWER6_PME_PM_DPU_WT_IC_MISS_COUNT 493 #define POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR 494 #define POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD 495 #define POWER6_PME_PM_FPU1_FPSCR 496 #define POWER6_PME_PM_LSU_REJECT_UST 497 #define POWER6_PME_PM_LSU0_DERAT_MISS 498 #define POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP 499 #define POWER6_PME_PM_MRK_DATA_FROM_L2 500 #define POWER6_PME_PM_FPU0_FSQRT_FDIV 501 #define POWER6_PME_PM_DPU_HELD_FXU_SOPS 502 #define POWER6_PME_PM_MRK_FPU0_FIN 503 #define POWER6_PME_PM_L2SB_LD_MISS_DATA 504 #define POWER6_PME_PM_LSU_SRQ_EMPTY_CYC 505 #define POWER6_PME_PM_1PLUS_PPC_DISP 506 #define POWER6_PME_PM_VMX_ST_ISSUED 507 #define POWER6_PME_PM_DATA_FROM_L2MISS 508 #define POWER6_PME_PM_LSU0_REJECT_ULD 509 #define POWER6_PME_PM_SUSPENDED 510 #define POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH 511 #define POWER6_PME_PM_LSU_REJECT_NO_SCRATCH 512 #define POWER6_PME_PM_STCX_FAIL 513 #define POWER6_PME_PM_FPU1_DENORM 514 #define POWER6_PME_PM_GCT_NOSLOT_COUNT 515 #define POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC 516 #define POWER6_PME_PM_DATA_FROM_L21 517 #define POWER6_PME_PM_FPU_1FLOP 518 #define POWER6_PME_PM_LSU1_REJECT 519 #define POWER6_PME_PM_IC_REQ 520 #define POWER6_PME_PM_MRK_DFU_FIN 521 #define POWER6_PME_PM_NOT_LLA_CYC 522 #define POWER6_PME_PM_INST_FROM_L1 523 #define POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED 524 #define POWER6_PME_PM_BRU_FIN 525 #define POWER6_PME_PM_LSU1_REJECT_EXTERN 526 #define POWER6_PME_PM_DATA_FROM_L21_CYC 527 #define POWER6_PME_PM_GXI_CYC_BUSY 528 #define POWER6_PME_PM_MRK_LD_MISS_L1 529 #define POWER6_PME_PM_L1_WRITE_CYC 530 #define POWER6_PME_PM_LLA_CYC 531 #define POWER6_PME_PM_MRK_DATA_FROM_L2MISS 532 #define POWER6_PME_PM_GCT_FULL_COUNT 533 #define POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB 534 #define POWER6_PME_PM_DATA_FROM_RL2L3_SHR 535 #define POWER6_PME_PM_MRK_LSU_REJECT_UST 536 #define POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED 537 #define POWER6_PME_PM_MRK_PTEG_FROM_L21 538 #define POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC 539 #define POWER6_PME_PM_BR_MPRED 540 #define POWER6_PME_PM_LD_REQ_L2 541 #define POWER6_PME_PM_FLUSH_ASYNC 542 #define POWER6_PME_PM_HV_CYC 543 #define POWER6_PME_PM_LSU1_DERAT_MISS 544 #define POWER6_PME_PM_DPU_HELD_SMT 545 #define POWER6_PME_PM_MRK_LSU_FIN 546 #define POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR 547 #define POWER6_PME_PM_LSU0_REJECT_STQ_FULL 548 #define POWER6_PME_PM_MRK_DERAT_REF_4K 549 #define POWER6_PME_PM_FPU_ISSUE_STALL_FPR 550 #define POWER6_PME_PM_IFU_FIN 551 #define POWER6_PME_PM_GXO_CYC_BUSY 552 static const int power6_event_ids[][POWER6_NUM_EVENT_COUNTERS] = { [ POWER6_PME_PM_LSU_REJECT_STQ_FULL ] = { 243, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_FXU_MULTI ] = { 37, 45, 36, 44, -1, -1 }, [ POWER6_PME_PM_VMX1_STALL ] = { 328, 335, 322, 320, -1, -1 }, [ POWER6_PME_PM_PMC2_SAVED ] = { 291, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SB_IC_INV ] = { 174, 183, 174, 180, -1, -1 }, [ POWER6_PME_PM_IERAT_MISS_64K ] = { -1, -1, 344, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { -1, -1, 310, -1, -1, -1 }, [ POWER6_PME_PM_LD_REF_L1_BOTH ] = { 202, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_FCONV ] = { 88, 97, 86, 94, -1, -1 }, [ POWER6_PME_PM_IBUF_FULL_COUNT ] = { 338, 345, 332, 330, -1, -1 }, [ POWER6_PME_PM_MRK_LSU_DERAT_MISS ] = { -1, -1, -1, 271, -1, -1 }, [ POWER6_PME_PM_MRK_ST_CMPL ] = { 282, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2_CASTOUT_MOD ] = { 185, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_ST_FOLDED ] = { 100, 109, 98, 106, -1, -1 }, [ POWER6_PME_PM_MRK_INST_TIMEO ] = { -1, -1, -1, 263, -1, -1 }, [ POWER6_PME_PM_DPU_WT ] = { -1, -1, 54, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_RESTART ] = { 47, 56, 46, 54, -1, -1 }, [ POWER6_PME_PM_IERAT_MISS ] = { 137, 146, 136, 143, -1, -1 }, [ POWER6_PME_PM_FPU_SINGLE ] = { -1, -1, -1, 122, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_LMEM ] = { -1, -1, -1, 278, -1, -1 }, [ POWER6_PME_PM_HV_COUNT ] = { -1, 351, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_ST_HIT ] = { 168, 177, 168, 174, -1, -1 }, [ POWER6_PME_PM_L2_LD_MISS_INST ] = { -1, 196, -1, -1, -1, -1 }, [ POWER6_PME_PM_EXT_INT ] = { -1, 67, 57, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_LDF ] = { 221, 230, 216, 221, -1, -1 }, [ POWER6_PME_PM_FAB_CMD_ISSUED ] = { 59, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L21 ] = { -1, 305, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_MISS ] = { 167, 176, 167, 173, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_MOD ] = { 299, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_WT_COUNT ] = { -1, -1, 340, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD ] = { -1, -1, 272, -1, -1, -1 }, [ POWER6_PME_PM_LD_HIT_L2 ] = { -1, 209, -1, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_SHR ] = { -1, -1, 290, -1, -1, -1 }, [ POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC ] = { 257, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L3SA_MISS ] = { 192, 202, 190, 195, -1, -1 }, [ POWER6_PME_PM_NO_ITAG_COUNT ] = { 340, 347, 334, 332, -1, -1 }, [ POWER6_PME_PM_DSLB_MISS ] = { 56, 65, 55, 63, -1, -1 }, [ POWER6_PME_PM_LSU_FLUSH_ALIGN ] = { 235, 244, 229, 235, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_FPU_CR ] = { 35, 43, 34, 42, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L2MISS ] = { 296, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_DMEM ] = { -1, 269, -1, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_LMEM ] = { -1, -1, -1, 291, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_REF_64K ] = { 353, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_REQ_INST ] = { 166, 175, 166, 172, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_MISS_16M ] = { -1, -1, 346, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD ] = { -1, -1, -1, 13, -1, -1 }, [ POWER6_PME_PM_FPU0_FXMULT ] = { 82, 91, 80, 88, -1, -1 }, [ POWER6_PME_PM_L3SB_MISS ] = { 195, 205, 193, 198, -1, -1 }, [ POWER6_PME_PM_STCX_CANCEL ] = { 305, 311, 297, 296, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_MISS_DATA ] = { 162, 171, 162, 168, -1, -1 }, [ POWER6_PME_PM_IC_INV_L2 ] = { -1, 141, 131, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD ] = { -1, 38, -1, -1, -1, -1 }, [ POWER6_PME_PM_PMC1_OVERFLOW ] = { -1, 303, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_6_CYC ] = { -1, 323, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3MISS ] = { -1, -1, 274, -1, -1, -1 }, [ POWER6_PME_PM_MRK_LSU0_REJECT_UST ] = { 272, 284, 266, 267, -1, -1 }, [ POWER6_PME_PM_MRK_INST_DISP ] = { 267, 279, -1, -1, -1, -1 }, [ POWER6_PME_PM_LARX ] = { 197, 207, 195, 200, -1, -1 }, [ POWER6_PME_PM_INST_CMPL ] = { 139, 148, 138, 145, -1, -1 }, [ POWER6_PME_PM_FXU_IDLE ] = { 117, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { -1, -1, -1, 256, -1, -1 }, [ POWER6_PME_PM_L2_LD_REQ_DATA ] = { 186, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_DERAT_MISS_CYC ] = { 234, -1, -1, 234, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_POWER_COUNT ] = { -1, 356, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_RL2L3_MOD ] = { 146, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DMEM_CYC ] = { -1, 14, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DMEM ] = { -1, 13, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR ] = { 241, -1, -1, 242, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_DERAT_MPRED ] = { -1, 249, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_ULD ] = { 231, 240, 226, 231, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L3_CYC ] = { -1, 21, -1, -1, -1, -1 }, [ POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 124, -1, -1 }, [ POWER6_PME_PM_INST_FROM_MEM_DP ] = { 145, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_FLUSH_DSI ] = { 236, 245, 230, 236, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_REF_16G ] = { -1, -1, -1, 345, -1, -1 }, [ POWER6_PME_PM_LSU_LDF_BOTH ] = { 237, -1, 232, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_1FLOP ] = { 86, 95, 84, 92, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RMEM_CYC ] = { -1, -1, -1, 23, -1, -1 }, [ POWER6_PME_PM_INST_PTEG_SECONDARY ] = { 150, 159, 150, 156, -1, -1 }, [ POWER6_PME_PM_L1_ICACHE_MISS ] = { 154, 163, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_DISP_LLA ] = { 140, 150, 140, 146, -1, -1 }, [ POWER6_PME_PM_THRD_BOTH_RUN_CYC ] = { -1, -1, -1, 304, -1, -1 }, [ POWER6_PME_PM_LSU_ST_CHAINED ] = { 246, 257, 240, 245, -1, -1 }, [ POWER6_PME_PM_FPU1_FXDIV ] = { 96, 105, 94, 102, -1, -1 }, [ POWER6_PME_PM_FREQ_UP ] = { -1, -1, -1, 123, -1, -1 }, [ POWER6_PME_PM_FAB_RETRY_SYS_PUMP ] = { 65, 75, 64, 71, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_LMEM ] = { -1, -1, -1, 20, -1, -1 }, [ POWER6_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 288, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_SET_MPRED ] = { 216, 225, 211, 216, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED ] = { 209, 218, 204, 209, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_STQ_FULL ] = { 230, 239, 225, 230, -1, -1 }, [ POWER6_PME_PM_MRK_BR_MPRED ] = { -1, -1, 251, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_ST_MISS ] = { 169, 178, 169, 175, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_EXTERN ] = { 210, 219, 205, 210, -1, -1 }, [ POWER6_PME_PM_MRK_BR_TAKEN ] = { 258, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_ISLB_MISS ] = { 152, 161, 152, 158, -1, -1 }, [ POWER6_PME_PM_CYC ] = { 12, 11, 10, 12, -1, -1 }, [ POWER6_PME_PM_FPU_FXDIV ] = { 105, -1, -1, 110, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_LLA_END ] = { 43, 51, 42, 50, -1, -1 }, [ POWER6_PME_PM_MEM0_DP_CL_WR_LOC ] = { 249, 260, 243, 248, -1, -1 }, [ POWER6_PME_PM_MRK_LSU_REJECT_ULD ] = { 276, -1, -1, 274, -1, -1 }, [ POWER6_PME_PM_1PLUS_PPC_CMPL ] = { 1, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_DMEM ] = { -1, 304, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT ] = { -1, -1, -1, 340, -1, -1 }, [ POWER6_PME_PM_GCT_FULL_CYC ] = { 120, 128, 119, 127, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L25_SHR ] = { -1, -1, -1, 150, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_MISS_4K ] = { -1, 364, -1, -1, -1, -1 }, [ POWER6_PME_PM_DC_PREF_STREAM_ALLOC ] = { 22, 29, 21, 29, -1, -1 }, [ POWER6_PME_PM_FPU1_FIN ] = { 90, 99, 88, 96, -1, -1 }, [ POWER6_PME_PM_BR_MPRED_TA ] = { 7, 5, 5, 7, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_POWER ] = { -1, 55, -1, -1, -1, -1 }, [ POWER6_PME_PM_RUN_INST_CMPL ] = { -1, -1, -1, -1, 0, -1 }, [ POWER6_PME_PM_GCT_EMPTY_CYC ] = { 119, 127, -1, -1, -1, -1 }, [ POWER6_PME_PM_LLA_COUNT ] = { 347, 354, 339, 339, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH ] = { 214, 223, 209, 214, -1, -1 }, [ POWER6_PME_PM_DPU_WT_IC_MISS ] = { -1, 64, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L3MISS ] = { -1, -1, 15, 19, -1, -1 }, [ POWER6_PME_PM_FPU_FPSCR ] = { -1, 112, 100, -1, -1, -1 }, [ POWER6_PME_PM_VMX1_INST_ISSUED ] = { 325, 332, 319, 317, -1, -1 }, [ POWER6_PME_PM_FLUSH ] = { 67, -1, -1, 73, -1, -1 }, [ POWER6_PME_PM_ST_HIT_L2 ] = { 308, -1, -1, 298, -1, -1 }, [ POWER6_PME_PM_SYNC_CYC ] = { 312, 319, 303, 303, -1, -1 }, [ POWER6_PME_PM_FAB_SYS_PUMP ] = { 66, 76, 65, 72, -1, -1 }, [ POWER6_PME_PM_IC_PREF_REQ ] = { 133, 142, 132, 139, -1, -1 }, [ POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC ] = { 250, 261, 244, 249, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_0 ] = { 107, 115, 103, 112, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_2_CYC ] = { -1, -1, 308, -1, -1, -1 }, [ POWER6_PME_PM_VMX_SIMPLE_ISSUED ] = { 335, 342, 329, 327, -1, -1 }, [ POWER6_PME_PM_MRK_FPU1_FIN ] = { 266, 275, 260, 261, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_CW ] = { 33, 41, 32, 40, -1, -1 }, [ POWER6_PME_PM_L3SA_REF ] = { 193, 203, 191, 196, -1, -1 }, [ POWER6_PME_PM_STCX ] = { 304, 310, 296, 295, -1, -1 }, [ POWER6_PME_PM_L2SB_MISS ] = { 181, 190, 181, 187, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT ] = { 208, 217, 203, 208, -1, -1 }, [ POWER6_PME_PM_TB_BIT_TRANS ] = { 313, -1, 304, -1, -1, -1 }, [ POWER6_PME_PM_THERMAL_MAX ] = { -1, -1, 305, -1, -1, -1 }, [ POWER6_PME_PM_FPU0_STF ] = { 84, 93, 82, 90, -1, -1 }, [ POWER6_PME_PM_FPU1_FMA ] = { 92, 101, 90, 98, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_LHS ] = { 226, 235, 221, 226, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_INT ] = { 40, 48, 39, 47, -1, -1 }, [ POWER6_PME_PM_THRD_LLA_BOTH_CYC ] = { -1, -1, -1, 306, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_THERMAL_COUNT ] = { 348, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_PMC4_REWIND ] = { 293, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DERAT_REF_16M ] = { -1, -1, 342, -1, -1, -1 }, [ POWER6_PME_PM_FPU0_FCONV ] = { 73, 82, 71, 79, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_REQ_DATA ] = { 165, 174, 165, 171, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_MEM_DP ] = { 15, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED ] = { 286, 298, 281, 283, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2MISS ] = { -1, -1, -1, 277, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { -1, 324, -1, -1, -1, -1 }, [ POWER6_PME_PM_VMX0_STALL ] = { 324, 331, 318, 316, -1, -1 }, [ POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 131, 139, 129, 137, -1, -1 }, [ POWER6_PME_PM_LSU_DERAT_MISS ] = { -1, 243, -1, 335, -1, -1 }, [ POWER6_PME_PM_FPU0_SINGLE ] = { 83, 92, 81, 89, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_STEERING ] = { 115, 123, 111, 120, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_1_CYC ] = { -1, 322, -1, -1, -1, -1 }, [ POWER6_PME_PM_VMX_COMPLEX_ISSUED ] = { 329, 336, 323, 321, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_ST_FOLDED ] = { 116, 124, 112, 121, -1, -1 }, [ POWER6_PME_PM_DFU_FIN ] = { 29, 36, 28, 36, -1, -1 }, [ POWER6_PME_PM_BR_PRED_CCACHE ] = { 9, 7, 7, 9, -1, -1 }, [ POWER6_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 277, -1, -1, -1 }, [ POWER6_PME_PM_FAB_MMIO ] = { 62, 72, 61, 68, -1, -1 }, [ POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED ] = { 288, 300, 283, 285, -1, -1 }, [ POWER6_PME_PM_FPU_STF ] = { -1, -1, 113, -1, -1, -1 }, [ POWER6_PME_PM_MEM1_DP_CL_WR_GLOB ] = { 252, 263, 246, 251, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L3MISS ] = { -1, -1, 255, -1, -1, -1 }, [ POWER6_PME_PM_GCT_NOSLOT_CYC ] = { 121, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2_ST_REQ_DATA ] = { -1, 200, 188, -1, -1, -1 }, [ POWER6_PME_PM_INST_TABLEWALK_COUNT ] = { 341, 348, 335, 333, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L35_SHR ] = { -1, 306, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_ISYNC ] = { 41, 49, 40, 48, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_SHR ] = { -1, -1, -1, 257, -1, -1 }, [ POWER6_PME_PM_L3SA_HIT ] = { 191, 201, 189, 194, -1, -1 }, [ POWER6_PME_PM_DERAT_MISS_16G ] = { -1, -1, -1, 343, -1, -1 }, [ POWER6_PME_PM_DATA_PTEG_2ND_HALF ] = { 18, 26, 18, 25, -1, -1 }, [ POWER6_PME_PM_L2SA_ST_REQ ] = { 170, 179, 170, 176, -1, -1 }, [ POWER6_PME_PM_INST_FROM_LMEM ] = { -1, -1, -1, 152, -1, -1 }, [ POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 132, 140, 130, 138, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L2 ] = { 295, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_PTEG_1ST_HALF ] = { 17, 25, 17, 24, -1, -1 }, [ POWER6_PME_PM_BR_MPRED_COUNT ] = { 5, 3, 3, 5, -1, -1 }, [ POWER6_PME_PM_IERAT_MISS_4K ] = { -1, -1, -1, 344, -1, -1 }, [ POWER6_PME_PM_THRD_BOTH_RUN_COUNT ] = { -1, -1, -1, 336, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_ULD ] = { 244, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC ] = { -1, -1, -1, 14, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { 280, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU0_FLOP ] = { 76, 85, 74, 82, -1, -1 }, [ POWER6_PME_PM_FPU0_FEST ] = { 74, 83, 72, 80, -1, -1 }, [ POWER6_PME_PM_MRK_LSU0_REJECT_LHS ] = { 270, 282, 264, 265, -1, -1 }, [ POWER6_PME_PM_VMX_RESULT_SAT_1 ] = { 334, 341, 328, 326, -1, -1 }, [ POWER6_PME_PM_NO_ITAG_CYC ] = { 290, 302, 285, 287, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH ] = { 227, 236, 222, 227, -1, -1 }, [ POWER6_PME_PM_0INST_FETCH ] = { 0, 0, 0, 0, -1, -1 }, [ POWER6_PME_PM_DPU_WT_BR_MPRED ] = { -1, -1, -1, 62, -1, -1 }, [ POWER6_PME_PM_L1_PREF ] = { 155, 164, 155, 161, -1, -1 }, [ POWER6_PME_PM_VMX_FLOAT_MULTICYCLE ] = { 331, 338, 325, 323, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L25_SHR_CYC ] = { -1, 16, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L3 ] = { -1, -1, 14, -1, -1, -1 }, [ POWER6_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 286, -1, -1, -1 }, [ POWER6_PME_PM_VMX0_LD_WRBACK ] = { 323, 330, 317, 315, -1, -1 }, [ POWER6_PME_PM_FPU0_DENORM ] = { 72, 81, 70, 78, -1, -1 }, [ POWER6_PME_PM_INST_FETCH_CYC ] = { 141, 151, 141, 147, -1, -1 }, [ POWER6_PME_PM_LSU_LDF ] = { -1, 246, 231, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_L2_CORR ] = { 239, -1, -1, 239, -1, -1 }, [ POWER6_PME_PM_DERAT_REF_64K ] = { -1, 360, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_3_CYC ] = { -1, -1, -1, 307, -1, -1 }, [ POWER6_PME_PM_FPU_FMA ] = { -1, 111, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L35_MOD ] = { 144, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DFU_CONV ] = { 26, 33, 25, 33, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L25_MOD ] = { -1, -1, 144, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L35_MOD ] = { 297, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_VMX_ST_ISSUED ] = { 289, 301, 284, 286, -1, -1 }, [ POWER6_PME_PM_VMX_FLOAT_ISSUED ] = { 330, 337, 324, 322, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_L2_CORR ] = { 212, 221, 207, 212, -1, -1 }, [ POWER6_PME_PM_THRD_L2MISS ] = { 314, 321, 307, 305, -1, -1 }, [ POWER6_PME_PM_FPU_FCONV ] = { 102, -1, -1, 107, -1, -1 }, [ POWER6_PME_PM_FPU_FXMULT ] = { 106, -1, -1, 111, -1, -1 }, [ POWER6_PME_PM_FPU1_FRSP ] = { 94, 103, 92, 100, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_REF_16M ] = { -1, -1, 345, -1, -1, -1 }, [ POWER6_PME_PM_L2SB_CASTOUT_SHR ] = { 172, 181, 172, 178, -1, -1 }, [ POWER6_PME_PM_THRD_ONE_RUN_COUNT ] = { 344, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_RMEM ] = { -1, -1, 147, -1, -1, -1 }, [ POWER6_PME_PM_LSU_BOTH_BUS ] = { 233, 242, 228, 233, -1, -1 }, [ POWER6_PME_PM_FPU1_FSQRT_FDIV ] = { 95, 104, 93, 101, -1, -1 }, [ POWER6_PME_PM_L2_LD_REQ_INST ] = { 187, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR ] = { -1, 291, -1, -1, -1, -1 }, [ POWER6_PME_PM_BR_PRED_CR ] = { 10, 8, 8, 10, -1, -1 }, [ POWER6_PME_PM_MRK_LSU0_REJECT_ULD ] = { 271, 283, 265, 266, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT ] = { -1, -1, -1, 238, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_LHS_BOTH ] = { -1, 250, -1, 241, -1, -1 }, [ POWER6_PME_PM_GXO_ADDR_CYC_BUSY ] = { 125, 132, 123, 131, -1, -1 }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT ] = { -1, -1, -1, 341, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L3 ] = { -1, -1, 292, -1, -1, -1 }, [ POWER6_PME_PM_VMX0_LD_ISSUED ] = { 322, 329, 316, 314, -1, -1 }, [ POWER6_PME_PM_FXU_PIPELINED_MULT_DIV ] = { 118, 126, 118, 126, -1, -1 }, [ POWER6_PME_PM_FPU1_STF ] = { 99, 108, 97, 105, -1, -1 }, [ POWER6_PME_PM_DFU_ADD ] = { 23, 30, 22, 30, -1, -1 }, [ POWER6_PME_PM_MEM_DP_CL_WR_GLOB ] = { -1, 267, 250, -1, -1, -1 }, [ POWER6_PME_PM_MRK_LSU1_REJECT_ULD ] = { 274, 286, 268, 269, -1, -1 }, [ POWER6_PME_PM_ITLB_REF ] = { 153, 162, 153, 159, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_L2MISS ] = { 211, 220, 206, 211, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L35_SHR ] = { -1, 19, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { 263, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU0_FPSCR ] = { 78, 87, 76, 84, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L2 ] = { 13, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_XER ] = { 54, 62, 52, 60, -1, -1 }, [ POWER6_PME_PM_FAB_NODE_PUMP ] = { 63, 73, 62, 69, -1, -1 }, [ POWER6_PME_PM_VMX_RESULT_SAT_0_1 ] = { 333, 340, 327, 325, -1, -1 }, [ POWER6_PME_PM_LD_REF_L1 ] = { 201, 212, 198, 203, -1, -1 }, [ POWER6_PME_PM_TLB_REF ] = { 320, 327, 314, 312, -1, -1 }, [ POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 21, 28, 20, 28, -1, -1 }, [ POWER6_PME_PM_FLUSH_FPU ] = { 69, 78, 67, 75, -1, -1 }, [ POWER6_PME_PM_MEM1_DP_CL_WR_LOC ] = { 253, 264, 247, 252, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_HIT ] = { 175, 184, 175, 181, -1, -1 }, [ POWER6_PME_PM_FAB_DCLAIM ] = { 60, 70, 59, 66, -1, -1 }, [ POWER6_PME_PM_MEM_DP_CL_WR_LOC ] = { 256, -1, -1, 255, -1, -1 }, [ POWER6_PME_PM_BR_MPRED_CR ] = { 6, 4, 4, 6, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_EXTERN ] = { -1, -1, 235, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD ] = { 16, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_RU_WQ ] = { 48, 57, 47, 55, -1, -1 }, [ POWER6_PME_PM_LD_MISS_L1 ] = { 199, 210, 197, 202, -1, -1 }, [ POWER6_PME_PM_DC_INV_L2 ] = { 20, -1, -1, 27, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_RMEM ] = { -1, -1, 275, -1, -1, -1 }, [ POWER6_PME_PM_FPU_FIN ] = { 103, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FXU0_FIN ] = { -1, -1, 117, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_FPQ ] = { 34, 42, 33, 41, -1, -1 }, [ POWER6_PME_PM_GX_DMA_READ ] = { 128, 135, 126, 134, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR ] = { 228, 237, 223, 228, -1, -1 }, [ POWER6_PME_PM_0INST_FETCH_COUNT ] = { 337, 344, 331, 329, -1, -1 }, [ POWER6_PME_PM_PMC5_OVERFLOW ] = { 294, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_REQ ] = { 178, 187, 178, 184, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { 318, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RMEM ] = { -1, -1, 16, -1, -1, -1 }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC ] = { -1, -1, 234, -1, -1, -1 }, [ POWER6_PME_PM_ST_REF_L1_BOTH ] = { -1, 316, -1, 301, -1, -1 }, [ POWER6_PME_PM_VMX_PERMUTE_ISSUED ] = { 332, 339, 326, 324, -1, -1 }, [ POWER6_PME_PM_BR_TAKEN ] = { -1, 10, -1, -1, -1, -1 }, [ POWER6_PME_PM_FAB_DMA ] = { 61, 71, 60, 67, -1, -1 }, [ POWER6_PME_PM_GCT_EMPTY_COUNT ] = { -1, 358, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_SINGLE ] = { 98, 107, 96, 104, -1, -1 }, [ POWER6_PME_PM_L2SA_CASTOUT_SHR ] = { 158, 167, 158, 164, -1, -1 }, [ POWER6_PME_PM_L3SB_REF ] = { 196, 206, 194, 199, -1, -1 }, [ POWER6_PME_PM_FPU0_FRSP ] = { 79, 88, 77, 85, -1, -1 }, [ POWER6_PME_PM_PMC4_SAVED ] = { -1, -1, 288, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_DC_INV ] = { 159, 168, 159, 165, -1, -1 }, [ POWER6_PME_PM_GXI_ADDR_CYC_BUSY ] = { 122, 129, 120, 128, -1, -1 }, [ POWER6_PME_PM_FPU0_FMA ] = { 77, 86, 75, 83, -1, -1 }, [ POWER6_PME_PM_SLB_MISS ] = { 303, -1, -1, 294, -1, -1 }, [ POWER6_PME_PM_MRK_ST_GPS ] = { -1, 294, -1, -1, -1, -1 }, [ POWER6_PME_PM_DERAT_REF_4K ] = { 350, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2_CASTOUT_SHR ] = { -1, 194, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_STCX_CR ] = { 51, 60, 50, 58, -1, -1 }, [ POWER6_PME_PM_FPU0_ST_FOLDED ] = { 85, 94, 83, 91, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L21 ] = { -1, 270, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { -1, -1, 311, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L35_MOD ] = { 14, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR ] = { -1, -1, 11, -1, -1, -1 }, [ POWER6_PME_PM_GXI_DATA_CYC_BUSY ] = { 124, 131, 122, 130, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_STEAL ] = { 242, 254, 239, 243, -1, -1 }, [ POWER6_PME_PM_ST_FIN ] = { 307, 313, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_CR_LOGICAL ] = { 32, 40, 31, 39, -1, -1 }, [ POWER6_PME_PM_THRD_SEL_T0 ] = { 319, 326, 312, 311, -1, -1 }, [ POWER6_PME_PM_PTEG_RELOAD_VALID ] = { 300, 308, 295, 292, -1, -1 }, [ POWER6_PME_PM_L2_PREF_ST ] = { 189, 199, 187, 192, -1, -1 }, [ POWER6_PME_PM_MRK_STCX_FAIL ] = { 281, 293, 276, 279, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_LHS ] = { 213, 222, 208, 213, -1, -1 }, [ POWER6_PME_PM_DFU_EXP_EQ ] = { 28, 35, 27, 35, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_FP_FX_MULT ] = { 36, 44, 35, 43, -1, -1 }, [ POWER6_PME_PM_L2_LD_MISS_DATA ] = { -1, 195, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L35_MOD_CYC ] = { -1, -1, -1, 18, -1, -1 }, [ POWER6_PME_PM_FLUSH_FXU ] = { 70, 79, 68, 76, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_1 ] = { 108, 116, 104, 113, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_LMEM_CYC ] = { -1, 22, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_LSU_SOPS ] = { 45, 53, 44, 52, -1, -1 }, [ POWER6_PME_PM_INST_PTEG_2ND_HALF ] = { 149, 158, 149, 155, -1, -1 }, [ POWER6_PME_PM_THRESH_TIMEO ] = { -1, -1, 313, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_UST_BOTH ] = { 245, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_FAST ] = { -1, -1, 236, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_THRD_PRIO ] = { 53, 61, 51, 59, -1, -1 }, [ POWER6_PME_PM_L2_PREF_LD ] = { 188, 198, 186, 191, -1, -1 }, [ POWER6_PME_PM_FPU_FEST ] = { -1, -1, -1, 108, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_RMEM ] = { -1, -1, 256, -1, -1, -1 }, [ POWER6_PME_PM_LD_MISS_L1_CYC ] = { 200, 211, -1, -1, -1, -1 }, [ POWER6_PME_PM_DERAT_MISS_4K ] = { 351, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_COMPLETION ] = { 31, 39, 30, 38, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_STALL_ST ] = { 113, 121, 109, 118, -1, -1 }, [ POWER6_PME_PM_L2SB_DC_INV ] = { 173, 182, 173, 179, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L25_SHR ] = { -1, -1, -1, 290, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_MOD ] = { -1, -1, -1, 289, -1, -1 }, [ POWER6_PME_PM_FAB_CMD_RETRIED ] = { -1, 69, -1, -1, -1, -1 }, [ POWER6_PME_PM_BR_PRED_LSTACK ] = { 11, 9, 9, 11, -1, -1 }, [ POWER6_PME_PM_GXO_DATA_CYC_BUSY ] = { 127, 134, 125, 133, -1, -1 }, [ POWER6_PME_PM_DFU_SUBNORM ] = { 30, 37, 29, 37, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_OOO ] = { 111, 119, 107, 116, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_ULD_BOTH ] = { -1, 255, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SB_ST_MISS ] = { 183, 192, 183, 189, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L25_MOD_CYC ] = { -1, -1, -1, 17, -1, -1 }, [ POWER6_PME_PM_INST_PTEG_1ST_HALF ] = { 148, 157, 148, 154, -1, -1 }, [ POWER6_PME_PM_DERAT_MISS_16M ] = { -1, -1, 343, -1, -1, -1 }, [ POWER6_PME_PM_GX_DMA_WRITE ] = { 129, 136, 127, 135, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { -1, -1, -1, 275, -1, -1 }, [ POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC ] = { 254, 265, 248, 253, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_REQ_DATA ] = { 179, 188, 179, 185, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_MISS_INST ] = { 163, 172, 163, 169, -1, -1 }, [ POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS ] = { 269, 281, 263, 264, -1, -1 }, [ POWER6_PME_PM_MRK_IFU_FIN ] = { -1, 278, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L3 ] = { -1, -1, 145, -1, -1, -1 }, [ POWER6_PME_PM_FXU1_FIN ] = { -1, -1, -1, 125, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_4_CYC ] = { -1, -1, -1, 308, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_MOD ] = { 261, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_SET_MPRED ] = { -1, 252, 238, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_MISS_16G ] = { -1, -1, -1, 346, -1, -1 }, [ POWER6_PME_PM_FPU0_FXDIV ] = { 81, 90, 79, 87, -1, -1 }, [ POWER6_PME_PM_MRK_LSU1_REJECT_UST ] = { 275, 287, 269, 270, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP ] = { 110, 118, 106, 115, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L35_SHR ] = { -1, 155, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_LSU_REJECT_LHS ] = { -1, -1, -1, 273, -1, -1 }, [ POWER6_PME_PM_LSU_LMQ_FULL_CYC ] = { 238, 247, 233, 237, -1, -1 }, [ POWER6_PME_PM_SYNC_COUNT ] = { 342, 349, 336, 334, -1, -1 }, [ POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB ] = { 251, 262, 245, 250, -1, -1 }, [ POWER6_PME_PM_L2SA_CASTOUT_MOD ] = { 157, 166, 157, 163, -1, -1 }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT ] = { -1, -1, 341, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_MEM_DP ] = { 298, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_SLOW ] = { -1, 253, -1, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L25_MOD ] = { -1, -1, 291, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_7_CYC ] = { 317, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { -1, 292, -1, -1, -1, -1 }, [ POWER6_PME_PM_ST_REQ_L2 ] = { -1, 317, 301, -1, -1, -1 }, [ POWER6_PME_PM_ST_REF_L1 ] = { 310, 315, 300, 300, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_STALL_THRD ] = { 114, 122, 110, 119, -1, -1 }, [ POWER6_PME_PM_RUN_COUNT ] = { 343, 350, -1, -1, -1, -1 }, [ POWER6_PME_PM_RUN_CYC ] = { 302, 309, -1, -1, -1, 0 }, [ POWER6_PME_PM_PTEG_FROM_RMEM ] = { -1, -1, 294, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_LDF ] = { 205, 214, 200, 205, -1, -1 }, [ POWER6_PME_PM_ST_MISS_L1 ] = { 309, 314, 299, 299, -1, -1 }, [ POWER6_PME_PM_INST_FROM_DL2L3_SHR ] = { -1, -1, 142, -1, -1, -1 }, [ POWER6_PME_PM_L2SA_IC_INV ] = { 160, 169, 160, 166, -1, -1 }, [ POWER6_PME_PM_THRD_ONE_RUN_CYC ] = { 315, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_REQ_INST ] = { 180, 189, 180, 186, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, -1, 253, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_XTHRD ] = { 55, 63, 53, 61, -1, -1 }, [ POWER6_PME_PM_L2SB_ST_REQ ] = { 184, 193, 184, 190, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L21 ] = { -1, 154, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L3MISS ] = { -1, -1, 146, -1, -1, -1 }, [ POWER6_PME_PM_L3SB_HIT ] = { 194, 204, 192, 197, -1, -1 }, [ POWER6_PME_PM_EE_OFF_EXT_INT ] = { 57, 66, 56, 64, -1, -1 }, [ POWER6_PME_PM_INST_FROM_DL2L3_MOD ] = { -1, -1, -1, 148, -1, -1 }, [ POWER6_PME_PM_PMC6_OVERFLOW ] = { -1, -1, 289, -1, -1, -1 }, [ POWER6_PME_PM_FPU_FLOP ] = { 104, -1, -1, 109, -1, -1 }, [ POWER6_PME_PM_FXU_BUSY ] = { -1, 125, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_FLOP ] = { 91, 100, 89, 97, -1, -1 }, [ POWER6_PME_PM_IC_RELOAD_SHR ] = { 135, 144, 134, 141, -1, -1 }, [ POWER6_PME_PM_INST_TABLEWALK_CYC ] = { 151, 160, 151, 157, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC ] = { -1, -1, -1, 22, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { -1, -1, -1, 309, -1, -1 }, [ POWER6_PME_PM_IBUF_FULL_CYC ] = { 130, 138, 128, 136, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_REQ ] = { 164, 173, 164, 170, -1, -1 }, [ POWER6_PME_PM_VMX1_LD_WRBACK ] = { 327, 334, 321, 319, -1, -1 }, [ POWER6_PME_PM_MRK_FPU_FIN ] = { -1, 276, 261, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_5_CYC ] = { -1, -1, 309, -1, -1, -1 }, [ POWER6_PME_PM_DFU_BACK2BACK ] = { 25, 32, 24, 32, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_LMEM ] = { -1, -1, -1, 258, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_LHS ] = { 240, -1, -1, 240, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_SPR ] = { 50, 59, 49, 57, -1, -1 }, [ POWER6_PME_PM_FREQ_DOWN ] = { -1, -1, 115, -1, -1, -1 }, [ POWER6_PME_PM_DFU_ENC_BCD_DPD ] = { 27, 34, 26, 34, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_GPR ] = { 39, 47, 38, 46, -1, -1 }, [ POWER6_PME_PM_LSU0_NCST ] = { 207, 216, 202, 207, -1, -1 }, [ POWER6_PME_PM_MRK_INST_ISSUED ] = { 268, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_RL2L3_SHR ] = { -1, 156, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU_DENORM ] = { -1, 110, 99, -1, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_L3MISS ] = { -1, -1, 293, -1, -1, -1 }, [ POWER6_PME_PM_RUN_PURR ] = { -1, -1, -1, 347, -1, -1 }, [ POWER6_PME_PM_MRK_VMX0_LD_WRBACK ] = { 283, 295, 278, 280, -1, -1 }, [ POWER6_PME_PM_L2_MISS ] = { -1, 197, 185, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L3 ] = { -1, -1, 254, -1, -1, -1 }, [ POWER6_PME_PM_MRK_LSU1_REJECT_LHS ] = { 273, 285, 267, 268, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_MISS_INST ] = { 177, 186, 177, 183, -1, -1 }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_SHR ] = { -1, 307, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_MISS_64K ] = { 354, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LWSYNC ] = { 247, 258, 241, 246, -1, -1 }, [ POWER6_PME_PM_FPU1_FXMULT ] = { 97, 106, 95, 103, -1, -1 }, [ POWER6_PME_PM_MEM0_DP_CL_WR_GLOB ] = { 248, 259, 242, 247, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR ] = { 215, 224, 210, 215, -1, -1 }, [ POWER6_PME_PM_INST_IMC_MATCH_CMPL ] = { 147, -1, -1, 153, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_THERMAL ] = { 52, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU_FRSP ] = { -1, 113, 101, -1, -1, -1 }, [ POWER6_PME_PM_MRK_INST_FIN ] = { -1, -1, 262, 262, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR ] = { -1, -1, 271, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DTLB_REF ] = { 264, 273, 258, 259, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR ] = { -1, -1, -1, 276, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_LSU ] = { 44, 52, 43, 51, -1, -1 }, [ POWER6_PME_PM_FPU_FSQRT_FDIV ] = { -1, 114, 102, -1, -1, -1 }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT ] = { -1, 359, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_PTEG_SECONDARY ] = { 19, 27, 19, 26, -1, -1 }, [ POWER6_PME_PM_FPU1_FEST ] = { 89, 98, 87, 95, -1, -1 }, [ POWER6_PME_PM_L2SA_LD_HIT ] = { 161, 170, 161, 167, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_MEM_DP_CYC ] = { -1, -1, -1, 21, -1, -1 }, [ POWER6_PME_PM_BR_MPRED_CCACHE ] = { 4, 2, 2, 4, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_COUNT ] = { -1, 355, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_SET_MPRED ] = { 229, 238, 224, 229, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_2 ] = { 109, 117, 105, 114, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_L2_CORR ] = { 225, 234, 220, 225, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_DMEM ] = { -1, 289, -1, -1, -1, -1 }, [ POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB ] = { 255, 266, 249, 254, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { -1, 325, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_0_CYC ] = { 316, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, 116, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED ] = { 223, 232, 218, 223, -1, -1 }, [ POWER6_PME_PM_MRK_VMX1_LD_WRBACK ] = { 284, 296, 279, 281, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC ] = { -1, 24, -1, -1, -1, -1 }, [ POWER6_PME_PM_IERAT_MISS_16M ] = { -1, 362, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_MEM_DP ] = { 262, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LARX_L1HIT ] = { 198, 208, 196, 201, -1, -1 }, [ POWER6_PME_PM_L2_ST_MISS_DATA ] = { 190, -1, -1, 193, -1, -1 }, [ POWER6_PME_PM_FPU_ST_FOLDED ] = { -1, -1, 114, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_SHR ] = { -1, 271, -1, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_MULT_GPR ] = { 46, 54, 45, 53, -1, -1 }, [ POWER6_PME_PM_FPU0_1FLOP ] = { 71, 80, 69, 77, -1, -1 }, [ POWER6_PME_PM_IERAT_MISS_16G ] = { 352, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_IC_PREF_WRITE ] = { 134, 143, 133, 140, -1, -1 }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { -1, -1, -1, 310, -1, -1 }, [ POWER6_PME_PM_FPU0_FIN ] = { 75, 84, 73, 81, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L2_CYC ] = { -1, 18, -1, -1, -1, -1 }, [ POWER6_PME_PM_DERAT_REF_16G ] = { -1, -1, -1, 342, -1, -1 }, [ POWER6_PME_PM_BR_PRED ] = { 8, 6, 6, 8, -1, -1 }, [ POWER6_PME_PM_VMX1_LD_ISSUED ] = { 326, 333, 320, 318, -1, -1 }, [ POWER6_PME_PM_L2SB_CASTOUT_MOD ] = { 171, 180, 171, 177, -1, -1 }, [ POWER6_PME_PM_INST_FROM_DMEM ] = { -1, 152, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L35_SHR_CYC ] = { -1, 20, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_NCLD ] = { 206, 215, 201, 206, -1, -1 }, [ POWER6_PME_PM_FAB_RETRY_NODE_PUMP ] = { 64, 74, 63, 70, -1, -1 }, [ POWER6_PME_PM_VMX0_INST_ISSUED ] = { 321, 328, 315, 313, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L25_MOD ] = { -1, -1, 12, -1, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_ITLB_ISLB ] = { 42, 50, 41, 49, -1, -1 }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 248, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_CONC_RUN_INST ] = { -1, -1, 306, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2 ] = { 277, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_PURR ] = { 301, -1, -1, 293, -1, -1 }, [ POWER6_PME_PM_DERAT_MISS_64K ] = { -1, 361, -1, -1, -1, -1 }, [ POWER6_PME_PM_PMC2_REWIND ] = { -1, -1, 287, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L2 ] = { 143, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_INST_DISP ] = { -1, 149, 139, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L25_SHR ] = { -1, -1, -1, 16, -1, -1 }, [ POWER6_PME_PM_L1_DCACHE_RELOAD_VALID ] = { -1, -1, 154, 160, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_UST ] = { 232, 241, 227, 232, -1, -1 }, [ POWER6_PME_PM_FAB_ADDR_COLLISION ] = { 58, 68, 58, 65, -1, -1 }, [ POWER6_PME_PM_MRK_FXU_FIN ] = { -1, 277, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_UST ] = { 219, 228, 214, 219, -1, -1 }, [ POWER6_PME_PM_PMC4_OVERFLOW ] = { 292, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3 ] = { -1, -1, 273, -1, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L2MISS ] = { -1, -1, -1, 151, -1, -1 }, [ POWER6_PME_PM_L2SB_ST_HIT ] = { 182, 191, 182, 188, -1, -1 }, [ POWER6_PME_PM_DPU_WT_IC_MISS_COUNT ] = { -1, 357, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { -1, -1, 252, -1, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD ] = { 278, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU1_FPSCR ] = { 93, 102, 91, 99, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_UST ] = { -1, 256, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_DERAT_MISS ] = { 204, 213, 199, 204, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP ] = { 279, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L2 ] = { 259, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU0_FSQRT_FDIV ] = { 80, 89, 78, 86, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_FXU_SOPS ] = { 38, 46, 37, 45, -1, -1 }, [ POWER6_PME_PM_MRK_FPU0_FIN ] = { 265, 274, 259, 260, -1, -1 }, [ POWER6_PME_PM_L2SB_LD_MISS_DATA ] = { 176, 185, 176, 182, -1, -1 }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 244, -1, -1 }, [ POWER6_PME_PM_1PLUS_PPC_DISP ] = { 2, -1, -1, 1, -1, -1 }, [ POWER6_PME_PM_VMX_ST_ISSUED ] = { 336, 343, 330, 328, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L2MISS ] = { -1, 17, 13, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_ULD ] = { 218, 227, 213, 218, -1, -1 }, [ POWER6_PME_PM_SUSPENDED ] = { 311, 318, 302, 302, -1, -1 }, [ POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH ] = { 24, 31, 23, 31, -1, -1 }, [ POWER6_PME_PM_LSU_REJECT_NO_SCRATCH ] = { -1, 251, 237, -1, -1, -1 }, [ POWER6_PME_PM_STCX_FAIL ] = { 306, 312, 298, 297, -1, -1 }, [ POWER6_PME_PM_FPU1_DENORM ] = { 87, 96, 85, 93, -1, -1 }, [ POWER6_PME_PM_GCT_NOSLOT_COUNT ] = { 349, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC ] = { -1, 12, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L21 ] = { -1, 15, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU_1FLOP ] = { 101, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT ] = { 222, 231, 217, 222, -1, -1 }, [ POWER6_PME_PM_IC_REQ ] = { 136, 145, 135, 142, -1, -1 }, [ POWER6_PME_PM_MRK_DFU_FIN ] = { -1, -1, 257, -1, -1, -1 }, [ POWER6_PME_PM_NOT_LLA_CYC ] = { 346, 353, 338, 338, -1, -1 }, [ POWER6_PME_PM_INST_FROM_L1 ] = { 142, 153, 143, 149, -1, -1 }, [ POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED ] = { 285, 297, 280, 282, -1, -1 }, [ POWER6_PME_PM_BRU_FIN ] = { 3, 1, 1, 2, -1, -1 }, [ POWER6_PME_PM_LSU1_REJECT_EXTERN ] = { 224, 233, 219, 224, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_L21_CYC ] = { -1, -1, -1, 15, -1, -1 }, [ POWER6_PME_PM_GXI_CYC_BUSY ] = { 123, 130, 121, 129, -1, -1 }, [ POWER6_PME_PM_MRK_LD_MISS_L1 ] = { -1, 280, -1, -1, -1, -1 }, [ POWER6_PME_PM_L1_WRITE_CYC ] = { 156, 165, 156, 162, -1, -1 }, [ POWER6_PME_PM_LLA_CYC ] = { 345, 352, 337, 337, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_L2MISS ] = { 260, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_GCT_FULL_COUNT ] = { 339, 346, 333, 331, -1, -1 }, [ POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB ] = { -1, 268, -1, -1, -1, -1 }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR ] = { -1, 23, -1, -1, -1, -1 }, [ POWER6_PME_PM_MRK_LSU_REJECT_UST ] = { -1, 288, 270, -1, -1, -1 }, [ POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED ] = { 287, 299, 282, 284, -1, -1 }, [ POWER6_PME_PM_MRK_PTEG_FROM_L21 ] = { -1, 290, -1, -1, -1, -1 }, [ POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { -1, 320, -1, -1, -1, -1 }, [ POWER6_PME_PM_BR_MPRED ] = { -1, -1, -1, 3, -1, -1 }, [ POWER6_PME_PM_LD_REQ_L2 ] = { 203, -1, -1, -1, -1, -1 }, [ POWER6_PME_PM_FLUSH_ASYNC ] = { 68, 77, 66, 74, -1, -1 }, [ POWER6_PME_PM_HV_CYC ] = { -1, 137, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU1_DERAT_MISS ] = { 220, 229, 215, 220, -1, -1 }, [ POWER6_PME_PM_DPU_HELD_SMT ] = { 49, 58, 48, 56, -1, -1 }, [ POWER6_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, 272, -1, -1 }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { -1, 272, -1, -1, -1, -1 }, [ POWER6_PME_PM_LSU0_REJECT_STQ_FULL ] = { 217, 226, 212, 217, -1, -1 }, [ POWER6_PME_PM_MRK_DERAT_REF_4K ] = { -1, 363, -1, -1, -1, -1 }, [ POWER6_PME_PM_FPU_ISSUE_STALL_FPR ] = { 112, 120, 108, 117, -1, -1 }, [ POWER6_PME_PM_IFU_FIN ] = { 138, 147, 137, 144, -1, -1 }, [ POWER6_PME_PM_GXO_CYC_BUSY ] = { 126, 133, 124, 132, -1, -1 } }; static const unsigned long long power6_group_vecs[][POWER6_NUM_GROUP_VEC] = { [ POWER6_PME_PM_LSU_REJECT_STQ_FULL ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_FXU_MULTI ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX1_STALL ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC2_SAVED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_IC_INV ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IERAT_MISS_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_REF_L1_BOTH ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FCONV ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IBUF_FULL_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_ST_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_CASTOUT_MOD ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_ST_FOLDED ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_INST_TIMEO ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_RESTART ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IERAT_MISS ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_HV_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_ST_HIT ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_LD_MISS_INST ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_EXT_INT ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_CMD_ISSUED ] = { 0x0000000000000000ULL, 0x0000000028000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L21 ] = { 0x0000000018000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_MISS ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_MOD ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002040000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_HIT_L2 ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_SHR ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L3SA_MISS ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_NO_ITAG_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DSLB_MISS ] = { 0x0600000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_FLUSH_ALIGN ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_FPU_CR ] = { 0x0000188000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L2MISS ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_LMEM ] = { 0x0000000060000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_REF_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000010ULL }, [ POWER6_PME_PM_L2SA_LD_REQ_INST ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_MISS_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FXMULT ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L3SB_MISS ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_STCX_CANCEL ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_LD_MISS_DATA ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_INV_L2 ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_6_CYC ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU0_REJECT_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_INST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LARX ] = { 0x0018000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_CMPL ] = { 0x0100000000006001ULL, 0x0000000000000000ULL, 0x3800003ffffffe18ULL, 0x0000000000000038ULL }, [ POWER6_PME_PM_FXU_IDLE ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_LD_REQ_DATA ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_DERAT_MISS_CYC ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0040000000000020ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_POWER_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100020000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_RL2L3_MOD ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_DMEM_CYC ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_DMEM ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_DERAT_MPRED ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_ULD ] = { 0x8000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L3_CYC ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_MEM_DP ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_FLUSH_DSI ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_REF_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL }, [ POWER6_PME_PM_LSU_LDF_BOTH ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_1FLOP ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RMEM_CYC ] = { 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_PTEG_SECONDARY ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L1_ICACHE_MISS ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_DISP_LLA ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_BOTH_RUN_CYC ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_ST_CHAINED ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FXDIV ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FREQ_UP ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_RETRY_SYS_PUMP ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_LMEM ] = { 0x0000000000100800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_SET_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000084ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_STQ_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_BR_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_ST_MISS ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_EXTERN ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_BR_TAKEN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ISLB_MISS ] = { 0x0600000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_CYC ] = { 0x0100200000283003ULL, 0x0000000000000000ULL, 0x5c00000481000018ULL, 0x0000000000000005ULL }, [ POWER6_PME_PM_FPU_FXDIV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_LLA_END ] = { 0x0000020000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM0_DP_CL_WR_LOC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU_REJECT_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000210000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_1PLUS_PPC_CMPL ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_DMEM ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002040000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L25_SHR ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_MISS_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000020ULL }, [ POWER6_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FIN ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_MPRED_TA ] = { 0x0000000000000028ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_POWER ] = { 0x0000001400000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_RUN_INST_CMPL ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x000000000000003fULL }, [ POWER6_PME_PM_GCT_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LLA_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT_IC_MISS ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L3MISS ] = { 0x0000000000000180ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL }, [ POWER6_PME_PM_FPU_FPSCR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX1_INST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FLUSH ] = { 0x0002000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_HIT_L2 ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_SYNC_CYC ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0061800000000010ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_SYS_PUMP ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_PREF_REQ ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_0 ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_2_CYC ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_SIMPLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_FPU1_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800100000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_CW ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L3SA_REF ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_STCX ] = { 0x0018000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_MISS ] = { 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT ] = { 0x0000000000000000ULL, 0x0000000000000044ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_TB_BIT_TRANS ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THERMAL_MAX ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_STF ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FMA ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_INT ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_LLA_BOTH_CYC ] = { 0x0040000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_THERMAL_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002020000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC4_REWIND ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_REF_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FCONV ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_LD_REQ_DATA ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_MEM_DP ] = { 0x0000000000020800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX0_STALL ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0020004000000020ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_SINGLE ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_STEERING ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_1_CYC ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_COMPLEX_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_ST_FOLDED ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_FIN ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_PRED_CCACHE ] = { 0x0000000000000018ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_MMIO ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_STF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0800000000000002ULL, 0x0000000000000002ULL }, [ POWER6_PME_PM_MEM1_DP_CL_WR_GLOB ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L3MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GCT_NOSLOT_CYC ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_ST_REQ_DATA ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_TABLEWALK_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L35_SHR ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_ISYNC ] = { 0x0000184000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L3SA_HIT ] = { 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_MISS_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_PTEG_2ND_HALF ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_ST_REQ ] = { 0x0000000000000000ULL, 0x0001080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_LMEM ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L2 ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_PTEG_1ST_HALF ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_MPRED_COUNT ] = { 0x0000000000000024ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IERAT_MISS_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_BOTH_RUN_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_ULD ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC ] = { 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FLOP ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FEST ] = { 0x0000000000000000ULL, 0x0600000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU0_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_RESULT_SAT_1 ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_NO_ITAG_CYC ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH ] = { 0x0000000000000000ULL, 0x0000000000000014ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_0INST_FETCH ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT_BR_MPRED ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L1_PREF ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_FLOAT_MULTICYCLE ] = { 0x0000000000000000ULL, 0x0000000001800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L25_SHR_CYC ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L3 ] = { 0x0000000000010100ULL, 0x000c000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER6_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX0_LD_WRBACK ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_DENORM ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FETCH_CYC ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LDF ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0800000000000000ULL, 0x0000000000000004ULL }, [ POWER6_PME_PM_LSU_REJECT_L2_CORR ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_REF_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_3_CYC ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FMA ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000002ULL }, [ POWER6_PME_PM_INST_FROM_L35_MOD ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_CONV ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L25_MOD ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L35_MOD ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX_ST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_FLOAT_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_L2_CORR ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_L2MISS ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FCONV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FXMULT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FRSP ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_REF_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_CASTOUT_SHR ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_ONE_RUN_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_RMEM ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_BOTH_BUS ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_LD_REQ_INST ] = { 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_PRED_CR ] = { 0x0000000000000014ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU0_REJECT_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_LHS_BOTH ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXO_ADDR_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L3 ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX0_LD_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000600000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU_PIPELINED_MULT_DIV ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_STF ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_ADD ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM_DP_CL_WR_GLOB ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU1_REJECT_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ITLB_REF ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L35_SHR ] = { 0x0000000000000300ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FPSCR ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L2 ] = { 0x0000000000040080ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_XER ] = { 0x0000004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_NODE_PUMP ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_RESULT_SAT_0_1 ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_REF_L1 ] = { 0x0000c00000000000ULL, 0x0000000000000000ULL, 0x2080000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_TLB_REF ] = { 0x0200000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FLUSH_FPU ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM1_DP_CL_WR_LOC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_HIT ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_DCLAIM ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM_DP_CL_WR_LOC ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_MPRED_CR ] = { 0x0000000000000024ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_EXTERN ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD ] = { 0x0000000000010400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_RU_WQ ] = { 0x0000084000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_MISS_L1 ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x1080084000000020ULL, 0x0000000000000006ULL }, [ POWER6_PME_PM_DC_INV_L2 ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x4000000000000001ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU0_FIN ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_FPQ ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GX_DMA_READ ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_0INST_FETCH_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_REQ ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RMEM ] = { 0x0000000000004800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_REF_L1_BOTH ] = { 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_PERMUTE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_TAKEN ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_DMA ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GCT_EMPTY_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008010000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_SINGLE ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_CASTOUT_SHR ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L3SB_REF ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FRSP ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC4_SAVED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_DC_INV ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXI_ADDR_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FMA ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_SLB_MISS ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_ST_GPS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_REF_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_CASTOUT_SHR ] = { 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_STCX_CR ] = { 0x0000084000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_ST_FOLDED ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L21 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L35_MOD ] = { 0x0000000000208300ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR ] = { 0x0000000000008400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXI_DATA_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_STEAL ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_FIN ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_CR_LOGICAL ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0300000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_SEL_T0 ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_RELOAD_VALID ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_PREF_ST ] = { 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_EXP_EQ ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_FP_FX_MULT ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_LD_MISS_DATA ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L35_MOD_CYC ] = { 0x0000000000208000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FLUSH_FXU ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_1 ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_LMEM_CYC ] = { 0x0000000000102000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_LSU_SOPS ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_PTEG_2ND_HALF ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRESH_TIMEO ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_UST_BOTH ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_FAST ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_THRD_PRIO ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_PREF_LD ] = { 0x0000000000000000ULL, 0x0002004000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FEST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_MISS_L1_CYC ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_MISS_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_COMPLETION ] = { 0x0000110000000000ULL, 0x0000000000000000ULL, 0x0300000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_STALL_ST ] = { 0x0000000000000000ULL, 0x0060000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_DC_INV ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L25_SHR ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_MOD ] = { 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_CMD_RETRIED ] = { 0x0000000000000000ULL, 0x0000000028000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_PRED_LSTACK ] = { 0x0000000000000018ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXO_DATA_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_SUBNORM ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_OOO ] = { 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_ULD_BOTH ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_ST_MISS ] = { 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L25_MOD_CYC ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_PTEG_1ST_HALF ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_MISS_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GX_DMA_WRITE ] = { 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_REQ_DATA ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_LD_MISS_INST ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_IFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L3 ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU1_FIN ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_4_CYC ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_SET_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_MISS_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER6_PME_PM_FPU0_FXDIV ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU1_REJECT_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP ] = { 0x0000000000000000ULL, 0x0060000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L35_SHR ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_SYNC_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0061800000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_CASTOUT_MOD ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008008000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_MEM_DP ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_SLOW ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L25_MOD ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_7_CYC ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_REQ_L2 ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_REF_L1 ] = { 0x0000c00000000000ULL, 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_STALL_THRD ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_RUN_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_RUN_CYC ] = { 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x000000000000003fULL }, [ POWER6_PME_PM_PTEG_FROM_RMEM ] = { 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_ST_MISS_L1 ] = { 0x0000800000000000ULL, 0x0000000000000000ULL, 0x1080000000000000ULL, 0x0000000000000004ULL }, [ POWER6_PME_PM_INST_FROM_DL2L3_SHR ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_IC_INV ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_ONE_RUN_CYC ] = { 0x0020000000000002ULL, 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_REQ_INST ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_XTHRD ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_ST_REQ ] = { 0x0000000000000000ULL, 0x0001080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L21 ] = { 0x0000000004400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L3MISS ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL }, [ POWER6_PME_PM_L3SB_HIT ] = { 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_EE_OFF_EXT_INT ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_DL2L3_MOD ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FLOP ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU_BUSY ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FLOP ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_RELOAD_SHR ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_TABLEWALK_CYC ] = { 0x0000000200000000ULL, 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC ] = { 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IBUF_FULL_CYC ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_LD_REQ ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX1_LD_WRBACK ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_FPU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800100000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_5_CYC ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_BACK2BACK ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_LHS ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_SPR ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FREQ_DOWN ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_ENC_BCD_DPD ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_GPR ] = { 0x0000012000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_NCST ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_INST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_RL2L3_SHR ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_DENORM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_L3MISS ] = { 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_RUN_PURR ] = { 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX0_LD_WRBACK ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_MISS ] = { 0x0000000000000000ULL, 0x0000104000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L3 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU1_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_MISS_INST ] = { 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_SHR ] = { 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_MISS_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000020ULL }, [ POWER6_PME_PM_LWSYNC ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FXMULT ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM0_DP_CL_WR_GLOB ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_IMC_MATCH_CMPL ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_THERMAL ] = { 0x0000001400000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FRSP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000005ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_INST_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DTLB_REF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_LSU ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020008000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_PTEG_SECONDARY ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FEST ] = { 0x0000000000000000ULL, 0x6000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SA_LD_HIT ] = { 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_MEM_DP_CYC ] = { 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_MPRED_CCACHE ] = { 0x0000000000000028ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_SET_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ISSUE_2 ] = { 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_L2_CORR ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB ] = { 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_0_CYC ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX1_LD_WRBACK ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IERAT_MISS_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_MEM_DP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LARX_L1HIT ] = { 0x0010000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2_ST_MISS_DATA ] = { 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_ST_FOLDED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_MULT_GPR ] = { 0x0000110000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_1FLOP ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IERAT_MISS_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_PREF_WRITE ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { 0x0000000000000000ULL, 0x0000000000020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FIN ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L2_CYC ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_REF_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_PRED ] = { 0x0000000000000054ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX1_LD_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000600000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_CASTOUT_MOD ] = { 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_DMEM ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L35_SHR_CYC ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_NCLD ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_RETRY_NODE_PUMP ] = { 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX0_INST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L25_MOD ] = { 0x0000000000020200ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_ITLB_ISLB ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_CONC_RUN_INST ] = { 0x0020000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PURR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DERAT_MISS_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC2_REWIND ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L2 ] = { 0x0000000004400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_DISP ] = { 0x0000000000140001ULL, 0x0000000000001000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L25_SHR ] = { 0x0000000000000200ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_UST ] = { 0x4000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FAB_ADDR_COLLISION ] = { 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_UST ] = { 0x4000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L2MISS ] = { 0x0000000004400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_ST_HIT ] = { 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_WT_IC_MISS_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0080040000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_FPSCR ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_UST ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_DERAT_MISS ] = { 0x0000000000000000ULL, 0x00000000000000a0ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU0_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_FXU_SOPS ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_FPU0_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800100000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L2SB_LD_MISS_DATA ] = { 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_1PLUS_PPC_DISP ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_VMX_ST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000001800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L2MISS ] = { 0x0000000000000080ULL, 0x0000100000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_ULD ] = { 0x8000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU_REJECT_NO_SCRATCH ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_STCX_FAIL ] = { 0x0018000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU1_DENORM ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GCT_NOSLOT_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008010000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC ] = { 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L21 ] = { 0x0000000000080080ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FPU_1FLOP ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000002ULL }, [ POWER6_PME_PM_LSU1_REJECT ] = { 0x0000000000000000ULL, 0x0000000000000044ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IC_REQ ] = { 0x0004000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_NOT_LLA_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_INST_FROM_L1 ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BRU_FIN ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_REJECT_EXTERN ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_L21_CYC ] = { 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXI_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000e00000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_L1_WRITE_CYC ] = { 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LLA_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GCT_FULL_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000410000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU_REJECT_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_PTEG_FROM_L21 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { 0x0020000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_BR_MPRED ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LD_REQ_L2 ] = { 0x0000000000000000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_FLUSH_ASYNC ] = { 0x0002000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_HV_CYC ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU1_DERAT_MISS ] = { 0x0000000000000000ULL, 0x00000000000000a0ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_DPU_HELD_SMT ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_LSU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x000000000000c000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_LSU0_REJECT_STQ_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_MRK_DERAT_REF_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000010ULL }, [ POWER6_PME_PM_FPU_ISSUE_STALL_FPR ] = { 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_IFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER6_PME_PM_GXO_CYC_BUSY ] = { 0x0000000000000000ULL, 0x0000000e00000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL } }; static const pme_power_entry_t power6_pe[] = { [ POWER6_PME_PM_LSU_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU_REJECT_STQ_FULL", .pme_code = 0x1a0030, .pme_short_desc = "LSU reject due to store queue full", .pme_long_desc = "LSU reject due to store queue full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_STQ_FULL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_STQ_FULL] }, [ POWER6_PME_PM_DPU_HELD_FXU_MULTI ] = { .pme_name = "PM_DPU_HELD_FXU_MULTI", .pme_code = 0x210a6, .pme_short_desc = "DISP unit held due to FXU multicycle", .pme_long_desc = "DISP unit held due to FXU multicycle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_FXU_MULTI], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_FXU_MULTI] }, [ POWER6_PME_PM_VMX1_STALL ] = { .pme_name = "PM_VMX1_STALL", .pme_code = 0xb008c, .pme_short_desc = "VMX1 stall", .pme_long_desc = "VMX1 stall", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX1_STALL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX1_STALL] }, [ POWER6_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x100022, .pme_short_desc = "PMC2 rewind value saved", .pme_long_desc = "PMC2 rewind value saved", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC2_SAVED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC2_SAVED] }, [ POWER6_PME_PM_L2SB_IC_INV ] = { .pme_name = "PM_L2SB_IC_INV", .pme_code = 0x5068c, .pme_short_desc = "L2 slice B I cache invalidate", .pme_long_desc = "L2 slice B I cache invalidate", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_IC_INV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_IC_INV] }, [ POWER6_PME_PM_IERAT_MISS_64K ] = { .pme_name = "PM_IERAT_MISS_64K", .pme_code = 0x392076, .pme_short_desc = "IERAT misses for 64K page", .pme_long_desc = "IERAT misses for 64K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IERAT_MISS_64K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IERAT_MISS_64K] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x323040, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles thread priority difference is 3 or 4", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC] }, [ POWER6_PME_PM_LD_REF_L1_BOTH ] = { .pme_name = "PM_LD_REF_L1_BOTH", .pme_code = 0x180036, .pme_short_desc = "Both units L1 D cache load reference", .pme_long_desc = "Both units L1 D cache load reference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_REF_L1_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_REF_L1_BOTH] }, [ POWER6_PME_PM_FPU1_FCONV ] = { .pme_name = "PM_FPU1_FCONV", .pme_code = 0xd10a8, .pme_short_desc = "FPU1 executed FCONV instruction", .pme_long_desc = "FPU1 executed FCONV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FCONV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FCONV] }, [ POWER6_PME_PM_IBUF_FULL_COUNT ] = { .pme_name = "PM_IBUF_FULL_COUNT", .pme_code = 0x40085, .pme_short_desc = "Periods instruction buffer full", .pme_long_desc = "Periods instruction buffer full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IBUF_FULL_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IBUF_FULL_COUNT] }, [ POWER6_PME_PM_MRK_LSU_DERAT_MISS ] = { .pme_name = "PM_MRK_LSU_DERAT_MISS", .pme_code = 0x400012, .pme_short_desc = "Marked DERAT miss", .pme_long_desc = "Marked DERAT miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU_DERAT_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU_DERAT_MISS] }, [ POWER6_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100006, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_ST_CMPL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_ST_CMPL] }, [ POWER6_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x150630, .pme_short_desc = "L2 castouts - Modified (M", .pme_long_desc = " Mu", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_CASTOUT_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_CASTOUT_MOD] }, [ POWER6_PME_PM_FPU1_ST_FOLDED ] = { .pme_name = "PM_FPU1_ST_FOLDED", .pme_code = 0xd10ac, .pme_short_desc = "FPU1 folded store", .pme_long_desc = "FPU1 folded store", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_ST_FOLDED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_ST_FOLDED] }, [ POWER6_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40003e, .pme_short_desc = "Marked Instruction finish timeout ", .pme_long_desc = "Marked Instruction finish timeout ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_INST_TIMEO], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_INST_TIMEO] }, [ POWER6_PME_PM_DPU_WT ] = { .pme_name = "PM_DPU_WT", .pme_code = 0x300004, .pme_short_desc = "Cycles DISP unit is stalled waiting for instructions", .pme_long_desc = "Cycles DISP unit is stalled waiting for instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT] }, [ POWER6_PME_PM_DPU_HELD_RESTART ] = { .pme_name = "PM_DPU_HELD_RESTART", .pme_code = 0x30086, .pme_short_desc = "DISP unit held after restart coming", .pme_long_desc = "DISP unit held after restart coming", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_RESTART], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_RESTART] }, [ POWER6_PME_PM_IERAT_MISS ] = { .pme_name = "PM_IERAT_MISS", .pme_code = 0x420ce, .pme_short_desc = "IERAT miss count", .pme_long_desc = "IERAT miss count", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IERAT_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IERAT_MISS] }, [ POWER6_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x4c1030, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_SINGLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_SINGLE] }, [ POWER6_PME_PM_MRK_PTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_LMEM", .pme_code = 0x412042, .pme_short_desc = "Marked PTEG loaded from local memory", .pme_long_desc = "Marked PTEG loaded from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_LMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_LMEM] }, [ POWER6_PME_PM_HV_COUNT ] = { .pme_name = "PM_HV_COUNT", .pme_code = 0x200017, .pme_short_desc = "Hypervisor Periods", .pme_long_desc = "Periods when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_HV_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_HV_COUNT] }, [ POWER6_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x50786, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_ST_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_ST_HIT] }, [ POWER6_PME_PM_L2_LD_MISS_INST ] = { .pme_name = "PM_L2_LD_MISS_INST", .pme_code = 0x250530, .pme_short_desc = "L2 instruction load misses", .pme_long_desc = "L2 instruction load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_LD_MISS_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_LD_MISS_INST] }, [ POWER6_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x2000f8, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", .pme_event_ids = power6_event_ids[POWER6_PME_PM_EXT_INT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_EXT_INT] }, [ POWER6_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x8008c, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_LDF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_LDF] }, [ POWER6_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x150130, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Fabric command issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_CMD_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_CMD_ISSUED] }, [ POWER6_PME_PM_PTEG_FROM_L21 ] = { .pme_name = "PM_PTEG_FROM_L21", .pme_code = 0x213048, .pme_short_desc = "PTEG loaded from private L2 other core", .pme_long_desc = "PTEG loaded from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L21], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L21] }, [ POWER6_PME_PM_L2SA_MISS ] = { .pme_name = "PM_L2SA_MISS", .pme_code = 0x50584, .pme_short_desc = "L2 slice A misses", .pme_long_desc = "L2 slice A misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_MISS] }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_RL2L3_MOD", .pme_code = 0x11304c, .pme_short_desc = "PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "PTEG loaded from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_RL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_RL2L3_MOD] }, [ POWER6_PME_PM_DPU_WT_COUNT ] = { .pme_name = "PM_DPU_WT_COUNT", .pme_code = 0x300005, .pme_short_desc = "Periods DISP unit is stalled waiting for instructions", .pme_long_desc = "Periods DISP unit is stalled waiting for instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT_COUNT] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L25_MOD", .pme_code = 0x312046, .pme_short_desc = "Marked PTEG loaded from L2.5 modified", .pme_long_desc = "Marked PTEG loaded from L2.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD] }, [ POWER6_PME_PM_LD_HIT_L2 ] = { .pme_name = "PM_LD_HIT_L2", .pme_code = 0x250730, .pme_short_desc = "L2 D cache load hits", .pme_long_desc = "L2 D cache load hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_HIT_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_HIT_L2] }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_DL2L3_SHR", .pme_code = 0x31304c, .pme_short_desc = "PTEG loaded from distant L2 or L3 shared", .pme_long_desc = "PTEG loaded from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_DL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_DL2L3_SHR] }, [ POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM_DP_RQ_GLOB_LOC", .pme_code = 0x150230, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC] }, [ POWER6_PME_PM_L3SA_MISS ] = { .pme_name = "PM_L3SA_MISS", .pme_code = 0x50084, .pme_short_desc = "L3 slice A misses", .pme_long_desc = "L3 slice A misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SA_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SA_MISS] }, [ POWER6_PME_PM_NO_ITAG_COUNT ] = { .pme_name = "PM_NO_ITAG_COUNT", .pme_code = 0x40089, .pme_short_desc = "Periods no ITAG available", .pme_long_desc = "Periods no ITAG available", .pme_event_ids = power6_event_ids[POWER6_PME_PM_NO_ITAG_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_NO_ITAG_COUNT] }, [ POWER6_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x830e8, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DSLB_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DSLB_MISS] }, [ POWER6_PME_PM_LSU_FLUSH_ALIGN ] = { .pme_name = "PM_LSU_FLUSH_ALIGN", .pme_code = 0x220cc, .pme_short_desc = "Flush caused by alignement exception", .pme_long_desc = "Flush caused by alignement exception", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_FLUSH_ALIGN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_FLUSH_ALIGN] }, [ POWER6_PME_PM_DPU_HELD_FPU_CR ] = { .pme_name = "PM_DPU_HELD_FPU_CR", .pme_code = 0x210a0, .pme_short_desc = "DISP unit held due to FPU updating CR", .pme_long_desc = "DISP unit held due to FPU updating CR", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_FPU_CR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_FPU_CR] }, [ POWER6_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x113028, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "PTEG loaded from L2 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L2MISS] }, [ POWER6_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x20304a, .pme_short_desc = "Marked data loaded from distant memory", .pme_long_desc = "Marked data loaded from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_DMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_DMEM] }, [ POWER6_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x41304a, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "PTEG loaded from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_LMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_LMEM] }, [ POWER6_PME_PM_MRK_DERAT_REF_64K ] = { .pme_name = "PM_MRK_DERAT_REF_64K", .pme_code = 0x182044, .pme_short_desc = "Marked DERAT reference for 64K page", .pme_long_desc = "Marked DERAT reference for 64K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_REF_64K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_REF_64K] }, [ POWER6_PME_PM_L2SA_LD_REQ_INST ] = { .pme_name = "PM_L2SA_LD_REQ_INST", .pme_code = 0x50580, .pme_short_desc = "L2 slice A instruction load requests", .pme_long_desc = "L2 slice A instruction load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_REQ_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_REQ_INST] }, [ POWER6_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x392044, .pme_short_desc = "Marked DERAT misses for 16M page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_MISS_16M], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_MISS_16M] }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x40005c, .pme_short_desc = "Data loaded from distant L2 or L3 modified", .pme_long_desc = "Data loaded from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DL2L3_MOD] }, [ POWER6_PME_PM_FPU0_FXMULT ] = { .pme_name = "PM_FPU0_FXMULT", .pme_code = 0xd0086, .pme_short_desc = "FPU0 executed fixed point multiplication", .pme_long_desc = "FPU0 executed fixed point multiplication", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FXMULT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FXMULT] }, [ POWER6_PME_PM_L3SB_MISS ] = { .pme_name = "PM_L3SB_MISS", .pme_code = 0x5008c, .pme_short_desc = "L3 slice B misses", .pme_long_desc = "L3 slice B misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SB_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SB_MISS] }, [ POWER6_PME_PM_STCX_CANCEL ] = { .pme_name = "PM_STCX_CANCEL", .pme_code = 0x830ec, .pme_short_desc = "stcx cancel by core", .pme_long_desc = "stcx cancel by core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_STCX_CANCEL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_STCX_CANCEL] }, [ POWER6_PME_PM_L2SA_LD_MISS_DATA ] = { .pme_name = "PM_L2SA_LD_MISS_DATA", .pme_code = 0x50482, .pme_short_desc = "L2 slice A data load misses", .pme_long_desc = "L2 slice A data load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_MISS_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_MISS_DATA] }, [ POWER6_PME_PM_IC_INV_L2 ] = { .pme_name = "PM_IC_INV_L2", .pme_code = 0x250632, .pme_short_desc = "L1 I cache entries invalidated from L2", .pme_long_desc = "L1 I cache entries invalidated from L2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_INV_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_INV_L2] }, [ POWER6_PME_PM_DPU_HELD ] = { .pme_name = "PM_DPU_HELD", .pme_code = 0x200004, .pme_short_desc = "DISP unit held", .pme_long_desc = "DISP unit held", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD] }, [ POWER6_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200014, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC1_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC1_OVERFLOW] }, [ POWER6_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x222046, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles thread running at priority level 6", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_6_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_6_CYC] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L3MISS", .pme_code = 0x312054, .pme_short_desc = "Marked PTEG loaded from L3 miss", .pme_long_desc = "Marked PTEG loaded from L3 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L3MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L3MISS] }, [ POWER6_PME_PM_MRK_LSU0_REJECT_UST ] = { .pme_name = "PM_MRK_LSU0_REJECT_UST", .pme_code = 0x930e2, .pme_short_desc = "LSU0 marked unaligned store reject", .pme_long_desc = "LSU0 marked unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU0_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU0_REJECT_UST] }, [ POWER6_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x10001a, .pme_short_desc = "Marked instruction dispatched", .pme_long_desc = "Marked instruction dispatched", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_INST_DISP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_INST_DISP] }, [ POWER6_PME_PM_LARX ] = { .pme_name = "PM_LARX", .pme_code = 0x830ea, .pme_short_desc = "Larx executed", .pme_long_desc = "Larx executed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LARX], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LARX] }, [ POWER6_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x2, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PPC instructions completed. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_CMPL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_CMPL] }, [ POWER6_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100050, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU_IDLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU_IDLE] }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x40304c, .pme_short_desc = "Marked data loaded from distant L2 or L3 modified", .pme_long_desc = "Marked data loaded from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD] }, [ POWER6_PME_PM_L2_LD_REQ_DATA ] = { .pme_name = "PM_L2_LD_REQ_DATA", .pme_code = 0x150430, .pme_short_desc = "L2 data load requests", .pme_long_desc = "L2 data load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_LD_REQ_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_LD_REQ_DATA] }, [ POWER6_PME_PM_LSU_DERAT_MISS_CYC ] = { .pme_name = "PM_LSU_DERAT_MISS_CYC", .pme_code = 0x1000fc, .pme_short_desc = "DERAT miss latency", .pme_long_desc = "DERAT miss latency", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_DERAT_MISS_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_DERAT_MISS_CYC] }, [ POWER6_PME_PM_DPU_HELD_POWER_COUNT ] = { .pme_name = "PM_DPU_HELD_POWER_COUNT", .pme_code = 0x20003d, .pme_short_desc = "Periods DISP unit held due to Power Management", .pme_long_desc = "Periods DISP unit held due to Power Management", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_POWER_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_POWER_COUNT] }, [ POWER6_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x142044, .pme_short_desc = "Instruction fetched from remote L2 or L3 modified", .pme_long_desc = "Instruction fetched from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_RL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_RL2L3_MOD] }, [ POWER6_PME_PM_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_DATA_FROM_DMEM_CYC", .pme_code = 0x20002e, .pme_short_desc = "Load latency from distant memory", .pme_long_desc = "Load latency from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DMEM_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DMEM_CYC] }, [ POWER6_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x20005e, .pme_short_desc = "Data loaded from distant memory", .pme_long_desc = "Data loaded from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DMEM] }, [ POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU_REJECT_PARTIAL_SECTOR", .pme_code = 0x1a0032, .pme_short_desc = "LSU reject due to partial sector valid", .pme_long_desc = "LSU reject due to partial sector valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR] }, [ POWER6_PME_PM_LSU_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU_REJECT_DERAT_MPRED", .pme_code = 0x2a0030, .pme_short_desc = "LSU reject due to mispredicted DERAT", .pme_long_desc = "LSU reject due to mispredicted DERAT", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_DERAT_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_DERAT_MPRED] }, [ POWER6_PME_PM_LSU1_REJECT_ULD ] = { .pme_name = "PM_LSU1_REJECT_ULD", .pme_code = 0x90088, .pme_short_desc = "LSU1 unaligned load reject", .pme_long_desc = "LSU1 unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_ULD] }, [ POWER6_PME_PM_DATA_FROM_L3_CYC ] = { .pme_name = "PM_DATA_FROM_L3_CYC", .pme_code = 0x200022, .pme_short_desc = "Load latency from L3", .pme_long_desc = "Load latency from L3", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L3_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L3_CYC] }, [ POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400050, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ POWER6_PME_PM_INST_FROM_MEM_DP ] = { .pme_name = "PM_INST_FROM_MEM_DP", .pme_code = 0x142042, .pme_short_desc = "Instruction fetched from double pump memory", .pme_long_desc = "Instruction fetched from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_MEM_DP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_MEM_DP] }, [ POWER6_PME_PM_LSU_FLUSH_DSI ] = { .pme_name = "PM_LSU_FLUSH_DSI", .pme_code = 0x220ce, .pme_short_desc = "Flush caused by DSI", .pme_long_desc = "Flush caused by DSI", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_FLUSH_DSI], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_FLUSH_DSI] }, [ POWER6_PME_PM_MRK_DERAT_REF_16G ] = { .pme_name = "PM_MRK_DERAT_REF_16G", .pme_code = 0x482044, .pme_short_desc = "Marked DERAT reference for 16G page", .pme_long_desc = "Marked DERAT reference for 16G page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_REF_16G], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_REF_16G] }, [ POWER6_PME_PM_LSU_LDF_BOTH ] = { .pme_name = "PM_LSU_LDF_BOTH", .pme_code = 0x180038, .pme_short_desc = "Both LSU units executed Floating Point load instruction", .pme_long_desc = "Both LSU units executed Floating Point load instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LDF_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LDF_BOTH] }, [ POWER6_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc0088, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_1FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_1FLOP] }, [ POWER6_PME_PM_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_DATA_FROM_RMEM_CYC", .pme_code = 0x40002c, .pme_short_desc = "Load latency from remote memory", .pme_long_desc = "Load latency from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RMEM_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RMEM_CYC] }, [ POWER6_PME_PM_INST_PTEG_SECONDARY ] = { .pme_name = "PM_INST_PTEG_SECONDARY", .pme_code = 0x910ac, .pme_short_desc = "Instruction table walk matched in secondary PTEG", .pme_long_desc = "Instruction table walk matched in secondary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_PTEG_SECONDARY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_PTEG_SECONDARY] }, [ POWER6_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x100056, .pme_short_desc = "L1 I cache miss count", .pme_long_desc = "L1 I cache miss count", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L1_ICACHE_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L1_ICACHE_MISS] }, [ POWER6_PME_PM_INST_DISP_LLA ] = { .pme_name = "PM_INST_DISP_LLA", .pme_code = 0x310a2, .pme_short_desc = "Instruction dispatched under load look ahead", .pme_long_desc = "Instruction dispatched under load look ahead", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_DISP_LLA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_DISP_LLA] }, [ POWER6_PME_PM_THRD_BOTH_RUN_CYC ] = { .pme_name = "PM_THRD_BOTH_RUN_CYC", .pme_code = 0x400004, .pme_short_desc = "Both threads in run cycles", .pme_long_desc = "Both threads in run cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_BOTH_RUN_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_BOTH_RUN_CYC] }, [ POWER6_PME_PM_LSU_ST_CHAINED ] = { .pme_name = "PM_LSU_ST_CHAINED", .pme_code = 0x820ce, .pme_short_desc = "number of chained stores", .pme_long_desc = "number of chained stores", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_ST_CHAINED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_ST_CHAINED] }, [ POWER6_PME_PM_FPU1_FXDIV ] = { .pme_name = "PM_FPU1_FXDIV", .pme_code = 0xc10a8, .pme_short_desc = "FPU1 executed fixed point division", .pme_long_desc = "FPU1 executed fixed point division", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FXDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FXDIV] }, [ POWER6_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x40003c, .pme_short_desc = "Frequency is being slewed up due to Power Management", .pme_long_desc = "Frequency is being slewed up due to Power Management", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FREQ_UP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FREQ_UP] }, [ POWER6_PME_PM_FAB_RETRY_SYS_PUMP ] = { .pme_name = "PM_FAB_RETRY_SYS_PUMP", .pme_code = 0x50182, .pme_short_desc = "Retry of a system pump", .pme_long_desc = " locally mastered ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_RETRY_SYS_PUMP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_RETRY_SYS_PUMP] }, [ POWER6_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x40005e, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "Data loaded from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_LMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_LMEM] }, [ POWER6_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400014, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC3_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC3_OVERFLOW] }, [ POWER6_PME_PM_LSU0_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU0_REJECT_SET_MPRED", .pme_code = 0xa0084, .pme_short_desc = "LSU0 reject due to mispredicted set", .pme_long_desc = "LSU0 reject due to mispredicted set", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_SET_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_SET_MPRED] }, [ POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU0_REJECT_DERAT_MPRED", .pme_code = 0xa0082, .pme_short_desc = "LSU0 reject due to mispredicted DERAT", .pme_long_desc = "LSU0 reject due to mispredicted DERAT", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED] }, [ POWER6_PME_PM_LSU1_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_STQ_FULL", .pme_code = 0xa0088, .pme_short_desc = "LSU1 reject due to store queue full", .pme_long_desc = "LSU1 reject due to store queue full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_STQ_FULL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_STQ_FULL] }, [ POWER6_PME_PM_MRK_BR_MPRED ] = { .pme_name = "PM_MRK_BR_MPRED", .pme_code = 0x300052, .pme_short_desc = "Marked branch mispredicted", .pme_long_desc = "Marked branch mispredicted", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_BR_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_BR_MPRED] }, [ POWER6_PME_PM_L2SA_ST_MISS ] = { .pme_name = "PM_L2SA_ST_MISS", .pme_code = 0x50486, .pme_short_desc = "L2 slice A store misses", .pme_long_desc = "L2 slice A store misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_ST_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_ST_MISS] }, [ POWER6_PME_PM_LSU0_REJECT_EXTERN ] = { .pme_name = "PM_LSU0_REJECT_EXTERN", .pme_code = 0xa10a4, .pme_short_desc = "LSU0 external reject request ", .pme_long_desc = "LSU0 external reject request ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_EXTERN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_EXTERN] }, [ POWER6_PME_PM_MRK_BR_TAKEN ] = { .pme_name = "PM_MRK_BR_TAKEN", .pme_code = 0x100052, .pme_short_desc = "Marked branch taken", .pme_long_desc = "Marked branch taken", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_BR_TAKEN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_BR_TAKEN] }, [ POWER6_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x830e0, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ISLB_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ISLB_MISS] }, [ POWER6_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x1e, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_CYC] }, [ POWER6_PME_PM_FPU_FXDIV ] = { .pme_name = "PM_FPU_FXDIV", .pme_code = 0x1c1034, .pme_short_desc = "FPU executed fixed point division", .pme_long_desc = "FPU executed fixed point division", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FXDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FXDIV] }, [ POWER6_PME_PM_DPU_HELD_LLA_END ] = { .pme_name = "PM_DPU_HELD_LLA_END", .pme_code = 0x30084, .pme_short_desc = "DISP unit held due to load look ahead ended", .pme_long_desc = "DISP unit held due to load look ahead ended", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_LLA_END], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_LLA_END] }, [ POWER6_PME_PM_MEM0_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM0_DP_CL_WR_LOC", .pme_code = 0x50286, .pme_short_desc = "cacheline write setting dp to local side 0", .pme_long_desc = "cacheline write setting dp to local side 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM0_DP_CL_WR_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM0_DP_CL_WR_LOC] }, [ POWER6_PME_PM_MRK_LSU_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU_REJECT_ULD", .pme_code = 0x193034, .pme_short_desc = "Marked unaligned load reject", .pme_long_desc = "Marked unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU_REJECT_ULD] }, [ POWER6_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100004, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_1PLUS_PPC_CMPL] }, [ POWER6_PME_PM_PTEG_FROM_DMEM ] = { .pme_name = "PM_PTEG_FROM_DMEM", .pme_code = 0x21304a, .pme_short_desc = "PTEG loaded from distant memory", .pme_long_desc = "PTEG loaded from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_DMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_DMEM] }, [ POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT ] = { .pme_name = "PM_DPU_WT_BR_MPRED_COUNT", .pme_code = 0x40000d, .pme_short_desc = "Periods DISP unit is stalled due to branch misprediction", .pme_long_desc = "Periods DISP unit is stalled due to branch misprediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT] }, [ POWER6_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x40086, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_FULL_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_FULL_CYC] }, [ POWER6_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x442046, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L25_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L25_SHR] }, [ POWER6_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x292044, .pme_short_desc = "Marked DERAT misses for 4K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_MISS_4K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_MISS_4K] }, [ POWER6_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x810a2, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DC_PREF_STREAM_ALLOC] }, [ POWER6_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0xd0088, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FIN] }, [ POWER6_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x410ac, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_MPRED_TA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_MPRED_TA] }, [ POWER6_PME_PM_DPU_HELD_POWER ] = { .pme_name = "PM_DPU_HELD_POWER", .pme_code = 0x20003c, .pme_short_desc = "DISP unit held due to Power Management", .pme_long_desc = "DISP unit held due to Power Management", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_POWER], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_POWER] }, [ POWER6_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x500009, .pme_short_desc = "Run instructions completed", .pme_long_desc = "Number of run instructions completed. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_RUN_INST_CMPL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_RUN_INST_CMPL] }, [ POWER6_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1000f8, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_EMPTY_CYC] }, [ POWER6_PME_PM_LLA_COUNT ] = { .pme_name = "PM_LLA_COUNT", .pme_code = 0xc01f, .pme_short_desc = "Transitions into Load Look Ahead mode", .pme_long_desc = "Transitions into Load Look Ahead mode", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LLA_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LLA_COUNT] }, [ POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU0_REJECT_NO_SCRATCH", .pme_code = 0xa10a2, .pme_short_desc = "LSU0 reject due to scratch register not available", .pme_long_desc = "LSU0 reject due to scratch register not available", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH] }, [ POWER6_PME_PM_DPU_WT_IC_MISS ] = { .pme_name = "PM_DPU_WT_IC_MISS", .pme_code = 0x20000c, .pme_short_desc = "Cycles DISP unit is stalled due to I cache miss", .pme_long_desc = "Cycles DISP unit is stalled due to I cache miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT_IC_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT_IC_MISS] }, [ POWER6_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x3000fe, .pme_short_desc = "Data loaded from private L3 miss", .pme_long_desc = "Data loaded from private L3 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L3MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L3MISS] }, [ POWER6_PME_PM_FPU_FPSCR ] = { .pme_name = "PM_FPU_FPSCR", .pme_code = 0x2d0032, .pme_short_desc = "FPU executed FPSCR instruction", .pme_long_desc = "FPU executed FPSCR instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FPSCR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FPSCR] }, [ POWER6_PME_PM_VMX1_INST_ISSUED ] = { .pme_name = "PM_VMX1_INST_ISSUED", .pme_code = 0x60088, .pme_short_desc = "VMX1 instruction issued", .pme_long_desc = "VMX1 instruction issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX1_INST_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX1_INST_ISSUED] }, [ POWER6_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x100010, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FLUSH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FLUSH] }, [ POWER6_PME_PM_ST_HIT_L2 ] = { .pme_name = "PM_ST_HIT_L2", .pme_code = 0x150732, .pme_short_desc = "L2 D cache store hits", .pme_long_desc = "L2 D cache store hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_HIT_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_HIT_L2] }, [ POWER6_PME_PM_SYNC_CYC ] = { .pme_name = "PM_SYNC_CYC", .pme_code = 0x920cc, .pme_short_desc = "Sync duration", .pme_long_desc = "Sync duration", .pme_event_ids = power6_event_ids[POWER6_PME_PM_SYNC_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_SYNC_CYC] }, [ POWER6_PME_PM_FAB_SYS_PUMP ] = { .pme_name = "PM_FAB_SYS_PUMP", .pme_code = 0x50180, .pme_short_desc = "System pump operation", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_SYS_PUMP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_SYS_PUMP] }, [ POWER6_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x4008c, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_PREF_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_PREF_REQ] }, [ POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM0_DP_RQ_GLOB_LOC", .pme_code = 0x50280, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local side 0", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local side 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC] }, [ POWER6_PME_PM_FPU_ISSUE_0 ] = { .pme_name = "PM_FPU_ISSUE_0", .pme_code = 0x320c6, .pme_short_desc = "FPU issue 0 per cycle", .pme_long_desc = "FPU issue 0 per cycle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_0], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_0] }, [ POWER6_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x322040, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles thread running at priority level 2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_2_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_2_CYC] }, [ POWER6_PME_PM_VMX_SIMPLE_ISSUED ] = { .pme_name = "PM_VMX_SIMPLE_ISSUED", .pme_code = 0x70082, .pme_short_desc = "VMX instruction issued to simple", .pme_long_desc = "VMX instruction issued to simple", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_SIMPLE_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_SIMPLE_ISSUED] }, [ POWER6_PME_PM_MRK_FPU1_FIN ] = { .pme_name = "PM_MRK_FPU1_FIN", .pme_code = 0xd008a, .pme_short_desc = "Marked instruction FPU1 processing finished", .pme_long_desc = "Marked instruction FPU1 processing finished", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_FPU1_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_FPU1_FIN] }, [ POWER6_PME_PM_DPU_HELD_CW ] = { .pme_name = "PM_DPU_HELD_CW", .pme_code = 0x20084, .pme_short_desc = "DISP unit held due to cache writes ", .pme_long_desc = "DISP unit held due to cache writes ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_CW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_CW] }, [ POWER6_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x50080, .pme_short_desc = "L3 slice A references", .pme_long_desc = "L3 slice A references", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SA_REF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SA_REF] }, [ POWER6_PME_PM_STCX ] = { .pme_name = "PM_STCX", .pme_code = 0x830e6, .pme_short_desc = "STCX executed", .pme_long_desc = "STCX executed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_STCX], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_STCX] }, [ POWER6_PME_PM_L2SB_MISS ] = { .pme_name = "PM_L2SB_MISS", .pme_code = 0x5058c, .pme_short_desc = "L2 slice B misses", .pme_long_desc = "L2 slice B misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_MISS] }, [ POWER6_PME_PM_LSU0_REJECT ] = { .pme_name = "PM_LSU0_REJECT", .pme_code = 0xa10a6, .pme_short_desc = "LSU0 reject", .pme_long_desc = "LSU0 reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT] }, [ POWER6_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100026, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_TB_BIT_TRANS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_TB_BIT_TRANS] }, [ POWER6_PME_PM_THERMAL_MAX ] = { .pme_name = "PM_THERMAL_MAX", .pme_code = 0x30002a, .pme_short_desc = "Processor in thermal MAX", .pme_long_desc = "Processor in thermal MAX", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THERMAL_MAX], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THERMAL_MAX] }, [ POWER6_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0xc10a4, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_STF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_STF] }, [ POWER6_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc008a, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FMA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FMA] }, [ POWER6_PME_PM_LSU1_REJECT_LHS ] = { .pme_name = "PM_LSU1_REJECT_LHS", .pme_code = 0x9008e, .pme_short_desc = "LSU1 load hit store reject", .pme_long_desc = "LSU1 load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_LHS] }, [ POWER6_PME_PM_DPU_HELD_INT ] = { .pme_name = "PM_DPU_HELD_INT", .pme_code = 0x310a8, .pme_short_desc = "DISP unit held due to exception", .pme_long_desc = "DISP unit held due to exception", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_INT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_INT] }, [ POWER6_PME_PM_THRD_LLA_BOTH_CYC ] = { .pme_name = "PM_THRD_LLA_BOTH_CYC", .pme_code = 0x400008, .pme_short_desc = "Both threads in Load Look Ahead", .pme_long_desc = "Both threads in Load Look Ahead", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_LLA_BOTH_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_LLA_BOTH_CYC] }, [ POWER6_PME_PM_DPU_HELD_THERMAL_COUNT ] = { .pme_name = "PM_DPU_HELD_THERMAL_COUNT", .pme_code = 0x10002b, .pme_short_desc = "Periods DISP unit held due to thermal condition", .pme_long_desc = "Periods DISP unit held due to thermal condition", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_THERMAL_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_THERMAL_COUNT] }, [ POWER6_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x100020, .pme_short_desc = "PMC4 rewind event", .pme_long_desc = "PMC4 rewind event", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC4_REWIND], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC4_REWIND] }, [ POWER6_PME_PM_DERAT_REF_16M ] = { .pme_name = "PM_DERAT_REF_16M", .pme_code = 0x382070, .pme_short_desc = "DERAT reference for 16M page", .pme_long_desc = "DERAT reference for 16M page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_REF_16M], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_REF_16M] }, [ POWER6_PME_PM_FPU0_FCONV ] = { .pme_name = "PM_FPU0_FCONV", .pme_code = 0xd10a0, .pme_short_desc = "FPU0 executed FCONV instruction", .pme_long_desc = "FPU0 executed FCONV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FCONV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FCONV] }, [ POWER6_PME_PM_L2SA_LD_REQ_DATA ] = { .pme_name = "PM_L2SA_LD_REQ_DATA", .pme_code = 0x50480, .pme_short_desc = "L2 slice A data load requests", .pme_long_desc = "L2 slice A data load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_REQ_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_REQ_DATA] }, [ POWER6_PME_PM_DATA_FROM_MEM_DP ] = { .pme_name = "PM_DATA_FROM_MEM_DP", .pme_code = 0x10005e, .pme_short_desc = "Data loaded from double pump memory", .pme_long_desc = "Data loaded from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_MEM_DP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_MEM_DP] }, [ POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED ] = { .pme_name = "PM_MRK_VMX_FLOAT_ISSUED", .pme_code = 0x70088, .pme_short_desc = "Marked VMX instruction issued to float", .pme_long_desc = "Marked VMX instruction issued to float", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L2MISS", .pme_code = 0x412054, .pme_short_desc = "Marked PTEG loaded from L2 miss", .pme_long_desc = "Marked PTEG loaded from L2 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L2MISS] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x223040, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles thread priority difference is 1 or 2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC] }, [ POWER6_PME_PM_VMX0_STALL ] = { .pme_name = "PM_VMX0_STALL", .pme_code = 0xb0084, .pme_short_desc = "VMX0 stall", .pme_long_desc = "VMX0 stall", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX0_STALL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX0_STALL] }, [ POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x420ca, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "L2 I cache demand request due to BHT redirect", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT] }, [ POWER6_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x20000e, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total DERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_DERAT_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_DERAT_MISS] }, [ POWER6_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0xc10a6, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_SINGLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_SINGLE] }, [ POWER6_PME_PM_FPU_ISSUE_STEERING ] = { .pme_name = "PM_FPU_ISSUE_STEERING", .pme_code = 0x320c4, .pme_short_desc = "FPU issue steering", .pme_long_desc = "FPU issue steering", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_STEERING], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_STEERING] }, [ POWER6_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x222040, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles thread running at priority level 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_1_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_1_CYC] }, [ POWER6_PME_PM_VMX_COMPLEX_ISSUED ] = { .pme_name = "PM_VMX_COMPLEX_ISSUED", .pme_code = 0x70084, .pme_short_desc = "VMX instruction issued to complex", .pme_long_desc = "VMX instruction issued to complex", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_COMPLEX_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_COMPLEX_ISSUED] }, [ POWER6_PME_PM_FPU_ISSUE_ST_FOLDED ] = { .pme_name = "PM_FPU_ISSUE_ST_FOLDED", .pme_code = 0x320c2, .pme_short_desc = "FPU issue a folded store", .pme_long_desc = "FPU issue a folded store", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_ST_FOLDED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_ST_FOLDED] }, [ POWER6_PME_PM_DFU_FIN ] = { .pme_name = "PM_DFU_FIN", .pme_code = 0xe0080, .pme_short_desc = "DFU instruction finish", .pme_long_desc = "DFU instruction finish", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_FIN] }, [ POWER6_PME_PM_BR_PRED_CCACHE ] = { .pme_name = "PM_BR_PRED_CCACHE", .pme_code = 0x410a4, .pme_short_desc = "Branch count cache prediction", .pme_long_desc = "Branch count cache prediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_PRED_CCACHE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_PRED_CCACHE] }, [ POWER6_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300006, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_ST_CMPL_INT] }, [ POWER6_PME_PM_FAB_MMIO ] = { .pme_name = "PM_FAB_MMIO", .pme_code = 0x50186, .pme_short_desc = "MMIO operation", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_MMIO], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_MMIO] }, [ POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED ] = { .pme_name = "PM_MRK_VMX_SIMPLE_ISSUED", .pme_code = 0x7008a, .pme_short_desc = "Marked VMX instruction issued to simple", .pme_long_desc = "Marked VMX instruction issued to simple", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED] }, [ POWER6_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x3c1030, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_STF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_STF] }, [ POWER6_PME_PM_MEM1_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM1_DP_CL_WR_GLOB", .pme_code = 0x5028c, .pme_short_desc = "cacheline write setting dp to global side 1", .pme_long_desc = "cacheline write setting dp to global side 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM1_DP_CL_WR_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM1_DP_CL_WR_GLOB] }, [ POWER6_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x303028, .pme_short_desc = "Marked data loaded from L3 miss", .pme_long_desc = "Marked data loaded from L3 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L3MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L3MISS] }, [ POWER6_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100008, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles this thread does not have any slots allocated in the GCT.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_NOSLOT_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_NOSLOT_CYC] }, [ POWER6_PME_PM_L2_ST_REQ_DATA ] = { .pme_name = "PM_L2_ST_REQ_DATA", .pme_code = 0x250432, .pme_short_desc = "L2 data store requests", .pme_long_desc = "L2 data store requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_ST_REQ_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_ST_REQ_DATA] }, [ POWER6_PME_PM_INST_TABLEWALK_COUNT ] = { .pme_name = "PM_INST_TABLEWALK_COUNT", .pme_code = 0x920cb, .pme_short_desc = "Periods doing instruction tablewalks", .pme_long_desc = "Periods doing instruction tablewalks", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_TABLEWALK_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_TABLEWALK_COUNT] }, [ POWER6_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x21304e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "PTEG loaded from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L35_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L35_SHR] }, [ POWER6_PME_PM_DPU_HELD_ISYNC ] = { .pme_name = "PM_DPU_HELD_ISYNC", .pme_code = 0x2008a, .pme_short_desc = "DISP unit held due to ISYNC ", .pme_long_desc = "DISP unit held due to ISYNC ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_ISYNC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_ISYNC] }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x40304e, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ POWER6_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x50082, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "L3 slice A hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SA_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SA_HIT] }, [ POWER6_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x492070, .pme_short_desc = "DERAT misses for 16G page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_MISS_16G], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_MISS_16G] }, [ POWER6_PME_PM_DATA_PTEG_2ND_HALF ] = { .pme_name = "PM_DATA_PTEG_2ND_HALF", .pme_code = 0x910a2, .pme_short_desc = "Data table walk matched in second half pri­mary PTEG", .pme_long_desc = "Data table walk matched in second half pri­mary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_PTEG_2ND_HALF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_PTEG_2ND_HALF] }, [ POWER6_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x50484, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_ST_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_ST_REQ] }, [ POWER6_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x442042, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "Instruction fetched from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_LMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_LMEM] }, [ POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x420cc, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "L2 I cache demand request due to branch redirect", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT] }, [ POWER6_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x113048, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "PTEG loaded from L2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L2] }, [ POWER6_PME_PM_DATA_PTEG_1ST_HALF ] = { .pme_name = "PM_DATA_PTEG_1ST_HALF", .pme_code = 0x910a0, .pme_short_desc = "Data table walk matched in first half primary PTEG", .pme_long_desc = "Data table walk matched in first half primary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_PTEG_1ST_HALF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_PTEG_1ST_HALF] }, [ POWER6_PME_PM_BR_MPRED_COUNT ] = { .pme_name = "PM_BR_MPRED_COUNT", .pme_code = 0x410aa, .pme_short_desc = "Branch misprediction due to count prediction", .pme_long_desc = "Branch misprediction due to count prediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_MPRED_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_MPRED_COUNT] }, [ POWER6_PME_PM_IERAT_MISS_4K ] = { .pme_name = "PM_IERAT_MISS_4K", .pme_code = 0x492076, .pme_short_desc = "IERAT misses for 4K page", .pme_long_desc = "IERAT misses for 4K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IERAT_MISS_4K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IERAT_MISS_4K] }, [ POWER6_PME_PM_THRD_BOTH_RUN_COUNT ] = { .pme_name = "PM_THRD_BOTH_RUN_COUNT", .pme_code = 0x400005, .pme_short_desc = "Periods both threads in run cycles", .pme_long_desc = "Periods both threads in run cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_BOTH_RUN_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_BOTH_RUN_COUNT] }, [ POWER6_PME_PM_LSU_REJECT_ULD ] = { .pme_name = "PM_LSU_REJECT_ULD", .pme_code = 0x190030, .pme_short_desc = "Unaligned load reject", .pme_long_desc = "Unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_ULD] }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD_CYC", .pme_code = 0x40002a, .pme_short_desc = "Load latency from distant L2 or L3 modified", .pme_long_desc = "Load latency from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC] }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_MOD", .pme_code = 0x112044, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "Marked PTEG loaded from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD] }, [ POWER6_PME_PM_FPU0_FLOP ] = { .pme_name = "PM_FPU0_FLOP", .pme_code = 0xc0086, .pme_short_desc = "FPU0 executed 1FLOP", .pme_long_desc = " FMA", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FLOP] }, [ POWER6_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0xd10a6, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FEST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FEST] }, [ POWER6_PME_PM_MRK_LSU0_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU0_REJECT_LHS", .pme_code = 0x930e6, .pme_short_desc = "LSU0 marked load hit store reject", .pme_long_desc = "LSU0 marked load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU0_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU0_REJECT_LHS] }, [ POWER6_PME_PM_VMX_RESULT_SAT_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_1", .pme_code = 0xb0086, .pme_short_desc = "VMX valid result with sat=1", .pme_long_desc = "VMX valid result with sat=1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_RESULT_SAT_1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_RESULT_SAT_1] }, [ POWER6_PME_PM_NO_ITAG_CYC ] = { .pme_name = "PM_NO_ITAG_CYC", .pme_code = 0x40088, .pme_short_desc = "Cyles no ITAG available", .pme_long_desc = "Cyles no ITAG available", .pme_event_ids = power6_event_ids[POWER6_PME_PM_NO_ITAG_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_NO_ITAG_CYC] }, [ POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU1_REJECT_NO_SCRATCH", .pme_code = 0xa10aa, .pme_short_desc = "LSU1 reject due to scratch register not available", .pme_long_desc = "LSU1 reject due to scratch register not available", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH] }, [ POWER6_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x40080, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_0INST_FETCH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_0INST_FETCH] }, [ POWER6_PME_PM_DPU_WT_BR_MPRED ] = { .pme_name = "PM_DPU_WT_BR_MPRED", .pme_code = 0x40000c, .pme_short_desc = "Cycles DISP unit is stalled due to branch misprediction", .pme_long_desc = "Cycles DISP unit is stalled due to branch misprediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT_BR_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT_BR_MPRED] }, [ POWER6_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x810a4, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L1_PREF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L1_PREF] }, [ POWER6_PME_PM_VMX_FLOAT_MULTICYCLE ] = { .pme_name = "PM_VMX_FLOAT_MULTICYCLE", .pme_code = 0xb0082, .pme_short_desc = "VMX multi-cycle floating point instruction issued", .pme_long_desc = "VMX multi-cycle floating point instruction issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_FLOAT_MULTICYCLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_FLOAT_MULTICYCLE] }, [ POWER6_PME_PM_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_L25_SHR_CYC", .pme_code = 0x200024, .pme_short_desc = "Load latency from L2.5 shared", .pme_long_desc = "Load latency from L2.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L25_SHR_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L25_SHR_CYC] }, [ POWER6_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x300058, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L3], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L3] }, [ POWER6_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300014, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC2_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC2_OVERFLOW] }, [ POWER6_PME_PM_VMX0_LD_WRBACK ] = { .pme_name = "PM_VMX0_LD_WRBACK", .pme_code = 0x60084, .pme_short_desc = "VMX0 load writeback valid", .pme_long_desc = "VMX0 load writeback valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX0_LD_WRBACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX0_LD_WRBACK] }, [ POWER6_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0xc10a2, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_DENORM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_DENORM] }, [ POWER6_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x420c8, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FETCH_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FETCH_CYC] }, [ POWER6_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x280032, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LDF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LDF] }, [ POWER6_PME_PM_LSU_REJECT_L2_CORR ] = { .pme_name = "PM_LSU_REJECT_L2_CORR", .pme_code = 0x1a1034, .pme_short_desc = "LSU reject due to L2 correctable error", .pme_long_desc = "LSU reject due to L2 correctable error", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_L2_CORR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_L2_CORR] }, [ POWER6_PME_PM_DERAT_REF_64K ] = { .pme_name = "PM_DERAT_REF_64K", .pme_code = 0x282070, .pme_short_desc = "DERAT reference for 64K page", .pme_long_desc = "DERAT reference for 64K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_REF_64K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_REF_64K] }, [ POWER6_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x422040, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles thread running at priority level 3", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_3_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_3_CYC] }, [ POWER6_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2c0030, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FMA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FMA] }, [ POWER6_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x142046, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "Instruction fetched from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L35_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L35_MOD] }, [ POWER6_PME_PM_DFU_CONV ] = { .pme_name = "PM_DFU_CONV", .pme_code = 0xe008e, .pme_short_desc = "DFU convert from fixed op", .pme_long_desc = "DFU convert from fixed op", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_CONV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_CONV] }, [ POWER6_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x342046, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L25_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L25_MOD] }, [ POWER6_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x11304e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "PTEG loaded from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L35_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L35_MOD] }, [ POWER6_PME_PM_MRK_VMX_ST_ISSUED ] = { .pme_name = "PM_MRK_VMX_ST_ISSUED", .pme_code = 0xb0088, .pme_short_desc = "Marked VMX store issued", .pme_long_desc = "Marked VMX store issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX_ST_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX_ST_ISSUED] }, [ POWER6_PME_PM_VMX_FLOAT_ISSUED ] = { .pme_name = "PM_VMX_FLOAT_ISSUED", .pme_code = 0x70080, .pme_short_desc = "VMX instruction issued to float", .pme_long_desc = "VMX instruction issued to float", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_FLOAT_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_FLOAT_ISSUED] }, [ POWER6_PME_PM_LSU0_REJECT_L2_CORR ] = { .pme_name = "PM_LSU0_REJECT_L2_CORR", .pme_code = 0xa10a0, .pme_short_desc = "LSU0 reject due to L2 correctable error", .pme_long_desc = "LSU0 reject due to L2 correctable error", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_L2_CORR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_L2_CORR] }, [ POWER6_PME_PM_THRD_L2MISS ] = { .pme_name = "PM_THRD_L2MISS", .pme_code = 0x310a0, .pme_short_desc = "Thread in L2 miss", .pme_long_desc = "Thread in L2 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_L2MISS] }, [ POWER6_PME_PM_FPU_FCONV ] = { .pme_name = "PM_FPU_FCONV", .pme_code = 0x1d1034, .pme_short_desc = "FPU executed FCONV instruction", .pme_long_desc = "FPU executed FCONV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FCONV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FCONV] }, [ POWER6_PME_PM_FPU_FXMULT ] = { .pme_name = "PM_FPU_FXMULT", .pme_code = 0x1d0032, .pme_short_desc = "FPU executed fixed point multiplication", .pme_long_desc = "FPU executed fixed point multiplication", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FXMULT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FXMULT] }, [ POWER6_PME_PM_FPU1_FRSP ] = { .pme_name = "PM_FPU1_FRSP", .pme_code = 0xd10aa, .pme_short_desc = "FPU1 executed FRSP instruction", .pme_long_desc = "FPU1 executed FRSP instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FRSP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FRSP] }, [ POWER6_PME_PM_MRK_DERAT_REF_16M ] = { .pme_name = "PM_MRK_DERAT_REF_16M", .pme_code = 0x382044, .pme_short_desc = "Marked DERAT reference for 16M page", .pme_long_desc = "Marked DERAT reference for 16M page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_REF_16M], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_REF_16M] }, [ POWER6_PME_PM_L2SB_CASTOUT_SHR ] = { .pme_name = "PM_L2SB_CASTOUT_SHR", .pme_code = 0x5068a, .pme_short_desc = "L2 slice B castouts - Shared", .pme_long_desc = "L2 slice B castouts - Shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_CASTOUT_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_CASTOUT_SHR] }, [ POWER6_PME_PM_THRD_ONE_RUN_COUNT ] = { .pme_name = "PM_THRD_ONE_RUN_COUNT", .pme_code = 0x1000fb, .pme_short_desc = "Periods one of the threads in run cycles", .pme_long_desc = "Periods one of the threads in run cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_ONE_RUN_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_ONE_RUN_COUNT] }, [ POWER6_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x342042, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "Instruction fetched from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_RMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_RMEM] }, [ POWER6_PME_PM_LSU_BOTH_BUS ] = { .pme_name = "PM_LSU_BOTH_BUS", .pme_code = 0x810aa, .pme_short_desc = "Both data return buses busy simultaneously", .pme_long_desc = "Both data return buses busy simultaneously", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_BOTH_BUS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_BOTH_BUS] }, [ POWER6_PME_PM_FPU1_FSQRT_FDIV ] = { .pme_name = "PM_FPU1_FSQRT_FDIV", .pme_code = 0xc008c, .pme_short_desc = "FPU1 executed FSQRT or FDIV instruction", .pme_long_desc = "FPU1 executed FSQRT or FDIV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FSQRT_FDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FSQRT_FDIV] }, [ POWER6_PME_PM_L2_LD_REQ_INST ] = { .pme_name = "PM_L2_LD_REQ_INST", .pme_code = 0x150530, .pme_short_desc = "L2 instruction load requests", .pme_long_desc = "L2 instruction load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_LD_REQ_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_LD_REQ_INST] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L35_SHR", .pme_code = 0x212046, .pme_short_desc = "Marked PTEG loaded from L3.5 shared", .pme_long_desc = "Marked PTEG loaded from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR] }, [ POWER6_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x410a2, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " CR prediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_PRED_CR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_PRED_CR] }, [ POWER6_PME_PM_MRK_LSU0_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU0_REJECT_ULD", .pme_code = 0x930e0, .pme_short_desc = "LSU0 marked unaligned load reject", .pme_long_desc = "LSU0 marked unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU0_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU0_REJECT_ULD] }, [ POWER6_PME_PM_LSU_REJECT ] = { .pme_name = "PM_LSU_REJECT", .pme_code = 0x4a1030, .pme_short_desc = "LSU reject", .pme_long_desc = "LSU reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT] }, [ POWER6_PME_PM_LSU_REJECT_LHS_BOTH ] = { .pme_name = "PM_LSU_REJECT_LHS_BOTH", .pme_code = 0x290038, .pme_short_desc = "Load hit store reject both units", .pme_long_desc = "Load hit store reject both units", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_LHS_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_LHS_BOTH] }, [ POWER6_PME_PM_GXO_ADDR_CYC_BUSY ] = { .pme_name = "PM_GXO_ADDR_CYC_BUSY", .pme_code = 0x50382, .pme_short_desc = "Outbound GX address utilization (# of cycles address out is valid)", .pme_long_desc = "Outbound GX address utilization (# of cycles address out is valid)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXO_ADDR_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXO_ADDR_CYC_BUSY] }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT ] = { .pme_name = "PM_LSU_SRQ_EMPTY_COUNT", .pme_code = 0x40001d, .pme_short_desc = "Periods SRQ empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT] }, [ POWER6_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x313048, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "PTEG loaded from L3", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L3], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L3] }, [ POWER6_PME_PM_VMX0_LD_ISSUED ] = { .pme_name = "PM_VMX0_LD_ISSUED", .pme_code = 0x60082, .pme_short_desc = "VMX0 load issued", .pme_long_desc = "VMX0 load issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX0_LD_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX0_LD_ISSUED] }, [ POWER6_PME_PM_FXU_PIPELINED_MULT_DIV ] = { .pme_name = "PM_FXU_PIPELINED_MULT_DIV", .pme_code = 0x210ae, .pme_short_desc = "Fix point multiply/divide pipelined", .pme_long_desc = "Fix point multiply/divide pipelined", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU_PIPELINED_MULT_DIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU_PIPELINED_MULT_DIV] }, [ POWER6_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0xc10ac, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_STF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_STF] }, [ POWER6_PME_PM_DFU_ADD ] = { .pme_name = "PM_DFU_ADD", .pme_code = 0xe008c, .pme_short_desc = "DFU add type instruction", .pme_long_desc = "DFU add type instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_ADD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_ADD] }, [ POWER6_PME_PM_MEM_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM_DP_CL_WR_GLOB", .pme_code = 0x250232, .pme_short_desc = "cache line write setting double pump state to global", .pme_long_desc = "cache line write setting double pump state to global", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM_DP_CL_WR_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM_DP_CL_WR_GLOB] }, [ POWER6_PME_PM_MRK_LSU1_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU1_REJECT_ULD", .pme_code = 0x930e8, .pme_short_desc = "LSU1 marked unaligned load reject", .pme_long_desc = "LSU1 marked unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU1_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU1_REJECT_ULD] }, [ POWER6_PME_PM_ITLB_REF ] = { .pme_name = "PM_ITLB_REF", .pme_code = 0x920c2, .pme_short_desc = "Instruction TLB reference", .pme_long_desc = "Instruction TLB reference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ITLB_REF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ITLB_REF] }, [ POWER6_PME_PM_LSU0_REJECT_L2MISS ] = { .pme_name = "PM_LSU0_REJECT_L2MISS", .pme_code = 0x90084, .pme_short_desc = "LSU0 L2 miss reject", .pme_long_desc = "LSU0 L2 miss reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_L2MISS] }, [ POWER6_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x20005a, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "Data loaded from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L35_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L35_SHR] }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x10304c, .pme_short_desc = "Marked data loaded from remote L2 or L3 modified", .pme_long_desc = "Marked data loaded from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD] }, [ POWER6_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0xd0084, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FPSCR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FPSCR] }, [ POWER6_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x100058, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L2] }, [ POWER6_PME_PM_DPU_HELD_XER ] = { .pme_name = "PM_DPU_HELD_XER", .pme_code = 0x20088, .pme_short_desc = "DISP unit held due to XER dependency", .pme_long_desc = "DISP unit held due to XER dependency", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_XER], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_XER] }, [ POWER6_PME_PM_FAB_NODE_PUMP ] = { .pme_name = "PM_FAB_NODE_PUMP", .pme_code = 0x50188, .pme_short_desc = "Node pump operation", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_NODE_PUMP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_NODE_PUMP] }, [ POWER6_PME_PM_VMX_RESULT_SAT_0_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_0_1", .pme_code = 0xb008e, .pme_short_desc = "VMX valid result with sat bit is set (0->1)", .pme_long_desc = "VMX valid result with sat bit is set (0->1)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_RESULT_SAT_0_1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_RESULT_SAT_0_1] }, [ POWER6_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x80082, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_REF_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_REF_L1] }, [ POWER6_PME_PM_TLB_REF ] = { .pme_name = "PM_TLB_REF", .pme_code = 0x920c8, .pme_short_desc = "TLB reference", .pme_long_desc = "TLB reference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_TLB_REF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_TLB_REF] }, [ POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x810a0, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS] }, [ POWER6_PME_PM_FLUSH_FPU ] = { .pme_name = "PM_FLUSH_FPU", .pme_code = 0x230ec, .pme_short_desc = "Flush caused by FPU exception", .pme_long_desc = "Flush caused by FPU exception", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FLUSH_FPU], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FLUSH_FPU] }, [ POWER6_PME_PM_MEM1_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM1_DP_CL_WR_LOC", .pme_code = 0x5028e, .pme_short_desc = "cacheline write setting dp to local side 1", .pme_long_desc = "cacheline write setting dp to local side 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM1_DP_CL_WR_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM1_DP_CL_WR_LOC] }, [ POWER6_PME_PM_L2SB_LD_HIT ] = { .pme_name = "PM_L2SB_LD_HIT", .pme_code = 0x5078a, .pme_short_desc = "L2 slice B load hits", .pme_long_desc = "L2 slice B load hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_HIT] }, [ POWER6_PME_PM_FAB_DCLAIM ] = { .pme_name = "PM_FAB_DCLAIM", .pme_code = 0x50184, .pme_short_desc = "Dclaim operation", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_DCLAIM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_DCLAIM] }, [ POWER6_PME_PM_MEM_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM_DP_CL_WR_LOC", .pme_code = 0x150232, .pme_short_desc = "cache line write setting double pump state to local", .pme_long_desc = "cache line write setting double pump state to local", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM_DP_CL_WR_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM_DP_CL_WR_LOC] }, [ POWER6_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x410a8, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_MPRED_CR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_MPRED_CR] }, [ POWER6_PME_PM_LSU_REJECT_EXTERN ] = { .pme_name = "PM_LSU_REJECT_EXTERN", .pme_code = 0x3a1030, .pme_short_desc = "LSU external reject request ", .pme_long_desc = "LSU external reject request ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_EXTERN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_EXTERN] }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x10005c, .pme_short_desc = "Data loaded from remote L2 or L3 modified", .pme_long_desc = "Data loaded from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RL2L3_MOD] }, [ POWER6_PME_PM_DPU_HELD_RU_WQ ] = { .pme_name = "PM_DPU_HELD_RU_WQ", .pme_code = 0x2008e, .pme_short_desc = "DISP unit held due to RU FXU write queue full", .pme_long_desc = "DISP unit held due to RU FXU write queue full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_RU_WQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_RU_WQ] }, [ POWER6_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x80080, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_MISS_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_MISS_L1] }, [ POWER6_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x150632, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DC_INV_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DC_INV_L2] }, [ POWER6_PME_PM_MRK_PTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_RMEM", .pme_code = 0x312042, .pme_short_desc = "Marked PTEG loaded from remote memory", .pme_long_desc = "Marked PTEG loaded from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_RMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_RMEM] }, [ POWER6_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x1d0030, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FIN] }, [ POWER6_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x300016, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU0_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU0_FIN] }, [ POWER6_PME_PM_DPU_HELD_FPQ ] = { .pme_name = "PM_DPU_HELD_FPQ", .pme_code = 0x20086, .pme_short_desc = "DISP unit held due to FPU issue queue full", .pme_long_desc = "DISP unit held due to FPU issue queue full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_FPQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_FPQ] }, [ POWER6_PME_PM_GX_DMA_READ ] = { .pme_name = "PM_GX_DMA_READ", .pme_code = 0x5038c, .pme_short_desc = "DMA Read Request", .pme_long_desc = "DMA Read Request", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GX_DMA_READ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GX_DMA_READ] }, [ POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU1_REJECT_PARTIAL_SECTOR", .pme_code = 0xa008e, .pme_short_desc = "LSU1 reject due to partial sector valid", .pme_long_desc = "LSU1 reject due to partial sector valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR] }, [ POWER6_PME_PM_0INST_FETCH_COUNT ] = { .pme_name = "PM_0INST_FETCH_COUNT", .pme_code = 0x40081, .pme_short_desc = "Periods with no instructions fetched", .pme_long_desc = "No instructions were fetched this periods (due to IFU hold, redirect, or icache miss)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_0INST_FETCH_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_0INST_FETCH_COUNT] }, [ POWER6_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x100024, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC5_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC5_OVERFLOW] }, [ POWER6_PME_PM_L2SB_LD_REQ ] = { .pme_name = "PM_L2SB_LD_REQ", .pme_code = 0x50788, .pme_short_desc = "L2 slice B load requests ", .pme_long_desc = "L2 slice B load requests ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_REQ] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x123040, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles no thread priority difference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC] }, [ POWER6_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x30005e, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "Data loaded from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RMEM] }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC", .pme_code = 0x30001c, .pme_short_desc = "Cycles both threads LMQ and SRQ empty", .pme_long_desc = "Cycles both threads LMQ and SRQ empty", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC] }, [ POWER6_PME_PM_ST_REF_L1_BOTH ] = { .pme_name = "PM_ST_REF_L1_BOTH", .pme_code = 0x280038, .pme_short_desc = "Both units L1 D cache store reference", .pme_long_desc = "Both units L1 D cache store reference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_REF_L1_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_REF_L1_BOTH] }, [ POWER6_PME_PM_VMX_PERMUTE_ISSUED ] = { .pme_name = "PM_VMX_PERMUTE_ISSUED", .pme_code = 0x70086, .pme_short_desc = "VMX instruction issued to permute", .pme_long_desc = "VMX instruction issued to permute", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_PERMUTE_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_PERMUTE_ISSUED] }, [ POWER6_PME_PM_BR_TAKEN ] = { .pme_name = "PM_BR_TAKEN", .pme_code = 0x200052, .pme_short_desc = "Branches taken", .pme_long_desc = "Branches taken", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_TAKEN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_TAKEN] }, [ POWER6_PME_PM_FAB_DMA ] = { .pme_name = "PM_FAB_DMA", .pme_code = 0x5018c, .pme_short_desc = "DMA operation", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_DMA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_DMA] }, [ POWER6_PME_PM_GCT_EMPTY_COUNT ] = { .pme_name = "PM_GCT_EMPTY_COUNT", .pme_code = 0x200009, .pme_short_desc = "Periods GCT empty", .pme_long_desc = "The Global Completion Table is completely empty.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_EMPTY_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_EMPTY_COUNT] }, [ POWER6_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0xc10ae, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_SINGLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_SINGLE] }, [ POWER6_PME_PM_L2SA_CASTOUT_SHR ] = { .pme_name = "PM_L2SA_CASTOUT_SHR", .pme_code = 0x50682, .pme_short_desc = "L2 slice A castouts - Shared", .pme_long_desc = "L2 slice A castouts - Shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_CASTOUT_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_CASTOUT_SHR] }, [ POWER6_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x50088, .pme_short_desc = "L3 slice B references", .pme_long_desc = "L3 slice B references", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SB_REF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SB_REF] }, [ POWER6_PME_PM_FPU0_FRSP ] = { .pme_name = "PM_FPU0_FRSP", .pme_code = 0xd10a2, .pme_short_desc = "FPU0 executed FRSP instruction", .pme_long_desc = "FPU0 executed FRSP instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FRSP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FRSP] }, [ POWER6_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x300022, .pme_short_desc = "PMC4 rewind value saved", .pme_long_desc = "PMC4 rewind value saved", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC4_SAVED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC4_SAVED] }, [ POWER6_PME_PM_L2SA_DC_INV ] = { .pme_name = "PM_L2SA_DC_INV", .pme_code = 0x50686, .pme_short_desc = "L2 slice A D cache invalidate", .pme_long_desc = "L2 slice A D cache invalidate", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_DC_INV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_DC_INV] }, [ POWER6_PME_PM_GXI_ADDR_CYC_BUSY ] = { .pme_name = "PM_GXI_ADDR_CYC_BUSY", .pme_code = 0x50388, .pme_short_desc = "Inbound GX address utilization (# of cycle address is in valid)", .pme_long_desc = "Inbound GX address utilization (# of cycle address is in valid)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXI_ADDR_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXI_ADDR_CYC_BUSY] }, [ POWER6_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc0082, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FMA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FMA] }, [ POWER6_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x183034, .pme_short_desc = "SLB misses", .pme_long_desc = "SLB misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_SLB_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_SLB_MISS] }, [ POWER6_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200006, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_ST_GPS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_ST_GPS] }, [ POWER6_PME_PM_DERAT_REF_4K ] = { .pme_name = "PM_DERAT_REF_4K", .pme_code = 0x182070, .pme_short_desc = "DERAT reference for 4K page", .pme_long_desc = "DERAT reference for 4K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_REF_4K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_REF_4K] }, [ POWER6_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x250630, .pme_short_desc = "L2 castouts - Shared (T", .pme_long_desc = " Te", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_CASTOUT_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_CASTOUT_SHR] }, [ POWER6_PME_PM_DPU_HELD_STCX_CR ] = { .pme_name = "PM_DPU_HELD_STCX_CR", .pme_code = 0x2008c, .pme_short_desc = "DISP unit held due to STCX updating CR ", .pme_long_desc = "DISP unit held due to STCX updating CR ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_STCX_CR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_STCX_CR] }, [ POWER6_PME_PM_FPU0_ST_FOLDED ] = { .pme_name = "PM_FPU0_ST_FOLDED", .pme_code = 0xd10a4, .pme_short_desc = "FPU0 folded store", .pme_long_desc = "FPU0 folded store", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_ST_FOLDED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_ST_FOLDED] }, [ POWER6_PME_PM_MRK_DATA_FROM_L21 ] = { .pme_name = "PM_MRK_DATA_FROM_L21", .pme_code = 0x203048, .pme_short_desc = "Marked data loaded from private L2 other core", .pme_long_desc = "Marked data loaded from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L21], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L21] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x323046, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles thread priority difference is -3 or -4", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC] }, [ POWER6_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x10005a, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "Data loaded from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L35_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L35_MOD] }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x30005c, .pme_short_desc = "Data loaded from distant L2 or L3 shared", .pme_long_desc = "Data loaded from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DL2L3_SHR] }, [ POWER6_PME_PM_GXI_DATA_CYC_BUSY ] = { .pme_name = "PM_GXI_DATA_CYC_BUSY", .pme_code = 0x5038a, .pme_short_desc = "Inbound GX Data utilization (# of cycle data in is valid)", .pme_long_desc = "Inbound GX Data utilization (# of cycle data in is valid)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXI_DATA_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXI_DATA_CYC_BUSY] }, [ POWER6_PME_PM_LSU_REJECT_STEAL ] = { .pme_name = "PM_LSU_REJECT_STEAL", .pme_code = 0x9008c, .pme_short_desc = "LSU reject due to steal", .pme_long_desc = "LSU reject due to steal", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_STEAL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_STEAL] }, [ POWER6_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x100054, .pme_short_desc = "Store instructions finished", .pme_long_desc = "Store instructions finished", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_FIN] }, [ POWER6_PME_PM_DPU_HELD_CR_LOGICAL ] = { .pme_name = "PM_DPU_HELD_CR_LOGICAL", .pme_code = 0x3008e, .pme_short_desc = "DISP unit held due to CR", .pme_long_desc = " LR or CTR updated by CR logical", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_CR_LOGICAL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_CR_LOGICAL] }, [ POWER6_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x310a6, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Decode selected thread 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_SEL_T0], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_SEL_T0] }, [ POWER6_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x130e8, .pme_short_desc = "TLB reload valid", .pme_long_desc = "TLB reload valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_RELOAD_VALID], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_RELOAD_VALID] }, [ POWER6_PME_PM_L2_PREF_ST ] = { .pme_name = "PM_L2_PREF_ST", .pme_code = 0x810a8, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "L2 cache prefetches", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_PREF_ST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_PREF_ST] }, [ POWER6_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x830e4, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_STCX_FAIL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_STCX_FAIL] }, [ POWER6_PME_PM_LSU0_REJECT_LHS ] = { .pme_name = "PM_LSU0_REJECT_LHS", .pme_code = 0x90086, .pme_short_desc = "LSU0 load hit store reject", .pme_long_desc = "LSU0 load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_LHS] }, [ POWER6_PME_PM_DFU_EXP_EQ ] = { .pme_name = "PM_DFU_EXP_EQ", .pme_code = 0xe0084, .pme_short_desc = "DFU operand exponents are equal for add type", .pme_long_desc = "DFU operand exponents are equal for add type", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_EXP_EQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_EXP_EQ] }, [ POWER6_PME_PM_DPU_HELD_FP_FX_MULT ] = { .pme_name = "PM_DPU_HELD_FP_FX_MULT", .pme_code = 0x210a8, .pme_short_desc = "DISP unit held due to non fixed multiple/divide after fixed multiply/divide", .pme_long_desc = "DISP unit held due to non fixed multiple/divide after fixed multiply/divide", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_FP_FX_MULT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_FP_FX_MULT] }, [ POWER6_PME_PM_L2_LD_MISS_DATA ] = { .pme_name = "PM_L2_LD_MISS_DATA", .pme_code = 0x250430, .pme_short_desc = "L2 data load misses", .pme_long_desc = "L2 data load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_LD_MISS_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_LD_MISS_DATA] }, [ POWER6_PME_PM_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_L35_MOD_CYC", .pme_code = 0x400026, .pme_short_desc = "Load latency from L3.5 modified", .pme_long_desc = "Load latency from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L35_MOD_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L35_MOD_CYC] }, [ POWER6_PME_PM_FLUSH_FXU ] = { .pme_name = "PM_FLUSH_FXU", .pme_code = 0x230ea, .pme_short_desc = "Flush caused by FXU exception", .pme_long_desc = "Flush caused by FXU exception", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FLUSH_FXU], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FLUSH_FXU] }, [ POWER6_PME_PM_FPU_ISSUE_1 ] = { .pme_name = "PM_FPU_ISSUE_1", .pme_code = 0x320c8, .pme_short_desc = "FPU issue 1 per cycle", .pme_long_desc = "FPU issue 1 per cycle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_1] }, [ POWER6_PME_PM_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_DATA_FROM_LMEM_CYC", .pme_code = 0x20002c, .pme_short_desc = "Load latency from local memory", .pme_long_desc = "Load latency from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_LMEM_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_LMEM_CYC] }, [ POWER6_PME_PM_DPU_HELD_LSU_SOPS ] = { .pme_name = "PM_DPU_HELD_LSU_SOPS", .pme_code = 0x30080, .pme_short_desc = "DISP unit held due to LSU slow ops (sync", .pme_long_desc = " tlbie", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_LSU_SOPS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_LSU_SOPS] }, [ POWER6_PME_PM_INST_PTEG_2ND_HALF ] = { .pme_name = "PM_INST_PTEG_2ND_HALF", .pme_code = 0x910aa, .pme_short_desc = "Instruction table walk matched in second half primary PTEG", .pme_long_desc = "Instruction table walk matched in second half primary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_PTEG_2ND_HALF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_PTEG_2ND_HALF] }, [ POWER6_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x300018, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRESH_TIMEO], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRESH_TIMEO] }, [ POWER6_PME_PM_LSU_REJECT_UST_BOTH ] = { .pme_name = "PM_LSU_REJECT_UST_BOTH", .pme_code = 0x190036, .pme_short_desc = "Unaligned store reject both units", .pme_long_desc = "Unaligned store reject both units", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_UST_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_UST_BOTH] }, [ POWER6_PME_PM_LSU_REJECT_FAST ] = { .pme_name = "PM_LSU_REJECT_FAST", .pme_code = 0x30003e, .pme_short_desc = "LSU fast reject", .pme_long_desc = "LSU fast reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_FAST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_FAST] }, [ POWER6_PME_PM_DPU_HELD_THRD_PRIO ] = { .pme_name = "PM_DPU_HELD_THRD_PRIO", .pme_code = 0x3008a, .pme_short_desc = "DISP unit held due to lower priority thread", .pme_long_desc = "DISP unit held due to lower priority thread", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_THRD_PRIO], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_THRD_PRIO] }, [ POWER6_PME_PM_L2_PREF_LD ] = { .pme_name = "PM_L2_PREF_LD", .pme_code = 0x810a6, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "L2 cache prefetches", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_PREF_LD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_PREF_LD] }, [ POWER6_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x4d1030, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FEST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FEST] }, [ POWER6_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x30304a, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "Marked data loaded from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_RMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_RMEM] }, [ POWER6_PME_PM_LD_MISS_L1_CYC ] = { .pme_name = "PM_LD_MISS_L1_CYC", .pme_code = 0x10000c, .pme_short_desc = "L1 data load miss cycles", .pme_long_desc = "L1 data load miss cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_MISS_L1_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_MISS_L1_CYC] }, [ POWER6_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x192070, .pme_short_desc = "DERAT misses for 4K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_MISS_4K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_MISS_4K] }, [ POWER6_PME_PM_DPU_HELD_COMPLETION ] = { .pme_name = "PM_DPU_HELD_COMPLETION", .pme_code = 0x210ac, .pme_short_desc = "DISP unit held due to completion holding dispatch ", .pme_long_desc = "DISP unit held due to completion holding dispatch ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_COMPLETION], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_COMPLETION] }, [ POWER6_PME_PM_FPU_ISSUE_STALL_ST ] = { .pme_name = "PM_FPU_ISSUE_STALL_ST", .pme_code = 0x320ce, .pme_short_desc = "FPU issue stalled due to store", .pme_long_desc = "FPU issue stalled due to store", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_STALL_ST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_STALL_ST] }, [ POWER6_PME_PM_L2SB_DC_INV ] = { .pme_name = "PM_L2SB_DC_INV", .pme_code = 0x5068e, .pme_short_desc = "L2 slice B D cache invalidate", .pme_long_desc = "L2 slice B D cache invalidate", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_DC_INV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_DC_INV] }, [ POWER6_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x41304e, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "PTEG loaded from L2.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L25_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L25_SHR] }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_DL2L3_MOD", .pme_code = 0x41304c, .pme_short_desc = "PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "PTEG loaded from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_DL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_DL2L3_MOD] }, [ POWER6_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x250130, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Fabric command retried", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_CMD_RETRIED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_CMD_RETRIED] }, [ POWER6_PME_PM_BR_PRED_LSTACK ] = { .pme_name = "PM_BR_PRED_LSTACK", .pme_code = 0x410a6, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = " link stack", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_PRED_LSTACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_PRED_LSTACK] }, [ POWER6_PME_PM_GXO_DATA_CYC_BUSY ] = { .pme_name = "PM_GXO_DATA_CYC_BUSY", .pme_code = 0x50384, .pme_short_desc = "Outbound GX Data utilization (# of cycles data out is valid)", .pme_long_desc = "Outbound GX Data utilization (# of cycles data out is valid)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXO_DATA_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXO_DATA_CYC_BUSY] }, [ POWER6_PME_PM_DFU_SUBNORM ] = { .pme_name = "PM_DFU_SUBNORM", .pme_code = 0xe0086, .pme_short_desc = "DFU result is a subnormal", .pme_long_desc = "DFU result is a subnormal", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_SUBNORM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_SUBNORM] }, [ POWER6_PME_PM_FPU_ISSUE_OOO ] = { .pme_name = "PM_FPU_ISSUE_OOO", .pme_code = 0x320c0, .pme_short_desc = "FPU issue out-of-order", .pme_long_desc = "FPU issue out-of-order", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_OOO], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_OOO] }, [ POWER6_PME_PM_LSU_REJECT_ULD_BOTH ] = { .pme_name = "PM_LSU_REJECT_ULD_BOTH", .pme_code = 0x290036, .pme_short_desc = "Unaligned load reject both units", .pme_long_desc = "Unaligned load reject both units", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_ULD_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_ULD_BOTH] }, [ POWER6_PME_PM_L2SB_ST_MISS ] = { .pme_name = "PM_L2SB_ST_MISS", .pme_code = 0x5048e, .pme_short_desc = "L2 slice B store misses", .pme_long_desc = "L2 slice B store misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_ST_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_ST_MISS] }, [ POWER6_PME_PM_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_L25_MOD_CYC", .pme_code = 0x400024, .pme_short_desc = "Load latency from L2.5 modified", .pme_long_desc = "Load latency from L2.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L25_MOD_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L25_MOD_CYC] }, [ POWER6_PME_PM_INST_PTEG_1ST_HALF ] = { .pme_name = "PM_INST_PTEG_1ST_HALF", .pme_code = 0x910a8, .pme_short_desc = "Instruction table walk matched in first half primary PTEG", .pme_long_desc = "Instruction table walk matched in first half primary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_PTEG_1ST_HALF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_PTEG_1ST_HALF] }, [ POWER6_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x392070, .pme_short_desc = "DERAT misses for 16M page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_MISS_16M], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_MISS_16M] }, [ POWER6_PME_PM_GX_DMA_WRITE ] = { .pme_name = "PM_GX_DMA_WRITE", .pme_code = 0x5038e, .pme_short_desc = "All DMA Write Requests (including dma wrt lgcy)", .pme_long_desc = "All DMA Write Requests (including dma wrt lgcy)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GX_DMA_WRITE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GX_DMA_WRITE] }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_MOD", .pme_code = 0x412044, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "Marked PTEG loaded from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD] }, [ POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM1_DP_RQ_GLOB_LOC", .pme_code = 0x50288, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local side 1", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local side 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC] }, [ POWER6_PME_PM_L2SB_LD_REQ_DATA ] = { .pme_name = "PM_L2SB_LD_REQ_DATA", .pme_code = 0x50488, .pme_short_desc = "L2 slice B data load requests", .pme_long_desc = "L2 slice B data load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_REQ_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_REQ_DATA] }, [ POWER6_PME_PM_L2SA_LD_MISS_INST ] = { .pme_name = "PM_L2SA_LD_MISS_INST", .pme_code = 0x50582, .pme_short_desc = "L2 slice A instruction load misses", .pme_long_desc = "L2 slice A instruction load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_MISS_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_MISS_INST] }, [ POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS ] = { .pme_name = "PM_MRK_LSU0_REJECT_L2MISS", .pme_code = 0x930e4, .pme_short_desc = "LSU0 marked L2 miss reject", .pme_long_desc = "LSU0 marked L2 miss reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS] }, [ POWER6_PME_PM_MRK_IFU_FIN ] = { .pme_name = "PM_MRK_IFU_FIN", .pme_code = 0x20000a, .pme_short_desc = "Marked instruction IFU processing finished", .pme_long_desc = "Marked instruction IFU processing finished", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_IFU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_IFU_FIN] }, [ POWER6_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x342040, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L3], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L3] }, [ POWER6_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x400016, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU1_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU1_FIN] }, [ POWER6_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x422046, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles thread running at priority level 4", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_4_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_4_CYC] }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x10304e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "Marked data loaded from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L35_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L35_MOD] }, [ POWER6_PME_PM_LSU_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU_REJECT_SET_MPRED", .pme_code = 0x2a0032, .pme_short_desc = "LSU reject due to mispredicted set", .pme_long_desc = "LSU reject due to mispredicted set", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_SET_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_SET_MPRED] }, [ POWER6_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x492044, .pme_short_desc = "Marked DERAT misses for 16G page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_MISS_16G], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_MISS_16G] }, [ POWER6_PME_PM_FPU0_FXDIV ] = { .pme_name = "PM_FPU0_FXDIV", .pme_code = 0xc10a0, .pme_short_desc = "FPU0 executed fixed point division", .pme_long_desc = "FPU0 executed fixed point division", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FXDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FXDIV] }, [ POWER6_PME_PM_MRK_LSU1_REJECT_UST ] = { .pme_name = "PM_MRK_LSU1_REJECT_UST", .pme_code = 0x930ea, .pme_short_desc = "LSU1 marked unaligned store reject", .pme_long_desc = "LSU1 marked unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU1_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU1_REJECT_UST] }, [ POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP ] = { .pme_name = "PM_FPU_ISSUE_DIV_SQRT_OVERLAP", .pme_code = 0x320cc, .pme_short_desc = "FPU divide/sqrt overlapped with other divide/sqrt", .pme_long_desc = "FPU divide/sqrt overlapped with other divide/sqrt", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP] }, [ POWER6_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x242046, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "Instruction fetched from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L35_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L35_SHR] }, [ POWER6_PME_PM_MRK_LSU_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU_REJECT_LHS", .pme_code = 0x493030, .pme_short_desc = "Marked load hit store reject", .pme_long_desc = "Marked load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU_REJECT_LHS] }, [ POWER6_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x810ac, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LMQ_FULL_CYC] }, [ POWER6_PME_PM_SYNC_COUNT ] = { .pme_name = "PM_SYNC_COUNT", .pme_code = 0x920cd, .pme_short_desc = "SYNC instructions completed", .pme_long_desc = "SYNC instructions completed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_SYNC_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_SYNC_COUNT] }, [ POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM0_DP_RQ_LOC_GLOB", .pme_code = 0x50282, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global side 0", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global side 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB] }, [ POWER6_PME_PM_L2SA_CASTOUT_MOD ] = { .pme_name = "PM_L2SA_CASTOUT_MOD", .pme_code = 0x50680, .pme_short_desc = "L2 slice A castouts - Modified", .pme_long_desc = "L2 slice A castouts - Modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_CASTOUT_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_CASTOUT_MOD] }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT", .pme_code = 0x30001d, .pme_short_desc = "Periods both threads LMQ and SRQ empty", .pme_long_desc = "Periods both threads LMQ and SRQ empty", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT] }, [ POWER6_PME_PM_PTEG_FROM_MEM_DP ] = { .pme_name = "PM_PTEG_FROM_MEM_DP", .pme_code = 0x11304a, .pme_short_desc = "PTEG loaded from double pump memory", .pme_long_desc = "PTEG loaded from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_MEM_DP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_MEM_DP] }, [ POWER6_PME_PM_LSU_REJECT_SLOW ] = { .pme_name = "PM_LSU_REJECT_SLOW", .pme_code = 0x20003e, .pme_short_desc = "LSU slow reject", .pme_long_desc = "LSU slow reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_SLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_SLOW] }, [ POWER6_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x31304e, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "PTEG loaded from L2.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L25_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L25_MOD] }, [ POWER6_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x122046, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles thread running at priority level 7", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_7_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_7_CYC] }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_SHR", .pme_code = 0x212044, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR] }, [ POWER6_PME_PM_ST_REQ_L2 ] = { .pme_name = "PM_ST_REQ_L2", .pme_code = 0x250732, .pme_short_desc = "L2 store requests", .pme_long_desc = "L2 store requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_REQ_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_REQ_L2] }, [ POWER6_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x80086, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_REF_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_REF_L1] }, [ POWER6_PME_PM_FPU_ISSUE_STALL_THRD ] = { .pme_name = "PM_FPU_ISSUE_STALL_THRD", .pme_code = 0x330e0, .pme_short_desc = "FPU issue stalled due to thread resource conflict", .pme_long_desc = "FPU issue stalled due to thread resource conflict", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_STALL_THRD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_STALL_THRD] }, [ POWER6_PME_PM_RUN_COUNT ] = { .pme_name = "PM_RUN_COUNT", .pme_code = 0x10000b, .pme_short_desc = "Run Periods", .pme_long_desc = "Processor Periods gated by the run latch", .pme_event_ids = power6_event_ids[POWER6_PME_PM_RUN_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_RUN_COUNT] }, [ POWER6_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x10000a, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", .pme_event_ids = power6_event_ids[POWER6_PME_PM_RUN_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_RUN_CYC] }, [ POWER6_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x31304a, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "PTEG loaded from remote memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_RMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_RMEM] }, [ POWER6_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x80084, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_LDF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_LDF] }, [ POWER6_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x80088, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", .pme_event_ids = power6_event_ids[POWER6_PME_PM_ST_MISS_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_ST_MISS_L1] }, [ POWER6_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x342044, .pme_short_desc = "Instruction fetched from distant L2 or L3 shared", .pme_long_desc = "Instruction fetched from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_DL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_DL2L3_SHR] }, [ POWER6_PME_PM_L2SA_IC_INV ] = { .pme_name = "PM_L2SA_IC_INV", .pme_code = 0x50684, .pme_short_desc = "L2 slice A I cache invalidate", .pme_long_desc = "L2 slice A I cache invalidate", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_IC_INV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_IC_INV] }, [ POWER6_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x100016, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "One of the threads in run cycles", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_ONE_RUN_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_ONE_RUN_CYC] }, [ POWER6_PME_PM_L2SB_LD_REQ_INST ] = { .pme_name = "PM_L2SB_LD_REQ_INST", .pme_code = 0x50588, .pme_short_desc = "L2 slice B instruction load requests", .pme_long_desc = "L2 slice B instruction load requests", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_REQ_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_REQ_INST] }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x30304e, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ POWER6_PME_PM_DPU_HELD_XTHRD ] = { .pme_name = "PM_DPU_HELD_XTHRD", .pme_code = 0x30082, .pme_short_desc = "DISP unit held due to cross thread resource conflicts", .pme_long_desc = "DISP unit held due to cross thread resource conflicts", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_XTHRD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_XTHRD] }, [ POWER6_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x5048c, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_ST_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_ST_REQ] }, [ POWER6_PME_PM_INST_FROM_L21 ] = { .pme_name = "PM_INST_FROM_L21", .pme_code = 0x242040, .pme_short_desc = "Instruction fetched from private L2 other core", .pme_long_desc = "Instruction fetched from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L21], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L21] }, [ POWER6_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x342054, .pme_short_desc = "Instruction fetched missed L3", .pme_long_desc = "Instruction fetched missed L3", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L3MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L3MISS] }, [ POWER6_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x5008a, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "L3 slice B hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L3SB_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L3SB_HIT] }, [ POWER6_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x230ee, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_event_ids = power6_event_ids[POWER6_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_EE_OFF_EXT_INT] }, [ POWER6_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x442044, .pme_short_desc = "Instruction fetched from distant L2 or L3 modified", .pme_long_desc = "Instruction fetched from distant L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_DL2L3_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_DL2L3_MOD] }, [ POWER6_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x300024, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC6_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC6_OVERFLOW] }, [ POWER6_PME_PM_FPU_FLOP ] = { .pme_name = "PM_FPU_FLOP", .pme_code = 0x1c0032, .pme_short_desc = "FPU executed 1FLOP", .pme_long_desc = " FMA", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FLOP] }, [ POWER6_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200050, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU_BUSY] }, [ POWER6_PME_PM_FPU1_FLOP ] = { .pme_name = "PM_FPU1_FLOP", .pme_code = 0xc008e, .pme_short_desc = "FPU1 executed 1FLOP", .pme_long_desc = " FMA", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FLOP] }, [ POWER6_PME_PM_IC_RELOAD_SHR ] = { .pme_name = "PM_IC_RELOAD_SHR", .pme_code = 0x4008e, .pme_short_desc = "I cache line reloading to be shared by threads", .pme_long_desc = "I cache line reloading to be shared by threads", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_RELOAD_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_RELOAD_SHR] }, [ POWER6_PME_PM_INST_TABLEWALK_CYC ] = { .pme_name = "PM_INST_TABLEWALK_CYC", .pme_code = 0x920ca, .pme_short_desc = "Cycles doing instruction tablewalks", .pme_long_desc = "Cycles doing instruction tablewalks", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_TABLEWALK_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_TABLEWALK_CYC] }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x400028, .pme_short_desc = "Load latency from remote L2 or L3 modified", .pme_long_desc = "Load latency from remote L2 or L3 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x423040, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles thread priority difference is 5 or 6", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC] }, [ POWER6_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x40084, .pme_short_desc = "Cycles instruction buffer full", .pme_long_desc = "Cycles instruction buffer full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IBUF_FULL_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IBUF_FULL_CYC] }, [ POWER6_PME_PM_L2SA_LD_REQ ] = { .pme_name = "PM_L2SA_LD_REQ", .pme_code = 0x50780, .pme_short_desc = "L2 slice A load requests ", .pme_long_desc = "L2 slice A load requests ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_REQ] }, [ POWER6_PME_PM_VMX1_LD_WRBACK ] = { .pme_name = "PM_VMX1_LD_WRBACK", .pme_code = 0x6008c, .pme_short_desc = "VMX1 load writeback valid", .pme_long_desc = "VMX1 load writeback valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX1_LD_WRBACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX1_LD_WRBACK] }, [ POWER6_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x2d0030, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_FPU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_FPU_FIN] }, [ POWER6_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x322046, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles thread running at priority level 5", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_5_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_5_CYC] }, [ POWER6_PME_PM_DFU_BACK2BACK ] = { .pme_name = "PM_DFU_BACK2BACK", .pme_code = 0xe0082, .pme_short_desc = "DFU back to back operations executed", .pme_long_desc = "DFU back to back operations executed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_BACK2BACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_BACK2BACK] }, [ POWER6_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x40304a, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "Marked data loaded from local memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_LMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_LMEM] }, [ POWER6_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0x190032, .pme_short_desc = "Load hit store reject", .pme_long_desc = "Load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_LHS] }, [ POWER6_PME_PM_DPU_HELD_SPR ] = { .pme_name = "PM_DPU_HELD_SPR", .pme_code = 0x3008c, .pme_short_desc = "DISP unit held due to MTSPR/MFSPR", .pme_long_desc = "DISP unit held due to MTSPR/MFSPR", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_SPR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_SPR] }, [ POWER6_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x30003c, .pme_short_desc = "Frequency is being slewed down due to Power Management", .pme_long_desc = "Frequency is being slewed down due to Power Management", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FREQ_DOWN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FREQ_DOWN] }, [ POWER6_PME_PM_DFU_ENC_BCD_DPD ] = { .pme_name = "PM_DFU_ENC_BCD_DPD", .pme_code = 0xe008a, .pme_short_desc = "DFU Encode BCD to DPD", .pme_long_desc = "DFU Encode BCD to DPD", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_ENC_BCD_DPD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_ENC_BCD_DPD] }, [ POWER6_PME_PM_DPU_HELD_GPR ] = { .pme_name = "PM_DPU_HELD_GPR", .pme_code = 0x20080, .pme_short_desc = "DISP unit held due to GPR dependencies", .pme_long_desc = "DISP unit held due to GPR dependencies", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_GPR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_GPR] }, [ POWER6_PME_PM_LSU0_NCST ] = { .pme_name = "PM_LSU0_NCST", .pme_code = 0x820cc, .pme_short_desc = "LSU0 non-cachable stores", .pme_long_desc = "LSU0 non-cachable stores", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_NCST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_NCST] }, [ POWER6_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10001c, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "Marked instruction issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_INST_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_INST_ISSUED] }, [ POWER6_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x242044, .pme_short_desc = "Instruction fetched from remote L2 or L3 shared", .pme_long_desc = "Instruction fetched from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_RL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_RL2L3_SHR] }, [ POWER6_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x2c1034, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_DENORM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_DENORM] }, [ POWER6_PME_PM_PTEG_FROM_L3MISS ] = { .pme_name = "PM_PTEG_FROM_L3MISS", .pme_code = 0x313028, .pme_short_desc = "PTEG loaded from L3 miss", .pme_long_desc = "PTEG loaded from L3 miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_L3MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_L3MISS] }, [ POWER6_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x4000f4, .pme_short_desc = "Run PURR Event", .pme_long_desc = "Run PURR Event", .pme_event_ids = power6_event_ids[POWER6_PME_PM_RUN_PURR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_RUN_PURR] }, [ POWER6_PME_PM_MRK_VMX0_LD_WRBACK ] = { .pme_name = "PM_MRK_VMX0_LD_WRBACK", .pme_code = 0x60086, .pme_short_desc = "Marked VMX0 load writeback valid", .pme_long_desc = "Marked VMX0 load writeback valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX0_LD_WRBACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX0_LD_WRBACK] }, [ POWER6_PME_PM_L2_MISS ] = { .pme_name = "PM_L2_MISS", .pme_code = 0x250532, .pme_short_desc = "L2 cache misses", .pme_long_desc = "L2 cache misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_MISS] }, [ POWER6_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x303048, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a marked demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L3], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L3] }, [ POWER6_PME_PM_MRK_LSU1_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU1_REJECT_LHS", .pme_code = 0x930ee, .pme_short_desc = "LSU1 marked load hit store reject", .pme_long_desc = "LSU1 marked load hit store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU1_REJECT_LHS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU1_REJECT_LHS] }, [ POWER6_PME_PM_L2SB_LD_MISS_INST ] = { .pme_name = "PM_L2SB_LD_MISS_INST", .pme_code = 0x5058a, .pme_short_desc = "L2 slice B instruction load misses", .pme_long_desc = "L2 slice B instruction load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_MISS_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_MISS_INST] }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_RL2L3_SHR", .pme_code = 0x21304c, .pme_short_desc = "PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "PTEG loaded from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PTEG_FROM_RL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PTEG_FROM_RL2L3_SHR] }, [ POWER6_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x192044, .pme_short_desc = "Marked DERAT misses for 64K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_MISS_64K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_MISS_64K] }, [ POWER6_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0x810ae, .pme_short_desc = "Isync instruction completed", .pme_long_desc = "Isync instruction completed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LWSYNC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LWSYNC] }, [ POWER6_PME_PM_FPU1_FXMULT ] = { .pme_name = "PM_FPU1_FXMULT", .pme_code = 0xd008e, .pme_short_desc = "FPU1 executed fixed point multiplication", .pme_long_desc = "FPU1 executed fixed point multiplication", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FXMULT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FXMULT] }, [ POWER6_PME_PM_MEM0_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM0_DP_CL_WR_GLOB", .pme_code = 0x50284, .pme_short_desc = "cacheline write setting dp to global side 0", .pme_long_desc = "cacheline write setting dp to global side 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM0_DP_CL_WR_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM0_DP_CL_WR_GLOB] }, [ POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU0_REJECT_PARTIAL_SECTOR", .pme_code = 0xa0086, .pme_short_desc = "LSU0 reject due to partial sector valid", .pme_long_desc = "LSU0 reject due to partial sector valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR] }, [ POWER6_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x1000f0, .pme_short_desc = "IMC matched instructions completed", .pme_long_desc = "Number of instructions resulting from the marked instructions expansion that completed.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_IMC_MATCH_CMPL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_IMC_MATCH_CMPL] }, [ POWER6_PME_PM_DPU_HELD_THERMAL ] = { .pme_name = "PM_DPU_HELD_THERMAL", .pme_code = 0x10002a, .pme_short_desc = "DISP unit held due to thermal condition", .pme_long_desc = "DISP unit held due to thermal condition", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_THERMAL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_THERMAL] }, [ POWER6_PME_PM_FPU_FRSP ] = { .pme_name = "PM_FPU_FRSP", .pme_code = 0x2d1034, .pme_short_desc = "FPU executed FRSP instruction", .pme_long_desc = "FPU executed FRSP instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FRSP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FRSP] }, [ POWER6_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30000a, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_INST_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_INST_FIN] }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_SHR", .pme_code = 0x312044, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 shared", .pme_long_desc = "Marked PTEG loaded from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR] }, [ POWER6_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0x920c0, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Marked Data TLB reference", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DTLB_REF], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DTLB_REF] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L25_SHR", .pme_code = 0x412046, .pme_short_desc = "Marked PTEG loaded from L2.5 shared", .pme_long_desc = "Marked PTEG loaded from L2.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR] }, [ POWER6_PME_PM_DPU_HELD_LSU ] = { .pme_name = "PM_DPU_HELD_LSU", .pme_code = 0x210a2, .pme_short_desc = "DISP unit held due to LSU move or invalidate SLB and SR", .pme_long_desc = "DISP unit held due to LSU move or invalidate SLB and SR", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_LSU], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_LSU] }, [ POWER6_PME_PM_FPU_FSQRT_FDIV ] = { .pme_name = "PM_FPU_FSQRT_FDIV", .pme_code = 0x2c0032, .pme_short_desc = "FPU executed FSQRT or FDIV instruction", .pme_long_desc = "FPU executed FSQRT or FDIV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_FSQRT_FDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_FSQRT_FDIV] }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_COUNT", .pme_code = 0x20001d, .pme_short_desc = "Periods LMQ and SRQ empty", .pme_long_desc = "Periods when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT] }, [ POWER6_PME_PM_DATA_PTEG_SECONDARY ] = { .pme_name = "PM_DATA_PTEG_SECONDARY", .pme_code = 0x910a4, .pme_short_desc = "Data table walk matched in secondary PTEG", .pme_long_desc = "Data table walk matched in secondary PTEG", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_PTEG_SECONDARY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_PTEG_SECONDARY] }, [ POWER6_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0xd10ae, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FEST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FEST] }, [ POWER6_PME_PM_L2SA_LD_HIT ] = { .pme_name = "PM_L2SA_LD_HIT", .pme_code = 0x50782, .pme_short_desc = "L2 slice A load hits", .pme_long_desc = "L2 slice A load hits", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SA_LD_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SA_LD_HIT] }, [ POWER6_PME_PM_DATA_FROM_MEM_DP_CYC ] = { .pme_name = "PM_DATA_FROM_MEM_DP_CYC", .pme_code = 0x40002e, .pme_short_desc = "Load latency from double pump memory", .pme_long_desc = "Load latency from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_MEM_DP_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_MEM_DP_CYC] }, [ POWER6_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x410ae, .pme_short_desc = "Branch misprediction due to count cache prediction", .pme_long_desc = "Branch misprediction due to count cache prediction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_MPRED_CCACHE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_MPRED_CCACHE] }, [ POWER6_PME_PM_DPU_HELD_COUNT ] = { .pme_name = "PM_DPU_HELD_COUNT", .pme_code = 0x200005, .pme_short_desc = "Periods DISP unit held", .pme_long_desc = "Dispatch unit held", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_COUNT] }, [ POWER6_PME_PM_LSU1_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU1_REJECT_SET_MPRED", .pme_code = 0xa008c, .pme_short_desc = "LSU1 reject due to mispredicted set", .pme_long_desc = "LSU1 reject due to mispredicted set", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_SET_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_SET_MPRED] }, [ POWER6_PME_PM_FPU_ISSUE_2 ] = { .pme_name = "PM_FPU_ISSUE_2", .pme_code = 0x320ca, .pme_short_desc = "FPU issue 2 per cycle", .pme_long_desc = "FPU issue 2 per cycle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_2] }, [ POWER6_PME_PM_LSU1_REJECT_L2_CORR ] = { .pme_name = "PM_LSU1_REJECT_L2_CORR", .pme_code = 0xa10a8, .pme_short_desc = "LSU1 reject due to L2 correctable error", .pme_long_desc = "LSU1 reject due to L2 correctable error", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_L2_CORR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_L2_CORR] }, [ POWER6_PME_PM_MRK_PTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_DMEM", .pme_code = 0x212042, .pme_short_desc = "Marked PTEG loaded from distant memory", .pme_long_desc = "Marked PTEG loaded from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_DMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_DMEM] }, [ POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM1_DP_RQ_LOC_GLOB", .pme_code = 0x5028a, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global side 1", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global side 1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x223046, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles thread priority difference is -1 or -2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC] }, [ POWER6_PME_PM_THRD_PRIO_0_CYC ] = { .pme_name = "PM_THRD_PRIO_0_CYC", .pme_code = 0x122040, .pme_short_desc = "Cycles thread running at priority level 0", .pme_long_desc = "Cycles thread running at priority level 0", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_0_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_0_CYC] }, [ POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300050, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU1_REJECT_DERAT_MPRED", .pme_code = 0xa008a, .pme_short_desc = "LSU1 reject due to mispredicted DERAT", .pme_long_desc = "LSU1 reject due to mispredicted DERAT", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED] }, [ POWER6_PME_PM_MRK_VMX1_LD_WRBACK ] = { .pme_name = "PM_MRK_VMX1_LD_WRBACK", .pme_code = 0x6008e, .pme_short_desc = "Marked VMX1 load writeback valid", .pme_long_desc = "Marked VMX1 load writeback valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX1_LD_WRBACK], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX1_LD_WRBACK] }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x200028, .pme_short_desc = "Load latency from remote L2 or L3 shared", .pme_long_desc = "Load latency from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC] }, [ POWER6_PME_PM_IERAT_MISS_16M ] = { .pme_name = "PM_IERAT_MISS_16M", .pme_code = 0x292076, .pme_short_desc = "IERAT misses for 16M page", .pme_long_desc = "IERAT misses for 16M page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IERAT_MISS_16M], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IERAT_MISS_16M] }, [ POWER6_PME_PM_MRK_DATA_FROM_MEM_DP ] = { .pme_name = "PM_MRK_DATA_FROM_MEM_DP", .pme_code = 0x10304a, .pme_short_desc = "Marked data loaded from double pump memory", .pme_long_desc = "Marked data loaded from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_MEM_DP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_MEM_DP] }, [ POWER6_PME_PM_LARX_L1HIT ] = { .pme_name = "PM_LARX_L1HIT", .pme_code = 0x830e2, .pme_short_desc = "larx hits in L1", .pme_long_desc = "larx hits in L1", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LARX_L1HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LARX_L1HIT] }, [ POWER6_PME_PM_L2_ST_MISS_DATA ] = { .pme_name = "PM_L2_ST_MISS_DATA", .pme_code = 0x150432, .pme_short_desc = "L2 data store misses", .pme_long_desc = "L2 data store misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2_ST_MISS_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2_ST_MISS_DATA] }, [ POWER6_PME_PM_FPU_ST_FOLDED ] = { .pme_name = "PM_FPU_ST_FOLDED", .pme_code = 0x3d1030, .pme_short_desc = "FPU folded store", .pme_long_desc = "FPU folded store", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ST_FOLDED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ST_FOLDED] }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x20304e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "Marked data loaded from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L35_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L35_SHR] }, [ POWER6_PME_PM_DPU_HELD_MULT_GPR ] = { .pme_name = "PM_DPU_HELD_MULT_GPR", .pme_code = 0x210aa, .pme_short_desc = "DISP unit held due to multiple/divide multiply/divide GPR dependencies", .pme_long_desc = "DISP unit held due to multiple/divide multiply/divide GPR dependencies", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_MULT_GPR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_MULT_GPR] }, [ POWER6_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc0080, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_1FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_1FLOP] }, [ POWER6_PME_PM_IERAT_MISS_16G ] = { .pme_name = "PM_IERAT_MISS_16G", .pme_code = 0x192076, .pme_short_desc = "IERAT misses for 16G page", .pme_long_desc = "IERAT misses for 16G page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IERAT_MISS_16G], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IERAT_MISS_16G] }, [ POWER6_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x430e0, .pme_short_desc = "Instruction prefetch written into I cache", .pme_long_desc = "Instruction prefetch written into I cache", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_PREF_WRITE], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_PREF_WRITE] }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x423046, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles thread priority difference is -5 or -6", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC] }, [ POWER6_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0xd0080, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FIN] }, [ POWER6_PME_PM_DATA_FROM_L2_CYC ] = { .pme_name = "PM_DATA_FROM_L2_CYC", .pme_code = 0x200020, .pme_short_desc = "Load latency from L2", .pme_long_desc = "Load latency from L2", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L2_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L2_CYC] }, [ POWER6_PME_PM_DERAT_REF_16G ] = { .pme_name = "PM_DERAT_REF_16G", .pme_code = 0x482070, .pme_short_desc = "DERAT reference for 16G page", .pme_long_desc = "DERAT reference for 16G page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_REF_16G], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_REF_16G] }, [ POWER6_PME_PM_BR_PRED ] = { .pme_name = "PM_BR_PRED", .pme_code = 0x410a0, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = "A conditional branch was predicted", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_PRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_PRED] }, [ POWER6_PME_PM_VMX1_LD_ISSUED ] = { .pme_name = "PM_VMX1_LD_ISSUED", .pme_code = 0x6008a, .pme_short_desc = "VMX1 load issued", .pme_long_desc = "VMX1 load issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX1_LD_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX1_LD_ISSUED] }, [ POWER6_PME_PM_L2SB_CASTOUT_MOD ] = { .pme_name = "PM_L2SB_CASTOUT_MOD", .pme_code = 0x50688, .pme_short_desc = "L2 slice B castouts - Modified", .pme_long_desc = "L2 slice B castouts - Modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_CASTOUT_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_CASTOUT_MOD] }, [ POWER6_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x242042, .pme_short_desc = "Instruction fetched from distant memory", .pme_long_desc = "Instruction fetched from distant memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_DMEM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_DMEM] }, [ POWER6_PME_PM_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_L35_SHR_CYC", .pme_code = 0x200026, .pme_short_desc = "Load latency from L3.5 shared", .pme_long_desc = "Load latency from L3.5 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L35_SHR_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L35_SHR_CYC] }, [ POWER6_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0x820ca, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "LSU0 non-cacheable loads", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_NCLD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_NCLD] }, [ POWER6_PME_PM_FAB_RETRY_NODE_PUMP ] = { .pme_name = "PM_FAB_RETRY_NODE_PUMP", .pme_code = 0x5018a, .pme_short_desc = "Retry of a node pump", .pme_long_desc = " locally mastered", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_RETRY_NODE_PUMP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_RETRY_NODE_PUMP] }, [ POWER6_PME_PM_VMX0_INST_ISSUED ] = { .pme_name = "PM_VMX0_INST_ISSUED", .pme_code = 0x60080, .pme_short_desc = "VMX0 instruction issued", .pme_long_desc = "VMX0 instruction issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX0_INST_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX0_INST_ISSUED] }, [ POWER6_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x30005a, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L25_MOD] }, [ POWER6_PME_PM_DPU_HELD_ITLB_ISLB ] = { .pme_name = "PM_DPU_HELD_ITLB_ISLB", .pme_code = 0x210a4, .pme_short_desc = "DISP unit held due to SLB or TLB invalidates ", .pme_long_desc = "DISP unit held due to SLB or TLB invalidates ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_ITLB_ISLB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_ITLB_ISLB] }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x20001c, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ POWER6_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x300026, .pme_short_desc = "Concurrent run instructions", .pme_long_desc = "Concurrent run instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_CONC_RUN_INST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_CONC_RUN_INST] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2 ] = { .pme_name = "PM_MRK_PTEG_FROM_L2", .pme_code = 0x112040, .pme_short_desc = "Marked PTEG loaded from L2.5 modified", .pme_long_desc = "Marked PTEG loaded from L2.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L2] }, [ POWER6_PME_PM_PURR ] = { .pme_name = "PM_PURR", .pme_code = 0x10000e, .pme_short_desc = "PURR Event", .pme_long_desc = "PURR Event", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PURR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PURR] }, [ POWER6_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x292070, .pme_short_desc = "DERAT misses for 64K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DERAT_MISS_64K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DERAT_MISS_64K] }, [ POWER6_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x300020, .pme_short_desc = "PMC2 rewind event", .pme_long_desc = "PMC2 rewind event", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC2_REWIND], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC2_REWIND] }, [ POWER6_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x142040, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L2] }, [ POWER6_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x200012, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_DISP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_DISP] }, [ POWER6_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x40005a, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L25_SHR] }, [ POWER6_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x3000f6, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ POWER6_PME_PM_LSU1_REJECT_UST ] = { .pme_name = "PM_LSU1_REJECT_UST", .pme_code = 0x9008a, .pme_short_desc = "LSU1 unaligned store reject", .pme_long_desc = "LSU1 unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_UST] }, [ POWER6_PME_PM_FAB_ADDR_COLLISION ] = { .pme_name = "PM_FAB_ADDR_COLLISION", .pme_code = 0x5018e, .pme_short_desc = "local node launch collision with off-node address ", .pme_long_desc = "local node launch collision with off-node address ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FAB_ADDR_COLLISION], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FAB_ADDR_COLLISION] }, [ POWER6_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x20001a, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "The fixed point units (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_FXU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_FXU_FIN] }, [ POWER6_PME_PM_LSU0_REJECT_UST ] = { .pme_name = "PM_LSU0_REJECT_UST", .pme_code = 0x90082, .pme_short_desc = "LSU0 unaligned store reject", .pme_long_desc = "LSU0 unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_UST] }, [ POWER6_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x100014, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", .pme_event_ids = power6_event_ids[POWER6_PME_PM_PMC4_OVERFLOW], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_PMC4_OVERFLOW] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3 ] = { .pme_name = "PM_MRK_PTEG_FROM_L3", .pme_code = 0x312040, .pme_short_desc = "Marked PTEG loaded from L3", .pme_long_desc = "Marked PTEG loaded from L3", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L3], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L3] }, [ POWER6_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x442054, .pme_short_desc = "Instructions fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond L2.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L2MISS] }, [ POWER6_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x5078e, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_ST_HIT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_ST_HIT] }, [ POWER6_PME_PM_DPU_WT_IC_MISS_COUNT ] = { .pme_name = "PM_DPU_WT_IC_MISS_COUNT", .pme_code = 0x20000d, .pme_short_desc = "Periods DISP unit is stalled due to I cache miss", .pme_long_desc = "Periods DISP unit is stalled due to I cache miss", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_WT_IC_MISS_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_WT_IC_MISS_COUNT] }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x30304c, .pme_short_desc = "Marked data loaded from distant L2 or L3 shared", .pme_long_desc = "Marked data loaded from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L35_MOD", .pme_code = 0x112046, .pme_short_desc = "Marked PTEG loaded from L3.5 modified", .pme_long_desc = "Marked PTEG loaded from L3.5 modified", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD] }, [ POWER6_PME_PM_FPU1_FPSCR ] = { .pme_name = "PM_FPU1_FPSCR", .pme_code = 0xd008c, .pme_short_desc = "FPU1 executed FPSCR instruction", .pme_long_desc = "FPU1 executed FPSCR instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_FPSCR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_FPSCR] }, [ POWER6_PME_PM_LSU_REJECT_UST ] = { .pme_name = "PM_LSU_REJECT_UST", .pme_code = 0x290030, .pme_short_desc = "Unaligned store reject", .pme_long_desc = "Unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_UST] }, [ POWER6_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x910a6, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_DERAT_MISS] }, [ POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP ] = { .pme_name = "PM_MRK_PTEG_FROM_MEM_DP", .pme_code = 0x112042, .pme_short_desc = "Marked PTEG loaded from double pump memory", .pme_long_desc = "Marked PTEG loaded from double pump memory", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP] }, [ POWER6_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x103048, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L2] }, [ POWER6_PME_PM_FPU0_FSQRT_FDIV ] = { .pme_name = "PM_FPU0_FSQRT_FDIV", .pme_code = 0xc0084, .pme_short_desc = "FPU0 executed FSQRT or FDIV instruction", .pme_long_desc = "FPU0 executed FSQRT or FDIV instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU0_FSQRT_FDIV], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU0_FSQRT_FDIV] }, [ POWER6_PME_PM_DPU_HELD_FXU_SOPS ] = { .pme_name = "PM_DPU_HELD_FXU_SOPS", .pme_code = 0x30088, .pme_short_desc = "DISP unit held due to FXU slow ops (mtmsr", .pme_long_desc = " scv", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_FXU_SOPS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_FXU_SOPS] }, [ POWER6_PME_PM_MRK_FPU0_FIN ] = { .pme_name = "PM_MRK_FPU0_FIN", .pme_code = 0xd0082, .pme_short_desc = "Marked instruction FPU0 processing finished", .pme_long_desc = "Marked instruction FPU0 processing finished", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_FPU0_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_FPU0_FIN] }, [ POWER6_PME_PM_L2SB_LD_MISS_DATA ] = { .pme_name = "PM_L2SB_LD_MISS_DATA", .pme_code = 0x5048a, .pme_short_desc = "L2 slice B data load misses", .pme_long_desc = "L2 slice B data load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L2SB_LD_MISS_DATA], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L2SB_LD_MISS_DATA] }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x40001c, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ POWER6_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x100012, .pme_short_desc = "Cycles at least one instruction dispatched", .pme_long_desc = "Cycles at least one instruction dispatched", .pme_event_ids = power6_event_ids[POWER6_PME_PM_1PLUS_PPC_DISP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_1PLUS_PPC_DISP] }, [ POWER6_PME_PM_VMX_ST_ISSUED ] = { .pme_name = "PM_VMX_ST_ISSUED", .pme_code = 0xb0080, .pme_short_desc = "VMX store issued", .pme_long_desc = "VMX store issued", .pme_event_ids = power6_event_ids[POWER6_PME_PM_VMX_ST_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_VMX_ST_ISSUED] }, [ POWER6_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x2000fe, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L2MISS] }, [ POWER6_PME_PM_LSU0_REJECT_ULD ] = { .pme_name = "PM_LSU0_REJECT_ULD", .pme_code = 0x90080, .pme_short_desc = "LSU0 unaligned load reject", .pme_long_desc = "LSU0 unaligned load reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_ULD], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_ULD] }, [ POWER6_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", .pme_event_ids = power6_event_ids[POWER6_PME_PM_SUSPENDED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_SUSPENDED] }, [ POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH ] = { .pme_name = "PM_DFU_ADD_SHIFTED_BOTH", .pme_code = 0xe0088, .pme_short_desc = "DFU add type with both operands shifted", .pme_long_desc = "DFU add type with both operands shifted", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH] }, [ POWER6_PME_PM_LSU_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU_REJECT_NO_SCRATCH", .pme_code = 0x2a1034, .pme_short_desc = "LSU reject due to scratch register not available", .pme_long_desc = "LSU reject due to scratch register not available", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU_REJECT_NO_SCRATCH], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU_REJECT_NO_SCRATCH] }, [ POWER6_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x830ee, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = power6_event_ids[POWER6_PME_PM_STCX_FAIL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_STCX_FAIL] }, [ POWER6_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0xc10aa, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU1_DENORM], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU1_DENORM] }, [ POWER6_PME_PM_GCT_NOSLOT_COUNT ] = { .pme_name = "PM_GCT_NOSLOT_COUNT", .pme_code = 0x100009, .pme_short_desc = "Periods no GCT slot allocated", .pme_long_desc = "Periods this thread does not have any slots allocated in the GCT.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_NOSLOT_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_NOSLOT_COUNT] }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x20002a, .pme_short_desc = "Load latency from distant L2 or L3 shared", .pme_long_desc = "Load latency from distant L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC] }, [ POWER6_PME_PM_DATA_FROM_L21 ] = { .pme_name = "PM_DATA_FROM_L21", .pme_code = 0x200058, .pme_short_desc = "Data loaded from private L2 other core", .pme_long_desc = "Data loaded from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L21], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L21] }, [ POWER6_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x1c0030, .pme_short_desc = "FPU executed one flop instruction ", .pme_long_desc = "This event counts the number of one flop instructions. These could be fadd*, fmul*, fsub*, fneg+, fabs+, fnabs+, fres+, frsqrte+, fcmp**, or fsel where XYZ* means XYZ, XYZs, XYZ., XYZs., XYZ+ means XYZ, XYZ., and XYZ** means XYZu, XYZo.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_1FLOP], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_1FLOP] }, [ POWER6_PME_PM_LSU1_REJECT ] = { .pme_name = "PM_LSU1_REJECT", .pme_code = 0xa10ae, .pme_short_desc = "LSU1 reject", .pme_long_desc = "LSU1 reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT] }, [ POWER6_PME_PM_IC_REQ ] = { .pme_name = "PM_IC_REQ", .pme_code = 0x4008a, .pme_short_desc = "I cache demand of prefetch request", .pme_long_desc = "I cache demand of prefetch request", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IC_REQ], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IC_REQ] }, [ POWER6_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x300008, .pme_short_desc = "DFU marked instruction finish", .pme_long_desc = "DFU marked instruction finish", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DFU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DFU_FIN] }, [ POWER6_PME_PM_NOT_LLA_CYC ] = { .pme_name = "PM_NOT_LLA_CYC", .pme_code = 0x401e, .pme_short_desc = "Load Look Ahead not Active", .pme_long_desc = "Load Look Ahead not Active", .pme_event_ids = power6_event_ids[POWER6_PME_PM_NOT_LLA_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_NOT_LLA_CYC] }, [ POWER6_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x40082, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power6_event_ids[POWER6_PME_PM_INST_FROM_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_INST_FROM_L1] }, [ POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED ] = { .pme_name = "PM_MRK_VMX_COMPLEX_ISSUED", .pme_code = 0x7008c, .pme_short_desc = "Marked VMX instruction issued to complex", .pme_long_desc = "Marked VMX instruction issued to complex", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED] }, [ POWER6_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x430e6, .pme_short_desc = "BRU produced a result", .pme_long_desc = "BRU produced a result", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BRU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BRU_FIN] }, [ POWER6_PME_PM_LSU1_REJECT_EXTERN ] = { .pme_name = "PM_LSU1_REJECT_EXTERN", .pme_code = 0xa10ac, .pme_short_desc = "LSU1 external reject request ", .pme_long_desc = "LSU1 external reject request ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_REJECT_EXTERN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_REJECT_EXTERN] }, [ POWER6_PME_PM_DATA_FROM_L21_CYC ] = { .pme_name = "PM_DATA_FROM_L21_CYC", .pme_code = 0x400020, .pme_short_desc = "Load latency from private L2 other core", .pme_long_desc = "Load latency from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_L21_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_L21_CYC] }, [ POWER6_PME_PM_GXI_CYC_BUSY ] = { .pme_name = "PM_GXI_CYC_BUSY", .pme_code = 0x50386, .pme_short_desc = "Inbound GX bus utilizations (# of cycles in use)", .pme_long_desc = "Inbound GX bus utilizations (# of cycles in use)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXI_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXI_CYC_BUSY] }, [ POWER6_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x200056, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LD_MISS_L1] }, [ POWER6_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_L1_WRITE_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_L1_WRITE_CYC] }, [ POWER6_PME_PM_LLA_CYC ] = { .pme_name = "PM_LLA_CYC", .pme_code = 0xc01e, .pme_short_desc = "Load Look Ahead Active", .pme_long_desc = "Load Look Ahead Active", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LLA_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LLA_CYC] }, [ POWER6_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x103028, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_L2MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_L2MISS] }, [ POWER6_PME_PM_GCT_FULL_COUNT ] = { .pme_name = "PM_GCT_FULL_COUNT", .pme_code = 0x40087, .pme_short_desc = "Periods GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GCT_FULL_COUNT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GCT_FULL_COUNT] }, [ POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM_DP_RQ_LOC_GLOB", .pme_code = 0x250230, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB] }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x20005c, .pme_short_desc = "Data loaded from remote L2 or L3 shared", .pme_long_desc = "Data loaded from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DATA_FROM_RL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DATA_FROM_RL2L3_SHR] }, [ POWER6_PME_PM_MRK_LSU_REJECT_UST ] = { .pme_name = "PM_MRK_LSU_REJECT_UST", .pme_code = 0x293034, .pme_short_desc = "Marked unaligned store reject", .pme_long_desc = "Marked unaligned store reject", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU_REJECT_UST], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU_REJECT_UST] }, [ POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED ] = { .pme_name = "PM_MRK_VMX_PERMUTE_ISSUED", .pme_code = 0x7008e, .pme_short_desc = "Marked VMX instruction issued to permute", .pme_long_desc = "Marked VMX instruction issued to permute", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED] }, [ POWER6_PME_PM_MRK_PTEG_FROM_L21 ] = { .pme_name = "PM_MRK_PTEG_FROM_L21", .pme_code = 0x212040, .pme_short_desc = "Marked PTEG loaded from private L2 other core", .pme_long_desc = "Marked PTEG loaded from private L2 other core", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_PTEG_FROM_L21], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_PTEG_FROM_L21] }, [ POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200018, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles group completed by both threads", .pme_event_ids = power6_event_ids[POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC] }, [ POWER6_PME_PM_BR_MPRED ] = { .pme_name = "PM_BR_MPRED", .pme_code = 0x400052, .pme_short_desc = "Branches incorrectly predicted", .pme_long_desc = "Branches incorrectly predicted", .pme_event_ids = power6_event_ids[POWER6_PME_PM_BR_MPRED], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_BR_MPRED] }, [ POWER6_PME_PM_LD_REQ_L2 ] = { .pme_name = "PM_LD_REQ_L2", .pme_code = 0x150730, .pme_short_desc = "L2 load requests ", .pme_long_desc = "L2 load requests ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LD_REQ_L2], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LD_REQ_L2] }, [ POWER6_PME_PM_FLUSH_ASYNC ] = { .pme_name = "PM_FLUSH_ASYNC", .pme_code = 0x220ca, .pme_short_desc = "Flush caused by asynchronous exception", .pme_long_desc = "Flush caused by asynchronous exception", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FLUSH_ASYNC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FLUSH_ASYNC] }, [ POWER6_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x200016, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_HV_CYC], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_HV_CYC] }, [ POWER6_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x910ae, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU1_DERAT_MISS] }, [ POWER6_PME_PM_DPU_HELD_SMT ] = { .pme_name = "PM_DPU_HELD_SMT", .pme_code = 0x20082, .pme_short_desc = "DISP unit held due to SMT conflicts ", .pme_long_desc = "DISP unit held due to SMT conflicts ", .pme_event_ids = power6_event_ids[POWER6_PME_PM_DPU_HELD_SMT], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_DPU_HELD_SMT] }, [ POWER6_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40001a, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_LSU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_LSU_FIN] }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x20304c, .pme_short_desc = "Marked data loaded from remote L2 or L3 shared", .pme_long_desc = "Marked data loaded from remote L2 or L3 shared", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR] }, [ POWER6_PME_PM_LSU0_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_STQ_FULL", .pme_code = 0xa0080, .pme_short_desc = "LSU0 reject due to store queue full", .pme_long_desc = "LSU0 reject due to store queue full", .pme_event_ids = power6_event_ids[POWER6_PME_PM_LSU0_REJECT_STQ_FULL], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_LSU0_REJECT_STQ_FULL] }, [ POWER6_PME_PM_MRK_DERAT_REF_4K ] = { .pme_name = "PM_MRK_DERAT_REF_4K", .pme_code = 0x282044, .pme_short_desc = "Marked DERAT reference for 4K page", .pme_long_desc = "Marked DERAT reference for 4K page", .pme_event_ids = power6_event_ids[POWER6_PME_PM_MRK_DERAT_REF_4K], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_MRK_DERAT_REF_4K] }, [ POWER6_PME_PM_FPU_ISSUE_STALL_FPR ] = { .pme_name = "PM_FPU_ISSUE_STALL_FPR", .pme_code = 0x330e2, .pme_short_desc = "FPU issue stalled due to FPR dependencies", .pme_long_desc = "FPU issue stalled due to FPR dependencies", .pme_event_ids = power6_event_ids[POWER6_PME_PM_FPU_ISSUE_STALL_FPR], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_FPU_ISSUE_STALL_FPR] }, [ POWER6_PME_PM_IFU_FIN ] = { .pme_name = "PM_IFU_FIN", .pme_code = 0x430e4, .pme_short_desc = "IFU finished an instruction", .pme_long_desc = "IFU finished an instruction", .pme_event_ids = power6_event_ids[POWER6_PME_PM_IFU_FIN], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_IFU_FIN] }, [ POWER6_PME_PM_GXO_CYC_BUSY ] = { .pme_name = "PM_GXO_CYC_BUSY", .pme_code = 0x50380, .pme_short_desc = "Outbound GX bus utilizations (# of cycles in use)", .pme_long_desc = "Outbound GX bus utilizations (# of cycles in use)", .pme_event_ids = power6_event_ids[POWER6_PME_PM_GXO_CYC_BUSY], .pme_group_vector = power6_group_vecs[POWER6_PME_PM_GXO_CYC_BUSY] } }; #define POWER6_PME_EVENT_COUNT 553 static const int power6_group_event_ids[][POWER6_NUM_EVENT_COUNTERS] = { [ 0 ] = { 302, 148, 139, 12, 0, 0 }, [ 1 ] = { 315, 11, 306, 347, 0, 0 }, [ 2 ] = { 10, 4, 6, 5, 0, 0 }, [ 3 ] = { 9, 9, 2, 7, 0, 0 }, [ 4 ] = { 8, 8, 7, 11, 0, 0 }, [ 5 ] = { 6, 3, 5, 4, 0, 0 }, [ 6 ] = { 8, 10, 1, 3, 0, 0 }, [ 7 ] = { 13, 15, 13, 19, 0, 0 }, [ 8 ] = { 14, 19, 14, 19, 0, 0 }, [ 9 ] = { 14, 19, 12, 16, 0, 0 }, [ 10 ] = { 16, 23, 11, 13, 0, 0 }, [ 11 ] = { 15, 13, 16, 20, 0, 0 }, [ 12 ] = { 200, 24, 10, 17, 0, 0 }, [ 13 ] = { 139, 22, 10, 14, 0, 0 }, [ 14 ] = { 139, 14, 16, 23, 0, 0 }, [ 15 ] = { 14, 12, 11, 18, 0, 0 }, [ 16 ] = { 16, 21, 14, 22, 0, 0 }, [ 17 ] = { 15, 16, 12, 21, 0, 0 }, [ 18 ] = { 13, 18, 139, 160, 0, 0 }, [ 19 ] = { 67, 15, 10, 15, 0, 0 }, [ 20 ] = { 2, 22, 139, 20, 0, 0 }, [ 21 ] = { 14, 20, 10, 18, 0, 0 }, [ 22 ] = { 143, 154, 144, 151, 0, 0 }, [ 23 ] = { 144, 155, 145, 150, 0, 0 }, [ 24 ] = { 146, 156, 142, 148, 0, 0 }, [ 25 ] = { 145, 152, 147, 152, 0, 0 }, [ 26 ] = { 143, 154, 146, 151, 0, 0 }, [ 27 ] = { 295, 305, 291, 290, 0, 0 }, [ 28 ] = { 296, 305, 292, 289, 0, 0 }, [ 29 ] = { 297, 306, 293, 291, 0, 0 }, [ 30 ] = { 298, 304, 294, 291, 0, 0 }, [ 31 ] = { 299, 307, 290, 292, 0, 0 }, [ 32 ] = { 17, 26, 19, 312, 0, 0 }, [ 33 ] = { 148, 158, 150, 157, 0, 0 }, [ 34 ] = { 52, 55, 115, 123, 0, 0 }, [ 35 ] = { 154, 64, 54, 62, 0, 0 }, [ 36 ] = { 52, 55, 305, 56, 0, 0 }, [ 37 ] = { 39, 38, 32, 41, 0, 0 }, [ 38 ] = { 54, 49, 50, 55, 0, 0 }, [ 39 ] = { 35, 52, 41, 44, 0, 0 }, [ 40 ] = { 36, 54, 30, 46, 0, 0 }, [ 41 ] = { 40, 63, 42, 54, 0, 0 }, [ 42 ] = { 38, 61, 49, 39, 0, 0 }, [ 43 ] = { 41, 60, 47, 42, 0, 0 }, [ 44 ] = { 41, 43, 45, 38, 0, 0 }, [ 45 ] = { 247, 11, 303, 52, 0, 0 }, [ 46 ] = { 202, 212, 300, 301, 0, 0 }, [ 47 ] = { 310, 212, 299, 202, 0, 0 }, [ 48 ] = { 21, 29, 155, 136, 0, 0 }, [ 49 ] = { 67, 77, 67, 76, 0, 0 }, [ 50 ] = { 136, 142, 134, 140, 0, 0 }, [ 51 ] = { 304, 311, 298, 200, 0, 0 }, [ 52 ] = { 197, 208, 296, 297, 0, 0 }, [ 53 ] = { 315, 320, 306, 304, 0, 0 }, [ 54 ] = { 1, 137, 313, 306, 0, 0 }, [ 55 ] = { 57, 67, 304, 0, 0, 0 }, [ 56 ] = { 307, 321, 10, 145, 0, 0 }, [ 57 ] = { 152, 65, 314, 159, 0, 0 }, [ 58 ] = { 152, 65, 136, 294, 0, 0 }, [ 59 ] = { 239, 249, 236, 238, 0, 0 }, [ 60 ] = { 240, 250, 235, 243, 0, 0 }, [ 61 ] = { 243, 253, 237, 242, 0, 0 }, [ 62 ] = { 245, 256, 214, 232, 0, 0 }, [ 63 ] = { 244, 255, 213, 231, 0, 0 }, [ 64 ] = { 216, 238, 238, 244, 0, 0 }, [ 65 ] = { 218, 241, 214, 231, 0, 0 }, [ 66 ] = { 208, 218, 217, 227, 0, 0 }, [ 67 ] = { 210, 221, 219, 225, 0, 0 }, [ 68 ] = { 214, 224, 222, 228, 0, 0 }, [ 69 ] = { 213, 213, 221, 220, 0, 0 }, [ 70 ] = { 217, 217, 225, 222, 0, 0 }, [ 71 ] = { 209, 213, 218, 220, 0, 0 }, [ 72 ] = { 237, 246, 200, 221, 0, 0 }, [ 73 ] = { 206, 216, 240, 233, 0, 0 }, [ 74 ] = { 238, 248, 234, 211, 0, 0 }, [ 75 ] = { 234, 243, 229, 236, 0, 0 }, [ 76 ] = { 140, 51, 139, 306, 0, 0 }, [ 77 ] = { 121, 127, 119, 147, 0, 0 }, [ 78 ] = { 316, 322, 308, 307, 0, 0 }, [ 79 ] = { 317, 323, 309, 308, 0, 0 }, [ 80 ] = { 318, 324, 310, 309, 0, 0 }, [ 81 ] = { 319, 325, 311, 310, 0, 0 }, [ 82 ] = { 117, 125, 116, 124, 0, 0 }, [ 83 ] = { 118, 147, 117, 125, 0, 0 }, [ 84 ] = { 329, 337, 329, 324, 0, 0 }, [ 85 ] = { 321, 332, 316, 318, 0, 0 }, [ 86 ] = { 322, 330, 320, 319, 0, 0 }, [ 87 ] = { 331, 340, 328, 328, 0, 0 }, [ 88 ] = { 336, 331, 322, 323, 0, 0 }, [ 89 ] = { 23, 31, 24, 33, 0, 0 }, [ 90 ] = { 27, 35, 28, 37, 0, 0 }, [ 91 ] = { 59, 69, 59, 67, 0, 0 }, [ 92 ] = { 63, 74, 64, 72, 0, 0 }, [ 93 ] = { 59, 69, 58, 68, 0, 0 }, [ 94 ] = { 257, 268, 250, 255, 0, 0 }, [ 95 ] = { 250, 262, 242, 248, 0, 0 }, [ 96 ] = { 254, 266, 246, 252, 0, 0 }, [ 97 ] = { 126, 132, 125, 129, 0, 0 }, [ 98 ] = { 123, 129, 122, 132, 0, 0 }, [ 99 ] = { 126, 130, 126, 135, 0, 0 }, [ 100 ] = { 142, 165, 285, 153, 0, 0 }, [ 101 ] = { 186, 195, 188, 193, 0, 0 }, [ 102 ] = { 187, 196, 185, 191, 0, 0 }, [ 103 ] = { 185, 194, 131, 27, 0, 0 }, [ 104 ] = { 203, 209, 301, 298, 0, 0 }, [ 105 ] = { 165, 171, 179, 182, 0, 0 }, [ 106 ] = { 166, 172, 180, 183, 0, 0 }, [ 107 ] = { 170, 178, 184, 189, 0, 0 }, [ 108 ] = { 167, 197, 13, 187, 0, 0 }, [ 109 ] = { 157, 167, 171, 178, 0, 0 }, [ 110 ] = { 160, 168, 174, 179, 0, 0 }, [ 111 ] = { 164, 170, 178, 181, 0, 0 }, [ 112 ] = { 170, 177, 184, 188, 0, 0 }, [ 113 ] = { 131, 140, 187, 191, 0, 0 }, [ 114 ] = { 193, 201, 14, 195, 0, 0 }, [ 115 ] = { 196, 204, 14, 198, 0, 0 }, [ 116 ] = { 107, 116, 105, 120, 0, 0 }, [ 117 ] = { 111, 124, 106, 118, 0, 0 }, [ 118 ] = { 114, 120, 106, 118, 0, 0 }, [ 119 ] = { 71, 86, 78, 90, 0, 0 }, [ 120 ] = { 76, 90, 70, 89, 0, 0 }, [ 121 ] = { 75, 83, 76, 88, 0, 0 }, [ 122 ] = { 73, 88, 83, 80, 0, 0 }, [ 123 ] = { 86, 101, 93, 105, 0, 0 }, [ 124 ] = { 91, 105, 85, 104, 0, 0 }, [ 125 ] = { 90, 98, 91, 103, 0, 0 }, [ 126 ] = { 88, 103, 98, 95, 0, 0 }, [ 127 ] = { 101, 111, 102, 109, 0, 0 }, [ 128 ] = { 103, 113, 100, 111, 0, 0 }, [ 129 ] = { 105, 110, 113, 122, 0, 0 }, [ 130 ] = { 102, 113, 114, 108, 0, 0 }, [ 131 ] = { 301, 309, 10, 145, 0, 0 }, [ 132 ] = { 311, 11, 303, 145, 0, 0 }, [ 133 ] = { 200, 243, 197, 234, 0, 0 }, [ 134 ] = { 351, 361, 343, 343, 0, 0 }, [ 135 ] = { 350, 360, 342, 342, 0, 0 }, [ 136 ] = { 352, 362, 344, 344, 0, 0 }, [ 137 ] = { 258, 280, 251, 145, 0, 0 }, [ 138 ] = { 139, 269, 252, 256, 0, 0 }, [ 139 ] = { 259, 270, 253, 145, 0, 0 }, [ 140 ] = { 260, 148, 254, 257, 0, 0 }, [ 141 ] = { 261, 271, 255, 145, 0, 0 }, [ 142 ] = { 262, 272, 138, 258, 0, 0 }, [ 143 ] = { 263, 272, 256, 145, 0, 0 }, [ 144 ] = { 276, 288, 138, 273, 0, 0 }, [ 145 ] = { 270, 283, 266, 145, 0, 0 }, [ 146 ] = { 273, 286, 269, 145, 0, 0 }, [ 147 ] = { 268, 279, 262, 145, 0, 0 }, [ 148 ] = { 265, 275, 261, 145, 0, 0 }, [ 149 ] = { 276, 277, 257, 145, 0, 0 }, [ 150 ] = { 281, 278, 138, 263, 0, 0 }, [ 151 ] = { 289, 281, 138, 271, 0, 0 }, [ 152 ] = { 12, 11, 138, 272, 0, 0 }, [ 153 ] = { 282, 294, 277, 145, 0, 0 }, [ 154 ] = { 277, 289, 271, 145, 0, 0 }, [ 155 ] = { 139, 290, 272, 275, 0, 0 }, [ 156 ] = { 278, 291, 138, 276, 0, 0 }, [ 157 ] = { 279, 148, 273, 277, 0, 0 }, [ 158 ] = { 280, 148, 274, 278, 0, 0 }, [ 159 ] = { 12, 292, 275, 145, 0, 0 }, [ 160 ] = { 285, 298, 282, 145, 0, 0 }, [ 161 ] = { 283, 296, 258, 145, 0, 0 }, [ 162 ] = { 288, 342, 10, 145, 0, 0 }, [ 163 ] = { 265, 276, 260, 145, 0, 0 }, [ 164 ] = { 353, 363, 345, 145, 0, 0 }, [ 165 ] = { 354, 364, 346, 145, 0, 0 }, [ 166 ] = { 199, 243, 197, 335, 0, 0 }, [ 167 ] = { 238, 359, 341, 211, 0, 0 }, [ 168 ] = { 349, 358, 333, 147, 0, 0 }, [ 169 ] = { 348, 356, 115, 123, 0, 0 }, [ 170 ] = { 154, 357, 340, 340, 0, 0 }, [ 171 ] = { 199, 64, 339, 337, 0, 0 }, [ 172 ] = { 337, 0, 332, 136, 0, 0 }, [ 173 ] = { 343, 309, 335, 157, 0, 0 }, [ 174 ] = { 339, 128, 334, 287, 0, 0 }, [ 175 ] = { 344, 351, 336, 303, 0, 0 }, [ 176 ] = { 315, 137, 336, 303, 0, 0 }, [ 177 ] = { 348, 355, 340, 340, 0, 0 }, [ 178 ] = { 52, 38, 54, 62, 0, 0 }, [ 179 ] = { 349, 358, 341, 341, 0, 0 }, [ 180 ] = { 121, 127, 234, 244, 0, 0 }, [ 181 ] = { 342, 359, 303, 335, 0, 0 }, [ 182 ] = { 312, 248, 336, 234, 0, 0 }, [ 183 ] = { 309, 357, 197, 203, 0, 0 }, [ 184 ] = { 31, 356, 31, 336, 0, 0 }, [ 185 ] = { 31, 55, 31, 304, 0, 0 }, [ 186 ] = { 101, 111, 102, 12, 0, 0 }, [ 187 ] = { 139, 246, 113, 12, 0, 0 }, [ 188 ] = { 12, 210, 299, 145, 0, 0 }, [ 189 ] = { 139, 149, 198, 300, 0, 0 }, [ 190 ] = { 103, 11, 117, 125, 0, 0 }, [ 191 ] = { 13, 15, 12, 16, 0, 0 }, [ 192 ] = { 14, 19, 14, 12, 0, 0 }, [ 193 ] = { 101, 111, 113, 202, 0, 0 }, [ 194 ] = { 199, 11, 231, 299, 0, 0 }, [ 195 ] = { 139, 197, 146, 19, 0, 0 }, [ 196 ] = { 353, 363, 138, 345, 0, 0 }, [ 197 ] = { 354, 364, 138, 346, 0, 0 } }; static const pmg_power_group_t power6_groups[] = { [ 0 ] = { .pmg_name = "pm_utilization", .pmg_desc = "CPI and utilization data", .pmg_event_ids = power6_group_event_ids[0], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000a02121eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 1 ] = { .pmg_name = "pm_utilization_capacity", .pmg_desc = "CPU utilization and capacity", .pmg_event_ids = power6_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000fa1ef4f4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 2 ] = { .pmg_name = "pm_branch", .pmg_desc = "Branch operations", .pmg_event_ids = power6_group_event_ids[2], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04000000a2a8808aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 3 ] = { .pmg_name = "pm_branch2", .pmg_desc = "Branch operations", .pmg_event_ids = power6_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04000000a4a68e8cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 4 ] = { .pmg_name = "pm_branch3", .pmg_desc = "Branch operations", .pmg_event_ids = power6_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04000000a0a28486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 5 ] = { .pmg_name = "pm_branch4", .pmg_desc = "Branch operations", .pmg_event_ids = power6_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04000000a8aa8c8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 6 ] = { .pmg_name = "pm_branch5", .pmg_desc = "Branch operations", .pmg_event_ids = power6_group_event_ids[6], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04040000a052c652ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 7 ] = { .pmg_name = "pm_dsource", .pmg_desc = "Data source", .pmg_event_ids = power6_group_event_ids[7], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000058585656ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 8 ] = { .pmg_name = "pm_dsource2", .pmg_desc = "Data sources", .pmg_event_ids = power6_group_event_ids[8], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005a5a5856ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 9 ] = { .pmg_name = "pm_dsource3", .pmg_desc = "Data sources", .pmg_event_ids = power6_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005a5a5a5aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 10 ] = { .pmg_name = "pm_dsource4", .pmg_desc = "Data sources", .pmg_event_ids = power6_group_event_ids[10], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005c5c5c5cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 11 ] = { .pmg_name = "pm_dsource5", .pmg_desc = "Data sources", .pmg_event_ids = power6_group_event_ids[11], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005e5e5e5eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 12 ] = { .pmg_name = "pm_dlatencies", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[12], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000c281e24ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 13 ] = { .pmg_name = "pm_dlatencies2", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[13], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000022c1e2aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 14 ] = { .pmg_name = "pm_dlatencies3", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[14], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000022e5e2cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 15 ] = { .pmg_name = "pm_dlatencies4", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[15], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005a2a5c26ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 16 ] = { .pmg_name = "pm_dlatencies5", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[16], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005c225828ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 17 ] = { .pmg_name = "pm_dlatencies6", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[17], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005e245a2eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 18 ] = { .pmg_name = "pm_dlatencies7", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[18], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005820120eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 19 ] = { .pmg_name = "pm_dlatencies8", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010581e20ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 20 ] = { .pmg_name = "pm_dlatencies9", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000122c125eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 21 ] = { .pmg_name = "pm_dlatencies10", .pmg_desc = "Data latencies", .pmg_event_ids = power6_group_event_ids[21], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005a261e26ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 22 ] = { .pmg_name = "pm_isource", .pmg_desc = "Instruction sources", .pmg_event_ids = power6_group_event_ids[22], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000040404654ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 23 ] = { .pmg_name = "pm_isource2", .pmg_desc = "Instruction sources", .pmg_event_ids = power6_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000046464046ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 24 ] = { .pmg_name = "pm_isource3", .pmg_desc = "Instruction sources", .pmg_event_ids = power6_group_event_ids[24], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000044444444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 25 ] = { .pmg_name = "pm_isource4", .pmg_desc = "Instruction sources", .pmg_event_ids = power6_group_event_ids[25], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000042424242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 26 ] = { .pmg_name = "pm_isource5", .pmg_desc = "Instruction sources", .pmg_event_ids = power6_group_event_ids[26], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000040405454ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 27 ] = { .pmg_name = "pm_pteg", .pmg_desc = "PTEG sources", .pmg_event_ids = power6_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0001000048484e4eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 28 ] = { .pmg_name = "pm_pteg2", .pmg_desc = "PTEG sources", .pmg_event_ids = power6_group_event_ids[28], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000100002848484cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 29 ] = { .pmg_name = "pm_pteg3", .pmg_desc = "PTEG sources", .pmg_event_ids = power6_group_event_ids[29], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000100004e4e284aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 30 ] = { .pmg_name = "pm_pteg4", .pmg_desc = "PTEG sources", .pmg_event_ids = power6_group_event_ids[30], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000100004a4a4a4aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 31 ] = { .pmg_name = "pm_pteg5", .pmg_desc = "PTEG sources", .pmg_event_ids = power6_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000100004c4c4cc8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 32 ] = { .pmg_name = "pm_data_tablewalk", .pmg_desc = "Data tablewalks", .pmg_event_ids = power6_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x09900000a0a284e8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 33 ] = { .pmg_name = "pm_inst_tablewalk", .pmg_desc = "Instruction tablewalks", .pmg_event_ids = power6_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x09900000a8aa8ceaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 34 ] = { .pmg_name = "pm_freq", .pmg_desc = "Frequency events", .pmg_event_ids = power6_group_event_ids[34], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002a3c3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 35 ] = { .pmg_name = "pm_disp_wait", .pmg_desc = "Dispatch stalls", .pmg_event_ids = power6_group_event_ids[35], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000560c040cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 36 ] = { .pmg_name = "pm_disp_held", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[36], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x200000002a3c2aa2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 37 ] = { .pmg_name = "pm_disp_held2", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[37], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x200000008004a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 38 ] = { .pmg_name = "pm_disp_held3", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x20000000888aacaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 39 ] = { .pmg_name = "pm_disp_held4", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[39], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x02000000a0a28486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 40 ] = { .pmg_name = "pm_disp_held5", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x22000000a8aa8ca0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 41 ] = { .pmg_name = "pm_disp_held6", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[41], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x33000000a882a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 42 ] = { .pmg_name = "pm_disp_held7", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x30000000888aacaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 43 ] = { .pmg_name = "pm_disp_held8", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x220000008a8cae80ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 44 ] = { .pmg_name = "pm_disp_held9", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power6_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x220000008aa08a8cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 45 ] = { .pmg_name = "pm_sync", .pmg_desc = "Sync events", .pmg_event_ids = power6_group_event_ids[45], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x38900000ae1eeca0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 46 ] = { .pmg_name = "pm_L1_ref", .pmg_desc = "L1 references", .pmg_event_ids = power6_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x80000000368aa63aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 47 ] = { .pmg_name = "pm_L1_ldst", .pmg_desc = "L1 load/store ref/miss", .pmg_event_ids = power6_group_event_ids[47], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000003230a8a0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 48 ] = { .pmg_name = "pm_streams", .pmg_desc = "Streams", .pmg_event_ids = power6_group_event_ids[48], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x48000000a0a284a4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 49 ] = { .pmg_name = "pm_flush", .pmg_desc = "Flushes", .pmg_event_ids = power6_group_event_ids[49], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0022000010cacccaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 50 ] = { .pmg_name = "pm_prefetch", .pmg_desc = "I cache Prefetches", .pmg_event_ids = power6_group_event_ids[50], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400400008a8caec0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 51 ] = { .pmg_name = "pm_stcx", .pmg_desc = "STCX", .pmg_event_ids = power6_group_event_ids[51], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00080000e6eccecaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 52 ] = { .pmg_name = "pm_larx", .pmg_desc = "LARX", .pmg_event_ids = power6_group_event_ids[52], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00080000eae2c6ceULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 53 ] = { .pmg_name = "pm_thread_cyc", .pmg_desc = "Thread cycles", .pmg_event_ids = power6_group_event_ids[53], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000016182604ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 54 ] = { .pmg_name = "pm_misc", .pmg_desc = "Misc", .pmg_event_ids = power6_group_event_ids[54], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000004161808ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 55 ] = { .pmg_name = "pm_misc2", .pmg_desc = "Misc", .pmg_event_ids = power6_group_event_ids[55], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40020000eef8f8a0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 56 ] = { .pmg_name = "pm_misc3", .pmg_desc = "Misc", .pmg_event_ids = power6_group_event_ids[56], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0300000054a01e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 57 ] = { .pmg_name = "pm_tlb_slb", .pmg_desc = "TLB and SLB events", .pmg_event_ids = power6_group_event_ids[57], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00980000e0e8e8e2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 58 ] = { .pmg_name = "pm_slb_miss", .pmg_desc = "SLB Misses", .pmg_event_ids = power6_group_event_ids[58], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00480001e0e8ee32ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 59 ] = { .pmg_name = "pm_rejects", .pmg_desc = "Reject events", .pmg_event_ids = power6_group_event_ids[59], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaa00000034303e30ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 60 ] = { .pmg_name = "pm_rejects2", .pmg_desc = "Reject events", .pmg_event_ids = power6_group_event_ids[60], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x9a000000323830acULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 61 ] = { .pmg_name = "pm_rejects3", .pmg_desc = "Reject events", .pmg_event_ids = power6_group_event_ids[61], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaa000000303e3234ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 62 ] = { .pmg_name = "pm_rejects4", .pmg_desc = "Unaligned store rejects", .pmg_event_ids = power6_group_event_ids[62], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x900000003630a2aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 63 ] = { .pmg_name = "pm_rejects5", .pmg_desc = "Unaligned load rejects", .pmg_event_ids = power6_group_event_ids[63], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x900000003036a0a8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 64 ] = { .pmg_name = "pm_rejects6", .pmg_desc = "Set mispredictions rejects", .pmg_event_ids = power6_group_event_ids[64], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xa0000000848c341cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 65 ] = { .pmg_name = "pm_rejects_unit", .pmg_desc = "Unaligned reject events by unit", .pmg_event_ids = power6_group_event_ids[65], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x90000000808aa2a8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 66 ] = { .pmg_name = "pm_rejects_unit2", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[66], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaa000000a6828e8aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 67 ] = { .pmg_name = "pm_rejects_unit3", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[67], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0a000000a4a08c88ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 68 ] = { .pmg_name = "pm_rejects_unit4", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[68], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaa000000a2868aaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 69 ] = { .pmg_name = "pm_rejects_unit5", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[69], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x9900000086a6ae8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 70 ] = { .pmg_name = "pm_rejects_unit6", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[70], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaa00000080a6a88eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 71 ] = { .pmg_name = "pm_rejects_unit7", .pmg_desc = "Reject events by unit", .pmg_event_ids = power6_group_event_ids[71], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xa900000082a6aa8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 72 ] = { .pmg_name = "pm_ldf", .pmg_desc = "Floating Point loads", .pmg_event_ids = power6_group_event_ids[72], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000003832a4acULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 73 ] = { .pmg_name = "pm_lsu_misc", .pmg_desc = "LSU events", .pmg_event_ids = power6_group_event_ids[73], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x08800000caccee8aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 74 ] = { .pmg_name = "pm_lsu_lmq", .pmg_desc = "LSU LMQ events", .pmg_event_ids = power6_group_event_ids[74], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x98000000ac1c1ca4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 75 ] = { .pmg_name = "pm_lsu_flush_derat_miss", .pmg_desc = "LSU flush and DERAT misses", .pmg_event_ids = power6_group_event_ids[75], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00200000fc0eeceeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 76 ] = { .pmg_name = "pm_lla", .pmg_desc = "Look Load Ahead events", .pmg_event_ids = power6_group_event_ids[76], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x33000000a2841208ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 77 ] = { .pmg_name = "pm_gct", .pmg_desc = "GCT events", .pmg_event_ids = power6_group_event_ids[77], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x404000000808a6e8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 78 ] = { .pmg_name = "pm_smt_priorities", .pmg_desc = "Thread priority events", .pmg_event_ids = power6_group_event_ids[78], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0020000040404040ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 79 ] = { .pmg_name = "pm_smt_priorities2", .pmg_desc = "Thread priority events", .pmg_event_ids = power6_group_event_ids[79], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0020000046464646ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 80 ] = { .pmg_name = "pm_smt_priorities3", .pmg_desc = "Thread priority differences events", .pmg_event_ids = power6_group_event_ids[80], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0002000040404040ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 81 ] = { .pmg_name = "pm_smt_priorities4", .pmg_desc = "Thread priority differences events", .pmg_event_ids = power6_group_event_ids[81], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x03020000a6464646ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 82 ] = { .pmg_name = "pm_fxu", .pmg_desc = "FXU events", .pmg_event_ids = power6_group_event_ids[82], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000050505050ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 83 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "FXU events", .pmg_event_ids = power6_group_event_ids[83], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x02040000aee41616ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 84 ] = { .pmg_name = "pm_vmx", .pmg_desc = "VMX events", .pmg_event_ids = power6_group_event_ids[84], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x700000008480a2a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 85 ] = { .pmg_name = "pm_vmx2", .pmg_desc = "VMX events", .pmg_event_ids = power6_group_event_ids[85], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x600000008088a2aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 86 ] = { .pmg_name = "pm_vmx3", .pmg_desc = "VMX events", .pmg_event_ids = power6_group_event_ids[86], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x600000008284aaacULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 87 ] = { .pmg_name = "pm_vmx4", .pmg_desc = "VMX events", .pmg_event_ids = power6_group_event_ids[87], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xb0000000828ea6a0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 88 ] = { .pmg_name = "pm_vmx5", .pmg_desc = "VMX events", .pmg_event_ids = power6_group_event_ids[88], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xb00000008084aca2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 89 ] = { .pmg_name = "pm_dfu", .pmg_desc = "DFU events", .pmg_event_ids = power6_group_event_ids[89], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xe00000008c88a2aeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 90 ] = { .pmg_name = "pm_dfu2", .pmg_desc = "DFU events", .pmg_event_ids = power6_group_event_ids[90], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xe00000008a84a0a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 91 ] = { .pmg_name = "pm_fab", .pmg_desc = "Fabric events", .pmg_event_ids = power6_group_event_ids[91], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500020003030a4acULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 92 ] = { .pmg_name = "pm_fab2", .pmg_desc = "Fabric events", .pmg_event_ids = power6_group_event_ids[92], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x50002000888aa2a0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 93 ] = { .pmg_name = "pm_fab3", .pmg_desc = "Fabric events", .pmg_event_ids = power6_group_event_ids[93], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500020003030aea6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 94 ] = { .pmg_name = "pm_mem_dblpump", .pmg_desc = "Double pump", .pmg_event_ids = power6_group_event_ids[94], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000400030303434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 95 ] = { .pmg_name = "pm_mem0_dblpump", .pmg_desc = "MCS0 Double pump", .pmg_event_ids = power6_group_event_ids[95], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500040008082a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 96 ] = { .pmg_name = "pm_mem1_dblpump", .pmg_desc = "MCS1 Double pump", .pmg_event_ids = power6_group_event_ids[96], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x50004000888aacaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 97 ] = { .pmg_name = "pm_gxo", .pmg_desc = "GX outbound", .pmg_event_ids = power6_group_event_ids[97], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500060008082a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 98 ] = { .pmg_name = "pm_gxi", .pmg_desc = "GX inbound", .pmg_event_ids = power6_group_event_ids[98], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500060008688aaa0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 99 ] = { .pmg_name = "pm_gx_dma", .pmg_desc = "DMA events", .pmg_event_ids = power6_group_event_ids[99], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500060008086acaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 100 ] = { .pmg_name = "pm_L1_misc", .pmg_desc = "L1 misc events", .pmg_event_ids = power6_group_event_ids[100], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4004000082e2a80aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 101 ] = { .pmg_name = "pm_L2_data", .pmg_desc = "L2 load and store data", .pmg_event_ids = power6_group_event_ids[101], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000800030303434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 102 ] = { .pmg_name = "pm_L2_ld_inst", .pmg_desc = "L2 Load instructions", .pmg_event_ids = power6_group_event_ids[102], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5800a00030303486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 103 ] = { .pmg_name = "pm_L2_castout_invalidate", .pmg_desc = "L2 castout and invalidate events", .pmg_event_ids = power6_group_event_ids[103], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000c00030303434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 104 ] = { .pmg_name = "pm_L2_ldst_reqhit", .pmg_desc = "L2 load and store requests and hits", .pmg_event_ids = power6_group_event_ids[104], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000e00030303434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 105 ] = { .pmg_name = "pm_L2_ld_data_slice", .pmg_desc = "L2 data loads by slice", .pmg_event_ids = power6_group_event_ids[105], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500080008082a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 106 ] = { .pmg_name = "pm_L2_ld_inst_slice", .pmg_desc = "L2 instruction loads by slice", .pmg_event_ids = power6_group_event_ids[106], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000a0008082a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 107 ] = { .pmg_name = "pm_L2_st_slice", .pmg_desc = "L2 slice stores by slice", .pmg_event_ids = power6_group_event_ids[107], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500080008486acaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 108 ] = { .pmg_name = "pm_L2miss_slice", .pmg_desc = "L2 misses by slice", .pmg_event_ids = power6_group_event_ids[108], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000a000843256acULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 109 ] = { .pmg_name = "pm_L2_castout_slice", .pmg_desc = "L2 castouts by slice", .pmg_event_ids = power6_group_event_ids[109], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000c0008082a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 110 ] = { .pmg_name = "pm_L2_invalidate_slice", .pmg_desc = "L2 invalidate by slice", .pmg_event_ids = power6_group_event_ids[110], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000c0008486acaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 111 ] = { .pmg_name = "pm_L2_ld_reqhit_slice", .pmg_desc = "L2 load requests and hist by slice", .pmg_event_ids = power6_group_event_ids[111], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000e0008082a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 112 ] = { .pmg_name = "pm_L2_st_reqhit_slice", .pmg_desc = "L2 store requests and hist by slice", .pmg_event_ids = power6_group_event_ids[112], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5000e0008486acaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 113 ] = { .pmg_name = "pm_L2_redir_pref", .pmg_desc = "L2 redirect and prefetch", .pmg_event_ids = power6_group_event_ids[113], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x08400000cacc8886ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 114 ] = { .pmg_name = "pm_L3_SliceA", .pmg_desc = "L3 slice A events", .pmg_event_ids = power6_group_event_ids[114], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x50000000303058a4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 115 ] = { .pmg_name = "pm_L3_SliceB", .pmg_desc = "L3 slice B events", .pmg_event_ids = power6_group_event_ids[115], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x50000000888a58acULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 116 ] = { .pmg_name = "pm_fpu_issue", .pmg_desc = "FPU issue events", .pmg_event_ids = power6_group_event_ids[116], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00300000c6c8eae4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 117 ] = { .pmg_name = "pm_fpu_issue2", .pmg_desc = "FPU issue events", .pmg_event_ids = power6_group_event_ids[117], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00300000c0c2eceeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 118 ] = { .pmg_name = "pm_fpu_issue3", .pmg_desc = "FPU issue events", .pmg_event_ids = power6_group_event_ids[118], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00330000e0e2eceeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 119 ] = { .pmg_name = "pm_fpu0_flop", .pmg_desc = "FPU0 flop events", .pmg_event_ids = power6_group_event_ids[119], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0000008082a484ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 120 ] = { .pmg_name = "pm_fpu0_misc", .pmg_desc = "FPU0 events", .pmg_event_ids = power6_group_event_ids[120], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc00000086a08286ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 121 ] = { .pmg_name = "pm_fpu0_misc2", .pmg_desc = "FPU0 events", .pmg_event_ids = power6_group_event_ids[121], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd00000080a6a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 122 ] = { .pmg_name = "pm_fpu0_misc3", .pmg_desc = "FPU0 events", .pmg_event_ids = power6_group_event_ids[122], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0d000000a0a28486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 123 ] = { .pmg_name = "pm_fpu1_flop", .pmg_desc = "FPU1 flop events", .pmg_event_ids = power6_group_event_ids[123], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc000000888aac8cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 124 ] = { .pmg_name = "pm_fpu1_misc", .pmg_desc = "FPU1 events", .pmg_event_ids = power6_group_event_ids[124], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0000008ea88a8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 125 ] = { .pmg_name = "pm_fpu1_misc2", .pmg_desc = "FPU1 events", .pmg_event_ids = power6_group_event_ids[125], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd00000088aeacaeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 126 ] = { .pmg_name = "pm_fpu1_misc3", .pmg_desc = "FPU1 events", .pmg_event_ids = power6_group_event_ids[126], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0d000000a8aa8c8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 127 ] = { .pmg_name = "pm_fpu_flop", .pmg_desc = "FPU flop events", .pmg_event_ids = power6_group_event_ids[127], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc000000030303434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 128 ] = { .pmg_name = "pm_fpu_misc", .pmg_desc = "FPU events", .pmg_event_ids = power6_group_event_ids[128], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd00000030343434ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 129 ] = { .pmg_name = "pm_fpu_misc2", .pmg_desc = "FPU events", .pmg_event_ids = power6_group_event_ids[129], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0c00000034343030ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 130 ] = { .pmg_name = "pm_fpu_misc3", .pmg_desc = "FPU events", .pmg_event_ids = power6_group_event_ids[130], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0d00000034343030ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 131 ] = { .pmg_name = "pm_purr", .pmg_desc = "PURR events", .pmg_event_ids = power6_group_event_ids[131], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000ef41e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 132 ] = { .pmg_name = "pm_suspend", .pmg_desc = "SUSPENDED events", .pmg_event_ids = power6_group_event_ids[132], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00900000001eec02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 133 ] = { .pmg_name = "pm_dcache", .pmg_desc = "D cache", .pmg_event_ids = power6_group_event_ids[133], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000c0e0c06ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 134 ] = { .pmg_name = "pm_derat_miss", .pmg_desc = "DERAT miss", .pmg_event_ids = power6_group_event_ids[134], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000f40404040ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 135 ] = { .pmg_name = "pm_derat_ref", .pmg_desc = "DERAT ref", .pmg_event_ids = power6_group_event_ids[135], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0080000f40404040ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 136 ] = { .pmg_name = "pm_ierat_miss", .pmg_desc = "IERAT miss", .pmg_event_ids = power6_group_event_ids[136], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000f46464646ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 137 ] = { .pmg_name = "pm_mrk_br", .pmg_desc = "Marked Branch events", .pmg_event_ids = power6_group_event_ids[137], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000052565202ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 138 ] = { .pmg_name = "pm_mrk_dsource", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[138], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000024a4c4cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 139 ] = { .pmg_name = "pm_mrk_dsource2", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[139], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000048484e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 140 ] = { .pmg_name = "pm_mrk_dsource3", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[140], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002802484eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 141 ] = { .pmg_name = "pm_mrk_dsource4", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[141], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000004e4e2802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 142 ] = { .pmg_name = "pm_mrk_dsource5", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[142], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000004a4c024aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 143 ] = { .pmg_name = "pm_mrk_dsource6", .pmg_desc = "Marked data sources", .pmg_event_ids = power6_group_event_ids[143], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000004c4c4a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 144 ] = { .pmg_name = "pm_mrk_rejects", .pmg_desc = "Marked rejects", .pmg_event_ids = power6_group_event_ids[144], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0009000d34340230ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 145 ] = { .pmg_name = "pm_mrk_rejects2", .pmg_desc = "Marked rejects LSU0", .pmg_event_ids = power6_group_event_ids[145], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00090000e6e0c202ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 146 ] = { .pmg_name = "pm_mrk_rejects3", .pmg_desc = "Marked rejects LSU1", .pmg_event_ids = power6_group_event_ids[146], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00090000eee8ca02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 147 ] = { .pmg_name = "pm_mrk_inst", .pmg_desc = "Marked instruction events", .pmg_event_ids = power6_group_event_ids[147], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001c100a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 148 ] = { .pmg_name = "pm_mrk_fpu_fin", .pmg_desc = "Marked Floating Point instructions finished", .pmg_event_ids = power6_group_event_ids[148], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0000000828a1a02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 149 ] = { .pmg_name = "pm_mrk_misc", .pmg_desc = "Marked misc events", .pmg_event_ids = power6_group_event_ids[149], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00090008341a0802ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 150 ] = { .pmg_name = "pm_mrk_misc2", .pmg_desc = "Marked misc events", .pmg_event_ids = power6_group_event_ids[150], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00080000e40a023eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 151 ] = { .pmg_name = "pm_mrk_misc3", .pmg_desc = "Marked misc events", .pmg_event_ids = power6_group_event_ids[151], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xb009000088e40212ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 152 ] = { .pmg_name = "pm_mrk_misc4", .pmg_desc = "Marked misc events", .pmg_event_ids = power6_group_event_ids[152], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e1e021aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 153 ] = { .pmg_name = "pm_mrk_st", .pmg_desc = "Marked stores events", .pmg_event_ids = power6_group_event_ids[153], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000006060602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 154 ] = { .pmg_name = "pm_mrk_pteg", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[154], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000040424402ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 155 ] = { .pmg_name = "pm_mrk_pteg2", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[155], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000002404644ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 156 ] = { .pmg_name = "pm_mrk_pteg3", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[156], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000046460246ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 157 ] = { .pmg_name = "pm_mrk_pteg4", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[157], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000042024054ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 158 ] = { .pmg_name = "pm_mrk_pteg5", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[158], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0010000044025442ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 159 ] = { .pmg_name = "pm_mrk_pteg6", .pmg_desc = "Marked PTEG", .pmg_event_ids = power6_group_event_ids[159], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x001000001e444202ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 160 ] = { .pmg_name = "pm_mrk_vmx", .pmg_desc = "Marked VMX", .pmg_event_ids = power6_group_event_ids[160], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x700000008c88ae02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 161 ] = { .pmg_name = "pm_mrk_vmx2", .pmg_desc = "Marked VMX", .pmg_event_ids = power6_group_event_ids[161], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x60900000868ee002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 162 ] = { .pmg_name = "pm_mrk_vmx3", .pmg_desc = "Marked VMX", .pmg_event_ids = power6_group_event_ids[162], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x700000008a821e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 163 ] = { .pmg_name = "pm_mrk_fp", .pmg_desc = "Marked FP events", .pmg_event_ids = power6_group_event_ids[163], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd00000008230aa02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 164 ] = { .pmg_name = "pm_mrk_derat_ref", .pmg_desc = "Marked DERAT ref", .pmg_event_ids = power6_group_event_ids[164], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0080000044444402ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 165 ] = { .pmg_name = "pm_mrk_derat_miss", .pmg_desc = "Marked DERAT miss", .pmg_event_ids = power6_group_event_ids[165], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000044444402ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 166 ] = { .pmg_name = "pm_dcache_edge", .pmg_desc = "D cache - edge", .pmg_event_ids = power6_group_event_ids[166], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000d0e0c07ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 167 ] = { .pmg_name = "pm_lsu_lmq_edge", .pmg_desc = "LSU LMQ events - edge", .pmg_event_ids = power6_group_event_ids[167], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x98000000ac1d1da4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 168 ] = { .pmg_name = "pm_gct_edge", .pmg_desc = "GCT events - edge", .pmg_event_ids = power6_group_event_ids[168], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x404000000909a7e8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 169 ] = { .pmg_name = "pm_freq_edge", .pmg_desc = "Frequency events - edge", .pmg_event_ids = power6_group_event_ids[169], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002b3d3c3cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 170 ] = { .pmg_name = "pm_disp_wait_edge", .pmg_desc = "Dispatch stalls - edge", .pmg_event_ids = power6_group_event_ids[170], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000560d050dULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 171 ] = { .pmg_name = "pm_edge1", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[171], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000006300d0c1f1eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 172 ] = { .pmg_name = "pm_edge2", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[172], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400000008180a5a4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 173 ] = { .pmg_name = "pm_edge3", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[173], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x009000000bf4ebeaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 174 ] = { .pmg_name = "pm_edge4", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[174], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x400000008786a9a8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 175 ] = { .pmg_name = "pm_edge5", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[175], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00900000fb17edecULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 176 ] = { .pmg_name = "pm_noedge5", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[176], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00900000fa16edecULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 177 ] = { .pmg_name = "pm_edge6", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[177], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002b05050dULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 178 ] = { .pmg_name = "pm_noedge6", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[178], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000002a04040cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 179 ] = { .pmg_name = "pm_edge7", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[179], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000009091d1dULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 180 ] = { .pmg_name = "pm_noedge7", .pmg_desc = "NOEDGE event group", .pmg_event_ids = power6_group_event_ids[180], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000008081c1cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 181 ] = { .pmg_name = "pm_edge8", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[181], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00900000cd1dec07ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 182 ] = { .pmg_name = "pm_noedge8", .pmg_desc = "NOEDGE event group", .pmg_event_ids = power6_group_event_ids[182], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00900000cc1ced06ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 183 ] = { .pmg_name = "pm_edge9", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[183], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x80000000880d0ca2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 184 ] = { .pmg_name = "pm_edge10", .pmg_desc = "EDGE event group", .pmg_event_ids = power6_group_event_ids[184], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x32000000ac3dae05ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 185 ] = { .pmg_name = "pm_noedge10", .pmg_desc = "NOEDGE event group", .pmg_event_ids = power6_group_event_ids[185], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x32000000ac3cae04ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 186 ] = { .pmg_name = "pm_hpm1", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[186], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc00000003030341eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 187 ] = { .pmg_name = "pm_hpm2", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[187], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x8c0000000232301eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 188 ] = { .pmg_name = "pm_hpm3", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[188], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000001e80f002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 189 ] = { .pmg_name = "pm_hpm4", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[189], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x800000000212a234ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 190 ] = { .pmg_name = "pm_hpm5", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[190], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0000000301e1616ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 191 ] = { .pmg_name = "pm_hpm6", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[191], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000058585a5aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 192 ] = { .pmg_name = "pm_hpm7", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[192], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000005a5a581eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 193 ] = { .pmg_name = "pm_hpm8", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[193], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc000000303030f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 194 ] = { .pmg_name = "pm_hpm9", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[194], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x80000000801e34a8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 195 ] = { .pmg_name = "pm_hpm10", .pmg_desc = "HPM group", .pmg_event_ids = power6_group_event_ids[195], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5040a00002325456ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 196 ] = { .pmg_name = "pm_mrk_derat_ref2", .pmg_desc = "Marked DERAT ref", .pmg_event_ids = power6_group_event_ids[196], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0080000044440244ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 197 ] = { .pmg_name = "pm_mrk_derat_miss2", .pmg_desc = "Marked DERAT miss", .pmg_event_ids = power6_group_event_ids[197], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0090000044440244ULL, .pmg_mmcra = 0x0000000000000001ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/power7_events.h000066400000000000000000014266601502707512200223050ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER7_EVENTS_H__ #define __POWER7_EVENTS_H__ /* * File: power7_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER7_PME_PM_NEST_4 0 #define POWER7_PME_PM_IC_DEMAND_L2_BR_ALL 1 #define POWER7_PME_PM_PMC2_SAVED 2 #define POWER7_PME_PM_CMPLU_STALL_DFU 3 #define POWER7_PME_PM_VSU0_16FLOP 4 #define POWER7_PME_PM_NEST_3 5 #define POWER7_PME_PM_MRK_LSU_DERAT_MISS 6 #define POWER7_PME_PM_MRK_ST_CMPL 7 #define POWER7_PME_PM_L2_ST_DISP 8 #define POWER7_PME_PM_L2_CASTOUT_MOD 9 #define POWER7_PME_PM_ISEG 10 #define POWER7_PME_PM_MRK_INST_TIMEO 11 #define POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR 12 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM 13 #define POWER7_PME_PM_IERAT_WR_64K 14 #define POWER7_PME_PM_MRK_DTLB_MISS_16M 15 #define POWER7_PME_PM_IERAT_MISS 16 #define POWER7_PME_PM_MRK_PTEG_FROM_LMEM 17 #define POWER7_PME_PM_FLOP 18 #define POWER7_PME_PM_THRD_PRIO_4_5_CYC 19 #define POWER7_PME_PM_BR_PRED_TA 20 #define POWER7_PME_PM_CMPLU_STALL_FXU 21 #define POWER7_PME_PM_EXT_INT 22 #define POWER7_PME_PM_VSU_FSQRT_FDIV 23 #define POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC 24 #define POWER7_PME_PM_LSU1_LDF 25 #define POWER7_PME_PM_IC_WRITE_ALL 26 #define POWER7_PME_PM_LSU0_SRQ_STFWD 27 #define POWER7_PME_PM_PTEG_FROM_RL2L3_MOD 28 #define POWER7_PME_PM_MRK_DATA_FROM_L31_SHR 29 #define POWER7_PME_PM_DATA_FROM_L21_MOD 30 #define POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED 31 #define POWER7_PME_PM_VSU0_8FLOP 32 #define POWER7_PME_PM_POWER_EVENT1 33 #define POWER7_PME_PM_DISP_CLB_HELD_BAL 34 #define POWER7_PME_PM_VSU1_2FLOP 35 #define POWER7_PME_PM_LWSYNC_HELD 36 #define POWER7_PME_PM_INST_FROM_L21_MOD 37 #define POWER7_PME_PM_IC_REQ_ALL 38 #define POWER7_PME_PM_DSLB_MISS 39 #define POWER7_PME_PM_L3_MISS 40 #define POWER7_PME_PM_LSU0_L1_PREF 41 #define POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED 42 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE 43 #define POWER7_PME_PM_L2_INST 44 #define POWER7_PME_PM_VSU0_FRSP 45 #define POWER7_PME_PM_FLUSH_DISP 46 #define POWER7_PME_PM_PTEG_FROM_L2MISS 47 #define POWER7_PME_PM_VSU1_DQ_ISSUED 48 #define POWER7_PME_PM_CMPLU_STALL_LSU 49 #define POWER7_PME_PM_MRK_DATA_FROM_DMEM 50 #define POWER7_PME_PM_LSU_FLUSH_ULD 51 #define POWER7_PME_PM_PTEG_FROM_LMEM 52 #define POWER7_PME_PM_MRK_DERAT_MISS_16M 53 #define POWER7_PME_PM_THRD_ALL_RUN_CYC 54 #define POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT 55 #define POWER7_PME_PM_DATA_FROM_DL2L3_MOD 56 #define POWER7_PME_PM_VSU_FRSP 57 #define POWER7_PME_PM_MRK_DATA_FROM_L21_MOD 58 #define POWER7_PME_PM_PMC1_OVERFLOW 59 #define POWER7_PME_PM_VSU0_SINGLE 60 #define POWER7_PME_PM_MRK_PTEG_FROM_L3MISS 61 #define POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR 62 #define POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED 63 #define POWER7_PME_PM_VSU1_FEST 64 #define POWER7_PME_PM_MRK_INST_DISP 65 #define POWER7_PME_PM_VSU0_COMPLEX_ISSUED 66 #define POWER7_PME_PM_LSU1_FLUSH_UST 67 #define POWER7_PME_PM_INST_CMPL 68 #define POWER7_PME_PM_FXU_IDLE 69 #define POWER7_PME_PM_LSU0_FLUSH_ULD 70 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD 71 #define POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC 72 #define POWER7_PME_PM_LSU1_REJECT_LMQ_FULL 73 #define POWER7_PME_PM_INST_PTEG_FROM_L21_MOD 74 #define POWER7_PME_PM_GCT_UTIL_3TO6_SLOT 75 #define POWER7_PME_PM_INST_FROM_RL2L3_MOD 76 #define POWER7_PME_PM_SHL_CREATED 77 #define POWER7_PME_PM_L2_ST_HIT 78 #define POWER7_PME_PM_DATA_FROM_DMEM 79 #define POWER7_PME_PM_L3_LD_MISS 80 #define POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE 81 #define POWER7_PME_PM_DISP_CLB_HELD_RES 82 #define POWER7_PME_PM_L2_SN_SX_I_DONE 83 #define POWER7_PME_PM_GRP_CMPL 84 #define POWER7_PME_PM_BCPLUS8_CONV 85 #define POWER7_PME_PM_STCX_CMPL 86 #define POWER7_PME_PM_VSU0_2FLOP 87 #define POWER7_PME_PM_L3_PREF_MISS 88 #define POWER7_PME_PM_LSU_SRQ_SYNC_CYC 89 #define POWER7_PME_PM_LSU_REJECT_ERAT_MISS 90 #define POWER7_PME_PM_L1_ICACHE_MISS 91 #define POWER7_PME_PM_LSU1_FLUSH_SRQ 92 #define POWER7_PME_PM_LD_REF_L1_LSU0 93 #define POWER7_PME_PM_VSU0_FEST 94 #define POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED 95 #define POWER7_PME_PM_FREQ_UP 96 #define POWER7_PME_PM_DATA_FROM_LMEM 97 #define POWER7_PME_PM_LSU1_LDX 98 #define POWER7_PME_PM_PMC3_OVERFLOW 99 #define POWER7_PME_PM_MRK_BR_MPRED 100 #define POWER7_PME_PM_SHL_MATCH 101 #define POWER7_PME_PM_MRK_BR_TAKEN 102 #define POWER7_PME_PM_ISLB_MISS 103 #define POWER7_PME_PM_CYC 104 #define POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC 105 #define POWER7_PME_PM_DISP_HELD_THERMAL 106 #define POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR 107 #define POWER7_PME_PM_LSU1_SRQ_STFWD 108 #define POWER7_PME_PM_GCT_NOSLOT_BR_MPRED 109 #define POWER7_PME_PM_1PLUS_PPC_CMPL 110 #define POWER7_PME_PM_PTEG_FROM_DMEM 111 #define POWER7_PME_PM_VSU_2FLOP 112 #define POWER7_PME_PM_GCT_FULL_CYC 113 #define POWER7_PME_PM_MRK_DATA_FROM_L3_CYC 114 #define POWER7_PME_PM_LSU_SRQ_S0_ALLOC 115 #define POWER7_PME_PM_MRK_DERAT_MISS_4K 116 #define POWER7_PME_PM_BR_MPRED_TA 117 #define POWER7_PME_PM_INST_PTEG_FROM_L2MISS 118 #define POWER7_PME_PM_DPU_HELD_POWER 119 #define POWER7_PME_PM_RUN_INST_CMPL 120 #define POWER7_PME_PM_MRK_VSU_FIN 121 #define POWER7_PME_PM_LSU_SRQ_S0_VALID 122 #define POWER7_PME_PM_GCT_EMPTY_CYC 123 #define POWER7_PME_PM_IOPS_DISP 124 #define POWER7_PME_PM_RUN_SPURR 125 #define POWER7_PME_PM_PTEG_FROM_L21_MOD 126 #define POWER7_PME_PM_VSU0_1FLOP 127 #define POWER7_PME_PM_SNOOP_TLBIE 128 #define POWER7_PME_PM_DATA_FROM_L3MISS 129 #define POWER7_PME_PM_VSU_SINGLE 130 #define POWER7_PME_PM_DTLB_MISS_16G 131 #define POWER7_PME_PM_CMPLU_STALL_VECTOR 132 #define POWER7_PME_PM_FLUSH 133 #define POWER7_PME_PM_L2_LD_HIT 134 #define POWER7_PME_PM_NEST_2 135 #define POWER7_PME_PM_VSU1_1FLOP 136 #define POWER7_PME_PM_IC_PREF_REQ 137 #define POWER7_PME_PM_L3_LD_HIT 138 #define POWER7_PME_PM_GCT_NOSLOT_IC_MISS 139 #define POWER7_PME_PM_DISP_HELD 140 #define POWER7_PME_PM_L2_LD 141 #define POWER7_PME_PM_LSU_FLUSH_SRQ 142 #define POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC 143 #define POWER7_PME_PM_L2_RCST_BUSY_RC_FULL 144 #define POWER7_PME_PM_TB_BIT_TRANS 145 #define POWER7_PME_PM_THERMAL_MAX 146 #define POWER7_PME_PM_LSU1_FLUSH_ULD 147 #define POWER7_PME_PM_LSU1_REJECT_LHS 148 #define POWER7_PME_PM_LSU_LRQ_S0_ALLOC 149 #define POWER7_PME_PM_POWER_EVENT4 150 #define POWER7_PME_PM_DATA_FROM_L31_SHR 151 #define POWER7_PME_PM_BR_UNCOND 152 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC 153 #define POWER7_PME_PM_PMC4_REWIND 154 #define POWER7_PME_PM_L2_RCLD_DISP 155 #define POWER7_PME_PM_THRD_PRIO_2_3_CYC 156 #define POWER7_PME_PM_MRK_PTEG_FROM_L2MISS 157 #define POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 158 #define POWER7_PME_PM_LSU_DERAT_MISS 159 #define POWER7_PME_PM_IC_PREF_CANCEL_L2 160 #define POWER7_PME_PM_GCT_UTIL_7TO10_SLOT 161 #define POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT 162 #define POWER7_PME_PM_BR_PRED_CCACHE 163 #define POWER7_PME_PM_MRK_ST_CMPL_INT 164 #define POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC 165 #define POWER7_PME_PM_MRK_DATA_FROM_L3MISS 166 #define POWER7_PME_PM_GCT_NOSLOT_CYC 167 #define POWER7_PME_PM_LSU_SET_MPRED 168 #define POWER7_PME_PM_FLUSH_DISP_TLBIE 169 #define POWER7_PME_PM_VSU1_FCONV 170 #define POWER7_PME_PM_NEST_1 171 #define POWER7_PME_PM_DERAT_MISS_16G 172 #define POWER7_PME_PM_INST_FROM_LMEM 173 #define POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT 174 #define POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG 175 #define POWER7_PME_PM_INST_PTEG_FROM_L2 176 #define POWER7_PME_PM_PTEG_FROM_L2 177 #define POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC 178 #define POWER7_PME_PM_MRK_DTLB_MISS_4K 179 #define POWER7_PME_PM_VSU0_FPSCR 180 #define POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED 181 #define POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD 182 #define POWER7_PME_PM_L2_LD_MISS 183 #define POWER7_PME_PM_VMX_RESULT_SAT_1 184 #define POWER7_PME_PM_L1_PREF 185 #define POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC 186 #define POWER7_PME_PM_GRP_IC_MISS_NONSPEC 187 #define POWER7_PME_PM_SHL_MERGED 188 #define POWER7_PME_PM_DATA_FROM_L3 189 #define POWER7_PME_PM_LSU_FLUSH 190 #define POWER7_PME_PM_LSU_SRQ_SYNC_COUNT 191 #define POWER7_PME_PM_PMC2_OVERFLOW 192 #define POWER7_PME_PM_LSU_LDF 193 #define POWER7_PME_PM_POWER_EVENT3 194 #define POWER7_PME_PM_DISP_WT 195 #define POWER7_PME_PM_CMPLU_STALL_REJECT 196 #define POWER7_PME_PM_IC_BANK_CONFLICT 197 #define POWER7_PME_PM_BR_MPRED_CR_TA 198 #define POWER7_PME_PM_L2_INST_MISS 199 #define POWER7_PME_PM_CMPLU_STALL_ERAT_MISS 200 #define POWER7_PME_PM_MRK_LSU_FLUSH 201 #define POWER7_PME_PM_L2_LDST 202 #define POWER7_PME_PM_INST_FROM_L31_SHR 203 #define POWER7_PME_PM_VSU0_FIN 204 #define POWER7_PME_PM_LARX_LSU 205 #define POWER7_PME_PM_INST_FROM_RMEM 206 #define POWER7_PME_PM_DISP_CLB_HELD_TLBIE 207 #define POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC 208 #define POWER7_PME_PM_BR_PRED_CR 209 #define POWER7_PME_PM_LSU_REJECT 210 #define POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT 211 #define POWER7_PME_PM_LSU0_REJECT_LMQ_FULL 212 #define POWER7_PME_PM_VSU_FEST 213 #define POWER7_PME_PM_PTEG_FROM_L3 214 #define POWER7_PME_PM_POWER_EVENT2 215 #define POWER7_PME_PM_IC_PREF_CANCEL_PAGE 216 #define POWER7_PME_PM_VSU0_FSQRT_FDIV 217 #define POWER7_PME_PM_MRK_GRP_CMPL 218 #define POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED 219 #define POWER7_PME_PM_GRP_DISP 220 #define POWER7_PME_PM_LSU0_LDX 221 #define POWER7_PME_PM_DATA_FROM_L2 222 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD 223 #define POWER7_PME_PM_LD_REF_L1 224 #define POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED 225 #define POWER7_PME_PM_VSU1_2FLOP_DOUBLE 226 #define POWER7_PME_PM_THRD_PRIO_6_7_CYC 227 #define POWER7_PME_PM_BR_MPRED_CR 228 #define POWER7_PME_PM_LD_MISS_L1 229 #define POWER7_PME_PM_DATA_FROM_RL2L3_MOD 230 #define POWER7_PME_PM_LSU_SRQ_FULL_CYC 231 #define POWER7_PME_PM_TABLEWALK_CYC 232 #define POWER7_PME_PM_MRK_PTEG_FROM_RMEM 233 #define POWER7_PME_PM_LSU_SRQ_STFWD 234 #define POWER7_PME_PM_INST_PTEG_FROM_RMEM 235 #define POWER7_PME_PM_FXU0_FIN 236 #define POWER7_PME_PM_PTEG_FROM_L31_MOD 237 #define POWER7_PME_PM_PMC5_OVERFLOW 238 #define POWER7_PME_PM_LD_REF_L1_LSU1 239 #define POWER7_PME_PM_INST_PTEG_FROM_L21_SHR 240 #define POWER7_PME_PM_CMPLU_STALL_THRD 241 #define POWER7_PME_PM_DATA_FROM_RMEM 242 #define POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED 243 #define POWER7_PME_PM_BR_MPRED_LSTACK 244 #define POWER7_PME_PM_NEST_8 245 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 246 #define POWER7_PME_PM_LSU0_FLUSH_UST 247 #define POWER7_PME_PM_LSU_NCST 248 #define POWER7_PME_PM_BR_TAKEN 249 #define POWER7_PME_PM_INST_PTEG_FROM_LMEM 250 #define POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS 251 #define POWER7_PME_PM_DTLB_MISS_4K 252 #define POWER7_PME_PM_PMC4_SAVED 253 #define POWER7_PME_PM_VSU1_PERMUTE_ISSUED 254 #define POWER7_PME_PM_SLB_MISS 255 #define POWER7_PME_PM_LSU1_FLUSH_LRQ 256 #define POWER7_PME_PM_DTLB_MISS 257 #define POWER7_PME_PM_VSU1_FRSP 258 #define POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED 259 #define POWER7_PME_PM_L2_CASTOUT_SHR 260 #define POWER7_PME_PM_NEST_7 261 #define POWER7_PME_PM_DATA_FROM_DL2L3_SHR 262 #define POWER7_PME_PM_VSU1_STF 263 #define POWER7_PME_PM_ST_FIN 264 #define POWER7_PME_PM_PTEG_FROM_L21_SHR 265 #define POWER7_PME_PM_L2_LOC_GUESS_WRONG 266 #define POWER7_PME_PM_MRK_STCX_FAIL 267 #define POWER7_PME_PM_LSU0_REJECT_LHS 268 #define POWER7_PME_PM_IC_PREF_CANCEL_HIT 269 #define POWER7_PME_PM_L3_PREF_BUSY 270 #define POWER7_PME_PM_MRK_BRU_FIN 271 #define POWER7_PME_PM_LSU1_NCLD 272 #define POWER7_PME_PM_INST_PTEG_FROM_L31_MOD 273 #define POWER7_PME_PM_LSU_NCLD 274 #define POWER7_PME_PM_LSU_LDX 275 #define POWER7_PME_PM_L2_LOC_GUESS_CORRECT 276 #define POWER7_PME_PM_THRESH_TIMEO 277 #define POWER7_PME_PM_L3_PREF_ST 278 #define POWER7_PME_PM_DISP_CLB_HELD_SYNC 279 #define POWER7_PME_PM_VSU_SIMPLE_ISSUED 280 #define POWER7_PME_PM_VSU1_SINGLE 281 #define POWER7_PME_PM_DATA_TABLEWALK_CYC 282 #define POWER7_PME_PM_L2_RC_ST_DONE 283 #define POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD 284 #define POWER7_PME_PM_LARX_LSU1 285 #define POWER7_PME_PM_MRK_DATA_FROM_RMEM 286 #define POWER7_PME_PM_DISP_CLB_HELD 287 #define POWER7_PME_PM_DERAT_MISS_4K 288 #define POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR 289 #define POWER7_PME_PM_SEG_EXCEPTION 290 #define POWER7_PME_PM_FLUSH_DISP_SB 291 #define POWER7_PME_PM_L2_DC_INV 292 #define POWER7_PME_PM_PTEG_FROM_DL2L3_MOD 293 #define POWER7_PME_PM_DSEG 294 #define POWER7_PME_PM_BR_PRED_LSTACK 295 #define POWER7_PME_PM_VSU0_STF 296 #define POWER7_PME_PM_LSU_FX_FIN 297 #define POWER7_PME_PM_DERAT_MISS_16M 298 #define POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD 299 #define POWER7_PME_PM_INST_FROM_L3 300 #define POWER7_PME_PM_MRK_IFU_FIN 301 #define POWER7_PME_PM_ITLB_MISS 302 #define POWER7_PME_PM_VSU_STF 303 #define POWER7_PME_PM_LSU_FLUSH_UST 304 #define POWER7_PME_PM_L2_LDST_MISS 305 #define POWER7_PME_PM_FXU1_FIN 306 #define POWER7_PME_PM_SHL_DEALLOCATED 307 #define POWER7_PME_PM_L2_SN_M_WR_DONE 308 #define POWER7_PME_PM_LSU_REJECT_SET_MPRED 309 #define POWER7_PME_PM_L3_PREF_LD 310 #define POWER7_PME_PM_L2_SN_M_RD_DONE 311 #define POWER7_PME_PM_MRK_DERAT_MISS_16G 312 #define POWER7_PME_PM_VSU_FCONV 313 #define POWER7_PME_PM_ANY_THRD_RUN_CYC 314 #define POWER7_PME_PM_LSU_LMQ_FULL_CYC 315 #define POWER7_PME_PM_MRK_LSU_REJECT_LHS 316 #define POWER7_PME_PM_MRK_LD_MISS_L1_CYC 317 #define POWER7_PME_PM_MRK_DATA_FROM_L2_CYC 318 #define POWER7_PME_PM_INST_IMC_MATCH_DISP 319 #define POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC 320 #define POWER7_PME_PM_VSU0_SIMPLE_ISSUED 321 #define POWER7_PME_PM_CMPLU_STALL_DIV 322 #define POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR 323 #define POWER7_PME_PM_VSU_FMA_DOUBLE 324 #define POWER7_PME_PM_VSU_4FLOP 325 #define POWER7_PME_PM_VSU1_FIN 326 #define POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD 327 #define POWER7_PME_PM_RUN_CYC 328 #define POWER7_PME_PM_PTEG_FROM_RMEM 329 #define POWER7_PME_PM_LSU_LRQ_S0_VALID 330 #define POWER7_PME_PM_LSU0_LDF 331 #define POWER7_PME_PM_FLUSH_COMPLETION 332 #define POWER7_PME_PM_ST_MISS_L1 333 #define POWER7_PME_PM_L2_NODE_PUMP 334 #define POWER7_PME_PM_INST_FROM_DL2L3_SHR 335 #define POWER7_PME_PM_MRK_STALL_CMPLU_CYC 336 #define POWER7_PME_PM_VSU1_DENORM 337 #define POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC 338 #define POWER7_PME_PM_GCT_USAGE_1TO2_SLOT 339 #define POWER7_PME_PM_NEST_6 340 #define POWER7_PME_PM_INST_FROM_L3MISS 341 #define POWER7_PME_PM_EE_OFF_EXT_INT 342 #define POWER7_PME_PM_INST_PTEG_FROM_DMEM 343 #define POWER7_PME_PM_INST_FROM_DL2L3_MOD 344 #define POWER7_PME_PM_PMC6_OVERFLOW 345 #define POWER7_PME_PM_VSU_2FLOP_DOUBLE 346 #define POWER7_PME_PM_TLB_MISS 347 #define POWER7_PME_PM_FXU_BUSY 348 #define POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER 349 #define POWER7_PME_PM_LSU_REJECT_LMQ_FULL 350 #define POWER7_PME_PM_IC_RELOAD_SHR 351 #define POWER7_PME_PM_GRP_MRK 352 #define POWER7_PME_PM_MRK_ST_NEST 353 #define POWER7_PME_PM_VSU1_FSQRT_FDIV 354 #define POWER7_PME_PM_LSU0_FLUSH_LRQ 355 #define POWER7_PME_PM_LARX_LSU0 356 #define POWER7_PME_PM_IBUF_FULL_CYC 357 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 358 #define POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC 359 #define POWER7_PME_PM_GRP_MRK_CYC 360 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 361 #define POWER7_PME_PM_L2_GLOB_GUESS_CORRECT 362 #define POWER7_PME_PM_LSU_REJECT_LHS 363 #define POWER7_PME_PM_MRK_DATA_FROM_LMEM 364 #define POWER7_PME_PM_INST_PTEG_FROM_L3 365 #define POWER7_PME_PM_FREQ_DOWN 366 #define POWER7_PME_PM_INST_FROM_RL2L3_SHR 367 #define POWER7_PME_PM_MRK_INST_ISSUED 368 #define POWER7_PME_PM_PTEG_FROM_L3MISS 369 #define POWER7_PME_PM_RUN_PURR 370 #define POWER7_PME_PM_MRK_DATA_FROM_L3 371 #define POWER7_PME_PM_MRK_GRP_IC_MISS 372 #define POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS 373 #define POWER7_PME_PM_PTEG_FROM_RL2L3_SHR 374 #define POWER7_PME_PM_LSU_FLUSH_LRQ 375 #define POWER7_PME_PM_MRK_DERAT_MISS_64K 376 #define POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD 377 #define POWER7_PME_PM_L2_ST_MISS 378 #define POWER7_PME_PM_LWSYNC 379 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE 380 #define POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR 381 #define POWER7_PME_PM_MRK_LSU_FLUSH_LRQ 382 #define POWER7_PME_PM_INST_IMC_MATCH_CMPL 383 #define POWER7_PME_PM_MRK_INST_FIN 384 #define POWER7_PME_PM_INST_FROM_L31_MOD 385 #define POWER7_PME_PM_MRK_DTLB_MISS_64K 386 #define POWER7_PME_PM_LSU_FIN 387 #define POWER7_PME_PM_MRK_LSU_REJECT 388 #define POWER7_PME_PM_L2_CO_FAIL_BUSY 389 #define POWER7_PME_PM_DATA_FROM_L31_MOD 390 #define POWER7_PME_PM_THERMAL_WARN 391 #define POWER7_PME_PM_VSU0_4FLOP 392 #define POWER7_PME_PM_BR_MPRED_CCACHE 393 #define POWER7_PME_PM_L1_DEMAND_WRITE 394 #define POWER7_PME_PM_FLUSH_BR_MPRED 395 #define POWER7_PME_PM_MRK_DTLB_MISS_16G 396 #define POWER7_PME_PM_MRK_PTEG_FROM_DMEM 397 #define POWER7_PME_PM_L2_RCST_DISP 398 #define POWER7_PME_PM_CMPLU_STALL 399 #define POWER7_PME_PM_LSU_PARTIAL_CDF 400 #define POWER7_PME_PM_DISP_CLB_HELD_SB 401 #define POWER7_PME_PM_VSU0_FMA_DOUBLE 402 #define POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE 403 #define POWER7_PME_PM_IC_DEMAND_CYC 404 #define POWER7_PME_PM_MRK_DATA_FROM_L21_SHR 405 #define POWER7_PME_PM_MRK_LSU_FLUSH_UST 406 #define POWER7_PME_PM_INST_PTEG_FROM_L3MISS 407 #define POWER7_PME_PM_VSU_DENORM 408 #define POWER7_PME_PM_MRK_LSU_PARTIAL_CDF 409 #define POWER7_PME_PM_INST_FROM_L21_SHR 410 #define POWER7_PME_PM_IC_PREF_WRITE 411 #define POWER7_PME_PM_BR_PRED 412 #define POWER7_PME_PM_INST_FROM_DMEM 413 #define POWER7_PME_PM_IC_PREF_CANCEL_ALL 414 #define POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM 415 #define POWER7_PME_PM_MRK_LSU_FLUSH_SRQ 416 #define POWER7_PME_PM_MRK_FIN_STALL_CYC 417 #define POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT 418 #define POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER 419 #define POWER7_PME_PM_VSU1_DD_ISSUED 420 #define POWER7_PME_PM_PTEG_FROM_L31_SHR 421 #define POWER7_PME_PM_DATA_FROM_L21_SHR 422 #define POWER7_PME_PM_LSU0_NCLD 423 #define POWER7_PME_PM_VSU1_4FLOP 424 #define POWER7_PME_PM_VSU1_8FLOP 425 #define POWER7_PME_PM_VSU_8FLOP 426 #define POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 427 #define POWER7_PME_PM_DTLB_MISS_64K 428 #define POWER7_PME_PM_THRD_CONC_RUN_INST 429 #define POWER7_PME_PM_MRK_PTEG_FROM_L2 430 #define POWER7_PME_PM_VSU_FIN 431 #define POWER7_PME_PM_MRK_DATA_FROM_L31_MOD 432 #define POWER7_PME_PM_THRD_PRIO_0_1_CYC 433 #define POWER7_PME_PM_DERAT_MISS_64K 434 #define POWER7_PME_PM_PMC2_REWIND 435 #define POWER7_PME_PM_INST_FROM_L2 436 #define POWER7_PME_PM_GRP_BR_MPRED_NONSPEC 437 #define POWER7_PME_PM_INST_DISP 438 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM 439 #define POWER7_PME_PM_L1_DCACHE_RELOAD_VALID 440 #define POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED 441 #define POWER7_PME_PM_L3_PREF_HIT 442 #define POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD 443 #define POWER7_PME_PM_MRK_FXU_FIN 444 #define POWER7_PME_PM_PMC4_OVERFLOW 445 #define POWER7_PME_PM_MRK_PTEG_FROM_L3 446 #define POWER7_PME_PM_LSU0_LMQ_LHR_MERGE 447 #define POWER7_PME_PM_BTAC_HIT 448 #define POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS 449 #define POWER7_PME_PM_L3_RD_BUSY 450 #define POWER7_PME_PM_INST_FROM_L2MISS 451 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC 452 #define POWER7_PME_PM_L2_ST 453 #define POWER7_PME_PM_VSU0_DENORM 454 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR 455 #define POWER7_PME_PM_BR_PRED_CR_TA 456 #define POWER7_PME_PM_VSU0_FCONV 457 #define POWER7_PME_PM_MRK_LSU_FLUSH_ULD 458 #define POWER7_PME_PM_BTAC_MISS 459 #define POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT 460 #define POWER7_PME_PM_MRK_DATA_FROM_L2 461 #define POWER7_PME_PM_VSU_FMA 462 #define POWER7_PME_PM_LSU0_FLUSH_SRQ 463 #define POWER7_PME_PM_LSU1_L1_PREF 464 #define POWER7_PME_PM_IOPS_CMPL 465 #define POWER7_PME_PM_L2_SYS_PUMP 466 #define POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL 467 #define POWER7_PME_PM_BCPLUS8_RSLV_TAKEN 468 #define POWER7_PME_PM_NEST_5 469 #define POWER7_PME_PM_LSU_LMQ_S0_ALLOC 470 #define POWER7_PME_PM_FLUSH_DISP_SYNC 471 #define POWER7_PME_PM_L2_IC_INV 472 #define POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC 473 #define POWER7_PME_PM_L3_PREF_LDST 474 #define POWER7_PME_PM_LSU_SRQ_EMPTY_CYC 475 #define POWER7_PME_PM_LSU_LMQ_S0_VALID 476 #define POWER7_PME_PM_FLUSH_PARTIAL 477 #define POWER7_PME_PM_VSU1_FMA_DOUBLE 478 #define POWER7_PME_PM_1PLUS_PPC_DISP 479 #define POWER7_PME_PM_DATA_FROM_L2MISS 480 #define POWER7_PME_PM_SUSPENDED 481 #define POWER7_PME_PM_VSU0_FMA 482 #define POWER7_PME_PM_CMPLU_STALL_SCALAR 483 #define POWER7_PME_PM_STCX_FAIL 484 #define POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE 485 #define POWER7_PME_PM_DC_PREF_DST 486 #define POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED 487 #define POWER7_PME_PM_L3_HIT 488 #define POWER7_PME_PM_L2_GLOB_GUESS_WRONG 489 #define POWER7_PME_PM_MRK_DFU_FIN 490 #define POWER7_PME_PM_INST_FROM_L1 491 #define POWER7_PME_PM_BRU_FIN 492 #define POWER7_PME_PM_IC_DEMAND_REQ 493 #define POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE 494 #define POWER7_PME_PM_VSU1_FMA 495 #define POWER7_PME_PM_MRK_LD_MISS_L1 496 #define POWER7_PME_PM_VSU0_2FLOP_DOUBLE 497 #define POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM 498 #define POWER7_PME_PM_INST_PTEG_FROM_L31_SHR 499 #define POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS 500 #define POWER7_PME_PM_MRK_DATA_FROM_L2MISS 501 #define POWER7_PME_PM_DATA_FROM_RL2L3_SHR 502 #define POWER7_PME_PM_INST_FROM_PREF 503 #define POWER7_PME_PM_VSU1_SQ 504 #define POWER7_PME_PM_L2_LD_DISP 505 #define POWER7_PME_PM_L2_DISP_ALL 506 #define POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC 507 #define POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE 508 #define POWER7_PME_PM_BR_MPRED 509 #define POWER7_PME_PM_VSU_1FLOP 510 #define POWER7_PME_PM_HV_CYC 511 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR 512 #define POWER7_PME_PM_DTLB_MISS_16M 513 #define POWER7_PME_PM_MRK_LSU_FIN 514 #define POWER7_PME_PM_LSU1_LMQ_LHR_MERGE 515 #define POWER7_PME_PM_IFU_FIN 516 static const int power7_event_ids[][POWER7_NUM_EVENT_COUNTERS] = { [ POWER7_PME_PM_NEST_4 ] = { 213, 213, 208, 203, -1, -1 }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_ALL ] = { 65, 62, 60, 60, -1, -1 }, [ POWER7_PME_PM_PMC2_SAVED ] = { 218, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_DFU ] = { -1, 18, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_16FLOP ] = { 269, 267, 261, 255, -1, -1 }, [ POWER7_PME_PM_NEST_3 ] = { 212, 212, 207, 202, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_DERAT_MISS ] = { -1, -1, 188, -1, -1, -1 }, [ POWER7_PME_PM_MRK_ST_CMPL ] = { 208, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_ST_DISP ] = { -1, -1, -1, 95, -1, -1 }, [ POWER7_PME_PM_L2_CASTOUT_MOD ] = { 99, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_ISEG ] = { 95, 89, 88, 85, -1, -1 }, [ POWER7_PME_PM_MRK_INST_TIMEO ] = { -1, -1, -1, 184, -1, -1 }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { -1, -1, 100, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM ] = { 167, 161, 161, 154, -1, -1 }, [ POWER7_PME_PM_IERAT_WR_64K ] = { 78, 74, 72, 72, -1, -1 }, [ POWER7_PME_PM_MRK_DTLB_MISS_16M ] = { -1, -1, -1, 181, -1, -1 }, [ POWER7_PME_PM_IERAT_MISS ] = { 77, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_LMEM ] = { -1, -1, -1, 198, -1, -1 }, [ POWER7_PME_PM_FLOP ] = { 42, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_THRD_PRIO_4_5_CYC ] = { 244, 242, 237, 231, -1, -1 }, [ POWER7_PME_PM_BR_PRED_TA ] = { 14, 12, 13, 14, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_FXU ] = { -1, 19, -1, -1, -1, -1 }, [ POWER7_PME_PM_EXT_INT ] = { -1, 43, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_FSQRT_FDIV ] = { 260, 258, 252, 246, -1, -1 }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { 196, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_LDF ] = { 174, 168, 168, 161, -1, -1 }, [ POWER7_PME_PM_IC_WRITE_ALL ] = { 76, 73, 71, 71, -1, -1 }, [ POWER7_PME_PM_LSU0_SRQ_STFWD ] = { 165, 159, 159, 152, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_MOD ] = { 225, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR ] = { 188, 184, -1, -1, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L21_MOD ] = { -1, -1, 20, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED ] = { 310, 308, 302, 296, -1, -1 }, [ POWER7_PME_PM_VSU0_8FLOP ] = { 274, 272, 266, 260, -1, -1 }, [ POWER7_PME_PM_POWER_EVENT1 ] = { 222, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD_BAL ] = { 32, 33, 29, 31, -1, -1 }, [ POWER7_PME_PM_VSU1_2FLOP ] = { 294, 292, 286, 280, -1, -1 }, [ POWER7_PME_PM_LWSYNC_HELD ] = { 182, 176, 176, 169, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L21_MOD ] = { -1, -1, 79, -1, -1, -1 }, [ POWER7_PME_PM_IC_REQ_ALL ] = { 75, 72, 70, 70, -1, -1 }, [ POWER7_PME_PM_DSLB_MISS ] = { 39, 40, 37, 37, -1, -1 }, [ POWER7_PME_PM_L3_MISS ] = { 110, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_L1_PREF ] = { 158, 152, 152, 145, -1, -1 }, [ POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED ] = { 263, 261, 255, 249, -1, -1 }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE ] = { 168, 162, 162, 155, -1, -1 }, [ POWER7_PME_PM_L2_INST ] = { -1, -1, 93, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_FRSP ] = { 283, 281, 275, 269, -1, -1 }, [ POWER7_PME_PM_FLUSH_DISP ] = { 44, 45, 43, 42, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L2MISS ] = { -1, -1, -1, 212, -1, -1 }, [ POWER7_PME_PM_VSU1_DQ_ISSUED ] = { 300, 298, 292, 286, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_LSU ] = { -1, 20, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM ] = { 184, 179, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_FLUSH_ULD ] = { 126, 121, 122, 115, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_LMEM ] = { -1, -1, -1, 213, -1, -1 }, [ POWER7_PME_PM_MRK_DERAT_MISS_16M ] = { -1, -1, 184, -1, -1, -1 }, [ POWER7_PME_PM_THRD_ALL_RUN_CYC ] = { -1, 239, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT ] = { -1, -1, 202, -1, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_DL2L3_MOD ] = { -1, -1, 18, 24, -1, -1 }, [ POWER7_PME_PM_VSU_FRSP ] = { 259, 257, 251, 245, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD ] = { -1, -1, 180, -1, -1, -1 }, [ POWER7_PME_PM_PMC1_OVERFLOW ] = { -1, 218, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_SINGLE ] = { 289, 287, 281, 275, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3MISS ] = { -1, 206, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR ] = { -1, 205, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED ] = { 292, 290, 284, 278, -1, -1 }, [ POWER7_PME_PM_VSU1_FEST ] = { 302, 300, 294, 288, -1, -1 }, [ POWER7_PME_PM_MRK_INST_DISP ] = { -1, 194, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_COMPLEX_ISSUED ] = { 275, 273, 267, 261, -1, -1 }, [ POWER7_PME_PM_LSU1_FLUSH_UST ] = { 172, 166, 166, 159, -1, -1 }, [ POWER7_PME_PM_INST_CMPL ] = { 80, 76, 74, 75, -1, -1 }, [ POWER7_PME_PM_FXU_IDLE ] = { 49, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_FLUSH_ULD ] = { 156, 150, 150, 143, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { -1, -1, 178, 170, -1, -1 }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC ] = { -1, -1, 129, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 179, 173, 173, 166, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_MOD ] = { -1, -1, 84, -1, -1, -1 }, [ POWER7_PME_PM_GCT_UTIL_3TO6_SLOT ] = { 55, 56, 53, 55, -1, -1 }, [ POWER7_PME_PM_INST_FROM_RL2L3_MOD ] = { 88, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_SHL_CREATED ] = { 228, 227, 222, 217, -1, -1 }, [ POWER7_PME_PM_L2_ST_HIT ] = { -1, -1, -1, 96, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_DMEM ] = { 22, 24, -1, -1, -1, -1 }, [ POWER7_PME_PM_L3_LD_MISS ] = { -1, 104, -1, -1, -1, -1 }, [ POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 48, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD_RES ] = { 33, 34, 30, 32, -1, -1 }, [ POWER7_PME_PM_L2_SN_SX_I_DONE ] = { -1, -1, 101, -1, -1, -1 }, [ POWER7_PME_PM_GRP_CMPL ] = { -1, -1, 55, -1, -1, -1 }, [ POWER7_PME_PM_BCPLUS8_CONV ] = { 2, 0, 1, 1, -1, -1 }, [ POWER7_PME_PM_STCX_CMPL ] = { 234, 234, 229, 223, -1, -1 }, [ POWER7_PME_PM_VSU0_2FLOP ] = { 271, 269, 263, 257, -1, -1 }, [ POWER7_PME_PM_L3_PREF_MISS ] = { -1, -1, 106, -1, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_SYNC_CYC ] = { 149, 143, 143, 136, -1, -1 }, [ POWER7_PME_PM_LSU_REJECT_ERAT_MISS ] = { -1, 134, -1, -1, -1, -1 }, [ POWER7_PME_PM_L1_ICACHE_MISS ] = { -1, 92, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_FLUSH_SRQ ] = { 170, 164, 164, 157, -1, -1 }, [ POWER7_PME_PM_LD_REF_L1_LSU0 ] = { 118, 112, 112, 107, -1, -1 }, [ POWER7_PME_PM_VSU0_FEST ] = { 278, 276, 270, 264, -1, -1 }, [ POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED ] = { 268, 266, 260, 254, -1, -1 }, [ POWER7_PME_PM_FREQ_UP ] = { -1, -1, -1, 47, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_LMEM ] = { -1, -1, 23, 27, -1, -1 }, [ POWER7_PME_PM_LSU1_LDX ] = { 175, 169, 169, 162, -1, -1 }, [ POWER7_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 208, -1, -1 }, [ POWER7_PME_PM_MRK_BR_MPRED ] = { -1, -1, 177, -1, -1, -1 }, [ POWER7_PME_PM_SHL_MATCH ] = { 230, 229, 224, 219, -1, -1 }, [ POWER7_PME_PM_MRK_BR_TAKEN ] = { 183, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_ISLB_MISS ] = { 96, 90, 89, 86, -1, -1 }, [ POWER7_PME_PM_CYC ] = { 21, 23, 17, 23, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC ] = { -1, -1, -1, 171, -1, -1 }, [ POWER7_PME_PM_DISP_HELD_THERMAL ] = { -1, -1, 34, -1, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR ] = { -1, 88, 85, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_SRQ_STFWD ] = { 180, 174, 174, 167, -1, -1 }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED ] = { -1, -1, -1, 51, -1, -1 }, [ POWER7_PME_PM_1PLUS_PPC_CMPL ] = { 0, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_DMEM ] = { -1, 220, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_2FLOP ] = { 249, 247, 241, 235, -1, -1 }, [ POWER7_PME_PM_GCT_FULL_CYC ] = { 51, 52, 50, 50, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L3_CYC ] = { -1, -1, -1, 175, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_S0_ALLOC ] = { 145, 139, 139, 132, -1, -1 }, [ POWER7_PME_PM_MRK_DERAT_MISS_4K ] = { 191, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_BR_MPRED_TA ] = { 8, 6, 7, 8, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L2MISS ] = { -1, -1, -1, 83, -1, -1 }, [ POWER7_PME_PM_DPU_HELD_POWER ] = { -1, 38, -1, -1, -1, -1 }, [ POWER7_PME_PM_RUN_INST_CMPL ] = { -1, -1, -1, 214, 0, -1 }, [ POWER7_PME_PM_MRK_VSU_FIN ] = { -1, -1, 204, -1, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_S0_VALID ] = { 146, 140, 140, 133, -1, -1 }, [ POWER7_PME_PM_GCT_EMPTY_CYC ] = { -1, 51, -1, -1, -1, -1 }, [ POWER7_PME_PM_IOPS_DISP ] = { -1, -1, 87, -1, -1, -1 }, [ POWER7_PME_PM_RUN_SPURR ] = { 226, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L21_MOD ] = { -1, -1, 218, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_1FLOP ] = { 270, 268, 262, 256, -1, -1 }, [ POWER7_PME_PM_SNOOP_TLBIE ] = { 233, 232, 227, 222, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L3MISS ] = { -1, 28, 22, -1, -1, -1 }, [ POWER7_PME_PM_VSU_SINGLE ] = { 265, 263, 257, 251, -1, -1 }, [ POWER7_PME_PM_DTLB_MISS_16G ] = { 40, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_VECTOR ] = { -1, 22, -1, -1, -1, -1 }, [ POWER7_PME_PM_FLUSH ] = { -1, -1, -1, 40, -1, -1 }, [ POWER7_PME_PM_L2_LD_HIT ] = { -1, -1, 96, -1, -1, -1 }, [ POWER7_PME_PM_NEST_2 ] = { 211, 211, 206, 201, -1, -1 }, [ POWER7_PME_PM_VSU1_1FLOP ] = { 293, 291, 285, 279, -1, -1 }, [ POWER7_PME_PM_IC_PREF_REQ ] = { 72, 69, 67, 67, -1, -1 }, [ POWER7_PME_PM_L3_LD_HIT ] = { -1, 103, -1, -1, -1, -1 }, [ POWER7_PME_PM_GCT_NOSLOT_IC_MISS ] = { -1, 53, -1, -1, -1, -1 }, [ POWER7_PME_PM_DISP_HELD ] = { 37, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_LD ] = { 103, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_FLUSH_SRQ ] = { 125, 120, 121, 114, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { -1, -1, -1, 176, -1, -1 }, [ POWER7_PME_PM_L2_RCST_BUSY_RC_FULL ] = { -1, 101, -1, -1, -1, -1 }, [ POWER7_PME_PM_TB_BIT_TRANS ] = { -1, -1, 232, -1, -1, -1 }, [ POWER7_PME_PM_THERMAL_MAX ] = { -1, -1, -1, 226, -1, -1 }, [ POWER7_PME_PM_LSU1_FLUSH_ULD ] = { 171, 165, 165, 158, -1, -1 }, [ POWER7_PME_PM_LSU1_REJECT_LHS ] = { 178, 172, 172, 165, -1, -1 }, [ POWER7_PME_PM_LSU_LRQ_S0_ALLOC ] = { 134, 129, 130, 122, -1, -1 }, [ POWER7_PME_PM_POWER_EVENT4 ] = { -1, -1, -1, 209, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L31_SHR ] = { 26, 27, -1, -1, -1, -1 }, [ POWER7_PME_PM_BR_UNCOND ] = { 15, 14, 14, 15, -1, -1 }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC ] = { 166, 160, 160, 153, -1, -1 }, [ POWER7_PME_PM_PMC4_REWIND ] = { 220, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_RCLD_DISP ] = { 106, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_THRD_PRIO_2_3_CYC ] = { 243, 241, 236, 230, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2MISS ] = { -1, -1, -1, 197, -1, -1 }, [ POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 64, 61, 59, 59, -1, -1 }, [ POWER7_PME_PM_LSU_DERAT_MISS ] = { -1, 117, 117, -1, -1, -1 }, [ POWER7_PME_PM_IC_PREF_CANCEL_L2 ] = { 70, 67, 65, 65, -1, -1 }, [ POWER7_PME_PM_GCT_UTIL_7TO10_SLOT ] = { 56, 57, 54, 56, -1, -1 }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT ] = { 194, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_BR_PRED_CCACHE ] = { 10, 8, 9, 10, -1, -1 }, [ POWER7_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 200, -1, -1, -1 }, [ POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC ] = { 150, 144, 144, 137, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L3MISS ] = { -1, 186, -1, -1, -1, -1 }, [ POWER7_PME_PM_GCT_NOSLOT_CYC ] = { 52, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_SET_MPRED ] = { 143, 138, 138, 130, -1, -1 }, [ POWER7_PME_PM_FLUSH_DISP_TLBIE ] = { 47, 48, 46, 45, -1, -1 }, [ POWER7_PME_PM_VSU1_FCONV ] = { 301, 299, 293, 287, -1, -1 }, [ POWER7_PME_PM_NEST_1 ] = { 210, 210, 205, 200, -1, -1 }, [ POWER7_PME_PM_DERAT_MISS_16G ] = { -1, -1, -1, 29, -1, -1 }, [ POWER7_PME_PM_INST_FROM_LMEM ] = { -1, -1, 81, 80, -1, -1 }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 66, 63, 61, 61, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG ] = { -1, 21, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L2 ] = { 91, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L2 ] = { 223, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { -1, 182, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DTLB_MISS_4K ] = { -1, 192, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_FPSCR ] = { 282, 280, 274, 268, -1, -1 }, [ POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED ] = { 315, 313, 307, 301, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { 207, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_LD_MISS ] = { -1, 97, -1, -1, -1, -1 }, [ POWER7_PME_PM_VMX_RESULT_SAT_1 ] = { 247, 245, 239, 233, -1, -1 }, [ POWER7_PME_PM_L1_PREF ] = { 98, 93, 92, 89, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { -1, 187, -1, -1, -1, -1 }, [ POWER7_PME_PM_GRP_IC_MISS_NONSPEC ] = { 58, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_SHL_MERGED ] = { 231, 230, 225, 220, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L3 ] = { 24, 26, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_FLUSH ] = { 123, 118, 119, 112, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_SYNC_COUNT ] = { 148, 142, 142, 135, -1, -1 }, [ POWER7_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 213, -1, -1, -1 }, [ POWER7_PME_PM_LSU_LDF ] = { 129, 123, 124, 117, -1, -1 }, [ POWER7_PME_PM_POWER_EVENT3 ] = { -1, -1, 217, -1, -1, -1 }, [ POWER7_PME_PM_DISP_WT ] = { -1, -1, 35, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_REJECT ] = { -1, -1, -1, 21, -1, -1 }, [ POWER7_PME_PM_IC_BANK_CONFLICT ] = { 62, 60, 58, 58, -1, -1 }, [ POWER7_PME_PM_BR_MPRED_CR_TA ] = { 6, 4, 5, 6, -1, -1 }, [ POWER7_PME_PM_L2_INST_MISS ] = { -1, -1, 94, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_ERAT_MISS ] = { -1, -1, -1, 20, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FLUSH ] = { 198, 196, 189, 187, -1, -1 }, [ POWER7_PME_PM_L2_LDST ] = { 104, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L31_SHR ] = { 86, 81, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_FIN ] = { 279, 277, 271, 265, -1, -1 }, [ POWER7_PME_PM_LARX_LSU ] = { 114, 108, 108, 102, -1, -1 }, [ POWER7_PME_PM_INST_FROM_RMEM ] = { -1, -1, 82, -1, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD_TLBIE ] = { 36, 37, 33, 35, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { -1, 180, -1, -1, -1, -1 }, [ POWER7_PME_PM_BR_PRED_CR ] = { 11, 9, 10, 11, -1, -1 }, [ POWER7_PME_PM_LSU_REJECT ] = { 139, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT ] = { 19, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 164, 158, 158, 151, -1, -1 }, [ POWER7_PME_PM_VSU_FEST ] = { 255, 253, 247, 241, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L3 ] = { -1, 221, -1, -1, -1, -1 }, [ POWER7_PME_PM_POWER_EVENT2 ] = { -1, 219, -1, -1, -1, -1 }, [ POWER7_PME_PM_IC_PREF_CANCEL_PAGE ] = { 71, 68, 66, 66, -1, -1 }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV ] = { 284, 282, 276, 270, -1, -1 }, [ POWER7_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 182, -1, -1 }, [ POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED ] = { 286, 284, 278, 272, -1, -1 }, [ POWER7_PME_PM_GRP_DISP ] = { -1, -1, 56, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_LDX ] = { 160, 154, 154, 147, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L2 ] = { 23, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { 189, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LD_REF_L1 ] = { 117, 111, 111, 106, -1, -1 }, [ POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED ] = { 291, 289, 283, 277, -1, -1 }, [ POWER7_PME_PM_VSU1_2FLOP_DOUBLE ] = { 295, 293, 287, 281, -1, -1 }, [ POWER7_PME_PM_THRD_PRIO_6_7_CYC ] = { 245, 243, 238, 232, -1, -1 }, [ POWER7_PME_PM_BR_MPRED_CR ] = { 5, 3, 4, 5, -1, -1 }, [ POWER7_PME_PM_LD_MISS_L1 ] = { -1, -1, -1, 105, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_RL2L3_MOD ] = { 27, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_FULL_CYC ] = { 144, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_TABLEWALK_CYC ] = { 237, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_RMEM ] = { -1, -1, 199, -1, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_STFWD ] = { 147, 141, 141, 134, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_RMEM ] = { -1, -1, 86, -1, -1, -1 }, [ POWER7_PME_PM_FXU0_FIN ] = { 50, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L31_MOD ] = { 224, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PMC5_OVERFLOW ] = { 221, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LD_REF_L1_LSU1 ] = { 119, 113, 113, 108, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_SHR ] = { -1, -1, -1, 82, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_THRD ] = { 20, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_RMEM ] = { -1, -1, 24, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED ] = { 287, 285, 279, 273, -1, -1 }, [ POWER7_PME_PM_BR_MPRED_LSTACK ] = { 7, 5, 6, 7, -1, -1 }, [ POWER7_PME_PM_NEST_8 ] = { 217, 217, 212, 207, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { -1, -1, -1, 178, -1, -1 }, [ POWER7_PME_PM_LSU0_FLUSH_UST ] = { 157, 151, 151, 144, -1, -1 }, [ POWER7_PME_PM_LSU_NCST ] = { 137, 132, 133, 125, -1, -1 }, [ POWER7_PME_PM_BR_TAKEN ] = { -1, 13, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_LMEM ] = { -1, -1, -1, 84, -1, -1 }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS ] = { -1, -1, -1, 52, -1, -1 }, [ POWER7_PME_PM_DTLB_MISS_4K ] = { -1, 41, -1, -1, -1, -1 }, [ POWER7_PME_PM_PMC4_SAVED ] = { -1, -1, 215, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_PERMUTE_ISSUED ] = { 309, 307, 301, 295, -1, -1 }, [ POWER7_PME_PM_SLB_MISS ] = { 232, 231, 226, 221, -1, -1 }, [ POWER7_PME_PM_LSU1_FLUSH_LRQ ] = { 169, 163, 163, 156, -1, -1 }, [ POWER7_PME_PM_DTLB_MISS ] = { -1, -1, 38, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_FRSP ] = { 306, 304, 298, 292, -1, -1 }, [ POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED ] = { 267, 265, 259, 253, -1, -1 }, [ POWER7_PME_PM_L2_CASTOUT_SHR ] = { 100, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_NEST_7 ] = { 216, 216, 211, 206, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_DL2L3_SHR ] = { -1, -1, 19, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_STF ] = { 314, 312, 306, 300, -1, -1 }, [ POWER7_PME_PM_ST_FIN ] = { -1, 233, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L21_SHR ] = { -1, -1, -1, 211, -1, -1 }, [ POWER7_PME_PM_L2_LOC_GUESS_WRONG ] = { -1, 99, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_STCX_FAIL ] = { 209, 209, 203, 199, -1, -1 }, [ POWER7_PME_PM_LSU0_REJECT_LHS ] = { 163, 157, 157, 150, -1, -1 }, [ POWER7_PME_PM_IC_PREF_CANCEL_HIT ] = { 69, 66, 64, 64, -1, -1 }, [ POWER7_PME_PM_L3_PREF_BUSY ] = { -1, -1, -1, 97, -1, -1 }, [ POWER7_PME_PM_MRK_BRU_FIN ] = { -1, 177, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU1_NCLD ] = { 177, 171, 171, 164, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_MOD ] = { 92, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_NCLD ] = { 136, 131, 132, 124, -1, -1 }, [ POWER7_PME_PM_LSU_LDX ] = { 130, 124, 125, 118, -1, -1 }, [ POWER7_PME_PM_L2_LOC_GUESS_CORRECT ] = { 105, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_THRESH_TIMEO ] = { 246, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L3_PREF_ST ] = { 113, 107, 107, 100, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD_SYNC ] = { 35, 36, 32, 34, -1, -1 }, [ POWER7_PME_PM_VSU_SIMPLE_ISSUED ] = { 264, 262, 256, 250, -1, -1 }, [ POWER7_PME_PM_VSU1_SINGLE ] = { 312, 310, 304, 298, -1, -1 }, [ POWER7_PME_PM_DATA_TABLEWALK_CYC ] = { -1, -1, 25, -1, -1, -1 }, [ POWER7_PME_PM_L2_RC_ST_DONE ] = { -1, -1, 98, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD ] = { -1, -1, 197, -1, -1, -1 }, [ POWER7_PME_PM_LARX_LSU1 ] = { 116, 110, 110, 104, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM ] = { -1, -1, 183, -1, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD ] = { 31, 32, 28, 30, -1, -1 }, [ POWER7_PME_PM_DERAT_MISS_4K ] = { 30, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { 107, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_SEG_EXCEPTION ] = { 227, 226, 221, 216, -1, -1 }, [ POWER7_PME_PM_FLUSH_DISP_SB ] = { 45, 46, 44, 43, -1, -1 }, [ POWER7_PME_PM_L2_DC_INV ] = { -1, 94, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_DL2L3_MOD ] = { -1, -1, -1, 210, -1, -1 }, [ POWER7_PME_PM_DSEG ] = { 38, 39, 36, 36, -1, -1 }, [ POWER7_PME_PM_BR_PRED_LSTACK ] = { 13, 11, 12, 13, -1, -1 }, [ POWER7_PME_PM_VSU0_STF ] = { 290, 288, 282, 276, -1, -1 }, [ POWER7_PME_PM_LSU_FX_FIN ] = { 128, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_DERAT_MISS_16M ] = { -1, -1, 27, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { -1, -1, -1, 195, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L3 ] = { 84, 80, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_IFU_FIN ] = { -1, -1, 186, -1, -1, -1 }, [ POWER7_PME_PM_ITLB_MISS ] = { -1, -1, -1, 87, -1, -1 }, [ POWER7_PME_PM_VSU_STF ] = { 266, 264, 258, 252, -1, -1 }, [ POWER7_PME_PM_LSU_FLUSH_UST ] = { 127, 122, 123, 116, -1, -1 }, [ POWER7_PME_PM_L2_LDST_MISS ] = { -1, 98, -1, -1, -1, -1 }, [ POWER7_PME_PM_FXU1_FIN ] = { -1, -1, -1, 49, -1, -1 }, [ POWER7_PME_PM_SHL_DEALLOCATED ] = { 229, 228, 223, 218, -1, -1 }, [ POWER7_PME_PM_L2_SN_M_WR_DONE ] = { -1, -1, -1, 94, -1, -1 }, [ POWER7_PME_PM_LSU_REJECT_SET_MPRED ] = { 142, 137, 137, 129, -1, -1 }, [ POWER7_PME_PM_L3_PREF_LD ] = { 111, 105, 104, 98, -1, -1 }, [ POWER7_PME_PM_L2_SN_M_RD_DONE ] = { -1, -1, -1, 93, -1, -1 }, [ POWER7_PME_PM_MRK_DERAT_MISS_16G ] = { -1, -1, -1, 180, -1, -1 }, [ POWER7_PME_PM_VSU_FCONV ] = { 254, 252, 246, 240, -1, -1 }, [ POWER7_PME_PM_ANY_THRD_RUN_CYC ] = { 1, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_LMQ_FULL_CYC ] = { 131, 125, 126, 119, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_REJECT_LHS ] = { 204, 202, 196, 194, -1, -1 }, [ POWER7_PME_PM_MRK_LD_MISS_L1_CYC ] = { -1, -1, -1, 185, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L2_CYC ] = { -1, 181, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_IMC_MATCH_DISP ] = { -1, -1, 83, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { -1, -1, -1, 179, -1, -1 }, [ POWER7_PME_PM_VSU0_SIMPLE_ISSUED ] = { 288, 286, 280, 274, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_DIV ] = { -1, -1, -1, 19, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { -1, 207, 198, -1, -1, -1 }, [ POWER7_PME_PM_VSU_FMA_DOUBLE ] = { 258, 256, 250, 244, -1, -1 }, [ POWER7_PME_PM_VSU_4FLOP ] = { 251, 249, 243, 237, -1, -1 }, [ POWER7_PME_PM_VSU1_FIN ] = { 303, 301, 295, 289, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD ] = { 93, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_RUN_CYC ] = { -1, 225, -1, -1, -1, 0 }, [ POWER7_PME_PM_PTEG_FROM_RMEM ] = { -1, -1, 220, -1, -1, -1 }, [ POWER7_PME_PM_LSU_LRQ_S0_VALID ] = { 135, 130, 131, 123, -1, -1 }, [ POWER7_PME_PM_LSU0_LDF ] = { 159, 153, 153, 146, -1, -1 }, [ POWER7_PME_PM_FLUSH_COMPLETION ] = { -1, -1, 42, -1, -1, -1 }, [ POWER7_PME_PM_ST_MISS_L1 ] = { -1, -1, 228, -1, -1, -1 }, [ POWER7_PME_PM_L2_NODE_PUMP ] = { -1, -1, 97, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_DL2L3_SHR ] = { -1, -1, 77, -1, -1, -1 }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC ] = { -1, -1, 201, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_DENORM ] = { 299, 297, 291, 285, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { -1, 185, -1, -1, -1, -1 }, [ POWER7_PME_PM_GCT_USAGE_1TO2_SLOT ] = { 53, 54, 51, 53, -1, -1 }, [ POWER7_PME_PM_NEST_6 ] = { 215, 215, 210, 205, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L3MISS ] = { -1, 82, -1, -1, -1, -1 }, [ POWER7_PME_PM_EE_OFF_EXT_INT ] = { 41, 42, 40, 39, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_DMEM ] = { -1, 84, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_DL2L3_MOD ] = { -1, -1, 76, 76, -1, -1 }, [ POWER7_PME_PM_PMC6_OVERFLOW ] = { -1, -1, 216, -1, -1, -1 }, [ POWER7_PME_PM_VSU_2FLOP_DOUBLE ] = { 250, 248, 242, 236, -1, -1 }, [ POWER7_PME_PM_TLB_MISS ] = { -1, 244, -1, -1, -1, -1 }, [ POWER7_PME_PM_FXU_BUSY ] = { -1, 50, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { -1, 100, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_REJECT_LMQ_FULL ] = { 141, 136, 136, 128, -1, -1 }, [ POWER7_PME_PM_IC_RELOAD_SHR ] = { 74, 71, 69, 69, -1, -1 }, [ POWER7_PME_PM_GRP_MRK ] = { 59, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_ST_NEST ] = { -1, 208, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV ] = { 307, 305, 299, 293, -1, -1 }, [ POWER7_PME_PM_LSU0_FLUSH_LRQ ] = { 154, 148, 148, 141, -1, -1 }, [ POWER7_PME_PM_LARX_LSU0 ] = { 115, 109, 109, 103, -1, -1 }, [ POWER7_PME_PM_IBUF_FULL_CYC ] = { 61, 59, 57, 57, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { -1, 178, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC ] = { 120, 114, 114, 109, -1, -1 }, [ POWER7_PME_PM_GRP_MRK_CYC ] = { 60, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { -1, 189, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_GLOB_GUESS_CORRECT ] = { 102, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU_REJECT_LHS ] = { 140, 135, 135, 127, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM ] = { -1, -1, 182, 177, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L3 ] = { -1, 85, -1, -1, -1, -1 }, [ POWER7_PME_PM_FREQ_DOWN ] = { -1, -1, 48, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_RL2L3_SHR ] = { 89, 83, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_INST_ISSUED ] = { 195, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L3MISS ] = { -1, 223, -1, -1, -1, -1 }, [ POWER7_PME_PM_RUN_PURR ] = { -1, -1, -1, 215, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L3 ] = { 186, 183, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_GRP_IC_MISS ] = { -1, -1, -1, 183, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { -1, 17, -1, -1, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_SHR ] = { -1, 224, 219, -1, -1, -1 }, [ POWER7_PME_PM_LSU_FLUSH_LRQ ] = { 124, 119, 120, 113, -1, -1 }, [ POWER7_PME_PM_MRK_DERAT_MISS_64K ] = { -1, 190, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD ] = { -1, -1, -1, 81, -1, -1 }, [ POWER7_PME_PM_L2_ST_MISS ] = { -1, 102, -1, -1, -1, -1 }, [ POWER7_PME_PM_LWSYNC ] = { 181, 175, 175, 168, -1, -1 }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE ] = { 153, 147, 147, 140, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR ] = { -1, -1, -1, 196, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FLUSH_LRQ ] = { 199, 197, 190, 188, -1, -1 }, [ POWER7_PME_PM_INST_IMC_MATCH_CMPL ] = { 90, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_INST_FIN ] = { -1, -1, 187, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L31_MOD ] = { 85, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DTLB_MISS_64K ] = { -1, -1, 185, -1, -1, -1 }, [ POWER7_PME_PM_LSU_FIN ] = { -1, -1, 118, -1, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_REJECT ] = { -1, -1, -1, 193, -1, -1 }, [ POWER7_PME_PM_L2_CO_FAIL_BUSY ] = { 101, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L31_MOD ] = { 25, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_THERMAL_WARN ] = { 238, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_4FLOP ] = { 273, 271, 265, 259, -1, -1 }, [ POWER7_PME_PM_BR_MPRED_CCACHE ] = { 4, 2, 3, 4, -1, -1 }, [ POWER7_PME_PM_L1_DEMAND_WRITE ] = { 97, 91, 91, 88, -1, -1 }, [ POWER7_PME_PM_FLUSH_BR_MPRED ] = { 43, 44, 41, 41, -1, -1 }, [ POWER7_PME_PM_MRK_DTLB_MISS_16G ] = { 192, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_DMEM ] = { -1, 203, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_RCST_DISP ] = { -1, -1, 99, -1, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL ] = { -1, -1, -1, 18, -1, -1 }, [ POWER7_PME_PM_LSU_PARTIAL_CDF ] = { 138, 133, 134, 126, -1, -1 }, [ POWER7_PME_PM_DISP_CLB_HELD_SB ] = { 34, 35, 31, 33, -1, -1 }, [ POWER7_PME_PM_VSU0_FMA_DOUBLE ] = { 281, 279, 273, 267, -1, -1 }, [ POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, 49, -1, -1, -1 }, [ POWER7_PME_PM_IC_DEMAND_CYC ] = { 63, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR ] = { -1, -1, 181, 173, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FLUSH_UST ] = { 202, 200, 193, 191, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L3MISS ] = { -1, 87, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_DENORM ] = { 253, 251, 245, 239, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_PARTIAL_CDF ] = { 203, 201, 194, 192, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L21_SHR ] = { -1, -1, 80, 78, -1, -1 }, [ POWER7_PME_PM_IC_PREF_WRITE ] = { 73, 70, 68, 68, -1, -1 }, [ POWER7_PME_PM_BR_PRED ] = { 9, 7, 8, 9, -1, -1 }, [ POWER7_PME_PM_INST_FROM_DMEM ] = { 81, 78, -1, -1, -1, -1 }, [ POWER7_PME_PM_IC_PREF_CANCEL_ALL ] = { 68, 65, 63, 63, -1, -1 }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM ] = { 121, 115, 115, 110, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FLUSH_SRQ ] = { 200, 198, 191, 189, -1, -1 }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC ] = { 193, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT ] = { 54, 55, 52, 54, -1, -1 }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { -1, -1, -1, 92, -1, -1 }, [ POWER7_PME_PM_VSU1_DD_ISSUED ] = { 298, 296, 290, 284, -1, -1 }, [ POWER7_PME_PM_PTEG_FROM_L31_SHR ] = { -1, 222, -1, -1, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L21_SHR ] = { -1, -1, 21, 25, -1, -1 }, [ POWER7_PME_PM_LSU0_NCLD ] = { 162, 156, 156, 149, -1, -1 }, [ POWER7_PME_PM_VSU1_4FLOP ] = { 296, 294, 288, 282, -1, -1 }, [ POWER7_PME_PM_VSU1_8FLOP ] = { 297, 295, 289, 283, -1, -1 }, [ POWER7_PME_PM_VSU_8FLOP ] = { 252, 250, 244, 238, -1, -1 }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 128, -1, -1, -1, -1 }, [ POWER7_PME_PM_DTLB_MISS_64K ] = { -1, -1, 39, -1, -1, -1 }, [ POWER7_PME_PM_THRD_CONC_RUN_INST ] = { -1, -1, 234, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2 ] = { 205, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_FIN ] = { 256, 254, 248, 242, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD ] = { 187, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_THRD_PRIO_0_1_CYC ] = { 242, 240, 235, 229, -1, -1 }, [ POWER7_PME_PM_DERAT_MISS_64K ] = { -1, 31, -1, -1, -1, -1 }, [ POWER7_PME_PM_PMC2_REWIND ] = { -1, -1, 214, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L2 ] = { 83, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_GRP_BR_MPRED_NONSPEC ] = { 57, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_DISP ] = { -1, 77, 75, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM ] = { 152, 146, 146, 139, -1, -1 }, [ POWER7_PME_PM_L1_DCACHE_RELOAD_VALID ] = { -1, -1, 90, -1, -1, -1 }, [ POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED ] = { 262, 260, 254, 248, -1, -1 }, [ POWER7_PME_PM_L3_PREF_HIT ] = { -1, -1, 103, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD ] = { 206, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_FXU_FIN ] = { -1, 193, -1, -1, -1, -1 }, [ POWER7_PME_PM_PMC4_OVERFLOW ] = { 219, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3 ] = { -1, 204, -1, -1, -1, -1 }, [ POWER7_PME_PM_LSU0_LMQ_LHR_MERGE ] = { 161, 155, 155, 148, -1, -1 }, [ POWER7_PME_PM_BTAC_HIT ] = { 17, 15, 15, 16, -1, -1 }, [ POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS ] = { 79, 75, 73, 73, -1, -1 }, [ POWER7_PME_PM_L3_RD_BUSY ] = { -1, -1, -1, 101, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L2MISS ] = { -1, -1, -1, 79, -1, -1 }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC ] = { 151, 145, 145, 138, -1, -1 }, [ POWER7_PME_PM_L2_ST ] = { 108, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_DENORM ] = { 276, 274, 268, 262, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { -1, -1, 179, -1, -1, -1 }, [ POWER7_PME_PM_BR_PRED_CR_TA ] = { 12, 10, 11, 12, -1, -1 }, [ POWER7_PME_PM_VSU0_FCONV ] = { 277, 275, 269, 263, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FLUSH_ULD ] = { 201, 199, 192, 190, -1, -1 }, [ POWER7_PME_PM_BTAC_MISS ] = { 18, 16, 16, 17, -1, -1 }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT ] = { 197, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L2 ] = { 185, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_FMA ] = { 257, 255, 249, 243, -1, -1 }, [ POWER7_PME_PM_LSU0_FLUSH_SRQ ] = { 155, 149, 149, 142, -1, -1 }, [ POWER7_PME_PM_LSU1_L1_PREF ] = { 173, 167, 167, 160, -1, -1 }, [ POWER7_PME_PM_IOPS_CMPL ] = { 94, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_SYS_PUMP ] = { -1, -1, 102, -1, -1, -1 }, [ POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL ] = { -1, -1, -1, 91, -1, -1 }, [ POWER7_PME_PM_BCPLUS8_RSLV_TAKEN ] = { 3, 1, 2, 2, -1, -1 }, [ POWER7_PME_PM_NEST_5 ] = { 214, 214, 209, 204, -1, -1 }, [ POWER7_PME_PM_LSU_LMQ_S0_ALLOC ] = { 132, 126, 127, 120, -1, -1 }, [ POWER7_PME_PM_FLUSH_DISP_SYNC ] = { 46, 47, 45, 44, -1, -1 }, [ POWER7_PME_PM_L2_IC_INV ] = { -1, 96, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { -1, -1, -1, 172, -1, -1 }, [ POWER7_PME_PM_L3_PREF_LDST ] = { 112, 106, 105, 99, -1, -1 }, [ POWER7_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 131, -1, -1 }, [ POWER7_PME_PM_LSU_LMQ_S0_VALID ] = { 133, 127, 128, 121, -1, -1 }, [ POWER7_PME_PM_FLUSH_PARTIAL ] = { 48, 49, 47, 46, -1, -1 }, [ POWER7_PME_PM_VSU1_FMA_DOUBLE ] = { 305, 303, 297, 291, -1, -1 }, [ POWER7_PME_PM_1PLUS_PPC_DISP ] = { -1, -1, -1, 0, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_L2MISS ] = { -1, 25, -1, 26, -1, -1 }, [ POWER7_PME_PM_SUSPENDED ] = { 236, 236, 231, 225, -1, -1 }, [ POWER7_PME_PM_VSU0_FMA ] = { 280, 278, 272, 266, -1, -1 }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR ] = { -1, -1, -1, 22, -1, -1 }, [ POWER7_PME_PM_STCX_FAIL ] = { 235, 235, 230, 224, -1, -1 }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE ] = { 285, 283, 277, 271, -1, -1 }, [ POWER7_PME_PM_DC_PREF_DST ] = { 29, 30, 26, 28, -1, -1 }, [ POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED ] = { 311, 309, 303, 297, -1, -1 }, [ POWER7_PME_PM_L3_HIT ] = { 109, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_L2_GLOB_GUESS_WRONG ] = { -1, 95, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DFU_FIN ] = { -1, 191, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_L1 ] = { 82, 79, 78, 77, -1, -1 }, [ POWER7_PME_PM_BRU_FIN ] = { 16, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_IC_DEMAND_REQ ] = { 67, 64, 62, 62, -1, -1 }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE ] = { 308, 306, 300, 294, -1, -1 }, [ POWER7_PME_PM_VSU1_FMA ] = { 304, 302, 296, 290, -1, -1 }, [ POWER7_PME_PM_MRK_LD_MISS_L1 ] = { -1, 195, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU0_2FLOP_DOUBLE ] = { 272, 270, 264, 258, -1, -1 }, [ POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM ] = { 122, 116, 116, 111, -1, -1 }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_SHR ] = { -1, 86, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS ] = { -1, -1, 195, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_L2MISS ] = { -1, -1, -1, 174, -1, -1 }, [ POWER7_PME_PM_DATA_FROM_RL2L3_SHR ] = { 28, 29, -1, -1, -1, -1 }, [ POWER7_PME_PM_INST_FROM_PREF ] = { 87, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU1_SQ ] = { 313, 311, 305, 299, -1, -1 }, [ POWER7_PME_PM_L2_LD_DISP ] = { -1, -1, 95, -1, -1, -1 }, [ POWER7_PME_PM_L2_DISP_ALL ] = { -1, -1, -1, 90, -1, -1 }, [ POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { 241, -1, -1, -1, -1, -1 }, [ POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE ] = { 261, 259, 253, 247, -1, -1 }, [ POWER7_PME_PM_BR_MPRED ] = { -1, -1, -1, 3, -1, -1 }, [ POWER7_PME_PM_VSU_1FLOP ] = { 248, 246, 240, 234, -1, -1 }, [ POWER7_PME_PM_HV_CYC ] = { -1, 58, -1, -1, -1, -1 }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { 190, 188, -1, -1, -1, -1 }, [ POWER7_PME_PM_DTLB_MISS_16M ] = { -1, -1, -1, 38, -1, -1 }, [ POWER7_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, 186, -1, -1 }, [ POWER7_PME_PM_LSU1_LMQ_LHR_MERGE ] = { 176, 170, 170, 163, -1, -1 }, [ POWER7_PME_PM_IFU_FIN ] = { -1, -1, -1, 74, -1, -1 } }; static const unsigned long long power7_group_vecs[][POWER7_NUM_GROUP_VEC] = { [ POWER7_PME_PM_NEST_4 ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_ALL ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC2_SAVED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL }, [ POWER7_PME_PM_CMPLU_STALL_DFU ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_16FLOP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_3 ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_DERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL }, [ POWER7_PME_PM_MRK_ST_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000040000000000ULL }, [ POWER7_PME_PM_L2_ST_DISP ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_CASTOUT_MOD ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_ISEG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_INST_TIMEO ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IERAT_WR_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER7_PME_PM_MRK_DTLB_MISS_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000080000000000ULL }, [ POWER7_PME_PM_IERAT_MISS ] = { 0x0000000000080400ULL, 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000204020ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL }, [ POWER7_PME_PM_FLOP ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000010000040000ULL, 0x0000000000020000ULL }, [ POWER7_PME_PM_THRD_PRIO_4_5_CYC ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_PRED_TA ] = { 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_FXU ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_EXT_INT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002080000000ULL, 0x0000000000800000ULL }, [ POWER7_PME_PM_VSU_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000020010ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL }, [ POWER7_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_WRITE_ALL ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_MOD ] = { 0x0000000041000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL }, [ POWER7_PME_PM_DATA_FROM_L21_MOD ] = { 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_8FLOP ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_POWER_EVENT1 ] = { 0x0000000300000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD_BAL ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_2FLOP ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LWSYNC_HELD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L21_MOD ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_REQ_ALL ] = { 0x0000000000000000ULL, 0x0000000000040000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DSLB_MISS ] = { 0x00000000000c8400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_L1_PREF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_INST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL }, [ POWER7_PME_PM_VSU0_FRSP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_DISP ] = { 0x0000003000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L2MISS ] = { 0x0000000010020000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_DQ_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_LSU ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL }, [ POWER7_PME_PM_LSU_FLUSH_ULD ] = { 0x000000c000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_LMEM ] = { 0x0000000080c00000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DERAT_MISS_16M ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000600000000000ULL }, [ POWER7_PME_PM_THRD_ALL_RUN_CYC ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000012480000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_FRSP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000082000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL }, [ POWER7_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000600ULL }, [ POWER7_PME_PM_VSU0_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL }, [ POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_FEST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_INST_DISP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL }, [ POWER7_PME_PM_VSU0_COMPLEX_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_FLUSH_UST ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_CMPL ] = { 0x1ea80000e00c4001ULL, 0xe0f0070804120ce6ULL, 0x60007b087f80f3f7ULL, 0xdffffffffcb838ffULL }, [ POWER7_PME_PM_FXU_IDLE ] = { 0x0024000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800000000ULL }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_MOD ] = { 0x0000000006000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_UTIL_3TO6_SLOT ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x0022000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SHL_CREATED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_ST_HIT ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0000068140000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_LD_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0014000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD_RES ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_SN_SX_I_DONE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL }, [ POWER7_PME_PM_GRP_CMPL ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BCPLUS8_CONV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_STCX_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x1800000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_2FLOP ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_PREF_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L1_ICACHE_MISS ] = { 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000204000ULL }, [ POWER7_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_FEST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FREQ_UP ] = { 0x0000000300000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0000068830000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_LDX ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000600ULL }, [ POWER7_PME_PM_MRK_BR_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL }, [ POWER7_PME_PM_SHL_MATCH ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_BR_TAKEN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL }, [ POWER7_PME_PM_ISLB_MISS ] = { 0x0000000000080400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CYC ] = { 0x1eb0002020030001ULL, 0x0050000000120d22ULL, 0x27b1f912b0000000ULL, 0x100000c0028381dfULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800000000ULL }, [ POWER7_PME_PM_DISP_HELD_THERMAL ] = { 0x0000000200000000ULL, 0x0000000000003000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR ] = { 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_1PLUS_PPC_CMPL ] = { 0x0000000000000000ULL, 0x0000000000030000ULL, 0x0000000100000000ULL, 0x0000000000040000ULL }, [ POWER7_PME_PM_PTEG_FROM_DMEM ] = { 0x0000000080800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_2FLOP ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x2000000000000000ULL }, [ POWER7_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L3_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL }, [ POWER7_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DERAT_MISS_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000400000000000ULL }, [ POWER7_PME_PM_BR_MPRED_TA ] = { 0x0000000000000112ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L2MISS ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DPU_HELD_POWER ] = { 0x0000000300000000ULL, 0x0000000000003000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL }, [ POWER7_PME_PM_RUN_INST_CMPL ] = { 0xfffd2fffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL }, [ POWER7_PME_PM_MRK_VSU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000003000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000000100ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IOPS_DISP ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_RUN_SPURR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL }, [ POWER7_PME_PM_PTEG_FROM_L21_MOD ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_1FLOP ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SNOOP_TLBIE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL }, [ POWER7_PME_PM_DATA_FROM_L3MISS ] = { 0x0000000000000000ULL, 0x0000004c40000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DTLB_MISS_16G ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_VECTOR ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH ] = { 0x000007a800000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL, 0x0000000001000000ULL }, [ POWER7_PME_PM_L2_LD_HIT ] = { 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_2 ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_1FLOP ] = { 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_PREF_REQ ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_LD_HIT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_NOSLOT_IC_MISS ] = { 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_HELD ] = { 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_LD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000044000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL }, [ POWER7_PME_PM_L2_RCST_BUSY_RC_FULL ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_TB_BIT_TRANS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000020080000000ULL, 0x0000000000800000ULL }, [ POWER7_PME_PM_THERMAL_MAX ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000008000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_POWER_EVENT4 ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_L31_SHR ] = { 0x0000000000000000ULL, 0x0000000120000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_UNCOND ] = { 0x0000000000000010ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC4_REWIND ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL }, [ POWER7_PME_PM_L2_RCLD_DISP ] = { 0x0140000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_THRD_PRIO_2_3_CYC ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL }, [ POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_DERAT_MISS ] = { 0x0000000000038000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000008000ULL }, [ POWER7_PME_PM_IC_PREF_CANCEL_L2 ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_UTIL_7TO10_SLOT ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL }, [ POWER7_PME_PM_BR_PRED_CCACHE ] = { 0x00000000000003c6ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000040000000000ULL }, [ POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L3MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL }, [ POWER7_PME_PM_GCT_NOSLOT_CYC ] = { 0x0000000000000000ULL, 0x0000000008000100ULL, 0x0000040000000000ULL, 0x0000000002000000ULL }, [ POWER7_PME_PM_LSU_SET_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_DISP_TLBIE ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL }, [ POWER7_PME_PM_VSU1_FCONV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_1 ] = { 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DERAT_MISS_16G ] = { 0x0000000000006000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_LMEM ] = { 0x0000000000000000ULL, 0x00ad180000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L2 ] = { 0x0000000008200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L2 ] = { 0x0000000030100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000080000000ULL }, [ POWER7_PME_PM_MRK_DTLB_MISS_4K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000180000000000ULL }, [ POWER7_PME_PM_VSU0_FPSCR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL }, [ POWER7_PME_PM_L2_LD_MISS ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VMX_RESULT_SAT_1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L1_PREF ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000008008000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL }, [ POWER7_PME_PM_GRP_IC_MISS_NONSPEC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SHL_MERGED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_L3 ] = { 0x0000000000000000ULL, 0x0000000830000002ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_FLUSH ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_SYNC_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL }, [ POWER7_PME_PM_LSU_LDF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000010000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_POWER_EVENT3 ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_WT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL }, [ POWER7_PME_PM_CMPLU_STALL_REJECT ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_BANK_CONFLICT ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_MPRED_CR_TA ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_INST_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_ERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_FLUSH ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000010000000000ULL }, [ POWER7_PME_PM_L2_LDST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L31_SHR ] = { 0x0000000000000000ULL, 0x0000c00000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LARX_LSU ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0880000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0088000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD_TLBIE ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL }, [ POWER7_PME_PM_BR_PRED_CR ] = { 0x00000000000003e4ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_REJECT ] = { 0x0000000000000000ULL, 0x0000000000000018ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_FEST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000044000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L3 ] = { 0x0000000030200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_POWER_EVENT2 ] = { 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_PREF_CANCEL_PAGE ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_GRP_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400000000000000ULL }, [ POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GRP_DISP ] = { 0x0000000000000000ULL, 0x0000000000010000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_LDX ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_L2 ] = { 0x0000000000000000ULL, 0x0000005010200001ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000440000000ULL }, [ POWER7_PME_PM_LD_REF_L1 ] = { 0x0000000400000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL, 0xc000000000000000ULL }, [ POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_2FLOP_DOUBLE ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_THRD_PRIO_6_7_CYC ] = { 0x0001000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_MPRED_CR ] = { 0x0000000000000090ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LD_MISS_L1 ] = { 0x0000000000008000ULL, 0x0200001000000000ULL, 0x0000000000000000ULL, 0x8000000000184000ULL }, [ POWER7_PME_PM_DATA_FROM_RL2L3_MOD ] = { 0x0000000000000000ULL, 0x000000a200000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_TABLEWALK_CYC ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_RMEM ] = { 0x0000000008400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FXU0_FIN ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L31_MOD ] = { 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER7_PME_PM_LD_REF_L1_LSU1 ] = { 0x0000000400000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_SHR ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_THRD ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0000028010000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_MPRED_LSTACK ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_8 ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000400000000ULL }, [ POWER7_PME_PM_LSU0_FLUSH_UST ] = { 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_NCST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_TAKEN ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_LMEM ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS ] = { 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000200000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DTLB_MISS_4K ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC4_SAVED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL }, [ POWER7_PME_PM_VSU1_PERMUTE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SLB_MISS ] = { 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DTLB_MISS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000040000000000ULL, 0x0000000000008000ULL }, [ POWER7_PME_PM_VSU1_FRSP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_CASTOUT_SHR ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_7 ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_DL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000012100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_STF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_ST_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000060000000000ULL, 0xc000000000080000ULL }, [ POWER7_PME_PM_PTEG_FROM_L21_SHR ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_LOC_GUESS_WRONG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL }, [ POWER7_PME_PM_LSU0_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_PREF_CANCEL_HIT ] = { 0x0000000000000000ULL, 0x0000000000080000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_PREF_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_BRU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0080000000000000ULL }, [ POWER7_PME_PM_LSU1_NCLD ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_MOD ] = { 0x0000000006000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_NCLD ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_LDX ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_LOC_GUESS_CORRECT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_THRESH_TIMEO ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_PREF_ST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD_SYNC ] = { 0x0000000000000000ULL, 0x0000000000004000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_SIMPLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_SINGLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000000000000000ULL, 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_RC_ST_DONE ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0005000000000000ULL }, [ POWER7_PME_PM_LARX_LSU1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001040000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL }, [ POWER7_PME_PM_DERAT_MISS_4K ] = { 0x0000000000002000ULL, 0x0000010000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SEG_EXCEPTION ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_DISP_SB ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_DC_INV ] = { 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_DL2L3_MOD ] = { 0x0000000040200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DSEG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_PRED_LSTACK ] = { 0x00000000000003c6ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_STF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_FX_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DERAT_MISS_16M ] = { 0x0000000000006000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L3 ] = { 0x0000000000000000ULL, 0x0010180000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_IFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0060000000000000ULL }, [ POWER7_PME_PM_ITLB_MISS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000208000ULL }, [ POWER7_PME_PM_VSU_STF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000090004ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_FLUSH_UST ] = { 0x0000014000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_LDST_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FXU1_FIN ] = { 0x0008000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_SHL_DEALLOCATED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_SN_M_WR_DONE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000004ULL }, [ POWER7_PME_PM_LSU_REJECT_SET_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000030ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_PREF_LD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_SN_M_RD_DONE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL }, [ POWER7_PME_PM_MRK_DERAT_MISS_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000000ULL }, [ POWER7_PME_PM_VSU_FCONV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000081000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_ANY_THRD_RUN_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000010000ULL }, [ POWER7_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000500000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0800010000000000ULL }, [ POWER7_PME_PM_MRK_LD_MISS_L1_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000800000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L2_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL }, [ POWER7_PME_PM_INST_IMC_MATCH_DISP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x4000000040000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001000000000ULL }, [ POWER7_PME_PM_VSU0_SIMPLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_DIV ] = { 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL }, [ POWER7_PME_PM_VSU_FMA_DOUBLE ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_4FLOP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x2000000000000000ULL }, [ POWER7_PME_PM_VSU1_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD ] = { 0x0000000000800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_RUN_CYC ] = { 0xfffd2fffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL }, [ POWER7_PME_PM_PTEG_FROM_RMEM ] = { 0x0000000090800000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_COMPLETION ] = { 0x0000002000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_ST_MISS_L1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x4000000000084000ULL }, [ POWER7_PME_PM_L2_NODE_PUMP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000008ULL }, [ POWER7_PME_PM_INST_FROM_DL2L3_SHR ] = { 0x0000000000000000ULL, 0x0042200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_DENORM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000200000000ULL }, [ POWER7_PME_PM_GCT_USAGE_1TO2_SLOT ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_6 ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L3MISS ] = { 0x0000000000040000ULL, 0x0005200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_DMEM ] = { 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_DL2L3_MOD ] = { 0x0000000000000000ULL, 0x0042300000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000400ULL }, [ POWER7_PME_PM_VSU_2FLOP_DOUBLE ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_TLB_MISS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FXU_BUSY ] = { 0x0034000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { 0x0140000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_RELOAD_SHR ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GRP_MRK ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_ST_NEST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000040000000000ULL }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000020000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LARX_LSU0 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0080000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IBUF_FULL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000008000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800000000ULL }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0040008000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GRP_MRK_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000400000000ULL }, [ POWER7_PME_PM_L2_GLOB_GUESS_CORRECT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_REJECT_LHS ] = { 0x0000000000000000ULL, 0x0000000000000008ULL, 0x0800000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L3 ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FREQ_DOWN ] = { 0x0000000100000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0026800000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_INST_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L3MISS ] = { 0x0000000000430000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_RUN_PURR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000010100ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L3 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000100000000ULL }, [ POWER7_PME_PM_MRK_GRP_IC_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { 0x0000000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_SHR ] = { 0x0000000041000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000024000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DERAT_MISS_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000600000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD ] = { 0x0000000000100000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_ST_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0400000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LWSYNC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000500000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0001000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_FLUSH_LRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000008000000000ULL }, [ POWER7_PME_PM_INST_IMC_MATCH_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x4000000040000000ULL, 0x0000000001000000ULL }, [ POWER7_PME_PM_MRK_INST_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000020000000000ULL }, [ POWER7_PME_PM_INST_FROM_L31_MOD ] = { 0x0000000000000000ULL, 0x0000400000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DTLB_MISS_64K ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000180000000000ULL }, [ POWER7_PME_PM_LSU_FIN ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0000000000200000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_REJECT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0800010000000000ULL }, [ POWER7_PME_PM_L2_CO_FAIL_BUSY ] = { 0x0080000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_L31_MOD ] = { 0x0000000000000000ULL, 0x0000000080000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_THERMAL_WARN ] = { 0x0000000000000000ULL, 0x0000000000002000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_4FLOP ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_MPRED_CCACHE ] = { 0x0000000000000212ULL, 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L1_DEMAND_WRITE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_BR_MPRED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DTLB_MISS_16G ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0001000000000000ULL }, [ POWER7_PME_PM_L2_RCST_DISP ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL ] = { 0x0000000000000000ULL, 0x0000000002000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_PARTIAL_CDF ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DISP_CLB_HELD_SB ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL }, [ POWER7_PME_PM_VSU0_FMA_DOUBLE ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0014000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_DEMAND_CYC ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000080000000ULL }, [ POWER7_PME_PM_MRK_LSU_FLUSH_UST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000004000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L3MISS ] = { 0x0000000008000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_DENORM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_PARTIAL_CDF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0080000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L21_SHR ] = { 0x0000000000000000ULL, 0x0000c00000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_PREF_WRITE ] = { 0x0000080000000000ULL, 0x0000000000000000ULL, 0x0000000800000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_PRED ] = { 0x000000000000002cULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_DMEM ] = { 0x0000000000000000ULL, 0x0088300000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_PREF_CANCEL_ALL ] = { 0x0000000000000000ULL, 0x00000000000c0000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_FLUSH_SRQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000008000000000ULL }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0100000000000000ULL }, [ POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT ] = { 0x0000000000000000ULL, 0x0000000000000200ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { 0x0100000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_DD_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PTEG_FROM_L31_SHR ] = { 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DATA_FROM_L21_SHR ] = { 0x0000000000000000ULL, 0x0000000300000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_NCLD ] = { 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_4FLOP ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_8FLOP ] = { 0x0000000000000000ULL, 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_8FLOP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000100ULL, 0x2000000000000000ULL }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000100000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DTLB_MISS_64K ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_THRD_CONC_RUN_INST ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000001000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0002000000000000ULL }, [ POWER7_PME_PM_VSU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x00000000000e8002ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000220000000ULL }, [ POWER7_PME_PM_THRD_PRIO_0_1_CYC ] = { 0x0001200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DERAT_MISS_64K ] = { 0x0000000000006000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_PMC2_REWIND ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL }, [ POWER7_PME_PM_INST_FROM_L2 ] = { 0x0000000000000000ULL, 0x0010080000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_GRP_BR_MPRED_NONSPEC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000400000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_DISP ] = { 0x0000000000000001ULL, 0x0000000000010001ULL, 0x4000000100000000ULL, 0x0000000003040000ULL }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0010000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000000000000000ULL, 0x0000041000200007ULL, 0x0000002000000000ULL, 0x0000000000100000ULL }, [ POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000040ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_PREF_HIT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0401000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL }, [ POWER7_PME_PM_MRK_FXU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL }, [ POWER7_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000200ULL }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0008000000000000ULL }, [ POWER7_PME_PM_LSU0_LMQ_LHR_MERGE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BTAC_HIT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL }, [ POWER7_PME_PM_L3_RD_BUSY ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L2MISS ] = { 0x0000000000000000ULL, 0x0005880000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_ST ] = { 0x0200000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_DENORM ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000001ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL }, [ POWER7_PME_PM_BR_PRED_CR_TA ] = { 0x0000000000000020ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_FCONV ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000001000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_FLUSH_ULD ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000004000000000ULL }, [ POWER7_PME_PM_BTAC_MISS ] = { 0x0000000000000800ULL, 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000090000000ULL }, [ POWER7_PME_PM_VSU_FMA ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000030000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000040000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU1_L1_PREF ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000008000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IOPS_CMPL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000200000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_SYS_PUMP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000002ULL }, [ POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL ] = { 0x0040000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BCPLUS8_RSLV_TAKEN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_NEST_5 ] = { 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_DISP_SYNC ] = { 0x0000000800000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_IC_INV ] = { 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000002000000000ULL }, [ POWER7_PME_PM_L3_PREF_LDST ] = { 0x0000000000000000ULL, 0x0100000000000000ULL, 0x0004000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL, 0x0000000000000010ULL, 0x0000000000100000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000002400000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_FLUSH_PARTIAL ] = { 0x0000001000000000ULL, 0x0000000000000000ULL, 0x0000004000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_FMA_DOUBLE ] = { 0x0000000000000000ULL, 0x4000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_1PLUS_PPC_DISP ] = { 0x0000000000000000ULL, 0x0000000000031001ULL, 0x0000000100000000ULL, 0x0000000000040000ULL }, [ POWER7_PME_PM_DATA_FROM_L2MISS ] = { 0x0000000000008000ULL, 0x0000005260000000ULL, 0x0000000000000000ULL, 0x0000000000100000ULL }, [ POWER7_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000100000000000ULL, 0x0000000000400000ULL }, [ POWER7_PME_PM_VSU0_FMA ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR ] = { 0x0000000000000000ULL, 0x0000000001000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_STCX_FAIL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x1800000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_DC_PREF_DST ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0040000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000080ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L3_HIT ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0001000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_GLOB_GUESS_WRONG ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000020000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0200000000000000ULL }, [ POWER7_PME_PM_INST_FROM_L1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000040000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BRU_FIN ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000000000008000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IC_DEMAND_REQ ] = { 0x8000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_FMA ] = { 0x0000000000000000ULL, 0x2000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000004000000ULL }, [ POWER7_PME_PM_VSU0_2FLOP_DOUBLE ] = { 0x0000000000000000ULL, 0x1000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM ] = { 0x0000000000000000ULL, 0x0200000000000000ULL, 0x0020000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_SHR ] = { 0x0000000004000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0800000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_L2MISS ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000010000000ULL }, [ POWER7_PME_PM_DATA_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000002680000004ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_INST_FROM_PREF ] = { 0x0000000000000000ULL, 0x0009000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU1_SQ ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000800ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_LD_DISP ] = { 0x0800000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_L2_DISP_ALL ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x8000000000000000ULL, 0x0000000000000001ULL }, [ POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { 0x0000200000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000020ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_BR_MPRED ] = { 0x0000000000000008ULL, 0x0000000000000000ULL, 0x0000046400000000ULL, 0x0000000002000000ULL }, [ POWER7_PME_PM_VSU_1FLOP ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000070100ULL, 0x2000000000000000ULL }, [ POWER7_PME_PM_HV_CYC ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000001100000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000040000000ULL }, [ POWER7_PME_PM_DTLB_MISS_16M ] = { 0x0000000000001000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_MRK_LSU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000000000ULL, 0x10c0000000000000ULL }, [ POWER7_PME_PM_LSU1_LMQ_LHR_MERGE ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000000000400000ULL, 0x0000000000000000ULL }, [ POWER7_PME_PM_IFU_FIN ] = { 0x0000000000000000ULL, 0x0000000000000000ULL, 0x0000081000000000ULL, 0x0000000000000000ULL } }; static const pme_power_entry_t power7_pe[] = { [ POWER7_PME_PM_NEST_4 ] = { .pme_name = "PM_NEST_4", .pme_code = 0x87, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_4], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_4] }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_ALL ] = { .pme_name = "PM_IC_DEMAND_L2_BR_ALL", .pme_code = 0x4898, .pme_short_desc = " L2 I cache demand request due to BHT or redirect", .pme_long_desc = " L2 I cache demand request due to BHT or redirect", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_DEMAND_L2_BR_ALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_DEMAND_L2_BR_ALL] }, [ POWER7_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x10022, .pme_short_desc = "PMC2 Rewind Value saved", .pme_long_desc = "PMC2 was counting speculatively. The speculative condition was met and the counter value was committed by copying it to the backup register.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC2_SAVED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC2_SAVED] }, [ POWER7_PME_PM_CMPLU_STALL_DFU ] = { .pme_name = "PM_CMPLU_STALL_DFU", .pme_code = 0x2003c, .pme_short_desc = "Completion stall caused by Decimal Floating Point Unit", .pme_long_desc = "Completion stall caused by Decimal Floating Point Unit", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_DFU], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_DFU] }, [ POWER7_PME_PM_VSU0_16FLOP ] = { .pme_name = "PM_VSU0_16FLOP", .pme_code = 0xa0a4, .pme_short_desc = "Sixteen flops operation (SP vector versions of fdiv", .pme_long_desc = "fsqrt) ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_16FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_16FLOP] }, [ POWER7_PME_PM_NEST_3 ] = { .pme_name = "PM_NEST_3", .pme_code = 0x85, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_3] }, [ POWER7_PME_PM_MRK_LSU_DERAT_MISS ] = { .pme_name = "PM_MRK_LSU_DERAT_MISS", .pme_code = 0x3d05a, .pme_short_desc = "Marked DERAT Miss", .pme_long_desc = "Marked DERAT Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_DERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_DERAT_MISS] }, [ POWER7_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x10034, .pme_short_desc = "marked store finished (was complete)", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_ST_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_ST_CMPL] }, [ POWER7_PME_PM_L2_ST_DISP ] = { .pme_name = "PM_L2_ST_DISP", .pme_code = 0x46180, .pme_short_desc = "All successful store dispatches", .pme_long_desc = "All successful store dispatches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_ST_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_ST_DISP] }, [ POWER7_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x16180, .pme_short_desc = "L2 Castouts - Modified (M", .pme_long_desc = " Mu", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_CASTOUT_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_CASTOUT_MOD] }, [ POWER7_PME_PM_ISEG ] = { .pme_name = "PM_ISEG", .pme_code = 0x20a4, .pme_short_desc = "ISEG Exception", .pme_long_desc = "ISEG Exception", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ISEG], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ISEG] }, [ POWER7_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40034, .pme_short_desc = "marked Instruction finish timeout ", .pme_long_desc = "The number of instructions finished since the last progress indicator from a marked instruction exceeded the threshold. The marked instruction was flushed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_INST_TIMEO], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_INST_TIMEO] }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", .pme_code = 0x36282, .pme_short_desc = " L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = " L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR] }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd0b6, .pme_short_desc = "LS1 'Dcache prefetch stream confirmed", .pme_long_desc = "LS1 'Dcache prefetch stream confirmed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM] }, [ POWER7_PME_PM_IERAT_WR_64K ] = { .pme_name = "PM_IERAT_WR_64K", .pme_code = 0x40be, .pme_short_desc = "large page 64k ", .pme_long_desc = "large page 64k ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IERAT_WR_64K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IERAT_WR_64K] }, [ POWER7_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x4d05e, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Data TLB references to 16M pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DTLB_MISS_16M], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DTLB_MISS_16M] }, [ POWER7_PME_PM_IERAT_MISS ] = { .pme_name = "PM_IERAT_MISS", .pme_code = 0x100f6, .pme_short_desc = "IERAT Miss (Not implemented as DI on POWER6)", .pme_long_desc = "A translation request missed the Instruction Effective to Real Address Translation (ERAT) table", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IERAT_MISS] }, [ POWER7_PME_PM_MRK_PTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_LMEM", .pme_code = 0x4d052, .pme_short_desc = "Marked PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to the same module this proccessor is located on due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_LMEM] }, [ POWER7_PME_PM_FLOP ] = { .pme_name = "PM_FLOP", .pme_code = 0x100f4, .pme_short_desc = "Floating Point Operation Finished", .pme_long_desc = "A floating point operation has completed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLOP] }, [ POWER7_PME_PM_THRD_PRIO_4_5_CYC ] = { .pme_name = "PM_THRD_PRIO_4_5_CYC", .pme_code = 0x40b4, .pme_short_desc = " Cycles thread running at priority level 4 or 5", .pme_long_desc = " Cycles thread running at priority level 4 or 5", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_PRIO_4_5_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_PRIO_4_5_CYC] }, [ POWER7_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x40aa, .pme_short_desc = "Branch predict - target address", .pme_long_desc = "The target address of a branch instruction was predicted.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED_TA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED_TA] }, [ POWER7_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x20014, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_FXU], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_FXU] }, [ POWER7_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x200f8, .pme_short_desc = "external interrupt", .pme_long_desc = "An interrupt due to an external exception occurred", .pme_event_ids = power7_event_ids[POWER7_PME_PM_EXT_INT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_EXT_INT] }, [ POWER7_PME_PM_VSU_FSQRT_FDIV ] = { .pme_name = "PM_VSU_FSQRT_FDIV", .pme_code = 0xa888, .pme_short_desc = "four flops operation (fdiv", .pme_long_desc = "fsqrt) Scalar Instructions only!", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FSQRT_FDIV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FSQRT_FDIV] }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", .pme_code = 0x1003e, .pme_short_desc = "Marked Load exposed Miss ", .pme_long_desc = "Marked Load exposed Miss ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC] }, [ POWER7_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc086, .pme_short_desc = "LS1 Scalar Loads ", .pme_long_desc = "A floating point load was executed by LSU1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_LDF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_LDF] }, [ POWER7_PME_PM_IC_WRITE_ALL ] = { .pme_name = "PM_IC_WRITE_ALL", .pme_code = 0x488c, .pme_short_desc = "Icache sectors written", .pme_long_desc = " prefetch + demand", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_WRITE_ALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_WRITE_ALL] }, [ POWER7_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc0a0, .pme_short_desc = "LS0 SRQ forwarded data to a load", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_SRQ_STFWD] }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1c052, .pme_short_desc = "PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a remote module due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR", .pme_code = 0x1d04e, .pme_short_desc = "Marked data loaded from another L3 on same chip shared", .pme_long_desc = "Marked data loaded from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L31_SHR] }, [ POWER7_PME_PM_DATA_FROM_L21_MOD ] = { .pme_name = "PM_DATA_FROM_L21_MOD", .pme_code = 0x3c046, .pme_short_desc = "Data loaded from another L2 on same chip modified", .pme_long_desc = "Data loaded from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L21_MOD] }, [ POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU1_SCAL_DOUBLE_ISSUED", .pme_code = 0xb08a, .pme_short_desc = "Double Precision scalar instruction issued on Pipe1", .pme_long_desc = "Double Precision scalar instruction issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED] }, [ POWER7_PME_PM_VSU0_8FLOP ] = { .pme_name = "PM_VSU0_8FLOP", .pme_code = 0xa0a0, .pme_short_desc = "eight flops operation (DP vector versions of fdiv", .pme_long_desc = "fsqrt and SP vector versions of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_8FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_8FLOP] }, [ POWER7_PME_PM_POWER_EVENT1 ] = { .pme_name = "PM_POWER_EVENT1", .pme_code = 0x1006e, .pme_short_desc = "Power Management Event 1", .pme_long_desc = "Power Management Event 1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_POWER_EVENT1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_POWER_EVENT1] }, [ POWER7_PME_PM_DISP_CLB_HELD_BAL ] = { .pme_name = "PM_DISP_CLB_HELD_BAL", .pme_code = 0x2092, .pme_short_desc = "Dispatch/CLB Hold: Balance", .pme_long_desc = "Dispatch/CLB Hold: Balance", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD_BAL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD_BAL] }, [ POWER7_PME_PM_VSU1_2FLOP ] = { .pme_name = "PM_VSU1_2FLOP", .pme_code = 0xa09a, .pme_short_desc = "two flops operation (scalar fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_2FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_2FLOP] }, [ POWER7_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x209a, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LWSYNC_HELD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LWSYNC_HELD] }, [ POWER7_PME_PM_INST_FROM_L21_MOD ] = { .pme_name = "PM_INST_FROM_L21_MOD", .pme_code = 0x34046, .pme_short_desc = "Instruction fetched from another L2 on same chip modified", .pme_long_desc = "Instruction fetched from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L21_MOD] }, [ POWER7_PME_PM_IC_REQ_ALL ] = { .pme_name = "PM_IC_REQ_ALL", .pme_code = 0x4888, .pme_short_desc = "Icache requests", .pme_long_desc = " prefetch + demand", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_REQ_ALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_REQ_ALL] }, [ POWER7_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0xd090, .pme_short_desc = "Data SLB Miss - Total of all segment sizes", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DSLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DSLB_MISS] }, [ POWER7_PME_PM_L3_MISS ] = { .pme_name = "PM_L3_MISS", .pme_code = 0x1f082, .pme_short_desc = "L3 Misses ", .pme_long_desc = "L3 Misses ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_MISS] }, [ POWER7_PME_PM_LSU0_L1_PREF ] = { .pme_name = "PM_LSU0_L1_PREF", .pme_code = 0xd0b8, .pme_short_desc = " LS0 L1 cache data prefetches", .pme_long_desc = " LS0 L1 cache data prefetches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_L1_PREF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_L1_PREF] }, [ POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED ] = { .pme_name = "PM_VSU_SCALAR_SINGLE_ISSUED", .pme_code = 0xb884, .pme_short_desc = "Single Precision scalar instruction issued on Pipe0", .pme_long_desc = "Single Precision scalar instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED] }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE", .pme_code = 0xd0be, .pme_short_desc = "LS1 Dcache Strided prefetch stream confirmed", .pme_long_desc = "LS1 Dcache Strided prefetch stream confirmed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE] }, [ POWER7_PME_PM_L2_INST ] = { .pme_name = "PM_L2_INST", .pme_code = 0x36080, .pme_short_desc = "Instruction Load Count", .pme_long_desc = "Instruction Load Count", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_INST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_INST] }, [ POWER7_PME_PM_VSU0_FRSP ] = { .pme_name = "PM_VSU0_FRSP", .pme_code = 0xa0b4, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FRSP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FRSP] }, [ POWER7_PME_PM_FLUSH_DISP ] = { .pme_name = "PM_FLUSH_DISP", .pme_code = 0x2082, .pme_short_desc = "Dispatch flush", .pme_long_desc = "Dispatch flush", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_DISP] }, [ POWER7_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x4c058, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L2MISS] }, [ POWER7_PME_PM_VSU1_DQ_ISSUED ] = { .pme_name = "PM_VSU1_DQ_ISSUED", .pme_code = 0xb09a, .pme_short_desc = "128BIT Decimal Issued on Pipe1", .pme_long_desc = "128BIT Decimal Issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_DQ_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_DQ_ISSUED] }, [ POWER7_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x20012, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_LSU], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_LSU] }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x1d04a, .pme_short_desc = "Marked data loaded from distant memory", .pme_long_desc = "The processor's Data Cache was reloaded with data from memory attached to a distant module due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DMEM] }, [ POWER7_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0xc8b0, .pme_short_desc = "Flush: Unaligned Load", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FLUSH_ULD] }, [ POWER7_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x4c052, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_LMEM] }, [ POWER7_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x3d05c, .pme_short_desc = "Marked DERAT misses for 16M page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DERAT_MISS_16M], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DERAT_MISS_16M] }, [ POWER7_PME_PM_THRD_ALL_RUN_CYC ] = { .pme_name = "PM_THRD_ALL_RUN_CYC", .pme_code = 0x2000c, .pme_short_desc = "All Threads in run_cycles", .pme_long_desc = "Cycles when all threads had their run latches set. Operating systems use the run latch to indicate when they are doing useful work.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_ALL_RUN_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_ALL_RUN_CYC] }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC_COUNT", .pme_code = 0x3003f, .pme_short_desc = "Marked Group Completion Stall cycles (use edge detect to count #)", .pme_long_desc = "Marked Group Completion Stall cycles (use edge detect to count #)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT] }, [ POWER7_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x3c04c, .pme_short_desc = "Data loaded from distant L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a distant module due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_VSU_FRSP ] = { .pme_name = "PM_VSU_FRSP", .pme_code = 0xa8b4, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FRSP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FRSP] }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD", .pme_code = 0x3d046, .pme_short_desc = "Marked data loaded from another L2 on same chip modified", .pme_long_desc = "Marked data loaded from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L21_MOD] }, [ POWER7_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20010, .pme_short_desc = "Overflow from counter 1", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC1_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC1_OVERFLOW] }, [ POWER7_PME_PM_VSU0_SINGLE ] = { .pme_name = "PM_VSU0_SINGLE", .pme_code = 0xa0a8, .pme_short_desc = "FPU single precision", .pme_long_desc = "VSU0 executed single precision instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_SINGLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_SINGLE] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L3MISS", .pme_code = 0x2d058, .pme_short_desc = "Marked PTEG loaded from L3 miss", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from beyond the L3 due to a marked load or store", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L3MISS] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L31_SHR", .pme_code = 0x2d056, .pme_short_desc = "Marked PTEG loaded from another L3 on same chip shared", .pme_long_desc = "Marked PTEG loaded from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR] }, [ POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED ] = { .pme_name = "PM_VSU0_VECTOR_SP_ISSUED", .pme_code = 0xb090, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED] }, [ POWER7_PME_PM_VSU1_FEST ] = { .pme_name = "PM_VSU1_FEST", .pme_code = 0xa0ba, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FEST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FEST] }, [ POWER7_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x20030, .pme_short_desc = "marked instruction dispatch", .pme_long_desc = "A marked instruction was dispatched", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_INST_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_INST_DISP] }, [ POWER7_PME_PM_VSU0_COMPLEX_ISSUED ] = { .pme_name = "PM_VSU0_COMPLEX_ISSUED", .pme_code = 0xb096, .pme_short_desc = "Complex VMX instruction issued", .pme_long_desc = "Complex VMX instruction issued", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_COMPLEX_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_COMPLEX_ISSUED] }, [ POWER7_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc0b6, .pme_short_desc = "LS1 Flush: Unaligned Store", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_FLUSH_UST] }, [ POWER7_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x2, .pme_short_desc = "# PPC Instructions Finished", .pme_long_desc = "Number of PowerPC Instructions that completed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_CMPL] }, [ POWER7_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x1000e, .pme_short_desc = "fxu0 idle and fxu1 idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU_IDLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU_IDLE] }, [ POWER7_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc0b0, .pme_short_desc = "LS0 Flush: Unaligned Load", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_FLUSH_ULD] }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x3d04c, .pme_short_desc = "Marked data loaded from distant L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a distant module due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC", .pme_code = 0x3001c, .pme_short_desc = "ALL threads lsu empty (lmq and srq empty)", .pme_long_desc = "ALL threads lsu empty (lmq and srq empty)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC] }, [ POWER7_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc0a6, .pme_short_desc = "LS1 Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_REJECT_LMQ_FULL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_REJECT_LMQ_FULL] }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_L21_MOD", .pme_code = 0x3e056, .pme_short_desc = "Instruction PTEG loaded from another L2 on same chip modified", .pme_long_desc = "Instruction PTEG loaded from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L21_MOD] }, [ POWER7_PME_PM_GCT_UTIL_3TO6_SLOT ] = { .pme_name = "PM_GCT_UTIL_3-6_SLOT", .pme_code = 0x209e, .pme_short_desc = "GCT Utilization 3-6 entries", .pme_long_desc = "GCT Utilization 3-6 entries", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_UTIL_3TO6_SLOT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_UTIL_3TO6_SLOT] }, [ POWER7_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x14042, .pme_short_desc = "Instruction fetched from remote L2 or L3 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from an L2 or L3 on a remote module. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_SHL_CREATED ] = { .pme_name = "PM_SHL_CREATED", .pme_code = 0x5082, .pme_short_desc = "SHL table entry Created", .pme_long_desc = "SHL table entry Created", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SHL_CREATED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SHL_CREATED] }, [ POWER7_PME_PM_L2_ST_HIT ] = { .pme_name = "PM_L2_ST_HIT", .pme_code = 0x46182, .pme_short_desc = "All successful store dispatches that were L2Hits", .pme_long_desc = "A store request hit in the L2 directory. This event includes all requests to this L2 from all sources. Total for all slices.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_ST_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_ST_HIT] }, [ POWER7_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x1c04a, .pme_short_desc = "Data loaded from distant memory", .pme_long_desc = "The processor's Data Cache was reloaded with data from memory attached to a distant module due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_DMEM] }, [ POWER7_PME_PM_L3_LD_MISS ] = { .pme_name = "PM_L3_LD_MISS", .pme_code = 0x2f082, .pme_short_desc = "L3 demand LD Miss", .pme_long_desc = "L3 demand LD Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_LD_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_LD_MISS] }, [ POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4000e, .pme_short_desc = "fxu0 idle and fxu1 busy. ", .pme_long_desc = "FXU0 was idle while FXU1 was busy", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ POWER7_PME_PM_DISP_CLB_HELD_RES ] = { .pme_name = "PM_DISP_CLB_HELD_RES", .pme_code = 0x2094, .pme_short_desc = "Dispatch/CLB Hold: Resource", .pme_long_desc = "Dispatch/CLB Hold: Resource", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD_RES], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD_RES] }, [ POWER7_PME_PM_L2_SN_SX_I_DONE ] = { .pme_name = "PM_L2_SN_SX_I_DONE", .pme_code = 0x36382, .pme_short_desc = "SNP dispatched and went from Sx or Tx to Ix", .pme_long_desc = "SNP dispatched and went from Sx or Tx to Ix", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_SN_SX_I_DONE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_SN_SX_I_DONE] }, [ POWER7_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x30004, .pme_short_desc = "group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_CMPL] }, [ POWER7_PME_PM_BCPLUS8_CONV ] = { .pme_name = "PM_BC+8_CONV", .pme_code = 0x40b8, .pme_short_desc = "BC+8 Converted", .pme_long_desc = "BC+8 Converted", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BCPLUS8_CONV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BCPLUS8_CONV] }, [ POWER7_PME_PM_STCX_CMPL ] = { .pme_name = "PM_STCX_CMPL", .pme_code = 0xc098, .pme_short_desc = "STCX executed", .pme_long_desc = "Conditional stores with reservation completed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_STCX_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_STCX_CMPL] }, [ POWER7_PME_PM_VSU0_2FLOP ] = { .pme_name = "PM_VSU0_2FLOP", .pme_code = 0xa098, .pme_short_desc = "two flops operation (scalar fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_2FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_2FLOP] }, [ POWER7_PME_PM_L3_PREF_MISS ] = { .pme_name = "PM_L3_PREF_MISS", .pme_code = 0x3f082, .pme_short_desc = "L3 Prefetch Directory Miss", .pme_long_desc = "L3 Prefetch Directory Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_MISS] }, [ POWER7_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0xd096, .pme_short_desc = "A sync is in the SRQ", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_SYNC_CYC] }, [ POWER7_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x20064, .pme_short_desc = "LSU Reject due to ERAT (up to 2 per cycles)", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_REJECT_ERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_REJECT_ERAT_MISS] }, [ POWER7_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x200fc, .pme_short_desc = "Demand iCache Miss", .pme_long_desc = "An instruction fetch request missed the L1 cache.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L1_ICACHE_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L1_ICACHE_MISS] }, [ POWER7_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc0be, .pme_short_desc = "LS1 Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed from unit 1 because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_FLUSH_SRQ] }, [ POWER7_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc080, .pme_short_desc = "LS0 L1 D cache load references counted at finish", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LD_REF_L1_LSU0] }, [ POWER7_PME_PM_VSU0_FEST ] = { .pme_name = "PM_VSU0_FEST", .pme_code = 0xa0b8, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FEST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FEST] }, [ POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED ] = { .pme_name = "PM_VSU_VECTOR_SINGLE_ISSUED", .pme_code = 0xb890, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED] }, [ POWER7_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x4000c, .pme_short_desc = "Power Management: Above Threshold A", .pme_long_desc = "Processor frequency was sped up due to power management", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FREQ_UP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FREQ_UP] }, [ POWER7_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x3c04a, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_LMEM] }, [ POWER7_PME_PM_LSU1_LDX ] = { .pme_name = "PM_LSU1_LDX", .pme_code = 0xc08a, .pme_short_desc = "LS1 Vector Loads", .pme_long_desc = "LS1 Vector Loads", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_LDX], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_LDX] }, [ POWER7_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40010, .pme_short_desc = "Overflow from counter 3", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC3_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC3_OVERFLOW] }, [ POWER7_PME_PM_MRK_BR_MPRED ] = { .pme_name = "PM_MRK_BR_MPRED", .pme_code = 0x30036, .pme_short_desc = "Marked Branch Mispredicted", .pme_long_desc = "A marked branch was mispredicted", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_BR_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_BR_MPRED] }, [ POWER7_PME_PM_SHL_MATCH ] = { .pme_name = "PM_SHL_MATCH", .pme_code = 0x5086, .pme_short_desc = "SHL Table Match", .pme_long_desc = "SHL Table Match", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SHL_MATCH], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SHL_MATCH] }, [ POWER7_PME_PM_MRK_BR_TAKEN ] = { .pme_name = "PM_MRK_BR_TAKEN", .pme_code = 0x10036, .pme_short_desc = "Marked Branch Taken", .pme_long_desc = "A marked branch was taken", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_BR_TAKEN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_BR_TAKEN] }, [ POWER7_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0xd092, .pme_short_desc = "Instruction SLB Miss - Tota of all segment sizes", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ISLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ISLB_MISS] }, [ POWER7_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x1e, .pme_short_desc = "Cycles", .pme_long_desc = "Processor Cycles", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DRL2L3_MOD_CYC", .pme_code = 0x4002a, .pme_short_desc = "Marked ld latency Data source 1011 (L2.75/L3.75 M different 4 chip node)", .pme_long_desc = "Marked ld latency Data source 1011 (L2.75/L3.75 M different 4 chip node)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DRL2L3_MOD_CYC] }, [ POWER7_PME_PM_DISP_HELD_THERMAL ] = { .pme_name = "PM_DISP_HELD_THERMAL", .pme_code = 0x30006, .pme_short_desc = "Dispatch Held due to Thermal", .pme_long_desc = "Dispatch Held due to Thermal", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_HELD_THERMAL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_HELD_THERMAL] }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2e054, .pme_short_desc = "Instruction PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "Instruction PTEG loaded from remote L2 or L3 shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc0a2, .pme_short_desc = "LS1 SRQ forwarded data to a load", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_SRQ_STFWD] }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x4001a, .pme_short_desc = "GCT empty by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_NOSLOT_BR_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_NOSLOT_BR_MPRED] }, [ POWER7_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100f2, .pme_short_desc = "1 or more ppc insts finished", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_1PLUS_PPC_CMPL] }, [ POWER7_PME_PM_PTEG_FROM_DMEM ] = { .pme_name = "PM_PTEG_FROM_DMEM", .pme_code = 0x2c052, .pme_short_desc = "PTEG loaded from distant memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with data from memory attached to a distant module due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_DMEM] }, [ POWER7_PME_PM_VSU_2FLOP ] = { .pme_name = "PM_VSU_2FLOP", .pme_code = 0xa898, .pme_short_desc = "two flops operation (scalar fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_2FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_2FLOP] }, [ POWER7_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x4086, .pme_short_desc = "Cycles No room in EAT", .pme_long_desc = "The Global Completion Table is completely full.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_FULL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_FULL_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x40020, .pme_short_desc = "Marked ld latency Data source 0001 (L3)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L3_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L3_CYC] }, [ POWER7_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xd09d, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "Slot 0 of SRQ valid", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_S0_ALLOC] }, [ POWER7_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x1d05c, .pme_short_desc = "Marked DERAT misses for 4K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DERAT_MISS_4K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DERAT_MISS_4K] }, [ POWER7_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x40ae, .pme_short_desc = "Branch mispredict - target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED_TA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED_TA] }, [ POWER7_PME_PM_INST_PTEG_FROM_L2MISS ] = { .pme_name = "PM_INST_PTEG_FROM_L2MISS", .pme_code = 0x4e058, .pme_short_desc = "Instruction PTEG loaded from L2 miss", .pme_long_desc = "Instruction PTEG loaded from L2 miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L2MISS] }, [ POWER7_PME_PM_DPU_HELD_POWER ] = { .pme_name = "PM_DPU_HELD_POWER", .pme_code = 0x20006, .pme_short_desc = "Dispatch Held due to Power Management", .pme_long_desc = "Cycles that Instruction Dispatch was held due to power management. More than one hold condition can exist at the same time", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DPU_HELD_POWER], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DPU_HELD_POWER] }, [ POWER7_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x400fa, .pme_short_desc = "Run_Instructions", .pme_long_desc = "Number of run instructions completed. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_RUN_INST_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_RUN_INST_CMPL] }, [ POWER7_PME_PM_MRK_VSU_FIN ] = { .pme_name = "PM_MRK_VSU_FIN", .pme_code = 0x30032, .pme_short_desc = "vsu (fpu) marked instr finish", .pme_long_desc = "vsu (fpu) marked instr finish", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_VSU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_VSU_FIN] }, [ POWER7_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xd09c, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_S0_VALID] }, [ POWER7_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x20008, .pme_short_desc = "GCT empty", .pme_long_desc = " all threads", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_EMPTY_CYC] }, [ POWER7_PME_PM_IOPS_DISP ] = { .pme_name = "PM_IOPS_DISP", .pme_code = 0x30014, .pme_short_desc = "IOPS dispatched", .pme_long_desc = "IOPS dispatched", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IOPS_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IOPS_DISP] }, [ POWER7_PME_PM_RUN_SPURR ] = { .pme_name = "PM_RUN_SPURR", .pme_code = 0x10008, .pme_short_desc = "Run SPURR", .pme_long_desc = "Run SPURR", .pme_event_ids = power7_event_ids[POWER7_PME_PM_RUN_SPURR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_RUN_SPURR] }, [ POWER7_PME_PM_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_PTEG_FROM_L21_MOD", .pme_code = 0x3c056, .pme_short_desc = "PTEG loaded from another L2 on same chip modified", .pme_long_desc = "PTEG loaded from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L21_MOD] }, [ POWER7_PME_PM_VSU0_1FLOP ] = { .pme_name = "PM_VSU0_1FLOP", .pme_code = 0xa080, .pme_short_desc = "one flop (fadd", .pme_long_desc = " fmul", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_1FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_1FLOP] }, [ POWER7_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0xd0b2, .pme_short_desc = "TLBIE snoop", .pme_long_desc = "A tlbie was snooped from another processor.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SNOOP_TLBIE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SNOOP_TLBIE] }, [ POWER7_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x2c048, .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", .pme_long_desc = "The processor's Data Cache was reloaded from beyond L3 due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L3MISS] }, [ POWER7_PME_PM_VSU_SINGLE ] = { .pme_name = "PM_VSU_SINGLE", .pme_code = 0xa8a8, .pme_short_desc = "Vector or Scalar single precision", .pme_long_desc = "Vector or Scalar single precision", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_SINGLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_SINGLE] }, [ POWER7_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x1c05e, .pme_short_desc = "Data TLB miss for 16G page", .pme_long_desc = "Data TLB references to 16GB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DTLB_MISS_16G], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DTLB_MISS_16G] }, [ POWER7_PME_PM_CMPLU_STALL_VECTOR ] = { .pme_name = "PM_CMPLU_STALL_VECTOR", .pme_code = 0x2001c, .pme_short_desc = "Completion stall caused by Vector instruction", .pme_long_desc = "Completion stall caused by Vector instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_VECTOR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_VECTOR] }, [ POWER7_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x400f8, .pme_short_desc = "Flush (any type)", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH] }, [ POWER7_PME_PM_L2_LD_HIT ] = { .pme_name = "PM_L2_LD_HIT", .pme_code = 0x36182, .pme_short_desc = "All successful load dispatches that were L2 hits", .pme_long_desc = "A load request (data or instruction) hit in the L2 directory. Includes speculative, prefetched, and demand requests. This event includes all requests to this L2 from all sources. Total for all slices", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LD_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LD_HIT] }, [ POWER7_PME_PM_NEST_2 ] = { .pme_name = "PM_NEST_2", .pme_code = 0x83, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_2] }, [ POWER7_PME_PM_VSU1_1FLOP ] = { .pme_name = "PM_VSU1_1FLOP", .pme_code = 0xa082, .pme_short_desc = "one flop (fadd", .pme_long_desc = " fmul", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_1FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_1FLOP] }, [ POWER7_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x408a, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_REQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_REQ] }, [ POWER7_PME_PM_L3_LD_HIT ] = { .pme_name = "PM_L3_LD_HIT", .pme_code = 0x2f080, .pme_short_desc = "L3 demand LD Hits", .pme_long_desc = "L3 demand LD Hits", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_LD_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_LD_HIT] }, [ POWER7_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x2001a, .pme_short_desc = "GCT empty by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_NOSLOT_IC_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_NOSLOT_IC_MISS] }, [ POWER7_PME_PM_DISP_HELD ] = { .pme_name = "PM_DISP_HELD", .pme_code = 0x10006, .pme_short_desc = "Dispatch Held", .pme_long_desc = "Dispatch Held", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_HELD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_HELD] }, [ POWER7_PME_PM_L2_LD ] = { .pme_name = "PM_L2_LD", .pme_code = 0x16080, .pme_short_desc = "Data Load Count", .pme_long_desc = "Data Load Count", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LD] }, [ POWER7_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0xc8bc, .pme_short_desc = "Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FLUSH_SRQ] }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD_CYC", .pme_code = 0x40026, .pme_short_desc = "Marked ld latency Data source 0111 (L3.1 M same chip)", .pme_long_desc = "Marked ld latency Data source 0111 (L3.1 M same chip)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC] }, [ POWER7_PME_PM_L2_RCST_BUSY_RC_FULL ] = { .pme_name = "PM_L2_RCST_BUSY_RC_FULL", .pme_code = 0x26282, .pme_short_desc = " L2 activated Busy to the core for stores due to all RC full", .pme_long_desc = " L2 activated Busy to the core for stores due to all RC full", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCST_BUSY_RC_FULL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCST_BUSY_RC_FULL] }, [ POWER7_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x300f8, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_TB_BIT_TRANS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_TB_BIT_TRANS] }, [ POWER7_PME_PM_THERMAL_MAX ] = { .pme_name = "PM_THERMAL_MAX", .pme_code = 0x40006, .pme_short_desc = "Processor In Thermal MAX", .pme_long_desc = "The processor experienced a thermal overload condition. This bit is sticky, it remains set until cleared by software.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THERMAL_MAX], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THERMAL_MAX] }, [ POWER7_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc0b2, .pme_short_desc = "LS 1 Flush: Unaligned Load", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_FLUSH_ULD] }, [ POWER7_PME_PM_LSU1_REJECT_LHS ] = { .pme_name = "PM_LSU1_REJECT_LHS", .pme_code = 0xc0ae, .pme_short_desc = "LS1 Reject: Load Hit Store", .pme_long_desc = "Load Store Unit 1 rejected a load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_REJECT_LHS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_REJECT_LHS] }, [ POWER7_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xd09f, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "Slot 0 of LRQ valid", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LRQ_S0_ALLOC] }, [ POWER7_PME_PM_POWER_EVENT4 ] = { .pme_name = "PM_POWER_EVENT4", .pme_code = 0x4006e, .pme_short_desc = "Power Management Event 4", .pme_long_desc = "Power Management Event 4", .pme_event_ids = power7_event_ids[POWER7_PME_PM_POWER_EVENT4], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_POWER_EVENT4] }, [ POWER7_PME_PM_DATA_FROM_L31_SHR ] = { .pme_name = "PM_DATA_FROM_L31_SHR", .pme_code = 0x1c04e, .pme_short_desc = "Data loaded from another L3 on same chip shared", .pme_long_desc = "Data loaded from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L31_SHR] }, [ POWER7_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x409e, .pme_short_desc = "Unconditional Branch", .pme_long_desc = "An unconditional branch was executed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_UNCOND], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_UNCOND] }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_ALLOC", .pme_code = 0xd0aa, .pme_short_desc = "LS 1 D cache new prefetch stream allocated", .pme_long_desc = "LS 1 D cache new prefetch stream allocated", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC] }, [ POWER7_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x10020, .pme_short_desc = "PMC4 Rewind Event", .pme_long_desc = "PMC4 was counting speculatively. The speculative condition was not met and the counter was restored to its previous value.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC4_REWIND], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC4_REWIND] }, [ POWER7_PME_PM_L2_RCLD_DISP ] = { .pme_name = "PM_L2_RCLD_DISP", .pme_code = 0x16280, .pme_short_desc = " L2 RC load dispatch attempt", .pme_long_desc = " L2 RC load dispatch attempt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCLD_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCLD_DISP] }, [ POWER7_PME_PM_THRD_PRIO_2_3_CYC ] = { .pme_name = "PM_THRD_PRIO_2_3_CYC", .pme_code = 0x40b2, .pme_short_desc = " Cycles thread running at priority level 2 or 3", .pme_long_desc = " Cycles thread running at priority level 2 or 3", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_PRIO_2_3_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_PRIO_2_3_CYC] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L2MISS", .pme_code = 0x4d058, .pme_short_desc = "Marked PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the ERAT but not from the local L2 due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L2MISS] }, [ POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x4098, .pme_short_desc = " L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT] }, [ POWER7_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x200f6, .pme_short_desc = "DERAT Reloaded due to a DERAT miss", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_DERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_DERAT_MISS] }, [ POWER7_PME_PM_IC_PREF_CANCEL_L2 ] = { .pme_name = "PM_IC_PREF_CANCEL_L2", .pme_code = 0x4094, .pme_short_desc = "L2 Squashed request", .pme_long_desc = "L2 Squashed request", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_CANCEL_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_CANCEL_L2] }, [ POWER7_PME_PM_GCT_UTIL_7TO10_SLOT ] = { .pme_name = "PM_GCT_UTIL_7-10_SLOT", .pme_code = 0x20a0, .pme_short_desc = "GCT Utilization 7-10 entries", .pme_long_desc = "GCT Utilization 7-10 entries", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_UTIL_7TO10_SLOT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_UTIL_7TO10_SLOT] }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT ] = { .pme_name = "PM_MRK_FIN_STALL_CYC_COUNT", .pme_code = 0x1003d, .pme_short_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #)", .pme_long_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT] }, [ POWER7_PME_PM_BR_PRED_CCACHE ] = { .pme_name = "PM_BR_PRED_CCACHE", .pme_code = 0x40a0, .pme_short_desc = "Count Cache Predictions", .pme_long_desc = "The count value of a Branch and Count instruction was predicted", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED_CCACHE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED_CCACHE] }, [ POWER7_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x30034, .pme_short_desc = "marked store complete (data home) with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_ST_CMPL_INT] }, [ POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC ] = { .pme_name = "PM_LSU_TWO_TABLEWALK_CYC", .pme_code = 0xd0a6, .pme_short_desc = "Cycles when two tablewalks pending on this thread", .pme_long_desc = "Cycles when two tablewalks pending on this thread", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x2d048, .pme_short_desc = "Marked data loaded from L3 miss", .pme_long_desc = "DL1 was reloaded from beyond L3 due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L3MISS] }, [ POWER7_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100f8, .pme_short_desc = "No itags assigned ", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_NOSLOT_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_NOSLOT_CYC] }, [ POWER7_PME_PM_LSU_SET_MPRED ] = { .pme_name = "PM_LSU_SET_MPRED", .pme_code = 0xc0a8, .pme_short_desc = "Line already in cache at reload time", .pme_long_desc = "Line already in cache at reload time", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SET_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SET_MPRED] }, [ POWER7_PME_PM_FLUSH_DISP_TLBIE ] = { .pme_name = "PM_FLUSH_DISP_TLBIE", .pme_code = 0x208a, .pme_short_desc = "Dispatch Flush: TLBIE", .pme_long_desc = "Dispatch Flush: TLBIE", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_DISP_TLBIE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_DISP_TLBIE] }, [ POWER7_PME_PM_VSU1_FCONV ] = { .pme_name = "PM_VSU1_FCONV", .pme_code = 0xa0b2, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FCONV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FCONV] }, [ POWER7_PME_PM_NEST_1 ] = { .pme_name = "PM_NEST_1", .pme_code = 0x81, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_1] }, [ POWER7_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x4c05c, .pme_short_desc = "DERAT misses for 16G page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DERAT_MISS_16G], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DERAT_MISS_16G] }, [ POWER7_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x3404a, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_LMEM] }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x409a, .pme_short_desc = " L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT] }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG ] = { .pme_name = "PM_CMPLU_STALL_SCALAR_LONG", .pme_code = 0x20018, .pme_short_desc = "Completion stall caused by long latency scalar instruction", .pme_long_desc = "Completion stall caused by long latency scalar instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG] }, [ POWER7_PME_PM_INST_PTEG_FROM_L2 ] = { .pme_name = "PM_INST_PTEG_FROM_L2", .pme_code = 0x1e050, .pme_short_desc = "Instruction PTEG loaded from L2", .pme_long_desc = "Instruction PTEG loaded from L2", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L2] }, [ POWER7_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x1c050, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L2 due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L2] }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR_CYC", .pme_code = 0x20024, .pme_short_desc = "Marked ld latency Data source 0100 (L2.1 S)", .pme_long_desc = "Marked load latency Data source 0100 (L2.1 S)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC] }, [ POWER7_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x2d05a, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DTLB_MISS_4K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DTLB_MISS_4K] }, [ POWER7_PME_PM_VSU0_FPSCR ] = { .pme_name = "PM_VSU0_FPSCR", .pme_code = 0xb09c, .pme_short_desc = "Move to/from FPSCR type instruction issued on Pipe 0", .pme_long_desc = "Move to/from FPSCR type instruction issued on Pipe 0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FPSCR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FPSCR] }, [ POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU1_VECT_DOUBLE_ISSUED", .pme_code = 0xb082, .pme_short_desc = "Double Precision vector instruction issued on Pipe1", .pme_long_desc = "Double Precision vector instruction issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED] }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1d052, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with shared (T or SL) data from an L2 or L3 on a remote module due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_L2_LD_MISS ] = { .pme_name = "PM_L2_LD_MISS", .pme_code = 0x26080, .pme_short_desc = "Data Load Miss", .pme_long_desc = "Data Load Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LD_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LD_MISS] }, [ POWER7_PME_PM_VMX_RESULT_SAT_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_1", .pme_code = 0xb0a0, .pme_short_desc = "Valid result with sat=1", .pme_long_desc = "Valid result with sat=1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VMX_RESULT_SAT_1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VMX_RESULT_SAT_1] }, [ POWER7_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xd8b8, .pme_short_desc = "L1 Prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L1_PREF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L1_PREF] }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x2002c, .pme_short_desc = "Marked ld latency Data Source 1100 (Local Memory)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC] }, [ POWER7_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x1000c, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_IC_MISS_NONSPEC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_IC_MISS_NONSPEC] }, [ POWER7_PME_PM_SHL_MERGED ] = { .pme_name = "PM_SHL_MERGED", .pme_code = 0x5084, .pme_short_desc = "SHL table entry merged with existing", .pme_long_desc = "SHL table entry merged with existing", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SHL_MERGED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SHL_MERGED] }, [ POWER7_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c048, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L3] }, [ POWER7_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x208e, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FLUSH], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FLUSH] }, [ POWER7_PME_PM_LSU_SRQ_SYNC_COUNT ] = { .pme_name = "PM_LSU_SRQ_SYNC_COUNT", .pme_code = 0xd097, .pme_short_desc = "SRQ sync count (edge of PM_LSU_SRQ_SYNC_CYC)", .pme_long_desc = "SRQ sync count (edge of PM_LSU_SRQ_SYNC_CYC)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_SYNC_COUNT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_SYNC_COUNT] }, [ POWER7_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30010, .pme_short_desc = "Overflow from counter 2", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC2_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC2_OVERFLOW] }, [ POWER7_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0xc884, .pme_short_desc = "All Scalar Loads", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LDF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LDF] }, [ POWER7_PME_PM_POWER_EVENT3 ] = { .pme_name = "PM_POWER_EVENT3", .pme_code = 0x3006e, .pme_short_desc = "Power Management Event 3", .pme_long_desc = "Power Management Event 3", .pme_event_ids = power7_event_ids[POWER7_PME_PM_POWER_EVENT3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_POWER_EVENT3] }, [ POWER7_PME_PM_DISP_WT ] = { .pme_name = "PM_DISP_WT", .pme_code = 0x30008, .pme_short_desc = "Dispatched Starved (not held", .pme_long_desc = " nothing to dispatch)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_WT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_WT] }, [ POWER7_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x40016, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_REJECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_REJECT] }, [ POWER7_PME_PM_IC_BANK_CONFLICT ] = { .pme_name = "PM_IC_BANK_CONFLICT", .pme_code = 0x4082, .pme_short_desc = "Read blocked due to interleave conflict. ", .pme_long_desc = "Read blocked due to interleave conflict. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_BANK_CONFLICT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_BANK_CONFLICT] }, [ POWER7_PME_PM_BR_MPRED_CR_TA ] = { .pme_name = "PM_BR_MPRED_CR_TA", .pme_code = 0x48ae, .pme_short_desc = "Branch mispredict - taken/not taken and target", .pme_long_desc = "Branch mispredict - taken/not taken and target", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED_CR_TA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED_CR_TA] }, [ POWER7_PME_PM_L2_INST_MISS ] = { .pme_name = "PM_L2_INST_MISS", .pme_code = 0x36082, .pme_short_desc = "Instruction Load Misses", .pme_long_desc = "Instruction Load Misses", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_INST_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_INST_MISS] }, [ POWER7_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x40018, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_ERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_ERAT_MISS] }, [ POWER7_PME_PM_MRK_LSU_FLUSH ] = { .pme_name = "PM_MRK_LSU_FLUSH", .pme_code = 0xd08c, .pme_short_desc = "Flush: (marked) : All Cases", .pme_long_desc = "Marked flush initiated by LSU", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FLUSH], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FLUSH] }, [ POWER7_PME_PM_L2_LDST ] = { .pme_name = "PM_L2_LDST", .pme_code = 0x16880, .pme_short_desc = "Data Load+Store Count", .pme_long_desc = "Data Load+Store Count", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LDST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LDST] }, [ POWER7_PME_PM_INST_FROM_L31_SHR ] = { .pme_name = "PM_INST_FROM_L31_SHR", .pme_code = 0x1404e, .pme_short_desc = "Instruction fetched from another L3 on same chip shared", .pme_long_desc = "Instruction fetched from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L31_SHR] }, [ POWER7_PME_PM_VSU0_FIN ] = { .pme_name = "PM_VSU0_FIN", .pme_code = 0xa0bc, .pme_short_desc = "VSU0 Finished an instruction", .pme_long_desc = "VSU0 Finished an instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FIN] }, [ POWER7_PME_PM_LARX_LSU ] = { .pme_name = "PM_LARX_LSU", .pme_code = 0xc894, .pme_short_desc = "Larx Finished", .pme_long_desc = "Larx Finished", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LARX_LSU], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LARX_LSU] }, [ POWER7_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x34042, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_RMEM] }, [ POWER7_PME_PM_DISP_CLB_HELD_TLBIE ] = { .pme_name = "PM_DISP_CLB_HELD_TLBIE", .pme_code = 0x2096, .pme_short_desc = "Dispatch Hold: Due to TLBIE", .pme_long_desc = "Dispatch Hold: Due to TLBIE", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD_TLBIE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD_TLBIE] }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", .pme_code = 0x2002e, .pme_short_desc = "Marked ld latency Data Source 1110 (Distant Memory)", .pme_long_desc = "Marked ld latency Data Source 1110 (Distant Memory)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC] }, [ POWER7_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x40a8, .pme_short_desc = "Branch predict - taken/not taken", .pme_long_desc = "A conditional branch instruction was predicted as taken or not taken.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED_CR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED_CR] }, [ POWER7_PME_PM_LSU_REJECT ] = { .pme_name = "PM_LSU_REJECT", .pme_code = 0x10064, .pme_short_desc = "LSU Reject (up to 2 per cycle)", .pme_long_desc = "The Load Store Unit rejected an instruction. Combined Unit 0 + 1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_REJECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_REJECT] }, [ POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT ] = { .pme_name = "PM_CMPLU_STALL_END_GCT_NOSLOT", .pme_code = 0x10028, .pme_short_desc = "Count ended because GCT went empty", .pme_long_desc = "Count ended because GCT went empty", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT] }, [ POWER7_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc0a4, .pme_short_desc = "LS0 Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_REJECT_LMQ_FULL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_REJECT_LMQ_FULL] }, [ POWER7_PME_PM_VSU_FEST ] = { .pme_name = "PM_VSU_FEST", .pme_code = 0xa8b8, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FEST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FEST] }, [ POWER7_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x2c050, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L3] }, [ POWER7_PME_PM_POWER_EVENT2 ] = { .pme_name = "PM_POWER_EVENT2", .pme_code = 0x2006e, .pme_short_desc = "Power Management Event 2", .pme_long_desc = "Power Management Event 2", .pme_event_ids = power7_event_ids[POWER7_PME_PM_POWER_EVENT2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_POWER_EVENT2] }, [ POWER7_PME_PM_IC_PREF_CANCEL_PAGE ] = { .pme_name = "PM_IC_PREF_CANCEL_PAGE", .pme_code = 0x4090, .pme_short_desc = "Prefetch Canceled due to page boundary", .pme_long_desc = "Prefetch Canceled due to page boundary", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_CANCEL_PAGE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_CANCEL_PAGE] }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV ] = { .pme_name = "PM_VSU0_FSQRT_FDIV", .pme_code = 0xa088, .pme_short_desc = "four flops operation (fdiv", .pme_long_desc = "fsqrt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FSQRT_FDIV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FSQRT_FDIV] }, [ POWER7_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x40030, .pme_short_desc = "Marked group complete", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_GRP_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_GRP_CMPL] }, [ POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU0_SCAL_DOUBLE_ISSUED", .pme_code = 0xb088, .pme_short_desc = "Double Precision scalar instruction issued on Pipe0", .pme_long_desc = "Double Precision scalar instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED] }, [ POWER7_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x3000a, .pme_short_desc = "dispatch_success (Group Dispatched)", .pme_long_desc = "A group was dispatched", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_DISP] }, [ POWER7_PME_PM_LSU0_LDX ] = { .pme_name = "PM_LSU0_LDX", .pme_code = 0xc088, .pme_short_desc = "LS0 Vector Loads", .pme_long_desc = "LS0 Vector Loads", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_LDX], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_LDX] }, [ POWER7_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c040, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L2] }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x1d042, .pme_short_desc = "Marked data loaded from remote L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a remote module due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0xc880, .pme_short_desc = " L1 D cache load references counted at finish", .pme_long_desc = " L1 D cache load references counted at finish", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LD_REF_L1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LD_REF_L1] }, [ POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU0_VECT_DOUBLE_ISSUED", .pme_code = 0xb080, .pme_short_desc = "Double Precision vector instruction issued on Pipe0", .pme_long_desc = "Double Precision vector instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED] }, [ POWER7_PME_PM_VSU1_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU1_2FLOP_DOUBLE", .pme_code = 0xa08e, .pme_short_desc = "two flop DP vector operation (xvadddp", .pme_long_desc = " xvmuldp", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_2FLOP_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_2FLOP_DOUBLE] }, [ POWER7_PME_PM_THRD_PRIO_6_7_CYC ] = { .pme_name = "PM_THRD_PRIO_6_7_CYC", .pme_code = 0x40b6, .pme_short_desc = " Cycles thread running at priority level 6 or 7", .pme_long_desc = " Cycles thread running at priority level 6 or 7", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_PRIO_6_7_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_PRIO_6_7_CYC] }, [ POWER7_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x40ac, .pme_short_desc = "Branch mispredict - taken/not taken", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED_CR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED_CR] }, [ POWER7_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x400f0, .pme_short_desc = "Load Missed L1", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LD_MISS_L1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LD_MISS_L1] }, [ POWER7_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x1c042, .pme_short_desc = "Data loaded from remote L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a remote module due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x1001a, .pme_short_desc = "Storage Queue is full and is blocking dispatch", .pme_long_desc = "Cycles the Store Request Queue is full.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_FULL_CYC] }, [ POWER7_PME_PM_TABLEWALK_CYC ] = { .pme_name = "PM_TABLEWALK_CYC", .pme_code = 0x10026, .pme_short_desc = "Cycles when a tablewalk (I or D) is active", .pme_long_desc = "Cycles doing instruction or data tablewalks", .pme_event_ids = power7_event_ids[POWER7_PME_PM_TABLEWALK_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_TABLEWALK_CYC] }, [ POWER7_PME_PM_MRK_PTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_RMEM", .pme_code = 0x3d052, .pme_short_desc = "Marked PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT. POWER6 does not have a TLB", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_RMEM] }, [ POWER7_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0xc8a0, .pme_short_desc = "Load got data from a store", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_STFWD] }, [ POWER7_PME_PM_INST_PTEG_FROM_RMEM ] = { .pme_name = "PM_INST_PTEG_FROM_RMEM", .pme_code = 0x3e052, .pme_short_desc = "Instruction PTEG loaded from remote memory", .pme_long_desc = "Instruction PTEG loaded from remote memory", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_RMEM] }, [ POWER7_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x10004, .pme_short_desc = "FXU0 Finished", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU0_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU0_FIN] }, [ POWER7_PME_PM_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_PTEG_FROM_L31_MOD", .pme_code = 0x1c054, .pme_short_desc = "PTEG loaded from another L3 on same chip modified", .pme_long_desc = "PTEG loaded from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L31_MOD] }, [ POWER7_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10024, .pme_short_desc = "Overflow from counter 5", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC5_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC5_OVERFLOW] }, [ POWER7_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc082, .pme_short_desc = "LS1 L1 D cache load references counted at finish", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LD_REF_L1_LSU1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LD_REF_L1_LSU1] }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_L21_SHR", .pme_code = 0x4e056, .pme_short_desc = "Instruction PTEG loaded from another L2 on same chip shared", .pme_long_desc = "Instruction PTEG loaded from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L21_SHR] }, [ POWER7_PME_PM_CMPLU_STALL_THRD ] = { .pme_name = "PM_CMPLU_STALL_THRD", .pme_code = 0x1001c, .pme_short_desc = "Completion Stalled due to thread conflict. Group ready to complete but it was another thread's turn", .pme_long_desc = "Completion Stalled due to thread conflict. Group ready to complete but it was another thread's turn", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_THRD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_THRD] }, [ POWER7_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x3c042, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_RMEM] }, [ POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED ] = { .pme_name = "PM_VSU0_SCAL_SINGLE_ISSUED", .pme_code = 0xb084, .pme_short_desc = "Single Precision scalar instruction issued on Pipe0", .pme_long_desc = "Single Precision scalar instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED] }, [ POWER7_PME_PM_BR_MPRED_LSTACK ] = { .pme_name = "PM_BR_MPRED_LSTACK", .pme_code = 0x40a6, .pme_short_desc = "Branch Mispredict due to Link Stack", .pme_long_desc = "Branch Mispredict due to Link Stack", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED_LSTACK], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED_LSTACK] }, [ POWER7_PME_PM_NEST_8 ] = { .pme_name = "PM_NEST_8", .pme_code = 0x8f, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_8], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_8] }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x40028, .pme_short_desc = "Marked ld latency Data source 1001 (L2.5/L3.5 M same 4 chip node)", .pme_long_desc = "Marked ld latency Data source 1001 (L2.5/L3.5 M same 4 chip node)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC] }, [ POWER7_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc0b4, .pme_short_desc = "LS0 Flush: Unaligned Store", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_FLUSH_UST] }, [ POWER7_PME_PM_LSU_NCST ] = { .pme_name = "PM_LSU_NCST", .pme_code = 0xc090, .pme_short_desc = "Non-cachable Stores sent to nest", .pme_long_desc = "Non-cachable Stores sent to nest", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_NCST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_NCST] }, [ POWER7_PME_PM_BR_TAKEN ] = { .pme_name = "PM_BR_TAKEN", .pme_code = 0x20004, .pme_short_desc = "Branch Taken", .pme_long_desc = "A branch instruction was taken. This could have been a conditional branch or an unconditional branch", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_TAKEN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_TAKEN] }, [ POWER7_PME_PM_INST_PTEG_FROM_LMEM ] = { .pme_name = "PM_INST_PTEG_FROM_LMEM", .pme_code = 0x4e052, .pme_short_desc = "Instruction PTEG loaded from local memory", .pme_long_desc = "Instruction PTEG loaded from local memory", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_LMEM] }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED_IC_MISS", .pme_code = 0x4001c, .pme_short_desc = "GCT empty by branch mispredict + IC miss", .pme_long_desc = "No slot in GCT caused by branch mispredict or I cache miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS] }, [ POWER7_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x2c05a, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DTLB_MISS_4K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DTLB_MISS_4K] }, [ POWER7_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x30022, .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", .pme_long_desc = "PMC4 was counting speculatively. The speculative condition was met and the counter value was committed by copying it to the backup register.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC4_SAVED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC4_SAVED] }, [ POWER7_PME_PM_VSU1_PERMUTE_ISSUED ] = { .pme_name = "PM_VSU1_PERMUTE_ISSUED", .pme_code = 0xb092, .pme_short_desc = "Permute VMX Instruction Issued", .pme_long_desc = "Permute VMX Instruction Issued", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_PERMUTE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_PERMUTE_ISSUED] }, [ POWER7_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0xd890, .pme_short_desc = "Data + Instruction SLB Miss - Total of all segment sizes", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SLB_MISS] }, [ POWER7_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc0ba, .pme_short_desc = "LS1 Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed from unit 1 because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_FLUSH_LRQ] }, [ POWER7_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x300fc, .pme_short_desc = "TLB reload valid", .pme_long_desc = "Data TLB misses, all page sizes.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DTLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DTLB_MISS] }, [ POWER7_PME_PM_VSU1_FRSP ] = { .pme_name = "PM_VSU1_FRSP", .pme_code = 0xa0b6, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FRSP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FRSP] }, [ POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU_VECTOR_DOUBLE_ISSUED", .pme_code = 0xb880, .pme_short_desc = "Double Precision vector instruction issued on Pipe0", .pme_long_desc = "Double Precision vector instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED] }, [ POWER7_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x16182, .pme_short_desc = "L2 Castouts - Shared (T", .pme_long_desc = " Te", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_CASTOUT_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_CASTOUT_SHR] }, [ POWER7_PME_PM_NEST_7 ] = { .pme_name = "PM_NEST_7", .pme_code = 0x8d, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_7], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_7] }, [ POWER7_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x3c044, .pme_short_desc = "Data loaded from distant L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a distant module due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_DL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_DL2L3_SHR] }, [ POWER7_PME_PM_VSU1_STF ] = { .pme_name = "PM_VSU1_STF", .pme_code = 0xb08e, .pme_short_desc = "FPU store (SP or DP) issued on Pipe1", .pme_long_desc = "FPU store (SP or DP) issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_STF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_STF] }, [ POWER7_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x200f0, .pme_short_desc = "Store Instructions Finished", .pme_long_desc = "Store requests sent to the nest.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ST_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ST_FIN] }, [ POWER7_PME_PM_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_PTEG_FROM_L21_SHR", .pme_code = 0x4c056, .pme_short_desc = "PTEG loaded from another L2 on same chip shared", .pme_long_desc = "PTEG loaded from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L21_SHR] }, [ POWER7_PME_PM_L2_LOC_GUESS_WRONG ] = { .pme_name = "PM_L2_LOC_GUESS_WRONG", .pme_code = 0x26480, .pme_short_desc = "L2 guess loc and guess was not correct (ie data remote)", .pme_long_desc = "L2 guess loc and guess was not correct (ie data remote)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LOC_GUESS_WRONG], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LOC_GUESS_WRONG] }, [ POWER7_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0xd08e, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_STCX_FAIL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_STCX_FAIL] }, [ POWER7_PME_PM_LSU0_REJECT_LHS ] = { .pme_name = "PM_LSU0_REJECT_LHS", .pme_code = 0xc0ac, .pme_short_desc = "LS0 Reject: Load Hit Store", .pme_long_desc = "Load Store Unit 0 rejected a load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_REJECT_LHS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_REJECT_LHS] }, [ POWER7_PME_PM_IC_PREF_CANCEL_HIT ] = { .pme_name = "PM_IC_PREF_CANCEL_HIT", .pme_code = 0x4092, .pme_short_desc = "Prefetch Canceled due to icache hit", .pme_long_desc = "Prefetch Canceled due to icache hit", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_CANCEL_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_CANCEL_HIT] }, [ POWER7_PME_PM_L3_PREF_BUSY ] = { .pme_name = "PM_L3_PREF_BUSY", .pme_code = 0x4f080, .pme_short_desc = "Prefetch machines >= threshold (8", .pme_long_desc = "16", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_BUSY], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_BUSY] }, [ POWER7_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2003a, .pme_short_desc = "bru marked instr finish", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_BRU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_BRU_FIN] }, [ POWER7_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc08e, .pme_short_desc = "LS1 Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_NCLD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_NCLD] }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_L31_MOD", .pme_code = 0x1e054, .pme_short_desc = "Instruction PTEG loaded from another L3 on same chip modified", .pme_long_desc = "Instruction PTEG loaded from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L31_MOD] }, [ POWER7_PME_PM_LSU_NCLD ] = { .pme_name = "PM_LSU_NCLD", .pme_code = 0xc88c, .pme_short_desc = "Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_NCLD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_NCLD] }, [ POWER7_PME_PM_LSU_LDX ] = { .pme_name = "PM_LSU_LDX", .pme_code = 0xc888, .pme_short_desc = "All Vector loads (vsx vector + vmx vector)", .pme_long_desc = "All Vector loads (vsx vector + vmx vector)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LDX], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LDX] }, [ POWER7_PME_PM_L2_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L2_LOC_GUESS_CORRECT", .pme_code = 0x16480, .pme_short_desc = "L2 guess loc and guess was correct (ie data local)", .pme_long_desc = "L2 guess loc and guess was correct (ie data local)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LOC_GUESS_CORRECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LOC_GUESS_CORRECT] }, [ POWER7_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x10038, .pme_short_desc = "Threshold timeout event", .pme_long_desc = "The threshold timer expired", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRESH_TIMEO], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRESH_TIMEO] }, [ POWER7_PME_PM_L3_PREF_ST ] = { .pme_name = "PM_L3_PREF_ST", .pme_code = 0xd0ae, .pme_short_desc = "L3 cache ST prefetches", .pme_long_desc = "L3 cache ST prefetches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_ST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_ST] }, [ POWER7_PME_PM_DISP_CLB_HELD_SYNC ] = { .pme_name = "PM_DISP_CLB_HELD_SYNC", .pme_code = 0x2098, .pme_short_desc = "Dispatch/CLB Hold: Sync type instruction", .pme_long_desc = "Dispatch/CLB Hold: Sync type instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD_SYNC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD_SYNC] }, [ POWER7_PME_PM_VSU_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU_SIMPLE_ISSUED", .pme_code = 0xb894, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_SIMPLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_SIMPLE_ISSUED] }, [ POWER7_PME_PM_VSU1_SINGLE ] = { .pme_name = "PM_VSU1_SINGLE", .pme_code = 0xa0aa, .pme_short_desc = "FPU single precision", .pme_long_desc = "VSU1 executed single precision instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_SINGLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_SINGLE] }, [ POWER7_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x3001a, .pme_short_desc = "Data Tablewalk Active", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_TABLEWALK_CYC] }, [ POWER7_PME_PM_L2_RC_ST_DONE ] = { .pme_name = "PM_L2_RC_ST_DONE", .pme_code = 0x36380, .pme_short_desc = "RC did st to line that was Tx or Sx", .pme_long_desc = "RC did st to line that was Tx or Sx", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RC_ST_DONE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RC_ST_DONE] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L21_MOD", .pme_code = 0x3d056, .pme_short_desc = "Marked PTEG loaded from another L2 on same chip modified", .pme_long_desc = "Marked PTEG loaded from another L2 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD] }, [ POWER7_PME_PM_LARX_LSU1 ] = { .pme_name = "PM_LARX_LSU1", .pme_code = 0xc096, .pme_short_desc = "ls1 Larx Finished", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 1 ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LARX_LSU1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LARX_LSU1] }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x3d042, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RMEM] }, [ POWER7_PME_PM_DISP_CLB_HELD ] = { .pme_name = "PM_DISP_CLB_HELD", .pme_code = 0x2090, .pme_short_desc = "CLB Hold: Any Reason", .pme_long_desc = "CLB Hold: Any Reason", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD] }, [ POWER7_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x1c05c, .pme_short_desc = "DERAT misses for 4K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DERAT_MISS_4K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DERAT_MISS_4K] }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", .pme_code = 0x16282, .pme_short_desc = " L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = " L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR] }, [ POWER7_PME_PM_SEG_EXCEPTION ] = { .pme_name = "PM_SEG_EXCEPTION", .pme_code = 0x28a4, .pme_short_desc = "ISEG + DSEG Exception", .pme_long_desc = "ISEG + DSEG Exception", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SEG_EXCEPTION], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SEG_EXCEPTION] }, [ POWER7_PME_PM_FLUSH_DISP_SB ] = { .pme_name = "PM_FLUSH_DISP_SB", .pme_code = 0x208c, .pme_short_desc = "Dispatch Flush: Scoreboard", .pme_long_desc = "Dispatch Flush: Scoreboard", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_DISP_SB], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_DISP_SB] }, [ POWER7_PME_PM_L2_DC_INV ] = { .pme_name = "PM_L2_DC_INV", .pme_code = 0x26182, .pme_short_desc = "Dcache invalidates from L2 ", .pme_long_desc = "The L2 invalidated a line in processor's data cache. This is caused by the L2 line being cast out or invalidated. Total for all slices", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_DC_INV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_DC_INV] }, [ POWER7_PME_PM_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4c054, .pme_short_desc = "PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a distant module due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_DSEG ] = { .pme_name = "PM_DSEG", .pme_code = 0x20a6, .pme_short_desc = "DSEG Exception", .pme_long_desc = "DSEG Exception", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DSEG], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DSEG] }, [ POWER7_PME_PM_BR_PRED_LSTACK ] = { .pme_name = "PM_BR_PRED_LSTACK", .pme_code = 0x40a2, .pme_short_desc = "Link Stack Predictions", .pme_long_desc = "The target address of a Branch to Link instruction was predicted by the link stack.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED_LSTACK], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED_LSTACK] }, [ POWER7_PME_PM_VSU0_STF ] = { .pme_name = "PM_VSU0_STF", .pme_code = 0xb08c, .pme_short_desc = "FPU store (SP or DP) issued on Pipe0", .pme_long_desc = "FPU store (SP or DP) issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_STF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_STF] }, [ POWER7_PME_PM_LSU_FX_FIN ] = { .pme_name = "PM_LSU_FX_FIN", .pme_code = 0x10066, .pme_short_desc = "LSU Finished a FX operation (up to 2 per cycle)", .pme_long_desc = "LSU Finished a FX operation (up to 2 per cycle)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FX_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FX_FIN] }, [ POWER7_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x3c05c, .pme_short_desc = "DERAT misses for 16M page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DERAT_MISS_16M], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DERAT_MISS_16M] }, [ POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4d054, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a distant module due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x14048, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L3] }, [ POWER7_PME_PM_MRK_IFU_FIN ] = { .pme_name = "PM_MRK_IFU_FIN", .pme_code = 0x3003a, .pme_short_desc = "IFU non-branch marked instruction finished", .pme_long_desc = "The Instruction Fetch Unit finished a marked instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_IFU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_IFU_FIN] }, [ POWER7_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x400fc, .pme_short_desc = "ITLB Reloaded (always zero on POWER6)", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ITLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ITLB_MISS] }, [ POWER7_PME_PM_VSU_STF ] = { .pme_name = "PM_VSU_STF", .pme_code = 0xb88c, .pme_short_desc = "FPU store (SP or DP) issued on Pipe0", .pme_long_desc = "FPU store (SP or DP) issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_STF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_STF] }, [ POWER7_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0xc8b4, .pme_short_desc = "Flush: Unaligned Store", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FLUSH_UST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FLUSH_UST] }, [ POWER7_PME_PM_L2_LDST_MISS ] = { .pme_name = "PM_L2_LDST_MISS", .pme_code = 0x26880, .pme_short_desc = "Data Load+Store Miss", .pme_long_desc = "Data Load+Store Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LDST_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LDST_MISS] }, [ POWER7_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x40004, .pme_short_desc = "FXU1 Finished", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU1_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU1_FIN] }, [ POWER7_PME_PM_SHL_DEALLOCATED ] = { .pme_name = "PM_SHL_DEALLOCATED", .pme_code = 0x5080, .pme_short_desc = "SHL Table entry deallocated", .pme_long_desc = "SHL Table entry deallocated", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SHL_DEALLOCATED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SHL_DEALLOCATED] }, [ POWER7_PME_PM_L2_SN_M_WR_DONE ] = { .pme_name = "PM_L2_SN_M_WR_DONE", .pme_code = 0x46382, .pme_short_desc = "SNP dispatched for a write and was M", .pme_long_desc = "SNP dispatched for a write and was M", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_SN_M_WR_DONE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_SN_M_WR_DONE] }, [ POWER7_PME_PM_LSU_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU_REJECT_SET_MPRED", .pme_code = 0xc8a8, .pme_short_desc = "Reject: Set Predict Wrong", .pme_long_desc = "The Load Store Unit rejected an instruction because the cache set was improperly predicted. This is a fast reject and will be immediately redispatched. Combined Unit 0 + 1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_REJECT_SET_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_REJECT_SET_MPRED] }, [ POWER7_PME_PM_L3_PREF_LD ] = { .pme_name = "PM_L3_PREF_LD", .pme_code = 0xd0ac, .pme_short_desc = "L3 cache LD prefetches", .pme_long_desc = "L3 cache LD prefetches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_LD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_LD] }, [ POWER7_PME_PM_L2_SN_M_RD_DONE ] = { .pme_name = "PM_L2_SN_M_RD_DONE", .pme_code = 0x46380, .pme_short_desc = "SNP dispatched for a read and was M", .pme_long_desc = "SNP dispatched for a read and was M", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_SN_M_RD_DONE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_SN_M_RD_DONE] }, [ POWER7_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x4d05c, .pme_short_desc = "Marked DERAT misses for 16G page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DERAT_MISS_16G], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DERAT_MISS_16G] }, [ POWER7_PME_PM_VSU_FCONV ] = { .pme_name = "PM_VSU_FCONV", .pme_code = 0xa8b0, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FCONV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FCONV] }, [ POWER7_PME_PM_ANY_THRD_RUN_CYC ] = { .pme_name = "PM_ANY_THRD_RUN_CYC", .pme_code = 0x100fa, .pme_short_desc = "One of threads in run_cycles ", .pme_long_desc = "One of threads in run_cycles ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ANY_THRD_RUN_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ANY_THRD_RUN_CYC] }, [ POWER7_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xd0a4, .pme_short_desc = "LMQ full", .pme_long_desc = "The Load Miss Queue was full.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LMQ_FULL_CYC] }, [ POWER7_PME_PM_MRK_LSU_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU_REJECT_LHS", .pme_code = 0xd082, .pme_short_desc = " Reject(marked): Load Hit Store", .pme_long_desc = "The Load Store Unit rejected a marked load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_REJECT_LHS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_REJECT_LHS] }, [ POWER7_PME_PM_MRK_LD_MISS_L1_CYC ] = { .pme_name = "PM_MRK_LD_MISS_L1_CYC", .pme_code = 0x4003e, .pme_short_desc = "L1 data load miss cycles", .pme_long_desc = "L1 data load miss cycles", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LD_MISS_L1_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LD_MISS_L1_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x20020, .pme_short_desc = "Marked ld latency Data source 0000 (L2 hit)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L2_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L2_CYC] }, [ POWER7_PME_PM_INST_IMC_MATCH_DISP ] = { .pme_name = "PM_INST_IMC_MATCH_DISP", .pme_code = 0x30016, .pme_short_desc = "IMC Matches dispatched", .pme_long_desc = "IMC Matches dispatched", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_IMC_MATCH_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_IMC_MATCH_DISP] }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4002c, .pme_short_desc = "Marked ld latency Data source 1101 (Memory same 4 chip node)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC] }, [ POWER7_PME_PM_VSU0_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU0_SIMPLE_ISSUED", .pme_code = 0xb094, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_SIMPLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_SIMPLE_ISSUED] }, [ POWER7_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x40014, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_DIV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_DIV] }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2d054, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to a different module than this proccessor is located on due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_VSU_FMA_DOUBLE ] = { .pme_name = "PM_VSU_FMA_DOUBLE", .pme_code = 0xa890, .pme_short_desc = "DP vector version of fmadd", .pme_long_desc = "fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FMA_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FMA_DOUBLE] }, [ POWER7_PME_PM_VSU_4FLOP ] = { .pme_name = "PM_VSU_4FLOP", .pme_code = 0xa89c, .pme_short_desc = "four flops operation (scalar fdiv", .pme_long_desc = " fsqrt; DP vector version of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_4FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_4FLOP] }, [ POWER7_PME_PM_VSU1_FIN ] = { .pme_name = "PM_VSU1_FIN", .pme_code = 0xa0be, .pme_short_desc = "VSU1 Finished an instruction", .pme_long_desc = "VSU1 Finished an instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FIN] }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1e052, .pme_short_desc = "Instruction PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "Instruction PTEG loaded from remote L2 or L3 modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD] }, [ POWER7_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x200f4, .pme_short_desc = "Run_cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_RUN_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_RUN_CYC] }, [ POWER7_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x3c052, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_RMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_RMEM] }, [ POWER7_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xd09e, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LRQ_S0_VALID] }, [ POWER7_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc084, .pme_short_desc = "LS0 Scalar Loads", .pme_long_desc = "A floating point load was executed by LSU0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_LDF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_LDF] }, [ POWER7_PME_PM_FLUSH_COMPLETION ] = { .pme_name = "PM_FLUSH_COMPLETION", .pme_code = 0x30012, .pme_short_desc = "Completion Flush", .pme_long_desc = "Completion Flush", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_COMPLETION], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_COMPLETION] }, [ POWER7_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x300f0, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_ST_MISS_L1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_ST_MISS_L1] }, [ POWER7_PME_PM_L2_NODE_PUMP ] = { .pme_name = "PM_L2_NODE_PUMP", .pme_code = 0x36480, .pme_short_desc = "RC req that was a local (aka node) pump attempt", .pme_long_desc = "RC req that was a local (aka node) pump attempt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_NODE_PUMP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_NODE_PUMP] }, [ POWER7_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x34044, .pme_short_desc = "Instruction fetched from distant L2 or L3 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L2 or L3 on a distant module. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_DL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_DL2L3_SHR] }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC", .pme_code = 0x3003e, .pme_short_desc = "Marked Group Completion Stall cycles ", .pme_long_desc = "Marked Group Completion Stall cycles ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_STALL_CMPLU_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_STALL_CMPLU_CYC] }, [ POWER7_PME_PM_VSU1_DENORM ] = { .pme_name = "PM_VSU1_DENORM", .pme_code = 0xa0ae, .pme_short_desc = "FPU denorm operand", .pme_long_desc = "VSU1 received denormalized data", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_DENORM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_DENORM] }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR_CYC", .pme_code = 0x20026, .pme_short_desc = "Marked ld latency Data source 0110 (L3.1 S) ", .pme_long_desc = "Marked load latency Data source 0110 (L3.1 S) ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC] }, [ POWER7_PME_PM_GCT_USAGE_1TO2_SLOT ] = { .pme_name = "PM_GCT_USAGE_1-2_SLOT", .pme_code = 0x209c, .pme_short_desc = "GCT Utilization 1-2 entries", .pme_long_desc = "GCT Utilization 1-2 entries", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_USAGE_1TO2_SLOT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_USAGE_1TO2_SLOT] }, [ POWER7_PME_PM_NEST_6 ] = { .pme_name = "PM_NEST_6", .pme_code = 0x8b, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_6], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_6] }, [ POWER7_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x24048, .pme_short_desc = "Instruction fetched missed L3", .pme_long_desc = "An instruction fetch group was fetched from beyond L3. Fetch groups can contain up to 8 instructions.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L3MISS] }, [ POWER7_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x2080, .pme_short_desc = "ee off and external interrupt", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_EE_OFF_EXT_INT] }, [ POWER7_PME_PM_INST_PTEG_FROM_DMEM ] = { .pme_name = "PM_INST_PTEG_FROM_DMEM", .pme_code = 0x2e052, .pme_short_desc = "Instruction PTEG loaded from distant memory", .pme_long_desc = "Instruction PTEG loaded from distant memory", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_DMEM] }, [ POWER7_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x3404c, .pme_short_desc = "Instruction fetched from distant L2 or L3 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from an L2 or L3 on a distant module. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30024, .pme_short_desc = "Overflow from counter 6", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC6_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC6_OVERFLOW] }, [ POWER7_PME_PM_VSU_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU_2FLOP_DOUBLE", .pme_code = 0xa88c, .pme_short_desc = "DP vector version of fmul", .pme_long_desc = " fsub", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_2FLOP_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_2FLOP_DOUBLE] }, [ POWER7_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x20066, .pme_short_desc = "TLB Miss (I + D)", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", .pme_event_ids = power7_event_ids[POWER7_PME_PM_TLB_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_TLB_MISS] }, [ POWER7_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x2000e, .pme_short_desc = "fxu0 busy and fxu1 busy.", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU_BUSY], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU_BUSY] }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", .pme_code = 0x26280, .pme_short_desc = " L2 RC load dispatch attempt failed due to other reasons", .pme_long_desc = " L2 RC load dispatch attempt failed due to other reasons", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER] }, [ POWER7_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0xc8a4, .pme_short_desc = "Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_REJECT_LMQ_FULL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_REJECT_LMQ_FULL] }, [ POWER7_PME_PM_IC_RELOAD_SHR ] = { .pme_name = "PM_IC_RELOAD_SHR", .pme_code = 0x4096, .pme_short_desc = "Reloading line to be shared between the threads", .pme_long_desc = "An Instruction Cache request was made by this thread and the cache line was already in the cache for the other thread. The line is marked valid for all threads.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_RELOAD_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_RELOAD_SHR] }, [ POWER7_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x10031, .pme_short_desc = "IDU Marked Instruction", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_MRK], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_MRK] }, [ POWER7_PME_PM_MRK_ST_NEST ] = { .pme_name = "PM_MRK_ST_NEST", .pme_code = 0x20034, .pme_short_desc = "marked store sent to Nest", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_ST_NEST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_ST_NEST] }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV ] = { .pme_name = "PM_VSU1_FSQRT_FDIV", .pme_code = 0xa08a, .pme_short_desc = "four flops operation (fdiv", .pme_long_desc = "fsqrt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FSQRT_FDIV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FSQRT_FDIV] }, [ POWER7_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc0b8, .pme_short_desc = "LS0 Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed from unit 0 because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_FLUSH_LRQ] }, [ POWER7_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0xc094, .pme_short_desc = "ls0 Larx Finished", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LARX_LSU0], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LARX_LSU0] }, [ POWER7_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x4084, .pme_short_desc = "Cycles No room in ibuff", .pme_long_desc = "Cycles with the Instruction Buffer was full. The Instruction Buffer is a circular queue of 64 instructions per thread, organized as 16 groups of 4 instructions.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IBUF_FULL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IBUF_FULL_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x2002a, .pme_short_desc = "Marked ld latency Data Source 1010 (Distant L2.75/L3.75 S)", .pme_long_desc = "Marked ld latency Data Source 1010 (Distant L2.75/L3.75 S)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC] }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU_DC_PREF_STREAM_ALLOC", .pme_code = 0xd8a8, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "D cache new prefetch stream allocated", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC] }, [ POWER7_PME_PM_GRP_MRK_CYC ] = { .pme_name = "PM_GRP_MRK_CYC", .pme_code = 0x10030, .pme_short_desc = "cycles IDU marked instruction before dispatch", .pme_long_desc = "cycles IDU marked instruction before dispatch", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_MRK_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_MRK_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x20028, .pme_short_desc = "Marked ld latency Data Source 1000 (Remote L2.5/L3.5 S)", .pme_long_desc = "Marked load latency Data Source 1000 (Remote L2.5/L3.5 S)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC] }, [ POWER7_PME_PM_L2_GLOB_GUESS_CORRECT ] = { .pme_name = "PM_L2_GLOB_GUESS_CORRECT", .pme_code = 0x16482, .pme_short_desc = "L2 guess glb and guess was correct (ie data remote)", .pme_long_desc = "L2 guess glb and guess was correct (ie data remote)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_GLOB_GUESS_CORRECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_GLOB_GUESS_CORRECT] }, [ POWER7_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0xc8ac, .pme_short_desc = "Reject: Load Hit Store", .pme_long_desc = "The Load Store Unit rejected a load load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully. Combined Unit 0 + 1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_REJECT_LHS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_REJECT_LHS] }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x3d04a, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_LMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_LMEM] }, [ POWER7_PME_PM_INST_PTEG_FROM_L3 ] = { .pme_name = "PM_INST_PTEG_FROM_L3", .pme_code = 0x2e050, .pme_short_desc = "Instruction PTEG loaded from L3", .pme_long_desc = "Instruction PTEG loaded from L3", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L3] }, [ POWER7_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x3000c, .pme_short_desc = "Frequency is being slewed down due to Power Management", .pme_long_desc = "Processor frequency was slowed down due to power management", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FREQ_DOWN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FREQ_DOWN] }, [ POWER7_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x1404c, .pme_short_desc = "Instruction fetched from remote L2 or L3 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L2 or L3 on a remote module. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10032, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "A marked instruction was issued to an execution unit.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_INST_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_INST_ISSUED] }, [ POWER7_PME_PM_PTEG_FROM_L3MISS ] = { .pme_name = "PM_PTEG_FROM_L3MISS", .pme_code = 0x2c058, .pme_short_desc = "PTEG loaded from L3 miss", .pme_long_desc = " Page Table Entry was loaded into the ERAT from beyond the L3 due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L3MISS] }, [ POWER7_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x400f4, .pme_short_desc = "Run_PURR", .pme_long_desc = "The Processor Utilization of Resources Register was incremented while the run latch was set. The PURR registers will be incremented roughly in the ratio in which the instructions are dispatched from the two threads. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_RUN_PURR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_RUN_PURR] }, [ POWER7_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1d048, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L3] }, [ POWER7_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x40038, .pme_short_desc = "Marked group experienced I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_GRP_IC_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_GRP_IC_MISS] }, [ POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x20016, .pme_short_desc = " Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS] }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2c054, .pme_short_desc = "PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with shared (T or SL) data from an L2 or L3 on a remote module due to a demand load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0xc8b8, .pme_short_desc = "Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Unit 0 + 1.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FLUSH_LRQ] }, [ POWER7_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x2d05c, .pme_short_desc = "Marked DERAT misses for 64K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DERAT_MISS_64K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DERAT_MISS_64K] }, [ POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4e054, .pme_short_desc = "Instruction PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "Instruction PTEG loaded from distant L2 or L3 modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD] }, [ POWER7_PME_PM_L2_ST_MISS ] = { .pme_name = "PM_L2_ST_MISS", .pme_code = 0x26082, .pme_short_desc = "Data Store Miss", .pme_long_desc = "Data Store Miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_ST_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_ST_MISS] }, [ POWER7_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0xd094, .pme_short_desc = "lwsync count (easier to use than IMC)", .pme_long_desc = "lwsync count (easier to use than IMC)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LWSYNC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LWSYNC] }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE", .pme_code = 0xd0bc, .pme_short_desc = "LS0 Dcache Strided prefetch stream confirmed", .pme_long_desc = "LS0 Dcache Strided prefetch stream confirmed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L21_SHR", .pme_code = 0x4d056, .pme_short_desc = "Marked PTEG loaded from another L2 on same chip shared", .pme_long_desc = "Marked PTEG loaded from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR] }, [ POWER7_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0xd088, .pme_short_desc = "Flush: (marked) LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A marked load was flushed because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FLUSH_LRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FLUSH_LRQ] }, [ POWER7_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x100f0, .pme_short_desc = "IMC Match Count", .pme_long_desc = "Number of instructions resulting from the marked instructions expansion that completed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_IMC_MATCH_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_IMC_MATCH_CMPL] }, [ POWER7_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30030, .pme_short_desc = "marked instr finish any unit ", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_INST_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_INST_FIN] }, [ POWER7_PME_PM_INST_FROM_L31_MOD ] = { .pme_name = "PM_INST_FROM_L31_MOD", .pme_code = 0x14044, .pme_short_desc = "Instruction fetched from another L3 on same chip modified", .pme_long_desc = "Instruction fetched from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L31_MOD] }, [ POWER7_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x3d05e, .pme_short_desc = "Marked Data TLB misses for 64K page", .pme_long_desc = "Data TLB references to 64KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DTLB_MISS_64K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DTLB_MISS_64K] }, [ POWER7_PME_PM_LSU_FIN ] = { .pme_name = "PM_LSU_FIN", .pme_code = 0x30066, .pme_short_desc = "LSU Finished an instruction (up to 2 per cycle)", .pme_long_desc = "LSU Finished an instruction (up to 2 per cycle)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_FIN] }, [ POWER7_PME_PM_MRK_LSU_REJECT ] = { .pme_name = "PM_MRK_LSU_REJECT", .pme_code = 0x40064, .pme_short_desc = "LSU marked reject (up to 2 per cycle)", .pme_long_desc = "LSU marked reject (up to 2 per cycle)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_REJECT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_REJECT] }, [ POWER7_PME_PM_L2_CO_FAIL_BUSY ] = { .pme_name = "PM_L2_CO_FAIL_BUSY", .pme_code = 0x16382, .pme_short_desc = " L2 RC Cast Out dispatch attempt failed due to all CO machines busy", .pme_long_desc = " L2 RC Cast Out dispatch attempt failed due to all CO machines busy", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_CO_FAIL_BUSY], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_CO_FAIL_BUSY] }, [ POWER7_PME_PM_DATA_FROM_L31_MOD ] = { .pme_name = "PM_DATA_FROM_L31_MOD", .pme_code = 0x1c044, .pme_short_desc = "Data loaded from another L3 on same chip modified", .pme_long_desc = "Data loaded from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L31_MOD] }, [ POWER7_PME_PM_THERMAL_WARN ] = { .pme_name = "PM_THERMAL_WARN", .pme_code = 0x10016, .pme_short_desc = "Processor in Thermal Warning", .pme_long_desc = "Processor in Thermal Warning", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THERMAL_WARN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THERMAL_WARN] }, [ POWER7_PME_PM_VSU0_4FLOP ] = { .pme_name = "PM_VSU0_4FLOP", .pme_code = 0xa09c, .pme_short_desc = "four flops operation (scalar fdiv", .pme_long_desc = " fsqrt; DP vector version of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_4FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_4FLOP] }, [ POWER7_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x40a4, .pme_short_desc = "Branch Mispredict due to Count Cache prediction", .pme_long_desc = "A branch instruction target was incorrectly predicted by the ccount cache. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED_CCACHE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED_CCACHE] }, [ POWER7_PME_PM_L1_DEMAND_WRITE ] = { .pme_name = "PM_L1_DEMAND_WRITE", .pme_code = 0x408c, .pme_short_desc = "Instruction Demand sectors wriittent into IL1", .pme_long_desc = "Instruction Demand sectors wriittent into IL1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L1_DEMAND_WRITE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L1_DEMAND_WRITE] }, [ POWER7_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x2084, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_BR_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_BR_MPRED] }, [ POWER7_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x1d05e, .pme_short_desc = "Marked Data TLB misses for 16G page", .pme_long_desc = "Data TLB references to 16GB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DTLB_MISS_16G], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DTLB_MISS_16G] }, [ POWER7_PME_PM_MRK_PTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_DMEM", .pme_code = 0x2d052, .pme_short_desc = "Marked PTEG loaded from distant memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to a different module than this proccessor is located on due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_DMEM] }, [ POWER7_PME_PM_L2_RCST_DISP ] = { .pme_name = "PM_L2_RCST_DISP", .pme_code = 0x36280, .pme_short_desc = " L2 RC store dispatch attempt", .pme_long_desc = " L2 RC store dispatch attempt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCST_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCST_DISP] }, [ POWER7_PME_PM_CMPLU_STALL ] = { .pme_name = "PM_CMPLU_STALL", .pme_code = 0x4000a, .pme_short_desc = "No groups completed", .pme_long_desc = " GCT not empty", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL] }, [ POWER7_PME_PM_LSU_PARTIAL_CDF ] = { .pme_name = "PM_LSU_PARTIAL_CDF", .pme_code = 0xc0aa, .pme_short_desc = "A partial cacheline was returned from the L3", .pme_long_desc = "A partial cacheline was returned from the L3", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_PARTIAL_CDF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_PARTIAL_CDF] }, [ POWER7_PME_PM_DISP_CLB_HELD_SB ] = { .pme_name = "PM_DISP_CLB_HELD_SB", .pme_code = 0x20a8, .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DISP_CLB_HELD_SB], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DISP_CLB_HELD_SB] }, [ POWER7_PME_PM_VSU0_FMA_DOUBLE ] = { .pme_name = "PM_VSU0_FMA_DOUBLE", .pme_code = 0xa090, .pme_short_desc = "four flop DP vector operations (xvmadddp", .pme_long_desc = " xvnmadddp", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FMA_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FMA_DOUBLE] }, [ POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x3000e, .pme_short_desc = "fxu0 busy and fxu1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ POWER7_PME_PM_IC_DEMAND_CYC ] = { .pme_name = "PM_IC_DEMAND_CYC", .pme_code = 0x10018, .pme_short_desc = "Cycles when a demand ifetch was pending", .pme_long_desc = "Cycles when a demand ifetch was pending", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_DEMAND_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_DEMAND_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR", .pme_code = 0x3d04e, .pme_short_desc = "Marked data loaded from another L2 on same chip shared", .pme_long_desc = "Marked data loaded from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L21_SHR] }, [ POWER7_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0xd086, .pme_short_desc = "Flush: (marked) Unaligned Store", .pme_long_desc = "A marked store was flushed because it was unaligned", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FLUSH_UST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FLUSH_UST] }, [ POWER7_PME_PM_INST_PTEG_FROM_L3MISS ] = { .pme_name = "PM_INST_PTEG_FROM_L3MISS", .pme_code = 0x2e058, .pme_short_desc = "Instruction PTEG loaded from L3 miss", .pme_long_desc = "Instruction PTEG loaded from L3 miss", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L3MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L3MISS] }, [ POWER7_PME_PM_VSU_DENORM ] = { .pme_name = "PM_VSU_DENORM", .pme_code = 0xa8ac, .pme_short_desc = "Vector or Scalar denorm operand", .pme_long_desc = "Vector or Scalar denorm operand", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_DENORM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_DENORM] }, [ POWER7_PME_PM_MRK_LSU_PARTIAL_CDF ] = { .pme_name = "PM_MRK_LSU_PARTIAL_CDF", .pme_code = 0xd080, .pme_short_desc = "A partial cacheline was returned from the L3 for a marked load", .pme_long_desc = "A partial cacheline was returned from the L3 for a marked load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_PARTIAL_CDF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_PARTIAL_CDF] }, [ POWER7_PME_PM_INST_FROM_L21_SHR ] = { .pme_name = "PM_INST_FROM_L21_SHR", .pme_code = 0x3404e, .pme_short_desc = "Instruction fetched from another L2 on same chip shared", .pme_long_desc = "Instruction fetched from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L21_SHR] }, [ POWER7_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x408e, .pme_short_desc = "Instruction prefetch written into IL1", .pme_long_desc = "Number of Instruction Cache entries written because of prefetch. Prefetch entries are marked least recently used and are candidates for eviction if they are not needed to satify a demand fetch.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_WRITE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_WRITE] }, [ POWER7_PME_PM_BR_PRED ] = { .pme_name = "PM_BR_PRED", .pme_code = 0x409c, .pme_short_desc = "Branch Predictions made", .pme_long_desc = "A branch prediction was made. This could have been a target prediction, a condition prediction, or both", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED] }, [ POWER7_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x1404a, .pme_short_desc = "Instruction fetched from distant memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a distant module. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_DMEM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_DMEM] }, [ POWER7_PME_PM_IC_PREF_CANCEL_ALL ] = { .pme_name = "PM_IC_PREF_CANCEL_ALL", .pme_code = 0x4890, .pme_short_desc = "Prefetch Canceled due to page boundary or icache hit", .pme_long_desc = "Prefetch Canceled due to page boundary or icache hit", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_PREF_CANCEL_ALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_PREF_CANCEL_ALL] }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd8b4, .pme_short_desc = "Dcache new prefetch stream confirmed", .pme_long_desc = "Dcache new prefetch stream confirmed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM] }, [ POWER7_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0xd08a, .pme_short_desc = "Flush: (marked) SRQ", .pme_long_desc = "Load Hit Store flush. A marked load was flushed because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FLUSH_SRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FLUSH_SRQ] }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC ] = { .pme_name = "PM_MRK_FIN_STALL_CYC", .pme_code = 0x1003c, .pme_short_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) ", .pme_long_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_FIN_STALL_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_FIN_STALL_CYC] }, [ POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT ] = { .pme_name = "PM_GCT_UTIL_11+_SLOT", .pme_code = 0x20a2, .pme_short_desc = "GCT Utilization 11+ entries", .pme_long_desc = "GCT Utilization 11+ entries", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GCT_UTIL_11PLUS_SLOT] }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", .pme_code = 0x46280, .pme_short_desc = " L2 RC store dispatch attempt failed due to other reasons", .pme_long_desc = " L2 RC store dispatch attempt failed due to other reasons", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER] }, [ POWER7_PME_PM_VSU1_DD_ISSUED ] = { .pme_name = "PM_VSU1_DD_ISSUED", .pme_code = 0xb098, .pme_short_desc = "64BIT Decimal Issued on Pipe1", .pme_long_desc = "64BIT Decimal Issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_DD_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_DD_ISSUED] }, [ POWER7_PME_PM_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_PTEG_FROM_L31_SHR", .pme_code = 0x2c056, .pme_short_desc = "PTEG loaded from another L3 on same chip shared", .pme_long_desc = "PTEG loaded from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PTEG_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PTEG_FROM_L31_SHR] }, [ POWER7_PME_PM_DATA_FROM_L21_SHR ] = { .pme_name = "PM_DATA_FROM_L21_SHR", .pme_code = 0x3c04e, .pme_short_desc = "Data loaded from another L2 on same chip shared", .pme_long_desc = "Data loaded from another L2 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L21_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L21_SHR] }, [ POWER7_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc08c, .pme_short_desc = "LS0 Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed by unit 0.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_NCLD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_NCLD] }, [ POWER7_PME_PM_VSU1_4FLOP ] = { .pme_name = "PM_VSU1_4FLOP", .pme_code = 0xa09e, .pme_short_desc = "four flops operation (scalar fdiv", .pme_long_desc = " fsqrt; DP vector version of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_4FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_4FLOP] }, [ POWER7_PME_PM_VSU1_8FLOP ] = { .pme_name = "PM_VSU1_8FLOP", .pme_code = 0xa0a2, .pme_short_desc = "eight flops operation (DP vector versions of fdiv", .pme_long_desc = "fsqrt and SP vector versions of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_8FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_8FLOP] }, [ POWER7_PME_PM_VSU_8FLOP ] = { .pme_name = "PM_VSU_8FLOP", .pme_code = 0xa8a0, .pme_short_desc = "eight flops operation (DP vector versions of fdiv", .pme_long_desc = "fsqrt and SP vector versions of fmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_8FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_8FLOP] }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2003e, .pme_short_desc = "LSU empty (lmq and srq empty)", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ POWER7_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x3c05e, .pme_short_desc = "Data TLB miss for 64K page", .pme_long_desc = "Data TLB references to 64KB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DTLB_MISS_64K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DTLB_MISS_64K] }, [ POWER7_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x300f4, .pme_short_desc = "Concurrent Run Instructions", .pme_long_desc = "Instructions completed by this thread when both threads had their run latches set.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_CONC_RUN_INST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_CONC_RUN_INST] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2 ] = { .pme_name = "PM_MRK_PTEG_FROM_L2", .pme_code = 0x1d050, .pme_short_desc = "Marked PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L2 due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L2] }, [ POWER7_PME_PM_VSU_FIN ] = { .pme_name = "PM_VSU_FIN", .pme_code = 0xa8bc, .pme_short_desc = "VSU0 Finished an instruction", .pme_long_desc = "VSU0 Finished an instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FIN] }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD", .pme_code = 0x1d044, .pme_short_desc = "Marked data loaded from another L3 on same chip modified", .pme_long_desc = "Marked data loaded from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L31_MOD] }, [ POWER7_PME_PM_THRD_PRIO_0_1_CYC ] = { .pme_name = "PM_THRD_PRIO_0_1_CYC", .pme_code = 0x40b0, .pme_short_desc = " Cycles thread running at priority level 0 or 1", .pme_long_desc = " Cycles thread running at priority level 0 or 1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_PRIO_0_1_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_PRIO_0_1_CYC] }, [ POWER7_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x2c05c, .pme_short_desc = "DERAT misses for 64K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DERAT_MISS_64K], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DERAT_MISS_64K] }, [ POWER7_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x30020, .pme_short_desc = "PMC2 Rewind Event (did not match condition)", .pme_long_desc = "PMC2 was counting speculatively. The speculative condition was not met and the counter was restored to its previous value.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC2_REWIND], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC2_REWIND] }, [ POWER7_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x14040, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L2] }, [ POWER7_PME_PM_GRP_BR_MPRED_NONSPEC ] = { .pme_name = "PM_GRP_BR_MPRED_NONSPEC", .pme_code = 0x1000a, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Group experienced non-speculative branch redirect", .pme_event_ids = power7_event_ids[POWER7_PME_PM_GRP_BR_MPRED_NONSPEC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_GRP_BR_MPRED_NONSPEC] }, [ POWER7_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x200f2, .pme_short_desc = "# PPC Dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_DISP] }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd0b4, .pme_short_desc = "LS0 Dcache prefetch stream confirmed", .pme_long_desc = "LS0 Dcache prefetch stream confirmed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM] }, [ POWER7_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x300f6, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU_SCALAR_DOUBLE_ISSUED", .pme_code = 0xb888, .pme_short_desc = "Double Precision scalar instruction issued on Pipe0", .pme_long_desc = "Double Precision scalar instruction issued on Pipe0", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED] }, [ POWER7_PME_PM_L3_PREF_HIT ] = { .pme_name = "PM_L3_PREF_HIT", .pme_code = 0x3f080, .pme_short_desc = "L3 Prefetch Directory Hit", .pme_long_desc = "L3 Prefetch Directory Hit", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_HIT] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L31_MOD", .pme_code = 0x1d054, .pme_short_desc = "Marked PTEG loaded from another L3 on same chip modified", .pme_long_desc = "Marked PTEG loaded from another L3 on same chip modified", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD] }, [ POWER7_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x20038, .pme_short_desc = "fxu marked instr finish", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_FXU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_FXU_FIN] }, [ POWER7_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10010, .pme_short_desc = "Overflow from counter 4", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_PMC4_OVERFLOW], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_PMC4_OVERFLOW] }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3 ] = { .pme_name = "PM_MRK_PTEG_FROM_L3", .pme_code = 0x2d050, .pme_short_desc = "Marked PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L3 due to a marked load or store.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_PTEG_FROM_L3], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_PTEG_FROM_L3] }, [ POWER7_PME_PM_LSU0_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU0_LMQ_LHR_MERGE", .pme_code = 0xd098, .pme_short_desc = "LS0 Load Merged with another cacheline request", .pme_long_desc = "LS0 Load Merged with another cacheline request", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_LMQ_LHR_MERGE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_LMQ_LHR_MERGE] }, [ POWER7_PME_PM_BTAC_HIT ] = { .pme_name = "PM_BTAC_HIT", .pme_code = 0x508a, .pme_short_desc = "BTAC Correct Prediction", .pme_long_desc = "BTAC Correct Prediction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BTAC_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BTAC_HIT] }, [ POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS ] = { .pme_name = "PM_IERAT_XLATE_WR_16M+", .pme_code = 0x40bc, .pme_short_desc = "large page 16M+", .pme_long_desc = "large page 16M+", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS] }, [ POWER7_PME_PM_L3_RD_BUSY ] = { .pme_name = "PM_L3_RD_BUSY", .pme_code = 0x4f082, .pme_short_desc = "Rd machines busy >= threshold (2", .pme_long_desc = "4", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_RD_BUSY], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_RD_BUSY] }, [ POWER7_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x44048, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L2MISS] }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_ALLOC", .pme_code = 0xd0a8, .pme_short_desc = "LS0 D cache new prefetch stream allocated", .pme_long_desc = "LS0 D cache new prefetch stream allocated", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC] }, [ POWER7_PME_PM_L2_ST ] = { .pme_name = "PM_L2_ST", .pme_code = 0x16082, .pme_short_desc = "Data Store Count", .pme_long_desc = "Data Store Count", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_ST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_ST] }, [ POWER7_PME_PM_VSU0_DENORM ] = { .pme_name = "PM_VSU0_DENORM", .pme_code = 0xa0ac, .pme_short_desc = "FPU denorm operand", .pme_long_desc = "VSU0 received denormalized data", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_DENORM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_DENORM] }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x3d044, .pme_short_desc = "Marked data loaded from distant L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a distant module due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR] }, [ POWER7_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x48aa, .pme_short_desc = "Branch predict - taken/not taken and target", .pme_long_desc = "Both the condition (taken or not taken) and the target address of a branch instruction was predicted.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_PRED_CR_TA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_PRED_CR_TA] }, [ POWER7_PME_PM_VSU0_FCONV ] = { .pme_name = "PM_VSU0_FCONV", .pme_code = 0xa0b0, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FCONV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FCONV] }, [ POWER7_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0xd084, .pme_short_desc = "Flush: (marked) Unaligned Load", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FLUSH_ULD], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FLUSH_ULD] }, [ POWER7_PME_PM_BTAC_MISS ] = { .pme_name = "PM_BTAC_MISS", .pme_code = 0x5088, .pme_short_desc = "BTAC Mispredicted", .pme_long_desc = "BTAC Mispredicted", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BTAC_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BTAC_MISS] }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC_COUNT", .pme_code = 0x1003f, .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT] }, [ POWER7_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1d040, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L2] }, [ POWER7_PME_PM_VSU_FMA ] = { .pme_name = "PM_VSU_FMA", .pme_code = 0xa884, .pme_short_desc = "two flops operation (fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FMA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FMA] }, [ POWER7_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc0bc, .pme_short_desc = "LS0 Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed from unit 0 because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU0_FLUSH_SRQ] }, [ POWER7_PME_PM_LSU1_L1_PREF ] = { .pme_name = "PM_LSU1_L1_PREF", .pme_code = 0xd0ba, .pme_short_desc = " LS1 L1 cache data prefetches", .pme_long_desc = " LS1 L1 cache data prefetches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_L1_PREF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_L1_PREF] }, [ POWER7_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x10014, .pme_short_desc = "Internal Operations completed", .pme_long_desc = "Number of internal operations that completed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IOPS_CMPL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IOPS_CMPL] }, [ POWER7_PME_PM_L2_SYS_PUMP ] = { .pme_name = "PM_L2_SYS_PUMP", .pme_code = 0x36482, .pme_short_desc = "RC req that was a global (aka system) pump attempt", .pme_long_desc = "RC req that was a global (aka system) pump attempt", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_SYS_PUMP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_SYS_PUMP] }, [ POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL ] = { .pme_name = "PM_L2_RCLD_BUSY_RC_FULL", .pme_code = 0x46282, .pme_short_desc = " L2 activated Busy to the core for loads due to all RC full", .pme_long_desc = " L2 activated Busy to the core for loads due to all RC full", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL] }, [ POWER7_PME_PM_BCPLUS8_RSLV_TAKEN ] = { .pme_name = "PM_BC+8_RSLV_TAKEN", .pme_code = 0x40ba, .pme_short_desc = "BC+8 Resolve outcome was Taken", .pme_long_desc = " resulting in the conditional instruction being canceled", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BCPLUS8_RSLV_TAKEN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BCPLUS8_RSLV_TAKEN] }, [ POWER7_PME_PM_NEST_5 ] = { .pme_name = "PM_NEST_5", .pme_code = 0x89, .pme_short_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_long_desc = "PlaceHolder for Nest events (MC0/MC1/PB/GX)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_NEST_5], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_NEST_5] }, [ POWER7_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xd0a1, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "Slot 0 of LMQ valid", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LMQ_S0_ALLOC] }, [ POWER7_PME_PM_FLUSH_DISP_SYNC ] = { .pme_name = "PM_FLUSH_DISP_SYNC", .pme_code = 0x2088, .pme_short_desc = "Dispatch Flush: Sync", .pme_long_desc = "Dispatch Flush: Sync", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_DISP_SYNC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_DISP_SYNC] }, [ POWER7_PME_PM_L2_IC_INV ] = { .pme_name = "PM_L2_IC_INV", .pme_code = 0x26180, .pme_short_desc = "Icache Invalidates from L2 ", .pme_long_desc = "Icache Invalidates from L2 ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_IC_INV], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_IC_INV] }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD_CYC", .pme_code = 0x40024, .pme_short_desc = "Marked ld latency Data source 0101 (L2.1 M same chip)", .pme_long_desc = "Marked ld latency Data source 0101 (L2.1 M same chip)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC] }, [ POWER7_PME_PM_L3_PREF_LDST ] = { .pme_name = "PM_L3_PREF_LDST", .pme_code = 0xd8ac, .pme_short_desc = "L3 cache prefetches LD + ST", .pme_long_desc = "L3 cache prefetches LD + ST", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_PREF_LDST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_PREF_LDST] }, [ POWER7_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x40008, .pme_short_desc = "ALL threads srq empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ POWER7_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xd0a0, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_LMQ_S0_VALID] }, [ POWER7_PME_PM_FLUSH_PARTIAL ] = { .pme_name = "PM_FLUSH_PARTIAL", .pme_code = 0x2086, .pme_short_desc = "Partial flush", .pme_long_desc = "Partial flush", .pme_event_ids = power7_event_ids[POWER7_PME_PM_FLUSH_PARTIAL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_FLUSH_PARTIAL] }, [ POWER7_PME_PM_VSU1_FMA_DOUBLE ] = { .pme_name = "PM_VSU1_FMA_DOUBLE", .pme_code = 0xa092, .pme_short_desc = "four flop DP vector operations (xvmadddp", .pme_long_desc = " xvnmadddp", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FMA_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FMA_DOUBLE] }, [ POWER7_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x400f2, .pme_short_desc = "Cycles at least one Instr Dispatched", .pme_long_desc = "", .pme_event_ids = power7_event_ids[POWER7_PME_PM_1PLUS_PPC_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_1PLUS_PPC_DISP] }, [ POWER7_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x200fe, .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_L2MISS] }, [ POWER7_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Counter OFF", .pme_long_desc = "The counter is suspended (does not count)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_SUSPENDED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_SUSPENDED] }, [ POWER7_PME_PM_VSU0_FMA ] = { .pme_name = "PM_VSU0_FMA", .pme_code = 0xa084, .pme_short_desc = "two flops operation (fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FMA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FMA] }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR ] = { .pme_name = "PM_CMPLU_STALL_SCALAR", .pme_code = 0x40012, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Completion stall caused by FPU instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_CMPLU_STALL_SCALAR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_CMPLU_STALL_SCALAR] }, [ POWER7_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0xc09a, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = power7_event_ids[POWER7_PME_PM_STCX_FAIL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_STCX_FAIL] }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU0_FSQRT_FDIV_DOUBLE", .pme_code = 0xa094, .pme_short_desc = "eight flop DP vector operations (xvfdivdp", .pme_long_desc = " xvsqrtdp ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE] }, [ POWER7_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0xd0b0, .pme_short_desc = "Data Stream Touch", .pme_long_desc = "A prefetch stream was started using the DST instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DC_PREF_DST], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DC_PREF_DST] }, [ POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED ] = { .pme_name = "PM_VSU1_SCAL_SINGLE_ISSUED", .pme_code = 0xb086, .pme_short_desc = "Single Precision scalar instruction issued on Pipe1", .pme_long_desc = "Single Precision scalar instruction issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED] }, [ POWER7_PME_PM_L3_HIT ] = { .pme_name = "PM_L3_HIT", .pme_code = 0x1f080, .pme_short_desc = "L3 Hits", .pme_long_desc = "L3 Hits", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L3_HIT], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L3_HIT] }, [ POWER7_PME_PM_L2_GLOB_GUESS_WRONG ] = { .pme_name = "PM_L2_GLOB_GUESS_WRONG", .pme_code = 0x26482, .pme_short_desc = "L2 guess glb and guess was not correct (ie data local)", .pme_long_desc = "L2 guess glb and guess was not correct (ie data local)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_GLOB_GUESS_WRONG], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_GLOB_GUESS_WRONG] }, [ POWER7_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x20032, .pme_short_desc = "Decimal Unit marked Instruction Finish", .pme_long_desc = "The Decimal Floating Point Unit finished a marked instruction.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DFU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DFU_FIN] }, [ POWER7_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x4080, .pme_short_desc = "Instruction fetches from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_L1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_L1] }, [ POWER7_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x10068, .pme_short_desc = "Branch Instruction Finished ", .pme_long_desc = "The Branch execution unit finished an instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BRU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BRU_FIN] }, [ POWER7_PME_PM_IC_DEMAND_REQ ] = { .pme_name = "PM_IC_DEMAND_REQ", .pme_code = 0x4088, .pme_short_desc = "Demand Instruction fetch request", .pme_long_desc = "Demand Instruction fetch request", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IC_DEMAND_REQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IC_DEMAND_REQ] }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU1_FSQRT_FDIV_DOUBLE", .pme_code = 0xa096, .pme_short_desc = "eight flop DP vector operations (xvfdivdp", .pme_long_desc = " xvsqrtdp ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE] }, [ POWER7_PME_PM_VSU1_FMA ] = { .pme_name = "PM_VSU1_FMA", .pme_code = 0xa086, .pme_short_desc = "two flops operation (fmadd", .pme_long_desc = " fnmadd", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_FMA], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_FMA] }, [ POWER7_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x20036, .pme_short_desc = "Marked DL1 Demand Miss", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LD_MISS_L1] }, [ POWER7_PME_PM_VSU0_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU0_2FLOP_DOUBLE", .pme_code = 0xa08c, .pme_short_desc = "two flop DP vector operation (xvadddp", .pme_long_desc = " xvmuldp", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU0_2FLOP_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU0_2FLOP_DOUBLE] }, [ POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM ] = { .pme_name = "PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM", .pme_code = 0xd8bc, .pme_short_desc = "Dcache Strided prefetch stream confirmed (software + hardware)", .pme_long_desc = "Dcache Strided prefetch stream confirmed (software + hardware)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM] }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_L31_SHR", .pme_code = 0x2e056, .pme_short_desc = "Instruction PTEG loaded from another L3 on same chip shared", .pme_long_desc = "Instruction PTEG loaded from another L3 on same chip shared", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_PTEG_FROM_L31_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_PTEG_FROM_L31_SHR] }, [ POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_MRK_LSU_REJECT_ERAT_MISS", .pme_code = 0x30064, .pme_short_desc = "LSU marked reject due to ERAT (up to 2 per cycle)", .pme_long_desc = "LSU marked reject due to ERAT (up to 2 per cycle)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS] }, [ POWER7_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x4d048, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_L2MISS], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_L2MISS] }, [ POWER7_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x1c04c, .pme_short_desc = "Data loaded from remote L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a remote module due to a demand load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DATA_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DATA_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x14046, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", .pme_event_ids = power7_event_ids[POWER7_PME_PM_INST_FROM_PREF], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_INST_FROM_PREF] }, [ POWER7_PME_PM_VSU1_SQ ] = { .pme_name = "PM_VSU1_SQ", .pme_code = 0xb09e, .pme_short_desc = "Store Vector Issued on Pipe1", .pme_long_desc = "Store Vector Issued on Pipe1", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU1_SQ], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU1_SQ] }, [ POWER7_PME_PM_L2_LD_DISP ] = { .pme_name = "PM_L2_LD_DISP", .pme_code = 0x36180, .pme_short_desc = "All successful load dispatches", .pme_long_desc = "All successful load dispatches", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_LD_DISP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_LD_DISP] }, [ POWER7_PME_PM_L2_DISP_ALL ] = { .pme_name = "PM_L2_DISP_ALL", .pme_code = 0x46080, .pme_short_desc = "All successful LD/ST dispatches for this thread(i+d)", .pme_long_desc = "All successful LD/ST dispatches for this thread(i+d)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_L2_DISP_ALL], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_L2_DISP_ALL] }, [ POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x10012, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC] }, [ POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU_FSQRT_FDIV_DOUBLE", .pme_code = 0xa894, .pme_short_desc = "DP vector versions of fdiv", .pme_long_desc = "fsqrt ", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE] }, [ POWER7_PME_PM_BR_MPRED ] = { .pme_name = "PM_BR_MPRED", .pme_code = 0x400f6, .pme_short_desc = "Number of Branch Mispredicts", .pme_long_desc = "A branch instruction was incorrectly predicted. This could have been a target prediction, a condition prediction, or both", .pme_event_ids = power7_event_ids[POWER7_PME_PM_BR_MPRED], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_BR_MPRED] }, [ POWER7_PME_PM_VSU_1FLOP ] = { .pme_name = "PM_VSU_1FLOP", .pme_code = 0xa880, .pme_short_desc = "one flop (fadd", .pme_long_desc = " fmul", .pme_event_ids = power7_event_ids[POWER7_PME_PM_VSU_1FLOP], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_VSU_1FLOP] }, [ POWER7_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x2000a, .pme_short_desc = "cycles in hypervisor mode ", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = power7_event_ids[POWER7_PME_PM_HV_CYC], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_HV_CYC] }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x1d04c, .pme_short_desc = "Marked data loaded from remote L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a remote module due to a marked load", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR] }, [ POWER7_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x4c05e, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", .pme_event_ids = power7_event_ids[POWER7_PME_PM_DTLB_MISS_16M], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_DTLB_MISS_16M] }, [ POWER7_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40032, .pme_short_desc = "Marked LSU instruction finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = power7_event_ids[POWER7_PME_PM_MRK_LSU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_MRK_LSU_FIN] }, [ POWER7_PME_PM_LSU1_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU1_LMQ_LHR_MERGE", .pme_code = 0xd09a, .pme_short_desc = "LS1 Load Merge with another cacheline request", .pme_long_desc = "LS1 Load Merge with another cacheline request", .pme_event_ids = power7_event_ids[POWER7_PME_PM_LSU1_LMQ_LHR_MERGE], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_LSU1_LMQ_LHR_MERGE] }, [ POWER7_PME_PM_IFU_FIN ] = { .pme_name = "PM_IFU_FIN", .pme_code = 0x40066, .pme_short_desc = "IFU Finished a (non-branch) instruction", .pme_long_desc = "The Instruction Fetch Unit finished an instruction", .pme_event_ids = power7_event_ids[POWER7_PME_PM_IFU_FIN], .pme_group_vector = power7_group_vecs[POWER7_PME_PM_IFU_FIN] } }; #define POWER7_PME_EVENT_COUNT 517 static const int power7_group_event_ids[][POWER7_NUM_EVENT_COUNTERS] = { [ 0 ] = { 21, 225, 75, 75, 0, 0 }, [ 1 ] = { 10, 11, 3, 8, 0, 0 }, [ 2 ] = { 9, 9, 9, 13, 0, 0 }, [ 3 ] = { 16, 13, 8, 3, 0, 0 }, [ 4 ] = { 5, 14, 7, 4, 0, 0 }, [ 5 ] = { 12, 4, 8, 11, 0, 0 }, [ 6 ] = { 10, 11, 10, 14, 0, 0 }, [ 7 ] = { 5, 9, 9, 13, 0, 0 }, [ 8 ] = { 8, 9, 9, 13, 0, 0 }, [ 9 ] = { 4, 9, 9, 13, 0, 0 }, [ 10 ] = { 77, 40, 89, 221, 0, 0 }, [ 11 ] = { 18, 244, 38, 87, 0, 0 }, [ 12 ] = { 40, 41, 39, 38, 0, 0 }, [ 13 ] = { 30, 31, 27, 29, 0, 0 }, [ 14 ] = { 80, 31, 27, 29, 0, 0 }, [ 15 ] = { 39, 25, 117, 105, 0, 0 }, [ 16 ] = { 21, 223, 117, 214, 0, 0 }, [ 17 ] = { 21, 223, 117, 212, 0, 0 }, [ 18 ] = { 39, 82, 74, 214, 0, 0 }, [ 19 ] = { 77, 40, 89, 75, 0, 0 }, [ 20 ] = { 223, 85, 218, 81, 0, 0 }, [ 21 ] = { 91, 221, 85, 210, 0, 0 }, [ 22 ] = { 224, 223, 86, 213, 0, 0 }, [ 23 ] = { 93, 220, 220, 213, 0, 0 }, [ 24 ] = { 225, 222, 219, 211, 0, 0 }, [ 25 ] = { 92, 84, 84, 84, 0, 0 }, [ 26 ] = { 92, 86, 84, 82, 0, 0 }, [ 27 ] = { 91, 87, 86, 83, 0, 0 }, [ 28 ] = { 223, 221, 220, 212, 0, 0 }, [ 29 ] = { 223, 221, 74, 23, 0, 0 }, [ 30 ] = { 225, 224, 74, 210, 0, 0 }, [ 31 ] = { 80, 220, 220, 213, 0, 0 }, [ 32 ] = { 222, 38, 48, 47, 0, 0 }, [ 33 ] = { 222, 38, 34, 47, 0, 0 }, [ 34 ] = { 117, 112, 113, 137, 0, 0 }, [ 35 ] = { 46, 48, 44, 40, 0, 0 }, [ 36 ] = { 48, 45, 119, 126, 0, 0 }, [ 37 ] = { 44, 23, 42, 40, 0, 0 }, [ 38 ] = { 126, 122, 120, 114, 0, 0 }, [ 39 ] = { 126, 150, 165, 40, 0, 0 }, [ 40 ] = { 127, 151, 166, 40, 0, 0 }, [ 41 ] = { 124, 148, 163, 40, 0, 0 }, [ 42 ] = { 125, 149, 164, 40, 0, 0 }, [ 43 ] = { 63, 69, 69, 68, 0, 0 }, [ 44 ] = { 0, 0, 0, 0, 0, 0 }, [ 45 ] = { 241, 239, 234, 229, 0, 0 }, [ 46 ] = { 0, 0, 0, 0, 0, 0 }, [ 47 ] = { 0, 0, 0, 0, 0, 0 }, [ 48 ] = { 242, 241, 237, 232, 0, 0 }, [ 49 ] = { 0, 0, 0, 0, 0, 0 }, [ 50 ] = { 49, 50, 49, 48, 0, 0 }, [ 51 ] = { 50, 225, 74, 49, 0, 0 }, [ 52 ] = { 21, 50, 49, 48, 0, 0 }, [ 53 ] = { 49, 50, 17, 75, 0, 0 }, [ 54 ] = { 106, 100, 99, 91, 0, 0 }, [ 55 ] = { 101, 23, 98, 75, 0, 0 }, [ 56 ] = { 106, 100, 100, 92, 0, 0 }, [ 57 ] = { 108, 97, 74, 23, 0, 0 }, [ 58 ] = { 80, 23, 96, 96, 0, 0 }, [ 59 ] = { 80, 23, 95, 95, 0, 0 }, [ 60 ] = { 107, 101, 74, 23, 0, 0 }, [ 61 ] = { 210, 211, 207, 203, 0, 0 }, [ 62 ] = { 214, 215, 211, 207, 0, 0 }, [ 63 ] = { 64, 63, 62, 58, 0, 0 }, [ 64 ] = { 23, 77, 90, 0, 0, 0 }, [ 65 ] = { 24, 23, 90, 75, 0, 0 }, [ 66 ] = { 27, 29, 90, 75, 0, 0 }, [ 67 ] = { 139, 157, 172, 127, 0, 0 }, [ 68 ] = { 139, 134, 137, 131, 0, 0 }, [ 69 ] = { 142, 138, 17, 75, 0, 0 }, [ 70 ] = { 141, 158, 173, 75, 0, 0 }, [ 71 ] = { 136, 156, 171, 75, 0, 0 }, [ 72 ] = { 52, 51, 50, 23, 0, 0 }, [ 73 ] = { 53, 56, 54, 54, 0, 0 }, [ 74 ] = { 99, 94, 74, 23, 0, 0 }, [ 75 ] = { 100, 96, 74, 23, 0, 0 }, [ 76 ] = { 37, 38, 34, 0, 0, 0 }, [ 77 ] = { 238, 38, 34, 226, 0, 0 }, [ 78 ] = { 32, 34, 33, 34, 0, 0 }, [ 79 ] = { 222, 219, 217, 209, 0, 0 }, [ 80 ] = { 0, 77, 56, 0, 0, 0 }, [ 81 ] = { 0, 23, 74, 0, 0, 0 }, [ 82 ] = { 75, 73, 63, 60, 0, 0 }, [ 83 ] = { 71, 66, 65, 63, 0, 0 }, [ 84 ] = { 77, 92, 74, 23, 0, 0 }, [ 85 ] = { 23, 17, 90, 20, 0, 0 }, [ 86 ] = { 49, 19, 55, 19, 0, 0 }, [ 87 ] = { 237, 20, 25, 21, 0, 0 }, [ 88 ] = { 42, 21, 201, 22, 0, 0 }, [ 89 ] = { 19, 22, 202, 18, 0, 0 }, [ 90 ] = { 20, 18, 74, 52, 0, 0 }, [ 91 ] = { 52, 53, 87, 51, 0, 0 }, [ 92 ] = { 23, 26, 24, 27, 0, 0 }, [ 93 ] = { 24, 27, 23, 26, 0, 0 }, [ 94 ] = { 22, 28, 20, 26, 0, 0 }, [ 95 ] = { 25, 29, 18, 24, 0, 0 }, [ 96 ] = { 26, 24, 19, 25, 0, 0 }, [ 97 ] = { 27, 29, 21, 26, 0, 0 }, [ 98 ] = { 28, 28, 18, 24, 0, 0 }, [ 99 ] = { 80, 26, 22, 27, 0, 0 }, [ 100 ] = { 23, 25, 90, 105, 0, 0 }, [ 101 ] = { 27, 29, 19, 24, 0, 0 }, [ 102 ] = { 23, 25, 22, 214, 0, 0 }, [ 103 ] = { 27, 24, 24, 27, 0, 0 }, [ 104 ] = { 30, 76, 19, 24, 0, 0 }, [ 105 ] = { 22, 76, 24, 27, 0, 0 }, [ 106 ] = { 22, 76, 90, 27, 0, 0 }, [ 107 ] = { 83, 80, 81, 79, 0, 0 }, [ 108 ] = { 84, 78, 76, 80, 0, 0 }, [ 109 ] = { 81, 82, 77, 76, 0, 0 }, [ 110 ] = { 85, 81, 79, 78, 0, 0 }, [ 111 ] = { 86, 83, 80, 79, 0, 0 }, [ 112 ] = { 87, 82, 81, 79, 0, 0 }, [ 113 ] = { 88, 83, 77, 76, 0, 0 }, [ 114 ] = { 89, 82, 81, 79, 0, 0 }, [ 115 ] = { 87, 78, 82, 80, 0, 0 }, [ 116 ] = { 83, 80, 74, 23, 0, 0 }, [ 117 ] = { 88, 83, 81, 75, 0, 0 }, [ 118 ] = { 21, 76, 77, 76, 0, 0 }, [ 119 ] = { 81, 76, 82, 80, 0, 0 }, [ 120 ] = { 120, 106, 115, 89, 0, 0 }, [ 121 ] = { 122, 111, 118, 105, 0, 0 }, [ 122 ] = { 270, 291, 263, 280, 0, 0 }, [ 123 ] = { 273, 294, 266, 283, 0, 0 }, [ 124 ] = { 249, 248, 264, 281, 0, 0 }, [ 125 ] = { 280, 302, 249, 75, 0, 0 }, [ 126 ] = { 281, 303, 250, 75, 0, 0 }, [ 127 ] = { 267, 289, 307, 75, 0, 0 }, [ 128 ] = { 253, 274, 291, 75, 0, 0 }, [ 129 ] = { 256, 277, 295, 75, 0, 0 }, [ 130 ] = { 266, 288, 306, 75, 0, 0 }, [ 131 ] = { 265, 287, 304, 255, 0, 0 }, [ 132 ] = { 260, 282, 299, 75, 0, 0 }, [ 133 ] = { 261, 283, 300, 75, 0, 0 }, [ 134 ] = { 262, 284, 302, 75, 0, 0 }, [ 135 ] = { 263, 285, 303, 75, 0, 0 }, [ 136 ] = { 248, 249, 244, 75, 0, 0 }, [ 137 ] = { 268, 290, 274, 75, 0, 0 }, [ 138 ] = { 264, 286, 267, 233, 0, 0 }, [ 139 ] = { 298, 298, 301, 299, 0, 0 }, [ 140 ] = { 254, 275, 293, 75, 0, 0 }, [ 141 ] = { 259, 281, 298, 75, 0, 0 }, [ 142 ] = { 255, 276, 294, 75, 0, 0 }, [ 143 ] = { 16, 225, 74, 242, 0, 0 }, [ 144 ] = { 129, 264, 249, 234, 0, 0 }, [ 145 ] = { 260, 254, 249, 234, 0, 0 }, [ 146 ] = { 42, 254, 247, 234, 0, 0 }, [ 147 ] = { 266, 254, 251, 240, 0, 0 }, [ 148 ] = { 131, 128, 129, 131, 0, 0 }, [ 149 ] = { 128, 132, 118, 112, 0, 0 }, [ 150 ] = { 161, 170, 128, 119, 0, 0 }, [ 151 ] = { 147, 159, 174, 75, 0, 0 }, [ 152 ] = { 149, 142, 140, 75, 0, 0 }, [ 153 ] = { 146, 130, 128, 75, 0, 0 }, [ 154 ] = { 132, 129, 139, 75, 0, 0 }, [ 155 ] = { 98, 152, 167, 75, 0, 0 }, [ 156 ] = { 105, 99, 17, 75, 0, 0 }, [ 157 ] = { 102, 95, 17, 75, 0, 0 }, [ 158 ] = { 90, 79, 83, 75, 0, 0 }, [ 159 ] = { 41, 43, 232, 23, 0, 0 }, [ 160 ] = { 0, 58, 75, 0, 0, 0 }, [ 161 ] = { 58, 53, 17, 52, 0, 0 }, [ 162 ] = { 57, 4, 3, 3, 0, 0 }, [ 163 ] = { 97, 70, 71, 75, 0, 0 }, [ 164 ] = { 246, 58, 17, 74, 0, 0 }, [ 165 ] = { 7, 43, 90, 3, 0, 0 }, [ 166 ] = { 43, 49, 138, 3, 0, 0 }, [ 167 ] = { 144, 114, 92, 57, 0, 0 }, [ 168 ] = { 42, 23, 55, 75, 0, 0 }, [ 169 ] = { 80, 233, 232, 40, 0, 0 }, [ 170 ] = { 52, 233, 38, 3, 0, 0 }, [ 171 ] = { 21, 23, 74, 74, 0, 0 }, [ 172 ] = { 236, 23, 175, 75, 0, 0 }, [ 173 ] = { 94, 23, 87, 75, 0, 0 }, [ 174 ] = { 181, 23, 176, 75, 0, 0 }, [ 175 ] = { 21, 226, 88, 36, 0, 0 }, [ 176 ] = { 109, 103, 103, 23, 0, 0 }, [ 177 ] = { 229, 227, 225, 219, 0, 0 }, [ 178 ] = { 111, 107, 105, 101, 0, 0 }, [ 179 ] = { 110, 104, 106, 97, 0, 0 }, [ 180 ] = { 21, 115, 146, 154, 0, 0 }, [ 181 ] = { 21, 116, 147, 155, 0, 0 }, [ 182 ] = { 29, 114, 145, 153, 0, 0 }, [ 183 ] = { 115, 110, 17, 102, 0, 0 }, [ 184 ] = { 21, 123, 153, 161, 0, 0 }, [ 185 ] = { 21, 124, 154, 162, 0, 0 }, [ 186 ] = { 103, 102, 103, 23, 0, 0 }, [ 187 ] = { 114, 135, 229, 224, 0, 0 }, [ 188 ] = { 17, 16, 229, 224, 0, 0 }, [ 189 ] = { 2, 1, 17, 75, 0, 0 }, [ 190 ] = { 90, 77, 83, 75, 0, 0 }, [ 191 ] = { 104, 98, 94, 90, 0, 0 }, [ 192 ] = { 80, 23, 93, 90, 0, 0 }, [ 193 ] = { 80, 23, 102, 214, 0, 0 }, [ 194 ] = { 80, 23, 101, 94, 0, 0 }, [ 195 ] = { 80, 23, 97, 214, 0, 0 }, [ 196 ] = { 80, 225, 17, 93, 0, 0 }, [ 197 ] = { 77, 75, 72, 75, 0, 0 }, [ 198 ] = { 31, 35, 17, 75, 0, 0 }, [ 199 ] = { 21, 38, 35, 75, 0, 0 }, [ 200 ] = { 226, 225, 17, 215, 0, 0 }, [ 201 ] = { 219, 218, 213, 208, 0, 0 }, [ 202 ] = { 221, 218, 216, 208, 0, 0 }, [ 203 ] = { 220, 225, 214, 75, 0, 0 }, [ 204 ] = { 218, 225, 215, 75, 0, 0 }, [ 205 ] = { 47, 37, 227, 75, 0, 0 }, [ 206 ] = { 77, 92, 228, 105, 0, 0 }, [ 207 ] = { 21, 117, 38, 87, 0, 0 }, [ 208 ] = { 1, 225, 17, 215, 0, 0 }, [ 209 ] = { 42, 225, 17, 214, 0, 0 }, [ 210 ] = { 0, 225, 75, 0, 0, 0 }, [ 211 ] = { 80, 233, 228, 105, 0, 0 }, [ 212 ] = { 80, 25, 90, 105, 0, 0 }, [ 213 ] = { 77, 92, 74, 87, 0, 0 }, [ 214 ] = { 236, 236, 231, 225, 0, 0 }, [ 215 ] = { 80, 43, 232, 23, 0, 0 }, [ 216 ] = { 90, 77, 234, 40, 0, 0 }, [ 217 ] = { 52, 77, 17, 3, 0, 0 }, [ 218 ] = { 183, 195, 177, 75, 0, 0 }, [ 219 ] = { 188, 179, 179, 75, 0, 0 }, [ 220 ] = { 185, 181, 74, 174, 0, 0 }, [ 221 ] = { 187, 76, 180, 177, 0, 0 }, [ 222 ] = { 189, 188, 183, 75, 0, 0 }, [ 223 ] = { 185, 182, 181, 75, 0, 0 }, [ 224 ] = { 186, 186, 74, 175, 0, 0 }, [ 225 ] = { 187, 185, 74, 176, 0, 0 }, [ 226 ] = { 189, 189, 74, 178, 0, 0 }, [ 227 ] = { 80, 178, 178, 171, 0, 0 }, [ 228 ] = { 80, 187, 183, 179, 0, 0 }, [ 229 ] = { 197, 180, 74, 172, 0, 0 }, [ 230 ] = { 201, 200, 74, 23, 0, 0 }, [ 231 ] = { 80, 23, 190, 189, 0, 0 }, [ 232 ] = { 204, 196, 74, 193, 0, 0 }, [ 233 ] = { 195, 194, 187, 75, 0, 0 }, [ 234 ] = { 208, 208, 200, 75, 0, 0 }, [ 235 ] = { 80, 192, 185, 181, 0, 0 }, [ 236 ] = { 192, 192, 185, 75, 0, 0 }, [ 237 ] = { 80, 190, 184, 180, 0, 0 }, [ 238 ] = { 191, 190, 184, 75, 0, 0 }, [ 239 ] = { 196, 76, 188, 185, 0, 0 }, [ 240 ] = { 80, 203, 197, 196, 0, 0 }, [ 241 ] = { 205, 207, 199, 75, 0, 0 }, [ 242 ] = { 80, 205, 197, 195, 0, 0 }, [ 243 ] = { 206, 204, 74, 197, 0, 0 }, [ 244 ] = { 207, 206, 74, 198, 0, 0 }, [ 245 ] = { 209, 76, 186, 184, 0, 0 }, [ 246 ] = { 80, 193, 186, 186, 0, 0 }, [ 247 ] = { 80, 177, 194, 186, 0, 0 }, [ 248 ] = { 193, 76, 204, 183, 0, 0 }, [ 249 ] = { 194, 191, 202, 75, 0, 0 }, [ 250 ] = { 60, 225, 74, 182, 0, 0 }, [ 251 ] = { 204, 76, 195, 193, 0, 0 }, [ 252 ] = { 21, 23, 74, 186, 0, 0 }, [ 253 ] = { 248, 249, 244, 235, 0, 0 }, [ 254 ] = { 80, 233, 228, 106, 0, 0 }, [ 255 ] = { 80, 233, 111, 105, 0, 0 } }; static pmg_power_group_t power7_groups[] = { [ 0 ] = { .pmg_name = "pm_utilization", .pmg_desc = "CPI and utilization data", .pmg_event_ids = power7_group_event_ids[0], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001ef4f202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 1 ] = { .pmg_name = "pm_branch1", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000a0a2a4aeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 2 ] = { .pmg_name = "pm_branch2", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[2], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x444400009ca8a0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 3 ] = { .pmg_name = "pm_branch3", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0040000068049cf6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 4 ] = { .pmg_name = "pm_branch4", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000ac9eaea4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 5 ] = { .pmg_name = "pm_branch5", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000caaae9ca8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 6 ] = { .pmg_name = "pm_branch6", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[6], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000a0a2a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 7 ] = { .pmg_name = "pm_branch7", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[7], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000aca8a0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 8 ] = { .pmg_name = "pm_branch8", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[8], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000aea8a0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 9 ] = { .pmg_name = "pm_branch9", .pmg_desc = "Branch operations", .pmg_event_ids = power7_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000a4a8a0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 10 ] = { .pmg_name = "pm_slb_miss", .pmg_desc = "SLB Misses", .pmg_event_ids = power7_group_event_ids[10], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd0001f6909290ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 11 ] = { .pmg_name = "pm_tlb_miss", .pmg_desc = "TLB Misses", .pmg_event_ids = power7_group_event_ids[11], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x500000008866fcfcULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 12 ] = { .pmg_name = "pm_dtlb_miss", .pmg_desc = "DTLB Misses", .pmg_event_ids = power7_group_event_ids[12], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc00005e5e5e5eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 13 ] = { .pmg_name = "pm_derat_miss1", .pmg_desc = "DERAT misses", .pmg_event_ids = power7_group_event_ids[13], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc00005c5c5c5cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 14 ] = { .pmg_name = "pm_derat_miss2", .pmg_desc = "DERAT misses", .pmg_event_ids = power7_group_event_ids[14], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc0000025c5c5cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 15 ] = { .pmg_name = "pm_misc_miss1", .pmg_desc = "Misses", .pmg_event_ids = power7_group_event_ids[15], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0c0000090fe5af0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 16 ] = { .pmg_name = "pm_misc_miss2", .pmg_desc = "Misses", .pmg_event_ids = power7_group_event_ids[16], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0cc000001e585afaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 17 ] = { .pmg_name = "pm_misc_miss3", .pmg_desc = "Misses", .pmg_event_ids = power7_group_event_ids[17], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc00001e585a58ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 18 ] = { .pmg_name = "pm_misc_miss4", .pmg_desc = "Misses", .pmg_event_ids = power7_group_event_ids[18], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd4000000904802faULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 19 ] = { .pmg_name = "pm_misc_miss5", .pmg_desc = "Misses", .pmg_event_ids = power7_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0dd00000f6909202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 20 ] = { .pmg_name = "pm_pteg1", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcece000050505654ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 21 ] = { .pmg_name = "pm_pteg2", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[21], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xecec000050505454ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 22 ] = { .pmg_name = "pm_pteg3", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[22], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccec000054585252ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 23 ] = { .pmg_name = "pm_pteg4", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xeccc000052525252ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 24 ] = { .pmg_name = "pm_pteg5", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[24], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000052565456ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 25 ] = { .pmg_name = "pm_pteg6", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[25], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xeeee000054525652ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 26 ] = { .pmg_name = "pm_pteg7", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[26], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xeeee000054565656ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 27 ] = { .pmg_name = "pm_pteg8", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xeeee000050585258ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 28 ] = { .pmg_name = "pm_pteg9", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[28], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000050505258ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 29 ] = { .pmg_name = "pm_pteg10", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[29], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0000005050021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 30 ] = { .pmg_name = "pm_pteg11", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[30], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0c000052540254ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 31 ] = { .pmg_name = "pm_pteg12", .pmg_desc = "PTEG sources", .pmg_event_ids = power7_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc000002525252ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 32 ] = { .pmg_name = "pm_freq1", .pmg_desc = "Frequency events", .pmg_event_ids = power7_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000006e060c0cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 33 ] = { .pmg_name = "pm_freq2", .pmg_desc = "Frequency events", .pmg_event_ids = power7_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000006e06060cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 34 ] = { .pmg_name = "pm_L1_ref", .pmg_desc = "L1 references", .pmg_event_ids = power7_group_event_ids[34], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccd0008808082a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 35 ] = { .pmg_name = "pm_flush1", .pmg_desc = "Flushes", .pmg_event_ids = power7_group_event_ids[35], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x22200000888a8cf8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 36 ] = { .pmg_name = "pm_flush2", .pmg_desc = "Flushes", .pmg_event_ids = power7_group_event_ids[36], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x222c000086828eaaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 37 ] = { .pmg_name = "pm_flush", .pmg_desc = "Flushes", .pmg_event_ids = power7_group_event_ids[37], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x20000000821e12f8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 38 ] = { .pmg_name = "pm_lsu_flush1", .pmg_desc = "LSU Flush", .pmg_event_ids = power7_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000fb0b4b8bcULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 39 ] = { .pmg_name = "pm_lsu_flush2", .pmg_desc = "LSU Flush ULD", .pmg_event_ids = power7_group_event_ids[39], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008b0b0b2f8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 40 ] = { .pmg_name = "pm_lsu_flush3", .pmg_desc = "LSU Flush UST", .pmg_event_ids = power7_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008b4b4b6f8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 41 ] = { .pmg_name = "pm_lsu_flush4", .pmg_desc = "LSU Flush LRQ", .pmg_event_ids = power7_group_event_ids[41], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008b8b8baf8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 42 ] = { .pmg_name = "pm_lsu_flush5", .pmg_desc = "LSU Flush SRQ", .pmg_event_ids = power7_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008bcbcbef8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 43 ] = { .pmg_name = "pm_prefetch", .pmg_desc = "I cache Prefetches", .pmg_event_ids = power7_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04440000188a968eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 44 ] = { .pmg_name = "", .pmg_desc = "", .pmg_event_ids = power7_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000000000000ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 45 ] = { .pmg_name = "pm_thread_cyc2", .pmg_desc = "Thread cycles", .pmg_event_ids = power7_group_event_ids[45], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00040000120cf4b0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 46 ] = { .pmg_name = "", .pmg_desc = "", .pmg_event_ids = power7_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000000000000ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 47 ] = { .pmg_name = "", .pmg_desc = "", .pmg_event_ids = power7_group_event_ids[47], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000000000000ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 48 ] = { .pmg_name = "pm_thread_cyc5", .pmg_desc = "Thread cycles", .pmg_event_ids = power7_group_event_ids[48], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000b0b2b4b6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 49 ] = { .pmg_name = "", .pmg_desc = "", .pmg_event_ids = power7_group_event_ids[49], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000000000000ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 50 ] = { .pmg_name = "pm_fxu1", .pmg_desc = "FXU events", .pmg_event_ids = power7_group_event_ids[50], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000e0e0e0eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 51 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "FXU events", .pmg_event_ids = power7_group_event_ids[51], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000004f40204ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 52 ] = { .pmg_name = "pm_fxu3", .pmg_desc = "FXU events", .pmg_event_ids = power7_group_event_ids[52], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e0e0e0eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 53 ] = { .pmg_name = "pm_fxu4", .pmg_desc = "FXU events", .pmg_event_ids = power7_group_event_ids[53], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000e0e1e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 54 ] = { .pmg_name = "pm_L2_RCLD", .pmg_desc = "L2 RC load events ", .pmg_event_ids = power7_group_event_ids[54], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x6666400080808082ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 55 ] = { .pmg_name = "pm_L2_RC", .pmg_desc = "RC related events", .pmg_event_ids = power7_group_event_ids[55], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x60606000821e8002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 56 ] = { .pmg_name = "pm_L2_RCST", .pmg_desc = "L2 RC Store Events", .pmg_event_ids = power7_group_event_ids[56], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x6666400080808280ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 57 ] = { .pmg_name = "pm_L2_ldst_1", .pmg_desc = "L2 load/store ", .pmg_event_ids = power7_group_event_ids[57], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x660000008280021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 58 ] = { .pmg_name = "pm_L2_ldst_2", .pmg_desc = "L2 load/store ", .pmg_event_ids = power7_group_event_ids[58], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00662000021e8282ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 59 ] = { .pmg_name = "pm_L2_ldst_3", .pmg_desc = "L2 load/store ", .pmg_event_ids = power7_group_event_ids[59], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00662000021e8080ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 60 ] = { .pmg_name = "pm_L2_RCSTLD", .pmg_desc = "L2 RC Load/Store Events", .pmg_event_ids = power7_group_event_ids[60], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x660040008282021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 61 ] = { .pmg_name = "pm_nest1", .pmg_desc = "Nest Events", .pmg_event_ids = power7_group_event_ids[61], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000081838587ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 62 ] = { .pmg_name = "pm_nest2", .pmg_desc = "Nest Events", .pmg_event_ids = power7_group_event_ids[62], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000898b8d8fULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 63 ] = { .pmg_name = "pm_L2_redir_pref", .pmg_desc = "L2 redirect and prefetch", .pmg_event_ids = power7_group_event_ids[63], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44440000989a8882ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 64 ] = { .pmg_name = "pm_dlatencies1", .pmg_desc = "Data latencies", .pmg_event_ids = power7_group_event_ids[64], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc000000040f2f6f2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 65 ] = { .pmg_name = "pm_dlatencies2", .pmg_desc = "Data latencies", .pmg_event_ids = power7_group_event_ids[65], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0000000481ef602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 66 ] = { .pmg_name = "pm_dlatencies3", .pmg_desc = "Data latencies", .pmg_event_ids = power7_group_event_ids[66], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0000004244f602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 67 ] = { .pmg_name = "pm_rejects1", .pmg_desc = "Reject event", .pmg_event_ids = power7_group_event_ids[67], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc000164acaeacULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 68 ] = { .pmg_name = "pm_rejects2", .pmg_desc = "Reject events", .pmg_event_ids = power7_group_event_ids[68], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00c000026464a808ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 69 ] = { .pmg_name = "pm_rejects3", .pmg_desc = "Set mispredictions rejects", .pmg_event_ids = power7_group_event_ids[69], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc000008a8a81e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 70 ] = { .pmg_name = "pm_lsu_reject", .pmg_desc = "LSU Reject Event", .pmg_event_ids = power7_group_event_ids[70], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008a4a4a602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 71 ] = { .pmg_name = "pm_lsu_ncld", .pmg_desc = "Non cachable loads", .pmg_event_ids = power7_group_event_ids[71], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc000088c8c8e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 72 ] = { .pmg_name = "pm_gct1", .pmg_desc = "GCT events", .pmg_event_ids = power7_group_event_ids[72], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00400000f808861eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 73 ] = { .pmg_name = "pm_gct2", .pmg_desc = "GCT Events", .pmg_event_ids = power7_group_event_ids[73], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x222200009c9ea0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 74 ] = { .pmg_name = "pm_L2_castout_invalidate_1", .pmg_desc = "L2 castout and invalidate events", .pmg_event_ids = power7_group_event_ids[74], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x660020008082021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 75 ] = { .pmg_name = "pm_L2_castout_invalidate_2", .pmg_desc = "L2 castout and invalidate events", .pmg_event_ids = power7_group_event_ids[75], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x660020008280021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 76 ] = { .pmg_name = "pm_disp_held1", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power7_group_event_ids[76], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000060606f2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 77 ] = { .pmg_name = "pm_disp_held2", .pmg_desc = "Dispatch held conditions", .pmg_event_ids = power7_group_event_ids[77], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000016060606ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 78 ] = { .pmg_name = "pm_disp_clb_held", .pmg_desc = "Display CLB held conditions", .pmg_event_ids = power7_group_event_ids[78], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x2222000092949698ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 79 ] = { .pmg_name = "pm_power", .pmg_desc = "Power Events", .pmg_event_ids = power7_group_event_ids[79], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000006e6e6e6eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 80 ] = { .pmg_name = "pm_dispatch1", .pmg_desc = "Groups and instructions dispatched", .pmg_event_ids = power7_group_event_ids[80], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f2f20af2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 81 ] = { .pmg_name = "pm_dispatch2", .pmg_desc = "Groups and instructions dispatched", .pmg_event_ids = power7_group_event_ids[81], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f21e02f2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 82 ] = { .pmg_name = "pm_ic", .pmg_desc = "I cache operations", .pmg_event_ids = power7_group_event_ids[82], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000f888c9098ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 83 ] = { .pmg_name = "pm_ic_pref_cancel", .pmg_desc = "Instruction pre-fetched cancelled", .pmg_event_ids = power7_group_event_ids[83], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000190929490ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 84 ] = { .pmg_name = "pm_ic_miss", .pmg_desc = "Icache and Ierat miss events", .pmg_event_ids = power7_group_event_ids[84], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f6fc021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 85 ] = { .pmg_name = "pm_cpi_stack1", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[85], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc00000004016f618ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 86 ] = { .pmg_name = "pm_cpi_stack2", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[86], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000e140414ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 87 ] = { .pmg_name = "pm_cpi_stack3", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[87], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000026121a16ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 88 ] = { .pmg_name = "pm_cpi_stack4", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[88], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f4183e12ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 89 ] = { .pmg_name = "pm_cpi_stack5", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[89], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000281c3f0aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 90 ] = { .pmg_name = "pm_cpi_stack6", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[90], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001c3c021cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 91 ] = { .pmg_name = "pm_cpi_stack7", .pmg_desc = "CPI stack breakdown", .pmg_event_ids = power7_group_event_ids[91], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f81a141aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 92 ] = { .pmg_name = "pm_dsource1", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[92], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000040404242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 93 ] = { .pmg_name = "pm_dsource2", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[93], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000048464a48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 94 ] = { .pmg_name = "pm_dsource3", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[94], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc00004a484648ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 95 ] = { .pmg_name = "pm_dsource4", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[95], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000044444c44ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 96 ] = { .pmg_name = "pm_dsource5", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[96], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc00004e424446ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 97 ] = { .pmg_name = "pm_dsource6", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[97], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000042444e48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 98 ] = { .pmg_name = "pm_dsource7", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[98], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc00004c484c44ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 99 ] = { .pmg_name = "pm_dsource8", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[99], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0c0c00000240fe42ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 100 ] = { .pmg_name = "pm_dsource9", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[100], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc000000040fef6f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 101 ] = { .pmg_name = "pm_dsource10", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[101], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000042444444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 102 ] = { .pmg_name = "pm_dsource11", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[102], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc000000040fefefaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 103 ] = { .pmg_name = "pm_dsource12", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[103], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000042424242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 104 ] = { .pmg_name = "pm_dsource13", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[104], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0cc00005c024444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 105 ] = { .pmg_name = "pm_dsource14", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[105], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc0cc00004a024242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 106 ] = { .pmg_name = "pm_dsource15", .pmg_desc = "Data source information", .pmg_event_ids = power7_group_event_ids[106], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xc00c00004a02f642ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 107 ] = { .pmg_name = "pm_isource1", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[107], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000040404a48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 108 ] = { .pmg_name = "pm_isource2", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[108], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000048424c42ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 109 ] = { .pmg_name = "pm_isource3", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[109], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x444400004a484444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 110 ] = { .pmg_name = "pm_isource4", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[110], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000044464646ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 111 ] = { .pmg_name = "pm_isource5", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[111], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x444400004e444e48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 112 ] = { .pmg_name = "pm_isource6", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[112], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000046484a48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 113 ] = { .pmg_name = "pm_isource7", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[113], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000042444444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 114 ] = { .pmg_name = "pm_isource8", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[114], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x444400004c484a48ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 115 ] = { .pmg_name = "pm_isource9", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[115], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4444000046424242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 116 ] = { .pmg_name = "pm_isource10", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[116], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x440000004040021eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 117 ] = { .pmg_name = "pm_isource11", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[117], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x4440000042444a02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 118 ] = { .pmg_name = "pm_isource12", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[118], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x004400001e024444ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 119 ] = { .pmg_name = "pm_isource13", .pmg_desc = "Instruction source information", .pmg_event_ids = power7_group_event_ids[119], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x404400004a024242ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 120 ] = { .pmg_name = "pm_prefetch1", .pmg_desc = "Prefetch events", .pmg_event_ids = power7_group_event_ids[120], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdddd000fa8acb4b8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 121 ] = { .pmg_name = "pm_prefetch2", .pmg_desc = "Prefetch events", .pmg_event_ids = power7_group_event_ids[121], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdc00000cbc8066f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 122 ] = { .pmg_name = "pm_vsu0", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[122], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa00008082989aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 123 ] = { .pmg_name = "pm_vsu1", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[123], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa00009c9ea0a2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 124 ] = { .pmg_name = "pm_vsu2", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[124], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa000c988c8c8eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 125 ] = { .pmg_name = "pm_vsu3", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[125], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa0000284868402ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 126 ] = { .pmg_name = "pm_vsu4", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[126], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa0000290929002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 127 ] = { .pmg_name = "pm_vsu5", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[127], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbb0000880808202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 128 ] = { .pmg_name = "pm_vsu6", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[128], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa00008acacae02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 129 ] = { .pmg_name = "pm_vsu7", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[129], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa00008bcbcbe02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 130 ] = { .pmg_name = "pm_vsu8", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[130], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbb000088c8c8e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 131 ] = { .pmg_name = "pm_vsu9", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[131], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa0008a8a8aaa4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 132 ] = { .pmg_name = "pm_vsu10", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[132], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa0000888888a02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 133 ] = { .pmg_name = "pm_vsu11", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[133], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa0000894949602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 134 ] = { .pmg_name = "pm_vsu12", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[134], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbb0000888888a02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 135 ] = { .pmg_name = "pm_vsu13", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[135], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbb0000884848602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 136 ] = { .pmg_name = "pm_vsu14", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[136], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa0000e809ca002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 137 ] = { .pmg_name = "pm_vsu15", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[137], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbb0000890909c02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 138 ] = { .pmg_name = "pm_vsu16", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[138], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbbb0008949496a0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 139 ] = { .pmg_name = "pm_vsu17", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[139], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbbbb0000989a929eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 140 ] = { .pmg_name = "pm_vsu18", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[140], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa00008b0b0b202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 141 ] = { .pmg_name = "pm_vsu19", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[141], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa00008b4b4b602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 142 ] = { .pmg_name = "pm_vsu20", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[142], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaa00008b8b8ba02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 143 ] = { .pmg_name = "pm_vsu21", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[143], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000a000168f402bcULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 144 ] = { .pmg_name = "pm_vsu22", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[144], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcbaa000f848c8480ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 145 ] = { .pmg_name = "pm_vsu23", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[145], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa000f88bc8480ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 146 ] = { .pmg_name = "pm_vsu24", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[146], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0aaa0007f4bcb880ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 147 ] = { .pmg_name = "pm_vsu25", .pmg_desc = "VSU Execution", .pmg_event_ids = power7_group_event_ids[147], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xbaaa000f8cbcb4b0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 148 ] = { .pmg_name = "pm_lsu1", .pmg_desc = "LSU LMQ SRQ events", .pmg_event_ids = power7_group_event_ids[148], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0000000a43e1c08ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 149 ] = { .pmg_name = "pm_lsu2", .pmg_desc = "LSU events", .pmg_event_ids = power7_group_event_ids[149], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0c0200006690668eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 150 ] = { .pmg_name = "pm_lsu_lmq", .pmg_desc = "LSU LMQ Events", .pmg_event_ids = power7_group_event_ids[150], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdddd0000989aa0a4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 151 ] = { .pmg_name = "pm_lsu_srq1", .pmg_desc = "Store Request Queue Info", .pmg_event_ids = power7_group_event_ids[151], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xccc00008a0a0a202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 152 ] = { .pmg_name = "pm_lsu_srq2", .pmg_desc = "Store Request Queue Info", .pmg_event_ids = power7_group_event_ids[152], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd0000096979c02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 153 ] = { .pmg_name = "pm_lsu_s0_valid", .pmg_desc = "LSU Events", .pmg_event_ids = power7_group_event_ids[153], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd000009c9ea002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 154 ] = { .pmg_name = "pm_lsu_s0_alloc", .pmg_desc = "LSU Events", .pmg_event_ids = power7_group_event_ids[154], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd00000a19f9d02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 155 ] = { .pmg_name = "pm_l1_pref", .pmg_desc = "L1 pref Events", .pmg_event_ids = power7_group_event_ids[155], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd00008b8b8ba02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 156 ] = { .pmg_name = "pm_l2_guess_1", .pmg_desc = "L2_Guess_events", .pmg_event_ids = power7_group_event_ids[156], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x6600800080801e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 157 ] = { .pmg_name = "pm_l2_guess_2", .pmg_desc = "L2_Guess_events", .pmg_event_ids = power7_group_event_ids[157], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x6600800082821e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 158 ] = { .pmg_name = "pm_misc1", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[158], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04000000f0801602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 159 ] = { .pmg_name = "pm_misc2", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[159], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x2000000080f8f81eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 160 ] = { .pmg_name = "pm_misc3", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[160], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f20af2f2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 161 ] = { .pmg_name = "pm_misc4", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[161], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000000c1a1e1cULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 162 ] = { .pmg_name = "pm_misc5", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[162], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x044000040aaea4f6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 163 ] = { .pmg_name = "pm_misc6", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[163], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x444000028c8e8c02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 164 ] = { .pmg_name = "pm_misc7", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[164], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000380a1e66ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 165 ] = { .pmg_name = "pm_misc8", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[165], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x40000000a6f8f6f6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 166 ] = { .pmg_name = "pm_misc9", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[166], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x22c000008486a8f6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 167 ] = { .pmg_name = "pm_misc10", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[167], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0dd400061aa8b884ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 168 ] = { .pmg_name = "pm_misc11", .pmg_desc = "Misc events", .pmg_event_ids = power7_group_event_ids[168], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f41e0402ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 169 ] = { .pmg_name = "pm_misc_12", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[169], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000002f0f8f8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 170 ] = { .pmg_name = "pm_misc_13", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[170], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f8f0fcf6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 171 ] = { .pmg_name = "pm_misc_14", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[171], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e1e0266ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 172 ] = { .pmg_name = "pm_suspend", .pmg_desc = "SUSPENDED events", .pmg_event_ids = power7_group_event_ids[172], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00d00000001e9402ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 173 ] = { .pmg_name = "pm_iops", .pmg_desc = "Internal Operations events", .pmg_event_ids = power7_group_event_ids[173], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000141e1402ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 174 ] = { .pmg_name = "pm_sync", .pmg_desc = "sync", .pmg_event_ids = power7_group_event_ids[174], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0200000941e9a02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 175 ] = { .pmg_name = "pm_seg", .pmg_desc = "Segment events", .pmg_event_ids = power7_group_event_ids[175], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x022200041ea4a4a6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 176 ] = { .pmg_name = "pm_l3_hit", .pmg_desc = "L3 Hit Events", .pmg_event_ids = power7_group_event_ids[176], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xfff000008080801eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 177 ] = { .pmg_name = "pm_shl", .pmg_desc = "Shell Events", .pmg_event_ids = power7_group_event_ids[177], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x5555000080828486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 178 ] = { .pmg_name = "pm_l3_pref", .pmg_desc = "L3 Prefetch events", .pmg_event_ids = power7_group_event_ids[178], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdddf0002acaeac82ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 179 ] = { .pmg_name = "pm_l3", .pmg_desc = "L3 events", .pmg_event_ids = power7_group_event_ids[179], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xffff000082828280ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 180 ] = { .pmg_name = "pm_streams1", .pmg_desc = "Streams", .pmg_event_ids = power7_group_event_ids[180], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd00041eb4b4b6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 181 ] = { .pmg_name = "pm_streams2", .pmg_desc = "Streams", .pmg_event_ids = power7_group_event_ids[181], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd00041ebcbcbeULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 182 ] = { .pmg_name = "pm_streams3", .pmg_desc = "Streams", .pmg_event_ids = power7_group_event_ids[182], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdddd0004b0a8a8aaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 183 ] = { .pmg_name = "pm_larx", .pmg_desc = "LARX", .pmg_event_ids = power7_group_event_ids[183], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcc0c000194961e94ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 184 ] = { .pmg_name = "pm_ldf", .pmg_desc = "Floating Point loads", .pmg_event_ids = power7_group_event_ids[184], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc00041e848486ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 185 ] = { .pmg_name = "pm_ldx", .pmg_desc = "Vector Load", .pmg_event_ids = power7_group_event_ids[185], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ccc00041e88888aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 186 ] = { .pmg_name = "pm_l2_ld_st", .pmg_desc = "L2 load and store events", .pmg_event_ids = power7_group_event_ids[186], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x66f000008082801eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 187 ] = { .pmg_name = "pm_stcx", .pmg_desc = "STCX", .pmg_event_ids = power7_group_event_ids[187], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xcccc000c94ac989aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 188 ] = { .pmg_name = "pm_btac", .pmg_desc = "BTAC", .pmg_event_ids = power7_group_event_ids[188], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x55cc00008a88989aULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 189 ] = { .pmg_name = "pm_br_bc", .pmg_desc = "Branch BC events", .pmg_event_ids = power7_group_event_ids[189], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x44000000b8ba1e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 190 ] = { .pmg_name = "pm_inst_imc ", .pmg_desc = "inst imc events", .pmg_event_ids = power7_group_event_ids[190], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f0f21602ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 191 ] = { .pmg_name = "pm_l2_misc1", .pmg_desc = "L2 load/store Miss events", .pmg_event_ids = power7_group_event_ids[191], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x6666000c80808280ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 192 ] = { .pmg_name = "pm_l2_misc2", .pmg_desc = "L2 Events", .pmg_event_ids = power7_group_event_ids[192], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00660000021e8080ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 193 ] = { .pmg_name = "pm_l2_misc3", .pmg_desc = "L2 Events", .pmg_event_ids = power7_group_event_ids[193], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00608000021e82faULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 194 ] = { .pmg_name = "pm_l2_misc4", .pmg_desc = "L2 Events", .pmg_event_ids = power7_group_event_ids[194], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00666000021e8282ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 195 ] = { .pmg_name = "pm_l2_misc5", .pmg_desc = "L2 Events", .pmg_event_ids = power7_group_event_ids[195], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00608000021e80faULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 196 ] = { .pmg_name = "pm_l2_misc6", .pmg_desc = "L2 Events", .pmg_event_ids = power7_group_event_ids[196], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0006600002f41e80ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 197 ] = { .pmg_name = "pm_ierat", .pmg_desc = "IERAT Events", .pmg_event_ids = power7_group_event_ids[197], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x04400000f6bcbe02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 198 ] = { .pmg_name = "pm_disp_clb", .pmg_desc = "Dispatch CLB Events", .pmg_event_ids = power7_group_event_ids[198], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x2200000090a81e02ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 199 ] = { .pmg_name = "pm_dpu", .pmg_desc = "DPU Events", .pmg_event_ids = power7_group_event_ids[199], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e060802ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 200 ] = { .pmg_name = "pm_cpu_util", .pmg_desc = "Basic CPU utilization", .pmg_event_ids = power7_group_event_ids[200], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000008f41ef4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 201 ] = { .pmg_name = "pm_overflow1", .pmg_desc = "Overflow events", .pmg_event_ids = power7_group_event_ids[201], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000010101010ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 202 ] = { .pmg_name = "pm_overflow2", .pmg_desc = "Overflow events", .pmg_event_ids = power7_group_event_ids[202], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000024102410ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 203 ] = { .pmg_name = "pm_rewind", .pmg_desc = "Rewind events", .pmg_event_ids = power7_group_event_ids[203], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000020f42002ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 204 ] = { .pmg_name = "pm_saved", .pmg_desc = "Saved Events", .pmg_event_ids = power7_group_event_ids[204], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000022f42202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 205 ] = { .pmg_name = "pm_tlbie", .pmg_desc = "TLBIE Events", .pmg_event_ids = power7_group_event_ids[205], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x22d000008a96b202ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 206 ] = { .pmg_name = "pm_id_miss_erat_l1", .pmg_desc = "Instruction/Data miss from ERAT/L1 cache", .pmg_event_ids = power7_group_event_ids[206], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f6fcf0f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 207 ] = { .pmg_name = "pm_id_miss_erat_tlab", .pmg_desc = "Instruction/Data miss from ERAT/TLB", .pmg_event_ids = power7_group_event_ids[207], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001ef6fcfcULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 208 ] = { .pmg_name = "pm_compat_utilization1", .pmg_desc = "Basic CPU utilization", .pmg_event_ids = power7_group_event_ids[208], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000faf41ef4ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 209 ] = { .pmg_name = "pm_compat_utilization2", .pmg_desc = "CPI and utilization data", .pmg_event_ids = power7_group_event_ids[209], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f4f41efaULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 210 ] = { .pmg_name = "pm_compat_cpi_1plus_ppc", .pmg_desc = "Misc CPI and utilization data", .pmg_event_ids = power7_group_event_ids[210], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f2f4f2f2ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 211 ] = { .pmg_name = "pm_compat_l1_dcache_load_store_miss", .pmg_desc = "L1 D-Cache load/store miss", .pmg_event_ids = power7_group_event_ids[211], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000002f0f0f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 212 ] = { .pmg_name = "pm_compat_l1_cache_load", .pmg_desc = "L1 Cache loads", .pmg_event_ids = power7_group_event_ids[212], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000002fef6f0ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 213 ] = { .pmg_name = "pm_compat_instruction_directory", .pmg_desc = "Instruction Directory", .pmg_event_ids = power7_group_event_ids[213], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f6fc02fcULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 214 ] = { .pmg_name = "pm_compat_suspend", .pmg_desc = "Suspend Events", .pmg_event_ids = power7_group_event_ids[214], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000000000000ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 215 ] = { .pmg_name = "pm_compat_misc_events1", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[215], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000002f8f81eULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 216 ] = { .pmg_name = "pm_compat_misc_events2", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[216], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f0f2f4f8ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 217 ] = { .pmg_name = "pm_compat_misc_events3", .pmg_desc = "Misc Events", .pmg_event_ids = power7_group_event_ids[217], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000f8f21ef6ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 218 ] = { .pmg_name = "pm_mrk_br", .pmg_desc = "Marked Branch events", .pmg_event_ids = power7_group_event_ids[218], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000036363602ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 219 ] = { .pmg_name = "pm_mrk_dsource1", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[219], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd000004e424402ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 220 ] = { .pmg_name = "pm_mrk_dsource2", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[220], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd00d000040200248ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 221 ] = { .pmg_name = "pm_mrk_dsource3", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[221], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0dd000044024642ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 222 ] = { .pmg_name = "pm_mrk_dsource4", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[222], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd0000042444202ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 223 ] = { .pmg_name = "pm_mrk_dsource5", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[223], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd0d0000040244e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 224 ] = { .pmg_name = "pm_mrk_dsource6", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[224], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd00000048480220ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 225 ] = { .pmg_name = "pm_mrk_dsource7", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[225], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd000000044260226ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 226 ] = { .pmg_name = "pm_mrk_dsource8", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[226], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd000000042280228ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 227 ] = { .pmg_name = "pm_mrk_dsource9", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[227], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00d00000022a4c2aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 228 ] = { .pmg_name = "pm_mrk_dsource10", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[228], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00d00000022c422cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 229 ] = { .pmg_name = "pm_mrk_dsource11", .pmg_desc = "Marked data sources", .pmg_event_ids = power7_group_event_ids[229], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000003f2e0224ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 230 ] = { .pmg_name = "pm_mrk_lsu_flush1", .pmg_desc = "Marked LSU Flush", .pmg_event_ids = power7_group_event_ids[230], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd0000008486021eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 231 ] = { .pmg_name = "pm_mrk_lsu_flush2", .pmg_desc = "Marked LSU Flush", .pmg_event_ids = power7_group_event_ids[231], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00dd0000021e888aULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 232 ] = { .pmg_name = "pm_mrk_rejects", .pmg_desc = "Marked rejects", .pmg_event_ids = power7_group_event_ids[232], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd000000828c0264ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 233 ] = { .pmg_name = "pm_mrk_inst", .pmg_desc = "Marked instruction events", .pmg_event_ids = power7_group_event_ids[233], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000032303002ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 234 ] = { .pmg_name = "pm_mrk_st", .pmg_desc = "Marked stores events", .pmg_event_ids = power7_group_event_ids[234], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000034343402ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 235 ] = { .pmg_name = "pm_mrk_dtlb_miss1", .pmg_desc = "Marked Data TLB Miss", .pmg_event_ids = power7_group_event_ids[235], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd0000025e5e5eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 236 ] = { .pmg_name = "pm_mrk_dtlb_miss2", .pmg_desc = "Marked Data TLB Miss", .pmg_event_ids = power7_group_event_ids[236], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd000005e5e5e02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 237 ] = { .pmg_name = "pm_mrk_derat_miss1", .pmg_desc = "Marked DERAT Miss events", .pmg_event_ids = power7_group_event_ids[237], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd0000025c5c5cULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 238 ] = { .pmg_name = "pm_mrk_derat_miss2", .pmg_desc = "Marked DERAT Miss events", .pmg_event_ids = power7_group_event_ids[238], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd000005c5c5c02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 239 ] = { .pmg_name = "pm_mrk_misc_miss", .pmg_desc = "marked Miss Events", .pmg_event_ids = power7_group_event_ids[239], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00d000003e025a3eULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 240 ] = { .pmg_name = "pm_mrk_pteg1", .pmg_desc = "Marked PTEG", .pmg_event_ids = power7_group_event_ids[240], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd000002525656ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 241 ] = { .pmg_name = "pm_mrk_pteg2", .pmg_desc = "Marked PTEG", .pmg_event_ids = power7_group_event_ids[241], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xddd0000050545202ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 242 ] = { .pmg_name = "pm_mrk_pteg3", .pmg_desc = "Marked PTEG", .pmg_event_ids = power7_group_event_ids[242], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0ddd000002565654ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 243 ] = { .pmg_name = "pm_mrk_pteg4", .pmg_desc = "Marked PTEG", .pmg_event_ids = power7_group_event_ids[243], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd0d000054500258ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 244 ] = { .pmg_name = "pm_mrk_pteg5", .pmg_desc = "Marked PTEG", .pmg_event_ids = power7_group_event_ids[244], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xdd0d000052580252ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 245 ] = { .pmg_name = "pm_mrk_misc1", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[245], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd00000008e023a34ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 246 ] = { .pmg_name = "pm_mrk_misc2", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[246], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000002383a32ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 247 ] = { .pmg_name = "pm_mrk_misc3", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[247], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00d00000023a8032ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 248 ] = { .pmg_name = "pm_mrk_misc4", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[248], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000003c023238ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 249 ] = { .pmg_name = "pm_mrk_misc5", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[249], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000003d323f02ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 250 ] = { .pmg_name = "pm_mrk_misc6", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[250], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x0000000030f40230ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 251 ] = { .pmg_name = "pm_mrk_misc7", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[251], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xd000000082026464ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 252 ] = { .pmg_name = "pm_mrk_misc8", .pmg_desc = "Marked misc events", .pmg_event_ids = power7_group_event_ids[252], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000000001e1e0232ULL, .pmg_mmcra = 0x0000000000000001ULL }, [ 253 ] = { .pmg_name = "pm_vsu15", .pmg_desc = "FP ops", .pmg_event_ids = power7_group_event_ids[253], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0xaaaa000f809ca098ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 254 ] = { .pmg_name = "pm_l1_dcache_accesses", .pmg_desc = "L1 D-Cache accesses", .pmg_event_ids = power7_group_event_ids[254], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000c000102f0f080ULL, .pmg_mmcra = 0x0000000000000000ULL }, [ 255 ] = { .pmg_name = "pm_loads_and_stores", .pmg_desc = "Load and Store instructions", .pmg_event_ids = power7_group_event_ids[255], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00c0000202f080f0ULL, .pmg_mmcra = 0x0000000000000000ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/powerpc_events.h000066400000000000000000000022551502707512200225260ustar00rootroot00000000000000/* * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * powerpc_events.h */ #ifndef _POWERPC_EVENTS_H_ #define _POWERPC_EVENTS_H_ #define PME_INSTR_COMPLETED 1 #endif papi-papi-7-2-0-t/src/libperfnec/lib/powerpc_reg.h000066400000000000000000000064711502707512200220030ustar00rootroot00000000000000/* * These definitions were taken from the reg.h file which, until Linux * 2.6.18, resided in /usr/include/asm-ppc64. Most of the unneeded * definitions have been removed, but there are still a few in this file * that are currently unused by libpfm. */ #ifndef _POWER_REG_H #define _POWER_REG_H #define __stringify_1(x) #x #define __stringify(x) __stringify_1(x) #define mfspr(rn) ({unsigned long rval; \ asm volatile("mfspr %0," __stringify(rn) \ : "=r" (rval)); rval;}) /* Special Purpose Registers (SPRNs)*/ #define SPRN_PVR 0x11F /* Processor Version Register */ /* Performance monitor SPRs */ #define SPRN_MMCR0 795 #define MMCR0_FC 0x80000000UL /* freeze counters */ #define MMCR0_FCS 0x40000000UL /* freeze in supervisor state */ #define MMCR0_KERNEL_DISABLE MMCR0_FCS #define MMCR0_FCP 0x20000000UL /* freeze in problem state */ #define MMCR0_PROBLEM_DISABLE MMCR0_FCP #define MMCR0_FCM1 0x10000000UL /* freeze counters while MSR mark = 1 */ #define MMCR0_FCM0 0x08000000UL /* freeze counters while MSR mark = 0 */ #define MMCR0_PMXE 0x04000000UL /* performance monitor exception enable */ #define MMCR0_FCECE 0x02000000UL /* freeze ctrs on enabled cond or event */ #define MMCR0_TBEE 0x00400000UL /* time base exception enable */ #define MMCR0_PMC1CE 0x00008000UL /* PMC1 count enable*/ #define MMCR0_PMCjCE 0x00004000UL /* PMCj count enable*/ #define MMCR0_TRIGGER 0x00002000UL /* TRIGGER enable */ #define MMCR0_PMAO 0x00000080UL /* performance monitor alert has occurred, set to 0 after handling exception */ #define MMCR0_SHRFC 0x00000040UL /* SHRre freeze conditions between threads */ #define MMCR0_FC1_4 0x00000020UL /* freeze counters 1 - 4 on POWER5/5+ */ #define MMCR0_FC5_6 0x00000010UL /* freeze counters 5 & 6 on POWER5/5+ */ #define MMCR0_FCTI 0x00000008UL /* freeze counters in tags inactive mode */ #define MMCR0_FCTA 0x00000004UL /* freeze counters in tags active mode */ #define MMCR0_FCWAIT 0x00000002UL /* freeze counter in WAIT state */ #define MMCR0_FCHV 0x00000001UL /* freeze conditions in hypervisor mode */ #define SPRN_MMCR1 798 #define SPRN_MMCRA 0x312 #define MMCRA_SIHV 0x10000000UL /* state of MSR HV when SIAR set */ #define MMCRA_SIPR 0x08000000UL /* state of MSR PR when SIAR set */ #define MMCRA_SAMPLE_ENABLE 0x00000001UL /* enable sampling */ #define SPRN_PMC1 787 #define SPRN_PMC2 788 #define SPRN_PMC3 789 #define SPRN_PMC4 790 #define SPRN_PMC5 791 #define SPRN_PMC6 792 #define SPRN_PMC7 793 #define SPRN_PMC8 794 #define SPRN_SIAR 780 #define SPRN_SDAR 781 /* Processor Version Register (PVR) field extraction */ #define PVR_VER(pvr) (((pvr) >> 16) & 0xFFFF) /* Version field */ #define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revison field */ #define __is_processor(pv) (PVR_VER(mfspr(SPRN_PVR)) == (pv)) /* 64-bit processors */ /* XXX the prefix should be PVR_, we'll do a global sweep to fix it one day */ #define PV_NORTHSTAR 0x0033 #define PV_PULSAR 0x0034 #define PV_POWER4 0x0035 #define PV_ICESTAR 0x0036 #define PV_SSTAR 0x0037 #define PV_POWER4p 0x0038 #define PV_970 0x0039 #define PV_POWER5 0x003A #define PV_POWER5p 0x003B #define PV_970FX 0x003C #define PV_POWER6 0x003E #define PV_POWER7 0x003F #define PV_630 0x0040 #define PV_630p 0x0041 #define PV_970MP 0x0044 #define PV_970GX 0x0045 #define PV_BE 0x0070 #endif /* _POWER_REG_H */ papi-papi-7-2-0-t/src/libperfnec/lib/ppc970_events.h000066400000000000000000004110161502707512200220700ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PPC970_EVENTS_H__ #define __PPC970_EVENTS_H__ /* * File: ppc970_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID 1 #define PPC970_PME_PM_FPU1_SINGLE 2 #define PPC970_PME_PM_FPU0_STALL3 3 #define PPC970_PME_PM_TB_BIT_TRANS 4 #define PPC970_PME_PM_GPR_MAP_FULL_CYC 5 #define PPC970_PME_PM_MRK_ST_CMPL 6 #define PPC970_PME_PM_FPU0_STF 7 #define PPC970_PME_PM_FPU1_FMA 8 #define PPC970_PME_PM_LSU1_FLUSH_ULD 9 #define PPC970_PME_PM_MRK_INST_FIN 10 #define PPC970_PME_PM_MRK_LSU0_FLUSH_UST 11 #define PPC970_PME_PM_LSU_LRQ_S0_ALLOC 12 #define PPC970_PME_PM_FPU_FDIV 13 #define PPC970_PME_PM_FPU0_FULL_CYC 14 #define PPC970_PME_PM_FPU_SINGLE 15 #define PPC970_PME_PM_FPU0_FMA 16 #define PPC970_PME_PM_MRK_LSU1_FLUSH_ULD 17 #define PPC970_PME_PM_LSU1_FLUSH_LRQ 18 #define PPC970_PME_PM_DTLB_MISS 19 #define PPC970_PME_PM_MRK_ST_MISS_L1 20 #define PPC970_PME_PM_EXT_INT 21 #define PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ 22 #define PPC970_PME_PM_MRK_ST_GPS 23 #define PPC970_PME_PM_GRP_DISP_SUCCESS 24 #define PPC970_PME_PM_LSU1_LDF 25 #define PPC970_PME_PM_LSU0_SRQ_STFWD 26 #define PPC970_PME_PM_CR_MAP_FULL_CYC 27 #define PPC970_PME_PM_MRK_LSU0_FLUSH_ULD 28 #define PPC970_PME_PM_LSU_DERAT_MISS 29 #define PPC970_PME_PM_FPU0_SINGLE 30 #define PPC970_PME_PM_FPU1_FDIV 31 #define PPC970_PME_PM_FPU1_FEST 32 #define PPC970_PME_PM_FPU0_FRSP_FCONV 33 #define PPC970_PME_PM_GCT_EMPTY_SRQ_FULL 34 #define PPC970_PME_PM_MRK_ST_CMPL_INT 35 #define PPC970_PME_PM_FLUSH_BR_MPRED 36 #define PPC970_PME_PM_FXU_FIN 37 #define PPC970_PME_PM_FPU_STF 38 #define PPC970_PME_PM_DSLB_MISS 39 #define PPC970_PME_PM_FXLS1_FULL_CYC 40 #define PPC970_PME_PM_LSU_LMQ_LHR_MERGE 41 #define PPC970_PME_PM_MRK_STCX_FAIL 42 #define PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE 43 #define PPC970_PME_PM_MRK_DATA_FROM_L25_SHR 44 #define PPC970_PME_PM_LSU_FLUSH_ULD 45 #define PPC970_PME_PM_MRK_BRU_FIN 46 #define PPC970_PME_PM_IERAT_XLATE_WR 47 #define PPC970_PME_PM_DATA_FROM_MEM 48 #define PPC970_PME_PM_FPR_MAP_FULL_CYC 49 #define PPC970_PME_PM_FPU1_FULL_CYC 50 #define PPC970_PME_PM_FPU0_FIN 51 #define PPC970_PME_PM_GRP_BR_REDIR 52 #define PPC970_PME_PM_THRESH_TIMEO 53 #define PPC970_PME_PM_FPU_FSQRT 54 #define PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ 55 #define PPC970_PME_PM_PMC1_OVERFLOW 56 #define PPC970_PME_PM_FXLS0_FULL_CYC 57 #define PPC970_PME_PM_FPU0_ALL 58 #define PPC970_PME_PM_DATA_TABLEWALK_CYC 59 #define PPC970_PME_PM_FPU0_FEST 60 #define PPC970_PME_PM_DATA_FROM_L25_MOD 61 #define PPC970_PME_PM_LSU0_REJECT_ERAT_MISS 62 #define PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 63 #define PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF 64 #define PPC970_PME_PM_FPU_FEST 65 #define PPC970_PME_PM_0INST_FETCH 66 #define PPC970_PME_PM_LD_MISS_L1_LSU0 67 #define PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF 68 #define PPC970_PME_PM_L1_PREF 69 #define PPC970_PME_PM_FPU1_STALL3 70 #define PPC970_PME_PM_BRQ_FULL_CYC 71 #define PPC970_PME_PM_PMC8_OVERFLOW 72 #define PPC970_PME_PM_PMC7_OVERFLOW 73 #define PPC970_PME_PM_WORK_HELD 74 #define PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 75 #define PPC970_PME_PM_FXU_IDLE 76 #define PPC970_PME_PM_INST_CMPL 77 #define PPC970_PME_PM_LSU1_FLUSH_UST 78 #define PPC970_PME_PM_LSU0_FLUSH_ULD 79 #define PPC970_PME_PM_LSU_FLUSH 80 #define PPC970_PME_PM_INST_FROM_L2 81 #define PPC970_PME_PM_LSU1_REJECT_LMQ_FULL 82 #define PPC970_PME_PM_PMC2_OVERFLOW 83 #define PPC970_PME_PM_FPU0_DENORM 84 #define PPC970_PME_PM_FPU1_FMOV_FEST 85 #define PPC970_PME_PM_GRP_DISP_REJECT 86 #define PPC970_PME_PM_LSU_LDF 87 #define PPC970_PME_PM_INST_DISP 88 #define PPC970_PME_PM_DATA_FROM_L25_SHR 89 #define PPC970_PME_PM_L1_DCACHE_RELOAD_VALID 90 #define PPC970_PME_PM_MRK_GRP_ISSUED 91 #define PPC970_PME_PM_FPU_FMA 92 #define PPC970_PME_PM_MRK_CRU_FIN 93 #define PPC970_PME_PM_MRK_LSU1_FLUSH_UST 94 #define PPC970_PME_PM_MRK_FXU_FIN 95 #define PPC970_PME_PM_LSU1_REJECT_ERAT_MISS 96 #define PPC970_PME_PM_BR_ISSUED 97 #define PPC970_PME_PM_PMC4_OVERFLOW 98 #define PPC970_PME_PM_EE_OFF 99 #define PPC970_PME_PM_INST_FROM_L25_MOD 100 #define PPC970_PME_PM_ITLB_MISS 101 #define PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE 102 #define PPC970_PME_PM_GRP_DISP_VALID 103 #define PPC970_PME_PM_MRK_GRP_DISP 104 #define PPC970_PME_PM_LSU_FLUSH_UST 105 #define PPC970_PME_PM_FXU1_FIN 106 #define PPC970_PME_PM_GRP_CMPL 107 #define PPC970_PME_PM_FPU_FRSP_FCONV 108 #define PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ 109 #define PPC970_PME_PM_LSU_LMQ_FULL_CYC 110 #define PPC970_PME_PM_ST_REF_L1_LSU0 111 #define PPC970_PME_PM_LSU0_DERAT_MISS 112 #define PPC970_PME_PM_LSU_SRQ_SYNC_CYC 113 #define PPC970_PME_PM_FPU_STALL3 114 #define PPC970_PME_PM_LSU_REJECT_ERAT_MISS 115 #define PPC970_PME_PM_MRK_DATA_FROM_L2 116 #define PPC970_PME_PM_LSU0_FLUSH_SRQ 117 #define PPC970_PME_PM_FPU0_FMOV_FEST 118 #define PPC970_PME_PM_LD_REF_L1_LSU0 119 #define PPC970_PME_PM_LSU1_FLUSH_SRQ 120 #define PPC970_PME_PM_GRP_BR_MPRED 121 #define PPC970_PME_PM_LSU_LMQ_S0_ALLOC 122 #define PPC970_PME_PM_LSU0_REJECT_LMQ_FULL 123 #define PPC970_PME_PM_ST_REF_L1 124 #define PPC970_PME_PM_MRK_VMX_FIN 125 #define PPC970_PME_PM_LSU_SRQ_EMPTY_CYC 126 #define PPC970_PME_PM_FPU1_STF 127 #define PPC970_PME_PM_RUN_CYC 128 #define PPC970_PME_PM_LSU_LMQ_S0_VALID 129 #define PPC970_PME_PM_LSU0_LDF 130 #define PPC970_PME_PM_LSU_LRQ_S0_VALID 131 #define PPC970_PME_PM_PMC3_OVERFLOW 132 #define PPC970_PME_PM_MRK_IMR_RELOAD 133 #define PPC970_PME_PM_MRK_GRP_TIMEO 134 #define PPC970_PME_PM_FPU_FMOV_FEST 135 #define PPC970_PME_PM_GRP_DISP_BLK_SB_CYC 136 #define PPC970_PME_PM_XER_MAP_FULL_CYC 137 #define PPC970_PME_PM_ST_MISS_L1 138 #define PPC970_PME_PM_STOP_COMPLETION 139 #define PPC970_PME_PM_MRK_GRP_CMPL 140 #define PPC970_PME_PM_ISLB_MISS 141 #define PPC970_PME_PM_SUSPENDED 142 #define PPC970_PME_PM_CYC 143 #define PPC970_PME_PM_LD_MISS_L1_LSU1 144 #define PPC970_PME_PM_STCX_FAIL 145 #define PPC970_PME_PM_LSU1_SRQ_STFWD 146 #define PPC970_PME_PM_GRP_DISP 147 #define PPC970_PME_PM_L2_PREF 148 #define PPC970_PME_PM_FPU1_DENORM 149 #define PPC970_PME_PM_DATA_FROM_L2 150 #define PPC970_PME_PM_FPU0_FPSCR 151 #define PPC970_PME_PM_MRK_DATA_FROM_L25_MOD 152 #define PPC970_PME_PM_FPU0_FSQRT 153 #define PPC970_PME_PM_LD_REF_L1 154 #define PPC970_PME_PM_MRK_L1_RELOAD_VALID 155 #define PPC970_PME_PM_1PLUS_PPC_CMPL 156 #define PPC970_PME_PM_INST_FROM_L1 157 #define PPC970_PME_PM_EE_OFF_EXT_INT 158 #define PPC970_PME_PM_PMC6_OVERFLOW 159 #define PPC970_PME_PM_LSU_LRQ_FULL_CYC 160 #define PPC970_PME_PM_IC_PREF_INSTALL 161 #define PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS 162 #define PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ 163 #define PPC970_PME_PM_GCT_FULL_CYC 164 #define PPC970_PME_PM_INST_FROM_MEM 165 #define PPC970_PME_PM_FLUSH_LSU_BR_MPRED 166 #define PPC970_PME_PM_FXU_BUSY 167 #define PPC970_PME_PM_ST_REF_L1_LSU1 168 #define PPC970_PME_PM_MRK_LD_MISS_L1 169 #define PPC970_PME_PM_L1_WRITE_CYC 170 #define PPC970_PME_PM_LSU_REJECT_LMQ_FULL 171 #define PPC970_PME_PM_FPU_ALL 172 #define PPC970_PME_PM_LSU_SRQ_S0_ALLOC 173 #define PPC970_PME_PM_INST_FROM_L25_SHR 174 #define PPC970_PME_PM_GRP_MRK 175 #define PPC970_PME_PM_BR_MPRED_CR 176 #define PPC970_PME_PM_DC_PREF_STREAM_ALLOC 177 #define PPC970_PME_PM_FPU1_FIN 178 #define PPC970_PME_PM_LSU_REJECT_SRQ 179 #define PPC970_PME_PM_BR_MPRED_TA 180 #define PPC970_PME_PM_CRQ_FULL_CYC 181 #define PPC970_PME_PM_LD_MISS_L1 182 #define PPC970_PME_PM_INST_FROM_PREF 183 #define PPC970_PME_PM_STCX_PASS 184 #define PPC970_PME_PM_DC_INV_L2 185 #define PPC970_PME_PM_LSU_SRQ_FULL_CYC 186 #define PPC970_PME_PM_LSU0_FLUSH_LRQ 187 #define PPC970_PME_PM_LSU_SRQ_S0_VALID 188 #define PPC970_PME_PM_LARX_LSU0 189 #define PPC970_PME_PM_GCT_EMPTY_CYC 190 #define PPC970_PME_PM_FPU1_ALL 191 #define PPC970_PME_PM_FPU1_FSQRT 192 #define PPC970_PME_PM_FPU_FIN 193 #define PPC970_PME_PM_LSU_SRQ_STFWD 194 #define PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 195 #define PPC970_PME_PM_FXU0_FIN 196 #define PPC970_PME_PM_MRK_FPU_FIN 197 #define PPC970_PME_PM_PMC5_OVERFLOW 198 #define PPC970_PME_PM_SNOOP_TLBIE 199 #define PPC970_PME_PM_FPU1_FRSP_FCONV 200 #define PPC970_PME_PM_FPU0_FDIV 201 #define PPC970_PME_PM_LD_REF_L1_LSU1 202 #define PPC970_PME_PM_HV_CYC 203 #define PPC970_PME_PM_LR_CTR_MAP_FULL_CYC 204 #define PPC970_PME_PM_FPU_DENORM 205 #define PPC970_PME_PM_LSU0_REJECT_SRQ 206 #define PPC970_PME_PM_LSU1_REJECT_SRQ 207 #define PPC970_PME_PM_LSU1_DERAT_MISS 208 #define PPC970_PME_PM_IC_PREF_REQ 209 #define PPC970_PME_PM_MRK_LSU_FIN 210 #define PPC970_PME_PM_MRK_DATA_FROM_MEM 211 #define PPC970_PME_PM_LSU0_FLUSH_UST 212 #define PPC970_PME_PM_LSU_FLUSH_LRQ 213 #define PPC970_PME_PM_LSU_FLUSH_SRQ 214 static const int ppc970_event_ids[][PPC970_NUM_EVENT_COUNTERS] = { [ PPC970_PME_PM_LSU_REJECT_RELOAD_CDF ] = { -1, -1, -1, -1, -1, 68, -1, -1 }, [ PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { -1, -1, 63, 61, -1, -1, 60, 61 }, [ PPC970_PME_PM_FPU1_SINGLE ] = { 23, 22, -1, -1, 24, 23, -1, -1 }, [ PPC970_PME_PM_FPU0_STALL3 ] = { 15, 14, -1, -1, 16, 15, -1, -1 }, [ PPC970_PME_PM_TB_BIT_TRANS ] = { -1, -1, -1, -1, -1, -1, -1, 67 }, [ PPC970_PME_PM_GPR_MAP_FULL_CYC ] = { -1, -1, 28, 28, -1, -1, 27, 27 }, [ PPC970_PME_PM_MRK_ST_CMPL ] = { 79, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU0_STF ] = { 16, 15, -1, -1, 17, 16, -1, -1 }, [ PPC970_PME_PM_FPU1_FMA ] = { 20, 19, -1, -1, 21, 20, -1, -1 }, [ PPC970_PME_PM_LSU1_FLUSH_ULD ] = { 58, 57, -1, -1, 60, 57, -1, -1 }, [ PPC970_PME_PM_MRK_INST_FIN ] = { -1, -1, -1, -1, -1, -1, 50, -1 }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_UST ] = { -1, -1, 58, 56, -1, -1, 55, 55 }, [ PPC970_PME_PM_LSU_LRQ_S0_ALLOC ] = { 66, 66, -1, -1, 68, 66, -1, -1 }, [ PPC970_PME_PM_FPU_FDIV ] = { 27, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU0_FULL_CYC ] = { 13, 12, -1, -1, 14, 13, -1, -1 }, [ PPC970_PME_PM_FPU_SINGLE ] = { -1, -1, -1, -1, 28, -1, -1, -1 }, [ PPC970_PME_PM_FPU0_FMA ] = { 11, 10, -1, -1, 12, 11, -1, -1 }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_ULD ] = { -1, -1, 61, 59, -1, -1, 58, 58 }, [ PPC970_PME_PM_LSU1_FLUSH_LRQ ] = { 56, 55, -1, -1, 58, 55, -1, -1 }, [ PPC970_PME_PM_DTLB_MISS ] = { 6, 5, -1, -1, 7, 6, -1, -1 }, [ PPC970_PME_PM_MRK_ST_MISS_L1 ] = { 80, 76, -1, -1, 79, 79, -1, -1 }, [ PPC970_PME_PM_EXT_INT ] = { -1, -1, -1, -1, -1, -1, -1, 10 }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { -1, -1, 59, 57, -1, -1, 56, 56 }, [ PPC970_PME_PM_MRK_ST_GPS ] = { -1, -1, -1, -1, -1, 78, -1, -1 }, [ PPC970_PME_PM_GRP_DISP_SUCCESS ] = { -1, -1, -1, -1, 34, -1, -1, -1 }, [ PPC970_PME_PM_LSU1_LDF ] = { -1, -1, 43, 40, -1, -1, 40, 41 }, [ PPC970_PME_PM_LSU0_SRQ_STFWD ] = { 54, 53, -1, -1, 56, 53, -1, -1 }, [ PPC970_PME_PM_CR_MAP_FULL_CYC ] = { 1, 1, -1, -1, 2, 1, -1, -1 }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_ULD ] = { -1, -1, 57, 55, -1, -1, 54, 54 }, [ PPC970_PME_PM_LSU_DERAT_MISS ] = { -1, -1, -1, -1, -1, 64, -1, -1 }, [ PPC970_PME_PM_FPU0_SINGLE ] = { 14, 13, -1, -1, 15, 14, -1, -1 }, [ PPC970_PME_PM_FPU1_FDIV ] = { 19, 18, -1, -1, 20, 19, -1, -1 }, [ PPC970_PME_PM_FPU1_FEST ] = { -1, -1, 18, 18, -1, -1, 17, 18 }, [ PPC970_PME_PM_FPU0_FRSP_FCONV ] = { -1, -1, 17, 17, -1, -1, 16, 17 }, [ PPC970_PME_PM_GCT_EMPTY_SRQ_FULL ] = { -1, 27, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 64, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FLUSH_BR_MPRED ] = { -1, -1, 11, 11, -1, -1, 10, 11 }, [ PPC970_PME_PM_FXU_FIN ] = { -1, -1, 27, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU_STF ] = { -1, -1, -1, -1, -1, 27, -1, -1 }, [ PPC970_PME_PM_DSLB_MISS ] = { 5, 4, -1, -1, 6, 5, -1, -1 }, [ PPC970_PME_PM_FXLS1_FULL_CYC ] = { -1, -1, 24, 24, -1, -1, 23, 24 }, [ PPC970_PME_PM_LSU_LMQ_LHR_MERGE ] = { -1, -1, 46, 43, -1, -1, 43, 45 }, [ PPC970_PME_PM_MRK_STCX_FAIL ] = { 78, 75, -1, -1, 78, 77, -1, -1 }, [ PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, -1, -1, -1, -1, 24, -1 }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 73, -1, -1, -1 }, [ PPC970_PME_PM_LSU_FLUSH_ULD ] = { 65, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_BRU_FIN ] = { -1, 71, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_IERAT_XLATE_WR ] = { 36, 36, -1, -1, 39, 36, -1, -1 }, [ PPC970_PME_PM_DATA_FROM_MEM ] = { -1, -1, 5, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPR_MAP_FULL_CYC ] = { 7, 6, -1, -1, 8, 7, -1, -1 }, [ PPC970_PME_PM_FPU1_FULL_CYC ] = { 22, 21, -1, -1, 23, 22, -1, -1 }, [ PPC970_PME_PM_FPU0_FIN ] = { -1, -1, 14, 14, -1, -1, 13, 14 }, [ PPC970_PME_PM_GRP_BR_REDIR ] = { 31, 30, -1, -1, 32, 31, -1, -1 }, [ PPC970_PME_PM_THRESH_TIMEO ] = { -1, 83, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU_FSQRT ] = { -1, -1, -1, -1, -1, 26, -1, -1 }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { -1, -1, 55, 53, -1, -1, 52, 52 }, [ PPC970_PME_PM_PMC1_OVERFLOW ] = { -1, 77, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FXLS0_FULL_CYC ] = { -1, -1, 23, 23, -1, -1, 22, 23 }, [ PPC970_PME_PM_FPU0_ALL ] = { 8, 7, -1, -1, 9, 8, -1, -1 }, [ PPC970_PME_PM_DATA_TABLEWALK_CYC ] = { 4, 3, -1, -1, 5, 4, -1, -1 }, [ PPC970_PME_PM_FPU0_FEST ] = { -1, -1, 13, 13, -1, -1, 12, 13 }, [ PPC970_PME_PM_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 3, -1, -1 }, [ PPC970_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 50, 49, -1, -1, 52, 49, -1, -1 }, [ PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 65, 49, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 52, 51, -1, -1, 54, 51, -1, -1 }, [ PPC970_PME_PM_FPU_FEST ] = { -1, -1, 22, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_0INST_FETCH ] = { -1, -1, -1, 0, -1, -1, -1, -1 }, [ PPC970_PME_PM_LD_MISS_L1_LSU0 ] = { -1, -1, 38, 35, -1, -1, 35, 35 }, [ PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 62, 61, -1, -1, 64, 61, -1, -1 }, [ PPC970_PME_PM_L1_PREF ] = { -1, -1, 34, 32, -1, -1, 32, 32 }, [ PPC970_PME_PM_FPU1_STALL3 ] = { 24, 23, -1, -1, 25, 24, -1, -1 }, [ PPC970_PME_PM_BRQ_FULL_CYC ] = { 0, 0, -1, -1, 1, 0, -1, -1 }, [ PPC970_PME_PM_PMC8_OVERFLOW ] = { 81, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_PMC7_OVERFLOW ] = { -1, -1, -1, -1, -1, -1, -1, 62 }, [ PPC970_PME_PM_WORK_HELD ] = { -1, 84, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 76, 73, -1, -1, 76, 75, -1, -1 }, [ PPC970_PME_PM_FXU_IDLE ] = { -1, -1, -1, -1, 29, -1, -1, -1 }, [ PPC970_PME_PM_INST_CMPL ] = { 37, 37, 31, 30, 40, 37, 30, 30 }, [ PPC970_PME_PM_LSU1_FLUSH_UST ] = { 59, 58, -1, -1, 61, 58, -1, -1 }, [ PPC970_PME_PM_LSU0_FLUSH_ULD ] = { 48, 47, -1, -1, 50, 47, -1, -1 }, [ PPC970_PME_PM_LSU_FLUSH ] = { -1, -1, 44, 41, -1, -1, 41, 42 }, [ PPC970_PME_PM_INST_FROM_L2 ] = { 40, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 61, 60, -1, -1, 63, 60, -1, -1 }, [ PPC970_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 66, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU0_DENORM ] = { 9, 8, -1, -1, 10, 9, -1, -1 }, [ PPC970_PME_PM_FPU1_FMOV_FEST ] = { -1, -1, 20, 20, -1, -1, 19, 20 }, [ PPC970_PME_PM_GRP_DISP_REJECT ] = { 32, 32, -1, -1, 33, 32, -1, 29 }, [ PPC970_PME_PM_LSU_LDF ] = { -1, -1, -1, -1, -1, -1, -1, 43 }, [ PPC970_PME_PM_INST_DISP ] = { 38, 38, -1, -1, 41, 38, -1, -1 }, [ PPC970_PME_PM_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 4, -1, -1, -1 }, [ PPC970_PME_PM_L1_DCACHE_RELOAD_VALID ] = { -1, -1, 33, 31, -1, -1, 31, 31 }, [ PPC970_PME_PM_MRK_GRP_ISSUED ] = { -1, -1, -1, -1, -1, 73, -1, -1 }, [ PPC970_PME_PM_FPU_FMA ] = { -1, 25, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_CRU_FIN ] = { -1, -1, -1, 50, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_UST ] = { -1, -1, 62, 60, -1, -1, 59, 59 }, [ PPC970_PME_PM_MRK_FXU_FIN ] = { -1, -1, -1, -1, -1, 72, -1, -1 }, [ PPC970_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 60, 59, -1, -1, 62, 59, -1, -1 }, [ PPC970_PME_PM_BR_ISSUED ] = { -1, -1, 0, 1, -1, -1, 0, 0 }, [ PPC970_PME_PM_PMC4_OVERFLOW ] = { -1, -1, -1, -1, 80, -1, -1, -1 }, [ PPC970_PME_PM_EE_OFF ] = { -1, -1, 9, 9, -1, -1, 8, 8 }, [ PPC970_PME_PM_INST_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 39, -1, -1 }, [ PPC970_PME_PM_ITLB_MISS ] = { 42, 41, -1, -1, 44, 41, -1, -1 }, [ PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 26, -1, -1, -1, -1 }, [ PPC970_PME_PM_GRP_DISP_VALID ] = { 33, 33, -1, -1, 35, 33, -1, -1 }, [ PPC970_PME_PM_MRK_GRP_DISP ] = { 73, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU_FLUSH_UST ] = { -1, 64, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FXU1_FIN ] = { -1, -1, 26, 27, -1, -1, 26, 26 }, [ PPC970_PME_PM_GRP_CMPL ] = { -1, -1, -1, -1, -1, -1, 28, -1 }, [ PPC970_PME_PM_FPU_FRSP_FCONV ] = { -1, -1, -1, -1, -1, -1, 21, -1 }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { -1, -1, 56, 54, -1, -1, 53, 53 }, [ PPC970_PME_PM_LSU_LMQ_FULL_CYC ] = { -1, -1, 45, 42, -1, -1, 42, 44 }, [ PPC970_PME_PM_ST_REF_L1_LSU0 ] = { -1, -1, 69, 64, -1, -1, 64, 64 }, [ PPC970_PME_PM_LSU0_DERAT_MISS ] = { 45, 44, -1, -1, 47, 44, -1, -1 }, [ PPC970_PME_PM_LSU_SRQ_SYNC_CYC ] = { -1, -1, 52, 49, -1, -1, 48, 50 }, [ PPC970_PME_PM_FPU_STALL3 ] = { -1, 26, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU_REJECT_ERAT_MISS ] = { -1, -1, -1, -1, 70, -1, -1, -1 }, [ PPC970_PME_PM_MRK_DATA_FROM_L2 ] = { 72, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU0_FLUSH_SRQ ] = { 47, 46, -1, -1, 49, 46, -1, -1 }, [ PPC970_PME_PM_FPU0_FMOV_FEST ] = { -1, -1, 15, 15, -1, -1, 14, 15 }, [ PPC970_PME_PM_LD_REF_L1_LSU0 ] = { -1, -1, 40, 37, -1, -1, 37, 38 }, [ PPC970_PME_PM_LSU1_FLUSH_SRQ ] = { 57, 56, -1, -1, 59, 56, -1, -1 }, [ PPC970_PME_PM_GRP_BR_MPRED ] = { 30, 29, -1, -1, 31, 30, -1, -1 }, [ PPC970_PME_PM_LSU_LMQ_S0_ALLOC ] = { -1, -1, 47, 44, -1, -1, 44, 46 }, [ PPC970_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 51, 50, -1, -1, 53, 50, -1, -1 }, [ PPC970_PME_PM_ST_REF_L1 ] = { -1, -1, -1, -1, -1, -1, 63, -1 }, [ PPC970_PME_PM_MRK_VMX_FIN ] = { -1, -1, 65, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 47, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU1_STF ] = { 25, 24, -1, -1, 26, 25, -1, -1 }, [ PPC970_PME_PM_RUN_CYC ] = { 82, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU_LMQ_S0_VALID ] = { -1, -1, 48, 45, -1, -1, 45, 47 }, [ PPC970_PME_PM_LSU0_LDF ] = { -1, -1, 42, 39, -1, -1, 39, 40 }, [ PPC970_PME_PM_LSU_LRQ_S0_VALID ] = { 67, 67, -1, -1, 69, 67, -1, -1 }, [ PPC970_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 62, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_IMR_RELOAD ] = { 74, 72, -1, -1, 75, 74, -1, -1 }, [ PPC970_PME_PM_MRK_GRP_TIMEO ] = { -1, -1, -1, -1, 74, -1, -1, -1 }, [ PPC970_PME_PM_FPU_FMOV_FEST ] = { -1, -1, -1, -1, -1, -1, -1, 22 }, [ PPC970_PME_PM_GRP_DISP_BLK_SB_CYC ] = { -1, -1, 29, 29, -1, -1, 29, 28 }, [ PPC970_PME_PM_XER_MAP_FULL_CYC ] = { 88, 85, -1, -1, 86, 86, -1, -1 }, [ PPC970_PME_PM_ST_MISS_L1 ] = { 86, 81, 68, 63, 84, 84, 62, 63 }, [ PPC970_PME_PM_STOP_COMPLETION ] = { -1, -1, 67, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 51, -1, -1, -1, -1 }, [ PPC970_PME_PM_ISLB_MISS ] = { 41, 40, -1, -1, 43, 40, -1, -1 }, [ PPC970_PME_PM_SUSPENDED ] = { 87, 82, 71, 66, 85, 85, 66, 66 }, [ PPC970_PME_PM_CYC ] = { 2, 2, 4, 5, 3, 2, 4, 4 }, [ PPC970_PME_PM_LD_MISS_L1_LSU1 ] = { -1, -1, 39, 36, -1, -1, 36, 36 }, [ PPC970_PME_PM_STCX_FAIL ] = { 84, 79, -1, -1, 82, 82, -1, -1 }, [ PPC970_PME_PM_LSU1_SRQ_STFWD ] = { 64, 63, -1, -1, 66, 63, -1, -1 }, [ PPC970_PME_PM_GRP_DISP ] = { -1, 31, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_L2_PREF ] = { -1, -1, 36, 34, -1, -1, 34, 34 }, [ PPC970_PME_PM_FPU1_DENORM ] = { 18, 17, -1, -1, 19, 18, -1, -1 }, [ PPC970_PME_PM_DATA_FROM_L2 ] = { 3, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU0_FPSCR ] = { -1, -1, 16, 16, -1, -1, 15, 16 }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 71, -1, -1 }, [ PPC970_PME_PM_FPU0_FSQRT ] = { 12, 11, -1, -1, 13, 12, -1, -1 }, [ PPC970_PME_PM_LD_REF_L1 ] = { -1, -1, -1, -1, -1, -1, -1, 37 }, [ PPC970_PME_PM_MRK_L1_RELOAD_VALID ] = { -1, -1, 54, 52, -1, -1, 51, 51 }, [ PPC970_PME_PM_1PLUS_PPC_CMPL ] = { -1, -1, -1, -1, 0, -1, -1, -1 }, [ PPC970_PME_PM_INST_FROM_L1 ] = { 39, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_EE_OFF_EXT_INT ] = { -1, -1, 10, 10, -1, -1, 9, 9 }, [ PPC970_PME_PM_PMC6_OVERFLOW ] = { -1, -1, -1, -1, -1, -1, 61, -1 }, [ PPC970_PME_PM_LSU_LRQ_FULL_CYC ] = { -1, -1, 50, 46, -1, -1, 46, 48 }, [ PPC970_PME_PM_IC_PREF_INSTALL ] = { 34, 34, -1, -1, 37, 34, -1, -1 }, [ PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { -1, -1, 7, 7, -1, -1, 6, 6 }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { -1, -1, 60, 58, -1, -1, 57, 57 }, [ PPC970_PME_PM_GCT_FULL_CYC ] = { 29, 28, -1, -1, 30, 29, -1, -1 }, [ PPC970_PME_PM_INST_FROM_MEM ] = { -1, 39, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FLUSH_LSU_BR_MPRED ] = { -1, -1, 12, 12, -1, -1, 11, 12 }, [ PPC970_PME_PM_FXU_BUSY ] = { -1, -1, -1, -1, -1, 28, -1, -1 }, [ PPC970_PME_PM_ST_REF_L1_LSU1 ] = { -1, -1, 70, 65, -1, -1, 65, 65 }, [ PPC970_PME_PM_MRK_LD_MISS_L1 ] = { 75, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_L1_WRITE_CYC ] = { -1, -1, 35, 33, -1, -1, 33, 33 }, [ PPC970_PME_PM_LSU_REJECT_LMQ_FULL ] = { -1, 68, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU_ALL ] = { -1, -1, -1, -1, 27, -1, -1, -1 }, [ PPC970_PME_PM_LSU_SRQ_S0_ALLOC ] = { 69, 69, -1, -1, 71, 69, -1, -1 }, [ PPC970_PME_PM_INST_FROM_L25_SHR ] = { -1, -1, -1, -1, 42, -1, -1, -1 }, [ PPC970_PME_PM_GRP_MRK ] = { -1, -1, -1, -1, 36, -1, -1, -1 }, [ PPC970_PME_PM_BR_MPRED_CR ] = { -1, -1, 1, 2, -1, -1, 1, 1 }, [ PPC970_PME_PM_DC_PREF_STREAM_ALLOC ] = { -1, -1, 8, 8, -1, -1, 7, 7 }, [ PPC970_PME_PM_FPU1_FIN ] = { -1, -1, 19, 19, -1, -1, 18, 19 }, [ PPC970_PME_PM_LSU_REJECT_SRQ ] = { 68, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_BR_MPRED_TA ] = { -1, -1, 2, 3, -1, -1, 2, 2 }, [ PPC970_PME_PM_CRQ_FULL_CYC ] = { -1, -1, 3, 4, -1, -1, 3, 3 }, [ PPC970_PME_PM_LD_MISS_L1 ] = { -1, -1, 37, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_INST_FROM_PREF ] = { -1, -1, 32, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_STCX_PASS ] = { 85, 80, -1, -1, 83, 83, -1, -1 }, [ PPC970_PME_PM_DC_INV_L2 ] = { -1, -1, 6, 6, -1, -1, 5, 5 }, [ PPC970_PME_PM_LSU_SRQ_FULL_CYC ] = { -1, -1, 51, 48, -1, -1, 47, 49 }, [ PPC970_PME_PM_LSU0_FLUSH_LRQ ] = { 46, 45, -1, -1, 48, 45, -1, -1 }, [ PPC970_PME_PM_LSU_SRQ_S0_VALID ] = { 70, 70, -1, -1, 72, 70, -1, -1 }, [ PPC970_PME_PM_LARX_LSU0 ] = { 43, 42, -1, -1, 45, 42, -1, -1 }, [ PPC970_PME_PM_GCT_EMPTY_CYC ] = { 28, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_FPU1_ALL ] = { 17, 16, -1, -1, 18, 17, -1, -1 }, [ PPC970_PME_PM_FPU1_FSQRT ] = { 21, 20, -1, -1, 22, 21, -1, -1 }, [ PPC970_PME_PM_FPU_FIN ] = { -1, -1, -1, 22, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU_SRQ_STFWD ] = { 71, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 77, 74, -1, -1, 77, 76, -1, -1 }, [ PPC970_PME_PM_FXU0_FIN ] = { -1, -1, 25, 25, -1, -1, 25, 25 }, [ PPC970_PME_PM_MRK_FPU_FIN ] = { -1, -1, -1, -1, -1, -1, 49, -1 }, [ PPC970_PME_PM_PMC5_OVERFLOW ] = { -1, -1, -1, -1, -1, 80, -1, -1 }, [ PPC970_PME_PM_SNOOP_TLBIE ] = { 83, 78, -1, -1, 81, 81, -1, -1 }, [ PPC970_PME_PM_FPU1_FRSP_FCONV ] = { -1, -1, 21, 21, -1, -1, 20, 21 }, [ PPC970_PME_PM_FPU0_FDIV ] = { 10, 9, -1, -1, 11, 10, -1, -1 }, [ PPC970_PME_PM_LD_REF_L1_LSU1 ] = { -1, -1, 41, 38, -1, -1, 38, 39 }, [ PPC970_PME_PM_HV_CYC ] = { -1, -1, 30, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 44, 43, -1, -1, 46, 43, -1, -1 }, [ PPC970_PME_PM_FPU_DENORM ] = { 26, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU0_REJECT_SRQ ] = { 53, 52, -1, -1, 55, 52, -1, -1 }, [ PPC970_PME_PM_LSU1_REJECT_SRQ ] = { 63, 62, -1, -1, 65, 62, -1, -1 }, [ PPC970_PME_PM_LSU1_DERAT_MISS ] = { 55, 54, -1, -1, 57, 54, -1, -1 }, [ PPC970_PME_PM_IC_PREF_REQ ] = { 35, 35, -1, -1, 38, 35, -1, -1 }, [ PPC970_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, -1, -1, -1, -1, 60 }, [ PPC970_PME_PM_MRK_DATA_FROM_MEM ] = { -1, -1, 53, -1, -1, -1, -1, -1 }, [ PPC970_PME_PM_LSU0_FLUSH_UST ] = { 49, 48, -1, -1, 51, 48, -1, -1 }, [ PPC970_PME_PM_LSU_FLUSH_LRQ ] = { -1, -1, -1, -1, -1, 65, -1, -1 }, [ PPC970_PME_PM_LSU_FLUSH_SRQ ] = { -1, -1, -1, -1, 67, -1, -1, -1 } }; static const unsigned long long ppc970_group_vecs[][PPC970_NUM_GROUP_VEC] = { [ PPC970_PME_PM_LSU_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 0x0000000800000000ULL }, [ PPC970_PME_PM_FPU1_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU0_STALL3 ] = { 0x0000000000002000ULL }, [ PPC970_PME_PM_TB_BIT_TRANS ] = { 0x0000000000080000ULL }, [ PPC970_PME_PM_GPR_MAP_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_ST_CMPL ] = { 0x0000000800000000ULL }, [ PPC970_PME_PM_FPU0_STF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_FMA ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_MRK_INST_FIN ] = { 0x0000000200000000ULL }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_UST ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000010000000ULL }, [ PPC970_PME_PM_FPU_FDIV ] = { 0x0000000000900010ULL }, [ PPC970_PME_PM_FPU0_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU0_FMA ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000000000004000ULL }, [ PPC970_PME_PM_DTLB_MISS ] = { 0x0000000010600000ULL }, [ PPC970_PME_PM_MRK_ST_MISS_L1 ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_EXT_INT ] = { 0x0000000000000200ULL }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_MRK_ST_GPS ] = { 0x0000000800000000ULL }, [ PPC970_PME_PM_GRP_DISP_SUCCESS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000000000020000ULL }, [ PPC970_PME_PM_CR_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_LSU_DERAT_MISS ] = { 0x0000000100000000ULL }, [ PPC970_PME_PM_FPU0_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_FDIV ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_FPU1_FEST ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_FPU0_FRSP_FCONV ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_GCT_EMPTY_SRQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000800000000ULL }, [ PPC970_PME_PM_FLUSH_BR_MPRED ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FXU_FIN ] = { 0x0000004000100000ULL }, [ PPC970_PME_PM_FPU_STF ] = { 0x0000000000800020ULL }, [ PPC970_PME_PM_DSLB_MISS ] = { 0x0000000004000000ULL }, [ PPC970_PME_PM_FXLS1_FULL_CYC ] = { 0x0000008000000080ULL }, [ PPC970_PME_PM_LSU_LMQ_LHR_MERGE ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000004000000000ULL }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_FLUSH_ULD ] = { 0x0000000000000008ULL }, [ PPC970_PME_PM_MRK_BRU_FIN ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_IERAT_XLATE_WR ] = { 0x0000000080000000ULL }, [ PPC970_PME_PM_DATA_FROM_MEM ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_FPR_MAP_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970_PME_PM_FPU0_FIN ] = { 0x0000000000802800ULL }, [ PPC970_PME_PM_GRP_BR_REDIR ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_THRESH_TIMEO ] = { 0x0000000200000000ULL }, [ PPC970_PME_PM_FPU_FSQRT ] = { 0x0000000000100010ULL }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FXLS0_FULL_CYC ] = { 0x0000008000000080ULL }, [ PPC970_PME_PM_FPU0_ALL ] = { 0x0000000000000800ULL }, [ PPC970_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000000020000000ULL }, [ PPC970_PME_PM_FPU0_FEST ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_DATA_FROM_L25_MOD ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000480000ULL }, [ PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU_FEST ] = { 0x0000000000000010ULL }, [ PPC970_PME_PM_0INST_FETCH ] = { 0x0000030000000000ULL }, [ PPC970_PME_PM_LD_MISS_L1_LSU0 ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_L1_PREF ] = { 0x0000000010000000ULL }, [ PPC970_PME_PM_FPU1_STALL3 ] = { 0x0000000000002000ULL }, [ PPC970_PME_PM_BRQ_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_PMC8_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_PMC7_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_WORK_HELD ] = { 0x0000000000000200ULL }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_FXU_IDLE ] = { 0x000000c000000000ULL }, [ PPC970_PME_PM_INST_CMPL ] = { 0x000003fbffffffffULL }, [ PPC970_PME_PM_LSU1_FLUSH_UST ] = { 0x0000000000010000ULL }, [ PPC970_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_LSU_FLUSH ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_INST_FROM_L2 ] = { 0x0000020020000000ULL }, [ PPC970_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU0_DENORM ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_FPU1_FMOV_FEST ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_GRP_DISP_REJECT ] = { 0x0000000000000101ULL }, [ PPC970_PME_PM_LSU_LDF ] = { 0x0000000000800020ULL }, [ PPC970_PME_PM_INST_DISP ] = { 0x0000000100000146ULL }, [ PPC970_PME_PM_DATA_FROM_L25_SHR ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000000100040000ULL }, [ PPC970_PME_PM_MRK_GRP_ISSUED ] = { 0x0000000200000000ULL }, [ PPC970_PME_PM_FPU_FMA ] = { 0x0000000000900010ULL }, [ PPC970_PME_PM_MRK_CRU_FIN ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_UST ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_MRK_FXU_FIN ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_BR_ISSUED ] = { 0x0000000007000000ULL }, [ PPC970_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_EE_OFF ] = { 0x0000000000000200ULL }, [ PPC970_PME_PM_INST_FROM_L25_MOD ] = { 0x0000020000000000ULL }, [ PPC970_PME_PM_ITLB_MISS ] = { 0x0000000010200000ULL }, [ PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000004000000000ULL }, [ PPC970_PME_PM_GRP_DISP_VALID ] = { 0x0000000100000100ULL }, [ PPC970_PME_PM_MRK_GRP_DISP ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_LSU_FLUSH_UST ] = { 0x0000000000000008ULL }, [ PPC970_PME_PM_FXU1_FIN ] = { 0x0000008000000100ULL }, [ PPC970_PME_PM_GRP_CMPL ] = { 0x0000000020080001ULL }, [ PPC970_PME_PM_FPU_FRSP_FCONV ] = { 0x0000000000000020ULL }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_ST_REF_L1_LSU0 ] = { 0x0000000000030000ULL }, [ PPC970_PME_PM_LSU0_DERAT_MISS ] = { 0x0000000000040000ULL }, [ PPC970_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000040000000ULL }, [ PPC970_PME_PM_FPU_STALL3 ] = { 0x0000000000000020ULL }, [ PPC970_PME_PM_LSU_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000000000004000ULL }, [ PPC970_PME_PM_FPU0_FMOV_FEST ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000000000004000ULL }, [ PPC970_PME_PM_GRP_BR_MPRED ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_ST_REF_L1 ] = { 0x000000010260000eULL }, [ PPC970_PME_PM_MRK_VMX_FIN ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_STF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_RUN_CYC ] = { 0x0000000004000001ULL }, [ PPC970_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000010000000ULL }, [ PPC970_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_IMR_RELOAD ] = { 0x0000001000000000ULL }, [ PPC970_PME_PM_MRK_GRP_TIMEO ] = { 0x0000000800000000ULL }, [ PPC970_PME_PM_FPU_FMOV_FEST ] = { 0x0000000000100010ULL }, [ PPC970_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 0x0000000000000040ULL }, [ PPC970_PME_PM_XER_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970_PME_PM_ST_MISS_L1 ] = { 0x0000000003630000ULL }, [ PPC970_PME_PM_STOP_COMPLETION ] = { 0x0000000000000201ULL }, [ PPC970_PME_PM_MRK_GRP_CMPL ] = { 0x0000000a00000000ULL }, [ PPC970_PME_PM_ISLB_MISS ] = { 0x0000000004000000ULL }, [ PPC970_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_CYC ] = { 0x000003ffffffffffULL }, [ PPC970_PME_PM_LD_MISS_L1_LSU1 ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_STCX_FAIL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000000000020000ULL }, [ PPC970_PME_PM_GRP_DISP ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_L2_PREF ] = { 0x0000000010000000ULL }, [ PPC970_PME_PM_FPU1_DENORM ] = { 0x0000000000001000ULL }, [ PPC970_PME_PM_DATA_FROM_L2 ] = { 0x0000000008000000ULL }, [ PPC970_PME_PM_FPU0_FPSCR ] = { 0x0000000000002000ULL }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU0_FSQRT ] = { 0x0000000000000800ULL }, [ PPC970_PME_PM_LD_REF_L1 ] = { 0x000000004260000eULL }, [ PPC970_PME_PM_MRK_L1_RELOAD_VALID ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_1PLUS_PPC_CMPL ] = { 0x0000000000080001ULL }, [ PPC970_PME_PM_INST_FROM_L1 ] = { 0x0000010080000000ULL }, [ PPC970_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000000200ULL }, [ PPC970_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_LRQ_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970_PME_PM_IC_PREF_INSTALL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_INST_FROM_MEM ] = { 0x0000030020000000ULL }, [ PPC970_PME_PM_FLUSH_LSU_BR_MPRED ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FXU_BUSY ] = { 0x000000c000000000ULL }, [ PPC970_PME_PM_ST_REF_L1_LSU1 ] = { 0x0000000000030000ULL }, [ PPC970_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000200000000ULL }, [ PPC970_PME_PM_L1_WRITE_CYC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU_ALL ] = { 0x0000000000000020ULL }, [ PPC970_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000040000000ULL }, [ PPC970_PME_PM_INST_FROM_L25_SHR ] = { 0x0000020000000000ULL }, [ PPC970_PME_PM_GRP_MRK ] = { 0x0000000600000000ULL }, [ PPC970_PME_PM_BR_MPRED_CR ] = { 0x0000000005000000ULL }, [ PPC970_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_FIN ] = { 0x0000000000802800ULL }, [ PPC970_PME_PM_LSU_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_BR_MPRED_TA ] = { 0x0000000005000000ULL }, [ PPC970_PME_PM_CRQ_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970_PME_PM_LD_MISS_L1 ] = { 0x0000000043600006ULL }, [ PPC970_PME_PM_INST_FROM_PREF ] = { 0x0000030000000000ULL }, [ PPC970_PME_PM_STCX_PASS ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_DC_INV_L2 ] = { 0x0000000020010006ULL }, [ PPC970_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000000000004000ULL }, [ PPC970_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000040000000ULL }, [ PPC970_PME_PM_LARX_LSU0 ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_GCT_EMPTY_CYC ] = { 0x0000000100080200ULL }, [ PPC970_PME_PM_FPU1_ALL ] = { 0x0000000000000800ULL }, [ PPC970_PME_PM_FPU1_FSQRT ] = { 0x0000000000000800ULL }, [ PPC970_PME_PM_FPU_FIN ] = { 0x0000000000100010ULL }, [ PPC970_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 0x0000002000000000ULL }, [ PPC970_PME_PM_FXU0_FIN ] = { 0x0000008000000100ULL }, [ PPC970_PME_PM_MRK_FPU_FIN ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_SNOOP_TLBIE ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_FPU1_FRSP_FCONV ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_FPU0_FDIV ] = { 0x0000000000000400ULL }, [ PPC970_PME_PM_LD_REF_L1_LSU1 ] = { 0x0000000000008000ULL }, [ PPC970_PME_PM_HV_CYC ] = { 0x0000000020080000ULL }, [ PPC970_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970_PME_PM_FPU_DENORM ] = { 0x0000000000000020ULL }, [ PPC970_PME_PM_LSU0_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU1_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU1_DERAT_MISS ] = { 0x0000000000040000ULL }, [ PPC970_PME_PM_IC_PREF_REQ ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_MRK_LSU_FIN ] = { 0x0000000400000000ULL }, [ PPC970_PME_PM_MRK_DATA_FROM_MEM ] = { 0x0000000000000000ULL }, [ PPC970_PME_PM_LSU0_FLUSH_UST ] = { 0x0000000000010000ULL }, [ PPC970_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000000000000008ULL }, [ PPC970_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000000000000008ULL } }; static const pme_power_entry_t ppc970_pe[] = { [ PPC970_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x6920, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "LSU reject due to reload CDF or tag update collision", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_REJECT_RELOAD_CDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_REJECT_RELOAD_CDF] }, [ PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x936, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID] }, [ PPC970_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_SINGLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_SINGLE] }, [ PPC970_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_STALL3], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_STALL3] }, [ PPC970_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_TB_BIT_TRANS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_TB_BIT_TRANS] }, [ PPC970_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x335, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GPR_MAP_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GPR_MAP_FULL_CYC] }, [ PPC970_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_ST_CMPL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_ST_CMPL] }, [ PPC970_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_STF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_STF] }, [ PPC970_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FMA], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FMA] }, [ PPC970_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0x804, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_FLUSH_ULD] }, [ PPC970_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_INST_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_INST_FIN] }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x711, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU0_FLUSH_UST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU0_FLUSH_UST] }, [ PPC970_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0x826, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LRQ_S0_ALLOC] }, [ PPC970_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FDIV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FDIV] }, [ PPC970_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x303, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FULL_CYC] }, [ PPC970_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_SINGLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_SINGLE] }, [ PPC970_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FMA], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FMA] }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x714, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU1_FLUSH_ULD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU1_FLUSH_ULD] }, [ PPC970_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0x806, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_FLUSH_LRQ] }, [ PPC970_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x704, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DTLB_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DTLB_MISS] }, [ PPC970_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x723, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_ST_MISS_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_ST_MISS_L1] }, [ PPC970_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_EXT_INT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_EXT_INT] }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x716, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ] }, [ PPC970_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_ST_GPS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_ST_GPS] }, [ PPC970_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups sucessfully dispatched (not rejected)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_DISP_SUCCESS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_DISP_SUCCESS] }, [ PPC970_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x734, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_LDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_LDF] }, [ PPC970_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0x820, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_SRQ_STFWD] }, [ PPC970_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x304, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_CR_MAP_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_CR_MAP_FULL_CYC] }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x710, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU0_FLUSH_ULD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU0_FLUSH_ULD] }, [ PPC970_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6700, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_DERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_DERAT_MISS] }, [ PPC970_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_SINGLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_SINGLE] }, [ PPC970_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FDIV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FDIV] }, [ PPC970_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FEST] }, [ PPC970_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FRSP_FCONV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FRSP_FCONV] }, [ PPC970_PME_PM_GCT_EMPTY_SRQ_FULL ] = { .pme_name = "PM_GCT_EMPTY_SRQ_FULL", .pme_code = 0x200b, .pme_short_desc = "GCT empty caused by SRQ full", .pme_long_desc = "GCT empty caused by SRQ full", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GCT_EMPTY_SRQ_FULL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GCT_EMPTY_SRQ_FULL] }, [ PPC970_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_ST_CMPL_INT] }, [ PPC970_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x316, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "Flush caused by branch mispredict", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FLUSH_BR_MPRED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FLUSH_BR_MPRED] }, [ PPC970_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3330, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU_FIN] }, [ PPC970_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_STF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_STF] }, [ PPC970_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x705, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DSLB_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DSLB_MISS] }, [ PPC970_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x314, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXLS1_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXLS1_FULL_CYC] }, [ PPC970_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x935, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LMQ_LHR_MERGE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LMQ_LHR_MERGE] }, [ PPC970_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x726, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_STCX_FAIL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_STCX_FAIL] }, [ PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x193d, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ PPC970_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1800, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_FLUSH_ULD] }, [ PPC970_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_BRU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_BRU_FIN] }, [ PPC970_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x430, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_IERAT_XLATE_WR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_IERAT_XLATE_WR] }, [ PPC970_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x3837, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "Data loaded from memory", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DATA_FROM_MEM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DATA_FROM_MEM] }, [ PPC970_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x301, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPR_MAP_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPR_MAP_FULL_CYC] }, [ PPC970_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x307, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FULL_CYC] }, [ PPC970_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FIN] }, [ PPC970_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x326, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Group experienced branch redirect", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_BR_REDIR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_BR_REDIR] }, [ PPC970_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_THRESH_TIMEO], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_THRESH_TIMEO] }, [ PPC970_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FSQRT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FSQRT] }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x712, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ] }, [ PPC970_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC1_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC1_OVERFLOW] }, [ PPC970_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x310, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXLS0_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXLS0_FULL_CYC] }, [ PPC970_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_ALL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_ALL] }, [ PPC970_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x707, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DATA_TABLEWALK_CYC] }, [ PPC970_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FEST] }, [ PPC970_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x383d, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DATA_FROM_L25_MOD] }, [ PPC970_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0x923, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "LSU0 reject due to ERAT miss", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_REJECT_ERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_REJECT_ERAT_MISS] }, [ PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0x922, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF] }, [ PPC970_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FEST] }, [ PPC970_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x442d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_0INST_FETCH], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_0INST_FETCH] }, [ PPC970_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0x812, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_MISS_L1_LSU0], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_MISS_L1_LSU0] }, [ PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0x926, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF] }, [ PPC970_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x731, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_L1_PREF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_L1_PREF] }, [ PPC970_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_STALL3], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_STALL3] }, [ PPC970_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x305, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_BRQ_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_BRQ_FULL_CYC] }, [ PPC970_PME_PM_PMC8_OVERFLOW ] = { .pme_name = "PM_PMC8_OVERFLOW", .pme_code = 0x100a, .pme_short_desc = "PMC8 Overflow", .pme_long_desc = "PMC8 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC8_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC8_OVERFLOW] }, [ PPC970_PME_PM_PMC7_OVERFLOW ] = { .pme_name = "PM_PMC7_OVERFLOW", .pme_code = 0x800a, .pme_short_desc = "PMC7 Overflow", .pme_long_desc = "PMC7 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC7_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC7_OVERFLOW] }, [ PPC970_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_WORK_HELD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_WORK_HELD] }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x720, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LD_MISS_L1_LSU0], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LD_MISS_L1_LSU0] }, [ PPC970_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU_IDLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU_IDLE] }, [ PPC970_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x1, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_CMPL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_CMPL] }, [ PPC970_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0x805, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_FLUSH_UST] }, [ PPC970_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0x800, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_FLUSH_ULD] }, [ PPC970_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x315, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "Flush initiated by LSU", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_FLUSH], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_FLUSH] }, [ PPC970_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x1426, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_L2], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_L2] }, [ PPC970_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0x925, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_REJECT_LMQ_FULL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_REJECT_LMQ_FULL] }, [ PPC970_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC2_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC2_OVERFLOW] }, [ PPC970_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_DENORM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_DENORM] }, [ PPC970_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FMOV_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FMOV_FEST] }, [ PPC970_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x324, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_DISP_REJECT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_DISP_REJECT] }, [ PPC970_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8730, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LDF] }, [ PPC970_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x320, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_DISP], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_DISP] }, [ PPC970_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x183d, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DATA_FROM_L25_SHR] }, [ PPC970_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x834, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ PPC970_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_GRP_ISSUED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_GRP_ISSUED] }, [ PPC970_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FMA], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FMA] }, [ PPC970_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_CRU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_CRU_FIN] }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x715, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU1_FLUSH_UST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU1_FLUSH_UST] }, [ PPC970_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "Marked instruction FXU processing finished", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_FXU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_FXU_FIN] }, [ PPC970_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0x927, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "LSU1 reject due to ERAT miss", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_REJECT_ERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_REJECT_ERAT_MISS] }, [ PPC970_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x431, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_BR_ISSUED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_BR_ISSUED] }, [ PPC970_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x500a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC4_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC4_OVERFLOW] }, [ PPC970_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x333, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_EE_OFF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_EE_OFF] }, [ PPC970_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x6426, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_L25_MOD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_L25_MOD] }, [ PPC970_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x700, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ITLB_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ITLB_MISS] }, [ PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ PPC970_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x323, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_DISP_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_DISP_VALID] }, [ PPC970_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_GRP_DISP], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_GRP_DISP] }, [ PPC970_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2800, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_FLUSH_UST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_FLUSH_UST] }, [ PPC970_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x336, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU1_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU1_FIN] }, [ PPC970_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_CMPL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_CMPL] }, [ PPC970_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FRSP_FCONV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FRSP_FCONV] }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x713, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ] }, [ PPC970_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x837, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LMQ_FULL_CYC] }, [ PPC970_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0x811, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ST_REF_L1_LSU0], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ST_REF_L1_LSU0] }, [ PPC970_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x702, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_DERAT_MISS] }, [ PPC970_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x735, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_SYNC_CYC] }, [ PPC970_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_STALL3], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_STALL3] }, [ PPC970_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x5920, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "LSU reject due to ERAT miss", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_REJECT_ERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_REJECT_ERAT_MISS] }, [ PPC970_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1937, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_DATA_FROM_L2] }, [ PPC970_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0x803, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_FLUSH_SRQ] }, [ PPC970_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FMOV_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FMOV_FEST] }, [ PPC970_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0x810, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_REF_L1_LSU0] }, [ PPC970_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0x807, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_FLUSH_SRQ] }, [ PPC970_PME_PM_GRP_BR_MPRED ] = { .pme_name = "PM_GRP_BR_MPRED", .pme_code = 0x327, .pme_short_desc = "Group experienced a branch mispredict", .pme_long_desc = "Group experienced a branch mispredict", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_BR_MPRED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_BR_MPRED] }, [ PPC970_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x836, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LMQ_S0_ALLOC] }, [ PPC970_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0x921, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_REJECT_LMQ_FULL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_REJECT_LMQ_FULL] }, [ PPC970_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7810, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ST_REF_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ST_REF_L1] }, [ PPC970_PME_PM_MRK_VMX_FIN ] = { .pme_name = "PM_MRK_VMX_FIN", .pme_code = 0x3005, .pme_short_desc = "Marked instruction VMX processing finished", .pme_long_desc = "Marked instruction VMX processing finished", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_VMX_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_VMX_FIN] }, [ PPC970_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ PPC970_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_STF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_STF] }, [ PPC970_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_RUN_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_RUN_CYC] }, [ PPC970_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x835, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LMQ_S0_VALID] }, [ PPC970_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x730, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_LDF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_LDF] }, [ PPC970_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0x822, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LRQ_S0_VALID] }, [ PPC970_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC3_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC3_OVERFLOW] }, [ PPC970_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x722, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occured due to marked load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_IMR_RELOAD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_IMR_RELOAD] }, [ PPC970_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_GRP_TIMEO], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_GRP_TIMEO] }, [ PPC970_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FMOV_FEST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FMOV_FEST] }, [ PPC970_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x331, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_DISP_BLK_SB_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_DISP_BLK_SB_CYC] }, [ PPC970_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x302, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_XER_MAP_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_XER_MAP_FULL_CYC] }, [ PPC970_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x813, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ST_MISS_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ST_MISS_L1] }, [ PPC970_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_STOP_COMPLETION], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_STOP_COMPLETION] }, [ PPC970_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_GRP_CMPL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_GRP_CMPL] }, [ PPC970_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x701, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ISLB_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ISLB_MISS] }, [ PPC970_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_SUSPENDED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_SUSPENDED] }, [ PPC970_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_CYC] }, [ PPC970_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0x816, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_MISS_L1_LSU1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_MISS_L1_LSU1] }, [ PPC970_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x721, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_STCX_FAIL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_STCX_FAIL] }, [ PPC970_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0x824, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_SRQ_STFWD] }, [ PPC970_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_DISP], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_DISP] }, [ PPC970_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0x733, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_L2_PREF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_L2_PREF] }, [ PPC970_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_DENORM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_DENORM] }, [ PPC970_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1837, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DATA_FROM_L2], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DATA_FROM_L2] }, [ PPC970_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FPSCR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FPSCR] }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x393d, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ PPC970_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FSQRT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FSQRT] }, [ PPC970_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8810, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_REF_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_REF_L1] }, [ PPC970_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x934, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_L1_RELOAD_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_L1_RELOAD_VALID] }, [ PPC970_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_1PLUS_PPC_CMPL] }, [ PPC970_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x142d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_L1] }, [ PPC970_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x337, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_EE_OFF_EXT_INT] }, [ PPC970_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x700a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC6_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC6_OVERFLOW] }, [ PPC970_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x312, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The ISU sends this signal when the LRQ is full.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_LRQ_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_LRQ_FULL_CYC] }, [ PPC970_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x427, .pme_short_desc = "Instruction prefetched installed in prefetch", .pme_long_desc = "New line coming into the prefetch buffer", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_IC_PREF_INSTALL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_IC_PREF_INSTALL] }, [ PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x732, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS] }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x717, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ] }, [ PPC970_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x300, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GCT_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GCT_FULL_CYC] }, [ PPC970_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x2426, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "Instruction fetched from memory", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_MEM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_MEM] }, [ PPC970_PME_PM_FLUSH_LSU_BR_MPRED ] = { .pme_name = "PM_FLUSH_LSU_BR_MPRED", .pme_code = 0x317, .pme_short_desc = "Flush caused by LSU or branch mispredict", .pme_long_desc = "Flush caused by LSU or branch mispredict", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FLUSH_LSU_BR_MPRED], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FLUSH_LSU_BR_MPRED] }, [ PPC970_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU_BUSY], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU_BUSY] }, [ PPC970_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0x815, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_ST_REF_L1_LSU1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_ST_REF_L1_LSU1] }, [ PPC970_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1720, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LD_MISS_L1] }, [ PPC970_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x434, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_L1_WRITE_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_L1_WRITE_CYC] }, [ PPC970_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2920, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "LSU reject due to LMQ full or missed data coming", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_REJECT_LMQ_FULL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_REJECT_LMQ_FULL] }, [ PPC970_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_ALL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_ALL] }, [ PPC970_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0x825, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_S0_ALLOC] }, [ PPC970_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x5426, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_L25_SHR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_L25_SHR] }, [ PPC970_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GRP_MRK], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GRP_MRK] }, [ PPC970_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x432, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_BR_MPRED_CR], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_BR_MPRED_CR] }, [ PPC970_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x737, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DC_PREF_STREAM_ALLOC] }, [ PPC970_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FIN] }, [ PPC970_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1920, .pme_short_desc = "LSU SRQ rejects", .pme_long_desc = "LSU SRQ rejects", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_REJECT_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_REJECT_SRQ] }, [ PPC970_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x433, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_BR_MPRED_TA], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_BR_MPRED_TA] }, [ PPC970_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x311, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_CRQ_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_CRQ_FULL_CYC] }, [ PPC970_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3810, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_MISS_L1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_MISS_L1] }, [ PPC970_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x342d, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_INST_FROM_PREF], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_INST_FROM_PREF] }, [ PPC970_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x725, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_STCX_PASS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_STCX_PASS] }, [ PPC970_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x817, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_DC_INV_L2], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_DC_INV_L2] }, [ PPC970_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x313, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The ISU sends this signal when the srq is full.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_FULL_CYC] }, [ PPC970_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0x802, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_FLUSH_LRQ] }, [ PPC970_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0x821, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_S0_VALID] }, [ PPC970_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x727, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LARX_LSU0], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LARX_LSU0] }, [ PPC970_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_GCT_EMPTY_CYC] }, [ PPC970_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_ALL], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_ALL] }, [ PPC970_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FSQRT], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FSQRT] }, [ PPC970_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_FIN] }, [ PPC970_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1820, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_SRQ_STFWD] }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x724, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LD_MISS_L1_LSU1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LD_MISS_L1_LSU1] }, [ PPC970_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x332, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FXU0_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FXU0_FIN] }, [ PPC970_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_FPU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_FPU_FIN] }, [ PPC970_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x600a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_PMC5_OVERFLOW], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_PMC5_OVERFLOW] }, [ PPC970_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x703, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_SNOOP_TLBIE], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_SNOOP_TLBIE] }, [ PPC970_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU1_FRSP_FCONV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU1_FRSP_FCONV] }, [ PPC970_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU0_FDIV], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU0_FDIV] }, [ PPC970_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0x814, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LD_REF_L1_LSU1], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LD_REF_L1_LSU1] }, [ PPC970_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_HV_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_HV_CYC] }, [ PPC970_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x306, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LR_CTR_MAP_FULL_CYC], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LR_CTR_MAP_FULL_CYC] }, [ PPC970_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_FPU_DENORM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_FPU_DENORM] }, [ PPC970_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0x920, .pme_short_desc = "LSU0 SRQ rejects", .pme_long_desc = "LSU0 SRQ rejects", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_REJECT_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_REJECT_SRQ] }, [ PPC970_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0x924, .pme_short_desc = "LSU1 SRQ rejects", .pme_long_desc = "LSU1 SRQ rejects", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_REJECT_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_REJECT_SRQ] }, [ PPC970_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x706, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU1_DERAT_MISS] }, [ PPC970_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x426, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_IC_PREF_REQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_IC_PREF_REQ] }, [ PPC970_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_LSU_FIN], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_LSU_FIN] }, [ PPC970_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x3937, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "Marked data loaded from memory", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_MRK_DATA_FROM_MEM], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_MRK_DATA_FROM_MEM] }, [ PPC970_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0x801, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU0_FLUSH_UST] }, [ PPC970_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6800, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_FLUSH_LRQ] }, [ PPC970_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5800, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970_event_ids[PPC970_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = ppc970_group_vecs[PPC970_PME_PM_LSU_FLUSH_SRQ] } }; #define PPC970_PME_EVENT_COUNT 215 static const int ppc970_group_event_ids[][PPC970_NUM_EVENT_COUNTERS] = { [ 0 ] = { 82, 2, 67, 30, 0, 2, 28, 29 }, [ 1 ] = { 2, 2, 37, 6, 41, 37, 63, 37 }, [ 2 ] = { 37, 2, 37, 6, 41, 37, 63, 37 }, [ 3 ] = { 65, 64, 4, 30, 67, 65, 63, 37 }, [ 4 ] = { 27, 25, 22, 22, 3, 26, 30, 22 }, [ 5 ] = { 26, 26, 4, 30, 27, 27, 21, 43 }, [ 6 ] = { 88, 1, 3, 29, 46, 38, 30, 4 }, [ 7 ] = { 13, 21, 23, 24, 3, 37, 46, 49 }, [ 8 ] = { 38, 2, 25, 27, 35, 32, 30, 4 }, [ 9 ] = { 28, 84, 67, 10, 3, 37, 8, 10 }, [ 10 ] = { 10, 18, 17, 21, 12, 20, 30, 4 }, [ 11 ] = { 12, 20, 14, 19, 9, 17, 30, 4 }, [ 12 ] = { 9, 17, 15, 20, 3, 37, 12, 18 }, [ 13 ] = { 15, 23, 14, 19, 3, 37, 4, 16 }, [ 14 ] = { 46, 55, 4, 5, 49, 56, 30, 4 }, [ 15 ] = { 48, 57, 40, 38, 3, 37, 35, 36 }, [ 16 ] = { 49, 58, 69, 65, 3, 37, 62, 5 }, [ 17 ] = { 54, 63, 69, 65, 84, 2, 30, 4 }, [ 18 ] = { 45, 54, 4, 5, 40, 2, 31, 4 }, [ 19 ] = { 28, 65, 30, 5, 0, 37, 28, 67 }, [ 20 ] = { 27, 25, 27, 22, 3, 26, 30, 22 }, [ 21 ] = { 6, 41, 37, 63, 3, 37, 63, 37 }, [ 22 ] = { 6, 65, 37, 63, 3, 37, 63, 37 }, [ 23 ] = { 27, 25, 14, 19, 3, 27, 30, 43 }, [ 24 ] = { 37, 2, 37, 1, 84, 2, 1, 2 }, [ 25 ] = { 37, 2, 37, 1, 3, 84, 63, 37 }, [ 26 ] = { 82, 4, 0, 2, 43, 2, 30, 2 }, [ 27 ] = { 3, 37, 5, 5, 4, 3, 44, 47 }, [ 28 ] = { 6, 41, 31, 5, 68, 67, 32, 34 }, [ 29 ] = { 40, 39, 30, 30, 5, 2, 28, 5 }, [ 30 ] = { 69, 70, 37, 49, 40, 37, 4, 37 }, [ 31 ] = { 39, 36, 31, 5, 40, 2, 30, 4 }, [ 32 ] = { 28, 33, 33, 30, 41, 64, 63, 4 }, [ 33 ] = { 75, 83, 4, 51, 36, 73, 50, 30 }, [ 34 ] = { 73, 71, 4, 50, 36, 72, 49, 60 }, [ 35 ] = { 79, 2, 64, 51, 74, 78, 60, 30 }, [ 36 ] = { 80, 72, 58, 60, 3, 37, 54, 58 }, [ 37 ] = { 76, 74, 55, 57, 3, 37, 53, 57 }, [ 38 ] = { 37, 37, 27, 26, 29, 28, 24, 4 }, [ 39 ] = { 37, 2, 24, 23, 29, 28, 25, 26 }, [ 40 ] = { 39, 39, 32, 0, 40, 2, 4, 30 }, [ 41 ] = { 40, 39, 32, 0, 42, 39, 4, 30 } }; static const pmg_power_group_t ppc970_groups[] = { [ 0 ] = { .pmg_name = "pm_slice0", .pmg_desc = "Time Slice 0", .pmg_event_ids = ppc970_group_event_ids[0], .pmg_mmcr0 = 0x000000000000051eULL, .pmg_mmcr1 = 0x000000000a46f18cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 1 ] = { .pmg_name = "pm_eprof", .pmg_desc = "Group for use with eprof", .pmg_event_ids = ppc970_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000f1eULL, .pmg_mmcr1 = 0x4003001005f09000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 2 ] = { .pmg_name = "pm_basic", .pmg_desc = "Basic performance indicators", .pmg_event_ids = ppc970_group_event_ids[2], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x4003001005f09000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 3 ] = { .pmg_name = "pm_lsu", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = ppc970_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000f00007a400000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 4 ] = { .pmg_name = "pm_fpu1", .pmg_desc = "Floating Point events", .pmg_event_ids = ppc970_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000001e0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 5 ] = { .pmg_name = "pm_fpu2", .pmg_desc = "Floating Point events", .pmg_event_ids = ppc970_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000020e87a400000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 6 ] = { .pmg_name = "pm_isu_rename", .pmg_desc = "ISU Rename Pool Events", .pmg_event_ids = ppc970_group_event_ids[6], .pmg_mmcr0 = 0x0000000000001228ULL, .pmg_mmcr1 = 0x400000218e6d84bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 7 ] = { .pmg_name = "pm_isu_queues1", .pmg_desc = "ISU Rename Pool Events", .pmg_event_ids = ppc970_group_event_ids[7], .pmg_mmcr0 = 0x000000000000132eULL, .pmg_mmcr1 = 0x40000000851e994cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 8 ] = { .pmg_name = "pm_isu_flow", .pmg_desc = "ISU Instruction Flow Events", .pmg_event_ids = ppc970_group_event_ids[8], .pmg_mmcr0 = 0x000000000000181eULL, .pmg_mmcr1 = 0x400000b3d7b7c4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 9 ] = { .pmg_name = "pm_isu_work", .pmg_desc = "ISU Indicators of Work Blockage", .pmg_event_ids = ppc970_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000402ULL, .pmg_mmcr1 = 0x400000050fde9d88ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 10 ] = { .pmg_name = "pm_fpu3", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970_group_event_ids[10], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000000008d6354bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 11 ] = { .pmg_name = "pm_fpu4", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970_group_event_ids[11], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000000009de774bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 12 ] = { .pmg_name = "pm_fpu5", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970_group_event_ids[12], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x000000c0851e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 13 ] = { .pmg_name = "pm_fpu7", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970_group_event_ids[13], .pmg_mmcr0 = 0x000000000000193aULL, .pmg_mmcr1 = 0x000000c89dde97e0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 14 ] = { .pmg_name = "pm_lsu_flush", .pmg_desc = "LSU Flush Events", .pmg_event_ids = ppc970_group_event_ids[14], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000c00007be774bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 15 ] = { .pmg_name = "pm_lsu_load1", .pmg_desc = "LSU Load Events", .pmg_event_ids = ppc970_group_event_ids[15], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000f0000851e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 16 ] = { .pmg_name = "pm_lsu_store1", .pmg_desc = "LSU Store Events", .pmg_event_ids = ppc970_group_event_ids[16], .pmg_mmcr0 = 0x000000000000112aULL, .pmg_mmcr1 = 0x000f00008d5e99dcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 17 ] = { .pmg_name = "pm_lsu_store2", .pmg_desc = "LSU Store Events", .pmg_event_ids = ppc970_group_event_ids[17], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x0003c0d08d76f4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 18 ] = { .pmg_name = "pm_lsu7", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = ppc970_group_event_ids[18], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000830047bd2fe3cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 19 ] = { .pmg_name = "pm_misc", .pmg_desc = "Misc Events for testing", .pmg_event_ids = ppc970_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000404ULL, .pmg_mmcr1 = 0x0000000023c69194ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 20 ] = { .pmg_name = "pm_pe_bench1", .pmg_desc = "PE Benchmarker group for FP analysis", .pmg_event_ids = ppc970_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x10001002001e0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 21 ] = { .pmg_name = "pm_pe_bench4", .pmg_desc = "PE Benchmarker group for L1 and TLB", .pmg_event_ids = ppc970_group_event_ids[21], .pmg_mmcr0 = 0x0000000000001420ULL, .pmg_mmcr1 = 0x000b000004de9000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 22 ] = { .pmg_name = "pm_hpmcount1", .pmg_desc = "Hpmcount group for L1 and TLB behavior", .pmg_event_ids = ppc970_group_event_ids[22], .pmg_mmcr0 = 0x0000000000001404ULL, .pmg_mmcr1 = 0x000b000004de9000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 23 ] = { .pmg_name = "pm_hpmcount2", .pmg_desc = "Hpmcount group for computation", .pmg_event_ids = ppc970_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000020289dde0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 24 ] = { .pmg_name = "pm_l1andbr", .pmg_desc = "L1 misses and branch misspredict analysis", .pmg_event_ids = ppc970_group_event_ids[24], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x8003c01d0636fce8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 25 ] = { .pmg_name = "Instruction mix: loads", .pmg_desc = " stores and branches", .pmg_event_ids = ppc970_group_event_ids[25], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x8003c021061fb000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 26 ] = { .pmg_name = "pm_branch", .pmg_desc = "SLB and branch misspredict analysis", .pmg_event_ids = ppc970_group_event_ids[26], .pmg_mmcr0 = 0x000000000000052aULL, .pmg_mmcr1 = 0x8008000bc662f4e8ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 27 ] = { .pmg_name = "pm_data", .pmg_desc = "data source and LMQ", .pmg_event_ids = ppc970_group_event_ids[27], .pmg_mmcr0 = 0x0000000000000712ULL, .pmg_mmcr1 = 0x0000300e3bce7f74ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 28 ] = { .pmg_name = "pm_tlb", .pmg_desc = "TLB and LRQ plus data prefetch", .pmg_event_ids = ppc970_group_event_ids[28], .pmg_mmcr0 = 0x0000000000001420ULL, .pmg_mmcr1 = 0x0008e03c4bfdacecULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 29 ] = { .pmg_name = "pm_isource", .pmg_desc = "inst source and tablewalk", .pmg_event_ids = ppc970_group_event_ids[29], .pmg_mmcr0 = 0x000000000000060cULL, .pmg_mmcr1 = 0x800b00c0226ef1dcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 30 ] = { .pmg_name = "pm_sync", .pmg_desc = "Sync and SRQ", .pmg_event_ids = ppc970_group_event_ids[30], .pmg_mmcr0 = 0x0000000000001d32ULL, .pmg_mmcr1 = 0x0003e0c107529780ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 31 ] = { .pmg_name = "pm_ierat", .pmg_desc = "IERAT", .pmg_event_ids = ppc970_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000d3eULL, .pmg_mmcr1 = 0x800000c04bd2f4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 32 ] = { .pmg_name = "pm_derat", .pmg_desc = "DERAT", .pmg_event_ids = ppc970_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000436ULL, .pmg_mmcr1 = 0x100b7052e274003cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 33 ] = { .pmg_name = "pm_mark1", .pmg_desc = "Information on marked instructions", .pmg_event_ids = ppc970_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000006ULL, .pmg_mmcr1 = 0x00008080790852a4ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 34 ] = { .pmg_name = "pm_mark2", .pmg_desc = "Marked Instructions Processing Flow", .pmg_event_ids = ppc970_group_event_ids[34], .pmg_mmcr0 = 0x000000000000020aULL, .pmg_mmcr1 = 0x0000000079484210ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 35 ] = { .pmg_name = "pm_mark3", .pmg_desc = "Marked Stores Processing Flow", .pmg_event_ids = ppc970_group_event_ids[35], .pmg_mmcr0 = 0x000000000000031eULL, .pmg_mmcr1 = 0x00203004190a3f24ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 36 ] = { .pmg_name = "pm_lsu_mark1", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = ppc970_group_event_ids[36], .pmg_mmcr0 = 0x0000000000001b34ULL, .pmg_mmcr1 = 0x000280c08d5e9850ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 37 ] = { .pmg_name = "pm_lsu_mark2", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = ppc970_group_event_ids[37], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x000280c0959e99dcULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 38 ] = { .pmg_name = "pm_fxu1", .pmg_desc = "Fixed Point events by unit", .pmg_event_ids = ppc970_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000912ULL, .pmg_mmcr1 = 0x100010020084213cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 39 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "Fixed Point events by unit", .pmg_event_ids = ppc970_group_event_ids[39], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x4000000ca4042d78ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 40 ] = { .pmg_name = "pm_ifu", .pmg_desc = "Instruction Fetch Unit events", .pmg_event_ids = ppc970_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000d0cULL, .pmg_mmcr1 = 0x800000c06b52f7a4ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 41 ] = { .pmg_name = "pm_L1_icm", .pmg_desc = " Level 1 instruction cache misses", .pmg_event_ids = ppc970_group_event_ids[41], .pmg_mmcr0 = 0x000000000000060cULL, .pmg_mmcr1 = 0x800000f06b4c67a4ULL, .pmg_mmcra = 0x0000000000002000ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/ppc970mp_events.h000066400000000000000000004454411502707512200224360ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PPC970MP_EVENTS_H__ #define __PPC970MP_EVENTS_H__ /* * File: ppc970mp_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2007. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID 1 #define PPC970MP_PME_PM_FPU1_SINGLE 2 #define PPC970MP_PME_PM_FPU0_STALL3 3 #define PPC970MP_PME_PM_TB_BIT_TRANS 4 #define PPC970MP_PME_PM_GPR_MAP_FULL_CYC 5 #define PPC970MP_PME_PM_MRK_ST_CMPL 6 #define PPC970MP_PME_PM_FPU0_STF 7 #define PPC970MP_PME_PM_FPU1_FMA 8 #define PPC970MP_PME_PM_LSU1_FLUSH_ULD 9 #define PPC970MP_PME_PM_MRK_INST_FIN 10 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST 11 #define PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC 12 #define PPC970MP_PME_PM_FPU_FDIV 13 #define PPC970MP_PME_PM_FPU0_FULL_CYC 14 #define PPC970MP_PME_PM_FPU_SINGLE 15 #define PPC970MP_PME_PM_FPU0_FMA 16 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD 17 #define PPC970MP_PME_PM_LSU1_FLUSH_LRQ 18 #define PPC970MP_PME_PM_DTLB_MISS 19 #define PPC970MP_PME_PM_CMPLU_STALL_FXU 20 #define PPC970MP_PME_PM_MRK_ST_MISS_L1 21 #define PPC970MP_PME_PM_EXT_INT 22 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ 23 #define PPC970MP_PME_PM_MRK_ST_GPS 24 #define PPC970MP_PME_PM_GRP_DISP_SUCCESS 25 #define PPC970MP_PME_PM_LSU1_LDF 26 #define PPC970MP_PME_PM_LSU0_SRQ_STFWD 27 #define PPC970MP_PME_PM_CR_MAP_FULL_CYC 28 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD 29 #define PPC970MP_PME_PM_LSU_DERAT_MISS 30 #define PPC970MP_PME_PM_FPU0_SINGLE 31 #define PPC970MP_PME_PM_FPU1_FDIV 32 #define PPC970MP_PME_PM_FPU1_FEST 33 #define PPC970MP_PME_PM_FPU0_FRSP_FCONV 34 #define PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL 35 #define PPC970MP_PME_PM_MRK_ST_CMPL_INT 36 #define PPC970MP_PME_PM_FLUSH_BR_MPRED 37 #define PPC970MP_PME_PM_FXU_FIN 38 #define PPC970MP_PME_PM_FPU_STF 39 #define PPC970MP_PME_PM_DSLB_MISS 40 #define PPC970MP_PME_PM_FXLS1_FULL_CYC 41 #define PPC970MP_PME_PM_CMPLU_STALL_FPU 42 #define PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE 43 #define PPC970MP_PME_PM_MRK_STCX_FAIL 44 #define PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE 45 #define PPC970MP_PME_PM_CMPLU_STALL_LSU 46 #define PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR 47 #define PPC970MP_PME_PM_LSU_FLUSH_ULD 48 #define PPC970MP_PME_PM_MRK_BRU_FIN 49 #define PPC970MP_PME_PM_IERAT_XLATE_WR 50 #define PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED 51 #define PPC970MP_PME_PM_LSU0_BUSY 52 #define PPC970MP_PME_PM_DATA_FROM_MEM 53 #define PPC970MP_PME_PM_FPR_MAP_FULL_CYC 54 #define PPC970MP_PME_PM_FPU1_FULL_CYC 55 #define PPC970MP_PME_PM_FPU0_FIN 56 #define PPC970MP_PME_PM_GRP_BR_REDIR 57 #define PPC970MP_PME_PM_GCT_EMPTY_IC_MISS 58 #define PPC970MP_PME_PM_THRESH_TIMEO 59 #define PPC970MP_PME_PM_FPU_FSQRT 60 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ 61 #define PPC970MP_PME_PM_PMC1_OVERFLOW 62 #define PPC970MP_PME_PM_FXLS0_FULL_CYC 63 #define PPC970MP_PME_PM_FPU0_ALL 64 #define PPC970MP_PME_PM_DATA_TABLEWALK_CYC 65 #define PPC970MP_PME_PM_FPU0_FEST 66 #define PPC970MP_PME_PM_DATA_FROM_L25_MOD 67 #define PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS 68 #define PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 69 #define PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF 70 #define PPC970MP_PME_PM_FPU_FEST 71 #define PPC970MP_PME_PM_0INST_FETCH 72 #define PPC970MP_PME_PM_LD_MISS_L1_LSU0 73 #define PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF 74 #define PPC970MP_PME_PM_L1_PREF 75 #define PPC970MP_PME_PM_FPU1_STALL3 76 #define PPC970MP_PME_PM_BRQ_FULL_CYC 77 #define PPC970MP_PME_PM_PMC8_OVERFLOW 78 #define PPC970MP_PME_PM_PMC7_OVERFLOW 79 #define PPC970MP_PME_PM_WORK_HELD 80 #define PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 81 #define PPC970MP_PME_PM_FXU_IDLE 82 #define PPC970MP_PME_PM_INST_CMPL 83 #define PPC970MP_PME_PM_LSU1_FLUSH_UST 84 #define PPC970MP_PME_PM_LSU0_FLUSH_ULD 85 #define PPC970MP_PME_PM_LSU_FLUSH 86 #define PPC970MP_PME_PM_INST_FROM_L2 87 #define PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL 88 #define PPC970MP_PME_PM_PMC2_OVERFLOW 89 #define PPC970MP_PME_PM_FPU0_DENORM 90 #define PPC970MP_PME_PM_FPU1_FMOV_FEST 91 #define PPC970MP_PME_PM_INST_FETCH_CYC 92 #define PPC970MP_PME_PM_GRP_DISP_REJECT 93 #define PPC970MP_PME_PM_LSU_LDF 94 #define PPC970MP_PME_PM_INST_DISP 95 #define PPC970MP_PME_PM_DATA_FROM_L25_SHR 96 #define PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID 97 #define PPC970MP_PME_PM_MRK_GRP_ISSUED 98 #define PPC970MP_PME_PM_FPU_FMA 99 #define PPC970MP_PME_PM_MRK_CRU_FIN 100 #define PPC970MP_PME_PM_CMPLU_STALL_REJECT 101 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST 102 #define PPC970MP_PME_PM_MRK_FXU_FIN 103 #define PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS 104 #define PPC970MP_PME_PM_BR_ISSUED 105 #define PPC970MP_PME_PM_PMC4_OVERFLOW 106 #define PPC970MP_PME_PM_EE_OFF 107 #define PPC970MP_PME_PM_INST_FROM_L25_MOD 108 #define PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS 109 #define PPC970MP_PME_PM_ITLB_MISS 110 #define PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE 111 #define PPC970MP_PME_PM_GRP_DISP_VALID 112 #define PPC970MP_PME_PM_MRK_GRP_DISP 113 #define PPC970MP_PME_PM_LSU_FLUSH_UST 114 #define PPC970MP_PME_PM_FXU1_FIN 115 #define PPC970MP_PME_PM_GRP_CMPL 116 #define PPC970MP_PME_PM_FPU_FRSP_FCONV 117 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ 118 #define PPC970MP_PME_PM_CMPLU_STALL_OTHER 119 #define PPC970MP_PME_PM_LSU_LMQ_FULL_CYC 120 #define PPC970MP_PME_PM_ST_REF_L1_LSU0 121 #define PPC970MP_PME_PM_LSU0_DERAT_MISS 122 #define PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC 123 #define PPC970MP_PME_PM_FPU_STALL3 124 #define PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS 125 #define PPC970MP_PME_PM_MRK_DATA_FROM_L2 126 #define PPC970MP_PME_PM_LSU0_FLUSH_SRQ 127 #define PPC970MP_PME_PM_FPU0_FMOV_FEST 128 #define PPC970MP_PME_PM_IOPS_CMPL 129 #define PPC970MP_PME_PM_LD_REF_L1_LSU0 130 #define PPC970MP_PME_PM_LSU1_FLUSH_SRQ 131 #define PPC970MP_PME_PM_CMPLU_STALL_DIV 132 #define PPC970MP_PME_PM_GRP_BR_MPRED 133 #define PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC 134 #define PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL 135 #define PPC970MP_PME_PM_ST_REF_L1 136 #define PPC970MP_PME_PM_MRK_VMX_FIN 137 #define PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC 138 #define PPC970MP_PME_PM_FPU1_STF 139 #define PPC970MP_PME_PM_RUN_CYC 140 #define PPC970MP_PME_PM_LSU_LMQ_S0_VALID 141 #define PPC970MP_PME_PM_LSU0_LDF 142 #define PPC970MP_PME_PM_LSU_LRQ_S0_VALID 143 #define PPC970MP_PME_PM_PMC3_OVERFLOW 144 #define PPC970MP_PME_PM_MRK_IMR_RELOAD 145 #define PPC970MP_PME_PM_MRK_GRP_TIMEO 146 #define PPC970MP_PME_PM_FPU_FMOV_FEST 147 #define PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC 148 #define PPC970MP_PME_PM_XER_MAP_FULL_CYC 149 #define PPC970MP_PME_PM_ST_MISS_L1 150 #define PPC970MP_PME_PM_STOP_COMPLETION 151 #define PPC970MP_PME_PM_MRK_GRP_CMPL 152 #define PPC970MP_PME_PM_ISLB_MISS 153 #define PPC970MP_PME_PM_SUSPENDED 154 #define PPC970MP_PME_PM_CYC 155 #define PPC970MP_PME_PM_LD_MISS_L1_LSU1 156 #define PPC970MP_PME_PM_STCX_FAIL 157 #define PPC970MP_PME_PM_LSU1_SRQ_STFWD 158 #define PPC970MP_PME_PM_GRP_DISP 159 #define PPC970MP_PME_PM_L2_PREF 160 #define PPC970MP_PME_PM_FPU1_DENORM 161 #define PPC970MP_PME_PM_DATA_FROM_L2 162 #define PPC970MP_PME_PM_FPU0_FPSCR 163 #define PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD 164 #define PPC970MP_PME_PM_FPU0_FSQRT 165 #define PPC970MP_PME_PM_LD_REF_L1 166 #define PPC970MP_PME_PM_MRK_L1_RELOAD_VALID 167 #define PPC970MP_PME_PM_1PLUS_PPC_CMPL 168 #define PPC970MP_PME_PM_INST_FROM_L1 169 #define PPC970MP_PME_PM_EE_OFF_EXT_INT 170 #define PPC970MP_PME_PM_PMC6_OVERFLOW 171 #define PPC970MP_PME_PM_LSU_LRQ_FULL_CYC 172 #define PPC970MP_PME_PM_IC_PREF_INSTALL 173 #define PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS 174 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ 175 #define PPC970MP_PME_PM_GCT_FULL_CYC 176 #define PPC970MP_PME_PM_INST_FROM_MEM 177 #define PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED 178 #define PPC970MP_PME_PM_FXU_BUSY 179 #define PPC970MP_PME_PM_ST_REF_L1_LSU1 180 #define PPC970MP_PME_PM_MRK_LD_MISS_L1 181 #define PPC970MP_PME_PM_L1_WRITE_CYC 182 #define PPC970MP_PME_PM_LSU1_BUSY 183 #define PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL 184 #define PPC970MP_PME_PM_CMPLU_STALL_FDIV 185 #define PPC970MP_PME_PM_FPU_ALL 186 #define PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC 187 #define PPC970MP_PME_PM_INST_FROM_L25_SHR 188 #define PPC970MP_PME_PM_GRP_MRK 189 #define PPC970MP_PME_PM_BR_MPRED_CR 190 #define PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC 191 #define PPC970MP_PME_PM_FPU1_FIN 192 #define PPC970MP_PME_PM_LSU_REJECT_SRQ 193 #define PPC970MP_PME_PM_BR_MPRED_TA 194 #define PPC970MP_PME_PM_CRQ_FULL_CYC 195 #define PPC970MP_PME_PM_LD_MISS_L1 196 #define PPC970MP_PME_PM_INST_FROM_PREF 197 #define PPC970MP_PME_PM_STCX_PASS 198 #define PPC970MP_PME_PM_DC_INV_L2 199 #define PPC970MP_PME_PM_LSU_SRQ_FULL_CYC 200 #define PPC970MP_PME_PM_LSU0_FLUSH_LRQ 201 #define PPC970MP_PME_PM_LSU_SRQ_S0_VALID 202 #define PPC970MP_PME_PM_LARX_LSU0 203 #define PPC970MP_PME_PM_GCT_EMPTY_CYC 204 #define PPC970MP_PME_PM_FPU1_ALL 205 #define PPC970MP_PME_PM_FPU1_FSQRT 206 #define PPC970MP_PME_PM_FPU_FIN 207 #define PPC970MP_PME_PM_LSU_SRQ_STFWD 208 #define PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 209 #define PPC970MP_PME_PM_FXU0_FIN 210 #define PPC970MP_PME_PM_MRK_FPU_FIN 211 #define PPC970MP_PME_PM_PMC5_OVERFLOW 212 #define PPC970MP_PME_PM_SNOOP_TLBIE 213 #define PPC970MP_PME_PM_FPU1_FRSP_FCONV 214 #define PPC970MP_PME_PM_FPU0_FDIV 215 #define PPC970MP_PME_PM_LD_REF_L1_LSU1 216 #define PPC970MP_PME_PM_HV_CYC 217 #define PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC 218 #define PPC970MP_PME_PM_FPU_DENORM 219 #define PPC970MP_PME_PM_LSU0_REJECT_SRQ 220 #define PPC970MP_PME_PM_LSU1_REJECT_SRQ 221 #define PPC970MP_PME_PM_LSU1_DERAT_MISS 222 #define PPC970MP_PME_PM_IC_PREF_REQ 223 #define PPC970MP_PME_PM_MRK_LSU_FIN 224 #define PPC970MP_PME_PM_MRK_DATA_FROM_MEM 225 #define PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS 226 #define PPC970MP_PME_PM_LSU0_FLUSH_UST 227 #define PPC970MP_PME_PM_LSU_FLUSH_LRQ 228 #define PPC970MP_PME_PM_LSU_FLUSH_SRQ 229 static const int ppc970mp_event_ids[][PPC970MP_NUM_EVENT_COUNTERS] = { [ PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF ] = { -1, -1, -1, -1, -1, 66, -1, -1 }, [ PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { -1, -1, 61, 61, -1, -1, 60, 61 }, [ PPC970MP_PME_PM_FPU1_SINGLE ] = { 23, 22, -1, -1, 23, 22, -1, -1 }, [ PPC970MP_PME_PM_FPU0_STALL3 ] = { 15, 14, -1, -1, 15, 14, -1, -1 }, [ PPC970MP_PME_PM_TB_BIT_TRANS ] = { -1, -1, -1, -1, -1, -1, -1, 67 }, [ PPC970MP_PME_PM_GPR_MAP_FULL_CYC ] = { -1, -1, 27, 28, -1, -1, 27, 27 }, [ PPC970MP_PME_PM_MRK_ST_CMPL ] = { 78, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU0_STF ] = { 16, 15, -1, -1, 16, 15, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FMA ] = { 20, 19, -1, -1, 20, 19, -1, -1 }, [ PPC970MP_PME_PM_LSU1_FLUSH_ULD ] = { 57, 56, -1, -1, 58, 55, -1, -1 }, [ PPC970MP_PME_PM_MRK_INST_FIN ] = { -1, -1, -1, -1, -1, -1, 50, -1 }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST ] = { -1, -1, 56, 56, -1, -1, 55, 55 }, [ PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC ] = { 65, 65, -1, -1, 66, 64, -1, -1 }, [ PPC970MP_PME_PM_FPU_FDIV ] = { 27, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FULL_CYC ] = { 13, 12, -1, -1, 13, 12, -1, -1 }, [ PPC970MP_PME_PM_FPU_SINGLE ] = { -1, -1, -1, -1, 27, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FMA ] = { 11, 10, -1, -1, 11, 10, -1, -1 }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD ] = { -1, -1, 59, 59, -1, -1, 58, 58 }, [ PPC970MP_PME_PM_LSU1_FLUSH_LRQ ] = { 55, 54, -1, -1, 56, 53, -1, -1 }, [ PPC970MP_PME_PM_DTLB_MISS ] = { 6, 5, -1, -1, 6, 5, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_FXU ] = { -1, -1, -1, -1, 85, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_ST_MISS_L1 ] = { 79, 75, -1, -1, 76, 76, -1, -1 }, [ PPC970MP_PME_PM_EXT_INT ] = { -1, -1, -1, -1, -1, -1, -1, 10 }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { -1, -1, 57, 57, -1, -1, 56, 56 }, [ PPC970MP_PME_PM_MRK_ST_GPS ] = { -1, -1, -1, -1, -1, 75, -1, -1 }, [ PPC970MP_PME_PM_GRP_DISP_SUCCESS ] = { -1, -1, -1, -1, 33, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU1_LDF ] = { -1, -1, 42, 40, -1, -1, 40, 41 }, [ PPC970MP_PME_PM_LSU0_SRQ_STFWD ] = { 53, 52, -1, -1, 54, 51, -1, -1 }, [ PPC970MP_PME_PM_CR_MAP_FULL_CYC ] = { 1, 1, -1, -1, 2, 1, -1, -1 }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD ] = { -1, -1, 55, 55, -1, -1, 54, 54 }, [ PPC970MP_PME_PM_LSU_DERAT_MISS ] = { -1, -1, -1, -1, -1, 62, -1, -1 }, [ PPC970MP_PME_PM_FPU0_SINGLE ] = { 14, 13, -1, -1, 14, 13, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FDIV ] = { 19, 18, -1, -1, 19, 18, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FEST ] = { -1, -1, 17, 18, -1, -1, 17, 18 }, [ PPC970MP_PME_PM_FPU0_FRSP_FCONV ] = { -1, -1, 16, 17, -1, -1, 16, 17 }, [ PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL ] = { -1, 27, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_ST_CMPL_INT ] = { -1, -1, 62, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FLUSH_BR_MPRED ] = { -1, -1, 10, 11, -1, -1, 10, 11 }, [ PPC970MP_PME_PM_FXU_FIN ] = { -1, -1, 26, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU_STF ] = { -1, -1, -1, -1, -1, 26, -1, -1 }, [ PPC970MP_PME_PM_DSLB_MISS ] = { 5, 4, -1, -1, 5, 4, -1, -1 }, [ PPC970MP_PME_PM_FXLS1_FULL_CYC ] = { -1, -1, 23, 24, -1, -1, 23, 24 }, [ PPC970MP_PME_PM_CMPLU_STALL_FPU ] = { -1, -1, -1, -1, -1, -1, 67, -1 }, [ PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE ] = { -1, -1, 45, 43, -1, -1, 43, 45 }, [ PPC970MP_PME_PM_MRK_STCX_FAIL ] = { 77, 74, -1, -1, 75, 74, -1, -1 }, [ PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { -1, -1, -1, -1, -1, -1, 24, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_LSU ] = { -1, -1, -1, -1, 84, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 92, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_FLUSH_ULD ] = { 64, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_BRU_FIN ] = { -1, 70, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_IERAT_XLATE_WR ] = { -1, -1, 70, 67, -1, -1, 72, 68 }, [ PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED ] = { -1, -1, -1, -1, -1, -1, 71, -1 }, [ PPC970MP_PME_PM_LSU0_BUSY ] = { 85, 80, -1, -1, 81, 81, -1, -1 }, [ PPC970MP_PME_PM_DATA_FROM_MEM ] = { -1, 87, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPR_MAP_FULL_CYC ] = { 7, 6, -1, -1, 7, 6, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FULL_CYC ] = { 22, 21, -1, -1, 22, 21, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FIN ] = { -1, -1, 13, 14, -1, -1, 13, 14 }, [ PPC970MP_PME_PM_GRP_BR_REDIR ] = { 31, 30, -1, -1, 31, 30, -1, -1 }, [ PPC970MP_PME_PM_GCT_EMPTY_IC_MISS ] = { -1, -1, -1, -1, 88, -1, -1, -1 }, [ PPC970MP_PME_PM_THRESH_TIMEO ] = { -1, 82, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU_FSQRT ] = { -1, -1, -1, -1, -1, 25, -1, -1 }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { -1, -1, 53, 53, -1, -1, 52, 52 }, [ PPC970MP_PME_PM_PMC1_OVERFLOW ] = { -1, 76, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FXLS0_FULL_CYC ] = { -1, -1, 22, 23, -1, -1, 22, 23 }, [ PPC970MP_PME_PM_FPU0_ALL ] = { 8, 7, -1, -1, 8, 7, -1, -1 }, [ PPC970MP_PME_PM_DATA_TABLEWALK_CYC ] = { 4, 3, -1, -1, 4, 3, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FEST ] = { -1, -1, 12, 13, -1, -1, 12, 13 }, [ PPC970MP_PME_PM_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 87, -1, -1 }, [ PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 49, 48, -1, -1, 50, 47, -1, -1 }, [ PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { -1, 64, 48, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 51, 50, -1, -1, 52, 49, -1, -1 }, [ PPC970MP_PME_PM_FPU_FEST ] = { -1, -1, 21, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_0INST_FETCH ] = { -1, -1, -1, 0, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU0 ] = { -1, -1, 37, 35, -1, -1, 35, 35 }, [ PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 61, 60, -1, -1, 62, 59, -1, -1 }, [ PPC970MP_PME_PM_L1_PREF ] = { -1, -1, 33, 32, -1, -1, 32, 32 }, [ PPC970MP_PME_PM_FPU1_STALL3 ] = { 24, 23, -1, -1, 24, 23, -1, -1 }, [ PPC970MP_PME_PM_BRQ_FULL_CYC ] = { 0, 0, -1, -1, 1, 0, -1, -1 }, [ PPC970MP_PME_PM_PMC8_OVERFLOW ] = { 80, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_PMC7_OVERFLOW ] = { -1, -1, -1, -1, -1, -1, -1, 62 }, [ PPC970MP_PME_PM_WORK_HELD ] = { -1, 83, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 75, 72, -1, -1, 73, 72, -1, -1 }, [ PPC970MP_PME_PM_FXU_IDLE ] = { -1, -1, -1, -1, 28, -1, -1, -1 }, [ PPC970MP_PME_PM_INST_CMPL ] = { 36, 36, 30, 30, 38, 35, 30, 30 }, [ PPC970MP_PME_PM_LSU1_FLUSH_UST ] = { 58, 57, -1, -1, 59, 56, -1, -1 }, [ PPC970MP_PME_PM_LSU0_FLUSH_ULD ] = { 47, 46, -1, -1, 48, 45, -1, -1 }, [ PPC970MP_PME_PM_LSU_FLUSH ] = { -1, -1, 43, 41, -1, -1, 41, 42 }, [ PPC970MP_PME_PM_INST_FROM_L2 ] = { 39, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 60, 59, -1, -1, 61, 58, -1, -1 }, [ PPC970MP_PME_PM_PMC2_OVERFLOW ] = { -1, -1, 64, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU0_DENORM ] = { 9, 8, -1, -1, 9, 8, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FMOV_FEST ] = { -1, -1, 19, 20, -1, -1, 19, 20 }, [ PPC970MP_PME_PM_INST_FETCH_CYC ] = { 90, 86, -1, -1, 90, 85, -1, -1 }, [ PPC970MP_PME_PM_GRP_DISP_REJECT ] = { 32, 32, -1, -1, 32, 31, -1, 29 }, [ PPC970MP_PME_PM_LSU_LDF ] = { -1, -1, -1, -1, -1, -1, -1, 43 }, [ PPC970MP_PME_PM_INST_DISP ] = { 37, 37, -1, -1, 39, 36, -1, -1 }, [ PPC970MP_PME_PM_DATA_FROM_L25_SHR ] = { -1, -1, -1, -1, 91, -1, -1, -1 }, [ PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID ] = { -1, -1, 32, 31, -1, -1, 31, 31 }, [ PPC970MP_PME_PM_MRK_GRP_ISSUED ] = { -1, -1, -1, -1, -1, 70, -1, -1 }, [ PPC970MP_PME_PM_FPU_FMA ] = { -1, 25, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_CRU_FIN ] = { -1, -1, -1, 50, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_REJECT ] = { -1, -1, -1, -1, -1, -1, 69, -1 }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST ] = { -1, -1, 60, 60, -1, -1, 59, 59 }, [ PPC970MP_PME_PM_MRK_FXU_FIN ] = { -1, -1, -1, -1, -1, 69, -1, -1 }, [ PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 59, 58, -1, -1, 60, 57, -1, -1 }, [ PPC970MP_PME_PM_BR_ISSUED ] = { -1, -1, 0, 1, -1, -1, 0, 0 }, [ PPC970MP_PME_PM_PMC4_OVERFLOW ] = { -1, -1, -1, -1, 77, -1, -1, -1 }, [ PPC970MP_PME_PM_EE_OFF ] = { -1, -1, 8, 9, -1, -1, 8, 8 }, [ PPC970MP_PME_PM_INST_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 37, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS ] = { -1, -1, -1, -1, -1, -1, 70, -1 }, [ PPC970MP_PME_PM_ITLB_MISS ] = { 41, 40, -1, -1, 42, 39, -1, -1 }, [ PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { -1, -1, -1, 26, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_GRP_DISP_VALID ] = { 33, 33, -1, -1, 34, 32, -1, -1 }, [ PPC970MP_PME_PM_MRK_GRP_DISP ] = { 72, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_FLUSH_UST ] = { -1, 63, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FXU1_FIN ] = { -1, -1, 25, 27, -1, -1, 26, 26 }, [ PPC970MP_PME_PM_GRP_CMPL ] = { -1, -1, -1, -1, -1, -1, 28, -1 }, [ PPC970MP_PME_PM_FPU_FRSP_FCONV ] = { -1, -1, -1, -1, -1, -1, 21, -1 }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { -1, -1, 54, 54, -1, -1, 53, 53 }, [ PPC970MP_PME_PM_CMPLU_STALL_OTHER ] = { 88, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_LMQ_FULL_CYC ] = { -1, -1, 44, 42, -1, -1, 42, 44 }, [ PPC970MP_PME_PM_ST_REF_L1_LSU0 ] = { -1, -1, 67, 64, -1, -1, 64, 64 }, [ PPC970MP_PME_PM_LSU0_DERAT_MISS ] = { 44, 43, -1, -1, 45, 42, -1, -1 }, [ PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC ] = { -1, -1, 51, 49, -1, -1, 48, 50 }, [ PPC970MP_PME_PM_FPU_STALL3 ] = { -1, 26, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS ] = { -1, -1, -1, -1, 68, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L2 ] = { 71, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU0_FLUSH_SRQ ] = { 46, 45, -1, -1, 47, 44, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FMOV_FEST ] = { -1, -1, 14, 15, -1, -1, 14, 15 }, [ PPC970MP_PME_PM_IOPS_CMPL ] = { 91, -1, -1, 68, -1, 86, 73, 69 }, [ PPC970MP_PME_PM_LD_REF_L1_LSU0 ] = { -1, -1, 39, 37, -1, -1, 37, 38 }, [ PPC970MP_PME_PM_LSU1_FLUSH_SRQ ] = { 56, 55, -1, -1, 57, 54, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_DIV ] = { -1, -1, -1, -1, -1, -1, 68, -1 }, [ PPC970MP_PME_PM_GRP_BR_MPRED ] = { 30, 29, -1, -1, 30, 29, -1, -1 }, [ PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC ] = { -1, -1, 46, 44, -1, -1, 44, 46 }, [ PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 50, 49, -1, -1, 51, 48, -1, -1 }, [ PPC970MP_PME_PM_ST_REF_L1 ] = { -1, -1, -1, -1, -1, -1, 63, -1 }, [ PPC970MP_PME_PM_MRK_VMX_FIN ] = { -1, -1, 63, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC ] = { -1, -1, -1, 47, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU1_STF ] = { 25, 24, -1, -1, 25, 24, -1, -1 }, [ PPC970MP_PME_PM_RUN_CYC ] = { 81, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_LMQ_S0_VALID ] = { -1, -1, 47, 45, -1, -1, 45, 47 }, [ PPC970MP_PME_PM_LSU0_LDF ] = { -1, -1, 41, 39, -1, -1, 39, 40 }, [ PPC970MP_PME_PM_LSU_LRQ_S0_VALID ] = { 66, 66, -1, -1, 67, 65, -1, -1 }, [ PPC970MP_PME_PM_PMC3_OVERFLOW ] = { -1, -1, -1, 62, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_IMR_RELOAD ] = { 73, 71, -1, -1, 72, 71, -1, -1 }, [ PPC970MP_PME_PM_MRK_GRP_TIMEO ] = { -1, -1, -1, -1, 71, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU_FMOV_FEST ] = { -1, -1, -1, -1, -1, -1, -1, 22 }, [ PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC ] = { -1, -1, 28, 29, -1, -1, 29, 28 }, [ PPC970MP_PME_PM_XER_MAP_FULL_CYC ] = { 87, 84, -1, -1, 83, 83, -1, -1 }, [ PPC970MP_PME_PM_ST_MISS_L1 ] = { -1, -1, 66, 63, -1, -1, 62, 63 }, [ PPC970MP_PME_PM_STOP_COMPLETION ] = { -1, -1, 65, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_GRP_CMPL ] = { -1, -1, -1, 51, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_ISLB_MISS ] = { 40, 39, -1, -1, 41, 38, -1, -1 }, [ PPC970MP_PME_PM_SUSPENDED ] = { 86, 81, 69, 66, 82, 82, 66, 66 }, [ PPC970MP_PME_PM_CYC ] = { 2, 2, 4, 5, 3, 2, 4, 4 }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU1 ] = { -1, -1, 38, 36, -1, -1, 36, 36 }, [ PPC970MP_PME_PM_STCX_FAIL ] = { 83, 78, -1, -1, 79, 79, -1, -1 }, [ PPC970MP_PME_PM_LSU1_SRQ_STFWD ] = { 63, 62, -1, -1, 64, 61, -1, -1 }, [ PPC970MP_PME_PM_GRP_DISP ] = { -1, 31, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_L2_PREF ] = { -1, -1, 35, 34, -1, -1, 34, 34 }, [ PPC970MP_PME_PM_FPU1_DENORM ] = { 18, 17, -1, -1, 18, 17, -1, -1 }, [ PPC970MP_PME_PM_DATA_FROM_L2 ] = { 3, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FPSCR ] = { -1, -1, 15, 16, -1, -1, 15, 16 }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD ] = { -1, -1, -1, -1, -1, 88, -1, -1 }, [ PPC970MP_PME_PM_FPU0_FSQRT ] = { 12, 11, -1, -1, 12, 11, -1, -1 }, [ PPC970MP_PME_PM_LD_REF_L1 ] = { -1, -1, -1, -1, -1, -1, -1, 37 }, [ PPC970MP_PME_PM_MRK_L1_RELOAD_VALID ] = { -1, -1, 52, 52, -1, -1, 51, 51 }, [ PPC970MP_PME_PM_1PLUS_PPC_CMPL ] = { -1, -1, -1, -1, 0, -1, -1, -1 }, [ PPC970MP_PME_PM_INST_FROM_L1 ] = { 38, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_EE_OFF_EXT_INT ] = { -1, -1, 9, 10, -1, -1, 9, 9 }, [ PPC970MP_PME_PM_PMC6_OVERFLOW ] = { -1, -1, -1, -1, -1, -1, 61, -1 }, [ PPC970MP_PME_PM_LSU_LRQ_FULL_CYC ] = { -1, -1, 49, 46, -1, -1, 46, 48 }, [ PPC970MP_PME_PM_IC_PREF_INSTALL ] = { 34, 34, -1, -1, 36, 33, -1, -1 }, [ PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { -1, -1, 6, 7, -1, -1, 6, 6 }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { -1, -1, 58, 58, -1, -1, 57, 57 }, [ PPC970MP_PME_PM_GCT_FULL_CYC ] = { 29, 28, -1, -1, 29, 28, -1, -1 }, [ PPC970MP_PME_PM_INST_FROM_MEM ] = { -1, 38, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED ] = { -1, -1, 11, 12, -1, -1, 11, 12 }, [ PPC970MP_PME_PM_FXU_BUSY ] = { -1, -1, -1, -1, -1, 27, -1, -1 }, [ PPC970MP_PME_PM_ST_REF_L1_LSU1 ] = { -1, -1, 68, 65, -1, -1, 65, 65 }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1 ] = { 74, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_L1_WRITE_CYC ] = { -1, -1, 34, 33, -1, -1, 33, 33 }, [ PPC970MP_PME_PM_LSU1_BUSY ] = { 89, 85, -1, -1, 89, 84, -1, -1 }, [ PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL ] = { -1, 67, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_FDIV ] = { -1, -1, -1, -1, 87, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU_ALL ] = { -1, -1, -1, -1, 26, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC ] = { 68, 68, -1, -1, 69, 67, -1, -1 }, [ PPC970MP_PME_PM_INST_FROM_L25_SHR ] = { -1, -1, -1, -1, 40, -1, -1, -1 }, [ PPC970MP_PME_PM_GRP_MRK ] = { -1, -1, -1, -1, 35, -1, -1, -1 }, [ PPC970MP_PME_PM_BR_MPRED_CR ] = { -1, -1, 1, 2, -1, -1, 1, 1 }, [ PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC ] = { -1, -1, 7, 8, -1, -1, 7, 7 }, [ PPC970MP_PME_PM_FPU1_FIN ] = { -1, -1, 18, 19, -1, -1, 18, 19 }, [ PPC970MP_PME_PM_LSU_REJECT_SRQ ] = { 67, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_BR_MPRED_TA ] = { -1, -1, 2, 3, -1, -1, 2, 2 }, [ PPC970MP_PME_PM_CRQ_FULL_CYC ] = { -1, -1, 3, 4, -1, -1, 3, 3 }, [ PPC970MP_PME_PM_LD_MISS_L1 ] = { -1, -1, 36, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_INST_FROM_PREF ] = { -1, -1, 31, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_STCX_PASS ] = { 84, 79, -1, -1, 80, 80, -1, -1 }, [ PPC970MP_PME_PM_DC_INV_L2 ] = { -1, -1, 5, 6, -1, -1, 5, 5 }, [ PPC970MP_PME_PM_LSU_SRQ_FULL_CYC ] = { -1, -1, 50, 48, -1, -1, 47, 49 }, [ PPC970MP_PME_PM_LSU0_FLUSH_LRQ ] = { 45, 44, -1, -1, 46, 43, -1, -1 }, [ PPC970MP_PME_PM_LSU_SRQ_S0_VALID ] = { 69, 69, -1, -1, 70, 68, -1, -1 }, [ PPC970MP_PME_PM_LARX_LSU0 ] = { 42, 41, -1, -1, 43, 40, -1, -1 }, [ PPC970MP_PME_PM_GCT_EMPTY_CYC ] = { 28, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_FPU1_ALL ] = { 17, 16, -1, -1, 17, 16, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FSQRT ] = { 21, 20, -1, -1, 21, 20, -1, -1 }, [ PPC970MP_PME_PM_FPU_FIN ] = { -1, -1, -1, 22, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU_SRQ_STFWD ] = { 70, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 76, 73, -1, -1, 74, 73, -1, -1 }, [ PPC970MP_PME_PM_FXU0_FIN ] = { -1, -1, 24, 25, -1, -1, 25, 25 }, [ PPC970MP_PME_PM_MRK_FPU_FIN ] = { -1, -1, -1, -1, -1, -1, 49, -1 }, [ PPC970MP_PME_PM_PMC5_OVERFLOW ] = { -1, -1, -1, -1, -1, 77, -1, -1 }, [ PPC970MP_PME_PM_SNOOP_TLBIE ] = { 82, 77, -1, -1, 78, 78, -1, -1 }, [ PPC970MP_PME_PM_FPU1_FRSP_FCONV ] = { -1, -1, 20, 21, -1, -1, 20, 21 }, [ PPC970MP_PME_PM_FPU0_FDIV ] = { 10, 9, -1, -1, 10, 9, -1, -1 }, [ PPC970MP_PME_PM_LD_REF_L1_LSU1 ] = { -1, -1, 40, 38, -1, -1, 38, 39 }, [ PPC970MP_PME_PM_HV_CYC ] = { -1, -1, 29, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 43, 42, -1, -1, 44, 41, -1, -1 }, [ PPC970MP_PME_PM_FPU_DENORM ] = { 26, -1, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU0_REJECT_SRQ ] = { 52, 51, -1, -1, 53, 50, -1, -1 }, [ PPC970MP_PME_PM_LSU1_REJECT_SRQ ] = { 62, 61, -1, -1, 63, 60, -1, -1 }, [ PPC970MP_PME_PM_LSU1_DERAT_MISS ] = { 54, 53, -1, -1, 55, 52, -1, -1 }, [ PPC970MP_PME_PM_IC_PREF_REQ ] = { 35, 35, -1, -1, 37, 34, -1, -1 }, [ PPC970MP_PME_PM_MRK_LSU_FIN ] = { -1, -1, -1, -1, -1, -1, -1, 60 }, [ PPC970MP_PME_PM_MRK_DATA_FROM_MEM ] = { -1, 88, -1, -1, -1, -1, -1, -1 }, [ PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { -1, -1, -1, -1, 86, -1, -1, -1 }, [ PPC970MP_PME_PM_LSU0_FLUSH_UST ] = { 48, 47, -1, -1, 49, 46, -1, -1 }, [ PPC970MP_PME_PM_LSU_FLUSH_LRQ ] = { -1, -1, -1, -1, -1, 63, -1, -1 }, [ PPC970MP_PME_PM_LSU_FLUSH_SRQ ] = { -1, -1, -1, -1, 65, -1, -1, -1 } }; static const unsigned long long ppc970mp_group_vecs[][PPC970MP_NUM_GROUP_VEC] = { [ PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { 0x0000000800000000ULL }, [ PPC970MP_PME_PM_FPU1_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU0_STALL3 ] = { 0x0000000000002000ULL }, [ PPC970MP_PME_PM_TB_BIT_TRANS ] = { 0x0000000000080000ULL }, [ PPC970MP_PME_PM_GPR_MAP_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_ST_CMPL ] = { 0x0000000800000000ULL }, [ PPC970MP_PME_PM_FPU0_STF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_FMA ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_LSU1_FLUSH_ULD ] = { 0x0000000000008000ULL }, [ PPC970MP_PME_PM_MRK_INST_FIN ] = { 0x0004000200000000ULL }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC ] = { 0x0000000010000000ULL }, [ PPC970MP_PME_PM_FPU_FDIV ] = { 0x0000100000900010ULL }, [ PPC970MP_PME_PM_FPU0_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970MP_PME_PM_FPU_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU0_FMA ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_LSU1_FLUSH_LRQ ] = { 0x0000000000004000ULL }, [ PPC970MP_PME_PM_DTLB_MISS ] = { 0x0000000010600000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_FXU ] = { 0x0000080000000000ULL }, [ PPC970MP_PME_PM_MRK_ST_MISS_L1 ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_EXT_INT ] = { 0x0000000000000200ULL }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_MRK_ST_GPS ] = { 0x0000000800000000ULL }, [ PPC970MP_PME_PM_GRP_DISP_SUCCESS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU1_LDF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU0_SRQ_STFWD ] = { 0x0000000000020000ULL }, [ PPC970MP_PME_PM_CR_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_LSU_DERAT_MISS ] = { 0x0000040100000000ULL }, [ PPC970MP_PME_PM_FPU0_SINGLE ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_FDIV ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_FPU1_FEST ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_FPU0_FRSP_FCONV ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL ] = { 0x0000080000000000ULL }, [ PPC970MP_PME_PM_MRK_ST_CMPL_INT ] = { 0x0000000800000000ULL }, [ PPC970MP_PME_PM_FLUSH_BR_MPRED ] = { 0x0000200000000000ULL }, [ PPC970MP_PME_PM_FXU_FIN ] = { 0x0000084000100000ULL }, [ PPC970MP_PME_PM_FPU_STF ] = { 0x0000000000800020ULL }, [ PPC970MP_PME_PM_DSLB_MISS ] = { 0x0000000004000000ULL }, [ PPC970MP_PME_PM_FXLS1_FULL_CYC ] = { 0x0000008000000080ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_FPU ] = { 0x0000100000000000ULL }, [ PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_STCX_FAIL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { 0x0000004000000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_LSU ] = { 0x0000020000000000ULL }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR ] = { 0x0004000000000000ULL }, [ PPC970MP_PME_PM_LSU_FLUSH_ULD ] = { 0x0000000000000008ULL }, [ PPC970MP_PME_PM_MRK_BRU_FIN ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_IERAT_XLATE_WR ] = { 0x0000000080000000ULL }, [ PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED ] = { 0x0000200000000000ULL }, [ PPC970MP_PME_PM_LSU0_BUSY ] = { 0x0000020003020000ULL }, [ PPC970MP_PME_PM_DATA_FROM_MEM ] = { 0x0003000008000000ULL }, [ PPC970MP_PME_PM_FPR_MAP_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970MP_PME_PM_FPU0_FIN ] = { 0x0000000000802800ULL }, [ PPC970MP_PME_PM_GRP_BR_REDIR ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_GCT_EMPTY_IC_MISS ] = { 0x0000200000000000ULL }, [ PPC970MP_PME_PM_THRESH_TIMEO ] = { 0x0000000200000000ULL }, [ PPC970MP_PME_PM_FPU_FSQRT ] = { 0x0000100000100010ULL }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_PMC1_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FXLS0_FULL_CYC ] = { 0x0000008000000080ULL }, [ PPC970MP_PME_PM_FPU0_ALL ] = { 0x0000000000000800ULL }, [ PPC970MP_PME_PM_DATA_TABLEWALK_CYC ] = { 0x0000000020000000ULL }, [ PPC970MP_PME_PM_FPU0_FEST ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_DATA_FROM_L25_MOD ] = { 0x0002400000000000ULL }, [ PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { 0x0000000000480000ULL }, [ PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU_FEST ] = { 0x0000000000000010ULL }, [ PPC970MP_PME_PM_0INST_FETCH ] = { 0x0000010000000000ULL }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU0 ] = { 0x0001000000008000ULL }, [ PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_L1_PREF ] = { 0x0000000010000000ULL }, [ PPC970MP_PME_PM_FPU1_STALL3 ] = { 0x0000000000002000ULL }, [ PPC970MP_PME_PM_BRQ_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_PMC8_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_PMC7_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_WORK_HELD ] = { 0x0000000000000200ULL }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_FXU_IDLE ] = { 0x000000c000000000ULL }, [ PPC970MP_PME_PM_INST_CMPL ] = { 0x0007fffbffffffffULL }, [ PPC970MP_PME_PM_LSU1_FLUSH_UST ] = { 0x0000000000010000ULL }, [ PPC970MP_PME_PM_LSU0_FLUSH_ULD ] = { 0x0000000000008000ULL }, [ PPC970MP_PME_PM_LSU_FLUSH ] = { 0x0000020000000000ULL }, [ PPC970MP_PME_PM_INST_FROM_L2 ] = { 0x0000800020000000ULL }, [ PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_PMC2_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU0_DENORM ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_FPU1_FMOV_FEST ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_INST_FETCH_CYC ] = { 0x0000010000000000ULL }, [ PPC970MP_PME_PM_GRP_DISP_REJECT ] = { 0x0000000000000101ULL }, [ PPC970MP_PME_PM_LSU_LDF ] = { 0x0000000000800020ULL }, [ PPC970MP_PME_PM_INST_DISP ] = { 0x0000000100000146ULL }, [ PPC970MP_PME_PM_DATA_FROM_L25_SHR ] = { 0x0002400000000000ULL }, [ PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID ] = { 0x0000000100040000ULL }, [ PPC970MP_PME_PM_MRK_GRP_ISSUED ] = { 0x0000000200000000ULL }, [ PPC970MP_PME_PM_FPU_FMA ] = { 0x0000100000900010ULL }, [ PPC970MP_PME_PM_MRK_CRU_FIN ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_REJECT ] = { 0x0000040000000000ULL }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_MRK_FXU_FIN ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_BR_ISSUED ] = { 0x0000800007000000ULL }, [ PPC970MP_PME_PM_PMC4_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_EE_OFF ] = { 0x0000000000000200ULL }, [ PPC970MP_PME_PM_INST_FROM_L25_MOD ] = { 0x0000010000000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS ] = { 0x0000020000000000ULL }, [ PPC970MP_PME_PM_ITLB_MISS ] = { 0x0000000010200000ULL }, [ PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { 0x0000004000000000ULL }, [ PPC970MP_PME_PM_GRP_DISP_VALID ] = { 0x0000000100000100ULL }, [ PPC970MP_PME_PM_MRK_GRP_DISP ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_LSU_FLUSH_UST ] = { 0x0000000000000008ULL }, [ PPC970MP_PME_PM_FXU1_FIN ] = { 0x0000008000000100ULL }, [ PPC970MP_PME_PM_GRP_CMPL ] = { 0x0000000020080001ULL }, [ PPC970MP_PME_PM_FPU_FRSP_FCONV ] = { 0x0000000000000020ULL }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_OTHER ] = { 0x0000040000000000ULL }, [ PPC970MP_PME_PM_LSU_LMQ_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_ST_REF_L1_LSU0 ] = { 0x0000000000030000ULL }, [ PPC970MP_PME_PM_LSU0_DERAT_MISS ] = { 0x0000000000040000ULL }, [ PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC ] = { 0x0000000040000000ULL }, [ PPC970MP_PME_PM_FPU_STALL3 ] = { 0x0000000000000020ULL }, [ PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L2 ] = { 0x0004000000000000ULL }, [ PPC970MP_PME_PM_LSU0_FLUSH_SRQ ] = { 0x0000000000004000ULL }, [ PPC970MP_PME_PM_FPU0_FMOV_FEST ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_IOPS_CMPL ] = { 0x0000100000000000ULL }, [ PPC970MP_PME_PM_LD_REF_L1_LSU0 ] = { 0x0000000000008000ULL }, [ PPC970MP_PME_PM_LSU1_FLUSH_SRQ ] = { 0x0000000000004000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_DIV ] = { 0x0000080000000000ULL }, [ PPC970MP_PME_PM_GRP_BR_MPRED ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC ] = { 0x0000400008000000ULL }, [ PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_ST_REF_L1 ] = { 0x000000010260000eULL }, [ PPC970MP_PME_PM_MRK_VMX_FIN ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_STF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_RUN_CYC ] = { 0x0000000004000001ULL }, [ PPC970MP_PME_PM_LSU_LMQ_S0_VALID ] = { 0x0000400008000000ULL }, [ PPC970MP_PME_PM_LSU0_LDF ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU_LRQ_S0_VALID ] = { 0x0000000010000000ULL }, [ PPC970MP_PME_PM_PMC3_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_IMR_RELOAD ] = { 0x0000001000000000ULL }, [ PPC970MP_PME_PM_MRK_GRP_TIMEO ] = { 0x0000000800000000ULL }, [ PPC970MP_PME_PM_FPU_FMOV_FEST ] = { 0x0000000000100010ULL }, [ PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC ] = { 0x0000000000000040ULL }, [ PPC970MP_PME_PM_XER_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970MP_PME_PM_ST_MISS_L1 ] = { 0x0000000000610000ULL }, [ PPC970MP_PME_PM_STOP_COMPLETION ] = { 0x0000000000000201ULL }, [ PPC970MP_PME_PM_MRK_GRP_CMPL ] = { 0x0000000a00000000ULL }, [ PPC970MP_PME_PM_ISLB_MISS ] = { 0x0000000004000000ULL }, [ PPC970MP_PME_PM_SUSPENDED ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_CYC ] = { 0x0007ffffffffffffULL }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU1 ] = { 0x0003000000008000ULL }, [ PPC970MP_PME_PM_STCX_FAIL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU1_SRQ_STFWD ] = { 0x0000000000020000ULL }, [ PPC970MP_PME_PM_GRP_DISP ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_L2_PREF ] = { 0x0000000010000000ULL }, [ PPC970MP_PME_PM_FPU1_DENORM ] = { 0x0000000000001000ULL }, [ PPC970MP_PME_PM_DATA_FROM_L2 ] = { 0x0003000008000000ULL }, [ PPC970MP_PME_PM_FPU0_FPSCR ] = { 0x0000000000002000ULL }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD ] = { 0x0004000000000000ULL }, [ PPC970MP_PME_PM_FPU0_FSQRT ] = { 0x0000000000000800ULL }, [ PPC970MP_PME_PM_LD_REF_L1 ] = { 0x000304004260000eULL }, [ PPC970MP_PME_PM_MRK_L1_RELOAD_VALID ] = { 0x0004000000000000ULL }, [ PPC970MP_PME_PM_1PLUS_PPC_CMPL ] = { 0x0001000000080001ULL }, [ PPC970MP_PME_PM_INST_FROM_L1 ] = { 0x0000010080000000ULL }, [ PPC970MP_PME_PM_EE_OFF_EXT_INT ] = { 0x0000000000000200ULL }, [ PPC970MP_PME_PM_PMC6_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU_LRQ_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970MP_PME_PM_IC_PREF_INSTALL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_GCT_FULL_CYC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_INST_FROM_MEM ] = { 0x0000810020000000ULL }, [ PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED ] = { 0x0000020000000000ULL }, [ PPC970MP_PME_PM_FXU_BUSY ] = { 0x000008c000000000ULL }, [ PPC970MP_PME_PM_ST_REF_L1_LSU1 ] = { 0x0000000000030000ULL }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1 ] = { 0x0000000200000000ULL }, [ PPC970MP_PME_PM_L1_WRITE_CYC ] = { 0x0000200000000000ULL }, [ PPC970MP_PME_PM_LSU1_BUSY ] = { 0x0000020000000000ULL }, [ PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_FDIV ] = { 0x0000100000000000ULL }, [ PPC970MP_PME_PM_FPU_ALL ] = { 0x0000000000000020ULL }, [ PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC ] = { 0x0000000040000000ULL }, [ PPC970MP_PME_PM_INST_FROM_L25_SHR ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_GRP_MRK ] = { 0x0000000600000000ULL }, [ PPC970MP_PME_PM_BR_MPRED_CR ] = { 0x0000800005000000ULL }, [ PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_FIN ] = { 0x0000000000802800ULL }, [ PPC970MP_PME_PM_LSU_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_BR_MPRED_TA ] = { 0x0000a00005000000ULL }, [ PPC970MP_PME_PM_CRQ_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970MP_PME_PM_LD_MISS_L1 ] = { 0x0000040043600006ULL }, [ PPC970MP_PME_PM_INST_FROM_PREF ] = { 0x0000810000000000ULL }, [ PPC970MP_PME_PM_STCX_PASS ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_DC_INV_L2 ] = { 0x0000000020010006ULL }, [ PPC970MP_PME_PM_LSU_SRQ_FULL_CYC ] = { 0x0000000000000080ULL }, [ PPC970MP_PME_PM_LSU0_FLUSH_LRQ ] = { 0x0000000000004000ULL }, [ PPC970MP_PME_PM_LSU_SRQ_S0_VALID ] = { 0x0000000040000000ULL }, [ PPC970MP_PME_PM_LARX_LSU0 ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_GCT_EMPTY_CYC ] = { 0x0000200100080200ULL }, [ PPC970MP_PME_PM_FPU1_ALL ] = { 0x0000000000000800ULL }, [ PPC970MP_PME_PM_FPU1_FSQRT ] = { 0x0000000000000800ULL }, [ PPC970MP_PME_PM_FPU_FIN ] = { 0x0000080000100010ULL }, [ PPC970MP_PME_PM_LSU_SRQ_STFWD ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { 0x0000002000000000ULL }, [ PPC970MP_PME_PM_FXU0_FIN ] = { 0x0000008000000100ULL }, [ PPC970MP_PME_PM_MRK_FPU_FIN ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_PMC5_OVERFLOW ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_SNOOP_TLBIE ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_FPU1_FRSP_FCONV ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_FPU0_FDIV ] = { 0x0000000000000400ULL }, [ PPC970MP_PME_PM_LD_REF_L1_LSU1 ] = { 0x0000000000008000ULL }, [ PPC970MP_PME_PM_HV_CYC ] = { 0x0000000020080000ULL }, [ PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC ] = { 0x0000000000000040ULL }, [ PPC970MP_PME_PM_FPU_DENORM ] = { 0x0000000000000020ULL }, [ PPC970MP_PME_PM_LSU0_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU1_REJECT_SRQ ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_LSU1_DERAT_MISS ] = { 0x0000000000040000ULL }, [ PPC970MP_PME_PM_IC_PREF_REQ ] = { 0x0000000000000000ULL }, [ PPC970MP_PME_PM_MRK_LSU_FIN ] = { 0x0000000400000000ULL }, [ PPC970MP_PME_PM_MRK_DATA_FROM_MEM ] = { 0x0004000000000000ULL }, [ PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { 0x0000040000000000ULL }, [ PPC970MP_PME_PM_LSU0_FLUSH_UST ] = { 0x0000000000010000ULL }, [ PPC970MP_PME_PM_LSU_FLUSH_LRQ ] = { 0x0000000000000008ULL }, [ PPC970MP_PME_PM_LSU_FLUSH_SRQ ] = { 0x0000000000000008ULL } }; static const pme_power_entry_t ppc970mp_pe[] = { [ PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x6920, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "LSU reject due to reload CDF or tag update collision", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF] }, [ PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x936, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID] }, [ PPC970MP_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_SINGLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_SINGLE] }, [ PPC970MP_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_STALL3], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_STALL3] }, [ PPC970MP_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_TB_BIT_TRANS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_TB_BIT_TRANS] }, [ PPC970MP_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x335, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GPR_MAP_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GPR_MAP_FULL_CYC] }, [ PPC970MP_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_ST_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_ST_CMPL] }, [ PPC970MP_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_STF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_STF] }, [ PPC970MP_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FMA], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FMA] }, [ PPC970MP_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0x804, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_FLUSH_ULD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_FLUSH_ULD] }, [ PPC970MP_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_INST_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_INST_FIN] }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x711, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST] }, [ PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0x826, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC] }, [ PPC970MP_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FDIV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FDIV] }, [ PPC970MP_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x303, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FULL_CYC] }, [ PPC970MP_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_SINGLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_SINGLE] }, [ PPC970MP_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FMA], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FMA] }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x714, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD] }, [ PPC970MP_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0x806, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_FLUSH_LRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_FLUSH_LRQ] }, [ PPC970MP_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x704, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DTLB_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DTLB_MISS] }, [ PPC970MP_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x508b, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Completion stall caused by FXU instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_FXU], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_FXU] }, [ PPC970MP_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x723, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_ST_MISS_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_ST_MISS_L1] }, [ PPC970MP_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_EXT_INT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_EXT_INT] }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x716, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ] }, [ PPC970MP_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_ST_GPS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_ST_GPS] }, [ PPC970MP_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups sucessfully dispatched (not rejected)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_DISP_SUCCESS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_DISP_SUCCESS] }, [ PPC970MP_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x734, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_LDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_LDF] }, [ PPC970MP_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0x820, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_SRQ_STFWD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_SRQ_STFWD] }, [ PPC970MP_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x304, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CR_MAP_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CR_MAP_FULL_CYC] }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x710, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD] }, [ PPC970MP_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6700, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_DERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_DERAT_MISS] }, [ PPC970MP_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_SINGLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_SINGLE] }, [ PPC970MP_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FDIV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FDIV] }, [ PPC970MP_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FEST] }, [ PPC970MP_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FRSP_FCONV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FRSP_FCONV] }, [ PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL ] = { .pme_name = "PM_GCT_EMPTY_SRQ_FULL", .pme_code = 0x200b, .pme_short_desc = "GCT empty caused by SRQ full", .pme_long_desc = "GCT empty caused by SRQ full", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL] }, [ PPC970MP_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_ST_CMPL_INT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_ST_CMPL_INT] }, [ PPC970MP_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x316, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "Flush caused by branch mispredict", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FLUSH_BR_MPRED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FLUSH_BR_MPRED] }, [ PPC970MP_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3330, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU_FIN] }, [ PPC970MP_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_STF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_STF] }, [ PPC970MP_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x705, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DSLB_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DSLB_MISS] }, [ PPC970MP_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x314, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXLS1_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXLS1_FULL_CYC] }, [ PPC970MP_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x704b, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Completion stall caused by FPU instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_FPU], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_FPU] }, [ PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x935, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occured for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE] }, [ PPC970MP_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x726, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_STCX_FAIL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_STCX_FAIL] }, [ PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE] }, [ PPC970MP_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x504b, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Completion stall caused by LSU instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_LSU], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_LSU] }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x5937, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR] }, [ PPC970MP_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1800, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_FLUSH_ULD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_FLUSH_ULD] }, [ PPC970MP_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_BRU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_BRU_FIN] }, [ PPC970MP_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x430, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_IERAT_XLATE_WR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_IERAT_XLATE_WR] }, [ PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED ] = { .pme_name = "PM_GCT_EMPTY_BR_MPRED", .pme_code = 0x708c, .pme_short_desc = "GCT empty due to branch mispredict", .pme_long_desc = "GCT empty due to branch mispredict", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED] }, [ PPC970MP_PME_PM_LSU0_BUSY ] = { .pme_name = "PM_LSU0_BUSY", .pme_code = 0x823, .pme_short_desc = "LSU0 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_BUSY], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_BUSY] }, [ PPC970MP_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x2837, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "Data loaded from memory", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DATA_FROM_MEM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DATA_FROM_MEM] }, [ PPC970MP_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x301, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPR_MAP_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPR_MAP_FULL_CYC] }, [ PPC970MP_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x307, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FULL_CYC] }, [ PPC970MP_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FIN] }, [ PPC970MP_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x326, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Group experienced branch redirect", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_BR_REDIR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_BR_REDIR] }, [ PPC970MP_PME_PM_GCT_EMPTY_IC_MISS ] = { .pme_name = "PM_GCT_EMPTY_IC_MISS", .pme_code = 0x508c, .pme_short_desc = "GCT empty due to I cache miss", .pme_long_desc = "GCT empty due to I cache miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GCT_EMPTY_IC_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GCT_EMPTY_IC_MISS] }, [ PPC970MP_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_THRESH_TIMEO], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_THRESH_TIMEO] }, [ PPC970MP_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FSQRT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FSQRT] }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x712, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ] }, [ PPC970MP_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC1_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC1_OVERFLOW] }, [ PPC970MP_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x310, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXLS0_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXLS0_FULL_CYC] }, [ PPC970MP_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_ALL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_ALL] }, [ PPC970MP_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x707, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DATA_TABLEWALK_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DATA_TABLEWALK_CYC] }, [ PPC970MP_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FEST] }, [ PPC970MP_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x6837, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DATA_FROM_L25_MOD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DATA_FROM_L25_MOD] }, [ PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0x923, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "LSU0 reject due to ERAT miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS] }, [ PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC] }, [ PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0x922, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF] }, [ PPC970MP_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FEST] }, [ PPC970MP_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x442d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_0INST_FETCH], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_0INST_FETCH] }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0x812, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_MISS_L1_LSU0], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_MISS_L1_LSU0] }, [ PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0x926, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF] }, [ PPC970MP_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x731, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_L1_PREF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_L1_PREF] }, [ PPC970MP_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_STALL3], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_STALL3] }, [ PPC970MP_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x305, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_BRQ_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_BRQ_FULL_CYC] }, [ PPC970MP_PME_PM_PMC8_OVERFLOW ] = { .pme_name = "PM_PMC8_OVERFLOW", .pme_code = 0x100a, .pme_short_desc = "PMC8 Overflow", .pme_long_desc = "PMC8 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC8_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC8_OVERFLOW] }, [ PPC970MP_PME_PM_PMC7_OVERFLOW ] = { .pme_name = "PM_PMC7_OVERFLOW", .pme_code = 0x800a, .pme_short_desc = "PMC7 Overflow", .pme_long_desc = "PMC7 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC7_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC7_OVERFLOW] }, [ PPC970MP_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_WORK_HELD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_WORK_HELD] }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x720, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0] }, [ PPC970MP_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU_IDLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU_IDLE] }, [ PPC970MP_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x1, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_CMPL] }, [ PPC970MP_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0x805, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_FLUSH_UST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_FLUSH_UST] }, [ PPC970MP_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0x800, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_FLUSH_ULD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_FLUSH_ULD] }, [ PPC970MP_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x315, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "Flush initiated by LSU", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_FLUSH], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_FLUSH] }, [ PPC970MP_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x1426, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_L2], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_L2] }, [ PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0x925, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL] }, [ PPC970MP_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC2_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC2_OVERFLOW] }, [ PPC970MP_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_DENORM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_DENORM] }, [ PPC970MP_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FMOV_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FMOV_FEST] }, [ PPC970MP_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x424, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FETCH_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FETCH_CYC] }, [ PPC970MP_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x324, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_DISP_REJECT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_DISP_REJECT] }, [ PPC970MP_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8730, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LDF] }, [ PPC970MP_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x320, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_DISP], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_DISP] }, [ PPC970MP_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x5837, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DATA_FROM_L25_SHR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DATA_FROM_L25_SHR] }, [ PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x834, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID] }, [ PPC970MP_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_GRP_ISSUED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_GRP_ISSUED] }, [ PPC970MP_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FMA], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FMA] }, [ PPC970MP_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_CRU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_CRU_FIN] }, [ PPC970MP_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x70cb, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Completion stall caused by reject", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_REJECT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_REJECT] }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x715, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST] }, [ PPC970MP_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "Marked instruction FXU processing finished", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_FXU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_FXU_FIN] }, [ PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0x927, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "LSU1 reject due to ERAT miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS] }, [ PPC970MP_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x431, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_BR_ISSUED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_BR_ISSUED] }, [ PPC970MP_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x500a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC4_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC4_OVERFLOW] }, [ PPC970MP_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x333, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_EE_OFF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_EE_OFF] }, [ PPC970MP_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x6426, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_L25_MOD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_L25_MOD] }, [ PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x704c, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Completion stall caused by ERAT miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS] }, [ PPC970MP_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x700, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ITLB_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ITLB_MISS] }, [ PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE] }, [ PPC970MP_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x323, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_DISP_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_DISP_VALID] }, [ PPC970MP_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_GRP_DISP], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_GRP_DISP] }, [ PPC970MP_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2800, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_FLUSH_UST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_FLUSH_UST] }, [ PPC970MP_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x336, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU1_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU1_FIN] }, [ PPC970MP_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_CMPL] }, [ PPC970MP_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FRSP_FCONV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FRSP_FCONV] }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x713, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ] }, [ PPC970MP_PME_PM_CMPLU_STALL_OTHER ] = { .pme_name = "PM_CMPLU_STALL_OTHER", .pme_code = 0x100b, .pme_short_desc = "Completion stall caused by other reason", .pme_long_desc = "Completion stall caused by other reason", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_OTHER], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_OTHER] }, [ PPC970MP_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x837, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LMQ_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LMQ_FULL_CYC] }, [ PPC970MP_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0x811, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ST_REF_L1_LSU0], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ST_REF_L1_LSU0] }, [ PPC970MP_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x702, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_DERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_DERAT_MISS] }, [ PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x735, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC] }, [ PPC970MP_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_STALL3], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_STALL3] }, [ PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x5920, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "LSU reject due to ERAT miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS] }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1937, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_DATA_FROM_L2], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_DATA_FROM_L2] }, [ PPC970MP_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0x803, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_FLUSH_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_FLUSH_SRQ] }, [ PPC970MP_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FMOV_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FMOV_FEST] }, [ PPC970MP_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1001, .pme_short_desc = "IOPS instructions completed", .pme_long_desc = "Number of IOPS Instructions that completed.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_IOPS_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_IOPS_CMPL] }, [ PPC970MP_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0x810, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_REF_L1_LSU0], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_REF_L1_LSU0] }, [ PPC970MP_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0x807, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_FLUSH_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_FLUSH_SRQ] }, [ PPC970MP_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x708b, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Completion stall caused by DIV instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_DIV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_DIV] }, [ PPC970MP_PME_PM_GRP_BR_MPRED ] = { .pme_name = "PM_GRP_BR_MPRED", .pme_code = 0x327, .pme_short_desc = "Group experienced a branch mispredict", .pme_long_desc = "Group experienced a branch mispredict", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_BR_MPRED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_BR_MPRED] }, [ PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x836, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC] }, [ PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0x921, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL] }, [ PPC970MP_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7810, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ST_REF_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ST_REF_L1] }, [ PPC970MP_PME_PM_MRK_VMX_FIN ] = { .pme_name = "PM_MRK_VMX_FIN", .pme_code = 0x3005, .pme_short_desc = "Marked instruction VMX processing finished", .pme_long_desc = "Marked instruction VMX processing finished", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_VMX_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_VMX_FIN] }, [ PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC] }, [ PPC970MP_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_STF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_STF] }, [ PPC970MP_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_RUN_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_RUN_CYC] }, [ PPC970MP_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x835, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LMQ_S0_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LMQ_S0_VALID] }, [ PPC970MP_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x730, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_LDF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_LDF] }, [ PPC970MP_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0x822, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LRQ_S0_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LRQ_S0_VALID] }, [ PPC970MP_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC3_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC3_OVERFLOW] }, [ PPC970MP_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x722, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occured due to marked load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_IMR_RELOAD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_IMR_RELOAD] }, [ PPC970MP_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_GRP_TIMEO], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_GRP_TIMEO] }, [ PPC970MP_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FMOV_FEST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FMOV_FEST] }, [ PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x331, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC] }, [ PPC970MP_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x302, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_XER_MAP_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_XER_MAP_FULL_CYC] }, [ PPC970MP_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x813, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ST_MISS_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ST_MISS_L1] }, [ PPC970MP_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_STOP_COMPLETION], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_STOP_COMPLETION] }, [ PPC970MP_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_GRP_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_GRP_CMPL] }, [ PPC970MP_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x701, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ISLB_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ISLB_MISS] }, [ PPC970MP_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_SUSPENDED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_SUSPENDED] }, [ PPC970MP_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CYC] }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0x816, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_MISS_L1_LSU1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_MISS_L1_LSU1] }, [ PPC970MP_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x721, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_STCX_FAIL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_STCX_FAIL] }, [ PPC970MP_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0x824, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_SRQ_STFWD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_SRQ_STFWD] }, [ PPC970MP_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_DISP], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_DISP] }, [ PPC970MP_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0x733, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_L2_PREF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_L2_PREF] }, [ PPC970MP_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_DENORM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_DENORM] }, [ PPC970MP_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1837, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DATA_FROM_L2], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DATA_FROM_L2] }, [ PPC970MP_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FPSCR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FPSCR] }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x6937, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD] }, [ PPC970MP_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FSQRT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FSQRT] }, [ PPC970MP_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8810, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_REF_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_REF_L1] }, [ PPC970MP_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x934, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_L1_RELOAD_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_L1_RELOAD_VALID] }, [ PPC970MP_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_1PLUS_PPC_CMPL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_1PLUS_PPC_CMPL] }, [ PPC970MP_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x142d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_L1] }, [ PPC970MP_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x337, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_EE_OFF_EXT_INT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_EE_OFF_EXT_INT] }, [ PPC970MP_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x700a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC6_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC6_OVERFLOW] }, [ PPC970MP_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x312, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The ISU sends this signal when the LRQ is full.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_LRQ_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_LRQ_FULL_CYC] }, [ PPC970MP_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x427, .pme_short_desc = "Instruction prefetched installed in prefetch", .pme_long_desc = "New line coming into the prefetch buffer", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_IC_PREF_INSTALL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_IC_PREF_INSTALL] }, [ PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x732, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS] }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x717, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ] }, [ PPC970MP_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x300, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GCT_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GCT_FULL_CYC] }, [ PPC970MP_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x2426, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "Instruction fetched from memory", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_MEM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_MEM] }, [ PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED ] = { .pme_name = "PM_FLUSH_LSU_BR_MPRED", .pme_code = 0x317, .pme_short_desc = "Flush caused by LSU or branch mispredict", .pme_long_desc = "Flush caused by LSU or branch mispredict", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED] }, [ PPC970MP_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU_BUSY], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU_BUSY] }, [ PPC970MP_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0x815, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_ST_REF_L1_LSU1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_ST_REF_L1_LSU1] }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1720, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LD_MISS_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LD_MISS_L1] }, [ PPC970MP_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x434, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_L1_WRITE_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_L1_WRITE_CYC] }, [ PPC970MP_PME_PM_LSU1_BUSY ] = { .pme_name = "PM_LSU1_BUSY", .pme_code = 0x827, .pme_short_desc = "LSU1 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_BUSY], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_BUSY] }, [ PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2920, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "LSU reject due to LMQ full or missed data coming", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL] }, [ PPC970MP_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x504c, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_FDIV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_FDIV] }, [ PPC970MP_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_ALL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_ALL] }, [ PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0x825, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC] }, [ PPC970MP_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x5426, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_L25_SHR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_L25_SHR] }, [ PPC970MP_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GRP_MRK], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GRP_MRK] }, [ PPC970MP_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x432, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_BR_MPRED_CR], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_BR_MPRED_CR] }, [ PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x737, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC] }, [ PPC970MP_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FIN] }, [ PPC970MP_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1920, .pme_short_desc = "LSU SRQ rejects", .pme_long_desc = "LSU SRQ rejects", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_REJECT_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_REJECT_SRQ] }, [ PPC970MP_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x433, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_BR_MPRED_TA], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_BR_MPRED_TA] }, [ PPC970MP_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x311, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CRQ_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CRQ_FULL_CYC] }, [ PPC970MP_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3810, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_MISS_L1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_MISS_L1] }, [ PPC970MP_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x342d, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_INST_FROM_PREF], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_INST_FROM_PREF] }, [ PPC970MP_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x725, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_STCX_PASS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_STCX_PASS] }, [ PPC970MP_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x817, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_DC_INV_L2], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_DC_INV_L2] }, [ PPC970MP_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x313, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The ISU sends this signal when the srq is full.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_FULL_CYC] }, [ PPC970MP_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0x802, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_FLUSH_LRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_FLUSH_LRQ] }, [ PPC970MP_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0x821, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_S0_VALID], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_S0_VALID] }, [ PPC970MP_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x727, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no coresponding unit 1 event since larx instructions can only execute on unit 0)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LARX_LSU0], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LARX_LSU0] }, [ PPC970MP_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_GCT_EMPTY_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_GCT_EMPTY_CYC] }, [ PPC970MP_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add", .pme_long_desc = " mult", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_ALL], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_ALL] }, [ PPC970MP_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FSQRT], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FSQRT] }, [ PPC970MP_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_FIN] }, [ PPC970MP_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1820, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_SRQ_STFWD], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_SRQ_STFWD] }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x724, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1] }, [ PPC970MP_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x332, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FXU0_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FXU0_FIN] }, [ PPC970MP_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_FPU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_FPU_FIN] }, [ PPC970MP_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x600a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_PMC5_OVERFLOW], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_PMC5_OVERFLOW] }, [ PPC970MP_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x703, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_SNOOP_TLBIE], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_SNOOP_TLBIE] }, [ PPC970MP_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU1_FRSP_FCONV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU1_FRSP_FCONV] }, [ PPC970MP_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU0_FDIV], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU0_FDIV] }, [ PPC970MP_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0x814, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LD_REF_L1_LSU1], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LD_REF_L1_LSU1] }, [ PPC970MP_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_HV_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_HV_CYC] }, [ PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x306, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC] }, [ PPC970MP_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_FPU_DENORM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_FPU_DENORM] }, [ PPC970MP_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0x920, .pme_short_desc = "LSU0 SRQ rejects", .pme_long_desc = "LSU0 SRQ rejects", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_REJECT_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_REJECT_SRQ] }, [ PPC970MP_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0x924, .pme_short_desc = "LSU1 SRQ rejects", .pme_long_desc = "LSU1 SRQ rejects", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_REJECT_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_REJECT_SRQ] }, [ PPC970MP_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x706, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU1_DERAT_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU1_DERAT_MISS] }, [ PPC970MP_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x426, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_IC_PREF_REQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_IC_PREF_REQ] }, [ PPC970MP_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_LSU_FIN], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_LSU_FIN] }, [ PPC970MP_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x2937, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "Marked data loaded from memory", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_MRK_DATA_FROM_MEM], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_MRK_DATA_FROM_MEM] }, [ PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x50cb, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Completion stall caused by D cache miss", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS] }, [ PPC970MP_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0x801, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU0_FLUSH_UST], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU0_FLUSH_UST] }, [ PPC970MP_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6800, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_FLUSH_LRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_FLUSH_LRQ] }, [ PPC970MP_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5800, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", .pme_event_ids = ppc970mp_event_ids[PPC970MP_PME_PM_LSU_FLUSH_SRQ], .pme_group_vector = ppc970mp_group_vecs[PPC970MP_PME_PM_LSU_FLUSH_SRQ] } }; #define PPC970MP_PME_EVENT_COUNT 230 static const int ppc970mp_group_event_ids[][PPC970MP_NUM_EVENT_COUNTERS] = { [ 0 ] = { 81, 2, 65, 30, 0, 2, 28, 29 }, [ 1 ] = { 2, 2, 36, 6, 39, 35, 63, 37 }, [ 2 ] = { 36, 2, 36, 6, 39, 35, 63, 37 }, [ 3 ] = { 64, 63, 4, 30, 65, 63, 63, 37 }, [ 4 ] = { 27, 25, 21, 22, 3, 25, 30, 22 }, [ 5 ] = { 26, 26, 4, 30, 26, 26, 21, 43 }, [ 6 ] = { 87, 1, 3, 29, 44, 36, 30, 4 }, [ 7 ] = { 13, 21, 22, 24, 3, 35, 46, 49 }, [ 8 ] = { 37, 2, 24, 27, 34, 31, 30, 4 }, [ 9 ] = { 28, 83, 65, 10, 3, 35, 8, 10 }, [ 10 ] = { 10, 18, 16, 21, 11, 19, 30, 4 }, [ 11 ] = { 12, 20, 13, 19, 8, 16, 30, 4 }, [ 12 ] = { 9, 17, 14, 20, 3, 35, 12, 18 }, [ 13 ] = { 15, 23, 13, 19, 3, 35, 4, 16 }, [ 14 ] = { 45, 54, 4, 5, 47, 54, 30, 4 }, [ 15 ] = { 47, 56, 39, 38, 3, 35, 35, 36 }, [ 16 ] = { 48, 57, 67, 65, 3, 35, 62, 5 }, [ 17 ] = { 53, 62, 67, 65, 81, 2, 30, 4 }, [ 18 ] = { 44, 53, 4, 5, 38, 2, 31, 4 }, [ 19 ] = { 28, 64, 29, 5, 0, 35, 28, 67 }, [ 20 ] = { 27, 25, 26, 22, 3, 25, 30, 22 }, [ 21 ] = { 6, 40, 36, 63, 3, 35, 63, 37 }, [ 22 ] = { 6, 64, 36, 63, 3, 35, 63, 37 }, [ 23 ] = { 27, 25, 13, 19, 3, 26, 30, 43 }, [ 24 ] = { 36, 2, 36, 1, 81, 2, 1, 2 }, [ 25 ] = { 36, 2, 36, 1, 3, 81, 63, 37 }, [ 26 ] = { 81, 4, 0, 2, 41, 2, 30, 2 }, [ 27 ] = { 3, 87, 30, 5, 38, 2, 44, 47 }, [ 28 ] = { 6, 40, 30, 5, 66, 65, 32, 34 }, [ 29 ] = { 39, 38, 29, 30, 4, 2, 28, 5 }, [ 30 ] = { 68, 69, 36, 49, 38, 35, 4, 37 }, [ 31 ] = { 38, 36, 70, 5, 38, 2, 30, 4 }, [ 32 ] = { 28, 33, 32, 30, 39, 62, 63, 4 }, [ 33 ] = { 74, 82, 4, 51, 35, 70, 50, 30 }, [ 34 ] = { 72, 70, 4, 50, 35, 69, 49, 60 }, [ 35 ] = { 78, 2, 62, 51, 71, 75, 60, 30 }, [ 36 ] = { 79, 71, 56, 60, 3, 35, 54, 58 }, [ 37 ] = { 75, 73, 53, 57, 3, 35, 53, 57 }, [ 38 ] = { 36, 36, 26, 26, 28, 27, 24, 4 }, [ 39 ] = { 36, 2, 23, 23, 28, 27, 25, 26 }, [ 40 ] = { 38, 38, 31, 0, 90, 37, 4, 30 }, [ 41 ] = { 85, 85, 43, 12, 84, 35, 70, 4 }, [ 42 ] = { 88, 36, 36, 5, 86, 62, 69, 37 }, [ 43 ] = { 36, 27, 26, 22, 85, 27, 68, 4 }, [ 44 ] = { 27, 25, 30, 68, 87, 25, 67, 4 }, [ 45 ] = { 28, 36, 10, 3, 88, 2, 71, 33 }, [ 46 ] = { 36, 36, 4, 5, 91, 87, 44, 47 }, [ 47 ] = { 39, 38, 31, 1, 3, 35, 1, 2 }, [ 48 ] = { 3, 87, 30, 35, 0, 2, 36, 37 }, [ 49 ] = { 3, 87, 30, 5, 91, 87, 36, 37 }, [ 50 ] = { 71, 88, 30, 5, 92, 88, 50, 51 } }; static const pmg_power_group_t ppc970mp_groups[] = { [ 0 ] = { .pmg_name = "pm_slice0", .pmg_desc = "Time Slice 0", .pmg_event_ids = ppc970mp_group_event_ids[0], .pmg_mmcr0 = 0x000000000000051eULL, .pmg_mmcr1 = 0x000000000a46f18cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 1 ] = { .pmg_name = "pm_eprof", .pmg_desc = "Group for use with eprof", .pmg_event_ids = ppc970mp_group_event_ids[1], .pmg_mmcr0 = 0x0000000000000f1eULL, .pmg_mmcr1 = 0x4003001005f09000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 2 ] = { .pmg_name = "pm_basic", .pmg_desc = "Basic performance indicators", .pmg_event_ids = ppc970mp_group_event_ids[2], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x4003001005f09000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 3 ] = { .pmg_name = "pm_lsu", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = ppc970mp_group_event_ids[3], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000f00007a400000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 4 ] = { .pmg_name = "pm_fpu1", .pmg_desc = "Floating Point events", .pmg_event_ids = ppc970mp_group_event_ids[4], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000001e0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 5 ] = { .pmg_name = "pm_fpu2", .pmg_desc = "Floating Point events", .pmg_event_ids = ppc970mp_group_event_ids[5], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000020e87a400000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 6 ] = { .pmg_name = "pm_isu_rename", .pmg_desc = "ISU Rename Pool Events", .pmg_event_ids = ppc970mp_group_event_ids[6], .pmg_mmcr0 = 0x0000000000001228ULL, .pmg_mmcr1 = 0x400000218e6d84bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 7 ] = { .pmg_name = "pm_isu_queues1", .pmg_desc = "ISU Rename Pool Events", .pmg_event_ids = ppc970mp_group_event_ids[7], .pmg_mmcr0 = 0x000000000000132eULL, .pmg_mmcr1 = 0x40000000851e994cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 8 ] = { .pmg_name = "pm_isu_flow", .pmg_desc = "ISU Instruction Flow Events", .pmg_event_ids = ppc970mp_group_event_ids[8], .pmg_mmcr0 = 0x000000000000181eULL, .pmg_mmcr1 = 0x400000b3d7b7c4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 9 ] = { .pmg_name = "pm_isu_work", .pmg_desc = "ISU Indicators of Work Blockage", .pmg_event_ids = ppc970mp_group_event_ids[9], .pmg_mmcr0 = 0x0000000000000402ULL, .pmg_mmcr1 = 0x400000050fde9d88ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 10 ] = { .pmg_name = "pm_fpu3", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[10], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000000008d6354bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 11 ] = { .pmg_name = "pm_fpu4", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[11], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000000009de774bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 12 ] = { .pmg_name = "pm_fpu5", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[12], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x000000c0851e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 13 ] = { .pmg_name = "pm_fpu7", .pmg_desc = "Floating Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[13], .pmg_mmcr0 = 0x000000000000193aULL, .pmg_mmcr1 = 0x000000c89dde97e0ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 14 ] = { .pmg_name = "pm_lsu_flush", .pmg_desc = "LSU Flush Events", .pmg_event_ids = ppc970mp_group_event_ids[14], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000c00007be774bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 15 ] = { .pmg_name = "pm_lsu_load1", .pmg_desc = "LSU Load Events", .pmg_event_ids = ppc970mp_group_event_ids[15], .pmg_mmcr0 = 0x0000000000001028ULL, .pmg_mmcr1 = 0x000f0000851e9958ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 16 ] = { .pmg_name = "pm_lsu_store1", .pmg_desc = "LSU Store Events", .pmg_event_ids = ppc970mp_group_event_ids[16], .pmg_mmcr0 = 0x000000000000112aULL, .pmg_mmcr1 = 0x000f00008d5e99dcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 17 ] = { .pmg_name = "pm_lsu_store2", .pmg_desc = "LSU Store Events", .pmg_event_ids = ppc970mp_group_event_ids[17], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x0003c0d08d76f4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 18 ] = { .pmg_name = "pm_lsu7", .pmg_desc = "Information on the Load Store Unit", .pmg_event_ids = ppc970mp_group_event_ids[18], .pmg_mmcr0 = 0x000000000000122cULL, .pmg_mmcr1 = 0x000830047bd2fe3cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 19 ] = { .pmg_name = "pm_misc", .pmg_desc = "Misc Events for testing", .pmg_event_ids = ppc970mp_group_event_ids[19], .pmg_mmcr0 = 0x0000000000000404ULL, .pmg_mmcr1 = 0x0000000023c69194ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 20 ] = { .pmg_name = "pm_pe_bench1", .pmg_desc = "PE Benchmarker group for FP analysis", .pmg_event_ids = ppc970mp_group_event_ids[20], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x10001002001e0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 21 ] = { .pmg_name = "pm_pe_bench4", .pmg_desc = "PE Benchmarker group for L1 and TLB", .pmg_event_ids = ppc970mp_group_event_ids[21], .pmg_mmcr0 = 0x0000000000001420ULL, .pmg_mmcr1 = 0x000b000004de9000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 22 ] = { .pmg_name = "pm_hpmcount1", .pmg_desc = "Hpmcount group for L1 and TLB behavior", .pmg_event_ids = ppc970mp_group_event_ids[22], .pmg_mmcr0 = 0x0000000000001404ULL, .pmg_mmcr1 = 0x000b000004de9000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 23 ] = { .pmg_name = "pm_hpmcount2", .pmg_desc = "Hpmcount group for computation", .pmg_event_ids = ppc970mp_group_event_ids[23], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x000020289dde0480ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 24 ] = { .pmg_name = "pm_l1andbr", .pmg_desc = "L1 misses and branch misspredict analysis", .pmg_event_ids = ppc970mp_group_event_ids[24], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x8003c01d0676fd6cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 25 ] = { .pmg_name = "Instruction mix: loads", .pmg_desc = " stores and branches", .pmg_event_ids = ppc970mp_group_event_ids[25], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x8003c021065fb000ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 26 ] = { .pmg_name = "pm_branch", .pmg_desc = "SLB and branch misspredict analysis", .pmg_event_ids = ppc970mp_group_event_ids[26], .pmg_mmcr0 = 0x000000000000052aULL, .pmg_mmcr1 = 0x8008000bcea2f4ecULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 27 ] = { .pmg_name = "pm_data", .pmg_desc = "data source and LMQ", .pmg_event_ids = ppc970mp_group_event_ids[27], .pmg_mmcr0 = 0x000000000000070eULL, .pmg_mmcr1 = 0x0000300c4bd2ff74ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 28 ] = { .pmg_name = "pm_tlb", .pmg_desc = "TLB and LRQ plus data prefetch", .pmg_event_ids = ppc970mp_group_event_ids[28], .pmg_mmcr0 = 0x0000000000001420ULL, .pmg_mmcr1 = 0x0008e03c4bfdacecULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 29 ] = { .pmg_name = "pm_isource", .pmg_desc = "inst source and tablewalk", .pmg_event_ids = ppc970mp_group_event_ids[29], .pmg_mmcr0 = 0x000000000000060cULL, .pmg_mmcr1 = 0x800b00c0226ef1dcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 30 ] = { .pmg_name = "pm_sync", .pmg_desc = "Sync and SRQ", .pmg_event_ids = ppc970mp_group_event_ids[30], .pmg_mmcr0 = 0x0000000000001d32ULL, .pmg_mmcr1 = 0x0003e0c107529780ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 31 ] = { .pmg_name = "pm_ierat", .pmg_desc = "IERAT", .pmg_event_ids = ppc970mp_group_event_ids[31], .pmg_mmcr0 = 0x0000000000000d12ULL, .pmg_mmcr1 = 0x80000082c3d2f4bcULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 32 ] = { .pmg_name = "pm_derat", .pmg_desc = "DERAT", .pmg_event_ids = ppc970mp_group_event_ids[32], .pmg_mmcr0 = 0x0000000000000436ULL, .pmg_mmcr1 = 0x100b7052e274003cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 33 ] = { .pmg_name = "pm_mark1", .pmg_desc = "Information on marked instructions", .pmg_event_ids = ppc970mp_group_event_ids[33], .pmg_mmcr0 = 0x0000000000000006ULL, .pmg_mmcr1 = 0x00008080790852a4ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 34 ] = { .pmg_name = "pm_mark2", .pmg_desc = "Marked Instructions Processing Flow", .pmg_event_ids = ppc970mp_group_event_ids[34], .pmg_mmcr0 = 0x000000000000020aULL, .pmg_mmcr1 = 0x0000000079484210ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 35 ] = { .pmg_name = "pm_mark3", .pmg_desc = "Marked Stores Processing Flow", .pmg_event_ids = ppc970mp_group_event_ids[35], .pmg_mmcr0 = 0x000000000000031eULL, .pmg_mmcr1 = 0x00203004190a3f24ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 36 ] = { .pmg_name = "pm_lsu_mark1", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = ppc970mp_group_event_ids[36], .pmg_mmcr0 = 0x0000000000001b34ULL, .pmg_mmcr1 = 0x000280c08d5e9850ULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 37 ] = { .pmg_name = "pm_lsu_mark2", .pmg_desc = "Load Store Unit Marked Events", .pmg_event_ids = ppc970mp_group_event_ids[37], .pmg_mmcr0 = 0x0000000000001838ULL, .pmg_mmcr1 = 0x000280c0959e99dcULL, .pmg_mmcra = 0x0000000000002001ULL }, [ 38 ] = { .pmg_name = "pm_fxu1", .pmg_desc = "Fixed Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[38], .pmg_mmcr0 = 0x0000000000000912ULL, .pmg_mmcr1 = 0x100010020084213cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 39 ] = { .pmg_name = "pm_fxu2", .pmg_desc = "Fixed Point events by unit", .pmg_event_ids = ppc970mp_group_event_ids[39], .pmg_mmcr0 = 0x000000000000091eULL, .pmg_mmcr1 = 0x4000000ca4042d78ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 40 ] = { .pmg_name = "pm_ifu", .pmg_desc = "pm_ifu", .pmg_event_ids = ppc970mp_group_event_ids[40], .pmg_mmcr0 = 0x0000000000000d0cULL, .pmg_mmcr1 = 0x800000f06b7867a4ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 41 ] = { .pmg_name = "pm_cpi_stack1", .pmg_desc = "CPI stack analysis", .pmg_event_ids = ppc970mp_group_event_ids[41], .pmg_mmcr0 = 0x0000000000001b3eULL, .pmg_mmcr1 = 0x4000c0c0add6963dULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 42 ] = { .pmg_name = "pm_cpi_stack2", .pmg_desc = "CPI stack analysis", .pmg_event_ids = ppc970mp_group_event_ids[42], .pmg_mmcr0 = 0x0000000000000b12ULL, .pmg_mmcr1 = 0x000b000003d60583ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 43 ] = { .pmg_name = "pm_cpi_stack3", .pmg_desc = "CPI stack analysis", .pmg_event_ids = ppc970mp_group_event_ids[43], .pmg_mmcr0 = 0x0000000000000916ULL, .pmg_mmcr1 = 0x10001002001625beULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 44 ] = { .pmg_name = "pm_cpi_stack4", .pmg_desc = "CPI stack analysis", .pmg_event_ids = ppc970mp_group_event_ids[44], .pmg_mmcr0 = 0x0000000000000000ULL, .pmg_mmcr1 = 0x00000000485805bdULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 45 ] = { .pmg_name = "pm_cpi_stack5", .pmg_desc = "CPI stack analysis", .pmg_event_ids = ppc970mp_group_event_ids[45], .pmg_mmcr0 = 0x0000000000000412ULL, .pmg_mmcr1 = 0x90014009b6d8f672ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 46 ] = { .pmg_name = "pm_data2", .pmg_desc = "data source and LMQ", .pmg_event_ids = ppc970mp_group_event_ids[46], .pmg_mmcr0 = 0x0000000000000912ULL, .pmg_mmcr1 = 0x0000300c7bce7f74ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 47 ] = { .pmg_name = "pm_fetch_branch", .pmg_desc = "Instruction fetch and branch events", .pmg_event_ids = ppc970mp_group_event_ids[47], .pmg_mmcr0 = 0x000000000000060cULL, .pmg_mmcr1 = 0x800000cd6e5e9d6cULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 48 ] = { .pmg_name = "pm_l1l2_miss", .pmg_desc = "L1 and L2 miss events", .pmg_event_ids = ppc970mp_group_event_ids[48], .pmg_mmcr0 = 0x000000000000070eULL, .pmg_mmcr1 = 0x000330004c86fb00ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 49 ] = { .pmg_name = "pm_data_from", .pmg_desc = "Data From L2 instructions", .pmg_event_ids = ppc970mp_group_event_ids[49], .pmg_mmcr0 = 0x000000000000070eULL, .pmg_mmcr1 = 0x000330004bce7b00ULL, .pmg_mmcra = 0x0000000000002000ULL }, [ 50 ] = { .pmg_name = "pm_mark_data_from", .pmg_desc = "Marked Data From L2 instructions", .pmg_event_ids = ppc970mp_group_event_ids[50], .pmg_mmcr0 = 0x000000000000070eULL, .pmg_mmcr1 = 0x002030084bce72f0ULL, .pmg_mmcra = 0x0000000000002001ULL } }; #endif papi-papi-7-2-0-t/src/libperfnec/lib/ultra12_events.h000066400000000000000000000066661502707512200223530ustar00rootroot00000000000000static pme_sparc_entry_t ultra12_pe[] = { /* These two must always be first. */ { .pme_name = "Cycle_cnt", .pme_desc = "Accumulated cycles", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x1, }, { .pme_name = "Dispatch0_IC_miss", .pme_desc = "I-buffer is empty from I-Cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, /* PIC0 events for UltraSPARC-I/II/IIi/IIe */ { .pme_name = "Dispatch0_storeBuf", .pme_desc = "Store buffer can not hold additional stores", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "IC_ref", .pme_desc = "I-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x8, }, { .pme_name = "DC_rd", .pme_desc = "D-cache read references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x9, }, { .pme_name = "DC_wr", .pme_desc = "D-cache write references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xa, }, { .pme_name = "Load_use", .pme_desc = "An instruction in the execute stage depends on an earlier load result that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xb, }, { .pme_name = "EC_ref", .pme_desc = "Total E-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xc, }, { .pme_name = "EC_write_hit_RDO", .pme_desc = "E-cache hits that do a read for ownership UPA transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xd, }, { .pme_name = "EC_snoop_inv", .pme_desc = "E-cache invalidates from the following UPA transactions: S_INV_REQ, S_CPI_REQ", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xe, }, { .pme_name = "EC_rd_hit", .pme_desc = "E-cache read hits from D-cache misses", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xf, }, /* PIC1 events for UltraSPARC-I/II/IIi/IIe */ { .pme_name = "Dispatch0_mispred", .pme_desc = "I-buffer is empty from Branch misprediction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2, }, { .pme_name = "Dispatch0_FP_use", .pme_desc = "First instruction in the group depends on an earlier floating point result that is not yet available", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x3, }, { .pme_name = "IC_hit", .pme_desc = "I-cache hits", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x8, }, { .pme_name = "DC_rd_hit", .pme_desc = "D-cache read hits", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x9, }, { .pme_name = "DC_wr_hit", .pme_desc = "D-cache write hits", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xa, }, { .pme_name = "Load_use_RAW", .pme_desc = "There is a load use in the execute stage and there is a read-after-write hazard on the oldest outstanding load", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xb, }, { .pme_name = "EC_hit", .pme_desc = "Total E-cache hits", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xc, }, { .pme_name = "EC_wb", .pme_desc = "E-cache misses that do writebacks", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xd, }, { .pme_name = "EC_snoop_cb", .pme_desc = "E-cache snoop copy-backs from the following UPA transactions: S_CPB_REQ, S_CPI_REQ, S_CPD_REQ, S_CPB_MIS_REQ", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xe, }, { .pme_name = "EC_ic_hit", .pme_desc = "E-cache read hits from I-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xf, }, }; #define PME_ULTRA12_EVENT_COUNT (sizeof(ultra12_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/ultra3_events.h000066400000000000000000000264771502707512200222750ustar00rootroot00000000000000static pme_sparc_entry_t ultra3_pe[] = { /* These two must always be first. */ { .pme_name = "Cycle_cnt", .pme_desc = "Accumulated cycles", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_IC_miss", .pme_desc = "I-buffer is empty from I-Cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, { .pme_name = "IC_ref", .pme_desc = "I-cache refrences", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x8, }, { .pme_name = "DC_rd", .pme_desc = "D-cache read references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x9, }, { .pme_name = "DC_wr", .pme_desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xa, }, { .pme_name = "EC_ref", .pme_desc = "E-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xc, }, { .pme_name = "EC_snoop_inv", .pme_desc = "L2-cache invalidates generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_mispred", .pme_desc = "I-buffer is empty from Branch misprediction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2, }, { .pme_name = "EC_wb", .pme_desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xd, }, { .pme_name = "EC_snoop_cb", .pme_desc = "L2-cache copybacks generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "Dispatch0_br_target", .pme_desc = "I-buffer is empty due to a branch target address calculation", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "Dispatch0_2nd_br", .pme_desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "Rstall_storeQ", .pme_desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stailled due to the store queue being full", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x5, }, { .pme_name = "Rstall_IU_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding integer instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x6, }, { .pme_name = "EC_write_hit_RTO", .pme_desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xd, }, { .pme_name = "EC_rd_miss", .pme_desc = "L2-cache miss events (including atomics) from D-cache events", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xf, }, { .pme_name = "PC_port0_rd", .pme_desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x10, }, { .pme_name = "SI_snoop", .pme_desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x11, }, { .pme_name = "SI_ciq_flow", .pme_desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x12, }, { .pme_name = "SI_owned", .pme_desc = "Counts events where owned_in is asserted on bus requests from the local processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x13, }, { .pme_name = "SW_count0", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x14, }, { .pme_name = "IU_Stat_Br_miss_taken", .pme_desc = "Retired branches that were predicted to be taken, but in fact were not taken", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x15, }, { .pme_name = "IU_Stat_Br_Count_taken", .pme_desc = "Retired taken branches", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x16, }, { .pme_name = "Dispatch0_rs_mispred", .pme_desc = "I-buffer is empty due to a Return Address Stack misprediction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "FA_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG ALU pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "IC_miss_cancelled", .pme_desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x3, }, { .pme_name = "Re_FPU_bypass", .pme_desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x5, }, { .pme_name = "Re_DC_miss", .pme_desc = "Stall due to loads that miss D-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x6, }, { .pme_name = "Re_EC_miss", .pme_desc = "Stall due to loads that miss L2-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x7, }, { .pme_name = "IC_miss", .pme_desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x8, }, { .pme_name = "DC_rd_miss", .pme_desc = "Recirculated loads that miss the D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x9, }, { .pme_name = "DC_wr_miss", .pme_desc = "D-cache store accesses that miss D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xa, }, { .pme_name = "Rstall_FP_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding floating-point instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xb, }, { .pme_name = "EC_misses", .pme_desc = "E-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xc, }, { .pme_name = "EC_ic_miss", .pme_desc = "L2-cache read misses from I-cache requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xf, }, { .pme_name = "Re_PC_miss", .pme_desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x10, }, { .pme_name = "ITLB_miss", .pme_desc = "I-TLB miss traps taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x11, }, { .pme_name = "DTLB_miss", .pme_desc = "Memory reference instructions which trap due to D-TLB miss", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x12, }, { .pme_name = "WC_miss", .pme_desc = "W-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x13, }, { .pme_name = "WC_snoop_cb", .pme_desc = "W-cache copybacks generated by a snoop from a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x14, }, { .pme_name = "WC_scrubbed", .pme_desc = "W-cache hits to clean lines", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x15, }, { .pme_name = "WC_wb_wo_read", .pme_desc = "W-cache writebacks not requiring a read", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x16, }, { .pme_name = "PC_soft_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x18, }, { .pme_name = "PC_snoop_inv", .pme_desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x19, }, { .pme_name = "PC_hard_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1a, }, { .pme_name = "PC_port1_rd", .pme_desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1b, }, { .pme_name = "SW_count1", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1c, }, { .pme_name = "IU_Stat_Br_miss_untaken", .pme_desc = "Retired branches that were predicted to be untaken, but in fact were taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1d, }, { .pme_name = "IU_Stat_Br_Count_untaken", .pme_desc = "Retired untaken branches", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1e, }, { .pme_name = "PC_MS_miss", .pme_desc = "FP loads through the MS pipeline that miss P-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1f, }, { .pme_name = "Re_RAW_miss", .pme_desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x26, }, { .pme_name = "FM_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG Multiply pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x27, }, /* PIC0 memory controller events common to UltraSPARC-III/III+ processors */ { .pme_name = "MC_reads_0", .pme_desc = "Read requests completed to memory bank 0", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x20, }, { .pme_name = "MC_reads_1", .pme_desc = "Read requests completed to memory bank 1", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x21, }, { .pme_name = "MC_reads_2", .pme_desc = "Read requests completed to memory bank 2", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x22, }, { .pme_name = "MC_reads_3", .pme_desc = "Read requests completed to memory bank 3", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x23, }, { .pme_name = "MC_stalls_0", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x24, }, { .pme_name = "MC_stalls_2", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x25, }, /* PIC1 memory controller events common to all UltraSPARC-III/III+ processors */ { .pme_name = "MC_writes_0", .pme_desc = "Write requests completed to memory bank 0", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x20, }, { .pme_name = "MC_writes_1", .pme_desc = "Write requests completed to memory bank 1", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x21, }, { .pme_name = "MC_writes_2", .pme_desc = "Write requests completed to memory bank 2", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x22, }, { .pme_name = "MC_writes_3", .pme_desc = "Write requests completed to memory bank 3", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x23, }, { .pme_name = "MC_stalls_1", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 1 was busy with a previous request", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x24, }, { .pme_name = "MC_stalls_3", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous request", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x25, }, }; #define PME_ULTRA3_EVENT_COUNT (sizeof(ultra3_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/ultra3i_events.h000066400000000000000000000260071502707512200224330ustar00rootroot00000000000000static pme_sparc_entry_t ultra3i_pe[] = { /* These two must always be first. */ { .pme_name = "Cycle_cnt", .pme_desc = "Accumulated cycles", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_IC_miss", .pme_desc = "I-buffer is empty from I-Cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, { .pme_name = "IC_ref", .pme_desc = "I-cache refrences", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x8, }, { .pme_name = "DC_rd", .pme_desc = "D-cache read references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x9, }, { .pme_name = "DC_wr", .pme_desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xa, }, { .pme_name = "EC_ref", .pme_desc = "E-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xc, }, { .pme_name = "EC_snoop_inv", .pme_desc = "L2-cache invalidates generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_mispred", .pme_desc = "I-buffer is empty from Branch misprediction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2, }, { .pme_name = "EC_wb", .pme_desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xd, }, { .pme_name = "EC_snoop_cb", .pme_desc = "L2-cache copybacks generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "Dispatch0_br_target", .pme_desc = "I-buffer is empty due to a branch target address calculation", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "Dispatch0_2nd_br", .pme_desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "Rstall_storeQ", .pme_desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stailled due to the store queue being full", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x5, }, { .pme_name = "Rstall_IU_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding integer instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x6, }, { .pme_name = "EC_write_hit_RTO", .pme_desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xd, }, { .pme_name = "EC_rd_miss", .pme_desc = "L2-cache miss events (including atomics) from D-cache events", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xf, }, { .pme_name = "PC_port0_rd", .pme_desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x10, }, { .pme_name = "SI_snoop", .pme_desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x11, }, { .pme_name = "SI_ciq_flow", .pme_desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x12, }, { .pme_name = "SI_owned", .pme_desc = "Counts events where owned_in is asserted on bus requests from the local processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x13, }, { .pme_name = "SW_count0", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x14, }, { .pme_name = "IU_Stat_Br_miss_taken", .pme_desc = "Retired branches that were predicted to be taken, but in fact were not taken", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x15, }, { .pme_name = "IU_Stat_Br_Count_taken", .pme_desc = "Retired taken branches", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x16, }, { .pme_name = "Dispatch0_rs_mispred", .pme_desc = "I-buffer is empty due to a Return Address Stack misprediction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "FA_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG ALU pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "IC_miss_cancelled", .pme_desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x3, }, { .pme_name = "Re_FPU_bypass", .pme_desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x5, }, { .pme_name = "Re_DC_miss", .pme_desc = "Stall due to loads that miss D-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x6, }, { .pme_name = "Re_EC_miss", .pme_desc = "Stall due to loads that miss L2-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x7, }, { .pme_name = "IC_miss", .pme_desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x8, }, { .pme_name = "DC_rd_miss", .pme_desc = "Recirculated loads that miss the D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x9, }, { .pme_name = "DC_wr_miss", .pme_desc = "D-cache store accesses that miss D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xa, }, { .pme_name = "Rstall_FP_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding floating-point instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xb, }, { .pme_name = "EC_misses", .pme_desc = "E-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xc, }, { .pme_name = "EC_ic_miss", .pme_desc = "L2-cache read misses from I-cache requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xf, }, { .pme_name = "Re_PC_miss", .pme_desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x10, }, { .pme_name = "ITLB_miss", .pme_desc = "I-TLB miss traps taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x11, }, { .pme_name = "DTLB_miss", .pme_desc = "Memory reference instructions which trap due to D-TLB miss", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x12, }, { .pme_name = "WC_miss", .pme_desc = "W-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x13, }, { .pme_name = "WC_snoop_cb", .pme_desc = "W-cache copybacks generated by a snoop from a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x14, }, { .pme_name = "WC_scrubbed", .pme_desc = "W-cache hits to clean lines", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x15, }, { .pme_name = "WC_wb_wo_read", .pme_desc = "W-cache writebacks not requiring a read", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x16, }, { .pme_name = "PC_soft_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x18, }, { .pme_name = "PC_snoop_inv", .pme_desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x19, }, { .pme_name = "PC_hard_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1a, }, { .pme_name = "PC_port1_rd", .pme_desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1b, }, { .pme_name = "SW_count1", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1c, }, { .pme_name = "IU_Stat_Br_miss_untaken", .pme_desc = "Retired branches that were predicted to be untaken, but in fact were taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1d, }, { .pme_name = "IU_Stat_Br_Count_untaken", .pme_desc = "Retired untaken branches", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1e, }, { .pme_name = "PC_MS_miss", .pme_desc = "FP loads through the MS pipeline that miss P-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1f, }, { .pme_name = "Re_RAW_miss", .pme_desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x26, }, { .pme_name = "FM_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG Multiply pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x27, }, /* PIC0 memory controller events specific to UltraSPARC-IIIi processors */ { .pme_name = "MC_read_dispatched", .pme_desc = "DDR 64-byte reads dispatched by the MIU", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x20, }, { .pme_name = "MC_write_dispatched", .pme_desc = "DDR 64-byte writes dispatched by the MIU", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x21, }, { .pme_name = "MC_read_returned_to_JBU", .pme_desc = "64-byte reads that return data to JBU", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x22, }, { .pme_name = "MC_msl_busy_stall", .pme_desc = "Stall cycles due to msl_busy", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x23, }, { .pme_name = "MC_mdb_overflow_stall", .pme_desc = "Stall cycles due to potential memory data buffer overflow", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x24, }, { .pme_name = "MC_miu_spec_request", .pme_desc = "Speculative requests accepted by MIU", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x25, }, /* PIC1 memory controller events specific to UltraSPARC-IIIi processors */ { .pme_name = "MC_reads", .pme_desc = "64-byte reads by the MSL", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x20, }, { .pme_name = "MC_writes", .pme_desc = "64-byte writes by the MSL", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x21, }, { .pme_name = "MC_page_close_stall", .pme_desc = "DDR page conflicts", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x22, }, /* PIC1 events specific to UltraSPARC-III+/IIIi */ { .pme_name = "Re_DC_missovhd", .pme_desc = "Used to measure D-cache stall counts seperatedly for L2-cache hits and misses. This counter is used with the recirculation and cache access events to seperately calculate the D-cache loads that hit and miss the L2-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x4, }, }; #define PME_ULTRA3I_EVENT_COUNT (sizeof(ultra3i_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/ultra3plus_events.h000066400000000000000000000320631502707512200231650ustar00rootroot00000000000000static pme_sparc_entry_t ultra3plus_pe[] = { /* These two must always be first. */ { .pme_name = "Cycle_cnt", .pme_desc = "Accumulated cycles", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_IC_miss", .pme_desc = "I-buffer is empty from I-Cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, { .pme_name = "IC_ref", .pme_desc = "I-cache refrences", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x8, }, { .pme_name = "DC_rd", .pme_desc = "D-cache read references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x9, }, { .pme_name = "DC_wr", .pme_desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xa, }, { .pme_name = "EC_ref", .pme_desc = "E-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xc, }, { .pme_name = "EC_snoop_inv", .pme_desc = "L2-cache invalidates generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .pme_name = "Dispatch0_mispred", .pme_desc = "I-buffer is empty from Branch misprediction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2, }, { .pme_name = "EC_wb", .pme_desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xd, }, { .pme_name = "EC_snoop_cb", .pme_desc = "L2-cache copybacks generated from a snoop by a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "Dispatch0_br_target", .pme_desc = "I-buffer is empty due to a branch target address calculation", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "Dispatch0_2nd_br", .pme_desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "Rstall_storeQ", .pme_desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stailled due to the store queue being full", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x5, }, { .pme_name = "Rstall_IU_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding integer instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x6, }, { .pme_name = "EC_write_hit_RTO", .pme_desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xd, }, { .pme_name = "EC_rd_miss", .pme_desc = "L2-cache miss events (including atomics) from D-cache events", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xf, }, { .pme_name = "PC_port0_rd", .pme_desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x10, }, { .pme_name = "SI_snoop", .pme_desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x11, }, { .pme_name = "SI_ciq_flow", .pme_desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x12, }, { .pme_name = "SI_owned", .pme_desc = "Counts events where owned_in is asserted on bus requests from the local processor", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x13, }, { .pme_name = "SW_count0", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x14, }, { .pme_name = "IU_Stat_Br_miss_taken", .pme_desc = "Retired branches that were predicted to be taken, but in fact were not taken", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x15, }, { .pme_name = "IU_Stat_Br_Count_taken", .pme_desc = "Retired taken branches", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x16, }, { .pme_name = "Dispatch0_rs_mispred", .pme_desc = "I-buffer is empty due to a Return Address Stack misprediction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "FA_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG ALU pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .pme_name = "IC_miss_cancelled", .pme_desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x3, }, { .pme_name = "Re_FPU_bypass", .pme_desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x5, }, { .pme_name = "Re_DC_miss", .pme_desc = "Stall due to loads that miss D-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x6, }, { .pme_name = "Re_EC_miss", .pme_desc = "Stall due to loads that miss L2-cache and get recirculated", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x7, }, { .pme_name = "IC_miss", .pme_desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x8, }, { .pme_name = "DC_rd_miss", .pme_desc = "Recirculated loads that miss the D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x9, }, { .pme_name = "DC_wr_miss", .pme_desc = "D-cache store accesses that miss D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xa, }, { .pme_name = "Rstall_FP_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding floating-point instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xb, }, { .pme_name = "EC_misses", .pme_desc = "E-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xc, }, { .pme_name = "EC_ic_miss", .pme_desc = "L2-cache read misses from I-cache requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xf, }, { .pme_name = "Re_PC_miss", .pme_desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x10, }, { .pme_name = "ITLB_miss", .pme_desc = "I-TLB miss traps taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x11, }, { .pme_name = "DTLB_miss", .pme_desc = "Memory reference instructions which trap due to D-TLB miss", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x12, }, { .pme_name = "WC_miss", .pme_desc = "W-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x13, }, { .pme_name = "WC_snoop_cb", .pme_desc = "W-cache copybacks generated by a snoop from a remote processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x14, }, { .pme_name = "WC_scrubbed", .pme_desc = "W-cache hits to clean lines", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x15, }, { .pme_name = "WC_wb_wo_read", .pme_desc = "W-cache writebacks not requiring a read", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x16, }, { .pme_name = "PC_soft_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x18, }, { .pme_name = "PC_snoop_inv", .pme_desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x19, }, { .pme_name = "PC_hard_hit", .pme_desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1a, }, { .pme_name = "PC_port1_rd", .pme_desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1b, }, { .pme_name = "SW_count1", .pme_desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1c, }, { .pme_name = "IU_Stat_Br_miss_untaken", .pme_desc = "Retired branches that were predicted to be untaken, but in fact were taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1d, }, { .pme_name = "IU_Stat_Br_Count_untaken", .pme_desc = "Retired untaken branches", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1e, }, { .pme_name = "PC_MS_miss", .pme_desc = "FP loads through the MS pipeline that miss P-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1f, }, { .pme_name = "Re_RAW_miss", .pme_desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x26, }, { .pme_name = "FM_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG Multiply pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x27, }, /* PIC0 memory controller events common to UltraSPARC-III/III+ processors */ { .pme_name = "MC_reads_0", .pme_desc = "Read requests completed to memory bank 0", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x20, }, { .pme_name = "MC_reads_1", .pme_desc = "Read requests completed to memory bank 1", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x21, }, { .pme_name = "MC_reads_2", .pme_desc = "Read requests completed to memory bank 2", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x22, }, { .pme_name = "MC_reads_3", .pme_desc = "Read requests completed to memory bank 3", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x23, }, { .pme_name = "MC_stalls_0", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x24, }, { .pme_name = "MC_stalls_2", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x25, }, /* PIC1 memory controller events common to all UltraSPARC-III/III+ processors */ { .pme_name = "MC_writes_0", .pme_desc = "Write requests completed to memory bank 0", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x20, }, { .pme_name = "MC_writes_1", .pme_desc = "Write requests completed to memory bank 1", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x21, }, { .pme_name = "MC_writes_2", .pme_desc = "Write requests completed to memory bank 2", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x22, }, { .pme_name = "MC_writes_3", .pme_desc = "Write requests completed to memory bank 3", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x23, }, { .pme_name = "MC_stalls_1", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 1 was busy with a previous request", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x24, }, { .pme_name = "MC_stalls_3", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous request", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x25, }, /* PIC0 events specific to UltraSPARC-III+ processors */ { .pme_name = "EC_wb_remote", .pme_desc = "Counts the retry event when any victimization for which the processor generates an R_WB transaction to non_LPA address region", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x19, }, { .pme_name = "EC_miss_local", .pme_desc = "Counts any transaction to an LPA for which the processor issues an RTS/RTO/RS transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1a, }, { .pme_name = "EC_miss_mtag_remote", .pme_desc = "Counts any transaction to an LPA in which the processor is required to generate a retry transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1b, }, /* PIC1 events specific to UltraSPARC-III+/IIIi processors */ { .pme_name = "Re_DC_missovhd", .pme_desc = "Used to measure D-cache stall counts seperatedly for L2-cache hits and misses. This counter is used with the recirculation and cache access events to seperately calculate the D-cache loads that hit and miss the L2-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x4, }, /* PIC1 events specific to UltraSPARC-III+ processors */ { .pme_name = "EC_miss_mtag_remote", .pme_desc = "Counts any transaction to an LPA in which the processor is required to generate a retry transaction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x28, }, { .pme_name = "EC_miss_remote", .pme_desc = "Counts the events triggered whenever the processor generates a remote (R_*) transaction and the address is to a non-LPA portion (remote) of the physical address space, or an R_WS transaction due to block-store/block-store-commit to any address space (LPA or non-LPA), or an R-RTO due to store/swap request on Os state to LPA space", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x29, }, }; #define PME_ULTRA3PLUS_EVENT_COUNT (sizeof(ultra3plus_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/lib/ultra4plus_events.h000066400000000000000000000466501502707512200231750ustar00rootroot00000000000000static pme_sparc_entry_t ultra4plus_pe[] = { /* These two must always be first. */ { .pme_name = "Cycle_cnt", .pme_desc = "Accumulated cycles", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x0, }, { .pme_name = "Instr_cnt", .pme_desc = "Number of instructions completed", .pme_ctrl = PME_CTRL_S0 | PME_CTRL_S1, .pme_val = 0x1, }, /* PIC0 UltraSPARC-IV+ events */ { .pme_name = "Dispatch0_IC_miss", .pme_desc = "I-buffer is empty from I-Cache miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2, }, { .pme_name = "IU_stat_jmp_correct_pred", .pme_desc = "Retired non-annulled register indirect jumps predicted correctly", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x3, }, { .pme_name = "Dispatch0_2nd_br", .pme_desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x4, }, { .pme_name = "Rstall_storeQ", .pme_desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stailled due to the store queue being full", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x5, }, { .pme_name = "Rstall_IU_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding integer instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x6, }, { .pme_name = "IU_stat_ret_correct_pred", .pme_desc = "Retired non-annulled returns predicted correctly", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x7, }, { .pme_name = "IC_ref", .pme_desc = "I-cache refrences", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x8, }, { .pme_name = "DC_rd", .pme_desc = "D-cache read references (including accesses that subsequently trap)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x9, }, { .pme_name = "Rstall_FP_use", .pme_desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceeding floating-point instruction in the pipeline that is not yet available", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xa, }, { .pme_name = "SW_pf_instr", .pme_desc = "Retired SW prefetch instructions", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xb, }, { .pme_name = "L2_ref", .pme_desc = "L2-cache references", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xc, }, { .pme_name = "L2_write_hit_RTO", .pme_desc = "L2-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xd, }, { .pme_name = "L2_snoop_inv_sh", .pme_desc = "L2 cache lines that were written back to the L3 cache due to requests from both cores", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xe, }, { .pme_name = "L2_rd_miss", .pme_desc = "L2-cache miss events (including atomics) from D-cache events", .pme_ctrl = PME_CTRL_S0, .pme_val = 0xf, }, { .pme_name = "PC_rd", .pme_desc = "P-cache cacheable loads", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x10, }, { .pme_name = "SI_snoop_sh", .pme_desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x11, }, { .pme_name = "SI_ciq_flow_sh", .pme_desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x12, }, { .pme_name = "Re_DC_miss", .pme_desc = "Stall due to loads that miss D-cache and get recirculated", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x13, }, { .pme_name = "SW_count_NOP0", .pme_desc = "Retired, non-annulled special software NOP instructions (which is equivalent to 'sethi %hi(0xfc000), %g0' instruction)", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x14, }, { .pme_name = "IU_Stat_Br_miss_taken", .pme_desc = "Retired branches that were predicted to be taken, but in fact were not taken", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x15, }, { .pme_name = "IU_Stat_Br_Count_taken", .pme_desc = "Retired taken branches", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x16, }, { .pme_name = "HW_pf_exec", .pme_desc = "Hardware prefetches enqueued in the prefetch queue", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x17, }, { .pme_name = "FA_pipe_completion", .pme_desc = "Instructions that complete execution on the FPG ALU pipelines", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x18, }, { .pme_name = "SSM_L3_wb_remote", .pme_desc = "L3 cache line victimizations from this core which generate R_WB transactions to non-LPA (remote physical address) regions", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x19, }, { .pme_name = "SSM_L3_miss_local", .pme_desc = "L3 cache misses to LPA (local physical address) from this core which generate an RTS, RTO, or RS transaction", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1a, }, { .pme_name = "SSM_L3_miss_mtag_remote", .pme_desc = "L3 cache misses to LPA (local physical address) from this core which generate retry (R_*) transactions including R_RTS, R_RTO, and R_RS", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1b, }, { .pme_name = "SW_pf_str_trapped", .pme_desc = "Strong software prefetch instructions trapping due to TLB miss", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1c, }, { .pme_name = "SW_pf_PC_installed", .pme_desc = "Software prefetch instructions that installed lines in the P-cache", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1d, }, { .pme_name = "IPB_to_IC_fill", .pme_desc = "I-cache filles from the instruction prefetch buffer", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1e, }, { .pme_name = "L2_write_miss", .pme_desc = "L2-cache misses from this core by cacheable store requests", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x1f, }, { .pme_name = "MC_reads_0_sh", .pme_desc = "Read requests completed to memory bank 0", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x20, }, { .pme_name = "MC_reads_1_sh", .pme_desc = "Read requests completed to memory bank 1", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x21, }, { .pme_name = "MC_reads_2_sh", .pme_desc = "Read requests completed to memory bank 2", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x22, }, { .pme_name = "MC_reads_3_sh", .pme_desc = "Read requests completed to memory bank 3", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x23, }, { .pme_name = "MC_stalls_0_sh", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x24, }, { .pme_name = "MC_stalls_2_sh", .pme_desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x25, }, { .pme_name = "L2_hit_other_half", .pme_desc = "L2 cache hits from this core to the ways filled by the other core when the cache is in the pseudo-split mode", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x26, }, { .pme_name = "L3_rd_miss", .pme_desc = "L3 cache misses sent out to SIU from this code by cacheable I-cache, D-cache, PO-cache, and W-cache (excluding block store) requests", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x28, }, { .pme_name = "Re_L2_miss", .pme_desc = "Stall cycles due to recirculation of cacheable loads that miss both D-cache and L2 cache", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x29, }, { .pme_name = "IC_miss_cancelled", .pme_desc = "I-cache miss requests cancelled due to new fetch stream", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2a, }, { .pme_name = "DC_wr_miss", .pme_desc = "D-cache store accesses that miss D-cache", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2b, }, { .pme_name = "L3_hit_I_state_sh", .pme_desc = "Tag hits in L3 cache when the line is in I state", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2c, }, { .pme_name = "SI_RTS_src_data", .pme_desc = "Local RTS transactions due to I-cache, D-cache, or P-cache requests from this core where data is from the cache of another processor on the system, not from memory", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2d, }, { .pme_name = "L2_IC_miss", .pme_desc = "L2 cache misses from this code by cacheable I-cache requests", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2e, }, { .pme_name = "SSM_new_transaction_sh", .pme_desc = "New SSM transactions (RTSU, RTOU, UGM) observed by this processor on the Fireplane Interconnect", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x2f, }, { .pme_name = "L2_SW_pf_miss", .pme_desc = "L2 cache misses by software prefetch requests from this core", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x30, }, { .pme_name = "L2_wb", .pme_desc = "L2 cache lines that were written back to the L3 cache because of requests from this core", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x31, }, { .pme_name = "L2_wb_sh", .pme_desc = "L2 cache lines that were written back to the L3 cache because of requests from both cores", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x32, }, { .pme_name = "L2_snoop_cb_sh", .pme_desc = "L2 cache lines that were copied back due to other processors", .pme_ctrl = PME_CTRL_S0, .pme_val = 0x33, }, /* PIC1 UltraSPARC-IV+ events */ { .pme_name = "Dispatch0_other", .pme_desc = "Stall cycles due to the event that no instructions are dispatched because the I-queue is empty due to various other events, including branch target address fetch and various events which cause an instruction to be refetched", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2, }, { .pme_name = "DC_wr", .pme_desc = "D-cache write references by cacheable stores (excluding block stores)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x3, }, { .pme_name = "Re_DC_missovhd", .pme_desc = "Stall cycles due to D-cache load miss", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x4, }, { .pme_name = "Re_FPU_bypass", .pme_desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x5, }, { .pme_name = "L3_write_hit_RTO", .pme_desc = "L3 cache hits in O, Os, or S state by cacheable store requests from this core that do a read-to-own (RTO) bus transaction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x6, }, { .pme_name = "L2L3_snoop_inv_sh", .pme_desc = "L2 and L3 cache lines that were invalidated due to other processors doing RTO, RTOR, RTOU, or WS transactions", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x7, }, { .pme_name = "IC_L2_req", .pme_desc = "I-cache requests sent to L2 cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x8, }, { .pme_name = "DC_rd_miss", .pme_desc = "Cacheable loads (excluding atomics and block loads) that miss D-cache as well as P-cache (for FP loads)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x9, }, { .pme_name = "L2_hit_I_state_sh", .pme_desc = "Tag hits in L2 cache when the line is in I state", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xa, }, { .pme_name = "L3_write_miss_RTO", .pme_desc = "L3 cache misses from this core by cacheable store requests that do a read-to-own (RTO) bus transaction. This count does not include RTO requests for prefetch (fcn=2,3/22,23) instructions", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xb, }, { .pme_name = "L2_miss", .pme_desc = "L2 cache misses from this core by cacheable I-cache, D-cache, P-cache, and W-cache (excluding block stores) requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xc, }, { .pme_name = "SI_owned_sh", .pme_desc = "Number of times owned_in is asserted on bus requests from the local processor", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xd, }, { .pme_name = "SI_RTO_src_data", .pme_desc = "Number of local RTO transactions due to W-cache or P-cache requests from this core where data is from the cache of another processor on the system, not from memory", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xe, }, { .pme_name = "SW_pf_duplicate", .pme_desc = "Number of software prefetch instructions that were dropped because the prefetch request matched an outstanding requests in the prefetch queue or the request hit the P-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0xf, }, { .pme_name = "IU_stat_jmp_mispred", .pme_desc = "Number of retired non-annulled register indirect jumps mispredicted", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x10, }, { .pme_name = "ITLB_miss", .pme_desc = "I-TLB misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x11, }, { .pme_name = "DTLB_miss", .pme_desc = "D-TLB misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x12, }, { .pme_name = "WC_miss", .pme_desc = "W-cache misses", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x13, }, { .pme_name = "IC_fill", .pme_desc = "Number of I-cache fills excluding fills from the instruction prefetch buffer. This is the best approximation of the number of I-cache misses for instructions that were actually executed", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x14, }, { .pme_name = "IU_stat_ret_mispred", .pme_desc = "Number of retired non-annulled returns mispredicted", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x15, }, { .pme_name = "Re_L3_miss", .pme_desc = "Stall cycles due to recirculation of cacheable loads that miss D-cache, L2, and L3 cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x16, }, { .pme_name = "Re_PFQ_full", .pme_desc = "Stall cycles due to recirculation of prefetch instructions because the prefetch queue (PFQ) was full", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x17, }, { .pme_name = "PC_soft_hit", .pme_desc = "Number of cacheable FP loads that hit a P-cache line that was prefetched by a software prefetch instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x18, }, { .pme_name = "PC_inv", .pme_desc = "Number of P-cache lines that were invalidated due to external snoops, internal stores, and L2 evictions", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x19, }, { .pme_name = "PC_hard_hit", .pme_desc = "Number of FP loads that hit a P-cache line that was fetched by a FP load or a hardware prefetch, irrespective of whether the loads hit or miss the D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1a, }, { .pme_name = "IC_pf", .pme_desc = "Number of I-cache prefetch requests sent to L2 cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1b, }, { .pme_name = "SW_count_NOP1", .pme_desc = "Retired, non-annulled special software NOP instructions (which is equivalent to 'sethi %hi(0xfc000), %g0' instruction)", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1c, }, { .pme_name = "IU_stat_br_miss_untaken", .pme_desc = "Number of retired non-annulled conditional branches that were predicted to be not taken, but in fact were taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1d, }, { .pme_name = "IU_stat_br_count_taken", .pme_desc = "Number of retired non-annulled conditional branches that were taken", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1e, }, { .pme_name = "PC_miss", .pme_desc = "Number of cacheable FP loads that miss P-cache, irrespective of whether the loads hit or miss the D-cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x1f, }, { .pme_name = "MC_writes_0_sh", .pme_desc = "Number of write requests complete to memory bank 0", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x20, }, { .pme_name = "MC_writes_1_sh", .pme_desc = "Number of write requests complete to memory bank 1", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x21, }, { .pme_name = "MC_writes_2_sh", .pme_desc = "Number of write requests complete to memory bank 2", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x22, }, { .pme_name = "MC_writes_3_sh", .pme_desc = "Number of write requests complete to memory bank 3", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x23, }, { .pme_name = "MC_stalls_1_sh", .pme_desc = "Number of processor cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x24, }, { .pme_name = "MC_stalls_3_sh", .pme_desc = "Number of processor cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x25, }, { .pme_name = "Re_RAW_miss", .pme_desc = "Stall cycles due to recirculation when there is a load instruction in the E-stage of the pipeline which has a non-bypassable read-after-write (RAW) hazard with an earlier store instruction", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x26, }, { .pme_name = "FM_pipe_completion", .pme_desc = "Number of retired instructions that complete execution on the FLoat-Point/Graphics Multiply pipeline", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x27, }, { .pme_name = "SSM_L3_miss_mtag_remote", .pme_desc = "Number of L3 cache misses to LPA (local physical address) from this core which generate retry (R_*) transactions including R_RTS, R_RTO, and R_RS", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x28, }, { .pme_name = "SSM_L3_miss_remote", .pme_desc = "Number of L3 cache misses from this core which generate retry (R_*) transactions to non-LPA (non-local physical address) address space, or R_WS transactions due to block store (BST) / block store commit (BSTC) to any address space (LPA or non-LPA), or R_RTO due to atomic request on Os state to LPA space.", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x29, }, { .pme_name = "SW_pf_exec", .pme_desc = "Number of retired, non-trapping software prefetch instructions that completed, i.e. number of retired prefetch instructions that were not dropped due to the prefecth queue being full", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2a, }, { .pme_name = "SW_pf_str_exec", .pme_desc = "Number of retired, non-trapping strong prefetch instructions that completed", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2b, }, { .pme_name = "SW_pf_dropped", .pme_desc = "Number of software prefetch instructions dropped due to TLB miss or due to the prefetch queue being full", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2c, }, { .pme_name = "SW_pf_L2_installed", .pme_desc = "Number of software prefetch instructions that installed lines in the L2 cache", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2d, }, { .pme_name = "L2_HW_pf_miss", .pme_desc = "Number of L2 cache misses by hardware prefetch requests from this core", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x2f, }, { .pme_name = "L3_miss", .pme_desc = "Number of L3 cache misses sent out to SIU from this core by cacheable I-cache, D-cache, P-cache, and W-cache (exclusing block stores) requests", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x31, }, { .pme_name = "L3_IC_miss", .pme_desc = "Number of L3 cache misses by cacheable I-cache requests from this core", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x32, }, { .pme_name = "L3_SW_pf_miss", .pme_desc = "Number of L3 cache misses by software prefetch requests from this core", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x33, }, { .pme_name = "L3_hit_other_half", .pme_desc = "Number of L3 cache hits from this core to the ways filled by the other core when the cache is in pseudo-split mode", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x34, }, { .pme_name = "L3_wb", .pme_desc = "Number of L3 cache lines that were written back because of requests from this core", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x35, }, { .pme_name = "L3_wb_sh", .pme_desc = "Number of L3 cache lines that were written back because of requests from both cores", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x36, }, { .pme_name = "L2L3_snoop_cb_sh", .pme_desc = "Total number of L2 and L3 cache lines that were copied back due to other processors", .pme_ctrl = PME_CTRL_S1, .pme_val = 0x37, }, }; #define PME_ULTRA4PLUS_EVENT_COUNT (sizeof(ultra4plus_pe)/sizeof(pme_sparc_entry_t)) papi-papi-7-2-0-t/src/libperfnec/libpfms/000077500000000000000000000000001502707512200201745ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/libpfms/Makefile000066400000000000000000000035511502707512200216400ustar00rootroot00000000000000# # Copyright (c) 2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk DIRS=lib CFLAGS+= -pthread -D_GNU_SOURCE -I./include LIBS += -L$(TOPDIR)/libpfms/lib -lpfms $(PFMLIB) -lm TARGETS=syst_smp all: $(TARGETS) syst_smp: ./lib/libpfms.a syst_smp.o $(CC) $(CFLAGS) $(LDFLAGS) -o $@ syst_smp.o $(LIBS) -lpthread clean: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done $(RM) -f *.o $(TARGETS) *~ distclean: clean .FORCE: lib/libpfms.a lib/libpfms.a: @set -e ; $(MAKE) -C lib all install depend: $(TARGETS) install depend: ifeq ($(CONFIG_PFMLIB_ARCH_SICORTEX),y) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done endif papi-papi-7-2-0-t/src/libperfnec/libpfms/include/000077500000000000000000000000001502707512200216175ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/libpfms/include/libpfms.h000066400000000000000000000036301502707512200234260ustar00rootroot00000000000000/* * libpfms.h - header file for libpfms - a helper library for perfmon SMP monitoring * * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __LIBPFMS_H__ #define __LIBPFMS_H__ #ifdef __cplusplus extern "C" { #endif typedef int (*pfms_ovfl_t)(pfarg_msg_t *msg); int pfms_initialize(void); int pfms_create(uint64_t *cpu_list, size_t n, pfarg_ctx_t *ctx, pfms_ovfl_t *ovfl, void **desc); int pfms_write_pmcs(void *desc, pfarg_pmc_t *pmcs, uint32_t n); int pfms_write_pmds(void *desc, pfarg_pmd_t *pmds, uint32_t n); int pfms_read_pmds(void *desc, pfarg_pmd_t *pmds, uint32_t n); int pfms_start(void *desc); int pfms_stop(void *desc); int pfms_close(void *desc); int pfms_unload(void *desc); int pfms_load(void *desc); #ifdef __cplusplus /* extern C */ } #endif #endif /* __LIBPFMS_H__ */ papi-papi-7-2-0-t/src/libperfnec/libpfms/lib/000077500000000000000000000000001502707512200207425ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/libpfms/lib/Makefile000066400000000000000000000051561502707512200224110ustar00rootroot00000000000000# # Copyright (c) 2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/../.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk CFLAGS+= -pthread -D_GNU_SOURCE LDFLAGS+=-static PFMSINCDIR=../include # # Library version # VERSION=0 REVISION=1 AGE=0 SRCS=libpfms.c HEADERS=../include/libpfms.h ALIBPFM=libpfms.a TARGETS=$(ALIBPFM) ifneq ($(CONFIG_PFMLIB_ARCH_CRAYX2),y) SLIBPFM=libpfms.so.$(VERSION).$(REVISION).$(AGE) VLIBPFM=libpfms.so.$(VERSION) endif OBJS=$(SRCS:.c=.o) SOBJS=$(OBJS:.o=.lo) # # assume that if llibpfm built static, libpfms should # also be static, i.e., likely platform does not support # shared libraries. # ifeq ($(CONFIG_PFMLIB_SHARED),y) TARGETS += $(SLIBPFM) endif ifeq ($(SYS),Linux) SLDFLAGS=-shared -Wl,-soname -Wl,libpfms.so.$(VERSION) endif CFLAGS+=-I$(PFMSINCDIR) all: $(TARGETS) $(OBJS) $(SOBJS): $(HEADERS) $(TOPDIR)/config.mk $(TOPDIR)/rules.mk Makefile libpfms.a: $(OBJS) $(RM) $@ $(AR) cru $@ $(OBJS) $(SLIBPFM): $(SOBJS) $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LN) -sf $@ libpfms.so.$(VERSION) clean: $(RM) -f *.o *.lo *.a *.so* *~ distclean: clean install: $(TARGETS) install: -mkdir -p $(DESTDIR)$(LIBDIR) $(INSTALL) -m 644 $(ALIBPFM) $(DESTDIR)$(LIBDIR) $(INSTALL) $(SLIBPFM) $(DESTDIR)$(LIBDIR) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) $(VLIBPFM) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) libpfms.so -mkdir -p $(DESTDIR)$(INCDIR)/perfmon $(INSTALL) -m 644 $(HEADERS) $(DESTDIR)$(INCDIR)/perfmon papi-papi-7-2-0-t/src/libperfnec/libpfms/lib/libpfms.c000066400000000000000000000432461502707512200225530ustar00rootroot00000000000000#include #include #include #include #include #include #include #include #include #include #include #include #include "libpfms.h" //#define dprint(format, arg...) fprintf(stderr, "%s.%d: " format , __FUNCTION__ , __LINE__, ## arg) #define dprint(format, arg...) typedef enum { CMD_NONE, CMD_CTX, CMD_LOAD, CMD_UNLOAD, CMD_WPMCS, CMD_WPMDS, CMD_RPMDS, CMD_STOP, CMD_START, CMD_CLOSE } pfms_cmd_t; typedef struct _barrier { pthread_mutex_t mutex; pthread_cond_t cond; uint32_t counter; uint32_t max; uint64_t generation; /* avoid race condition on wake-up */ } barrier_t; typedef struct { uint32_t cpu; uint32_t fd; void *smpl_vaddr; size_t smpl_buf_size; } pfms_cpu_t; typedef struct _pfms_thread { uint32_t cpu; pfms_cmd_t cmd; void *data; uint32_t ndata; sem_t cmd_sem; int ret; pthread_t tid; barrier_t *barrier; } pfms_thread_t; typedef struct { barrier_t barrier; uint32_t ncpus; } pfms_session_t; static uint32_t ncpus; static pfms_thread_t *tds; static pthread_mutex_t tds_lock = PTHREAD_MUTEX_INITIALIZER; static int barrier_init(barrier_t *b, uint32_t count) { int r; r = pthread_mutex_init(&b->mutex, NULL); if (r == -1) return -1; r = pthread_cond_init(&b->cond, NULL); if (r == -1) return -1; b->max = b->counter = count; b->generation = 0; return 0; } static void cleanup_barrier(void *arg) { barrier_t *b = (barrier_t *)arg; int r; r = pthread_mutex_unlock(&b->mutex); dprint("free barrier mutex r=%d\n", r); (void) r; } static int barrier_wait(barrier_t *b) { uint64_t generation; int oldstate; pthread_cleanup_push(cleanup_barrier, b); pthread_mutex_lock(&b->mutex); pthread_testcancel(); if (--b->counter == 0) { /* reset barrier */ b->counter = b->max; /* * bump generation number, this avoids thread getting stuck in the * wake up loop below in case a thread just out of the barrier goes * back in right away before all the thread from the previous "round" * have "escaped". */ b->generation++; pthread_cond_broadcast(&b->cond); } else { generation = b->generation; pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, &oldstate); while (b->counter != b->max && generation == b->generation) { pthread_cond_wait(&b->cond, &b->mutex); } pthread_setcancelstate(oldstate, NULL); } pthread_mutex_unlock(&b->mutex); pthread_cleanup_pop(0); return 0; } /* * placeholder for pthread_setaffinity_np(). This stuff is ugly * and I could not figure out a way to get it compiled while also preserving * the pthread_*cancel(). There are issues with LinuxThreads and NPTL. I * decided to quit on this and implement my own affinity call until this * settles. */ static int pin_cpu(uint32_t cpu) { uint64_t *mask; size_t size; pid_t pid; int ret; pid = syscall(__NR_gettid); size = ncpus * sizeof(uint64_t); mask = calloc(1, size); if (mask == NULL) { dprint("CPU%u: cannot allocate bitvector\n", cpu); return -1; } mask[cpu>>6] = 1ULL << (cpu & 63); ret = syscall(__NR_sched_setaffinity, pid, size, mask); free(mask); return ret; } static void pfms_thread_mainloop(void *arg) { long k = (long )arg; uint32_t mycpu = (uint32_t)k; pfarg_ctx_t myctx, *ctx; pfarg_load_t load_args; int fd = -1; pfms_thread_t *td; sem_t *cmd_sem; int ret = 0; memset(&load_args, 0, sizeof(load_args)); load_args.load_pid = mycpu; td = tds+mycpu; ret = pin_cpu(mycpu); dprint("CPU%u wthread created and pinned ret=%d\n", mycpu, ret); cmd_sem = &tds[mycpu].cmd_sem; for(;;) { dprint("CPU%u waiting for cmd\n", mycpu); sem_wait(cmd_sem); switch(td->cmd) { case CMD_NONE: ret = 0; break; case CMD_CTX: /* * copy context to get private fd */ ctx = td->data; myctx = *ctx; fd = pfm_create_context(&myctx, NULL, NULL, 0); ret = fd < 0 ? -1 : 0; dprint("CPU%u CMD_CTX ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_LOAD: ret = pfm_load_context(fd, &load_args); dprint("CPU%u CMD_LOAD ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_UNLOAD: ret = pfm_unload_context(fd); dprint("CPU%u CMD_UNLOAD ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_START: ret = pfm_start(fd, NULL); dprint("CPU%u CMD_START ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_STOP: ret = pfm_stop(fd); dprint("CPU%u CMD_STOP ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_WPMCS: ret = pfm_write_pmcs(fd,(pfarg_pmc_t *)td->data, td->ndata); dprint("CPU%u CMD_WPMCS ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_WPMDS: ret = pfm_write_pmds(fd,(pfarg_pmd_t *)td->data, td->ndata); dprint("CPU%u CMD_WPMDS ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_RPMDS: ret = pfm_read_pmds(fd,(pfarg_pmd_t *)td->data, td->ndata); dprint("CPU%u CMD_RPMDS ret=%d errno=%d fd=%d\n", mycpu, ret, errno, fd); break; case CMD_CLOSE: dprint("CPU%u CMD_CLOSE fd=%d\n", mycpu, fd); ret = close(fd); fd = -1; break; default: break; } td->ret = ret; dprint("CPU%u td->ret=%d\n", mycpu, ret); barrier_wait(td->barrier); } } static int create_one_wthread(int cpu) { int ret; sem_init(&tds[cpu].cmd_sem, 0, 0); ret = pthread_create(&tds[cpu].tid, NULL, (void *(*)(void *))pfms_thread_mainloop, (void *)(long)cpu); return ret; } /* * must be called with tds_lock held */ static int create_wthreads(uint64_t *cpu_list, uint32_t n) { uint64_t v; uint32_t i,k, cpu; int ret = 0; for(k=0, cpu = 0; k < n; k++, cpu+= 64) { v = cpu_list[k]; for(i=0; v && i < 63; i++, v>>=1, cpu++) { if ((v & 0x1) && tds[cpu].tid == 0) { ret = create_one_wthread(cpu); if (ret) break; } } } if (ret) dprint("cannot create wthread on CPU%u\n", cpu); return ret; } int pfms_initialize(void) { printf("cpu_t=%zu thread=%zu session_t=%zu\n", sizeof(pfms_cpu_t), sizeof(pfms_thread_t), sizeof(pfms_session_t)); ncpus = (uint32_t)sysconf(_SC_NPROCESSORS_ONLN); if (ncpus == -1) { dprint("cannot retrieve number of online processors\n"); return -1; } dprint("configured for %u CPUs\n", ncpus); /* * XXX: assuming CPU are contiguously indexed */ tds = calloc(ncpus, sizeof(*tds)); if (tds == NULL) { dprint("cannot allocate thread descriptors\n"); return -1; } return 0; } int pfms_create(uint64_t *cpu_list, size_t n, pfarg_ctx_t *ctx, pfms_ovfl_t *ovfl, void **desc) { uint64_t v; size_t k, i; uint32_t num, cpu; pfms_session_t *s; int ret; if (cpu_list == NULL || n == 0 || ctx == NULL || desc == NULL) { dprint("invalid parameters\n"); return -1; } if ((ctx->ctx_flags & PFM_FL_SYSTEM_WIDE) == 0) { dprint("only works for system wide\n"); return -1; } *desc = NULL; /* * XXX: assuming CPU are contiguously indexed */ num = 0; for(k=0, cpu = 0; k < n; k++, cpu+=64) { v = cpu_list[k]; for(i=0; v && i < 63; i++, v>>=1, cpu++) { if (v & 0x1) { if (cpu >= ncpus) { dprint("unavailable CPU%u\n", cpu); return -1; } num++; } } } if (num == 0) return 0; s = calloc(1, sizeof(*s)); if (s == NULL) { dprint("cannot allocate %u contexts\n", num); return -1; } s->ncpus = num; printf("%u-way session\n", num); /* * +1 to account for main thread waiting */ ret = barrier_init(&s->barrier, num + 1); if (ret) { dprint("cannot init barrier\n"); goto error_free; } /* * lock thread descriptor table, no other create_session, close_session * can occur */ pthread_mutex_lock(&tds_lock); if (create_wthreads(cpu_list, n)) goto error_free_unlock; /* * check all needed threads are available */ for(k=0, cpu = 0; k < n; k++, cpu += 64) { v = cpu_list[k]; for(i=0; v && i < 63; i++, v>>=1, cpu++) { if (v & 0x1) { if (tds[cpu].barrier) { dprint("CPU%u already managing a session\n", cpu); goto error_free_unlock; } } } } /* * send create context order */ for(k=0, cpu = 0; k < n; k++, cpu += 64) { v = cpu_list[k]; for(i=0; v && i < 63; i++, v>>=1, cpu++) { if (v & 0x1) { tds[cpu].cmd = CMD_CTX; tds[cpu].data = ctx; tds[cpu].barrier = &s->barrier; sem_post(&tds[cpu].cmd_sem); } } } barrier_wait(&s->barrier); ret = 0; /* * check for errors */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { ret = tds[k].ret; if (ret) break; } } /* * undo if error found */ if (k < ncpus) { for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { if (tds[k].ret == 0) { tds[k].cmd = CMD_CLOSE; sem_post(&tds[k].cmd_sem); } /* mark as free */ tds[k].barrier = NULL; } } } pthread_mutex_unlock(&tds_lock); if (ret == 0) *desc = s; return ret ? -1 : 0; error_free_unlock: pthread_mutex_unlock(&tds_lock); error_free: free(s); return -1; } int pfms_load(void *desc) { uint32_t k; pfms_session_t *s; int ret; if (desc == NULL) { dprint("invalid parameters\n"); return -1; } s = (pfms_session_t *)desc; if (s->ncpus == 0) { dprint("invalid session content 0 CPUS\n"); return -1; } /* * send create context order */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { tds[k].cmd = CMD_LOAD; sem_post(&tds[k].cmd_sem); } } barrier_wait(&s->barrier); ret = 0; /* * check for errors */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { ret = tds[k].ret; if (ret) { dprint("failure on CPU%u\n", k); break; } } } /* * if error, unload all others */ if (k < ncpus) { for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { if (tds[k].ret == 0) { tds[k].cmd = CMD_UNLOAD; sem_post(&tds[k].cmd_sem); } } } } return ret ? -1 : 0; } static int __pfms_do_simple_cmd(pfms_cmd_t cmd, void *desc, void *data, uint32_t n) { size_t k; pfms_session_t *s; int ret; if (desc == NULL) { dprint("invalid parameters\n"); return -1; } s = (pfms_session_t *)desc; if (s->ncpus == 0) { dprint("invalid session content 0 CPUS\n"); return -1; } /* * send create context order */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { tds[k].cmd = cmd; tds[k].data = data; tds[k].ndata = n; sem_post(&tds[k].cmd_sem); } } barrier_wait(&s->barrier); ret = 0; /* * check for errors */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { ret = tds[k].ret; if (ret) { dprint("failure on CPU%zu\n", k); break; } } } /* * simple commands cannot be undone */ return ret ? -1 : 0; } int pfms_unload(void *desc) { return __pfms_do_simple_cmd(CMD_UNLOAD, desc, NULL, 0); } int pfms_start(void *desc) { return __pfms_do_simple_cmd(CMD_START, desc, NULL, 0); } int pfms_stop(void *desc) { return __pfms_do_simple_cmd(CMD_STOP, desc, NULL, 0); } int pfms_write_pmcs(void *desc, pfarg_pmc_t *pmcs, uint32_t n) { return __pfms_do_simple_cmd(CMD_WPMCS, desc, pmcs, n); } int pfms_write_pmds(void *desc, pfarg_pmd_t *pmds, uint32_t n) { return __pfms_do_simple_cmd(CMD_WPMDS, desc, pmds, n); } int pfms_close(void *desc) { size_t k; pfms_session_t *s; int ret; if (desc == NULL) { dprint("invalid parameters\n"); return -1; } s = (pfms_session_t *)desc; if (s->ncpus == 0) { dprint("invalid session content 0 CPUS\n"); return -1; } for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { tds[k].cmd = CMD_CLOSE; sem_post(&tds[k].cmd_sem); } } barrier_wait(&s->barrier); ret = 0; pthread_mutex_lock(&tds_lock); /* * check for errors */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { if (tds[k].ret) { dprint("failure on CPU%zu\n", k); } ret |= tds[k].ret; tds[k].barrier = NULL; } } pthread_mutex_unlock(&tds_lock); free(s); /* * XXX: we cannot undo close */ return ret ? -1 : 0; } int pfms_read_pmds(void *desc, pfarg_pmd_t *pmds, uint32_t n) { pfms_session_t *s; uint32_t k, pmds_per_cpu; int ret; if (desc == NULL) { dprint("invalid parameters\n"); return -1; } s = (pfms_session_t *)desc; if (s->ncpus == 0) { dprint("invalid session content 0 CPUS\n"); return -1; } if (n % s->ncpus) { dprint("invalid number of pfarg_pmd_t provided, must be multiple of %u\n", s->ncpus); return -1; } pmds_per_cpu = n / s->ncpus; dprint("n=%u ncpus=%u per_cpu=%u\n", n, s->ncpus, pmds_per_cpu); for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { tds[k].cmd = CMD_RPMDS; tds[k].data = pmds; tds[k].ndata= pmds_per_cpu; sem_post(&tds[k].cmd_sem); pmds += pmds_per_cpu; } } barrier_wait(&s->barrier); ret = 0; /* * check for errors */ for(k=0; k < ncpus; k++) { if (tds[k].barrier == &s->barrier) { ret = tds[k].ret; if (ret) { dprint("failure on CPU%u\n", k); break; } } } /* * cannot undo pfm_read_pmds */ return ret ? -1 : 0; } #if 0 /* * beginning of test program */ #include #define NUM_PMCS PFMLIB_MAX_PMCS #define NUM_PMDS PFMLIB_MAX_PMDS static void fatal_error(char *fmt,...) __attribute__((noreturn)); static void fatal_error(char *fmt, ...) { va_list ap; va_start(ap, fmt); vfprintf(stderr, fmt, ap); va_end(ap); exit(1); } static uint32_t popcount(uint64_t c) { uint32_t count = 0; for(; c; c>>=1) { if (c & 0x1) count++; } return count; } int main(int argc, char **argv) { pfarg_ctx_t ctx; pfarg_pmc_t pc[NUM_PMCS]; pfarg_pmd_t *pd; pfmlib_input_param_t inp; pfmlib_output_param_t outp; uint64_t cpu_list; void *desc; unsigned int num_counters; uint32_t i, j, k, l, ncpus, npmds; size_t len; int ret; char *name; if (pfm_initialize() != PFMLIB_SUCCESS) fatal_error("cannot initialize libpfm\n"); if (pfms_initialize()) fatal_error("cannot initialize libpfms\n"); pfm_get_num_counters(&num_counters); pfm_get_max_event_name_len(&len); name = malloc(len+1); if (name == NULL) fatal_error("cannot allocate memory for event name\n"); memset(&ctx, 0, sizeof(ctx)); memset(pc, 0, sizeof(pc)); memset(&inp,0, sizeof(inp)); memset(&outp,0, sizeof(outp)); cpu_list = argc > 1 ? strtoul(argv[1], NULL, 0) : 0x3; ncpus = popcount(cpu_list); if (pfm_get_cycle_event(&inp.pfp_events[0].event) != PFMLIB_SUCCESS) fatal_error("cannot find cycle event\n"); if (pfm_get_inst_retired_event(&inp.pfp_events[1].event) != PFMLIB_SUCCESS) fatal_error("cannot find inst retired event\n"); i = 2; inp.pfp_dfl_plm = PFM_PLM3|PFM_PLM0; if (i > num_counters) { i = num_counters; printf("too many events provided (max=%d events), using first %d event(s)\n", num_counters, i); } /* * how many counters we use */ inp.pfp_event_count = i; /* * indicate we are using the monitors for a system-wide session. * This may impact the way the library sets up the PMC values. */ inp.pfp_flags = PFMLIB_PFP_SYSTEMWIDE; /* * let the library figure out the values for the PMCS */ if ((ret=pfm_dispatch_events(&inp, NULL, &outp, NULL)) != PFMLIB_SUCCESS) fatal_error("cannot configure events: %s\n", pfm_strerror(ret)); npmds = ncpus * inp.pfp_event_count; dprint("ncpus=%u npmds=%u\n", ncpus, npmds); pd = calloc(npmds, sizeof(pfarg_pmd_t)); if (pd == NULL) fatal_error("cannot allocate pd array\n"); for (i=0; i < outp.pfp_pmc_count; i++) { pc[i].reg_num = outp.pfp_pmcs[i].reg_num; pc[i].reg_value = outp.pfp_pmcs[i].reg_value; } for(l=0, k = 0; l < ncpus; l++) { for (i=0, j=0; i < inp.pfp_event_count; i++, k++) { pd[k].reg_num = outp.pfp_pmcs[j].reg_pmd_num; for(; j < outp.pfp_pmc_count; j++) if (outp.pfp_pmcs[j].reg_evt_idx != i) break; } } /* * create a context on all CPUs we asked for * * libpfms only works for system-wide, so we set the flag in * the master context. the context argument is not modified by * call. * * desc is an opaque descriptor used to identify session. */ ctx.ctx_flags = PFM_FL_SYSTEM_WIDE; ret = pfms_create(&cpu_list, 1, &ctx, NULL, &desc); if (ret == -1) fatal_error("create error %d\n", ret); /* * program the PMC registers on all CPUs of interest */ ret = pfms_write_pmcs(desc, pc, outp.pfp_pmc_count); if (ret == -1) fatal_error("write_pmcs error %d\n", ret); /* * program the PMD registers on all CPUs of interest */ ret = pfms_write_pmds(desc, pd, inp.pfp_event_count); if (ret == -1) fatal_error("write_pmds error %d\n", ret); /* * load context on all CPUs of interest */ ret = pfms_load(desc); if (ret == -1) fatal_error("load error %d\n", ret); /* * start monitoring on all CPUs of interest */ ret = pfms_start(desc); if (ret == -1) fatal_error("start error %d\n", ret); /* * simulate some work */ sleep(10); /* * stop monitoring on all CPUs of interest */ ret = pfms_stop(desc); if (ret == -1) fatal_error("stop error %d\n", ret); /* * read the PMD registers on all CPUs of interest. * The pd[] array must be organized such that to * read 2 PMDs on each CPU you need: * - 2 * number of CPUs of interest * - the first 2 elements of pd[] read on 1st CPU * - the next 2 elements of pd[] read on the 2nd CPU * - and so on */ ret = pfms_read_pmds(desc, pd, npmds); if (ret == -1) fatal_error("read_pmds error %d\n", ret); /* * pre per-CPU results */ for(j=0, k= 0; j < ncpus; j++) { for (i=0; i < inp.pfp_event_count; i++, k++) { pfm_get_full_event_name(&inp.pfp_events[i], name, len); printf("CPU%-3d PMD%u %20"PRIu64" %s\n", j, pd[k].reg_num, pd[k].reg_value, name); } } /* * destroy context on all CPUs of interest. * After this call desc is invalid */ ret = pfms_close(desc); if (ret == -1) fatal_error("close error %d\n", ret); free(name); return 0; } #endif papi-papi-7-2-0-t/src/libperfnec/libpfms/syst_smp.c000066400000000000000000000157561502707512200222370ustar00rootroot00000000000000/* * syst_smp.c - system-wide monitoring for SMP machine using libpfms helper * library * * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #define NUM_PMCS PFMLIB_MAX_PMCS #define NUM_PMDS PFMLIB_MAX_PMDS static void fatal_error(char *fmt,...) __attribute__((noreturn)); static void fatal_error(char *fmt, ...) { va_list ap; va_start(ap, fmt); vfprintf(stderr, fmt, ap); va_end(ap); exit(1); } static uint32_t popcount(uint64_t c) { uint32_t count = 0; for(; c; c>>=1) { if (c & 0x1) count++; } return count; } int main(int argc, char **argv) { pfarg_ctx_t ctx; pfarg_pmc_t pc[NUM_PMCS]; pfarg_pmd_t *pd; pfmlib_input_param_t inp; pfmlib_output_param_t outp; uint64_t cpu_list; void *desc; unsigned int num_counters; uint32_t i, j, l, k, ncpus, npmds; size_t len; int ret; char *name; if (pfm_initialize() != PFMLIB_SUCCESS) fatal_error("cannot initialize libpfm\n"); if (pfms_initialize()) fatal_error("cannot initialize libpfms\n"); memset(&ctx, 0, sizeof(ctx)); memset(pc, 0, sizeof(pc)); ncpus = (uint32_t)sysconf(_SC_NPROCESSORS_ONLN); if (ncpus == -1) fatal_error("cannot retrieve number of online processors\n"); if (argc > 1) { cpu_list = strtoul(argv[1],NULL,0); if (popcount(cpu_list) > ncpus) fatal_error("too many processors specified\n"); } else { cpu_list = ((1< num_counters) { i = num_counters; printf("too many events provided (max=%d events), using first %d event(s)\n", num_counters, i); } /* * how many counters we use */ inp.pfp_event_count = i; /* * indicate we are using the monitors for a system-wide session. * This may impact the way the library sets up the PMC values. */ inp.pfp_flags = PFMLIB_PFP_SYSTEMWIDE; /* * let the library figure out the values for the PMCS */ if ((ret=pfm_dispatch_events(&inp, NULL, &outp, NULL)) != PFMLIB_SUCCESS) fatal_error("cannot configure events: %s\n", pfm_strerror(ret)); npmds = ncpus * inp.pfp_event_count; printf("ncpus=%u npmds=%u\n", ncpus, npmds); pd = calloc(npmds, sizeof(pfarg_pmd_t)); if (pd == NULL) fatal_error("cannot allocate pd array\n"); for (i=0; i < outp.pfp_pmc_count; i++) { pc[i].reg_num = outp.pfp_pmcs[i].reg_num; pc[i].reg_value = outp.pfp_pmcs[i].reg_value; } /* * We use inp.pfp_event_count PMD registers for our events per-CPU. * We need to setup the PMDs we use. They are determined based on the * PMC registers used. The following loop prepares the pd[] array * for pfm_write_pmds(). With libpfms, on PMD write we need to pass * only pfp_event_count PMD registers. But on PMD read, we need * to pass pfp_event_count PMD registers per-CPU because libpfms * does not aggregate counts. To prepapre for PMD read, we therefore * propagate the PMD setup beyond just the first pfp_event_count * elements of pd[]. */ for(l=0, k= 0; l < ncpus; l++) { for (i=0; i < outp.pfp_pmd_count; i++, k++) pd[k].reg_num = outp.pfp_pmds[i].reg_num; } /* * create a context on all CPUs we asked for * * libpfms only works for system-wide, so we set the flag in * the master context. the context argument is not modified by * call. * * desc is an opaque descriptor used to identify session. */ ctx.ctx_flags = PFM_FL_SYSTEM_WIDE; ret = pfms_create(&cpu_list, 1, &ctx, NULL, &desc); if (ret == -1) fatal_error("create error %d\n", ret); /* * program the PMC registers on all CPUs of interest */ ret = pfms_write_pmcs(desc, pc, outp.pfp_pmc_count); if (ret == -1) fatal_error("write_pmcs error %d\n", ret); /* * program the PMD registers on all CPUs of interest */ ret = pfms_write_pmds(desc, pd, outp.pfp_pmd_count); if (ret == -1) fatal_error("write_pmds error %d\n", ret); /* * load context on all CPUs of interest */ ret = pfms_load(desc); if (ret == -1) fatal_error("load error %d\n", ret); printf("monitoring for 10s on all CPUs\n"); /* * start monitoring on all CPUs of interest */ ret = pfms_start(desc); if (ret == -1) fatal_error("start error %d\n", ret); /* * stop and listen to activity for 10s */ sleep(10); /* * stop monitoring on all CPUs of interest */ ret = pfms_stop(desc); if (ret == -1) fatal_error("stop error %d\n", ret); /* * read the PMD registers on all CPUs of interest. * The pd[] array must be organized such that to * read 2 PMDs on each CPU you need: * - 2 * number of CPUs of interest * - the first 2 elements of pd[] read on CPU0 * - the next 2 elements of pd[] read on CPU1 * - and so on */ ret = pfms_read_pmds(desc, pd, npmds); if (ret == -1) fatal_error("read_pmds error %d\n", ret); /* * print per-CPU results */ for(j=0, k= 0; j < ncpus; j++) { for (i=0; i < inp.pfp_event_count; i++, k++) { pfm_get_full_event_name(&inp.pfp_events[i], name, len); printf("CPU%-3d PMD%u %20"PRIu64" %s\n", j, pd[k].reg_num, pd[k].reg_value, name); } } /* * destroy context on all CPUs of interest. * After this call desc is invalid */ ret = pfms_close(desc); if (ret == -1) fatal_error("close error %d\n", ret); free(name); return 0; } papi-papi-7-2-0-t/src/libperfnec/python/000077500000000000000000000000001502707512200200615ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/python/Makefile000066400000000000000000000024231502707512200215220ustar00rootroot00000000000000# # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # all: ./setup.py build install: ./setup.py install clean: $(RM) src/perfmon_int_wrap.c src/perfmon_int.py src/*.pyc $(RM) -r build papi-papi-7-2-0-t/src/libperfnec/python/README000066400000000000000000000003711502707512200207420ustar00rootroot00000000000000Requirements: To use the python bindings, you need the following packages: 1. swig (http://www.swig.org) 2. python-dev (http://www.python.org) 3. pycpuid (http://code.google.com/p/pycpuid) linux.sched is python package that comes with pycpuid. papi-papi-7-2-0-t/src/libperfnec/python/self.py000077500000000000000000000040121502707512200213640ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # # Self monitoring example. Copied from self.c import os from optparse import OptionParser import random import errno from perfmon import * if __name__ == '__main__': parser = OptionParser() parser.add_option("-e", "--events", help="Events to use", action="store", dest="events") (options, args) = parser.parse_args() s = PerThreadSession(int(os.getpid())) if options.events: events = options.events.split(",") else: raise "You need to specify events to monitor" s.dispatch_events(events) s.load() s.start() # code to be measured # # note that this is not identical to what examples/self.c does # thus counts will be different in the end for i in range(1, 10000000): random.random() s.stop() # read the counts for i in xrange(s.npmds): print """PMD%d\t%lu""" % (s.pmds[0][i].reg_num, s.pmds[0][i].reg_value) papi-papi-7-2-0-t/src/libperfnec/python/setup.py000077500000000000000000000012401502707512200215730ustar00rootroot00000000000000#!/usr/bin/env python from distutils.core import setup, Extension from distutils.command.install_data import install_data setup(name='perfmon', version='0.1', author='Arun Sharma', author_email='arun.sharma@google.com', description='libpfm wrapper', packages=['perfmon'], package_dir={ 'perfmon' : 'src' }, py_modules=['perfmon.perfmon_int'], ext_modules=[Extension('perfmon._perfmon_int', sources = ['src/perfmon_int.i'], libraries = ['pfm'], library_dirs = ['../lib'], include_dirs = ['../include'], swig_opts=['-I../include'])]) papi-papi-7-2-0-t/src/libperfnec/python/src/000077500000000000000000000000001502707512200206505ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libperfnec/python/src/__init__.py000066400000000000000000000001021502707512200227520ustar00rootroot00000000000000from perfmon_int import * from pmu import * from session import * papi-papi-7-2-0-t/src/libperfnec/python/src/perfmon_int.i000066400000000000000000000127651502707512200233550ustar00rootroot00000000000000/* * Copyright (c) 2008 Google, Inc. * Contributed by Arun Sharma * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included * in all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * * Python Bindings for perfmon. */ %module perfmon_int %{ #include #include static PyObject *libpfm_err; %} %include "carrays.i" %include "cstring.i" %include /* Some typemaps for corner cases SWIG can't handle */ /* Convert from Python --> C */ %typemap(memberin) pfmlib_event_t[ANY] { int i; for (i = 0; i < $1_dim0; i++) { $1[i] = $input[i]; } } %typemap(out) pfmlib_event_t[ANY] { int len, i; len = $1_dim0; $result = PyList_New(len); for (i = 0; i < len; i++) { PyObject *o = SWIG_NewPointerObj(SWIG_as_voidptr(&$1[i]), SWIGTYPE_p_pfmlib_event_t, 0 | 0 ); PyList_SetItem($result, i, o); } } /* Convert from Python --> C */ %typemap(memberin) pfmlib_reg_t[ANY] { int i; for (i = 0; i < $1_dim0; i++) { $1[i] = $input[i]; } } %typemap(out) pfmlib_reg_t[ANY] { int len, i; len = $1_dim0; $result = PyList_New(len); for (i = 0; i < len; i++) { PyObject *o = SWIG_NewPointerObj(SWIG_as_voidptr(&$1[i]), SWIGTYPE_p_pfmlib_reg_t, 0 | 0 ); PyList_SetItem($result, i, o); } } /* Convert libpfm errors into exceptions */ %typemap(out) os_err_t { if (result == -1) { PyErr_SetFromErrno(PyExc_OSError); SWIG_fail; } resultobj = SWIG_From_int((int)(result)); }; %typemap(out) pfm_err_t { if (result != PFMLIB_SUCCESS) { PyObject *obj = Py_BuildValue("(i,s)", result, pfm_strerror(result)); PyErr_SetObject(libpfm_err, obj); SWIG_fail; } else { PyErr_Clear(); } resultobj = SWIG_From_int((int)(result)); } /* Convert libpfm errors into exceptions */ %typemap(out) os_err_t { if (result == -1) { PyErr_SetFromErrno(PyExc_OSError); SWIG_fail; } resultobj = SWIG_From_int((int)(result)); }; %typemap(out) pfm_err_t { if (result != PFMLIB_SUCCESS) { PyObject *obj = Py_BuildValue("(i,s)", result, pfm_strerror(result)); PyErr_SetObject(libpfm_err, obj); SWIG_fail; } else { PyErr_Clear(); } resultobj = SWIG_From_int((int)(result)); } %cstring_output_maxsize(char *name, size_t maxlen) %cstring_output_maxsize(char *name, int maxlen) %extend pfmlib_regmask_t { unsigned int weight() { unsigned int w = 0; pfm_regmask_weight($self, &w); return w; } } /* Kernel interface */ %include %array_class(pfarg_pmc_t, pmc) %array_class(pfarg_pmd_t, pmd) /* Library interface */ %include %extend pfarg_ctx_t { void zero() { memset(self, 0, sizeof(self)); } } %extend pfarg_load_t { void zero() { memset(self, 0, sizeof(self)); } } %init %{ libpfm_err = PyErr_NewException("perfmon.libpfmError", NULL, NULL); PyDict_SetItemString(d, "libpfmError", libpfm_err); %} %inline %{ /* Helper functions to avoid pointer classes */ int pfm_py_get_pmu_type(void) { int tmp = -1; pfm_get_pmu_type(&tmp); return tmp; } unsigned int pfm_py_get_hw_counter_width(void) { unsigned int tmp = 0; pfm_get_hw_counter_width(&tmp); return tmp; } unsigned int pfm_py_get_num_events(void) { unsigned int tmp = 0; pfm_get_num_events(&tmp); return tmp; } int pfm_py_get_event_code(int idx) { int tmp = 0; pfm_get_event_code(idx, &tmp); return tmp; } unsigned int pfm_py_get_num_event_masks(int idx) { unsigned int tmp = 0; pfm_get_num_event_masks(idx, &tmp); return tmp; } unsigned int pfm_py_get_event_mask_code(int idx, int i) { unsigned int tmp = 0; pfm_get_event_mask_code(idx, i, &tmp); return tmp; } #define PFMON_MAX_EVTNAME_LEN 128 PyObject *pfm_py_get_event_name(int idx) { char name[PFMON_MAX_EVTNAME_LEN]; pfm_get_event_name(idx, name, PFMON_MAX_EVTNAME_LEN); return PyString_FromString(name); } PyObject *pfm_py_get_event_mask_name(int idx, int i) { char name[PFMON_MAX_EVTNAME_LEN]; pfm_get_event_mask_name(idx, i, name, PFMON_MAX_EVTNAME_LEN); return PyString_FromString(name); } PyObject *pfm_py_get_event_description(int idx) { char *desc; PyObject *ret; pfm_get_event_description(idx, &desc); ret = PyString_FromString(desc); free(desc); return ret; } PyObject *pfm_py_get_event_mask_description(int idx, int i) { char *desc; PyObject *ret; pfm_get_event_mask_description(idx, i, &desc); ret = PyString_FromString(desc); free(desc); return ret; } %} papi-papi-7-2-0-t/src/libperfnec/python/src/pmu.py000066400000000000000000000066351502707512200220350ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # import os from perfmon import * def public_members(self): s = "{ " for k, v in self.__dict__.iteritems(): if not k[0] == '_': s += "%s : %s, " % (k, v) s += " }" return s class System: def __init__(self): self.ncpus = os.sysconf('SC_NPROCESSORS_ONLN') self.pmu = PMU() def __repr__(self): return public_members(self) class Event: def __init__(self): pass def __repr__(self): return '\n' + public_members(self) class EventMask: def __init__(self): pass def __repr__(self): return '\n\t' + public_members(self) class PMU: def __init__(self): pfm_initialize() self.type = pfm_py_get_pmu_type() self.name = pfm_get_pmu_name(PFMON_MAX_EVTNAME_LEN)[1] self.width = pfm_py_get_hw_counter_width() # What does the PMU support? self.__implemented_pmcs = pfmlib_regmask_t() self.__implemented_pmds = pfmlib_regmask_t() self.__implemented_counters = pfmlib_regmask_t() pfm_get_impl_pmcs(self.__implemented_pmcs) pfm_get_impl_pmds(self.__implemented_pmds) pfm_get_impl_counters(self.__implemented_counters) self.implemented_pmcs = self.__implemented_pmcs.weight() self.implemented_pmds = self.__implemented_pmds.weight() self.implemented_counters = self.__implemented_counters.weight() self.__events = None def __parse_events(self): nevents = pfm_py_get_num_events() self.__events = [] for idx in range(0, nevents): e = Event() e.name = pfm_py_get_event_name(idx) e.code = pfm_py_get_event_code(idx) e.__counters = pfmlib_regmask_t() pfm_get_event_counters(idx, e.__counters) # Now the event masks e.masks = [] nmasks = pfm_py_get_num_event_masks(idx) for mask_idx in range(0, nmasks): em = EventMask() em.name = pfm_py_get_event_mask_name(idx, mask_idx) em.code = pfm_py_get_event_mask_code(idx, mask_idx) em.desc = pfm_py_get_event_mask_description(idx, mask_idx) e.masks.append(em) self.__events.append(e) def events(self): if not self.__events: self.__parse_events() return self.__events def __repr__(self): return public_members(self) if __name__ == '__main__': from perfmon import * s = System() print s print s.pmu.events() papi-papi-7-2-0-t/src/libperfnec/python/src/session.py000066400000000000000000000161141502707512200227100ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # from perfmon import * from linux import sched import os import sys from threading import Thread # Shouldn't be necessary for python version >= 2.5 from Queue import Queue # http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/425445 def once(func): "A decorator that runs a function only once." def decorated(*args, **kwargs): try: return decorated._once_result except AttributeError: decorated._once_result = func(*args, **kwargs) return decorated._once_result return decorated @once def pfm_initialize_once(): # Initialize once opts = pfmlib_options_t() opts.pfm_verbose = 1 pfm_set_options(opts) pfm_initialize() # Common base class class Session: def __init__(self, n): self.system = System() pfm_initialize_once() # Setup context self.ctxts = [] self.fds = [] self.inps = [] self.outps = [] self.pmcs = [] self.pmds = [] for i in xrange(n): ctx = pfarg_ctx_t() ctx.zero() ctx.ctx_flags = self.ctx_flags fd = pfm_create_context(ctx, None, None, 0) self.ctxts.append(ctx) self.fds.append(fd) def __del__(self): if self.__dict__.has_key("fds"): for fd in self.fds: os.close(fd) def dispatch_event_one(self, events, which): # Select and dispatch events inp = pfmlib_input_param_t() for i in xrange(0, len(events)): pfm_find_full_event(events[i], inp.pfp_events[i]) inp.pfp_dfl_plm = self.default_pl inp.pfp_flags = self.pfp_flags outp = pfmlib_output_param_t() cnt = len(events) inp.pfp_event_count = cnt pfm_dispatch_events(inp, None, outp, None) # pfp_pm_count may be > cnt cnt = outp.pfp_pmc_count pmcs = pmc(outp.pfp_pmc_count) pmds = pmd(outp.pfp_pmd_count) for i in xrange(outp.pfp_pmc_count): npmc = pfarg_pmc_t() npmc.reg_num = outp.pfp_pmcs[i].reg_num npmc.reg_value = outp.pfp_pmcs[i].reg_value pmcs[i] = npmc self.npmds = outp.pfp_pmd_count for i in xrange(outp.pfp_pmd_count): npmd = pfarg_pmd_t() npmd.reg_num = outp.pfp_pmds[i].reg_num pmds[i] = npmd # Program PMCs and PMDs fd = self.fds[which] pfm_write_pmcs(fd, pmcs, outp.pfp_pmc_count) pfm_write_pmds(fd, pmds, outp.pfp_pmd_count) # Save all the state in various vectors self.inps.append(inp) self.outps.append(outp) self.pmcs.append(pmcs) self.pmds.append(pmds) def dispatch_events(self, events): for i in xrange(len(self.fds)): self.dispatch_event_one(events, i) def load_one(self, i): fd = self.fds[i] load = pfarg_load_t() load.zero() load.load_pid = self.targets[i] try: pfm_load_context(fd, load) except OSError, err: import errno if (err.errno == errno.EBUSY): err.strerror = "Another conflicting perfmon session?" raise err def load(self): for i in xrange(len(self.fds)): self.load_one(i) def start_one(self, i): pfm_start(self.fds[i], None) def start(self): for i in xrange(len(self.fds)): self.start_one(i) def stop_one(self, i): fd = self.fds[i] pmds = self.pmds[i] pfm_stop(fd) pfm_read_pmds(fd, pmds, self.npmds) def stop(self): for i in xrange(len(self.fds)): self.stop_one(i) class PerfmonThread(Thread): def __init__(self, session, i, cpu): Thread.__init__(self) self.cpu = cpu self.session = session self.index = i self.done = 0 self.started = 0 def run(self): queue = self.session.queues[self.index] exceptions = self.session.exceptions[self.index] cpu_set = sched.cpu_set_t() cpu_set.set(self.cpu) sched.setaffinity(0, cpu_set) while not self.done: # wait for a command from the master method = queue.get() try: method(self.session, self.index) except: exceptions.put(sys.exc_info()) queue.task_done() break queue.task_done() def run_in_other_thread(func): "A decorator that runs a function in another thread (second argument)" def decorated(*args, **kwargs): self = args[0] i = args[1] # Tell thread i to call func() self.queues[i].put(func) self.queues[i].join() if not self.exceptions[i].empty(): exc = self.exceptions[i].get() # Let the main thread know we had an exception self.exceptions[i].put(exc) print "CPU: %d, exception: %s" % (i, exc) raise exc[1] return decorated class SystemWideSession(Session): def __init__(self, cpulist): self.default_pl = PFM_PLM3 | PFM_PLM0 self.targets = cpulist self.ctx_flags = PFM_FL_SYSTEM_WIDE self.pfp_flags = PFMLIB_PFP_SYSTEMWIDE self.threads = [] self.queues = [] self.exceptions = [] n = len(cpulist) for i in xrange(n): t = PerfmonThread(self, i, cpulist[i]) self.threads.append(t) self.queues.append(Queue(0)) self.exceptions.append(Queue(0)) t.start() Session.__init__(self, n) def __del__(self): self.cleanup() Session.__del__(self) def cleanup(self): for t in self.threads: t.done = 1 # join only threads with no exceptions if self.exceptions[t.index].empty(): if t.started: self.stop_one(t.index) else: self.wakeup(t.index) t.join() self.threads = [] @run_in_other_thread def load_one(self, i): Session.load_one(self, i) @run_in_other_thread def start_one(self, i): Session.start_one(self, i) self.threads[i].started = 1 @run_in_other_thread def stop_one(self, i): Session.stop_one(self, i) self.threads[i].started = 0 @run_in_other_thread def wakeup(self, i): "Do nothing. Just wakeup the other thread" pass class PerThreadSession(Session): def __init__(self, pid): self.targets = [pid] self.default_pl = PFM_PLM3 self.ctx_flags = 0 self.pfp_flags = 0 Session.__init__(self, 1) def __del__(self): Session.__del__(self) papi-papi-7-2-0-t/src/libperfnec/python/sys.py000077500000000000000000000043731502707512200212630ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # # System wide monitoring example. Copied from syst.c # # Run as: ./sys.py -c cpulist -e eventlist import sys import os from optparse import OptionParser import time from perfmon import * if __name__ == '__main__': parser = OptionParser() parser.add_option("-e", "--events", help="Events to use", action="store", dest="events") parser.add_option("-c", "--cpulist", help="CPUs to monitor", action="store", dest="cpulist") parser.set_defaults(cpu=0) (options, args) = parser.parse_args() cpus = options.cpulist.split(',') cpus = [ int(c) for c in cpus ] try: s = SystemWideSession(cpus) if options.events: events = options.events.split(",") else: raise "You need to specify events to monitor" s.dispatch_events(events) s.load() # Measuring loop for i in range(1, 10): s.start() time.sleep(1) s.stop() # Print the counts for cpu in xrange(len(cpus)): for i in xrange(s.npmds): print "CPU%d.PMD%d\t%lu""" % (cpu, s.pmds[cpu][i].reg_num, s.pmds[cpu][i].reg_value) finally: s.cleanup() papi-papi-7-2-0-t/src/libperfnec/rules.mk000066400000000000000000000031011502707512200202160ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # This file is part of libpfm, a performance monitoring support library for # applications on Linux/ia64. # .SUFFIXES: .c .S .o .lo .cpp .PHONY: all clean distclean depend install install_examples .S.o: $(CC) $(CFLAGS) -c $*.S .c.o: $(CC) $(CFLAGS) -c $*.c .cpp.o: $(CXX) $(CFLAGS) -c $*.cpp .c.lo: $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo .S.lo: $(CC) -fPIC -DPIC $(CFLAGS) -c $*.S -o $*.lo papi-papi-7-2-0-t/src/libpfm4/000077500000000000000000000000001502707512200157645ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/COPYING000066400000000000000000000021761502707512200170250ustar00rootroot00000000000000All other files are published under the following license: Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. papi-papi-7-2-0-t/src/libpfm4/Makefile000066400000000000000000000053311502707512200174260ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # # Look in config.mk for options # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi) include config.mk EXAMPLE_DIRS=examples DIRS=lib tests $(EXAMPLE_DIRS) include docs ifeq ($(SYS),Linux) EXAMPLE_DIRS +=perf_examples endif ifneq ($(CONFIG_PFMLIB_NOPYTHON),y) DIRS += python endif TAR=tar --exclude=.git --exclude=.gitignore CURDIR=$(shell basename "$$PWD") PKG=libpfm-4.$(REVISION).$(AGE) TARBALL=$(PKG).tar.gz all: @echo Compiling for \'$(ARCH)\' target @echo Compiling for \'$(SYS)\' system @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done lib: $(MAKE) -C lib clean: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done distclean: clean @(cd debian; $(RM) -f *.log *.debhelper *.substvars; $(RM) -rf libpfm4-dev libpfm4 python-libpfm4 tmp files) $(RM) -f tags depend: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done tar: clean ln -s $$PWD ../$(PKG) && cd .. && $(TAR) -zcf $(TARBALL) $(PKG)/. && rm $(PKG) @echo generated ../$(TARBALL) install-lib: @echo installing in $(DESTDIR)$(PREFIX) @$(MAKE) -C lib install install install-all: @echo installing in $(DESTDIR)$(PREFIX) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d install ; done install-examples install_examples: @echo installing in $(DESTDIR)$(PREFIX) @set -e ; for d in $(EXAMPLE_DIRS) ; do $(MAKE) -C $$d $@ ; done tags: @echo creating tags $(MAKE) -C lib $@ static: make all CONFIG_PFMLIB_SHARED=n .PHONY: all clean distclean depend tar install install-all install-lib install-examples lib static install_examples # DO NOT DELETE papi-papi-7-2-0-t/src/libpfm4/README000066400000000000000000000140211502707512200166420ustar00rootroot00000000000000 ------------------------------------------------------------ libpfm-4.x: a helper library to program the performance monitoring events ------------------------------------------------------------ Copyright (c) 2009 Google, Inc Contributed by Stephane Eranian Copyright (c) 2001-2007 Hewlett-Packard Development Company, L.P. Contributed by Stephane Eranian This package provides a library, called libpfm4 which is used to develop monitoring tools exploiting the performance monitoring events such as those provided by the Performance Monitoring Unit (PMU) of modern processors. This is a complete rewrite of libpfm3 and it is NOT backward compatible with it. Libpfm4 helps convert from an event name, expressed as a string, to the event encoding that is either the raw event as documented by HW vendor or the OS-specific encoding. In the latter case, the library is able to prepare the OS-specific data structures needed by the kernel to setup the event. The current libpfm4 provides support for the perf_events interface which was introduced in Linux v2.6.31. Perfmon support is not present yet. The library does not make any performance monitoring system calls. It is portable and supports other operating system environments beyond Linux, such as Mac OS X, and Windows. The library supports many PMUs. The current version can handle: - For AMD X86: AMD64 K7, K8 AMD64 Fam10h (Barcelona, Shanghai, Istanbul) AMD64 Fam11h (Turion) AMD64 Fam12h (Llano) AMD64 Fam14h (Bobcat) AMD64 Fam15h (Bulldozer) (core and uncore) AMD64 Fam16h (Jaguar) AMD64 Fam17h (Zen1) AMD64 Fam17h (Zen2) AMD64 Fam19h (Zen3) (core and L3) AMD64 Fam19h (Zen4) AMD64 Fam1Ah (Zen5) (core and L3) - For Intel X86: Intel P6 (Pentium II, Pentium Pro, Pentium III, Pentium M) Intel Yonah (Core Duo/Core Solo), Intel Core (Merom, Penryn, Dunnington) Intel Atom Intel Nehalem, Westmere Intel Sandy Bridge Intel Ivy Bridge Intel Haswell Intel Broadwell Intel SkyLake Intel CascadeLake Intel IceLake Intel SapphireRapid Intel EmeraldRapid Intel GraniteRapids Intel Alderlake (P-core) Intel Alderlake (E-core) Intel Raptorlake (P-core, E-core) Intel Silvermont Intel Airmont Intel Goldmont Intel Tremont Intel RAPL (energy consumption) Intel Knights Corner Intel Knights Landing (core, uncore) Intel Knights Mill (core, uncore) Intel architectural perfmon v1, v2, v3 - For ARM: ARMV7 Cortex A8 ARMV7 Cortex A9 ARMV7 Cortex A15 ARMV8 Cortex A57, A53, A55, A72, A76 Applied Micro X-Gene Qualcomm Krait Fujitsu A64FX Fujitsu FUJITSU-MONAKA Arm Neoverse V1, V2, V3 Arm Neoverse N1, N2, N3 Huawei HiSilicon Kunpeng 920 - For SPARC Ultra I, II Ultra III, IIIi, III+ Ultra IV+ Niagara I, Niagara II - For IBM Power 4 Power 5 Power 6 Power 7 Power 8 Power 8 Nest Power 9 Power 10 PPC970 Torrent System z (s390x) - For MIPS Mips 74k WHAT'S THERE ------------- - the library source code including support for all processors listed above - a set of generic examples showing how to list and query events. They are in examples. - a set of examples showing how the library can be used with the perf_events interface. They are in perf_examples. - a set of library header files used to compile the library and perf_examples - man pages for all the library entry points - Python bindings for the library - a SPEC file to build RPMs from the library - the Debian-style config file to build a .deb package from the library INSTALLATION ------------ - edit config.mk to : - update some of the configuration variables - select your compiler options - type make - type make install - The default installation location is /usr/local. You can specify a diffierent install location as follows: $ make PREFIX= install Depending on your install location, you may need to run the 'ldconfig' command or use LD_LIBRARY_PATH when you build and run tools that link to the libpfm4 library. - By default, libpfm library files are installed in /lib. If 'make' builds 64-bit libraries on your system, and your target architecture expects 64-bit libraries to be located in a library named "lib64", then you should use the LIBDIR variable when installing, as follows: $ make LIBDIR=/lib64 install - To compile and install the Python bindings, you need to go to the python sub-directory and type make. Python may not be systematically built. - to compile the library for another ABI (e.g. 32-bit x86 on a 64-bit x86) system, you can pass the ABI flag to the compiler as follows (assuming you have the multilib version of gcc): $ make OPTIM="-m32 -O2" PACKAGING --------- The library comes with the config files necessary to generate RPMs or Debian packages. The source code produces 3 packages: - libpfm : runtime library - libpfm-dev: development files (headers, manpages, static library) - libpfm-python: Python bindings for the library To generate the RPMs: $ rpmbuild -ba libpfm.spec To generate the Debian packages: $ debuild -i -us -uc -b You may need to install some extra packages to make Debian package generation possible. REQUIREMENTS: ------------- - to run the programs in the perf_examples subdir, you MUST be using a linux kernel with perf_events. That means v2.6.31 or later. - to compile the Python bindings, you need to have SWIG and the python development packages installed - To compile on Windows, you need the MinGW and MSYS compiler environment (see www.mingw.org). The environment needs to be augmented with the mingw regex user contributed package (mingw-libgnurx-2.5.1.dev.tar.gz). - To compile on Mac OS X, you need to have gmake installed. DOCUMENTATION ------------- - man pages for all entry points. It is recommended you start with: man libpfm - More information can be found on library web site: http://perfmon2.sf.net papi-papi-7-2-0-t/src/libpfm4/config.mk000066400000000000000000000134411502707512200175650ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # This file is part of libpfm, a performance monitoring support library for # applications on Linux. # # # This file defines the global compilation settings. # It is included by every Makefile # # SYS := $(shell uname -s) ARCH := $(shell uname -m) ifeq (i686,$(findstring i686,$(ARCH))) override ARCH=i386 endif ifeq (i586,$(findstring i586,$(ARCH))) override ARCH=i386 endif ifeq (i486,$(findstring i486,$(ARCH))) override ARCH=i386 endif ifeq (i386,$(findstring i386,$(ARCH))) override ARCH=i386 endif ifeq (i86pc,$(findstring i86pc,$(ARCH))) override ARCH=i386 endif ifeq (x86,$(findstring x86,$(ARCH))) override ARCH=x86_64 endif ifeq ($(ARCH),x86_64) override ARCH=x86_64 endif ifeq ($(ARCH),amd64) override ARCH=x86_64 endif ifeq (ppc,$(findstring ppc,$(ARCH))) override ARCH=powerpc endif ifeq (sparc64,$(findstring sparc64,$(ARCH))) override ARCH=sparc endif ifeq (armv6,$(findstring armv6,$(ARCH))) override ARCH=arm endif ifeq (armv7,$(findstring armv7,$(ARCH))) override ARCH=arm endif ifeq (armv7,$(findstring armv7,$(ARCH))) override ARCH=arm endif ifeq (aarch32,$(findstring aarch32,$(ARCH))) override ARCH=arm endif ifeq (armv8l,$(findstring armv8l,$(ARCH))) override ARCH=arm endif ifeq (mips64,$(findstring mips64,$(ARCH))) override ARCH=mips endif ifeq (mips,$(findstring mips,$(ARCH))) override ARCH=mips endif ifeq (MINGW,$(findstring MINGW,$(SYS))) override SYS=WINDOWS endif # # CONFIG_PFMLIB_SHARED: y=compile static and shared versions, n=static only # CONFIG_PFMLIB_DEBUG: enable debugging output support # CONFIG_PFMLIB_NOPYTHON: do not generate the python support, incompatible # CONFIG_PFMLIB_NOTRACEPOINT: no tracepoint support in perf PMU (eliminate startup overhead) # with PFMLIB_SHARED=n # CONFIG_PFMLIB_SHARED?=y CONFIG_PFMLIB_DEBUG?=y CONFIG_PFMLIB_NOPYTHON?=y CONFIG_PFMLIB_NOTRACEPOINT?=y # # Cell Broadband Engine is reported as PPC but needs special handling. # ifeq ($(SYS),Linux) MACHINE := $(shell grep -q 'Cell Broadband Engine' /proc/cpuinfo && echo cell) ifeq (cell,$(MACHINE)) override ARCH=cell endif endif # # Library version # VERSION=4 REVISION=13 AGE=0 # # Where should things (lib, headers, man) go in the end. # PREFIX?=/usr/local LIBDIR=$(PREFIX)/lib INCDIR=$(PREFIX)/include MANDIR=$(PREFIX)/share/man DOCDIR=$(PREFIX)/share/doc/libpfm-$(VERSION).$(REVISION).$(AGE) # # System header files # # SYSINCDIR : where to find standard header files (default to .) SYSINCDIR=. # # Configuration Paramaters for libpfm library # ifeq ($(ARCH),ia64) CONFIG_PFMLIB_ARCH_IA64=y endif ifeq ($(ARCH),x86_64) CONFIG_PFMLIB_ARCH_X86_64=y CONFIG_PFMLIB_ARCH_X86=y endif ifeq ($(ARCH),i386) CONFIG_PFMLIB_ARCH_I386=y CONFIG_PFMLIB_ARCH_X86=y endif ifeq ($(ARCH),mips) CONFIG_PFMLIB_ARCH_MIPS=y endif ifeq ($(ARCH),powerpc) CONFIG_PFMLIB_ARCH_POWERPC=y endif ifeq ($(ARCH),sparc) CONFIG_PFMLIB_ARCH_SPARC=y endif ifeq ($(ARCH),arm) CONFIG_PFMLIB_ARCH_ARM=y endif ifeq ($(ARCH),aarch64) CONFIG_PFMLIB_ARCH_ARM64=y endif ifeq ($(ARCH),arm64) CONFIG_PFMLIB_ARCH_ARM64=y endif ifeq ($(ARCH),s390x) CONFIG_PFMLIB_ARCH_S390X=y endif ifeq ($(ARCH),cell) CONFIG_PFMLIB_CELL=y endif # # you shouldn't have to touch anything beyond this point # # # The entire package can be compiled using # icc the Intel Itanium Compiler (7.x,8.x, 9.x) # or GNU C #CC=icc CC?=gcc LIBS= INSTALL=install LDCONFIG=ldconfig LN?=ln -sf PFMINCDIR=$(TOPDIR)/include PFMLIBDIR=$(TOPDIR)/lib # # -Wextra: to enable extra compiler sanity checks (e.g., signed vs. unsigned) # -Wno-unused-parameter: to avoid warnings on unused foo(void *this) parameter # DBG?=-g -Wall -Werror -Wextra -Wno-unused-parameter ifeq ($(SYS),Darwin) # older gcc-4.2 does not like -Wextra and some of our initialization code # Xcode uses a gcc version which is too old for some static initializers CC=clang DBG?=-g -Wall -Werror LDCONFIG=true endif ifeq ($(SYS),FreeBSD) # gcc-4.2 does not like -Wextra and some of our initialization code DBG=-g -Wall -Werror endif CFLAGS+=$(OPTIM) $(DBG) -I$(SYSINCDIR) -I$(PFMINCDIR) MKDEP=makedepend PFMLIB=$(PFMLIBDIR)/libpfm.a ifeq ($(CONFIG_PFMLIB_DEBUG),y) CFLAGS += -DCONFIG_PFMLIB_DEBUG endif CTAGS?=ctags # # Python is for use with perf_events # so it only works on Linux # ifneq ($(SYS),Linux) CONFIG_PFMLIB_NOPYTHON=y endif # # mark that we are compiling on Linux # ifeq ($(SYS),Linux) CFLAGS+= -DCONFIG_PFMLIB_OS_LINUX endif # # compile examples statically if library is # compile static # not compatible with python support, so disable for now # ifeq ($(CONFIG_PFMLIB_SHARED),n) LDFLAGS+= -static CONFIG_PFMLIB_NOPYTHON=y endif ifeq ($(SYS),WINDOWS) CFLAGS +=-DPFMLIB_WINDOWS endif ifeq ($(CONFIG_PFMLIB_NOTRACEPOINT),y) CFLAGS += -DCONFIG_PFMLIB_NOTRACEPOINT endif papi-papi-7-2-0-t/src/libpfm4/debian/000077500000000000000000000000001502707512200172065ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/debian/README000066400000000000000000000002601502707512200200640ustar00rootroot00000000000000The Debian Package libpfm4 ---------------------------- libpfm4 packaging tested on Ubuntu Lucid (amd64). -- Arun Sharma Mon, 21 Jun 2010 15:17:22 -0700 papi-papi-7-2-0-t/src/libpfm4/debian/README.source000066400000000000000000000001311502707512200213600ustar00rootroot00000000000000Sources were slightly modified to compile with -Werror -Arun Sharma (aruns@google.com) papi-papi-7-2-0-t/src/libpfm4/debian/changelog000066400000000000000000000150411502707512200210610ustar00rootroot00000000000000libpfm4 (13.0) unstable; urgency=low * update Intel SKL/SKX/CLX event table * add ARM Neoverse V2 core PMU support * move ARM Neoverse N2 to ARMv9 support * add ARM v9 support basic infrastructure * add Arm Neoverse V1 core PMU support * Update Intel SapphireRapid event table * update Intel Icelake event table * Update AMD Zen4 event table * remove useless combination in AMD Zen4 packed_int_ops_retired event * Fix AMD Zen4 cpu_family used in detection code * Add AMD Zen4 core PMU support * Correctly detect all AMD Zen3 processors * fix CPU_CLK_UNHALTED.REF_DISTRIBUTED on Intel Icelake * add missing break in amd64_get_revision() * update Intel Skylake event table * Fix typos in Intel Icelake event table * Update Intel SapphireRapid event table * update perf_event.h to Linux 5.18 * fix amd_get_revision() to identify AMD Zen3 uniquely -- Stephane Eranian Tue, 28 Mar 2023 08:00:01 +0200 libpfm4 (12.1) unstable; urgency=low * fix debian changelog 12.0 entry order * fix debian rules to build again -- Stephane Eranian Tue, 20 Sep 2022 08:00:01 +0200 libpfm4 (12.0) unstable; urgency=low * Add IBM Power10 core PMU support * Add Intel IcelakeX core PMU support * Add Intel SapphireRapid core PMU support * Add Intel SapphireRapid RAPL PMU support * Update Intel Icelake RAPL PMU support * Add support HiSilicon Kunpeng uncore PMUs * Add support HiSilicon Kunpeng core PMU * Remove arm_fujitsu_a64fx_support for ARM(32 bit) * Update Intel Skylake event table * Add Intel PERF_METRICS event support for Icelake * Add support for ARM Neoverse N2 core PMU * Add ARM SPE events for Neoverse N1 core PMU * Add cgroup-switches software event * Add Intel Tigerlake and Rocketlake core PMU support * Add AMD64 Fam19h Zen3 L3 PMU support * Add AMD64 Fam17h Zen2 RAPL support * Add AMD64 Fam19h Zen3 core PMU support * Add RAPL for AMD64 Fam19h Zen3 processor * Update ARM N1 event table * Update AMD Fam17h Zen2 event table * s390: Update counter definition for IBM z16 -- Stephane Eranian Fri, 16 Sep 2022 12:00:01 +0200 libpfm4 (11.1) unstable; urgency=low * Fix Makefile revision * Fix Fujitsu A64Fx man page install in ARM64 mode -- Stephane Eranian Wed, 23 Sep 2020 12:00:01 +0200 libpfm4 (11.0) unstable; urgency=low * Cavium TX2 core updates * Cavium TX2 uncore core * Intel CascadeLakeX core PMU support * AMD Zen1 updates * AMD Zen2 core pmu support * S390 updates * Update Skylake event table * speculative event info infrastructure * RAPL updates * Intel Icelake core PMU support * add hw_smpl attribute to the interface * enable priv level filtering on ARMv8 * add ARM Neoverse N1 core PMU support -- Stephane Eranian Wed, 2 Sep 2020 12:00:01 +0200 libpfm4 (10.1) unstable; urgency=low * fix Cavium Thunder X2 build issues * Update Skylake event table -- Stephane Eranian Thu, 13 Jun 2018 23:40:01 +0200 libpfm4 (10.0) unstable; urgency=low * add support for Skylake X uncore PMUs * add support for Cavium Thunder X2 core PMU * add support for Intel Knights Mill core PMU * various fixes and event table updates -- Stephane Eranian Thu, 6 Jun 2018 11:37:01 +0200 libpfm4 (9.0) unstable; urgency=low * add support for Broadwell EP uncore PMUa * add support for Intel Skylake X * add support for AMD Fam17h * add support for IBM Power9 * add support for AMD Fam16h * various fixes and event table updates -- Stephane Eranian Thu, 4 Jan 2018 13:37:01 +0200 libpfm4 (8.0) unstable; urgency=low * add Intel Knights Landing support * add Intel Goldmont support * update Intel event tables * allow . as delimiter for event string * add SQ_MISC:SPLIT_LOCK * enable Broadwell EP * various fixes -- Stephane Eranian Sat, 5 Nov 2016 14:38:01 +0200 libpfm4 (7.0) unstable; urgency=low * add Intel Skylake support * add Intel Haswell-EP uncore PMU support * add Broadwell DE support * updated most Intel x86 event tables to match official tables * refreshed perf_event.h header to 4.2 * more bug fixes and minor updates -- Stephane Eranian Thu, 11 Feb 2016 16:56:01 +0200 libpfm4 (6.0) unstable; urgency=low * add Intel Broadwell (desktop) support * add Intel Haswell-EP support (core) * add Applied Micro X-Gene processor support * simplified X86 model detection for Intel processors * Intel SNB, IVB, HSW event table updates * IBM Power8 event table updates * add ARM Cortex A53 support * more bug fixes and minor updates -- Stephane Eranian Tue, 30 Dec 2014 16:56:01 +0200 libpfm4 (5.0) unstable; urgency=low * Intel IVB-EP uncore PMU support * Intel Silvermont support * Perf raw event syntax support * Intel RAPL event support * AMD Fam15h northbridge support * Qualcomm Krait support * IBM Power 8 support * IBM s390 updates * AMD Fam15h fixes * various IVB, SNB, HSW event table updates * more bug fixes -- Stephane Eranian Fri, 21 Feb 2014 18:45:01 +0200 libpfm4 (4.0) unstable; urgency=low * Intel IVB-EP support * Intel IVB updates support * Intel SNB updates support * Intel SNB-EP uncore support * ldlat support (PEBS-LL) * New Intel Atom support * bug fixes -- Stephane Eranian Fri, 08 JUn 2013 18:45:01 +0200 libpfm4 (3.0) unstable; urgency=low * ARM Cortex A15 support * updated Intel Sandy Bridge core PMU events * Intel Sandy Bridge desktop (model 42) uncore PMU support * Intel Ivy Bridge support * full perf_events generic event support * updated perf_examples * enabled Intel Nehalem/Westmere uncore PMU support * AMD LLano processor supoprt (Fam 12h) * AMD Turion rocessor supoprt (Fam 11h) * Intel Atom Cedarview processor support * Win32 compilation support * perf_events excl attribute * perf_events generic hw event aliases support * many bug fixes -- Stephane Eranian Mon, 27 Aug 2012 17:45:22 +0200 libpfm4 (2.0) unstable; urgency=low * updated event tables for Intel X86 processors * new AMD Fam15h support * new MIPS 74k support * updated ARM Cortex A8/A9 support * 30% size reduction for Intel/AMD X86 event tables * bug fixes and other improvements -- Stephane Eranian Fri, 7 Oct 2011 15:55:22 +0200 libpfm4 (1.0) unstable; urgency=low * Initial Release. -- Arun Sharma Mon, 21 Jun 2010 15:17:22 -0700 papi-papi-7-2-0-t/src/libpfm4/debian/compat000066400000000000000000000000021502707512200204040ustar00rootroot000000000000007 papi-papi-7-2-0-t/src/libpfm4/debian/control000066400000000000000000000033401502707512200206110ustar00rootroot00000000000000Source: libpfm4 Priority: extra Maintainer: Stephane Eranian Build-Depends: debhelper (>= 7), python (>= 2.4), python-support, python-dev (>= 2.4), swig Standards-Version: 3.8.4 Section: libs Homepage: http://perfmon2.sourceforge.net/ Package: libpfm4-dev Section: libdevel Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends} Description: A library to program the performance monitoring events Libpfm4 helps convert from an event name, expressed as a string, to the event encoding. The encoding can then be used with specific OS interfaces. Libpfm4 also provides OS-specific interfaces to directly setup OS-specific data structures to be passed to the kernel. The current libpfm4, for instance, provides support for the perf_events interface which was introduced in Linux v2.6.31. Package: libpfm4 Section: libs Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends} Description: A library to program the performance monitoring events Libpfm4 helps convert from an event name, expressed as a string, to the event encoding. The encoding can then be used with specific OS interfaces. Libpfm4 also provides OS-specific interfaces to directly setup OS-specific data structures to be passed to the kernel. The current libpfm4, for instance, provides support for the perf_events interface which was introduced in Linux v2.6.31. Package: python-libpfm4 Depends: libpfm4, python, ${shlibs:Depends}, ${misc:Depends} Architecture: any Section: python Description: Python bindings for libpfm4 This package allows you to write simple python scripts that monitor various hardware performance monitoring events. It may be more efficient to use this approach instead of parsing the output of other tools. papi-papi-7-2-0-t/src/libpfm4/debian/copyright000066400000000000000000000027031502707512200211430ustar00rootroot00000000000000This work was packaged for Debian by: Arun Sharma on Mon, 21 Jun 2010 15:17:22 -0700 It was downloaded from: git://perfmon2.git.sourceforge.net/gitroot/perfmon2/libpfm4 Upstream Author(s): Stephane Eranian Packaging by: Copyright (C) 2010 Arun Sharma Library and packaging released under the following license: Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. papi-papi-7-2-0-t/src/libpfm4/debian/docs000066400000000000000000000000071502707512200200560ustar00rootroot00000000000000README papi-papi-7-2-0-t/src/libpfm4/debian/libpfm4-dev.dirs000066400000000000000000000000241502707512200221760ustar00rootroot00000000000000usr/lib usr/include papi-papi-7-2-0-t/src/libpfm4/debian/libpfm4-dev.install000066400000000000000000000000351502707512200227050ustar00rootroot00000000000000usr/include/* usr/lib/lib*.a papi-papi-7-2-0-t/src/libpfm4/debian/libpfm4-dev.manpages000066400000000000000000000000161502707512200230310ustar00rootroot00000000000000docs/man3/*.3 papi-papi-7-2-0-t/src/libpfm4/debian/libpfm4.install000066400000000000000000000000221502707512200221250ustar00rootroot00000000000000usr/lib/lib*.so.* papi-papi-7-2-0-t/src/libpfm4/debian/python-libpfm4.install000066400000000000000000000001261502707512200234510ustar00rootroot00000000000000usr/lib/python*/site-packages/perfmon/*.py usr/lib/python*/site-packages/perfmon/*.so papi-papi-7-2-0-t/src/libpfm4/debian/pyversions000066400000000000000000000000051502707512200213450ustar00rootroot000000000000002.4- papi-papi-7-2-0-t/src/libpfm4/debian/rules000077500000000000000000000011441502707512200202660ustar00rootroot00000000000000#!/usr/bin/make -f # -*- makefile -*- # Sample debian/rules that uses debhelper. # This file was originally written by Joey Hess and Craig Small. # As a special exception, when this file is copied by dh-make into a # dh-make output file, you may use that output file without restriction. # This special exception was added by Craig Small in version 0.37 of dh-make. # Uncomment this to turn on verbose mode. #export DH_VERBOSE=1 #include /usr/share/dpatch/dpatch.make override_dh_auto_install: build $(MAKE) install DESTDIR=$(CURDIR)/debian/tmp PREFIX=/usr CONFIG_PFMLIB_NOPYTHON=n LDCONFIG=true %: dh $@ papi-papi-7-2-0-t/src/libpfm4/debian/source/000077500000000000000000000000001502707512200205065ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/debian/source/format000066400000000000000000000000141502707512200217140ustar00rootroot000000000000003.0 (quilt) papi-papi-7-2-0-t/src/libpfm4/docs/000077500000000000000000000000001502707512200167145ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/docs/Makefile000066400000000000000000000137651502707512200203700ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. # Contributed by John Linford # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk .PHONY: all clean distclean depend ARCH_MAN= SYS_MAN= ifeq ($(CONFIG_PFMLIB_ARCH_X86),y) ARCH_MAN=libpfm_intel_core.3 \ libpfm_intel_x86_arch.3\ libpfm_amd64.3 \ libpfm_amd64_k7.3 \ libpfm_amd64_k8.3 \ libpfm_amd64_fam10h.3 \ libpfm_amd64_fam15h.3 \ libpfm_amd64_fam16h.3 \ libpfm_amd64_fam17h.3 \ libpfm_amd64_fam17h_zen2.3 \ libpfm_amd64_fam19h_zen3.3 \ libpfm_amd64_fam19h_zen4.3 \ libpfm_amd64_fam19h_zen3_l3.3 \ libpfm_amd64_fam1ah_zen5.3 \ libpfm_amd64_fam1ah_zen5_l3.3 \ libpfm_intel_atom.3 \ libpfm_intel_nhm.3 \ libpfm_intel_nhm_unc.3 \ libpfm_intel_wsm.3 \ libpfm_intel_wsm_unc.3 \ libpfm_intel_snb.3 \ libpfm_intel_snb_unc.3 \ libpfm_intel_ivb.3 \ libpfm_intel_ivb_unc.3 \ libpfm_intel_hsw.3 \ libpfm_intel_bdw.3 \ libpfm_intel_rapl.3 \ libpfm_intel_slm.3 \ libpfm_intel_tmt.3 \ libpfm_intel_skl.3 \ libpfm_intel_icl.3 \ libpfm_intel_icx.3 \ libpfm_intel_spr.3 \ libpfm_intel_emr.3 \ libpfm_intel_gnr.3 \ libpfm_intel_glm.3 \ libpfm_intel_adl_glc.3 \ libpfm_intel_adl_grt.3 \ libpfm_intel_knl.3 \ libpfm_intel_knm.3 \ libpfm_intel_snbep_unc_cbo.3 \ libpfm_intel_snbep_unc_ha.3 \ libpfm_intel_snbep_unc_imc.3 \ libpfm_intel_snbep_unc_pcu.3 \ libpfm_intel_snbep_unc_qpi.3 \ libpfm_intel_snbep_unc_ubo.3 \ libpfm_intel_snbep_unc_r2pcie.3 \ libpfm_intel_snbep_unc_r3qpi.3 \ libpfm_intel_ivbep_unc_cbo.3 \ libpfm_intel_ivbep_unc_ha.3 \ libpfm_intel_ivbep_unc_imc.3 \ libpfm_intel_ivbep_unc_pcu.3 \ libpfm_intel_ivbep_unc_qpi.3 \ libpfm_intel_ivbep_unc_ubo.3 \ libpfm_intel_ivbep_unc_r2pcie.3 \ libpfm_intel_ivbep_unc_r3qpi.3 \ libpfm_intel_ivbep_unc_irp.3 \ libpfm_intel_knc.3 \ libpfm_intel_hswep_unc_cbo.3 \ libpfm_intel_hswep_unc_ha.3 \ libpfm_intel_hswep_unc_imc.3 \ libpfm_intel_hswep_unc_irp.3 \ libpfm_intel_hswep_unc_pcu.3 \ libpfm_intel_hswep_unc_qpi.3 \ libpfm_intel_hswep_unc_r2pcie.3 \ libpfm_intel_hswep_unc_r3qpi.3 \ libpfm_intel_hswep_unc_sbo.3 \ libpfm_intel_hswep_unc_ubo.3 \ libpfm_intel_bdx_unc_cbo.3 \ libpfm_intel_bdx_unc_ha.3 \ libpfm_intel_bdx_unc_imc.3 \ libpfm_intel_bdx_unc_irp.3 \ libpfm_intel_bdx_unc_pcu.3 \ libpfm_intel_bdx_unc_qpi.3 \ libpfm_intel_bdx_unc_r2pcie.3 \ libpfm_intel_bdx_unc_r3qpi.3 \ libpfm_intel_bdx_unc_sbo.3 \ libpfm_intel_bdx_unc_ubo.3 \ libpfm_intel_skx_unc_cha.3 \ libpfm_intel_skx_unc_imc.3 \ libpfm_intel_skx_unc_irp.3 \ libpfm_intel_skx_unc_m2m.3 \ libpfm_intel_skx_unc_m3upi.3 \ libpfm_intel_skx_unc_pcu.3 \ libpfm_intel_skx_unc_ubo.3 \ libpfm_intel_skx_unc_upi.3 \ libpfm_intel_icx_unc_cha.3 \ libpfm_intel_icx_unc_imc.3 \ libpfm_intel_icx_unc_m2m.3 \ libpfm_intel_icx_unc_iio.3 \ libpfm_intel_icx_unc_pcu.3 \ libpfm_intel_icx_unc_upi.3 \ libpfm_intel_icx_unc_m3upi.3 \ libpfm_intel_icx_unc_ubox.3 \ libpfm_intel_icx_unc_m2pcie.3 \ libpfm_intel_icx_unc_irp.3 \ libpfm_intel_spr_unc_imc.3 \ libpfm_intel_spr_unc_upi.3 \ libpfm_intel_spr_unc_cha.3 \ libpfm_intel_gnr_unc_imc.3 ifeq ($(CONFIG_PFMLIB_ARCH_I386),y) ARCH_MAN += libpfm_intel_p6.3 libpfm_intel_coreduo.3 endif endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM),y) ARCH_MAN += libpfm_arm_xgene.3 \ libpfm_arm_ac7.3 \ libpfm_arm_ac57.3 \ libpfm_arm_ac53.3 \ libpfm_arm_ac55.3 \ libpfm_arm_ac72.3 \ libpfm_arm_ac76.3 \ libpfm_arm_ac15.3 \ libpfm_arm_ac8.3 \ libpfm_arm_ac9.3 \ libpfm_arm_qcom_krait.3 \ libpfm_arm_neoverse_n1.3 \ libpfm_arm_neoverse_n2.3 \ libpfm_arm_neoverse_n3.3 \ libpfm_arm_neoverse_v1.3 \ libpfm_arm_neoverse_v2.3 \ libpfm_arm_neoverse_v3.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM64),y) ARCH_MAN += libpfm_arm_xgene.3 \ libpfm_arm_ac57.3 \ libpfm_arm_ac53.3 \ libpfm_arm_ac55.3 \ libpfm_arm_ac72.3 \ libpfm_arm_ac76.3 \ libpfm_arm_a64fx.3 \ libpfm_arm_monaka.3 \ libpfm_arm_neoverse_n1.3 \ libpfm_arm_neoverse_n2.3 \ libpfm_arm_neoverse_n3.3 \ libpfm_arm_neoverse_v1.3 \ libpfm_arm_neoverse_v2.3 \ libpfm_arm_neoverse_v3.3 endif ifeq ($(CONFIG_PFMLIB_ARCH_MIPS),y) ARCH_MAN += libpfm_mips_74k.3 endif GEN_MAN= libpfm.3 \ pfm_find_event.3 \ pfm_get_event_attr_info.3 \ pfm_get_event_info.3 \ pfm_get_event_encoding.3 \ pfm_get_event_next.3 \ pfm_get_pmu_info.3 \ pfm_get_os_event_encoding.3 \ pfm_get_version.3 \ pfm_initialize.3 \ pfm_terminate.3 \ pfm_strerror.3 ifeq ($(SYS),Linux) SYS_MAN=pfm_get_perf_event_encoding.3 libpfm_perf_event_raw.3 endif MAN=$(GEN_MAN) $(ARCH_MAN) $(SYS_MAN) install: -mkdir -p $(DESTDIR)$(MANDIR)/man3 ( cd man3; $(INSTALL) -m 644 $(MAN) $(DESTDIR)$(MANDIR)/man3 ) papi-papi-7-2-0-t/src/libpfm4/docs/man3/000077500000000000000000000000001502707512200175525ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm.3000066400000000000000000000135271502707512200211170ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2010" "" "Linux Programmer's Manual" .SH NAME libpfm \- a helper library to develop monitoring tools .SH SYNOPSIS .nf .B #include .SH DESCRIPTION This is a helper library used by applications to program specific performance monitoring events. Those events are typically provided by the hardware or the OS kernel. The most common hardware events are provided by the Performance Monitoring Unit (PMU) of modern processors. They can measure elapsed cycles or the number of cache misses. Software events usually count kernel events such as the number of context switches, or pages faults. The library groups events based on which source is providing them. The term PMU is generalized to any event source, not just hardware sources. The library supports hardware performance events from most common processors, each group under a specific PMU name, such as Intel Core, IBM Power 6. Programming events is usually done through a kernel API, such as Oprofile, perfmon, perfctr, or perf_events on Linux. The library provides support for perf_events which is available in the Linux kernel as of v2.6.31. Perf_events supports selected PMU models and several software events. At its core, the library provides a simple translation service, whereby a user specifies an event to measure as a string and the library returns the parameters needed to invoke the kernel API. It is important to realize that the library does \fBnot\fR make the system call to program the event. \fBNote:\fR You must first call \fBpfm_initialize()\fR in order to use any of the other provided functions in the library. A first part of the library provides an event listing and query interface. This can be used to discover the events available on a specific hardware platform. The second part of the library provides a set of functions to obtain event encodings form event strings. Event encoding depends primarily on the underlying hardware but also on the kernel API. The library offers a generic API to address the first situation but it also provides entry points for specific kernel APIs such as perf_events. In that case, it is able to prepare the data structure which must be passed to the kernel to program a specific event. .SH EVENT DETECTION When the library is initialized via \fBpfm_initialize()\fR, it first detects the underlying hardware and software configuration. Based on this information it enables certain PMU support. Multiple events tables may be activated. It is possible to force activation of a specific PMU (group of events) using an environment variable. .SH EVENT STRINGS Events are expressed as strings. Those string are structured and may contain several components depending on the type of event and the underlying hardware. String parsing is always case insensitive. The string structure is defined as follows: .sp .ce .B [pmu::][event_name][:unit_mask][:modifier|:modifier=val] or .ce .B [pmu::][event_name][.unit_mask][.modifier|.modifier=val] The components are defined as follows: .TP .B pmu Optional name of the PMU (group of events) to which the event belongs to. This is useful to disambiguate events in case events from difference sources have the same name. If not specified, the first match is used. .TP .B event_name The name of the event. It must be the complete name, partial matches are not accepted. This component is required. .TP .B unit_mask This designate an optional sub-events. Some events can be refined using sub-events. Event may have multiple unit masks and it may or may be possible to combine them. If more than one unit masks needs to be passed, then the [:unit_mask] pattern can be repeated. .TP .B modifier A modifier is an optional filter which modifies how the event counts. Modifiers have a type and a value. The value is specified after the equal sign. No space is allowed. In case of boolean modifiers, it is possible to omit the value true (1). The presence of the modifier is interpreted as meaning true. Events may support multiple modifiers, in which case the [:modifier|:modifier=val] pattern can be repeated. The is no ordering constraint between modifier and unit masks. Modifiers may be specified before unit masks and vice-versa. .SH ENVIRONMENT VARIABLES It is possible to enable certain debug features of the library using environment variables. The following variables are defined: .TP .B LIBPFM_VERBOSE Enable verbose output. Value must be 0 or 1. .TP .B LIBPFM_DEBUG Enable debug output. Value must be 0 or 1 .TP .B LIBPFM_DEBUG_STDOUT Redirect verbose and debug output to the standard output file descriptor (stdout). By default, the output is directed to the standard error file descriptor (stderr). .TP .B LIBPFM_FORCE_PMU Force a specific PMU model to be activated. In this mode, only that one model is activated. The value of the variable must be the PMU name as returned by the \fBpfm_get_pmu_name()\fR function. Note for some PMU models, it may be possible to specify additional options, such as specific processor models or stepping. Additional parameters necessarily appears after a comma. For instance, LIBPFM_FORCE_PMU=amd64,16,2,1. .TP .B LIBPFM_ENCODE_INACTIVE Set this variable to 1 to enable encoding of events for non detected, but supported, PMUs models. .TP .B LIBPFM_DISABLED_PMUS Provides a list of PMU models to disable. This is a comma separated list of PMU models. The PMU model is the string in \fBname\fR field of the \fBpfm_pmu_info_t\fR structure. For instance: LIBPFM_DISABLE_PMUS=core,snb, will disable both the Intel Core and SandyBridge core PMU support. .SH AUTHORS .nf Stephane Eranian Robert Richter .fi .SH SEE ALSO libpfm_amd64_k7(3), libpfm_amd64_k8(3), libpfm_amd64_fam10h(3), libpfm_intel_core(3), libpfm_intel_atom(3), libpfm_intel_p6(3), libpfm_intel_nhm(3), libpfm_intel_nhm_unc(3), pfm_get_perf_event_encoding(3), pfm_initialize(3) .sp Some examples are shipped with the library papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64.3000066400000000000000000000012151502707512200221010ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2010" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64 - support for AMD64 processors .SH SYNOPSIS .nf .B #include .sp .SH DESCRIPTION The library supports all AMD64 processors in both 32 and 64-bit modes. The support is broken down in three groups: .TP .B AMD K7 processors (family 6) .TP .B AMD K8 processors (family 15) .TP .B AMD Family 10h processors (family 16) .sp .TP Each group has a distinct man page. See links below. .SH SEE ALSO libpfm_amd64_k7(3), libpfm_amd64_k8(3), libpfm_amd64_fam10h(3) .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam10h.3000066400000000000000000000034141502707512200232400ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2010" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam10h - support for AMD64 Family 10h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam10h_barcelona, amd64_fam10h_shanghai, amd64_fam10h_istanbul .B PMU desc: AMD64 Fam10h Barcelona, AMD64 Fam10h Shanghai, AMD64 Fam10h Istanbul .sp .SH DESCRIPTION The library supports AMD Family 10h processors in both 32 and 64-bit modes. They correspond to processor family 16. .SH MODIFIERS The following modifiers are supported on AMD64 Family 10h (16) processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam15h.3000066400000000000000000000035341502707512200232500ustar00rootroot00000000000000.TH LIBPFM 3 "Nov, 2013" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam15h - support for AMD64 Family 15h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam15h_interlagos .B PMU desc: AMD64 Fam15h Interlagos .B PMU name: amd64_fam15h_nb .B PMU desc: AMD64 Fam15h Northbridge .sp .SH DESCRIPTION The library supports AMD Family 15h processors core PMU in both 32 and 64-bit modes. The uncore (NorthBridge) PMU is also supported as a separate PMU model. .SH MODIFIERS The following modifiers are supported on AMD64 Family 15h core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP The uncore (NorthBridge) PMU \fBdoes not support\fR any modifiers. .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam16h.3000066400000000000000000000031211502707512200232410ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam16h - support for AMD64 Family 16h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam16h .B PMU desc: AMD64 Fam16h Zen .sp .SH DESCRIPTION The library supports AMD Family 16h processors core PMU in both 32 and 64-bit modes. .SH MODIFIERS The following modifiers are supported on AMD64 Family 16h core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam17h.3000066400000000000000000000034521502707512200232510ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam17h - support for AMD64 Family 17h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam17h (deprecated), amd_fam17h_zen1 .B PMU desc: AMD64 Fam17h Zen1 .sp .SH DESCRIPTION The library supports AMD Family 17h processors Zen1 core PMU in both 32 and 64-bit modes. The amd64_fam17h PMU model name has been deprecated in favor of amd_fam17h_zen1. The old name is maintained for backward compatibility reasons, but should not be used anymore. .SH MODIFIERS The following modifiers are supported on AMD64 Family 17h Zen1 core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam17h_zen2.3000066400000000000000000000031641502707512200242070ustar00rootroot00000000000000.TH LIBPFM 3 "December, 2019" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam17h_zen2 - support for AMD64 Family 17h model 31h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam17h_zen2 .B PMU desc: AMD64 Fam17h Zen2 .sp .SH DESCRIPTION The library supports AMD Family 17h processors Zen2 core PMU in both 32 and 64-bit modes. .SH MODIFIERS The following modifiers are supported on AMD64 Family 17h Zen2 core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam19h_zen3.3000066400000000000000000000031221502707512200242040ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2021" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam19h_zen3 - support for AMD64 Family 19h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam19h_zen3 .B PMU desc: AMD64 Fam19h Zen3 .sp .SH DESCRIPTION The library supports AMD Family 19h processors Zen3 core PMU in both 32 and 64-bit modes. .SH MODIFIERS The following modifiers are supported on AMD64 Family 19h Zen3 core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Swarup Sahoo .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam19h_zen3_l3.3000066400000000000000000000007031502707512200246040ustar00rootroot00000000000000.TH LIBPFM 3 "March, 2021" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam19h_zen3_l3 - support for AMD64 Family 19h L3 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam19h_zen3_l3 .B PMU desc: AMD64 Fam19h Zen3 L3 .sp .SH DESCRIPTION The library supports AMD Family 19h processors Zen3 L3 PMU in both 32 and 64-bit modes. At this point, there is no modifier supported. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam19h_zen4.3000066400000000000000000000031261502707512200242110ustar00rootroot00000000000000.TH LIBPFM 3 "December, 2022" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam19h_zen4 - support for AMD64 Family 19h processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam19h_zen4 .B PMU desc: AMD64 Fam19h Zen4 .sp .SH DESCRIPTION The library supports AMD Family 19h processors Zen4 core PMU in both 32 and 64-bit modes. .SH MODIFIERS The following modifiers are supported on AMD64 Family 19h Zen4 core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam1ah_zen5.3000066400000000000000000000031151502707512200242600ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam1ah_zen5 - support for AMD64 Family 1Ah processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam1ah_zen5 .B PMU desc: AMD64 Fam1Ah Zen5 .sp .SH DESCRIPTION The library supports AMD Family 1Ah processors Zen5 core PMU in both 32 and 64-bit modes. .SH MODIFIERS The following modifiers are supported on AMD64 Family 1Ah Zen5 core PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at while executing in host mode (when using virtualization). This corresponds to \fBPFM_PLMH\fR. This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B g Measure at while executing in guest mode (when using virtualization). This modifier is available starting with Fam10h. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Swarup Sahoo .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_fam1ah_zen5_l3.3000066400000000000000000000006751502707512200246660ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_fam1ah_zen5_l3 - support for AMD64 Family 1Ah L3 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_fam1ah_zen5_l3 .B PMU desc: AMD64 Fam1Ah Zen5 L3 .sp .SH DESCRIPTION The library supports AMD Family 1Ah processors Zen5 L3 PMU in both 32 and 64-bit modes. At this point, there is no modifier supported. .SH AUTHORS .nf Swarup Sahoo .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_k7.3000066400000000000000000000024271502707512200225100ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2010" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_k7 - support for AMD64 K7 processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_k7 .B PMU desc: AMD64 K7 .sp .SH DESCRIPTION The library supports AMD K7 processors in both 32 and 64-bit modes. They correspond to processor family 6. .SH MODIFIERS The following modifiers are supported on AMD64 K7 processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_amd64_k8.3000066400000000000000000000026671502707512200225170ustar00rootroot00000000000000.TH LIBPFM 3 "April, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_amd64_k8 - support for AMD64 K8 processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: amd64_k8_revb, amd64_k8_revc, amd64_k8_revd, amd64_k8_reve, amd64_k8_revf, amd64_k8_revg .B PMU desc: AMD64 K8 RevB, AMD64 K8 RevC, AMD64 K8 RevD, AMD64 K8 RevE, AMD64 K8 RevF, AMD64 K8 RevG .sp .SH DESCRIPTION The library supports AMD K8 processors in both 32 and 64-bit modes. They correspond to processor family 15. .SH MODIFIERS The following modifiers are supported on AMD64 K8 processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian Robert Richter .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_a64fx.3000066400000000000000000000014721502707512200227620ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2020" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_a64fx - support for Fujitsu A64FX PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_a64fx .B PMU desc: Fujitsu A64FX .sp .SH DESCRIPTION The library supports the Fujitsu A64FX core PMU. This PMU supports 8 counters and privilege levels filtering. It can operate in 64 bit mode only. .SH MODIFIERS The following modifiers are supported on Fujitsu A64FX: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac15.3000066400000000000000000000014331502707512200225600ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac15 - support for Arm Cortex A15 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac15 .B PMU desc: ARM Cortex A15 .sp .SH DESCRIPTION The library supports the ARM Cortex A15 core PMU. This PMU supports 6 counters and privilege levels filtering. .SH MODIFIERS The following modifiers are supported on ARM Cortex A15: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac53.3000066400000000000000000000015041502707512200225610ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac53 - support for ARM Cortex A53 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac53 .B PMU desc: ARM Cortex A53 .sp .SH DESCRIPTION The library supports the ARM Cortex A53 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Cortex A53: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac55.3000066400000000000000000000015121502707512200225620ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac55 - support for ARM Cortex A55 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac55 .B PMU desc: ARM Cortex A55 .sp .SH DESCRIPTION The library supports the ARM Cortex A55 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Cortex A55: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac57.3000066400000000000000000000015041502707512200225650ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac57 - support for Arm Cortex A57 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac57 .B PMU desc: ARM Cortex A57 .sp .SH DESCRIPTION The library supports the ARM Cortex A57 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Cortex A57: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac7.3000066400000000000000000000014251502707512200225020ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac7 - support for Arm Cortex A7 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac7 .B PMU desc: ARM Cortex A7 .sp .SH DESCRIPTION The library supports the ARM Cortex A7 core PMU. This PMU supports 4 counters and privilege levels filtering. .SH MODIFIERS The following modifiers are supported on ARM Cortex A7: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac72.3000066400000000000000000000015121502707512200225610ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac72 - support for Arm Cortex A72 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac72 .B PMU desc: ARM Cortex A72 .sp .SH DESCRIPTION The library supports the ARM Cortex A72 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Cortex A72: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac76.3000066400000000000000000000015121502707512200225650ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac76 - support for Arm Cortex A76 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac76 .B PMU desc: ARM Cortex A76 .sp .SH DESCRIPTION The library supports the ARM Cortex A76 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Cortex A76: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac8.3000066400000000000000000000007021502707512200225000ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac8 - support for ARM Cortex A8 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac8 .B PMU desc: ARM Cortex A8 .sp .SH DESCRIPTION The library supports the ARM Cortex A8 core PMU. This PMU supports 2 counters and has no privilege levels filtering. No event modifiers are available. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_ac9.3000066400000000000000000000007021502707512200225010ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac9 - support for ARM Cortex A9 PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_ac9 .B PMU desc: ARM Cortex A9 .sp .SH DESCRIPTION The library supports the ARM Cortex A9 core PMU. This PMU supports 2 counters and has no privilege levels filtering. No event modifiers are available. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_monaka.3000066400000000000000000000015441502707512200233000ustar00rootroot00000000000000.TH LIBPFM 3 "October, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_monaka - support for Fujitsu FUJITSU-MONAKA PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_monaka .B PMU desc: Fujitsu FUJITSU-MONAKA .sp .SH DESCRIPTION The library supports the Fujitsu FUJITSU-MONAKA core PMU. This PMU supports 8 counters and privilege levels filtering. It can operate in 64 bit mode only. .SH MODIFIERS The following modifiers are supported on Fujitsu FUJITSU-MONAKA: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_n1.3000066400000000000000000000015231502707512200242530ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2020" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_n1 - support for ARM Neoverse N1 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_n1 .B PMU desc: ARM Neoverse N1 .sp .SH DESCRIPTION The library supports the ARM Neoverse N1 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Neoverse N1: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_n2.3000066400000000000000000000015251502707512200242560ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2021" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_n2 - support for ARM Neoverse N2 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_n2 .B PMU desc: ARM Neoverse N2 .sp .SH DESCRIPTION The library supports the ARM Neoverse N2 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Neoverse N2: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_n3.3000066400000000000000000000015431502707512200242570ustar00rootroot00000000000000.TH LIBPFM 3 "October, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_n3 - support for ARM Neoverse N2 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_n3 .B PMU desc: ARM Neoverse N2 .sp .SH DESCRIPTION The library supports the ARM Neoverse N2 core PMU. This PMU supports 6 or 20 64-bit counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Neoverse N3: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_v1.3000066400000000000000000000015671502707512200242730ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2022" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_v1 - support for Arm Neoverse V1 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_v1 .B PMU desc: Arm Neoverse V1 .sp .SH DESCRIPTION The library supports the Arm Neoverse V1 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on ARM Neoverse V1: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf John Linford Stephane Eranian .if .PPpapi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_v2.3000066400000000000000000000015671502707512200242740ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2022" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_v2 - support for Arm Neoverse V2 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_v2 .B PMU desc: Arm Neoverse V2 .sp .SH DESCRIPTION The library supports the Arm Neoverse V2 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on Arm Neoverse V2: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf John Linford Stephane Eranian .if .PPpapi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_neoverse_v3.3000066400000000000000000000015301502707512200242630ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_neoverse_v3 - support for Arm Neoverse V3 core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_v3 .B PMU desc: Arm Neoverse V3 .sp .SH DESCRIPTION The library supports the Arm Neoverse V3 core PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on Arm Neoverse V3: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_qcom_krait.3000066400000000000000000000014301502707512200241550ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac15 - support for Qualcomm Krait PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: qcom_krait .B PMU desc: Qualcomm Krait .sp .SH DESCRIPTION The library supports the Qualcomm Krait core PMU. This PMU supports 5 counters and privilege levels filtering. .SH MODIFIERS The following modifiers are supported on this PMU: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_arm_xgene.3000066400000000000000000000016021502707512200231330ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_arm_ac57 - support for Applied Micro X-Gene PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: arm_xgene .B PMU desc: Applied Micro X-Gene .sp .SH DESCRIPTION The library supports the Applied Micro X-Gene PMU. This PMU supports 6 counters and privilege levels filtering. It can operate in both 32 and 64 bit modes. .SH MODIFIERS The following modifiers are supported on Applied Micro X-Gene: .TP .B u Measure at the user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at the kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B hv Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .nf William Cohen .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_adl_glc.3000066400000000000000000000073031502707512200237520ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_adl_glc - support for Intel Alderlake Goldencove (P-Core) core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: adl_glc .B PMU desc: Intel Alderlake Goldencove (P-core) .sp .SH DESCRIPTION The library supports the Intel Alderlake Goldencove (P-Core) core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. Because the processor uses a hybrid architecture with a P-Core and E-Core with a different PMU model, it may be necessary to force a PMU instance name to get the desired encoding. For instance, to encode for the P-Core adl_glc::BR_INST_RETIRED and for the E-core adl_grt::BR_INST_RETIRED. On Adlerlake Goldencove (P-Core), the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fR. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapid processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B ldlat Pass a latency threshold in core cycles to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. The library provides a set of presets for specific latencies, such as 128. Note that the event \fBmust\fR be used with precise sampling (PEBS). .SH FRONTEND_RETIRED event For Alderlake Goldencove (P-Core), the library uses the hardcoded bubble width and bubble length provided by Intel preset events. It is not possible to tweak either the latency or the width via the library. .SH OFFCORE_RESPONSE events Intel Alderlake Goldencove (P-Core) supports two encodings for offcore_response events (0x2a, 0x2b). In the library, these are called OCR0 and OCR1. The two encodings are equivalent. On Linux, the kernel can schedule any OCR encoding into any of the two OCR counters. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Alderlake Goldencove (P-Core), the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. The library offers the list of validated combinations as per Intel's official event list. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_adl_grt.3000066400000000000000000000061551502707512200240050ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_adl_grt - support for Intel Alderlake Gracemont (P-core) core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: adl_grt .B PMU desc: Intel Alderlake Goldencove (E-core) .sp .SH DESCRIPTION The library supports the Intel Alderlake Gracemont (E-core) core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. Because the processor uses a hybrid architecture with a P-Core and E-Core with a different PMU model, it may be necessary to force a PMU instance name to get the desired encoding. For instance, to encode for the P-Core adl_grt::BR_INST_RETIRED and for the E-core adl_grt::BR_INST_RETIRED. On Adlerlake Gracemont (P-Core), there are 6 generic counters and 3 fixed counters. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fR. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapid processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold in core cycles to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. The library provides a set of presets for specific latencies, such as 128. Note that the event \fBmust\fR be used with precise sampling (PEBS). .SH OFFCORE_RESPONSE events Intel Alderlake Gracemont (E-Core) supports two encodings for offcore_response events (0x1b7, 0x2b7). In the library, these are called OCR0 and OCR1. The two encodings are equivalent. On Linux, the kernel can schedule any OCR encoding into any of the two OCR counters. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Alderlake Gracemont (E-Core), the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. The library offers the list of validated combinations as per Intel's official event list. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_atom.3000066400000000000000000000025241502707512200233250ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_atom - support for Intel Atom processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: atom .B PMU desc: Intel Atom .sp .SH DESCRIPTION The library supports all Intel Atom-based processors that includes family 6 model 28. .SH MODIFIERS The following modifiers are supported on Intel Atom processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdw.3000066400000000000000000000077131502707512200231460ustar00rootroot00000000000000.TH LIBPFM 3 "October, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdw - support for Intel Broadwell core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdw .B PMU desc: Intel Broadwell .sp .SH DESCRIPTION The library supports the Intel Broadwell core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Broadwell, the number of generic counters depends on the Hyperthreading (HT) mode. When HT is on, then only 4 generic counters are available. When HT is off, then 8 generic counters are available. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Broadwell processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .SH OFFCORE_RESPONSE events Intel Broadwell provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Broadwell, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_cbo.3000066400000000000000000000072651502707512200246410ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_cbo - support for Intel Broadwell Server C-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_cbo[0-21] .B PMU desc: Intel Broadwell Server C-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server C-Box (coherency engine) uncore PMU. This PMU model exists on various Broadwell server models (79, 86) . There is one C-box PMU per physical core. Therefore there are up to twenty-one identical C-Box PMU instances numbered from 0 to 21. On dual-socket systems, the number refers to the C-Box PMU on the socket where the program runs. For instance, if running on CPU18, then bdx_unc_cbo0 refers to the C-Box for physical core 0 on socket 1. Conversely, if running on CPU0, then the same bdx_unc_cbo0 refers to the C-Box for physical core 0 but on socket 0. Each C-Box PMU implements 4 generic counters and two filter registers used only with certain events and umasks. .SH MODIFIERS The following modifiers are supported on Intel Broadwell C-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of C-Box cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B nf Node filter. Certain events, such as UNC_C_LLC_LOOKUP, UNC_C_LLC_VICTIMS, provide a \fBNID\fR umask. Sometimes the \fBNID\fR is combined with other filtering capabilities, such as opcodes. The node filter is an 8-bit max bitmask. A node corresponds to a processor socket. The legal values therefore depend on the underlying hardware configuration. For dual-socket systems, the bitmask has two valid bits [0:1]. .TP .B cf Core Filter. This is a 5-bit filter which is used to filter based on physical core origin of the C-Box request. Possible values are 0-63. If the filter is not specified, then no filtering takes place. Bit 0-3 indicate the physical core id and bit 4 filters on non thread-related data. .TP .B tf Thread Filter. This is a 1-bit filter which is used to filter C-Box requests based on logical processor (hyper-thread) identification. Possibles values are 0-1. If the filter is not specified, then no filtering takes place. .TP .B nc Non-Coherent. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .TP .B isoc Isochronous. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .SH Opcode filtering Certain events, such as UNC_C_TOR_INSERTS supports opcode matching on the C-BOX transaction type. To use this feature, first an opcode matching umask must be selected, e.g., MISS_OPCODE. Second, the opcode to match on must be selected via a second umask among the OPC_* umasks. For instance, UNC_C_TOR_INSERTS:OPCODE:OPC_RFO, counts the number of TOR insertions for RFO transactions. Opcode matching may be combined with node filtering with certain umasks. In general, the filtering support is encoded into the umask name, e.g., NID_OPCODE supports both node and opcode filtering. For instance, UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:nf=1. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_ha.3000066400000000000000000000027631502707512200244640ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_ha - support for Intel Broadwell Server Home Agent (HA) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_ha0, bdx_unc_ha1 .B PMU desc: Intel Broadwell Server HA uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server Home Agent (HA) uncore PMU. This PMU model only exists on various Broadwell models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server HA uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_imc.3000066400000000000000000000030311502707512200246310ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_imc - support for Intel Broadwell Server Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_imc[0-7] .B PMU desc: Intel Broadwell Server IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server Integrated Memory Controller (IMC) uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server IMC uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_irp.3000066400000000000000000000027441502707512200246650ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_irp - support for Intel Broadwell Server IRP uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_irp .B PMU desc: Intel Broadwell Server IRP uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server IRP (IIO coherency) uncore PMU . This PMU model only exists various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server IRP uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_pcu.3000066400000000000000000000046701502707512200246620ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_pcu - support for Intel Broadwell Server Power Controller Unit (PCU) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_pcu .B PMU desc: Intel Broadwell Server PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server Power Controller Unit uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server PCU uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .TP .B ff Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are counted. .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_qpi.3000066400000000000000000000027441502707512200246640ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_qpi - support for Intel Broadwell Server QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_qpi0, bdx_unc_qpi1 .B PMU desc: Intel Broadwell Server QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell Server QPI uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Broadwell server QPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_r2pcie.3000066400000000000000000000027541502707512200252600ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_r2pcie - support for Intel Broadwell Server R2 PCIe uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_r2pcie .B PMU desc: Intel Broadwell Server R2 PCIe uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell server R2 PCIe uncore PMU. This PMU model only exists on Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server R2PCIe uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R2PCIe cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_r3qpi.3000066400000000000000000000027551502707512200251330ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_r3qpi - support for Intel Broadwell Server R3QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_r3qpi[0-2] .B PMU desc: Intel Broadwell server R3QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell server R3QPI uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server R3PQI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R3QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_sbo.3000066400000000000000000000035671502707512200246620ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_sbo - support for Intel Broadwell Server S-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_sbo .B PMU desc: Intel Broadwell Server S-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell server Ring Transfer unit (S-Box) uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server S-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_bdx_unc_ubo.3000066400000000000000000000027741502707512200246630ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2017" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_bdx_unc_ubo - support for Intel Broadwell Server U-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: bdx_unc_ubo .B PMU desc: Intel Broadwell Server U-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Broadwell server system configuration unit (U-Box) uncore PMU. This PMU model only exists on various Broadwell server models (79, 86). .SH MODIFIERS The following modifiers are supported on Intel Broadwell server U-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_core.3000066400000000000000000000026111502707512200233120ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_core - support for Intel Core-based processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: core .B PMU desc: Intel Core .sp .SH DESCRIPTION The library supports all Intel Core-based processors that includes models 15, 23, 29. .SH MODIFIERS The following modifiers are supported on Intel Core processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_coreduo.3000066400000000000000000000027421502707512200240270ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_coreduo - support for Intel Core Duo/Solo processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: coreduo .B PMU desc: Intel Core Duo .sp .SH DESCRIPTION The library supports all Intel Yonah-based processors such as Intel Core Duo and Intel Core Solo processors. .SH MODIFIERS The following modifiers are supported on Intel Core Duo processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH ENVIRONMENT VARIABLES It is possible to force activation of the Intel Core Duo support using the \fBLIBPFM_FORCE_PMU\fR variable. The PMU name, coreduo, must be passed. No additional options are supported. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_emr.3000066400000000000000000000071341502707512200231520ustar00rootroot00000000000000.TH LIBPFM 3 "April, 2022" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_emr - support for Intel EmeraldRapid core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: emr .B PMU desc: Intel EmeraldRapid .sp .SH DESCRIPTION The library supports the Intel EmeraldRapid core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On EmeraldRapid, the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel EmeraldRapid processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel EmeraldRapid supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the operating system needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel EmeraldRapid unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the library offers the list of validated combinations as per Intel's official event list. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_glm.3000066400000000000000000000106331502707512200231440ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2016" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_glm - support for Intel Goldmont core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: glm .B PMU desc: Intel Goldmont .sp .SH DESCRIPTION The library supports the Intel Goldmont core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Goldmont, the number of generic counters is 4. There is no HyperThreading support. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Goldmont processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH OFFCORE_RESPONSE events Intel Goldmont provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Goldmont, the umasks are divided into 4 categories: request, supplier and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency. In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two offcore_response events are combined to compute an average latency per request type. For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .P In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events. Example of average latency settings: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE .P The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_gnr.3000066400000000000000000000126761502707512200231640ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_gnr - support for Intel GraniteRapids core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: gnr .B PMU desc: Intel GraniteRapids .sp .SH DESCRIPTION The library supports the Intel GraniteRapids core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Intel GraniteRapids, the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel GraniteRapids processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel GraniteRapids supports two encodings for offcore_response events. the library uses the generic \fBOCR\fR event name to match Intel's definitions. However, the old \fBOFFCORE_RESPONSE_0\fR and \fBOFFCORE_RESPONSE_1\fR names are still available for backward compatibility. It should be noted that \fBOCR\fR is not an architected event, therefore it can changed between processor generations. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple \fBOCR\fR events are monitored simultaneously, the operating system needs to manage the sharing of that extra register. The \fBOCR\fR events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel GraniteRapids, the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the library offers the list of validated combinations as per Intel's official event list. .SH Topdown via PERF_METRICS Intel GraniteRapids supports the PERF_METRICS MSR which provides support for Topdown Level 1 and 2 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_gnr_unc_imc.3000066400000000000000000000023301502707512200246430ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2025" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_gnr_unc_imc - support for Intel GraniteRapids Server Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: gnr_unc_imc[0-11] .B PMU desc: Intel GraniteRapids Server IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel GraniteRapids Server Integrated Memory Controller (IMC) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel GraniteRapids server IMC uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hsw.3000066400000000000000000000077741502707512200232020ustar00rootroot00000000000000.TH LIBPFM 3 "April, 2013" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hsw - support for Intel Haswell core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hsw .B PMU desc: Intel Haswell .B PMU name: hsw_ep .B PMU desc: Intel Haswell-EP .sp .SH DESCRIPTION The library supports the Intel Haswell and Haswell-EP core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Haswell, the number of generic counters depends on the Hyperthreading (HT) mode. When HT is on, then only 4 generic counters are available. When HT is off, then 8 generic counters are available. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Haswell processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .SH OFFCORE_RESPONSE events Intel Haswell provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Haswell, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_cbo.3000066400000000000000000000072171502707512200252070ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_cbo - support for Intel Haswell-EP C-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_cbo[0-17] .B PMU desc: Intel Haswell-EP C-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell C-Box (coherency engine) uncore PMU. This PMU model only exists on Haswell model 63. There is one C-box PMU per physical core. Therefore there are up to eighteen identical C-Box PMU instances numbered from 0 to 17. On dual-socket systems, the number refers to the C-Box PMU on the socket where the program runs. For instance, if running on CPU18, then hswep_unc_cbo0 refers to the C-Box for physical core 0 on socket 1. Conversely, if running on CPU0, then the same hswep_unc_cbo0 refers to the C-Box for physical core 0 but on socket 0. Each C-Box PMU implements 4 generic counters and two filter registers used only with certain events and umasks. .SH MODIFIERS The following modifiers are supported on Intel Haswell C-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of C-Box cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B nf Node filter. Certain events, such as UNC_C_LLC_LOOKUP, UNC_C_LLC_VICTIMS, provide a \fBNID\fR umask. Sometimes the \fBNID\fR is combined with other filtering capabilities, such as opcodes. The node filter is an 8-bit max bitmask. A node corresponds to a processor socket. The legal values therefore depend on the underlying hardware configuration. For dual-socket systems, the bitmask has two valid bits [0:1]. .TP .B cf Core Filter. This is a 5-bit filter which is used to filter based on physical core origin of the C-Box request. Possible values are 0-63. If the filter is not specified, then no filtering takes place. Bit 0-3 indicate the physical core id and bit 4 filters on non thread-related data. .TP .B tf Thread Filter. This is a 1-bit filter which is used to filter C-Box requests based on logical processor (hyper-thread) identification. Possibles values are 0-1. If the filter is not specified, then no filtering takes place. .TP .B nc Non-Coherent. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .TP .B isoc Isochronous. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .SH Opcode filtering Certain events, such as UNC_C_TOR_INSERTS supports opcode matching on the C-BOX transaction type. To use this feature, first an opcode matching umask must be selected, e.g., MISS_OPCODE. Second, the opcode to match on must be selected via a second umask among the OPC_* umasks. For instance, UNC_C_TOR_INSERTS:OPCODE:OPC_RFO, counts the number of TOR insertions for RFO transactions. Opcode matching may be combined with node filtering with certain umasks. In general, the filtering support is encoded into the umask name, e.g., NID_OPCODE supports both node and opcode filtering. For instance, UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:nf=1. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_ha.3000066400000000000000000000027111502707512200250260ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_ha - support for Intel Haswell-EP Home Agent (HA) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_ha0, hswep_unc_ha1 .B PMU desc: Intel Haswell-EP HA uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell Home Agent (HA) uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell HA uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_imc.3000066400000000000000000000027501502707512200252110ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_imc - support for Intel Haswell-EP Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_imc[0-7] .B PMU desc: Intel Haswell-EP IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell Integrated Memory Controller (IMC) uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell C-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_irp.3000066400000000000000000000026371502707512200252370ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_irp - support for Intel Haswell-EP IRP uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_irp .B PMU desc: Intel Haswell-EP IRP uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell IRP uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_pcu.3000066400000000000000000000046051502707512200252310ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_pcu - support for Intel Haswell-EP Power Controller Unit (PCU) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_pcu .B PMU desc: Intel Haswell-EP PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell Power Controller Unit uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell PCU uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .TP .B ff Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are counted. .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_qpi.3000066400000000000000000000027001502707512200252250ustar00rootroot00000000000000.TH LIBPFM 3 "May , 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_qpi - support for Intel Haswell-EP QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_qpi0, hswep_unc_qpi1 .B PMU desc: Intel Haswell-EP QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell Power QPI uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Haswell Bridge QPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_r2pcie.3000066400000000000000000000027001502707512200256200ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_r2pcie - support for Intel Haswell-EP R2 PCIe uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_r2pcie .B PMU desc: Intel Haswell-EP R2 PCIe uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell R2 PCIe uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell R2PCIe uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R2PCIe cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_r3qpi.3000066400000000000000000000026721502707512200255020ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_r3qpi - support for Intel Haswell-EP R3QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_r3qpi[0-2] .B PMU desc: Intel Haswell-EP R3QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell R3QPI uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell R3PQI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R3QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_sbo.3000066400000000000000000000035051502707512200252230ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_sbo - support for Intel Haswell-EP S-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_sbo .B PMU desc: Intel Haswell-EP S-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell Rrint Transfer unit (S-Box) uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell S-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_hswep_unc_ubo.3000066400000000000000000000027111502707512200252230ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_hswep_unc_ubo - support for Intel Haswell-EP U-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: hswep_unc_ubo .B PMU desc: Intel Haswell-EP U-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Haswell system configuration unit (U-Box) uncore PMU. This PMU model only exists on Haswell model 63. .SH MODIFIERS The following modifiers are supported on Intel Haswell U-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icl.3000066400000000000000000000122611502707512200231330ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2019" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icl - support for Intel IceLake core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icl .B PMU desc: Intel IceLake .sp .SH DESCRIPTION The library supports the Intel IceLake core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On IceLake, the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel IceLake processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel IceLake supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the operating system needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel IceLake unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the library offers the list of validated combinations as per Intel's official event list. .SH Topdown via PERF_METRICS Intel Icelake supports the PERF_METRICS MSR which provides support for Topdown Level 1 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx.3000066400000000000000000000070751502707512200231560ustar00rootroot00000000000000.TH LIBPFM 3 "May, 2021" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx - support for Intel IceLakeX core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx .B PMU desc: Intel IceLakeX .sp .SH DESCRIPTION The library supports the Intel IceLakeX core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On IceLake, the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel IceLakeX processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel IceLakeX supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the operating system needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel IceLakeX unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the library offers the list of validated combinations as per Intel's official event list. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_cha.3000066400000000000000000000026501502707512200246300ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_cha - support for Intel IcelakeX Server CHA uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_cha[0-39] .B PMU desc: Intel IcelakeX CHA uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX CHA (coherency and home agent) uncore PMU. There is one CHA-box PMU per physical core. Therefore there are up forty identical CHA PMU instances numbered from 0 up to possibly 39. On dual-socket systems, the number refers to the CHA PMUs on the socket where the program runs. Each CHA PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX CHA uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of CHA clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_iio.3000066400000000000000000000026111502707512200246520ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_iio - support for Intel IcelakeX Server IIO uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_iio[0-5] .B PMU desc: Intel IcelakeX IIO uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX IIO (I/O controller) uncore PMU. Each IIO PMU implements 4 generic counters and free running counters (not yet supported by libpfm4). The current version of libpfm4 does not expose the \fBfc_mask\fR and \fBch_mask\fR filter because these are hardcoded in the events provided by the library. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX IIO uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IIO clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_imc.3000066400000000000000000000023071502707512200246440ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_imc - support for Intel Icelake X Server Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_imc[0-11] .B PMU desc: Intel Icelake X Server IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Icelake X Server Integrated Memory Controller (IMC) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Icelake X server IMC uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_irp.3000066400000000000000000000023151502707512200246650ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_irp - support for Intel IcelakeX Server IRP uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_irp[0-5] .B PMU desc: Intel IcelakeX IRP uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX IRP (IIO Ring Port) uncore PMU. There is one IRP per IIO. Each IRP PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX IRP uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IRP clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_m2m.3000066400000000000000000000022521502707512200245660ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_m2m - support for Intel Icelake X Server Mesh to Memory (M2M) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_m2m[0-1] .B PMU desc: Intel Icelake X Server M2M uncore PMU .sp .SH DESCRIPTION The library supports the Intel Icelake X Server Mesh to Memory (M2M) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Icelake X server M2M uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of M2M cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_m2pcie.3000066400000000000000000000023111502707512200252460ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_m2pcie - support for Intel IcelakeX Server M2PCIE uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_m2pcie[0-2] .B PMU desc: Intel IcelakeX M2PCIE uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX M2PCIE (Mesh to IIO) uncore PMU. Each M2PCIE PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX M2PCIE uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of M2PCIE clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_m3upi.3000066400000000000000000000022721502707512200251320ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_m3upi - support for Intel IcelakeX Server M3UPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_m3upi[0-3] .B PMU desc: Intel IcelakeX M3UPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX M3UPI (Mesh to UPI) uncore PMU. Each M3UPI PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX M3UPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IRP clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_pcu.3000066400000000000000000000032751502707512200246700ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_pcu - support for Intel IcelakeX Server PCU uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_pcu .B PMU desc: Intel IcelakeX PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX PCU (Power Control Unit) uncore PMU. There is one PCU per system. Each PCU PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX PCU uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of PCU clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:63]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .TP .B occ_i Invert the threshold (t) for occupancy events \fBPOWER_STATE_OCCUPANY\fR test from strictly greater than to less or equal to. This is a boolean modifier. .TP .B occ_e Enable edge detection for occupancy events \fBPOWER_STATE_OCCUPANCY\fR, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_ubox.3000066400000000000000000000023511502707512200250500ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_ubox - support for Intel IcelakeX Server UBOX uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_ubox .B PMU desc: Intel IcelakeX UBOX uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX UBOX (System configuration controller) uncore PMU. There is one UBOX per processor. Each UBOX PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX UBOX uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of UBOX clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_icx_unc_upi.3000066400000000000000000000022701502707512200246700ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2023" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_icx_unc_upi - support for Intel IcelakeX Server UPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: icx_unc_upi[0-3] .B PMU desc: Intel IcelakeX UPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel IcelakeX UPI (Ultra Path Interconnect) uncore PMU. Each UPI PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel IcelakeX UPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IRP clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivb.3000066400000000000000000000072771502707512200231570ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivb - support for Intel Ivy Bridge core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivb .B PMU desc: Intel Ivy Bridge .B PMU name: ivb_ep .B PMU desc: Intel Ivy Bridge EP .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Ivy Bridge, the number of generic counters depends on the Hyperthreading (HT) mode. When HT is on, then only 4 generic counters are available. When HT is off, then 8 generic counters are available. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .SH OFFCORE_RESPONSE events Intel Ivy Bridge provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Ivy Bridge, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivb_unc.3000066400000000000000000000036011502707512200240070ustar00rootroot00000000000000.TH LIBPFM 3 "June, 2013" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivb_unc - support for Intel Ivy Bridge uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivb_unc_cbo0, ivb_unc_cbo1, ivb_unc_cbo2, ivb_unc_cbo3 .B PMU desc: Intel Ivy Bridge C-box uncore .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge client part (model 58) uncore PMU. The support is currently limited to the Coherency Box, so called C-Box for up to 4 physical cores. Each physical core has an associated C-Box which it uses to communicate with the L3 cache. The C-boxes all support the same set of events. However, Core 0 C-box (snb_unc_cbo0) supports an additional uncore clock ticks event: \fBUNC_CLOCKTICKS\fR. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .P Both the \fBUNC_CBO_CACHE_LOOKUP\fR and \fBUNC_CBO_XSNP_RESPONSE\fR requires two umasks to be valid. For \fBUNC_CBO_CACHE_LOOKUP\fR the first umask must be one of the MESI state umasks, the second has to be one of the filters. For \fBUNC_CBO_XSNP_RESPONSE\fR the first umask must be one of the snoop types, the second has to be one of the filters. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_cbo.3000066400000000000000000000071131502707512200251610ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_cbo - support for Intel Ivy Bridge-EP C-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_cbo[0-7] .B PMU desc: Intel Ivy Bridge-EP C-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge C-Box (coherency engine) uncore PMU. This PMU model only exists on Ivy Bridge model 62. There is one C-box PMU per physical core. Therefore there are up to fifteen identical C-Box PMU instances numbered from 0 to 14. On dual-socket systems, the number refers to the C-Box PMU on the socket where the program runs. For instance, if running on CPU15, then ivbep_unc_cbo0 refers to the C-Box for physical core 0 on socket 1. Conversely, if running on CPU0, then the same ivbep_unc_cbo0 refers to the C-Box for physical core 0 but on socket 0. Each C-Box PMU implements 4 generic counters and two filter registers used only with certain events and umasks. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge C-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of C-Box cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B nf Node filter. Certain events, such as UNC_C_LLC_LOOKUP, UNC_C_LLC_VICTIMS, provide a \fBNID\fR umask. Sometimes the \fBNID\fR is combined with other filtering capabilities, such as opcodes. The node filter is an 8-bit max bitmask. A node corresponds to a processor socket. The legal values therefore depend on the underlying hardware configuration. For dual-socket systems, the bitmask has two valid bits [0:1]. .TP .B cf Core Filter. This is a 3-bit filter which is used to filter based on physical core origin of the C-Box request. Possible values are 0-7. If the filter is not specified, then no filtering takes place. .TP .B tf Thread Filter. This is a 1-bit filter which is used to filter C-Box requests based on logical processor (hyper-thread) identification. Possibles values are 0-1. If the filter is not specified, then no filtering takes place. .TP .B nc Non-Coherent. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .TP .B isoc Isochronous. This is a 1-bit filter which is used to filter C-Box requests only for the TOR_INSERTS and TOR_OCCUPANCY umasks using the OPCODE matcher. If the filter is not specified, then no filtering takes place. .SH Opcode filtering Certain events, such as UNC_C_TOR_INSERTS supports opcode matching on the C-BOX transaction type. To use this feature, first an opcode matching umask must be selected, e.g., MISS_OPCODE. Second, the opcode to match on must be selected via a second umask among the OPC_* umasks. For instance, UNC_C_TOR_INSERTS:OPCODE:OPC_RFO, counts the number of TOR insertions for RFO transactions. Opcode matching may be combined with node filtering with certain umasks. In general, the filtering support is encoded into the umask name, e.g., NID_OPCODE supports both node and opcode filtering. For instance, UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:nf=1. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_ha.3000066400000000000000000000021341502707512200250040ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_ha - support for Intel Ivy Bridge-EP Home Agent (HA) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_ha0, ivbep_unc_ha1 .B PMU desc: Intel Ivy Bridge-EP HA uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge Home Agent (HA) uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge HA uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_imc.3000066400000000000000000000021771502707512200251730ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_imc - support for Intel Ivy Bridge-EP Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_imc[0-7] .B PMU desc: Intel Ivy Bridge-EP IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge Integrated Memory Controller (IMC) uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge C-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_irp.3000066400000000000000000000020611502707512200252050ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_irp - support for Intel Ivy Bridge-EP IRP uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_irp .B PMU desc: Intel Ivy Bridge-EP IRP uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge IRP uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_pcu.3000066400000000000000000000040271502707512200252060ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_pcu - support for Intel Ivy Bridge-EP Power Controller Unit (PCU) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_pcu .B PMU desc: Intel Ivy Bridge-EP PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge Power Controller Unit uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge PCU uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B ff Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are counted. .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_qpi.3000066400000000000000000000021201502707512200252000ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_qpi - support for Intel Ivy Bridge-EP QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_qpi0, ivbep_unc_qpi1 .B PMU desc: Intel Ivy Bridge-EP QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge Power QPI uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge QPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_r2pcie.3000066400000000000000000000021221502707512200255750ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_r2pcie - support for Intel Ivy Bridge-EP R2 PCIe uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_r2pcie .B PMU desc: Intel Ivy Bridge-EP R2 PCIe uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge R2 PCIe uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge R2PCIe uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R2PCIe cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_r3qpi.3000066400000000000000000000021541502707512200254540ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_r3qpi - support for Intel Ivy Bridge-EP R3QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_r3qpi0, ivbep_unc_r3qpi1, ivbep_unc_r3qpi2 .B PMU desc: Intel Ivy Bridge-EP R3QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge R3QPI uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge R3PQI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R3QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_ivbep_unc_ubo.3000066400000000000000000000021331502707512200252000ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_ivbep_unc_ubo - support for Intel Ivy Bridge-EP U-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ivbep_unc_ubo .B PMU desc: Intel Ivy Bridge-EP U-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Ivy Bridge system configuration unit (U-Box) uncore PMU. This PMU model only exists on Ivy Bridge model 62. .SH MODIFIERS The following modifiers are supported on Intel Ivy Bridge U-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_knc.3000066400000000000000000000025041502707512200231360ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_knc - support for Intel Knights Corner .SH SYNOPSIS .nf .B #include .sp .B PMU name: knc .B PMU desc: Intel Knights Corner .sp .SH DESCRIPTION The library supports Intel Knights Corner processors. .SH MODIFIERS The following modifiers are supported on Intel Knights Corner processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on all threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_knl.3000066400000000000000000000115201502707512200231450ustar00rootroot00000000000000.TH LIBPFM 3 "July, 2016" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_knl - support for Intel Knights Landing core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: knl .B PMU desc: Intel Knights Landing .sp .SH DESCRIPTION The library supports the Intel Knights Landing core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Knights Landing, the number of generic counters is 4. There is 4-way HyperThreading support. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Knights Landing processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on any of the 4 hyper-threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. This modifier is only available on fixed counters (unhalted_reference_cycles, instructions_retired, unhalted_core_cycles). Depending on the underlying kernel interface, the event may be programmed on a fixed counter or a generic counter, except for unhalted_reference_cycles, in which case, this modifier may be ignored or rejected. .SH OFFCORE_RESPONSE events Intel Knights Landing provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Knights Landing, the umasks are divided into 4 categories: request, supplier and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency. In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two offcore_response events are combined to compute an average latency per request type. For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR .P But the following is illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR:ANY_RESPONSE .P In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events. Example of average latency settings: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE .P The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_knm.3000066400000000000000000000114741502707512200231560ustar00rootroot00000000000000.TH LIBPFM 3 "March, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_knm - support for Intel Knights Mill core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: knm .B PMU desc: Intel Knights Mill .sp .SH DESCRIPTION The library supports the Intel Knights Mill core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On Knights Mill, the number of generic counters is 4. There is 4-way HyperThreading support. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Knights Mill processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on any of the 4 hyper-threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. This modifier is only available on fixed counters (unhalted_reference_cycles, instructions_retired, unhalted_core_cycles). Depending on the underlying kernel interface, the event may be programmed on a fixed counter or a generic counter, except for unhalted_reference_cycles, in which case, this modifier may be ignored or rejected. .SH OFFCORE_RESPONSE events Intel Knights Mill provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Knights Mill, the umasks are divided into 4 categories: request, supplier and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency. In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two offcore_response events are combined to compute an average latency per request type. For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR .P But the following is illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR:ANY_RESPONSE .P In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events. Example of average latency settings: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE .P The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_nhm.3000066400000000000000000000056421502707512200231530ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_nhm - support for Intel Nehalem core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: nhm .B PMU desc: Intel Nehalem .B PMU name: nhm_ex .B PMU desc: Intel Nehalem EX .sp .SH DESCRIPTION The library supports the Intel Nehalem core PMU. It should be noted that this PMU model only covers the each core's PMU and not the socket level PMU. It is provided separately. Support is provided for the Intel Core i7 and Core i5 processors. .SH MODIFIERS The following modifiers are supported on Intel Nehalem processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .SH OFFCORE_RESPONSE event The library is able to encode the OFFCORE_RESPONSE_0 event. This is a special event because it needs a second MSR (0x1a6) to be programmed for the event to count properly. Thus two values are necessary. The first value can be programmed on any of the generic counters. The second value goes into the dedicated MSR (0x1a6). The OFFCORE_RESPONSE event is exposed as a normal event with several umasks which are divided in two groups: request and response. The user must provide \fBat least\fR one umask from each group. For instance, OFFCORE_RESPONSE_0:ANY_DATA:LOCAL_DRAM. When using \fBpfm_get_event_encoding()\fR, two 64-bit values are returned. The first value corresponds to what needs to be programmed into any of the generic counters. The second value must be programmed into the dedicated MSR (0x1a6). When using an OS-specific encoding routine, the way the event is encoded is OS specific. Refer to the corresponding man page for more information. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_nhm_unc.3000066400000000000000000000024351502707512200240150ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_nhm_unc \- support for Intel Nehalem uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: nhm_unc .B PMU desc: Intel Nehalem uncore .sp .SH DESCRIPTION The library supports the Nehalem uncore PMU as implemented by processors such as Intel Core i7, and Intel Core i5. The PMU is located at the socket-level and is therefore shared between the various cores. By construction it can only measure at all privilege levels. .SH MODIFIERS The following modifiers are supported on Intel Nehalem processors: .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B o Causes the queue occupancy counter associated with the event to be cleared (zeroed). This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_p6.3000066400000000000000000000026521502707512200227140ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_p6 - support for Intel P5 based processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: pm, ppro, pii, piii, p6 .B PMU desc: Intel Pentium M, Intel Pentium Pro, Intel Pentium II, Intel Pentium III, Intel P6 .sp .SH DESCRIPTION The library supports all Intel P6-based processors all the way back to the Pentium Pro. Although all those processors offers the same PMU architecture, they differ in the events they provide. .SH MODIFIERS The following modifiers are supported on all Intel P6 processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_rapl.3000066400000000000000000000023101502707512200233140ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2013" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_rapl - support for Intel RAPL PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: rapl .B PMU desc: Intel RAPL (Intel SandyBridge, IvyBridge, Haswell) .sp .SH DESCRIPTION The library supports the Intel Running Average Power Limit (RAPL) energy consumption counters. This is a socket-level set of counters which reports energy consumption in Joules. There are up to 3 counters each measuring only one event. The following events are defined: .TP .B RAPL_ENERGY_CORES On all processors, the event reports the number of Joules consumed by all cores. On all processors, .TP .B RAPL_ENERGYC_PKG On all processors, th event reports the number of Joules consumed by all the cores and Last Level cache (L3). .TP .B RAPL_ENERGY_DRAM On server processors, the event reports the number of Joules consumed n by the DRAM controller. By construction, the events are socket-level and can only be measured in system-wide mode. It is necessary and sufficient to measure only one CPU per socket to get meaningful results. .SH MODIFIERS The PMU does not support any modifiers. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skl.3000066400000000000000000000103651502707512200231600ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2015" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skl - support for Intel SkyLake core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skl .B PMU desc: Intel SkyLake .sp .SH DESCRIPTION The library supports the Intel SkyLake core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On SkyLake, the number of generic counters depends on the Hyperthreading (HT) mode. counters are available. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel SkyLake processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel SkyLake provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel SkyLake, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_cha.3000066400000000000000000000066021502707512200246530ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_cha - support for Intel Skylake X Server CHA-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_cha[0-27] .B PMU desc: Intel Skylake X CHA uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X CHA-Box (coherency and home agent engine) uncore PMU. There is one CHA-box PMU per physical core. Therefore there are up to twenty-eight identical CHA-Box PMU instances numbered from 0 up to possibly 27. On dual-socket systems, the number refers to the CHA-Box PMU on the socket where the program runs. For instance, if running on CPU18, then skx_unc_cha0 refers to the CHA-Box for physical core 0 on socket 1. Conversely, if running on CPU0, then the same skx_unc_cha0 refers to the CHA-Box for physical core 0 but on socket 0. Each CHA-Box PMU implements 4 generic counters and two filter registers used only with certain events and umasks. The filters are either accessed via modifiers (see below) or umasks, such as the opcode or cache state filter. .SH MODIFIERS The following modifiers are supported on Intel Skylake CHA-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of C-Box cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier. .TP .B loc Match on local node target. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .TP .B rem Match on remote node target. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .TP .B lmem Match near memory cacheable. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .TP .B rmem Match not near memory cacheable. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .TP .B nc Match non-coherent requests. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .TP .B isoc Match isochronous requests. This filter is only supported on UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY. This is a boolean filter. .SH Opcode filtering Events UNC_C_TOR_INSERTS and UNC_C_TOR_OCCUPANCY support opcode matching. The processor implements two opcode filters. Both are used at the same time. The OPC0 umasks correspond to the first opcode matcher and OPC1 to the second opcode matcher. If only one opcode must be tracked then the unused filter will be set to 0. The opcode umasks must be used in combination with a specific queue umask otherwise the library will reject the event. The umask description shows which queue umask is required for each opcode. For instance, OPC0_RFO/OPC1_RFO require the IRQ queue and thus the IRQ umask. The opcode match umasks can be combined with other modifiers. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_iio.3000066400000000000000000000034101502707512200246720ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_ubo - support for Intel Skylake X Server IIO uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_iio[0-5] .B PMU desc: Intel Skylake X Server IIO uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X server IIO uncore PMU. The IIO PMU is used to analyze I/O traffic from the PCIe controllers. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server IIO uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH Filters Events UNC_IO_COMP_BUF_INSERTS, UNC_IO_DATA_REQ_BY_CPU, UNC_IO_DATA_REQ_OF_CPU, support additional filtering via a set of umasks tracking requests completions. They umasks start with a FC_ prefix. The other filter is the port filter and it is hardcoded as umasks for the same events. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_imc.3000066400000000000000000000027241502707512200246710ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_imc - support for Intel Skylake X Server Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_imc[0-7] .B PMU desc: Intel Skylake X Server IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X Server Integrated Memory Controller (IMC) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server IMC uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_irp.3000066400000000000000000000026421502707512200247120ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_irp - support for Intel Broadwell Server IRP uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_irp .B PMU desc: Intel Skylake X Server IRP uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X Server IRP (IIO coherency) uncore PMU . .SH MODIFIERS The following modifiers are supported on Intel Skylake X server IRP uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_m2m.3000066400000000000000000000026741502707512200246200ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_m2m - support for Intel Skylake X Server M2M uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_m2m[0-1] .B PMU desc: Intel Skylake X Server Mesh-2-Memory uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X server system configuration unit M2M uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server M2M uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_m3upi.3000066400000000000000000000026501502707512200251540ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_m3upi - support for Intel Skylake X Server M3UPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_m3upi[0-2] .B PMU desc: Intel Skylake X server M3UPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X server M3UPI uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server M3UQI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of M3UPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_pcu.3000066400000000000000000000040041502707512200247010ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_pcu - support for Intel Skylake X Power Controller Unit (PCU) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_pcu .B PMU desc: Intel Skylake X Server PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X Server Power Controller Unit uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server PCU uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_ubo.3000066400000000000000000000026671502707512200247140ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_ubo - support for Intel Skylake X Server U-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_ubo .B PMU desc: Intel Skylake X Server U-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X server system configuration unit (U-Box) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server U-Box uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_skx_unc_upi.3000066400000000000000000000036521502707512200247170ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2018" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_skx_unc_upi - support for Intel Skylake X Server UPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: skx_unc_upi[0-2] .B PMU desc: Intel Skylake X Server UPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Skylake X Server UPI uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel Skylake X server UPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the meaning of the threshold or edge filter. If set, the event counts when strictly less than N occurrences occur per cycle if threshold is set to N. When invert is set, then threshold must be set to non-zero value. If set, the event counts when the event transitions from occurring to not occurring (falling edge) when edge detection is set. This is a boolean modifier .SH BASIC_HDR_MATCH events The library also supports the special \fBRXL_BASIC_HDR_MATCH\fR and \fBTXL_BASIC_HDR_MATCH\fR opcode matcher events. These events have a lot more filters available in the form of either a modifier (listed below) or specific umasks. The following modifiers are available in additional to the standard listed above: .TP .B rcsnid Specify a RCS Node identifier as an integer in the range [0-15]. Default: 0 .TP .B dnid Specify a destination Node identifier as an integer in the range [0-15]. Default: 0 .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_slm.3000066400000000000000000000057161502707512200231660ustar00rootroot00000000000000.TH LIBPFM 3 "November, 2013" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_slm - support for Intel Silvermont core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: slm .B PMU desc: Intel Silvermont .sp .SH DESCRIPTION The library supports the Intel Silvermont core PMU. .SH MODIFIERS The following modifiers are supported on Intel Silvermont processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH OFFCORE_RESPONSE events Intel Silvermont provides two offcore_response events: \fBOFFCORE_RESPONSE_0\fR and \fBOFFCORE_RESPONSE_1\fR. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal event by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Silvermont, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. The library provides a default umask per category if not provided by the user. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:NON_DRAM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:L2_HIT:SNOOP_ANY:ANY_RESPONSE .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snb.3000066400000000000000000000074741502707512200231600ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2011" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snb - support for Intel Sandy Bridge core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snb .B PMU desc: Intel Sandy Bridge .B PMU name: snb_ep .B PMU desc: Intel Sandy Bridge EP .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. For that refer to the Sandy Bridge uncore PMU support. On Sandy Bridge, the number of generic counters depends on the Hyperthreading (HT) mode. When HT is on, then only 4 generic counters are available. When HT is off, then 8 generic counters are available. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .SH OFFCORE_RESPONSE events Intel Sandy Bridge provides two offcore_response events, like Intel Westmere. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel Sandy Bridge, the umasks are divided into three categories: request, supplier and snoop. The user must provide at least one umask for each category. The categories are shown in the umask descriptions. There is also the special response umask called \fBANY_RESPONSE\fR. When this umask is used then it overrides any supplier and snoop umasks. In other words, users can specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop is specified, the library defaults to using \fBANY_RESPONSE\fR. For instance, the following are valid event selections: .TP .B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_REQUEST .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY .P But the following are illegal: .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE .TP .B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE .SH SEE ALSO libpfm_snb_unc(3) .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snb_unc.3000066400000000000000000000036131502707512200240140ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snb_unc - support for Intel Sandy Bridge uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snb_unc_cbo0, snb_unc_cbo1, snb_unc_cbo2, snb_unc_cbo3 .B PMU desc: Intel Sandy Bridge C-box uncore .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge client part (model 42) uncore PMU. The support is currently limited to the Coherency Box, so called C-Box for up to 4 physical cores. Each physical core has an associated C-Box which it uses to communicate with the L3 cache. The C-boxes all support the same set of events. However, Core 0 C-box (snb_unc_cbo0) supports an additional uncore clock ticks event: \fBUNC_CLOCKTICKS\fR. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .P Both the \fBUNC_CBO_CACHE_LOOKUP\fR and \fBUNC_CBO_XSNP_RESPONSE\fR requires two umasks to be valid. For \fBUNC_CBO_CACHE_LOOKUP\fR the first umask must be one of the MESI state umasks, the second has to be one of the filters. For \fBUNC_CBO_XSNP_RESPONSE\fR the first umask must be one of the snoop types, the second has to be one of the filters. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_cbo.3000066400000000000000000000064321502707512200251660ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_cbo - support for Intel Sandy Bridge-EP C-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_cbo[0-7] .B PMU desc: Intel Sandy Bridge-EP C-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge C-Box (coherency engine) uncore PMU. This PMU model only exists on Sandy Bridge model 45. There is one C-box PMU per physical core. Therefore there are eight identical C-Box PMU instances numbered frmo 0 to 7. On dual-socket systems, the number refers to the C-Box PMU on the socket where the program runs. For instance, if running on CPU8, then snbep_unc_cbo0 refers to the C-Box for physical core 0 on socket 1. Conversely, if running on CPU0, then the same snbep_unc_cbo0 refers to the C-Box for physical core 0 but on socket 0. Each C-Box PMU implements 4 generic counters and a filter register used only with certain events and umasks. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count C-Box cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of C-Box cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B nf Node filter. Certain events, such as UNC_C_LLC_LOOKUP, UNC_C_LLC_VICTIMS, provide a \fBNID\fR umask. Sometimes the \fBNID\fR is combined with other filtering capabilities, such as opcodes. The node filter is an 8-bit max bitmask. A node corresponds to a processor socket. The legal values therefore depend on the underlying hardware configuration. For dual-socket systems, the bitmask has two valid bits [0:1]. .TP .B cf Core Filter. This is a 3-bit filter which is used to filter based on physical core origin of the C-Box request. Possible values are 0-7. If the filter is not specified, then no filtering takes place. .TP .B tf Thread Filter. This is a 1-bit filter which is used to filter C-Box requests based on logical processor (hyper-thread) identification. Possibles values are 0-1. If the filter is not specified, then no filtering takes place. .SH Opcode filtering Certain events, such as UNC_C_TOR_INSERTS supports opcode matching on the C-BOX transaction type. To use this feature, first an opcode matching umask must be selected, e.g., MISS_OPCODE. Second, the opcode to match on must be selected via a second umask among the OPC_* umasks. For instance, UNC_C_TOR_INSERTS:OPCODE:OPC_RFO, counts the number of TOR insertions for RFO transactions. Opcode matching may be combined with node filtering with certain umasks. In general the filtering support is encoded into the umask name, e.g., NID_OPCODE supports both node and opcode filtering. For instance, UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:nf=1. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_ha.3000066400000000000000000000024371502707512200250140ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_ha - support for Intel Sandy Bridge-EP Home Agent (HA) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_ha .B PMU desc: Intel Sandy Bridge-EP HA uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge Home Agent (HA) uncore PMU. This PMU model only exists on Sandy Bridge model 45. There is only one Home Agent per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count HA cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_imc.3000066400000000000000000000025011502707512200251640ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_imc - support for Intel Sandy Bridge-EP Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_imc[0-3] .B PMU desc: Intel Sandy Bridge-EP IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge Integrated Memory Controller (IMC) uncore PMU. This PMU model only exists on Sandy Bridge model 45. There are four IMC PMUs per socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count IMC cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_pcu.3000066400000000000000000000043461502707512200252140ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_pcu - support for Intel Sandy Bridge-EP Power Controller Unit (PCU) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_pcu .B PMU desc: Intel Sandy Bridge-EP PCU uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge Power Controller Unit uncore PMU. This PMU model only exists on Sandy Bridge model 45. There is only one PCU PMU per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge C-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count HA cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B ff Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are counted. .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_qpi.3000066400000000000000000000024331502707512200252110ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_qpi - support for Intel Sandy Bridge-EP QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_qpi0, snbep_unc_qpi1 .B PMU desc: Intel Sandy Bridge-EP QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge Power QPI uncore PMU. This PMU model only exists on Sandy Bridge model 45. There are two QPI PMUs per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge QPI uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count QPI cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_r2pcie.3000066400000000000000000000024471502707512200256110ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_r2pcie - support for Intel Sandy Bridge-EP R2 PCIe uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_r2pcie .B PMU desc: Intel Sandy Bridge-EP R2 PCIe uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge R2 PCIe uncore PMU. This PMU model only exists on Sandy Bridge model 45. There is only one R2PCIe PMU per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge R2PCIe uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count R2 PCIe cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R2PCIe cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_r3qpi.3000066400000000000000000000024511502707512200254560ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_r3qpi - support for Intel Sandy Bridge-EP R3QPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_r3qpi0, snbep_unc_r3qpi0 .B PMU desc: Intel Sandy Bridge-EP R3QPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge R3QPI uncore PMU. This PMU model only exists on Sandy Bridge model 45. There are two R3QPI PMUs per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge R3PQI uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count R3QPI cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of R3QPI cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_snbep_unc_ubo.3000066400000000000000000000053361502707512200252120ustar00rootroot00000000000000.TH LIBPFM 3 "August, 2012" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_snbep_unc_ubo - support for Intel Sandy Bridge-EP U-Box uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: snbep_unc_ubo .B PMU desc: Intel Sandy Bridge-EP U-Box uncore PMU .sp .SH DESCRIPTION The library supports the Intel Sandy Bridge system configuration unit (U-Box) uncore PMU. This PMU model only exists on Sandy Bridge model 45. There is only one U-Box PMU per processor socket. .SH MODIFIERS The following modifiers are supported on Intel Sandy Bridge U-Box uncore PMU: .TP .B i Invert the meaning of the event. The counter will now count HA cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of HA cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:15]. .TP .B oi Invert the meaning of the occupancy event POWER_STATE_OCCUPANCY. The counter will now count PCU cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B oe Enable edge detection for the occupancy event POWER_STATE_OCCUPANCY. The event now counts only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B ff Enable frequency band filtering. This modifier applies only to the UNC_P_FREQ_BANDx_CYCLES events, where x is [0-3]. The modifiers expects an integer in the range [0-255]. The value is interpreted as a frequency value to be multiplied by 100Mhz. Thus if the value is 32, then all cycles where the processor is running at 3.2GHz and more are counted. .SH Frequency band filtering There are 3 events which support frequency band filtering, namely, UNC_P_FREQ_BAND0_CYCLES, UNC_P_FREQ_BAND1_CYCLES, UNC_P_FREQ_BAND2_CYCLES, UNC_P_FREQ_BAND3_CYCLES. The frequency filter (available via the ff modifier) is stored into a PMU shared register which hold all 4 possible frequency bands, one per event. However, the library generate the encoding for each event individually because it processes events one at a time. The caller or the underlying kernel interface may have to merge the band filter settings to program the filter register properly. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_spr.3000066400000000000000000000123561502707512200231750ustar00rootroot00000000000000.TH LIBPFM 3 "April, 2022" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_spr - support for Intel SapphireRapids core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: spr .B PMU desc: Intel SapphireRapids .sp .SH DESCRIPTION The library supports the Intel SapphireRapids core PMU. It should be noted that this PMU model only covers each core's PMU and not the socket level PMU. On SapphireRapids, the number of generic counters depends on the Hyperthreading (HT) mode. The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters in \fBnum_cntrs\fr. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapids processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B ldlat Pass a latency threshold to the MEM_TRANS_RETIRED:LOAD_LATENCY event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .TP .B intx Monitor the event only when executing inside a transactional memory region (in tx). Event does not count otherwise. This is a boolean modifiers. Default value is 0. .TP .B intxcp Do not count occurrences of the event when they are inside an aborted transactional memory region. This is a boolean modifier. Default value is 0. .TP .B fe_thres This modifier is for the FRONTEND_RETIRED event only. It defines the period in core cycles after which the IDQ_*_BUBBLES umask counts. It acts as a threshold, i.e., at least a period of N core cycles where the frontend did not deliver X uops. It can only be used with the IDQ_*_BUBBLES umasks. If not specified, the default threshold value is 1 cycle. the valid values are in [1-4095]. .SH OFFCORE_RESPONSE events Intel SapphireRapids supports two encodings for offcore_response events. In the library, these are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the operating system needs to manage the sharing of that extra register. The offcore_response events are exposed as a normal events by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according to the underlying kernel interface. On Intel SapphireRapids unlike older processors, the event is treated as a regular event with a flat set of umasks to choose from. It is not possible to combine the various requests, supplier, snoop bits anymore. Therefore the library offers the list of validated combinations as per Intel's official event list. .SH Topdown via PERF_METRICS Intel SapphireRapids supports the PERF_METRICS MSR which provides support for Topdown Level 1 and 2 via a single PMU counter. This special counter provides percentages of slots for each metric. This feature must be used in conjunction with fixed counter 3 which counts SLOTS in order to work properly. The Linux kernel exposes PERF_METRICS metrics as individual pseudo events counting in slots unit however to operate correctly all events must be programmed together. The Linux kernel requires all PERF_METRICS events to be programmed as a single event group with the first event as SLOTS required. Example: '{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}'. Libpfm4 provides acces to the PERF_METRICS pseudo events via a dedicated event called \fBTOPDOWN_M\fR. This event uses the pseudo encodings assigned by the Linux kernel to PERF_METRICS pseudo events. Using these encodings ensures the kernel detects them as targeting the PERF_METRICS MSR. Note that libpfm4 only provides the encodings and that it is up the user on Linux to group them and order them properly for the perf_events interface. There exists generic counter encodings for most of the Topdown metrics and libpfm4 provides support for those via the \fBTOPDOWN\fR event. Note that all subevents of \fBTOPDOWN_M\fR use fixed counters which have, by definition, no actual event codes. The library uses the Linux pseudo event codes for them, even when compiled on non Linux operating systems.The same holds true for any fixed counters pseudo event exported by libpfm4. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_spr_unc_cha.3000066400000000000000000000026711502707512200246540ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_spr_unc_cha - support for Intel IcelakeX Server CHA uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: spr_unc_cha[0-59] .B PMU desc: Intel SapphireRapids CHA uncore PMU .sp .SH DESCRIPTION The library supports the Intel SapphireRapids CHA (coherency and home agent) uncore PMU. There is one CHA-box PMU per physical core. Therefore there are up forty identical CHA PMU instances numbered from 0 up to possibly 59. On dual-socket systems, the number refers to the CHA PMUs on the socket where the program runs. Each CHA PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapids CHA uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of CHA clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_spr_unc_imc.3000066400000000000000000000023321502707512200246630ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_spr_unc_imc - support for Intel SapphireRapids Server Integrated Memory Controller (IMC) uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: spr_unc_imc[0-11] .B PMU desc: Intel SapphireRapids Server IMC uncore PMU .sp .SH DESCRIPTION The library supports the Intel SapphireRapids Server Integrated Memory Controller (IMC) uncore PMU. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapids server IMC uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IMC cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_spr_unc_upi.3000066400000000000000000000023171502707512200247130ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2024" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_spr_unc_upi - support for Intel SapphireRapids Server UPI uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: spr_unc_upi[0-3] .B PMU desc: Intel SapphireRapids UPI uncore PMU .sp .SH DESCRIPTION The library supports the Intel SapphireRapids UPI (Ultra Path Interconnect) uncore PMU. Each UPI PMU implements 4 generic counters. .SH MODIFIERS The following modifiers are supported on Intel SapphireRapids UPI uncore PMU: .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a threshold modifier (t) with a value greater or equal to one. This is a boolean modifier. .TP .B t Set the threshold value. When set to a non-zero value, the counter counts the number of IRP clockticks in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .B i Invert the threshold (t) test from strictly greater than to less or equal to. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_tmt.3000066400000000000000000000043601502707512200231710ustar00rootroot00000000000000.TH LIBPFM 3 "March, 2020" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_tmt - support for Intel Tremont core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: tmt .B PMU desc: Intel Tremont .sp .SH DESCRIPTION The library supports the Intel Tremont core PMU. .SH MODIFIERS The following modifiers are supported on Intel Tremont processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .SH OFFCORE_RESPONSE events Intel Tremont provides two offcore_response events: \fBOFFCORE_RESPONSE_0\fR and \fBOFFCORE_RESPONSE_1\fR. The \fBOCR\fR event is aliased to \fBOFFCORE_RESPONSE_0\fR. Those events need special treatment in the performance monitoring infrastructure because each event uses an extra register to store some settings. Thus, in case multiple offcore_response events are monitored simultaneously, the kernel needs to manage the sharing of that extra register. The offcore_response event is exposed as a normal event by the library. The extra settings are exposed as regular umasks. The library takes care of encoding the events according for the underlying kernel interface. On Intel Tremont, it is not possible to combine the request, supplier, snoop, fields anymore to avoid invalid combinations. As such, the umasks provided by the library are the only ones supported and validated. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_wsm.3000066400000000000000000000060401502707512200231700ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_wsm - support for Intel Westmere core PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: wsm .B PMU desc: Intel Westmere .B PMU name: wsm_dp .B PMU desc: Intel Westmere DP .sp .SH DESCRIPTION The library supports the Intel Westmere core PMU. It should be noted that this PMU model only covers the each core's PMU and not the socket level PMU. It is provided separately. Support is provided for the Intel Core i7 and Core i5 processors (models 37, 44). .SH MODIFIERS The following modifiers are supported on Intel Westmere processors: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This is a boolean modifier. .TP .B ldlat Pass a latency threshold to the MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD event. This is an integer attribute that must be in the range [1:65535]. It is required for this event. Note that the event must be used with precise sampling (PEBS). .SH OFFCORE_RESPONSE events The library is able to encode the OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1 events. Those are special events because they, each, need a second MSR (0x1a6 and 0x1a7 respectively) to be programmed for the event to count properly. Thus two values are necessary for each event. The first value can be programmed on any of the generic counters. The second value goes into the dedicated MSR (0x1a6 or 0x1a7). The OFFCORE_RESPONSE events are exposed as normal events with several umasks which are divided in two groups: request and response. The user must provide \fBat least\fR one umask from each group. For instance, OFFCORE_RESPONSE_0:ANY_DATA:LOCAL_DRAM. When using \fBpfm_get_event_encoding()\fR, two 64-bit values are returned. The first value corresponds to what needs to be programmed into any of the generic counters. The second value must be programmed into the corresponding dedicated MSR (0x1a6 or 0x1a7). When using an OS-specific encoding routine, the way the event is encoded is OS specific. Refer to the corresponding man page for more information. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_wsm_unc.3000066400000000000000000000024651502707512200240440ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2010" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_wsm_unc \- support for Intel Westmere uncore PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: wsm_unc .B PMU desc: Intel Westmere uncore .sp .SH DESCRIPTION The library supports the Intel Westmere uncore PMU as implemented by processors such as Intel Core i7, and Intel Core i5 (models 37, 44). The PMU is located at the socket-level and is therefore shared between the various cores. By construction it can only measure at all privilege levels. .SH MODIFIERS The following modifiers are supported on Intel Westmere processors: .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B o Causes the queue occupancy counter associated with the event to be cleared (zeroed). This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_intel_x86_arch.3000066400000000000000000000033661502707512200240140ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME libpfm_intel_x86_arch - support for Intel X86 architectural PMU .SH SYNOPSIS .nf .B #include .sp .B PMU name: ix86arch .B PMU desc: Intel X86 architectural PMU .sp .SH DESCRIPTION The library supports \fbany\fR processor implementing the Intel architectural PMU. This is a minimal PMU with a variable number of counters but predefined set of events. It is implemented in all recent processors starting with Intel Core Duo/Core Solo. It acts as a default PMU support in case the library is run on a very recent processor for which the specific support has not yet been implemented. .SH MODIFIERS The following modifiers are supported on Intel architectural PMU: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B i Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR occurring. This is a boolean modifier .TP .B e Enable edge detection, i.e., count only when there is a state transition. This is a boolean modifier. .TP .B c Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles in which the number of occurrences of the event is greater or equal to the threshold. This is an integer modifier with values in the range [0:255]. .TP .B t Measure on both threads at the same time assuming hyper-threading is enabled. This modifier requires at least version 3 of the architectural PMU. This is a boolean modifier. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_mips_74k.3000066400000000000000000000036301502707512200226260ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2011" "" "Linux Programmer's Manual" .SH NAME libpfm_mips_74k - support for MIPS 74k processors .SH SYNOPSIS .nf .B #include .sp .B PMU name: mips_74k .B PMU desc: MIPS 74k .sp .SH DESCRIPTION The library supports MIPS 74k processors in big or little endian modes. .SH ENCODINGS On this processor, what is measured by an event depends on the event code and on the counter it is programmed on. Usually the meaning of the event code changes between odd and even indexed counters. For instance, event code \fB0x2\fR means 'PREDICTED_JR31' when programmed on even-indexed counters and it means 'JR_31_MISPREDICTIONS' when programmed on odd-indexed counters. To correctly measure an event, one needs both the event encoding and a list of possible counters. When \fRpfm_get_os_event_encoding()\fR is used with \fBPFM_OS_NONE\fR to return the raw PMU encoding, the library returns two values: the event encoding as per the architecture manual and a bitmask of valid counters to program it on. For instance, for 'JR_31_MISPREDICTIONS' The library returns codes[0] = 0x4a, codes[1]= 0xa (supported on counter 1, 3). The encoding for a specific kernel interface may vary and is handled internally by the library. .SH MODIFIERS The following modifiers are supported on MIPS 74k. .TP .B u Measure at user level. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B e Measure at exception level. This corresponds to \fBPFM_PLM2\fR. This is a boolean modifier. .TP .B s Measure at supervisor level. This corresponds to \fBPFM_PLM1\fR. This is a boolean modifier. It should be noted that those modifiers are available for encoding as raw mode with \fBPFM_OS_NONE\fR but they may not all be present with specific kernel interfaces. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/libpfm_perf_event_raw.3000066400000000000000000000104771502707512200242060ustar00rootroot00000000000000.TH LIBPFM 3 "February, 2014" "" "Linux Programmer's Manual" .SH NAME libpfm_perf_event_raw - support for perf_events raw events syntax .SH SYNOPSIS .nf .B #include .sp .B PMU name: perf_raw .B PMU desc: Raw perf_events event syntax .sp .SH DESCRIPTION The library supports a pseudo PMU model to allow raw encodings of PMU events for the Linux perf_events kernel interface. With this PMU, it is possible to provide the raw hexadecimal encoding of any hardware event for any PMU models. The raw encoding is passed as is to the kernel. All events are encoded as \fBPERF_TYPE_RAW\fR. As such, perf_events generic events, such as cycles, instructions, cannot be encoded by this PMU. The syntax is very simple: rX. X is the hexadecimal 64-bit value for the event. It may include event filters on some PMU models. The hexadecimal number is passed without the 0x prefix, e.g., r01c4. The library's standard perf_events attributes are supported by this PMU model. They are separated with colons as is customary with the library. .SH MODIFIERS The following modifiers are supported by this PMU model: .TP .B u Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR. This is a boolean modifier. .TP .B k Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR. This is a boolean modifier. .TP .B h Measure at the hypervisor level. This corresponds to \fBPFM_PLMH\fR. This is a boolean modifier .TP .B mg Measure guest execution only. This is a boolean modifier .TP .B mh Measure host execution only. This is a boolean modifier .TP .B period Specify the the sampling period value. Value can be expressed in decimal or hexadecimal. Value is 64-bit wide. This option is mutually exclusive with \fBfreq\fR. The period is expressed in the unit of the event. There is no default value. .TP .B freq Specify the the sampling frequency value. Value can be expressed in decimal or hexadecimal. Value is 64-bit wide. This options is mutually exclusive with \fBperiod\fR. The value is expressed in Hertz. For instance, freq=100, means that the event should be sampled 100 times per second on average. There is no default value. .TP .B excl The associated event is the only event measured on the PMU. This applies only to hardware events. This attribute requires admin privileges. Default is off. .TP .B precise Enables precise sampling mode. This option is only valid when sampling on events. The options takes an integer argument. It can have the following values: 1=enable precise sampling, 2=enable precise sampling and eliminate skid, 3=enable precise sampling, eliminate skid and bias. Not all events necessarily support precise mode at all levels, this is dependent on the underlying PMU. Eliminating skid is a best effort feature. It may not work for all samples. This option is mutually exclusive with \fBhw_smpl\fR. This options implies using the hardware assist sampling mechanism. .TP .B hw_smpl Enables hardware assist sampling. This is a boolean option. It is false by default. On some processors, it is possible to have the hardware record samples in a buffer and then notify the kernel when it is full. Such feature may not be available for all events. Using a hardware buffer does not necessarily eliminate skid and bias, it usually lowers the overhead of interrupt-based sampling by amortizing the interrupt over multiple samples. This option is usually implicit with precise sampling events. .TP .B cpu This integer option is used with system-wide events, i.e., events attached to a CPU instead of a thread. The value designate the CPU to attach the event to. It is up to the caller of the library to use the cpu field in the library event encoding argument to create the event. No verification on the validity of the CPU number is made by the library. Default value is -1 for this field. .TP .B pinned This boolean option is used with system-wide events, i.e., events attached to a CPU instead of a thread. If set, then the event is marked as pinned. That means it needs to remain on the counters at all time, i.e., it cannot be multiplexed. There can only be as many pinned events as there are counters, yet the library does not check for that, the perf_event subsystem does. The default value for this field is false, i.e., the event is not pinned. .SH AUTHORS .nf Stephane Eranian .if .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_find_event.3000066400000000000000000000074711502707512200226320ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_find_event \- search for an event masks .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_find_event(const char *"str ");" .sp .SH DESCRIPTION This function is used to convert an event string passed in \fBstr\fR into an opaque event identifier, i.e., the return value. Events are first manipulated a strings which contain the event name, sub-event names and optional filters and modifiers. This function analyzes the string and try to find the matching event. The event string is a structured string and it is composed as follows: .TP .B [pmu_name::]event_name[:unit_mask][:modifier|:modifier=val] .PP The various components are separated by \fB:\fR or \fB::\fR, they are defined as follows: .TP .B pmu_name This is an optional prefix to designate a specific PMU model. With the prefix the event which matches the event_name is used. In case multiple PMU models are activated, there may be conflict with identical event names to mean the same or different things. In that case, it is necessary to fully specify the event with a pmu_name. That string corresponds to what is returned by \fBpfm_get_pmu_name()\fR. .TP .B event_name This is the event name and is required. The library is not case sensitive on event string. The event name must match \fBcompletely\fR the actual event name; it cannot be a substring. .TP .B unit_mask The optional unit mask which can be considered like a sub-event of the major event. If a event has unit masks, and there is no default, then at least one unit mask must be passed in the string. Multiple unit masks may be specified for a single event. .TP .B modifier A modifier is an optional filter which is provided by the hardware register hosting the event or by the underlying kernel infrastructure. Typical modifiers include privilege level filters. Some modifiers are simple boolean, in which case just passing their names is equivalent to setting their value to \fBtrue\fR. Other modifiers need a specific value, in which case it is provided after the equal sign. No space is tolerate around the equal sign. The list of modifiers depends on the host PMU and underlying kernel API. They are documented in PMU-specific documentation. Multiple modifiers may be passed. There is not order between unit masks and modifiers. .PP The library uses the generic term \fBattribute\fR to designate both unit masks and modifiers. Here are a few examples of event strings: .TP .B amd64::RETIRED_INSTRUCTIONS:u Event RETIRED_INSTRUCTION on AMD64 processor, measure at user privilege level only .TP .B RS_UOPS_DISPATCHED:c=1:i:u Event RS_UOPS_DISPATCHED measured at user privilege level only, and with counter-mask set to 1 .PP For the purpose of this function, only the pmu_name and event_name are considered, everything else is parsed, thus must be valid, but is ignored. The function searches only for one event per call. As a convenience, the function will identify the event up to the first comma. In other words, if \fBstr\fR is equal to "EVENTA,EVENTB", then the function will only look at EVENTA and will not return an error because of invalid event string. This is handy when parsing constant event strings containing multiple, comma-separated, events. .SH RETURN The function returns the opaque event identifier that corresponds that the event string. In case of error, a negative error code is returned instead. .SH ERRORS .TP .B PFMLIB_ERR_NOINIT The library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL The event string is NULL. .TP .B PFMLIB_ERR_NOMEM The library ran out of memory. .TP .B PFMLIB_ERR_NOTFOUND The event was not found .TP .B PFMLIB_ERR_ATTR Invalid event attribute .TP .B PFMLIB_ERR_ATTR_VAL Invalid event attribute value .TP .B PFMLIB_ERR_TOOMANY Too many event attributes passed .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_event_attr_info.3000066400000000000000000000165171502707512200245370ustar00rootroot00000000000000.TH LIBPFM 3 "December, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_event_attr_info \- get event attribute information .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_event_attr_info(int " idx ", int " attr ", pfm_os_t " os ", pfm_event_attr_info_t *" info ");" .sp .SH DESCRIPTION This function returns in \fBinfo\fR information about the attribute designated by \fBattr\fR for the event specified in \fBidx\fR and the os layer in \fBos\fR. The \fBpfm_os_t\fR enumeration provides the following choices: .TP .B PFM_OS_NONE The returned information pertains only to what the PMU hardware exports. No operating system attributes is taken into account. .TP .B PFM_OS_PERF_EVENT The returned information includes the actual PMU hardware and the additional attributes exported by the perf_events kernel interface. The perf_event attributes pertain only the PMU hardware. In case perf_events is not detected, an error is returned. .TP .B PFM_OS_PERF_EVENT_EXT The returned information includes all of what is already provided by \fBPFM_OS_PERF_EVENT\fR plus all the software attributes controlled by perf_events, such as sampling period, precise sampling. .PP The \fBpfm_event_attr_info_t\fR structure is defined as follows: .nf typedef struct { const char *name; const char *desc; const char *equiv; size_t size; uint64_t code; pfm_attr_t type; int idx; pfm_attr_ctrl_t ctrl; int reserved1; struct { int is_dfl:1; int is_precise:1; int is_speculative:2; int support_hw_smpl:1; int support_no_mods:1; int reserved:26; }; union { uint64_t dfl_val64; const char *dfl_str; int dfl_bool; int dfl_int; }; } pfm_event_attr_info_t; .fi The fields of this structure are defined as follows: .TP .B name This is the name of the attribute. This is a read-only string. .TP .B desc This is the description of the attribute. This is a read-only string. It may contain multiple sentences. .TP .B equiv Certain attributes may be just variations of other attributes for the same event. They may be provided as handy shortcuts to avoid supplying a long list of attributes. For those attributes, this field is not NULL and contains the complete equivalent attribute string. This string, once appended to the event name, may be used library calls requiring an event string. .TP .B code This is the raw attribute code. For PFM_ATTR_UMASK, this is the unit mask code. For all other attributes, this is an opaque index. .TP .B type This is the type of the attribute. Attributes represent either sub-events or extra filters that can be applied to the event. Filters (also called modifiers) may be tied to the event or the PMU register the event is programmed into. The type of an attribute determines how it must be specified. The following types are defined: .RS .TP .B PFM_ATTR_UMASK This is a unit mask, i.e., a sub-event. It is specified using its name. Depending on the event, it may be possible to specify multiple unit masks. .TP .B PFM_ATTR_MOD_BOOL This is a boolean attribute. It has a value of 0, 1, y or n. The value is specified after the equal sign, e.g., foo=1. As a convenience, the equal sign and value may be omitted, in which case this is equivalent to =1. .TP .B PFM_ATTR_MOD_INTEGER This is an integer attribute. It has a value which must be passed after the equal sign. The range of valid values depends on the attribute and is usually specified in its description. .PP .RE .TP .B idx This is the attribute index. It is identical to the value of \fBattr\fR passed to the call and is provided for completeness. .TP .B size This field contains the size of the struct passed. This field is used to provide for extensibility of the struct without compromising backward compatibility. The value should be set to \fBsizeof(pfm_event_attr_info_t)\fR. If instead, a value of \fB0\fR is specified, the library assumes the struct passed is identical to the first ABI version which size is \fBPFM_ATTR_INFO_ABI0\fR. Thus, if fields were added after the first ABI, they will not be set by the library. The library does check that bytes beyond what is implemented are zeroes. .TP .B is_dfl This field indicates whether or not this attribute is set by default. This applies mostly for PFM_ATTR_UMASK. If a unit mask is marked as default, and no unit mask is specified in the event string, then the library uses it by default. Note that there may be multiple defaults per event depending on how unit masks are grouped. .TP .B is_precise This field indicates whether or not this umask supports precise sampling. Precise sampling is a hardware mechanism that avoids instruction address skid when using interrupt-based sampling. On Intel X86 processors, this field indicates that the umask supports Precise Event-Based Sampling (PEBS). .TP .B is_speculative This bitfield indicates whether or not the attribute includes occurrences happening during speculative execution for both wrong and correct paths. Given that this kind of event information is not always available from vendors, this field uses multiple bits. A value of \fBPFM_EVENT_INFO_SPEC_NA\fR indicates that speculation information is not available. A value of \fBPFM_EVENT_INFO_SPEC_TRUE\fR indicates that the attribute counts during speculative execution. A value of \fBPFM_EVENT_INFO_SPEC_FALSE\fR indicates that the attribute does not count during speculative execution. .TP .B support_hw_smpl This boolean field indicates that the attribute (in this case a umask) supports hardware sampling. That means the hardware can sample this event+umasks without involving the kernel at each sample. .TP .B support_no_mods This boolean field indicates that the attribute (in this case a umask) does not support any hardware or software modifiers, such as privilege level filters, sampling, precise sampling, and such. This is necessary when select umasks of an event have more restrictions than others, e.g., the event and most umasks support modifiers except a few umasks. .TP .B dfl_val64, dfl_str, dfl_bool, dfl_int This union contains the value of an attribute. For PFM_ATTR_UMASK, the is the unit mask code, for all other types this is the actual value of the attribute. .TP .B ctrl This field indicates which layer or source controls the attribute. The following sources are defined: .RS .TP .B PFM_ATTR_CTRL_UNKNOWN The source controlling the attribute is not known. .TP .B PFM_ATTR_CTRL_PMU The attribute is controlled by the PMU hardware. .TP .B PFM_ATTR_CTRL_PERF_EVENT The attribute is controlled by the perf_events kernel interface. .RE .TP .B reserved These fields must be set to zero. .PP .SH RETURN If successful, the function returns \fBPFM_SUCCESS\fR and attribute information in \fBinfo\fR, otherwise it returns an error code. .SH ERRORS .TP .B PFMLIB_ERR_NOINIT Library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL The \fBidx\fR or \fBattr\fR arguments are invalid or \fBinfo\fR is \fBNULL\fR or \fBsize\fR is not zero. .TP .B PFM_ERR_NOTSUPP The requested os layer has not been detected on the host system. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_event_encoding.3000066400000000000000000000105751502707512200243360ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_event_encoding \- get raw event encoding .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_event_encoding(const char *" str ",int " dfl_plm ", char **" fstr ", int *" idx ", uint64_t *"code ", int *" count ");" .sp .SH DESCRIPTION This function is used to retrieve the raw event encoding corresponding to the event string in \fBstr\fR. Only one event per call can be encoded. As such, \fBstr\fR can contain only one symbolic event name. The string may contain unit masks and modifiers. The default privilege level mask is passed in \fBdfl_plm\fR. It may be used depending on the event. This function is \fBdeprecated\fR. It is superseded by \fBpfm_get_os_event_encoding()\fR where the OS is set to \fBPFM_OS_NONE\fR. Encoding is retrieve through the \fBpfm_pmu_encode_arg_t\fR structure. The following examples illustrates the transition: .nf int i, count = 0; uint64_t *codes; ret = pfm_get_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, NULL, &codes, &count); if (ret != PFM_SUCCESS) err(1", cannot get encoding %s", pfm_strerror(ret)); for(i=0; i < count; i++) printf("count[%d]=0x%"PRIx64"\\n", i, codes[i]); .fi is equivalent to: .nf pfm_pmu_encode_arg_t arg; int i; memset(&arg, 0, sizeof(arg)); arg.size = sizeof(arg); ret = pfm_get_os_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, PFM_OS_NONE, &arg); if (ret != PFM_SUCCESS) err(1", cannot get encoding %s", pfm_strerror(ret)); for(i=0; i < arg.count; i++) printf("count[%d]=0x%"PRIx64"\\n", i, arg.codes[i]); free(arg.codes); .nf The encoding may take several 64-bit integers. The function can use the array passed in \fBcode\fR if the number of entries passed in \fBcount\fR is big enough. However, if both \fB*codes\fR is \fBNULL\fR and \fBcount\fR is 0, the function allocates the memory necessary to store the encoding. It is up to the caller to eventually free the memory. The number of 64-bit entries in \fBcodes\fR is reflected in \fB*count\fR upon return regardless of whether the \fBcodes\fR was allocated or used as is. If the number of 64-bit integers is greater than one, then the order in which each component is returned is PMU-model specific. Refer to the PMU specific man page. The raw encoding means the encoding as mandated by the underlying PMU model. It may not be directly suitable to pass to a kernel API. You may want to use API-specific library calls to ensure the correct encoding is passed. If \fBfstr\fR is not NULL, it will point to the fully qualified event string upon successful return. The string contains the event name, any umask set, and the value of all the modifiers. It reflects what the encoding will actually measure. The function allocates the memory to store the string. The caller must eventually free the string. Here is a example of how this function could be used: .nf #include #include #include int main(int argc, char **argv) { uint64_t *codes 0; int count = 0; int ret; ret = pfm_initialize(); if (ret != PFMLIB_SUCCESS) err(1", cannot initialize library %s", pfm_strerror(ret)); ret = pfm_get_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, NULL, &codes, &count); if (ret != PFM_SUCCESS) err(1", cannot get encoding %s", pfm_strerror(ret)); for(i=0; i < count; i++) printf("count[%d]=0x%"PRIx64"\\n", i, codes[i]); free(codes); return 0; } .fi .SH RETURN The function returns in \fB*codes\fR the encoding of the event and in \fB*count\fR the number of 64-bit integers to support that encoding. Upon success, \fBPFM_SUCCESS\fR is returned otherwise a specific error code is returned. .SH ERRORS .TP .B PFM_ERR_TOOSMALL The \fBcode\fR argument is too small for the encoding. .TP .B PFM_ERR_INVAL The \fBcode\fR or \fBcount\fR argument is \fBNULL\fR or the \fBstr\fR contains more than one symbolic event. .TP .B PFM_ERR_NOMEM Not enough memory. .TP .B PFM_ERR_NOTFOUND Event not found. .TP .B PFM_ERR_ATTR Invalid event attribute (unit mask or modifier) .TP .B PFM_ERR_ATTR_VAL Invalid modifier value. .TP .B PFM_ERR_ATTR_SET attribute already set, cannot be changed. .TP .B PFM_ERR_ATTR_UMASK Missing unit mask. .TP .B PFM_ERR_ATTR_FEATCOMB Unit masks or features cannot be combined into a single event. .SH AUTHOR Stephane Eranian .SH SEE ALSO pfm_get_os_event_encoding(3) papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_event_info.3000066400000000000000000000116731502707512200235030ustar00rootroot00000000000000.TH LIBPFM 3 "December, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_event_info \- get event information .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_event_info(int " idx ", pfm_os_t " os ", pfm_event_info_t *" info ");" .sp .SH DESCRIPTION This function returns in \fBinfo\fR information about a specific event designated by its opaque unique identifier in \fBidx\fR for the operating system specified in \fBos\fR. The \fBpfm_event_info_t\fR structure is defined as follows: .nf typedef struct { const char *name; const char *desc; const char *equiv; size_t size; uint64_t code; pfm_pmu_t pmu; pfm_dtype_t dtype int idx; int nattrs; struct { unsigned int is_precise:1; unsigned int reserved_bits:31; }; } pfm_event_info_t; .fi The fields of this structure are defined as follows: .TP .B name This is the name of the event. This is a read-only string. .TP .B desc This is the description of the event. This is a read-only string. It may contain multiple sentences. .TP .B equiv Certain events may be just variations of actual events. They may be provided as handy shortcuts to avoid supplying a long list of attributes. For those events, this field is not NULL and contains the complete equivalent event string. .TP .B code This is the raw event code. It should not be confused with the encoding of the event. This field represents only the event selection code, it does not include any unit mask or attribute settings. .TP .B pmu This is the identification of the PMU model this event belongs to. It is of type \fBpfm_pmu_t\fR. Using this value and the \fBpfm_get_pmu_info\fR function, it is possible to get PMU information. .TP .B dtype This field returns the representation of the event data. By default, it is \fBPFM_DATA_UINT64\fR. .B idx This is the event unique opaque identifier. It is identical to the idx passed to the call and is provided for completeness. .TP .B nattrs This is the number of attributes supported by this event. Attributes may be unit masks or modifiers. If the event has not attribute, then the value of this field is simply 0. .TP .B size This field contains the size of the struct passed. This field is used to provide for extensibility of the struct without compromising backward compatibility. The value should be set to \fBsizeof(pfm_event_info_t)\fR. If instead, a value of \fB0\fR is specified, the library assumes the struct passed is identical to the first ABI version which size is \fBPFM_EVENT_INFO_ABI0\fR. Thus, if fields were added after the first ABI, they will not be set by the library. The library does check that bytes beyond what is implemented are zeroes. .TP .B is_precise This bitfield indicates whether or not the event support precise sampling. Precise sampling is a hardware mechanism that avoids instruction address skid when using interrupt-based sampling. When the event has umasks, this field means that at least one umask supports precise sampling. On Intel X86 processors, this indicates whether the event supports Precise Event-Based Sampling (PEBS). .PP .TP .B is_speculative This bitfield indicates whether or not the event includes occurrences happening during speculative execution for both wrong and correct path. Given that this kind of event information is not always available from vendors, this field uses multiple bits. A value of \fBPFM_EVENT_INFO_SPEC_NA\fR indicates that speculation information is not available. A value of \fBPFM_EVENT_INFO_SPEC_TRUE\fR indicates that the event count during speculative execution. A value of \fBPFM_EVENT_INFO_SPEC_FALS\fR indicates that the event does not count during speculative execution. .PP The \fBpfm_os_t\fR enumeration provides the following choices: .TP .B PFM_OS_NONE The returned information pertains only to what the PMU hardware exports. No operating system attributes is taken into account. .TP .B PFM_OS_PERF_EVENT The returned information includes the actual PMU hardware and the additional attributes exported by the perf_events kernel interface. The perf_event attributes pertain only the PMU hardware. In case perf_events is not detected, an error is returned. .TP .B PFM_OS_PERF_EVENT_EXT The returned information includes all of what is already provided by \fBPFM_OS_PERF_EVENT\fR plus all the software attributes controlled by perf_events, such as sampling period, precise sampling. .PP .SH RETURN If successful, the function returns \fBPFM_SUCCESS\fR and event information in \fBinfo\fR, otherwise it returns an error code. .SH ERRORS .TP .B PFMLIB_ERR_NOINIT Library has not been initialized properly. .TP .B PFMLIB_ERR_INVAL The \fBidx\fR argument is invalid or \fBinfo\fR is \fBNULL\fR or \fBsize\fR is not zero. .TP .B PFMLIB_ERR_NOTSUPP The requested \fBos\fR is not detected or supported. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_event_next.3000066400000000000000000000044671502707512200235310ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_event_next \- iterate over events .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_event_next(int "idx ");" .sp .SH DESCRIPTION Events are uniquely identified with opaque integer identifiers. There is no guaranteed order within identifiers. Thus, to list all the events, it is necessary to use iterators. Events are grouped in tables within the library. A table usually corresponds to a PMU model or family. The library contains support for multiple PMU models, thus it has multiple tables. Based on the host hardware and software environments, tables get activated when the library is initialized via \fBpfm_initialize()\fR. Events from activated tables are called active events. Events from non-activated tables are called supported events. Event identifiers are usually retrieved via \fBpfm_find_event()\fR or when encoding events. To iterate over a list of events for a given PMU model, all that is needed is an initial identifier for the PMU. The first event identifier is usually obtained via \fBpfm_get_pmu_info()\fR. The \fBpfm_get_event_next()\fR function returns the identifier of next supported event after the one passed in \fBidx\fR. This iterator stops when the last event for the PMU is passed as argument, in which case the function returns \-1. .sp .nf void list_pmu_events(pfm_pmu_t pmu) { struct pfm_event_info info; struct pfm_pmu_info pinfo; int i, ret; memset(&info, 0, sizeof(info)); memset(&pinfo, 0, sizeof(pinfo)); info.size = sizeof(info); pinfo.size = sizeof(pinfo); ret = pfm_get_pmu_info(pmu, &pinfo); if (ret != PFM_SUCCESS) errx(1, "cannot get pmu info"); for (i = pinfo.first_event; i != \-1; i = pfm_get_event_next(i)) { ret = pfm_get_event_info(i, &info); if (ret != PFM_SUCCESS) errx(1, "cannot get event info"); printf("%s Event: %s::%s\\n", pinfo.present ? "Active" : "Supported", pinfo.name, info.name); } } .fi .SH RETURN The function returns the identifier of the next supported event. It returns \-1 when the argument is already the last event for the PMU. .SH ERRORS No error code, besides \-1, is returned by this function. .SH SEE ALSO pfm_find_event(3) .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_os_event_encoding.3000066400000000000000000000166041502707512200250360ustar00rootroot00000000000000.TH LIBPFM 3 "January, 2011" "" "Linux Programmer's Manual" .SH NAME pfm_get_os_event_encoding \- get event encoding for a specific operating system .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_os_event_encoding(const char *" str ", int " dfl_plm ", pfm_os_t " os ", void *" arg ");" .sp .SH DESCRIPTION This is the key function to retrieve the encoding of an event for a specific operating system interface. The event string passed in \fBstr\fR is parsed and encoded for the operating system specified by \fBos\fR. Only one event per call can be encoded. As such, \fBstr\fR can contain only one symbolic event name. The event is encoded to monitor at the privilege levels specified by the \fBdfl_plm\fR mask, if supported, otherwise this parameter is ignored. The operating system specific input and output arguments are passed in \fBarg\fR. The event string, \fBstr\fR, may contains sub-event masks (umask) and any other supported modifiers. Only one event is parsed from the string. For convenience, it is possible to pass a comma-separated list of events in \fBstr\fR but only the first event is encoded. The following values are supported for \fBos\fR: .TP .B PFM_OS_NONE This value causes the event to be encoded purely as specified by the PMU hardware. The \fBarg\fR argument must be a pointer to a \fBpfm_pmu_encode_arg_t\fR structure which is defined as follows: .nf typedef struct { uint64_t *codes; char **fstr; size_t size; int count; int idx; } pfm_pmu_encode_arg_t; .fi The fields are defined as follows: .RS .TP .B codes A pointer to an array of 64-bit values. On input, if \fBcodes\fR is NULL, then the library allocates whatever is necessary to store the encoding of the event. If \fBcodes\fR is not NULL on input, then \fBcount\fR must reflect its actual number of elements. If \fBcount\fR is big enough, the library stores the encoding at the address provided. Otherwise, an error is returned. .TP .B count On input, the field contains the maximum number of elements in the array \fBcodes\fR. Upon return, it contains the number of actual entries in \fBcodes\fR. If \fBcodes\fR is NULL, then count must be zero. .TP .B fstr If the caller is interested in retrieving the fully qualified event string where all used unit masks and all modifiers are spelled out, this field must be set to a non-null address of a pointer to a string (char **). Upon return, if \fBfstr\fR was not NULL, then the string pointer passed on entry points to the event string. The string is dynamically allocated and \fBmust\fR eventually be freed by the caller. If \fBfstr\fR was NULL on entry, then nothing is returned in this field. The typical calling sequence looks as follows: .nf char *fstr = NULL pfm_pmu_encode_arg_t arg; arg.fstr = &fstr; ret = pfm_get_os_event_encoding("event", PFM_PLM0|PFM_PLM3, PFM_OS_NONE, &e); if (ret == PFM_SUCCESS) { printf("fstr=%s\\n", fstr); free(fstr); } .fi .TP .B size This field contains the size of the struct passed. This field is used to provide for extensibility of the struct without compromising backward compatibility. The value should be set to \fBsizeof(pfm_pmu_encode_arg_t)\fR. If instead, a value of \fB0\fR is specified, the library assumes the struct passed is identical to the first ABI version which size is \fBPFM_RAW_ENCODE_ABI0\fR. Thus, if fields were added after the first ABI, they will not be set by the library. The library does check that bytes beyond what is implemented are zeroes. .TP .B idx Upon return, this field contains the opaque unique identifier for the event described in \fBstr\fR. This index can be used to retrieve information about the event using \fBpfm_get_event_info()\fR, for instance. .RE .TP .B PFM_OS_PERF_EVENT, PFM_OS_PERF_EVENT_EXT This value causes the event to be encoded for the perf_event Linux kernel interface (available since 2.6.31). The \fBarg\fR must be a pointer to a \fBpfm_perf_encode_arg_t\fR structure. The PFM_OS_PERF_EVENT layer provides the modifiers exported by the underlying PMU hardware, some of which may actually be overridden by the perf_event interface, such as the monitoring privilege levels. The \fBPFM_OS_PERF_EVENT_EXT\fR extends \fBPFM_OS_PERF_EVENT\fR to add modifiers controlled only by the perf_event interface, such as sampling period (\fBperiod\fR), frequency (\fBfreq\fR) and exclusive resource access (\fBexcl\fR). .nf typedef struct { struct perf_event_attr *attr; char **fstr; size_t size; int idx; int cpu; int flags; } pfm_perf_encode_arg_t; .fi The fields are defined as follows: .RS .TP .B attr A pointer to a struct perf_event_attr as defined in perf_event.h. This field cannot be NULL on entry. The struct is not completely overwritten by the call. The library only modifies the fields it knows about, thereby allowing perf_event ABI mismatch between caller and library. .TP .B fstr Same behavior as is described for PFM_OS_NONE above. .TP .B size This field contains the size of the struct passed. This field is used to provide for extensibility of the struct without compromising backward compatibility. The value should be set to \fBsizeof(pfm_perf_encode_arg_t)\fR. If instead, a value of \fB0\fR is specified, the library assumes the struct passed is identical to the first ABI version which size is \fBPFM_PERF_ENCODE_ABI0\fR. Thus, if fields were added after the first ABI, they will not be set by the library. The library does check that bytes beyond what is implemented are zeroes. .TP .B idx Upon return, this field contains the opaque unique identifier for the event described in \fBstr\fR. This index can be used to retrieve information about the event using \fBpfm_get_event_info()\fR, for instance. .TP .B cpu Not used yet. .TP .B flags Not used yet. .RE .PP Here is a example of how this function could be used with PFM_OS_NONE: .nf #include #include #include int main(int argc, char **argv) { pfm_pmu_encode_arg_t arg; int i, ret; ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "cannot initialize library %s", pfm_strerror(ret)); memset(&arg, 0, sizeof(arg)); ret = pfm_get_os_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, PFM_OS_NONE, &arg); if (ret != PFM_SUCCESS) err(1, "cannot get encoding %s", pfm_strerror(ret)); for(i = 0; i < arg.count; i++) printf("count[%d]=0x%"PRIx64"\\n", i, arg.codes[i]); free(arg.codes); return 0; } .fi .SH RETURN The function returns in \fBarg\fR the encoding of the event for the os passed in \fBos\fR. The content of \fBarg\fR depends on the \fBos\fR argument. Upon success, \fBPFM_SUCCESS\fR is returned otherwise a specific error code is returned. .SH ERRORS .TP .B PFM_ERR_TOOSMALL The \fBcode\fR argument is too small for the encoding. .TP .B PFM_ERR_INVAL The \fBcode\fR or \fBcount\fR argument is \fBNULL\fR or the \fBstr\fR contains more than one symbolic event. .TP .B PFM_ERR_NOMEM Not enough memory. .TP .B PFM_ERR_NOTFOUND Event not found. .TP .B PFM_ERR_ATTR Invalid event attribute (unit mask or modifier) .TP .B PFM_ERR_ATTR_VAL Invalid modifier value. .TP .B PFM_ERR_ATTR_SET attribute already set, cannot be changed. .TP .B PFM_ERR_ATTR_UMASK Missing unit mask. .TP .B PFM_ERR_ATTR_FEATCOMB Unit masks or features cannot be combined into a single event. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_perf_event_encoding.3000066400000000000000000000123351502707512200253460ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_perf_event_encoding \- encode event for perf_event API .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_perf_event_encoding(const char *" str ", int " dfl_plm ", struct perf_event_attr *" attr ", char **" fstr ", int *" idx ");" .sp .SH DESCRIPTION This function can be used in conjunction with the perf_events Linux kernel API which provides access to hardware performance counters, kernel software counters and tracepoints. The function takes an event string in \fBstr\fR and a default privilege level mask in \fBdfl_plm\fR and fills out the relevant parts of the perf_events specific data structure in \fBattr\fR. This function is \fBdeprecated\fR. It is superseded by \fBpfm_get_os_event_encoding()\fR with the OS argument set to either \fBPFM_OS_PERF_EVENT\fR or \fBPFM_OS_PERF_EVENT_EXT\fR. Using this function provides extended support for perf_events. Certain perf_event configuration option are only available through this new interface. The following examples illustrates the transition: .nf struct perf_event_attr attr; int i, count = 0; uint64_t *codes; memset(&attr, 0, sizeof(attr)); ret = pfm_get_perf_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, &attrs, NULL, NULL); if (ret != PFM_SUCCESS) err(1, "cannot get encoding %s", pfm_strerror(ret)); .fi is equivalent to: .nf #include struct perf_event_attr attr; pfm_perf_encode_arg_t arg; memset(&arg, 0, sizeof(arg)); arg.size = sizeof(arg); arg.attr = &attr; ret = pfm_get_os_event_encoding("RETIRED_INSTRUCTIONS", PFM_PLM3, PFM_OS_PERF, &arg); if (ret != PFM_SUCCESS) err(1, "cannot get encoding %s", pfm_strerror(ret)); .nf The \fBdfl_plm\fR cannot be zero, though it may not necessarily be used by the event. Depending on the event, combination of the following privilege levels may be used: .TP .B PFM_PLM3 Measure at privilege level 3. This usually corresponds to user level. On X86, it corresponds to privilege levels 3, 2, 1. Check the PMU specific man page to verify if this level is supported by your PMU model. .TP .B PFM_PLM2 Measure at privilege level 2. Check the PMU specific man page to verify if this level is supported by your PMU model. .TP .B PFM_PLM1 Measure at privilege level 1. Check the PMU specific man page to verify if this level is supported by your PMU model. .TP .B PFM_PLM0 Measure at privilege level 0. This usually corresponds to kernel level. Check the PMU specific man page to verify if this level is supported by your PMU model. .TP .B PFM_PLMH Measure at hypervisor privilege level. This is used in conjunction with hardware virtualization. Check the PMU specific man page to verify if this level is supported by your PMU model. .PP If \fBfstr\fR is not NULL, the function will make it point to the fully qualified event string, i.e., a string with the event name, all unit masks set, and the value of all modifiers. The library will allocate memory to store the event string but it is the responsibility of the caller to eventually free that string using free(). If \fBidx\fR is not NULL, it returns the corresponding unique event identifier. Only select fields are modified by the function, the others are untouched. The following fields in \fBattr\fR are modified: .TP .B type The type of the event .TP .B config The encoding of the event .TP .B exclude_user Whether or not user level execution should be excluded from monitoring. The definition of user is PMU model specific. .TP .B exclude_kernel Whether or not kernel level execution should be excluded from monitoring. The definition of kernel is PMU model specific. .TP .B exclude_hv Whether or not hypervisor level execution should be excluded from monitoring. The definition of hypervisor is PMU model specific. .PP By default, if no privilege level modifier is specified in the event string, the library clears \fBexclude_user\fR, \fBexclude_kernel\fR and \fBexclude_hv\fR, resulting in the event being measured at all levels subject to hardware support. The function is able to work on only one event at a time. For convenience, it accepts event strings with commas. In that case, it will translate the first event up to the first comma. This is handy in case tools gets passed events as a comma-separated list. .SH RETURN The function returns in \fBattr\fR the perf_event encoding which corresponds to the event string. If \fBidx\fR is not NULL, then it will contain the unique event identifier upon successful return. The value \fBPFM_SUCCESS\fR is returned if successful, otherwise a negative error code is returned. .SH ERRORS .TP .B PFM_ERR_TOOSMALL The \fBcode\fR argument is too small for the encoding. .TP .B PFM_ERR_INVAL The \fBattr\fR argument is \fBNULL\fR. .TP .B PFM_ERR_NOMEM Not enough memory. .TP .B PFM_ERR_NOTFOUND Event not found. .TP .B PFM_ERR_ATTR Invalid event attribute (unit mask or modifier) .TP .B PFM_ERR_ATTR_VAL Invalid modifier value. .TP .B PFM_ERR_ATTR_SET attribute already set, cannot be changed. .TP .B PFM_ERR_ATTR_UMASK Missing unit mask. .TP .B PFM_ERR_ATTR_FEATCOMB Unit masks or features cannot be combined into a single event. .SH AUTHOR Stephane Eranian .SH SEE ALSO pfm_get_os_event_encoding(3) papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_pmu_info.3000066400000000000000000000106421502707512200231560ustar00rootroot00000000000000.TH LIBPFM 3 "December, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_pmu_info \- get PMU information .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_pmu_info(pfm_pmu_t " pmu ", pfm_pmu_info_t *" info ");" .sp .SH DESCRIPTION This function returns in \fBinfo\fR information about a PMU model designated by its identifier in \fBpmu\fR. The \fBpfm_pmu_info\fR structure is defined as follows: .nf typedef struct { const char *name; const char *desc; pfm_pmu_t pmu; pfm_pmu_type_t type; int size; int nevents; int first_event; int max_encoding; int num_cntrs; int num_fixed_cntrs; struct { int is_present:1; int is_arch_default:1; int is_core:1; int is_uncore:1; int reserved:28; }; } pfm_pmu_info_t; .fi The fields of this structure are defined as follows: .TP .B name This is the symbolic name of the PMU. This name can be used as a prefix in an event string. This is a read-only string. .TP .B desc This is the description of PMU. This is a read-only string. .TP .B pmu This is the unique PMU identification code. It is identical to the value passed in \fBpmu\fR and it provided only for completeness. .TP .B type This field contains the type of the PMU. The following types are defined: .RS .TP .B PFM_PMU_TYPE_UNKNOWN The type of the PMU could not be determined. .TP .B PFM_PMU_TYPE_CORE This field is set to one when the PMU is implemented by the processor core. .TP .B PFM_PMU_TYPE_UNCORE This field is set to one when the PMU is implemented on the processor die but at the socket level, i.e., capturing events for all cores. .PP .RE .TP .B nevents This is the number of available events for this PMU model based on the host processor. It is \fBonly\fR valid if the \fBis_present\fR field is set to 1. Event identifiers are not guaranteed contiguous. In other words, it is not because \fBnevents\fR is equal to 100, that event identifiers go from 0 to 99. The iterator function \fBpfm_get_event_next()\fR must be used to go from one identifier to the next. .TP .B first_event This field returns the opaque index of the first event for this PMU model. The index can be used with \fBpfm_get_event_info()\fR or \fBpfm_get_event_next()\fR functions. In case no event is available, this field contains \fB-1\fR. .TP .B num_cntrs This field contains the number of generic counters supported by the PMU. A counter is generic if it can count more than one event. When it is not possible to determine the number of generic counters, this field contains \fb-1\fR. .TP .B num_fixed_cntrs This field contains the number of fixed counters supported by the PMU. A counter is fixed if it hardwired to count only one event. When it is not possible to determine the number of generic counters, this field contains \fb-1\fR. .TP .B size This field contains the size of the struct passed. This field is used to provide for extensibility of the struct without compromising backward compatibility. The value should be set to \fBsizeof(pfm_pmu_info_t)\fR. If instead, a value of \fB0\fR is specified, the library assumes the struct passed is identical to the first ABI version which size is \fBPFM_PMU_INFO_ABI0\fR. Thus, if fields were added after the first ABI, they will not be set by the library. The library does check that bytes beyond what is implemented are zeroes. .TP .B max_encoding This field returns the number of event codes returned by \fBpfm_get_event_encoding()\fR. .TP .B is_present This field is set to one is the PMU model has been detected on the host system. .TP .B is_dfl This field is set to one if the PMU is the default PMU for this architecture. Otherwise this field is zero. .PP .SH RETURN If successful, the function returns \fBPFM_SUCCESS\fR and PMU information in \fBinfo\fR, otherwise it returns an error code. .SH ERRORS .TP .B PFMLIB_ERR_NOINIT Library has not been initialized properly. .TP .B PFMLIB_ERR_NOTSUPP PMU model is not supported by the library. .TP .B PFMLIB_ERR_INVAL The \fBpmu\fR argument is invalid or \fBinfo\fR is \fBNULL\fR or \fBsize\fR is not zero. .SH SEE ALSO pfm_get_event_next(3) .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_get_version.3000066400000000000000000000015001502707512200230200ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_get_version \- get library version .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_get_version(void)"; .sp .SH DESCRIPTION This function can be called at any time to get the revision level of the library. It is not necessary to have invoked \fBpfm_initialize()\fR prior to calling this function. The revision number is composed of two fields: a major number and a minor number. Both can be extracted using macros provided in the header file: .TP .B PFMLIB_MAJ_VERSION(v) returns the major number encoded in v. .TP .B PFMLIB_MIN_VERSION(v) returns the minor number encoded in v. .SH RETURN The function is always successful, i.e., it always returns the 32-bit version number. .SH ERRORS .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_initialize.3000066400000000000000000000020461502707512200226430ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_initialize \- initialize library .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_initialize(void);" .sp .SH DESCRIPTION This is the first function that a program \fBmust\fR call otherwise the library will not operate. This function probes the underlying hardware looking for valid PMU event tables to activate. Multiple distinct PMU tables may be activated at the same time. The function must be called only once. If the function is called more than once, it does not execute the initialization multiple times, it simply returns the same value as for the first call. This is \fBnot a reentrant function\fR. Only one thread at a time can call the function .SH RETURN The function returns whether or not it was successful, i.e., at least one PMU was activated. A return value of \fBPFMLIB_SUCCESS\fR indicates success, otherwise the value is an error code. .SH ERRORS .TP .B PFMLIB_ERR_NOTSUPP No PMU was activated. .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_strerror.3000066400000000000000000000022451502707512200223650ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_strerror \- return constant string describing error code .SH SYNOPSIS .nf .B #include .sp .BI "const char *pfm_strerror(int "code); .sp .SH DESCRIPTION This function returns a string which describes the libpfm error value in \fBcode\fR. The string returned by the call is \fBread-only\fR. The function must \fBonly\fR be used with libpfm calls documented to return specific error codes. The value \-1 is not considered a specific error code. Strings and \fBpfm_pmu_t\fR return values cannot be used with this function. Typically \fBNULL\fR is returned in case of error for string values, and \fBPFM_PMU_NONE\fR is returned for \fBpfm_pmu_t\fR values. The function is also not designed to handle OS system call errors, i.e., errno values. .SH RETURN The function returns a pointer to the constant string describing the error code. The string is in English. If code is invalid then a default error message is returned. .SH ERRORS If the error code is invalid, then the function returns a pointer to a string which says "unknown error code". .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/docs/man3/pfm_terminate.3000066400000000000000000000010411502707512200224640ustar00rootroot00000000000000.TH LIBPFM 3 "September, 2009" "" "Linux Programmer's Manual" .SH NAME pfm_terminate \- free resources used by library .SH SYNOPSIS .nf .B #include .sp .BI "int pfm_terminate(void);" .sp .SH DESCRIPTION This is the last function that a program \fBmust\fR call to free all the resources allocated by the library, e.g., memory. The function is not reentrant, caller must ensure only one thread at a time is executing it. .SH RETURN There is no return value to this function .SH AUTHOR Stephane Eranian .PP papi-papi-7-2-0-t/src/libpfm4/examples/000077500000000000000000000000001502707512200176025ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/examples/Makefile000066400000000000000000000041701502707512200212440ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk CFLAGS+= -I. -D_GNU_SOURCE LIBS += -lm ifeq ($(SYS),Linux) CFLAGS+= -pthread LIBS += -lrt endif ifeq ($(SYS),WINDOWS) LIBS += -lgnurx endif TARGETS=showevtinfo check_events EXAMPLESDIR=$(DESTDIR)$(DOCDIR)/examples all: $(TARGETS) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done $(TARGETS): %:%.o $(PFMLIB) $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $^ $(LIBS) clean: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done $(RM) -f *.o $(TARGETS) *~ distclean: clean install-examples install_examples: $(TARGETS) @echo installing: $(TARGETS) -mkdir -p $(EXAMPLESDIR) $(INSTALL) -m 755 $(TARGETS) $(EXAMPLESDIR) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done # # examples are installed as part of the RPM install, typically in /usr/share/doc/libpfm-X.Y/ # .PHONY: install depend install-example install_examples papi-papi-7-2-0-t/src/libpfm4/examples/check_events.c000066400000000000000000000107041502707512200224110ustar00rootroot00000000000000/* * check_events.c - show event encoding * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include int pmu_is_present(pfm_pmu_t p) { pfm_pmu_info_t pinfo; int ret; memset(&pinfo, 0, sizeof(pinfo)); ret = pfm_get_pmu_info(p, &pinfo); return ret == PFM_SUCCESS ? pinfo.is_present : 0; } int main(int argc, const char **argv) { pfm_pmu_info_t pinfo; pfm_pmu_encode_arg_t e; const char *arg[3]; const char **p; char *fqstr; pfm_event_info_t info; int j, ret; pfm_pmu_t i; int total_supported_events = 0; int total_available_events = 0; /* * Initialize pfm library (required before we can use it) */ ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "cannot initialize library: %s\n", pfm_strerror(ret)); memset(&pinfo, 0, sizeof(pinfo)); memset(&info, 0, sizeof(info)); printf("Supported PMU models:\n"); for(i=PFM_PMU_NONE; i < PFM_PMU_MAX; i++) { ret = pfm_get_pmu_info(i, &pinfo); if (ret != PFM_SUCCESS) continue; printf("\t[%d, %s, \"%s\"]\n", i, pinfo.name, pinfo.desc); } printf("Detected PMU models:\n"); for(i=PFM_PMU_NONE; i < PFM_PMU_MAX; i++) { ret = pfm_get_pmu_info(i, &pinfo); if (ret != PFM_SUCCESS) continue; if (pinfo.is_present) { printf("\t[%d, %s, \"%s\"]\n", i, pinfo.name, pinfo.desc); total_supported_events += pinfo.nevents; } total_available_events += pinfo.nevents; } printf("Total events: %d available, %d supported\n", total_available_events, total_supported_events); /* * be nice to user! */ if (argc < 2 && pmu_is_present(PFM_PMU_PERF_EVENT)) { arg[0] = "PERF_COUNT_HW_CPU_CYCLES"; arg[1] = "PERF_COUNT_HW_INSTRUCTIONS"; arg[2] = NULL; p = arg; } else { p = argv+1; } if (!*p) errx(1, "you must pass at least one event"); memset(&e, 0, sizeof(e)); while(*p) { /* * extract raw event encoding * * For perf_event encoding, use * #include * and the function: * pfm_get_perf_event_encoding() */ fqstr = NULL; e.fstr = &fqstr; ret = pfm_get_os_event_encoding(*p, PFM_PLM0|PFM_PLM3, PFM_OS_NONE, &e); if (ret != PFM_SUCCESS) { /* * codes is too small for this event * free and let the library resize */ if (ret == PFM_ERR_TOOSMALL) { free(e.codes); e.codes = NULL; e.count = 0; free(fqstr); continue; } if (ret == PFM_ERR_NOTFOUND && strstr(*p, "::")) errx(1, "%s: try setting LIBPFM_ENCODE_INACTIVE=1", pfm_strerror(ret)); errx(1, "cannot encode event %s: %s", *p, pfm_strerror(ret)); } ret = pfm_get_event_info(e.idx, PFM_OS_NONE, &info); if (ret != PFM_SUCCESS) errx(1, "cannot get event info: %s", pfm_strerror(ret)); ret = pfm_get_pmu_info(info.pmu, &pinfo); if (ret != PFM_SUCCESS) errx(1, "cannot get PMU info: %s", pfm_strerror(ret)); printf("Requested Event: %s\n", *p); printf("Actual Event: %s\n", fqstr); printf("PMU : %s\n", pinfo.desc); printf("IDX : %d\n", e.idx); printf("Codes :"); for(j=0; j < e.count; j++) printf(" 0x%"PRIx64, e.codes[j]); putchar('\n'); free(fqstr); p++; } if (e.codes) free(e.codes); return 0; } papi-papi-7-2-0-t/src/libpfm4/examples/showevtinfo.c000066400000000000000000000513261502707512200223300ustar00rootroot00000000000000/* * showevtinfo.c - show event information * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #include #define MAXBUF 1024 #define COMBO_MAX 18 static struct { int compact; int sort; uint8_t encode; uint8_t combo; uint8_t combo_lim; uint8_t name_only; uint8_t desc; char *csv_sep; pfm_event_info_t efilter; pfm_event_attr_info_t ufilter; pfm_os_t os; uint64_t mask; } options; typedef struct { uint64_t code; int idx; } code_info_t; static void show_event_info_compact(pfm_event_info_t *info); static const char *srcs[PFM_ATTR_CTRL_MAX]={ [PFM_ATTR_CTRL_UNKNOWN] = "???", [PFM_ATTR_CTRL_PMU] = "PMU", [PFM_ATTR_CTRL_PERF_EVENT] = "perf_event", }; #ifdef PFMLIB_WINDOWS int set_env_var(const char *var, const char *value, int ov) { size_t len; char *str; int ret; len = strlen(var) + 1 + strlen(value) + 1; str = malloc(len); if (!str) return PFM_ERR_NOMEM; sprintf(str, "%s=%s", var, value); ret = putenv(str); free(str); return ret ? PFM_ERR_INVAL : PFM_SUCCESS; } #else static inline int set_env_var(const char *var, const char *value, int ov) { return setenv(var, value, ov); } #endif static int event_has_pname(char *s) { char *p; return (p = strchr(s, ':')) && *(p+1) == ':'; } static int print_codes(char *buf, int plm, int max_encoding) { uint64_t *codes = NULL; int j, ret, count = 0; ret = pfm_get_event_encoding(buf, PFM_PLM0|PFM_PLM3, NULL, NULL, &codes, &count); if (ret != PFM_SUCCESS) { if (ret == PFM_ERR_NOTFOUND) errx(1, "encoding failed, try setting env variable LIBPFM_ENCODE_INACTIVE=1"); return -1; } for(j = 0; j < max_encoding; j++) { if (j < count) printf("0x%"PRIx64, codes[j]); printf("%s", options.csv_sep); } free(codes); return 0; } static int check_valid(char *buf, int plm) { uint64_t *codes = NULL; int ret, count = 0; ret = pfm_get_event_encoding(buf, PFM_PLM0|PFM_PLM3, NULL, NULL, &codes, &count); if (ret != PFM_SUCCESS) return -1; free(codes); return 0; } static int match_ufilters(pfm_event_attr_info_t *info) { uint32_t ufilter1 = 0; uint32_t ufilter2 = 0; if (options.ufilter.is_dfl) ufilter1 |= 0x1; if (info->is_dfl) ufilter2 |= 0x1; if (options.ufilter.is_precise) ufilter1 |= 0x2; if (info->is_precise) ufilter2 |= 0x2; if (!ufilter1) return 1; /* at least one filter matches */ return ufilter1 & ufilter2; } static int match_efilters(pfm_event_info_t *info) { pfm_event_attr_info_t ainfo; int n = 0; int i, ret; if (options.efilter.is_precise && !info->is_precise) return 0; memset(&ainfo, 0, sizeof(ainfo)); ainfo.size = sizeof(ainfo); pfm_for_each_event_attr(i, info) { ret = pfm_get_event_attr_info(info->idx, i, options.os, &ainfo); if (ret != PFM_SUCCESS) continue; if (match_ufilters(&ainfo)) return 1; if (ainfo.type == PFM_ATTR_UMASK) n++; } return n ? 0 : 1; } static void show_event_info_combo(pfm_event_info_t *info) { pfm_event_attr_info_t *ainfo; pfm_pmu_info_t pinfo; char buf[MAXBUF]; size_t len; int numasks = 0; int i, j, ret; uint64_t total, m, u; memset(&pinfo, 0, sizeof(pinfo)); pinfo.size = sizeof(pinfo); ret = pfm_get_pmu_info(info->pmu, &pinfo); if (ret != PFM_SUCCESS) errx(1, "cannot get PMU info"); ainfo = calloc(info->nattrs, sizeof(*ainfo)); if (!ainfo) err(1, "event %s : ", info->name); /* * extract attribute information and count number * of umasks * * we cannot just drop non umasks because we need * to keep attributes in order for the enumeration * of 2^n */ pfm_for_each_event_attr(i, info) { ainfo[i].size = sizeof(*ainfo); ret = pfm_get_event_attr_info(info->idx, i, options.os, &ainfo[i]); if (ret != PFM_SUCCESS) errx(1, "cannot get attribute info: %s", pfm_strerror(ret)); if (ainfo[i].type == PFM_ATTR_UMASK) numasks++; } if (numasks > options.combo_lim) { warnx("event %s has too many umasks to print all combinations, dropping to simple enumeration", info->name); free(ainfo); show_event_info_compact(info); return; } if (numasks) { if (info->nattrs > (int)((sizeof(total)<<3))) { warnx("too many umasks, cannot show all combinations for event %s", info->name); goto end; } total = 1ULL << info->nattrs; for (u = 1; u < total; u++) { len = sizeof(buf); len -= snprintf(buf, len, "%s::%s", pinfo.name, info->name); if (len <= 0) { warnx("event name too long%s", info->name); goto end; } for(m = u, j = 0; m; m >>=1, j++) { if (m & 0x1ULL) { /* we have hit a non umasks attribute, skip */ if (ainfo[j].type != PFM_ATTR_UMASK) break; if (len < (1 + strlen(ainfo[j].name))) { warnx("umasks combination too long for event %s", buf); break; } strncat(buf, ":", len-1);buf[len-1] = '\0'; len--; strncat(buf, ainfo[j].name, len-1);buf[len-1] = '\0'; len -= strlen(ainfo[j].name); } } /* if found a valid umask combination, check encoding */ if (m == 0) { if (options.encode) ret = print_codes(buf, PFM_PLM0|PFM_PLM3, pinfo.max_encoding); else ret = check_valid(buf, PFM_PLM0|PFM_PLM3); if (!ret) printf("%s\n", buf); } } } else { snprintf(buf, sizeof(buf)-1, "%s::%s", pinfo.name, info->name); buf[sizeof(buf)-1] = '\0'; ret = options.encode ? print_codes(buf, PFM_PLM0|PFM_PLM3, pinfo.max_encoding) : 0; if (!ret) printf("%s\n", buf); } end: free(ainfo); } static void show_event_info_compact(pfm_event_info_t *info) { pfm_event_attr_info_t ainfo; pfm_pmu_info_t pinfo; char buf[MAXBUF]; int i, ret, um = 0; memset(&ainfo, 0, sizeof(ainfo)); memset(&pinfo, 0, sizeof(pinfo)); pinfo.size = sizeof(pinfo); ainfo.size = sizeof(ainfo); ret = pfm_get_pmu_info(info->pmu, &pinfo); if (ret != PFM_SUCCESS) errx(1, "cannot get pmu info: %s", pfm_strerror(ret)); if (options.name_only) { if (options.encode) printf("0x%-10"PRIx64, info->code); printf("%s\n", info->name); return; } pfm_for_each_event_attr(i, info) { ret = pfm_get_event_attr_info(info->idx, i, options.os, &ainfo); if (ret != PFM_SUCCESS) errx(1, "cannot get attribute info: %s", pfm_strerror(ret)); if (ainfo.type != PFM_ATTR_UMASK) continue; if (!match_ufilters(&ainfo)) continue; snprintf(buf, sizeof(buf)-1, "%s::%s:%s", pinfo.name, info->name, ainfo.name); buf[sizeof(buf)-1] = '\0'; ret = 0; if (options.encode) { ret = print_codes(buf, PFM_PLM0|PFM_PLM3, pinfo.max_encoding); } if (!ret) { printf("%s", buf); if (options.desc) { printf("%s", options.csv_sep); printf("\"%s. %s.\"", info->desc, ainfo.desc); } putchar('\n'); } um++; } if (um == 0) { if (!match_efilters(info)) return; snprintf(buf, sizeof(buf)-1, "%s::%s", pinfo.name, info->name); buf[sizeof(buf)-1] = '\0'; if (options.encode) { ret = print_codes(buf, PFM_PLM0|PFM_PLM3, pinfo.max_encoding); if (ret) return; } printf("%s", buf); if (options.desc) { printf("%s", options.csv_sep); printf("\"%s.\"", info->desc); } putchar('\n'); } } int compare_codes(const void *a, const void *b) { const code_info_t *aa = a; const code_info_t *bb = b; uint64_t m = options.mask; if ((aa->code & m) < (bb->code &m)) return -1; if ((aa->code & m) == (bb->code & m)) return 0; return 1; } static void print_event_flags(pfm_event_info_t *info) { int n = 0; int spec = info->is_speculative; if (info->is_precise) { printf("[precise] "); n++; } if (info->support_hw_smpl) { printf("[hw_smpl] "); n++; } if (spec > PFM_EVENT_INFO_SPEC_NA) { printf("[%s] ", spec == PFM_EVENT_INFO_SPEC_TRUE ? "speculative" : "non-speculative"); n++; } if (!n) printf("None"); } static void print_attr_flags(pfm_event_attr_info_t *info) { int n = 0; int spec = info->is_speculative; if (info->support_no_mods) { printf("[no modifier supported] "); n++; } if (info->is_dfl) { printf("[default] "); n++; } if (info->is_precise) { printf("[precise] "); n++; } if (info->support_hw_smpl) { printf("[hw_smpl] "); n++; } if (spec > PFM_EVENT_INFO_SPEC_NA) { printf("[%s] ", spec == PFM_EVENT_INFO_SPEC_TRUE ? "speculative" : "non-speculative"); n++; } if (!n) printf("None "); } static void show_event_info(pfm_event_info_t *info) { pfm_event_attr_info_t ainfo; pfm_pmu_info_t pinfo; int mod = 0, um = 0; int i, ret; const char *src; if (options.name_only) { printf("%s\n", info->name); return; } memset(&ainfo, 0, sizeof(ainfo)); memset(&pinfo, 0, sizeof(pinfo)); pinfo.size = sizeof(pinfo); ainfo.size = sizeof(ainfo); if (!match_efilters(info)) return; ret = pfm_get_pmu_info(info->pmu, &pinfo); if (ret) errx(1, "cannot get pmu info: %s", pfm_strerror(ret)); printf("#-----------------------------\n" "IDX : %d\n" "PMU name : %s (%s)\n" "Name : %s\n" "Equiv : %s\n", info->idx, pinfo.name, pinfo.desc, info->name, info->equiv ? info->equiv : "None"); printf("Flags : "); print_event_flags(info); putchar('\n'); printf("Desc : %s\n", info->desc ? info->desc : "no description available"); printf("Code : 0x%"PRIx64"\n", info->code); pfm_for_each_event_attr(i, info) { ret = pfm_get_event_attr_info(info->idx, i, options.os, &ainfo); if (ret != PFM_SUCCESS) errx(1, "cannot retrieve event %s attribute info: %s", info->name, pfm_strerror(ret)); if (ainfo.ctrl >= PFM_ATTR_CTRL_MAX) { warnx("event: %s has unsupported attribute source %d", info->name, ainfo.ctrl); ainfo.ctrl = PFM_ATTR_CTRL_UNKNOWN; } src = srcs[ainfo.ctrl]; switch(ainfo.type) { case PFM_ATTR_UMASK: if (!match_ufilters(&ainfo)) continue; printf("Umask-%02u : 0x%02"PRIx64" : %s : [%s] : ", um, ainfo.code, src, ainfo.name); print_attr_flags(&ainfo); putchar(':'); if (ainfo.equiv) printf(" Alias to %s", ainfo.equiv); else printf(" %s", ainfo.desc); putchar('\n'); um++; break; case PFM_ATTR_MOD_BOOL: printf("Modif-%02u : 0x%02"PRIx64" : %s : [%s] : %s (boolean)\n", mod, ainfo.code, src, ainfo.name, ainfo.desc); mod++; break; case PFM_ATTR_MOD_INTEGER: printf("Modif-%02u : 0x%02"PRIx64" : %s : [%s] : %s (integer)\n", mod, ainfo.code, src, ainfo.name, ainfo.desc); mod++; break; default: printf("Attr-%02u : 0x%02"PRIx64" : %s : [%s] : %s\n", i, ainfo.code, ainfo.name, src, ainfo.desc); } } } static int show_info(char *event, regex_t *preg) { pfm_pmu_info_t pinfo; pfm_event_info_t info; pfm_pmu_t j; int i, ret, match = 0, pname; size_t len, l = 0; char *fullname = NULL; memset(&pinfo, 0, sizeof(pinfo)); memset(&info, 0, sizeof(info)); pinfo.size = sizeof(pinfo); info.size = sizeof(info); pname = event_has_pname(event); /* * scan all supported events, incl. those * from undetected PMU models */ pfm_for_all_pmus(j) { ret = pfm_get_pmu_info(j, &pinfo); if (ret != PFM_SUCCESS) continue; /* no pmu prefix, just look for detected PMU models */ if (!pname && !pinfo.is_present) continue; for (i = pinfo.first_event; i != -1; i = pfm_get_event_next(i)) { ret = pfm_get_event_info(i, options.os, &info); if (ret != PFM_SUCCESS) errx(1, "cannot get event info: %s", pfm_strerror(ret)); len = strlen(info.name) + strlen(pinfo.name) + 1 + 2; if (len > l) { l = len; fullname = realloc(fullname, l); if (!fullname) err(1, "cannot allocate memory"); } sprintf(fullname, "%s::%s", pinfo.name, info.name); if (regexec(preg, fullname, 0, NULL, 0) == 0) { if (options.compact) if (options.combo) show_event_info_combo(&info); else show_event_info_compact(&info); else show_event_info(&info); match++; } } } if (fullname) free(fullname); return match; } static int show_info_sorted(char *event, regex_t *preg) { pfm_pmu_info_t pinfo; pfm_event_info_t info; pfm_pmu_t j; int i, ret, n, match = 0; size_t len, l = 0; char *fullname = NULL; code_info_t *codes; memset(&pinfo, 0, sizeof(pinfo)); memset(&info, 0, sizeof(info)); pinfo.size = sizeof(pinfo); info.size = sizeof(info); pfm_for_all_pmus(j) { ret = pfm_get_pmu_info(j, &pinfo); if (ret != PFM_SUCCESS) continue; codes = malloc(pinfo.nevents * sizeof(*codes)); if (!codes) err(1, "cannot allocate memory\n"); /* scans all supported events */ n = 0; for (i = pinfo.first_event; i != -1; i = pfm_get_event_next(i)) { ret = pfm_get_event_info(i, options.os, &info); if (ret != PFM_SUCCESS) errx(1, "cannot get event info: %s", pfm_strerror(ret)); if (info.pmu != j) continue; codes[n].idx = info.idx; codes[n].code = info.code; n++; } qsort(codes, n, sizeof(*codes), compare_codes); for(i=0; i < n; i++) { ret = pfm_get_event_info(codes[i].idx, options.os, &info); if (ret != PFM_SUCCESS) errx(1, "cannot get event info: %s", pfm_strerror(ret)); len = strlen(info.name) + strlen(pinfo.name) + 1 + 2; if (len > l) { l = len; fullname = realloc(fullname, l); if (!fullname) err(1, "cannot allocate memory"); } sprintf(fullname, "%s::%s", pinfo.name, info.name); if (regexec(preg, fullname, 0, NULL, 0) == 0) { if (options.compact) show_event_info_compact(&info); else show_event_info(&info); match++; } } free(codes); } if (fullname) free(fullname); return match; } static void usage(void) { printf("showevtinfo [-L] [-E] [-h] [-s] [-m mask]\n" "-L\t\tlist one event per line (compact mode)\n" "-E\t\tlist one event per line with encoding (compact mode)\n" "-M\t\tdisplay all valid unit masks combination (use with -L or -E)\n" "-h\t\tget help\n" "-s\t\tsort event by PMU and by code based on -m mask\n" "-l\t\tmaximum number of umasks to list all combinations (default: %d)\n" "-F\t\tshow only events and attributes with certain flags (precise,...)\n" "-m mask\t\thexadecimal event code mask, bits to match when sorting\n" "-x sep\t\tuse sep as field separator in compact mode\n" "-D\t\t\tprint event description in compact mode\n" "-O os\t\tshow attributes for the specific operating system\n", COMBO_MAX); } /* * keep: [pmu::]event * drop everything else */ static void drop_event_attributes(char *str) { char *p; p = strchr(str, ':'); if (!p) return; str = p+1; /* keep PMU name */ if (*str == ':') str++; /* stop string at 1st attribute */ p = strchr(str, ':'); if (p) *p = '\0'; } #define EVENT_FLAGS(n, f, l) { .name = n, .ebit = f, .ubit = l } struct attr_flags { const char *name; int ebit; /* bit position in pfm_event_info_t.flags, -1 means ignore */ int ubit; /* bit position in pfm_event_attr_info_t.flags, -1 means ignore */ }; static const struct attr_flags event_flags[]={ EVENT_FLAGS("precise", 0, 1), EVENT_FLAGS("pebs", 0, 1), EVENT_FLAGS("default", -1, 0), EVENT_FLAGS("dfl", -1, 0), EVENT_FLAGS(NULL, 0, 0) }; static void parse_filters(char *arg) { const struct attr_flags *attr; char *p; while (arg) { p = strchr(arg, ','); if (p) *p++ = 0; for (attr = event_flags; attr->name; attr++) { if (!strcasecmp(attr->name, arg)) { switch(attr->ebit) { case 0: options.efilter.is_precise = 1; break; case -1: break; default: errx(1, "unknown event flag %d", attr->ebit); } switch (attr->ubit) { case 0: options.ufilter.is_dfl = 1; break; case 1: options.ufilter.is_precise = 1; break; case -1: break; default: errx(1, "unknown umaks flag %d", attr->ubit); } break; } } arg = p; } } static const struct { char *name; pfm_os_t os; } supported_oses[]={ { .name = "none", .os = PFM_OS_NONE }, { .name = "raw", .os = PFM_OS_NONE }, { .name = "pmu", .os = PFM_OS_NONE }, { .name = "perf", .os = PFM_OS_PERF_EVENT}, { .name = "perf_ext", .os = PFM_OS_PERF_EVENT_EXT}, { .name = NULL, } }; static const char *pmu_types[]={ "unknown type", "core", "uncore", "OS generic", }; static void setup_os(char *ostr) { int i; for (i = 0; supported_oses[i].name; i++) { if (!strcmp(supported_oses[i].name, ostr)) { options.os = supported_oses[i].os; return; } } fprintf(stderr, "unknown OS layer %s, choose from:", ostr); for (i = 0; supported_oses[i].name; i++) { if (i) fputc(',', stderr); fprintf(stderr, " %s", supported_oses[i].name); } fputc('\n', stderr); exit(1); } int main(int argc, char **argv) { static char *argv_all[2] = { ".*", NULL }; pfm_pmu_info_t pinfo; char *endptr = NULL; char default_sep[2] = "\t"; char *ostr = NULL; char **args; pfm_pmu_t i; int match; regex_t preg; int ret, c; memset(&pinfo, 0, sizeof(pinfo)); pinfo.size = sizeof(pinfo); while ((c=getopt(argc, argv,"hELsm:MNl:F:x:DO:")) != -1) { switch(c) { case 'L': options.compact = 1; break; case 'F': parse_filters(optarg); break; case 'E': options.compact = 1; options.encode = 1; break; case 'M': options.combo = 1; break; case 'N': options.name_only = 1; break; case 's': options.sort = 1; break; case 'D': options.desc = 1; break; case 'l': options.combo_lim = atoi(optarg); break; case 'x': options.csv_sep = optarg; break; case 'O': ostr = optarg; break; case 'm': options.mask = strtoull(optarg, &endptr, 16); if (*endptr) errx(1, "mask must be in hexadecimal\n"); break; case 'h': usage(); exit(0); default: errx(1, "unknown option error"); } } /* to allow encoding of events from non detected PMU models */ ret = set_env_var("LIBPFM_ENCODE_INACTIVE", "1", 1); if (ret != PFM_SUCCESS) errx(1, "cannot force inactive encoding"); ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "cannot initialize libpfm: %s", pfm_strerror(ret)); if (options.mask == 0) options.mask = ~0; if (optind == argc) { args = argv_all; } else { args = argv + optind; } if (!options.csv_sep) options.csv_sep = default_sep; /* avoid combinatorial explosion */ if (options.combo_lim == 0) options.combo_lim = COMBO_MAX; if (ostr) setup_os(ostr); else options.os = PFM_OS_NONE; if (!options.compact) { int total_supported_events = 0; int total_available_events = 0; printf("Supported PMU models:\n"); pfm_for_all_pmus(i) { ret = pfm_get_pmu_info(i, &pinfo); if (ret != PFM_SUCCESS) continue; printf("\t[%d, %s, \"%s\"]\n", i, pinfo.name, pinfo.desc); } printf("Detected PMU models:\n"); pfm_for_all_pmus(i) { ret = pfm_get_pmu_info(i, &pinfo); if (ret != PFM_SUCCESS) continue; if (pinfo.is_present) { if (pinfo.type >= PFM_PMU_TYPE_MAX) pinfo.type = PFM_PMU_TYPE_UNKNOWN; printf("\t[%d, %s, \"%s\", %d events, %d max encoding, %d counters, %s PMU]\n", i, pinfo.name, pinfo.desc, pinfo.nevents, pinfo.max_encoding, pinfo.num_cntrs + pinfo.num_fixed_cntrs, pmu_types[pinfo.type]); total_supported_events += pinfo.nevents; } total_available_events += pinfo.nevents; } printf("Total events: %d available, %d supported\n", total_available_events, total_supported_events); } while(*args) { /* drop umasks and modifiers */ drop_event_attributes(*args); if (regcomp(&preg, *args, REG_ICASE)) errx(1, "error in regular expression for event \"%s\"", *argv); if (options.sort) match = show_info_sorted(*args, &preg); else match = show_info(*args, &preg); if (match == 0) errx(1, "event %s not found", *args); args++; } regfree(&preg); pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/include/000077500000000000000000000000001502707512200174075ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/include/Makefile000066400000000000000000000030571502707512200210540ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk HEADERS= perfmon/pfmlib.h \ perfmon/perf_event.h \ perfmon/pfmlib_perf_event.h dir: -mkdir -p $(DESTDIR)$(INCDIR)/perfmon install: dir $(INSTALL) -m 644 $(HEADERS) $(DESTDIR)$(INCDIR)/perfmon .PHONY: all clean distclean depend dir papi-papi-7-2-0-t/src/libpfm4/include/perfmon/000077500000000000000000000000001502707512200210555ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/include/perfmon/err.h000077500000000000000000000032561502707512200220270ustar00rootroot00000000000000/* * err.h: substitute header for compiling on Windows with MingGW * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFM_ERR_H__ #define __PFM_ERR_H__ #ifndef PFMLIB_WINDOWS #include #else /* PFMLIB_WINDOWS */ #define warnx(...) do { \ fprintf (stderr, __VA_ARGS__); \ fprintf (stderr, "\n"); \ } while (0) #define errx(code, ...) do { \ fprintf (stderr, __VA_ARGS__); \ fprintf (stderr, "\n"); \ exit (code); \ } while (0) #define err(code, ...) do { \ fprintf (stderr, __VA_ARGS__); \ fprintf (stderr, " : %s\n", strerror(errno)); \ exit (code); \ } while (0) #endif #endif /* __PFM_ERR_H__ */ papi-papi-7-2-0-t/src/libpfm4/include/perfmon/perf_event.h000066400000000000000000000542571502707512200234000ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PERFMON_PERF_EVENT_H__ #define __PERFMON_PERF_EVENT_H__ #pragma GCC visibility push(default) #include #include /* for syscall numbers */ #include #include /* for syscall stub macros */ #include /* for _IO */ #include /* for prctl() comamnds */ #ifdef __cplusplus extern "C" { #endif /* * avoid clashes with actual kernel header file */ #if !(defined(_LINUX_PERF_EVENT_H) || defined(_UAPI_LINUX_PERF_EVENT_H)) /* * attr->type field values */ enum perf_type_id { PERF_TYPE_HARDWARE = 0, PERF_TYPE_SOFTWARE = 1, PERF_TYPE_TRACEPOINT = 2, PERF_TYPE_HW_CACHE = 3, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5, PERF_TYPE_MAX }; /* * attr->config values for generic HW PMU events * * they get mapped onto actual events by the kernel */ enum perf_hw_id { PERF_COUNT_HW_CPU_CYCLES = 0, PERF_COUNT_HW_INSTRUCTIONS = 1, PERF_COUNT_HW_CACHE_REFERENCES = 2, PERF_COUNT_HW_CACHE_MISSES = 3, PERF_COUNT_HW_BRANCH_INSTRUCTIONS = 4, PERF_COUNT_HW_BRANCH_MISSES = 5, PERF_COUNT_HW_BUS_CYCLES = 6, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND = 7, PERF_COUNT_HW_STALLED_CYCLES_BACKEND = 8, PERF_COUNT_HW_REF_CPU_CYCLES = 9, PERF_COUNT_HW_MAX }; /* * attr->config values for generic HW cache events * * they get mapped onto actual events by the kernel */ enum perf_hw_cache_id { PERF_COUNT_HW_CACHE_L1D = 0, PERF_COUNT_HW_CACHE_L1I = 1, PERF_COUNT_HW_CACHE_LL = 2, PERF_COUNT_HW_CACHE_DTLB = 3, PERF_COUNT_HW_CACHE_ITLB = 4, PERF_COUNT_HW_CACHE_BPU = 5, PERF_COUNT_HW_CACHE_NODE = 6, PERF_COUNT_HW_CACHE_MAX }; enum perf_hw_cache_op_id { PERF_COUNT_HW_CACHE_OP_READ = 0, PERF_COUNT_HW_CACHE_OP_WRITE = 1, PERF_COUNT_HW_CACHE_OP_PREFETCH = 2, PERF_COUNT_HW_CACHE_OP_MAX }; enum perf_hw_cache_op_result_id { PERF_COUNT_HW_CACHE_RESULT_ACCESS = 0, PERF_COUNT_HW_CACHE_RESULT_MISS = 1, PERF_COUNT_HW_CACHE_RESULT_MAX }; /* * attr->config values for SW events */ enum perf_sw_ids { PERF_COUNT_SW_CPU_CLOCK = 0, PERF_COUNT_SW_TASK_CLOCK = 1, PERF_COUNT_SW_PAGE_FAULTS = 2, PERF_COUNT_SW_CONTEXT_SWITCHES = 3, PERF_COUNT_SW_CPU_MIGRATIONS = 4, PERF_COUNT_SW_PAGE_FAULTS_MIN = 5, PERF_COUNT_SW_PAGE_FAULTS_MAJ = 6, PERF_COUNT_SW_ALIGNMENT_FAULTS = 7, PERF_COUNT_SW_EMULATION_FAULTS = 8, PERF_COUNT_SW_DUMMY = 9, PERF_COUNT_SW_BPF_OUTPUT = 10, PERF_COUNT_SW_CGROUP_SWITCHES = 11, PERF_COUNT_SW_MAX }; /* * attr->sample_type values */ enum perf_event_sample_format { PERF_SAMPLE_IP = 1U << 0, PERF_SAMPLE_TID = 1U << 1, PERF_SAMPLE_TIME = 1U << 2, PERF_SAMPLE_ADDR = 1U << 3, PERF_SAMPLE_READ = 1U << 4, PERF_SAMPLE_CALLCHAIN = 1U << 5, PERF_SAMPLE_ID = 1U << 6, PERF_SAMPLE_CPU = 1U << 7, PERF_SAMPLE_PERIOD = 1U << 8, PERF_SAMPLE_STREAM_ID = 1U << 9, PERF_SAMPLE_RAW = 1U << 10, PERF_SAMPLE_BRANCH_STACK = 1U << 11, PERF_SAMPLE_REGS_USER = 1U << 12, PERF_SAMPLE_STACK_USER = 1U << 13, PERF_SAMPLE_WEIGHT = 1U << 14, PERF_SAMPLE_DATA_SRC = 1U << 15, PERF_SAMPLE_IDENTIFIER = 1U << 16, PERF_SAMPLE_TRANSACTION = 1U << 17, PERF_SAMPLE_REGS_INTR = 1U << 18, PERF_SAMPLE_PHYS_ADDR = 1U << 19, PERF_SAMPLE_AUX = 1U << 20, PERF_SAMPLE_CGROUP = 1U << 21, PERF_SAMPLE_DATA_PAGE_SIZE = 1U << 22, PERF_SAMPLE_CODE_PAGE_SIZE = 1U << 23, PERF_SAMPLE_WEIGHT_STRUCT = 1U << 24, PERF_SAMPLE_MAX = 1U << 25, }; enum perf_txn_qualifier { PERF_TXN_ELISION = (1 << 0), PERF_TXN_TRANSACTION = (1 << 1), PERF_TXN_SYNC = (1 << 2), PERF_TXN_ASYNC = (1 << 3), PERF_TXN_RETRY = (1 << 4), PERF_TXN_CONFLICT = (1 << 5), PERF_TXN_CAPACITY_WRITE = (1 << 6), PERF_TXN_CAPACITY_READ = (1 << 7), PERF_TXN_MAX = (1 << 8), PERF_TXN_ABORT_MASK = (0xffffffffULL << 32), PERF_TXN_ABORT_SHIFT = 32, }; /* * branch_sample_type values */ enum perf_branch_sample_type_shift { PERF_SAMPLE_BRANCH_USER_SHIFT = 0, PERF_SAMPLE_BRANCH_KERNEL_SHIFT = 1, PERF_SAMPLE_BRANCH_HV_SHIFT = 2, PERF_SAMPLE_BRANCH_ANY_SHIFT = 3, PERF_SAMPLE_BRANCH_ANY_CALL_SHIFT = 4, PERF_SAMPLE_BRANCH_ANY_RETURN_SHIFT = 5, PERF_SAMPLE_BRANCH_IND_CALL_SHIFT = 6, PERF_SAMPLE_BRANCH_ABORT_TX_SHIFT = 7, PERF_SAMPLE_BRANCH_IN_TX_SHIFT = 8, PERF_SAMPLE_BRANCH_NO_TX_SHIFT = 9, PERF_SAMPLE_BRANCH_COND_SHIFT = 10, PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT = 11, PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT = 12, PERF_SAMPLE_BRANCH_CALL_SHIFT = 13, PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT = 17, PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT = 18, PERF_SAMPLE_BRANCH_COUNTERS_SHIFT = 19, PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ }; enum perf_branch_sample_type { PERF_SAMPLE_BRANCH_USER = 1U << PERF_SAMPLE_BRANCH_USER_SHIFT, PERF_SAMPLE_BRANCH_KERNEL = 1U << PERF_SAMPLE_BRANCH_KERNEL_SHIFT, PERF_SAMPLE_BRANCH_HV = 1U << PERF_SAMPLE_BRANCH_HV_SHIFT, PERF_SAMPLE_BRANCH_ANY = 1U << PERF_SAMPLE_BRANCH_ANY_SHIFT, PERF_SAMPLE_BRANCH_ANY_CALL = 1U << PERF_SAMPLE_BRANCH_ANY_CALL_SHIFT, PERF_SAMPLE_BRANCH_ANY_RETURN = 1U << PERF_SAMPLE_BRANCH_ANY_RETURN_SHIFT, PERF_SAMPLE_BRANCH_IND_CALL = 1U << PERF_SAMPLE_BRANCH_IND_CALL_SHIFT, PERF_SAMPLE_BRANCH_ABORT_TX = 1U << PERF_SAMPLE_BRANCH_ABORT_TX_SHIFT, PERF_SAMPLE_BRANCH_IN_TX = 1U << PERF_SAMPLE_BRANCH_IN_TX_SHIFT, PERF_SAMPLE_BRANCH_NO_TX = 1U << PERF_SAMPLE_BRANCH_NO_TX_SHIFT, PERF_SAMPLE_BRANCH_COND = 1U << PERF_SAMPLE_BRANCH_COND_SHIFT, PERF_SAMPLE_BRANCH_CALL_STACK = 1U << PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT, PERF_SAMPLE_BRANCH_IND_JUMP = 1U << PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT, PERF_SAMPLE_BRANCH_CALL = 1U << PERF_SAMPLE_BRANCH_IND_CALL_SHIFT, PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT, PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT, PERF_SAMPLE_BRANCH_TYPE_SAVE = 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT, PERF_SAMPLE_BRANCH_HW_INDEX = 1U << PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT, PERF_SAMPLE_BRANCH_PRIV_SAVE = 1U << PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT, PERF_SAMPLE_BRANCH_COUNTERS = 1U << PERF_SAMPLE_BRANCH_COUNTERS_SHIFT, PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, }; enum perf_branch_type { PERF_BR_UNKNOWN = 0, PERF_BR_COND = 1, PERF_BR_UNCOND = 2, PERF_BR_IND = 3, PERF_BR_CALL = 4, PERF_BR_IND_CALL = 5, PERF_BR_RET = 6, PERF_BR_SYSCALL = 7, PERF_BR_SYSRET = 8, PERF_BR_COND_CALL = 9, PERF_BR_COND_RET = 10, PERF_BR_ERET = 11, PERF_BR_IRQ = 12, PERF_BR_SERROR = 13, PERF_BR_NO_TX = 14, PERF_BR_EXTEND_ABI = 15, PERF_BR_MAX, }; enum perf_branch_spec { PERF_BR_SPEC_NA = 0, PERF_BR_SPEC_WRONG_PATH = 1, PERF_BR_NON_SPEC_CORRECT_PATH = 2, PERF_BR_SPEC_CORRECT_PATH = 3, PERF_BR_SPEC_MAX, }; enum perf_branch_fault { PERF_BR_NEW_FAULT_ALGN = 0, PERF_BR_NEW_FAULT_DATA = 1, PERF_BR_NEW_FAULT_INST = 2, PERF_BR_NEW_ARCH_1 = 3, PERF_BR_NEW_ARCH_2 = 4, PERF_BR_NEW_ARCH_3 = 5, PERF_BR_NEW_ARCH_4 = 6, PERF_BR_NEW_ARCH_5 = 7, PERF_BR_NEW_MAX, }; enum perf_branch_priv { PERF_BR_PRIV_UNKNOWN = 0, PERF_BR_PRIV_USER = 1, PERF_BR_PRIV_KERNEL = 2, PERF_BR_PRIV_HV = 3, }; enum perf_sample_regs_abi { PERF_SAMPLE_REGS_ABI_NONE = 0, PERF_SAMPLE_REGS_ABI_32 = 1, PERF_SAMPLE_REGS_ABI_64 = 2, }; /* * attr->read_format values */ enum perf_event_read_format { PERF_FORMAT_TOTAL_TIME_ENABLED = 1U << 0, PERF_FORMAT_TOTAL_TIME_RUNNING = 1U << 1, PERF_FORMAT_ID = 1U << 2, PERF_FORMAT_GROUP = 1U << 3, PERF_FORMAT_LOST = 1U << 4, PERF_FORMAT_MAX = 1U << 5, }; #define PERF_ATTR_SIZE_VER0 64 /* sizeof first published struct */ #define PERF_ATTR_SIZE_VER1 72 /* add: config2 */ #define PERF_ATTR_SIZE_VER2 80 /* add: branch_sample_type */ #define PERF_ATTR_SIZE_VER3 96 /* add: sample_regs_user */ /* add: sample_stack_user */ #define PERF_ATTR_SIZE_VER4 104 /* add: sample_regs_intr */ #define PERF_ATTR_SIZE_VER5 112 /* add: aux_watermark */ #define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */ #define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */ #define PERF_ATTR_SIZE_VER8 136 /* add: config3 */ /* * SWIG doesn't deal well with anonymous nested structures * so we add names for the nested structure only when swig * is used. */ #ifdef SWIG #define SWIG_NAME(x) x #else #define SWIG_NAME(x) #endif /* SWIG */ /* * perf_event_attr struct passed to perf_event_open() */ typedef struct perf_event_attr { uint32_t type; uint32_t size; uint64_t config; union { uint64_t sample_period; uint64_t sample_freq; } SWIG_NAME(sample); uint64_t sample_type; uint64_t read_format; uint64_t disabled : 1, inherit : 1, pinned : 1, exclusive : 1, exclude_user : 1, exclude_kernel : 1, exclude_hv : 1, exclude_idle : 1, mmap : 1, comm : 1, freq : 1, inherit_stat : 1, enable_on_exec : 1, task : 1, watermark : 1, precise_ip : 2, mmap_data : 1, sample_id_all : 1, exclude_host : 1, exclude_guest : 1, exclude_callchain_kernel : 1, exclude_callchain_user : 1, mmap2 : 1, comm_exec : 1, use_clockid : 1, context_switch : 1, write_backward : 1, namespaces : 1, ksymbol : 1, bpf_event : 1, aux_output : 1, cgroup : 1, text_poke : 1, build_id : 1, inherit_thread : 1, remove_on_exec : 1, sigtrap : 1, __reserved_1 : 26; union { uint32_t wakeup_events; uint32_t wakeup_watermark; } SWIG_NAME(wakeup); uint32_t bp_type; union { uint64_t bp_addr; uint64_t kprobe_func; uint64_t uprobe_path; uint64_t config1; /* extend config */ } SWIG_NAME(bpa); union { uint64_t bp_len; uint64_t kprobe_addr; uint64_t probe_offset; uint64_t config2; /* extend config1 */ } SWIG_NAME(bpb); uint64_t branch_sample_type; uint64_t sample_regs_user; uint32_t sample_stack_user; int32_t clockid; uint64_t sample_regs_intr; uint32_t aux_watermark; uint16_t sample_max_stack; uint16_t __reserved_2; uint32_t aux_sample_size; uint32_t __reserved_3; uint64_t sig_data; uint64_t config3; } perf_event_attr_t; struct perf_branch_entry { uint64_t from; uint64_t to; uint64_t mispred:1, /* target mispredicted */ predicted:1,/* target predicted */ in_tx:1, /* in transaction */ abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ type:4, /* branch type */ spec:2, /* branch speculation info */ new_type:4, /* additional branch type */ priv:3, /* privilege level */ reserved:31; }; /* * branch stack layout: * nr: number of taken branches stored in entries[] * * Note that nr can vary from sample to sample * branches (to, from) are stored from most recent * to least recent, i.e., entries[0] contains the most * recent branch. */ struct perf_branch_stack { uint64_t nr; struct perf_branch_entry entries[0]; }; /* * Structure used by below PERF_EVENT_IOC_QUERY_BPF command * to query bpf programs attached to the same perf tracepoint * as the given perf event. */ struct perf_event_query_bpf { uint32_t ids_len; uint32_t prog_cnt; uint32_t ids[0]; }; union perf_sample_weight { uint64_t full; #if __BYTE_ORDER == __LITTLE_ENDIAN struct { uint32_t var1_dw; uint16_t var2_w; uint16_t var3_w; }; #elif __BYTE_ORDER == __BIG_ENDIAN struct { uint16_t var3_w; uint16_t var2_w; uint32_t var1_dw; }; #else #error "Unsupported endianness" #endif }; /* * perf_events ioctl commands, use with event fd */ #define PERF_EVENT_IOC_ENABLE _IO ('$', 0) #define PERF_EVENT_IOC_DISABLE _IO ('$', 1) #define PERF_EVENT_IOC_REFRESH _IO ('$', 2) #define PERF_EVENT_IOC_RESET _IO ('$', 3) #define PERF_EVENT_IOC_PERIOD _IOW('$', 4, uint64_t) #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5) #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *) #define PERF_EVENT_IOC_ID _IOR('$', 7, uint64_t *) #define PERF_EVENT_IOC_SET_BPF _IOW('$', 8, uint32_t) #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) #define PERF_EVENT_IOC_QUERY_BPF _IOWR('$', 10, struct perf_event_query_bpf *) #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES _IOW('$', 11, struct perf_event_attr *) /* * ioctl() 3rd argument */ enum perf_event_ioc_flags { PERF_IOC_FLAG_GROUP = 1U << 0, }; /* * mmapped sampling buffer layout * occupies a 4kb page */ struct perf_event_mmap_page { uint32_t version; uint32_t compat_version; uint32_t lock; uint32_t index; int64_t offset; uint64_t time_enabled; uint64_t time_running; union { uint64_t capabilities; struct { uint64_t cap_bit0:1, cap_bit0_is_deprecated:1, cap_usr_rdpmc:1, cap_user_time:1, cap_user_time_zero:1, cap_user_time_short:1, cap_____res:58; } SWIG_NAME(rdmap_cap_s); } SWIG_NAME(rdmap_cap_u); uint16_t pmc_width; uint16_t time_shift; uint32_t time_mult; uint64_t time_offset; uint64_t time_zero; uint32_t size; uint32_t __reserved_1; uint64_t time_cycles; uint64_t time_mask; uint8_t __reserved[116*8]; uint64_t data_head; uint64_t data_tail; uint64_t data_offset; uint64_t data_size; uint64_t aux_head; uint64_t aux_tail; uint64_t aux_offset; uint64_t aux_size; }; /* * sampling buffer event header */ struct perf_event_header { uint32_t type; uint16_t misc; uint16_t size; }; /* * event header misc field values */ #define PERF_EVENT_MISC_CPUMODE_MASK (3 << 0) #define PERF_EVENT_MISC_CPUMODE_UNKNOWN (0 << 0) #define PERF_EVENT_MISC_KERNEL (1 << 0) #define PERF_EVENT_MISC_USER (2 << 0) #define PERF_EVENT_MISC_HYPERVISOR (3 << 0) #define PERF_RECORD_MISC_GUEST_KERNEL (4 << 0) #define PERF_RECORD_MISC_GUEST_USER (5 << 0) #define PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT (1 << 12) #define PERF_RECORD_MISC_MMAP_DATA (1 << 13) #define PERF_RECORD_MISC_COMM_EXEC (1 << 13) #define PERF_RECORD_MISC_FORK_EXEC (1 << 13) #define PERF_RECORD_MISC_SWITCH_OUT (1 << 13) #define PERF_RECORD_MISC_EXACT (1 << 14) #define PERF_RECORD_MISC_EXACT_IP (1 << 14) #define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT (1 << 14) #define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14) #define PERF_RECORD_MISC_EXT_RESERVED (1 << 15) /* * header->type values */ enum perf_event_type { PERF_RECORD_MMAP = 1, PERF_RECORD_LOST = 2, PERF_RECORD_COMM = 3, PERF_RECORD_EXIT = 4, PERF_RECORD_THROTTLE = 5, PERF_RECORD_UNTHROTTLE = 6, PERF_RECORD_FORK = 7, PERF_RECORD_READ = 8, PERF_RECORD_SAMPLE = 9, PERF_RECORD_MMAP2 = 10, PERF_RECORD_AUX = 11, PERF_RECORD_ITRACE_START = 12, PERF_RECORD_LOST_SAMPLES = 13, PERF_RECORD_SWITCH = 14, PERF_RECORD_SWITCH_CPU_WIDE = 15, PERF_RECORD_NAMESPACES = 16, PERF_RECORD_KSYMBOL = 17, PERF_RECORD_BPF_EVENT = 18, PERF_RECORD_CGROUP = 19, PERF_RECORD_TEXT_POKE = 20, PERF_RECORD_AUX_OUTPUT_HW_ID = 21, PERF_RECORD_MAX }; struct perf_ns_link_info { uint64_t dev; uint64_t ino; }; enum perf_ns_type { NET_NS_INDEX = 0, UTS_NS_INDEX = 1, IPC_NS_INDEX = 2, PID_NS_INDEX = 3, USER_NS_INDEX = 4, MNT_NS_INDEX = 5, CGROUP_NS_INDEX = 6, NR_NAMESPACES }; enum perf_record_ksymbol_type { PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0, PERF_RECORD_KSYMBOL_TYPE_BPF = 1, /* * Out of line code such as kprobe-replaced instructions or optimized * kprobes or ftrace trampolines. */ PERF_RECORD_KSYMBOL_TYPE_OOL = 2, PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */ }; #define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0) enum perf_bpf_event_type { PERF_BPF_EVENT_UNKNOWN = 0, PERF_BPF_EVENT_PROG_LOAD = 1, PERF_BPF_EVENT_PROG_UNLOAD = 2, PERF_BPF_EVENT_MAX, /* non-ABI */ }; #define PERF_MAX_STACK_DEPTH 127 #define PERF_MAX_CONTEXTS_PER_STACK 8 enum perf_callchain_context { PERF_CONTEXT_HV = (uint64_t)-32, PERF_CONTEXT_KERNEL = (uint64_t)-128, PERF_CONTEXT_USER = (uint64_t)-512, PERF_CONTEXT_GUEST = (uint64_t)-2048, PERF_CONTEXT_GUEST_KERNEL = (uint64_t)-2176, PERF_CONTEXT_GUEST_USER = (uint64_t)-2560, PERF_CONTEXT_MAX = (uint64_t)-4095, }; #define PERF_AUX_FLAG_TRUNCATED 0x01 #define PERF_AUX_FLAG_OVERWRITE 0x02 #define PERF_AUX_FLAG_PARTIAL 0x04 #define PERF_AUX_FLAG_COLLISION 0x08 #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* CoreSight PMU AUX buffer formats */ #define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 #define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* * flags for perf_event_open() */ #define PERF_FLAG_FD_NO_GROUP (1U << 0) #define PERF_FLAG_FD_OUTPUT (1U << 1) #define PERF_FLAG_PID_CGROUP (1U << 2) #define PERF_FLAG_FD_CLOEXEC (1UL << 3) #endif /* _LINUX_PERF_EVENT_H */ #ifndef __NR_perf_event_open #ifdef __x86_64__ # define __NR_perf_event_open 298 #endif #ifdef __i386__ # define __NR_perf_event_open 336 #endif #ifdef __powerpc__ # define __NR_perf_event_open 319 #endif #ifdef __s390__ # define __NR_perf_event_open 331 #endif #ifdef __arm__ #if defined(__ARM_EABI__) || defined(__thumb__) # define __NR_perf_event_open 364 #else # define __NR_perf_event_open (0x900000+364) #endif #endif #ifdef __mips__ #if _MIPS_SIM == _MIPS_SIM_ABI32 # define __NR_perf_event_open __NR_Linux + 333 #elif _MIPS_SIM == _MIPS_SIM_ABI64 # define __NR_perf_event_open __NR_Linux + 292 #else /* if _MIPS_SIM == MIPS_SIM_NABI32 */ # define __NR_perf_event_open __NR_Linux + 296 #endif #endif #endif /* __NR_perf_event_open */ /* * perf_event_open() syscall stub */ static inline int perf_event_open( struct perf_event_attr *hw_event_uptr, pid_t pid, int cpu, int group_fd, unsigned long flags) { return syscall( __NR_perf_event_open, hw_event_uptr, pid, cpu, group_fd, flags); } /* * compensate for some distros which do not * have recent enough linux/prctl.h */ #ifndef PR_TASK_PERF_EVENTS_DISABLE #define PR_TASK_PERF_EVENTS_ENABLE 32 #define PR_TASK_PERF_EVENTS_DISABLE 31 #endif /* handle case of older system perf_event.h included before this file */ #ifndef PERF_MEM_OP_NA union perf_mem_data_src { uint64_t val; struct { uint64_t mem_op:5, /* type of opcode */ mem_lvl:14, /* memory hierarchy level */ mem_snoop:5, /* snoop mode */ mem_lock:2, /* lock instr */ mem_dtlb:7, /* tlb access */ mem_lvl_num:4, /* memory hierarchy level number */ mem_remote:1, /* remote */ mem_snoopx:2, /* snoop mode, ext */ mem_blk:3, /* access blocked */ mem_hops:3, /* hop level */ mem_rsvd:18; }; }; /* type of opcode (load/store/prefetch,code) */ #define PERF_MEM_OP_NA 0x01 /* not available */ #define PERF_MEM_OP_LOAD 0x02 /* load instruction */ #define PERF_MEM_OP_STORE 0x04 /* store instruction */ #define PERF_MEM_OP_PFETCH 0x08 /* prefetch */ #define PERF_MEM_OP_EXEC 0x10 /* code (execution) */ #define PERF_MEM_OP_SHIFT 0 /* memory hierarchy (memory level, hit or miss) */ #define PERF_MEM_LVL_NA 0x01 /* not available */ #define PERF_MEM_LVL_HIT 0x02 /* hit level */ #define PERF_MEM_LVL_MISS 0x04 /* miss level */ #define PERF_MEM_LVL_L1 0x08 /* L1 */ #define PERF_MEM_LVL_LFB 0x10 /* Line Fill Buffer */ #define PERF_MEM_LVL_L2 0x20 /* L2 */ #define PERF_MEM_LVL_L3 0x40 /* L3 */ #define PERF_MEM_LVL_LOC_RAM 0x80 /* Local DRAM */ #define PERF_MEM_LVL_REM_RAM1 0x100 /* Remote DRAM (1 hop) */ #define PERF_MEM_LVL_REM_RAM2 0x200 /* Remote DRAM (2 hops) */ #define PERF_MEM_LVL_REM_CCE1 0x400 /* Remote Cache (1 hop) */ #define PERF_MEM_LVL_REM_CCE2 0x800 /* Remote Cache (2 hops) */ #define PERF_MEM_LVL_IO 0x1000 /* I/O memory */ #define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */ #define PERF_MEM_LVL_SHIFT 5 #define PERF_MEM_REMOTE_REMOTE 0x01 /* Remote */ #define PERF_MEM_REMOTE_SHIFT 37 /* snoop mode */ #define PERF_MEM_SNOOP_NA 0x01 /* not available */ #define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */ #define PERF_MEM_SNOOP_HIT 0x04 /* snoop hit */ #define PERF_MEM_SNOOP_MISS 0x08 /* snoop miss */ #define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */ #define PERF_MEM_SNOOP_SHIFT 19 #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ /* 1 free */ #define PERF_MEM_SNOOPX_SHIFT 38 /* locked instruction */ #define PERF_MEM_LOCK_NA 0x01 /* not available */ #define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */ #define PERF_MEM_LOCK_SHIFT 24 /* TLB access */ #define PERF_MEM_TLB_NA 0x01 /* not available */ #define PERF_MEM_TLB_HIT 0x02 /* hit level */ #define PERF_MEM_TLB_MISS 0x04 /* miss level */ #define PERF_MEM_TLB_L1 0x08 /* L1 */ #define PERF_MEM_TLB_L2 0x10 /* L2 */ #define PERF_MEM_TLB_WK 0x20 /* Hardware Walker*/ #define PERF_MEM_TLB_OS 0x40 /* OS fault handler */ #define PERF_MEM_TLB_SHIFT 26 /* Access blocked */ #define PERF_MEM_BLK_NA 0x01 /* not available */ #define PERF_MEM_BLK_DATA 0x02 /* data could not be forwarded */ #define PERF_MEM_BLK_ADDR 0x04 /* address conflict */ #define PERF_MEM_BLK_SHIFT 40 /* hop level */ #define PERF_MEM_HOPS_0 0x01 /* remote core, same node */ #define PERF_MEM_HOPS_1 0x02 /* remote node, same socket */ #define PERF_MEM_HOPS_2 0x03 /* remote socket, same board */ #define PERF_MEM_HOPS_3 0x04 /* remote board */ /* 5-7 available */ #define PERF_MEM_HOPS_SHIFT 43 #define PERF_MEM_S(a, s) \ (((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) #endif /* PERF_MEM_OP_NA */ #ifdef __cplusplus /* extern C */ } #endif #pragma GCC visibility pop #endif /* __PERFMON_PERF_EVENT_H__ */ papi-papi-7-2-0-t/src/libpfm4/include/perfmon/pfmlib.h000066400000000000000000001624271502707512200225130ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Based on: * Copyright (c) 2001-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_H__ #define __PFMLIB_H__ #pragma GCC visibility push(default) #ifdef __cplusplus extern "C" { #endif #include #include #include #include #define LIBPFM_VERSION (4 << 16 | 0) #define PFM_MAJ_VERSION(v) ((v)>>16) #define PFM_MIN_VERSION(v) ((v) & 0xffff) /* * ABI revision level */ #define LIBPFM_ABI_VERSION 0 /* * priv level mask (for dfl_plm) */ #define PFM_PLM0 0x01 /* kernel */ #define PFM_PLM1 0x02 /* not yet used */ #define PFM_PLM2 0x04 /* not yet used */ #define PFM_PLM3 0x08 /* priv level 3, 2, 1 (x86) */ #define PFM_PLMH 0x10 /* hypervisor */ /* * Performance Event Source * * The source is what is providing events. * It can be: * - Hardware Performance Monitoring Unit (PMU) * - a particular kernel subsystem * * Identifiers are guaranteed constant across libpfm revisions * * New sources must be added at the end before PFM_PMU_MAX */ typedef enum { PFM_PMU_NONE= 0, /* no PMU */ PFM_PMU_GEN_IA64, /* Intel IA-64 architected PMU */ PFM_PMU_ITANIUM, /* Intel Itanium */ PFM_PMU_ITANIUM2, /* Intel Itanium 2 */ PFM_PMU_MONTECITO, /* Intel Dual-Core Itanium 2 9000 */ PFM_PMU_AMD64, /* AMD AMD64 (obsolete) */ PFM_PMU_I386_P6, /* Intel PIII (P6 core) */ PFM_PMU_INTEL_NETBURST, /* Intel Netburst (Pentium 4) */ PFM_PMU_INTEL_NETBURST_P, /* Intel Netburst Prescott (Pentium 4) */ PFM_PMU_COREDUO, /* Intel Core Duo/Core Solo */ PFM_PMU_I386_PM, /* Intel Pentium M */ PFM_PMU_INTEL_CORE, /* Intel Core */ PFM_PMU_INTEL_PPRO, /* Intel Pentium Pro */ PFM_PMU_INTEL_PII, /* Intel Pentium II */ PFM_PMU_INTEL_ATOM, /* Intel Atom */ PFM_PMU_INTEL_NHM, /* Intel Nehalem core PMU */ PFM_PMU_INTEL_NHM_EX, /* Intel Nehalem-EX core PMU */ PFM_PMU_INTEL_NHM_UNC, /* Intel Nehalem uncore PMU */ PFM_PMU_INTEL_X86_ARCH, /* Intel X86 architectural PMU */ PFM_PMU_MIPS_20KC, /* MIPS 20KC */ PFM_PMU_MIPS_24K, /* MIPS 24K */ PFM_PMU_MIPS_25KF, /* MIPS 25KF */ PFM_PMU_MIPS_34K, /* MIPS 34K */ PFM_PMU_MIPS_5KC, /* MIPS 5KC */ PFM_PMU_MIPS_74K, /* MIPS 74K */ PFM_PMU_MIPS_R10000, /* MIPS R10000 */ PFM_PMU_MIPS_R12000, /* MIPS R12000 */ PFM_PMU_MIPS_RM7000, /* MIPS RM7000 */ PFM_PMU_MIPS_RM9000, /* MIPS RM9000 */ PFM_PMU_MIPS_SB1, /* MIPS SB1/SB1A */ PFM_PMU_MIPS_VR5432, /* MIPS VR5432 */ PFM_PMU_MIPS_VR5500, /* MIPS VR5500 */ PFM_PMU_MIPS_ICE9A, /* SiCortex ICE9A */ PFM_PMU_MIPS_ICE9B, /* SiCortex ICE9B */ PFM_PMU_POWERPC, /* POWERPC */ PFM_PMU_CELL, /* IBM CELL */ PFM_PMU_SPARC_ULTRA12, /* UltraSPARC I, II, IIi, and IIe */ PFM_PMU_SPARC_ULTRA3, /* UltraSPARC III */ PFM_PMU_SPARC_ULTRA3I, /* UltraSPARC IIIi and IIIi+ */ PFM_PMU_SPARC_ULTRA3PLUS, /* UltraSPARC III+ and IV */ PFM_PMU_SPARC_ULTRA4PLUS, /* UltraSPARC IV+ */ PFM_PMU_SPARC_NIAGARA1, /* Niagara-1 */ PFM_PMU_SPARC_NIAGARA2, /* Niagara-2 */ PFM_PMU_PPC970, /* IBM PowerPC 970(FX,GX) */ PFM_PMU_PPC970MP, /* IBM PowerPC 970MP */ PFM_PMU_POWER3, /* IBM POWER3 */ PFM_PMU_POWER4, /* IBM POWER4 */ PFM_PMU_POWER5, /* IBM POWER5 */ PFM_PMU_POWER5p, /* IBM POWER5+ */ PFM_PMU_POWER6, /* IBM POWER6 */ PFM_PMU_POWER7, /* IBM POWER7 */ PFM_PMU_PERF_EVENT, /* perf_event PMU */ PFM_PMU_INTEL_WSM, /* Intel Westmere single-socket (Clarkdale) */ PFM_PMU_INTEL_WSM_DP, /* Intel Westmere dual-socket (Westmere-EP, Gulftwon) */ PFM_PMU_INTEL_WSM_UNC, /* Intel Westmere uncore PMU */ PFM_PMU_AMD64_K7, /* AMD AMD64 K7 */ PFM_PMU_AMD64_K8_REVB, /* AMD AMD64 K8 RevB */ PFM_PMU_AMD64_K8_REVC, /* AMD AMD64 K8 RevC */ PFM_PMU_AMD64_K8_REVD, /* AMD AMD64 K8 RevD */ PFM_PMU_AMD64_K8_REVE, /* AMD AMD64 K8 RevE */ PFM_PMU_AMD64_K8_REVF, /* AMD AMD64 K8 RevF */ PFM_PMU_AMD64_K8_REVG, /* AMD AMD64 K8 RevG */ PFM_PMU_AMD64_FAM10H_BARCELONA, /* AMD AMD64 Fam10h Barcelona RevB */ PFM_PMU_AMD64_FAM10H_SHANGHAI, /* AMD AMD64 Fam10h Shanghai RevC */ PFM_PMU_AMD64_FAM10H_ISTANBUL, /* AMD AMD64 Fam10h Istanbul RevD */ PFM_PMU_ARM_CORTEX_A8, /* ARM Cortex A8 */ PFM_PMU_ARM_CORTEX_A9, /* ARM Cortex A9 */ PFM_PMU_TORRENT, /* IBM Torrent hub chip */ PFM_PMU_INTEL_SNB, /* Intel Sandy Bridge (single socket) */ PFM_PMU_AMD64_FAM14H_BOBCAT, /* AMD AMD64 Fam14h Bobcat */ PFM_PMU_AMD64_FAM15H_INTERLAGOS,/* AMD AMD64 Fam15h Interlagos */ PFM_PMU_INTEL_SNB_EP, /* Intel SandyBridge EP */ PFM_PMU_AMD64_FAM12H_LLANO, /* AMD AMD64 Fam12h Llano */ PFM_PMU_AMD64_FAM11H_TURION, /* AMD AMD64 Fam11h Turion */ PFM_PMU_INTEL_IVB, /* Intel IvyBridge */ PFM_PMU_ARM_CORTEX_A15, /* ARM Cortex A15 */ PFM_PMU_INTEL_SNB_UNC_CB0, /* Intel SandyBridge C-box 0 uncore PMU */ PFM_PMU_INTEL_SNB_UNC_CB1, /* Intel SandyBridge C-box 1 uncore PMU */ PFM_PMU_INTEL_SNB_UNC_CB2, /* Intel SandyBridge C-box 2 uncore PMU */ PFM_PMU_INTEL_SNB_UNC_CB3, /* Intel SandyBridge C-box 3 uncore PMU */ PFM_PMU_INTEL_SNBEP_UNC_CB0, /* Intel SandyBridge-EP C-Box core 0 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB1, /* Intel SandyBridge-EP C-Box core 1 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB2, /* Intel SandyBridge-EP C-Box core 2 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB3, /* Intel SandyBridge-EP C-Box core 3 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB4, /* Intel SandyBridge-EP C-Box core 4 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB5, /* Intel SandyBridge-EP C-Box core 5 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB6, /* Intel SandyBridge-EP C-Box core 6 uncore */ PFM_PMU_INTEL_SNBEP_UNC_CB7, /* Intel SandyBridge-EP C-Box core 7 uncore */ PFM_PMU_INTEL_SNBEP_UNC_HA, /* Intel SandyBridge-EP HA uncore */ PFM_PMU_INTEL_SNBEP_UNC_IMC0, /* Intel SandyBridge-EP IMC socket 0 uncore */ PFM_PMU_INTEL_SNBEP_UNC_IMC1, /* Intel SandyBridge-EP IMC socket 1 uncore */ PFM_PMU_INTEL_SNBEP_UNC_IMC2, /* Intel SandyBridge-EP IMC socket 2 uncore */ PFM_PMU_INTEL_SNBEP_UNC_IMC3, /* Intel SandyBridge-EP IMC socket 3 uncore */ PFM_PMU_INTEL_SNBEP_UNC_PCU, /* Intel SandyBridge-EP PCU uncore */ PFM_PMU_INTEL_SNBEP_UNC_QPI0, /* Intel SandyBridge-EP QPI link 0 uncore */ PFM_PMU_INTEL_SNBEP_UNC_QPI1, /* Intel SandyBridge-EP QPI link 1 uncore */ PFM_PMU_INTEL_SNBEP_UNC_UBOX, /* Intel SandyBridge-EP U-Box uncore */ PFM_PMU_INTEL_SNBEP_UNC_R2PCIE, /* Intel SandyBridge-EP R2PCIe uncore */ PFM_PMU_INTEL_SNBEP_UNC_R3QPI0, /* Intel SandyBridge-EP R3QPI 0 uncore */ PFM_PMU_INTEL_SNBEP_UNC_R3QPI1, /* Intel SandyBridge-EP R3QPI 1 uncore */ PFM_PMU_INTEL_KNC, /* Intel Knights Corner (Xeon Phi) */ PFM_PMU_S390X_CPUM_CF, /* s390x: CPU-M counter facility */ PFM_PMU_ARM_1176, /* ARM 1176 */ PFM_PMU_INTEL_IVB_EP, /* Intel IvyBridge EP */ PFM_PMU_INTEL_HSW, /* Intel Haswell */ PFM_PMU_INTEL_IVB_UNC_CB0, /* Intel IvyBridge C-box 0 uncore PMU */ PFM_PMU_INTEL_IVB_UNC_CB1, /* Intel IvyBridge C-box 1 uncore PMU */ PFM_PMU_INTEL_IVB_UNC_CB2, /* Intel IvyBridge C-box 2 uncore PMU */ PFM_PMU_INTEL_IVB_UNC_CB3, /* Intel IvyBridge C-box 3 uncore PMU */ PFM_PMU_POWER8, /* IBM POWER8 */ PFM_PMU_INTEL_RAPL, /* Intel RAPL */ PFM_PMU_INTEL_SLM, /* Intel Silvermont */ PFM_PMU_AMD64_FAM15H_NB, /* AMD AMD64 Fam15h NorthBridge */ PFM_PMU_ARM_QCOM_KRAIT, /* Qualcomm Krait */ PFM_PMU_PERF_EVENT_RAW, /* perf_events RAW event syntax */ PFM_PMU_INTEL_IVBEP_UNC_CB0, /* Intel IvyBridge-EP C-Box core 0 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB1, /* Intel IvyBridge-EP C-Box core 1 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB2, /* Intel IvyBridge-EP C-Box core 2 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB3, /* Intel IvyBridge-EP C-Box core 3 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB4, /* Intel IvyBridge-EP C-Box core 4 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB5, /* Intel IvyBridge-EP C-Box core 5 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB6, /* Intel IvyBridge-EP C-Box core 6 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB7, /* Intel IvyBridge-EP C-Box core 7 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB8, /* Intel IvyBridge-EP C-Box core 8 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB9, /* Intel IvyBridge-EP C-Box core 9 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB10, /* Intel IvyBridge-EP C-Box core 10 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB11, /* Intel IvyBridge-EP C-Box core 11 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB12, /* Intel IvyBridge-EP C-Box core 12 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB13, /* Intel IvyBridge-EP C-Box core 13 uncore */ PFM_PMU_INTEL_IVBEP_UNC_CB14, /* Intel IvyBridge-EP C-Box core 14 uncore */ PFM_PMU_INTEL_IVBEP_UNC_HA0, /* Intel IvyBridge-EP HA 0 uncore */ PFM_PMU_INTEL_IVBEP_UNC_HA1, /* Intel IvyBridge-EP HA 1 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC0, /* Intel IvyBridge-EP IMC socket 0 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC1, /* Intel IvyBridge-EP IMC socket 1 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC2, /* Intel IvyBridge-EP IMC socket 2 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC3, /* Intel IvyBridge-EP IMC socket 3 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC4, /* Intel IvyBridge-EP IMC socket 4 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC5, /* Intel IvyBridge-EP IMC socket 5 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC6, /* Intel IvyBridge-EP IMC socket 6 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IMC7, /* Intel IvyBridge-EP IMC socket 7 uncore */ PFM_PMU_INTEL_IVBEP_UNC_PCU, /* Intel IvyBridge-EP PCU uncore */ PFM_PMU_INTEL_IVBEP_UNC_QPI0, /* Intel IvyBridge-EP QPI link 0 uncore */ PFM_PMU_INTEL_IVBEP_UNC_QPI1, /* Intel IvyBridge-EP QPI link 1 uncore */ PFM_PMU_INTEL_IVBEP_UNC_QPI2, /* Intel IvyBridge-EP QPI link 2 uncore */ PFM_PMU_INTEL_IVBEP_UNC_UBOX, /* Intel IvyBridge-EP U-Box uncore */ PFM_PMU_INTEL_IVBEP_UNC_R2PCIE, /* Intel IvyBridge-EP R2PCIe uncore */ PFM_PMU_INTEL_IVBEP_UNC_R3QPI0, /* Intel IvyBridge-EP R3QPI 0 uncore */ PFM_PMU_INTEL_IVBEP_UNC_R3QPI1, /* Intel IvyBridge-EP R3QPI 1 uncore */ PFM_PMU_INTEL_IVBEP_UNC_R3QPI2, /* Intel IvyBridge-EP R3QPI 2 uncore */ PFM_PMU_INTEL_IVBEP_UNC_IRP, /* Intel IvyBridge-EP IRP uncore */ PFM_PMU_S390X_CPUM_SF, /* s390x: CPU-M sampling facility */ PFM_PMU_ARM_CORTEX_A57, /* ARM Cortex A57 (ARMv8) */ PFM_PMU_ARM_CORTEX_A53, /* ARM Cortex A53 (ARMv8) */ PFM_PMU_ARM_CORTEX_A7, /* ARM Cortex A7 */ PFM_PMU_INTEL_HSW_EP, /* Intel Haswell EP */ PFM_PMU_INTEL_BDW, /* Intel Broadwell */ PFM_PMU_ARM_XGENE, /* Applied Micro X-Gene (ARMv8) */ PFM_PMU_INTEL_HSWEP_UNC_CB0, /* Intel Haswell-EP C-Box core 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB1, /* Intel Haswell-EP C-Box core 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB2, /* Intel Haswell-EP C-Box core 2 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB3, /* Intel Haswell-EP C-Box core 3 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB4, /* Intel Haswell-EP C-Box core 4 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB5, /* Intel Haswell-EP C-Box core 5 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB6, /* Intel Haswell-EP C-Box core 6 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB7, /* Intel Haswell-EP C-Box core 7 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB8, /* Intel Haswell-EP C-Box core 8 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB9, /* Intel Haswell-EP C-Box core 9 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB10, /* Intel Haswell-EP C-Box core 10 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB11, /* Intel Haswell-EP C-Box core 11 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB12, /* Intel Haswell-EP C-Box core 12 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB13, /* Intel Haswell-EP C-Box core 13 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB14, /* Intel Haswell-EP C-Box core 14 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB15, /* Intel Haswell-EP C-Box core 15 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB16, /* Intel Haswell-EP C-Box core 16 uncore */ PFM_PMU_INTEL_HSWEP_UNC_CB17, /* Intel Haswell-EP C-Box core 17 uncore */ PFM_PMU_INTEL_HSWEP_UNC_HA0, /* Intel Haswell-EP HA 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_HA1, /* Intel Haswell-EP HA 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC0, /* Intel Haswell-EP IMC socket 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC1, /* Intel Haswell-EP IMC socket 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC2, /* Intel Haswell-EP IMC socket 2 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC3, /* Intel Haswell-EP IMC socket 3 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC4, /* Intel Haswell-EP IMC socket 4 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC5, /* Intel Haswell-EP IMC socket 5 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC6, /* Intel Haswell-EP IMC socket 6 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IMC7, /* Intel Haswell-EP IMC socket 7 uncore */ PFM_PMU_INTEL_HSWEP_UNC_PCU, /* Intel Haswell-EP PCU uncore */ PFM_PMU_INTEL_HSWEP_UNC_QPI0, /* Intel Haswell-EP QPI link 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_QPI1, /* Intel Haswell-EP QPI link 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_UBOX, /* Intel Haswell-EP U-Box uncore */ PFM_PMU_INTEL_HSWEP_UNC_R2PCIE, /* Intel Haswell-EP R2PCIe uncore */ PFM_PMU_INTEL_HSWEP_UNC_R3QPI0, /* Intel Haswell-EP R3QPI 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_R3QPI1, /* Intel Haswell-EP R3QPI 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_R3QPI2, /* Intel Haswell-EP R3QPI 2 uncore */ PFM_PMU_INTEL_HSWEP_UNC_IRP, /* Intel Haswell-EP IRP uncore */ PFM_PMU_INTEL_HSWEP_UNC_SB0, /* Intel Haswell-EP S-Box 0 uncore */ PFM_PMU_INTEL_HSWEP_UNC_SB1, /* Intel Haswell-EP S-Box 1 uncore */ PFM_PMU_INTEL_HSWEP_UNC_SB2, /* Intel Haswell-EP S-Box 2 uncore */ PFM_PMU_INTEL_HSWEP_UNC_SB3, /* Intel Haswell-EP S-Box 3 uncore */ PFM_PMU_POWERPC_NEST_MCS_READ_BW, /* POWERPC Nest Memory Read bandwidth */ PFM_PMU_POWERPC_NEST_MCS_WRITE_BW, /* POWERPC Nest Memory Write bandwidth */ PFM_PMU_INTEL_SKL, /* Intel Skylake */ PFM_PMU_INTEL_BDW_EP, /* Intel Broadwell EP */ PFM_PMU_INTEL_GLM, /* Intel Goldmont */ PFM_PMU_INTEL_KNL, /* Intel Knights Landing */ PFM_PMU_INTEL_KNL_UNC_IMC0, /* Intel KnightLanding IMC channel 0 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC1, /* Intel KnightLanding IMC channel 1 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC2, /* Intel KnightLanding IMC channel 2 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC3, /* Intel KnightLanding IMC channel 3 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC4, /* Intel KnightLanding IMC channel 4 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC5, /* Intel KnightLanding IMC channel 5 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC_UCLK0,/* Intel KnightLanding IMC UCLK unit 0 uncore */ PFM_PMU_INTEL_KNL_UNC_IMC_UCLK1,/* Intel KnightLanding IMC UCLK unit 1 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK0,/* Intel KnightLanding EDC ECLK unit 0 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK1,/* Intel KnightLanding EDC ECLK unit 1 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK2,/* Intel KnightLanding EDC ECLK unit 2 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK3,/* Intel KnightLanding EDC ECLK unit 3 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK4,/* Intel KnightLanding EDC ECLK unit 4 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK5,/* Intel KnightLanding EDC ECLK unit 5 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK6,/* Intel KnightLanding EDC ECLK unit 6 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_ECLK7,/* Intel KnightLanding EDC ECLK unit 7 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK0,/* Intel KnightLanding EDC UCLK unit 0 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK1,/* Intel KnightLanding EDC UCLK unit 1 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK2,/* Intel KnightLanding EDC UCLK unit 2 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK3,/* Intel KnightLanding EDC UCLK unit 3 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK4,/* Intel KnightLanding EDC UCLK unit 4 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK5,/* Intel KnightLanding EDC UCLK unit 5 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK6,/* Intel KnightLanding EDC UCLK unit 6 uncore */ PFM_PMU_INTEL_KNL_UNC_EDC_UCLK7,/* Intel KnightLanding EDC UCLK unit 7 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA0, /* Intel KnightLanding CHA unit 0 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA1, /* Intel KnightLanding CHA unit 1 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA2, /* Intel KnightLanding CHA unit 2 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA3, /* Intel KnightLanding CHA unit 3 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA4, /* Intel KnightLanding CHA unit 4 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA5, /* Intel KnightLanding CHA unit 5 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA6, /* Intel KnightLanding CHA unit 6 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA7, /* Intel KnightLanding CHA unit 7 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA8, /* Intel KnightLanding CHA unit 8 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA9, /* Intel KnightLanding CHA unit 9 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA10, /* Intel KnightLanding CHA unit 10 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA11, /* Intel KnightLanding CHA unit 11 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA12, /* Intel KnightLanding CHA unit 12 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA13, /* Intel KnightLanding CHA unit 13 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA14, /* Intel KnightLanding CHA unit 14 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA15, /* Intel KnightLanding CHA unit 15 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA16, /* Intel KnightLanding CHA unit 16 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA17, /* Intel KnightLanding CHA unit 17 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA18, /* Intel KnightLanding CHA unit 18 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA19, /* Intel KnightLanding CHA unit 19 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA20, /* Intel KnightLanding CHA unit 20 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA21, /* Intel KnightLanding CHA unit 21 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA22, /* Intel KnightLanding CHA unit 22 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA23, /* Intel KnightLanding CHA unit 23 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA24, /* Intel KnightLanding CHA unit 24 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA25, /* Intel KnightLanding CHA unit 25 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA26, /* Intel KnightLanding CHA unit 26 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA27, /* Intel KnightLanding CHA unit 27 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA28, /* Intel KnightLanding CHA unit 28 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA29, /* Intel KnightLanding CHA unit 29 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA30, /* Intel KnightLanding CHA unit 30 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA31, /* Intel KnightLanding CHA unit 31 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA32, /* Intel KnightLanding CHA unit 32 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA33, /* Intel KnightLanding CHA unit 33 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA34, /* Intel KnightLanding CHA unit 34 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA35, /* Intel KnightLanding CHA unit 35 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA36, /* Intel KnightLanding CHA unit 36 uncore */ PFM_PMU_INTEL_KNL_UNC_CHA37, /* Intel KnightLanding CHA unit 37 uncore */ PFM_PMU_INTEL_KNL_UNC_UBOX, /* Intel KnightLanding Ubox uncore */ PFM_PMU_INTEL_KNL_UNC_M2PCIE, /* Intel KnightLanding M2PCIe uncore */ PFM_PMU_POWER9, /* IBM POWER9 */ PFM_PMU_INTEL_BDX_UNC_CB0, /* Intel Broadwell-X C-Box core 0 uncore */ PFM_PMU_INTEL_BDX_UNC_CB1, /* Intel Broadwell-X C-Box core 1 uncore */ PFM_PMU_INTEL_BDX_UNC_CB2, /* Intel Broadwell-X C-Box core 2 uncore */ PFM_PMU_INTEL_BDX_UNC_CB3, /* Intel Broadwell-X C-Box core 3 uncore */ PFM_PMU_INTEL_BDX_UNC_CB4, /* Intel Broadwell-X C-Box core 4 uncore */ PFM_PMU_INTEL_BDX_UNC_CB5, /* Intel Broadwell-X C-Box core 5 uncore */ PFM_PMU_INTEL_BDX_UNC_CB6, /* Intel Broadwell-X C-Box core 6 uncore */ PFM_PMU_INTEL_BDX_UNC_CB7, /* Intel Broadwell-X C-Box core 7 uncore */ PFM_PMU_INTEL_BDX_UNC_CB8, /* Intel Broadwell-X C-Box core 8 uncore */ PFM_PMU_INTEL_BDX_UNC_CB9, /* Intel Broadwell-X C-Box core 9 uncore */ PFM_PMU_INTEL_BDX_UNC_CB10, /* Intel Broadwell-X C-Box core 10 uncore */ PFM_PMU_INTEL_BDX_UNC_CB11, /* Intel Broadwell-X C-Box core 11 uncore */ PFM_PMU_INTEL_BDX_UNC_CB12, /* Intel Broadwell-X C-Box core 12 uncore */ PFM_PMU_INTEL_BDX_UNC_CB13, /* Intel Broadwell-X C-Box core 13 uncore */ PFM_PMU_INTEL_BDX_UNC_CB14, /* Intel Broadwell-X C-Box core 14 uncore */ PFM_PMU_INTEL_BDX_UNC_CB15, /* Intel Broadwell-X C-Box core 15 uncore */ PFM_PMU_INTEL_BDX_UNC_CB16, /* Intel Broadwell-X C-Box core 16 uncore */ PFM_PMU_INTEL_BDX_UNC_CB17, /* Intel Broadwell-X C-Box core 17 uncore */ PFM_PMU_INTEL_BDX_UNC_CB18, /* Intel Broadwell-X C-Box core 18 uncore */ PFM_PMU_INTEL_BDX_UNC_CB19, /* Intel Broadwell-X C-Box core 19 uncore */ PFM_PMU_INTEL_BDX_UNC_CB20, /* Intel Broadwell-X C-Box core 20 uncore */ PFM_PMU_INTEL_BDX_UNC_CB21, /* Intel Broadwell-X C-Box core 21 uncore */ PFM_PMU_INTEL_BDX_UNC_CB22, /* Intel Broadwell-X C-Box core 22 uncore */ PFM_PMU_INTEL_BDX_UNC_CB23, /* Intel Broadwell-X C-Box core 23 uncore */ PFM_PMU_INTEL_BDX_UNC_HA0, /* Intel Broadwell-X HA 0 uncore */ PFM_PMU_INTEL_BDX_UNC_HA1, /* Intel Broadwell-X HA 1 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC0, /* Intel Broadwell-X IMC socket 0 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC1, /* Intel Broadwell-X IMC socket 1 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC2, /* Intel Broadwell-X IMC socket 2 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC3, /* Intel Broadwell-X IMC socket 3 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC4, /* Intel Broadwell-X IMC socket 4 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC5, /* Intel Broadwell-X IMC socket 5 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC6, /* Intel Broadwell-X IMC socket 6 uncore */ PFM_PMU_INTEL_BDX_UNC_IMC7, /* Intel Broadwell-X IMC socket 7 uncore */ PFM_PMU_INTEL_BDX_UNC_PCU, /* Intel Broadwell-X PCU uncore */ PFM_PMU_INTEL_BDX_UNC_QPI0, /* Intel Broadwell-X QPI link 0 uncore */ PFM_PMU_INTEL_BDX_UNC_QPI1, /* Intel Broadwell-X QPI link 1 uncore */ PFM_PMU_INTEL_BDX_UNC_QPI2, /* Intel Broadwell-X QPI link 2 uncore */ PFM_PMU_INTEL_BDX_UNC_UBOX, /* Intel Broadwell-X U-Box uncore */ PFM_PMU_INTEL_BDX_UNC_R2PCIE, /* Intel Broadwell-X R2PCIe uncore */ PFM_PMU_INTEL_BDX_UNC_R3QPI0, /* Intel Broadwell-X R3QPI 0 uncore */ PFM_PMU_INTEL_BDX_UNC_R3QPI1, /* Intel Broadwell-X R3QPI 1 uncore */ PFM_PMU_INTEL_BDX_UNC_R3QPI2, /* Intel Broadwell-X R3QPI 2 uncore */ PFM_PMU_INTEL_BDX_UNC_IRP, /* Intel Broadwell-X IRP uncore */ PFM_PMU_INTEL_BDX_UNC_SB0, /* Intel Broadwell-X S-Box 0 uncore */ PFM_PMU_INTEL_BDX_UNC_SB1, /* Intel Broadwell-X S-Box 1 uncore */ PFM_PMU_INTEL_BDX_UNC_SB2, /* Intel Broadwell-X S-Box 2 uncore */ PFM_PMU_INTEL_BDX_UNC_SB3, /* Intel Broadwell-X S-Box 3 uncore */ PFM_PMU_AMD64_FAM17H, /* AMD AMD64 Fam17h Zen1 (deprecated) */ PFM_PMU_AMD64_FAM16H, /* AMD AMD64 Fam16h Jaguar */ PFM_PMU_INTEL_SKX, /* Intel Skylake-X */ PFM_PMU_INTEL_SKX_UNC_CHA0, /* Intel Skylake-X CHA core 0 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA1, /* Intel Skylake-X CHA core 1 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA2, /* Intel Skylake-X CHA core 2 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA3, /* Intel Skylake-X CHA core 3 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA4, /* Intel Skylake-X CHA core 4 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA5, /* Intel Skylake-X CHA core 5 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA6, /* Intel Skylake-X CHA core 6 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA7, /* Intel Skylake-X CHA core 7 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA8, /* Intel Skylake-X CHA core 8 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA9, /* Intel Skylake-X CHA core 9 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA10, /* Intel Skylake-X CHA core 10 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA11, /* Intel Skylake-X CHA core 11 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA12, /* Intel Skylake-X CHA core 12 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA13, /* Intel Skylake-X CHA core 13 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA14, /* Intel Skylake-X CHA core 14 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA15, /* Intel Skylake-X CHA core 15 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA16, /* Intel Skylake-X CHA core 16 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA17, /* Intel Skylake-X CHA core 17 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA18, /* Intel Skylake-X CHA core 18 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA19, /* Intel Skylake-X CHA core 19 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA20, /* Intel Skylake-X CHA core 20 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA21, /* Intel Skylake-X CHA core 21 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA22, /* Intel Skylake-X CHA core 22 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA23, /* Intel Skylake-X CHA core 23 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA24, /* Intel Skylake-X CHA core 24 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA25, /* Intel Skylake-X CHA core 25 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA26, /* Intel Skylake-X CHA core 26 uncore */ PFM_PMU_INTEL_SKX_UNC_CHA27, /* Intel Skylake-X CHA core 27 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO0, /* Intel Skylake-X IIO0 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO1, /* Intel Skylake-X IIO1 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO2, /* Intel Skylake-X IIO2 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO3, /* Intel Skylake-X IIO3 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO4, /* Intel Skylake-X IIO4 uncore */ PFM_PMU_INTEL_SKX_UNC_IIO5, /* Intel Skylake-X IIO5 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC0, /* Intel Skylake-X IMC0 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC1, /* Intel Skylake-X IMC1 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC2, /* Intel Skylake-X IMC2 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC3, /* Intel Skylake-X IMC3 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC4, /* Intel Skylake-X IMC4 uncore */ PFM_PMU_INTEL_SKX_UNC_IMC5, /* Intel Skylake-X IMC5 uncore */ PFM_PMU_INTEL_SKX_UNC_IRP, /* Intel Skylake-X IRP uncore */ PFM_PMU_INTEL_SKX_UNC_M2M0, /* Intel Skylake-X M2M0 uncore */ PFM_PMU_INTEL_SKX_UNC_M2M1, /* Intel Skylake-X M2M1 uncore */ PFM_PMU_INTEL_SKX_UNC_M3UPI0, /* Intel Skylake-X M3UPI0 uncore */ PFM_PMU_INTEL_SKX_UNC_M3UPI1, /* Intel Skylake-X M3UPI1 uncore */ PFM_PMU_INTEL_SKX_UNC_M3UPI2, /* Intel Skylake-X M3UPI2 uncore */ PFM_PMU_INTEL_SKX_UNC_PCU, /* Intel Skylake-X PCU uncore */ PFM_PMU_INTEL_SKX_UNC_UBOX, /* Intel Skylake-X U-Box uncore */ PFM_PMU_INTEL_SKX_UNC_UPI0, /* Intel Skylake-X UPI link 0 uncore */ PFM_PMU_INTEL_SKX_UNC_UPI1, /* Intel Skylake-X UPI link 1 uncore */ PFM_PMU_INTEL_SKX_UNC_UPI2, /* Intel Skylake-X UPI link 2 uncore */ PFM_PMU_INTEL_KNM, /* Intel Knights Mill */ PFM_PMU_INTEL_KNM_UNC_IMC0, /* Intel Knights Mill IMC channel 0 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC1, /* Intel Knights Mill IMC channel 1 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC2, /* Intel Knights Mill IMC channel 2 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC3, /* Intel Knights Mill IMC channel 3 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC4, /* Intel Knights Mill IMC channel 4 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC5, /* Intel Knights Mill IMC channel 5 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC_UCLK0,/* Intel Knights Mill IMC UCLK unit 0 uncore */ PFM_PMU_INTEL_KNM_UNC_IMC_UCLK1,/* Intel Knights Mill IMC UCLK unit 1 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK0,/* Intel Knights Mill EDC ECLK unit 0 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK1,/* Intel Knights Mill EDC ECLK unit 1 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK2,/* Intel Knights Mill EDC ECLK unit 2 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK3,/* Intel Knights Mill EDC ECLK unit 3 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK4,/* Intel Knights Mill EDC ECLK unit 4 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK5,/* Intel Knights Mill EDC ECLK unit 5 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK6,/* Intel Knights Mill EDC ECLK unit 6 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_ECLK7,/* Intel Knights Mill EDC ECLK unit 7 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK0,/* Intel Knights Mill EDC UCLK unit 0 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK1,/* Intel Knights Mill EDC UCLK unit 1 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK2,/* Intel Knights Mill EDC UCLK unit 2 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK3,/* Intel Knights Mill EDC UCLK unit 3 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK4,/* Intel Knights Mill EDC UCLK unit 4 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK5,/* Intel Knights Mill EDC UCLK unit 5 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK6,/* Intel Knights Mill EDC UCLK unit 6 uncore */ PFM_PMU_INTEL_KNM_UNC_EDC_UCLK7,/* Intel Knights Mill EDC UCLK unit 7 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA0, /* Intel Knights Mill CHA unit 0 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA1, /* Intel Knights Mill CHA unit 1 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA2, /* Intel Knights Mill CHA unit 2 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA3, /* Intel Knights Mill CHA unit 3 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA4, /* Intel Knights Mill CHA unit 4 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA5, /* Intel Knights Mill CHA unit 5 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA6, /* Intel Knights Mill CHA unit 6 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA7, /* Intel Knights Mill CHA unit 7 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA8, /* Intel Knights Mill CHA unit 8 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA9, /* Intel Knights Mill CHA unit 9 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA10, /* Intel Knights Mill CHA unit 10 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA11, /* Intel Knights Mill CHA unit 11 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA12, /* Intel Knights Mill CHA unit 12 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA13, /* Intel Knights Mill CHA unit 13 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA14, /* Intel Knights Mill CHA unit 14 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA15, /* Intel Knights Mill CHA unit 15 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA16, /* Intel Knights Mill CHA unit 16 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA17, /* Intel Knights Mill CHA unit 17 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA18, /* Intel Knights Mill CHA unit 18 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA19, /* Intel Knights Mill CHA unit 19 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA20, /* Intel Knights Mill CHA unit 20 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA21, /* Intel Knights Mill CHA unit 21 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA22, /* Intel Knights Mill CHA unit 22 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA23, /* Intel Knights Mill CHA unit 23 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA24, /* Intel Knights Mill CHA unit 24 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA25, /* Intel Knights Mill CHA unit 25 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA26, /* Intel Knights Mill CHA unit 26 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA27, /* Intel Knights Mill CHA unit 27 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA28, /* Intel Knights Mill CHA unit 28 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA29, /* Intel Knights Mill CHA unit 29 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA30, /* Intel Knights Mill CHA unit 30 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA31, /* Intel Knights Mill CHA unit 31 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA32, /* Intel Knights Mill CHA unit 32 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA33, /* Intel Knights Mill CHA unit 33 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA34, /* Intel Knights Mill CHA unit 34 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA35, /* Intel Knights Mill CHA unit 35 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA36, /* Intel Knights Mill CHA unit 36 uncore */ PFM_PMU_INTEL_KNM_UNC_CHA37, /* Intel Knights Mill CHA unit 37 uncore */ PFM_PMU_INTEL_KNM_UNC_UBOX, /* Intel Knights Mill Ubox uncore */ PFM_PMU_INTEL_KNM_UNC_M2PCIE, /* Intel Knights Mill M2PCIe uncore */ PFM_PMU_ARM_THUNDERX2, /* Marvell ThunderX2 */ PFM_PMU_INTEL_CLX, /* Intel CascadeLake X */ PFM_PMU_ARM_THUNDERX2_DMC0, /* Marvell ThunderX2 DMC unit 0 uncore */ PFM_PMU_ARM_THUNDERX2_DMC1, /* Marvell ThunderX2 DMC unit 1 uncore */ PFM_PMU_ARM_THUNDERX2_LLC0, /* Marvell ThunderX2 LLC unit 0 uncore */ PFM_PMU_ARM_THUNDERX2_LLC1, /* Marvell ThunderX2 LLC unit 1 uncore */ PFM_PMU_ARM_THUNDERX2_CCPI0, /* Marvell ThunderX2 Cross-Socket Interconnect unit 0 uncore */ PFM_PMU_ARM_THUNDERX2_CCPI1, /* Marvell ThunderX2 Cross-Socket Interconnect unit 1 uncore */ PFM_PMU_AMD64_FAM17H_ZEN1, /* AMD AMD64 Fam17h Zen1 */ PFM_PMU_AMD64_FAM17H_ZEN2, /* AMD AMD64 Fam17h Zen2 */ PFM_PMU_INTEL_TMT, /* Intel Tremont */ PFM_PMU_INTEL_ICL, /* Intel IceLake */ PFM_PMU_ARM_A64FX, /* Fujitsu A64FX processor */ PFM_PMU_ARM_N1, /* Arm Neoverse N1 */ PFM_PMU_AMD64_FAM19H_ZEN3, /* AMD AMD64 Fam19h Zen3 */ PFM_PMU_AMD64_RAPL, /* AMD64 RAPL */ PFM_PMU_AMD64_FAM19H_ZEN3_L3, /* AMD64 Fam17h Zen3 L3 */ PFM_PMU_INTEL_ICX, /* Intel IceLakeX */ PFM_PMU_ARM_N2, /* Arm Neoverse N2 */ PFM_PMU_ARM_KUNPENG, /* HiSilicon Kunpeng processor */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_DDRC0, /* Hisilicon Kunpeng SCCL unit 1 DDRC 0 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_DDRC1, /* Hisilicon Kunpeng SCCL unit 1 DDRC 1 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_DDRC2, /* Hisilicon Kunpeng SCCL unit 1 DDRC 2 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_DDRC3, /* Hisilicon Kunpeng SCCL unit 1 DDRC 3 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_DDRC0, /* Hisilicon Kunpeng SCCL unit 3 DDRC 0 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_DDRC1, /* Hisilicon Kunpeng SCCL unit 3 DDRC 1 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_DDRC2, /* Hisilicon Kunpeng SCCL unit 3 DDRC 2 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_DDRC3, /* Hisilicon Kunpeng SCCL unit 3 DDRC 3 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_DDRC0, /* Hisilicon Kunpeng SCCL unit 5 DDRC 0 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_DDRC1, /* Hisilicon Kunpeng SCCL unit 5 DDRC 1 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_DDRC2, /* Hisilicon Kunpeng SCCL unit 5 DDRC 2 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_DDRC3, /* Hisilicon Kunpeng SCCL unit 5 DDRC 3 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_DDRC0, /* Hisilicon Kunpeng SCCL unit 7 DDRC 0 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_DDRC1, /* Hisilicon Kunpeng SCCL unit 7 DDRC 1 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_DDRC2, /* Hisilicon Kunpeng SCCL unit 7 DDRC 2 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_DDRC3, /* Hisilicon Kunpeng SCCL unit 7 DDRC 3 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_HHA2, /* Hisilicon Kunpeng SCCL unit 1 HHA 2 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_HHA3, /* Hisilicon Kunpeng SCCL unit 1 HHA 3 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_HHA0, /* Hisilicon Kunpeng SCCL unit 3 HHA 0 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_HHA1, /* Hisilicon Kunpeng SCCL unit 3 HHA 1 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_HHA6, /* Hisilicon Kunpeng SCCL unit 5 HHA 6 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_HHA7, /* Hisilicon Kunpeng SCCL unit 5 HHA 7 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_HHA4, /* Hisilicon Kunpeng SCCL unit 7 HHA 4 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_HHA5, /* Hisilicon Kunpeng SCCL unit 7 HHA 5 uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C10, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C11, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C12, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C13, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C14, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C15, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C8, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL1_L3C9, /* Hisilicon Kunpeng SCCL unit 1 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C0, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C1, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C2, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C3, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C4, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C5, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C6, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL3_L3C7, /* Hisilicon Kunpeng SCCL unit 3 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C24, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C25, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C26, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C27, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C28, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C29, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C30, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL5_L3C31, /* Hisilicon Kunpeng SCCL unit 5 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C16, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C17, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C18, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C19, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C20, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C21, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C22, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_ARM_KUNPENG_UNC_SCCL7_L3C23, /* Hisilicon Kunpeng SCCL unit 7 L3C uncore */ PFM_PMU_INTEL_SPR, /* Intel SapphireRapid */ PFM_PMU_POWER10, /* IBM POWER10 */ PFM_PMU_AMD64_FAM19H_ZEN4, /* AMD AMD64 Fam19h Zen4 */ PFM_PMU_ARM_V1, /* ARM Neoverse V1 */ PFM_PMU_ARM_V2, /* Arm Neoverse V2 */ PFM_PMU_INTEL_EMR, /* Intel EmeraldRapid */ PFM_PMU_INTEL_ICX_UNC_CHA0, /* Intel Icelake-X CHA core 0 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA1, /* Intel Icelake-X CHA core 1 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA2, /* Intel Icelake-X CHA core 2 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA3, /* Intel Icelake-X CHA core 3 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA4, /* Intel Icelake-X CHA core 4 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA5, /* Intel Icelake-X CHA core 5 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA6, /* Intel Icelake-X CHA core 6 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA7, /* Intel Icelake-X CHA core 7 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA8, /* Intel Icelake-X CHA core 8 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA9, /* Intel Icelake-X CHA core 9 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA10, /* Intel Icelake-X CHA core 10 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA11, /* Intel Icelake-X CHA core 11 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA12, /* Intel Icelake-X CHA core 12 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA13, /* Intel Icelake-X CHA core 13 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA14, /* Intel Icelake-X CHA core 14 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA15, /* Intel Icelake-X CHA core 15 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA16, /* Intel Icelake-X CHA core 16 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA17, /* Intel Icelake-X CHA core 17 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA18, /* Intel Icelake-X CHA core 18 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA19, /* Intel Icelake-X CHA core 19 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA20, /* Intel Icelake-X CHA core 20 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA21, /* Intel Icelake-X CHA core 21 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA22, /* Intel Icelake-X CHA core 22 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA23, /* Intel Icelake-X CHA core 23 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA24, /* Intel Icelake-X CHA core 24 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA25, /* Intel Icelake-X CHA core 25 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA26, /* Intel Icelake-X CHA core 26 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA27, /* Intel Icelake-X CHA core 27 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA28, /* Intel Icelake-X CHA core 28 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA29, /* Intel Icelake-X CHA core 39 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA30, /* Intel Icelake-X CHA core 30 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA31, /* Intel Icelake-X CHA core 31 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA32, /* Intel Icelake-X CHA core 32 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA33, /* Intel Icelake-X CHA core 33 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA34, /* Intel Icelake-X CHA core 34 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA35, /* Intel Icelake-X CHA core 35 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA36, /* Intel Icelake-X CHA core 36 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA37, /* Intel Icelake-X CHA core 37 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA38, /* Intel Icelake-X CHA core 38 uncore */ PFM_PMU_INTEL_ICX_UNC_CHA39, /* Intel Icelake-X CHA core 39 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC0, /* Intel Icelake-X IMC channel 0 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC1, /* Intel Icelake-X IMC channel 1 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC2, /* Intel Icelake-X IMC channel 2 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC3, /* Intel Icelake-X IMC channel 3 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC4, /* Intel Icelake-X IMC channel 4 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC5, /* Intel Icelake-X IMC channel 5 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC6, /* Intel Icelake-X IMC channel 6 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC7, /* Intel Icelake-X IMC channel 7 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC8, /* Intel Icelake-X IMC channel 8 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC9, /* Intel Icelake-X IMC channel 9 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC10, /* Intel Icelake-X IMC channel 10 uncore */ PFM_PMU_INTEL_ICX_UNC_IMC11, /* Intel Icelake-X IMC channel 11 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO0, /* Intel Icelake-X IIO 0 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO1, /* Intel Icelake-X IIO 1 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO2, /* Intel Icelake-X IIO 2 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO3, /* Intel Icelake-X IIO 3 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO4, /* Intel Icelake-X IIO 4 uncore */ PFM_PMU_INTEL_ICX_UNC_IIO5, /* Intel Icelake-X IIO 5 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP0, /* Intel Icelake-X IRP 0 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP1, /* Intel Icelake-X IRP 1 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP2, /* Intel Icelake-X IRP 2 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP3, /* Intel Icelake-X IRP 3 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP4, /* Intel Icelake-X IRP 4 uncore */ PFM_PMU_INTEL_ICX_UNC_IRP5, /* Intel Icelake-X IRP 5 uncore */ PFM_PMU_INTEL_ICX_UNC_M2M0, /* Intel Icelake-X M2M 0 uncore */ PFM_PMU_INTEL_ICX_UNC_M2M1, /* Intel Icelake-X M2M 1 uncore */ PFM_PMU_INTEL_ICX_UNC_PCU, /* Intel Icelake-X PCU uncore */ PFM_PMU_INTEL_ICX_UNC_UPI0, /* Intel Icelake-X UPI0 uncore */ PFM_PMU_INTEL_ICX_UNC_UPI1, /* Intel Icelake-X UPI1 uncore */ PFM_PMU_INTEL_ICX_UNC_UPI2, /* Intel Icelake-X UPI2 uncore */ PFM_PMU_INTEL_ICX_UNC_UPI3, /* Intel Icelake-X UPI3 uncore */ PFM_PMU_INTEL_ICX_UNC_M3UPI0, /* Intel Icelake-X M3UPI0 uncore */ PFM_PMU_INTEL_ICX_UNC_M3UPI1, /* Intel Icelake-X M3UPI1 uncore */ PFM_PMU_INTEL_ICX_UNC_M3UPI2, /* Intel Icelake-X M3UPI2 uncore */ PFM_PMU_INTEL_ICX_UNC_M3UPI3, /* Intel Icelake-X M3UPI3 uncore */ PFM_PMU_INTEL_ICX_UNC_UBOX, /* Intel Icelake-X UBOX uncore */ PFM_PMU_INTEL_ICX_UNC_M2PCIE0, /* Intel Icelake-X M2PCIE0 uncore */ PFM_PMU_INTEL_ICX_UNC_M2PCIE1, /* Intel Icelake-X M2PCIE1 uncore */ PFM_PMU_INTEL_ICX_UNC_M2PCIE2, /* Intel Icelake-X M2PCIE2 uncore */ PFM_PMU_INTEL_ADL_GLC, /* Intel AlderLake Goldencove (P-Core) */ PFM_PMU_INTEL_ADL_GRT, /* Intel AlderLake Gracemont (E-Core) */ PFM_PMU_INTEL_SPR_UNC_IMC0, /* Intel SapphireRapids IMC channel 0 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC1, /* Intel SapphireRapids IMC channel 1 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC2, /* Intel SapphireRapids IMC channel 2 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC3, /* Intel SapphireRapids IMC channel 3 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC4, /* Intel SapphireRapids IMC channel 4 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC5, /* Intel SapphireRapids IMC channel 5 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC6, /* Intel SapphireRapids IMC channel 6 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC7, /* Intel SapphireRapids IMC channel 7 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC8, /* Intel SapphireRapids IMC channel 8 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC9, /* Intel SapphireRapids IMC channel 9 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC10, /* Intel SapphireRapids IMC channel 10 uncore */ PFM_PMU_INTEL_SPR_UNC_IMC11, /* Intel SapphireRapids IMC channel 11 uncore */ PFM_PMU_INTEL_SPR_UNC_UPI0, /* Intel SapphireRapids UPI0 uncore */ PFM_PMU_INTEL_SPR_UNC_UPI1, /* Intel SapphireRapids UPI1 uncore */ PFM_PMU_INTEL_SPR_UNC_UPI2, /* Intel SapphireRapids UPI2 uncore */ PFM_PMU_INTEL_SPR_UNC_UPI3, /* Intel SapphireRapids UPI3 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA0, /* Intel SapphireRapids CHA core 0 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA1, /* Intel SapphireRapids CHA core 1 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA2, /* Intel SapphireRapids CHA core 2 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA3, /* Intel SapphireRapids CHA core 3 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA4, /* Intel SapphireRapids CHA core 4 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA5, /* Intel SapphireRapids CHA core 5 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA6, /* Intel SapphireRapids CHA core 6 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA7, /* Intel SapphireRapids CHA core 7 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA8, /* Intel SapphireRapids CHA core 8 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA9, /* Intel SapphireRapids CHA core 9 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA10, /* Intel SapphireRapids CHA core 10 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA11, /* Intel SapphireRapids CHA core 11 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA12, /* Intel SapphireRapids CHA core 12 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA13, /* Intel SapphireRapids CHA core 13 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA14, /* Intel SapphireRapids CHA core 14 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA15, /* Intel SapphireRapids CHA core 15 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA16, /* Intel SapphireRapids CHA core 16 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA17, /* Intel SapphireRapids CHA core 17 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA18, /* Intel SapphireRapids CHA core 18 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA19, /* Intel SapphireRapids CHA core 19 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA20, /* Intel SapphireRapids CHA core 20 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA21, /* Intel SapphireRapids CHA core 21 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA22, /* Intel SapphireRapids CHA core 22 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA23, /* Intel SapphireRapids CHA core 23 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA24, /* Intel SapphireRapids CHA core 24 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA25, /* Intel SapphireRapids CHA core 25 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA26, /* Intel SapphireRapids CHA core 26 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA27, /* Intel SapphireRapids CHA core 27 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA28, /* Intel SapphireRapids CHA core 28 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA29, /* Intel SapphireRapids CHA core 39 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA30, /* Intel SapphireRapids CHA core 30 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA31, /* Intel SapphireRapids CHA core 31 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA32, /* Intel SapphireRapids CHA core 32 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA33, /* Intel SapphireRapids CHA core 33 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA34, /* Intel SapphireRapids CHA core 34 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA35, /* Intel SapphireRapids CHA core 35 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA36, /* Intel SapphireRapids CHA core 36 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA37, /* Intel SapphireRapids CHA core 37 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA38, /* Intel SapphireRapids CHA core 38 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA39, /* Intel SapphireRapids CHA core 39 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA40, /* Intel SapphireRapids CHA core 40 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA41, /* Intel SapphireRapids CHA core 41 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA42, /* Intel SapphireRapids CHA core 42 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA43, /* Intel SapphireRapids CHA core 43 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA44, /* Intel SapphireRapids CHA core 44 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA45, /* Intel SapphireRapids CHA core 45 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA46, /* Intel SapphireRapids CHA core 46 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA47, /* Intel SapphireRapids CHA core 47 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA48, /* Intel SapphireRapids CHA core 48 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA49, /* Intel SapphireRapids CHA core 49 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA50, /* Intel SapphireRapids CHA core 50 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA51, /* Intel SapphireRapids CHA core 51 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA52, /* Intel SapphireRapids CHA core 52 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA53, /* Intel SapphireRapids CHA core 53 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA54, /* Intel SapphireRapids CHA core 54 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA55, /* Intel SapphireRapids CHA core 55 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA56, /* Intel SapphireRapids CHA core 56 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA57, /* Intel SapphireRapids CHA core 57 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA58, /* Intel SapphireRapids CHA core 58 uncore */ PFM_PMU_INTEL_SPR_UNC_CHA59, /* Intel SapphireRapids CHA core 59 uncore */ PFM_PMU_INTEL_GNR, /* Intel GraniteRapids core PMU */ PFM_PMU_AMD64_FAM1AH_ZEN5, /* AMD64 Fam1Ah Zen5 */ PFM_PMU_AMD64_FAM1AH_ZEN5_L3, /* AMD64 Fam1Ah Zen5 L3 */ PFM_PMU_ARM_CORTEX_A72, /* ARM Cortex A72 (ARMv8) */ PFM_PMU_ARM_V3, /* Arm Neoverse V3 (ARMv9) */ PFM_PMU_ARM_CORTEX_A55, /* ARM Cortex A55 (ARMv8) */ PFM_PMU_ARM_CORTEX_A76, /* ARM Cortex A76 (ARMv8) */ PFM_PMU_ARM_N3, /* Arm Neoverse N3 */ PFM_PMU_ARM_MONAKA, /* Fujitsu FUJITSU-MONAKA processor */ PFM_PMU_INTEL_GNR_UNC_IMC0, /* Intel GraniteRapids IMC channel 0 */ PFM_PMU_INTEL_GNR_UNC_IMC1, /* Intel GraniteRapids IMC channel 1 */ PFM_PMU_INTEL_GNR_UNC_IMC2, /* Intel GraniteRapids IMC channel 2 */ PFM_PMU_INTEL_GNR_UNC_IMC3, /* Intel GraniteRapids IMC channel 3 */ PFM_PMU_INTEL_GNR_UNC_IMC4, /* Intel GraniteRapids IMC channel 4 */ PFM_PMU_INTEL_GNR_UNC_IMC5, /* Intel GraniteRapids IMC channel 5 */ PFM_PMU_INTEL_GNR_UNC_IMC6, /* Intel GraniteRapids IMC channel 6 */ PFM_PMU_INTEL_GNR_UNC_IMC7, /* Intel GraniteRapids IMC channel 7 */ PFM_PMU_INTEL_GNR_UNC_IMC8, /* Intel GraniteRapids IMC channel 8 */ PFM_PMU_INTEL_GNR_UNC_IMC9, /* Intel GraniteRapids IMC channel 9 */ PFM_PMU_INTEL_GNR_UNC_IMC10, /* Intel GraniteRapids IMC channel 10 */ PFM_PMU_INTEL_GNR_UNC_IMC11, /* Intel GraniteRapids IMC channel 10 */ /* MUST ADD NEW PMU MODELS HERE */ PFM_PMU_MAX /* end marker */ } pfm_pmu_t; typedef enum { PFM_PMU_TYPE_UNKNOWN=0, /* unknown PMU type */ PFM_PMU_TYPE_CORE, /* processor core PMU */ PFM_PMU_TYPE_UNCORE, /* processor socket-level PMU */ PFM_PMU_TYPE_OS_GENERIC,/* generic OS-provided PMU */ PFM_PMU_TYPE_MAX } pfm_pmu_type_t; typedef enum { PFM_ATTR_NONE=0, /* no attribute */ PFM_ATTR_UMASK, /* unit mask */ PFM_ATTR_MOD_BOOL, /* register modifier */ PFM_ATTR_MOD_INTEGER, /* register modifier */ PFM_ATTR_RAW_UMASK, /* raw umask (not user visible) */ PFM_ATTR_MAX /* end-marker */ } pfm_attr_t; /* * define additional event data types beyond historic uint64 * what else can fit in 64 bits? */ typedef enum { PFM_DTYPE_UNKNOWN=0, /* unkown */ PFM_DTYPE_UINT64, /* uint64 */ PFM_DTYPE_INT64, /* int64 */ PFM_DTYPE_DOUBLE, /* IEEE double precision float */ PFM_DTYPE_FIXED, /* 32.32 fixed point */ PFM_DTYPE_RATIO, /* 32/32 integer ratio */ PFM_DTYPE_CHAR8, /* 8 char unterminated string */ PFM_DTYPE_MAX /* end-marker */ } pfm_dtype_t; /* * event attribute control: which layer is controlling * the attribute could be PMU, OS APIs */ typedef enum { PFM_ATTR_CTRL_UNKNOWN = 0, /* unknown */ PFM_ATTR_CTRL_PMU, /* PMU hardware */ PFM_ATTR_CTRL_PERF_EVENT, /* perf_events kernel interface */ PFM_ATTR_CTRL_MAX } pfm_attr_ctrl_t; /* * OS layer * Used when querying event or attribute information */ typedef enum { PFM_OS_NONE = 0, /* only PMU */ PFM_OS_PERF_EVENT, /* perf_events PMU attribute subset + PMU */ PFM_OS_PERF_EVENT_EXT, /* perf_events all attributes + PMU */ PFM_OS_MAX, } pfm_os_t; /* SWIG doesn't deal well with anonymous nested structures */ #ifdef SWIG #define SWIG_NAME(x) x #else #define SWIG_NAME(x) #endif /* SWIG */ /* * special data type for libpfm error value used to help * with Python support and in particular for SWIG. By using * a specific type we can detect library calls and trap errors * in one SWIG statement as opposed to having to keep track of * each call individually. Programs can use 'int' safely for * the return value. */ typedef int pfm_err_t; /* error if !PFM_SUCCESS */ typedef int os_err_t; /* error if a syscall fails */ typedef struct { const char *name; /* event name */ const char *desc; /* event description */ size_t size; /* struct sizeof */ pfm_pmu_t pmu; /* PMU identification */ pfm_pmu_type_t type; /* PMU type */ int nevents; /* how many events for this PMU */ int first_event; /* opaque index of first event */ int max_encoding; /* max number of uint64_t to encode an event */ int num_cntrs; /* number of generic counters */ int num_fixed_cntrs;/* number of fixed counters */ struct { unsigned int is_present:1; /* present on host system */ unsigned int is_dfl:1; /* is architecture default PMU */ unsigned int reserved_bits:30; } SWIG_NAME(flags); } pfm_pmu_info_t; /* * possible values for pfm_event_info_t.is_speculative * possible values for pfm_event_attr_info_t.is_speculative */ typedef enum { PFM_EVENT_INFO_SPEC_NA = 0, /* speculation info not available */ PFM_EVENT_INFO_SPEC_TRUE = 1, /* counts speculative exec events */ PFM_EVENT_INFO_SPEC_FALSE = 2, /* counts non-speculative exec events */ } pfm_event_info_spec_t; typedef struct { const char *name; /* event name */ const char *desc; /* event description */ const char *equiv; /* event is equivalent to */ size_t size; /* struct sizeof */ uint64_t code; /* event raw code (not encoding) */ pfm_pmu_t pmu; /* which PMU */ pfm_dtype_t dtype; /* data type of event value */ int idx; /* unique event identifier */ int nattrs; /* number of attributes */ int reserved; /* for future use */ struct { unsigned int is_precise:1; /* precise sampling (Intel X86=PEBS) */ unsigned int is_speculative:2; /* count correct and wrong path occurrences */ unsigned int support_hw_smpl:1;/* can be recorded by hw buffer (Intel X86=EXTPEBS) */ unsigned int reserved_bits:28; } SWIG_NAME(flags); } pfm_event_info_t; typedef struct { const char *name; /* attribute symbolic name */ const char *desc; /* attribute description */ const char *equiv; /* attribute is equivalent to */ size_t size; /* struct sizeof */ uint64_t code; /* attribute code */ pfm_attr_t type; /* attribute type */ int idx; /* attribute opaque index */ pfm_attr_ctrl_t ctrl; /* what is providing attr */ struct { unsigned int is_dfl:1; /* is default umask */ unsigned int is_precise:1; /* Intel X86: supports PEBS */ unsigned int is_speculative:2; /* count correct and wrong path occurrences */ unsigned int support_hw_smpl:1;/* can be recorded by hw buffer (Intel X86=EXTPEBS) */ unsigned int support_no_mods:1;/* attribute does not support modifiers (umask only) */ unsigned int reserved_bits:26; } SWIG_NAME(flags); union { uint64_t dfl_val64; /* default 64-bit value */ const char *dfl_str; /* default string value */ int dfl_bool; /* default boolean value */ int dfl_int; /* default integer value */ } SWIG_NAME(defaults); } pfm_event_attr_info_t; /* * use with PFM_OS_NONE for pfm_get_os_event_encoding() */ typedef struct { uint64_t *codes; /* out/in: event codes array */ char **fstr; /* out/in: fully qualified event string */ size_t size; /* sizeof struct */ int count; /* out/in: # of elements in array */ int idx; /* out: unique event identifier */ } pfm_pmu_encode_arg_t; #if __WORDSIZE == 64 #define PFM_PMU_INFO_ABI0 56 #define PFM_EVENT_INFO_ABI0 64 #define PFM_ATTR_INFO_ABI0 64 #define PFM_RAW_ENCODE_ABI0 32 #else #define PFM_PMU_INFO_ABI0 44 #define PFM_EVENT_INFO_ABI0 48 #define PFM_ATTR_INFO_ABI0 48 #define PFM_RAW_ENCODE_ABI0 20 #endif /* * initialization, configuration, errors */ extern pfm_err_t pfm_initialize(void); extern void pfm_terminate(void); extern const char *pfm_strerror(int code); extern int pfm_get_version(void); /* * PMU API */ extern pfm_err_t pfm_get_pmu_info(pfm_pmu_t pmu, pfm_pmu_info_t *output); /* * event API */ extern int pfm_get_event_next(int idx); extern int pfm_find_event(const char *str); extern pfm_err_t pfm_get_event_info(int idx, pfm_os_t os, pfm_event_info_t *output); /* * event encoding API * * content of args depends on value of os (refer to man page) */ extern pfm_err_t pfm_get_os_event_encoding(const char *str, int dfl_plm, pfm_os_t os, void *args); /* * attribute API */ extern pfm_err_t pfm_get_event_attr_info(int eidx, int aidx, pfm_os_t os, pfm_event_attr_info_t *output); /* * library validation API */ extern pfm_err_t pfm_pmu_validate(pfm_pmu_t pmu_id, FILE *fp); /* * older encoding API */ extern pfm_err_t pfm_get_event_encoding(const char *str, int dfl_plm, char **fstr, int *idx, uint64_t **codes, int *count); /* * error codes */ #define PFM_SUCCESS 0 /* success */ #define PFM_ERR_NOTSUPP -1 /* function not supported */ #define PFM_ERR_INVAL -2 /* invalid parameters */ #define PFM_ERR_NOINIT -3 /* library was not initialized */ #define PFM_ERR_NOTFOUND -4 /* event not found */ #define PFM_ERR_FEATCOMB -5 /* invalid combination of features */ #define PFM_ERR_UMASK -6 /* invalid or missing unit mask */ #define PFM_ERR_NOMEM -7 /* out of memory */ #define PFM_ERR_ATTR -8 /* invalid event attribute */ #define PFM_ERR_ATTR_VAL -9 /* invalid event attribute value */ #define PFM_ERR_ATTR_SET -10 /* attribute value already set */ #define PFM_ERR_TOOMANY -11 /* too many parameters */ #define PFM_ERR_TOOSMALL -12 /* parameter is too small */ /* * event, attribute iterators * must be used because no guarante indexes are contiguous * * for pmu, simply iterate over pfm_pmu_t enum and use * pfm_get_pmu_info() and the is_present field */ #define pfm_for_each_event_attr(x, z) \ for((x)=0; (x) < (z)->nattrs; (x) = (x)+1) #define pfm_for_all_pmus(x) \ for((x)= PFM_PMU_NONE ; (x) < PFM_PMU_MAX; (x)++) #ifdef __cplusplus /* extern C */ } #endif #pragma GCC visibility pop #endif /* __PFMLIB_H__ */ papi-papi-7-2-0-t/src/libpfm4/include/perfmon/pfmlib_perf_event.h000066400000000000000000000045541502707512200247240ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_PERF_EVENTS_H__ #define __PFMLIB_PERF_EVENTS_H__ #include #include #pragma GCC visibility push(default) #ifdef __cplusplus extern "C" { #endif /* * use with PFM_OS_PERF, PFM_OS_PERF_EXT for pfm_get_os_event_encoding() */ typedef struct { struct perf_event_attr *attr; /* in/out: perf_event struct pointer */ char **fstr; /* out/in: fully qualified event string */ size_t size; /* sizeof struct */ int idx; /* out: opaque event identifier */ int cpu; /* out: cpu to program, -1 = not set */ int flags; /* out: perf_event_open() flags */ int pad0; /* explicit 64-bit mode padding */ } pfm_perf_encode_arg_t; #if __WORDSIZE == 64 #define PFM_PERF_ENCODE_ABI0 40 /* includes 4-byte padding */ #else #define PFM_PERF_ENCODE_ABI0 28 #endif /* * old interface, maintained for backward compatibility with older versions o * the library. Should use pfm_get_os_event_encoding() now */ extern pfm_err_t pfm_get_perf_event_encoding(const char *str, int dfl_plm, struct perf_event_attr *output, char **fstr, int *idx); #ifdef __cplusplus /* extern C */ } #endif #pragma GCC visibility pop #endif /* __PFMLIB_PERF_EVENT_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/000077500000000000000000000000001502707512200165325ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/lib/Makefile000066400000000000000000000344311502707512200201770ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. # Contributed by John Linford # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk # # Common files # SRCS=pfmlib_common.c ifeq ($(SYS),Linux) CFLAGS += -DHAS_OPENAT SRCS += pfmlib_perf_event_pmu.c pfmlib_perf_event.c pfmlib_perf_event_raw.c endif CFLAGS+=-D_REENTRANT -I. -fvisibility=hidden # # list all library support modules # ifeq ($(CONFIG_PFMLIB_ARCH_IA64),y) INCARCH = $(INC_IA64) #SRCS += pfmlib_gen_ia64.c pfmlib_itanium.c pfmlib_itanium2.c pfmlib_montecito.c CFLAGS += -DCONFIG_PFMLIB_ARCH_IA64 endif ifeq ($(CONFIG_PFMLIB_ARCH_X86),y) ifeq ($(SYS),Linux) SRCS += pfmlib_intel_x86_perf_event.c pfmlib_amd64_perf_event.c \ pfmlib_intel_netburst_perf_event.c \ pfmlib_intel_snbep_unc_perf_event.c endif INCARCH = $(INC_X86) SRCS += pfmlib_amd64.c pfmlib_intel_core.c pfmlib_intel_x86.c \ pfmlib_intel_x86_arch.c pfmlib_intel_atom.c \ pfmlib_intel_nhm_unc.c pfmlib_intel_nhm.c \ pfmlib_intel_wsm.c \ pfmlib_intel_snb.c pfmlib_intel_snb_unc.c \ pfmlib_intel_ivb.c pfmlib_intel_ivb_unc.c \ pfmlib_intel_hsw.c \ pfmlib_intel_bdw.c \ pfmlib_intel_skl.c \ pfmlib_intel_icl.c \ pfmlib_intel_spr.c \ pfmlib_intel_gnr.c \ pfmlib_intel_rapl.c \ pfmlib_intel_snbep_unc.c \ pfmlib_intel_snbep_unc_cbo.c \ pfmlib_intel_snbep_unc_ha.c \ pfmlib_intel_snbep_unc_imc.c \ pfmlib_intel_snbep_unc_pcu.c \ pfmlib_intel_snbep_unc_qpi.c \ pfmlib_intel_snbep_unc_ubo.c \ pfmlib_intel_snbep_unc_r2pcie.c \ pfmlib_intel_snbep_unc_r3qpi.c \ pfmlib_intel_ivbep_unc_cbo.c \ pfmlib_intel_ivbep_unc_ha.c \ pfmlib_intel_ivbep_unc_imc.c \ pfmlib_intel_ivbep_unc_pcu.c \ pfmlib_intel_ivbep_unc_qpi.c \ pfmlib_intel_ivbep_unc_ubo.c \ pfmlib_intel_ivbep_unc_r2pcie.c \ pfmlib_intel_ivbep_unc_r3qpi.c \ pfmlib_intel_ivbep_unc_irp.c \ pfmlib_intel_hswep_unc_cbo.c \ pfmlib_intel_hswep_unc_ha.c \ pfmlib_intel_hswep_unc_imc.c \ pfmlib_intel_hswep_unc_pcu.c \ pfmlib_intel_hswep_unc_qpi.c \ pfmlib_intel_hswep_unc_ubo.c \ pfmlib_intel_hswep_unc_r2pcie.c \ pfmlib_intel_hswep_unc_r3qpi.c \ pfmlib_intel_hswep_unc_irp.c \ pfmlib_intel_hswep_unc_sbo.c \ pfmlib_intel_bdx_unc_cbo.c \ pfmlib_intel_bdx_unc_ubo.c \ pfmlib_intel_bdx_unc_sbo.c \ pfmlib_intel_bdx_unc_ha.c \ pfmlib_intel_bdx_unc_imc.c \ pfmlib_intel_bdx_unc_irp.c \ pfmlib_intel_bdx_unc_pcu.c \ pfmlib_intel_bdx_unc_qpi.c \ pfmlib_intel_bdx_unc_r2pcie.c \ pfmlib_intel_bdx_unc_r3qpi.c \ pfmlib_intel_skx_unc_cha.c \ pfmlib_intel_skx_unc_iio.c \ pfmlib_intel_skx_unc_imc.c \ pfmlib_intel_skx_unc_irp.c \ pfmlib_intel_skx_unc_m2m.c \ pfmlib_intel_skx_unc_m3upi.c \ pfmlib_intel_skx_unc_pcu.c \ pfmlib_intel_skx_unc_ubo.c \ pfmlib_intel_skx_unc_upi.c \ pfmlib_intel_icx_unc_cha.c \ pfmlib_intel_icx_unc_imc.c \ pfmlib_intel_icx_unc_m2m.c \ pfmlib_intel_icx_unc_iio.c \ pfmlib_intel_icx_unc_irp.c \ pfmlib_intel_icx_unc_pcu.c \ pfmlib_intel_icx_unc_upi.c \ pfmlib_intel_icx_unc_m3upi.c \ pfmlib_intel_icx_unc_ubox.c \ pfmlib_intel_icx_unc_m2pcie.c \ pfmlib_intel_spr_unc_imc.c \ pfmlib_intel_spr_unc_upi.c \ pfmlib_intel_spr_unc_cha.c \ pfmlib_intel_gnr_unc_imc.c \ pfmlib_intel_knc.c \ pfmlib_intel_slm.c \ pfmlib_intel_tmt.c \ pfmlib_intel_knl.c \ pfmlib_intel_adl.c \ pfmlib_intel_knl_unc_imc.c \ pfmlib_intel_knl_unc_edc.c \ pfmlib_intel_knl_unc_cha.c \ pfmlib_intel_knl_unc_m2pcie.c \ pfmlib_intel_glm.c \ pfmlib_intel_netburst.c \ pfmlib_amd64_k7.c pfmlib_amd64_k8.c pfmlib_amd64_fam10h.c \ pfmlib_amd64_fam11h.c pfmlib_amd64_fam12h.c \ pfmlib_amd64_fam14h.c pfmlib_amd64_fam15h.c \ pfmlib_amd64_fam17h.c pfmlib_amd64_fam16h.c \ pfmlib_amd64_fam19h.c pfmlib_amd64_rapl.c \ pfmlib_amd64_fam19h_l3.c \ pfmlib_amd64_fam1ah.c pfmlib_amd64_fam1ah_l3.c CFLAGS += -DCONFIG_PFMLIB_ARCH_X86 ifeq ($(CONFIG_PFMLIB_ARCH_I386),y) SRCS += pfmlib_intel_coreduo.c pfmlib_intel_p6.c CFLAGS += -DCONFIG_PFMLIB_ARCH_I386 endif ifeq ($(CONFIG_PFMLIB_ARCH_X86_64),y) CFLAGS += -DCONFIG_PFMLIB_ARCH_X86_64 endif endif ifeq ($(CONFIG_PFMLIB_ARCH_POWERPC),y) ifeq ($(SYS),Linux) SRCS += pfmlib_powerpc_perf_event.c endif INCARCH = $(INC_POWERPC) SRCS += pfmlib_powerpc.c pfmlib_power4.c pfmlib_ppc970.c pfmlib_power5.c \ pfmlib_power6.c pfmlib_power7.c pfmlib_torrent.c pfmlib_power8.c \ pfmlib_power9.c pfmlib_powerpc_nest.c pfmlib_power10.c CFLAGS += -DCONFIG_PFMLIB_ARCH_POWERPC endif ifeq ($(CONFIG_PFMLIB_ARCH_S390X),y) ifeq ($(SYS),Linux) SRCS += pfmlib_s390x_perf_event.c endif INCARCH = $(INC_S390X) SRCS += pfmlib_s390x_cpumf.c CFLAGS += -DCONFIG_PFMLIB_ARCH_S390X endif ifeq ($(CONFIG_PFMLIB_ARCH_SPARC),y) ifeq ($(SYS),Linux) SRCS += pfmlib_sparc_perf_event.c endif INCARCH = $(INC_SPARC) SRCS += pfmlib_sparc.c pfmlib_sparc_ultra12.c pfmlib_sparc_ultra3.c pfmlib_sparc_ultra4.c pfmlib_sparc_niagara.c CFLAGS += -DCONFIG_PFMLIB_ARCH_SPARC endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM),y) ifeq ($(SYS),Linux) SRCS += pfmlib_arm_perf_event.c pfmlib_arm_armv8_thunderx2_unc_perf_event.c pfmlib_arm_armv8_kunpeng_unc_perf_event.c endif INCARCH = $(INC_ARM) SRCS += pfmlib_arm.c \ pfmlib_arm_armv7_pmuv1.c \ pfmlib_arm_armv6.c \ pfmlib_arm_armv8.c \ pfmlib_arm_armv9.c \ pfmlib_arm_armv8_thunderx2_unc.c \ pfmlib_arm_armv8_kunpeng_unc.c CFLAGS += -DCONFIG_PFMLIB_ARCH_ARM endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM64),y) ifeq ($(SYS),Linux) SRCS += pfmlib_arm_perf_event.c pfmlib_arm_armv8_thunderx2_unc_perf_event.c pfmlib_arm_armv8_kunpeng_unc_perf_event.c endif INCARCH = $(INC_ARM64) SRCS += pfmlib_arm.c \ pfmlib_arm_armv8.c \ pfmlib_arm_armv9.c \ pfmlib_arm_armv8_thunderx2_unc.c \ pfmlib_arm_armv8_kunpeng_unc.c CFLAGS += -DCONFIG_PFMLIB_ARCH_ARM64 endif ifeq ($(CONFIG_PFMLIB_ARCH_MIPS),y) ifeq ($(SYS),Linux) SRCS += pfmlib_mips_perf_event.c endif INCARCH = $(INC_MIPS) SRCS += pfmlib_mips.c pfmlib_mips_74k.c CFLAGS += -DCONFIG_PFMLIB_ARCH_MIPS endif ifeq ($(CONFIG_PFMLIB_CELL),y) INCARCH = $(INC_CELL) #SRCS += pfmlib_cell.c CFLAGS += -DCONFIG_PFMLIB_CELL endif ifeq ($(SYS),Linux) SLDFLAGS=$(LDFLAGS) -shared -Wl,-soname -Wl,$(VLIBPFM) SLIBPFM=libpfm.so.$(VERSION).$(REVISION).$(AGE) VLIBPFM=libpfm.so.$(VERSION) SOLIBEXT=so endif CFLAGS+=-I. ALIBPFM=libpfm.a TARGETS=$(ALIBPFM) ifeq ($(CONFIG_PFMLIB_SHARED),y) TARGETS += $(SLIBPFM) endif OBJS=$(SRCS:.c=.o) SOBJS=$(OBJS:.o=.lo) INC_COMMON= $(PFMINCDIR)/perfmon/pfmlib.h pfmlib_priv.h ifeq ($(SYS),Linux) INC_COMMON += $(PFMINCDIR)/perfmon/perf_event.h events/perf_events.h endif INC_IA64=pfmlib_ia64_priv.h \ events/itanium_events.h \ events/itanium2_events.h \ events/montecito_events.h INC_X86= pfmlib_intel_x86_priv.h \ pfmlib_amd64_priv.h \ events/amd64_events_k7.h \ events/amd64_events_k8.h \ events/amd64_events_fam10h.h \ events/amd64_events_fam11h.h \ events/amd64_events_fam12h.h \ events/amd64_events_fam14h.h \ events/amd64_events_fam15h.h \ events/amd64_events_fam17h_zen1.h \ events/amd64_events_fam17h_zen2.h \ events/amd64_events_fam19h_zen3.h \ events/amd64_events_fam19h_zen4.h \ events/amd64_events_fam19h_zen3_l3.h \ events/amd64_events_fam16h.h \ events/amd64_events_fam1ah_zen5.h \ events/amd64_events_fam1ah_zen5_l3.h \ events/intel_p6_events.h \ events/intel_netburst_events.h \ events//intel_x86_arch_events.h \ events/intel_coreduo_events.h \ events/intel_core_events.h \ events/intel_atom_events.h \ events/intel_nhm_events.h \ events/intel_nhm_unc_events.h \ events/intel_wsm_events.h \ events/intel_wsm_unc_events.h \ events/intel_snb_events.h \ events/intel_snb_unc_events.h \ events/intel_ivb_events.h \ events/intel_hsw_events.h \ events/intel_bdw_events.h \ events/intel_skl_events.h \ events/intel_glm_events.h \ events/intel_icl_events.h \ events/intel_spr_events.h \ events/intel_gnr_events.h \ events/intel_adl_glc_events.h \ events/intel_adl_grt_events.h \ pfmlib_intel_snbep_unc_priv.h \ events/intel_snbep_unc_cbo_events.h \ events/intel_snbep_unc_ha_events.h \ events/intel_snbep_unc_imc_events.h \ events/intel_snbep_unc_pcu_events.h \ events/intel_snbep_unc_qpi_events.h \ events/intel_snbep_unc_ubo_events.h \ events/intel_snbep_unc_r2pcie_events.h \ events/intel_snbep_unc_r3qpi_events.h \ events/intel_tmt_events.h \ events/intel_knc_events.h \ events/intel_knl_events.h \ events/intel_ivbep_unc_cbo_events.h \ events/intel_ivbep_unc_ha_events.h \ events/intel_ivbep_unc_imc_events.h \ events/intel_ivbep_unc_pcu_events.h \ events/intel_ivbep_unc_qpi_events.h \ events/intel_ivbep_unc_ubo_events.h \ events/intel_ivbep_unc_r2pcie_events.h \ events/intel_ivbep_unc_r3qpi_events.h \ events/intel_ivbep_unc_irp_events.h \ events/intel_hswep_unc_cbo_events.h \ events/intel_hswep_unc_sbo_events.h \ events/intel_hswep_unc_ha_events.h \ events/intel_hswep_unc_imc_events.h \ events/intel_hswep_unc_pcu_events.h \ events/intel_hswep_unc_qpi_events.h \ events/intel_hswep_unc_ubo_events.h \ events/intel_hswep_unc_r2pcie_events.h \ events/intel_hswep_unc_r3qpi_events.h \ events/intel_hswep_unc_irp_events.h \ events/intel_bdx_unc_cbo_events.h \ events/intel_bdx_unc_ubo_events.h \ events/intel_bdx_unc_sbo_events.h \ events/intel_bdx_unc_ha_events.h \ events/intel_bdx_unc_imc_events.h \ events/intel_bdx_unc_irp_events.h \ events/intel_bdx_unc_pcu_events.h \ events/intel_bdx_unc_qpi_events.h \ events/intel_bdx_unc_r2pcie_events.h \ events/intel_bdx_unc_r3qpi_events.h \ events/intel_skx_unc_cha_events.h \ events/intel_skx_unc_iio_events.h \ events/intel_skx_unc_imc_events.h \ events/intel_skx_unc_irp_events.h \ events/intel_skx_unc_m2m_events.h \ events/intel_skx_unc_m3upi_events.h \ events/intel_skx_unc_pcu_events.h \ events/intel_skx_unc_ubo_events.h \ events/intel_skx_unc_upi_events.h \ events/intel_knl_unc_imc_events.h \ events/intel_knl_unc_edc_events.h \ events/intel_knl_unc_cha_events.h \ events/intel_knl_unc_m2pcie_events.h \ events/intel_icx_unc_cha_events.h \ events/intel_icx_unc_imc_events.h \ events/intel_icx_unc_m2m_events.h \ events/intel_icx_unc_irp_events.h \ events/intel_icx_unc_pcu_events.h \ events/intel_icx_unc_upi_events.h \ events/intel_icx_unc_m3upi_events.h \ events/intel_icx_unc_ubox_events.h \ events/intel_icx_unc_m2pcie_events.h \ events/intel_spr_unc_imc_events.h \ events/intel_spr_unc_upi_events.h \ events/intel_spr_unc_cha_events.h \ events/intel_gnr_unc_imc_events.h \ events/intel_slm_events.h INC_MIPS=events/mips_74k_events.h events/mips_74k_events.h INC_POWERPC=events/ppc970_events.h \ events/ppc970mp_events.h \ events/power4_events.h \ events/power5_events.h \ events/power5+_events.h \ events/power6_events.h \ events/power7_events.h \ events/power8_events.h \ events/power9_events.h \ events/power10_events.h \ events/torrent_events.h \ events/powerpc_nest_events.h INC_S390X=pfmlib_s390x_priv.h \ events/s390x_cpumf_events.h INC_SPARC=events/sparc_ultra12_events.h \ events/sparc_ultra3_events.h \ events/sparc_ultra3plus_events.h \ events/sparc_ultra3i_events.h \ events/sparc_ultra4plus_events.h \ events/sparc_niagara1_events.h \ events/sparc_niagara2_events.h INC_CELL=events/cell_events.h INC_ARM=pfmlib_arm_priv.h \ events/arm_cortex_a7_events.h \ events/arm_cortex_a8_events.h \ events/arm_cortex_a9_events.h \ events/arm_cortex_a15_events.h \ events/arm_cortex_a57_events.h \ events/arm_cortex_a53_events.h \ events/arm_xgene_events.h \ events/arm_cavium_tx2_events.h \ events/arm_marvell_tx2_unc_events.h \ events/arm_neoverse_n1_events.h \ events/arm_neoverse_n2_events.h \ events/arm_neoverse_v1_events.h \ events/arm_neoverse_v2_events.h \ events/arm_neoverse_v3_events.h \ events/arm_hisilicon_kunpeng_events.h \ events/arm_hisilicon_kunpeng_unc_events.h INC_ARM64=pfmlib_arm_priv.h \ events/arm_cortex_a57_events.h \ events/arm_cortex_a53_events.h \ events/arm_xgene_events.h \ events/arm_cavium_tx2_events.h \ events/arm_marvell_tx2_unc_events.h \ events/arm_fujitsu_a64fx_events.h \ events/arm_fujitsu_monaka_events.h \ events/arm_neoverse_n1_events.h \ events/arm_neoverse_n2_events.h \ events/arm_neoverse_v1_events.h \ events/arm_neoverse_v2_events.h \ events/arm_neoverse_v3_events.h \ events/arm_hisilicon_kunpeng_events.h \ events/arm_hisilicon_kunpeng_unc_events.h INCDEP=$(INC_COMMON) $(INCARCH) all: $(TARGETS) $(OBJS) $(SOBJS): $(TOPDIR)/config.mk $(TOPDIR)/rules.mk Makefile $(INCDEP) libpfm.a: $(OBJS) $(RM) $@ $(AR) cq $@ $(OBJS) $(SLIBPFM): $(SOBJS) $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LN) $@ $(VLIBPFM) $(LN) $@ libpfm.$(SOLIBEXT) clean: $(RM) -f *.o *.lo *.a *.so* *~ *.$(SOLIBEXT) distclean: clean depend: $(MKDEP) $(CFLAGS) $(SRCS) install: $(TARGETS) install: @echo building: $(TARGETS) -mkdir -p $(DESTDIR)$(LIBDIR) $(INSTALL) -m 644 $(ALIBPFM) $(DESTDIR)$(LIBDIR) ifeq ($(CONFIG_PFMLIB_SHARED),y) $(INSTALL) $(SLIBPFM) $(DESTDIR)$(LIBDIR) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) $(VLIBPFM) cd $(DESTDIR)$(LIBDIR); $(LN) $(SLIBPFM) libpfm.$(SOLIBEXT) -$(LDCONFIG) endif tags: $(CTAGS) -o $(TOPDIR)/tags --tag-relative=yes $(SRCS) $(INCDEP) papi-papi-7-2-0-t/src/libpfm4/lib/events/000077500000000000000000000000001502707512200200365ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam10h.h000066400000000000000000002105521502707512200236470ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Regenerated from previous version by: * Copyright (c) 2007 Advanced Micro Devices, Inc. * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_fam10h (AMD64 Fam10h) */ /* History * * May 28 2010 -- Robert Richter, robert.richter@amd.com: * * Update from: BIOS and Kernel Developer's Guide (BKDG) For AMD * Family 10h Processors, 31116 Rev 3.48 - April 22, 2010 * * Feb 06 2009 -- Robert Richter, robert.richter@amd.com: * * Update for Family 10h RevD (Istanbul) from: BIOS and Kernel * Developer's Guide (BKDG) For AMD Family 10h Processors, 31116 Rev * 3.20 - February 04, 2009 * This file has been automatically generated. * * Update for Family 10h RevC (Shanghai) from: BIOS and Kernel * Developer's Guide (BKDG) For AMD Family 10h Processors, 31116 Rev * 3.20 - February 04, 2009 * * * Dec 12 2007 -- Robert Richter, robert.richter@amd.com: * * Created from: BIOS and Kernel Developer's Guide (BKDG) For AMD * Family 10h Processors, 31116 Rev 3.00 - September 07, 2007 * PMU: amd64_fam10h (AMD64 Fam10h) */ static const amd64_umask_t amd64_fam10h_dispatched_fpu[]={ { .uname = "OPS_ADD", .udesc = "Add pipe ops excluding load ops and SSE move ops", .ucode = 0x1, }, { .uname = "OPS_MULTIPLY", .udesc = "Multiply pipe ops excluding load ops and SSE move ops", .ucode = 0x2, }, { .uname = "OPS_STORE", .udesc = "Store pipe ops excluding load ops and SSE move ops", .ucode = 0x4, }, { .uname = "OPS_ADD_PIPE_LOAD_OPS", .udesc = "Add pipe load ops and SSE move ops", .ucode = 0x8, }, { .uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .udesc = "Multiply pipe load ops and SSE move ops", .ucode = 0x10, }, { .uname = "OPS_STORE_PIPE_LOAD_OPS", .udesc = "Store pipe load ops and SSE move ops", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_sse_operations[]={ { .uname = "SINGLE_ADD_SUB_OPS", .udesc = "Single precision add/subtract ops", .ucode = 0x1, }, { .uname = "SINGLE_MUL_OPS", .udesc = "Single precision multiply ops", .ucode = 0x2, }, { .uname = "SINGLE_DIV_OPS", .udesc = "Single precision divide/square root ops", .ucode = 0x4, }, { .uname = "DOUBLE_ADD_SUB_OPS", .udesc = "Double precision add/subtract ops", .ucode = 0x8, }, { .uname = "DOUBLE_MUL_OPS", .udesc = "Double precision multiply ops", .ucode = 0x10, }, { .uname = "DOUBLE_DIV_OPS", .udesc = "Double precision divide/square root ops", .ucode = 0x20, }, { .uname = "OP_TYPE", .udesc = "Op type: 0=uops. 1=FLOPS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_move_ops[]={ { .uname = "LOW_QW_MOVE_UOPS", .udesc = "Merging low quadword move uops", .ucode = 0x1, }, { .uname = "HIGH_QW_MOVE_UOPS", .udesc = "Merging high quadword move uops", .ucode = 0x2, }, { .uname = "ALL_OTHER_MERGING_MOVE_UOPS", .udesc = "All other merging move uops", .ucode = 0x4, }, { .uname = "ALL_OTHER_MOVE_UOPS", .udesc = "All other move uops", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_serializing_ops[]={ { .uname = "SSE_BOTTOM_EXECUTING_UOPS", .udesc = "SSE bottom-executing uops retired", .ucode = 0x1, }, { .uname = "SSE_BOTTOM_SERIALIZING_UOPS", .udesc = "SSE bottom-serializing uops retired", .ucode = 0x2, }, { .uname = "X87_BOTTOM_EXECUTING_UOPS", .udesc = "X87 bottom-executing uops retired", .ucode = 0x4, }, { .uname = "X87_BOTTOM_SERIALIZING_UOPS", .udesc = "X87 bottom-serializing uops retired", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_fp_scheduler_cycles[]={ { .uname = "BOTTOM_EXECUTE_CYCLES", .udesc = "Number of cycles a bottom-execute uop is in the FP scheduler", .ucode = 0x1, }, { .uname = "BOTTOM_SERIALIZING_CYCLES", .udesc = "Number of cycles a bottom-serializing uop is in the FP scheduler", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "The number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in speculative phase", .ucode = 0x2, }, { .uname = "CYCLES_NON_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .ucode = 0x4, }, { .uname = "CYCLES_WAITING", .udesc = "The number of cycles waiting for a cache hit (cache miss penalty).", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cancelled_store_to_load_forward_operations[]={ { .uname = "ADDRESS_MISMATCHES", .udesc = "Address mismatches (starting byte not the same).", .ucode = 0x1, }, { .uname = "STORE_IS_SMALLER_THAN_LOAD", .udesc = "Store is smaller than load.", .ucode = 0x2, }, { .uname = "MISALIGNED", .udesc = "Misaligned.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_data_cache_refills[]={ { .uname = "SYSTEM", .udesc = "Refill from the Northbridge", .ucode = 0x1, }, { .uname = "L2_SHARED", .udesc = "Shared-state line from L2", .ucode = 0x2, }, { .uname = "L2_EXCLUSIVE", .udesc = "Exclusive-state line from L2", .ucode = 0x4, }, { .uname = "L2_OWNED", .udesc = "Owned-state line from L2", .ucode = 0x8, }, { .uname = "L2_MODIFIED", .udesc = "Modified-state line from L2", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_data_cache_refills_from_system[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_data_cache_lines_evicted[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "BY_PREFETCHNTA", .udesc = "Cache line evicted was brought into the cache with by a PrefetchNTA instruction.", .ucode = 0x20, }, { .uname = "NOT_BY_PREFETCHNTA", .udesc = "Cache line evicted was not brought into the cache with by a PrefetchNTA instruction.", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_l1_dtlb_miss_and_l2_dtlb_hit[]={ { .uname = "L2_4K_TLB_HIT", .udesc = "L2 4K TLB hit", .ucode = 0x1, }, { .uname = "L2_2M_TLB_HIT", .udesc = "L2 2M TLB hit", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_TILL_FAM10H_REV_B, }, { .uname = "L2_1G_TLB_HIT", .udesc = "L2 1G TLB hit", .ucode = 0x4, .uflags= AMD64_FL_FAM10H_REV_C, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_FAM10H_REV_C, }, }; static const amd64_umask_t amd64_fam10h_l1_dtlb_and_l2_dtlb_miss[]={ { .uname = "4K_TLB_RELOAD", .udesc = "4K TLB reload", .ucode = 0x1, }, { .uname = "2M_TLB_RELOAD", .udesc = "2M TLB reload", .ucode = 0x2, }, { .uname = "1G_TLB_RELOAD", .udesc = "1G TLB reload", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_scrubber_single_bit_ecc_errors[]={ { .uname = "SCRUBBER_ERROR", .udesc = "Scrubber error", .ucode = 0x1, }, { .uname = "PIGGYBACK_ERROR", .udesc = "Piggyback scrubber errors", .ucode = 0x2, }, { .uname = "LOAD_PIPE_ERROR", .udesc = "Load pipe error", .ucode = 0x4, }, { .uname = "STORE_WRITE_PIPE_ERROR", .udesc = "Store write pipe error", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_dcache_misses_by_locked_instructions[]={ { .uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .udesc = "Data cache misses by locked instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_l1_dtlb_hit[]={ { .uname = "L1_4K_TLB_HIT", .udesc = "L1 4K TLB hit", .ucode = 0x1, }, { .uname = "L1_2M_TLB_HIT", .udesc = "L1 2M TLB hit", .ucode = 0x2, }, { .uname = "L1_1G_TLB_HIT", .udesc = "L1 1G TLB hit", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_ineffective_sw_prefetches[]={ { .uname = "SW_PREFETCH_HIT_IN_L1", .udesc = "Software prefetch hit in the L1.", .ucode = 0x1, }, { .uname = "SW_PREFETCH_HIT_IN_L2", .udesc = "Software prefetch hit in L2.", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x9, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_data_prefetches[]={ { .uname = "CANCELLED", .udesc = "Cancelled prefetches", .ucode = 0x1, }, { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_mab_requests[]={ { .uname = "BUFFER_0", .udesc = "Buffer 0", .ucode = 0x0, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_1", .udesc = "Buffer 1", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_2", .udesc = "Buffer 2", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_3", .udesc = "Buffer 3", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_4", .udesc = "Buffer 4", .ucode = 0x4, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_5", .udesc = "Buffer 5", .ucode = 0x5, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_6", .udesc = "Buffer 6", .ucode = 0x6, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_7", .udesc = "Buffer 7", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_8", .udesc = "Buffer 8", .ucode = 0x8, .uflags= AMD64_FL_NCOMBO, }, { .uname = "BUFFER_9", .udesc = "Buffer 9", .ucode = 0x9, .uflags= AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam10h_system_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_quadwords_written_to_system[]={ { .uname = "QUADWORD_WRITE_TRANSFER", .udesc = "Octword write transfer", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB fill (page table walks)", .ucode = 0x4, }, { .uname = "SNOOP", .udesc = "Tag snoop request", .ucode = 0x8, }, { .uname = "CANCELLED", .udesc = "Cancelled request", .ucode = 0x10, }, { .uname = "HW_PREFETCH_FROM_DC", .udesc = "Hardware prefetch from DC", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas EventSelect 041h does not)", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB page table walk", .ucode = 0x4, }, { .uname = "HW_PREFETCH_FROM_DC", .udesc = "Hardware prefetch from DC", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_l2_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .ucode = 0x1, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_l1_itlb_miss_and_l2_itlb_miss[]={ { .uname = "4K_PAGE_FETCHES", .udesc = "Instruction fetches to a 4K page.", .ucode = 0x1, }, { .uname = "2M_PAGE_FETCHES", .udesc = "Instruction fetches to a 2M page.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_instruction_cache_lines_invalidated[]={ { .uname = "INVALIDATING_PROBE_NO_IN_FLIGHT", .udesc = "Invalidating probe that did not hit any in-flight instructions.", .ucode = 0x1, }, { .uname = "INVALIDATING_PROBE_ONE_OR_MORE_IN_FLIGHT", .udesc = "Invalidating probe that hit one or more in-flight instructions.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_mmx_and_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "MMX_AND_3DNOW", .udesc = "MMX and 3DNow! instructions", .ucode = 0x2, }, { .uname = "PACKED_SSE_AND_SSE2", .udesc = "SSE instructions (SSE, SSE2, SSE3, and SSE4A)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_fastpath_double_op_instructions[]={ { .uname = "POSITION_0", .udesc = "With low op in position 0", .ucode = 0x1, }, { .uname = "POSITION_1", .udesc = "With low op in position 1", .ucode = 0x2, }, { .uname = "POSITION_2", .udesc = "With low op in position 2", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_dram_accesses_page[]={ { .uname = "HIT", .udesc = "DCT0 Page hit", .ucode = 0x1, }, { .uname = "MISS", .udesc = "DCT0 Page Miss", .ucode = 0x2, }, { .uname = "CONFLICT", .udesc = "DCT0 Page Conflict", .ucode = 0x4, }, { .uname = "DCT1_PAGE_HIT", .udesc = "DCT1 Page hit", .ucode = 0x8, }, { .uname = "DCT1_PAGE_MISS", .udesc = "DCT1 Page Miss", .ucode = 0x10, }, { .uname = "DCT1_PAGE_CONFLICT", .udesc = "DCT1 Page Conflict", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_controller_page_table_overflows[]={ { .uname = "DCT0_PAGE_TABLE_OVERFLOW", .udesc = "DCT0 Page Table Overflow", .ucode = 0x1, }, { .uname = "DCT1_PAGE_TABLE_OVERFLOW", .udesc = "DCT1 Page Table Overflow", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_controller_slot_misses[]={ { .uname = "DCT0_COMMAND_SLOTS_MISSED", .udesc = "DCT0 Command Slots Missed", .ucode = 0x1, }, { .uname = "DCT1_COMMAND_SLOTS_MISSED", .udesc = "DCT1 Command Slots Missed", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_controller_turnarounds[]={ { .uname = "CHIP_SELECT", .udesc = "DCT0 DIMM (chip select) turnaround", .ucode = 0x1, }, { .uname = "READ_TO_WRITE", .udesc = "DCT0 Read to write turnaround", .ucode = 0x2, }, { .uname = "WRITE_TO_READ", .udesc = "DCT0 Write to read turnaround", .ucode = 0x4, }, { .uname = "DCT1_DIMM", .udesc = "DCT1 DIMM (chip select) turnaround", .ucode = 0x8, }, { .uname = "DCT1_READ_TO_WRITE_TURNAROUND", .udesc = "DCT1 Read to write turnaround", .ucode = 0x10, }, { .uname = "DCT1_WRITE_TO_READ_TURNAROUND", .udesc = "DCT1 Write to read turnaround", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_controller_bypass[]={ { .uname = "HIGH_PRIORITY", .udesc = "Memory controller high priority bypass", .ucode = 0x1, }, { .uname = "LOW_PRIORITY", .udesc = "Memory controller medium priority bypass", .ucode = 0x2, }, { .uname = "DRAM_INTERFACE", .udesc = "DCT0 DCQ bypass", .ucode = 0x4, }, { .uname = "DRAM_QUEUE", .udesc = "DCT1 DCQ bypass", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_thermal_status_and_ecc_errors[]={ { .uname = "CLKS_DIE_TEMP_TOO_HIGH", .udesc = "Number of times the HTC trip point is crossed", .ucode = 0x4, }, { .uname = "CLKS_TEMP_THRESHOLD_EXCEEDED", .udesc = "Number of clocks when STC trip point active", .ucode = 0x8, }, { .uname = "STC_TRIP_POINTS_CROSSED", .udesc = "Number of times the STC trip point is crossed", .ucode = 0x10, }, { .uname = "CLOCKS_HTC_P_STATE_INACTIVE", .udesc = "Number of clocks HTC P-state is inactive.", .ucode = 0x20, }, { .uname = "CLOCKS_HTC_P_STATE_ACTIVE", .udesc = "Number of clocks HTC P-state is active", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7c, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cpu_io_requests_to_memory_io[]={ { .uname = "I_O_TO_I_O", .udesc = "IO to IO", .ucode = 0x1, }, { .uname = "I_O_TO_MEM", .udesc = "IO to Mem", .ucode = 0x2, }, { .uname = "CPU_TO_I_O", .udesc = "CPU to IO", .ucode = 0x4, }, { .uname = "CPU_TO_MEM", .udesc = "CPU to Mem", .ucode = 0x8, }, { .uname = "TO_REMOTE_NODE", .udesc = "To remote node", .ucode = 0x10, }, { .uname = "TO_LOCAL_NODE", .udesc = "To local node", .ucode = 0x20, }, { .uname = "FROM_REMOTE_NODE", .udesc = "From remote node", .ucode = 0x40, }, { .uname = "FROM_LOCAL_NODE", .udesc = "From local node", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cache_block[]={ { .uname = "VICTIM_WRITEBACK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "DCACHE_LOAD_MISS", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "SHARED_ICACHE_REFILL", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "READ_TO_DIRTY", .udesc = "Change-to-Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_sized_commands[]={ { .uname = "NON_POSTED_WRITE_BYTE", .udesc = "Non-Posted SzWr Byte (1-32 bytes) Legacy or mapped IO, typically 1-4 bytes", .ucode = 0x1, }, { .uname = "NON_POSTED_WRITE_DWORD", .udesc = "Non-Posted SzWr DW (1-16 dwords) Legacy or mapped IO, typically 1 DWORD", .ucode = 0x2, }, { .uname = "POSTED_WRITE_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes) Subcache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .ucode = 0x4, }, { .uname = "POSTED_WRITE_DWORD", .udesc = "Posted SzWr DW (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .ucode = 0x8, }, { .uname = "READ_BYTE_4_BYTES", .udesc = "SzRd Byte (4 bytes) Legacy or mapped IO", .ucode = 0x10, }, { .uname = "READ_DWORD_1_16_DWORDS", .udesc = "SzRd DW (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_probe[]={ { .uname = "MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_DISPLAY_REFRESH_READS", .udesc = "Upstream display refresh/ISOC reads", .ucode = 0x10, }, { .uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .udesc = "Upstream non-display refresh reads", .ucode = 0x20, }, { .uname = "UPSTREAM_WRITES", .udesc = "Upstream ISOC writes", .ucode = 0x40, }, { .uname = "UPSTREAM_NON_ISOC_WRITES", .udesc = "Upstream non-ISOC writes", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_gart[]={ { .uname = "APERTURE_HIT_FROM_CPU", .udesc = "GART aperture hit on access from CPU", .ucode = 0x1, }, { .uname = "APERTURE_HIT_FROM_IO", .udesc = "GART aperture hit on access from IO", .ucode = 0x2, }, { .uname = "MISS", .udesc = "GART miss", .ucode = 0x4, }, { .uname = "REQUEST_HIT_TABLE_WALK", .udesc = "GART/DEV Request hit table walk in progress", .ucode = 0x8, }, { .uname = "DEV_HIT", .udesc = "DEV hit", .ucode = 0x10, }, { .uname = "DEV_MISS", .udesc = "DEV miss", .ucode = 0x20, }, { .uname = "DEV_ERROR", .udesc = "DEV error", .ucode = 0x40, }, { .uname = "MULTIPLE_TABLE_WALK", .udesc = "GART/DEV multiple table walk in progress", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_memory_controller_requests[]={ { .uname = "WRITE_REQUESTS", .udesc = "Write requests sent to the DCT", .ucode = 0x1, }, { .uname = "READ_REQUESTS", .udesc = "Read requests (including prefetch requests) sent to the DCT", .ucode = 0x2, }, { .uname = "PREFETCH_REQUESTS", .udesc = "Prefetch requests sent to the DCT", .ucode = 0x4, }, { .uname = "32_BYTES_WRITES", .udesc = "32 Bytes Sized Writes", .ucode = 0x8, }, { .uname = "64_BYTES_WRITES", .udesc = "64 Bytes Sized Writes", .ucode = 0x10, }, { .uname = "32_BYTES_READS", .udesc = "32 Bytes Sized Reads", .ucode = 0x20, }, { .uname = "64_BYTES_READS", .udesc = "64 Byte Sized Reads", .ucode = 0x40, }, { .uname = "READ_REQUESTS_WHILE_WRITES_REQUESTS", .udesc = "Read requests sent to the DCT while writes requests are pending in the DCT", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cpu_to_dram_requests_to_target_node[]={ { .uname = "LOCAL_TO_0", .udesc = "From Local node to Node 0", .ucode = 0x1, }, { .uname = "LOCAL_TO_1", .udesc = "From Local node to Node 1", .ucode = 0x2, }, { .uname = "LOCAL_TO_2", .udesc = "From Local node to Node 2", .ucode = 0x4, }, { .uname = "LOCAL_TO_3", .udesc = "From Local node to Node 3", .ucode = 0x8, }, { .uname = "LOCAL_TO_4", .udesc = "From Local node to Node 4", .ucode = 0x10, }, { .uname = "LOCAL_TO_5", .udesc = "From Local node to Node 5", .ucode = 0x20, }, { .uname = "LOCAL_TO_6", .udesc = "From Local node to Node 6", .ucode = 0x40, }, { .uname = "LOCAL_TO_7", .udesc = "From Local node to Node 7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cpu_read_command_latency_to_target_node_0_3[]={ { .uname = "READ_BLOCK", .udesc = "Read block", .ucode = 0x1, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read block shared", .ucode = 0x2, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read block modified", .ucode = 0x4, }, { .uname = "CHANGE_TO_DIRTY", .udesc = "Change-to-Dirty", .ucode = 0x8, }, { .uname = "LOCAL_TO_0", .udesc = "From Local node to Node 0", .ucode = 0x10, }, { .uname = "LOCAL_TO_1", .udesc = "From Local node to Node 1", .ucode = 0x20, }, { .uname = "LOCAL_TO_2", .udesc = "From Local node to Node 2", .ucode = 0x40, }, { .uname = "LOCAL_TO_3", .udesc = "From Local node to Node 3", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cpu_read_command_latency_to_target_node_4_7[]={ { .uname = "READ_BLOCK", .udesc = "Read block", .ucode = 0x1, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read block shared", .ucode = 0x2, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read block modified", .ucode = 0x4, }, { .uname = "CHANGE_TO_DIRTY", .udesc = "Change-to-Dirty", .ucode = 0x8, }, { .uname = "LOCAL_TO_4", .udesc = "From Local node to Node 4", .ucode = 0x10, }, { .uname = "LOCAL_TO_5", .udesc = "From Local node to Node 5", .ucode = 0x20, }, { .uname = "LOCAL_TO_6", .udesc = "From Local node to Node 6", .ucode = 0x40, }, { .uname = "LOCAL_TO_7", .udesc = "From Local node to Node 7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_cpu_command_latency_to_target_node_0_3_4_7[]={ { .uname = "READ_SIZED", .udesc = "Read Sized", .ucode = 0x1, }, { .uname = "WRITE_SIZED", .udesc = "Write Sized", .ucode = 0x2, }, { .uname = "VICTIM_BLOCK", .udesc = "Victim Block", .ucode = 0x4, }, { .uname = "NODE_GROUP_SELECT", .udesc = "Node Group Select. 0=Nodes 0-3. 1= Nodes 4-7.", .ucode = 0x8, }, { .uname = "LOCAL_TO_0_4", .udesc = "From Local node to Node 0/4", .ucode = 0x10, }, { .uname = "LOCAL_TO_1_5", .udesc = "From Local node to Node 1/5", .ucode = 0x20, }, { .uname = "LOCAL_TO_2_6", .udesc = "From Local node to Node 2/6", .ucode = 0x40, }, { .uname = "LOCAL_TO_3_7", .udesc = "From Local node to Node 3/7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_hypertransport_link0[]={ { .uname = "COMMAND_DWORD_SENT", .udesc = "Command DWORD sent", .ucode = 0x1, .grpid = 0, }, { .uname = "DATA_DWORD_SENT", .udesc = "Data DWORD sent", .ucode = 0x2, .grpid = 0, }, { .uname = "BUFFER_RELEASE_DWORD_SENT", .udesc = "Buffer release DWORD sent", .ucode = 0x4, .grpid = 0, }, { .uname = "NOP_DWORD_SENT", .udesc = "Nop DW sent (idle)", .ucode = 0x8, .grpid = 0, }, { .uname = "ADDRESS_EXT_DWORD_SENT", .udesc = "Address extension DWORD sent", .ucode = 0x10, .grpid = 0, }, { .uname = "PER_PACKET_CRC_SENT", .udesc = "Per packet CRC sent", .ucode = 0x20, .grpid = 0, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "SUBLINK_MASK", .udesc = "SubLink Mask", .ucode = 0x80, .uflags= AMD64_FL_OMIT, .grpid = 1, }, }; static const amd64_umask_t amd64_fam10h_hypertransport_link3[]={ { .uname = "COMMAND_DWORD_SENT", .udesc = "Command DWORD sent", .ucode = 0x1, .grpid = 0, }, { .uname = "DATA_DWORD_SENT", .udesc = "Data DWORD sent", .ucode = 0x2, .grpid = 0, }, { .uname = "BUFFER_RELEASE_DWORD_SENT", .udesc = "Buffer release DWORD sent", .ucode = 0x4, .grpid = 0, }, { .uname = "NOP_DWORD_SENT", .udesc = "Nop DW sent (idle)", .ucode = 0x8, .grpid = 0, }, { .uname = "ADDRESS_EXT_DWORD_SENT", .udesc = "Address DWORD sent", .ucode = 0x10, .grpid = 0, }, { .uname = "PER_PACKET_CRC_SENT", .udesc = "Per packet CRC sent", .ucode = 0x20, .grpid = 0, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "SUBLINK_MASK", .udesc = "SubLink Mask", .ucode = 0x80, .uflags= AMD64_FL_OMIT, .grpid = 1, }, }; static const amd64_umask_t amd64_fam10h_read_request_to_l3_cache[]={ { .uname = "READ_BLOCK_EXCLUSIVE", .udesc = "Read Block Exclusive (Data cache read)", .ucode = 0x1, .grpid = 0, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read Block Shared (Instruction cache read)", .ucode = 0x2, .grpid = 0, }, { .uname = "READ_BLOCK_MODIFY", .udesc = "Read Block Modify", .ucode = 0x4, .grpid = 0, }, { .uname = "ANY_READ", .udesc = "Any read modes (exclusive, shared, modify)", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "ALL_CORES", .udesc = "All sub-events selected", .ucode = 0xf0, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, }; static const amd64_umask_t amd64_fam10h_l3_cache_misses[]={ { .uname = "READ_BLOCK_EXCLUSIVE", .udesc = "Read Block Exclusive (Data cache read)", .ucode = 0x1, .grpid = 0, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read Block Shared (Instruction cache read)", .ucode = 0x2, .grpid = 0, }, { .uname = "READ_BLOCK_MODIFY", .udesc = "Read Block Modify", .ucode = 0x4, .grpid = 0, }, { .uname = "ANY_READ", .udesc = "Any read modes (exclusive, shared, modify)", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "ALL_CORES", .udesc = "All cores", .ucode = 0xf0, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, }; static const amd64_umask_t amd64_fam10h_l3_fills_caused_by_l2_evictions[]={ { .uname = "SHARED", .udesc = "Shared", .ucode = 0x1, .grpid = 0, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x2, .grpid = 0, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x4, .grpid = 0, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x8, .grpid = 0, }, { .uname = "ANY_STATE", .udesc = "Any line state (shared, owned, exclusive, modified)", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "ALL_CORES", .udesc = "All cores", .ucode = 0xf0, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, }; static const amd64_umask_t amd64_fam10h_l3_evictions[]={ { .uname = "SHARED", .udesc = "Shared", .ucode = 0x1, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x2, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x4, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_page_size_mismatches[]={ { .uname = "GUEST_LARGER", .udesc = "Guest page size is larger than the host page size.", .ucode = 0x1, }, { .uname = "MTRR_MISMATCH", .udesc = "MTRR mismatch.", .ucode = 0x2, }, { .uname = "HOST_LARGER", .udesc = "Host page size is larger than the guest page size.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam10h_retired_x87_ops[]={ { .uname = "ADD_SUB_OPS", .udesc = "Add/subtract ops", .ucode = 0x1, }, { .uname = "MUL_OPS", .udesc = "Multiply ops", .ucode = 0x2, }, { .uname = "DIV_OPS", .udesc = "Divide ops", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam10h_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Dispatched FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_dispatched_fpu), .ngrp = 1, .umasks = amd64_fam10h_dispatched_fpu, }, { .name = "CYCLES_NO_FPU_OPS_RETIRED", .desc = "Cycles in which the FPU is Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2, }, { .name = "RETIRED_SSE_OPERATIONS", .desc = "Retired SSE Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_sse_operations), .ngrp = 1, .umasks = amd64_fam10h_retired_sse_operations, }, { .name = "RETIRED_MOVE_OPS", .desc = "Retired Move Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_move_ops), .ngrp = 1, .umasks = amd64_fam10h_retired_move_ops, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "Retired Serializing Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_serializing_ops), .ngrp = 1, .umasks = amd64_fam10h_retired_serializing_ops, }, { .name = "FP_SCHEDULER_CYCLES", .desc = "Number of Cycles that a Serializing uop is in the FP Scheduler", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_fp_scheduler_cycles), .ngrp = 1, .umasks = amd64_fam10h_fp_scheduler_cycles, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam10h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x22, }, { .name = "LS_BUFFER_2_FULL_CYCLES", .desc = "LS Buffer 2 Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x23, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_locked_ops), .ngrp = 1, .umasks = amd64_fam10h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x27, }, { .name = "CANCELLED_STORE_TO_LOAD_FORWARD_OPERATIONS", .desc = "Cancelled Store to Load Forward Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cancelled_store_to_load_forward_operations), .ngrp = 1, .umasks = amd64_fam10h_cancelled_store_to_load_forward_operations, }, { .name = "SMIS_RECEIVED", .desc = "SMIs Received", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2b, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_data_cache_refills), .ngrp = 1, .umasks = amd64_fam10h_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Data Cache Refills from the Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_fam10h_data_cache_refills_from_system, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_data_cache_lines_evicted), .ngrp = 1, .umasks = amd64_fam10h_data_cache_lines_evicted, }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "L1 DTLB Miss and L2 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x45, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l1_dtlb_miss_and_l2_dtlb_hit), .ngrp = 1, .umasks = amd64_fam10h_l1_dtlb_miss_and_l2_dtlb_hit, }, { .name = "L1_DTLB_AND_L2_DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x46, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l1_dtlb_and_l2_dtlb_miss), .ngrp = 1, .umasks = amd64_fam10h_l1_dtlb_and_l2_dtlb_miss, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x47, }, { .name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Late Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x48, }, { .name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Early Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x49, }, { .name = "SCRUBBER_SINGLE_BIT_ECC_ERRORS", .desc = "Single-bit ECC Errors Recorded by Scrubber", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_scrubber_single_bit_ecc_errors), .ngrp = 1, .umasks = amd64_fam10h_scrubber_single_bit_ecc_errors, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam10h_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_dcache_misses_by_locked_instructions), .ngrp = 1, .umasks = amd64_fam10h_dcache_misses_by_locked_instructions, }, { .name = "L1_DTLB_HIT", .desc = "L1 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l1_dtlb_hit), .ngrp = 1, .umasks = amd64_fam10h_l1_dtlb_hit, }, { .name = "INEFFECTIVE_SW_PREFETCHES", .desc = "Ineffective Software Prefetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x52, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_ineffective_sw_prefetches), .ngrp = 1, .umasks = amd64_fam10h_ineffective_sw_prefetches, }, { .name = "GLOBAL_TLB_FLUSHES", .desc = "Global TLB Flushes", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x54, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_requests), .ngrp = 1, .umasks = amd64_fam10h_memory_requests, }, { .name = "DATA_PREFETCHES", .desc = "Data Prefetcher", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_data_prefetches), .ngrp = 1, .umasks = amd64_fam10h_data_prefetches, }, { .name = "MAB_REQUESTS", .desc = "Average L1 refill latency for Icache and Dcache misses (request count for cache refills)", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_mab_requests), .ngrp = 1, .umasks = amd64_fam10h_mab_requests, }, { .name = "MAB_WAIT_CYCLES", .desc = "Average L1 refill latency for Icache and Dcache misses (cycles that requests spent waiting for the refills)", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_mab_requests), .ngrp = 1, .umasks = amd64_fam10h_mab_requests, /* identical to actual umasks list for this event */ }, { .name = "SYSTEM_READ_RESPONSES", .desc = "Northbridge Read Responses by Coherency State", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_system_read_responses), .ngrp = 1, .umasks = amd64_fam10h_system_read_responses, }, { .name = "QUADWORDS_WRITTEN_TO_SYSTEM", .desc = "Octwords Written to System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_quadwords_written_to_system), .ngrp = 1, .umasks = amd64_fam10h_quadwords_written_to_system, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x76, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_requests_to_l2), .ngrp = 1, .umasks = amd64_fam10h_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l2_cache_miss), .ngrp = 1, .umasks = amd64_fam10h_l2_cache_miss, }, { .name = "L2_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l2_fill_writeback), .ngrp = 1, .umasks = amd64_fam10h_l2_fill_writeback, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l1_itlb_miss_and_l2_itlb_miss), .ngrp = 1, .umasks = amd64_fam10h_l1_itlb_miss_and_l2_itlb_miss, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "Pipeline Restart Due to Instruction Stream Probe", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x86, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x89, }, { .name = "INSTRUCTION_CACHE_VICTIMS", .desc = "Instruction Cache Victims", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8b, }, { .name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .desc = "Instruction Cache Lines Invalidated", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_instruction_cache_lines_invalidated), .ngrp = 1, .umasks = amd64_fam10h_instruction_cache_lines_invalidated, }, { .name = "ITLB_RELOADS", .desc = "ITLB Reloads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x99, }, { .name = "ITLB_RELOADS_ABORTED", .desc = "ITLB Reloads Aborted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x9a, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_mmx_and_fp_instructions), .ngrp = 1, .umasks = amd64_fam10h_retired_mmx_and_fp_instructions, }, { .name = "RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS", .desc = "Retired Fastpath Double Op Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_fastpath_double_op_instructions), .ngrp = 1, .umasks = amd64_fam10h_retired_fastpath_double_op_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcf, }, { .name = "DECODER_EMPTY", .desc = "Decoder Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd0, }, { .name = "DISPATCH_STALLS", .desc = "Dispatch Stalls", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd1, }, { .name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .desc = "Dispatch Stall for Branch Abort to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd2, }, { .name = "DISPATCH_STALL_FOR_SERIALIZATION", .desc = "Dispatch Stall for Serialization", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd3, }, { .name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .desc = "Dispatch Stall for Segment Load", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd4, }, { .name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .desc = "Dispatch Stall for Reorder Buffer Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd5, }, { .name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .desc = "Dispatch Stall for Reservation Station Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd6, }, { .name = "DISPATCH_STALL_FOR_FPU_FULL", .desc = "Dispatch Stall for FPU Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd7, }, { .name = "DISPATCH_STALL_FOR_LS_FULL", .desc = "Dispatch Stall for LS Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd8, }, { .name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .desc = "Dispatch Stall Waiting for All Quiet", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd9, }, { .name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .desc = "Dispatch Stall for Far Transfer or Resync to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xda, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam10h_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdf, }, { .name = "DRAM_ACCESSES_PAGE", .desc = "DRAM Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_dram_accesses_page), .ngrp = 1, .umasks = amd64_fam10h_dram_accesses_page, }, { .name = "MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS", .desc = "DRAM Controller Page Table Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_controller_page_table_overflows), .ngrp = 1, .umasks = amd64_fam10h_memory_controller_page_table_overflows, }, { .name = "MEMORY_CONTROLLER_SLOT_MISSES", .desc = "Memory Controller DRAM Command Slots Missed", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_controller_slot_misses), .ngrp = 1, .umasks = amd64_fam10h_memory_controller_slot_misses, }, { .name = "MEMORY_CONTROLLER_TURNAROUNDS", .desc = "Memory Controller Turnarounds", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_controller_turnarounds), .ngrp = 1, .umasks = amd64_fam10h_memory_controller_turnarounds, }, { .name = "MEMORY_CONTROLLER_BYPASS", .desc = "Memory Controller Bypass Counter Saturation", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_controller_bypass), .ngrp = 1, .umasks = amd64_fam10h_memory_controller_bypass, }, { .name = "THERMAL_STATUS_AND_ECC_ERRORS", .desc = "Thermal Status", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_thermal_status_and_ecc_errors), .ngrp = 1, .umasks = amd64_fam10h_thermal_status_and_ecc_errors, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_fam10h_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK", .desc = "Cache Block Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cache_block), .ngrp = 1, .umasks = amd64_fam10h_cache_block, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_sized_commands), .ngrp = 1, .umasks = amd64_fam10h_sized_commands, }, { .name = "PROBE", .desc = "Probe Responses and Upstream Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_probe), .ngrp = 1, .umasks = amd64_fam10h_probe, }, { .name = "GART", .desc = "GART Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_gart), .ngrp = 1, .umasks = amd64_fam10h_gart, }, { .name = "MEMORY_CONTROLLER_REQUESTS", .desc = "Memory Controller Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1f0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_memory_controller_requests), .ngrp = 1, .umasks = amd64_fam10h_memory_controller_requests, }, { .name = "CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE", .desc = "CPU to DRAM Requests to Target Node", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_to_dram_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam10h_cpu_to_dram_requests_to_target_node, }, { .name = "IO_TO_DRAM_REQUESTS_TO_TARGET_NODE", .desc = "IO to DRAM Requests to Target Node", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_to_dram_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam10h_cpu_to_dram_requests_to_target_node, /* identical to actual umasks list for this event */ }, { .name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_0_3", .desc = "CPU Read Command Latency to Target Node 0-3", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_read_command_latency_to_target_node_0_3), .ngrp = 1, .umasks = amd64_fam10h_cpu_read_command_latency_to_target_node_0_3, }, { .name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_0_3", .desc = "CPU Read Command Requests to Target Node 0-3", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_read_command_latency_to_target_node_0_3), .ngrp = 1, .umasks = amd64_fam10h_cpu_read_command_latency_to_target_node_0_3, /* identical to actual umasks list for this event */ }, { .name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_4_7", .desc = "CPU Read Command Latency to Target Node 4-7", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_read_command_latency_to_target_node_4_7), .ngrp = 1, .umasks = amd64_fam10h_cpu_read_command_latency_to_target_node_4_7, }, { .name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_4_7", .desc = "CPU Read Command Requests to Target Node 4-7", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_read_command_latency_to_target_node_4_7), .ngrp = 1, .umasks = amd64_fam10h_cpu_read_command_latency_to_target_node_4_7, /* identical to actual umasks list for this event */ }, { .name = "CPU_COMMAND_LATENCY_TO_TARGET_NODE_0_3_4_7", .desc = "CPU Command Latency to Target Node 0-3/4-7", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_command_latency_to_target_node_0_3_4_7), .ngrp = 1, .umasks = amd64_fam10h_cpu_command_latency_to_target_node_0_3_4_7, }, { .name = "CPU_REQUESTS_TO_TARGET_NODE_0_3_4_7", .desc = "CPU Requests to Target Node 0-3/4-7", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e7, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_cpu_command_latency_to_target_node_0_3_4_7), .ngrp = 1, .umasks = amd64_fam10h_cpu_command_latency_to_target_node_0_3_4_7, /* identical to actual umasks list for this event */ }, { .name = "HYPERTRANSPORT_LINK0", .desc = "HyperTransport Link 0 Transmit Bandwidth", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xf6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_hypertransport_link0), .ngrp = 2, .umasks = amd64_fam10h_hypertransport_link0, }, { .name = "HYPERTRANSPORT_LINK1", .desc = "HyperTransport Link 1 Transmit Bandwidth", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xf7, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_hypertransport_link0), .ngrp = 2, .umasks = amd64_fam10h_hypertransport_link0, /* identical to actual umasks list for this event */ }, { .name = "HYPERTRANSPORT_LINK2", .desc = "HyperTransport Link 2 Transmit Bandwidth", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xf8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_hypertransport_link0), .ngrp = 2, .umasks = amd64_fam10h_hypertransport_link0, /* identical to actual umasks list for this event */ }, { .name = "HYPERTRANSPORT_LINK3", .desc = "HyperTransport Link 3 Transmit Bandwidth", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1f9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_hypertransport_link3), .ngrp = 2, .umasks = amd64_fam10h_hypertransport_link3, }, { .name = "READ_REQUEST_TO_L3_CACHE", .desc = "Read Request to L3 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e0, .flags = AMD64_FL_TILL_FAM10H_REV_C, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_read_request_to_l3_cache), .ngrp = 2, .umasks = amd64_fam10h_read_request_to_l3_cache, }, { .name = "L3_CACHE_MISSES", .desc = "L3 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e1, .flags = AMD64_FL_TILL_FAM10H_REV_C, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_cache_misses), .ngrp = 2, .umasks = amd64_fam10h_l3_cache_misses, }, { .name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .desc = "L3 Fills caused by L2 Evictions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e2, .flags = AMD64_FL_TILL_FAM10H_REV_C, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_fills_caused_by_l2_evictions), .ngrp = 2, .umasks = amd64_fam10h_l3_fills_caused_by_l2_evictions, }, { .name = "L3_EVICTIONS", .desc = "L3 Evictions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_evictions), .ngrp = 1, .umasks = amd64_fam10h_l3_evictions, }, { .name = "PAGE_SIZE_MISMATCHES", .desc = "Page Size Mismatches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x165, .flags = AMD64_FL_FAM10H_REV_C, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_page_size_mismatches), .ngrp = 1, .umasks = amd64_fam10h_page_size_mismatches, }, { .name = "RETIRED_X87_OPS", .desc = "Retired x87 Floating Point Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1c0, .flags = AMD64_FL_FAM10H_REV_C, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_retired_x87_ops), .ngrp = 1, .umasks = amd64_fam10h_retired_x87_ops, }, { .name = "IBS_OPS_TAGGED", .desc = "IBS Ops Tagged", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1cf, .flags = AMD64_FL_FAM10H_REV_C, }, { .name = "LFENCE_INST_RETIRED", .desc = "LFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d3, .flags = AMD64_FL_FAM10H_REV_C, }, { .name = "SFENCE_INST_RETIRED", .desc = "SFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d4, .flags = AMD64_FL_FAM10H_REV_C, }, { .name = "MFENCE_INST_RETIRED", .desc = "MFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d5, .flags = AMD64_FL_FAM10H_REV_C, }, { .name = "READ_REQUEST_TO_L3_CACHE", .desc = "Read Request to L3 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e0, .flags = AMD64_FL_FAM10H_REV_D, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_cache_misses), .ngrp = 2, .umasks = amd64_fam10h_l3_cache_misses, /* identical to actual umasks list for this event */ }, { .name = "L3_CACHE_MISSES", .desc = "L3 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e1, .flags = AMD64_FL_FAM10H_REV_D, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_cache_misses), .ngrp = 2, .umasks = amd64_fam10h_l3_cache_misses, /* identical to actual umasks list for this event */ }, { .name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .desc = "L3 Fills caused by L2 Evictions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4e2, .flags = AMD64_FL_FAM10H_REV_D, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_fills_caused_by_l2_evictions), .ngrp = 2, .umasks = amd64_fam10h_l3_fills_caused_by_l2_evictions, /* identical to actual umasks list for this event */ }, { .name = "NON_CANCELLED_L3_READ_REQUESTS", .desc = "Non-cancelled L3 Read Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4ed, .flags = AMD64_FL_FAM10H_REV_D, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam10h_l3_cache_misses), .ngrp = 2, .umasks = amd64_fam10h_l3_cache_misses, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam11h.h000066400000000000000000001152001502707512200236420ustar00rootroot00000000000000/* * Copyright (c) 2012 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam11h (AMD64 Fam11h) */ static const amd64_umask_t amd64_fam11h_dispatched_fpu[]={ { .uname = "OPS_ADD", .udesc = "Add pipe ops excluding load ops and SSE move ops", .ucode = 0x1, }, { .uname = "OPS_MULTIPLY", .udesc = "Multiply pipe ops excluding load ops and SSE move ops", .ucode = 0x2, }, { .uname = "OPS_STORE", .udesc = "Store pipe ops excluding load ops and SSE move ops", .ucode = 0x4, }, { .uname = "OPS_ADD_PIPE_LOAD_OPS", .udesc = "Add pipe load ops and SSE move ops", .ucode = 0x8, }, { .uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .udesc = "Multiply pipe load ops and SSE move ops", .ucode = 0x10, }, { .uname = "OPS_STORE_PIPE_LOAD_OPS", .udesc = "Store pipe load ops and SSE move ops", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "The number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in speculative phase", .ucode = 0x2, }, { .uname = "CYCLES_NON_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_data_cache_refills[]={ { .uname = "SYSTEM", .udesc = "Refill from the Northbridge", .ucode = 0x1, }, { .uname = "L2_SHARED", .udesc = "Shared-state line from L2", .ucode = 0x2, }, { .uname = "L2_EXCLUSIVE", .udesc = "Exclusive-state line from L2", .ucode = 0x4, }, { .uname = "L2_OWNED", .udesc = "Owned-state line from L2", .ucode = 0x8, }, { .uname = "L2_MODIFIED", .udesc = "Modified-state line from L2", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_data_cache_refills_from_system[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_data_cache_lines_evicted[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_scrubber_single_bit_ecc_errors[]={ { .uname = "SCRUBBER_ERROR", .udesc = "Scrubber error", .ucode = 0x1, }, { .uname = "PIGGYBACK_ERROR", .udesc = "Piggyback scrubber errors", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_dcache_misses_by_locked_instructions[]={ { .uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .udesc = "Data cache misses by locked instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_data_prefetches[]={ { .uname = "CANCELLED", .udesc = "Cancelled prefetches", .ucode = 0x1, }, { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_system_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x17, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_quadwords_written_to_system[]={ { .uname = "QUADWORD_WRITE_TRANSFER", .udesc = "Quadword write transfer", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB fill (page table walks)", .ucode = 0x4, }, { .uname = "SNOOP", .udesc = "Tag snoop request", .ucode = 0x8, }, { .uname = "CANCELLED", .udesc = "Cancelled request", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas EventSelect 041h does not)", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB page table walk", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_l2_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .ucode = 0x1, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_retired_mmx_and_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "MMX_AND_3DNOW", .udesc = "MMX and 3DNow! instructions", .ucode = 0x2, }, { .uname = "PACKED_SSE_AND_SSE2", .udesc = "Packed SSE and SSE2 instructions", .ucode = 0x4, }, { .uname = "SCALAR_SSE_AND_SSE2", .udesc = "Scalar SSE and SSE2 instructions", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_retired_fastpath_double_op_instructions[]={ { .uname = "POSITION_0", .udesc = "With low op in position 0", .ucode = 0x1, }, { .uname = "POSITION_1", .udesc = "With low op in position 1", .ucode = 0x2, }, { .uname = "POSITION_2", .udesc = "With low op in position 2", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_interrupt_events[]={ { .uname = "FIXED_AND_LPA", .udesc = "Fixed and LPA", .ucode = 0x1, }, { .uname = "LPA", .udesc = "LPA", .ucode = 0x2, }, { .uname = "SMI", .udesc = "SMI", .ucode = 0x4, }, { .uname = "NMI", .udesc = "NMI", .ucode = 0x8, }, { .uname = "INIT", .udesc = "INIT", .ucode = 0x10, }, { .uname = "STARTUP", .udesc = "STARTUP", .ucode = 0x20, }, { .uname = "INT", .udesc = "INT", .ucode = 0x40, }, { .uname = "EOI", .udesc = "EOI", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_sideband_signals[]={ { .uname = "HALT", .udesc = "HALT", .ucode = 0x1, }, { .uname = "STOPGRANT", .udesc = "STOPGRANT", .ucode = 0x2, }, { .uname = "SHUTDOWN", .udesc = "SHUTDOWN", .ucode = 0x4, }, { .uname = "WBINVD", .udesc = "WBINVD", .ucode = 0x8, }, { .uname = "INVD", .udesc = "INVD", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_dram_accesses[]={ { .uname = "DCT0_PAGE_HIT", .udesc = "DCT0 Page hit", .ucode = 0x1, }, { .uname = "DCT0_PAGE_MISS", .udesc = "DCT0 Page Miss", .ucode = 0x2, }, { .uname = "DCT0_PAGE_CONFLICT", .udesc = "DCT0 Page Conflict", .ucode = 0x4, }, { .uname = "DCT1_PAGE_HIT", .udesc = "DCT1 Page hit", .ucode = 0x8, }, { .uname = "DCT1_PAGE_MISS", .udesc = "DCT1 Page Miss", .ucode = 0x10, }, { .uname = "DCT1_PAGE_CONFLICT", .udesc = "DCT1 Page Conflict", .ucode = 0x20, }, { .uname = "WRITE_REQUEST", .udesc = "Write request.", .ucode = 0x40, }, { .uname = "READ_REQUEST", .udesc = "Read request.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_dram_controller_page_table_events[]={ { .uname = "DCT_PAGE_TABLE_OVERFLOW", .udesc = "DCT Page Table Overflow", .ucode = 0x1, }, { .uname = "STALE_TABLE_ENTRY_HITS", .udesc = "Number of stale table entry hits. (hit on a page closed too soon).", .ucode = 0x2, }, { .uname = "PAGE_TABLE_IDLE_CYCLE_LIMIT_INCREMENTED", .udesc = "Page table idle cycle limit incremented.", .ucode = 0x4, }, { .uname = "PAGE_TABLE_IDLE_CYCLE_LIMIT_DECREMENTED", .udesc = "Page table idle cycle limit decremented.", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_memory_controller_turnarounds[]={ { .uname = "DCT0_READ_TO_WRITE", .udesc = "DCT0 read-to-write turnaround.", .ucode = 0x1, }, { .uname = "DCT0_WRITE_TO_READ", .udesc = "DCT0 write-to-read turnaround", .ucode = 0x2, }, { .uname = "DCT0_DIMM", .udesc = "DCT0 DIMM (chip select) turnaround", .ucode = 0x4, }, { .uname = "DCT1_READ_TO_WRITE", .udesc = "DCT1 read-to-write turnaround.", .ucode = 0x8, }, { .uname = "DCT1_WRITE_TO_READ", .udesc = "DCT1 write-to-read turnaround", .ucode = 0x10, }, { .uname = "DCT1_DIMM", .udesc = "DCT1 DIMM (chip select) turnaround", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_memory_rbd_queue[]={ { .uname = "COUNTER_REACHED", .udesc = "F2x[1,0]94[DcqBypassMax] counter reached.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x4, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_thermal_status[]={ { .uname = "MEMHOT_L_ASSERTIONS", .udesc = "Number of clocks MEMHOT_L is asserted.", .ucode = 0x1, }, { .uname = "HTC_TRANSITIONS", .udesc = "Number of times the HTC transitions from inactive to active.", .ucode = 0x4, }, { .uname = "CLOCKS_HTC_P_STATE_INACTIVE", .udesc = "Number of clocks HTC P-state is inactive.", .ucode = 0x20, }, { .uname = "CLOCKS_HTC_P_STATE_ACTIVE", .udesc = "Number of clocks HTC P-state is active", .ucode = 0x40, }, { .uname = "PROCHOT_L_ASSERTIONS", .udesc = "PROCHOT_L asserted by an external source and the assertion causes a P-state change.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xe5, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_cpu_io_requests_to_memory_io[]={ { .uname = "I_O_TO_I_O", .udesc = "IO to IO", .ucode = 0xa1, .uflags= AMD64_FL_NCOMBO, }, { .uname = "I_O_TO_MEM", .udesc = "IO to Mem", .ucode = 0xa2, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CPU_TO_I_O", .udesc = "CPU to IO", .ucode = 0xa4, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CPU_TO_MEM", .udesc = "CPU to Mem", .ucode = 0xa8, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xaf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_cache_block[]={ { .uname = "VICTIM_WRITEBACK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "DCACHE_LOAD_MISS", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "SHARED_ICACHE_REFILL", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "READ_TO_DIRTY", .udesc = "Change-to-Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_sized_commands[]={ { .uname = "NON_POSTED_WRITE_BYTE", .udesc = "Non-Posted SzWr Byte (1-32 bytes) Legacy or mapped IO, typically 1-4 bytes", .ucode = 0x1, }, { .uname = "NON_POSTED_WRITE_DWORD", .udesc = "Non-Posted SzWr DW (1-16 dwords) Legacy or mapped IO, typically 1 DWORD", .ucode = 0x2, }, { .uname = "POSTED_WRITE_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes) Subcache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .ucode = 0x4, }, { .uname = "POSTED_WRITE_DWORD", .udesc = "Posted SzWr DW (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .ucode = 0x8, }, { .uname = "READ_BYTE_4_BYTES", .udesc = "SzRd Byte (4 bytes) Legacy or mapped IO", .ucode = 0x10, }, { .uname = "READ_DWORD_1_16_DWORDS", .udesc = "SzRd DW (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_probe[]={ { .uname = "MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_DISPLAY_REFRESH_READS", .udesc = "Upstream display refresh/ISOC reads.", .ucode = 0x10, }, { .uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .udesc = "Upstream non-display refresh reads.", .ucode = 0x20, }, { .uname = "UPSTREAM_ISOC_WRITES", .udesc = "Upstream ISOC writes.", .ucode = 0x40, }, { .uname = "UPSTREAM_NON_ISOC_WRITES", .udesc = "Upstream non-ISOC writes.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_dev[]={ { .uname = "DEV_HIT", .udesc = "DEV hit", .ucode = 0x10, }, { .uname = "DEV_MISS", .udesc = "DEV miss", .ucode = 0x20, }, { .uname = "DEV_ERROR", .udesc = "DEV error", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x70, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_memory_controller_requests[]={ { .uname = "32_BYTES_WRITES", .udesc = "32 Bytes Sized Writes", .ucode = 0x8, }, { .uname = "64_BYTES_WRITES", .udesc = "64 Bytes Sized Writes", .ucode = 0x10, }, { .uname = "32_BYTES_READS", .udesc = "32 Bytes Sized Reads", .ucode = 0x20, }, { .uname = "64_BYTES_READS", .udesc = "64 Byte Sized Reads", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x78, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam11h_hypertransport_link0[]={ { .uname = "COMMAND_DWORD_SENT", .udesc = "Command DWORD sent", .ucode = 0x1, .grpid = 0, }, { .uname = "ADDRESS_DWORD_SENT", .udesc = "Address DWORD sent", .ucode = 0x2, .grpid = 0, }, { .uname = "DATA_DWORD_SENT", .udesc = "Data DWORD sent", .ucode = 0x4, .grpid = 0, }, { .uname = "BUFFER_RELEASE_DWORD_SENT", .udesc = "Buffer release DWORD sent", .ucode = 0x8, .grpid = 0, }, { .uname = "NOP_DWORD_SENT", .udesc = "Nop DW sent (idle)", .ucode = 0x10, .grpid = 0, }, { .uname = "PER_PACKET_CRC_SENT", .udesc = "Per packet CRC sent", .ucode = 0x20, .grpid = 0, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, }; static const amd64_entry_t amd64_fam11h_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Dispatched FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_dispatched_fpu), .ngrp = 1, .umasks = amd64_fam11h_dispatched_fpu, }, { .name = "CYCLES_NO_FPU_OPS_RETIRED", .desc = "Cycles in which the FPU is Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam11h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x22, }, { .name = "LS_BUFFER_2_FULL_CYCLES", .desc = "LS Buffer 2 Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x23, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_locked_ops), .ngrp = 1, .umasks = amd64_fam11h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x27, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_data_cache_refills), .ngrp = 1, .umasks = amd64_fam11h_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Data Cache Refills from the System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_fam11h_data_cache_refills_from_system, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_data_cache_lines_evicted), .ngrp = 1, .umasks = amd64_fam11h_data_cache_lines_evicted, }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "Number of data cache accesses that miss in L1 DTLB and hit in L2 DTLB", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x45, }, { .name = "L1_DTLB_AND_L2_DTLB_MISS", .desc = "Number of data cache accesses that miss both the L1 and L2 DTLBs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x46, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x47, }, { .name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Late Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x48, }, { .name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Early Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x49, }, { .name = "SCRUBBER_SINGLE_BIT_ECC_ERRORS", .desc = "Single-bit ECC Errors Recorded by Scrubber", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_scrubber_single_bit_ecc_errors), .ngrp = 1, .umasks = amd64_fam11h_scrubber_single_bit_ecc_errors, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam11h_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_dcache_misses_by_locked_instructions), .ngrp = 1, .umasks = amd64_fam11h_dcache_misses_by_locked_instructions, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_memory_requests), .ngrp = 1, .umasks = amd64_fam11h_memory_requests, }, { .name = "DATA_PREFETCHES", .desc = "Data Prefetcher", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_data_prefetches), .ngrp = 1, .umasks = amd64_fam11h_data_prefetches, }, { .name = "SYSTEM_READ_RESPONSES", .desc = "System Read Responses by Coherency State", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_system_read_responses), .ngrp = 1, .umasks = amd64_fam11h_system_read_responses, }, { .name = "QUADWORDS_WRITTEN_TO_SYSTEM", .desc = "Quadwords Written to System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_quadwords_written_to_system), .ngrp = 1, .umasks = amd64_fam11h_quadwords_written_to_system, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x76, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_requests_to_l2), .ngrp = 1, .umasks = amd64_fam11h_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_l2_cache_miss), .ngrp = 1, .umasks = amd64_fam11h_l2_cache_miss, }, { .name = "L2_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_l2_fill_writeback), .ngrp = 1, .umasks = amd64_fam11h_l2_fill_writeback, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x85, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "Pipeline Restart Due to Instruction Stream Probe", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x86, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x89, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_retired_mmx_and_fp_instructions), .ngrp = 1, .umasks = amd64_fam11h_retired_mmx_and_fp_instructions, }, { .name = "RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS", .desc = "Retired Fastpath Double Op Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_retired_fastpath_double_op_instructions), .ngrp = 1, .umasks = amd64_fam11h_retired_fastpath_double_op_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcf, }, { .name = "DECODER_EMPTY", .desc = "Decoder Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd0, }, { .name = "DISPATCH_STALLS", .desc = "Dispatch Stalls", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd1, }, { .name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .desc = "Dispatch Stall for Branch Abort to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd2, }, { .name = "DISPATCH_STALL_FOR_SERIALIZATION", .desc = "Dispatch Stall for Serialization", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd3, }, { .name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .desc = "Dispatch Stall for Segment Load", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd4, }, { .name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .desc = "Dispatch Stall for Reorder Buffer Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd5, }, { .name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .desc = "Dispatch Stall for Reservation Station Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd6, }, { .name = "DISPATCH_STALL_FOR_FPU_FULL", .desc = "Dispatch Stall for FPU Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd7, }, { .name = "DISPATCH_STALL_FOR_LS_FULL", .desc = "Dispatch Stall for LS Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd8, }, { .name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .desc = "Dispatch Stall Waiting for All Quiet", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd9, }, { .name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .desc = "Dispatch Stall for Far Transfer or Resync to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xda, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam11h_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdf, }, { .name = "DRAM_ACCESSES", .desc = "DRAM Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_dram_accesses), .ngrp = 1, .umasks = amd64_fam11h_dram_accesses, }, { .name = "DRAM_CONTROLLER_PAGE_TABLE_EVENTS", .desc = "DRAM Controller Page Table Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_dram_controller_page_table_events), .ngrp = 1, .umasks = amd64_fam11h_dram_controller_page_table_events, }, { .name = "MEMORY_CONTROLLER_TURNAROUNDS", .desc = "Memory Controller Turnarounds", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_memory_controller_turnarounds), .ngrp = 1, .umasks = amd64_fam11h_memory_controller_turnarounds, }, { .name = "MEMORY_CONTROLLER_RBD_QUEUE", .desc = "Memory Controller RBD Queue Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_memory_rbd_queue), .ngrp = 1, .umasks = amd64_fam11h_memory_rbd_queue, }, { .name = "THERMAL_STATUS", .desc = "Thermal Status", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_thermal_status), .ngrp = 1, .umasks = amd64_fam11h_thermal_status, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_fam11h_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK", .desc = "Cache Block Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_cache_block), .ngrp = 1, .umasks = amd64_fam11h_cache_block, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_sized_commands), .ngrp = 1, .umasks = amd64_fam11h_sized_commands, }, { .name = "PROBE", .desc = "Probe Responses and Upstream Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_probe), .ngrp = 1, .umasks = amd64_fam11h_probe, }, { .name = "DEV", .desc = "DEV Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_dev), .ngrp = 1, .umasks = amd64_fam11h_dev, }, { .name = "HYPERTRANSPORT_LINK0", .desc = "HyperTransport Link 0 Transmit Bandwidth", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xf6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_hypertransport_link0), .ngrp = 1, .umasks = amd64_fam11h_hypertransport_link0, }, { .name = "MEMORY_CONTROLLER_REQUESTS", .desc = "Memory Controller Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1f0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_memory_controller_requests), .ngrp = 1, .umasks = amd64_fam11h_memory_controller_requests, }, { .name = "SIDEBAND_SIGNALS", .desc = "Sideband Signals and Special Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_sideband_signals), .ngrp = 1, .umasks = amd64_fam11h_sideband_signals, }, { .name = "INTERRUPT_EVENTS", .desc = "Interrupt Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1ea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam11h_interrupt_events), .ngrp = 1, .umasks = amd64_fam11h_interrupt_events, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam12h.h000066400000000000000000001417371502707512200236610ustar00rootroot00000000000000/* * Copyright (c) 2011 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam12h (AMD64 Fam12h) */ static const amd64_umask_t amd64_fam12h_dispatched_fpu[]={ { .uname = "OPS_ADD", .udesc = "Add pipe ops excluding load ops and SSE move ops", .ucode = 0x1, }, { .uname = "OPS_MULTIPLY", .udesc = "Multiply pipe ops excluding load ops and SSE move ops", .ucode = 0x2, }, { .uname = "OPS_STORE", .udesc = "Store pipe ops excluding load ops and SSE move ops", .ucode = 0x4, }, { .uname = "OPS_ADD_PIPE_LOAD_OPS", .udesc = "Add pipe load ops and SSE move ops", .ucode = 0x8, }, { .uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .udesc = "Multiply pipe load ops and SSE move ops", .ucode = 0x10, }, { .uname = "OPS_STORE_PIPE_LOAD_OPS", .udesc = "Store pipe load ops and SSE move ops", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_retired_sse_operations[]={ { .uname = "SINGLE_ADD_SUB_OPS", .udesc = "Single precision add/subtract ops", .ucode = 0x1, }, { .uname = "SINGLE_MUL_OPS", .udesc = "Single precision multiply ops", .ucode = 0x2, }, { .uname = "SINGLE_DIV_OPS", .udesc = "Single precision divide/square root ops", .ucode = 0x4, }, { .uname = "DOUBLE_ADD_SUB_OPS", .udesc = "Double precision add/subtract ops", .ucode = 0x8, }, { .uname = "DOUBLE_MUL_OPS", .udesc = "Double precision multiply ops", .ucode = 0x10, }, { .uname = "DOUBLE_DIV_OPS", .udesc = "Double precision divide/square root ops", .ucode = 0x20, }, { .uname = "OP_TYPE", .udesc = "Op type: 0=uops. 1=FLOPS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_retired_move_ops[]={ { .uname = "LOW_QW_MOVE_UOPS", .udesc = "Merging low quadword move uops", .ucode = 0x1, }, { .uname = "HIGH_QW_MOVE_UOPS", .udesc = "Merging high quadword move uops", .ucode = 0x2, }, { .uname = "ALL_OTHER_MERGING_MOVE_UOPS", .udesc = "All other merging move uops", .ucode = 0x4, }, { .uname = "ALL_OTHER_MOVE_UOPS", .udesc = "All other move uops", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_retired_serializing_ops[]={ { .uname = "SSE_BOTTOM_EXECUTING_UOPS", .udesc = "SSE bottom-executing uops retired", .ucode = 0x1, }, { .uname = "SSE_BOTTOM_SERIALIZING_UOPS", .udesc = "SSE bottom-serializing uops retired", .ucode = 0x2, }, { .uname = "X87_BOTTOM_EXECUTING_UOPS", .udesc = "X87 bottom-executing uops retired", .ucode = 0x4, }, { .uname = "X87_BOTTOM_SERIALIZING_UOPS", .udesc = "X87 bottom-serializing uops retired", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_fp_scheduler_cycles[]={ { .uname = "BOTTOM_EXECUTE_CYCLES", .udesc = "Number of cycles a bottom-execute uop is in the FP scheduler", .ucode = 0x1, }, { .uname = "BOTTOM_SERIALIZING_CYCLES", .udesc = "Number of cycles a bottom-serializing uop is in the FP scheduler", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "The number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in speculative phase", .ucode = 0x2, }, { .uname = "CYCLES_NON_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .ucode = 0x4, }, { .uname = "CYCLES_WAITING", .udesc = "The number of cycles waiting for a cache hit (cache miss penalty).", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_cancelled_store_to_load_forward_operations[]={ { .uname = "ADDRESS_MISMATCHES", .udesc = "Address mismatches (starting byte not the same).", .ucode = 0x1, }, { .uname = "STORE_IS_SMALLER_THAN_LOAD", .udesc = "Store is smaller than load.", .ucode = 0x2, }, { .uname = "MISALIGNED", .udesc = "Misaligned.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_data_cache_refills[]={ { .uname = "SYSTEM", .udesc = "Refill from the Northbridge", .ucode = 0x1, }, { .uname = "L2_SHARED", .udesc = "Shared-state line from L2", .ucode = 0x2, }, { .uname = "L2_EXCLUSIVE", .udesc = "Exclusive-state line from L2", .ucode = 0x4, }, { .uname = "L2_OWNED", .udesc = "Owned-state line from L2", .ucode = 0x8, }, { .uname = "L2_MODIFIED", .udesc = "Modified-state line from L2", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_data_cache_refills_from_northbridge[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_data_cache_lines_evicted[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "BY_PREFETCHNTA", .udesc = "Cache line evicted was brought into the cache with by a PrefetchNTA instruction.", .ucode = 0x20, }, { .uname = "NOT_BY_PREFETCHNTA", .udesc = "Cache line evicted was not brought into the cache with by a PrefetchNTA instruction.", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l1_dtlb_miss_and_l2_dtlb_hit[]={ { .uname = "L2_4K_TLB_HIT", .udesc = "L2 4K TLB hit", .ucode = 0x1, }, { .uname = "L2_2M_TLB_HIT", .udesc = "L2 2M TLB hit", .ucode = 0x2, }, { .uname = "L2_1G_TLB_HIT", .udesc = "L2 1G TLB hit", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l1_dtlb_and_l2_dtlb_miss[]={ { .uname = "4K_TLB_RELOAD", .udesc = "4K TLB reload", .ucode = 0x1, }, { .uname = "2M_TLB_RELOAD", .udesc = "2M TLB reload", .ucode = 0x2, }, { .uname = "1G_TLB_RELOAD", .udesc = "1G TLB reload", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_dcache_misses_by_locked_instructions[]={ { .uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .udesc = "Data cache misses by locked instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l1_dtlb_hit[]={ { .uname = "L1_4K_TLB_HIT", .udesc = "L1 4K TLB hit", .ucode = 0x1, }, { .uname = "L1_2M_TLB_HIT", .udesc = "L1 2M TLB hit", .ucode = 0x2, }, { .uname = "L1_1G_TLB_HIT", .udesc = "L1 1G TLB hit", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_ineffective_sw_prefetches[]={ { .uname = "SW_PREFETCH_HIT_IN_L1", .udesc = "Software prefetch hit in the L1.", .ucode = 0x1, }, { .uname = "SW_PREFETCH_HIT_IN_L2", .udesc = "Software prefetch hit in L2.", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x9, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "CACHE_DISABLED", .udesc = "Requests to cache-disabled (CD) memory", .ucode = 0x4, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x87, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_data_prefetches[]={ { .uname = "CANCELLED", .udesc = "Cancelled prefetches", .ucode = 0x1, }, { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_northbridge_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_octwords_written_to_system[]={ { .uname = "OCTWORD_WRITE_TRANSFER", .udesc = "Octword write transfer", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB fill (page table walks)", .ucode = 0x4, }, { .uname = "SNOOP", .udesc = "Tag snoop request", .ucode = 0x8, }, { .uname = "CANCELLED", .udesc = "Cancelled request", .ucode = 0x10, }, { .uname = "HW_PREFETCH_FROM_DC", .udesc = "Hardware prefetch from DC", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas EventSelect 041h does not)", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB page table walk", .ucode = 0x4, }, { .uname = "HW_PREFETCH_FROM_DC", .udesc = "Hardware prefetch from DC", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l2_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .ucode = 0x1, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_l1_itlb_miss_and_l2_itlb_miss[]={ { .uname = "4K_PAGE_FETCHES", .udesc = "Instruction fetches to a 4K page.", .ucode = 0x1, }, { .uname = "2M_PAGE_FETCHES", .udesc = "Instruction fetches to a 2M page.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_instruction_cache_lines_invalidated[]={ { .uname = "INVALIDATING_PROBE_NO_IN_FLIGHT", .udesc = "Invalidating probe that did not hit any in-flight instructions.", .ucode = 0x1, }, { .uname = "INVALIDATING_PROBE_ONE_OR_MORE_IN_FLIGHT", .udesc = "Invalidating probe that hit one or more in-flight instructions.", .ucode = 0x2, }, { .uname = "SMC_NO_INFLIGHT", .udesc = "SMC that did not hit any in-flight instructions.", .ucode = 0x4, }, { .uname = "SMC_INFLIGHT", .udesc = "SMC that hit one or more in-flight instructions.", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_retired_mmx_and_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "MMX_AND_3DNOW", .udesc = "MMX and 3DNow! instructions", .ucode = 0x2, }, { .uname = "SSE_AND_SSE2", .udesc = "SSE and SSE2 instructions", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_interrupt_events[]={ { .uname = "FIXED_AND_LPA", .udesc = "Fixed and LPA", .ucode = 0x1, }, { .uname = "LPA", .udesc = "LPA", .ucode = 0x2, }, { .uname = "SMI", .udesc = "SMI", .ucode = 0x4, }, { .uname = "NMI", .udesc = "NMI", .ucode = 0x8, }, { .uname = "INIT", .udesc = "INIT", .ucode = 0x10, }, { .uname = "STARTUP", .udesc = "STARTUP", .ucode = 0x20, }, { .uname = "INT", .udesc = "INT", .ucode = 0x40, }, { .uname = "EOI", .udesc = "EOI", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_sideband_signals[]={ { .uname = "STOPGRANT", .udesc = "STOPGRANT", .ucode = 0x2, }, { .uname = "SHUTDOWN", .udesc = "SHUTDOWN", .ucode = 0x4, }, { .uname = "WBINVD", .udesc = "WBINVD", .ucode = 0x8, }, { .uname = "INVD", .udesc = "INVD", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1e, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_dram_accesses_page[]={ { .uname = "DCT0_HIT", .udesc = "DCT0 Page hit", .ucode = 0x1, }, { .uname = "DCT0_MISS", .udesc = "DCT0 Page Miss", .ucode = 0x2, }, { .uname = "DCT0_CONFLICT", .udesc = "DCT0 Page Conflict", .ucode = 0x4, }, { .uname = "DCT1_PAGE_HIT", .udesc = "DCT1 Page hit", .ucode = 0x8, }, { .uname = "DCT1_PAGE_MISS", .udesc = "DCT1 Page Miss", .ucode = 0x10, }, { .uname = "DCT1_PAGE_CONFLICT", .udesc = "DCT1 Page Conflict", .ucode = 0x20, }, { .uname = "WRITE_REQUEST", .udesc = "Write request.", .ucode = 0x40, }, { .uname = "READ_REQUEST", .udesc = "Read request.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_controller_page_table_events[]={ { .uname = "PAGE_TABLE_OVERFLOW", .udesc = "Page Table Overflow", .ucode = 0x1, }, { .uname = "STALE_TABLE_ENTRY_HITS", .udesc = "Number of stale table entry hits. (hit on a page closed too soon).", .ucode = 0x2, }, { .uname = "PAGE_TABLE_IDLE_CYCLE_LIMIT_INCREMENTED", .udesc = "Page table idle cycle limit incremented.", .ucode = 0x4, }, { .uname = "PAGE_TABLE_IDLE_CYCLE_LIMIT_DECREMENTED", .udesc = "Page table idle cycle limit decremented.", .ucode = 0x8, }, { .uname = "PAGE_TABLE_CLOSED_INACTIVITY", .udesc = "Page table is closed due to row inactivity.", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_controller_slot_misses[]={ { .uname = "DCT0_RBD", .udesc = "DCT0 RBD.", .ucode = 0x10, }, { .uname = "DCT1_RBD", .udesc = "DCT1 RBD.", .ucode = 0x20, }, { .uname = "DCT0_PREFETCH", .udesc = "DCT0 Prefetch.", .ucode = 0x40, }, { .uname = "DCT1_PREFETCH", .udesc = "DCT1 Prefetch.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf0, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_controller_turnarounds[]={ { .uname = "DCT0_READ_TO_WRITE", .udesc = "DCT0 read-to-write turnaround.", .ucode = 0x1, }, { .uname = "DCT0_WRITE_TO_READ", .udesc = "DCT0 write-to-read turnaround", .ucode = 0x2, }, { .uname = "DCT1_READ_TO_WRITE", .udesc = "DCT1 read-to-write turnaround.", .ucode = 0x8, }, { .uname = "DCT1_WRITE_TO_READ", .udesc = "DCT1 write-to-read turnaround", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1b, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_rbd_queue[]={ { .uname = "COUNTER_REACHED", .udesc = "D18F2x[1,0]94[DcqBypassMax] counter reached.", .ucode = 0x4, }, { .uname = "BANK_CLOSED", .udesc = "Bank is closed due to bank conflict with an outstanding request in the RBD queue.", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xc, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_thermal_status[]={ { .uname = "MEMHOT_L_ASSERTIONS", .udesc = "MEMHOT_L assertions.", .ucode = 0x1, }, { .uname = "HTC_TRANSITIONS", .udesc = "Number of times the HTC transitions from inactive to active.", .ucode = 0x4, }, { .uname = "CLOCKS_HTC_P_STATE_INACTIVE", .udesc = "Number of clocks HTC P-state is inactive.", .ucode = 0x20, }, { .uname = "CLOCKS_HTC_P_STATE_ACTIVE", .udesc = "Number of clocks HTC P-state is active", .ucode = 0x40, }, { .uname = "PROCHOT_L_ASSERTIONS", .udesc = "PROCHOT_L asserted by an external source and the assertion causes a P-state change.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xe5, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_cpu_io_requests_to_memory_io[]={ { .uname = "I_O_TO_I_O", .udesc = "IO to IO", .ucode = 0x1, }, { .uname = "I_O_TO_MEM", .udesc = "IO to Mem", .ucode = 0x2, }, { .uname = "CPU_TO_I_O", .udesc = "CPU to IO", .ucode = 0x4, }, { .uname = "CPU_TO_MEM", .udesc = "CPU to Mem", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x0f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_cache_block[]={ { .uname = "VICTIM_WRITEBACK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "DCACHE_LOAD_MISS", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "SHARED_ICACHE_REFILL", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "READ_TO_DIRTY", .udesc = "Change-to-Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_sized_commands[]={ { .uname = "NON_POSTED_WRITE_BYTE", .udesc = "Non-Posted SzWr Byte (1-32 bytes) Legacy or mapped IO, typically 1-4 bytes", .ucode = 0x1, }, { .uname = "NON_POSTED_WRITE_DWORD", .udesc = "Non-Posted SzWr DW (1-16 dwords) Legacy or mapped IO, typically 1 DWORD", .ucode = 0x2, }, { .uname = "POSTED_WRITE_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes) Subcache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .ucode = 0x4, }, { .uname = "POSTED_WRITE_DWORD", .udesc = "Posted SzWr DW (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .ucode = 0x8, }, { .uname = "READ_BYTE_4_BYTES", .udesc = "SzRd Byte (4 bytes) Legacy or mapped IO", .ucode = 0x10, }, { .uname = "READ_DWORD_1_16_DWORDS", .udesc = "SzRd DW (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_probe[]={ { .uname = "MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_HIGH_PRIORITY_READS", .udesc = "Upstream high priority reads.", .ucode = 0x10, }, { .uname = "UPSTREAM_LOW_PRIORITY_READS", .udesc = "Upstream low priority reads.", .ucode = 0x20, }, { .uname = "UPSTREAM_LOW_PRIORITY_WRITES", .udesc = "Upstream low priority writes.", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xbf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_dev[]={ { .uname = "DEV_HIT", .udesc = "DEV hit", .ucode = 0x10, }, { .uname = "DEV_MISS", .udesc = "DEV miss", .ucode = 0x20, }, { .uname = "DEV_ERROR", .udesc = "DEV error", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x70, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_memory_controller_requests[]={ { .uname = "32_BYTES_WRITES", .udesc = "32 Bytes Sized Writes", .ucode = 0x8, }, { .uname = "64_BYTES_WRITES", .udesc = "64 Bytes Sized Writes", .ucode = 0x10, }, { .uname = "32_BYTES_READS", .udesc = "32 Bytes Sized Reads", .ucode = 0x20, }, { .uname = "64_BYTES_READS", .udesc = "64 Byte Sized Reads", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x78, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_page_size_mismatches[]={ { .uname = "GUEST_LARGER", .udesc = "Guest page size is larger than the host page size.", .ucode = 0x1, }, { .uname = "MTRR_MISMATCH", .udesc = "MTRR mismatch.", .ucode = 0x2, }, { .uname = "HOST_LARGER", .udesc = "Host page size is larger than the guest page size.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam12h_retired_x87_ops[]={ { .uname = "ADD_SUB_OPS", .udesc = "Add/subtract ops", .ucode = 0x1, }, { .uname = "MUL_OPS", .udesc = "Multiply ops", .ucode = 0x2, }, { .uname = "DIV_OPS", .udesc = "Divide ops", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam12h_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Dispatched FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_dispatched_fpu), .ngrp = 1, .umasks = amd64_fam12h_dispatched_fpu, }, { .name = "CYCLES_NO_FPU_OPS_RETIRED", .desc = "Cycles in which the FPU is Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2, }, { .name = "RETIRED_SSE_OPERATIONS", .desc = "Retired SSE Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_retired_sse_operations), .ngrp = 1, .umasks = amd64_fam12h_retired_sse_operations, }, { .name = "RETIRED_MOVE_OPS", .desc = "Retired Move Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_retired_move_ops), .ngrp = 1, .umasks = amd64_fam12h_retired_move_ops, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "Retired Serializing Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_retired_serializing_ops), .ngrp = 1, .umasks = amd64_fam12h_retired_serializing_ops, }, { .name = "FP_SCHEDULER_CYCLES", .desc = "Number of Cycles that a Serializing uop is in the FP Scheduler", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_fp_scheduler_cycles), .ngrp = 1, .umasks = amd64_fam12h_fp_scheduler_cycles, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam12h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x22, }, { .name = "LS_BUFFER_2_FULL_CYCLES", .desc = "LS Buffer 2 Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x23, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_locked_ops), .ngrp = 1, .umasks = amd64_fam12h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x27, }, { .name = "CANCELLED_STORE_TO_LOAD_FORWARD_OPERATIONS", .desc = "Cancelled Store to Load Forward Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_cancelled_store_to_load_forward_operations), .ngrp = 1, .umasks = amd64_fam12h_cancelled_store_to_load_forward_operations, }, { .name = "SMIS_RECEIVED", .desc = "SMIs Received", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2b, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_data_cache_refills), .ngrp = 1, .umasks = amd64_fam12h_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Data Cache Refills from the Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_data_cache_refills_from_northbridge), .ngrp = 1, .umasks = amd64_fam12h_data_cache_refills_from_northbridge, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_data_cache_lines_evicted), .ngrp = 1, .umasks = amd64_fam12h_data_cache_lines_evicted, }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "L1 DTLB Miss and L2 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x45, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l1_dtlb_miss_and_l2_dtlb_hit), .ngrp = 1, .umasks = amd64_fam12h_l1_dtlb_miss_and_l2_dtlb_hit, }, { .name = "L1_DTLB_AND_L2_DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x46, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l1_dtlb_and_l2_dtlb_miss), .ngrp = 1, .umasks = amd64_fam12h_l1_dtlb_and_l2_dtlb_miss, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x47, }, { .name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Late Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x48, }, { .name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Early Cancel of an Access", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x49, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam12h_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_dcache_misses_by_locked_instructions), .ngrp = 1, .umasks = amd64_fam12h_dcache_misses_by_locked_instructions, }, { .name = "L1_DTLB_HIT", .desc = "L1 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l1_dtlb_hit), .ngrp = 1, .umasks = amd64_fam12h_l1_dtlb_hit, }, { .name = "INEFFECTIVE_SW_PREFETCHES", .desc = "Ineffective Software Prefetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x52, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_ineffective_sw_prefetches), .ngrp = 1, .umasks = amd64_fam12h_ineffective_sw_prefetches, }, { .name = "GLOBAL_TLB_FLUSHES", .desc = "Global TLB Flushes", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x54, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_requests), .ngrp = 1, .umasks = amd64_fam12h_memory_requests, }, { .name = "DATA_PREFETCHES", .desc = "Data Prefetcher", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_data_prefetches), .ngrp = 1, .umasks = amd64_fam12h_data_prefetches, }, { .name = "NORTHBRIDGE_READ_RESPONSES", .desc = "Northbridge Read Responses by Coherency State", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_northbridge_read_responses), .ngrp = 1, .umasks = amd64_fam12h_northbridge_read_responses, }, { .name = "OCTWORDS_WRITTEN_TO_SYSTEM", .desc = "Octwords Written to System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_octwords_written_to_system), .ngrp = 1, .umasks = amd64_fam12h_octwords_written_to_system, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x76, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_requests_to_l2), .ngrp = 1, .umasks = amd64_fam12h_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l2_cache_miss), .ngrp = 1, .umasks = amd64_fam12h_l2_cache_miss, }, { .name = "L2_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l2_fill_writeback), .ngrp = 1, .umasks = amd64_fam12h_l2_fill_writeback, }, { .name = "PAGE_SIZE_MISMATCHES", .desc = "Page Size Mismatches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x165, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_page_size_mismatches), .ngrp = 1, .umasks = amd64_fam12h_page_size_mismatches, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_l1_itlb_miss_and_l2_itlb_miss), .ngrp = 1, .umasks = amd64_fam12h_l1_itlb_miss_and_l2_itlb_miss, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "Pipeline Restart Due to Instruction Stream Probe", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x86, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x89, }, { .name = "INSTRUCTION_CACHE_VICTIMS", .desc = "Instruction Cache Victims", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8b, }, { .name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .desc = "Instruction Cache Lines Invalidated", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_instruction_cache_lines_invalidated), .ngrp = 1, .umasks = amd64_fam12h_instruction_cache_lines_invalidated, }, { .name = "ITLB_RELOADS", .desc = "ITLB Reloads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x99, }, { .name = "ITLB_RELOADS_ABORTED", .desc = "ITLB Reloads Aborted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x9a, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_retired_mmx_and_fp_instructions), .ngrp = 1, .umasks = amd64_fam12h_retired_mmx_and_fp_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcf, }, { .name = "DECODER_EMPTY", .desc = "Decoder Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd0, }, { .name = "DISPATCH_STALLS", .desc = "Dispatch Stalls", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd1, }, { .name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .desc = "Dispatch Stall for Branch Abort to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd2, }, { .name = "DISPATCH_STALL_FOR_SERIALIZATION", .desc = "Dispatch Stall for Serialization", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd3, }, { .name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .desc = "Dispatch Stall for Segment Load", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd4, }, { .name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .desc = "Dispatch Stall for Reorder Buffer Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd5, }, { .name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .desc = "Dispatch Stall for Reservation Station Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd6, }, { .name = "DISPATCH_STALL_FOR_FPU_FULL", .desc = "Dispatch Stall for FPU Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd7, }, { .name = "DISPATCH_STALL_FOR_LS_FULL", .desc = "Dispatch Stall for LS Full", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd8, }, { .name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .desc = "Dispatch Stall Waiting for All Quiet", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xd9, }, { .name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .desc = "Dispatch Stall for Far Transfer or Resync to Retire", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xda, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam12h_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdf, }, { .name = "RETIRED_X87_OPS", .desc = "Retired x87 Floating Point Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1c0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_retired_x87_ops), .ngrp = 1, .umasks = amd64_fam12h_retired_x87_ops, }, { .name = "LFENCE_INST_RETIRED", .desc = "LFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d3, }, { .name = "SFENCE_INST_RETIRED", .desc = "SFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d4, }, { .name = "MFENCE_INST_RETIRED", .desc = "MFENCE Instructions Retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d5, }, { .name = "DRAM_ACCESSES_PAGE", .desc = "DRAM Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_dram_accesses_page), .ngrp = 1, .umasks = amd64_fam12h_dram_accesses_page, }, { .name = "MEMORY_CONTROLLER_0_PAGE", .desc = "DRAM Controller 0 Page Table Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_controller_page_table_events), .ngrp = 1, .umasks = amd64_fam12h_memory_controller_page_table_events, }, { .name = "MEMORY_CONTROLLER_SLOT_MISSES", .desc = "Memory Controller DRAM Command Slots Missed", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_controller_slot_misses), .ngrp = 1, .umasks = amd64_fam12h_memory_controller_slot_misses, }, { .name = "MEMORY_CONTROLLER_TURNAROUNDS", .desc = "Memory Controller Turnarounds", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_controller_turnarounds), .ngrp = 1, .umasks = amd64_fam12h_memory_controller_turnarounds, }, { .name = "MEMORY_CONTROLLER_RBD_QUEUE", .desc = "Memory Controller RBD Queue Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_rbd_queue), .ngrp = 1, .umasks = amd64_fam12h_memory_rbd_queue, }, { .name = "MEMORY_CONTROLLER_1_PAGE", .desc = "DRAM Controller 1 Page Table Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_controller_page_table_events), .ngrp = 1, .umasks = amd64_fam12h_memory_controller_page_table_events, }, { .name = "THERMAL_STATUS", .desc = "Thermal Status", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_thermal_status), .ngrp = 1, .umasks = amd64_fam12h_thermal_status, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_fam12h_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK", .desc = "Cache Block Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_cache_block), .ngrp = 1, .umasks = amd64_fam12h_cache_block, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_sized_commands), .ngrp = 1, .umasks = amd64_fam12h_sized_commands, }, { .name = "PROBE", .desc = "Probe Responses and Upstream Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_probe), .ngrp = 1, .umasks = amd64_fam12h_probe, }, { .name = "DEV", .desc = "DEV Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_dev), .ngrp = 1, .umasks = amd64_fam12h_dev, }, { .name = "MEMORY_CONTROLLER_REQUESTS", .desc = "Memory Controller Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1f0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_memory_controller_requests), .ngrp = 1, .umasks = amd64_fam12h_memory_controller_requests, }, { .name = "SIDEBAND_SIGNALS", .desc = "Sideband Signals and Special Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_sideband_signals), .ngrp = 1, .umasks = amd64_fam12h_sideband_signals, }, { .name = "INTERRUPT_EVENTS", .desc = "Interrupt Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1ea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam12h_interrupt_events), .ngrp = 1, .umasks = amd64_fam12h_interrupt_events, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam14h.h000066400000000000000000001245731502707512200236620ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_fam14h (AMD64 Fam14h) */ static const amd64_umask_t amd64_fam14h_dispatched_fpu[]={ { .uname = "PIPE0", .udesc = "Pipe 0 (fadd, imul, mmx) ops", .ucode = 0x1, }, { .uname = "PIPE1", .udesc = "Pipe 1 (fmul, store, mmx) ops", .ucode = 0x2, }, { .uname = "ANY", .udesc = "Pipe 1 and Pipe 0 ops", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_retired_sse_operations[]={ { .uname = "SINGLE_ADD_SUB_OPS", .udesc = "Single precision add/subtract ops", .ucode = 0x1, }, { .uname = "SINGLE_MUL_OPS", .udesc = "Single precision multiply ops", .ucode = 0x2, }, { .uname = "SINGLE_DIV_OPS", .udesc = "Single precision divide/square root ops", .ucode = 0x4, }, { .uname = "DOUBLE_ADD_SUB_OPS", .udesc = "Double precision add/subtract ops", .ucode = 0x8, }, { .uname = "DOUBLE_MUL_OPS", .udesc = "Double precision multiply ops", .ucode = 0x10, }, { .uname = "DOUBLE_DIV_OPS", .udesc = "Double precision divide/square root ops", .ucode = 0x20, }, { .uname = "OP_TYPE", .udesc = "Op type: 0=uops. 1=FLOPS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_retired_move_ops[]={ { .uname = "ALL_OTHER_MERGING_MOVE_UOPS", .udesc = "All other merging move uops", .ucode = 0x4, }, { .uname = "ALL_OTHER_MOVE_UOPS", .udesc = "All other move uops", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xc, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_retired_serializing_ops[]={ { .uname = "SSE_BOTTOM_EXECUTING_UOPS", .udesc = "SSE bottom-executing uops retired", .ucode = 0x1, }, { .uname = "SSE_BOTTOM_SERIALIZING_UOPS", .udesc = "SSE bottom-serializing uops retired", .ucode = 0x2, }, { .uname = "X87_BOTTOM_EXECUTING_UOPS", .udesc = "X87 bottom-executing uops retired", .ucode = 0x4, }, { .uname = "X87_BOTTOM_SERIALIZING_UOPS", .udesc = "X87 bottom-serializing uops retired", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_retired_x87_fpu_ops[]={ { .uname = "ADD_SUB_OPS", .udesc = "Add/subtract ops", .ucode = 0x1, }, { .uname = "MULT_OPS", .udesc = "Multiply ops", .ucode = 0x2, }, { .uname = "DIV_FSQRT_OPS", .udesc = "Divide and fqsrt ops", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "Number of locked instructions executed", .ucode = 0x1, }, { .uname = "BUS_LOCK", .udesc = "Number of cycles to acquire bus lock", .ucode = 0x2, }, { .uname = "UNLOCK_LINE", .udesc = "Number of cycles to unlock line (not including cache miss)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_cancelled_store_to_load_forward_operations[]={ { .uname = "ADDRESS_MISMATCHES", .udesc = "Address mismatches (starting byte not the same).", .ucode = 0x1, }, { .uname = "STORE_IS_SMALLER_THAN_LOAD", .udesc = "Store is smaller than load.", .ucode = 0x2, }, { .uname = "MISALIGNED", .udesc = "Misaligned.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_data_cache_refills[]={ { .uname = "UNCACHEABLE", .udesc = "From non-cacheable data", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "From shared lines", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "From exclusive lines", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "From owned lines", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "From modified lines", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_data_cache_refills_from_nb[]={ { .uname = "UNCACHEABLE", .udesc = "Uncacheable data", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_data_cache_lines_evicted[]={ { .uname = "PROBE", .udesc = "Eviction from probe", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared eviction", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive eviction", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned eviction", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified eviction", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_dtlb_miss[]={ { .uname = "STORES_L1TLB_MISS", .udesc = "Stores that miss L1TLB", .ucode = 0x1, }, { .uname = "LOADS_L1TLB_MISS", .udesc = "Loads that miss L1TLB", .ucode = 0x2, }, { .uname = "STORES_L2TLB_MISS", .udesc = "Stores that miss L2TLB", .ucode = 0x4, }, { .uname = "LOADS_L2TLB_MISS", .udesc = "Loads that miss L2TLB", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_l1_dtlb_hit[]={ { .uname = "L1_4K_TLB_HIT", .udesc = "L1 4K TLB hit", .ucode = 0x1, }, { .uname = "L1_2M_TLB_HIT", .udesc = "L1 2M TLB hit", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_dcache_sw_prefetches[]={ { .uname = "HIT", .udesc = "SW prefetch hit in the data cache", .ucode = 0x1, }, { .uname = "PENDING_FILL", .udesc = "SW prefetch hit a pending fill", .ucode = 0x2, }, { .uname = "NO_MAB", .udesc = "SW prefetch does not get a MAB", .ucode = 0x4, }, { .uname = "L2_HIT", .udesc = "SW prefetch hits L2", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_mab_requests[]={ { .uname = "DC_BUFFER_0", .udesc = "Data cache buffer 0", .ucode = 0x0, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_1", .udesc = "Data cache buffer 1", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_2", .udesc = "Data cache buffer 2", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_3", .udesc = "Data cache buffer 3", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_4", .udesc = "Data cache buffer 4", .ucode = 0x4, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_5", .udesc = "Data cache buffer 5", .ucode = 0x5, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_6", .udesc = "Data cache buffer 6", .ucode = 0x6, .uflags= AMD64_FL_NCOMBO, }, { .uname = "DC_BUFFER_7", .udesc = "Data cache buffer 7", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO, }, { .uname = "IC_BUFFER_0", .udesc = "Instruction cache Buffer 1", .ucode = 0x8, .uflags= AMD64_FL_NCOMBO, }, { .uname = "IC_BUFFER_1", .udesc = "Instructions cache buffer 1", .ucode = 0x9, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ANY_IC_BUFFER", .udesc = "Any instruction cache buffer", .ucode = 0xa, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ANY_DC_BUFFER", .udesc = "Any data cache buffer", .ucode = 0xb, .uflags= AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam14h_system_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "DIRTY_SUCCESS", .udesc = "Change-to-dirty success", .ucode = 0x20, }, { .uname = "UNCACHEABLE", .udesc = "Uncacheable", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "SNOOP", .udesc = "Tag snoop request", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xb, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas EventSelect 041h does not)", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_l2_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .ucode = 0x1, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system.", .ucode = 0x2, }, { .uname = "IC_ATTR_WRITES_L2_ACCESS", .udesc = "Ic attribute writes which access the L2", .ucode = 0x4, }, { .uname = "IC_ATTR_WRITES_L2_WRITES", .udesc = "Ic attribute writes which store into the L2", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_l1_itlb_miss_and_l2_itlb_miss[]={ { .uname = "4K_PAGE_FETCHES", .udesc = "Instruction fetches to a 4K page.", .ucode = 0x1, }, { .uname = "2M_PAGE_FETCHES", .udesc = "Instruction fetches to a 2M page.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_instruction_cache_lines_invalidated[]={ { .uname = "INVALIDATING_LS_PROBE", .udesc = "IC invalidate due to an LS probe", .ucode = 0x1, }, { .uname = "INVALIDATING_BU_PROBE", .udesc = "IC invalidate due to a BU probe", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_retired_floating_point_instructions[]={ { .uname = "X87", .udesc = "X87 or MMX instructions", .ucode = 0x1, }, { .uname = "SSE", .udesc = "SSE (SSE, SSE2, SSE3, MNI) instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_dram_accesses_page[]={ { .uname = "HIT", .udesc = "DCT0 Page hit", .ucode = 0x1, }, { .uname = "MISS", .udesc = "DCT0 Page Miss", .ucode = 0x2, }, { .uname = "CONFLICT", .udesc = "DCT0 Page Conflict", .ucode = 0x4, }, { .uname = "WRITE_REQUEST", .udesc = "Write request", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x47, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_memory_controller_page_table[]={ { .uname = "DCT0_PAGE_TABLE_OVERFLOW", .udesc = "DCT0 Page Table Overflow", .ucode = 0x1, }, { .uname = "DCT0_PAGE_TABLE_STALE_HIT", .udesc = "DCT0 number of stale table entry hits (hit on a page closed too soon)", .ucode = 0x2, }, { .uname = "DCT0_PAGE_TABLE_IDLE_INC", .udesc = "DCT0 page table idle cycle limit incremented", .ucode = 0x4, }, { .uname = "DCT0_PAGE_TABLE_IDLE_DEC", .udesc = "DCT0 page table idle cycle limit decremented", .ucode = 0x8, }, { .uname = "DCT0_PAGE_TABLE_CLOSED", .udesc = "DCT0 page table is closed due to row inactivity", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_memory_controller_slot_misses[]={ { .uname = "DCT0_RBD", .udesc = "DCT0 RBD", .ucode = 0x10, }, { .uname = "DCT0_PREFETCH", .udesc = "DCT0 prefetch", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x50, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_memory_controller_rbd_queue_events[]={ { .uname = "DCQ_BYPASS_MAX", .udesc = "DCQ_BYPASS_MAX counter reached", .ucode = 0x4, }, { .uname = "BANK_CLOSED", .udesc = "Bank is closed due to bank conflict with an outstanding request in the RBD queue", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xc, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_thermal_status[]={ { .uname = "MEMHOT_L", .udesc = "MEMHOT_L assertions", .ucode = 0x1, }, { .uname = "HTC_TRANSITION", .udesc = "Number of times HTC transitions from inactive to active", .ucode = 0x4, }, { .uname = "CLOCKS_HTC_P_STATE_INACTIVE", .udesc = "Number of clocks HTC P-state is inactive.", .ucode = 0x20, }, { .uname = "CLOCKS_HTC_P_STATE_ACTIVE", .udesc = "Number of clocks HTC P-state is active", .ucode = 0x40, }, { .uname = "PROCHOT_L", .udesc = "PROCHOT_L asserted by an external source and the assertion causes a P-state change", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xc5, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_cpu_io_requests_to_memory_io[]={ { .uname = "I_O_TO_I_O", .udesc = "IO to IO", .ucode = 0x1, }, { .uname = "I_O_TO_MEM", .udesc = "IO to Mem", .ucode = 0x2, }, { .uname = "CPU_TO_I_O", .udesc = "CPU to IO", .ucode = 0x4, }, { .uname = "CPU_TO_MEM", .udesc = "CPU to Mem", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_cache_block[]={ { .uname = "VICTIM_WRITEBACK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "DCACHE_LOAD_MISS", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "SHARED_ICACHE_REFILL", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "CHANGE_TO_DIRTY", .udesc = "Change-to-Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_sized_commands[]={ { .uname = "NON_POSTED_WRITE_BYTE", .udesc = "Non-Posted SzWr Byte (1-32 bytes) Legacy or mapped IO, typically 1-4 bytes", .ucode = 0x1, }, { .uname = "NON_POSTED_WRITE_DWORD", .udesc = "Non-Posted SzWr DW (1-16 dwords) Legacy or mapped IO, typically 1 DWORD", .ucode = 0x2, }, { .uname = "POSTED_WRITE_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes) Subcache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .ucode = 0x4, }, { .uname = "POSTED_WRITE_DWORD", .udesc = "Posted SzWr DW (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .ucode = 0x8, }, { .uname = "READ_BYTE_4_BYTES", .udesc = "SzRd Byte (4 bytes) Legacy or mapped IO", .ucode = 0x10, }, { .uname = "READ_DWORD_1_16_DWORDS", .udesc = "SzRd DW (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_probe[]={ { .uname = "MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_HIGH_PRIO_READS", .udesc = "Upstream high priority reads", .ucode = 0x10, }, { .uname = "UPSTREAM_LOW_PRIO_READS", .udesc = "Upstream low priority reads", .ucode = 0x20, }, { .uname = "UPSTREAM_LOW_PRIO_WRITES", .udesc = "Upstream non-ISOC writes", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xbf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_dev_events[]={ { .uname = "HIT", .udesc = "DEV hit", .ucode = 0x10, }, { .uname = "MISS", .udesc = "DEV miss", .ucode = 0x20, }, { .uname = "ERROR", .udesc = "DEV error", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x70, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_memory_controller_requests[]={ { .uname = "32_BYTES_WRITES", .udesc = "32 Bytes Sized Writes", .ucode = 0x8, }, { .uname = "64_BYTES_WRITES", .udesc = "64 Bytes Sized Writes", .ucode = 0x10, }, { .uname = "32_BYTES_READS", .udesc = "32 Bytes Sized Reads", .ucode = 0x20, }, { .uname = "64_BYTES_READS", .udesc = "64 Byte Sized Reads", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x78, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_sideband_signals_special_signals[]={ { .uname = "STOPGRANT", .udesc = "Stopgrant", .ucode = 0x2, }, { .uname = "SHUTDOWN", .udesc = "Shutdown", .ucode = 0x4, }, { .uname = "WBINVD", .udesc = "Wbinvd", .ucode = 0x8, }, { .uname = "INVD", .udesc = "Invd", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1c, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_interrupt_events[]={ { .uname = "FIXED_AND_LPA", .udesc = "Fixed and LPA", .ucode = 0x1, }, { .uname = "LPA", .udesc = "LPA", .ucode = 0x2, }, { .uname = "SMI", .udesc = "SMI", .ucode = 0x4, }, { .uname = "NMI", .udesc = "NMI", .ucode = 0x8, }, { .uname = "INIT", .udesc = "INIT", .ucode = 0x10, }, { .uname = "STARTUP", .udesc = "STARTUP", .ucode = 0x20, }, { .uname = "INT", .udesc = "INT", .ucode = 0x40, }, { .uname = "EOI", .udesc = "EOI", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam14h_pdc_miss[]={ { .uname = "HOST_PDE_LEVEL", .udesc = "Host PDE level", .ucode = 0x1, }, { .uname = "HOST_PDPE_LEVEL", .udesc = "Host PDPE level", .ucode = 0x2, }, { .uname = "HOST_PML4E_LEVEL", .udesc = "Host PML4E level", .ucode = 0x4, }, { .uname = "GUEST_PDE_LEVEL", .udesc = "Guest PDE level", .ucode = 0x10, }, { .uname = "GUEST_PDPE_LEVEL", .udesc = "Guest PDPE level", .ucode = 0x20, }, { .uname = "GUEST_PML4E_LEVEL", .udesc = "Guest PML4E level", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x67, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam14h_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Number of uops dispatched to FPU execution pipelines", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_dispatched_fpu), .ngrp = 1, .umasks = amd64_fam14h_dispatched_fpu, }, { .name = "CYCLES_NO_FPU_OPS_RETIRED", .desc = "Cycles in which the FPU is Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2, }, { .name = "RETIRED_SSE_OPERATIONS", .desc = "Retired SSE Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_retired_sse_operations), .ngrp = 1, .umasks = amd64_fam14h_retired_sse_operations, }, { .name = "RETIRED_MOVE_OPS", .desc = "Retired Move Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_retired_move_ops), .ngrp = 1, .umasks = amd64_fam14h_retired_move_ops, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "Retired Serializing Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_retired_serializing_ops), .ngrp = 1, .umasks = amd64_fam14h_retired_serializing_ops, }, { .name = "RETIRED_X87_FPU_OPS", .desc = "Number of x87 floating points ops that have retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x11, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_retired_x87_fpu_ops), .ngrp = 1, .umasks = amd64_fam14h_retired_x87_fpu_ops, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam14h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x22, }, { .name = "RSQ_FULL", .desc = "Number of cycles that the RSQ holds retired stores. This buffer holds the stores waiting to retired as well as requests that missed the data cache and waiting on a refill", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x23, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_locked_ops), .ngrp = 1, .umasks = amd64_fam14h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x27, }, { .name = "CANCELLED_STORE_TO_LOAD_FORWARD_OPERATIONS", .desc = "Cancelled Store to Load Forward Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_cancelled_store_to_load_forward_operations), .ngrp = 1, .umasks = amd64_fam14h_cancelled_store_to_load_forward_operations, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_data_cache_refills), .ngrp = 1, .umasks = amd64_fam14h_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_NB", .desc = "Data Cache Refills from the Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_data_cache_refills_from_nb), .ngrp = 1, .umasks = amd64_fam14h_data_cache_refills_from_nb, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_data_cache_lines_evicted), .ngrp = 1, .umasks = amd64_fam14h_data_cache_lines_evicted, }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "Number of data cache accesses that miss in the L1 DTLB and hit the L2 DTLB. This is a speculative event", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x45, }, { .name = "DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x46, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_dtlb_miss), .ngrp = 1, .umasks = amd64_fam14h_dtlb_miss, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x47, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam14h_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4c, }, { .name = "L1_DTLB_HIT", .desc = "L1 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_l1_dtlb_hit), .ngrp = 1, .umasks = amd64_fam14h_l1_dtlb_hit, }, { .name = "DCACHE_SW_PREFETCHES", .desc = "Number of software prefetches that do not cause an actual data cache refill", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x52, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_dcache_sw_prefetches), .ngrp = 1, .umasks = amd64_fam14h_dcache_sw_prefetches, }, { .name = "GLOBAL_TLB_FLUSHES", .desc = "Global TLB Flushes", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x54, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_memory_requests), .ngrp = 1, .umasks = amd64_fam14h_memory_requests, }, { .name = "MAB_REQUESTS", .desc = "Number of L1 I-cache and D-cache misses per buffer. Average latency by combining with MAB_WAIT_CYCLES.", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_mab_requests), .ngrp = 1, .umasks = amd64_fam14h_mab_requests, }, { .name = "MAB_WAIT_CYCLES", .desc = "Latency of L1 I-cache and D-cache misses per buffer. Average latency by combining with MAB_REQUESTS.", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_mab_requests), .ngrp = 1, .umasks = amd64_fam14h_mab_requests, /* identical to actual umasks list for this event */ }, { .name = "SYSTEM_READ_RESPONSES", .desc = "Northbridge Read Responses by Coherency State", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_system_read_responses), .ngrp = 1, .umasks = amd64_fam14h_system_read_responses, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x76, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_requests_to_l2), .ngrp = 1, .umasks = amd64_fam14h_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_l2_cache_miss), .ngrp = 1, .umasks = amd64_fam14h_l2_cache_miss, }, { .name = "L2_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_l2_fill_writeback), .ngrp = 1, .umasks = amd64_fam14h_l2_fill_writeback, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_l1_itlb_miss_and_l2_itlb_miss), .ngrp = 1, .umasks = amd64_fam14h_l1_itlb_miss_and_l2_itlb_miss, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x89, }, { .name = "INSTRUCTION_CACHE_VICTIMS", .desc = "Instruction Cache Victims", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8b, }, { .name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .desc = "Instruction Cache Lines Invalidated", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_instruction_cache_lines_invalidated), .ngrp = 1, .umasks = amd64_fam14h_instruction_cache_lines_invalidated, }, { .name = "ITLB_RELOADS", .desc = "ITLB Reloads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x99, }, { .name = "ITLB_RELOADS_ABORTED", .desc = "ITLB Reloads Aborted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x9a, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xca, }, { .name = "RETIRED_FLOATING_POINT_INSTRUCTIONS", .desc = "Retired SSE/MMX/FP Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_retired_floating_point_instructions), .ngrp = 1, .umasks = amd64_fam14h_retired_floating_point_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcf, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam14h_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdf, }, { .name = "DRAM_ACCESSES_PAGE", .desc = "DRAM Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_dram_accesses_page), .ngrp = 1, .umasks = amd64_fam14h_dram_accesses_page, }, { .name = "MEMORY_CONTROLLER_PAGE_TABLE", .desc = "Number of page table events in the local DRAM controller", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_memory_controller_page_table), .ngrp = 1, .umasks = amd64_fam14h_memory_controller_page_table, }, { .name = "MEMORY_CONTROLLER_SLOT_MISSES", .desc = "Memory Controller DRAM Command Slots Missed", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_memory_controller_slot_misses), .ngrp = 1, .umasks = amd64_fam14h_memory_controller_slot_misses, }, { .name = "MEMORY_CONTROLLER_RBD_QUEUE_EVENTS", .desc = "Memory Controller Bypass Counter Saturation", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_memory_controller_rbd_queue_events), .ngrp = 1, .umasks = amd64_fam14h_memory_controller_rbd_queue_events, }, { .name = "THERMAL_STATUS", .desc = "Thermal Status", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_thermal_status), .ngrp = 1, .umasks = amd64_fam14h_thermal_status, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xe9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_fam14h_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK", .desc = "Cache Block Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_cache_block), .ngrp = 1, .umasks = amd64_fam14h_cache_block, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_sized_commands), .ngrp = 1, .umasks = amd64_fam14h_sized_commands, }, { .name = "PROBE", .desc = "Probe Responses and Upstream Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_probe), .ngrp = 1, .umasks = amd64_fam14h_probe, }, { .name = "DEV_EVENTS", .desc = "DEV Events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_dev_events), .ngrp = 1, .umasks = amd64_fam14h_dev_events, }, { .name = "MEMORY_CONTROLLER_REQUESTS", .desc = "Memory Controller Requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1f0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_memory_controller_requests), .ngrp = 1, .umasks = amd64_fam14h_memory_controller_requests, }, { .name = "SIDEBAND_SIGNALS_SPECIAL_SIGNALS", .desc = "Sideband signals and special cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1e9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_sideband_signals_special_signals), .ngrp = 1, .umasks = amd64_fam14h_sideband_signals_special_signals, }, { .name = "INTERRUPT_EVENTS", .desc = "Interrupt events", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1ea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_interrupt_events), .ngrp = 1, .umasks = amd64_fam14h_interrupt_events, }, { .name = "PDC_MISS", .desc = "PDC miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x162, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam14h_pdc_miss), .ngrp = 1, .umasks = amd64_fam14h_pdc_miss, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam15h.h000066400000000000000000001112431502707512200236510ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_fam15h (AMD64 Fam15h Interlagos) * * Based on libpfm patch by Robert Richter : * Family 15h Microarchitecture performance monitor events * * History: * * Apr 29 2011 -- Robert Richter, robert.richter@amd.com: * Source: BKDG for AMD Family 15h Models 00h-0Fh Processors, * 42301, Rev 1.15, April 18, 2011 * * Dec 09 2010 -- Robert Richter, robert.richter@amd.com: * Source: BIOS and Kernel Developer's Guide for the AMD Family 15h * Processors, Rev 0.90, May 18, 2010 */ #define CORE_SELECT(b) \ { .uname = "CORE_0",\ .udesc = "Measure on Core0",\ .ucode = 0 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_1",\ .udesc = "Measure on Core1",\ .ucode = 1 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_2",\ .udesc = "Measure on Core2",\ .ucode = 2 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_3",\ .udesc = "Measure on Core3",\ .ucode = 3 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_4",\ .udesc = "Measure on Core4",\ .ucode = 4 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_5",\ .udesc = "Measure on Core5",\ .ucode = 5 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_6",\ .udesc = "Measure on Core6",\ .ucode = 6 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_7",\ .udesc = "Measure on Core7",\ .ucode = 7 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "ANY_CORE",\ .udesc = "Measure on any core",\ .ucode = 0xf << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL,\ } static const amd64_umask_t amd64_fam15h_dispatched_fpu_ops[]={ { .uname = "OPS_PIPE0", .udesc = "Total number uops assigned to Pipe 0", .ucode = 0x1, }, { .uname = "OPS_PIPE1", .udesc = "Total number uops assigned to Pipe 1", .ucode = 0x2, }, { .uname = "OPS_PIPE2", .udesc = "Total number uops assigned to Pipe 2", .ucode = 0x4, }, { .uname = "OPS_PIPE3", .udesc = "Total number uops assigned to Pipe 3", .ucode = 0x8, }, { .uname = "OPS_DUAL_PIPE0", .udesc = "Total number dual-pipe uops assigned to Pipe 0", .ucode = 0x10, }, { .uname = "OPS_DUAL_PIPE1", .udesc = "Total number dual-pipe uops assigned to Pipe 1", .ucode = 0x20, }, { .uname = "OPS_DUAL_PIPE2", .udesc = "Total number dual-pipe uops assigned to Pipe 2", .ucode = 0x40, }, { .uname = "OPS_DUAL_PIPE3", .udesc = "Total number dual-pipe uops assigned to Pipe 3", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_retired_sse_ops[]={ { .uname = "SINGLE_ADD_SUB_OPS", .udesc = "Single-precision add/subtract FLOPS", .ucode = 0x1, }, { .uname = "SINGLE_MUL_OPS", .udesc = "Single-precision multiply FLOPS", .ucode = 0x2, }, { .uname = "SINGLE_DIV_OPS", .udesc = "Single-precision divide/square root FLOPS", .ucode = 0x4, }, { .uname = "SINGLE_MUL_ADD_OPS", .udesc = "Single precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS", .ucode = 0x8, }, { .uname = "DOUBLE_ADD_SUB_OPS", .udesc = "Double precision add/subtract FLOPS", .ucode = 0x10, }, { .uname = "DOUBLE_MUL_OPS", .udesc = "Double precision multiply FLOPS", .ucode = 0x20, }, { .uname = "DOUBLE_DIV_OPS", .udesc = "Double precision divide/square root FLOPS", .ucode = 0x40, }, { .uname = "DOUBLE_MUL_ADD_OPS", .udesc = "Double precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_move_scalar_optimization[]={ { .uname = "SSE_MOVE_OPS", .udesc = "Number of SSE Move Ops", .ucode = 0x1, }, { .uname = "SSE_MOVE_OPS_ELIM", .udesc = "Number of SSE Move Ops eliminated", .ucode = 0x2, }, { .uname = "OPT_CAND", .udesc = "Number of Ops that are candidates for optimization (Z-bit set or pass)", .ucode = 0x4, }, { .uname = "SCALAR_OPS_OPTIMIZED", .udesc = "Number of Scalar ops optimized", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_retired_serializing_ops[]={ { .uname = "SSE_RETIRED", .udesc = "SSE bottom-executing uops retired", .ucode = 0x1, }, { .uname = "SSE_MISPREDICTED", .udesc = "SSE control word mispredict traps due to mispredictions", .ucode = 0x2, }, { .uname = "X87_RETIRED", .udesc = "X87 bottom-executing uops retired", .ucode = 0x4, }, { .uname = "X87_MISPREDICTED", .udesc = "X87 control word mispredict traps due to mispredictions", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_load_q_store_q_full[]={ { .uname = "LOAD_QUEUE", .udesc = "The number of cycles that the load buffer is full", .ucode = 0x1, }, { .uname = "STORE_QUEUE", .udesc = "The number of cycles that the store buffer is full", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "Number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_NON_SPECULATIVE_PHASE", .udesc = "Number of cycles spent in non-speculative phase, excluding cache miss penalty", .ucode = 0x4, }, { .uname = "CYCLES_WAITING", .udesc = "Number of cycles spent in non-speculative phase, including the cache miss penalty", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xd, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_cancelled_store_to_load[]={ { .uname = "SIZE_ADDRESS_MISMATCHES", .udesc = "Store is smaller than load or different starting byte but partial overlap", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_data_cache_misses[]={ { .uname = "DC_MISS_STREAMING_STORE", .udesc = "First data cache miss or streaming store to a 64B cache line", .ucode = 0x1, }, { .uname = "STREAMING_STORE", .udesc = "First streaming store to a 64B cache line", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_data_cache_refills_from_l2_or_northbridge[]={ { .uname = "GOOD", .udesc = "Fill with good data. (Final valid status is valid)", .ucode = 0x1, }, { .uname = "INVALID", .udesc = "Early valid status turned out to be invalid", .ucode = 0x2, }, { .uname = "POISON", .udesc = "Fill with poison data", .ucode = 0x4, }, { .uname = "READ_ERROR", .udesc = "Fill with read data error", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_unified_tlb_hit[]={ { .uname = "4K_DATA", .udesc = "4 KB unified TLB hit for data", .ucode = 0x1, }, { .uname = "2M_DATA", .udesc = "2 MB unified TLB hit for data", .ucode = 0x2, }, { .uname = "1G_DATA", .udesc = "1 GB unified TLB hit for data", .ucode = 0x4, }, { .uname = "4K_INST", .udesc = "4 KB unified TLB hit for instruction", .ucode = 0x10, }, { .uname = "2M_INST", .udesc = "2 MB unified TLB hit for instruction", .ucode = 0x20, }, { .uname = "1G_INST", .udesc = "1 GB unified TLB hit for instruction", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x77, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_unified_tlb_miss[]={ { .uname = "4K_DATA", .udesc = "4 KB unified TLB miss for data", .ucode = 0x1, }, { .uname = "2M_DATA", .udesc = "2 MB unified TLB miss for data", .ucode = 0x2, }, { .uname = "1GB_DATA", .udesc = "1 GB unified TLB miss for data", .ucode = 0x4, }, { .uname = "4K_INST", .udesc = "4 KB unified TLB miss for instruction", .ucode = 0x10, }, { .uname = "2M_INST", .udesc = "2 MB unified TLB miss for instruction", .ucode = 0x20, }, { .uname = "1G_INST", .udesc = "1 GB unified TLB miss for instruction", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x77, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_ineffective_sw_prefetches[]={ { .uname = "SW_PREFETCH_HIT_IN_L1", .udesc = "Software prefetch hit in the L1", .ucode = 0x1, }, { .uname = "SW_PREFETCH_HIT_IN_L2", .udesc = "Software prefetch hit in the L2", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x9, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to non-cacheable (WC, but not WC+/SS) memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Requests to non-cacheable (WC+/SS, but not WC) memory", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_data_prefetcher[]={ { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_mab_reqs[]={ { .uname = "BUFFER_BIT_0", .udesc = "Buffer entry index bit 0", .ucode = 0x1, }, { .uname = "BUFFER_BIT_1", .udesc = "Buffer entry index bit 1", .ucode = 0x2, }, { .uname = "BUFFER_BIT_2", .udesc = "Buffer entry index bit 2", .ucode = 0x4, }, { .uname = "BUFFER_BIT_3", .udesc = "Buffer entry index bit 3", .ucode = 0x8, }, { .uname = "BUFFER_BIT_4", .udesc = "Buffer entry index bit 4", .ucode = 0x10, }, { .uname = "BUFFER_BIT_5", .udesc = "Buffer entry index bit 5", .ucode = 0x20, }, { .uname = "BUFFER_BIT_6", .udesc = "Buffer entry index bit 6", .ucode = 0x40, }, { .uname = "BUFFER_BIT_7", .udesc = "Buffer entry index bit 7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_system_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified (D18F0x68[ATMModeEn]==0), Modified written (D18F0x68[ATMModeEn]==1)", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "MODIFIED_UNWRITTEN", .udesc = "Modified unwritten", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_octword_write_transfers[]={ { .uname = "OCTWORD_WRITE_TRANSFER", .udesc = "OW write transfer", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB fill (page table walks)", .ucode = 0x4, }, { .uname = "SNOOP", .udesc = "NB probe request", .ucode = 0x8, }, { .uname = "CANCELLED", .udesc = "Canceled request", .ucode = 0x10, }, { .uname = "PREFETCHER", .udesc = "L2 cache prefetcher request", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x5f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas PMCx041 does not)", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB page table walk", .ucode = 0x4, }, { .uname = "PREFETCHER", .udesc = "L2 Cache Prefetcher request", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x17, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_l2_cache_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills from system", .ucode = 0x1, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system (Clean and Dirty)", .ucode = 0x2, }, { .uname = "L2_WRITEBACKS_CLEAN", .udesc = "L2 Clean Writebacks to system", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_page_splintering[]={ { .uname = "GUEST_LARGER", .udesc = "Guest page size is larger than host page size when nested paging is enabled", .ucode = 0x1, }, { .uname = "MTRR_MISMATCH", .udesc = "Splintering due to MTRRs, IORRs, APIC, TOMs or other special address region", .ucode = 0x2, }, { .uname = "HOST_LARGER", .udesc = "Host page size is larger than the guest page size", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_l1_itlb_miss_and_l2_itlb_miss[]={ { .uname = "4K_PAGE_FETCHES", .udesc = "Instruction fetches to a 4 KB page", .ucode = 0x1, }, { .uname = "2M_PAGE_FETCHES", .udesc = "Instruction fetches to a 2 MB page", .ucode = 0x2, }, { .uname = "1G_PAGE_FETCHES", .udesc = "Instruction fetches to a 1 GB page", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_instruction_cache_invalidated[]={ { .uname = "NON_SMC_PROBE_MISS", .udesc = "Non-SMC invalidating probe that missed on in-flight instructions", .ucode = 0x1, }, { .uname = "NON_SMC_PROBE_HIT", .udesc = "Non-SMC invalidating probe that hit on in-flight instructions", .ucode = 0x2, }, { .uname = "SMC_PROBE_MISS", .udesc = "SMC invalidating probe that missed on in-flight instructions", .ucode = 0x4, }, { .uname = "SMC_PROBE_HIT", .udesc = "SMC invalidating probe that hit on in-flight instructions", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_retired_mmx_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "MMX", .udesc = "MMX(tm) instructions", .ucode = 0x2, }, { .uname = "SSE", .udesc = "SSE instructions (SSE,SSE2,SSE3,SSSE3,SSE4A,SSE4.1,SSE4.2,AVX,XOP,FMA4)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_fpu_exceptions[]={ { .uname = "TOTAL_FAULTS", .udesc = "Total microfaults", .ucode = 0x1, }, { .uname = "TOTAL_TRAPS", .udesc = "Total microtraps", .ucode = 0x2, }, { .uname = "INT2EXT_FAULTS", .udesc = "Int2Ext faults", .ucode = 0x4, }, { .uname = "EXT2INT_FAULTS", .udesc = "Ext2Int faults", .ucode = 0x8, }, { .uname = "BYPASS_FAULTS", .udesc = "Bypass faults", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_ibs_ops_tagged[]={ { .uname = "TAGGED", .udesc = "Number of ops tagged by IBS", .ucode = 0x1, }, { .uname = "RETIRED", .udesc = "Number of ops tagged by IBS that retired", .ucode = 0x2, }, { .uname = "IGNORED", .udesc = "Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_ls_dispatch[]={ { .uname = "LOADS", .udesc = "Loads", .ucode = 0x1, }, { .uname = "STORES", .udesc = "Stores", .ucode = 0x2, }, { .uname = "LOAD_OP_STORES", .udesc = "Load-op-Stores", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_l2_prefetcher_trigger_events[]={ { .uname = "LOAD_L1_MISS_SEEN_BY_PREFETCHER", .udesc = "Load L1 miss seen by prefetcher", .ucode = 0x1, }, { .uname = "STORE_L1_MISS_SEEN_BY_PREFETCHER", .udesc = "Store L1 miss seen by prefetcher", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam15h_pe[]={ { .name = "DISPATCHED_FPU_OPS", .desc = "FPU Pipe Assignment", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_dispatched_fpu_ops), .ngrp = 1, .umasks = amd64_fam15h_dispatched_fpu_ops, }, { .name = "CYCLES_FPU_EMPTY", .desc = "FP Scheduler Empty", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x1, }, { .name = "RETIRED_SSE_OPS", .desc = "Retired SSE/BNI Ops", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_retired_sse_ops), .ngrp = 1, .umasks = amd64_fam15h_retired_sse_ops, }, { .name = "MOVE_SCALAR_OPTIMIZATION", .desc = "Number of Move Elimination and Scalar Op Optimization", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_move_scalar_optimization), .ngrp = 1, .umasks = amd64_fam15h_move_scalar_optimization, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "Retired Serializing Ops", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_retired_serializing_ops), .ngrp = 1, .umasks = amd64_fam15h_retired_serializing_ops, }, { .name = "BOTTOM_EXECUTE_OP", .desc = "Number of Cycles that a Bottom-Execute uop is in the FP Scheduler", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x6, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam15h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x22, }, { .name = "LOAD_Q_STORE_Q_FULL", .desc = "Load Queue/Store Queue Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x23, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_load_q_store_q_full), .ngrp = 1, .umasks = amd64_fam15h_load_q_store_q_full, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_locked_ops), .ngrp = 1, .umasks = amd64_fam15h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x27, }, { .name = "CANCELLED_STORE_TO_LOAD", .desc = "Canceled Store to Load Forward Operations", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_cancelled_store_to_load), .ngrp = 1, .umasks = amd64_fam15h_cancelled_store_to_load, }, { .name = "SMIS_RECEIVED", .desc = "SMIs Received", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x2b, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_data_cache_misses), .ngrp = 1, .umasks = amd64_fam15h_data_cache_misses, }, { .name = "DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE", .desc = "Data Cache Refills from L2 or System", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_data_cache_refills_from_l2_or_northbridge), .ngrp = 1, .umasks = amd64_fam15h_data_cache_refills_from_l2_or_northbridge, }, { .name = "DATA_CACHE_REFILLS_FROM_NORTHBRIDGE", .desc = "Data Cache Refills from System", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x43, }, { .name = "UNIFIED_TLB_HIT", .desc = "Unified TLB Hit", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x45, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_unified_tlb_hit), .ngrp = 1, .umasks = amd64_fam15h_unified_tlb_hit, }, { .name = "UNIFIED_TLB_MISS", .desc = "Unified TLB Miss", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x46, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_unified_tlb_miss), .ngrp = 1, .umasks = amd64_fam15h_unified_tlb_miss, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x47, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam15h_prefetch_instructions_dispatched, }, { .name = "INEFFECTIVE_SW_PREFETCHES", .desc = "Ineffective Software Prefetches", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x52, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_ineffective_sw_prefetches), .ngrp = 1, .umasks = amd64_fam15h_ineffective_sw_prefetches, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_memory_requests), .ngrp = 1, .umasks = amd64_fam15h_memory_requests, }, { .name = "DATA_PREFETCHER", .desc = "Data Prefetcher", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_data_prefetcher), .ngrp = 1, .umasks = amd64_fam15h_data_prefetcher, }, { .name = "MAB_REQS", .desc = "MAB Requests", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_mab_reqs), .ngrp = 1, .umasks = amd64_fam15h_mab_reqs, }, { .name = "MAB_WAIT", .desc = "MAB Wait Cycles", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_mab_reqs), .ngrp = 1, .umasks = amd64_fam15h_mab_reqs, /* identical to actual umasks list for this event */ }, { .name = "SYSTEM_READ_RESPONSES", .desc = "Response From System on Cache Refills", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_system_read_responses), .ngrp = 1, .umasks = amd64_fam15h_system_read_responses, }, { .name = "OCTWORD_WRITE_TRANSFERS", .desc = "Octwords Written to System", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_octword_write_transfers), .ngrp = 1, .umasks = amd64_fam15h_octword_write_transfers, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x76, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_requests_to_l2), .ngrp = 1, .umasks = amd64_fam15h_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_l2_cache_miss), .ngrp = 1, .umasks = amd64_fam15h_l2_cache_miss, }, { .name = "L2_CACHE_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_l2_cache_fill_writeback), .ngrp = 1, .umasks = amd64_fam15h_l2_cache_fill_writeback, }, { .name = "PAGE_SPLINTERING", .desc = "Page Splintering", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x165, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_page_splintering), .ngrp = 1, .umasks = amd64_fam15h_page_splintering, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss, L2 ITLB Hit", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss, L2 ITLB Miss", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_l1_itlb_miss_and_l2_itlb_miss), .ngrp = 1, .umasks = amd64_fam15h_l1_itlb_miss_and_l2_itlb_miss, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "Pipeline Restart Due to Instruction Stream Probe", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x86, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x89, }, { .name = "INSTRUCTION_CACHE_VICTIMS", .desc = "Instruction Cache Victims", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x8b, }, { .name = "INSTRUCTION_CACHE_INVALIDATED", .desc = "Instruction Cache Lines Invalidated", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x8c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_instruction_cache_invalidated), .ngrp = 1, .umasks = amd64_fam15h_instruction_cache_invalidated, }, { .name = "ITLB_RELOADS", .desc = "ITLB Reloads", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x99, }, { .name = "ITLB_RELOADS_ABORTED", .desc = "ITLB Reloads Aborted", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x9a, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_retired_mmx_fp_instructions), .ngrp = 1, .umasks = amd64_fam15h_retired_mmx_fp_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xcf, }, { .name = "DECODER_EMPTY", .desc = "Decoder Empty", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd0, }, { .name = "DISPATCH_STALLS", .desc = "Dispatch Stalls", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd1, }, { .name = "DISPATCH_STALL_FOR_SERIALIZATION", .desc = "Microsequencer Stall due to Serialization", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd3, }, { .name = "DISPATCH_STALL_FOR_RETIRE_QUEUE_FULL", .desc = "Dispatch Stall for Instruction Retire Q Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd5, }, { .name = "DISPATCH_STALL_FOR_INT_SCHED_QUEUE_FULL", .desc = "Dispatch Stall for Integer Scheduler Queue Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd6, }, { .name = "DISPATCH_STALL_FOR_FPU_FULL", .desc = "Dispatch Stall for FP Scheduler Queue Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd7, }, { .name = "DISPATCH_STALL_FOR_LDQ_FULL", .desc = "Dispatch Stall for LDQ Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd8, }, { .name = "MICROSEQ_STALL_WAITING_FOR_ALL_QUIET", .desc = "Microsequencer Stall Waiting for All Quiet", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xd9, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam15h_fpu_exceptions, }, { .name = "DR0_BREAKPOINTS", .desc = "DR0 Breakpoint Match", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINTS", .desc = "DR1 Breakpoint Match", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINTS", .desc = "DR2 Breakpoint Match", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINTS", .desc = "DR3 Breakpoint Match", .modmsk = AMD64_FAM15H_ATTRS, .code = 0xdf, }, { .name = "IBS_OPS_TAGGED", .desc = "Tagged IBS Ops", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x1cf, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_ibs_ops_tagged), .ngrp = 1, .umasks = amd64_fam15h_ibs_ops_tagged, }, { .name = "LS_DISPATCH", .desc = "LS Dispatch", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_ls_dispatch), .ngrp = 1, .umasks = amd64_fam15h_ls_dispatch, }, { .name = "EXECUTED_CLFLUSH_INSTRUCTIONS", .desc = "Executed CLFLUSH Instructions", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x30, }, { .name = "L2_PREFETCHER_TRIGGER_EVENTS", .desc = "L2 Prefetcher Trigger Events", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x16c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_l2_prefetcher_trigger_events), .ngrp = 1, .umasks = amd64_fam15h_l2_prefetcher_trigger_events, }, { .name = "DISPATCH_STALL_FOR_STQ_FULL", .desc = "Dispatch Stall for STQ Full", .modmsk = AMD64_FAM15H_ATTRS, .code = 0x1d8, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam15h_nb.h000066400000000000000000001156021502707512200243330ustar00rootroot00000000000000/* * Copyright (c) 2013 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_fam15h_nb_nb (AMD64 Fam15h Interlagos NorthBridge) * * Based on libpfm patch by Robert Richter : * Family 15h Microarchitecture performance monitor events * * History: * * Nov 30 2013 -- Stephane Eranian , eranian@gmail.com: * Split core and Northbridge events as PMU is distinct * * Apr 29 2011 -- Robert Richter, robert.richter@amd.com: * Source: BKDG for AMD Family 15h Models 00h-0Fh Processors, * 42301, Rev 1.15, April 18, 2011 * * Dec 09 2010 -- Robert Richter, robert.richter@amd.com: * Source: BIOS and Kernel Developer's Guide for the AMD Family 15h * Processors, Rev 0.90, May 18, 2010 */ #define CORE_SELECT(b) \ { .uname = "CORE_0",\ .udesc = "Measure on Core0",\ .ucode = 0 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_1",\ .udesc = "Measure on Core1",\ .ucode = 1 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_2",\ .udesc = "Measure on Core2",\ .ucode = 2 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_3",\ .udesc = "Measure on Core3",\ .ucode = 3 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_4",\ .udesc = "Measure on Core4",\ .ucode = 4 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_5",\ .udesc = "Measure on Core5",\ .ucode = 5 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_6",\ .udesc = "Measure on Core6",\ .ucode = 6 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "CORE_7",\ .udesc = "Measure on Core7",\ .ucode = 7 << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO,\ },\ { .uname = "ANY_CORE",\ .udesc = "Measure on any core",\ .ucode = 0xf << 4,\ .grpid = b,\ .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL,\ } static const amd64_umask_t amd64_fam15h_nb_dram_accesses[]={ { .uname = "DCT0_PAGE_HIT", .udesc = "DCT0 Page hit", .ucode = 0x1, }, { .uname = "DCT0_PAGE_MISS", .udesc = "DCT0 Page Miss", .ucode = 0x2, }, { .uname = "DCT0_PAGE_CONFLICT", .udesc = "DCT0 Page Conflict", .ucode = 0x4, }, { .uname = "DCT1_PAGE_HIT", .udesc = "DCT1 Page hit", .ucode = 0x8, }, { .uname = "DCT1_PAGE_MISS", .udesc = "DCT1 Page Miss", .ucode = 0x10, }, { .uname = "DCT1_PAGE_CONFLICT", .udesc = "DCT1 Page Conflict", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_dram_controller_page_table_overflows[]={ { .uname = "DCT0_PAGE_TABLE_OVERFLOW", .udesc = "DCT0 Page Table Overflow", .ucode = 0x1, }, { .uname = "DCT1_PAGE_TABLE_OVERFLOW", .udesc = "DCT1 Page Table Overflow", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_memory_controller_dram_command_slots_missed[]={ { .uname = "DCT0_COMMAND_SLOTS_MISSED", .udesc = "DCT0 Command Slots Missed (in MemClks)", .ucode = 0x1, }, { .uname = "DCT1_COMMAND_SLOTS_MISSED", .udesc = "DCT1 Command Slots Missed (in MemClks)", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_memory_controller_turnarounds[]={ { .uname = "DCT0_DIMM_TURNAROUND", .udesc = "DCT0 DIMM (chip select) turnaround", .ucode = 0x1, }, { .uname = "DCT0_READ_WRITE_TURNAROUND", .udesc = "DCT0 Read to write turnaround", .ucode = 0x2, }, { .uname = "DCT0_WRITE_READ_TURNAROUND", .udesc = "DCT0 Write to read turnaround", .ucode = 0x4, }, { .uname = "DCT1_DIMM_TURNAROUND", .udesc = "DCT1 DIMM (chip select) turnaround", .ucode = 0x8, }, { .uname = "DCT1_READ_WRITE_TURNAROUND", .udesc = "DCT1 Read to write turnaround", .ucode = 0x10, }, { .uname = "DCT1_WRITE_READ_TURNAROUND", .udesc = "DCT1 Write to read turnaround", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_memory_controller_bypass_counter_saturation[]={ { .uname = "MEMORY_CONTROLLER_HIGH_PRIORITY_BYPASS", .udesc = "Memory controller high priority bypass", .ucode = 0x1, }, { .uname = "MEMORY_CONTROLLER_MEDIUM_PRIORITY_BYPASS", .udesc = "Memory controller medium priority bypass", .ucode = 0x2, }, { .uname = "DCT0_DCQ_BYPASS", .udesc = "DCT0 DCQ bypass", .ucode = 0x4, }, { .uname = "DCT1_DCQ_BYPASS", .udesc = "DCT1 DCQ bypass", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_thermal_status[]={ { .uname = "NUM_HTC_TRIP_POINT_CROSSED", .udesc = "Number of times the HTC trip point is crossed", .ucode = 0x4, }, { .uname = "NUM_CLOCKS_HTC_PSTATE_INACTIVE", .udesc = "Number of clocks HTC P-state is inactive", .ucode = 0x20, }, { .uname = "NUM_CLOCKS_HTC_PSTATE_ACTIVE", .udesc = "Number of clocks HTC P-state is active", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x64, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_cpu_io_requests_to_memory_io[]={ { .uname = "REMOTE_IO_TO_LOCAL_IO", .udesc = "Remote IO to Local IO", .ucode = 0x61, .uflags= AMD64_FL_NCOMBO, }, { .uname = "REMOTE_CPU_TO_LOCAL_IO", .udesc = "Remote CPU to Local IO", .ucode = 0x64, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_IO_TO_REMOTE_IO", .udesc = "Local IO to Remote IO", .ucode = 0x91, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_IO_TO_REMOTE_MEM", .udesc = "Local IO to Remote Mem", .ucode = 0x92, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_CPU_TO_REMOTE_IO", .udesc = "Local CPU to Remote IO", .ucode = 0x94, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_CPU_TO_REMOTE_MEM", .udesc = "Local CPU to Remote Mem", .ucode = 0x98, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_IO_TO_LOCAL_IO", .udesc = "Local IO to Local IO", .ucode = 0xa1, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_IO_TO_LOCAL_MEM", .udesc = "Local IO to Local Mem", .ucode = 0xa2, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_CPU_TO_LOCAL_IO", .udesc = "Local CPU to Local IO", .ucode = 0xa4, .uflags= AMD64_FL_NCOMBO, }, { .uname = "LOCAL_CPU_TO_LOCAL_MEM", .udesc = "Local CPU to Local Mem", .ucode = 0xa8, .uflags= AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam15h_nb_cache_block_commands[]={ { .uname = "VICTIM_BLOCK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "READ_BLOCK", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "CHANGE_TO_DIRTY", .udesc = "Change-to-Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_sized_commands[]={ { .uname = "NON-POSTED_SZWR_BYTE", .udesc = "Non-Posted SzWr Byte (1-32 bytes). Typical Usage: Legacy or mapped IO, typically 1-4 bytes.", .ucode = 0x1, }, { .uname = "NON-POSTED_SZWR_DW", .udesc = "Non-Posted SzWr DW (1-16 dwords). Typical Usage: Legacy or mapped IO, typically 1", .ucode = 0x2, }, { .uname = "POSTED_SZWR_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes). Typical Usage: Subcache-line DMA writes, size varies; also", .ucode = 0x4, }, { .uname = "POSTED_SZWR_DW", .udesc = "Posted SzWr DW (1-16 dwords). Typical Usage: Block-oriented DMA writes, often cache-line", .ucode = 0x8, }, { .uname = "SZRD_BYTE", .udesc = "SzRd Byte (4 bytes). Typical Usage: Legacy or mapped IO.", .ucode = 0x10, }, { .uname = "SZRD_DW", .udesc = "SzRd DW (1-16 dwords). Typical Usage: Block-oriented DMA reads, typically cache-line size.", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_probe_responses_and_upstream_requests[]={ { .uname = "PROBE_MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "PROBE_HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "PROBE_HIT_DIRTY_WITHOUT_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "PROBE_HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_DISPLAY_REFRESH_ISOC_READS", .udesc = "Upstream display refresh/ISOC reads", .ucode = 0x10, }, { .uname = "UPSTREAM_NON-DISPLAY_REFRESH_READS", .udesc = "Upstream non-display refresh reads", .ucode = 0x20, }, { .uname = "UPSTREAM_ISOC_WRITES", .udesc = "Upstream ISOC writes", .ucode = 0x40, }, { .uname = "UPSTREAM_NON-ISOC_WRITES", .udesc = "Upstream non-ISOC writes", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_gart_events[]={ { .uname = "GART_APERTURE_HIT_ON_ACCESS_FROM_CPU", .udesc = "GART aperture hit on access from CPU", .ucode = 0x1, }, { .uname = "GART_APERTURE_HIT_ON_ACCESS_FROM_IO", .udesc = "GART aperture hit on access from IO", .ucode = 0x2, }, { .uname = "GART_MISS", .udesc = "GART miss", .ucode = 0x4, }, { .uname = "GART_REQUEST_HIT_TABLE_WALK_IN_PROGRESS", .udesc = "GART Request hit table walk in progress", .ucode = 0x8, }, { .uname = "GART_MULTIPLE_TABLE_WALK_IN_PROGRESS", .udesc = "GART multiple table walk in progress", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x8f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_link_transmit_bandwidth[]={ { .uname = "COMMAND_DW_SENT", .udesc = "Command DW sent", .ucode = 0x1, .grpid = 0, }, { .uname = "DATA_DW_SENT", .udesc = "Data DW sent", .ucode = 0x2, .grpid = 0, }, { .uname = "BUFFER_RELEASE_DW_SENT", .udesc = "Buffer release DW sent", .ucode = 0x4, .grpid = 0, }, { .uname = "NOP_DW_SENT", .udesc = "NOP DW sent (idle)", .ucode = 0x8, .grpid = 0, }, { .uname = "ADDRESS_DW_SENT", .udesc = "Address (including extensions) DW sent", .ucode = 0x10, .grpid = 0, }, { .uname = "PER_PACKET_CRC_SENT", .udesc = "Per packet CRC sent", .ucode = 0x20, .grpid = 0, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "SUBLINK_1", .udesc = "When links are unganged, enable this umask to select sublink 1", .ucode = 0x80, .grpid = 1, .uflags= AMD64_FL_NCOMBO, }, { .uname = "SUBLINK_0", .udesc = "When links are unganged, enable this umask to select sublink 0 (default when links ganged)", .ucode = 0x00, .grpid = 1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_cpu_to_dram_requests_to_target_node[]={ { .uname = "LOCAL_TO_NODE_0", .udesc = "From Local node to Node 0", .ucode = 0x1, }, { .uname = "LOCAL_TO_NODE_1", .udesc = "From Local node to Node 1", .ucode = 0x2, }, { .uname = "LOCAL_TO_NODE_2", .udesc = "From Local node to Node 2", .ucode = 0x4, }, { .uname = "LOCAL_TO_NODE_3", .udesc = "From Local node to Node 3", .ucode = 0x8, }, { .uname = "LOCAL_TO_NODE_4", .udesc = "From Local node to Node 4", .ucode = 0x10, }, { .uname = "LOCAL_TO_NODE_5", .udesc = "From Local node to Node 5", .ucode = 0x20, }, { .uname = "LOCAL_TO_NODE_6", .udesc = "From Local node to Node 6", .ucode = 0x40, }, { .uname = "LOCAL_TO_NODE_7", .udesc = "From Local node to Node 7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_io_to_dram_requests_to_target_node[]={ { .uname = "LOCAL_TO_NODE_0", .udesc = "From Local node to Node 0", .ucode = 0x1, }, { .uname = "LOCAL_TO_NODE_1", .udesc = "From Local node to Node 1", .ucode = 0x2, }, { .uname = "LOCAL_TO_NODE_2", .udesc = "From Local node to Node 2", .ucode = 0x4, }, { .uname = "LOCAL_TO_NODE_3", .udesc = "From Local node to Node 3", .ucode = 0x8, }, { .uname = "LOCAL_TO_NODE_4", .udesc = "From Local node to Node 4", .ucode = 0x10, }, { .uname = "LOCAL_TO_NODE_5", .udesc = "From Local node to Node 5", .ucode = 0x20, }, { .uname = "LOCAL_TO_NODE_6", .udesc = "From Local node to Node 6", .ucode = 0x40, }, { .uname = "LOCAL_TO_NODE_7", .udesc = "From Local node to Node 7", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_cpu_read_command_requests_to_target_node_0_3[]={ { .uname = "READ_BLOCK_LOCAL_TO_NODE_0", .udesc = "Read block From Local node to Node 0", .ucode = 0x11, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_0", .udesc = "Read block shared From Local node to Node 0", .ucode = 0x12, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_0", .udesc = "Read block modified From Local node to Node 0", .ucode = 0x14, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_0", .udesc = "Change-to-Dirty From Local node to Node 0", .ucode = 0x18, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_1", .udesc = "Read block From Local node to Node 1", .ucode = 0x21, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_1", .udesc = "Read block shared From Local node to Node 1", .ucode = 0x22, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_1", .udesc = "Read block modified From Local node to Node 1", .ucode = 0x24, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_1", .udesc = "Change-to-Dirty From Local node to Node 1", .ucode = 0x28, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_2", .udesc = "Read block From Local node to Node 2", .ucode = 0x41, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_2", .udesc = "Read block shared From Local node to Node 2", .ucode = 0x42, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_2", .udesc = "Read block modified From Local node to Node 2", .ucode = 0x44, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_2", .udesc = "Change-to-Dirty From Local node to Node 2", .ucode = 0x48, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_3", .udesc = "Read block From Local node to Node 3", .ucode = 0x81, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_3", .udesc = "Read block shared From Local node to Node 3", .ucode = 0x82, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_3", .udesc = "Read block modified From Local node to Node 3", .ucode = 0x84, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_3", .udesc = "Change-to-Dirty From Local node to Node 3", .ucode = 0x88, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_cpu_read_command_requests_to_target_node_4_7[]={ { .uname = "READ_BLOCK_LOCAL_TO_NODE_4", .udesc = "Read block From Local node to Node 4", .ucode = 0x11, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_4", .udesc = "Read block shared From Local node to Node 4", .ucode = 0x12, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_4", .udesc = "Read block modified From Local node to Node 4", .ucode = 0x14, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_4", .udesc = "Change-to-Dirty From Local node to Node 4", .ucode = 0x18, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_5", .udesc = "Read block From Local node to Node 5", .ucode = 0x21, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_5", .udesc = "Read block shared From Local node to Node 5", .ucode = 0x22, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_5", .udesc = "Read block modified From Local node to Node 5", .ucode = 0x24, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_5", .udesc = "Change-to-Dirty From Local node to Node 5", .ucode = 0x28, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_6", .udesc = "Read block From Local node to Node 6", .ucode = 0x41, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_6", .udesc = "Read block shared From Local node to Node 6", .ucode = 0x42, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_6", .udesc = "Read block modified From Local node to Node 6", .ucode = 0x44, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_6", .udesc = "Change-to-Dirty From Local node to Node 6", .ucode = 0x48, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_LOCAL_TO_NODE_7", .udesc = "Read block From Local node to Node 7", .ucode = 0x81, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_SHARED_LOCAL_TO_NODE_7", .udesc = "Read block shared From Local node to Node 7", .ucode = 0x82, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_BLOCK_MODIFIED_LOCAL_TO_NODE_7", .udesc = "Read block modified From Local node to Node 7", .ucode = 0x84, .uflags= AMD64_FL_NCOMBO, }, { .uname = "CHANGE_TO_DIRTY_LOCAL_TO_NODE_7", .udesc = "Change-to-Dirty From Local node to Node 7", .ucode = 0x88, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_cpu_command_requests_to_target_node[]={ { .uname = "READ_SIZED_LOCAL_TO_NODE_0", .udesc = "Read Sized From Local node to Node 0", .ucode = 0x11, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_0", .udesc = "Write Sized From Local node to Node 0", .ucode = 0x12, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_0", .udesc = "Victim Block From Local node to Node 0", .ucode = 0x14, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_1", .udesc = "Read Sized From Local node to Node 1", .ucode = 0x21, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_1", .udesc = "Write Sized From Local node to Node 1", .ucode = 0x22, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_1", .udesc = "Victim Block From Local node to Node 1", .ucode = 0x24, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_2", .udesc = "Read Sized From Local node to Node 2", .ucode = 0x41, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_2", .udesc = "Write Sized From Local node to Node 2", .ucode = 0x42, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_2", .udesc = "Victim Block From Local node to Node 2", .ucode = 0x44, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_3", .udesc = "Read Sized From Local node to Node 3", .ucode = 0x81, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_3", .udesc = "Write Sized From Local node to Node 3", .ucode = 0x82, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_3", .udesc = "Victim Block From Local node to Node 3", .ucode = 0x84, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_4", .udesc = "Read Sized From Local node to Node 4", .ucode = 0x19, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_4", .udesc = "Write Sized From Local node to Node 4", .ucode = 0x1a, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_4", .udesc = "Victim Block From Local node to Node 4", .ucode = 0x1c, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_5", .udesc = "Read Sized From Local node to Node 5", .ucode = 0x29, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_5", .udesc = "Write Sized From Local node to Node 5", .ucode = 0x2a, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_5", .udesc = "Victim Block From Local node to Node 5", .ucode = 0x2c, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_6", .udesc = "Read Sized From Local node to Node 6", .ucode = 0x49, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_6", .udesc = "Write Sized From Local node to Node 6", .ucode = 0x4a, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_6", .udesc = "Victim Block From Local node to Node 6", .ucode = 0x4c, .uflags= AMD64_FL_NCOMBO, }, { .uname = "READ_SIZED_LOCAL_TO_NODE_7", .udesc = "Read Sized From Local node to Node 7", .ucode = 0x89, .uflags= AMD64_FL_NCOMBO, }, { .uname = "WRITE_SIZED_LOCAL_TO_NODE_7", .udesc = "Write Sized From Local node to Node 7", .ucode = 0x8a, .uflags= AMD64_FL_NCOMBO, }, { .uname = "VICTIM_BLOCK_LOCAL_TO_NODE_7", .udesc = "Victim Block From Local node to Node 7", .ucode = 0x8c, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ALL_LOCAL_TO_NODE_0_3", .udesc = "All From Local node to Node 0-3", .ucode = 0xf7, .uflags= AMD64_FL_NCOMBO, }, { .uname = "ALL_LOCAL_TO_NODE_4_7", .udesc = "All From Local node to Node 4-7", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_request_cache_status_0[]={ { .uname = "PROBE_HIT_S", .udesc = "Probe Hit S", .ucode = 0x1, }, { .uname = "PROBE_HIT_E", .udesc = "Probe Hit E", .ucode = 0x2, }, { .uname = "PROBE_HIT_MUW_OR_O", .udesc = "Probe Hit MuW or O", .ucode = 0x4, }, { .uname = "PROBE_HIT_M", .udesc = "Probe Hit M", .ucode = 0x8, }, { .uname = "PROBE_MISS", .udesc = "Probe Miss", .ucode = 0x10, }, { .uname = "DIRECTED_PROBE", .udesc = "Directed Probe", .ucode = 0x20, }, { .uname = "TRACK_CACHE_STAT_FOR_RDBLK", .udesc = "Track Cache Stat for RdBlk", .ucode = 0x40, }, { .uname = "TRACK_CACHE_STAT_FOR_RDBLKS", .udesc = "Track Cache Stat for RdBlkS", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_request_cache_status_1[]={ { .uname = "PROBE_HIT_S", .udesc = "Probe Hit S", .ucode = 0x1, }, { .uname = "PROBE_HIT_E", .udesc = "Probe Hit E", .ucode = 0x2, }, { .uname = "PROBE_HIT_MUW_OR_O", .udesc = "Probe Hit MuW or O", .ucode = 0x4, }, { .uname = "PROBE_HIT_M", .udesc = "Probe Hit M", .ucode = 0x8, }, { .uname = "PROBE_MISS", .udesc = "Probe Miss", .ucode = 0x10, }, { .uname = "DIRECTED_PROBE", .udesc = "Directed Probe", .ucode = 0x20, }, { .uname = "TRACK_CACHE_STAT_FOR_CHGTODIRTY", .udesc = "Track Cache Stat for ChgToDirty", .ucode = 0x40, }, { .uname = "TRACK_CACHE_STAT_FOR_RDBLKM", .udesc = "Track Cache Stat for RdBlkM", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_memory_controller_requests[]={ { .uname = "WRITE_REQUESTS_TO_DCT", .udesc = "Write requests sent to the DCT", .ucode = 0x1, }, { .uname = "READ_REQUESTS_TO_DCT", .udesc = "Read requests (including prefetch requests) sent to the DCT", .ucode = 0x2, }, { .uname = "PREFETCH_REQUESTS_TO_DCT", .udesc = "Prefetch requests sent to the DCT", .ucode = 0x4, }, { .uname = "32_BYTES_SIZED_WRITES", .udesc = "32 Bytes Sized Writes", .ucode = 0x8, }, { .uname = "64_BYTES_SIZED_WRITES", .udesc = "64 Bytes Sized Writes", .ucode = 0x10, }, { .uname = "32_BYTES_SIZED_READS", .udesc = "32 Bytes Sized Reads", .ucode = 0x20, }, { .uname = "64_BYTE_SIZED_READS", .udesc = "64 Byte Sized Reads", .ucode = 0x40, }, { .uname = "READ_REQUESTS_TO_DCT_WHILE_WRITES_PENDING", .udesc = "Read requests sent to the DCT while writes requests are pending in the DCT", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_read_request_to_l3_cache[]={ { .uname = "READ_BLOCK_EXCLUSIVE", .udesc = "Read Block Exclusive (Data cache read)", .ucode = 0x1, .grpid = 0, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read Block Shared (Instruction cache read)", .ucode = 0x2, .grpid = 0, }, { .uname = "READ_BLOCK_MODIFY", .udesc = "Read Block Modify", .ucode = 0x4, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "Count prefetches only", .ucode = 0x8, .grpid = 0, }, { .uname = "READ_BLOCK_ANY", .udesc = "Count any read request", .ucode = 0x7, .grpid = 0, .uflags= AMD64_FL_DFL | AMD64_FL_NCOMBO, }, CORE_SELECT(1), }; static const amd64_umask_t amd64_fam15h_nb_l3_fills_caused_by_l2_evictions[]={ { .uname = "SHARED", .udesc = "Shared", .ucode = 0x1, .grpid = 0, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x2, .grpid = 0, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x4, .grpid = 0, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x8, .grpid = 0, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, CORE_SELECT(1), }; static const amd64_umask_t amd64_fam15h_nb_l3_evictions[]={ { .uname = "SHARED", .udesc = "Shared", .ucode = 0x1, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x2, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x4, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam15h_nb_l3_latency[]={ { .uname = "L3_REQUEST_CYCLE", .udesc = "L3 Request cycle count.", .ucode = 0x1, }, { .uname = "L3_REQUEST", .udesc = "L3 request count.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam15h_nb_pe[]={ { .name = "DRAM_ACCESSES", .desc = "DRAM Accesses", .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_dram_accesses), .ngrp = 1, .umasks = amd64_fam15h_nb_dram_accesses, }, { .name = "DRAM_CONTROLLER_PAGE_TABLE_OVERFLOWS", .desc = "DRAM Controller Page Table Overflows", .code = 0xe1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_dram_controller_page_table_overflows), .ngrp = 1, .umasks = amd64_fam15h_nb_dram_controller_page_table_overflows, }, { .name = "MEMORY_CONTROLLER_DRAM_COMMAND_SLOTS_MISSED", .desc = "Memory Controller DRAM Command Slots Missed", .code = 0xe2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_memory_controller_dram_command_slots_missed), .ngrp = 1, .umasks = amd64_fam15h_nb_memory_controller_dram_command_slots_missed, }, { .name = "MEMORY_CONTROLLER_TURNAROUNDS", .desc = "Memory Controller Turnarounds", .code = 0xe3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_memory_controller_turnarounds), .ngrp = 1, .umasks = amd64_fam15h_nb_memory_controller_turnarounds, }, { .name = "MEMORY_CONTROLLER_BYPASS_COUNTER_SATURATION", .desc = "Memory Controller Bypass Counter Saturation", .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_memory_controller_bypass_counter_saturation), .ngrp = 1, .umasks = amd64_fam15h_nb_memory_controller_bypass_counter_saturation, }, { .name = "THERMAL_STATUS", .desc = "Thermal Status", .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_thermal_status), .ngrp = 1, .umasks = amd64_fam15h_nb_thermal_status, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .code = 0xe9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK_COMMANDS", .desc = "Cache Block Commands", .code = 0xea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cache_block_commands), .ngrp = 1, .umasks = amd64_fam15h_nb_cache_block_commands, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_sized_commands), .ngrp = 1, .umasks = amd64_fam15h_nb_sized_commands, }, { .name = "PROBE_RESPONSES_AND_UPSTREAM_REQUESTS", .desc = "Probe Responses and Upstream Requests", .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_probe_responses_and_upstream_requests), .ngrp = 1, .umasks = amd64_fam15h_nb_probe_responses_and_upstream_requests, }, { .name = "GART_EVENTS", .desc = "GART Events", .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_gart_events), .ngrp = 1, .umasks = amd64_fam15h_nb_gart_events, }, { .name = "LINK_TRANSMIT_BANDWIDTH_LINK_0", .desc = "Link Transmit Bandwidth Link 0", .code = 0xf6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_link_transmit_bandwidth), .ngrp = 2, .umasks = amd64_fam15h_nb_link_transmit_bandwidth, }, { .name = "LINK_TRANSMIT_BANDWIDTH_LINK_1", .desc = "Link Transmit Bandwidth Link 1", .code = 0xf7, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_link_transmit_bandwidth), .ngrp = 2, .umasks = amd64_fam15h_nb_link_transmit_bandwidth, }, { .name = "LINK_TRANSMIT_BANDWIDTH_LINK_2", .desc = "Link Transmit Bandwidth Link 2", .code = 0xf8, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_link_transmit_bandwidth), .ngrp = 2, .umasks = amd64_fam15h_nb_link_transmit_bandwidth, }, { .name = "LINK_TRANSMIT_BANDWIDTH_LINK_3", .desc = "Link Transmit Bandwidth Link 3", .code = 0x1f9, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_link_transmit_bandwidth), .ngrp = 2, .umasks = amd64_fam15h_nb_link_transmit_bandwidth, }, { .name = "CPU_TO_DRAM_REQUESTS_TO_TARGET_NODE", .desc = "CPU to DRAM Requests to Target Node", .code = 0x1e0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_to_dram_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_to_dram_requests_to_target_node, }, { .name = "IO_TO_DRAM_REQUESTS_TO_TARGET_NODE", .desc = "IO to DRAM Requests to Target Node", .code = 0x1e1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_io_to_dram_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam15h_nb_io_to_dram_requests_to_target_node, }, { .name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_0_3", .desc = "CPU Read Command Latency to Target Node 0-3", .code = 0x1e2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_read_command_requests_to_target_node_0_3), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_read_command_requests_to_target_node_0_3, }, { .name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_0_3", .desc = "CPU Read Command Requests to Target Node 0-3", .code = 0x1e3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_read_command_requests_to_target_node_0_3), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_read_command_requests_to_target_node_0_3, }, { .name = "CPU_READ_COMMAND_LATENCY_TO_TARGET_NODE_4_7", .desc = "CPU Read Command Latency to Target Node 4-7", .code = 0x1e4, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_read_command_requests_to_target_node_4_7), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_read_command_requests_to_target_node_4_7, }, { .name = "CPU_READ_COMMAND_REQUESTS_TO_TARGET_NODE_4_7", .desc = "CPU Read Command Requests to Target Node 4-7", .code = 0x1e5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_read_command_requests_to_target_node_4_7), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_read_command_requests_to_target_node_4_7, }, { .name = "CPU_COMMAND_LATENCY_TO_TARGET_NODE", .desc = "CPU Command Latency to Target Node", .code = 0x1e6, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_command_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_command_requests_to_target_node, }, { .name = "CPU_REQUESTS_TO_TARGET_NODE", .desc = "CPU Requests to Target Node", .code = 0x1e7, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_cpu_command_requests_to_target_node), .ngrp = 1, .umasks = amd64_fam15h_nb_cpu_command_requests_to_target_node, }, { .name = "REQUEST_CACHE_STATUS_0", .desc = "Request Cache Status 0", .code = 0x1ea, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_request_cache_status_0), .ngrp = 1, .umasks = amd64_fam15h_nb_request_cache_status_0, }, { .name = "REQUEST_CACHE_STATUS_1", .desc = "Request Cache Status 1", .code = 0x1eb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_request_cache_status_1), .ngrp = 1, .umasks = amd64_fam15h_nb_request_cache_status_1, }, { .name = "MEMORY_CONTROLLER_REQUESTS", .desc = "Memory Controller Requests", .code = 0x1f0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_memory_controller_requests), .ngrp = 1, .umasks = amd64_fam15h_nb_memory_controller_requests, }, { .name = "READ_REQUEST_TO_L3_CACHE", .desc = "Read Request to L3 Cache", .code = 0x4e0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_read_request_to_l3_cache), .ngrp = 2, .umasks = amd64_fam15h_nb_read_request_to_l3_cache, }, { .name = "L3_CACHE_MISSES", .desc = "L3 Cache Misses", .code = 0x4e1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_read_request_to_l3_cache), .ngrp = 2, .umasks = amd64_fam15h_nb_read_request_to_l3_cache, }, { .name = "L3_FILLS_CAUSED_BY_L2_EVICTIONS", .desc = "L3 Fills caused by L2 Evictions", .code = 0x4e2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_l3_fills_caused_by_l2_evictions), .ngrp = 2, .umasks = amd64_fam15h_nb_l3_fills_caused_by_l2_evictions, }, { .name = "L3_EVICTIONS", .desc = "L3 Evictions", .code = 0x4e3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_l3_evictions), .ngrp = 1, .umasks = amd64_fam15h_nb_l3_evictions, }, { .name = "NON_CANCELED_L3_READ_REQUESTS", .desc = "Non-canceled L3 Read Requests", .code = 0x4ed, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_read_request_to_l3_cache), .ngrp = 2, .umasks = amd64_fam15h_nb_read_request_to_l3_cache, }, { .name = "L3_LATENCY", .desc = "L3 Latency", .code = 0x4ef, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_l3_latency), .ngrp = 1, .umasks = amd64_fam15h_nb_l3_latency, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam16h.h000066400000000000000000001066131502707512200236570ustar00rootroot00000000000000/* * Copyright (c) 2017 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam16h (AMD64 Fam16h) */ /* Dispatched FPU 0x0 */ static const amd64_umask_t amd64_fam16h_dispatched_fpu[]={ { .uname = "PIPE0", .udesc = "Pipe0 dispatches", .ucode = 0x1, }, { .uname = "PIPE1", .udesc = "Pipe1 dispatches", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Retired SSE/AVX 0x03 */ static const amd64_umask_t amd64_fam16h_retired_sse_operations[]={ { .uname = "SINGLE_ADD_SUB_OPS", .udesc = "Single precision add/subtract ops", .ucode = 0x1, }, { .uname = "SINGLE_MUL_OPS", .udesc = "Single precision multiply ops", .ucode = 0x2, }, { .uname = "SINGLE_DIV_OPS", .udesc = "Single precision divide/square root ops", .ucode = 0x4, }, { .uname = "DOUBLE_ADD_SUB_OPS", .udesc = "Double precision add/subtract ops", .ucode = 0x10, }, { .uname = "DOUBLE_MUL_OPS", .udesc = "Double precision multiply ops", .ucode = 0x20, }, { .uname = "DOUBLE_DIV_OPS", .udesc = "Double precision divide/square root ops", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Retired serializing ops 0x05 */ static const amd64_umask_t amd64_fam16h_retired_serializing_ops[]={ { .uname = "SSE_BOTTOM_EXECUTING_UOPS", .udesc = "SSE bottom-executing uops retired", .ucode = 0x1, }, { .uname = "SSE_CONTROL_RENAMING_UOPS", .udesc = "SSE control-renaming uops retired", .ucode = 0x2, }, { .uname = "X87_BOTTOM_EXECUTING_UOPS", .udesc = "X87 bottom-executing uops retired", .ucode = 0x4, }, { .uname = "X87_CONTROL_RENAMING_UOPS", .udesc = "X87 control-renaming uops retired", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Retired x87 ops 0x11 */ static const amd64_umask_t amd64_fam16h_retired_x87_ops[]={ { .uname = "ADD_AND_SUB", .udesc = "Add and subtract", .ucode = 0x1, }, { .uname = "MULTIPLY", .udesc = "Multiply", .ucode = 0x2, }, { .uname = "DIVIDE_AND_FSQRT", .udesc = "Divide and fsqrt", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Segment Register Loads 0x20 */ static const amd64_umask_t amd64_fam16h_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Pipeline Restart 0x21 */ static const amd64_umask_t amd64_fam16h_pipeline_restart[]={ { .uname = "INVALIDATING_PROBES", .udesc = "Evictions caused by invalidating probes", .ucode = 0x1, }, { .uname = "FILLS", .udesc = "Evictions caused by fills", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Locked Operations 0x24 */ static const amd64_umask_t amd64_fam16h_locked_ops[]={ { .uname = "EXECUTED", .udesc = "The number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_TO_ACQUIRE", .udesc = "The number of cycles to acquire bus lock", .ucode = 0x2, }, { .uname = "CYCLES_TO_UNLOCK", .udesc = "The number of cycles to unlock cache line", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* LS Dispatch 0x29 */ static const amd64_umask_t amd64_fam16h_ls_dispatch[]={ { .uname = "LOADS", .udesc = "The number of loads", .ucode = 0x1, }, { .uname = "STORES", .udesc = "The number of stores", .ucode = 0x2, }, { .uname = "LOAD_OP_STORES", .udesc = "The number of load-op-stores", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Cancells Store to Load 0x2a */ static const amd64_umask_t amd64_fam16h_cancelled_store_to_load_forward_operations[]={ { .uname = "ADDRESS_MISMATCHES", .udesc = "Address mismatches (starting byte not the same).", .ucode = 0x1, }, { .uname = "STORE_IS_SMALLER_THAN_LOAD", .udesc = "Store is smaller than load.", .ucode = 0x2, }, { .uname = "MISALIGNED", .udesc = "Misaligned.", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Data cache refills 0x42 */ static const amd64_umask_t amd64_fam16h_data_cache_refills[]={ { .uname = "NON_CACHABLE", .udesc = "Non-cachable", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Cache refills from northbridge 0x43 */ static const amd64_umask_t amd64_fam16h_data_cache_refills_from_system[]={ { .uname = "NON_CACHABLE", .udesc = "non-cachable", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Data cache lines evicted 0x44 */ static const amd64_umask_t amd64_fam16h_data_cache_lines_evicted[]={ { .uname = "EVICTED", .udesc = "Evicted from probe", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared eviction", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive eviction", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned eviction", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified eviction", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* DTLB Miss 0x46 */ static const amd64_umask_t amd64_fam16h_dtlb_miss[]={ { .uname = "STORES_L1TLB", .udesc = "Stores that miss L1TLB", .ucode = 0x1, }, { .uname = "LOADS_L1TLB", .udesc = "Loads that miss L1TLB", .ucode = 0x2, }, { .uname = "STORES_L2TLB", .udesc = "Stores that miss L2TLB", .ucode = 0x4, }, { .uname = "LOADS_L2TLB", .udesc = "Loads that miss L2TLB", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Misaligned accesses 0x47 */ static const amd64_umask_t amd64_fam16h_misaligned_accesses[]={ { .uname = "MISALIGN_16B", .udesc = "Misaligns that cross 16 Byte boundary", .ucode = 0x1, }, { .uname = "MISALIGN_4KB", .udesc = "Misaligns that cross a 4kB boundary", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Prefetch Instruction Dispatched 0x4b */ static const amd64_umask_t amd64_fam16h_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* L1 DTLB Hit 0x4d */ static const amd64_umask_t amd64_fam16h_l1_dtlb_hit[]={ { .uname = "L1_4K_TLB_HIT", .udesc = "L1 4K TLB hit", .ucode = 0x1, }, { .uname = "L1_2M_TLB_HIT", .udesc = "L1 2M TLB hit", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Ineffective SW Prefetch 0x52 */ static const amd64_umask_t amd64_fam16h_ineffective_sw_prefetches[]={ { .uname = "SW_PREFETCH_DATA_CACHE", .udesc = "Software prefetch hit in data cache", .ucode = 0x1, }, { .uname = "SW_PREFETCH_PENDING_FILL", .udesc = "Software prefetch hit a pending fill", .ucode = 0x2, }, { .uname = "SW_PREFETCH_MAB", .udesc = "Software prefetches that don't get a MAB", .ucode = 0x4, }, { .uname = "SW_PREFETCH_HIT_L2", .udesc = "Software prefetches that hit in L2", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Uncachable Memory 0x61 */ static const amd64_umask_t amd64_fam16h_uncachable_memory[]={ { .uname = "READ_BYTE", .udesc = "Read byte", .ucode = 0x1, }, { .uname = "READ_DOUBLEWORD", .udesc = "Read doubleword", .ucode = 0x2, }, { .uname = "WRITE_BYTE", .udesc = "Write byte", .ucode = 0x10, }, { .uname = "WRITE_DOUBLEWORD", .udesc = "Write doubleword", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x33, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Read Block Operations 0x62 */ static const amd64_umask_t amd64_fam16h_read_block[]={ { .uname = "READ_BLOCK", .udesc = "Read block", .ucode = 0x1, }, { .uname = "RDBLKMOD", .udesc = "RdBlkMod", .ucode = 0x2, }, { .uname = "READ_BLOCK_SHARED", .udesc = "Read block shared", .ucode = 0x4, }, { .uname = "READ_BLOCK_SPEC", .udesc = "Read block speculative", .ucode = 0x10, }, { .uname = "READ_BLOCK_SPEC_MOD", .udesc = "Read block speculative modified", .ucode = 0x20, }, { .uname = "READ_BLOCK_SPEC_SHARED", .udesc = "Read block speculative shared", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x77, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Change to Dirty 0x63 */ static const amd64_umask_t amd64_fam16h_change_dirty[]={ { .uname = "CHANGE_DIRTY", .udesc = "Change to dirty", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x10, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Memory Requests 0x65 */ static const amd64_umask_t amd64_fam16h_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Data Cache Prefetches 0x67 */ static const amd64_umask_t amd64_fam16h_data_prefetches[]={ { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "MAB", .udesc = "Hits on MAB", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xa, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* MAB Requests 0x68 and 0x69 */ static const amd64_umask_t amd64_fam16h_mab_requests[]={ { .uname = "DC_MISS0", .udesc = "Data cache miss buffer 0", .ucode = 0x1, }, { .uname = "DC_MISS1", .udesc = "Data cache miss buffer 1", .ucode = 0x2, }, { .uname = "DC_MISS2", .udesc = "Data cache miss buffer 2", .ucode = 0x4, }, { .uname = "DC_MISS3", .udesc = "Data cache miss buffer 3", .ucode = 0x8, }, { .uname = "DC_MISS4", .udesc = "Data cache miss buffer 4", .ucode = 0x10, }, { .uname = "DC_MISS5", .udesc = "Data cache miss buffer 5", .ucode = 0x20, }, { .uname = "DC_MISS6", .udesc = "Data cache miss buffer 6", .ucode = 0x40, }, { .uname = "DC_MISS7", .udesc = "Data cache miss buffer 7", .ucode = 0x80, }, { .uname = "IC_MISS0", .udesc = "Instruction cache miss buffer 0", .ucode = 0x100, }, { .uname = "IC_MISS1", .udesc = "Instruction cache miss buffer 1", .ucode = 0x200, }, { .uname = "DC_ANY", .udesc = "Any data cache miss buffer", .ucode = 0x800, }, { .uname = "IC_ANY", .udesc = "Any instruction cache miss buffer", .ucode = 0x1000, }, }; /* System Response by Coherence 0x6c */ static const amd64_umask_t amd64_fam16h_system_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "DATA_ERROR", .udesc = "Data Error", .ucode = 0x10, }, { .uname = "CHANGE_DIRTY", .udesc = "Change to dirty success", .ucode = 0x20, }, { .uname = "UNCACHEABLE", .udesc = "Uncacheable", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Data written to system 0x6d */ static const amd64_umask_t amd64_fam16h_data_written_to_system[]={ { .uname = "DATA_LINE_EVICTIONS", .udesc = "Data line evictions", .ucode = 0x1, }, { .uname = "INSTRUCTION_ATTRIBUTE_EVICTIONS", .udesc = "Instruction attribute evictions", .ucode = 0x2, }, { .uname = "BYTE_ENABLE_MASK_UNCACHEABLE", .udesc = "Byte enable mask for uncacheabe or I/O store", .ucode = 0x4, }, { .uname = "DATA_FOR_UNCACHEABLE", .udesc = "Data for uncacheabe or I/O store", .ucode = 0x8, }, { .uname = "BYTE_ENABLE_MASK_WRITE_COMBINE", .udesc = "Byte enable mask for write combine context flush", .ucode = 0x10, }, { .uname = "DATA_FOR_WRITE_COMBINE", .udesc = "Data for write combine contet flush", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* cache cross invalidate 0x75 */ static const amd64_umask_t amd64_fam16h_cache_cross_invalidates[]={ { .uname = "DC_INVALIDATES_IC", .udesc = "Modification of instructions of data too close to code", .ucode = 0x1, }, { .uname = "DC_INVALIDATES_DC", .udesc = "CD or WBINVD", .ucode = 0x2, }, { .uname = "IC_INVALIDATES_IC", .udesc = "aliasing", .ucode = 0x4, }, { .uname = "IC_INVALIDATES_DC_DIRTY", .udesc = "Execution of modified instruction or data too close to code", .ucode = 0x8, }, { .uname = "IC_HITS_DC_CLEAN_LINE", .udesc = "Reading code", .ucode = 0x10, }, { .uname = "DC_PROBE_REJECTED_EARLY", .udesc = "DC probe rejected early", .ucode = 0x20, }, { .uname = "DC_PROBE_REJECTED_LATE", .udesc = "DC probe rejected late", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* PDC Miss 0x162 */ static const amd64_umask_t amd64_fam16h_pdc_miss[]={ { .uname = "HOST_PDE_LEVEL", .udesc = "Host: PDE level", .ucode = 0x1, }, { .uname = "HOST_PDPE_LEVEL", .udesc = "Host: PDPE level", .ucode = 0x2, }, { .uname = "HOST_PML4E_LEVEL", .udesc = "Host: PML4E level", .ucode = 0x4, }, { .uname = "GUEST_PDE_LEVEL", .udesc = "Guest: PDE level", .ucode = 0x10, }, { .uname = "GUEST_PDPE_LEVEL", .udesc = "Guest: PDPE level", .ucode = 0x20, }, { .uname = "GUEST_PML4E_LEVEL", .udesc = "Guest: PML4E level", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x77, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* ITLB Miss 0x85 */ static const amd64_umask_t amd64_fam16h_itlb_miss[]={ { .uname = "4K_PAGE_FETCHES", .udesc = "Instruction fetches to a 4K page.", .ucode = 0x1, }, { .uname = "2M_PAGE_FETCHES", .udesc = "Instruction fetches to a 2M page.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Instruction Cache Lines Invalidated 0x8c */ static const amd64_umask_t amd64_fam16h_instruction_cache_lines_invalidated[]={ { .uname = "IC_INVALIDATE_LS_PROBE", .udesc = "Instruction cache invalidate due to LS probe", .ucode = 0x1, }, { .uname = "IC_INVALIDATE_BU_PROBE", .udesc = "Instruction cache invalidate due to BU probe", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Retired indirect branch info (0x19a) */ static const amd64_umask_t amd64_fam16h_retired_branch_info[]={ { .uname = "RETIRED", .udesc = "Retired indirect branch instruction.", .ucode = 0x1, }, { .uname = "MISPREDICTED", .udesc = "Retired mispredicted near unconditional jump.", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* Retired MMX/FP instructions 0xcb */ static const amd64_umask_t amd64_fam16h_retired_mmx_and_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "SSE", .udesc = "SSE, SSE2, SSE3, MNI instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; /* FPU exceptions 0xdb */ static const amd64_umask_t amd64_fam16h_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam16h_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Dispatched FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_dispatched_fpu), .ngrp = 1, .umasks = amd64_fam16h_dispatched_fpu, }, { .name = "FP_SCHEDULER_EMPTY", .desc = "Cycles in which the FPU is Empty", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2, }, { .name = "RETIRED_SSE_AVX_OPERATIONS", .desc = "Retired SSE/AVX Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_retired_sse_operations), .ngrp = 1, .umasks = amd64_fam16h_retired_sse_operations, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "Retired Serializing Ops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_retired_serializing_ops), .ngrp = 1, .umasks = amd64_fam16h_retired_serializing_ops, }, { .name = "RETIRED_X87_OPERATIONS", .desc = "Retired x87 operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x11, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_retired_x87_ops), .ngrp = 1, .umasks = amd64_fam16h_retired_x87_ops, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_segment_register_loads), .ngrp = 1, .umasks = amd64_fam16h_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline Restart Due to Self-Modifying Code", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline Restart Due to Probe Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_pipeline_restart), .ngrp = 1, .umasks = amd64_fam16h_pipeline_restart, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_locked_ops), .ngrp = 1, .umasks = amd64_fam16h_locked_ops, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x27, }, { .name = "LS_DISPATCH", .desc = "Transactions dispatched to load-store unit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_ls_dispatch), .ngrp = 1, .umasks = amd64_fam16h_ls_dispatch, }, { .name = "CANCELLED_STORE_TO_LOAD_FORWARD_OPERATIONS", .desc = "Cancelled Store to Load Forward Operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_cancelled_store_to_load_forward_operations), .ngrp = 1, .umasks = amd64_fam16h_cancelled_store_to_load_forward_operations, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_data_cache_refills), .ngrp = 1, .umasks = amd64_fam16h_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_NORTHBRIDGE", .desc = "Data Cache Refills from the Northbridge", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_fam16h_data_cache_refills_from_system, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_data_cache_lines_evicted), .ngrp = 1, .umasks = amd64_fam16h_data_cache_lines_evicted, }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "L1 DTLB Miss and L2 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x45, }, { .name = "DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x46, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_dtlb_miss), .ngrp = 1, .umasks = amd64_fam16h_dtlb_miss, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x47, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_misaligned_accesses), .ngrp = 1, .umasks = amd64_fam16h_misaligned_accesses, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_fam16h_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4c, }, { .name = "L1_DTLB_HIT", .desc = "L1 DTLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x4d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_l1_dtlb_hit), .ngrp = 1, .umasks = amd64_fam16h_l1_dtlb_hit, }, { .name = "INEFFECTIVE_SW_PREFETCHES", .desc = "Ineffective Software Prefetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x52, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_ineffective_sw_prefetches), .ngrp = 1, .umasks = amd64_fam16h_ineffective_sw_prefetches, }, { .name = "GLOBAL_TLB_FLUSHES", .desc = "Global TLB Flushes", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x54, }, /* fam30h only */ { .name = "COMMAND_RELATED_UNCACHABLE", .desc = "Commands realted to uncachable memory and I/O", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x61, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_uncachable_memory), .ngrp = 1, .umasks = amd64_fam16h_uncachable_memory, }, { .name = "COMMAND_RELATED_READ_BLOCK", .desc = "Commands realted to read block operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_read_block), .ngrp = 1, .umasks = amd64_fam16h_read_block, }, { .name = "COMMAND_RELATED_DIRTY", .desc = "Commands realted to change dirty operations", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_change_dirty), .ngrp = 1, .umasks = amd64_fam16h_change_dirty, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_memory_requests), .ngrp = 1, .umasks = amd64_fam16h_memory_requests, }, { .name = "DATA_PREFETCHES", .desc = "Data Prefetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_data_prefetches), .ngrp = 1, .umasks = amd64_fam16h_data_prefetches, }, { .name = "MAB_REQUESTS", .desc = "Miss address buffer requests", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_mab_requests), .ngrp = 1, .umasks = amd64_fam16h_mab_requests, }, { .name = "MAB_WAIT_CYCLES", .desc = "Miss address buffer wait cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_mab_requests), .ngrp = 1, .umasks = amd64_fam16h_mab_requests, }, { .name = "SYSTEM_RESPONSES", .desc = "L2I Responses by Coherency State", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_system_responses), .ngrp = 1, .umasks = amd64_fam16h_system_responses, }, { .name = "DATA_WRITTEN_TO_SYSTEM", .desc = "16-byte transfers written to system", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_data_written_to_system), .ngrp = 1, .umasks = amd64_fam16h_data_written_to_system, }, { .name = "CACHE_CROSS_INVALIDATES", .desc = "Internal probes causing cache lines to be invalidated", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x75, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_cache_cross_invalidates), .ngrp = 1, .umasks = amd64_fam16h_cache_cross_invalidates, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x76, }, { .name = "PDC_MISS", .desc = "Number of PDC misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x162, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_pdc_miss), .ngrp = 1, .umasks = amd64_fam16h_pdc_miss, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x84, }, { .name = "ITLB_MISS", .desc = "Instruction fetches that miss in 4k and 2M ITLB", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_itlb_miss), .ngrp = 1, .umasks = amd64_fam16h_itlb_miss, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x89, }, { .name = "INSTRUCTION_CACHE_VICTIMS", .desc = "Instruction Cache Victims", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8b, }, { .name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .desc = "Instruction Cache Lines Invalidated", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x8c, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_instruction_cache_lines_invalidated), .ngrp = 1, .umasks = amd64_fam16h_instruction_cache_lines_invalidated, }, { .name = "ITLB_RELOADS", .desc = "ITLB Reloads", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x99, }, { .name = "ITLB_RELOADS_ABORTED", .desc = "ITLB reloads aborted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x9a, }, { .name = "RETIRED_INDIRECT_BRANCH_INFO", .desc = "Retired indirect branch info", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x19a, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_retired_branch_info), .ngrp = 1, .umasks = amd64_fam16h_retired_branch_info, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xc9, }, { .name = "RETIRED_MISPREDICTED_TAKEN", .desc = "Retired mispredicted taken branches due to target mismatch", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_retired_mmx_and_fp_instructions), .ngrp = 1, .umasks = amd64_fam16h_retired_mmx_and_fp_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xcf, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam16h_fpu_exceptions), .ngrp = 1, .umasks = amd64_fam16h_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_FAM10H_ATTRS, .code = 0xdf, }, { .name = "TAGGED_IBS_OPS", .desc = "Ops tagged by IBS", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1cf, }, { .name = "TAGGED_IBS_OPS_RETIRED", .desc = "Ops tagged by IBS that retired", .modmsk = AMD64_FAM10H_ATTRS, .code = 0x1d0, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam17h_zen1.h000066400000000000000000001121431502707512200246100ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam17h_zen1 (AMD64 Fam17h Zen1)) */ static const amd64_umask_t amd64_fam17h_zen1_l1_itlb_miss_l2_itlb_miss[]={ { .uname = "IF1G", .udesc = "TBD", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "TBD", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_retired_mmx_fp_instructions[]={ { .uname = "SSE_INSTR", .udesc = "TBD", .ucode = 0x4, }, { .uname = "MMX_INSTR", .udesc = "TBD", .ucode = 0x2, }, { .uname = "X87_INSTR", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_tagged_ibs_ops[]={ { .uname = "IBS_COUNT_ROLLOVER", .udesc = "Number of times a uop could not be tagged by IBS because of a previous tagged uop that has not retired.", .ucode = 0x4, }, { .uname = "IBS_TAGGED_OPS_RET", .udesc = "Number of uops tagged by IBS that retired.", .ucode = 0x2, }, { .uname = "IBS_TAGGED_OPS", .udesc = "Number of uops tagged by IBS.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_number_of_move_elimination_and_scalar_op_optimization[]={ { .uname = "OPTIMIZED", .udesc = "Number of scalar ops optimized.", .ucode = 0x8, }, { .uname = "OPT_POTENTIAL", .udesc = "Number of ops that are candidates for optimization (have z-bit either set or pass.", .ucode = 0x4, }, { .uname = "SSE_MOV_OPS_ELIM", .udesc = "Number of SSE move ops eliminated.", .ucode = 0x2, }, { .uname = "SSE_MOV_OPS", .udesc = "Number of SSE move ops.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_retired_sse_avx_operations[]={ { .uname = "DP_MULT_ADD_FLOPS", .udesc = "Double precision multiply-add flops.", .ucode = 0x80, }, { .uname = "DP_DIV_FLOPS", .udesc = "Double precision divide/square root flops.", .ucode = 0x40, }, { .uname = "DP_MULT_FLOPS", .udesc = "Double precision multiply flops.", .ucode = 0x20, }, { .uname = "DP_ADD_SUB_FLOPS", .udesc = "Double precision add/subtract flops.", .ucode = 0x10, }, { .uname = "SP_MULT_ADD_FLOPS", .udesc = "Single precision multiply-add flops.", .ucode = 0x8, }, { .uname = "SP_DIV_FLOPS", .udesc = "Single precision divide/square root flops.", .ucode = 0x4, }, { .uname = "SP_MULT_FLOPS", .udesc = "Single precision multiply flops.", .ucode = 0x2, }, { .uname = "SP_ADD_SUB_FLOPS", .udesc = "Single precision add/subtract flops.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_retired_serializing_ops[]={ { .uname = "X87_CTRL_RET", .udesc = "X87 control word mispredict traps due to mispredction in RC or PC, or changes in mask bits.", .ucode = 0x8, }, { .uname = "X87_BOT_RET", .udesc = "X87 bottom-executing uops retired.", .ucode = 0x4, }, { .uname = "SSE_CTRL_RET", .udesc = "SSE control word mispreduct traps due to mispredctions in RC, FTZ or DAZ or changes in mask bits.", .ucode = 0x2, }, { .uname = "SSE_BOT_RET", .udesc = "SSE bottom-executing uops retired.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_retired_x87_floating_point_operations[]={ { .uname = "DIV_SQR_R_OPS", .udesc = "Divide and square root ops", .ucode = 0x4, }, { .uname = "MUL_OPS", .udesc = "Multiple ops", .ucode = 0x2, }, { .uname = "ADD_SUB_OPS", .udesc = "Add/subtract ops", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_fpu_pipe_assignment[]={ { .uname = "DUAL3", .udesc = "Total number of multi-pipe uops assigned to pipe3", .ucode = 0x80, }, { .uname = "DUAL2", .udesc = "Total number of multi-pipe uops assigned to pipe2", .ucode = 0x40, }, { .uname = "DUAL1", .udesc = "Total number of multi-pipe uops assigned to pipe1", .ucode = 0x20, }, { .uname = "DUAL0", .udesc = "Total number of multi-pipe uops assigned to pipe0", .ucode = 0x10, }, { .uname = "TOTAL3", .udesc = "Total number of uops assigned to pipe3", .ucode = 0x8, }, { .uname = "TOTAL2", .udesc = "Total number of uops assigned to pipe2", .ucode = 0x4, }, { .uname = "TOTAL1", .udesc = "Total number of uops assigned to pipe1", .ucode = 0x2, }, { .uname = "TOTAL0", .udesc = "Total number of uops assigned to pipe0", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_instruction_cache_lines_invalidated[]={ { .uname = "L2_INVALIDATING_PROBE", .udesc = "IC line invalidated due to L2 invalidating probe (external or LS).", .ucode = 0x2, }, { .uname = "FILL_INVALIDATED", .udesc = "IC line invalidated due to overwriting fill response.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_instruction_pipe_stall[]={ { .uname = "IC_STALL_ANY", .udesc = "IC pipe was stalled during this clock cycle for any reason (nothing valud in pipe ICM1).", .ucode = 0x4, }, { .uname = "IC_STALL_DQ_EMPTY", .udesc = "IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.", .ucode = 0x2, }, { .uname = "IC_STALL_BACK_PRESSURE", .udesc = "IC pipe was stalled during this clock cycle (ncluding IC to OC fetches) due to back pressure.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_core_to_l2_cacheable_request_access_status[]={ { .uname = "LS_RD_BLK_C_S", .udesc = "Load/Store ReadBlock C/S hit", .ucode = 0x80, }, { .uname = "LS_RD_BLK_L_HIT_X", .udesc = "Load/Store Readblock L hit eXclusive.", .ucode = 0x40, }, { .uname = "LS_RD_BLK_L_HIT_S", .udesc = "Load/Store ReadBlock L hit Shared.", .ucode = 0x20, }, { .uname = "LS_RD_BLK_X", .udesc = "Load/Store ReadblockX/ChangeToX hit eXclusive.", .ucode = 0x10, }, { .uname = "LS_RD_BLK_C", .udesc = "Load/Store ReadBlock C S L X Change To X Miss.", .ucode = 0x8, }, { .uname = "IC_FILL_HIT_X", .udesc = "Icache fill hit eXclusive.", .ucode = 0x4, }, { .uname = "IC_FILL_HIT_S", .udesc = "Icache fill hit Shared.", .ucode = 0x2, }, { .uname = "IC_FILL_MISS", .udesc = "Icache fill miss.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_cycles_with_fill_pending_from_l2[]={ { .uname = "L2_FILL_BUSY", .udesc = "TBD", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen1_l2_latency[]={ { .uname = "L2_CYCLES_WAITING_ON_FILLS", .udesc = "TBD", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen1_requests_to_l2_group1[]={ { .uname = "RD_BLK_L", .udesc = "TBD", .ucode = 0x80, }, { .uname = "RD_BLK_X", .udesc = "TBD", .ucode = 0x40, }, { .uname = "LS_RD_BLK_C_S", .udesc = "TBD", .ucode = 0x20, }, { .uname = "CACHEABLE_IC_READ", .udesc = "TBD", .ucode = 0x10, }, { .uname = "CHANGE_TO_X", .udesc = "TBD", .ucode = 0x8, }, { .uname = "PREFETCH_L2", .udesc = "TBD", .ucode = 0x4, }, { .uname = "L2_HW_PF", .udesc = "TBD", .ucode = 0x2, }, { .uname = "OTHER_REQUESTS", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_requests_to_l2_group2[]={ { .uname = "GROUP1", .udesc = "TBD", .ucode = 0x80, }, { .uname = "LS_RD_SIZED", .udesc = "TBD", .ucode = 0x40, }, { .uname = "LS_RD_SIZED_N_C", .udesc = "TBD", .ucode = 0x20, }, { .uname = "IC_RD_SIZED", .udesc = "TBD", .ucode = 0x10, }, { .uname = "IC_RD_SIZED_N_C", .udesc = "TBD", .ucode = 0x8, }, { .uname = "SMC_INVAL", .udesc = "TBD", .ucode = 0x4, }, { .uname = "BUS_LOCKS_ORIGINATOR", .udesc = "TBD", .ucode = 0x2, }, { .uname = "BUS_LOCKS_RESPONSES", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_ls_to_l2_wbc_requests[]={ { .uname = "WCB_WRITE", .udesc = "TBD", .ucode = 0x40, }, { .uname = "WCB_CLOSE", .udesc = "TBD", .ucode = 0x20, }, { .uname = "CACHE_LINE_FLUSH", .udesc = "TBD", .ucode = 0x10, }, { .uname = "I_LINE_FLUSH", .udesc = "TBD", .ucode = 0x8, }, { .uname = "ZERO_BYTE_STORE", .udesc = "TBD", .ucode = 0x4, }, { .uname = "LOCAL_IC_CLR", .udesc = "TBD", .ucode = 0x2, }, { .uname = "C_L_ZERO", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_ls_dispatch[]={ { .uname = "LD_ST_DISPATCH", .udesc = "Load/Store uops dispatched.", .ucode = 0x4, }, { .uname = "STORE_DISPATCH", .udesc = "Store uops dispatched.", .ucode = 0x2, }, { .uname = "LD_DISPATCH", .udesc = "Load uops dispatched.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_ineffective_software_prefetch[]={ { .uname = "MAB_MCH_CNT", .udesc = "TBD", .ucode = 0x2, }, { .uname = "DATA_PIPE_SW_PF_DC_HIT", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_l1_dtlb_miss[]={ { .uname = "TLB_RELOAD_1G_L2_MISS", .udesc = "TBD", .ucode = 0x80, }, { .uname = "TLB_RELOAD_2M_L2_MISS", .udesc = "TBD", .ucode = 0x40, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_MISS", .udesc = "TBD", .ucode = 0x20, }, { .uname = "TLB_RELOAD_4K_L2_MISS", .udesc = "TBD", .ucode = 0x10, }, { .uname = "TLB_RELOAD_1G_L2_HIT", .udesc = "TBD", .ucode = 0x8, }, { .uname = "TLB_RELOAD_2M_L2_HIT", .udesc = "TBD", .ucode = 0x4, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_HIT", .udesc = "TBD", .ucode = 0x2, }, { .uname = "TLB_RELOAD_4K_L2_HIT", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_locks[]={ { .uname = "SPEC_LOCK_MAP_COMMIT", .udesc = "TBD", .ucode = 0x8, }, { .uname = "SPEC_LOCK", .udesc = "TBD", .ucode = 0x4, }, { .uname = "NON_SPEC_LOCK", .udesc = "TBD", .ucode = 0x2, }, { .uname = "BUS_LOCK", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_mab_allocation_by_pipe[]={ { .uname = "TLB_PIPE_EARLY", .udesc = "TBD", .ucode = 0x10, }, { .uname = "HW_PF", .udesc = "hw_pf", .ucode = 0x8, }, { .uname = "TLB_PIPE_LATE", .udesc = "TBD", .ucode = 0x4, }, { .uname = "ST_PIPE", .udesc = "TBD", .ucode = 0x2, }, { .uname = "DATA_PIPE", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_prefetch_instructions_dispatched[]={ { .uname = "PREFETCH_NTA", .udesc = "Non-temporal prefetches.", .ucode = 0x4, }, { .uname = "STORE_PREFETCH_W", .udesc = "TBD", .ucode = 0x2, }, { .uname = "LOAD_PREFETCH_W", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_tablewalker_allocation[]={ { .uname = "ALLOC_ISIDE1", .udesc = "TBD", .ucode = 0x8, }, { .uname = "ALLOC_ISIDE0", .udesc = "TBD", .ucode = 0x4, }, { .uname = "ALLOC_DSIDE1", .udesc = "TBD", .ucode = 0x2, }, { .uname = "ALLOC_DSIDE0", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_oc_mode_switch[]={ { .uname = "OC_IC_MODE_SWITCH", .udesc = "TBD", .ucode = 0x2, }, { .uname = "IC_OC_MODE_SWITCH", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_dynamic_tokens_dispatch_stall_cycles_0[]={ { .uname = "RETIRE_TOKEN_STALL", .udesc = "Retire tokens unavailable", .ucode = 0x40, }, { .uname = "AGSQ_TOKEN_STALL", .udesc = "AGSQ tokens unavailable", .ucode = 0x20, }, { .uname = "ALU_TOKEN_STALL", .udesc = "ALU tokens unavailable", .ucode = 0x10, }, { .uname = "ALSQ3_0_TOKEN_STALL", .udesc = "TBD", .ucode = 0x8, }, { .uname = "ALSQ3_TOKEN_STALL", .udesc = "ALSQ3 tokens unavailable", .ucode = 0x4, }, { .uname = "ALSQ2_TOKEN_STALL", .udesc = "ALSQ2 tokens unavailable", .ucode = 0x2, }, { .uname = "ALSQ1_TOKEN_STALL", .udesc = "ALSQ1 tokens unavailable", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen1_software_prefetch_data_cache_fills[]={ { .uname = "MABRESP_LCL_L2", .udesc = "Fill from local L2.", .ucode = 0x1, }, { .uname = "LS_MABRESP_LCL_CACHE", .udesc = "Fill from another cache (home node local).", .ucode = 0x2, }, { .uname = "LS_MABRESP_LCL_DRAM", .udesc = "Fill from DRAM (home node local).", .ucode = 0x8, }, { .uname = "LS_MABRESP_LCL_RMT_CACHE", .udesc = "Fill from another cache (home node remote).", .ucode = 0x10, }, { .uname = "LS_MABRESP_LCL_RMT_DRAM", .udesc = "Fill from DRAM (home node remote).", .ucode = 0x40, }, }; static const amd64_umask_t amd64_fam17h_zen1_uops_dispatched_from_decoder[]={ { .uname = "DECODER_DISPATCHED", .udesc = "Number of uops dispatched from the Decoder", .ucode = 0x1, }, { .uname = "OPCACHE_DISPATCHED", .udesc = "Number of uops dispatched from the OpCache", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam17h_zen1_dispatch_resource_stall_cycles_1[]={ { .uname = "INT_PHY_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to integer physical register file resource stalls. Applies to all uops that have integer destination register.", .ucode = 0x1, }, { .uname = "LOAD_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to load queue resource stalls. Applies to all uops with load semantics.", .ucode = 0x2, }, { .uname = "STORE_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to store queue resource stalls. Applies to all uops with store semantics.", .ucode = 0x4, }, { .uname = "INT_SCHEDULER_MISC_RSRC_STALL", .udesc = "Number of cycles stalled due to integer scheduler miscellaneous resource stalls.", .ucode = 0x8, }, { .uname = "TAKEN_BRANCH_BUFFER_RSRC_STALL", .udesc = "Number of cycles stalled due to taken branch buffer resource stalls.", .ucode = 0x10, }, { .uname = "FP_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point register file resource stalls.", .ucode = 0x20, }, { .uname = "FP_SCHEDULER_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point scheduler resource stalls.", .ucode = 0x40, }, { .uname = "FP_MISC_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point miscellaneous resource unavailable.", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam17h_zen1_dispatch_resource_stall_cycles_0[]={ { .uname = "ALSQ1_RSRC_STALL", .udesc = "ALSQ1 resources unavailable.", .ucode = 0x1, }, { .uname = "ALSQ2_RSRC_STALL", .udesc = "ALSQ2 resources unavailable.", .ucode = 0x2, }, { .uname = "ALSQ3_RSRC_STALL", .udesc = "ALSQ3 resources unavailable.", .ucode = 0x4, }, { .uname = "ALSQ3_0_RSRC_STALL", .udesc = "TBD", .ucode = 0x8, }, { .uname = "ALU_RSRC_STALL", .udesc = "ALU resource total unavailable", .ucode = 0x10, }, { .uname = "AGSQ_RSRC_STALL", .udesc = "AGSQ resource unavailable", .ucode = 0x20, }, { .uname = "RETIRE_RSRC_STALL", .udesc = "RETIRE resource unavailable", .ucode = 0x40, }, }; static const amd64_umask_t amd64_fam17h_zen1_l2_prefetch_hit_l2[]={ { .uname = "ANY", .udesc = "Any L2 prefetch requests", .ucode = 0x3f, .uflags = AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam17h_zen1_pe[]={ { .name = "L1_ITLB_MISS_L2_ITLB_HIT", .desc = "The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x84, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_MISS", .desc = "The number of instruction fetches that miss in both the L1 and L2 TLBs.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x85, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l1_itlb_miss_l2_itlb_miss), .umasks = amd64_fam17h_zen1_l1_itlb_miss_l2_itlb_miss, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "The number of pipeline restarts caused by invalidating probes that hit on the instruction stream currently being executed. This would happen if the active instruction stream was being modified by another processor in an MP system - typically a highly unlikely event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x86, .flags = 0, .ngrp = 0, }, { .name = "ITLB_RELOADS", .desc = "The number of ITLB reload requests.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x99, .flags = 0, .ngrp = 0, }, { .name = "DIV_CYCLES_BUSY_COUNT", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd3, .flags = 0, .ngrp = 0, }, { .name = "DIV_OP_COUNT", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "The number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc6, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xca, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "The number of branch instructions retired, of any type, that were not correctly predicted. This includes those for which prediction is not attempted (far control transfers, exceptions and interrupts).", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc3, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "The number of resync branches. These reflect pipeline restarts due to certain microcode assists and events such as writes to the active instruction stream, among other things. Each occurrence reflects a restart penalty similar to a branch mispredict. This is relatively rare.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc7, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "The number of retired taken branch instructions that were mispredicted.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc5, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UOPS", .desc = "The number of uops retired. This includes all processor activity (instructions, exceptions, interrupts, microcode assists, etc.). The number of events logged per cycle can vary from 0 to 4.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FUSED_BRANCH_INSTRUCTIONS", .desc = "The number of fused retired branch instructions retired per cycle. The number of events logged per cycle can vary from 0 to 3.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1d0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Instructions Retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPS.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xcb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_retired_mmx_fp_instructions), .umasks = amd64_fam17h_zen1_retired_mmx_fp_instructions, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "The number of near return instructions (RET or RETI) retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "The number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc9, .flags = 0, .ngrp = 0, }, { .name = "TAGGED_IBS_OPS", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1cf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_tagged_ibs_ops), .umasks = amd64_fam17h_zen1_tagged_ibs_ops, }, { .name = "NUMBER_OF_MOVE_ELIMINATION_AND_SCALAR_OP_OPTIMIZATION", .desc = "This is a dispatch based speculative event. It is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x4, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_number_of_move_elimination_and_scalar_op_optimization), .umasks = amd64_fam17h_zen1_number_of_move_elimination_and_scalar_op_optimization, }, { .name = "RETIRED_SSE_AVX_OPERATIONS", .desc = "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x3, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_retired_sse_avx_operations), .umasks = amd64_fam17h_zen1_retired_sse_avx_operations, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "The number of serializing Ops retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x5, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_retired_serializing_ops), .umasks = amd64_fam17h_zen1_retired_serializing_ops, }, { .name = "RETIRED_X87_FLOATING_POINT_OPERATIONS", .desc = "The number of x87 floating-point Ops that have retired. The number of events logged per cycle can vary from 0 to 8.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x2, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_retired_x87_floating_point_operations), .umasks = amd64_fam17h_zen1_retired_x87_floating_point_operations, }, { .name = "FP_SCHEDULER_EMPTY", .desc = "This is a speculative event. The number of cycles in which the FPU scheduler is empty. Note that some Ops like FP loads bypass the scheduler. Invert this to count cycles in which at least one FPU operation is present in the FPU.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1, .flags = 0, .ngrp = 0, }, { .name = "FPU_PIPE_ASSIGNMENT", .desc = "The number of operations (uOps) and dual-pipe uOps dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one-cycle dispatch event. This event is a speculative event. (See Core::X86::Pmc::Core::ExRetMmxFpInstr). Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x0, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_fpu_pipe_assignment), .umasks = amd64_fam17h_zen1_fpu_pipe_assignment, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "The number of 64-byte instruction cachelines that was fulfilled by the L2 cache.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x82, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "The number of 64-byte instruction cachelines fulfilled from system memory or another cache.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x83, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_LINES_INVALIDATED", .desc = "The number of instruction cachelines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8c, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_instruction_cache_lines_invalidated), .umasks = amd64_fam17h_zen1_instruction_cache_lines_invalidated, }, { .name = "INSTRUCTION_PIPE_STALL", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x87, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_instruction_pipe_stall), .umasks = amd64_fam17h_zen1_instruction_pipe_stall, }, { .name = "32_BYTE_INSTRUCTION_CACHE_FETCH", .desc = "The number of 32B fetch windows transferred from IC pipe to DE instruction decoder (includes non-cacheable and cacheable fill responses).", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x80, .flags = 0, .ngrp = 0, }, { .name = "32_BYTE_INSTRUCTION_CACHE_MISSES", .desc = "The number of 32B fetch windows tried to read the L1 IC and missed in the full tag.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x81, .flags = 0, .ngrp = 0, }, { .name = "CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS", .desc = "This event does not count accesses to the L2 cache by the L2 prefetcher, but it does count accesses by the L1 prefetcher.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x64, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_core_to_l2_cacheable_request_access_status), .umasks = amd64_fam17h_zen1_core_to_l2_cacheable_request_access_status, }, { .name = "CYCLES_WITH_FILL_PENDING_FROM_L2", .desc = "Total cycles spent with one or more fill requests in flight from L2.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x6d, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_cycles_with_fill_pending_from_l2), .umasks = amd64_fam17h_zen1_cycles_with_fill_pending_from_l2, }, { .name = "L2_LATENCY", .desc = "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. This may be used to calculate average latency by multiplying this count by four and then dividing by the total number of L2 fills (umask L2RequestG1). Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x62, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l2_latency), .umasks = amd64_fam17h_zen1_l2_latency, }, { .name = "REQUESTS_TO_L2_GROUP1", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x60, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_requests_to_l2_group1), .umasks = amd64_fam17h_zen1_requests_to_l2_group1, }, { .name = "REQUESTS_TO_L2_GROUP2", .desc = "Multi-events in that LS and IF requests can be received simultaneous.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x61, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_requests_to_l2_group2), .umasks = amd64_fam17h_zen1_requests_to_l2_group2, }, { .name = "LS_TO_L2_WBC_REQUESTS", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x63, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_ls_to_l2_wbc_requests), .umasks = amd64_fam17h_zen1_ls_to_l2_wbc_requests, }, { .name = "DATA_CACHE_ACCESSES", .desc = "The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x40, .flags = 0, .ngrp = 0, }, { .name = "LS_DISPATCH", .desc = "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x29, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_ls_dispatch), .umasks = amd64_fam17h_zen1_ls_dispatch, }, { .name = "INEFFECTIVE_SOFTWARE_PREFETCH", .desc = "The number of software prefetches that did not fetch data outside of the processor core.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x52, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_ineffective_software_prefetch), .umasks = amd64_fam17h_zen1_ineffective_software_prefetch, }, { .name = "SOFTWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of software prefetches fills by data source", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x59, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_software_prefetch_data_cache_fills), .umasks = amd64_fam17h_zen1_software_prefetch_data_cache_fills, }, { .name = "HARDWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of hardware prefetches fills by data source", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x5a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_software_prefetch_data_cache_fills), .umasks = amd64_fam17h_zen1_software_prefetch_data_cache_fills, /* shared */ }, { .name = "L1_DTLB_MISS", .desc = "L1 Data TLB misses.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x45, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l1_dtlb_miss), .umasks = amd64_fam17h_zen1_l1_dtlb_miss, }, { .name = "LOCKS", .desc = "Lock operations. Unit masks ORed", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x25, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_locks), .umasks = amd64_fam17h_zen1_locks, }, { .name = "MAB_ALLOCATION_BY_PIPE", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x41, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_mab_allocation_by_pipe), .umasks = amd64_fam17h_zen1_mab_allocation_by_pipe, }, { .name = "MISALIGNED_LOADS", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x47, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NOT_IN_HALT", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x76, .flags = 0, .ngrp = 0, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Software Prefetch Instructions Dispatched.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x4b, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_prefetch_instructions_dispatched), .umasks = amd64_fam17h_zen1_prefetch_instructions_dispatched, }, { .name = "STORE_TO_LOAD_FORWARD", .desc = "Number of STore Lad Forward hits.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x35, .flags = 0, .ngrp = 0, }, { .name = "TABLEWALKER_ALLOCATION", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x46, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_tablewalker_allocation), .umasks = amd64_fam17h_zen1_tablewalker_allocation, }, { .name = "L1_BTB_CORRECTION", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8a, .flags = 0, .ngrp = 0, }, { .name = "L2_BTB_CORRECTION", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8b, .flags = 0, .ngrp = 0, }, { .name = "OC_MODE_SWITCH", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x28a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_oc_mode_switch), .umasks = amd64_fam17h_zen1_oc_mode_switch, }, { .name = "DYNAMIC_TOKENS_DISPATCH_STALLS_CYCLES_0", .desc = "Cycles where a dispatch group is valid but does not get dispatched due to a token stall.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xaf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_dynamic_tokens_dispatch_stall_cycles_0), .umasks = amd64_fam17h_zen1_dynamic_tokens_dispatch_stall_cycles_0, }, { .name = "UOPS_DISPATCHED_FROM_DECODER", .desc = "Number of uops dispatched from either the Decoder, OpCache or both", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xaa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_uops_dispatched_from_decoder), .umasks = amd64_fam17h_zen1_uops_dispatched_from_decoder, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_1", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xae, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_dispatch_resource_stall_cycles_1), .umasks = amd64_fam17h_zen1_dispatch_resource_stall_cycles_1, }, { .name = "L2_PREFETCH_HIT_L2", .desc = "Number of L2 prefetcher hits in the L2", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x70, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen1_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_HIT_L3", .desc = "Number of L2 prefetcher hits in the L3", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x71, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen1_l2_prefetch_hit_l2, /* shared */ }, { .name = "L2_PREFETCH_MISS_L3", .desc = "Number of L2 prefetcher misses in the L3", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x72, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen1_l2_prefetch_hit_l2, /* shared */ }, { .name = "DYNAMIC_INDIRECT_PREDICTIONS", .desc = "Indirect Branch Prediction for potential multi-target branch (speculative)", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8e, .flags = 0, }, { .name = "DECODER_OVERRIDES_PREDICTION", .desc = "Decoder Overrides Existing Branch Prediction (speculative)", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x91, .flags = 0, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam17h_zen2.h000066400000000000000000000766351502707512200246300ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam17h_zen2 (AMD64 Fam17h Zen2)) */ static const amd64_umask_t amd64_fam17h_zen2_l1_itlb_miss_l2_itlb_miss[]={ { .uname = "IF1G", .udesc = "Number of instruction fetches to a 1GB page", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "Number of instruction fetches to a 2MB page", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "Number of instruction fetches to a 4KB page", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_itlb_fetch_hit[]={ { .uname = "IF1G", .udesc = "L1 instruction fetch that hit a 1GB page.", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "L1 instruction fetch that hit a 2MB page.", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "L1 instruction fetch that hit a 4KB page.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_retired_mmx_fp_instructions[]={ { .uname = "SSE_INSTR", .udesc = "Number of SSE instructions (SSE, SSE2, SSE3, SSE$, SSE4A, SSE41, SSE42, AVX).", .ucode = 0x4, }, { .uname = "MMX_INSTR", .udesc = "Number of MMX instructions.", .ucode = 0x2, }, { .uname = "X87_INSTR", .udesc = "Number of X87 instructions.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_tagged_ibs_ops[]={ { .uname = "IBS_COUNT_ROLLOVER", .udesc = "Number of times a uop could not be tagged by IBS because of a previous tagged uop that has not retired.", .ucode = 0x4, }, { .uname = "IBS_TAGGED_OPS_RET", .udesc = "Number of uops tagged by IBS that retired.", .ucode = 0x2, }, { .uname = "IBS_TAGGED_OPS", .udesc = "Number of uops tagged by IBS.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_core_to_l2_cacheable_request_access_status[]={ { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared read hitting in the L2.", .ucode = 0x80, }, { .uname = "LS_RD_BLK_L_HIT_X", .udesc = "Number of data cache reads hitting in the L2.", .ucode = 0x40, }, { .uname = "LS_RD_BLK_L_HIT_S", .udesc = "Number of data cache reads hitting a shared in line in the L2.", .ucode = 0x20, }, { .uname = "LS_RD_BLK_X", .udesc = "Number of data cache store or state change (to exclusive) requests hitting in the L2.", .ucode = 0x10, }, { .uname = "LS_RD_BLK_C", .udesc = "Number of data cache fill requests missing in the L2 (all types).", .ucode = 0x8, }, { .uname = "IC_FILL_HIT_X", .udesc = "Number of I-cache fill requests hitting a modifiable (exclusive) line in the L2.", .ucode = 0x4, }, { .uname = "IC_FILL_HIT_S", .udesc = "Number of I-cache fill requests hitting a clean line in the L2.", .ucode = 0x2, }, { .uname = "IC_FILL_MISS", .udesc = "Number of I-cache fill requests missing the L2.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_l2_prefetch_hit_l2[]={ { .uname = "ANY", .udesc = "Any L2 prefetch requests", .ucode = 0x1f, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen2_requests_to_l2_group1[]={ { .uname = "RD_BLK_L", .udesc = "Number of data cache reads (including software and hardware prefetches).", .ucode = 0x80, }, { .uname = "RD_BLK_X", .udesc = "Number of data cache stores", .ucode = 0x40, }, { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared reads.", .ucode = 0x20, }, { .uname = "CACHEABLE_IC_READ", .udesc = "Number of instruction cache reads.", .ucode = 0x10, }, { .uname = "CHANGE_TO_X", .udesc = "Number of requests change to writable. Check L2 for current state.", .ucode = 0x8, }, { .uname = "PREFETCH_L2", .udesc = "TBD", .ucode = 0x4, }, { .uname = "L2_HW_PF", .udesc = "Number of prefetches accepted by L2 pipeline, hit or miss.", .ucode = 0x2, }, { .uname = "GROUP2", .udesc = "Number of miscellaneous requests covered in more details by REQUESTS_TO_L2_GROUP1", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_requests_to_l2_group2[]={ { .uname = "GROUP1", .udesc = "Number of miscellaneous requests covered in more details by REQUESTS_TO_L2_GROUP2", .ucode = 0x80, }, { .uname = "LS_RD_SIZED", .udesc = "Number of data cache reads sized.", .ucode = 0x40, }, { .uname = "LS_RD_SIZED_N_C", .udesc = "Number of data cache reads sized non-cacheable.", .ucode = 0x20, }, { .uname = "IC_RD_SIZED", .udesc = "Number of instruction cache reads sized.", .ucode = 0x10, }, { .uname = "IC_RD_SIZED_N_C", .udesc = "Number of instruction cache reads sized non-cacheable.", .ucode = 0x8, }, { .uname = "SMC_INVAL", .udesc = "Number of self-modifying code invalidates.", .ucode = 0x4, }, { .uname = "BUS_LOCKS_ORIGINATOR", .udesc = "Number of bus locks.", .ucode = 0x2, }, { .uname = "BUS_LOCKS_RESPONSES", .udesc = "Number of bus lock responses.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_bad_status_2[]={ { .uname = "STLI_OTHER", .udesc = "Store-to-load conflicts. A load was unable to complete due to a non-forwardable conflict with an older store.", .ucode = 0x2, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen2_retired_lock_instructions[]={ { .uname = "CACHEABLE_LOCKS", .udesc = "Lock in cacheable memory region.", .ucode = 0xe, }, { .uname = "BUS_LOCK", .udesc = "Number of bus locks", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_tlb_flushes[]={ { .uname = "ALL", .udesc = "ANY TLB flush.", .ucode = 0xff, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen2_ls_dispatch[]={ { .uname = "LD_ST_DISPATCH", .udesc = "Load/Store single uops dispatched (compare-and-exchange).", .ucode = 0x4, }, { .uname = "STORE_DISPATCH", .udesc = "Store uops dispatched.", .ucode = 0x2, }, { .uname = "LD_DISPATCH", .udesc = "Load uops dispatched.", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_ineffective_software_prefetch[]={ { .uname = "MAB_MCH_CNT", .udesc = "Software prefetch instructions saw a match on an already allocated miss request buffer.", .ucode = 0x2, }, { .uname = "DATA_PIPE_SW_PF_DC_HIT", .udesc = "Software Prefetch instruction saw a DC hit", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_software_prefetch_data_cache_fills[]={ { .uname = "MABRESP_LCL_L2", .udesc = "Fill from local L2.", .ucode = 0x1, }, { .uname = "LS_MABRESP_LCL_CACHE", .udesc = "Fill from another cache (home node local).", .ucode = 0x2, }, { .uname = "LS_MABRESP_LCL_DRAM", .udesc = "Fill from DRAM (home node local).", .ucode = 0x8, }, { .uname = "LS_MABRESP_RMT_CACHE", .udesc = "Fill from another cache (home node remote).", .ucode = 0x10, }, { .uname = "LS_MABRESP_RMT_DRAM", .udesc = "Fill from DRAM (home node remote).", .ucode = 0x40, }, }; static const amd64_umask_t amd64_fam17h_zen2_store_commit_cancels_2[]={ { .uname = "WCB_FULL", .udesc = "Non cacheable store and the non-cacheable commit buffer is full.", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen2_l1_dtlb_miss[]={ { .uname = "TLB_RELOAD_1G_L2_MISS", .udesc = "Data TLB reload to a 1GB page that missed in the L2 TLB", .ucode = 0x80, }, { .uname = "TLB_RELOAD_2M_L2_MISS", .udesc = "Data TLB reload to a 2MB page that missed in the L2 TLB", .ucode = 0x40, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_MISS", .udesc = "Data TLB reload to coalesced pages that missed", .ucode = 0x20, }, { .uname = "TLB_RELOAD_4K_L2_MISS", .udesc = "Data TLB reload to a 4KB page that missed in the L2 TLB", .ucode = 0x10, }, { .uname = "TLB_RELOAD_1G_L2_HIT", .udesc = "Data TLB reload to a 1GB page that hit in the L2 TLB", .ucode = 0x8, }, { .uname = "TLB_RELOAD_2M_L2_HIT", .udesc = "Data TLB reload to a 2MB page that hit in the L2 TLB", .ucode = 0x4, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_HIT", .udesc = "Data TLB reload to coalesced pages that hit", .ucode = 0x2, }, { .uname = "TLB_RELOAD_4K_L2_HIT", .udesc = "Data TLB reload to a 4KB page thta hit in the L2 TLB", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_mab_allocation_by_pipe[]={ { .uname = "TLB_PIPE_EARLY", .udesc = "TBD", .ucode = 0x10, }, { .uname = "HW_PF", .udesc = "hw_pf", .ucode = 0x8, }, { .uname = "TLB_PIPE_LATE", .udesc = "TBD", .ucode = 0x4, }, { .uname = "ST_PIPE", .udesc = "TBD", .ucode = 0x2, }, { .uname = "DATA_PIPE", .udesc = "TBD", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam17h_zen2_prefetch_instructions_dispatched[]={ { .uname = "PREFETCH_T0_T1_T2", .udesc = "Number of prefetcht0, perfetcht1, prefetcht2 instructions dispatched", .ucode = 0x1, }, { .uname = "PREFETCHW", .udesc = "Number of prefetchtw instructions dispatched", .ucode = 0x2, }, { .uname = "PREFETCHNTA", .udesc = "Number of prefetchtnta instructions dispatched", .ucode = 0x4, }, { .uname = "ANY", .udesc = "Any prefetch", .ucode = 0x7, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam17h_zen2_uops_dispatched_from_decoder[]={ { .uname = "DECODER_DISPATCHED", .udesc = "Number of uops dispatched from the Decoder", .ucode = 0x1, }, { .uname = "OPCACHE_DISPATCHED", .udesc = "Number of uops dispatched from the OpCache", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam17h_zen2_dispatch_resource_stall_cycles_1[]={ { .uname = "INT_PHY_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to integer physical register file resource stalls. Applies to all uops that have integer destination register.", .ucode = 0x1, }, { .uname = "LOAD_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to load queue resource stalls. Applies to all uops with load semantics.", .ucode = 0x2, }, { .uname = "STORE_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to store queue resource stalls. Applies to all uops with store semantics.", .ucode = 0x4, }, { .uname = "INT_SCHEDULER_MISC_RSRC_STALL", .udesc = "Number of cycles stalled due to integer scheduler miscellaneous resource stalls.", .ucode = 0x8, }, { .uname = "TAKEN_BRANCH_BUFFER_RSRC_STALL", .udesc = "Number of cycles stalled due to taken branch buffer resource stalls.", .ucode = 0x10, }, { .uname = "FP_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point register file resource stalls.", .ucode = 0x20, }, { .uname = "FP_SCHEDULER_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point scheduler resource stalls.", .ucode = 0x40, }, { .uname = "FP_MISC_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point miscellaneous resource unavailable.", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam17h_zen2_dispatch_resource_stall_cycles_0[]={ { .uname = "ALU_TOKEN_STALL", .udesc = "Number of cycles ALU tokens total unavailable.", .ucode = 0x8, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam17h_zen2_retired_serializing_ops[]={ { .uname = "X87_CTRL_RET", .udesc = "X87 control word mispredict traps due to mispredction in RC or PC, or changes in mask bits.", .ucode = 0x1, }, { .uname = "X87_BOT_RET", .udesc = "X87 bottom-executing uops retired.", .ucode = 0x2, }, { .uname = "SSE_CTRL_RET", .udesc = "SSE control word mispreduct traps due to mispredctions in RC, FTZ or DAZ or changes in mask bits.", .ucode = 0x4, }, { .uname = "SSE_BOT_RET", .udesc = "SSE bottom-executing uops retired.", .ucode = 0x8, }, }; static const amd64_umask_t amd64_fam17h_zen2_retired_sse_avx_flops[]={ { .uname = "ADD_SUB_FLOPS", .udesc = "Addition/subtraction FLOPS", .ucode = 0x1, }, { .uname = "MULT_FLOPS", .udesc = "Multiplication FLOPS", .ucode = 0x2, }, { .uname = "DIV_FLOPS", .udesc = "Division FLOPS.", .ucode = 0x4, }, { .uname = "MAC_FLOPS", .udesc = "Double precision add/subtract flops.", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Double precision add/subtract flops.", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam17h_zen2_fp_dispatch_faults[]={ { .uname = "X87_FILL_FAULT", .udesc = "x87 fill faults", .ucode = 0x1, }, { .uname = "XMM_FILL_FAULT", .udesc = "XMM fill faults", .ucode = 0x2, }, { .uname = "YMM_FILL_FAULT", .udesc = "YMM fill faults", .ucode = 0x4, }, { .uname = "YMM_SPILL_FAULT", .udesc = "YMM spill faults", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Any FP dispatch faults", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_entry_t amd64_fam17h_zen2_pe[]={ { .name = "L1_ITLB_MISS_L2_ITLB_HIT", .desc = "Number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x84, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_MISS", .desc = "Number of instruction fetches that miss in both the L1 and L2 TLBs.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x85, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_l1_itlb_miss_l2_itlb_miss), .umasks = amd64_fam17h_zen2_l1_itlb_miss_l2_itlb_miss, }, { .name = "RETIRED_SSE_AVX_FLOPS", .desc = "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15 and therefore requires the MergeEvent. On Linux, the kernel handles this case without the need to pass the merge event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x3, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_retired_sse_avx_flops), .umasks = amd64_fam17h_zen2_retired_sse_avx_flops, }, { .name = "DIV_CYCLES_BUSY_COUNT", .desc = "Number of cycles when the divider is busy.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd3, .flags = 0, .ngrp = 0, }, { .name = "DIV_OP_COUNT", .desc = "Number of divide uops.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc6, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of indirect branches retired there were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted condition branch instruction. Only EX mispredicts are counted.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xca, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of branch instructions retired, of any type, that were not correctly predicted. This includes those for which prediction is not attempted (far control transfers, exceptions and interrupts).", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc3, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired taken branch instructions that were mispredicted.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc5, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired conditional branch instructions.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xd1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UOPS", .desc = "Number of uops retired. This includes all processor activity (instructions, exceptions, interrupts, microcode assists, etc.). The number of events logged per cycle can vary from 0 to 8.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FUSED_INSTRUCTIONS", .desc = "Number of fused retired branch instructions retired per cycle. The number of events logged per cycle can vary from 0 to 3.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1d0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Instructions Retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "Number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions, it is not suitable for measuring MFLOPS.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xcb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_retired_mmx_fp_instructions), .umasks = amd64_fam17h_zen2_retired_mmx_fp_instructions, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Number of near return instructions (RET or RETI) retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xc9, .flags = 0, .ngrp = 0, }, { .name = "TAGGED_IBS_OPS", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1cf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_tagged_ibs_ops), .umasks = amd64_fam17h_zen2_tagged_ibs_ops, }, { .name = "RETIRED_BRANCH_MISPREDICTED_DIRECTION_MISMATCH", .desc = "Number of retired conditional branch instructions that were not correctly predicted because of branch direction mismatch", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x1c7, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Number of 64-byte instruction cachelines that was fulfilled by the L2 cache.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x82, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Number of 64-byte instruction cachelines fulfilled from system memory or another cache.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x83, .flags = 0, .ngrp = 0, }, { .name = "CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS", .desc = "L2 cache request outcomes. This event does not count accesses to the L2 cache by the L2 prefetcher.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x64, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_core_to_l2_cacheable_request_access_status), .umasks = amd64_fam17h_zen2_core_to_l2_cacheable_request_access_status, }, { .name = "L2_PREFETCH_HIT_L2", .desc = "Number of L2 prefetcher hits in the L2", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x70, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen2_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_HIT_L3", .desc = "Number of L2 prefetcher hits in the L3", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x71, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen2_l2_prefetch_hit_l2, /* shared */ }, { .name = "L2_PREFETCH_MISS_L3", .desc = "Number of L2 prefetcher misses in the L3", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x72, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_l2_prefetch_hit_l2), .umasks = amd64_fam17h_zen2_l2_prefetch_hit_l2, /* shared */ }, { .name = "REQUESTS_TO_L2_GROUP1", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x60, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_requests_to_l2_group1), .umasks = amd64_fam17h_zen2_requests_to_l2_group1, }, { .name = "REQUESTS_TO_L2_GROUP2", .desc = "Multi-events in that LS and IF requests can be received simultaneous.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x61, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_requests_to_l2_group2), .umasks = amd64_fam17h_zen2_requests_to_l2_group2, }, { .name = "BAD_STATUS_2", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x24, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_bad_status_2), .umasks = amd64_fam17h_zen2_bad_status_2, }, { .name = "LS_DISPATCH", .desc = "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x29, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_ls_dispatch), .umasks = amd64_fam17h_zen2_ls_dispatch, }, { .name = "INEFFECTIVE_SOFTWARE_PREFETCH", .desc = "Number of software prefetches that did not fetch data outside of the processor core.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x52, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_ineffective_software_prefetch), .umasks = amd64_fam17h_zen2_ineffective_software_prefetch, }, { .name = "SOFTWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of software prefetches fills by data source", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x59, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_software_prefetch_data_cache_fills), .umasks = amd64_fam17h_zen2_software_prefetch_data_cache_fills, }, { .name = "HARDWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of hardware prefetches fills by data source", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x5a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_software_prefetch_data_cache_fills), .umasks = amd64_fam17h_zen2_software_prefetch_data_cache_fills, /* shared */ }, { .name = "L1_DTLB_MISS", .desc = "L1 Data TLB misses.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x45, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_l1_dtlb_miss), .umasks = amd64_fam17h_zen2_l1_dtlb_miss, }, { .name = "RETIRED_LOCK_INSTRUCTIONS", .desc = "Counts the number of retired locked instructions", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x25, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_retired_lock_instructions), .umasks = amd64_fam17h_zen2_retired_lock_instructions, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Counts the number of retired non-speculative clflush instructions", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x26, .flags = 0, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Counts the number of retired cpuid instructions", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x27, .flags = 0, }, { .name = "SMI_RECEIVED", .desc = "Counts the number system management interrupts (SMI) received", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x2b, .flags = 0, }, { .name = "INTERRUPT_TAKEN", .desc = "Counts the number of interrupts taken", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x2c, .flags = 0, }, { .name = "MAB_ALLOCATION_BY_PIPE", .desc = "TBD", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x41, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_mab_allocation_by_pipe), .umasks = amd64_fam17h_zen2_mab_allocation_by_pipe, }, { .name = "MISALIGNED_LOADS", .desc = "Misaligned loads retired", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x47, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NOT_IN_HALT", .desc = "Number of core cycles not in halted state", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x76, .flags = 0, .ngrp = 0, }, { .name = "TLB_FLUSHES", .desc = "Number of TLB flushes", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x78, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_tlb_flushes), .umasks = amd64_fam17h_zen2_tlb_flushes, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Software Prefetch Instructions Dispatched. This is a speculative event", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x4b, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_prefetch_instructions_dispatched), .umasks = amd64_fam17h_zen2_prefetch_instructions_dispatched, }, { .name = "STORE_TO_LOAD_FORWARD", .desc = "Number of STore Lad Forward hits.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x35, .flags = 0, .ngrp = 0, }, { .name = "STORE_COMMIT_CANCELS_2", .desc = "Number of store commit cancellations", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x37, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_store_commit_cancels_2), .umasks = amd64_fam17h_zen2_store_commit_cancels_2, }, { .name = "L1_BTB_CORRECTION", .desc = "Number of L1 branch prediction overrides of existing prediction. This is a speculative event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8a, .flags = 0, .ngrp = 0, }, { .name = "L2_BTB_CORRECTION", .desc = "Number of L2 branch prediction overrides of existing prediction. This is a speculative event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8b, .flags = 0, .ngrp = 0, }, { .name = "DYNAMIC_INDIRECT_PREDICTIONS", .desc = "Number of indirect branch prediction for potential multi-target branch. This is a speculative event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x8e, .flags = 0, .ngrp = 0, }, { .name = "DECODER_OVERRIDE_BRANCH_PRED", .desc = "Number of decoder overrides of existing branch prediction. This is a speculative event.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x91, .flags = 0, .ngrp = 0, }, { .name = "ITLB_FETCH_HIT", .desc = "Instruction fetches that hit in the L1 ITLB", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x94, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_itlb_fetch_hit), .umasks = amd64_fam17h_zen2_itlb_fetch_hit, }, { .name = "UOPS_QUEUE_EMPTY", .desc = "Cycles where the uops queue is empty", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xa9, .flags = 0, .ngrp = 0, }, { .name = "UOPS_DISPATCHED_FROM_DECODER", .desc = "Number of uops dispatched from either the Decoder, OpCache or both", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xaa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_uops_dispatched_from_decoder), .umasks = amd64_fam17h_zen2_uops_dispatched_from_decoder, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_1", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xae, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_dispatch_resource_stall_cycles_1), .umasks = amd64_fam17h_zen2_dispatch_resource_stall_cycles_1, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_0", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xaf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_dispatch_resource_stall_cycles_0), .umasks = amd64_fam17h_zen2_dispatch_resource_stall_cycles_0, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "The number of serializing Ops retired.", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x5, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_retired_serializing_ops), .umasks = amd64_fam17h_zen2_retired_serializing_ops, }, { .name = "FP_DISPATCH_FAULTS", .desc = "Floating-point dispatch faults", .modmsk = AMD64_FAM17H_ATTRS, .code = 0xe, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_fp_dispatch_faults), .umasks = amd64_fam17h_zen2_fp_dispatch_faults, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Demand Data Cache fills by data source", .modmsk = AMD64_FAM17H_ATTRS, .code = 0x43, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_software_prefetch_data_cache_fills), .umasks = amd64_fam17h_zen2_software_prefetch_data_cache_fills, /* shared */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam19h_zen3.h000066400000000000000000001043241502707512200246160ustar00rootroot00000000000000/* * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam19h_zen3 (AMD64 Fam19h Zen3) */ static const amd64_umask_t amd64_fam19h_zen3_retired_sse_avx_flops[]={ { .uname = "ADD_SUB_FLOPS", .udesc = "Addition/subtraction FLOPS", .ucode = 0x1, }, { .uname = "MULT_FLOPS", .udesc = "Multiplication FLOPS", .ucode = 0x2, }, { .uname = "DIV_FLOPS", .udesc = "Division/Square-root FLOPS", .ucode = 0x4, }, { .uname = "MAC_FLOPS", .udesc = "Multiply-Accumulate flops. Each MAC operation is counted as 2 FLOPS", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Double precision add/subtract flops", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_retired_serializing_ops[]={ { .uname = "X87_CTRL_RET", .udesc = "x87 control word mispredict traps due to mispredction in RC or PC, or changes in Exception Mask bits", .ucode = 0x1, }, { .uname = "X87_BOT_RET", .udesc = "x87 bottom-executing ops retired", .ucode = 0x2, }, { .uname = "SSE_CTRL_RET", .udesc = "SSE/AVX control word mispredict traps", .ucode = 0x4, }, { .uname = "SSE_BOT_RET", .udesc = "SSE/AVX bottom-executing ops retired", .ucode = 0x8, }, }; static const amd64_umask_t amd64_fam19h_zen3_fp_dispatch_faults[]={ { .uname = "X87_FILL_FAULT", .udesc = "x87 fill faults", .ucode = 0x1, }, { .uname = "XMM_FILL_FAULT", .udesc = "XMM fill faults", .ucode = 0x2, }, { .uname = "YMM_FILL_FAULT", .udesc = "YMM fill faults", .ucode = 0x4, }, { .uname = "YMM_SPILL_FAULT", .udesc = "YMM spill faults", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Any FP dispatch faults", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_bad_status_2[]={ { .uname = "STLI_OTHER", .udesc = "Store-to-load conflicts. A load was unable to complete due to a non-forwardable conflict with an older store", .ucode = 0x2, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen3_retired_lock_instructions[]={ { .uname = "BUS_LOCK", .udesc = "Number of bus locks", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen3_ls_dispatch[]={ { .uname = "LD_ST_DISPATCH", .udesc = "Dispatched op that performs a load from and store to the same memory address", .ucode = 0x4, }, { .uname = "STORE_DISPATCH", .udesc = "Store ops dispatched", .ucode = 0x2, }, { .uname = "LD_DISPATCH", .udesc = "Load ops dispatched", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_store_commit_cancels_2[]={ { .uname = "WCB_FULL", .udesc = "Non cacheable store and the non-cacheable commit buffer is full", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen3_mab_allocation_by_type[]={ { .uname = "LS", .udesc = "Load store allocations", .ucode = 0x3f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "HW_PF", .udesc = "Hardware prefetcher allocations", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All allocations", .ucode = 0x7f, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_software_prefetch_data_cache_fills[]={ { .uname = "LCL_L2", .udesc = "Fill from local L2 to the core", .ucode = 0x1, }, { .uname = "INT_CACHE", .udesc = "Fill from L3 or different L2 in same CCX", .ucode = 0x2, }, { .uname = "EXT_CACHE_LCL", .udesc = "Fill from cache of different CCX in same node", .ucode = 0x4, }, { .uname = "MEM_IO_LCL", .udesc = "Fill from DRAM or IO connected in same node", .ucode = 0x8, }, { .uname = "EXT_CACHE_RMT", .udesc = "Fill from CCX cache in different node", .ucode = 0x10, }, { .uname = "MEM_IO_RMT", .udesc = "Fill from DRAM or IO connected in different node", .ucode = 0x40, }, }; static const amd64_umask_t amd64_fam19h_zen3_l1_dtlb_miss[]={ { .uname = "TLB_RELOAD_1G_L2_MISS", .udesc = "Data TLB reload to a 1GB page that missed in the L2 TLB", .ucode = 0x80, }, { .uname = "TLB_RELOAD_2M_L2_MISS", .udesc = "Data TLB reload to a 2MB page that missed in the L2 TLB", .ucode = 0x40, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_MISS", .udesc = "Data TLB reload to a coalesced page that also missed in the L2 TLB", .ucode = 0x20, }, { .uname = "TLB_RELOAD_4K_L2_MISS", .udesc = "Data TLB reload to a 4KB page that missed in the L2 TLB", .ucode = 0x10, }, { .uname = "TLB_RELOAD_1G_L2_HIT", .udesc = "Data TLB reload to a 1GB page that hit in the L2 TLB", .ucode = 0x8, }, { .uname = "TLB_RELOAD_2M_L2_HIT", .udesc = "Data TLB reload to a 2MB page that hit in the L2 TLB", .ucode = 0x4, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_HIT", .udesc = "Data TLB reload to a coalesced page that hit in the L2 TLB", .ucode = 0x2, }, { .uname = "TLB_RELOAD_4K_L2_HIT", .udesc = "Data TLB reload to a 4KB page that hit in the L2 TLB", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_misaligned_loads[]={ { .uname = "MA4K", .udesc = "The number of 4KB misaligned (page crossing) loads", .ucode = 0x2, }, { .uname = "MA64", .udesc = "The number of 64B misaligned (cacheline crossing) loads", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_prefetch_instructions_dispatched[]={ { .uname = "PREFETCH_T0_T1_T2", .udesc = "Number of prefetcht0, perfetcht1, prefetcht2 instructions dispatched", .ucode = 0x1, }, { .uname = "PREFETCHW", .udesc = "Number of prefetchtw instructions dispatched", .ucode = 0x2, }, { .uname = "PREFETCHNTA", .udesc = "Number of prefetchtnta instructions dispatched", .ucode = 0x4, }, { .uname = "ANY", .udesc = "Any prefetch", .ucode = 0x7, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_ineffective_software_prefetch[]={ { .uname = "MAB_MCH_CNT", .udesc = "Software prefetch instructions saw a match on an already allocated miss request buffer", .ucode = 0x2, }, { .uname = "DATA_PIPE_SW_PF_DC_HIT", .udesc = "Software Prefetch instruction saw a DC hit", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_tlb_flushes[]={ { .uname = "ALL", .udesc = "Any TLB flush", .ucode = 0xff, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen3_l1_itlb_miss_l2_itlb_miss[]={ { .uname = "COALESCED4K", .udesc = "Number of instruction fetches to a >4K coalesced page", .ucode = 0x8, }, { .uname = "IF1G", .udesc = "Number of instruction fetches to a 1GB page", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "Number of instruction fetches to a 2MB page", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "Number of instruction fetches to a 4KB page", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_itlb_fetch_hit[]={ { .uname = "IF1G", .udesc = "L1 instruction fetch TLB hit a 1GB page size", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "L1 instruction fetch TLB hit a 2MB page size", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "L1 instruction fetch TLB hit a 4KB or 16KB page size", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_ic_tag_hit_miss[]={ { .uname = "IC_HIT", .udesc = "Instruction cache hit", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, { .uname = "IC_MISS", .udesc = "Instruction cache miss", .ucode = 0x18, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_IC_ACCESS", .udesc = "All instruction cache accesses", .ucode = 0x1f, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_op_cache_hit_miss[]={ { .uname = "OC_HIT", .udesc = "Op cache hit", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO, }, { .uname = "OC_MISS", .udesc = "Op cache miss", .ucode = 0x4, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_OC_ACCESS", .udesc = "All op cache accesses", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_ops_source_dispatched_from_decoder[]={ { .uname = "X86DECODER_DISPATCHED", .udesc = "Number of ops fetched from Instruction Cache and dispatched", .ucode = 0x1, }, { .uname = "OPCACHE_DISPATCHED", .udesc = "Number of ops fetched from Op Cache and dispatched", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam19h_zen3_ops_type_dispatched_from_decoder[]={ { .uname = "FP_DISP_IBS_MODE", .udesc = "Any FP dispatch. Count aligns with IBS count", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT_DISP_IBS_MODE", .udesc = "Any Integer dispatch. Count aligns with IBS count", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP_DISP_RETIRE_MODE", .udesc = "Any FP dispatch. Count aligns with RETIRED_OPS count", .ucode = 0x84, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT_DISP_RETIRE_MODE", .udesc = "Any Integer dispatch. Count aligns with RETIRED_OPS count", .ucode = 0x88, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen3_dispatch_resource_stall_cycles_1[]={ { .uname = "INT_PHY_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to integer physical register file resource stalls. Applies to all ops that have integer destination register", .ucode = 0x1, }, { .uname = "LOAD_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to load queue resource stalls. Applies to all ops with load semantics", .ucode = 0x2, }, { .uname = "STORE_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to store queue resource stalls. Applies to all ops with store semantics", .ucode = 0x4, }, { .uname = "TAKEN_BRANCH_BUFFER_RSRC_STALL", .udesc = "Number of cycles stalled due to taken branch buffer resource stalls", .ucode = 0x10, }, { .uname = "FP_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point register file resource stalls. Applies to all FP ops that have a destination register", .ucode = 0x20, }, { .uname = "FP_SCHEDULER_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point scheduler resource stalls. Applies to ops that use the FP scheduler", .ucode = 0x40, }, { .uname = "FP_FLUSH_RECOVERY_STALL", .udesc = "Number of cycles stalled due to floating-point flush recovery", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam19h_zen3_dispatch_resource_stall_cycles_2[]={ { .uname = "INT_SCHEDULER_0_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 0", .ucode = 0x1, }, { .uname = "INT_SCHEDULER_1_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 1", .ucode = 0x2, }, { .uname = "INT_SCHEDULER_2_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 2", .ucode = 0x4, }, { .uname = "INT_SCHEDULER_3_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 3", .ucode = 0x8, }, { .uname = "RETIRE_TOKEN_STALL", .udesc = "Number of cycles stalled due to insufficient tokens available for Retire Queue", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam19h_zen3_retired_mmx_fp_instructions[]={ { .uname = "SSE_INSTR", .udesc = "Number of SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX)", .ucode = 0x4, }, { .uname = "MMX_INSTR", .udesc = "Number of MMX instructions", .ucode = 0x2, }, { .uname = "X87_INSTR", .udesc = "Number of x87 instructions", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_tagged_ibs_ops[]={ { .uname = "IBS_COUNT_ROLLOVER", .udesc = "Number of times a op could not be tagged by IBS because of a previous tagged op that has not retired", .ucode = 0x4, }, { .uname = "IBS_TAGGED_OPS_RET", .udesc = "Number of ops tagged by IBS that retired", .ucode = 0x2, }, { .uname = "IBS_TAGGED_OPS", .udesc = "Number of ops tagged by IBS", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_requests_to_l2_group1[]={ { .uname = "RD_BLK_L", .udesc = "Number of data cache reads (including software and hardware prefetches)", .ucode = 0x80, }, { .uname = "RD_BLK_X", .udesc = "Number of data cache stores", .ucode = 0x40, }, { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared reads", .ucode = 0x20, }, { .uname = "CACHEABLE_IC_READ", .udesc = "Number of instruction cache reads", .ucode = 0x10, }, { .uname = "CHANGE_TO_X", .udesc = "Number of requests change to writable, check L2 for current state", .ucode = 0x8, }, { .uname = "PREFETCH_L2", .udesc = "TBD", .ucode = 0x4, }, { .uname = "L2_HW_PF", .udesc = "Number of prefetches accepted by L2 pipeline, hit or miss", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam19h_zen3_core_to_l2_cacheable_request_access_status[]={ { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared read hitting in the L2", .ucode = 0x80, }, { .uname = "LS_RD_BLK_L_HIT_X", .udesc = "Number of data cache reads hitting in the L2", .ucode = 0x40, }, { .uname = "LS_RD_BLK_L_HIT_S", .udesc = "Number of data cache reads hitting a non-modifiable line in the L2", .ucode = 0x20, }, { .uname = "LS_RD_BLK_X", .udesc = "Number of data cache store or state change requests hitting in the L2", .ucode = 0x10, }, { .uname = "LS_RD_BLK_C", .udesc = "Number of data cache requests missing in the L2 (all types)", .ucode = 0x8, }, { .uname = "IC_FILL_HIT_X", .udesc = "Number of instruction cache fill requests hitting a modifiable line in the L2", .ucode = 0x4, }, { .uname = "IC_FILL_HIT_S", .udesc = "Number of instruction cache fill requests hitting a non-modifiable line in the L2", .ucode = 0x2, }, { .uname = "IC_FILL_MISS", .udesc = "Number of instruction cache fill requests missing the L2", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen3_l2_prefetch_hit_l2[]={ { .uname = "L2_HW_PREFETCHER", .udesc = "Number of requests generated by L2 hardware prefetcher", .ucode = 0x1f, }, { .uname = "L1_HW_PREFETCHER", .udesc = "Number of requests generated by L1 hardware prefetcher", .ucode = 0xe0, }, }; static const amd64_entry_t amd64_fam19h_zen3_pe[]={ { .name = "RETIRED_SSE_AVX_FLOPS", .desc = "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15 and therefore requires the MergeEvent", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x3, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_retired_sse_avx_flops), .umasks = amd64_fam19h_zen3_retired_sse_avx_flops, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "The number of serializing Ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_retired_serializing_ops), .umasks = amd64_fam19h_zen3_retired_serializing_ops, }, { .name = "FP_DISPATCH_FAULTS", .desc = "Floating-point dispatch faults", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xe, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_fp_dispatch_faults), .umasks = amd64_fam19h_zen3_fp_dispatch_faults, }, { .name = "BAD_STATUS_2", .desc = "TBD", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x24, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_bad_status_2), .umasks = amd64_fam19h_zen3_bad_status_2, }, { .name = "RETIRED_LOCK_INSTRUCTIONS", .desc = "Counts the number of retired locked instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x25, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_retired_lock_instructions), .umasks = amd64_fam19h_zen3_retired_lock_instructions, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Counts the number of retired non-speculative clflush instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x26, .flags = 0, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Counts the number of retired cpuid instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x27, .flags = 0, }, { .name = "LS_DISPATCH", .desc = "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x29, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_ls_dispatch), .umasks = amd64_fam19h_zen3_ls_dispatch, }, { .name = "SMI_RECEIVED", .desc = "Counts the number system management interrupts (SMI) received", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x2b, .flags = 0, }, { .name = "INTERRUPT_TAKEN", .desc = "Counts the number of interrupts taken", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x2c, .flags = 0, }, { .name = "STORE_TO_LOAD_FORWARD", .desc = "Number of STore to Load Forward hits", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x35, .flags = 0, .ngrp = 0, }, { .name = "STORE_COMMIT_CANCELS_2", .desc = "TBD", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x37, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_store_commit_cancels_2), .umasks = amd64_fam19h_zen3_store_commit_cancels_2, }, { .name = "MAB_ALLOCATION_BY_TYPE", .desc = "Counts when a LS pipe allocates a MAB entry", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x41, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_mab_allocation_by_type), .umasks = amd64_fam19h_zen3_mab_allocation_by_type, }, { .name = "DEMAND_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Demand Data Cache fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x43, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_software_prefetch_data_cache_fills), .umasks = amd64_fam19h_zen3_software_prefetch_data_cache_fills, /* shared */ }, { .name = "ANY_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Any Data Cache fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x44, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_software_prefetch_data_cache_fills), .umasks = amd64_fam19h_zen3_software_prefetch_data_cache_fills, /* shared */ }, { .name = "L1_DTLB_MISS", .desc = "L1 Data TLB misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x45, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l1_dtlb_miss), .umasks = amd64_fam19h_zen3_l1_dtlb_miss, }, { .name = "MISALIGNED_LOADS", .desc = "Misaligned loads retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x47, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_misaligned_loads), .umasks = amd64_fam19h_zen3_misaligned_loads, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Software Prefetch Instructions Dispatched. This is a speculative event", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x4b, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_prefetch_instructions_dispatched), .umasks = amd64_fam19h_zen3_prefetch_instructions_dispatched, }, { .name = "INEFFECTIVE_SOFTWARE_PREFETCH", .desc = "Number of software prefetches that did not fetch data outside of the processor core", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x52, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_ineffective_software_prefetch), .umasks = amd64_fam19h_zen3_ineffective_software_prefetch, }, { .name = "SOFTWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of software prefetches fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x59, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_software_prefetch_data_cache_fills), .umasks = amd64_fam19h_zen3_software_prefetch_data_cache_fills, /* shared */ }, { .name = "HARDWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of hardware prefetches fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_software_prefetch_data_cache_fills), .umasks = amd64_fam19h_zen3_software_prefetch_data_cache_fills, /* shared */ }, { .name = "ALLOC_MAB_COUNT", .desc = "Counts the in-flight L1 data cache misses (allocated Miss Address Buffers) divided by 4 and rounded down each cycle unless used with the MergeEvent functionality. If the MergeEvent is used, it counts the exact number of outstanding L1 data cache misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5f, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NOT_IN_HALT", .desc = "Number of core cycles not in halted state", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x76, .flags = 0, .ngrp = 0, }, { .name = "TLB_FLUSHES", .desc = "Number of TLB flushes", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x78, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_tlb_flushes), .umasks = amd64_fam19h_zen3_tlb_flushes, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Number of 64-byte instruction cachelines that was fulfilled by the L2 cache", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x82, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Number of 64-byte instruction cachelines fulfilled from system memory or another cache", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x83, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_HIT", .desc = "Number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x84, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_MISS", .desc = "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x85, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l1_itlb_miss_l2_itlb_miss), .umasks = amd64_fam19h_zen3_l1_itlb_miss_l2_itlb_miss, }, { .name = "L2_BTB_CORRECTION", .desc = "Number of L2 branch prediction overrides of existing prediction. This is a speculative event", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x8b, .flags = 0, .ngrp = 0, }, { .name = "DYNAMIC_INDIRECT_PREDICTIONS", .desc = "Number of times a branch used the indirect predictor to make a prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x8e, .flags = 0, .ngrp = 0, }, { .name = "DECODER_OVERRIDE_BRANCH_PRED", .desc = "Number of decoder overrides of existing branch prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x91, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_FETCH_HIT", .desc = "Instruction fetches that hit in the L1 ITLB", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x94, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_itlb_fetch_hit), .umasks = amd64_fam19h_zen3_itlb_fetch_hit, }, { .name = "IC_TAG_HIT_MISS", .desc = "Counts various IC tag related hit and miss events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x18e, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_ic_tag_hit_miss), .umasks = amd64_fam19h_zen3_ic_tag_hit_miss, }, { .name = "OP_CACHE_HIT_MISS", .desc = "Counts op cache micro-tag hit/miss events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x28f, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_op_cache_hit_miss), .umasks = amd64_fam19h_zen3_op_cache_hit_miss, }, { .name = "OPS_SOURCE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xaa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_ops_source_dispatched_from_decoder), .umasks = amd64_fam19h_zen3_ops_source_dispatched_from_decoder, }, { .name = "OPS_TYPE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op type", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xab, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_ops_type_dispatched_from_decoder), .umasks = amd64_fam19h_zen3_ops_type_dispatched_from_decoder, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_1", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xae, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_dispatch_resource_stall_cycles_1), .umasks = amd64_fam19h_zen3_dispatch_resource_stall_cycles_1, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_2", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xaf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_dispatch_resource_stall_cycles_2), .umasks = amd64_fam19h_zen3_dispatch_resource_stall_cycles_2, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Number of instructions retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_OPS", .desc = "Number of macro-ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired branch instructions, that were mispredicted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc3, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired taken branch instructions that were mispredicted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc5, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc6, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Number of near return instructions (RET or RET Iw) retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc9, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of indirect branches retired there were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted condition branch instruction. Only EX mispredicts are counted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xca, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "Number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions, it is not suitable for measuring MFLOPS", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xcb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_retired_mmx_fp_instructions), .umasks = amd64_fam19h_zen3_retired_mmx_fp_instructions, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS", .desc = "Number of indirect branches retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xcc, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired conditional branch instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd1, .flags = 0, .ngrp = 0, }, { .name = "DIV_CYCLES_BUSY_COUNT", .desc = "Number of cycles when the divider is busy", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd3, .flags = 0, .ngrp = 0, }, { .name = "DIV_OP_COUNT", .desc = "Number of divide ops", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_MISPREDICTED_DIRECTION_MISMATCH", .desc = "Number of retired conditional branch instructions that were not correctly predicted because of branch direction mismatch", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c7, .flags = 0, .ngrp = 0, }, { .name = "TAGGED_IBS_OPS", .desc = "Counts Op IBS related events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1cf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_tagged_ibs_ops), .umasks = amd64_fam19h_zen3_tagged_ibs_ops, }, { .name = "RETIRED_FUSED_INSTRUCTIONS", .desc = "Counts retired fused instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1d0, .flags = 0, .ngrp = 0, }, { .name = "REQUESTS_TO_L2_GROUP1", .desc = "All L2 cache requests", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x60, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_requests_to_l2_group1), .umasks = amd64_fam19h_zen3_requests_to_l2_group1, }, { .name = "CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS", .desc = "L2 cache request outcomes. This event does not count accesses to the L2 cache by the L2 prefetcher", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x64, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_core_to_l2_cacheable_request_access_status), .umasks = amd64_fam19h_zen3_core_to_l2_cacheable_request_access_status, }, { .name = "L2_PREFETCH_HIT_L2", .desc = "Number of L2 prefetches that hit in the L2", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x70, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen3_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_HIT_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x71, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen3_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_MISS_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x72, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen3_l2_prefetch_hit_l2, }, { .name = "UOPS_QUEUE_EMPTY", .desc = "Counts cycles where the decoded uops queue is empty", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xa9, .flags = 0, .ngrp = 0, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam19h_zen3_l3.h000066400000000000000000000047651502707512200252240ustar00rootroot00000000000000/* * Copyright 2021 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam19h_zen3_l3 (AMD64 Fam19h Zen3 L3) */ static const amd64_umask_t amd64_fam19h_zen3_l3_requests[]={ { .uname = "ALL", .udesc = "All types of requests", .ucode = 0xff, .uflags = AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam19h_zen3_l3_pe[]={ { .name = "UNC_L3_REQUESTS", .desc = "Number of requests to L3 cache", .code = 0x04, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l3_requests), .umasks = amd64_fam19h_zen3_l3_requests, }, { .name = "UNC_L3_MISS_LATENCY", .desc = "Each cycle, this event increments by the total number of read requests outstanding from the CCX divided by XiSysFillLatencyDivider. The user can calculate the average system fill latency in cycles by multiplying by XiSysFillLatencyDivider and dividing by the total number of fill requests over the same period (counted by event 0x9A UserMask 0x1F). XiSysFillLatencyDivider is 16 for this product, but may change for future products", .code = 0x90, }, { .name = "UNC_L3_MISSES", .desc = "Number of L3 cache misses", .code = 0x9a, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l3_requests), .umasks = amd64_fam19h_zen3_l3_requests, /* shared */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam19h_zen4.h000066400000000000000000001626361502707512200246310ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam19h_zen4 (AMD64 Fam19h Zen4) */ static const amd64_umask_t amd64_fam19h_zen4_retired_sse_avx_flops[]={ { .uname = "ADD_SUB_FLOPS", .udesc = "Addition/subtraction FLOPS", .ucode = 0x1, }, { .uname = "MULT_FLOPS", .udesc = "Multiplication FLOPS", .ucode = 0x2, }, { .uname = "DIV_FLOPS", .udesc = "Division/Square-root FLOPS", .ucode = 0x4, }, { .uname = "MAC_FLOPS", .udesc = "Multiply-Accumulate flops. Each MAC operation is counted as 2 FLOPS. This event does not include bfloat MAC operations", .ucode = 0x8, }, { .uname = "BFLOAT_MAC_FLOPS", .udesc = "Bfloat Multiply-Accumulate flops. Each MAC operation is counted as 2 FLOPS", .ucode = 0x10, }, { .uname = "ANY", .udesc = "Double precision add/subtract flops", .ucode = 0x1f, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_serializing_ops[]={ { .uname = "X87_CTRL_RET", .udesc = "x87 control word mispredict traps due to mispredction in RC or PC, or changes in Exception Mask bits", .ucode = 0x1, }, { .uname = "X87_BOT_RET", .udesc = "x87 bottom-executing ops retired", .ucode = 0x2, }, { .uname = "SSE_CTRL_RET", .udesc = "SSE/AVX control word mispredict traps", .ucode = 0x4, }, { .uname = "SSE_BOT_RET", .udesc = "SSE/AVX bottom-executing ops retired", .ucode = 0x8, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_fp_ops_by_width[]={ { .uname = "X87_UOPS_RETIRED", .udesc = "X87 uops retired", .ucode = 0x1, }, { .uname = "MMX_UOPS_RETIRED", .udesc = "MMX uops retired", .ucode = 0x2, }, { .uname = "SCALAR_UOPS_RETIRED", .udesc = "Scalar uops retired", .ucode = 0x4, }, { .uname = "PACK128_UOPS_RETIRED", .udesc = "Packed 128-bit uops retired", .ucode = 0x8, }, { .uname = "PACK256_UOPS_RETIRED", .udesc = "Packed 256-bit uops retired", .ucode = 0x10, }, { .uname = "PACK512_UOPS_RETIRED", .udesc = "Packed 512-bit uops retired", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_fp_ops_by_type[]={ { .uname = "SCALAR_ADD", .udesc = "Number of scalar ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_SUB", .udesc = "Number of scalar SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_MUL", .udesc = "Number of scalar MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_MAC", .udesc = "Number of scalar MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_DIV", .udesc = "Number of scalar DIV ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_SQRT", .udesc = "Number of scalar SQRT ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_CMP", .udesc = "Number of scalar CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_CVT", .udesc = "Number of scalar CVT ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_BLEND", .udesc = "Number of scalar BLEND ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_OTHER", .udesc = "Number of other scalar ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SCALAR_ALL", .udesc = "Number of anyscalar ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_ADD", .udesc = "Number of vector ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_SUB", .udesc = "Number of vector SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_MUL", .udesc = "Number of vector MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_MAC", .udesc = "Number of vector MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_DIV", .udesc = "Number of DIV ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_SQRT", .udesc = "Number of vector SQRT ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_CMP", .udesc = "Number of vector CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_CVT", .udesc = "Number of vector CVT ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_BLEND", .udesc = "Number of vector BLEND ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_SHUFFLE", .udesc = "Number of vector SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_LOGICAL", .udesc = "Number of vector LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_OTHER", .udesc = "Number of other vector ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "VECTOR_ALL", .udesc = "Number of scalar ADD and vector ALL ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_int_ops[]={ { .uname = "MMX_ADD", .udesc = "Number of MMX ADD ops retired", .ucode = 0x1, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_SUB", .udesc = "Number of MMX SUB ops retired", .ucode = 0x2, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_MUL", .udesc = "Number of MMX MUL ops retired", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_MAC", .udesc = "Number of MMX MAC ops retired", .ucode = 0x4, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_CMP", .udesc = "Number of MMX CMP ops retired", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_SHIFT", .udesc = "Number of MMX SHIFT ops retired", .ucode = 0x9, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_MOV", .udesc = "Number of MMX MOV ops retired", .ucode = 0xa, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_SHUFFLE", .udesc = "Number of MMX SHUFFLE ops retired", .ucode = 0xb, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_PACK", .udesc = "Number of MMX PACK ops retired", .ucode = 0xc, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_LOGICAL", .udesc = "Number of MMX LOGICAL ops retired", .ucode = 0xd, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_OTHER", .udesc = "Number of other MMX ops retired", .ucode = 0xe, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_ALL", .udesc = "Any MMX ops retired", .ucode = 0xf, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_ADD", .udesc = "Number of SSE/AVX ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_SUB", .udesc = "Number of SSE/AVX SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_MUL", .udesc = "Number of SSE/AVX MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_MAC", .udesc = "Number of SSE/AVX MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_AES", .udesc = "Number of SSE/AVX AES ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_SHA", .udesc = "Number of SSE/AVX SHA ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_CMP", .udesc = "Number of SSE/AVX CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_CLM", .udesc = "Number of SSE/AVX CLM ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_SHIFT", .udesc = "Number of SSE/AVX SHIFT ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_MOV", .udesc = "Number of SSE/AVX MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_SHUFFLE", .udesc = "Number of SSE/AVX SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_PACK", .udesc = "Number of SSE/AVX PACK ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_LOGICAL", .udesc = "Number of SSE/AVX LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_OTHER", .udesc = "Number of other SSE/AVX ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SSE_AVX_ALL", .udesc = "Any SSE/AVX ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_fp_dispatch_faults[]={ { .uname = "X87_FILL_FAULT", .udesc = "x87 fill faults", .ucode = 0x1, }, { .uname = "XMM_FILL_FAULT", .udesc = "XMM fill faults", .ucode = 0x2, }, { .uname = "YMM_FILL_FAULT", .udesc = "YMM fill faults", .ucode = 0x4, }, { .uname = "YMM_SPILL_FAULT", .udesc = "YMM spill faults", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Any FP dispatch faults", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_bad_status_2[]={ { .uname = "STLI_OTHER", .udesc = "Store-to-load conflicts. A load was unable to complete due to a non-forwardable conflict with an older store", .ucode = 0x2, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_lock_instructions[]={ { .uname = "BUS_LOCK", .udesc = "Number of bus locks", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_ls_dispatch[]={ { .uname = "LD_ST_DISPATCH", .udesc = "Dispatched op that performs a load from and store to the same memory address", .ucode = 0x4, }, { .uname = "STORE_DISPATCH", .udesc = "Store ops dispatched", .ucode = 0x2, }, { .uname = "LD_DISPATCH", .udesc = "Load ops dispatched", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_store_commit_cancels_2[]={ { .uname = "WCB_FULL", .udesc = "Non cacheable store and the non-cacheable commit buffer is full", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_mab_allocation_by_type[]={ { .uname = "LS", .udesc = "Load store allocations", .ucode = 0x3f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "HW_PF", .udesc = "Hardware prefetcher allocations", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All allocations", .ucode = 0x7f, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_demand_data_fills_from_system[]={ { .uname = "LCL_L2", .udesc = "Fill from local L2 to the core", .ucode = 0x1, }, { .uname = "LOCAL_CCX", .udesc = "Fill from L3 or different L2 in same CCX", .ucode = 0x2, }, { .uname = "NEAR_CACHE_NEAR_FAR", .udesc = "Fill from cache of different CCX in same node", .ucode = 0x4, }, { .uname = "DRAM_IO_NEAR", .udesc = "Fill from DRAM or IO connected to same node", .ucode = 0x8, }, { .uname = "FAR_CACHE_NEAR_FAR", .udesc = "Fill from CCX cache in different node", .ucode = 0x10, }, { .uname = "DRAM_IO_FAR", .udesc = "Fill from DRAM or IO connected from a different node (same socket or remote)", .ucode = 0x40, }, { .uname = "ALT_MEM_NEAR_FAR", .udesc = "Fill from Extension Memory", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam19h_zen4_l1_dtlb_miss[]={ { .uname = "TLB_RELOAD_1G_L2_MISS", .udesc = "Data TLB reload to a 1GB page that missed in the L2 TLB", .ucode = 0x80, }, { .uname = "TLB_RELOAD_2M_L2_MISS", .udesc = "Data TLB reload to a 2MB page that missed in the L2 TLB", .ucode = 0x40, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_MISS", .udesc = "Data TLB reload to a coalesced page that also missed in the L2 TLB", .ucode = 0x20, }, { .uname = "TLB_RELOAD_4K_L2_MISS", .udesc = "Data TLB reload to a 4KB page that missed in the L2 TLB", .ucode = 0x10, }, { .uname = "TLB_RELOAD_1G_L2_HIT", .udesc = "Data TLB reload to a 1GB page that hit in the L2 TLB", .ucode = 0x8, }, { .uname = "TLB_RELOAD_2M_L2_HIT", .udesc = "Data TLB reload to a 2MB page that hit in the L2 TLB", .ucode = 0x4, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_HIT", .udesc = "Data TLB reload to a coalesced page that hit in the L2 TLB", .ucode = 0x2, }, { .uname = "TLB_RELOAD_4K_L2_HIT", .udesc = "Data TLB reload to a 4KB page that hit in the L2 TLB", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_misaligned_loads[]={ { .uname = "MA4K", .udesc = "The number of 4KB misaligned (page crossing) loads", .ucode = 0x2, }, { .uname = "MA64", .udesc = "The number of 64B misaligned (cacheline crossing) loads", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_prefetch_instructions_dispatched[]={ { .uname = "PREFETCH_T0_T1_T2", .udesc = "Number of prefetcht0, perfetcht1, prefetcht2 instructions dispatched", .ucode = 0x1, }, { .uname = "PREFETCHW", .udesc = "Number of prefetchtw instructions dispatched", .ucode = 0x2, }, { .uname = "PREFETCHNTA", .udesc = "Number of prefetchtnta instructions dispatched", .ucode = 0x4, }, { .uname = "ANY", .udesc = "Any prefetch", .ucode = 0x7, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_ineffective_software_prefetch[]={ { .uname = "MAB_MCH_CNT", .udesc = "Software prefetch instructions saw a match on an already allocated miss request buffer", .ucode = 0x2, }, { .uname = "DATA_PIPE_SW_PF_DC_HIT", .udesc = "Software Prefetch instruction saw a DC hit", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_tlb_flushes[]={ { .uname = "ALL", .udesc = "Any TLB flush", .ucode = 0xff, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_l1_itlb_miss_l2_itlb_miss[]={ { .uname = "COALESCED4K", .udesc = "Number of instruction fetches to a >4K coalesced page", .ucode = 0x8, }, { .uname = "IF1G", .udesc = "Number of instruction fetches to a 1GB page", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "Number of instruction fetches to a 2MB page", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "Number of instruction fetches to a 4KB page", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_itlb_fetch_hit[]={ { .uname = "IF1G", .udesc = "L1 instruction fetch TLB hit a 1GB page size", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "L1 instruction fetch TLB hit a 2MB page size", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "L1 instruction fetch TLB hit a 4KB or 16KB page size", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_ic_tag_hit_miss[]={ { .uname = "IC_HIT", .udesc = "Instruction cache hit", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, { .uname = "IC_MISS", .udesc = "Instruction cache miss", .ucode = 0x18, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_IC_ACCESS", .udesc = "All instruction cache accesses", .ucode = 0x1f, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_op_cache_hit_miss[]={ { .uname = "OC_HIT", .udesc = "Op cache hit", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO, }, { .uname = "OC_MISS", .udesc = "Op cache miss", .ucode = 0x4, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_OC_ACCESS", .udesc = "All op cache accesses", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_ops_source_dispatched_from_decoder[]={ { .uname = "DECODER", .udesc = "Number of ops fetched from Instruction Cache and dispatched", .ucode = 0x1, }, { .uname = "OPCACHE", .udesc = "Number of ops fetched from Op Cache and dispatched", .ucode = 0x2, }, { .uname = "LOOP_BUFFER", .udesc = "Number of ops fetched from Loop buffer", .ucode = 0x4, }, }; static const amd64_umask_t amd64_fam19h_zen4_ops_type_dispatched_from_decoder[]={ { .uname = "FP_DISPATCH", .udesc = "Any FP dispatch", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INTEGER_DISPATCH", .udesc = "Any Integer dispatch", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_dispatch_resource_stall_cycles_1[]={ { .uname = "INT_PHY_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to integer physical register file resource stalls. Applies to all ops that have integer destination register", .ucode = 0x1, }, { .uname = "LOAD_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to load queue resource stalls", .ucode = 0x2, }, { .uname = "STORE_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to store queue resource stalls", .ucode = 0x4, }, { .uname = "TAKEN_BRANCH_BUFFER_RSRC_STALL", .udesc = "Number of cycles stalled due to taken branch buffer resource stalls", .ucode = 0x10, }, { .uname = "FP_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point register file resource stalls. Applies to all FP ops that have a destination register", .ucode = 0x20, }, { .uname = "FP_SCHEDULER_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point scheduler resource stalls", .ucode = 0x40, }, { .uname = "FP_FLUSH_RECOVERY_STALL", .udesc = "Number of cycles stalled due to floating-point flush recovery", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam19h_zen4_dispatch_resource_stall_cycles_2[]={ { .uname = "INT_SCHEDULER_0_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 0", .ucode = 0x1, }, { .uname = "INT_SCHEDULER_1_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 1", .ucode = 0x2, }, { .uname = "INT_SCHEDULER_2_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 2", .ucode = 0x4, }, { .uname = "INT_SCHEDULER_3_TOKEN_STALL", .udesc = "Number of cycles stalled due to no tokens available for Integer Scheduler Queue 3", .ucode = 0x8, }, { .uname = "RETIRE_TOKEN_STALL", .udesc = "Number of cycles stalled due to insufficient tokens available for Retire Queue", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_mmx_fp_instructions[]={ { .uname = "SSE_INSTR", .udesc = "Number of SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX)", .ucode = 0x4, }, { .uname = "MMX_INSTR", .udesc = "Number of MMX instructions", .ucode = 0x2, }, { .uname = "X87_INSTR", .udesc = "Number of x87 instructions", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_tagged_ibs_ops[]={ { .uname = "IBS_COUNT_ROLLOVER", .udesc = "Number of times a op could not be tagged by IBS because of a previous tagged op that has not retired", .ucode = 0x4, }, { .uname = "IBS_TAGGED_OPS_RET", .udesc = "Number of ops tagged by IBS that retired", .ucode = 0x2, }, { .uname = "IBS_TAGGED_OPS", .udesc = "Number of ops tagged by IBS", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_requests_to_l2_group1[]={ { .uname = "RD_BLK_L", .udesc = "Number of data cache reads (including software and hardware prefetches)", .ucode = 0x80, }, { .uname = "RD_BLK_X", .udesc = "Number of data cache stores", .ucode = 0x40, }, { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared reads", .ucode = 0x20, }, { .uname = "CACHEABLE_IC_READ", .udesc = "Number of instruction cache reads", .ucode = 0x10, }, { .uname = "CHANGE_TO_X", .udesc = "Number of data requests change to writable, check L2 for current state", .ucode = 0x8, }, { .uname = "PREFETCH_L2_CMD", .udesc = "TBD", .ucode = 0x4, }, { .uname = "L2_HW_PF", .udesc = "Number of prefetches accepted by L2 pipeline, hit or miss", .ucode = 0x2, }, { .uname = "MISC", .udesc = "Count various non-cacheable requests: non-cached data read, non-cached instruction reads, self-modifying code checks", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_core_to_l2_cacheable_request_access_status[]={ { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared read hitting in the L2", .ucode = 0x80, }, { .uname = "LS_RD_BLK_L_HIT_X", .udesc = "Number of data cache reads hitting in the L2", .ucode = 0x40, }, { .uname = "LS_RD_BLK_L_HIT_S", .udesc = "Number of data cache reads hitting a non-modifiable line in the L2", .ucode = 0x20, }, { .uname = "LS_RD_BLK_X", .udesc = "Number of data cache store or state change requests hitting in the L2", .ucode = 0x10, }, { .uname = "LS_RD_BLK_C", .udesc = "Number of data cache requests missing in the L2 (all types)", .ucode = 0x8, }, { .uname = "IC_FILL_HIT_X", .udesc = "Number of instruction cache fill requests hitting a modifiable line in the L2", .ucode = 0x4, }, { .uname = "IC_FILL_HIT_S", .udesc = "Number of instruction cache fill requests hitting a non-modifiable line in the L2", .ucode = 0x2, }, { .uname = "IC_FILL_MISS", .udesc = "Number of instruction cache fill requests missing the L2", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam19h_zen4_l2_prefetch_hit_l2[]={ { .uname = "L2_STREAM", .udesc = "Number of requests from the L2Stream prefetcher", .ucode = 0x1, }, { .uname = "L2_NEXT_LINE", .udesc = "Number of requests from the L2 Next Line prefetcher", .ucode = 0x2, }, { .uname = "L2_UP_DOWN", .udesc = "Number of requests from the L2 Up Down prefetcher", .ucode = 0x4, }, { .uname = "L2_BURST", .udesc = "Number of requests from the L2 Burst prefetcher", .ucode = 0x8, }, { .uname = "L2_STRIDE", .udesc = "Number of requests from the L2 Stride prefetcher", .ucode = 0x10, }, { .uname = "L1_STREAM", .udesc = "Number of requests from the L2 Stream prefetcher", .ucode = 0x20, }, { .uname = "L1_STRIDE", .udesc = "Number of requests from the L1 Stride prefetcher", .ucode = 0x40, }, { .uname = "L1_REGION", .udesc = "Number of requests from the L1 Region prefetcher", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam19h_zen4_retired_x87_fp_ops[]={ { .uname = "ADD_SUB_OPS", .udesc = "Number of add/subtract ops", .ucode = 0x01, }, { .uname = "MUL_OPS", .udesc = "Number of multiply ops", .ucode = 0x2, }, { .uname = "DIV_SQRT_OPS", .udesc = "Number of divide and square root ops", .ucode = 0x4, }, }; static const amd64_umask_t amd64_fam19h_zen4_packed_int_ops_retired[]={ { .uname = "INT128_ADD", .udesc = "Number of integer 128-bit ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_SUB", .udesc = "Number of integer 128-bit SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_MUL", .udesc = "Number of integer 128-bit MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_MAC", .udesc = "Number of integer 256-bit MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_AES", .udesc = "Number of integer 128-bit AES ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_SHA", .udesc = "Number of integer 128-bit SHA ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_CMP", .udesc = "Number of integer 128-bit CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_CLM", .udesc = "Number of integer 128-bit CLM ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_SHIFT", .udesc = "Number of integer 128-bit SHIFT ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_MOV", .udesc = "Number of integer 128-bit MOV ops retired", .ucode = 0x0a, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_SHUFFLE", .udesc = "Number of integer 128-bit SHUFFLE ops retired", .ucode = 0x0b, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_PACK", .udesc = "Number of integer 128-bit PACK ops retired", .ucode = 0x0c, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_LOGICAL", .udesc = "Number of integer 128-bit LOGICAL ops retired", .ucode = 0x0d, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_OTHER", .udesc = "Number of other integer 128-bit ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT128_ALL", .udesc = "Any integer 128-bit ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_ADD", .udesc = "Number of integer 256-bit ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_SUB", .udesc = "Number of integer 256-bit SHIFT ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_MUL", .udesc = "Number of integer 256-bit MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_MAC", .udesc = "Number of integer 256-bit MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_CMP", .udesc = "Number of integer 256-bit CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_SHIFT", .udesc = "Number of integer 256-bit SHIFT ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_MOV", .udesc = "Number of integer 256-bit MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_SHUFFLE", .udesc = "Number of integer 256-bit SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_PACK", .udesc = "Number of integer 256-bit NONE ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_LOGICAL", .udesc = "Number of integer 256-bit LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_OTHER", .udesc = "Number of other integer 256-bit ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INT256_ALL", .udesc = "Any integer 256-bit ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_packed_fp_ops_retired[]={ { .uname = "FP128_ADD", .udesc = "Number of floating-point 128-bit ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_SUB", .udesc = "Number of floating-point 128-bit SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_MUL", .udesc = "Number of floating-point 128-bit MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_MAC", .udesc = "Number of floating-point 128-bit MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_DIV", .udesc = "Number of floating-point 128-bit DIV ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_SQRT", .udesc = "Number of floating-point 128-bit SQRT ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_CMP", .udesc = "Number of floating-point 128-bit CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_CVT", .udesc = "Number of floating-point 128-bit CVT ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_BLEND", .udesc = "Number of floating-point 128-bit 256-bit BLEND ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_SHUFFLE", .udesc = "Number of floating-point 128-bit SHUFFLE ops retired", .ucode = 0x0b, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_LOGICAL", .udesc = "Number of floating-point 128-bit LOGICAL ops retired", .ucode = 0x0d, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_OTHER", .udesc = "Number of other floating-point 128-bit ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP128_ALL", .udesc = "Number of any floating-point 128-bit ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_ADD", .udesc = "Number of floating-point 256-bit ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_SUB", .udesc = "Number of floating-point 256-bit SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_MUL", .udesc = "Number of floating-point 256-bit MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_MAC", .udesc = "Number of floating-point 256-bit MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_DIV", .udesc = "Number of floating-point 256-bit DIV ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_SQRT", .udesc = "Number of floating-point 256-bit SQRT ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_CMP", .udesc = "Number of floating-point 256-bit CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_CVT", .udesc = "Number of floating-point 256-bit CVT ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_BLEND", .udesc = "Number of floating-point 256-bit BLEND ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_SHUFFLE", .udesc = "Number of floating-point 256-bit SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_LOGICAL", .udesc = "Number of floating-point 256-bit LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_OTHER", .udesc = "Number of other floating-point 256-bit ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "FP256_ALL", .udesc = "Any floating-point 256-bit ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam19h_zen4_p0_freq_cycles_not_in_halt[]={ { .uname = "P0_FREQ_CYCLES", .udesc = "Counts at P0 frequency (same as MPERF) when CPU is not in halted state", .ucode = 0x1, .uflags = AMD64_FL_DFL, } }; static const amd64_umask_t amd64_fam19h_zen4_dispatch_stalls_1[]={ { .uname = "FE_NO_OPS", .udesc = "Counts dispatch slots left empty because the front-end did not supply ops", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, }, { .uname = "BE_STALLS", .udesc = "Counts uops unable to dispatch due to back-end stalls", .ucode = 0x1e, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SMT_CONTENTION", .udesc = "Counts dispatch slots left empty because of back-end stalls", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, } }; static const amd64_umask_t amd64_fam19h_zen4_dispatch_stalls_2[]={ { .uname = "FE_NO_OPS", .udesc = "Counts cycles dispatch is stall due to the lack of dispatch resources", .ucode = 0x30, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam19h_zen4_cycles_no_retire[]={ { .uname = "EMPTY", .udesc = "Number of cycles when there were no valid ops in the retire queue. This may be caused by front-end bottlenecks or pipeline redirects", .ucode = 0x1, .uflags = AMD64_FL_NCOMBO, }, { .uname = "NOT_COMPLETE_LOAD_AND_ALU", .udesc = "Number of cycles when the oldest retire slot did not have its completion bits set. Only load and ALU completion considered", .ucode = 0x2, .uflags = AMD64_FL_NCOMBO, }, { .uname = "NOT_COMPLETE_MISSING_LOAD", .udesc = "Number of cycles when the oldest retire slot did not have its completion bits set. Only missing Load completion considered", .ucode = 0xa2, .uflags = AMD64_FL_NCOMBO, }, { .uname = "OTHER", .udesc = "Number of cycles where ops could have retired but were stopped from retirement for other reasons: retire breaks, traps, faults, etc", .ucode = 0x8, .uflags = AMD64_FL_NCOMBO, }, { .uname = "THREAD_NOT_SELECTED", .udesc = "Number of cycles where ops could have retired but did not because thread arbitration did not select the thread for retire", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_entry_t amd64_fam19h_zen4_pe[]={ { .name = "RETIRED_X87_FP_OPS", .desc = "Number of X87 floating-point ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x2, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_x87_fp_ops), .umasks = amd64_fam19h_zen4_retired_x87_fp_ops, }, { .name = "RETIRED_SSE_AVX_FLOPS", .desc = "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15 and therefore requires the MergeEvent", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x3, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_sse_avx_flops), .umasks = amd64_fam19h_zen4_retired_sse_avx_flops, }, { .name = "RETIRED_SERIALIZING_OPS", .desc = "The number of serializing ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_serializing_ops), .umasks = amd64_fam19h_zen4_retired_serializing_ops, }, { .name = "RETIRED_FP_OPS_BY_WIDTH", .desc = "The number of retired floating-point ops by width", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x8, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_fp_ops_by_width), .umasks = amd64_fam19h_zen4_retired_fp_ops_by_width, }, { .name = "RETIRED_FP_OPS_BY_TYPE", .desc = "The number of retired floating-point ops by type", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_fp_ops_by_type), .umasks = amd64_fam19h_zen4_retired_fp_ops_by_type, }, { .name = "RETIRED_INT_OPS", .desc = "The number of retired integer ops (SSE/AVX)", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_int_ops), .umasks = amd64_fam19h_zen4_retired_int_ops, }, { .name = "PACKED_FP_OPS_RETIRED", .desc = "The number of packed floating-point operations", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_packed_fp_ops_retired), .umasks = amd64_fam19h_zen4_packed_fp_ops_retired, }, { .name = "PACKED_INT_OPS_RETIRED", .desc = "The number of packed integer operations", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_packed_int_ops_retired), .umasks = amd64_fam19h_zen4_packed_int_ops_retired, }, { .name = "FP_DISPATCH_FAULTS", .desc = "Floating-point dispatch faults", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xe, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_fp_dispatch_faults), .umasks = amd64_fam19h_zen4_fp_dispatch_faults, }, { .name = "BAD_STATUS_2", .desc = "TBD", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x24, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_bad_status_2), .umasks = amd64_fam19h_zen4_bad_status_2, }, { .name = "RETIRED_LOCK_INSTRUCTIONS", .desc = "Counts the number of retired locked instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x25, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_lock_instructions), .umasks = amd64_fam19h_zen4_retired_lock_instructions, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Counts the number of retired non-speculative clflush instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x26, .flags = 0, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Counts the number of retired cpuid instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x27, .flags = 0, }, { .name = "LS_DISPATCH", .desc = "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x29, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_ls_dispatch), .umasks = amd64_fam19h_zen4_ls_dispatch, }, { .name = "SMI_RECEIVED", .desc = "Counts the number system management interrupts (SMI) received", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x2b, .flags = 0, }, { .name = "INTERRUPT_TAKEN", .desc = "Counts the number of interrupts taken", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x2c, .flags = 0, }, { .name = "STORE_TO_LOAD_FORWARD", .desc = "Number of STore to Load Forward hits", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x35, .flags = 0, .ngrp = 0, }, { .name = "STORE_COMMIT_CANCELS_2", .desc = "TBD", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x37, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_store_commit_cancels_2), .umasks = amd64_fam19h_zen4_store_commit_cancels_2, }, { .name = "MAB_ALLOCATION_BY_TYPE", .desc = "Counts when a LS pipe allocates a MAB entry", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x41, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_mab_allocation_by_type), .umasks = amd64_fam19h_zen4_mab_allocation_by_type, }, { .name = "DEMAND_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Demand Data Cache fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x43, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_demand_data_fills_from_system), .umasks = amd64_fam19h_zen4_demand_data_fills_from_system, }, { .name = "ANY_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Any Data Cache fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x44, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_demand_data_fills_from_system), .umasks = amd64_fam19h_zen4_demand_data_fills_from_system, /* shared */ }, { .name = "L1_DTLB_MISS", .desc = "L1 Data TLB misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x45, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_l1_dtlb_miss), .umasks = amd64_fam19h_zen4_l1_dtlb_miss, }, { .name = "MISALIGNED_LOADS", .desc = "Misaligned loads retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x47, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_misaligned_loads), .umasks = amd64_fam19h_zen4_misaligned_loads, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Software Prefetch Instructions Dispatched. This is a speculative event", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x4b, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_prefetch_instructions_dispatched), .umasks = amd64_fam19h_zen4_prefetch_instructions_dispatched, }, { .name = "INEFFECTIVE_SOFTWARE_PREFETCH", .desc = "Number of software prefetches that did not fetch data outside of the processor core", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x52, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_ineffective_software_prefetch), .umasks = amd64_fam19h_zen4_ineffective_software_prefetch, }, { .name = "SOFTWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of software prefetches fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x59, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_demand_data_fills_from_system), .umasks = amd64_fam19h_zen4_demand_data_fills_from_system, /* shared */ }, { .name = "HARDWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of hardware prefetches fills by data source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_demand_data_fills_from_system), .umasks = amd64_fam19h_zen4_demand_data_fills_from_system, /* shared */ }, { .name = "ALLOC_MAB_COUNT", .desc = "Counts the in-flight L1 data cache misses (allocated Miss Address Buffers) divided by 4 and rounded down each cycle unless used with the MergeEvent functionality. If the MergeEvent is used, it counts the exact number of outstanding L1 data cache misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x5f, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NOT_IN_HALT", .desc = "Number of core cycles not in halted state", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x76, .flags = 0, .ngrp = 0, }, { .name = "TLB_FLUSHES", .desc = "Number of TLB flushes", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x78, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_tlb_flushes), .umasks = amd64_fam19h_zen4_tlb_flushes, }, { .name = "P0_FREQ_CYCLES_NOT_IN_HALT", .desc = "Number of core cycle4s not in halted state by P-level", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x120, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_p0_freq_cycles_not_in_halt), .umasks = amd64_fam19h_zen4_p0_freq_cycles_not_in_halt, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Number of 64-byte instruction cachelines that was fulfilled by the L2 cache", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x82, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Number of 64-byte instruction cachelines fulfilled from system memory or another cache", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x83, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_HIT", .desc = "Number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x84, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_MISS", .desc = "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x85, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_l1_itlb_miss_l2_itlb_miss), .umasks = amd64_fam19h_zen4_l1_itlb_miss_l2_itlb_miss, }, { .name = "L2_BTB_CORRECTION", .desc = "Number of L2 branch prediction overrides of existing prediction. This is a speculative event", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x8b, .flags = 0, .ngrp = 0, }, { .name = "DYNAMIC_INDIRECT_PREDICTIONS", .desc = "Number of times a branch used the indirect predictor to make a prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x8e, .flags = 0, .ngrp = 0, }, { .name = "DECODER_OVERRIDE_BRANCH_PRED", .desc = "Number of decoder overrides of existing branch prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x91, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_FETCH_HIT", .desc = "Instruction fetches that hit in the L1 ITLB", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x94, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_itlb_fetch_hit), .umasks = amd64_fam19h_zen4_itlb_fetch_hit, }, { .name = "RESYNCS", .desc = "Number of HW resyncs (pipeline restarts) or NC redirects. NC redirects occur when front-end transitions to fetching from un-cacheable memory", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x96, .flags = 0, .ngrp = 0, }, { .name = "IC_TAG_HIT_MISS", .desc = "Counts various IC tag related hit and miss events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x18e, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_ic_tag_hit_miss), .umasks = amd64_fam19h_zen4_ic_tag_hit_miss, }, { .name = "OP_CACHE_HIT_MISS", .desc = "Counts op cache micro-tag hit/miss events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x28f, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_op_cache_hit_miss), .umasks = amd64_fam19h_zen4_op_cache_hit_miss, }, { .name = "OPS_QUEUE_EMPTY", .desc = "Number of cycles where the uop queue is empty", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xa9, .flags = 0, .ngrp = 0, }, { .name = "OPS_SOURCE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op source", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xaa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_ops_source_dispatched_from_decoder), .umasks = amd64_fam19h_zen4_ops_source_dispatched_from_decoder, }, { .name = "OPS_TYPE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op type", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xab, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_ops_type_dispatched_from_decoder), .umasks = amd64_fam19h_zen4_ops_type_dispatched_from_decoder, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_1", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xae, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_dispatch_resource_stall_cycles_1), .umasks = amd64_fam19h_zen4_dispatch_resource_stall_cycles_1, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_2", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xaf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_dispatch_resource_stall_cycles_2), .umasks = amd64_fam19h_zen4_dispatch_resource_stall_cycles_2, }, { .name = "DISPATCH_STALLS_1", .desc = "For each cycle, counts the number of dispatch slots that remained unused for a given reason", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1a0, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_dispatch_stalls_1), .umasks = amd64_fam19h_zen4_dispatch_stalls_1, }, { .name = "DISPATCH_STALLS_2", .desc = "For each cycle, counts the number of dispatch slots that remained unused for a given reason", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1a2, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_dispatch_stalls_2), .umasks = amd64_fam19h_zen4_dispatch_stalls_2, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Number of instructions retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_OPS", .desc = "Number of macro-ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired branch instructions, that were mispredicted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc3, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired taken branch instructions that were mispredicted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc5, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc6, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Number of near return instructions (RET or RET Iw) retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xc9, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of indirect branches retired there were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted condition branch instruction. Only EX mispredicts are counted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xca, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "Number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions, it is not suitable for measuring MFLOPS", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xcb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_retired_mmx_fp_instructions), .umasks = amd64_fam19h_zen4_retired_mmx_fp_instructions, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS", .desc = "Number of indirect branches retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xcc, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired conditional branch instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd1, .flags = 0, .ngrp = 0, }, { .name = "DIV_CYCLES_BUSY_COUNT", .desc = "Number of cycles when the divider is busy", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd3, .flags = 0, .ngrp = 0, }, { .name = "DIV_OP_COUNT", .desc = "Number of divide ops", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd4, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NO_RETIRE", .desc = "Counts cycles when the hardware does not retire any ops for a given reason. Event can only track one reason at a time. If multiple reasons apply for a given cycle, the lowest numbered reason is counted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0xd6, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_cycles_no_retire), .umasks = amd64_fam19h_zen4_cycles_no_retire, }, { .name = "RETIRED_UCODE_INSTRUCTIONS", .desc = "Number of microcode instructions retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UCODE_OPS", .desc = "Number of microcode ops retired", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_MISPREDICTED_DIRECTION_MISMATCH", .desc = "Number of retired conditional branch instructions that were not correctly predicted because of branch direction mismatch", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c7, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UNCONDITIONAL_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired unconditional indirect branch instructions that were mispredicted", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UNCONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired unconditional branch instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1c9, .flags = 0, .ngrp = 0, }, { .name = "TAGGED_IBS_OPS", .desc = "Counts Op IBS related events", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1cf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_tagged_ibs_ops), .umasks = amd64_fam19h_zen4_tagged_ibs_ops, }, { .name = "RETIRED_FUSED_INSTRUCTIONS", .desc = "Counts retired fused instructions", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x1d0, .flags = 0, .ngrp = 0, }, { .name = "REQUESTS_TO_L2_GROUP1", .desc = "All L2 cache requests", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x60, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_requests_to_l2_group1), .umasks = amd64_fam19h_zen4_requests_to_l2_group1, }, { .name = "CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS", .desc = "L2 cache request outcomes. This event does not count accesses to the L2 cache by the L2 prefetcher", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x64, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_core_to_l2_cacheable_request_access_status), .umasks = amd64_fam19h_zen4_core_to_l2_cacheable_request_access_status, }, { .name = "L2_PREFETCH_HIT_L2", .desc = "Number of L2 prefetches that hit in the L2", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x70, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen4_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_HIT_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x71, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen4_l2_prefetch_hit_l2, }, { .name = "L2_PREFETCH_MISS_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches", .modmsk = AMD64_FAM19H_ATTRS, .code = 0x72, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_l2_prefetch_hit_l2), .umasks = amd64_fam19h_zen4_l2_prefetch_hit_l2, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam1ah_zen5.h000066400000000000000000002046261502707512200246760ustar00rootroot00000000000000#include "../pfmlib_amd64_priv.h" #include "../pfmlib_priv.h" /* * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved. * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam1ah_zen5 (AMD64 Fam1Ah Zen5) */ static const amd64_umask_t amd64_fam1ah_zen5_retired_x87_fp_ops[]={ { .uname = "ADD_SUB_OPS", .udesc = "Number of add/subtract ops", .ucode = 0x01, }, { .uname = "MUL_OPS", .udesc = "Number of multiply ops", .ucode = 0x2, }, { .uname = "DIV_SQRT_OPS", .udesc = "Number of divide and square root ops", .ucode = 0x4, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_sse_avx_flops[]={ { .uname = "ADD_SUB_FLOPS", .udesc = "Addition/subtraction FLOPS", .ucode = 0x1, .grpid = 0, }, { .uname = "MULT_FLOPS", .udesc = "Multiplication FLOPS", .ucode = 0x2, .grpid = 0, }, { .uname = "DIV_FLOPS", .udesc = "Division/Square-root FLOPS", .ucode = 0x4, .grpid = 0, }, { .uname = "MAC_FLOPS", .udesc = "Multiply-Accumulate flops. Each MAC operation is counted as 2 FLOPS. This event does not include bfloat MAC operations", .ucode = 0x8, .grpid = 0, }, { .uname = "ALL_TYPE", .udesc = "All types", .ucode = 0x0, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FLOP_TYPE_BFLOAT16", .udesc = "B Float 16", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FLOP_TYPE_SCALAR_SINGLE", .udesc = "Scalar single", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FLOP_TYPE_PACKED_SINGLE", .udesc = "Packed single", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FLOP_TYPE_SCALAR_DOUBLE", .udesc = "Scalar double", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FLOP_TYPE_PACKED_DOUBLE", .udesc = "Packed double", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "ANY", .udesc = "Double precision add/subtract flops", .ucode = 0x0f, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, .grpid = 0, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_serializing_ops[]={ { .uname = "X87_CTRL_RET", .udesc = "x87 control word mispredict traps due to mispredictions in RC or PC, or changes in Exception Mask bits", .ucode = 0x1, }, { .uname = "X87_BOT_RET", .udesc = "x87 bottom-executing ops retired", .ucode = 0x2, }, { .uname = "SSE_CTRL_RET", .udesc = "SSE/AVX control word mispredict traps", .ucode = 0x4, }, { .uname = "SSE_BOT_RET", .udesc = "SSE/AVX bottom-executing ops retired", .ucode = 0x8, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_fp_ops_by_width[]={ { .uname = "X87_UOPS_RETIRED", .udesc = "X87 uops retired", .ucode = 0x1, }, { .uname = "MMX_UOPS_RETIRED", .udesc = "MMX uops retired", .ucode = 0x2, }, { .uname = "SCALAR_UOPS_RETIRED", .udesc = "Scalar uops retired", .ucode = 0x4, }, { .uname = "PACK128_UOPS_RETIRED", .udesc = "Packed 128-bit uops retired", .ucode = 0x8, }, { .uname = "PACK256_UOPS_RETIRED", .udesc = "Packed 256-bit uops retired", .ucode = 0x10, }, { .uname = "PACK512_UOPS_RETIRED", .udesc = "Packed 512-bit uops retired", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_fp_ops_by_type[]={ { .uname = "SCALAR_NONE", .udesc = "Do not count scalar ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "SCALAR_ADD", .udesc = "Number of scalar ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_SUB", .udesc = "Number of scalar SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_MUL", .udesc = "Number of scalar MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_MAC", .udesc = "Number of scalar MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_DIV", .udesc = "Number of scalar DIV ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_SQRT", .udesc = "Number of scalar SQRT ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_CMP", .udesc = "Number of scalar CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_CVT", .udesc = "Number of scalar CVT ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_BLEND", .udesc = "Number of scalar BLEND ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_MOVE", .udesc = "Number of scalar MOV ops retired", .ucode = 0x0a, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_SHUFFLE", .udesc = "Number of scalar SHUFFLE ops retired", .ucode = 0x0b, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_BFLOAT", .udesc = "Number of scalar BFloat ops retired", .ucode = 0x0c, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_LOGICAL", .udesc = "Number of scalar LOGICAL ops retired", .ucode = 0x0d, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_OTHER", .udesc = "Number of other scalar ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SCALAR_ALL", .udesc = "Number of any scalar ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "VECTOR_NONE", .udesc = "Do not count vector ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, { .uname = "VECTOR_ADD", .udesc = "Number of vector ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_SUB", .udesc = "Number of vector SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_MUL", .udesc = "Number of vector MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_MAC", .udesc = "Number of vector MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_DIV", .udesc = "Number of DIV ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_SQRT", .udesc = "Number of vector SQRT ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_CMP", .udesc = "Number of vector CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_CVT", .udesc = "Number of vector CVT ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_BLEND", .udesc = "Number of vector BLEND ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_MOVE", .udesc = "Number of vector MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_SHUFFLE", .udesc = "Number of vector SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_BFLOAT", .udesc = "Number of vector BFloat ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_LOGICAL", .udesc = "Number of vector LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_OTHER", .udesc = "Number of other vector ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "VECTOR_ALL", .udesc = "Number of scalar ADD and vector ALL ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_int_ops[]={ { .uname = "MMX_NONE", .udesc = "Do count MMX ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "MMX_ADD", .udesc = "Number of MMX ADD ops retired", .ucode = 0x1, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_SUB", .udesc = "Number of MMX SUB ops retired", .ucode = 0x2, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_MUL", .udesc = "Number of MMX MUL ops retired", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_MAC", .udesc = "Number of MMX MAC ops retired", .ucode = 0x4, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_AES", .udesc = "Number of MMX AES ops retired", .ucode = 0x5, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_SHA", .udesc = "Number of MMX SHA ops retired", .ucode = 0x6, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_CMP", .udesc = "Number of MMX CMP ops retired", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_CVT", .udesc = "Number of MMX CVT or PACK ops retired", .ucode = 0x8, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_SHIFT", .udesc = "Number of MMX SHIFT ops retired", .ucode = 0x9, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_MOV", .udesc = "Number of MMX MOV ops retired", .ucode = 0xa, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_SHUFFLE", .udesc = "Number of MMX SHUFFLE ops retired", .ucode = 0xb, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_VNNI", .udesc = "Number of MMX VNNI ops retired", .ucode = 0xc, .uflags = AMD64_FL_NCOMBO, }, { .uname = "MMX_LOGICAL", .udesc = "Number of MMX LOGICAL ops retired", .ucode = 0xd, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_OTHER", .udesc = "Number of other MMX ops retired", .ucode = 0xe, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "MMX_ALL", .udesc = "Any MMX ops retired", .ucode = 0xf, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "SSE_AVX_NONE", .udesc = "Do not count SSE/AVX ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, { .uname = "SSE_AVX_ADD", .udesc = "Number of SSE/AVX ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_SUB", .udesc = "Number of SSE/AVX SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_MUL", .udesc = "Number of SSE/AVX MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_MAC", .udesc = "Number of SSE/AVX MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_AES", .udesc = "Number of SSE/AVX AES ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_SHA", .udesc = "Number of SSE/AVX SHA ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_CMP", .udesc = "Number of SSE/AVX CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_CVT", .udesc = "Number of SSE/AVX CVT or PACK ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_SHIFT", .udesc = "Number of SSE/AVX SHIFT ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_MOV", .udesc = "Number of SSE/AVX MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_SHUFFLE", .udesc = "Number of SSE/AVX SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_VNNI", .udesc = "Number of SSE/AVX VNNI ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_LOGICAL", .udesc = "Number of SSE/AVX LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_OTHER", .udesc = "Number of other SSE/AVX ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "SSE_AVX_ALL", .udesc = "Any SSE/AVX ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_packed_fp_ops_retired[]={ { .uname = "FP128_NONE", .udesc = "Do not count any 128-bit ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "FP128_ADD", .udesc = "Number of floating-point 128-bit ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_SUB", .udesc = "Number of floating-point 128-bit SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_MUL", .udesc = "Number of floating-point 128-bit MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_MAC", .udesc = "Number of floating-point 128-bit MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_DIV", .udesc = "Number of floating-point 128-bit DIV ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_SQRT", .udesc = "Number of floating-point 128-bit SQRT ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_CMP", .udesc = "Number of floating-point 128-bit CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_CVT", .udesc = "Number of floating-point 128-bit CVT ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_BLEND", .udesc = "Number of floating-point 128-bit 256-bit BLEND ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_MOV", .udesc = "Number of floating-point 128-bit MOV ops retired", .ucode = 0x0a, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_SHUFFLE", .udesc = "Number of floating-point 128-bit SHUFFLE ops retired", .ucode = 0x0b, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_BFLOAT", .udesc = "Number of floating-point 128-bit BFloat ops retired", .ucode = 0x0c, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_LOGICAL", .udesc = "Number of floating-point 128-bit LOGICAL ops retired", .ucode = 0x0d, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_OTHER", .udesc = "Number of other floating-point 128-bit ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP128_ALL", .udesc = "Number of any floating-point 128-bit ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "FP256_NONE", .udesc = "Do not count any 256-bit ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, { .uname = "FP256_ADD", .udesc = "Number of floating-point 256-bit ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_SUB", .udesc = "Number of floating-point 256-bit SUB ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_MUL", .udesc = "Number of floating-point 256-bit MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_MAC", .udesc = "Number of floating-point 256-bit MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_DIV", .udesc = "Number of floating-point 256-bit DIV ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_SQRT", .udesc = "Number of floating-point 256-bit SQRT ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_CMP", .udesc = "Number of floating-point 256-bit CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_CVT", .udesc = "Number of floating-point 256-bit CVT ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_BLEND", .udesc = "Number of floating-point 256-bit BLEND ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_MOV", .udesc = "Number of floating-point 256-bit MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_SHUFFLE", .udesc = "Number of floating-point 256-bit SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_BFLOAT", .udesc = "Number of floating-point 256-bit BFloat ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_LOGICAL", .udesc = "Number of floating-point 256-bit LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_OTHER", .udesc = "Number of other floating-point 256-bit ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "FP256_ALL", .udesc = "Any floating-point 256-bit ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_packed_int_ops_retired[]={ { .uname = "INT128_NONE", .udesc = "Do not count 128-bit ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 0, }, { .uname = "INT128_ADD", .udesc = "Number of integer 128-bit ADD ops retired", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_SUB", .udesc = "Number of integer 128-bit SUB ops retired", .ucode = 0x02, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_MUL", .udesc = "Number of integer 128-bit MUL ops retired", .ucode = 0x03, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_MAC", .udesc = "Number of integer 256-bit MAC ops retired", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_AES", .udesc = "Number of integer 128-bit AES ops retired", .ucode = 0x05, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_SHA", .udesc = "Number of integer 128-bit SHA ops retired", .ucode = 0x06, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_CMP", .udesc = "Number of integer 128-bit CMP ops retired", .ucode = 0x07, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_CVT", .udesc = "Number of integer 128-bit CVT or PACK ops retired", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_SHIFT", .udesc = "Number of integer 128-bit SHIFT ops retired", .ucode = 0x09, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_MOV", .udesc = "Number of integer 128-bit MOV ops retired", .ucode = 0x0a, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_SHUFFLE", .udesc = "Number of integer 128-bit SHUFFLE ops retired", .ucode = 0x0b, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_VNNI", .udesc = "Number of integer 128-bit VNNI ops retired", .ucode = 0x0c, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_LOGICAL", .udesc = "Number of integer 128-bit LOGICAL ops retired", .ucode = 0x0d, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_OTHER", .udesc = "Number of other integer 128-bit ops retired", .ucode = 0x0e, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT128_ALL", .udesc = "Any integer 128-bit ops retired", .ucode = 0x0f, .uflags = AMD64_FL_NCOMBO, .grpid = 0, }, { .uname = "INT256_NONE", .udesc = "Do not count 256-bit ops", .ucode = 0x0, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, .grpid = 1, }, { .uname = "INT256_ADD", .udesc = "Number of integer 256-bit ADD ops retired", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_SUB", .udesc = "Number of integer 256-bit SHIFT ops retired", .ucode = 0x20, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_MUL", .udesc = "Number of integer 256-bit MUL ops retired", .ucode = 0x30, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_MAC", .udesc = "Number of integer 256-bit MAC ops retired", .ucode = 0x40, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_AES", .udesc = "Number of integer 256-bit AES ops retired", .ucode = 0x50, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_SHA", .udesc = "Number of integer 256-bit SHA ops retired", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_CMP", .udesc = "Number of integer 256-bit CMP ops retired", .ucode = 0x70, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_CVT", .udesc = "Number of integer 256-bit CVT or PACK ops retired", .ucode = 0x80, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_SHIFT", .udesc = "Number of integer 256-bit SHIFT ops retired", .ucode = 0x90, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_MOV", .udesc = "Number of integer 256-bit MOV ops retired", .ucode = 0xa0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_SHUFFLE", .udesc = "Number of integer 256-bit SHUFFLE ops retired", .ucode = 0xb0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_VNNI", .udesc = "Number of integer 256-bit VNNI ops retired", .ucode = 0xc0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_LOGICAL", .udesc = "Number of integer 256-bit LOGICAL ops retired", .ucode = 0xd0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_OTHER", .udesc = "Number of other integer 256-bit ops retired", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, { .uname = "INT256_ALL", .udesc = "Any integer 256-bit ops retired", .ucode = 0xf0, .uflags = AMD64_FL_NCOMBO, .grpid = 1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_fp_dispatch_faults[]={ { .uname = "X87_FILL_FAULT", .udesc = "x87 fill faults", .ucode = 0x1, }, { .uname = "XMM_FILL_FAULT", .udesc = "XMM fill faults", .ucode = 0x2, }, { .uname = "YMM_FILL_FAULT", .udesc = "YMM fill faults", .ucode = 0x4, }, { .uname = "YMM_SPILL_FAULT", .udesc = "YMM spill faults", .ucode = 0x8, }, { .uname = "ANY", .udesc = "Any FP dispatch faults", .ucode = 0xf, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam1ah_zen5_bad_status_2[]={ { .uname = "STLI_OTHER", .udesc = "Store-to-load conflicts. A load was unable to complete due to a non-forwardable conflict with an older store", .ucode = 0x2, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_lock_instructions[]={ { .uname = "BUS_LOCK", .udesc = "Number of non-cacheable or cacheline-misaligned locks", .ucode = 0x1, }, { .uname = "ANY", .udesc = "Number of all locks", .ucode = 0x1f, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ls_dispatch[]={ { .uname = "LD_ST_DISPATCH", .udesc = "Dispatched op that performs a load from and store to the same memory address", .ucode = 0x4, }, { .uname = "STORE_DISPATCH", .udesc = "Store ops dispatched", .ucode = 0x2, }, { .uname = "LD_DISPATCH", .udesc = "Load ops dispatched", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_store_commit_cancels_2[]={ { .uname = "OLDER_ST_VISIBLE_CANCEL", .udesc = "Older SCB being waited on to become globally visible was unable to become globally visible", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_mab_allocation_by_type[]={ { .uname = "LS", .udesc = "Load store allocations", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, { .uname = "HW_PF", .udesc = "Hardware prefetcher allocations", .ucode = 0x8, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL", .udesc = "All allocations", .ucode = 0xf, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_demand_data_fills_from_system[]={ { .uname = "LCL_L2", .udesc = "Fill from local L2 to the core", .ucode = 0x1, }, { .uname = "LOCAL_CCX", .udesc = "Fill from L3 or different L2 in same CCX", .ucode = 0x2, }, { .uname = "NEAR_FAR_CACHE_NEAR", .udesc = "Fill from cache of different CCX in same node", .ucode = 0x4, }, { .uname = "DRAM_IO_NEAR", .udesc = "Fill from DRAM or IO connected to same node", .ucode = 0x8, }, { .uname = "NEAR_FAR_CACHE_FAR", .udesc = "Fill from CCX cache in different node", .ucode = 0x10, }, { .uname = "DRAM_IO_FAR", .udesc = "Fill from DRAM or IO connected from a different node (same socket or remote)", .ucode = 0x40, }, { .uname = "ALT_MEM_NEAR_FAR", .udesc = "Fill from Extension Memory", .ucode = 0x80, }, }; static const amd64_umask_t amd64_fam1ah_zen5_l1_dtlb_miss[]={ { .uname = "TLB_RELOAD_1G_L2_MISS", .udesc = "Data TLB reload to a 1GB page that missed in the L2 TLB", .ucode = 0x80, }, { .uname = "TLB_RELOAD_2M_L2_MISS", .udesc = "Data TLB reload to a 2MB page that missed in the L2 TLB", .ucode = 0x40, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_MISS", .udesc = "Data TLB reload to a coalesced page that also missed in the L2 TLB", .ucode = 0x20, }, { .uname = "TLB_RELOAD_4K_L2_MISS", .udesc = "Data TLB reload to a 4KB page that missed in the L2 TLB", .ucode = 0x10, }, { .uname = "TLB_RELOAD_1G_L2_HIT", .udesc = "Data TLB reload to a 1GB page that hit in the L2 TLB", .ucode = 0x8, }, { .uname = "TLB_RELOAD_2M_L2_HIT", .udesc = "Data TLB reload to a 2MB page that hit in the L2 TLB", .ucode = 0x4, }, { .uname = "TLB_RELOAD_COALESCED_PAGE_HIT", .udesc = "Data TLB reload to a coalesced page that hit in the L2 TLB", .ucode = 0x2, }, { .uname = "TLB_RELOAD_4K_L2_HIT", .udesc = "Data TLB reload to a 4KB page that hit in the L2 TLB", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_misaligned_loads[]={ { .uname = "MA4K", .udesc = "The number of 4KB misaligned (page crossing) loads", .ucode = 0x2, }, { .uname = "MA64", .udesc = "The number of 64B misaligned (cacheline crossing) loads", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_prefetch_instructions_dispatched[]={ { .uname = "PREFETCH_T0_T1_T2", .udesc = "Number of prefetcht0, perfetcht1, prefetcht2 instructions dispatched", .ucode = 0x1, }, { .uname = "PREFETCHW", .udesc = "Number of prefetchtw instructions dispatched", .ucode = 0x2, }, { .uname = "PREFETCHNTA", .udesc = "Number of prefetchtnta instructions dispatched", .ucode = 0x4, }, { .uname = "ANY", .udesc = "Any prefetch", .ucode = 0x7, .uflags = AMD64_FL_DFL | AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam1ah_zen5_wcb_close[]={ { .uname = "FULL_LINE_64B", .udesc = "All 64 bytes of the WCB entry have been written", .ucode = 0x1, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ineffective_software_prefetch[]={ { .uname = "MAB_MCH_CNT", .udesc = "Software prefetch instructions saw a match on an already allocated miss request buffer", .ucode = 0x2, }, { .uname = "DATA_PIPE_SW_PF_DC_HIT", .udesc = "Software Prefetch instruction saw a DC hit", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_tlb_flushes[]={ { .uname = "TLB_FLUSHES", .udesc = "All TLB Flushes", .ucode = 0xff, .uflags = AMD64_FL_DFL, } }; static const amd64_umask_t amd64_fam1ah_zen5_p0_freq_cycles_not_in_halt[]={ { .uname = "P0_FREQ_CYCLES", .udesc = "Counts at P0 frequency (same as MPERF) when CPU is not in halted state", .ucode = 0x1, .uflags = AMD64_FL_DFL, } }; static const amd64_umask_t amd64_fam1ah_zen5_l1_itlb_miss_l2_itlb_miss[]={ { .uname = "COALESCED4K", .udesc = "Number of instruction fetches to a >4K coalesced page", .ucode = 0x8, }, { .uname = "IF1G", .udesc = "Number of instruction fetches to a 1GB page", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "Number of instruction fetches to a 2MB page", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "Number of instruction fetches to a 4KB page", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_itlb_fetch_hit[]={ { .uname = "IF1G", .udesc = "L1 instruction fetch TLB hit a 1GB page size", .ucode = 0x4, }, { .uname = "IF2M", .udesc = "L1 instruction fetch TLB hit a 2MB page size", .ucode = 0x2, }, { .uname = "IF4K", .udesc = "L1 instruction fetch TLB hit a 4KB or 16KB page size", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_bp_redirects[]={ { .uname = "EX_REDIR", .udesc = "Mispredict redirect from EX (execution-time)", .ucode = 0x2, }, { .uname = "RESYNC", .udesc = "Resync redirect from RT (retire-time)", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_fetch_ibs_events[]={ { .uname = "SAMPLE_VAL", .udesc = "Counts the number of valid fetch IBS samples that were collected. Each valid sample also created an IBS interrupt", .ucode = 0x10, }, { .uname = "SAMPLE_FILTERED", .udesc = "Counts the number of fetch IBS tagged fetches that were discarded due to IBS filtering. When a tagged fetch is discarded the fetch IBS facility will automatically tag a new fetch", .ucode = 0x8, }, { .uname = "SAMPLE_DISCARDED", .udesc = "Counts when the fetch IBS facility discards an IBS tagged fetch for reasons other than IBS filtering. When a tagged fetch is discarded the fetch IBS facility will automatically tag a new fetch", .ucode = 0x4, }, { .uname = "FETCH_TAGGED", .udesc = "Counts the number of fetches tagged for fetch IBS. Not all tagged fetches create an IBS interrupt and valid fetch sample", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ic_tag_hit_miss[]={ { .uname = "IC_HIT", .udesc = "Instruction cache hit", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO, }, { .uname = "IC_MISS", .udesc = "Instruction cache miss", .ucode = 0x18, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_IC_ACCESS", .udesc = "All instruction cache accesses", .ucode = 0x1f, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_op_cache_hit_miss[]={ { .uname = "OC_HIT", .udesc = "Op cache hit", .ucode = 0x3, .uflags = AMD64_FL_NCOMBO, }, { .uname = "OC_MISS", .udesc = "Op cache miss", .ucode = 0x4, .uflags = AMD64_FL_NCOMBO, }, { .uname = "ALL_OC_ACCESS", .udesc = "All op cache accesses", .ucode = 0x7, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ops_source_dispatched_from_decoder[]={ { .uname = "DECODER", .udesc = "Number of ops dispatched from x86 decoder", .ucode = 0x1, }, { .uname = "OPCACHE", .udesc = "Number of ops dispatched from Op Cache", .ucode = 0x2, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ops_type_dispatched_from_decoder[]={ { .uname = "FP_DISPATCH", .udesc = "Any FP dispatch", .ucode = 0x04, .uflags = AMD64_FL_NCOMBO, }, { .uname = "INTEGER_DISPATCH", .udesc = "Any Integer dispatch", .ucode = 0x08, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam1ah_zen5_dispatch_resource_stall_cycles_1[]={ { .uname = "INT_PHY_REG_FILE_RSRC_STALL", .udesc = "Number of cycles stalled due to integer physical register file resource stalls", .ucode = 0x1, }, { .uname = "LOAD_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to load queue resource stalls", .ucode = 0x2, }, { .uname = "STORE_QUEUE_RSRC_STALL", .udesc = "Number of cycles stalled due to store queue resource stalls", .ucode = 0x4, }, { .uname = "TAKEN_BRANCH_BUFFER_RSRC_STALL", .udesc = "Number of cycles stalled due to taken branch buffer resource stalls", .ucode = 0x10, }, { .uname = "FP_SCHEDULER_RSRC_STALL", .udesc = "Number of cycles stalled due to floating-point scheduler resource stalls", .ucode = 0x40, }, }; static const amd64_umask_t amd64_fam1ah_zen5_dispatch_resource_stall_cycles_2[]={ { .uname = "AL_TOKEN_STALL", .udesc = "Number of cycles stalled due to insufficient tokens available for ALU", .ucode = 0x1, }, { .uname = "AG_TOKEN_STALL", .udesc = "Number of cycles stalled due to insufficient tokens available for Agen", .ucode = 0x2, }, { .uname = "EX_FLUSH_STALL", .udesc = "Number of cycles stalled due to integer execution flush recovery pending", .ucode = 0x4, }, { .uname = "RETIRE_TOKEN_STALL", .udesc = "Number of cycles stalled due to insufficient tokens available for Retire Queue", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam1ah_zen5_dispatch_stalls_1[]={ { .uname = "FE_NO_OPS", .udesc = "Counts dispatch slots left empty because the front-end did not supply ops", .ucode = 0x01, .uflags = AMD64_FL_NCOMBO, }, { .uname = "BE_STALLS", .udesc = "Counts ops unable to dispatch due to back-end stalls", .ucode = 0x1e, .uflags = AMD64_FL_NCOMBO, }, { .uname = "SMT_CONTENTION", .udesc = "Counts ops unable to dispatch because the dispatch cycle was granted to the other SMT thread", .ucode = 0x60, .uflags = AMD64_FL_NCOMBO, } }; static const amd64_umask_t amd64_fam1ah_zen5_dispatch_stalls_2[]={ { .uname = "FE_NO_OPS", .udesc = "Counts cycles dispatch is stalled due to the lack of dispatch resources", .ucode = 0x30, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_retired_mmx_fp_instructions[]={ { .uname = "SSE_INSTR", .udesc = "Number of SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX)", .ucode = 0x4, }, { .uname = "MMX_INSTR", .udesc = "Number of MMX instructions", .ucode = 0x2, }, { .uname = "X87_INSTR", .udesc = "Number of x87 instructions", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_cycles_no_retire[]={ { .uname = "EMPTY", .udesc = "Number of cycles when there were no valid ops in the retire queue. This may be caused by front-end bottlenecks or pipeline redirects", .ucode = 0x1, .uflags = AMD64_FL_NCOMBO, }, { .uname = "NOT_COMPLETE_LOAD_AND_ALU", .udesc = "Number of cycles when the oldest retire slot did not have its completion bits set", .ucode = 0x2, .uflags = AMD64_FL_NCOMBO, }, { .uname = "OTHER", .udesc = "Number of cycles where ops could have retired but were stopped from retirement for other reasons: retire breaks, traps, faults, etc", .ucode = 0x8, .uflags = AMD64_FL_NCOMBO, }, { .uname = "THREAD_NOT_SELECTED", .udesc = "Number of cycles where ops could have retired but did not because thread arbitration did not select the thread for retire", .ucode = 0x10, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam1ah_zen5_tagged_ibs_ops[]={ { .uname = "IBS_COUNT_ROLLOVER", .udesc = "Number of times a op could not be tagged by IBS because of a previous tagged op that has not yet signaled interrupt", .ucode = 0x4, }, { .uname = "IBS_TAGGED_OPS_RET", .udesc = "Number of ops tagged by IBS that retired", .ucode = 0x2, }, { .uname = "IBS_TAGGED_OPS", .udesc = "Number of ops tagged by IBS", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_requests_to_l2_group1[]={ { .uname = "RD_BLK_L", .udesc = "Number of data cache reads (including software and hardware prefetches)", .ucode = 0x80, }, { .uname = "RD_BLK_X", .udesc = "Number of data cache stores", .ucode = 0x40, }, { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared reads", .ucode = 0x20, }, { .uname = "CACHEABLE_IC_READ", .udesc = "Number of instruction cache reads", .ucode = 0x10, }, { .uname = "PREFETCH_L2_CMD", .udesc = "TBD", .ucode = 0x4, }, { .uname = "L2_HW_PF", .udesc = "Number of prefetches accepted by L2 pipeline, hit or miss", .ucode = 0x2, }, { .uname = "MISC", .udesc = "Count various non-cacheable requests: non-cached data read, non-cached instruction reads, self-modifying code checks", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_requests_to_l2_group2[]={ { .uname = "LS_RD_SIZED", .udesc = "LS sized read, coherent non-cacheable", .ucode = 0x40, }, { .uname = "LS_RD_SIZED_NC", .udesc = "LS sized read, non-coherent, non-cacheable", .ucode = 0x20, }, }; static const amd64_umask_t amd64_fam1ah_zen5_ls_to_l2_wbc_requests[]={ { .uname = "WCB_CLOSE", .udesc = "Write combining buffer close", .ucode = 0x20, .uflags = AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_fam1ah_zen5_core_to_l2_cacheable_request_access_status[]={ { .uname = "LS_RD_BLK_C_S", .udesc = "Number of data cache shared read hitting in the L2", .ucode = 0x80, }, { .uname = "LS_RD_BLK_L_HIT_X", .udesc = "Number of data cache reads hitting in the L2", .ucode = 0x40, }, { .uname = "LS_RD_BLK_L_HIT_S", .udesc = "Number of data cache reads hitting a non-modifiable line in the L2", .ucode = 0x20, }, { .uname = "LS_RD_BLK_X", .udesc = "Number of data cache store requests hitting in the L2", .ucode = 0x10, }, { .uname = "LS_RD_BLK_C", .udesc = "Number of data cache requests missing in the L2", .ucode = 0x8, }, { .uname = "IC_FILL_HIT_X", .udesc = "Number of instruction cache fill requests hitting a modifiable line in the L2", .ucode = 0x4, }, { .uname = "IC_FILL_HIT_S", .udesc = "Number of instruction cache fill requests hitting a non-modifiable line in the L2", .ucode = 0x2, }, { .uname = "IC_FILL_MISS", .udesc = "Number of instruction cache fill requests missing in the L2", .ucode = 0x1, }, }; static const amd64_umask_t amd64_fam1ah_zen5_l2_prefetch_hit_l2[]={ { .uname = "L2_HW_PF", .udesc = "Counts requests generated from L2 hardware prefetchers", .ucode = 0x1f, .uflags = AMD64_FL_NCOMBO, }, { .uname = "L1_DC_HW_PF", .udesc = "Counts requests generated from L1 DC hardware prefetchers", .ucode = 0xe0, .uflags = AMD64_FL_NCOMBO, }, { .uname = "L1_DC_L2_HW_PF", .udesc = "Counts requests generated from L1 DC and L2 hardware prefetchers", .ucode = 0xff, .uflags = AMD64_FL_NCOMBO, }, }; static const amd64_umask_t amd64_fam1ah_zen5_l2_fill_resp_src[]={ { .uname = "LOCAL_CCX", .udesc = "Fill from L3 or different L2 in same CCX", .ucode = 0x2, }, { .uname = "NEAR_CACHE", .udesc = "Fill from cache of different CCX in same node", .ucode = 0x4, }, { .uname = "DRAM_IO_NEAR", .udesc = "Fill from DRAM or IO connected to same node", .ucode = 0x8, }, { .uname = "FAR_CACHE", .udesc = "Fill from CCX cache in different node", .ucode = 0x10, }, { .uname = "DRAM_IO_FAR", .udesc = "Fill from DRAM or IO connected from a different node (same socket or remote)", .ucode = 0x40, }, { .uname = "ALT_MEM", .udesc = "Fill from Extension Memory", .ucode = 0x80, }, }; static const amd64_entry_t amd64_fam1ah_zen5_pe[]={ { .name = "RETIRED_X87_FP_OPS", .desc = "Number of X87 floating-point ops retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x2, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_x87_fp_ops), .umasks = amd64_fam1ah_zen5_retired_x87_fp_ops, }, { .name = "RETIRED_SSE_AVX_FLOPS", .desc = "This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15 and therefore requires the MergeEvent", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x3, .flags = 0, .ngrp = 2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_sse_avx_flops), .umasks = amd64_fam1ah_zen5_retired_sse_avx_flops, }, { .name = "RETIRED_FP_OPS_BY_WIDTH", .desc = "The number of retired floating-point ops by width", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x8, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_fp_ops_by_width), .umasks = amd64_fam1ah_zen5_retired_fp_ops_by_width, }, { .name = "RETIRED_FP_OPS_BY_TYPE", .desc = "The number of retired floating-point ops by type", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xa, .flags = 0, .ngrp = 2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_fp_ops_by_type), .umasks = amd64_fam1ah_zen5_retired_fp_ops_by_type, }, { .name = "RETIRED_INT_OPS", .desc = "The number of retired integer ops (SSE/AVX)", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xb, .flags = 0, .ngrp = 2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_int_ops), .umasks = amd64_fam1ah_zen5_retired_int_ops, }, { .name = "PACKED_FP_OPS_RETIRED", .desc = "The number of packed floating-point operations", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc, .flags = 0, .ngrp = 2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_packed_fp_ops_retired), .umasks = amd64_fam1ah_zen5_packed_fp_ops_retired, }, { .name = "PACKED_INT_OPS_RETIRED", .desc = "The number of packed integer operations", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xd, .flags = 0, .ngrp = 2, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_packed_int_ops_retired), .umasks = amd64_fam1ah_zen5_packed_int_ops_retired, }, { .name = "FP_DISPATCH_FAULTS", .desc = "Floating-point dispatch faults", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xe, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_fp_dispatch_faults), .umasks = amd64_fam1ah_zen5_fp_dispatch_faults, }, { .name = "BAD_STATUS_2", .desc = "TBD", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x24, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_bad_status_2), .umasks = amd64_fam1ah_zen5_bad_status_2, }, { .name = "RETIRED_LOCK_INSTRUCTIONS", .desc = "Counts the number of retired locked instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x25, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_lock_instructions), .umasks = amd64_fam1ah_zen5_retired_lock_instructions, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Counts the number of retired non-speculative clflush instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x26, .flags = 0, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Counts the number of retired cpuid instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x27, .flags = 0, }, { .name = "LS_DISPATCH", .desc = "Counts the number of operations dispatched to the LS unit. Unit Masks ADDed", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x29, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ls_dispatch), .umasks = amd64_fam1ah_zen5_ls_dispatch, }, { .name = "SMI_RECEIVED", .desc = "Counts the number system management interrupts (SMI) received", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x2b, .flags = 0, }, { .name = "INTERRUPT_TAKEN", .desc = "Counts the number of interrupts taken", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x2c, .flags = 0, }, { .name = "STORE_TO_LOAD_FORWARD", .desc = "Number of STore to Load Forward hits", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x35, .flags = 0, .ngrp = 0, }, { .name = "STORE_GLOBALLY_VISIBLE_CANCELS_2", .desc = "Count reasons why a store coalescing buffer (SCB) commit is canceled", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x37, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_store_commit_cancels_2), .umasks = amd64_fam1ah_zen5_store_commit_cancels_2, }, { .name = "MAB_ALLOCATION_BY_TYPE", .desc = "Counts when a LS pipe allocates a MAB entry", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x41, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_mab_allocation_by_type), .umasks = amd64_fam1ah_zen5_mab_allocation_by_type, }, { .name = "DEMAND_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Demand Data Cache fills by data source", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x43, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_demand_data_fills_from_system), .umasks = amd64_fam1ah_zen5_demand_data_fills_from_system, /* shared */ }, { .name = "ANY_DATA_CACHE_FILLS_FROM_SYSTEM", .desc = "Any Data Cache fills by data source", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x44, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_demand_data_fills_from_system), .umasks = amd64_fam1ah_zen5_demand_data_fills_from_system, /* shared */ }, { .name = "L1_DTLB_MISS", .desc = "L1 Data TLB misses", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x45, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l1_dtlb_miss), .umasks = amd64_fam1ah_zen5_l1_dtlb_miss, }, { .name = "MISALIGNED_LOADS", .desc = "Misaligned loads retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x47, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_misaligned_loads), .umasks = amd64_fam1ah_zen5_misaligned_loads, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Software Prefetch Instructions Dispatched. This is a speculative event", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x4b, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_prefetch_instructions_dispatched), .umasks = amd64_fam1ah_zen5_prefetch_instructions_dispatched, }, { .name = "WRITE_COMBINING_BUFFER_CLOSE", .desc = "Counts events that cause a Write Combining Buffer (WCB) entry to close", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x50, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_wcb_close), .umasks = amd64_fam1ah_zen5_wcb_close, }, { .name = "INEFFECTIVE_SOFTWARE_PREFETCH", .desc = "Number of software prefetches that did not fetch data outside of the processor core", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x52, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ineffective_software_prefetch), .umasks = amd64_fam1ah_zen5_ineffective_software_prefetch, }, { .name = "SOFTWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of software prefetches fills by data source", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x59, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_demand_data_fills_from_system), .umasks = amd64_fam1ah_zen5_demand_data_fills_from_system, /* shared */ }, { .name = "HARDWARE_PREFETCH_DATA_CACHE_FILLS", .desc = "Number of hardware prefetches fills by data source", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x5a, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_demand_data_fills_from_system), .umasks = amd64_fam1ah_zen5_demand_data_fills_from_system, /* shared */ }, { .name = "ALLOC_MAB_COUNT", .desc = "Counts the in-flight L1 data cache misses each cycle", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x5f, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NOT_IN_HALT", .desc = "Number of core cycles not in halted state", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x76, .flags = 0, .ngrp = 0, }, { .name = "TLB_FLUSHES", .desc = "Number of TLB flushes", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x78, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_tlb_flushes), .umasks = amd64_fam1ah_zen5_tlb_flushes, }, { .name = "P0_FREQ_CYCLES_NOT_IN_HALT", .desc = "Number of core cycles not in halted state by P-level", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x120, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_p0_freq_cycles_not_in_halt), .umasks = amd64_fam1ah_zen5_p0_freq_cycles_not_in_halt, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Number of 64-byte instruction cachelines that was fulfilled by the L2 cache", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x82, .flags = 0, .ngrp = 0, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Number of 64-byte instruction cachelines fulfilled from system memory or another cache", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x83, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_HIT", .desc = "Number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x84, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_MISS_L2_ITLB_MISS", .desc = "The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x85, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l1_itlb_miss_l2_itlb_miss), .umasks = amd64_fam1ah_zen5_l1_itlb_miss_l2_itlb_miss, }, { .name = "BP_CORRECTION", .desc = "Number of times the Branch Predictor (BP) flushed its own pipeline due to internal conditions such as a second level prediction structure. Does not count the number of bubbles caused by these internal flushes", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x8b, .flags = 0, .ngrp = 0, }, { .name = "DYNAMIC_INDIRECT_PREDICTIONS", .desc = "Number of times a branch used the indirect predictor to make a prediction", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x8e, .flags = 0, .ngrp = 0, }, { .name = "DECODER_OVERRIDE_BRANCH_PRED", .desc = "Number of decoder overrides of existing branch prediction", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x91, .flags = 0, .ngrp = 0, }, { .name = "L1_ITLB_FETCH_HIT", .desc = "Instruction fetches that hit in the L1 ITLB", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x94, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_itlb_fetch_hit), .umasks = amd64_fam1ah_zen5_itlb_fetch_hit, }, { .name = "BP_REDIRECTS", .desc = "Counts redirects of the branch predictor", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x9f, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_bp_redirects), .umasks = amd64_fam1ah_zen5_bp_redirects, }, { .name = "FETCH_IBS_EVENTS", .desc = "Counts significant Fetch Instruction Based Sampling (IBS) state transitions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x188, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_fetch_ibs_events), .umasks = amd64_fam1ah_zen5_fetch_ibs_events, }, { .name = "IC_TAG_HIT_MISS", .desc = "Counts various IC tag related hit and miss events", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x18e, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ic_tag_hit_miss), .umasks = amd64_fam1ah_zen5_ic_tag_hit_miss, }, { .name = "OP_CACHE_HIT_MISS", .desc = "Counts op cache micro-tag hit/miss events", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x28f, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_op_cache_hit_miss), .umasks = amd64_fam1ah_zen5_op_cache_hit_miss, }, { .name = "OPS_QUEUE_EMPTY", .desc = "Number of cycles where the op queue is empty", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xa9, .flags = 0, .ngrp = 0, }, { .name = "OPS_SOURCE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op source", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xaa, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ops_source_dispatched_from_decoder), .umasks = amd64_fam1ah_zen5_ops_source_dispatched_from_decoder, }, { .name = "OPS_TYPE_DISPATCHED_FROM_DECODER", .desc = "Number of ops dispatched from the decoder classified by op type", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xab, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ops_type_dispatched_from_decoder), .umasks = amd64_fam1ah_zen5_ops_type_dispatched_from_decoder, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_1", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xae, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_dispatch_resource_stall_cycles_1), .umasks = amd64_fam1ah_zen5_dispatch_resource_stall_cycles_1, }, { .name = "DISPATCH_RESOURCE_STALL_CYCLES_2", .desc = "Number of cycles where a dispatch group is valid but does not get dispatched due to a Token Stall", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xaf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_dispatch_resource_stall_cycles_2), .umasks = amd64_fam1ah_zen5_dispatch_resource_stall_cycles_2, }, { .name = "DISPATCH_STALLS_1", .desc = "For each cycle, counts the number of dispatch slots that remained unused for a given reason", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1a0, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_dispatch_stalls_1), .umasks = amd64_fam1ah_zen5_dispatch_stalls_1, }, { .name = "DISPATCH_STALLS_2", .desc = "For each cycle, counts the number of dispatch slots that remained unused for a given reason", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1a2, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_dispatch_stalls_2), .umasks = amd64_fam1ah_zen5_dispatch_stalls_2, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Number of instructions retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc0, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_OPS", .desc = "Number of macro-ops retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired branch instructions, that were mispredicted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc3, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc4, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired taken branch instructions that were mispredicted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc5, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc6, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Number of near return instructions (RET or RET Iw) retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction. Only EX mispredicts are counted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xc9, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of indirect branches retired there were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted condition branch instruction. Only EX mispredicts are counted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xca, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_MMX_FP_INSTRUCTIONS", .desc = "Number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions, it is not suitable for measuring MFLOPS", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xcb, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_retired_mmx_fp_instructions), .umasks = amd64_fam1ah_zen5_retired_mmx_fp_instructions, }, { .name = "RETIRED_INDIRECT_BRANCH_INSTRUCTIONS", .desc = "Number of indirect branches retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xcc, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired conditional branch instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xd1, .flags = 0, .ngrp = 0, }, { .name = "DIV_CYCLES_BUSY_COUNT", .desc = "Number of cycles when the divider is busy", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xd3, .flags = 0, .ngrp = 0, }, { .name = "DIV_OP_COUNT", .desc = "Number of divide ops", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xd4, .flags = 0, .ngrp = 0, }, { .name = "CYCLES_NO_RETIRE", .desc = "Counts cycles when the hardware does not retire any ops for a given reason. Event can only track one reason at a time. If multiple reasons apply for a given cycle, the lowest numbered reason is counted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0xd6, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_cycles_no_retire), .umasks = amd64_fam1ah_zen5_cycles_no_retire, }, { .name = "RETIRED_UCODE_INSTRUCTIONS", .desc = "Number of microcode instructions retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1c1, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UCODE_OPS", .desc = "Number of microcode ops retired", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1c2, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_BRANCH_MISPREDICTED_DIRECTION_MISMATCH", .desc = "Number of retired conditional branch instructions that were not correctly predicted because of branch direction mismatch", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1c7, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UNCONDITIONAL_INDIRECT_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Number of retired unconditional indirect branch instructions that were mispredicted", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1c8, .flags = 0, .ngrp = 0, }, { .name = "RETIRED_UNCONDITIONAL_BRANCH_INSTRUCTIONS", .desc = "Number of retired unconditional branch instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1c9, .flags = 0, .ngrp = 0, }, { .name = "TAGGED_IBS_OPS", .desc = "Counts Op IBS related events", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1cf, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_tagged_ibs_ops), .umasks = amd64_fam1ah_zen5_tagged_ibs_ops, }, { .name = "RETIRED_FUSED_INSTRUCTIONS", .desc = "Counts retired fused instructions", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x1d0, .flags = 0, .ngrp = 0, }, { .name = "REQUESTS_TO_L2_GROUP1", .desc = "All L2 cache requests", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x60, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_requests_to_l2_group1), .umasks = amd64_fam1ah_zen5_requests_to_l2_group1, }, { .name = "REQUESTS_TO_L2_GROUP2", .desc = "All L2 cache requests (rare cases)", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x61, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_requests_to_l2_group2), .umasks = amd64_fam1ah_zen5_requests_to_l2_group2, }, { .name = "LS_TO_L2_WBC_REQUESTS", .desc = "Number of write combining buffer (WCB) operations", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x63, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_ls_to_l2_wbc_requests), .umasks = amd64_fam1ah_zen5_ls_to_l2_wbc_requests, }, { .name = "CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS", .desc = "L2 cache request outcomes. This event does not count accesses to the L2 cache by the L2 prefetcher", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x64, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_core_to_l2_cacheable_request_access_status), .umasks = amd64_fam1ah_zen5_core_to_l2_cacheable_request_access_status, }, { .name = "L2_PREFETCH_HIT_L2", .desc = "Number of L2 prefetches that hit in the L2", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x70, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l2_prefetch_hit_l2), .umasks = amd64_fam1ah_zen5_l2_prefetch_hit_l2, /* shared */ }, { .name = "L2_PREFETCH_HIT_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x71, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l2_prefetch_hit_l2), .umasks = amd64_fam1ah_zen5_l2_prefetch_hit_l2, /* shared */ }, { .name = "L2_FILL_RESPONSE_SRC", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x165, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l2_fill_resp_src), .umasks = amd64_fam1ah_zen5_l2_fill_resp_src, }, { .name = "L2_PREFETCH_MISS_L3", .desc = "Number of L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches", .modmsk = AMD64_FAM1AH_ATTRS, .code = 0x72, .flags = 0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l2_prefetch_hit_l2), .umasks = amd64_fam1ah_zen5_l2_prefetch_hit_l2, /* shared */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_fam1ah_zen5_l3.h000066400000000000000000000040101502707512200252550ustar00rootroot00000000000000#include "../pfmlib_amd64_priv.h" #include "../pfmlib_priv.h" /* * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved. * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: amd64_fam1ah_zen5_l3 (AMD64 Fam1Ah Zen3 L3) */ static const amd64_umask_t amd64_fam1ah_zen5_l3_requests[]={ { .uname = "L3_MISS", .udesc = "L3 miss", .ucode = 0x1, }, { .uname = "L3_HIT", .udesc = "L3 hit", .ucode = 0xfe, }, { .uname = "ALL", .udesc = "All types of requests", .ucode = 0xff, .uflags = AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_fam1ah_zen5_l3_pe[]={ { .name = "UNC_L3_REQUESTS", .desc = "Number of requests to L3 cache", .code = 0x04, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l3_requests), .umasks = amd64_fam1ah_zen5_l3_requests, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_k7.h000066400000000000000000000155571502707512200231240ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Regenerated from previous version by: * * Copyright (c) 2006, 2007 Advanced Micro Devices, Inc. * Contributed by Ray Bryant * Contributed by Robert Richter * Modified for K7 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_k7 (AMD64 K7) */ /* * Definitions taken from "AMD Athlon Processor x86 Code Optimization Guide" * Table 11 February 2002 */ static const amd64_umask_t amd64_k7_data_cache_refills[]={ { .uname = "L2_INVALID", .udesc = "Invalid line from L2", .ucode = 0x1, }, { .uname = "L2_SHARED", .udesc = "Shared-state line from L2", .ucode = 0x2, }, { .uname = "L2_EXCLUSIVE", .udesc = "Exclusive-state line from L2", .ucode = 0x4, }, { .uname = "L2_OWNED", .udesc = "Owned-state line from L2", .ucode = 0x8, }, { .uname = "L2_MODIFIED", .udesc = "Modified-state line from L2", .ucode = 0x10, }, { .uname = "ALL", .udesc = "Shared, Exclusive, Owned, Modified State Refills", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k7_data_cache_refills_from_system[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "Invalid, Shared, Exclusive, Owned, Modified", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_k7_pe[]={ { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2", .modmsk = AMD64_BASIC_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_k7_data_cache_refills), .ngrp = 1, .umasks = amd64_k7_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Data Cache Refills from System", .modmsk = AMD64_BASIC_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_k7_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_k7_data_cache_refills_from_system, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_k7_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_k7_data_cache_refills_from_system, /* identical to actual umasks list for this event */ }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "L1 DTLB Miss and L2 DTLB Hit", .modmsk = AMD64_BASIC_ATTRS, .code = 0x45, }, { .name = "L1_DTLB_AND_L2_DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_BASIC_ATTRS, .code = 0x46, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x47, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_BASIC_ATTRS, .code = 0x76, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_BASIC_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x81, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_BASIC_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_BASIC_ATTRS, .code = 0x85, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions (includes exceptions, interrupts, resyncs)", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs (only non-control transfer branches)", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc7, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_BASIC_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/amd64_events_k8.h000066400000000000000000001114751502707512200231210ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Regenerated from previous version by: * * Copyright (c) 2006, 2007 Advanced Micro Devices, Inc. * Contributed by Ray Bryant * Contributed by Robert Richter * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: amd64_k8 (AMD64 K8) */ /* History * * Feb 10 2006 -- Ray Bryant, raybry@mpdtxmail.amd.com * * Brought event table up-to-date with the 3.85 (October 2005) version of the * "BIOS and Kernel Developer's Guide for the AMD Athlon[tm] 64 and * AMD Opteron[tm] Processors," AMD Publication # 26094. * * Dec 12 2007 -- Robert Richter, robert.richter@amd.com * * Updated to: BIOS and Kernel Developer's Guide for AMD NPT Family * 0Fh Processors, Publication # 32559, Revision: 3.08, Issue Date: * July 2007 * * Feb 26 2009 -- Robert Richter, robert.richter@amd.com * * Updates and fixes of some revision flags and descriptions according * to these documents: * BIOS and Kernel Developer's Guide, #26094, Revision: 3.30 * BIOS and Kernel Developer's Guide, #32559, Revision: 3.12 */ static const amd64_umask_t amd64_k8_dispatched_fpu[]={ { .uname = "OPS_ADD", .udesc = "Add pipe ops", .ucode = 0x1, }, { .uname = "OPS_MULTIPLY", .udesc = "Multiply pipe ops", .ucode = 0x2, }, { .uname = "OPS_STORE", .udesc = "Store pipe ops", .ucode = 0x4, }, { .uname = "OPS_ADD_PIPE_LOAD_OPS", .udesc = "Add pipe load ops", .ucode = 0x8, }, { .uname = "OPS_MULTIPLY_PIPE_LOAD_OPS", .udesc = "Multiply pipe load ops", .ucode = 0x10, }, { .uname = "OPS_STORE_PIPE_LOAD_OPS", .udesc = "Store pipe load ops", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_segment_register_loads[]={ { .uname = "ES", .udesc = "ES", .ucode = 0x1, }, { .uname = "CS", .udesc = "CS", .ucode = 0x2, }, { .uname = "SS", .udesc = "SS", .ucode = 0x4, }, { .uname = "DS", .udesc = "DS", .ucode = 0x8, }, { .uname = "FS", .udesc = "FS", .ucode = 0x10, }, { .uname = "GS", .udesc = "GS", .ucode = 0x20, }, { .uname = "HS", .udesc = "HS", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All segments", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_locked_ops[]={ { .uname = "EXECUTED", .udesc = "The number of locked instructions executed", .ucode = 0x1, }, { .uname = "CYCLES_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in speculative phase", .ucode = 0x2, }, { .uname = "CYCLES_NON_SPECULATIVE_PHASE", .udesc = "The number of cycles spent in non-speculative phase (including cache miss penalty)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_memory_requests[]={ { .uname = "NON_CACHEABLE", .udesc = "Requests to non-cacheable (UC) memory", .ucode = 0x1, }, { .uname = "WRITE_COMBINING", .udesc = "Requests to write-combining (WC) memory or WC buffer flushes to WB memory", .ucode = 0x2, }, { .uname = "STREAMING_STORE", .udesc = "Streaming store (SS) requests", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x83, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_data_cache_refills[]={ { .uname = "SYSTEM", .udesc = "Refill from System", .ucode = 0x1, }, { .uname = "L2_SHARED", .udesc = "Shared-state line from L2", .ucode = 0x2, }, { .uname = "L2_EXCLUSIVE", .udesc = "Exclusive-state line from L2", .ucode = 0x4, }, { .uname = "L2_OWNED", .udesc = "Owned-state line from L2", .ucode = 0x8, }, { .uname = "L2_MODIFIED", .udesc = "Modified-state line from L2", .ucode = 0x10, }, { .uname = "ALL", .udesc = "Shared, Exclusive, Owned, Modified State Refills", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_data_cache_refills_from_system[]={ { .uname = "INVALID", .udesc = "Invalid", .ucode = 0x1, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x2, }, { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x4, }, { .uname = "OWNED", .udesc = "Owned", .ucode = 0x8, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x10, }, { .uname = "ALL", .udesc = "Invalid, Shared, Exclusive, Owned, Modified", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_scrubber_single_bit_ecc_errors[]={ { .uname = "SCRUBBER_ERROR", .udesc = "Scrubber error", .ucode = 0x1, }, { .uname = "PIGGYBACK_ERROR", .udesc = "Piggyback scrubber errors", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_prefetch_instructions_dispatched[]={ { .uname = "LOAD", .udesc = "Load (Prefetch, PrefetchT0/T1/T2)", .ucode = 0x1, }, { .uname = "STORE", .udesc = "Store (PrefetchW)", .ucode = 0x2, }, { .uname = "NTA", .udesc = "NTA (PrefetchNTA)", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_dcache_misses_by_locked_instructions[]={ { .uname = "DATA_CACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .udesc = "Data cache misses by locked instructions", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x2, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_data_prefetches[]={ { .uname = "CANCELLED", .udesc = "Cancelled prefetches", .ucode = 0x1, }, { .uname = "ATTEMPTED", .udesc = "Prefetch attempts", .ucode = 0x2, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_system_read_responses[]={ { .uname = "EXCLUSIVE", .udesc = "Exclusive", .ucode = 0x1, }, { .uname = "MODIFIED", .udesc = "Modified", .ucode = 0x2, }, { .uname = "SHARED", .udesc = "Shared", .ucode = 0x4, }, { .uname = "ALL", .udesc = "Exclusive, Modified, Shared", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_quadwords_written_to_system[]={ { .uname = "QUADWORD_WRITE_TRANSFER", .udesc = "Quadword write transfer", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_requests_to_l2[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB fill (page table walks)", .ucode = 0x4, }, { .uname = "SNOOP", .udesc = "Tag snoop request", .ucode = 0x8, }, { .uname = "CANCELLED", .udesc = "Cancelled request", .ucode = 0x10, }, { .uname = "ALL", .udesc = "All non-cancelled requests", .ucode = 0x1f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_l2_cache_miss[]={ { .uname = "INSTRUCTIONS", .udesc = "IC fill", .ucode = 0x1, }, { .uname = "DATA", .udesc = "DC fill (includes possible replays, whereas event 41h does not)", .ucode = 0x2, }, { .uname = "TLB_WALK", .udesc = "TLB page table walk", .ucode = 0x4, }, { .uname = "ALL", .udesc = "Instructions, Data, TLB walk", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_l2_fill_writeback[]={ { .uname = "L2_FILLS", .udesc = "L2 fills (victims from L1 caches, TLB page table walks and data prefetches)", .ucode = 0x1, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x1, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_TILL_K8_REV_E, }, { .uname = "L2_WRITEBACKS", .udesc = "L2 Writebacks to system.", .ucode = 0x2, .uflags= AMD64_FL_K8_REV_F, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_K8_REV_F, }, }; static const amd64_umask_t amd64_k8_retired_mmx_and_fp_instructions[]={ { .uname = "X87", .udesc = "X87 instructions", .ucode = 0x1, }, { .uname = "MMX_AND_3DNOW", .udesc = "MMX and 3DNow! instructions", .ucode = 0x2, }, { .uname = "PACKED_SSE_AND_SSE2", .udesc = "Packed SSE and SSE2 instructions", .ucode = 0x4, }, { .uname = "SCALAR_SSE_AND_SSE2", .udesc = "Scalar SSE and SSE2 instructions", .ucode = 0x8, }, { .uname = "ALL", .udesc = "X87, MMX(TM), 3DNow!(TM), Scalar and Packed SSE and SSE2 instructions", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_retired_fastpath_double_op_instructions[]={ { .uname = "POSITION_0", .udesc = "With low op in position 0", .ucode = 0x1, }, { .uname = "POSITION_1", .udesc = "With low op in position 1", .ucode = 0x2, }, { .uname = "POSITION_2", .udesc = "With low op in position 2", .ucode = 0x4, }, { .uname = "ALL", .udesc = "With low op in position 0, 1, or 2", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_fpu_exceptions[]={ { .uname = "X87_RECLASS_MICROFAULTS", .udesc = "X87 reclass microfaults", .ucode = 0x1, }, { .uname = "SSE_RETYPE_MICROFAULTS", .udesc = "SSE retype microfaults", .ucode = 0x2, }, { .uname = "SSE_RECLASS_MICROFAULTS", .udesc = "SSE reclass microfaults", .ucode = 0x4, }, { .uname = "SSE_AND_X87_MICROTRAPS", .udesc = "SSE and x87 microtraps", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_dram_accesses_page[]={ { .uname = "HIT", .udesc = "Page hit", .ucode = 0x1, }, { .uname = "MISS", .udesc = "Page Miss", .ucode = 0x2, }, { .uname = "CONFLICT", .udesc = "Page Conflict", .ucode = 0x4, }, { .uname = "ALL", .udesc = "Page Hit, Miss, or Conflict", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_memory_controller_turnarounds[]={ { .uname = "CHIP_SELECT", .udesc = "DIMM (chip select) turnaround", .ucode = 0x1, }, { .uname = "READ_TO_WRITE", .udesc = "Read to write turnaround", .ucode = 0x2, }, { .uname = "WRITE_TO_READ", .udesc = "Write to read turnaround", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All Memory Controller Turnarounds", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_memory_controller_bypass[]={ { .uname = "HIGH_PRIORITY", .udesc = "Memory controller high priority bypass", .ucode = 0x1, }, { .uname = "LOW_PRIORITY", .udesc = "Memory controller low priority bypass", .ucode = 0x2, }, { .uname = "DRAM_INTERFACE", .udesc = "DRAM controller interface bypass", .ucode = 0x4, }, { .uname = "DRAM_QUEUE", .udesc = "DRAM controller queue bypass", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_sized_blocks[]={ { .uname = "32_BYTE_WRITES", .udesc = "32-byte Sized Writes", .ucode = 0x4, }, { .uname = "64_BYTE_WRITES", .udesc = "64-byte Sized Writes", .ucode = 0x8, }, { .uname = "32_BYTE_READS", .udesc = "32-byte Sized Reads", .ucode = 0x10, }, { .uname = "64_BYTE_READS", .udesc = "64-byte Sized Reads", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3c, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_thermal_status_and_ecc_errors[]={ { .uname = "CLKS_CPU_ACTIVE", .udesc = "Number of clocks CPU is active when HTC is active", .ucode = 0x1, .uflags= AMD64_FL_K8_REV_F, }, { .uname = "CLKS_CPU_INACTIVE", .udesc = "Number of clocks CPU clock is inactive when HTC is active", .ucode = 0x2, .uflags= AMD64_FL_K8_REV_F, }, { .uname = "CLKS_DIE_TEMP_TOO_HIGH", .udesc = "Number of clocks when die temperature is higher than the software high temperature threshold", .ucode = 0x4, .uflags= AMD64_FL_K8_REV_F, }, { .uname = "CLKS_TEMP_THRESHOLD_EXCEEDED", .udesc = "Number of clocks when high temperature threshold was exceeded", .ucode = 0x8, .uflags= AMD64_FL_K8_REV_F, }, { .uname = "DRAM_ECC_ERRORS", .udesc = "Number of correctable and Uncorrectable DRAM ECC errors", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x80, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_TILL_K8_REV_E, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x8f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_K8_REV_F, }, }; static const amd64_umask_t amd64_k8_cpu_io_requests_to_memory_io[]={ { .uname = "I_O_TO_I_O", .udesc = "I/O to I/O", .ucode = 0x1, }, { .uname = "I_O_TO_MEM", .udesc = "I/O to Mem", .ucode = 0x2, }, { .uname = "CPU_TO_I_O", .udesc = "CPU to I/O", .ucode = 0x4, }, { .uname = "CPU_TO_MEM", .udesc = "CPU to Mem", .ucode = 0x8, }, { .uname = "TO_REMOTE_NODE", .udesc = "To remote node", .ucode = 0x10, }, { .uname = "TO_LOCAL_NODE", .udesc = "To local node", .ucode = 0x20, }, { .uname = "FROM_REMOTE_NODE", .udesc = "From remote node", .ucode = 0x40, }, { .uname = "FROM_LOCAL_NODE", .udesc = "From local node", .ucode = 0x80, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xff, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_cache_block[]={ { .uname = "VICTIM_WRITEBACK", .udesc = "Victim Block (Writeback)", .ucode = 0x1, }, { .uname = "DCACHE_LOAD_MISS", .udesc = "Read Block (Dcache load miss refill)", .ucode = 0x4, }, { .uname = "SHARED_ICACHE_REFILL", .udesc = "Read Block Shared (Icache refill)", .ucode = 0x8, }, { .uname = "READ_BLOCK_MODIFIED", .udesc = "Read Block Modified (Dcache store miss refill)", .ucode = 0x10, }, { .uname = "READ_TO_DIRTY", .udesc = "Change to Dirty (first store to clean block already in cache)", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3d, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_sized_commands[]={ { .uname = "NON_POSTED_WRITE_BYTE", .udesc = "NonPosted SzWr Byte (1-32 bytes) Legacy or mapped I/O, typically 1-4 bytes", .ucode = 0x1, }, { .uname = "NON_POSTED_WRITE_DWORD", .udesc = "NonPosted SzWr Dword (1-16 dwords) Legacy or mapped I/O, typically 1 dword", .ucode = 0x2, }, { .uname = "POSTED_WRITE_BYTE", .udesc = "Posted SzWr Byte (1-32 bytes) Sub-cache-line DMA writes, size varies; also flushes of partially-filled Write Combining buffer", .ucode = 0x4, }, { .uname = "POSTED_WRITE_DWORD", .udesc = "Posted SzWr Dword (1-16 dwords) Block-oriented DMA writes, often cache-line sized; also processor Write Combining buffer flushes", .ucode = 0x8, }, { .uname = "READ_BYTE_4_BYTES", .udesc = "SzRd Byte (4 bytes) Legacy or mapped I/O", .ucode = 0x10, }, { .uname = "READ_DWORD_1_16_DWORDS", .udesc = "SzRd Dword (1-16 dwords) Block-oriented DMA reads, typically cache-line size", .ucode = 0x20, }, { .uname = "READ_MODIFY_WRITE", .udesc = "RdModWr", .ucode = 0x40, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_probe[]={ { .uname = "MISS", .udesc = "Probe miss", .ucode = 0x1, }, { .uname = "HIT_CLEAN", .udesc = "Probe hit clean", .ucode = 0x2, }, { .uname = "HIT_DIRTY_NO_MEMORY_CANCEL", .udesc = "Probe hit dirty without memory cancel (probed by Sized Write or Change2Dirty)", .ucode = 0x4, }, { .uname = "HIT_DIRTY_WITH_MEMORY_CANCEL", .udesc = "Probe hit dirty with memory cancel (probed by DMA read or cache refill request)", .ucode = 0x8, }, { .uname = "UPSTREAM_DISPLAY_REFRESH_READS", .udesc = "Upstream display refresh reads", .ucode = 0x10, }, { .uname = "UPSTREAM_NON_DISPLAY_REFRESH_READS", .udesc = "Upstream non-display refresh reads", .ucode = 0x20, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x3f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_TILL_K8_REV_C, }, { .uname = "UPSTREAM_WRITES", .udesc = "Upstream writes", .ucode = 0x40, .uflags= AMD64_FL_K8_REV_D, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7f, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL | AMD64_FL_K8_REV_D, }, }; static const amd64_umask_t amd64_k8_gart[]={ { .uname = "APERTURE_HIT_FROM_CPU", .udesc = "GART aperture hit on access from CPU", .ucode = 0x1, }, { .uname = "APERTURE_HIT_FROM_IO", .udesc = "GART aperture hit on access from I/O", .ucode = 0x2, }, { .uname = "MISS", .udesc = "GART miss", .ucode = 0x4, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0x7, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_umask_t amd64_k8_hypertransport_link0[]={ { .uname = "COMMAND_DWORD_SENT", .udesc = "Command dword sent", .ucode = 0x1, }, { .uname = "DATA_DWORD_SENT", .udesc = "Data dword sent", .ucode = 0x2, }, { .uname = "BUFFER_RELEASE_DWORD_SENT", .udesc = "Buffer release dword sent", .ucode = 0x4, }, { .uname = "NOP_DWORD_SENT", .udesc = "Nop dword sent (idle)", .ucode = 0x8, }, { .uname = "ALL", .udesc = "All sub-events selected", .ucode = 0xf, .uflags= AMD64_FL_NCOMBO | AMD64_FL_DFL, }, }; static const amd64_entry_t amd64_k8_pe[]={ { .name = "DISPATCHED_FPU", .desc = "Dispatched FPU Operations", .modmsk = AMD64_BASIC_ATTRS, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_dispatched_fpu), .ngrp = 1, .umasks = amd64_k8_dispatched_fpu, }, { .name = "CYCLES_NO_FPU_OPS_RETIRED", .desc = "Cycles with no FPU Ops Retired", .modmsk = AMD64_BASIC_ATTRS, .code = 0x1, }, { .name = "DISPATCHED_FPU_OPS_FAST_FLAG", .desc = "Dispatched Fast Flag FPU Operations", .modmsk = AMD64_BASIC_ATTRS, .code = 0x2, }, { .name = "SEGMENT_REGISTER_LOADS", .desc = "Segment Register Loads", .modmsk = AMD64_BASIC_ATTRS, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_segment_register_loads), .ngrp = 1, .umasks = amd64_k8_segment_register_loads, }, { .name = "PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE", .desc = "Pipeline restart due to self-modifying code", .modmsk = AMD64_BASIC_ATTRS, .code = 0x21, }, { .name = "PIPELINE_RESTART_DUE_TO_PROBE_HIT", .desc = "Pipeline restart due to probe hit", .modmsk = AMD64_BASIC_ATTRS, .code = 0x22, }, { .name = "LS_BUFFER_2_FULL_CYCLES", .desc = "LS Buffer 2 Full", .modmsk = AMD64_BASIC_ATTRS, .code = 0x23, }, { .name = "LOCKED_OPS", .desc = "Locked Operations", .modmsk = AMD64_BASIC_ATTRS, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_locked_ops), .ngrp = 1, .umasks = amd64_k8_locked_ops, }, { .name = "MEMORY_REQUESTS", .desc = "Memory Requests by Type", .modmsk = AMD64_BASIC_ATTRS, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_memory_requests), .ngrp = 1, .umasks = amd64_k8_memory_requests, }, { .name = "DATA_CACHE_ACCESSES", .desc = "Data Cache Accesses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x40, }, { .name = "DATA_CACHE_MISSES", .desc = "Data Cache Misses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x41, }, { .name = "DATA_CACHE_REFILLS", .desc = "Data Cache Refills from L2 or System", .modmsk = AMD64_BASIC_ATTRS, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_data_cache_refills), .ngrp = 1, .umasks = amd64_k8_data_cache_refills, }, { .name = "DATA_CACHE_REFILLS_FROM_SYSTEM", .desc = "Data Cache Refills from System", .modmsk = AMD64_BASIC_ATTRS, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_k8_data_cache_refills_from_system, }, { .name = "DATA_CACHE_LINES_EVICTED", .desc = "Data Cache Lines Evicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0x44, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_data_cache_refills_from_system), .ngrp = 1, .umasks = amd64_k8_data_cache_refills_from_system, /* identical to actual umasks list for this event */ }, { .name = "L1_DTLB_MISS_AND_L2_DTLB_HIT", .desc = "L1 DTLB Miss and L2 DTLB Hit", .modmsk = AMD64_BASIC_ATTRS, .code = 0x45, }, { .name = "L1_DTLB_AND_L2_DTLB_MISS", .desc = "L1 DTLB and L2 DTLB Miss", .modmsk = AMD64_BASIC_ATTRS, .code = 0x46, }, { .name = "MISALIGNED_ACCESSES", .desc = "Misaligned Accesses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x47, }, { .name = "MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Late Cancel of an Access", .modmsk = AMD64_BASIC_ATTRS, .code = 0x48, }, { .name = "MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS", .desc = "Microarchitectural Early Cancel of an Access", .modmsk = AMD64_BASIC_ATTRS, .code = 0x49, }, { .name = "SCRUBBER_SINGLE_BIT_ECC_ERRORS", .desc = "Single-bit ECC Errors Recorded by Scrubber", .modmsk = AMD64_BASIC_ATTRS, .code = 0x4a, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_scrubber_single_bit_ecc_errors), .ngrp = 1, .umasks = amd64_k8_scrubber_single_bit_ecc_errors, }, { .name = "PREFETCH_INSTRUCTIONS_DISPATCHED", .desc = "Prefetch Instructions Dispatched", .modmsk = AMD64_BASIC_ATTRS, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_prefetch_instructions_dispatched), .ngrp = 1, .umasks = amd64_k8_prefetch_instructions_dispatched, }, { .name = "DCACHE_MISSES_BY_LOCKED_INSTRUCTIONS", .desc = "DCACHE Misses by Locked Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_dcache_misses_by_locked_instructions), .ngrp = 1, .umasks = amd64_k8_dcache_misses_by_locked_instructions, }, { .name = "DATA_PREFETCHES", .desc = "Data Prefetcher", .modmsk = AMD64_BASIC_ATTRS, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_data_prefetches), .ngrp = 1, .umasks = amd64_k8_data_prefetches, }, { .name = "SYSTEM_READ_RESPONSES", .desc = "System Read Responses by Coherency State", .modmsk = AMD64_BASIC_ATTRS, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_system_read_responses), .ngrp = 1, .umasks = amd64_k8_system_read_responses, }, { .name = "QUADWORDS_WRITTEN_TO_SYSTEM", .desc = "Quadwords Written to System", .modmsk = AMD64_BASIC_ATTRS, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_quadwords_written_to_system), .ngrp = 1, .umasks = amd64_k8_quadwords_written_to_system, }, { .name = "REQUESTS_TO_L2", .desc = "Requests to L2 Cache", .modmsk = AMD64_BASIC_ATTRS, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_requests_to_l2), .ngrp = 1, .umasks = amd64_k8_requests_to_l2, }, { .name = "L2_CACHE_MISS", .desc = "L2 Cache Misses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_l2_cache_miss), .ngrp = 1, .umasks = amd64_k8_l2_cache_miss, }, { .name = "L2_FILL_WRITEBACK", .desc = "L2 Fill/Writeback", .modmsk = AMD64_BASIC_ATTRS, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_l2_fill_writeback), .ngrp = 1, .umasks = amd64_k8_l2_fill_writeback, }, { .name = "INSTRUCTION_CACHE_FETCHES", .desc = "Instruction Cache Fetches", .modmsk = AMD64_BASIC_ATTRS, .code = 0x80, }, { .name = "INSTRUCTION_CACHE_MISSES", .desc = "Instruction Cache Misses", .modmsk = AMD64_BASIC_ATTRS, .code = 0x81, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_L2", .desc = "Instruction Cache Refills from L2", .modmsk = AMD64_BASIC_ATTRS, .code = 0x82, }, { .name = "INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM", .desc = "Instruction Cache Refills from System", .modmsk = AMD64_BASIC_ATTRS, .code = 0x83, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_HIT", .desc = "L1 ITLB Miss and L2 ITLB Hit", .modmsk = AMD64_BASIC_ATTRS, .code = 0x84, }, { .name = "L1_ITLB_MISS_AND_L2_ITLB_MISS", .desc = "L1 ITLB Miss and L2 ITLB Miss", .modmsk = AMD64_BASIC_ATTRS, .code = 0x85, }, { .name = "PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE", .desc = "Pipeline Restart Due to Instruction Stream Probe", .modmsk = AMD64_BASIC_ATTRS, .code = 0x86, }, { .name = "INSTRUCTION_FETCH_STALL", .desc = "Instruction Fetch Stall", .modmsk = AMD64_BASIC_ATTRS, .code = 0x87, }, { .name = "RETURN_STACK_HITS", .desc = "Return Stack Hits", .modmsk = AMD64_BASIC_ATTRS, .code = 0x88, }, { .name = "RETURN_STACK_OVERFLOWS", .desc = "Return Stack Overflows", .modmsk = AMD64_BASIC_ATTRS, .code = 0x89, }, { .name = "RETIRED_CLFLUSH_INSTRUCTIONS", .desc = "Retired CLFLUSH Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0x26, }, { .name = "RETIRED_CPUID_INSTRUCTIONS", .desc = "Retired CPUID Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0x27, }, { .name = "CPU_CLK_UNHALTED", .desc = "CPU Clocks not Halted", .modmsk = AMD64_BASIC_ATTRS, .code = 0x76, }, { .name = "RETIRED_INSTRUCTIONS", .desc = "Retired Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc0, }, { .name = "RETIRED_UOPS", .desc = "Retired uops", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc1, }, { .name = "RETIRED_BRANCH_INSTRUCTIONS", .desc = "Retired Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc2, }, { .name = "RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS", .desc = "Retired Mispredicted Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc3, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS", .desc = "Retired Taken Branch Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc4, }, { .name = "RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED", .desc = "Retired Taken Branch Instructions Mispredicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc5, }, { .name = "RETIRED_FAR_CONTROL_TRANSFERS", .desc = "Retired Far Control Transfers", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc6, }, { .name = "RETIRED_BRANCH_RESYNCS", .desc = "Retired Branch Resyncs", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc7, }, { .name = "RETIRED_NEAR_RETURNS", .desc = "Retired Near Returns", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc8, }, { .name = "RETIRED_NEAR_RETURNS_MISPREDICTED", .desc = "Retired Near Returns Mispredicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0xc9, }, { .name = "RETIRED_INDIRECT_BRANCHES_MISPREDICTED", .desc = "Retired Indirect Branches Mispredicted", .modmsk = AMD64_BASIC_ATTRS, .code = 0xca, }, { .name = "RETIRED_MMX_AND_FP_INSTRUCTIONS", .desc = "Retired MMX/FP Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_retired_mmx_and_fp_instructions), .ngrp = 1, .umasks = amd64_k8_retired_mmx_and_fp_instructions, }, { .name = "RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS", .desc = "Retired Fastpath Double Op Instructions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_retired_fastpath_double_op_instructions), .ngrp = 1, .umasks = amd64_k8_retired_fastpath_double_op_instructions, }, { .name = "INTERRUPTS_MASKED_CYCLES", .desc = "Interrupts-Masked Cycles", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcd, }, { .name = "INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING", .desc = "Interrupts-Masked Cycles with Interrupt Pending", .modmsk = AMD64_BASIC_ATTRS, .code = 0xce, }, { .name = "INTERRUPTS_TAKEN", .desc = "Interrupts Taken", .modmsk = AMD64_BASIC_ATTRS, .code = 0xcf, }, { .name = "DECODER_EMPTY", .desc = "Decoder Empty", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd0, }, { .name = "DISPATCH_STALLS", .desc = "Dispatch Stalls", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd1, }, { .name = "DISPATCH_STALL_FOR_BRANCH_ABORT", .desc = "Dispatch Stall for Branch Abort to Retire", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd2, }, { .name = "DISPATCH_STALL_FOR_SERIALIZATION", .desc = "Dispatch Stall for Serialization", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd3, }, { .name = "DISPATCH_STALL_FOR_SEGMENT_LOAD", .desc = "Dispatch Stall for Segment Load", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd4, }, { .name = "DISPATCH_STALL_FOR_REORDER_BUFFER_FULL", .desc = "Dispatch Stall for Reorder Buffer Full", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd5, }, { .name = "DISPATCH_STALL_FOR_RESERVATION_STATION_FULL", .desc = "Dispatch Stall for Reservation Station Full", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd6, }, { .name = "DISPATCH_STALL_FOR_FPU_FULL", .desc = "Dispatch Stall for FPU Full", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd7, }, { .name = "DISPATCH_STALL_FOR_LS_FULL", .desc = "Dispatch Stall for LS Full", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd8, }, { .name = "DISPATCH_STALL_WAITING_FOR_ALL_QUIET", .desc = "Dispatch Stall Waiting for All Quiet", .modmsk = AMD64_BASIC_ATTRS, .code = 0xd9, }, { .name = "DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RSYNC", .desc = "Dispatch Stall for Far Transfer or Resync to Retire", .modmsk = AMD64_BASIC_ATTRS, .code = 0xda, }, { .name = "FPU_EXCEPTIONS", .desc = "FPU Exceptions", .modmsk = AMD64_BASIC_ATTRS, .code = 0xdb, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_fpu_exceptions), .ngrp = 1, .umasks = amd64_k8_fpu_exceptions, }, { .name = "DR0_BREAKPOINT_MATCHES", .desc = "DR0 Breakpoint Matches", .modmsk = AMD64_BASIC_ATTRS, .code = 0xdc, }, { .name = "DR1_BREAKPOINT_MATCHES", .desc = "DR1 Breakpoint Matches", .modmsk = AMD64_BASIC_ATTRS, .code = 0xdd, }, { .name = "DR2_BREAKPOINT_MATCHES", .desc = "DR2 Breakpoint Matches", .modmsk = AMD64_BASIC_ATTRS, .code = 0xde, }, { .name = "DR3_BREAKPOINT_MATCHES", .desc = "DR3 Breakpoint Matches", .modmsk = AMD64_BASIC_ATTRS, .code = 0xdf, }, { .name = "DRAM_ACCESSES_PAGE", .desc = "DRAM Accesses", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe0, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_dram_accesses_page), .ngrp = 1, .umasks = amd64_k8_dram_accesses_page, }, { .name = "MEMORY_CONTROLLER_PAGE_TABLE_OVERFLOWS", .desc = "Memory Controller Page Table Overflows", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe1, }, { .name = "MEMORY_CONTROLLER_TURNAROUNDS", .desc = "Memory Controller Turnarounds", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe3, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_memory_controller_turnarounds), .ngrp = 1, .umasks = amd64_k8_memory_controller_turnarounds, }, { .name = "MEMORY_CONTROLLER_BYPASS", .desc = "Memory Controller Bypass Counter Saturation", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe4, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_memory_controller_bypass), .ngrp = 1, .umasks = amd64_k8_memory_controller_bypass, }, { .name = "SIZED_BLOCKS", .desc = "Sized Blocks", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe5, .flags = AMD64_FL_K8_REV_D, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_sized_blocks), .ngrp = 1, .umasks = amd64_k8_sized_blocks, }, { .name = "THERMAL_STATUS_AND_ECC_ERRORS", .desc = "Thermal Status and ECC Errors", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe8, .flags = AMD64_FL_K8_REV_E, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_thermal_status_and_ecc_errors), .ngrp = 1, .umasks = amd64_k8_thermal_status_and_ecc_errors, }, { .name = "CPU_IO_REQUESTS_TO_MEMORY_IO", .desc = "CPU/IO Requests to Memory/IO", .modmsk = AMD64_BASIC_ATTRS, .code = 0xe9, .flags = AMD64_FL_K8_REV_E, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_cpu_io_requests_to_memory_io), .ngrp = 1, .umasks = amd64_k8_cpu_io_requests_to_memory_io, }, { .name = "CACHE_BLOCK", .desc = "Cache Block Commands", .modmsk = AMD64_BASIC_ATTRS, .code = 0xea, .flags = AMD64_FL_K8_REV_E, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_cache_block), .ngrp = 1, .umasks = amd64_k8_cache_block, }, { .name = "SIZED_COMMANDS", .desc = "Sized Commands", .modmsk = AMD64_BASIC_ATTRS, .code = 0xeb, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_sized_commands), .ngrp = 1, .umasks = amd64_k8_sized_commands, }, { .name = "PROBE", .desc = "Probe Responses and Upstream Requests", .modmsk = AMD64_BASIC_ATTRS, .code = 0xec, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_probe), .ngrp = 1, .umasks = amd64_k8_probe, }, { .name = "GART", .desc = "GART Events", .modmsk = AMD64_BASIC_ATTRS, .code = 0xee, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_gart), .ngrp = 1, .umasks = amd64_k8_gart, }, { .name = "HYPERTRANSPORT_LINK0", .desc = "HyperTransport Link 0 Transmit Bandwidth", .modmsk = AMD64_BASIC_ATTRS, .code = 0xf6, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_hypertransport_link0), .ngrp = 1, .umasks = amd64_k8_hypertransport_link0, }, { .name = "HYPERTRANSPORT_LINK1", .desc = "HyperTransport Link 1 Transmit Bandwidth", .modmsk = AMD64_BASIC_ATTRS, .code = 0xf7, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_hypertransport_link0), .ngrp = 1, .umasks = amd64_k8_hypertransport_link0, /* identical to actual umasks list for this event */ }, { .name = "HYPERTRANSPORT_LINK2", .desc = "HyperTransport Link 2 Transmit Bandwidth", .modmsk = AMD64_BASIC_ATTRS, .code = 0xf8, .numasks = LIBPFM_ARRAY_SIZE(amd64_k8_hypertransport_link0), .ngrp = 1, .umasks = amd64_k8_hypertransport_link0, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_1176_events.h000066400000000000000000000074111502707512200230330ustar00rootroot00000000000000/* * Copyright (c) 2013 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * the various event names are the same as those given in the * file linux-2.6/arch/arm/kernel/perf_event_v6.c */ /* * ARM1176 Event Table */ static const arm_entry_t arm_1176_pe []={ {.name = "ICACHE_MISS", .code = 0x00, .desc = "Instruction cache miss (includes speculative accesses)" }, {.name = "IBUF_STALL", .code = 0x01, .desc = "Stall because instruction buffer cannot deliver an instruction" }, {.name = "DDEP_STALL", .code = 0x02, .desc = "Stall because of data dependency" }, {.name = "ITLB_MISS", .code = 0x03, .desc = "Instruction MicroTLB miss" }, {.name = "DTLB_MISS", .code = 0x04, .desc = "Data MicroTLB miss" }, {.name = "BR_EXEC", .code = 0x05, .desc = "Branch instruction executed" }, {.name = "BR_MISPREDICT", .code = 0x06, .desc = "Branch mispredicted" }, {.name = "INSTR_EXEC", .code = 0x07, .desc = "Instruction executed" }, {.name = "DCACHE_HIT", .code = 0x09, .desc = "Data cache hit" }, {.name = "DCACHE_ACCESS", .code = 0x0a, .desc = "Data cache access" }, {.name = "DCACHE_MISS", .code = 0x0b, .desc = "Data cache miss" }, {.name = "DCACHE_WBACK", .code = 0x0c, .desc = "Data cache writeback" }, {.name = "SW_PC_CHANGE", .code = 0x0d, .desc = "Software changed the PC." }, {.name = "MAIN_TLB_MISS", .code = 0x0f, .desc = "Main TLB miss" }, {.name = "EXPL_D_ACCESS", .code = 0x10, .desc = "Explicit external data cache access " }, {.name = "LSU_FULL_STALL", .code = 0x11, .desc = "Stall because of a full Load Store Unit request queue." }, {.name = "WBUF_DRAINED", .code = 0x12, .desc = "Write buffer drained due to data synchronization barrier or strongly ordered operation" }, {.name = "ETMEXTOUT_0", .code = 0x20, .desc = "ETMEXTOUT[0] was asserted" }, {.name = "ETMEXTOUT_1", .code = 0x21, .desc = "ETMEXTOUT[1] was asserted" }, {.name = "ETMEXTOUT", .code = 0x22, .desc = "Increment once for each of ETMEXTOUT[0] or ETMEXTOUT[1]" }, {.name = "PROC_CALL_EXEC", .code = 0x23, .desc = "Procedure call instruction executed" }, {.name = "PROC_RET_EXEC", .code = 0x24, .desc = "Procedure return instruction executed" }, {.name = "PROC_RET_EXEC_PRED", .code = 0x25, .desc = "Procedure return instruction executed and address predicted" }, {.name = "PROC_RET_EXEC_PRED_INCORRECT", .code = 0x26, .desc = "Procedure return instruction executed and address predicted incorrectly" }, {.name = "CPU_CYCLES", .code = 0xff, .desc = "CPU cycles" }, }; #define ARM_1176_EVENT_COUNT (sizeof(arm_1176_pe)/sizeof(arm_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cavium_tx2_events.h000066400000000000000000000534071502707512200245240ustar00rootroot00000000000000/* * Copyright (c) 2018 Cavium, Inc * Contributed by Steve Walk * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Cavium ThunderX2 * * ARM Architecture Reference Manual, ARMv8, for ARMv8-A architecture profile, * ARM DDI 0487B.a (ID033117) * * Cavium ThunderX2 C99XX PMU Events (Abridged), July 31, 2018 * https://cavium.com/resources.html */ static const arm_entry_t arm_thunderx2_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "LD_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x06, .desc = "Instruction architecturally executed (condition check pass) - Load" }, {.name = "ST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x07, .desc = "Instruction architecturally executed (condition check pass) - Store" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0A, .desc = "Instruction architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0B, .desc = "Instruction architecturally executed (condition check pass) - Write to CONTEXTIDR" }, {.name = "BR_IMMED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0D, .desc = "Instruction architecturally executed, immediate branch" }, {.name = "BR_RETURN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0E, .desc = "Instruction architecturally executed (condition check pass) - procedure return" }, {.name = "UNALIGNED_LDST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0F, .desc = "Instruction architecturally executed (condition check pass), unaligned load/store" }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-back" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-back" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1B, .desc = "Instruction speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1C, .desc = "Instruction architecturally executed (condition check pass) Write to translation table base" }, {.name = "CHAIN", .modmsk = ARMV8_ATTRS, .code = 0x1E, .desc = "For odd-numbered counters, increments the count by one for each overflow of the proceeding even counter" }, {.name = "L1D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x1F, .desc = "Level 1 data cache allocation without refill" }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "Level 2 data/unified cache allocation without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Counts all branches on the architecturally executed path that would incur cost if mispredicted" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instructions executed, mis-predicted branch. All instructions counted by BR_RETIRED that were not correctly predicted" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "Cycle on which no operation issued because there were no operations to issue" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "Cycle on which no operation issued due to back-end resources being unavailable" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB access" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Instruction TLB access" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2D, .desc = "Attributable memory-read or attributable memory-write operation that causes a TLB refill" }, {.name = "L2I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2E, .desc = "Attributable instruction memory access that causes a TLB refill" }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2F, .desc = "Attributable memory read operation or attributable memory write operation that causes a TLB access" }, {.name = "L2I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x30, .desc = "Attributable memory read operation or attributable memory write operation that causes a TLB access" }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache access, read" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache access, write" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache refill, read" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache refill, write" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refill, inner" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refill, outer" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-back, victim" }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-back, cleaning and coherency" }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4C, .desc = "Level 1 data TLB read refill" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4D, .desc = "Level 1 data TLB write refill" }, {.name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4E, .desc = "Level 1 data TLB access, read" }, {.name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4F, .desc = "Level 1 data TLB access, write" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache access, read" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache access, write" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache refill, read" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache refill, write" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache write-back, victim" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache write-back, cleaning and coherency" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x5C, .desc = "Level 2 data/unified TLB refill, read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x5D, .desc = "Level 2 data/unified TLB refill, write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x5E, .desc = "Level 2 data/unified TLB access, read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x5F, .desc = "Level 2 data/unified TLB access, write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access, read" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access, write" }, {.name = "BUS_ACCESS_SHARED", .modmsk = ARMV8_ATTRS, .code = 0x62, .desc = "Bus access, normal, cacheable, shareable" }, {.name = "BUS_ACCESS_NOT_SHARED", .modmsk = ARMV8_ATTRS, .code = 0x63, .desc = "Bus not normal access" }, {.name = "BUS_ACCESS_NORMAL", .modmsk = ARMV8_ATTRS, .code = 0x64, .desc = "Bus access, normal" }, {.name = "BUS_ACCESS_PERIPH", .modmsk = ARMV8_ATTRS, .code = 0x65, .desc = "Bus access, peripheral" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned access, read" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned access, write" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6A, .desc = "Unaligned access" }, {.name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6C, .desc = "Exclusive operation speculatively executed - LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6D, .desc = "Exclusive operation speculative executed - STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6E, .desc = "Exclusive operation speculative executed - STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6F, .desc = "Exclusive operation speculatively executed - STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store" }, {.name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Operation speculatively executed, load or store" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, data-processing" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD instruction" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, floating point instruction" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Branch speculatively executed, immediate branch" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Branch speculatively executed, return" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7A, .desc = "Branch speculatively executed, indirect branch" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7C, .desc = "Barrier speculatively executed, ISB" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7D, .desc = "barrier speculatively executed, DSB" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7E, .desc = "Barrier speculatively executed, DMB" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, supervisor call" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, instruction abort" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, data abort or SError" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, irq" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, fiq" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken, smc" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8A, .desc = "Exception taken, hypervisor call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8B, .desc = "Exception taken, instruction abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8C, .desc = "Exception taken, data abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8D, .desc = "Exception taken, other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8E, .desc = "Exception taken, irq not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8F, .desc = "Exception taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency instruction speculatively executed (load-acquire)" }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency instruction speculatively executed (store-release)" }, {.name = "L1D_LHS_VANOTP", .modmsk = ARMV8_ATTRS, .code = 0xC1, .desc = "A Load hit store retry" }, {.name = "L1D_LHS_OVRLAP", .modmsk = ARMV8_ATTRS, .code = 0xC2, .desc = "A Load hit store retry, VA match, PA mismatch" }, {.name = "L1D_LHS_VANOSD", .modmsk = ARMV8_ATTRS, .code = 0xC3, .desc = "A Load hit store retry, VA match, store data not issued" }, {.name = "L1D_LHS_FWD", .modmsk = ARMV8_ATTRS, .code = 0xC4, .desc = "A Load hit store forwarding. Load completes" }, {.name = "L1D_BNKCFL", .modmsk = ARMV8_ATTRS, .code = 0xC6, .desc = "Bank conflict load retry" }, {.name = "L1D_LSMQ_FULL", .modmsk = ARMV8_ATTRS, .code = 0xC7, .desc = "LSMQ retry" }, {.name = "L1D_LSMQ_HIT", .modmsk = ARMV8_ATTRS, .code = 0xC8, .desc = "LSMQ hit retry" }, {.name = "L1D_EXPB_MISS", .modmsk = ARMV8_ATTRS, .code = 0xC9, .desc = "An external probe missed the L1" }, {.name = "L1D_L2EV_MISS", .modmsk = ARMV8_ATTRS, .code = 0xCA, .desc = "An L2 evict operation missed the L1" }, {.name = "L1D_EXPB_HITM", .modmsk = ARMV8_ATTRS, .code = 0xCB, .desc = "An external probe hit a modified line in the L1" }, {.name = "L1D_L2EV_HITM", .modmsk = ARMV8_ATTRS, .code = 0xCC, .desc = "An L2 evict operation hit a modified line in the L1" }, {.name = "L1D_EXPB_HIT", .modmsk = ARMV8_ATTRS, .code = 0xCD, .desc = "An external probe hit in the L1" }, {.name = "L1D_L2EV_HIT", .modmsk = ARMV8_ATTRS, .code = 0xCE, .desc = "An L2 evict operation hit in the L1" }, {.name = "L1D_EXPB_RETRY", .modmsk = ARMV8_ATTRS, .code = 0xCF, .desc = "An external probe hit was retried" }, {.name = "L1D_L2EV_RETRY", .modmsk = ARMV8_ATTRS, .code = 0xD0, .desc = "An L2 evict operation was retried" }, {.name = "L1D_ST_RMW", .modmsk = ARMV8_ATTRS, .code = 0xD1, .desc = "A read modify write store was drained and updated the L1" }, {.name = "L1D_LSMQ00_LDREQ", .modmsk = ARMV8_ATTRS, .code = 0xD2, .desc = "A load has allocated LSMQ entry 0" }, {.name = "L1D_LSMQ00_LDVLD", .modmsk = ARMV8_ATTRS, .code = 0xD3, .desc = "LSMQ entry 0 was initiated by a load" }, {.name = "L1D_LSMQ15_STREQ", .modmsk = ARMV8_ATTRS, .code = 0xD4, .desc = "A store was allocated LSMQ entry 15" }, {.name = "L1D_LSMQ15_STVLD", .modmsk = ARMV8_ATTRS, .code = 0xD5, .desc = "LSMQ entry 15 was initiated by a store" }, {.name = "L1D_PB_FLUSH", .modmsk = ARMV8_ATTRS, .code = 0xD6, .desc = "LRQ ordering flush" }, {.name = "BR_COND_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xE0, .desc = "Conditional branch instruction executed, but mis-predicted" }, {.name = "BR_IND_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xE1, .desc = "Indirect branch instruction executed, but mis-predicted" }, {.name = "BR_RETURN_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xE2, .desc = "Return branch instruction executed, but mis-predicted" }, {.name = "OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xE8, .desc = "Uops executed" }, {.name = "LD_OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xE9, .desc = "Load uops executed" }, {.name = "ST_OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xEA, .desc = "Store uops executed" }, {.name = "FUSED_OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0xEB, .desc = "Fused uops executed" }, {.name = "IRQ_MASK", .modmsk = ARMV8_ATTRS, .code = 0xF8, .desc = "Cumulative duration of a PSTATE.I interrupt mask set to 1" }, {.name = "FIQ_MASK", .modmsk = ARMV8_ATTRS, .code = 0xF9, .desc = "Cumulative duration of a PSTATE.F interrupt mask set to 1" }, {.name = "SERROR_MASK", .modmsk = ARMV8_ATTRS, .code = 0xFA, .desc = "Cumulative duration of PSTATE.A interrupt mask set to 1" }, {.name = "WFIWFE_SLEEP", .modmsk = ARMV8_ATTRS, .code = 0x108, .desc = "Number of cycles in which CPU is in low power mode due to WFI/WFE instruction" }, {.name = "L2TLB_4K_PAGE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x127, .desc = "L2 TLB lookup miss using 4K page size" }, {.name = "L2TLB_64K_PAGE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x128, .desc = "L2 TLB lookup miss using 64K page size" }, {.name = "L2TLB_2M_PAGE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x129, .desc = "L2 TLB lookup miss using 2M page size" }, {.name = "L2TLB_512M_PAGE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x12A, .desc = "L2 TLB lookup miss using 512M page size" }, {.name = "ISB_EMPTY", .modmsk = ARMV8_ATTRS, .code = 0x150, .desc = "Number of cycles during which micro-op skid-buffer is empty" }, {.name = "ISB_FULL", .modmsk = ARMV8_ATTRS, .code = 0x151, .desc = "Number of cycles during which micro-op skid-buffer is back-pressuring decode" }, {.name = "STALL_NOTSELECTED", .modmsk = ARMV8_ATTRS, .code = 0x152, .desc = "Number of cycles during which thread was available for dispatch but not selected" }, {.name = "ROB_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x153, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to ROB full" }, {.name = "ISSQ_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x154, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to ISSQ full" }, {.name = "GPR_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x155, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to GPR full" }, {.name = "FPR_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x156, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to FPR full" }, {.name = "LRQ_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x158, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to LRQ full" }, {.name = "SRQ_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x159, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to SRQ full" }, {.name = "BSR_RECYCLE", .modmsk = ARMV8_ATTRS, .code = 0x15B, .desc = "Number of cycles in which one or more valid micro-ops did not dispatch due to BSR full" }, {.name = "UOPSFUSED", .modmsk = ARMV8_ATTRS, .code = 0x164, .desc = "Number of fused micro-ops dispatched" }, {.name = "L2D_TLBI_INT", .modmsk = ARMV8_ATTRS, .code = 0x20B, .desc = "Internal mmu tlbi cacheops" }, {.name = "L2D_TLBI_EXT", .modmsk = ARMV8_ATTRS, .code = 0x20C, .desc = "External mmu tlbi cacheops" }, {.name = "L2D_HWPF_DMD_HIT", .modmsk = ARMV8_ATTRS, .code = 0x218, .desc = "Scu ld/st requests that hit cache or msg for lines brought in by the hardware prefetcher" }, {.name = "L2D_HWPF_REQ_VAL", .modmsk = ARMV8_ATTRS, .code = 0x219, .desc = "Scu hwpf requests into the pipeline" }, {.name = "L2D_HWPF_REQ_LD", .modmsk = ARMV8_ATTRS, .code = 0x21A, .desc = "Scu hwpf ld requests into the pipeline" }, {.name = "L2D_HWPF_REQ_MISS", .modmsk = ARMV8_ATTRS, .code = 0x21B, .desc = "Scu hwpf ld requests that miss" }, {.name = "L2D_HWPF_NEXT_LINE", .modmsk = ARMV8_ATTRS, .code = 0x21C, .desc = "Scu hwpf next line requests generated" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a15_events.h000066400000000000000000000232511502707512200244070ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * Contributed by Will Deacon * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Cortex A15 r2p0 * based on Table 11-6 from the "Cortex A15 Technical Reference Manual" */ static const arm_entry_t arm_cortex_a15_pe[]={ {.name = "SW_INCR", .modmsk = ARMV7_A15_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) Software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "INST_RETIRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXCEPTION_TAKEN", .modmsk = ARMV7_A15_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXCEPTION_RETURN", .modmsk = ARMV7_A15_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed (condition check pass) Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed (condition check pass) Write to CONTEXTIDR" }, {.name = "BRANCH_MISPRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV7_A15_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "DATA_MEM_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV7_A15_ATTRS, .code = 0x15, .desc = "Level 1 data cache WriteBack" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV7_A15_ATTRS, .code = 0x18, .desc = "Level 2 data cache WriteBack" }, {.name = "BUS_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "LOCAL_MEMORY_ERROR", .modmsk = ARMV7_A15_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "INST_SPEC_EXEC", .modmsk = ARMV7_A15_ATTRS, .code = 0x1b, .desc = "Instruction speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed (condition check pass) Write to translation table base" }, {.name = "BUS_CYCLES", .modmsk = ARMV7_A15_ATTRS, .code = 0x1d, .desc = "Bus cycle" }, {.name = "L1D_READ_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x40, .desc = "Level 1 data cache read access" }, {.name = "L1D_WRITE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x41, .desc = "Level 1 data cache write access" }, {.name = "L1D_READ_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refill" }, {.name = "L1D_WRITE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refill" }, {.name = "L1D_WB_VICTIM", .modmsk = ARMV7_A15_ATTRS, .code = 0x46, .desc = "Level 1 data cache writeback victim" }, {.name = "L1D_WB_CLEAN_COHERENCY", .modmsk = ARMV7_A15_ATTRS, .code = 0x47, .desc = "Level 1 data cache writeback cleaning and coherency" }, {.name = "L1D_INVALIDATE", .modmsk = ARMV7_A15_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate" }, {.name = "L1D_TLB_READ_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refill" }, {.name = "L1D_TLB_WRITE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refill" }, {.name = "L2D_READ_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x50, .desc = "Level 2 data cache read access" }, {.name = "L2D_WRITE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x51, .desc = "Level 2 data cache write access" }, {.name = "L2D_READ_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refill" }, {.name = "L2D_WRITE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refill" }, {.name = "L2D_WB_VICTIM", .modmsk = ARMV7_A15_ATTRS, .code = 0x56, .desc = "Level 2 data cache writeback victim" }, {.name = "L2D_WB_CLEAN_COHERENCY", .modmsk = ARMV7_A15_ATTRS, .code = 0x57, .desc = "Level 2 data cache writeback cleaning and coherency" }, {.name = "L2D_INVALIDATE", .modmsk = ARMV7_A15_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate" }, {.name = "BUS_READ_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x60, .desc = "Bus read access" }, {.name = "BUS_WRITE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x61, .desc = "Bus write access" }, {.name = "BUS_NORMAL_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x62, .desc = "Bus normal access" }, {.name = "BUS_NOT_NORMAL_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x63, .desc = "Bus not normal access" }, {.name = "BUS_NORMAL_ACCESS_2", .modmsk = ARMV7_A15_ATTRS, .code = 0x64, .desc = "Bus normal access" }, {.name = "BUS_PERIPH_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x65, .desc = "Bus peripheral access" }, {.name = "DATA_MEM_READ_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x66, .desc = "Data memory read access" }, {.name = "DATA_MEM_WRITE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x67, .desc = "Data memory write access" }, {.name = "UNALIGNED_READ_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x68, .desc = "Unaligned read access" }, {.name = "UNALIGNED_WRITE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x69, .desc = "Unaligned read access" }, {.name = "UNALIGNED_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x6a, .desc = "Unaligned access" }, {.name = "INST_SPEC_EXEC_LDREX", .modmsk = ARMV7_A15_ATTRS, .code = 0x6c, .desc = "LDREX exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STREX_PASS", .modmsk = ARMV7_A15_ATTRS, .code = 0x6d, .desc = "STREX pass exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STREX_FAIL", .modmsk = ARMV7_A15_ATTRS, .code = 0x6e, .desc = "STREX fail exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_LOAD", .modmsk = ARMV7_A15_ATTRS, .code = 0x70, .desc = "Load instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STORE", .modmsk = ARMV7_A15_ATTRS, .code = 0x71, .desc = "Store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_LOAD_STORE", .modmsk = ARMV7_A15_ATTRS, .code = 0x72, .desc = "Load or store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_INTEGER_INST", .modmsk = ARMV7_A15_ATTRS, .code = 0x73, .desc = "Integer data processing instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SIMD", .modmsk = ARMV7_A15_ATTRS, .code = 0x74, .desc = "Advanced SIMD instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_VFP", .modmsk = ARMV7_A15_ATTRS, .code = 0x75, .desc = "VFP instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SOFT_PC", .modmsk = ARMV7_A15_ATTRS, .code = 0x76, .desc = "Software of the PC instruction speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IMM_BRANCH", .modmsk = ARMV7_A15_ATTRS, .code = 0x78, .desc = "Immediate branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_RET", .modmsk = ARMV7_A15_ATTRS, .code = 0x79, .desc = "Return branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IND", .modmsk = ARMV7_A15_ATTRS, .code = 0x7a, .desc = "Indirect branch speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_ISB", .modmsk = ARMV7_A15_ATTRS, .code = 0x7c, .desc = "ISB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DSB", .modmsk = ARMV7_A15_ATTRS, .code = 0x7d, .desc = "DSB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DMB", .modmsk = ARMV7_A15_ATTRS, .code = 0x7e, .desc = "DMB barrier speculatively executed" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a53_events.h000066400000000000000000000121561502707512200244130ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Cortex A53 r0p2 * based on Table 12.9 from the "Cortex A53 Technical Reference Manual" */ static const arm_entry_t arm_cortex_a53_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) Software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "LD_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x06, .desc = "Load Instruction architecturally executed, condition check", }, {.name = "ST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x07, .desc = "Store Instruction architecturally executed, condition check", }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXCEPTION_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXCEPTION_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed (condition check pass) Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Change to Context ID retired", }, {.name = "PC_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0c, .desc = "Write to CONTEXTIDR, instruction architecturally executed, condition check pass" }, {.name = "BR_IMMED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0d, .desc = "Software change of the PC, instruction architecturally executed, condition check pass" }, {.name = "UNALIGNED_LDST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0f, .desc = "Procedure return, instruction architecturally executed, condition check pass" }, {.name = "BRANCH_MISPRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "DATA_MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache WriteBack" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache WriteBack" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "LOCAL_MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycle" }, {.name = "BUS_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus read access" }, {.name = "BUS_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus write access" }, {.name = "BRANCH_SPEC_EXEC_IND", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Indirect branch speculatively executed" }, {.name = "EXCEPTION_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, irq" }, {.name = "EXCEPTION_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, irq" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a55_events.h000066400000000000000000000367011502707512200244170ustar00rootroot00000000000000/* * Copyright (c) 2024 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * * ARM Cortex A55 * References: * - Arm Cortex A55 TRM: https://developer.arm.com/documentation/100442/0100/debug-descriptions/pmu/pmu-events * - https://github.com/ARM-software/data/blob/master/pmu/cortex-a55.json */ static const arm_entry_t arm_cortex_a55_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed, condition code check pass, software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "LD_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x06, .desc = "Instruction architecturally executed, condition code check pass, load" }, {.name = "ST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x07, .desc = "Instruction architecturally executed, condition code check pass, store" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed, condition code check pass, exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR" }, {.name = "PC_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0c, .desc = "Instruction architecturally executed, condition code check pass, software change of the PC" }, {.name = "BR_IMMED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0d, .desc = "Instruction architecturally executed, immediate branch" }, {.name = "BR_RETURN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0e, .desc = "Instruction architecturally executed, condition code check pass, procedure return" }, {.name = "UNALIGNED_LDST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0f, .desc = "Instruction architecturally executed, condition code check pass, unaligned load or store" }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycle" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache Write-Back" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache Write-Back" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Operation speculatively executed" }, {.name = "INT_SPEC", .modmsk = ARMV8_ATTRS, .equiv = "INST_SPEC", .code = 0x1b, .desc = "Operation speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, condition code check pass, write to TTBR" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycles" }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "Level 2 data cache allocation without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Instruction architecturally executed, branch" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instruction architecturally executed, mispredicted branch" }, {.name = "BR__MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .equiv = "BR_MIS_PRED_RETIRED", .code = 0x22, .desc = "Instruction architecturally executed, mispredicted branch" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "No operation issued because of the frontend" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "No operation issued because of the backend" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB access" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Level 1 instruction TLB access" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x29, .desc = "Attributable Level 3 unified cache allocation without refill" }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2a, .desc = "Attributable Level 3 unified cache refill" }, {.name = "L3D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x2b, .desc = "Attributable Level 3 unified cache access" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Attributable Level 2 unified TLB refill" }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Attributable Level 2 unified TLB access" }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Access to data TLB that caused a page table walk" }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Access to instruction TLB that caused a page table walk" }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last level cache access, read" }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last level cache miss, read" }, {.name = "REMOTE_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x38, .desc = "Access to another socket in a multi-socket system, read" }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache access, read" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache access, write" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache refill, read" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache refill, write" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refill, inner" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refill, outer" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 cache access, read" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 cache access, write" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 cache refill, read" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 cache refill, write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access, read" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access, write" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store" }, {.name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Operation speculatively executed, load or store" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, integer data processing" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD instruction" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, floating-point instruction" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Operation speculatively executed, software change of the PC" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Branch speculatively executed, immediate branch" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Branch speculatively executed, procedure return" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Branch speculatively executed, indirect branch" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, FIQ" }, {.name = "L3D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0xa0, .desc = "Attributable Level 3 unified cache access, read" }, {.name = "L3D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0xa2, .desc = "Attributable Level 3 unified cache refill, read" }, {.name = "L3D_CACHE_REFILL_PREFETCH", .modmsk = ARMV8_ATTRS, .code = 0xc0, .desc = "Level 3 cache refill due to prefetch" }, {.name = "L2D_CACHE_REFILL_PREFETCH", .modmsk = ARMV8_ATTRS, .code = 0xc1, .desc = "Level 2 cache refill due to prefetch" }, {.name = "L1D_CACHE_REFILL_PREFETCH", .modmsk = ARMV8_ATTRS, .code = 0xc2, .desc = "Level 1 data cache refill due to prefetch" }, {.name = "L2D_WS_MODE", .modmsk = ARMV8_ATTRS, .code = 0xc3, .desc = "Level 2 cache write streaming mode" }, {.name = "L1D_WS_MODE_ENTRY", .modmsk = ARMV8_ATTRS, .code = 0xc4, .desc = "Level 1 data cache entering write streaming mode" }, {.name = "L1D_WS_MODE", .modmsk = ARMV8_ATTRS, .code = 0xc5, .desc = "Level 1 data cache write streaming mode" }, {.name = "PREDECODE_ERROR", .modmsk = ARMV8_ATTRS, .code = 0xc6, .desc = "Predecode error" }, {.name = "L3D_WS_MODE", .modmsk = ARMV8_ATTRS, .code = 0xc7, .desc = "Level 3 cache write streaming mode" }, {.name = "BR_COND_PRED", .modmsk = ARMV8_ATTRS, .code = 0xc9, .desc = "Predicted conditional branch executed" }, {.name = "BR_INDIRECT_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0xca, .desc = "Indirect branch mis-predicted" }, {.name = "BR_INDIRECT_ADDR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0xcb, .desc = "Indirect branch mis-predicted due to address mis-compare" }, {.name = "BR_COND_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0xcc, .desc = "Conditional branch mis-predicted" }, {.name = "BR_INDIRECT_ADDR_PRED", .modmsk = ARMV8_ATTRS, .code = 0xcd, .desc = "Indirect branch with predicted address executed" }, {.name = "BR_RETURN_ADDR_PRED", .modmsk = ARMV8_ATTRS, .code = 0xce, .desc = "Procedure return with predicted address executed" }, {.name = "BR_RETURN_ADDR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0xcf, .desc = "Procedure return mis-predicted due to address mis-compare" }, {.name = "L2D_LLWALK_TLB", .modmsk = ARMV8_ATTRS, .code = 0xd0, .desc = "Level 2 TLB last-level walk cache access" }, {.name = "L2D_LLWALK_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0xd1, .desc = "Level 2 TLB last-level walk cache refill" }, {.name = "L2D_L2WALK_TLB", .modmsk = ARMV8_ATTRS, .code = 0xd2, .desc = "Level 2 TLB level-2 walk cache access" }, {.name = "L2D_L2WALK_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0xd3, .desc = "Level 2 TLB level-2 walk cache refill" }, {.name = "L2D_S2_TLB", .modmsk = ARMV8_ATTRS, .code = 0xd4, .desc = "Level 2 TLB IPA cache access" }, {.name = "L2D_S2_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0xd5, .desc = "Level 2 TLB IPA cache refill" }, {.name = "L2D_CACHE_STASH_DROPPED", .modmsk = ARMV8_ATTRS, .code = 0xd6, .desc = "Level 2 cache stash dropped" }, {.name = "STALL_FRONTEND_CACHE", .modmsk = ARMV8_ATTRS, .code = 0xe1, .desc = "No operation issued due to the frontend, cache miss" }, {.name = "STALL_FRONTEND_TLB", .modmsk = ARMV8_ATTRS, .code = 0xe2, .desc = "No operation issued due to the frontend, TLB miss" }, {.name = "STALL_FRONTEND_PDERR", .modmsk = ARMV8_ATTRS, .code = 0xe3, .desc = "No operation issued due to the frontend, pre-decode error" }, {.name = "STALL_BACKEND_ILOCK", .modmsk = ARMV8_ATTRS, .code = 0xe4, .desc = "No operation issued due to the backend interlock" }, {.name = "STALL_BACKEND_ILOCK_AGU", .modmsk = ARMV8_ATTRS, .code = 0xe5, .desc = "No operation issued due to the backend, interlock, AGU" }, {.name = "STALL_BACKEND_ILOCK_FPU", .modmsk = ARMV8_ATTRS, .code = 0xe6, .desc = "No operation issued due to the backend, interlock, FPU" }, {.name = "STALL_BACKEND_LD", .modmsk = ARMV8_ATTRS, .code = 0xe7, .desc = "No operation issued due to the backend, load" }, {.name = "STALL_BACKEND_ST", .modmsk = ARMV8_ATTRS, .code = 0xe8, .desc = "No operation issued due to the backend, store" }, {.name = "STALL_BACKEND_LD_CACHE", .modmsk = ARMV8_ATTRS, .code = 0xe9, .desc = "No operation issued due to the backend, load, cache miss" }, {.name = "STALL_BACKEND_LD_TLB", .modmsk = ARMV8_ATTRS, .code = 0xea, .desc = "No operation issued due to the backend, load, TLB miss" }, {.name = "STALL_BACKEND_ST_STB", .modmsk = ARMV8_ATTRS, .code = 0xeb, .desc = "No operation issued due to the backend, store, STB full" }, {.name = "STALL_BACKEND_ST_TLB", .modmsk = ARMV8_ATTRS, .code = 0xec, .desc = "No operation issued due to the backend, store, TLB miss" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a57_events.h000066400000000000000000000264511502707512200244220ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Cortex A57 r1p1 * based on Table 11-24 from the "Cortex A57 Technical Reference Manual" */ static const arm_entry_t arm_cortex_a57_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) Software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXCEPTION_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXCEPTION_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed (condition check pass) Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed (condition check pass) Write to CONTEXTIDR" }, {.name = "BRANCH_MISPRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "DATA_MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache WriteBack" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache WriteBack" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "LOCAL_MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "INST_SPEC_EXEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Instruction speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed (condition check pass) Write to translation table base" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycle" }, {.name = "L1D_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache read access" }, {.name = "L1D_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache write access" }, {.name = "L1D_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refill" }, {.name = "L1D_WRITE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refill" }, {.name = "L1D_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Level 1 data cache writeback victim" }, {.name = "L1D_WB_CLEAN_COHERENCY", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache writeback cleaning and coherency" }, {.name = "L1D_INVALIDATE", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate" }, {.name = "L1D_TLB_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refill" }, {.name = "L1D_TLB_WRITE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refill" }, {.name = "L2D_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache read access" }, {.name = "L2D_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache write access" }, {.name = "L2D_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refill" }, {.name = "L2D_WRITE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refill" }, {.name = "L2D_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache writeback victim" }, {.name = "L2D_WB_CLEAN_COHERENCY", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache writeback cleaning and coherency" }, {.name = "L2D_INVALIDATE", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate" }, {.name = "BUS_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus read access" }, {.name = "BUS_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus write access" }, {.name = "BUS_NORMAL_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x62, .desc = "Bus normal access" }, {.name = "BUS_NOT_NORMAL_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x63, .desc = "Bus not normal access" }, {.name = "BUS_NORMAL_ACCESS_2", .modmsk = ARMV8_ATTRS, .code = 0x64, .desc = "Bus normal access" }, {.name = "BUS_PERIPH_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x65, .desc = "Bus peripheral access" }, {.name = "DATA_MEM_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory read access" }, {.name = "DATA_MEM_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory write access" }, {.name = "UNALIGNED_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned read access" }, {.name = "UNALIGNED_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned read access" }, {.name = "UNALIGNED_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned access" }, {.name = "INST_SPEC_EXEC_LDREX", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "LDREX exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STREX_PASS", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "STREX pass exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STREX_FAIL", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "STREX fail exclusive instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_LOAD", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Load instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STORE", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_LOAD_STORE", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Load or store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_INTEGER_INST", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Integer data processing instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SIMD", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Advanced SIMD instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_VFP", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "VFP instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SOFT_PC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Software of the PC instruction speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IMM_BRANCH", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Immediate branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_RET", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Return branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IND", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Indirect branch speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_ISB", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "ISB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DSB", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "DSB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DMB", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "DMB barrier speculatively executed" }, {.name = "EXCEPTION_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous" }, {.name = "EXCEPTION_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, supervisor call" }, {.name = "EXCEPTION_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, instruction abort" }, {.name = "EXCEPTION_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, data abort or SError" }, {.name = "EXCEPTION_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, irq" }, {.name = "EXCEPTION_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, irq" }, {.name = "EXCEPTION_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken, secure monitor call" }, {.name = "EXCEPTION_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken, hypervisor call" }, {.name = "EXCEPTION_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, instruction abort not taken locally" }, {.name = "EXCEPTION_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, data abort or SError not taken locally" }, {.name = "EXCEPTION_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, other traps not taken locally" }, {.name = "EXCEPTION_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, irq not taken locally" }, {.name = "EXCEPTION_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency instruction speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency instruction speculatively executed (store-release)", }, /* END Cortex A47 specific events */ }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a76_events.h000066400000000000000000000342751502707512200244260ustar00rootroot00000000000000/* Copyright (c) 2024 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * * ARM Cortex A76 * References: * - Arm Cortex A76 TRM: https://developer.arm.com/documentation/100798/0401/Performance-Monitoring-Unit/PMU-events * - https://github.com/ARM-software/data/blob/master/pmu/cortex-a76.json */ static const arm_entry_t arm_cortex_a76_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "L1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "L1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "L1 data cache refill" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "L1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "L1 data TLB refill" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed, condition code check pass, exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR" }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycle" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access or Level 0 Macro-op cache access" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "L1 data cache Write-Back" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "L2 unified cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "L2 unified cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "L2 unified cache write-back" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Operation speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, condition code check pass, write to TTBR" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycles" }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "L2 unified cache allocation without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Instruction architecturally executed, branch" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instruction architecturally executed, mispredicted branch" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "No operation issued because of the frontend" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "No operation issued because of the backend" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB access" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Level 1 instruction TLB access" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x29, .desc = "Attributable L3 data or unified cache allocation without refill" }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2a, .desc = "Attributable Level 3 unified cache refill" }, {.name = "L3D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x2b, .desc = "Attributable Level 3 unified cache access" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Attributable L2 data or unified TLB refill" }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Attributable L2 data or unified TLB access" }, {.name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Access to another socket in a multi-socket system" }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Access to data TLB that caused a page table walk" }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Access to instruction TLB that caused a page table walk" }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last level cache access, read" }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last level cache miss, read" }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "L1 data cache access, read" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "L1 data cache access, write" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "L1 data cache refill, read" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "L1 data cache refill, write" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "L1 data cache refill, inner" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "L1 data cache refill, outer" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "L1 data cache write-back, victim" }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "L1 data cache write-back cleaning and coherency" }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "L1 data cache invalidate" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "L1 data TLB refill, read" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "L1 data TLB refill, write" }, {.name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4e, .desc = "L1 data TLB access, read" }, {.name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4f, .desc = "L1 data TLB access, write" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "L2 unified cache access, read" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "L2 unified cache access, write" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "L2 unified cache refill, read" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "L2 unified cache refill, write" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "L2 unified cache write-back, victim" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "L2 unified cache write-back, cleaning, and coherency" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "L2 unified cache invalidate" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x5c, .desc = "L2 data or unified TLB refill, read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x5d, .desc = "L2 data or unified TLB refill, write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x5e, .desc = "L2 data or unified TLB access, read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x5f, .desc = "L2 data or unified TLB access, write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access read" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access write" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned access, read" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned access, write" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned access Event mnemonic Event description" }, {.name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operation speculatively executed, LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operation speculatively executed, STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operation speculatively executed, STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operation speculatively executed, STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store" }, {.name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Operation speculatively executed, load or store" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, integer data-processing" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD instruction" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, floating-point instruction" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Operation speculatively executed, software change of the PC" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Branch speculatively executed, immediate branch" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Branch speculatively executed, procedure return" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Branch speculatively executed, indirect branch" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "Barrier speculatively executed, ISB" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "Barrier speculatively executed, DSB" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "Barrier speculatively executed, DMB" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Counts the number of undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken locally, Supervisor Call" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken locally, Instruction Abort" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken locally, Data Abort and SError" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken locally, Secure Monitor Call" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken locally, Hypervisor Call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, Instruction Abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, Data Abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, Other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, IRQ not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, FIQ not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency operation speculatively executed, load-acquire" }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency operation speculatively executed, store-release" }, {.name = "L3D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0xa0, .desc = "L3 cache read" }, {.name = "L3_CACHE_RD", .modmsk = ARMV8_ATTRS, .equiv = "L3D_CACHE_RD", .code = 0xa0, .desc = "L3 cache read" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a7_events.h000066400000000000000000000137601502707512200243340ustar00rootroot00000000000000/* * Copyright (c) 2014 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Cortex A7 MPCore * based on Table 11-5 from the "Cortex-A7 MPCore Technical Reference Manual" */ static const arm_entry_t arm_cortex_a7_pe[]={ {.name = "SW_INCR", .modmsk = ARMV7_A7_ATTRS, .code = 0x00, .desc = "Incremented on writes to the Software Increment Register" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "DATA_READS", .modmsk = ARMV7_A7_ATTRS, .code = 0x06, .desc = "Data reads architecturally executed" }, {.name = "DATA_WRITES", .modmsk = ARMV7_A7_ATTRS, .code = 0x07, .desc = "Data writes architecturally executed" }, {.name = "INST_RETIRED", .modmsk = ARMV7_A7_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXCEPTION_TAKEN", .modmsk = ARMV7_A7_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXCEPTION_RETURN", .modmsk = ARMV7_A7_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV7_A7_ATTRS, .code = 0x0b, .desc = "Change to ContextID retired" }, {.name = "SW_CHANGE_PC", .modmsk = ARMV7_A7_ATTRS, .code = 0x0c, .desc = "Software change of PC" }, {.name = "IMMEDIATE_BRANCHES", .modmsk = ARMV7_A7_ATTRS, .code = 0x0d, .desc = "Immediate branch architecturally executed" }, {.name = "PROCEDURE_RETURNS", .modmsk = ARMV7_A7_ATTRS, .code = 0x0e, .desc = "Procedure returns architecturally executed" }, {.name = "UNALIGNED_LOAD_STORE", .modmsk = ARMV7_A7_ATTRS, .code = 0x0f, .desc = "Unaligned load-store" }, {.name = "BRANCH_MISPRED", .modmsk = ARMV7_A7_ATTRS, .code = 0x10, .desc = "Branches mispredicted/not predicted" }, {.name = "CPU_CYCLES", .modmsk = ARMV7_A7_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV7_A7_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "DATA_MEM_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L1D_CACHE_EVICTION", .modmsk = ARMV7_A7_ATTRS, .code = 0x15, .desc = "Level 1 data cache eviction" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV7_A7_ATTRS, .code = 0x18, .desc = "Level 2 data cache WriteBack" }, {.name = "BUS_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x19, .desc = "Bus accesses" }, {.name = "BUS_CYCLES", .modmsk = ARMV7_A7_ATTRS, .code = 0x1d, .desc = "Bus cycle" }, {.name = "BUS_READ_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x60, .desc = "Bus read access" }, {.name = "BUS_WRITE_ACCESS", .modmsk = ARMV7_A7_ATTRS, .code = 0x61, .desc = "Bus write access" }, {.name = "IRQ_EXCEPTION_TAKEN", .modmsk = ARMV7_A7_ATTRS, .code = 0x86, .desc = "IRQ Exception Taken" }, {.name = "FIQ_EXCEPTION_TAKEN", .modmsk = ARMV7_A7_ATTRS, .code = 0x87, .desc = "FIQ Exception Taken" }, {.name = "EXTERNAL_MEMORY_REQUEST", .modmsk = ARMV7_A7_ATTRS, .code = 0xc0, .desc = "External memory request" }, {.name = "NONCACHE_EXTERNAL_MEMORY_REQUEST", .modmsk = ARMV7_A7_ATTRS, .code = 0xc1, .desc = "Non-cacheable xternal memory request" }, {.name = "PREFETCH_LINEFILL", .modmsk = ARMV7_A7_ATTRS, .code = 0xc2, .desc = "Linefill due to prefetch" }, {.name = "PREFETCH_LINEFILL_DROPPED", .modmsk = ARMV7_A7_ATTRS, .code = 0xc3, .desc = "Prefetch linefill dropped" }, {.name = "ENTERING_READ_ALLOC", .modmsk = ARMV7_A7_ATTRS, .code = 0xc4, .desc = "Entering read allocate mode" }, {.name = "READ_ALLOC", .modmsk = ARMV7_A7_ATTRS, .code = 0xc5, .desc = "Read allocate mode" }, /* 0xc6 is Reserved */ {.name = "ETM_EXT_OUT_0", .modmsk = ARMV7_A7_ATTRS, .code = 0xc7, .desc = "ETM Ext Out[0]" }, {.name = "ETM_EXT_OUT_1", .modmsk = ARMV7_A7_ATTRS, .code = 0xc8, .desc = "ETM Ext Out[1]" }, {.name = "DATA_WRITE_STALL", .modmsk = ARMV7_A7_ATTRS, .code = 0xc9, .desc = "Data write operation that stalls pipeline due to full store buffer" }, {.name = "DATA_SNOOPED", .modmsk = ARMV7_A7_ATTRS, .code = 0xca, .desc = "Data snooped from other processor" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a8_events.h000066400000000000000000000146031502707512200243320ustar00rootroot00000000000000/* * Copyright (c) 2010 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * the various event names are the same as those given in the * file linux-2.6/arch/arm/kernel/perf_event.c */ /* * Cortex A8 Event Table */ static const arm_entry_t arm_cortex_a8_pe []={ {.name = "PMNC_SW_INCR", .code = 0x00, .desc = "Incremented by writes to the Software Increment Register" }, {.name = "IFETCH_MISS", .code = 0x01, .desc = "Instruction fetches that cause lowest-level cache miss" }, {.name = "ITLB_MISS", .code = 0x02, .desc = "Instruction fetches that cause lowest-level TLB miss" }, {.name = "DCACHE_REFILL", .code = 0x03, .desc = "Data read or writes that cause lowest-level cache miss" }, {.name = "DCACHE_ACCESS", .code = 0x04, .desc = "Data read or writes that cause lowest-level cache access" }, {.name = "DTLB_REFILL", .code = 0x05, .desc = "Data read or writes that cause lowest-level TLB refill" }, {.name = "DREAD", .code = 0x06, .desc = "Data read architecturally executed" }, {.name = "DWRITE", .code = 0x07, .desc = "Data write architecturally executed" }, {.name = "INSTR_EXECUTED", .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "EXC_TAKEN", .code = 0x09, .desc = "Counts each exception taken" }, {.name = "EXC_EXECUTED", .code = 0x0a, .desc = "Exception returns architecturally executed" }, {.name = "CID_WRITE", .code = 0x0b, .desc = "Instruction writes to Context ID Register, architecturally executed" }, {.name = "PC_WRITE", .code = 0x0c, .desc = "Software change of PC. Equivalent to branches" }, {.name = "PC_IMM_BRANCH", .code = 0x0d, .desc = "Immediate branches architecturally executed" }, {.name = "PC_PROC_RETURN", .code = 0x0e, .desc = "Procedure returns architecturally executed" }, {.name = "UNALIGNED_ACCESS", .code = 0x0f, .desc = "Unaligned accesses architecturally executed" }, {.name = "PC_BRANCH_MIS_PRED", .code = 0x10, .desc = "Branches mispredicted or not predicted" }, {.name = "CLOCK_CYCLES", /* this isn't in the Cortex-A8 tech doc */ .code = 0x11, /* but is in linux kernel */ .desc = "Clock cycles" }, {.name = "PC_BRANCH_MIS_USED", .code = 0x12, .desc = "Branches that could have been predicted" }, {.name = "WRITE_BUFFER_FULL", .code = 0x40, .desc = "Cycles Write buffer full" }, {.name = "L2_STORE_MERGED", .code = 0x41, .desc = "Stores merged in L2" }, {.name = "L2_STORE_BUFF", .code = 0x42, .desc = "Bufferable store transactions to L2" }, {.name = "L2_ACCESS", .code = 0x43, .desc = "Accesses to L2 cache" }, {.name = "L2_CACHE_MISS", .code = 0x44, .desc = "L2 cache misses" }, {.name = "AXI_READ_CYCLES", .code = 0x45, .desc = "Cycles with active AXI read channel transactions" }, {.name = "AXI_WRITE_CYCLES", .code = 0x46, .desc = "Cycles with Active AXI write channel transactions" }, {.name = "MEMORY_REPLAY", .code = 0x47, .desc = "Memory replay events" }, {.name = "UNALIGNED_ACCESS_REPLAY", .code = 0x48, .desc = "Unaligned accesses causing replays" }, {.name = "L1_DATA_MISS", .code = 0x49, .desc = "L1 data misses due to hashing algorithm" }, {.name = "L1_INST_MISS", .code = 0x4a, .desc = "L1 instruction misses due to hashing algorithm" }, {.name = "L1_DATA_COLORING", .code = 0x4b, .desc = "L1 data access where page color alias occurs" }, {.name = "L1_NEON_DATA", .code = 0x4c, .desc = "NEON accesses that hit in L1 cache" }, {.name = "L1_NEON_CACH_DATA", .code = 0x4d, .desc = "NEON cache accesses for L1 cache" }, {.name = "L2_NEON", .code = 0x4e, .desc = "L2 accesses caused by NEON" }, {.name = "L2_NEON_HIT", .code = 0x4f, .desc = "L2 hits caused by NEON" }, {.name = "L1_INST", .code = 0x50, .desc = "L1 instruction cache accesses" }, {.name = "PC_RETURN_MIS_PRED", .code = 0x51, .desc = "Return stack mispredictions" }, {.name = "PC_BRANCH_FAILED", .code = 0x52, .desc = "Branch prediction failures" }, {.name = "PC_BRANCH_TAKEN", .code = 0x53, .desc = "Branches predicted taken" }, {.name = "PC_BRANCH_EXECUTED", .code = 0x54, .desc = "Taken branches executed" }, {.name = "OP_EXECUTED", .code = 0x55, .desc = "Operations executed (includes sub-ops in multi-cycle instructions)" }, {.name = "CYCLES_INST_STALL", .code = 0x56, .desc = "Cycles no instruction is available for issue" }, {.name = "CYCLES_INST", .code = 0x57, .desc = "Number of instructions issued in cycle" }, {.name = "CYCLES_NEON_DATA_STALL", .code = 0x58, .desc = "Cycles stalled waiting on NEON MRC data" }, {.name = "CYCLES_NEON_INST_STALL", .code = 0x59, .desc = "Cycles stalled due to full NEON queues" }, {.name = "NEON_CYCLES", .code = 0x5a, .desc = "Cycles NEON and integer processors both not idle" }, {.name = "PMU0_EVENTS", .code = 0x70, .desc = "External PMUEXTIN[0] event" }, {.name = "PMU1_EVENTS", .code = 0x71, .desc = "External PMUEXTIN[1] event" }, {.name = "PMU_EVENTS", .code = 0x72, .desc = "External PMUEXTIN[0] or PMUEXTIN[1] event" }, {.name = "CPU_CYCLES", .code = 0xff, .desc = "CPU cycles" }, }; #define ARM_CORTEX_A8_EVENT_COUNT (sizeof(arm_cortex_a8_pe)/sizeof(arm_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_cortex_a9_events.h000066400000000000000000000203551502707512200243340ustar00rootroot00000000000000/* * Copyright (c) 2010 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * the various event names are the same as those given in the * file linux-2.6/arch/arm/kernel/perf_event.c */ /* * Cortex A9 r2p2 Event Table * based on Table 11-7 from the "Cortex A9 Technical Reference Manual" */ static const arm_entry_t arm_cortex_a9_pe []={ /* * ARMv7 events */ {.name = "PMNC_SW_INCR", .code = 0x00, .desc = "Incremented by writes to the Software Increment Register" }, {.name = "IFETCH_MISS", .code = 0x01, .desc = "Instruction fetches that cause lowest-level cache miss" }, {.name = "ITLB_MISS", .code = 0x02, .desc = "Instruction fetches that cause lowest-level TLB miss" }, {.name = "DCACHE_REFILL", .code = 0x03, .desc = "Data read or writes that cause lowest-level cache miss" }, {.name = "DCACHE_ACCESS", .code = 0x04, .desc = "Data read or writes that cause lowest-level cache access" }, {.name = "DTLB_REFILL", .code = 0x05, .desc = "Data read or writes that cause lowest-level TLB refill" }, {.name = "DREAD", .code = 0x06, .desc = "Data read architecturally executed" }, {.name = "DWRITE", .code = 0x07, .desc = "Data write architecturally executed" }, {.name = "EXC_TAKEN", .code = 0x09, .desc = "Counts each exception taken" }, {.name = "EXC_EXECUTED", .code = 0x0a, .desc = "Exception returns architecturally executed" }, {.name = "CID_WRITE", .code = 0x0b, .desc = "Instruction writes to Context ID Register, architecturally executed" }, {.name = "PC_WRITE", .code = 0x0c, .desc = "Software change of PC. Equivalent to branches" }, {.name = "PC_IMM_BRANCH", .code = 0x0d, .desc = "Immediate branches architecturally executed" }, {.name = "UNALIGNED_ACCESS", .code = 0x0f, .desc = "Unaligned accesses architecturally executed" }, {.name = "PC_BRANCH_MIS_PRED", .code = 0x10, .desc = "Branches mispredicted or not predicted" }, {.name = "CLOCK_CYCLES", .code = 0x11, .desc = "Clock cycles" }, {.name = "PC_BRANCH_MIS_USED", .code = 0x12, .desc = "Branches that could have been predicted" }, /* * Cortex A9 specific events */ {.name = "JAVA_HW_BYTECODE_EXEC", .code = 0x40, .desc = "Java bytecodes decoded, including speculative (approximate)" }, {.name = "JAVA_SW_BYTECODE_EXEC", .code = 0x41, .desc = "Software Java bytecodes decoded, including speculative (approximate)" }, {.name = "JAZELLE_BRANCH_EXEC", .code = 0x42, .desc = "Jazelle backward branches executed. Includes branches that are flushed because of previous load/store which abort late (approximate)" }, {.name = "COHERENT_LINE_MISS", .code = 0x50, .desc = "Coherent linefill misses which also miss on other processors" }, {.name = "COHERENT_LINE_HIT", .code = 0x51, .desc = "Coherent linefill requests that hit on another processor" }, {.name = "ICACHE_DEP_STALL_CYCLES", .code = 0x60, .desc = "Cycles processor is stalled waiting for instruction cache and the instruction cache is performing at least one linefill (approximate)" }, {.name = "DCACHE_DEP_STALL_CYCLES", .code = 0x61, .desc = "Cycles processor is stalled waiting for data cache" }, {.name = "TLB_MISS_DEP_STALL_CYCLES", .code = 0x62, .desc = "Cycles processor is stalled waiting for completion of TLB walk (approximate)" }, {.name = "STREX_EXECUTED_PASSED", .code = 0x63, .desc = "Number of STREX instructions executed and passed" }, {.name = "STREX_EXECUTED_FAILED", .code = 0x64, .desc = "Number of STREX instructions executed and failed" }, {.name = "DATA_EVICTION", .code = 0x65, .desc = "Data eviction requests due to linefill in data cache" }, {.name = "ISSUE_STAGE_NO_INST", .code = 0x66, .desc = "Cycles the issue stage does not dispatch any instructions" }, {.name = "ISSUE_STAGE_EMPTY", .code = 0x67, .desc = "Cycles where issue stage is empty" }, {.name = "INST_OUT_OF_RENAME_STAGE", .code = 0x68, .desc = "Number of instructions going through register renaming stage (approximate)" }, {.name = "PREDICTABLE_FUNCT_RETURNS", .code = 0x6e, .desc = "Number of predictable function returns whose condition codes do not fail (approximate)" }, {.name = "MAIN_UNIT_EXECUTED_INST", .code = 0x70, .desc = "Instructions executed in the main execution, multiply, ALU pipelines (approximate)" }, {.name = "SECOND_UNIT_EXECUTED_INST", .code = 0x71, .desc = "Instructions executed in the second execution pipeline" }, {.name = "LD_ST_UNIT_EXECUTED_INST", .code = 0x72, .desc = "Instructions executed in the Load/Store unit" }, {.name = "FP_EXECUTED_INST", .code = 0x73, .desc = "Floating point instructions going through register renaming stage" }, {.name = "NEON_EXECUTED_INST", .code = 0x74, .desc = "NEON instructions going through register renaming stage (approximate)" }, {.name = "PLD_FULL_DEP_STALL_CYCLES", .code = 0x80, .desc = "Cycles processor is stalled because PLD slots are full (approximate)" }, {.name = "DATA_WR_DEP_STALL_CYCLES", .code = 0x81, .desc = "Cycles processor is stalled due to writes to external memory (approximate)" }, {.name = "ITLB_MISS_DEP_STALL_CYCLES", .code = 0x82, .desc = "Cycles stalled due to main instruction TLB miss (approximate)" }, {.name = "DTLB_MISS_DEP_STALL_CYCLES", .code = 0x83, .desc = "Cycles stalled due to main data TLB miss (approximate)" }, {.name = "MICRO_ITLB_MISS_DEP_STALL_CYCLES", .code = 0x84, .desc = "Cycles stalled due to micro instruction TLB miss (approximate)" }, {.name = "MICRO_DTLB_MISS_DEP_STALL_CYCLES", .code = 0x85, .desc = "Cycles stalled due to micro data TLB miss (approximate)" }, {.name = "DMB_DEP_STALL_CYCLES", .code = 0x86, .desc = "Cycles stalled due to DMB memory barrier (approximate)" }, {.name = "INTGR_CLK_ENABLED_CYCLES", .code = 0x8a, .desc = "Cycles during which integer core clock is enabled (approximate)" }, {.name = "DATA_ENGINE_CLK_EN_CYCLES", .code = 0x8b, .desc = "Cycles during which Data Engine clock is enabled (approximate)" }, {.name = "ISB_INST", .code = 0x90, .desc = "Number of ISB instructions architecturally executed" }, {.name = "DSB_INST", .code = 0x91, .desc = "Number of DSB instructions architecturally executed" }, {.name = "DMB_INST", .code = 0x92, .desc = "Number of DMB instructions architecturally executed (approximate)" }, {.name = "EXT_INTERRUPTS", .code = 0x93, .desc = "Number of External interrupts (approximate)" }, {.name = "PLE_CACHE_LINE_RQST_COMPLETED", .code = 0xa0, .desc = "PLE cache line requests completed" }, {.name = "PLE_CACHE_LINE_RQST_SKIPPED", .code = 0xa1, .desc = "PLE cache line requests skipped" }, {.name = "PLE_FIFO_FLUSH", .code = 0xa2, .desc = "PLE FIFO flushes" }, {.name = "PLE_RQST_COMPLETED", .code = 0xa3, .desc = "PLE requests completed" }, {.name = "PLE_FIFO_OVERFLOW", .code = 0xa4, .desc = "PLE FIFO overflows" }, {.name = "PLE_RQST_PROG", .code = 0xa5, .desc = "PLE requests programmed" }, {.name = "CPU_CYCLES", .code = 0xff, .desc = "CPU cycles" }, }; #define ARM_CORTEX_A9_EVENT_COUNT (sizeof(arm_cortex_a9_pe)/sizeof(arm_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_fujitsu_a64fx_events.h000066400000000000000000001151411502707512200251360ustar00rootroot00000000000000/* * Copyright 2020 Cray Inc. All Rights Reserved. */ /* * Fujitsu A64FX processor * * A64FX® PMU Events * Fujitsu Limited * 1.2, 28 April 2020 */ static const arm_entry_t arm_a64fx_pe[ ] = { { .name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x0000, .desc = "This event counts on writes to the PMSWINC register.", }, { .name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x0001, .desc = "This event counts operations that cause a refill of at least the L1I cache.", }, { .name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x0002, .desc = "This event counts operations that cause a TLB refill of at least the L1I TLB.", }, { .name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x0003, .desc = "This event counts operations that cause a refill of at least the L1D cache.", }, { .name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x0004, .desc = "This event counts operations that cause a cache access to at least the L1D cache.", }, { .name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x0005, .desc = "This event counts operations that cause a TLB refill of at least the L1D TLB.", }, { .name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0008, .desc = "This event counts every architecturally executed instruction.", }, { .name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x0009, .desc = "This event counts each exception taken.", }, { .name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x000a, .desc = "This event counts each executed exception return instruction.", }, { .name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x000b, .desc = "This event counts every write to CONTEXTIDR.", }, { .name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x0010, .desc = "This event counts each correction to the predicted program flow that occurs because of a misprediction from, or no prediction from, the branch prediction resources and that relates to instructions that the branch prediction resources are capable of predicting.", }, { .name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x0011, .desc = "This event counts every cycle.", }, { .name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x0012, .desc = "This event counts every branch or other change in the program flow that the branch prediction resources are capable of predicting.", }, { .name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x0014, .desc = "This event counts operations that cause a cache access to at least the L1I cache.", }, { .name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x0015, .desc = "This event counts every write-back of data from the L1D cache.", }, { .name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x0016, .desc = "This event counts operations that cause a cache access to at least the L2 cache.", }, { .name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x0017, .desc = "This event counts operations that cause a refill of at least the L2 cache.", }, { .name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x0018, .desc = "This event counts every write-back of data from the L2 cache.", }, { .name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x001b, .desc = "This event counts every architecturally executed instruction.", }, { .name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x0023, .desc = "This event counts every cycle counted by the CPU_CYCLES event on that no operations are issued because there are no operations available to issue for this PE from the frontend.", }, { .name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x0024, .desc = "This event counts every cycle counted by the CPU_CYCLES event on that no operations are issued because the backend is unable to accept any operations.", }, { .name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x002d, .desc = "This event counts operations that cause a TLB refill of at least the L2D TLB.", }, { .name = "L2I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x002e, .desc = "This event counts operations that cause a TLB refill of at least the L2I TLB.", }, { .name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x002f, .desc = "This event counts operations that cause a TLB access to at least the L2D TLB.", }, { .name = "L2I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x0030, .desc = "This event counts operations that cause a TLB access to at least the L2I TLB.", }, { .name = "L1D_CACHE_REFILL_PRF", .modmsk = ARMV8_ATTRS, .code = 0x0049, .desc = "This event counts L1D_CACHE_REFILL caused by software or hardware prefetch.", }, { .name = "L2D_CACHE_REFILL_PRF", .modmsk = ARMV8_ATTRS, .code = 0x0059, .desc = "This event counts L2D_CACHE_REFILL caused by software or hardware prefetch.", }, { .name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x006c, .desc = "This event counts architecturally executed load-exclusive instructions.", }, { .name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x006f, .desc = "This event counts architecturally executed store-exclusive instructions.", }, { .name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0070, .desc = "This event counts architecturally executed memory-reading instructions, as defined by the LD_RETIRED event.", }, { .name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0071, .desc = "This event counts architecturally executed memory-writing instructions, as defined by the ST_RETIRED event. This event counts DCZVA as a store operation.", }, { .name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0072, .desc = "This event counts architecturally executed memory-reading instructions and memory-writing instructions, as defined by the LD_RETIRED and ST_RETIRED events.", }, { .name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0073, .desc = "This event counts architecturally executed integer data-processing instructions.", }, { .name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0074, .desc = "This event counts architecturally executed Advanced SIMD data-processing instructions.", }, { .name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0075, .desc = "This event counts architecturally executed floating-point data-processing instructions.", }, { .name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0076, .desc = "This event counts only software changes of the PC that defined by the instruction architecturally executed, condition code check pass and software change of the PC event.", }, { .name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0077, .desc = "This event counts architecturally executed cryptographic instructions, except PMULL and VMULL.", }, { .name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0078, .desc = "This event counts architecturally executed immediate branch instructions.", }, { .name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0079, .desc = "This event counts architecturally executed procedure return operations that defined by the BR_RETURN_RETIRED event.", }, { .name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x007a, .desc = "This event counts architecturally executed indirect branch instructions that includes software change of the PC other than exception-generating instructions and immediate branch instructions.", }, { .name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x007c, .desc = "This event counts architecturally executed Instruction Synchronization Barrier instructions.", }, { .name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x007d, .desc = "This event counts architecturally executed Data Synchronization Barrier instructions.", }, { .name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x007e, .desc = "This event counts architecturally executed Data Memory Barrier instructions, excluding the implied barrier operations of load/store operations with release consistency semantics.", }, { .name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x0081, .desc = "This event counts only other synchronous exceptions that are taken locally.", }, { .name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x0082, .desc = "This event counts only Supervisor Call exceptions that are taken locally.", }, { .name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x0083, .desc = "This event counts only Instruction Abort exceptions that are taken locally.", }, { .name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x0084, .desc = "This event counts only Data Abort or SError interrupt exceptions that are taken locally.", }, { .name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x0086, .desc = "This event counts only IRQ exceptions that are taken locally, including Virtual IRQ exceptions.", }, { .name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x0087, .desc = "This event counts only FIQ exceptions that are taken locally, including Virtual FIQ exceptions.", }, { .name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x0088, .desc = "This event counts only Secure Monitor Call exceptions. The counter does not increment on SMC instructions trapped as a Hyp Trap exception.", }, { .name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x008a, .desc = "This event counts for both Hypervisor Call exceptions taken locally in the hypervisor and those taken as an exception from Non-secure EL1.", }, { .name = "DCZVA_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x009f, .desc = "This event counts architecturally executed zero blocking operations due to the 'DC ZVA' instruction.", }, { .name = "FP_MV_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0105, .desc = "This event counts architecturally executed floating-point move operations.", }, { .name = "PRD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0108, .desc = "This event counts architecturally executed operations that using predicate register.", }, { .name = "IEL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0109, .desc = "This event counts architecturally executed inter-element manipulation operations.", }, { .name = "IREG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x010a, .desc = "This event counts architecturally executed inter-register manipulation operations.", }, { .name = "FP_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0112, .desc = "This event counts architecturally executed NOSIMD load operations that using SIMD and FP registers.", }, { .name = "FP_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0113, .desc = "This event counts architecturally executed NOSIMD store operations that using SIMD and FP registers.", }, { .name = "BC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x011a, .desc = "This event counts architecturally executed SIMD broadcast floating-point load operations.", }, { .name = "EFFECTIVE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0121, .desc = "This event counts architecturally executed instructions, excluding the MOVPRFX instruction.", }, { .name = "PRE_INDEX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0123, .desc = "This event counts architecturally executed operations that uses 'pre-index' as its addressing mode.", }, { .name = "POST_INDEX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x0124, .desc = "This event counts architecturally executed operations that uses 'post-index' as its addressing mode.", }, { .name = "UOP_SPLIT", .modmsk = ARMV8_ATTRS, .code = 0x0139, .desc = "This event counts the occurrence count of the micro-operation split.", }, { .name = "LD_COMP_WAIT_L2_MISS", .modmsk = ARMV8_ATTRS, .code = 0x0180, .desc = "This event counts every cycle that no operation was committed because the oldest and uncommitted load/store operation waits for memory access.", }, { .name = "LD_COMP_WAIT_L2_MISS_EX", .modmsk = ARMV8_ATTRS, .code = 0x0181, .desc = "This event counts every cycle that no instructions are committed because the oldest and uncommitted integer load instruction waits for memory access.", }, { .name = "LD_COMP_WAIT_L1_MISS", .modmsk = ARMV8_ATTRS, .code = 0x0182, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store operation waits for L2 cache access.", }, { .name = "LD_COMP_WAIT_L1_MISS_EX", .modmsk = ARMV8_ATTRS, .code = 0x0183, .desc = "This event counts every cycle that no instructions are committed because the oldest and uncommitted integer load instruction waits for L2 cache access.", }, { .name = "LD_COMP_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x0184, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store operation waits for L1D, L2 and memory access.", }, { .name = "LD_COMP_WAIT_EX", .modmsk = ARMV8_ATTRS, .code = 0x0185, .desc = "This event counts every cycle that no instructions are committed because the oldest and uncommitted integer load instruction waits for L1D, L2 and memory access.", }, { .name = "LD_COMP_WAIT_PFP_BUSY", .modmsk = ARMV8_ATTRS, .code = 0x0186, .desc = "This event counts every cycle that no instructions are committed due to the lack of an available prefetch port.", }, { .name = "LD_COMP_WAIT_PFP_BUSY_EX", .modmsk = ARMV8_ATTRS, .code = 0x0187, .desc = "This event counts the LD_COMP_WAIT_PFP_BUSY caused by an integer load operation.", }, { .name = "LD_COMP_WAIT_PFP_BUSY_SWPF", .modmsk = ARMV8_ATTRS, .code = 0x0188, .desc = "This event counts the LD_COMP_WAIT_PFP_BUSY caused by a software prefetch instruction.", }, { .name = "EU_COMP_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x0189, .desc = "This event counts every cycle that no instructions are committed, and the oldest and uncommitted instruction is an integer or floating-point instruction.", }, { .name = "FL_COMP_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x018a, .desc = "This event counts every cycle that no instructions are committed, and the oldest and uncommitted instruction is a floating-point instruction.", }, { .name = "BR_COMP_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x018b, .desc = "This event counts every cycle that no instructions are committed, and the oldest and uncommitted instruction is a branch instruction.", }, { .name = "ROB_EMPTY", .modmsk = ARMV8_ATTRS, .code = 0x018c, .desc = "This event counts every cycle that no instructions are committed because the CSE is empty.", }, { .name = "ROB_EMPTY_STQ_BUSY", .modmsk = ARMV8_ATTRS, .code = 0x018d, .desc = "This event counts every cycle that no instructions are committed because the CSE is empty and the all store ports are full.", }, { .name = "WFE_WFI_CYCLE", .modmsk = ARMV8_ATTRS, .code = 0x018e, .desc = "This event counts every cycle that the WFE/WFI instruction brings the instruction unit to a halt.", }, { .name = "0INST_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0190, .desc = "This event counts every cycle that no instructions are committed, but counts at the time when commits MOVPRFX only.", }, { .name = "1INST_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0191, .desc = "This event counts every cycle that one instruction is committed.", }, { .name = "2INST_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0192, .desc = "This event counts every cycle that two instructions are committed.", }, { .name = "3INST_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0193, .desc = "This event counts every cycle that three instructions are committed.", }, { .name = "4INST_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0194, .desc = "This event counts every cycle that four instructions are committed.", }, { .name = "UOP_ONLY_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0198, .desc = "This event counts every cycle that only any micro-operations are committed.", }, { .name = "SINGLE_MOVPRFX_COMMIT", .modmsk = ARMV8_ATTRS, .code = 0x0199, .desc = "This event counts every cycle that only the MOVPRFX instruction is committed.", }, { .name = "EAGA_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a0, .desc = "This event counts valid cycles of EAGA pipeline.", }, { .name = "EAGB_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a1, .desc = "This event counts valid cycles of EAGB pipeline.", }, { .name = "EXA_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a2, .desc = "This event counts valid cycles of EXA pipeline.", }, { .name = "EXB_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a3, .desc = "This event counts valid cycles of EXB pipeline.", }, { .name = "FLA_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a4, .desc = "This event counts valid cycles of FLA pipeline.", }, { .name = "FLB_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a5, .desc = "This event counts valid cycles of FLB pipeline.", }, { .name = "PRX_VAL", .modmsk = ARMV8_ATTRS, .code = 0x01a6, .desc = "This event counts valid cycles of PRX pipeline.", }, { .name = "FLA_VAL_PRD_CNT", .modmsk = ARMV8_ATTRS, .code = 0x01b4, .desc = "This event counts the number of 1 in the predicate bits of request in FLA pipeline, and corrects itself to be 16 when all bits are 1.", }, { .name = "FLB_VAL_PRD_CNT", .modmsk = ARMV8_ATTRS, .code = 0x01b5, .desc = "This event counts the number of 1 in the predicate bits of request in FLB pipeline, and corrects itself to be 16 when all bits are 1.", }, { .name = "EA_CORE", .modmsk = ARMV8_ATTRS, .code = 0x01e0, .desc = "This event counts energy consumption per cycle of core.", }, { .name = "L1D_CACHE_REFILL_DM", .modmsk = ARMV8_ATTRS, .code = 0x0200, .desc = "This event counts L1D_CACHE_REFILL caused by demand access.", }, { .name = "L1D_CACHE_REFILL_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x0202, .desc = "This event counts L1D_CACHE_REFILL caused by hardware prefetch.", }, { .name = "L1_MISS_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x0208, .desc = "This event counts outstanding L1D cache miss requests per cycle.", }, { .name = "L1I_MISS_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x0209, .desc = "This event counts outstanding L1I cache miss requests per cycle.", }, { .name = "L1HWPF_STREAM_PF", .modmsk = ARMV8_ATTRS, .code = 0x0230, .desc = "This event counts streaming prefetch requests to L1D cache generated by hardware prefetcher.", }, { .name = "L1HWPF_INJ_ALLOC_PF", .modmsk = ARMV8_ATTRS, .code = 0x0231, .desc = "This event counts allocation type prefetch injection requests to L1D cache generated by hardware prefetcher.", }, { .name = "L1HWPF_INJ_NOALLOC_PF", .modmsk = ARMV8_ATTRS, .code = 0x0232, .desc = "This event counts non-allocation type prefetch injection requests to L1D cache generated by hardware prefetcher.", }, { .name = "L2HWPF_STREAM_PF", .modmsk = ARMV8_ATTRS, .code = 0x0233, .desc = "This event counts streaming prefetch requests to L2 cache generated by hardware prefecher.", }, { .name = "L2HWPF_INJ_ALLOC_PF", .modmsk = ARMV8_ATTRS, .code = 0x0234, .desc = "This event counts allocation type prefetch injection requests to L2 cache generated by hardware prefetcher.", }, { .name = "L2HWPF_INJ_NOALLOC_PF", .modmsk = ARMV8_ATTRS, .code = 0x0235, .desc = "This event counts non-allocation type prefetch injection requests to L2 cache generated by hardware prefetcher.", }, { .name = "L2HWPF_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x0236, .desc = "This event counts prefetch requests to L2 cache generated by the other causes.", }, { .name = "L1_PIPE0_VAL", .modmsk = ARMV8_ATTRS, .code = 0x0240, .desc = "This event counts valid cycles of L1D cache pipeline#0.", }, { .name = "L1_PIPE1_VAL", .modmsk = ARMV8_ATTRS, .code = 0x0241, .desc = "This event counts valid cycles of L1D cache pipeline#1.", }, { .name = "L1_PIPE0_VAL_IU_TAG_ADRS_SCE", .modmsk = ARMV8_ATTRS, .code = 0x0250, .desc = "This event counts requests in L1D cache pipeline#0 that its sce bit of tagged address is 1.", }, { .name = "L1_PIPE0_VAL_IU_TAG_ADRS_PFE", .modmsk = ARMV8_ATTRS, .code = 0x0251, .desc = "This event counts requests in L1D cache pipeline#0 that its pfe bit of tagged address is 1.", }, { .name = "L1_PIPE1_VAL_IU_TAG_ADRS_SCE", .modmsk = ARMV8_ATTRS, .code = 0x0252, .desc = "This event counts requests in L1D cache pipeline#1 that its sce bit of tagged address is 1.", }, { .name = "L1_PIPE1_VAL_IU_TAG_ADRS_PFE", .modmsk = ARMV8_ATTRS, .code = 0x0253, .desc = "This event counts requests in L1D cache pipeline#1 that its pfe bit of tagged address is 1.", }, { .name = "L1_PIPE0_COMP", .modmsk = ARMV8_ATTRS, .code = 0x0260, .desc = "This event counts completed requests in L1D cache pipeline#0.", }, { .name = "L1_PIPE1_COMP", .modmsk = ARMV8_ATTRS, .code = 0x0261, .desc = "This event counts completed requests in L1D cache pipeline#1.", }, { .name = "L1I_PIPE_COMP", .modmsk = ARMV8_ATTRS, .code = 0x0268, .desc = "This event counts completed requests in L1I cache pipeline.", }, { .name = "L1I_PIPE_VAL", .modmsk = ARMV8_ATTRS, .code = 0x0269, .desc = "This event counts valid cycles of L1I cache pipeline.", }, { .name = "L1_PIPE_ABORT_STLD_INTLK", .modmsk = ARMV8_ATTRS, .code = 0x0274, .desc = "This event counts aborted requests in L1D pipelines that due to store-load interlock.", }, { .name = "L1_PIPE0_VAL_IU_NOT_SEC0", .modmsk = ARMV8_ATTRS, .code = 0x02a0, .desc = "This event counts requests in L1D cache pipeline#0 that its sector cache ID is not 0.", }, { .name = "L1_PIPE1_VAL_IU_NOT_SEC0", .modmsk = ARMV8_ATTRS, .code = 0x02a1, .desc = "This event counts requests in L1D cache pipeline#1 that its sector cache ID is not 0.", }, { .name = "L1_PIPE_COMP_GATHER_2FLOW", .modmsk = ARMV8_ATTRS, .code = 0x02b0, .desc = "This event counts the number of times where 2 elements of the gather instructions became 2flows because 2 elements could not be combined.", }, { .name = "L1_PIPE_COMP_GATHER_1FLOW", .modmsk = ARMV8_ATTRS, .code = 0x02b1, .desc = "This event counts the number of times where 2 elements of the gather instructions became 1flow because 2 elements could be combined.", }, { .name = "L1_PIPE_COMP_GATHER_0FLOW", .modmsk = ARMV8_ATTRS, .code = 0x02b2, .desc = "This event counts the number of times where 2 elements of the gather instructions became 0flow because both predicate values are 0.", }, { .name = "L1_PIPE_COMP_SCATTER_1FLOW", .modmsk = ARMV8_ATTRS, .code = 0x02b3, .desc = "This event counts the number of flows of the scatter instructions.", }, { .name = "L1_PIPE0_COMP_PRD_CNT", .modmsk = ARMV8_ATTRS, .code = 0x02b8, .desc = "This event counts the number of 1 in the predicate bits of request in L1D cache pipeline#0, and corrects itself to be 16 when all bits are 1.", }, { .name = "L1_PIPE1_COMP_PRD_CNT", .modmsk = ARMV8_ATTRS, .code = 0x02b9, .desc = "This event counts the number of 1 in the predicate bits of request in L1D cache pipeline#1, and corrects itself to be 16 when all bits are 1.", }, { .name = "L2D_CACHE_REFILL_DM", .modmsk = ARMV8_ATTRS, .code = 0x0300, .desc = "This event counts L2D_CACHE_REFILL caused by demand access.", }, { .name = "L2D_CACHE_REFILL_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x0302, .desc = "This event counts L2D_CACHE_REFILL caused by hardware prefetch.", }, { .name = "L2_MISS_WAIT", .modmsk = ARMV8_ATTRS, .code = 0x0308, .desc = "This event counts outstanding L2 cache miss requests per cycle. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "L2_MISS_COUNT", .modmsk = ARMV8_ATTRS, .code = 0x0309, .desc = "This event counts the number of times of L2 cache miss. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_CMG0", .modmsk = ARMV8_ATTRS, .code = 0x0310, .desc = "This event counts read requests from CMG0 to measured CMG, if measured CMG is not CMG0. Otherwise, this event counts read requests from CMG0 local memory to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_CMG1", .modmsk = ARMV8_ATTRS, .code = 0x0311, .desc = "This event counts read requests from CMG1 to measured CMG, if measured CMG is not CMG1. Otherwise, this event counts read requests from CMG1 local memory to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_CMG2", .modmsk = ARMV8_ATTRS, .code = 0x0312, .desc = "This event counts read requests from CMG2 to measured CMG, if measured CMG is not CMG2. Otherwise, this event counts read requests from CMG2 local memory to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_CMG3", .modmsk = ARMV8_ATTRS, .code = 0x0313, .desc = "This event counts read requests from CMG3 to measured CMG, if measured CMG is not CMG3. Otherwise, this event counts read requests from CMG3 local memory to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_TOFU", .modmsk = ARMV8_ATTRS, .code = 0x0314, .desc = "This event counts read requests from tofu controller to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_PCI", .modmsk = ARMV8_ATTRS, .code = 0x0315, .desc = "This event counts read requests from PCI controller to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_READ_TOTAL_MEM", .modmsk = ARMV8_ATTRS, .code = 0x0316, .desc = "This event counts read requests from measured CMG local memory to measured CMG. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_CMG0", .modmsk = ARMV8_ATTRS, .code = 0x0318, .desc = "This event counts write requests from measured CMG to CMG0, if measured CMG is not CMG0. Otherwise, this event counts write requests from measured CMG to CMG0 local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_CMG1", .modmsk = ARMV8_ATTRS, .code = 0x0319, .desc = "This event counts write requests from measured CMG to CMG1, if measured CMG is not CMG1. Otherwise, this event counts write requests from measured CMG to CMG1 local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_CMG2", .modmsk = ARMV8_ATTRS, .code = 0x031a, .desc = "This event counts write requests from measured CMG to CMG2, if measured CMG is not CMG2. Otherwise, this event counts write requests from measured CMG to CMG2 local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_CMG3", .modmsk = ARMV8_ATTRS, .code = 0x031b, .desc = "This event counts write requests from measured CMG to CMG3, if measured CMG is not CMG3. Otherwise, this event counts write requests from measured CMG to CMG3 local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_TOFU", .modmsk = ARMV8_ATTRS, .code = 0x031c, .desc = "This event counts write requests from measured CMG to tofu controller. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_PCI", .modmsk = ARMV8_ATTRS, .code = 0x031d, .desc = "This event counts write requests from measured CMG to PCI controller. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "BUS_WRITE_TOTAL_MEM", .modmsk = ARMV8_ATTRS, .code = 0x031e, .desc = "This event counts write requests from measured CMG to measured CMG local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "L2D_SWAP_DM", .modmsk = ARMV8_ATTRS, .code = 0x0325, .desc = "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch.", }, { .name = "L2D_CACHE_MIBMCH_PRF", .modmsk = ARMV8_ATTRS, .code = 0x0326, .desc = "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access.", }, { .name = "L2_PIPE_VAL", .modmsk = ARMV8_ATTRS, .code = 0x0330, .desc = "This event counts valid cycles of L2 cache pipeline. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "L2_PIPE_COMP_ALL", .modmsk = ARMV8_ATTRS, .code = 0x0350, .desc = "This event counts completed requests in L2 cache pipeline. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "L2_PIPE_COMP_PF_L2MIB_MCH", .modmsk = ARMV8_ATTRS, .code = 0x0370, .desc = "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "L2D_CACHE_SWAP_LOCAL", .modmsk = ARMV8_ATTRS, .code = 0x0396, .desc = "This event counts operations where demand access hits an L2 cache refill buffer allocated by software or hardware prefetch. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "EA_L2", .modmsk = ARMV8_ATTRS, .code = 0x03e0, .desc = "This event counts energy consumption per cycle of L2 cache. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "EA_MEMORY", .modmsk = ARMV8_ATTRS, .code = 0x03e8, .desc = "This event counts energy consumption per cycle of CMG local memory. It counts all events caused in measured CMG regardless of measured PE.", }, { .name = "SIMD_INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8000, .desc = "This event counts architecturally executed SIMD instructions, excluding the Advanced SIMD scalar instructions and the instructions listed in Non-SIMD SVE instructions section of SVE Reference Manual.", }, { .name = "SVE_INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8002, .desc = "This event counts architecturally executed Advanced SIMD instructions, including the instructions listed in Non-SIMD SVE instructions section of SVE Reference Manual.", }, { .name = "UOP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8008, .desc = "This event counts all architecturally executed micro-operations.", }, { .name = "SVE_MATH_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x800e, .desc = "This event counts architecturally executed math function operations due to the SVE FTSMUL, FTMAD, FTSSEL, and FEXPA instructions.", }, { .name = "FP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8010, .desc = "This event counts architecturally executed operations due to scalar, Advanced SIMD, and SVE instructions listed in Floating-point instructions section of SVE Reference Manual.", }, { .name = "FP_FMA_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8028, .desc = "This event counts architecturally executed floating-point fused multiply-add and multiply-subtract operations.", }, { .name = "FP_RECPE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8034, .desc = "This event counts architecturally executed floating-point reciprocal estimate operations due to the Advanced SIMD scalar, Advanced SIMD vector, and SVE FRECPE and FRSQRTE instructions.", }, { .name = "FP_CVT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8038, .desc = "This event counts architecturally executed floating-point convert operations due to the scalar, Advanced SIMD, and SVE floating-point conversion instructions listed in Floating-point conversions section of SVE Reference Manual.", }, { .name = "ASE_SVE_INT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8043, .desc = "This event counts architecturally executed integer arithmetic operations due to Advanced SIMD and SVE data-processing instructions listed in Integer instructions section of SVE Reference Manual.", }, { .name = "SVE_PRED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8074, .desc = "This event counts architecturally executed SIMD data-processing and load/store operations due to SVE instructions with a Governing predicate operand that determines the Active elements.", }, { .name = "SVE_MOVPRFX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x807c, .desc = "This event counts architecturally executed operations due to MOVPRFX instructions, whether or not they are fused with the prefixed instruction.", }, { .name = "SVE_MOVPRFX_U_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x807f, .desc = "This event counts architecturally executed operations due to MOVPRFX instructions that are not fused with the prefixed instruction.", }, { .name = "ASE_SVE_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8085, .desc = "This event counts architecturally executed operations that read from memory due to SVE and Advanced SIMD load instructions.", }, { .name = "ASE_SVE_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8086, .desc = "This event counts architecturally executed operations that write to memory due to SVE and Advanced SIMD store instructions.", }, { .name = "PRF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8087, .desc = "This event counts architecturally executed prefetch operations due to scalar PRFM and SVE PRF instructions.", }, { .name = "BASE_LD_REG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8089, .desc = "This event counts architecturally executed operations that read from memory due to an instruction that loads a general-purpose register.", }, { .name = "BASE_ST_REG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x808a, .desc = "This event counts architecturally executed operations that write to memory due to an instruction that stores a general-purpose register, excluding the 'DC ZVA' instruction.", }, { .name = "SVE_LDR_REG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8091, .desc = "This event counts architecturally executed operations that read from memory due to an SVE LDR instruction.", }, { .name = "SVE_STR_REG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8092, .desc = "This event counts architecturally executed operations that write to memory due to an SVE STR instruction.", }, { .name = "SVE_LDR_PREG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8095, .desc = "This event counts architecturally executed operations that read from memory due to an SVE LDR (predicate) instruction.", }, { .name = "SVE_STR_PREG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8096, .desc = "This event counts architecturally executed operations that write to memory due to an SVE STR (predicate) instruction.", }, { .name = "SVE_PRF_CONTIG_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x809f, .desc = "This event counts architecturally executed operations that prefetch memory due to an SVE predicated single contiguous element prefetch instruction.", }, { .name = "ASE_SVE_LD_MULTI_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80a5, .desc = "This event counts architecturally executed operations that read from memory due to SVE and Advanced SIMD multiple vector contiguous structure load instructions.", }, { .name = "ASE_SVE_ST_MULTI_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80a6, .desc = "This event counts architecturally executed operations that write to memory due to SVE and Advanced SIMD multiple vector contiguous structure store instructions.", }, { .name = "SVE_LD_GATHER_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80ad, .desc = "This event counts architecturally executed operations that read from memory due to SVE noncontiguous gather-load instructions.", }, { .name = "SVE_ST_SCATTER_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80ae, .desc = "This event counts architecturally executed operations that write to memory due to SVE noncontiguous scatter-store instructions.", }, { .name = "SVE_PRF_GATHER_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80af, .desc = "This event counts architecturally executed operations that prefetch memory due to SVE noncontiguous gather-prefetch instructions.", }, { .name = "SVE_LDFF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bc, .desc = "This event counts architecturally executed memory read operations due to SVE First-fault and Non-fault load instructions.", }, { .name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c0, .desc = "This event counts architecturally executed SVE arithmetic operations. This event counter is incremented by (128 / CSIZE) and by twice that amount for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c1, .desc = "This event counts architecturally executed v8SIMD and FP arithmetic operations. The event counter is incremented by the specified number of elements for Advanced SIMD operations or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_HP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c2, .desc = "This event counts architecturally executed SVE half-precision arithmetic operations. This event counter is incremented by 8, or by 16 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_HP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c3, .desc = "This event counts architecturally executed v8SIMD and FP half-precision arithmetic operations. This event counter is incremented by the number of 16-bit elements for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_SP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c4, .desc = "This event counts architecturally executed SVE single-precision arithmetic operations. This event counter is incremented by 4, or by 8 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_SP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c5, .desc = "This event counts architecturally executed v8SIMD and FP single-precision arithmetic operations. This event counter is incremented by the number of 32-bit elements for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_DP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c6, .desc = "This event counts architecturally executed SVE double-precision arithmetic operations. This event counter is incremented by 2, or by 4 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_DP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c7, .desc = "This event counts architecturally executed v8SIMD and FP double-precision arithmetic operations. This event counter is incremented by 2 for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_fujitsu_monaka_events.h000066400000000000000000002735531502707512200254700ustar00rootroot00000000000000/* * Copyright (c) 2024 Fujitsu Limited. All rights reserved. */ /* * Fujitsu FUJITSU-MONAKA processor * * FUJITSU-MONAKA Specification * Fujitsu Limited * 1.0, 30 September 2024 */ static const arm_entry_t arm_monaka_pe[ ] = { { .name = "SW_INCR", .modmsk = ARMV9_ATTRS, .code = 0x0000, .desc = "This event counts on writes to the PMSWINC register.", }, { .name = "L1I_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x0001, .desc = "This event counts operations that cause a refill of the L1I cache. See L1I_CACHE_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L1I_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x0002, .desc = "This event counts operations that cause a TLB refill of the L1I TLB. See L1I_TLB_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L1D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x0003, .desc = "This event counts operations that cause a refill of the L1D cache. See L1D_CACHE_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L1D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x0004, .desc = "This event counts operations that cause a cache access to the L1D cache. See L1D_CACHE of ARMv9 Reference Manual for more information.", }, { .name = "L1D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x0005, .desc = "This event counts operations that cause a TLB refill of the L1D TLB. See L1D_TLB_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "INST_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x0008, .desc = "This event counts every architecturally executed instruction.", }, { .name = "EXC_TAKEN", .modmsk = ARMV9_ATTRS, .code = 0x0009, .desc = "This event counts each exception taken.", }, { .name = "EXC_RETURN", .modmsk = ARMV9_ATTRS, .code = 0x000a, .desc = "This event counts each executed exception return instruction.", }, { .name = "CID_WRITE_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x000b, .desc = "This event counts every write to CONTEXTIDR.", }, { .name = "BR_MIS_PRED", .modmsk = ARMV9_ATTRS, .code = 0x0010, .desc = "This event counts each correction to the predicted program flow that occurs because of a misprediction from, or no prediction from, the branch prediction resources and that relates to instructions that the branch prediction resources are capable of predicting.", }, { .name = "CPU_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0011, .desc = "This event counts every cycle.", }, { .name = "BR_PRED", .modmsk = ARMV9_ATTRS, .code = 0x0012, .desc = "This event counts every branch or other change in the program flow that the branch prediction resources are capable of predicting.", }, { .name = "MEM_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x0013, .desc = "This event counts architecturally executed memory-reading instructions and memory-writing instructions, as defined by the LDST_SPEC events.", }, { .name = "L1I_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x0014, .desc = "This event counts operations that cause a cache access to the L1I cache. See L1I_CACHE of ARMv9 Reference Manual for more information.", }, { .name = "L1D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x0015, .desc = "This event counts every write-back of data from the L1D cache. See L1D_CACHE_WB of ARMv9 Reference Manual for more information.", }, { .name = "L2D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x0016, .desc = "This event counts operations that cause a cache access to the L2 cache. See L2D_CACHE of ARMv9 Reference Manual for more information.", }, { .name = "L2D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x0017, .desc = "This event counts operations that cause a refill of the L2 cache. See L2D_CACHE_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L2D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x0018, .desc = "This event counts every write-back of data from the L2 cache caused by L2 replace, non-temporal-store and DC ZVA.", }, { .name = "INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x001b, .desc = "This event counts every architecturally executed instruction.", }, { .name = "BR_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x0021, .desc = "This event counts architecturally executed branch instruction.", }, { .name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x0022, .desc = "This event counts architecturally executed branch instruction which was mispredicted.", }, { .name = "STALL_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x0023, .desc = "This event counts every cycle counted by the CPU_CYCLES event on that no operation was issued because there are no operations available to issue for this PE from the frontend.", }, { .name = "STALL_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x0024, .desc = "This event counts every cycle counted by the CPU_CYCLES event on that no operation was issued because the backend is unable to accept any operations.", }, { .name = "L1D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x0025, .desc = "This event counts operations that cause a TLB access to the L1D TLB. See L1D_TLB of ARMv9 Reference Manual for more information.", }, { .name = "L1I_TLB", .modmsk = ARMV9_ATTRS, .code = 0x0026, .desc = "This event counts operations that cause a TLB access to the L1I TLB. See L1I_TLB of ARMv9 Reference Manual for more information.", }, { .name = "L3D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x002b, .desc = "This event counts operations that cause a cache access to the L3 cache, as defined by the sum of L2D_CACHE_REFILL_L3D_CACHE and L2D_CACHE_WB_VICTIM_CLEAN events.", }, { .name = "L2D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x002d, .desc = "This event counts operations that cause a TLB refill of the L2D TLB. See L2D_TLB_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L2I_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x002e, .desc = "This event counts operations that cause a TLB refill of the L2I TLB. See L2I_TLB_REFILL of ARMv9 Reference Manual for more information.", }, { .name = "L2D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x002f, .desc = "This event counts operations that cause a TLB access to the L2D TLB. See L2D_TLB of ARMv9 Reference Manual for more information.", }, { .name = "L2I_TLB", .modmsk = ARMV9_ATTRS, .code = 0x0030, .desc = "This event counts operations that cause a TLB access to the L2I TLB. See L2I_TLB of ARMv9 Reference Manual for more information.", }, { .name = "DTLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x0034, .desc = "This event counts data TLB access with at least one translation table walk.", }, { .name = "ITLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x0035, .desc = "This event counts instruction TLB access with at least one translation table walk.", }, { .name = "LL_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x0036, .desc = "This event counts access counted by L3D_CACHE that is a Memory-read operation, as defined by the L2D_CACHE_REFILL_L3D_CACHE events.", }, { .name = "LL_CACHE_MISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x0037, .desc = "This event counts access counted by L3D_CACHE that is not completed by the L3D cache, and a Memory-read operation, as defined by the L2D_CACHE_REFILL_L3D_MISS events.", }, { .name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x0039, .desc = "This event counts operations that cause a refill of the L1D cache that incurs additional latency.", }, { .name = "OP_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x003a, .desc = "This event counts every architecturally executed micro-operation.", }, { .name = "OP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x003b, .desc = "This event counts every speculatively executed micro-operation.", }, { .name = "STALL", .modmsk = ARMV9_ATTRS, .code = 0x003c, .desc = "This event counts every cycle that no instruction was dispatched from decode unit.", }, { .name = "STALL_SLOT_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x003d, .desc = "This event counts every cycle that no instruction was dispatched from decode unit due to the backend.", }, { .name = "STALL_SLOT_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x003e, .desc = "This event counts every cycle that no instruction was dispatched from decode unit due to the frontend.", }, { .name = "STALL_SLOT", .modmsk = ARMV9_ATTRS, .code = 0x003f, .desc = "This event counts every cycle that no instruction or operation Slot was dispatched from decode unit.", }, { .name = "L1D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x0040, .desc = "This event counts L1D CACHE caused by read access.", }, { .name = "L1D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x0041, .desc = "This event counts L1D CACHE caused by write access.", }, { .name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x0042, .desc = "This event counts L1D_CACHE_REFILL caused by read access.", }, { .name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x0043, .desc = "This event counts L1D_CACHE_REFILL caused by write access.", }, { .name = "L2D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x0050, .desc = "This event counts L2D CACHE caused by read access.", }, { .name = "L2D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x0051, .desc = "This event counts L2D CACHE caused by write access.", }, { .name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x0052, .desc = "This event counts L2D CACHE_REFILL caused by read access.", }, { .name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x0053, .desc = "This event counts L2D CACHE_REFILL caused by write access.", }, { .name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV9_ATTRS, .code = 0x0056, .desc = "This event counts every write-back of data from the L2 cache caused by L2 replace.", }, { .name = "MEM_ACCESS_RD", .modmsk = ARMV9_ATTRS, .code = 0x0066, .desc = "This event counts architecturally executed memory-reading instructions, as defined by the LD_SPEC events.", }, { .name = "LDREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x006c, .desc = "This event counts architecturally executed load-exclusive instructions.", }, { .name = "STREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x006f, .desc = "This event counts architecturally executed store-exclusive instructions.", }, { .name = "LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0070, .desc = "This event counts architecturally executed memory-reading instructions, as defined by the LD_RETIRED event.", }, { .name = "ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0071, .desc = "This event counts architecturally executed memory-writing instructions, as defined by the ST_RETIRED event. This event counts DCZVA as a store operation.", }, { .name = "LDST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0072, .desc = "This event counts architecturally executed memory-reading instructions and memory-writing instructions, as defined by the LD_RETIRED and ST_RETIRED events.", }, { .name = "DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0073, .desc = "This event counts architecturally executed integer data-processing instructions. See DP_SPEC of ARMv9 Reference Manual for more information.", }, { .name = "ASE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0074, .desc = "This event counts architecturally executed Advanced SIMD data-processing instructions.", }, { .name = "VFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0075, .desc = "This event counts architecturally executed floating-point data-processing instructions.", }, { .name = "PC_WRITE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0076, .desc = "This event counts only software changes of the PC that defined by the instruction architecturally executed, condition code check pass, software change of the PC event.", }, { .name = "CRYPTO_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0077, .desc = "This event counts architecturally executed cryptographic instructions, except PMULL and VMULL.", }, { .name = "BR_IMMED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0078, .desc = "This event counts architecturally executed immediate branch instructions.", }, { .name = "BR_RETURN_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0079, .desc = "This event counts architecturally executed procedure return operations that defined by the BR_RETURN_RETIRED event.", }, { .name = "BR_INDIRECT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x007a, .desc = "This event counts architecturally executed indirect branch instructions that includes software change of the PC other than exception-generating instructions and immediate branch instructions.", }, { .name = "ISB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x007c, .desc = "This event counts architecturally executed Instruction Synchronization Barrier instructions.", }, { .name = "DSB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x007d, .desc = "This event counts architecturally executed Data Synchronization Barrier instructions.", }, { .name = "DMB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x007e, .desc = "This event counts architecturally executed Data Memory Barrier instructions, excluding the implied barrier operations of load/store operations with release consistency semantics.", }, { .name = "CSDB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x007f, .desc = "This event counts speculatively executed control speculation barrier instructions.", }, { .name = "EXC_UNDEF", .modmsk = ARMV9_ATTRS, .code = 0x0081, .desc = "This event counts only other synchronous exceptions that are taken locally.", }, { .name = "EXC_SVC", .modmsk = ARMV9_ATTRS, .code = 0x0082, .desc = "This event counts only Supervisor Call exceptions that are taken locally.", }, { .name = "EXC_PABORT", .modmsk = ARMV9_ATTRS, .code = 0x0083, .desc = "This event counts only Instruction Abort exceptions that are taken locally.", }, { .name = "EXC_DABORT", .modmsk = ARMV9_ATTRS, .code = 0x0084, .desc = "This event counts only Data Abort or SError interrupt exceptions that are taken locally.", }, { .name = "EXC_IRQ", .modmsk = ARMV9_ATTRS, .code = 0x0086, .desc = "This event counts only IRQ exceptions that are taken locally, including Virtual IRQ exceptions.", }, { .name = "EXC_FIQ", .modmsk = ARMV9_ATTRS, .code = 0x0087, .desc = "This event counts only FIQ exceptions that are taken locally, including Virtual FIQ exceptions.", }, { .name = "EXC_SMC", .modmsk = ARMV9_ATTRS, .code = 0x0088, .desc = "This event counts only Secure Monitor Call exceptions. The counter does not increment on SMC instructions trapped as a Hyp Trap exception.", }, { .name = "EXC_HVC", .modmsk = ARMV9_ATTRS, .code = 0x008a, .desc = "This event counts for both Hypervisor Call exceptions taken locally in the hypervisor and those taken as an exception from Non-secure EL1.", }, { .name = "L3D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x00a0, .desc = "This event counts access counted by L3D_CACHE that is a Memory-read operation, as defined by the L2D_CACHE_REFILL_L3D_CACHE events.", }, { .name = "FP_MV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0105, .desc = "This event counts architecturally executed floating-point move operations.", }, { .name = "PRD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0108, .desc = "This event counts architecturally executed operations that using predicate register.", }, { .name = "IEL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0109, .desc = "This event counts architecturally executed inter-element manipulation operations.", }, { .name = "IREG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x010a, .desc = "This event counts architecturally executed inter-register manipulation operations.", }, { .name = "FP_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0112, .desc = "This event counts architecturally executed NOSIMD load operations that using SIMD&FP registers.", }, { .name = "FP_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0113, .desc = "This event counts architecturally executed NOSIMD store operations that using SIMD&FP registers.", }, { .name = "BC_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x011a, .desc = "This event counts architecturally executed SIMD broadcast floating-point load operations.", }, { .name = "DCZVA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x011b, .desc = "This event counts architecturally executed zero blocking operations due to the DC ZVA instruction.", }, { .name = "EFFECTIVE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0121, .desc = "This event counts architecturally executed instructions, excluding the MOVPRFX instruction.", }, { .name = "PRE_INDEX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0123, .desc = "This event counts architecturally executed operations that uses pre-index as its addressing mode.", }, { .name = "POST_INDEX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x0124, .desc = "This event counts architecturally executed operations that uses post-index as its addressing mode.", }, { .name = "UOP_SPLIT", .modmsk = ARMV9_ATTRS, .code = 0x0139, .desc = "This event counts the occurrence count of the micro-operation split.", }, { .name = "LD_COMP_WAIT_L1_MISS", .modmsk = ARMV9_ATTRS, .code = 0x0182, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L2 cache access.", }, { .name = "LD_COMP_WAIT_L1_MISS_EX", .modmsk = ARMV9_ATTRS, .code = 0x0183, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L2 cache access.", }, { .name = "LD_COMP_WAIT", .modmsk = ARMV9_ATTRS, .code = 0x0184, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L1D cache, L2 cache and memory access.", }, { .name = "LD_COMP_WAIT_EX", .modmsk = ARMV9_ATTRS, .code = 0x0185, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L1D cache, L2 cache and memory access.", }, { .name = "LD_COMP_WAIT_PFP_BUSY", .modmsk = ARMV9_ATTRS, .code = 0x0186, .desc = "This event counts every cycle that no instruction was committed due to the lack of an available prefetch port.", }, { .name = "LD_COMP_WAIT_PFP_BUSY_EX", .modmsk = ARMV9_ATTRS, .code = 0x0187, .desc = "This event counts the LD_COMP_WAIT_PFP_BUSY caused by an integer load operation.", }, { .name = "LD_COMP_WAIT_PFP_BUSY_SWPF", .modmsk = ARMV9_ATTRS, .code = 0x0188, .desc = "This event counts the LD_COMP_WAIT_PFP_BUSY caused by a software prefetch instruction.", }, { .name = "EU_COMP_WAIT", .modmsk = ARMV9_ATTRS, .code = 0x0189, .desc = "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is an integer or floating-point/SIMD instruction. ", }, { .name = "FL_COMP_WAIT", .modmsk = ARMV9_ATTRS, .code = 0x018a, .desc = "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a floating-point/SIMD instruction.", }, { .name = "BR_COMP_WAIT", .modmsk = ARMV9_ATTRS, .code = 0x018b, .desc = "This event counts every cycle that no instruction was committed and the oldest and uncommitted instruction is a branch instruction.", }, { .name = "ROB_EMPTY", .modmsk = ARMV9_ATTRS, .code = 0x018c, .desc = "This event counts every cycle that no instruction was committed because the CSE is empty.", }, { .name = "ROB_EMPTY_STQ_BUSY", .modmsk = ARMV9_ATTRS, .code = 0x018d, .desc = "This event counts every cycle that no instruction was committed because the CSE is empty and the store port (SP) is full.", }, { .name = "WFE_WFI_CYCLE", .modmsk = ARMV9_ATTRS, .code = 0x018e, .desc = "This event counts every cycle that the instruction unit is halted by the WFE/WFI instruction.", }, { .name = "RETENTION_CYCLE", .modmsk = ARMV9_ATTRS, .code = 0x018f, .desc = "This event counts every cycle that the instruction unit is halted by the RETENTION state.", }, { .name = "_0INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0190, .desc = "This event counts every cycle that no instruction was committed, but counts at the time when commits MOVPRFX only. ", }, { .name = "_1INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0191, .desc = "This event counts every cycle that one instruction is committed.", }, { .name = "_2INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0192, .desc = "This event counts every cycle that two instructions are committed.", }, { .name = "_3INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0193, .desc = "This event counts every cycle that three instructions are committed.", }, { .name = "_4INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0194, .desc = "This event counts every cycle that four instructions are committed.", }, { .name = "_5INST_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0195, .desc = "This event counts every cycle that five instructions are committed.", }, { .name = "UOP_ONLY_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0198, .desc = "This event counts every cycle that only any micro-operations are committed.", }, { .name = "SINGLE_MOVPRFX_COMMIT", .modmsk = ARMV9_ATTRS, .code = 0x0199, .desc = "This event counts every cycle that only the MOVPRFX instruction is committed.", }, { .name = "LD_COMP_WAIT_L2_MISS", .modmsk = ARMV9_ATTRS, .code = 0x019c, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted load/store/prefetch operation waits for L2 cache miss.", }, { .name = "LD_COMP_WAIT_L2_MISS_EX", .modmsk = ARMV9_ATTRS, .code = 0x019d, .desc = "This event counts every cycle that no instruction was committed because the oldest and uncommitted integer load operation waits for L2 cache miss.", }, { .name = "EAGA_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a0, .desc = "This event counts valid cycles of EAGA pipeline.", }, { .name = "EAGB_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a1, .desc = "This event counts valid cycles of EAGB pipeline.", }, { .name = "PRX_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a3, .desc = "This event counts valid cycles of PRX pipeline.", }, { .name = "EXA_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a4, .desc = "This event counts valid cycles of EXA pipeline.", }, { .name = "EXB_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a5, .desc = "This event counts valid cycles of EXB pipeline.", }, { .name = "EXC_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a6, .desc = "This event counts valid cycles of EXC pipeline.", }, { .name = "EXD_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a7, .desc = "This event counts valid cycles of EXD pipeline.", }, { .name = "FLA_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a8, .desc = "This event counts valid cycles of FLA pipeline.", }, { .name = "FLB_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01a9, .desc = "This event counts valid cycles of FLB pipeline.", }, { .name = "STEA_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01aa, .desc = "This event counts valid cycles of STEA pipeline.", }, { .name = "STEB_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01ab, .desc = "This event counts valid cycles of STEB pipeline.", }, { .name = "STFL_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01ac, .desc = "This event counts valid cycles of STFL pipeline.", }, { .name = "STPX_VAL", .modmsk = ARMV9_ATTRS, .code = 0x01ad, .desc = "This event counts valid cycles of STPX pipeline.", }, { .name = "FLA_VAL_PRD_CNT", .modmsk = ARMV9_ATTRS, .code = 0x01b0, .desc = "This event counts the number of 1's in the predicate bits of request in FLA pipeline, where it is corrected so that it becomes 32 when all bits are 1.", }, { .name = "FLB_VAL_PRD_CNT", .modmsk = ARMV9_ATTRS, .code = 0x01b1, .desc = "This event counts the number of 1's in the predicate bits of request in FLB pipeline, where it is corrected so that it becomes 32 when all bits are 1.", }, { .name = "FLA_VAL_FOR_PRD", .modmsk = ARMV9_ATTRS, .code = 0x01b2, .desc = "This event counts valid cycles of FLA pipeline.", }, { .name = "FLB_VAL_FOR_PRD", .modmsk = ARMV9_ATTRS, .code = 0x01b3, .desc = "This event counts valid cycles of FLB pipeline.", }, { .name = "EA_CORE", .modmsk = ARMV9_ATTRS, .code = 0x01f0, .desc = "This event counts energy consumption of core.", }, { .name = "L1D_CACHE_DM", .modmsk = ARMV9_ATTRS, .code = 0x0200, .desc = "This event counts L1D_CACHE caused by demand access.", }, { .name = "L1D_CACHE_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0201, .desc = "This event counts L1D_CACHE caused by demand read access.", }, { .name = "L1D_CACHE_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x0202, .desc = "This event counts L1D_CACHE caused by demand write access.", }, { .name = "L1I_CACHE_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0207, .desc = "This event counts L1I_CACHE caused by demand read access.", }, { .name = "L1D_CACHE_REFILL_DM", .modmsk = ARMV9_ATTRS, .code = 0x0208, .desc = "This event counts L1D_CACHE_REFILL caused by demand access.", }, { .name = "L1D_CACHE_REFILL_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0209, .desc = "This event counts L1D_CACHE_REFILL caused by demand read access.", }, { .name = "L1D_CACHE_REFILL_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x020a, .desc = "This event counts L1D_CACHE_REFILL caused by demand write access.", }, { .name = "L1D_CACHE_BTC", .modmsk = ARMV9_ATTRS, .code = 0x020d, .desc = "This event counts demand access that hits cache line with shared status and requests exclusive access in the Level 1 data cache, causing a coherence access to outside of the Level 1 caches of this PE.", }, { .name = "L1I_CACHE_REFILL_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x020f, .desc = "This event counts L1I_CACHE_REFILL caused by demand read access.", }, { .name = "L1HWPF_STREAM_PF", .modmsk = ARMV9_ATTRS, .code = 0x0230, .desc = "This event counts streaming prefetch requests to L1D cache generated by hardware prefetcher.", }, { .name = "L1HWPF_STRIDE_PF", .modmsk = ARMV9_ATTRS, .code = 0x0231, .desc = "This event counts stride prefetch requests to L1D cache generated by hardware prefetcher.", }, { .name = "L1HWPF_PFTGT_PF", .modmsk = ARMV9_ATTRS, .code = 0x0232, .desc = "This event counts LDS prefetch requests to L1D cache generated by hardware prefetcher.", }, { .name = "L2HWPF_STREAM_PF", .modmsk = ARMV9_ATTRS, .code = 0x0234, .desc = "This event counts streaming prefetch requests to L2 cache generated by hardware prefetcher.", }, { .name = "L2HWPF_STRIDE_PF", .modmsk = ARMV9_ATTRS, .code = 0x0235, .desc = "This event counts stride prefetch requests to L2 cache generated by hardware prefetcher.", }, { .name = "L2HWPF_OTHER", .modmsk = ARMV9_ATTRS, .code = 0x0237, .desc = "This event counts prefetch requests to L2 cache generated by the other causes.", }, { .name = "L3HWPF_STREAM_PF", .modmsk = ARMV9_ATTRS, .code = 0x0238, .desc = "This event counts streaming prefetch requests to L3 cache generated by hardware prefetcher.", }, { .name = "L3HWPF_STRIDE_PF", .modmsk = ARMV9_ATTRS, .code = 0x0239, .desc = "This event counts stride prefetch requests to L3 cache generated by hardware prefetcher.", }, { .name = "L3HWPF_OTHER", .modmsk = ARMV9_ATTRS, .code = 0x023b, .desc = "This event counts prefetch requests to L3 cache generated by the other causes.", }, { .name = "L1IHWPF_NEXTLINE_PF", .modmsk = ARMV9_ATTRS, .code = 0x023c, .desc = "This event counts next line's prefetch requests to L1I cache generated by hardware prefetcher.", }, { .name = "L1_PIPE0_VAL", .modmsk = ARMV9_ATTRS, .code = 0x0240, .desc = "This event counts valid cycles of L1D cache pipeline#0.", }, { .name = "L1_PIPE1_VAL", .modmsk = ARMV9_ATTRS, .code = 0x0241, .desc = "This event counts valid cycles of L1D cache pipeline#1.", }, { .name = "L1_PIPE2_VAL", .modmsk = ARMV9_ATTRS, .code = 0x0242, .desc = "This event counts valid cycles of L1D cache pipeline#2.", }, { .name = "L1_PIPE0_COMP", .modmsk = ARMV9_ATTRS, .code = 0x0250, .desc = "This event counts completed requests in L1D cache pipeline#0.", }, { .name = "L1_PIPE1_COMP", .modmsk = ARMV9_ATTRS, .code = 0x0251, .desc = "This event counts completed requests in L1D cache pipeline#1.", }, { .name = "L1_PIPE_ABORT_STLD_INTLK", .modmsk = ARMV9_ATTRS, .code = 0x025a, .desc = "This event counts aborted requests in L1D pipelines that due to store-load interlock.", }, { .name = "L1I_PIPE_COMP", .modmsk = ARMV9_ATTRS, .code = 0x026c, .desc = "This event counts completed requests in L1I cache pipeline.", }, { .name = "L1I_PIPE_VAL", .modmsk = ARMV9_ATTRS, .code = 0x026d, .desc = "This event counts valid cycles of L1I cache pipeline.", }, { .name = "L1_PIPE0_VAL_IU_TAG_ADRS_SCE", .modmsk = ARMV9_ATTRS, .code = 0x0278, .desc = "This event counts requests in L1D cache pipeline#0 that its sce bit of tagged address is 1.", }, { .name = "L1_PIPE1_VAL_IU_TAG_ADRS_SCE", .modmsk = ARMV9_ATTRS, .code = 0x0279, .desc = "This event counts requests in L1D cache pipeline#1 that its sce bit of tagged address is 1.", }, { .name = "L1_PIPE0_VAL_IU_NOT_SEC0", .modmsk = ARMV9_ATTRS, .code = 0x02a0, .desc = "This event counts requests in L1D cache pipeline#0 that its sector cache ID is not 0.", }, { .name = "L1_PIPE1_VAL_IU_NOT_SEC0", .modmsk = ARMV9_ATTRS, .code = 0x02a1, .desc = "This event counts requests in L1D cache pipeline#1 that its sector cache ID is not 0.", }, { .name = "L1_PIPE_COMP_GATHER_2FLOW", .modmsk = ARMV9_ATTRS, .code = 0x02b0, .desc = "This event counts the number of times where 2 elements of the gather instructions became 2 flows because 2 elements could not be combined.", }, { .name = "L1_PIPE_COMP_GATHER_1FLOW", .modmsk = ARMV9_ATTRS, .code = 0x02b1, .desc = "This event counts the number of times where 2 elements of the gather instructions became 1 flow because 2 elements could be combined.", }, { .name = "L1_PIPE_COMP_GATHER_0FLOW", .modmsk = ARMV9_ATTRS, .code = 0x02b2, .desc = "This event counts the number of times where 2 elements of the gather instructions became 0 flow because both predicate values are 0.", }, { .name = "L1_PIPE_COMP_SCATTER_1FLOW", .modmsk = ARMV9_ATTRS, .code = 0x02b3, .desc = "This event counts the number of flows of the scatter instructions.", }, { .name = "L1_PIPE0_COMP_PRD_CNT", .modmsk = ARMV9_ATTRS, .code = 0x02b8, .desc = "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#0, where it is corrected so that it becomes 64 when all bits are 1.", }, { .name = "L1_PIPE1_COMP_PRD_CNT", .modmsk = ARMV9_ATTRS, .code = 0x02b9, .desc = "This event counts the number of 1's in the predicate bits of request in L1D cache pipeline#1, where it is corrected so that it becomes 64 when all bits are 1.", }, { .name = "L2D_CACHE_DM", .modmsk = ARMV9_ATTRS, .code = 0x0300, .desc = "This event counts L2D_CACHE caused by demand access.", }, { .name = "L2D_CACHE_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0301, .desc = "This event counts L2D_CACHE caused by demand read access.", }, { .name = "L2D_CACHE_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x0302, .desc = "This event counts L2D_CACHE caused by demand write access.", }, { .name = "L2D_CACHE_HWPRF_ADJACENT", .modmsk = ARMV9_ATTRS, .code = 0x0305, .desc = "This event counts L2D_CACHE caused by hardware adjacent prefetch access.", }, { .name = "L2D_CACHE_REFILL_DM", .modmsk = ARMV9_ATTRS, .code = 0x0308, .desc = "This event counts L2D_CACHE_REFILL caused by demand access.", }, { .name = "L2D_CACHE_REFILL_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0309, .desc = "This event counts L2D_CACHE_REFILL caused by demand read access.", }, { .name = "L2D_CACHE_REFILL_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x030a, .desc = "This event counts L2D_CACHE_REFILL caused by demand write access.", }, { .name = "L2D_CACHE_REFILL_DM_WR_EXCL", .modmsk = ARMV9_ATTRS, .code = 0x030b, .desc = "This event counts L2D_CACHE_REFILL caused by demand write exclusive access.", }, { .name = "L2D_CACHE_REFILL_DM_WR_ATOM", .modmsk = ARMV9_ATTRS, .code = 0x030c, .desc = "This event counts L2D_CACHE_REFILL caused by demand write atomic access.", }, { .name = "L2D_CACHE_BTC", .modmsk = ARMV9_ATTRS, .code = 0x030d, .desc = "This event counts demand access that hits cache line with shared status and requests exclusive access in the Level 1 data and Level 2 caches, causing a coherence access to outside of the Level 1 and Level 2 caches of this PE.", }, { .name = "L2_PIPE_VAL", .modmsk = ARMV9_ATTRS, .code = 0x0330, .desc = "This event counts valid cycles of L2 cache pipeline.", }, { .name = "L2_PIPE_COMP_ALL", .modmsk = ARMV9_ATTRS, .code = 0x0350, .desc = "This event counts completed requests in L2 cache pipeline.", }, { .name = "L2_PIPE_COMP_PF_L2MIB_MCH", .modmsk = ARMV9_ATTRS, .code = 0x0370, .desc = "This event counts operations where software or hardware prefetch hits an L2 cache refill buffer allocated by demand access.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x0390, .desc = "This event counts operations that cause a cache access to the L3 cache.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE_DM", .modmsk = ARMV9_ATTRS, .code = 0x0391, .desc = "This event counts L2D_CACHE_REFILL_L3D_CACHE caused by demand access.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0392, .desc = "This event counts L2D_CACHE_REFILL_L3D_CACHE caused by demand read access.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x0393, .desc = "This event counts L2D_CACHE_REFILL_L3D_CACHE caused by demand write access.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE_PRF", .modmsk = ARMV9_ATTRS, .code = 0x0394, .desc = "This event counts L2D_CACHE_REFILL_L3D_CACHE caused by prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_CACHE_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x0395, .desc = "This event counts L2D_CACHE_REFILL_L3D_CACHE caused by hardware prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS", .modmsk = ARMV9_ATTRS, .code = 0x0396, .desc = "This event counts operations that cause a miss of the L3 cache.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_DM", .modmsk = ARMV9_ATTRS, .code = 0x0397, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS caused by demand access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x0398, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS caused by demand read access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x0399, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS caused by demand write access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_PRF", .modmsk = ARMV9_ATTRS, .code = 0x039a, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS caused by prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x039b, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS caused by hardware prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT", .modmsk = ARMV9_ATTRS, .code = 0x039c, .desc = "This event counts operations that cause a hit of the L3 cache.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT_DM", .modmsk = ARMV9_ATTRS, .code = 0x039d, .desc = "This event counts L2D_CACHE_REFILL_L3D_HIT caused by demand access.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x039e, .desc = "This event counts L2D_CACHE_REFILL_L3D_HIT caused by demand read access.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x039f, .desc = "This event counts L2D_CACHE_REFILL_L3D_HIT caused by demand write access.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT_PRF", .modmsk = ARMV9_ATTRS, .code = 0x03a0, .desc = "This event counts L2D_CACHE_REFILL_L3D_HIT caused by prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_HIT_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x03a1, .desc = "This event counts L2D_CACHE_REFILL_L3D_HIT caused by hardware prefetch access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT", .modmsk = ARMV9_ATTRS, .code = 0x03a2, .desc = "This event counts the number of L3 cache misses where the requests hit the PFTGT buffer.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT_DM", .modmsk = ARMV9_ATTRS, .code = 0x03a3, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT caused by demand access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT_DM_RD", .modmsk = ARMV9_ATTRS, .code = 0x03a4, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT caused by demand read access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT_DM_WR", .modmsk = ARMV9_ATTRS, .code = 0x03a5, .desc = "This event counts L2D_CACHE_REFILL_L3D_MISS_PFTGT_HIT caused by demand write access.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_L_MEM", .modmsk = ARMV9_ATTRS, .code = 0x03a6, .desc = "This event counts the number of L3 cache misses where the requests access the memory in the same socket as the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_FR_MEM", .modmsk = ARMV9_ATTRS, .code = 0x03a7, .desc = "This event counts the number of L3 cache misses where the requests access the memory in the different socket from the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_L_L2", .modmsk = ARMV9_ATTRS, .code = 0x03a8, .desc = "This event counts the number of L3 cache misses where the requests access the different L2 cache from the requests in the same Numa nodes as the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_NR_L2", .modmsk = ARMV9_ATTRS, .code = 0x03a9, .desc = "This event counts the number of L3 cache misses where the requests access L2 cache in the different Numa nodes from the requests in the same socket as the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_NR_L3", .modmsk = ARMV9_ATTRS, .code = 0x03aa, .desc = "This event counts the number of L3 cache misses where the requests access L3 cache in the different Numa nodes from the requests in the same socket as the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_FR_L2", .modmsk = ARMV9_ATTRS, .code = 0x03ab, .desc = "This event counts the number of L3 cache misses where the requests access L2 cache in the different socket from the requests.", }, { .name = "L2D_CACHE_REFILL_L3D_MISS_FR_L3", .modmsk = ARMV9_ATTRS, .code = 0x03ac, .desc = "This event counts the number of L3 cache misses where the requests access L3 cache in the different socket from the requests.", }, { .name = "L2D_CACHE_WB_VICTIM_CLEAN", .modmsk = ARMV9_ATTRS, .code = 0x03b0, .desc = "This event counts every write-back of data from the L2 cache caused by L2 replace where the data is clean. In this case, the data will usually be written to L3 cache.", }, { .name = "L2D_CACHE_WB_NT", .modmsk = ARMV9_ATTRS, .code = 0x03b1, .desc = "This event counts every write-back of data from the L2 cache caused by non-temporal-store.", }, { .name = "L2D_CACHE_WB_DCZVA", .modmsk = ARMV9_ATTRS, .code = 0x03b2, .desc = "This event counts every write-back of data from the L2 cache caused by DC ZVA.", }, { .name = "L2D_CACHE_FB", .modmsk = ARMV9_ATTRS, .code = 0x03b3, .desc = "This event counts every flush-back (drop) of data from the L2 cache.", }, { .name = "EA_L3", .modmsk = ARMV9_ATTRS, .code = 0x03f0, .desc = "This event counts energy consumption of L3 cache.", }, { .name = "EA_LDO_LOSS", .modmsk = ARMV9_ATTRS, .code = 0x03f1, .desc = "This event counts energy consumption of LDO loss.", }, { .name = "GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0880, .desc = "This event counts the number of cycles at 100MHz.", }, { .name = "FL0_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0890, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 0.", }, { .name = "FL1_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0891, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 1.", }, { .name = "FL2_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0892, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 2.", }, { .name = "FL3_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0893, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 3.", }, { .name = "FL4_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0894, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 4.", }, { .name = "FL5_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0895, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 5.", }, { .name = "FL6_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0896, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 6.", }, { .name = "FL7_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0897, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 7.", }, { .name = "FL8_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0898, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 8.", }, { .name = "FL9_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x0899, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 9.", }, { .name = "FL10_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089a, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 10.", }, { .name = "FL11_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089b, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 11.", }, { .name = "FL12_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089c, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 12.", }, { .name = "FL13_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089d, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 13.", }, { .name = "FL14_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089e, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 14.", }, { .name = "FL15_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x089f, .desc = "This event counts the number of cycles where the measured core is staying in the Frequency Level 15.", }, { .name = "RETENTION_GCYCLES", .modmsk = ARMV9_ATTRS, .code = 0x08a0, .desc = "This event counts the number of cycles where the measured core is staying in the RETENTION state.", }, { .name = "RETENTION_COUNT", .modmsk = ARMV9_ATTRS, .code = 0x08a1, .desc = "This event counts the number of changes from the normal state to the RETENTION state.", }, { .name = "L1I_TLB_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c00, .desc = "This event counts operations that cause a TLB access to the L1I in 4KB page.", }, { .name = "L1I_TLB_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c01, .desc = "This event counts operations that cause a TLB access to the L1I in 64KB page.", }, { .name = "L1I_TLB_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c02, .desc = "This event counts operations that cause a TLB access to the L1I in 2MB page.", }, { .name = "L1I_TLB_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c03, .desc = "This event counts operations that cause a TLB access to the L1I in 32MB page.", }, { .name = "L1I_TLB_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c04, .desc = "This event counts operations that cause a TLB access to the L1I in 512MB page.", }, { .name = "L1I_TLB_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c05, .desc = "This event counts operations that cause a TLB access to the L1I in 1GB page.", }, { .name = "L1I_TLB_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c06, .desc = "This event counts operations that cause a TLB access to the L1I in 16GB page.", }, { .name = "L1D_TLB_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c08, .desc = "This event counts operations that cause a TLB access to the L1D in 4KB page.", }, { .name = "L1D_TLB_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c09, .desc = "This event counts operations that cause a TLB access to the L1D in 64KB page.", }, { .name = "L1D_TLB_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c0a, .desc = "This event counts operations that cause a TLB access to the L1D in 2MB page.", }, { .name = "L1D_TLB_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c0b, .desc = "This event counts operations that cause a TLB access to the L1D in 32MB page.", }, { .name = "L1D_TLB_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c0c, .desc = "This event counts operations that cause a TLB access to the L1D in 512MB page.", }, { .name = "L1D_TLB_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c0d, .desc = "This event counts operations that cause a TLB access to the L1D in 1GB page.", }, { .name = "L1D_TLB_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c0e, .desc = "This event counts operations that cause a TLB access to the L1D in 16GB page.", }, { .name = "L1I_TLB_REFILL_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c10, .desc = "This event counts operations that cause a TLB refill to the L1I in 4KB page.", }, { .name = "L1I_TLB_REFILL_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c11, .desc = "This event counts operations that cause a TLB refill to the L1I in 64KB page.", }, { .name = "L1I_TLB_REFILL_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c12, .desc = "This event counts operations that cause a TLB refill to the L1I in 2MB page.", }, { .name = "L1I_TLB_REFILL_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c13, .desc = "This event counts operations that cause a TLB refill to the L1I in 32MB page.", }, { .name = "L1I_TLB_REFILL_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c14, .desc = "This event counts operations that cause a TLB refill to the L1I in 512MB page.", }, { .name = "L1I_TLB_REFILL_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c15, .desc = "This event counts operations that cause a TLB refill to the L1I in 1GB page.", }, { .name = "L1I_TLB_REFILL_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c16, .desc = "This event counts operations that cause a TLB refill to the L1I in 16GB page.", }, { .name = "L1D_TLB_REFILL_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c18, .desc = "This event counts operations that cause a TLB refill to the L1D in 4KB page.", }, { .name = "L1D_TLB_REFILL_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c19, .desc = "This event counts operations that cause a TLB refill to the L1D in 64KB page.", }, { .name = "L1D_TLB_REFILL_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c1a, .desc = "This event counts operations that cause a TLB refill to the L1D in 2MB page.", }, { .name = "L1D_TLB_REFILL_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c1b, .desc = "This event counts operations that cause a TLB refill to the L1D in 32MB page.", }, { .name = "L1D_TLB_REFILL_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c1c, .desc = "This event counts operations that cause a TLB refill to the L1D in 512MB page.", }, { .name = "L1D_TLB_REFILL_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c1d, .desc = "This event counts operations that cause a TLB refill to the L1D in 1GB page.", }, { .name = "L1D_TLB_REFILL_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c1e, .desc = "This event counts operations that cause a TLB refill to the L1D in 16GB page.", }, { .name = "L2I_TLB_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c20, .desc = "This event counts operations that cause a TLB access to the L2I in 4KB page.", }, { .name = "L2I_TLB_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c21, .desc = "This event counts operations that cause a TLB access to the L2I in 64KB page.", }, { .name = "L2I_TLB_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c22, .desc = "This event counts operations that cause a TLB access to the L2I in 2MB page.", }, { .name = "L2I_TLB_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c23, .desc = "This event counts operations that cause a TLB access to the L2I in 32MB page.", }, { .name = "L2I_TLB_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c24, .desc = "This event counts operations that cause a TLB access to the L2I in 512MB page.", }, { .name = "L2I_TLB_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c25, .desc = "This event counts operations that cause a TLB access to the L2I in 1GB page.", }, { .name = "L2I_TLB_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c26, .desc = "This event counts operations that cause a TLB access to the L2I in 16GB page.", }, { .name = "L2D_TLB_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c28, .desc = "This event counts operations that cause a TLB access to the L2D in 4KB page.", }, { .name = "L2D_TLB_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c29, .desc = "This event counts operations that cause a TLB access to the L2D in 64KB page.", }, { .name = "L2D_TLB_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c2a, .desc = "This event counts operations that cause a TLB access to the L2D in 2MB page.", }, { .name = "L2D_TLB_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c2b, .desc = "This event counts operations that cause a TLB access to the L2D in 32MB page.", }, { .name = "L2D_TLB_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c2c, .desc = "This event counts operations that cause a TLB access to the L2D in 512MB page.", }, { .name = "L2D_TLB_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c2d, .desc = "This event counts operations that cause a TLB access to the L2D in 1GB page.", }, { .name = "L2D_TLB_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c2e, .desc = "This event counts operations that cause a TLB access to the L2D in 16GB page.", }, { .name = "L2I_TLB_REFILL_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c30, .desc = "This event counts operations that cause a TLB refill to the L2Iin 4KB page.", }, { .name = "L2I_TLB_REFILL_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c31, .desc = "This event counts operations that cause a TLB refill to the L2I in 64KB page.", }, { .name = "L2I_TLB_REFILL_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c32, .desc = "This event counts operations that cause a TLB refill to the L2I in 2MB page.", }, { .name = "L2I_TLB_REFILL_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c33, .desc = "This event counts operations that cause a TLB refill to the L2I in 32MB page.", }, { .name = "L2I_TLB_REFILL_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c34, .desc = "This event counts operations that cause a TLB refill to the L2I in 512MB page.", }, { .name = "L2I_TLB_REFILL_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c35, .desc = "This event counts operations that cause a TLB refill to the L2I in 1GB page.", }, { .name = "L2I_TLB_REFILL_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c36, .desc = "This event counts operations that cause a TLB refill to the L2I in 16GB page.", }, { .name = "L2D_TLB_REFILL_4K", .modmsk = ARMV9_ATTRS, .code = 0x0c38, .desc = "This event counts operations that cause a TLB refill to the L2D in 4KB page.", }, { .name = "L2D_TLB_REFILL_64K", .modmsk = ARMV9_ATTRS, .code = 0x0c39, .desc = "This event counts operations that cause a TLB refill to the L2D in 64KB page.", }, { .name = "L2D_TLB_REFILL_2M", .modmsk = ARMV9_ATTRS, .code = 0x0c3a, .desc = "This event counts operations that cause a TLB refill to the L2D in 2MB page.", }, { .name = "L2D_TLB_REFILL_32M", .modmsk = ARMV9_ATTRS, .code = 0x0c3b, .desc = "This event counts operations that cause a TLB refill to the L2D in 32MB page.", }, { .name = "L2D_TLB_REFILL_512M", .modmsk = ARMV9_ATTRS, .code = 0x0c3c, .desc = "This event counts operations that cause a TLB refill to the L2D in 512MB page.", }, { .name = "L2D_TLB_REFILL_1G", .modmsk = ARMV9_ATTRS, .code = 0x0c3d, .desc = "This event counts operations that cause a TLB refill to the L2D in 1GB page.", }, { .name = "L2D_TLB_REFILL_16G", .modmsk = ARMV9_ATTRS, .code = 0x0c3e, .desc = "This event counts operations that cause a TLB refill to the L2D in 16GB page.", }, { .name = "CNT_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x4004, .desc = "This event counts the constant frequency cycles counter increments at a constant frequency equal to the rate of increment of the System counter.", }, { .name = "STALL_BACKEND_MEM", .modmsk = ARMV9_ATTRS, .code = 0x4005, .desc = "This event counts every cycle that no instruction was dispatched from decode unit due to memory stall.", }, { .name = "L1I_CACHE_LMISS", .modmsk = ARMV9_ATTRS, .code = 0x4006, .desc = "This event counts operations that cause a refill of the L1I cache that incurs additional latency.", }, { .name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x4009, .desc = "This event counts operations that cause a refill of the L2D cache that incurs additional latency.", }, { .name = "L3D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x400b, .desc = "This event counts access counted by L3D_CACHE that is not completed by the L3D cache, and a Memory-read operation, as defined by the L2D_CACHE_REFILL_L3D_MISS events.", }, { .name = "TRB_WRAP", .modmsk = ARMV9_ATTRS, .code = 0x400c, .desc = "This event counts the event generated each time the current write pointer is wrapped to the base pointer.", }, { .name = "PMU_OVFS", .modmsk = ARMV9_ATTRS, .code = 0x400d, .desc = "This event counts the event generated each time one of the condition occurs described in Arm Architecture Reference Manual for A-profile architecture. This event is only for output to the trace unit.", }, { .name = "TRB_TRIG", .modmsk = ARMV9_ATTRS, .code = 0x400e, .desc = "This event counts the event generated when a Trace Buffer Extension Trigger Event occurs.", }, { .name = "PMU_HOVFS", .modmsk = ARMV9_ATTRS, .code = 0x400f, .desc = "This event counts the event generated each time an event is counted by an event counter and all of the condition occur described in Arm Architecture Reference Manual for A-profile architecture. This event is only for output to the trace unit.", }, { .name = "TRCEXTOUT0", .modmsk = ARMV9_ATTRS, .code = 0x4010, .desc = "This event counts the event generated each time an event is signaled by the trace unit external event 0.", }, { .name = "CTI_TRIGOUT4", .modmsk = ARMV9_ATTRS, .code = 0x4018, .desc = "This event counts the event generated each time an event is signaled on CTI output trigger 4.", }, { .name = "SIMD_INST_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x8000, .desc = "This event counts architecturally executed SIMD instructions, excluding the Advanced SIMD scalar instructions and the instructions listed in Non-SIMD SVE instructions section of ARMv9 Reference Manual.", }, { .name = "SVE_INST_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x8002, .desc = "This event counts architecturally executed SVE instructions, including the instructions listed in Non-SIMD SVE instructions section of ARMv9 Reference Manual.", }, { .name = "ASE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8005, .desc = "This event counts architecturally executed Advanced SIMD operations.", }, { .name = "SVE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8006, .desc = "This event counts architecturally executed SVE instructions, including the instructions listed in Non-SIMD SVE instructions section of ARMv9 Reference Manual.", }, { .name = "ASE_SVE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8007, .desc = "This event counts architecturally executed Advanced SIMD and SVE operations.", }, { .name = "UOP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8008, .desc = "This event counts all architecturally executed micro-operations.", }, { .name = "SVE_MATH_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x800e, .desc = "This event counts architecturally executed math function operations due to the SVE FTSMUL, FTMAD, FTSSEL, and FEXPA instructions.", }, { .name = "FP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8010, .desc = "This event counts architecturally executed operations due to scalar, Advanced SIMD, and SVE instructions listed in Floating-point instructions section of ARMv9 Reference Manual.", }, { .name = "ASE_FP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8011, .desc = "This event counts architecturally executed Advanced SIMD floating-point operation.", }, { .name = "SVE_FP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8012, .desc = "This event counts architecturally executed SVE floating-point operation.", }, { .name = "ASE_SVE_FP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8013, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point operations.", }, { .name = "FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8014, .desc = "This event counts architecturally executed half-precision floating-point operation.", }, { .name = "ASE_FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8015, .desc = "This event counts architecturally executed Advanced SIMD half-precision floating-point operation.", }, { .name = "SVE_FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8016, .desc = "This event counts architecturally executed SVE half-precision floating-point operation.", }, { .name = "ASE_SVE_FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8017, .desc = "This event counts architecturally executed Advanced SIMD and SVE half-precision floating-point operations.", }, { .name = "FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8018, .desc = "This event counts architecturally executed single-precision floating-point operation.", }, { .name = "ASE_FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8019, .desc = "This event counts architecturally executed Advanced SIMD single-precision floating-point operation.", }, { .name = "SVE_FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801a, .desc = "This event counts architecturally executed SVE single-precision floating-point operation.", }, { .name = "ASE_SVE_FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801b, .desc = "This event counts architecturally executed Advanced SIMD and SVE single-precision floating-point operations.", }, { .name = "FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801c, .desc = "This event counts architecturally executed double-precision floating-point operation.", }, { .name = "ASE_FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801d, .desc = "This event counts architecturally executed Advanced SIMD double-precision floating-point operation.", }, { .name = "SVE_FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801e, .desc = "This event counts architecturally executed SVE double-precision floating-point operation.", }, { .name = "ASE_SVE_FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801f, .desc = "This event counts architecturally executed Advanced SIMD and SVE double-precision floating-point operations.", }, { .name = "FP_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8020, .desc = "This event counts architecturally executed floating-point divide operation.", }, { .name = "ASE_FP_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8021, .desc = "This event counts architecturally executed Advanced SIMD floating-point divide operation.", }, { .name = "SVE_FP_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8022, .desc = "This event counts architecturally executed SVE floating-point divide operation.", }, { .name = "ASE_SVE_FP_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8023, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point divide operations.", }, { .name = "FP_SQRT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8024, .desc = "This event counts architecturally executed floating-point square root operation.", }, { .name = "ASE_FP_SQRT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8025, .desc = "This event counts architecturally executed Advanced SIMD floating-point square root operation.", }, { .name = "SVE_FP_SQRT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8026, .desc = "This event counts architecturally executed SVE floating-point square root operation.", }, { .name = "ASE_SVE_FP_SQRT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8027, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point square root operations.", }, { .name = "FP_FMA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8028, .desc = "This event counts architecturally executed floating-point fused multiply-add and multiply-subtract operations.", }, { .name = "ASE_FP_FMA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8029, .desc = "This event counts architecturally executed Advanced SIMD floating-point FMA operation.", }, { .name = "SVE_FP_FMA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802a, .desc = "This event counts architecturally executed SVE floating-point FMA operation.", }, { .name = "ASE_SVE_FP_FMA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802b, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point FMA operations.", }, { .name = "FP_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802c, .desc = "This event counts architecturally executed floating-point multiply operations.", }, { .name = "ASE_FP_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802d, .desc = "This event counts architecturally executed Advanced SIMD floating-point multiply operation.", }, { .name = "SVE_FP_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802e, .desc = "This event counts architecturally executed SVE floating-point multiply operation.", }, { .name = "ASE_SVE_FP_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x802f, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point multiply operations.", }, { .name = "FP_ADDSUB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8030, .desc = "This event counts architecturally executed floating-point add or subtract operations.", }, { .name = "ASE_FP_ADDSUB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8031, .desc = "This event counts architecturally executed Advanced SIMD floating-point add or subtract operation.", }, { .name = "SVE_FP_ADDSUB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8032, .desc = "This event counts architecturally executed SVE floating-point add or subtract operation.", }, { .name = "ASE_SVE_FP_ADDSUB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8033, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point add or subtract operations.", }, { .name = "FP_RECPE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8034, .desc = "This event counts architecturally executed floating-point reciprocal estimate operations due to the Advanced SIMD scalar, Advanced SIMD vector, and SVE FRECPE and FRSQRTE instructions.", }, { .name = "ASE_FP_RECPE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8035, .desc = "This event counts architecturally executed Advanced SIMD floating-point reciprocal estimate operations.", }, { .name = "SVE_FP_RECPE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8036, .desc = "This event counts architecturally executed SVE floating-point reciprocal estimate operations.", }, { .name = "ASE_SVE_FP_RECPE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8037, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point reciprocal estimate operations.", }, { .name = "FP_CVT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8038, .desc = "This event counts architecturally executed floating-point convert operations due to the scalar, Advanced SIMD, and SVE floating-point conversion instructions listed in Floating-point conversions section of ARMv9 Reference Manual.", }, { .name = "ASE_FP_CVT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8039, .desc = "This event counts architecturally executed Advanced SIMD floating-point convert operation.", }, { .name = "SVE_FP_CVT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803a, .desc = "This event counts architecturally executed SVE floating-point convert operation.", }, { .name = "ASE_SVE_FP_CVT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803b, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point convert operations.", }, { .name = "SVE_FP_AREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803c, .desc = "This event counts architecturally executed SVE floating-point accumulating reduction operations.", }, { .name = "ASE_FP_PREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803d, .desc = "This event counts architecturally executed Advanced SIMD floating-point pairwise add step operations.", }, { .name = "SVE_FP_VREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803e, .desc = "This event counts architecturally executed SVE floating-point vector reduction operation.", }, { .name = "ASE_SVE_FP_VREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x803f, .desc = "This event counts architecturally executed Advanced SIMD and SVE floating-point vector reduction operations.", }, { .name = "INT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8040, .desc = "This event counts architecturally executed operations due to scalar, Advanced SIMD, and SVE instructions listed in Integer instructions section of ARMv9 Reference Manual.", }, { .name = "ASE_INT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8041, .desc = "This event counts architecturally executed Advanced SIMD integer operations.", }, { .name = "SVE_INT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8042, .desc = "This event counts architecturally executed SVE integer operations.", }, { .name = "ASE_SVE_INT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8043, .desc = "This event counts architecturally executed Advanced SIMD and SVE integer operations.", }, { .name = "INT_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8044, .desc = "This event counts architecturally executed integer divide operation.", }, { .name = "INT_DIV64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8045, .desc = "This event counts architecturally executed 64-bit integer divide operation.", }, { .name = "SVE_INT_DIV_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8046, .desc = "This event counts architecturally executed SVE integer divide operation.", }, { .name = "SVE_INT_DIV64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8047, .desc = "This event counts architecturally executed SVE 64-bit integer divide operation.", }, { .name = "INT_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8048, .desc = "This event counts architecturally executed integer multiply operation.", }, { .name = "ASE_INT_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8049, .desc = "This event counts architecturally executed Advanced SIMD integer multiply operation.", }, { .name = "SVE_INT_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804a, .desc = "This event counts architecturally executed SVE integer multiply operation.", }, { .name = "ASE_SVE_INT_MUL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804b, .desc = "This event counts architecturally executed Advanced SIMD and SVE integer multiply operations.", }, { .name = "INT_MUL64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804c, .desc = "This event counts architecturally executed integer 64-bit x 64-bit multiply operation.", }, { .name = "SVE_INT_MUL64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804d, .desc = "This event counts architecturally executed SVE integer 64-bit x 64-bit multiply operation.", }, { .name = "INT_MULH64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804e, .desc = "This event counts architecturally executed integer 64-bit x 64-bit multiply returning high part operation.", }, { .name = "SVE_INT_MULH64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x804f, .desc = "This event counts architecturally executed SVE integer 64-bit x 64-bit multiply returning high part operations.", }, { .name = "NONFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8058, .desc = "This event counts architecturally executed non-floating-point operations.", }, { .name = "ASE_NONFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8059, .desc = "This event counts architecturally executed Advanced SIMD non-floating-point operations.", }, { .name = "SVE_NONFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x805a, .desc = "This event counts architecturally executed SVE non-floating-point operations.", }, { .name = "ASE_SVE_NONFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x805b, .desc = "This event counts architecturally executed Advanced SIMD and SVE non-floating-point operations.", }, { .name = "ASE_INT_VREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x805d, .desc = "This event counts architecturally executed Advanced SIMD integer reduction operation.", }, { .name = "SVE_INT_VREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x805e, .desc = "This event counts architecturally executed SVE integer reduction operation.", }, { .name = "ASE_SVE_INT_VREDUCE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x805f, .desc = "This event counts architecturally executed Advanced SIMD and SVE integer reduction operations.", }, { .name = "SVE_PERM_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8060, .desc = "This event counts architecturally executed vector or predicate permute operation.", }, { .name = "SVE_XPIPE_Z2R_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8065, .desc = "This event counts architecturally executed vector to general-purpose scalar cross-pipeline transfer operation.", }, { .name = "SVE_XPIPE_R2Z_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8066, .desc = "This event counts architecturally executed general-purpose scalar to vector cross-pipeline transfer operation.", }, { .name = "SVE_PGEN_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8068, .desc = "This event counts architecturally executed predicate-generating operation.", }, { .name = "SVE_PGEN_FLG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8069, .desc = "This event counts architecturally executed predicate-generating operation that sets condition flags.", }, { .name = "SVE_PPERM_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x806d, .desc = "This event counts architecturally executed predicate permute operation.", }, { .name = "SVE_PRED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8074, .desc = "This event counts architecturally executed SIMD data-processing and load/store operations due to SVE instructions with a Governing predicate operand that determines the Active elements.", }, { .name = "SVE_MOVPRFX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x807c, .desc = "This event counts architecturally executed operations due to MOVPRFX instructions, whether or not they were fused with the prefixed instruction.", }, { .name = "SVE_MOVPRFX_Z_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x807d, .desc = "This event counts architecturally executed operation counted by SVE_MOVPRFX_SPEC where the operation uses zeroing predication.", }, { .name = "SVE_MOVPRFX_M_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x807e, .desc = "This event counts architecturally executed operation counted by SVE_MOVPRFX_SPEC where the operation uses merging predication.", }, { .name = "SVE_MOVPRFX_U_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x807f, .desc = "This event counts architecturally executed operations due to MOVPRFX instructions that were not fused with the prefixed instruction.", }, { .name = "ASE_SVE_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8085, .desc = "This event counts architecturally executed operations that read from memory due to SVE and Advanced SIMD load instructions.", }, { .name = "ASE_SVE_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8086, .desc = "This event counts architecturally executed operations that write to memory due to SVE and Advanced SIMD store instructions.", }, { .name = "PRF_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8087, .desc = "This event counts architecturally executed prefetch operations due to scalar PRFM, PRFUM and SVE PRF instructions.", }, { .name = "BASE_LD_REG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8089, .desc = "This event counts architecturally executed operations that read from memory due to an instruction that loads a general-purpose register.", }, { .name = "BASE_ST_REG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x808a, .desc = "This event counts architecturally executed operations that write to memory due to an instruction that stores a general-purpose register, excluding the DC ZVA instruction.", }, { .name = "SVE_LDR_REG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8091, .desc = "This event counts architecturally executed operations that read from memory due to an SVE LDR instruction.", }, { .name = "SVE_STR_REG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8092, .desc = "This event counts architecturally executed operations that write to memory due to an SVE STR instruction.", }, { .name = "SVE_LDR_PREG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8095, .desc = "This event counts architecturally executed operations that read from memory due to an SVE LDR (predicate) instruction.", }, { .name = "SVE_STR_PREG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8096, .desc = "This event counts architecturally executed operations that write to memory due to an SVE STR (predicate) instruction.", }, { .name = "SVE_PRF_CONTIG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x809f, .desc = "This event counts architecturally executed operations that prefetch memory due to an SVE predicated single contiguous element prefetch instruction.", }, { .name = "SVE_LDNT_CONTIG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80a1, .desc = "This event counts architecturally executed operation that reads from memory with a non-temporal hint due to an SVE non-temporal contiguous element load instruction.", }, { .name = "SVE_STNT_CONTIG_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80a2, .desc = "This event counts architecturally executed operation that writes to memory with a non-temporal hint due to an SVE non-temporal contiguous element store instruction.", }, { .name = "ASE_SVE_LD_MULTI_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80a5, .desc = "This event counts architecturally executed operations that read from memory due to SVE and Advanced SIMD multiple vector contiguous structure load instructions.", }, { .name = "ASE_SVE_ST_MULTI_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80a6, .desc = "This event counts architecturally executed operations that write to memory due to SVE and Advanced SIMD multiple vector contiguous structure store instructions.", }, { .name = "SVE_LD_GATHER_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80ad, .desc = "This event counts architecturally executed operations that read from memory due to SVE non-contiguous gather-load instructions.", }, { .name = "SVE_ST_SCATTER_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80ae, .desc = "This event counts architecturally executed operations that write to memory due to SVE non-contiguous scatter-store instructions.", }, { .name = "SVE_PRF_GATHER_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80af, .desc = "This event counts architecturally executed operations that prefetch memory due to SVE non-contiguous gather-prefetch instructions.", }, { .name = "SVE_LDFF_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80bc, .desc = "This event counts architecturally executed memory read operations due to SVE First-fault and Non-fault load instructions.", }, { .name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c0, .desc = "This event counts architecturally executed SVE arithmetic operations. See FP_SCALE_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by (128 / CSIZE) and by twice that amount for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c1, .desc = "This event counts architecturally executed v8SIMD&FP arithmetic operations. See FP_FIXED_OPS_SPEC of ARMv9 Reference Manual for more information. The event counter is incremented by the specified number of elements for Advanced SIMD operations or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_HP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c2, .desc = "This event counts architecturally executed SVE half-precision arithmetic operations. See FP_HP_SCALE_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by 8, or by 16 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_HP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c3, .desc = "This event counts architecturally executed v8SIMD&FP half-precision arithmetic operations. See FP_HP_FIXED_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by the number of 16-bit elements for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_SP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c4, .desc = "This event counts architecturally executed SVE single-precision arithmetic operations. See FP_SP_SCALE_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by 4, or by 8 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_SP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c5, .desc = "This event counts architecturally executed v8SIMD&FP single-precision arithmetic operations. See FP_SP_FIXED_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by the number of 32-bit elements for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "FP_DP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c6, .desc = "This event counts architecturally executed SVE double-precision arithmetic operations. See FP_DP_SCALE_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by 2, or by 4 for operations that would also be counted by SVE_FP_FMA_SPEC.", }, { .name = "FP_DP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c7, .desc = "This event counts architecturally executed v8SIMD&FP double-precision arithmetic operations. See FP_DP_FIXED_OPS_SPEC of ARMv9 Reference Manual for more information. This event counter is incremented by 2 for Advanced SIMD operations, or by 1 for scalar operations, and by twice those amounts for operations that would also be counted by FP_FMA_SPEC.", }, { .name = "INT_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c8, .desc = "This event counts each integer ALU operation counted by SVE_INT_SPEC. See ALU operation counts section of ARMv9 Reference Manual for information on the counter increment for different types of instruction.", }, { .name = "INT_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c9, .desc = "This event counts each integer ALU operation counted by INT_SPEC that is not counted by SVE_INT_SPEC. See ALU operation counts section of ARMv9 Reference Manual for information on the counter increment for different types of instruction.", }, { .name = "ASE_SVE_FP_DOT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80f3, .desc = "This event counts architecturally executed microarchitectural Advanced SIMD or SVE floating-point dot-product operation.", }, { .name = "ASE_SVE_FP_MMLA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80f7, .desc = "This event counts architecturally executed microarchitectural Advanced SIMD or SVE floating-point matrix multiply operation.", }, { .name = "ASE_SVE_INT_DOT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80fb, .desc = "This event counts architecturally executed microarchitectural Advanced SIMD or SVE integer dot-product operation.", }, { .name = "ASE_SVE_INT_MMLA_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80ff, .desc = "This event counts architecturally executed microarchitectural Advanced SIMD or SVE integer matrix multiply operation.", }, { .name = "DTLB_WALK_PERCYC", .modmsk = ARMV9_ATTRS, .code = 0x8128, .desc = "This event counts the number of DTLB_WALK events in progress on each Processor cycle.", }, { .name = "ITLB_WALK_PERCYC", .modmsk = ARMV9_ATTRS, .code = 0x8129, .desc = "This event counts the number of ITLB_WALK events in progress on each Processor cycle.", }, { .name = "DTLB_STEP", .modmsk = ARMV9_ATTRS, .code = 0x8136, .desc = "This event counts translation table walk access made by a refill of the data TLB.", }, { .name = "ITLB_STEP", .modmsk = ARMV9_ATTRS, .code = 0x8137, .desc = "This event counts translation table walk access made by a refill of the instruction TLB.", }, { .name = "DTLB_WALK_LARGE", .modmsk = ARMV9_ATTRS, .code = 0x8138, .desc = "This event counts translation table walk counted by DTLB_WALK where the result of the walk yields a large page size.", }, { .name = "ITLB_WALK_LARGE", .modmsk = ARMV9_ATTRS, .code = 0x8139, .desc = "This event counts translation table walk counted by ITLB_WALK where the result of the walk yields a large page size.", }, { .name = "DTLB_WALK_SMALL", .modmsk = ARMV9_ATTRS, .code = 0x813a, .desc = "This event counts translation table walk counted by DTLB_WALK where the result of the walk yields a small page size.", }, { .name = "ITLB_WALK_SMALL", .modmsk = ARMV9_ATTRS, .code = 0x813b, .desc = "This event counts translation table walk counted by ITLB_WALK where the result of the walk yields a small page size.", }, { .name = "L1D_CACHE_MISS", .modmsk = ARMV9_ATTRS, .code = 0x8144, .desc = "This event counts demand access that misses in the Level 1 data cache, causing an access to outside of the Level 1 caches of this PE.", }, { .name = "L1I_CACHE_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x8145, .desc = "This event counts access counted by L1I_CACHE that is due to a hardware prefetch.", }, { .name = "L2D_CACHE_MISS", .modmsk = ARMV9_ATTRS, .code = 0x814c, .desc = "This event counts demand access that misses in the Level 1 data and Level 2 caches, causing an access to outside of the Level 1 and Level 2 caches of this PE.", }, { .name = "L1D_CACHE_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x8154, .desc = "This event counts access counted by L1D_CACHE that is due to a hardware prefetch.", }, { .name = "L2D_CACHE_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x8155, .desc = "This event counts access counted by L2D_CACHE that is due to a hardware prefetch.", }, { .name = "STALL_FRONTEND_MEMBOUND", .modmsk = ARMV9_ATTRS, .code = 0x8158, .desc = "This event counts every cycle counted by STALL_FRONTEND when no instructions are delivered from the memory system.", }, { .name = "STALL_FRONTEND_L1I", .modmsk = ARMV9_ATTRS, .code = 0x8159, .desc = "This event counts every cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand instruction miss in the first level of instruction cache.", }, { .name = "STALL_FRONTEND_L2I", .modmsk = ARMV9_ATTRS, .code = 0x815a, .desc = "This event counts every cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand instruction miss in the second level of instruction cache.", }, { .name = "STALL_FRONTEND_MEM", .modmsk = ARMV9_ATTRS, .code = 0x815b, .desc = "This event counts every cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand instruction miss in the last level of instruction cache within the PE clock domain or a non-cacheable instruction fetch in progress.", }, { .name = "STALL_FRONTEND_TLB", .modmsk = ARMV9_ATTRS, .code = 0x815c, .desc = "This event counts every cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand instruction miss in the instruction TLB.", }, { .name = "STALL_FRONTEND_CPUBOUND", .modmsk = ARMV9_ATTRS, .code = 0x8160, .desc = "This event counts every cycle counted by STALL_FRONTEND when the frontend is stalled on a frontend processor resource, not including memory.", }, { .name = "STALL_FRONTEND_FLOW", .modmsk = ARMV9_ATTRS, .code = 0x8161, .desc = "This event counts every cycle counted by STALL_FRONTEND_CPUBOUND when the frontend is stalled on unavailability of prediction flow resources.", }, { .name = "STALL_FRONTEND_FLUSH", .modmsk = ARMV9_ATTRS, .code = 0x8162, .desc = "This event counts every cycle counted by STALL_FRONTEND_CPUBOUND when the frontend is recovering from a pipeline flush.", }, { .name = "STALL_FRONTEND_RENAME", .modmsk = ARMV9_ATTRS, .code = 0x8163, .desc = "This event counts every cycle counted by STALL_FRONTEND_CPUBOUND when operations are available from the frontend but at least one is not ready to be sent to the backend because no rename register is available.", }, { .name = "STALL_BACKEND_MEMBOUND", .modmsk = ARMV9_ATTRS, .code = 0x8164, .desc = "This event counts every cycle counted by STALL_BACKEND when the backend is waiting for a memory access to complete.", }, { .name = "STALL_BACKEND_L1D", .modmsk = ARMV9_ATTRS, .code = 0x8165, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when there is a demand data miss in L1D cache.", }, { .name = "STALL_BACKEND_L2D", .modmsk = ARMV9_ATTRS, .code = 0x8166, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when there is a demand data miss in L2D cache.", }, { .name = "STALL_BACKEND_TLB", .modmsk = ARMV9_ATTRS, .code = 0x8167, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when there is a demand data miss in the data TLB.", }, { .name = "STALL_BACKEND_ST", .modmsk = ARMV9_ATTRS, .code = 0x8168, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when the backend is stalled waiting for a store.", }, { .name = "STALL_BACKEND_CPUBOUND", .modmsk = ARMV9_ATTRS, .code = 0x816a, .desc = "This event counts every cycle counted by STALL_BACKEND when the backend is stalled on a processor resource, not including memory.", }, { .name = "STALL_BACKEND_BUSY", .modmsk = ARMV9_ATTRS, .code = 0x816b, .desc = "This event counts every cycle counted by STALL_BACKEND when operations are available from the frontend but the backend is not able to accept an operation because an execution unit is busy.", }, { .name = "STALL_BACKEND_ILOCK", .modmsk = ARMV9_ATTRS, .code = 0x816c, .desc = "This event counts every cycle counted by STALL_BACKEND when operations are available from the frontend but at least one is not ready to be sent to the backend because of an input dependency.", }, { .name = "STALL_BACKEND_RENAME", .modmsk = ARMV9_ATTRS, .code = 0x816d, .desc = "This event counts every cycle counted by STALL_BACKEND_CPUBOUND when operations are available from the frontend but at least one is not ready to be sent to the backend because no rename register is available.", }, { .name = "STALL_BACKEND_ATOMIC", .modmsk = ARMV9_ATTRS, .code = 0x816e, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when the backend is processing an Atomic operation.", }, { .name = "STALL_BACKEND_MEMCPYSET", .modmsk = ARMV9_ATTRS, .code = 0x816f, .desc = "This event counts every cycle counted by STALL_BACKEND_MEMBOUND when the backend is processing a Memory Copy or Set instruction.", }, { .name = "UOP_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x8186, .desc = "This event counts micro-operation that would be executed in a Simple sequential execution of the program.", }, { .name = "DTLB_WALK_BLOCK", .modmsk = ARMV9_ATTRS, .code = 0x8188, .desc = "This event counts translation table walk counted by DTLB_WALK where the result of the walk yields a Block.", }, { .name = "ITLB_WALK_BLOCK", .modmsk = ARMV9_ATTRS, .code = 0x8189, .desc = "This event counts translation table walk counted by ITLB_WALK where the result of the walk yields a Block.", }, { .name = "DTLB_WALK_PAGE", .modmsk = ARMV9_ATTRS, .code = 0x818a, .desc = "This event counts translation table walk counted by DTLB_WALK where the result of the walk yields a Page.", }, { .name = "ITLB_WALK_PAGE", .modmsk = ARMV9_ATTRS, .code = 0x818b, .desc = "This event counts translation table walk counted by ITLB_WALK where the result of the walk yields a Page.", }, { .name = "L1I_CACHE_REFILL_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x81b8, .desc = "This event counts hardware prefetch counted by L1I_CACHE_HWPRF that causes a refill of the Level 1 instruction cache from outside of the Level 1 instruction cache.", }, { .name = "L1D_CACHE_REFILL_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x81bc, .desc = "This event counts hardware prefetch counted by L1D_CACHE_HWPRF that causes a refill of the Level 1 data cache from outside of the Level 1 data cache.", }, { .name = "L2D_CACHE_REFILL_HWPRF", .modmsk = ARMV9_ATTRS, .code = 0x81bd, .desc = "This event counts hardware prefetch counted by L2D_CACHE_HWPRF that causes a refill of the Level 2 cache, or any Level 1 data and instruction cache of this PE, from outside of those caches.", }, { .name = "L1I_CACHE_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x81c0, .desc = "This event counts demand fetch counted by L1I_CACHE_DM_RD that hits in the Level 1 instruction cache.", }, { .name = "L1D_CACHE_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x81c4, .desc = "This event counts demand read counted by L1D_CACHE_RD that hits in the Level 1 data cache.", }, { .name = "L2D_CACHE_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x81c5, .desc = "This event counts demand read counted by L2D_CACHE_RD that hits in the Level 2 data cache.", }, { .name = "L1D_CACHE_HIT_WR", .modmsk = ARMV9_ATTRS, .code = 0x81c8, .desc = "This event counts demand write counted by L1D_CACHE_WR that hits in the Level 1 data cache.", }, { .name = "L2D_CACHE_HIT_WR", .modmsk = ARMV9_ATTRS, .code = 0x81c9, .desc = "This event counts demand write counted by L2D_CACHE_WR that hits in the Level 2 data cache.", }, { .name = "L1I_CACHE_HIT", .modmsk = ARMV9_ATTRS, .code = 0x8200, .desc = "This event counts access counted by L1I_CACHE that hits in the Level 1 instruction cache.", }, { .name = "L1D_CACHE_HIT", .modmsk = ARMV9_ATTRS, .code = 0x8204, .desc = "This event counts access counted by L1D_CACHE that hits in the Level 1 data cache.", }, { .name = "L2D_CACHE_HIT", .modmsk = ARMV9_ATTRS, .code = 0x8205, .desc = "This event counts access counted by L2D_CACHE that hits in the Level 2 data cache.", }, { .name = "L1I_LFB_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x8240, .desc = "This event counts demand access counted by L1I_CACHE_HIT_RD that hits a cache line that is in the process of being loaded into the Level 1 instruction cache.", }, { .name = "L1D_LFB_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x8244, .desc = "This event counts demand access counted by L1D_CACHE_HIT_RD that hits a cache line that is in the process of being loaded into the Level 1 data cache.", }, { .name = "L2D_LFB_HIT_RD", .modmsk = ARMV9_ATTRS, .code = 0x8245, .desc = "This event counts demand access counted by L2D_CACHE_HIT_RD that hits a recently fetched line in the Level 2 cache.", }, { .name = "L1D_LFB_HIT_WR", .modmsk = ARMV9_ATTRS, .code = 0x8248, .desc = "This event counts demand access counted by L1D_CACHE_HIT_WR that hits a cache line that is in the process of being loaded into the Level 1 data cache.", }, { .name = "L2D_LFB_HIT_WR", .modmsk = ARMV9_ATTRS, .code = 0x8249, .desc = "This event counts demand access counted by L2D_CACHE_HIT_WR that hits a recently fetched line in the Level 2 cache.", }, { .name = "L1I_CACHE_PRF", .modmsk = ARMV9_ATTRS, .code = 0x8280, .desc = "This event counts fetch counted by either Level 1 instruction hardware prefetch or Level 1 instruction software prefetch.", }, { .name = "L1D_CACHE_PRF", .modmsk = ARMV9_ATTRS, .code = 0x8284, .desc = "This event counts fetch counted by either Level 1 data hardware prefetch or Level 1 data software prefetch.", }, { .name = "L2D_CACHE_PRF", .modmsk = ARMV9_ATTRS, .code = 0x8285, .desc = "This event counts fetch counted by either Level 2 data hardware prefetch or Level 2 data software prefetch.", }, { .name = "L1I_CACHE_REFILL_PRF", .modmsk = ARMV9_ATTRS, .code = 0x8288, .desc = "This event counts hardware prefetch counted by L1I_CACHE_PRF that causes a refill of the Level 1 instruction cache from outside of the Level 1 instruction cache.", }, { .name = "L1D_CACHE_REFILL_PRF", .modmsk = ARMV9_ATTRS, .code = 0x828c, .desc = "This event counts hardware prefetch counted by L1D_CACHE_PRF that causes a refill of the Level 1 data cache from outside of the Level 1 data cache.", }, { .name = "L2D_CACHE_REFILL_PRF", .modmsk = ARMV9_ATTRS, .code = 0x828d, .desc = "This event counts hardware prefetch counted by L2D_CACHE_PRF that causes a refill of the Level 2 data cache from outside of the Level 1 data cache.", }, { .name = "L1D_CACHE_REFILL_PERCYC", .modmsk = ARMV9_ATTRS, .code = 0x8320, .desc = "The counter counts by the number of cache refills counted by L1D_CACHE_REFILL in progress on each Processor cycle.", }, { .name = "L2D_CACHE_REFILL_PERCYC", .modmsk = ARMV9_ATTRS, .code = 0x8321, .desc = "The counter counts by the number of cache refills counted by L2D_CACHE_REFILL in progress on each Processor cycle.", }, { .name = "L1I_CACHE_REFILL_PERCYC", .modmsk = ARMV9_ATTRS, .code = 0x8324, .desc = "The counter counts by the number of cache refills counted by L1I_CACHE_REFILL in progress on each Processor cycle.", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_hisilicon_kunpeng_events.h000066400000000000000000000426121502707512200261470ustar00rootroot00000000000000/* * Copyright (c) 2021 Barcelona Supercomputing Center * Contributed by Estanislao Mercadal Melià * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Hisilicon Kunpeng 920 * Based on https://developer.arm.com/documentation/ddi0487/latest/ and * https://github.com/torvalds/linux/blob/master/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/core-imp-def.json */ static const arm_entry_t arm_kunpeng_pe[ ] = { /* Common architectural events */ { .name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed, Condition code check pass, software increment" }, { .name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, { .name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, { .name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed, Condition code check pass, exception return" }, { .name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed, Condition code check pass, write to CONTEXTIDR" }, { .name = "BR_RETURN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0e, .desc = "Instruction architecturally executed, Condition code check pass, procedure return" }, { .name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, Condition code check pass, write to TTBR" }, { .name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Instruction architecturally executed, branch" }, { .name = "SVE_INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8002, .desc = "This event counts architecturally executed SVE instructions.", }, /* Common microarchitectural events */ { .name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill." }, { .name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Attributable Level 1 instruction TLB refill." }, { .name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill." }, { .name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access." }, { .name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Attributable Level 1 data TLB refill." }, { .name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch. Speculatively executed." }, { .name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycle." }, { .name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch. Speculatively executed." }, { .name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access." }, { .name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Attributable Level 1 instruction cache access." }, { .name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Attributable Level 1 data cache write-back." }, { .name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access." }, { .name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill." }, { .name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Attributable Level 2 data cache write-back." }, { .name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Attributable Bus access." }, { .name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error." }, { .name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x001b, .desc = "Operation speculatively executed." }, { .name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycle." }, { .name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instruction architecturally executed, mispredicted branch." }, { .name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "No operation issued due to the frontend." }, { .name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "No operation issued due to the backend." }, { .name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Attributable Level 1 data or unified TLB access." }, { .name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Attributable Level 1 instruction TLB access." }, { .name = "L2I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x27, .desc = "Attributable Level 2 instruction cache access." }, { .name = "L2I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x28, .desc = "Attributable Level 2 instruction cache refill." }, { .name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Attributable Level 2 data TLB refill." }, { .name = "L2I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2e, .desc = "Attributable Level 2 instruction TLB refill." }, { .name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Attributable Level 2 data or unified TLB access." }, { .name = "L2I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x30, .desc = "Attributable Level 2 instruction TLB access." }, { .name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Access to another socket in a multi-socket system." }, { .name = "LL_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x32, .desc = "Last Level cache access." }, { .name = "LL_CACHE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x33, .desc = "Last Level cache miss." }, { .name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Access to data TLB causes a translation table walk." }, { .name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Access to instruction TLB that causes a translation table walk." }, { .name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Attributable Last level cache memory read." }, { .name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last level cache miss, read." }, { .name = "REMOTE_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x38, .desc = "Access to another socket in a multi-socket system, read." }, { .name = "SAMPLE_POP", .modmsk = ARMV8_ATTRS, .code = 0x4000, .desc = "Sample Population." }, { .name = "SAMPLE_FEED", .modmsk = ARMV8_ATTRS, .code = 0x4001, .desc = "Sample Taken." }, { .name = "SAMPLE_FILTRATE", .modmsk = ARMV8_ATTRS, .code = 0x4002, .desc = "Sample taken and not removed by filtering." }, { .name = "SAMPLE_COLLISION", .modmsk = ARMV8_ATTRS, .code = 0x4003, .desc = "Sample collided with a previous sample." }, /* ARM recommended Implementation Defined */ { .name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Attributable Level 1 data cache access, read." }, { .name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Attributable Level 1 data cache access, write." }, { .name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Attributable Level 1 data cache refill, read." }, { .name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Attributable Level 1 data cache refill, write." }, { .name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Attributable Level 1 data cache Write-Back, victim." }, { .name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache Write-Back, cleaning and coherency." }, { .name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Attributable Level 1 data cache invalidate." }, { .name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Attributable Level 1 data TLB refill, read." }, { .name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Attributable Level 1 data TLB refill, write." }, { .name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4e, .desc = "Attributable Level 1 data or unified TLB access, read." }, { .name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4f, .desc = "Attributable Level 1 data or unified TLB access, write." }, { .name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Attributable Level 2 data cache access, read." }, { .name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Attributable Level 2 data cache access, write." }, { .name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Attributable Level 2 data cache refill, read." }, { .name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Attributable Level 2 data cache refill, write." }, { .name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Attributable Level 2 data cache Write-Back, victim." }, { .name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache Write-Back, cleaning and coherency." }, { .name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Attributable Level 2 data cache invalidate." }, { .name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access, read." }, { .name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access, write." }, { .name = "BUS_ACCESS_SHARED", .modmsk = ARMV8_ATTRS, .code = 0x62, .desc = "Bus access, Normal, Cacheable, Shareable." }, { .name = "BUS_ACCESS_NOT_SHARED", .modmsk = ARMV8_ATTRS, .code = 0x63, .desc = "Bus access, not Normal, Cacheable, Shareable." }, { .name = "BUS_ACCESS_NORMAL", .modmsk = ARMV8_ATTRS, .code = 0x64, .desc = "Bus access, normal." }, { .name = "BUS_ACCESS_PERIPH", .modmsk = ARMV8_ATTRS, .code = 0x65, .desc = "Bus access, peripheral." }, { .name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read." }, { .name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write." }, { .name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned access, read." }, { .name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned access, write." }, { .name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned access." }, { .name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operation speculatively executed, Load-Exclusive." }, { .name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operation speculatively executed, Store-Exclusive pass." }, { .name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operation speculatively executed, Store-Exclusive fail." }, { .name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operation speculatively executed, Store-Exclusive." }, { .name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load." }, { .name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store." }, { .name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Operation speculatively executed, load or store." }, { .name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, integer data processing." }, { .name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD." }, { .name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, floating-point." }, { .name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Operation speculatively executed, software change of the PC." }, { .name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction." }, { .name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Branch speculatively executed, immediate branch." }, { .name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Branch speculatively executed, procedure return." }, { .name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Branch speculatively executed, indirect branch." }, { .name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "Barrier speculatively executed, ISB." }, { .name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "Barrier speculatively executed, DSB." }, { .name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "Barrier speculatively executed, DMB." }, { .name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous." }, { .name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, Supervisor Call." }, { .name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, Instruction Abort." }, { .name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, Data Abort or SError." }, { .name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, IRQ." }, { .name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, FIQ." }, { .name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken, Secure Monitor Call." }, { .name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken, Hypervisor Call." }, { .name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, Instruction Abort not Taken locally." }, { .name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, Data Abort or SError not Taken locally." }, { .name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, other traps not Taken locally." }, { .name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, IRQ not Taken locally." }, { .name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, FIQ not Taken locally." }, { .name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency operation speculatively executed, Load-Acquire." }, { .name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency operation speculatively executed, Store-Release." }, /* Implementation Defined */ { .name = "L1I_CACHE_PRF", .modmsk = ARMV8_ATTRS, .code = 0x102e, .desc = "Level 1 instruction cache prefetch access count." }, { .name = "L1I_CACHE_PRF_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x102f, .desc = "Level 1 instruction cache miss due to prefetch access count" }, { .name = "IQ_IS_EMPTY", .modmsk = ARMV8_ATTRS, .code = 0x1043, .desc = "Instruction queue is empty" }, { .name = "IF_IS_STALL", .modmsk = ARMV8_ATTRS, .code = 0x1044, .desc = "Instruction fetch stall cycles" }, { .name = "FETCH_BUBBLE", .modmsk = ARMV8_ATTRS, .code = 0x2014, .desc = "Instructions can receive, but not send" }, { .name = "PRF_REQ", .modmsk = ARMV8_ATTRS, .code = 0x6013, .desc = "Prefetch request from LSU" }, { .name = "HIT_ON_PRF", .modmsk = ARMV8_ATTRS, .code = 0x6014, .desc = "Hit on prefetched data" }, { .name = "EXE_STALL_CYCLE", .modmsk = ARMV8_ATTRS, .code = 0x7001, .desc = "Cycles of that the number of issuing micro operations are less than 4" }, { .name = "MEM_STALL_ANYLOAD", .modmsk = ARMV8_ATTRS, .code = 0x7004, .desc = "No any micro operation is issued and meanwhile any load operation is not resolved" }, { .name = "MEM_STALL_L1MISS", .modmsk = ARMV8_ATTRS, .code = 0x7006, .desc = "No any micro operation is issued and meanwhile there is any load operation missing L1 cache and pending data refill" }, { .name = "MEM_STALL_L2MISS", .modmsk = ARMV8_ATTRS, .code = 0x7007, .desc = "No any micro operation is issued and meanwhile there is any load operation missing both L1 and L2 cache and pending data refill from L3 cache" } }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_hisilicon_kunpeng_unc_events.h000066400000000000000000000120661502707512200270140ustar00rootroot00000000000000/* * Copyright (c) 2021 Barcelona Supercomputing Center * Contributed by Estanislao Mercadal Melià * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Hisilicon Kunpeng 920 * Based on https://developer.arm.com/documentation/ddi0487/latest/ and * https://github.com/torvalds/linux/blob/master/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-ddrc.json * https://github.com/torvalds/linux/blob/master/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-hha.json * https://github.com/torvalds/linux/blob/master/tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-l3c.json */ static const arm_entry_t arm_kunpeng_unc_ddrc_pe[ ] = { { .name = "flux_wr", .code = 0x00, .desc = "DDRC total write operations." }, { .name = "flux_rd", .code = 0x01, .desc = "DDRC total read operations." }, { .name = "flux_wcmd", .code = 0x02, .desc = "DDRC write commands." }, { .name = "flux_rcmd", .code = 0x03, .desc = "DDRC read commands." }, { .name = "pre_cmd", .code = 0x04, .desc = "DDRC precharge commands." }, { .name = "act_cmd", .code = 0x05, .desc = "DDRC active commands." }, { .name = "rnk_chg", .code = 0x06, .desc = "DDRC rank commands." }, { .name = "rw_chg", .code = 0x07, .desc = "DDRC read and write changes." } }; static const arm_entry_t arm_kunpeng_unc_hha_pe[ ] = { { .name = "rx_ops_num", .code = 0x00, .desc = "The number of all operations received by the HHA." }, { .name = "rx_outer", .code = 0x01, .desc = "The number of all operations received by the HHA from another socket." }, { .name = "rx_sccl", .code = 0x02, .desc = "The number of all operations received by the HHA from another SCCL in this socket." }, { .name = "rx_ccix", .code = 0x03, .desc = "Count of the number of operations that HHA has received from CCIX." }, { .name = "rd_ddr_64b", .code = 0x1c, .desc = "The number of read operations sent by HHA to DDRC which size is 64bytes." }, { .name = "wr_ddr_64b", .code = 0x1d, .desc = "The number of write operations sent by HHA to DDRC which size is 64 bytes." }, { .name = "rd_ddr_128b", .code = 0x1e, .desc = "The number of read operations sent by HHA to DDRC which size is 128 bytes." }, { .name = "wr_ddr_128b", .code = 0x1f, .desc = "The number of write operations sent by HHA to DDRC which size is 128 bytes." }, { .name = "spill_num", .code = 0x20, .desc = "Count of the number of spill operations that the HHA has sent." }, { .name = "spill_success", .code = 0x21, .desc = "Count of the number of successful spill operations that the HHA has sent." } }; static const arm_entry_t arm_kunpeng_unc_l3c_pe[ ] = { { .name = "rd_cpipe", .code = 0x00, .desc = "Total read accesses." }, { .name = "wr_cpipe", .code = 0x01, .desc = "Total write accesses." }, { .name = "rd_hit_cpipe", .code = 0x02, .desc = "Total read hits." }, { .name = "wr_hit_cpipe", .code = 0x03, .desc = "Total write hits." }, { .name = "victim_num", .code = 0x04, .desc = "l3c precharge commands." }, { .name = "rd_spipe", .code = 0x20, .desc = "Count of the number of read lines that come from this cluster of CPU core in spipe." }, { .name = "wr_spipe", .code = 0x21, .desc = "Count of the number of write lines that come from this cluster of CPU core in spipe." }, { .name = "rd_hit_spipe", .code = 0x22, .desc = "Count of the number of read lines that hits in spipe of this L3C." }, { .name = "wr_hit_spipe", .code = 0x23, .desc = "Count of the number of write lines that hits in spipe of this L3C." }, { .name = "back_invalid", .code = 0x29, .desc = "Count of the number of L3C back invalid operations." }, { .name = "retry_cpu", .code = 0x40, .desc = "Count of the number of retry that L3C suppresses the CPU operations." }, { .name = "retry_ring", .code = 0x41, .desc = "Count of the number of retry that L3C suppresses the ring operations." }, { .name = "prefetch_drop", .code = 0x42, .desc = "Count of the number of prefetch drops from this L3C." } }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_marvell_tx2_unc_events.h000066400000000000000000000100601502707512200255330ustar00rootroot00000000000000/* * Copyright (c) 2019 Marvell Technology Group Ltd * Contributed by Shay Gal-On * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Marvell ThunderX2 * * ARM Architecture Reference Manual, ARMv8, for ARMv8-A architecture profile, * ARM DDI 0487B.a (ID033117) * * Marvell ThunderX2 C99XX Core and Uncore PMU Events (Abridged) can be found at * https://www.marvell.com/documents/hrur6mybdvk5uki1w0z7/ * */ /* L3C event IDs */ #define L3_EVENT_READ_REQ 0xD #define L3_EVENT_WRITEBACK_REQ 0xE #define L3_EVENT_EVICT_REQ 0x13 #define L3_EVENT_READ_HIT 0x17 #define L3_EVENT_MAX 0x18 /* DMC event IDs */ #define DMC_EVENT_COUNT_CYCLES 0x1 #define DMC_EVENT_WRITE_TXNS 0xB #define DMC_EVENT_DATA_TRANSFERS 0xD #define DMC_EVENT_READ_TXNS 0xF #define DMC_EVENT_MAX 0x10 /* CCPI event IDs */ #define CCPI2_EVENT_REQ_PKT_SENT 0x3D #define CCPI2_EVENT_SNOOP_PKT_SENT 0x65 #define CCPI2_EVENT_DATA_PKT_SENT 0x105 #define CCPI2_EVENT_GIC_PKT_SENT 0x12D static const arm_entry_t arm_thunderx2_unc_dmc_pe[]={ {.name = "UNC_DMC_READS", .modmsk = ARMV8_ATTRS, .code = DMC_EVENT_READ_TXNS, .desc = "Memory read transactions" }, {.name = "UNC_DMC_WRITES", .modmsk = ARMV8_ATTRS, .code = DMC_EVENT_WRITE_TXNS, .desc = "Memory write transactions" }, {.name = "UNC_DMC_DATA_TRANSFERS", .modmsk = ARMV8_ATTRS, .code = DMC_EVENT_DATA_TRANSFERS, .desc = "Memory data transfers" }, {.name = "UNC_DMC_CYCLES", .modmsk = ARMV8_ATTRS, .code = DMC_EVENT_COUNT_CYCLES, .desc = "Clocks at the DMC clock rate" } }; #define ARM_TX2_CORE_DMC_COUNT (sizeof(arm_thunderx2_unc_dmc_pe)/sizeof(arm_entry_t)) static const arm_entry_t arm_thunderx2_unc_ccpi_pe[]={ {.name = "UNC_CCPI_REQ", .modmsk = ARMV8_ATTRS, .code = CCPI2_EVENT_REQ_PKT_SENT, .desc = "Request packets sent from this node" }, {.name = "UNC_CCPI_SNOOP", .modmsk = ARMV8_ATTRS, .code = CCPI2_EVENT_SNOOP_PKT_SENT, .desc = "Snoop packets sent from this node" }, {.name = "UNC_CCPI_DATA", .modmsk = ARMV8_ATTRS, .code = CCPI2_EVENT_DATA_PKT_SENT , .desc = "Data packets sent from this node" }, {.name = "UNC_CCPI_GIC", .modmsk = ARMV8_ATTRS, .code = CCPI2_EVENT_GIC_PKT_SENT, .desc = "Interrupt related packets sent from this node" } }; #define ARM_TX2_CORE_CCPI_COUNT (sizeof(arm_thunderx2_unc_ccpi_pe)/sizeof(arm_entry_t)) static const arm_entry_t arm_thunderx2_unc_llc_pe[]={ {.name = "UNC_LLC_READ", .modmsk = ARMV8_ATTRS, .code = L3_EVENT_READ_REQ, .desc = "Read requests to LLC" }, {.name = "UNC_LLC_EVICT", .modmsk = ARMV8_ATTRS, .code = L3_EVENT_EVICT_REQ, .desc = "Evict requests to LLC" }, {.name = "UNC_LLC_READ_HIT", .modmsk = ARMV8_ATTRS, .code = L3_EVENT_READ_HIT, .desc = "Read requests to LLC which hit" }, {.name = "UNC_LLC_WB", .modmsk = ARMV8_ATTRS, .code = L3_EVENT_WRITEBACK_REQ, .desc = "Writeback requests to LLC" } }; #define ARM_TX2_CORE_LLC_COUNT (sizeof(arm_thunderx2_unc_llc_pe)/sizeof(arm_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_n1_events.h000066400000000000000000000416351502707512200246670ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * ARM Neoverse N1 * Based on https://static.docs.arm.com/100616/0301/neoverse_n1_trm_100616_0301_01_en.pdf * Section C2.3 PMU events */ static const arm_entry_t arm_n1_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refills" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refills" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refills" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache accesses" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refills" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exceptions taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instructions architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instructions architecturally executed (condition check pass) - Write to CONTEXTIDR", }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branches speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branches speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory accesses" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache accesses" }, {.name = "L1I_CACHE", .modmsk = ARMV9_ATTRS, .equiv = "L1I_CACHE", .code = 0x14, .desc = "Level 1 instruction cache accesses (deprecated)" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-backs" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x16, .equiv = "L2D_CACHE", .desc = "Level 2 data cache accesses (alias to L2D_CACHE)" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache accesses" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refills" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-backs" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Counts every beat of data transferred over the data channels between the core and the SCU" }, {.name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory errors" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Instructions speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instructions architecturally executed (condition check pass, write to TTBR). Counts writes to TTBR0_EL1/TTBR1_EL1 in aarch64 mode" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycles. This event duplicates CPU_CYCLES", }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "Level 2 data/unified cache allocations without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Counts all branches on the architecturally executed path that would incur cost if mispredicted" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instructions executed, mis-predicted branch. All instructions counted by BR_RETIRED that were not correctly predicted" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "Cycles in which no operation issued because there were no operations to issue" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "Cycleis in which no operation issued due to back-end resources being unavailable" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB accesses" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Instruction TLB accesses" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x29, .desc = "Attributable L3 data or unified cache allocations without a refill. Counts any full cache line write into the L3 cache which does not case a linefill, including write-backs from L2 to L3 and full line write which do not allocate into L2", }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2a, .desc = "Attributable L3 unified cache refills. Counts any cacheable read transaction returning data from the SCU for which the data source was outside the cluster", }, {.name = "L3D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x2b, .desc = "Attributable L3 unified cache accesses. Counts any cacheable read transaction returning data from the SCU, or any cacheable write to the SCU", }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Attributable L2 data or unified TLB refills. Counts on any refill of the L2TLB caused by either an instruction or data access (MMU must be enabled)", }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Attributable L2 data or unified TLB accesses. Counts on any access to the L2 TLB caused by a refill of any of the L1 TLB (MMU must be enabled)", }, {.name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Number of accesses to another socket", }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Accesses to the data TLB that caused a page walk. Counts any data access which causes L2D_TLB_REFILL to count", }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Accesses to the instruction TLB that caused a page walk. Counts any instruction which causes L2D_TLB_REFILL to count", }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last Level cache accesses for reads", }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last Level cache misses for reads", }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache read accesses" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache write accesses" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refills" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refills" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refills, inner. Counts any L1D cache line fill which hits in the L2, L3 or another core in the cluster" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refills, outer. Counts any L1D cache line fill which does not hit in the L2, L3 or another core in the cluster and instead obtains data from outside the cluster" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-backs (victim eviction)", }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-backs (clean and coherency eviction)", }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidations" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refills" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refills" }, {.name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4e, .desc = "Level 1 data TLB read accesses" }, {.name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4f, .desc = "Level 1 data TLB write accesses" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache read accesses" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache write accesses" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refills" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refills" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache victim write-backs" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache cleaning and coherency write-backs" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidations" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x5c, .desc = "Level 2 data TLB refills on read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x5d, .desc = "Level 2 data TLB refills on write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x5e, .desc = "Level 2 data TLB accesses on read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x5f, .desc = "Level 2 data TLB accesses on write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus read accesses" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus write accesses" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory read accesses" }, {.name = "MEM_READ_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "MEM_ACCESS_RD", .code = 0x66, .desc = "Data memory read accesses" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory write accesses" }, {.name = "MEM_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "MEM_ACCESS_WR", .code = 0x67, .desc = "Data memory write accesses" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned read accesses" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned write accesses" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned accesses" }, {.name = "UNALIGNED_LDST_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "UNALIGNED_LDST_SPEC", .code = 0x6a, .desc = "Unaligned accesses" }, {.name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operations speculatively executed - LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operations speculative executed - STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operations speculative executed - STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operations speculatively executed - STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Load instructions speculatively executed" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Store instructions speculatively executed" }, {.name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Load or store instructions speculatively executed" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Integer data processing instructions speculatively executed" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Advanced SIMD instructions speculatively executed" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Floating-point instructions speculatively executed" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Software change of the PC instruction speculatively executed" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Cryptographic instructions speculatively executed" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Immediate branches speculatively executed" }, {.name = "BR_RET_SPEC", .modmsk = ARMV8_ATTRS, .equiv = "BR_RETURN_SPEC", .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Indirect branches speculatively executed" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "ISB barriers speculatively executed" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "DSB barriers speculatively executed" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "DMB barriers speculatively executed" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exceptions taken, supervisor call" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exceptions taken, instruction abort" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exceptions taken locally, data abort or SError" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exceptions taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exceptions taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exceptions taken locally, secure monitor call" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exceptions taken, hypervisor call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exceptions taken, instruction abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exceptions taken, data abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exceptions taken, other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exceptions taken, irq not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exceptions taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency instructions speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency instructions speculatively executed (store-release)", }, {.name = "L3_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0xa0, .desc = "L3 cache reads", }, {.name = "SAMPLE_POP", .modmsk = ARMV8_ATTRS, .code = 0x4000, .desc = "Number of operations that might be sampled by SPE, whether or not the operation was sampled", }, {.name = "SAMPLE_FEED", .modmsk = ARMV8_ATTRS, .code = 0x4001, .desc = "Number of times the SPE sample interval counter reaches zero and is reloaded", }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV8_ATTRS, .code = 0x4002, .desc = "Number of times SPE completed sample record passes the SPE filters and is written to the buffer" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV8_ATTRS, .code = 0x4003, .desc = "Number of times SPE has a sample record taken when the previous sampled operation has not yet completed its record" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_n2_events.h000066400000000000000000000557531502707512200246760ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * ARM Neoverse N2 * Based on ARM Neoverse N2 Technical Reference Manual rev 0 * Section 18.1 Performance Monitors events */ static const arm_entry_t arm_n2_pe[]={ {.name = "SW_INCR", .modmsk = ARMV9_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refills" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refills" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x03, .desc = "Level 1 data cache refills" }, {.name = "L1D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x04, .desc = "Level 1 data cache accesses" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refills" }, {.name = "INST_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV9_ATTRS, .code = 0x09, .desc = "Exceptions taken" }, {.name = "EXC_RETURN", .modmsk = ARMV9_ATTRS, .code = 0x0a, .desc = "Instructions architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x0b, .desc = "Instructions architecturally executed (condition check pass) - Write to CONTEXTIDR", }, {.name = "BR_MIS_PRED", .modmsk = ARMV9_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branches speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BR_PRED", .modmsk = ARMV9_ATTRS, .code = 0x12, .desc = "Predictable branches speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x13, .desc = "Data memory accesses" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV9_ATTRS, .equiv = "L1I_CACHE", .code = 0x14, .desc = "Level 1 instruction cache accesses (deprecated)" }, {.name = "L1I_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache accesses" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-backs" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x16, .equiv = "L2D_CACHE", .desc = "Level 2 data cache accesses (alias to L2D_CACHE)" }, {.name = "L2D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x16, .desc = "Level 2 data cache accesses" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x17, .desc = "Level 2 data cache refills" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-backs" }, {.name = "BUS_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x19, .desc = "Counts every beat of data transferred over the data channels between the core and the SCU" }, {.name = "MEMORY_ERROR", .modmsk = ARMV9_ATTRS, .code = 0x1a, .desc = "Local memory errors" }, {.name = "INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x1b, .desc = "Instructions speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x1c, .desc = "Instructions architecturally executed (condition check pass, write to TTBR). Counts writes to TTBR0_EL1/TTBR1_EL1 in aarch64 mode" }, {.name = "BUS_MASTER_CYCLE", .modmsk = ARMV9_ATTRS, .code = 0x1d, .desc = "Bus cycles. This event duplicate cycles", }, {.name = "COUNTER_OVERFLOW", .modmsk = ARMV9_ATTRS, .code = 0x1e, .desc = "For odd-numbered counters, this event increments the count by one for each overflow of the preceding even-numbered counter. There is no increment for even-numbered counters", }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV9_ATTRS, .code = 0x20, .desc = "Level 2 data/unified cache allocations without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x21, .desc = "Counts all branches on the architecturally executed path that would incur cost if mispredicted" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x22, .desc = "Instructions executed, mis-predicted branch. All instructions counted by BR_RETIRED that were not correctly predicted" }, {.name = "STALL_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x23, .desc = "Cycles in which no operation issued because there were no operations to issue" }, {.name = "STALL_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x24, .desc = "Cycles in which no operation issued due to back-end resources being unavailable" }, {.name = "L1D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x25, .desc = "Level 1 data TLB accesses" }, {.name = "L1I_TLB", .modmsk = ARMV9_ATTRS, .code = 0x26, .desc = "Instruction TLB accesses" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV9_ATTRS, .code = 0x29, .desc = "Attributable L3 data or unified cache allocations without a refill. Counts any full cache line write into the L3 cache which does not case a linefill, including write-backs from L2 to L3 and full line write which do not allocate into L2", }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x2a, .desc = "Attributable L3 unified cache refills. Counts any cacheable read transaction returning data from the SCU for which the data source was outside the cluster", }, {.name = "L3D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x2b, .desc = "Attributable L3 unified cache accesses. Counts any cacheable read transaction returning data from the SCU, or any cacheable write to the SCU", }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x2d, .desc = "Attributable L2 data or unified TLB refills. Counts on any refill of the L2TLB caused by either an instruction or data access (MMU must be enabled)", }, {.name = "L2D_TLB_REQ", .modmsk = ARMV9_ATTRS, .code = 0x2f, .desc = "Attributable L2 TLB accesses. Counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "L2D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x2f, .equiv = "L2D_TLB_REQ", .desc = "Attributable L2 TLB accesses. Counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "REMOTE_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x31, .desc = "Number of accesses to another socket", }, {.name = "DTLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x34, .desc = "Accesses to the data TLB that caused a page walk. Counts any data access which causes L2D_TLB_REFILL to count", }, {.name = "ITLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x35, .desc = "Accesses to the instruction TLB that caused a page walk. Counts any instruction which causes L2D_TLB_REFILL to count", }, {.name = "LL_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x36, .desc = "Last Level cache accesses for reads", }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x37, .desc = "Last Level cache misses for reads", }, {.name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x39, .desc = "Counts the number Level 1 data cache long-latency misses", }, {.name = "OP_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x3a, .desc = "Counts the number of micro-ops architecturally executed", }, {.name = "OP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x3b, .desc = "Counts the number of speculatively executed micro-ops", }, {.name = "STALL", .modmsk = ARMV9_ATTRS, .code = 0x3c, .desc = "Counts cycles in which no operation is sent for execution", }, {.name = "STALL_SLOT_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x3d, .desc = "No operation sent for execution on a slot due to the backend", }, {.name = "STALL_SLOT_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x3e, .desc = "No operation sent for execution on a slot due to the frontend", }, {.name = "STALL_SLOT", .modmsk = ARMV9_ATTRS, .code = 0x3f, .desc = "No operation sent for execution on a slot", }, {.name = "L1D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x40, .desc = "Level 1 data cache read accesses" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x41, .desc = "Level 1 data cache write accesses" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refills" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refills" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV9_ATTRS, .code = 0x44, .desc = "Level 1 data cache refills, inner. Counts any L1D cache line fill which hits in the L2, L3 or another core in the cluster" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV9_ATTRS, .code = 0x45, .desc = "Level 1 data cache refills, outer. Counts any L1D cache line fill which does not hit in the L2, L3 or another core in the cluster and instead obtains data from outside the cluster" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV9_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-backs (victim eviction)", }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV9_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-backs (clean and coherency eviction)", }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV9_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidations" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refills" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refills" }, {.name = "L1D_TLB_RD", .modmsk = ARMV9_ATTRS, .code = 0x4e, .desc = "Level 1 data TLB read accesses" }, {.name = "L1D_TLB_WR", .modmsk = ARMV9_ATTRS, .code = 0x4f, .desc = "Level 1 data TLB write accesses" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x50, .desc = "Level 2 data cache read accesses" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x51, .desc = "Level 2 data cache write accesses" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refills" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refills" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV9_ATTRS, .code = 0x56, .desc = "Level 2 data cache victim write-backs" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV9_ATTRS, .code = 0x57, .desc = "Level 2 data cache cleaning and coherency write-backs" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV9_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidations" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x5c, .desc = "Level 2 data TLB refills on read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x5d, .desc = "Level 2 data TLB refills on write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV9_ATTRS, .code = 0x5e, .desc = "Level 2 data TLB accesses on read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV9_ATTRS, .code = 0x5f, .desc = "Level 2 data TLB accesses on write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV9_ATTRS, .code = 0x60, .desc = "Bus read accesses" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV9_ATTRS, .code = 0x61, .desc = "Bus write accesses" }, {.name = "MEM_READ_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x66, .desc = "Data memory read accesses" }, {.name = "MEM_WRITE_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x67, .desc = "Data memory write accesses" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x68, .desc = "Unaligned read accesses" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x69, .desc = "Unaligned write accesses" }, {.name = "UNALIGNED_LDST_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x6a, .desc = "Unaligned accesses" }, {.name = "LDREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6c, .desc = "Exclusive operations speculatively executed - LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6d, .desc = "Exclusive operations speculative executed - STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6e, .desc = "Exclusive operations speculative executed - STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6f, .desc = "Exclusive operations speculatively executed - STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x70, .desc = "Load instructions speculatively executed" }, {.name = "ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x71, .desc = "Store instructions speculatively executed" }, {.name = "DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x73, .desc = "Integer data processing instructions speculatively executed" }, {.name = "ASE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x74, .desc = "Advanced SIMD instructions speculatively executed" }, {.name = "VFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x75, .desc = "Floating-point instructions speculatively executed" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x76, .desc = "Software change of the PC instruction speculatively executed" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x77, .desc = "Cryptographic instructions speculatively executed" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x78, .desc = "Immediate branches speculatively executed" }, {.name = "BR_RET_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7a, .desc = "Indirect branches speculatively executed" }, {.name = "ISB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7c, .desc = "ISB barriers speculatively executed" }, {.name = "DSB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7d, .desc = "DSB barriers speculatively executed" }, {.name = "DMB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7e, .desc = "DMB barriers speculatively executed" }, {.name = "EXC_UNDEF", .modmsk = ARMV9_ATTRS, .code = 0x81, .desc = "Undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV9_ATTRS, .code = 0x82, .desc = "Exceptions taken, supervisor call" }, {.name = "EXC_PABORT", .modmsk = ARMV9_ATTRS, .code = 0x83, .desc = "Exceptions taken, instruction abort" }, {.name = "EXC_DABORT", .modmsk = ARMV9_ATTRS, .code = 0x84, .desc = "Exceptions taken locally, data abort or SError" }, {.name = "EXC_IRQ", .modmsk = ARMV9_ATTRS, .code = 0x86, .desc = "Exceptions taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV9_ATTRS, .code = 0x87, .desc = "Exceptions taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV9_ATTRS, .code = 0x88, .desc = "Exceptions taken locally, secure monitor call" }, {.name = "EXC_HVC", .modmsk = ARMV9_ATTRS, .code = 0x8a, .desc = "Exceptions taken, hypervisor call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV9_ATTRS, .code = 0x8b, .desc = "Exceptions taken, instruction abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV9_ATTRS, .code = 0x8c, .desc = "Exceptions taken, data abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV9_ATTRS, .code = 0x8d, .desc = "Exceptions taken, other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV9_ATTRS, .code = 0x8e, .desc = "Exceptions taken, irq not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV9_ATTRS, .code = 0x8f, .desc = "Exceptions taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x90, .desc = "Release consistency instructions speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x91, .desc = "Release consistency instructions speculatively executed (store-release)", }, {.name = "L3_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0xa0, .desc = "L3 cache reads", }, {.name = "SAMPLE_POP", .modmsk = ARMV9_ATTRS, .code = 0x4000, .desc = "Number of operations that might be sampled by SPE, whether or not the operation was sampled", }, {.name = "SAMPLE_FEED", .modmsk = ARMV9_ATTRS, .code = 0x4001, .desc = "Number of times the SPE sample interval counter reaches zero and is reloaded", }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV9_ATTRS, .code = 0x4002, .desc = "Number of times SPE completed sample record passes the SPE filters and is written to the buffer" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV9_ATTRS, .code = 0x4003, .desc = "Number of times SPE has a sample record taken when the previous sampled operation has not yet completed its record" }, {.name = "CNT_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x4004, .desc = "Constant frequency cycles", }, {.name = "STALL_BACKEND_MEM", .modmsk = ARMV9_ATTRS, .code = 0x4005, .desc = "No operation sent due to the backend and memory stalls", }, {.name = "L1I_CACHE_LMISS", .modmsk = ARMV9_ATTRS, .code = 0x4006, .desc = "Counts L1 instruction cache long latency misses", }, {.name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x4009, .desc = "Counts L2 cache long latency misses", }, {.name = "L3D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x400b, .desc = "Counts L3 cache long latency misses", }, {.name = "TRB_WRAP", .modmsk = ARMV9_ATTRS, .code = 0x400c, .desc = "Counts number of time the Trace buffer current write pointer wrapped", }, {.name = "TRCEXTOUT0", .modmsk = ARMV9_ATTRS, .code = 0x4010, .desc = "PE Trace unit extern output 0", }, {.name = "TRCEXTOUT1", .modmsk = ARMV9_ATTRS, .code = 0x4011, .desc = "PE Trace unit extern output 1", }, {.name = "TRCEXTOUT2", .modmsk = ARMV9_ATTRS, .code = 0x4012, .desc = "PE Trace unit extern output 2", }, {.name = "TRCEXTOUT3", .modmsk = ARMV9_ATTRS, .code = 0x4013, .desc = "PE Trace unit extern output 3", }, {.name = "CTI_TRIGOUT4", .modmsk = ARMV9_ATTRS, .code = 0x4018, .desc = "Cross-trigger interface output trigger 4", }, {.name = "CTI_TRIGOUT5", .modmsk = ARMV9_ATTRS, .code = 0x4019, .desc = "Cross-trigger interface output trigger 5", }, {.name = "CTI_TRIGOUT6", .modmsk = ARMV9_ATTRS, .code = 0x401a, .desc = "Cross-trigger interface output trigger 6", }, {.name = "CTI_TRIGOUT7", .modmsk = ARMV9_ATTRS, .code = 0x401b, .desc = "Cross-trigger interface output trigger r", }, {.name = "LDST_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4020, .desc = "Accesses with additonal latency from aligment", }, {.name = "LD_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4021, .desc = "Loads with additonal latency from aligment", }, {.name = "ST_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4022, .desc = "Stores with additonal latency from aligment", }, {.name = "MEM_ACCESS_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4024, .desc = "Checked data memory acess", }, {.name = "MEM_ACCESS_RD_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4025, .desc = "Checked data memory read acess", }, {.name = "MEM_ACCESS_WR_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4026, .desc = "Checked data memory write acess", }, {.name = "ASE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8005, .desc = "Advanced SIMD operations sepculatively executed", }, {.name = "SVE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8006, .desc = "SVE operations sepculatively executed", }, {.name = "FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8014, .desc = "Half precision floating-point operations sepculatively executed", }, {.name = "FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8018, .desc = "Single precision floating-point operations sepculatively executed", }, {.name = "FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801c, .desc = "Double precision floating-point operations sepculatively executed", }, {.name = "SVE_PRED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8074, .desc = "SVE predicated operations speculatively executed", }, {.name = "SVE_PRED_EMPTY_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8075, .desc = "SVE predicated operations with no active predicates speculatively executed", }, {.name = "SVE_PRED_FULL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8076, .desc = "SVE predicated operations with all active predicates speculatively executed", }, {.name = "SVE_PRED_PARTIAL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8077, .desc = "SVE predicated operations with partially active predicates speculatively executed", }, {.name = "SVE_PRED_NOT_FULL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8079, .desc = "SVE predicated operations speculatively executed with a governing predicate in which at least one element is false", }, {.name = "SVE_LDFF_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80bc, .desc = "SVE first-fault load operations speculatively executed", }, {.name = "SVE_LDFF_FAULT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80bd, .desc = "SVE first-fault load operations speculatively executed which set FFR bit to 0", }, {.name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c0, .desc = "Scalable floating-point element operations sepculatively executed", }, {.name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c1, .desc = "Non-scalable floating-point element operations sepculatively executed", }, {.name = "ASE_SVE_INT8_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80e3, .desc = "Operations counted by ASE_SVE_INT_SPEC where the large type is a 8-bit integer", }, {.name = "ASE_SVE_INT16_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80e7, .desc = "Operations counted by ASE_SVE_INT_SPEC where the large type is a 16-bit integer", }, {.name = "ASE_SVE_INT32_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80eb, .desc = "Operations counted by ASE_SVE_INT_SPEC where the large type is a 32-bit integer", }, {.name = "ASE_SVE_INT64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80ef, .desc = "Operations counted by ASE_SVE_INT_SPEC where the large type is a 64-bit integer", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_n3_events.h000066400000000000000000001250671502707512200246730ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * ARM Neoverse N3 * Based on ARM Neoverse N3 Technical Reference Manual rev 0 * Section 19.1 Performance Monitors events */ static const arm_entry_t arm_n3_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed, Condition code check pass, software increment This event counts any instruction architecturally executed (condition code check pass)" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill This event counts any instruction fetch which misses in the cache" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill This event counts any refill of the L1 instruction TLB from the MMU Translation Cache (MMUTC)" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill This event counts any load or store operation or translation table walk that causes data to be read from outside the L1 cache, including accesses which do not allocate into L1" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access This event counts any load or store operation or translation table walk that looks up in the L1 data cache" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill This event counts any refill of the data L1 TLB from the L2 TLB" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed This event counts all retired instructions, including ones that fail their condition check" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken The counter counts each exception taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed, Condition code check pass, exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed, Condition code check pass, write to CONTEXTIDR This event only counts writes using the CONTEXTIDR_EL1 mnemonic" }, {.name = "PC_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0c, .desc = "Instruction architecturally executed, Condition code check pass, software change of the PC This event counts all branches taken and popped from the branch monitor" }, {.name = "BR_IMMED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0d, .desc = "Instruction architecturally executed, immediate branch This event counts all branches decoded as immediate branches, taken or not, and popped from the branch monitor" }, {.name = "BR_RETURN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0e, .desc = "Branch instruction architecturally executed, procedure return, taken Instruction architecturally executed, Condition code check pass, procedure return" }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed This event counts any predictable branch instruction which is mispredicted either due to dynamic misprediction or because the MMU is off and the branches are statically predicted not taken" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycle" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch instruction speculatively executed This event counts all predictable branches" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access This event counts memory accesses due to load or store instructions" }, {.name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access This event counts any instruction fetch which accesses the L1 instruction cache" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-back This event counts any write-back of data from the L1 data cache to lower level of caches" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access - If the core is configured with a per-core L2 cache, this event counts any transaction from L1 which looks up in the L2 cache, and any writeback from the L1 to the L2" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill - If the core is configured with a per-core L2 cache, this event counts any Cacheable transaction from L1 which causes data to be read from outside the core" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-back If the core is configured with a per-core L2 cache, this event counts any write-back of data from the L2 cache to a location outside the core" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access This event counts for every beat of data that is transferred over the data channels between the core and the SCU" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Operation speculatively executed This event duplicates INST_RETIRED" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, condition code check pass, write to TTBR This event only counts writes to TTBR0/TTBR1 in AArch32 and TTBR0_EL1/TTBR1_EL1 in AArch64" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycle This event duplicates CPU_CYCLES" }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "Level 2 data cache allocation without refill This event counts any full cache line write into the L2 cache which does not cause a linefill, including write-backs from L1 to L2 and full-line writes which do not allocate into L1" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Branch instruction architecturally executed This event counts all branches, taken or not, popped from the branch monitor" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Branch instruction architecturally executed, mispredicted This event counts any branch that is counted by BR_RETIRED which is not correctly predicted and causes pipeline clears" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "No operation has been sent for execution, due to the frontend No operation has been issued, because of the frontend The counter counts on any cycle when no operations are issued due to the instruction queue being empty" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "No operation has been sent for execution due to the backend" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB access This event counts any load or store operation which accesses the data L1 TLB" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Level 1 instruction TLB access This event counts any instruction fetch which accesses the instruction L1 TLB" }, {.name = "L2I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x27, .desc = "Level 2 instruction cache access The counter counts each instruction memory access to at least the L2 instruction or unified cache" }, {.name = "L2I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x28, .desc = "Level 2 instruction cache refill The counter counts each access counted by L2I_CACHE that causes a refill of the L2 instruction or unified cache, or any L1 data, instruction, or unified cache of this PE, from outside of those caches" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x29, .desc = "Level 3 data cache allocation without refill This event counts any full cache line write into the L3 cache which does not cause a linefill, including write-backs from L2 to L3 cache and full-line writes which do not allocate into L2" }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2a, .desc = "Level 3 data cache refill This event counts for any cacheable read transaction returning data from the SCU for which the data source was outside the cluster" }, {.name = "L3D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x2b, .desc = "Level 3 data cache access This event counts for any cacheable read, write or write-back transaction sent to the SCU" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Level 2 data TLB refill This event counts on any refill of the L2 TLB, caused by either an instruction or data access" }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Level 2 data TLB access Attributable level 2 unified TLB access" }, {.name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Access to another socket in a multi-socket system This event counts any transactions returning data from another socket in a multi-socket system" }, {.name = "LL_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x32, .desc = "Last level cache access The counter counts each memory-read operation or memory-write operation that causes a cache access to at least the last level cache" }, {.name = "LL_CACHE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x33, .desc = "Last level cache miss The counter counts each access counted by LL_CACHE that is not completed by the last level cache" }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Data TLB access with at least one translation table walk Access to data TLB that caused a translation table walk" }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Instruction TLB access with at least one translation table walk Access to instruction TLB that caused a translation table walk" }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last level cache access, read This event counts any cacheable read transaction which returns a data source of 'interconnect cache', 'DRAM', 'remote' or 'inter-cluster peer'" }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last Level cache miss read This event counts any cacheable read transaction which returns a data source of 'DRAM', 'remote' or 'inter-cluster peer'" }, {.name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x39, .desc = "Level 1 data cache long-latency read miss Level 1 data cache access, read" }, {.name = "OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x3a, .desc = "This event counts each operation counted by OP_SPEC that would be executed in a Simple sequential execution of the program" }, {.name = "OP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x3b, .desc = "Micro-operation speculatively executed This event counts the number of operations executed by the core, including those that are executed speculatively and would not be executed in a simple sequential execution of the program" }, {.name = "STALL", .modmsk = ARMV8_ATTRS, .code = 0x3c, .desc = "No operation sent for execution This event counts every Attributable cycle on which no Attributable instruction or operation was sent for execution on this core" }, {.name = "STALL_SLOT_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x3d, .desc = "No operation sent for execution on a Slot due to the backend Counts each Slot counted by STALL_SLOT where no Attributable instruction or operation was for execution because the backend is unable to accept one of: - The instruction operation available for the PE on the slot" }, {.name = "STALL_SLOT_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x3e, .desc = "No operation sent for execution due to the frontend" }, {.name = "STALL_SLOT", .modmsk = ARMV8_ATTRS, .code = 0x3f, .desc = "No operation sent for execution on a Slot The counter counts on each Attributable cycle the number of instruction or operation Slots that were not occupied by an instruction or operation Attributable to the PE" }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache access, read Counts any load operation or translation table walk access which looks up in the L1 data cache" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache access, write Counts any store operation which looks up in the L1 data cache" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refill, inner This event counts any L1 data cache linefill (as counted by L1D_CACHE_REFILL) which hits in lower level of caches, or another core in the cluster" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refill, outer This event counts any L1 data cache linefill (as counted by L1D_CACHE_REFILL) which does not hit in lower level of caches, or another core in the cluster, and instead obtains data from outside the cluster" }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate The counter counts each invalidation of a cache line in the Level 1 data or unified cache" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache access, read This event counts any transaction issued from L1 caches which looks up in the L2 cache, including requests for instructions fetches and MMU table walks" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache access, write This event counts any full cache line write into the L2 cache which does not cause a linefill, including write-backs from L1 to L2, full-line writes which do not allocate into L1 and MMU descriptor hardware updates performed in L2" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache refill, read This event counts any Cacheable transaction generated by a read operation which causes data to be read from outside the L2" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache refill, write This event counts any Cacheable transaction generated by a store operation which causes data to be read from outside the L2" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache write-back, victim This event counts any datafull write-back operation caused by allocations" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache write-back, cleaning, and coherency This event counts any datafull write-back operation caused by cache maintenance operations or external coherency requests" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate This event counts any cache maintenance operation which causes the invalidation of a line present in the L2 cache" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access, read This event counts for every beat of data that is transferred over the read data channel between the core and the SCU" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access, write This event counts for every beat of data that is transferred over the write data channel between the core and the SCU" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read This event counts memory accesses due to load instructions" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write This event counts memory accesses due to store instructions" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operation speculatively executed, Store-Exclusive fail" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operation speculatively executed, Store-Exclusive" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, integer data processing This event counts retired integer data-processing instructions" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD This event counts retired Advanced SIMD instructions" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, floating-point This event counts retired floating-point instructions" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Operation speculatively executed, software change of the PC This event counts at Decoder step each instruction changing the PC: all branches, some exceptions (HVC/SVC/SMC/ISB and exception return)" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction This event counts retired Cryptographic instructions" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "Barrier speculatively executed, ISB" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "Barrier speculatively executed, DSB" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "Barrier speculatively executed, DMB" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous Counts the number of undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, Supervisor Call Exception taken locally, Supervisor Call" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, Instruction Abort Exception taken locally, Instruction Abort" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, Data Abort or SError Exception taken locally, Data Abort and SError" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, IRQ Exception taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, FIQ Exception taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken, Secure Monitor Call Exception taken locally, Secure Monitor Call" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken, Hypervisor Call Exception taken locally, Hypervisor Call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, Instruction Abort not Taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, Data Abort or SError not Taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, other traps not Taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, IRQ not Taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, FIQ not Taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency operation speculatively executed, Load-Acquire" }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency operation speculatively executed, Store-Release" }, {.name = "L3D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0xa0, .desc = "Level 3 data cache access, read This event counts for any cacheable read transaction sent to the SCU" }, {.name = "SAMPLE_POP", .modmsk = ARMV8_ATTRS, .code = 0x4000, .desc = "Sample Population" }, {.name = "SAMPLE_FEED", .modmsk = ARMV8_ATTRS, .code = 0x4001, .desc = "Sample Taken" }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV8_ATTRS, .code = 0x4002, .desc = "Sample Taken and not removed by filtering" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV8_ATTRS, .code = 0x4003, .desc = "Sample collided with previous sample" }, {.name = "CNT_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x4004, .desc = "Constant frequency cycles The counter increments at a constant frequency equal to the rate of increment of the System Counter, CNTPCT_EL0" }, {.name = "STALL_BACKEND_MEM", .modmsk = ARMV8_ATTRS, .code = 0x4005, .desc = "Memory stall cycles The counter counts each cycle counted by STALL_BACKEND_MEMBOUND where there is a demand data miss in the last level of data or unified cache within the PE clock domain or a non-cacheable data access in progress" }, {.name = "L1I_CACHE_LMISS", .modmsk = ARMV8_ATTRS, .code = 0x4006, .desc = "Level 1 instruction cache long-latency miss The counter counts each access counted by L1I_CACHE that incurs additional latency because it returns instructions from outside the L1 instruction cache" }, {.name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x4009, .desc = "Level 2 data cache long-latency read miss The counter counts each memory read access counted by L2D_CACHE that incurs additional latency because it returns data from outside the L2 data or unified cache of this PE" }, {.name = "L2I_CACHE_LMISS", .modmsk = ARMV8_ATTRS, .code = 0x400a, .desc = "Level 2 instruction cache long-latency miss The counter counts each memory read access counted by L2I_CACHE that incurs additional latency because it returns data from outside the L2 instruction or unified cache of this PE" }, {.name = "L3D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x400b, .desc = "Level 3 data cache long-latency read miss The counter counts each memory read access counted by L3D_CACHE that incurs additional latency because it returns data from outside the L3 data or unified cache of this PE" }, {.name = "TRB_WRAP", .modmsk = ARMV8_ATTRS, .code = 0x400c, .desc = "Trace buffer current write pointer wrapped" }, {.name = "PMU_OVFS", .modmsk = ARMV8_ATTRS, .code = 0x400d, .desc = "PMU overflow, counters accessible to EL1 and EL0 Note: This event is exported to the trace unit, but cannot be counted in the PMU" }, {.name = "TRB_TRIG", .modmsk = ARMV8_ATTRS, .code = 0x400e, .desc = "Trace buffer Trigger Event Note: This event is only exported to the trace unit and is not visible to the PMU" }, {.name = "PMU_HOVFS", .modmsk = ARMV8_ATTRS, .code = 0x400f, .desc = "PMU overflow, counters reserved for use by EL2 Note: This event is only exported to the trace unit and is not visible to the PMU" }, {.name = "TRCEXTOUT0", .modmsk = ARMV8_ATTRS, .code = 0x4010, .desc = "Trace unit external output 0 PE Trace Unit external output 0 Note: This event is not exported to the trace unit" }, {.name = "TRCEXTOUT1", .modmsk = ARMV8_ATTRS, .code = 0x4011, .desc = "Trace unit external output 1 PE Trace Unit external output 1 Note: This event is not exported to the trace unit" }, {.name = "TRCEXTOUT2", .modmsk = ARMV8_ATTRS, .code = 0x4012, .desc = "Trace unit external output 2 PE Trace Unit external output 2 Note: This event is not exported to the trace unit" }, {.name = "TRCEXTOUT3", .modmsk = ARMV8_ATTRS, .code = 0x4013, .desc = "Trace unit external output 3 PE Trace Unit external output 3 Note: This event is not exported to the trace unit" }, {.name = "CTI_TRIGOUT4", .modmsk = ARMV8_ATTRS, .code = 0x4018, .desc = "Cross Trigger Interface output trigger 4" }, {.name = "CTI_TRIGOUT5", .modmsk = ARMV8_ATTRS, .code = 0x4019, .desc = "Cross Trigger Interface output trigger 5" }, {.name = "CTI_TRIGOUT6", .modmsk = ARMV8_ATTRS, .code = 0x401a, .desc = "Cross Trigger Interface output trigger 6" }, {.name = "CTI_TRIGOUT7", .modmsk = ARMV8_ATTRS, .code = 0x401b, .desc = "Cross Trigger Interface output trigger 7" }, {.name = "LDST_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4020, .desc = "Access with additional latency from alignment The counter counts each access counted by MEM_ACCESS that, due to the alignment of the address and size of data being accessed, incurred additional latency" }, {.name = "LD_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4021, .desc = "Load with additional latency from alignment The counter counts each memory-read access counted by LDST_ALIGN_LAT" }, {.name = "ST_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4022, .desc = "Store with additional latency from alignment The counter counts each memory-write access counted by LDST_ALIGN_LAT" }, {.name = "MEM_ACCESS_CHECKED", .modmsk = ARMV8_ATTRS, .code = 0x4024, .desc = "Checked data memory access" }, {.name = "MEM_ACCESS_CHECKED_RD", .modmsk = ARMV8_ATTRS, .code = 0x4025, .desc = "Checked data memory access, read" }, {.name = "MEM_ACCESS_CHECKED_WR", .modmsk = ARMV8_ATTRS, .code = 0x4026, .desc = "Checked data memory access, write" }, {.name = "ASE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8005, .desc = "Advanced SIMD operations speculatively executed" }, {.name = "SVE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8006, .desc = "SVE operation, including load/store The counter counts speculatively executed operations due to SVE instructions" }, {.name = "FP_HP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8014, .desc = "Half-precision floating-point operation speculatively executed" }, {.name = "FP_SP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8018, .desc = "Single-precision floating-point operation speculatively executed" }, {.name = "FP_DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x801c, .desc = "Double-precision floating-point operation speculatively executed" }, {.name = "SVE_PRED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8074, .desc = "SVE predicated operations speculatively executed" }, {.name = "SVE_PRED_EMPTY_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8075, .desc = "SVE predicated operations with no active predicates speculatively executed" }, {.name = "SVE_PRED_FULL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8076, .desc = "SVE predicated operations with all active predicates speculatively executed" }, {.name = "SVE_PRED_PARTIAL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8077, .desc = "SVE predicated operations with partially active predicates speculatively executed" }, {.name = "SVE_PRED_NOT_FULL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8079, .desc = "SVE predicated operations with no or partially active predicates speculatively executed" }, {.name = "SVE_LDFF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bc, .desc = "SVE First-fault load operations speculatively executed" }, {.name = "SVE_LDFF_FAULT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bd, .desc = "SVE First-fault load operations speculatively executed which set FFR bit to 0" }, {.name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c0, .desc = "Scalable floating-point element operations speculatively executed" }, {.name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c1, .desc = "Non-scalable floating-point element operations speculatively executed" }, {.name = "ASE_SVE_INT8_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80e3, .desc = "Advanced SIMD and SVE 8-bit integer operation speculatively executed" }, {.name = "ASE_SVE_INT16_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80e7, .desc = "Advanced SIMD and SVE 16-bit integer operation speculatively executed" }, {.name = "ASE_SVE_INT32_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80eb, .desc = "Advanced SIMD and SVE 32-bit integer operation speculatively executed" }, {.name = "ASE_SVE_INT64_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80ef, .desc = "Advanced SIMD and SVE 64-bit integer operation speculatively executed" }, {.name = "BR_IMMED_TAKEN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8108, .desc = "Instruction architecturally executed, immediate branch taken" }, {.name = "BR_INDNR_TAKEN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x810c, .desc = "Instruction architecturally executed, indirect branch excluding procedure return retired" }, {.name = "BR_IMMED_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8110, .desc = "Branch instruction architecturally executed, predicted immediate The counter counts the instructions on the architecturally executed path counted by both BR_IMMED_RETIRED and BR_PRED_RETIRED" }, {.name = "BR_IMMED_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8111, .desc = "Branch instruction architecturally executed, mispredicted immediate The counter counts the instructions on the architecturally executed path, counted by both BR_IMMED_RETIRED and BR_MIS_PRED_RETIRED" }, {.name = "BR_IND_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8112, .desc = "Branch instruction architecturally executed, predicted indirect The counter counts the instructions on the architecturally executed path counted by both BR_IND_RETIRED and BR_PRED_RETIRED" }, {.name = "BR_IND_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8113, .desc = "Branch instruction architecturally executed, mispredicted indirect The counter counts the instructions on the architecturally executed path counted by both BR_IND_RETIRED and BR_MIS_PRED_RETIRED" }, {.name = "BR_RETURN_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8114, .desc = "Branch instruction architecturally executed, predicted procedure return The counter counts the instructions on the architecturally executed path counted by BR_IND_PRED_RETIRED where, if taken, the branch would be counted by BR_RETURN_RETIRED" }, {.name = "BR_RETURN_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8115, .desc = "Branch instruction architecturally executed, mispredicted procedure return The counter counts the instructions on the architecturally executed path counted by BR_IND_MIS_PRED_RETIRED where, if taken, the branch would also be counted by BR_RETURN_RETIRED" }, {.name = "BR_INDNR_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8116, .desc = "Branch instruction architecturally executed, predicted indirect excluding procedure return The counter counts the instructions on the architecturally executed path counted by BR_IND_PRED_RETIRED where, if taken, the branch would not be counted by BR_RETURN_RETIRED" }, {.name = "BR_INDNR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8117, .desc = "Branch instruction architecturally executed, mispredicted indirect excluding procedure return The counter counts the instructions on the architecturally executed path counted by BR_IND_MIS_PRED_RETIRED where, if taken, the branch would not be counted by BR_RETURN_RETIRED" }, {.name = "BR_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811c, .desc = "Branch instruction architecturally executed, predicted branch The counter counts the instructions on the architecturally executed path counted by BR_RETIRED that are not counted by BR_MIS_PRED_RETIRED" }, {.name = "BR_IND_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811d, .desc = "Instruction architecturally executed, indirect branch" }, {.name = "INST_FETCH_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8120, .desc = "Event in progress, INST_FETCH The counter counts by the number of INST_FETCH events in progress on each Processor cycle" }, {.name = "MEM_ACCESS_RD_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8121, .desc = "Event in progress, MEM_ACCESS_RD The counter counts by the number of MEM_ACCESS_RD events in progress on each Processor Cycle" }, {.name = "INST_FETCH", .modmsk = ARMV8_ATTRS, .code = 0x8124, .desc = "Instruction memory access The counter counts each Instruction memory access that the PE makes" }, {.name = "DTLB_WALK_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8128, .desc = "Total cycles, DTLB_WALK The counter counts by the number of data TLB walk events in progress on each processor cycle" }, {.name = "ITLB_WALK_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8129, .desc = "Total cycles, ITLB_WALK The counter counts by the number of instruction TLB walk events in progress on each processor cycle" }, {.name = "SAMPLE_FEED_BR", .modmsk = ARMV8_ATTRS, .code = 0x812a, .desc = "Statistical Profiling sample taken, branch The counter counts each sample counted by SAMPLE_FEED that are branch operations" }, {.name = "SAMPLE_FEED_LD", .modmsk = ARMV8_ATTRS, .code = 0x812b, .desc = "Statistical Profiling sample taken, load The counter counts each sample counted by SAMPLE_FEED that are load or load atomic operations" }, {.name = "SAMPLE_FEED_ST", .modmsk = ARMV8_ATTRS, .code = 0x812c, .desc = "Statistical Profiling sample taken, store The counter counts each sample counted by SAMPLE_FEED that are store or atomic operations, including load atomic operations" }, {.name = "SAMPLE_FEED_OP", .modmsk = ARMV8_ATTRS, .code = 0x812d, .desc = "Statistical Profiling sample taken, matching operation type The counter counts each sample counted by SAMPLE_FEED that meets the operation type filter constraints" }, {.name = "SAMPLE_FEED_EVENT", .modmsk = ARMV8_ATTRS, .code = 0x812e, .desc = "Statistical Profiling sample taken, matching events The counter counts each sample counted by SAMPLE_FEED that meets the Events packet filter constraints" }, {.name = "SAMPLE_FEED_LAT", .modmsk = ARMV8_ATTRS, .code = 0x812f, .desc = "Statistical Profiling sample taken, exceeding minimum latency The counter counts each sample counted by SAMPLE_FEED that meets the operation latency filter constraints" }, {.name = "DTLB_HWUPD", .modmsk = ARMV8_ATTRS, .code = 0x8134, .desc = "Data TLB hardware update of translation table The counter counts each access counted by L1D_TLB that causes a hardware update of a translation table entry" }, {.name = "ITLB_HWUPD", .modmsk = ARMV8_ATTRS, .code = 0x8135, .desc = "Instruction TLB hardware update of translation table The counter counts each access counted by L1I_TLB that causes a hardware update of a translation table entry" }, {.name = "DTLB_STEP", .modmsk = ARMV8_ATTRS, .code = 0x8136, .desc = "Data TLB translation table walk, step The counter counts each translation table walk access made by a refill of the data or unified TLB" }, {.name = "ITLB_STEP", .modmsk = ARMV8_ATTRS, .code = 0x8137, .desc = "Instruction TLB translation table walk, step The counter counts each translation table walk access made by a refill of the instruction TLB" }, {.name = "DTLB_WALK_LARGE", .modmsk = ARMV8_ATTRS, .code = 0x8138, .desc = "Data TLB large page translation table walk The counter counts each translation table walk counted by DTLB_WALK where the result of the walk yields a large page size" }, {.name = "ITLB_WALK_LARGE", .modmsk = ARMV8_ATTRS, .code = 0x8139, .desc = "Instruction TLB large page translation table walk The counter counts each translation table walk counted by ITLB_WALK where the result of the walk yields a large page size" }, {.name = "DTLB_WALK_SMALL", .modmsk = ARMV8_ATTRS, .code = 0x813a, .desc = "Data TLB small page translation table walk The counter counts each translation table walk counted by DTLB_WALK where the result of the walk yields a small page size" }, {.name = "ITLB_WALK_SMALL", .modmsk = ARMV8_ATTRS, .code = 0x813b, .desc = "Instruction TLB small page translation table walk The counter counts each translation table walk counted by ITLB_WALK where the result of the walk yields a small page size" }, {.name = "L1D_CACHE_RW", .modmsk = ARMV8_ATTRS, .code = 0x8140, .desc = "Level 1 data cache demand access The counter counts each access counted by L1D_CACHE that is due to a demand read or demand write access" }, {.name = "L2D_CACHE_RW", .modmsk = ARMV8_ATTRS, .code = 0x8148, .desc = "Level 2 data cache demand access The counter counts each access counted by L2D_CACHE that is due to a demand Memoryread operation or demand Memory-write operation" }, {.name = "L2I_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x8149, .desc = "Level 2 instruction cache demand fetch The counter counts each access counted by L2I_CACHE that is due to a demand instruction memory access" }, {.name = "L3D_CACHE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x8152, .desc = "Level 3 data cache demand access miss The counter counts each access counted by L3D_CACHE_RW that misses in the L1 to L3 data or unified caches, causing an access to outside of the L1 to L3 caches of this PE" }, {.name = "L1D_CACHE_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x8154, .desc = "Level 1 data cache hardware prefetch The counter counts each fetch triggered by L1 prefetchers" }, {.name = "STALL_FRONTEND_MEMBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8158, .desc = "Frontend stall cycles, memory bound The counter counts each cycle counted by STALL_FRONTEND when no instructions are delivered from the memory system" }, {.name = "STALL_FRONTEND_L1I", .modmsk = ARMV8_ATTRS, .code = 0x8159, .desc = "Frontend stall cycles, level 1 instruction cache The counter counts each cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand miss in the first level instruction cache" }, {.name = "STALL_FRONTEND_MEM", .modmsk = ARMV8_ATTRS, .code = 0x815b, .desc = "Frontend stall cycles, last level PE cache or memory The counter counts each cycle counted by STALL_FRONTEND_MEMBOUND when there is a demand instruction miss in the last level of instruction or unified cache within the PE clock domain or a non-cacheable instruction fetch in progress" }, {.name = "STALL_FRONTEND_TLB", .modmsk = ARMV8_ATTRS, .code = 0x815c, .desc = "Frontend stall cycles, TLB The counter counts each cycle counted by STALL_FRONTEND_MEMBOUND when there is an instruction or unified TLB demand miss" }, {.name = "STALL_FRONTEND_CPUBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8160, .desc = "Frontend stall cycles, processor bound The counter counts each cycle counted by STALL_FRONTEND when the frontend is stalled on a frontend processor resource, not including memory" }, {.name = "STALL_FRONTEND_FLUSH", .modmsk = ARMV8_ATTRS, .code = 0x8162, .desc = "Frontend stall cycles, flush recovery" }, {.name = "STALL_BACKEND_MEMBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8164, .desc = "Backend stall cycles, memory bound The counter counts each cycle counted by STALL_BACKEND when the backend is waiting for a memory access to complete" }, {.name = "STALL_BACKEND_L1D", .modmsk = ARMV8_ATTRS, .code = 0x8165, .desc = "Backend stall cycles, level 1 data cache The counter counts each cycle counted by STALL_BACKEND_MEMBOUND where there is a demand data miss in the L1 of data or unified cache" }, {.name = "STALL_BACKEND_TLB", .modmsk = ARMV8_ATTRS, .code = 0x8167, .desc = "Backend stall cycles, TLB The counter counts each cycle counted by STALL_BACKEND_MEMBOUND where there is a demand data miss in the data or unified TLB" }, {.name = "STALL_BACKEND_ST", .modmsk = ARMV8_ATTRS, .code = 0x8168, .desc = "Back stall cycles store The counter counts each cycle counted by STALL_BACKEND_MEMBOUND when the backend is stalled waiting for a store" }, {.name = "STALL_BACKEND_CPUBOUND", .modmsk = ARMV8_ATTRS, .code = 0x816a, .desc = "Backend stall cycles, processor bound The counter counts each cycle counted by STALL_BACKEND when the backend is stalled on a processor resource, not including memory" }, {.name = "STALL_BACKEND_BUSY", .modmsk = ARMV8_ATTRS, .code = 0x816b, .desc = "Backend stall cycles, backend busy The counter counts each cycle by STALL_BACKEND when operations are available from the frontend but the backend is not able to accept an operation because an execution unit is busy" }, {.name = "STALL_BACKEND_RENAME", .modmsk = ARMV8_ATTRS, .code = 0x816d, .desc = "Backend stall cycles, rename full The counter counts each cycle counted by STALL_BACKEND_CPUBOUND when operation are available from the frontend but at least one is not ready to be sent to the backend because no rename register is available" }, {.name = "CAS_NEAR_PASS", .modmsk = ARMV8_ATTRS, .code = 0x8171, .desc = "Atomic memory Operation speculatively executed, Compare and Swap pass The counter counts each Compare and Swap operation counted by CAS_NEAR_SPEC that updates the location accessed" }, {.name = "CAS_NEAR_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8172, .desc = "Atomic memory Operation speculatively executed, Compare and Swap near The counter counts each Compare and Swap operation that executes locally to the PE" }, {.name = "CAS_FAR_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8173, .desc = "Atomic memory Operation speculatively executed, Compare and Swap far The counter counts each Compare and Swap operation that does not execute locally to the PE" }, {.name = "L1D_CACHE_REFILL_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x81bc, .desc = "Level 1 data cache refill, hardware prefetch The counter counts each refill triggered by L1 prefetchers" }, {.name = "L2D_CACHE_PRF", .modmsk = ARMV8_ATTRS, .code = 0x8285, .desc = "Level 2 data cache, preload or prefetch hit The counter counts each fetch counted by either L2D_CACHE_HWPRF or L2D_CACHE_PRFM" }, {.name = "L2D_CACHE_REFILL_PRF", .modmsk = ARMV8_ATTRS, .code = 0x828d, .desc = "Level 2 data cache refill, preload or prefetch hit The counter counts each refill counted by either L2D_CACHE_REFILL_HWPRF or L2D_CACHE_REFILL_PRFM" }, {.name = "LL_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x829a, .desc = "Last level cache refill The counter counts each access counted by LL_CACHE that causes a refill of the Last level cache, or any other data, instruction, or unified cache of this PE, from outside of those caches" }, /* END Neoverse N3 specific events */ }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_v1_events.h000066400000000000000000000501531502707512200246720ustar00rootroot00000000000000/* * Contributed by John Linford * Stephane Eranian * * SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. * SPDX-License-Identifier: MIT * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * * ARM Neoverse V1 * References: * - https://github.com/ARM-software/data/blob/master/pmu/neoverse-v1.json * - [Arm Neoverse V1 PMU Guide](https://developer.arm.com/documentation/PJDOC-1063724031-605393/r1p1/?lang=en) * - [Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile](https://developer.arm.com/documentation/ddi0487/ga) * - [Arm Architecture Reference Manual Supplement: The Scalable Vector Extension (SVE), for ARMv8-A](https://developer.arm.com/documentation/ddi0584/ba/) */ static const arm_entry_t arm_v1_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refills" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refills" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refills" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache accesses" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refills" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exceptions taken" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instructions architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instructions architecturally executed (condition check pass) - Write to CONTEXTIDR", }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branches speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branches speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory accesses" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "L1I_CACHE", .code = 0x14, .desc = "Level 1 instruction cache accesses (deprecated)" }, {.name = "L1I_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache accesses" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-backs" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "L2D_CACHE", .code = 0x16, .desc = "Level 2 data cache accesses (alias to L2D_CACHE)" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache accesses" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refills" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-backs" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Counts every beat of data transferred over the data channels between the core and the SCU" }, {.name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory errors" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Instructions speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instructions architecturally executed (condition check pass, write to TTBR). Counts writes to TTBR0_EL1/TTBR1_EL1 in aarch64 mode" }, {.name = "BUS_MASTER_CYCLE", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycles. This event duplicate cycles", }, {.name = "COUNTER_OVERFLOW", .modmsk = ARMV8_ATTRS, .code = 0x1e, .desc = "For odd-numbered counters, this event increments the count by one for each overflow of the preceding even-numbered counter. There is no increment for even-numbered counters", }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV8_ATTRS, .code = 0x20, .desc = "Level 2 data/unified cache allocations without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Counts all branches on the architecturally executed path that would incur cost if mispredicted" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Instructions executed, mis-predicted branch. All instructions counted by BR_RETIRED that were not correctly predicted" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "Cycles in which no operation issued because there were no operations to issue" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "Cycles in which no operation issued due to back-end resources being unavailable" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB accesses" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Instruction TLB accesses" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Attributable L2 data or unified TLB refills. Counts on any refill of the L2TLB caused by either an instruction or data access (MMU must be enabled)", }, {.name = "L2D_TLB_REQ", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Attributable L2 TLB accesses. Counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .equiv = "L2D_TLB_REQ", .desc = "Attributable L2 TLB accesses. Counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Number of accesses to another socket", }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Accesses to the data TLB that caused a page walk. Counts any data access which causes L2D_TLB_REFILL to count", }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Accesses to the instruction TLB that caused a page walk. Counts any instruction which causes L2D_TLB_REFILL to count", }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last Level cache accesses for reads", }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last Level cache misses for reads", }, {.name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x39, .desc = "Counts the number Level 1 data cache long-latency misses", }, {.name = "OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x3a, .desc = "Counts the number of micro-ops architecturally executed", }, {.name = "OP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x3b, .desc = "Counts the number of speculatively executed micro-ops", }, {.name = "STALL", .modmsk = ARMV8_ATTRS, .code = 0x3c, .desc = "Counts cycles in which no operation is sent for execution", }, {.name = "STALL_SLOT_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x3d, .desc = "No operation sent for execution on a slot due to the backend", }, {.name = "STALL_SLOT_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x3e, .desc = "No operation sent for execution on a slot due to the frontend", }, {.name = "STALL_SLOT", .modmsk = ARMV8_ATTRS, .code = 0x3f, .desc = "No operation sent for execution on a slot", }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache read accesses" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache write accesses" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refills" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refills" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refills, inner. Counts any L1D cache line fill which hits in the L2, L3 or another core in the cluster" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refills, outer. Counts any L1D cache line fill which does not hit in the L2, L3 or another core in the cluster and instead obtains data from outside the cluster" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-backs (victim eviction)", }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-backs (clean and coherency eviction)", }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidations" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refills" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refills" }, {.name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4e, .desc = "Level 1 data TLB read accesses" }, {.name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4f, .desc = "Level 1 data TLB write accesses" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache read accesses" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache write accesses" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refills" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refills" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache victim write-backs" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache cleaning and coherency write-backs" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidations" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x5c, .desc = "Level 2 data TLB refills on read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x5d, .desc = "Level 2 data TLB refills on write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x5e, .desc = "Level 2 data TLB accesses on read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x5f, .desc = "Level 2 data TLB accesses on write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus read accesses" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus write accesses" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory read accesses (includes SVE)" }, {.name = "MEM_READ_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "MEM_ACCESS_RD", .code = 0x66, .desc = "Data memory read accesses" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory write accesses (includes SVE)" }, {.name = "MEM_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "MEM_ACCESS_WR", .code = 0x67, .desc = "Data memory write accesses" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned read accesses" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned write accesses" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned accesses (includes speculatively executed SVE load and store operations that access at least one unaligned element address)" }, {.name = "UNALIGNED_LDST_ACCESS", .modmsk = ARMV8_ATTRS, .equiv = "UNALIGNED_LDST_SPEC", .code = 0x6a, .desc = "Unaligned accesses" }, {.name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operations speculatively executed - LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operations speculative executed - STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operations speculative executed - STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operations speculatively executed - STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Load instructions speculatively executed" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Store instructions speculatively executed" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Integer data processing instructions speculatively executed" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Advanced SIMD instructions speculatively executed" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Floating-point instructions speculatively executed" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Software change of the PC instruction speculatively executed" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Cryptographic instructions speculatively executed" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Immediate branches speculatively executed" }, {.name = "BR_RET_SPEC", .modmsk = ARMV8_ATTRS, .equiv = "BR_RETURN_SPEC", .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Indirect branches speculatively executed" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "ISB barriers speculatively executed" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "DSB barriers speculatively executed" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "DMB barriers speculatively executed" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exceptions taken, supervisor call" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exceptions taken, instruction abort" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exceptions taken locally, data abort or SError" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exceptions taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exceptions taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exceptions taken locally, secure monitor call" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exceptions taken, hypervisor call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exceptions taken, instruction abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exceptions taken, data abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exceptions taken, other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exceptions taken, irq not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exceptions taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency instructions speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency instructions speculatively executed (store-release)", }, {.name = "SAMPLE_POP", .modmsk = ARMV8_ATTRS, .code = 0x4000, .desc = "Number of operations that might be sampled by SPE, whether or not the operation was sampled", }, {.name = "SAMPLE_FEED", .modmsk = ARMV8_ATTRS, .code = 0x4001, .desc = "Number of times the SPE sample interval counter reaches zero and is reloaded", }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV8_ATTRS, .code = 0x4002, .desc = "Number of times SPE completed sample record passes the SPE filters and is written to the buffer" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV8_ATTRS, .code = 0x4003, .desc = "Number of times SPE has a sample record taken when the previous sampled operation has not yet completed its record" }, {.name = "CNT_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x4004, .desc = "Constant frequency cycles", }, {.name = "STALL_BACKEND_MEM", .modmsk = ARMV8_ATTRS, .code = 0x4005, .desc = "No operation sent due to the backend and memory stalls", }, {.name = "L1I_CACHE_LMISS", .modmsk = ARMV8_ATTRS, .code = 0x4006, .desc = "Counts L1 instruction cache long latency misses", }, {.name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x4009, .desc = "Counts L2 cache long latency misses", }, {.name = "ASE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8005, .desc = "Advanced SIMD operations sepculatively executed", }, {.name = "SVE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8006, .desc = "SVE operations sepculatively executed", }, {.name = "SVE_PRED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8074, .desc = "SVE predicated operations speculatively executed", }, {.name = "SVE_PRED_EMPTY_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8075, .desc = "SVE predicated operations with no active predicates speculatively executed", }, {.name = "SVE_PRED_FULL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8076, .desc = "SVE predicated operations with all active predicates speculatively executed", }, {.name = "SVE_PRED_PARTIAL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8077, .desc = "SVE predicated operations with partially active predicates speculatively executed", }, {.name = "SVE_LDFF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bc, .desc = "SVE first-fault load operations speculatively executed", }, {.name = "SVE_LDFF_FAULT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bd, .desc = "SVE first-fault load operations speculatively executed which set FFR bit to 0", }, {.name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c0, .desc = "Scalable floating-point element operations sepculatively executed", }, {.name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c1, .desc = "Non-scalable floating-point element operations sepculatively executed", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_v2_events.h000066400000000000000000000605641502707512200247020ustar00rootroot00000000000000/* * Contributed by John Linford * Stephane Eranian * * SPDX-FileCopyrightText: Copyright (c) 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. * SPDX-License-Identifier: MIT * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * * ARM Neoverse V2 * References: * - Arm Neoverse V2 Core TRM: https://developer.arm.com/documentation/102375/0002 * - https://github.com/ARM-software/data/blob/master/pmu/neoverse-v2.json */ static const arm_entry_t arm_v2_pe[]={ {.name = "SW_INCR", .modmsk = ARMV9_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refills" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refills" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x03, .desc = "Level 1 data cache refills" }, {.name = "L1D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x04, .desc = "Level 1 data cache accesses" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refills" }, {.name = "INST_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV9_ATTRS, .code = 0x09, .desc = "Exceptions taken" }, {.name = "EXC_RETURN", .modmsk = ARMV9_ATTRS, .code = 0x0a, .desc = "Instructions architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x0b, .desc = "Instructions architecturally executed (condition check pass) - Write to CONTEXTIDR", }, {.name = "BR_MIS_PRED", .modmsk = ARMV9_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branches speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BR_PRED", .modmsk = ARMV9_ATTRS, .code = 0x12, .desc = "Predictable branches speculatively executed" }, {.name = "MEM_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x13, .desc = "Data memory accesses" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV9_ATTRS, .equiv = "L1I_CACHE", .code = 0x14, .desc = "Level 1 instruction cache accesses (deprecated)" }, {.name = "L1I_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache accesses" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-backs" }, {.name = "L2D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x16, .desc = "Level 2 data cache accesses" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x17, .desc = "Level 2 data cache refills" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV9_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-backs" }, {.name = "BUS_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x19, .desc = "Bus access This event counts for every beat of data transferred over the data channels between the core and the Snoop Control Unit (SCU). If both read and write data beats are transferred on a given cycle, this event is counted twice on that cycle. This event counts the sum of BUS_ACCESS_RD, BUS_ACCESS_WR, and any snoop data responses" }, {.name = "MEMORY_ERROR", .modmsk = ARMV9_ATTRS, .code = 0x1a, .desc = "Local memory error This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs" }, {.name = "INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x1b, .desc = "Instructions speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, condition code check pass, write to TTBR This event only counts writes to TTBR0_EL1/TTBR1_EL1. Accesses to TTBR0_EL12/TTBR1_EL12 or TTBR0_EL2/TTBR1_EL2 are not counted" }, {.name = "BUS_MASTER_CYCLE", .modmsk = ARMV9_ATTRS, .code = 0x1d, .desc = "Bus cycles. This event duplicate cycles", }, {.name = "COUNTER_OVERFLOW", .modmsk = ARMV9_ATTRS, .code = 0x1e, .desc = "For odd-numbered counters, this event increments the count by one for each overflow of the preceding even-numbered counter. For even-numbered counters, there is no increment", }, {.name = "L2D_CACHE_ALLOCATE", .modmsk = ARMV9_ATTRS, .code = 0x20, .desc = "Level 2 data/unified cache allocations without refill" }, {.name = "BR_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x21, .desc = "Counts all branches on the architecturally executed path that would incur cost if mispredicted" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x22, .desc = "Instructions executed, mis-predicted branch. All instructions counted by BR_RETIRED that were not correctly predicted" }, {.name = "STALL_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x23, .desc = "Cycles in which no operation issued because there were no operations to issue" }, {.name = "STALL_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x24, .desc = "Cycles in which no operation issued due to back-end resources being unavailable" }, {.name = "L1D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x25, .desc = "Level 1 data TLB accesses" }, {.name = "L1I_TLB", .modmsk = ARMV9_ATTRS, .code = 0x26, .desc = "Instruction TLB accesses" }, {.name = "L3D_CACHE_ALLOCATE", .modmsk = ARMV9_ATTRS, .code = 0x29, .desc = "Attributable L3 cache allocation without refill This event counts any full cache line write into the L3 cache which does not cause a linefill, including Write-Backs from L2 to L3 and full-line writes which do not allocate into L2" }, {.name = "L3D_CACHE_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x2a, .desc = "Attributable L3 cache refill This event counts for any cacheable read transaction returning data from the SCU for which the data source was outside the cluster. Transactions such as ReadUnique are counted as read transactions, even though they can be generated by store instructions" }, {.name = "L3D_CACHE", .modmsk = ARMV9_ATTRS, .code = 0x2b, .desc = "Attributable L3 cache access This event counts for any cacheable read transaction returning data from the SCU, or for any cacheable write to the SCU" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV9_ATTRS, .code = 0x2d, .desc = "Attributable L2 data or unified TLB refills. Counts on any refill of the L2TLB caused by either an instruction or data access (MMU must be enabled)", }, {.name = "L2_TLB_REQ", .modmsk = ARMV9_ATTRS, .code = 0x2f, .equiv = "L2D_TLB", .desc = "Attributable L2 TLB access This event counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "L2D_TLB", .modmsk = ARMV9_ATTRS, .code = 0x2f, .desc = "Attributable L2 TLB access This event counts on any access to the MMUTC (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled", }, {.name = "REMOTE_ACCESS", .modmsk = ARMV9_ATTRS, .code = 0x31, .desc = "Number of accesses to another socket", }, {.name = "DTLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x34, .desc = "Accesses to the data TLB that caused a page walk. Counts any data access which causes L2D_TLB_REFILL to count", }, {.name = "ITLB_WALK", .modmsk = ARMV9_ATTRS, .code = 0x35, .desc = "Accesses to the instruction TLB that caused a page walk. Counts any instruction which causes L2D_TLB_REFILL to count", }, {.name = "LL_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x36, .desc = "Last Level cache accesses for reads", }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x37, .desc = "Last Level cache misses for reads", }, {.name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x39, .desc = "Counts the number Level 1 data cache long-latency misses", }, {.name = "OP_RETIRED", .modmsk = ARMV9_ATTRS, .code = 0x3a, .desc = "Counts the number of micro-ops architecturally executed", }, {.name = "OP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x3b, .desc = "Counts the number of speculatively executed micro-ops", }, {.name = "STALL", .modmsk = ARMV9_ATTRS, .code = 0x3c, .desc = "Counts cycles in which no operation is sent for execution", }, {.name = "STALL_SLOT_BACKEND", .modmsk = ARMV9_ATTRS, .code = 0x3d, .desc = "No operation sent for execution on a slot due to the backend", }, {.name = "STALL_SLOT_FRONTEND", .modmsk = ARMV9_ATTRS, .code = 0x3e, .desc = "No operation sent for execution on a slot due to the frontend", }, {.name = "STALL_SLOT", .modmsk = ARMV9_ATTRS, .code = 0x3f, .desc = "No operation sent for execution on a slot", }, {.name = "L1D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x40, .desc = "Level 1 data cache read accesses" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x41, .desc = "Level 1 data cache write accesses" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refills" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x43, .desc = "Level 1 data cache write refills" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV9_ATTRS, .code = 0x44, .desc = "Level 1 data cache refills, inner. Counts any L1D cache line fill which hits in the L2, L3 or another core in the cluster" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV9_ATTRS, .code = 0x45, .desc = "Level 1 data cache refills, outer. Counts any L1D cache line fill which does not hit in the L2, L3 or another core in the cluster and instead obtains data from outside the cluster" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV9_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-backs (victim eviction)", }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV9_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-backs (clean and coherency eviction)", }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV9_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidations" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refills" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refills" }, {.name = "L1D_TLB_RD", .modmsk = ARMV9_ATTRS, .code = 0x4e, .desc = "Level 1 data TLB read accesses" }, {.name = "L1D_TLB_WR", .modmsk = ARMV9_ATTRS, .code = 0x4f, .desc = "Level 1 data TLB write accesses" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0x50, .desc = "Level 2 data cache read accesses" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV9_ATTRS, .code = 0x51, .desc = "Level 2 data cache write accesses" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refills" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refills" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV9_ATTRS, .code = 0x56, .desc = "Level 2 data cache victim write-backs" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV9_ATTRS, .code = 0x57, .desc = "Level 2 data cache cleaning and coherency write-backs" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV9_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidations" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV9_ATTRS, .code = 0x5c, .desc = "Level 2 data TLB refills on read" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV9_ATTRS, .code = 0x5d, .desc = "Level 2 data TLB refills on write" }, {.name = "L2D_TLB_RD", .modmsk = ARMV9_ATTRS, .code = 0x5e, .desc = "Level 2 data TLB accesses on read" }, {.name = "L2D_TLB_WR", .modmsk = ARMV9_ATTRS, .code = 0x5f, .desc = "Level 2 data TLB accesses on write" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV9_ATTRS, .code = 0x60, .desc = "Bus read accesses" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV9_ATTRS, .code = 0x61, .desc = "Bus write accesses" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV9_ATTRS, .code = 0x66, .desc = "Data memory read accesses (includes SVE)" }, {.name = "MEM_READ_ACCESS", .modmsk = ARMV9_ATTRS, .equiv = "MEM_ACCESS_RD", .code = 0x66, .desc = "Data memory read accesses" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV9_ATTRS, .code = 0x67, .desc = "Data memory write accesses (includes SVE)" }, {.name = "MEM_WRITE_ACCESS", .modmsk = ARMV9_ATTRS, .equiv = "MEM_ACCESS_WR", .code = 0x67, .desc = "Data memory write accesses" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x68, .desc = "Unaligned read accesses" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x69, .desc = "Unaligned write accesses" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6a, .desc = "Unaligned accesses (includes speculatively executed SVE load and store operations that access at least one unaligned element address)" }, {.name = "UNALIGNED_LDST_ACCESS", .modmsk = ARMV9_ATTRS, .equiv = "UNALIGNED_LDST_SPEC", .code = 0x6a, .desc = "Unaligned accesses" }, {.name = "LDREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6c, .desc = "Exclusive operations speculatively executed - LDREX or LDX" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6d, .desc = "Exclusive operations speculative executed - STREX or STX pass" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6e, .desc = "Exclusive operations speculative executed - STREX or STX fail" }, {.name = "STREX_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x6f, .desc = "Exclusive operations speculatively executed - STREX or STX" }, {.name = "LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x70, .desc = "Load instructions speculatively executed" }, {.name = "ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x71, .desc = "Store instructions speculatively executed" }, {.name = "DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x73, .desc = "Integer data processing instructions speculatively executed" }, {.name = "ASE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x74, .desc = "Advanced SIMD instructions speculatively executed" }, {.name = "VFP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x75, .desc = "Floating-point instructions speculatively executed" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x76, .desc = "Software change of the PC instruction speculatively executed" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x77, .desc = "Cryptographic instructions speculatively executed" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x78, .desc = "Immediate branches speculatively executed" }, {.name = "BR_RET_SPEC", .modmsk = ARMV9_ATTRS, .equiv = "BR_RETURN_SPEC", .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x79, .desc = "Return branches speculatively executed" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7a, .desc = "Indirect branches speculatively executed" }, {.name = "ISB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7c, .desc = "ISB barriers speculatively executed" }, {.name = "DSB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7d, .desc = "DSB barriers speculatively executed" }, {.name = "DMB_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x7e, .desc = "DMB barriers speculatively executed" }, {.name = "EXC_UNDEF", .modmsk = ARMV9_ATTRS, .code = 0x81, .desc = "Undefined exceptions taken locally" }, {.name = "EXC_SVC", .modmsk = ARMV9_ATTRS, .code = 0x82, .desc = "Exceptions taken, supervisor call" }, {.name = "EXC_PABORT", .modmsk = ARMV9_ATTRS, .code = 0x83, .desc = "Exceptions taken, instruction abort" }, {.name = "EXC_DABORT", .modmsk = ARMV9_ATTRS, .code = 0x84, .desc = "Exceptions taken locally, data abort or SError" }, {.name = "EXC_IRQ", .modmsk = ARMV9_ATTRS, .code = 0x86, .desc = "Exceptions taken locally, IRQ" }, {.name = "EXC_FIQ", .modmsk = ARMV9_ATTRS, .code = 0x87, .desc = "Exceptions taken locally, FIQ" }, {.name = "EXC_SMC", .modmsk = ARMV9_ATTRS, .code = 0x88, .desc = "Exceptions taken locally, secure monitor call" }, {.name = "EXC_HVC", .modmsk = ARMV9_ATTRS, .code = 0x8a, .desc = "Exceptions taken, hypervisor call" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV9_ATTRS, .code = 0x8b, .desc = "Exceptions taken, instruction abort not taken locally" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV9_ATTRS, .code = 0x8c, .desc = "Exceptions taken, data abort or SError not taken locally" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV9_ATTRS, .code = 0x8d, .desc = "Exceptions taken, other traps not taken locally" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV9_ATTRS, .code = 0x8e, .desc = "Exceptions taken, irq not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV9_ATTRS, .code = 0x8f, .desc = "Exceptions taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x90, .desc = "Release consistency instructions speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x91, .desc = "Release consistency instructions speculatively executed (store-release)", }, {.name = "L3_CACHE_RD", .modmsk = ARMV9_ATTRS, .code = 0xA0, .desc = "L3 cache read", }, {.name = "SAMPLE_POP", .modmsk = ARMV9_ATTRS, .code = 0x4000, .desc = "Number of operations that might be sampled by SPE, whether or not the operation was sampled", }, {.name = "SAMPLE_FEED", .modmsk = ARMV9_ATTRS, .code = 0x4001, .desc = "Number of times the SPE sample interval counter reaches zero and is reloaded", }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV9_ATTRS, .code = 0x4002, .desc = "Number of times SPE completed sample record passes the SPE filters and is written to the buffer" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV9_ATTRS, .code = 0x4003, .desc = "Number of times SPE has a sample record taken when the previous sampled operation has not yet completed its record" }, {.name = "CNT_CYCLES", .modmsk = ARMV9_ATTRS, .code = 0x4004, .desc = "Constant frequency cycles", }, {.name = "STALL_BACKEND_MEM", .modmsk = ARMV9_ATTRS, .code = 0x4005, .desc = "No operation sent due to the backend and memory stalls", }, {.name = "L1I_CACHE_LMISS", .modmsk = ARMV9_ATTRS, .code = 0x4006, .desc = "Counts L1 instruction cache long latency misses", }, {.name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV9_ATTRS, .code = 0x4009, .desc = "Counts L2 cache long latency misses", }, {.name = "TRB_WRAP", .modmsk = ARMV9_ATTRS, .code = 0x400C, .desc = "Trace buffer current write pointer wrapped", }, {.name = "TRCEXTOUT0", .modmsk = ARMV9_ATTRS, .code = 0x4010, .desc = "PE Trace Unit external output 0 This event is not exported to the trace unit", }, {.name = "TRCEXTOUT1", .modmsk = ARMV9_ATTRS, .code = 0x4011, .desc = "PE Trace Unit external output 1 This event is not exported to the trace unit", }, {.name = "TRCEXTOUT2", .modmsk = ARMV9_ATTRS, .code = 0x4012, .desc = "PE Trace Unit external output 2 This event is not exported to the trace unit", }, {.name = "TRCEXTOUT3", .modmsk = ARMV9_ATTRS, .code = 0x4013, .desc = "PE Trace Unit external output 3 This event is not exported to the trace unit", }, {.name = "CTI_TRIGOUT4", .modmsk = ARMV9_ATTRS, .code = 0x4018, .desc = "Cross-trigger Interface output trigger 4", }, {.name = "CTI_TRIGOUT5", .modmsk = ARMV9_ATTRS, .code = 0x4019, .desc = "Cross-trigger Interface output trigger 5", }, {.name = "CTI_TRIGOUT6", .modmsk = ARMV9_ATTRS, .code = 0x401a, .desc = "Cross-trigger Interface output trigger 6", }, {.name = "CTI_TRIGOUT7", .modmsk = ARMV9_ATTRS, .code = 0x401b, .desc = "Cross-trigger Interface output trigger 7", }, {.name = "LDST_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4020, .desc = "Access with additional latency from alignment", }, {.name = "LD_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4021, .desc = "Load with additional latency from alignment", }, {.name = "ST_ALIGN_LAT", .modmsk = ARMV9_ATTRS, .code = 0x4022, .desc = "Store with additional latency from alignment", }, {.name = "MEM_ACCESS_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4024, .desc = "Checked data memory access", }, {.name = "MEM_ACCESS_RD_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4025, .desc = "Checked data memory access, read", }, {.name = "MEM_ACCESS_WR_CHECKED", .modmsk = ARMV9_ATTRS, .code = 0x4026, .desc = "Checked data memory access, write", }, {.name = "ASE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8005, .desc = "Advanced SIMD operations sepculatively executed", }, {.name = "SVE_INST_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8006, .desc = "SVE operations sepculatively executed", }, {.name = "FP_HP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8014, .desc = "Half-precision floating-point operation speculatively executed", }, {.name = "FP_SP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8018, .desc = "Single-precision floating-point operation speculatively executed", }, {.name = "FP_DP_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x801C, .desc = "Double-precision floating-point operation speculatively executed", }, {.name = "SVE_PRED_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8074, .desc = "SVE predicated operations speculatively executed", }, {.name = "SVE_PRED_EMPTY_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8075, .desc = "SVE predicated operations with no active predicates speculatively executed", }, {.name = "SVE_PRED_FULL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8076, .desc = "SVE predicated operations with all active predicates speculatively executed", }, {.name = "SVE_PRED_PARTIAL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8077, .desc = "SVE predicated operations with partially active predicates speculatively executed", }, {.name = "SVE_PRED_NOT_FULL_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x8079, .desc = "SVE predicated operations speculatively executed with a Governing predicate in which at least one element is FALSE", }, {.name = "SVE_LDFF_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80bc, .desc = "SVE first-fault load operations speculatively executed", }, {.name = "SVE_LDFF_FAULT_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80bd, .desc = "SVE first-fault load operations speculatively executed which set FFR bit to 0", }, {.name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c0, .desc = "Scalable floating-point element operations speculatively executed", }, {.name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80c1, .desc = "Non-scalable floating-point element operations speculatively executed", }, {.name = "ASE_SVE_INT8_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80e3, .desc = "Operation counted by ASE_SVE_INT_SPEC where the largest type is 8-bit integer", }, {.name = "ASE_SVE_INT16_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80e7, .desc = "Operation counted by ASE_SVE_INT_SPEC where the largest type is 16-bit integer", }, {.name = "ASE_SVE_INT32_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80eb, .desc = "Operation counted by ASE_SVE_INT_SPEC where the largest type is 32-bit integer", }, {.name = "ASE_SVE_INT64_SPEC", .modmsk = ARMV9_ATTRS, .code = 0x80ef, .desc = "Operation counted by ASE_SVE_INT_SPEC where the largest type is 64-bit integer", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_neoverse_v3_events.h000066400000000000000000001517371502707512200247060ustar00rootroot00000000000000/* * Copyright (c) 2024 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * * ARM Neoverse V3 * References: * - Arm Neoverse V3 Core TRM: https://developer.arm.com/documentation/107734/ * - https://github.com/ARM-software/data/blob/master/pmu/neoverse-v3.json */ static const arm_entry_t arm_neoverse_v3_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed, Condition code check pass, software increment Counts software writes to the PMSWINC_EL0 (software PMU increment) register" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill Counts cache line refills in the level 1 instruction cache caused by a missed instruction fetch" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill Counts level 1 instruction TLB refills from any Instruction fetch" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill Counts level 1 data cache refills caused by speculatively executed load or store operations that missed in the level 1 data cache" }, {.name = "L1D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access Counts level 1 data cache accesses from any load/store operations" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill Counts level 1 data TLB accesses that resulted in TLB refills" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed Counts instructions that have been architecturally executed" }, {.name = "EXC_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken Counts any taken architecturally visible exceptions such as IRQ, FIQ, SError, and other synchronous exceptions" }, {.name = "EXC_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed, Condition code check pass, exception return Counts any architecturally executed exception return instructions" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed, Condition code check pass, write to CONTEXTIDR Counts architecturally executed writes to the CONTEXTIDR_EL1 register, which usually contain the kernel PID and can be output with hardware trace" }, {.name = "PC_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0c, .desc = "Instruction architecturally executed, Condition code check pass, Software change of the PC Counts branch instructions that caused a change of Program Counter, which effectively causes a change in the control flow of the program" }, {.name = "BR_IMMED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0d, .desc = "Branch instruction architecturally executed, immediate Counts architecturally executed direct branches" }, {.name = "BR_RETURN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0e, .desc = "Branch instruction architecturally executed, procedure return, taken Counts architecturally executed procedure returns" }, {.name = "BR_MIS_PRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Branch instruction speculatively executed, mispredicted or not predicted Counts branches which are speculatively executed and mispredicted" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycle Counts CPU clock cycles (not timer cycles)" }, {.name = "BR_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch instruction speculatively executed Counts all speculatively executed branches" }, {.name = "MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access Counts memory accesses issued by the CPU load store unit, where those accesses are issued due to load or store operations" }, {.name = "L1I_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access Counts instruction fetches which access the level 1 instruction cache" }, {.name = "L1D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x15, .desc = "Level 1 data cache write-back Counts write-backs of dirty data from the L1 data cache to the L2 cache" }, {.name = "L2D_CACHE", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access Counts accesses to the level 2 cache due to data accesses" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill Counts cache line refills into the level 2 cache" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache write-back Counts write-backs of data from the L2 cache to outside the CPU" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access Counts memory transactions issued by the CPU to the external bus, including snoop requests and snoop responses" }, {.name = "MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error Counts any detected correctable or uncorrectable physical memory errors (ECC or parity) in protected CPUs RAMs" }, {.name = "INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Operation speculatively executed Counts operations that have been speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed, Condition code check pass, write to TTBR Counts architectural writes to TTBR0/1_EL1" }, {.name = "BUS_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x1d, .desc = "Bus cycle Counts bus cycles in the CPU" }, {.name = "BR_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x21, .desc = "Instruction architecturally executed, branch Counts architecturally executed branches, whether the branch is taken or not" }, {.name = "BR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x22, .desc = "Branch instruction architecturally executed, mispredicted Counts branches counted by BR_RETIRED which were mispredicted and caused a pipeline flush" }, {.name = "STALL_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x23, .desc = "No operation sent for execution due to the frontend Counts cycles when frontend could not send any micro-operations to the rename stage because of frontend resource stalls caused by fetch memory latency or branch prediction flow stalls" }, {.name = "STALL_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x24, .desc = "No operation sent for execution due to the backend Counts cycles whenever the rename unit is unable to send any micro-operations to the backend of the pipeline because of backend resource constraints" }, {.name = "L1D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x25, .desc = "Level 1 data TLB access Counts level 1 data TLB accesses caused by any memory load or store operation" }, {.name = "L1I_TLB", .modmsk = ARMV8_ATTRS, .code = 0x26, .desc = "Level 1 instruction TLB access Counts level 1 instruction TLB accesses, whether the access hits or misses in the TLB" }, {.name = "L2D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x2d, .desc = "Level 2 data TLB refill Counts level 2 TLB refills caused by memory operations from both data and instruction fetch, except for those caused by TLB maintenance operations and hardware prefetches" }, {.name = "L2D_TLB", .modmsk = ARMV8_ATTRS, .code = 0x2f, .desc = "Level 2 data TLB access Counts level 2 TLB accesses except those caused by TLB maintenance operations" }, {.name = "REMOTE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x31, .desc = "Access to another socket in a multi-socket system Counts accesses to another chip, which is implemented as a different CMN mesh in the system" }, {.name = "DTLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x34, .desc = "Data TLB access with at least one translation table walk Counts number of demand data translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "ITLB_WALK", .modmsk = ARMV8_ATTRS, .code = 0x35, .desc = "Instruction TLB access with at least one translation table walk Counts number of instruction translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "LL_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x36, .desc = "Last level cache access, read Counts read transactions that were returned from outside the core cluster" }, {.name = "LL_CACHE_MISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x37, .desc = "Last level cache miss, read Counts read transactions that were returned from outside the core cluster but missed in the system level cache" }, {.name = "L1D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x39, .desc = "Level 1 data cache long-latency read miss Counts cache line refills into the level 1 data cache from any memory read operations, that incurred additional latency" }, {.name = "OP_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x3a, .desc = "Micro-operation architecturally executed Counts micro-operations that are architecturally executed" }, {.name = "OP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x3b, .desc = "Micro-operation speculatively executed Counts micro-operations speculatively executed" }, {.name = "STALL", .modmsk = ARMV8_ATTRS, .code = 0x3c, .desc = "No operation sent for execution Counts cycles when no operations are sent to the rename unit from the frontend or from the rename unit to the backend for any reason (either frontend or backend stall)" }, {.name = "STALL_SLOT_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x3d, .desc = "No operation sent for execution on a Slot due to the backend Counts slots per cycle in which no operations are sent from the rename unit to the backend due to backend resource constraints" }, {.name = "STALL_SLOT_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x3e, .desc = "No operation sent for execution on a Slot due to the frontend Counts slots per cycle in which no operations are sent to the rename unit from the frontend due to frontend resource constraints" }, {.name = "STALL_SLOT", .modmsk = ARMV8_ATTRS, .code = 0x3f, .desc = "No operation sent for execution on a Slot Counts slots per cycle in which no operations are sent to the rename unit from the frontend or from the rename unit to the backend for any reason (either frontend or backend stall)" }, {.name = "L1D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache access, read Counts level 1 data cache accesses from any load operation" }, {.name = "L1D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache access, write Counts level 1 data cache accesses generated by store operations" }, {.name = "L1D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache refill, read Counts level 1 data cache refills caused by speculatively executed load instructions where the memory read operation misses in the level 1 data cache" }, {.name = "L1D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x43, .desc = "Level 1 data cache refill, write Counts level 1 data cache refills caused by speculatively executed store instructions where the memory write operation misses in the level 1 data cache" }, {.name = "L1D_CACHE_REFILL_INNER", .modmsk = ARMV8_ATTRS, .code = 0x44, .desc = "Level 1 data cache refill, inner Counts level 1 data cache refills where the cache line data came from caches inside the immediate cluster of the core" }, {.name = "L1D_CACHE_REFILL_OUTER", .modmsk = ARMV8_ATTRS, .code = 0x45, .desc = "Level 1 data cache refill, outer Counts level 1 data cache refills for which the cache line data came from outside the immediate cluster of the core, like an SLC in the system interconnect or DRAM" }, {.name = "L1D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x46, .desc = "Level 1 data cache write-back, victim Counts dirty cache line evictions from the level 1 data cache caused by a new cache line allocation" }, {.name = "L1D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x47, .desc = "Level 1 data cache write-back, cleaning and coherency Counts write-backs from the level 1 data cache that are a result of a coherency operation made by another CPU" }, {.name = "L1D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate Counts each explicit invalidation of a cache line in the level 1 data cache caused by: - Cache Maintenance Operations (CMO) that operate by a virtual address" }, {.name = "L1D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB refill, read Counts level 1 data TLB refills caused by memory read operations" }, {.name = "L1D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB refill, write Counts level 1 data TLB refills caused by data side memory write operations" }, {.name = "L1D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x4e, .desc = "Level 1 data TLB access, read Counts level 1 data TLB accesses caused by memory read operations" }, {.name = "L1D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x4f, .desc = "Level 1 data TLB access, write Counts any L1 data side TLB accesses caused by memory write operations" }, {.name = "L2D_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache access, read Counts level 2 data cache accesses due to memory read operations" }, {.name = "L2D_CACHE_WR", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache access, write Counts level 2 cache accesses due to memory write operations" }, {.name = "L2D_CACHE_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache refill, read Counts refills for memory accesses due to memory read operation counted by L2D_CACHE_RD" }, {.name = "L2D_CACHE_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache refill, write Counts refills for memory accesses due to memory write operation counted by L2D_CACHE_WR" }, {.name = "L2D_CACHE_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache write-back, victim Counts evictions from the level 2 cache because of a line being allocated into the L2 cache" }, {.name = "L2D_CACHE_WB_CLEAN", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache write-back, cleaning and coherency Counts write-backs from the level 2 cache that are a result of either: 1" }, {.name = "L2D_CACHE_INVAL", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate Counts each explicit invalidation of a cache line in the level 2 cache by cache maintenance operations that operate by a virtual address, or by external coherency operations" }, {.name = "L2D_TLB_REFILL_RD", .modmsk = ARMV8_ATTRS, .code = 0x5c, .desc = "Level 2 data TLB refill, read Counts level 2 TLB refills caused by memory read operations from both data and instruction fetch except for those caused by TLB maintenance operations or hardware prefetches" }, {.name = "L2D_TLB_REFILL_WR", .modmsk = ARMV8_ATTRS, .code = 0x5d, .desc = "Level 2 data TLB refill, write Counts level 2 TLB refills caused by memory write operations from both data and instruction fetch except for those caused by TLB maintenance operations" }, {.name = "L2D_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x5e, .desc = "Level 2 data TLB access, read Counts level 2 TLB accesses caused by memory read operations from both data and instruction fetch except for those caused by TLB maintenance operations" }, {.name = "L2D_TLB_WR", .modmsk = ARMV8_ATTRS, .code = 0x5f, .desc = "Level 2 data TLB access, write Counts level 2 TLB accesses caused by memory write operations from both data and instruction fetch except for those caused by TLB maintenance operations" }, {.name = "BUS_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus access, read Counts memory read transactions seen on the external bus" }, {.name = "BUS_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus access, write Counts memory write transactions seen on the external bus" }, {.name = "MEM_ACCESS_RD", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory access, read Counts memory accesses issued by the CPU due to load operations" }, {.name = "MEM_ACCESS_WR", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory access, write Counts memory accesses issued by the CPU due to store operations" }, {.name = "UNALIGNED_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned access, read Counts unaligned memory read operations issued by the CPU" }, {.name = "UNALIGNED_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned access, write Counts unaligned memory write operations issued by the CPU" }, {.name = "UNALIGNED_LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned access Counts unaligned memory operations issued by the CPU" }, {.name = "LDREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operation speculatively executed, Load-Exclusive Counts Load-Exclusive operations that have been speculatively executed" }, {.name = "STREX_PASS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operation speculatively executed, Store-Exclusive pass Counts store-exclusive operations that have been speculatively executed and have successfully completed the store operation" }, {.name = "STREX_FAIL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operation speculatively executed, Store-Exclusive fail Counts store-exclusive operations that have been speculatively executed and have not successfully completed the store operation" }, {.name = "STREX_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operation speculatively executed, Store-Exclusive Counts store-exclusive operations that have been speculatively executed" }, {.name = "LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Operation speculatively executed, load Counts speculatively executed load operations including Single Instruction Multiple Data (SIMD) load operations" }, {.name = "ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Operation speculatively executed, store Counts speculatively executed store operations including Single Instruction Multiple Data (SIMD) store operations" }, {.name = "LDST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Operation speculatively executed, load or store Counts load and store operations that have been speculatively executed" }, {.name = "DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Operation speculatively executed, integer data processing Counts speculatively executed logical or arithmetic instructions such as MOV/MVN operations" }, {.name = "ASE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Operation speculatively executed, Advanced SIMD Counts speculatively executed Advanced SIMD operations excluding load, store and move micro-operations that move data to or from SIMD (vector) registers" }, {.name = "VFP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "Operation speculatively executed, scalar floating-point Counts speculatively executed floating point operations" }, {.name = "PC_WRITE_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Operation speculatively executed, Software change of the PC Counts speculatively executed operations which cause software changes of the PC" }, {.name = "CRYPTO_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x77, .desc = "Operation speculatively executed, Cryptographic instruction Counts speculatively executed cryptographic operations except for PMULL and VMULL operations" }, {.name = "BR_IMMED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Branch speculatively executed, immediate branch Counts direct branch operations which are speculatively executed" }, {.name = "BR_RETURN_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Branch speculatively executed, procedure return Counts procedure return operations (RET, RETAA and RETAB) which are speculatively executed" }, {.name = "BR_INDIRECT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Branch speculatively executed, indirect branch Counts indirect branch operations including procedure returns, which are speculatively executed" }, {.name = "ISB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "Barrier speculatively executed, ISB Counts ISB operations that are executed" }, {.name = "DSB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "Barrier speculatively executed, DSB Counts DSB operations that are speculatively issued to Load/Store unit in the CPU" }, {.name = "DMB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "Barrier speculatively executed, DMB Counts DMB operations that are speculatively issued to the Load/Store unit in the CPU" }, {.name = "CSDB_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x7f, .desc = "Barrier speculatively executed, CSDB Counts CDSB operations that are speculatively issued to the Load/Store unit in the CPU" }, {.name = "EXC_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous Counts the number of synchronous exceptions which are taken locally that are due to attempting to execute an instruction that is UNDEFINED" }, {.name = "EXC_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, Supervisor Call Counts SVC exceptions taken locally" }, {.name = "EXC_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, Instruction Abort Counts synchronous exceptions that are taken locally and caused by Instruction Aborts" }, {.name = "EXC_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, Data Abort or SError Counts exceptions that are taken locally and are caused by data aborts or SErrors" }, {.name = "EXC_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, IRQ Counts IRQ exceptions including the virtual IRQs that are taken locally" }, {.name = "EXC_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, FIQ Counts FIQ exceptions including the virtual FIQs that are taken locally" }, {.name = "EXC_SMC", .modmsk = ARMV8_ATTRS, .code = 0x88, .desc = "Exception taken, Secure Monitor Call Counts SMC exceptions take to EL3" }, {.name = "EXC_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken, Hypervisor Call Counts HVC exceptions taken to EL2" }, {.name = "EXC_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, Instruction Abort not Taken locally Counts exceptions which are traps not taken locally and are caused by Instruction Aborts" }, {.name = "EXC_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, Data Abort or SError not Taken locally Counts exceptions which are traps not taken locally and are caused by Data Aborts or SError interrupts" }, {.name = "EXC_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, other traps not Taken locally Counts the number of synchronous trap exceptions which are not taken locally and are not SVC, SMC, HVC, data aborts, Instruction Aborts, or interrupts" }, {.name = "EXC_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, IRQ not Taken locally Counts IRQ exceptions including the virtual IRQs that are not taken locally" }, {.name = "EXC_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, FIQ not Taken locally Counts FIQs which are not taken locally but taken from EL0, EL1, or EL2 to EL3 (which would be the normal behavior for FIQs when not executing in EL3)" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency operation speculatively executed, Load-Acquire Counts any load acquire operations that are speculatively executed" }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency operation speculatively executed, Store-Release Counts any store release operations that are speculatively executed" }, {.name = "SAMPLE_POP", .modmsk = ARMV8_ATTRS, .code = 0x4000, .desc = "Sample Population" }, {.name = "SAMPLE_FEED", .modmsk = ARMV8_ATTRS, .code = 0x4001, .desc = "Sample Taken" }, {.name = "SAMPLE_FILTRATE", .modmsk = ARMV8_ATTRS, .code = 0x4002, .desc = "Sample Taken and not removed by filtering" }, {.name = "SAMPLE_COLLISION", .modmsk = ARMV8_ATTRS, .code = 0x4003, .desc = "Sample collided with previous sample" }, {.name = "CNT_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x4004, .desc = "Constant frequency cycles Increments at a constant frequency equal to the rate of increment of the System Counter, CNTPCT_EL0" }, {.name = "STALL_BACKEND_MEM", .modmsk = ARMV8_ATTRS, .code = 0x4005, .desc = "Memory stall cycles Counts cycles when the backend is stalled because there is a pending demand load request in progress in the last level core cache" }, {.name = "L1I_CACHE_LMISS", .modmsk = ARMV8_ATTRS, .code = 0x4006, .desc = "Level 1 instruction cache long-latency miss Counts cache line refills into the level 1 instruction cache, that incurred additional latency" }, {.name = "L2D_CACHE_LMISS_RD", .modmsk = ARMV8_ATTRS, .code = 0x4009, .desc = "Level 2 data cache long-latency read miss Counts cache line refills into the level 2 unified cache from any memory read operations that incurred additional latency" }, {.name = "LDST_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4020, .desc = "Access with additional latency from alignment Counts the number of memory read and write accesses in a cycle that incurred additional latency, due to the alignment of the address and the size of data being accessed, which results in store crossing a single cache line" }, {.name = "LD_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4021, .desc = "Load with additional latency from alignment Counts the number of memory read accesses in a cycle that incurred additional latency, due to the alignment of the address and size of data being accessed, which results in load crossing a single cache line" }, {.name = "ST_ALIGN_LAT", .modmsk = ARMV8_ATTRS, .code = 0x4022, .desc = "Store with additional latency from alignment Counts the number of memory write access in a cycle that incurred additional latency, due to the alignment of the address and size of data being accessed incurred additional latency" }, {.name = "MEM_ACCESS_CHECKED", .modmsk = ARMV8_ATTRS, .code = 0x4024, .desc = "Checked data memory access Counts the number of memory read and write accesses counted by MEM_ACCESS that are tag checked by the Memory Tagging Extension (MTE)" }, {.name = "MEM_ACCESS_CHECKED_RD", .modmsk = ARMV8_ATTRS, .code = 0x4025, .desc = "Checked data memory access, read Counts the number of memory read accesses in a cycle that are tag checked by the Memory Tagging Extension (MTE)" }, {.name = "MEM_ACCESS_CHECKED_WR", .modmsk = ARMV8_ATTRS, .code = 0x4026, .desc = "Checked data memory access, write Counts the number of memory write accesses in a cycle that is tag checked by the Memory Tagging Extension (MTE)" }, {.name = "SIMD_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8004, .desc = "Operation speculatively executed, SIMD Counts speculatively executed operations that are SIMD or SVE vector operations or Advanced SIMD non-scalar operations" }, {.name = "ASE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8005, .desc = "Operation speculatively executed, Advanced SIMD Counts speculatively executed Advanced SIMD operations" }, {.name = "SVE_INST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8006, .desc = "Operation speculatively executed, SVE, including load and store Counts speculatively executed operations that are SVE operations" }, {.name = "FP_HP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8014, .desc = "Floating-point operation speculatively executed, half precision Counts speculatively executed half precision floating point operations" }, {.name = "FP_SP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8018, .desc = "Floating-point operation speculatively executed, single precision Counts speculatively executed single precision floating point operations" }, {.name = "FP_DP_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x801c, .desc = "Floating-point operation speculatively executed, double precision Counts speculatively executed double precision floating point operations" }, {.name = "INT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8040, .desc = "Integer operation speculatively executed Counts speculatively executed integer arithmetic operations" }, {.name = "SVE_PRED_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8074, .desc = "Operation speculatively executed, SVE predicated Counts speculatively executed predicated SVE operations" }, {.name = "SVE_PRED_EMPTY_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8075, .desc = "Operation speculatively executed, SVE predicated with no active predicates Counts speculatively executed predicated SVE operations with no active predicate elements" }, {.name = "SVE_PRED_FULL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8076, .desc = "Operation speculatively executed, SVE predicated with all active predicates Counts speculatively executed predicated SVE operations with all predicate elements active" }, {.name = "SVE_PRED_PARTIAL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8077, .desc = "Operation speculatively executed, SVE predicated with partially active predicates Counts speculatively executed predicated SVE operations with at least one but not all active predicate elements" }, {.name = "SVE_PRED_NOT_FULL_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8079, .desc = "SVE predicated operations speculatively executed with no active or partially active predicates Counts speculatively executed predicated SVE operations with at least one non active predicate elements" }, {.name = "PRF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x8087, .desc = "Operation speculatively executed, Prefetch Counts speculatively executed operations that prefetch memory" }, {.name = "SVE_LDFF_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bc, .desc = "Operation speculatively executed, SVE first-fault load Counts speculatively executed SVE first fault or non-fault load operations" }, {.name = "SVE_LDFF_FAULT_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80bd, .desc = "Operation speculatively executed, SVE first-fault load which set FFR bit to 0b0 Counts speculatively executed SVE first fault or non-fault load operations that clear at least one bit in the FFR" }, {.name = "FP_SCALE_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c0, .desc = "Scalable floating-point element ALU operations speculatively executed Counts speculatively executed scalable single precision floating point operations" }, {.name = "FP_FIXED_OPS_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80c1, .desc = "Non-scalable floating-point element ALU operations speculatively executed Counts speculatively executed non-scalable single precision floating point operations" }, {.name = "ASE_SVE_INT8_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80e3, .desc = "Integer operation speculatively executed, Advanced SIMD or SVE 8-bit Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type an 8-bit integer" }, {.name = "ASE_SVE_INT16_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80e7, .desc = "Integer operation speculatively executed, Advanced SIMD or SVE 16-bit Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 16-bit integer" }, {.name = "ASE_SVE_INT32_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80eb, .desc = "Integer operation speculatively executed, Advanced SIMD or SVE 32-bit Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 32-bit integer" }, {.name = "ASE_SVE_INT64_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x80ef, .desc = "Integer operation speculatively executed, Advanced SIMD or SVE 64-bit Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 64-bit integer" }, {.name = "BR_IMMED_TAKEN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8108, .desc = "Branch instruction architecturally executed, immediate, taken Counts architecturally executed direct branches that were taken" }, {.name = "BR_INDNR_TAKEN_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x810c, .desc = "Branch instruction architecturally executed, indirect excluding procedure return, taken Counts architecturally executed indirect branches excluding procedure returns that were taken" }, {.name = "BR_IMMED_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8110, .desc = "Branch instruction architecturally executed, predicted immediate Counts architecturally executed direct branches that were correctly predicted" }, {.name = "BR_IMMED_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8111, .desc = "Branch instruction architecturally executed, mispredicted immediate Counts architecturally executed direct branches that were mispredicted and caused a pipeline flush" }, {.name = "BR_IND_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8112, .desc = "Branch instruction architecturally executed, predicted indirect Counts architecturally executed indirect branches including procedure returns that were correctly predicted" }, {.name = "BR_IND_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8113, .desc = "Branch instruction architecturally executed, mispredicted indirect Counts architecturally executed indirect branches including procedure returns that were mispredicted and caused a pipeline flush" }, {.name = "BR_RETURN_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8114, .desc = "Branch instruction architecturally executed, predicted procedure return Counts architecturally executed procedure returns that were correctly predicted" }, {.name = "BR_RETURN_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8115, .desc = "Branch instruction architecturally executed, mispredicted procedure return Counts architecturally executed procedure returns that were mispredicted and caused a pipeline flush" }, {.name = "BR_INDNR_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8116, .desc = "Branch instruction architecturally executed, predicted indirect excluding procedure return Counts architecturally executed indirect branches excluding procedure returns that were correctly predicted" }, {.name = "BR_INDNR_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8117, .desc = "Branch instruction architecturally executed, mispredicted indirect excluding procedure return Counts architecturally executed indirect branches excluding procedure returns that were mispredicted and caused a pipeline flush" }, {.name = "BR_TAKEN_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8118, .desc = "Branch instruction architecturally executed, predicted branch, taken Counts architecturally executed branches that were taken and were correctly predicted" }, {.name = "BR_TAKEN_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x8119, .desc = "Branch instruction architecturally executed, mispredicted branch, taken Counts architecturally executed branches that were taken and were mispredicted causing a pipeline flush" }, {.name = "BR_SKIP_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811a, .desc = "Branch instruction architecturally executed, predicted branch, not taken Counts architecturally executed branches that were not taken and were correctly predicted" }, {.name = "BR_SKIP_MIS_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811b, .desc = "Branch instruction architecturally executed, mispredicted branch, not taken Counts architecturally executed branches that were not taken and were mispredicted causing a pipeline flush" }, {.name = "BR_PRED_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811c, .desc = "Branch instruction architecturally executed, predicted branch Counts branch instructions counted by BR_RETIRED which were correctly predicted" }, {.name = "BR_IND_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x811d, .desc = "Instruction architecturally executed, indirect branch Counts architecturally executed indirect branches including procedure returns" }, {.name = "INST_FETCH_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8120, .desc = "Event in progress, INST FETCH Counts number of instruction fetches outstanding per cycle, which will provide an average latency of instruction fetch" }, {.name = "MEM_ACCESS_RD_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8121, .desc = "Event in progress, MEM ACCESS RD Counts the number of outstanding loads or memory read accesses per cycle" }, {.name = "INST_FETCH", .modmsk = ARMV8_ATTRS, .code = 0x8124, .desc = "Instruction memory access Counts Instruction memory accesses that the PE makes" }, {.name = "DTLB_WALK_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8128, .desc = "Event in progress, DTLB WALK Counts the number of data translation table walks in progress per cycle" }, {.name = "ITLB_WALK_PERCYC", .modmsk = ARMV8_ATTRS, .code = 0x8129, .desc = "Event in progress, ITLB WALK Counts the number of instruction translation table walks in progress per cycle" }, {.name = "SAMPLE_FEED_BR", .modmsk = ARMV8_ATTRS, .code = 0x812a, .desc = "Statisical Profiling sample taken, branch Counts statistical profiling samples taken which are branches" }, {.name = "SAMPLE_FEED_LD", .modmsk = ARMV8_ATTRS, .code = 0x812b, .desc = "Statisical Profiling sample taken, load Counts statistical profiling samples taken which are loads or load atomic operations" }, {.name = "SAMPLE_FEED_ST", .modmsk = ARMV8_ATTRS, .code = 0x812c, .desc = "Statisical Profiling sample taken, store Counts statistical profiling samples taken which are stores or store atomic operations" }, {.name = "SAMPLE_FEED_OP", .modmsk = ARMV8_ATTRS, .code = 0x812d, .desc = "Statisical Profiling sample taken, matching operation type Counts statistical profiling samples taken which are matching any operation type filters supported" }, {.name = "SAMPLE_FEED_EVENT", .modmsk = ARMV8_ATTRS, .code = 0x812e, .desc = "Statisical Profiling sample taken, matching events Counts statistical profiling samples taken which are matching event packet filter constraints" }, {.name = "SAMPLE_FEED_LAT", .modmsk = ARMV8_ATTRS, .code = 0x812f, .desc = "Statisical Profiling sample taken, exceeding minimum latency Counts statistical profiling samples taken which are exceeding minimum latency set by operation latency filter constraints" }, {.name = "L1D_TLB_RW", .modmsk = ARMV8_ATTRS, .code = 0x8130, .desc = "Level 1 data TLB demand access Counts level 1 data TLB demand accesses caused by memory read or write operations" }, {.name = "L1I_TLB_RD", .modmsk = ARMV8_ATTRS, .code = 0x8131, .desc = "Level 1 instruction TLB demand access Counts level 1 instruction TLB demand accesses whether the access hits or misses in the TLB" }, {.name = "L1D_TLB_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8132, .desc = "Level 1 data TLB software preload Counts level 1 data TLB accesses generated by software prefetch or preload memory accesses" }, {.name = "L1I_TLB_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8133, .desc = "Level 1 instruction TLB software preload Counts level 1 instruction TLB accesses generated by software preload or prefetch instructions" }, {.name = "DTLB_HWUPD", .modmsk = ARMV8_ATTRS, .code = 0x8134, .desc = "Data TLB hardware update of translation table Counts number of memory accesses triggered by a data translation table walk and performing an update of a translation table entry" }, {.name = "ITLB_HWUPD", .modmsk = ARMV8_ATTRS, .code = 0x8135, .desc = "Instruction TLB hardware update of translation table Counts number of memory accesses triggered by an instruction translation table walk and performing an update of a translation table entry" }, {.name = "DTLB_STEP", .modmsk = ARMV8_ATTRS, .code = 0x8136, .desc = "Data TLB translation table walk, step Counts number of memory accesses triggered by a demand data translation table walk and performing a read of a translation table entry" }, {.name = "ITLB_STEP", .modmsk = ARMV8_ATTRS, .code = 0x8137, .desc = "Instruction TLB translation table walk, step Counts number of memory accesses triggered by an instruction translation table walk and performing a read of a translation table entry" }, {.name = "DTLB_WALK_LARGE", .modmsk = ARMV8_ATTRS, .code = 0x8138, .desc = "Data TLB large page translation table walk Counts number of demand data translation table walks caused by a miss in the L2 TLB and yielding a large page" }, {.name = "ITLB_WALK_LARGE", .modmsk = ARMV8_ATTRS, .code = 0x8139, .desc = "Instruction TLB large page translation table walk Counts number of instruction translation table walks caused by a miss in the L2 TLB and yielding a large page" }, {.name = "DTLB_WALK_SMALL", .modmsk = ARMV8_ATTRS, .code = 0x813a, .desc = "Data TLB small page translation table walk Counts number of data translation table walks caused by a miss in the L2 TLB and yielding a small page" }, {.name = "ITLB_WALK_SMALL", .modmsk = ARMV8_ATTRS, .code = 0x813b, .desc = "Instruction TLB small page translation table walk Counts number of instruction translation table walks caused by a miss in the L2 TLB and yielding a small page" }, {.name = "DTLB_WALK_RW", .modmsk = ARMV8_ATTRS, .code = 0x813c, .desc = "Data TLB demand access with at least one translation table walk Counts number of demand data translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "ITLB_WALK_RD", .modmsk = ARMV8_ATTRS, .code = 0x813d, .desc = "Instruction TLB demand access with at least one translation table walk Counts number of demand instruction translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "DTLB_WALK_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x813e, .desc = "Data TLB software preload access with at least one translation table walk Counts number of software prefetches or preloads generated data translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "ITLB_WALK_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x813f, .desc = "Instruction TLB software preload access with at least one translation table walk Counts number of software prefetches or preloads generated instruction translation table walks caused by a miss in the L2 TLB and performing at least one memory access" }, {.name = "L1D_CACHE_RW", .modmsk = ARMV8_ATTRS, .code = 0x8140, .desc = "Level 1 data cache demand access Counts level 1 data demand cache accesses from any load or store operation" }, {.name = "L1I_CACHE_RD", .modmsk = ARMV8_ATTRS, .code = 0x8141, .desc = "Level 1 instruction cache demand fetch Counts demand instruction fetches which access the level 1 instruction cache" }, {.name = "L1D_CACHE_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8142, .desc = "Level 1 data cache software preload Counts level 1 data cache accesses from software preload or prefetch instructions" }, {.name = "L1I_CACHE_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8143, .desc = "Level 1 instruction cache software preload Counts instruction fetches generated by software preload or prefetch instructions which access the level 1 instruction cache" }, {.name = "L1D_CACHE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x8144, .desc = "Level 1 data cache demand access miss Counts cache line misses in the level 1 data cache" }, {.name = "L1I_CACHE_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x8145, .desc = "Level 1 instruction cache hardware prefetch Counts instruction fetches which access the level 1 instruction cache generated by the hardware prefetcher" }, {.name = "L1D_CACHE_REFILL_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8146, .desc = "Level 1 data cache refill, software preload Counts level 1 data cache refills where the cache line access was generated by software preload or prefetch instructions" }, {.name = "L1I_CACHE_REFILL_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8147, .desc = "Level 1 instruction cache refill, software preload Counts cache line refills in the level 1 instruction cache caused by a missed instruction fetch generated by software preload or prefetch instructions" }, {.name = "L2D_CACHE_RW", .modmsk = ARMV8_ATTRS, .code = 0x8148, .desc = "Level 2 data cache demand access Counts level 2 cache demand accesses from any load/store operations" }, {.name = "L2D_CACHE_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x814a, .desc = "Level 2 data cache software preload Counts level 2 data cache accesses generated by software preload or prefetch instructions" }, {.name = "L2D_CACHE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x814c, .desc = "Level 2 data cache demand access miss Counts cache line misses in the level 2 cache" }, {.name = "L2D_CACHE_REFILL_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x814e, .desc = "Level 2 data cache refill, software preload Counts refills due to accesses generated as a result of software preload or prefetch instructions as counted by L2D_CACHE_PRFM" }, {.name = "L1D_CACHE_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x8154, .desc = "Level 1 data cache hardware prefetch Counts level 1 data cache accesses from any load/store operations generated by the hardware prefetcher" }, {.name = "L2D_CACHE_HWPRF", .modmsk = ARMV8_ATTRS, .code = 0x8155, .desc = "Level 2 data cache hardware prefetch Counts level 2 data cache accesses generated by L2D hardware prefetchers" }, {.name = "STALL_FRONTEND_MEMBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8158, .desc = "Frontend stall cycles, memory bound Counts cycles when the frontend could not send any micro-operations to the rename stage due to resource constraints in the memory resources" }, {.name = "STALL_FRONTEND_L1I", .modmsk = ARMV8_ATTRS, .code = 0x8159, .desc = "Frontend stall cycles, level 1 instruction cache Counts cycles when the frontend is stalled because there is an instruction fetch request pending in the level 1 instruction cache" }, {.name = "STALL_FRONTEND_MEM", .modmsk = ARMV8_ATTRS, .code = 0x815b, .desc = "Frontend stall cycles, last level PE cache or memory Counts cycles when the frontend is stalled because there is an instruction fetch request pending in the last level core cache" }, {.name = "STALL_FRONTEND_TLB", .modmsk = ARMV8_ATTRS, .code = 0x815c, .desc = "Frontend stall cycles, TLB Counts when the frontend is stalled on any TLB misses being handled" }, {.name = "STALL_FRONTEND_CPUBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8160, .desc = "Frontend stall cycles, processor bound Counts cycles when the frontend could not send any micro-operations to the rename stage due to resource constraints in the CPU resources excluding memory resources" }, {.name = "STALL_FRONTEND_FLOW", .modmsk = ARMV8_ATTRS, .code = 0x8161, .desc = "Frontend stall cycles, flow control Counts cycles when the frontend could not send any micro-operations to the rename stage due to resource constraints in the branch prediction unit" }, {.name = "STALL_FRONTEND_FLUSH", .modmsk = ARMV8_ATTRS, .code = 0x8162, .desc = "Frontend stall cycles, flush recovery Counts cycles when the frontend could not send any micro-operations to the rename stage as the frontend is recovering from a machine flush or resteer" }, {.name = "STALL_BACKEND_MEMBOUND", .modmsk = ARMV8_ATTRS, .code = 0x8164, .desc = "Backend stall cycles, memory bound Counts cycles when the backend could not accept any micro-operations due to resource constraints in the memory resources" }, {.name = "STALL_BACKEND_L1D", .modmsk = ARMV8_ATTRS, .code = 0x8165, .desc = "Backend stall cycles, level 1 data cache Counts cycles when the backend is stalled because there is a pending demand load request in progress in the level 1 data cache" }, {.name = "STALL_BACKEND_L2D", .modmsk = ARMV8_ATTRS, .code = 0x8166, .desc = "Backend stall cycles, level 2 data cache Counts cycles when the backend is stalled because there is a pending demand load request in progress in the level 2 data cache" }, {.name = "STALL_BACKEND_TLB", .modmsk = ARMV8_ATTRS, .code = 0x8167, .desc = "Backend stall cycles, TLB Counts cycles when the backend is stalled on any demand TLB misses being handled" }, {.name = "STALL_BACKEND_ST", .modmsk = ARMV8_ATTRS, .code = 0x8168, .desc = "Backend stall cycles, store Counts cycles when the backend is stalled and there is a store that has not reached the pre-commit stage" }, {.name = "STALL_BACKEND_CPUBOUND", .modmsk = ARMV8_ATTRS, .code = 0x816a, .desc = "Backend stall cycles, processor bound Counts cycles when the backend could not accept any micro-operations due to any resource constraints in the CPU excluding memory resources" }, {.name = "STALL_BACKEND_BUSY", .modmsk = ARMV8_ATTRS, .code = 0x816b, .desc = "Backend stall cycles, backend busy Counts cycles when the backend could not accept any micro-operations because the issue queues are full to take any operations for execution" }, {.name = "STALL_BACKEND_ILOCK", .modmsk = ARMV8_ATTRS, .code = 0x816c, .desc = "Backend stall cycles, input dependency Counts cycles when the backend could not accept any micro-operations due to resource constraints imposed by input dependency" }, {.name = "STALL_BACKEND_RENAME", .modmsk = ARMV8_ATTRS, .code = 0x816d, .desc = "Backend stall cycles, rename full Counts cycles when backend is stalled even when operations are available from the frontend but at least one is not ready to be sent to the backend because no rename register is available" }, {.name = "L1I_CACHE_HIT_RD", .modmsk = ARMV8_ATTRS, .code = 0x81c0, .desc = "Level 1 instruction cache demand fetch hit Counts demand instruction fetches that access the level 1 instruction cache and hit in the L1 instruction cache" }, {.name = "L1I_CACHE_HIT_RD_FPRFM", .modmsk = ARMV8_ATTRS, .code = 0x81d0, .desc = "Level 1 instruction cache demand fetch first hit, fetched by software preload Counts demand instruction fetches that access the level 1 instruction cache that hit in the L1 instruction cache and the line was requested by a software prefetch" }, {.name = "L1I_CACHE_HIT_RD_FHWPRF", .modmsk = ARMV8_ATTRS, .code = 0x81e0, .desc = "Level 1 instruction cache demand fetch first hit, fetched by hardware prefetcher Counts demand instruction fetches generated by hardware prefetch that access the level 1 instruction cache and hit in the L1 instruction cache" }, {.name = "L1I_CACHE_HIT", .modmsk = ARMV8_ATTRS, .code = 0x8200, .desc = "Level 1 instruction cache hit Counts instruction fetches that access the level 1 instruction cache and hit in the level 1 instruction cache" }, {.name = "L1I_CACHE_HIT_PRFM", .modmsk = ARMV8_ATTRS, .code = 0x8208, .desc = "Level 1 instruction cache software preload hit Counts instruction fetches generated by software preload or prefetch instructions that access the level 1 instruction cache and hit in the level 1 instruction cache" }, {.name = "L1I_LFB_HIT_RD", .modmsk = ARMV8_ATTRS, .code = 0x8240, .desc = "Level 1 instruction cache demand fetch line-fill buffer hit Counts demand instruction fetches that access the level 1 instruction cache and hit in a line that is in the process of being loaded into the level 1 instruction cache" }, {.name = "L1I_LFB_HIT_RD_FPRFM", .modmsk = ARMV8_ATTRS, .code = 0x8250, .desc = "Level 1 instruction cache demand fetch line-fill buffer first hit, recently fetched by software preload Counts demand instruction fetches generated by software prefetch instructions that access the level 1 instruction cache and hit in a line that is in the process of being loaded into the level 1 instruction cache" }, {.name = "L1I_LFB_HIT_RD_FHWPRF", .modmsk = ARMV8_ATTRS, .code = 0x8260, .desc = "Level 1 instruction cache demand fetch line-fill buffer first hit, recently fetched by hardware prefetcher Counts demand instruction fetches generated by hardware prefetch that access the level 1 instruction cache and hit in a line that is in the process of being loaded into the level 1 instruction cache" }, /* END Neoverse V3 specific events */ }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_qcom_krait_events.h000066400000000000000000000043061502707512200245660ustar00rootroot00000000000000/* * Copyright (c) 2014 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Qualcomm Krait Chips * based on info in the thread on linux-kernel: * [PATCH 0/7] Support Krait CPU PMUs */ static const arm_entry_t arm_qcom_krait_pe[]={ {.name = "L1D_CACHE_REFILL", .modmsk = ARMV7_A15_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV7_A15_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "INSTR_EXECUTED", .modmsk = ARMV7_A15_ATTRS, .code = 0x08, .desc = "Instructions architecturally executed" }, {.name = "PC_WRITE", .modmsk = ARMV7_A15_ATTRS, .code = 0x0c, .desc = "Software change of PC. Equivalent to branches" }, {.name = "PC_BRANCH_MIS_PRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x10, .desc = "Branches mispredicted or not predicted" }, {.name = "CLOCK_CYCLES", .modmsk = ARMV7_A15_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV7_A15_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV7_A15_ATTRS, .code = 0xff, .desc = "Cycles" }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/arm_xgene_events.h000066400000000000000000000326151502707512200235470ustar00rootroot00000000000000/* * Copyright (c) 2014 Red Hat Inc. All rights reserved * Contributed by William Cohen * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Applied Micro X-Gene * based on https://github.com/AppliedMicro/ENGLinuxLatest/blob/apm_linux_v3.17-rc4/Documentation/arm64/xgene_pmu.txt */ static const arm_entry_t arm_xgene_pe[]={ {.name = "SW_INCR", .modmsk = ARMV8_ATTRS, .code = 0x00, .desc = "Instruction architecturally executed (condition check pass) software increment" }, {.name = "L1I_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x01, .desc = "Level 1 instruction cache refill" }, {.name = "L1I_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x02, .desc = "Level 1 instruction TLB refill" }, {.name = "L1D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x03, .desc = "Level 1 data cache refill" }, {.name = "L1D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x04, .desc = "Level 1 data cache access" }, {.name = "L1D_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x05, .desc = "Level 1 data TLB refill" }, {.name = "INST_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x08, .desc = "Instruction architecturally executed" }, {.name = "EXCEPTION_TAKEN", .modmsk = ARMV8_ATTRS, .code = 0x09, .desc = "Exception taken" }, {.name = "EXCEPTION_RETURN", .modmsk = ARMV8_ATTRS, .code = 0x0a, .desc = "Instruction architecturally executed (condition check pass) - Exception return" }, {.name = "CID_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x0b, .desc = "Instruction architecturally executed (condition check pass) - Write to CONTEXTIDR", }, {.name = "BRANCH_MISPRED", .modmsk = ARMV8_ATTRS, .code = 0x10, .desc = "Mispredicted or not predicted branch speculatively executed" }, {.name = "CPU_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x11, .desc = "Cycles" }, {.name = "BRANCH_PRED", .modmsk = ARMV8_ATTRS, .code = 0x12, .desc = "Predictable branch speculatively executed" }, {.name = "DATA_MEM_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x13, .desc = "Data memory access" }, {.name = "L1I_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x14, .desc = "Level 1 instruction cache access" }, {.name = "L2D_CACHE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x16, .desc = "Level 2 data cache access" }, {.name = "L2D_CACHE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x17, .desc = "Level 2 data cache refill" }, {.name = "L2D_CACHE_WB", .modmsk = ARMV8_ATTRS, .code = 0x18, .desc = "Level 2 data cache WriteBack" }, {.name = "BUS_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x19, .desc = "Bus access" }, {.name = "LOCAL_MEMORY_ERROR", .modmsk = ARMV8_ATTRS, .code = 0x1a, .desc = "Local memory error" }, {.name = "INST_SPEC_EXEC", .modmsk = ARMV8_ATTRS, .code = 0x1b, .desc = "Instruction speculatively executed" }, {.name = "TTBR_WRITE_RETIRED", .modmsk = ARMV8_ATTRS, .code = 0x1c, .desc = "Instruction architecturally executed (condition check pass) Write to translation table base" }, {.name = "L1D_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x40, .desc = "Level 1 data cache read access" }, {.name = "L1D_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x41, .desc = "Level 1 data cache write access" }, {.name = "L1D_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x42, .desc = "Level 1 data cache read refill" }, {.name = "L1D_INVALIDATE", .modmsk = ARMV8_ATTRS, .code = 0x48, .desc = "Level 1 data cache invalidate" }, {.name = "L1D_TLB_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x4c, .desc = "Level 1 data TLB read refill" }, {.name = "L1D_TLB_WRITE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x4d, .desc = "Level 1 data TLB write refill" }, {.name = "L2D_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x50, .desc = "Level 2 data cache read access" }, {.name = "L2D_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x51, .desc = "Level 2 data cache write access" }, {.name = "L2D_READ_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x52, .desc = "Level 2 data cache read refill" }, {.name = "L2D_WRITE_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x53, .desc = "Level 2 data cache write refill" }, {.name = "L2D_WB_VICTIM", .modmsk = ARMV8_ATTRS, .code = 0x56, .desc = "Level 2 data cache writeback victim" }, {.name = "L2D_WB_CLEAN_COHERENCY", .modmsk = ARMV8_ATTRS, .code = 0x57, .desc = "Level 2 data cache writeback cleaning and coherency" }, {.name = "L2D_INVALIDATE", .modmsk = ARMV8_ATTRS, .code = 0x58, .desc = "Level 2 data cache invalidate" }, {.name = "BUS_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x60, .desc = "Bus read access" }, {.name = "BUS_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x61, .desc = "Bus write access" }, {.name = "BUS_NORMAL_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x62, .desc = "Bus normal access" }, {.name = "BUS_NOT_NORMAL_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x63, .desc = "Bus not normal access" }, {.name = "BUS_NORMAL_ACCESS_2", .modmsk = ARMV8_ATTRS, .code = 0x64, .desc = "Bus normal access" }, {.name = "BUS_PERIPH_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x65, .desc = "Bus peripheral access" }, {.name = "DATA_MEM_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x66, .desc = "Data memory read access" }, {.name = "DATA_MEM_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x67, .desc = "Data memory write access" }, {.name = "UNALIGNED_READ_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x68, .desc = "Unaligned read access" }, {.name = "UNALIGNED_WRITE_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x69, .desc = "Unaligned read access" }, {.name = "UNALIGNED_ACCESS", .modmsk = ARMV8_ATTRS, .code = 0x6a, .desc = "Unaligned access" }, {.name = "INST_SPEC_EXEC_LDREX", .modmsk = ARMV8_ATTRS, .code = 0x6c, .desc = "Exclusive operation speculatively executed - Load exclusive" }, {.name = "INST_SPEC_EXEC_STREX_PASS", .modmsk = ARMV8_ATTRS, .code = 0x6d, .desc = "Exclusive operation speculative executed - Store exclusive pass" }, {.name = "INST_SPEC_EXEC_STREX_FAIL", .modmsk = ARMV8_ATTRS, .code = 0x6e, .desc = "Exclusive operation speculative executed - Store exclusive fail" }, {.name = "INST_SPEC_EXEC_STREX", .modmsk = ARMV8_ATTRS, .code = 0x6f, .desc = "Exclusive operation speculatively executed - Store exclusive" }, {.name = "INST_SPEC_EXEC_LOAD", .modmsk = ARMV8_ATTRS, .code = 0x70, .desc = "Load instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_STORE", .modmsk = ARMV8_ATTRS, .code = 0x71, .desc = "Store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_LOAD_STORE", .modmsk = ARMV8_ATTRS, .code = 0x72, .desc = "Load or store instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_INTEGER_INST", .modmsk = ARMV8_ATTRS, .code = 0x73, .desc = "Integer data processing instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SIMD", .modmsk = ARMV8_ATTRS, .code = 0x74, .desc = "Advanced SIMD instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_VFP", .modmsk = ARMV8_ATTRS, .code = 0x75, .desc = "VFP instruction speculatively executed" }, {.name = "INST_SPEC_EXEC_SOFT_PC", .modmsk = ARMV8_ATTRS, .code = 0x76, .desc = "Software change of the PC instruction speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IMM_BRANCH", .modmsk = ARMV8_ATTRS, .code = 0x78, .desc = "Immediate branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_RET", .modmsk = ARMV8_ATTRS, .code = 0x79, .desc = "Return branch speculatively executed" }, {.name = "BRANCH_SPEC_EXEC_IND", .modmsk = ARMV8_ATTRS, .code = 0x7a, .desc = "Indirect branch speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_ISB", .modmsk = ARMV8_ATTRS, .code = 0x7c, .desc = "ISB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DSB", .modmsk = ARMV8_ATTRS, .code = 0x7d, .desc = "DSB barrier speculatively executed" }, {.name = "BARRIER_SPEC_EXEC_DMB", .modmsk = ARMV8_ATTRS, .code = 0x7e, .desc = "DMB barrier speculatively executed" }, {.name = "EXCEPTION_UNDEF", .modmsk = ARMV8_ATTRS, .code = 0x81, .desc = "Exception taken, other synchronous" }, {.name = "EXCEPTION_SVC", .modmsk = ARMV8_ATTRS, .code = 0x82, .desc = "Exception taken, supervisor call" }, {.name = "EXCEPTION_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x83, .desc = "Exception taken, instruction abort" }, {.name = "EXCEPTION_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x84, .desc = "Exception taken, data abort or SError" }, {.name = "EXCEPTION_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x86, .desc = "Exception taken, irq" }, {.name = "EXCEPTION_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x87, .desc = "Exception taken, fiq" }, {.name = "EXCEPTION_HVC", .modmsk = ARMV8_ATTRS, .code = 0x8a, .desc = "Exception taken, hypervisor call" }, {.name = "EXCEPTION_TRAP_PABORT", .modmsk = ARMV8_ATTRS, .code = 0x8b, .desc = "Exception taken, instruction abort not taken locally" }, {.name = "EXCEPTION_TRAP_DABORT", .modmsk = ARMV8_ATTRS, .code = 0x8c, .desc = "Exception taken, data abort or SError not taken locally" }, {.name = "EXCEPTION_TRAP_OTHER", .modmsk = ARMV8_ATTRS, .code = 0x8d, .desc = "Exception taken, other traps not taken locally" }, {.name = "EXCEPTION_TRAP_IRQ", .modmsk = ARMV8_ATTRS, .code = 0x8e, .desc = "Exception taken, irq not taken locally" }, {.name = "EXCEPTION_TRAP_FIQ", .modmsk = ARMV8_ATTRS, .code = 0x8f, .desc = "Exception taken, fiq not taken locally" }, {.name = "RC_LD_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x90, .desc = "Release consistency instruction speculatively executed (load-acquire)", }, {.name = "RC_ST_SPEC", .modmsk = ARMV8_ATTRS, .code = 0x91, .desc = "Release consistency instruction speculatively executed (store-release)", }, {.name = "INST_SPEC_EXEC_NOP", .modmsk = ARMV8_ATTRS, .code = 0x100, .desc = "Operation speculatively executed - NOP", }, {.name = "FSU_CLOCK_OFF", .modmsk = ARMV8_ATTRS, .code = 0x101, .desc = "FSU clocking gated off cycle", }, {.name = "BTB_MISPREDICT", .modmsk = ARMV8_ATTRS, .code = 0x102, .desc = "BTB misprediction", }, {.name = "ITB_MISS", .modmsk = ARMV8_ATTRS, .code = 0x103, .desc = "ITB miss", }, {.name = "DTB_MISS", .modmsk = ARMV8_ATTRS, .code = 0x104, .desc = "DTB miss", }, {.name = "L1D_CACHE_LATE_MISS", .modmsk = ARMV8_ATTRS, .code = 0x105, .desc = "L1 data cache late miss", }, {.name = "L1D_CACHE_PREFETCH", .modmsk = ARMV8_ATTRS, .code = 0x106, .desc = "L1 data cache prefetch request", }, {.name = "L2_CACHE_PREFETCH", .modmsk = ARMV8_ATTRS, .code = 0x107, .desc = "L2 data prefetch request", }, {.name = "STALLED_CYCLES_FRONTEND", .modmsk = ARMV8_ATTRS, .code = 0x108, .desc = "Decode starved for instruction cycle", }, {.name = "STALLED_CYCLES_BACKEND", .modmsk = ARMV8_ATTRS, .code = 0x109, .desc = "Op dispatch stalled cycle", }, {.name = "IXA_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10A, .desc = "IXA Op non-issue", }, {.name = "IXB_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10B, .desc = "IXB Op non-issue", }, {.name = "BX_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10C, .desc = "BX Op non-issue", }, {.name = "LX_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10D, .desc = "LX Op non-issue", }, {.name = "SX_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10E, .desc = "SX Op non-issue", }, {.name = "FX_NO_ISSUE", .modmsk = ARMV8_ATTRS, .code = 0x10F, .desc = "FX Op non-issue", }, {.name = "WAIT_CYCLES", .modmsk = ARMV8_ATTRS, .code = 0x110, .desc = "Wait state cycle", }, {.name = "L1_STAGE2_TLB_REFILL", .modmsk = ARMV8_ATTRS, .code = 0x111, .desc = "L1 stage-2 TLB refill", }, {.name = "PAGE_WALK_L0_STAGE1_HIT", .modmsk = ARMV8_ATTRS, .code = 0x112, .desc = "Page Walk Cache level-0 stage-1 hit", }, {.name = "PAGE_WALK_L1_STAGE1_HIT", .modmsk = ARMV8_ATTRS, .code = 0x113, .desc = "Page Walk Cache level-1 stage-1 hit", }, {.name = "PAGE_WALK_L2_STAGE1_HIT", .modmsk = ARMV8_ATTRS, .code = 0x114, .desc = "Page Walk Cache level-2 stage-1 hit", }, {.name = "PAGE_WALK_L1_STAGE2_HIT", .modmsk = ARMV8_ATTRS, .code = 0x115, .desc = "Page Walk Cache level-1 stage-2 hit", }, {.name = "PAGE_WALK_L2_STAGE2_HIT", .modmsk = ARMV8_ATTRS, .code = 0x116, .desc = "Page Walk Cache level-2 stage-2 hit", }, /* END Applied Micro X-Gene specific events */ }; papi-papi-7-2-0-t/src/libpfm4/lib/events/cell_events.h000066400000000000000000003364411502707512200225250ustar00rootroot00000000000000/* * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ static pme_cell_entry_t cell_pe[] = { {.pme_name = "CYCLES", .pme_desc = "CPU cycles", .pme_code = 0x0, /* 0 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_COMMIT_TH0", .pme_desc = "Branch instruction committed.", .pme_code = 0x834, /* 2100 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_FLUSH_TH0", .pme_desc = "Branch instruction that caused a misprediction flush is committed. Branch misprediction includes", .pme_code = 0x835, /* 2101 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "INST_BUFF_EMPTY_TH0", .pme_desc = "Instruction buffer empty.", .pme_code = 0x836, /* 2102 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_ERAT_MISS_TH0", .pme_desc = "Instruction effective-address-to-real-address translation (I-ERAT) miss.", .pme_code = 0x837, /* 2103 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L1_ICACHE_MISS_CYCLES_TH0", .pme_desc = "L1 Instruction cache miss cycles. Counts the cycles from the miss event until the returned instruction is dispatched or cancelled due to branch misprediction, completion restart, or exceptions.", .pme_code = 0x838, /* 2104 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "DISPATCH_BLOCKED_TH0", .pme_desc = "Valid instruction available for dispatch, but dispatch is blocked.", .pme_code = 0x83a, /* 2106 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_FLUSH_TH0", .pme_desc = "Instruction in pipeline stage EX7 causes a flush.", .pme_code = 0x83d, /* 2109 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "PPC_INST_COMMIT_TH0", .pme_desc = "Two PowerPC instructions committed. For microcode sequences, only the last microcode operation is counted. Committed instructions are counted two at a time. If only one instruction has committed for a given cycle, this event will not be raised until another instruction has been committed in a future cycle.", .pme_code = 0x83f, /* 2111 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_COMMIT_TH1", .pme_desc = "Branch instruction committed.", .pme_code = 0x847, /* 2119 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BRANCH_FLUSH_TH1", .pme_desc = "Branch instruction that caused a misprediction flush is committed. Branch misprediction includes", .pme_code = 0x848, /* 2120 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "INST_BUFF_EMPTY_TH1", .pme_desc = "Instruction buffer empty.", .pme_code = 0x849, /* 2121 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_ERAT_MISS_TH1", .pme_desc = "Instruction effective-address-to-real-address translation (I-ERAT) miss.", .pme_code = 0x84a, /* 2122 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L1_ICACHE_MISS_CYCLES_TH1", .pme_desc = "L1 Instruction cache miss cycles. Counts the cycles from the miss event until the returned instruction is dispatched or cancelled due to branch misprediction, completion restart, or exceptions.", .pme_code = 0x84b, /* 2123 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "DISPATCH_BLOCKED_TH1", .pme_desc = "Valid instruction available for dispatch, but dispatch is blocked.", .pme_code = 0x84d, /* 2125 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "INST_FLUSH_TH1", .pme_desc = "Instruction in pipeline stage EX7 causes a flush.", .pme_code = 0x850, /* 2128 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "PPC_INST_COMMIT_TH1", .pme_desc = "Two PowerPC instructions committed. For microcode sequences, only the last microcode operation is counted. Committed instructions are counted two at a time. If only one instruction has committed for a given cycle, this event will not be raised until another instruction has been committed in a future cycle.", .pme_code = 0x852, /* 2130 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "DATA_ERAT_MISS_TH0", .pme_desc = "Data effective-address-to-real-address translation (D-ERAT) miss. Not speculative.", .pme_code = 0x89a, /* 2202 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "ST_REQ_TH0", .pme_desc = "Store request counted at the L2 interface. Counts microcoded PPE sequences more than once. (Thread 0 and 1)", .pme_code = 0x89b, /* 2203 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_VALID_TH0", .pme_desc = "Load valid at a particular pipe stage. Speculative, since flushed operations are counted as well. Counts microcoded PPE sequences more than once. Misaligned flushes might be counted the first time as well. Load operations include all loads that read data from the cache, dcbt and dcbtst. Does not include load Vector/SIMD multimedia extension pattern instructions.", .pme_code = 0x89c, /* 2204 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L1_DCACHE_MISS_TH0", .pme_desc = "L1 D-cache load miss. Pulsed when there is a miss request that has a tag miss but not an ERAT miss. Speculative, since flushed operations are counted as well.", .pme_code = 0x89d, /* 2205 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "DATA_ERAT_MISS_TH1", .pme_desc = "Data effective-address-to-real-address translation (D-ERAT) miss. Not speculative.", .pme_code = 0x8aa, /* 2218 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_VALID_TH1", .pme_desc = "Load valid at a particular pipe stage. Speculative, since flushed operations are counted as well. Counts microcoded PPE sequences more than once. Misaligned flushes might be counted the first time as well. Load operations include all loads that read data from the cache, dcbt and dcbtst. Does not include load Vector/SIMD multimedia extension pattern instructions.", .pme_code = 0x8ac, /* 2220 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "DATA_ERAT_MISS_TH1", .pme_desc = "L1 D-cache load miss. Pulsed when there is a miss request that has a tag miss but not an ERAT miss. Speculative, since flushed operations are counted as well.", .pme_code = 0x8ad, /* 2221 */ .pme_enable_word = WORD_0_AND_1, .pme_freq = PFM_CELL_PME_FREQ_PPU_MFC, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "LD_MFC_MMIO", .pme_desc = "Load from MFC memory-mapped I/O (MMIO) space.", .pme_code = 0xc1c, /* 3100 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "ST_MFC_MMIO", .pme_desc = "Stores to MFC MMIO space.", .pme_code = 0xc1d, /* 3101 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "REQ_TOKEN_TYPE", .pme_desc = "Request token for even memory bank numbers 0-14.", .pme_code = 0xc22, /* 3106 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "RCV_8BEAT_DATA", .pme_desc = "Receive 8-beat data from the Element Interconnect Bus (EIB).", .pme_code = 0xc2b, /* 3115 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "SEND_8BEAT_DATA", .pme_desc = "Send 8-beat data to the EIB.", .pme_code = 0xc2c, /* 3116 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "SEND_CMD", .pme_desc = "Send a command to the EIB; includes retried commands.", .pme_code = 0xc2d, /* 3117 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "DATA_GRANT_CYCLES", .pme_desc = "Cycles between data request and data grant.", .pme_code = 0xc2e, /* 3118 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_ST_Q_NOT_EMPTY_CYCLES", .pme_desc = "The five-entry Non-Cacheable Unit (NCU) Store Command queue not empty.", .pme_code = 0xc33, /* 3123 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_CACHE_HIT", .pme_desc = "Cache hit for core interface unit (CIU) loads and stores.", .pme_code = 0xc80, /* 3200 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_CACHE_MISS", .pme_desc = "Cache miss for CIU loads and stores.", .pme_code = 0xc81, /* 3201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LD_MISS", .pme_desc = "CIU load miss.", .pme_code = 0xc84, /* 3204 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_MISS", .pme_desc = "CIU store to Invalid state (miss).", .pme_code = 0xc85, /* 3205 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LWARX_LDARX_MISS_TH0", .pme_desc = "Load word and reserve indexed (lwarx/ldarx) for Thread 0 hits Invalid cache state", .pme_code = 0xc87, /* 3207 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_STWCX_STDCX_MISS_TH0", .pme_desc = "Store word conditional indexed (stwcx/stdcx) for Thread 0 hits Invalid cache state when reservation is set.", .pme_code = 0xc8e, /* 3214 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ALL_SNOOP_SM_BUSY", .pme_desc = "All four snoop state machines busy.", .pme_code = 0xc99, /* 3225 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_DCLAIM_GOOD", .pme_desc = "Data line claim (dclaim) that received good combined response; includes store/stcx/dcbz to Shared (S), Shared Last (SL),or Tagged (T) cache state; does not include dcbz to Invalid (I) cache state.", .pme_code = 0xce8, /* 3304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_DCLAIM_TO_RWITM", .pme_desc = "Dclaim converted into rwitm; may still not get to the bus if stcx is aborted .", .pme_code = 0xcef, /* 3311 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_TO_M_MU_E", .pme_desc = "Store to modified (M), modified unsolicited (MU), or exclusive (E) cache state.", .pme_code = 0xcf0, /* 3312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_ST_Q_FULL", .pme_desc = "8-entry store queue (STQ) full.", .pme_code = 0xcf1, /* 3313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "L2_ST_TO_RC_ACKED", .pme_desc = "Store dispatched to RC machine is acknowledged.", .pme_code = 0xcf2, /* 3314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_GATHERABLE_ST", .pme_desc = "Gatherable store (type = 00000) received from CIU.", .pme_code = 0xcf3, /* 3315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_PUSH", .pme_desc = "Snoop push.", .pme_code = 0xcf6, /* 3318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_INTERVENTION_FROM_SL_E_SAME_MODE", .pme_desc = "Send intervention from (SL | E) cache state to a destination within the same CBE chip.", .pme_code = 0xcf7, /* 3319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_INTERVENTION_FROM_M_MU_SAME_MODE", .pme_desc = "Send intervention from (M | MU) cache state to a destination within the same CBE chip.", .pme_code = 0xcf8, /* 3320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RETRY_CONFLICTS", .pme_desc = "Respond with Retry to a snooped request due to one of the following conflicts", .pme_code = 0xcfd, /* 3325 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RETRY_BUSY", .pme_desc = "Respond with Retry to a snooped request because all snoop machines are busy.", .pme_code = 0xcfe, /* 3326 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_MMU_TO_EST", .pme_desc = "Snooped response causes a cache state transition from (M | MU) to (E | S | T).", .pme_code = 0xcff, /* 3327 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_E_TO_S", .pme_desc = "Snooped response causes a cache state transition from E to S.", .pme_code = 0xd00, /* 3328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_ESLST_TO_I", .pme_desc = "Snooped response causes a cache state transition from (E | SL | S | T) to Invalid (I).", .pme_code = 0xd01, /* 3329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_SNOOP_RESP_MMU_TO_I", .pme_desc = "Snooped response causes a cache state transition from (M | MU) to I.", .pme_code = 0xd02, /* 3330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_LWARX_LDARX_MISS_TH1", .pme_desc = "Load and reserve indexed (lwarx/ldarx) for Thread 1 hits Invalid cache state.", .pme_code = 0xd54, /* 3412 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "L2_STWCX_STDCX_MISS_TH1", .pme_desc = "Store conditional indexed (stwcx/stdcx) for Thread 1 hits Invalid cache state.", .pme_code = 0xd5b, /* 3419 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_NON_CACHEABLE_ST_ALL", .pme_desc = "Non-cacheable store request received from CIU; includes all synchronization operations such as sync and eieio.", .pme_code = 0xdac, /* 3500 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_SYNC_REQ", .pme_desc = "sync received from CIU.", .pme_code = 0xdad, /* 3501 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_NON_CACHEABLE_ST", .pme_desc = "Non-cacheable store request received from CIU; includes only stores.", .pme_code = 0xdb0, /* 3504 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_EIEIO_REQ", .pme_desc = "eieio received from CIU.", .pme_code = 0xdb2, /* 3506 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_TLBIE_REQ", .pme_desc = "tlbie received from CIU.", .pme_code = 0xdb3, /* 3507 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_SYNC_WAIT", .pme_desc = "sync at the bottom of the store queue, while waiting on st_done signal from the Bus Interface Unit (BIU) and sync_done signal from L2.", .pme_code = 0xdb4, /* 3508 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_LWSYNC_WAIT", .pme_desc = "lwsync at the bottom of the store queue, while waiting for a sync_done signal from the L2.", .pme_code = 0xdb5, /* 3509 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_EIEIO_WAIT", .pme_desc = "eieio at the bottom of the store queue, while waiting for a st_done signal from the BIU and a sync_done signal from the L2.", .pme_code = 0xdb6, /* 3510 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_TLBIE_WAIT", .pme_desc = "tlbie at the bottom of the store queue, while waiting for a st_done signal from the BIU.", .pme_code = 0xdb7, /* 3511 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_COMBINED_NON_CACHEABLE_ST", .pme_desc = "Non-cacheable store combined with the previous non-cacheable store with a contiguous address.", .pme_code = 0xdb8, /* 3512 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_ALL_ST_GATHER_BUFFS_FULL", .pme_desc = "All four store-gather buffers full.", .pme_code = 0xdbb, /* 3515 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_LD_REQ", .pme_desc = "Non-cacheable load request received from CIU; includes instruction and data fetches.", .pme_code = 0xdbc, /* 3516 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "NCU_ST_Q_NOT_EMPTY", .pme_desc = "The four-deep store queue not empty.", .pme_code = 0xdbd, /* 3517 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_ST_Q_FULL", .pme_desc = "The four-deep store queue full.", .pme_code = 0xdbe, /* 3518 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "NCU_AT_LEAST_ONE_ST_GATHER_BUFF_NOT_EMPTY", .pme_desc = "At least one store gather buffer not empty.", .pme_code = 0xdbf, /* 3519 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_DUAL_INST_COMMITTED", .pme_desc = "A dual instruction is committed.", .pme_code = 0x1004, /* 4100 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_SINGLE_INST_COMMITTED", .pme_desc = "A single instruction is committed.", .pme_code = 0x1005, /* 4101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_PIPE0_INST_COMMITTED", .pme_desc = "A pipeline 0 instruction is committed.", .pme_code = 0x1006, /* 4102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_PIPE1_INST_COMMITTED", .pme_desc = "A pipeline 1 instruction is committed.", .pme_code = 0x1007, /* 4103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_BUSY", .pme_desc = "Local storage is busy.", .pme_code = 0x1009, /* 4105 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_DMA_CONFLICT_LD_ST", .pme_desc = "A direct memory access (DMA) might conflict with a load or store.", .pme_code = 0x100a, /* 4106 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_ST", .pme_desc = "A store instruction to local storage is issued.", .pme_code = 0x100b, /* 4107 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_LS_LD", .pme_desc = "A load instruction from local storage is issued.", .pme_code = 0x100c, /* 4108 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_FP_EXCEPTION", .pme_desc = "A floating-point unit exception occurred.", .pme_code = 0x100d, /* 4109 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_COMMIT", .pme_desc = "A branch instruction is committed.", .pme_code = 0x100e, /* 4110 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_NON_SEQ_PC", .pme_desc = "A nonsequential change of the SPU program counter has occurred. This can be caused by branch, asynchronous interrupt, stalled wait on channel, error-correction code (ECC) error, and so forth.", .pme_code = 0x100f, /* 4111 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_NOT_TAKEN", .pme_desc = "A branch was not taken.", .pme_code = 0x1010, /* 4112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_MISS_PREDICTION", .pme_desc = "Branch miss prediction. This count is not exact. Certain other code sequences can cause additional pulses on this signal.", .pme_code = 0x1011, /* 4113 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_BRANCH_HINT_MISS_PREDICTION", .pme_desc = "Branch hint miss prediction. This count is not exact. Certain other code sequences can cause additional pulses on this signal.", .pme_code = 0x1012, /* 4114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_INST_SEQ_ERROR", .pme_desc = "Instruction sequence error", .pme_code = 0x1013, /* 4115 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "SPU_STALL_CH_WRITE", .pme_desc = "Stalled waiting on any blocking channel write.", .pme_code = 0x1015, /* 4117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_EXTERNAL_EVENT_CH0", .pme_desc = "Stalled waiting on external event status (Channel 0).", .pme_code = 0x1016, /* 4118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_SIGNAL_1_CH3", .pme_desc = "Stalled waiting on SPU Signal Notification 1 (Channel 3).", .pme_code = 0x1017, /* 4119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_SIGNAL_2_CH4", .pme_desc = "Stalled waiting on SPU Signal Notification 2 (Channel 4).", .pme_code = 0x1018, /* 4120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_DMA_CH21", .pme_desc = "Stalled waiting on DMA Command Opcode or ClassID Register (Channel 21).", .pme_code = 0x1019, /* 4121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MFC_READ_CH24", .pme_desc = "Stalled waiting on memory flow control (MFC) Read Tag-Group Status (Channel 24).", .pme_code = 0x101a, /* 4122 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MFC_READ_CH25", .pme_desc = "Stalled waiting on MFC Read List Stall-and-Notify Tag Status (Channel 25).", .pme_code = 0x101b, /* 4123 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_OUTBOUND_MAILBOX_WRITE_CH28", .pme_desc = "Stalled waiting on SPU Write Outbound Mailbox (Channel 28).", .pme_code = 0x101c, /* 4124 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_STALL_MAILBOX_CH29", .pme_desc = "Stalled waiting on SPU Mailbox (Channel 29).", .pme_code = 0x1022, /* 4130 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_TR_STALL_CH", .pme_desc = "Stalled waiting on a channel operation.", .pme_code = 0x10a1, /* 4257 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_EV_INST_FETCH_STALL", .pme_desc = "Instruction fetch stall", .pme_code = 0x1107, /* 4359 */ .pme_enable_word = WORD_NONE, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "SPU_EV_ADDR_TRACE", .pme_desc = "Serialized SPU address (program counter) trace.", .pme_code = 0x110b, /* 4363 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_SPU, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD", .pme_desc = "An atomic load was received from direct memory access controller (DMAC).", .pme_code = 0x13ed, /* 5101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_DCLAIM", .pme_desc = "An atomic dclaim was sent to synergistic bus interface (SBI); includes retried requests.", .pme_code = 0x13ee, /* 5102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_RWITM", .pme_desc = "An atomic rwitm performed was sent to SBI; includes retried requests.", .pme_code = 0x13ef, /* 5103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_MU", .pme_desc = "An atomic load miss caused MU cache state.", .pme_code = 0x13f0, /* 5104 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_E", .pme_desc = "An atomic load miss caused E cache state.", .pme_code = 0x13f1, /* 5105 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_SL", .pme_desc = "An atomic load miss caused SL cache state.", .pme_code = 0x13f2, /* 5106 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_HIT", .pme_desc = "An atomic load hits cache.", .pme_code = 0x13f3, /* 5107 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_LD_CACHE_MISS_INTERVENTION", .pme_desc = "Atomic load misses cache with data intervention; sum of signals 4 and 6 in this group.", .pme_code = 0x13f4, /* 5108 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ATOMIC_PUTLLXC_CACHE_MISS_WO_INTERVENTION", .pme_desc = "putllc or putlluc misses cache without data intervention; for putllc, counts only when reservation is set for the address.", .pme_code = 0x13fa, /* 5114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_MACHINE_BUSY", .pme_desc = "Snoop machine busy.", .pme_code = 0x13fd, /* 5117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SNOOP_MMU_TO_I", .pme_desc = "A snoop caused cache transition from [M | MU] to I.", .pme_code = 0x13ff, /* 5119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_ESSL_TO_I", .pme_desc = "A snoop caused cache transition from [E | S | SL] to I.", .pme_code = 0x1401, /* 5121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SNOOP_MU_TO_T", .pme_desc = "A snoop caused cache transition from MU to T cache state.", .pme_code = 0x1403, /* 5123 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_INTERVENTION_LOCAL", .pme_desc = "Sent modified data intervention to a destination within the same CBE chip.", .pme_code = 0x1407, /* 5127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ANY_DMA_GET", .pme_desc = "Any flavor of DMA get[] command issued to Synergistic Bus Interface (SBI); sum of signals 17-25 in this group.", .pme_code = 0x1450, /* 5200 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ANY_DMA_PUT", .pme_desc = "Any flavor of DMA put[] command issued to SBI; sum of signals 2-16 in this group.", .pme_code = 0x1451, /* 5201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_DMA_PUT", .pme_desc = "DMA put (put) is issued to SBI.", .pme_code = 0x1452, /* 5202 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_DMA_GET", .pme_desc = "DMA get data from effective address to local storage (get) issued to SBI.", .pme_code = 0x1461, /* 5217 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_LD_REQ", .pme_desc = "Load request sent to element interconnect bus (EIB); includes read, read atomic, rwitm, rwitm atomic, and retried commands.", .pme_code = 0x14b8, /* 5304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_ST_REQ", .pme_desc = "Store request sent to EIB; includes wwf, wwc, wwk, dclaim, dclaim atomic, and retried commands.", .pme_code = 0x14b9, /* 5305 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_DATA", .pme_desc = "Received data from EIB, including partial cache line data.", .pme_code = 0x14ba, /* 5306 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_DATA", .pme_desc = "Sent data to EIB, both as a master and a snooper.", .pme_code = 0x14bb, /* 5307 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SBI_Q_NOT_EMPTY", .pme_desc = "16-deep synergistic bus interface (SBI) queue with outgoing requests not empty; does not include atomic requests.", .pme_code = 0x14bc, /* 5308 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SBI_Q_FULL", .pme_desc = "16-deep SBI queue with outgoing requests full; does not include atomic requests.", .pme_code = 0x14bd, /* 5309 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_SENT_REQ", .pme_desc = "Sent request to EIB.", .pme_code = 0x14be, /* 5310 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_DATA_BUS_GRANT", .pme_desc = "Received data bus grant; includes data sent for MMIO operations.", .pme_code = 0x14c0, /* 5312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_WAIT_DATA_BUS_GRANT", .pme_desc = "Cycles between data bus request and data bus grant.", .pme_code = 0x14c1, /* 5313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_CMD_O_MEM", .pme_desc = "Command (read or write) for an odd-numbered memory bank; valid only when resource allocation is turned on.", .pme_code = 0x14c2, /* 5314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_CMD_E_MEM", .pme_desc = "Command (read or write) for an even-numbered memory bank; valid only when resource allocation is turned on.", .pme_code = 0x14c3, /* 5315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_RECV_RETRY_RESP", .pme_desc = "Request gets the Retry response; includes local and global requests.", .pme_code = 0x14c6, /* 5318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_SENT_DATA_BUS_REQ", .pme_desc = "Sent data bus request to EIB.", .pme_code = 0x14c7, /* 5319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_TLB_MISS", .pme_desc = "Translation Lookaside Buffer (TLB) miss without parity or protection errors.", .pme_code = 0x1518, /* 5400 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "MFC_TLB_CYCLES", .pme_desc = "TLB miss (cycles).", .pme_code = 0x1519, /* 5401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MFC_TLB_HIT", .pme_desc = "TLB hit.", .pme_code = 0x151a, /* 5402 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_READ_RWITM_1", .pme_desc = "Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 1)", .pme_code = 0x17d4, /* 6100 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DCLAIM_1", .pme_desc = "Number of dclaim commands (including atomic) AC1 to AC0. (Group 1)", .pme_code = 0x17d5, /* 6101 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_WWK_WWC_WWF_1", .pme_desc = "Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d6, /* 6102 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SYNC_TLBSYNC_EIEIO_1", .pme_desc = "Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d7, /* 6103 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_TLBIE_1", .pme_desc = "Number of tlbie commands from AC1 to AC0. (Group 1)", .pme_code = 0x17d8, /* 6104 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_HIT_1", .pme_desc = "Previous adjacent address match (PAAM) Content Addressable Memory (CAM) hit. (Group 1)", .pme_code = 0x17df, /* 6111 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_MISS_1", .pme_desc = "PAAM CAM miss. (Group 1)", .pme_code = 0x17e0, /* 6112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_CMD_REFLECTED_1", .pme_desc = "Command reflected. (Group 1)", .pme_code = 0x17e2, /* 6114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_READ_RWITM_2", .pme_desc = "Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 2)", .pme_code = 0x17e4, /* 6116 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DCLAIM_2", .pme_desc = "Number of dclaim commands (including atomic) AC1 to AC0. (Group 2)", .pme_code = 0x17e5, /* 6117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_WWK_WWC_WWF_2", .pme_desc = "Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e6, /* 6118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SYNC_TLBSYNC_EIEIO_2", .pme_desc = "Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e7, /* 6119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_TLBIE_2", .pme_desc = "Number of tlbie commands from AC1 to AC0. (Group 2)", .pme_code = 0x17e8, /* 6120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_HIT_2", .pme_desc = "PAAM CAM hit. (Group 2)", .pme_code = 0x17ef, /* 6127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PAAM_CAM_MISS_2", .pme_desc = "PAAM CAM miss. (Group 2)", .pme_code = 0x17f0, /* 6128 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_CMD_REFLECTED_2", .pme_desc = "Command reflected. (Group 2)", .pme_code = 0x17f2, /* 6130 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE6", .pme_desc = "Local command from SPE 6.", .pme_code = 0x1839, /* 6201 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE4", .pme_desc = "Local command from SPE 4.", .pme_code = 0x183a, /* 6202 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CME_FROM_SPE2", .pme_desc = "Local command from SPE 2.", .pme_code = 0x183b, /* 6203 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_PPE", .pme_desc = "Local command from PPE.", .pme_code = 0x183d, /* 6205 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE1", .pme_desc = "Local command from SPE 1.", .pme_code = 0x183e, /* 6206 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE3", .pme_desc = "Local command from SPE 3.", .pme_code = 0x183f, /* 6207 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE5", .pme_desc = "Local command from SPE 5.", .pme_code = 0x1840, /* 6208 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_LOCAL_CMD_FROM_SPE7", .pme_desc = "Local command from SPE 7.", .pme_code = 0x1841, /* 6209 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE6", .pme_desc = "AC1-to-AC0 global command from SPE 6.", .pme_code = 0x1844, /* 6212 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE4", .pme_desc = "AC1-to-AC0 global command from SPE 4.", .pme_code = 0x1845, /* 6213 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE2", .pme_desc = "AC1-to-AC0 global command from SPE 2.", .pme_code = 0x1846, /* 6214 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE0", .pme_desc = "AC1-to-AC0 global command from SPE 0.", .pme_code = 0x1847, /* 6215 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_PPE", .pme_desc = "AC1-to-AC0 global command from PPE.", .pme_code = 0x1848, /* 6216 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE1", .pme_desc = "AC1-to-AC0 global command from SPE 1.", .pme_code = 0x1849, /* 6217 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE3", .pme_desc = "AC1-to-AC0 global command from SPE 3.", .pme_code = 0x184a, /* 6218 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE5", .pme_desc = "AC1-to-AC0 global command from SPE 5.", .pme_code = 0x184b, /* 6219 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GLOBAL_CMD_FROM_SPE7", .pme_desc = "AC1-to-AC0 global command from SPE 7", .pme_code = 0x184c, /* 6220 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_REFLECTING_LOCAL_CMD", .pme_desc = "AC1 is reflecting any local command.", .pme_code = 0x184e, /* 6222 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_SEND_GLOBAL_CMD", .pme_desc = "AC1 sends a global command to AC0.", .pme_code = 0x184f, /* 6223 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC0_REFLECT_GLOBAL_CMD", .pme_desc = "AC0 reflects a global command back to AC1.", .pme_code = 0x1850, /* 6224 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_AC1_REFLECT_CMD_TO_BM", .pme_desc = "AC1 reflects a command back to the bus masters.", .pme_code = 0x1851, /* 6225 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING0_1", .pme_desc = "Grant on data ring 0.", .pme_code = 0x189c, /* 6300 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING1_1", .pme_desc = "Grant on data ring 1.", .pme_code = 0x189d, /* 6301 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING2_1", .pme_desc = "Grant on data ring 2.", .pme_code = 0x189e, /* 6302 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING3_1", .pme_desc = "Grant on data ring 3.", .pme_code = 0x189f, /* 6303 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_DATA_RING0_INUSE_1", .pme_desc = "Data ring 0 is in use.", .pme_code = 0x18a0, /* 6304 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING1_INUSE_1", .pme_desc = "Data ring 1 is in use.", .pme_code = 0x18a1, /* 6305 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING2_INUSE_1", .pme_desc = "Data ring 2 is in use.", .pme_code = 0x18a2, /* 6306 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_DATA_RING3_INUSE_1", .pme_desc = "Data ring 3 is in use.", .pme_code = 0x18a3, /* 6307 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_IDLE_1", .pme_desc = "All data rings are idle.", .pme_code = 0x18a4, /* 6308 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ONE_DATA_RING_BUSY_1", .pme_desc = "One data ring is busy.", .pme_code = 0x18a5, /* 6309 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TWO_OR_THREE_DATA_RINGS_BUSY_1", .pme_desc = "Two or three data rings are busy.", .pme_code = 0x18a6, /* 6310 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_BUSY_1", .pme_desc = "All data rings are busy.", .pme_code = 0x18a7, /* 6311 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_IOIF0_DATA_REQ_PENDING_1", .pme_desc = "BIC(IOIF0) data request pending.", .pme_code = 0x18a8, /* 6312 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE6_DATA_REQ_PENDING_1", .pme_desc = "SPE 6 data request pending.", .pme_code = 0x18a9, /* 6313 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE4_DATA_REQ_PENDING_1", .pme_desc = "SPE 4 data request pending.", .pme_code = 0x18aa, /* 6314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE2_DATA_REQ_PENDING_1", .pme_desc = "SPE 2 data request pending.", .pme_code = 0x18ab, /* 6315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE0_DATA_REQ_PENDING_1", .pme_desc = "SPE 0 data request pending.", .pme_code = 0x18ac, /* 6316 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_MIC_DATA_REQ_PENDING_1", .pme_desc = "MIC data request pending.", .pme_code = 0x18ad, /* 6317 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_PPE_DATA_REQ_PENDING_1", .pme_desc = "PPE data request pending.", .pme_code = 0x18ae, /* 6318 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE1_DATA_REQ_PENDING_1", .pme_desc = "SPE 1 data request pending.", .pme_code = 0x18af, /* 6319 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE3_DATA_REQ_PENDING_1", .pme_desc = "SPE 3 data request pending.", .pme_code = 0x18b0, /* 6320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE5_DATA_REQ_PENDING_1", .pme_desc = "SPE 5 data request pending.", .pme_code = 0x18b1, /* 6321 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE7_DATA_REQ_PENDING_1", .pme_desc = "SPE 7 data request pending.", .pme_code = 0x18b2, /* 6322 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF0_DATA_DEST_1", .pme_desc = "IOIF0 is data destination.", .pme_code = 0x18b4, /* 6324 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE6_DATA_DEST_1", .pme_desc = "SPE 6 is data destination.", .pme_code = 0x18b5, /* 6325 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE4_DATA_DEST_1", .pme_desc = "SPE 4 is data destination.", .pme_code = 0x18b6, /* 6326 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE2_DATA_DEST_1", .pme_desc = "SPE 2 is data destination.", .pme_code = 0x18b7, /* 6327 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE0_DATA_DEST_1", .pme_desc = "SPE 0 is data destination.", .pme_code = 0x18b8, /* 6328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_MIC_DATA_DEST_1", .pme_desc = "MIC is data destination.", .pme_code = 0x18b9, /* 6329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PPE_DATA_DEST_1", .pme_desc = "PPE is data destination.", .pme_code = 0x18ba, /* 6330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE1_DATA_DEST_1", .pme_desc = "SPE 1 is data destination.", .pme_code = 0x18bb, /* 6331 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_IOIF0_DATA_REQ_PENDING_2", .pme_desc = "BIC(IOIF0) data request pending.", .pme_code = 0x1900, /* 6400 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE6_DATA_REQ_PENDING_2", .pme_desc = "SPE 6 data request pending.", .pme_code = 0x1901, /* 6401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE4_DATA_REQ_PENDING_2", .pme_desc = "SPE 4 data request pending.", .pme_code = 0x1902, /* 6402 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE2_DATA_REQ_PENDING_2", .pme_desc = "SPE 2 data request pending.", .pme_code = 0x1903, /* 6403 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE0_DATA_REQ_PENDING_2", .pme_desc = "SPE 0 data request pending.", .pme_code = 0x1904, /* 6404 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_MIC_DATA_REQ_PENDING_2", .pme_desc = "MIC data request pending.", .pme_code = 0x1905, /* 6405 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_PPE_DATA_REQ_PENDING_2", .pme_desc = "PPE data request pending.", .pme_code = 0x1906, /* 6406 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE1_DATA_REQ_PENDING_2", .pme_desc = "SPE 1 data request pending.", .pme_code = 0x1907, /* 6407 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE3_DATA_REQ_PENDING_2", .pme_desc = "SPE 3 data request pending.", .pme_code = 0x1908, /* 6408 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE5_DATA_REQ_PENDING_2", .pme_desc = "SPE 5 data request pending.", .pme_code = 0x1909, /* 6409 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_SPE7_DATA_REQ_PENDING_2", .pme_desc = "SPE 7 data request pending.", .pme_code = 0x190a, /* 6410 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF1_DATA_REQ_PENDING_2", .pme_desc = "IOIF1 data request pending.", .pme_code = 0x190b, /* 6411 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "EIB_IOIF0_DATA_DEST_2", .pme_desc = "IOIF0 is data destination.", .pme_code = 0x190c, /* 6412 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE6_DATA_DEST_2", .pme_desc = "SPE 6 is data destination.", .pme_code = 0x190d, /* 6413 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE4_DATA_DEST_2", .pme_desc = "SPE 4 is data destination.", .pme_code = 0x190e, /* 6414 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE2_DATA_DEST_2", .pme_desc = "SPE 2 is data destination.", .pme_code = 0x190f, /* 6415 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE0_DATA_DEST_2", .pme_desc = "SPE 0 is data destination.", .pme_code = 0x1910, /* 6416 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_MIC_DATA_DEST_2", .pme_desc = "MIC is data destination.", .pme_code = 0x1911, /* 6417 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_PPE_DATA_DEST_2", .pme_desc = "PPE is data destination.", .pme_code = 0x1912, /* 6418 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE1_DATA_DEST_2", .pme_desc = "SPE 1 is data destination.", .pme_code = 0x1913, /* 6419 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE3_DATA_DEST_2", .pme_desc = "SPE 3 is data destination.", .pme_code = 0x1914, /* 6420 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE5_DATA_DEST_2", .pme_desc = "SPE 5 is data destination.", .pme_code = 0x1915, /* 6421 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_SPE7_DATA_DEST_2", .pme_desc = "SPE 7 is data destination.", .pme_code = 0x1916, /* 6422 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_IOIF1_DATA_DEST_2", .pme_desc = "IOIF1 is data destination.", .pme_code = 0x1917, /* 6423 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING0_2", .pme_desc = "Grant on data ring 0.", .pme_code = 0x1918, /* 6424 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING1_2", .pme_desc = "Grant on data ring 1.", .pme_code = 0x1919, /* 6425 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING2_2", .pme_desc = "Grant on data ring 2.", .pme_code = 0x191a, /* 6426 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_GRANT_DATA_RING3_2", .pme_desc = "Grant on data ring 3.", .pme_code = 0x191b, /* 6427 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "EIB_ALL_DATA_RINGS_IDLE_2", .pme_desc = "All data rings are idle.", .pme_code = 0x191c, /* 6428 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ONE_DATA_RING_BUSY_2", .pme_desc = "One data ring is busy.", .pme_code = 0x191d, /* 6429 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TWO_OR_THREE_DATA_RINGS_BUSY_2", .pme_desc = "Two or three data rings are busy.", .pme_code = 0x191e, /* 6430 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_ALL_DATA_RINGS_BUSY_2", .pme_desc = "All four data rings are busy.", .pme_code = 0x191f, /* 6431 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_UNUSED", .pme_desc = "Even XIO token unused by RAG 0.", .pme_code = 0xfe4c, /* 65100 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_UNUSED", .pme_desc = "Odd XIO token unused by RAG 0.", .pme_code = 0xfe4d, /* 65101 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_UNUSED", .pme_desc = "Even bank token unused by RAG 0.", .pme_code = 0xfe4e, /* 65102 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_UNUSED", .pme_desc = "Odd bank token unused by RAG 0.", .pme_code = 0xfe4f, /* 65103 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE0", .pme_desc = "Token granted for SPE 0.", .pme_code = 0xfe54, /* 65108 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE1", .pme_desc = "Token granted for SPE 1.", .pme_code = 0xfe55, /* 65109 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE2", .pme_desc = "Token granted for SPE 2.", .pme_code = 0xfe56, /* 65110 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE3", .pme_desc = "Token granted for SPE 3.", .pme_code = 0xfe57, /* 65111 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE4", .pme_desc = "Token granted for SPE 4.", .pme_code = 0xfe58, /* 65112 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE5", .pme_desc = "Token granted for SPE 5.", .pme_code = 0xfe59, /* 65113 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE6", .pme_desc = "Token granted for SPE 6.", .pme_code = 0xfe5a, /* 65114 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_SPE7", .pme_desc = "Token granted for SPE 7.", .pme_code = 0xfe5b, /* 65115 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb0, /* 65200 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb1, /* 65201 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb2, /* 65202 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.", .pme_code = 0xfeb3, /* 65203 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG U.", .pme_code = 0xfebc, /* 65212 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG U.", .pme_code = 0xfebd, /* 65213 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG U.", .pme_code = 0xfebe, /* 65214 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG U.", .pme_code = 0xfebf, /* 65215 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 0 shared with RAG 1", .pme_code = 0xff14, /* 65300 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 0 shared with RAG 2", .pme_code = 0xff15, /* 65301 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 0 shared with RAG 3", .pme_code = 0xff16, /* 65302 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 1", .pme_code = 0xff17, /* 65303 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 2", .pme_code = 0xff18, /* 65304 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 0 shared with RAG 3", .pme_code = 0xff19, /* 65305 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 0 shared with RAG 1", .pme_code = 0xff1a, /* 65306 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 0 shared with RAG 2", .pme_code = 0xff1b, /* 65307 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 0 shared with RAG 3", .pme_code = 0xff1c, /* 65308 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 0 shared with RAG 1", .pme_code = 0xff1d, /* 65309 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 0 shared with RAG 2", .pme_code = 0xff1e, /* 65310 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 0 shared with RAG 3", .pme_code = 0xff1f, /* 65311 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_UNUSED", .pme_desc = "Even XIO token was unused by RAG 1.", .pme_code = 0xff88, /* 65416 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_UNUSED", .pme_desc = "Odd XIO token was unused by RAG 1.", .pme_code = 0xff89, /* 65417 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_UNUSED", .pme_desc = "Even bank token was unused by RAG 1.", .pme_code = 0xff8a, /* 65418 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_UNUSED", .pme_desc = "Odd bank token was unused by RAG 1.", .pme_code = 0xff8b, /* 65419 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_IOC0", .pme_desc = "Token was granted for IOC0.", .pme_code = 0xff91, /* 65425 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_TOKEN_GRANTED_IOC1", .pme_desc = "Token was granted for IOC1.", .pme_code = 0xff92, /* 65426 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_WASTED", .pme_desc = "Even XIO token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffec, /* 65516 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_WASTED", .pme_desc = "Odd XIO token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffed, /* 65517 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_WASTED", .pme_desc = "Even bank token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffee, /* 65518 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_WASTED", .pme_desc = "Odd bank token was wasted by RAG 1. This is valid only when UE = 1 in TKM_CR.", .pme_code = 0xffef, /* 65519 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 1 shared with RAG 0", .pme_code = 0x10050, /* 65616 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 1 shared with RAG 2", .pme_code = 0x10051, /* 65617 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 1 shared with RAG 3", .pme_code = 0x10052, /* 65618 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 0", .pme_code = 0x10053, /* 65619 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 2", .pme_code = 0x10054, /* 65620 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 1 shared with RAG 3", .pme_code = 0x10055, /* 65621 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 1 shared with RAG 0", .pme_code = 0x10056, /* 65622 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 1 shared with RAG 2", .pme_code = 0x10057, /* 65623 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 1 shared with RAG 3", .pme_code = 0x10058, /* 65624 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 1 shared with RAG 0", .pme_code = 0x10059, /* 65625 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 1 shared with RAG 2", .pme_code = 0x1005a, /* 65626 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG1_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 1 shared with RAG 3", .pme_code = 0x1005b, /* 65627 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG U shared with RAG 1", .pme_code = 0x1005c, /* 65628 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG U shared with RAG 1", .pme_code = 0x1005d, /* 65629 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_E_BANK_RAG1", .pme_desc = "Even bank token from RAG U shared with RAG 1", .pme_code = 0x1005e, /* 65630 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAGU_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG U shared with RAG 1", .pme_code = 0x1005f, /* 65631 */ .pme_enable_word = WORD_0_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_UNUSED", .pme_desc = "Even XIO token unused by RAG 2", .pme_code = 0x100e4, /* 65764 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_UNUSED", .pme_desc = "Odd XIO token unused by RAG 2", .pme_code = 0x100e5, /* 65765 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_UNUSED", .pme_desc = "Even bank token unused by RAG 2", .pme_code = 0x100e6, /* 65766 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_UNUSED", .pme_desc = "Odd bank token unused by RAG 2", .pme_code = 0x100e7, /* 65767 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_IN_TOKEN_UNUSED", .pme_desc = "IOIF0 In token unused by RAG 0", .pme_code = 0x100e8, /* 65768 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_OUT_TOKEN_UNUSED", .pme_desc = "IOIF0 Out token unused by RAG 0", .pme_code = 0x100e9, /* 65769 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_IN_TOKEN_UNUSED", .pme_desc = "IOIF1 In token unused by RAG 0", .pme_code = 0x100ea, /* 65770 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_OUT_TOKEN_UNUSED", .pme_desc = "IOIF1 Out token unused by RAG 0", .pme_code = 0x100eb, /* 65771 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 2", .pme_code = 0x10148, /* 65864 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 2", .pme_code = 0x10149, /* 65865 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 2", .pme_code = 0x1014a, /* 65866 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 2", .pme_code = 0x1014b, /* 65867 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 2 shared with RAG 0", .pme_code = 0x101ac, /* 65964 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 2 shared with RAG 1", .pme_code = 0x101ad, /* 65965 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_XIO_RAG3", .pme_desc = "Even XIO token from RAG 2 shared with RAG 3", .pme_code = 0x101ae, /* 65966 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 0", .pme_code = 0x101af, /* 65967 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 1", .pme_code = 0x101b0, /* 65968 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_XIO_RAG3", .pme_desc = "Odd XIO token from RAG 2 shared with RAG 3", .pme_code = 0x101b1, /* 65969 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 2 shared with RAG 0", .pme_code = 0x101b2, /* 65970 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 2 shared with RAG 1", .pme_code = 0x101b3, /* 65971 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_E_BANK_RAG3", .pme_desc = "Even bank token from RAG 2 shared with RAG 3", .pme_code = 0x101b4, /* 65972 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 2 shared with RAG 0", .pme_code = 0x101b5, /* 65973 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 2 shared with RAG 1", .pme_code = 0x101b6, /* 65974 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG2_O_BANK_RAG3", .pme_desc = "Odd bank token from RAG 2 shared with RAG 3", .pme_code = 0x101b7, /* 65975 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_IN_TOKEN_WASTED", .pme_desc = "IOIF0 In token wasted by RAG 0", .pme_code = 0x9ef38, /* 651064 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF0_OUT_TOKEN_WASTED", .pme_desc = "IOIF0 Out token wasted by RAG 0", .pme_code = 0x9ef39, /* 651065 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_IN_TOKEN_WASTED", .pme_desc = "IOIF1 In token wasted by RAG 0", .pme_code = 0x9ef3a, /* 651066 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG0_IOIF1_OUT_TOKEN_WASTED", .pme_desc = "IOIF1 Out token wasted by RAG 0", .pme_code = 0x9ef3b, /* 651067 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_UNUSED", .pme_desc = "Even XIO token was unused by RAG 3.", .pme_code = 0x9efac, /* 651180 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_UNUSED", .pme_desc = "Odd XIO token was unused by RAG 3.", .pme_code = 0x9efad, /* 651181 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_UNUSED", .pme_desc = "Even bank token was unused by RAG 3.", .pme_code = 0x9efae, /* 651182 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_UNUSED", .pme_desc = "Odd bank token was unused by RAG 3.", .pme_code = 0x9efaf, /* 651183 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_WASTED", .pme_desc = "Even XIO token wasted by RAG 3", .pme_code = 0x9f010, /* 651280 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_WASTED", .pme_desc = "Odd XIO token wasted by RAG 3", .pme_code = 0x9f011, /* 651281 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_WASTED", .pme_desc = "Even bank token wasted by RAG 3", .pme_code = 0x9f012, /* 651282 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_WASTED", .pme_desc = "Odd bank token wasted by RAG 3", .pme_code = 0x9f013, /* 651283 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG0", .pme_desc = "Even XIO token from RAG 3 shared with RAG 0", .pme_code = 0x9f074, /* 651380 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG1", .pme_desc = "Even XIO token from RAG 3 shared with RAG 1", .pme_code = 0x9f075, /* 651381 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_XIO_RAG2", .pme_desc = "Even XIO token from RAG 3 shared with RAG 2", .pme_code = 0x9f076, /* 651382 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG0", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 0", .pme_code = 0x9f077, /* 651383 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG1", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 1", .pme_code = 0x9f078, /* 651384 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_XIO_RAG2", .pme_desc = "Odd XIO token from RAG 3 shared with RAG 2", .pme_code = 0x9f079, /* 651385 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG0", .pme_desc = "Even bank token from RAG 3 shared with RAG 0", .pme_code = 0x9f07a, /* 651386 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG1", .pme_desc = "Even bank token from RAG 3 shared with RAG 1", .pme_code = 0x9f07b, /* 651387 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_E_BANK_RAG2", .pme_desc = "Even bank token from RAG 3 shared with RAG 2", .pme_code = 0x9f07c, /* 651388 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG0", .pme_desc = "Odd bank token from RAG 3 shared with RAG 0", .pme_code = 0x9f07d, /* 651389 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG1", .pme_desc = "Odd bank token from RAG 3 shared with RAG 1", .pme_code = 0x9f07e, /* 651390 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "EIB_RAG3_O_BANK_RAG2", .pme_desc = "Odd bank token from RAG 3 shared with RAG 2", .pme_code = 0x9f07f, /* 651391 */ .pme_enable_word = WORD_2_ONLY, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_Q_EMPTY", .pme_desc = "XIO1 - Read command queue is empty.", .pme_code = 0x1bc5, /* 7109 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_Q_EMPTY", .pme_desc = "XIO1 - Write command queue is empty.", .pme_code = 0x1bc6, /* 7110 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_Q_FULL", .pme_desc = "XIO1 - Read command queue is full.", .pme_code = 0x1bc8, /* 7112 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_RESPONDS_READ_RETRY", .pme_desc = "XIO1 - MIC responds with a Retry for a read command because the read command queue is full.", .pme_code = 0x1bc9, /* 7113 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_Q_FULL", .pme_desc = "XIO1 - Write command queue is full.", .pme_code = 0x1bca, /* 7114 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_RESPONDS_WRITE_RETRY", .pme_desc = "XIO1 - MIC responds with a Retry for a write command because the write command queue is full.", .pme_code = 0x1bcb, /* 7115 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_DISPATCHED", .pme_desc = "XIO1 - Read command dispatched; includes high-priority and fast-path reads.", .pme_code = 0x1bde, /* 7134 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Write command dispatched.", .pme_code = 0x1bdf, /* 7135 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_MOD_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1be0, /* 7136 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_REFRESH_DISPATCHED", .pme_desc = "XIO1 - Refresh dispatched.", .pme_code = 0x1be1, /* 7137 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_BYTE_MSK_WRITE_CMD_DISPATCHED", .pme_desc = "XIO1 - Byte-masking write command (data size >= 16 bytes) dispatched.", .pme_code = 0x1be3, /* 7139 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_WRITE_CMD_DISPATCHED_AFTER_READ", .pme_desc = "XIO1 - Write command dispatched after a read command was previously dispatched.", .pme_code = 0x1be5, /* 7141 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO1_READ_CMD_DISPATCHED_AFTER_WRITE", .pme_desc = "XIO1 - Read command dispatched after a write command was previously dispatched.", .pme_code = 0x1be6, /* 7142 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_Q_EMPTY", .pme_desc = "XIO0 - Read command queue is empty.", .pme_code = 0x1c29, /* 7209 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_Q_EMPTY", .pme_desc = "XIO0 - Write command queue is empty.", .pme_code = 0x1c2a, /* 7210 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_Q_FULL", .pme_desc = "XIO0 - Read command queue is full.", .pme_code = 0x1c2c, /* 7212 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_RESPONDS_READ_RETRY", .pme_desc = "XIO0 - MIC responds with a Retry for a read command because the read command queue is full.", .pme_code = 0x1c2d, /* 7213 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_Q_FULL", .pme_desc = "XIO0 - Write command queue is full.", .pme_code = 0x1c2e, /* 7214 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_RESPONDS_WRITE_RETRY", .pme_desc = "XIO0 - MIC responds with a Retry for a write command because the write command queue is full.", .pme_code = 0x1c2f, /* 7215 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_DISPATCHED", .pme_desc = "XIO0 - Read command dispatched; includes high-priority and fast-path reads.", .pme_code = 0x1c42, /* 7234 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Write command dispatched.", .pme_code = 0x1c43, /* 7235 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_MOD_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1c44, /* 7236 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_REFRESH_DISPATCHED", .pme_desc = "XIO0 - Refresh dispatched.", .pme_code = 0x1c45, /* 7237 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED_AFTER_READ", .pme_desc = "XIO0 - Write command dispatched after a read command was previously dispatched.", .pme_code = 0x1c49, /* 7241 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_CMD_DISPATCHED_AFTER_WRITE", .pme_desc = "XIO0 - Read command dispatched after a write command was previously dispatched.", .pme_code = 0x1c4a, /* 7242 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_WRITE_CMD_DISPATCHED_2", .pme_desc = "XIO0 - Write command dispatched.", .pme_code = 0x1ca7, /* 7335 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_READ_MOD_WRITE_CMD_DISPATCHED_2", .pme_desc = "XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched.", .pme_code = 0x1ca8, /* 7336 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_REFRESH_DISPATCHED_2", .pme_desc = "XIO0 - Refresh dispatched.", .pme_code = 0x1ca9, /* 7337 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "MIC_XIO0_BYTE_MSK_WRITE_CMD_DISPATCHED", .pme_desc = "XIO0 - Byte-masking write command (data size >= 16 bytes) dispatched.", .pme_code = 0x1cab, /* 7339 */ .pme_enable_word = 0xF, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_DATA_PLG", .pme_desc = "Type A data physical layer group (PLG). Does not include header-only or credit-only data PLGs. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.", .pme_code = 0x1fb0, /* 8112 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_DATA_PLG", .pme_desc = "Type B data PLG. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.", .pme_code = 0x1fb1, /* 8113 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_IOIF_TYPEA_DATA_PLG", .pme_desc = "Type A data PLG. Does not include header-only or credit-only PLGs. In IOIF mode, counts CBE store data to I/O device. Does not apply in BIF mode.", .pme_code = 0x1fb2, /* 8114 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_IOIF_TYPEB_DATA_PLG", .pme_desc = "Type B data PLG. In IOIF mode, counts CBE store data to an I/O device. Does not apply in BIF mode.", .pme_code = 0x1fb3, /* 8115 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_PLG", .pme_desc = "Data PLG. Does not include header-only or credit-only PLGs.", .pme_code = 0x1fb4, /* 8116 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_PLG", .pme_desc = "Command PLG (no credit-only PLG). In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/ reflected command or snoop/combined responses.", .pme_code = 0x1fb5, /* 8117 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_TRANSFER", .pme_desc = "Type A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).", .pme_code = 0x1fb6, /* 8118 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_TRANSFER", .pme_desc = "Type B data transfer.", .pme_code = 0x1fb7, /* 8119 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_GREDIT_ONLY_PLG", .pme_desc = "Command-credit-only command PLG in either IOIF or BIF mode.", .pme_code = 0x1fb8, /* 8120 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_CREDIT_ONLY_PLG", .pme_desc = "Data-credit-only data PLG sent in either IOIF or BIF mode.", .pme_code = 0x1fb9, /* 8121 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NON_NULL_ENVLP_SENT", .pme_desc = "Non-null envelope sent (does not include long envelopes).", .pme_code = 0x1fba, /* 8122 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NULL_ENVLP_SENT", .pme_desc = "Null envelope sent.", .pme_code = 0x1fbc, /* 8124 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NO_VALID_DATA_SENT", .pme_desc = "No valid data sent this cycle.", .pme_code = 0x1fbd, /* 8125 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NORMAL_ENVLP_SENT", .pme_desc = "Normal envelope sent.", .pme_code = 0x1fbe, /* 8126 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_LONG_ENVLP_SENT", .pme_desc = "Long envelope sent.", .pme_code = 0x1fbf, /* 8127 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_NULL_PLG_INSERTED", .pme_desc = "A Null PLG inserted in an outgoing envelope.", .pme_code = 0x1fc0, /* 8128 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_OUTBOUND_ENV_ARRAY_FULL", .pme_desc = "Outbound envelope array is full.", .pme_code = 0x1fc1, /* 8129 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEB_TRANSFER", .pme_desc = "Type B data transfer.", .pme_code = 0x201b, /* 8219 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NULL_ENVLP_RECV", .pme_desc = "Null envelope received.", .pme_code = 0x206d, /* 8301 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_CMD_PLG_2", .pme_desc = "Command PLG, but not credit-only PLG. In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/reflected command or snoop/combined responses.", .pme_code = 0x207a, /* 8314 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_CMD_GREDIT_ONLY_PLG_2", .pme_desc = "Command-credit-only command PLG.", .pme_code = 0x207b, /* 8315 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NORMAL_ENVLP_RECV", .pme_desc = "Normal envelope received is good.", .pme_code = 0x2080, /* 8320 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_LONG_ENVLP_RECV", .pme_desc = "Long envelope received is good.", .pme_code = 0x2081, /* 8321 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF0_DATA_GREDIT_ONLY_PLG_2", .pme_desc = "Data-credit-only data PLG in either IOIF or BIF mode; will count a maximum of one per envelope.", .pme_code = 0x2082, /* 8322 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_NON_NULL_ENVLP", .pme_desc = "Non-null envelope; does not include long envelopes; includes retried envelopes.", .pme_code = 0x2083, /* 8323 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_GRANT_RECV", .pme_desc = "Data grant received.", .pme_code = 0x2084, /* 8324 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_DATA_PLG_2", .pme_desc = "Data PLG. Does not include header-only or credit-only PLGs.", .pme_code = 0x2088, /* 8328 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEA_TRANSFER_2", .pme_desc = "Type A data transfer regardless of length. Can also be used to count Type A data header PLGs, but not credit-only PLGs.", .pme_code = 0x2089, /* 8329 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF0_TYPEB_TRANSFER_2", .pme_desc = "Type B data transfer.", .pme_code = 0x208a, /* 8330 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NULL_ENVLP_RECV", .pme_desc = "Null envelope received.", .pme_code = 0x20d1, /* 8401 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_CMD_PLG_2", .pme_desc = "Command PLG (no credit-only PLG). Counts I/O command or reply PLGs.", .pme_code = 0x20de, /* 8414 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_CMD_GREDIT_ONLY_PLG_2", .pme_desc = "Command-credit-only command PLG.", .pme_code = 0x20df, /* 8415 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NORMAL_ENVLP_RECV", .pme_desc = "Normal envelope received is good.", .pme_code = 0x20e4, /* 8420 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_LONG_ENVLP_RECV", .pme_desc = "Long envelope received is good.", .pme_code = 0x20e5, /* 8421 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "BIF_IOIF1_DATA_GREDIT_ONLY_PLG_2", .pme_desc = "Data-credit-only data PLG received; will count a maximum of one per envelope.", .pme_code = 0x20e6, /* 8422 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_NON_NULL_ENVLP", .pme_desc = "Non-Null envelope received; does not include long envelopes; includes retried envelopes.", .pme_code = 0x20e7, /* 8423 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_DATA_GRANT_RECV", .pme_desc = "Data grant received.", .pme_code = 0x20e8, /* 8424 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_DATA_PLG_2", .pme_desc = "Data PLG received. Does not include header-only or credit-only PLGs.", .pme_code = 0x20ec, /* 8428 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEA_TRANSFER_2", .pme_desc = "Type I A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).", .pme_code = 0x20ed, /* 8429 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "BIF_IOIF1_TYPEB_TRANSFER_2", .pme_desc = "Type B data transfer received.", .pme_code = 0x20ee, /* 8430 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_MMIO_READ_IOIF1", .pme_desc = "Received MMIO read targeted to IOIF1.", .pme_code = 0x213c, /* 8508 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_WRITE_IOIF1", .pme_desc = "Received MMIO write targeted to IOIF1.", .pme_code = 0x213d, /* 8509 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_READ_IOIF0", .pme_desc = "Received MMIO read targeted to IOIF0.", .pme_code = 0x213e, /* 8510 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_MMIO_WRITE_IOIF0", .pme_desc = "Received MMIO write targeted to IOIF0.", .pme_code = 0x213f, /* 8511 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_CMD_TO_IOIF0", .pme_desc = "Sent command to IOIF0.", .pme_code = 0x2140, /* 8512 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_CMD_TO_IOIF1", .pme_desc = "Sent command to IOIF1.", .pme_code = 0x2141, /* 8513 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_MATRIX3_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 3 is occupied by a dependent command.", .pme_code = 0x219d, /* 8605 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_IOIF0_MATRIX4_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 4 is occupied by a dependent command.", .pme_code = 0x219e, /* 8606 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_IOIF0_MATRIX5_OCCUPIED", .pme_desc = "IOIF0 Dependency Matrix 5 is occupied by a dependent command.", .pme_code = 0x219f, /* 8607 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_BOTH_TYPE, }, {.pme_name = "IOC_DMA_READ_IOIF0", .pme_desc = "Received read request from IOIF0.", .pme_code = 0x21a2, /* 8610 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_DMA_WRITE_IOIF0", .pme_desc = "Received write request from IOIF0.", .pme_code = 0x21a3, /* 8611 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_INTERRUPT_IOIF0", .pme_desc = "Received interrupt from the IOIF0.", .pme_code = 0x21a6, /* 8614 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_E_MEM", .pme_desc = "IOIF0 request for token for even memory banks 0-14.", .pme_code = 0x220c, /* 8716 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_O_MEM", .pme_desc = "IOIF0 request for token for odd memory banks 1-15.", .pme_code = 0x220d, /* 8717 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_1357", .pme_desc = "IOIF0 request for token type 1, 3, 5, or 7.", .pme_code = 0x220e, /* 8718 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_9111315", .pme_desc = "IOIF0 request for token type 9, 11, 13, or 15.", .pme_code = 0x220f, /* 8719 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_16", .pme_desc = "IOIF0 request for token type 16.", .pme_code = 0x2214, /* 8724 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_17", .pme_desc = "IOIF0 request for token type 17.", .pme_code = 0x2215, /* 8725 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_18", .pme_desc = "IOIF0 request for token type 18.", .pme_code = 0x2216, /* 8726 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOIF0_REQ_TOKEN_19", .pme_desc = "IOIF0 request for token type 19.", .pme_code = 0x2217, /* 8727 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_CUMULATIVE_LEN, }, {.pme_name = "IOC_IOPT_CACHE_HIT", .pme_desc = "I/O page table cache hit for commands from IOIF.", .pme_code = 0x2260, /* 8800 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOPT_CACHE_MISS", .pme_desc = "I/O page table cache miss for commands from IOIF.", .pme_code = 0x2261, /* 8801 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOST_CACHE_HIT", .pme_desc = "I/O segment table cache hit.", .pme_code = 0x2263, /* 8803 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IOST_CACHE_MISS", .pme_desc = "I/O segment table cache miss.", .pme_code = 0x2264, /* 8804 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_INTERRUPT_FROM_SPU", .pme_desc = "Interrupt received from any SPU (reflected cmd when IIC has sent ACK response).", .pme_code = 0x2278, /* 8824 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IIC_INTERRUPT_TO_PPU_TH0", .pme_desc = "Internal interrupt controller (IIC) generated interrupt to PPU thread 0.", .pme_code = 0x2279, /* 8825 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_IIC_INTERRUPT_TO_PPU_TH1", .pme_desc = "IIC generated interrupt to PPU thread 1.", .pme_code = 0x227a, /* 8826 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_RECV_EXTERNAL_INTERRUPT_TO_TH0", .pme_desc = "Received external interrupt (using MMIO) from PPU to PPU thread 0.", .pme_code = 0x227b, /* 8827 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, {.pme_name = "IOC_RECV_EXTERNAL_INTERRUPT_TO_TH1", .pme_desc = "Received external interrupt (using MMIO) from PPU to PPU thread 1.", .pme_code = 0x227c, /* 8828 */ .pme_enable_word = WORD_0_AND_2, .pme_freq = PFM_CELL_PME_FREQ_HALF, .pme_type = COUNT_TYPE_OCCURRENCE, }, }; /*--- The number of events : 435 ---*/ #define PME_CELL_EVENT_COUNT (sizeof(cell_pe)/sizeof(pme_cell_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_adl_glc_events.h000066400000000000000000002523711502707512200243650ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: adl_glc (Alderlake GoldenCove P-Core) * Based on Intel JSON event table version : 1.24 * Based on Intel JSON event table published : 12/04/2023 */ static const intel_x86_umask_t adl_glc_arith[]={ { .uname = "DIVIDER_ACTIVE", .udesc = "This event is deprecated. Refer to new event ARITH.DIV_ACTIVE", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DIV_ACTIVE", .udesc = "Cycles when divide unit is busy executing divide or square root operations", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FPDIV_ACTIVE", .udesc = "TBD", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FP_DIVIDER_ACTIVE", .udesc = "This event is deprecated. Refer to new event ARITH.FPDIV_ACTIVE", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "IDIV_ACTIVE", .udesc = "This event counts the cycles the integer divider is busy", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "INT_DIVIDER_ACTIVE", .udesc = "This event is deprecated. Refer to new event ARITH.IDIV_ACTIVE", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_assists[]={ { .uname = "ANY", .udesc = "Number of occurrences where a microcode assist is invoked by hardware", .ucode = 0x1b00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FP", .udesc = "Counts all microcode FP assists", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HARDWARE", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAGE_FAULT", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SSE_AVX_MIX", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_baclears[]={ { .uname = "ANY", .udesc = "Clears due to Unknown Branches", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All branch instructions retired", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Conditional branch instructions retired", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Not taken branch instructions retired", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Taken conditional branch instructions retired", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Far branch instructions retired", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Indirect near branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Direct and indirect near call instructions retired", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Return instructions retired", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Taken branch instructions retired", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions retired", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Mispredicted conditional branch instructions retired", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Mispredicted non-taken conditional branch instructions retired", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "number of branch instructions retired that were mispredicted and taken", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Miss-predicted near indirect branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Mispredicted indirect CALL retired", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET", .udesc = "This event counts the number of mispredicted ret instructions retired. Non PEBS", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_core_power[]={ { .uname = "LICENSE_1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LICENSE_2", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LICENSE_3", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_cpu_clk_unhalted[]={ { .uname = "C01", .udesc = "Core clocks when the thread is in the C0.1 light-weight slower wakeup time but more power saving optimized state", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "C02", .udesc = "Core clocks when the thread is in the C0.2 light-weight faster wakeup time but less power saving optimized state", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "C0_WAIT", .udesc = "Core clocks when the thread is in the C0.1 or C0.2 or running a PAUSE in C0 ACPI state", .ucode = 0x7000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DISTRIBUTED", .udesc = "Cycle counts are evenly distributed between active threads in the Core", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Core crystal clock cycles when this thread is unhalted and the other thread is halted", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAUSE", .udesc = "TBD", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAUSE_INST", .udesc = "TBD", .ucode = 0x4000ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "REF_DISTRIBUTED", .udesc = "Core crystal clock cycles. Cycle counts are evenly distributed between active threads in the Core", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Reference cycles when the core is not in halt state", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC_P", .udesc = "Reference cycles when the core is not in halt state", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Core cycles when the thread is not in halt state", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD_P", .udesc = "Thread cycles when thread is not in halt state", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_cycle_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding", .ucode = 0x0800ull | (0x8 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L2_MISS", .udesc = "Cycles while L2 cache miss demand load is outstanding", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles while memory subsystem has an outstanding load", .ucode = 0x1000ull | (0x10 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding", .ucode = 0x0c00ull | (0xc << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand load is outstanding", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand load is outstanding", .ucode = 0x0600ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_TOTAL", .udesc = "Total execution stalls", .ucode = 0x0400ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_decode[]={ { .uname = "LCP", .udesc = "Stalls caused by changing prefix length of the instruction", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_BUSY", .udesc = "Cycles the Microcode Sequencer is busy", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "DSB-to-MITE switch true penalty cycles", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_dtlb_load_misses[]={ { .uname = "STLB_HIT", .udesc = "Loads that miss the DTLB and hit the STLB", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a demand load", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Load miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data load to a 1G page", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data load to a 2M/4M page", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data load to a 4K page", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a demand load in the PMH each cycle", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_dtlb_store_misses[]={ { .uname = "STLB_HIT", .udesc = "Stores that miss the DTLB and hit the STLB", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a store", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Store misses in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data store to a 1G page", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data store to a 2M/4M page", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data store to a 4K page", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a store in the PMH each cycle", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_exe_activity[]={ { .uname = "1_PORTS_UTIL", .udesc = "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_PORTS_UTIL", .udesc = "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "3_PORTS_UTIL", .udesc = "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_PORTS_UTIL", .udesc = "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOUND_ON_LOADS", .udesc = "Execution stalls while memory subsystem has an outstanding load", .ucode = 0x2100ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "BOUND_ON_STORES", .udesc = "Cycles where the Store Buffer was full and no loads caused an execution stall", .ucode = 0x4000ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "EXE_BOUND_0_PORTS", .udesc = "Cycles no uop executed while RS was not empty, the SB was not full and there was no outstanding load", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_fp_arith_dispatched[]={ { .uname = "PORT_0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V5", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_fp_arith_inst_retired[]={ { .uname = "128B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 2 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "128B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_FLOPS", .udesc = "Counts number of SSE/AVX computational 128-bit packed single and 256-bit packed double precision FP instructions retired; some instructions will count twice as noted below. Each count represents 2 or/and 4 computation operations, 1 for each element. Applies to SSE* and AVX* packed single precision and packed double precision FP instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element", .ucode = 0x1800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Counts number of SSE/AVX computational scalar floating-point instructions retired; some instructions will count twice as noted below. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 RANGE SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_DOUBLE", .udesc = "Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SINGLE", .udesc = "Counts number of SSE/AVX computational scalar single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of any Vector retired FP arithmetic instructions", .ucode = 0xfc00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_frontend_retired[]={ { .uname = "ANY_DSB_MISS", .udesc = "Retired Instructions who experienced DSB miss", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "DSB_MISS", .udesc = "Retired Instructions who experienced a critical DSB miss", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ITLB_MISS", .udesc = "Retired Instructions who experienced iTLB true miss", .ucode = 0x1400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1I_MISS", .udesc = "Retired Instructions who experienced Instruction L1 Cache true miss", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired Instructions who experienced Instruction L2 Cache true miss", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_1", .udesc = "Retired instructions after front-end starvation of at least 1 cycle", .ucode = 0x60010600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_128", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall", .ucode = 0x60800600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_16", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall", .ucode = 0x60100600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_2", .udesc = "Retired instructions after front-end starvation of at least 2 cycles", .ucode = 0x60020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_256", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall", .ucode = 0x61000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_2_BUBBLES_GE_1", .udesc = "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall", .ucode = 0x10020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_32", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall", .ucode = 0x60200600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_4", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall", .ucode = 0x60040600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_512", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall", .ucode = 0x62000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_64", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall", .ucode = 0x60400600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_8", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 8 cycles which was not interrupted by a back-end stall", .ucode = 0x60080600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "MS_FLOWS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS", .udesc = "Retired Instructions who experienced STLB (2nd level TLB) true miss", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "UNKNOWN_BRANCH", .udesc = "TBD", .ucode = 0x1700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_icache_data[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache miss", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_icache_tag[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_idq[]={ { .uname = "DSB_CYCLES_ANY", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_CYCLES_OK", .udesc = "Cycles DSB is delivering optimal number of Uops", .ucode = 0x0800ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_CYCLES_ANY", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_CYCLES_OK", .udesc = "Cycles MITE is delivering optimal number of Uops", .ucode = 0x0400ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_CYCLES_ANY", .udesc = "Cycles when uops are being delivered to IDQ while MS is busy", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of switches from DSB or MITE to the MS", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MS_UOPS", .udesc = "Uops delivered to IDQ while MS is busy", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Uops not delivered by IDQ when backend of the machine is not stalled [This event is alias to IDQ_BUBBLES.CORE]", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles when no uops are not delivered by the IDQ when backend of the machine is not stalled [This event is alias to IDQ_BUBBLES.CYCLES_0_UOPS_DELIV.CORE]", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles when optimal number of uops was delivered to the back-end when the back-end is not stalled [This event is alias to IDQ_BUBBLES.CYCLES_FE_WAS_OK]", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_inst_decoded[]={ { .uname = "DECODERS", .udesc = "Instruction decoders utilized in a cycle", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_inst_retired[]={ { .uname = "ANY", .udesc = "Number of instructions retired. Fixed Counter - architectural event", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MACRO_FUSED", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOP", .udesc = "Retired NOP instructions", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired with PEBS precise-distribution", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REP_ITERATION", .udesc = "Iterations of Repeat string retired instructions", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_int_misc[]={ { .uname = "CLEARS_COUNT", .udesc = "Clears speculative count", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "CLEAR_RESTEER_CYCLES", .udesc = "Counts cycles after recovery from a branch misprediction or machine clear till the first uop is issued from the resteered path", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UNKNOWN_BRANCH_CYCLES", .udesc = "Bubble cycles of BAClear (Unknown Branch)", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UOP_DROPPING", .udesc = "TMA slots where uops got dropped", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_int_vec_retired[]={ { .uname = "128BIT", .udesc = "TBD", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256BIT", .udesc = "TBD", .ucode = 0xac00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_128", .udesc = "integer ADD, SUB, SAD 128-bit vector instructions", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_256", .udesc = "integer ADD, SUB, SAD 256-bit vector instructions", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MUL_256", .udesc = "TBD", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHUFFLES", .udesc = "TBD", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_128", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_256", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_itlb_misses[]={ { .uname = "STLB_HIT", .udesc = "Instruction fetch requests that miss the ITLB and hit the STLB", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Code miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Code miss in all TLB levels causes a page walk that completes. (2M/4M)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Code miss in all TLB levels causes a page walk that completes. (4K)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for an outstanding code request in the PMH each cycle", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_l1d[]={ { .uname = "HWPF_MISS", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REPLACEMENT", .udesc = "Counts the number of cache lines replaced in L1 data cache", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_l1d_pend_miss[]={ { .uname = "FB_FULL", .udesc = "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FB_FULL_PERIODS", .udesc = "Number of phases a demand request has waited due to L1D Fill Buffer (FB) unavailability", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "L2_STALL", .udesc = "This event is deprecated. Refer to new event L1D_PEND_MISS.L2_STALLS", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_STALLS", .udesc = "Number of cycles a demand request has waited due to L1D due to lack of L2 resources", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING", .udesc = "Number of L1D misses that are outstanding", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load Misses outstanding", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_l2_lines_in[]={ { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x1f00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_l2_lines_out[]={ { .uname = "USELESS_HWPF", .udesc = "Cache lines that have been L2 hardware prefetched but not used by demand accesses", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_l2_rqsts[]={ { .uname = "ALL_CODE_RD", .udesc = "L2 code requests", .ucode = 0xe400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand Data Read access L2 cache", .ucode = 0xe100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "Demand requests that miss L2 cache", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_HWPF", .udesc = "TBD", .ucode = 0xf000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "RFO requests to L2 cache", .ucode = 0xe200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read miss L2 cache", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_MISS", .udesc = "TBD", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Read requests with true-miss in L2 cache. [This event is alias to L2_REQUEST.MISS]", .ucode = 0x3f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All accesses to L2 cache [This event is alias to L2_REQUEST.ALL]", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_HIT", .udesc = "SW prefetch requests that hit L2 cache", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_MISS", .udesc = "SW prefetch requests that miss L2 cache", .ucode = 0x2800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_ld_blocks[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", .ucode = 0x8800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked due to overlapping with a preceding store that cannot be forwarded", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_load_hit_prefetch[]={ { .uname = "SWPF", .udesc = "Counts the number of demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3)", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_lsd[]={ { .uname = "CYCLES_ACTIVE", .udesc = "Cycles Uops delivered by the LSD, but didn't come from the decoder", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_OK", .udesc = "Cycles optimal number of Uops delivered by the LSD, but did not come from the decoder", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "UOPS", .udesc = "Number of Uops delivered by the LSD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_machine_clears[]={ { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of machine clears due to memory ordering conflicts", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-modifying code (SMC) detected", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_memory_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding", .ucode = 0x0300ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand cacheable load request is outstanding", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand cacheable load request is outstanding", .ucode = 0x0900ull | (0x9 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_mem_inst_retired[]={ { .uname = "ALL_LOADS", .udesc = "Retired load instructions", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "Retired store instructions", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All retired memory instructions", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Retired load instructions with locked access", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired load instructions that split across a cacheline boundary", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired store instructions that split across a cacheline boundary", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "Retired load instructions that miss the STLB", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Retired store instructions that miss the STLB", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_mem_load_completed[]={ { .uname = "L1_MISS_ANY", .udesc = "Completed demand load uops that miss the L1 d-cache", .ucode = 0xfd00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_mem_load_l3_hit_retired[]={ { .uname = "XSNP_FWD", .udesc = "Retired load instructions whose data sources were HitM responses from shared L3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HIT", .udesc = "Retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Retired load instructions whose data sources were HitM responses from shared L3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load instructions whose data sources were hits in L3 without snoops required", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NO_FWD", .udesc = "Retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_mem_load_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from local dram", .ucode = 0x0100ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_mem_load_misc_retired[]={ { .uname = "UC", .udesc = "Retired instructions with at least 1 uncacheable load or lock", .ucode = 0x0400ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_mem_load_retired[]={ { .uname = "FB_HIT", .udesc = "Number of completed demand load requests that missed the L1, but hit the FB(fill buffer), because a preceding miss to the same cacheline initiated the line to be brought into L1, but data is not yet ready in L1", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Retired load instructions with L1 cache hits as data sources", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load instructions missed L1 cache as data sources", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load instructions with L2 cache hits as data sources", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load instructions missed L2 cache as data sources", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load instructions with L3 cache hits as data sources", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load instructions missed L3 cache as data sources", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_mem_store_retired[]={ { .uname = "L2_HIT", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, { .uname = "LOAD_LATENCY_GT_128", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 128 cycles", .uequiv = "LOAD_LATENCY:ldlat=128", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_16", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 16 cycles", .uequiv = "LOAD_LATENCY:ldlat=16", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_256", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 256 cycles", .uequiv = "LOAD_LATENCY:ldlat=256", .ucode = 0x10000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_32", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 32 cycles", .uequiv = "LOAD_LATENCY:ldlat=32", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_4", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 4 cycles", .uequiv = "LOAD_LATENCY:ldlat=4", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_512", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 512 cycles", .ucode = 0x20000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_64", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 64 cycles", .uequiv = "LOAD_LATENCY:ldlat=64", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_8", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 8 cycles", .uequiv = "LOAD_LATENCY:ldlat=8", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STORE_SAMPLE", .udesc = "Retired memory store access operations. A PDist event for PEBS Store Latency Facility", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_glc_mem_uop_retired[]={ { .uname = "ANY", .udesc = "Retired memory uops for any access", .ucode = 0x0300ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_misc2_retired[]={ { .uname = "LFENCE", .udesc = "LFENCE instructions retired", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_misc_retired[]={ { .uname = "LBR_INSERTS", .udesc = "Increments whenever there is an update to the LBR array", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_ocr[]={ { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM", .ucode = 0x18400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that resulted in a snoop hit in another cores caches, data forwarding is required as the data is modified", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that resulted in a snoop hit in another cores caches which forwarded the unmodified data to the requesting core", .ucode = 0x8003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that were not supplied by the L3 cache", .ucode = 0x3fbfc0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand read for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response", .ucode = 0x1000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand read for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that resulted in a snoop hit in another cores caches, data forwarding is required as the data is modified", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand read for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the L3 cache", .ucode = 0x3fbfc0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_offcore_requests[]={ { .uname = "ALL_REQUESTS", .udesc = "TBD", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_RD", .udesc = "Demand and prefetch data reads", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Counts demand data read requests that miss the L3 cache", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD", .udesc = "This event is deprecated. Refer to new event OFFCORE_REQUESTS_OUTSTANDING.DATA_RD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DATA_RD", .udesc = "TBD", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_DATA_RD", .udesc = "Cycles where at least 1 outstanding demand data read request is pending", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_RFO", .udesc = "For every cycle where the core is waiting on at least 1 outstanding Demand RFO request, increments by 1", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DATA_RD", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of outstanding demand data read requests pending", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of demand data read requests pending that are known to have missed the L3 cache", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_resource_stalls[]={ { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available. (not including draining form sync)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCOREBOARD", .udesc = "Counts cycles where the pipeline is stalled due to serializing operations", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_rs[]={ { .uname = "EMPTY", .udesc = "Cycles when Reservation Station (RS) is empty for the thread", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EMPTY_COUNT", .udesc = "Counts end of periods where the Reservation Station (RS) was empty", .ucode = 0x0700ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t adl_glc_rs_empty[]={ { .uname = "COUNT", .udesc = "This event is deprecated. Refer to new event RS.EMPTY_COUNT", .ucode = 0x0700ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "CYCLES", .udesc = "This event is deprecated. Refer to new event RS.EMPTY", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_sq_misc[]={ { .uname = "BUS_LOCK", .udesc = "Counts bus locks, accounts for cache line split locks and UC locks", .ucode = 0x1000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_sw_prefetch_access[]={ { .uname = "NTA", .udesc = "Number of PREFETCHNTA instructions executed", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREFETCHW", .udesc = "Number of PREFETCHW instructions executed", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T0", .udesc = "Number of PREFETCHT0 instructions executed", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T1_T2", .udesc = "Number of PREFETCHT1 or PREFETCHT2 instructions executed", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_topdown[]={ { .uname = "BACKEND_BOUND_SLOTS", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources (Topdown L1)", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BAD_SPEC_SLOTS", .udesc = "TMA slots wasted due to incorrect speculations (Topdown L1)", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BR_MISPREDICT_SLOTS", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x8500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_BOUND_SLOTS", .udesc = "TMA slots wasted due to memory accesses (TopdownL2)", .ucode = 0x8700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RETIRING_SLOTS", .udesc = "TMA slots where instructions are retiring (Topdown L1)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_P", .udesc = "TMA slots available for an unhalted logical processor. General counter - architectural event", .ucode = 0x01a4ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, }; static const intel_x86_umask_t adl_glc_uops_decoded[]={ { .uname = "DEC0_UOPS", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_uops_dispatched[]={ { .uname = "PORT_0", .udesc = "Uops executed on port 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Uops executed on port 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2_3_10", .udesc = "Uops executed on ports 2, 3 and 10", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4_9", .udesc = "Uops executed on ports 4 and 9", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5_11", .udesc = "Uops executed on ports 5 and 11", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Uops executed on port 6", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7_8", .udesc = "Uops executed on ports 7 and 8", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_uops_executed[]={ { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles at least 1 micro-op is executed from any thread on physical core", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles at least 2 micro-op is executed from any thread on physical core", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles at least 3 micro-op is executed from any thread on physical core", .ucode = 0x0200ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles at least 4 micro-op is executed from any thread on physical core", .ucode = 0x0200ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed per-thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed per-thread", .ucode = 0x0100ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed per-thread", .ucode = 0x0100ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed per-thread", .ucode = 0x0100ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS", .udesc = "Counts number of cycles no uops were dispatched to be executed on this thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .udesc = "This event is deprecated. Refer to new event UOPS_EXECUTED.STALLS", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "THREAD", .udesc = "Counts the number of uops to be executed per-thread each cycle", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "X87", .udesc = "Counts the number of x87 uops dispatched", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_glc_uops_issued[]={ { .uname = "ANY", .udesc = "Uops that RAT issues to RS", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_glc_uops_retired[]={ { .uname = "CYCLES", .udesc = "Cycles with retired uop(s)", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "HEAVY", .udesc = "Retired uops except the last uop of each instruction", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "Retirement slots used", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS", .udesc = "Cycles without actually retired uops", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .udesc = "This event is deprecated. Refer to new event UOPS_RETIRED.STALLS", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t adl_glc_xq[]={ { .uname = "FULL_CYCLES", .udesc = "Cycles the uncore cannot take further requests", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_entry_t intel_adl_glc_pe[]={ { .name = "ARITH", .desc = "This event is deprecated. Refer to new event ARITH.FPDIV_ACTIVE", .code = 0x00b0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_arith), .umasks = adl_glc_arith, }, { .name = "ASSISTS", .desc = "Counts all microcode FP assists", .code = 0x00c1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_assists), .umasks = adl_glc_assists, }, { .name = "BACLEARS", .desc = "Clears due to Unknown Branches", .code = 0x0060, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_baclears), .umasks = adl_glc_baclears, }, { .name = "BR_INST_RETIRED", .desc = "All branch instructions retired", .code = 0x00c4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_br_inst_retired), .umasks = adl_glc_br_inst_retired, }, { .name = "BR_MISP_RETIRED", .desc = "All mispredicted branch instructions retired", .code = 0x00c5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_br_misp_retired), .umasks = adl_glc_br_misp_retired, }, { .name = "CORE_POWER", .desc = "TBD", .code = 0x0028, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_core_power), .umasks = adl_glc_core_power, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when the thread is not in halt state", .code = 0x003c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_cpu_clk_unhalted), .umasks = adl_glc_cpu_clk_unhalted, }, { .name = "CYCLE_ACTIVITY", .desc = "Cycles while L2 cache miss demand load is outstanding", .code = 0x00a3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_cycle_activity), .umasks = adl_glc_cycle_activity, }, { .name = "DECODE", .desc = "Stalls caused by changing prefix length of the instruction", .code = 0x0087, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_decode), .umasks = adl_glc_decode, }, { .name = "DSB2MITE_SWITCHES", .desc = "DSB-to-MITE switch true penalty cycles", .code = 0x0061, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_dsb2mite_switches), .umasks = adl_glc_dsb2mite_switches, }, { .name = "DTLB_LOAD_MISSES", .desc = "Page walks completed due to a demand data load to a 4K page", .code = 0x0012, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_dtlb_load_misses), .umasks = adl_glc_dtlb_load_misses, }, { .name = "DTLB_STORE_MISSES", .desc = "Page walks completed due to a demand data store to a 4K page", .code = 0x0013, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_dtlb_store_misses), .umasks = adl_glc_dtlb_store_misses, }, { .name = "EXE_ACTIVITY", .desc = "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty", .code = 0x00a6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_exe_activity), .umasks = adl_glc_exe_activity, }, { .name = "FP_ARITH_DISPATCHED", .desc = "TBD", .code = 0x00b3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_fp_arith_dispatched), .umasks = adl_glc_fp_arith_dispatched, }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions", .code = 0x00c7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_fp_arith_inst_retired), .umasks = adl_glc_fp_arith_inst_retired, }, { .name = "FRONTEND_RETIRED", .desc = "Retired Instructions who experienced a critical DSB miss", .code = 0x01c6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_FRONTEND | INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_frontend_retired), .umasks = adl_glc_frontend_retired, }, { .name = "ICACHE_DATA", .desc = "Cycles where a code fetch is stalled due to L1 instruction cache miss", .code = 0x0080, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_icache_data), .umasks = adl_glc_icache_data, }, { .name = "ICACHE_TAG", .desc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss", .code = 0x0083, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_icache_tag), .umasks = adl_glc_icache_tag, }, { .name = "IDQ", .desc = "Cycles MITE is delivering any Uop", .code = 0x0079, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_idq), .umasks = adl_glc_idq, }, { .name = "IDQ_BUBBLES", .desc = "Uops not delivered by IDQ when backend of the machine is not stalled", .equiv = "IDQ_UOPS_NOT_DELIVERED", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_idq_uops_not_delivered), .umasks = adl_glc_idq_uops_not_delivered, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered by IDQ when backend of the machine is not stalled [This event is alias to IDQ_BUBBLES.CORE]", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_idq_uops_not_delivered), .umasks = adl_glc_idq_uops_not_delivered, }, { .name = "INST_DECODED", .desc = "Instruction decoders utilized in a cycle", .code = 0x0075, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_inst_decoded), .umasks = adl_glc_inst_decoded, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .code = 0x00c0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_inst_retired), .umasks = adl_glc_inst_retired, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V5_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INT_MISC", .desc = "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread", .code = 0x00ad, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_int_misc), .umasks = adl_glc_int_misc, }, { .name = "INT_VEC_RETIRED", .desc = "integer ADD, SUB, SAD 128-bit vector instructions", .code = 0x00e7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_int_vec_retired), .umasks = adl_glc_int_vec_retired, }, { .name = "ITLB_MISSES", .desc = "Code miss in all TLB levels causes a page walk that completes. (4K)", .code = 0x0011, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_itlb_misses), .umasks = adl_glc_itlb_misses, }, { .name = "L1D", .desc = "Counts the number of cache lines replaced in L1 data cache", .code = 0x0051, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l1d), .umasks = adl_glc_l1d, }, { .name = "L1D_PEND_MISS", .desc = "Number of L1D misses that are outstanding", .code = 0x0048, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l1d_pend_miss), .umasks = adl_glc_l1d_pend_miss, }, { .name = "L2_LINES_IN", .desc = "L2 cache lines filling L2", .code = 0x0025, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l2_lines_in), .umasks = adl_glc_l2_lines_in, }, { .name = "L2_LINES_OUT", .desc = "Cache lines that have been L2 hardware prefetched but not used by demand accesses", .code = 0x0026, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l2_lines_out), .umasks = adl_glc_l2_lines_out, }, { .name = "L2_REQUEST", .desc = "Demand Data Read miss L2 cache", .equiv = "L2_RQSTS", .code = 0x0024, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l2_rqsts), .umasks = adl_glc_l2_rqsts, }, { .name = "L2_RQSTS", .desc = "Demand Data Read miss L2 cache", .code = 0x0024, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_l2_rqsts), .umasks = adl_glc_l2_rqsts, }, { .name = "LD_BLOCKS", .desc = "False dependencies in MOB due to partial compare on address", .code = 0x0003, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_ld_blocks), .umasks = adl_glc_ld_blocks, }, { .name = "LOAD_HIT_PREFETCH", .desc = "Counts the number of demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch", .code = 0x004c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_load_hit_prefetch), .umasks = adl_glc_load_hit_prefetch, }, { .name = "LONGEST_LAT_CACHE", .desc = "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)", .code = 0x002e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_longest_lat_cache), .umasks = adl_glc_longest_lat_cache, }, { .name = "LSD", .desc = "Cycles Uops delivered by the LSD, but didn't come from the decoder", .code = 0x00a8, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_lsd), .umasks = adl_glc_lsd, }, { .name = "MACHINE_CLEARS", .desc = "Number of machine clears (nukes) of any type", .code = 0x00c3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_machine_clears), .umasks = adl_glc_machine_clears, }, { .name = "MEMORY_ACTIVITY", .desc = "Cycles while L1 cache miss demand load is outstanding", .code = 0x0047, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_memory_activity), .umasks = adl_glc_memory_activity, }, { .name = "MEM_INST_RETIRED", .desc = "Retired load instructions that miss the STLB", .code = 0x00d0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_inst_retired), .umasks = adl_glc_mem_inst_retired, }, { .name = "MEM_LOAD_COMPLETED", .desc = "Completed demand load uops that miss the L1 d-cache", .code = 0x0043, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_load_completed), .umasks = adl_glc_mem_load_completed, }, { .name = "MEM_LOAD_L3_HIT_RETIRED", .desc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache", .code = 0x00d2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_load_l3_hit_retired), .umasks = adl_glc_mem_load_l3_hit_retired, }, { .name = "MEM_LOAD_L3_MISS_RETIRED", .desc = "Retired load instructions which data sources missed L3 but serviced from local dram", .code = 0x00d3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_load_l3_miss_retired), .umasks = adl_glc_mem_load_l3_miss_retired, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Retired instructions with at least 1 uncacheable load or lock", .code = 0x00d4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_load_misc_retired), .umasks = adl_glc_mem_load_misc_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired load instructions with L1 cache hits as data sources", .code = 0x00d1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_load_retired), .umasks = adl_glc_mem_load_retired, }, { .name = "MEM_STORE_RETIRED", .desc = "TBD", .code = 0x0044, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_store_retired), .umasks = adl_glc_mem_store_retired, }, { .name = "MEM_TRANS_RETIRED", .desc = "Counts randomly selected loads when the latency from first dispatch to completion is greater latency threshold in ldlat=", .code = 0x01cd, .modmsk = INTEL_V4_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xfeull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_trans_retired), .umasks = adl_glc_mem_trans_retired, }, { .name = "MEM_UOP_RETIRED", .desc = "Retired memory uops for any access", .code = 0x00e5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_mem_uop_retired), .umasks = adl_glc_mem_uop_retired, }, { .name = "MISC2_RETIRED", .desc = "LFENCE instructions retired", .code = 0x00e0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_misc2_retired), .umasks = adl_glc_misc2_retired, }, { .name = "MISC_RETIRED", .desc = "Increments whenever there is an update to the LBR array", .code = 0x00cc, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_misc_retired), .umasks = adl_glc_misc_retired, }, { .name = "OCR0", .desc = "Counts demand data reads that were not supplied by the L3 cache", .code = 0x012a, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_ocr), .umasks = adl_glc_ocr, }, { .name = "OCR1", .desc = "Counts demand data reads that were not supplied by the L3 cache", .code = 0x012b, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_ocr), .umasks = adl_glc_ocr, }, { .name = "OFFCORE_REQUESTS", .desc = "Demand Data Read requests sent to uncore", .code = 0x0021, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_offcore_requests), .umasks = adl_glc_offcore_requests, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Cycles where at least 1 outstanding demand data read request is pending", .code = 0x0020, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_offcore_requests_outstanding), .umasks = adl_glc_offcore_requests_outstanding, }, { .name = "RESOURCE_STALLS", .desc = "Counts cycles where the pipeline is stalled due to serializing operations", .code = 0x00a2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_resource_stalls), .umasks = adl_glc_resource_stalls, }, { .name = "RS", .desc = "Cycles when Reservation Station (RS) is empty for the thread", .code = 0x00a5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_rs), .umasks = adl_glc_rs, }, { .name = "SQ_MISC", .desc = "Counts bus locks, accounts for cache line split locks and UC locks", .code = 0x002c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_sq_misc), .umasks = adl_glc_sq_misc, }, { .name = "SW_PREFETCH_ACCESS", .desc = "Number of PREFETCHNTA instructions executed", .code = 0x0040, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_sw_prefetch_access), .umasks = adl_glc_sw_prefetch_access, }, { .name = "TOPDOWN", .desc = "Topdown events using PERF_METRICS support", .code = 0x0000, .modmsk = INTEL_FIXED2_ATTRS, .cntmsk = 0x800000000ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_topdown), .umasks = adl_glc_topdown, }, { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "UOPS_DECODED", .desc = "TBD", .code = 0x0076, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_uops_decoded), .umasks = adl_glc_uops_decoded, }, { .name = "UOPS_DISPATCHED", .desc = "Uops executed on port 0", .code = 0x00b2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_uops_dispatched), .umasks = adl_glc_uops_dispatched, }, { .name = "UOPS_EXECUTED", .desc = "Cycles where at least 1 uop was executed per-thread", .code = 0x00b1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_uops_executed), .umasks = adl_glc_uops_executed, }, { .name = "UOPS_ISSUED", .desc = "Uops that RAT issues to RS", .code = 0x00ae, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_uops_issued), .umasks = adl_glc_uops_issued, }, { .name = "UOPS_RETIRED", .desc = "Retired uops except the last uop of each instruction", .code = 0x00c2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_uops_retired), .umasks = adl_glc_uops_retired, }, { .name = "XQ", .desc = "Cycles the uncore cannot take further requests", .code = 0x002d, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_glc_xq), .umasks = adl_glc_xq, }, }; /* 64 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_adl_grt_events.h000066400000000000000000001303701502707512200244060ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: adl_grt (Alderlake GoldenCove E-Core) * Based on Intel JSON event table version : 1.24 * Based on Intel JSON event table published : 12/04/2023 */ static const intel_x86_umask_t adl_grt_baclears[]={ { .uname = "ANY", .udesc = "Counts the total number of BACLEARS due to all branch types including conditional and unconditional jumps, returns, and indirect branches", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Counts the total number of branch instructions retired for all branch types", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "CALL", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.NEAR_CALL", .uequiv = "NEAR_CALL", .ucode = 0xf900ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Counts the number of retired JCC (Jump on Conditional Code) branch instructions retired, includes both taken and not taken branches", .ucode = 0x7e00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Counts the number of taken JCC (Jump on Conditional Code) branch instructions retired", .ucode = 0xfe00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of far branch instructions retired, includes far jump, far call and return, and interrupt call and return", .ucode = 0xbf00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Counts the number of near indirect JMP and near indirect CALL branch instructions retired", .ucode = 0xeb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Counts the number of near indirect CALL branch instructions retired", .ucode = 0xfb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.INDIRECT_CALL", .uequiv = "INDIRECT_CALL", .ucode = 0xfb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "JCC", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.COND", .uequiv = "COND", .ucode = 0x7e00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Counts the number of near CALL branch instructions retired", .ucode = 0xf900ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Counts the number of near RET branch instructions retired", .ucode = 0xf700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Counts the number of near taken branch instructions retired", .ucode = 0xc000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.INDIRECT", .uequiv = "INDIRECT", .ucode = 0xeb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REL_CALL", .udesc = "Counts the number of near relative CALL branch instructions retired", .ucode = 0xfd00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.NEAR_RETURN", .uequiv = "NEAR_RETURN", .ucode = 0xf700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "This event is deprecated. Refer to new event BR_INST_RETIRED.COND_TAKEN", .uequiv = "COND_TAKEN", .ucode = 0xfe00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_grt_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Counts the total number of mispredicted branch instructions retired for all branch types", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Counts the number of mispredicted JCC (Jump on Conditional Code) branch instructions retired", .ucode = 0x7e00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Counts the number of mispredicted taken JCC (Jump on Conditional Code) branch instructions retired", .ucode = 0xfe00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired", .ucode = 0xeb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Counts the number of mispredicted near indirect CALL branch instructions retired", .ucode = 0xfb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "This event is deprecated. Refer to new event BR_MISP_RETIRED.INDIRECT_CALL", .uequiv = "INDIRECT_CALL", .ucode = 0xfb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "JCC", .udesc = "This event is deprecated. Refer to new event BR_MISP_RETIRED.COND", .uequiv = "COND", .ucode = 0x7e00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Counts the number of mispredicted near taken branch instructions retired", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "This event is deprecated. Refer to new event BR_MISP_RETIRED.INDIRECT", .uequiv = "INDIRECT", .ucode = 0xeb00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "Counts the number of mispredicted near RET branch instructions retired", .ucode = 0xf700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "This event is deprecated. Refer to new event BR_MISP_RETIRED.COND_TAKEN", .uequiv = "COND_TAKEN", .ucode = 0xfe00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_grt_cpu_clk_unhalted[]={ { .uname = "CORE", .udesc = "Counts the number of unhalted core clock cycles. (Fixed event)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "CORE_P", .udesc = "Counts the number of unhalted core clock cycles", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Counts the number of unhalted reference clock cycles at TSC frequency. (Fixed event)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "REF_TSC_P", .udesc = "Counts the number of unhalted reference clock cycles at TSC frequency", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Counts the number of unhalted core clock cycles. (Fixed event)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "THREAD_P", .udesc = "Counts the number of unhalted core clock cycles", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_dtlb_load_misses[]={ { .uname = "WALK_COMPLETED", .udesc = "Counts the number of page walks completed due to load DTLB misses to any page size", .ucode = 0x0e00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_dtlb_store_misses[]={ { .uname = "WALK_COMPLETED", .udesc = "Counts the number of page walks completed due to store DTLB misses to any page size", .ucode = 0x0e00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_icache[]={ { .uname = "ACCESSES", .udesc = "Counts the number of requests to the instruction cache for one or more bytes of a cache line", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "Counts the number of instruction cache misses", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_inst_retired[]={ { .uname = "ANY", .udesc = "Counts the total number of instructions retired. (Fixed event)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_CODE_OVERRIDE, }, { .uname = "ANY_P", .udesc = "Counts the total number of instructions retired", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_itlb_misses[]={ { .uname = "MISS_CAUSED_WALK", .udesc = "Counts the number of page walks initiated by a instruction fetch that missed the first and second level TLBs", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PDE_CACHE_MISS", .udesc = "Counts the number of page walks due to an instruction fetch that miss the PDE (Page Directory Entry) cache", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Counts the number of page walks completed due to instruction fetch misses to any page size", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_lbr_inserts[]={ { .uname = "ANY", .udesc = "This event is deprecated. [This event is alias to MISC_RETIRED.LBR_INSERTS]", .ucode = 0x0100ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_ld_blocks[]={ { .uname = "4K_ALIAS", .udesc = "This event is deprecated. Refer to new event LD_BLOCKS.ADDRESS_ALIAS", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ADDRESS_ALIAS", .udesc = "Counts the number of retired loads that are blocked because it initially appears to be store forward blocked, but subsequently is shown not to be blocked based on 4K alias check", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "DATA_UNKNOWN", .udesc = "Counts the number of retired loads that are blocked because its address exactly matches an older store whose data is not ready", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_grt_ld_head[]={ { .uname = "ANY_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to any number of reasons, including an L1 miss, WCB full, pagewalk, store address block or store data block, on a load that retires", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DTLB_MISS_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a DTLB miss", .ucode = 0x9000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L1_BOUND_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer is stalled due to a core bound stall including a store address match, a DTLB miss or a page walk that detains the load from retiring", .ucode = 0xf400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L1_MISS_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a DL1 miss", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to other block cases", .ucode = 0xc000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PGWALK_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a pagewalk", .ucode = 0xa000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ST_ADDR_AT_RET", .udesc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a store address match", .ucode = 0x8400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Counts the number of cacheable memory requests that miss in the LLC. Counts on a per core basis", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Counts the number of cacheable memory requests that access the LLC. Counts on a per core basis", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_machine_clears[]={ { .uname = "DISAMBIGUATION", .udesc = "Counts the number of machine clears due to memory ordering in which an internal load passes an older store within the same CPU", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FP_ASSIST", .udesc = "Counts the number of floating point operations retired that required microcode assist", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Counts the number of machine clears due to memory ordering caused by a snoop from an external agent. Does not count internally generated machine clears such as those due to memory disambiguation", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MRN_NUKE", .udesc = "Counts the number of machines clears due to memory renaming", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAGE_FAULT", .udesc = "Counts the number of machine clears due to a page fault. Counts both I-Side and D-Side (Loads/Stores) page faults. A page fault occurs when either the page is not present, or an access violation occurs", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW", .udesc = "Counts the number of machine clears that flush the pipeline and restart the machine with the use of microcode due to SMC, MEMORY_ORDERING, FP_ASSISTS, PAGE_FAULT, DISAMBIGUATION, and FPC_VIRTUAL_TRAP", .ucode = 0x6f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Counts the number of machine clears due to program modifying data (self modifying code) within 1K of a recently fetched code page", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_mem_bound_stalls[]={ { .uname = "IFETCH", .udesc = "Counts the number of cycles the core is stalled due to an instruction cache or TLB miss which hit in the L2, LLC, DRAM or MMIO (Non-DRAM)", .ucode = 0x3800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFETCH_DRAM_HIT", .udesc = "Counts the number of cycles the core is stalled due to an instruction cache or TLB miss which hit in DRAM or MMIO (Non-DRAM)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFETCH_L2_HIT", .udesc = "Counts the number of cycles the core is stalled due to an instruction cache or TLB miss which hit in the L2 cache", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFETCH_LLC_HIT", .udesc = "Counts the number of cycles the core is stalled due to an instruction cache or TLB miss which hit in the LLC or other core with HITE/F/M", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOAD", .udesc = "Counts the number of cycles the core is stalled due to a demand load miss which hit in the L2, LLC, DRAM or MMIO (Non-DRAM)", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOAD_DRAM_HIT", .udesc = "Counts the number of cycles the core is stalled due to a demand load miss which hit in DRAM or MMIO (Non-DRAM)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOAD_L2_HIT", .udesc = "Counts the number of cycles the core is stalled due to a demand load which hit in the L2 cache", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOAD_LLC_HIT", .udesc = "Counts the number of cycles the core is stalled due to a demand load which hit in the LLC or other core with HITE/F/M", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_mem_load_uops_retired[]={ { .uname = "DRAM_HIT", .udesc = "Counts the number of load uops retired that hit in DRAM", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Counts the number of load uops retired that hit in the L2 cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Counts the number of load uops retired that hit in the L3 cache", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_grt_mem_scheduler_block[]={ { .uname = "ALL", .udesc = "load buffer, store buffer or RSV full", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LD_BUF", .udesc = "Counts the number of cycles that uops are blocked due to a load buffer full condition", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSV", .udesc = "Counts the number of cycles that uops are blocked due to an RSV full condition", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ST_BUF", .udesc = "Counts the number of cycles that uops are blocked due to a store buffer full condition", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_mem_uops_retired[]={ { .uname = "ALL_LOADS", .udesc = "Counts the number of load uops retired", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "Counts the number of store uops retired", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, { .uname = "LOAD_LATENCY_GT_128", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 128 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x8000ull, .uequiv = "LOAD_LATENCY:ldlat=128", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_16", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 16 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x1000ull, .uequiv = "LOAD_LATENCY:ldlat=16", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_256", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 256 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x10000ull, .uequiv = "LOAD_LATENCY:ldlat=256", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_32", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 32 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x2000ull, .uequiv = "LOAD_LATENCY:ldlat=32", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_4", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 4 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x0400ull, .uequiv = "LOAD_LATENCY:ldlat=4", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_512", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 512 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x20000ull, .uequiv = "LOAD_LATENCY:ldlat=512", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_64", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 64 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x4000ull, .uequiv = "LOAD_LATENCY:ldlat=64", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_8", .udesc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 8 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .ucode = 0x0800ull, .uequiv = "LOAD_LATENCY:ldlat=8", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "SPLIT_LOADS", .udesc = "Counts the number of retired split load uops", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STORE_LATENCY", .udesc = "Counts the number of stores uops retired. Counts with or without PEBS enabled", .ucode = 0x0600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t adl_grt_misc_retired[]={ { .uname = "LBR_INSERTS", .udesc = "Counts the number of LBR entries recorded. Requires LBRs to be enabled in IA32_LBR_CTL. [This event is alias to LBR_INSERTS.ANY]", .ucode = 0x0100ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_ocr[]={ { .uname = "COREWB_M_ANY_RESPONSE", .udesc = "Counts modified writebacks from L1 cache and L2 cache that have any type of response", .ucode = 0x1000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT", .udesc = "Counts demand data reads that were supplied by the L3 cache", .ucode = 0x3f803c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, and modified data was forwarded", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, but no data was forwarded", .ucode = 0x4003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, and non-modified data was forwarded", .ucode = 0x8003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that were not supplied by the L3 cache", .ucode = 0x3f8440000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS_LOCAL", .udesc = "Counts demand data reads that were not supplied by the L3 cache. [L3_MISS_LOCAL is alias to L3_MISS]", .ucode = 0x3f8440000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that have any type of response", .ucode = 0x1000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT", .udesc = "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were supplied by the L3 cache", .ucode = 0x3f803c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were supplied by the L3 cache where a snoop was sent, the snoop hit, and modified data was forwarded", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the L3 cache", .ucode = 0x3f8440000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS_LOCAL", .udesc = "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the L3 cache. [L3_MISS_LOCAL is alias to L3_MISS]", .ucode = 0x3f8440000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_serialization[]={ { .uname = "NON_C01_MS_SCB", .udesc = "Counts the number of issue slots not consumed by the backend due to a micro-sequencer (MS) scoreboard, which stalls the front-end from issuing from the UROM until a specified older uop retires", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_topdown_bad_speculation[]={ { .uname = "ALL", .udesc = "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FASTNUKE", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to fast nukes such as memory ordering and memory disambiguation machine clears", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MACHINE_CLEARS", .udesc = "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a machine clear (nuke) of any kind including memory ordering and memory disambiguation", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISPREDICT", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to branch mispredicts", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUKE", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to a machine clear (nuke)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_topdown_be_bound[]={ { .uname = "ALL", .udesc = "Counts the total number of issue slots every cycle that were not consumed by the backend due to backend stalls", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALLOC_RESTRICTIONS", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to certain allocation restrictions", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_SCHEDULER", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to memory reservation stalls in which a scheduler is not able to accept uops", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_MEM_SCHEDULER", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to IEC or FPC RAT stalls, which can be due to FIQ or IEC reservation stalls in which the integer, floating point or SIMD scheduler is not able to accept uops", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGISTER", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to the physical register file unable to accept an entry (marble stalls)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REORDER_BUFFER", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to the reorder buffer being full (ROB stalls)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SERIALIZATION", .udesc = "Counts the number of issue slots every cycle that were not consumed by the backend due to scoreboards from the instruction queue (IQ), jump execution unit (JEU), or microcode sequencer (MS)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_topdown_fe_bound[]={ { .uname = "ALL", .udesc = "Counts the total number of issue slots every cycle that were not consumed by the backend due to frontend stalls", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BRANCH_DETECT", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to BACLEARS", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BRANCH_RESTEER", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to BTCLEARS", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CISC", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to the microcode sequencer (MS)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DECODE", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to decode stalls", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FRONTEND_BANDWIDTH", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to frontend bandwidth restrictions due to decode, predecode, cisc, and other limitations", .ucode = 0x8d00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FRONTEND_LATENCY", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to a latency related stalls including BACLEARs, BTCLEARs, ITLB misses, and ICache misses", .ucode = 0x7200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ICACHE", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to instruction cache misses", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ITLB", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to ITLB misses", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to other common frontend stalls not categorized", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREDECODE", .udesc = "Counts the number of issue slots every cycle that were not delivered by the frontend due to wrong predecodes", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t adl_grt_topdown_retiring[]={ { .uname = "ALL", .udesc = "Counts the total number of consumed retirement slots", .ucode = 0x0000ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t adl_grt_uops_retired[]={ { .uname = "ALL", .udesc = "Counts the total number of uops retired", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "FPDIV", .udesc = "Counts the number of floating point divide uops retired (x87 and SSE, including x87 sqrt)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IDIV", .udesc = "Counts the number of integer divide uops retired", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MS", .udesc = "Counts the number of uops that are from complex flows issued by the micro-sequencer (MS)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "X87", .udesc = "Counts the number of x87 uops retired, includes those in MS flows", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_entry_t intel_adl_grt_pe[]={ { .name = "BACLEARS", .desc = "Counts the total number of BACLEARS due to all branch types including conditional and unconditional jumps, returns, and indirect branches", .code = 0x00e6, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_baclears), .umasks = adl_grt_baclears, }, { .name = "BR_INST_RETIRED", .desc = "Counts the total number of branch instructions retired for all branch types", .code = 0x00c4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_br_inst_retired), .umasks = adl_grt_br_inst_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Counts the total number of mispredicted branch instructions retired for all branch types", .code = 0x00c5, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_br_misp_retired), .umasks = adl_grt_br_misp_retired, }, { .name = "CPU_CLK_UNHALTED", .desc = "Counts the number of unhalted core clock cycles", .code = 0x003c, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_cpu_clk_unhalted), .umasks = adl_grt_cpu_clk_unhalted, }, { .name = "DTLB_LOAD_MISSES", .desc = "Counts the number of page walks completed due to load DTLB misses to any page size", .code = 0x0008, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_dtlb_load_misses), .umasks = adl_grt_dtlb_load_misses, }, { .name = "DTLB_STORE_MISSES", .desc = "Counts the number of page walks completed due to store DTLB misses to any page size", .code = 0x0049, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_dtlb_store_misses), .umasks = adl_grt_dtlb_store_misses, }, { .name = "ICACHE", .desc = "Counts the number of instruction cache misses", .code = 0x0080, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_icache), .umasks = adl_grt_icache, }, { .name = "INST_RETIRED", .desc = "Counts the total number of instructions retired. (Fixed event)", .code = 0x00c0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_inst_retired), .umasks = adl_grt_inst_retired, }, { .name = "ITLB_MISSES", .desc = "Counts the number of page walks initiated by a instruction fetch that missed the first and second level TLBs", .code = 0x0085, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_itlb_misses), .umasks = adl_grt_itlb_misses, }, { .name = "LBR_INSERTS", .desc = "This event is deprecated. [This event is alias to MISC_RETIRED.LBR_INSERTS]", .code = 0x00e4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS | INTEL_X86_DEPRECATED, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_lbr_inserts), .umasks = adl_grt_lbr_inserts, }, { .name = "LD_BLOCKS", .desc = "Counts the number of retired loads that are blocked because its address exactly matches an older store whose data is not ready", .code = 0x0003, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_ld_blocks), .umasks = adl_grt_ld_blocks, }, { .name = "LD_HEAD", .desc = "Counts the number of cycles that the head (oldest load) of the load buffer and retirement are both stalled due to a DL1 miss", .code = 0x0005, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_ld_head), .umasks = adl_grt_ld_head, }, { .name = "LONGEST_LAT_CACHE", .desc = "Counts the number of cacheable memory requests that miss in the LLC. Counts on a per core basis", .code = 0x002e, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_longest_lat_cache), .umasks = adl_grt_longest_lat_cache, }, { .name = "MACHINE_CLEARS", .desc = "Counts the number of machine clears due to program modifying data (self modifying code) within 1K of a recently fetched code page", .code = 0x00c3, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_machine_clears), .umasks = adl_grt_machine_clears, }, { .name = "MEM_BOUND_STALLS", .desc = "Counts the number of cycles the core is stalled due to a demand load which hit in the L2 cache", .code = 0x0034, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_mem_bound_stalls), .umasks = adl_grt_mem_bound_stalls, }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Counts the number of load uops retired that hit in the L2 cache", .code = 0x00d1, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_mem_load_uops_retired), .umasks = adl_grt_mem_load_uops_retired, }, { .name = "MEM_SCHEDULER_BLOCK", .desc = "Counts the number of cycles that uops are blocked due to a store buffer full condition", .code = 0x0004, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_mem_scheduler_block), .umasks = adl_grt_mem_scheduler_block, }, { .name = "MEM_UOPS_RETIRED", .desc = "Counts the number of tagged loads with an instruction latency that exceeds or equals the threshold of 4 cycles as defined in MEC_CR_PEBS_LD_LAT_THRESHOLD (3F6H). Only counts with PEBS enabled", .code = 0x00d0, .modmsk = INTEL_V2_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0x3ull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_mem_uops_retired), .umasks = adl_grt_mem_uops_retired, }, { .name = "MISC_RETIRED", .desc = "Counts the number of LBR entries recorded. Requires LBRs to be enabled in IA32_LBR_CTL. [This event is alias to LBR_INSERTS.ANY]", .code = 0x00e4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_misc_retired), .umasks = adl_grt_misc_retired, }, { .name = "OCR0", .desc = "Counts demand data reads that have any type of response", .code = 0x01b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_ocr), .umasks = adl_grt_ocr, }, { .name = "OCR1", .desc = "Counts demand data reads that have any type of response", .code = 0x02b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_ocr), .umasks = adl_grt_ocr, }, { .name = "SERIALIZATION", .desc = "Counts the number of issue slots not consumed by the backend due to a micro-sequencer (MS) scoreboard, which stalls the front-end from issuing from the UROM until a specified older uop retires", .code = 0x0075, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_serialization), .umasks = adl_grt_serialization, }, { .name = "TOPDOWN_BAD_SPECULATION", .desc = "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear", .code = 0x0073, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_topdown_bad_speculation), .umasks = adl_grt_topdown_bad_speculation, }, { .name = "TOPDOWN_BE_BOUND", .desc = "Counts the total number of issue slots every cycle that were not consumed by the backend due to backend stalls", .code = 0x0074, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_topdown_be_bound), .umasks = adl_grt_topdown_be_bound, }, { .name = "TOPDOWN_FE_BOUND", .desc = "Counts the total number of issue slots every cycle that were not consumed by the backend due to frontend stalls", .code = 0x0071, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_topdown_fe_bound), .umasks = adl_grt_topdown_fe_bound, }, { .name = "TOPDOWN_RETIRING", .desc = "Counts the total number of consumed retirement slots", .code = 0x00c2, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS | INTEL_X86_CODE_DUP, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_topdown_retiring), .umasks = adl_grt_topdown_retiring, }, { .name = "UOPS_RETIRED", .desc = "Counts the total number of uops retired", .code = 0x00c2, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3full, .ngrp = 1, .flags = INTEL_X86_PEBS | INTEL_X86_CODE_DUP, .numasks= LIBPFM_ARRAY_SIZE(adl_grt_uops_retired), .umasks = adl_grt_uops_retired, }, }; /* 26 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_atom_events.h000066400000000000000000001117421502707512200237340ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: atom (Intel Atom) */ static const intel_x86_umask_t atom_l2_reject_busq[]={ { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 0, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 0, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 0, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .grpid = 1, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 2, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .grpid = 2, }, }; static const intel_x86_umask_t atom_icache[]={ { .uname = "ACCESSES", .udesc = "Instruction fetches, including uncacheacble fetches", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "Count all instructions fetches that miss the icache or produce memory requests. This includes uncacheache fetches. Any instruction fetch miss is counted only once and not once for every cycle it is outstanding", .ucode = 0x200, }, }; static const intel_x86_umask_t atom_l2_lock[]={ { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 0, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 0, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 0, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .grpid = 1, }, }; static const intel_x86_umask_t atom_uops_retired[]={ { .uname = "ANY", .udesc = "Micro-ops retired", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "STALLED_CYCLES", .udesc = "Cycles no micro-ops retired", .ucode = 0x1000 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALLS", .udesc = "Periods no micro-ops retired", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_l2_m_lines_out[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .grpid = 0, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .grpid = 1, }, }; static const intel_x86_umask_t atom_simd_comp_inst_retired[]={ { .uname = "PACKED_SINGLE", .udesc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .ucode = 0x100, }, { .uname = "SCALAR_SINGLE", .udesc = "Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions", .ucode = 0x200, }, { .uname = "PACKED_DOUBLE", .udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .ucode = 0x400, }, { .uname = "SCALAR_DOUBLE", .udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .ucode = 0x800, }, }; static const intel_x86_umask_t atom_simd_sat_uop_exec[]={ { .uname = "S", .udesc = "SIMD saturated arithmetic micro-ops executed", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AR", .udesc = "SIMD saturated arithmetic micro-ops retired", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions retired using generic counter (precise event)", .ucode = 0x0, .uflags= INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_l1d_cache[]={ { .uname = "LD", .udesc = "L1 Cacheable Data Reads", .ucode = 0x2100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ST", .udesc = "L1 Cacheable Data Writes", .ucode = 0x2200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_mul[]={ { .uname = "S", .udesc = "Multiply operations executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AR", .udesc = "Multiply operations retired", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_div[]={ { .uname = "S", .udesc = "Divide operations executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AR", .udesc = "Divide operations retired", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_bus_trans_p[]={ { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .grpid = 1, }, }; static const intel_x86_umask_t atom_bus_io_wait[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, }, }; static const intel_x86_umask_t atom_bus_hitm_drv[]={ { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_itlb[]={ { .uname = "FLUSH", .udesc = "ITLB flushes", .ucode = 0x400, }, { .uname = "MISSES", .udesc = "ITLB misses", .ucode = 0x200, }, }; static const intel_x86_umask_t atom_simd_uop_type_exec[]={ { .uname = "MUL_S", .udesc = "SIMD packed multiply micro-ops executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MUL_AR", .udesc = "SIMD packed multiply micro-ops retired", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHIFT_S", .udesc = "SIMD packed shift micro-ops executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHIFT_AR", .udesc = "SIMD packed shift micro-ops retired", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACK_S", .udesc = "SIMD packed micro-ops executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACK_AR", .udesc = "SIMD packed micro-ops retired", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK_S", .udesc = "SIMD unpacked micro-ops executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK_AR", .udesc = "SIMD unpacked micro-ops retired", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOGICAL_S", .udesc = "SIMD packed logical micro-ops executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOGICAL_AR", .udesc = "SIMD packed logical micro-ops retired", .ucode = 0x9000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ARITHMETIC_S", .udesc = "SIMD packed arithmetic micro-ops executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ARITHMETIC_AR", .udesc = "SIMD packed arithmetic micro-ops retired", .ucode = 0xa000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_simd_inst_retired[]={ { .uname = "PACKED_SINGLE", .udesc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .ucode = 0x100, }, { .uname = "SCALAR_SINGLE", .udesc = "Retired Streaming SIMD Extensions (SSE) scalar-single instructions", .ucode = 0x200, }, { .uname = "PACKED_DOUBLE", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .ucode = 0x400, }, { .uname = "SCALAR_DOUBLE", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .ucode = 0x800, }, { .uname = "VECTOR", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) vector instructions", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "Retired Streaming SIMD instructions", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_prefetch[]={ { .uname = "PREFETCHT0", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .ucode = 0x100, }, { .uname = "SW_L2", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed", .ucode = 0x600, }, { .uname = "PREFETCHNTA", .udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .ucode = 0x800, }, }; static const intel_x86_umask_t atom_l2_rqsts[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .grpid = 0, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .grpid = 1, }, { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 2, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 2, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 2, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 2, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 2, }, }; static const intel_x86_umask_t atom_simd_uops_exec[]={ { .uname = "S", .udesc = "Number of SIMD saturated arithmetic micro-ops executed", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AR", .udesc = "Number of SIMD saturated arithmetic micro-ops retired", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_machine_clears[]={ { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_br_inst_retired[]={ { .uname = "ANY", .udesc = "Retired branch instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PRED_NOT_TAKEN", .udesc = "Retired branch instructions that were predicted not-taken", .ucode = 0x100, }, { .uname = "MISPRED_NOT_TAKEN", .udesc = "Retired branch instructions that were mispredicted not-taken", .ucode = 0x200, }, { .uname = "PRED_TAKEN", .udesc = "Retired branch instructions that were predicted taken", .ucode = 0x400, }, { .uname = "MISPRED_TAKEN", .udesc = "Retired branch instructions that were mispredicted taken", .ucode = 0x800, }, { .uname = "MISPRED", .udesc = "Retired mispredicted branch instructions", .ucode = 0xa00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Retired taken branch instructions", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY1", .udesc = "Retired branch instructions", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_macro_insts[]={ { .uname = "NON_CISC_DECODED", .udesc = "Non-CISC macro instructions decoded ", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DECODED", .udesc = "All Instructions decoded", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_segment_reg_loads[]={ { .uname = "ANY", .udesc = "Number of segment register loads", .ucode = 0x8000, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_baclears[]={ { .uname = "ANY", .udesc = "BACLEARS asserted", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_cycles_int_masked[]={ { .uname = "CYCLES_INT_MASKED", .udesc = "Cycles during which interrupts are disabled", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_INT_PENDING_AND_MASKED", .udesc = "Cycles during which interrupts are pending and disabled", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_fp_assist[]={ { .uname = "S", .udesc = "Floating point assists for executed instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AR", .udesc = "Floating point assists for retired instructions", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_data_tlb_misses[]={ { .uname = "DTLB_MISS", .udesc = "Memory accesses that missed the DTLB", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_MISS_LD", .udesc = "DTLB misses due to load operations", .ucode = 0x500, }, { .uname = "L0_DTLB_MISS_LD", .udesc = "L0 (micro-TLB) misses due to load operations", .ucode = 0x900, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_MISS_ST", .udesc = "DTLB misses due to store operations", .ucode = 0x600, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_store_forwards[]={ { .uname = "GOOD", .udesc = "Good store forwards", .ucode = 0x8100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t atom_cpu_clk_unhalted[]={ { .uname = "CORE_P", .udesc = "Core cycles when core is not halted", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BUS", .udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NO_OTHER", .udesc = "Bus cycles when core is active and other is halted", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_mem_load_retired[]={ { .uname = "L2_HIT", .udesc = "Retired loads that hit the L2 cache (precise event)", .ucode = 0x100, .uflags= INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired loads that miss the L2 cache (precise event)", .ucode = 0x200, .uflags= INTEL_X86_PEBS, }, { .uname = "DTLB_MISS", .udesc = "Retired loads that miss the DTLB (precise event)", .ucode = 0x400, .uflags= INTEL_X86_PEBS, }, }; static const intel_x86_umask_t atom_x87_comp_ops_exe[]={ { .uname = "ANY_S", .udesc = "Floating point computational micro-ops executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_AR", .udesc = "Floating point computational micro-ops retired", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t atom_page_walks[]={ { .uname = "WALKS", .udesc = "Number of page-walks executed", .uequiv = "CYCLES", .ucode = 0x300 | INTEL_X86_MOD_EDGE, .modhw = _INTEL_X86_ATTR_E, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES", .udesc = "Duration of page-walks in core cycles", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_atom_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Unhalted core cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x200000003ull, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycle", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x100000003ull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V3_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10003, .code = 0xc0, }, { .name = "LLC_REFERENCES", .desc = "Last level of cache references", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_V3_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Last level of cache misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for LLC_MISSES", .modmsk = INTEL_V3_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0x3, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Branch instructions retired", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_INST_RETIRED:ANY", .cntmsk = 0x3, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Mispredicted branch instruction retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, }, { .name = "SIMD_INSTR_RETIRED", .desc = "SIMD Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xce, }, { .name = "L2_REJECT_BUSQ", .desc = "Rejected L2 cache requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_reject_busq), .ngrp = 3, .umasks = atom_l2_reject_busq, }, { .name = "SIMD_SAT_INSTR_RETIRED", .desc = "Saturated arithmetic instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xcf, }, { .name = "ICACHE", .desc = "Instruction fetches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(atom_icache), .ngrp = 1, .umasks = atom_icache, }, { .name = "L2_LOCK", .desc = "L2 locked accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x2b, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_lock), .ngrp = 2, .umasks = atom_l2_lock, }, { .name = "UOPS_RETIRED", .desc = "Micro-ops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc2, .numasks = LIBPFM_ARRAY_SIZE(atom_uops_retired), .ngrp = 1, .umasks = atom_uops_retired, }, { .name = "L2_M_LINES_OUT", .desc = "Modified lines evicted from the L2 cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, }, { .name = "SIMD_COMP_INST_RETIRED", .desc = "Retired computational Streaming SIMD Extensions (SSE) instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(atom_simd_comp_inst_retired), .ngrp = 1, .umasks = atom_simd_comp_inst_retired, }, { .name = "SNOOP_STALL_DRV", .desc = "Bus stalled for snoops", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_BURST", .desc = "Burst (full cache-line) bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "SIMD_SAT_UOP_EXEC", .desc = "SIMD saturated arithmetic micro-ops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xb1, .numasks = LIBPFM_ARRAY_SIZE(atom_simd_sat_uop_exec), .ngrp = 1, .umasks = atom_simd_sat_uop_exec, }, { .name = "BUS_TRANS_IO", .desc = "IO bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "RFO bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "SIMD_ASSIST", .desc = "SIMD assists invoked", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(atom_inst_retired), .ngrp = 1, .umasks = atom_inst_retired, }, { .name = "L1D_CACHE", .desc = "L1 Cacheable Data Reads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(atom_l1d_cache), .ngrp = 1, .umasks = atom_l1d_cache, }, { .name = "MUL", .desc = "Multiply operations executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(atom_mul), .ngrp = 1, .umasks = atom_mul, }, { .name = "DIV", .desc = "Divide operations executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x13, .numasks = LIBPFM_ARRAY_SIZE(atom_div), .ngrp = 1, .umasks = atom_div, }, { .name = "BUS_TRANS_P", .desc = "Partial bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_trans_p), .ngrp = 2, .umasks = atom_bus_trans_p, }, { .name = "BUS_IO_WAIT", .desc = "IO requests waiting in the bus queue", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, }, { .name = "L2_M_LINES_IN", .desc = "L2 cache line modifications", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x25, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "L2 cache misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "BUSQ_EMPTY", .desc = "Bus queue is empty", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, { .name = "L2_IFETCH", .desc = "L2 cacheable instruction fetch requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_lock), .ngrp = 2, .umasks = atom_l2_lock, /* identical to actual umasks list for this event */ }, { .name = "BUS_HITM_DRV", .desc = "HITM signal asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7b, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_hitm_drv), .ngrp = 1, .umasks = atom_bus_hitm_drv, }, { .name = "ITLB", .desc = "ITLB hits", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x82, .numasks = LIBPFM_ARRAY_SIZE(atom_itlb), .ngrp = 1, .umasks = atom_itlb, }, { .name = "BUS_TRANS_MEM", .desc = "Memory bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_PWR", .desc = "Partial write bus transaction", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "BR_INST_DECODED", .desc = "Branch instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x1e0, }, { .name = "BUS_TRANS_INVAL", .desc = "Invalidate bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "SIMD_UOP_TYPE_EXEC", .desc = "SIMD micro-ops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(atom_simd_uop_type_exec), .ngrp = 1, .umasks = atom_simd_uop_type_exec, }, { .name = "SIMD_INST_RETIRED", .desc = "Retired Streaming SIMD Extensions (SSE) instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc7, .numasks = LIBPFM_ARRAY_SIZE(atom_simd_inst_retired), .ngrp = 1, .umasks = atom_simd_inst_retired, }, { .name = "CYCLES_DIV_BUSY", .desc = "Cycles the divider is busy", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x14, }, { .name = "PREFETCH", .desc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(atom_prefetch), .ngrp = 1, .umasks = atom_prefetch, }, { .name = "L2_RQSTS", .desc = "L2 cache requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_rqsts), .ngrp = 3, .umasks = atom_l2_rqsts, }, { .name = "SIMD_UOPS_EXEC", .desc = "SIMD micro-ops executed (excluding stores)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xb0, .numasks = LIBPFM_ARRAY_SIZE(atom_simd_uops_exec), .ngrp = 1, .umasks = atom_simd_uops_exec, }, { .name = "HW_INT_RCV", .desc = "Hardware interrupts received (warning overcounts by 2x)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x1c8, }, { .name = "BUS_TRANS_BRD", .desc = "Burst read bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_trans_p), .ngrp = 2, .umasks = atom_bus_trans_p, /* identical to actual umasks list for this event */ }, { .name = "BOGUS_BR", .desc = "Bogus branches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BUS_DATA_RCV", .desc = "Bus cycles while processor receives data", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x64, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, { .name = "MACHINE_CLEARS", .desc = "Self-Modifying Code detected", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(atom_machine_clears), .ngrp = 1, .umasks = atom_machine_clears, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc4, .numasks = LIBPFM_ARRAY_SIZE(atom_br_inst_retired), .ngrp = 1, .umasks = atom_br_inst_retired, }, { .name = "L2_ADS", .desc = "Cycles L2 address bus is in use", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x21, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, { .name = "EIST_TRANS", .desc = "Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x3a, }, { .name = "BUS_TRANS_WB", .desc = "Explicit writeback bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "MACRO_INSTS", .desc = "Macro-instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xaa, .numasks = LIBPFM_ARRAY_SIZE(atom_macro_insts), .ngrp = 1, .umasks = atom_macro_insts, }, { .name = "L2_LINES_OUT", .desc = "L2 cache lines evicted. ", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "L2_LD", .desc = "L2 cache reads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_rqsts), .ngrp = 3, .umasks = atom_l2_rqsts, /* identical to actual umasks list for this event */ }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(atom_segment_reg_loads), .ngrp = 1, .umasks = atom_segment_reg_loads, }, { .name = "L2_NO_REQ", .desc = "Cycles no L2 cache requests are pending", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, { .name = "THERMAL_TRIP", .desc = "Number of thermal trips", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc03b, }, { .name = "EXT_SNOOP", .desc = "External snoops", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x77, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_lock), .ngrp = 2, .umasks = atom_l2_lock, /* identical to actual umasks list for this event */ }, { .name = "BACLEARS", .desc = "Branch address calculator", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(atom_baclears), .ngrp = 1, .umasks = atom_baclears, }, { .name = "CYCLES_INT_MASKED", .desc = "Cycles during which interrupts are disabled", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc6, .numasks = LIBPFM_ARRAY_SIZE(atom_cycles_int_masked), .ngrp = 1, .umasks = atom_cycles_int_masked, }, { .name = "FP_ASSIST", .desc = "Floating point assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x11, .numasks = LIBPFM_ARRAY_SIZE(atom_fp_assist), .ngrp = 1, .umasks = atom_fp_assist, }, { .name = "L2_ST", .desc = "L2 store requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_lock), .ngrp = 2, .umasks = atom_l2_lock, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_DEF", .desc = "Deferred bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "DATA_TLB_MISSES", .desc = "Memory accesses that missed the DTLB", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(atom_data_tlb_misses), .ngrp = 1, .umasks = atom_data_tlb_misses, }, { .name = "BUS_BNR_DRV", .desc = "Number of Bus Not Ready signals asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x61, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_hitm_drv), .ngrp = 1, .umasks = atom_bus_hitm_drv, /* identical to actual umasks list for this event */ }, { .name = "STORE_FORWARDS", .desc = "All store forwards", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x2, .numasks = LIBPFM_ARRAY_SIZE(atom_store_forwards), .ngrp = 1, .umasks = atom_store_forwards, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(atom_cpu_clk_unhalted), .ngrp = 1, .umasks = atom_cpu_clk_unhalted, }, { .name = "BUS_TRANS_ANY", .desc = "All bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(atom_l2_m_lines_out), .ngrp = 2, .umasks = atom_l2_m_lines_out, /* identical to actual umasks list for this event */ }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xcb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(atom_mem_load_retired), .ngrp = 1, .umasks = atom_mem_load_retired, }, { .name = "X87_COMP_OPS_EXE", .desc = "Floating point computational micro-ops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x10, .numasks = LIBPFM_ARRAY_SIZE(atom_x87_comp_ops_exe), .ngrp = 1, .umasks = atom_x87_comp_ops_exe, }, { .name = "PAGE_WALKS", .desc = "Number of page-walks executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0xc, .numasks = LIBPFM_ARRAY_SIZE(atom_page_walks), .ngrp = 1, .umasks = atom_page_walks, }, { .name = "BUS_LOCK_CLOCKS", .desc = "Bus cycles when a LOCK signal is asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_trans_p), .ngrp = 2, .umasks = atom_bus_trans_p, /* identical to actual umasks list for this event */ }, { .name = "BUS_REQUEST_OUTSTANDING", .desc = "Outstanding cacheable data read bus requests duration", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_trans_p), .ngrp = 2, .umasks = atom_bus_trans_p, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IFETCH", .desc = "Instruction-fetch bus transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_trans_p), .ngrp = 2, .umasks = atom_bus_trans_p, /* identical to actual umasks list for this event */ }, { .name = "BUS_HIT_DRV", .desc = "HIT signal asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x7a, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_hitm_drv), .ngrp = 1, .umasks = atom_bus_hitm_drv, /* identical to actual umasks list for this event */ }, { .name = "BUS_DRDY_CLOCKS", .desc = "Bus cycles when data is sent on the bus", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_hitm_drv), .ngrp = 1, .umasks = atom_bus_hitm_drv, /* identical to actual umasks list for this event */ }, { .name = "L2_DBUS_BUSY", .desc = "Cycles the L2 cache data bus is busy", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(atom_bus_io_wait), .ngrp = 1, .umasks = atom_bus_io_wait, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdw_events.h000066400000000000000000003072471502707512200235570ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdw (Intel Broadwell) */ static const intel_x86_umask_t bdw_baclears[]={ { .uname = "ANY", .udesc = "Number of front-end re-steers due to BPU misprediction", .ucode = 0x1f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_br_inst_exec[]={ { .uname = "NONTAKEN_CONDITIONAL", .udesc = "All macro conditional nontaken branch instructions", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONTAKEN_COND", .udesc = "All macro conditional nontaken branch instructions", .ucode = 0x4100, .uequiv = "NONTAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_CONDITIONAL", .udesc = "Taken speculative and retired macro-conditional branches", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "Taken speculative and retired macro-conditional branches", .ucode = 0x8100, .uequiv = "TAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_JUMP", .udesc = "Taken speculative and retired macro-conditional branch instructions excluding calls and indirects", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Taken speculative and retired indirect branches excluding calls and returns", .ucode = 0x8400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_RETURN", .udesc = "Taken speculative and retired indirect branches with return mnemonic", .ucode = 0x8800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_NEAR_CALL", .udesc = "Taken speculative and retired direct near calls", .ucode = 0x9000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CONDITIONAL", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_COND", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DIRECT_JMP", .udesc = "Speculative and retired macro-unconditional branches excluding calls and indirects", .ucode = 0xc200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Speculative and retired indirect branches excluding calls and returns", .ucode = 0xc400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_NEAR_RETURN", .udesc = "Speculative and retired indirect return branches", .ucode = 0xc800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DIRECT_NEAR_CALL", .udesc = "Speculative and retired direct near calls", .ucode = 0xd000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All branch instructions executed", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_br_inst_retired[]={ { .uname = "CONDITIONAL", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Counts all macro direct and indirect near calls", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "Counts all taken and not taken macro branches including far branches (architectural event)", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Counts the number of near ret instructions retired", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "Counts all not taken macro branch instructions retired", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Counts the number of near branch taken instructions retired", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of far branch instructions retired", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_br_misp_exec[]={ { .uname = "NONTAKEN_CONDITIONAL", .udesc = "Not taken speculative and retired mispredicted macro conditional branches", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONTAKEN_COND", .udesc = "Not taken speculative and retired mispredicted macro conditional branches", .ucode = 0x4100, .uequiv = "NONTAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_CONDITIONAL", .udesc = "Taken speculative and retired mispredicted macro conditional branches", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "Taken speculative and retired mispredicted macro conditional branches", .ucode = 0x8100, .uequiv = "TAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Taken speculative and retired mispredicted indirect branches excluding calls and returns", .ucode = 0x8400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CONDITIONAL", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xc100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All mispredicted indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "Taken speculative and retired mispredicted indirect calls", .ucode = 0xa000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_RETURN_NEAR", .udesc = "Taken speculative and retired mispredicted direct returns", .ucode = 0x8800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_br_misp_retired[]={ { .uname = "CONDITIONAL", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted macro branches (architectural event)", .ucode = 0x0, /* architectural encoding */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET", .udesc = "Number of mispredicted ret instructions retired", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_cpl_cycles[]={ { .uname = "RING0", .udesc = "Unhalted core cycles when the thread is in ring 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RING123", .udesc = "Unhalted core cycles when thread is in rings 1, 2, or 3", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RING0_TRANS", .udesc = "Number of intervals between processor halts while thread is in ring 0", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_cpu_clk_thread_unhalted[]={ { .uname = "REF_XCLK", .udesc = "Count Xclk pulses (100Mhz) when the core is unhalted", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK_ANY", .udesc = "Count Xclk pulses (100Mhz) when the at least one thread on the physical core is unhalted", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "REF_XCLK:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "REF_P", .udesc = "Cycles when the core is unhalted (count at 100 Mhz)", .ucode = 0x100, .uequiv = "REF_XCLK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted", .ucode = 0x000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Counts Xclk (100Mhz) pulses when this thread is unhalted and the other thread is halted", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_cycle_activity[]={ { .uname = "CYCLES_L2_PENDING", .udesc = "Cycles with pending L2 miss loads (must use with HT off only)", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, }, { .uname = "CYCLES_LDM_PENDING", .udesc = "Cycles with pending memory loads", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uequiv = "CYCLES_MEM_ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles with pending memory loads", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_L1D_PENDING", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0800 | (0x8 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0x4, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_LDM_PENDING", .udesc = "Executions stalls when there is at least one pending demand load request", .ucode = 0x0600 | (0x6 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0x4, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L1D_PENDING", .udesc = "Executions stalls while there is at least one L1D demand load outstanding", .ucode = 0x0c00 | (0xc << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0x4, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L2_PENDING", .udesc = "Execution stalls while there is at least one L2 demand load pending outstanding", .ucode = 0x0500 | (0x5 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_TOTAL", .udesc = "Cycles during which no instructions were executed in the execution stage of the pipeline", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_NO_EXECUTE", .udesc = "Cycles during which no instructions were executed in the execution stage of the pipeline", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uequiv = "STALLS_TOTAL", .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_dtlb_load_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Misses in all TLB levels causes a page walk that completes (4K)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Misses in all TLB levels causes a page walk of 2MB/4MB page sizes that completes", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Misses in all TLB levels causes a page walk of 1GB page sizes that completes", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_4K", .udesc = "Misses that miss the DTLB and hit the STLB (4KB)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_2M", .udesc = "Misses that miss the DTLB and hit the STLB (2MB)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x6000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_itlb_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Misses in all TLB levels causes a page walk that completes (4KB)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Misses in all TLB levels causes a page walk that completes (2MB/4MB)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Misses in all TLB levels causes a page walk that completes (1GB)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_4K", .udesc = "Misses that miss the DTLB and hit the STLB (4KB)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_2M", .udesc = "Misses that miss the DTLB and hit the STLB (2MB)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x6000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_fp_assist[]={ { .uname = "X87_OUTPUT", .udesc = "Number of X87 FP assists due to output values", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "X87_INPUT", .udesc = "Number of X87 FP assists due to input values", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_OUTPUT", .udesc = "Number of SIMD FP assists due to output values", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_INPUT", .udesc = "Number of SIMD FP assists due to input values", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Cycles with any input/output SEE or FP assists", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL", .udesc = "Cycles with any input and output SSE or FP assist", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_icache[]={ { .uname = "MISSES", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes Uncacheable accesses", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFDATA_STALL", .udesc = "Number of cycles where a code fetch is stalled due to L1 miss", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Reads. Includes cacheable and uncacheable accesses and uncacheable fetches", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_idq[]={ { .uname = "EMPTY", .udesc = "Cycles the Instruction Decode Queue (IDQ) is empty", .ucode = 0x200, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS", .udesc = "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS", .udesc = "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS", .udesc = "Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS_CYCLES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_UOPS:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MITE_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_UOPS_CYCLES", .udesc = "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_OCCUR", .udesc = "Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1:e=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_4_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops", .ucode = 0x1800 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_ANY_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x1800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_4_UOPS", .udesc = "Cycles MITE is delivering 4 Uops", .ucode = 0x2400 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_ANY_UOPS", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x2400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from any path", .ucode = 0x3c00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Count number of non-delivered uops to Resource Allocation Table (RAT)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles per thread when 4 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .uequiv = "CORE:c=4", .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_1_UOP_DELIV_CORE", .udesc = "Cycles per thread when 3 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_2_UOP_DELIV_CORE", .udesc = "Cycles with less than 2 uops delivered by the front end", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_3_UOP_DELIV_CORE", .udesc = "Cycles with less than 3 uops delivered by the front end", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles Front-End (FE) delivered 4 uops or Resource Allocation Table (RAT) was stalling FE", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 inv=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t bdw_inst_retired[]={ { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution (Precise Event)", .ucode = 0x100, .uequiv = "PREC_DIST", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "PREC_DIST:i=1:c=10", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution (Precise event)", .ucode = 0x100, .ucntmsk= 0x2, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "X87", .udesc = "Number of FPU operations retired (instructions with no exceptions)", .ucode = 0x200, .ucntmsk= 0x2, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_int_misc[]={ { .uname = "RECOVERY_CYCLES", .udesc = "Cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "RECOVERY_CYCLES_ANY", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .uequiv = "RECOVERY_CYCLES:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "RECOVERY_STALLS_COUNT", .udesc = "Number of occurrences waiting for Machine Clears", .ucode = 0x300 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "RAT_STALL_CYCLES", .udesc = "Cycles when the Resource Allocation Table (RAT) external stall event is sent to the Instruction Decode Queue (IDQ) for the thread. Also includes cycles when the allocator is serving another thread", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_itlb[]={ { .uname = "ITLB_FLUSH", .udesc = "Flushing of the Instruction TLB (ITLB) pages independent of page size", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_l1d[]={ { .uname = "REPLACEMENT", .udesc = "L1D Data line replacements", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_sq_misc[]={ { .uname = "SPLIT_LOCK", .udesc = "Number of split locks in the super queue (SQ)", .ucode = 0x1000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_l1d_pend_miss[]={ { .uname = "PENDING", .udesc = "Cycles with L1D load misses outstanding", .ucode = 0x100, .ucntmsk = 0x4, .uflags = INTEL_X86_DFL, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load misses outstanding", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "PENDING:c=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "PENDING_CYCLES_ANY", .udesc = "Cycles with L1D load misses outstanding from any thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .uequiv = "PENDING:c=1:t", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "OCCURRENCES", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "EDGE", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "FB_FULL", .udesc = "Number of cycles a demand request was blocked due to Fill Buffer (FB) unavailability", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_l2_demand_rqsts[]={ { .uname = "WB_HIT", .udesc = "WB requests that hit L2 cache", .ucode = 0x5000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_l2_lines_in[]={ { .uname = "I", .udesc = "L2 cache lines in I state filling L2", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S", .udesc = "L2 cache lines in S state filling L2", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E", .udesc = "L2 cache lines in E state filling L2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "L2 cache lines filling L2", .uequiv = "ALL", .ucode = 0x700, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_l2_lines_out[]={ { .uname = "DEMAND_CLEAN", .udesc = "Number of clean L2 cachelines evicted by demand", .ucode = 0x500, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_l2_rqsts[]={ { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read requests that miss L2 cache", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uequiv = "DEMAND_RFO_MISS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0x4200, .uequiv = "DEMAND_RFO_HIT", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "All demand requests that miss the L2 cache", .ucode = 0x2700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads", .ucode = 0x4400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_PF_MISS", .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", .ucode = 0x3800, .uequiv = "PF_MISS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_MISS", .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", .ucode = 0x3800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "All requests that miss the L2 cache", .ucode = 0x3f00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0xd800, .uequiv = "PF_HIT", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0xd800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Any data read request to L2 cache", .ucode = 0xe100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "Any data RFO request to L2 cache", .ucode = 0xe200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CODE_RD", .udesc = "Any code read request to L2 cache", .ucode = 0xe400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_REFERENCES", .udesc = "All demand requests to L2 cache ", .ucode = 0xe700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "Any L2 HW prefetch request to L2 cache", .ucode = 0xf800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All requests to L2 cache", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_l2_trans[]={ { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests that access L2 cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO requests that access L2 cache", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD", .udesc = "L2 cache accesses when fetching instructions", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "L2 or L3 HW prefetches that access L2 cache, including rejects", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writebacks that access L2 cache", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_FILL", .udesc = "L2 fill requests that access L2 cache", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_REQUESTS", .udesc = "Transactions accessing L2 pipe", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_ld_blocks[]={ { .uname = "STORE_FORWARD", .udesc = "Counts the number of loads blocked by overlapping with store buffer entries that cannot be forwarded", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_load_hit_pre[]={ { .uname = "HW_PF", .udesc = "Non software-prefetch load dispatches that hit FB allocated for hardware prefetch", .ucode = 0x200, }, { .uname = "SW_PF", .udesc = "Non software-prefetch load dispatches that hit FB allocated for software prefetch", .ucode = 0x100, }, }; static const intel_x86_umask_t bdw_lock_cycles[]={ { .uname = "SPLIT_LOCK_UC_LOCK_DURATION", .udesc = "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CACHE_LOCK_DURATION", .udesc = "cycles that the L1D is locked", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed LLC - architectural event", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable demand requests that refer to LLC - architectural event", .ucode = 0x4f00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_machine_clears[]={ { .uname = "CYCLES", .udesc = "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of Memory Ordering Machine Clears detected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Number of Self-modifying code (SMC) Machine Clears detected", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MASKMOV", .udesc = "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "CYCLES:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_mem_load_uops_l3_hit_retired[]={ { .uname = "XSNP_MISS", .udesc = "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HIT", .udesc = "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Load had HitM Response from a core on same socket (shared L3). (Non PEBS", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load uops which data sources were hits in L3 without snoops required", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_mem_load_uops_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load uops missing L3 cache but hitting local memory (Precise Event)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS , }, { .uname = "REMOTE_DRAM", .udesc = "Number of retired load uops that missed L3 but were service by remote RAM, snoop not needed, snoop miss, snoop hit data not forwarded (Precise Event)", .ucode = 0x400, .umodel = PFM_PMU_INTEL_BDW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Number of retired load uops whose data sources was remote HITM (Precise Event)", .ucode = 0x1000, .umodel = PFM_PMU_INTEL_BDW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Load uops that miss in the L3 whose data source was forwarded from a remote cache (Precise Event)", .ucode = 0x2000, .umodel = PFM_PMU_INTEL_BDW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_mem_load_uops_retired[]={ { .uname = "L1_HIT", .udesc = "Retired load uops with L1 cache hits as data source", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load uops with L2 cache hits as data source", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load uops with L3 cache hits as data source", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load uops which missed the L1D", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load uops which missed the L2. Unknown data source excluded", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load uops which missed the L3", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "HIT_LFB", .udesc = "Retired load uops which missed L1 but hit line fill buffer (LFB)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uequiv = "LOAD_LATENCY", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_NO_AUTOENCODE, }, }; static const intel_x86_umask_t bdw_mem_uops_retired[]={ { .uname = "STLB_MISS_LOADS", .udesc = "Load uops with true STLB miss retired to architected path", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Store uops with true STLB miss retired to architected path", .ucode = 0x1200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Load uops with locked access retired", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Line-splitted load uops retired", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Line-splitted store uops retired", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "All load uops retired", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "All store uops retired", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t bdw_misalign_mem_ref[]={ { .uname = "LOADS", .udesc = "Speculative cache-line split load uops dispatched to the L1D", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Speculative cache-line split store-address uops dispatched to L1D", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_move_elimination[]={ { .uname = "INT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were eliminated", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were eliminated", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_NOT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were not eliminated", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_NOT_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were not eliminated", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_offcore_requests[]={ { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read requests sent to uncore (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Demand code read requests sent to uncore (use with HT off only)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFOs requests sent to uncore (use with HT off only)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Data read requests sent to uncore (use with HT off only)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_other_assists[]={ { .uname = "AVX_TO_SSE", .udesc = "Number of transitions from AVX-256 to legacy SSE when penalty applicable", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SSE_TO_AVX", .udesc = "Number of transitions from legacy SSE to AVX-256 when penalty applicable", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_WB_ASSIST", .udesc = "Number of times any microcode assist is invoked by HW upon uop writeback", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_resource_stalls[]={ { .uname = "ANY", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RS", .udesc = "Stall cycles caused by absence of eligible entries in Reservation Station (RS)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SB", .udesc = "Cycles Allocator is stalled due to Store Buffer full (not including draining from synch)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ROB", .udesc = "ROB full stall cycles", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_rob_misc_events[]={ { .uname = "LBR_INSERTS", .udesc = "Count each time an new Last Branch Record (LBR) is inserted", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_rs_events[]={ { .uname = "EMPTY_CYCLES", .udesc = "Cycles the Reservation Station (RS) is empty for this thread", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, { .uname = "EMPTY_END", .udesc = "Number of times the reservation station (RS) was empty", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_EDGE, /* inv=1, cmask=1,edge=1 */ .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t bdw_tlb_flush[]={ { .uname = "DTLB_THREAD", .udesc = "Count number of DTLB flushes of thread-specific entries", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_ANY", .udesc = "Count number of any STLB flushes", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_uops_executed[]={ { .uname = "CORE", .udesc = "Number of uops executed from any thread", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Number of uops executed per thread each cycle", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Number of cycles with no uops executed", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "THREAD:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1_UOP_EXEC", .udesc = "Cycles where at least 1 uop was executed per thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "THREAD:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2_UOPS_EXEC", .udesc = "Cycles where at least 2 uops were executed per thread", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "THREAD:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3_UOPS_EXEC", .udesc = "Cycles where at least 3 uops were executed per thread", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "THREAD:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4_UOPS_EXEC", .udesc = "Cycles where at least 4 uops were executed per thread", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "THREAD:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed from any thread", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed from any thread", .ucode = 0x200 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed from any thread", .ucode = 0x200 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed from any thread", .ucode = 0x200 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "CORE:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_NONE", .udesc = "Cycles where no uop is executed on any thread", .ucode = 0x200 | INTEL_X86_MOD_INV, /* inv=1 */ .uequiv = "CORE:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t bdw_uops_executed_port[]={ { .uname = "PORT_0", .udesc = "Cycles which a Uop is executed on port 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Cycles which a Uop is executed on port 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2", .udesc = "Cycles which a Uop is executed on port 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_3", .udesc = "Cycles which a Uop is executed on port 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4", .udesc = "Cycles which a Uop is executed on port 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Cycles which a Uop is executed on port 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Cycles which a Uop is executed on port 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7", .udesc = "Cycles which a Uop is executed on port 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_0_CORE", .udesc = "tbd", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_0:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_1_CORE", .udesc = "tbd", .ucode = 0x200 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_1:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_2_CORE", .udesc = "tbd", .ucode = 0x400 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_2:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_3_CORE", .udesc = "tbd", .ucode = 0x800 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_3:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_4_CORE", .udesc = "tbd", .ucode = 0x1000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_4:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_5_CORE", .udesc = "tbd", .ucode = 0x2000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_5:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_6_CORE", .udesc = "tbd", .ucode = 0x4000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_6:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_7_CORE", .udesc = "tbd", .ucode = 0x8000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_7:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, }; static const intel_x86_umask_t bdw_uops_issued[]={ { .uname = "ANY", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLAGS_MERGE", .udesc = "Number of flags-merge uops being allocated. Such uops adds delay", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_LEA", .udesc = "Number of slow LEA or similar uops allocated. Such uop has 3 sources regardless if result of LEA instruction or not", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SINGLE_MUL", .udesc = "Number of Multiply packed/scalar single precision uops allocated", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued by this thread", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1", .uflags = INTEL_X86_NCOMBO, .ucntmsk = 0xf, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued on this core", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* any=1 inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1:t=1", .ucntmsk = 0xf, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_uops_retired[]={ { .uname = "ALL", .udesc = "All uops that actually retired", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "All uops that actually retired", .ucode = 0x100, .uequiv = "ALL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "number of retirement slots used non PEBS", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no executable uops retired (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:i=1:c=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition applied to PEBS uops retired event", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "ALL:i=1:c=10", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no executable uops retired on core (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:i=1:c=1:t=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_OCCURRENCES", .udesc = "Number of transitions from stalled to unstalled execution (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE| (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "ALL:c=1:i=1:e=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t bdw_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_CODE_RD", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .uequiv = "DMND_CODE_RD", .grpid = 0, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", .ucode = 1ULL << (4 + 8), .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetchers", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PF_LLC_DATA_RD", .udesc = "Request: number of L3 prefetcher requests to L2 for loads", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PF_LLC_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "PF_LLC_IFETCH", .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Request: number bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH | PF_LLC_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH:PF_LLC_IFETCH", .ucode = 0x24100, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER", .ucode = 0x8fff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_LLC_DATA_RD", .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_LLC_DATA_RD", .ucode = 0x9100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_LLC_RFO", .uequiv = "DMND_RFO:PF_RFO:PF_LLC_RFO", .ucode = 0x10300, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "NO_SUPP", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .grpid = 1, }, { .uname = "L3_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "LLC_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .uequiv = "L3_HITM", .grpid = 1, }, { .uname = "L3_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .grpid = 1, }, { .uname = "LLC_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .uequiv = "L3_HITE", .grpid = 1, }, { .uname = "L3_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .grpid = 1, }, { .uname = "LLC_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .uequiv = "L3_HITS", .grpid = 1, }, { .uname = "L3_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .grpid = 1, }, { .uname = "LLC_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (20+8), .uequiv = "L3_HITF", .grpid = 1, }, { .uname = "L3_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .grpid = 1, }, { .uname = "LLC_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITMESF", .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Alias for L3_HITMESF", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .grpid = 1, }, { .uname = "LLC_HIT", .udesc = "Alias for LLC_HITMESF", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .grpid = 1, }, { .uname = "L3_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (26+8), .grpid = 1, }, { .uname = "LLC_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (26+8), .uequiv = "L3_MISS_LOCAL", .grpid = 1, }, { .uname = "LLC_MISS_LOCAL_DRAM", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (26+8), .uequiv = "L3_MISS_LOCAL", .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (26+8), .uequiv = "L3_MISS_LOCAL", .grpid = 1, .umodel = PFM_PMU_INTEL_BDW, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local or remote DRAM", .ucode = 0xfULL << (26+8), .uequiv = "L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP0", .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", .ucode = 0x1ULL << (27+8), .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP0_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", .ucode = 0x1ULL << (27+8), .uequiv = "L3_MISS_REMOTE_HOP0", .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP1", .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", .ucode = 0x1ULL << (28+8), .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP1_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", .ucode = 0x1ULL << (28+8), .uequiv = "L3_MISS_REMOTE_HOP1", .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP2P", .udesc = "Supplier: counts L3 misses to remote DRAM with 2P hops", .ucode = 0x1ULL << (29+8), .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP2P_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM with 2P hops", .ucode = 0x1ULL << (29+8), .uequiv = "L3_MISS_REMOTE_HOP2P", .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE", .udesc = "Supplier: counts L3 misses to remote node", .uequiv = "L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", .ucode = 0x7ULL << (27+8), .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_DRAM", .udesc = "Supplier: counts L3 misses to remote node", .ucode = 0x7ULL << (27+8), .uequiv = "L3_MISS_REMOTE", .umodel = PFM_PMU_INTEL_BDW_EP, .grpid = 1, }, { .uname = "SPL_HIT", .udesc = "Supplier: counts L3 supplier hit", .ucode = 0x1ULL << (30+8), .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_NOT_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_NO_FWD", .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_FWD", .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", .ucode = 1ULL << (35+8), .grpid = 2, }, { .uname = "HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .uequiv = "SNP_HITM", .grpid = 2, }, { .uname = "SNP_HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7fULL << (31+8), .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:HITM:NON_DRAM", .uflags = INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t bdw_hle_retired[]={ { .uname = "START", .udesc = "Number of times an HLE execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an HLE execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an HLE execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MISC1", .udesc = "Number of times an HLE execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC2", .udesc = "Number of times an HLE execution aborted due to uncommon conditions", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC3", .udesc = "Number of times an HLE execution aborted due to HLE-unfriendly instructions", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC4", .udesc = "Number of times an HLE execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC5", .udesc = "Number of times an HLE execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_rtm_retired[]={ { .uname = "START", .udesc = "Number of times an RTM execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MISC1", .udesc = "Number of times an RTM execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC2", .udesc = "Number of times an RTM execution aborted due to uncommon conditions", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC3", .udesc = "Number of times an RTM execution aborted due to RTM-unfriendly instructions", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC4", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC5", .udesc = "Number of times an RTM execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_tx_mem[]={ { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to data conflict on a transactionally accessed address", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY", .udesc = "Number of times a transactional abort was signaled due to data capacity limitation", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_STORE_TO_ELIDED_LOCK", .udesc = "Number of times a HLE transactional execution aborted due to a non xrelease prefixed instruction writing to an elided lock in the elision buffer", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_NOT_EMPTY", .udesc = "Number of times a HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_MISMATCH", .udesc = "Number of times a HLE transaction execution aborted due to xrelease lock not satisfying the address and value requirements in the elision buffer", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT", .udesc = "Number of times a HLE transaction execution aborted due to an unsupported read alignment from the elision buffer", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_FULL", .udesc = "Number of times a HLE clock could not be elided due to ElisionBufferAvailable being zero", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_tx_exec[]={ { .uname = "MISC1", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC2", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed inside a transactional region", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC3", .udesc = "Number of times an instruction execution caused the supported nest count to be exceeded", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC4", .udesc = "Number of times an instruction a xbegin instruction was executed inside HLE transactional region", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC5", .udesc = "Number of times an instruction with HLE xacquire prefix was executed inside a RTM transactional region", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD_CYCLES", .udesc = "Cycles with cacheable data read transactions in the superQ (use with HT off only)", .uequiv = "ALL_DATA_RD:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_CYCLES", .udesc = "Cycles with demand code reads transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_CYCLES", .udesc = "Cycles with demand data read transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Cacheable data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Code read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_GE_6", .udesc = "Cycles with at lesat 6 offcore outstanding demand data read requests in the uncore queue", .uequiv = "DEMAND_DATA_RD:c=6", .ucode = 0x100 | (6 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding RFO (store) transactions in the superQ every cycle (use with HT off only)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_CYCLES", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_ild_stall[]={ { .uname = "LCP", .udesc = "Stall caused by changing prefix length of the instruction", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_page_walker_loads[]={ { .uname = "DTLB_L1", .udesc = "Number of DTLB page walker loads that hit in the L1D and line fill buffer", .ucode = 0x1100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L1", .udesc = "Number of ITLB page walker loads that hit in the L1I and line fill buffer", .ucode = 0x2100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_L2", .udesc = "Number of DTLB page walker loads that hit in the L2", .ucode = 0x1200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L2", .udesc = "Number of ITLB page walker loads that hit in the L2", .ucode = 0x2200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_L3", .udesc = "Number of DTLB page walker loads that hit in the L3", .ucode = 0x1400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L3", .udesc = "Number of ITLB page walker loads that hit in the L3", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_MEMORY", .udesc = "Number of DTLB page walker loads that hit memory", .ucode = 0x1800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t bdw_lsd[]={ { .uname = "UOPS", .udesc = "Number of uops delivered by the Loop Stream Detector (LSD)", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, { .uname = "ACTIVE", .udesc = "Cycles with uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_4_UOPS", .udesc = "Cycles with 4 uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "UOPS:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t bdw_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "Number of DSB to MITE switch true penalty cycles", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_ept[]={ { .uname = "WALK_CYCLES", .udesc = "Cycles for an extended page table walk", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_arith[]={ { .uname = "FPU_DIV_ACTIVE", .udesc = "Cycles when divider is busy execuing divide operations", .ucode = 0x0100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_fp_arith[]={ { .uname = "SCALAR_DOUBLE", .udesc = "Number of scalar double precision floating-point arithmetic instructions (multiply by 1 to get flops)", .ucode = 0x0100, }, { .uname = "SCALAR_SINGLE", .udesc = "Number of scalar single precision floating-point arithmetic instructions (multiply by 1 to get flops)", .ucode = 0x0200, }, { .uname = "SCALAR", .udesc = "Number of SSE/AVX computational scalar floating-point instructions retired. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element", .ucode = 0x0300, .uequiv = "SCALAR_DOUBLE:SCALAR_SINGLE", }, { .uname = "128B_PACKED_DOUBLE", .udesc = "Number of scalar 128-bit packed double precision floating-point arithmetic instructions (multiply by 2 to get flops)", .ucode = 0x0400, }, { .uname = "128B_PACKED_SINGLE", .udesc = "Number of scalar 128-bit packed single precision floating-point arithmetic instructions (multiply by 4 to get flops)", .ucode = 0x0800, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "Number of scalar 256-bit packed double precision floating-point arithmetic instructions (multiply by 4 to get flops)", .ucode = 0x1000, }, { .uname = "256B_PACKED_SINGLE", .udesc = "Number of scalar 256-bit packed single precision floating-point arithmetic instructions (multiply by 8 to get flops)", .ucode = 0x2000, }, { .uname = "PACKED", .udesc = "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element", .ucode = 0x3c00, .uequiv = "128B_PACKED_DOUBLE:128B_PACKED_SINGLE:256B_PACKED_SINGLE:256B_PACKED_DOUBLE", }, { .uname = "SINGLE", .udesc = "Number of SSE/AVX computational single precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element", .ucode = 0x2a00, .uequiv = "256B_PACKED_SINGLE:128B_PACKED_SINGLE:SCALAR_SINGLE", }, { .uname = "DOUBLE", .udesc = "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element", .ucode = 0x1500, .uequiv = "SCALAR_DOUBLE:128B_PACKED_DOUBLE:256B_PACKED_DOUBLE", }, }; static const intel_x86_umask_t bdw_offcore_requests_buffer[]={ { .uname = "SQ_FULL", .udesc = "Number of cycles the offcore requests buffer is full", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t bdw_uops_dispatches_cancelled[]={ { .uname = "SIMD_PRF", .udesc = "Number of uops cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports exceeds the read bandwidth of the register file. This umask applies to instructions: DPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*", .ucode = 0x0300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_bdw_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V4_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc5, }, { .name = "BACLEARS", .desc = "Branch re-steered", .code = 0xe6, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_baclears), .umasks = bdw_baclears }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .code = 0x88, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_br_inst_exec), .umasks = bdw_br_inst_exec }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired (Precise Event)", .code = 0xc4, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_br_inst_retired), .umasks = bdw_br_inst_retired }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .code = 0x89, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_br_misp_exec), .umasks = bdw_br_misp_exec }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches (Precise Event)", .code = 0xc5, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_br_misp_retired), .umasks = bdw_br_misp_retired }, { .name = "CPL_CYCLES", .desc = "Unhalted core cycles at a specific ring level", .code = 0x5c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_cpl_cycles), .umasks = bdw_cpl_cycles }, { .name = "CPU_CLK_THREAD_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_cpu_clk_thread_unhalted), .umasks = bdw_cpu_clk_thread_unhalted }, { .name = "CPU_CLK_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .modmsk = INTEL_V4_ATTRS, .equiv = "CPU_CLK_THREAD_UNHALTED", }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles", .code = 0xa3, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_cycle_activity), .umasks = bdw_cycle_activity }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .code = 0x8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_dtlb_load_misses), .umasks = bdw_dtlb_load_misses }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses", .code = 0x49, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_dtlb_load_misses), .umasks = bdw_dtlb_load_misses /* shared */ }, { .name = "FP_ASSIST", .desc = "X87 floating-point assists", .code = 0xca, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_fp_assist), .umasks = bdw_fp_assist }, { .name = "HLE_RETIRED", .desc = "HLE execution (Precise Event)", .code = 0xc8, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_hle_retired), .umasks = bdw_hle_retired }, { .name = "ICACHE", .desc = "Instruction Cache", .code = 0x80, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_icache), .umasks = bdw_icache }, { .name = "IDQ", .desc = "IDQ operations", .code = 0x79, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_idq), .umasks = bdw_idq }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered", .code = 0x9c, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_idq_uops_not_delivered), .umasks = bdw_idq_uops_not_delivered }, { .name = "INST_RETIRED", .desc = "Number of instructions retired (Precise Event)", .code = 0xc0, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_inst_retired), .umasks = bdw_inst_retired }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions", .code = 0xd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_int_misc), .umasks = bdw_int_misc }, { .name = "ITLB", .desc = "Instruction TLB", .code = 0xae, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_itlb), .umasks = bdw_itlb }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .code = 0x85, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_itlb_misses), .umasks = bdw_itlb_misses }, { .name = "L1D", .desc = "L1D cache", .code = 0x51, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l1d), .umasks = bdw_l1d }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses", .code = 0x48, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l1d_pend_miss), .umasks = bdw_l1d_pend_miss }, { .name = "L2_DEMAND_RQSTS", .desc = "Demand Data Read requests to L2", .code = 0x27, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l2_demand_rqsts), .umasks = bdw_l2_demand_rqsts }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .code = 0xf1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l2_lines_in), .umasks = bdw_l2_lines_in }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .code = 0xf2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l2_lines_out), .umasks = bdw_l2_lines_out }, { .name = "L2_RQSTS", .desc = "L2 requests", .code = 0x24, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l2_rqsts), .umasks = bdw_l2_rqsts }, { .name = "L2_TRANS", .desc = "L2 transactions", .code = 0xf0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_l2_trans), .umasks = bdw_l2_trans }, { .name = "LD_BLOCKS", .desc = "Blocking loads", .code = 0x3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_ld_blocks), .umasks = bdw_ld_blocks }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks", .code = 0x7, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_ld_blocks_partial), .umasks = bdw_ld_blocks_partial }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches", .code = 0x4c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_load_hit_pre), .umasks = bdw_load_hit_pre }, { .name = "LOCK_CYCLES", .desc = "Locked cycles in L1D and L2", .code = 0x63, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_lock_cycles), .umasks = bdw_lock_cycles }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache", .code = 0x2e, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_longest_lat_cache), .umasks = bdw_longest_lat_cache }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .code = 0xc3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_machine_clears), .umasks = bdw_machine_clears }, { .name = "MEM_LOAD_UOPS_L3_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_load_uops_l3_hit_retired), .umasks = bdw_mem_load_uops_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .equiv = "MEM_LOAD_UOPS_L3_HIT_RETIRED", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_load_uops_l3_hit_retired), .umasks = bdw_mem_load_uops_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_L3_MISS_RETIRED", .desc = "Load uops retired that missed the L3 (Precise Event)", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_load_uops_l3_miss_retired), .umasks = bdw_mem_load_uops_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_LLC_MISS_RETIRED", .desc = "Load uops retired that missed the L3 (Precise Event)", .equiv = "MEM_LOAD_UOPS_L3_MISS_RETIRED", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_load_uops_l3_miss_retired), .umasks = bdw_mem_load_uops_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Retired load uops (Precise Event)", .code = 0xd1, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_load_uops_retired), .umasks = bdw_mem_load_uops_retired }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired (Precise Event)", .code = 0xcd, .cntmsk = 0x8, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS | _INTEL_X86_ATTR_LDLAT, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_trans_retired), .umasks = bdw_mem_trans_retired }, { .name = "MEM_UOPS_RETIRED", .desc = "Memory uops retired (Precise Event)", .code = 0xd0, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_mem_uops_retired), .umasks = bdw_mem_uops_retired }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned memory references", .code = 0x5, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_misalign_mem_ref), .umasks = bdw_misalign_mem_ref }, { .name = "MOVE_ELIMINATION", .desc = "Move Elimination", .code = 0x58, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_move_elimination), .umasks = bdw_move_elimination }, { .name = "OFFCORE_REQUESTS", .desc = "Demand Data Read requests sent to uncore", .code = 0xb0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_offcore_requests), .umasks = bdw_offcore_requests }, { .name = "OTHER_ASSISTS", .desc = "Software assist", .code = 0xc1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_other_assists), .umasks = bdw_other_assists }, { .name = "RESOURCE_STALLS", .desc = "Cycles Allocation is stalled due to Resource Related reason", .code = 0xa2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_resource_stalls), .umasks = bdw_resource_stalls }, { .name = "ROB_MISC_EVENTS", .desc = "ROB miscellaneous events", .code = 0xcc, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_rob_misc_events), .umasks = bdw_rob_misc_events }, { .name = "RS_EVENTS", .desc = "Reservation Station", .code = 0x5e, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_rs_events), .umasks = bdw_rs_events }, { .name = "RTM_RETIRED", .desc = "Restricted Transaction Memory execution (Precise Event)", .code = 0xc9, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_rtm_retired), .umasks = bdw_rtm_retired }, { .name = "TLB_FLUSH", .desc = "TLB flushes", .code = 0xbd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_tlb_flush), .umasks = bdw_tlb_flush }, { .name = "UOPS_EXECUTED", .desc = "Uops executed", .code = 0xb1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_uops_executed), .umasks = bdw_uops_executed }, { .name = "LSD", .desc = "Loop stream detector", .code = 0xa8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_lsd), .umasks = bdw_lsd, }, { .name = "UOPS_EXECUTED_PORT", .desc = "Uops dispatch to specific ports", .code = 0xa1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_uops_executed_port), .umasks = bdw_uops_executed_port }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .code = 0xe, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_uops_issued), .umasks = bdw_uops_issued }, { .name = "ARITH", .desc = "Arithmetic uop", .code = 0x14, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_arith), .umasks = bdw_arith }, { .name = "UOPS_RETIRED", .desc = "Uops retired (Precise Event)", .code = 0xc2, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_uops_retired), .umasks = bdw_uops_retired }, { .name = "TX_MEM", .desc = "Transactional memory aborts", .code = 0x54, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_tx_mem), .umasks = bdw_tx_mem, }, { .name = "TX_EXEC", .desc = "Transactional execution", .code = 0x5d, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(bdw_tx_exec), .umasks = bdw_tx_exec }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(bdw_offcore_requests_outstanding), .ngrp = 1, .umasks = bdw_offcore_requests_outstanding, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(bdw_ild_stall), .ngrp = 1, .umasks = bdw_ild_stall, }, { .name = "PAGE_WALKER_LOADS", .desc = "Page walker loads", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xbc, .numasks = LIBPFM_ARRAY_SIZE(bdw_page_walker_loads), .ngrp = 1, .umasks = bdw_page_walker_loads, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(bdw_dsb2mite_switches), .ngrp = 1, .umasks = bdw_dsb2mite_switches, }, { .name = "EPT", .desc = "Extended page table", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(bdw_ept), .ngrp = 1, .umasks = bdw_ept, }, { .name = "FP_ARITH", .desc = "Floating-point instructions retired", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xc7, .numasks = LIBPFM_ARRAY_SIZE(bdw_fp_arith), .ngrp = 1, .umasks = bdw_fp_arith, .equiv = "FP_ARITH_INST_RETIRED", }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Floating-point instructions retired", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xc7, .numasks = LIBPFM_ARRAY_SIZE(bdw_fp_arith), .ngrp = 1, .umasks = bdw_fp_arith, }, { .name = "OFFCORE_REQUESTS_BUFFER", .desc = "Offcore reqest buffer", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xb2, .numasks = LIBPFM_ARRAY_SIZE(bdw_offcore_requests_buffer), .ngrp = 1, .umasks = bdw_offcore_requests_buffer, }, { .name = "UOPS_DISPATCHES_CANCELLED", .desc = "Micro-ops cancelled", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xa0, .numasks = LIBPFM_ARRAY_SIZE(bdw_uops_dispatches_cancelled), .ngrp = 1, .umasks = bdw_uops_dispatches_cancelled, }, { .name = "SQ_MISC", .desc = "SuperQueue miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(bdw_sq_misc), .ngrp = 1, .umasks = bdw_sq_misc, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(bdw_offcore_response), .ngrp = 3, .umasks = bdw_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(bdw_offcore_response), .ngrp = 3, .umasks = bdw_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_cbo_events.h000066400000000000000000001116121502707512200252350ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_cbo */ #define CBO_FILT_MESIF(a, b, c, d) \ { .uname = "STATE_"#a,\ .udesc = #b" cacheline state",\ .ufilters[0] = 1ULL << (17 + (c)),\ .grpid = d, \ } #define CBO_FILT_MESIFS(d) \ CBO_FILT_MESIF(I, Invalid, 0, d), \ CBO_FILT_MESIF(S, Shared, 1, d), \ CBO_FILT_MESIF(E, Exclusive, 2, d), \ CBO_FILT_MESIF(M, Modified, 3, d), \ CBO_FILT_MESIF(F, Forward, 4, d), \ CBO_FILT_MESIF(D, Debug, 5, d), \ { .uname = "STATE_MP",\ .udesc = "Cacheline is modified but never written, was forwarded in modified state",\ .ufilters[0] = 0x1ULL << (17+6),\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO, \ }, \ { .uname = "STATE_MESIFD",\ .udesc = "Any cache line state",\ .ufilters[0] = 0x7fULL << 17,\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ } #define CBO_FILT_OPC(d) \ { .uname = "OPC_RFO",\ .udesc = "Demand data RFO (combine with any OPCODE umask)",\ .ufilters[1] = 0x180ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_CRD",\ .udesc = "Demand code read (combine with any OPCODE umask)",\ .ufilters[1] = 0x181ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_DRD",\ .udesc = "Demand data read (combine with any OPCODE umask)",\ .ufilters[1] = 0x182ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PRD",\ .udesc = "Partial reads (UC) (combine with any OPCODE umask)",\ .ufilters[1] = 0x187ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCILF",\ .udesc = "Full Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCIL",\ .udesc = "Partial Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18dULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WIL",\ .udesc = "Write Invalidate Line (Partial) (combine with any OPCODE umask)", \ .ufilters[1] = 0x18fULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_RFO",\ .udesc = "Prefetch RFO into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x190ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_CODE",\ .udesc = "Prefetch code into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x191ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_DATA",\ .udesc = "Prefetch data into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x192ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWIL",\ .udesc = "PCIe write (partial, non-allocating) - partial line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ .ufilters[1] = 0x193ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWIF",\ .udesc = "PCIe write (full, non-allocating) - full line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ .ufilters[1] = 0x194ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIITOM",\ .udesc = "PCIe write (allocating) (combine with any OPCODE umask)", \ .ufilters[1] = 0x19cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIRDCUR",\ .udesc = "PCIe read current (combine with any OPCODE umask)", \ .ufilters[1] = 0x19eULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOI",\ .udesc = "Request writeback modified invalidate line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOE",\ .udesc = "Request writeback modified set to exclusive (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_ITOM",\ .udesc = "Request invalidate line. Request exclusive ownership of the line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c8ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSRD",\ .udesc = "PCIe non-snoop read (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWR",\ .udesc = "PCIe non-snoop write (partial) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWRF",\ .udesc = "PCIe non-snoop write (full) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e6ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ } static intel_x86_umask_t bdx_unc_c_llc_lookup[]={ { .uname = "ANY", .ucode = 0x1100, .udesc = "Cache Lookups -- Any Request", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "DATA_READ", .ucode = 0x300, .udesc = "Cache Lookups -- Data Read Request", .grpid = 0, }, { .uname = "NID", .ucode = 0x4100, .udesc = "Cache Lookups -- Lookups that Match NID", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, .uflags = INTEL_X86_GRP_DFL_NONE }, { .uname = "READ", .ucode = 0x2100, .udesc = "Cache Lookups -- Any Read Request", .grpid = 0, }, { .uname = "REMOTE_SNOOP", .ucode = 0x900, .udesc = "Cache Lookups -- External Snoop Request", .grpid = 0, }, { .uname = "WRITE", .ucode = 0x500, .udesc = "Cache Lookups -- Write Requests", .grpid = 0, }, CBO_FILT_MESIFS(2), }; static intel_x86_umask_t bdx_unc_c_llc_victims[]={ { .uname = "F_STATE", .ucode = 0x800, .udesc = "Lines in Forward state", .grpid = 0, }, { .uname = "I_STATE", .ucode = 0x400, .udesc = "Lines in S State", .grpid = 0, }, { .uname = "S_STATE", .ucode = 0x400, .udesc = "Lines in S state", .grpid = 0, }, { .uname = "E_STATE", .ucode = 0x200, .udesc = "Lines in E state", .grpid = 0, }, { .uname = "M_STATE", .ucode = 0x100, .udesc = "Lines in M state", .grpid = 0, }, { .uname = "MISS", .ucode = 0x1000, .udesc = "Lines Victimized", .grpid = 0, }, { .uname = "NID", .ucode = 0x4000, .udesc = "Lines Victimized -- Victimized Lines that Match NID", .uflags = INTEL_X86_GRP_DFL_NONE, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, }, }; static intel_x86_umask_t bdx_unc_c_misc[]={ { .uname = "CVZERO_PREFETCH_MISS", .ucode = 0x2000, .udesc = "Cbo Misc -- DRd hitting non-M with raw CV=0", }, { .uname = "CVZERO_PREFETCH_VICTIM", .ucode = 0x1000, .udesc = "Cbo Misc -- Clean Victim with raw CV=0", }, { .uname = "RFO_HIT_S", .ucode = 0x800, .udesc = "Cbo Misc -- RFO HitS", }, { .uname = "RSPI_WAS_FSE", .ucode = 0x100, .udesc = "Cbo Misc -- Silent Snoop Eviction", }, { .uname = "STARTED", .ucode = 0x400, .udesc = "Cbo Misc -- ", }, { .uname = "WC_ALIASING", .ucode = 0x200, .udesc = "Cbo Misc -- Write Combining Aliasing", }, }; static intel_x86_umask_t bdx_unc_c_ring_ad_used[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "AD Ring In Use -- All", }, { .uname = "CCW", .ucode = 0xc00, .udesc = "AD Ring In Use -- Down", }, { .uname = "CW", .ucode = 0x300, .udesc = "AD Ring In Use -- Up", }, { .uname = "DOWN_EVEN", .ucode = 0x400, .udesc = "AD Ring In Use -- Down and Even", }, { .uname = "DOWN_ODD", .ucode = 0x800, .udesc = "AD Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "AD Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "AD Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t bdx_unc_c_ring_ak_used[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "AK Ring In Use -- All", }, { .uname = "CCW", .ucode = 0xc00, .udesc = "AK Ring In Use -- Down", }, { .uname = "CW", .ucode = 0x300, .udesc = "AK Ring In Use -- Up", }, { .uname = "DOWN_EVEN", .ucode = 0x400, .udesc = "AK Ring In Use -- Down and Even", }, { .uname = "DOWN_ODD", .ucode = 0x800, .udesc = "AK Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "AK Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "AK Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t bdx_unc_c_ring_bl_used[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "BL Ring in Use -- Down", }, { .uname = "CCW", .ucode = 0xc00, .udesc = "BL Ring in Use -- Down", }, { .uname = "CW", .ucode = 0x300, .udesc = "BL Ring in Use -- Up", }, { .uname = "DOWN_EVEN", .ucode = 0x400, .udesc = "BL Ring in Use -- Down and Even", }, { .uname = "DOWN_ODD", .ucode = 0x800, .udesc = "BL Ring in Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "BL Ring in Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "BL Ring in Use -- Up and Odd", }, }; static intel_x86_umask_t bdx_unc_c_ring_bounces[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Number of LLC responses that bounced on the Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Number of LLC responses that bounced on the Ring. -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Number of LLC responses that bounced on the Ring. -- BL", }, { .uname = "IV", .ucode = 0x1000, .udesc = "Number of LLC responses that bounced on the Ring. -- Snoops of processors cachee.", }, }; static intel_x86_umask_t bdx_unc_c_ring_iv_used[]={ { .uname = "ANY", .ucode = 0xf00, .udesc = "BL Ring in Use -- Any", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DN", .ucode = 0xc00, .udesc = "BL Ring in Use -- Any", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DOWN", .ucode = 0xcc00, .udesc = "BL Ring in Use -- Down", .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP", .ucode = 0x300, .udesc = "BL Ring in Use -- Any", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_c_rxr_ext_starved[]={ { .uname = "IPQ", .ucode = 0x200, .udesc = "Ingress Arbiter Blocking Cycles -- IRQ", }, { .uname = "IRQ", .ucode = 0x100, .udesc = "Ingress Arbiter Blocking Cycles -- IPQ", }, { .uname = "ISMQ_BIDS", .ucode = 0x800, .udesc = "Ingress Arbiter Blocking Cycles -- ISMQ_BID", }, { .uname = "PRQ", .ucode = 0x400, .udesc = "Ingress Arbiter Blocking Cycles -- PRQ", }, }; static intel_x86_umask_t bdx_unc_c_rxr_inserts[]={ { .uname = "IPQ", .ucode = 0x400, .udesc = "Ingress Allocations -- IPQ", }, { .uname = "IRQ", .ucode = 0x100, .udesc = "Ingress Allocations -- IRQ", }, { .uname = "IRQ_REJ", .ucode = 0x200, .udesc = "Ingress Allocations -- IRQ Rejected", }, { .uname = "PRQ", .ucode = 0x1000, .udesc = "Ingress Allocations -- PRQ", }, { .uname = "PRQ_REJ", .ucode = 0x2000, .udesc = "Ingress Allocations -- PRQ", }, }; static intel_x86_umask_t bdx_unc_c_rxr_ipq_retry[]={ { .uname = "ADDR_CONFLICT", .ucode = 0x400, .udesc = "Probe Queue Retries -- Address Conflict", }, { .uname = "ANY", .ucode = 0x100, .udesc = "Probe Queue Retries -- Any Reject", .uflags = INTEL_X86_DFL, }, { .uname = "FULL", .ucode = 0x200, .udesc = "Probe Queue Retries -- No Egress Credits", }, { .uname = "QPI_CREDITS", .ucode = 0x1000, .udesc = "Probe Queue Retries -- No QPI Credits", }, }; static intel_x86_umask_t bdx_unc_c_rxr_ipq_retry2[]={ { .uname = "AD_SBO", .ucode = 0x100, .udesc = "Probe Queue Retries -- No AD Sbo Credits", }, { .uname = "TARGET", .ucode = 0x4000, .udesc = "Probe Queue Retries -- Target Node Filter", }, }; static intel_x86_umask_t bdx_unc_c_rxr_irq_retry[]={ { .uname = "ADDR_CONFLICT", .ucode = 0x400, .udesc = "Ingress Request Queue Rejects -- Address Conflict", }, { .uname = "ANY", .ucode = 0x100, .udesc = "Ingress Request Queue Rejects -- Any Reject", .uflags = INTEL_X86_DFL, }, { .uname = "FULL", .ucode = 0x200, .udesc = "Ingress Request Queue Rejects -- No Egress Credits", }, { .uname = "IIO_CREDITS", .ucode = 0x2000, .udesc = "Ingress Request Queue Rejects -- No IIO Credits", }, { .uname = "NID", .ucode = 0x4000, .udesc = "Ingress Request Queue Rejects -- ", }, { .uname = "QPI_CREDITS", .ucode = 0x1000, .udesc = "Ingress Request Queue Rejects -- No QPI Credits", }, { .uname = "RTID", .ucode = 0x800, .udesc = "Ingress Request Queue Rejects -- No RTIDs", }, }; static intel_x86_umask_t bdx_unc_c_rxr_irq_retry2[]={ { .uname = "AD_SBO", .ucode = 0x100, .udesc = "Ingress Request Queue Rejects -- No AD Sbo Credits", }, { .uname = "BL_SBO", .ucode = 0x200, .udesc = "Ingress Request Queue Rejects -- No BL Sbo Credits", }, { .uname = "TARGET", .ucode = 0x4000, .udesc = "Ingress Request Queue Rejects -- Target Node Filter", }, }; static intel_x86_umask_t bdx_unc_c_rxr_ismq_retry[]={ { .uname = "ANY", .ucode = 0x100, .udesc = "ISMQ Retries -- Any Reject", .uflags = INTEL_X86_DFL, }, { .uname = "FULL", .ucode = 0x200, .udesc = "ISMQ Retries -- No Egress Credits", }, { .uname = "IIO_CREDITS", .ucode = 0x2000, .udesc = "ISMQ Retries -- No IIO Credits", }, { .uname = "NID", .ucode = 0x4000, .udesc = "ISMQ Retries -- ", }, { .uname = "QPI_CREDITS", .ucode = 0x1000, .udesc = "ISMQ Retries -- No QPI Credits", }, { .uname = "RTID", .ucode = 0x800, .udesc = "ISMQ Retries -- No RTIDs", }, { .uname = "WB_CREDITS", .ucode = 0x8000, .udesc = "ISMQ Retries -- ", }, }; static intel_x86_umask_t bdx_unc_c_rxr_ismq_retry2[]={ { .uname = "AD_SBO", .ucode = 0x100, .udesc = "ISMQ Request Queue Rejects -- No AD Sbo Credits", }, { .uname = "BL_SBO", .ucode = 0x200, .udesc = "ISMQ Request Queue Rejects -- No BL Sbo Credits", }, { .uname = "TARGET", .ucode = 0x4000, .udesc = "ISMQ Request Queue Rejects -- Target Node Filter", }, }; static intel_x86_umask_t bdx_unc_c_rxr_occupancy[]={ { .uname = "IPQ", .ucode = 0x400, .udesc = "Ingress Occupancy -- IPQ", .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .ucode = 0x100, .udesc = "Ingress Occupancy -- IRQ", .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJ", .ucode = 0x200, .udesc = "Ingress Occupancy -- IRQ Rejected", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_REJ", .ucode = 0x2000, .udesc = "Ingress Occupancy -- PRQ Rejects", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_c_sbo_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "SBo Credits Acquired -- For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "SBo Credits Acquired -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_c_sbo_credit_occupancy[]={ { .uname = "AD", .ucode = 0x100, .udesc = "SBo Credits Occupancy -- For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "SBo Credits Occupancy -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_c_tor_inserts[]={ { .uname = "ALL", .ucode = 0x800, .udesc = "All", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "EVICTION", .ucode = 0x400, .udesc = "Evictions", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "LOCAL", .ucode = 0x2800, .udesc = "Local Memory", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "LOCAL_OPCODE", .ucode = 0x2100, .udesc = "Local Memory - Opcode Matched", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "MISS_LOCAL", .ucode = 0x2a00, .udesc = "Misses to Local Memory", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "MISS_LOCAL_OPCODE", .ucode = 0x2300, .udesc = "Misses to Local Memory - Opcode Matched", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "MISS_OPCODE", .ucode = 0x300, .udesc = "Miss Opcode Match", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "MISS_REMOTE", .ucode = 0x8a00, .udesc = "Misses to Remote Memory", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "MISS_REMOTE_OPCODE", .ucode = 0x8300, .udesc = "Misses to Remote Memory - Opcode Matched", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NID_ALL", .ucode = 0x4800, .udesc = "NID Matched", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "NID_EVICTION", .ucode = 0x4400, .udesc = "NID Matched Evictions", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "NID_MISS_ALL", .ucode = 0x4a00, .udesc = "NID Matched Miss All", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "NID_MISS_OPCODE", .ucode = 0x4300, .udesc = "NID and Opcode Matched Miss", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NID_OPCODE", .ucode = 0x4100, .udesc = "NID and Opcode Matched", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NID_WB", .ucode = 0x5000, .udesc = "NID Matched Writebacks", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "OPCODE", .ucode = 0x100, .udesc = "Opcode Match", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "REMOTE", .ucode = 0x8800, .udesc = "Remote Memory", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "REMOTE_OPCODE", .ucode = 0x8100, .udesc = "Remote Memory - Opcode Matched", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB", .ucode = 0x1000, .udesc = "Writebacks", .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, CBO_FILT_OPC(1) }; static intel_x86_umask_t bdx_unc_c_tor_occupancy[]={ { .uname = "ALL", .ucode = 0x800, .udesc = "Any", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 0, }, { .uname = "EVICTION", .ucode = 0x400, .udesc = "Evictions", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL", .ucode = 0x2800, .udesc = "Number of transactions in the TOR that are satisfied by locally homed memory", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL_OPCODE", .ucode = 0x2100, .udesc = "Local Memory - Opcode Matched", .grpid = 0, }, { .uname = "MISS_ALL", .ucode = 0xa00, .udesc = "Miss All", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL", .ucode = 0x2a00, .udesc = "Number of miss transactions in the TOR that are satisfied by locally homed memory", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL_OPCODE", .ucode = 0x2300, .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", .grpid = 0, }, { .uname = "MISS_OPCODE", .ucode = 0x300, .udesc = "Number of miss transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .grpid = 0, }, { .uname = "MISS_REMOTE_OPCODE", .ucode = 0x8300, .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .grpid = 0, }, { .uname = "NID_ALL", .ucode = 0x4800, .udesc = "Number of NID-matched transactions inserted into the TOR (must provide nf=X modifier)", .grpid = 0, }, { .uname = "NID_EVICTION", .ucode = 0x4400, .udesc = "Number of NID-matched eviction transactions inserted into the TOR (must provide nf=X modifier)", .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .ucode = 0x4a00, .udesc = "Number of NID-matched miss transactions that were inserted into the TOR (must provide nf=X modifier)", .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_OPCODE", .ucode = 0x4300, .udesc = "Number of NID and opcode matched miss transactions inserted into the TOR (must provide opc_* umask and nf=X modifier)", .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_OPCODE", .ucode = 0x4100, .udesc = "Number of transactions inserted into the TOR that match a NID and opcode (must provide opc_* umask and nf=X modifier)", .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_WB", .ucode = 0x5000, .udesc = "Number of NID-matched write back transactions inserted into the TOR (must provide nf=X modifier)", .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "OPCODE", .ucode = 0x100, .udesc = "Number of transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .ucode = 0x8800, .udesc = "Number of transactions inserted into the TOR that are satisfied by remote caches or memory", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "REMOTE_OPCODE", .ucode = 0x8100, .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .ucode = 0x1000, .udesc = "Number of write transactions inserted into the TOR", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_REMOTE", .ucode = 0x8a00, .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static intel_x86_umask_t bdx_unc_c_txr_ads_used[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Onto AD Ring", }, { .uname = "AK", .ucode = 0x200, .udesc = "Onto AK Ring", }, { .uname = "BL", .ucode = 0x400, .udesc = "Onto BL Ring", }, }; static intel_x86_umask_t bdx_unc_c_txr_inserts[]={ { .uname = "AD_CACHE", .ucode = 0x100, .udesc = "Egress Allocations -- AD - Cachebo", }, { .uname = "AD_CORE", .ucode = 0x1000, .udesc = "Egress Allocations -- AD - Corebo", }, { .uname = "AK_CACHE", .ucode = 0x200, .udesc = "Egress Allocations -- AK - Cachebo", }, { .uname = "AK_CORE", .ucode = 0x2000, .udesc = "Egress Allocations -- AK - Corebo", }, { .uname = "BL_CACHE", .ucode = 0x400, .udesc = "Egress Allocations -- BL - Cacheno", }, { .uname = "BL_CORE", .ucode = 0x4000, .udesc = "Egress Allocations -- BL - Corebo", }, { .uname = "IV_CACHE", .ucode = 0x800, .udesc = "Egress Allocations -- IV - Cachebo", }, }; static intel_x86_entry_t intel_bdx_unc_c_pe[]={ { .name = "UNC_C_BOUNCE_CONTROL", .code = 0xa, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_CLOCKTICKS", .code = 0x0, .desc = "Clock ticks", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_COUNTER0_OCCUPANCY", .code = 0x1f, .desc = "Since occupancy counts can only be captured in the Cbos 0 counter, this event allows a user to capture occupancy related information by filtering the Cb0 occupancy count captured in Counter 0. The filtering available is found in the control register - threshold, invert and edge detect. E.g. setting threshold to 1 can effectively monitor how many cycles the monitored queue has an entry.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_FAST_ASSERTED", .code = 0x9, .desc = "Counts the number of cycles either the local distress or incoming distress signals are asserted. Incoming distress includes both up and dn.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_C_LLC_LOOKUP", .code = 0x34, .desc = "Counts the number of times the LLC was accessed - this includes code, data, prefetches and hints coming from L2. This has numerous filters available. Note the non-standard filtering equation. This event will count requests that lookup the cache multiple times with multiple increments. One must ALWAYS set umask bit 0 and select a state or states to match. Otherwise, the event will count nothing. CBoGlCtrl[22:18] bits correspond to [FMESI] state.", .modmsk = BDX_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .cntmsk = 0xf, .ngrp = 3, .umasks = bdx_unc_c_llc_lookup, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_llc_lookup), }, { .name = "UNC_C_LLC_VICTIMS", .code = 0x37, .desc = "Counts the number of lines that were victimized on a fill. This can be filtered by the state that the line was in.", .modmsk = BDX_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .cntmsk = 0xf, .ngrp = 2, .umasks = bdx_unc_c_llc_victims, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_llc_victims), }, { .name = "UNC_C_MISC", .code = 0x39, .desc = "Miscellaneous events in the Cbo.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_misc, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_misc), }, { .name = "UNC_C_RING_AD_USED", .code = 0x1b, .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_ad_used), }, { .name = "UNC_C_RING_AK_USED", .code = 0x1c, .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_ring_ak_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_ak_used), }, { .name = "UNC_C_RING_BL_USED", .code = 0x1d, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_ring_bl_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_bl_used), }, { .name = "UNC_C_RING_BOUNCES", .code = 0x5, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_ring_bounces, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_bounces), }, { .name = "UNC_C_RING_IV_USED", .code = 0x1e, .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring in BDX Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_ring_iv_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_ring_iv_used), }, { .name = "UNC_C_RING_SRC_THRTL", .code = 0x7, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_RXR_EXT_STARVED", .code = 0x12, .desc = "Counts cycles in external starvation. This occurs when one of the ingress queues is being starved by the other queues.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_ext_starved, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ext_starved), }, { .name = "UNC_C_RXR_INSERTS", .code = 0x13, .desc = "Counts number of allocations per cycle into the specified Ingress queue.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_inserts), }, { .name = "UNC_C_RXR_IPQ_RETRY", .code = 0x31, .desc = "Number of times a snoop (probe) request had to retry. Filters exist to cover some of the common cases retries.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_ipq_retry, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ipq_retry), }, { .name = "UNC_C_RXR_IPQ_RETRY2", .code = 0x28, .desc = "Number of times a snoop (probe) request had to retry. Filters exist to cover some of the common cases retries.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_ipq_retry2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ipq_retry2), }, { .name = "UNC_C_RXR_IRQ_RETRY", .code = 0x32, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_irq_retry, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_irq_retry), }, { .name = "UNC_C_RXR_IRQ_RETRY2", .code = 0x29, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_irq_retry2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_irq_retry2), }, { .name = "UNC_C_RXR_ISMQ_RETRY", .code = 0x33, .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_ismq_retry, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ismq_retry), }, { .name = "UNC_C_RXR_ISMQ_RETRY2", .code = 0x2a, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_rxr_ismq_retry2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_ismq_retry2), }, { .name = "UNC_C_RXR_OCCUPANCY", .code = 0x11, .desc = "Counts number of entries in the specified Ingress queue in each cycle.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_c_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_rxr_occupancy), }, { .name = "UNC_C_SBO_CREDITS_ACQUIRED", .code = 0x3d, .desc = "Number of Sbo credits acquired in a given cycle, per ring. Each Cbo is assigned an Sbo it can communicate with.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_sbo_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_sbo_credits_acquired), }, { .name = "UNC_C_SBO_CREDIT_OCCUPANCY", .code = 0x3e, .desc = "Number of Sbo credits in use in a given cycle, per ring. Each Cbo is assigned an Sbo it can communicate with.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_c_sbo_credit_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_sbo_credit_occupancy), }, { .name = "UNC_C_TOR_INSERTS", .code = 0x35, .desc = "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. There are a number of subevent filters but only a subset of the subevent combinations are valid. Subevents that require an opcode or NID match require the Cn_MSR_PMON_BOX_FILTER.{opc, nid} field to be set. If, for example, one wanted to count DRD Local Misses, one should select MISS_OPC_MATCH and set Cn_MSR_PMON_BOX_FILTER.opc to DRD (0x182).", .modmsk = BDX_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .cntmsk = 0xf, .ngrp = 2, .umasks = bdx_unc_c_tor_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_tor_inserts), }, { .name = "UNC_C_TOR_OCCUPANCY", .code = 0x36, .desc = "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. There are a number of subevent filters but only a subset of the subevent combinations are valid. Subevents that require an opcode or NID match require the Cn_MSR_PMON_BOX_FILTER.{opc, nid} field to be set. If, for example, one wanted to count DRD Local Misses, one should select MISS_OPC_MATCH and set Cn_MSR_PMON_BOX_FILTER.opc to DRD (0x182).", .modmsk = BDX_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .cntmsk = 0x1, .ngrp = 2, .umasks = bdx_unc_c_tor_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_tor_occupancy), }, { .name = "UNC_C_TXR_ADS_USED", .code = 0x4, .desc = "TBD", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_txr_ads_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_txr_ads_used), }, { .name = "UNC_C_TXR_INSERTS", .code = 0x2, .desc = "Number of allocations into the Cbo Egress. The Egress is used to queue up requests destined for the ring.", .modmsk = BDX_UNC_CBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_c_txr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_c_txr_inserts), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_ha_events.h000066400000000000000000001225571502707512200250740ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_ha */ static intel_x86_umask_t bdx_unc_h_bypass_imc[]={ { .uname = "NOT_TAKEN", .ucode = 0x200, .udesc = "HA to iMC Bypass -- Not Taken", }, { .uname = "TAKEN", .ucode = 0x100, .udesc = "HA to iMC Bypass -- Taken", }, }; static intel_x86_umask_t bdx_unc_h_directory_lookup[]={ { .uname = "NO_SNP", .ucode = 0x200, .udesc = "Directory Lookups -- Snoop Not Needed", }, { .uname = "SNP", .ucode = 0x100, .udesc = "Directory Lookups -- Snoop Needed", }, }; static intel_x86_umask_t bdx_unc_h_directory_update[]={ { .uname = "ANY", .ucode = 0x300, .udesc = "Directory Updates -- Any Directory Update", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CLEAR", .ucode = 0x200, .udesc = "Directory Updates -- Directory Clear", }, { .uname = "SET", .ucode = 0x100, .udesc = "Directory Updates -- Directory Set", }, }; static intel_x86_umask_t bdx_unc_h_hitme_hit[]={ { .uname = "ACKCNFLTWBI", .ucode = 0x400, .udesc = "Counts Number of Hits in HitMe Cache -- op is AckCnfltWbI", }, { .uname = "ALL", .ucode = 0xff00, .udesc = "Counts Number of Hits in HitMe Cache -- All Requests", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALLOCS", .ucode = 0x7000, .udesc = "Counts Number of Hits in HitMe Cache -- Allocations", .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICTS", .ucode = 0x4200, .udesc = "Counts Number of Hits in HitMe Cache -- Allocations", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .ucode = 0xf00, .udesc = "Counts Number of Hits in HitMe Cache -- HOM Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVALS", .ucode = 0x2600, .udesc = "Counts Number of Hits in HitMe Cache -- Invalidations", .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_OR_INVITOE", .ucode = 0x100, .udesc = "Counts Number of Hits in HitMe Cache -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", }, { .uname = "RSP", .ucode = 0x8000, .udesc = "Counts Number of Hits in HitMe Cache -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", }, { .uname = "RSPFWDI_LOCAL", .ucode = 0x2000, .udesc = "Counts Number of Hits in HitMe Cache -- op is RspIFwd or RspIFwdWb for a local request", }, { .uname = "RSPFWDI_REMOTE", .ucode = 0x1000, .udesc = "Counts Number of Hits in HitMe Cache -- op is RspIFwd or RspIFwdWb for a remote request", }, { .uname = "RSPFWDS", .ucode = 0x4000, .udesc = "Counts Number of Hits in HitMe Cache -- op is RsSFwd or RspSFwdWb", }, { .uname = "WBMTOE_OR_S", .ucode = 0x800, .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoE or WbMtoS", }, { .uname = "WBMTOI", .ucode = 0x200, .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoI", }, }; static intel_x86_umask_t bdx_unc_h_hitme_hit_pv_bits_set[]={ { .uname = "ACKCNFLTWBI", .ucode = 0x400, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is AckCnfltWbI", }, { .uname = "ALL", .ucode = 0xff00, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- All Requests", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "HOM", .ucode = 0xf00, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- HOM Requests", }, { .uname = "READ_OR_INVITOE", .ucode = 0x100, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", }, { .uname = "RSP", .ucode = 0x8000, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", }, { .uname = "RSPFWDI_LOCAL", .ucode = 0x2000, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspIFwd or RspIFwdWb for a local request", }, { .uname = "RSPFWDI_REMOTE", .ucode = 0x1000, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RspIFwd or RspIFwdWb for a remote request", }, { .uname = "RSPFWDS", .ucode = 0x4000, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is RsSFwd or RspSFwdWb", }, { .uname = "WBMTOE_OR_S", .ucode = 0x800, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is WbMtoE or WbMtoS", }, { .uname = "WBMTOI", .ucode = 0x200, .udesc = "Accumulates Number of PV bits set on HitMe Cache Hits -- op is WbMtoI", }, }; static intel_x86_umask_t bdx_unc_h_hitme_lookup[]={ { .uname = "ACKCNFLTWBI", .ucode = 0x400, .udesc = "Counts Number of times HitMe Cache is accessed -- op is AckCnfltWbI", }, { .uname = "ALL", .ucode = 0xff00, .udesc = "Counts Number of times HitMe Cache is accessed -- All Requests", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALLOCS", .ucode = 0x7000, .udesc = "Counts Number of times HitMe Cache is accessed -- Allocations", }, { .uname = "HOM", .ucode = 0xf00, .udesc = "Counts Number of times HitMe Cache is accessed -- HOM Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVALS", .ucode = 0x2600, .udesc = "Counts Number of times HitMe Cache is accessed -- Invalidations", .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_OR_INVITOE", .ucode = 0x100, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvItoE", }, { .uname = "RSP", .ucode = 0x8000, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspI, RspIWb, RspS, RspSWb, RspCnflt or RspCnfltWbI", }, { .uname = "RSPFWDI_LOCAL", .ucode = 0x2000, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspIFwd or RspIFwdWb for a local request", }, { .uname = "RSPFWDI_REMOTE", .ucode = 0x1000, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RspIFwd or RspIFwdWb for a remote request", }, { .uname = "RSPFWDS", .ucode = 0x4000, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RsSFwd or RspSFwdWb", }, { .uname = "WBMTOE_OR_S", .ucode = 0x800, .udesc = "Counts Number of times HitMe Cache is accessed -- op is WbMtoE or WbMtoS", }, { .uname = "WBMTOI", .ucode = 0x200, .udesc = "Counts Number of times HitMe Cache is accessed -- op is WbMtoI", }, }; static intel_x86_umask_t bdx_unc_h_igr_no_credit_cycles[]={ { .uname = "AD_QPI0", .ucode = 0x100, .udesc = "Cycles without QPI Ingress Credits -- AD to QPI Link 0", }, { .uname = "AD_QPI1", .ucode = 0x200, .udesc = "Cycles without QPI Ingress Credits -- AD to QPI Link 1", }, { .uname = "AD_QPI2", .ucode = 0x1000, .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 0", }, { .uname = "BL_QPI0", .ucode = 0x400, .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 0", }, { .uname = "BL_QPI1", .ucode = 0x800, .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 1", }, { .uname = "BL_QPI2", .ucode = 0x2000, .udesc = "Cycles without QPI Ingress Credits -- BL to QPI Link 1", }, }; static intel_x86_umask_t bdx_unc_h_imc_reads[]={ { .uname = "NORMAL", .ucode = 0x100, .udesc = "HA to iMC Normal Priority Reads Issued -- Normal Priority", .uflags = INTEL_X86_DFL, }, }; static intel_x86_umask_t bdx_unc_h_imc_writes[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "HA to iMC Full Line Writes Issued -- All Writes", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .ucode = 0x100, .udesc = "HA to iMC Full Line Writes Issued -- Full Line Non-ISOCH", }, { .uname = "FULL_ISOCH", .ucode = 0x400, .udesc = "HA to iMC Full Line Writes Issued -- ISOCH Full Line", }, { .uname = "PARTIAL", .ucode = 0x200, .udesc = "HA to iMC Full Line Writes Issued -- Partial Non-ISOCH", }, { .uname = "PARTIAL_ISOCH", .ucode = 0x800, .udesc = "HA to iMC Full Line Writes Issued -- ISOCH Partial", }, }; static intel_x86_umask_t bdx_unc_h_osb[]={ { .uname = "CANCELLED", .ucode = 0x1000, .udesc = "OSB Snoop Broadcast -- Cancelled", }, { .uname = "INVITOE_LOCAL", .ucode = 0x400, .udesc = "OSB Snoop Broadcast -- Local InvItoE", }, { .uname = "READS_LOCAL", .ucode = 0x200, .udesc = "OSB Snoop Broadcast -- Local Reads", }, { .uname = "READS_LOCAL_USEFUL", .ucode = 0x2000, .udesc = "OSB Snoop Broadcast -- Reads Local - Useful", }, { .uname = "REMOTE", .ucode = 0x800, .udesc = "OSB Snoop Broadcast -- Remote", }, { .uname = "REMOTE_USEFUL", .ucode = 0x4000, .udesc = "OSB Snoop Broadcast -- Remote - Useful", }, }; static intel_x86_umask_t bdx_unc_h_osb_edr[]={ { .uname = "ALL", .ucode = 0x100, .udesc = "OSB Early Data Return -- All", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "READS_LOCAL_I", .ucode = 0x200, .udesc = "OSB Early Data Return -- Reads to Local I", }, { .uname = "READS_LOCAL_S", .ucode = 0x800, .udesc = "OSB Early Data Return -- Reads to Local S", }, { .uname = "READS_REMOTE_I", .ucode = 0x400, .udesc = "OSB Early Data Return -- Reads to Remote I", }, { .uname = "READS_REMOTE_S", .ucode = 0x1000, .udesc = "OSB Early Data Return -- Reads to Remote S", }, }; static intel_x86_umask_t bdx_unc_h_requests[]={ { .uname = "INVITOE_LOCAL", .ucode = 0x1000, .udesc = "Read and Write Requests -- Local InvItoEs", }, { .uname = "INVITOE_REMOTE", .ucode = 0x2000, .udesc = "Read and Write Requests -- Remote InvItoEs", }, { .uname = "READS", .ucode = 0x300, .udesc = "Read and Write Requests -- Reads", .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL", .ucode = 0x100, .udesc = "Read and Write Requests -- Local Reads", }, { .uname = "READS_REMOTE", .ucode = 0x200, .udesc = "Read and Write Requests -- Remote Reads", }, { .uname = "WRITES", .ucode = 0xc00, .udesc = "Read and Write Requests -- Writes", .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_LOCAL", .ucode = 0x400, .udesc = "Read and Write Requests -- Local Writes", }, { .uname = "WRITES_REMOTE", .ucode = 0x800, .udesc = "Read and Write Requests -- Remote Writes", }, }; static intel_x86_umask_t bdx_unc_h_ring_ad_used[]={ { .uname = "CCW", .ucode = 0xc00, .udesc = "Counterclockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW_EVEN", .ucode = 0x400, .udesc = "Counterclockwise and Even", }, { .uname = "CCW_ODD", .ucode = 0x800, .udesc = "Counterclockwise and Odd", }, { .uname = "CW", .ucode = 0x300, .udesc = "Clockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW_EVEN", .ucode = 0x100, .udesc = "Clockwise and Even", }, { .uname = "CW_ODD", .ucode = 0x200, .udesc = "Clockwise and Odd", }, }; static intel_x86_umask_t bdx_unc_h_rpq_cycles_no_reg_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "iMC RPQ Credits Empty - Regular -- Channel 0", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .ucode = 0x200, .udesc = "iMC RPQ Credits Empty - Regular -- Channel 1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .ucode = 0x400, .udesc = "iMC RPQ Credits Empty - Regular -- Channel 2", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN3", .ucode = 0x800, .udesc = "iMC RPQ Credits Empty - Regular -- Channel 3", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_sbo0_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "For BL Ring", }, }; static intel_x86_umask_t bdx_unc_h_snoops_rsp_after_data[]={ { .uname = "LOCAL", .ucode = 0x100, .udesc = "Data beat the Snoop Responses -- Local Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .ucode = 0x200, .udesc = "Data beat the Snoop Responses -- Remote Requests", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_snoop_cycles_ne[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "Cycles with Snoops Outstanding -- All Requests", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOCAL", .ucode = 0x100, .udesc = "Cycles with Snoops Outstanding -- Local Requests", }, { .uname = "REMOTE", .ucode = 0x200, .udesc = "Cycles with Snoops Outstanding -- Remote Requests", }, }; static intel_x86_umask_t bdx_unc_h_snoop_occupancy[]={ { .uname = "LOCAL", .ucode = 0x100, .udesc = "Tracker Snoops Outstanding Accumulator -- Local Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .ucode = 0x200, .udesc = "Tracker Snoops Outstanding Accumulator -- Remote Requests", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_snoop_resp[]={ { .uname = "RSPCNFLCT", .ucode = 0x4000, .udesc = "Snoop Responses Received -- RSPCNFLCT*", }, { .uname = "RSPI", .ucode = 0x100, .udesc = "Snoop Responses Received -- RspI", }, { .uname = "RSPIFWD", .ucode = 0x400, .udesc = "Snoop Responses Received -- RspIFwd", }, { .uname = "RSPS", .ucode = 0x200, .udesc = "Snoop Responses Received -- RspS", }, { .uname = "RSPSFWD", .ucode = 0x800, .udesc = "Snoop Responses Received -- RspSFwd", }, { .uname = "RSP_FWD_WB", .ucode = 0x2000, .udesc = "Snoop Responses Received -- Rsp*Fwd*WB", }, { .uname = "RSP_WB", .ucode = 0x1000, .udesc = "Snoop Responses Received -- Rsp*WB", }, }; static intel_x86_umask_t bdx_unc_h_snp_resp_recv_local[]={ { .uname = "OTHER", .ucode = 0x8000, .udesc = "Snoop Responses Received Local -- Other", }, { .uname = "RSPCNFLCT", .ucode = 0x4000, .udesc = "Snoop Responses Received Local -- RspCnflct", }, { .uname = "RSPI", .ucode = 0x100, .udesc = "Snoop Responses Received Local -- RspI", }, { .uname = "RSPIFWD", .ucode = 0x400, .udesc = "Snoop Responses Received Local -- RspIFwd", }, { .uname = "RSPS", .ucode = 0x200, .udesc = "Snoop Responses Received Local -- RspS", }, { .uname = "RSPSFWD", .ucode = 0x800, .udesc = "Snoop Responses Received Local -- RspSFwd", }, { .uname = "RSPxFWDxWB", .ucode = 0x2000, .udesc = "Snoop Responses Received Local -- Rsp*FWD*WB", }, { .uname = "RSPxWB", .ucode = 0x1000, .udesc = "Snoop Responses Received Local -- Rsp*WB", }, }; static intel_x86_umask_t bdx_unc_h_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .ucode = 0x100, .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", }, { .uname = "SBO0_BL", .ucode = 0x400, .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", }, { .uname = "SBO1_AD", .ucode = 0x200, .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", }, { .uname = "SBO1_BL", .ucode = 0x800, .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", }, }; static intel_x86_umask_t bdx_unc_h_tad_requests_g0[]={ { .uname = "REGION0", .ucode = 0x100, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 0", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION1", .ucode = 0x200, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION2", .ucode = 0x400, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 2", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION3", .ucode = 0x800, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 3", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION4", .ucode = 0x1000, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 4", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION5", .ucode = 0x2000, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 5", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION6", .ucode = 0x4000, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 6", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION7", .ucode = 0x8000, .udesc = "HA Requests to a TAD Region - Group 0 -- TAD Region 7", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_tad_requests_g1[]={ { .uname = "REGION10", .ucode = 0x400, .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 10", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION11", .ucode = 0x800, .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 11", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION8", .ucode = 0x100, .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 8", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION9", .ucode = 0x200, .udesc = "HA Requests to a TAD Region - Group 1 -- TAD Region 9", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_tracker_cycles_full[]={ { .uname = "ALL", .ucode = 0x200, .udesc = "Tracker Cycles Full -- Cycles Completely Used", .uflags = INTEL_X86_DFL, }, { .uname = "GP", .ucode = 0x100, .udesc = "Tracker Cycles Full -- Cycles GP Completely Used", }, }; static intel_x86_umask_t bdx_unc_h_tracker_cycles_ne[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "Tracker Cycles Not Empty -- All Requests", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOCAL", .ucode = 0x100, .udesc = "Tracker Cycles Not Empty -- Local Requests", }, { .uname = "REMOTE", .ucode = 0x200, .udesc = "Tracker Cycles Not Empty -- Remote Requests", }, }; static intel_x86_umask_t bdx_unc_h_tracker_occupancy[]={ { .uname = "INVITOE_LOCAL", .ucode = 0x4000, .udesc = "Tracker Occupancy Accumultor -- Local InvItoE Requests", }, { .uname = "INVITOE_REMOTE", .ucode = 0x8000, .udesc = "Tracker Occupancy Accumultor -- Remote InvItoE Requests", }, { .uname = "READS_LOCAL", .ucode = 0x400, .udesc = "Tracker Occupancy Accumultor -- Local Read Requests", }, { .uname = "READS_REMOTE", .ucode = 0x800, .udesc = "Tracker Occupancy Accumultor -- Remote Read Requests", }, { .uname = "WRITES_LOCAL", .ucode = 0x1000, .udesc = "Tracker Occupancy Accumultor -- Local Write Requests", }, { .uname = "WRITES_REMOTE", .ucode = 0x2000, .udesc = "Tracker Occupancy Accumultor -- Remote Write Requests", }, }; static intel_x86_umask_t bdx_unc_h_tracker_pending_occupancy[]={ { .uname = "LOCAL", .ucode = 0x100, .udesc = "Data Pending Occupancy Accumultor -- Local Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .ucode = 0x200, .udesc = "Data Pending Occupancy Accumultor -- Remote Requests", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_h_txr_ad_cycles_full[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "All", .uflags = INTEL_X86_DFL, }, { .uname = "SCHED0", .ucode = 0x100, .udesc = "Scheduler 0", }, { .uname = "SCHED1", .ucode = 0x200, .udesc = "Scheduler 1", }, }; static intel_x86_umask_t bdx_unc_h_txr_bl[]={ { .uname = "DRS_CACHE", .ucode = 0x100, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Cache", }, { .uname = "DRS_CORE", .ucode = 0x200, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Core", }, { .uname = "DRS_QPI", .ucode = 0x400, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to QPI", }, }; static intel_x86_umask_t bdx_unc_h_txr_starved[]={ { .uname = "AK", .ucode = 0x100, .udesc = "Injection Starvation -- For AK Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "Injection Starvation -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_h_wpq_cycles_no_reg_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 0", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .ucode = 0x200, .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .ucode = 0x400, .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 2", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN3", .ucode = 0x800, .udesc = "HA iMC CHN0 WPQ Credits Empty - Regular -- Channel 3", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_entry_t intel_bdx_unc_h_pe[]={ /* ADDR_OPC_MATCH not supported (linux kernel has no support for HA OPC yet*/ { .name = "UNC_H_BT_CYCLES_NE", .code = 0x42, .desc = "Cycles the Backup Tracker (BT) is not empty. The BT is the actual HOM tracker in IVT.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_BT_OCCUPANCY", .code = 0x43, .desc = "Accumulates the occupancy of te HA BT pool in every cycle. This can be used with the 'not empty' stat to calculate the average queue occupancy or the 'allocations' stat to calculate average queue latency. HA BTs are allocated as son as a request enters the HA and are released after the snoop response and data return and the response is returned to the ring", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_BYPASS_IMC", .code = 0x14, .desc = "Counts the number of times when the HA was able to bypass was attempted. This is a latency optimization for situations when there is light loadings on the memory subsystem. This can be filted by when the bypass was taken and when it was not.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_bypass_imc, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_bypass_imc), }, { .name = "UNC_H_CONFLICT_CYCLES", .code = 0xb, .desc = "TBD", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_CLOCKTICKS", .code = 0x0, .desc = "Counts the number of uclks in the HA. This will be slightly different than the count in the Ubox because of enable/freeze delays. The HA is on the other side of the die from the fixed Ubox uclk counter, so the drift could be somewhat larger than in units that are closer like the QPI Agent.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_DIRECT2CORE_COUNT", .code = 0x11, .desc = "Number of Direct2Core messages sent", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_DIRECT2CORE_CYCLES_DISABLED", .code = 0x12, .desc = "Number of cycles in which Direct2Core was disabled", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_DIRECT2CORE_TXN_OVERRIDE", .code = 0x13, .desc = "Number of Reads where Direct2Core overridden", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_DIRECTORY_LAT_OPT", .code = 0x41, .desc = "Directory Latency Optimization Data Return Path Taken. When directory mode is enabled and the directory retuned for a read is Dir=I, then data can be returned using a faster path if certain conditions are met (credits, free pipeline, etc).", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_DIRECTORY_LOOKUP", .code = 0xc, .desc = "Counts the number of transactions that looked up the directory. Can be filtered by requests that had to snoop and those that did not have to.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_directory_lookup, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_directory_lookup), }, { .name = "UNC_H_DIRECTORY_UPDATE", .code = 0xd, .desc = "Counts the number of directory updates that were required. These result in writes to the memory controller. This can be filtered by directory sets and directory clears.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_directory_update, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_directory_update), }, { .name = "UNC_H_HITME_HIT", .code = 0x71, .desc = "TBD", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_hitme_hit, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_hit), }, { .name = "UNC_H_HITME_HIT_PV_BITS_SET", .code = 0x72, .desc = "TBD", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_hitme_hit_pv_bits_set, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_hit_pv_bits_set), }, { .name = "UNC_H_HITME_LOOKUP", .code = 0x70, .desc = "TBD", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_hitme_lookup, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_hitme_lookup), }, { .name = "UNC_H_IGR_NO_CREDIT_CYCLES", .code = 0x22, .desc = "Counts the number of cycles when the HA does not have credits to send messages to the QPI Agent. This can be filtered by the different credit pools and the different links.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_igr_no_credit_cycles, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_igr_no_credit_cycles), }, { .name = "UNC_H_IMC_READS", .code = 0x17, .desc = "Count of the number of reads issued to any of the memory controller channels. This can be filtered by the priority of the reads.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_imc_reads, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_imc_reads), }, { .name = "UNC_H_IMC_RETRY", .code = 0x1e, .desc = "TBD", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_H_IMC_WRITES", .code = 0x1a, .desc = "Counts the total number of full line writes issued from the HA into the memory controller. This counts for all four channels. It can be filtered by full/partial and ISOCH/non-ISOCH.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_imc_writes, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_imc_writes), }, { .name = "UNC_H_OSB", .code = 0x53, .desc = "Count of OSB snoop broadcasts. Counts by 1 per request causing OSB snoops to be broadcast. Does not count all the snoops generated by OSB.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_osb, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_osb), }, { .name = "UNC_H_OSB_EDR", .code = 0x54, .desc = "Counts the number of transactions that broadcast snoop due to OSB, but found clean data in memory and was able to do early data return", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_osb_edr, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_osb_edr), }, { .name = "UNC_H_REQUESTS", .code = 0x1, .desc = "Counts the total number of read requests made into the Home Agent. Reads include all read opcodes (including RFO). Writes include all writes (streaming, evictions, HitM, etc).", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_requests, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_requests), }, { .name = "UNC_H_RING_AD_USED", .code = 0x3e, .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), }, { .name = "UNC_H_RING_AK_USED", .code = 0x3f, .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), }, { .name = "UNC_H_RING_BL_USED", .code = 0x40, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_ring_ad_used), }, { .name = "UNC_H_RPQ_CYCLES_NO_REG_CREDITS", .code = 0x15, .desc = "Counts the number of cycles when there are no regular credits available for posting reads from the HA into the iMC. In order to send reads into the memory controller, the HA must first acquire a credit for the iMCs RPQ (read pending queue). This queue is broken into regular credits/buffers that are used by general reads, and special requests such as ISOCH reads. This count only tracks the regular credits Common high banwidth workloads should be able to make use of all of the regular buffers, but it will be difficult (and uncommon) to make use of both the regular and special buffers at the same time. One can filter based on the memory controller channel. One or more channels can be tracked at a given iven time.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_rpq_cycles_no_reg_credits, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_rpq_cycles_no_reg_credits), }, { .name = "UNC_H_SBO0_CREDITS_ACQUIRED", .code = 0x68, .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), }, { .name = "UNC_H_SBO0_CREDIT_OCCUPANCY", .code = 0x6a, .desc = "Number of Sbo 0 credits in use in a given cycle, per ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), }, { .name = "UNC_H_SBO1_CREDITS_ACQUIRED", .code = 0x69, .desc = "Number of Sbo 1 credits acquired in a given cycle, per ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), }, { .name = "UNC_H_SBO1_CREDIT_OCCUPANCY", .code = 0x6b, .desc = "Number of Sbo 1 credits in use in a given cycle, per ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_sbo0_credits_acquired), }, { .name = "UNC_H_SNOOPS_RSP_AFTER_DATA", .code = 0xa, .desc = "Counts the number of reads when the snoop was on the critical path to the data return.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_snoops_rsp_after_data, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoops_rsp_after_data), }, { .name = "UNC_H_SNOOP_CYCLES_NE", .code = 0x8, .desc = "Counts cycles when one or more snoops are outstanding.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_snoop_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_cycles_ne), }, { .name = "UNC_H_SNOOP_OCCUPANCY", .code = 0x9, .desc = "Accumulates the occupancy of either the local HA tracker pool that have snoops pending in every cycle. This can be used in conjection with the not empty stat to calculate average queue occupancy or the allocations stat in order to calculate average queue latency. HA trackers are allocated as soon as a request enters the HA if an HT (HomeTracker) entry is available and this occupancy is decremented when all the snoop responses have returned.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_snoop_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_occupancy), }, { .name = "UNC_H_SNOOP_RESP", .code = 0x21, .desc = "Counts the total number of RspI snoop responses received. Whenever a snoops are issued, one or more snoop responses will be returned depending on the topology of the system. In systems larger than 2s, when multiple snoops are returned this will count all the snoops that are received. For example, if 3 snoops were issued and returned RspI, RspS, and RspSFwd; then each of these sub-events would increment by 1.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_snoop_resp, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snoop_resp), }, { .name = "UNC_H_SNP_RESP_RECV_LOCAL", .code = 0x60, .desc = "Number of snoop responses received for a Local request", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_snp_resp_recv_local, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_snp_resp_recv_local), }, { .name = "UNC_H_STALL_NO_SBO_CREDIT", .code = 0x6c, .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_stall_no_sbo_credit, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_stall_no_sbo_credit), }, { .name = "UNC_H_TAD_REQUESTS_G0", .code = 0x1b, .desc = "Counts the number of HA requests to a given TAD region. There are up to 11 TAD (target address decode) regions in each home agent. All requests destined for the memory controller must first be decoded to determine which TAD region they are in. This event is filtered based on the TAD region ID, and covers regions 0 to 7. This event is useful for understanding how applications are using the memory that is spread across the different memory regions. It is particularly useful for Monroe systems that use the TAD to enable individual channels to enter self-refresh to save power.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tad_requests_g0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tad_requests_g0), }, { .name = "UNC_H_TAD_REQUESTS_G1", .code = 0x1c, .desc = "Counts the number of HA requests to a given TAD region. There are up to 11 TAD (target address decode) regions in each home agent. All requests destined for the memory controller must first be decoded to determine which TAD region they are in. This event is filtered based on the TAD region ID, and covers regions 8 to 10. This event is useful for understanding how applications are using the memory that is spread across the different memory regions. It is particularly useful for Monroe systems that use the TAD to enable individual channels to enter self-refresh to save power.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tad_requests_g1, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tad_requests_g1), }, { .name = "UNC_H_TRACKER_CYCLES_FULL", .code = 0x2, .desc = "Counts the number of cycles when the local HA tracker pool is completely used. This can be used with edge detect to identify the number of situations when the pool became fully utilized. This should not be confused with RTID credit usage -- which must be tracked inside each cbo individually -- but represents the actual tracker buffer structure. In other words, the system could be starved for RTIDs but not fill up the HA trackers. HA trackers are allocated as soon as a request enters the HA and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tracker_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_cycles_full), }, { .name = "UNC_H_TRACKER_CYCLES_NE", .code = 0x3, .desc = "Counts the number of cycles when the local HA tracker pool is not empty. This can be used with edge detect to identify the number of situations when the pool became empty. This should not be confused with RTID credit usage -- which must be tracked inside each cbo individually -- but represents the actual tracker buffer structure. In other words, this buffer could be completely empty, but there may still be credits in use by the CBos. This stat can be used in conjunction with the occupancy accumulation stat in order to calculate average queue occpancy. HA trackers are allocated as soon as a request enters the HA if an HT (Home Tracker) entry is available and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tracker_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_cycles_ne), }, { .name = "UNC_H_TRACKER_OCCUPANCY", .code = 0x4, .desc = "Accumulates the occupancy of the local HA tracker pool in every cycle. This can be used in conjection with the not empty stat to calculate average queue occupancy or the allocations stat in order to calculate average queue latency. HA trackers are allocated as soon as a request enters the HA if a HT (Home Tracker) entry is available and is released after the snoop response and data return (or post in the case of a write) and the response is returned on the rhe ring.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tracker_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_occupancy), }, { .name = "UNC_H_TRACKER_PENDING_OCCUPANCY", .code = 0x5, .desc = "Accumulates the number of transactions that have data from the memory controller until they get scheduled to the Egress. This can be used to calculate the queuing latency for two things. (1) If the system is waiting for snoops, this will increase. (2) If the system cant schedule to the Egress because of either (a) Egress Credits or (b) QPI BL IGR credits for remote requestss.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_tracker_pending_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_tracker_pending_occupancy), }, { .name = "UNC_H_TXR_AD_CYCLES_FULL", .code = 0x2a, .desc = "AD Egress Full", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_txr_ad_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), }, { .name = "UNC_H_TXR_AK_CYCLES_FULL", .code = 0x32, .desc = "AK Egress Full", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_txr_ad_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), /* shared */ }, { .name = "UNC_H_TXR_BL", .code = 0x10, .desc = "Counts the number of DRS messages sent out on the BL ring. This can be filtered by the destination.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_txr_bl, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_bl), }, { .name = "UNC_H_TXR_BL_CYCLES_FULL", .code = 0x36, .desc = "BL Egress Full", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_txr_ad_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_ad_cycles_full), /* shared */ }, { .name = "UNC_H_TXR_STARVED", .code = 0x6d, .desc = "Counts injection starvation. This starvation is triggered when the Egress cannot send a transaction onto the ring for a long period of time.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_txr_starved, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_txr_starved), }, { .name = "UNC_H_WPQ_CYCLES_NO_REG_CREDITS", .code = 0x18, .desc = "Counts the number of cycles when there are no regular credits available for posting writes from the HA into the iMC. In order to send writes into the memory controller, the HA must first acquire a credit for the iMCs WPQ (write pending queue). This queue is broken into regular credits/buffers that are used by general writes, and special requests such as ISOCH writes. This count only tracks the regular credits Common high banwidth workloads should be able to make use of all of the regular buffers, but it will be difficult (and uncommon) to make use of both the regular and special buffers at the same time. One can filter based on the memory controller channel. One or more channels can be tracked at a given iven time.", .modmsk = BDX_UNC_HA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_h_wpq_cycles_no_reg_credits, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_h_wpq_cycles_no_reg_credits), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_imc_events.h000066400000000000000000000615071502707512200252510ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_imc */ static intel_x86_umask_t bdx_unc_m_act_count[]={ { .uname = "BYP", .ucode = 0x800, .udesc = "DRAM Activate Count -- Activate due to Write", }, { .uname = "RD", .ucode = 0x100, .udesc = "DRAM Activate Count -- Activate due to Read", }, { .uname = "WR", .ucode = 0x200, .udesc = "DRAM Activate Count -- Activate due to Write", }, }; static intel_x86_umask_t bdx_unc_m_byp_cmds[]={ { .uname = "ACT", .ucode = 0x100, .udesc = "ACT command issued by 2 cycle bypass", }, { .uname = "CAS", .ucode = 0x200, .udesc = "CAS command issued by 2 cycle bypass", }, { .uname = "PRE", .ucode = 0x400, .udesc = "PRE command issued by 2 cycle bypass", }, }; static intel_x86_umask_t bdx_unc_m_cas_count[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM WR_CAS (w/ and w/out auto-pre)", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .ucode = 0x300, .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM Reads (RD_CAS + Underfills)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .ucode = 0x100, .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM RD_CAS (w/ and w/out auto-pre)", }, { .uname = "RD_RMM", .ucode = 0x2000, .udesc = "DRAM RD_CAS and WR_CAS Commands. Read CAS issued in RMM", }, { .uname = "RD_UNDERFILL", .ucode = 0x200, .udesc = "DRAM RD_CAS and WR_CAS Commands. Underfill Read Issued", }, { .uname = "RD_WMM", .ucode = 0x1000, .udesc = "DRAM RD_CAS and WR_CAS Commands. Read CAS issued in WMM", }, { .uname = "WR", .ucode = 0xc00, .udesc = "DRAM RD_CAS and WR_CAS Commands. All DRAM WR_CAS (both Modes)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_RMM", .ucode = 0x800, .udesc = "DRAM RD_CAS and WR_CAS Commands. DRAM WR_CAS (w/ and w/out auto-pre) in Read Major Mode", }, { .uname = "WR_WMM", .ucode = 0x400, .udesc = "DRAM RD_CAS and WR_CAS Commands. DRAM WR_CAS (w/ and w/out auto-pre) in Write Major Mode", }, }; static intel_x86_umask_t bdx_unc_m_dram_refresh[]={ { .uname = "HIGH", .ucode = 0x400, .udesc = "Number of DRAM Refreshes Issued", }, { .uname = "PANIC", .ucode = 0x200, .udesc = "Number of DRAM Refreshes Issued", }, }; static intel_x86_umask_t bdx_unc_m_major_modes[]={ { .uname = "ISOCH", .ucode = 0x800, .udesc = "Cycles in a Major Mode -- Isoch Major Mode", }, { .uname = "PARTIAL", .ucode = 0x400, .udesc = "Cycles in a Major Mode -- Partial Major Mode", }, { .uname = "READ", .ucode = 0x100, .udesc = "Cycles in a Major Mode -- Read Major Mode", }, { .uname = "WRITE", .ucode = 0x200, .udesc = "Cycles in a Major Mode -- Write Major Mode", }, }; static intel_x86_umask_t bdx_unc_m_power_cke_cycles[]={ { .uname = "RANK0", .ucode = 0x100, .udesc = "Rank0 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK1", .ucode = 0x200, .udesc = "Rank1 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK2", .ucode = 0x400, .udesc = "Rank2 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK3", .ucode = 0x800, .udesc = "Rank3 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK4", .ucode = 0x1000, .udesc = "Rank4 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK5", .ucode = 0x2000, .udesc = "Rank5 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK6", .ucode = 0x4000, .udesc = "Rank6 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK7", .ucode = 0x8000, .udesc = "Rank7 -- DIMM ID", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_m_preemption[]={ { .uname = "RD_PREEMPT_RD", .ucode = 0x100, .udesc = "Read Preemption Count -- Read over Read Preemption", }, { .uname = "RD_PREEMPT_WR", .ucode = 0x200, .udesc = "Read Preemption Count -- Read over Write Preemption", }, }; static intel_x86_umask_t bdx_unc_m_pre_count[]={ { .uname = "BYP", .ucode = 0x1000, .udesc = "DRAM Precharge commands. -- Precharge due to bypass", }, { .uname = "PAGE_CLOSE", .ucode = 0x200, .udesc = "DRAM Precharge commands. -- Precharge due to timer expiration", }, { .uname = "PAGE_MISS", .ucode = 0x100, .udesc = "DRAM Precharge commands. -- Precharges due to page miss", }, { .uname = "RD", .ucode = 0x400, .udesc = "DRAM Precharge commands. -- Precharge due to read", }, { .uname = "WR", .ucode = 0x800, .udesc = "DRAM Precharge commands. -- Precharge due to write", }, }; static intel_x86_umask_t bdx_unc_m_rd_cas_prio[]={ { .uname = "HIGH", .ucode = 0x400, .udesc = "Read CAS issued with HIGH priority", }, { .uname = "LOW", .ucode = 0x100, .udesc = "Read CAS issued with LOW priority", }, { .uname = "MED", .ucode = 0x200, .udesc = "Read CAS issued with MEDIUM priority", }, { .uname = "PANIC", .ucode = 0x800, .udesc = "Read CAS issued with PANIC NON ISOCH priority (starved)", }, }; static intel_x86_umask_t bdx_unc_m_rd_cas_rank0[]={ { .uname = "ALLBANKS", .ucode = 0x1000, .udesc = "Access to Rank 0 -- All Banks", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BANK0", .ucode = 0x0, .udesc = "Access to Rank 0 -- Bank 0", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK1", .ucode = 0x100, .udesc = "Access to Rank 0 -- Bank 1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK10", .ucode = 0xa00, .udesc = "Access to Rank 0 -- Bank 10", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK11", .ucode = 0xb00, .udesc = "Access to Rank 0 -- Bank 11", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK12", .ucode = 0xc00, .udesc = "Access to Rank 0 -- Bank 12", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK13", .ucode = 0xd00, .udesc = "Access to Rank 0 -- Bank 13", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK14", .ucode = 0xe00, .udesc = "Access to Rank 0 -- Bank 14", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK15", .ucode = 0xf00, .udesc = "Access to Rank 0 -- Bank 15", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK2", .ucode = 0x200, .udesc = "Access to Rank 0 -- Bank 2", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK3", .ucode = 0x300, .udesc = "Access to Rank 0 -- Bank 3", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK4", .ucode = 0x400, .udesc = "Access to Rank 0 -- Bank 4", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK5", .ucode = 0x500, .udesc = "Access to Rank 0 -- Bank 5", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK6", .ucode = 0x600, .udesc = "Access to Rank 0 -- Bank 6", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK7", .ucode = 0x700, .udesc = "Access to Rank 0 -- Bank 7", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK8", .ucode = 0x800, .udesc = "Access to Rank 0 -- Bank 8", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK9", .ucode = 0x900, .udesc = "Access to Rank 0 -- Bank 9", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG0", .ucode = 0x1100, .udesc = "Access to Rank 0 -- Bank Group 0 (Banks 0-3)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG1", .ucode = 0x1200, .udesc = "Access to Rank 0 -- Bank Group 1 (Banks 4-7)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG2", .ucode = 0x1300, .udesc = "Access to Rank 0 -- Bank Group 2 (Banks 8-11)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG3", .ucode = 0x1400, .udesc = "Access to Rank 0 -- Bank Group 3 (Banks 12-15)", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_m_rd_cas_rank2[]={ { .uname = "BANK0", .ucode = 0x0, .udesc = "RD_CAS Access to Rank 2 -- Bank 0", .uflags = INTEL_X86_DFL, }, }; static intel_x86_umask_t bdx_unc_m_vmse_wr_push[]={ { .uname = "RMM", .ucode = 0x200, .udesc = "VMSE WR PUSH issued -- VMSE write PUSH issued in RMM", }, { .uname = "WMM", .ucode = 0x100, .udesc = "VMSE WR PUSH issued -- VMSE write PUSH issued in WMM", }, }; static intel_x86_umask_t bdx_unc_m_wmm_to_rmm[]={ { .uname = "LOW_THRESH", .ucode = 0x100, .udesc = "Transition from WMM to RMM because of low threshold -- Transition from WMM to RMM because of starve counter", }, { .uname = "STARVE", .ucode = 0x200, .udesc = "Transition from WMM to RMM because of low threshold -- ", }, { .uname = "VMSE_RETRY", .ucode = 0x400, .udesc = "Transition from WMM to RMM because of low threshold -- ", }, }; static intel_x86_entry_t intel_bdx_unc_m_pe[]={ { .name = "UNC_M_CLOCKTICKS", .desc = "IMC Uncore clockticks (fixed counter)", .modmsk = 0x0, .cntmsk = 0x100000000ull, .code = 0xff, /* perf pseudo encoding for fixed counter */ .flags = INTEL_X86_FIXED, }, { .name = "UNC_M_ACT_COUNT", .code = 0x1, .desc = "Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_act_count, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_act_count), }, { .name = "UNC_M_BYP_CMDS", .code = 0xa1, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_byp_cmds, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_byp_cmds), }, { .name = "UNC_M_CAS_COUNT", .code = 0x4, .desc = "DRAM RD_CAS and WR_CAS Commands", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_cas_count, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_cas_count), }, { .name = "UNC_M_DCLOCKTICKS", .code = 0x0, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_DRAM_PRE_ALL", .code = 0x6, .desc = "Counts the number of times that the precharge all command was sent.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_DRAM_REFRESH", .code = 0x5, .desc = "Counts the number of refreshes issued.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_dram_refresh, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_dram_refresh), }, { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", .code = 0x9, .desc = "Counts the number of ECC errors detected and corrected by the iMC on this channel. This counter is only useful with ECC DRAM devices. This count will increment one time for each correction regardless of the number of bits corrected. The iMC can correct up to 4 bit errors in independent channel mode and 8 bit erros in lockstep mode.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_MAJOR_MODES", .code = 0x7, .desc = "Counts the total number of cycles spent in a major mode (selected by a filter) on the given channel. Major modes are channel-wide, and not a per-rank (or dimm or bank) mode.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_major_modes, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_major_modes), }, { .name = "UNC_M_POWER_CHANNEL_DLLOFF", .code = 0x84, .desc = "Number of cycles when all the ranks in the channel are in CKE Slow (DLLOFF) mode.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .code = 0x85, .desc = "Number of cycles when all the ranks in the channel are in PPD mode. If IBT=off is enabled, then this can be used to count those cycles. If it is not enabled, then this can count the number of cycles when that could have been taken advantage of.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_CKE_CYCLES", .code = 0x83, .desc = "Number of cycles spent in CKE ON mode. The filter allows you to select a rank to monitor. If multiple ranks are in CKE ON mode at one time, the counter will ONLY increment by one rather than doing accumulation. Multiple counters will need to be used to track multiple ranks simultaneously. There is no distinction between the different CKE modes (APD, PPDS, PPDF). This can be determined based on the system programming. These events should commonly be used with Invert to get the number of cycles in power saving mode. Edge Detect is also useful here. Make sure that you do NOT use Invert with Edge Detect (this just confuses the system and is not necessary).", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_power_cke_cycles, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_power_cke_cycles), }, { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", .code = 0x86, .desc = "Counts the number of cycles when the iMC is in critical thermal throttling. When this happens, all traffic is blocked. This should be rare unless something bad is going on in the platform. There is no filtering by rank for this event.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_PCU_THROTTLING", .code = 0x42, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_SELF_REFRESH", .code = 0x43, .desc = "Counts the number of cycles when the iMC is in self-refresh and the iMC still has a clock. This happens in some package C-states. For example, the PCU may ask the iMC to enter self-refresh even though some of the cores are still processing. One use of this is for Monroe technology. Self-refresh is required during package C3 and C6, but there is no clock in the iMC at this time, so it is not possible to count these cases.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .code = 0x41, .desc = "Counts the number of cycles while the iMC is being throttled by either thermal constraints or by the PCU throttling. It is not possible to distinguish between the two. This can be filtered by rank. If multiple ranks are selected and are being throttled at the same time, the counter will only increment by 1.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_power_cke_cycles, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_power_cke_cycles), }, { .name = "UNC_M_PREEMPTION", .code = 0x8, .desc = "Counts the number of times a read in the iMC preempts another read or write. Generally reads to an open page are issued ahead of requests to closed pages. This improves the page hit rate of the system. However, high priority requests can cause pages of active requests to be closed in order to get them out. This will reduce the latency of the high-priority request at the expense of lower bandwidth and increased overall average latency.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_preemption, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_preemption), }, { .name = "UNC_M_PRE_COUNT", .code = 0x2, .desc = "Counts the number of DRAM Precharge commands sent on this channel.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_pre_count, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_pre_count), }, { .name = "UNC_M_RD_CAS_PRIO", .code = 0xa0, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_prio, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_prio), }, { .name = "UNC_M_RD_CAS_RANK0", .code = 0xb0, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK1", .code = 0xb1, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_RD_CAS_RANK2", .code = 0xb2, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank2), }, { .name = "UNC_M_RD_CAS_RANK4", .code = 0xb4, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_RD_CAS_RANK5", .code = 0xb5, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_RD_CAS_RANK6", .code = 0xb6, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_RD_CAS_RANK7", .code = 0xb7, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_RPQ_CYCLES_NE", .code = 0x11, .desc = "Counts the number of cycles that the Read Pending Queue is not empty. This can then be used to calculate the average occupancy (in conjunction with the Read Pending Queue Occupancy count). The RPQ is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This filter is to be used in conjunction with the occupancy filter so that one can correctly track the average occupancies for schedulable entries and scheduled requests.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_RPQ_INSERTS", .code = 0x10, .desc = "Counts the number of allocations into the Read Pending Queue. This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This includes both ISOCH and non-ISOCH requests.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_VMSE_MXB_WR_OCCUPANCY", .code = 0x91, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_VMSE_WR_PUSH", .code = 0x90, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_vmse_wr_push, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_vmse_wr_push), }, { .name = "UNC_M_WMM_TO_RMM", .code = 0xc0, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_wmm_to_rmm, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_wmm_to_rmm), }, { .name = "UNC_M_WPQ_CYCLES_FULL", .code = 0x22, .desc = "Counts the number of cycles when the Write Pending Queue is full. When the WPQ is full, the HA will not be able to issue any additional read requests into the iMC. This count should be similar count in the HA which tracks the number of cycles that the HA has no WPQ credits, just somewhat smaller to account for the credit return overhead.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_CYCLES_NE", .code = 0x21, .desc = "Counts the number of cycles that the Write Pending Queue is not empty. This can then be used to calculate the average queue occupancy (in conjunction with the WPQ Occupancy Accumulation count). The WPQ is used to schedule write out to the memory controller and to track the writes. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have posted to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencieies.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_READ_HIT", .code = 0x23, .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_WRITE_HIT", .code = 0x24, .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WRONG_MM", .code = 0xc1, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WR_CAS_RANK0", .code = 0xb8, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK1", .code = 0xb9, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_WR_CAS_RANK4", .code = 0xbc, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_WR_CAS_RANK5", .code = 0xbd, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_WR_CAS_RANK6", .code = 0xbe, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, { .name = "UNC_M_WR_CAS_RANK7", .code = 0xbf, .desc = "TBD", .modmsk = BDX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_m_rd_cas_rank0), /* shared */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_irp_events.h000066400000000000000000000322111502707512200252610ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_irp */ static intel_x86_umask_t bdx_unc_i_cache_total_occupancy[]={ { .uname = "ANY", .ucode = 0x100, .udesc = "Total Write Cache Occupancy -- Any Source", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SOURCE", .ucode = 0x200, .udesc = "Total Write Cache Occupancy -- Select Source", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_i_coherent_ops[]={ { .uname = "CLFLUSH", .ucode = 0x8000, .udesc = "Coherent Ops -- CLFlush", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CRD", .ucode = 0x200, .udesc = "Coherent Ops -- CRd", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRD", .ucode = 0x400, .udesc = "Coherent Ops -- DRd", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCIDCAHINT", .ucode = 0x2000, .udesc = "Coherent Ops -- PCIDCAHin5t", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCIRDCUR", .ucode = 0x100, .udesc = "Coherent Ops -- PCIRdCur", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCITOM", .ucode = 0x1000, .udesc = "Coherent Ops -- PCIItoM", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .ucode = 0x800, .udesc = "Coherent Ops -- RFO", .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOI", .ucode = 0x4000, .udesc = "Coherent Ops -- WbMtoI", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_i_misc0[]={ { .uname = "2ND_ATOMIC_INSERT", .ucode = 0x1000, .udesc = "Misc Events - Set 0 -- Cache Inserts of Atomic Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_RD_INSERT", .ucode = 0x400, .udesc = "Misc Events - Set 0 -- Cache Inserts of Read Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_WR_INSERT", .ucode = 0x800, .udesc = "Misc Events - Set 0 -- Cache Inserts of Write Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REJ", .ucode = 0x200, .udesc = "Misc Events - Set 0 -- Fastpath Rejects", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REQ", .ucode = 0x100, .udesc = "Misc Events - Set 0 -- Fastpath Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_XFER", .ucode = 0x2000, .udesc = "Misc Events - Set 0 -- Fastpath Transfers From Primary to Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_ACK_HINT", .ucode = 0x4000, .udesc = "Misc Events - Set 0 -- Prefetch Ack Hints From Primary to Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_TIMEOUT", .ucode = 0x8000, .udesc = "Misc Events - Set 0 -- Prefetch TimeOut", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_i_misc1[]={ { .uname = "DATA_THROTTLE", .ucode = 0x8000, .udesc = "Misc Events - Set 1 -- Data Throttled", .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOST_FWD", .ucode = 0x1000, .udesc = "Misc Events - Set 1 -- ", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_INVLD", .ucode = 0x2000, .udesc = "Misc Events - Set 1 -- Received Invalid", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_VLD", .ucode = 0x4000, .udesc = "Misc Events - Set 1 -- Received Valid", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_I", .ucode = 0x100, .udesc = "Misc Events - Set 1 -- Slow Transfer of I Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_S", .ucode = 0x200, .udesc = "Misc Events - Set 1 -- Slow Transfer of S Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_E", .ucode = 0x400, .udesc = "Misc Events - Set 1 -- Slow Transfer of E Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_M", .ucode = 0x800, .udesc = "Misc Events - Set 1 -- Slow Transfer of M Line", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_i_snoop_resp[]={ { .uname = "HIT_ES", .ucode = 0x400, .udesc = "Snoop Responses -- Hit E or S", }, { .uname = "HIT_I", .ucode = 0x200, .udesc = "Snoop Responses -- Hit I", }, { .uname = "HIT_M", .ucode = 0x800, .udesc = "Snoop Responses -- Hit M", }, { .uname = "MISS", .ucode = 0x100, .udesc = "Snoop Responses -- Miss", }, { .uname = "SNPCODE", .ucode = 0x1000, .udesc = "Snoop Responses -- SnpCode", }, { .uname = "SNPDATA", .ucode = 0x2000, .udesc = "Snoop Responses -- SnpData", }, { .uname = "SNPINV", .ucode = 0x4000, .udesc = "Snoop Responses -- SnpInv", }, }; static intel_x86_umask_t bdx_unc_i_transactions[]={ { .uname = "ATOMIC", .ucode = 0x1000, .udesc = "Inbound Transaction Count -- Atomic", }, { .uname = "ORDERINGQ", .ucode = 0x4000, .udesc = "Inbound Transaction Count -- Select Source via IRP orderingQ register", }, { .uname = "OTHER", .ucode = 0x2000, .udesc = "Inbound Transaction Count -- Other", }, { .uname = "RD_PREF", .ucode = 0x400, .udesc = "Inbound Transaction Count -- Read Prefetches", }, { .uname = "READS", .ucode = 0x100, .udesc = "Inbound Transaction Count -- Reads", }, { .uname = "WRITES", .ucode = 0x200, .udesc = "Inbound Transaction Count -- Writes", }, { .uname = "WR_PREF", .ucode = 0x800, .udesc = "Inbound Transaction Count -- Write Prefetches", }, }; static intel_x86_entry_t intel_bdx_unc_i_pe[]={ { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", .code = 0x12, .desc = "Accumulates the number of reads and writes that are outstanding in the uncore in each cycle. This is effectively the sum of the READ_OCCUPANCY and WRITE_OCCUPANCY events.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_cache_total_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_cache_total_occupancy), }, { .name = "UNC_I_CLOCKTICKS", .code = 0x0, .desc = "Number of clocks in the IRP.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_COHERENT_OPS", .code = 0x13, .desc = "Counts the number of coherency related operations servied by the IRP", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_coherent_ops, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_coherent_ops), }, { .name = "UNC_I_MISC0", .code = 0x14, .desc = "TBD", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_misc0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_misc0), }, { .name = "UNC_I_MISC1", .code = 0x15, .desc = "TBD", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_misc1, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_misc1), }, { .name = "UNC_I_RXR_AK_INSERTS", .code = 0xa, .desc = "Counts the number of allocations into the AK Ingress. This queue is where the IRP receives responses from R2PCIe (the ring).", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_DRS_CYCLES_FULL", .code = 0x4, .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_DRS_INSERTS", .code = 0x1, .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_DRS_OCCUPANCY", .code = 0x7, .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCB_CYCLES_FULL", .code = 0x5, .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCB_INSERTS", .code = 0x2, .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCB_OCCUPANCY", .code = 0x8, .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCS_CYCLES_FULL", .code = 0x6, .desc = "Counts the number of cycles when the BL Ingress is full. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCS_INSERTS", .code = 0x3, .desc = "Counts the number of allocations into the BL Ingress. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_RXR_BL_NCS_OCCUPANCY", .code = 0x9, .desc = "Accumulates the occupancy of the BL Ingress in each cycles. This queue is where the IRP receives data from R2PCIe (the ring). It is used for data returns from read requets as well as outbound MMIO writes.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_SNOOP_RESP", .code = 0x17, .desc = "TBD", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_snoop_resp, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_snoop_resp), }, { .name = "UNC_I_TRANSACTIONS", .code = 0x16, .desc = "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portItID.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_i_transactions, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_i_transactions), }, { .name = "UNC_I_TXR_AD_STALL_CREDIT_CYCLES", .code = 0x18, .desc = "Counts the number times when it is not possible to issue a request to the R2PCIe because there are no AD Egress Credits available.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR_BL_STALL_CREDIT_CYCLES", .code = 0x19, .desc = "Counts the number times when it is not possible to issue data to the R2PCIe because there are no BL Egress Credits available.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCB", .code = 0xe, .desc = "Counts the number of requests issued to the switch (towards the devices).", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCS", .code = 0xf, .desc = "Counts the number of requests issued to the switch (towards the devices).", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR_REQUEST_OCCUPANCY", .code = 0xd, .desc = "Accumultes the number of outstanding outbound requests from the IRP to the switch (towards the devices). This can be used in conjuection with the allocations event in order to calculate average latency of outbound requests.", .modmsk = BDX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_pcu_events.h000066400000000000000000000401151502707512200252600ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_pcu */ static intel_x86_umask_t bdx_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .ucode = 0x4000, .udesc = "Number of cores in C-State -- C0 and C1", }, { .uname = "CORES_C3", .ucode = 0x8000, .udesc = "Number of cores in C-State -- C3", }, { .uname = "CORES_C6", .ucode = 0xc000, .udesc = "Number of cores in C-State -- C6 and C7", }, }; static intel_x86_entry_t intel_bdx_unc_p_pe[]={ { .name = "UNC_P_CLOCKTICKS", .code = 0x0, .desc = "The PCU runs off a fixed 1 GHz clock. This event counts the number of pclk cycles measured while the counter was enabled. The pclk, like the Memory Controllers dclk, counts at a constant rate making it a good measure of actual wall timee.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE0_TRANSITION_CYCLES", .code = 0x60, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE10_TRANSITION_CYCLES", .code = 0x6a, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE11_TRANSITION_CYCLES", .code = 0x6b, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE12_TRANSITION_CYCLES", .code = 0x6c, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE13_TRANSITION_CYCLES", .code = 0x6d, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE14_TRANSITION_CYCLES", .code = 0x6e, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE15_TRANSITION_CYCLES", .code = 0x6f, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE16_TRANSITION_CYCLES", .code = 0x70, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE17_TRANSITION_CYCLES", .code = 0x71, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE1_TRANSITION_CYCLES", .code = 0x61, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE2_TRANSITION_CYCLES", .code = 0x62, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE3_TRANSITION_CYCLES", .code = 0x63, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE4_TRANSITION_CYCLES", .code = 0x64, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE5_TRANSITION_CYCLES", .code = 0x65, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE6_TRANSITION_CYCLES", .code = 0x66, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE7_TRANSITION_CYCLES", .code = 0x67, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE8_TRANSITION_CYCLES", .code = 0x68, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE9_TRANSITION_CYCLES", .code = 0x69, .desc = "Number of cycles spent performing core C state transitions. There is one event per core.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE0", .code = 0x30, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE1", .code = 0x31, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE10", .code = 0x3a, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE11", .code = 0x3b, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE12", .code = 0x3c, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE13", .code = 0x3d, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE14", .code = 0x3e, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE15", .code = 0x3f, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE16", .code = 0x40, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE17", .code = 0x41, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE2", .code = 0x32, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE3", .code = 0x33, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE4", .code = 0x34, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE5", .code = 0x35, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE6", .code = 0x36, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE7", .code = 0x37, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE8", .code = 0x38, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS_CORE9", .code = 0x39, .desc = "Counts the number of times when a configurable cores had a C-state demotion", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .code = 0x4, .desc = "Counts the number of cycles when thermal conditions are the upper limit on frequency. This is related to the THERMAL_THROTTLE CYCLES_ABOVE_TEMP event, which always counts cycles when we are above the thermal temperature. This event (STRONGEST_UPPER_LIMIT) is sampled at the output of the algorithm that determines the actual frequency, while THERMAL_THROTTLE looks at the input.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MAX_OS_CYCLES", .code = 0x6, .desc = "Counts the number of cycles when the OS is the upper limit on frequency.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .code = 0x5, .desc = "Counts the number of cycles when power is the upper limit on frequency.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .code = 0x73, .desc = "Counts the number of cycles when IO P Limit is preventing us from dropping the frequency lower. This algorithm monitors the needs to the IO subsystem on both local and remote sockets and will maintain a frequency high enough to maintain good IO BW. This is necessary for when all the IA cores on a socket are idle but a user still would like to maintain high IO Bandwidth.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .code = 0x74, .desc = "Counts the number of cycles when the system is changing frequency. This can not be filtered by thread ID. One can also use it with the occupancy counter that monitors number of threads in C0 to estimate the performance impact that frequency transitions had on the system.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .code = 0x2f, .desc = "Counts the number of cycles that the PCU has triggered memory phase shedding. This is a mode that can be run in the iMC physicals that saves power at the expense of additional latency.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .code = 0x80, .desc = "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_p_power_state_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_p_power_state_occupancy), }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .code = 0xa, .desc = "Counts the number of cycles that we are in external PROCHOT mode. This mode is triggered when a sensor off the die determines that something off-die (like DRAM) is too hot and must throttle to avoid damaging the chip.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .code = 0x9, .desc = "Counts the number of cycles that we are in internal PROCHOT mode. This mode is triggered when a sensor on the die determines that we are too hot and must throttle to avoid damaging the chip.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .code = 0x72, .desc = "Number of cycles spent performing core C state transitions across all cores.", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_BANDWIDTH_MAX_RANGE", .code = 0x7e, .desc = "TBD", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_TRANSITIONS_DOWN", .code = 0x7c, .desc = "Ring GV down due to low traffic", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_TRANSITIONS_IO_P_LIMIT", .code = 0x7d, .desc = "TBD", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_TRANSITIONS_NO_CHANGE", .code = 0x79, .desc = "Ring GV with same final and initial frequency", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_TRANSITIONS_UP_RING", .code = 0x7a, .desc = "Ring GV up due to high ring traffic", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_UFS_TRANSITIONS_UP_STALL", .code = 0x7b, .desc = "Ring GV up due to high core stalls", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_VR_HOT_CYCLES", .code = 0x42, .desc = "TBD", .modmsk = BDX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_BAND0_CYCLES", .desc = "Frequency Residency", .code = 0xb, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = BDX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND1_CYCLES", .desc = "Frequency Residency", .code = 0xc, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = BDX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND2_CYCLES", .desc = "Frequency Residency", .code = 0xd, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = BDX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND3_CYCLES", .desc = "Frequency Residency", .code = 0xe, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = BDX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FIVR_PS_PS0_CYCLES", .desc = "Cycles spent in phase-shedding power state 0", .code = 0x75, .cntmsk = 0xf, .modmsk = BDX_UNC_PCU_ATTRS, }, { .name = "UNC_P_FIVR_PS_PS1_CYCLES", .desc = "Cycles spent in phase-shedding power state 1", .code = 0x76, .cntmsk = 0xf, .modmsk = BDX_UNC_PCU_ATTRS, }, { .name = "UNC_P_FIVR_PS_PS2_CYCLES", .desc = "Cycles spent in phase-shedding power state 2", .code = 0x77, .cntmsk = 0xf, .modmsk = BDX_UNC_PCU_ATTRS, }, { .name = "UNC_P_FIVR_PS_PS3_CYCLES", .desc = "Cycles spent in phase-shedding power state 3", .code = 0x78, .cntmsk = 0xf, .modmsk = BDX_UNC_PCU_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_qpi_events.h000066400000000000000000001217211502707512200252650ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_qpi */ static intel_x86_umask_t bdx_unc_q_direct2core[]={ { .uname = "FAILURE_CREDITS", .ucode = 0x200, .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress Credits", }, { .uname = "FAILURE_CREDITS_MISS", .ucode = 0x2000, .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Miss", }, { .uname = "FAILURE_CREDITS_RBT", .ucode = 0x800, .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Invalid", }, { .uname = "FAILURE_CREDITS_RBT_MISS", .ucode = 0x8000, .udesc = "Direct 2 Core Spawning -- Spawn Failure - Egress and RBT Miss, Invalid", }, { .uname = "FAILURE_MISS", .ucode = 0x1000, .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Miss", }, { .uname = "FAILURE_RBT_HIT", .ucode = 0x400, .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Invalid", }, { .uname = "FAILURE_RBT_MISS", .ucode = 0x4000, .udesc = "Direct 2 Core Spawning -- Spawn Failure - RBT Miss and Invalid", }, { .uname = "SUCCESS_RBT_HIT", .ucode = 0x100, .udesc = "Direct 2 Core Spawning -- Spawn Success", }, }; static intel_x86_umask_t bdx_unc_q_rxl_credits_consumed_vn0[]={ { .uname = "DRS", .ucode = 0x100, .udesc = "VN0 Credit Consumed -- DRS", }, { .uname = "HOM", .ucode = 0x800, .udesc = "VN0 Credit Consumed -- HOM", }, { .uname = "NCB", .ucode = 0x200, .udesc = "VN0 Credit Consumed -- NCB", }, { .uname = "NCS", .ucode = 0x400, .udesc = "VN0 Credit Consumed -- NCS", }, { .uname = "NDR", .ucode = 0x2000, .udesc = "VN0 Credit Consumed -- NDR", }, { .uname = "SNP", .ucode = 0x1000, .udesc = "VN0 Credit Consumed -- SNP", }, }; static intel_x86_umask_t bdx_unc_q_rxl_flits_g1[]={ { .uname = "DRS", .ucode = 0x1800, .udesc = "Flits Received - Group 1 -- DRS Flits (both Header and Data)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_DATA", .ucode = 0x800, .udesc = "Flits Received - Group 1 -- DRS Data Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_NONDATA", .ucode = 0x1000, .udesc = "Flits Received - Group 1 -- DRS Header Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .ucode = 0x600, .udesc = "Flits Received - Group 1 -- HOM Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_NONREQ", .ucode = 0x400, .udesc = "Flits Received - Group 1 -- HOM Non-Request Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_REQ", .ucode = 0x200, .udesc = "Flits Received - Group 1 -- HOM Request Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .ucode = 0x100, .udesc = "Flits Received - Group 1 -- SNP Flits", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_rxl_flits_g2[]={ { .uname = "NCB", .ucode = 0xc00, .udesc = "Flits Received - Group 2 -- Non-Coherent Rx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_DATA", .ucode = 0x400, .udesc = "Flits Received - Group 2 -- Non-Coherent data Rx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_NONDATA", .ucode = 0x800, .udesc = "Flits Received - Group 2 -- Non-Coherent non-data Rx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .ucode = 0x1000, .udesc = "Flits Received - Group 2 -- Non-Coherent standard Rx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AD", .ucode = 0x100, .udesc = "Flits Received - Group 2 -- Non-Data Response Rx Flits - AD", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AK", .ucode = 0x200, .udesc = "Flits Received - Group 2 -- Non-Data Response Rx Flits - AK", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_rxl_inserts_drs[]={ { .uname = "VN0", .ucode = 0x100, .udesc = "for VN0", }, { .uname = "VN1", .ucode = 0x200, .udesc = "for VN1", }, }; static const intel_x86_umask_t bdx_unc_q_rxl_flits_g0[]={ { .uname = "IDLE", .udesc = "Number of data flits over QPI that do not hold payload. When QPI is not in a power saving state, it continuously transmits flits across the link. When there are no protocol flits to send, it will send IDLE and NULL flits across", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Number of data flits over QPI", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "Number of non-NULL non-data flits over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_txl_flits_g0[]={ { .uname = "DATA", .ucode = 0x200, .udesc = "Flits Transferred - Group 0 -- Data Tx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .ucode = 0x400, .udesc = "Flits Transferred - Group 0 -- Non-Data protocol Tx Flits", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_txl_flits_g1[]={ { .uname = "DRS", .ucode = 0x1800, .udesc = "Flits Transferred - Group 1 -- DRS Flits (both Header and Data)", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_DATA", .ucode = 0x800, .udesc = "Flits Transferred - Group 1 -- DRS Data Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_NONDATA", .ucode = 0x1000, .udesc = "Flits Transferred - Group 1 -- DRS Header Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .ucode = 0x600, .udesc = "Flits Transferred - Group 1 -- HOM Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_NONREQ", .ucode = 0x400, .udesc = "Flits Transferred - Group 1 -- HOM Non-Request Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_REQ", .ucode = 0x200, .udesc = "Flits Transferred - Group 1 -- HOM Request Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .ucode = 0x100, .udesc = "Flits Transferred - Group 1 -- SNP Flits", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_txl_flits_g2[]={ { .uname = "NCB", .ucode = 0xc00, .udesc = "Flits Transferred - Group 2 -- Non-Coherent Bypass Tx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_DATA", .ucode = 0x400, .udesc = "Flits Transferred - Group 2 -- Non-Coherent data Tx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_NONDATA", .ucode = 0x800, .udesc = "Flits Transferred - Group 2 -- Non-Coherent non-data Tx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .ucode = 0x1000, .udesc = "Flits Transferred - Group 2 -- Non-Coherent standard Tx Flits", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AD", .ucode = 0x100, .udesc = "Flits Transferred - Group 2 -- Non-Data Response Tx Flits - AD", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AK", .ucode = 0x200, .udesc = "Flits Transferred - Group 2 -- Non-Data Response Tx Flits - AK", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_q_txr_bl_drs_credit_acquired[]={ { .uname = "VN0", .ucode = 0x100, .udesc = "R3QPI Egress Credit Occupancy - DRS -- for VN0", .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .ucode = 0x200, .udesc = "R3QPI Egress Credit Occupancy - DRS -- for VN1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN_SHR", .ucode = 0x400, .udesc = "R3QPI Egress Credit Occupancy - DRS -- for Shared VN", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_entry_t intel_bdx_unc_q_pe[]={ { .name = "UNC_Q_CLOCKTICKS", .code = 0x14, .desc = "Counts the number of clocks in the QPI LL. This clock runs at 1/4th the GT/s speed of the QPI link. For example, a 4GT/s link will have qfclk or 1GHz. BDX does not support dynamic link speeds, so this frequency is fixexed.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_CTO_COUNT", .code = 0x38 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of CTO (cluster trigger outs) events that were asserted across the two slots. If both slots trigger in a given cycle, the event will increment by 2. You can use edge detect to count the number of cases when both events triggered.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_DIRECT2CORE", .code = 0x13, .desc = "Counts the number of DRS packets that we attempted to do direct2core on. There are 4 mutually exclusive filters. Filter [0] can be used to get successful spawns, while [1:3] provide the different failure cases. Note that this does not count packets that are not candidates for Direct2Core. The only candidates for Direct2Core are DRS packets destined for Cbos.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_direct2core, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_direct2core), }, { .name = "UNC_Q_L1_POWER_CYCLES", .code = 0x12, .desc = "Number of QPI qfclk cycles spent in L1 power mode. L1 is a mode that totally shuts down a QPI link. Use edge detect to count the number of instances when the QPI link entered L1. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. Because L1 totally shuts down the link, it takes a good amount of time to exit this mode.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL0P_POWER_CYCLES", .code = 0x10, .desc = "Number of QPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 1/2 of the QPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize QPI for snoops and their responses. Use edge detect to count the number of instances when the QPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL0_POWER_CYCLES", .code = 0xf, .desc = "Number of QPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_BYPASSED", .code = 0x9, .desc = "Counts the number of times that an incoming flit was able to bypass the flit buffer and pass directly across the BGF and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of flits transferred, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN0", .code = 0x1e | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of times that an RxQ VN0 credit was consumed (i.e. message uses a VN0 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_credits_consumed_vn0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_credits_consumed_vn0), }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN1", .code = 0x39 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of times that an RxQ VN1 credit was consumed (i.e. message uses a VN1 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_credits_consumed_vn0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_credits_consumed_vn0), }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VNA", .code = 0x1d | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of times that an RxQ VNA credit was consumed (i.e. message uses a VNA credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_CYCLES_NE", .code = 0xa, .desc = "Counts the number of cycles that the QPI RxQ was not empty. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy Accumulator event to calculate the average occupancy.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_FLITS_G0", .code = 0x1, .desc = "Counts the number of flits received from the QPI Link.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_flits_g0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g0), }, { .name = "UNC_Q_RXL_FLITS_G1", .code = 0x2 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of flits received from the QPI Link. This is one of three groups that allow us to track flits. It includes filters for SNP, HOM, and DRS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transferring a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_flits_g1, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g1), }, { .name = "UNC_Q_RXL_FLITS_G2", .code = 0x3 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of flits received from the QPI Link. This is one of three groups that allow us to track flits. It includes filters for NDR, NCB, and NCS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transferring a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_flits_g2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_flits_g2), }, { .name = "UNC_Q_RXL_INSERTS", .code = 0x8, .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_INSERTS_DRS", .code = 0x9 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only DRS flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_INSERTS_HOM", .code = 0xc | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only HOM flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_INSERTS_NCB", .code = 0xa | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NCB flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_INSERTS_NCS", .code = 0xb | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NCS flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_INSERTS_NDR", .code = 0xe | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only NDR flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_INSERTS_SNP", .code = 0xd | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of allocations into the QPI Rx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime. This monitors only SNP flits.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY", .code = 0xb, .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_RXL_OCCUPANCY_DRS", .code = 0x15 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors DRS flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY_HOM", .code = 0x18 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors HOM flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY_NCB", .code = 0x16 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NCB flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY_NCS", .code = 0x17 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NCS flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY_NDR", .code = 0x1a | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors NDR flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_RXL_OCCUPANCY_SNP", .code = 0x19 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Accumulates the number of elements in the QPI RxQ in each cycle. Generally, when data is transmitted across QPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime. This monitors SNP flits only.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXL0P_POWER_CYCLES", .code = 0xd, .desc = "Number of QPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 1/2 of the QPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize QPI for snoops and their responses. Use edge detect to count the number of instances when the QPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXL0_POWER_CYCLES", .code = 0xc, .desc = "Number of QPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXL_BYPASSED", .code = 0x5, .desc = "Counts the number of times that an incoming flit was able to bypass the Tx flit buffer and pass directly out the QPI Link. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXL_CYCLES_NE", .code = 0x6, .desc = "Counts the number of cycles when the TxQ is not empty. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXL_FLITS_G0", .code = 0x0, .desc = "Counts the number of flits transmitted across the QPI Link. It includes filters for Idle, protocol, and Data Flits. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transferring a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time (for L0) or 4B instead of 8B for L0p.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_txl_flits_g0, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g0), }, { .name = "UNC_Q_TXL_FLITS_G1", .code = 0x0 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of flits transmitted across the QPI Link. It includes filters for Idle, protocol, and Data Flits. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transferring a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time (for L0) or 4B instead of 8B for L0p.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_txl_flits_g1, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g1), }, { .name = "UNC_Q_TXL_FLITS_G2", .code = 0x1 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Counts the number of flits trasmitted across the QPI Link. This is one of three groups that allow us to track flits. It includes filters for NDR, NCB, and NCS message classes. Each flit is made up of 80 bits of information (in addition to some ECC data). In full-width (L0) mode, flits are made up of four fits, each of which contains 20 bits of data (along with some additional ECC data). In half-width (L0p) mode, the fits are only 10 bits, and therefore it takes twice as many fits to transmit a flit. When one talks about QPI speed (for example, 8.0 GT/s), the transfers here refer to fits. Therefore, in L0, the system will transfer 1 flit at the rate of 1/4th the QPI speed. One can calculate the bandwidth of the link by taking: flits*80b/time. Note that this is not the same as data bandwidth. For example, when we are transferring a 64B cacheline across QPI, we will break it into 9 flits -- 1 with header information and 8 with 64 bits of actual data and an additional 16 bits of other information. To calculate data bandwidth, one should therefore do: data flits * 8B / time.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_txl_flits_g2, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txl_flits_g2), }, { .name = "UNC_Q_TXL_INSERTS", .code = 0x4, .desc = "Number of allocations into the QPI Tx Flit Buffer. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXL_OCCUPANCY", .code = 0x7, .desc = "Accumulates the number of flits in the TxQ. Generally, when data is transmitted across QPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This can be used with the cycles not empty event to track average occupancy, or the allocations event to track average lifetime in the TxQ.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_ACQUIRED", .code = 0x26 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for Home messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_OCCUPANCY", .code = 0x22 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO for HOM messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_ACQUIRED", .code = 0x28 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for NDR messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_OCCUPANCY", .code = 0x24 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO for NDR messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_ACQUIRED", .code = 0x27 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of link layer credits into the R3 (for transactions across the BGF) acquired each cycle. Flow Control FIFO for Snoop messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_OCCUPANCY", .code = 0x23 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of link layer credits into the R3 (for transactions across the BGF) available in each cycle. Flow Control FIFO fro Snoop messages on AD.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_ACQUIRED", .code = 0x29 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. Local NDR message class to AK Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_OCCUPANCY", .code = 0x25 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. Local NDR message class to AK Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_ACQUIRED", .code = 0x2a | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. DRS message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_txr_bl_drs_credit_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txr_bl_drs_credit_acquired), }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_OCCUPANCY", .code = 0x1f | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. DRS message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_txr_bl_drs_credit_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_txr_bl_drs_credit_acquired), }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_ACQUIRED", .code = 0x2b | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. NCB message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_OCCUPANCY", .code = 0x20 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. NCB message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_ACQUIRED", .code = 0x2c | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of credits into the R3 (for transactions across the BGF) acquired each cycle. NCS message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_OCCUPANCY", .code = 0x21 | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Occupancy event that tracks the number of credits into the R3 (for transactions across the BGF) available in each cycle. NCS message class to BL Egress.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_q_rxl_inserts_drs, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_q_rxl_inserts_drs), }, { .name = "UNC_Q_VNA_CREDIT_RETURNS", .code = 0x1c | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of VNA credits returned.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_Q_VNA_CREDIT_RETURN_OCCUPANCY", .code = 0x1b | (1 << 21), /* extra ev_sel_ext bit set */ .desc = "Number of VNA credits in the Rx side that are waitng to be returned back across the link.", .modmsk = BDX_UNC_QPI_ATTRS, .cntmsk = 0xf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_r2pcie_events.h000066400000000000000000000266651502707512200256730ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_r2pcie */ static intel_x86_umask_t bdx_unc_r2_iio_credit[]={ { .uname = "ISOCH_QPI0", .ucode = 0x400, .udesc = "TBD", }, { .uname = "ISOCH_QPI1", .ucode = 0x800, .udesc = "TBD", }, { .uname = "PRQ_QPI0", .ucode = 0x100, .udesc = "TBD", }, { .uname = "PRQ_QPI1", .ucode = 0x200, .udesc = "TBD", }, }; static intel_x86_umask_t bdx_unc_r2_ring_ad_used[]={ { .uname = "CCW", .ucode = 0xc00, .udesc = "Counterclockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW_EVEN", .ucode = 0x400, .udesc = "Counterclockwise and Even", }, { .uname = "CCW_ODD", .ucode = 0x800, .udesc = "Counterclockwise and Odd", }, { .uname = "CW", .ucode = 0x300, .udesc = "Clockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW_EVEN", .ucode = 0x100, .udesc = "Clockwise and Even", }, { .uname = "CW_ODD", .ucode = 0x200, .udesc = "Clockwise and Odd", }, }; static intel_x86_umask_t bdx_unc_r2_ring_ak_bounces[]={ { .uname = "DN", .ucode = 0x200, .udesc = "AK Ingress Bounced -- Dn", }, { .uname = "UP", .ucode = 0x100, .udesc = "AK Ingress Bounced -- Up", }, }; static intel_x86_umask_t bdx_unc_r2_ring_iv_used[]={ { .uname = "ANY", .ucode = 0xf00, .udesc = "Any directions", .uflags = INTEL_X86_DFL, }, { .uname = "CCW", .ucode = 0xc00, .udesc = "Counterclockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW", .ucode = 0x300, .udesc = "Clockwise", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_r2_rxr_cycles_ne[]={ { .uname = "NCB", .ucode = 0x1000, .udesc = "NCB", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "NCS", }, }; static intel_x86_umask_t bdx_unc_r2_rxr_occupancy[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "Ingress Occupancy Accumulator -- DRS", .uflags = INTEL_X86_DFL, }, }; static intel_x86_umask_t bdx_unc_r2_sbo0_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "SBo0 Credits Acquired -- For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "SBo0 Credits Acquired -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_r2_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .ucode = 0x100, .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", }, { .uname = "SBO0_BL", .ucode = 0x400, .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", }, { .uname = "SBO1_AD", .ucode = 0x200, .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", }, { .uname = "SBO1_BL", .ucode = 0x800, .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", }, }; static intel_x86_umask_t bdx_unc_r2_txr_cycles_full[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Egress Cycles Full -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Egress Cycles Full -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Egress Cycles Full -- BL", }, }; static intel_x86_umask_t bdx_unc_r2_txr_cycles_ne[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Egress Cycles Not Empty -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Egress Cycles Not Empty -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Egress Cycles Not Empty -- BL", }, }; static intel_x86_umask_t bdx_unc_r2_txr_nack_cw[]={ { .uname = "DN_AD", .ucode = 0x100, .udesc = "Egress CCW NACK -- AD CCW", }, { .uname = "DN_AK", .ucode = 0x400, .udesc = "Egress CCW NACK -- AK CCW", }, { .uname = "DN_BL", .ucode = 0x200, .udesc = "Egress CCW NACK -- BL CCW", }, { .uname = "UP_AD", .ucode = 0x800, .udesc = "Egress CCW NACK -- AK CCW", }, { .uname = "UP_AK", .ucode = 0x2000, .udesc = "Egress CCW NACK -- BL CW", }, { .uname = "UP_BL", .ucode = 0x1000, .udesc = "Egress CCW NACK -- BL CCW", }, }; static intel_x86_entry_t intel_bdx_unc_r2_pe[]={ { .name = "UNC_R2_CLOCKTICKS", .code = 0x1, .desc = "Counts the number of uclks in the R2PCIe uclk domain. This could be slightly different than the count in the Ubox because of enable/freeze delays. However, because the R2PCIe is close to the Ubox, they generally should not diverge by more than a handful of cycles.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_R2_IIO_CREDIT", .code = 0x2d, .desc = "TBD", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_iio_credit, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_iio_credit), }, { .name = "UNC_R2_RING_AD_USED", .code = 0x7, .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_r2_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), }, { .name = "UNC_R2_RING_AK_BOUNCES", .code = 0x12, .desc = "Counts the number of times when a request destined for the AK ingress bounced.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_r2_ring_ak_bounces, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ak_bounces), }, { .name = "UNC_R2_RING_AK_USED", .code = 0x8, .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_r2_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), }, { .name = "UNC_R2_RING_BL_USED", .code = 0x9, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_r2_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_ad_used), }, { .name = "UNC_R2_RING_IV_USED", .code = 0xa, .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_r2_ring_iv_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_ring_iv_used), }, { .name = "UNC_R2_RXR_CYCLES_NE", .code = 0x10, .desc = "Counts the number of cycles when the R2PCIe Ingress is not empty. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_rxr_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_cycles_ne), }, { .name = "UNC_R2_RXR_INSERTS", .code = 0x11, .desc = "Counts the number of allocations into the R2PCIe Ingress. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_rxr_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_cycles_ne), }, { .name = "UNC_R2_RXR_OCCUPANCY", .code = 0x13, .desc = "Accumulates the occupancy of a given R2PCIe Ingress queue in each cycles. This tracks one of the three ring Ingress buffers. This can be used with the R2PCIe Ingress Not Empty event to calculate average occupancy or the R2PCIe Ingress Allocations event in order to calculate average queuing latency.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_r2_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_rxr_occupancy), }, { .name = "UNC_R2_SBO0_CREDITS_ACQUIRED", .code = 0x28, .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_sbo0_credits_acquired), }, { .name = "UNC_R2_STALL_NO_SBO_CREDIT", .code = 0x2c, .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_stall_no_sbo_credit, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_stall_no_sbo_credit), }, { .name = "UNC_R2_TXR_CYCLES_FULL", .code = 0x25, .desc = "Counts the number of cycles when the R2PCIe Egress buffer is full.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_r2_txr_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_cycles_full), }, { .name = "UNC_R2_TXR_CYCLES_NE", .code = 0x23, .desc = "Counts the number of cycles when the R2PCIe Egress is not empty. This tracks one of the three rings that are used by the R2PCIe agent. This can be used in conjunction with the R2PCIe Egress Occupancy Accumulator event in order to calculate average queue occupancy. Only a single Egress queue can be tracked at any given time. It is not possible to filter based on direction or polarity.", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_r2_txr_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_cycles_ne), }, { .name = "UNC_R2_TXR_NACK_CW", .code = 0x26, .desc = "TBD", .modmsk = BDX_UNC_R2PCIE_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r2_txr_nack_cw, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r2_txr_nack_cw), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_r3qpi_events.h000066400000000000000000000644451502707512200255430ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_r3qpi */ static intel_x86_umask_t bdx_unc_r3_c_hi_ad_credits_empty[]={ { .uname = "CBO10", .ucode = 0x400, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO11", .ucode = 0x800, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO12", .ucode = 0x1000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO13", .ucode = 0x2000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO14_16", .ucode = 0x4000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO8", .ucode = 0x100, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO9", .ucode = 0x200, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO_15_17", .ucode = 0x8000, .udesc = "CBox AD Credits Empty", }, }; static intel_x86_umask_t bdx_unc_r3_c_lo_ad_credits_empty[]={ { .uname = "CBO0", .ucode = 0x100, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO1", .ucode = 0x200, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO2", .ucode = 0x400, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO3", .ucode = 0x800, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO4", .ucode = 0x1000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO5", .ucode = 0x2000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO6", .ucode = 0x4000, .udesc = "CBox AD Credits Empty", }, { .uname = "CBO7", .ucode = 0x8000, .udesc = "CBox AD Credits Empty", }, }; static intel_x86_umask_t bdx_unc_r3_ha_r2_bl_credits_empty[]={ { .uname = "HA0", .ucode = 0x100, .udesc = "HA/R2 AD Credits Empty", }, { .uname = "HA1", .ucode = 0x200, .udesc = "HA/R2 AD Credits Empty", }, { .uname = "R2_NCB", .ucode = 0x400, .udesc = "HA/R2 AD Credits Empty", }, { .uname = "R2_NCS", .ucode = 0x800, .udesc = "HA/R2 AD Credits Empty", }, }; static intel_x86_umask_t bdx_unc_r3_qpi0_ad_credits_empty[]={ { .uname = "VN0_HOM", .ucode = 0x200, .udesc = "VN0 HOM messages", }, { .uname = "VN0_NDR", .ucode = 0x800, .udesc = "VN0 NDR messages", }, { .uname = "VN0_SNP", .ucode = 0x400, .udesc = "VN0 SNP messages", }, { .uname = "VN1_HOM", .ucode = 0x1000, .udesc = "VN1 HOM messages", }, { .uname = "VN1_NDR", .ucode = 0x4000, .udesc = "VN1 NDR messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "VN1 SNP messages", }, { .uname = "VNA", .ucode = 0x100, .udesc = "VNA messages", }, }; static intel_x86_umask_t bdx_unc_r3_qpi0_bl_credits_empty[]={ { .uname = "VN1_HOM", .ucode = 0x1000, .udesc = "QPIx BL Credits Empty", }, { .uname = "VN1_NDR", .ucode = 0x4000, .udesc = "QPIx BL Credits Empty", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "QPIx BL Credits Empty", }, { .uname = "VNA", .ucode = 0x100, .udesc = "QPIx BL Credits Empty", }, }; static intel_x86_umask_t bdx_unc_r3_ring_ad_used[]={ { .uname = "CCW", .ucode = 0xc00, .udesc = "Counterclockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW_EVEN", .ucode = 0x400, .udesc = "Counterclockwise and Even", }, { .uname = "CCW_ODD", .ucode = 0x800, .udesc = "Counterclockwise and Odd", }, { .uname = "CW", .ucode = 0x300, .udesc = "Clockwise", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW_EVEN", .ucode = 0x100, .udesc = "Clockwise and Even", }, { .uname = "CW_ODD", .ucode = 0x200, .udesc = "Clockwise and Odd", }, }; static intel_x86_umask_t bdx_unc_r3_ring_iv_used[]={ { .uname = "ANY", .ucode = 0xf00, .udesc = "Any", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CW", .ucode = 0x300, .udesc = "Clockwise", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_r3_ring_sink_starved[]={ { .uname = "AK", .ucode = 0x200, .udesc = "AK", .uflags = INTEL_X86_DFL, }, }; static intel_x86_umask_t bdx_unc_r3_rxr_cycles_ne[]={ { .uname = "HOM", .ucode = 0x100, .udesc = "Ingress Cycles Not Empty -- HOM", }, { .uname = "NDR", .ucode = 0x400, .udesc = "Ingress Cycles Not Empty -- NDR", }, { .uname = "SNP", .ucode = 0x200, .udesc = "Ingress Cycles Not Empty -- SNP", }, }; static intel_x86_umask_t bdx_unc_r3_rxr_cycles_ne_vn1[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VN1 Ingress Cycles Not Empty -- DRS", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VN1 Ingress Cycles Not Empty -- HOM", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VN1 Ingress Cycles Not Empty -- NCB", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN1 Ingress Cycles Not Empty -- NCS", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VN1 Ingress Cycles Not Empty -- NDR", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN1 Ingress Cycles Not Empty -- SNP", }, }; static intel_x86_umask_t bdx_unc_r3_rxr_inserts[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "Ingress Allocations -- DRS", }, { .uname = "HOM", .ucode = 0x100, .udesc = "Ingress Allocations -- HOM", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "Ingress Allocations -- NCB", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "Ingress Allocations -- NCS", }, { .uname = "NDR", .ucode = 0x400, .udesc = "Ingress Allocations -- NDR", }, { .uname = "SNP", .ucode = 0x200, .udesc = "Ingress Allocations -- SNP", }, }; static intel_x86_umask_t bdx_unc_r3_sbo0_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "SBo0 Credits Acquired -- For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "SBo0 Credits Acquired -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_r3_sbo1_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "SBo1 Credits Acquired -- For AD Ring", }, { .uname = "BL", .ucode = 0x200, .udesc = "SBo1 Credits Acquired -- For BL Ring", }, }; static intel_x86_umask_t bdx_unc_r3_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .ucode = 0x100, .udesc = "Stall on No Sbo Credits -- For SBo0, AD Ring", }, { .uname = "SBO0_BL", .ucode = 0x400, .udesc = "Stall on No Sbo Credits -- For SBo0, BL Ring", }, { .uname = "SBO1_AD", .ucode = 0x200, .udesc = "Stall on No Sbo Credits -- For SBo1, AD Ring", }, { .uname = "SBO1_BL", .ucode = 0x800, .udesc = "Stall on No Sbo Credits -- For SBo1, BL Ring", }, }; static intel_x86_umask_t bdx_unc_r3_txr_nack[]={ { .uname = "DN_AD", .ucode = 0x100, .udesc = "Egress CCW NACK -- AD CCW", }, { .uname = "DN_AK", .ucode = 0x400, .udesc = "Egress CCW NACK -- AK CCW", }, { .uname = "DN_BL", .ucode = 0x200, .udesc = "Egress CCW NACK -- BL CCW", }, { .uname = "UP_AD", .ucode = 0x800, .udesc = "Egress CCW NACK -- AK CCW", }, { .uname = "UP_AK", .ucode = 0x2000, .udesc = "Egress CCW NACK -- BL CW", }, { .uname = "UP_BL", .ucode = 0x1000, .udesc = "Egress CCW NACK -- BL CCW", }, }; static intel_x86_umask_t bdx_unc_r3_vn0_credits_reject[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VN0 Credit Acquisition Failed on DRS -- DRS Message Class", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VN0 Credit Acquisition Failed on DRS -- HOM Message Class", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VN0 Credit Acquisition Failed on DRS -- NCB Message Class", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN0 Credit Acquisition Failed on DRS -- NCS Message Class", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VN0 Credit Acquisition Failed on DRS -- NDR Message Class", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN0 Credit Acquisition Failed on DRS -- SNP Message Class", }, }; static intel_x86_umask_t bdx_unc_r3_vn0_credits_used[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VN0 Credit Used -- DRS Message Class", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VN0 Credit Used -- HOM Message Class", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VN0 Credit Used -- NCB Message Class", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN0 Credit Used -- NCS Message Class", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VN0 Credit Used -- NDR Message Class", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN0 Credit Used -- SNP Message Class", }, }; static intel_x86_umask_t bdx_unc_r3_vn1_credits_reject[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VN1 Credit Acquisition Failed on DRS -- DRS Message Class", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VN1 Credit Acquisition Failed on DRS -- HOM Message Class", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VN1 Credit Acquisition Failed on DRS -- NCB Message Class", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN1 Credit Acquisition Failed on DRS -- NCS Message Class", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VN1 Credit Acquisition Failed on DRS -- NDR Message Class", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN1 Credit Acquisition Failed on DRS -- SNP Message Class", }, }; static intel_x86_umask_t bdx_unc_r3_vn1_credits_used[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VN1 Credit Used -- DRS Message Class", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VN1 Credit Used -- HOM Message Class", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VN1 Credit Used -- NCB Message Class", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN1 Credit Used -- NCS Message Class", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VN1 Credit Used -- NDR Message Class", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN1 Credit Used -- SNP Message Class", }, }; static intel_x86_umask_t bdx_unc_r3_vna_credits_acquired[]={ { .uname = "AD", .ucode = 0x100, .udesc = "VNA credit Acquisitions -- HOM Message Class", }, { .uname = "BL", .ucode = 0x400, .udesc = "VNA credit Acquisitions -- HOM Message Class", }, }; static intel_x86_umask_t bdx_unc_r3_vna_credits_reject[]={ { .uname = "DRS", .ucode = 0x800, .udesc = "VNA Credit Reject -- DRS Message Class", }, { .uname = "HOM", .ucode = 0x100, .udesc = "VNA Credit Reject -- HOM Message Class", }, { .uname = "NCB", .ucode = 0x1000, .udesc = "VNA Credit Reject -- NCB Message Class", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VNA Credit Reject -- NCS Message Class", }, { .uname = "NDR", .ucode = 0x400, .udesc = "VNA Credit Reject -- NDR Message Class", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VNA Credit Reject -- SNP Message Class", }, }; static intel_x86_entry_t intel_bdx_unc_r3_pe[]={ { .name = "UNC_R3_CLOCKTICKS", .code = 0x1, .desc = "Counts the number of uclks in the QPI uclk domain. This could be slightly different than the count in the Ubox because of enable/freeze delays. However, because the QPI Agent is close to the Ubox, they generally should not diverge by more than a handful of cycles.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, }, { .name = "UNC_R3_C_HI_AD_CREDITS_EMPTY", .code = 0x1f, .desc = "No credits available to send to Cbox on the AD Ring (covers higher CBoxes)", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_c_hi_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_c_hi_ad_credits_empty), }, { .name = "UNC_R3_C_LO_AD_CREDITS_EMPTY", .code = 0x22, .desc = "No credits available to send to Cbox on the AD Ring (covers lower CBoxes)", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_c_lo_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_c_lo_ad_credits_empty), }, { .name = "UNC_R3_HA_R2_BL_CREDITS_EMPTY", .code = 0x2d, .desc = "No credits available to send to either HA or R2 on the BL Ring", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_ha_r2_bl_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ha_r2_bl_credits_empty), }, { .name = "UNC_R3_QPI0_AD_CREDITS_EMPTY", .code = 0x20, .desc = "No credits available to send to QPI0 on the AD Ring", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_qpi0_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), }, { .name = "UNC_R3_QPI0_BL_CREDITS_EMPTY", .code = 0x21, .desc = "No credits available to send to QPI0 on the BL Ring", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_qpi0_bl_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_bl_credits_empty), }, { .name = "UNC_R3_QPI1_AD_CREDITS_EMPTY", .code = 0x2e, .desc = "No credits available to send to QPI1 on the AD Ring", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_qpi0_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), }, { .name = "UNC_R3_QPI1_BL_CREDITS_EMPTY", .code = 0x2f, .desc = "No credits available to send to QPI1 on the BL Ring", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_qpi0_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_qpi0_ad_credits_empty), }, { .name = "UNC_R3_RING_AD_USED", .code = 0x7, .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = bdx_unc_r3_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), }, { .name = "UNC_R3_RING_AK_USED", .code = 0x8, .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = bdx_unc_r3_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), }, { .name = "UNC_R3_RING_BL_USED", .code = 0x9, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = bdx_unc_r3_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_ad_used), }, { .name = "UNC_R3_RING_IV_USED", .code = 0xa, .desc = "Counts the number of cycles that the IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = bdx_unc_r3_ring_iv_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_iv_used), }, { .name = "UNC_R3_RING_SINK_STARVED", .code = 0xe, .desc = "Number of cycles the ringstop is in starvation (per ring)", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = bdx_unc_r3_ring_sink_starved, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_ring_sink_starved), }, { .name = "UNC_R3_RXR_CYCLES_NE", .code = 0x10, .desc = "Counts the number of cycles when the QPI Ingress is not empty. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_rxr_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_cycles_ne), }, { .name = "UNC_R3_RXR_CYCLES_NE_VN1", .code = 0x14, .desc = "Counts the number of cycles when the QPI VN1 Ingress is not empty. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_rxr_cycles_ne_vn1, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_cycles_ne_vn1), }, { .name = "UNC_R3_RXR_INSERTS", .code = 0x11, .desc = "Counts the number of allocations into the QPI Ingress. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), }, { .name = "UNC_R3_RXR_INSERTS_VN1", .code = 0x15, .desc = "Counts the number of allocations into the QPI VN1 Ingress. This tracks one of the three rings that are used by the QPI agent. This can be used in conjunction with the QPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), }, { .name = "UNC_R3_RXR_OCCUPANCY_VN1", .code = 0x13, .desc = "Accumulates the occupancy of a given QPI VN1 Ingress queue in each cycles. This tracks one of the three ring Ingress buffers. This can be used with the QPI VN1 Ingress Not Empty event to calculate average occupancy or the QPI VN1 Ingress Allocations event in order to calculate average queuing latency.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = bdx_unc_r3_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_rxr_inserts), }, { .name = "UNC_R3_SBO0_CREDITS_ACQUIRED", .code = 0x28, .desc = "Number of Sbo 0 credits acquired in a given cycle, per ring.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_sbo0_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_sbo0_credits_acquired), }, { .name = "UNC_R3_SBO1_CREDITS_ACQUIRED", .code = 0x29, .desc = "Number of Sbo 1 credits acquired in a given cycle, per ring.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_sbo1_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_sbo1_credits_acquired), }, { .name = "UNC_R3_STALL_NO_SBO_CREDIT", .code = 0x2c, .desc = "Number of cycles Egress is stalled waiting for an Sbo credit to become available. Per Sbo, per Ring.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_stall_no_sbo_credit, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_stall_no_sbo_credit), }, { .name = "UNC_R3_TXR_NACK", .code = 0x26, .desc = "TBD", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_txr_nack, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_txr_nack), }, { .name = "UNC_R3_VN0_CREDITS_REJECT", .code = 0x37, .desc = "Number of times a request failed to acquire a DRS VN0 credit. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN0. VNA is a shared pool used to achieve high performance. The VN0 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN0 if they fail. This therefore counts the number of times when a request failed to acquire either a VNA or VN0 credit and is delayed. This should generally be a rare situation.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vn0_credits_reject, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn0_credits_reject), }, { .name = "UNC_R3_VN0_CREDITS_USED", .code = 0x36, .desc = "Number of times a VN0 credit was used on the DRS message channel. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN0. VNA is a shared pool used to achieve high performance. The VN0 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN0 if they fail. This counts the number of times a VN0 credit was used. Note that a single VN0 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN0 will only count a single credit even though it may use multiple buffers.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vn0_credits_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn0_credits_used), }, { .name = "UNC_R3_VN1_CREDITS_REJECT", .code = 0x39, .desc = "Number of times a request failed to acquire a VN1 credit. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN1. VNA is a shared pool used to achieve high performance. The VN1 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN1 if they fail. This therefore counts the number of times when a request failed to acquire either a VNA or VN1 credit and is delayed. This should generally be a rare situation.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vn1_credits_reject, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn1_credits_reject), }, { .name = "UNC_R3_VN1_CREDITS_USED", .code = 0x38, .desc = "Number of times a VN1 credit was used on the DRS message channel. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN1. VNA is a shared pool used to achieve high performance. The VN1 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN1 if they fail. This counts the number of times a VN1 credit was used. Note that a single VN1 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN1 will only count a single credit even though it may use multiple buffers.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vn1_credits_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vn1_credits_used), }, { .name = "UNC_R3_VNA_CREDITS_ACQUIRED", .code = 0x33, .desc = "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credts from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transferred). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transferred in a given message class using an qfclk event.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vna_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vna_credits_acquired), }, { .name = "UNC_R3_VNA_CREDITS_REJECT", .code = 0x34, .desc = "Number of attempted VNA credit acquisitions that were rejected because the VNA credit pool was full (or almost full). It is possible to filter this event by message class. Some packets use more than one flit buffer, and therefore must acquire multiple credits. Therefore, one could get a reject even if the VNA credits were not fully used up. The VNA pool is generally used to provide the bulk of the QPI bandwidth (as opposed to the VN0 pool which is used to guarantee forward progress). VNA credits can run out if the flit buffer on the receiving side starts to queue up substantially. This can happen if the rest of the uncore is unable to drain the requests fast enough.", .modmsk = BDX_UNC_R3QPI_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_r3_vna_credits_reject, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_r3_vna_credits_reject), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_sbo_events.h000066400000000000000000000317361502707512200252650ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_sbo */ static intel_x86_umask_t bdx_unc_s_ring_ad_used[]={ { .uname = "DOWN_EVEN", .ucode = 0x400, .udesc = "Down and Event", }, { .uname = "DOWN_ODD", .ucode = 0x800, .udesc = "Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Up and Odd", }, { .uname = "UP", .ucode = 0x300, .udesc = "Up", .uflags= INTEL_X86_NCOMBO, }, { .uname = "DOWN", .ucode = 0xcc00, .udesc = "Down", .uflags= INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_s_ring_bounces[]={ { .uname = "AD_CACHE", .ucode = 0x100, .udesc = "Number of LLC responses that bounced on the Ring. -- ", }, { .uname = "AK_CORE", .ucode = 0x200, .udesc = "Number of LLC responses that bounced on the Ring. -- Acknowledgements to core", }, { .uname = "BL_CORE", .ucode = 0x400, .udesc = "Number of LLC responses that bounced on the Ring. -- Data Responses to core", }, { .uname = "IV_CORE", .ucode = 0x800, .udesc = "Number of LLC responses that bounced on the Ring. -- Snoops of processors cachee.", }, }; static intel_x86_umask_t bdx_unc_s_ring_iv_used[]={ { .uname = "DN", .ucode = 0xc00, .udesc = "BL Ring in Use -- Any", .uflags= INTEL_X86_NCOMBO, }, { .uname = "UP", .ucode = 0x300, .udesc = "BL Ring in Use -- Any", .uflags= INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_s_rxr_bypass[]={ { .uname = "AD_BNC", .ucode = 0x200, .udesc = "Bypass -- AD - Bounces", .uflags= INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .ucode = 0x100, .udesc = "Bypass -- AD - Credits", .uflags= INTEL_X86_NCOMBO, }, { .uname = "AK", .ucode = 0x1000, .udesc = "Bypass -- AK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .ucode = 0x800, .udesc = "Bypass -- BL - Bounces", .uflags= INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .ucode = 0x400, .udesc = "Bypass -- BL - Credits", .uflags= INTEL_X86_NCOMBO, }, { .uname = "IV", .ucode = 0x2000, .udesc = "Bypass -- IV", .uflags= INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_s_rxr_inserts[]={ { .uname = "AD_BNC", .ucode = 0x200, .udesc = "Ingress Allocations -- AD - Bounces", }, { .uname = "AD_CRD", .ucode = 0x100, .udesc = "Ingress Allocations -- AD - Credits", }, { .uname = "AK", .ucode = 0x1000, .udesc = "Ingress Allocations -- AK", }, { .uname = "BL_BNC", .ucode = 0x800, .udesc = "Ingress Allocations -- BL - Bounces", }, { .uname = "BL_CRD", .ucode = 0x400, .udesc = "Ingress Allocations -- BL - Credits", }, { .uname = "IV", .ucode = 0x2000, .udesc = "Ingress Allocations -- IV", }, }; static intel_x86_umask_t bdx_unc_s_rxr_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x200, .udesc = "Ingress Occupancy -- AD - Bounces", .uflags= INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .ucode = 0x100, .udesc = "Ingress Occupancy -- AD - Credits", .uflags= INTEL_X86_NCOMBO, }, { .uname = "AK", .ucode = 0x1000, .udesc = "Ingress Occupancy -- AK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .ucode = 0x800, .udesc = "Ingress Occupancy -- BL - Bounces", .uflags= INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .ucode = 0x400, .udesc = "Ingress Occupancy -- BL - Credits", .uflags= INTEL_X86_NCOMBO, }, { .uname = "IV", .ucode = 0x2000, .udesc = "Ingress Occupancy -- IV", .uflags= INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t bdx_unc_s_txr_ads_used[]={ { .uname = "AD", .ucode = 0x100, .udesc = "TBD", }, { .uname = "AK", .ucode = 0x200, .udesc = "TBD", }, { .uname = "BL", .ucode = 0x400, .udesc = "TBD", }, }; static intel_x86_umask_t bdx_unc_s_txr_inserts[]={ { .uname = "AD_BNC", .ucode = 0x200, .udesc = "Egress Allocations -- AD - Bounces", }, { .uname = "AD_CRD", .ucode = 0x100, .udesc = "Egress Allocations -- AD - Credits", }, { .uname = "AK", .ucode = 0x1000, .udesc = "Egress Allocations -- AK", }, { .uname = "BL_BNC", .ucode = 0x800, .udesc = "Egress Allocations -- BL - Bounces", }, { .uname = "BL_CRD", .ucode = 0x400, .udesc = "Egress Allocations -- BL - Credits", }, { .uname = "IV", .ucode = 0x2000, .udesc = "Egress Allocations -- IV", }, }; static intel_x86_umask_t bdx_unc_s_txr_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x200, .udesc = "Egress Occupancy -- AD - Bounces", }, { .uname = "AD_CRD", .ucode = 0x100, .udesc = "Egress Occupancy -- AD - Credits", }, { .uname = "AK", .ucode = 0x1000, .udesc = "Egress Occupancy -- AK", }, { .uname = "BL_BNC", .ucode = 0x800, .udesc = "Egress Occupancy -- BL - Bounces", }, { .uname = "BL_CRD", .ucode = 0x400, .udesc = "Egress Occupancy -- BL - Credits", }, { .uname = "IV", .ucode = 0x2000, .udesc = "Egress Occupancy -- IV", }, }; static intel_x86_umask_t bdx_unc_s_txr_ordering[]={ { .uname = "IVSNOOPGO_UP", .ucode = 0x100, .udesc = "TBD", }, { .uname = "IVSNOOP_DN", .ucode = 0x200, .udesc = "TBD", }, { .uname = "AK_U2C_UP_EVEN", .ucode = 0x400, .udesc = "TBD", }, { .uname = "AK_U2C_UP_ODD", .ucode = 0x800, .udesc = "TBD", }, { .uname = "AK_U2C_DN_EVEN", .ucode = 0x1000, .udesc = "TBD", }, { .uname = "AK_U2C_DN_ODD", .ucode = 0x2000, .udesc = "TBD", }, }; static intel_x86_entry_t intel_bdx_unc_s_pe[]={ { .name = "UNC_S_BOUNCE_CONTROL", .code = 0xa, .desc = "TBD", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_S_CLOCKTICKS", .code = 0x0, .desc = "TBD", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_S_FAST_ASSERTED", .code = 0x9, .desc = "Counts the number of cycles either the local or incoming distress signals are asserted. Incoming distress includes up, dn and across.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_S_RING_AD_USED", .code = 0x1b, .desc = "Counts the number of cycles that the AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), }, { .name = "UNC_S_RING_AK_USED", .code = 0x1c, .desc = "Counts the number of cycles that the AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), }, { .name = "UNC_S_RING_BL_USED", .code = 0x1d, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. We really have two rings in BDX -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_ring_ad_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_ad_used), }, { .name = "UNC_S_RING_BOUNCES", .code = 0x5, .desc = "TBD", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_ring_bounces, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_bounces), }, { .name = "UNC_S_RING_IV_USED", .code = 0x1e, .desc = "Counts the number of cycles that the BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sent, but does not include when packets are being sunk into the ring stop. There is only 1 IV ring in BDX. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_ring_iv_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_ring_iv_used), }, { .name = "UNC_S_RXR_BYPASS", .code = 0x12, .desc = "Bypass the Sbo Ingress.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_rxr_bypass, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_bypass), }, { .name = "UNC_S_RXR_INSERTS", .code = 0x13, .desc = "Number of allocations into the Sbo Ingress The Ingress is used to queue up requests received from the ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_inserts), }, { .name = "UNC_S_RXR_OCCUPANCY", .code = 0x11, .desc = "Occupancy event for the Ingress buffers in the Sbo. The Ingress is used to queue up requests received from the ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_rxr_occupancy), }, { .name = "UNC_S_TXR_ADS_USED", .code = 0x4, .desc = "TBD", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_txr_ads_used, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_ads_used), }, { .name = "UNC_S_TXR_INSERTS", .code = 0x2, .desc = "Number of allocations into the Sbo Egress. The Egress is used to queue up requests destined for the ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_txr_inserts, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_inserts), }, { .name = "UNC_S_TXR_OCCUPANCY", .code = 0x1, .desc = "Occupancy event for the Egress buffers in the Sbo. The egress is used to queue up requests destined for the ring.", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_txr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_occupancy), }, { .name = "UNC_S_TXR_ORDERING", .code = 0x7, .desc = "TB", .modmsk = BDX_UNC_SBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = bdx_unc_s_txr_ordering, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_s_txr_ordering), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_bdx_unc_ubo_events.h000066400000000000000000000047161502707512200252650ustar00rootroot00000000000000/* * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: bdx_unc_ubo */ static intel_x86_umask_t bdx_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .ucode = 0x800, .udesc = "VLW Received", .uflags = INTEL_X86_DFL, }, }; static intel_x86_umask_t bdx_unc_u_phold_cycles[]={ { .uname = "ASSERT_TO_ACK", .ucode = 0x100, .udesc = "Cycles PHOLD Assert to Ack. Assert to ACK", .uflags = INTEL_X86_DFL, }, }; static intel_x86_entry_t intel_bdx_unc_u_pe[]={ { .name = "UNC_U_EVENT_MSG", .code = 0x42, .desc = "Virtual Logical Wire (legacy) message were received from uncore", .modmsk = BDX_UNC_UBO_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_u_event_msg, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_u_event_msg), }, { .name = "UNC_U_PHOLD_CYCLES", .code = 0x45, .desc = "PHOLD cycles. Filter from source CoreID.", .modmsk = BDX_UNC_UBO_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = bdx_unc_u_phold_cycles, .numasks= LIBPFM_ARRAY_SIZE(bdx_unc_u_phold_cycles), }, { .name = "UNC_U_RACU_REQUESTS", .code = 0x46, .desc = "Number outstanding register requests within message channel tracker", .modmsk = BDX_UNC_UBO_ATTRS, .cntmsk = 0x3, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_core_events.h000066400000000000000000001474401502707512200237300ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: core (Intel Core) */ static const intel_x86_umask_t core_rs_uops_dispatched_cycles[]={ { .uname = "PORT_0", .udesc = "On port 0", .ucode = 0x100, }, { .uname = "PORT_1", .udesc = "On port 1", .ucode = 0x200, }, { .uname = "PORT_2", .udesc = "On port 2", .ucode = 0x400, }, { .uname = "PORT_3", .udesc = "On port 3", .ucode = 0x800, }, { .uname = "PORT_4", .udesc = "On port 4", .ucode = 0x1000, }, { .uname = "PORT_5", .udesc = "On port 5", .ucode = 0x2000, }, { .uname = "ANY", .udesc = "On any port", .uequiv = "PORT_0:PORT_1:PORT_2:PORT_3:PORT_4:PORT_5", .ucode = 0x3f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_load_block[]={ { .uname = "STA", .udesc = "Loads blocked by a preceding store with unknown address", .ucode = 0x200, }, { .uname = "STD", .udesc = "Loads blocked by a preceding store with unknown data", .ucode = 0x400, }, { .uname = "OVERLAP_STORE", .udesc = "Loads that partially overlap an earlier store, or 4K equived with a previous store", .ucode = 0x800, }, { .uname = "UNTIL_RETIRE", .udesc = "Loads blocked until retirement", .ucode = 0x1000, }, { .uname = "L1D", .udesc = "Loads blocked by the L1 data cache", .ucode = 0x2000, }, }; static const intel_x86_umask_t core_store_block[]={ { .uname = "ORDER", .udesc = "Cycles while store is waiting for a preceding store to be globally observed", .ucode = 0x200, }, { .uname = "SNOOP", .udesc = "A store is blocked due to a conflict with an external or internal snoop", .ucode = 0x800, }, }; static const intel_x86_umask_t core_sse_pre_exec[]={ { .uname = "NTA", .udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L2", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Streaming SIMD Extensions (SSE) Weakly-ordered store instructions executed", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_dtlb_misses[]={ { .uname = "ANY", .udesc = "Any memory access that missed the DTLB", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, { .uname = "MISS_LD", .udesc = "DTLB misses due to load operations", .ucode = 0x200, }, { .uname = "L0_MISS_LD", .udesc = "L0 DTLB misses due to load operations", .ucode = 0x400, }, { .uname = "MISS_ST", .udesc = "DTLB misses due to store operations", .ucode = 0x800, }, }; static const intel_x86_umask_t core_memory_disambiguation[]={ { .uname = "RESET", .udesc = "Memory disambiguation reset cycles", .ucode = 0x100, }, { .uname = "SUCCESS", .udesc = "Number of loads that were successfully disambiguated", .ucode = 0x200, }, }; static const intel_x86_umask_t core_page_walks[]={ { .uname = "COUNT", .udesc = "Number of page-walks executed", .ucode = 0x100, }, { .uname = "CYCLES", .udesc = "Duration of page-walks in core cycles", .ucode = 0x200, }, }; static const intel_x86_umask_t core_delayed_bypass[]={ { .uname = "FP", .udesc = "Delayed bypass to FP operation", .ucode = 0x0, }, { .uname = "SIMD", .udesc = "Delayed bypass to SIMD operation", .ucode = 0x100, }, { .uname = "LOAD", .udesc = "Delayed bypass to load operation", .ucode = 0x200, }, }; static const intel_x86_umask_t core_l2_ads[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_l2_lines_in[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, { .uname = "EXCL_PREFETCH", .udesc = "Exclude hardware prefetch", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t core_l2_ifetch[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "MESI", .udesc = "Any cacheline access", .uequiv = "M_STATE:E_STATE:S_STATE:I_STATE", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 1, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 1, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 1, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 1, }, }; static const intel_x86_umask_t core_l2_ld[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, { .uname = "EXCL_PREFETCH", .udesc = "Exclude hardware prefetch", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, { .uname = "MESI", .udesc = "Any cacheline access", .uequiv = "M_STATE:E_STATE:S_STATE:I_STATE", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 2, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 2, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 2, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 2, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 2, }, }; static const intel_x86_umask_t core_cpu_clk_unhalted[]={ { .uname = "CORE_P", .udesc = "Core cycles when core is not halted", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BUS", .udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NO_OTHER", .udesc = "Bus cycles when core is active and the other is halted", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_l1d_cache_ld[]={ { .uname = "MESI", .udesc = "Any cacheline access", .uequiv = "M_STATE:E_STATE:S_STATE:I_STATE", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, }, }; static const intel_x86_umask_t core_l1d_split[]={ { .uname = "LOADS", .udesc = "Cache line split loads from the L1 data cache", .ucode = 0x100, }, { .uname = "STORES", .udesc = "Cache line split stores to the L1 data cache", .ucode = 0x200, }, }; static const intel_x86_umask_t core_sse_pre_miss[]={ { .uname = "NTA", .udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions missing all cache levels", .ucode = 0x0, }, { .uname = "L1", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT0 instructions missing all cache levels", .ucode = 0x100, }, { .uname = "L2", .udesc = "Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions missing all cache levels", .ucode = 0x200, }, }; static const intel_x86_umask_t core_l1d_prefetch[]={ { .uname = "REQUESTS", .udesc = "L1 data cache prefetch requests", .ucode = 0x1000, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_bus_request_outstanding[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t core_bus_bnr_drv[]={ { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_ext_snoop[]={ { .uname = "ANY", .udesc = "Any external snoop response", .ucode = 0xb00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "CLEAN", .udesc = "External snoop CLEAN response", .ucode = 0x100, .grpid = 0, }, { .uname = "HIT", .udesc = "External snoop HIT response", .ucode = 0x200, .grpid = 0, }, { .uname = "HITM", .udesc = "External snoop HITM response", .ucode = 0x800, .grpid = 0, }, { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t core_cmp_snoop[]={ { .uname = "ANY", .udesc = "L1 data cache is snooped by other core", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "SHARE", .udesc = "L1 data cache is snooped for sharing by other core", .ucode = 0x100, .grpid = 0, }, { .uname = "INVALIDATE", .udesc = "L1 data cache is snooped for Invalidation by other core", .ucode = 0x200, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t core_itlb[]={ { .uname = "SMALL_MISS", .udesc = "ITLB small page misses", .ucode = 0x200, }, { .uname = "LARGE_MISS", .udesc = "ITLB large page misses", .ucode = 0x1000, }, { .uname = "FLUSH", .udesc = "ITLB flushes", .ucode = 0x4000, }, { .uname = "MISSES", .udesc = "ITLB misses", .ucode = 0x1200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_inst_queue[]={ { .uname = "FULL", .udesc = "Cycles during which the instruction queue is full", .ucode = 0x200, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_macro_insts[]={ { .uname = "DECODED", .udesc = "Instructions decoded", .ucode = 0x100, }, { .uname = "CISC_DECODED", .udesc = "CISC instructions decoded", .ucode = 0x800, }, }; static const intel_x86_umask_t core_esp[]={ { .uname = "SYNCH", .udesc = "ESP register content synchronization", .ucode = 0x100, }, { .uname = "ADDITIONS", .udesc = "ESP register automatic additions", .ucode = 0x200, }, }; static const intel_x86_umask_t core_simd_uop_type_exec[]={ { .uname = "MUL", .udesc = "SIMD packed multiply micro-ops executed", .ucode = 0x100, }, { .uname = "SHIFT", .udesc = "SIMD packed shift micro-ops executed", .ucode = 0x200, }, { .uname = "PACK", .udesc = "SIMD pack micro-ops executed", .ucode = 0x400, }, { .uname = "UNPACK", .udesc = "SIMD unpack micro-ops executed", .ucode = 0x800, }, { .uname = "LOGICAL", .udesc = "SIMD packed logical micro-ops executed", .ucode = 0x1000, }, { .uname = "ARITHMETIC", .udesc = "SIMD packed arithmetic micro-ops executed", .ucode = 0x2000, }, }; static const intel_x86_umask_t core_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions retired (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "LOADS", .udesc = "Instructions retired, which contain a load", .ucode = 0x100, }, { .uname = "STORES", .udesc = "Instructions retired, which contain a store", .ucode = 0x200, }, { .uname = "OTHER", .udesc = "Instructions retired, with no load or store operation", .ucode = 0x400, }, }; static const intel_x86_umask_t core_x87_ops_retired[]={ { .uname = "FXCH", .udesc = "FXCH instructions retired", .ucode = 0x100, }, { .uname = "ANY", .udesc = "Retired floating-point computational operations (Precise Event)", .ucode = 0xfe00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_uops_retired[]={ { .uname = "LD_IND_BR", .udesc = "Fused load+op or load+indirect branch retired", .ucode = 0x100, }, { .uname = "STD_STA", .udesc = "Fused store address + data retired", .ucode = 0x200, }, { .uname = "MACRO_FUSION", .udesc = "Retired instruction pairs fused into one micro-op", .ucode = 0x400, }, { .uname = "NON_FUSED", .udesc = "Non-fused micro-ops retired", .ucode = 0x800, }, { .uname = "FUSED", .udesc = "Fused micro-ops retired", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Micro-ops retired", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_machine_nukes[]={ { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x100, }, { .uname = "MEM_ORDER", .udesc = "Execution pipeline restart due to memory ordering conflict or memory disambiguation misprediction", .ucode = 0x400, }, }; static const intel_x86_umask_t core_br_inst_retired[]={ { .uname = "ANY", .udesc = "Retired branch instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PRED_NOT_TAKEN", .udesc = "Retired branch instructions that were predicted not-taken", .ucode = 0x100, }, { .uname = "MISPRED_NOT_TAKEN", .udesc = "Retired branch instructions that were mispredicted not-taken", .ucode = 0x200, }, { .uname = "PRED_TAKEN", .udesc = "Retired branch instructions that were predicted taken", .ucode = 0x400, }, { .uname = "MISPRED_TAKEN", .udesc = "Retired branch instructions that were mispredicted taken", .ucode = 0x800, }, { .uname = "TAKEN", .udesc = "Retired taken branch instructions", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t core_simd_inst_retired[]={ { .uname = "PACKED_SINGLE", .udesc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .ucode = 0x100, }, { .uname = "SCALAR_SINGLE", .udesc = "Retired Streaming SIMD Extensions (SSE) scalar-single instructions", .ucode = 0x200, }, { .uname = "PACKED_DOUBLE", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .ucode = 0x400, }, { .uname = "SCALAR_DOUBLE", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .ucode = 0x800, }, { .uname = "VECTOR", .udesc = "Retired Streaming SIMD Extensions 2 (SSE2) vector integer instructions", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "Retired Streaming SIMD instructions (Precise Event)", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_simd_comp_inst_retired[]={ { .uname = "PACKED_SINGLE", .udesc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .ucode = 0x100, }, { .uname = "SCALAR_SINGLE", .udesc = "Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions", .ucode = 0x200, }, { .uname = "PACKED_DOUBLE", .udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions", .ucode = 0x400, }, { .uname = "SCALAR_DOUBLE", .udesc = "Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions", .ucode = 0x800, }, }; static const intel_x86_umask_t core_mem_load_retired[]={ { .uname = "L1D_MISS", .udesc = "Retired loads that miss the L1 data cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_PEBS, }, { .uname = "L1D_LINE_MISS", .udesc = "L1 data cache line missed by retired loads (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired loads that miss the L2 cache (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_PEBS, }, { .uname = "L2_LINE_MISS", .udesc = "L2 cache line missed by retired loads (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_PEBS, }, { .uname = "DTLB_MISS", .udesc = "Retired loads that miss the DTLB (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_PEBS, }, }; static const intel_x86_umask_t core_fp_mmx_trans[]={ { .uname = "TO_FP", .udesc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .ucode = 0x200, }, { .uname = "TO_MMX", .udesc = "Transitions from Floating Point to MMX (TM) Instructions", .ucode = 0x100, }, }; static const intel_x86_umask_t core_rat_stalls[]={ { .uname = "ROB_READ_PORT", .udesc = "ROB read port stalls cycles", .ucode = 0x100, }, { .uname = "PARTIAL_CYCLES", .udesc = "Partial register stall cycles", .ucode = 0x200, }, { .uname = "FLAGS", .udesc = "Flag stall cycles", .ucode = 0x400, }, { .uname = "FPSW", .udesc = "FPU status word stall", .ucode = 0x800, }, { .uname = "ANY", .udesc = "All RAT stall cycles", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_seg_rename_stalls[]={ { .uname = "ES", .udesc = "Segment rename stalls - ES ", .ucode = 0x100, }, { .uname = "DS", .udesc = "Segment rename stalls - DS", .ucode = 0x200, }, { .uname = "FS", .udesc = "Segment rename stalls - FS", .ucode = 0x400, }, { .uname = "GS", .udesc = "Segment rename stalls - GS", .ucode = 0x800, }, { .uname = "ANY", .udesc = "Any (ES/DS/FS/GS) segment rename stall", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_seg_reg_renames[]={ { .uname = "ES", .udesc = "Segment renames - ES", .ucode = 0x100, }, { .uname = "DS", .udesc = "Segment renames - DS", .ucode = 0x200, }, { .uname = "FS", .udesc = "Segment renames - FS", .ucode = 0x400, }, { .uname = "GS", .udesc = "Segment renames - GS", .ucode = 0x800, }, { .uname = "ANY", .udesc = "Any (ES/DS/FS/GS) segment rename", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t core_resource_stalls[]={ { .uname = "ROB_FULL", .udesc = "Cycles during which the ROB is full", .ucode = 0x100, }, { .uname = "RS_FULL", .udesc = "Cycles during which the RS is full", .ucode = 0x200, }, { .uname = "LD_ST", .udesc = "Cycles during which the pipeline has exceeded load or store limit or waiting to commit all stores", .ucode = 0x400, }, { .uname = "FPCW", .udesc = "Cycles stalled due to FPU control word write", .ucode = 0x800, }, { .uname = "BR_MISS_CLEAR", .udesc = "Cycles stalled due to branch misprediction", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "Resource related stalls", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_core_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x200000003ull, .code = 0x3c, }, { .name = "INSTRUCTION_RETIRED", .desc = "Count the number of instructions at retirement", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x100000003ull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias from INSTRUCTION_RETIRED", .modmsk = INTEL_X86_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x100000003ull, .code = 0xc0, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED2_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "LLC_REFERENCES", .desc = "Count each request originating equiv the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to L2_RQSTS:SELF_DEMAND_MESI", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_X86_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L2_RQSTS:SELF_DEMAND_I_STATE", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for LLC_MISSES", .modmsk = INTEL_X86_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0x3, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction.", .modmsk = INTEL_X86_ATTRS, .equiv = "BR_INST_RETIRED:ANY", .cntmsk = 0x3, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware.", .modmsk = INTEL_X86_ATTRS, .equiv = "BR_INST_RETIRED_MISPRED", .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, }, { .name = "RS_UOPS_DISPATCHED_CYCLES", .desc = "Cycles micro-ops dispatched for execution", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xa1, .numasks = LIBPFM_ARRAY_SIZE(core_rs_uops_dispatched_cycles), .ngrp = 1, .umasks = core_rs_uops_dispatched_cycles, }, { .name = "RS_UOPS_DISPATCHED", .desc = "Number of micro-ops dispatched for execution", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa0, }, { .name = "RS_UOPS_DISPATCHED_NONE", .desc = "Number of of cycles in which no micro-ops is dispatched for execution", .modmsk =0x0, .equiv = "RS_UOPS_DISPATCHED:i=1:c=1", .cntmsk = 0x3, .code = 0xa0 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), }, { .name = "LOAD_BLOCK", .desc = "Loads blocked", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(core_load_block), .ngrp = 1, .umasks = core_load_block, }, { .name = "SB_DRAIN_CYCLES", .desc = "Cycles while stores are blocked due to store buffer drain", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x104, }, { .name = "STORE_BLOCK", .desc = "Cycles while store is waiting", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(core_store_block), .ngrp = 1, .umasks = core_store_block, }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "SSE_PRE_EXEC", .desc = "Streaming SIMD Extensions (SSE) Prefetch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(core_sse_pre_exec), .ngrp = 1, .umasks = core_sse_pre_exec, }, { .name = "DTLB_MISSES", .desc = "Memory accesses that missed the DTLB", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(core_dtlb_misses), .ngrp = 1, .umasks = core_dtlb_misses, }, { .name = "MEMORY_DISAMBIGUATION", .desc = "Memory disambiguation", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x9, .numasks = LIBPFM_ARRAY_SIZE(core_memory_disambiguation), .ngrp = 1, .umasks = core_memory_disambiguation, }, { .name = "PAGE_WALKS", .desc = "Number of page-walks executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc, .numasks = LIBPFM_ARRAY_SIZE(core_page_walks), .ngrp = 1, .umasks = core_page_walks, }, { .name = "FP_COMP_OPS_EXE", .desc = "Floating point computational micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "Floating point assists", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Multiply operations executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Divide operations executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Cycles the divider is busy", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "IDLE_DURING_DIV", .desc = "Cycles the divider is busy and all other execution units are idle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x18, }, { .name = "DELAYED_BYPASS", .desc = "Delayed bypass", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x19, .numasks = LIBPFM_ARRAY_SIZE(core_delayed_bypass), .ngrp = 1, .umasks = core_delayed_bypass, }, { .name = "L2_ADS", .desc = "Cycles L2 address bus is in use", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, }, { .name = "L2_DBUS_BUSY_RD", .desc = "Cycles the L2 transfers data to the core", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "L2 cache misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(core_l2_lines_in), .ngrp = 2, .umasks = core_l2_lines_in, }, { .name = "L2_M_LINES_IN", .desc = "L2 cache line modifications", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_OUT", .desc = "L2 cache lines evicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(core_l2_lines_in), .ngrp = 2, .umasks = core_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_OUT", .desc = "Modified lines evicted from the L2 cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(core_l2_lines_in), .ngrp = 2, .umasks = core_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "L2_IFETCH", .desc = "L2 cacheable instruction fetch requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ifetch), .ngrp = 2, .umasks = core_l2_ifetch, }, { .name = "L2_LD", .desc = "L2 cache reads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ld), .ngrp = 3, .umasks = core_l2_ld, }, { .name = "L2_ST", .desc = "L2 store requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ifetch), .ngrp = 2, .umasks = core_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_LOCK", .desc = "L2 locked accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2b, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ifetch), .ngrp = 2, .umasks = core_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_RQSTS", .desc = "L2 cache requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ld), .ngrp = 3, .umasks = core_l2_ld, /* identical to actual umasks list for this event */ }, { .name = "L2_REJECT_BUSQ", .desc = "Rejected L2 cache requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ld), .ngrp = 3, .umasks = core_l2_ld, /* identical to actual umasks list for this event */ }, { .name = "L2_NO_REQ", .desc = "Cycles no L2 cache requests are pending", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "EIST_TRANS", .desc = "Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3a, }, { .name = "THERMAL_TRIP", .desc = "Number of thermal trips", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc03b, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(core_cpu_clk_unhalted), .ngrp = 1, .umasks = core_cpu_clk_unhalted, }, { .name = "L1D_CACHE_LD", .desc = "L1 cacheable data reads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(core_l1d_cache_ld), .ngrp = 1, .umasks = core_l1d_cache_ld, }, { .name = "L1D_CACHE_ST", .desc = "L1 cacheable data writes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(core_l1d_cache_ld), .ngrp = 1, .umasks = core_l1d_cache_ld, /* identical to actual umasks list for this event */ }, { .name = "L1D_CACHE_LOCK", .desc = "L1 data cacheable locked reads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(core_l1d_cache_ld), .ngrp = 1, .umasks = core_l1d_cache_ld, /* identical to actual umasks list for this event */ }, { .name = "L1D_ALL_REF", .desc = "All references to the L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x143, }, { .name = "L1D_ALL_CACHE_REF", .desc = "L1 Data cacheable reads and writes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x243, }, { .name = "L1D_REPL", .desc = "Cache lines allocated in the L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf45, }, { .name = "L1D_M_REPL", .desc = "Modified cache lines allocated in the L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "L1D_M_EVICT", .desc = "Modified cache lines evicted from the L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "L1D_PEND_MISS", .desc = "Total number of outstanding L1 data cache misses at any cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "L1D_SPLIT", .desc = "Cache line split from L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x49, .numasks = LIBPFM_ARRAY_SIZE(core_l1d_split), .ngrp = 1, .umasks = core_l1d_split, }, { .name = "SSE_PRE_MISS", .desc = "Streaming SIMD Extensions (SSE) instructions missing all cache levels", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(core_sse_pre_miss), .ngrp = 1, .umasks = core_sse_pre_miss, }, { .name = "LOAD_HIT_PRE", .desc = "Load operations conflicting with a software prefetch to the same address", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4c, }, { .name = "L1D_PREFETCH", .desc = "L1 data cache prefetch", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4e, .numasks = LIBPFM_ARRAY_SIZE(core_l1d_prefetch), .ngrp = 1, .umasks = core_l1d_prefetch, }, { .name = "BUS_REQUEST_OUTSTANDING", .desc = "Number of pending full cache line read transactions on the bus occurring in each cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, }, { .name = "BUS_BNR_DRV", .desc = "Number of Bus Not Ready signals asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, .numasks = LIBPFM_ARRAY_SIZE(core_bus_bnr_drv), .ngrp = 1, .umasks = core_bus_bnr_drv, }, { .name = "BUS_DRDY_CLOCKS", .desc = "Bus cycles when data is sent on the bus", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(core_bus_bnr_drv), .ngrp = 1, .umasks = core_bus_bnr_drv, /* identical to actual umasks list for this event */ }, { .name = "BUS_LOCK_CLOCKS", .desc = "Bus cycles when a LOCK signal is asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RCV", .desc = "Bus cycles while processor receives data", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x64, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_BRD", .desc = "Burst read bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "RFO bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Explicit writeback bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IFETCH", .desc = "Instruction-fetch bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_INVAL", .desc = "Invalidate bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_PWR", .desc = "Partial write bus transaction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Partial bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "IO bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_DEF", .desc = "Deferred bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_BURST", .desc = "Burst (full cache-line) bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_MEM", .desc = "Memory bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_ANY", .desc = "All bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "EXT_SNOOP", .desc = "External snoops responses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x77, .numasks = LIBPFM_ARRAY_SIZE(core_ext_snoop), .ngrp = 2, .umasks = core_ext_snoop, }, { .name = "CMP_SNOOP", .desc = "L1 data cache is snooped by other core", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x78, .numasks = LIBPFM_ARRAY_SIZE(core_cmp_snoop), .ngrp = 2, .umasks = core_cmp_snoop, }, { .name = "BUS_HIT_DRV", .desc = "HIT signal asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7a, .numasks = LIBPFM_ARRAY_SIZE(core_bus_bnr_drv), .ngrp = 1, .umasks = core_bus_bnr_drv, /* identical to actual umasks list for this event */ }, { .name = "BUS_HITM_DRV", .desc = "HITM signal asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7b, .numasks = LIBPFM_ARRAY_SIZE(core_bus_bnr_drv), .ngrp = 1, .umasks = core_bus_bnr_drv, /* identical to actual umasks list for this event */ }, { .name = "BUSQ_EMPTY", .desc = "Bus queue is empty", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(core_bus_bnr_drv), .ngrp = 1, .umasks = core_bus_bnr_drv, /* identical to actual umasks list for this event */ }, { .name = "SNOOP_STALL_DRV", .desc = "Bus stalled for snoops", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, .numasks = LIBPFM_ARRAY_SIZE(core_bus_request_outstanding), .ngrp = 2, .umasks = core_bus_request_outstanding, /* identical to actual umasks list for this event */ }, { .name = "BUS_IO_WAIT", .desc = "IO requests waiting in the bus queue", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7f, .numasks = LIBPFM_ARRAY_SIZE(core_l2_ads), .ngrp = 1, .umasks = core_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "L1I_READS", .desc = "Instruction fetches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "L1I_MISSES", .desc = "Instruction Fetch Unit misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB", .desc = "ITLB small page misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x82, .numasks = LIBPFM_ARRAY_SIZE(core_itlb), .ngrp = 1, .umasks = core_itlb, }, { .name = "INST_QUEUE", .desc = "Cycles during which the instruction queue is full", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x83, .numasks = LIBPFM_ARRAY_SIZE(core_inst_queue), .ngrp = 1, .umasks = core_inst_queue, }, { .name = "CYCLES_L1I_MEM_STALLED", .desc = "Cycles during which instruction fetches are stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stall cycles due to a length changing prefix", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x88, }, { .name = "BR_MISSP_EXEC", .desc = "Mispredicted branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x89, }, { .name = "BR_BAC_MISSP_EXEC", .desc = "Branch instructions mispredicted at decoding", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8a, }, { .name = "BR_CND_EXEC", .desc = "Conditional branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8b, }, { .name = "BR_CND_MISSP_EXEC", .desc = "Mispredicted conditional branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8c, }, { .name = "BR_IND_EXEC", .desc = "Indirect branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8d, }, { .name = "BR_IND_MISSP_EXEC", .desc = "Mispredicted indirect branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8e, }, { .name = "BR_RET_EXEC", .desc = "RET instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8f, }, { .name = "BR_RET_MISSP_EXEC", .desc = "Mispredicted RET instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x90, }, { .name = "BR_RET_BAC_MISSP_EXEC", .desc = "RET instructions executed mispredicted at decoding", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x91, }, { .name = "BR_CALL_EXEC", .desc = "CALL instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x92, }, { .name = "BR_CALL_MISSP_EXEC", .desc = "Mispredicted CALL instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x93, }, { .name = "BR_IND_CALL_EXEC", .desc = "Indirect CALL instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x94, }, { .name = "BR_TKN_BUBBLE_1", .desc = "Branch predicted taken with bubble I", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x97, }, { .name = "BR_TKN_BUBBLE_2", .desc = "Branch predicted taken with bubble II", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x98, }, { .name = "MACRO_INSTS", .desc = "Instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xaa, .numasks = LIBPFM_ARRAY_SIZE(core_macro_insts), .ngrp = 1, .umasks = core_macro_insts, }, { .name = "ESP", .desc = "ESP register content synchronization", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(core_esp), .ngrp = 1, .umasks = core_esp, }, { .name = "SIMD_UOPS_EXEC", .desc = "SIMD micro-ops executed (excluding stores)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb0, }, { .name = "SIMD_SAT_UOP_EXEC", .desc = "SIMD saturated arithmetic micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb1, }, { .name = "SIMD_UOP_TYPE_EXEC", .desc = "SIMD packed multiply micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(core_simd_uop_type_exec), .ngrp = 1, .umasks = core_simd_uop_type_exec, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(core_inst_retired), .ngrp = 1, .umasks = core_inst_retired, }, { .name = "X87_OPS_RETIRED", .desc = "FXCH instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc1, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(core_x87_ops_retired), .ngrp = 1, .umasks = core_x87_ops_retired, }, { .name = "UOPS_RETIRED", .desc = "Fused load+op or load+indirect branch retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, .numasks = LIBPFM_ARRAY_SIZE(core_uops_retired), .ngrp = 1, .umasks = core_uops_retired, }, { .name = "MACHINE_NUKES", .desc = "Self-Modifying Code detected", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(core_machine_nukes), .ngrp = 1, .umasks = core_machine_nukes, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, .numasks = LIBPFM_ARRAY_SIZE(core_br_inst_retired), .ngrp = 1, .umasks = core_br_inst_retired, }, { .name = "BR_INST_RETIRED_MISPRED", .desc = "Retired mispredicted branch instructions (Precise_Event)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, }, { .name = "CYCLES_INT_MASKED", .desc = "Cycles during which interrupts are disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x1c6, }, { .name = "CYCLES_INT_PENDING_AND_MASKED", .desc = "Cycles during which interrupts are pending and disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2c6, }, { .name = "SIMD_INST_RETIRED", .desc = "Retired Streaming SIMD Extensions (SSE) packed-single instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(core_simd_inst_retired), .ngrp = 1, .umasks = core_simd_inst_retired, }, { .name = "HW_INT_RCV", .desc = "Hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "ITLB_MISS_RETIRED", .desc = "Retired instructions that missed the ITLB", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "SIMD_COMP_INST_RETIRED", .desc = "Retired computational Streaming SIMD Extensions (SSE) packed-single instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(core_simd_comp_inst_retired), .ngrp = 1, .umasks = core_simd_comp_inst_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired loads that miss the L1 data cache", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xcb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(core_mem_load_retired), .ngrp = 1, .umasks = core_mem_load_retired, }, { .name = "FP_MMX_TRANS", .desc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(core_fp_mmx_trans), .ngrp = 1, .umasks = core_fp_mmx_trans, }, { .name = "SIMD_ASSIST", .desc = "SIMD assists invoked", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "SIMD_INSTR_RETIRED", .desc = "SIMD Instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xce, }, { .name = "SIMD_SAT_INSTR_RETIRED", .desc = "Saturated arithmetic instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcf, }, { .name = "RAT_STALLS", .desc = "ROB read port stalls cycles", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd2, .numasks = LIBPFM_ARRAY_SIZE(core_rat_stalls), .ngrp = 1, .umasks = core_rat_stalls, }, { .name = "SEG_RENAME_STALLS", .desc = "Segment rename stalls - ES ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd4, .numasks = LIBPFM_ARRAY_SIZE(core_seg_rename_stalls), .ngrp = 1, .umasks = core_seg_rename_stalls, }, { .name = "SEG_REG_RENAMES", .desc = "Segment renames - ES", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd5, .numasks = LIBPFM_ARRAY_SIZE(core_seg_reg_renames), .ngrp = 1, .umasks = core_seg_reg_renames, }, { .name = "RESOURCE_STALLS", .desc = "Cycles during which the ROB is full", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xdc, .numasks = LIBPFM_ARRAY_SIZE(core_resource_stalls), .ngrp = 1, .umasks = core_resource_stalls, }, { .name = "BR_INST_DECODED", .desc = "Branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BOGUS_BR", .desc = "Bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "BACLEARS asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "PREF_RQSTS_UP", .desc = "Upward prefetches issued from the DPL", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf0, }, { .name = "PREF_RQSTS_DN", .desc = "Downward prefetches issued from the DPL", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf8, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_coreduo_events.h000066400000000000000000001063131502707512200244320ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: coreduo (Intel Core Duo/Core Solo) */ static const intel_x86_umask_t coreduo_sse_prefetch[]={ { .uname = "NTA", .udesc = "Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed", .ucode = 0x0, }, { .uname = "T1", .udesc = "SSE software prefetch instruction PREFE0xTCT1 retired", .ucode = 0x100, }, { .uname = "T2", .udesc = "SSE software prefetch instruction PREFE0xTCT2 retired", .ucode = 0x200, }, }; static const intel_x86_umask_t coreduo_l2_ads[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_l2_lines_in[]={ { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t coreduo_l2_ifetch[]={ { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 0, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 0, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 0, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, }; static const intel_x86_umask_t coreduo_l2_rqsts[]={ { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, .grpid = 0, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, .grpid = 0, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, .grpid = 0, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, .grpid = 0, }, { .uname = "SELF", .udesc = "This core", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "BOTH_CORES", .udesc = "Both cores", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, .grpid = 1, }, { .uname = "ANY", .udesc = "All inclusive", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 2, }, { .uname = "PREFETCH", .udesc = "Hardware prefetch only", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, .grpid = 2, }, }; static const intel_x86_umask_t coreduo_thermal_trip[]={ { .uname = "CYCLES", .udesc = "Duration in a thermal trip based on the current core clock", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TRIPS", .udesc = "Number of thermal trips", .ucode = 0xc000 | INTEL_X86_MOD_EDGE, .modhw = _INTEL_X86_ATTR_E, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_cpu_clk_unhalted[]={ { .uname = "CORE_P", .udesc = "Unhalted core cycles", .ucode = 0x0, }, { .uname = "NONHLT_REF_CYCLES", .udesc = "Non-halted bus cycles", .ucode = 0x100, }, { .uname = "SERIAL_EXECUTION_CYCLES", .udesc = "Non-halted bus cycles of this core executing code while the other core is halted", .ucode = 0x200, }, }; static const intel_x86_umask_t coreduo_dcache_cache_ld[]={ { .uname = "MESI", .udesc = "Any cacheline access", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "I_STATE", .udesc = "Invalid cacheline", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Shared cacheline", .ucode = 0x200, }, { .uname = "E_STATE", .udesc = "Exclusive cacheline", .ucode = 0x400, }, { .uname = "M_STATE", .udesc = "Modified cacheline", .ucode = 0x800, }, }; static const intel_x86_umask_t coreduo_sse_pre_miss[]={ { .uname = "NTA_MISS", .udesc = "PREFETCHNTA missed all caches", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T1_MISS", .udesc = "PREFETCHT1 missed all caches", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T2_MISS", .udesc = "PREFETCHT2 missed all caches", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STORES_MISS", .udesc = "SSE streaming store instruction missed all caches", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_bus_drdy_clocks[]={ { .uname = "THIS_AGENT", .udesc = "This agent", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_AGENTS", .udesc = "Any agent on the bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_simd_int_instructions[]={ { .uname = "MUL", .udesc = "Number of SIMD Integer packed multiply instructions executed", .ucode = 0x100, }, { .uname = "SHIFT", .udesc = "Number of SIMD Integer packed shift instructions executed", .ucode = 0x200, }, { .uname = "PACK", .udesc = "Number of SIMD Integer pack operations instruction executed", .ucode = 0x400, }, { .uname = "UNPACK", .udesc = "Number of SIMD Integer unpack instructions executed", .ucode = 0x800, }, { .uname = "LOGICAL", .udesc = "Number of SIMD Integer packed logical instructions executed", .ucode = 0x1000, }, { .uname = "ARITHMETIC", .udesc = "Number of SIMD Integer packed arithmetic instructions executed", .ucode = 0x2000, }, }; static const intel_x86_umask_t coreduo_mmx_fp_trans[]={ { .uname = "TO_FP", .udesc = "Number of transitions from MMX to X87", .ucode = 0x0, }, { .uname = "TO_MMX", .udesc = "Number of transitions from X87 to MMX", .ucode = 0x100, }, }; static const intel_x86_umask_t coreduo_sse_instructions_retired[]={ { .uname = "SINGLE", .udesc = "Number of SSE/SSE2 single precision instructions retired (packed and scalar)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SINGLE", .udesc = "Number of SSE/SSE2 scalar single precision instructions retired", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_DOUBLE", .udesc = "Number of SSE/SSE2 packed double precision instructions retired", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DOUBLE", .udesc = "Number of SSE/SSE2 scalar double precision instructions retired", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INT_128", .udesc = "Number of SSE2 128 bit integer instructions retired", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_sse_comp_instructions_retired[]={ { .uname = "PACKED_SINGLE", .udesc = "Number of SSE/SSE2 packed single precision compute instructions retired (does not include AND, OR, XOR)", .ucode = 0x0, }, { .uname = "SCALAR_SINGLE", .udesc = "Number of SSE/SSE2 scalar single precision compute instructions retired (does not include AND, OR, XOR)", .ucode = 0x100, }, { .uname = "PACKED_DOUBLE", .udesc = "Number of SSE/SSE2 packed double precision compute instructions retired (does not include AND, OR, XOR)", .ucode = 0x200, }, { .uname = "SCALAR_DOUBLE", .udesc = "Number of SSE/SSE2 scalar double precision compute instructions retired (does not include AND, OR, XOR)", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t coreduo_fused_uops[]={ { .uname = "ALL", .udesc = "All fused uops retired", .ucode = 0x0, }, { .uname = "LOADS", .udesc = "Fused load uops retired", .ucode = 0x100, }, { .uname = "STORES", .udesc = "Fused load uops retired", .ucode = 0x200, }, }; static const intel_x86_umask_t coreduo_est_trans[]={ { .uname = "ANY", .udesc = "Any Intel Enhanced SpeedStep(R) Technology transitions", .ucode = 0x0, }, { .uname = "FREQ", .udesc = "Intel Enhanced SpeedStep Technology frequency transitions", .ucode = 0x1000, }, }; static const intel_x86_entry_t intel_coreduo_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Unhalted core cycles", .modmsk = INTEL_X86_ATTRS, .equiv = "CPU_CLK_UNHALTED:CORE_P", .cntmsk = 0x3, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles. Measures bus cycles", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x13c, .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_X86_ATTRS, .equiv = "INSTR_RET", .cntmsk = 0x3, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_X86_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x3, .code = 0xc0, }, { .name = "LLC_REFERENCES", .desc = "Last level of cache references", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_X86_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Last level of cache misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for LLC_MISSES", .modmsk = INTEL_X86_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0x3, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Branch instructions retired", .modmsk = INTEL_X86_ATTRS, .equiv = "BR_INSTR_RET", .cntmsk = 0x3, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Mispredicted branch instruction retired", .modmsk = INTEL_X86_ATTRS, .equiv = "BR_MISPRED_RET", .cntmsk = 0x3, .code = 0xc5, }, { .name = "LD_BLOCKS", .desc = "Load operations delayed due to store buffer blocks. The preceding store may be blocked due to unknown address, unknown data, or conflict due to partial overlap between the load and store.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, }, { .name = "SD_DRAINS", .desc = "Cycles while draining store buffers", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned data memory references (MOB splits of loads and stores).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x5, }, { .name = "SEG_REG_LOADS", .desc = "Segment register loads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "SSE_PREFETCH", .desc = "Streaming SIMD Extensions (SSE) Prefetch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(coreduo_sse_prefetch), .ngrp = 1, .umasks = coreduo_sse_prefetch, }, { .name = "SSE_NTSTORES_RET", .desc = "SSE streaming store instruction retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x307, }, { .name = "FP_COMPS_OP_EXE", .desc = "FP computational Instruction executed. FADD, FSUB, FCOM, FMULs, MUL, IMUL, FDIVs, DIV, IDIV, FPREMs, FSQRT are included; but exclude FADD or FMUL used in the middle of a transcendental instruction.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "FP exceptions experienced microcode assists", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Multiply operations (a speculative count, including FP and integer multiplies).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Divide operations (a speculative count, including FP and integer multiplies). ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Cycles the divider is busy ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "L2_ADS", .desc = "L2 Address strobes ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, }, { .name = "DBUS_BUSY", .desc = "Core cycle during which data bus was busy (increments by 4)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "DBUS_BUSY_RD", .desc = "Cycles data bus is busy transferring data to a core (increments by 4) ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "L2 cache lines allocated", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, }, { .name = "L2_M_LINES_IN", .desc = "L2 Modified-state cache lines allocated", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_OUT", .desc = "L2 cache lines evicted ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_OUT", .desc = "L2 Modified-state cache lines evicted ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "L2_IFETCH", .desc = "L2 instruction fetches from instruction fetch unit (includes speculative fetches) ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ifetch), .ngrp = 2, .umasks = coreduo_l2_ifetch, }, { .name = "L2_LD", .desc = "L2 cache reads (includes speculation) ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ifetch), .ngrp = 2, .umasks = coreduo_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_ST", .desc = "L2 cache writes (includes speculation)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ifetch), .ngrp = 2, .umasks = coreduo_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_RQSTS", .desc = "L2 cache reference requests ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_rqsts), .ngrp = 3, .umasks = coreduo_l2_rqsts, }, { .name = "L2_REJECT_CYCLES", .desc = "Cycles L2 is busy and rejecting new requests.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_rqsts), .ngrp = 3, .umasks = coreduo_l2_rqsts, /* identical to actual umasks list for this event */ }, { .name = "L2_NO_REQUEST_CYCLES", .desc = "Cycles there is no request to access L2.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_rqsts), .ngrp = 3, .umasks = coreduo_l2_rqsts, /* identical to actual umasks list for this event */ }, { .name = "EST_TRANS", .desc = "Intel Enhanced SpeedStep(R) Technology transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3a, .numasks= LIBPFM_ARRAY_SIZE(coreduo_est_trans), .ngrp = 1, .umasks = coreduo_est_trans, }, { .name = "THERMAL_TRIP", .desc = "Duration in a thermal trip based on the current core clock ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3b, .numasks = LIBPFM_ARRAY_SIZE(coreduo_thermal_trip), .ngrp = 1, .umasks = coreduo_thermal_trip, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(coreduo_cpu_clk_unhalted), .ngrp = 1, .umasks = coreduo_cpu_clk_unhalted, }, { .name = "DCACHE_CACHE_LD", .desc = "L1 cacheable data read operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(coreduo_dcache_cache_ld), .ngrp = 1, .umasks = coreduo_dcache_cache_ld, }, { .name = "DCACHE_CACHE_ST", .desc = "L1 cacheable data write operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(coreduo_dcache_cache_ld), .ngrp = 1, .umasks = coreduo_dcache_cache_ld, /* identical to actual umasks list for this event */ }, { .name = "DCACHE_CACHE_LOCK", .desc = "L1 cacheable lock read operations to invalid state", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(coreduo_dcache_cache_ld), .ngrp = 1, .umasks = coreduo_dcache_cache_ld, /* identical to actual umasks list for this event */ }, { .name = "DATA_MEM_REF", .desc = "L1 data read and writes of cacheable and non-cacheable types", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x143, }, { .name = "DATA_MEM_CACHE_REF", .desc = "L1 data cacheable read and write operations.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x244, }, { .name = "DCACHE_REPL", .desc = "L1 data cache line replacements", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf45, }, { .name = "DCACHE_M_REPL", .desc = "L1 data M-state cache line allocated", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "DCACHE_M_EVICT", .desc = "L1 data M-state cache line evicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "DCACHE_PEND_MISS", .desc = "Weighted cycles of L1 miss outstanding", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "DTLB_MISS", .desc = "Data references that missed TLB", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x49, }, { .name = "SSE_PRE_MISS", .desc = "Streaming SIMD Extensions (SSE) instructions missing all cache levels", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(coreduo_sse_pre_miss), .ngrp = 1, .umasks = coreduo_sse_pre_miss, }, { .name = "L1_PREF_REQ", .desc = "L1 prefetch requests due to DCU cache misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4f, }, { .name = "BUS_REQ_OUTSTANDING", .desc = "Weighted cycles of cacheable bus data read requests. This event counts full-line read request from DCU or HW prefetcher, but not RFO, write, instruction fetches, or others.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_BNR_CLOCKS", .desc = "External bus cycles while BNR asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, }, { .name = "BUS_DRDY_CLOCKS", .desc = "External bus cycles while DRDY asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(coreduo_bus_drdy_clocks), .ngrp = 1, .umasks = coreduo_bus_drdy_clocks, }, { .name = "BUS_LOCKS_CLOCKS", .desc = "External bus cycles while bus lock signal asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RCV", .desc = "External bus cycles while bus lock signal asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4064, }, { .name = "BUS_TRANS_BRD", .desc = "Burst read bus transactions (data or code)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "Completed read for ownership ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IFETCH", .desc = "Completed instruction fetch transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_INVAL", .desc = "Completed invalidate transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_PWR", .desc = "Completed partial write transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Completed partial transactions (include partial read + partial write + line write)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "Completed I/O transactions (read and write)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_lines_in), .ngrp = 2, .umasks = coreduo_l2_lines_in, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_DEF", .desc = "Completed defer transactions ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x206d, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Completed writeback transactions from DCU (does not include L2 writebacks)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc067, .numasks = LIBPFM_ARRAY_SIZE(coreduo_bus_drdy_clocks), .ngrp = 1, .umasks = coreduo_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_BURST", .desc = "Completed burst transactions (full line transactions include reads, write, RFO, and writebacks) ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc06e, .numasks = LIBPFM_ARRAY_SIZE(coreduo_bus_drdy_clocks), .ngrp = 1, .umasks = coreduo_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_MEM", .desc = "Completed memory transactions. This includes Bus_Trans_Burst + Bus_Trans_P + Bus_Trans_Inval.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc06f, .numasks = LIBPFM_ARRAY_SIZE(coreduo_bus_drdy_clocks), .ngrp = 1, .umasks = coreduo_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_ANY", .desc = "Any completed bus transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc070, .numasks = LIBPFM_ARRAY_SIZE(coreduo_bus_drdy_clocks), .ngrp = 1, .umasks = coreduo_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_SNOOPS", .desc = "External bus cycles while bus lock signal asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x77, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ifetch), .ngrp = 2, .umasks = coreduo_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "DCU_SNOOP_TO_SHARE", .desc = "DCU snoops to share-state L1 cache line due to L1 misses ", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x178, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_NOT_IN_USE", .desc = "Number of cycles there is no transaction from the core", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7d, .numasks = LIBPFM_ARRAY_SIZE(coreduo_l2_ads), .ngrp = 1, .umasks = coreduo_l2_ads, /* identical to actual umasks list for this event */ }, { .name = "BUS_SNOOP_STALL", .desc = "Number of bus cycles while bus snoop is stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, }, { .name = "ICACHE_READS", .desc = "Number of instruction fetches from ICache, streaming buffers (both cacheable and uncacheable fetches)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "ICACHE_MISSES", .desc = "Number of instruction fetch misses from ICache, streaming buffers.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB_MISSES", .desc = "Number of iITLB misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x85, }, { .name = "IFU_MEM_STALL", .desc = "Cycles IFU is stalled while waiting for data from memory", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Number of instruction length decoder stalls (Counts number of LCP stalls)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "BR_INST_EXEC", .desc = "Branch instruction executed (includes speculation).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x88, }, { .name = "BR_MISSP_EXEC", .desc = "Branch instructions executed and mispredicted at execution (includes branches that do not have prediction or mispredicted)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x89, }, { .name = "BR_BAC_MISSP_EXEC", .desc = "Branch instructions executed that were mispredicted at front end", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8a, }, { .name = "BR_CND_EXEC", .desc = "Conditional branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8b, }, { .name = "BR_CND_MISSP_EXEC", .desc = "Conditional branch instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8c, }, { .name = "BR_IND_EXEC", .desc = "Indirect branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8d, }, { .name = "BR_IND_MISSP_EXEC", .desc = "Indirect branch instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8e, }, { .name = "BR_RET_EXEC", .desc = "Return branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8f, }, { .name = "BR_RET_MISSP_EXEC", .desc = "Return branch instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x90, }, { .name = "BR_RET_BAC_MISSP_EXEC", .desc = "Return branch instructions executed that were mispredicted at the front end", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x91, }, { .name = "BR_CALL_EXEC", .desc = "Return call instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x92, }, { .name = "BR_CALL_MISSP_EXEC", .desc = "Return call instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x93, }, { .name = "BR_IND_CALL_EXEC", .desc = "Indirect call branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x94, }, { .name = "RESOURCE_STALL", .desc = "Cycles while there is a resource related stall (renaming, buffer entries) as seen by allocator", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa2, }, { .name = "MMX_INSTR_EXEC", .desc = "Number of MMX instructions executed (does not include MOVQ and MOVD stores)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb0, }, { .name = "SIMD_INT_SAT_EXEC", .desc = "Number of SIMD Integer saturating instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb1, }, { .name = "SIMD_INT_INSTRUCTIONS", .desc = "Number of SIMD Integer instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(coreduo_simd_int_instructions), .ngrp = 1, .umasks = coreduo_simd_int_instructions, }, { .name = "INSTR_RET", .desc = "Number of instruction retired (Macro fused instruction count as 2)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, }, { .name = "FP_COMP_INSTR_RET", .desc = "Number of FP compute instructions retired (X87 instruction or instruction that contain X87 operations)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xc1, }, { .name = "UOPS_RET", .desc = "Number of micro-ops retired (include fused uops)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, }, { .name = "SMC_DETECTED", .desc = "Number of times self-modifying code condition detected", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc3, }, { .name = "BR_INSTR_RET", .desc = "Number of branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, }, { .name = "BR_MISPRED_RET", .desc = "Number of mispredicted branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, }, { .name = "CYCLES_INT_MASKED", .desc = "Cycles while interrupt is disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc6, }, { .name = "CYCLES_INT_PEDNING_MASKED", .desc = "Cycles while interrupt is disabled and interrupts are pending", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, }, { .name = "HW_INT_RX", .desc = "Number of hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "BR_TAKEN_RET", .desc = "Number of taken branch instruction retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "BR_MISPRED_TAKEN_RET", .desc = "Number of taken and mispredicted branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, }, { .name = "MMX_FP_TRANS", .desc = "Transitions from MMX (TM) Instructions to Floating Point Instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(coreduo_mmx_fp_trans), .ngrp = 1, .umasks = coreduo_mmx_fp_trans, }, { .name = "MMX_ASSIST", .desc = "Number of EMMS executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "MMX_INSTR_RET", .desc = "Number of MMX instruction retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xce, }, { .name = "INSTR_DECODED", .desc = "Number of instruction decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd0, }, { .name = "ESP_UOPS", .desc = "Number of ESP folding instruction decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd7, }, { .name = "SSE_INSTRUCTIONS_RETIRED", .desc = "Number of SSE/SSE2 instructions retired (packed and scalar)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd8, .numasks = LIBPFM_ARRAY_SIZE(coreduo_sse_instructions_retired), .ngrp = 1, .umasks = coreduo_sse_instructions_retired, }, { .name = "SSE_COMP_INSTRUCTIONS_RETIRED", .desc = "Number of computational SSE/SSE2 instructions retired (does not include AND, OR, XOR)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd9, .numasks = LIBPFM_ARRAY_SIZE(coreduo_sse_comp_instructions_retired), .ngrp = 1, .umasks = coreduo_sse_comp_instructions_retired, }, { .name = "FUSED_UOPS", .desc = "Fused uops retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xda, .numasks = LIBPFM_ARRAY_SIZE(coreduo_fused_uops), .ngrp = 1, .umasks = coreduo_fused_uops, }, { .name = "UNFUSION", .desc = "Number of unfusion events in the ROB (due to exception)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xdb, }, { .name = "BR_INSTR_DECODED", .desc = "Branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BTB_MISSES", .desc = "Number of branches the BTB did not produce a prediction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe2, }, { .name = "BR_BOGUS", .desc = "Number of bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "Number of BAClears asserted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "PREF_RQSTS_UP", .desc = "Number of hardware prefetch requests issued in forward streams", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf0, }, { .name = "PREF_RQSTS_DN", .desc = "Number of hardware prefetch requests issued in backward streams", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf8, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_glm_events.h000066400000000000000000001263511502707512200235550ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * FILE AUTOMATICALLY GENERATED from download.01.org/perfmon/GLM/Goldmont_core_V6.json * PMU: glm (Intel Goldmont) */ static const intel_x86_umask_t glm_icache[]={ { .uname = "HIT", .udesc = "References per ICache line that are available in the ICache (hit). This event counts differently than Intel processors based on Silvermont microarchitecture", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "MISSES", .udesc = "References per ICache line that are not available in the ICache (miss). This event counts differently than Intel processors based on Silvermont microarchitecture", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ACCESSES", .udesc = "References per ICache line. This event counts differently than Intel processors based on Silvermont microarchitecture", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_l2_reject_xq[]={ { .uname = "ALL", .udesc = "Requests rejected by the XQ", .ucode = 0x0000, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_hw_interrupts[]={ { .uname = "RECEIVED", .udesc = "Hardware interrupts received", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "PENDING_AND_MASKED", .udesc = "Cycles pending interrupts are masked", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Retired mispredicted branch instructions (Precise Event)", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "JCC", .udesc = "Retired mispredicted conditional branch instructions (Precise Event)", .ucode = 0x7e00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "TAKEN_JCC", .udesc = "Retired mispredicted conditional branch instructions that were taken (Precise Event)", .ucode = 0xfe00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "IND_CALL", .udesc = "Retired mispredicted near indirect call instructions (Precise Event)", .ucode = 0xfb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "RETURN", .udesc = "Retired mispredicted near return instructions (Precise Event)", .ucode = 0xf700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "NON_RETURN_IND", .udesc = "Retired mispredicted instructions of near indirect Jmp or near indirect call (Precise Event)", .ucode = 0xeb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_decode_restriction[]={ { .uname = "PREDECODE_WRONG", .udesc = "Decode restrictions due to predicting wrong instruction length", .ucode = 0x0100, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_misalign_mem_ref[]={ { .uname = "LOAD_PAGE_SPLIT", .udesc = "Load uops that split a page (Precise Event)", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "STORE_PAGE_SPLIT", .udesc = "Store uops that split a page (Precise Event)", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_inst_retired[]={ { .uname = "ANY_P", .udesc = "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers. This is an architectural performance event. This event uses a (_P)rogrammable general purpose performance counter. *This event is Precise Event capable: The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event. Note: Because PEBS records can be collected only on IA32_PMC0, only one event can use the PEBS facility at a time.", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_issue_slots_not_consumed[]={ { .uname = "RESOURCE_FULL", .udesc = "Unfilled issue slots per cycle because of a full resource in the backend", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "RECOVERY", .udesc = "Unfilled issue slots per cycle to recover", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ANY", .udesc = "Unfilled issue slots per cycle", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_itlb[]={ { .uname = "MISS", .udesc = "ITLB misses", .ucode = 0x0400, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_longest_lat_cache[]={ { .uname = "REFERENCE", .udesc = "L2 cache requests", .ucode = 0x4f00, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "MISS", .udesc = "L2 cache request misses", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_mem_load_uops_retired[]={ { .uname = "L1_HIT", .udesc = "Load uops retired that hit L1 data cache (Precise Event)", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "L1_MISS", .udesc = "Load uops retired that missed L1 data cache (Precise Event)", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "L2_HIT", .udesc = "Load uops retired that hit L2 (Precise Event)", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "L2_MISS", .udesc = "Load uops retired that missed L2 (Precise Event)", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "HITM", .udesc = "Memory uop retired where cross core or cross module HITM occurred (Precise Event)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "WCB_HIT", .udesc = "Loads retired that hit WCB (Precise Event)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DRAM_HIT", .udesc = "Loads retired that came from DRAM (Precise Event)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_ld_blocks[]={ { .uname = "ALL_BLOCK", .udesc = "Loads blocked (Precise Event)", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "UTLB_MISS", .udesc = "Loads blocked because address in not in the UTLB (Precise Event)", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked due to store forward restriction (Precise Event)", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DATA_UNKNOWN", .udesc = "Loads blocked due to store data not ready (Precise Event)", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "4K_ALIAS", .udesc = "Loads blocked because address has 4k partial address false dependence (Precise Event)", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_dl1[]={ { .uname = "DIRTY_EVICTION", .udesc = "L1 Cache evictions for dirty data", .ucode = 0x0100, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_cycles_div_busy[]={ { .uname = "ALL", .udesc = "Cycles a divider is busy", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "IDIV", .udesc = "Cycles the integer divide unit is busy", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "FPDIV", .udesc = "Cycles the FP divide unit is busy", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_ms_decoded[]={ { .uname = "MS_ENTRY", .udesc = "MS decode starts", .ucode = 0x0100, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_uops_retired[]={ { .uname = "ANY", .udesc = "Uops retired (Precise Event)", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "MS", .udesc = "MS uops retired (Precise Event)", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_offcore_response_0[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "DMND_CODE_RD", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetcher", .ucode = 1ULL << (4 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (5 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_READS", .udesc = "Request: number of partil reads", .ucode = 1ULL << (7 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_WRITES", .udesc = "Request: number of partial writes", .ucode = 1ULL << (8 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "UC_CODE_READS", .udesc = "Request: number of uncached code reads", .ucode = 1ULL << (9 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "BUS_LOCKS", .udesc = "Request: number of bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "FULL_STRM_ST", .udesc = "Request: number of streaming store requests for full cacheline", .ucode = 1ULL << (11 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "SW_PF", .udesc = "Request: number of cacheline requests due to software prefetch", .ucode = 1ULL << (12 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_L1_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L1 data prefetcher", .ucode = 1ULL << (13 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_STRM_ST", .udesc = "Request: number of streaming store requests for partial cacheline", .ucode = 1ULL << (14 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests for partial or full cacheline", .ucode = (1ULL << (14 + 8)) | (1ULL << (11+8)), .uequiv = "FULL_STRM_ST:PARTIAL_STRM_ST", .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .ucode = 1ULL << (15 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "ANY_PF_DATA_RD", .udesc = "Request: number of prefetch data reads", .ucode = (1ULL << (4+8)) | (1ULL << (12+8)) | (1ULL << (13+8)), .grpid = 0, .ucntmsk = 0xffull, .uequiv = "PF_DATA_RD:SW_PF:PF_L1_DATA_RD", }, { .uname = "ANY_RFO", .udesc = "Request: number of RFO", .ucode = (1ULL << (1+8)) | (1ULL << (5+8)), .grpid = 0, .ucntmsk = 0xffull, .uequiv = "DMND_RFO:PF_RFO", }, { .uname = "ANY_RESPONSE", .udesc = "Response: any response type", .ucode = 1ULL << (16 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, .ucntmsk = 0xffull, }, { .uname = "L2_HIT", .udesc = "Supplier: counts L2 hits", .ucode = 1ULL << (18 + 8), .grpid = 1, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED", .udesc = "Snoop: counts number true misses to this processor module for which a snoop request missed the other processor module or no snoop was needed", .ucode = 1ULL << (33 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_HIT_OTHER_CORE_NO_FWD", .udesc = "Snoop: counts number of times a snoop request hits the other processor module but no data forwarding is needed", .ucode = 1ULL << (34 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_HITM_OTHER_CORE", .udesc = "Snoop: counts number of times a snoop request hits in the other processor module or other core's L1 where a modified copy (M-state) is found", .ucode = 1ULL << (36 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x1bULL << (33 + 8), .uflags = INTEL_X86_DFL, .uequiv = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:L2_MISS_HIT_OTHER_CORE_NO_FWD:L2_MISS_HITM_OTHER_CORE:L2_MISS_SNP_NON_DRAM", .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "OUTSTANDING", .udesc = "Outstanding request: counts weighted cycles of outstanding offcore requests of the request type specified in the bits 15:0 of offcore_response from the time the XQ receives the request and any response received. Bits 37:16 must be set to 0. This is only available for offcore_response_0", .ucode = 1ULL << (38 + 8), .uflags = INTEL_X86_GRP_DFL_NONE | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ .grpid = 3, .ucntmsk = 0xffull, }, }; static const intel_x86_umask_t glm_offcore_response_1[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "DMND_CODE_RD", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetcher", .ucode = 1ULL << (4 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (5 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_READS", .udesc = "Request: number of partil reads", .ucode = 1ULL << (7 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_WRITES", .udesc = "Request: number of partial writes", .ucode = 1ULL << (8 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "UC_CODE_READS", .udesc = "Request: number of uncached code reads", .ucode = 1ULL << (9 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "BUS_LOCKS", .udesc = "Request: number of bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "FULL_STRM_ST", .udesc = "Request: number of streaming store requests for full cacheline", .ucode = 1ULL << (11 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "SW_PF", .udesc = "Request: number of cacheline requests due to software prefetch", .ucode = 1ULL << (12 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PF_L1_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L1 data prefetcher", .ucode = 1ULL << (13 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "PARTIAL_STRM_ST", .udesc = "Request: number of streaming store requests for partial cacheline", .ucode = 1ULL << (14 + 8), .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests for partial or full cacheline", .ucode = (1ULL << (14 + 8)) | (1ULL << (11+8)), .uequiv = "FULL_STRM_ST:PARTIAL_STRM_ST", .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .ucode = 1ULL << (15 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xffull, }, { .uname = "ANY_PF_DATA_RD", .udesc = "Request: number of prefetch data reads", .ucode = (1ULL << (4+8)) | (1ULL << (12+8)) | (1ULL << (13+8)), .grpid = 0, .ucntmsk = 0xffull, .uequiv = "PF_DATA_RD:SW_PF:PF_L1_DATA_RD", }, { .uname = "ANY_RFO", .udesc = "Request: number of RFO", .ucode = (1ULL << (1+8)) | (1ULL << (5+8)), .grpid = 0, .ucntmsk = 0xffull, .uequiv = "DMND_RFO:PF_RFO", }, { .uname = "ANY_RESPONSE", .udesc = "Response: any response type", .ucode = 1ULL << (16 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, .ucntmsk = 0xffull, }, { .uname = "L2_HIT", .udesc = "Supplier: counts L2 hits", .ucode = 1ULL << (18 + 8), .grpid = 1, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED", .udesc = "Snoop: counts number true misses to this processor module for which a snoop request missed the other processor module or no snoop was needed", .ucode = 1ULL << (33 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_HIT_OTHER_CORE_NO_FWD", .udesc = "Snoop: counts number of times a snoop request hits the other processor module but no data forwarding is needed", .ucode = 1ULL << (34 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_HITM_OTHER_CORE", .udesc = "Snoop: counts number of times a snoop request hits in the other processor module or other core's L1 where a modified copy (M-state) is found", .ucode = 1ULL << (36 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37 + 8), .grpid = 2, .ucntmsk = 0xffull, }, { .uname = "L2_MISS_SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0xfULL << (33 + 8), .uflags = INTEL_X86_DFL, .grpid = 2, .ucntmsk = 0xffull, .uequiv = "L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:L2_MISS_HIT_OTHER_CORE_NO_FWD:L2_MISS_HITM_OTHER_CORE:L2_MISS_SNP_NON_DRAM", }, }; static const intel_x86_umask_t glm_machine_clears[]={ { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "MEMORY_ORDERING", .udesc = "Machine cleas due to memory ordering issue", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "FP_ASSIST", .udesc = "Machine clears due to FP assists", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DISAMBIGUATION", .udesc = "Machine clears due to memory disambiguation", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ALL", .udesc = "All machine clears", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Retired branch instructions (Precise Event)", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ALL_TAKEN_BRANCHES", .udesc = "Retired branch instructions (Precise Event)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "JCC", .udesc = "Retired conditional branch instructions (Precise Event)", .ucode = 0x7e00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "TAKEN_JCC", .udesc = "Retired conditional branch instructions that were taken (Precise Event)", .ucode = 0xfe00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "CALL", .udesc = "Retired near call instructions (Precise Event)", .ucode = 0xf900, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "REL_CALL", .udesc = "Retired near relative call instructions (Precise Event)", .ucode = 0xfd00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "IND_CALL", .udesc = "Retired near indirect call instructions (Precise Event)", .ucode = 0xfb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "RETURN", .udesc = "Retired near return instructions (Precise Event)", .ucode = 0xf700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "NON_RETURN_IND", .udesc = "Retired instructions of near indirect Jmp or call (Precise Event)", .ucode = 0xeb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "FAR_BRANCH", .udesc = "Retired far branch instructions (Precise Event)", .ucode = 0xbf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_fetch_stall[]={ { .uname = "ICACHE_FILL_PENDING_CYCLES", .udesc = "Cycles where code-fetch is stalled and an ICache miss is outstanding. This is not the same as an ICache Miss", .ucode = 0x0200, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_uops_not_delivered[]={ { .uname = "ANY", .udesc = "Uops requested but not-delivered to the back-end per cycle", .ucode = 0x0000, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_mem_uops_retired[]={ { .uname = "ALL_LOADS", .udesc = "Load uops retired (Precise Event)", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ALL_STORES", .udesc = "Store uops retired (Precise Event)", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "ALL", .udesc = "Memory uops retired (Precise Event)", .ucode = 0x8300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DTLB_MISS_LOADS", .udesc = "Load uops retired that missed the DTLB (Precise Event)", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DTLB_MISS_STORES", .udesc = "Store uops retired that missed the DTLB (Precise Event)", .ucode = 0x1200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "DTLB_MISS", .udesc = "Memory uops retired that missed the DTLB (Precise Event)", .ucode = 0x1300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "LOCK_LOADS", .udesc = "Locked load uops retired (Precise Event)", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "SPLIT_LOADS", .udesc = "Load uops retired that split a cache-line (Precise Event)", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "SPLIT_STORES", .udesc = "Stores uops retired that split a cache-line (Precise Event)", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "SPLIT", .udesc = "Memory uops retired that split a cache-line (Precise Event)", .ucode = 0x4300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_uops_issued[]={ { .uname = "ANY", .udesc = "Uops issued to the back end per cycle", .ucode = 0x0000, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_core_reject_l2q[]={ { .uname = "ALL", .udesc = "Requests rejected by the L2Q ", .ucode = 0x0000, .uflags = INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_page_walks[]={ { .uname = "D_SIDE_CYCLES", .udesc = "Duration of D-side page-walks in cycles", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "I_SIDE_CYCLES", .udesc = "Duration of I-side pagewalks in cycles", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "CYCLES", .udesc = "Duration of page-walks in cycles", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_baclears[]={ { .uname = "ALL", .udesc = "BACLEARs asserted for any branch type", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "RETURN", .udesc = "BACLEARs asserted for return branch", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "COND", .udesc = "BACLEARs asserted for conditional branch", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_umask_t glm_cpu_clk_unhalted[]={ { .uname = "CORE", .udesc = "Core cycles when core is not halted (Fixed event)", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0x200000000ull, }, { .uname = "REF_TSC", .udesc = "Reference cycles when core is not halted (Fixed event)", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0x400000000ull, }, { .uname = "CORE_P", .udesc = "Core cycles when core is not halted", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "REF", .udesc = "Reference cycles when core is not halted", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, .grpid = 0, .ucntmsk = 0xfull, }, }; static const intel_x86_entry_t intel_glm_pe[]={ { .name = "ICACHE", .desc = "References per ICache line that are available in the ICache (hit). This event counts differently than Intel processors based on Silvermont microarchitecture", .code = 0x80, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_icache), .umasks = glm_icache, }, { .name = "L2_REJECT_XQ", .desc = "Requests rejected by the XQ", .code = 0x30, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_l2_reject_xq), .umasks = glm_l2_reject_xq, }, { .name = "HW_INTERRUPTS", .desc = "Hardware interrupts received", .code = 0xcb, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_hw_interrupts), .umasks = glm_hw_interrupts, }, { .name = "BR_MISP_RETIRED", .desc = "Retired mispredicted branch instructions (Precise Event)", .code = 0xc5, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_br_misp_retired), .umasks = glm_br_misp_retired, }, { .name = "DECODE_RESTRICTION", .desc = "Decode restrictions due to predicting wrong instruction length", .code = 0xe9, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_decode_restriction), .umasks = glm_decode_restriction, }, { .name = "MISALIGN_MEM_REF", .desc = "Load uops that split a page (Precise Event)", .code = 0x13, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_misalign_mem_ref), .umasks = glm_misalign_mem_ref, }, { .name = "INST_RETIRED", .desc = "Instructions retired (Precise Event)", .code = 0xc0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x10000000full, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_inst_retired), .umasks = glm_inst_retired, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions retired", .code = 0xc0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x100000ffull, .ngrp = 0, }, { .name = "ISSUE_SLOTS_NOT_CONSUMED", .desc = "Unfilled issue slots per cycle because of a full resource in the backend", .code = 0xca, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_issue_slots_not_consumed), .umasks = glm_issue_slots_not_consumed, }, { .name = "ITLB", .desc = "ITLB misses", .code = 0x81, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_itlb), .umasks = glm_itlb, }, { .name = "LONGEST_LAT_CACHE", .desc = "L2 cache requests", .code = 0x2e, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_longest_lat_cache), .umasks = glm_longest_lat_cache, }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Load uops retired that hit L1 data cache (Precise Event)", .code = 0xd1, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_mem_load_uops_retired), .umasks = glm_mem_load_uops_retired, }, { .name = "LD_BLOCKS", .desc = "Loads blocked (Precise Event)", .code = 0x03, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_ld_blocks), .umasks = glm_ld_blocks, }, { .name = "DL1", .desc = "L1 Cache evictions for dirty data", .code = 0x51, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_dl1), .umasks = glm_dl1, }, { .name = "CYCLES_DIV_BUSY", .desc = "Cycles a divider is busy", .code = 0xcd, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_cycles_div_busy), .umasks = glm_cycles_div_busy, }, { .name = "MS_DECODED", .desc = "MS decode starts", .code = 0xe7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_ms_decoded), .umasks = glm_ms_decoded, }, { .name = "UOPS_RETIRED", .desc = "Uops retired (Precise Event)", .code = 0xc2, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_uops_retired), .umasks = glm_uops_retired, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .code = 0x2b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xffull, .flags = INTEL_X86_NHM_OFFCORE, .ngrp = 3, .numasks = LIBPFM_ARRAY_SIZE(glm_offcore_response_1), .umasks = glm_offcore_response_1, }, { .name = "MACHINE_CLEARS", .desc = "Self-Modifying Code detected", .code = 0xc3, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_machine_clears), .umasks = glm_machine_clears, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions (Precise Event)", .code = 0xc4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_br_inst_retired), .umasks = glm_br_inst_retired, }, { .name = "FETCH_STALL", .desc = "Cycles where code-fetch is stalled and an ICache miss is outstanding. This is not the same as an ICache Miss", .code = 0x86, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_fetch_stall), .umasks = glm_fetch_stall, }, { .name = "UOPS_NOT_DELIVERED", .desc = "Uops requested but not-delivered to the back-end per cycle", .code = 0x9c, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_uops_not_delivered), .umasks = glm_uops_not_delivered, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Number of mispredicted branch instructions retired", .code = 0xc5, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xffull, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .ngrp = 0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "Number of instructions retired", .code = 0xc0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x100000ffull, .equiv = "INSTRUCTION_RETIRED", .ngrp = 0, }, { .name = "MEM_UOPS_RETIRED", .desc = "Load uops retired (Precise Event)", .code = 0xd0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .flags = INTEL_X86_PEBS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_mem_uops_retired), .umasks = glm_mem_uops_retired, }, { .name = "UOPS_ISSUED", .desc = "Uops issued to the back end per cycle", .code = 0x0e, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_uops_issued), .umasks = glm_uops_issued, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .code = 0x1b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xffull, .flags = INTEL_X86_NHM_OFFCORE, .ngrp = 4, .numasks = LIBPFM_ARRAY_SIZE(glm_offcore_response_0), .umasks = glm_offcore_response_0, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles. Ticks at constant reference frequency", .code = 0x0300, .modmsk = INTEL_FIXED2_ATTRS, .cntmsk = 0x40000000ull, .flags = INTEL_X86_FIXED, .ngrp = 0, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Number of branch instructions retired", .code = 0xc4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xffull, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .ngrp = 0, }, { .name = "CORE_REJECT_L2Q", .desc = "Requests rejected by the L2Q ", .code = 0x31, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_core_reject_l2q), .umasks = glm_core_reject_l2q, }, { .name = "PAGE_WALKS", .desc = "Duration of D-side page-walks in cycles", .code = 0x05, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_page_walks), .umasks = glm_page_walks, }, { .name = "BACLEARS", .desc = "BACLEARs asserted for any branch type", .code = 0xe6, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_baclears), .umasks = glm_baclears, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted (Fixed event)", .code = 0x00, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x60000000full, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(glm_cpu_clk_unhalted), .umasks = glm_cpu_clk_unhalted, }, { .name = "UNHALTED_CORE_CYCLES", .desc = "Core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x20000000ull, .ngrp = 0, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_gnr_events.h000066400000000000000000003335641502707512200235720ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: gnr (GraniteRapids) * Based on Intel JSON event table version : 1.06 * Based on Intel JSON event table published : 01/17/2025 */ static const intel_x86_umask_t intel_gnr_arith[]={ { .uname = "DIV_ACTIVE", .udesc = "Cycles when divide unit is busy executing divide or square root operations.", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FPDIV_ACTIVE", .udesc = "This event counts the cycles the floating point divider is busy.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "IDIV_ACTIVE", .udesc = "This event counts the cycles the integer divider is busy.", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_assists[]={ { .uname = "ANY", .udesc = "Number of occurrences where a microcode assist is invoked by hardware.", .ucode = 0x1b00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FP", .udesc = "Counts all microcode FP assists.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HARDWARE", .udesc = "Count all other hardware assists or traps that are not necessarily architecturally exposed (through a software handler) beyond FP; SSE-AVX mix and A/D assists who are counted by dedicated sub-events. The event also counts for Machine Ordering count.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAGE_FAULT", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SSE_AVX_MIX", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_baclears[]={ { .uname = "ANY", .udesc = "Clears due to Unknown Branches.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Not taken branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Taken conditional branch instructions retired.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Far branch instructions retired.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Indirect near branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Direct and indirect near call instructions retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Return instructions retired.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Taken branch instructions retired.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ALL_BRANCHES_COST", .udesc = "All mispredicted branch instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Mispredicted conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_COST", .udesc = "Mispredicted conditional branch instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x5100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Mispredicted non-taken conditional branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN_COST", .udesc = "Mispredicted non-taken conditional branch instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x5000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "number of branch instructions retired that were mispredicted and taken.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN_COST", .udesc = "Mispredicted taken conditional branch instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Miss-predicted near indirect branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Mispredicted indirect CALL retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL_COST", .udesc = "Mispredicted indirect CALL retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_COST", .udesc = "Mispredicted near indirect branch instructions retired (excluding returns). This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0xc000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN_COST", .udesc = "Mispredicted taken near branch instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x6000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET", .udesc = "This event counts the number of mispredicted ret instructions retired. Non PEBS", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET_COST", .udesc = "Mispredicted ret instructions retired. This precise event may be used to get the misprediction cost via the Retire_Latency field of PEBS. It fires on the instruction that immediately follows the mispredicted branch.", .ucode = 0x4800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_cpu_clk_unhalted[]={ { .uname = "C01", .udesc = "Core clocks when the thread is in the C0.1 light-weight slower wakeup time but more power saving optimized state.", .ucode = 0x10ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "C02", .udesc = "Core clocks when the thread is in the C0.2 light-weight faster wakeup time but less power saving optimized state.", .ucode = 0x20ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "C0_WAIT", .udesc = "Core clocks when the thread is in the C0.1 or C0.2 or running a PAUSE in C0 ACPI state.", .ucode = 0x70ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "DISTRIBUTED", .udesc = "Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x02ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Core crystal clock cycles when this thread is unhalted and the other thread is halted.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAUSE", .udesc = "TBD", .ucode = 0x40ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "PAUSE_INST", .udesc = "TBD", .ucode = 0x40ecull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "REF_DISTRIBUTED", .udesc = "Core crystal clock cycles. Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Reference cycles when the core is not in halt state.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "REF_TSC_P", .udesc = "Reference cycles when the core is not in halt state.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Core cycles when the thread is not in halt state", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, /* fixed counter encoding */ }, { .uname = "THREAD_P", .udesc = "Thread cycles when thread is not in halt state", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_cycle_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding.", .ucode = 0x0800ull | (0x8 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L2_MISS", .udesc = "Cycles while L2 cache miss demand load is outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L3_MISS", .udesc = "Cycles while L3 cache miss demand load is outstanding.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles while memory subsystem has an outstanding load.", .ucode = 0x1000ull | (0x10 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding.", .ucode = 0x0c00ull | (0xc << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand load is outstanding.", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand load is outstanding.", .ucode = 0x0600ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_TOTAL", .udesc = "Total execution stalls.", .ucode = 0x0400ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_decode[]={ { .uname = "LCP", .udesc = "Stalls caused by changing prefix length of the instruction.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_BUSY", .udesc = "Cycles the Microcode Sequencer is busy.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "DSB-to-MITE switch true penalty cycles.", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_dtlb_load_misses[]={ { .uname = "STLB_HIT", .udesc = "Loads that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a demand load.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Load miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data load to a 1G page.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data load to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data load to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a demand load in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_dtlb_store_misses[]={ { .uname = "STLB_HIT", .udesc = "Stores that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a store.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Store misses in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data store to a 1G page.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data store to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data store to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a store in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_exe[]={ { .uname = "AMX_BUSY", .udesc = "Counts the cycles where the AMX (Advance Matrix Extension) unit is busy performing an operation.", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_exe_activity[]={ { .uname = "1_PORTS_UTIL", .udesc = "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_3_PORTS_UTIL", .udesc = "Cycles total of 2 or 3 uops are executed on all ports and Reservation Station (RS) was not empty.", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_PORTS_UTIL", .udesc = "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "3_PORTS_UTIL", .udesc = "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_PORTS_UTIL", .udesc = "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOUND_ON_LOADS", .udesc = "Execution stalls while memory subsystem has an outstanding load.", .ucode = 0x2100ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "BOUND_ON_STORES", .udesc = "Cycles where the Store Buffer was full and no loads caused an execution stall.", .ucode = 0x4000ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "EXE_BOUND_0_PORTS", .udesc = "Cycles no uop executed while RS was not empty, the SB was not full and there was no outstanding load.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_fp_arith_dispatched[]={ { .uname = "PORT_0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "V2", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_fp_arith_inst_retired[]={ { .uname = "128B_PACKED_DOUBLE", .udesc = "ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "128B_PACKED_SINGLE", .udesc = "ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_SINGLE", .udesc = "ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_FLOPS", .udesc = "ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x1800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_DOUBLE", .udesc = "ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_SINGLE", .udesc = "ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "8_FLOPS", .udesc = "ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RSQRT14 RCP RCP14 DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x6000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 RANGE SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_DOUBLE", .udesc = "ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SINGLE", .udesc = "ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of any Vector retired FP arithmetic instructions", .ucode = 0xfc00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_fp_arith_inst_retired2[]={ { .uname = "128B_PACKED_HALF", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_HALF", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_HALF", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMPLEX_SCALAR_HALF", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Number of all Scalar Half-Precision FP arithmetic instructions(1) retired - regular and complex.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_HALF", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of all Vector (also called packed) Half-Precision FP arithmetic instructions(1) retired.", .ucode = 0x1c00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_frontend_retired[]={ { .uname = "ANY_ANT", .udesc = "Retired ANT branches", .ucode = 0x0900ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_DSB_MISS", .udesc = "Retired Instructions who experienced DSB miss.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "DSB_MISS", .udesc = "Retired Instructions who experienced a critical DSB miss.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ITLB_MISS", .udesc = "Retired Instructions who experienced iTLB true miss.", .ucode = 0x1400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1I_MISS", .udesc = "Retired Instructions who experienced Instruction L1 Cache true miss.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired Instructions who experienced Instruction L2 Cache true miss.", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_1", .udesc = "Retired instructions after front-end starvation of at least 1 cycle", .ucode = 0x60010600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_128", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.", .ucode = 0x60800600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_16", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.", .ucode = 0x60100600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2", .udesc = "Retired instructions after front-end starvation of at least 2 cycles", .ucode = 0x60020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_256", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.", .ucode = 0x61000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2_BUBBLES_GE_1", .udesc = "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.", .ucode = 0x10020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_32", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.", .ucode = 0x60200600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_4", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.", .ucode = 0x60040600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_512", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.", .ucode = 0x62000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_64", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.", .ucode = 0x60400600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_8", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 8 cycles which was not interrupted by a back-end stall.", .ucode = 0x60080600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATE_SWPF", .udesc = "I-Cache miss too close to Code Prefetch Instruction", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MISP_ANT", .udesc = "Mispredicted Retired ANT branches", .ucode = 0x0900ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MS_FLOWS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS", .udesc = "Retired Instructions who experienced STLB (2nd level TLB) true miss.", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "UNKNOWN_BRANCH", .udesc = "TBD", .ucode = 0x1700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_icache_data[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "STALL_PERIODS", .udesc = "TBD", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t intel_gnr_icache_tag[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_idq[]={ { .uname = "DSB_CYCLES_ANY", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_CYCLES_OK", .udesc = "Cycles DSB is delivering optimal number of Uops", .ucode = 0x0800ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_CYCLES_ANY", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_CYCLES_OK", .udesc = "Cycles MITE is delivering optimal number of Uops", .ucode = 0x0400ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_CYCLES_ANY", .udesc = "Cycles when uops are being delivered to IDQ while MS is busy", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of switches from DSB or MITE to the MS", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MS_UOPS", .udesc = "Uops initiated by MITE or Decode Stream Buffer (DSB) and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_idq_bubbles[]={ { .uname = "CORE", .udesc = "This event counts a subset of the Topdown Slots event that when no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles when no uops are not delivered by the IDQ when backend of the machine is not stalled [This event is alias to IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE]", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles when optimal number of uops was delivered to the back-end when the back-end is not stalled [This event is alias to IDQ_UOPS_NOT_DELIVERED.CYCLES_FE_WAS_OK]", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Uops not delivered by IDQ when backend of the machine is not stalled", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles when no uops are not delivered by the IDQ when backend of the machine is not stalled [This event is alias to IDQ_BUBBLES.CYCLES_0_UOPS_DELIV.CORE]", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles when optimal number of uops was delivered to the back-end when the back-end is not stalled [This event is alias to IDQ_BUBBLES.CYCLES_FE_WAS_OK]", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_inst_decoded[]={ { .uname = "DECODERS", .udesc = "Instruction decoders utilized in a cycle", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_inst_retired[]={ { .uname = "ANY", .udesc = "Number of instructions retired. Fixed Counter - architectural event", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_CODE_OVERRIDE | INTEL_X86_DFL, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_E, }, { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MACRO_FUSED", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOP", .udesc = "Retired NOP instructions.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired with PEBS precise-distribution", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE | INTEL_X86_FIXED | INTEL_X86_PEBS, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, { .uname = "REP_ITERATION", .udesc = "Iterations of Repeat string retired instructions.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_int_misc[]={ { .uname = "CLEARS_COUNT", .udesc = "Clears speculative count", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "CLEAR_RESTEER_CYCLES", .udesc = "Counts cycles after recovery from a branch misprediction or machine clear till the first uop is issued from the resteered path.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBA_STALLS", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UNKNOWN_BRANCH_CYCLES", .udesc = "Bubble cycles of BAClear (Unknown Branch).", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UOP_DROPPING", .udesc = "TMA slots where uops got dropped", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_int_vec_retired[]={ { .uname = "128BIT", .udesc = "TBD", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256BIT", .udesc = "TBD", .ucode = 0xac00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_128", .udesc = "Integer ADD, SUB, SAD 128-bit vector instructions.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_256", .udesc = "Integer ADD, SUB, SAD 256-bit vector instructions.", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MUL_256", .udesc = "TBD", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHUFFLES", .udesc = "TBD", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_128", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_256", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_itlb_misses[]={ { .uname = "STLB_HIT", .udesc = "Instruction fetch requests that miss the ITLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Code miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Code miss in all TLB levels causes a page walk that completes. (2M/4M)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Code miss in all TLB levels causes a page walk that completes. (4K)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for an outstanding code request in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_l1d[]={ { .uname = "HWPF_MISS", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REPLACEMENT", .udesc = "Counts the number of cache lines replaced in L1 data cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_l1d_pend_miss[]={ { .uname = "FB_FULL", .udesc = "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FB_FULL_PERIODS", .udesc = "Number of phases a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "L2_STALLS", .udesc = "Number of cycles a demand request has waited due to L1D due to lack of L2 resources.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING", .udesc = "Number of L1D misses that are outstanding", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load Misses outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_l2_lines_in[]={ { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x1f00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_l2_lines_out[]={ { .uname = "NON_SILENT", .udesc = "Modified cache lines that are evicted by L2 cache when triggered by an L2 cache fill.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SILENT", .udesc = "Non-modified cache lines that are silently dropped by L2 cache when triggered by an L2 cache fill.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "USELESS_HWPF", .udesc = "Cache lines that have been L2 hardware prefetched but not used by demand accesses", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_l2_rqsts[]={ { .uname = "ALL_CODE_RD", .udesc = "L2 code requests", .ucode = 0xe400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand Data Read access L2 cache", .ucode = 0xe100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "Demand requests that miss L2 cache", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_REFERENCES", .udesc = "Demand requests to L2 cache", .ucode = 0xe700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_HWPF", .udesc = "TBD", .ucode = 0xf000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "RFO requests to L2 cache", .ucode = 0xe200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads.", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read miss L2 cache", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "All requests that hit L2 cache. [This event is alias to L2_REQUEST.HIT]", .ucode = 0xdf00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_MISS", .udesc = "TBD", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Read requests with true-miss in L2 cache [This event is alias to L2_REQUEST.MISS]", .ucode = 0x3f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All accesses to L2 cache [This event is alias to L2_REQUEST.ALL]", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_HIT", .udesc = "SW prefetch requests that hit L2 cache.", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_MISS", .udesc = "SW prefetch requests that miss L2 cache.", .ucode = 0x2800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_l2_trans[]={ { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_ld_blocks[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.", .ucode = 0x8800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked due to overlapping with a preceding store that cannot be forwarded.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_load_hit_prefetch[]={ { .uname = "SWPF", .udesc = "Counts the number of demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3)", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_lsd[]={ { .uname = "CYCLES_ACTIVE", .udesc = "Cycles Uops delivered by the LSD, but didn't come from the decoder.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_OK", .udesc = "Cycles optimal number of Uops delivered by the LSD, but did not come from the decoder.", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "UOPS", .udesc = "Number of Uops delivered by the LSD.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_machine_clears[]={ { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of machine clears due to memory ordering conflicts.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-modifying code (SMC) detected.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_memory_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding.", .ucode = 0x0300ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand cacheable load request is outstanding.", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand cacheable load request is outstanding.", .ucode = 0x0900ull | (0x9 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_mem_inst_retired[]={ { .uname = "ALL_LOADS", .udesc = "Retired load instructions.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "Retired store instructions.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All retired memory instructions.", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "LOCK_LOADS", .udesc = "Retired load instructions with locked access.", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired load instructions that split across a cacheline boundary.", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired store instructions that split across a cacheline boundary.", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_HIT_LOADS", .udesc = "Retired load instructions that hit the STLB.", .ucode = 0x0900ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_HIT_STORES", .udesc = "Retired store instructions that hit the STLB.", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "Retired load instructions that miss the STLB.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Retired store instructions that miss the STLB.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_mem_load_completed[]={ { .uname = "L1_MISS_ANY", .udesc = "Completed demand load uops that miss the L1 d-cache.", .ucode = 0xfd00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_mem_load_l3_hit_retired[]={ { .uname = "XSNP_FWD", .udesc = "Retired load instructions whose data sources were HitM responses from shared L3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load instructions whose data sources were hits in L3 without snoops required", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NO_FWD", .udesc = "Retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_mem_load_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from local dram", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Retired load instructions whose data sources was forwarded from a remote cache", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_mem_load_misc_retired[]={ { .uname = "UC", .udesc = "Retired instructions with at least 1 uncacheable load or lock.", .ucode = 0x0400ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_mem_load_retired[]={ { .uname = "FB_HIT", .udesc = "Number of completed demand load requests that missed the L1, but hit the FB(fill buffer), because a preceding miss to the same cacheline initiated the line to be brought into L1, but data is not yet ready in L1.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Retired load instructions with L1 cache hits as data sources", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load instructions missed L1 cache as data sources", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load instructions with L2 cache hits as data sources", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load instructions missed L2 cache as data sources", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load instructions with L3 cache hits as data sources", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load instructions missed L3 cache as data sources", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_mem_store_retired[]={ { .uname = "L2_HIT", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOAD_LATENCY_GT_1024", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 1024 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=1024", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_128", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 128 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=128", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_16", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 16 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=16", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_2048", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 2048 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=2048", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_256", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 256 cycles.", .ucode = 0x10000ull, .uequiv = "LOAD_LATENCY:ldlat=256", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_32", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 32 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=32", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_4", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 4 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=4", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_512", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 512 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=512", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_64", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 64 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=64", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOAD_LATENCY_GT_8", .udesc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 8 cycles.", .ucode = 0x100ull, .uequiv = "LOAD_LATENCY:ldlat=8", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STORE_SAMPLE", .udesc = "Retired memory store access operations. A PDist event for PEBS Store Latency Facility.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_gnr_mem_uop_retired[]={ { .uname = "ANY", .udesc = "Retired memory uops for any access", .ucode = 0x0300ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_misc2_retired[]={ { .uname = "LFENCE", .udesc = "LFENCE instructions retired", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_misc_retired[]={ { .uname = "LBR_INSERTS", .udesc = "Increments whenever there is an update to the LBR array.", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_ocr[]={ { .uname = "DEMAND_CODE_RD_ANY_RESPONSE", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that have any type of response.", .ucode = 0x1000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM.", .ucode = 0x73c00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_MISS", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_LOCAL_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response.", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM.", .ucode = 0x73c00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT", .udesc = "Counts demand data reads that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand data reads that resulted in a snoop that hit in another core, which did not forward the data.", .ucode = 0x4003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that resulted in a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x8003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to another socket.", .ucode = 0x73000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.", .ucode = 0x3f3ffc000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM.", .ucode = 0x73c00000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_LOCAL_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MODIFIED_WRITE_ANY_RESPONSE", .udesc = "Counts writebacks of modified cachelines and streaming stores that have any type of response.", .ucode = 0x1080800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_ANY_RESPONSE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that have any type of response.", .ucode = 0x3f3ffc447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM.", .ucode = 0x73c00447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f04c0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL_SOCKET", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that missed the L3 Cache and were supplied by the local socket (DRAM or PMM), whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM or DRAM accesses that are controlled by the close or distant SNC Cluster. It does not count misses to the L3 which go to Local CXL Type 2 Memory or Local Non DRAM.", .ucode = 0x70cc0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_SOCKET_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts DRAM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x70c00447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and were supplied by a remote socket.", .ucode = 0x3f3300447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop was sent and data was returned (Modified or Not Modified).", .ucode = 0x183000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to another socket.", .ucode = 0x73000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_MEMORY", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM or PMM attached to another socket.", .ucode = 0x73300447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_TO_CORE_L3_HIT_M", .udesc = "Counts demand reads for ownership (RFO), hardware prefetch RFOs (which bring data to L2), and software prefetches for exclusive ownership (PREFETCHW) that hit to a (M)odified cacheline in the L3 or snoop filter.", .ucode = 0x1f8004002200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response.", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_ESTIMATE_MEMORY", .udesc = "Counts Demand RFOs, ItoM's, PREFECTHW's, Hardware RFO Prefetches to the L1/L2 and Streaming stores that likely resulted in a store to Memory (DRAM or PMM)", .ucode = 0xfbff8082200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_offcore_requests[]={ { .uname = "ALL_REQUESTS", .udesc = "Any memory transaction that reached the SQ.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_RD", .udesc = "Demand and prefetch data reads", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Cacheable and Non-Cacheable code read requests", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFO requests including regular RFOs, locks, ItoM", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Counts demand data read requests that miss the L3 cache.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_offcore_requests_outstanding[]={ { .uname = "CYCLES_WITH_DATA_RD", .udesc = "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_CODE_RD", .udesc = "Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_DATA_RD", .udesc = "Cycles where at least 1 outstanding demand data read request is pending.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_RFO", .udesc = "Cycles with offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore.", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_L3_MISS_DEMAND_DATA_RD", .udesc = "Cycles where data return is pending for a Demand Data Read request who miss L3 cache.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DATA_RD", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of outstanding demand data read requests pending.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Store Read transactions pending for off-core. Highly correlated.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of demand data read requests pending that are known to have missed the L3 cache.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_resource_stalls[]={ { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available. (not including draining form sync).", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCOREBOARD", .udesc = "Counts cycles where the pipeline is stalled due to serializing operations.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_rs[]={ { .uname = "EMPTY", .udesc = "Cycles when Reservation Station (RS) is empty for the thread.", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EMPTY_COUNT", .udesc = "Counts end of periods where the Reservation Station (RS) was empty.", .ucode = 0x0700ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "EMPTY_RESOURCE", .udesc = "Cycles when RS was empty and a resource allocation stall is asserted", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_rtm_retired[]={ { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_EVENTS", .udesc = "Number of times an RTM execution aborted due to none of the previous 3 categories (e.g. interrupt)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEMTYPE", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an RTM execution aborted due to HLE-unfriendly instructions", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "START", .udesc = "Number of times an RTM execution started.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_sq_misc[]={ { .uname = "BUS_LOCK", .udesc = "Counts bus locks, accounts for cache line split locks and UC locks.", .ucode = 0x1000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_sw_prefetch_access[]={ { .uname = "ANY", .udesc = "Counts the number of PREFETCHNTA, PREFETCHW, PREFETCHT0, PREFETCHT1 or PREFETCHT2 instructions executed.", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "NTA", .udesc = "Number of PREFETCHNTA instructions executed.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREFETCHW", .udesc = "Number of PREFETCHW instructions executed.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T0", .udesc = "Number of PREFETCHT0 instructions executed.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T1_T2", .udesc = "Number of PREFETCHT1 or PREFETCHT2 instructions executed.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_topdown[]={ { .uname = "BACKEND_BOUND_SLOTS", .udesc = "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BAD_SPEC_SLOTS", .udesc = "TMA slots wasted due to incorrect speculations.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BR_MISPREDICT_SLOTS", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_BOUND_SLOTS", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_P", .udesc = "TMA slots available for an unhalted logical processor. General counter - architectural event", .ucode = 0x01a4ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, }; static const intel_x86_umask_t intel_gnr_topdown_m[]={ { .uname = "BACKEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "BAD_SPEC", .udesc = "TMA slots wasted due to incorrect speculations.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "BR_MISPREDICT", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x8500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "FRONTEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of front-end resources.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "FETCH_LAT", .udesc = "TMA slots wasted due to lack of uops to decode due to code fetch latencies.", .ucode = 0x8600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "HEAVY_OPS", .udesc = "TMA slots where instructions with 2+ uops retired.", .ucode = 0x8400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "MEMORY_BOUND", .udesc = "TMA slots wasted due to back-end waiting for memory.", .ucode = 0x8700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "RETIRING", .udesc = "TMA slots where instructions are retiring", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_tx_mem[]={ { .uname = "ABORT_CAPACITY_READ", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional reads", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY_WRITE", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional writes.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_uops_decoded[]={ { .uname = "DEC0_UOPS", .udesc = "Number of non dec-by-all uops decoded by decoder", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_gnr_uops_dispatched[]={ { .uname = "PORT_0", .udesc = "Uops executed on port 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Uops executed on port 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2_3_10", .udesc = "Uops executed on ports 2, 3 and 10", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4_9", .udesc = "Uops executed on ports 4 and 9", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5_11", .udesc = "Uops executed on ports 5 and 11", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Uops executed on port 6", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7_8", .udesc = "Uops executed on ports 7 and 8", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_uops_executed[]={ { .uname = "CORE", .udesc = "Number of uops executed on the core.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles at least 1 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles at least 2 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles at least 3 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles at least 4 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed per-thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed per-thread", .ucode = 0x0100ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed per-thread", .ucode = 0x0100ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed per-thread", .ucode = 0x0100ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS", .udesc = "Counts number of cycles no uops were dispatched to be executed on this thread.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "THREAD", .udesc = "Counts the number of uops to be executed per-thread each cycle.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "X87", .udesc = "Counts the number of x87 uops dispatched.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_gnr_uops_issued[]={ { .uname = "ANY", .udesc = "Uops that RAT issues to RS", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CYCLES", .udesc = "TBD", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_uops_retired[]={ { .uname = "CYCLES", .udesc = "Cycles with retired uop(s).", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "HEAVY", .udesc = "Retired uops except the last uop of each instruction.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS", .udesc = "Cycles without actually retired uops.", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_gnr_xq[]={ { .uname = "FULL_CYCLES", .udesc = "Cycles the uncore cannot take further requests", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_entry_t intel_gnr_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "ARITH", .desc = "Arithmetic operations.", .code = 0x00b0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_arith), .umasks = intel_gnr_arith, }, { .name = "ASSISTS", .desc = "Counts all hardware assists.", .code = 0x00c1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_assists), .umasks = intel_gnr_assists, }, { .name = "BACLEARS", .desc = "Clears due to Unknown Branches.", .code = 0x0060, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_baclears), .umasks = intel_gnr_baclears, }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired.", .code = 0x00c4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_br_inst_retired), .umasks = intel_gnr_br_inst_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted branch instructions retired.", .code = 0x00c5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_br_misp_retired), .umasks = intel_gnr_br_misp_retired, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when the thread is not in halt state", .code = 0x003c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_cpu_clk_unhalted), .umasks = intel_gnr_cpu_clk_unhalted, }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles.", .code = 0x00a3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_cycle_activity), .umasks = intel_gnr_cycle_activity, }, { .name = "DECODE", .desc = "Decoder activity.", .code = 0x0087, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_decode), .umasks = intel_gnr_decode, }, { .name = "DSB2MITE_SWITCHES", .desc = "DSB to MITE switches.", .code = 0x0061, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_dsb2mite_switches), .umasks = intel_gnr_dsb2mite_switches, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses.", .code = 0x0012, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_dtlb_load_misses), .umasks = intel_gnr_dtlb_load_misses, }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses.", .code = 0x0013, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_dtlb_store_misses), .umasks = intel_gnr_dtlb_store_misses, }, { .name = "EXE", .desc = "Excution cycles.", .code = 0x01b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_exe), .umasks = intel_gnr_exe, }, { .name = "EXE_ACTIVITY", .desc = "Execution activity.", .code = 0x00a6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_exe_activity), .umasks = intel_gnr_exe_activity, }, { .name = "FP_ARITH_DISPATCHED", .desc = "TBD", .code = 0x00b3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_fp_arith_dispatched), .umasks = intel_gnr_fp_arith_dispatched, }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Counts number arithmetic operations.", .code = 0x00c7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_fp_arith_inst_retired), .umasks = intel_gnr_fp_arith_inst_retired, }, { .name = "FP_ARITH_INST_RETIRED2", .desc = "TBD", .code = 0x00cf, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_fp_arith_inst_retired2), .umasks = intel_gnr_fp_arith_inst_retired2, }, { .name = "FRONTEND_RETIRED", .desc = "Precise frontend retired events", .code = 0x03c6, .modmsk = INTEL_SKL_FE_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_FRONTEND | INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_frontend_retired), .umasks = intel_gnr_frontend_retired, }, { .name = "ICACHE_DATA", .desc = "Instruction cache.", .code = 0x0080, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_icache_data), .umasks = intel_gnr_icache_data, }, { .name = "ICACHE_TAG", .desc = "Instruction cache tagging.", .code = 0x0083, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_icache_tag), .umasks = intel_gnr_icache_tag, }, { .name = "IDQ", .desc = "IDQ (Instruction Decoded Queue) operations.", .code = 0x0079, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_idq), .umasks = intel_gnr_idq, }, { .name = "IDQ_BUBBLES", .desc = "IDQ (Instruction Decoded Queue) bubbles.", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_idq_bubbles), .umasks = intel_gnr_idq_bubbles, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered.", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_idq_uops_not_delivered), .umasks = intel_gnr_idq_uops_not_delivered, }, { .name = "INST_DECODED", .desc = "Instruction decoded.", .code = 0x0075, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_inst_decoded), .umasks = intel_gnr_inst_decoded, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired.", .code = 0x00c0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_inst_retired), .umasks = intel_gnr_inst_retired, }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions.", .code = 0x00ad, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_int_misc), .umasks = intel_gnr_int_misc, }, { .name = "INT_VEC_RETIRED", .desc = "Integer ADD, SUB, SAD 128-bit vector instructions.", .code = 0x00e7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_int_vec_retired), .umasks = intel_gnr_int_vec_retired, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses.", .code = 0x0011, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_itlb_misses), .umasks = intel_gnr_itlb_misses, }, { .name = "L1D", .desc = "L1D cache.", .code = 0x0051, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l1d), .umasks = intel_gnr_l1d, }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses.", .code = 0x0048, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l1d_pend_miss), .umasks = intel_gnr_l1d_pend_miss, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated.", .code = 0x0025, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l2_lines_in), .umasks = intel_gnr_l2_lines_in, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted.", .code = 0x0026, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l2_lines_out), .umasks = intel_gnr_l2_lines_out, }, { .name = "L2_RQSTS", .desc = "L2 requests.", .code = 0x0024, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l2_rqsts), .umasks = intel_gnr_l2_rqsts, }, { .name = "L2_TRANS", .desc = "L2 transactions.", .code = 0x0023, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_l2_trans), .umasks = intel_gnr_l2_trans, }, { .name = "LD_BLOCKS", .desc = "Blocking loads.", .code = 0x0003, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_ld_blocks), .umasks = intel_gnr_ld_blocks, }, { .name = "LOAD_HIT_PREFETCH", .desc = "Load dispatches.", .code = 0x004c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_load_hit_prefetch), .umasks = intel_gnr_load_hit_prefetch, }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache.", .code = 0x002e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_longest_lat_cache), .umasks = intel_gnr_longest_lat_cache, }, { .name = "LSD", .desc = "LSD (Loop Stream Detector) operations.", .code = 0x00a8, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_lsd), .umasks = intel_gnr_lsd, }, { .name = "MACHINE_CLEARS", .desc = "Number of machine clears (nukes) of any type.", .code = 0x00c3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_machine_clears), .umasks = intel_gnr_machine_clears, }, { .name = "MEMORY_ACTIVITY", .desc = "Memory activity.", .code = 0x0047, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_memory_activity), .umasks = intel_gnr_memory_activity, }, { .name = "MEM_INST_RETIRED", .desc = "Retired memory instructions.", .code = 0x00d0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_inst_retired), .umasks = intel_gnr_mem_inst_retired, }, { .name = "MEM_LOAD_COMPLETED", .desc = "Completed demand load.", .code = 0x0043, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_load_completed), .umasks = intel_gnr_mem_load_completed, }, { .name = "MEM_LOAD_L3_HIT_RETIRED", .desc = "Retired load instructions whose data sources were L3 hit.", .code = 0x00d2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_load_l3_hit_retired), .umasks = intel_gnr_mem_load_l3_hit_retired, }, { .name = "MEM_LOAD_L3_MISS_RETIRED", .desc = "Retired load instructions which data sources missed L3.", .code = 0x00d3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_load_l3_miss_retired), .umasks = intel_gnr_mem_load_l3_miss_retired, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Miscellaneous load retired instructions.", .code = 0x00d4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_load_misc_retired), .umasks = intel_gnr_mem_load_misc_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired load operations.", .code = 0x00d1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_load_retired), .umasks = intel_gnr_mem_load_retired, }, { .name = "MEM_STORE_RETIRED", .desc = "Retired store operations", .code = 0x0044, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_store_retired), .umasks = intel_gnr_mem_store_retired, }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired.", .code = 0x01cd, .modmsk = INTEL_V5_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xfeull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_trans_retired), .umasks = intel_gnr_mem_trans_retired, }, { .name = "MEM_UOP_RETIRED", .desc = "Retired memory uops.", .code = 0x00e5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_mem_uop_retired), .umasks = intel_gnr_mem_uop_retired, }, { .name = "MISC2_RETIRED", .desc = "Miscellaneous retired operations.", .code = 0x00e0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_misc2_retired), .umasks = intel_gnr_misc2_retired, }, { .name = "MISC_RETIRED", .desc = "Miscellaneous retired operations.", .code = 0x00cc, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_misc_retired), .umasks = intel_gnr_misc_retired, }, { .name = "OCR", .desc = "Offcore response event.", .code = 0x012a, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_ocr), .umasks = intel_gnr_ocr, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event.", .code = 0x012a, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .equiv = "OCR", .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_ocr), .umasks = intel_gnr_ocr, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event.", .code = 0x012b, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .equiv = "OCR", .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_ocr), .umasks = intel_gnr_ocr, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore requests.", .code = 0x0021, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_offcore_requests), .umasks = intel_gnr_offcore_requests, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests.", .code = 0x0020, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_offcore_requests_outstanding), .umasks = intel_gnr_offcore_requests_outstanding, }, { .name = "RESOURCE_STALLS", .desc = "Cycles where Allocation is stalled due to Resource Related reasons.", .code = 0x00a2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_resource_stalls), .umasks = intel_gnr_resource_stalls, }, { .name = "RS", .desc = "Reservation Station (RS) activity.", .code = 0x00a5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_rs), .umasks = intel_gnr_rs, }, { .name = "RTM_RETIRED", .desc = "Number of times an RTM execution started.", .code = 0x00c9, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_rtm_retired), .umasks = intel_gnr_rtm_retired, }, { .name = "SQ_MISC", .desc = "Miscellaneous SQ activity.", .code = 0x002c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_sq_misc), .umasks = intel_gnr_sq_misc, }, { .name = "SW_PREFETCH_ACCESS", .desc = "Software prefetch accesses.", .code = 0x0040, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_sw_prefetch_access), .umasks = intel_gnr_sw_prefetch_access, }, { .name = "TOPDOWN", .desc = "Topdown event.", .code = 0x0000, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_topdown), .umasks = intel_gnr_topdown, }, { .name = "TOPDOWN_M", .desc = "Topdown events via PERF_METRICS MSR (Linux only). All events must be in a Linux perf_events group and SLOTS must be the first event for the kernel to program the events onto the PERF_METRICS MSR. Only SLOTS umask accepts modifiers", .cntmsk = 0x1000000000ull, .modmsk = INTEL_FIXED2_ATTRS, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_FIXED, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_topdown_m), .umasks = intel_gnr_topdown_m, }, { .name = "TX_MEM", .desc = "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address", .code = 0x0054, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_tx_mem), .umasks = intel_gnr_tx_mem, }, { .name = "UOPS_DECODED", .desc = "Uops decoded.", .code = 0x0076, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_uops_decoded), .umasks = intel_gnr_uops_decoded, }, { .name = "UOPS_DISPATCHED", .desc = "Uops disaptched.", .code = 0x00b2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_uops_dispatched), .umasks = intel_gnr_uops_dispatched, }, { .name = "UOPS_EXECUTED", .desc = "Uops ecxecuted.", .code = 0x00b1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_uops_executed), .umasks = intel_gnr_uops_executed, }, { .name = "UOPS_ISSUED", .desc = "Uops issued.", .code = 0x00ae, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_uops_issued), .umasks = intel_gnr_uops_issued, }, { .name = "UOPS_RETIRED", .desc = "Uops retired.", .code = 0x00c2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_uops_retired), .umasks = intel_gnr_uops_retired, }, { .name = "XQ", .desc = "Xq activity.", .code = 0x002d, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_gnr_xq), .umasks = intel_gnr_xq, }, }; /* 67 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_gnr_unc_imc_events.h000066400000000000000000000422521502707512200252560ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: gnr_unc_imc (GraniteRapids Uncore IMC) * Based on Intel JSON event table version : 1.06 * Based on Intel JSON event table published : 01/17/2025 */ static const intel_x86_umask_t gnr_imc_unc_m_act_count[]={ { .uname = "ALL", .udesc = "Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates.", .ucode = 0xf700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "Read transaction on Page Empty or Page Miss : Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates. (experimental)", .ucode = 0xf100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL", .udesc = "Underfill Read transaction on Page Empty or Page Miss : Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates. (experimental)", .ucode = 0xf400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Write transaction on Page Empty or Page Miss : Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates. (experimental)", .ucode = 0xf200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_cas_count_sch0[]={ { .uname = "ALL", .udesc = "CAS count for SubChannel 0, all CAS operations", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "CAS count for SubChannel 0, all reads", .ucode = 0xcf00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "CAS count for SubChannel 0 regular reads", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UNDERFILL", .udesc = "CAS count for SubChannel 0 underfill reads", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "CAS count for SubChannel 0, all writes", .ucode = 0xf000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_NONPRE", .udesc = "CAS count for SubChannel 0 regular writes (experimental)", .ucode = 0xd000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PRE", .udesc = "CAS count for SubChannel 0 auto-precharge writes (experimental)", .ucode = 0xe000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_cas_count_sch1[]={ { .uname = "ALL", .udesc = "CAS count for SubChannel 1, all CAS operations", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "CAS count for SubChannel 1, all reads", .ucode = 0xcf00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "CAS count for SubChannel 1 regular reads", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UNDERFILL", .udesc = "CAS count for SubChannel 1 underfill reads", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "CAS count for SubChannel 1, all writes", .ucode = 0xf000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_NONPRE", .udesc = "CAS count for SubChannel 1 regular writes (experimental)", .ucode = 0xd000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PRE", .udesc = "CAS count for SubChannel 1 auto-precharge writes (experimental)", .ucode = 0xe000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_powerdown_cycles[]={ { .uname = "SCH0_RANK0", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_RANK1", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_RANK2", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_RANK3", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_RANK0", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_RANK1", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_RANK2", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_RANK3", .udesc = "Number of cycles a given rank is in Power Down Mode (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_pre_count[]={ { .uname = "ALL", .udesc = "Counts the number of DRAM Precharge commands sent on this channel.", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PGT", .udesc = "Precharge due to (?) : Counts the number of DRAM Precharge commands sent on this channel.", .ucode = 0xf800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "Counts the number of DRAM Precharge commands sent on this channel. (experimental)", .ucode = 0xf100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL", .udesc = "Counts the number of DRAM Precharge commands sent on this channel. (experimental)", .ucode = 0xf400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Counts the number of DRAM Precharge commands sent on this channel. (experimental)", .ucode = 0xf200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_rdb_inserts[]={ { .uname = "SCH0", .udesc = "Read buffer inserts on subchannel 0", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1", .udesc = "Read buffer inserts on subchannel 1", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_rpq_inserts[]={ { .uname = "PCH0", .udesc = "Counts the number of allocations into the Read Pending Queue. This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This includes both ISOCH and non-ISOCH requests. (experimental)", .ucode = 0x5000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Counts the number of allocations into the Read Pending Queue. This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This includes both ISOCH and non-ISOCH requests. (experimental)", .ucode = 0xa000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_PCH0", .udesc = "Read Pending Queue inserts for subchannel 0, pseudochannel 0", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_PCH1", .udesc = "Read Pending Queue inserts for subchannel 0, pseudochannel 1", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_PCH0", .udesc = "Read Pending Queue inserts for subchannel 1, pseudochannel 0", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_PCH1", .udesc = "Read Pending Queue inserts for subchannel 1, pseudochannel 1", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_self_refresh[]={ { .uname = "ENTER_SUCCESS", .udesc = "Number of cycles all ranks were in SR subevent1 - # of times all ranks went into SR subevent2 -# of times ps_sr_active asserted (SRE) subevent3 - # of times ps_sr_active deasserted (SRX) subevent4 - # of times PS-&>Refresh ps_sr_req asserted (SRE) subevent5 - # of times PS-&>Refresh ps_sr_req deasserted (SRX) subevent6 - # of cycles PSCtrlr FSM was in FATAL (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ENTER_SUCCESS_CYCLES", .udesc = "Number of cycles all ranks were in SR (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t gnr_imc_unc_m_wpq_inserts[]={ { .uname = "PCH0", .udesc = "Write Pending Queue Allocations (experimental)", .ucode = 0x5000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Write Pending Queue Allocations (experimental)", .ucode = 0xa000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_PCH0", .udesc = "Write Pending Queue inserts for subchannel 0, pseudochannel 0", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH0_PCH1", .udesc = "Write Pending Queue inserts for subchannel 0, pseudochannel 1", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_PCH0", .udesc = "Write Pending Queue inserts for subchannel 1, pseudochannel 0", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCH1_PCH1", .udesc = "Write Pending Queue inserts for subchannel 1, pseudochannel 1", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_gnr_unc_imc_pe[]={ { .name = "UNC_M_ACT_COUNT", .desc = "DRAM Activate Count", .code = 0x0002, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_act_count), .umasks = gnr_imc_unc_m_act_count, }, { .name = "UNC_M_CAS_COUNT_SCH0", .desc = "CAS count for SubChannel 0 regular reads", .code = 0x0005, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_cas_count_sch0), .umasks = gnr_imc_unc_m_cas_count_sch0, }, { .name = "UNC_M_CAS_COUNT_SCH1", .desc = "CAS count for SubChannel 1 regular reads", .code = 0x0006, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_cas_count_sch1), .umasks = gnr_imc_unc_m_cas_count_sch1, }, { .name = "UNC_M_CLOCKTICKS", .desc = "Number of DRAM DCLK clock cycles while the event is enabled. DCLK is 1/4 of DRAM data rate.", .code = 0x0001, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_HCLOCKTICKS", .desc = "Number of DRAM HCLK clock cycles while the event is enabled (experimental)", .code = 0x0001, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_POWERDOWN_CYCLES", .desc = "Number of cycles a given rank is in Power Down Mode", .code = 0x0047, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_powerdown_cycles), .umasks = gnr_imc_unc_m_powerdown_cycles, }, { .name = "UNC_M_POWER_CHANNEL_PPD_CYCLES", .desc = "Number of cycles a given rank is in Power Down Mode and all pages are closed (experimental)", .code = 0x0088, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PRE_COUNT", .desc = "DRAM Precharge commands.", .code = 0x0003, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_pre_count), .umasks = gnr_imc_unc_m_pre_count, }, { .name = "UNC_M_RDB_INSERTS", .desc = "Read buffer inserts on subchannel 0", .code = 0x0017, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_rdb_inserts), .umasks = gnr_imc_unc_m_rdb_inserts, }, { .name = "UNC_M_RDB_OCCUPANCY_SCH0", .desc = "Read buffer occupancy on subchannel 0", .code = 0x001a, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RDB_OCCUPANCY_SCH1", .desc = "Read buffer occupancy on subchannel 1", .code = 0x001b, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue inserts for subchannel 0, pseudochannel 0", .code = 0x0010, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_rpq_inserts), .umasks = gnr_imc_unc_m_rpq_inserts, }, { .name = "UNC_M_RPQ_OCCUPANCY_SCH0_PCH0", .desc = "Read pending queue occupancy for subchannel 0, pseudochannel 0", .code = 0x0080, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_OCCUPANCY_SCH0_PCH1", .desc = "Read pending queue occupancy for subchannel 0, pseudochannel 1", .code = 0x0081, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_OCCUPANCY_SCH1_PCH0", .desc = "Read pending queue occupancy for subchannel 1, pseudochannel 0", .code = 0x0082, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_OCCUPANCY_SCH1_PCH1", .desc = "Read pending queue occupancy for subchannel 1, pseudochannel 1", .code = 0x0083, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SELF_REFRESH", .desc = "Number of cycles all ranks were in SR", .code = 0x0043, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_self_refresh), .umasks = gnr_imc_unc_m_self_refresh, }, { .name = "UNC_M_WPQ_INSERTS", .desc = "Write Pending Queue inserts for subchannel 0, pseudochannel 0", .code = 0x0022, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(gnr_imc_unc_m_wpq_inserts), .umasks = gnr_imc_unc_m_wpq_inserts, }, { .name = "UNC_M_WPQ_OCCUPANCY_SCH0_PCH0", .desc = "Write pending queue occupancy for subchannel 0, pseudochannel 0", .code = 0x0084, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_OCCUPANCY_SCH0_PCH1", .desc = "Write pending queue occupancy for subchannel 0, pseudochannel 1", .code = 0x0085, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_OCCUPANCY_SCH1_PCH0", .desc = "Write pending queue occupancy for subchannel 1, pseudochannel 0", .code = 0x0086, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_OCCUPANCY_SCH1_PCH1", .desc = "Write pending queue occupancy for subchannel 1, pseudochannel 1", .code = 0x0087, .modmsk = GNR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, }; /* 22 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hsw_events.h000066400000000000000000002770531502707512200236050ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hsw (Intel Haswell) */ static const intel_x86_umask_t hsw_baclears[]={ { .uname = "ANY", .udesc = "Counts the number of times the front end is re-steered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_br_inst_exec[]={ { .uname = "NONTAKEN_CONDITIONAL", .udesc = "All macro conditional nontaken branch instructions", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONTAKEN_COND", .udesc = "All macro conditional nontaken branch instructions", .ucode = 0x4100, .uequiv = "NONTAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_CONDITIONAL", .udesc = "Taken speculative and retired macro-conditional branches", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "Taken speculative and retired macro-conditional branches", .ucode = 0x8100, .uequiv = "TAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_JUMP", .udesc = "Taken speculative and retired macro-conditional branch instructions excluding calls and indirects", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Taken speculative and retired indirect branches excluding calls and returns", .ucode = 0x8400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_RETURN", .udesc = "Taken speculative and retired indirect branches with return mnemonic", .ucode = 0x8800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_NEAR_CALL", .udesc = "Taken speculative and retired direct near calls", .ucode = 0x9000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CONDITIONAL", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_COND", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "Speculative and retired macro-conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DIRECT_JMP", .udesc = "Speculative and retired macro-unconditional branches excluding calls and indirects", .ucode = 0xc200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Speculative and retired indirect branches excluding calls and returns", .ucode = 0xc400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_NEAR_RETURN", .udesc = "Speculative and retired indirect return branches", .ucode = 0xc800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DIRECT_NEAR_CALL", .udesc = "Speculative and retired direct near calls", .ucode = 0xd000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All branch instructions executed", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_br_inst_retired[]={ { .uname = "CONDITIONAL", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Counts all macro direct and indirect near calls", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "Counts all taken and not taken macro branches including far branches (architectural event)", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Counts the number of near ret instructions retired", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "Counts all not taken macro branch instructions retired", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Counts the number of near branch taken instructions retired", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of far branch instructions retired", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_br_misp_exec[]={ { .uname = "NONTAKEN_CONDITIONAL", .udesc = "Not taken speculative and retired mispredicted macro conditional branches", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONTAKEN_COND", .udesc = "Not taken speculative and retired mispredicted macro conditional branches", .ucode = 0x4100, .uequiv = "NONTAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_CONDITIONAL", .udesc = "Taken speculative and retired mispredicted macro conditional branches", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "Taken speculative and retired mispredicted macro conditional branches", .ucode = 0x8100, .uequiv = "TAKEN_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "Taken speculative and retired mispredicted indirect branches excluding calls and returns", .ucode = 0x8400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN_RETURN_NEAR", .udesc = "Taken speculative and retired mispredicted indirect branches with return mnemonic", .ucode = 0x8800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CONDITIONAL", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xc100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All mispredicted indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "Taken speculative and retired mispredicted indirect calls", .ucode = 0xa000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_br_misp_retired[]={ { .uname = "CONDITIONAL", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted macro branches (architectural event)", .ucode = 0x0, /* architectural encoding */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "NEAR_TAKEN", .udesc = "number of near branch instructions retired that were mispredicted and taken", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_cpl_cycles[]={ { .uname = "RING0", .udesc = "Unhalted core cycles when the thread is in ring 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RING123", .udesc = "Unhalted core cycles when thread is in rings 1, 2, or 3", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RING0_TRANS", .udesc = "Number of intervals between processor halts while thread is in ring 0", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_cpu_clk_thread_unhalted[]={ { .uname = "REF_XCLK", .udesc = "Count Xclk pulses (100Mhz) when the core is unhalted", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK_ANY", .udesc = "Count Xclk pulses 100Mhz) when the at least one thread on the physical core is unhalted", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "REF_XCLK:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "REF_P", .udesc = "Cycles when the core is unhalted (count at 100 Mhz)", .ucode = 0x100, .uequiv = "REF_XCLK", .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted", .ucode = 0x000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Counts Xclk (100Mhz) pulses when this thread is unhalted and the other thread is halted", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_cycle_activity[]={ { .uname = "CYCLES_L2_PENDING", .udesc = "Cycles with pending L2 miss loads (must use with HT off only)", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, }, { .uname = "CYCLES_LDM_PENDING", .udesc = "Cycles with pending memory loads", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_L1D_PENDING", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0800 | (0x8 << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L1D_PENDING", .udesc = "Executions stalls due to pending L1D load cache misses", .ucode = 0x0c00 | (0xc << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L2_PENDING", .udesc = "Execution stalls due to L2 pending loads (must use with HT off only)", .ucode = 0x0500 | (0x5 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_LDM_PENDING", .udesc = "Execution stalls due to memory subsystem", .ucode = 0x0600 | (0x6 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_NO_EXECUTE", .udesc = "Cycles during which no instructions were executed in the execution stage of the pipeline", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_dtlb_load_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Misses in all TLB levels causes a page walk that completes (4K)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Misses in all TLB levels causes a page walk that completes (2M/4M)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_4K", .udesc = "Misses that miss the DTLB and hit the STLB (4K)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_2M", .udesc = "Misses that miss the DTLB and hit the STLB (2M)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x6000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PDE_CACHE_MISS", .udesc = "DTLB misses with low part of linear-to-physical address translation missed", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_itlb_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Misses in all TLB levels causes a page walk that completes (4K)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Misses in all TLB levels causes a page walk that completes (2M/4M)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Misses in all TLB levels causes a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_4K", .udesc = "Misses that miss the DTLB and hit the STLB (4K)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT_2M", .udesc = "Misses that miss the DTLB and hit the STLB (2M)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x6000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_fp_assist[]={ { .uname = "X87_OUTPUT", .udesc = "Number of X87 FP assists due to output values", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "X87_INPUT", .udesc = "Number of X87 FP assists due to input values", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_OUTPUT", .udesc = "Number of SIMD FP assists due to output values", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_INPUT", .udesc = "Number of SIMD FP assists due to input values", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Cycles with any input/output SEE or FP assists", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL", .udesc = "Cycles with any input and output SSE or FP assist", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_icache[]={ { .uname = "MISSES", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes Uncacheable accesses", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Reads. Includes cacheable and uncacheable accesses and uncacheable fetches", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFETCH_STALL", .udesc = "Number of cycles where a code-fetch stalled due to L1 instruction cache miss or an iTLB miss", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_idq[]={ { .uname = "EMPTY", .udesc = "Cycles the Instruction Decode Queue (IDQ) is empty", .ucode = 0x200, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS", .udesc = "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS", .udesc = "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS", .udesc = "Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS_CYCLES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_UOPS:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MITE_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_UOPS_CYCLES", .udesc = "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_OCCUR", .udesc = "Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1:e=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_4_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops", .ucode = 0x1800 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_ANY_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x1800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_4_UOPS", .udesc = "Cycles MITE is delivering 4 Uops", .ucode = 0x2400 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_ANY_UOPS", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x2400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from any path", .ucode = 0x3c00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Count number of non-delivered uops to Resource Allocation Table (RAT)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles per thread when 4 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .uequiv = "CORE:c=4", .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_1_UOP_DELIV_CORE", .udesc = "Cycles per thread when 3 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_2_UOP_DELIV_CORE", .udesc = "Cycles with less than 2 uops delivered by the front end", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_3_UOP_DELIV_CORE", .udesc = "Cycles with less than 3 uops delivered by the front end", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles Front-End (FE) delivered 4 uops or Resource Allocation Table (RAT) was stalling FE", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 inv=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, } }; static const intel_x86_umask_t hsw_inst_retired[]={ { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution (Precise Event)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "ALL:i=1:c=10", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution", .ucode = 0x100, .uequiv = "ALL", .ucntmsk= 0x2, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "X87", .udesc = "X87 FP operations retired with no exceptions. Also counts flows that have several X87 or flows that use X87 uops in the exception handling", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_int_misc[]={ { .uname = "RECOVERY_CYCLES", .udesc = "Cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "RECOVERY_CYCLES_ANY", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .uequiv = "RECOVERY_CYCLES:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "RECOVERY_STALLS_COUNT", .udesc = "Number of occurrences waiting for Machine Clears", .ucode = 0x300 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_itlb[]={ { .uname = "ITLB_FLUSH", .udesc = "Flushing of the Instruction TLB (ITLB) pages independent of page size", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_l1d[]={ { .uname = "REPLACEMENT", .udesc = "L1D Data line replacements", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_l1d_pend_miss[]={ { .uname = "PENDING", .udesc = "Cycles with L1D load misses outstanding", .ucode = 0x100, .ucntmsk = 0x4, .uflags = INTEL_X86_DFL, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load misses outstanding", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "PENDING:c=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "OCCURRENCES", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "EDGE", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "REQUEST_FB_FULL", .udesc = "Number of times a demand request was blocked due to Fill Buffer (FB) unavailability", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FB_FULL", .udesc = "Number of cycles a demand request was blocked due to Fill Buffer (FB) unavailability", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "REQUEST_FB_FULL:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_l2_demand_rqsts[]={ { .uname = "WB_HIT", .udesc = "WB requests that hit L2 cache", .ucode = 0x5000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_l2_lines_in[]={ { .uname = "I", .udesc = "L2 cache lines in I state filling L2", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S", .udesc = "L2 cache lines in S state filling L2", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E", .udesc = "L2 cache lines in E state filling L2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "L2 cache lines filling L2", .uequiv = "ALL", .ucode = 0x700, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_l2_lines_out[]={ { .uname = "DEMAND_CLEAN", .udesc = "Number of clean L2 cachelines evicted by demand", .ucode = 0x500, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DIRTY", .udesc = "Number of dirty L2 cachelines evicted by demand", .ucode = 0x600, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_l2_rqsts[]={ { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read requests that miss L2 cache", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uequiv = "DEMAND_RFO_MISS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0x4200, .uequiv = "DEMAND_RFO_HIT", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "All demand requests that miss the L2 cache", .ucode = 0x2700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads", .ucode = 0x4400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_PF_MISS", .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", .ucode = 0x3800, .uequiv = "PF_MISS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_MISS", .udesc = "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache", .ucode = 0x3800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "All requests that miss the L2 cache", .ucode = 0x3f00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0xd800, .uequiv = "PF_HIT", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0xd800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Any data read request to L2 cache", .ucode = 0xe100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "Any data RFO request to L2 cache", .ucode = 0xe200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CODE_RD", .udesc = "Any code read request to L2 cache", .ucode = 0xe400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_REFERENCES", .udesc = "All demand requests to L2 cache ", .ucode = 0xe700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "Any L2 HW prefetch request to L2 cache", .ucode = 0xf800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All requests to L2 cache", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_l2_trans[]={ { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests that access L2 cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO requests that access L2 cache", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD", .udesc = "L2 cache accesses when fetching instructions", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "L2 or L3 HW prefetches that access L2 cache, including rejects", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writebacks that access L2 cache", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_FILL", .udesc = "L2 fill requests that access L2 cache", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_REQUESTS", .udesc = "Transactions accessing L2 pipe", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_ld_blocks[]={ { .uname = "STORE_FORWARD", .udesc = "Counts the number of loads blocked by overlapping with store buffer entries that cannot be forwarded", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_load_hit_pre[]={ { .uname = "SW_PF", .udesc = "Non software-prefetch load dispatches that hit FB allocated for software prefetch", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HW_PF", .udesc = "Non software-prefetch load dispatches that hit FB allocated for hardware prefetch", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_lock_cycles[]={ { .uname = "SPLIT_LOCK_UC_LOCK_DURATION", .udesc = "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CACHE_LOCK_DURATION", .udesc = "cycles that the L1D is locked", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed LLC - architectural event", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable demand requests that refer to LLC - architectural event", .ucode = 0x4f00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_machine_clears[]={ { .uname = "CYCLES", .udesc = "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of Memory Ordering Machine Clears detected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Number of Self-modifying code (SMC) Machine Clears detected", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MASKMOV", .udesc = "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "CYCLES:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_mem_load_uops_l3_hit_retired[]={ { .uname = "XSNP_MISS", .udesc = "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HIT", .udesc = "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Load had HitM Response from a core on same socket (shared L3). (Non PEBS", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load uops which data sources were hits in L3 without snoops required", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_mem_load_uops_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load uops missing L3 cache but hitting local memory", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "REMOTE_DRAM", .udesc = "Number of retired load uops that missed L3 but were service by remote RAM, snoop not needed, snoop miss, snoop hit data not forwarded (Precise Event)", .ucode = 0x400, .umodel = PFM_PMU_INTEL_HSW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Number of retired load uops whose data sources was remote HITM (Precise Event)", .ucode = 0x1000, .umodel = PFM_PMU_INTEL_HSW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Load uops that miss in the L3 whose data source was forwarded from a remote cache (Precise Event)", .ucode = 0x2000, .umodel = PFM_PMU_INTEL_HSW_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_mem_load_uops_retired[]={ { .uname = "L1_HIT", .udesc = "Retired load uops with L1 cache hits as data source", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load uops with L2 cache hits as data source", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load uops with L3 cache hits as data source", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load uops which missed the L1D", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load uops which missed the L2. Unknown data source excluded", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load uops which missed the L3", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "HIT_LFB", .udesc = "Retired load uops which missed L1 but hit line fill buffer (LFB)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uequiv = "LOAD_LATENCY", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_NO_AUTOENCODE, }, }; static const intel_x86_umask_t hsw_mem_uops_retired[]={ { .uname = "STLB_MISS_LOADS", .udesc = "Load uops with true STLB miss retired to architected path", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Store uops with true STLB miss retired to architected path", .ucode = 0x1200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Load uops with locked access retired", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Line-splitted load uops retired", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Line-splitted store uops retired", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "All load uops retired", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "All store uops retired", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t hsw_misalign_mem_ref[]={ { .uname = "LOADS", .udesc = "Speculative cache-line split load uops dispatched to the L1D", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Speculative cache-line split store-address uops dispatched to L1D", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_move_elimination[]={ { .uname = "INT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were eliminated", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were eliminated", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_NOT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were not eliminated", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_NOT_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were not eliminated", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_offcore_requests[]={ { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read requests sent to uncore (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Demand code read requests sent to uncore (use with HT off only)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFOs requests sent to uncore (use with HT off only)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Data read requests sent to uncore (use with HT off only)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_other_assists[]={ { .uname = "AVX_TO_SSE", .udesc = "Number of transitions from AVX-256 to legacy SSE when penalty applicable", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SSE_TO_AVX", .udesc = "Number of transitions from legacy SSE to AVX-256 when penalty applicable", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_WB_ASSIST", .udesc = "Number of times any microcode assist is invoked by HW upon uop writeback", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_resource_stalls[]={ { .uname = "ANY", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RS", .udesc = "Stall cycles caused by absence of eligible entries in Reservation Station (RS)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SB", .udesc = "Cycles Allocator is stalled due to Store Buffer full (not including draining from synch)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ROB", .udesc = "ROB full stall cycles", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_rob_misc_events[]={ { .uname = "LBR_INSERTS", .udesc = "Count each time an new Last Branch Record (LBR) is inserted", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_rs_events[]={ { .uname = "EMPTY_CYCLES", .udesc = "Cycles the Reservation Station (RS) is empty for this thread", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "EMPTY_END", .udesc = "Counts number of time the Reservation Station (RS) goes from empty to non-empty", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "EMPTY_CYCLES:c=1:e:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_tlb_flush[]={ { .uname = "DTLB_THREAD", .udesc = "Count number of DTLB flushes of thread-specific entries", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_ANY", .udesc = "Count number of any STLB flushes", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_uops_executed[]={ { .uname = "CORE", .udesc = "Number of uops executed from any thread", .ucode = 0x200, .uflags = INTEL_X86_DFL, }, { .uname = "STALL_CYCLES", .udesc = "Number of cycles with no uops executed", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1_UOP_EXEC", .udesc = "Cycles where at least 1 uop was executed per thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2_UOPS_EXEC", .udesc = "Cycles where at least 2 uops were executed per thread", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3_UOPS_EXEC", .udesc = "Cycles where at least 3 uops were executed per thread", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4_UOPS_EXEC", .udesc = "Cycles where at least 4 uops were executed per thread", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed from any thread", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed from any thread", .ucode = 0x200 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed from any thread", .ucode = 0x200 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed from any thread", .ucode = 0x200 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "CORE:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_NONE", .udesc = "Cycles where no uop is executed on any thread", .ucode = 0x200 | INTEL_X86_MOD_INV, /* inv=1 */ .uequiv = "CORE:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t hsw_uops_executed_port[]={ { .uname = "PORT_0", .udesc = "Cycles which a Uop is executed on port 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Cycles which a Uop is executed on port 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2", .udesc = "Cycles which a Uop is executed on port 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_3", .udesc = "Cycles which a Uop is executed on port 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4", .udesc = "Cycles which a Uop is executed on port 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Cycles which a Uop is executed on port 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Cycles which a Uop is executed on port 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7", .udesc = "Cycles which a Uop is executed on port 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_0_CORE", .udesc = "tbd", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_0:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_1_CORE", .udesc = "tbd", .ucode = 0x200 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_1:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_2_CORE", .udesc = "tbd", .ucode = 0x400 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_2:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_3_CORE", .udesc = "tbd", .ucode = 0x800 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_3:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_4_CORE", .udesc = "tbd", .ucode = 0x1000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_4:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_5_CORE", .udesc = "tbd", .ucode = 0x2000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_5:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_6_CORE", .udesc = "tbd", .ucode = 0x4000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_6:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_7_CORE", .udesc = "tbd", .ucode = 0x8000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_7:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, }; static const intel_x86_umask_t hsw_uops_issued[]={ { .uname = "ANY", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLAGS_MERGE", .udesc = "Number of flags-merge uops being allocated. Such uops adds delay", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_LEA", .udesc = "Number of slow LEA or similar uops allocated. Such uop has 3 sources regardless if result of LEA instruction or not", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SINGLE_MUL", .udesc = "Number of Multiply packed/scalar single precision uops allocated", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued by this thread", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1", .uflags = INTEL_X86_NCOMBO, .ucntmsk = 0xf, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued on this core", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* any=1 inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1:t=1", .ucntmsk = 0xf, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_uops_retired[]={ { .uname = "ALL", .udesc = "All uops that actually retired", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "All uops that actually retired", .ucode = 0x100, .uequiv = "ALL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "number of retirement slots used non PEBS", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no executable uops retired (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:i=1:c=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition applied to PEBS uops retired event", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "ALL:i=1:c=10", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no executable uops retired on core (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:i=1:c=1:t=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_OCCURRENCES", .udesc = "Number of transitions from stalled to unstalled execution (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE| (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "ALL:c=1:i=1:e=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t hsw_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_CODE_RD", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .uequiv = "DMND_CODE_RD", .grpid = 0, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", .ucode = 1ULL << (4 + 8), .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetchers", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_CODE_RD", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, .uequiv= "PF_CODE_RD", }, { .uname = "PF_L3_DATA_RD", .udesc = "Request: number of L2 prefetcher requests to L3 for loads", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PF_L3_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "PF_L3_CODE_RD", .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "PF_L3_IFETCH", .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, .uequiv= "PF_L3_CODE_RD", }, { .uname = "SPLIT_LOCK_UC_LOCK", .udesc = "Request: number of bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Request: number of bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, .uequiv= "SPLIT_LOCK_UC_LOCK", }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_CODE_RD", .udesc = "Request: combination of PF_CODE_RD | DMND_CODE_RD | PF_L3_CODE_RD", .uequiv = "PF_CODE_RD:DMND_CODE_RD:PF_L3_CODE_RD", .ucode = 0x24400, .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_CODE_RD | PF_L3_CODE_RD", .ucode = 0x24000, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER", .ucode = 0x8fff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_L3_DATA_RD", .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD", .ucode = 0x9100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_L3_RFO", .uequiv = "DMND_RFO:PF_RFO:PF_L3_RFO", .ucode = 0x12200, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "NO_SUPP", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .grpid = 1, }, { .uname = "L3_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "L3_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .grpid = 1, }, { .uname = "L3_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .grpid = 1, }, { .uname = "L3_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Supplier: counts L3 hits in any state (M, E, S)", .ucode = 0x7ULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS", .umodel = PFM_PMU_INTEL_HSW, .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (22+8), .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP0", .udesc = "Supplier: counts L3 misses to remote DRAM with 0 hop", .ucode = 0x1ULL << (27+8), .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP1", .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", .ucode = 0x1ULL << (28+8), .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP2P", .udesc = "Supplier: counts L3 misses to remote DRAM with 2P hops", .ucode = 0x1ULL << (29+8), .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 0x1ULL << (22+8), .uequiv = "L3_MISS_LOCAL", .grpid = 1, .umodel = PFM_PMU_INTEL_HSW, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local or remote DRAM", .ucode = 0x7ULL << (27+8) | 0x1ULL << (22+8), .uequiv = "L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE", .udesc = "Supplier: counts L3 misses to remote node", .ucode = 0x7ULL << (27+8), .uequiv = "L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P", .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "L3_MISS_REMOTE_DRAM", .udesc = "Supplier: counts L3 misses to remote node", .ucode = 0x7ULL << (27+8), .uequiv = "L3_MISS_REMOTE", .umodel = PFM_PMU_INTEL_HSW_EP, .grpid = 1, }, { .uname = "SPL_HIT", .udesc = "Supplier: counts L3 supplier hit", .ucode = 0x1ULL << (30+8), .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_NOT_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_NO_FWD", .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_FWD", .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", .ucode = 1ULL << (35+8), .grpid = 2, }, { .uname = "SNP_HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "SNP_NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7fULL << (31+8), .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM", .uflags= INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t hsw_hle_retired[]={ { .uname = "START", .udesc = "Number of times an HLE execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an HLE execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an HLE execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MISC1", .udesc = "Number of times an HLE execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC2", .udesc = "Number of times an HLE execution aborted due to uncommon conditions", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC3", .udesc = "Number of times an HLE execution aborted due to HLE-unfriendly instructions", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC4", .udesc = "Number of times an HLE execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC5", .udesc = "Number of times an HLE execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_rtm_retired[]={ { .uname = "START", .udesc = "Number of times an RTM execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MISC1", .udesc = "Number of times an RTM execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC2", .udesc = "Number of times an RTM execution aborted due to uncommon conditions", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC3", .udesc = "Number of times an RTM execution aborted due to RTM-unfriendly instructions", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC4", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MISC5", .udesc = "Number of times an RTM execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_tx_mem[]={ { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to data conflict on a transactionally accessed address", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY_WRITE", .udesc = "Number of times a transactional abort was signaled due to data capacity limitation for transactional writes", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_STORE_TO_ELIDED_LOCK", .udesc = "Number of times a HLE transactional execution aborted due to a non xrelease prefixed instruction writing to an elided lock in the elision buffer", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_NOT_EMPTY", .udesc = "Number of times a HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_MISMATCH", .udesc = "Number of times a HLE transaction execution aborted due to xrelease lock not satisfying the address and value requirements in the elision buffer", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT", .udesc = "Number of times a HLE transaction execution aborted due to an unsupported read alignment from the elision buffer", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_FULL", .udesc = "Number of times a HLE clock could not be elided due to ElisionBufferAvailable being zero", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_tx_exec[]={ { .uname = "MISC1", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC2", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed inside a transactional region", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC3", .udesc = "Number of times an instruction execution caused the supported nest count to be exceeded", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC4", .udesc = "Number of times an instruction with HLE xbegin prefix was executed inside a RTM transactional region", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC5", .udesc = "Number of times an instruction with HLE xacquire prefix was executed inside a RTM transactional region", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD_CYCLES", .udesc = "Cycles with cacheable data read transactions in the superQ (use with HT off only)", .uequiv = "ALL_DATA_RD:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_CYCLES", .udesc = "Cycles with demand code reads transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_CYCLES", .udesc = "Cycles with demand data read transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Cacheable data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Code read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_GE_6", .udesc = "Cycles with at lesat 6 offcore outstanding demand data read requests in the uncore queue", .uequiv = "DEMAND_DATA_RD:c=6", .ucode = 0x100 | (6 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding RFO (store) transactions in the superQ every cycle (use with HT off only)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_CYCLES", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_ild_stall[]={ { .uname = "LCP", .udesc = "Stall caused by changing prefix length of the instruction", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IQ_FULL", .udesc = "Stall cycles due to IQ full", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_page_walker_loads[]={ { .uname = "DTLB_L1", .udesc = "Number of DTLB page walker loads that hit in the L1D and line fill buffer", .ucode = 0x1100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L1", .udesc = "Number of ITLB page walker loads that hit in the L1I and line fill buffer", .ucode = 0x2100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_L2", .udesc = "Number of DTLB page walker loads that hit in the L2", .ucode = 0x1200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L2", .udesc = "Number of ITLB page walker loads that hit in the L2", .ucode = 0x2200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_L3", .udesc = "Number of DTLB page walker loads that hit in the L3", .ucode = 0x1400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_L3", .udesc = "Number of ITLB page walker loads that hit in the L3", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_DTLB_L1", .udesc = "Number of extended page table walks from the DTLB that hit in the L1D and line fill buffer", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_ITLB_L1", .udesc = "Number of extended page table walks from the ITLB that hit in the L1D and line fill buffer", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_DTLB_L2", .udesc = "Number of extended page table walks from the DTLB that hit in the L2", .ucode = 0x4200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_ITLB_L2", .udesc = "Number of extended page table walks from the ITLB that hit in the L2", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_DTLB_L3", .udesc = "Number of extended page table walks from the DTLB that hit in the L3", .ucode = 0x4400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_ITLB_L3", .udesc = "Number of extended page table walks from the ITLB that hit in the L3", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DTLB_MEMORY", .udesc = "Number of DTLB page walker loads that hit memory", .ucode = 0x1800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ITLB_MEMORY", .udesc = "Number of ITLB page walker loads that hit memory", .ucode = 0x2800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_DTLB_MEMORY", .udesc = "Number of extended page table walks from the DTLB that hit memory", .ucode = 0x4800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPT_ITLB_MEMORY", .udesc = "Number of extended page table walks from the ITLB that hit memory", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hsw_lsd[]={ { .uname = "UOPS", .udesc = "Number of uops delivered by the Loop Stream Detector (LSD)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ACTIVE", .udesc = "Cycles with uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_4_UOPS", .udesc = "Cycles with 4 uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "UOPS:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t hsw_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "Number of DSB to MITE switch true penalty cycles", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_ept[]={ { .uname = "WALK_CYCLES", .udesc = "Cycles for an extended page table walk", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_arith[]={ { .uname = "DIVIDER_UOPS", .udesc = "Number of uops executed by divider", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_offcore_requests_buffer[]={ { .uname = "SQ_FULL", .udesc = "Number of cycles the offcore requests buffer is full", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_avx[]={ { .uname = "ALL", .udesc = "Approximate counts of AVX and AVX2 256-bit instructions, including non-arithmetic instructions, loads, and stores. May count non-AVX instructions using 256-bit operations", .ucode = 0x0700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hsw_sq_misc[]={ { .uname = "SPLIT_LOCK", .udesc = "Number of split locks in the super queue (SQ)", .ucode = 0x1000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_hsw_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V4_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc5, }, { .name = "BACLEARS", .desc = "Branch re-steered", .code = 0xe6, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_baclears), .umasks = hsw_baclears }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .code = 0x88, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_br_inst_exec), .umasks = hsw_br_inst_exec }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired (Precise Event)", .code = 0xc4, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_br_inst_retired), .umasks = hsw_br_inst_retired }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .code = 0x89, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_br_misp_exec), .umasks = hsw_br_misp_exec }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches (Precise Event)", .code = 0xc5, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_br_misp_retired), .umasks = hsw_br_misp_retired }, { .name = "CPL_CYCLES", .desc = "Unhalted core cycles at a specific ring level", .code = 0x5c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_cpl_cycles), .umasks = hsw_cpl_cycles }, { .name = "CPU_CLK_THREAD_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_cpu_clk_thread_unhalted), .umasks = hsw_cpu_clk_thread_unhalted }, { .name = "CPU_CLK_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .modmsk = INTEL_V4_ATTRS, .equiv = "CPU_CLK_THREAD_UNHALTED", }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles", .code = 0xa3, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_cycle_activity), .umasks = hsw_cycle_activity }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .code = 0x8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_dtlb_load_misses), .umasks = hsw_dtlb_load_misses }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses", .code = 0x49, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_dtlb_load_misses), .umasks = hsw_dtlb_load_misses /* shared */ }, { .name = "FP_ASSIST", .desc = "X87 floating-point assists", .code = 0xca, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_fp_assist), .umasks = hsw_fp_assist }, { .name = "HLE_RETIRED", .desc = "HLE execution (Precise Event)", .code = 0xc8, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_hle_retired), .umasks = hsw_hle_retired }, { .name = "ICACHE", .desc = "Instruction Cache", .code = 0x80, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_icache), .umasks = hsw_icache }, { .name = "IDQ", .desc = "IDQ operations", .code = 0x79, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_idq), .umasks = hsw_idq }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered", .code = 0x9c, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_idq_uops_not_delivered), .umasks = hsw_idq_uops_not_delivered }, { .name = "INST_RETIRED", .desc = "Number of instructions retired (Precise Event)", .code = 0xc0, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_inst_retired), .umasks = hsw_inst_retired }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions", .code = 0xd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_int_misc), .umasks = hsw_int_misc }, { .name = "ITLB", .desc = "Instruction TLB", .code = 0xae, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_itlb), .umasks = hsw_itlb }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .code = 0x85, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_itlb_misses), .umasks = hsw_itlb_misses }, { .name = "L1D", .desc = "L1D cache", .code = 0x51, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l1d), .umasks = hsw_l1d }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses", .code = 0x48, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l1d_pend_miss), .umasks = hsw_l1d_pend_miss }, { .name = "L2_DEMAND_RQSTS", .desc = "Demand Data Read requests to L2", .code = 0x27, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l2_demand_rqsts), .umasks = hsw_l2_demand_rqsts }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .code = 0xf1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l2_lines_in), .umasks = hsw_l2_lines_in }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .code = 0xf2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l2_lines_out), .umasks = hsw_l2_lines_out }, { .name = "L2_RQSTS", .desc = "L2 requests", .code = 0x24, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l2_rqsts), .umasks = hsw_l2_rqsts }, { .name = "L2_TRANS", .desc = "L2 transactions", .code = 0xf0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_l2_trans), .umasks = hsw_l2_trans }, { .name = "LD_BLOCKS", .desc = "Blocking loads", .code = 0x3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_ld_blocks), .umasks = hsw_ld_blocks }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks", .code = 0x7, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_ld_blocks_partial), .umasks = hsw_ld_blocks_partial }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches", .code = 0x4c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_load_hit_pre), .umasks = hsw_load_hit_pre }, { .name = "LOCK_CYCLES", .desc = "Locked cycles in L1D and L2", .code = 0x63, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_lock_cycles), .umasks = hsw_lock_cycles }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache", .code = 0x2e, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_longest_lat_cache), .umasks = hsw_longest_lat_cache }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .code = 0xc3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_machine_clears), .umasks = hsw_machine_clears }, { .name = "MEM_LOAD_UOPS_L3_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_load_uops_l3_hit_retired), .umasks = hsw_mem_load_uops_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .equiv = "MEM_LOAD_UOPS_L3_HIT_RETIRED", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_load_uops_l3_hit_retired), .umasks = hsw_mem_load_uops_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_L3_MISS_RETIRED", .desc = "Load uops retired that missed the L3 (Precise Event)", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_load_uops_l3_miss_retired), .umasks = hsw_mem_load_uops_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_LLC_MISS_RETIRED", .desc = "Load uops retired that missed the L3 (Precise Event)", .equiv = "MEM_LOAD_UOPS_L3_MISS_RETIRED", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_load_uops_l3_miss_retired), .umasks = hsw_mem_load_uops_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Retired load uops (Precise Event)", .code = 0xd1, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_load_uops_retired), .umasks = hsw_mem_load_uops_retired }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired (Precise Event)", .code = 0xcd, .cntmsk = 0x8, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS | _INTEL_X86_ATTR_LDLAT, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_trans_retired), .umasks = hsw_mem_trans_retired }, { .name = "MEM_UOPS_RETIRED", .desc = "Memory uops retired (Precise Event)", .code = 0xd0, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_mem_uops_retired), .umasks = hsw_mem_uops_retired }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned memory references", .code = 0x5, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_misalign_mem_ref), .umasks = hsw_misalign_mem_ref }, { .name = "MOVE_ELIMINATION", .desc = "Move Elimination", .code = 0x58, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_move_elimination), .umasks = hsw_move_elimination }, { .name = "OFFCORE_REQUESTS", .desc = "Demand Data Read requests sent to uncore", .code = 0xb0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_offcore_requests), .umasks = hsw_offcore_requests }, { .name = "OTHER_ASSISTS", .desc = "Software assist", .code = 0xc1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_other_assists), .umasks = hsw_other_assists }, { .name = "RESOURCE_STALLS", .desc = "Cycles Allocation is stalled due to Resource Related reason", .code = 0xa2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_resource_stalls), .umasks = hsw_resource_stalls }, { .name = "ROB_MISC_EVENTS", .desc = "ROB miscellaneous events", .code = 0xcc, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_rob_misc_events), .umasks = hsw_rob_misc_events }, { .name = "RS_EVENTS", .desc = "Reservation Station", .code = 0x5e, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_rs_events), .umasks = hsw_rs_events }, { .name = "RTM_RETIRED", .desc = "Restricted Transaction Memory execution (Precise Event)", .code = 0xc9, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_rtm_retired), .umasks = hsw_rtm_retired }, { .name = "TLB_FLUSH", .desc = "TLB flushes", .code = 0xbd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_tlb_flush), .umasks = hsw_tlb_flush }, { .name = "UOPS_EXECUTED", .desc = "Uops executed", .code = 0xb1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_uops_executed), .umasks = hsw_uops_executed }, { .name = "LSD", .desc = "Loop stream detector", .code = 0xa8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_lsd), .umasks = hsw_lsd, }, { .name = "UOPS_EXECUTED_PORT", .desc = "Uops dispatched to specific ports", .code = 0xa1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_uops_executed_port), .umasks = hsw_uops_executed_port }, { .name = "UOPS_DISPATCHED_PORT", .desc = "Uops dispatched to specific ports", .code = 0xa1, .cntmsk = 0xff, .ngrp = 1, .equiv = "UOPS_EXECUTED_PORT", .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_uops_executed_port), .umasks = hsw_uops_executed_port }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .code = 0xe, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_uops_issued), .umasks = hsw_uops_issued }, { .name = "UOPS_RETIRED", .desc = "Uops retired (Precise Event)", .code = 0xc2, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_uops_retired), .umasks = hsw_uops_retired }, { .name = "TX_MEM", .desc = "Transactional memory aborts", .code = 0x54, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_tx_mem), .umasks = hsw_tx_mem, }, { .name = "TX_EXEC", .desc = "Transactional execution", .code = 0x5d, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hsw_tx_exec), .umasks = hsw_tx_exec }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(hsw_offcore_requests_outstanding), .ngrp = 1, .umasks = hsw_offcore_requests_outstanding, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(hsw_ild_stall), .ngrp = 1, .umasks = hsw_ild_stall, }, { .name = "PAGE_WALKER_LOADS", .desc = "Page walker loads", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xbc, .numasks = LIBPFM_ARRAY_SIZE(hsw_page_walker_loads), .ngrp = 1, .umasks = hsw_page_walker_loads, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(hsw_dsb2mite_switches), .ngrp = 1, .umasks = hsw_dsb2mite_switches, }, { .name = "EPT", .desc = "Extended page table", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(hsw_ept), .ngrp = 1, .umasks = hsw_ept, }, { .name = "ARITH", .desc = "Counts arithmetic multiply operations", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x14, .numasks = LIBPFM_ARRAY_SIZE(hsw_arith), .ngrp = 1, .umasks = hsw_arith, }, { .name = "AVX", .desc = "Counts AVX instructions", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xc6, .numasks = LIBPFM_ARRAY_SIZE(hsw_avx), .ngrp = 1, .umasks = hsw_avx, }, { .name = "SQ_MISC", .desc = "SuperQueue miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(hsw_sq_misc), .ngrp = 1, .umasks = hsw_sq_misc, }, { .name = "OFFCORE_REQUESTS_BUFFER", .desc = "Offcore reqest buffer", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xb2, .numasks = LIBPFM_ARRAY_SIZE(hsw_offcore_requests_buffer), .ngrp = 1, .umasks = hsw_offcore_requests_buffer, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(hsw_offcore_response), .ngrp = 3, .umasks = hsw_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(hsw_offcore_response), .ngrp = 3, .umasks = hsw_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_cbo_events.h000066400000000000000000001055641502707512200256170ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_cbo (Intel Haswell-EP C-Box uncore PMU) */ #define CBO_FILT_MESIF(a, b, c, d) \ { .uname = "STATE_"#a,\ .udesc = #b" cacheline state",\ .ufilters[0] = 1ULL << (17 + (c)),\ .grpid = d, \ } #define CBO_FILT_MESIFS(d) \ CBO_FILT_MESIF(I, Invalid, 0, d), \ CBO_FILT_MESIF(S, Shared, 1, d), \ CBO_FILT_MESIF(E, Exclusive, 2, d), \ CBO_FILT_MESIF(M, Modified, 3, d), \ CBO_FILT_MESIF(F, Forward, 4, d), \ CBO_FILT_MESIF(D, Debug, 5, d), \ { .uname = "STATE_MP",\ .udesc = "Cacheline is modified but never written, was forwarded in modified state",\ .ufilters[0] = 0x1ULL << (17+6),\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO, \ }, \ { .uname = "STATE_MESIFD",\ .udesc = "Any cache line state",\ .ufilters[0] = 0x7fULL << 17,\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ } #define CBO_FILT_OPC(d) \ { .uname = "OPC_RFO",\ .udesc = "Demand data RFO (combine with any OPCODE umask)",\ .ufilters[1] = 0x180ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_CRD",\ .udesc = "Demand code read (combine with any OPCODE umask)",\ .ufilters[1] = 0x181ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_DRD",\ .udesc = "Demand data read (combine with any OPCODE umask)",\ .ufilters[1] = 0x182ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PRD",\ .udesc = "Partial reads (UC) (combine with any OPCODE umask)",\ .ufilters[1] = 0x187ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCILF",\ .udesc = "Full Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCIL",\ .udesc = "Partial Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18dULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WIL",\ .udesc = "Write Invalidate Line (Partial) (combine with any OPCODE umask)", \ .ufilters[1] = 0x18fULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_RFO",\ .udesc = "Prefetch RFO into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x190ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_CODE",\ .udesc = "Prefetch code into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x191ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_DATA",\ .udesc = "Prefetch data into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x192ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWIL",\ .udesc = "PCIe write (partial, non-allocating) - partial line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ .ufilters[1] = 0x193ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWIF",\ .udesc = "PCIe write (full, non-allocating) - full line MMIO write transactions from IIO (P2P). Not used for coherent transacions. Uncacheable. (combine with any OPCODE umask)", \ .ufilters[1] = 0x194ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIITOM",\ .udesc = "PCIe write (allocating) (combine with any OPCODE umask)", \ .ufilters[1] = 0x19cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIRDCUR",\ .udesc = "PCIe read current (combine with any OPCODE umask)", \ .ufilters[1] = 0x19eULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOI",\ .udesc = "Request writeback modified invalidate line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOE",\ .udesc = "Request writeback modified set to exclusive (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_ITOM",\ .udesc = "Request invalidate line. Request exclusive ownership of the line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c8ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSRD",\ .udesc = "PCIe non-snoop read (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWR",\ .udesc = "PCIe non-snoop write (partial) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWRF",\ .udesc = "PCIe non-snoop write (full) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e6ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ } static const intel_x86_umask_t hswep_unc_c_llc_lookup[]={ { .uname = "DATA_READ", .udesc = "Data read requests", .grpid = 0, .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Write requests. Includes all write transactions (cached, uncached)", .grpid = 0, .ucode = 0x500, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNOOP", .udesc = "External snoop request", .grpid = 0, .ucode = 0x900, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any request", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .ucode = 0x1100, }, { .uname = "NID", .udesc = "Match a given RTID destination NID (must provide nf=X modifier)", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, .ucode = 0x4100, .uflags = INTEL_X86_GRP_DFL_NONE }, CBO_FILT_MESIFS(2), }; static const intel_x86_umask_t hswep_unc_c_llc_victims[]={ { .uname = "STATE_M", .udesc = "Lines in M state", .ucode = 0x100, .grpid = 0, }, { .uname = "STATE_E", .udesc = "Lines in E state", .ucode = 0x200, .grpid = 0, }, { .uname = "STATE_S", .udesc = "Lines in S state", .ucode = 0x400, .grpid = 0, }, { .uname = "STATE_F", .udesc = "Lines in F state", .ucode = 0x800, .grpid = 0, }, { .uname = "MISS", .udesc = "TBD", .ucode = 0x1000, .grpid = 0, }, { .uname = "NID", .udesc = "Victimized Lines matching the NID filter (must provide nf=X modifier)", .ucode = 0x4000, .uflags = INTEL_X86_GRP_DFL_NONE, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, }, }; static const intel_x86_umask_t hswep_unc_c_ring_ad_used[]={ { .uname = "UP_EVEN", .udesc = "Up and Even ring polarity filter", .ucode = 0x100, }, { .uname = "UP_ODD", .udesc = "Up and odd ring polarity filter", .ucode = 0x200, }, { .uname = "DOWN_EVEN", .udesc = "Down and even ring polarity filter", .ucode = 0x400, }, { .uname = "DOWN_ODD", .udesc = "Down and odd ring polarity filter", .ucode = 0x800, }, { .uname = "UP", .udesc = "Up ring polarity filter", .ucode = 0x3300, }, { .uname = "DOWN", .udesc = "Down ring polarity filter", .ucode = 0xcc00, }, { .uname = "ALL", .udesc = "up or down ring polarity filter", .ucode = 0xcc00, }, }; static const intel_x86_umask_t hswep_unc_c_ring_bounces[]={ { .uname = "AD_IRQ", .udesc = "TBD", .ucode = 0x200, }, { .uname = "AK", .udesc = "Acknowledgments to core", .ucode = 0x400, }, { .uname = "BL", .udesc = "Data responses to core", .ucode = 0x800, }, { .uname = "IV", .udesc = "Snoops of processor cache", .ucode = 0x1000, }, }; static const intel_x86_umask_t hswep_unc_c_ring_iv_used[]={ { .uname = "ANY", .udesc = "Any filter", .ucode = 0x0f00, .uflags = INTEL_X86_DFL, }, { .uname = "UP", .udesc = "Filter on any up polarity", .ucode = 0x0300, }, { .uname = "DN", .udesc = "Filter on any up polarity", .ucode = 0x0c00, }, { .uname = "DOWN", .udesc = "Filter on any down polarity", .ucode = 0xcc00, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_ext_starved[]={ { .uname = "IRQ", .udesc = "Irq externally starved, therefore blocking the IPQ", .ucode = 0x100, }, { .uname = "IPQ", .udesc = "IPQ externally starved, therefore blocking the IRQ", .ucode = 0x200, }, { .uname = "PRQ", .udesc = "IRQ is blocking the ingress queue and causing starvation", .ucode = 0x400, }, { .uname = "ISMQ_BIDS", .udesc = "Number of time the ISMQ bids", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_inserts[]={ { .uname = "IRQ", .udesc = "IRQ", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJECTED", .udesc = "IRQ rejected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "IPQ", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "PRQ", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_REJECTED", .udesc = "PRQ rejected", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_ipq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any Reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_ipq_retry2[]={ { .uname = "AD_SBO", .udesc = "Count number of time that a request from the IPQ was retried because it lacked credits to send an AD packet to SBO", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TARGET", .udesc = "Count number of times that a request from the IPQ was retried filtered by the target NodeId", .ucode = 0x100, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_irq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_CREDITS", .udesc = "No IIO Credits", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_irq_retry2[]={ { .uname = "AD_SBO", .udesc = "Count number of time that a request from the IRQ was retried because it lacked credits to send an AD packet to SBO", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_SBO", .udesc = "Count number of time that a request from the IRQ was retried because it lacked credits to send an BL packet to SBO", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TARGET", .udesc = "Count number of times that a request from the IRQ was retried filtered by the target NodeId", .ucode = 0x100, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_ismq_retry[]={ { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_CREDITS", .udesc = "No IIO credits", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "NO QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB_CREDITS", .udesc = "No WB credits", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_rxr_ismq_retry2[]={ { .uname = "AD_SBO", .udesc = "Count number of time that a request from the ISMQ was retried because it lacked credits to send an AD packet to SBO", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_SBO", .udesc = "Count number of time that a request from the ISMQ was retried because it lacked credits to send an BL packet to SBO", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TARGET", .udesc = "Count number of times that a request from the ISMQ was retried filtered by the target NodeId", .ucode = 0x100, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_tor_inserts[]={ { .uname = "OPCODE", .udesc = "Number of transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_OPCODE", .udesc = "Number of miss transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICTION", .udesc = "Number of Evictions transactions inserted into TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "ALL", .udesc = "Number of transactions inserted in TOR", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, }, { .uname = "WB", .udesc = "Number of write transactions inserted into the TOR", .ucode = 0x1000, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL_OPCODE", .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_LOCAL_OPCODE", .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Number of transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_OPCODE", .udesc = "Number of transactions inserted into the TOR that match a NID and opcode (must provide opc_* umask and nf=X modifier)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID and opcode matched miss transactions inserted into the TOR (must provide opc_* umask and nf=X modifier)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched eviction transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched miss transactions that were inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_WB", .udesc = "Number of NID-matched write back transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x5000, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "REMOTE_OPCODE", .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_REMOTE_OPCODE", .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Number of transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_REMOTE", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t hswep_unc_c_tor_occupancy[]={ { .uname = "OPCODE", .udesc = "Number of TOR entries that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_OPCODE", .udesc = "Number of TOR entries that match a NID and an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICTION", .udesc = "Number of outstanding eviction transactions in the TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "ALL", .udesc = "All valid TOR entries", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_ALL", .udesc = "Number of outstanding miss requests in the TOR", .ucode = 0xa00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "WB", .udesc = "Number of write transactions in the TOR. Does not include RFO, but actual operations that contain data being sent from the core", .ucode = 0x1000, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL_OPCODE", .udesc = "Number of opcode-matched transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_LOCAL_OPCODE", .udesc = "Number of miss opcode-matched transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Number of transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL", .udesc = "Number of miss transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_OPCODE", .udesc = "Number of NID-matched TOR entries that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID-matched outstanding miss requests in the TOR that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched outstanding miss requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_WB", .udesc = "Number of NID-matched write transactions in the TOR (must provide a nf=X modifier)", .ucode = 0x5000, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "REMOTE_OPCODE", .udesc = "Number of opcode-matched transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_REMOTE_OPCODE", .udesc = "Number of miss opcode-matched transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Number of transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_REMOTE", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t hswep_unc_c_txr_inserts[]={ { .uname = "AD_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_CACHE", .udesc = "Counts the number of ring transactions from Cachebo ton IV ring", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CORE", .udesc = "Counts the number of ring transactions from Corebo to AD ring", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CORE", .udesc = "Counts the number of ring transactions from Corebo to AK ring", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CORE", .udesc = "Counts the number of ring transactions from Corebo to BL ring", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_txr_ads_used[]={ { .uname = "AD", .udesc = "onto AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Onto AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Onto BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_umask_t hswep_unc_c_misc[]={ { .uname = "RSPI_WAS_FSE", .udesc = "Counts the number of times when a SNoop hit in FSE states and triggered a silent eviction. This is useful because this information is lost in the PRE encodings", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WC_ALIASING", .udesc = "Counts the number of times a USWC write (WCIL(F)) transaction hits in the LLC in M state, triggering a WBMTOI followed by the USWC write. This occurs when there is WC aliasing", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STARTED", .udesc = "TBD", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT_S", .udesc = "Counts the number of times that an RFO hits in S state. This is useful for determining if it might be good for a workload to use RSPIWB instead of RSPSWB", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CVZERO_PREFETCH_VICTIM", .udesc = "Counts the number of clean victims with raw CV=0 (core valid)", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CVZERO_PREFETCH_MISS", .udesc = "Counts the number of Demand Data Read requests hitting non-modified state lines with raw CV=0 (core valid)", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_c_sbo_credits_acquired[]={ { .uname = "AD", .udesc = "for AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "for BL ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_entry_t intel_hswep_unc_c_pe[]={ { .name = "UNC_C_CLOCKTICKS", .desc = "C-box Uncore clockticks", .modmsk = 0x0, .cntmsk = 0xf, .code = 0x00, .flags = INTEL_X86_FIXED, }, { .name = "UNC_C_COUNTER0_OCCUPANCY", .desc = "Counter 0 occupancy. Counts the occupancy related information by filtering CB0 occupancy count captured in counter 0.", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xe, .code = 0x1f, }, { .name = "UNC_C_LLC_LOOKUP", .desc = "Cache lookups", .modmsk = HSWEP_UNC_CBO_NID_ATTRS, .cntmsk = 0xf, .code = 0x34, .ngrp = 3, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_llc_lookup), .umasks = hswep_unc_c_llc_lookup, }, { .name = "UNC_C_LLC_VICTIMS", .desc = "Lines victimized", .modmsk = HSWEP_UNC_CBO_NID_ATTRS, .cntmsk = 0xf, .code = 0x37, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_llc_victims), .ngrp = 2, .umasks = hswep_unc_c_llc_victims, }, { .name = "UNC_C_MISC", .desc = "Miscellaneous C-Box events", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x39, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_misc), .ngrp = 1, .umasks = hswep_unc_c_misc, }, { .name = "UNC_C_RING_AD_USED", .desc = "Address ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x1b, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_ring_ad_used), .ngrp = 1, .umasks = hswep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_AK_USED", .desc = "Acknowledgement ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x1c, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = hswep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BL_USED", .desc = "Bus or Data ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x1d, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = hswep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BOUNCES", .desc = "Number of LLC responses that bounced in the ring", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x05, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_ring_bounces), .ngrp = 1, .umasks = hswep_unc_c_ring_bounces, }, { .name = "UNC_C_FAST_ASSERTED", .desc = "Number of cycles in which the local distress or incoming distress signals are asserted (FaST). Incoming distress includes both up and down", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x09, }, { .name = "UNC_C_BOUNCE_CONTROL", .desc = "Bounce control", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x0a, }, { .name = "UNC_C_RING_IV_USED", .desc = "Invalidate ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x1e, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_ring_iv_used), .ngrp = 1, .umasks = hswep_unc_c_ring_iv_used, }, { .name = "UNC_C_RING_SRC_THRTL", .desc = "TDB", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x07, }, { .name = "UNC_C_RXR_EXT_STARVED", .desc = "Ingress arbiter blocking cycles", .modmsk = HSWEP_UNC_CBO_ATTRS, .cntmsk = 0xf, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_ext_starved), .ngrp = 1, .umasks = hswep_unc_c_rxr_ext_starved, }, { .name = "UNC_C_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x13, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_inserts), .umasks = hswep_unc_c_rxr_inserts }, { .name = "UNC_C_RXR_IPQ_RETRY", .desc = "Probe Queue Retries", .code = 0x31, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_ipq_retry), .umasks = hswep_unc_c_rxr_ipq_retry }, { .name = "UNC_C_RXR_IPQ_RETRY2", .desc = "Probe Queue Retries", .code = 0x28, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_ipq_retry2), .umasks = hswep_unc_c_rxr_ipq_retry2 }, { .name = "UNC_C_RXR_IRQ_RETRY", .desc = "Ingress Request Queue Rejects", .code = 0x32, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_irq_retry), .umasks = hswep_unc_c_rxr_irq_retry }, { .name = "UNC_C_RXR_IRQ_RETRY2", .desc = "Ingress Request Queue Rejects", .code = 0x29, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_irq_retry2), .umasks = hswep_unc_c_rxr_irq_retry2 }, { .name = "UNC_C_RXR_ISMQ_RETRY", .desc = "ISMQ Retries", .code = 0x33, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_ismq_retry), .umasks = hswep_unc_c_rxr_ismq_retry }, { .name = "UNC_C_RXR_ISMQ_RETRY2", .desc = "ISMQ Retries", .code = 0x2a, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_ismq_retry2), .umasks = hswep_unc_c_rxr_ismq_retry2 }, { .name = "UNC_C_RXR_OCCUPANCY", .desc = "Ingress Occupancy", .code = 0x11, .cntmsk = 0x1, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_rxr_inserts), .umasks = hswep_unc_c_rxr_inserts, /* identical to hswep_unc_c_rxr_inserts */ }, { .name = "UNC_C_TOR_INSERTS", .desc = "TOR Inserts", .code = 0x35, .cntmsk = 0xf, .ngrp = 2, .modmsk = HSWEP_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_tor_inserts), .umasks = hswep_unc_c_tor_inserts }, { .name = "UNC_C_TOR_OCCUPANCY", .desc = "TOR Occupancy", .code = 0x36, .cntmsk = 0x1, .ngrp = 2, .modmsk = HSWEP_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_tor_occupancy), .umasks = hswep_unc_c_tor_occupancy }, { .name = "UNC_C_TXR_ADS_USED", .desc = "Egress events", .code = 0x04, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_txr_ads_used), .umasks = hswep_unc_c_txr_ads_used }, { .name = "UNC_C_TXR_INSERTS", .desc = "Egress allocations", .code = 0x02, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_txr_inserts), .umasks = hswep_unc_c_txr_inserts }, { .name = "UNC_C_SBO_CREDITS_ACQUIRED", .desc = "SBO credits acquired", .code = 0x3d, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_sbo_credits_acquired), .umasks = hswep_unc_c_sbo_credits_acquired }, { .name = "UNC_C_SBO_CREDITS_OCCUPANCY", .desc = "SBO credits occupancy", .code = 0x3e, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_c_sbo_credits_acquired), /* shared */ .umasks = hswep_unc_c_sbo_credits_acquired }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_ha_events.h000066400000000000000000001020741502707512200254350ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_ha (Intel Haswell-EP HA uncore PMU) */ static const intel_x86_umask_t hswep_unc_h_directory_lookup[]={ { .uname = "NO_SNP", .udesc = "Snoop not needed", .ucode = 0x200, }, { .uname = "SNOOP", .udesc = "SNooop needed", .ucode = 0x100, }, }; static const intel_x86_umask_t hswep_unc_h_bypass_imc[]={ { .uname = "TAKEN", .udesc = "Bypass taken", .ucode = 0x100, }, { .uname = "NOT_TAKEN", .udesc = "Bypass not taken", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_h_directory_update[]={ { .uname = "ANY", .udesc = "Counts any directory update", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CLEAR", .udesc = "Directory clears", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SET", .udesc = "Directory set", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_hitme_hit[]={ { .uname = "ALL", .udesc = "All requests", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "READ_OR_INVITOE", .udesc = "Number of hits with opcode RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvToE", .ucode = 0x100, }, { .uname = "WBMTOI", .udesc = "Number of hits with opcode WbToMtoI", .ucode = 0x200, }, { .uname = "ACKCNFLTWBI", .udesc = "Number of hits with opcode AckCnfltWbI", .ucode = 0x400, }, { .uname = "WBMTOE_OR_S", .udesc = "Number of hits with opcode WbMtoE or WbMtoS", .ucode = 0x800, }, { .uname = "HOM", .udesc = "Number of hits with HOM requests", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDI_REMOTE", .udesc = "Number of hits with opcode RspIFwd, RspIFwdWb for remore requests", .ucode = 0x1000, }, { .uname = "RSPFWDI_LOCAL", .udesc = "Number of hits with opcode RspIFwd, RspIFwdWb for local requests", .ucode = 0x2000, }, { .uname = "INVALS", .udesc = "Number of hits for invalidations", .ucode = 0x2600, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDS", .udesc = "Number of hits with opcode RsSFwd, RspSFwdWb", .ucode = 0x4000, }, { .uname = "EVICTS", .udesc = "Number of hits for allocations", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALLOCS", .udesc = "Number of hits for allocations", .ucode = 0x7000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP", .udesc = "Number of hits with opcode RspI, RspIWb, RspSWb, RspCnflt, RspCnfltWbI", .ucode = 0x8000, } }; static const intel_x86_umask_t hswep_unc_h_hitme_hit_pv_bits_set[]={ { .uname = "ALL", .udesc = "All requests", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "READ_OR_INVITOE", .udesc = "Number of hits with opcode RdCode, RdData, RdDataMigratory, RdInvOwn, RdCur or InvToE", .ucode = 0x100, }, { .uname = "WBMTOI", .udesc = "Number of hits with opcode WbToMtoI", .ucode = 0x200, }, { .uname = "ACKCNFLTWBI", .udesc = "Number of hits with opcode AckCnfltWbI", .ucode = 0x400, }, { .uname = "WBMTOE_OR_S", .udesc = "Number of hits with opcode WbMtoE or WbMtoS", .ucode = 0x800, }, { .uname = "HOM", .udesc = "Number of hits with HOM requests", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDI_REMOTE", .udesc = "Number of hits with opcode RspIFwd, RspIFwdWb for remore requests", .ucode = 0x1000, }, { .uname = "RSPFWDI_LOCAL", .udesc = "Number of hits with opcode RspIFwd, RspIFwdWb for local requests", .ucode = 0x2000, }, { .uname = "RSPFWDS", .udesc = "Number of hits with opcode RsSFwd, RspSFwdWb", .ucode = 0x4000, }, { .uname = "RSP", .udesc = "Number of hits with opcode RspI, RspIWb, RspSWb, RspCnflt, RspCnfltWbI", .ucode = 0x8000, } }; static const intel_x86_umask_t hswep_unc_h_igr_no_credit_cycles[]={ { .uname = "AD_QPI0", .udesc = "AD to QPI link 0", .ucode = 0x100, }, { .uname = "AD_QPI1", .udesc = "AD to QPI link 1", .ucode = 0x200, }, { .uname = "BL_QPI0", .udesc = "BL to QPI link 0", .ucode = 0x400, }, { .uname = "BL_QPI1", .udesc = "BL to QPI link 1", .ucode = 0x800, }, { .uname = "AD_QPI2", .udesc = "AD to QPI link 2", .ucode = 0x1000, }, { .uname = "BL_QPI2", .udesc = "BL to QPI link 2", .ucode = 0x2000, }, }; static const intel_x86_umask_t hswep_unc_h_imc_writes[]={ { .uname = "ALL", .udesc = "Counts all writes", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "Counts full line non ISOCH", .ucode = 0x100, }, { .uname = "PARTIAL", .udesc = "Counts partial non-ISOCH", .ucode = 0x200, }, { .uname = "FULL_ISOCH", .udesc = "Counts ISOCH full line", .ucode = 0x400, }, { .uname = "PARTIAL_ISOCH", .udesc = "Counts ISOCH partial", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_h_imc_reads[]={ { .uname = "NORMAL", .udesc = "Normal priority", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t hswep_unc_h_requests[]={ { .uname = "READS", .udesc = "Counts incoming read requests. Good proxy for LLC read misses, incl. RFOs", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL", .udesc = "Counts incoming read requests coming from local socket. Good proxy for LLC read misses, incl. RFOs from the local socket", .ucode = 0x100, }, { .uname = "READS_REMOTE", .udesc = "Counts incoming read requests coming from remote socket. Good proxy for LLC read misses, incl. RFOs from the remote socket", .ucode = 0x200, }, { .uname = "WRITES", .udesc = "Counts incoming writes", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_LOCAL", .udesc = "Counts incoming writes from local socket", .ucode = 0x400, }, { .uname = "WRITES_REMOTE", .udesc = "Counts incoming writes from remote socket", .ucode = 0x800, }, { .uname = "INVITOE_LOCAL", .udesc = "Counts InvItoE coming from local socket", .ucode = 0x1000, }, { .uname = "INVITOE_REMOTE", .udesc = "Counts InvItoE coming from remote socket", .ucode = 0x2000, } }; static const intel_x86_umask_t hswep_unc_h_rpq_cycles_no_reg_credits[]={ { .uname = "CHN0", .udesc = "Channel 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .udesc = "Channel 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .udesc = "channel 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN3", .udesc = "Chanel 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_tad_requests_g0[]={ { .uname = "REGION0", .udesc = "Counts for TAD Region 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION1", .udesc = "Counts for TAD Region 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION2", .udesc = "Counts for TAD Region 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION3", .udesc = "Counts for TAD Region 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION4", .udesc = "Counts for TAD Region 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION5", .udesc = "Counts for TAD Region 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION6", .udesc = "Counts for TAD Region 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION7", .udesc = "Counts for TAD Region 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_tad_requests_g1[]={ { .uname = "REGION8", .udesc = "Counts for TAD Region 8", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION9", .udesc = "Counts for TAD Region 9", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION10", .udesc = "Counts for TAD Region 10", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION11", .udesc = "Counts for TAD Region 11", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_snoop_resp[]={ { .uname = "RSPI", .udesc = "Filters for snoop responses of RspI. RspI is returned when the remote cache does not have the data or when the remote cache silently evicts data (e.g. RFO hit non-modified line)", .ucode = 0x100, }, { .uname = "RSPS", .udesc = "Filters for snoop responses of RspS. RspS is returned when the remote cache has the data but is not forwarding it. It is a way to let the requesting socket know that it cannot allocate the data in E-state", .ucode = 0x200, }, { .uname = "RSPIFWD", .udesc = "Filters for snoop responses of RspIFwd. RspIFwd is returned when the remote cache agent forwards data and the requesting agent is able to acquire the data in E or M state. This is commonly returned with RFO transacations. It can be either HitM or HitFE", .ucode = 0x400, }, { .uname = "RSPSFWD", .udesc = "Filters for snoop responses of RspSFwd. RspSFwd is returned when the remote cache agent forwards data but holds on to its current copy. This is common for data and code reads that hit in a remote socket in E or F state", .ucode = 0x800, }, { .uname = "RSP_WB", .udesc = "Filters for snoop responses of RspIWB or RspSWB. This is returned when a non-RFO requests hits in M-state. Data and code reads can return either RspIWB or RspSWB depending on how the system has been configured. InvItoE transactions will also return RspIWB because they must acquire ownership", .ucode = 0x1000, }, { .uname = "RSP_FWD_WB", .udesc = "Filters for snoop responses of RspxFwdxWB. This snoop response is only used in 4s systems. It is used when a snoop HITM in a remote caching agent and it directly forwards data to a requester and simultaneously returns data to the home to be written back to memory", .ucode = 0x2000, }, { .uname = "RSPCNFLCT", .udesc = "Filters for snoop responses of RspConflict. This is returned when a snoop finds an existing outstanding transaction in a remote caching agent when it CMAs that caching agent. This triggers the conflict resolution hardware. This covers both RspConflct and RspCnflctWBI", .ucode = 0x4000, }, }; static const intel_x86_umask_t hswep_unc_h_txr_ad_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles full from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles full from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles full from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_txr_bl_occupancy[]={ { .uname = "SCHED0", .udesc = "Counts cycles full from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles full from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_txr_ak_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_txr_bl[]={ { .uname = "DRS_CACHE", .udesc = "Counts data being sent to the cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_CORE", .udesc = "Counts data being sent directly to the requesting core", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_QPI", .udesc = "Counts data being sent to a remote socket over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_osb[]={ { .uname = "REMOTE", .udesc = "Remote", .ucode = 0x800, }, { .uname = "READS_LOCAL", .udesc = "Local reads", .ucode = 0x200, }, { .uname = "INVITOE_LOCAL", .udesc = "Local InvItoE", .ucode = 0x400, }, { .uname = "CANCELLED", .udesc = "Cancelled due to D2C or Other", .ucode = 0x1000, }, { .uname = "READS_LOCAL_USEFUL", .udesc = "Local reads - useful", .ucode = 0x2000, }, { .uname = "REMOTE_USEFUL", .udesc = "Remote - useful", .ucode = 0x4000, } }; static const intel_x86_umask_t hswep_unc_h_osb_edr[]={ { .uname = "ALL", .udesc = "All data returns", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL_I", .udesc = "Reads to local I", .ucode = 0x200, }, { .uname = "READS_REMOTE_I", .udesc = "Reads to remote I", .ucode = 0x400, }, { .uname = "READS_LOCAL_S", .udesc = "Reads to local S", .ucode = 0x800, }, { .uname = "READS_REMOTE_S", .udesc = "Reads to remote S", .ucode = 0x1000, } }; static const intel_x86_umask_t hswep_unc_h_ring_ad_used[]={ { .uname = "CCW_EVEN", .udesc = "Counter-clockwise and even ring polarity", .ucode = 0x400, }, { .uname = "CCW_ODD", .udesc = "Counter-clockwise and odd ring polarity", .ucode = 0x800, }, { .uname = "CW_EVEN", .udesc = "Clockwise and even ring polarity", .ucode = 0x100, }, { .uname = "CW_ODD", .udesc = "Clockwise and odd ring polarity", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity", .ucode = 0x3300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity", .ucode = 0xcc00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_snp_resp_recv_local[]={ { .uname = "RSPI", .udesc = "Filters for snoop responses of RspI. RspI is returned when the remote cache does not have the data or when the remote cache silently evicts data (e.g. RFO hit non-modified line)", .ucode = 0x100, }, { .uname = "RSPS", .udesc = "Filters for snoop responses of RspS. RspS is returned when the remote cache has the data but is not forwarding it. It is a way to let the requesting socket know that it cannot allocate the data in E-state", .ucode = 0x200, }, { .uname = "RSPIFWD", .udesc = "Filters for snoop responses of RspIFwd. RspIFwd is returned when the remote cache agent forwards data and the requesting agent is able to acquire the data in E or M state. This is commonly returned with RFO transacations. It can be either HitM or HitFE", .ucode = 0x400, }, { .uname = "RSPSFWD", .udesc = "Filters for snoop responses of RspSFwd. RspSFwd is returned when the remote cache agent forwards data but holds on to its current copy. This is common for data and code reads that hit in a remote socket in E or F state", .ucode = 0x800, }, { .uname = "RSP_WB", .udesc = "Filters for snoop responses of RspIWB or RspSWB. This is returned when a non-RFO requests hits in M-state. Data and code reads can return either RspIWB or RspSWB depending on how the system has been configured. InvItoE transactions will also return RspIWB because they must acquire ownership", .ucode = 0x1000, }, { .uname = "RSP_FWD_WB", .udesc = "Filters for snoop responses of RspxFwdxWB. This snoop response is only used in 4s systems. It is used when a snoop HITM in a remote caching agent and it directly forwards data to a requester and simultaneously returns data to the home to be written back to memory", .ucode = 0x2000, }, { .uname = "RSPCNFLCT", .udesc = "Filters for snoop responses of RspConflict. This is returned when a snoop finds an existing outstanding transaction in a remote caching agent when it CMAs that caching agent. This triggers the conflict resolution hardware. This covers both RspConflct and RspCnflctWBI", .ucode = 0x4000, }, { .uname = "OTHER", .udesc = "Filters all other snoop responses", .ucode = 0x8000, }, }; static const intel_x86_umask_t hswep_unc_h_sbo0_credits_acquired[]={ { .uname = "AD", .udesc = "For AD ring", .ucode = 0x100, }, { .uname = "BL", .udesc = "For BL ring", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_h_snoops_rsp_after_data[]={ { .uname = "LOCAL", .udesc = "Local", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Remote", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_h_snoops_cycles_ne[]={ { .uname = "ALL", .udesc = "Local", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOCAL", .udesc = "Local", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Remote", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_h_txr_ak[]={ { .uname = "NDR", .udesc = "Number of outbound NDR (non-data response) transactions send on the AK ring. AK NDR is used for messages to the local socket", .ucode = 0x100, }, { .uname = "CRD_CBO", .udesc = "Number of outbound CDR transactions send on the AK ring to CBO", .ucode = 0x200, }, { .uname = "CRD_QPI", .udesc = "Number of outbound CDR transactions send on the AK ring to QPI", .ucode = 0x400, }, }; static const intel_x86_umask_t hswep_unc_h_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .udesc = "No credit for SBO0 AD Ring", .ucode = 0x100, }, { .uname = "SBO1_AD", .udesc = "No credit for SBO1 AD Ring", .ucode = 0x200, }, { .uname = "SBO0_BL", .udesc = "No credit for SBO0 BL Ring", .ucode = 0x400, }, { .uname = "SBO1_BL", .udesc = "No credit for SBO1 BL Ring", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_h_tracker_occupancy[]={ { .uname = "READS_LOCAL", .udesc = "Local read requests", .ucode = 0x400, }, { .uname = "READS_REMOTE", .udesc = "Remote read requests", .ucode = 0x800, }, { .uname = "WRITES_LOCAL", .udesc = "Local write requests", .ucode = 0x1000, }, { .uname = "WRITES_REMOTE", .udesc = "Remote write requests", .ucode = 0x2000, }, { .uname = "INVITOE_LOCAL", .udesc = "Local InvItoE requests", .ucode = 0x4000, }, { .uname = "INVITOE_REMOTE", .udesc = "Remote InvItoE requests", .ucode = 0x8000, } }; static const intel_x86_umask_t hswep_unc_h_txr_starved[]={ { .uname = "AK", .udesc = "For AD ring", .ucode = 0x100, }, { .uname = "BL", .udesc = "For BL ring", .ucode = 0x200, }, }; static const intel_x86_entry_t intel_hswep_unc_h_pe[]={ { .name = "UNC_H_CLOCKTICKS", .desc = "HA Uncore clockticks", .modmsk = HSWEP_UNC_HA_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_H_CONFLICT_CYCLES", .desc = "Conflict Checks", .code = 0xb, .cntmsk = 0x2, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_COUNT", .desc = "Direct2Core Messages Sent", .code = 0x11, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_CYCLES_DISABLED", .desc = "Cycles when Direct2Core was Disabled", .code = 0x12, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_TXN_OVERRIDE", .desc = "Number of Reads that had Direct2Core Overridden", .code = 0x13, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECTORY_LOOKUP", .desc = "Directory Lookups", .code = 0xc, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_directory_lookup), .umasks = hswep_unc_h_directory_lookup }, { .name = "UNC_H_DIRECTORY_UPDATE", .desc = "Directory Updates", .code = 0xd, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_directory_update), .umasks = hswep_unc_h_directory_update }, { .name = "UNC_H_IGR_NO_CREDIT_CYCLES", .desc = "Cycles without QPI Ingress Credits", .code = 0x22, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_igr_no_credit_cycles), .umasks = hswep_unc_h_igr_no_credit_cycles }, { .name = "UNC_H_IMC_RETRY", .desc = "Retry Events", .code = 0x1e, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IMC_WRITES", .desc = "HA to IMC Full Line Writes Issued", .code = 0x1a, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_imc_writes), .umasks = hswep_unc_h_imc_writes }, { .name = "UNC_H_IMC_READS", .desc = "HA to IMC normal priority reads issued", .code = 0x17, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_imc_reads), .umasks = hswep_unc_h_imc_reads }, { .name = "UNC_H_REQUESTS", .desc = "Read and Write Requests", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_requests), .umasks = hswep_unc_h_requests }, { .name = "UNC_H_RPQ_CYCLES_NO_REG_CREDITS", .desc = "IMC RPQ Credits Empty", .code = 0x15, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_rpq_cycles_no_reg_credits), .umasks = hswep_unc_h_rpq_cycles_no_reg_credits }, { .name = "UNC_H_TAD_REQUESTS_G0", .desc = "HA Requests to a TAD Region", .code = 0x1b, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_tad_requests_g0), .umasks = hswep_unc_h_tad_requests_g0 }, { .name = "UNC_H_TAD_REQUESTS_G1", .desc = "HA Requests to a TAD Region", .code = 0x1c, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_tad_requests_g1), .umasks = hswep_unc_h_tad_requests_g1 }, { .name = "UNC_H_TXR_AD_CYCLES_FULL", .desc = "AD Egress Full", .code = 0x2a, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_ad_cycles_full), .umasks = hswep_unc_h_txr_ad_cycles_full }, { .name = "UNC_H_TXR_AK_CYCLES_FULL", .desc = "AK Egress Full", .code = 0x32, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_ak_cycles_full), .umasks = hswep_unc_h_txr_ak_cycles_full }, { .name = "UNC_H_TXR_AK", .desc = "Outbound Ring Transactions on AK", .code = 0xe, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_ak), .umasks = hswep_unc_h_txr_ak }, { .name = "UNC_H_TXR_BL", .desc = "Outbound DRS Ring Transactions to Cache", .code = 0x10, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_bl), .umasks = hswep_unc_h_txr_bl }, { .name = "UNC_H_TXR_BL_CYCLES_FULL", .desc = "BL Egress Full", .code = 0x36, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_ak_cycles_full), .umasks = hswep_unc_h_txr_ak_cycles_full, /* identical to snbep_unc_h_txr_ak_cycles_full */ }, { .name = "UNC_H_WPQ_CYCLES_NO_REG_CREDITS", .desc = "HA IMC CHN0 WPQ Credits Empty", .code = 0x18, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_rpq_cycles_no_reg_credits), .umasks = hswep_unc_h_rpq_cycles_no_reg_credits, /* shared */ }, { .name = "UNC_H_BT_BYPASS", .desc = "Backup Tracker bypass", .code = 0x52, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_BYPASS_IMC", .desc = "HA to IMC bypass", .code = 0x14, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_bypass_imc), .umasks = hswep_unc_h_bypass_imc, }, { .name = "UNC_H_BT_CYCLES_NE", .desc = "Backup Tracker cycles not empty", .code = 0x42, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_BT_OCCUPANCY", .desc = "Backup Tracker inserts", .code = 0x43, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_OSB", .desc = "OSB snoop broadcast", .code = 0x53, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_osb), .umasks = hswep_unc_h_osb, }, { .name = "UNC_H_OSB_EDR", .desc = "OSB early data return", .code = 0x54, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_osb_edr), .umasks = hswep_unc_h_osb_edr, }, { .name = "UNC_H_RING_AD_USED", .desc = "AD ring in use", .code = 0x3e, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_ring_ad_used), .umasks = hswep_unc_h_ring_ad_used, }, { .name = "UNC_H_RING_AK_USED", .desc = "AK ring in use", .code = 0x3f, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_ring_ad_used), /* shared */ .umasks = hswep_unc_h_ring_ad_used, }, { .name = "UNC_H_RING_BL_USED", .desc = "BL ring in use", .code = 0x40, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_ring_ad_used), /* shared */ .umasks = hswep_unc_h_ring_ad_used, }, { .name = "UNC_H_DIRECTORY_LAT_OPT", .desc = "Directory latency optimization data return path taken", .code = 0x41, .cntmsk = 0xf, .modmsk = HSWEP_UNC_HA_ATTRS, }, { .name = "UNC_H_SNOOP_RESP_RECV_LOCAL", .desc = "Snoop responses received local", .code = 0x60, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snp_resp_recv_local), .umasks = hswep_unc_h_snp_resp_recv_local, }, { .name = "UNC_H_SNP_RESP_RECV_LOCAL", .desc = "Snoop responses received local", .code = 0x60, .cntmsk = 0xf, .ngrp = 1, .equiv = "UNC_H_SNOOP_RESP_RECV_LOCAL", .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snp_resp_recv_local), .umasks = hswep_unc_h_snp_resp_recv_local, }, { .name = "UNC_H_TXR_BL_OCCUPANCY", .desc = "BL Egress occupancy", .code = 0x34, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_bl_occupancy), .umasks = hswep_unc_h_txr_bl_occupancy, }, { .name = "UNC_H_SNOOP_RESP", .desc = "Snoop responses received", .code = 0x21, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoop_resp), .umasks = hswep_unc_h_snoop_resp }, { .name = "UNC_H_HITME_HIT", .desc = "Hits in the HitMe cache", .code = 0x71, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_hitme_hit), .umasks = hswep_unc_h_hitme_hit }, { .name = "UNC_H_HITME_HIT_PV_BITS_SET", .desc = "Number of PV bits set on HitMe cache hits", .code = 0x72, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_hitme_hit_pv_bits_set), .umasks = hswep_unc_h_hitme_hit_pv_bits_set }, { .name = "UNC_H_HITME_LOOKUP", .desc = "Number of accesses to HitMe cache", .code = 0x70, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_hitme_hit), /* shared with hswep_unc_h_hitme_hit */ .umasks = hswep_unc_h_hitme_hit }, { .name = "UNC_H_SBO0_CREDIT_ACQUIRED", .desc = "SBO0 credits acquired", .code = 0x68, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_sbo0_credits_acquired), .umasks = hswep_unc_h_sbo0_credits_acquired, }, { .name = "UNC_H_SBO0_CREDIT_OCCUPANCY", .desc = "SBO0 credits occupancy", .code = 0x6a, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_sbo0_credits_acquired), .umasks = hswep_unc_h_sbo0_credits_acquired, /* shared with hswep_unc_h_sbo0_credits_acquired */ }, { .name = "UNC_H_SBO1_CREDIT_ACQUIRED", .desc = "SBO1 credits acquired", .code = 0x69, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_sbo0_credits_acquired), .umasks = hswep_unc_h_sbo0_credits_acquired,/* shared with hswep_unc_h_sbo0_credits_acquired */ }, { .name = "UNC_H_SBO0_CREDIT_OCCUPANCY", .desc = "SBO1 credits occupancy", .code = 0x6b, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_sbo0_credits_acquired), .umasks = hswep_unc_h_sbo0_credits_acquired, }, { .name = "UNC_H_SNOOPS_RSP_AFTER_DATA", .desc = "Number of reads when the snoops was on the critical path to the data return", .code = 0xa, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoops_rsp_after_data), .umasks = hswep_unc_h_snoops_rsp_after_data, }, { .name = "UNC_H_SNOOPS_CYCLES_NE", .desc = "Number of cycles when one or more snoops are outstanding", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoops_cycles_ne), .umasks = hswep_unc_h_snoops_cycles_ne, }, { .name = "UNC_H_SNOOPS_OCCUPANCY", .desc = "Tracker snoops outstanding accumulator", .code = 0x9, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoops_rsp_after_data), .umasks = hswep_unc_h_snoops_rsp_after_data, /* shared */ }, { .name = "UNC_H_STALL_NO_SBO_CREDIT", .desc = "Stalls on no SBO credits", .code = 0x6c, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_stall_no_sbo_credit), .umasks = hswep_unc_h_stall_no_sbo_credit, }, { .name = "UNC_H_TRACKER_CYCLES_NE", .desc = "Tracker cycles not empty", .code = 0x3, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoops_cycles_ne), .umasks = hswep_unc_h_snoops_cycles_ne, /* shared */ }, { .name = "UNC_H_TRACKER_OCCUPANCY", .desc = "Tracker occupancy accumulator", .code = 0x4, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_tracker_occupancy), .umasks = hswep_unc_h_tracker_occupancy, }, { .name = "UNC_H_TRACKER_PENDING_OCCUPANCY", .desc = "Data pending occupancy accumulator", .code = 0x5, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_snoops_rsp_after_data), .umasks = hswep_unc_h_snoops_rsp_after_data, /* shared */ }, { .name = "UNC_H_TXR_STARVED", .desc = "Injection starvation", .code = 0x6d, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_h_txr_starved), .umasks = hswep_unc_h_txr_starved, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_imc_events.h000066400000000000000000000472111502707512200256160ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_imc (Intel Haswell-EP IMC uncore PMU) */ static const intel_x86_umask_t hswep_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "Counts total number of DRAM CAS commands issued on this channel", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "Counts all DRAM reads on this channel, incl. underfills", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "Counts number of DRAM read CAS commands issued on this channel, incl. regular read CAS and those with implicit precharge", .ucode = 0x100, }, { .uname = "RD_UNDERFILL", .udesc = "Counts number of underfill reads issued by the memory controller", .ucode = 0x200, }, { .uname = "WR", .udesc = "Counts number of DRAM write CAS commands on this channel", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_RMM", .udesc = "Counts Number of opportunistic DRAM write CAS commands issued on this channel", .ucode = 0x800, }, { .uname = "WR_WMM", .udesc = "Counts number of DRAM write CAS commands issued on this channel while in Write-Major mode", .ucode = 0x400, }, { .uname = "RD_RMM", .udesc = "Counts Number of opportunistic DRAM read CAS commands issued on this channel", .ucode = 0x1000, }, { .uname = "RD_WMM", .udesc = "Counts number of DRAM read CAS commands issued on this channel while in Write-Major mode", .ucode = 0x2000, }, }; static const intel_x86_umask_t hswep_unc_m_dram_refresh[]={ { .uname = "HIGH", .udesc = "High", .ucode = 0x400, }, { .uname = "PANIC", .udesc = "Panic", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_m_major_modes[]={ { .uname = "ISOCH", .udesc = "Counts cycles in ISOCH Major mode", .ucode = 0x800, }, { .uname = "PARTIAL", .udesc = "Counts cycles in Partial Major mode", .ucode = 0x400, }, { .uname = "READ", .udesc = "Counts cycles in Read Major mode", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Counts cycles in Write Major mode", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_m_power_cke_cycles[]={ { .uname = "RANK0", .udesc = "Count cycles for rank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK1", .udesc = "Count cycles for rank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK2", .udesc = "Count cycles for rank 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK3", .udesc = "Count cycles for rank 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK4", .udesc = "Count cycles for rank 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK5", .udesc = "Count cycles for rank 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK6", .udesc = "Count cycles for rank 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK7", .udesc = "Count cycles for rank 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_m_preemption[]={ { .uname = "RD_PREEMPT_RD", .udesc = "Counts read over read preemptions", .ucode = 0x100, }, { .uname = "RD_PREEMPT_WR", .udesc = "Counts read over write preemptions", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_m_pre_count[]={ { .uname = "PAGE_CLOSE", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of the page close counter expiring", .ucode = 0x200, }, { .uname = "PAGE_MISS", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of page misses", .ucode = 0x100, }, { .uname = "RD", .udesc = "Precharge due to read", .ucode = 0x400, }, { .uname = "WR", .udesc = "Precharge due to write", .ucode = 0x800, }, { .uname = "BYP", .udesc = "Precharge due to bypass", .ucode = 0x1000, }, }; static const intel_x86_umask_t hswep_unc_m_act_count[]={ { .uname = "RD", .udesc = "Activate due to read", .ucode = 0x100, }, { .uname = "WR", .udesc = "Activate due to write", .ucode = 0x200, }, { .uname = "BYP", .udesc = "Activate due to bypass", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_m_byp_cmds[]={ { .uname = "ACT", .udesc = "ACT command issued by 2 cycle bypass", .ucode = 0x100, }, { .uname = "CAS", .udesc = "CAS command issued by 2 cycle bypass", .ucode = 0x200, }, { .uname = "PRE", .udesc = "PRE command issued by 2 cycle bypass", .ucode = 0x400, }, }; static const intel_x86_umask_t hswep_unc_m_rd_cas_prio[]={ { .uname = "LOW", .udesc = "Read CAS issued with low priority", .ucode = 0x100, }, { .uname = "MED", .udesc = "Read CAS issued with medium priority", .ucode = 0x200, }, { .uname = "HIGH", .udesc = "Read CAS issued with high priority", .ucode = 0x400, }, { .uname = "PANIC", .udesc = "Read CAS issued with panic non isoch priority (starved)", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_m_rd_cas_rank0[]={ { .uname = "BANK0", .udesc = "Bank 0", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK1", .udesc = "Bank 1", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK2", .udesc = "Bank 2", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK3", .udesc = "Bank 3", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK4", .udesc = "Bank 4", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK5", .udesc = "Bank 5", .ucode = 0x500, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK6", .udesc = "Bank 6", .ucode = 0x600, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK7", .udesc = "Bank 7", .ucode = 0x700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK8", .udesc = "Bank 8", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK9", .udesc = "Bank 9", .ucode = 0x900, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK10", .udesc = "Bank 10", .ucode = 0xa00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK11", .udesc = "Bank 11", .ucode = 0xb00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK12", .udesc = "Bank 12", .ucode = 0xc000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK13", .udesc = "Bank 13", .ucode = 0xd000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK14", .udesc = "Bank 14", .ucode = 0xe000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANK15", .udesc = "Bank 15", .ucode = 0xf000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALLBANKS", .udesc = "Bank 15", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BANKG0", .udesc = "Bank Group 0 (bank 0-3)", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG1", .udesc = "Bank Group 1 (bank 4-7)", .ucode = 0x12000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG2", .udesc = "Bank Group 2 (8-11)", .ucode = 0x13000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BANKG3", .udesc = "Bank Group 3 (12-15)", .ucode = 0x14000, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_umask_t hswep_unc_m_vmse_wr_push[]={ { .uname = "WMM", .udesc = "VMSE write push issued in WMM", .ucode = 0x100, }, { .uname = "RMM", .udesc = "VMSE write push issued in RMM", .ucode = 0x200, } }; static const intel_x86_umask_t hswep_unc_m_wmm_to_rmm[]={ { .uname = "LOW_THRES", .udesc = "Transition from WMM to RMM because of starve counter", .ucode = 0x100, }, { .uname = "STARVE", .udesc = "Starve", .ucode = 0x200, }, { .uname = "VMSE_RETRY", .udesc = "VMSE retry", .ucode = 0x400, } }; static const intel_x86_entry_t intel_hswep_unc_m_pe[]={ { .name = "UNC_M_CLOCKTICKS", .desc = "IMC Uncore clockticks (fixed counter)", .modmsk = 0x0, .cntmsk = 0x100000000ull, .code = 0xff, /* perf pseudo encoding for fixed counter */ .flags = INTEL_X86_FIXED, }, { .name = "UNC_M_DCLOCKTICKS", .desc = "IMC Uncore clockticks (generic counters)", .modmsk = HSWEP_UNC_IMC_ATTRS, .cntmsk = 0xf, .code = 0x00, /*encoding for generic counters */ }, { .name = "UNC_M_ACT_COUNT", .desc = "DRAM Activate Count", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_act_count), .umasks = hswep_unc_m_act_count }, { .name = "UNC_M_CAS_COUNT", .desc = "DRAM RD_CAS and WR_CAS Commands.", .code = 0x4, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_cas_count), .umasks = hswep_unc_m_cas_count }, { .name = "UNC_M_DRAM_PRE_ALL", .desc = "DRAM Precharge All Commands", .code = 0x6, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_DRAM_REFRESH", .desc = "Number of DRAM Refreshes Issued", .code = 0x5, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_dram_refresh), .umasks = hswep_unc_m_dram_refresh }, { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", .desc = "ECC Correctable Errors", .code = 0x9, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_MAJOR_MODES", .desc = "Cycles in a Major Mode", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_major_modes), .umasks = hswep_unc_m_major_modes }, { .name = "UNC_M_POWER_CHANNEL_DLLOFF", .desc = "Channel DLLOFF Cycles", .code = 0x84, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .desc = "Channel PPD Cycles", .code = 0x85, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CKE_CYCLES", .desc = "CKE_ON_CYCLES by Rank", .code = 0x83, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_power_cke_cycles), .umasks = hswep_unc_m_power_cke_cycles }, { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", .desc = "Critical Throttle Cycles", .code = 0x86, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_SELF_REFRESH", .desc = "Clock-Enabled Self-Refresh", .code = 0x43, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .desc = "Throttle Cycles", .code = 0x41, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_power_cke_cycles), .umasks = hswep_unc_m_power_cke_cycles /* identical to snbep_unc_m_power_cke_cycles */ }, { .name = "UNC_M_POWER_PCU_THROTTLING", .desc = "PCU throttling", .code = 0x42, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_PREEMPTION", .desc = "Read Preemption Count", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_preemption), .umasks = hswep_unc_m_preemption }, { .name = "UNC_M_PRE_COUNT", .desc = "DRAM Precharge commands.", .code = 0x2, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_pre_count), .umasks = hswep_unc_m_pre_count }, { .name = "UNC_M_RPQ_CYCLES_NE", .desc = "Read Pending Queue Not Empty", .code = 0x11, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue Allocations", .code = 0x10, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_FULL", .desc = "Write Pending Queue Full Cycles", .code = 0x22, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_NE", .desc = "Write Pending Queue Not Empty", .code = 0x21, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_READ_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x23, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_WRITE_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x24, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_BYP_CMDS", .desc = "Bypass command event", .code = 0xa1, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_byp_cmds), .umasks = hswep_unc_m_byp_cmds }, { .name = "UNC_M_RD_CAS_PRIO", .desc = "Read CAS priority", .code = 0xa0, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_prio), .umasks = hswep_unc_m_rd_cas_prio }, { .name = "UNC_M_RD_CAS_RANK0", .desc = "Read CAS access to Rank 0", .code = 0xb0, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK1", .desc = "Read CAS access to Rank 1", .code = 0xb1, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK2", .desc = "Read CAS access to Rank 2", .code = 0xb2, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK3", .desc = "Read CAS access to Rank 3", .code = 0xb3, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK4", .desc = "Read CAS access to Rank 4", .code = 0xb4, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK5", .desc = "Read CAS access to Rank 5", .code = 0xb5, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK6", .desc = "Read CAS access to Rank 6", .code = 0xb6, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK7", .desc = "Read CAS access to Rank 7", .code = 0xb7, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_VMSE_MXB_WR_OCCUPANCY", .desc = "VMSE MXB write buffer occupancy", .code = 0x91, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_VMSE_WR_PUSH", .desc = "VMSE WR push issued", .code = 0x90, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_vmse_wr_push), .umasks = hswep_unc_m_vmse_wr_push }, { .name = "UNC_M_WMM_TO_RMM", .desc = "Transitions from WMM to RMM because of low threshold", .code = 0xc0, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_wmm_to_rmm), .umasks = hswep_unc_m_wmm_to_rmm }, { .name = "UNC_M_WRONG_MM", .desc = "Not getting the requested major mode", .code = 0xc1, .cntmsk = 0xf, .modmsk = HSWEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WR_CAS_RANK0", .desc = "Write CAS access to Rank 0", .code = 0xb8, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK1", .desc = "Write CAS access to Rank 1", .code = 0xb9, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK2", .desc = "Write CAS access to Rank 2", .code = 0xba, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK3", .desc = "Write CAS access to Rank 3", .code = 0xbb, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK4", .desc = "Write CAS access to Rank 4", .code = 0xbc, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK5", .desc = "Write CAS access to Rank 5", .code = 0xbd, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK6", .desc = "Write CAS access to Rank 6", .code = 0xbe, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK7", .desc = "Write CAS access to Rank 7", .code = 0xbf, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_m_rd_cas_rank0), /* shared */ .umasks = hswep_unc_m_rd_cas_rank0 }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_irp_events.h000066400000000000000000000223321502707512200256350ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_irp (Intel Haswell-EP IRP uncore) */ static const intel_x86_umask_t hswep_unc_i_cache_ack_pending_occupancy[]={ { .uname = "ANY", .udesc = "Any source", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SOURCE", .udesc = "Track all requests from any source port", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_i_coherent_ops[]={ { .uname = "PCIRDCUR", .udesc = "PCI read current", .ucode = 0x100, }, { .uname = "CRD", .udesc = "CRD", .ucode = 0x200, }, { .uname = "DRD", .udesc = "DRD", .ucode = 0x400, }, { .uname = "RFO", .udesc = "RFO", .ucode = 0x800, }, { .uname = "PCITOM", .udesc = "DRITOM", .ucode = 0x1000, }, { .uname = "PCIDCAHINT", .udesc = "PCIDCAHINT", .ucode = 0x2000, }, { .uname = "WBMTOI", .udesc = "WBMTOI", .ucode = 0x4000, }, { .uname = "CFLUSH", .udesc = "CFLUSH", .ucode = 0x8000, }, }; static const intel_x86_umask_t hswep_unc_i_misc0[]={ { .uname = "FAST_REQ", .udesc = "Fastpath requests", .ucode = 0x100, }, { .uname = "FAST_REJ", .udesc = "Fastpath rejects", .ucode = 0x200, }, { .uname = "2ND_RD_INSERT", .udesc = "Cache insert of read transaction as secondary", .ucode = 0x400, }, { .uname = "2ND_WR_INSERT", .udesc = "Cache insert of write transaction as secondary", .ucode = 0x800, }, { .uname = "2ND_ATOMIC_INSERT", .udesc = "Cache insert of atomic transaction as secondary", .ucode = 0x1000, }, { .uname = "FAST_XFER", .udesc = "Fastpath trasnfers from primary to secondary", .ucode = 0x2000, }, { .uname = "PF_ACK_HINT", .udesc = "Prefetch ack hints from primary to secondary", .ucode = 0x4000, }, { .uname = "PF_TIMEOUT", .udesc = "Prefetch timeout", .ucode = 0x8000, }, }; static const intel_x86_umask_t hswep_unc_i_misc1[]={ { .uname = "SLOW_I", .udesc = "Slow transfer of I-state cacheline", .ucode = 0x100, }, { .uname = "SLOW_S", .udesc = "Slow transfer of S-state cacheline", .ucode = 0x200, }, { .uname = "SLOW_E", .udesc = "Slow transfer of e-state cacheline", .ucode = 0x400, }, { .uname = "SLOW_M", .udesc = "Slow transfer of M-state cacheline", .ucode = 0x800, }, { .uname = "LOST_FWD", .udesc = "LOST forwards", .ucode = 0x1000, }, { .uname = "SEC_RCVD_INVLD", .udesc = "Received Invalid", .ucode = 0x2000, }, { .uname = "SEC_RCVD_VLD", .udesc = "Received Valid", .ucode = 0x4000, }, { .uname = "DATA_THROTTLE", .udesc = "Data throttled", .ucode = 0x8000, }, }; static const intel_x86_umask_t hswep_unc_i_snoop_resp[]={ { .uname = "MISS", .udesc = "Miss", .ucode = 0x100, }, { .uname = "HIT_I", .udesc = "Hit in Invalid state", .ucode = 0x200, }, { .uname = "HIT_ES", .udesc = "Hit in Exclusive or Shared state", .ucode = 0x400, }, { .uname = "HIT_M", .udesc = "Hit in Modified state", .ucode = 0x800, }, { .uname = "SNPCODE", .udesc = "Snoop Code", .ucode = 0x1000, }, { .uname = "SNPDATA", .udesc = "Snoop Data", .ucode = 0x2000, }, { .uname = "SNPINV", .udesc = "Snoop Invalid", .ucode = 0x4000, }, }; static const intel_x86_umask_t hswep_unc_i_transactions[]={ { .uname = "READS", .udesc = "Reads (not including prefetches)", .ucode = 0x100, }, { .uname = "WRITES", .udesc = "Writes", .ucode = 0x200, }, { .uname = "RD_PREF", .udesc = "Read prefetches", .ucode = 0x400, }, { .uname = "WR_PREF", .udesc = "Write prefetches", .ucode = 0x800, }, { .uname = "ATOMIC", .udesc = "Atomic transactions", .ucode = 0x1000, }, { .uname = "OTHER", .udesc = "Other kinds of transactions", .ucode = 0x2000, }, { .uname = "ORDERINGQ", .udesc = "Track request coming from port designated in IRP OrderingQ filter", .ucode = 0x4000, }, }; static const intel_x86_entry_t intel_hswep_unc_i_pe[]={ { .name = "UNC_I_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x0, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_SNOOP_RESP", .desc = "Snoop responses", .code = 0x17, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_snoop_resp), .umasks = hswep_unc_i_snoop_resp, }, { .name = "UNC_I_MISC0", .desc = "Miscellaneous events", .code = 0x14, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_misc0), .umasks = hswep_unc_i_misc0, }, { .name = "UNC_I_COHERENT_OPS", .desc = "Coherent operations", .code = 0x13, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_coherent_ops), .umasks = hswep_unc_i_coherent_ops, }, { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", .desc = "Total write cache occupancy", .code = 0x12, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_cache_ack_pending_occupancy), .umasks = hswep_unc_i_cache_ack_pending_occupancy /* shared */ }, { .name = "UNC_I_RXR_AK_INSERTS", .desc = "Egress cycles full", .code = 0xa, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_CYCLES_FULL", .desc = "TBD", .code = 0x4, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_INSERTS", .desc = "BL Ingress occupancy DRS", .code = 0x1, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_OCCUPANCY", .desc = "TBD", .code = 0x7, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_CYCLES_FULL", .desc = "TBD", .code = 0x5, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_INSERTS", .desc = "BL Ingress occupancy NCB", .code = 0x2, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_OCCUPANCY", .desc = "TBD", .code = 0x8, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_CYCLES_FULL", .desc = "TBD", .code = 0x6, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_INSERTS", .desc = "BL Ingress Occupancy NCS", .code = 0x3, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_OCCUPANCY", .desc = "TBD", .code = 0x9, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TRANSACTIONS", .desc = "Inbound transactions", .code = 0x16, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_transactions), .umasks = hswep_unc_i_transactions, }, { .name = "UNC_I_MISC1", .desc = "Misc events", .code = 0x15, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_i_misc1), .umasks = hswep_unc_i_misc1, }, { .name = "UNC_I_TXR_AD_STALL_CREDIT_CYCLES", .desc = "No AD Egress credit stalls", .code = 0x18, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_BL_STALL_CREDIT_CYCLES", .desc = "No BL Egress credit stalls", .code = 0x19, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCB", .desc = "Outbound read requests", .code = 0xe, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCS", .desc = "Outbound read requests", .code = 0xf, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_REQUEST_OCCUPANCY", .desc = "Outbound request queue occupancy", .code = 0xd, .cntmsk = 0x3, .modmsk = HSWEP_UNC_IRP_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_pcu_events.h000066400000000000000000000267411502707512200256420ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_pcu (Intel Haswell-EP PCU uncore) */ static const intel_x86_umask_t hswep_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .udesc = "Counts number of cores in C0", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C3", .udesc = "Counts number of cores in C3", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C6", .udesc = "Counts number of cores in C6", .ucode = 0xc000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_hswep_unc_p_pe[]={ { .name = "UNC_P_CLOCKTICKS", .desc = "PCU Uncore clockticks", .modmsk = HSWEP_UNC_PCU_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_P_CORE0_TRANSITION_CYCLES", .desc = "Core 0 C State Transition Cycles", .code = 0x60, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE1_TRANSITION_CYCLES", .desc = "Core 1 C State Transition Cycles", .code = 0x61, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE2_TRANSITION_CYCLES", .desc = "Core 2 C State Transition Cycles", .code = 0x62, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE3_TRANSITION_CYCLES", .desc = "Core 3 C State Transition Cycles", .code = 0x63, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE4_TRANSITION_CYCLES", .desc = "Core 4 C State Transition Cycles", .code = 0x64, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE5_TRANSITION_CYCLES", .desc = "Core 5 C State Transition Cycles", .code = 0x65, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE6_TRANSITION_CYCLES", .desc = "Core 6 C State Transition Cycles", .code = 0x66, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE7_TRANSITION_CYCLES", .desc = "Core 7 C State Transition Cycles", .code = 0x67, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE8_TRANSITION_CYCLES", .desc = "Core 8 C State Transition Cycles", .code = 0x68, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE9_TRANSITION_CYCLES", .desc = "Core 9 C State Transition Cycles", .code = 0x69, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE10_TRANSITION_CYCLES", .desc = "Core 10 C State Transition Cycles", .code = 0x6a, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE11_TRANSITION_CYCLES", .desc = "Core 11 C State Transition Cycles", .code = 0x6b, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE12_TRANSITION_CYCLES", .desc = "Core 12 C State Transition Cycles", .code = 0x6c, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE13_TRANSITION_CYCLES", .desc = "Core 13 C State Transition Cycles", .code = 0x6d, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE14_TRANSITION_CYCLES", .desc = "Core 14 C State Transition Cycles", .code = 0x6e, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE15_TRANSITION_CYCLES", .desc = "Core 15 C State Transition Cycles", .code = 0x6f, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE16_TRANSITION_CYCLES", .desc = "Core 16 C State Transition Cycles", .code = 0x70, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE17_TRANSITION_CYCLES", .desc = "Core 17 C State Transition Cycles", .code = 0x71, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE0", .desc = "Core 0 C State Demotions", .code = 0x30, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE1", .desc = "Core 1 C State Demotions", .code = 0x31, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE2", .desc = "Core 2 C State Demotions", .code = 0x32, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE3", .desc = "Core 3 C State Demotions", .code = 0x33, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE4", .desc = "Core 4 C State Demotions", .code = 0x34, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE5", .desc = "Core 5 C State Demotions", .code = 0x35, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE6", .desc = "Core 6 C State Demotions", .code = 0x36, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE7", .desc = "Core 7 C State Demotions", .code = 0x37, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE8", .desc = "Core 8 C State Demotions", .code = 0x38, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE9", .desc = "Core 9 C State Demotions", .code = 0x39, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE10", .desc = "Core 10 C State Demotions", .code = 0x3a, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE11", .desc = "Core 11 C State Demotions", .code = 0x3b, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE12", .desc = "Core 12 C State Demotions", .code = 0x3c, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE13", .desc = "Core 13 C State Demotions", .code = 0x3d, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE14", .desc = "Core 14 C State Demotions", .code = 0x3e, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE15", .desc = "Core 15 C State Demotions", .code = 0x3f, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE16", .desc = "Core 16 C State Demotions", .code = 0x40, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE17", .desc = "Core 17 C State Demotions", .code = 0x41, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_BAND0_CYCLES", .desc = "Frequency Residency", .code = 0xb, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = HSWEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND1_CYCLES", .desc = "Frequency Residency", .code = 0xc, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = HSWEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND2_CYCLES", .desc = "Frequency Residency", .code = 0xd, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = HSWEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND3_CYCLES", .desc = "Frequency Residency", .code = 0xe, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = HSWEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .desc = "Thermal Strongest Upper Limit Cycles", .code = 0x4, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_OS_CYCLES", .desc = "OS Strongest Upper Limit Cycles", .code = 0x6, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .desc = "Power Strongest Upper Limit Cycles", .code = 0x5, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .desc = "IO P Limit Strongest Lower Limit Cycles", .code = 0x73, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .desc = "Cycles spent changing Frequency", .code = 0x74, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C0_CYCLES", .desc = "Package C State residency - C0", .code = 0x2a, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C1E_CYCLES", .desc = "Package C State residency - C1E", .code = 0x4e, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C2E_CYCLES", .desc = "Package C State residency - C2E", .code = 0x2b, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C3_CYCLES", .desc = "Package C State residency - C3", .code = 0x2c, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C6_CYCLES", .desc = "Package C State residency - C6", .code = 0x2d, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_RESIDENCY_C7_CYCLES", .desc = "Package C State residency - C7", .code = 0x2e, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .desc = "Memory Phase Shedding Cycles", .code = 0x2f, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .desc = "Number of cores in C0", .code = 0x80, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_PCU_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_p_power_state_occupancy), .umasks = hswep_unc_p_power_state_occupancy }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .desc = "External Prochot", .code = 0xa, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .desc = "Internal Prochot", .code = 0x9, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .desc = "Total Core C State Transition Cycles", .code = 0x72, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VR_HOT_CYCLES", .desc = "VR Hot", .code = 0x42, .cntmsk = 0xf, .modmsk = HSWEP_UNC_PCU_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_qpi_events.h000066400000000000000000000521541502707512200256410ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_qpi (Intel Haswell-EP QPI uncore) */ static const intel_x86_umask_t hswep_unc_q_direct2core[]={ { .uname = "FAILURE_CREDITS", .udesc = "Number of spawn failures due to lack of Egress credits", .ucode = 0x200, }, { .uname = "FAILURE_CREDITS_RBT", .udesc = "Number of spawn failures due to lack of Egress credit and route-back table (RBT) bit was not set", .ucode = 0x800, }, { .uname = "FAILURE_RBT_HIT", .udesc = "Number of spawn failures because route-back table (RBT) specified that the transaction should not trigger a direct2core transaction", .ucode = 0x400, }, { .uname = "SUCCESS_RBT_HIT", .udesc = "Number of spawn successes", .ucode = 0x100, }, { .uname = "FAILURE_MISS", .udesc = "Number of spawn failures due to RBT tag not matching although the valid bit was set and there was enough Egress credits", .ucode = 0x1000, }, { .uname = "FAILURE_CREDITS_MISS", .udesc = "Number of spawn failures due to RBT tag not matching and they were not enough Egress credits. The valid bit was set", .ucode = 0x2000, }, { .uname = "FAILURE_RBT_MISS", .udesc = "Number of spawn failures due to RBT tag not matching, the valid bit was not set but there were enough Egress credits", .ucode = 0x4000, }, { .uname = "FAILURE_CREDITS_RBT_MISS", .udesc = "Number of spawn failures due to RBT tag not matching, the valid bit was not set and there were not enough Egress credits", .ucode = 0x8000, }, }; static const intel_x86_umask_t hswep_unc_q_rxl_credits_consumed_vn0[]={ { .uname = "DRS", .udesc = "Number of times VN0 consumed for DRS message class", .ucode = 0x100, }, { .uname = "HOM", .udesc = "Number of times VN0 consumed for HOM message class", .ucode = 0x800, }, { .uname = "NCB", .udesc = "Number of times VN0 consumed for NCB message class", .ucode = 0x200, }, { .uname = "NCS", .udesc = "Number of times VN0 consumed for NCS message class", .ucode = 0x400, }, { .uname = "NDR", .udesc = "Number of times VN0 consumed for NDR message class", .ucode = 0x2000, }, { .uname = "SNP", .udesc = "Number of times VN0 consumed for SNP message class", .ucode = 0x1000, }, }; static const intel_x86_umask_t hswep_unc_q_rxl_credits_consumed_vn1[]={ { .uname = "DRS", .udesc = "Number of times VN1 consumed for DRS message class", .ucode = 0x100, }, { .uname = "HOM", .udesc = "Number of times VN1 consumed for HOM message class", .ucode = 0x800, }, { .uname = "NCB", .udesc = "Number of times VN1 consumed for NCB message class", .ucode = 0x200, }, { .uname = "NCS", .udesc = "Number of times VN1 consumed for NCS message class", .ucode = 0x400, }, { .uname = "NDR", .udesc = "Number of times VN1 consumed for NDR message class", .ucode = 0x2000, }, { .uname = "SNP", .udesc = "Number of times VN1 consumed for SNP message class", .ucode = 0x1000, }, }; static const intel_x86_umask_t hswep_unc_q_txl_flits_g0[]={ { .uname = "DATA", .udesc = "Number of data flits over QPI", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "Number of non-NULL non-data flits over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_q_rxl_flits_g1[]={ { .uname = "DRS", .udesc = "Number of flits over QPI on the Data Response (DRS) channel", .ucode = 0x1800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_DATA", .udesc = "Number of data flits over QPI on the Data Response (DRS) channel", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_NONDATA", .udesc = "Number of protocol flits over QPI on the Data Response (DRS) channel", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of flits over QPI on the home channel", .ucode = 0x600, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_NONREQ", .udesc = "Number of non-request flits over QPI on the home channel", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_REQ", .udesc = "Number of data requests over QPI on the home channel", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of snoop requests flits over QPI", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_q_rxl_flits_g2[]={ { .uname = "NCB", .udesc = "Number of non-coherent bypass flits", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_DATA", .udesc = "Number of non-coherent data flits", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_NONDATA", .udesc = "Number of bypass non-data flits", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of non-coherent standard (NCS) flits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AD", .udesc = "Number of flits received over Non-data response (NDR) channel", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AK", .udesc = "Number of flits received on the Non-data response (NDR) channel)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_q_txr_ad_hom_credit_acquired[]={ { .uname = "VN0", .udesc = "for VN0", .ucode = 0x100, }, { .uname = "VN1", .udesc = "for VN1", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_q_txr_bl_drs_credit_acquired[]={ { .uname = "VN0", .udesc = "for VN0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .udesc = "for VN1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN_SHR", .udesc = "for shared VN", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_hswep_unc_q_pe[]={ { .name = "UNC_Q_CLOCKTICKS", .desc = "Number of qfclks", .code = 0x14, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_CTO_COUNT", .desc = "Count of CTO Events", .code = 0x38 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_DIRECT2CORE", .desc = "Direct 2 Core Spawning", .code = 0x13, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_direct2core), .umasks = hswep_unc_q_direct2core }, { .name = "UNC_Q_L1_POWER_CYCLES", .desc = "Cycles in L1", .code = 0x12, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0x10, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xf, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_BYPASSED", .desc = "Rx Flit Buffer Bypassed", .code = 0x9, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN0", .desc = "VN0 Credit Consumed", .code = 0x1e | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_credits_consumed_vn0), .umasks = hswep_unc_q_rxl_credits_consumed_vn0 }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN1", .desc = "VN1 Credit Consumed", .code = 0x39 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_credits_consumed_vn1), .umasks = hswep_unc_q_rxl_credits_consumed_vn1 }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VNA", .desc = "VNA Credit Consumed", .code = 0x1d | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CYCLES_NE", .desc = "RxQ Cycles Not Empty", .code = 0xa, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_FLITS_G1", .desc = "Flits Received - Group 1", .code = 0x2 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_flits_g1), .umasks = hswep_unc_q_rxl_flits_g1 }, { .name = "UNC_Q_RXL_FLITS_G2", .desc = "Flits Received - Group 2", .code = 0x3 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_flits_g2), .umasks = hswep_unc_q_rxl_flits_g2 }, { .name = "UNC_Q_RXL_INSERTS", .desc = "Rx Flit Buffer Allocations", .code = 0x8, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_DRS", .desc = "Rx Flit Buffer Allocations - DRS", .code = 0x9 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_HOM", .desc = "Rx Flit Buffer Allocations - HOM", .code = 0xc | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NCB", .desc = "Rx Flit Buffer Allocations - NCB", .code = 0xa | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NCS", .desc = "Rx Flit Buffer Allocations - NCS", .code = 0xb | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NDR", .desc = "Rx Flit Buffer Allocations - NDR", .code = 0xe | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_SNP", .desc = "Rx Flit Buffer Allocations - SNP", .code = 0xd | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY", .desc = "RxQ Occupancy - All Packets", .code = 0xb, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_DRS", .desc = "RxQ Occupancy - DRS", .code = 0x15 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_HOM", .desc = "RxQ Occupancy - HOM", .code = 0x18 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCB", .desc = "RxQ Occupancy - NCB", .code = 0x16 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCS", .desc = "RxQ Occupancy - NCS", .code = 0x17 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NDR", .desc = "RxQ Occupancy - NDR", .code = 0x1a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_SNP", .desc = "RxQ Occupancy - SNP", .code = 0x19 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0xd, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xc, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_BYPASSED", .desc = "Tx Flit Buffer Bypassed", .code = 0x5, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_CYCLES_NE", .desc = "Tx Flit Buffer Cycles not Empty", .code = 0x6, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_FLITS_G0", .desc = "Flits Transferred - Group 0", .code = 0x0, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txl_flits_g0), .umasks = hswep_unc_q_txl_flits_g0 }, { .name = "UNC_Q_TXL_FLITS_G1", .desc = "Flits Transferred - Group 1", .code = 0x0 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_flits_g1), .umasks = hswep_unc_q_rxl_flits_g1 /* shared with rxl_flits_g1 */ }, { .name = "UNC_Q_TXL_FLITS_G2", .desc = "Flits Transferred - Group 2", .code = 0x1 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_rxl_flits_g2), .umasks = hswep_unc_q_rxl_flits_g2 /* shared with rxl_flits_g2 */ }, { .name = "UNC_Q_TXL_INSERTS", .desc = "Tx Flit Buffer Allocations", .code = 0x4, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_OCCUPANCY", .desc = "Tx Flit Buffer Occupancy", .code = 0x7, .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURNS", .desc = "VNA Credits Returned", .code = 0x1c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURN_OCCUPANCY", .desc = "VNA Credits Pending Return - Occupancy", .code = 0x1b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = HSWEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD HOM", .code = 0x26 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD HOM", .code = 0x22 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x28 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x24 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD SNP", .code = 0x27 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD SNP", .code = 0x23 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AK NDR", .code = 0x29 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x25 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL DRS", .code = 0x2a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_bl_drs_credit_acquired), .umasks = hswep_unc_q_txr_bl_drs_credit_acquired, }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL DRS", .code = 0x1f | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_bl_drs_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_bl_drs_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL NCB", .code = 0x2b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL NCB", .code = 0x20 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL NCS", .code = 0x2c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL NCS", .code = 0x21 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = hswep_unc_q_txr_ad_hom_credit_acquired, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_r2pcie_events.h000066400000000000000000000204651502707512200262340ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_r2pcie (Intel Haswell-EP R2PCIe uncore) */ static const intel_x86_umask_t hswep_unc_r2_ring_ad_used[]={ { .uname = "CCW_EVEN", .udesc = "Counter-clockwise and even ring polarity on virtual ring", .ucode = 0x400, }, { .uname = "CCW_ODD", .udesc = "Counter-clockwise and odd ring polarity on virtual ring", .ucode = 0x800, }, { .uname = "CW_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring", .ucode = 0x100, }, { .uname = "CW_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0x0c00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_r2_rxr_ak_bounces[]={ { .uname = "UP", .udesc = "Up", .ucode = 0x100, }, { .uname = "DOWN", .udesc = "Down", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_r2_rxr_occupancy[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t hswep_unc_r2_ring_iv_used[]={ { .uname = "CW", .udesc = "Clockwise with any polarity on virtual ring", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on virtual ring", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "any direction and any polarity on virtual ring", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hswep_unc_r2_rxr_cycles_ne[]={ { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, }, }; static const intel_x86_umask_t hswep_unc_r2_sbo0_credits_acquired[]={ { .uname = "AD", .udesc = "For ring AD", .ucode = 0x100, }, { .uname = "BL", .udesc = "For ring BL", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_r2_iio_credit[]={ { .uname = "PRQ_QPI0", .udesc = "QPI0", .ucode = 0x100, }, { .uname = "PRQ_QPI1", .udesc = "QPI1", .ucode = 0x200, }, { .uname = "ISOCH_QPI0", .udesc = "Isochronous QPI0", .ucode = 0x400, }, { .uname = "ISOCH_QPI1", .udesc = "Isochronous QPI1", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_r2_txr_nack_cw[]={ { .uname = "DN_AD", .udesc = "AD counter clockwise Egress queue", .ucode = 0x100, }, { .uname = "DN_BL", .udesc = "BL counter clockwise Egress queue", .ucode = 0x200, }, { .uname = "DN_AK", .udesc = "AK counter clockwise Egress queue", .ucode = 0x400, }, { .uname = "UP_AD", .udesc = "AD clockwise Egress queue", .ucode = 0x800, }, { .uname = "UP_BL", .udesc = "BL clockwise Egress queue", .ucode = 0x1000, }, { .uname = "UP_AK", .udesc = "AK clockwise Egress queue", .ucode = 0x2000, }, }; static const intel_x86_umask_t hswep_unc_r2_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .udesc = "For SBO0, AD ring", .ucode = 0x100, }, { .uname = "SBO1_AD", .udesc = "For SBO1, AD ring", .ucode = 0x100, }, { .uname = "SBO0_BL", .udesc = "For SBO0, BL ring", .ucode = 0x100, }, { .uname = "SBO1_BL", .udesc = "For SBO1, BL ring", .ucode = 0x100, }, }; static const intel_x86_entry_t intel_hswep_unc_r2_pe[]={ { .name = "UNC_R2_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0xf, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, }, { .name = "UNC_R2_RING_AD_USED", .desc = "R2 AD Ring in Use", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_ring_ad_used), .umasks = hswep_unc_r2_ring_ad_used }, { .name = "UNC_R2_RING_AK_USED", .desc = "R2 AK Ring in Use", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_ring_ad_used), .umasks = hswep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_BL_USED", .desc = "R2 BL Ring in Use", .code = 0x9, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_ring_ad_used), .umasks = hswep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_IV_USED", .desc = "R2 IV Ring in Use", .code = 0xa, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_ring_iv_used), .umasks = hswep_unc_r2_ring_iv_used }, { .name = "UNC_R2_RXR_AK_BOUNCES", .desc = "AK Ingress Bounced", .code = 0x12, .cntmsk = 0xf, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_rxr_ak_bounces), .umasks = hswep_unc_r2_rxr_ak_bounces }, { .name = "UNC_R2_RXR_OCCUPANCY", .desc = "Ingress occupancy accumulator", .code = 0x13, .cntmsk = 0x1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_rxr_occupancy), .umasks = hswep_unc_r2_rxr_occupancy }, { .name = "UNC_R2_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_rxr_cycles_ne), .umasks = hswep_unc_r2_rxr_cycles_ne }, { .name = "UNC_R2_RXR_INSERTS", .desc = "Ingress inserts", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_rxr_cycles_ne), .umasks = hswep_unc_r2_rxr_cycles_ne, /* shared */ }, { .name = "UNC_R2_TXR_NACK_CW", .desc = "Egress clockwise BACK", .code = 0x26, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_txr_nack_cw), .umasks = hswep_unc_r2_txr_nack_cw, }, { .name = "UNC_R2_SBO0_CREDITS_ACQUIRED", .desc = "SBO0 credits acquired", .code = 0x28, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_sbo0_credits_acquired), .umasks = hswep_unc_r2_sbo0_credits_acquired, }, { .name = "UNC_R2_STALL_NO_SBO_CREDIT", .desc = "Stall on No SBo Credits", .code = 0x2c, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_stall_no_sbo_credit), .umasks = hswep_unc_r2_stall_no_sbo_credit }, { .name = "UNC_R2_IIO_CREDIT", .desc = "Egress counter-clockwise BACK", .code = 0x2d, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r2_iio_credit), .umasks = hswep_unc_r2_iio_credit, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_r3qpi_events.h000066400000000000000000000356241502707512200261110ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_r3qpi (Intel Haswell-EP R3QPI uncore) */ static const intel_x86_umask_t hswep_unc_r3_ring_ad_used[]={ { .uname = "CCW_EVEN", .udesc = "Counter-Clockwise and even ring polarity", .ucode = 0x400, }, { .uname = "CCW_ODD", .udesc = "Counter-Clockwise and odd ring polarity", .ucode = 0x800, }, { .uname = "CW_EVEN", .udesc = "Clockwise and even ring polarity", .ucode = 0x100, }, { .uname = "CW_ODD", .udesc = "Clockwise and odd ring polarity", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x300, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xc00, }, }; static const intel_x86_umask_t hswep_unc_r3_ring_iv_used[]={ { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x300, }, { .uname = "ANY", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hswep_unc_r3_rxr_cycles_ne[]={ { .uname = "HOM", .udesc = "HOM Ingress queue", .ucode = 0x100, }, { .uname = "SNP", .udesc = "SNP Ingress queue", .ucode = 0x200, }, { .uname = "NDR", .udesc = "NDR Ingress queue", .ucode = 0x400, }, }; static const intel_x86_umask_t hswep_unc_r3_rxr_inserts[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, }, { .uname = "HOM", .udesc = "HOM Ingress queue", .ucode = 0x100, }, { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, }, { .uname = "NDR", .udesc = "NDR Ingress queue", .ucode = 0x400, }, { .uname = "SNP", .udesc = "SNP Ingress queue", .ucode = 0x200, }, }; static const intel_x86_umask_t hswep_unc_r3_vn0_credits_used[]={ { .uname = "HOM", .udesc = "Filter HOM message class", .ucode = 0x100, }, { .uname = "SNP", .udesc = "Filter SNP message class", .ucode = 0x200, }, { .uname = "NDR", .udesc = "Filter NDR message class", .ucode = 0x400, }, { .uname = "DRS", .udesc = "Filter DRS message class", .ucode = 0x800, }, { .uname = "NCB", .udesc = "Filter NCB message class", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "Filter NCS message class", .ucode = 0x2000, }, }; static const intel_x86_umask_t hswep_unc_r3_c_lo_ad_credits_empty[]={ { .uname = "CBO0", .udesc = "CBox 0", .ucode = 0x100, }, { .uname = "CBO1", .udesc = "CBox 1", .ucode = 0x200, }, { .uname = "CBO2", .udesc = "CBox 2", .ucode = 0x400, }, { .uname = "CBO3", .udesc = "CBox 3", .ucode = 0x800, }, { .uname = "CBO4", .udesc = "CBox 4", .ucode = 0x1000, }, { .uname = "CBO5", .udesc = "CBox 5", .ucode = 0x2000, }, { .uname = "CBO6", .udesc = "CBox 6", .ucode = 0x4000, }, { .uname = "CBO7", .udesc = "CBox 7", .ucode = 0x8000, } }; static const intel_x86_umask_t hswep_unc_r3_c_hi_ad_credits_empty[]={ { .uname = "CBO8", .udesc = "CBox 8", .ucode = 0x100, }, { .uname = "CBO9", .udesc = "CBox 9", .ucode = 0x200, }, { .uname = "CBO10", .udesc = "CBox 10", .ucode = 0x400, }, { .uname = "CBO11", .udesc = "CBox 11", .ucode = 0x800, }, { .uname = "CBO12", .udesc = "CBox 12", .ucode = 0x1000, }, { .uname = "CBO13", .udesc = "CBox 13", .ucode = 0x2000, }, { .uname = "CBO14_16", .udesc = "CBox 14 and CBox 16", .ucode = 0x4000, }, { .uname = "CBO15_17", .udesc = "CBox 15 and CBox 17", .ucode = 0x8000, } }; static const intel_x86_umask_t hswep_unc_r3_ha_r2_bl_credits_empty[]={ { .uname = "HA0", .udesc = "HA0", .ucode = 0x100, }, { .uname = "HA1", .udesc = "HA1", .ucode = 0x200, }, { .uname = "R2_NCB", .udesc = "R2 NCB messages", .ucode = 0x400, }, { .uname = "R2_NCS", .udesc = "R2 NCS messages", .ucode = 0x800, } }; static const intel_x86_umask_t hswep_unc_r3_qpi0_ad_credits_empty[]={ { .uname = "VNA", .udesc = "VNA", .ucode = 0x100, }, { .uname = "VN0_HOM", .udesc = "VN0 HOM messages", .ucode = 0x200, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP messages", .ucode = 0x400, }, { .uname = "VN0_NDR", .udesc = "VN0 NDR messages", .ucode = 0x800, }, { .uname = "VN1_HOM", .udesc = "VN1 HOM messages", .ucode = 0x1000, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP messages", .ucode = 0x2000, }, { .uname = "VN1_NDR", .udesc = "VN1 NDR messages", .ucode = 0x4000, }, }; static const intel_x86_umask_t hswep_unc_r3_sbo0_credits_acquired[]={ { .uname = "AD", .udesc = "For AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "For BL ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_r3_txr_nack[]={ { .uname = "AD", .udesc = "AD clockwise Egress queue", .ucode = 0x100, }, { .uname = "AK", .udesc = "AD counter-clockwise Egress queue", .ucode = 0x200, }, { .uname = "BL", .udesc = "BL clockwise Egress queue", .ucode = 0x400, }, }; static const intel_x86_umask_t hswep_unc_r3_vna_credits_acquired[]={ { .uname = "AD", .udesc = "For AD ring", .ucode = 0x100, }, { .uname = "BL", .udesc = "For BL ring", .ucode = 0x400, }, }; static const intel_x86_umask_t hswep_unc_r3_stall_no_sbo_credit[]={ { .uname = "SBO0_AD", .udesc = "For SBO0, AD ring", .ucode = 0x100, }, { .uname = "SBO1_AD", .udesc = "For SBO1, AD ring", .ucode = 0x100, }, { .uname = "SBO0_BL", .udesc = "For SBO0, BL ring", .ucode = 0x100, }, { .uname = "SBO1_BL", .udesc = "For SBO1, BL ring", .ucode = 0x100, }, }; static const intel_x86_umask_t hswep_unc_r3_ring_sink_starved[]={ { .uname = "AK", .udesc = "For AJ ring", .ucode = 0x200, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_hswep_unc_r3_pe[]={ { .name = "UNC_R3_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0x7, .modmsk = HSWEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_RING_AD_USED", .desc = "R3 AD Ring in Use", .code = 0x7, .cntmsk = 0x7, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ring_ad_used), .umasks = hswep_unc_r3_ring_ad_used }, { .name = "UNC_R3_RING_AK_USED", .desc = "R3 AK Ring in Use", .code = 0x8, .cntmsk = 0x7, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ring_ad_used), .umasks = hswep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_BL_USED", .desc = "R3 BL Ring in Use", .code = 0x9, .cntmsk = 0x7, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ring_ad_used), .umasks = hswep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_IV_USED", .desc = "R3 IV Ring in Use", .code = 0xa, .cntmsk = 0x7, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ring_iv_used), .umasks = hswep_unc_r3_ring_iv_used }, { .name = "UNC_R3_RING_SINK_STARVED", .desc = "R3 Ring stop starved", .code = 0xe, .cntmsk = 0x7, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ring_sink_starved), .umasks = hswep_unc_r3_ring_sink_starved }, { .name = "UNC_R3_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_rxr_cycles_ne), .umasks = hswep_unc_r3_rxr_cycles_ne }, { .name = "UNC_R3_RXR_CYCLES_NE_VN1", .desc = "VN1 Ingress Cycles Not Empty", .code = 0x14, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_rxr_inserts), .umasks = hswep_unc_r3_rxr_inserts }, { .name = "UNC_R3_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_rxr_inserts), .umasks = hswep_unc_r3_rxr_inserts }, { .name = "UNC_R3_RXR_INSERTS_VN1", .desc = "VN1 Ingress Allocations", .code = 0x15, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_rxr_inserts), .umasks = hswep_unc_r3_rxr_inserts }, { .name = "UNC_R3_RXR_OCCUPANCY_VN1", .desc = "VN1 Ingress Occupancy Accumulator", .code = 0x13, .cntmsk = 0x1, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_rxr_inserts), .umasks = hswep_unc_r3_rxr_inserts/* shared */ }, { .name = "UNC_R3_VN0_CREDITS_REJECT", .desc = "VN0 Credit Acquisition Failed", .code = 0x37, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vn0_credits_used), .umasks = hswep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VN0_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x36, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vn0_credits_used), .umasks = hswep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VNA_CREDITS_ACQUIRED", .desc = "VNA credit Acquisitions", .code = 0x33, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vna_credits_acquired), .umasks = hswep_unc_r3_vna_credits_acquired }, { .name = "UNC_R3_VNA_CREDITS_REJECT", .desc = "VNA Credit Reject", .code = 0x34, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vn0_credits_used), .umasks = hswep_unc_r3_vn0_credits_used /* shared */ }, { .name = "UNC_R3_STALL_NO_SBO_CREDIT", .desc = "Stall no SBO credit", .code = 0x2c, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_stall_no_sbo_credit), .umasks = hswep_unc_r3_stall_no_sbo_credit, }, { .name = "UNC_R3_C_LO_AD_CREDITS_EMPTY", .desc = "Cbox AD credits empty", .code = 0x22, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_c_lo_ad_credits_empty), .umasks = hswep_unc_r3_c_lo_ad_credits_empty }, { .name = "UNC_R3_C_HI_AD_CREDITS_EMPTY", .desc = "Cbox AD credits empty", .code = 0x1f, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_c_hi_ad_credits_empty), .umasks = hswep_unc_r3_c_hi_ad_credits_empty }, { .name = "UNC_R3_QPI0_AD_CREDITS_EMPTY", .desc = "QPI0 AD credits empty", .code = 0x20, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_qpi0_ad_credits_empty), .umasks = hswep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_QPI0_BL_CREDITS_EMPTY", .desc = "QPI0 BL credits empty", .code = 0x21, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_qpi0_ad_credits_empty), .umasks = hswep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_QPI1_BL_CREDITS_EMPTY", .desc = "QPI0 BL credits empty", .code = 0x2f, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_qpi0_ad_credits_empty), .umasks = hswep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_HA_R2_BL_CREDITS_EMPTY", .desc = "HA/R2 AD credits empty", .code = 0x2d, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_ha_r2_bl_credits_empty), .umasks = hswep_unc_r3_ha_r2_bl_credits_empty }, { .name = "UNC_R3_SBO0_CREDITS_ACQUIRED", .desc = "SBO0 credits acquired", .code = 0x28, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_sbo0_credits_acquired), .umasks = hswep_unc_r3_sbo0_credits_acquired, }, { .name = "UNC_R3_SBO1_CREDITS_ACQUIRED", .desc = "SBO1 credits acquired", .code = 0x29, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_sbo0_credits_acquired), .umasks = hswep_unc_r3_sbo0_credits_acquired, }, { .name = "UNC_R3_TXR_NACK", .desc = "Egress NACK", .code = 0x26, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_txr_nack), .umasks = hswep_unc_r3_txr_nack }, { .name = "UNC_R3_VN1_CREDITS_REJECT", .desc = "VN1 Credit Acquisition Failed", .code = 0x39, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vn0_credits_used), /* shared */ .umasks = hswep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VN1_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x38, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_r3_vn0_credits_used), /* shared */ .umasks = hswep_unc_r3_vn0_credits_used }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_sbo_events.h000066400000000000000000000173221502707512200256310ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_sbo (Intel Haswell-EP S-Box uncore PMU) */ static const intel_x86_umask_t hswep_unc_s_ring_ad_used[]={ { .uname = "UP_EVEN", .udesc = "Up and Even ring polarity filter", .ucode = 0x100, }, { .uname = "UP_ODD", .udesc = "Up and odd ring polarity filter", .ucode = 0x200, }, { .uname = "DOWN_EVEN", .udesc = "Down and even ring polarity filter", .ucode = 0x400, }, { .uname = "DOWN_ODD", .udesc = "Down and odd ring polarity filter", .ucode = 0x800, }, { .uname = "UP", .udesc = "Up ring polarity filter", .ucode = 0x3300, }, { .uname = "DOWN", .udesc = "Down ring polarity filter", .ucode = 0xcc00, }, }; static const intel_x86_umask_t hswep_unc_s_ring_bounces[]={ { .uname = "AD_CACHE", .udesc = "AD_CACHE", .ucode = 0x100, }, { .uname = "AK_CORE", .udesc = "Acknowledgments to core", .ucode = 0x200, }, { .uname = "BL_CORE", .udesc = "Data responses to core", .ucode = 0x400, }, { .uname = "IV_CORE", .udesc = "Snoops of processor cache", .ucode = 0x800, }, }; static const intel_x86_umask_t hswep_unc_s_ring_iv_used[]={ { .uname = "ANY", .udesc = "Any filter", .ucode = 0x0f00, .uflags = INTEL_X86_DFL, }, { .uname = "UP", .udesc = "Filter on any up polarity", .ucode = 0x0300, }, { .uname = "DOWN", .udesc = "Filter on any down polarity", .ucode = 0xcc00, }, }; static const intel_x86_umask_t hswep_unc_s_rxr_bypass[]={ { .uname = "AD_CRD", .udesc = "AD credis", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_BNC", .udesc = "AD bounces", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL credits", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .udesc = "BL bounces", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t hswep_unc_s_txr_ads_used[]={ { .uname = "AD", .udesc = "onto AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Onto AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Onto BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_entry_t intel_hswep_unc_s_pe[]={ { .name = "UNC_S_CLOCKTICKS", .desc = "S-box Uncore clockticks", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_S_RING_AD_USED", .desc = "Address ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x1b, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_ring_ad_used), .ngrp = 1, .umasks = hswep_unc_s_ring_ad_used, }, { .name = "UNC_S_RING_AK_USED", .desc = "Acknowledgement ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x1c, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = hswep_unc_s_ring_ad_used, }, { .name = "UNC_S_RING_BL_USED", .desc = "Bus or Data ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x1d, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = hswep_unc_s_ring_ad_used, }, { .name = "UNC_S_RING_IV_USED", .desc = "Invalidate ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x1e, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_ring_iv_used), .ngrp = 1, .umasks = hswep_unc_s_ring_iv_used, }, { .name = "UNC_S_RING_BOUNCES", .desc = "Number of LLC responses that bounced in the ring", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x05, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_ring_bounces), .ngrp = 1, .umasks = hswep_unc_s_ring_bounces, }, { .name = "UNC_S_FAST_ASSERTED", .desc = "Number of cycles in which the local distress or incoming distress signals are asserted (FaST). Incoming distress includes both up and down", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x09, }, { .name = "UNC_C_BOUNCE_CONTROL", .desc = "Bounce control", .modmsk = HSWEP_UNC_SBO_ATTRS, .cntmsk = 0xf, .code = 0x0a, }, { .name = "UNC_S_RXR_OCCUPANCY", .desc = "Ingress Occupancy", .code = 0x11, .cntmsk = 0x1, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_rxr_bypass), /* shared with rxr_bypass */ .umasks = hswep_unc_s_rxr_bypass, }, { .name = "UNC_S_RXR_BYPASS", .desc = "Ingress Allocations", .code = 0x12, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_rxr_bypass), .umasks = hswep_unc_s_rxr_bypass }, { .name = "UNC_S_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x13, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_rxr_bypass), /* shared with rxr_bypass */ .umasks = hswep_unc_s_rxr_bypass }, { .name = "UNC_S_TXR_ADS_USED", .desc = "Egress events", .code = 0x04, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_txr_ads_used), .umasks = hswep_unc_s_txr_ads_used }, { .name = "UNC_S_TXR_INSERTS", .desc = "Egress allocations", .code = 0x02, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_rxr_bypass), /* shared with rxr_bypass */ .umasks = hswep_unc_s_rxr_bypass }, { .name = "UNC_S_TXR_OCCUPANCY", .desc = "Egress allocations", .code = 0x01, .cntmsk = 0xf, .ngrp = 1, .modmsk = HSWEP_UNC_SBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_s_rxr_bypass), /* shared with rxr_bypass */ .umasks = hswep_unc_s_rxr_bypass }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_hswep_unc_ubo_events.h000066400000000000000000000046521502707512200256350ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: hswep_unc_ubo (Intel Haswell-EP U-Box uncore PMU) */ static const intel_x86_umask_t hswep_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .udesc = "TBD", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t hswep_unc_u_phold_cycles[]={ { .uname = "ASSERT_TO_ACK", .udesc = "Number of cycles asserted to ACK", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_hswep_unc_u_pe[]={ { .name = "UNC_U_EVENT_MSG", .desc = "VLW Received", .code = 0x42, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_UBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_u_event_msg), .umasks = hswep_unc_u_event_msg }, { .name = "UNC_U_PHOLD_CYCLES", .desc = "Cycles PHOLD asserts to Ack", .code = 0x45, .cntmsk = 0x3, .ngrp = 1, .modmsk = HSWEP_UNC_UBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(hswep_unc_u_phold_cycles), .umasks = hswep_unc_u_phold_cycles }, { .name = "UNC_U_RACU_REQUESTS", .desc = "RACU requests", .code = 0x46, .cntmsk = 0x3, .modmsk = HSWEP_UNC_UBO_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icl_events.h000066400000000000000000004162111502707512200235420ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * * PMU: intel_icl (Intel Icelake) * Based on Intel JSON event table version : 1.19 * Based on Intel JSON event table pusblised : 02/16/2023 */ static const intel_x86_umask_t intel_icl_ocr[]={ { .uname = "OTHER_LOCAL_DRAM", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that DRAM supplied the request.", .ucode = 0x18400800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_LOCAL_DRAM", .udesc = "Counts streaming stores that DRAM supplied the request.", .ucode = 0x18400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_LOCAL_DRAM", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that DRAM supplied the request.", .ucode = 0x18400040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_LOCAL_DRAM", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that DRAM supplied the request.", .ucode = 0x18400002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_LOCAL_DRAM", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that DRAM supplied the request.", .ucode = 0x18400001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_LOCAL_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that DRAM supplied the request.", .ucode = 0x18400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_LOCAL_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that DRAM supplied the request.", .ucode = 0x18400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_DRAM", .udesc = "Counts demand data reads that DRAM supplied the request.", .ucode = 0x18400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_MISS", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that was not supplied by the L3 cache.", .ucode = 0x3fffc0800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_MISS", .udesc = "Counts streaming stores that was not supplied by the L3 cache.", .ucode = 0x3fffc0080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_MISS", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that was not supplied by the L3 cache.", .ucode = 0x3fffc0040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_MISS", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that was not supplied by the L3 cache.", .ucode = 0x3fffc0002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_MISS", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that was not supplied by the L3 cache.", .ucode = 0x3fffc0001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_MISS", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that was not supplied by the L3 cache.", .ucode = 0x3fffc0000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that was not supplied by the L3 cache.", .ucode = 0x3fffc0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that was not supplied by the L3 cache.", .ucode = 0x3fffc0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_DRAM", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that DRAM supplied the request.", .ucode = 0x18400800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_DRAM", .udesc = "Counts streaming stores that DRAM supplied the request.", .ucode = 0x18400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_DRAM", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that DRAM supplied the request.", .ucode = 0x18400040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_DRAM", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that DRAM supplied the request.", .ucode = 0x18400002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_DRAM", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that DRAM supplied the request.", .ucode = 0x18400001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that DRAM supplied the request.", .ucode = 0x18400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that DRAM supplied the request.", .ucode = 0x18400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_DRAM", .udesc = "Counts demand data reads that DRAM supplied the request.", .ucode = 0x18400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_HIT_SNOOP_SENT", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_SNOOP_SENT", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_SNOOP_SENT", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_SENT", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_SENT", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_SENT", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop was sent.", .ucode = 0x1e003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_ANY_RESPONSE", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that have any type of response.", .ucode = 0x1800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response.", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_ANY_RESPONSE", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that have any type of response.", .ucode = 0x1040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_ANY_RESPONSE", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that have any type of response.", .ucode = 0x1002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_ANY_RESPONSE", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that have any type of response.", .ucode = 0x1001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_ANY_RESPONSE", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that have any type of response.", .ucode = 0x1000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.", .ucode = 0x1000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response.", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_HIT_ANY", .udesc = "Counts hardware prefetches to the L3 only that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_HIT_SNOOP_MISS", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_HIT_ANY", .udesc = "Counts streaming stores that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_HIT_ANY", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_MISS", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_ANY", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.", .ucode = 0x10003c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_SNOOP_MISS", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_RFO_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts hardware prefetch RFOs (which bring data to L2) that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_ANY", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.", .ucode = 0x10003c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_SNOOP_MISS", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts hardware prefetch data reads (which bring data to L2) that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_ANY", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.", .ucode = 0x10003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_MISS", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_ANY", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_MISS", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_ANY", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop was sent or not.", .ucode = 0x3fc03c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop hit in another core, data forwarding is not required.", .ucode = 0x4003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_MISS", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop was sent but no other cores had the data.", .ucode = 0x2003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED", .udesc = "Counts demand data reads that hit a cacheline in the L3 where a snoop was not needed to satisfy the request.", .ucode = 0x1003c000100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icx_ocr[]={ { .uname = "WRITE_ESTIMATE_MEMORY", .udesc = "Counts Demand RFOs, ItoM's, PREFECTHW's, Hardware RFO Prefetches to the L1/L2 and Streaming stores that likely resulted in a store to Memory (DRAM or PMM)", .ucode = 0xfbff8082200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_MEMORY", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM or PMM attached to another socket.", .ucode = 0x73180047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_ANY_RESPONSE", .udesc = "Counts hardware prefetch (which bring data to L2) that have any type of response.", .ucode = 0x1007000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_SOCKET_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x700c0047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_SOCKET_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts DRAM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x70c00047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL_SOCKET", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that missed the L3 Cache and were supplied by the local socket (DRAM or PMM), whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM or DRAM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x70cc0047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_CACHE_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_CACHE_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_CACHE_HITM", .udesc = "Counts demand data reads that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand data reads that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREFETCHES_L3_HIT", .udesc = "Counts hardware and software prefetches to all cache levels that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c27f000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREFETCHES_L3_MISS_LOCAL", .udesc = "Counts hardware and software prefetches to all cache levels that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f844027f000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_MISS", .udesc = "Counts hardware prefetches to the L3 only that missed the local socket's L1, L2, and L3 caches.", .ucode = 0x9400238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_MISS", .udesc = "Counts streaming stores that missed the local socket's L1, L2, and L3 caches.", .ucode = 0x9400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ITOM_REMOTE", .udesc = "Counts full cacheline writes (ItoM) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline was homed in a remote socket.", .ucode = 0x9000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_REMOTE", .udesc = "Counts hardware prefetches to the L3 only that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline was homed in a remote socket.", .ucode = 0x9000238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and were supplied by a remote socket.", .ucode = 0x3f3300047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ITOM_L3_MISS_LOCAL", .udesc = "Counts full cacheline writes (ItoM) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x8400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_MISS_LOCAL", .udesc = "Counts hardware prefetches to the L3 only that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x8400238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_MISS_LOCAL", .udesc = "Counts streaming stores that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x8400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and were supplied by the local socket.", .ucode = 0x3f0440047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS_LOCAL", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches and were supplied by the local socket.", .ucode = 0x3f0440000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f003c047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_HIT", .udesc = "Counts hardware prefetches to the L3 only that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x8008238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_HIT", .udesc = "Counts streaming stores that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x8008080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_ANY_RESPONSE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that have any type of response.", .ucode = 0x3f3ffc047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.", .ucode = 0x3f3ffc000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70080047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those PMM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10040047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM attached to another socket.", .ucode = 0x70300047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to another socket.", .ucode = 0x73000047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM.", .ucode = 0x73c00047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop was sent and data was returned (Modified or Not Modified).", .ucode = 0x183000047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x8003c047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop that hit in another core, which did not forward the data.", .ucode = 0x4003c047700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_HIT", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT", .udesc = "Counts demand data reads that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_PMM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by PMM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70080000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_PMM", .udesc = "Counts demand data reads that were supplied by PMM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70080000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_MISS_LOCAL", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f8440800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_MISS_LOCAL", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f8440040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_DRAM", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that were supplied by DRAM.", .ucode = 0x73c00040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_MISS_LOCAL", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f8440000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM.", .ucode = 0x73c00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_PMM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by PMM.", .ucode = 0x703c0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_LOCAL_PMM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by PMM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those PMM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10040000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_REMOTE_PMM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by PMM attached to another socket.", .ucode = 0x70300000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM.", .ucode = 0x73c00000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_PMM", .udesc = "Counts demand data reads that were supplied by PMM.", .ucode = 0x703c0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_PMM", .udesc = "Counts demand data reads that were supplied by PMM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those PMM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10040000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS_LOCAL", .udesc = "Counts demand data reads that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f8440000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_PMM", .udesc = "Counts demand data reads that were supplied by PMM attached to another socket.", .ucode = 0x70300000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM.", .ucode = 0x73c00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to another socket.", .ucode = 0x73000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_ANY_RESPONSE", .udesc = "Counts hardware prefetches to the L3 only that have any type of response.", .ucode = 0x1238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_L3_MISS", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_ANY_RESPONSE", .udesc = "Counts miscellaneous requests, such as I/O and un-cacheable accesses that have any type of response.", .ucode = 0x1800000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response.", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_LOCAL_DRAM", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_AND_SWPF_L3_MISS", .udesc = "Counts L1 data cache prefetch requests and software prefetches (except PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_LOCAL_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_MISS", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_ANY_RESPONSE", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that have any type of response.", .ucode = 0x1000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_LOCAL_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response.", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that resulted in a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x8003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand data reads that resulted in a snoop that hit in another core, which did not forward the data.", .ucode = 0x4003c000100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_sq_misc[]={ { .uname = "SQ_FULL", .udesc = "Cycles the thread is active and superQ cannot take any more entries.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BUS_LOCK", .udesc = "Counts bus locks, accounts for cache line split locks and UC locks.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_l2_lines_out[]={ { .uname = "USELESS_HWPF", .udesc = "Cache lines that have been L2 hardware prefetched but not used by demand accesses", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_SILENT", .udesc = "Modified cache lines that are evicted by L2 cache when triggered by an L2 cache fill.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SILENT", .udesc = "Non-modified cache lines that are silently dropped by L2 cache when triggered by an L2 cache fill.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_l2_lines_in[]={ { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x1f00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_l2_trans[]={ { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_baclears[]={ { .uname = "ANY", .udesc = "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_mem_load_l3_miss_retired[]={ { .uname = "REMOTE_PMM", .udesc = "Retired load instructions with remote Intel Optane DC persistent memory as the data source where the data request missed all caches.", .ucode = 0x1000ull, .umodel = PFM_PMU_INTEL_ICX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Retired load instructions whose data sources was forwarded from a remote cache", .ucode = 0x0800ull, .umodel = PFM_PMU_INTEL_ICX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Retired load instructions whose data sources was remote HITM", .ucode = 0x0400ull, .umodel = PFM_PMU_INTEL_ICX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from remote dram", .ucode = 0x0200ull, .umodel = PFM_PMU_INTEL_ICX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCAL_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from local dram", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_icl_mem_load_l3_hit_retired[]={ { .uname = "XSNP_NONE", .udesc = "Retired load instructions whose data sources were hits in L3 without snoops required", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Retired load instructions whose data sources were HitM responses from shared L3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HIT", .udesc = "Retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_icl_mem_load_retired[]={ { .uname = "FB_HIT", .udesc = "Number of completed demand load requests that missed the L1, but hit the FB(fill buffer), because a preceding miss to the same cacheline initiated the line to be brought into L1, but data is not yet ready in L1.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load instructions missed L3 cache as data sources", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load instructions missed L2 cache as data sources", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load instructions missed L1 cache as data sources", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load instructions with L3 cache hits as data sources", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load instructions with L2 cache hits as data sources", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Retired load instructions with L1 cache hits as data sources", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCAL_PMM", .udesc = "Retired load instructions with local Intel Optane DC persistent memory as the data source where the data request missed all caches.", .ucode = 0x8000ull, .umodel = PFM_PMU_INTEL_ICX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_icl_mem_inst_retired[]={ { .uname = "ALL_STORES", .udesc = "All retired store instructions.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "All retired load instructions.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired store instructions that split across a cacheline boundary.", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired load instructions that split across a cacheline boundary.", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Retired load instructions with locked access.", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Retired store instructions that miss the STLB.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "Retired load instructions that miss the STLB.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All retired memory instructions.", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_misc_retired[]={ { .uname = "PAUSE_INST", .udesc = "Number of retired PAUSE instructions.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LBR_INSERTS", .udesc = "Increments whenever there is an update to the LBR array.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_rtm_retired[]={ { .uname = "ABORTED_EVENTS", .udesc = "Number of times an RTM execution aborted due to none of the previous 4 categories (e.g. interrupt)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEMTYPE", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an RTM execution aborted due to HLE-unfriendly instructions", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "START", .udesc = "Number of times an RTM execution started.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_hle_retired[]={ { .uname = "ABORTED_EVENTS", .udesc = "Number of times an HLE execution aborted due to unfriendly events (such as interrupts).", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an HLE execution aborted due to various memory events (e.g., read/write capacity and conflicts).", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an HLE execution aborted due to any reasons (multiple categories may count as one).", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMMIT", .udesc = "Number of times an HLE execution successfully committed", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "START", .udesc = "Number of times an HLE execution started.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_fp_arith_inst_retired[]={ { .uname = "512B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 512-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 16 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 512-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "128B_PACKED_SINGLE", .udesc = "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "128B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 2 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SINGLE", .udesc = "Counts number of SSE/AVX computational scalar single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_DOUBLE", .udesc = "Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element. The DAZ and FTZ flags in the MXCSR register need to be set when using this event.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Number of SSE/AVX computational scalar floating-point instructions retired; some instructions will count twice as noted below. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_FLOPS", .udesc = "Number of SSE/AVX computational 128-bit packed single and 256-bit packed double precision FP instructions retired; some instructions will count twice as noted below. Each count represents 2 or/and 4 computation operations, 1 for each element. Applies to SSE* and AVX* packed single precision and packed double precision FP instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x1800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "8_FLOPS", .udesc = "Number of SSE/AVX computational 256-bit packed single precision and 512-bit packed double precision FP instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, 1 for each element. Applies to SSE* and AVX* packed single precision and double precision FP instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RSQRT14 RCP RCP14 DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x6000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of any Vector retired FP arithmetic instructions", .ucode = 0xfc00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_frontend_retired[]={ { .uname = "LATENCY_GE_1", .udesc = "Retired instructions after front-end starvation of at least 1 cycle", .ucode = 0x50010600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2_BUBBLES_GE_1", .udesc = "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.", .ucode = 0x10020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_512", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.", .ucode = 0x52000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_256", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.", .ucode = 0x51000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_128", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.", .ucode = 0x50800600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_64", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.", .ucode = 0x50400600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_32", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.", .ucode = 0x50200600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_16", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.", .ucode = 0x50100600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_8", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 8 cycles which was not interrupted by a back-end stall.", .ucode = 0x50080600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_4", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.", .ucode = 0x50040600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2", .udesc = "Retired instructions after front-end starvation of at least 2 cycles", .ucode = 0x50020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "STLB_MISS", .udesc = "Retired Instructions who experienced STLB (2nd level TLB) true miss.", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ITLB_MISS", .udesc = "Retired Instructions who experienced iTLB true miss.", .ucode = 0x1400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired Instructions who experienced Instruction L2 Cache true miss.", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1I_MISS", .udesc = "Retired Instructions who experienced Instruction L1 Cache true miss.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "DSB_MISS", .udesc = "Retired Instructions experiencing a critical DSB miss.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_DSB_MISS", .udesc = "Retired Instructions experiencing a DSB miss.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IDQ_4_BUBBLES", .udesc = "Retired instructions after an interval where the front-end did not deliver any uops (4 bubbles) for a period determined by the fe_thres modifier (set to 1 cycle by default) and which was not interrupted by a back-end stall", .ucode = (4 << 20 | 0x6) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_3_BUBBLES", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 1 uop (3 bubbles) for a period determined by the fe_thres modifier (set to 1 cycle by default) and which was not interrupted by a back-end stall", .ucode = (3 << 20 | 0x6) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_2_BUBBLES", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 2 uops (2 bubbles) for a period determined by the fe_thres modifier (set to 1 cycle by default) and which was not interrupted by a back-end stall", .ucode = (2 << 20 | 0x6) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_1_BUBBLE", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 3 uops (1 bubble) for a period determined by the fe_thres modifier (set to 1 cycle by default) and which was not interrupted by a back-end stall", .ucode = (1 << 20 | 0x6) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_icl_br_misp_retired[]={ { .uname = "INDIRECT", .udesc = "All miss-predicted indirect branch instructions retired (excluding RETs. TSX aborts is considered indirect branch).", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Mispredicted conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Mispredicted non-taken conditional branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Mispredicted indirect CALL instructions retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "number of branch instructions retired that were mispredicted and taken.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_br_inst_retired[]={ { .uname = "INDIRECT", .udesc = "Indirect near branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Far branch instructions retired.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Taken branch instructions retired.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Not taken branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Return instructions retired.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Direct and indirect near call instructions retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Taken conditional branch instructions retired.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "All branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_machine_clears[]={ { .uname = "SMC", .udesc = "Self-modifying code (SMC) detected.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of machine clears due to memory ordering conflicts.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t intel_icl_uops_retired[]={ { .uname = "SLOTS", .udesc = "Retirement slots used.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL_CYCLES", .udesc = "Cycles with less than 10 actually retired uops.", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0xa << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .udesc = "Cycles without actually retired uops.", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_icl_assists[]={ { .uname = "ANY", .udesc = "Number of occurrences where a microcode assist is invoked by hardware.", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FP", .udesc = "Counts all microcode FP assists.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_tlb_flush[]={ { .uname = "STLB_ANY", .udesc = "STLB flush attempts", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DTLB_THREAD", .udesc = "DTLB flush attempts of the thread-specific entries", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_uops_executed[]={ { .uname = "X87", .udesc = "Counts the number of x87 uops dispatched.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles at least 4 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles at least 3 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles at least 2 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles at least 1 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE", .udesc = "Number of uops executed on the core.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed per-thread", .ucode = 0x0100ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed per-thread", .ucode = 0x0100ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed per-thread", .ucode = 0x0100ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed per-thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .udesc = "Counts number of cycles no uops were dispatched to be executed on this thread.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "THREAD", .udesc = "Counts the number of uops to be executed per-thread each cycle.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_offcore_requests[]={ { .uname = "ALL_REQUESTS", .udesc = "Any memory transaction that reached the SQ.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Demand Data Read requests who miss L3 cache", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Demand and prefetch data reads", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFO requests including regular RFOs, locks, ItoM", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Counts cacheable and non-cacheable code reads to the core.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_dsb2mite_switches[]={ { .uname = "COUNT", .udesc = "DSB-to-MITE transitions count.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "PENALTY_CYCLES", .udesc = "DSB-to-MITE switch true penalty cycles.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_lsd[]={ { .uname = "CYCLES_OK", .udesc = "Cycles optimal number of Uops delivered by the LSD, but did not come from the decoder.", .ucode = 0x0100ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_ACTIVE", .udesc = "Cycles Uops delivered by the LSD, but didn't come from the decoder.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "UOPS", .udesc = "Number of Uops delivered by the LSD.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_exe_activity[]={ { .uname = "EXE_BOUND_0_PORTS", .udesc = "Cycles where no uops were executed, the Reservation Station was not empty, the Store Buffer was full and there was no outstanding load.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOUND_ON_STORES", .udesc = "Cycles where the Store Buffer was full and no loads caused an execution stall.", .ucode = 0x4000ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "4_PORTS_UTIL", .udesc = "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "3_PORTS_UTIL", .udesc = "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_PORTS_UTIL", .udesc = "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "1_PORTS_UTIL", .udesc = "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_cycle_activity[]={ { .uname = "STALLS_MEM_ANY", .udesc = "Execution stalls while memory subsystem has an outstanding load.", .ucode = 0x1400ull | (0x14 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles while memory subsystem has an outstanding load.", .ucode = 0x1000ull | (0x10 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding.", .ucode = 0x0c00ull | (0xc << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding.", .ucode = 0x0800ull | (0x8 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand load is outstanding.", .ucode = 0x0600ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand load is outstanding.", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_TOTAL", .udesc = "Total execution stalls.", .ucode = 0x0400ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L3_MISS", .udesc = "Cycles while L3 cache miss demand load is outstanding.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L2_MISS", .udesc = "Cycles while L2 cache miss demand load is outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_icl_resource_stalls[]={ { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available. (not including draining form sync).", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCOREBOARD", .udesc = "Counts cycles where the pipeline is stalled due to serializing operations.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_uops_dispatched[]={ { .uname = "PORT_7_8", .udesc = "Number of uops executed on port 7 and 8", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Number of uops executed on port 6", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Number of uops executed on port 5", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4_9", .udesc = "Number of uops executed on port 4 and 9", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2_3", .udesc = "Number of uops executed on port 2 and 3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Number of uops executed on port 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_0", .udesc = "Number of uops executed on port 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_idq_uops_not_delivered[]={ { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles when optimal number of uops was delivered to the back-end when the back-end is not stalled", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles when no uops are not delivered by the IDQ when backend of the machine is not stalled", .ucode = 0x0100ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE", .udesc = "Uops not delivered by IDQ when backend of the machine is not stalled", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_ild_stall[]={ { .uname = "LCP", .udesc = "Stalls caused by changing prefix length of the instruction.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_itlb_misses[]={ { .uname = "STLB_HIT", .udesc = "Instruction fetch requests that miss the ITLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for an outstanding code request in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Code miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Code miss in all TLB levels causes a page walk that completes. (2M/4M)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Code miss in all TLB levels causes a page walk that completes. (4K)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_icache_64b[]={ { .uname = "IFTAG_STALL", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFTAG_MISS", .udesc = "Instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFTAG_HIT", .udesc = "Instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_icache_16b[]={ { .uname = "IFDATA_STALL", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_idq[]={ { .uname = "MS_CYCLES_ANY", .udesc = "Cycles when uops are being delivered to IDQ while MS is busy", .ucode = 0x3000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_UOPS", .udesc = "Uops delivered to IDQ while MS is busy", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_SWITCHES", .udesc = "Number of switches from DSB or MITE to the MS", .ucode = 0x3000ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "DSB_CYCLES_ANY", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_CYCLES_OK", .udesc = "Cycles DSB is delivering optimal number of Uops", .ucode = 0x0800ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_CYCLES_ANY", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_CYCLES_OK", .udesc = "Cycles MITE is delivering optimal number of Uops", .ucode = 0x0400ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_rs_events[]={ { .uname = "EMPTY_END", .udesc = "Counts end of periods where the Reservation Station (RS) was empty.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "EMPTY_CYCLES", .udesc = "Cycles when Reservation Station (RS) is empty for the thread", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_tx_exec[]={ { .uname = "MISC3", .udesc = "Number of times an instruction execution caused the transactional nest count supported to be exceeded", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC2", .udesc = "Counts the number of times a class of instructions that may cause a transactional abort was executed inside a transactional region", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_tx_mem[]={ { .uname = "ABORT_CAPACITY_READ", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional reads", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HLE_ELISION_BUFFER_FULL", .udesc = "Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT", .udesc = "Number of times an HLE transactional execution aborted due to an unsupported read alignment from the elision buffer.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_MISMATCH", .udesc = "Number of times an HLE transactional execution aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_NOT_EMPTY", .udesc = "Number of times an HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_STORE_TO_ELIDED_LOCK", .udesc = "Number of times a HLE transactional region aborted due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY_WRITE", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional writes.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_l1d[]={ { .uname = "REPLACEMENT", .udesc = "Counts the number of cache lines replaced in L1 data cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_load_hit_prefetch[]={ { .uname = "SWPF", .udesc = "Counts the number of demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_dtlb_store_misses[]={ { .uname = "STLB_HIT", .udesc = "Stores that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a store.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a store in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Store misses in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data store to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data store to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_l1d_pend_miss[]={ { .uname = "L2_STALL", .udesc = "Number of cycles a demand request has waited due to L1D due to lack of L2 resources.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FB_FULL_PERIODS", .udesc = "Number of phases a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "FB_FULL", .udesc = "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load Misses outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "PENDING", .udesc = "Number of L1D misses that are outstanding", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_sw_prefetch_access[]={ { .uname = "PREFETCHW", .udesc = "Number of PREFETCHW instructions executed.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T1_T2", .udesc = "Number of PREFETCHT1 or PREFETCHT2 instructions executed.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T0", .udesc = "Number of PREFETCHT0 instructions executed.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NTA", .udesc = "Number of PREFETCHNTA instructions executed.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed L3 (except hardware prefetches to L3).", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3).", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_core_power[]={ { .uname = "LVL2_TURBO_LICENSE", .udesc = "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LVL1_TURBO_LICENSE", .udesc = "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX2 turbo schedule.", .ucode = 0x1800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LVL0_TURBO_LICENSE", .udesc = "Core cycles where the core was running in a manner where Turbo may be clipped to the Non-AVX turbo schedule.", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_l2_rqsts[]={ { .uname = "ALL_DEMAND_REFERENCES", .udesc = "Demand requests to L2 cache", .ucode = 0xe700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CODE_RD", .udesc = "L2 code requests", .ucode = 0xe400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "RFO requests to L2 cache", .ucode = 0xe200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand Data Read requests", .ucode = 0xe100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_HIT", .udesc = "SW prefetch requests that hit L2 cache. Accounts for PREFETCHNTA and PREFETCH0/1/2 instructions when FB is not full.", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads.", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_MISS", .udesc = "SW prefetch requests that miss L2 cache. Accounts for PREFETCHNTA and PREFETCH0/1/2 instructions when FB is not full.", .ucode = 0x2800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "Demand requests that miss L2 cache", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read miss L2, no rejects", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_arith[]={ { .uname = "DIVIDER_ACTIVE", .udesc = "Cycles when divide unit is busy executing divide or square root operations.", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_icl_uops_issued[]={ { .uname = "STALL_CYCLES", .udesc = "Cycles when RAT does not issue Uops to RS for the thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "VECTOR_WIDTH_MISMATCH", .udesc = "Uops inserted at issue-stage in order to preserve upper bits of vector registers.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Uops that RAT issues to RS", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_int_misc[]={ { .uname = "CLEAR_RESTEER_CYCLES", .udesc = "Counts cycles after recovery from a branch misprediction or machine clear till the first uop is issued from the resteered path.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UOP_DROPPING", .udesc = "TMA slots where uops got dropped", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RECOVERY_CYCLES", .udesc = "Cycles the Backend cluster is recovering after a miss-speculation or a Store Buffer or Load Buffer drain stall.", .ucode = 0x0300ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "RECOVERY_CYCLES", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CLEARS_COUNT", .udesc = "Clears speculative count", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t intel_icl_dtlb_load_misses[]={ { .uname = "STLB_HIT", .udesc = "Loads that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a demand load.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a demand load in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Load miss in all TLB levels causes a page walk that completes (All page sizes).", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data load to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data load to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_ld_blocks[]={ { .uname = "NO_SR", .udesc = "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked due to overlapping with a preceding store that cannot be forwarded.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_topdown[]={ { .uname = "BR_MISPREDICT_SLOTS", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BACKEND_BOUND_SLOTS", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_P", .udesc = "TMA slots available for an unhalted logical processor. General counter - architectural event", .ucode = 0x01a4ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_topdown_m[]={ { .uname = "BACKEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "BAD_SPEC", .udesc = "TMA slots wasted due to incorrect speculations.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "FRONTEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of front-end resources.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "RETIRING", .udesc = "TMA slots where instructions are retiring", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_cpu_clk_unhalted[]={ { .uname = "DISTRIBUTED", .udesc = "Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x02ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "REF_DISTRIBUTED", .udesc = "Core crystal clock cycles. Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Core crystal clock cycles when this thread is unhalted and the other thread is halted.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK", .udesc = "Core crystal clock cycles when the thread is unhalted.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD_P", .udesc = "Thread cycles when thread is not in halt state", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Reference cycles when the core is not in halt state.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, }; static const intel_x86_umask_t intel_icl_inst_retired[]={ { .uname = "STALL_CYCLES", .udesc = "Cycles without actually retired instructions.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event with a reduced effect of PEBS shadow in IP distribution (Fixed counter 0 only. c, e, i, intx, intxcp modifiers not available)", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE | INTEL_X86_FIXED | INTEL_X86_PEBS, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, { .uname = "ANY", .udesc = "Number of instructions retired. Fixed Counter - architectural event (c, e, i, intx, intxcp modifiers not available)", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE | INTEL_X86_FIXED, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_E, }, { .uname = "NOP", .udesc = "Number of retired NOP instructions.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_icl_uops_decoded[]={ { .uname = "DEC0", .udesc = "Number of uops decoded out of instructions exclusively fetched by decoder 0", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_icl_mem_load_misc_retired[]={ { .uname = "UC", .udesc = "Retired instructions with at least 1 uncacheable load or Bus Lock.", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_icl_core_snoop_response[]={ { .uname = "MISS", .udesc = "Number of lines not found in snoop replies.", .ucode = 0x0100ull, }, { .uname = "I_HIT_FSE", .udesc = "Hit snoop reply without sending the data, line invalidated.", .ucode = 0x0200ull, }, { .uname = "S_HIT_FSE", .udesc = "Hit snoop reply without sending the data, line kept in Shared state.", .ucode = 0x0400ull, }, { .uname = "S_FWD_M", .udesc = "HitM snoop reply with data, line kept in Shared state", .ucode = 0x0800ull, }, { .uname = "I_FWD_M", .udesc = "HitM snoop reply with data, line invalidated.", .ucode = 0x1000ull, }, { .uname = "I_FWD_FE", .udesc = "Hit snoop reply with data, line invalidated.", .ucode = 0x2000ull, }, { .uname = "S_FWD_FE", .udesc = "Hit snoop reply with data, line kept in Shared state.", .ucode = 0x4000ull, }, }; static const intel_x86_umask_t intel_icl_offcore_requests_outstanding[]={ { .uname = "DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of outstanding demand data read requests pending.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "For every cycle, increments by the number of outstanding code read requests pending.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DEMAND_CODE_RD", .udesc = "Cycles with outstanding code read requests pending.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_RFO", .udesc = "Cycles where at least 1 outstanding Demand RFO request is pending.", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_DATA_RD", .udesc = "For every cycle, increments by the number of outstanding data read requests pending.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DATA_RD", .udesc = "Cycles where at least 1 outstanding data read request is pending.", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of demand data read requests pending that are known to have missed the L3 cache.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_L3_MISS_DEMAND_DATA_RD", .udesc = "Cycles where at least one demand data read request known to have missed the L3 cache is pending.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "L3_MISS_DEMAND_DATA_RD_GE_6", .udesc = "Cycles where the core is waiting on at least 6 outstanding demand data read requests known to have missed the L3 cache.", .ucode = 0x1000ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_icl_inst_decoded[]={ { .uname = "DECODERS", .udesc = "Number of decoders utilized in a cycle when the MITE (legacy decode pipeline) fetches instructions.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_icl_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "SQ_MISC", .desc = "SuperQueue miscellaneous.", .code = 0x00f4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_sq_misc), .umasks = intel_icl_sq_misc, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted.", .code = 0x00f2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l2_lines_out), .umasks = intel_icl_l2_lines_out, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated.", .code = 0x00f1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l2_lines_in), .umasks = intel_icl_l2_lines_in, }, { .name = "L2_TRANS", .desc = "L2 transactions.", .code = 0x00f0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l2_trans), .umasks = intel_icl_l2_trans, }, { .name = "BACLEARS", .desc = "Branch re-steers.", .code = 0x00e6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_baclears), .umasks = intel_icl_baclears, }, { .name = "MEM_LOAD_L3_HIT_RETIRED", .desc = "L3 hit load uops retired.", .code = 0x00d2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_load_l3_hit_retired), .umasks = intel_icl_mem_load_l3_hit_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired load uops.", .code = 0x00d1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_load_retired), .umasks = intel_icl_mem_load_retired, }, { .name = "MEM_INST_RETIRED", .desc = "Memory instructions retired.", .code = 0x00d0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_inst_retired), .umasks = intel_icl_mem_inst_retired, }, { .name = "MEM_LOAD_L3_MISS_RETIRED", .desc = "Retired load instructions which data sources missed L3 but serviced from local dram", .code = 0x00d3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_load_l3_miss_retired), .umasks = intel_icl_mem_load_l3_miss_retired, }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired,", .code = 0x00cd, .modmsk = INTEL_V5_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_trans_retired), .umasks = intel_icl_mem_trans_retired, }, { .name = "MISC_RETIRED", .desc = "Miscellaneous retired events.", .code = 0x00cc, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_misc_retired), .umasks = intel_icl_misc_retired, }, { .name = "RTM_RETIRED", .desc = "RTM (Restricted Transaction Memory) execution.", .code = 0x00c9, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_rtm_retired), .umasks = intel_icl_rtm_retired, }, { .name = "HLE_RETIRED", .desc = "HLE (Hardware Lock Elision) execution.", .code = 0x00c8, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_hle_retired), .umasks = intel_icl_hle_retired, }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Floating-point instructions retired.", .code = 0x00c7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_fp_arith_inst_retired), .umasks = intel_icl_fp_arith_inst_retired, }, { .name = "FP_ARITH", .desc = "Floating-point instructions retired.", .equiv = "FP_ARITH_INST_RETIRED", .code = 0x00c7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_fp_arith_inst_retired), .umasks = intel_icl_fp_arith_inst_retired, }, { .name = "FRONTEND_RETIRED", .desc = "Precise frontend retired events.", .code = 0x01c6, .modmsk = INTEL_SKL_FE_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_FRONTEND | INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_frontend_retired), .umasks = intel_icl_frontend_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted branch instructions retired.", .code = 0x00c5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_br_misp_retired), .umasks = intel_icl_br_misp_retired, }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired.", .code = 0x00c4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_br_inst_retired), .umasks = intel_icl_br_inst_retired, }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted.", .code = 0x00c3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_machine_clears), .umasks = intel_icl_machine_clears, }, { .name = "UOPS_RETIRED", .desc = "Retired uops.", .code = 0x00c2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_retired), .umasks = intel_icl_uops_retired, }, { .name = "ASSISTS", .desc = "Software assist.", .code = 0x00c1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_assists), .umasks = intel_icl_assists, }, { .name = "TLB_FLUSH", .desc = "Data TLB flushes.", .code = 0x00bd, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_tlb_flush), .umasks = intel_icl_tlb_flush, }, { .name = "UOPS_EXECUTED", .desc = "Uops executed.", .code = 0x00b1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_executed), .umasks = intel_icl_uops_executed, }, { .name = "OFFCORE_REQUESTS", .desc = "Requests sent to uncore.", .code = 0x00b0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_offcore_requests), .umasks = intel_icl_offcore_requests, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches.", .code = 0x00ab, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_dsb2mite_switches), .umasks = intel_icl_dsb2mite_switches, }, { .name = "LSD", .desc = "LSD (Loop stream detector) operations.", .code = 0x00a8, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_lsd), .umasks = intel_icl_lsd, }, { .name = "EXE_ACTIVITY", .desc = "Execution activity,", .code = 0x00a6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_exe_activity), .umasks = intel_icl_exe_activity, }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles.", .code = 0x00a3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_cycle_activity), .umasks = intel_icl_cycle_activity, }, { .name = "RESOURCE_STALLS", .desc = "Cycles where Allocation is stalled due to Resource Related reasons.", .code = 0x00a2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_resource_stalls), .umasks = intel_icl_resource_stalls, }, { .name = "UOPS_DISPATCHED", .desc = "Uops dispatched to specific ports", .code = 0x00a1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_dispatched), .umasks = intel_icl_uops_dispatched, }, { .name = "UOPS_DISPATCHED_PORT", .desc = "Uops dispatched to specific ports", .equiv = "UOPS_DISPATCHED", .code = 0x00a1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V5_ATTRS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_dispatched), .umasks = intel_icl_uops_dispatched, .flags = INTEL_X86_SPEC, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered.", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_idq_uops_not_delivered), .umasks = intel_icl_idq_uops_not_delivered, }, { .name = "ILD_STALL", .desc = "ILD (Instruction Length Decoder) stalls.", .code = 0x0087, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ild_stall), .umasks = intel_icl_ild_stall, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses.", .code = 0x0085, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_itlb_misses), .umasks = intel_icl_itlb_misses, }, { .name = "ICACHE_64B", .desc = "Instruction Cache.", .code = 0x0083, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_icache_64b), .umasks = intel_icl_icache_64b, }, { .name = "ICACHE_16B", .desc = "Instruction Cache.", .code = 0x0080, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_icache_16b), .umasks = intel_icl_icache_16b, }, { .name = "IDQ", .desc = "IDQ (Instruction Decoded Queue) operations", .code = 0x0079, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_idq), .umasks = intel_icl_idq, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests.", .code = 0x0060, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_offcore_requests_outstanding), .umasks = intel_icl_offcore_requests_outstanding, }, { .name = "RS_EVENTS", .desc = "Reservation Station.", .code = 0x005e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_rs_events), .umasks = intel_icl_rs_events, }, { .name = "TX_EXEC", .desc = "Transactional execution.", .code = 0x005d, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_tx_exec), .umasks = intel_icl_tx_exec, }, { .name = "TX_MEM", .desc = "Transactional memory.", .code = 0x0054, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_tx_mem), .umasks = intel_icl_tx_mem, }, { .name = "L1D", .desc = "L1D cache.", .code = 0x0051, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l1d), .umasks = intel_icl_l1d, }, { .name = "LOAD_HIT_PREFETCH", .desc = "Load dispatches.", .code = 0x004c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_load_hit_prefetch), .umasks = intel_icl_load_hit_prefetch, }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches.", .equiv = "LOAD_HIT_PREFETCH", .code = 0x004c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_load_hit_prefetch), .umasks = intel_icl_load_hit_prefetch, }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses.", .code = 0x0049, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_dtlb_store_misses), .umasks = intel_icl_dtlb_store_misses, }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses.", .code = 0x0048, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l1d_pend_miss), .umasks = intel_icl_l1d_pend_miss, }, { .name = "SW_PREFETCH_ACCESS", .desc = "Software prefetches.", .code = 0x0032, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_sw_prefetch_access), .umasks = intel_icl_sw_prefetch_access, }, { .name = "SW_PREFETCH", .desc = "Software prefetches.", .equiv = "SW_PREFETCH_ACCESS", .code = 0x0032, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_sw_prefetch_access), .umasks = intel_icl_sw_prefetch_access, }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache.", .code = 0x002e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_longest_lat_cache), .umasks = intel_icl_longest_lat_cache, }, { .name = "CORE_POWER", .desc = "Power power cycles.", .code = 0x0028, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_core_power), .umasks = intel_icl_core_power, }, { .name = "L2_RQSTS", .desc = "L2 requests.", .code = 0x0024, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_l2_rqsts), .umasks = intel_icl_l2_rqsts, }, { .name = "ARITH", .desc = "Arithmetic uops.", .code = 0x0014, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_arith), .umasks = intel_icl_arith, }, { .name = "UOPS_ISSUED", .desc = "Uops issued.", .code = 0x000e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_issued), .umasks = intel_icl_uops_issued, }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions.", .code = 0x000d, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_int_misc), .umasks = intel_icl_int_misc, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses.", .code = 0x0008, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_dtlb_load_misses), .umasks = intel_icl_dtlb_load_misses, }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks.", .code = 0x0007, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ld_blocks_partial), .umasks = intel_icl_ld_blocks_partial, }, { .name = "LD_BLOCKS", .desc = "Blocking loads.", .code = 0x0003, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ld_blocks), .umasks = intel_icl_ld_blocks, }, { .name = "TOPDOWN", .desc = "TMA slots available for an unhalted logical processor.", .code = 0x0000, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x800000000ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_topdown), .umasks = intel_icl_topdown, }, { .name = "TOPDOWN_M", .desc = "Topdown events via PERF_METRICS MSR (Linux only). All events must be in a Linux perf_events group and SLOTS must be the first event for the kernel to program the events onto the PERF_METRICS MSR. Only SLOTS umask accepts modifiers", .cntmsk = 0x1000000000ull, .modmsk = INTEL_FIXED2_ATTRS, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_FIXED, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_topdown_m), .umasks = intel_icl_topdown_m, }, { .name = "CPU_CLK_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted).", .code = 0x003c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x200000000ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_cpu_clk_unhalted), .umasks = intel_icl_cpu_clk_unhalted, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .code = 0xc0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_inst_retired), .umasks = intel_icl_inst_retired, }, { .name = "UOPS_DECODED", .desc = "Number of instructions decoded", .code = 0x56, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_uops_decoded), .umasks = intel_icl_uops_decoded, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Miscellaneous loads retired", .code = 0xc4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_mem_load_misc_retired), .umasks = intel_icl_mem_load_misc_retired, }, { .name = "INST_DECODED", .desc = "Instructions decoders", .code = 0x55, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_inst_decoded), .umasks = intel_icl_inst_decoded, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event", .code = 0x01b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icx_ocr), .umasks = intel_icx_ocr, .model = PFM_PMU_INTEL_ICX, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event", .code = 0x01b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ocr), .umasks = intel_icl_ocr, .model = PFM_PMU_INTEL_ICL, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event", .code = 0x01bb, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icx_ocr), .umasks = intel_icx_ocr, .model = PFM_PMU_INTEL_ICX, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event", .code = 0x01bb, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ocr), .umasks = intel_icl_ocr, .model = PFM_PMU_INTEL_ICL, }, { .name = "OCR", .desc = "Offcore response event", .equiv = "OFFCORE_RESPONSE_0", .code = 0x01b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icx_ocr), .umasks = intel_icx_ocr, .model = PFM_PMU_INTEL_ICX, }, { .name = "OCR", .desc = "Offcore response event", .equiv = "OFFCORE_RESPONSE_0", .code = 0x01b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_icl_ocr), .model = PFM_PMU_INTEL_ICL, .umasks = intel_icl_ocr, }, }; /* 56 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_cha_events.h000066400000000000000000005306541502707512200252460ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_cha (IcelakeX Uncore CHA) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_cha_2lm_nm_setconflicts[]={ { .uname = "LLC", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOR", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_2lm_nm_setconflicts2[]={ { .uname = "MEMWR", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMWRNI", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag0_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag0_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag0_bl_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag0_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag1_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag1_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag1_bl_crd_acquired0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 4 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 5 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ag1_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_bypass_cha_imc[]={ { .uname = "INTERMEDIATE", .udesc = "Intermediate bypass Taken (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_TAKEN", .udesc = "Not Taken (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Taken (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_core_snp[]={ { .uname = "ANY_GTONE", .udesc = "Any Cycle with Multiple Snoops (experimental)", .ucode = 0xf200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_ONE", .udesc = "Any Single Snoop (experimental)", .ucode = 0xf100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_GTONE", .udesc = "Multiple Core Requests (experimental)", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_ONE", .udesc = "Single Core Requests (experimental)", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT_GTONE", .udesc = "Multiple Eviction (experimental)", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT_ONE", .udesc = "Single Eviction (experimental)", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EXT_GTONE", .udesc = "Multiple External Snoops (experimental)", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EXT_ONE", .udesc = "Single External Snoops (experimental)", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_GTONE", .udesc = "Multiple Snoop Targets from Remote (experimental)", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ONE", .udesc = "Single Snoop Target from Remote (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_direct_go[]={ { .uname = "HA_SUPPRESS_DRD", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA_SUPPRESS_NO_D2C", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA_TOR_DEALLOC", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_direct_go_opc[]={ { .uname = "EXTCMP", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_GO", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_GO_PULL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GO", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GO_PULL", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE_DUE_SUPPRESS", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOP", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULL", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_dir_lookup[]={ { .uname = "NO_SNP", .udesc = "Snoop Not Needed (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Snoop Needed (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_dir_update[]={ { .uname = "HA", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOR", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_distress_asserted[]={ { .uname = "DPT_LOCAL", .udesc = "DPT Local (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_NONLOCAL", .udesc = "DPT Remote (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_IV", .udesc = "DPT Stalled - IV (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_NOCRD", .udesc = "DPT Stalled - No Credit (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HORZ", .udesc = "Horizontal (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_LOCAL", .udesc = "PMM Local (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_NONLOCAL", .udesc = "PMM Remote (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VERT", .udesc = "Vertical (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNOOPGO_UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_hitme_hit[]={ { .uname = "EX_RDS", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED_OWNREQ", .udesc = "Remote socket ownership read requests that hit in S state. (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOE", .udesc = "Remote socket WBMtoE requests (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOI_OR_S", .udesc = "Remote socket writeback to I or S requests (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_hitme_lookup[]={ { .uname = "READ", .udesc = "Remote socket read requests (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Remote socket write (i.e. writeback) requests (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_hitme_miss[]={ { .uname = "NOTSHARED_RDINVOWN", .udesc = "Remote socket RdInvOwn requests that are not to shared line (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_OR_INV", .udesc = "Remote socket read or invalidate requests (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED_RDINVOWN", .udesc = "Remote socket RdInvOwn requests to shared line (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_hitme_update[]={ { .uname = "DEALLOCATE", .udesc = "Deallocate HitME$ on Reads without RspFwdI* (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEALLOCATE_RSPFWDI_LOC", .udesc = "op is RspIFwd or RspIFwdWb for a local request (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDINVOWN", .udesc = "Update HitMe Cache on RdInvOwn even if not RspFwdI* (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDI_REM", .udesc = "op is RspIFwd or RspIFwdWb for a remote request (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED", .udesc = "Update HitMe Cache to SHARed (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_horz_ring_akc_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_horz_ring_iv_in_use[]={ { .uname = "LEFT", .udesc = "Left (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT", .udesc = "Right (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_imc_reads_count[]={ { .uname = "NORMAL", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRIORITY", .udesc = "ISOCH (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_imc_writes_count[]={ { .uname = "FULL", .udesc = "Full Line Non-ISOCH", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL_PRIORITY", .udesc = "ISOCH Full Line (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Partial Non-ISOCH (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL_PRIORITY", .udesc = "ISOCH Partial (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_llc_lookup[]={ { .uname = "ALL", .udesc = "TBD (experimental)", .ucode = 0x1fff0000ff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_REMOTE", .udesc = "All transactions from Remote Agents (experimental)", .ucode = 0x1e200000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_F", .udesc = "All Request Filter (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE", .udesc = "TBD (experimental)", .ucode = 0x1bd00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x19d00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ", .udesc = "Code Reads (experimental)", .ucode = 0x1bd00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ_F", .udesc = "CRd Request Filter (experimental)", .ucode = 0x1000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ_LOCAL", .udesc = "CRd Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x19d00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ_MISS", .udesc = "Code Read Misses (experimental)", .ucode = 0x1bd000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ_REMOTE", .udesc = "CRd Requests that come from a Remote socket. (experimental)", .ucode = 0x1a100000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_REMOTE", .udesc = "TBD (experimental)", .ucode = 0x1a100000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COREPREF_OR_DMND_LOCAL_F", .udesc = "Local request Filter (experimental)", .ucode = 0x4000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_RD", .udesc = "TBD (experimental)", .ucode = 0x1bc10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ", .udesc = "TBD", .ucode = 0x1bc10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_ALL", .udesc = "TBD (experimental)", .ucode = 0x1fc10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_F", .udesc = "Data Read Request Filter (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x19c10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_MISS", .udesc = "Data Read Misses (experimental)", .ucode = 0x1bc100000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_REMOTE", .udesc = "TBD (experimental)", .ucode = 0x1a010000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DMND_READ_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x8410000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E", .udesc = "E State (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "F", .udesc = "F State (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_INV", .udesc = "Flush or Invalidate Requests (experimental)", .ucode = 0x1a440000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_INV_LOCAL", .udesc = "Flush or Invalidate Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x18440000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_INV_REMOTE", .udesc = "Flush or Invalidate requests that come from a Remote socket. (experimental)", .ucode = 0x1a040000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_OR_INV_F", .udesc = "Flush or Invalidate Filter (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "I", .udesc = "I State (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCPREF_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x189d0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCPREF_LOCAL_F", .udesc = "Local LLC prefetch requests (from LLC) Filter (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_PF_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x189d0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCALLY_HOMED_ADDRESS", .udesc = "TBD (experimental)", .ucode = 0xbdf0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_F", .udesc = "Transactions homed locally Filter (experimental)", .ucode = 0x80000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_HOM", .udesc = "Transactions homed locally (experimental)", .ucode = 0xbdf0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M", .udesc = "M State (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_ALL", .udesc = "All Misses (experimental)", .ucode = 0x1fe000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_REQ_F", .udesc = "Write Request Filter (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREF_OR_DMND_REMOTE_F", .udesc = "Remote non-snoop request Filter (experimental)", .ucode = 0x20000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ", .udesc = "Reads (experimental)", .ucode = 0x1bd90000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_LOCAL_LOC_HOM", .udesc = "Locally Requested Reads that are Locally HOMed (experimental)", .ucode = 0x9d90000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_LOCAL_REM_HOM", .udesc = "Locally Requested Reads that are Remotely HOMed (experimental)", .ucode = 0x11d90000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_MISS", .udesc = "Read Misses (experimental)", .ucode = 0x1bd900000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_MISS_LOC_HOM", .udesc = "Locally HOMed Read Misses (experimental)", .ucode = 0xbd900000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_MISS_REM_HOM", .udesc = "Remotely HOMed Read Misses (experimental)", .ucode = 0x13d900000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_OR_SNOOP_REMOTE_MISS_REM_HOM", .udesc = "Remotely requested Read or Snoop Misses that are Remotely HOMed (experimental)", .ucode = 0x161900000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_REMOTE_LOC_HOM", .udesc = "Remotely Requested Reads that are Locally HOMed (experimental)", .ucode = 0xa190000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_SF_HIT", .udesc = "Reads that Hit the Snoop Filter (experimental)", .ucode = 0x1bd900000e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTELY_HOMED_ADDRESS", .udesc = "TBD (experimental)", .ucode = 0x15df0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_F", .udesc = "Transactions homed remotely Filter (experimental)", .ucode = 0x100000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNOOP_F", .udesc = "Remote snoop request Filter (experimental)", .ucode = 0x40000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNP", .udesc = "TBD (experimental)", .ucode = 0x1c190000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM_HOM", .udesc = "Transactions homed remotely (experimental)", .ucode = 0x15df0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO Requests (experimental)", .ucode = 0x1bc80000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_F", .udesc = "RFO Request Filter (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_LOCAL", .udesc = "RFO Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x19c80000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO Misses (experimental)", .ucode = 0x1bc800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_PREF_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x8880000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_REMOTE", .udesc = "RFO Requests that come from a Remote socket. (experimental)", .ucode = 0x1a080000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S", .udesc = "S State (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_E", .udesc = "SnoopFilter - E State (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_H", .udesc = "SnoopFilter - H State (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_S", .udesc = "SnoopFilter - S State (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_AND_OTHER", .udesc = "Filters Requests for those that write info into the cache (experimental)", .ucode = 0x1a420000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x8420000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_REMOTE", .udesc = "TBD (experimental)", .ucode = 0x17c20000ff00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_llc_victims[]={ { .uname = "ALL", .udesc = "All Lines Victimized", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E_STATE", .udesc = "Lines in E state (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_ALL", .udesc = "Local - All Lines (experimental)", .ucode = 0x2000000f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_E", .udesc = "Local - Lines in E State (experimental)", .ucode = 0x2000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_M", .udesc = "Local - Lines in M State (experimental)", .ucode = 0x2000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_ONLY", .udesc = "Local Only (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_S", .udesc = "Local - Lines in S State (experimental)", .ucode = 0x2000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "Lines in M state (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ALL", .udesc = "Remote - All Lines (experimental)", .ucode = 0x8000000f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_E", .udesc = "Remote - Lines in E State (experimental)", .ucode = 0x8000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_M", .udesc = "Remote - Lines in M State (experimental)", .ucode = 0x8000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ONLY", .udesc = "Remote Only (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_S", .udesc = "Remote - Lines in S State (experimental)", .ucode = 0x8000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "Lines in S State (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_misc[]={ { .uname = "CV0_PREF_MISS", .udesc = "CV0 Prefetch Miss (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CV0_PREF_VIC", .udesc = "CV0 Prefetch Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT_S", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI_WAS_FSE", .udesc = "Silent Snoop Eviction (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WC_ALIASING", .udesc = "Write Combining Aliasing (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_misc_external[]={ { .uname = "MBE_INST0", .udesc = "Number of cycles MBE is high for MS2IDI0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBE_INST1", .udesc = "Number of cycles MBE is high for MS2IDI1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_osb[]={ { .uname = "LOCAL_INVITOE", .udesc = "Local InvItoE (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_READ", .udesc = "Local Rd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OFF_PWRHEURISTIC", .udesc = "Off (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_READ", .udesc = "Remote Rd (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_READINVITOE", .udesc = "Remote Rd InvItoE (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HITS_SNP_BCAST", .udesc = "RFO HitS Snoop Broadcast (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_pmm_memmode_nm_invitox[]={ { .uname = "LOCAL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SETCONFLICT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_pmm_memmode_nm_setconflicts[]={ { .uname = "LLC", .udesc = "Counts the number of times CHA saw NM Set conflict in SF/LLC (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF", .udesc = "Counts the number of times CHA saw NM Set conflict in SF/LLC (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOR", .udesc = "Counts the number of times CHA saw NM Set conflict in TOR (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_pmm_memmode_nm_setconflicts2[]={ { .uname = "IODC", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMWR", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMWRNI", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_pmm_qos[]={ { .uname = "DDR4_FAST_INSERT", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REJ_IRQ", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOWTORQ_SKIP", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_INSERT", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE_IRQ", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE_PRQ", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_pmm_qos_occupancy[]={ { .uname = "DDR_FAST_FIFO", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR_SLOW_FIFO", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_requests[]={ { .uname = "INVITOE", .udesc = "TBD (experimental)", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_LOCAL", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_REMOTE", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS", .udesc = "TBD", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_REMOTE", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES", .udesc = "TBD", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_LOCAL", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_REMOTE", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ring_bounces_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ring_sink_starved_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "Acknowledgements to Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_ring_sink_starved_vert[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Acknowledgements to core (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Data Responses to core (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "Snoops of processor's cache. (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_inserts[]={ { .uname = "IPQ", .udesc = "IPQ (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .udesc = "IRQ (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJ", .udesc = "IRQ Rejected (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "PRQ (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_REJ", .udesc = "PRQ (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "RRQ (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "WBQ (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_irq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_irq1_reject[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC or SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_ismq0_retry[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_ismq1_retry[]={ { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_occupancy[]={ { .uname = "IPQ", .udesc = "IPQ (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .udesc = "IRQ (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "RRQ (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "WBQ (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_other1_retry[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC OR SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "PhyAddr Match (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_prq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_req_q1_retry[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC OR SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "PhyAddr Match (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_rrq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_wbq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxc_wbq1_reject[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC OR SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "PhyAddr Match (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxr_crd_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFV", .udesc = "IFV - Credited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxr_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_rxr_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_sf_eviction[]={ { .uname = "E_STATE", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_snoops_sent[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BCST_LOCAL", .udesc = "Broadcast snoops for Local Requests (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BCST_REMOTE", .udesc = "Broadcast snoops for Remote Requests (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRECT_LOCAL", .udesc = "Directed snoops for Local Requests (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRECT_REMOTE", .udesc = "Directed snoops for Remote Requests (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Snoops sent for Local Requests (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Snoops sent for Remote Requests (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_snoop_resp[]={ { .uname = "RSPCNFLCT", .udesc = "RSPCNFLCT* (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWD", .udesc = "RspFwd (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDWB", .udesc = "Rsp*Fwd*WB (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI", .udesc = "RspI (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWD", .udesc = "RspIFwd (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPS", .udesc = "RspS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPSFWD", .udesc = "RspSFwd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPWB", .udesc = "Rsp*WB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_snoop_resp_local[]={ { .uname = "RSPCNFLCT", .udesc = "RspCnflct (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWD", .udesc = "RspFwd (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDWB", .udesc = "Rsp*FWD*WB (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI", .udesc = "RspI (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWD", .udesc = "RspIFwd (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPS", .udesc = "RspS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPSFWD", .udesc = "RspSFwd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPWB", .udesc = "Rsp*WB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_snoop_rsp_misc[]={ { .uname = "MTOI_RSPDATAM", .udesc = "MtoI RspIDataM (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MTOI_RSPIFWDM", .udesc = "MtoI RspIFwdM (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULLDATAPTL_HITLLC", .udesc = "Pull Data Partial - Hit LLC (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULLDATAPTL_HITSF", .udesc = "Pull Data Partial - Hit SF (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWDMPTL_HITLLC", .udesc = "RspIFwdPtl Hit LLC (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWDMPTL_HITSF", .udesc = "RspIFwdPtl Hit SF (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_stall0_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_stall0_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_stall0_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_stall1_no_txr_horz_crd_ad_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_stall1_no_txr_horz_crd_bl_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_tor_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0xc001ff0000ff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DDR", .udesc = "DDR4 Access (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR4", .udesc = "TBD (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT", .udesc = "SF/LLC Evictions (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Just Hits (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA", .udesc = "All requests from iA Cores", .ucode = 0xc001ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSH", .udesc = "CLFlushes issued by iA Cores", .ucode = 0xc8c7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSHOPT", .udesc = "CLFlushOpts issued by iA Cores (experimental)", .ucode = 0xc8d7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD", .udesc = "CRDs issued by iA Cores", .ucode = 0xc80fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD_PREF", .udesc = "TBD (experimental)", .ucode = 0xc88fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD", .udesc = "DRds issued by iA Cores (experimental)", .ucode = 0xc817ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRDPTE", .udesc = "DRd PTEs issued by iA Cores (experimental)", .ucode = 0xc837ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT", .udesc = "DRd_Opts issued by iA Cores (experimental)", .ucode = 0xc827ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores (experimental)", .ucode = 0xc8a7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores", .ucode = 0xc897ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT", .udesc = "All requests from iA Cores that Hit the LLC", .ucode = 0xc001fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD", .udesc = "CRds issued by iA Cores that Hit the LLC", .ucode = 0xc80ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD_PREF", .udesc = "CRd_Prefs issued by iA Cores that hit the LLC", .ucode = 0xc88ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD", .udesc = "DRds issued by iA Cores that Hit the LLC", .ucode = 0xc817fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRDPTE", .udesc = "DRd PTEs issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc837fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT", .udesc = "DRd_Opts issued by iA Cores that hit the LLC (experimental)", .ucode = 0xc827fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores that hit the LLC (experimental)", .ucode = 0xc8a7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores that Hit the LLC", .ucode = 0xc897fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_ITOM", .udesc = "ItoMs issued by iA Cores that Hit LLC (experimental)", .ucode = 0xcc47fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores that hit the LLC (experimental)", .ucode = 0xcccffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFCRD", .udesc = "TBD (experimental)", .ucode = 0xcccffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores that hit the LLC (experimental)", .ucode = 0xccd7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFDRD", .udesc = "TBD (experimental)", .ucode = 0xccd7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores that hit the LLC", .ucode = 0xccc7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO", .udesc = "RFOs issued by iA Cores that Hit the LLC", .ucode = 0xc807fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores that Hit the LLC", .ucode = 0xc887fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_SPECITOM", .udesc = "SpecItoMs issued by iA Cores that hit in the LLC (experimental)", .ucode = 0xcc57fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOM", .udesc = "ItoMs issued by iA Cores (experimental)", .ucode = 0xcc47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOMCACHENEAR", .udesc = "ItoMCacheNears issued by iA Cores (experimental)", .ucode = 0xcd47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores (experimental)", .ucode = 0xcccfff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores", .ucode = 0xccd7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores", .ucode = 0xccc7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS", .udesc = "All requests from iA Cores that Missed the LLC", .ucode = 0xc001fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD", .udesc = "CRds issued by iA Cores that Missed the LLC", .ucode = 0xc80ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_LOCAL", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc80efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC", .ucode = 0xc88ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_LOCAL", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc88efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_REMOTE", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc88f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_REMOTE", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc80f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD", .udesc = "DRds issued by iA Cores that Missed the LLC", .ucode = 0xc817fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRDPTE", .udesc = "DRd PTEs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc837fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC", .ucode = 0xc8178600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL", .udesc = "DRds issued by iA Cores that Missed the LLC - HOMed locally", .ucode = 0xc816fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally", .ucode = 0xc8168600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally", .ucode = 0xc8168a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT", .udesc = "DRd_Opt issued by iA Cores that missed the LLC (experimental)", .ucode = 0xc827fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores that missed the LLC (experimental)", .ucode = 0xc8a7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC", .ucode = 0xc8178a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores that Missed the LLC", .ucode = 0xc897fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC (experimental)", .ucode = 0xc8978600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL", .udesc = "TBD", .ucode = 0xc896fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC (experimental)", .ucode = 0xc8978a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE", .udesc = "TBD", .ucode = 0xc8977e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE", .udesc = "DRds issued by iA Cores that Missed the LLC - HOMed remotely", .ucode = 0xc8177e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely", .ucode = 0xc8170600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely", .ucode = 0xc8170a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR", .udesc = "TBD", .ucode = 0xc867fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_LOCAL_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_LOCAL_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_LOCAL_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8668a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8678a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_REMOTE_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_REMOTE_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_REMOTE_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8670a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_ITOM", .udesc = "ItoMs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xcc47fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores that missed the LLC (experimental)", .ucode = 0xcccffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores that missed the LLC", .ucode = 0xccd7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores that missed the LLC", .ucode = 0xccc7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR", .udesc = "TBD", .ucode = 0xc86ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_LOCAL_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_LOCAL_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_LOCAL_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86e8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86f8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_REMOTE_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_REMOTE_DRAM", .udesc = "TBD (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_REMOTE_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86f0a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed remote memory (experimental)", .ucode = 0xc8670a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO", .udesc = "RFOs issued by iA Cores that Missed the LLC", .ucode = 0xc807fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_LOCAL", .udesc = "RFOs issued by iA Cores that Missed the LLC - HOMed locally", .ucode = 0xc806fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC", .ucode = 0xc887fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_LOCAL", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC - HOMed locally", .ucode = 0xc886fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_REMOTE", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC - HOMed remotely", .ucode = 0xc8877e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_REMOTE", .udesc = "RFOs issued by iA Cores that Missed the LLC - HOMed remotely", .ucode = 0xc8077e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_SPECITOM", .udesc = "SpecItoMs issued by iA Cores that missed the LLC (experimental)", .ucode = 0xcc57fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_UCRDF", .udesc = "UCRdFs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc877de00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL", .udesc = "WCiLs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc86ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF", .udesc = "WCiLF issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc867fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc8678a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc86f8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WIL", .udesc = "WiLs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc87fde00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO", .udesc = "RFOs issued by iA Cores", .ucode = 0xc807ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores", .ucode = 0xc887ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_SPECITOM", .udesc = "SpecItoMs issued by iA Cores", .ucode = 0xcc57ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBEFTOE", .udesc = "WBEFtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc3fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBEFTOI", .udesc = "WBEFtoIs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc37ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBMTOE", .udesc = "WBMtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc2fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBMTOI", .udesc = "WbMtoIs issued by an iA Cores. Modified Write Backs (experimental)", .ucode = 0xcc27ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBSTOI", .udesc = "WBStoIs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc67ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCIL", .udesc = "WCiLs issued by iA Cores (experimental)", .ucode = 0xc86fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCILF", .udesc = "WCiLF issued by iA Cores (experimental)", .ucode = 0xc867ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO", .udesc = "All requests from IO Devices", .ucode = 0xc001ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_CLFLUSH", .udesc = "CLFlushes issued by IO Devices (experimental)", .ucode = 0xc8c3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT", .udesc = "All requests from IO Devices that hit the LLC", .ucode = 0xc001fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOM", .udesc = "ItoMs issued by IO Devices that Hit the LLC", .ucode = 0xcc43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that hit the LLC", .ucode = 0xcd43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices that hit the LLC", .ucode = 0xc8f3fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_RFO", .udesc = "RFOs issued by IO Devices that hit the LLC (experimental)", .ucode = 0xc803fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOM", .udesc = "ItoMs issued by IO Devices", .ucode = 0xcc43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices", .ucode = 0xcd43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOMCACHENEAR_LOCAL", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices to locally HOMed memory", .ucode = 0xcd42ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOMCACHENEAR_REMOTE", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices to remotely HOMed memory", .ucode = 0xcd437f00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOM_LOCAL", .udesc = "ItoMs issued by IO Devices to locally HOMed memory", .ucode = 0xcc42ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOM_REMOTE", .udesc = "ItoMs issued by IO Devices to remotely HOMed memory", .ucode = 0xcc437f00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS", .udesc = "All requests from IO Devices that missed the LLC", .ucode = 0xc001fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOM", .udesc = "ItoMs issued by IO Devices that missed the LLC", .ucode = 0xcc43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC", .ucode = 0xcd43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices that missed the LLC", .ucode = 0xc8f3fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_RFO", .udesc = "RFOs issued by IO Devices that missed the LLC (experimental)", .ucode = 0xc803fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices", .ucode = 0xc8f3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_RFO", .udesc = "RFOs issued by IO Devices (experimental)", .ucode = 0xc803ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WBMTOI", .udesc = "WbMtoIs issued by IO Devices (experimental)", .ucode = 0xcc23ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "IPQ (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_IA", .udesc = "IRQ - iA (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_NON_IA", .udesc = "IRQ - Non iA (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ISOC", .udesc = "Just ISOC (experimental)", .ucode = 0x200000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_TGT", .udesc = "Just Local Targets (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_ALL", .udesc = "All from Local iA and IO (experimental)", .ucode = 0xc000ff00000500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IA", .udesc = "All from Local iA (experimental)", .ucode = 0xc000ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IO", .udesc = "All from Local IO (experimental)", .ucode = 0xc000ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MATCH_OPC", .udesc = "Match the Opcode in b[29:19] of the extended umask field (experimental)", .ucode = 0x20000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Just Misses (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MMCFG", .udesc = "MMCFG Access (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEARMEM", .udesc = "Just NearMem (experimental)", .ucode = 0x40000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONCOH", .udesc = "Just NonCoherent (experimental)", .ucode = 0x100000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_NEARMEM", .udesc = "Just NotNearMem (experimental)", .ucode = 0x80000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "PMM Access (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREMORPH_OPC", .udesc = "Match the PreMorphed Opcode in b[29:19] of the extended umask field (experimental)", .ucode = 0x40000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_IOSF", .udesc = "PRQ - IOSF (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_NON_IOSF", .udesc = "PRQ - Non IOSF (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_TGT", .udesc = "Just Remote Targets (experimental)", .ucode = 0x10000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "RRQ (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "WBQ (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_tor_occupancy[]={ { .uname = "DDR", .udesc = "DDR4 Access (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT", .udesc = "SF/LLC Evictions (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Just Hits (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA", .udesc = "All requests from iA Cores", .ucode = 0xc001ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSH", .udesc = "CLFlushes issued by iA Cores (experimental)", .ucode = 0xc8c7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSHOPT", .udesc = "CLFlushOpts issued by iA Cores (experimental)", .ucode = 0xc8d7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD", .udesc = "CRDs issued by iA Cores", .ucode = 0xc80fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD_PREF", .udesc = "TBD (experimental)", .ucode = 0xc88fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD", .udesc = "DRds issued by iA Cores", .ucode = 0xc817ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRDPTE", .udesc = "DRdPte issued by iA Cores due to a page walk (experimental)", .ucode = 0xc837ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT", .udesc = "DRd_Opts issued by iA Cores (experimental)", .ucode = 0xc827ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores (experimental)", .ucode = 0xc8a7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores (experimental)", .ucode = 0xc897ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT", .udesc = "All requests from iA Cores that Hit the LLC", .ucode = 0xc001fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD", .udesc = "CRds issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc80ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD_PREF", .udesc = "CRd_Prefs issued by iA Cores that hit the LLC (experimental)", .ucode = 0xc88ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD", .udesc = "DRds issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc817fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRDPTE", .udesc = "DRdPte issued by iA Cores due to a page walk that hit the LLC (experimental)", .ucode = 0xc837fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT", .udesc = "DRd_Opts issued by iA Cores that hit the LLC (experimental)", .ucode = 0xc827fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores that hit the LLC (experimental)", .ucode = 0xc8a7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc897fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_ITOM", .udesc = "ItoMs issued by iA Cores that Hit LLC (experimental)", .ucode = 0xcc47fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores that hit the LLC (experimental)", .ucode = 0xcccffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores that hit the LLC (experimental)", .ucode = 0xccd7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores that hit the LLC (experimental)", .ucode = 0xccc7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO", .udesc = "RFOs issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc807fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc887fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOM", .udesc = "ItoMs issued by iA Cores (experimental)", .ucode = 0xcc47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOMCACHENEAR", .udesc = "ItoMCacheNears issued by iA Cores (experimental)", .ucode = 0xcd47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores (experimental)", .ucode = 0xcccfff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores (experimental)", .ucode = 0xccd7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores (experimental)", .ucode = 0xccc7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS", .udesc = "All requests from iA Cores that Missed the LLC", .ucode = 0xc001fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD", .udesc = "CRds issued by iA Cores that Missed the LLC", .ucode = 0xc80ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_LOCAL", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc80efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc88ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_LOCAL", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc88efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_REMOTE", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc88f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_REMOTE", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc80f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD", .udesc = "DRds issued by iA Cores that Missed the LLC", .ucode = 0xc817fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRDPTE", .udesc = "DRdPte issued by iA Cores due to a page walk that missed the LLC (experimental)", .ucode = 0xc837fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC", .ucode = 0xc8178600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL", .udesc = "DRds issued by iA Cores that Missed the LLC - HOMed locally", .ucode = 0xc816fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8168600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8168a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT", .udesc = "DRd_Opt issued by iA Cores that missed the LLC (experimental)", .ucode = 0xc827fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT_PREF", .udesc = "DRd_Opt_Prefs issued by iA Cores that missed the LLC (experimental)", .ucode = 0xc8a7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC", .ucode = 0xc8178a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF", .udesc = "DRd_Prefs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc897fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC (experimental)", .ucode = 0xc8978600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL", .udesc = "TBD (experimental)", .ucode = 0xc896fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC (experimental)", .ucode = 0xc8978a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE", .udesc = "TBD (experimental)", .ucode = 0xc8977e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE", .udesc = "DRds issued by iA Cores that Missed the LLC - HOMed remotely", .ucode = 0xc8177e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8170600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8170a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR", .udesc = "TBD (experimental)", .ucode = 0xc867fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_LOCAL_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_LOCAL_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8668a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8678a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_REMOTE_DDR", .udesc = "TBD (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_FULL_STREAMING_WR_REMOTE_PMM", .udesc = "TBD (experimental)", .ucode = 0xc8670a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_ITOM", .udesc = "ItoMs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xcc47fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFCODE", .udesc = "LLCPrefCode issued by iA Cores that missed the LLC (experimental)", .ucode = 0xcccffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFDATA", .udesc = "LLCPrefData issued by iA Cores that missed the LLC (experimental)", .ucode = 0xccd7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFRFO", .udesc = "LLCPrefRFO issued by iA Cores that missed the LLC (experimental)", .ucode = 0xccc7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR", .udesc = "TBD (experimental)", .ucode = 0xc86ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_LOCAL_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_LOCAL_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86e8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86f8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_REMOTE_DDR", .udesc = "TBD (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_PARTIAL_STREAMING_WR_REMOTE_PMM", .udesc = "TBD (experimental)", .ucode = 0xc86f0a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8670a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO", .udesc = "RFOs issued by iA Cores that Missed the LLC", .ucode = 0xc807fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_LOCAL", .udesc = "RFOs issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc806fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc887fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_LOCAL", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc886fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_REMOTE", .udesc = "RFO_Prefs issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8877e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_REMOTE", .udesc = "RFOs issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8077e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_SPECITOM", .udesc = "SpecItoMs issued by iA Cores that missed the LLC (experimental)", .ucode = 0xcc57fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_UCRDF", .udesc = "UCRdFs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc877de00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL", .udesc = "WCiLs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc86ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF", .udesc = "WCiLF issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc867fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc8678a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc86f8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WIL", .udesc = "WiLs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc87fde00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO", .udesc = "RFOs issued by iA Cores", .ucode = 0xc807ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO_PREF", .udesc = "RFO_Prefs issued by iA Cores (experimental)", .ucode = 0xc887ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_SPECITOM", .udesc = "SpecItoMs issued by iA Cores (experimental)", .ucode = 0xcc57ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBMTOI", .udesc = "WbMtoIs issued by iA Cores (experimental)", .ucode = 0xcc27ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCIL", .udesc = "WCiLs issued by iA Cores (experimental)", .ucode = 0xc86fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCILF", .udesc = "WCiLF issued by iA Cores (experimental)", .ucode = 0xc867ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO", .udesc = "All requests from IO Devices", .ucode = 0xc001ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_CLFLUSH", .udesc = "CLFlushes issued by IO Devices (experimental)", .ucode = 0xc8c3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT", .udesc = "All requests from IO Devices that hit the LLC", .ucode = 0xc001fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOM", .udesc = "ItoMs issued by IO Devices that Hit the LLC (experimental)", .ucode = 0xcc43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that hit the LLC (experimental)", .ucode = 0xcd43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices that hit the LLC (experimental)", .ucode = 0xc8f3fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_RFO", .udesc = "RFOs issued by IO Devices that hit the LLC (experimental)", .ucode = 0xc803fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOM", .udesc = "ItoMs issued by IO Devices (experimental)", .ucode = 0xcc43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices (experimental)", .ucode = 0xcd43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS", .udesc = "All requests from IO Devices that missed the LLC", .ucode = 0xc001fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOM", .udesc = "ItoMs issued by IO Devices that missed the LLC (experimental)", .ucode = 0xcc43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC (experimental)", .ucode = 0xcd43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices that missed the LLC", .ucode = 0xc8f3fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_RFO", .udesc = "RFOs issued by IO Devices that missed the LLC (experimental)", .ucode = 0xc803fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_PCIRDCUR", .udesc = "PCIRdCurs issued by IO Devices", .ucode = 0xc8f3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_RFO", .udesc = "RFOs issued by IO Devices (experimental)", .ucode = 0xc803ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WBMTOI", .udesc = "WbMtoIs issued by IO Devices (experimental)", .ucode = 0xcc23ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "IPQ (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_IA", .udesc = "IRQ - iA (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_NON_IA", .udesc = "IRQ - Non iA (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ISOC", .udesc = "Just ISOC (experimental)", .ucode = 0x200000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_TGT", .udesc = "Just Local Targets (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_ALL", .udesc = "All from Local iA and IO (experimental)", .ucode = 0xc000ff00000500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IA", .udesc = "All from Local iA (experimental)", .ucode = 0xc000ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IO", .udesc = "All from Local IO (experimental)", .ucode = 0xc000ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MATCH_OPC", .udesc = "Match the Opcode in b[29:19] of the extended umask field (experimental)", .ucode = 0x20000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Just Misses (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MMCFG", .udesc = "MMCFG Access (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEARMEM", .udesc = "Just NearMem (experimental)", .ucode = 0x40000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONCOH", .udesc = "Just NonCoherent (experimental)", .ucode = 0x100000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_NEARMEM", .udesc = "Just NotNearMem (experimental)", .ucode = 0x80000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "PMM Access (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREMORPH_OPC", .udesc = "Match the PreMorphed Opcode in b[29:19] of the extended umask field (experimental)", .ucode = 0x40000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "PRQ - IOSF (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_NON_IOSF", .udesc = "PRQ - Non IOSF (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_TGT", .udesc = "Just Remote Targets (experimental)", .ucode = 0x10000000000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_horz_ads_used[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_horz_cycles_full[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_horz_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_horz_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_horz_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_ads_used[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_bypass[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG1", .udesc = "IV - Agent 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_cycles_full1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_cycles_ne0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_inserts1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_occupancy0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_occupancy1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_starved0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_txr_vert_starved1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGC", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_vert_ring_akc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_vert_ring_iv_in_use[]={ { .uname = "DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_vert_ring_tgc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_wb_push_mtoi[]={ { .uname = "LLC", .udesc = "Pushed to LLC (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM", .udesc = "Pushed to Memory (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_write_no_credits[]={ { .uname = "MC0", .udesc = "MC0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC1", .udesc = "MC1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC10", .udesc = "MC10 (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC11", .udesc = "MC11 (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC12", .udesc = "MC12 (experimental)", .ucode = 0x1000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC13", .udesc = "MC13 (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC2", .udesc = "MC2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC3", .udesc = "MC3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC4", .udesc = "MC4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC5", .udesc = "MC5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC6", .udesc = "MC6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC7", .udesc = "MC7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC8", .udesc = "MC8 (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC9", .udesc = "MC9 (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_cha_xpt_pref[]={ { .uname = "DROP0_CONFLICT", .udesc = "Dropped (on 0?) - Conflict (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP0_NOCRD", .udesc = "Dropped (on 0?) - No Credits (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP1_CONFLICT", .udesc = "Dropped (on 1?) - Conflict (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP1_NOCRD", .udesc = "Dropped (on 1?) - No Credits (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SENT0", .udesc = "Sent (on 0?) (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SENT1", .udesc = "Sent (on 1?) (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_cha_pe[]={ { .name = "UNC_CHA_2LM_NM_INVITOX", .desc = "This event is deprecated. Refer to new event UNC_CHA_PMM_MEMMODE_NM_INVITOX.LOCAL", .code = 0x0065, .equiv = "UNC_CHA_PMM_MEMMODE_NM_INVITOX", .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_memmode_nm_invitox), /* shared */ .umasks = icx_unc_cha_pmm_memmode_nm_invitox, }, { .name = "UNC_CHA_2LM_NM_SETCONFLICTS", .desc = "This event is deprecated. Refer to new event UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS.TOR", .code = 0x0064, .equiv = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS", .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_2lm_nm_setconflicts), .umasks = icx_unc_cha_2lm_nm_setconflicts, }, { .name = "UNC_CHA_2LM_NM_SETCONFLICTS2", .desc = "This event is deprecated. Refer to new event UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS2.MEMWR", .code = 0x0070, .equiv = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS2", .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_2lm_nm_setconflicts2), .umasks = icx_unc_cha_2lm_nm_setconflicts2, }, { .name = "UNC_CHA_AG0_AD_CRD_ACQUIRED0", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0080, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_cha_ag0_ad_crd_occupancy0, }, { .name = "UNC_CHA_AG0_AD_CRD_ACQUIRED1", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0081, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_cha_ag0_ad_crd_occupancy1, }, { .name = "UNC_CHA_AG0_AD_CRD_OCCUPANCY0", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0082, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_ad_crd_occupancy0), .umasks = icx_unc_cha_ag0_ad_crd_occupancy0, }, { .name = "UNC_CHA_AG0_AD_CRD_OCCUPANCY1", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0083, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_ad_crd_occupancy1), .umasks = icx_unc_cha_ag0_ad_crd_occupancy1, }, { .name = "UNC_CHA_AG0_BL_CRD_ACQUIRED0", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0088, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_bl_crd_occupancy0), /* shared */ .umasks = icx_unc_cha_ag0_bl_crd_occupancy0, }, { .name = "UNC_CHA_AG0_BL_CRD_ACQUIRED1", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0089, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_cha_ag0_bl_crd_occupancy1, }, { .name = "UNC_CHA_AG0_BL_CRD_OCCUPANCY0", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008a, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_bl_crd_occupancy0), .umasks = icx_unc_cha_ag0_bl_crd_occupancy0, }, { .name = "UNC_CHA_AG0_BL_CRD_OCCUPANCY1", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008b, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag0_bl_crd_occupancy1), .umasks = icx_unc_cha_ag0_bl_crd_occupancy1, }, { .name = "UNC_CHA_AG1_AD_CRD_ACQUIRED0", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0084, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_cha_ag1_ad_crd_occupancy0, }, { .name = "UNC_CHA_AG1_AD_CRD_ACQUIRED1", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0085, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_cha_ag1_ad_crd_occupancy1, }, { .name = "UNC_CHA_AG1_AD_CRD_OCCUPANCY0", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0086, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_ad_crd_occupancy0), .umasks = icx_unc_cha_ag1_ad_crd_occupancy0, }, { .name = "UNC_CHA_AG1_AD_CRD_OCCUPANCY1", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0087, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_ad_crd_occupancy1), .umasks = icx_unc_cha_ag1_ad_crd_occupancy1, }, { .name = "UNC_CHA_AG1_BL_CRD_ACQUIRED0", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008c, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_bl_crd_acquired0), .umasks = icx_unc_cha_ag1_bl_crd_acquired0, }, { .name = "UNC_CHA_AG1_BL_CRD_ACQUIRED1", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_cha_ag1_bl_crd_occupancy1, }, { .name = "UNC_CHA_AG1_BL_CRD_OCCUPANCY0", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008e, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall0_no_txr_horz_crd_ad_ag0), /* shared */ .umasks = icx_unc_cha_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_CHA_AG1_BL_CRD_OCCUPANCY1", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008f, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ag1_bl_crd_occupancy1), .umasks = icx_unc_cha_ag1_bl_crd_occupancy1, }, { .name = "UNC_CHA_BYPASS_CHA_IMC", .desc = "CHA to iMC Bypass", .code = 0x0057, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_bypass_cha_imc), .umasks = icx_unc_cha_bypass_cha_imc, }, { .name = "UNC_CHA_CLOCKTICKS", .desc = "Clockticks of the uncore caching and home agent (CHA)", .code = 0x0000, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_CMS_CLOCKTICKS", .desc = "CMS Clockticks", .code = 0x00c0, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_CORE_SNP", .desc = "Core Cross Snoops Issued", .code = 0x0033, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_core_snp), .umasks = icx_unc_cha_core_snp, }, { .name = "UNC_CHA_COUNTER0_OCCUPANCY", .desc = "Counter 0 Occupancy (experimental)", .code = 0x001f, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_DIRECT_GO", .desc = "Direct GO", .code = 0x006e, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_direct_go), .umasks = icx_unc_cha_direct_go, }, { .name = "UNC_CHA_DIRECT_GO_OPC", .desc = "Direct GO", .code = 0x006d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_direct_go_opc), .umasks = icx_unc_cha_direct_go_opc, }, { .name = "UNC_CHA_DIR_LOOKUP", .desc = "Multi-socket cacheline directory state lookups", .code = 0x0053, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_dir_lookup), .umasks = icx_unc_cha_dir_lookup, }, { .name = "UNC_CHA_DIR_UPDATE", .desc = "Multi-socket cacheline directory state updates; memory write due to directory update from the home agent (HA) pipe", .code = 0x0054, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_dir_update), .umasks = icx_unc_cha_dir_update, }, { .name = "UNC_CHA_DISTRESS_ASSERTED", .desc = "Distress signal asserted", .code = 0x00af, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_distress_asserted), .umasks = icx_unc_cha_distress_asserted, }, { .name = "UNC_CHA_EGRESS_ORDERING", .desc = "Egress Blocking due to Ordering requirements", .code = 0x00ba, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_egress_ordering), .umasks = icx_unc_cha_egress_ordering, }, { .name = "UNC_CHA_HITME_HIT", .desc = "Read request from a remote socket which hit in the HitMe Cache to a line In the E state", .code = 0x005f, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_hitme_hit), .umasks = icx_unc_cha_hitme_hit, }, { .name = "UNC_CHA_HITME_LOOKUP", .desc = "Counts Number of times HitMe Cache is accessed", .code = 0x005e, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_hitme_lookup), .umasks = icx_unc_cha_hitme_lookup, }, { .name = "UNC_CHA_HITME_MISS", .desc = "Counts Number of Misses in HitMe Cache", .code = 0x0060, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_hitme_miss), .umasks = icx_unc_cha_hitme_miss, }, { .name = "UNC_CHA_HITME_UPDATE", .desc = "Counts the number of Allocate/Update to HitMe Cache", .code = 0x0061, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_hitme_update), .umasks = icx_unc_cha_hitme_update, }, { .name = "UNC_CHA_HORZ_RING_AD_IN_USE", .desc = "Horizontal AD Ring In Use", .code = 0x00b6, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_horz_ring_akc_in_use), /* shared */ .umasks = icx_unc_cha_horz_ring_akc_in_use, }, { .name = "UNC_CHA_HORZ_RING_AKC_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00bb, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_horz_ring_akc_in_use), .umasks = icx_unc_cha_horz_ring_akc_in_use, }, { .name = "UNC_CHA_HORZ_RING_AK_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00b7, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_horz_ring_bl_in_use), /* shared */ .umasks = icx_unc_cha_horz_ring_bl_in_use, }, { .name = "UNC_CHA_HORZ_RING_BL_IN_USE", .desc = "Horizontal BL Ring in Use", .code = 0x00b8, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_horz_ring_bl_in_use), .umasks = icx_unc_cha_horz_ring_bl_in_use, }, { .name = "UNC_CHA_HORZ_RING_IV_IN_USE", .desc = "Horizontal IV Ring in Use", .code = 0x00b9, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_horz_ring_iv_in_use), .umasks = icx_unc_cha_horz_ring_iv_in_use, }, { .name = "UNC_CHA_IMC_READS_COUNT", .desc = "Normal priority reads issued to the memory controller from the CHA", .code = 0x0059, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_imc_reads_count), .umasks = icx_unc_cha_imc_reads_count, }, { .name = "UNC_CHA_IMC_WRITES_COUNT", .desc = "CHA to iMC Full Line Writes Issued", .code = 0x005b, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_imc_writes_count), .umasks = icx_unc_cha_imc_writes_count, }, { .name = "UNC_CHA_LLC_LOOKUP", .desc = "Cache and Snoop Filter Lookups; Data Read Request", .code = 0x0034, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_llc_lookup), .umasks = icx_unc_cha_llc_lookup, }, { .name = "UNC_CHA_LLC_VICTIMS", .desc = "Lines Victimized", .code = 0x0037, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_llc_victims), .umasks = icx_unc_cha_llc_victims, }, { .name = "UNC_CHA_MISC", .desc = "Number of times that an RFO hit in S state.", .code = 0x0039, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_misc), .umasks = icx_unc_cha_misc, }, { .name = "UNC_CHA_MISC_EXTERNAL", .desc = "Miscellaneous Events (mostly from MS2IDI)", .code = 0x00e6, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_misc_external), .umasks = icx_unc_cha_misc_external, }, { .name = "UNC_CHA_OSB", .desc = "OSB Snoop Broadcast", .code = 0x0055, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_osb), .umasks = icx_unc_cha_osb, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_INVITOX", .desc = "UNC_CHA_PMM_MEMMODE_NM_INVITOX.LOCAL", .code = 0x0065, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_memmode_nm_invitox), .umasks = icx_unc_cha_pmm_memmode_nm_invitox, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS", .desc = "PMM Memory Mode related events", .code = 0x0064, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_memmode_nm_setconflicts), .umasks = icx_unc_cha_pmm_memmode_nm_setconflicts, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS2", .desc = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS2.IODC", .code = 0x0070, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_memmode_nm_setconflicts2), .umasks = icx_unc_cha_pmm_memmode_nm_setconflicts2, }, { .name = "UNC_CHA_PMM_QOS", .desc = "UNC_CHA_PMM_QOS.SLOW_INSERT", .code = 0x0066, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_qos), .umasks = icx_unc_cha_pmm_qos, }, { .name = "UNC_CHA_PMM_QOS_OCCUPANCY", .desc = "UNC_CHA_PMM_QOS_OCCUPANCY.DDR_SLOW_FIFO", .code = 0x0067, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_pmm_qos_occupancy), .umasks = icx_unc_cha_pmm_qos_occupancy, }, { .name = "UNC_CHA_READ_NO_CREDITS", .desc = "CHA iMC CHNx READ Credits Empty", .code = 0x0058, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_write_no_credits), /* shared */ .umasks = icx_unc_cha_write_no_credits, }, { .name = "UNC_CHA_REQUESTS", .desc = "Local INVITOE requests (exclusive ownership of a cache line without receiving data) that miss the SF/LLC and are sent to the CHA's home agent", .code = 0x0050, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_requests), .umasks = icx_unc_cha_requests, }, { .name = "UNC_CHA_RING_BOUNCES_HORZ", .desc = "Messages that bounced on the Horizontal Ring.", .code = 0x00ac, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ring_bounces_horz), .umasks = icx_unc_cha_ring_bounces_horz, }, { .name = "UNC_CHA_RING_BOUNCES_VERT", .desc = "Messages that bounced on the Vertical Ring.", .code = 0x00aa, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ring_sink_starved_vert), /* shared */ .umasks = icx_unc_cha_ring_sink_starved_vert, }, { .name = "UNC_CHA_RING_SINK_STARVED_HORZ", .desc = "Sink Starvation on Horizontal Ring", .code = 0x00ad, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ring_sink_starved_horz), .umasks = icx_unc_cha_ring_sink_starved_horz, }, { .name = "UNC_CHA_RING_SINK_STARVED_VERT", .desc = "Sink Starvation on Vertical Ring", .code = 0x00ab, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_ring_sink_starved_vert), .umasks = icx_unc_cha_ring_sink_starved_vert, }, { .name = "UNC_CHA_RING_SRC_THRTL", .desc = "Source Throttle (experimental)", .code = 0x00ae, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_RxC_INSERTS", .desc = "Ingress (from CMS) Allocations", .code = 0x0013, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_inserts), .umasks = icx_unc_cha_rxc_inserts, }, { .name = "UNC_CHA_RxC_IPQ0_REJECT", .desc = "IPQ Requests (from CMS) Rejected - Set 0", .code = 0x0022, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_irq0_reject), /* shared */ .umasks = icx_unc_cha_rxc_irq0_reject, }, { .name = "UNC_CHA_RxC_IPQ1_REJECT", .desc = "IPQ Requests (from CMS) Rejected - Set 1", .code = 0x0023, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_other1_retry), /* shared */ .umasks = icx_unc_cha_rxc_other1_retry, }, { .name = "UNC_CHA_RxC_IRQ0_REJECT", .desc = "IRQ Requests (from CMS) Rejected - Set 0", .code = 0x0018, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_irq0_reject), .umasks = icx_unc_cha_rxc_irq0_reject, }, { .name = "UNC_CHA_RxC_IRQ1_REJECT", .desc = "Ingress (from CMS) Request Queue Rejects; PhyAddr Match", .code = 0x0019, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_irq1_reject), .umasks = icx_unc_cha_rxc_irq1_reject, }, { .name = "UNC_CHA_RxC_ISMQ0_REJECT", .desc = "ISMQ Rejects - Set 0", .code = 0x0024, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_ismq0_retry), /* shared */ .umasks = icx_unc_cha_rxc_ismq0_retry, }, { .name = "UNC_CHA_RxC_ISMQ0_RETRY", .desc = "ISMQ Retries - Set 0", .code = 0x002c, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_ismq0_retry), .umasks = icx_unc_cha_rxc_ismq0_retry, }, { .name = "UNC_CHA_RxC_ISMQ1_REJECT", .desc = "ISMQ Rejects - Set 1", .code = 0x0025, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_ismq1_retry), /* shared */ .umasks = icx_unc_cha_rxc_ismq1_retry, }, { .name = "UNC_CHA_RxC_ISMQ1_RETRY", .desc = "ISMQ Retries - Set 1", .code = 0x002d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_ismq1_retry), .umasks = icx_unc_cha_rxc_ismq1_retry, }, { .name = "UNC_CHA_RxC_OCCUPANCY", .desc = "Ingress (from CMS) Occupancy", .code = 0x0011, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_occupancy), .umasks = icx_unc_cha_rxc_occupancy, }, { .name = "UNC_CHA_RxC_OTHER0_RETRY", .desc = "Other Retries - Set 0", .code = 0x002e, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_prq0_reject), /* shared */ .umasks = icx_unc_cha_rxc_prq0_reject, }, { .name = "UNC_CHA_RxC_OTHER1_RETRY", .desc = "Other Retries - Set 1", .code = 0x002f, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_other1_retry), .umasks = icx_unc_cha_rxc_other1_retry, }, { .name = "UNC_CHA_RxC_PRQ0_REJECT", .desc = "PRQ Requests (from CMS) Rejected - Set 0", .code = 0x0020, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_prq0_reject), .umasks = icx_unc_cha_rxc_prq0_reject, }, { .name = "UNC_CHA_RxC_PRQ1_REJECT", .desc = "PRQ Requests (from CMS) Rejected - Set 1", .code = 0x0021, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_req_q1_retry), /* shared */ .umasks = icx_unc_cha_rxc_req_q1_retry, }, { .name = "UNC_CHA_RxC_REQ_Q0_RETRY", .desc = "Request Queue Retries - Set 0", .code = 0x002a, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_rrq0_reject), /* shared */ .umasks = icx_unc_cha_rxc_rrq0_reject, }, { .name = "UNC_CHA_RxC_REQ_Q1_RETRY", .desc = "Request Queue Retries - Set 1", .code = 0x002b, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_req_q1_retry), .umasks = icx_unc_cha_rxc_req_q1_retry, }, { .name = "UNC_CHA_RxC_RRQ0_REJECT", .desc = "RRQ Rejects - Set 0", .code = 0x0026, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_rrq0_reject), .umasks = icx_unc_cha_rxc_rrq0_reject, }, { .name = "UNC_CHA_RxC_RRQ1_REJECT", .desc = "RRQ Rejects - Set 1", .code = 0x0027, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_wbq1_reject), /* shared */ .umasks = icx_unc_cha_rxc_wbq1_reject, }, { .name = "UNC_CHA_RxC_WBQ0_REJECT", .desc = "WBQ Rejects - Set 0", .code = 0x0028, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_wbq0_reject), .umasks = icx_unc_cha_rxc_wbq0_reject, }, { .name = "UNC_CHA_RxC_WBQ1_REJECT", .desc = "WBQ Rejects - Set 1", .code = 0x0029, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxc_wbq1_reject), .umasks = icx_unc_cha_rxc_wbq1_reject, }, { .name = "UNC_CHA_RxR_BUSY_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e5, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_ads_used), /* shared */ .umasks = icx_unc_cha_txr_horz_ads_used, }, { .name = "UNC_CHA_RxR_BYPASS", .desc = "Transgress Ingress Bypass", .code = 0x00e2, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxr_inserts), /* shared */ .umasks = icx_unc_cha_rxr_inserts, }, { .name = "UNC_CHA_RxR_CRD_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e3, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxr_crd_starved), .umasks = icx_unc_cha_rxr_crd_starved, }, { .name = "UNC_CHA_RxR_CRD_STARVED_1", .desc = "Transgress Injection Starvation (experimental)", .code = 0x00e4, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_RxR_INSERTS", .desc = "Transgress Ingress Allocations", .code = 0x00e1, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxr_inserts), .umasks = icx_unc_cha_rxr_inserts, }, { .name = "UNC_CHA_RxR_OCCUPANCY", .desc = "Transgress Ingress Occupancy", .code = 0x00e0, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_rxr_occupancy), .umasks = icx_unc_cha_rxr_occupancy, }, { .name = "UNC_CHA_SF_EVICTION", .desc = "Snoop filter capacity evictions for E-state entries.", .code = 0x003d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_sf_eviction), .umasks = icx_unc_cha_sf_eviction, }, { .name = "UNC_CHA_SNOOPS_SENT", .desc = "Snoops Sent", .code = 0x0051, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_snoops_sent), .umasks = icx_unc_cha_snoops_sent, }, { .name = "UNC_CHA_SNOOP_RESP", .desc = "Snoop Responses Received", .code = 0x005c, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_snoop_resp), .umasks = icx_unc_cha_snoop_resp, }, { .name = "UNC_CHA_SNOOP_RESP_LOCAL", .desc = "Snoop Responses Received Local", .code = 0x005d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_snoop_resp_local), .umasks = icx_unc_cha_snoop_resp_local, }, { .name = "UNC_CHA_SNOOP_RSP_MISC", .desc = "Misc Snoop Responses Received", .code = 0x006b, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_snoop_rsp_misc), .umasks = icx_unc_cha_snoop_rsp_misc, }, { .name = "UNC_CHA_STALL0_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d0, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall0_no_txr_horz_crd_ad_ag0), .umasks = icx_unc_cha_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_CHA_STALL0_NO_TxR_HORZ_CRD_AD_AG1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d2, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall0_no_txr_horz_crd_bl_ag0), /* shared */ .umasks = icx_unc_cha_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_CHA_STALL0_NO_TxR_HORZ_CRD_BL_AG0", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d4, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall0_no_txr_horz_crd_bl_ag0), .umasks = icx_unc_cha_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_CHA_STALL0_NO_TxR_HORZ_CRD_BL_AG1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d6, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall0_no_txr_horz_crd_bl_ag1), .umasks = icx_unc_cha_stall0_no_txr_horz_crd_bl_ag1, }, { .name = "UNC_CHA_STALL1_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d1, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall1_no_txr_horz_crd_ad_ag1_1), /* shared */ .umasks = icx_unc_cha_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_CHA_STALL1_NO_TxR_HORZ_CRD_AD_AG1_1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d3, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall1_no_txr_horz_crd_ad_ag1_1), .umasks = icx_unc_cha_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_CHA_STALL1_NO_TxR_HORZ_CRD_BL_AG0_1", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d5, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall1_no_txr_horz_crd_bl_ag1_1), /* shared */ .umasks = icx_unc_cha_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_CHA_STALL1_NO_TxR_HORZ_CRD_BL_AG1_1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d7, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_stall1_no_txr_horz_crd_bl_ag1_1), .umasks = icx_unc_cha_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_CHA_TOR_INSERTS", .desc = "TOR Inserts", .code = 0x0035, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_tor_inserts), .umasks = icx_unc_cha_tor_inserts, }, { .name = "UNC_CHA_TOR_OCCUPANCY", .desc = "TOR Occupancy", .code = 0x0036, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_tor_occupancy), .umasks = icx_unc_cha_tor_occupancy, }, { .name = "UNC_CHA_TxR_HORZ_ADS_USED", .desc = "CMS Horizontal ADS Used", .code = 0x00a6, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_ads_used), .umasks = icx_unc_cha_txr_horz_ads_used, }, { .name = "UNC_CHA_TxR_HORZ_BYPASS", .desc = "CMS Horizontal Bypass Used", .code = 0x00a7, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_cycles_full), /* shared */ .umasks = icx_unc_cha_txr_horz_cycles_full, }, { .name = "UNC_CHA_TxR_HORZ_CYCLES_FULL", .desc = "Cycles CMS Horizontal Egress Queue is Full", .code = 0x00a2, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_cycles_full), .umasks = icx_unc_cha_txr_horz_cycles_full, }, { .name = "UNC_CHA_TxR_HORZ_CYCLES_NE", .desc = "Cycles CMS Horizontal Egress Queue is Not Empty", .code = 0x00a3, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_inserts), /* shared */ .umasks = icx_unc_cha_txr_horz_inserts, }, { .name = "UNC_CHA_TxR_HORZ_INSERTS", .desc = "CMS Horizontal Egress Inserts", .code = 0x00a1, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_inserts), .umasks = icx_unc_cha_txr_horz_inserts, }, { .name = "UNC_CHA_TxR_HORZ_NACK", .desc = "CMS Horizontal Egress NACKs", .code = 0x00a4, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_occupancy), /* shared */ .umasks = icx_unc_cha_txr_horz_occupancy, }, { .name = "UNC_CHA_TxR_HORZ_OCCUPANCY", .desc = "CMS Horizontal Egress Occupancy", .code = 0x00a0, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_occupancy), .umasks = icx_unc_cha_txr_horz_occupancy, }, { .name = "UNC_CHA_TxR_HORZ_STARVED", .desc = "CMS Horizontal Egress Injection Starvation", .code = 0x00a5, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_horz_starved), .umasks = icx_unc_cha_txr_horz_starved, }, { .name = "UNC_CHA_TxR_VERT_ADS_USED", .desc = "CMS Vertical ADS Used", .code = 0x009c, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_ads_used), .umasks = icx_unc_cha_txr_vert_ads_used, }, { .name = "UNC_CHA_TxR_VERT_BYPASS", .desc = "CMS Vertical ADS Used", .code = 0x009d, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_bypass), .umasks = icx_unc_cha_txr_vert_bypass, }, { .name = "UNC_CHA_TxR_VERT_BYPASS_1", .desc = "CMS Vertical ADS Used", .code = 0x009e, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_cycles_full1), /* shared */ .umasks = icx_unc_cha_txr_vert_cycles_full1, }, { .name = "UNC_CHA_TxR_VERT_CYCLES_FULL0", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0094, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_cycles_ne0), /* shared */ .umasks = icx_unc_cha_txr_vert_cycles_ne0, }, { .name = "UNC_CHA_TxR_VERT_CYCLES_FULL1", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0095, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_cycles_full1), .umasks = icx_unc_cha_txr_vert_cycles_full1, }, { .name = "UNC_CHA_TxR_VERT_CYCLES_NE0", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0096, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_cycles_ne0), .umasks = icx_unc_cha_txr_vert_cycles_ne0, }, { .name = "UNC_CHA_TxR_VERT_CYCLES_NE1", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0097, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_inserts1), /* shared */ .umasks = icx_unc_cha_txr_vert_inserts1, }, { .name = "UNC_CHA_TxR_VERT_INSERTS0", .desc = "CMS Vert Egress Allocations", .code = 0x0092, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_occupancy0), /* shared */ .umasks = icx_unc_cha_txr_vert_occupancy0, }, { .name = "UNC_CHA_TxR_VERT_INSERTS1", .desc = "CMS Vert Egress Allocations", .code = 0x0093, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_inserts1), .umasks = icx_unc_cha_txr_vert_inserts1, }, { .name = "UNC_CHA_TxR_VERT_NACK0", .desc = "CMS Vertical Egress NACKs", .code = 0x0098, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_starved0), /* shared */ .umasks = icx_unc_cha_txr_vert_starved0, }, { .name = "UNC_CHA_TxR_VERT_NACK1", .desc = "CMS Vertical Egress NACKs", .code = 0x0099, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_occupancy1), /* shared */ .umasks = icx_unc_cha_txr_vert_occupancy1, }, { .name = "UNC_CHA_TxR_VERT_OCCUPANCY0", .desc = "CMS Vert Egress Occupancy", .code = 0x0090, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_occupancy0), .umasks = icx_unc_cha_txr_vert_occupancy0, }, { .name = "UNC_CHA_TxR_VERT_OCCUPANCY1", .desc = "CMS Vert Egress Occupancy", .code = 0x0091, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_occupancy1), .umasks = icx_unc_cha_txr_vert_occupancy1, }, { .name = "UNC_CHA_TxR_VERT_STARVED0", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009a, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_starved0), .umasks = icx_unc_cha_txr_vert_starved0, }, { .name = "UNC_CHA_TxR_VERT_STARVED1", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009b, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_txr_vert_starved1), .umasks = icx_unc_cha_txr_vert_starved1, }, { .name = "UNC_CHA_VERT_RING_AD_IN_USE", .desc = "Vertical AD Ring In Use", .code = 0x00b0, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_akc_in_use), /* shared */ .umasks = icx_unc_cha_vert_ring_akc_in_use, }, { .name = "UNC_CHA_VERT_RING_AKC_IN_USE", .desc = "Vertical AKC Ring In Use", .code = 0x00b4, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_akc_in_use), .umasks = icx_unc_cha_vert_ring_akc_in_use, }, { .name = "UNC_CHA_VERT_RING_AK_IN_USE", .desc = "Vertical AK Ring In Use", .code = 0x00b1, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_bl_in_use), /* shared */ .umasks = icx_unc_cha_vert_ring_bl_in_use, }, { .name = "UNC_CHA_VERT_RING_BL_IN_USE", .desc = "Vertical BL Ring in Use", .code = 0x00b2, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_bl_in_use), .umasks = icx_unc_cha_vert_ring_bl_in_use, }, { .name = "UNC_CHA_VERT_RING_IV_IN_USE", .desc = "Vertical IV Ring in Use", .code = 0x00b3, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_iv_in_use), .umasks = icx_unc_cha_vert_ring_iv_in_use, }, { .name = "UNC_CHA_VERT_RING_TGC_IN_USE", .desc = "Vertical TGC Ring In Use", .code = 0x00b5, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_vert_ring_tgc_in_use), .umasks = icx_unc_cha_vert_ring_tgc_in_use, }, { .name = "UNC_CHA_WB_PUSH_MTOI", .desc = "WbPushMtoI", .code = 0x0056, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_wb_push_mtoi), .umasks = icx_unc_cha_wb_push_mtoi, }, { .name = "UNC_CHA_WRITE_NO_CREDITS", .desc = "CHA iMC CHNx WRITE Credits Empty", .code = 0x005a, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_write_no_credits), .umasks = icx_unc_cha_write_no_credits, }, { .name = "UNC_CHA_XPT_PREF", .desc = "XPT Prefetches", .code = 0x006f, .modmsk = ICX_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_cha_xpt_pref), .umasks = icx_unc_cha_xpt_pref, }, }; /* 132 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_iio_events.h000066400000000000000000002262021502707512200252620ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_iio (IcelakeX Uncore IIO) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_iio_bandwidth_in[]={ { .uname = "PART0_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART1_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART2_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART3_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART4_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART5_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART6_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART7_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_bandwidth_out[]={ { .uname = "PART0_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART1_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART2_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART3_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART4_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART5_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART6_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PART7_FREERUN", .udesc = "TBD (experimental)", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_comp_buf_occupancy[]={ { .uname = "CMPD_ALL", .udesc = "Part 0-7 (experimental)", .ucode = 0x400000000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_ALL_PARTS", .udesc = "Part 0-7", .ucode = 0x400000000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART0", .udesc = "Part 0", .ucode = 0x4000000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART1", .udesc = "Part 1", .ucode = 0x4000000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART2", .udesc = "Part 2", .ucode = 0x4000000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART3", .udesc = "Part 3", .ucode = 0x4000000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART4", .udesc = "Part 4", .ucode = 0x4000000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART5", .udesc = "Part 5", .ucode = 0x4000000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART6", .udesc = "Part 6", .ucode = 0x4000000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART7", .udesc = "Part 7", .ucode = 0x4000000008000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_data_req_by_cpu[]={ { .uname = "CFG_READ_IOMMU0", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7100000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_IOMMU1", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7200000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART0", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7001000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART1", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7002000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART2", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7004000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART3", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7008000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART4", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7010000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART5", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7020000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART6", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7040000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART7", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7080000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_IOMMU0", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7100000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_IOMMU1", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7200000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART0", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7001000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART1", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7002000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART2", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7004000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART3", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7008000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART4", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7010000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART5", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7020000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART6", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7040000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART7", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7080000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_IOMMU0", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7100000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_IOMMU1", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7200000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART0", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7001000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART1", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7002000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART2", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7004000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART3", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7008000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART4", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7010000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART5", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7020000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART6", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7040000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART7", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7080000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_IOMMU0", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7100000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_IOMMU1", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7200000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART0", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7001000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART1", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7002000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART2", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7004000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART3", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7008000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART4", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7010000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART5", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7020000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART6", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7040000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART7", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7080000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU0", .udesc = "Core reporting completion of Card read from Core DRAM (experimental)", .ucode = 0x7100000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU1", .udesc = "Core reporting completion of Card read from Core DRAM (experimental)", .ucode = 0x7200000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART0", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7001000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART1", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7002000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART2", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7004000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART3", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7008000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART4", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7010000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART5", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7020000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART6", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7040000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART7", .udesc = "Core reporting completion of Card read from Core DRAM", .ucode = 0x7080000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU0", .udesc = "Core writing to Card's MMIO space (experimental)", .ucode = 0x7100000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU1", .udesc = "Core writing to Card's MMIO space (experimental)", .ucode = 0x7200000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART0", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7001000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART1", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7002000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART2", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7004000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART3", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7008000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART4", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7010000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART5", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7020000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART6", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7040000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART7", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7080000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU0", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7100000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU1", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7200000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART0", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7001000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART1", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7002000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART2", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7004000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART3", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7008000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART4", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7010000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART5", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7020000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART6", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7040000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART7", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7080000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU0", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7100000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU1", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7200000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART0", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7001000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART1", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7002000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART2", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7004000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART3", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7008000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART4", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7010000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART5", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7020000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART6", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7040000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART7", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7080000000200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_data_req_of_cpu[]={ { .uname = "ATOMIC_IOMMU0", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7100000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_IOMMU1", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7200000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART0", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7001000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART1", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7002000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART2", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7004000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART3", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7008000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART4", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7010000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART5", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7020000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART6", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7040000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART7", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7080000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_IOMMU0", .udesc = "CmpD - device sending completion to CPU request (experimental)", .ucode = 0x7100000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_IOMMU1", .udesc = "CmpD - device sending completion to CPU request (experimental)", .ucode = 0x7200000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART0", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7001000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART1", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7002000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART2", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7004000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART3", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7008000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART4", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7010000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART5", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7020000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART6", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7040000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART7", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7080000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU0", .udesc = "Card reading from DRAM (experimental)", .ucode = 0x7100000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU1", .udesc = "Card reading from DRAM (experimental)", .ucode = 0x7200000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART0", .udesc = "Card reading from DRAM", .ucode = 0x7001000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART1", .udesc = "Card reading from DRAM", .ucode = 0x7002000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART2", .udesc = "Card reading from DRAM", .ucode = 0x7004000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART3", .udesc = "Card reading from DRAM", .ucode = 0x7008000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART4", .udesc = "Card reading from DRAM", .ucode = 0x7010000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART5", .udesc = "Card reading from DRAM", .ucode = 0x7020000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART6", .udesc = "Card reading from DRAM", .ucode = 0x7040000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART7", .udesc = "Card reading from DRAM", .ucode = 0x7080000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU0", .udesc = "Card writing to DRAM (experimental)", .ucode = 0x7100000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU1", .udesc = "Card writing to DRAM (experimental)", .ucode = 0x7200000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART0", .udesc = "Card writing to DRAM", .ucode = 0x7001000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART1", .udesc = "Card writing to DRAM", .ucode = 0x7002000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART2", .udesc = "Card writing to DRAM", .ucode = 0x7004000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART3", .udesc = "Card writing to DRAM", .ucode = 0x7008000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART4", .udesc = "Card writing to DRAM", .ucode = 0x7010000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART5", .udesc = "Card writing to DRAM", .ucode = 0x7020000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART6", .udesc = "Card writing to DRAM", .ucode = 0x7040000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART7", .udesc = "Card writing to DRAM", .ucode = 0x7080000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_IOMMU0", .udesc = "Messages (experimental)", .ucode = 0x7100000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_IOMMU1", .udesc = "Messages (experimental)", .ucode = 0x7200000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART0", .udesc = "Messages (experimental)", .ucode = 0x7001000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART1", .udesc = "Messages (experimental)", .ucode = 0x7002000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART2", .udesc = "Messages (experimental)", .ucode = 0x7004000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART3", .udesc = "Messages (experimental)", .ucode = 0x7008000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART4", .udesc = "Messages (experimental)", .ucode = 0x7010000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART5", .udesc = "Messages (experimental)", .ucode = 0x7020000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART6", .udesc = "Messages (experimental)", .ucode = 0x7040000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART7", .udesc = "Messages (experimental)", .ucode = 0x7080000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU0", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7100000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU1", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7200000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART0", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7001000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART1", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7002000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART2", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7004000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART3", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7008000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART4", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7010000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART5", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7020000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART6", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7040000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART7", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7080000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU0", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7100000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU1", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7200000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART0", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7001000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART1", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7002000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART2", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7004000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART3", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7008000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART4", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7010000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART5", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7020000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART6", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7040000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART7", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7080000000200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_inbound_arb_won[]={ { .uname = "DATA", .udesc = "Passing data to be written (experimental)", .ucode = 0x70ff000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FINAL_RD_WR", .udesc = "Issuing final read or write of line (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_HIT", .udesc = "Processing response from IOMMU (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_REQ", .udesc = "Issuing to IOMMU (experimental)", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_OWN", .udesc = "Request Ownership (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Writing line (experimental)", .ucode = 0x70ff000001000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_iommu0[]={ { .uname = "1G_HITS", .udesc = "IOTLB Hits to a 1G Page (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2M_HITS", .udesc = "IOTLB Hits to a 2M Page (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4K_HITS", .udesc = "IOTLB Hits to a 4K Page (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_LOOKUPS", .udesc = "IOTLB lookups all (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CTXT_CACHE_HITS", .udesc = "Context cache hits (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CTXT_CACHE_LOOKUPS", .udesc = "Context cache lookups (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FIRST_LOOKUPS", .udesc = "IOTLB lookups first (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "IOTLB Fills (same as IOTLB miss) (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_iommu1[]={ { .uname = "CYC_PWT_FULL", .udesc = "Cycles PWT full (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_MEM_ACCESSES", .udesc = "IOMMU memory access (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWC_1G_HITS", .udesc = "PWC Hit to a 1G page (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWC_2M_HITS", .udesc = "PWC Hit to a 2M page (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWC_4K_HITS", .udesc = "PWC Hit to a 4K page (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWC_512G_HITS", .udesc = "PWT Hit to a 256T page (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWC_CACHE_FILLS", .udesc = "PageWalk cache fill (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWT_CACHE_LOOKUPS", .udesc = "PageWalk cache lookup (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_iommu3[]={ { .uname = "INT_CACHE_HITS", .udesc = "Interrupt Entry cache hit (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_CACHE_LOOKUPS", .udesc = "Interrupt Entry cache lookup (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_CTXT_CACHE_INVAL_DEVICE", .udesc = "Device-selective Context cache invalidation cycles (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_CTXT_CACHE_INVAL_DOMAIN", .udesc = "Domain-selective Context cache invalidation cycles (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_CTXT_CACHE_INVAL_GBL", .udesc = "Context cache global invalidation cycles (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_INVAL_DOMAIN", .udesc = "Domain-selective IOTLB invalidation cycles (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_INVAL_GBL", .udesc = "Global IOTLB invalidation cycles (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NUM_INVAL_PAGE", .udesc = "Page-selective IOTLB invalidation cycles (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_mask_match_or[]={ { .uname = "BUS0", .udesc = "Non-PCIE bus (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BUS0_BUS1", .udesc = "Non-PCIE bus and PCIE bus (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BUS0_NOT_BUS1", .udesc = "Non-PCIE bus and !(PCIE bus) (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BUS1", .udesc = "PCIE bus (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_BUS0_BUS1", .udesc = "!(Non-PCIE bus) and PCIE bus (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_BUS0_NOT_BUS1", .udesc = "!(Non-PCIE bus) and !(PCIE bus) (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_num_oustanding_req_from_cpu[]={ { .uname = "TO_IO", .udesc = "To device (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t icx_unc_iio_num_outstanding_req_of_cpu[]={ { .uname = "DATA", .udesc = "Passing data to be written (experimental)", .ucode = 0x70ff000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FINAL_RD_WR", .udesc = "Issuing final read or write of line (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_HIT", .udesc = "Processing response from IOMMU (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_REQ", .udesc = "Issuing to IOMMU (experimental)", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_OWN", .udesc = "Request Ownership (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Writing line (experimental)", .ucode = 0x70ff000001000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_num_req_from_cpu[]={ { .uname = "IRP", .udesc = "From IRP (experimental)", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ITC", .udesc = "From ITC (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREALLOC", .udesc = "Completion allocations (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_num_req_of_cpu[]={ { .uname = "ALL_DROP", .udesc = "Drop request (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMMIT_ALL", .udesc = "All", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_num_req_of_cpu_by_tgt[]={ { .uname = "ABORT", .udesc = "Abort (experimental)", .ucode = 0x70ff000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CONFINED_P2P", .udesc = "Confined P2P (experimental)", .ucode = 0x70ff000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_P2P", .udesc = "Local P2P (experimental)", .ucode = 0x70ff000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MCAST", .udesc = "Multi-cast (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM", .udesc = "Memory (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSGB", .udesc = "MsgB (experimental)", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM_P2P", .udesc = "Remote P2P (experimental)", .ucode = 0x70ff000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UBOX", .udesc = "Ubox (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_outbound_cl_reqs_issued[]={ { .uname = "TO_IO", .udesc = "64B requests issued to device (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t icx_unc_iio_outbound_tlp_reqs_issued[]={ { .uname = "TO_IO", .udesc = "To device (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t icx_unc_iio_req_from_pcie_cmpl[]={ { .uname = "DATA", .udesc = "Passing data to be written (experimental)", .ucode = 0x70ff000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FINAL_RD_WR", .udesc = "Issuing final read or write of line (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_HIT", .udesc = "Processing response from IOMMU (experimental)", .ucode = 0x70ff000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IOMMU_REQ", .udesc = "Issuing to IOMMU (experimental)", .ucode = 0x70ff000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_OWN", .udesc = "Request Ownership (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Writing line (experimental)", .ucode = 0x70ff000001000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_req_from_pcie_pass_cmpl[]={ { .uname = "DATA", .udesc = "Passing data to be written (experimental)", .ucode = 0x70ff000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FINAL_RD_WR", .udesc = "Issuing final read or write of line (experimental)", .ucode = 0x70ff000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_OWN", .udesc = "Request Ownership (experimental)", .ucode = 0x70ff000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Writing line (experimental)", .ucode = 0x70ff000001000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_txn_req_by_cpu[]={ { .uname = "CFG_READ_IOMMU0", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7100000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_IOMMU1", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7200000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART0", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7001000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART1", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7002000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART2", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7004000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART3", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7008000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART4", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7010000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART5", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7020000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART6", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7040000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_READ_PART7", .udesc = "Core reading from Card's PCICFG space (experimental)", .ucode = 0x7080000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_IOMMU0", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7100000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_IOMMU1", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7200000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART0", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7001000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART1", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7002000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART2", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7004000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART3", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7008000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART4", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7010000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART5", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7020000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART6", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7040000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CFG_WRITE_PART7", .udesc = "Core writing to Card's PCICFG space (experimental)", .ucode = 0x7080000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_IOMMU0", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7100000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_IOMMU1", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7200000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART0", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7001000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART1", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7002000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART2", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7004000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART3", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7008000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART4", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7010000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART5", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7020000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART6", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7040000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_READ_PART7", .udesc = "Core reading from Card's IO space (experimental)", .ucode = 0x7080000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_IOMMU0", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7100000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_IOMMU1", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7200000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART0", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7001000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART1", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7002000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART2", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7004000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART3", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7008000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART4", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7010000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART5", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7020000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART6", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7040000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WRITE_PART7", .udesc = "Core writing to Card's IO space (experimental)", .ucode = 0x7080000002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU0", .udesc = "Core reading from Card's MMIO space (experimental)", .ucode = 0x7100000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU1", .udesc = "Core reading from Card's MMIO space (experimental)", .ucode = 0x7200000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART0", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7001000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART1", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7002000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART2", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7004000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART3", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7008000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART4", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7010000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART5", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7020000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART6", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7040000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART7", .udesc = "Core reading from Card's MMIO space", .ucode = 0x7080000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU0", .udesc = "Core writing to Card's MMIO space (experimental)", .ucode = 0x7100000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU1", .udesc = "Core writing to Card's MMIO space (experimental)", .ucode = 0x7200000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART0", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7001000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART1", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7002000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART2", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7004000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART3", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7008000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART4", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7010000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART5", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7020000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART6", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7040000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART7", .udesc = "Core writing to Card's MMIO space", .ucode = 0x7080000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU0", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7100000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU1", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7200000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART0", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7001000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART1", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7002000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART2", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7004000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART3", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7008000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART4", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7010000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART5", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7020000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART6", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7040000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART7", .udesc = "Another card (different IIO stack) reading from this card. (experimental)", .ucode = 0x7080000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU0", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7200000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART0", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7001000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART1", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7002000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART2", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7004000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART3", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7008000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART4", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7010000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART5", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7020000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART6", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7040000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART7", .udesc = "Another card (different IIO stack) writing to this card. (experimental)", .ucode = 0x7080000000200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_iio_txn_req_of_cpu[]={ { .uname = "ATOMIC_IOMMU0", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7100000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_IOMMU1", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7200000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART0", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7001000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART1", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7002000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART2", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7004000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART3", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7008000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART4", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7010000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART5", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7020000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART6", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7040000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ATOMIC_PART7", .udesc = "Atomic requests targeting DRAM (experimental)", .ucode = 0x7080000001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_IOMMU0", .udesc = "CmpD - device sending completion to CPU request (experimental)", .ucode = 0x7100000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_IOMMU1", .udesc = "CmpD - device sending completion to CPU request (experimental)", .ucode = 0x7200000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART0", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7001000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART1", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7002000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART2", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7004000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART3", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7008000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART4", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7010000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART5", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7020000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART6", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7040000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMPD_PART7", .udesc = "CmpD - device sending completion to CPU request", .ucode = 0x7080000008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU0", .udesc = "Card reading from DRAM (experimental)", .ucode = 0x7100000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_IOMMU1", .udesc = "Card reading from DRAM (experimental)", .ucode = 0x7200000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART0", .udesc = "Card reading from DRAM", .ucode = 0x7001000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART1", .udesc = "Card reading from DRAM", .ucode = 0x7002000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART2", .udesc = "Card reading from DRAM", .ucode = 0x7004000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART3", .udesc = "Card reading from DRAM", .ucode = 0x7008000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART4", .udesc = "Card reading from DRAM", .ucode = 0x7010000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART5", .udesc = "Card reading from DRAM", .ucode = 0x7020000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART6", .udesc = "Card reading from DRAM", .ucode = 0x7040000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_READ_PART7", .udesc = "Card reading from DRAM", .ucode = 0x7080000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU0", .udesc = "Card writing to DRAM (experimental)", .ucode = 0x7100000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_IOMMU1", .udesc = "Card writing to DRAM (experimental)", .ucode = 0x7200000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART0", .udesc = "Card writing to DRAM", .ucode = 0x7001000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART1", .udesc = "Card writing to DRAM", .ucode = 0x7002000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART2", .udesc = "Card writing to DRAM", .ucode = 0x7004000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART3", .udesc = "Card writing to DRAM", .ucode = 0x7008000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART4", .udesc = "Card writing to DRAM", .ucode = 0x7010000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART5", .udesc = "Card writing to DRAM", .ucode = 0x7020000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART6", .udesc = "Card writing to DRAM", .ucode = 0x7040000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM_WRITE_PART7", .udesc = "Card writing to DRAM", .ucode = 0x7080000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_IOMMU0", .udesc = "Messages (experimental)", .ucode = 0x7100000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_IOMMU1", .udesc = "Messages (experimental)", .ucode = 0x7200000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART0", .udesc = "Messages (experimental)", .ucode = 0x7001000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART1", .udesc = "Messages (experimental)", .ucode = 0x7002000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART2", .udesc = "Messages (experimental)", .ucode = 0x7004000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART3", .udesc = "Messages (experimental)", .ucode = 0x7008000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART4", .udesc = "Messages (experimental)", .ucode = 0x7010000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART5", .udesc = "Messages (experimental)", .ucode = 0x7020000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART6", .udesc = "Messages (experimental)", .ucode = 0x7040000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG_PART7", .udesc = "Messages (experimental)", .ucode = 0x7080000004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU0", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7100000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_IOMMU1", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7200000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART0", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7001000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART1", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7002000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART2", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7004000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART3", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7008000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART4", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7010000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART5", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7020000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART6", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7040000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_READ_PART7", .udesc = "Card reading from another Card (same or different stack) (experimental)", .ucode = 0x7080000000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU0", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7100000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_IOMMU1", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7200000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART0", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7001000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART1", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7002000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART2", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7004000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART3", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7008000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART4", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7010000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART5", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7020000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART6", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7040000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PEER_WRITE_PART7", .udesc = "Card writing to another Card (same or different stack) (experimental)", .ucode = 0x7080000000200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_iio_pe[]={ { .name = "UNC_IIO_BANDWIDTH_IN", .desc = "Free running counter that increments for every 32 bytes of data sent from the IO agent to the SOC", .code = 0x0000, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x2ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_bandwidth_in), .umasks = icx_unc_iio_bandwidth_in, }, { .name = "UNC_IIO_BANDWIDTH_OUT", .desc = "Free running counter that increments for every 32 bytes of data sent from the IO agent to the SOC", .code = 0x0000, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x200ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_bandwidth_out), .umasks = icx_unc_iio_bandwidth_out, }, { .name = "UNC_IIO_CLOCKTICKS", .desc = "Clockticks of the integrated IO (IIO) traffic controller", .code = 0x0001, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_IIO_CLOCKTICKS_FREERUN", .desc = "Free running counter that increments for IIO clocktick", .code = 0x0000, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x1ull, }, { .name = "UNC_IIO_COMP_BUF_OCCUPANCY", .desc = "PCIe Completion Buffer Occupancy of completions with data", .code = 0x00d5, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xcull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_comp_buf_occupancy), .umasks = icx_unc_iio_comp_buf_occupancy, }, { .name = "UNC_IIO_DATA_REQ_BY_CPU", .desc = "Data requested by the CPU", .code = 0x00c0, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xcull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_data_req_by_cpu), .umasks = icx_unc_iio_data_req_by_cpu, }, { .name = "UNC_IIO_DATA_REQ_OF_CPU", .desc = "Four byte data request of the CPU", .code = 0x0083, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_data_req_of_cpu), .umasks = icx_unc_iio_data_req_of_cpu, }, { .name = "UNC_IIO_INBOUND_ARB_REQ", .desc = "Incoming arbitration requests", .code = 0x0086, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_inbound_arb_won), /* shared */ .umasks = icx_unc_iio_inbound_arb_won, }, { .name = "UNC_IIO_INBOUND_ARB_WON", .desc = "Incoming arbitration requests granted", .code = 0x0087, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_inbound_arb_won), .umasks = icx_unc_iio_inbound_arb_won, }, { .name = "UNC_IIO_IOMMU0", .desc = "TBD", .code = 0x0040, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_iommu0), .umasks = icx_unc_iio_iommu0, }, { .name = "UNC_IIO_IOMMU1", .desc = "TBD", .code = 0x0041, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_iommu1), .umasks = icx_unc_iio_iommu1, }, { .name = "UNC_IIO_IOMMU3", .desc = "TBD", .code = 0x0043, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_iommu3), .umasks = icx_unc_iio_iommu3, }, { .name = "UNC_IIO_MASK_MATCH_AND", .desc = "AND Mask/match for debug bus", .code = 0x0002, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_mask_match_or), /* shared */ .umasks = icx_unc_iio_mask_match_or, }, { .name = "UNC_IIO_MASK_MATCH_OR", .desc = "OR Mask/match for debug bus", .code = 0x0003, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_mask_match_or), .umasks = icx_unc_iio_mask_match_or, }, { .name = "UNC_IIO_NOTHING", .desc = "Counting disabled (experimental)", .code = 0x0080, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_IIO_NUM_OUSTANDING_REQ_FROM_CPU", .desc = "Occupancy of outbound request queue", .code = 0x00c5, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xcull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_num_oustanding_req_from_cpu), .umasks = icx_unc_iio_num_oustanding_req_from_cpu, }, { .name = "UNC_IIO_NUM_OUTSTANDING_REQ_OF_CPU", .desc = "TBD", .code = 0x0088, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xcull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_num_outstanding_req_of_cpu), .umasks = icx_unc_iio_num_outstanding_req_of_cpu, }, { .name = "UNC_IIO_NUM_REQ_FROM_CPU", .desc = "Number requests sent to PCIe from main die", .code = 0x00c2, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_num_req_from_cpu), .umasks = icx_unc_iio_num_req_from_cpu, }, { .name = "UNC_IIO_NUM_REQ_OF_CPU", .desc = "Number requests PCIe makes of the main die", .code = 0x0085, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_num_req_of_cpu), .umasks = icx_unc_iio_num_req_of_cpu, }, { .name = "UNC_IIO_NUM_REQ_OF_CPU_BY_TGT", .desc = "Num requests sent by PCIe - by target", .code = 0x008e, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_num_req_of_cpu_by_tgt), .umasks = icx_unc_iio_num_req_of_cpu_by_tgt, }, { .name = "UNC_IIO_NUM_TGT_MATCHED_REQ_OF_CPU", .desc = "ITC address map 1 (experimental)", .code = 0x008f, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_IIO_OUTBOUND_CL_REQS_ISSUED", .desc = "Outbound cacheline requests issued", .code = 0x00d0, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_outbound_cl_reqs_issued), .umasks = icx_unc_iio_outbound_cl_reqs_issued, }, { .name = "UNC_IIO_OUTBOUND_TLP_REQS_ISSUED", .desc = "Outbound TLP (transaction layer packet) requests issued", .code = 0x00d1, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_outbound_tlp_reqs_issued), .umasks = icx_unc_iio_outbound_tlp_reqs_issued, }, { .name = "UNC_IIO_PWT_OCCUPANCY", .desc = "PWT occupancy (experimental)", .code = 0x0042, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_IIO_REQ_FROM_PCIE_CL_CMPL", .desc = "PCIe Request - cacheline complete", .code = 0x0091, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_req_from_pcie_pass_cmpl), /* shared */ .umasks = icx_unc_iio_req_from_pcie_pass_cmpl, }, { .name = "UNC_IIO_REQ_FROM_PCIE_CMPL", .desc = "PCIe Request complete", .code = 0x0092, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_req_from_pcie_cmpl), .umasks = icx_unc_iio_req_from_pcie_cmpl, }, { .name = "UNC_IIO_REQ_FROM_PCIE_PASS_CMPL", .desc = "PCIe Request - pass complete", .code = 0x0090, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_req_from_pcie_pass_cmpl), .umasks = icx_unc_iio_req_from_pcie_pass_cmpl, }, { .name = "UNC_IIO_SYMBOL_TIMES", .desc = "Symbol Times on Link (experimental)", .code = 0x0082, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_IIO_TXN_REQ_BY_CPU", .desc = "Number Transactions requested by the CPU", .code = 0x00c1, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_txn_req_by_cpu), .umasks = icx_unc_iio_txn_req_by_cpu, }, { .name = "UNC_IIO_TXN_REQ_OF_CPU", .desc = "Number Transactions requested of the CPU", .code = 0x0084, .modmsk = ICX_UNC_IIO_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_iio_txn_req_of_cpu), .umasks = icx_unc_iio_txn_req_of_cpu, }, }; /* 31 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_imc_events.h000066400000000000000000001046271502707512200252600ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_imc (IcelakeX Uncore IMC) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_m_act_count[]={ { .uname = "ALL", .udesc = "All Activates", .ucode = 0x0b00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BYP", .udesc = "Activate due to Bypass (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "TBD", .ucode = 0x3f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "TBD", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PRE_REG", .udesc = "DRAM RD_CAS commands w/auto-pre (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PRE_UNDERFILL", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UNDERFILL", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "TBD", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_NONPRE", .udesc = "DRAM WR_CAS commands w/o auto-pre (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PRE", .udesc = "DRAM WR_CAS commands w/ auto-pre (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_dram_refresh[]={ { .uname = "HIGH", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OPPORTUNISTIC", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PANIC", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pcls[]={ { .uname = "RD", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pmm_cmd1[]={ { .uname = "ALL", .udesc = "All", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "MISC", .udesc = "Misc Commands (error, flow ACKs) (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC_GNT", .udesc = "Misc GNTs (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "Reads - RPQ", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RPQ_GNTS", .udesc = "RPQ GNTs (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL_RD", .udesc = "Underfill reads", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WPQ_GNTS", .udesc = "Underfill GNTs (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Writes", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pmm_cmd2[]={ { .uname = "NODATA_EXP", .udesc = "Expected No data packet (ERID matched NDP encoding) (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NODATA_UNEXP", .udesc = "Unexpected No data packet (ERID matched a Read, but data was a NDP) (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OPP_RD", .udesc = "Opportunistic Reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_ECC_ERROR", .udesc = "ECC Errors (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_ERID_ERROR", .udesc = "ERID detectable parity error (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_ERID_STARVED", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQS_SLOT0", .udesc = "Read Requests - Slot 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQS_SLOT1", .udesc = "Read Requests - Slot 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pmm_rpq_occupancy[]={ { .uname = "ALL", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "GNT_WAIT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_GNT", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pmm_wpq_occupancy[]={ { .uname = "ALL", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CAS", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWR", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_power_cke_cycles[]={ { .uname = "LOW_0", .udesc = "DIMM ID (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_1", .udesc = "DIMM ID (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_2", .udesc = "DIMM ID (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_3", .udesc = "DIMM ID (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_power_throttle_cycles[]={ { .uname = "SLOT0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_pre_count[]={ { .uname = "ALL", .udesc = "TBD", .ucode = 0x1c00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PAGE_MISS", .udesc = "Precharge due to page miss (experimental)", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PGT", .udesc = "Precharge due to page table", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "Precharge due to read", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Precharge due to write", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_rpq_inserts[]={ { .uname = "PCH0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_accesses[]={ { .uname = "ACCEPTS", .udesc = "Scoreboard Accesses Accepted (experimental)", .ucode = 0x0500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMRD_CMPS", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMWR_CMPS", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD_CMPS", .udesc = "Write Accepts (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR_CMPS", .udesc = "Write Rejects (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMRD_CMPS", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMWR_CMPS", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_CMPS", .udesc = "FM read completions (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_CMPS", .udesc = "FM write completions (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_ACCEPTS", .udesc = "Read Accepts (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REJECTS", .udesc = "Read Rejects (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REJECTS", .udesc = "Scoreboard Accesses Rejected (experimental)", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_ACCEPTS", .udesc = "NM read completions (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_REJECTS", .udesc = "NM write completions (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_canary[]={ { .uname = "ALLOC", .udesc = "Alloc (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEALLOC", .udesc = "Dealloc (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMRD_STARVED", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMTGRWR_STARVED", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMWR_STARVED", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD_STARVED", .udesc = "Near Mem Write Starved (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR_WR_STARVED", .udesc = "Far Mem Write Starved (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR_STARVED", .udesc = "Far Mem Read Starved (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMRD_STARVED", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMWR_STARVED", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_STARVED", .udesc = "Valid (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_STARVED", .udesc = "Near Mem Read Starved (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VLD", .udesc = "Reject (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_inserts[]={ { .uname = "BLOCK_RDS", .udesc = "Block region reads (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BLOCK_WRS", .udesc = "Block region writes (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_RDS", .udesc = "Persistent Mem reads (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_WRS", .udesc = "Persistent Mem writes (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDS", .udesc = "Reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRS", .udesc = "Writes (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_occupancy[]={ { .uname = "BLOCK_RDS", .udesc = "Block region reads (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BLOCK_WRS", .udesc = "Block region writes (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_RDS", .udesc = "Persistent Mem reads (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_WRS", .udesc = "Persistent Mem writes (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDS", .udesc = "Reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_pref_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DDR", .udesc = "DDR4 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "Persistent Mem (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_pref_occupancy[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DDR", .udesc = "DDR4 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMEM", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "Persistent Mem (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_reject[]={ { .uname = "CANARY", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR_EARLY_CMP", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_ADDR_CNFLT", .udesc = "FM requests rejected due to full address conflict (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_SET_CNFLT", .udesc = "NM requests rejected due to set conflict (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PATROL_SET_CNFLT", .udesc = "Patrol requests rejected due to set conflict (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_strv_dealloc[]={ { .uname = "FMRD", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMTGR", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMWR", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD", .udesc = "Far Mem Read - Set (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR", .udesc = "Near Mem Read - Clear (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR", .udesc = "Far Mem Write - Set (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMRD", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMWR", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD", .udesc = "Near Mem Read - Set (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR", .udesc = "Near Mem Write - Set (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_strv_occ[]={ { .uname = "FMRD", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMTGR", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FMWR", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD", .udesc = "Far Mem Read (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR", .udesc = "Near Mem Read - Clear (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR", .udesc = "Far Mem Write (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMRD", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NMWR", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD", .udesc = "Near Mem Read (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR", .udesc = "Near Mem Write (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_sb_tagged[]={ { .uname = "DDR4_CMP", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEW", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OCC", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM0_CMP", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM1_CMP", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM2_CMP", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_HIT", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_MISS", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_tagchk[]={ { .uname = "HIT", .udesc = "Hit in Near Memory Cache", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_CLEAN", .udesc = "Miss, no data in this line", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_DIRTY", .udesc = "Miss, existing data may be evicted to Far Memory", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_HIT", .udesc = "Read Hit in Near Memory Cache", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_HIT", .udesc = "Write Hit in Near Memory Cache", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_wpq_inserts[]={ { .uname = "PCH0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m_wpq_write_hit[]={ { .uname = "PCH0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_imc_pe[]={ { .name = "UNC_M_ACT_COUNT", .desc = "DRAM Activate Count", .code = 0x0001, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_act_count), .umasks = icx_unc_m_act_count, }, { .name = "UNC_M_CAS_COUNT", .desc = "All DRAM read CAS commands issued (including underfills)", .code = 0x0004, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_cas_count), .umasks = icx_unc_m_cas_count, }, { .name = "UNC_M_CLOCKTICKS", .desc = "DRAM Clockticks", .code = 0x0000, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_CLOCKTICKS_FREERUN", .desc = "Free running counter that increments for the Memory Controller (experimental)", .code = 0x0000, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0x10ull, }, { .name = "UNC_M_DRAM_PRE_ALL", .desc = "DRAM Precharge All Commands (experimental)", .code = 0x0044, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_DRAM_REFRESH", .desc = "Number of DRAM Refreshes Issued", .code = 0x0045, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_dram_refresh), .umasks = icx_unc_m_dram_refresh, }, { .name = "UNC_M_HCLOCKTICKS", .desc = "Half clockticks for IMC", .code = 0x0000, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0x1ull, }, { .name = "UNC_M_PARITY_ERRORS", .desc = "UNC_M_PARITY_ERRORS (experimental)", .code = 0x002c, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PCLS", .desc = "UNC_M_PCLS.RD", .code = 0x00a0, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pcls), .umasks = icx_unc_m_pcls, }, { .name = "UNC_M_PMM_CMD1", .desc = "PMM Commands", .code = 0x00ea, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pmm_cmd1), .umasks = icx_unc_m_pmm_cmd1, }, { .name = "UNC_M_PMM_CMD2", .desc = "PMM Commands - Part 2", .code = 0x00eb, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pmm_cmd2), .umasks = icx_unc_m_pmm_cmd2, }, { .name = "UNC_M_PMM_RPQ_CYCLES_FULL", .desc = "PMM Read Queue Cycles Full (experimental)", .code = 0x00e2, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_RPQ_CYCLES_NE", .desc = "PMM Read Queue Cycles Not Empty (experimental)", .code = 0x00e1, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_RPQ_INSERTS", .desc = "PMM Read Queue Inserts", .code = 0x00e3, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_RPQ_OCCUPANCY", .desc = "PMM Read Pending Queue Occupancy", .code = 0x00e0, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pmm_rpq_occupancy), .umasks = icx_unc_m_pmm_rpq_occupancy, }, { .name = "UNC_M_PMM_WPQ_CYCLES_FULL", .desc = "PMM Write Queue Cycles Full (experimental)", .code = 0x00e6, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_CYCLES_NE", .desc = "PMM Write Queue Cycles Not Empty (experimental)", .code = 0x00e5, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_FLUSH", .desc = "UNC_M_PMM_WPQ_FLUSH (experimental)", .code = 0x00e8, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_FLUSH_CYC", .desc = "UNC_M_PMM_WPQ_FLUSH_CYC (experimental)", .code = 0x00e9, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_INSERTS", .desc = "PMM Write Queue Inserts", .code = 0x00e7, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_OCCUPANCY", .desc = "PMM Write Pending Queue Occupancy", .code = 0x00e4, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pmm_wpq_occupancy), .umasks = icx_unc_m_pmm_wpq_occupancy, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .desc = "Channel PPD Cycles (experimental)", .code = 0x0085, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_POWER_CKE_CYCLES", .desc = "CKE_ON_CYCLES by Rank", .code = 0x0047, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_power_cke_cycles), .umasks = icx_unc_m_power_cke_cycles, }, { .name = "UNC_M_POWER_CRIT_THROTTLE_CYCLES", .desc = "Throttle Cycles for Rank 0", .code = 0x0086, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_power_throttle_cycles), /* shared */ .umasks = icx_unc_m_power_throttle_cycles, }, { .name = "UNC_M_POWER_SELF_REFRESH", .desc = "Clock-Enabled Self-Refresh (experimental)", .code = 0x0043, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .desc = "Throttle Cycles for Rank 0", .code = 0x0046, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_power_throttle_cycles), .umasks = icx_unc_m_power_throttle_cycles, }, { .name = "UNC_M_PRE_COUNT", .desc = "DRAM Precharge commands.", .code = 0x0002, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_pre_count), .umasks = icx_unc_m_pre_count, }, { .name = "UNC_M_RDB_FULL", .desc = "Read Data Buffer Full (experimental)", .code = 0x0019, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RDB_INSERTS", .desc = "Read Data Buffer Inserts (experimental)", .code = 0x0017, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RDB_NOT_EMPTY", .desc = "Read Data Buffer Not Empty (experimental)", .code = 0x0018, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RDB_OCCUPANCY", .desc = "Read Data Buffer Occupancy (experimental)", .code = 0x001a, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_CYCLES_FULL_PCH0", .desc = "Read Pending Queue Full Cycles (experimental)", .code = 0x0012, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_CYCLES_FULL_PCH1", .desc = "Read Pending Queue Full Cycles (experimental)", .code = 0x0015, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_CYCLES_NE", .desc = "Read Pending Queue Not Empty", .code = 0x0011, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_rpq_inserts), /* shared */ .umasks = icx_unc_m_rpq_inserts, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue Allocations", .code = 0x0010, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_rpq_inserts), .umasks = icx_unc_m_rpq_inserts, }, { .name = "UNC_M_RPQ_OCCUPANCY_PCH0", .desc = "Read Pending Queue Occupancy", .code = 0x0080, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_OCCUPANCY_PCH1", .desc = "Read Pending Queue Occupancy", .code = 0x0081, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_ACCESSES", .desc = "Scoreboard Accesses", .code = 0x00d2, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_accesses), .umasks = icx_unc_m_sb_accesses, }, { .name = "UNC_M_SB_CANARY", .desc = "TBD", .code = 0x00d9, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_canary), .umasks = icx_unc_m_sb_canary, }, { .name = "UNC_M_SB_CYCLES_FULL", .desc = "Scoreboard Cycles Full (experimental)", .code = 0x00d1, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_CYCLES_NE", .desc = "Scoreboard Cycles Not-Empty (experimental)", .code = 0x00d0, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_INSERTS", .desc = "Scoreboard Inserts", .code = 0x00d6, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_inserts), .umasks = icx_unc_m_sb_inserts, }, { .name = "UNC_M_SB_OCCUPANCY", .desc = "Scoreboard Occupancy", .code = 0x00d5, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_occupancy), .umasks = icx_unc_m_sb_occupancy, }, { .name = "UNC_M_SB_PREF_INSERTS", .desc = "Scoreboard Prefetch Inserts", .code = 0x00da, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_pref_inserts), .umasks = icx_unc_m_sb_pref_inserts, }, { .name = "UNC_M_SB_PREF_OCCUPANCY", .desc = "Scoreboard Prefetch Occupancy", .code = 0x00db, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_pref_occupancy), .umasks = icx_unc_m_sb_pref_occupancy, }, { .name = "UNC_M_SB_REJECT", .desc = "Number of Scoreboard Requests Rejected", .code = 0x00d4, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_reject), .umasks = icx_unc_m_sb_reject, }, { .name = "UNC_M_SB_STRV_ALLOC", .desc = "This event is deprecated. Refer to new event UNC_M_SB_STRV_ALLOC.NM_RD", .code = 0x00d7, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_strv_dealloc), /* shared */ .umasks = icx_unc_m_sb_strv_dealloc, }, { .name = "UNC_M_SB_STRV_DEALLOC", .desc = "This event is deprecated. Refer to new event UNC_M_SB_STRV_DEALLOC.NM_RD", .code = 0x00de, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_strv_dealloc), .umasks = icx_unc_m_sb_strv_dealloc, }, { .name = "UNC_M_SB_STRV_OCC", .desc = "This event is deprecated. Refer to new event UNC_M_SB_STRV_OCC.NM_RD", .code = 0x00d8, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_strv_occ), .umasks = icx_unc_m_sb_strv_occ, }, { .name = "UNC_M_SB_TAGGED", .desc = "UNC_M_SB_TAGGED.NEW", .code = 0x00dd, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_sb_tagged), .umasks = icx_unc_m_sb_tagged, }, { .name = "UNC_M_TAGCHK", .desc = "2LM Tag Check", .code = 0x00d3, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_tagchk), .umasks = icx_unc_m_tagchk, }, { .name = "UNC_M_WPQ_CYCLES_FULL_PCH0", .desc = "Write Pending Queue Full Cycles (experimental)", .code = 0x0022, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_CYCLES_FULL_PCH1", .desc = "Write Pending Queue Full Cycles (experimental)", .code = 0x0016, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_CYCLES_NE", .desc = "Write Pending Queue Not Empty", .code = 0x0021, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_wpq_inserts), /* shared */ .umasks = icx_unc_m_wpq_inserts, }, { .name = "UNC_M_WPQ_INSERTS", .desc = "Write Pending Queue Allocations", .code = 0x0020, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_wpq_inserts), .umasks = icx_unc_m_wpq_inserts, }, { .name = "UNC_M_WPQ_OCCUPANCY_PCH0", .desc = "Write Pending Queue Occupancy", .code = 0x0082, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_OCCUPANCY_PCH1", .desc = "Write Pending Queue Occupancy", .code = 0x0083, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_READ_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x0023, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_wpq_write_hit), /* shared */ .umasks = icx_unc_m_wpq_write_hit, }, { .name = "UNC_M_WPQ_WRITE_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x0024, .modmsk = ICX_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m_wpq_write_hit), .umasks = icx_unc_m_wpq_write_hit, }, }; /* 59 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_irp_events.h000066400000000000000000000363671502707512200253070ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_irp (IcelakeX Uncore IRP) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_i_cache_total_occupancy[]={ { .uname = "ANY", .udesc = "Any Source (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "IV_Q", .udesc = "Snoops (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_coherent_ops[]={ { .uname = "CLFLUSH", .udesc = "CLFlush (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCITOM", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOI", .udesc = "WbMtoI", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_irp_all[]={ { .uname = "EVICTS", .udesc = "All Inserts Outbound (BL, AK, Snoops) (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INBOUND_INSERTS", .udesc = "All Inserts Inbound (p2p + faf + cset)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OUTBOUND_INSERTS", .udesc = "All Inserts Outbound (BL, AK, Snoops) (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_misc0[]={ { .uname = "2ND_ATOMIC_INSERT", .udesc = "Cache Inserts of Atomic Transactions as Secondary (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_RD_INSERT", .udesc = "Cache Inserts of Read Transactions as Secondary (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_WR_INSERT", .udesc = "Cache Inserts of Write Transactions as Secondary (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REJ", .udesc = "Fastpath Rejects (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REQ", .udesc = "Fastpath Requests (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_XFER", .udesc = "Fastpath Transfers From Primary to Secondary (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_ACK_HINT", .udesc = "Prefetch Ack Hints From Primary to Secondary (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOWPATH_FWPF_NO_PRF", .udesc = "Slow path fwpf didn't find prefetch (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_misc1[]={ { .uname = "LOST_FWD", .udesc = "Lost Forward", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_INVLD", .udesc = "Received Invalid (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_VLD", .udesc = "Received Valid (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_E", .udesc = "Slow Transfer of E Line (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_I", .udesc = "Slow Transfer of I Line (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_M", .udesc = "Slow Transfer of M Line (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_S", .udesc = "Slow Transfer of S Line (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_p2p_transactions[]={ { .uname = "CMPL", .udesc = "P2P completions (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC", .udesc = "match if local only (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_AND_TGT_MATCH", .udesc = "match if local and target matches (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSG", .udesc = "P2P Message (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "P2P reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM", .udesc = "Match if remote only (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM_AND_TGT_MATCH", .udesc = "match if remote and target matches (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "P2P Writes (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_snoop_resp[]={ { .uname = "ALL_HIT", .udesc = "TBD (experimental)", .ucode = 0x7e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_HIT_ES", .udesc = "TBD (experimental)", .ucode = 0x7400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_HIT_I", .udesc = "TBD (experimental)", .ucode = 0x7200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_HIT_M", .udesc = "TBD", .ucode = 0x7800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_MISS", .udesc = "TBD (experimental)", .ucode = 0x7100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT_ES", .udesc = "Hit E or S (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT_I", .udesc = "Hit I (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT_M", .udesc = "Hit M (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Miss (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNPCODE", .udesc = "SnpCode (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNPDATA", .udesc = "SnpData (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNPINV", .udesc = "SnpInv (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_i_transactions[]={ { .uname = "ATOMIC", .udesc = "Atomic (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ORDERINGQ", .udesc = "Select Source (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER", .udesc = "Other (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES", .udesc = "Writes (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PREF", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_irp_pe[]={ { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", .desc = "Total IRP occupancy of inbound read and write requests to coherent memory.", .code = 0x000f, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_cache_total_occupancy), .umasks = icx_unc_i_cache_total_occupancy, }, { .name = "UNC_I_CLOCKTICKS", .desc = "Clockticks of the IO coherency tracker (IRP)", .code = 0x0001, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_COHERENT_OPS", .desc = "PCIITOM request issued by the IRP unit to the mesh with the intention of writing a full cacheline.", .code = 0x0010, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_coherent_ops), .umasks = icx_unc_i_coherent_ops, }, { .name = "UNC_I_FAF_FULL", .desc = "FAF RF full", .code = 0x0017, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_FAF_INSERTS", .desc = "Inbound read requests received by the IRP and inserted into the FAF queue.", .code = 0x0018, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_FAF_OCCUPANCY", .desc = "Occupancy of the IRP FAF queue.", .code = 0x0019, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_FAF_TRANSACTIONS", .desc = "FAF allocation -- sent to ADQ", .code = 0x0016, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_IRP_ALL", .desc = "TBD", .code = 0x0020, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_irp_all), .umasks = icx_unc_i_irp_all, }, { .name = "UNC_I_MISC0", .desc = "Counts Timeouts - Set 0", .code = 0x001e, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_misc0), .umasks = icx_unc_i_misc0, }, { .name = "UNC_I_MISC1", .desc = "Misc Events - Set 1", .code = 0x001f, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_misc1), .umasks = icx_unc_i_misc1, }, { .name = "UNC_I_P2P_INSERTS", .desc = "P2P Requests (experimental)", .code = 0x0014, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_P2P_OCCUPANCY", .desc = "P2P Occupancy (experimental)", .code = 0x0015, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_P2P_TRANSACTIONS", .desc = "P2P Transactions", .code = 0x0013, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_p2p_transactions), .umasks = icx_unc_i_p2p_transactions, }, { .name = "UNC_I_SNOOP_RESP", .desc = "Responses to snoops of any type that hit M line in the IIO cache", .code = 0x0012, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_snoop_resp), .umasks = icx_unc_i_snoop_resp, }, { .name = "UNC_I_TRANSACTIONS", .desc = "Inbound write (fast path) requests received by the IRP.", .code = 0x0011, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_i_transactions), .umasks = icx_unc_i_transactions, }, { .name = "UNC_I_TxC_AK_INSERTS", .desc = "AK Egress Allocations (experimental)", .code = 0x000b, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_DRS_CYCLES_FULL", .desc = "BL DRS Egress Cycles Full (experimental)", .code = 0x0005, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_DRS_INSERTS", .desc = "BL DRS Egress Inserts (experimental)", .code = 0x0002, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_DRS_OCCUPANCY", .desc = "BL DRS Egress Occupancy (experimental)", .code = 0x0008, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCB_CYCLES_FULL", .desc = "BL NCB Egress Cycles Full (experimental)", .code = 0x0006, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCB_INSERTS", .desc = "BL NCB Egress Inserts (experimental)", .code = 0x0003, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCB_OCCUPANCY", .desc = "BL NCB Egress Occupancy (experimental)", .code = 0x0009, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCS_CYCLES_FULL", .desc = "BL NCS Egress Cycles Full (experimental)", .code = 0x0007, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCS_INSERTS", .desc = "BL NCS Egress Inserts (experimental)", .code = 0x0004, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxC_BL_NCS_OCCUPANCY", .desc = "BL NCS Egress Occupancy (experimental)", .code = 0x000a, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxR2_AD01_STALL_CREDIT_CYCLES", .desc = "UNC_I_TxR2_AD01_STALL_CREDIT_CYCLES (experimental)", .code = 0x001c, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxR2_AD0_STALL_CREDIT_CYCLES", .desc = "No AD0 Egress Credits Stalls (experimental)", .code = 0x001a, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxR2_AD1_STALL_CREDIT_CYCLES", .desc = "No AD1 Egress Credits Stalls (experimental)", .code = 0x001b, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxR2_BL_STALL_CREDIT_CYCLES", .desc = "No BL Egress Credit Stalls (experimental)", .code = 0x001d, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxS_DATA_INSERTS_NCB", .desc = "Outbound Read Requests (experimental)", .code = 0x000d, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxS_DATA_INSERTS_NCS", .desc = "Outbound Read Requests (experimental)", .code = 0x000e, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_I_TxS_REQUEST_OCCUPANCY", .desc = "Outbound Request Queue Occupancy (experimental)", .code = 0x000c, .modmsk = ICX_UNC_IRP_ATTRS, .cntmsk = 0x3ull, }, }; /* 32 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_m2m_events.h000066400000000000000000003432131502707512200251770ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_m2m (IcelakeX Uncore M2M) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_m2m_ag0_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag0_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag0_bl_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag0_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag1_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag1_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag1_bl_crd_acquired0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 4 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 5 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ag1_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_bypass_m2m_ingress[]={ { .uname = "NOT_TAKEN", .udesc = "Not Taken (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Taken (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_directory_lookup[]={ { .uname = "ANY", .udesc = "Found in any state", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_A", .udesc = "Found in A state", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_I", .udesc = "Found in I state", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_S", .udesc = "Found in S state", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_directory_miss[]={ { .uname = "CLEAN_A", .udesc = "On NonDirty Line in A State (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CLEAN_I", .udesc = "On NonDirty Line in I State (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CLEAN_P", .udesc = "On NonDirty Line in L State (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CLEAN_S", .udesc = "On NonDirty Line in S State (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRTY_A", .udesc = "On Dirty Line in A State (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRTY_I", .udesc = "On Dirty Line in I State (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRTY_P", .udesc = "On Dirty Line in L State (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRTY_S", .udesc = "On Dirty Line in S State (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_directory_update[]={ { .uname = "ANY", .udesc = "From/to any state. Note: event counts are incorrect in 2LM mode.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t icx_unc_m2m_distress_asserted[]={ { .uname = "DPT_LOCAL", .udesc = "DPT Local (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_NONLOCAL", .udesc = "DPT Remote (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_IV", .udesc = "DPT Stalled - IV (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_NOCRD", .udesc = "DPT Stalled - No Credit (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HORZ", .udesc = "Horizontal (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_LOCAL", .udesc = "PMM Local (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_NONLOCAL", .udesc = "PMM Remote (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VERT", .udesc = "Vertical (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNOOPGO_UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_horz_ring_akc_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_horz_ring_iv_in_use[]={ { .uname = "LEFT", .udesc = "Left (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT", .udesc = "Right (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_imc_reads[]={ { .uname = "ALL", .udesc = "All, regardless of priority. - All Channels (experimental)", .ucode = 0x700000400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CH0_ALL", .udesc = "All, regardless of priority. - Ch0 (experimental)", .ucode = 0x100000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_FROM_TGR", .udesc = "From TGR - Ch0 (experimental)", .ucode = 0x100004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_ISOCH", .udesc = "Critical Priority - Ch0 (experimental)", .ucode = 0x100000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_NORMAL", .udesc = "Normal Priority - Ch0 (experimental)", .ucode = 0x100000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - Ch0 (experimental)", .ucode = 0x100001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_DDR_AS_MEM", .udesc = "DDR - Ch0 (experimental)", .ucode = 0x100000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_PMM", .udesc = "PMM - Ch0 (experimental)", .ucode = 0x100002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_ALL", .udesc = "All, regardless of priority. - Ch1 (experimental)", .ucode = 0x200000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_FROM_TGR", .udesc = "From TGR - Ch1 (experimental)", .ucode = 0x200004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_ISOCH", .udesc = "Critical Priority - Ch1 (experimental)", .ucode = 0x200000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_NORMAL", .udesc = "Normal Priority - Ch1 (experimental)", .ucode = 0x200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - Ch1 (experimental)", .ucode = 0x200001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_DDR_AS_MEM", .udesc = "DDR - Ch1 (experimental)", .ucode = 0x200000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_PMM", .udesc = "PMM - Ch1 (experimental)", .ucode = 0x200002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_FROM_TGR", .udesc = "From TGR - Ch2 (experimental)", .ucode = 0x400004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FROM_TGR", .udesc = "From TGR - All Channels (experimental)", .ucode = 0x700004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ISOCH", .udesc = "Critical Priority - All Channels (experimental)", .ucode = 0x700000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NORMAL", .udesc = "Normal Priority - All Channels (experimental)", .ucode = 0x700000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - All Channels (experimental)", .ucode = 0x700001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_DDR_AS_MEM", .udesc = "DDR - All Channels (experimental)", .ucode = 0x700000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_PMM", .udesc = "PMM - All Channels", .ucode = 0x700002000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_imc_writes[]={ { .uname = "ALL", .udesc = "All Writes - All Channels (experimental)", .ucode = 0x1c00001000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CH0_ALL", .udesc = "All Writes - Ch0 (experimental)", .ucode = 0x400001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_FROM_TGR", .udesc = "From TGR - Ch0 (experimental)", .ucode = 0x500000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_FULL", .udesc = "Full Line Non-ISOCH - Ch0 (experimental)", .ucode = 0x400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_FULL_ISOCH", .udesc = "ISOCH Full Line - Ch0 (experimental)", .ucode = 0x400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_NI", .udesc = "Non-Inclusive - Ch0 (experimental)", .ucode = 0x600000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_NI_MISS", .udesc = "Non-Inclusive Miss - Ch0 (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_PARTIAL", .udesc = "Partial Non-ISOCH - Ch0 (experimental)", .ucode = 0x400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_PARTIAL_ISOCH", .udesc = "ISOCH Partial - Ch0 (experimental)", .ucode = 0x400000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - Ch0 (experimental)", .ucode = 0x400004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_DDR_AS_MEM", .udesc = "DDR - Ch0 (experimental)", .ucode = 0x400002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_TO_PMM", .udesc = "PMM - Ch0 (experimental)", .ucode = 0x400008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_ALL", .udesc = "All Writes - Ch1 (experimental)", .ucode = 0x800001000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_FROM_TGR", .udesc = "From TGR - Ch1 (experimental)", .ucode = 0x900000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_FULL", .udesc = "Full Line Non-ISOCH - Ch1 (experimental)", .ucode = 0x800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_FULL_ISOCH", .udesc = "ISOCH Full Line - Ch1 (experimental)", .ucode = 0x800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_NI", .udesc = "Non-Inclusive - Ch1 (experimental)", .ucode = 0xa00000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_NI_MISS", .udesc = "Non-Inclusive Miss - Ch1 (experimental)", .ucode = 0xc00000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_PARTIAL", .udesc = "Partial Non-ISOCH - Ch1 (experimental)", .ucode = 0x800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_PARTIAL_ISOCH", .udesc = "ISOCH Partial - Ch1 (experimental)", .ucode = 0x800000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - Ch1 (experimental)", .ucode = 0x800004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_DDR_AS_MEM", .udesc = "DDR - Ch1 (experimental)", .ucode = 0x800002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_TO_PMM", .udesc = "PMM - Ch1 (experimental)", .ucode = 0x800008000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FROM_TGR", .udesc = "From TGR - All Channels (experimental)", .ucode = 0x1d00000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL", .udesc = "Full Line Non-ISOCH - All Channels (experimental)", .ucode = 0x1c00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL_ISOCH", .udesc = "ISOCH Full Line - All Channels (experimental)", .ucode = 0x1c00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NI", .udesc = "Non-Inclusive - All Channels (experimental)", .ucode = 0x1e00000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NI_MISS", .udesc = "Non-Inclusive Miss - All Channels (experimental)", .ucode = 0x1c00000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Partial Non-ISOCH - All Channels (experimental)", .ucode = 0x1c00000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL_ISOCH", .udesc = "ISOCH Partial - All Channels (experimental)", .ucode = 0x1c00000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_DDR_AS_CACHE", .udesc = "DDR, acting as Cache - All Channels (experimental)", .ucode = 0x1c00004000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_DDR_AS_MEM", .udesc = "DDR - All Channels (experimental)", .ucode = 0x1c00002000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TO_PMM", .udesc = "PMM - All Channels", .ucode = 0x1c00008000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_misc_external[]={ { .uname = "MBE_INST0", .udesc = "Number of cycles MBE is high for MS2IDI0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBE_INST1", .udesc = "Number of cycles MBE is high for MS2IDI1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_pkt_match[]={ { .uname = "MC", .udesc = "MC Match (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MESH", .udesc = "Mesh Match (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_cycles_ne[]={ { .uname = "ALLCH", .udesc = "All Channels (experimental)", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_deallocs[]={ { .uname = "CH0_HITA0_INVAL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_HITA1_INVAL", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_MISS_INVAL", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_RSP_PDRESET", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_HITA0_INVAL", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_HITA1_INVAL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_MISS_INVAL", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_RSP_PDRESET", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_HITA0_INVAL", .udesc = "TBD (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_HITA1_INVAL", .udesc = "TBD (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_MISS_INVAL", .udesc = "TBD (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_RSP_PDRESET", .udesc = "TBD (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_demand_merge[]={ { .uname = "CH0_XPTUPI", .udesc = "XPT & UPI- Ch 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_XPTUPI", .udesc = "XPT & UPI - Ch 1 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_XPTUPI", .udesc = "XPT & UPI- Ch 2 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XPTUPI_ALLCH", .udesc = "XPT & UPI- All Channels (experimental)", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_demand_no_merge[]={ { .uname = "CH0_XPTUPI", .udesc = "XPT & UPI - Ch 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_XPTUPI", .udesc = "XPT & UPI - Ch 1 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_XPTUPI", .udesc = "XPT & UPI - Ch 2 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XPTUPI_ALLCH", .udesc = "XPT & UPI - All Channels (experimental)", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_drop_reasons_ch1[]={ { .uname = "ERRORBLK_RxC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_PF_SAD_REGION", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_AD_CRD", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_CAM_FULL", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_CAM_HIT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_SECURE_DROP", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RPQ_PROXY", .udesc = "TBD (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STOP_B2B", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI_THRESH", .udesc = "TBD (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WPQ_PROXY", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XPT_THRESH", .udesc = "TBD (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_drop_reasons_ch2[]={ { .uname = "ERRORBLK_RxC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_PF_SAD_REGION", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_AD_CRD", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_CAM_FULL", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_CAM_HIT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_SECURE_DROP", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RPQ_PROXY", .udesc = "TBD (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STOP_B2B", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI_THRESH", .udesc = "TBD (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WPQ_PROXY", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XPT_THRESH", .udesc = "TBD (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_inserts[]={ { .uname = "CH0_UPI", .udesc = "UPI - Ch 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0_XPT", .udesc = "XPT - Ch 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_UPI", .udesc = "UPI - Ch 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1_XPT", .udesc = "XPT - Ch 1 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_UPI", .udesc = "UPI - Ch 2 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2_XPT", .udesc = "XPT - Ch 2 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI_ALLCH", .udesc = "UPI - All Channels (experimental)", .ucode = 0x2a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XPT_ALLCH", .udesc = "XPT - All Channels (experimental)", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_resp_miss[]={ { .uname = "ALLCH", .udesc = "All Channels (experimental)", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_prefcam_rxc_deallocs[]={ { .uname = "1LM_POSTED", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CIS", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_MEMMODE_ACCEPT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SQUASHED", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ring_bounces_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ring_sink_starved_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "Acknowledgements to Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_ring_sink_starved_vert[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Acknowledgements to core (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Data Responses to core (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "Snoops of processor's cache. (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_rpq_no_spec_crd[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_rxr_crd_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFV", .udesc = "IFV - Credited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_rxr_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_rxr_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_stall0_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_stall1_no_txr_horz_crd_ad_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_stall1_no_txr_horz_crd_bl_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_tag_hit[]={ { .uname = "NM_RD_HIT_CLEAN", .udesc = "Clean NearMem Read Hit", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_HIT_DIRTY", .udesc = "Dirty NearMem Read Hit", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_UFILL_HIT_CLEAN", .udesc = "Clean NearMem Underfill Hit (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_UFILL_HIT_DIRTY", .udesc = "Dirty NearMem Underfill Hit (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_tracker_inserts[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_tracker_occupancy[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_ak[]={ { .uname = "CRD_CBO", .udesc = "CRD Transactions to Cbo (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "NDR Transactions (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_ak_cycles_full[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDCRD0", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDCRD1", .udesc = "TBD (experimental)", .ucode = 0x8800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCMP0", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCMP1", .udesc = "TBD (experimental)", .ucode = 0xa000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCRD0", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCRD1", .udesc = "TBD (experimental)", .ucode = 0x9000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_ak_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREF_RD_CAM_HIT", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDCRD", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCMP", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCRD", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_ak_no_credit_cycles[]={ { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_ak_occupancy[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDCRD", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCMP", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRCRD", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_bl[]={ { .uname = "DRS_CACHE", .udesc = "Data to Cache (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_CORE", .udesc = "Data to Core (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_UPI", .udesc = "Data to QPI (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_bl_credits_acquired[]={ { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_bl_cycles_ne[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_bl_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txc_bl_no_credit_stalled[]={ { .uname = "CMS0", .udesc = "Common Mesh Stop - Near Side (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CMS1", .udesc = "Common Mesh Stop - Far Side (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_horz_ads_used[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_horz_cycles_full[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_horz_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_horz_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_horz_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_ads_used[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_bypass[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG1", .udesc = "IV - Agent 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_cycles_full1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_cycles_ne0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_inserts1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_occupancy0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_occupancy1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_starved0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_txr_vert_starved1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGC", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_vert_ring_akc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_vert_ring_iv_in_use[]={ { .uname = "DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_vert_ring_tgc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wpq_no_reg_crd[]={ { .uname = "CHN0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wpq_no_spec_crd[]={ { .uname = "CHN0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wr_tracker_full[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MIRR", .udesc = "Mirror (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wr_tracker_inserts[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wr_tracker_nonposted_occupancy[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wr_tracker_occupancy[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MIRR", .udesc = "Mirror (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MIRR_NONTGR", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MIRR_PWR", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2m_wr_tracker_posted_occupancy[]={ { .uname = "CH0", .udesc = "Channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH1", .udesc = "Channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CH2", .udesc = "Channel 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_m2m_pe[]={ { .name = "UNC_M2M_AG0_AD_CRD_ACQUIRED0", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0080, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m2m_ag0_ad_crd_occupancy0, }, { .name = "UNC_M2M_AG0_AD_CRD_ACQUIRED1", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0081, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m2m_ag0_ad_crd_occupancy1, }, { .name = "UNC_M2M_AG0_AD_CRD_OCCUPANCY0", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0082, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_ad_crd_occupancy0), .umasks = icx_unc_m2m_ag0_ad_crd_occupancy0, }, { .name = "UNC_M2M_AG0_AD_CRD_OCCUPANCY1", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0083, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_ad_crd_occupancy1), .umasks = icx_unc_m2m_ag0_ad_crd_occupancy1, }, { .name = "UNC_M2M_AG0_BL_CRD_ACQUIRED0", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0088, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_bl_crd_occupancy0), /* shared */ .umasks = icx_unc_m2m_ag0_bl_crd_occupancy0, }, { .name = "UNC_M2M_AG0_BL_CRD_ACQUIRED1", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0089, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m2m_ag0_bl_crd_occupancy1, }, { .name = "UNC_M2M_AG0_BL_CRD_OCCUPANCY0", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_bl_crd_occupancy0), .umasks = icx_unc_m2m_ag0_bl_crd_occupancy0, }, { .name = "UNC_M2M_AG0_BL_CRD_OCCUPANCY1", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag0_bl_crd_occupancy1), .umasks = icx_unc_m2m_ag0_bl_crd_occupancy1, }, { .name = "UNC_M2M_AG1_AD_CRD_ACQUIRED0", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0084, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m2m_ag1_ad_crd_occupancy0, }, { .name = "UNC_M2M_AG1_AD_CRD_ACQUIRED1", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0085, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m2m_ag1_ad_crd_occupancy1, }, { .name = "UNC_M2M_AG1_AD_CRD_OCCUPANCY0", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0086, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_ad_crd_occupancy0), .umasks = icx_unc_m2m_ag1_ad_crd_occupancy0, }, { .name = "UNC_M2M_AG1_AD_CRD_OCCUPANCY1", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0087, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_ad_crd_occupancy1), .umasks = icx_unc_m2m_ag1_ad_crd_occupancy1, }, { .name = "UNC_M2M_AG1_BL_CRD_ACQUIRED0", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_bl_crd_acquired0), .umasks = icx_unc_m2m_ag1_bl_crd_acquired0, }, { .name = "UNC_M2M_AG1_BL_CRD_ACQUIRED1", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m2m_ag1_bl_crd_occupancy1, }, { .name = "UNC_M2M_AG1_BL_CRD_OCCUPANCY0", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall0_no_txr_horz_crd_ad_ag0), /* shared */ .umasks = icx_unc_m2m_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M2M_AG1_BL_CRD_OCCUPANCY1", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ag1_bl_crd_occupancy1), .umasks = icx_unc_m2m_ag1_bl_crd_occupancy1, }, { .name = "UNC_M2M_BYPASS_M2M_EGRESS", .desc = "M2M to iMC Bypass", .code = 0x0022, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_bypass_m2m_ingress), /* shared */ .umasks = icx_unc_m2m_bypass_m2m_ingress, }, { .name = "UNC_M2M_BYPASS_M2M_INGRESS", .desc = "M2M to iMC Bypass", .code = 0x0021, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_bypass_m2m_ingress), .umasks = icx_unc_m2m_bypass_m2m_ingress, }, { .name = "UNC_M2M_CLOCKTICKS", .desc = "Clockticks of the mesh to memory (M2M)", .code = 0x0000, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_CMS_CLOCKTICKS", .desc = "CMS Clockticks", .code = 0x00c0, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2CORE_NOT_TAKEN_DIRSTATE", .desc = "Cycles when direct to core mode, which bypasses the CHA, was disabled (experimental)", .code = 0x0024, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2CORE_NOT_TAKEN_NOTFORKED", .desc = "UNC_M2M_DIRECT2CORE_NOT_TAKEN_NOTFORKED (experimental)", .code = 0x0060, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2CORE_TXN_OVERRIDE", .desc = "Number of reads in which direct to core transaction was overridden (experimental)", .code = 0x0025, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2UPI_NOT_TAKEN_CREDITS", .desc = "Number of reads in which direct to Intel UPI transactions were overridden (experimental)", .code = 0x0028, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2UPI_NOT_TAKEN_DIRSTATE", .desc = "Cycles when Direct2UPI was Disabled (experimental)", .code = 0x0027, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECT2UPI_TXN_OVERRIDE", .desc = "Number of reads that a message sent direct2 Intel UPI was overridden (experimental)", .code = 0x0029, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DIRECTORY_HIT", .desc = "Directory Hit", .code = 0x002a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_directory_miss), /* shared */ .umasks = icx_unc_m2m_directory_miss, }, { .name = "UNC_M2M_DIRECTORY_LOOKUP", .desc = "Multi-socket cacheline Directory Lookups", .code = 0x002d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_directory_lookup), .umasks = icx_unc_m2m_directory_lookup, }, { .name = "UNC_M2M_DIRECTORY_MISS", .desc = "Directory Miss", .code = 0x002b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_directory_miss), .umasks = icx_unc_m2m_directory_miss, }, { .name = "UNC_M2M_DIRECTORY_UPDATE", .desc = "Multi-socket cacheline Directory Updates", .code = 0x002e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_directory_update), .umasks = icx_unc_m2m_directory_update, }, { .name = "UNC_M2M_DISTRESS_ASSERTED", .desc = "Distress signal asserted", .code = 0x00af, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_distress_asserted), .umasks = icx_unc_m2m_distress_asserted, }, { .name = "UNC_M2M_DISTRESS_PMM", .desc = "UNC_M2M_DISTRESS_PMM (experimental)", .code = 0x00f2, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_DISTRESS_PMM_MEMMODE", .desc = "UNC_M2M_DISTRESS_PMM_MEMMODE (experimental)", .code = 0x00f1, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_EGRESS_ORDERING", .desc = "Egress Blocking due to Ordering requirements", .code = 0x00ba, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_egress_ordering), .umasks = icx_unc_m2m_egress_ordering, }, { .name = "UNC_M2M_HORZ_RING_AD_IN_USE", .desc = "Horizontal AD Ring In Use", .code = 0x00b6, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_horz_ring_akc_in_use), /* shared */ .umasks = icx_unc_m2m_horz_ring_akc_in_use, }, { .name = "UNC_M2M_HORZ_RING_AKC_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00bb, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_horz_ring_akc_in_use), .umasks = icx_unc_m2m_horz_ring_akc_in_use, }, { .name = "UNC_M2M_HORZ_RING_AK_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00b7, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_horz_ring_bl_in_use), /* shared */ .umasks = icx_unc_m2m_horz_ring_bl_in_use, }, { .name = "UNC_M2M_HORZ_RING_BL_IN_USE", .desc = "Horizontal BL Ring in Use", .code = 0x00b8, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_horz_ring_bl_in_use), .umasks = icx_unc_m2m_horz_ring_bl_in_use, }, { .name = "UNC_M2M_HORZ_RING_IV_IN_USE", .desc = "Horizontal IV Ring in Use", .code = 0x00b9, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_horz_ring_iv_in_use), .umasks = icx_unc_m2m_horz_ring_iv_in_use, }, { .name = "UNC_M2M_IMC_READS", .desc = "M2M Reads Issued to iMC", .code = 0x0037, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_imc_reads), .umasks = icx_unc_m2m_imc_reads, }, { .name = "UNC_M2M_IMC_WRITES", .desc = "M2M Writes Issued to iMC", .code = 0x0038, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_imc_writes), .umasks = icx_unc_m2m_imc_writes, }, { .name = "UNC_M2M_MIRR_WRQ_INSERTS", .desc = "Write Tracker Inserts (experimental)", .code = 0x0064, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_MIRR_WRQ_OCCUPANCY", .desc = "Write Tracker Occupancy (experimental)", .code = 0x0065, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_MISC_EXTERNAL", .desc = "Miscellaneous Events (mostly from MS2IDI)", .code = 0x00e6, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_misc_external), .umasks = icx_unc_m2m_misc_external, }, { .name = "UNC_M2M_PKT_MATCH", .desc = "Number Packet Header Matches", .code = 0x004c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_pkt_match), .umasks = icx_unc_m2m_pkt_match, }, { .name = "UNC_M2M_PREFCAM_CIS_DROPS", .desc = "UNC_M2M_PREFCAM_CIS_DROPS (experimental)", .code = 0x0073, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_PREFCAM_CYCLES_FULL", .desc = "Prefetch CAM Cycles Full", .code = 0x006b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_cycles_ne), /* shared */ .umasks = icx_unc_m2m_prefcam_cycles_ne, }, { .name = "UNC_M2M_PREFCAM_CYCLES_NE", .desc = "Prefetch CAM Cycles Not Empty", .code = 0x006c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_cycles_ne), .umasks = icx_unc_m2m_prefcam_cycles_ne, }, { .name = "UNC_M2M_PREFCAM_DEALLOCS", .desc = "Prefetch CAM Deallocs", .code = 0x006e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_deallocs), .umasks = icx_unc_m2m_prefcam_deallocs, }, { .name = "UNC_M2M_PREFCAM_DEMAND_DROPS", .desc = "Data Prefetches Dropped", .code = 0x006f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_inserts), /* shared */ .umasks = icx_unc_m2m_prefcam_inserts, }, { .name = "UNC_M2M_PREFCAM_DEMAND_MERGE", .desc = "Demands Merged with CAMed Prefetches", .code = 0x0074, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_demand_merge), .umasks = icx_unc_m2m_prefcam_demand_merge, }, { .name = "UNC_M2M_PREFCAM_DEMAND_NO_MERGE", .desc = "Demands Not Merged with CAMed Prefetches", .code = 0x0075, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_demand_no_merge), .umasks = icx_unc_m2m_prefcam_demand_no_merge, }, { .name = "UNC_M2M_PREFCAM_DROP_REASONS_CH0", .desc = "Data Prefetches Dropped Ch0 - Reasons", .code = 0x0070, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_drop_reasons_ch1), /* shared */ .umasks = icx_unc_m2m_prefcam_drop_reasons_ch1, }, { .name = "UNC_M2M_PREFCAM_DROP_REASONS_CH1", .desc = "Data Prefetches Dropped Ch1 - Reasons", .code = 0x0071, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_drop_reasons_ch1), .umasks = icx_unc_m2m_prefcam_drop_reasons_ch1, }, { .name = "UNC_M2M_PREFCAM_DROP_REASONS_CH2", .desc = "Data Prefetches Dropped Ch2 - Reasons", .code = 0x0072, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_drop_reasons_ch2), .umasks = icx_unc_m2m_prefcam_drop_reasons_ch2, }, { .name = "UNC_M2M_PREFCAM_INSERTS", .desc = "Prefetch CAM Inserts", .code = 0x006d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_inserts), .umasks = icx_unc_m2m_prefcam_inserts, }, { .name = "UNC_M2M_PREFCAM_OCCUPANCY", .desc = "Prefetch CAM Occupancy", .code = 0x006a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_resp_miss), /* shared */ .umasks = icx_unc_m2m_prefcam_resp_miss, }, { .name = "UNC_M2M_PREFCAM_RESP_MISS", .desc = "TBD", .code = 0x0076, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_resp_miss), .umasks = icx_unc_m2m_prefcam_resp_miss, }, { .name = "UNC_M2M_PREFCAM_RxC_CYCLES_NE", .desc = "UNC_M2M_PREFCAM_RxC_CYCLES_NE (experimental)", .code = 0x0079, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_PREFCAM_RxC_DEALLOCS", .desc = "UNC_M2M_PREFCAM_RxC_DEALLOCS.SQUASHED", .code = 0x007a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_prefcam_rxc_deallocs), .umasks = icx_unc_m2m_prefcam_rxc_deallocs, }, { .name = "UNC_M2M_PREFCAM_RxC_INSERTS", .desc = "UNC_M2M_PREFCAM_RxC_INSERTS (experimental)", .code = 0x0078, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RING_BOUNCES_HORZ", .desc = "Messages that bounced on the Horizontal Ring.", .code = 0x00ac, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ring_bounces_horz), .umasks = icx_unc_m2m_ring_bounces_horz, }, { .name = "UNC_M2M_RING_BOUNCES_VERT", .desc = "Messages that bounced on the Vertical Ring.", .code = 0x00aa, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ring_sink_starved_vert), /* shared */ .umasks = icx_unc_m2m_ring_sink_starved_vert, }, { .name = "UNC_M2M_RING_SINK_STARVED_HORZ", .desc = "Sink Starvation on Horizontal Ring", .code = 0x00ad, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ring_sink_starved_horz), .umasks = icx_unc_m2m_ring_sink_starved_horz, }, { .name = "UNC_M2M_RING_SINK_STARVED_VERT", .desc = "Sink Starvation on Vertical Ring", .code = 0x00ab, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_ring_sink_starved_vert), .umasks = icx_unc_m2m_ring_sink_starved_vert, }, { .name = "UNC_M2M_RING_SRC_THRTL", .desc = "Source Throttle (experimental)", .code = 0x00ae, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RPQ_NO_REG_CRD", .desc = "M2M to iMC RPQ Cycles w/Credits - Regular", .code = 0x0043, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rpq_no_spec_crd), /* shared */ .umasks = icx_unc_m2m_rpq_no_spec_crd, }, { .name = "UNC_M2M_RPQ_NO_REG_CRD_PMM", .desc = "M2M->iMC RPQ Cycles w/Credits - PMM", .code = 0x004f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wpq_no_reg_crd), /* shared */ .umasks = icx_unc_m2m_wpq_no_reg_crd, }, { .name = "UNC_M2M_RPQ_NO_SPEC_CRD", .desc = "M2M to iMC RPQ Cycles w/Credits - Special", .code = 0x0044, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rpq_no_spec_crd), .umasks = icx_unc_m2m_rpq_no_spec_crd, }, { .name = "UNC_M2M_RxC_AD_CYCLES_FULL", .desc = "AD Ingress (from CMS) Full (experimental)", .code = 0x0004, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_AD_CYCLES_NE", .desc = "AD Ingress (from CMS) Not Empty (experimental)", .code = 0x0003, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_AD_INSERTS", .desc = "AD Ingress (from CMS) Allocations (experimental)", .code = 0x0001, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_AD_OCCUPANCY", .desc = "AD Ingress (from CMS) Occupancy (experimental)", .code = 0x0002, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_AD_PREF_OCCUPANCY", .desc = "AD Ingress (from CMS) Occupancy - Prefetches (experimental)", .code = 0x0077, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_AK_WR_CMP", .desc = "AK Egress (to CMS) Allocations (experimental)", .code = 0x005c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_BL_CYCLES_FULL", .desc = "BL Ingress (from CMS) Full (experimental)", .code = 0x0008, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_BL_CYCLES_NE", .desc = "BL Ingress (from CMS) Not Empty (experimental)", .code = 0x0007, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_BL_INSERTS", .desc = "BL Ingress (from CMS) Allocations (experimental)", .code = 0x0005, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxC_BL_OCCUPANCY", .desc = "BL Ingress (from CMS) Occupancy (experimental)", .code = 0x0006, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxR_BUSY_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e5, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_ads_used), /* shared */ .umasks = icx_unc_m2m_txr_horz_ads_used, }, { .name = "UNC_M2M_RxR_BYPASS", .desc = "Transgress Ingress Bypass", .code = 0x00e2, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rxr_inserts), /* shared */ .umasks = icx_unc_m2m_rxr_inserts, }, { .name = "UNC_M2M_RxR_CRD_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e3, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rxr_crd_starved), .umasks = icx_unc_m2m_rxr_crd_starved, }, { .name = "UNC_M2M_RxR_CRD_STARVED_1", .desc = "Transgress Injection Starvation (experimental)", .code = 0x00e4, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_RxR_INSERTS", .desc = "Transgress Ingress Allocations", .code = 0x00e1, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rxr_inserts), .umasks = icx_unc_m2m_rxr_inserts, }, { .name = "UNC_M2M_RxR_OCCUPANCY", .desc = "Transgress Ingress Occupancy", .code = 0x00e0, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_rxr_occupancy), .umasks = icx_unc_m2m_rxr_occupancy, }, { .name = "UNC_M2M_SCOREBOARD_AD_RETRY_ACCEPTS", .desc = "UNC_M2M_SCOREBOARD_AD_RETRY_ACCEPTS (experimental)", .code = 0x0033, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_AD_RETRY_REJECTS", .desc = "UNC_M2M_SCOREBOARD_AD_RETRY_REJECTS (experimental)", .code = 0x0034, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_BL_RETRY_ACCEPTS", .desc = "Retry - Mem Mirroring Mode (experimental)", .code = 0x0035, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_BL_RETRY_REJECTS", .desc = "Retry - Mem Mirroring Mode (experimental)", .code = 0x0036, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_RD_ACCEPTS", .desc = "Scoreboard Accepts (experimental)", .code = 0x002f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_RD_REJECTS", .desc = "Scoreboard Rejects (experimental)", .code = 0x0030, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_WR_ACCEPTS", .desc = "Scoreboard Accepts (experimental)", .code = 0x0031, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_SCOREBOARD_WR_REJECTS", .desc = "Scoreboard Rejects (experimental)", .code = 0x0032, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_STALL0_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d0, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall0_no_txr_horz_crd_ad_ag0), .umasks = icx_unc_m2m_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M2M_STALL0_NO_TxR_HORZ_CRD_AD_AG1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d2, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag0), /* shared */ .umasks = icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M2M_STALL0_NO_TxR_HORZ_CRD_BL_AG0", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d4, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag0), .umasks = icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M2M_STALL0_NO_TxR_HORZ_CRD_BL_AG1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d6, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag1), .umasks = icx_unc_m2m_stall0_no_txr_horz_crd_bl_ag1, }, { .name = "UNC_M2M_STALL1_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d1, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall1_no_txr_horz_crd_ad_ag1_1), /* shared */ .umasks = icx_unc_m2m_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M2M_STALL1_NO_TxR_HORZ_CRD_AD_AG1_1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d3, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall1_no_txr_horz_crd_ad_ag1_1), .umasks = icx_unc_m2m_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M2M_STALL1_NO_TxR_HORZ_CRD_BL_AG0_1", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d5, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall1_no_txr_horz_crd_bl_ag1_1), /* shared */ .umasks = icx_unc_m2m_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M2M_STALL1_NO_TxR_HORZ_CRD_BL_AG1_1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d7, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_stall1_no_txr_horz_crd_bl_ag1_1), .umasks = icx_unc_m2m_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M2M_TAG_HIT", .desc = "Tag Hit", .code = 0x002c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_tag_hit), .umasks = icx_unc_m2m_tag_hit, }, { .name = "UNC_M2M_TAG_MISS", .desc = "Tag Miss (experimental)", .code = 0x0061, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TGR_AD_CREDITS", .desc = "Number AD Ingress Credits (experimental)", .code = 0x0041, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TGR_BL_CREDITS", .desc = "Number BL Ingress Credits (experimental)", .code = 0x0042, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TRACKER_FULL", .desc = "Tracker Cycles Full", .code = 0x0045, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_tracker_inserts), /* shared */ .umasks = icx_unc_m2m_tracker_inserts, }, { .name = "UNC_M2M_TRACKER_INSERTS", .desc = "Tracker Inserts", .code = 0x0049, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_tracker_inserts), .umasks = icx_unc_m2m_tracker_inserts, }, { .name = "UNC_M2M_TRACKER_NE", .desc = "Tracker Cycles Not Empty", .code = 0x0046, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_tracker_occupancy), /* shared */ .umasks = icx_unc_m2m_tracker_occupancy, }, { .name = "UNC_M2M_TRACKER_OCCUPANCY", .desc = "Tracker Occupancy", .code = 0x0047, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_tracker_occupancy), .umasks = icx_unc_m2m_tracker_occupancy, }, { .name = "UNC_M2M_TxC_AD_CREDITS_ACQUIRED", .desc = "AD Egress (to CMS) Credit Acquired (experimental)", .code = 0x000d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_CREDIT_OCCUPANCY", .desc = "AD Egress (to CMS) Credits Occupancy (experimental)", .code = 0x000e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_CYCLES_FULL", .desc = "AD Egress (to CMS) Full (experimental)", .code = 0x000c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_CYCLES_NE", .desc = "AD Egress (to CMS) Not Empty (experimental)", .code = 0x000b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_INSERTS", .desc = "AD Egress (to CMS) Allocations (experimental)", .code = 0x0009, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_NO_CREDIT_CYCLES", .desc = "Cycles with No AD Egress (to CMS) Credits (experimental)", .code = 0x000f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_NO_CREDIT_STALLED", .desc = "Cycles Stalled with No AD Egress (to CMS) Credits (experimental)", .code = 0x0010, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AD_OCCUPANCY", .desc = "AD Egress (to CMS) Occupancy (experimental)", .code = 0x000a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AK", .desc = "Outbound Ring Transactions on AK", .code = 0x0039, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak), .umasks = icx_unc_m2m_txc_ak, }, { .name = "UNC_M2M_TxC_AKC_CREDITS", .desc = "AKC Credits (experimental)", .code = 0x005f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2M_TxC_AK_CREDITS_ACQUIRED", .desc = "AK Egress (to CMS) Credit Acquired", .code = 0x001d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_no_credit_cycles), /* shared */ .umasks = icx_unc_m2m_txc_ak_no_credit_cycles, }, { .name = "UNC_M2M_TxC_AK_CYCLES_FULL", .desc = "AK Egress (to CMS) Full", .code = 0x0014, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_cycles_full), .umasks = icx_unc_m2m_txc_ak_cycles_full, }, { .name = "UNC_M2M_TxC_AK_CYCLES_NE", .desc = "AK Egress (to CMS) Not Empty", .code = 0x0013, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_occupancy), /* shared */ .umasks = icx_unc_m2m_txc_ak_occupancy, }, { .name = "UNC_M2M_TxC_AK_INSERTS", .desc = "AK Egress (to CMS) Allocations", .code = 0x0011, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_inserts), .umasks = icx_unc_m2m_txc_ak_inserts, }, { .name = "UNC_M2M_TxC_AK_NO_CREDIT_CYCLES", .desc = "Cycles with No AK Egress (to CMS) Credits", .code = 0x001f, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_no_credit_cycles), .umasks = icx_unc_m2m_txc_ak_no_credit_cycles, }, { .name = "UNC_M2M_TxC_AK_NO_CREDIT_STALLED", .desc = "Cycles Stalled with No AK Egress (to CMS) Credits", .code = 0x0020, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_credits_acquired), /* shared */ .umasks = icx_unc_m2m_txc_bl_credits_acquired, }, { .name = "UNC_M2M_TxC_AK_OCCUPANCY", .desc = "AK Egress (to CMS) Occupancy", .code = 0x0012, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_ak_occupancy), .umasks = icx_unc_m2m_txc_ak_occupancy, }, { .name = "UNC_M2M_TxC_BL", .desc = "Outbound DRS Ring Transactions to Cache", .code = 0x0040, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl), .umasks = icx_unc_m2m_txc_bl, }, { .name = "UNC_M2M_TxC_BL_CREDITS_ACQUIRED", .desc = "BL Egress (to CMS) Credit Acquired", .code = 0x0019, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_credits_acquired), .umasks = icx_unc_m2m_txc_bl_credits_acquired, }, { .name = "UNC_M2M_TxC_BL_CYCLES_FULL", .desc = "BL Egress (to CMS) Full", .code = 0x0018, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_cycles_ne), /* shared */ .umasks = icx_unc_m2m_txc_bl_cycles_ne, }, { .name = "UNC_M2M_TxC_BL_CYCLES_NE", .desc = "BL Egress (to CMS) Not Empty", .code = 0x0017, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_cycles_ne), .umasks = icx_unc_m2m_txc_bl_cycles_ne, }, { .name = "UNC_M2M_TxC_BL_INSERTS", .desc = "BL Egress (to CMS) Allocations", .code = 0x0015, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_inserts), .umasks = icx_unc_m2m_txc_bl_inserts, }, { .name = "UNC_M2M_TxC_BL_NO_CREDIT_CYCLES", .desc = "Cycles with No BL Egress (to CMS) Credits", .code = 0x001b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_no_credit_stalled), /* shared */ .umasks = icx_unc_m2m_txc_bl_no_credit_stalled, }, { .name = "UNC_M2M_TxC_BL_NO_CREDIT_STALLED", .desc = "Cycles Stalled with No BL Egress (to CMS) Credits", .code = 0x001c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txc_bl_no_credit_stalled), .umasks = icx_unc_m2m_txc_bl_no_credit_stalled, }, { .name = "UNC_M2M_TxR_HORZ_ADS_USED", .desc = "CMS Horizontal ADS Used", .code = 0x00a6, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_ads_used), .umasks = icx_unc_m2m_txr_horz_ads_used, }, { .name = "UNC_M2M_TxR_HORZ_BYPASS", .desc = "CMS Horizontal Bypass Used", .code = 0x00a7, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_cycles_full), /* shared */ .umasks = icx_unc_m2m_txr_horz_cycles_full, }, { .name = "UNC_M2M_TxR_HORZ_CYCLES_FULL", .desc = "Cycles CMS Horizontal Egress Queue is Full", .code = 0x00a2, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_cycles_full), .umasks = icx_unc_m2m_txr_horz_cycles_full, }, { .name = "UNC_M2M_TxR_HORZ_CYCLES_NE", .desc = "Cycles CMS Horizontal Egress Queue is Not Empty", .code = 0x00a3, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_inserts), /* shared */ .umasks = icx_unc_m2m_txr_horz_inserts, }, { .name = "UNC_M2M_TxR_HORZ_INSERTS", .desc = "CMS Horizontal Egress Inserts", .code = 0x00a1, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_inserts), .umasks = icx_unc_m2m_txr_horz_inserts, }, { .name = "UNC_M2M_TxR_HORZ_NACK", .desc = "CMS Horizontal Egress NACKs", .code = 0x00a4, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_occupancy), /* shared */ .umasks = icx_unc_m2m_txr_horz_occupancy, }, { .name = "UNC_M2M_TxR_HORZ_OCCUPANCY", .desc = "CMS Horizontal Egress Occupancy", .code = 0x00a0, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_occupancy), .umasks = icx_unc_m2m_txr_horz_occupancy, }, { .name = "UNC_M2M_TxR_HORZ_STARVED", .desc = "CMS Horizontal Egress Injection Starvation", .code = 0x00a5, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_horz_starved), .umasks = icx_unc_m2m_txr_horz_starved, }, { .name = "UNC_M2M_TxR_VERT_ADS_USED", .desc = "CMS Vertical ADS Used", .code = 0x009c, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_ads_used), .umasks = icx_unc_m2m_txr_vert_ads_used, }, { .name = "UNC_M2M_TxR_VERT_BYPASS", .desc = "CMS Vertical ADS Used", .code = 0x009d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_bypass), .umasks = icx_unc_m2m_txr_vert_bypass, }, { .name = "UNC_M2M_TxR_VERT_BYPASS_1", .desc = "CMS Vertical ADS Used", .code = 0x009e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_cycles_full1), /* shared */ .umasks = icx_unc_m2m_txr_vert_cycles_full1, }, { .name = "UNC_M2M_TxR_VERT_CYCLES_FULL0", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0094, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_cycles_ne0), /* shared */ .umasks = icx_unc_m2m_txr_vert_cycles_ne0, }, { .name = "UNC_M2M_TxR_VERT_CYCLES_FULL1", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0095, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_cycles_full1), .umasks = icx_unc_m2m_txr_vert_cycles_full1, }, { .name = "UNC_M2M_TxR_VERT_CYCLES_NE0", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0096, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_cycles_ne0), .umasks = icx_unc_m2m_txr_vert_cycles_ne0, }, { .name = "UNC_M2M_TxR_VERT_CYCLES_NE1", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0097, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_inserts1), /* shared */ .umasks = icx_unc_m2m_txr_vert_inserts1, }, { .name = "UNC_M2M_TxR_VERT_INSERTS0", .desc = "CMS Vert Egress Allocations", .code = 0x0092, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_occupancy0), /* shared */ .umasks = icx_unc_m2m_txr_vert_occupancy0, }, { .name = "UNC_M2M_TxR_VERT_INSERTS1", .desc = "CMS Vert Egress Allocations", .code = 0x0093, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_inserts1), .umasks = icx_unc_m2m_txr_vert_inserts1, }, { .name = "UNC_M2M_TxR_VERT_NACK0", .desc = "CMS Vertical Egress NACKs", .code = 0x0098, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_starved0), /* shared */ .umasks = icx_unc_m2m_txr_vert_starved0, }, { .name = "UNC_M2M_TxR_VERT_NACK1", .desc = "CMS Vertical Egress NACKs", .code = 0x0099, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_occupancy1), /* shared */ .umasks = icx_unc_m2m_txr_vert_occupancy1, }, { .name = "UNC_M2M_TxR_VERT_OCCUPANCY0", .desc = "CMS Vert Egress Occupancy", .code = 0x0090, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_occupancy0), .umasks = icx_unc_m2m_txr_vert_occupancy0, }, { .name = "UNC_M2M_TxR_VERT_OCCUPANCY1", .desc = "CMS Vert Egress Occupancy", .code = 0x0091, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_occupancy1), .umasks = icx_unc_m2m_txr_vert_occupancy1, }, { .name = "UNC_M2M_TxR_VERT_STARVED0", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_starved0), .umasks = icx_unc_m2m_txr_vert_starved0, }, { .name = "UNC_M2M_TxR_VERT_STARVED1", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_txr_vert_starved1), .umasks = icx_unc_m2m_txr_vert_starved1, }, { .name = "UNC_M2M_VERT_RING_AD_IN_USE", .desc = "Vertical AD Ring In Use", .code = 0x00b0, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_akc_in_use), /* shared */ .umasks = icx_unc_m2m_vert_ring_akc_in_use, }, { .name = "UNC_M2M_VERT_RING_AKC_IN_USE", .desc = "Vertical AKC Ring In Use", .code = 0x00b4, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_akc_in_use), .umasks = icx_unc_m2m_vert_ring_akc_in_use, }, { .name = "UNC_M2M_VERT_RING_AK_IN_USE", .desc = "Vertical AK Ring In Use", .code = 0x00b1, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_bl_in_use), /* shared */ .umasks = icx_unc_m2m_vert_ring_bl_in_use, }, { .name = "UNC_M2M_VERT_RING_BL_IN_USE", .desc = "Vertical BL Ring in Use", .code = 0x00b2, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_bl_in_use), .umasks = icx_unc_m2m_vert_ring_bl_in_use, }, { .name = "UNC_M2M_VERT_RING_IV_IN_USE", .desc = "Vertical IV Ring in Use", .code = 0x00b3, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_iv_in_use), .umasks = icx_unc_m2m_vert_ring_iv_in_use, }, { .name = "UNC_M2M_VERT_RING_TGC_IN_USE", .desc = "Vertical TGC Ring In Use", .code = 0x00b5, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_vert_ring_tgc_in_use), .umasks = icx_unc_m2m_vert_ring_tgc_in_use, }, { .name = "UNC_M2M_WPQ_FLUSH", .desc = "WPQ Flush", .code = 0x0058, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_inserts), /* shared */ .umasks = icx_unc_m2m_wr_tracker_inserts, }, { .name = "UNC_M2M_WPQ_NO_REG_CRD", .desc = "M2M->iMC WPQ Cycles w/Credits - Regular", .code = 0x004d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wpq_no_reg_crd), .umasks = icx_unc_m2m_wpq_no_reg_crd, }, { .name = "UNC_M2M_WPQ_NO_REG_CRD_PMM", .desc = "M2M->iMC WPQ Cycles w/Credits - PMM", .code = 0x0051, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wpq_no_spec_crd), /* shared */ .umasks = icx_unc_m2m_wpq_no_spec_crd, }, { .name = "UNC_M2M_WPQ_NO_SPEC_CRD", .desc = "M2M->iMC WPQ Cycles w/Credits - Special", .code = 0x004e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wpq_no_spec_crd), .umasks = icx_unc_m2m_wpq_no_spec_crd, }, { .name = "UNC_M2M_WR_TRACKER_FULL", .desc = "Write Tracker Cycles Full", .code = 0x004a, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_full), .umasks = icx_unc_m2m_wr_tracker_full, }, { .name = "UNC_M2M_WR_TRACKER_INSERTS", .desc = "Write Tracker Inserts", .code = 0x0056, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_inserts), .umasks = icx_unc_m2m_wr_tracker_inserts, }, { .name = "UNC_M2M_WR_TRACKER_NE", .desc = "Write Tracker Cycles Not Empty", .code = 0x004b, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_occupancy), /* shared */ .umasks = icx_unc_m2m_wr_tracker_occupancy, }, { .name = "UNC_M2M_WR_TRACKER_NONPOSTED_INSERTS", .desc = "Write Tracker Non-Posted Inserts", .code = 0x0063, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_nonposted_occupancy), /* shared */ .umasks = icx_unc_m2m_wr_tracker_nonposted_occupancy, }, { .name = "UNC_M2M_WR_TRACKER_NONPOSTED_OCCUPANCY", .desc = "Write Tracker Non-Posted Occupancy", .code = 0x0062, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_nonposted_occupancy), .umasks = icx_unc_m2m_wr_tracker_nonposted_occupancy, }, { .name = "UNC_M2M_WR_TRACKER_OCCUPANCY", .desc = "Write Tracker Occupancy", .code = 0x0055, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_occupancy), .umasks = icx_unc_m2m_wr_tracker_occupancy, }, { .name = "UNC_M2M_WR_TRACKER_POSTED_INSERTS", .desc = "Write Tracker Posted Inserts", .code = 0x005e, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_posted_occupancy), /* shared */ .umasks = icx_unc_m2m_wr_tracker_posted_occupancy, }, { .name = "UNC_M2M_WR_TRACKER_POSTED_OCCUPANCY", .desc = "Write Tracker Posted Occupancy", .code = 0x005d, .modmsk = ICX_UNC_M2M_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2m_wr_tracker_posted_occupancy), .umasks = icx_unc_m2m_wr_tracker_posted_occupancy, }, }; /* 175 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_m2pcie_events.h000066400000000000000000002533721502707512200256710ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_m2pcie (IcelakeX Uncore M2PCIE) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_m2p_ag0_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag0_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag0_bl_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag0_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag1_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag1_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag1_bl_crd_acquired0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 4 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 5 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ag1_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_distress_asserted[]={ { .uname = "DPT_LOCAL", .udesc = "DPT Local (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_NONLOCAL", .udesc = "DPT Remote (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_IV", .udesc = "DPT Stalled - IV (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_NOCRD", .udesc = "DPT Stalled - No Credit (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HORZ", .udesc = "Horizontal (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_LOCAL", .udesc = "PMM Local (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_NONLOCAL", .udesc = "PMM Remote (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VERT", .udesc = "Vertical (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNOOPGO_UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_horz_ring_akc_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_horz_ring_iv_in_use[]={ { .uname = "LEFT", .udesc = "Left (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT", .udesc = "Right (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_iio_credits_acquired[]={ { .uname = "DRS_0", .udesc = "DRS (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_1", .udesc = "DRS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_0", .udesc = "NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_1", .udesc = "NCB (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_0", .udesc = "NCS (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_1", .udesc = "NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_iio_credits_reject[]={ { .uname = "DRS", .udesc = "DRS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_iio_credits_used[]={ { .uname = "DRS_0", .udesc = "DRS to CMS Port 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_1", .udesc = "DRS to CMS Port 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_0", .udesc = "NCB to CMS Port 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_1", .udesc = "NCB to CMS Port 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_0", .udesc = "NCS to CMS Port 0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_1", .udesc = "NCS to CMS Port 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_p2p_ded_returned_0[]={ { .uname = "MS2IOSF0_NCB", .udesc = "M2IOSF0 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF0_NCS", .udesc = "M2IOSF0 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF1_NCB", .udesc = "M2IOSF1 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF1_NCS", .udesc = "M2IOSF1 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF2_NCB", .udesc = "M2IOSF2 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF2_NCS", .udesc = "M2IOSF2 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF3_NCB", .udesc = "M2IOSF3 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF3_NCS", .udesc = "M2IOSF3 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_p2p_ded_returned_1[]={ { .uname = "MS2IOSF4_NCB", .udesc = "M2IOSF4 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF4_NCS", .udesc = "M2IOSF4 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF5_NCB", .udesc = "M2IOSF5 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS2IOSF5_NCS", .udesc = "M2IOSF5 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_shar_p2p_crd_returned[]={ { .uname = "AGENT_0", .udesc = "Agent0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_1", .udesc = "Agent1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_2", .udesc = "Agent2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_3", .udesc = "Agent3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_4", .udesc = "Agent4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_5", .udesc = "Agent5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_shar_p2p_crd_taken_0[]={ { .uname = "M2IOSF0_NCB", .udesc = "M2IOSF0 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF0_NCS", .udesc = "M2IOSF0 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF1_NCB", .udesc = "M2IOSF1 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF1_NCS", .udesc = "M2IOSF1 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF2_NCB", .udesc = "M2IOSF2 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF2_NCS", .udesc = "M2IOSF2 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF3_NCB", .udesc = "M2IOSF3 - NCB (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF3_NCS", .udesc = "M2IOSF3 - NCS (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_shar_p2p_crd_taken_1[]={ { .uname = "M2IOSF4_NCB", .udesc = "M2IOSF4 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF4_NCS", .udesc = "M2IOSF4 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF5_NCB", .udesc = "M2IOSF5 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF5_NCS", .udesc = "M2IOSF5 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_shar_p2p_crd_wait_0[]={ { .uname = "M2IOSF0_NCB", .udesc = "M2IOSF0 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF0_NCS", .udesc = "M2IOSF0 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF1_NCB", .udesc = "M2IOSF1 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF1_NCS", .udesc = "M2IOSF1 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF2_NCB", .udesc = "M2IOSF2 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF2_NCS", .udesc = "M2IOSF2 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF3_NCB", .udesc = "M2IOSF3 - NCB (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF3_NCS", .udesc = "M2IOSF3 - NCS (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_local_shar_p2p_crd_wait_1[]={ { .uname = "M2IOSF4_NCB", .udesc = "M2IOSF4 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF4_NCS", .udesc = "M2IOSF4 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF5_NCB", .udesc = "M2IOSF5 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M2IOSF5_NCS", .udesc = "M2IOSF5 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_misc_external[]={ { .uname = "MBE_INST0", .udesc = "Number of cycles MBE is high for MS2IDI0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBE_INST1", .udesc = "Number of cycles MBE is high for MS2IDI1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_p2p_crd_occupancy[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOCAL_NCB", .udesc = "Local NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_NCS", .udesc = "Local NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_NCB", .udesc = "Remote NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_NCS", .udesc = "Remote NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_p2p_shar_received[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOCAL_NCB", .udesc = "Local NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_NCS", .udesc = "Local NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_NCB", .udesc = "Remote NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_NCS", .udesc = "Remote NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_p2p_ded_returned[]={ { .uname = "UPI0_NCB", .udesc = "UPI0 - NCB (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI0_NCS", .udesc = "UPI0 - NCS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCB", .udesc = "UPI1 - NCB (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCS", .udesc = "UPI1 - NCS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCB", .udesc = "UPI2 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCS", .udesc = "UPI2 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_p2p_shar_returned[]={ { .uname = "AGENT_0", .udesc = "Agent0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_1", .udesc = "Agent1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_2", .udesc = "Agent2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_shar_p2p_crd_returned[]={ { .uname = "AGENT_0", .udesc = "Agent0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_1", .udesc = "Agent1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AGENT_2", .udesc = "Agent2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_shar_p2p_crd_taken_0[]={ { .uname = "UPI0_DRS", .udesc = "UPI0 - DRS (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI0_NCB", .udesc = "UPI0 - NCB (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI0_NCS", .udesc = "UPI0 - NCS (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_DRS", .udesc = "UPI1 - DRS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCB", .udesc = "UPI1 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCS", .udesc = "UPI1 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_shar_p2p_crd_taken_1[]={ { .uname = "UPI2_DRS", .udesc = "UPI2 - DRS (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCB", .udesc = "UPI2 - NCB (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCS", .udesc = "UPI2 - NCS (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_shar_p2p_crd_wait_0[]={ { .uname = "UPI0_DRS", .udesc = "UPI0 - DRS (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI0_NCB", .udesc = "UPI0 - NCB (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI0_NCS", .udesc = "UPI0 - NCS (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_DRS", .udesc = "UPI1 - DRS (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCB", .udesc = "UPI1 - NCB (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI1_NCS", .udesc = "UPI1 - NCS (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_remote_shar_p2p_crd_wait_1[]={ { .uname = "UPI2_DRS", .udesc = "UPI2 - DRS (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCB", .udesc = "UPI2 - NCB (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI2_NCS", .udesc = "UPI2 - NCS (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ring_bounces_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ring_sink_starved_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "Acknowledgements to Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_ring_sink_starved_vert[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Acknowledgements to core (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Data Responses to core (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "Snoops of processor's cache. (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_rxc_inserts[]={ { .uname = "ALL", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CHA_IDI", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHA_NCB", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHA_NCS", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_NCB", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_NCS", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI_NCB", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UPI_NCS", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_rxr_crd_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFV", .udesc = "IFV - Credited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_rxr_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_rxr_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_stall0_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_stall1_no_txr_horz_crd_ad_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_stall1_no_txr_horz_crd_bl_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txc_credits[]={ { .uname = "PMM", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txc_cycles_full[]={ { .uname = "AD_0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_1", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_1", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_BLOCK_0", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_BLOCK_1", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txc_cycles_ne[]={ { .uname = "AD_0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_1", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_1", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_DISTRESS_0", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_DISTRESS_1", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txc_inserts[]={ { .uname = "AD_0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_1", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CRD_0", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CRD_1", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_1", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_horz_ads_used[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_horz_cycles_full[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_horz_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_horz_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_horz_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_ads_used[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_bypass[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG1", .udesc = "IV - Agent 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_cycles_full1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_cycles_ne0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_inserts1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_occupancy0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_occupancy1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_starved0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_txr_vert_starved1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGC", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_vert_ring_akc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_vert_ring_iv_in_use[]={ { .uname = "DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m2p_vert_ring_tgc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_m2pcie_pe[]={ { .name = "UNC_M2P_AG0_AD_CRD_ACQUIRED0", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0080, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m2p_ag0_ad_crd_occupancy0, }, { .name = "UNC_M2P_AG0_AD_CRD_ACQUIRED1", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0081, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m2p_ag0_ad_crd_occupancy1, }, { .name = "UNC_M2P_AG0_AD_CRD_OCCUPANCY0", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0082, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_ad_crd_occupancy0), .umasks = icx_unc_m2p_ag0_ad_crd_occupancy0, }, { .name = "UNC_M2P_AG0_AD_CRD_OCCUPANCY1", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0083, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_ad_crd_occupancy1), .umasks = icx_unc_m2p_ag0_ad_crd_occupancy1, }, { .name = "UNC_M2P_AG0_BL_CRD_ACQUIRED0", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0088, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_bl_crd_occupancy0), /* shared */ .umasks = icx_unc_m2p_ag0_bl_crd_occupancy0, }, { .name = "UNC_M2P_AG0_BL_CRD_ACQUIRED1", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0089, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m2p_ag0_bl_crd_occupancy1, }, { .name = "UNC_M2P_AG0_BL_CRD_OCCUPANCY0", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008a, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_bl_crd_occupancy0), .umasks = icx_unc_m2p_ag0_bl_crd_occupancy0, }, { .name = "UNC_M2P_AG0_BL_CRD_OCCUPANCY1", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008b, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag0_bl_crd_occupancy1), .umasks = icx_unc_m2p_ag0_bl_crd_occupancy1, }, { .name = "UNC_M2P_AG1_AD_CRD_ACQUIRED0", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0084, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m2p_ag1_ad_crd_occupancy0, }, { .name = "UNC_M2P_AG1_AD_CRD_ACQUIRED1", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0085, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m2p_ag1_ad_crd_occupancy1, }, { .name = "UNC_M2P_AG1_AD_CRD_OCCUPANCY0", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0086, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_ad_crd_occupancy0), .umasks = icx_unc_m2p_ag1_ad_crd_occupancy0, }, { .name = "UNC_M2P_AG1_AD_CRD_OCCUPANCY1", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0087, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_ad_crd_occupancy1), .umasks = icx_unc_m2p_ag1_ad_crd_occupancy1, }, { .name = "UNC_M2P_AG1_BL_CRD_ACQUIRED0", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008c, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_bl_crd_acquired0), .umasks = icx_unc_m2p_ag1_bl_crd_acquired0, }, { .name = "UNC_M2P_AG1_BL_CRD_ACQUIRED1", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008d, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m2p_ag1_bl_crd_occupancy1, }, { .name = "UNC_M2P_AG1_BL_CRD_OCCUPANCY0", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008e, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall0_no_txr_horz_crd_ad_ag0), /* shared */ .umasks = icx_unc_m2p_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M2P_AG1_BL_CRD_OCCUPANCY1", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008f, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ag1_bl_crd_occupancy1), .umasks = icx_unc_m2p_ag1_bl_crd_occupancy1, }, { .name = "UNC_M2P_CLOCKTICKS", .desc = "Clockticks of the mesh to PCI (M2P)", .code = 0x0001, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2P_CMS_CLOCKTICKS", .desc = "CMS Clockticks", .code = 0x00c0, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2P_DISTRESS_ASSERTED", .desc = "Distress signal asserted", .code = 0x00af, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_distress_asserted), .umasks = icx_unc_m2p_distress_asserted, }, { .name = "UNC_M2P_EGRESS_ORDERING", .desc = "Egress Blocking due to Ordering requirements", .code = 0x00ba, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_egress_ordering), .umasks = icx_unc_m2p_egress_ordering, }, { .name = "UNC_M2P_HORZ_RING_AD_IN_USE", .desc = "Horizontal AD Ring In Use", .code = 0x00b6, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_horz_ring_akc_in_use), /* shared */ .umasks = icx_unc_m2p_horz_ring_akc_in_use, }, { .name = "UNC_M2P_HORZ_RING_AKC_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00bb, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_horz_ring_akc_in_use), .umasks = icx_unc_m2p_horz_ring_akc_in_use, }, { .name = "UNC_M2P_HORZ_RING_AK_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00b7, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_horz_ring_bl_in_use), /* shared */ .umasks = icx_unc_m2p_horz_ring_bl_in_use, }, { .name = "UNC_M2P_HORZ_RING_BL_IN_USE", .desc = "Horizontal BL Ring in Use", .code = 0x00b8, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_horz_ring_bl_in_use), .umasks = icx_unc_m2p_horz_ring_bl_in_use, }, { .name = "UNC_M2P_HORZ_RING_IV_IN_USE", .desc = "Horizontal IV Ring in Use", .code = 0x00b9, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_horz_ring_iv_in_use), .umasks = icx_unc_m2p_horz_ring_iv_in_use, }, { .name = "UNC_M2P_IIO_CREDITS_ACQUIRED", .desc = "M2PCIe IIO Credit Acquired", .code = 0x0033, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_iio_credits_acquired), .umasks = icx_unc_m2p_iio_credits_acquired, }, { .name = "UNC_M2P_IIO_CREDITS_REJECT", .desc = "M2PCIe IIO Failed to Acquire a Credit", .code = 0x0034, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_iio_credits_reject), .umasks = icx_unc_m2p_iio_credits_reject, }, { .name = "UNC_M2P_IIO_CREDITS_USED", .desc = "M2PCIe IIO Credits in Use", .code = 0x0032, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_iio_credits_used), .umasks = icx_unc_m2p_iio_credits_used, }, { .name = "UNC_M2P_LOCAL_DED_P2P_CRD_TAKEN_0", .desc = "Local Dedicated P2P Credit Taken - 0", .code = 0x0046, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_taken_0), /* shared */ .umasks = icx_unc_m2p_local_shar_p2p_crd_taken_0, }, { .name = "UNC_M2P_LOCAL_DED_P2P_CRD_TAKEN_1", .desc = "Local Dedicated P2P Credit Taken - 1", .code = 0x0047, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_taken_1), /* shared */ .umasks = icx_unc_m2p_local_shar_p2p_crd_taken_1, }, { .name = "UNC_M2P_LOCAL_P2P_DED_RETURNED_0", .desc = "Local P2P Dedicated Credits Returned - 0", .code = 0x0019, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_p2p_ded_returned_0), .umasks = icx_unc_m2p_local_p2p_ded_returned_0, }, { .name = "UNC_M2P_LOCAL_P2P_DED_RETURNED_1", .desc = "Local P2P Dedicated Credits Returned - 1", .code = 0x001a, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_p2p_ded_returned_1), .umasks = icx_unc_m2p_local_p2p_ded_returned_1, }, { .name = "UNC_M2P_LOCAL_P2P_SHAR_RETURNED", .desc = "Local P2P Shared Credits Returned", .code = 0x0017, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_p2p_shar_returned), /* shared */ .umasks = icx_unc_m2p_remote_p2p_shar_returned, }, { .name = "UNC_M2P_LOCAL_SHAR_P2P_CRD_RETURNED", .desc = "Local Shared P2P Credit Returned to credit ring", .code = 0x0044, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_returned), .umasks = icx_unc_m2p_local_shar_p2p_crd_returned, }, { .name = "UNC_M2P_LOCAL_SHAR_P2P_CRD_TAKEN_0", .desc = "Local Shared P2P Credit Taken - 0", .code = 0x0040, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_taken_0), .umasks = icx_unc_m2p_local_shar_p2p_crd_taken_0, }, { .name = "UNC_M2P_LOCAL_SHAR_P2P_CRD_TAKEN_1", .desc = "Local Shared P2P Credit Taken - 1", .code = 0x0041, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_taken_1), .umasks = icx_unc_m2p_local_shar_p2p_crd_taken_1, }, { .name = "UNC_M2P_LOCAL_SHAR_P2P_CRD_WAIT_0", .desc = "Waiting on Local Shared P2P Credit - 0", .code = 0x004a, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_wait_0), .umasks = icx_unc_m2p_local_shar_p2p_crd_wait_0, }, { .name = "UNC_M2P_LOCAL_SHAR_P2P_CRD_WAIT_1", .desc = "Waiting on Local Shared P2P Credit - 1", .code = 0x004b, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_local_shar_p2p_crd_wait_1), .umasks = icx_unc_m2p_local_shar_p2p_crd_wait_1, }, { .name = "UNC_M2P_MISC_EXTERNAL", .desc = "Miscellaneous Events (mostly from MS2IDI)", .code = 0x00e6, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_misc_external), .umasks = icx_unc_m2p_misc_external, }, { .name = "UNC_M2P_P2P_CRD_OCCUPANCY", .desc = "P2P Credit Occupancy", .code = 0x0014, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_p2p_crd_occupancy), .umasks = icx_unc_m2p_p2p_crd_occupancy, }, { .name = "UNC_M2P_P2P_DED_RECEIVED", .desc = "Dedicated Credits Received", .code = 0x0016, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_p2p_shar_received), /* shared */ .umasks = icx_unc_m2p_p2p_shar_received, }, { .name = "UNC_M2P_P2P_SHAR_RECEIVED", .desc = "Shared Credits Received", .code = 0x0015, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_p2p_shar_received), .umasks = icx_unc_m2p_p2p_shar_received, }, { .name = "UNC_M2P_REMOTE_DED_P2P_CRD_TAKEN_0", .desc = "Remote Dedicated P2P Credit Taken - 0", .code = 0x0048, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_taken_0), /* shared */ .umasks = icx_unc_m2p_remote_shar_p2p_crd_taken_0, }, { .name = "UNC_M2P_REMOTE_DED_P2P_CRD_TAKEN_1", .desc = "Remote Dedicated P2P Credit Taken - 1", .code = 0x0049, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_taken_1), /* shared */ .umasks = icx_unc_m2p_remote_shar_p2p_crd_taken_1, }, { .name = "UNC_M2P_REMOTE_P2P_DED_RETURNED", .desc = "Remote P2P Dedicated Credits Returned", .code = 0x001b, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_p2p_ded_returned), .umasks = icx_unc_m2p_remote_p2p_ded_returned, }, { .name = "UNC_M2P_REMOTE_P2P_SHAR_RETURNED", .desc = "Remote P2P Shared Credits Returned", .code = 0x0018, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_p2p_shar_returned), .umasks = icx_unc_m2p_remote_p2p_shar_returned, }, { .name = "UNC_M2P_REMOTE_SHAR_P2P_CRD_RETURNED", .desc = "Remote Shared P2P Credit Returned to credit ring", .code = 0x0045, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_returned), .umasks = icx_unc_m2p_remote_shar_p2p_crd_returned, }, { .name = "UNC_M2P_REMOTE_SHAR_P2P_CRD_TAKEN_0", .desc = "Remote Shared P2P Credit Taken - 0", .code = 0x0042, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_taken_0), .umasks = icx_unc_m2p_remote_shar_p2p_crd_taken_0, }, { .name = "UNC_M2P_REMOTE_SHAR_P2P_CRD_TAKEN_1", .desc = "Remote Shared P2P Credit Taken - 1", .code = 0x0043, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_taken_1), .umasks = icx_unc_m2p_remote_shar_p2p_crd_taken_1, }, { .name = "UNC_M2P_REMOTE_SHAR_P2P_CRD_WAIT_0", .desc = "Waiting on Remote Shared P2P Credit - 0", .code = 0x004c, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_wait_0), .umasks = icx_unc_m2p_remote_shar_p2p_crd_wait_0, }, { .name = "UNC_M2P_REMOTE_SHAR_P2P_CRD_WAIT_1", .desc = "Waiting on Remote Shared P2P Credit - 1", .code = 0x004d, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_remote_shar_p2p_crd_wait_1), .umasks = icx_unc_m2p_remote_shar_p2p_crd_wait_1, }, { .name = "UNC_M2P_RING_BOUNCES_HORZ", .desc = "Messages that bounced on the Horizontal Ring.", .code = 0x00ac, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ring_bounces_horz), .umasks = icx_unc_m2p_ring_bounces_horz, }, { .name = "UNC_M2P_RING_BOUNCES_VERT", .desc = "Messages that bounced on the Vertical Ring.", .code = 0x00aa, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ring_sink_starved_vert), /* shared */ .umasks = icx_unc_m2p_ring_sink_starved_vert, }, { .name = "UNC_M2P_RING_SINK_STARVED_HORZ", .desc = "Sink Starvation on Horizontal Ring", .code = 0x00ad, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ring_sink_starved_horz), .umasks = icx_unc_m2p_ring_sink_starved_horz, }, { .name = "UNC_M2P_RING_SINK_STARVED_VERT", .desc = "Sink Starvation on Vertical Ring", .code = 0x00ab, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_ring_sink_starved_vert), .umasks = icx_unc_m2p_ring_sink_starved_vert, }, { .name = "UNC_M2P_RING_SRC_THRTL", .desc = "Source Throttle (experimental)", .code = 0x00ae, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2P_RxC_CYCLES_NE", .desc = "Ingress (from CMS) Queue Cycles Not Empty", .code = 0x0010, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxc_inserts), /* shared */ .umasks = icx_unc_m2p_rxc_inserts, }, { .name = "UNC_M2P_RxC_INSERTS", .desc = "Ingress (from CMS) Queue Inserts", .code = 0x0011, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxc_inserts), .umasks = icx_unc_m2p_rxc_inserts, }, { .name = "UNC_M2P_RxR_BUSY_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e5, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_ads_used), /* shared */ .umasks = icx_unc_m2p_txr_horz_ads_used, }, { .name = "UNC_M2P_RxR_BYPASS", .desc = "Transgress Ingress Bypass", .code = 0x00e2, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxr_inserts), /* shared */ .umasks = icx_unc_m2p_rxr_inserts, }, { .name = "UNC_M2P_RxR_CRD_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e3, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxr_crd_starved), .umasks = icx_unc_m2p_rxr_crd_starved, }, { .name = "UNC_M2P_RxR_CRD_STARVED_1", .desc = "Transgress Injection Starvation (experimental)", .code = 0x00e4, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M2P_RxR_INSERTS", .desc = "Transgress Ingress Allocations", .code = 0x00e1, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxr_inserts), .umasks = icx_unc_m2p_rxr_inserts, }, { .name = "UNC_M2P_RxR_OCCUPANCY", .desc = "Transgress Ingress Occupancy", .code = 0x00e0, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_rxr_occupancy), .umasks = icx_unc_m2p_rxr_occupancy, }, { .name = "UNC_M2P_STALL0_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d0, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall0_no_txr_horz_crd_ad_ag0), .umasks = icx_unc_m2p_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M2P_STALL0_NO_TxR_HORZ_CRD_AD_AG1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d2, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag0), /* shared */ .umasks = icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M2P_STALL0_NO_TxR_HORZ_CRD_BL_AG0", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d4, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag0), .umasks = icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M2P_STALL0_NO_TxR_HORZ_CRD_BL_AG1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d6, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag1), .umasks = icx_unc_m2p_stall0_no_txr_horz_crd_bl_ag1, }, { .name = "UNC_M2P_STALL1_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d1, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall1_no_txr_horz_crd_ad_ag1_1), /* shared */ .umasks = icx_unc_m2p_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M2P_STALL1_NO_TxR_HORZ_CRD_AD_AG1_1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d3, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall1_no_txr_horz_crd_ad_ag1_1), .umasks = icx_unc_m2p_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M2P_STALL1_NO_TxR_HORZ_CRD_BL_AG0_1", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d5, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall1_no_txr_horz_crd_bl_ag1_1), /* shared */ .umasks = icx_unc_m2p_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M2P_STALL1_NO_TxR_HORZ_CRD_BL_AG1_1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d7, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_stall1_no_txr_horz_crd_bl_ag1_1), .umasks = icx_unc_m2p_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M2P_TxC_CREDITS", .desc = "UNC_M2P_TxC_CREDITS.PRQ", .code = 0x002d, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txc_credits), .umasks = icx_unc_m2p_txc_credits, }, { .name = "UNC_M2P_TxC_CYCLES_FULL", .desc = "Egress (to CMS) Cycles Full", .code = 0x0025, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txc_cycles_full), .umasks = icx_unc_m2p_txc_cycles_full, }, { .name = "UNC_M2P_TxC_CYCLES_NE", .desc = "Egress (to CMS) Cycles Not Empty", .code = 0x0023, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txc_cycles_ne), .umasks = icx_unc_m2p_txc_cycles_ne, }, { .name = "UNC_M2P_TxC_INSERTS", .desc = "Egress (to CMS) Ingress", .code = 0x0024, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txc_inserts), .umasks = icx_unc_m2p_txc_inserts, }, { .name = "UNC_M2P_TxR_HORZ_ADS_USED", .desc = "CMS Horizontal ADS Used", .code = 0x00a6, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_ads_used), .umasks = icx_unc_m2p_txr_horz_ads_used, }, { .name = "UNC_M2P_TxR_HORZ_BYPASS", .desc = "CMS Horizontal Bypass Used", .code = 0x00a7, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_cycles_full), /* shared */ .umasks = icx_unc_m2p_txr_horz_cycles_full, }, { .name = "UNC_M2P_TxR_HORZ_CYCLES_FULL", .desc = "Cycles CMS Horizontal Egress Queue is Full", .code = 0x00a2, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_cycles_full), .umasks = icx_unc_m2p_txr_horz_cycles_full, }, { .name = "UNC_M2P_TxR_HORZ_CYCLES_NE", .desc = "Cycles CMS Horizontal Egress Queue is Not Empty", .code = 0x00a3, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_inserts), /* shared */ .umasks = icx_unc_m2p_txr_horz_inserts, }, { .name = "UNC_M2P_TxR_HORZ_INSERTS", .desc = "CMS Horizontal Egress Inserts", .code = 0x00a1, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_inserts), .umasks = icx_unc_m2p_txr_horz_inserts, }, { .name = "UNC_M2P_TxR_HORZ_NACK", .desc = "CMS Horizontal Egress NACKs", .code = 0x00a4, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_occupancy), /* shared */ .umasks = icx_unc_m2p_txr_horz_occupancy, }, { .name = "UNC_M2P_TxR_HORZ_OCCUPANCY", .desc = "CMS Horizontal Egress Occupancy", .code = 0x00a0, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_occupancy), .umasks = icx_unc_m2p_txr_horz_occupancy, }, { .name = "UNC_M2P_TxR_HORZ_STARVED", .desc = "CMS Horizontal Egress Injection Starvation", .code = 0x00a5, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_horz_starved), .umasks = icx_unc_m2p_txr_horz_starved, }, { .name = "UNC_M2P_TxR_VERT_ADS_USED", .desc = "CMS Vertical ADS Used", .code = 0x009c, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_ads_used), .umasks = icx_unc_m2p_txr_vert_ads_used, }, { .name = "UNC_M2P_TxR_VERT_BYPASS", .desc = "CMS Vertical ADS Used", .code = 0x009d, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_bypass), .umasks = icx_unc_m2p_txr_vert_bypass, }, { .name = "UNC_M2P_TxR_VERT_BYPASS_1", .desc = "CMS Vertical ADS Used", .code = 0x009e, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_cycles_full1), /* shared */ .umasks = icx_unc_m2p_txr_vert_cycles_full1, }, { .name = "UNC_M2P_TxR_VERT_CYCLES_FULL0", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0094, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_cycles_ne0), /* shared */ .umasks = icx_unc_m2p_txr_vert_cycles_ne0, }, { .name = "UNC_M2P_TxR_VERT_CYCLES_FULL1", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0095, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_cycles_full1), .umasks = icx_unc_m2p_txr_vert_cycles_full1, }, { .name = "UNC_M2P_TxR_VERT_CYCLES_NE0", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0096, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_cycles_ne0), .umasks = icx_unc_m2p_txr_vert_cycles_ne0, }, { .name = "UNC_M2P_TxR_VERT_CYCLES_NE1", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0097, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_inserts1), /* shared */ .umasks = icx_unc_m2p_txr_vert_inserts1, }, { .name = "UNC_M2P_TxR_VERT_INSERTS0", .desc = "CMS Vert Egress Allocations", .code = 0x0092, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_occupancy0), /* shared */ .umasks = icx_unc_m2p_txr_vert_occupancy0, }, { .name = "UNC_M2P_TxR_VERT_INSERTS1", .desc = "CMS Vert Egress Allocations", .code = 0x0093, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_inserts1), .umasks = icx_unc_m2p_txr_vert_inserts1, }, { .name = "UNC_M2P_TxR_VERT_NACK0", .desc = "CMS Vertical Egress NACKs", .code = 0x0098, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_starved0), /* shared */ .umasks = icx_unc_m2p_txr_vert_starved0, }, { .name = "UNC_M2P_TxR_VERT_NACK1", .desc = "CMS Vertical Egress NACKs", .code = 0x0099, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_occupancy1), /* shared */ .umasks = icx_unc_m2p_txr_vert_occupancy1, }, { .name = "UNC_M2P_TxR_VERT_OCCUPANCY0", .desc = "CMS Vert Egress Occupancy", .code = 0x0090, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_occupancy0), .umasks = icx_unc_m2p_txr_vert_occupancy0, }, { .name = "UNC_M2P_TxR_VERT_OCCUPANCY1", .desc = "CMS Vert Egress Occupancy", .code = 0x0091, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_occupancy1), .umasks = icx_unc_m2p_txr_vert_occupancy1, }, { .name = "UNC_M2P_TxR_VERT_STARVED0", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009a, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_starved0), .umasks = icx_unc_m2p_txr_vert_starved0, }, { .name = "UNC_M2P_TxR_VERT_STARVED1", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009b, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_txr_vert_starved1), .umasks = icx_unc_m2p_txr_vert_starved1, }, { .name = "UNC_M2P_VERT_RING_AD_IN_USE", .desc = "Vertical AD Ring In Use", .code = 0x00b0, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_akc_in_use), /* shared */ .umasks = icx_unc_m2p_vert_ring_akc_in_use, }, { .name = "UNC_M2P_VERT_RING_AKC_IN_USE", .desc = "Vertical AKC Ring In Use", .code = 0x00b4, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_akc_in_use), .umasks = icx_unc_m2p_vert_ring_akc_in_use, }, { .name = "UNC_M2P_VERT_RING_AK_IN_USE", .desc = "Vertical AK Ring In Use", .code = 0x00b1, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_bl_in_use), /* shared */ .umasks = icx_unc_m2p_vert_ring_bl_in_use, }, { .name = "UNC_M2P_VERT_RING_BL_IN_USE", .desc = "Vertical BL Ring in Use", .code = 0x00b2, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_bl_in_use), .umasks = icx_unc_m2p_vert_ring_bl_in_use, }, { .name = "UNC_M2P_VERT_RING_IV_IN_USE", .desc = "Vertical IV Ring in Use", .code = 0x00b3, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_iv_in_use), .umasks = icx_unc_m2p_vert_ring_iv_in_use, }, { .name = "UNC_M2P_VERT_RING_TGC_IN_USE", .desc = "Vertical TGC Ring In Use", .code = 0x00b5, .modmsk = ICX_UNC_M2PCIE_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m2p_vert_ring_tgc_in_use), .umasks = icx_unc_m2p_vert_ring_tgc_in_use, }, }; /* 105 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_m3upi_events.h000066400000000000000000003444241502707512200255460ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_m3upi (IcelakeX Uncore M3UPI) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_m3upi_ag0_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag0_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag0_bl_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag0_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag1_ad_crd_occupancy0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag1_ad_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag1_bl_crd_acquired0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 4 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 5 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ag1_bl_crd_occupancy1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_cha_ad_credits_empty[]={ { .uname = "REQ", .udesc = "Requests (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Snoops (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA", .udesc = "VNA Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "Writebacks (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_distress_asserted[]={ { .uname = "DPT_LOCAL", .udesc = "DPT Local (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_NONLOCAL", .udesc = "DPT Remote (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_IV", .udesc = "DPT Stalled - IV (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DPT_STALL_NOCRD", .udesc = "DPT Stalled - No Credit (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HORZ", .udesc = "Horizontal (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_LOCAL", .udesc = "PMM Local (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_NONLOCAL", .udesc = "PMM Remote (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VERT", .udesc = "Vertical (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNOOPGO_UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_horz_ring_akc_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .udesc = "Left and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "Left and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "Right and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "Right and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_horz_ring_iv_in_use[]={ { .uname = "LEFT", .udesc = "Left (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT", .udesc = "Right (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_m2_bl_credits_empty[]={ { .uname = "IIO1_NCB", .udesc = "IIO0 and IIO1 share the same ring destination. (1 VN0 credit only) (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO2_NCB", .udesc = "IIO2 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO3_NCB", .udesc = "IIO3 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO4_NCB", .udesc = "IIO4 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO5_NCB", .udesc = "IIO5 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "All IIO targets for NCS are in single mask. ORs them together (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_SEL", .udesc = "Selected M2p BL NCS credits (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UBOX_NCB", .udesc = "IIO5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_misc_external[]={ { .uname = "MBE_INST0", .udesc = "Number of cycles MBE is high for MS2IDI0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBE_INST1", .udesc = "Number of cycles MBE is high for MS2IDI1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_multi_slot_rcvd[]={ { .uname = "AD_SLOT0", .udesc = "AD - Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SLOT1", .udesc = "AD - Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SLOT2", .udesc = "AD - Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_SLOT0", .udesc = "AK - Slot 0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_SLOT2", .udesc = "AK - Slot 2 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_SLOT0", .udesc = "BL - Slot 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ring_bounces_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ring_sink_starved_horz[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "Acknowledgements to Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_ring_sink_starved_vert[]={ { .uname = "AD", .udesc = "AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Acknowledgements to core (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Data Responses to core (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "Snoops of processor's cache. (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_arb_lost_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_arb_misc[]={ { .uname = "ADBL_PARALLEL_WIN_VN0", .udesc = "AD, BL Parallel Win VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADBL_PARALLEL_WIN_VN1", .udesc = "AD, BL Parallel Win VN1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PARALLEL_WIN", .udesc = "Max Parallel Win (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_PROG_AD_VN0", .udesc = "No Progress on Pending AD VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_PROG_AD_VN1", .udesc = "No Progress on Pending AD VN1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_PROG_BL_VN0", .udesc = "No Progress on Pending BL VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_PROG_BL_VN1", .udesc = "No Progress on Pending BL VN1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN01_PARALLEL_WIN", .udesc = "VN0, VN1 Parallel Win (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_arb_nocrd_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_arb_noreq_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_bypassed[]={ { .uname = "AD_S0_BL_ARB", .udesc = "AD to Slot 0 on BL Arb (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_S0_IDLE", .udesc = "AD to Slot 0 on Idle (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_S1_BL_SLOT", .udesc = "AD + BL to Slot 1 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_S2_BL_SLOT", .udesc = "AD + BL to Slot 2 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_crd_misc[]={ { .uname = "ANY_BGF_FIFO", .udesc = "Any In BGF FIFO (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_BGF_PATH", .udesc = "Any in BGF Path (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT1_FOR_D2K", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT2_FOR_D2K", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_NO_D2K_FOR_ARB", .udesc = "No D2K For Arb (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NO_D2K_FOR_ARB", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_crd_occ[]={ { .uname = "CONSUMED", .udesc = "Credits Consumed (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "D2K_CRD", .udesc = "D2K Credits (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLITS_IN_FIFO", .udesc = "Packets in BGF FIFO (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLITS_IN_PATH", .udesc = "Packets in BGF Path (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_FIFO", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_TOTAL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxQ_CRD", .udesc = "Transmit Credits (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA_IN_USE", .udesc = "VNA In Use (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_cycles_ne_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_data_flits_not_sent[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "NO_BGF", .udesc = "No BGF Credits (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_TXQ", .udesc = "No TxQ Credits (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TSV_HI", .udesc = "TSV High (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VALID_FOR_FLIT", .udesc = "Cycle valid for Flit (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_flits_gen_bl[]={ { .uname = "P0_WAIT", .udesc = "Wait on Pump 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_AT_LIMIT", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_BUSY", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_FIFO_FULL", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_HOLD_P0", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1P_TO_LIMBO", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1_WAIT", .udesc = "Wait on Pump 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_flits_misc[]={ { .uname = "S2REQ_IN_HOLDOFF", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2REQ_IN_SERVICE", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2REQ_RECEIVED", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2REQ_WITHDRAWN", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_flits_slot_bl[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEED_DATA", .udesc = "Needs Data Flit (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P0_WAIT", .udesc = "Wait on Pump 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1_NOT_REQ", .udesc = "Don't Need Pump 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1_NOT_REQ_BUT_BUBBLE", .udesc = "Don't Need Pump 1 - Bubble (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1_NOT_REQ_NOT_AVAIL", .udesc = "Don't Need Pump 1 - Not Avail (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "P1_WAIT", .udesc = "Wait on Pump 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_flit_gen_hdr1[]={ { .uname = "ACCUM", .udesc = "Accumulate (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ACCUM_READ", .udesc = "Accumulate Ready (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ACCUM_WASTED", .udesc = "Accumulate Wasted (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AHEAD_BLOCKED", .udesc = "Run-Ahead - Blocked (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AHEAD_MSG1_AFTER", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AHEAD_MSG1_DURING", .udesc = "Run-Ahead - Message (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AHEAD_MSG2_AFTER", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AHEAD_MSG2_SENT", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_flit_gen_hdr2[]={ { .uname = "PAR", .udesc = "Parallel Ok (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAR_FLIT", .udesc = "Parallel Flit Finished (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAR_MSG", .udesc = "Parallel Message (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RMSTALL", .udesc = "Rate-matching Stall (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RMSTALL_NOMSG", .udesc = "Rate-matching Stall - No Message (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_hdr_flits_sent[]={ { .uname = "1_MSG", .udesc = "One Message (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "1_MSG_VNX", .udesc = "One Message in non-VNA (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_MSGS", .udesc = "Two Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "3_MSGS", .udesc = "Three Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_1", .udesc = "One Slot Taken (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_2", .udesc = "Two Slots Taken (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_3", .udesc = "All Slots Taken (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_hdr_flit_not_sent[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_BGF_CRD", .udesc = "No BGF Credits (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_BGF_NO_MSG", .udesc = "No BGF Credits + No Extra Message Slotted (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_TXQ_CRD", .udesc = "No TxQ Credits (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_TXQ_NO_MSG", .udesc = "No TxQ Credits + No Extra Message Slotted (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TSV_HI", .udesc = "TSV High (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VALID_FOR_FLIT", .udesc = "Cycle valid for Flit (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_held[]={ { .uname = "CANT_SLOT_AD", .udesc = "Can't Slot AD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CANT_SLOT_BL", .udesc = "Can't Slot BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARALLEL_ATTEMPT", .udesc = "Parallel Attempt (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARALLEL_SUCCESS", .udesc = "Parallel Success (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0", .udesc = "VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .udesc = "VN1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_inserts_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_occupancy_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_packing_miss_vn1[]={ { .uname = "AD_REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS", .udesc = "NCS on BL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_vna_crd[]={ { .uname = "ANY_IN_USE", .udesc = "Any In Use (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORRECTED", .udesc = "Corrected (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT1", .udesc = "Level < 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT10", .udesc = "Level < 10 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT4", .udesc = "Level < 4 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LT5", .udesc = "Level < 5 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxc_vna_crd_misc[]={ { .uname = "REQ_ADBL_ALLOC_L5", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_VN01_ALLOC_LT10", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_JUST_AD", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_JUST_BL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_ONLY", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_JUST_AD", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_JUST_BL", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_ONLY", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxr_crd_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFV", .udesc = "IFV - Credited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxr_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_rxr_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_stall0_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .udesc = "For Transgress 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "For Transgress 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "For Transgress 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "For Transgress 3 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "For Transgress 4 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "For Transgress 5 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "For Transgress 6 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "For Transgress 7 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_stall1_no_txr_horz_crd_ad_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_stall1_no_txr_horz_crd_bl_ag1_1[]={ { .uname = "TGR10", .udesc = "For Transgress 10 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR8", .udesc = "For Transgress 8 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR9", .udesc = "For Transgress 9 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_ad_flq_bypass[]={ { .uname = "AD_SLOT0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SLOT1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_SLOT2", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_EARLY_RSP", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_ad_flq_cycles_ne[]={ { .uname = "VN0_REQ", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_REQ", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_ad_flq_inserts[]={ { .uname = "VN0_REQ", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_REQ", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_ad_flq_occupancy[]={ { .uname = "VN0_REQ", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_REQ", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_bl_arb_fail[]={ { .uname = "VN0_NCB", .udesc = "VN0 NCB Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_NCS", .udesc = "VN0 NCS Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCB", .udesc = "VN1 NCS Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCS", .udesc = "VN1 NCB Messages (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_bl_flq_cycles_ne[]={ { .uname = "VN0_REQ", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_REQ", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_bl_flq_inserts[]={ { .uname = "VN0_NCB", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_NCS", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 NCS Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 NCB Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCB", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCS", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1_NCB Messages (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1_NCS Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_bl_flq_occupancy[]={ { .uname = "VN0_NCB", .udesc = "VN0 NCB Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_NCS", .udesc = "VN0 NCS Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCB", .udesc = "VN1_NCS Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCS", .udesc = "VN1_NCB Messages (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txc_bl_wb_flq_occupancy[]={ { .uname = "VN0_LOCAL", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_THROUGH", .udesc = "VN0 WB Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WRPULL", .udesc = "VN0 NCB Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_LOCAL", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_THROUGH", .udesc = "VN1 WB Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WRPULL", .udesc = "VN1_NCS Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_horz_ads_used[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_horz_cycles_full[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_horz_inserts[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_horz_occupancy[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD - Credited (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "BL - Credited (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_horz_starved[]={ { .uname = "AD_ALL", .udesc = "AD - All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_UNCRD", .udesc = "AD - Uncredited (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_UNCRD", .udesc = "AKC - Uncredited (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_ALL", .udesc = "BL - All (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_UNCRD", .udesc = "BL - Uncredited (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_ads_used[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_bypass[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG1", .udesc = "IV - Agent 1 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_cycles_full1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_cycles_ne0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_inserts1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_occupancy0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_occupancy1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_starved0[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_txr_vert_starved1[]={ { .uname = "AKC_AG0", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AKC_AG1", .udesc = "AKC - Agent 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGC", .udesc = "AKC - Agent 0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_upi_peer_ad_credits_empty[]={ { .uname = "VN0_REQ", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_REQ", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA", .udesc = "VNA (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_upi_peer_bl_credits_empty[]={ { .uname = "VN0_NCS_NCB", .udesc = "VN0 RSP Messages (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_RSP", .udesc = "VN0 REQ Messages (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0_WB", .udesc = "VN0 SNP Messages (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_NCS_NCB", .udesc = "VN1 RSP Messages (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_RSP", .udesc = "VN1 REQ Messages (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1_WB", .udesc = "VN1 SNP Messages (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA", .udesc = "VNA (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vert_ring_akc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vert_ring_iv_in_use[]={ { .uname = "DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vert_ring_tgc_in_use[]={ { .uname = "DN_EVEN", .udesc = "Down and Even (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "Down and Odd (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_EVEN", .udesc = "Up and Even (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "Up and Odd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vn0_no_credits[]={ { .uname = "NCB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_vn1_no_credits[]={ { .uname = "NCB", .udesc = "WB on BL (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCB on BL (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ", .udesc = "REQ on AD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP", .udesc = "RSP on AD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "SNP on AD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "RSP on BL (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_wb_occ_compare[]={ { .uname = "BOTHNONZERO_RT_EQ_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOTHNONZERO_RT_EQ_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0xa000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOTHNONZERO_RT_GT_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOTHNONZERO_RT_GT_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0x9000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOTHNONZERO_RT_LT_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x8400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOTHNONZERO_RT_LT_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0xc000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_EQ_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_EQ_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_GT_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_GT_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_LT_LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RT_LT_LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_wb_pending[]={ { .uname = "LOCALDEST_VN0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCALDEST_VN1", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_AND_RT_VN0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_AND_RT_VN1", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ROUTETHRU_VN0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ROUTETHRU_VN1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WAITING4PULL_VN0", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WAITING4PULL_VN1", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_m3upi_xpt_pftch[]={ { .uname = "ARB", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ARRIVED", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BYPASS", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLITTED", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOST_ARB", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOST_OLD", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOST_QFULL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_m3upi_pe[]={ { .name = "UNC_M3UPI_AG0_AD_CRD_ACQUIRED0", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0080, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m3upi_ag0_ad_crd_occupancy0, }, { .name = "UNC_M3UPI_AG0_AD_CRD_ACQUIRED1", .desc = "CMS Agent0 AD Credits Acquired", .code = 0x0081, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m3upi_ag0_ad_crd_occupancy1, }, { .name = "UNC_M3UPI_AG0_AD_CRD_OCCUPANCY0", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0082, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_ad_crd_occupancy0), .umasks = icx_unc_m3upi_ag0_ad_crd_occupancy0, }, { .name = "UNC_M3UPI_AG0_AD_CRD_OCCUPANCY1", .desc = "CMS Agent0 AD Credits Occupancy", .code = 0x0083, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_ad_crd_occupancy1), .umasks = icx_unc_m3upi_ag0_ad_crd_occupancy1, }, { .name = "UNC_M3UPI_AG0_BL_CRD_ACQUIRED0", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0088, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_bl_crd_occupancy0), /* shared */ .umasks = icx_unc_m3upi_ag0_bl_crd_occupancy0, }, { .name = "UNC_M3UPI_AG0_BL_CRD_ACQUIRED1", .desc = "CMS Agent0 BL Credits Acquired", .code = 0x0089, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m3upi_ag0_bl_crd_occupancy1, }, { .name = "UNC_M3UPI_AG0_BL_CRD_OCCUPANCY0", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008a, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_bl_crd_occupancy0), .umasks = icx_unc_m3upi_ag0_bl_crd_occupancy0, }, { .name = "UNC_M3UPI_AG0_BL_CRD_OCCUPANCY1", .desc = "CMS Agent0 BL Credits Occupancy", .code = 0x008b, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag0_bl_crd_occupancy1), .umasks = icx_unc_m3upi_ag0_bl_crd_occupancy1, }, { .name = "UNC_M3UPI_AG1_AD_CRD_ACQUIRED0", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0084, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_ad_crd_occupancy0), /* shared */ .umasks = icx_unc_m3upi_ag1_ad_crd_occupancy0, }, { .name = "UNC_M3UPI_AG1_AD_CRD_ACQUIRED1", .desc = "CMS Agent1 AD Credits Acquired", .code = 0x0085, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_ad_crd_occupancy1), /* shared */ .umasks = icx_unc_m3upi_ag1_ad_crd_occupancy1, }, { .name = "UNC_M3UPI_AG1_AD_CRD_OCCUPANCY0", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0086, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_ad_crd_occupancy0), .umasks = icx_unc_m3upi_ag1_ad_crd_occupancy0, }, { .name = "UNC_M3UPI_AG1_AD_CRD_OCCUPANCY1", .desc = "CMS Agent1 AD Credits Occupancy", .code = 0x0087, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_ad_crd_occupancy1), .umasks = icx_unc_m3upi_ag1_ad_crd_occupancy1, }, { .name = "UNC_M3UPI_AG1_BL_CRD_ACQUIRED0", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_bl_crd_acquired0), .umasks = icx_unc_m3upi_ag1_bl_crd_acquired0, }, { .name = "UNC_M3UPI_AG1_BL_CRD_ACQUIRED1", .desc = "CMS Agent1 BL Credits Acquired", .code = 0x008d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_bl_crd_occupancy1), /* shared */ .umasks = icx_unc_m3upi_ag1_bl_crd_occupancy1, }, { .name = "UNC_M3UPI_AG1_BL_CRD_OCCUPANCY0", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall0_no_txr_horz_crd_ad_ag0), /* shared */ .umasks = icx_unc_m3upi_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M3UPI_AG1_BL_CRD_OCCUPANCY1", .desc = "CMS Agent1 BL Credits Occupancy", .code = 0x008f, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ag1_bl_crd_occupancy1), .umasks = icx_unc_m3upi_ag1_bl_crd_occupancy1, }, { .name = "UNC_M3UPI_CHA_AD_CREDITS_EMPTY", .desc = "CBox AD Credits Empty", .code = 0x0022, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_cha_ad_credits_empty), .umasks = icx_unc_m3upi_cha_ad_credits_empty, }, { .name = "UNC_M3UPI_CLOCKTICKS", .desc = "Clockticks of the mesh to UPI (M3UPI)", .code = 0x0001, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_CMS_CLOCKTICKS", .desc = "CMS Clockticks (experimental)", .code = 0x00c0, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_D2C_SENT", .desc = "D2C Sent (experimental)", .code = 0x002b, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_D2U_SENT", .desc = "D2U Sent (experimental)", .code = 0x002a, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_DISTRESS_ASSERTED", .desc = "Distress signal asserted", .code = 0x00af, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_distress_asserted), .umasks = icx_unc_m3upi_distress_asserted, }, { .name = "UNC_M3UPI_EGRESS_ORDERING", .desc = "Egress Blocking due to Ordering requirements", .code = 0x00ba, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_egress_ordering), .umasks = icx_unc_m3upi_egress_ordering, }, { .name = "UNC_M3UPI_HORZ_RING_AD_IN_USE", .desc = "Horizontal AD Ring In Use", .code = 0x00b6, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_horz_ring_akc_in_use), /* shared */ .umasks = icx_unc_m3upi_horz_ring_akc_in_use, }, { .name = "UNC_M3UPI_HORZ_RING_AKC_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00bb, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_horz_ring_akc_in_use), .umasks = icx_unc_m3upi_horz_ring_akc_in_use, }, { .name = "UNC_M3UPI_HORZ_RING_AK_IN_USE", .desc = "Horizontal AK Ring In Use", .code = 0x00b7, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_horz_ring_bl_in_use), /* shared */ .umasks = icx_unc_m3upi_horz_ring_bl_in_use, }, { .name = "UNC_M3UPI_HORZ_RING_BL_IN_USE", .desc = "Horizontal BL Ring in Use", .code = 0x00b8, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_horz_ring_bl_in_use), .umasks = icx_unc_m3upi_horz_ring_bl_in_use, }, { .name = "UNC_M3UPI_HORZ_RING_IV_IN_USE", .desc = "Horizontal IV Ring in Use", .code = 0x00b9, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_horz_ring_iv_in_use), .umasks = icx_unc_m3upi_horz_ring_iv_in_use, }, { .name = "UNC_M3UPI_M2_BL_CREDITS_EMPTY", .desc = "M2 BL Credits Empty", .code = 0x0023, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_m2_bl_credits_empty), .umasks = icx_unc_m3upi_m2_bl_credits_empty, }, { .name = "UNC_M3UPI_MISC_EXTERNAL", .desc = "Miscellaneous Events (mostly from MS2IDI)", .code = 0x00e6, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_misc_external), .umasks = icx_unc_m3upi_misc_external, }, { .name = "UNC_M3UPI_MULTI_SLOT_RCVD", .desc = "Multi Slot Flit Received", .code = 0x003e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_multi_slot_rcvd), .umasks = icx_unc_m3upi_multi_slot_rcvd, }, { .name = "UNC_M3UPI_RING_BOUNCES_HORZ", .desc = "Messages that bounced on the Horizontal Ring.", .code = 0x00ac, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ring_bounces_horz), .umasks = icx_unc_m3upi_ring_bounces_horz, }, { .name = "UNC_M3UPI_RING_BOUNCES_VERT", .desc = "Messages that bounced on the Vertical Ring.", .code = 0x00aa, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ring_sink_starved_vert), /* shared */ .umasks = icx_unc_m3upi_ring_sink_starved_vert, }, { .name = "UNC_M3UPI_RING_SINK_STARVED_HORZ", .desc = "Sink Starvation on Horizontal Ring", .code = 0x00ad, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ring_sink_starved_horz), .umasks = icx_unc_m3upi_ring_sink_starved_horz, }, { .name = "UNC_M3UPI_RING_SINK_STARVED_VERT", .desc = "Sink Starvation on Vertical Ring", .code = 0x00ab, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_ring_sink_starved_vert), .umasks = icx_unc_m3upi_ring_sink_starved_vert, }, { .name = "UNC_M3UPI_RING_SRC_THRTL", .desc = "Source Throttle (experimental)", .code = 0x00ae, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_RxC_ARB_LOST_VN0", .desc = "Lost Arb for VN0", .code = 0x004b, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_lost_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_arb_lost_vn1, }, { .name = "UNC_M3UPI_RxC_ARB_LOST_VN1", .desc = "Lost Arb for VN1", .code = 0x004c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_lost_vn1), .umasks = icx_unc_m3upi_rxc_arb_lost_vn1, }, { .name = "UNC_M3UPI_RxC_ARB_MISC", .desc = "Arb Miscellaneous", .code = 0x004d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_misc), .umasks = icx_unc_m3upi_rxc_arb_misc, }, { .name = "UNC_M3UPI_RxC_ARB_NOCRD_VN0", .desc = "No Credits to Arb for VN0", .code = 0x0047, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_nocrd_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_arb_nocrd_vn1, }, { .name = "UNC_M3UPI_RxC_ARB_NOCRD_VN1", .desc = "No Credits to Arb for VN1", .code = 0x0048, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_nocrd_vn1), .umasks = icx_unc_m3upi_rxc_arb_nocrd_vn1, }, { .name = "UNC_M3UPI_RxC_ARB_NOREQ_VN0", .desc = "Can't Arb for VN0", .code = 0x0049, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_noreq_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_arb_noreq_vn1, }, { .name = "UNC_M3UPI_RxC_ARB_NOREQ_VN1", .desc = "Can't Arb for VN1", .code = 0x004a, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_arb_noreq_vn1), .umasks = icx_unc_m3upi_rxc_arb_noreq_vn1, }, { .name = "UNC_M3UPI_RxC_BYPASSED", .desc = "Ingress Queue Bypasses", .code = 0x0040, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x7ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_bypassed), .umasks = icx_unc_m3upi_rxc_bypassed, }, { .name = "UNC_M3UPI_RxC_CRD_MISC", .desc = "Miscellaneous Credit Events", .code = 0x005f, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_crd_misc), .umasks = icx_unc_m3upi_rxc_crd_misc, }, { .name = "UNC_M3UPI_RxC_CRD_OCC", .desc = "Credit Occupancy", .code = 0x0060, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_crd_occ), .umasks = icx_unc_m3upi_rxc_crd_occ, }, { .name = "UNC_M3UPI_RxC_CYCLES_NE_VN0", .desc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty", .code = 0x0043, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_cycles_ne_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_cycles_ne_vn1, }, { .name = "UNC_M3UPI_RxC_CYCLES_NE_VN1", .desc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty", .code = 0x0044, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_cycles_ne_vn1), .umasks = icx_unc_m3upi_rxc_cycles_ne_vn1, }, { .name = "UNC_M3UPI_RxC_DATA_FLITS_NOT_SENT", .desc = "Data Flit Not Sent", .code = 0x0055, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_data_flits_not_sent), .umasks = icx_unc_m3upi_rxc_data_flits_not_sent, }, { .name = "UNC_M3UPI_RxC_FLITS_GEN_BL", .desc = "Generating BL Data Flit Sequence", .code = 0x0057, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_flits_gen_bl), .umasks = icx_unc_m3upi_rxc_flits_gen_bl, }, { .name = "UNC_M3UPI_RxC_FLITS_MISC", .desc = "UNC_M3UPI_RxC_FLITS_MISC.S2REQ_RECEIVED", .code = 0x0058, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_flits_misc), .umasks = icx_unc_m3upi_rxc_flits_misc, }, { .name = "UNC_M3UPI_RxC_FLITS_SLOT_BL", .desc = "Slotting BL Message Into Header Flit", .code = 0x0056, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_flits_slot_bl), .umasks = icx_unc_m3upi_rxc_flits_slot_bl, }, { .name = "UNC_M3UPI_RxC_FLIT_GEN_HDR1", .desc = "Flit Gen - Header 1", .code = 0x0051, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_flit_gen_hdr1), .umasks = icx_unc_m3upi_rxc_flit_gen_hdr1, }, { .name = "UNC_M3UPI_RxC_FLIT_GEN_HDR2", .desc = "Flit Gen - Header 2", .code = 0x0052, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_flit_gen_hdr2), .umasks = icx_unc_m3upi_rxc_flit_gen_hdr2, }, { .name = "UNC_M3UPI_RxC_HDR_FLITS_SENT", .desc = "Sent Header Flit", .code = 0x0054, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_hdr_flits_sent), .umasks = icx_unc_m3upi_rxc_hdr_flits_sent, }, { .name = "UNC_M3UPI_RxC_HDR_FLIT_NOT_SENT", .desc = "Header Not Sent", .code = 0x0053, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_hdr_flit_not_sent), .umasks = icx_unc_m3upi_rxc_hdr_flit_not_sent, }, { .name = "UNC_M3UPI_RxC_HELD", .desc = "Message Held", .code = 0x0050, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x7ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_held), .umasks = icx_unc_m3upi_rxc_held, }, { .name = "UNC_M3UPI_RxC_INSERTS_VN0", .desc = "VN0 Ingress (from CMS) Queue - Inserts", .code = 0x0041, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_inserts_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_inserts_vn1, }, { .name = "UNC_M3UPI_RxC_INSERTS_VN1", .desc = "VN1 Ingress (from CMS) Queue - Inserts", .code = 0x0042, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_inserts_vn1), .umasks = icx_unc_m3upi_rxc_inserts_vn1, }, { .name = "UNC_M3UPI_RxC_OCCUPANCY_VN0", .desc = "VN0 Ingress (from CMS) Queue - Occupancy", .code = 0x0045, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_occupancy_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_occupancy_vn1, }, { .name = "UNC_M3UPI_RxC_OCCUPANCY_VN1", .desc = "VN1 Ingress (from CMS) Queue - Occupancy", .code = 0x0046, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_occupancy_vn1), .umasks = icx_unc_m3upi_rxc_occupancy_vn1, }, { .name = "UNC_M3UPI_RxC_PACKING_MISS_VN0", .desc = "VN0 message can't slot into flit", .code = 0x004e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x7ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_packing_miss_vn1), /* shared */ .umasks = icx_unc_m3upi_rxc_packing_miss_vn1, }, { .name = "UNC_M3UPI_RxC_PACKING_MISS_VN1", .desc = "VN1 message can't slot into flit", .code = 0x004f, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x7ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_packing_miss_vn1), .umasks = icx_unc_m3upi_rxc_packing_miss_vn1, }, { .name = "UNC_M3UPI_RxC_VNA_CRD", .desc = "Remote VNA Credits", .code = 0x005a, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_vna_crd), .umasks = icx_unc_m3upi_rxc_vna_crd, }, { .name = "UNC_M3UPI_RxC_VNA_CRD_MISC", .desc = "UNC_M3UPI_RxC_VNA_CRD_MISC.REQ_VN01_ALLOC_LT10", .code = 0x0059, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxc_vna_crd_misc), .umasks = icx_unc_m3upi_rxc_vna_crd_misc, }, { .name = "UNC_M3UPI_RxR_BUSY_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e5, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_ads_used), /* shared */ .umasks = icx_unc_m3upi_txr_horz_ads_used, }, { .name = "UNC_M3UPI_RxR_BYPASS", .desc = "Transgress Ingress Bypass", .code = 0x00e2, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxr_inserts), /* shared */ .umasks = icx_unc_m3upi_rxr_inserts, }, { .name = "UNC_M3UPI_RxR_CRD_STARVED", .desc = "Transgress Injection Starvation", .code = 0x00e3, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxr_crd_starved), .umasks = icx_unc_m3upi_rxr_crd_starved, }, { .name = "UNC_M3UPI_RxR_CRD_STARVED_1", .desc = "Transgress Injection Starvation (experimental)", .code = 0x00e4, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_RxR_INSERTS", .desc = "Transgress Ingress Allocations", .code = 0x00e1, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxr_inserts), .umasks = icx_unc_m3upi_rxr_inserts, }, { .name = "UNC_M3UPI_RxR_OCCUPANCY", .desc = "Transgress Ingress Occupancy", .code = 0x00e0, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_rxr_occupancy), .umasks = icx_unc_m3upi_rxr_occupancy, }, { .name = "UNC_M3UPI_STALL0_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d0, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall0_no_txr_horz_crd_ad_ag0), .umasks = icx_unc_m3upi_stall0_no_txr_horz_crd_ad_ag0, }, { .name = "UNC_M3UPI_STALL0_NO_TxR_HORZ_CRD_AD_AG1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d2, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag0), /* shared */ .umasks = icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M3UPI_STALL0_NO_TxR_HORZ_CRD_BL_AG0", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d4, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag0), .umasks = icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag0, }, { .name = "UNC_M3UPI_STALL0_NO_TxR_HORZ_CRD_BL_AG1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d6, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag1), .umasks = icx_unc_m3upi_stall0_no_txr_horz_crd_bl_ag1, }, { .name = "UNC_M3UPI_STALL1_NO_TxR_HORZ_CRD_AD_AG0", .desc = "Stall on No AD Agent0 Transgress Credits", .code = 0x00d1, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall1_no_txr_horz_crd_ad_ag1_1), /* shared */ .umasks = icx_unc_m3upi_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M3UPI_STALL1_NO_TxR_HORZ_CRD_AD_AG1_1", .desc = "Stall on No AD Agent1 Transgress Credits", .code = 0x00d3, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall1_no_txr_horz_crd_ad_ag1_1), .umasks = icx_unc_m3upi_stall1_no_txr_horz_crd_ad_ag1_1, }, { .name = "UNC_M3UPI_STALL1_NO_TxR_HORZ_CRD_BL_AG0_1", .desc = "Stall on No BL Agent0 Transgress Credits", .code = 0x00d5, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall1_no_txr_horz_crd_bl_ag1_1), /* shared */ .umasks = icx_unc_m3upi_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M3UPI_STALL1_NO_TxR_HORZ_CRD_BL_AG1_1", .desc = "Stall on No BL Agent1 Transgress Credits", .code = 0x00d7, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_stall1_no_txr_horz_crd_bl_ag1_1), .umasks = icx_unc_m3upi_stall1_no_txr_horz_crd_bl_ag1_1, }, { .name = "UNC_M3UPI_TxC_AD_ARB_FAIL", .desc = "Failed ARB for AD", .code = 0x0030, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_ad_flq_cycles_ne), /* shared */ .umasks = icx_unc_m3upi_txc_ad_flq_cycles_ne, }, { .name = "UNC_M3UPI_TxC_AD_FLQ_BYPASS", .desc = "AD FlowQ Bypass", .code = 0x002c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_ad_flq_bypass), .umasks = icx_unc_m3upi_txc_ad_flq_bypass, }, { .name = "UNC_M3UPI_TxC_AD_FLQ_CYCLES_NE", .desc = "AD Flow Q Not Empty", .code = 0x0027, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_ad_flq_cycles_ne), .umasks = icx_unc_m3upi_txc_ad_flq_cycles_ne, }, { .name = "UNC_M3UPI_TxC_AD_FLQ_INSERTS", .desc = "AD Flow Q Inserts", .code = 0x002d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_ad_flq_inserts), .umasks = icx_unc_m3upi_txc_ad_flq_inserts, }, { .name = "UNC_M3UPI_TxC_AD_FLQ_OCCUPANCY", .desc = "AD Flow Q Occupancy", .code = 0x001c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_ad_flq_occupancy), .umasks = icx_unc_m3upi_txc_ad_flq_occupancy, }, { .name = "UNC_M3UPI_TxC_AK_FLQ_INSERTS", .desc = "AK Flow Q Inserts (experimental)", .code = 0x002f, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_TxC_AK_FLQ_OCCUPANCY", .desc = "AK Flow Q Occupancy (experimental)", .code = 0x001e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x1ull, }, { .name = "UNC_M3UPI_TxC_BL_ARB_FAIL", .desc = "Failed ARB for BL", .code = 0x0035, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_bl_arb_fail), .umasks = icx_unc_m3upi_txc_bl_arb_fail, }, { .name = "UNC_M3UPI_TxC_BL_FLQ_CYCLES_NE", .desc = "BL Flow Q Not Empty", .code = 0x0028, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_bl_flq_cycles_ne), .umasks = icx_unc_m3upi_txc_bl_flq_cycles_ne, }, { .name = "UNC_M3UPI_TxC_BL_FLQ_INSERTS", .desc = "BL Flow Q Inserts", .code = 0x002e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_bl_flq_inserts), .umasks = icx_unc_m3upi_txc_bl_flq_inserts, }, { .name = "UNC_M3UPI_TxC_BL_FLQ_OCCUPANCY", .desc = "BL Flow Q Occupancy", .code = 0x001d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_bl_flq_occupancy), .umasks = icx_unc_m3upi_txc_bl_flq_occupancy, }, { .name = "UNC_M3UPI_TxC_BL_WB_FLQ_OCCUPANCY", .desc = "BL Flow Q Occupancy", .code = 0x001f, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txc_bl_wb_flq_occupancy), .umasks = icx_unc_m3upi_txc_bl_wb_flq_occupancy, }, { .name = "UNC_M3UPI_TxR_HORZ_ADS_USED", .desc = "CMS Horizontal ADS Used", .code = 0x00a6, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_ads_used), .umasks = icx_unc_m3upi_txr_horz_ads_used, }, { .name = "UNC_M3UPI_TxR_HORZ_BYPASS", .desc = "CMS Horizontal Bypass Used", .code = 0x00a7, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_cycles_full), /* shared */ .umasks = icx_unc_m3upi_txr_horz_cycles_full, }, { .name = "UNC_M3UPI_TxR_HORZ_CYCLES_FULL", .desc = "Cycles CMS Horizontal Egress Queue is Full", .code = 0x00a2, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_cycles_full), .umasks = icx_unc_m3upi_txr_horz_cycles_full, }, { .name = "UNC_M3UPI_TxR_HORZ_CYCLES_NE", .desc = "Cycles CMS Horizontal Egress Queue is Not Empty", .code = 0x00a3, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_inserts), /* shared */ .umasks = icx_unc_m3upi_txr_horz_inserts, }, { .name = "UNC_M3UPI_TxR_HORZ_INSERTS", .desc = "CMS Horizontal Egress Inserts", .code = 0x00a1, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_inserts), .umasks = icx_unc_m3upi_txr_horz_inserts, }, { .name = "UNC_M3UPI_TxR_HORZ_NACK", .desc = "CMS Horizontal Egress NACKs", .code = 0x00a4, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_occupancy), /* shared */ .umasks = icx_unc_m3upi_txr_horz_occupancy, }, { .name = "UNC_M3UPI_TxR_HORZ_OCCUPANCY", .desc = "CMS Horizontal Egress Occupancy", .code = 0x00a0, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_occupancy), .umasks = icx_unc_m3upi_txr_horz_occupancy, }, { .name = "UNC_M3UPI_TxR_HORZ_STARVED", .desc = "CMS Horizontal Egress Injection Starvation", .code = 0x00a5, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_horz_starved), .umasks = icx_unc_m3upi_txr_horz_starved, }, { .name = "UNC_M3UPI_TxR_VERT_ADS_USED", .desc = "CMS Vertical ADS Used", .code = 0x009c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_ads_used), .umasks = icx_unc_m3upi_txr_vert_ads_used, }, { .name = "UNC_M3UPI_TxR_VERT_BYPASS", .desc = "CMS Vertical ADS Used", .code = 0x009d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_bypass), .umasks = icx_unc_m3upi_txr_vert_bypass, }, { .name = "UNC_M3UPI_TxR_VERT_BYPASS_1", .desc = "CMS Vertical ADS Used", .code = 0x009e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_cycles_full1), /* shared */ .umasks = icx_unc_m3upi_txr_vert_cycles_full1, }, { .name = "UNC_M3UPI_TxR_VERT_CYCLES_FULL0", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0094, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_cycles_ne0), /* shared */ .umasks = icx_unc_m3upi_txr_vert_cycles_ne0, }, { .name = "UNC_M3UPI_TxR_VERT_CYCLES_FULL1", .desc = "Cycles CMS Vertical Egress Queue Is Full", .code = 0x0095, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_cycles_full1), .umasks = icx_unc_m3upi_txr_vert_cycles_full1, }, { .name = "UNC_M3UPI_TxR_VERT_CYCLES_NE0", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0096, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_cycles_ne0), .umasks = icx_unc_m3upi_txr_vert_cycles_ne0, }, { .name = "UNC_M3UPI_TxR_VERT_CYCLES_NE1", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty", .code = 0x0097, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_inserts1), /* shared */ .umasks = icx_unc_m3upi_txr_vert_inserts1, }, { .name = "UNC_M3UPI_TxR_VERT_INSERTS0", .desc = "CMS Vert Egress Allocations", .code = 0x0092, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_occupancy0), /* shared */ .umasks = icx_unc_m3upi_txr_vert_occupancy0, }, { .name = "UNC_M3UPI_TxR_VERT_INSERTS1", .desc = "CMS Vert Egress Allocations", .code = 0x0093, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_inserts1), .umasks = icx_unc_m3upi_txr_vert_inserts1, }, { .name = "UNC_M3UPI_TxR_VERT_NACK0", .desc = "CMS Vertical Egress NACKs", .code = 0x0098, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_starved0), /* shared */ .umasks = icx_unc_m3upi_txr_vert_starved0, }, { .name = "UNC_M3UPI_TxR_VERT_NACK1", .desc = "CMS Vertical Egress NACKs", .code = 0x0099, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_occupancy1), /* shared */ .umasks = icx_unc_m3upi_txr_vert_occupancy1, }, { .name = "UNC_M3UPI_TxR_VERT_OCCUPANCY0", .desc = "CMS Vert Egress Occupancy", .code = 0x0090, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_occupancy0), .umasks = icx_unc_m3upi_txr_vert_occupancy0, }, { .name = "UNC_M3UPI_TxR_VERT_OCCUPANCY1", .desc = "CMS Vert Egress Occupancy", .code = 0x0091, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_occupancy1), .umasks = icx_unc_m3upi_txr_vert_occupancy1, }, { .name = "UNC_M3UPI_TxR_VERT_STARVED0", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009a, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_starved0), .umasks = icx_unc_m3upi_txr_vert_starved0, }, { .name = "UNC_M3UPI_TxR_VERT_STARVED1", .desc = "CMS Vertical Egress Injection Starvation", .code = 0x009b, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_txr_vert_starved1), .umasks = icx_unc_m3upi_txr_vert_starved1, }, { .name = "UNC_M3UPI_UPI_PEER_AD_CREDITS_EMPTY", .desc = "UPI0 AD Credits Empty", .code = 0x0020, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_upi_peer_ad_credits_empty), .umasks = icx_unc_m3upi_upi_peer_ad_credits_empty, }, { .name = "UNC_M3UPI_UPI_PEER_BL_CREDITS_EMPTY", .desc = "UPI0 BL Credits Empty", .code = 0x0021, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_upi_peer_bl_credits_empty), .umasks = icx_unc_m3upi_upi_peer_bl_credits_empty, }, { .name = "UNC_M3UPI_UPI_PREFETCH_SPAWN", .desc = "FlowQ Generated Prefetch (experimental)", .code = 0x0029, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M3UPI_VERT_RING_AD_IN_USE", .desc = "Vertical AD Ring In Use", .code = 0x00b0, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_akc_in_use), /* shared */ .umasks = icx_unc_m3upi_vert_ring_akc_in_use, }, { .name = "UNC_M3UPI_VERT_RING_AKC_IN_USE", .desc = "Vertical AKC Ring In Use", .code = 0x00b4, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_akc_in_use), .umasks = icx_unc_m3upi_vert_ring_akc_in_use, }, { .name = "UNC_M3UPI_VERT_RING_AK_IN_USE", .desc = "Vertical AK Ring In Use", .code = 0x00b1, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_bl_in_use), /* shared */ .umasks = icx_unc_m3upi_vert_ring_bl_in_use, }, { .name = "UNC_M3UPI_VERT_RING_BL_IN_USE", .desc = "Vertical BL Ring in Use", .code = 0x00b2, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_bl_in_use), .umasks = icx_unc_m3upi_vert_ring_bl_in_use, }, { .name = "UNC_M3UPI_VERT_RING_IV_IN_USE", .desc = "Vertical IV Ring in Use", .code = 0x00b3, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_iv_in_use), .umasks = icx_unc_m3upi_vert_ring_iv_in_use, }, { .name = "UNC_M3UPI_VERT_RING_TGC_IN_USE", .desc = "Vertical TGC Ring In Use", .code = 0x00b5, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vert_ring_tgc_in_use), .umasks = icx_unc_m3upi_vert_ring_tgc_in_use, }, { .name = "UNC_M3UPI_VN0_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x005b, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vn0_no_credits), /* shared */ .umasks = icx_unc_m3upi_vn0_no_credits, }, { .name = "UNC_M3UPI_VN0_NO_CREDITS", .desc = "VN0 No Credits", .code = 0x005d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vn0_no_credits), .umasks = icx_unc_m3upi_vn0_no_credits, }, { .name = "UNC_M3UPI_VN1_CREDITS_USED", .desc = "VN1 Credit Used", .code = 0x005c, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vn1_no_credits), /* shared */ .umasks = icx_unc_m3upi_vn1_no_credits, }, { .name = "UNC_M3UPI_VN1_NO_CREDITS", .desc = "VN1 No Credits", .code = 0x005e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_vn1_no_credits), .umasks = icx_unc_m3upi_vn1_no_credits, }, { .name = "UNC_M3UPI_WB_OCC_COMPARE", .desc = "UNC_M3UPI_WB_OCC_COMPARE.RT_GT_LOCALDEST_VN0", .code = 0x007e, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_wb_occ_compare), .umasks = icx_unc_m3upi_wb_occ_compare, }, { .name = "UNC_M3UPI_WB_PENDING", .desc = "UNC_M3UPI_WB_PENDING.LOCALDEST_VN0", .code = 0x007d, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_wb_pending), .umasks = icx_unc_m3upi_wb_pending, }, { .name = "UNC_M3UPI_XPT_PFTCH", .desc = "UNC_M3UPI_XPT_PFTCH.ARRIVED", .code = 0x0061, .modmsk = ICX_UNC_M3UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_m3upi_xpt_pftch), .umasks = icx_unc_m3upi_xpt_pftch, }, }; /* 130 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_pcu_events.h000066400000000000000000000145331502707512200252730ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_pcu (IcelakeX Uncore PCU) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .udesc = "C0 and C1 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C3", .udesc = "C3 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C6", .udesc = "C6 and C7 (experimental)", .ucode = 0xc000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_pcu_pe[]={ { .name = "UNC_P_CLOCKTICKS", .desc = "Clockticks of the power control unit (PCU)", .code = 0x0000, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_CORE_TRANSITION_CYCLES", .desc = "UNC_P_CORE_TRANSITION_CYCLES (experimental)", .code = 0x0060, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_DEMOTIONS", .desc = "UNC_P_DEMOTIONS (experimental)", .code = 0x0030, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FIVR_PS_PS0_CYCLES", .desc = "Phase Shed 0 Cycles (experimental)", .code = 0x0075, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FIVR_PS_PS1_CYCLES", .desc = "Phase Shed 1 Cycles (experimental)", .code = 0x0076, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FIVR_PS_PS2_CYCLES", .desc = "Phase Shed 2 Cycles (experimental)", .code = 0x0077, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FIVR_PS_PS3_CYCLES", .desc = "Phase Shed 3 Cycles (experimental)", .code = 0x0078, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_CLIP_AVX256", .desc = "AVX256 Frequency Clipping (experimental)", .code = 0x0049, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_CLIP_AVX512", .desc = "AVX512 Frequency Clipping (experimental)", .code = 0x004a, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .desc = "Thermal Strongest Upper Limit Cycles (experimental)", .code = 0x0004, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .desc = "Power Strongest Upper Limit Cycles (experimental)", .code = 0x0005, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .desc = "IO P Limit Strongest Lower Limit Cycles (experimental)", .code = 0x0073, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .desc = "Cycles spent changing Frequency (experimental)", .code = 0x0074, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .desc = "Memory Phase Shedding Cycles (experimental)", .code = 0x002f, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PKG_RESIDENCY_C0_CYCLES", .desc = "Package C State Residency - C0 (experimental)", .code = 0x002a, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PKG_RESIDENCY_C2E_CYCLES", .desc = "Package C State Residency - C2E (experimental)", .code = 0x002b, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PKG_RESIDENCY_C3_CYCLES", .desc = "Package C State Residency - C3 (experimental)", .code = 0x002c, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PKG_RESIDENCY_C6_CYCLES", .desc = "Package C State Residency - C6 (experimental)", .code = 0x002d, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PMAX_THROTTLED_CYCLES", .desc = "UNC_P_PMAX_THROTTLED_CYCLES (experimental)", .code = 0x0006, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .desc = "Number of cores in C-State", .code = 0x0080, .modmsk = ICX_UNC_PCU_OCC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_p_power_state_occupancy), .umasks = icx_unc_p_power_state_occupancy, }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .desc = "External Prochot (experimental)", .code = 0x000a, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .desc = "Internal Prochot (experimental)", .code = 0x0009, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .desc = "Total Core C State Transition Cycles (experimental)", .code = 0x0072, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_P_VR_HOT_CYCLES", .desc = "VR Hot (experimental)", .code = 0x0042, .modmsk = ICX_UNC_PCU_ATTRS, .cntmsk = 0xfull, }, }; /* 24 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_ubox_events.h000066400000000000000000000166641502707512200254700ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_ubox (IcelakeX Uncore UBOX) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .udesc = "Doorbell (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_PRIO", .udesc = "Interrupt (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPI_RCVD", .udesc = "IPI (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSI_RCVD", .udesc = "MSI (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VLW_RCVD", .udesc = "VLW (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_u_m2u_misc1[]={ { .uname = "RxC_CYCLES_NE_CBO_NCB", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RxC_CYCLES_NE_CBO_NCS", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RxC_CYCLES_NE_UPI_NCB", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RxC_CYCLES_NE_UPI_NCS", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_CBO_NCB", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_CBO_NCS", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_UPI_NCB", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_UPI_NCS", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_u_m2u_misc2[]={ { .uname = "RxC_CYCLES_EMPTY_BL", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RxC_CYCLES_FULL_BL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_VN0_NCB", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_CRD_OVF_VN0_NCS", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_EMPTY_AK", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_EMPTY_AKC", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_EMPTY_BL", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_FULL_BL", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_u_m2u_misc3[]={ { .uname = "TxC_CYCLES_FULL_AK", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TxC_CYCLES_FULL_AKC", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_u_phold_cycles[]={ { .uname = "ASSERT_TO_ACK", .udesc = "Assert to ACK (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t icx_unc_u_racu_drng[]={ { .uname = "PFTCH_BUF_EMPTY", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDRAND", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDSEED", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_ubox_pe[]={ { .name = "UNC_U_CLOCKTICKS", .desc = "Clockticks in the UBOX using a dedicated 48-bit Fixed Counter", .code = 0x0000, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x1ull, }, { .name = "UNC_U_EVENT_MSG", .desc = "Message Received", .code = 0x0042, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_event_msg), .umasks = icx_unc_u_event_msg, }, { .name = "UNC_U_LOCK_CYCLES", .desc = "IDI Lock/SplitLock Cycles (experimental)", .code = 0x0044, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, }, { .name = "UNC_U_M2U_MISC1", .desc = "TBD", .code = 0x004d, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_m2u_misc1), .umasks = icx_unc_u_m2u_misc1, }, { .name = "UNC_U_M2U_MISC2", .desc = "TBD", .code = 0x004e, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_m2u_misc2), .umasks = icx_unc_u_m2u_misc2, }, { .name = "UNC_U_M2U_MISC3", .desc = "TBD", .code = 0x004f, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_m2u_misc3), .umasks = icx_unc_u_m2u_misc3, }, { .name = "UNC_U_PHOLD_CYCLES", .desc = "Cycles PHOLD Assert to Ack", .code = 0x0045, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_phold_cycles), .umasks = icx_unc_u_phold_cycles, }, { .name = "UNC_U_RACU_DRNG", .desc = "TBD", .code = 0x004c, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_u_racu_drng), .umasks = icx_unc_u_racu_drng, }, { .name = "UNC_U_RACU_REQUESTS", .desc = "RACU Request (experimental)", .code = 0x0046, .modmsk = ICX_UNC_UBO_ATTRS, .cntmsk = 0x3ull, }, }; /* 9 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_icx_unc_upi_events.h000066400000000000000000000524051502707512200253010ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: icx_unc_upi (IcelakeX Uncore UPI) * Based on Intel JSON event table version : 1.21 * Based on Intel JSON event table published : 06/06/2023 */ static const intel_x86_umask_t icx_unc_upi_direct_attempts[]={ { .uname = "D2C", .udesc = "D2C (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "D2K", .udesc = "D2K (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_flowq_no_vna_crd[]={ { .uname = "AD_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_VNA_EQ1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_VNA_EQ2", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ2", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ3", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_m3_byp_blocked[]={ { .uname = "BGF_CRD", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_LE2", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AK_VNA_LE3", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GV_BLOCK", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_m3_rxq_blocked[]={ { .uname = "BGF_CRD", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_BTW_2_THRESH", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_LE2", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AK_VNA_LE3", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_BTW_0_THRESH", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GV_BLOCK", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_req_slot2_from_m3[]={ { .uname = "ACK", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_rxl_flits[]={ { .uname = "ALL_DATA", .udesc = "All Data", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_NULL", .udesc = "Null FLITs received from any slot", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Data (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Null FLITs received from any slot (experimental)", .ucode = 0x4700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .udesc = "LLCRD Not Empty (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .udesc = "LLCTRL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "All Non Data", .ucode = 0x9700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .udesc = "Slot NULL or LLCRD Empty (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PROTHDR", .udesc = "Protocol Header (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_rxl_inserts[]={ { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_rxl_occupancy[]={ { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_rxl_slot_bypass[]={ { .uname = "S0_RXQ1", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S0_RXQ2", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S1_RXQ0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S1_RXQ2", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2_RXQ0", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2_RXQ1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_txl0p_clk_active[]={ { .uname = "CFG_CTL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DFX", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RETRY", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ_BYPASS", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ_CRED", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SPARE", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TXQ", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_txl_basic_hdr_match[]={ { .uname = "NCB", .udesc = "Non-Coherent Bypass (experimental)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_OPC", .udesc = "Non-Coherent Bypass, Match Opcode (experimental)", .ucode = 0x100000e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Non-Coherent Standard (experimental)", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS_OPC", .udesc = "Non-Coherent Standard, Match Opcode (experimental)", .ucode = 0x100000f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ", .udesc = "Request (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REQ_OPC", .udesc = "Request, Match Opcode (experimental)", .ucode = 0x100000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPCNFLT", .udesc = "Response - Conflict (experimental)", .ucode = 0x10000aa00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI", .udesc = "Response - Invalid (experimental)", .ucode = 0x100002a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP_DATA", .udesc = "Response - Data (experimental)", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP_DATA_OPC", .udesc = "Response - Data, Match Opcode (experimental)", .ucode = 0x100000c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP_NODATA", .udesc = "Response - No Data (experimental)", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSP_NODATA_OPC", .udesc = "Response - No Data, Match Opcode (experimental)", .ucode = 0x100000a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Snoop (experimental)", .ucode = 0x0900ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP_OPC", .udesc = "Snoop, Match Opcode (experimental)", .ucode = 0x100000900ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "Writeback (experimental)", .ucode = 0x0d00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB_OPC", .udesc = "Writeback, Match Opcode (experimental)", .ucode = 0x100000d00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t icx_unc_upi_txl_flits[]={ { .uname = "ALL_DATA", .udesc = "All Data", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_NULL", .udesc = "Null FLITs transmitted to any slot", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Data (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Idle (experimental)", .ucode = 0x4700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .udesc = "LLCRD Not Empty (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .udesc = "LLCTRL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "All Non Data", .ucode = 0x9700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .udesc = "Slot NULL or LLCRD Empty (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PROTHDR", .udesc = "Protocol Header (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_icx_unc_upi_ll_pe[]={ { .name = "UNC_UPI_CLOCKTICKS", .desc = "Number of kfclks", .code = 0x0001, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_DIRECT_ATTEMPTS", .desc = "Direct packet attempts", .code = 0x0012, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_direct_attempts), .umasks = icx_unc_upi_direct_attempts, }, { .name = "UNC_UPI_FLOWQ_NO_VNA_CRD", .desc = "UNC_UPI_FLOWQ_NO_VNA_CRD.AD_VNA_EQ0", .code = 0x0018, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_flowq_no_vna_crd), .umasks = icx_unc_upi_flowq_no_vna_crd, }, { .name = "UNC_UPI_L1_POWER_CYCLES", .desc = "Cycles in L1", .code = 0x0021, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_M3_BYP_BLOCKED", .desc = "UNC_UPI_M3_BYP_BLOCKED.FLOWQ_AD_VNA_LE2", .code = 0x0014, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_m3_byp_blocked), .umasks = icx_unc_upi_m3_byp_blocked, }, { .name = "UNC_UPI_M3_CRD_RETURN_BLOCKED", .desc = "UNC_UPI_M3_CRD_RETURN_BLOCKED (experimental)", .code = 0x0016, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_M3_RXQ_BLOCKED", .desc = "UNC_UPI_M3_RXQ_BLOCKED.FLOWQ_AD_VNA_LE2", .code = 0x0015, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_m3_rxq_blocked), .umasks = icx_unc_upi_m3_rxq_blocked, }, { .name = "UNC_UPI_PHY_INIT_CYCLES", .desc = "Cycles where phy is not in L0, L0c, L0p, L1 (experimental)", .code = 0x0020, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_POWER_L1_NACK", .desc = "L1 Req Nack (experimental)", .code = 0x0023, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_POWER_L1_REQ", .desc = "L1 Req (same as L1 Ack). (experimental)", .code = 0x0022, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_REQ_SLOT2_FROM_M3", .desc = "UNC_UPI_REQ_SLOT2_FROM_M3.VNA", .code = 0x0046, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_req_slot2_from_m3), .umasks = icx_unc_upi_req_slot2_from_m3, }, { .name = "UNC_UPI_RxL0P_POWER_CYCLES", .desc = "Cycles in L0p (experimental)", .code = 0x0025, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL0_POWER_CYCLES", .desc = "Cycles in L0 (experimental)", .code = 0x0024, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_BASIC_HDR_MATCH", .desc = "Matches on Receive path of a UPI Port", .code = 0x0005, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_txl_basic_hdr_match), /* shared */ .umasks = icx_unc_upi_txl_basic_hdr_match, }, { .name = "UNC_UPI_RxL_BYPASSED", .desc = "RxQ Flit Buffer Bypassed", .code = 0x0031, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_rxl_inserts), /* shared */ .umasks = icx_unc_upi_rxl_inserts, }, { .name = "UNC_UPI_RxL_CRC_ERRORS", .desc = "CRC Errors Detected (experimental)", .code = 0x000b, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CRC_LLR_REQ_TRANSMIT", .desc = "LLR Requests Sent (experimental)", .code = 0x0008, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VN0", .desc = "VN0 Credit Consumed (experimental)", .code = 0x0039, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VN1", .desc = "VN1 Credit Consumed (experimental)", .code = 0x003a, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VNA", .desc = "VNA Credit Consumed (experimental)", .code = 0x0038, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_FLITS", .desc = "Valid Flits Received", .code = 0x0003, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_rxl_flits), .umasks = icx_unc_upi_rxl_flits, }, { .name = "UNC_UPI_RxL_INSERTS", .desc = "RxQ Flit Buffer Allocations", .code = 0x0030, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_rxl_inserts), .umasks = icx_unc_upi_rxl_inserts, }, { .name = "UNC_UPI_RxL_OCCUPANCY", .desc = "RxQ Occupancy - All Packets", .code = 0x0032, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_rxl_occupancy), .umasks = icx_unc_upi_rxl_occupancy, }, { .name = "UNC_UPI_RxL_SLOT_BYPASS", .desc = "UNC_UPI_RxL_SLOT_BYPASS.S0_RXQ1", .code = 0x0033, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_rxl_slot_bypass), .umasks = icx_unc_upi_rxl_slot_bypass, }, { .name = "UNC_UPI_TxL0P_CLK_ACTIVE", .desc = "UNC_UPI_TxL0P_CLK_ACTIVE.CFG_CTL", .code = 0x002a, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_txl0p_clk_active), .umasks = icx_unc_upi_txl0p_clk_active, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0x0027, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES_LL_ENTER", .desc = "UNC_UPI_TxL0P_POWER_CYCLES_LL_ENTER (experimental)", .code = 0x0028, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES_M3_EXIT", .desc = "UNC_UPI_TxL0P_POWER_CYCLES_M3_EXIT (experimental)", .code = 0x0029, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0_POWER_CYCLES", .desc = "Cycles in L0 (experimental)", .code = 0x0026, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_BASIC_HDR_MATCH", .desc = "Matches on Transmit path of a UPI Port", .code = 0x0004, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_txl_basic_hdr_match), .umasks = icx_unc_upi_txl_basic_hdr_match, }, { .name = "UNC_UPI_TxL_BYPASSED", .desc = "Tx Flit Buffer Bypassed (experimental)", .code = 0x0041, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_FLITS", .desc = "Valid Flits Sent", .code = 0x0002, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(icx_unc_upi_txl_flits), .umasks = icx_unc_upi_txl_flits, }, { .name = "UNC_UPI_TxL_INSERTS", .desc = "Tx Flit Buffer Allocations (experimental)", .code = 0x0040, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_OCCUPANCY", .desc = "Tx Flit Buffer Occupancy (experimental)", .code = 0x0042, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_BLOCKED_VN01", .desc = "UNC_UPI_VNA_CREDIT_RETURN_BLOCKED_VN01 (experimental)", .code = 0x0045, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_OCCUPANCY", .desc = "VNA Credits Pending Return - Occupancy (experimental)", .code = 0x0044, .modmsk = ICX_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, }; /* 36 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivb_events.h000066400000000000000000002523761502707512200235650ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivb (Intel Ivy Bridge) * PMU: ivb_ep (Intel Ivy Bridge EP) */ static const intel_x86_umask_t ivb_arith[]={ { .uname = "FPU_DIV_ACTIVE", .udesc = "Cycles that the divider is active, includes integer and floating point", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "FPU_DIV", .udesc = "Number of cycles the divider is activated, includes integer and floating point", .ucode = 0x400 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_br_inst_exec[]={ { .uname = "NONTAKEN_COND", .udesc = "All macro conditional non-taken branch instructions", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "All macro conditional taken branch instructions", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_JUMP", .udesc = "All macro unconditional taken branch instructions, excluding calls and indirects", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All taken indirect branches that are not calls nor returns", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_NEAR_RETURN", .udesc = "All taken indirect branches that have a return mnemonic", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_NEAR_CALL", .udesc = "All taken non-indirect calls", .ucode = 0x9000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All taken indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All near executed branches instructions (not necessarily retired)", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_COND", .udesc = "All macro conditional branch instructions", .ucode = 0xc100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "All macro conditional branch instructions", .ucode = 0xc100, .uequiv = "ALL_COND", .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_DIRECT_NEAR_CALL", .udesc = "All non-indirect calls", .ucode = 0xd000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_DIRECT_JUMP", .udesc = "All direct jumps", .ucode = 0xc200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_INDIRECT_NEAR_RET", .udesc = "All indirect near returns", .ucode = 0xc800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All taken and not taken macro branches including far branches (Precise Event)", .ucode = 0x0000, /* architectural encoding */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "All taken and not taken macro conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Number of far branch instructions retired (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "All macro direct and indirect near calls, does not count far calls (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Number of near ret instructions retired (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch taken instructions retired (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "All not taken macro branch instructions retired (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_br_misp_exec[]={ { .uname = "NONTAKEN_COND", .udesc = "All non-taken mispredicted macro conditional branch instructions", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "All taken mispredicted macro conditional branch instructions", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All taken mispredicted indirect branches that are not calls nor returns", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_NEAR_RETURN", .udesc = "All taken mispredicted indirect branches that have a return mnemonic", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_RETURN_NEAR", .udesc = "All taken mispredicted indirect branches that have a return mnemonic", .ucode = 0x8800, .uequiv ="TAKEN_NEAR_RETURN", .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All taken mispredicted indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0xc100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All mispredicted indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted macro branches (Precise Event)", .ucode = 0x0000, /* architectural encoding */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "All mispredicted macro conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "CONDITIONAL", .udesc = "All mispredicted macro conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .uequiv = "COND", }, { .uname = "NEAR_TAKEN", .udesc = "Number of branch instructions retired that were mispredicted and taken (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_lock_cycles[]={ { .uname = "SPLIT_LOCK_UC_LOCK_DURATION", .udesc = "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CACHE_LOCK_DURATION", .udesc = "Cycles in which the L1D is locked", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_cpl_cycles[]={ { .uname = "RING0", .udesc = "Unhalted core cycles the thread was in ring 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RING0_TRANS", .udesc = "Transitions from rings 1, 2, or 3 to ring 0", .uequiv = "RING0:c=1:e=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "RING123", .udesc = "Unhalted core cycles the thread was in rings 1, 2, or 3", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_cpu_clk_unhalted[]={ { .uname = "REF_P", .udesc = "Cycles when the core is unhalted (count at 100 Mhz)", .ucode = 0x100, .uequiv = "REF_XCLK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK", .udesc = "Count Xclk pulses (100Mhz) when the core is unhalted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK_ANY", .udesc = "Count Xclk pulses (100Mhz) when the at least one thread on the physical core is unhalted", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "REF_XCLK:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Counts Xclk (100Mhz) pulses when this thread is unhalted and the other thread is halted", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_dsb2mite_switches[]={ { .uname = "COUNT", .udesc = "Number of DSB to MITE switches", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PENALTY_CYCLES", .udesc = "Number of DSB to MITE switch true penalty cycles", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_dsb_fill[]={ { .uname = "EXCEED_DSB_LINES", .udesc = "DSB Fill encountered > 3 DSB lines", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_dtlb_load_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Demand load miss in all TLB levels which causes a page walk of any page size", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Demand load miss in all TLB levels which causes a page walk that completes for any page size", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles PMH is busy with a walk due to demand loads", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_LD_MISS_CAUSES_A_WALK", .udesc = "Demand load miss in all TLB levels which causes a page walk of any page size", .ucode = 0x8100, .uequiv = "MISS_CAUSES_A_WALK", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_LD_WALK_COMPLETED", .udesc = "Demand load miss in all TLB levels which causes a page walk that completes for any page size", .ucode = 0x8200, .uequiv = "WALK_COMPLETED", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_LD_WALK_DURATION", .udesc = "Cycles PMH is busy with a walk due to demand loads", .ucode = 0x8400, .uequiv = "WALK_DURATION", .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of load operations that missed L1TLB but hit L2TLB", .ucode = 0x45f, /* override event code */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "Number of large page walks completed for demand loads", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_itlb_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uequiv = "MISS_CAUSES_A_WALK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "First level miss but second level hit; no page walk. Only relevant if multiple levels", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Miss in all TLB levels that causes a page walk that completes of any page size (4K/2M/4M/1G)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles PMH is busy with this walk", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LARGE_PAGE_WALK_COMPLETED", .udesc = "Number of completed page walks in ITLB due to STLB load misses for large pages", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_dtlb_store_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uequiv = "MISS_CAUSES_A_WALK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "First level miss but second level hit; no page walk. Only relevant if multiple levels", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Miss in all TLB levels that causes a page walk that completes of any page size (4K/2M/4M/1G)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles PMH is busy with this walk", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_fp_assist[]={ { .uname = "ANY", .udesc = "Cycles with any input/output SSE or FP assists", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "SIMD_INPUT", .udesc = "Number of SIMD FP assists due to input values", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SIMD_OUTPUT", .udesc = "Number of SIMD FP assists due to output values", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87_INPUT", .udesc = "Number of X87 assists due to input value", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87_OUTPUT", .udesc = "Number of X87 assists due to output value", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_icache[]={ { .uname = "MISSES", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH_STALL", .udesc = "Number of cycles wher a code-fetch stalled due to L1 instruction cache miss or iTLB miss", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Reads. Includes cacheable and uncacheable accesses and uncacheable fetches", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_idq[]={ { .uname = "EMPTY", .udesc = "Cycles IDQ is empty", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS", .udesc = "Number of uops delivered to IDQ from MITE path", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS", .udesc = "Number of uops delivered to IDQ from DSB path", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS", .udesc = "Number of uops delivered to IDQ when MS busy by DSB", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS", .udesc = "Number of uops delivered to IDQ when MS busy by MITE", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS", .udesc = "Number of uops were delivered to IDQ from MS by either DSB or MITE", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS_CYCLES", .udesc = "Cycles where uops are delivered to IDQ from MITE (MITE active)", .uequiv = "MITE_UOPS:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS_CYCLES", .udesc = "Cycles where uops are delivered to IDQ from DSB (DSB active)", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ when MS busy by DSB", .uequiv = "MS_DSB_UOPS:c=1", .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ when MS busy by MITE", .uequiv = "MS_MITE_UOPS:c=1", .ucode = 0x2000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ from MS by either BSD or MITE", .uequiv = "MS_UOPS:c=1", .ucode = 0x3000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_SWITCHES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_UOPS:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_UOPS", .udesc = "Number of uops delivered from either DSB paths", .ucode = 0x1800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DSB_CYCLES", .udesc = "Cycles MITE/MS delivered anything", .ucode = 0x1800 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DSB_CYCLES_4_UOPS", .udesc = "Cycles MITE/MS delivered 4 uops", .ucode = 0x1800 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_MITE_UOPS", .udesc = "Number of uops delivered from either MITE paths", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_MITE_CYCLES", .udesc = "Cycles DSB/MS delivered anything", .ucode = 0x2400 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_MITE_CYCLES_4_UOPS", .udesc = "Cycles MITE is delivering 4 uops", .ucode = 0x2400 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_UOPS", .udesc = "Number of uops delivered to IDQ from any path", .ucode = 0x3c00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS_OCCUR", .udesc = "Occurrences of DSB MS going active", .uequiv = "MS_DSB_UOPS:c=1:e=1", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Number of non-delivered uops to RAT (use cmask to qualify further)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles per thread when 4 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .uequiv = "CORE:c=4", .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_1_UOP_DELIV_CORE", .udesc = "Cycles per thread when 3 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_2_UOP_DELIV_CORE", .udesc = "Cycles with less than 2 uops delivered by the front end", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_3_UOP_DELIV_CORE", .udesc = "Cycles with less than 3 uops delivered by the front end", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles Front-End (FE) delivered 4 uops or Resource Allocation Table (RAT) was stalling FE", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 inv=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t ivb_ild_stall[]={ { .uname = "LCP", .udesc = "Stall caused by changing prefix length of the instruction", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IQ_FULL", .udesc = "Stall cycles due to IQ full", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_inst_retired[]={ { .uname = "ANY_P", .udesc = "Number of instructions retired", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Precise instruction retired event to reduce effect of PEBS shadow IP distribution (Precise Event)", .ucntmsk = 0x2, .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event to reduce effect of PEBS shadow IP distribution (Precise Event)", .ucntmsk = 0x2, .uequiv = "ALL", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_itlb[]={ { .uname = "ITLB_FLUSH", .udesc = "Number of ITLB flushes, includes 4k/2M/4M pages", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FLUSH", .udesc = "Number of ITLB flushes, includes 4k/2M/4M pages", .ucode = 0x100, .uequiv = "ITLB_FLUSH", .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l1d[]={ { .uname = "REPLACEMENT", .udesc = "Number of cache lines brought into the L1D cache", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_move_elimination[]={ { .uname = "INT_NOT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were not eliminated", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SIMD_NOT_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were not eliminated", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were eliminated", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SIMD_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were eliminated", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l1d_pend_miss[]={ { .uname = "OCCURRENCES", .udesc = "Occurrences of L1D_PEND_MISS going active", .uequiv = "PENDING:e=1:c=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "EDGE", .udesc = "Occurrences of L1D_PEND_MISS going active", .uequiv = "OCCURRENCES", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "PENDING", .udesc = "Number of L1D load misses outstanding every cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load misses outstanding", .uequiv = "PENDING:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES_ANY", .udesc = "Cycles with L1D load misses outstanding from any thread on the physical core", .uequiv = "PENDING:c=1:t", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "FB_FULL", .udesc = "Number of cycles a demand request was blocked due to Fill Buffer (FB) unavailability", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_l2_l1d_wb_rqsts[]={ { .uname = "HIT_E", .udesc = "Non rejected writebacks from L1D to L2 cache lines in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT_M", .udesc = "Non rejected writebacks from L1D to L2 cache lines in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Not rejected writebacks that missed LLC", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "Not rejected writebacks from L1D to L2 cache lines in any state", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l2_lines_in[]={ { .uname = "ANY", .udesc = "L2 cache lines filling (counting does not cover rejects)", .ucode = 0x700, .uequiv = "ALL", .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "L2 cache lines filling (counting does not cover rejects)", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "E", .udesc = "L2 cache lines in E state (counting does not cover rejects)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I", .udesc = "L2 cache lines in I state (counting does not cover rejects)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S", .udesc = "L2 cache lines in S state (counting does not cover rejects)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l2_lines_out[]={ { .uname = "DEMAND_CLEAN", .udesc = "L2 clean line evicted by a demand", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DIRTY", .udesc = "L2 dirty line evicted by a demand", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_CLEAN", .udesc = "L2 clean line evicted by a prefetch", .ucode = 0x400, .uequiv = "PF_CLEAN", .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_CLEAN", .udesc = "L2 clean line evicted by a prefetch", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_DIRTY", .udesc = "L2 dirty line evicted by an MLC Prefetch", .ucode = 0x800, .uequiv = "PF_DIRTY", .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_DIRTY", .udesc = "L2 dirty line evicted by an MLC Prefetch", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRTY_ANY", .udesc = "Any L2 dirty line evicted (does not cover rejects)", .ucode = 0xa00, .uequiv = "DIRTY_ALL", .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRTY_ALL", .udesc = "Any L2 dirty line evicted (does not cover rejects)", .ucode = 0xa00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l2_rqsts[]={ { .uname = "ALL_CODE_RD", .udesc = "Any code request to L2 cache", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand data read requests to L2 cache", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand data read requests that hit L2", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "Any L2 HW prefetch request to L2 cache", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_MISS", .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "Any RFO requests to L2 cache", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "Store RFO requests that hit L2 cache", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l2_store_lock_rqsts[]={ { .uname = "MISS", .udesc = "RFOs that miss cache (I state)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT_M", .udesc = "RFOs that hit cache lines in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "RFOs that access cache lines in any state", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_l2_trans[]={ { .uname = "ALL", .udesc = "Transactions accessing the L2 pipe", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CODE_RD", .udesc = "L2 cache accesses when fetching instructions", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writebacks that access the L2 cache", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DMND_DATA_RD", .udesc = "Demand Data Read requests that access the L2 cache", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L2_FILL", .udesc = "L2 fill requests that access the L2 cache", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L2_WB", .udesc = "L2 writebacks that access the L2 cache", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_PREFETCH", .udesc = "L2 or L3 HW prefetches that access the L2 cache (including rejects)", .ucode = 0x800, .uequiv = "ALL_PF", .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "L2 or L3 HW prefetches that access the L2 cache (including rejects)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO requests that access the L2 cache", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_ld_blocks[]={ { .uname = "STORE_FORWARD", .udesc = "Loads blocked by overlapping with store buffer that cannot be forwarded", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "Number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_load_hit_pre[]={ { .uname = "HW_PF", .udesc = "Non sw-prefetch load dispatches that hit the fill buffer allocated for HW prefetch", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SW_PF", .udesc = "Non sw-prefetch load dispatches that hit the fill buffer allocated for SW prefetch", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_l3_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed L3", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable demand requests that refer to L3", .ucode = 0x4f00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_machine_clears[]={ { .uname = "MASKMOV", .udesc = "The number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of Memory Ordering Machine Clears detected", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_mem_load_uops_llc_hit_retired[]={ { .uname = "XSNP_HIT", .udesc = "Load LLC Hit and a cross-core Snoop hits in on-pkg core cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Load had HitM Response from a core on same socket (shared LLC) (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Load LLC Hit and a cross-core Snoop missed in on-pkg core cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Load hit in last-level (L3) cache with no snoop needed (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_mem_load_uops_llc_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Number of retired load uops that missed L3 but were service by local RAM. Does not count hardware prefetches (Precise Event)", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Number of retired load uops that missed L3 but were service by remote RAM, snoop not needed, snoop miss, snoop hit data not forwarded (Precise Event)", .ucode = 0xc00, .umodel = PFM_PMU_INTEL_IVB_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Number of retired load uops whose data sources was remote HITM (Precise Event)", .ucode = 0x1000, .umodel = PFM_PMU_INTEL_IVB_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Load uops that miss in the L3 whose data source was forwarded from a remote cache (Precise Event)", .ucode = 0x2000, .umodel = PFM_PMU_INTEL_IVB_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_mem_load_uops_retired[]={ { .uname = "HIT_LFB", .udesc = "A load missed L1D but hit the Fill Buffer (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Load miss in nearest-level (L1D) cache (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Load hit in nearest-level (L1D) cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Load hit in mid-level (L2) cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Load misses in mid-level (L2) cache (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Load hit in last-level (L3) cache with no snoop needed (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Load miss in last-level (L3) cache (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_mem_trans_retired[]={ { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "PRECISE_STORE", .udesc = "Capture where stores occur, must use with PEBS (Precise Event required)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_mem_uops_retired[]={ { .uname = "ALL_LOADS", .udesc = "Any retired loads (Precise Event)", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_LOADS", .udesc = "Any retired loads (Precise Event)", .ucode = 0x8100, .uequiv = "ALL_LOADS", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "Any retired stores (Precise Event)", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Locked retired loads (Precise Event)", .ucode = 0x2100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_STORES", .udesc = "Any retired stores (Precise Event)", .ucode = 0x8200, .uequiv = "ALL_STORES", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired loads causing cacheline splits (Precise Event)", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired stores causing cacheline splits (Precise Event)", .ucode = 0x4200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "STLB misses dues to retired loads (Precise Event)", .ucode = 0x1100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "STLB misses dues to retired stores (Precise Event)", .ucode = 0x1200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_misalign_mem_ref[]={ { .uname = "LOADS", .udesc = "Speculative cache-line split load uops dispatched to the L1D", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Speculative cache-line split Store-address uops dispatched to L1D", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_offcore_requests[]={ { .uname = "ALL_DATA_RD", .udesc = "Demand and prefetch read requests sent to uncore", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_READ", .udesc = "Demand and prefetch read requests sent to uncore", .uequiv = "ALL_DATA_RD", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Offcore code read requests, including cacheable and un-cacheables", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Offcore Demand RFOs, includes regular RFO, Locks, ItoM", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD_CYCLES", .udesc = "Cycles with cacheable data read transactions in the superQ", .uequiv = "ALL_DATA_RD:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_CYCLES", .udesc = "Cycles with demand code reads transactions in the superQ", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_CYCLES", .udesc = "Cycles with demand data read transactions in the superQ", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Cacheable data read transactions in the superQ every cycle", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Code read transactions in the superQ every cycle", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read transactions in the superQ every cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_GE_6", .udesc = "Cycles with at lesat 6 offcore outstanding demand data read requests in the uncore queue", .uequiv = "DEMAND_DATA_RD:c=6", .ucode = 0x100 | (6 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding RFO (store) transactions in the superQ every cycle", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_CYCLES", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_other_assists[]={ { .uname = "AVX_TO_SSE", .udesc = "Number of transitions from AVX-256 to legacy SSE when penalty applicable", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_TO_AVX", .udesc = "Number of transitions from legacy SSE to AVX-256 when penalty applicable", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AVX_STORE", .udesc = "Number of assists associated with 256-bit AVX stores", .ucode = 0x0800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "Number of times the microcode assist is invoked by hardware upon uop writeback", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_resource_stalls[]={ { .uname = "ANY", .udesc = "Cycles stalled due to Resource Related reason", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RS", .udesc = "Cycles stalled due to no eligible RS entry available", .ucode = 0x400, }, { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available (not including draining from sync)", .ucode = 0x800, }, { .uname = "ROB", .udesc = "Cycles stalled due to re-order buffer full", .ucode = 0x1000, }, }; static const intel_x86_umask_t ivb_rob_misc_events[]={ { .uname = "LBR_INSERTS", .udesc = "Count each time an new LBR record is saved by HW", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_rs_events[]={ { .uname = "EMPTY_CYCLES", .udesc = "Cycles the RS is empty for this thread", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "EMPTY_END", .udesc = "Counts number of time the Reservation Station (RS) goes from empty to non-empty", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "EMPTY_CYCLES:c=1:e:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_tlb_access[]={ { .uname = "STLB_HIT", .udesc = "Number of load operations that missed L1TLB but hit L2TLB", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LOAD_STLB_HIT", .udesc = "Number of load operations that missed L1TLB but hit L2TLB", .ucode = 0x400, .uequiv= "STLB_HIT", .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_tlb_flush[]={ { .uname = "DTLB_THREAD", .udesc = "Number of DTLB flushes of thread-specific entries", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_ANY", .udesc = "Number of STLB flushes", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_uops_executed[]={ { .uname = "CORE", .udesc = "Counts total number of uops executed from any thread per cycle", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Counts total number of uops executed per thread each cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Number of cycles with no uops executed", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1_UOP_EXEC", .udesc = "Cycles where at least 1 uop was executed per thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2_UOPS_EXEC", .udesc = "Cycles where at least 2 uops were executed per thread", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3_UOPS_EXEC", .udesc = "Cycles where at least 3 uops were executed per thread", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4_UOPS_EXEC", .udesc = "Cycles where at least 4 uops were executed per thread", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed from any thread", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed from any thread", .ucode = 0x200 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed from any thread", .ucode = 0x200 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed from any thread", .ucode = 0x200 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "CORE:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_NONE", .udesc = "Cycles where no uop is executed on any thread", .ucode = 0x200 | INTEL_X86_MOD_INV, /* inv=1 */ .uequiv = "CORE:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t ivb_uops_dispatched_port[]={ { .uname = "PORT_0", .udesc = "Cycles in which a uop is dispatched on port 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Cycles in which a uop is dispatched on port 1", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_2", .udesc = "Cycles in which a uop is dispatched on port 2", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_3", .udesc = "Cycles in which a uop is dispatched on port 3", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_4", .udesc = "Cycles in which a uop is dispatched on port 4", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Cycles in which a uop is dispatched on port 5", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_0_CORE", .udesc = "Cycles in which a uop is dispatched on port 0 for any thread", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_0:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_1_CORE", .udesc = "Cycles in which a uop is dispatched on port 1 for any thread", .ucode = 0x200 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_1:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_2_CORE", .udesc = "Cycles in which a uop is dispatched on port 2 for any thread", .ucode = 0xc00 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_2:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_3_CORE", .udesc = "Cycles in which a uop is dispatched on port 3 for any thread", .ucode = 0x3000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_3:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_4_CORE", .udesc = "Cycles in which a uop is dispatched on port 4 for any thread", .ucode = 0x4000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_4:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_5_CORE", .udesc = "Cycles in which a uop is dispatched on port 5 for any thread", .ucode = 0x8000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_5:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, }; static const intel_x86_umask_t ivb_uops_issued[]={ { .uname = "ANY", .udesc = "Number of uops issued by the RAT to the Reservation Station (RS)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no uops issued on this core (by any thread)", .uequiv = "ANY:c=1:i=1:t=1", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no uops issued by this thread", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "FLAGS_MERGE", .udesc = "Number of flags-merge uops allocated. Such uops adds delay", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SLOW_LEA", .udesc = "Number of slow LEA or similar uops allocated", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SINGLE_MUL", .udesc = "Number of multiply packed/scalar single precision uops allocated", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_uops_retired[]={ { .uname = "ALL", .udesc = "All uops that actually retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "All uops that actually retired (Precise Event)", .ucode = 0x100, .uequiv= "ALL", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "Number of retirement slots used (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no executable uop retired (Precise Event)", .uequiv = "ALL:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TOTAL_CYCLES", .udesc = "Total cycles using precise uop retired event (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "ALL:c=10:i", .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t ivb_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", .ucode = 1ULL << (4 + 8), .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetchers", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PF_LLC_DATA_RD", .udesc = "Request: number of L3 prefetcher requests to L2 for loads", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PF_LLC_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "PF_LLC_IFETCH", .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Request: number bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH | PF_LLC_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH:PF_LLC_IFETCH", .ucode = 0x24400, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER", .ucode = 0x8fff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_LLC_DATA_RD", .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_LLC_DATA_RD", .ucode = 0x9100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_LLC_RFO", .uequiv = "DMND_RFO:PF_RFO:PF_LLC_RFO", .ucode = 0x12200, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "NO_SUPP", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .grpid = 1, }, { .uname = "LLC_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "LLC_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .grpid = 1, }, { .uname = "LLC_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .grpid = 1, }, { .uname = "LLC_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .grpid = 1, }, { .uname = "LLC_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (22+8), .grpid = 1, }, { .uname = "LLC_MISS_REMOTE", .udesc = "Supplier: counts L3 misses to remote DRAM", .ucode = 0xffULL << (23+8), .uequiv = "LLC_MISS_REMOTE_DRAM", .umodel = PFM_PMU_INTEL_IVB_EP, .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 0x1ULL << (22+8), .grpid = 1, .uequiv = "LLC_MISS_LOCAL", .umodel = PFM_PMU_INTEL_IVB, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local or remote DRAM", .ucode = 0x3ULL << (22+8), .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", .umodel = PFM_PMU_INTEL_IVB_EP, .grpid = 1, }, { .uname = "LLC_MISS_REMOTE_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM", .ucode = 0xffULL << (23+8), .umodel = PFM_PMU_INTEL_IVB_EP, .grpid = 1, }, { .uname = "LLC_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "LLC_HITM:LLC_HITE:LLC_HITS:LLC_HITF", .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_NOT_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_NO_FWD", .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_FWD", .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", .ucode = 1ULL << (35+8), .grpid = 2, }, { .uname = "HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7fULL << (31+8), .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:HITM:NON_DRAM", .uflags= INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t ivb_baclears[]={ { .uname = "ANY", .udesc = "Counts the number of times the front end is re-steered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_cycle_activity[]={ { .uname = "CYCLES_L2_PENDING", .udesc = "Cycles with pending L2 miss loads", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, }, { .uname = "CYCLES_LDM_PENDING", .udesc = "Cycles with pending memory loads", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, .ucntmsk= 0xf, }, { .uname = "CYCLES_L1D_PENDING", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0800 | (0x8 << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_NO_EXECUTE", .udesc = "Cycles of dispatch stalls", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L2_PENDING", .udesc = "Execution stalls due to L2 pending loads", .ucode = 0x0500 | (0x5 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L1D_PENDING", .udesc = "Execution stalls due to L1D pending loads", .ucode = 0x0c00 | (0xc << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_LDM_PENDING", .udesc = "Execution stalls due to memory loads", .ucode = 0x0600 | (0x6 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_fp_comp_ops_exe[]={ { .uname = "X87", .udesc = "Number of X87 uops executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_PACKED_DOUBLE", .udesc = "Number of SSE or AVX-128 double precision FP packed uops executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_SCALAR_SINGLE", .udesc = "Number of SSE or AVX-128 single precision FP scalar uops executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_PACKED_SINGLE", .udesc = "Number of SSE or AVX-128 single precision FP packed uops executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_SCALAR_DOUBLE", .udesc = "Number of SSE or AVX-128 double precision FP scalar uops executed", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_simd_fp_256[]={ { .uname = "PACKED_SINGLE", .udesc = "Counts 256-bit packed single-precision", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_DOUBLE", .udesc = "Counts 256-bit packed double-precision", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivb_lsd[]={ { .uname = "UOPS", .udesc = "Number of uops delivered by the Loop Stream Detector (LSD)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ACTIVE", .udesc = "Cycles with uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_4_UOPS", .udesc = "Cycles with 4 uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "UOPS:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_int_misc[]={ { .uname = "RECOVERY_CYCLES", .udesc = "Cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .modhw = _INTEL_X86_ATTR_C, }, { .uname = "RECOVERY_CYCLES_ANY", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)", .ucode = 0x300 | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .uequiv = "RECOVERY_CYCLES:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "RECOVERY_STALLS_COUNT", .udesc = "Number of occurrences waiting for Machine Clears", .ucode = 0x300 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t ivb_ept[]={ { .uname = "WALK_CYCLES", .udesc = "Cycles for an extended page table walk", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_page_walks[]={ { .uname = "LLC_MISS", .udesc = "Number of page walks with a LLC miss", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_offcore_requests_buffer[]={ { .uname = "SQ_FULL", .udesc = "Number of cycles the offcore requests buffer is full", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivb_sq_misc[]={ { .uname = "SPLIT_LOCK", .udesc = "Number of split locks in the super queue (SQ)", .ucode = 0x1000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_ivb_pe[]={ { .name = "ARITH", .desc = "Counts arithmetic multiply operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x14, .numasks = LIBPFM_ARRAY_SIZE(ivb_arith), .ngrp = 1, .umasks = ivb_arith, }, { .name = "BACLEARS", .desc = "Branch re-steered", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(ivb_baclears), .ngrp = 1, .umasks = ivb_baclears, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x88, .numasks = LIBPFM_ARRAY_SIZE(ivb_br_inst_exec), .ngrp = 1, .umasks = ivb_br_inst_exec, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_br_inst_retired), .ngrp = 1, .umasks = ivb_br_inst_retired, }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x89, .numasks = LIBPFM_ARRAY_SIZE(ivb_br_misp_exec), .ngrp = 1, .umasks = ivb_br_misp_exec, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_br_misp_retired), .ngrp = 1, .umasks = ivb_br_misp_retired, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc5, }, { .name = "LOCK_CYCLES", .desc = "Locked cycles in L1D and L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(ivb_lock_cycles), .ngrp = 1, .umasks = ivb_lock_cycles, }, { .name = "CPL_CYCLES", .desc = "Unhalted core cycles at a specific ring level", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5c, .numasks = LIBPFM_ARRAY_SIZE(ivb_cpl_cycles), .ngrp = 1, .umasks = ivb_cpl_cycles, }, { .name = "CPU_CLK_UNHALTED", .desc = "Cycles when processor is not in halted state", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(ivb_cpu_clk_unhalted), .ngrp = 1, .umasks = ivb_cpu_clk_unhalted, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(ivb_dsb2mite_switches), .ngrp = 1, .umasks = ivb_dsb2mite_switches, }, { .name = "DSB_FILL", .desc = "DSB fills", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xac, .numasks = LIBPFM_ARRAY_SIZE(ivb_dsb_fill), .ngrp = 1, .umasks = ivb_dsb_fill, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(ivb_dtlb_load_misses), .ngrp = 1, .umasks = ivb_dtlb_load_misses, }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x49, .numasks = LIBPFM_ARRAY_SIZE(ivb_dtlb_store_misses), .ngrp = 1, .umasks = ivb_dtlb_store_misses, }, { .name = "FP_ASSIST", .desc = "X87 Floating point assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(ivb_fp_assist), .ngrp = 1, .umasks = ivb_fp_assist, }, { .name = "ICACHE", .desc = "Instruction Cache accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(ivb_icache), .ngrp = 1, .umasks = ivb_icache, }, { .name = "IDQ", .desc = "IDQ operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x79, .numasks = LIBPFM_ARRAY_SIZE(ivb_idq), .ngrp = 1, .umasks = ivb_idq, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x9c, .numasks = LIBPFM_ARRAY_SIZE(ivb_idq_uops_not_delivered), .ngrp = 1, .umasks = ivb_idq_uops_not_delivered, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(ivb_ild_stall), .ngrp = 1, .umasks = ivb_ild_stall, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_inst_retired), .ngrp = 1, .umasks = ivb_inst_retired, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V3_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "ITLB", .desc = "Instruction TLB", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xae, .numasks = LIBPFM_ARRAY_SIZE(ivb_itlb), .ngrp = 1, .umasks = ivb_itlb, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(ivb_itlb_misses), .ngrp = 1, .umasks = ivb_itlb_misses, }, { .name = "L1D", .desc = "L1D cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x51, .numasks = LIBPFM_ARRAY_SIZE(ivb_l1d), .ngrp = 1, .umasks = ivb_l1d, }, { .name = "MOVE_ELIMINATION", .desc = "Move Elimination", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x58, .numasks = LIBPFM_ARRAY_SIZE(ivb_move_elimination), .ngrp = 1, .umasks = ivb_move_elimination, }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x4, .code = 0x48, .numasks = LIBPFM_ARRAY_SIZE(ivb_l1d_pend_miss), .ngrp = 1, .umasks = ivb_l1d_pend_miss, }, { .name = "L2_L1D_WB_RQSTS", .desc = "Writeback requests from L1D to L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_l1d_wb_rqsts), .ngrp = 1, .umasks = ivb_l2_l1d_wb_rqsts, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf1, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_lines_in), .ngrp = 1, .umasks = ivb_l2_lines_in, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf2, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_lines_out), .ngrp = 1, .umasks = ivb_l2_lines_out, }, { .name = "L2_RQSTS", .desc = "L2 requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_rqsts), .ngrp = 1, .umasks = ivb_l2_rqsts, }, { .name = "L2_STORE_LOCK_RQSTS", .desc = "L2 store lock requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_store_lock_rqsts), .ngrp = 1, .umasks = ivb_l2_store_lock_rqsts, }, { .name = "L2_TRANS", .desc = "L2 transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf0, .numasks = LIBPFM_ARRAY_SIZE(ivb_l2_trans), .ngrp = 1, .umasks = ivb_l2_trans, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for L3_LAT_CACHE:MISS", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:MISS", .cntmsk = 0xff, .code = 0x412e, }, { .name = "LLC_MISSES", .desc = "Alias for LAST_LEVEL_CACHE_MISSES", .modmsk = INTEL_V3_ATTRS, .equiv = "LAST_LEVEL_CACHE_MISSES", .cntmsk = 0xff, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for L3_LAT_CACHE:REFERENCE", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:REFERENCE", .cntmsk = 0xff, .code = 0x4f2e, }, { .name = "LLC_REFERENCES", .desc = "Alias for LAST_LEVEL_CACHE_REFERENCES", .modmsk = INTEL_V3_ATTRS, .equiv = "LAST_LEVEL_CACHE_REFERENCES", .cntmsk = 0xff, .code = 0x4f2e, }, { .name = "LD_BLOCKS", .desc = "Blocking loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(ivb_ld_blocks), .ngrp = 1, .umasks = ivb_ld_blocks, }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(ivb_ld_blocks_partial), .ngrp = 1, .umasks = ivb_ld_blocks_partial, }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches that hit fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(ivb_load_hit_pre), .ngrp = 1, .umasks = ivb_load_hit_pre, }, { .name = "L3_LAT_CACHE", .desc = "Core-originated cacheable demand requests to L3", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(ivb_l3_lat_cache), .ngrp = 1, .umasks = ivb_l3_lat_cache, }, { .name = "LONGEST_LAT_CACHE", .desc = "Core-originated cacheable demand requests to L3", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(ivb_l3_lat_cache), .ngrp = 1, .equiv = "L3_LAT_CACHE", .umasks = ivb_l3_lat_cache, }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(ivb_machine_clears), .ngrp = 1, .umasks = ivb_machine_clears, }, { .name = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .desc = "L3 hit loads uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_load_uops_llc_hit_retired), .ngrp = 1, .umasks = ivb_mem_load_uops_llc_hit_retired, }, { .name = "MEM_LOAD_LLC_HIT_RETIRED", .desc = "L3 hit loads uops retired (deprecated use MEM_LOAD_UOPS_LLC_HIT_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd2, .equiv = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_load_uops_llc_hit_retired), .ngrp = 1, .umasks = ivb_mem_load_uops_llc_hit_retired, }, { .name = "MEM_LOAD_UOPS_LLC_MISS_RETIRED", .desc = "Load uops retired that missed the LLC", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd3, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_load_uops_llc_miss_retired), .ngrp = 1, .umasks = ivb_mem_load_uops_llc_miss_retired, }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Memory loads uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd1, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_load_uops_retired), .ngrp = 1, .umasks = ivb_mem_load_uops_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Memory loads uops retired (deprecated use MEM_LOAD_UOPS_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd1, .equiv = "MEM_LOAD_UOPS_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_load_uops_retired), .ngrp = 1, .umasks = ivb_mem_load_uops_retired, }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired", .modmsk = INTEL_V3_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0x8, .code = 0xcd, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_trans_retired), .ngrp = 1, .umasks = ivb_mem_trans_retired, }, { .name = "MEM_UOPS_RETIRED", .desc = "Memory uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_uops_retired), .ngrp = 1, .umasks = ivb_mem_uops_retired, }, { .name = "MEM_UOP_RETIRED", .desc = "Memory uops retired (deprecated use MEM_UOPS_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd0, .equiv = "MEM_UOPS_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_mem_uops_retired), .ngrp = 1, .umasks = ivb_mem_uops_retired, }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned memory references", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(ivb_misalign_mem_ref), .ngrp = 1, .umasks = ivb_misalign_mem_ref, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb0, .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_requests), .ngrp = 1, .umasks = ivb_offcore_requests, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_requests_outstanding), .ngrp = 1, .umasks = ivb_offcore_requests_outstanding, }, { .name = "OTHER_ASSISTS", .desc = "Count hardware assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc1, .numasks = LIBPFM_ARRAY_SIZE(ivb_other_assists), .ngrp = 1, .umasks = ivb_other_assists, }, { .name = "RESOURCE_STALLS", .desc = "Resource related stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa2, .numasks = LIBPFM_ARRAY_SIZE(ivb_resource_stalls), .ngrp = 1, .umasks = ivb_resource_stalls, }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa3, .numasks = LIBPFM_ARRAY_SIZE(ivb_cycle_activity), .ngrp = 1, .umasks = ivb_cycle_activity, }, { .name = "ROB_MISC_EVENTS", .desc = "Reorder buffer events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(ivb_rob_misc_events), .ngrp = 1, .umasks = ivb_rob_misc_events, }, { .name = "RS_EVENTS", .desc = "Reservation station events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5e, .numasks = LIBPFM_ARRAY_SIZE(ivb_rs_events), .ngrp = 1, .umasks = ivb_rs_events, }, { .name = "DTLB_LOAD_ACCESS", .desc = "TLB access", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5f, .numasks = LIBPFM_ARRAY_SIZE(ivb_tlb_access), .ngrp = 1, .umasks = ivb_tlb_access, }, { .name = "TLB_ACCESS", .desc = "TLB access", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5f, .numasks = LIBPFM_ARRAY_SIZE(ivb_tlb_access), .ngrp = 1, .equiv = "DTLB_LOAD_ACCESS", .umasks = ivb_tlb_access, }, { .name = "TLB_FLUSH", .desc = "TLB flushes", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xbd, .numasks = LIBPFM_ARRAY_SIZE(ivb_tlb_flush), .ngrp = 1, .umasks = ivb_tlb_flush, }, { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "UOPS_EXECUTED", .desc = "Uops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb1, .numasks = LIBPFM_ARRAY_SIZE(ivb_uops_executed), .ngrp = 1, .umasks = ivb_uops_executed, }, { .name = "UOPS_DISPATCHED_PORT", .desc = "Uops dispatch to specific ports", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa1, .numasks = LIBPFM_ARRAY_SIZE(ivb_uops_dispatched_port), .ngrp = 1, .umasks = ivb_uops_dispatched_port, }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xe, .numasks = LIBPFM_ARRAY_SIZE(ivb_uops_issued), .ngrp = 1, .umasks = ivb_uops_issued, }, { .name = "UOPS_RETIRED", .desc = "Uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(ivb_uops_retired), .ngrp = 1, .umasks = ivb_uops_retired, }, { .name = "FP_COMP_OPS_EXE", .desc = "Counts number of floating point events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x10, .numasks = LIBPFM_ARRAY_SIZE(ivb_fp_comp_ops_exe), .ngrp = 1, .umasks = ivb_fp_comp_ops_exe, }, { .name = "SIMD_FP_256", .desc = "Counts 256-bit packed floating point instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x11, .numasks = LIBPFM_ARRAY_SIZE(ivb_simd_fp_256), .ngrp = 1, .umasks = ivb_simd_fp_256, }, { .name = "LSD", .desc = "Loop stream detector", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa8, .numasks = LIBPFM_ARRAY_SIZE(ivb_lsd), .ngrp = 1, .umasks = ivb_lsd, }, { .name = "EPT", .desc = "Extended page table", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(ivb_ept), .ngrp = 1, .umasks = ivb_ept, }, { .name = "PAGE_WALKS", .desc = "page walker", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xbe, .numasks = LIBPFM_ARRAY_SIZE(ivb_page_walks), .ngrp = 1, .umasks = ivb_page_walks, }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions", .code = 0xd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V3_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivb_int_misc), .umasks = ivb_int_misc }, { .name = "OFFCORE_REQUESTS_BUFFER", .desc = "Offcore reqest buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb2, .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_requests_buffer), .ngrp = 1, .umasks = ivb_offcore_requests_buffer, }, { .name = "SQ_MISC", .desc = "SuperQueue miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(ivb_sq_misc), .ngrp = 1, .umasks = ivb_sq_misc, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), .ngrp = 3, .umasks = ivb_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(ivb_offcore_response), .ngrp = 3, .umasks = ivb_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_cbo_events.h000066400000000000000000000736301502707512200255740ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: ivbep_unc_cbo (Intel IvyBridge-EP C-Box uncore PMU) */ #define CBO_FILT_MESIF(a, b, c, d) \ { .uname = "STATE_"#a,\ .udesc = #b" cacheline state",\ .ufilters[0] = 1ULL << (17 + (c)),\ .grpid = d, \ } #define CBO_FILT_MESIFS(d) \ CBO_FILT_MESIF(I, Invalid, 0, d), \ CBO_FILT_MESIF(S, Shared, 1, d), \ CBO_FILT_MESIF(E, Exclusive, 2, d), \ CBO_FILT_MESIF(M, Modified, 3, d), \ CBO_FILT_MESIF(F, Forward, 4, d), \ { .uname = "STATE_MESIF",\ .udesc = "Any cache line state",\ .ufilters[0] = 0x3fULL << 17,\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ } #define CBO_FILT_OPC(d) \ { .uname = "OPC_RFO",\ .udesc = "Demand data RFO (combine with any OPCODE umask)",\ .ufilters[1] = 0x180ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_CRD",\ .udesc = "Demand code read (combine with any OPCODE umask)",\ .ufilters[1] = 0x181ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_DRD",\ .udesc = "Demand data read (combine with any OPCODE umask)",\ .ufilters[1] = 0x182ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PRD",\ .udesc = "Partial reads (UC) (combine with any OPCODE umask)",\ .ufilters[1] = 0x187ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCILF",\ .udesc = "Full Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCIL",\ .udesc = "Partial Stream store (combine with any OPCODE umask)", \ .ufilters[1] = 0x18dULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_RFO",\ .udesc = "Prefetch RFO into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x190ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_CODE",\ .udesc = "Prefetch code into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x191ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_DATA",\ .udesc = "Prefetch data into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[1] = 0x192ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWILF",\ .udesc = "PCIe write (non-allocating) (combine with any OPCODE umask)", \ .ufilters[1] = 0x194ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIPRD",\ .udesc = "PCIe UC read (combine with any OPCODE umask)", \ .ufilters[1] = 0x195ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIITOM",\ .udesc = "PCIe write (allocating) (combine with any OPCODE umask)", \ .ufilters[1] = 0x19cULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIRDCUR",\ .udesc = "PCIe read current (combine with any OPCODE umask)", \ .ufilters[1] = 0x19eULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOI",\ .udesc = "Request writeback modified invalidate line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOE",\ .udesc = "Request writeback modified set to exclusive (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_ITOM",\ .udesc = "Request invalidate line (combine with any OPCODE umask)", \ .ufilters[1] = 0x1c8ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSRD",\ .udesc = "PCIe non-snoop read (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e4ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWR",\ .udesc = "PCIe non-snoop write (partial) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e5ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWRF",\ .udesc = "PCIe non-snoop write (full) (combine with any OPCODE umask)", \ .ufilters[1] = 0x1e6ULL << 20, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ } static const intel_x86_umask_t ivbep_unc_c_llc_lookup[]={ { .uname = "DATA_READ", .udesc = "Data read requests", .grpid = 0, .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Write requests. Includes all write transactions (cached, uncached)", .grpid = 0, .ucode = 0x500, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNOOP", .udesc = "External snoop request", .grpid = 0, .ucode = 0x900, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any request", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .ucode = 0x1100, }, { .uname = "NID", .udesc = "Match a given RTID destination NID (must provide nf=X modifier)", .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, .ucode = 0x4100, .uflags = INTEL_X86_GRP_DFL_NONE }, CBO_FILT_MESIFS(2), }; static const intel_x86_umask_t ivbep_unc_c_llc_victims[]={ { .uname = "STATE_M", .udesc = "Lines in M state", .ucode = 0x100, .grpid = 0, }, { .uname = "STATE_E", .udesc = "Lines in E state", .ucode = 0x200, .grpid = 0, }, { .uname = "STATE_S", .udesc = "Lines in S state", .ucode = 0x400, .grpid = 0, }, { .uname = "MISS", .udesc = "TBD", .ucode = 0x800, .grpid = 0, }, { .uname = "NID", .udesc = "Victimized Lines matching the NID filter (must provide nf=X modifier)", .ucode = 0x4000, .uflags = INTEL_X86_GRP_DFL_NONE, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .grpid = 1, }, }; static const intel_x86_umask_t ivbep_unc_c_ring_ad_used[]={ { .uname = "UP_VR0_EVEN", .udesc = "Up and Even ring polarity filter on virtual ring 0", .ucode = 0x100, }, { .uname = "UP_VR0_ODD", .udesc = "Up and odd ring polarity filter on virtual ring 0", .ucode = 0x200, }, { .uname = "DOWN_VR0_EVEN", .udesc = "Down and even ring polarity filter on virtual ring 0", .ucode = 0x400, }, { .uname = "DOWN_VR0_ODD", .udesc = "Down and odd ring polarity filter on virtual ring 0", .ucode = 0x800, }, { .uname = "UP_VR1_EVEN", .udesc = "Up and Even ring polarity filter on virtual ring 1", .ucode = 0x1000, }, { .uname = "UP_VR1_ODD", .udesc = "Up and odd ring polarity filter on virtual ring 1", .ucode = 0x2000, }, { .uname = "DOWN_VR1_EVEN", .udesc = "Down and even ring polarity filter on virtual ring 1", .ucode = 0x4000, }, { .uname = "DOWN_VR1_ODD", .udesc = "Down and odd ring polarity filter on virtual ring 1", .ucode = 0x8000, }, { .uname = "UP", .udesc = "Up on any virtual ring", .ucode = 0x3300, }, { .uname = "DOWN", .udesc = "Down any virtual ring", .ucode = 0xcc00, }, }; static const intel_x86_umask_t ivbep_unc_c_ring_bounces[]={ { .uname = "AD_IRQ", .udesc = "TBD", .ucode = 0x200, }, { .uname = "AK", .udesc = "Acknowledgments to core", .ucode = 0x400, }, { .uname = "BL", .udesc = "Data responses to core", .ucode = 0x800, }, { .uname = "IV", .udesc = "Snoops of processor cache", .ucode = 0x1000, }, }; static const intel_x86_umask_t ivbep_unc_c_ring_iv_used[]={ { .uname = "ANY", .udesc = "Any filter", .ucode = 0xf00, .uflags = INTEL_X86_DFL, }, { .uname = "UP", .udesc = "Filter on any up polarity", .ucode = 0x3300, }, { .uname = "DOWN", .udesc = "Filter on any down polarity", .ucode = 0xcc00, }, }; static const intel_x86_umask_t ivbep_unc_c_rxr_ext_starved[]={ { .uname = "IRQ", .udesc = "Irq externally starved, therefore blocking the IPQ", .ucode = 0x100, }, { .uname = "IPQ", .udesc = "IPQ externally starved, therefore blocking the IRQ", .ucode = 0x200, }, { .uname = "PRQ", .udesc = "IRQ is blocking the ingress queue and causing starvation", .ucode = 0x400, }, { .uname = "ISMQ_BIDS", .udesc = "Number of time the ISMQ bids", .ucode = 0x800, }, }; static const intel_x86_umask_t ivbep_unc_c_rxr_inserts[]={ { .uname = "IPQ", .udesc = "IPQ", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .udesc = "IRQ", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJECTED", .udesc = "IRQ rejected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VFIFO", .udesc = "Counts the number of allocated into the IRQ ordering FIFO", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_c_rxr_ipq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any Reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_c_rxr_irq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_CREDITS", .udesc = "No IIO Credits", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_c_rxr_ismq_retry[]={ { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_CREDITS", .udesc = "No IIO credits", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "NO QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB_CREDITS", .udesc = "No WB credits", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_c_tor_inserts[]={ { .uname = "OPCODE", .udesc = "Number of transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_OPCODE", .udesc = "Number of miss transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICTION", .udesc = "Number of Evictions transactions inserted into TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "ALL", .udesc = "Number of transactions inserted in TOR", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, }, { .uname = "WB", .udesc = "Number of write transactions inserted into the TOR", .ucode = 0x1000, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL_OPCODE", .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_LOCAL_OPCODE", .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Number of transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by locally homed memory", .ucode = 0x2a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_OPCODE", .udesc = "Number of transactions inserted into the TOR that match a NID and opcode (must provide opc_* umask and nf=X modifier)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID and opcode matched miss transactions inserted into the TOR (must provide opc_* umask and nf=X modifier)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched eviction transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched miss transactions that were inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_WB", .udesc = "Number of NID-matched write back transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x5000, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "REMOTE_OPCODE", .udesc = "Number of opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_REMOTE_OPCODE", .udesc = "Number of miss opcode-matched transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Number of transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_REMOTE", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t ivbep_unc_c_tor_occupancy[]={ { .uname = "OPCODE", .udesc = "Number of TOR entries that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_OPCODE", .udesc = "Number of TOR entries that match a NID and an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICTION", .udesc = "Number of outstanding eviction transactions in the TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "ALL", .udesc = "All valid TOR entries", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_ALL", .udesc = "Number of outstanding miss requests in the TOR", .ucode = 0xa00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "WB", .udesc = "Number of write transactions in the TOR. Does not include RFO, but actual operations that contain data being sent from the core", .ucode = 0x1000, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "LOCAL_OPCODE", .udesc = "Number of opcode-matched transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_LOCAL_OPCODE", .udesc = "Number of miss opcode-matched transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Number of transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_LOCAL", .udesc = "Number of miss transactions in the TOR that are satisfied by locally homed memory", .ucode = 0x2a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_OPCODE", .udesc = "Number of NID-matched TOR entries that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID-matched outstanding miss requests in the TOR that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched outstanding miss requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_WB", .udesc = "Number of NID-matched write transactions in the TOR (must provide a nf=X modifier)", .ucode = 0x5000, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "REMOTE_OPCODE", .udesc = "Number of opcode-matched transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_REMOTE_OPCODE", .udesc = "Number of miss opcode-matched transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Number of transactions in the TOR that are satisfied by remote caches or memory", .ucode = 0x8800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_REMOTE", .udesc = "Number of miss transactions inserted into the TOR that are satisfied by remote caches or memory", .ucode = 0x8a00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t ivbep_unc_c_txr_inserts[]={ { .uname = "AD_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_CACHE", .udesc = "Counts the number of ring transactions from Cachebo ton IV ring", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CORE", .udesc = "Counts the number of ring transactions from Corebo to AD ring", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CORE", .udesc = "Counts the number of ring transactions from Corebo to AK ring", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CORE", .udesc = "Counts the number of ring transactions from Corebo to BL ring", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_c_txr_ads_used[]={ { .uname = "AD", .udesc = "onto AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "Onto AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "Onto BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_umask_t ivbep_unc_c_misc[]={ { .uname = "RSPI_WAS_FSE", .udesc = "Counts the number of times when a SNoop hit in FSE states and triggered a silent eviction. This is useful because this information is lost in the PRE encodings", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WC_ALIASING", .udesc = "Counts the number of times a USWC write (WCIL(F)) transaction hits in the LLC in M state, triggering a WBMTOI followed by the USWC write. This occurs when there is WC aliasing", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STARTED", .udesc = "TBD", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT_S", .udesc = "Counts the number of times that an RFO hits in S state. This is useful for determining if it might be good for a workload to use RSPIWB instead of RSPSWB", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_ivbep_unc_c_pe[]={ { .name = "UNC_C_CLOCKTICKS", .desc = "C-box Uncore clockticks", .modmsk = 0x0, .cntmsk = 0xf, .code = 0x00, .flags = INTEL_X86_FIXED, }, { .name = "UNC_C_COUNTER0_OCCUPANCY", .desc = "Counter 0 occupancy. Counts the occupancy related information by filtering CB0 occupancy count captured in counter 0.", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0xe, .code = 0x1f, }, { .name = "UNC_C_LLC_LOOKUP", .desc = "Cache lookups", .modmsk = IVBEP_UNC_CBO_NID_ATTRS, .cntmsk = 0x3, .code = 0x34, .ngrp = 3, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_llc_lookup), .umasks = ivbep_unc_c_llc_lookup, }, { .name = "UNC_C_LLC_VICTIMS", .desc = "Lines victimized", .modmsk = IVBEP_UNC_CBO_NID_ATTRS, .cntmsk = 0x3, .code = 0x37, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_llc_victims), .ngrp = 2, .umasks = ivbep_unc_c_llc_victims, }, { .name = "UNC_C_MISC", .desc = "Miscellaneous C-Box events", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x39, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_misc), .ngrp = 1, .umasks = ivbep_unc_c_misc, }, { .name = "UNC_C_RING_AD_USED", .desc = "Address ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1b, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_ring_ad_used), .ngrp = 1, .umasks = ivbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_AK_USED", .desc = "Acknowledgement ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1c, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = ivbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BL_USED", .desc = "Bus or Data ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1d, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = ivbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BOUNCES", .desc = "Number of LLC responses that bounced in the ring", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x05, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_ring_bounces), .ngrp = 1, .umasks = ivbep_unc_c_ring_bounces, }, { .name = "UNC_C_RING_IV_USED", .desc = "Invalidate ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1e, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_ring_iv_used), .ngrp = 1, .umasks = ivbep_unc_c_ring_iv_used, }, { .name = "UNC_C_RING_SRC_THRTL", .desc = "TDB", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x07, }, { .name = "UNC_C_RXR_EXT_STARVED", .desc = "Ingress arbiter blocking cycles", .modmsk = IVBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_ext_starved), .ngrp = 1, .umasks = ivbep_unc_c_rxr_ext_starved, }, { .name = "UNC_C_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x13, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_inserts), .umasks = ivbep_unc_c_rxr_inserts }, { .name = "UNC_C_RXR_IPQ_RETRY", .desc = "Probe Queue Retries", .code = 0x31, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_ipq_retry), .umasks = ivbep_unc_c_rxr_ipq_retry }, { .name = "UNC_C_RXR_IRQ_RETRY", .desc = "Ingress Request Queue Rejects", .code = 0x32, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_irq_retry), .umasks = ivbep_unc_c_rxr_irq_retry }, { .name = "UNC_C_RXR_ISMQ_RETRY", .desc = "ISMQ Retries", .code = 0x33, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_ismq_retry), .umasks = ivbep_unc_c_rxr_ismq_retry }, { .name = "UNC_C_RXR_OCCUPANCY", .desc = "Ingress Occupancy", .code = 0x11, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_rxr_inserts), .umasks = ivbep_unc_c_rxr_inserts, /* identical to ivbep_unc_c_rxr_inserts */ }, { .name = "UNC_C_TOR_INSERTS", .desc = "TOR Inserts", .code = 0x35, .cntmsk = 0x3, .ngrp = 2, .modmsk = IVBEP_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_tor_inserts), .umasks = ivbep_unc_c_tor_inserts }, { .name = "UNC_C_TOR_OCCUPANCY", .desc = "TOR Occupancy", .code = 0x36, .cntmsk = 0x1, .ngrp = 2, .modmsk = IVBEP_UNC_CBO_NID_ATTRS | _SNBEP_UNC_ATTR_ISOC | _SNBEP_UNC_ATTR_NC, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_tor_occupancy), .umasks = ivbep_unc_c_tor_occupancy }, { .name = "UNC_C_TXR_ADS_USED", .desc = "Egress events", .code = 0x04, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_txr_ads_used), .umasks = ivbep_unc_c_txr_ads_used }, { .name = "UNC_C_TXR_INSERTS", .desc = "Egress allocations", .code = 0x02, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_c_txr_inserts), .umasks = ivbep_unc_c_txr_inserts }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_ha_events.h000066400000000000000000000647371502707512200254310ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivbep_unc_ha (Intel IvyBridge-EP HA uncore PMU) */ static const intel_x86_umask_t ivbep_unc_h_conflict_cycles[]={ { .uname = "CONFLICT", .udesc = "Number of cycles that we are handling conflicts", .ucode = 0x200, }, { .uname = "LAST", .udesc = "Count every last conflictor in conflict chain. Can be used to compute average conflict chain length", .ucode = 0x400, }, { .uname = "CMP_FWDS", .udesc = "Count the number of cmp_fwd. This gives the number of late conflicts", .ucode = 0x1000, }, { .uname = "ACKCNFLTS", .udesc = "Count the number Acknflts", .ucode = 0x800, }, }; static const intel_x86_umask_t ivbep_unc_h_directory_lookup[]={ { .uname = "NO_SNP", .udesc = "Snoop not needed", .ucode = 0x200, }, { .uname = "SNOOP", .udesc = "SNooop needed", .ucode = 0x100, }, }; static const intel_x86_umask_t ivbep_unc_h_bypass_imc[]={ { .uname = "TAKEN", .udesc = "Bypass taken", .ucode = 0x200, }, { .uname = "NOT_TAKEN", .udesc = "Bypass not taken", .ucode = 0x100, }, }; static const intel_x86_umask_t ivbep_unc_h_directory_update[]={ { .uname = "ANY", .udesc = "Counts any directory update", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CLEAR", .udesc = "Directory clears", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SET", .udesc = "Directory set", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_igr_no_credit_cycles[]={ { .uname = "AD_QPI0", .udesc = "AD to QPI link 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_QPI1", .udesc = "AD to QPI link 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_QPI0", .udesc = "BL to QPI link 0", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_QPI1", .udesc = "BL to QPI link 1", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_imc_writes[]={ { .uname = "ALL", .udesc = "Counts all writes", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "Counts full line non ISOCH", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL_ISOCH", .udesc = "Counts ISOCH full line", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Counts partial non-ISOCH", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL_ISOCH", .udesc = "Counts ISOCH partial", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_imc_reads[]={ { .uname = "NORMAL", .udesc = "Normal priority", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivbep_unc_h_requests[]={ { .uname = "READS", .udesc = "Counts incoming read requests. Good proxy for LLC read misses, incl. RFOs", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL", .udesc = "Counts incoming read requests coming from local socket. Good proxy for LLC read misses, incl. RFOs from the local socket", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_REMOTE", .udesc = "Counts incoming read requests coming from remote socket. Good proxy for LLC read misses, incl. RFOs from the remote socket", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES", .udesc = "Counts incoming writes", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_LOCAL", .udesc = "Counts incoming writes from local socket", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_REMOTE", .udesc = "Counts incoming writes from remote socket", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_LOCAL", .udesc = "Counts InvItoE coming from local socket", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_REMOTE", .udesc = "Counts InvItoE coming from remote socket", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, } }; static const intel_x86_umask_t ivbep_unc_h_rpq_cycles_no_reg_credits[]={ { .uname = "CHN0", .udesc = "Channel 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .udesc = "Channel 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .udesc = "channel 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN3", .udesc = "Chanell 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_tad_requests_g0[]={ { .uname = "REGION0", .udesc = "Counts for TAD Region 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION1", .udesc = "Counts for TAD Region 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION2", .udesc = "Counts for TAD Region 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION3", .udesc = "Counts for TAD Region 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION4", .udesc = "Counts for TAD Region 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION5", .udesc = "Counts for TAD Region 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION6", .udesc = "Counts for TAD Region 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION7", .udesc = "Counts for TAD Region 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_tad_requests_g1[]={ { .uname = "REGION8", .udesc = "Counts for TAD Region 8", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION9", .udesc = "Counts for TAD Region 9", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION10", .udesc = "Counts for TAD Region 10", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION11", .udesc = "Counts for TAD Region 11", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_snoop_resp[]={ { .uname = "RSPI", .udesc = "Filters for snoop responses of RspI. RspI is returned when the remote cache does not have the data or when the remote cache silently evicts data (e.g. RFO hit non-modified line)", .ucode = 0x100, }, { .uname = "RSPS", .udesc = "Filters for snoop responses of RspS. RspS is returned when the remote cache has the data but is not forwarding it. It is a way to let the requesting socket know that it cannot allocate the data in E-state", .ucode = 0x200, }, { .uname = "RSPIFWD", .udesc = "Filters for snoop responses of RspIFwd. RspIFwd is returned when the remote cache agent forwards data and the requesting agent is able to acquire the data in E or M state. This is commonly returned with RFO transacations. It can be either HitM or HitFE", .ucode = 0x400, }, { .uname = "RSPSFWD", .udesc = "Filters for snoop responses of RspSFwd. RspSFwd is returned when the remote cache agent forwards data but holds on to its current copy. This is common for data and code reads that hit in a remote socket in E or F state", .ucode = 0x800, }, { .uname = "RSP_WB", .udesc = "Filters for snoop responses of RspIWB or RspSWB. This is returned when a non-RFO requests hits in M-state. Data and code reads can return either RspIWB or RspSWB depending on how the system has been configured. InvItoE transactions will also return RspIWB because they must acquire ownership", .ucode = 0x1000, }, { .uname = "RSP_FWD_WB", .udesc = "Filters for snoop responses of RspxFwdxWB. This snoop response is only used in 4s systems. It is used when a snoop HITM in a remote caching agent and it directly forwards data to a requester and simultaneously returns data to the home to be written back to memory", .ucode = 0x2000, }, { .uname = "RSPCNFLCT", .udesc = "Filters for snoop responses of RspConflict. This is returned when a snoop finds an existing outstanding transaction in a remote caching agent when it CMAs that caching agent. This triggers the conflict resolution hardware. This covers both RspConflct and RspCnflctWBI", .ucode = 0x4000, }, }; static const intel_x86_umask_t ivbep_unc_h_txr_ad_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles full from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles full from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles full from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_txr_bl_occupancy[]={ { .uname = "SCHED0", .udesc = "Counts cycles full from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles full from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_txr_ak_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_h_txr_bl[]={ { .uname = "DRS_CACHE", .udesc = "Counts data being sent to the cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_CORE", .udesc = "Counts data being sent directly to the requesting core", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_QPI", .udesc = "Counts data being sent to a remote socket over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; #if 0 static const intel_x86_umask_t ivbep_unc_h_addr_opc_match[]={ { .uname = "FILT", .udesc = "Number of addr and opcode matches (opc via opc= or address via addr= modifiers)", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_ADDR, }, }; #endif static const intel_x86_umask_t ivbep_unc_h_bt_occupancy[]={ { .uname = "LOCAL", .udesc = "Local", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Remote", .ucode = 0x200, }, { .uname = "READS_REMOTE", .udesc = "Reads remote", .ucode = 0x800, }, { .uname = "WRITES_LOCAL", .udesc = "Writes local", .ucode = 0x1000, }, { .uname = "WRITES_REMOTE", .udesc = "Writes remote", .ucode = 0x2000, }, }; static const intel_x86_umask_t ivbep_unc_h_osb[]={ { .uname = "REMOTE", .udesc = "Remote", .ucode = 0x800, }, { .uname = "READS_LOCAL", .udesc = "Local reads", .ucode = 0x200, }, { .uname = "INVITOE_LOCAL", .udesc = "Local InvItoE", .ucode = 0x400, } }; static const intel_x86_umask_t ivbep_unc_h_osb_edr[]={ { .uname = "ALL", .udesc = "All data returns", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL_I", .udesc = "Reads to local I", .ucode = 0x200, }, { .uname = "READS_REMOTE_I", .udesc = "Reads to remote I", .ucode = 0x400, }, { .uname = "READS_LOCAL_S", .udesc = "Reads to local S", .ucode = 0x800, }, { .uname = "READS_REMOTE_S", .udesc = "Reads to remote S", .ucode = 0x1000, } }; static const intel_x86_umask_t ivbep_unc_h_ring_ad_used[]={ { .uname = "CCW_VR0_EVEN", .udesc = "Counter-clockwise and even ring polarity on virtual ring 0", .ucode = 0x400, }, { .uname = "CCW_VR0_ODD", .udesc = "Counter-clockwise and odd ring polarity on virtual ring 0", .ucode = 0x800, }, { .uname = "CW_VR0_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring 0", .ucode = 0x100, }, { .uname = "CW_VR0_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring 0", .ucode = 0x200, }, { .uname = "CCW_VR1_EVEN", .udesc = "Counter-clockwise and even ring polarity on virtual ring 1", .ucode = 0x400, }, { .uname = "CCW_VR1_ODD", .udesc = "Counter-clockwise and odd ring polarity on virtual ring 1", .ucode = 0x800, }, { .uname = "CW_VR1_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring 1", .ucode = 0x100, }, { .uname = "CW_VR1_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring 1", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x3300, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xcc00, }, }; static const intel_x86_umask_t ivbep_unc_h_snp_resp_recv_local[]={ { .uname = "RSPI", .udesc = "Filters for snoop responses of RspI. RspI is returned when the remote cache does not have the data or when the remote cache silently evicts data (e.g. RFO hit non-modified line)", .ucode = 0x100, }, { .uname = "RSPS", .udesc = "Filters for snoop responses of RspS. RspS is returned when the remote cache has the data but is not forwarding it. It is a way to let the requesting socket know that it cannot allocate the data in E-state", .ucode = 0x200, }, { .uname = "RSPIFWD", .udesc = "Filters for snoop responses of RspIFwd. RspIFwd is returned when the remote cache agent forwards data and the requesting agent is able to acquire the data in E or M state. This is commonly returned with RFO transacations. It can be either HitM or HitFE", .ucode = 0x400, }, { .uname = "RSPSFWD", .udesc = "Filters for snoop responses of RspSFwd. RspSFwd is returned when the remote cache agent forwards data but holds on to its current copy. This is common for data and code reads that hit in a remote socket in E or F state", .ucode = 0x800, }, { .uname = "RSP_WB", .udesc = "Filters for snoop responses of RspIWB or RspSWB. This is returned when a non-RFO requests hits in M-state. Data and code reads can return either RspIWB or RspSWB depending on how the system has been configured. InvItoE transactions will also return RspIWB because they must acquire ownership", .ucode = 0x1000, }, { .uname = "RSP_FWD_WB", .udesc = "Filters for snoop responses of RspxFwdxWB. This snoop response is only used in 4s systems. It is used when a snoop HITM in a remote caching agent and it directly forwards data to a requester and simultaneously returns data to the home to be written back to memory", .ucode = 0x2000, }, { .uname = "RSPCNFLCT", .udesc = "Filters for snoop responses of RspConflict. This is returned when a snoop finds an existing outstanding transaction in a remote caching agent when it CMAs that caching agent. This triggers the conflict resolution hardware. This covers both RspConflct and RspCnflctWBI", .ucode = 0x4000, }, { .uname = "OTHER", .udesc = "Filters all other snoop responses", .ucode = 0x8000, }, }; static const intel_x86_umask_t ivbep_unc_h_txr_ak[]={ { .uname = "NDR", .udesc = "Number of outbound NDR (non-data response) transactions send on the AK ring. AK NDR is used for messages to the local socket", .ucode = 0x100, }, { .uname = "CRD_CBO", .udesc = "Number of outbound CDR transactions send on the AK ring to CBO", .ucode = 0x200, }, { .uname = "CRD_QPI", .udesc = "Number of outbound CDR transactions send on the AK ring to QPI", .ucode = 0x400, }, }; static const intel_x86_umask_t ivbep_unc_h_iodc_conflicts[]={ { .uname = "ANY", .udesc = "Any conflict", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "LAST", .udesc = "Last conflict", .ucode = 0x400, } }; static const intel_x86_entry_t intel_ivbep_unc_h_pe[]={ { .name = "UNC_H_CLOCKTICKS", .desc = "HA Uncore clockticks", .modmsk = IVBEP_UNC_HA_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_H_CONFLICT_CYCLES", .desc = "Conflict Checks", .code = 0xb, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_conflict_cycles), .umasks = ivbep_unc_h_conflict_cycles, }, { .name = "UNC_H_DIRECT2CORE_COUNT", .desc = "Direct2Core Messages Sent", .code = 0x11, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_CYCLES_DISABLED", .desc = "Cycles when Direct2Core was Disabled", .code = 0x12, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_TXN_OVERRIDE", .desc = "Number of Reads that had Direct2Core Overridden", .code = 0x13, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECTORY_LOOKUP", .desc = "Directory Lookups", .code = 0xc, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_directory_lookup), .umasks = ivbep_unc_h_directory_lookup }, { .name = "UNC_H_DIRECTORY_UPDATE", .desc = "Directory Updates", .code = 0xd, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_directory_update), .umasks = ivbep_unc_h_directory_update }, { .name = "UNC_H_IGR_NO_CREDIT_CYCLES", .desc = "Cycles without QPI Ingress Credits", .code = 0x22, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_igr_no_credit_cycles), .umasks = ivbep_unc_h_igr_no_credit_cycles }, { .name = "UNC_H_IMC_RETRY", .desc = "Retry Events", .code = 0x1e, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IMC_WRITES", .desc = "HA to IMC Full Line Writes Issued", .code = 0x1a, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_imc_writes), .umasks = ivbep_unc_h_imc_writes }, { .name = "UNC_H_IMC_READS", .desc = "HA to IMC normal priority reads issued", .code = 0x17, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_imc_reads), .umasks = ivbep_unc_h_imc_reads }, { .name = "UNC_H_REQUESTS", .desc = "Read and Write Requests", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_requests), .umasks = ivbep_unc_h_requests }, { .name = "UNC_H_RPQ_CYCLES_NO_REG_CREDITS", .desc = "IMC RPQ Credits Empty", .code = 0x15, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_rpq_cycles_no_reg_credits), .umasks = ivbep_unc_h_rpq_cycles_no_reg_credits }, { .name = "UNC_H_TAD_REQUESTS_G0", .desc = "HA Requests to a TAD Region", .code = 0x1b, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_tad_requests_g0), .umasks = ivbep_unc_h_tad_requests_g0 }, { .name = "UNC_H_TAD_REQUESTS_G1", .desc = "HA Requests to a TAD Region", .code = 0x1c, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_tad_requests_g1), .umasks = ivbep_unc_h_tad_requests_g1 }, { .name = "UNC_H_TXR_AD_CYCLES_FULL", .desc = "AD Egress Full", .code = 0x2a, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_ad_cycles_full), .umasks = ivbep_unc_h_txr_ad_cycles_full }, { .name = "UNC_H_TXR_AK_CYCLES_FULL", .desc = "AK Egress Full", .code = 0x32, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_ak_cycles_full), .umasks = ivbep_unc_h_txr_ak_cycles_full }, { .name = "UNC_H_TXR_AK", .desc = "Outbound Ring Transactions on AK", .code = 0xe, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_ak), .umasks = ivbep_unc_h_txr_ak }, { .name = "UNC_H_TXR_BL", .desc = "Outbound DRS Ring Transactions to Cache", .code = 0x10, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_bl), .umasks = ivbep_unc_h_txr_bl }, { .name = "UNC_H_TXR_BL_CYCLES_FULL", .desc = "BL Egress Full", .code = 0x36, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_ak_cycles_full), .umasks = ivbep_unc_h_txr_ak_cycles_full, /* identical to snbep_unc_h_txr_ak_cycles_full */ }, { .name = "UNC_H_WPQ_CYCLES_NO_REG_CREDITS", .desc = "HA IMC CHN0 WPQ Credits Empty", .code = 0x18, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_rpq_cycles_no_reg_credits), .umasks = ivbep_unc_h_rpq_cycles_no_reg_credits, /* shared */ }, { .name = "UNC_H_BT_BYPASS", .desc = "Backup Tracker bypass", .code = 0x52, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_BYPASS_IMC", .desc = "HA to IMC bypass", .code = 0x14, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_bypass_imc), .umasks = ivbep_unc_h_bypass_imc, }, { .name = "UNC_H_BT_CYCLES_NE", .desc = "Backup Tracker cycles not empty", .code = 0x42, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_BT_OCCUPANCY", .desc = "Backup Tracker inserts", .code = 0x43, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_bt_occupancy), .umasks = ivbep_unc_h_bt_occupancy, }, { .name = "UNC_H_IGR_AD_QPI2", .desc = "AD QPI Link 2 credit accumulator", .code = 0x59, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IGR_BL_QPI2", .desc = "BL QPI Link 2 credit accumulator", .code = 0x5a, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IODC_INSERTS", .desc = "IODC inserts", .code = 0x56, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IODC_CONFLICTS", .desc = "IODC conflicts", .code = 0x57, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_iodc_conflicts), .umasks = ivbep_unc_h_iodc_conflicts, }, { .name = "UNC_H_IODC_OLEN_WBMTOI", .desc = "IODC zero length writes", .code = 0x58, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_OSB", .desc = "OSB snoop broadcast", .code = 0x53, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_osb), .umasks = ivbep_unc_h_osb, }, { .name = "UNC_H_OSB_EDR", .desc = "OSB early data return", .code = 0x54, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_osb_edr), .umasks = ivbep_unc_h_osb_edr, }, { .name = "UNC_H_RING_AD_USED", .desc = "AD ring in use", .code = 0x3e, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_ring_ad_used), .umasks = ivbep_unc_h_ring_ad_used, }, { .name = "UNC_H_RING_AK_USED", .desc = "AK ring in use", .code = 0x3f, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_ring_ad_used), /* shared */ .umasks = ivbep_unc_h_ring_ad_used, }, { .name = "UNC_H_RING_BL_USED", .desc = "BL ring in use", .code = 0x40, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_ring_ad_used), /* shared */ .umasks = ivbep_unc_h_ring_ad_used, }, { .name = "UNC_H_DIRECTORY_LAT_OPT", .desc = "Directory latency optimization data return path taken", .code = 0x41, .cntmsk = 0xf, .modmsk = IVBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_SNP_RESP_RECV_LOCAL", .desc = "Snoop responses received local", .code = 0x60, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_snp_resp_recv_local), .umasks = ivbep_unc_h_snp_resp_recv_local, }, { .name = "UNC_H_TXR_BL_OCCUPANCY", .desc = "BL Egress occupancy", .code = 0x34, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_txr_bl_occupancy), .umasks = ivbep_unc_h_txr_bl_occupancy, }, { .name = "UNC_H_SNOOP_RESP", .desc = "Snoop responses received", .code = 0x21, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_h_snoop_resp), .umasks = ivbep_unc_h_snoop_resp }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_imc_events.h000066400000000000000000000437231502707512200256010ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: ivbep_unc_imc (Intel IvyBridge-EP IMC uncore PMU) */ static const intel_x86_umask_t ivbep_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "Counts total number of DRAM CAS commands issued on this channel", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "Counts all DRAM reads on this channel, incl. underfills", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "Counts number of DRAM read CAS commands issued on this channel, incl. regular read CAS and those with implicit precharge", .ucode = 0x100, }, { .uname = "RD_UNDERFILL", .udesc = "Counts number of underfill reads issued by the memory controller", .ucode = 0x200, }, { .uname = "WR", .udesc = "Counts number of DRAM write CAS commands on this channel", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_RMM", .udesc = "Counts Number of opportunistic DRAM write CAS commands issued on this channel", .ucode = 0x800, }, { .uname = "WR_WMM", .udesc = "Counts number of DRAM write CAS commands issued on this channel while in Write-Major mode", .ucode = 0x400, }, { .uname = "RD_RMM", .udesc = "Counts Number of opportunistic DRAM read CAS commands issued on this channel", .ucode = 0x1000, }, { .uname = "RD_WMM", .udesc = "Counts number of DRAM read CAS commands issued on this channel while in Write-Major mode", .ucode = 0x2000, }, }; static const intel_x86_umask_t ivbep_unc_m_dram_refresh[]={ { .uname = "HIGH", .udesc = "TBD", .ucode = 0x400, }, { .uname = "PANIC", .udesc = "TBD", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_m_major_modes[]={ { .uname = "ISOCH", .udesc = "Counts cycles in ISOCH Major mode", .ucode = 0x800, }, { .uname = "PARTIAL", .udesc = "Counts cycles in Partial Major mode", .ucode = 0x400, }, { .uname = "READ", .udesc = "Counts cycles in Read Major mode", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Counts cycles in Write Major mode", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_m_power_cke_cycles[]={ { .uname = "RANK0", .udesc = "Count cycles for rank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK1", .udesc = "Count cycles for rank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK2", .udesc = "Count cycles for rank 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK3", .udesc = "Count cycles for rank 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK4", .udesc = "Count cycles for rank 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK5", .udesc = "Count cycles for rank 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK6", .udesc = "Count cycles for rank 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK7", .udesc = "Count cycles for rank 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_m_preemption[]={ { .uname = "RD_PREEMPT_RD", .udesc = "Counts read over read preemptions", .ucode = 0x100, }, { .uname = "RD_PREEMPT_WR", .udesc = "Counts read over write preemptions", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_m_pre_count[]={ { .uname = "PAGE_CLOSE", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of the page close counter expiring", .ucode = 0x200, }, { .uname = "PAGE_MISS", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of page misses", .ucode = 0x100, }, { .uname = "RD", .udesc = "Precharge due to read", .ucode = 0x400, }, { .uname = "WR", .udesc = "Precharge due to write", .ucode = 0x800, }, { .uname = "BYP", .udesc = "Precharge due to bypass", .ucode = 0x1000, }, }; static const intel_x86_umask_t ivbep_unc_m_act_count[]={ { .uname = "RD", .udesc = "Activate due to read", .ucode = 0x100, }, { .uname = "WR", .udesc = "Activate due to write", .ucode = 0x200, }, { .uname = "BYP", .udesc = "Activate due to bypass", .ucode = 0x800, }, }; static const intel_x86_umask_t ivbep_unc_m_byp_cmds[]={ { .uname = "ACT", .udesc = "ACT command issued by 2 cycle bypass", .ucode = 0x100, }, { .uname = "CAS", .udesc = "CAS command issued by 2 cycle bypass", .ucode = 0x200, }, { .uname = "PRE", .udesc = "PRE command issued by 2 cycle bypass", .ucode = 0x400, }, }; static const intel_x86_umask_t ivbep_unc_m_rd_cas_prio[]={ { .uname = "LOW", .udesc = "Read CAS issued with low priority", .ucode = 0x100, }, { .uname = "MED", .udesc = "Read CAS issued with medium priority", .ucode = 0x200, }, { .uname = "HIGH", .udesc = "Read CAS issued with high priority", .ucode = 0x400, }, { .uname = "PANIC", .udesc = "Read CAS issued with panic non isoch priority (starved)", .ucode = 0x800, }, }; static const intel_x86_umask_t ivbep_unc_m_rd_cas_rank0[]={ { .uname = "BANK0", .udesc = "Bank 0", .ucode = 0x100, }, { .uname = "BANK1", .udesc = "Bank 1", .ucode = 0x200, }, { .uname = "BANK2", .udesc = "Bank 2", .ucode = 0x400, }, { .uname = "BANK3", .udesc = "Bank 3", .ucode = 0x800, }, { .uname = "BANK4", .udesc = "Bank 4", .ucode = 0x1000, }, { .uname = "BANK5", .udesc = "Bank 5", .ucode = 0x2000, }, { .uname = "BANK6", .udesc = "Bank 6", .ucode = 0x4000, }, { .uname = "BANK7", .udesc = "Bank 7", .ucode = 0x8000, } }; static const intel_x86_umask_t ivbep_unc_m_vmse_wr_push[]={ { .uname = "WMM", .udesc = "VMSE write push issued in WMM", .ucode = 0x100, }, { .uname = "RMM", .udesc = "VMSE write push issued in RMM", .ucode = 0x200, } }; static const intel_x86_umask_t ivbep_unc_m_wmm_to_rmm[]={ { .uname = "LOW_THRES", .udesc = "Transition from WMM to RMM because of starve counter", .ucode = 0x100, }, { .uname = "STARVE", .udesc = "TBD", .ucode = 0x200, }, { .uname = "VMSE_RETRY", .udesc = "TBD", .ucode = 0x400, } }; static const intel_x86_entry_t intel_ivbep_unc_m_pe[]={ { .name = "UNC_M_CLOCKTICKS", .desc = "IMC Uncore clockticks (fixed counter)", .modmsk = 0x0, .cntmsk = 0x100000000ull, .code = 0xff, /* perf pseudo encoding for fixed counter */ .flags = INTEL_X86_FIXED, }, { .name = "UNC_M_DCLOCKTICKS", .desc = "IMC Uncore clockticks (generic counters)", .modmsk = IVBEP_UNC_IMC_ATTRS, .cntmsk = 0xf, .code = 0x00, /*encoding for generic counters */ }, { .name = "UNC_M_ACT_COUNT", .desc = "DRAM Activate Count", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_act_count), .umasks = ivbep_unc_m_act_count }, { .name = "UNC_M_CAS_COUNT", .desc = "DRAM RD_CAS and WR_CAS Commands.", .code = 0x4, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_cas_count), .umasks = ivbep_unc_m_cas_count }, { .name = "UNC_M_DRAM_PRE_ALL", .desc = "DRAM Precharge All Commands", .code = 0x6, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_DRAM_REFRESH", .desc = "Number of DRAM Refreshes Issued", .code = 0x5, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_dram_refresh), .umasks = ivbep_unc_m_dram_refresh }, { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", .desc = "ECC Correctable Errors", .code = 0x9, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_MAJOR_MODES", .desc = "Cycles in a Major Mode", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_major_modes), .umasks = ivbep_unc_m_major_modes }, { .name = "UNC_M_POWER_CHANNEL_DLLOFF", .desc = "Channel DLLOFF Cycles", .code = 0x84, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .desc = "Channel PPD Cycles", .code = 0x85, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CKE_CYCLES", .desc = "CKE_ON_CYCLES by Rank", .code = 0x83, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_power_cke_cycles), .umasks = ivbep_unc_m_power_cke_cycles }, { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", .desc = "Critical Throttle Cycles", .code = 0x86, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_SELF_REFRESH", .desc = "Clock-Enabled Self-Refresh", .code = 0x43, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .desc = "Throttle Cycles", .code = 0x41, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_power_cke_cycles), .umasks = ivbep_unc_m_power_cke_cycles /* identical to snbep_unc_m_power_cke_cycles */ }, { .name = "UNC_M_PREEMPTION", .desc = "Read Preemption Count", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_preemption), .umasks = ivbep_unc_m_preemption }, { .name = "UNC_M_PRE_COUNT", .desc = "DRAM Precharge commands.", .code = 0x2, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_pre_count), .umasks = ivbep_unc_m_pre_count }, { .name = "UNC_M_RPQ_CYCLES_NE", .desc = "Read Pending Queue Not Empty", .code = 0x11, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue Allocations", .code = 0x10, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_FULL", .desc = "Write Pending Queue Full Cycles", .code = 0x22, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_NE", .desc = "Write Pending Queue Not Empty", .code = 0x21, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_INSERTS", .desc = "Write Pending Queue Allocations", .code = 0x20, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_READ_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x23, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_WRITE_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x24, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_BYP_CMDS", .desc = "Bypass command event", .code = 0xa1, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_byp_cmds), .umasks = ivbep_unc_m_byp_cmds }, { .name = "UNC_M_RD_CAS_PRIO", .desc = "Read CAS priority", .code = 0xa0, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_prio), .umasks = ivbep_unc_m_rd_cas_prio }, { .name = "UNC_M_RD_CAS_RANK0", .desc = "Read CAS access to Rank 0", .code = 0xb0, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK1", .desc = "Read CAS access to Rank 1", .code = 0xb1, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK2", .desc = "Read CAS access to Rank 2", .code = 0xb2, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK3", .desc = "Read CAS access to Rank 3", .code = 0xb3, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK4", .desc = "Read CAS access to Rank 4", .code = 0xb4, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK5", .desc = "Read CAS access to Rank 5", .code = 0xb5, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK6", .desc = "Read CAS access to Rank 6", .code = 0xb6, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_RD_CAS_RANK7", .desc = "Read CAS access to Rank 7", .code = 0xb7, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_VMSE_MXB_WR_OCCUPANCY", .desc = "VMSE MXB write buffer occupancy", .code = 0x91, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_VMSE_WR_PUSH", .desc = "VMSE WR push issued", .code = 0x90, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_vmse_wr_push), .umasks = ivbep_unc_m_vmse_wr_push }, { .name = "UNC_M_WMM_TO_RMM", .desc = "Transitions from WMM to RMM because of low threshold", .code = 0xc0, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_wmm_to_rmm), .umasks = ivbep_unc_m_wmm_to_rmm }, { .name = "UNC_M_WRONG_MM", .desc = "Not getting the requested major mode", .code = 0xc1, .cntmsk = 0xf, .modmsk = IVBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WR_CAS_RANK0", .desc = "Write CAS access to Rank 0", .code = 0xb8, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK1", .desc = "Write CAS access to Rank 1", .code = 0xb9, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK2", .desc = "Write CAS access to Rank 2", .code = 0xba, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK3", .desc = "Write CAS access to Rank 3", .code = 0xbb, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK4", .desc = "Write CAS access to Rank 4", .code = 0xbc, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK5", .desc = "Write CAS access to Rank 5", .code = 0xbd, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK6", .desc = "Write CAS access to Rank 6", .code = 0xbe, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, { .name = "UNC_M_WR_CAS_RANK7", .desc = "Write CAS access to Rank 7", .code = 0xbf, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_m_rd_cas_rank0), /* shared */ .umasks = ivbep_unc_m_rd_cas_rank0 }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_irp_events.h000066400000000000000000000201771502707512200256210ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivbep_unc_irp (Intel IvyBridge-EP IRP uncore) */ static const intel_x86_umask_t ivbep_unc_i_address_match[]={ { .uname = "STALL_COUNT", .udesc = "Number of time when it is not possible to merge two conflicting requests, a stall event occurs", .ucode = 0x100, }, { .uname = "MERGE_COUNT", .udesc = "Number of times when two requests to the same address from the same source are received back to back, it is possible to merge them", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_i_cache_ack_pending_occupancy[]={ { .uname = "ANY", .udesc = "Any source", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SOURCE", .udesc = "Track all requests from any source port", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_i_tickles[]={ { .uname = "LOST_OWNERSHIP", .udesc = "Number of request that lost ownership as a result of a tickle", .ucode = 0x100, }, { .uname = "TOP_OF_QUEUE", .udesc = "Number of cases when a tickle was received but the request was at the head of the queue in the switch. In this case data is returned rather than releasing ownership", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_i_transactions[]={ { .uname = "READS", .udesc = "Number of read requests (not including read prefetches)", .ucode = 0x100, }, { .uname = "WRITES", .udesc = "Number of write requests. Each write should have a prefetch, so there is no need to explicitly track these requests", .ucode = 0x200, }, { .uname = "RD_PREFETCHES", .udesc = "Number of read prefetches", .ucode = 0x400, }, }; static const intel_x86_entry_t intel_ivbep_unc_i_pe[]={ { .name = "UNC_I_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x0, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_ADDRESS_MATCH", .desc = "Address match conflict count", .code = 0x17, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_address_match), .umasks = ivbep_unc_i_address_match }, { .name = "UNC_I_CACHE_ACK_PENDING_OCCUPANCY", .desc = "Write ACK pending occupancy", .code = 0x14, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_cache_ack_pending_occupancy), .umasks = ivbep_unc_i_cache_ack_pending_occupancy }, { .name = "UNC_I_CACHE_OWN_OCCUPANCY", .desc = "Outstanding write ownership occupancy", .code = 0x13, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_cache_ack_pending_occupancy), .umasks = ivbep_unc_i_cache_ack_pending_occupancy /* shared */ }, { .name = "UNC_I_CACHE_READ_OCCUPANCY", .desc = "Outstanding read occupancy", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_cache_ack_pending_occupancy), .umasks = ivbep_unc_i_cache_ack_pending_occupancy /* shared */ }, { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", .desc = "Total write cache occupancy", .code = 0x12, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_cache_ack_pending_occupancy), .umasks = ivbep_unc_i_cache_ack_pending_occupancy /* shared */ }, { .name = "UNC_I_CACHE_WRITE_OCCUPANCY", .desc = "Outstanding write occupancy", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_cache_ack_pending_occupancy), .umasks = ivbep_unc_i_cache_ack_pending_occupancy /* shared */ }, { .name = "UNC_I_RXR_AK_CYCLES_FULL", .desc = "TBD", .code = 0xb, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_AK_INSERTS", .desc = "Egress cycles full", .code = 0xa, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_AK_OCCUPANCY", .desc = "TBD", .code = 0x0c, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_CYCLES_FULL", .desc = "TBD", .code = 0x4, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_INSERTS", .desc = "BL Ingress occupancy DRS", .code = 0x1, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_DRS_OCCUPANCY", .desc = "TBD", .code = 0x7, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_CYCLES_FULL", .desc = "TBD", .code = 0x5, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_INSERTS", .desc = "BL Ingress occupancy NCB", .code = 0x2, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCB_OCCUPANCY", .desc = "TBD", .code = 0x8, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_CYCLES_FULL", .desc = "TBD", .code = 0x6, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_INSERTS", .desc = "BL Ingress Occupancy NCS", .code = 0x3, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_RXR_BL_NCS_OCCUPANCY", .desc = "TBD", .code = 0x9, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TICKLES", .desc = "Tickle count", .code = 0x16, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_tickles), .umasks = ivbep_unc_i_tickles }, { .name = "UNC_I_TRANSACTIONS", .desc = "Inbound transaction count", .code = 0x15, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_IRP_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_i_transactions), .umasks = ivbep_unc_i_transactions }, { .name = "UNC_I_TXR_AD_STALL_CREDIT_CYCLES", .desc = "No AD Egress credit stalls", .code = 0x18, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_BL_STALL_CREDIT_CYCLES", .desc = "No BL Egress credit stalls", .code = 0x19, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCB", .desc = "Outbound read requests", .code = 0xe, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_DATA_INSERTS_NCS", .desc = "Outbound read requests", .code = 0xf, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_TXR_REQUEST_OCCUPANCY", .desc = "Outbound request queue occupancy", .code = 0xd, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, { .name = "UNC_I_WRITE_ORDERING_STALL_CYCLES", .desc = "Write ordering stalls", .code = 0x1a, .cntmsk = 0x3, .modmsk = SNBEP_UNC_IRP_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_pcu_events.h000066400000000000000000000331601502707512200256120ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: ivbep_unc_pcu (Intel IvyBridge-EP PCU uncore) */ static const intel_x86_umask_t ivbep_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .udesc = "Counts number of cores in C0", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C3", .udesc = "Counts number of cores in C3", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C6", .udesc = "Counts number of cores in C6", .ucode = 0xc000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_ivbep_unc_p_pe[]={ { .name = "UNC_P_CLOCKTICKS", .desc = "PCU Uncore clockticks", .modmsk = IVBEP_UNC_PCU_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_P_CORE0_TRANSITION_CYCLES", .desc = "Core 0 C State Transition Cycles", .code = 0x70, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE1_TRANSITION_CYCLES", .desc = "Core 1 C State Transition Cycles", .code = 0x71, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE2_TRANSITION_CYCLES", .desc = "Core 2 C State Transition Cycles", .code = 0x72, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE3_TRANSITION_CYCLES", .desc = "Core 3 C State Transition Cycles", .code = 0x73, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE4_TRANSITION_CYCLES", .desc = "Core 4 C State Transition Cycles", .code = 0x74, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE5_TRANSITION_CYCLES", .desc = "Core 5 C State Transition Cycles", .code = 0x75, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE6_TRANSITION_CYCLES", .desc = "Core 6 C State Transition Cycles", .code = 0x76, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE7_TRANSITION_CYCLES", .desc = "Core 7 C State Transition Cycles", .code = 0x77, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE8_TRANSITION_CYCLES", .desc = "Core 8 C State Transition Cycles", .code = 0x78, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE9_TRANSITION_CYCLES", .desc = "Core 9 C State Transition Cycles", .code = 0x79, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE10_TRANSITION_CYCLES", .desc = "Core 10 C State Transition Cycles", .code = 0x7a, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE11_TRANSITION_CYCLES", .desc = "Core 11 C State Transition Cycles", .code = 0x7b, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE12_TRANSITION_CYCLES", .desc = "Core 12 C State Transition Cycles", .code = 0x7c, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE13_TRANSITION_CYCLES", .desc = "Core 13 C State Transition Cycles", .code = 0x7d, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE14_TRANSITION_CYCLES", .desc = "Core 14 C State Transition Cycles", .code = 0x7e, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE0", .desc = "Deep C state rejection Core 0", .code = 0x17 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE1", .desc = "Deep C state rejection Core 1", .code = 0x18 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE2", .desc = "Deep C state rejection Core 2", .code = 0x19 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE3", .desc = "Deep C state rejection Core 3", .code = 0x1a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE4", .desc = "Deep C state rejection Core 4", .code = 0x1b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE5", .desc = "Deep C state rejection Core 5", .code = 0x1c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE6", .desc = "Deep C state rejection Core 6", .code = 0x1d | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE7", .desc = "Deep C state rejection Core 7", .code = 0x1e | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE8", .desc = "Deep C state rejection Core 8", .code = 0x1f | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE9", .desc = "Deep C state rejection Core 9", .code = 0x20 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE10", .desc = "Deep C state rejection Core 10", .code = 0x21 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE11", .desc = "Deep C state rejection Core 11", .code = 0x22 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE12", .desc = "Deep C state rejection Core 12", .code = 0x23 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE13", .desc = "Deep C state rejection Core 13", .code = 0x24 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DELAYED_C_STATE_ABORT_CORE14", .desc = "Deep C state rejection Core 14", .code = 0x25 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE0", .desc = "Core 0 C State Demotions", .code = 0x1e, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE1", .desc = "Core 1 C State Demotions", .code = 0x1f, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE2", .desc = "Core 2 C State Demotions", .code = 0x20, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE3", .desc = "Core 3 C State Demotions", .code = 0x21, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE4", .desc = "Core 4 C State Demotions", .code = 0x22, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE5", .desc = "Core 5 C State Demotions", .code = 0x23, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE6", .desc = "Core 6 C State Demotions", .code = 0x24, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE7", .desc = "Core 7 C State Demotions", .code = 0x25, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE8", .desc = "Core 8 C State Demotions", .code = 0x40, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE9", .desc = "Core 9 C State Demotions", .code = 0x41, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE10", .desc = "Core 10 C State Demotions", .code = 0x42, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE11", .desc = "Core 11 C State Demotions", .code = 0x43, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE12", .desc = "Core 12 C State Demotions", .code = 0x44, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE13", .desc = "Core 13 C State Demotions", .code = 0x45, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE14", .desc = "Core 14 C State Demotions", .code = 0x46, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_BAND0_CYCLES", .desc = "Frequency Residency", .code = 0xb, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = IVBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND1_CYCLES", .desc = "Frequency Residency", .code = 0xc, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = IVBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND2_CYCLES", .desc = "Frequency Residency", .code = 0xd, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = IVBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND3_CYCLES", .desc = "Frequency Residency", .code = 0xe, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = IVBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_MAX_CURRENT_CYCLES", .desc = "Current Strongest Upper Limit Cycles", .code = 0x7, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .desc = "Thermal Strongest Upper Limit Cycles", .code = 0x4, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_OS_CYCLES", .desc = "OS Strongest Upper Limit Cycles", .code = 0x6, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .desc = "Power Strongest Upper Limit Cycles", .code = 0x5, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MIN_PERF_P_CYCLES", .desc = "Perf P Limit Strongest Lower Limit Cycles", .code = 0x02 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .desc = "IO P Limit Strongest Lower Limit Cycles", .code = 0x61, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .desc = "Cycles spent changing Frequency", .code = 0x60, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .desc = "Memory Phase Shedding Cycles", .code = 0x2f, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PKG_C_EXIT_LATENCY", .desc = "Package C state exit latency. Counts cycles the package is transitioning from C2 to C3", .code = 0x26 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .desc = "Number of cores in C0", .code = 0x80, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_I, /* available only on occ_invert */ .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_p_power_state_occupancy), .umasks = ivbep_unc_p_power_state_occupancy }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .desc = "External Prochot", .code = 0xa, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .desc = "Internal Prochot", .code = 0x9, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .desc = "Total Core C State Transition Cycles", .code = 0x63, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_CHANGE", .desc = "Cycles Changing Voltage", .code = 0x3, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_DECREASE", .desc = "Cycles Decreasing Voltage", .code = 0x2, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_INCREASE", .desc = "Cycles Increasing Voltage", .code = 0x1, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VR_HOT_CYCLES", .desc = "VR Hot", .code = 0x32, .cntmsk = 0xf, .modmsk = IVBEP_UNC_PCU_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_qpi_events.h000066400000000000000000000550231502707512200256160ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivbep_unc_qpi (Intel IvyBridge-EP QPI uncore) */ static const intel_x86_umask_t ivbep_unc_q_direct2core[]={ { .uname = "FAILURE_CREDITS", .udesc = "Number of spawn failures due to lack of Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_CREDITS_RBT", .udesc = "Number of spawn failures due to lack of Egress credit and route-back table (RBT) bit was not set", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_RBT_HIT", .udesc = "Number of spawn failures because route-back table (RBT) specified that the transaction should not trigger a direct2core transaction", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SUCCESS_RBT_HIT", .udesc = "Number of spawn successes", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_MISS", .udesc = "Number of spawn failures due to RBT tag not matching although the valid bit was set and there was enough Egress credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_CREDITS_MISS", .udesc = "Number of spawn failures due to RBT tag not matching and they were not enough Egress credits. The valid bit was set", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_RBT_MISS", .udesc = "Number of spawn failures due to RBT tag not matching, the valid bit was not set but there were enough Egress credits", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_CREDITS_RBT_MISS", .udesc = "Number of spawn failures due to RBT tag not matching, the valid bit was not set and there were not enough Egress credits", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_rxl_credits_consumed_vn0[]={ { .uname = "DRS", .udesc = "Number of times VN0 consumed for DRS message class", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of times VN0 consumed for HOM message class", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "Number of times VN0 consumed for NCB message class", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of times VN0 consumed for NCS message class", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "Number of times VN0 consumed for NDR message class", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of times VN0 consumed for SNP message class", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_rxl_credits_consumed_vn1[]={ { .uname = "DRS", .udesc = "Number of times VN1 consumed for DRS message class", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of times VN1 consumed for HOM message class", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "Number of times VN1 consumed for NCB message class", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of times VN1 consumed for NCS message class", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "Number of times VN1 consumed for NDR message class", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of times VN1 consumed for SNP message class", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_rxl_flits_g0[]={ { .uname = "DATA", .udesc = "Number of data flits over QPI", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Number of flits over QPI that do not hold protocol payload", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "Number of non-NULL non-data flits over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_txl_flits_g0[]={ { .uname = "DATA", .udesc = "Number of data flits over QPI", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "Number of non-NULL non-data flits over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_rxl_flits_g1[]={ { .uname = "DRS", .udesc = "Number of flits over QPI on the Data Response (DRS) channel", .ucode = 0x1800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_DATA", .udesc = "Number of data flits over QPI on the Data Response (DRS) channel", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_NONDATA", .udesc = "Number of protocol flits over QPI on the Data Response (DRS) channel", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of flits over QPI on the home channel", .ucode = 0x600, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_NONREQ", .udesc = "Number of non-request flits over QPI on the home channel", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_REQ", .udesc = "Number of data requests over QPI on the home channel", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of snoop requests flits over QPI", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_rxl_flits_g2[]={ { .uname = "NCB", .udesc = "Number of non-coherent bypass flits", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_DATA", .udesc = "Number of non-coherent data flits", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_NONDATA", .udesc = "Number of bypass non-data flits", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of non-coherent standard (NCS) flits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AD", .udesc = "Number of flits received over Non-data response (NDR) channel", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AK", .udesc = "Number of flits received on the Non-data response (NDR) channel)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_q_txr_ad_hom_credit_acquired[]={ { .uname = "VN0", .udesc = "for VN0", .ucode = 0x100, }, { .uname = "VN1", .udesc = "for VN1", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_q_txr_bl_drs_credit_acquired[]={ { .uname = "VN0", .udesc = "for VN0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .udesc = "for VN1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN_SHR", .udesc = "for shared VN", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_ivbep_unc_q_pe[]={ { .name = "UNC_Q_CLOCKTICKS", .desc = "Number of qfclks", .code = 0x14, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_CTO_COUNT", .desc = "Count of CTO Events", .code = 0x38 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_DIRECT2CORE", .desc = "Direct 2 Core Spawning", .code = 0x13, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_direct2core), .umasks = ivbep_unc_q_direct2core }, { .name = "UNC_Q_L1_POWER_CYCLES", .desc = "Cycles in L1", .code = 0x12, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0x10, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xf, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_BYPASSED", .desc = "Rx Flit Buffer Bypassed", .code = 0x9, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN0", .desc = "VN0 Credit Consumed", .code = 0x1e | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_credits_consumed_vn0), .umasks = ivbep_unc_q_rxl_credits_consumed_vn0 }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN1", .desc = "VN1 Credit Consumed", .code = 0x39 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_credits_consumed_vn1), .umasks = ivbep_unc_q_rxl_credits_consumed_vn1 }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VNA", .desc = "VNA Credit Consumed", .code = 0x1d | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CYCLES_NE", .desc = "RxQ Cycles Not Empty", .code = 0xa, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_FLITS_G0", .desc = "Flits Received - Group 0", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_flits_g0), .umasks = ivbep_unc_q_rxl_flits_g0 }, { .name = "UNC_Q_RXL_FLITS_G1", .desc = "Flits Received - Group 1", .code = 0x2 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_flits_g1), .umasks = ivbep_unc_q_rxl_flits_g1 }, { .name = "UNC_Q_RXL_FLITS_G2", .desc = "Flits Received - Group 2", .code = 0x3 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_flits_g2), .umasks = ivbep_unc_q_rxl_flits_g2 }, { .name = "UNC_Q_RXL_INSERTS", .desc = "Rx Flit Buffer Allocations", .code = 0x8, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_DRS", .desc = "Rx Flit Buffer Allocations - DRS", .code = 0x9 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_HOM", .desc = "Rx Flit Buffer Allocations - HOM", .code = 0xc | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NCB", .desc = "Rx Flit Buffer Allocations - NCB", .code = 0xa | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NCS", .desc = "Rx Flit Buffer Allocations - NCS", .code = 0xb | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_NDR", .desc = "Rx Flit Buffer Allocations - NDR", .code = 0xe | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_INSERTS_SNP", .desc = "Rx Flit Buffer Allocations - SNP", .code = 0xd | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY", .desc = "RxQ Occupancy - All Packets", .code = 0xb, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_DRS", .desc = "RxQ Occupancy - DRS", .code = 0x15 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_HOM", .desc = "RxQ Occupancy - HOM", .code = 0x18 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCB", .desc = "RxQ Occupancy - NCB", .code = 0x16 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCS", .desc = "RxQ Occupancy - NCS", .code = 0x17 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_NDR", .desc = "RxQ Occupancy - NDR", .code = 0x1a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_RXL_OCCUPANCY_SNP", .desc = "RxQ Occupancy - SNP", .code = 0x19 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0xd, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xc, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_BYPASSED", .desc = "Tx Flit Buffer Bypassed", .code = 0x5, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_CYCLES_NE", .desc = "Tx Flit Buffer Cycles not Empty", .code = 0x6, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_FLITS_G0", .desc = "Flits Transferred - Group 0", .code = 0x0, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txl_flits_g0), .umasks = ivbep_unc_q_txl_flits_g0 }, { .name = "UNC_Q_TXL_FLITS_G1", .desc = "Flits Transferred - Group 1", .code = 0x0 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_flits_g1), .umasks = ivbep_unc_q_rxl_flits_g1 /* shared with rxl_flits_g1 */ }, { .name = "UNC_Q_TXL_FLITS_G2", .desc = "Flits Transferred - Group 2", .code = 0x1 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_rxl_flits_g2), .umasks = ivbep_unc_q_rxl_flits_g2 /* shared with rxl_flits_g2 */ }, { .name = "UNC_Q_TXL_INSERTS", .desc = "Tx Flit Buffer Allocations", .code = 0x4, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_OCCUPANCY", .desc = "Tx Flit Buffer Occupancy", .code = 0x7, .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURNS", .desc = "VNA Credits Returned", .code = 0x1c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURN_OCCUPANCY", .desc = "VNA Credits Pending Return - Occupancy", .code = 0x1b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = IVBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD HOM", .code = 0x26 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_HOM_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD HOM", .code = 0x22 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x28 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_NDR_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x24 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AD SNP", .code = 0x27 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AD_SNP_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD SNP", .code = 0x23 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy AK NDR", .code = 0x29 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_AK_NDR_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy AD NDR", .code = 0x25 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL DRS", .code = 0x2a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_bl_drs_credit_acquired), .umasks = ivbep_unc_q_txr_bl_drs_credit_acquired, }, { .name = "UNC_Q_TXR_BL_DRS_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL DRS", .code = 0x1f | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_bl_drs_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_bl_drs_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL NCB", .code = 0x2b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCB_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL NCB", .code = 0x20 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_ACQUIRED", .desc = "R3QPI Egress credit occupancy BL NCS", .code = 0x2c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, { .name = "UNC_Q_TXR_BL_NCS_CREDIT_OCCUPANCY", .desc = "R3QPI Egress credit occupancy BL NCS", .code = 0x21 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_q_txr_ad_hom_credit_acquired), /* shared */ .umasks = ivbep_unc_q_txr_ad_hom_credit_acquired, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_r2pcie_events.h000066400000000000000000000171541502707512200262140ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivbep_unc_r2pcie (Intel IvyBridge-EP R2PCIe uncore) */ static const intel_x86_umask_t ivbep_unc_r2_ring_ad_used[]={ { .uname = "CCW_VR0_EVEN", .udesc = "Counter-clockwise and even ring polarity on virtual ring 0", .ucode = 0x400, }, { .uname = "CCW_VR0_ODD", .udesc = "Counter-clockwise and odd ring polarity on virtual ring 0", .ucode = 0x800, }, { .uname = "CW_VR0_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring 0", .ucode = 0x100, }, { .uname = "CW_VR0_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring 0", .ucode = 0x200, }, { .uname = "CCW_VR1_EVEN", .udesc = "Counter-clockwise and even ring polarity on virtual ring 1", .ucode = 0x400, }, { .uname = "CCW_VR1_ODD", .udesc = "Counter-clockwise and odd ring polarity on virtual ring 1", .ucode = 0x800, }, { .uname = "CW_VR1_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring 1", .ucode = 0x100, }, { .uname = "CW_VR1_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring 1", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x3300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xcc00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_r2_rxr_ak_bounces[]={ { .uname = "CW", .udesc = "Clockwise", .ucode = 0x100, }, { .uname = "CCW", .udesc = "Counter-clockwise", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_r2_rxr_occupancy[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivbep_unc_r2_ring_iv_used[]={ { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x3300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xcc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "any direction and any polarity on any virtual ring", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivbep_unc_r2_rxr_cycles_ne[]={ { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, }, }; static const intel_x86_umask_t ivbep_unc_r2_txr_cycles_full[]={ { .uname = "AD", .udesc = "AD Egress queue", .ucode = 0x100, }, { .uname = "AK", .udesc = "AK Egress queue", .ucode = 0x200, }, { .uname = "BL", .udesc = "BL Egress queue", .ucode = 0x400, }, }; static const intel_x86_entry_t intel_ivbep_unc_r2_pe[]={ { .name = "UNC_R2_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0xf, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, }, { .name = "UNC_R2_RING_AD_USED", .desc = "R2 AD Ring in Use", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_ring_ad_used), .umasks = ivbep_unc_r2_ring_ad_used }, { .name = "UNC_R2_RING_AK_USED", .desc = "R2 AK Ring in Use", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_ring_ad_used), .umasks = ivbep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_BL_USED", .desc = "R2 BL Ring in Use", .code = 0x9, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_ring_ad_used), .umasks = ivbep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_IV_USED", .desc = "R2 IV Ring in Use", .code = 0xa, .cntmsk = 0xf, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_ring_iv_used), .umasks = ivbep_unc_r2_ring_iv_used }, { .name = "UNC_R2_RXR_AK_BOUNCES", .desc = "AK Ingress Bounced", .code = 0x12, .cntmsk = 0x1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_rxr_ak_bounces), .umasks = ivbep_unc_r2_rxr_ak_bounces }, { .name = "UNC_R2_RXR_OCCUPANCY", .desc = "Ingress occupancy accumulator", .code = 0x13, .cntmsk = 0x1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_rxr_occupancy), .umasks = ivbep_unc_r2_rxr_occupancy }, { .name = "UNC_R2_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_rxr_cycles_ne), .umasks = ivbep_unc_r2_rxr_cycles_ne }, { .name = "UNC_R2_RXR_INSERTS", .desc = "Ingress inserts", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_rxr_cycles_ne), .umasks = ivbep_unc_r2_rxr_cycles_ne, /* shared */ }, { .name = "UNC_R2_TXR_CYCLES_FULL", .desc = "Egress Cycles Full", .code = 0x25, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_txr_cycles_full), .umasks = ivbep_unc_r2_txr_cycles_full }, { .name = "UNC_R2_TXR_CYCLES_NE", .desc = "Egress Cycles Not Empty", .code = 0x23, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_txr_cycles_full), .umasks = ivbep_unc_r2_txr_cycles_full /* shared */ }, { .name = "UNC_R2_TXR_NACK_CCW", .desc = "Egress counter-clockwise BACK", .code = 0x28, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_txr_cycles_full), .umasks = ivbep_unc_r2_txr_cycles_full /* shared */ }, { .name = "UNC_R2_TXR_NACK_CW", .desc = "Egress clockwise BACK", .code = 0x26, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r2_txr_cycles_full), .umasks = ivbep_unc_r2_txr_cycles_full /* shared */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_r3qpi_events.h000066400000000000000000000344401502707512200260630ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ivbep_unc_r3qpi (Intel IvyBridge-EP R3QPI uncore) */ static const intel_x86_umask_t ivbep_unc_r3_ring_ad_used[]={ { .uname = "CCW_VR0_EVEN", .udesc = "Counter-Clockwise and even ring polarity on virtual ring 0", .ucode = 0x400, }, { .uname = "CCW_VR0_ODD", .udesc = "Counter-Clockwise and odd ring polarity on virtual ring 0", .ucode = 0x800, }, { .uname = "CW_VR0_EVEN", .udesc = "Clockwise and even ring polarity on virtual ring 0", .ucode = 0x100, }, { .uname = "CW_VR0_ODD", .udesc = "Clockwise and odd ring polarity on virtual ring 0", .ucode = 0x200, }, { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x3300, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xcc00, }, }; static const intel_x86_umask_t ivbep_unc_r3_ring_iv_used[]={ { .uname = "CW", .udesc = "Clockwise with any polarity on either virtual rings", .ucode = 0x3300, }, { .uname = "CCW", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xcc00, }, { .uname = "ANY", .udesc = "Counter-clockwise with any polarity on either virtual rings", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t ivbep_unc_r3_rxr_cycles_ne[]={ { .uname = "HOM", .udesc = "HOM Ingress queue", .ucode = 0x100, }, { .uname = "SNP", .udesc = "SNP Ingress queue", .ucode = 0x200, }, { .uname = "NDR", .udesc = "NDR Ingress queue", .ucode = 0x400, }, }; static const intel_x86_umask_t ivbep_unc_r3_rxr_inserts[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, }, { .uname = "HOM", .udesc = "HOM Ingress queue", .ucode = 0x100, }, { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, }, { .uname = "NDR", .udesc = "NDR Ingress queue", .ucode = 0x400, }, { .uname = "SNP", .udesc = "SNP Ingress queue", .ucode = 0x200, }, }; static const intel_x86_umask_t ivbep_unc_r3_vn0_credits_used[]={ { .uname = "HOM", .udesc = "Filter HOM message class", .ucode = 0x100, }, { .uname = "SNP", .udesc = "Filter SNP message class", .ucode = 0x200, }, { .uname = "NDR", .udesc = "Filter NDR message class", .ucode = 0x400, }, { .uname = "DRS", .udesc = "Filter DRS message class", .ucode = 0x800, }, { .uname = "NCB", .udesc = "Filter NCB message class", .ucode = 0x1000, }, { .uname = "NCS", .udesc = "Filter NCS message class", .ucode = 0x2000, }, }; static const intel_x86_umask_t ivbep_unc_r3_c_hi_ad_credits_empty[]={ { .uname = "CBO8", .udesc = "CBox 8", .ucode = 0x100, }, { .uname = "CBO9", .udesc = "CBox 9", .ucode = 0x200, }, { .uname = "CBO10", .udesc = "CBox 10", .ucode = 0x400, }, { .uname = "CBO11", .udesc = "CBox 11", .ucode = 0x800, }, { .uname = "CBO12", .udesc = "CBox 12", .ucode = 0x1000, }, { .uname = "CBO13", .udesc = "CBox 13", .ucode = 0x2000, }, { .uname = "CBO14", .udesc = "CBox 14 & 16", .ucode = 0x4000, }, }; static const intel_x86_umask_t ivbep_unc_r3_c_lo_ad_credits_empty[]={ { .uname = "CBO0", .udesc = "CBox 0", .ucode = 0x100, }, { .uname = "CBO1", .udesc = "CBox 1", .ucode = 0x200, }, { .uname = "CBO2", .udesc = "CBox 2", .ucode = 0x400, }, { .uname = "CBO3", .udesc = "CBox 3", .ucode = 0x800, }, { .uname = "CBO4", .udesc = "CBox 4", .ucode = 0x1000, }, { .uname = "CBO5", .udesc = "CBox 5", .ucode = 0x2000, }, { .uname = "CBO6", .udesc = "CBox 6", .ucode = 0x4000, }, { .uname = "CBO7", .udesc = "CBox 7", .ucode = 0x8000, } }; static const intel_x86_umask_t ivbep_unc_r3_ha_r2_bl_credits_empty[]={ { .uname = "HA0", .udesc = "HA0", .ucode = 0x100, }, { .uname = "HA1", .udesc = "HA1", .ucode = 0x200, }, { .uname = "R2_NCB", .udesc = "R2 NCB messages", .ucode = 0x400, }, { .uname = "R2_NCS", .udesc = "R2 NCS messages", .ucode = 0x800, } }; static const intel_x86_umask_t ivbep_unc_r3_qpi0_ad_credits_empty[]={ { .uname = "VNA", .udesc = "VNA", .ucode = 0x100, }, { .uname = "VN0_HOM", .udesc = "VN0 HOM messages", .ucode = 0x200, }, { .uname = "VN0_SNP", .udesc = "VN0 SNP messages", .ucode = 0x400, }, { .uname = "VN0_NDR", .udesc = "VN0 NDR messages", .ucode = 0x800, }, { .uname = "VN1_HOM", .udesc = "VN1 HOM messages", .ucode = 0x1000, }, { .uname = "VN1_SNP", .udesc = "VN1 SNP messages", .ucode = 0x2000, }, { .uname = "VN1_NDR", .udesc = "VN1 NDR messages", .ucode = 0x4000, }, }; static const intel_x86_umask_t ivbep_unc_r3_txr_nack_ccw[]={ { .uname = "AD", .udesc = "BL counter-clockwise Egress queue", .ucode = 0x100, }, { .uname = "AK", .udesc = "AD clockwise Egress queue", .ucode = 0x200, }, { .uname = "BL", .udesc = "AD counter-clockwise Egress queue", .ucode = 0x400, }, }; static const intel_x86_umask_t ivbep_unc_r3_txr_nack_cw[]={ { .uname = "AD", .udesc = "AD clockwise Egress queue", .ucode = 0x100, }, { .uname = "AK", .udesc = "AD counter-clockwise Egress queue", .ucode = 0x200, }, { .uname = "BL", .udesc = "BL clockwise Egress queue", .ucode = 0x400, }, }; static const intel_x86_umask_t ivbep_unc_r3_vna_credits_acquired[]={ { .uname = "AD", .udesc = "For AD ring", .ucode = 0x100, }, { .uname = "BL", .udesc = "For BL ring", .ucode = 0x400, }, }; static const intel_x86_entry_t intel_ivbep_unc_r3_pe[]={ { .name = "UNC_R3_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0x7, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_RING_AD_USED", .desc = "R3 AD Ring in Use", .code = 0x7, .cntmsk = 0x7, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_ring_ad_used), .umasks = ivbep_unc_r3_ring_ad_used }, { .name = "UNC_R3_RING_AK_USED", .desc = "R3 AK Ring in Use", .code = 0x8, .cntmsk = 0x7, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_ring_ad_used), .umasks = ivbep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_BL_USED", .desc = "R3 BL Ring in Use", .code = 0x9, .cntmsk = 0x7, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_ring_ad_used), .umasks = ivbep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_IV_USED", .desc = "R3 IV Ring in Use", .code = 0xa, .cntmsk = 0x7, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_ring_iv_used), .umasks = ivbep_unc_r3_ring_iv_used }, { .name = "UNC_R3_RXR_AD_BYPASSED", .desc = "Ingress Bypassed", .code = 0x12, .cntmsk = 0x3, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_rxr_cycles_ne), .umasks = ivbep_unc_r3_rxr_cycles_ne }, { .name = "UNC_R3_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_rxr_inserts), .umasks = ivbep_unc_r3_rxr_inserts }, { .name = "UNC_R3_RXR_OCCUPANCY", .desc = "Ingress Occupancy Accumulator", .code = 0x13, .cntmsk = 0x1, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_rxr_inserts), .umasks = ivbep_unc_r3_rxr_inserts/* shared */ }, { .name = "UNC_R3_TXR_CYCLES_FULL", .desc = "Egress cycles full", .code = 0x25, .cntmsk = 0x3, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_VN0_CREDITS_REJECT", .desc = "VN0 Credit Acquisition Failed", .code = 0x37, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vn0_credits_used), .umasks = ivbep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VN0_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x36, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vn0_credits_used), .umasks = ivbep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VNA_CREDITS_ACQUIRED", .desc = "VNA credit Acquisitions", .code = 0x33, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vna_credits_acquired), .umasks = ivbep_unc_r3_vna_credits_acquired }, { .name = "UNC_R3_VNA_CREDITS_REJECT", .desc = "VNA Credit Reject", .code = 0x34, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vn0_credits_used), .umasks = ivbep_unc_r3_vn0_credits_used /* shared */ }, { .name = "UNC_R3_VNA_CREDIT_CYCLES_OUT", .desc = "Cycles with no VNA credits available", .code = 0x31, .cntmsk = 0x3, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_VNA_CREDIT_CYCLES_USED", .desc = "Cycles with 1 or more VNA credits in use", .code = 0x32, .cntmsk = 0x3, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_C_HI_AD_CREDITS_EMPTY", .desc = "Cbox AD credits empty", .code = 0x2c, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_c_hi_ad_credits_empty), .umasks = ivbep_unc_r3_c_hi_ad_credits_empty }, { .name = "UNC_R3_C_LO_AD_CREDITS_EMPTY", .desc = "Cbox AD credits empty", .code = 0x2b, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_c_lo_ad_credits_empty), .umasks = ivbep_unc_r3_c_lo_ad_credits_empty }, { .name = "UNC_R3_HA_R2_BL_CREDITS_EMPTY", .desc = "HA/R2 AD credits empty", .code = 0x2f, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_ha_r2_bl_credits_empty), .umasks = ivbep_unc_r3_ha_r2_bl_credits_empty }, { .name = "UNC_R3_QPI0_AD_CREDITS_EMPTY", .desc = "QPI0 AD credits empty", .code = 0x29, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_qpi0_ad_credits_empty), .umasks = ivbep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_QPI0_BL_CREDITS_EMPTY", .desc = "QPI0 BL credits empty", .code = 0x2d, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_qpi0_ad_credits_empty), /* shared */ .umasks = ivbep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_QPI1_AD_CREDITS_EMPTY", .desc = "QPI1 AD credits empty", .code = 0x2a, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_qpi0_ad_credits_empty), /* shared */ .umasks = ivbep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_QPI1_BL_CREDITS_EMPTY", .desc = "QPI1 BL credits empty", .code = 0x2e, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_qpi0_ad_credits_empty), /* shared */ .umasks = ivbep_unc_r3_qpi0_ad_credits_empty }, { .name = "UNC_R3_TXR_CYCLES_NE", .desc = "Egress cycles not empty", .code = 0x23, .cntmsk = 0x3, .modmsk = IVBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_TXR_NACK_CCW", .desc = "Egress NACK counter-clockwise", .code = 0x28, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_txr_nack_ccw), .umasks = ivbep_unc_r3_txr_nack_ccw }, { .name = "UNC_R3_TXR_NACK_CW", .desc = "Egress NACK counter-clockwise", .code = 0x26, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_txr_nack_cw), .umasks = ivbep_unc_r3_txr_nack_cw }, { .name = "UNC_R3_VN1_CREDITS_REJECT", .desc = "VN1 Credit Acquisition Failed", .code = 0x39, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vn0_credits_used), /* shared */ .umasks = ivbep_unc_r3_vn0_credits_used }, { .name = "UNC_R3_VN1_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x38, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_r3_vn0_credits_used), /* shared */ .umasks = ivbep_unc_r3_vn0_credits_used }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ivbep_unc_ubo_events.h000066400000000000000000000061211502707512200256050ustar00rootroot00000000000000/* * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: ivbep_unc_ubo (Intel IvyBridge-EP U-Box uncore PMU) */ static const intel_x86_umask_t ivbep_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .udesc = "TBD", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_PRIO", .udesc = "TBD", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPI_RCVD", .udesc = "TBD", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSI_RCVD", .udesc = "TBD", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VLW_RCVD", .udesc = "TBD", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t ivbep_unc_u_phold_cycles[]={ { .uname = "ASSERT_TO_ACK", .udesc = "Number of cycles asserted to ACK", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ACK_TO_DEASSERT", .udesc = "Number of cycles ACK to deassert", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_ivbep_unc_u_pe[]={ { .name = "UNC_U_EVENT_MSG", .desc = "VLW Received", .code = 0x42, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_UBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_u_event_msg), .umasks = ivbep_unc_u_event_msg }, { .name = "UNC_U_LOCK_CYCLES", .desc = "IDI Lock/SplitLock Cycles", .code = 0x44, .cntmsk = 0x3, .modmsk = IVBEP_UNC_UBO_ATTRS, }, { .name = "UNC_U_PHOLD_CYCLES", .desc = "Cycles PHOLD asserts to Ack", .code = 0x45, .cntmsk = 0x3, .ngrp = 1, .modmsk = IVBEP_UNC_UBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(ivbep_unc_u_phold_cycles), .umasks = ivbep_unc_u_phold_cycles }, { .name = "UNC_U_RACU_REQUESTS", .desc = "RACU requests", .code = 0x46, .cntmsk = 0x3, .modmsk = IVBEP_UNC_UBO_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knc_events.h000066400000000000000000000354601502707512200235510ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knc (Intel Knights Corners) */ static const intel_x86_entry_t intel_knc_pe[]={ { .name = "BANK_CONFLICTS", .desc = "Number of actual bank conflicts", .code = 0xa, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "BRANCHES", .desc = "Number of taken and not taken branches, including: conditional branches, jumps, calls, returns, software interrupts, and interrupt returns", .code = 0x12, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "BRANCHES_MISPREDICTED", .desc = "Number of branch mispredictions that occurred on BTB hits. BTB misses are not considered branch mispredicts because no prediction exists for them yet.", .code = 0x2b, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "CODE_CACHE_MISS", .desc = "Number of instruction reads that miss the internal code cache; whether the read is cacheable or noncacheable", .code = 0xe, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "CODE_PAGE_WALK", .desc = "Number of code page walks", .code = 0xd, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "CODE_READ", .desc = "Number of instruction reads; whether the read is cacheable or noncacheable", .code = 0xc, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "CPU_CLK_UNHALTED", .desc = "Number of cycles during which the processor is not halted.", .code = 0x2a, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_CACHE_LINES_WRITTEN_BACK", .desc = "Number of dirty lines (all) that are written back, regardless of the cause", .code = 0x6, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_PAGE_WALK", .desc = "Number of data page walks", .code = 0x2, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_READ", .desc = "Number of successful memory data reads committed by the K-unit (L1). Cache accesses resulting from prefetch instructions are included for A0 stepping.", .code = 0x0, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_READ_MISS", .desc = "Number of memory read accesses that miss the internal data cache whether or not the access is cacheable or noncacheable. Cache accesses resulting from prefetch instructions are not included.", .code = 0x3, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_READ_MISS_OR_WRITE_MISS", .desc = "Number of memory read and/or write accesses that miss the internal data cache, whether or not the access is cacheable or noncacheable", .code = 0x29, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_READ_OR_WRITE", .desc = "Number of memory data reads and/or writes (internal data cache hit and miss combined). Read cache accesses resulting from prefetch instructions are included for A0 stepping.", .code = 0x28, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_WRITE", .desc = "Number of successful memory data writes committed by the K-unit (L1). Streaming stores (hit/miss L1), cacheable write partials, and UC promotions are all included.", .code = 0x1, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "DATA_WRITE_MISS", .desc = "Number of memory write accesses that miss the internal data cache whether or not the access is cacheable. Non-cacheable misses are not included.", .code = 0x4, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "EXEC_STAGE_CYCLES", .desc = "Number of E-stage cycles that were successfully completed. Includes cycles generated by multi-cycle E-stage instructions. For instructions destined for the FPU or VPU pipelines, this event only counts occupancy in the integer E-stage.", .code = 0x2e, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "FE_STALLED", .desc = "Number of cycles where the front-end could not advance. Any multi-cycle instructions which delay pipeline advance and apply backpressure to the front-end will be included, e.g. read-modify-write instructions. Includes cycles when the front-end did not have any instructions to issue.", .code = 0x2d, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "INSTRUCTIONS_EXECUTED", .desc = "Number of instructions executed (up to two per clock)", .code = 0x16, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "INSTRUCTIONS_EXECUTED_V_PIPE", .desc = "Number of instructions executed in the V_pipe. The event indicates the number of instructions that were paired.", .code = 0x17, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L1_DATA_HIT_INFLIGHT_PF1", .desc = "Number of data requests which hit an in-flight vprefetch0. The in-flight vprefetch0 was not necessarily issued from the same thread as the data request.", .code = 0x20, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L1_DATA_PF1", .desc = "Number of data vprefetch0 requests seen by the L1.", .code = 0x11, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L1_DATA_PF1_DROP", .desc = "Number of data vprefetch0 requests seen by the L1 which were dropped for any reason. A vprefetch0 can be dropped if the requested address matches another in-flight request or if it has a UC memtype.", .code = 0x1e, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L1_DATA_PF1_MISS", .desc = "Number of data vprefetch0 requests seen by the L1 which missed L1. Does not include vprefetch1 requests which are counted in L1_DATA_PF1_DROP.", .code = 0x1c, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L1_DATA_PF2", .desc = "Number of data vprefetch1 requests seen by the L1. This is not necessarily the same number as seen by the L2 because this count includes requests that are dropped by the core. A vprefetch1 can be dropped by the core if the requested address matches another in-flight request or if it has a UC memtype.", .code = 0x37, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_CODE_READ_MISS_CACHE_FILL", .desc = "Number of code read accesses that missed the L2 cache and were satisfied by another L2 cache. Can include promoted read misses that started as DATA accesses.", .code = 0x10f0, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_CODE_READ_MISS_MEM_FILL", .desc = "Number of code read accesses that missed the L2 cache and were satisfied by main memory. Can include promoted read misses that started as DATA accesses.", .code = 0x10f5, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_HIT_INFLIGHT_PF2", .desc = "Number of data requests which hit an in-flight vprefetch1. The in-flight vprefetch1 was not necessarily issued from the same thread as the data request.", .code = 0x10ff, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_PF1_MISS", .desc = "Number of data vprefetch0 requests seen by the L2 which missed L2.", .code = 0x38, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_PF2", .desc = "Number of data vprefetch1 requests seen by the L2. Only counts vprefetch1 hits on A0 stepping.", .code = 0x10fc, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_PF2_DROP", .desc = "Number of data vprefetch1 requests seen by the L2 which were dropped for any reason.", .code = 0x10fd, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_PF2_MISS", .desc = "Number of data vprefetch1 requests seen by the L2 which missed L2. Does not include vprefetch2 requests which are counted in L2_DATA_PF2_DROP.", .code = 0x10fe, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_READ_MISS_CACHE_FILL", .desc = "Number of data read accesses that missed the L2 cache and were satisfied by another L2 cache. Can include promoted read misses that started as CODE accesses.", .code = 0x10f1, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_READ_MISS_MEM_FILL", .desc = "Number of data read accesses that missed the L2 cache and were satisfied by main memory. Can include promoted read misses that started as CODE accesses.", .code = 0x10f6, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_WRITE_MISS_CACHE_FILL", .desc = "Number of data write (RFO) accesses that missed the L2 cache and were satisfied by another L2 cache.", .code = 0x10f2, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_DATA_WRITE_MISS_MEM_FILL", .desc = "Number of data write (RFO) accesses that missed the L2 cache and were satisfied by main memory.", .code = 0x10f7, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_READ_HIT_E", .desc = "L2 Read Hit E State, may include prefetches on A0 stepping.", .code = 0x10c8, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_READ_HIT_M", .desc = "L2 Read Hit M State", .code = 0x10c9, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_READ_HIT_S", .desc = "L2 Read Hit S State", .code = 0x10ca, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_READ_MISS", .desc = "L2 Read Misses. Prefetch and demand requests to the same address will produce double counting.", .code = 0x10cb, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_VICTIM_REQ_WITH_DATA", .desc = "L2 received a victim request and responded with data", .code = 0x10d7, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "L2_WRITE_HIT", .desc = "L2 Write HIT, may undercount on A0 stepping.", .code = 0x10cc, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "LONG_CODE_PAGE_WALK", .desc = "Number of long code page walks, i.e. page walks that also missed the L2 uTLB. Subset of DATA_CODE_WALK event", .code = 0x3b, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "LONG_DATA_PAGE_WALK", .desc = "Number of long data page walks, i.e. page walks that also missed the L2 uTLB. Subset of DATA_PAGE_WALK event", .code = 0x3a, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "MEMORY_ACCESSES_IN_BOTH_PIPES", .desc = "Number of data memory reads or writes that are paired in both pipes of the pipeline", .code = 0x9, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "MICROCODE_CYCLES", .desc = "The number of cycles microcode is executing. While microcode is executing, all other threads are stalled.", .code = 0x2c, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "PIPELINE_AGI_STALLS", .desc = "Number of address generation interlock (AGI) stalls. An AGI occurring in both the U- and V- pipelines in the same clock signals this event twice.", .code = 0x1f, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "PIPELINE_FLUSHES", .desc = "Number of pipeline flushes that occur. Pipeline flushes are caused by BTB misses on taken branches, mispredictions, exceptions, interrupts, and some segment descriptor loads.", .code = 0x15, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "PIPELINE_SG_AGI_STALLS", .desc = "Number of address generation interlock (AGI) stalls due to vscatter* and vgather* instructions.", .code = 0x21, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "SNP_HITM_BUNIT", .desc = "Snoop HITM in BUNIT", .code = 0x10e3, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "SNP_HITM_L2", .desc = "Snoop HITM in L2", .code = 0x10e7, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "SNP_HIT_L2", .desc = "Snoop HIT in L2", .code = 0x10e6, .cntmsk = 0x1, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_DATA_READ", .desc = "Number of read transactions that were issued. In general each read transaction will read 1 64B cacheline. If there are alignment issues, then reads against multiple cache lines will each be counted individually.", .code = 0x2000, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_DATA_READ_MISS", .desc = "VPU L1 data cache readmiss. Counts the number of occurrences.", .code = 0x2003, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_DATA_WRITE", .desc = "Number of write transactions that were issued. . In general each write transaction will write 1 64B cacheline. If there are alignment issues, then write against multiple cache lines will each be counted individually.", .code = 0x2001, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_DATA_WRITE_MISS", .desc = "VPU L1 data cache write miss. Counts the number of occurrences.", .code = 0x2004, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_ELEMENTS_ACTIVE", .desc = "Counts the cumulative number of elements active (via mask) for VPU instructions issued.", .code = 0x2018, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_INSTRUCTIONS_EXECUTED", .desc = "Counts the number of VPU instructions executed in both u- and v-pipes.", .code = 0x2016, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_INSTRUCTIONS_EXECUTED_V_PIPE", .desc = "Counts the number of VPU instructions that paired and executed in the v-pipe.", .code = 0x2017, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, { .name = "VPU_STALL_REG", .desc = "VPU stall on Register Dependency. Counts the number of occurrences. Dependencies will include RAW, WAW, WAR.", .code = 0x2005, .cntmsk = 0x3, .modmsk = INTEL_V3_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knl_events.h000066400000000000000000001233741502707512200235640ustar00rootroot00000000000000/* * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knl (Intel Knights Landing) */ static const intel_x86_umask_t knl_icache[]={ { .uname = "HIT", .udesc = "Counts all instruction fetches that hit the instruction cache.", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "Counts all instruction fetches that miss the instruction cache or produce memory requests. An instruction fetch miss is counted only once and not once for every cycle it is outstanding.", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ACCESSES", .udesc = "Counts all instruction fetches, including uncacheable fetches.", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_uops_retired[]={ { .uname = "ALL", .udesc = "Counts the number of micro-ops retired.", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "MS", .udesc = "Counts the number of micro-ops retired that are from the complex flows issued by the micro-sequencer (MS).", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SIMD", .udesc = "Counts the number of scalar SSE, AVX, AVX2, AVX-512 micro-ops retired. More specifically, it counts scalar SSE, AVX, AVX2, AVX-512 micro-ops except for loads (memory-to-register mov-type micro ops), division, sqrt.", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PACKED_SIMD", .udesc = "Counts the number of vector SSE, AVX, AVX2, AVX-512 micro-ops retired. More specifically, it counts packed SSE, AVX, AVX2, AVX-512 micro-ops (both floating point and integer) except for loads (memory-to-register mov-type micro-ops), packed byte and word multiplies.", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions retired using generic counter (precise event)", .ucode = 0x0, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Instructions retired using generic counter (precise event)", .uequiv = "ANY_P", .ucode = 0x0, .uflags = INTEL_X86_PEBS, }, }; static const intel_x86_umask_t knl_l2_requests_reject[]={ { .uname = "ALL", .udesc = "Counts the number of MEC requests from the L2Q that reference a cache line excluding SW prefetches filling only to L2 cache and L1 evictions (automatically exlcudes L2HWP, UC, WC) that were rejected - Multiple repeated rejects should be counted multiple times.", .ucode = 0x000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_core_reject[]={ { .uname = "ALL", .udesc = "Counts the number of MEC requests that were not accepted into the L2Q because of any L2 queue reject condition. There is no concept of at-ret here. It might include requests due to instructions in the speculative path", .ucode = 0x000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_machine_clears[]={ { .uname = "SMC", .udesc = "Counts the number of times that the machine clears due to program modifying data within 1K of a recently fetched code page.", .ucode = 0x0100, .uflags = INTEL_X86_DFL, }, { .uname = "MEMORY_ORDERING", .udesc = "Counts the number of times the machine clears due to memory ordering hazards", .ucode = 0x0200, }, { .uname = "FP_ASSIST", .udesc = "Counts the number of floating operations retired that required microcode assists", .ucode = 0x0400, }, { .uname = "ALL", .udesc = "Counts all nukes", .ucode = 0x0800, }, { .uname = "ANY", .udesc = "Counts all nukes", .uequiv = "ALL", .ucode = 0x0800, }, }; static const intel_x86_umask_t knl_br_inst_retired[]={ { .uname = "ANY", .udesc = "Counts the number of branch instructions retired (Precise Event)", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "Counts the number of branch instructions retired", .uequiv = "ANY", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "JCC", .udesc = "Counts the number of branch instructions retired that were conditional jumps.", .ucode = 0x7e00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "Counts the number of branch instructions retired that were conditional jumps and predicted taken.", .ucode = 0xfe00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "CALL", .udesc = "Counts the number of near CALL branch instructions retired.", .ucode = 0xf900, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REL_CALL", .udesc = "Counts the number of near relative CALL branch instructions retired.", .ucode = 0xfd00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "Counts the number of near indirect CALL branch instructions retired. (Precise Event)", .ucode = 0xfb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "Counts the number of near RET branch instructions retired. (Precise Event)", .ucode = 0xf700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "Counts the number of branch instructions retired that were near indirect CALL or near indirect JMP. (Precise Event)", .ucode = 0xeb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of far branch instructions retired. (Precise Event)", .uequiv = "FAR", .ucode = 0xbf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR", .udesc = "Counts the number of far branch instructions retired. (Precise Event)", .ucode = 0xbf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t knl_fetch_stall[]={ { .uname = "ICACHE_FILL_PENDING_CYCLES", .udesc = "Counts the number of core cycles the fetch stalls because of an icache miss. This is a cumulative count of core cycles the fetch stalled for all icache misses", .ucode = 0x0400, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_baclears[]={ { .uname = "ALL", .udesc = "Counts the number of times the front end resteers for any branch as a result of another branch handling mechanism in the front end.", .ucode = 0x100, .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Counts the number of times the front end resteers for any branch as a result of another branch handling mechanism in the front end.", .uequiv = "ALL", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RETURN", .udesc = "Counts the number of times the front end resteers for RET branches as a result of another branch handling mechanism in the front end.", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COND", .udesc = "Counts the number of times the front end resteers for conditional branches as a result of another branch handling mechanism in the front end.", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_cpu_clk_unhalted[]={ { .uname = "THREAD_P", .udesc = "thread cycles when core is not halted", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BUS", .udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .uequiv = "REF_P", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_P", .udesc = "Number of reference cycles that the cpu is not in a halted state. The core enters the halted state when it is running the HLT instruction. In mobile systems, the core frequency may change from time to time. This event is not affected by core frequency changes but counts as if the core is running a the same maximum frequency all the time", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_mem_uops_retired[]={ { .uname = "L1_MISS_LOADS", .udesc = "Counts the number of load micro-ops retired that miss in L1 D cache.", .ucode = 0x100, }, { .uname = "LD_DCU_MISS", .udesc = "Counts the number of load micro-ops retired that miss in L1 D cache.", .uequiv = "L1_MISS_LOADS", .ucode = 0x100, }, { .uname = "L2_HIT_LOADS", .udesc = "Counts the number of load micro-ops retired that hit in the L2.", .ucode = 0x200, .uflags = INTEL_X86_PEBS, }, { .uname = "L2_MISS_LOADS", .udesc = "Counts the number of load micro-ops retired that miss in the L2.", .ucode = 0x400, .uflags = INTEL_X86_PEBS, }, { .uname = "LD_L2_MISS", .udesc = "Counts the number of load micro-ops retired that miss in the L2.", .uequiv = "L2_MISS_LOADS", .ucode = 0x400, .uflags = INTEL_X86_PEBS, }, { .uname = "DTLB_MISS_LOADS", .udesc = "Counts the number of load micro-ops retired that cause a DTLB miss.", .ucode = 0x800, .uflags = INTEL_X86_PEBS, }, { .uname = "UTLB_MISS_LOADS", .udesc = "Counts the number of load micro-ops retired that caused micro TLB miss.", .ucode = 0x1000, }, { .uname = "LD_UTLB_MISS", .udesc = "Counts the number of load micro-ops retired that caused micro TLB miss.", .uequiv = "UTLB_MISS_LOADS", .ucode = 0x1000, }, { .uname = "HITM", .udesc = "Counts the loads retired that get the data from the other core in the same tile in M state.", .ucode = 0x2000, .uflags = INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "Counts all the load micro-ops retired.", .ucode = 0x4000, .uflags = INTEL_X86_DFL, }, { .uname = "ANY_LD", .udesc = "Counts all the load micro-ops retired.", .uequiv = "ALL_LOADS", .ucode = 0x4000, }, { .uname = "ALL_STORES", .udesc = "Counts all the store micro-ops retired.", .ucode = 0x8000, }, { .uname = "ANY_ST", .udesc = "Counts all the store micro-ops retired.", .uequiv = "ALL_STORES", .ucode = 0x8000, }, }; static const intel_x86_umask_t knl_page_walks[]={ { .uname = "D_SIDE_CYCLES", .udesc = "Counts the total D-side page walks that are completed or started. The page walks started in the speculative path will also be counted.", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "D_SIDE_WALKS", .udesc = "Counts the total number of core cycles for all the D-side page walks. The cycles for page walks started in speculative path will also be included.", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "I_SIDE_CYCLES", .udesc = "Counts the total I-side page walks that are completed.", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "I_SIDE_WALKS", .udesc = "Counts the total number of core cycles for all the I-side page walks. The cycles for page walks started in speculative path will also be included.", .ucode = 0x200 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES", .udesc = "Counts the total page walks completed (I-side and D-side)", .ucode = 0x300, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "WALKS", .udesc = "Counts the total number of core cycles for all the page walks. The cycles for page walks started in speculative path will also be included.", .ucode = 0x300 | INTEL_X86_MOD_EDGE | (1ULL << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_l2_rqsts[]={ { .uname = "MISS", .udesc = "Counts the number of L2 cache misses", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Counts the total number of L2 cache references.", .ucode = 0x4f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_recycleq[]={ { .uname = "LD_BLOCK_ST_FORWARD", .udesc = "Counts the number of occurrences a retired load gets blocked because its address partially overlaps with a store (Precise Event).", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LD_BLOCK_STD_NOTREADY", .udesc = "Counts the number of occurrences a retired load gets blocked because its address overlaps with a store whose data is not ready.", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ST_SPLITS", .udesc = "Counts the number of occurrences a retired store that is a cache line split. Each split should be counted only once.", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LD_SPLITS", .udesc = "Counts the number of occurrences a retired load that is a cache line split. Each split should be counted only once (Precise Event).", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK", .udesc = "Counts all the retired locked loads. It does not include stores because we would double count if we count stores.", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STA_FULL", .udesc = "Counts the store micro-ops retired that were pushed in the rehad queue because the store address buffer is full.", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_LD", .udesc = "Counts any retired load that was pushed into the recycle queue for any reason.", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY_ST", .udesc = "Counts any retired store that was pushed into the recycle queue for any reason.", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_offcore_response_0[]={ { .uname = "DMND_DATA_RD", .udesc = "Counts demand cacheable data and L1 prefetch data reads", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Counts Demand cacheable data writes", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_CODE_RD", .udesc = "Counts demand code reads and prefetch code reads", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "PF_L2_RFO", .udesc = "Counts L2 data RFO prefetches (includes PREFETCHW instruction)", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_L2_CODE_RD", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PARTIAL_READS", .udesc = "Counts Partial reads (UC or WC and is valid only for Outstanding response type).", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PARTIAL_WRITES", .udesc = "Counts Partial writes (UC or WT or WP and should be programmed on PMC1)", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "UC_CODE_READS", .udesc = "Counts UC code reads (valid only for Outstanding response type)", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Counts Bus locks and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "FULL_STREAMING_STORES", .udesc = "Counts Full streaming stores (WC and should be programmed on PMC1)", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "PF_SOFTWARE", .udesc = "Counts Software prefetches", .ucode = 1ULL << (12 + 8), .grpid = 0, }, { .uname = "PF_L1_DATA_RD", .udesc = "Counts L1 data HW prefetches", .ucode = 1ULL << (13 + 8), .grpid = 0, }, { .uname = "PARTIAL_STREAMING_STORES", .udesc = "Counts Partial streaming stores (WC and should be programmed on PMC1)", .ucode = 1ULL << (14 + 8), .grpid = 0, }, { .uname = "STREAMING_STORES", .udesc = "Counts all streaming stores (WC and should be programmed on PMC1)", .ucode = (1ULL << 14 | 1ULL << 11) << 8, .uequiv = "PARTIAL_STREAMING_STORES:FULL_STREAMING_STORES", .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Counts any request", .ucode = 1ULL << (15 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Counts Demand cacheable data and L1 prefetch data read requests", .ucode = (1ULL << 0 | 1ULL << 7 | 1ULL << 12 | 1ULL << 13) << 8, .uequiv = "DMND_DATA_RD:PARTIAL_READS:PF_SOFTWARE:PF_L1_DATA_RD", .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Counts Demand cacheable data write requests", .ucode = (1ULL << 1 | 1ULL << 5) << 8, .grpid = 0, }, { .uname = "ANY_CODE_RD", .udesc = "Counts Demand code reads and prefetch code read requests", .ucode = (1ULL << 2 | 1ULL << 6) << 8, .uequiv = "DMND_CODE_RD:PF_L2_CODE_RD", .grpid = 0, }, { .uname = "ANY_READ", .udesc = "Counts any Read request", .ucode = (1ULL << 0 | 1ULL << 1 | 1ULL << 2 | 1ULL << 5 | 1ULL << 6 | 1ULL << 7 | 1ULL << 9 | 1ULL << 12 | 1ULL << 13 ) << 8, .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD", .grpid = 0, }, { .uname = "ANY_PF_L2", .udesc = "Counts any Prefetch requests", .ucode = (1ULL << 5 | 1ULL << 6) << 8, .uequiv = "PF_L2_RFO:PF_L2_CODE_RD", .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Accounts for any response", .ucode = (1ULL << 16) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "DDR_NEAR", .udesc = "Accounts for data responses from DRAM Local.", .ucode = (1ULL << 31 | 1ULL << 23 ) << 8, .grpid = 1, }, { .uname = "DDR_FAR", .udesc = "Accounts for data responses from DRAM Far.", .ucode = (1ULL << 31 | 1ULL << 24 ) << 8, .grpid = 1, }, { .uname = "MCDRAM_NEAR", .udesc = "Accounts for data responses from MCDRAM Local.", .ucode = (1ULL << 31 | 1ULL << 21 ) << 8, .grpid = 1, }, { .uname = "MCDRAM_FAR", .udesc = "Accounts for data responses from MCDRAM Far or Other tile L2 hit far.", .ucode = (1ULL << 32 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE_E_F", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in E/F state.", .ucode = (1ULL << 35 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE_M", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in M state.", .ucode = (1ULL << 36 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE_E_F", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in E/F state. Valid only for SNC4 cluster mode.", .ucode = (1ULL << 35 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE_M", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in M state.", .ucode = (1ULL << 36 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "NON_DRAM", .udesc = "accounts for responses from any NON_DRAM system address. This includes MMIO transactions", .ucode = (1ULL << 37 | 1ULL << 17 ) << 8, .grpid = 1, }, { .uname = "MCDRAM", .udesc = "accounts for responses from MCDRAM (local and far)", .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 22 | 1ULL << 21 ) << 8, .grpid = 1, }, { .uname = "DDR", .udesc = "accounts for responses from DDR (local and far)", .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 24 | 1ULL << 23 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE", .udesc = " accounts for responses from snoop request hit with data forwarded from its Near-other tile L2 in E/F/M state", .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 20 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE", .udesc = "accounts for responses from snoop request hit with data forwarded from it Far(not in the same quadrant as the request)-other tile L2 in E/F/M state. Valid only in SNC4 Cluster mode.", .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "OUTSTANDING", .udesc = "outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0.", .ucode = (1ULL << 38) << 8, .uflags = INTEL_X86_GRP_DFL_NONE | INTEL_X86_EXCL_GRP_BUT_0, /* can only be combined with request type bits (grpid = 0) */ .grpid = 2, }, }; static const intel_x86_umask_t knl_offcore_response_1[]={ { .uname = "DMND_DATA_RD", .udesc = "Counts demand cacheable data and L1 prefetch data reads", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Counts Demand cacheable data writes", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_CODE_RD", .udesc = "Counts demand code reads and prefetch code reads", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "PF_L2_RFO", .udesc = "Counts L2 data RFO prefetches (includes PREFETCHW instruction)", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_L2_CODE_RD", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PARTIAL_READS", .udesc = "Counts Partial reads (UC or WC and is valid only for Outstanding response type).", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PARTIAL_WRITES", .udesc = "Counts Partial writes (UC or WT or WP and should be programmed on PMC1)", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "UC_CODE_READS", .udesc = "Counts UC code reads (valid only for Outstanding response type)", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Counts Bus locks and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "FULL_STREAMING_STORES", .udesc = "Counts Full streaming stores (WC and should be programmed on PMC1)", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "PF_SOFTWARE", .udesc = "Counts Software prefetches", .ucode = 1ULL << (12 + 8), .grpid = 0, }, { .uname = "PF_L1_DATA_RD", .udesc = "Counts L1 data HW prefetches", .ucode = 1ULL << (13 + 8), .grpid = 0, }, { .uname = "PARTIAL_STREAMING_STORES", .udesc = "Counts Partial streaming stores (WC and should be programmed on PMC1)", .ucode = 1ULL << (14 + 8), .grpid = 0, }, { .uname = "STREAMING_STORES", .udesc = "Counts all streaming stores (WC and should be programmed on PMC1)", .ucode = (1ULL << 14 | 1ULL << 11) << 8, .uequiv = "PARTIAL_STREAMING_STORES:FULL_STREAMING_STORES", .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Counts any request", .ucode = 1ULL << (15 + 8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Counts Demand cacheable data and L1 prefetch data read requests", .ucode = (1ULL << 0 | 1ULL << 7 | 1ULL << 12 | 1ULL << 13) << 8, .uequiv = "DMND_DATA_RD:PARTIAL_READS:PF_SOFTWARE:PF_L1_DATA_RD", .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Counts Demand cacheable data write requests", .ucode = (1ULL << 1 | 1ULL << 5) << 8, .grpid = 0, }, { .uname = "ANY_CODE_RD", .udesc = "Counts Demand code reads and prefetch code read requests", .ucode = (1ULL << 2 | 1ULL << 6) << 8, .uequiv = "DMND_CODE_RD:PF_L2_CODE_RD", .grpid = 0, }, { .uname = "ANY_READ", .udesc = "Counts any Read request", .ucode = (1ULL << 0 | 1ULL << 1 | 1ULL << 2 | 1ULL << 5 | 1ULL << 6 | 1ULL << 7 | 1ULL << 9 | 1ULL << 12 | 1ULL << 13 ) << 8, .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD", .grpid = 0, }, { .uname = "ANY_PF_L2", .udesc = "Counts any Prefetch requests", .ucode = (1ULL << 5 | 1ULL << 6) << 8, .uequiv = "PF_L2_RFO:PF_L2_CODE_RD", .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Accounts for any response", .ucode = (1ULL << 16) << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "DDR_NEAR", .udesc = "Accounts for data responses from DRAM Local.", .ucode = (1ULL << 31 | 1ULL << 23 ) << 8, .grpid = 1, }, { .uname = "DDR_FAR", .udesc = "Accounts for data responses from DRAM Far.", .ucode = (1ULL << 31 | 1ULL << 24 ) << 8, .grpid = 1, }, { .uname = "MCDRAM_NEAR", .udesc = "Accounts for data responses from MCDRAM Local.", .ucode = (1ULL << 31 | 1ULL << 21 ) << 8, .grpid = 1, }, { .uname = "MCDRAM_FAR", .udesc = "Accounts for data responses from MCDRAM Far or Other tile L2 hit far.", .ucode = (1ULL << 32 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE_E_F", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in E/F state.", .ucode = (1ULL << 35 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE_M", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Near-other tile's L2 in M state.", .ucode = (1ULL << 36 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE_E_F", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in E/F state. Valid only for SNC4 cluster mode.", .ucode = (1ULL << 35 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE_M", .udesc = "Accounts for responses from a snoop request hit with data forwarded from its Far(not in the same quadrant as the request)-other tile's L2 in M state.", .ucode = (1ULL << 36 | 1ULL << 22 ) << 8, .grpid = 1, }, { .uname = "NON_DRAM", .udesc = "accounts for responses from any NON_DRAM system address. This includes MMIO transactions", .ucode = (1ULL << 37 | 1ULL << 17 ) << 8, .grpid = 1, }, { .uname = "MCDRAM", .udesc = "accounts for responses from MCDRAM (local and far)", .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 22 | 1ULL << 21 ) << 8, .grpid = 1, }, { .uname = "DDR", .udesc = "accounts for responses from DDR (local and far)", .ucode = (1ULL << 32 | 1ULL << 31 | 1ULL << 24 | 1ULL << 23 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_NEAR_TILE", .udesc = " accounts for responses from snoop request hit with data forwarded from its Near-other tile L2 in E/F/M state", .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 20 | 1ULL << 19 ) << 8, .grpid = 1, }, { .uname = "L2_HIT_FAR_TILE", .udesc = "accounts for responses from snoop request hit with data forwarded from it Far(not in the same quadrant as the request)-other tile L2 in E/F/M state. Valid only in SNC4 Cluster mode.", .ucode = (1ULL << 36 | 1ULL << 35 | 1ULL << 22 ) << 8, .grpid = 1, }, }; static const intel_x86_umask_t knl_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branches (Precise Event)", .uequiv = "ANY", .ucode = 0x0000, /* architectural encoding */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All mispredicted branches (Precise Event)", .ucode = 0x0000, /* architectural encoding */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "JCC", .udesc = "Number of mispredicted conditional branch instructions retired (Precise Event)", .ucode = 0x7e00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "Number of mispredicted non-return branch instructions retired (Precise Event)", .ucode = 0xeb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "Number of mispredicted return branch instructions retired (Precise Event)", .ucode = 0xf700, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "Number of mispredicted indirect call branch instructions retired (Precise Event)", .ucode = 0xfb00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "Number of mispredicted taken conditional branch instructions retired (Precise Event)", .ucode = 0xfe00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "CALL", .udesc = "Counts the number of mispredicted near CALL branch instructions retired.", .ucode = 0xf900, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REL_CALL", .udesc = "Counts the number of mispredicted near relative CALL branch instructions retired.", .ucode = 0xfd00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of mispredicted far branch instructions retired.", .ucode = 0xbf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t knl_no_alloc_cycles[]={ { .uname = "ROB_FULL", .udesc = "Counts the number of core cycles when no micro-ops are allocated and the ROB is full", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISPREDICTS", .udesc = "Counts the number of core cycles when no micro-ops are allocated and the alloc pipe is stalled waiting for a mispredicted branch to retire.", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RAT_STALL", .udesc = "Counts the number of core cycles when no micro-ops are allocated and a RATstall (caused by reservation station full) is asserted.", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_DELIVERED", .udesc = "Counts the number of core cycles when no micro-ops are allocated, the IQ is empty, and no other condition is blocking allocation.", .ucode = 0x9000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "Counts the total number of core cycles when no micro-ops are allocated for any reason.", .ucode = 0x7f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Counts the total number of core cycles when no micro-ops are allocated for any reason.", .uequiv = "ALL", .ucode = 0x7f00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_rs_full_stall[]={ { .uname = "MEC", .udesc = "Counts the number of core cycles when allocation pipeline is stalled and is waiting for a free MEC reservation station entry.", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Counts the total number of core cycles the Alloc pipeline is stalled when any one of the reservation stations is full.", .ucode = 0x1f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_cycles_div_busy[]={ { .uname = "ALL", .udesc = "Counts the number of core cycles when divider is busy. Does not imply a stall waiting for the divider.", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_ms_decoded[]={ { .uname = "ENTRY", .udesc = "Counts the number of times the MSROM starts a flow of uops.", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_decode_restriction[]={ { .uname = "PREDECODE_WRONG", .udesc = "Number of times the prediction (from the predecode cache) for instruction length is incorrect", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_knl_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Unhalted core cycles", .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ .cntmsk = 0x200000003ull, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycle", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Instructions retired (any thread modifier supported in fixed counter)", .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ .cntmsk = 0x100000003ull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED (any thread modifier supported in fixed counter)", .modmsk = INTEL_V3_ATTRS, /* any thread only supported in fixed counter */ .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10003, .code = 0xc0, }, { .name = "LLC_REFERENCES", .desc = "Last level of cache references", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_V2_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Last level of cache misses", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for LLC_MISSES", .modmsk = INTEL_V2_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0x3, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Branch instructions retired", .modmsk = INTEL_V2_ATTRS, .equiv = "BR_INST_RETIRED:ANY", .cntmsk = 0x3, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Mispredicted branch instruction retired", .equiv = "BR_MISP_RETIRED:ANY", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags = INTEL_X86_PEBS, }, /* begin model specific events */ { .name = "ICACHE", .desc = "Instruction fetches", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(knl_icache), .ngrp = 1, .umasks = knl_icache, }, { .name = "UOPS_RETIRED", .desc = "Micro-ops retired", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc2, .numasks = LIBPFM_ARRAY_SIZE(knl_uops_retired), .ngrp = 1, .umasks = knl_uops_retired, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc0, .flags = INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(knl_inst_retired), .ngrp = 1, .umasks = knl_inst_retired, }, { .name = "CYCLES_DIV_BUSY", .desc = "Counts the number of core cycles when divider is busy.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xcd, .numasks = LIBPFM_ARRAY_SIZE(knl_cycles_div_busy), .ngrp = 1, .umasks = knl_cycles_div_busy, }, { .name = "RS_FULL_STALL", .desc = "Counts the number of core cycles when allocation pipeline is stalled.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(knl_rs_full_stall), .ngrp = 1, .umasks = knl_rs_full_stall, }, { .name = "L2_REQUESTS", .desc = "L2 cache requests", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(knl_l2_rqsts), .ngrp = 1, .umasks = knl_l2_rqsts, }, { .name = "MACHINE_CLEARS", .desc = "Counts the number of times that the machine clears.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(knl_machine_clears), .ngrp = 1, .umasks = knl_machine_clears, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc4, .numasks = LIBPFM_ARRAY_SIZE(knl_br_inst_retired), .flags = INTEL_X86_PEBS, .ngrp = 1, .umasks = knl_br_inst_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Counts the number of mispredicted branch instructions retired.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags = INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(knl_br_misp_retired), .ngrp = 1, .umasks = knl_br_misp_retired, }, { .name = "MS_DECODED", .desc = "Number of times the MSROM starts a flow of uops.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xe7, .numasks = LIBPFM_ARRAY_SIZE(knl_ms_decoded), .ngrp = 1, .umasks = knl_ms_decoded, }, { .name = "FETCH_STALL", .desc = "Counts the number of core cycles the fetch stalls.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x86, .numasks = LIBPFM_ARRAY_SIZE(knl_fetch_stall), .ngrp = 1, .umasks = knl_fetch_stall, }, { .name = "BACLEARS", .desc = "Branch address calculator", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(knl_baclears), .ngrp = 1, .umasks = knl_baclears, }, { .name = "NO_ALLOC_CYCLES", .desc = "Front-end allocation", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(knl_no_alloc_cycles), .ngrp = 1, .umasks = knl_no_alloc_cycles, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(knl_cpu_clk_unhalted), .ngrp = 1, .umasks = knl_cpu_clk_unhalted, }, { .name = "MEM_UOPS_RETIRED", .desc = "Counts the number of load micro-ops retired.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x4, .flags = INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(knl_mem_uops_retired), .ngrp = 1, .umasks = knl_mem_uops_retired, }, { .name = "PAGE_WALKS", .desc = "Number of page-walks executed", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(knl_page_walks), .ngrp = 1, .umasks = knl_page_walks, }, { .name = "L2_REQUESTS_REJECT", .desc = "Counts the number of MEC requests from the L2Q that reference a cache line were rejected.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(knl_l2_requests_reject), .ngrp = 1, .umasks = knl_l2_requests_reject, }, { .name = "CORE_REJECT_L2Q", .desc = "Number of requests not accepted into the L2Q because of any L2 queue reject condition.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x31, .numasks = LIBPFM_ARRAY_SIZE(knl_core_reject), .ngrp = 1, .umasks = knl_core_reject, }, { .name = "RECYCLEQ", .desc = "Counts the number of occurrences a retired load gets blocked.", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x03, .flags = INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(knl_recycleq), .ngrp = 1, .umasks = knl_recycleq, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xf, .code = 0x01b7, .flags = INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(knl_offcore_response_0), .ngrp = 3, .umasks = knl_offcore_response_0, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xf, .code = 0x02b7, .flags = INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(knl_offcore_response_1), .ngrp = 2, .umasks = knl_offcore_response_1, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knl_unc_cha_events.h000066400000000000000000001127251502707512200252420ustar00rootroot00000000000000/* * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knl_unc_cha (Intel Knights Landing CHA uncore PMU) */ static const intel_x86_umask_t knl_unc_cha_llc_lookup[]={ { .uname = "DATA_READ", .udesc = "Data read requests", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Write requests. Includes all write transactions (cached, uncached)", .ucode = 0x0500, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNOOP", .udesc = "External snoop request", .ucode = 0x0900, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any request", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_unc_cha_llc_victims[]={ { .uname = "M_STATE", .udesc = "Lines in M state", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E_STATE", .udesc = "Lines in E state", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "Lines in S state", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "F_STATE", .udesc = "Lines in F state", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Victimized Lines matching the NID filter.", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "REMOTE", .udesc = "Victimized Lines does not matching the NID.", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ingress_int_starved[]={ { .uname = "IRQ", .udesc = "Internal starved with IRQ.", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "Internal starved with IPQ.", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ISMQ", .udesc = "Internal starved with ISMQ.", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "Internal starved with PRQ.", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ingress_ext[]={ { .uname = "IRQ", .udesc = "IRQ", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJ", .udesc = "IRQ rejected", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "IPQ", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "PRQ", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_REJ", .udesc = "PRQ rejected", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ingress_entry_reject_q0[]={ { .uname = "AD_REQ_VN0", .udesc = "AD Request", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD Response", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL Response", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "AK non upi", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "IV non upi", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ingress_entry_reject_q1[]={ { .uname = "ANY_REJECT", .udesc = "Any reject from request queue0", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SF_VICTIM", .udesc = "SF victim", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_WAY", .udesc = "SF way", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALLOW_SNP", .udesc = "allow snoop", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "PA match", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_tor_subevent[]={ { .uname = "IRQ", .udesc = " -IRQ.", .ucode = 0x3100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT", .udesc = " -SF/LLC Evictions.", .ucode = 0x3200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = " -PRQ.", .ucode = 0x3400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = " -IPQ.", .ucode = 0x3800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = " -Hit (Not a Miss).", .ucode = 0x1f00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = " -Miss.", .ucode = 0x2f00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_HIT", .udesc = " -IRQ HIT.", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_MISS", .udesc = " -IRQ MISS.", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_HIT", .udesc = " -PRQ HIT.", .ucode = 0x1400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_MISS", .udesc = " -PRQ MISS.", .ucode = 0x2400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ_HIT", .udesc = " -IPQ HIT", .ucode = 0x1800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ_MISS", .udesc = " -IPQ MISS", .ucode = 0x2800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_misc[]={ { .uname = "RSPI_WAS_FSE", .udesc = "Silent Snoop Eviction", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WC_ALIASING", .udesc = "Write Combining Aliasing.", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT_S", .udesc = "Counts the number of times that an RFO hits in S state.", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CV0_PREF_VIC", .udesc = "CV0 Prefetch Victim.", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CV0_PREF_MISS", .udesc = "CV0 Prefetch Miss.", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_tgr_ext[]={ { .uname = "TGR0", .udesc = "for Transgress 0", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR1", .udesc = "for Transgress 1", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR2", .udesc = "for Transgress 2", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR3", .udesc = "for Transgress 3", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR4", .udesc = "for Transgress 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR5", .udesc = "for Transgress 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR6", .udesc = "for Transgress 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TGR7", .udesc = "for Transgress 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_tgr_ext1[]={ { .uname = "TGR8", .udesc = "for Transgress 8", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_OF_TGR0_THRU_TGR7", .udesc = "for Transgress 0-7", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_type_agent[]={ { .uname = "AD_AG0", .udesc = "AD - Agent 0", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG0", .udesc = "AK - Agent 0", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG0", .udesc = "BL - Agent 0", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_AG0", .udesc = "IV - Agent 0", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_AG1", .udesc = "AD - Agent 1", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_AG1", .udesc = "AK - Agent 1", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_AG1", .udesc = "BL - Agent 1", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_type[]={ { .uname = "AD", .udesc = " - AD ring", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = " - AK ring", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = " - BL ring", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV", .udesc = " - IV ring", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_dire_ext[]={ { .uname = "VERT", .udesc = " - vertical", .ucode = 0x0000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HORZ", .udesc = " - horizontal", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_use_vert[]={ { .uname = "UP_EVEN", .udesc = "UP_EVEN", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UP_ODD", .udesc = "UP_ODD", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_EVEN", .udesc = "DN_EVEN", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN_ODD", .udesc = "DN_ODD", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_use_hori[]={ { .uname = "LEFT_EVEN", .udesc = "LEFT_EVEN", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LEFT_ODD", .udesc = "LEFT_ODD", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_EVEN", .udesc = "RIGHT_EVEN", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT_ODD", .udesc = "RIGHT_ODD", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_use_updn[]={ { .uname = "UP", .udesc = "up", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DN", .udesc = "down", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_ring_use_lfrt[]={ { .uname = "LEFT", .udesc = "left", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RIGHT", .udesc = "right", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_iv_snp[]={ { .uname = "IV_SNP_GO_UP", .udesc = "IV_SNP_GO_UP", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNP_GO_DN", .udesc = "IV_SNP_GO_DN", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_cms_ext[]={ { .uname = "AD_BNC", .udesc = "AD_BNC", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_BNC", .udesc = "AK_BNC", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .udesc = "BL_BNC", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_BNC", .udesc = "IV_BNC", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD_CRD", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "AD_CRD", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_cms_crd_starved[]={ { .uname = "AD_BNC", .udesc = "AD_BNC", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_BNC", .udesc = "AK_BNC", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .udesc = "BL_BNC", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_BNC", .udesc = "IV_BNC", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD_CRD", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "AD_CRD", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IVF", .udesc = "IVF", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t knl_unc_cha_cms_busy_starved[]={ { .uname = "AD_BNC", .udesc = "AD_BNC", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_BNC", .udesc = "BL_BNC", .ucode = 0x0400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CRD", .udesc = "AD_CRD", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CRD", .udesc = "AD_CRD", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_knl_unc_cha_pe[]={ { .name = "UNC_H_U_CLOCKTICKS", .desc = "Uncore clockticks", .modmsk = 0x0, .cntmsk = 0xf, .code = 0x00, .flags = INTEL_X86_FIXED, }, { .name = "UNC_H_INGRESS_OCCUPANCY", .desc = "Ingress Occupancy. Ingress Occupancy. Counts number of entries in the specified Ingress queue in each cycle", .cntmsk = 0xf, .code = 0x11, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_ext), .umasks = knl_unc_cha_ingress_ext, }, { .name = "UNC_H_INGRESS_INSERTS", .desc = "Ingress Allocations. Counts number of allocations per cycle into the specified Ingress queue", .cntmsk = 0xf, .code = 0x13, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_ext), .umasks = knl_unc_cha_ingress_ext, }, { .name = "UNC_H_INGRESS_INT_STARVED", .desc = "Cycles Internal Starvation", .cntmsk = 0xf, .code = 0x14, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_int_starved), .umasks = knl_unc_cha_ingress_int_starved, }, { .name = "UNC_H_INGRESS_RETRY_IRQ0_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x18, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_IRQ01_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x19, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), .umasks = knl_unc_cha_ingress_entry_reject_q1, }, { .name = "UNC_H_INGRESS_RETRY_PRQ0_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x20, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_PRQ1_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x21, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), .umasks = knl_unc_cha_ingress_entry_reject_q1, }, { .name = "UNC_H_INGRESS_RETRY_IPQ0_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x22, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_IPQ1_REJECT", .desc = "Ingress Request Queue Rejects", .cntmsk = 0xf, .code = 0x23, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), .umasks = knl_unc_cha_ingress_entry_reject_q1, }, { .name = "UNC_H_INGRESS_RETRY_ISMQ0_REJECT", .desc = "ISMQ Rejects", .cntmsk = 0xf, .code = 0x24, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_REQ_Q0_RETRY", .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMQ)", .cntmsk = 0xf, .code = 0x2a, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_REQ_Q1_RETRY", .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMQ)", .cntmsk = 0xf, .code = 0x2b, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), .umasks = knl_unc_cha_ingress_entry_reject_q1, }, { .name = "UNC_H_INGRESS_RETRY_ISMQ0_RETRY", .desc = "ISMQ retries", .cntmsk = 0xf, .code = 0x2c, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_OTHER0_RETRY", .desc = "Other Queue Retries", .cntmsk = 0xf, .code = 0x2e, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q0), .umasks = knl_unc_cha_ingress_entry_reject_q0, }, { .name = "UNC_H_INGRESS_RETRY_OTHER1_RETRY", .desc = "Other Queue Retries", .cntmsk = 0xf, .code = 0x2f, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ingress_entry_reject_q1), .umasks = knl_unc_cha_ingress_entry_reject_q1, }, { .name = "UNC_H_SF_LOOKUP", .desc = "Cache Lookups. Counts the number of times the LLC was accessed.", .cntmsk = 0xf, .code = 0x34, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_llc_lookup), .umasks = knl_unc_cha_llc_lookup, }, { .name = "UNC_H_CACHE_LINES_VICTIMIZED", .desc = "Cache Lookups. Counts the number of times the LLC was accessed.", .cntmsk = 0xf, .code = 0x37, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_llc_victims), .umasks = knl_unc_cha_llc_victims, }, { .name = "UNC_H_TOR_INSERTS", .desc = "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent.", .modmsk = KNL_UNC_CHA_TOR_ATTRS, .cntmsk = 0xf, .code = 0x35, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tor_subevent), .umasks = knl_unc_cha_tor_subevent }, { .name = "UNC_H_TOR_OCCUPANCY", .desc = "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent", .modmsk = KNL_UNC_CHA_TOR_ATTRS, .cntmsk = 0xf, .code = 0x36, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tor_subevent), .umasks = knl_unc_cha_tor_subevent }, { .name = "UNC_H_MISC", .desc = "Miscellaneous events in the Cha", .cntmsk = 0xf, .code = 0x39, .ngrp = 1, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_misc), .umasks = knl_unc_cha_misc, }, { .name = "UNC_H_AG0_AD_CRD_ACQUIRED", .desc = "CMS Agent0 AD Credits Acquired.", .cntmsk = 0xf, .code = 0x80, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_AD_CRD_ACQUIRED_EXT", .desc = "CMS Agent0 AD Credits Acquired.", .cntmsk = 0xf, .code = 0x81, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG0_AD_CRD_OCCUPANCY", .desc = "CMS Agent0 AD Credits Occupancy.", .cntmsk = 0xf, .code = 0x82, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_AD_CRD_OCCUPANCY_EXT", .desc = "CMS Agent0 AD Credits Acquired For Transgress.", .cntmsk = 0xf, .code = 0x83, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_AD_CRD_ACQUIRED", .desc = "CMS Agent1 AD Credits Acquired .", .cntmsk = 0xf, .code = 0x84, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_AD_CRD_ACQUIRED_EXT", .desc = "CMS Agent1 AD Credits Acquired .", .cntmsk = 0xf, .code = 0x85, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_AD_CRD_OCCUPANCY", .desc = "CMS Agent1 AD Credits Occupancy.", .cntmsk = 0xf, .code = 0x86, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_AD_CRD_OCCUPANCY_EXT", .desc = "CMS Agent1 AD Credits Occupancy.", .cntmsk = 0xf, .code = 0x87, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG0_BL_CRD_ACQUIRED", .desc = "CMS Agent0 BL Credits Acquired.", .cntmsk = 0xf, .code = 0x88, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_BL_CRD_ACQUIRED_EXT", .desc = "CMS Agent0 BL Credits Acquired.", .cntmsk = 0xf, .code = 0x89, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG0_BL_CRD_OCCUPANCY", .desc = "CMS Agent0 BL Credits Occupancy.", .cntmsk = 0xf, .code = 0x8a, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_BL_CRD_OCCUPANCY_EXT", .desc = "CMS Agent0 BL Credits Occupancy.", .cntmsk = 0xf, .code = 0x8b, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_BL_CRD_ACQUIRED", .desc = "CMS Agent1 BL Credits Acquired.", .cntmsk = 0xf, .code = 0x8c, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_BL_CRD_ACQUIRED_EXT", .desc = "CMS Agent1 BL Credits Acquired.", .cntmsk = 0xf, .code = 0x8d, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_BL_CRD_OCCUPANCY", .desc = "CMS Agent1 BL Credits Occupancy.", .cntmsk = 0xf, .code = 0x8e, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_BL_CRD_OCCUPANCY_EXT", .desc = "CMS Agent1 BL Credits Occupancy.", .cntmsk = 0xf, .code = 0x8f, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_AD", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_AD_EXT", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD1, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_AD", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD2, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_AD_EXT", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD3, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_BL", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD4, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG0_STALL_NO_CRD_EGRESS_HORZ_BL_EXT", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD5, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_BL", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD6, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext), .umasks = knl_unc_cha_tgr_ext, }, { .name = "UNC_H_AG1_STALL_NO_CRD_EGRESS_HORZ_BL_EXT", .desc = "Stall on No AD Transgress Credits.", .cntmsk = 0xf, .code = 0xD7, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_tgr_ext1), .umasks = knl_unc_cha_tgr_ext1, }, { .name = "UNC_H_EGRESS_VERT_OCCUPANCY", .desc = "CMS Vert Egress Occupancy.", .cntmsk = 0xf, .code = 0x90, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_INSERTS", .desc = "CMS Vert Egress Allocations.", .cntmsk = 0xf, .code = 0x91, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_CYCLES_FULL", .desc = "Cycles CMS Vertical Egress Queue Is Full.", .cntmsk = 0xf, .code = 0x92, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_CYCLES_NE", .desc = "Cycles CMS Vertical Egress Queue Is Not Empty.", .cntmsk = 0xf, .code = 0x93, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_NACK", .desc = "CMS Vertical Egress NACKs.", .cntmsk = 0xf, .code = 0x98, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_STARVED", .desc = "CMS Vertical Egress Injection Starvation.", .cntmsk = 0xf, .code = 0x9a, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_ADS_USED", .desc = "CMS Vertical ADS Used.", .cntmsk = 0xf, .code = 0x9c, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_VERT_BYPASS", .desc = "CMS Vertical Egress Bypass.", .cntmsk = 0xf, .code = 0x9e, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type_agent), .umasks = knl_unc_cha_ring_type_agent, }, { .name = "UNC_H_EGRESS_HORZ_OCCUPANCY", .desc = "CMS Horizontal Egress Occupancy.", .cntmsk = 0xf, .code = 0x94, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_INSERTS", .desc = "CMS Horizontal Egress Inserts.", .cntmsk = 0xf, .code = 0x95, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_CYCLES_FULL", .desc = "Cycles CMS Horizontal Egress Queue is Full.", .cntmsk = 0xf, .code = 0x96, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_CYCLES_NE", .desc = "Cycles CMS Horizontal Egress Queue is Not Empty.", .cntmsk = 0xf, .code = 0x97, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_NACK", .desc = "CMS Horizontal Egress NACKs.", .cntmsk = 0xf, .code = 0x99, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_STARVED", .desc = "CMS Horizontal Egress Injection Starvation.", .cntmsk = 0xf, .code = 0x9b, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_ADS_USED", .desc = "CMS Horizontal ADS Used.", .cntmsk = 0xf, .code = 0x9d, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_EGRESS_HORZ_BYPASS", .desc = "CMS Horizontal Egress Bypass.", .cntmsk = 0xf, .code = 0x9f, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_RING_BOUNCES_VERT", .desc = "Number of incoming messages from the Vertical ring that were bounced, by ring type.", .cntmsk = 0xf, .code = 0xa0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_RING_BOUNCES_HORZ", .desc = "Number of incoming messages from the Horizontal ring that were bounced, by ring type.", .cntmsk = 0xf, .code = 0xa1, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_RING_SINK_STARVED_VERT", .desc = "Vertical ring sink starvation count.", .cntmsk = 0xf, .code = 0xa2, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_RING_SINK_STARVED_HORZ", .desc = "Horizontal ring sink starvation count.", .cntmsk = 0xf, .code = 0xa3, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_type), .umasks = knl_unc_cha_ring_type, }, { .name = "UNC_H_RING_SRC_THRT", .desc = "Counts cycles in throttle mode.", .cntmsk = 0xf, .code = 0xa4, }, { .name = "UNC_H_FAST_ASSERTED", .desc = "Counts cycles source throttling is asserted", .cntmsk = 0xf, .code = 0xa5, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_dire_ext), .umasks = knl_unc_cha_dire_ext, }, { .name = "UNC_H_VERT_RING_AD_IN_USE", .desc = "Counts the number of cycles that the Vertical AD ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xa6, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), .umasks = knl_unc_cha_ring_use_vert, }, { .name = "UNC_H_HORZ_RING_AD_IN_USE", .desc = "Counts the number of cycles that the Horizontal AD ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xa7, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), .umasks = knl_unc_cha_ring_use_hori, }, { .name = "UNC_H_VERT_RING_AK_IN_USE", .desc = "Counts the number of cycles that the Vertical AK ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xa8, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), .umasks = knl_unc_cha_ring_use_vert, }, { .name = "UNC_H_HORZ_RING_AK_IN_USE", .desc = "Counts the number of cycles that the Horizontal AK ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xa9, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), .umasks = knl_unc_cha_ring_use_hori, }, { .name = "UNC_H_VERT_RING_BL_IN_USE", .desc = "Counts the number of cycles that the Vertical BL ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xaa, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_vert), .umasks = knl_unc_cha_ring_use_vert, }, { .name = "UNC_H_HORZ_RING_BL_IN_USE", .desc = "Counts the number of cycles that the Horizontal BL ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xab, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_hori), .umasks = knl_unc_cha_ring_use_hori, }, { .name = "UNC_H_VERT_RING_IV_IN_USE", .desc = "Counts the number of cycles that the Vertical IV ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xac, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_updn), .umasks = knl_unc_cha_ring_use_updn, }, { .name = "UNC_H_HORZ_RING_IV_IN_USE", .desc = "Counts the number of cycles that the Horizontal IV ring is being used at this ring stop.", .cntmsk = 0xf, .code = 0xad, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_ring_use_lfrt), .umasks = knl_unc_cha_ring_use_lfrt, }, { .name = "UNC_H_EGRESS_ORDERING", .desc = "Counts number of cycles IV was blocked in the TGR Egress due to SNP/GO Ordering requirements.", .cntmsk = 0xf, .code = 0xae, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_iv_snp), .umasks = knl_unc_cha_iv_snp, }, { .name = "UNC_H_TG_INGRESS_OCCUPANCY", .desc = "Transgress Ingress Occupancy. Occupancy event for the Ingress buffers in the CMS The Ingress is used to queue up requests received from the mesh.", .cntmsk = 0xf, .code = 0xb0, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), .umasks = knl_unc_cha_cms_ext, }, { .name = "UNC_H_TG_INGRESS_INSERTS", .desc = "Transgress Ingress Allocations. Number of allocations into the CMS Ingress The Ingress is used to queue up requests received from the mesh.", .cntmsk = 0xf, .code = 0xb1, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), .umasks = knl_unc_cha_cms_ext, }, { .name = "UNC_H_TG_INGRESS_BYPASS", .desc = "Transgress Ingress Bypass. Number of packets bypassing the CMS Ingress.", .cntmsk = 0xf, .code = 0xb2, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_ext), .umasks = knl_unc_cha_cms_ext, }, { .name = "UNC_H_TG_INGRESS_CRD_STARVED", .desc = "Transgress Injection Starvation. Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, the Ingress is unable to forward to the Egress due to a lack of credit.", .cntmsk = 0xf, .code = 0xb3, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_crd_starved), .umasks = knl_unc_cha_cms_crd_starved, }, { .name = "UNC_H_TG_INGRESS_BUSY_STARVED", .desc = "Transgress Injection Starvation. Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, because a message from the other queue has higher priority.", .cntmsk = 0xf, .code = 0xb4, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_cha_cms_busy_starved), .umasks = knl_unc_cha_cms_busy_starved, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knl_unc_edc_events.h000066400000000000000000000055161502707512200252410ustar00rootroot00000000000000/* * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knl_unc_edc (Intel Knights Landing EDC_UCLK, EDC_ECLK uncore PMUs) */ static const intel_x86_umask_t knl_unc_edc_uclk_access_count[]={ { .uname = "HIT_CLEAN", .udesc = "Hit E", .ucode = 0x0100, }, { .uname = "HIT_DIRTY", .udesc = "Hit M", .ucode = 0x0200, }, { .uname = "MISS_CLEAN", .udesc = "Miss E", .ucode = 0x0400, }, { .uname = "MISS_DIRTY", .udesc = "Miss M", .ucode = 0x0800, }, { .uname = "MISS_INVALID", .udesc = "Miss I", .ucode = 0x1000, }, { .uname = "MISS_GARBAGE", .udesc = "Miss G", .ucode = 0x2000, }, }; static const intel_x86_entry_t intel_knl_unc_edc_uclk_pe[]={ { .name = "UNC_E_U_CLOCKTICKS", .desc = "EDC UCLK clockticks (generic counters)", .code = 0x00, /*encoding for generic counters */ .cntmsk = 0xf, }, { .name = "UNC_E_EDC_ACCESS", .desc = "Number of EDC Access Hits or Misses.", .code = 0x02, .cntmsk = 0xf, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_edc_uclk_access_count), .umasks = knl_unc_edc_uclk_access_count }, }; static const intel_x86_entry_t intel_knl_unc_edc_eclk_pe[]={ { .name = "UNC_E_E_CLOCKTICKS", .desc = "EDC ECLK clockticks (generic counters)", .code = 0x00, /*encoding for generic counters */ .cntmsk = 0xf, }, { .name = "UNC_E_RPQ_INSERTS", .desc = "Counts total number of EDC RPQ insers", .code = 0x0101, .cntmsk = 0xf, }, { .name = "UNC_E_WPQ_INSERTS", .desc = "Counts total number of EDC WPQ insers", .code = 0x0102, .cntmsk = 0xf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knl_unc_imc_events.h000066400000000000000000000046111502707512200252510ustar00rootroot00000000000000/* * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knl_unc_imc (Intel Knights Landing IMC uncore PMU) */ static const intel_x86_umask_t knl_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "Counts total number of DRAM CAS commands issued on this channel", .ucode = 0x0300, }, { .uname = "RD", .udesc = "Counts all DRAM reads on this channel, incl. underfills", .ucode = 0x0100, }, { .uname = "WR", .udesc = "Counts number of DRAM write CAS commands on this channel", .ucode = 0x0200, }, }; static const intel_x86_entry_t intel_knl_unc_imc_pe[]={ { .name = "UNC_M_D_CLOCKTICKS", .desc = "IMC Uncore DCLK counts", .code = 0x00, /*encoding for generic counters */ .cntmsk = 0xf, }, { .name = "UNC_M_CAS_COUNT", .desc = "DRAM RD_CAS and WR_CAS Commands.", .code = 0x03, .cntmsk = 0xf, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m_cas_count), .umasks = knl_unc_m_cas_count, }, }; static const intel_x86_entry_t intel_knl_unc_imc_uclk_pe[]={ { .name = "UNC_M_U_CLOCKTICKS", .desc = "IMC UCLK counts", .code = 0x00, /*encoding for generic counters */ .cntmsk = 0xf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_knl_unc_m2pcie_events.h000066400000000000000000000101761502707512200256630ustar00rootroot00000000000000/* * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: knl_unc_m2pcie (Intel Knights Landing M2PCIe uncore) */ static const intel_x86_umask_t knl_unc_m2p_ingress_cycles_ne[]={ { .uname = "CBO_IDI", .udesc = "CBO_IDI", .ucode = 0x0100, }, { .uname = "CBO_NCB", .udesc = "CBO_NCB", .ucode = 0x0200, }, { .uname = "CBO_NCS", .udesc = "CBO_NCS", .ucode = 0x0400, }, { .uname = "ALL", .udesc = "All", .ucode = 0x0800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t knl_unc_m2p_egress_cycles[]={ { .uname = "AD_0", .udesc = "AD_0", .ucode = 0x0100, }, { .uname = "AK_0", .udesc = "AK_0", .ucode = 0x0200, }, { .uname = "BL_0", .udesc = "BL_0", .ucode = 0x0400, }, { .uname = "AD_1", .udesc = "AD_1", .ucode = 0x0800, }, { .uname = "AK_1", .udesc = "AK_1", .ucode = 0x1000, }, { .uname = "BL_1", .udesc = "BL_1", .ucode = 0x2000, }, }; static const intel_x86_umask_t knl_unc_m2p_egress_inserts[]={ { .uname = "AD_0", .udesc = "AD_0", .ucode = 0x0100, }, { .uname = "AK_0", .udesc = "AK_0", .ucode = 0x0200, }, { .uname = "BL_0", .udesc = "BL_0", .ucode = 0x0400, }, { .uname = "AK_CRD_0", .udesc = "AK_CRD_0", .ucode = 0x0800, }, { .uname = "AD_1", .udesc = "AD_1", .ucode = 0x1000, }, { .uname = "AK_1", .udesc = "AK_1", .ucode = 0x2000, }, { .uname = "BL_1", .udesc = "BL_1", .ucode = 0x4000, }, { .uname = "AK_CRD_1", .udesc = "AK_CRD_1", .ucode = 0x8000, }, }; static const intel_x86_entry_t intel_knl_unc_m2pcie_pe[]={ { .name = "UNC_M2P_INGRESS_CYCLES_NE", .desc = "Ingress Queue Cycles Not Empty. Counts the number of cycles when the M2PCIe Ingress is not empty", .code = 0x10, .cntmsk = 0xf, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_ingress_cycles_ne), .umasks = knl_unc_m2p_ingress_cycles_ne }, { .name = "UNC_M2P_EGRESS_CYCLES_NE", .desc = "Egress (to CMS) Cycles Not Empty. Counts the number of cycles when the M2PCIe Egress is not empty", .code = 0x23, .cntmsk = 0x3, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_cycles), .umasks = knl_unc_m2p_egress_cycles }, { .name = "UNC_M2P_EGRESS_INSERTS", .desc = "Egress (to CMS) Ingress. Counts the number of number of messages inserted into the the M2PCIe Egress queue", .code = 0x24, .cntmsk = 0xf, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_inserts), .umasks = knl_unc_m2p_egress_inserts }, { .name = "UNC_M2P_EGRESS_CYCLES_FULL", .desc = "Egress (to CMS) Cycles Full. Counts the number of cycles when the M2PCIe Egress is full", .code = 0x25, .cntmsk = 0xf, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(knl_unc_m2p_egress_cycles), .umasks = knl_unc_m2p_egress_cycles }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_netburst_events.h000066400000000000000000001132171502707512200246410ustar00rootroot00000000000000/* * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * This header contains arrays to describe the Event-Selection-Control * Registers (ESCRs), Counter-Configuration-Control Registers (CCCRs), * and countable events on Pentium4/Xeon/EM64T systems. * * For more details, see: * - IA-32 Intel Architecture Software Developer's Manual, * Volume 3B: System Programming Guide, Part 2 * (available at: http://www.intel.com/design/Pentium4/manuals/253669.htm) * - Chapter 18.10: Performance Monitoring Overview * - Chapter 18.13: Performance Monitoring - Pentium4 and Xeon Processors * - Chapter 18.14: Performance Monitoring and Hyper-Threading Technology * - Appendix A.1: Pentium4 and Xeon Processor Performance-Monitoring Events * * This header also contains an array to describe how the Perfmon PMCs map to * the ESCRs and CCCRs. */ #ifndef _NETBURST_EVENTS_H_ #define _NETBURST_EVENTS_H_ /** * netburst_events * * Array of events that can be counted on Pentium4. **/ static const netburst_entry_t netburst_events[] = { /* 0 */ {.name = "TC_deliver_mode", .desc = "The duration (in clock cycles) of the operating modes of " "the trace cache and decode engine in the processor package", .event_select = 0x1, .escr_select = 0x1, .allowed_escrs = { 9, 32 }, .perf_code = P4_EVENT_TC_DELIVER_MODE, .event_masks = { {.name = "DD", .desc = "Both logical CPUs in deliver mode", .bit = 0, }, {.name = "DB", .desc = "Logical CPU 0 in deliver mode and " "logical CPU 1 in build mode", .bit = 1, }, {.name = "DI", .desc = "Logical CPU 0 in deliver mode and logical CPU 1 " "either halted, under machine clear condition, or " "transitioning to a long microcode flow", .bit = 2, }, {.name = "BD", .desc = "Logical CPU 0 in build mode and " "logical CPU 1 is in deliver mode", .bit = 3, }, {.name = "BB", .desc = "Both logical CPUs in build mode", .bit = 4, }, {.name = "BI", .desc = "Logical CPU 0 in build mode and logical CPU 1 " "either halted, under machine clear condition, or " "transitioning to a long microcode flow", .bit = 5, }, {.name = "ID", .desc = "Logical CPU 0 either halted, under machine clear " "condition, or transitioning to a long microcode " "flow, and logical CPU 1 in deliver mode", .bit = 6, }, {.name = "IB", .desc = "Logical CPU 0 either halted, under machine clear " "condition, or transitioning to a long microcode " "flow, and logical CPU 1 in build mode", .bit = 7, }, }, }, /* 1 */ {.name = "BPU_fetch_request", .desc = "Instruction fetch requests by the Branch Prediction Unit", .event_select = 0x3, .escr_select = 0x0, .allowed_escrs = { 0, 23 }, .perf_code = P4_EVENT_BPU_FETCH_REQUEST, .event_masks = { {.name = "TCMISS", .desc = "Trace cache lookup miss", .bit = 0, .flags = NETBURST_FL_DFL, }, }, }, /* 2 */ {.name = "ITLB_reference", .desc = "Translations using the Instruction " "Translation Look-Aside Buffer", .event_select = 0x18, .escr_select = 0x3, .allowed_escrs = { 3, 26 }, .perf_code = P4_EVENT_ITLB_REFERENCE, .event_masks = { {.name = "HIT", .desc = "ITLB hit", .bit = 0, }, {.name = "MISS", .desc = "ITLB miss", .bit = 1, }, {.name = "HIT_UC", .desc = "Uncacheable ITLB hit", .bit = 2, }, }, }, /* 3 */ {.name = "memory_cancel", .desc = "Canceling of various types of requests in the " "Data cache Address Control unit (DAC)", .event_select = 0x2, .escr_select = 0x5, .allowed_escrs = { 15, 38 }, .perf_code = P4_EVENT_MEMORY_CANCEL, .event_masks = { {.name = "ST_RB_FULL", .desc = "Replayed because no store request " "buffer is available", .bit = 2, }, {.name = "64K_CONF", .desc = "Conflicts due to 64K aliasing", .bit = 3, }, }, }, /* 4 */ {.name = "memory_complete", .desc = "Completions of a load split, store split, " "uncacheable (UC) split, or UC load", .event_select = 0x8, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .perf_code = P4_EVENT_MEMORY_COMPLETE, .event_masks = { {.name = "LSC", .desc = "Load split completed, excluding UC/WC loads", .bit = 0, }, {.name = "SSC", .desc = "Any split stores completed", .bit = 1, }, }, }, /* 5 */ {.name = "load_port_replay", .desc = "Replayed events at the load port", .event_select = 0x4, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .perf_code = P4_EVENT_LOAD_PORT_REPLAY, .event_masks = { {.name = "SPLIT_LD", .desc = "Split load", .bit = 1, .flags = NETBURST_FL_DFL, }, }, }, /* 6 */ {.name = "store_port_replay", .desc = "Replayed events at the store port", .event_select = 0x5, .escr_select = 0x2, .allowed_escrs = { 13, 36 }, .perf_code = P4_EVENT_STORE_PORT_REPLAY, .event_masks = { {.name = "SPLIT_ST", .desc = "Split store", .bit = 1, .flags = NETBURST_FL_DFL, }, }, }, /* 7 */ {.name = "MOB_load_replay", .desc = "Count of times the memory order buffer (MOB) " "caused a load operation to be replayed", .event_select = 0x3, .escr_select = 0x2, .allowed_escrs = { 2, 25 }, .perf_code = P4_EVENT_MOB_LOAD_REPLAY, .event_masks = { {.name = "NO_STA", .desc = "Replayed because of unknown store address", .bit = 1, }, {.name = "NO_STD", .desc = "Replayed because of unknown store data", .bit = 3, }, {.name = "PARTIAL_DATA", .desc = "Replayed because of partially overlapped data " "access between the load and store operations", .bit = 4, }, {.name = "UNALGN_ADDR", .desc = "Replayed because the lower 4 bits of the " "linear address do not match between the " "load and store operations", .bit = 5, }, }, }, /* 8 */ {.name = "page_walk_type", .desc = "Page walks that the page miss handler (PMH) performs", .event_select = 0x1, .escr_select = 0x4, .allowed_escrs = { 4, 27 }, .perf_code = P4_EVENT_PAGE_WALK_TYPE, .event_masks = { {.name = "DTMISS", .desc = "Page walk for a data TLB miss (load or store)", .bit = 0, }, {.name = "ITMISS", .desc = "Page walk for an instruction TLB miss", .bit = 1, }, }, }, /* 9 */ {.name = "BSQ_cache_reference", .desc = "Cache references (2nd or 3rd level caches) as seen by the " "bus unit. Read types include both load and RFO, and write " "types include writebacks and evictions", .event_select = 0xC, .escr_select = 0x7, .allowed_escrs = { 7, 30 }, .perf_code = P4_EVENT_BSQ_CACHE_REFERENCE, .event_masks = { {.name = "RD_2ndL_HITS", .desc = "Read 2nd level cache hit Shared", .bit = 0, }, {.name = "RD_2ndL_HITE", .desc = "Read 2nd level cache hit Exclusive", .bit = 1, }, {.name = "RD_2ndL_HITM", .desc = "Read 2nd level cache hit Modified", .bit = 2, }, {.name = "RD_3rdL_HITS", .desc = "Read 3rd level cache hit Shared", .bit = 3, }, {.name = "RD_3rdL_HITE", .desc = "Read 3rd level cache hit Exclusive", .bit = 4, }, {.name = "RD_3rdL_HITM", .desc = "Read 3rd level cache hit Modified", .bit = 5, }, {.name = "RD_2ndL_MISS", .desc = "Read 2nd level cache miss", .bit = 8, }, {.name = "RD_3rdL_MISS", .desc = "Read 3rd level cache miss", .bit = 9, }, {.name = "WR_2ndL_MISS", .desc = "A writeback lookup from DAC misses the 2nd " "level cache (unlikely to happen)", .bit = 10, }, }, }, /* 10 */ {.name = "IOQ_allocation", .desc = "Count of various types of transactions on the bus. A count " "is generated each time a transaction is allocated into the " "IOQ that matches the specified mask bits. An allocated entry " "can be a sector (64 bytes) or a chunk of 8 bytes. Requests " "are counted once per retry. All 'TYPE_BIT*' event-masks " "together are treated as a single 5-bit value", .event_select = 0x3, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_IOQ_ALLOCATION, .event_masks = { {.name = "TYPE_BIT0", .desc = "Bus request type (bit 0)", .bit = 0, }, {.name = "TYPE_BIT1", .desc = "Bus request type (bit 1)", .bit = 1, }, {.name = "TYPE_BIT2", .desc = "Bus request type (bit 2)", .bit = 2, }, {.name = "TYPE_BIT3", .desc = "Bus request type (bit 3)", .bit = 3, }, {.name = "TYPE_BIT4", .desc = "Bus request type (bit 4)", .bit = 4, }, {.name = "ALL_READ", .desc = "Count read entries", .bit = 5, }, {.name = "ALL_WRITE", .desc = "Count write entries", .bit = 6, }, {.name = "MEM_UC", .desc = "Count UC memory access entries", .bit = 7, }, {.name = "MEM_WC", .desc = "Count WC memory access entries", .bit = 8, }, {.name = "MEM_WT", .desc = "Count write-through (WT) memory access entries", .bit = 9, }, {.name = "MEM_WP", .desc = "Count write-protected (WP) memory access entries", .bit = 10, }, {.name = "MEM_WB", .desc = "Count WB memory access entries", .bit = 11, }, {.name = "OWN", .desc = "Count all store requests driven by processor, as " "opposed to other processor or DMA", .bit = 13, }, {.name = "OTHER", .desc = "Count all requests driven by other " "processors or DMA", .bit = 14, }, {.name = "PREFETCH", .desc = "Include HW and SW prefetch requests in the count", .bit = 15, }, }, }, /* 11 */ {.name = "IOQ_active_entries", .desc = "Number of entries (clipped at 15) in the IOQ that are " "active. An allocated entry can be a sector (64 bytes) " "or a chunk of 8 bytes. This event must be programmed in " "conjunction with IOQ_allocation. All 'TYPE_BIT*' event-masks " "together are treated as a single 5-bit value", .event_select = 0x1A, .escr_select = 0x6, .allowed_escrs = { 29, -1 }, .perf_code = P4_EVENT_IOQ_ACTIVE_ENTRIES, .event_masks = { {.name = "TYPE_BIT0", .desc = "Bus request type (bit 0)", .bit = 0, }, {.name = "TYPE_BIT1", .desc = "Bus request type (bit 1)", .bit = 1, }, {.name = "TYPE_BIT2", .desc = "Bus request type (bit 2)", .bit = 2, }, {.name = "TYPE_BIT3", .desc = "Bus request type (bit 3)", .bit = 3, }, {.name = "TYPE_BIT4", .desc = "Bus request type (bit 4)", .bit = 4, }, {.name = "ALL_READ", .desc = "Count read entries", .bit = 5, }, {.name = "ALL_WRITE", .desc = "Count write entries", .bit = 6, }, {.name = "MEM_UC", .desc = "Count UC memory access entries", .bit = 7, }, {.name = "MEM_WC", .desc = "Count WC memory access entries", .bit = 8, }, {.name = "MEM_WT", .desc = "Count write-through (WT) memory access entries", .bit = 9, }, {.name = "MEM_WP", .desc = "Count write-protected (WP) memory access entries", .bit = 10, }, {.name = "MEM_WB", .desc = "Count WB memory access entries", .bit = 11, }, {.name = "OWN", .desc = "Count all store requests driven by processor, as " "opposed to other processor or DMA", .bit = 13, }, {.name = "OTHER", .desc = "Count all requests driven by other " "processors or DMA", .bit = 14, }, {.name = "PREFETCH", .desc = "Include HW and SW prefetch requests in the count", .bit = 15, }, }, }, /* 12 */ {.name = "FSB_data_activity", .desc = "Count of DRDY or DBSY events that " "occur on the front side bus", .event_select = 0x17, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_FSB_DATA_ACTIVITY, .event_masks = { {.name = "DRDY_DRV", .desc = "Count when this processor drives data onto the bus. " "Includes writes and implicit writebacks", .bit = 0, }, {.name = "DRDY_OWN", .desc = "Count when this processor reads data from the bus. " "Includes loads and some PIC transactions. Count " "DRDY events that we drive. Count DRDY events sampled " "that we own", .bit = 1, }, {.name = "DRDY_OTHER", .desc = "Count when data is on the bus but not being sampled " "by the processor. It may or may not be driven by " "this processor", .bit = 2, }, {.name = "DBSY_DRV", .desc = "Count when this processor reserves the bus for use " "in the next bus cycle in order to drive data", .bit = 3, }, {.name = "DBSY_OWN", .desc = "Count when some agent reserves the bus for use in " "the next bus cycle to drive data that this processor " "will sample", .bit = 4, }, {.name = "DBSY_OTHER", .desc = "Count when some agent reserves the bus for use in " "the next bus cycle to drive data that this processor " "will NOT sample. It may or may not be being driven " "by this processor", .bit = 5, }, }, }, /* 13 */ {.name = "BSQ_allocation", .desc = "Allocations in the Bus Sequence Unit (BSQ). The event mask " "bits consist of four sub-groups: request type, request " "length, memory type, and a sub-group consisting mostly of " "independent bits (5 through 10). Must specify a mask for " "each sub-group", .event_select = 0x5, .escr_select = 0x7, .allowed_escrs = { 7, -1 }, .perf_code = P4_EVENT_BSQ_ALLOCATION, .event_masks = { {.name = "REQ_TYPE0", .desc = "Along with REQ_TYPE1, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache)", .bit = 0, }, {.name = "REQ_TYPE1", .desc = "Along with REQ_TYPE0, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache)", .bit = 1, }, {.name = "REQ_LEN0", .desc = "Along with REQ_LEN1, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks", .bit = 2, }, {.name = "REQ_LEN1", .desc = "Along with REQ_LEN0, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks", .bit = 3, }, {.name = "REQ_IO_TYPE", .desc = "Request type is input or output", .bit = 5, }, {.name = "REQ_LOCK_TYPE", .desc = "Request type is bus lock", .bit = 6, }, {.name = "REQ_CACHE_TYPE", .desc = "Request type is cacheable", .bit = 7, }, {.name = "REQ_SPLIT_TYPE", .desc = "Request type is a bus 8-byte chunk split across " "an 8-byte boundary", .bit = 8, }, {.name = "REQ_DEM_TYPE", .desc = "0: Request type is HW.SW prefetch. " "1: Request type is a demand", .bit = 9, }, {.name = "REQ_ORD_TYPE", .desc = "Request is an ordered type", .bit = 10, }, {.name = "MEM_TYPE0", .desc = "Along with MEM_TYPE1 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 11, }, {.name = "MEM_TYPE1", .desc = "Along with MEM_TYPE0 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 12, }, {.name = "MEM_TYPE2", .desc = "Along with MEM_TYPE0 and MEM_TYPE1, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 13, }, }, }, /* 14 */ {.name = "BSQ_active_entries", .desc = "Number of BSQ entries (clipped at 15) currently active " "(valid) which meet the subevent mask criteria during " "allocation in the BSQ. Active request entries are allocated " "on the BSQ until de-allocated. De-allocation of an entry " "does not necessarily imply the request is filled. This " "event must be programmed in conjunction with BSQ_allocation", .event_select = 0x6, .escr_select = 0x7, .allowed_escrs = { 30, -1 }, .perf_code = P4_EVENT_BSQ_ACTIVE_ENTRIES, .event_masks = { {.name = "REQ_TYPE0", .desc = "Along with REQ_TYPE1, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache)", .bit = 0, }, {.name = "REQ_TYPE1", .desc = "Along with REQ_TYPE0, request type encodings are: " "0 - Read (excludes read invalidate), 1 - Read " "invalidate, 2 - Write (other than writebacks), 3 - " "Writeback (evicted from cache)", .bit = 1, }, {.name = "REQ_LEN0", .desc = "Along with REQ_LEN1, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks", .bit = 2, }, {.name = "REQ_LEN1", .desc = "Along with REQ_LEN0, request length encodings are: " "0 - zero chunks, 1 - one chunk, 3 - eight chunks", .bit = 3, }, {.name = "REQ_IO_TYPE", .desc = "Request type is input or output", .bit = 5, }, {.name = "REQ_LOCK_TYPE", .desc = "Request type is bus lock", .bit = 6, }, {.name = "REQ_CACHE_TYPE", .desc = "Request type is cacheable", .bit = 7, }, {.name = "REQ_SPLIT_TYPE", .desc = "Request type is a bus 8-byte chunk split across " "an 8-byte boundary", .bit = 8, }, {.name = "REQ_DEM_TYPE", .desc = "0: Request type is HW.SW prefetch. " "1: Request type is a demand", .bit = 9, }, {.name = "REQ_ORD_TYPE", .desc = "Request is an ordered type", .bit = 10, }, {.name = "MEM_TYPE0", .desc = "Along with MEM_TYPE1 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 11, }, {.name = "MEM_TYPE1", .desc = "Along with MEM_TYPE0 and MEM_TYPE2, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 12, }, {.name = "MEM_TYPE2", .desc = "Along with MEM_TYPE0 and MEM_TYPE1, " "memory type encodings are: 0 - UC, " "1 - USWC, 4- WT, 5 - WP, 6 - WB", .bit = 13, }, }, }, /* 15 */ {.name = "SSE_input_assist", .desc = "Number of times an assist is requested to handle problems " "with input operands for SSE/SSE2/SSE3 operations; most " "notably denormal source operands when the DAZ bit isn't set", .event_select = 0x34, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_SSE_INPUT_ASSIST, .event_masks = { {.name = "ALL", .desc = "Count assists for SSE/SSE2/SSE3 uops", .bit = 15, .flags = NETBURST_FL_DFL, }, }, }, /* 16 */ {.name = "packed_SP_uop", .desc = "Number of packed single-precision uops", .event_select = 0x8, .escr_select = 0x1, .perf_code = P4_EVENT_PACKED_SP_UOP, .allowed_escrs = { 12, 35 }, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on packed " "single-precisions operands", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 17 */ {.name = "packed_DP_uop", .desc = "Number of packed double-precision uops", .event_select = 0xC, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_PACKED_DP_UOP, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on packed " "double-precisions operands", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 18 */ {.name = "scalar_SP_uop", .desc = "Number of scalar single-precision uops", .event_select = 0xA, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_SCALAR_SP_UOP, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on scalar " "single-precisions operands", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 19 */ {.name = "scalar_DP_uop", .desc = "Number of scalar double-precision uops", .event_select = 0xE, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_SCALAR_DP_UOP, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on scalar " "double-precisions operands", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 20 */ {.name = "64bit_MMX_uop", .desc = "Number of MMX instructions which " "operate on 64-bit SIMD operands", .event_select = 0x2, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_64BIT_MMX_UOP, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on 64-bit SIMD integer " "operands in memory or MMX registers", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 21 */ {.name = "128bit_MMX_uop", .desc = "Number of MMX instructions which " "operate on 128-bit SIMD operands", .event_select = 0x1A, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_128BIT_MMX_UOP, .event_masks = { {.name = "ALL", .desc = "Count all uops operating on 128-bit SIMD integer " "operands in memory or MMX registers", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 22 */ {.name = "x87_FP_uop", .desc = "Number of x87 floating-point uops", .event_select = 0x4, .escr_select = 0x1, .allowed_escrs = { 12, 35 }, .perf_code = P4_EVENT_X87_FP_UOP, .event_masks = { {.name = "ALL", .desc = "Count all x87 FP uops", .bit = 15, .flags = NETBURST_FL_DFL, }, {.name = "TAG0", .desc = "Tag this event with tag bit 0 " "for retirement counting with execution_event", .bit = 16, }, {.name = "TAG1", .desc = "Tag this event with tag bit 1 " "for retirement counting with execution_event", .bit = 17, }, {.name = "TAG2", .desc = "Tag this event with tag bit 2 " "for retirement counting with execution_event", .bit = 18, }, {.name = "TAG3", .desc = "Tag this event with tag bit 3 " "for retirement counting with execution_event", .bit = 19, }, }, }, /* 23 */ {.name = "TC_misc", .desc = "Miscellaneous events detected by the TC. The counter will " "count twice for each occurrence", .event_select = 0x6, .escr_select = 0x1, .allowed_escrs = { 9, 32 }, .perf_code = P4_EVENT_TC_MISC, .event_masks = { {.name = "FLUSH", .desc = "Number of flushes", .bit = 4, .flags = NETBURST_FL_DFL, }, }, }, /* 24 */ {.name = "global_power_events", .desc = "Counts the time during which a processor is not stopped", .event_select = 0x13, .escr_select = 0x6, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_GLOBAL_POWER_EVENTS, .event_masks = { {.name = "RUNNING", .desc = "The processor is active (includes the " "handling of HLT STPCLK and throttling", .bit = 0, .flags = NETBURST_FL_DFL, }, }, }, /* 25 */ {.name = "tc_ms_xfer", .desc = "Number of times that uop delivery changed from TC to MS ROM", .event_select = 0x5, .escr_select = 0x0, .allowed_escrs = { 8, 31 }, .perf_code = P4_EVENT_TC_MS_XFER, .event_masks = { {.name = "CISC", .desc = "A TC to MS transfer occurred", .bit = 0, .flags = NETBURST_FL_DFL, }, }, }, /* 26 */ {.name = "uop_queue_writes", .desc = "Number of valid uops written to the uop queue", .event_select = 0x9, .escr_select = 0x0, .allowed_escrs = { 8, 31 }, .perf_code = P4_EVENT_UOP_QUEUE_WRITES, .event_masks = { {.name = "FROM_TC_BUILD", .desc = "The uops being written are from TC build mode", .bit = 0, }, {.name = "FROM_TC_DELIVER", .desc = "The uops being written are from TC deliver mode", .bit = 1, }, {.name = "FROM_ROM", .desc = "The uops being written are from microcode ROM", .bit = 2, }, }, }, /* 27 */ {.name = "retired_mispred_branch_type", .desc = "Number of retiring mispredicted branches by type", .event_select = 0x5, .escr_select = 0x2, .allowed_escrs = { 10, 33 }, .perf_code = P4_EVENT_RETIRED_MISPRED_BRANCH_TYPE, .event_masks = { {.name = "CONDITIONAL", .desc = "Conditional jumps", .bit = 1, }, {.name = "CALL", .desc = "Indirect call branches", .bit = 2, }, {.name = "RETURN", .desc = "Return branches", .bit = 3, }, {.name = "INDIRECT", .desc = "Returns, indirect calls, or indirect jumps", .bit = 4, }, }, }, /* 28 */ {.name = "retired_branch_type", .desc = "Number of retiring branches by type", .event_select = 0x4, .escr_select = 0x2, .allowed_escrs = { 10, 33 }, .perf_code = P4_EVENT_RETIRED_BRANCH_TYPE, .event_masks = { {.name = "CONDITIONAL", .desc = "Conditional jumps", .bit = 1, }, {.name = "CALL", .desc = "Indirect call branches", .bit = 2, }, {.name = "RETURN", .desc = "Return branches", .bit = 3, }, {.name = "INDIRECT", .desc = "Returns, indirect calls, or indirect jumps", .bit = 4, }, }, }, /* 29 */ {.name = "resource_stall", .desc = "Occurrences of latency or stalls in the Allocator", .event_select = 0x1, .escr_select = 0x1, .allowed_escrs = { 17, 40 }, .perf_code = P4_EVENT_RESOURCE_STALL, .event_masks = { {.name = "SBFULL", .desc = "A stall due to lack of store buffers", .bit = 5, .flags = NETBURST_FL_DFL, }, }, }, /* 30 */ {.name = "WC_Buffer", .desc = "Number of Write Combining Buffer operations", .event_select = 0x5, .escr_select = 0x5, .allowed_escrs = { 15, 38 }, .perf_code = P4_EVENT_WC_BUFFER, .event_masks = { {.name = "WCB_EVICTS", .desc = "WC Buffer evictions of all causes", .bit = 0, }, {.name = "WCB_FULL_EVICT", .desc = "WC Buffer eviction; no WC buffer is available", .bit = 1, }, }, }, /* 31 */ {.name = "b2b_cycles", .desc = "Number of back-to-back bus cycles", .event_select = 0x16, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_B2B_CYCLES, .event_masks = { {.name = "BIT1", .desc = "bit 1", .bit = 1, }, {.name = "BIT2", .desc = "bit 2", .bit = 2, }, {.name = "BIT3", .desc = "bit 3", .bit = 3, }, {.name = "BIT4", .desc = "bit 4", .bit = 4, }, {.name = "BIT5", .desc = "bit 5", .bit = 4, }, {.name = "BIT6", .desc = "bit 6", .bit = 4, }, }, }, /* 32 */ {.name = "bnr", .desc = "Number of bus-not-ready conditions", .event_select = 0x8, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_BNR, .event_masks = { {.name = "BIT0", .desc = "bit 0", .bit = 0, }, {.name = "BIT1", .desc = "bit 1", .bit = 1, }, {.name = "BIT2", .desc = "bit 2", .bit = 2, }, }, }, /* 33 */ {.name = "snoop", .desc = "Number of snoop hit modified bus traffic", .event_select = 0x6, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_SNOOP, .event_masks = { {.name = "BIT2", .desc = "bit 2", .bit = 2, }, {.name = "BIT6", .desc = "bit 6", .bit = 6, }, {.name = "BIT7", .desc = "bit 7", .bit = 7, }, }, }, /* 34 */ {.name = "response", .desc = "Count of different types of responses", .event_select = 0x4, .escr_select = 0x3, .allowed_escrs = { 6, 29 }, .perf_code = P4_EVENT_RESPONSE, .event_masks = { {.name = "BIT1", .desc = "bit 1", .bit = 1, }, {.name = "BIT2", .desc = "bit 2", .bit = 2, }, {.name = "BIT8", .desc = "bit 8", .bit = 8, }, {.name = "BIT9", .desc = "bit 9", .bit = 9, }, }, }, /* 35 */ {.name = "front_end_event", .desc = "Number of retirements of tagged uops which are specified " "through the front-end tagging mechanism", .event_select = 0x8, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_FRONT_END_EVENT, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus", .bit = 1, }, }, }, /* 36 */ {.name = "execution_event", .desc = "Number of retirements of tagged uops which are specified " "through the execution tagging mechanism. The event-mask " "allows from one to four types of uops to be tagged", .event_select = 0xC, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_EXECUTION_EVENT, .event_masks = { {.name = "NBOGUS0", .desc = "The marked uops are not bogus", .bit = 0, }, {.name = "NBOGUS1", .desc = "The marked uops are not bogus", .bit = 1, }, {.name = "NBOGUS2", .desc = "The marked uops are not bogus", .bit = 2, }, {.name = "NBOGUS3", .desc = "The marked uops are not bogus", .bit = 3, }, {.name = "BOGUS0", .desc = "The marked uops are bogus", .bit = 4, }, {.name = "BOGUS1", .desc = "The marked uops are bogus", .bit = 5, }, {.name = "BOGUS2", .desc = "The marked uops are bogus", .bit = 6, }, {.name = "BOGUS3", .desc = "The marked uops are bogus", .bit = 7, }, }, }, /* 37 */ {.name = "replay_event", .desc = "Number of retirements of tagged uops which are specified " "through the replay tagging mechanism", .event_select = 0x9, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_REPLAY_EVENT, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus", .bit = 1, }, {.name = "L1_LD_MISS", .desc = "Virtual mask for L1 cache load miss replays", .bit = 2, }, {.name = "L2_LD_MISS", .desc = "Virtual mask for L2 cache load miss replays", .bit = 3, }, {.name = "DTLB_LD_MISS", .desc = "Virtual mask for DTLB load miss replays", .bit = 4, }, {.name = "DTLB_ST_MISS", .desc = "Virtual mask for DTLB store miss replays", .bit = 5, }, {.name = "DTLB_ALL_MISS", .desc = "Virtual mask for all DTLB miss replays", .bit = 6, }, {.name = "BR_MSP", .desc = "Virtual mask for tagged mispredicted branch replays", .bit = 7, }, {.name = "MOB_LD_REPLAY", .desc = "Virtual mask for MOB load replays", .bit = 8, }, {.name = "SP_LD_RET", .desc = "Virtual mask for split load replays. Use with load_port_replay event", .bit = 9, }, {.name = "SP_ST_RET", .desc = "Virtual mask for split store replays. Use with store_port_replay event", .bit = 10, }, }, }, /* 38 */ {.name = "instr_retired", .desc = "Number of instructions retired during a clock cycle", .event_select = 0x2, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .perf_code = P4_EVENT_INSTR_RETIRED, .event_masks = { {.name = "NBOGUSNTAG", .desc = "Non-bogus instructions that are not tagged", .bit = 0, }, {.name = "NBOGUSTAG", .desc = "Non-bogus instructions that are tagged", .bit = 1, }, {.name = "BOGUSNTAG", .desc = "Bogus instructions that are not tagged", .bit = 2, }, {.name = "BOGUSTAG", .desc = "Bogus instructions that are tagged", .bit = 3, }, }, }, /* 39 */ {.name = "uops_retired", .desc = "Number of uops retired during a clock cycle", .event_select = 0x1, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .perf_code = P4_EVENT_UOPS_RETIRED, .event_masks = { {.name = "NBOGUS", .desc = "The marked uops are not bogus", .bit = 0, }, {.name = "BOGUS", .desc = "The marked uops are bogus", .bit = 1, }, }, }, /* 40 */ {.name = "uops_type", .desc = "This event is used in conjunction with with the front-end " "mechanism to tag load and store uops", .event_select = 0x2, .escr_select = 0x2, .allowed_escrs = { 18, 41 }, .perf_code = P4_EVENT_UOP_TYPE, .event_masks = { {.name = "TAGLOADS", .desc = "The uop is a load operation", .bit = 1, }, {.name = "TAGSTORES", .desc = "The uop is a store operation", .bit = 2, }, }, }, /* 41 */ {.name = "branch_retired", .desc = "Number of retirements of a branch", .event_select = 0x6, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_BRANCH_RETIRED, .event_masks = { {.name = "MMNP", .desc = "Branch not-taken predicted", .bit = 0, }, {.name = "MMNM", .desc = "Branch not-taken mispredicted", .bit = 1, }, {.name = "MMTP", .desc = "Branch taken predicted", .bit = 2, }, {.name = "MMTM", .desc = "Branch taken mispredicted", .bit = 3, }, }, }, /* 42 */ {.name = "mispred_branch_retired", .desc = "Number of retirements of mispredicted " "IA-32 branch instructions", .event_select = 0x3, .escr_select = 0x4, .allowed_escrs = { 20, 42 }, .perf_code = P4_EVENT_MISPRED_BRANCH_RETIRED, .event_masks = { {.name = "BOGUS", .desc = "The retired instruction is not bogus", .bit = 0, .flags = NETBURST_FL_DFL, }, }, }, /* 43 */ {.name = "x87_assist", .desc = "Number of retirements of x87 instructions that required " "special handling", .event_select = 0x3, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_X87_ASSIST, .event_masks = { {.name = "FPSU", .desc = "Handle FP stack underflow", .bit = 0, }, {.name = "FPSO", .desc = "Handle FP stack overflow", .bit = 1, }, {.name = "POAO", .desc = "Handle x87 output overflow", .bit = 2, }, {.name = "POAU", .desc = "Handle x87 output underflow", .bit = 3, }, {.name = "PREA", .desc = "Handle x87 input assist", .bit = 4, }, }, }, /* 44 */ {.name = "machine_clear", .desc = "Number of occurrences when the entire " "pipeline of the machine is cleared", .event_select = 0x2, .escr_select = 0x5, .allowed_escrs = { 21, 43 }, .perf_code = P4_EVENT_MACHINE_CLEAR, .event_masks = { {.name = "CLEAR", .desc = "Counts for a portion of the many cycles while the " "machine is cleared for any cause. Use edge-" "triggering for this bit only to get a count of " "occurrences versus a duration", .bit = 0, }, {.name = "MOCLEAR", .desc = "Increments each time the machine is cleared due to " "memory ordering issues", .bit = 2, }, {.name = "SMCLEAR", .desc = "Increments each time the machine is cleared due to " "self-modifying code issues", .bit = 6, }, }, }, /* 45 */ {.name = "instr_completed", .desc = "Instructions that have completed and " "retired during a clock cycle (models 3, 4, 6 only)", .event_select = 0x7, .escr_select = 0x4, .allowed_escrs = { 21, 42 }, .perf_code = P4_EVENT_INSTR_COMPLETED, .event_masks = { {.name = "NBOGUS", .desc = "Non-bogus instructions", .bit = 0, }, {.name = "BOGUS", .desc = "Bogus instructions", .bit = 1, }, }, }, }; #define PME_REPLAY_EVENT 37 #define NETBURST_EVENT_COUNT (sizeof(netburst_events)/sizeof(netburst_entry_t)) #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_nhm_events.h000066400000000000000000002167711502707512200235660ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: nhm (Intel Nehalem) */ static const intel_x86_umask_t nhm_arith[]={ { .uname = "CYCLES_DIV_BUSY", .udesc = "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE.", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIV", .udesc = "Counts the number of divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE.", .uequiv = "CYCLES_DIV_BUSY:c=1:i=1:e=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MUL", .udesc = "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD.", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_baclear[]={ { .uname = "BAD_TARGET", .udesc = "BACLEAR asserted with bad target address", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CLEAR", .udesc = "BACLEAR asserted, regardless of cause", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_bpu_clears[]={ { .uname = "EARLY", .udesc = "Early Branch Prediction Unit clears", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LATE", .udesc = "Late Branch Prediction Unit clears", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Count any Branch Prediction Unit clears", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_br_inst_exec[]={ { .uname = "ANY", .udesc = "Branch instructions executed", .ucode = 0x7f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Conditional branch instructions executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT", .udesc = "Unconditional branches executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT_NEAR_CALL", .udesc = "Unconditional call branches executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NEAR_CALL", .udesc = "Indirect call branches executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NON_CALL", .udesc = "Indirect non call branches executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NEAR_CALLS", .udesc = "Call branches executed", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NON_CALLS", .udesc = "All non call branches executed", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RETURN_NEAR", .udesc = "Indirect return branches executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Taken branches executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Retired branch instructions (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "CONDITIONAL", .udesc = "Retired conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Retired near call instructions (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_br_misp_exec[]={ { .uname = "ANY", .udesc = "Mispredicted branches executed", .ucode = 0x7f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Mispredicted conditional branches executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT", .udesc = "Mispredicted unconditional branches executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT_NEAR_CALL", .udesc = "Mispredicted non call branches executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NEAR_CALL", .udesc = "Mispredicted indirect call branches executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NON_CALL", .udesc = "Mispredicted indirect non call branches executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NEAR_CALLS", .udesc = "Mispredicted call branches executed", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NON_CALLS", .udesc = "Mispredicted non call branches executed", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RETURN_NEAR", .udesc = "Mispredicted return branches executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Mispredicted taken branches executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_br_misp_retired[]={ { .uname = "NEAR_CALL", .udesc = "Counts mispredicted direct and indirect near unconditional retired calls", .ucode = 0x200, .uflags= INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_cache_lock_cycles[]={ { .uname = "L1D", .udesc = "Cycles L1D locked", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_L2", .udesc = "Cycles L1D and L2 locked", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_cpu_clk_unhalted[]={ { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted (programmable counter)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "REF_P", .udesc = "Reference base clock (133 Mhz) cycles when thread is not halted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TOTAL_CYCLES", .udesc = "Total number of elapsed cycles. Does not work when C-state enabled", .uequiv = "THREAD_P:c=2:i=1", .ucode = 0x0 | INTEL_X86_MOD_INV | (0x2 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_dtlb_load_misses[]={ { .uname = "ANY", .udesc = "DTLB load misses", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PDE_MISS", .udesc = "DTLB load miss caused by low part of address", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "DTLB load miss page walks complete", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "DTLB second level hit", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PDP_MISS", .udesc = "Number of DTLB cache load misses where the high part of the linear to physical address translation was missed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "Counts number of completed large page walks due to load miss in the STLB", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_dtlb_misses[]={ { .uname = "ANY", .udesc = "DTLB misses", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "STLB_HIT", .udesc = "DTLB first level misses but second level hit", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "DTLB miss page walks", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PDE_MISS", .udesc = "Number of DTLB cache misses where the low part of the linear to physical address translation was missed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PDP_MISS", .udesc = "Number of DTLB misses where the high part of the linear to physical address translation was missed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "Counts number of completed large page walks due to misses in the STLB", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_ept[]={ { .uname = "EPDE_MISS", .udesc = "Extended Page Directory Entry miss", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPDPE_MISS", .udesc = "Extended Page Directory Pointer miss", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EPDPE_HIT", .udesc = "Extended Page Directory Pointer hit", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_fp_assist[]={ { .uname = "ALL", .udesc = "Floating point assists (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "INPUT", .udesc = "Floating point assists for invalid input value (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "OUTPUT", .udesc = "Floating point assists for invalid output value (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_fp_comp_ops_exe[]={ { .uname = "MMX", .udesc = "MMX Uops", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_DOUBLE_PRECISION", .udesc = "SSE* FP double precision Uops", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP", .udesc = "SSE and SSE2 FP Uops", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_PACKED", .udesc = "SSE FP packed Uops", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_SCALAR", .udesc = "SSE FP scalar Uops", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_SINGLE_PRECISION", .udesc = "SSE* FP single precision Uops", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE2_INTEGER", .udesc = "SSE2 integer Uops", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87", .udesc = "Computational floating-point operations executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_fp_mmx_trans[]={ { .uname = "ANY", .udesc = "All Floating Point to and from MMX transitions", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "TO_FP", .udesc = "Transitions from MMX to Floating Point instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TO_MMX", .udesc = "Transitions from Floating Point to MMX instructions", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_ifu_ivc[]={ { .uname = "FULL", .udesc = "Instruction Fetche unit victim cache full", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1I_EVICTION", .udesc = "L1 Instruction cache evictions", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_ild_stall[]={ { .uname = "ANY", .udesc = "Any Instruction Length Decoder stall cycles", .uequiv = "IQ_FULL:LCP:MRU:REGEN", .ucode = 0xf00, .uflags= INTEL_X86_DFL, }, { .uname = "IQ_FULL", .udesc = "Instruction Queue full stall cycles", .ucode = 0x400, }, { .uname = "LCP", .udesc = "Length Change Prefix stall cycles", .ucode = 0x100, }, { .uname = "MRU", .udesc = "Stall cycles due to BPU MRU bypass", .ucode = 0x200, }, { .uname = "REGEN", .udesc = "Regen stall cycles", .ucode = 0x800, }, }; static const intel_x86_umask_t nhm_inst_decoded[]={ { .uname = "DEC0", .udesc = "Instructions that must be decoded by decoder 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions Retired (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "X87", .udesc = "Retired floating-point operations (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_l1d[]={ { .uname = "M_EVICT", .udesc = "L1D cache lines replaced in M state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_REPL", .udesc = "L1D cache lines allocated in the M state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_SNOOP_EVICT", .udesc = "L1D snoop eviction of cache lines in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REPL", .udesc = "L1 data cache lines allocated", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l1d_all_ref[]={ { .uname = "ANY", .udesc = "All references to the L1 data cache", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, { .uname = "CACHEABLE", .udesc = "L1 data cacheable reads and writes", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l1d_cache_ld[]={ { .uname = "E_STATE", .udesc = "L1 data cache read in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I_STATE", .udesc = "L1 data cache read in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "L1 data cache read in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MESI", .udesc = "L1 data cache reads", .ucode = 0xf00, .uflags= INTEL_X86_DFL, }, { .uname = "S_STATE", .udesc = "L1 data cache read in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l1d_cache_lock[]={ { .uname = "E_STATE", .udesc = "L1 data cache load locks in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "L1 data cache load lock hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "L1 data cache load locks in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L1 data cache load locks in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l1d_cache_st[]={ { .uname = "E_STATE", .udesc = "L1 data cache stores in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I_STATE", .udesc = "L1 data cache store in the I state", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "L1 data cache stores in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L1 data cache stores in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MESI", .udesc = "L1 data cache store in all states", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_l1d_prefetch[]={ { .uname = "MISS", .udesc = "L1D hardware prefetch misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REQUESTS", .udesc = "L1D hardware prefetch requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TRIGGERS", .udesc = "L1D hardware prefetch requests triggered", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l1d_wb_l2[]={ { .uname = "E_STATE", .udesc = "L1 writebacks to L2 in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I_STATE", .udesc = "L1 writebacks to L2 in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "L1 writebacks to L2 in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L1 writebacks to L2 in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MESI", .udesc = "All L1 writebacks to L2", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_l1i[]={ { .uname = "CYCLES_STALLED", .udesc = "L1I instruction fetch stall cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITS", .udesc = "L1I instruction fetch hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "L1I instruction fetch misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "READS", .udesc = "L1I Instruction fetches", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_data_rqsts[]={ { .uname = "ANY", .udesc = "All L2 data requests", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DEMAND_E_STATE", .udesc = "L2 data demand loads in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_I_STATE", .udesc = "L2 data demand loads in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_M_STATE", .udesc = "L2 data demand loads in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_MESI", .udesc = "L2 data demand requests", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_S_STATE", .udesc = "L2 data demand loads in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_E_STATE", .udesc = "L2 data prefetches in E state", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_I_STATE", .udesc = "L2 data prefetches in the I state (misses)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_M_STATE", .udesc = "L2 data prefetches in M state", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_MESI", .udesc = "All L2 data prefetches", .ucode = 0xf000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_S_STATE", .udesc = "L2 data prefetches in the S state", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_hw_prefetch[]={ { .uname = "HIT", .udesc = "Count L2 HW prefetcher detector hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALLOC", .udesc = "Count L2 HW prefetcher allocations", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DATA_TRIGGER", .udesc = "Count L2 HW data prefetcher triggered", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_TRIGGER", .udesc = "Count L2 HW code prefetcher triggered", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DCA_TRIGGER", .udesc = "Count L2 HW DCA prefetcher triggered", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "KICK_START", .udesc = "Count L2 HW prefetcher kick started", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_lines_in[]={ { .uname = "ANY", .udesc = "L2 lines allocated", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "E_STATE", .udesc = "L2 lines allocated in the E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L2 lines allocated in the S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_lines_out[]={ { .uname = "ANY", .udesc = "L2 lines evicted", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DEMAND_CLEAN", .udesc = "L2 lines evicted by a demand request", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DIRTY", .udesc = "L2 modified lines evicted by a demand request", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_CLEAN", .udesc = "L2 lines evicted by a prefetch request", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_DIRTY", .udesc = "L2 modified lines evicted by a prefetch request", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_rqsts[]={ { .uname = "MISS", .udesc = "All L2 misses", .ucode = 0xaa00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All L2 requests", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH_HIT", .udesc = "L2 instruction fetch hits", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH_MISS", .udesc = "L2 instruction fetch misses", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCHES", .udesc = "L2 instruction fetches", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LD_HIT", .udesc = "L2 load hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LD_MISS", .udesc = "L2 load misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOADS", .udesc = "L2 requests", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_HIT", .udesc = "L2 prefetch hits", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_MISS", .udesc = "L2 prefetch misses", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCHES", .udesc = "All L2 prefetches", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "L2 RFO hits", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "L2 RFO misses", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFOS", .udesc = "L2 RFO requests", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_transactions[]={ { .uname = "ANY", .udesc = "All L2 transactions", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FILL", .udesc = "L2 fill transactions", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH", .udesc = "L2 instruction fetch transactions", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writeback to L2 transactions", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOAD", .udesc = "L2 Load transactions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH", .udesc = "L2 prefetch transactions", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "L2 RFO transactions", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "L2 writeback to LLC transactions", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_l2_write[]={ { .uname = "LOCK_E_STATE", .udesc = "L2 demand lock RFOs in E state", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_I_STATE", .udesc = "L2 demand lock RFOs in I state (misses)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_S_STATE", .udesc = "L2 demand lock RFOs in S state", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_HIT", .udesc = "All demand L2 lock RFOs that hit the cache", .ucode = 0xe000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_M_STATE", .udesc = "L2 demand lock RFOs in M state", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_MESI", .udesc = "All demand L2 lock RFOs", .ucode = 0xf000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "All L2 demand store RFOs that hit the cache", .ucode = 0xe00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_I_STATE", .udesc = "L2 demand store RFOs in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_E_STATE", .udesc = "L2 demand store RFOs in the E state (exclusive)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_M_STATE", .udesc = "L2 demand store RFOs in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MESI", .udesc = "All L2 demand store RFOs", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_S_STATE", .udesc = "L2 demand store RFOs in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_large_itlb[]={ { .uname = "HIT", .udesc = "Large ITLB hit", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_load_dispatch[]={ { .uname = "ANY", .udesc = "All loads dispatched", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "MOB", .udesc = "Loads dispatched from the MOB", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RS", .udesc = "Loads dispatched that bypass the MOB", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RS_DELAYED", .udesc = "Loads dispatched from stage 305", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_longest_lat_cache[]={ { .uname = "REFERENCE", .udesc = "Longest latency cache reference", .ucode = 0x4f00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Longest latency cache miss", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_lsd[]={ { .uname = "ACTIVE", .udesc = "Cycles when uops were delivered by the LSD", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "INACTIVE", .udesc = "Cycles no uops were delivered by the LSD", .uequiv = "ACTIVE:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), }, }; static const intel_x86_umask_t nhm_machine_clears[]={ { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES", .udesc = "Cycles machine clear asserted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MEM_ORDER", .udesc = "Execution pipeline restart due to Memory ordering conflicts", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "FUSION_ASSIST", .udesc = "Counts the number of macro-fusion assists", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_macro_insts[]={ { .uname = "DECODED", .udesc = "Instructions decoded", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "FUSIONS_DECODED", .udesc = "Macro-fused instructions decoded", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_memory_disambiguation[]={ { .uname = "RESET", .udesc = "Counts memory disambiguation reset cycles", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WATCHDOG", .udesc = "Counts the number of times the memory disambiguation watchdog kicked in", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WATCH_CYCLES", .udesc = "Counts the cycles that the memory disambiguation watchdog is active", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_mem_inst_retired[]={ { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory instructions retired above programmed clocks, minimum threshold value is 3, (Precise Event and ldlat required)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOADS", .udesc = "Instructions retired which contains a load (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STORES", .udesc = "Instructions retired which contains a store (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_mem_load_retired[]={ { .uname = "DTLB_MISS", .udesc = "Retired loads that miss the DTLB (Precise Event)", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "HIT_LFB", .udesc = "Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1D_HIT", .udesc = "Retired loads that hit the L1 data cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired loads that hit the L2 cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired loads that miss the L3 cache (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LLC_MISS", .udesc = "This is an alias for L3_MISS", .uequiv = "L3_MISS", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_UNSHARED_HIT", .udesc = "Retired loads that hit valid versions in the L3 cache (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LLC_UNSHARED_HIT", .udesc = "This is an alias for L3_UNSHARED_HIT", .uequiv = "L3_UNSHARED_HIT", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "OTHER_CORE_L2_HIT_HITM", .udesc = "Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_mem_store_retired[]={ { .uname = "DTLB_MISS", .udesc = "Retired stores that miss the DTLB (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_mem_uncore_retired[]={ { .uname = "OTHER_CORE_L2_HITM", .udesc = "Load instructions retired that HIT modified data in sibling core (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_CACHE_LOCAL_HOME_HIT", .udesc = "Load instructions retired remote cache HIT data source (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCAL_DRAM", .udesc = "Load instructions retired with a data source of local DRAM or locally homed remote hitm (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_DATA_MISS_UNKNOWN", .udesc = "Load instructions retired where the memory reference missed L3 and data source is unknown (Model 46 only, Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_NHM_EX, }, { .uname = "UNCACHEABLE", .udesc = "Load instructions retired where the memory reference missed L1, L2, L3 caches and to perform I/O (Model 46 only, Precise Event)", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_NHM_EX, }, }; static const intel_x86_umask_t nhm_offcore_requests[]={ { .uname = "ANY", .udesc = "All offcore requests", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY_READ", .udesc = "Offcore read requests", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_RFO", .udesc = "Offcore RFO requests", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_CODE", .udesc = "Counts number of offcore demand code read requests. Does not count L2 prefetch requests.", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_DATA", .udesc = "Offcore demand data read requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Offcore demand RFO requests", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WRITEBACK", .udesc = "Offcore L1 data cache writebacks", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNCACHED_MEM", .udesc = "Counts number of offcore uncached memory requests", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_pic_accesses[]={ { .uname = "TPR_READS", .udesc = "Counts number of TPR reads", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TPR_WRITES", .udesc = "Counts number of TPR writes", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_rat_stalls[]={ { .uname = "FLAGS", .udesc = "Flag stall cycles", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REGISTERS", .udesc = "Partial register stall cycles", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ROB_READ_PORT", .udesc = "ROB read port stalls cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SCOREBOARD", .udesc = "Scoreboard stall cycles", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "All RAT stall cycles", .ucode = 0xf00, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_resource_stalls[]={ { .uname = "FPCW", .udesc = "FPU control word write stall cycles", .ucode = 0x2000, }, { .uname = "LOAD", .udesc = "Load buffer stall cycles", .ucode = 0x200, }, { .uname = "MXCSR", .udesc = "MXCSR rename stall cycles", .ucode = 0x4000, }, { .uname = "RS_FULL", .udesc = "Reservation Station full stall cycles", .ucode = 0x400, }, { .uname = "STORE", .udesc = "Store buffer stall cycles", .ucode = 0x800, }, { .uname = "OTHER", .udesc = "Other Resource related stall cycles", .ucode = 0x8000, }, { .uname = "ROB_FULL", .udesc = "ROB full stall cycles", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "Resource related stall cycles", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_simd_int_128[]={ { .uname = "PACK", .udesc = "128 bit SIMD integer pack operations", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_ARITH", .udesc = "128 bit SIMD integer arithmetic operations", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_LOGICAL", .udesc = "128 bit SIMD integer logical operations", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_MPY", .udesc = "128 bit SIMD integer multiply operations", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_SHIFT", .udesc = "128 bit SIMD integer shift operations", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHUFFLE_MOVE", .udesc = "128 bit SIMD integer shuffle/move operations", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK", .udesc = "128 bit SIMD integer unpack operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_simd_int_64[]={ { .uname = "PACK", .udesc = "SIMD integer 64 bit pack operations", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_ARITH", .udesc = "SIMD integer 64 bit arithmetic operations", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_LOGICAL", .udesc = "SIMD integer 64 bit logical operations", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_MPY", .udesc = "SIMD integer 64 bit packed multiply operations", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_SHIFT", .udesc = "SIMD integer 64 bit shift operations", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHUFFLE_MOVE", .udesc = "SIMD integer 64 bit shuffle/move operations", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK", .udesc = "SIMD integer 64 bit unpack operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_snoop_response[]={ { .uname = "HIT", .udesc = "Thread responded HIT to snoop", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITE", .udesc = "Thread responded HITE to snoop", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITM", .udesc = "Thread responded HITM to snoop", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_sq_misc[]={ { .uname = "PROMOTION", .udesc = "Counts the number of L2 secondary misses that hit the Super Queue", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PROMOTION_POST_GO", .udesc = "Counts the number of L2 secondary misses during the Super Queue filling L2", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LRU_HINTS", .udesc = "Counts number of Super Queue LRU hints sent to L3", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "FILL_DROPPED", .udesc = "Counts the number of SQ L2 fills dropped due to L2 busy", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SPLIT_LOCK", .udesc = "Super Queue lock splits across a cache line", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_sse_mem_exec[]={ { .uname = "NTA", .udesc = "Streaming SIMD L1D NTA prefetch miss", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_ssex_uops_retired[]={ { .uname = "PACKED_DOUBLE", .udesc = "SIMD Packed-Double Uops retired (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PACKED_SINGLE", .udesc = "SIMD Packed-Single Uops retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SCALAR_DOUBLE", .udesc = "SIMD Scalar-Double Uops retired (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SCALAR_SINGLE", .udesc = "SIMD Scalar-Single Uops retired (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "VECTOR_INTEGER", .udesc = "SIMD Vector Integer Uops retired (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_store_blocks[]={ { .uname = "AT_RET", .udesc = "Loads delayed with at-Retirement block code", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_BLOCK", .udesc = "Cacheable loads delayed with L1D block code", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NOT_STA", .udesc = "Loads delayed due to a store blocked for unknown data", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STA", .udesc = "Loads delayed due to a store blocked for an unknown address", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_uops_decoded[]={ { .uname = "ESP_FOLDING", .udesc = "Stack pointer instructions decoded", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ESP_SYNC", .udesc = "Stack pointer sync operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS", .udesc = "Uops decoded by Microcode Sequencer", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_CYCLES_ACTIVE", .udesc = "Cycles in which at least one uop is decoded by Microcode Sequencer", .uequiv = "MS:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_uops_executed[]={ { .uname = "PORT0", .udesc = "Uops executed on port 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT1", .udesc = "Uops executed on port 1", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT2_CORE", .udesc = "Uops executed on port 2 on any thread (core count only)", .ucode = 0x400 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT3_CORE", .udesc = "Uops executed on port 3 on any thread (core count only)", .ucode = 0x800 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT4_CORE", .udesc = "Uops executed on port 4 on any thread (core count only)", .ucode = 0x1000 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT5", .udesc = "Uops executed on port 5", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT015", .udesc = "Uops issued on ports 0, 1 or 5", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT234_CORE", .udesc = "Uops issued on ports 2, 3 or 4 on any thread (core count only)", .ucode = 0x8000 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT015_STALL_CYCLES", .udesc = "Cycles no Uops issued on ports 0, 1 or 5", .uequiv = "PORT015:c=1:i=1", .ucode = 0x4000 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_uops_issued[]={ { .uname = "ANY", .udesc = "Uops issued", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "STALLED_CYCLES", .udesc = "Cycles stalled no issued uops", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "FUSED", .udesc = "Fused Uops issued", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_uops_retired[]={ { .uname = "ANY", .udesc = "Uops retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "RETIRE_SLOTS", .udesc = "Retirement slots used (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ACTIVE_CYCLES", .udesc = "Cycles Uops are being retired (Precise Event)", .uequiv = "ANY:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles No Uops retired (Precise Event)", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MACRO_FUSED", .udesc = "Macro-fused Uops retired (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t nhm_offcore_response_0[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 0x100, .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO", .ucode = 0x200, .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 0x400, .grpid = 0, }, { .uname = "WB", .udesc = "Request: counts the number of writeback (modified to exclusive) transactions", .ucode = 0x800, .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: counts the number of data cacheline reads generated by L2 prefetchers", .ucode = 0x1000, .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: counts the number of RFO requests generated by L2 prefetchers", .ucode = 0x2000, .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: counts the number of code reads generated by L2 prefetchers", .ucode = 0x4000, .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 0x8000, .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH", .ucode = 0x4400, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all requests umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:OTHER", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: any data read/write request", .uequiv = "DMND_DATA_RD:PF_DATA_RD:DMND_RFO:PF_RFO", .ucode = 0x3300, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Request: any data read in request", .uequiv = "DMND_DATA_RD:PF_DATA_RD", .ucode = 0x1100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO", .uequiv = "DMND_RFO:PF_RFO", .ucode = 0x2200, .grpid = 0, }, { .uname = "UNCORE_HIT", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping)", .ucode = 0x10000, .grpid = 1, }, { .uname = "OTHER_CORE_HIT_SNP", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean)", .ucode = 0x20000, .grpid = 1, }, { .uname = "OTHER_CORE_HITM", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM)", .ucode = 0x40000, .grpid = 1, }, { .uname = "REMOTE_CACHE_HITM", .udesc = "Response: counts L3 Hit: local or remote home requests that hit a remote L3 cacheline in modified (HITM) state", .ucode = 0x80000, .grpid = 1, }, { .uname = "REMOTE_CACHE_FWD", .udesc = "Response: counts L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted)", .ucode = 0x100000, .grpid = 1, }, { .uname = "REMOTE_DRAM", .udesc = "Response: counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .ucode = 0x200000, .grpid = 1, }, { .uname = "LOCAL_DRAM", .udesc = "Response: counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM", .ucode = 0x400000, .grpid = 1, }, { .uname = "NON_DRAM", .udesc = "Response: Non-DRAM requests that were serviced by IOH", .ucode = 0x800000, .grpid = 1, }, { .uname = "ANY_CACHE_DRAM", .udesc = "Response: requests serviced by any source but IOH", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_FWD:REMOTE_CACHE_HITM:REMOTE_DRAM:LOCAL_DRAM", .ucode = 0x7f0000, .grpid = 1, }, { .uname = "ANY_DRAM", .udesc = "Response: requests serviced by local or remote DRAM", .uequiv = "REMOTE_DRAM:LOCAL_DRAM", .ucode = 0x600000, .grpid = 1, }, { .uname = "ANY_LLC_MISS", .udesc = "Response: requests that missed in L3", .uequiv = "REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM:LOCAL_DRAM:NON_DRAM", .ucode = 0xf80000, .grpid = 1, }, { .uname = "LOCAL_CACHE_DRAM", .udesc = "Response: requests hit local core or uncore caches or local DRAM", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:LOCAL_DRAM", .ucode = 0x470000, .grpid = 1, }, { .uname = "REMOTE_CACHE_DRAM", .udesc = "Response: requests that miss L3 and hit remote caches or DRAM", .uequiv = "REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM", .ucode = 0x380000, .grpid = 1, }, { .uname = "ANY_RESPONSE", .udesc = "Response: combination of all response umasks", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_FWD:REMOTE_CACHE_HITM:REMOTE_DRAM:LOCAL_DRAM:NON_DRAM", .ucode = 0xff0000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, }; static const intel_x86_entry_t intel_nhm_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "INSTRUCTION_RETIRED", .desc = "Count the number of instructions at retirement", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V3_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "LLC_REFERENCES", .desc = "Count each request originating equiv the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to L2_RQSTS:SELF_DEMAND_MESI", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_V3_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0xf, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch. Alias to event L2_RQSTS:SELF_DEMAND_I_STATE", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an equiv for LLC_MISSES", .modmsk = INTEL_V3_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0xf, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction.", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xf, .code = 0xc4, }, { .name = "ARITH", .desc = "Counts arithmetic multiply and divide operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x14, .numasks = LIBPFM_ARRAY_SIZE(nhm_arith), .ngrp = 1, .umasks = nhm_arith, }, { .name = "BACLEAR", .desc = "Branch address calculator", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(nhm_baclear), .ngrp = 1, .umasks = nhm_baclear, }, { .name = "BACLEAR_FORCE_IQ", .desc = "Instruction queue forced BACLEAR", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1a7, }, { .name = "BOGUS_BR", .desc = "Counts the number of bogus branches.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1e4, }, { .name = "BPU_CLEARS", .desc = "Branch prediction Unit clears", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(nhm_bpu_clears), .ngrp = 1, .umasks = nhm_bpu_clears, }, { .name = "BPU_MISSED_CALL_RET", .desc = "Branch prediction unit missed call or return", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1e5, }, { .name = "BR_INST_DECODED", .desc = "Branch instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1e0, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x88, .numasks = LIBPFM_ARRAY_SIZE(nhm_br_inst_exec), .ngrp = 1, .umasks = nhm_br_inst_exec, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_br_inst_retired), .ngrp = 1, .umasks = nhm_br_inst_retired, }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x89, .numasks = LIBPFM_ARRAY_SIZE(nhm_br_misp_exec), .ngrp = 1, .umasks = nhm_br_misp_exec, }, { .name = "BR_MISP_RETIRED", .desc = "Count Mispredicted Branch Activity", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_br_misp_retired), .ngrp = 1, .umasks = nhm_br_misp_retired, }, { .name = "CACHE_LOCK_CYCLES", .desc = "Cache lock cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(nhm_cache_lock_cycles), .ngrp = 1, .umasks = nhm_cache_lock_cycles, }, { .name = "CPU_CLK_UNHALTED", .desc = "Cycles when processor is not in halted state", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(nhm_cpu_clk_unhalted), .ngrp = 1, .umasks = nhm_cpu_clk_unhalted, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(nhm_dtlb_load_misses), .ngrp = 1, .umasks = nhm_dtlb_load_misses, }, { .name = "DTLB_MISSES", .desc = "Data TLB misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x49, .numasks = LIBPFM_ARRAY_SIZE(nhm_dtlb_misses), .ngrp = 1, .umasks = nhm_dtlb_misses, }, { .name = "EPT", .desc = "Extended Page Directory", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(nhm_ept), .ngrp = 1, .umasks = nhm_ept, }, { .name = "ES_REG_RENAMES", .desc = "ES segment renames", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1d5, }, { .name = "FP_ASSIST", .desc = "Floating point assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf7, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_fp_assist), .ngrp = 1, .umasks = nhm_fp_assist, }, { .name = "FP_COMP_OPS_EXE", .desc = "Floating point computational micro-ops", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x10, .numasks = LIBPFM_ARRAY_SIZE(nhm_fp_comp_ops_exe), .ngrp = 1, .umasks = nhm_fp_comp_ops_exe, }, { .name = "FP_MMX_TRANS", .desc = "Floating Point to and from MMX transitions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(nhm_fp_mmx_trans), .ngrp = 1, .umasks = nhm_fp_mmx_trans, }, { .name = "IFU_IVC", .desc = "Instruction Fetch unit victim cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x81, .numasks = LIBPFM_ARRAY_SIZE(nhm_ifu_ivc), .ngrp = 1, .umasks = nhm_ifu_ivc, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(nhm_ild_stall), .ngrp = 1, .umasks = nhm_ild_stall, }, { .name = "INST_DECODED", .desc = "Instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x18, .numasks = LIBPFM_ARRAY_SIZE(nhm_inst_decoded), .ngrp = 1, .umasks = nhm_inst_decoded, }, { .name = "INST_QUEUE_WRITES", .desc = "Instructions written to instruction queue.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x117, }, { .name = "INST_QUEUE_WRITE_CYCLES", .desc = "Cycles instructions are written to the instruction queue", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x11e, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_inst_retired), .ngrp = 1, .umasks = nhm_inst_retired, }, { .name = "IO_TRANSACTIONS", .desc = "I/O transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x16c, }, { .name = "ITLB_FLUSH", .desc = "Counts the number of ITLB flushes", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1ae, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(nhm_dtlb_misses), .ngrp = 1, .umasks = nhm_dtlb_misses, /* identical to actual umasks list for this event */ }, { .name = "ITLB_MISS_RETIRED", .desc = "Retired instructions that missed the ITLB (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x20c8, .flags= INTEL_X86_PEBS, }, { .name = "L1D", .desc = "L1D cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x51, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d), .ngrp = 1, .umasks = nhm_l1d, }, { .name = "L1D_ALL_REF", .desc = "L1D references", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_all_ref), .ngrp = 1, .umasks = nhm_l1d_all_ref, }, { .name = "L1D_CACHE_LD", .desc = "L1D cacheable loads. WARNING: event may overcount loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_cache_ld), .ngrp = 1, .umasks = nhm_l1d_cache_ld, }, { .name = "L1D_CACHE_LOCK", .desc = "L1 data cache load lock", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_cache_lock), .ngrp = 1, .umasks = nhm_l1d_cache_lock, }, { .name = "L1D_CACHE_LOCK_FB_HIT", .desc = "L1D load lock accepted in fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x153, }, { .name = "L1D_CACHE_PREFETCH_LOCK_FB_HIT", .desc = "L1D prefetch load lock accepted in fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x152, }, { .name = "L1D_CACHE_ST", .desc = "L1 data cache stores", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_cache_st), .ngrp = 1, .umasks = nhm_l1d_cache_st, }, { .name = "L1D_PREFETCH", .desc = "L1D hardware prefetch", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x4e, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_prefetch), .ngrp = 1, .umasks = nhm_l1d_prefetch, }, { .name = "L1D_WB_L2", .desc = "L1 writebacks to L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1d_wb_l2), .ngrp = 1, .umasks = nhm_l1d_wb_l2, }, { .name = "L1I", .desc = "L1I instruction fetches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(nhm_l1i), .ngrp = 1, .umasks = nhm_l1i, }, { .name = "L1I_OPPORTUNISTIC_HITS", .desc = "Opportunistic hits in streaming", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x183, }, { .name = "L2_DATA_RQSTS", .desc = "L2 data requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_data_rqsts), .ngrp = 1, .umasks = nhm_l2_data_rqsts, }, { .name = "L2_HW_PREFETCH", .desc = "L2 HW prefetches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf3, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_hw_prefetch), .ngrp = 1, .umasks = nhm_l2_hw_prefetch, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf1, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_lines_in), .ngrp = 1, .umasks = nhm_l2_lines_in, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf2, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_lines_out), .ngrp = 1, .umasks = nhm_l2_lines_out, }, { .name = "L2_RQSTS", .desc = "L2 requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_rqsts), .ngrp = 1, .umasks = nhm_l2_rqsts, }, { .name = "L2_TRANSACTIONS", .desc = "L2 transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf0, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_transactions), .ngrp = 1, .umasks = nhm_l2_transactions, }, { .name = "L2_WRITE", .desc = "L2 demand lock/store RFO", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(nhm_l2_write), .ngrp = 1, .umasks = nhm_l2_write, }, { .name = "LARGE_ITLB", .desc = "Large instruction TLB", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x82, .numasks = LIBPFM_ARRAY_SIZE(nhm_large_itlb), .ngrp = 1, .umasks = nhm_large_itlb, }, { .name = "LOAD_DISPATCH", .desc = "Loads dispatched", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x13, .numasks = LIBPFM_ARRAY_SIZE(nhm_load_dispatch), .ngrp = 1, .umasks = nhm_load_dispatch, }, { .name = "LOAD_HIT_PRE", .desc = "Load operations conflicting with software prefetches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x14c, }, { .name = "LONGEST_LAT_CACHE", .desc = "Longest latency cache reference", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(nhm_longest_lat_cache), .ngrp = 1, .umasks = nhm_longest_lat_cache, }, { .name = "LSD", .desc = "Loop stream detector", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xa8, .numasks = LIBPFM_ARRAY_SIZE(nhm_lsd), .ngrp = 1, .umasks = nhm_lsd, }, { .name = "MACHINE_CLEARS", .desc = "Machine Clear", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(nhm_machine_clears), .ngrp = 1, .umasks = nhm_machine_clears, }, { .name = "MACRO_INSTS", .desc = "Macro-fused instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd0, .numasks = LIBPFM_ARRAY_SIZE(nhm_macro_insts), .ngrp = 1, .umasks = nhm_macro_insts, }, { .name = "MEMORY_DISAMBIGUATION", .desc = "Memory Disambiguation Activity", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x9, .numasks = LIBPFM_ARRAY_SIZE(nhm_memory_disambiguation), .ngrp = 1, .umasks = nhm_memory_disambiguation, }, { .name = "MEM_INST_RETIRED", .desc = "Memory instructions retired", .modmsk = INTEL_V3_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xf, .code = 0xb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_mem_inst_retired), .ngrp = 1, .umasks = nhm_mem_inst_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xcb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_mem_load_retired), .ngrp = 1, .umasks = nhm_mem_load_retired, }, { .name = "MEM_STORE_RETIRED", .desc = "Retired stores", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_mem_store_retired), .ngrp = 1, .umasks = nhm_mem_store_retired, }, { .name = "MEM_UNCORE_RETIRED", .desc = "Load instructions retired which hit offcore", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_mem_uncore_retired), .ngrp = 1, .umasks = nhm_mem_uncore_retired, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore memory requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb0, .numasks = LIBPFM_ARRAY_SIZE(nhm_offcore_requests), .ngrp = 1, .umasks = nhm_offcore_requests, }, { .name = "OFFCORE_REQUESTS_SQ_FULL", .desc = "Counts cycles the Offcore Request buffer or Super Queue is full.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b2, }, { .name = "PARTIAL_ADDRESS_ALIAS", .desc = "False dependencies due to partial address forming", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x107, }, { .name = "PIC_ACCESSES", .desc = "Programmable interrupt controller", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xba, .numasks = LIBPFM_ARRAY_SIZE(nhm_pic_accesses), .ngrp = 1, .umasks = nhm_pic_accesses, }, { .name = "RAT_STALLS", .desc = "Register allocation table stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd2, .numasks = LIBPFM_ARRAY_SIZE(nhm_rat_stalls), .ngrp = 1, .umasks = nhm_rat_stalls, }, { .name = "RESOURCE_STALLS", .desc = "Processor stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xa2, .numasks = LIBPFM_ARRAY_SIZE(nhm_resource_stalls), .ngrp = 1, .umasks = nhm_resource_stalls, }, { .name = "SEG_RENAME_STALLS", .desc = "Segment rename stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1d4, }, { .name = "SEGMENT_REG_LOADS", .desc = "Counts number of segment register loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1f8, }, { .name = "SIMD_INT_128", .desc = "128 bit SIMD integer operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(nhm_simd_int_128), .ngrp = 1, .umasks = nhm_simd_int_128, }, { .name = "SIMD_INT_64", .desc = "64 bit SIMD integer operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xfd, .numasks = LIBPFM_ARRAY_SIZE(nhm_simd_int_64), .ngrp = 1, .umasks = nhm_simd_int_64, }, { .name = "SNOOP_RESPONSE", .desc = "Snoop", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb8, .numasks = LIBPFM_ARRAY_SIZE(nhm_snoop_response), .ngrp = 1, .umasks = nhm_snoop_response, }, { .name = "SQ_FULL_STALL_CYCLES", .desc = "Counts cycles the Offcore Request buffer or Super Queue is full and request(s) are outstanding.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1f6, }, { .name = "SQ_MISC", .desc = "Super Queue Activity Related to L2 Cache Access", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(nhm_sq_misc), .ngrp = 1, .umasks = nhm_sq_misc, }, { .name = "SSE_MEM_EXEC", .desc = "Streaming SIMD executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(nhm_sse_mem_exec), .ngrp = 1, .umasks = nhm_sse_mem_exec, }, { .name = "SSEX_UOPS_RETIRED", .desc = "SIMD micro-ops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc7, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_ssex_uops_retired), .ngrp = 1, .umasks = nhm_ssex_uops_retired, }, { .name = "STORE_BLOCKS", .desc = "Delayed loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(nhm_store_blocks), .ngrp = 1, .umasks = nhm_store_blocks, }, { .name = "TWO_UOP_INSTS_DECODED", .desc = "Two micro-ops instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x119, }, { .name = "UOPS_DECODED_DEC0", .desc = "Micro-ops decoded by decoder 0", .modmsk =0x0, .cntmsk = 0xf, .code = 0x13d, }, { .name = "UOPS_DECODED", .desc = "Micro-ops decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd1, .numasks = LIBPFM_ARRAY_SIZE(nhm_uops_decoded), .ngrp = 1, .umasks = nhm_uops_decoded, }, { .name = "UOPS_EXECUTED", .desc = "Micro-ops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb1, .numasks = LIBPFM_ARRAY_SIZE(nhm_uops_executed), .ngrp = 1, .umasks = nhm_uops_executed, }, { .name = "UOPS_ISSUED", .desc = "Micro-ops issued", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe, .numasks = LIBPFM_ARRAY_SIZE(nhm_uops_issued), .ngrp = 1, .umasks = nhm_uops_issued, }, { .name = "UOPS_RETIRED", .desc = "Micro-ops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(nhm_uops_retired), .ngrp = 1, .umasks = nhm_uops_retired, }, { .name = "UOP_UNFUSION", .desc = "Micro-ops unfusions due to FP exceptions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1db, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response 0 (must provide at least one request and one response umasks)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(nhm_offcore_response_0), .ngrp = 2, .umasks = nhm_offcore_response_0, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_nhm_unc_events.h000066400000000000000000001047771502707512200244350ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: nhm_unc (Intel Nehalem uncore) */ static const intel_x86_umask_t nhm_unc_unc_dram_open[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 open commands issued for read or write", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 open commands issued for read or write", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 open commands issued for read or write", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_page_close[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 page close", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 page close", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 page close", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_page_miss[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 page miss", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 page miss", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 page miss", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_pre_all[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 precharge all commands", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 precharge all commands", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 precharge all commands", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_read_cas[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 read CAS commands", .ucode = 0x100, }, { .uname = "AUTOPRE_CH0", .udesc = "DRAM Channel 0 read CAS auto page close commands", .ucode = 0x200, }, { .uname = "CH1", .udesc = "DRAM Channel 1 read CAS commands", .ucode = 0x400, }, { .uname = "AUTOPRE_CH1", .udesc = "DRAM Channel 1 read CAS auto page close commands", .ucode = 0x800, }, { .uname = "CH2", .udesc = "DRAM Channel 2 read CAS commands", .ucode = 0x1000, }, { .uname = "AUTOPRE_CH2", .udesc = "DRAM Channel 2 read CAS auto page close commands", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_refresh[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 refresh commands", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 refresh commands", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 refresh commands", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_dram_write_cas[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 write CAS commands", .ucode = 0x100, }, { .uname = "AUTOPRE_CH0", .udesc = "DRAM Channel 0 write CAS auto page close commands", .ucode = 0x200, }, { .uname = "CH1", .udesc = "DRAM Channel 1 write CAS commands", .ucode = 0x400, }, { .uname = "AUTOPRE_CH1", .udesc = "DRAM Channel 1 write CAS auto page close commands", .ucode = 0x800, }, { .uname = "CH2", .udesc = "DRAM Channel 2 write CAS commands", .ucode = 0x1000, }, { .uname = "AUTOPRE_CH2", .udesc = "DRAM Channel 2 write CAS auto page close commands", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_gq_alloc[]={ { .uname = "READ_TRACKER", .udesc = "GQ read tracker requests", .ucode = 0x100, }, { .uname = "RT_LLC_MISS", .udesc = "GQ read tracker LLC misses", .ucode = 0x200, }, { .uname = "RT_TO_LLC_RESP", .udesc = "GQ read tracker LLC requests", .ucode = 0x400, }, { .uname = "RT_TO_RTID_ACQUIRED", .udesc = "GQ read tracker LLC miss to RTID acquired", .ucode = 0x800, }, { .uname = "WT_TO_RTID_ACQUIRED", .udesc = "GQ write tracker LLC miss to RTID acquired", .ucode = 0x1000, }, { .uname = "WRITE_TRACKER", .udesc = "GQ write tracker LLC misses", .ucode = 0x2000, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "GQ peer probe tracker requests", .ucode = 0x4000, }, }; static const intel_x86_umask_t nhm_unc_unc_gq_cycles_full[]={ { .uname = "READ_TRACKER", .udesc = "Cycles GQ read tracker is full.", .ucode = 0x100, }, { .uname = "WRITE_TRACKER", .udesc = "Cycles GQ write tracker is full.", .ucode = 0x200, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "Cycles GQ peer probe tracker is full.", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_gq_cycles_not_empty[]={ { .uname = "READ_TRACKER", .udesc = "Cycles GQ read tracker is busy", .ucode = 0x100, }, { .uname = "WRITE_TRACKER", .udesc = "Cycles GQ write tracker is busy", .ucode = 0x200, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "Cycles GQ peer probe tracker is busy", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_gq_data_from[]={ { .uname = "QPI", .udesc = "Cycles GQ data is imported from Quickpath interface", .ucode = 0x100, }, { .uname = "QMC", .udesc = "Cycles GQ data is imported from Quickpath memory interface", .ucode = 0x200, }, { .uname = "LLC", .udesc = "Cycles GQ data is imported from LLC", .ucode = 0x400, }, { .uname = "CORES_02", .udesc = "Cycles GQ data is imported from Cores 0 and 2", .ucode = 0x800, }, { .uname = "CORES_13", .udesc = "Cycles GQ data is imported from Cores 1 and 3", .ucode = 0x1000, }, }; static const intel_x86_umask_t nhm_unc_unc_gq_data_to[]={ { .uname = "QPI_QMC", .udesc = "Cycles GQ data sent to the QPI or QMC", .ucode = 0x100, }, { .uname = "LLC", .udesc = "Cycles GQ data sent to LLC", .ucode = 0x200, }, { .uname = "CORES", .udesc = "Cycles GQ data sent to cores", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_llc_hits[]={ { .uname = "READ", .udesc = "Number of LLC read hits", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Number of LLC write hits", .ucode = 0x200, }, { .uname = "PROBE", .udesc = "Number of LLC peer probe hits", .ucode = 0x400, }, { .uname = "ANY", .udesc = "Number of LLC hits", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_llc_lines_in[]={ { .uname = "M_STATE", .udesc = "LLC lines allocated in M state", .ucode = 0x100, }, { .uname = "E_STATE", .udesc = "LLC lines allocated in E state", .ucode = 0x200, }, { .uname = "S_STATE", .udesc = "LLC lines allocated in S state", .ucode = 0x400, }, { .uname = "F_STATE", .udesc = "LLC lines allocated in F state", .ucode = 0x800, }, { .uname = "ANY", .udesc = "LLC lines allocated", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_llc_lines_out[]={ { .uname = "M_STATE", .udesc = "LLC lines victimized in M state", .ucode = 0x100, }, { .uname = "E_STATE", .udesc = "LLC lines victimized in E state", .ucode = 0x200, }, { .uname = "S_STATE", .udesc = "LLC lines victimized in S state", .ucode = 0x400, }, { .uname = "I_STATE", .udesc = "LLC lines victimized in I state", .ucode = 0x800, }, { .uname = "F_STATE", .udesc = "LLC lines victimized in F state", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "LLC lines victimized", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_llc_miss[]={ { .uname = "READ", .udesc = "Number of LLC read misses", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Number of LLC write misses", .ucode = 0x200, }, { .uname = "PROBE", .udesc = "Number of LLC peer probe misses", .ucode = 0x400, }, { .uname = "ANY", .udesc = "Number of LLC misses", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_address_conflicts[]={ { .uname = "2WAY", .udesc = "QHL 2 way address conflicts", .ucode = 0x200, }, { .uname = "3WAY", .udesc = "QHL 3 way address conflicts", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_conflict_cycles[]={ { .uname = "IOH", .udesc = "QHL IOH Tracker conflict cycles", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "QHL Remote Tracker conflict cycles", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "QHL Local Tracker conflict cycles", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_cycles_full[]={ { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker is full", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker is full", .ucode = 0x400, }, { .uname = "IOH", .udesc = "Cycles QHL IOH Tracker is full", .ucode = 0x100, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_cycles_not_empty[]={ { .uname = "IOH", .udesc = "Cycles QHL IOH is busy", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker is busy", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker is busy", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_frc_ack_cnflts[]={ { .uname = "LOCAL", .udesc = "QHL FrcAckCnflts sent to local home", .ucode = 0x400, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_occupancy[]={ { .uname = "IOH", .udesc = "Cycles QHL IOH Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_qhl_requests[]={ { .uname = "LOCAL_READS", .udesc = "Quickpath Home Logic local read requests", .ucode = 0x1000, }, { .uname = "LOCAL_WRITES", .udesc = "Quickpath Home Logic local write requests", .ucode = 0x2000, }, { .uname = "REMOTE_READS", .udesc = "Quickpath Home Logic remote read requests", .ucode = 0x400, }, { .uname = "IOH_READS", .udesc = "Quickpath Home Logic IOH read requests", .ucode = 0x100, }, { .uname = "IOH_WRITES", .udesc = "Quickpath Home Logic IOH write requests", .ucode = 0x200, }, { .uname = "REMOTE_WRITES", .udesc = "Quickpath Home Logic remote write requests", .ucode = 0x800, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_busy[]={ { .uname = "READ_CH0", .udesc = "Cycles QMC channel 0 busy with a read request", .ucode = 0x100, }, { .uname = "READ_CH1", .udesc = "Cycles QMC channel 1 busy with a read request", .ucode = 0x200, }, { .uname = "READ_CH2", .udesc = "Cycles QMC channel 2 busy with a read request", .ucode = 0x400, }, { .uname = "WRITE_CH0", .udesc = "Cycles QMC channel 0 busy with a write request", .ucode = 0x800, }, { .uname = "WRITE_CH1", .udesc = "Cycles QMC channel 1 busy with a write request", .ucode = 0x1000, }, { .uname = "WRITE_CH2", .udesc = "Cycles QMC channel 2 busy with a write request", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_cancel[]={ { .uname = "CH0", .udesc = "QMC channel 0 cancels", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 cancels", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 cancels", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC cancels", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_critical_priority_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 critical priority read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 critical priority read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 critical priority read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC critical priority read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_high_priority_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 high priority read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 high priority read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 high priority read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC high priority read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_isoc_full[]={ { .uname = "READ_CH0", .udesc = "Cycles DRAM channel 0 full with isochronous read requests", .ucode = 0x100, }, { .uname = "READ_CH1", .udesc = "Cycles DRAM channel 1 full with isochronous read requests", .ucode = 0x200, }, { .uname = "READ_CH2", .udesc = "Cycles DRAM channel 2 full with isochronous read requests", .ucode = 0x400, }, { .uname = "WRITE_CH0", .udesc = "Cycles DRAM channel 0 full with isochronous write requests", .ucode = 0x800, }, { .uname = "WRITE_CH1", .udesc = "Cycles DRAM channel 1 full with isochronous write requests", .ucode = 0x1000, }, { .uname = "WRITE_CH2", .udesc = "Cycles DRAM channel 2 full with isochronous write requests", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_imc_isoc_occupancy[]={ { .uname = "CH0", .udesc = "IMC channel 0 isochronous read request occupancy", .ucode = 0x100, }, { .uname = "CH1", .udesc = "IMC channel 1 isochronous read request occupancy", .ucode = 0x200, }, { .uname = "CH2", .udesc = "IMC channel 2 isochronous read request occupancy", .ucode = 0x400, }, { .uname = "ANY", .udesc = "IMC isochronous read request occupancy", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_normal_full[]={ { .uname = "READ_CH0", .udesc = "Cycles DRAM channel 0 full with normal read requests", .ucode = 0x100, }, { .uname = "READ_CH1", .udesc = "Cycles DRAM channel 1 full with normal read requests", .ucode = 0x200, }, { .uname = "READ_CH2", .udesc = "Cycles DRAM channel 2 full with normal read requests", .ucode = 0x400, }, { .uname = "WRITE_CH0", .udesc = "Cycles DRAM channel 0 full with normal write requests", .ucode = 0x800, }, { .uname = "WRITE_CH1", .udesc = "Cycles DRAM channel 1 full with normal write requests", .ucode = 0x1000, }, { .uname = "WRITE_CH2", .udesc = "Cycles DRAM channel 2 full with normal write requests", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_normal_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 normal read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 normal read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 normal read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC normal read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_occupancy[]={ { .uname = "CH0", .udesc = "IMC channel 0 normal read request occupancy", .ucode = 0x100, }, { .uname = "CH1", .udesc = "IMC channel 1 normal read request occupancy", .ucode = 0x200, }, { .uname = "CH2", .udesc = "IMC channel 2 normal read request occupancy", .ucode = 0x400, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_priority_updates[]={ { .uname = "CH0", .udesc = "QMC channel 0 priority updates", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 priority updates", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 priority updates", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC priority updates", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t nhm_unc_unc_qmc_writes[]={ { .uname = "FULL_CH0", .udesc = "QMC channel 0 full cache line writes", .ucode = 0x100, .grpid = 0, }, { .uname = "FULL_CH1", .udesc = "QMC channel 1 full cache line writes", .ucode = 0x200, .grpid = 0, }, { .uname = "FULL_CH2", .udesc = "QMC channel 2 full cache line writes", .ucode = 0x400, .grpid = 0, }, { .uname = "FULL_ANY", .udesc = "QMC full cache line writes", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "PARTIAL_CH0", .udesc = "QMC channel 0 partial cache line writes", .ucode = 0x800, .grpid = 1, }, { .uname = "PARTIAL_CH1", .udesc = "QMC channel 1 partial cache line writes", .ucode = 0x1000, .grpid = 1, }, { .uname = "PARTIAL_CH2", .udesc = "QMC channel 2 partial cache line writes", .ucode = 0x2000, .grpid = 1, }, { .uname = "PARTIAL_ANY", .udesc = "QMC partial cache line writes", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, }; static const intel_x86_umask_t nhm_unc_unc_qpi_rx_no_ppt_credit[]={ { .uname = "STALLS_LINK_0", .udesc = "Link 0 snoop stalls due to no PPT entry", .ucode = 0x100, }, { .uname = "STALLS_LINK_1", .udesc = "Link 1 snoop stalls due to no PPT entry", .ucode = 0x200, }, }; static const intel_x86_umask_t nhm_unc_unc_qpi_tx_header[]={ { .uname = "BUSY_LINK_0", .udesc = "Cycles link 0 outbound header busy", .ucode = 0x200, }, { .uname = "BUSY_LINK_1", .udesc = "Cycles link 1 outbound header busy", .ucode = 0x800, }, }; static const intel_x86_umask_t nhm_unc_unc_qpi_tx_stalled_multi_flit[]={ { .uname = "DRS_LINK_0", .udesc = "Cycles QPI outbound link 0 DRS stalled", .ucode = 0x100, }, { .uname = "NCB_LINK_0", .udesc = "Cycles QPI outbound link 0 NCB stalled", .ucode = 0x200, }, { .uname = "NCS_LINK_0", .udesc = "Cycles QPI outbound link 0 NCS stalled", .ucode = 0x400, }, { .uname = "DRS_LINK_1", .udesc = "Cycles QPI outbound link 1 DRS stalled", .ucode = 0x800, }, { .uname = "NCB_LINK_1", .udesc = "Cycles QPI outbound link 1 NCB stalled", .ucode = 0x1000, }, { .uname = "NCS_LINK_1", .udesc = "Cycles QPI outbound link 1 NCS stalled", .ucode = 0x2000, }, { .uname = "LINK_0", .udesc = "Cycles QPI outbound link 0 multi flit stalled", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LINK_1", .udesc = "Cycles QPI outbound link 1 multi flit stalled", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_unc_unc_qpi_tx_stalled_single_flit[]={ { .uname = "HOME_LINK_0", .udesc = "Cycles QPI outbound link 0 HOME stalled", .ucode = 0x100, }, { .uname = "SNOOP_LINK_0", .udesc = "Cycles QPI outbound link 0 SNOOP stalled", .ucode = 0x200, }, { .uname = "NDR_LINK_0", .udesc = "Cycles QPI outbound link 0 NDR stalled", .ucode = 0x400, }, { .uname = "HOME_LINK_1", .udesc = "Cycles QPI outbound link 1 HOME stalled", .ucode = 0x800, }, { .uname = "SNOOP_LINK_1", .udesc = "Cycles QPI outbound link 1 SNOOP stalled", .ucode = 0x1000, }, { .uname = "NDR_LINK_1", .udesc = "Cycles QPI outbound link 1 NDR stalled", .ucode = 0x2000, }, { .uname = "LINK_0", .udesc = "Cycles QPI outbound link 0 single flit stalled", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LINK_1", .udesc = "Cycles QPI outbound link 1 single flit stalled", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t nhm_unc_unc_snp_resp_to_local_home[]={ { .uname = "I_STATE", .udesc = "Local home snoop response - LLC does not have cache line", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Local home snoop response - LLC has cache line in S state", .ucode = 0x200, }, { .uname = "FWD_S_STATE", .udesc = "Local home snoop response - LLC forwarding cache line in S state.", .ucode = 0x400, }, { .uname = "FWD_I_STATE", .udesc = "Local home snoop response - LLC has forwarded a modified cache line", .ucode = 0x800, }, { .uname = "CONFLICT", .udesc = "Local home conflict snoop response", .ucode = 0x1000, }, { .uname = "WB", .udesc = "Local home snoop response - LLC has cache line in the M state", .ucode = 0x2000, }, }; static const intel_x86_umask_t nhm_unc_unc_snp_resp_to_remote_home[]={ { .uname = "I_STATE", .udesc = "Remote home snoop response - LLC does not have cache line", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Remote home snoop response - LLC has cache line in S state", .ucode = 0x200, }, { .uname = "FWD_S_STATE", .udesc = "Remote home snoop response - LLC forwarding cache line in S state.", .ucode = 0x400, }, { .uname = "FWD_I_STATE", .udesc = "Remote home snoop response - LLC has forwarded a modified cache line", .ucode = 0x800, }, { .uname = "CONFLICT", .udesc = "Remote home conflict snoop response", .ucode = 0x1000, }, { .uname = "WB", .udesc = "Remote home snoop response - LLC has cache line in the M state", .ucode = 0x2000, }, { .uname = "HITM", .udesc = "Remote home snoop response - LLC HITM", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_nhm_unc_pe[]={ { .name = "UNC_CLK_UNHALTED", .desc = "Uncore clockticks.", .modmsk =0x0, .cntmsk = 0x100000, .code = 0xff, .flags = INTEL_X86_FIXED, }, { .name = "UNC_DRAM_OPEN", .desc = "DRAM open commands issued for read or write", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_open), .ngrp = 1, .umasks = nhm_unc_unc_dram_open, }, { .name = "UNC_DRAM_PAGE_CLOSE", .desc = "DRAM page close due to idle timer expiration", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x61, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_page_close), .ngrp = 1, .umasks = nhm_unc_unc_dram_page_close, }, { .name = "UNC_DRAM_PAGE_MISS", .desc = "DRAM Channel 0 page miss", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_page_miss), .ngrp = 1, .umasks = nhm_unc_unc_dram_page_miss, }, { .name = "UNC_DRAM_PRE_ALL", .desc = "DRAM Channel 0 precharge all commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_pre_all), .ngrp = 1, .umasks = nhm_unc_unc_dram_pre_all, }, { .name = "UNC_DRAM_READ_CAS", .desc = "DRAM Channel 0 read CAS commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_read_cas), .ngrp = 1, .umasks = nhm_unc_unc_dram_read_cas, }, { .name = "UNC_DRAM_REFRESH", .desc = "DRAM Channel 0 refresh commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_refresh), .ngrp = 1, .umasks = nhm_unc_unc_dram_refresh, }, { .name = "UNC_DRAM_WRITE_CAS", .desc = "DRAM Channel 0 write CAS commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x64, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_dram_write_cas), .ngrp = 1, .umasks = nhm_unc_unc_dram_write_cas, }, { .name = "UNC_GQ_ALLOC", .desc = "GQ read tracker requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_gq_alloc), .ngrp = 1, .umasks = nhm_unc_unc_gq_alloc, }, { .name = "UNC_GQ_CYCLES_FULL", .desc = "Cycles GQ read tracker is full.", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_gq_cycles_full), .ngrp = 1, .umasks = nhm_unc_unc_gq_cycles_full, }, { .name = "UNC_GQ_CYCLES_NOT_EMPTY", .desc = "Cycles GQ read tracker is busy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x1, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_gq_cycles_not_empty), .ngrp = 1, .umasks = nhm_unc_unc_gq_cycles_not_empty, }, { .name = "UNC_GQ_DATA_FROM", .desc = "Cycles GQ data is imported", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_gq_data_from), .ngrp = 1, .umasks = nhm_unc_unc_gq_data_from, }, { .name = "UNC_GQ_DATA_TO", .desc = "Cycles GQ data is exported", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_gq_data_to), .ngrp = 1, .umasks = nhm_unc_unc_gq_data_to, }, { .name = "UNC_LLC_HITS", .desc = "Number of LLC read hits", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_llc_hits), .ngrp = 1, .umasks = nhm_unc_unc_llc_hits, }, { .name = "UNC_LLC_LINES_IN", .desc = "LLC lines allocated in M state", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0xa, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_llc_lines_in), .ngrp = 1, .umasks = nhm_unc_unc_llc_lines_in, }, { .name = "UNC_LLC_LINES_OUT", .desc = "LLC lines victimized in M state", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0xb, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_llc_lines_out), .ngrp = 1, .umasks = nhm_unc_unc_llc_lines_out, }, { .name = "UNC_LLC_MISS", .desc = "Number of LLC read misses", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x9, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_llc_miss), .ngrp = 1, .umasks = nhm_unc_unc_llc_miss, }, { .name = "UNC_QHL_ADDRESS_CONFLICTS", .desc = "QHL 2 way address conflicts", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_address_conflicts), .ngrp = 1, .umasks = nhm_unc_unc_qhl_address_conflicts, }, { .name = "UNC_QHL_CONFLICT_CYCLES", .desc = "QHL IOH Tracker conflict cycles", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x25, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_conflict_cycles), .ngrp = 1, .umasks = nhm_unc_unc_qhl_conflict_cycles, }, { .name = "UNC_QHL_CYCLES_FULL", .desc = "Cycles QHL Remote Tracker is full", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x21, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_cycles_full), .ngrp = 1, .umasks = nhm_unc_unc_qhl_cycles_full, }, { .name = "UNC_QHL_CYCLES_NOT_EMPTY", .desc = "Cycles QHL Tracker is not empty", .modmsk =0x0, .cntmsk = 0x1fe00000, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_cycles_not_empty), .ngrp = 1, .umasks = nhm_unc_unc_qhl_cycles_not_empty, }, { .name = "UNC_QHL_FRC_ACK_CNFLTS", .desc = "QHL FrcAckCnflts sent to local home", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x33, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_frc_ack_cnflts), .ngrp = 1, .umasks = nhm_unc_unc_qhl_frc_ack_cnflts, }, { .name = "UNC_QHL_OCCUPANCY", .desc = "Cycles QHL Tracker Allocate to Deallocate Read Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x23, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_occupancy), .ngrp = 1, .umasks = nhm_unc_unc_qhl_occupancy, }, { .name = "UNC_QHL_REQUESTS", .desc = "Quickpath Home Logic local read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qhl_requests), .ngrp = 1, .umasks = nhm_unc_unc_qhl_requests, }, { .name = "UNC_QHL_TO_QMC_BYPASS", .desc = "Number of requests to QMC that bypass QHL", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x26, }, { .name = "UNC_QMC_BUSY", .desc = "Cycles QMC busy with a read request", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_busy), .ngrp = 1, .umasks = nhm_unc_unc_qmc_busy, }, { .name = "UNC_QMC_CANCEL", .desc = "QMC cancels", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_cancel), .ngrp = 1, .umasks = nhm_unc_unc_qmc_cancel, }, { .name = "UNC_QMC_CRITICAL_PRIORITY_READS", .desc = "QMC critical priority read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_critical_priority_reads), .ngrp = 1, .umasks = nhm_unc_unc_qmc_critical_priority_reads, }, { .name = "UNC_QMC_HIGH_PRIORITY_READS", .desc = "QMC high priority read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2d, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_high_priority_reads), .ngrp = 1, .umasks = nhm_unc_unc_qmc_high_priority_reads, }, { .name = "UNC_QMC_ISOC_FULL", .desc = "Cycles DRAM full with isochronous (ISOC) read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_isoc_full), .ngrp = 1, .umasks = nhm_unc_unc_qmc_isoc_full, }, { .name = "UNC_IMC_ISOC_OCCUPANCY", .desc = "IMC isochronous (ISOC) Read Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2b, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_imc_isoc_occupancy), .ngrp = 1, .umasks = nhm_unc_unc_imc_isoc_occupancy, }, { .name = "UNC_QMC_NORMAL_FULL", .desc = "Cycles DRAM full with normal read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_normal_full), .ngrp = 1, .umasks = nhm_unc_unc_qmc_normal_full, }, { .name = "UNC_QMC_NORMAL_READS", .desc = "QMC normal read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2c, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_normal_reads), .ngrp = 1, .umasks = nhm_unc_unc_qmc_normal_reads, }, { .name = "UNC_QMC_OCCUPANCY", .desc = "QMC Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_occupancy), .ngrp = 1, .umasks = nhm_unc_unc_qmc_occupancy, }, { .name = "UNC_QMC_PRIORITY_UPDATES", .desc = "QMC priority updates", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x31, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_priority_updates), .ngrp = 1, .umasks = nhm_unc_unc_qmc_priority_updates, }, { .name = "UNC_QMC_WRITES", .desc = "QMC cache line writes", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2f, .flags= INTEL_X86_GRP_EXCL, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qmc_writes), .ngrp = 2, .umasks = nhm_unc_unc_qmc_writes, }, { .name = "UNC_QPI_RX_NO_PPT_CREDIT", .desc = "Link 0 snoop stalls due to no PPT entry", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qpi_rx_no_ppt_credit), .ngrp = 1, .umasks = nhm_unc_unc_qpi_rx_no_ppt_credit, }, { .name = "UNC_QPI_TX_HEADER", .desc = "Cycles link 0 outbound header busy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qpi_tx_header), .ngrp = 1, .umasks = nhm_unc_unc_qpi_tx_header, }, { .name = "UNC_QPI_TX_STALLED_MULTI_FLIT", .desc = "Cycles QPI outbound stalls", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qpi_tx_stalled_multi_flit), .ngrp = 1, .umasks = nhm_unc_unc_qpi_tx_stalled_multi_flit, }, { .name = "UNC_QPI_TX_STALLED_SINGLE_FLIT", .desc = "Cycles QPI outbound link stalls", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_qpi_tx_stalled_single_flit), .ngrp = 1, .umasks = nhm_unc_unc_qpi_tx_stalled_single_flit, }, { .name = "UNC_SNP_RESP_TO_LOCAL_HOME", .desc = "Local home snoop response", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_snp_resp_to_local_home), .ngrp = 1, .umasks = nhm_unc_unc_snp_resp_to_local_home, }, { .name = "UNC_SNP_RESP_TO_REMOTE_HOME", .desc = "Remote home snoop response", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(nhm_unc_unc_snp_resp_to_remote_home), .ngrp = 1, .umasks = nhm_unc_unc_snp_resp_to_remote_home, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_p6_events.h000066400000000000000000000610601502707512200233160ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: p6 (Intel P6 Processor Family) */ static const intel_x86_umask_t p6_l2_ifetch[]={ { .uname = "I", .udesc = "Invalid state", .ucode = 0x100, }, { .uname = "S", .udesc = "Shared state", .ucode = 0x200, }, { .uname = "E", .udesc = "Exclusive state", .ucode = 0x400, }, { .uname = "M", .udesc = "Modified state", .ucode = 0x800, }, }; static const intel_x86_umask_t p6_bus_drdy_clocks[]={ { .uname = "SELF", .udesc = "Clocks when processor is driving bus", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Clocks when any agent is driving bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t p6_mmx_instr_type_exec[]={ { .uname = "MUL", .udesc = "MMX packed multiply instructions executed", .ucode = 0x100, }, { .uname = "SHIFT", .udesc = "MMX packed shift instructions executed", .ucode = 0x200, }, { .uname = "PACK", .udesc = "MMX pack operation instructions executed", .ucode = 0x400, }, { .uname = "UNPACK", .udesc = "MMX unpack operation instructions executed", .ucode = 0x800, }, { .uname = "LOGICAL", .udesc = "MMX packed logical instructions executed", .ucode = 0x1000, }, { .uname = "ARITH", .udesc = "MMX packed arithmetic instructions executed", .ucode = 0x2000, }, }; static const intel_x86_umask_t p6_fp_mmx_trans[]={ { .uname = "TO_FP", .udesc = "From MMX instructions to floating-point instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TO_MMX", .udesc = "From floating-point instructions to MMX instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t p6_seg_rename_stalls[]={ { .uname = "ES", .udesc = "Segment register ES", .ucode = 0x100, }, { .uname = "DS", .udesc = "Segment register DS", .ucode = 0x200, }, { .uname = "FS", .udesc = "Segment register FS", .ucode = 0x400, }, { .uname = "GS", .udesc = "Segment register GS", .ucode = 0x800, }, }; static const intel_x86_umask_t p6_emon_kni_pref_dispatched[]={ { .uname = "NTA", .udesc = "Prefetch NTA", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T1", .udesc = "Prefetch T1", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T2", .udesc = "Prefetch T2", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WEAK", .udesc = "Weakly ordered stores", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t p6_emon_kni_inst_retired[]={ { .uname = "PACKED_SCALAR", .udesc = "Packed and scalar instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Scalar only", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_p6_pe[]={ { .name = "CPU_CLK_UNHALTED", .desc = "Number cycles during which the processor is not halted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x79, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, }, { .name = "DATA_MEM_REFS", .desc = "All loads from any memory type. All stores to any memory typeEach part of a split is counted separately. The internal logic counts not only memory loads and stores but also internal retries. 80-bit floating point accesses are double counted, since they are decomposed into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the same address, and which finally gets performed, is only counted once). Does not include I/O accesses or other non-memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x43, }, { .name = "DCU_LINES_IN", .desc = "Total lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x45, }, { .name = "DCU_M_LINES_IN", .desc = "Number of M state lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "DCU_M_LINES_OUT", .desc = "Number of M state lines evicted from the DCU. This includes evictions via snoop HITM, intervention or replacement", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "DCU_MISS_OUTSTANDING", .desc = "Weighted number of cycle while a DCU miss is outstanding, incremented by the number of cache misses at any particular time. Cacheable read requests only are considered. Uncacheable requests are excluded Read-for-ownerships are counted, as well as line fills, invalidates, and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "IFU_IFETCH", .desc = "Number of instruction fetches, both cacheable and noncacheable including UC fetches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "IFU_IFETCH_MISS", .desc = "Number of instruction fetch misses. All instructions fetches that do not hit the IFU (i.e., that produce memory requests). Includes UC accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB_MISS", .desc = "Number of ITLB misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x85, }, { .name = "IFU_MEM_STALL", .desc = "Number of cycles instruction fetch is stalled for any reason. Includes IFU cache misses, ITLB misses, ITLB faults, and other minor stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Number of cycles that the instruction length decoder is stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "L2_IFETCH", .desc = "Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by the L2. The count includes only L2 cacheable instruction fetches: it does not include UC instruction fetches It does not include ITLB miss accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(p6_l2_ifetch), .ngrp = 1, .umasks = p6_l2_ifetch, }, { .name = "L2_ST", .desc = "Number of L2 data stores. This event indicates that a normal, unlocked, store memory access was received by the L2. Specifically, it indicates that the DCU sent a read-for ownership request to the L2. It also includes Invalid to Modified requests sent by the DCU to the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(p6_l2_ifetch), .ngrp = 1, .umasks = p6_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_INM", .desc = "Number of modified lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, }, { .name = "L2_RQSTS", .desc = "Total number of L2 requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(p6_l2_ifetch), .ngrp = 1, .umasks = p6_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_ADS", .desc = "Number of L2 address strobes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, }, { .name = "L2_DBUS_BUSY", .desc = "Number of cycles during which the L2 cache data bus was busy", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x22, }, { .name = "L2_DBUS_BUSY_RD", .desc = "Number of cycles during which the data bus was busy transferring read data from L2 to the processor", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, }, { .name = "BUS_DRDY_CLOCKS", .desc = "Number of clocks during which DRDY# is asserted. Utilization of the external system data bus during data transfers", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, }, { .name = "BUS_LOCK_CLOCKS", .desc = "Number of clocks during which LOCK# is asserted on the external system bus", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_REQ_OUTSTANDING", .desc = "Number of bus requests outstanding. This counter is incremented by the number of cacheable read bus requests outstanding in any given cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, }, { .name = "BUS_TRANS_BRD", .desc = "Number of burst read transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "Number of completed read for ownership transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Number of completed write back transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_IFETCH", .desc = "Number of completed instruction fetch transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_INVAL", .desc = "Number of completed invalidate transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_PWR", .desc = "Number of completed partial write transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Number of completed partial transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "Number of completed I/O transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_DEF", .desc = "Number of completed deferred transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_BURST", .desc = "Number of completed burst transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_ANY", .desc = "Number of all completed bus transactions. Address bus utilization can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_MEM", .desc = "Number of completed memory transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(p6_bus_drdy_clocks), .ngrp = 1, .umasks = p6_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RECV", .desc = "Number of bus clock cycles during which this processor is receiving data", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x64, }, { .name = "BUS_BNR_DRV", .desc = "Number of bus clock cycles during which this processor is driving the BNR# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, }, { .name = "BUS_HIT_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HIT# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7a, }, { .name = "BUS_HITM_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HITM# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7b, }, { .name = "BUS_SNOOP_STALL", .desc = "Number of clock cycles during which the bus is snoop stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, }, { .name = "FLOPS", .desc = "Number of computational floating-point operations retired. Excludes floating-point computational operations that cause traps or assists. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating point loads and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xc1, }, { .name = "FP_COMP_OPS_EXE", .desc = "Number of computational floating-point operations executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This number does not include the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "Number of floating-point exception cases handled by microcode.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Number of multiplies.This count includes integer as well as FP multiplies and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Number of divides.This count includes integer as well as FP divides and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Number of cycles during which the divider is busy, and cannot accept new divides. This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "LD_BLOCKS", .desc = "Number of load operations delayed due to store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, }, { .name = "SB_DRAINS", .desc = "Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, }, { .name = "MISALIGN_MEM_REF", .desc = "Number of misaligned data memory references. Incremented by 1 every cycle during which, either the processor's load or store pipeline dispatches a misaligned micro-op Counting is performed if it is the first or second half or if it is blocked, squashed, or missed. In this context, misaligned means crossing a 64-bit boundary", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x5, }, { .name = "UOPS_RETIRED", .desc = "Number of micro-ops retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, }, { .name = "INST_DECODED", .desc = "Number of instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd0, }, { .name = "HW_INT_RX", .desc = "Number of hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "CYCLES_INT_MASKED", .desc = "Number of processor cycles for which interrupts are disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc6, }, { .name = "CYCLES_INT_PENDING_AND_MASKED", .desc = "Number of processor cycles for which interrupts are disabled and interrupts are pending.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, }, { .name = "BR_INST_RETIRED", .desc = "Number of branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, }, { .name = "BR_MISS_PRED_RETIRED", .desc = "Number of mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, }, { .name = "BR_TAKEN_RETIRED", .desc = "Number of taken branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "BR_MISS_PRED_TAKEN_RET", .desc = "Number of taken mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, }, { .name = "BR_INST_DECODED", .desc = "Number of branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BTB_MISSES", .desc = "Number of branches for which the BTB did not produce a prediction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe2, }, { .name = "BR_BOGUS", .desc = "Number of bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "Number of times BACLEAR is asserted. This is the number of times that a static branch prediction was made, in which the branch decoder decided to make a branch prediction because the BTB did not", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "RESOURCE_STALLS", .desc = "Incremented by 1 during every cycle for which there is a resource related stall. Includes register renaming buffer entries, memory buffer entries. Does not include stalls due to bus queue full, too many cache misses, etc. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa2, }, { .name = "PARTIAL_RAT_STALLS", .desc = "Number of cycles or events for partial stalls. This includes flag partial stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd2, }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "MMX_SAT_INSTR_EXEC", .desc = "Number of MMX saturating instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb1, }, { .name = "MMX_UOPS_EXEC", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb2, }, { .name = "MMX_INSTR_TYPE_EXEC", .desc = "Number of MMX instructions executed by type", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(p6_mmx_instr_type_exec), .ngrp = 1, .umasks = p6_mmx_instr_type_exec, }, { .name = "FP_MMX_TRANS", .desc = "Number of MMX transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(p6_fp_mmx_trans), .ngrp = 1, .umasks = p6_fp_mmx_trans, }, { .name = "MMX_ASSIST", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "SEG_RENAME_STALLS", .desc = "Number of Segment Register Renaming Stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd4, .numasks = LIBPFM_ARRAY_SIZE(p6_seg_rename_stalls), .ngrp = 1, .umasks = p6_seg_rename_stalls, }, { .name = "SEG_REG_RENAMES", .desc = "Number of Segment Register Renames", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd5, .numasks = LIBPFM_ARRAY_SIZE(p6_seg_rename_stalls), .ngrp = 1, .umasks = p6_seg_rename_stalls, /* identical to actual umasks list for this event */ }, { .name = "RET_SEG_RENAMES", .desc = "Number of segment register rename events retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd6, }, { .name = "EMON_KNI_PREF_DISPATCHED", .desc = "Number of Streaming SIMD extensions prefetch/weakly-ordered instructions dispatched (speculative prefetches are included in counting). Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(p6_emon_kni_pref_dispatched), .ngrp = 1, .umasks = p6_emon_kni_pref_dispatched, }, { .name = "EMON_KNI_PREF_MISS", .desc = "Number of prefetch/weakly-ordered instructions that miss all caches. Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(p6_emon_kni_pref_dispatched), .ngrp = 1, .umasks = p6_emon_kni_pref_dispatched, /* identical to actual umasks list for this event */ }, { .name = "L2_LD", .desc = "Number of L2 data loads. This event indicates that a normal, unlocked, load memory access was received by the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(p6_l2_ifetch), .ngrp = 1, .umasks = p6_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "Number of lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, }, { .name = "L2_LINES_OUT", .desc = "Number of lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, }, { .name = "L2_M_LINES_OUTM", .desc = "Number of modified lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, }, { .name = "EMON_KNI_INST_RETIRED", .desc = "Number of SSE instructions retired. Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd8, .numasks = LIBPFM_ARRAY_SIZE(p6_emon_kni_inst_retired), .ngrp = 1, .umasks = p6_emon_kni_inst_retired, }, { .name = "EMON_KNI_COMP_INST_RET", .desc = "Number of SSE computation instructions retired. Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd9, .numasks = LIBPFM_ARRAY_SIZE(p6_emon_kni_inst_retired), .ngrp = 1, .umasks = p6_emon_kni_inst_retired, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_pii_events.h000066400000000000000000000553031502707512200235550ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: pii (Intel Pentium II) */ static const intel_x86_umask_t pii_l2_ifetch[]={ { .uname = "I", .udesc = "Invalid state", .ucode = 0x100, }, { .uname = "S", .udesc = "Shared state", .ucode = 0x200, }, { .uname = "E", .udesc = "Exclusive state", .ucode = 0x400, }, { .uname = "M", .udesc = "Modified state", .ucode = 0x800, }, }; static const intel_x86_umask_t pii_bus_drdy_clocks[]={ { .uname = "SELF", .udesc = "Clocks when processor is driving bus", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Clocks when any agent is driving bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pii_mmx_instr_type_exec[]={ { .uname = "MUL", .udesc = "MMX packed multiply instructions executed", .ucode = 0x100, }, { .uname = "SHIFT", .udesc = "MMX packed shift instructions executed", .ucode = 0x200, }, { .uname = "PACK", .udesc = "MMX pack operation instructions executed", .ucode = 0x400, }, { .uname = "UNPACK", .udesc = "MMX unpack operation instructions executed", .ucode = 0x800, }, { .uname = "LOGICAL", .udesc = "MMX packed logical instructions executed", .ucode = 0x1000, }, { .uname = "ARITH", .udesc = "MMX packed arithmetic instructions executed", .ucode = 0x2000, }, }; static const intel_x86_umask_t pii_fp_mmx_trans[]={ { .uname = "TO_FP", .udesc = "From MMX instructions to floating-point instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TO_MMX", .udesc = "From floating-point instructions to MMX instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pii_seg_rename_stalls[]={ { .uname = "ES", .udesc = "Segment register ES", .ucode = 0x100, }, { .uname = "DS", .udesc = "Segment register DS", .ucode = 0x200, }, { .uname = "FS", .udesc = "Segment register FS", .ucode = 0x400, }, { .uname = "GS", .udesc = "Segment register GS", .ucode = 0x800, }, }; static const intel_x86_entry_t intel_pii_pe[]={ { .name = "CPU_CLK_UNHALTED", .desc = "Number cycles during which the processor is not halted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x79, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, }, { .name = "DATA_MEM_REFS", .desc = "All loads from any memory type. All stores to any memory typeEach part of a split is counted separately. The internal logic counts not only memory loads and stores but also internal retries. 80-bit floating point accesses are double counted, since they are decomposed into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the same address, and which finally gets performed, is only counted once). Does not include I/O accesses or other non-memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x43, }, { .name = "DCU_LINES_IN", .desc = "Total lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x45, }, { .name = "DCU_M_LINES_IN", .desc = "Number of M state lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "DCU_M_LINES_OUT", .desc = "Number of M state lines evicted from the DCU. This includes evictions via snoop HITM, intervention or replacement", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "DCU_MISS_OUTSTANDING", .desc = "Weighted number of cycle while a DCU miss is outstanding, incremented by the number of cache misses at any particular time. Cacheable read requests only are considered. Uncacheable requests are excluded Read-for-ownerships are counted, as well as line fills, invalidates, and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "IFU_IFETCH", .desc = "Number of instruction fetches, both cacheable and noncacheable including UC fetches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "IFU_IFETCH_MISS", .desc = "Number of instruction fetch misses. All instructions fetches that do not hit the IFU (i.e., that produce memory requests). Includes UC accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB_MISS", .desc = "Number of ITLB misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x85, }, { .name = "IFU_MEM_STALL", .desc = "Number of cycles instruction fetch is stalled for any reason. Includes IFU cache misses, ITLB misses, ITLB faults, and other minor stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Number of cycles that the instruction length decoder is stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "L2_IFETCH", .desc = "Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by the L2. The count includes only L2 cacheable instruction fetches: it does not include UC instruction fetches It does not include ITLB miss accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(pii_l2_ifetch), .ngrp = 1, .umasks = pii_l2_ifetch, }, { .name = "L2_ST", .desc = "Number of L2 data stores. This event indicates that a normal, unlocked, store memory access was received by the L2. Specifically, it indicates that the DCU sent a read-for ownership request to the L2. It also includes Invalid to Modified requests sent by the DCU to the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(pii_l2_ifetch), .ngrp = 1, .umasks = pii_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_INM", .desc = "Number of modified lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, }, { .name = "L2_RQSTS", .desc = "Total number of L2 requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(pii_l2_ifetch), .ngrp = 1, .umasks = pii_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_ADS", .desc = "Number of L2 address strobes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, }, { .name = "L2_DBUS_BUSY", .desc = "Number of cycles during which the L2 cache data bus was busy", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x22, }, { .name = "L2_DBUS_BUSY_RD", .desc = "Number of cycles during which the data bus was busy transferring read data from L2 to the processor", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, }, { .name = "BUS_DRDY_CLOCKS", .desc = "Number of clocks during which DRDY# is asserted. Utilization of the external system data bus during data transfers", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, }, { .name = "BUS_LOCK_CLOCKS", .desc = "Number of clocks during which LOCK# is asserted on the external system bus", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_REQ_OUTSTANDING", .desc = "Number of bus requests outstanding. This counter is incremented by the number of cacheable read bus requests outstanding in any given cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, }, { .name = "BUS_TRANS_BRD", .desc = "Number of burst read transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "Number of completed read for ownership transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Number of completed write back transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_IFETCH", .desc = "Number of completed instruction fetch transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_INVAL", .desc = "Number of completed invalidate transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_PWR", .desc = "Number of completed partial write transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Number of completed partial transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "Number of completed I/O transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_DEF", .desc = "Number of completed deferred transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_BURST", .desc = "Number of completed burst transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_ANY", .desc = "Number of all completed bus transactions. Address bus utilization can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_MEM", .desc = "Number of completed memory transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(pii_bus_drdy_clocks), .ngrp = 1, .umasks = pii_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RECV", .desc = "Number of bus clock cycles during which this processor is receiving data", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x64, }, { .name = "BUS_BNR_DRV", .desc = "Number of bus clock cycles during which this processor is driving the BNR# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, }, { .name = "BUS_HIT_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HIT# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7a, }, { .name = "BUS_HITM_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HITM# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7b, }, { .name = "BUS_SNOOP_STALL", .desc = "Number of clock cycles during which the bus is snoop stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, }, { .name = "FLOPS", .desc = "Number of computational floating-point operations retired. Excludes floating-point computational operations that cause traps or assists. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating point loads and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xc1, }, { .name = "FP_COMP_OPS_EXE", .desc = "Number of computational floating-point operations executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This number does not include the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "Number of floating-point exception cases handled by microcode.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Number of multiplies.This count includes integer as well as FP multiplies and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Number of divides.This count includes integer as well as FP divides and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Number of cycles during which the divider is busy, and cannot accept new divides. This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "LD_BLOCKS", .desc = "Number of load operations delayed due to store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, }, { .name = "SB_DRAINS", .desc = "Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, }, { .name = "MISALIGN_MEM_REF", .desc = "Number of misaligned data memory references. Incremented by 1 every cycle during which, either the processor's load or store pipeline dispatches a misaligned micro-op Counting is performed if it is the first or second half or if it is blocked, squashed, or missed. In this context, misaligned means crossing a 64-bit boundary", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x5, }, { .name = "UOPS_RETIRED", .desc = "Number of micro-ops retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, }, { .name = "INST_DECODED", .desc = "Number of instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd0, }, { .name = "HW_INT_RX", .desc = "Number of hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "CYCLES_INT_MASKED", .desc = "Number of processor cycles for which interrupts are disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc6, }, { .name = "CYCLES_INT_PENDING_AND_MASKED", .desc = "Number of processor cycles for which interrupts are disabled and interrupts are pending.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, }, { .name = "BR_INST_RETIRED", .desc = "Number of branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, }, { .name = "BR_MISS_PRED_RETIRED", .desc = "Number of mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, }, { .name = "BR_TAKEN_RETIRED", .desc = "Number of taken branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "BR_MISS_PRED_TAKEN_RET", .desc = "Number of taken mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, }, { .name = "BR_INST_DECODED", .desc = "Number of branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BTB_MISSES", .desc = "Number of branches for which the BTB did not produce a prediction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe2, }, { .name = "BR_BOGUS", .desc = "Number of bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "Number of times BACLEAR is asserted. This is the number of times that a static branch prediction was made, in which the branch decoder decided to make a branch prediction because the BTB did not", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "RESOURCE_STALLS", .desc = "Incremented by 1 during every cycle for which there is a resource related stall. Includes register renaming buffer entries, memory buffer entries. Does not include stalls due to bus queue full, too many cache misses, etc. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa2, }, { .name = "PARTIAL_RAT_STALLS", .desc = "Number of cycles or events for partial stalls. This includes flag partial stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd2, }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "MMX_INSTR_EXEC", .desc = "Number of MMX instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb0, }, { .name = "MMX_INSTR_RET", .desc = "Number of MMX instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xce, }, { .name = "MMX_SAT_INSTR_EXEC", .desc = "Number of MMX saturating instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb1, }, { .name = "MMX_UOPS_EXEC", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb2, }, { .name = "MMX_INSTR_TYPE_EXEC", .desc = "Number of MMX instructions executed by type", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(pii_mmx_instr_type_exec), .ngrp = 1, .umasks = pii_mmx_instr_type_exec, }, { .name = "FP_MMX_TRANS", .desc = "Number of MMX transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(pii_fp_mmx_trans), .ngrp = 1, .umasks = pii_fp_mmx_trans, }, { .name = "MMX_ASSIST", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "SEG_RENAME_STALLS", .desc = "Number of Segment Register Renaming Stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd4, .numasks = LIBPFM_ARRAY_SIZE(pii_seg_rename_stalls), .ngrp = 1, .umasks = pii_seg_rename_stalls, }, { .name = "SEG_REG_RENAMES", .desc = "Number of Segment Register Renames", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd5, .numasks = LIBPFM_ARRAY_SIZE(pii_seg_rename_stalls), .ngrp = 1, .umasks = pii_seg_rename_stalls, /* identical to actual umasks list for this event */ }, { .name = "RET_SEG_RENAMES", .desc = "Number of segment register rename events retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd6, }, { .name = "L2_LD", .desc = "Number of L2 data loads. This event indicates that a normal, unlocked, load memory access was received by the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(pii_l2_ifetch), .ngrp = 1, .umasks = pii_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "Number of lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, }, { .name = "L2_LINES_OUT", .desc = "Number of lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, }, { .name = "L2_M_LINES_OUTM", .desc = "Number of modified lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_pm_events.h000066400000000000000000000733631502707512200234160ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: pm (Intel Pentium M) */ static const intel_x86_umask_t pm_l2_ifetch[]={ { .uname = "I", .udesc = "Invalid state", .ucode = 0x100, }, { .uname = "S", .udesc = "Shared state", .ucode = 0x200, }, { .uname = "E", .udesc = "Exclusive state", .ucode = 0x400, }, { .uname = "M", .udesc = "Modified state", .ucode = 0x800, }, }; static const intel_x86_umask_t pm_bus_drdy_clocks[]={ { .uname = "SELF", .udesc = "Clocks when processor is driving bus", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Clocks when any agent is driving bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pm_mmx_instr_type_exec[]={ { .uname = "MUL", .udesc = "MMX packed multiply instructions executed", .ucode = 0x100, }, { .uname = "SHIFT", .udesc = "MMX packed shift instructions executed", .ucode = 0x200, }, { .uname = "PACK", .udesc = "MMX pack operation instructions executed", .ucode = 0x400, }, { .uname = "UNPACK", .udesc = "MMX unpack operation instructions executed", .ucode = 0x800, }, { .uname = "LOGICAL", .udesc = "MMX packed logical instructions executed", .ucode = 0x1000, }, { .uname = "ARITH", .udesc = "MMX packed arithmetic instructions executed", .ucode = 0x2000, }, }; static const intel_x86_umask_t pm_fp_mmx_trans[]={ { .uname = "TO_FP", .udesc = "From MMX instructions to floating-point instructions", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TO_MMX", .udesc = "From floating-point instructions to MMX instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pm_seg_rename_stalls[]={ { .uname = "ES", .udesc = "Segment register ES", .ucode = 0x100, }, { .uname = "DS", .udesc = "Segment register DS", .ucode = 0x200, }, { .uname = "FS", .udesc = "Segment register FS", .ucode = 0x400, }, { .uname = "GS", .udesc = "Segment register GS", .ucode = 0x800, }, }; static const intel_x86_umask_t pm_emon_kni_pref_dispatched[]={ { .uname = "NTA", .udesc = "Prefetch NTA", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T1", .udesc = "Prefetch T1", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T2", .udesc = "Prefetch T2", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WEAK", .udesc = "Weakly ordered stores", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pm_emon_est_trans[]={ { .uname = "ALL", .udesc = "All transitions", .ucode = 0x0, }, { .uname = "FREQ", .udesc = "Only frequency transitions", .ucode = 0x200, }, }; static const intel_x86_umask_t pm_emon_fused_uops_ret[]={ { .uname = "ALL", .udesc = "All fused micro-ops", .ucode = 0x0, }, { .uname = "LD_OP", .udesc = "Only load+Op micro-ops", .ucode = 0x100, }, { .uname = "STD_STA", .udesc = "Only std+sta micro-ops", .ucode = 0x200, }, }; static const intel_x86_umask_t pm_emon_sse_sse2_inst_retired[]={ { .uname = "SSE_PACKED_SCALAR_SINGLE", .udesc = "SSE Packed Single and Scalar Single", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_SCALAR_SINGLE", .udesc = "SSE Scalar Single", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE2_PACKED_DOUBLE", .udesc = "SSE2 Packed Double", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE2_SCALAR_DOUBLE", .udesc = "SSE2 Scalar Double", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t pm_l2_ld[]={ { .uname = "I", .udesc = "Invalid state", .ucode = 0x100, }, { .uname = "S", .udesc = "Shared state", .ucode = 0x200, }, { .uname = "E", .udesc = "Exclusive state", .ucode = 0x400, }, { .uname = "M", .udesc = "Modified state", .ucode = 0x800, }, { .uname = "EXCL_HW_PREFETCH", .udesc = "Exclude hardware prefetched lines", .ucode = 0x0, }, { .uname = "ONLY_HW_PREFETCH", .udesc = "Only hardware prefetched lines", .ucode = 0x1000, }, { .uname = "NON_HW_PREFETCH", .udesc = "Non hardware prefetched lines", .ucode = 0x2000, }, }; static const intel_x86_entry_t intel_pm_pe[]={ { .name = "CPU_CLK_UNHALTED", .desc = "Number cycles during which the processor is not halted and not in a thermal trip", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x79, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, }, { .name = "DATA_MEM_REFS", .desc = "All loads from any memory type. All stores to any memory typeEach part of a split is counted separately. The internal logic counts not only memory loads and stores but also internal retries. 80-bit floating point accesses are double counted, since they are decomposed into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the same address, and which finally gets performed, is only counted once). Does not include I/O accesses or other non-memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x43, }, { .name = "DCU_LINES_IN", .desc = "Total lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x45, }, { .name = "DCU_M_LINES_IN", .desc = "Number of M state lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "DCU_M_LINES_OUT", .desc = "Number of M state lines evicted from the DCU. This includes evictions via snoop HITM, intervention or replacement", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "DCU_MISS_OUTSTANDING", .desc = "Weighted number of cycle while a DCU miss is outstanding, incremented by the number of cache misses at any particular time. Cacheable read requests only are considered. Uncacheable requests are excluded Read-for-ownerships are counted, as well as line fills, invalidates, and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "IFU_IFETCH", .desc = "Number of instruction fetches, both cacheable and noncacheable including UC fetches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "IFU_IFETCH_MISS", .desc = "Number of instruction fetch misses. All instructions fetches that do not hit the IFU (i.e., that produce memory requests). Includes UC accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB_MISS", .desc = "Number of ITLB misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x85, }, { .name = "IFU_MEM_STALL", .desc = "Number of cycles instruction fetch is stalled for any reason. Includes IFU cache misses, ITLB misses, ITLB faults, and other minor stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Number of cycles that the instruction length decoder is stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "L2_IFETCH", .desc = "Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by the L2. The count includes only L2 cacheable instruction fetches: it does not include UC instruction fetches It does not include ITLB miss accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ifetch), .ngrp = 1, .umasks = pm_l2_ifetch, }, { .name = "L2_ST", .desc = "Number of L2 data stores. This event indicates that a normal, unlocked, store memory access was received by the L2. Specifically, it indicates that the DCU sent a read-for ownership request to the L2. It also includes Invalid to Modified requests sent by the DCU to the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ifetch), .ngrp = 1, .umasks = pm_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_INM", .desc = "Number of modified lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, }, { .name = "L2_RQSTS", .desc = "Total number of L2 requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ifetch), .ngrp = 1, .umasks = pm_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_ADS", .desc = "Number of L2 address strobes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, }, { .name = "L2_DBUS_BUSY", .desc = "Number of cycles during which the L2 cache data bus was busy", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x22, }, { .name = "L2_DBUS_BUSY_RD", .desc = "Number of cycles during which the data bus was busy transferring read data from L2 to the processor", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, }, { .name = "BUS_DRDY_CLOCKS", .desc = "Number of clocks during which DRDY# is asserted. Utilization of the external system data bus during data transfers", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, }, { .name = "BUS_LOCK_CLOCKS", .desc = "Number of clocks during which LOCK# is asserted on the external system bus", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_REQ_OUTSTANDING", .desc = "Number of bus requests outstanding. This counter is incremented by the number of cacheable read bus requests outstanding in any given cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, }, { .name = "BUS_TRANS_BRD", .desc = "Number of burst read transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "Number of completed read for ownership transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Number of completed write back transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_IFETCH", .desc = "Number of completed instruction fetch transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_INVAL", .desc = "Number of completed invalidate transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_PWR", .desc = "Number of completed partial write transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Number of completed partial transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "Number of completed I/O transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_DEF", .desc = "Number of completed deferred transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_BURST", .desc = "Number of completed burst transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_ANY", .desc = "Number of all completed bus transactions. Address bus utilization can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_MEM", .desc = "Number of completed memory transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(pm_bus_drdy_clocks), .ngrp = 1, .umasks = pm_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RECV", .desc = "Number of bus clock cycles during which this processor is receiving data", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x64, }, { .name = "BUS_BNR_DRV", .desc = "Number of bus clock cycles during which this processor is driving the BNR# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, }, { .name = "BUS_HIT_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HIT# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7a, }, { .name = "BUS_HITM_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HITM# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7b, }, { .name = "BUS_SNOOP_STALL", .desc = "Number of clock cycles during which the bus is snoop stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, }, { .name = "FLOPS", .desc = "Number of computational floating-point operations retired. Excludes floating-point computational operations that cause traps or assists. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating point loads and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xc1, }, { .name = "FP_COMP_OPS_EXE", .desc = "Number of computational floating-point operations executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This number does not include the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "Number of floating-point exception cases handled by microcode.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Number of multiplies.This count includes integer as well as FP multiplies and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Number of divides.This count includes integer as well as FP divides and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Number of cycles during which the divider is busy, and cannot accept new divides. This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "LD_BLOCKS", .desc = "Number of load operations delayed due to store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, }, { .name = "SB_DRAINS", .desc = "Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, }, { .name = "MISALIGN_MEM_REF", .desc = "Number of misaligned data memory references. Incremented by 1 every cycle during which, either the processor's load or store pipeline dispatches a misaligned micro-op Counting is performed if it is the first or second half or if it is blocked, squashed, or missed. In this context, misaligned means crossing a 64-bit boundary", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x5, }, { .name = "UOPS_RETIRED", .desc = "Number of micro-ops retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, }, { .name = "INST_DECODED", .desc = "Number of instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd0, }, { .name = "HW_INT_RX", .desc = "Number of hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "CYCLES_INT_MASKED", .desc = "Number of processor cycles for which interrupts are disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc6, }, { .name = "CYCLES_INT_PENDING_AND_MASKED", .desc = "Number of processor cycles for which interrupts are disabled and interrupts are pending.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, }, { .name = "BR_INST_RETIRED", .desc = "Number of branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, }, { .name = "BR_MISS_PRED_RETIRED", .desc = "Number of mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, }, { .name = "BR_TAKEN_RETIRED", .desc = "Number of taken branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "BR_MISS_PRED_TAKEN_RET", .desc = "Number of taken mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, }, { .name = "BR_INST_DECODED", .desc = "Number of branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BTB_MISSES", .desc = "Number of branches for which the BTB did not produce a prediction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe2, }, { .name = "BR_BOGUS", .desc = "Number of bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "Number of times BACLEAR is asserted. This is the number of times that a static branch prediction was made, in which the branch decoder decided to make a branch prediction because the BTB did not", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "RESOURCE_STALLS", .desc = "Incremented by 1 during every cycle for which there is a resource related stall. Includes register renaming buffer entries, memory buffer entries. Does not include stalls due to bus queue full, too many cache misses, etc. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa2, }, { .name = "PARTIAL_RAT_STALLS", .desc = "Number of cycles or events for partial stalls. This includes flag partial stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd2, }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "MMX_SAT_INSTR_EXEC", .desc = "Number of MMX saturating instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb1, }, { .name = "MMX_UOPS_EXEC", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb2, }, { .name = "MMX_INSTR_TYPE_EXEC", .desc = "Number of MMX instructions executed by type", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(pm_mmx_instr_type_exec), .ngrp = 1, .umasks = pm_mmx_instr_type_exec, }, { .name = "FP_MMX_TRANS", .desc = "Number of MMX transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(pm_fp_mmx_trans), .ngrp = 1, .umasks = pm_fp_mmx_trans, }, { .name = "MMX_ASSIST", .desc = "Number of MMX micro-ops executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xcd, }, { .name = "SEG_RENAME_STALLS", .desc = "Number of Segment Register Renaming Stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd4, .numasks = LIBPFM_ARRAY_SIZE(pm_seg_rename_stalls), .ngrp = 1, .umasks = pm_seg_rename_stalls, }, { .name = "SEG_REG_RENAMES", .desc = "Number of Segment Register Renames", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd5, .numasks = LIBPFM_ARRAY_SIZE(pm_seg_rename_stalls), .ngrp = 1, .umasks = pm_seg_rename_stalls, /* identical to actual umasks list for this event */ }, { .name = "RET_SEG_RENAMES", .desc = "Number of segment register rename events retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd6, }, { .name = "EMON_KNI_PREF_DISPATCHED", .desc = "Number of Streaming SIMD extensions prefetch/weakly-ordered instructions dispatched (speculative prefetches are included in counting). Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_kni_pref_dispatched), .ngrp = 1, .umasks = pm_emon_kni_pref_dispatched, }, { .name = "EMON_KNI_PREF_MISS", .desc = "Number of prefetch/weakly-ordered instructions that miss all caches. Pentium III and later", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4b, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_kni_pref_dispatched), .ngrp = 1, .umasks = pm_emon_kni_pref_dispatched, /* identical to actual umasks list for this event */ }, { .name = "EMON_EST_TRANS", .desc = "Number of Enhanced Intel SpeedStep technology transitions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x58, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_est_trans), .ngrp = 1, .umasks = pm_emon_est_trans, }, { .name = "EMON_THERMAL_TRIP", .desc = "Duration/occurrences in thermal trip; to count the number of thermal trips; edge detect must be used", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x59, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed (not necessarily retired)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x88, }, { .name = "BR_MISSP_EXEC", .desc = "Branch instructions executed that were mispredicted at execution", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x89, }, { .name = "BR_BAC_MISSP_EXEC", .desc = "Branch instructions executed that were mispredicted at Front End (BAC)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8a, }, { .name = "BR_CND_EXEC", .desc = "Conditional branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8b, }, { .name = "BR_CND_MISSP_EXEC", .desc = "Conditional branch instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8c, }, { .name = "BR_IND_EXEC", .desc = "Indirect branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8d, }, { .name = "BR_IND_MISSP_EXEC", .desc = "Indirect branch instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8e, }, { .name = "BR_RET_EXEC", .desc = "Return branch instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x8f, }, { .name = "BR_RET_MISSP_EXEC", .desc = "Return branch instructions executed that were mispredicted at Execution", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x90, }, { .name = "BR_RET_BAC_MISSP_EXEC", .desc = "Return branch instructions executed that were mispredicted at Front End (BAC)", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x91, }, { .name = "BR_CALL_EXEC", .desc = "CALL instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x92, }, { .name = "BR_CALL_MISSP_EXEC", .desc = "CALL instructions executed that were mispredicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x93, }, { .name = "BR_IND_CALL_EXEC", .desc = "Indirect CALL instructions executed", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x94, }, { .name = "EMON_SIMD_INSTR_RETIRED", .desc = "Number of retired MMX instructions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xce, }, { .name = "EMON_SYNCH_UOPS", .desc = "Sync micro-ops", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd3, }, { .name = "EMON_ESP_UOPS", .desc = "Total number of micro-ops", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd7, }, { .name = "EMON_FUSED_UOPS_RET", .desc = "Total number of micro-ops", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xda, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_fused_uops_ret), .ngrp = 1, .umasks = pm_emon_fused_uops_ret, }, { .name = "EMON_UNFUSION", .desc = "Number of unfusion events in the ROB, happened on a FP exception to a fused micro-op", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xdb, }, { .name = "EMON_PREF_RQSTS_UP", .desc = "Number of upward prefetches issued", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf0, }, { .name = "EMON_PREF_RQSTS_DN", .desc = "Number of downward prefetches issued", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xf8, }, { .name = "EMON_SSE_SSE2_INST_RETIRED", .desc = "Streaming SIMD extensions instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd8, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_sse_sse2_inst_retired), .ngrp = 1, .umasks = pm_emon_sse_sse2_inst_retired, }, { .name = "EMON_SSE_SSE2_COMP_INST_RETIRED", .desc = "Computational SSE instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd9, .numasks = LIBPFM_ARRAY_SIZE(pm_emon_sse_sse2_inst_retired), .ngrp = 1, .umasks = pm_emon_sse_sse2_inst_retired, /* identical to actual umasks list for this event */ }, { .name = "L2_LD", .desc = "Number of L2 data loads", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ld), .ngrp = 1, .umasks = pm_l2_ld, }, { .name = "L2_LINES_IN", .desc = "Number of L2 lines allocated", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ld), .ngrp = 1, .umasks = pm_l2_ld, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_OUT", .desc = "Number of L2 lines evicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ld), .ngrp = 1, .umasks = pm_l2_ld, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_OUT", .desc = "Number of L2 M-state lines evicted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(pm_l2_ld), .ngrp = 1, .umasks = pm_l2_ld, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_ppro_events.h000066400000000000000000000465711502707512200237630ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: ppro (Intel Pentium Pro) */ static const intel_x86_umask_t ppro_l2_ifetch[]={ { .uname = "I", .udesc = "Invalid state", .ucode = 0x100, }, { .uname = "S", .udesc = "Shared state", .ucode = 0x200, }, { .uname = "E", .udesc = "Exclusive state", .ucode = 0x400, }, { .uname = "M", .udesc = "Modified state", .ucode = 0x800, }, }; static const intel_x86_umask_t ppro_bus_drdy_clocks[]={ { .uname = "SELF", .udesc = "Clocks when processor is driving bus", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Clocks when any agent is driving bus", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_ppro_pe[]={ { .name = "CPU_CLK_UNHALTED", .desc = "Number cycles during which the processor is not halted", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x79, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc0, }, { .name = "DATA_MEM_REFS", .desc = "All loads from any memory type. All stores to any memory typeEach part of a split is counted separately. The internal logic counts not only memory loads and stores but also internal retries. 80-bit floating point accesses are double counted, since they are decomposed into a 16-bit exponent load and a 64-bit mantissa load. Memory accesses are only counted when they are actually performed (such as a load that gets squashed because a previous cache miss is outstanding to the same address, and which finally gets performed, is only counted once). Does not include I/O accesses or other non-memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x43, }, { .name = "DCU_LINES_IN", .desc = "Total lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x45, }, { .name = "DCU_M_LINES_IN", .desc = "Number of M state lines allocated in the DCU", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x46, }, { .name = "DCU_M_LINES_OUT", .desc = "Number of M state lines evicted from the DCU. This includes evictions via snoop HITM, intervention or replacement", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x47, }, { .name = "DCU_MISS_OUTSTANDING", .desc = "Weighted number of cycle while a DCU miss is outstanding, incremented by the number of cache misses at any particular time. Cacheable read requests only are considered. Uncacheable requests are excluded Read-for-ownerships are counted, as well as line fills, invalidates, and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x48, }, { .name = "IFU_IFETCH", .desc = "Number of instruction fetches, both cacheable and noncacheable including UC fetches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x80, }, { .name = "IFU_IFETCH_MISS", .desc = "Number of instruction fetch misses. All instructions fetches that do not hit the IFU (i.e., that produce memory requests). Includes UC accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x81, }, { .name = "ITLB_MISS", .desc = "Number of ITLB misses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x85, }, { .name = "IFU_MEM_STALL", .desc = "Number of cycles instruction fetch is stalled for any reason. Includes IFU cache misses, ITLB misses, ITLB faults, and other minor stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x86, }, { .name = "ILD_STALL", .desc = "Number of cycles that the instruction length decoder is stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x87, }, { .name = "L2_IFETCH", .desc = "Number of L2 instruction fetches. This event indicates that a normal instruction fetch was received by the L2. The count includes only L2 cacheable instruction fetches: it does not include UC instruction fetches It does not include ITLB miss accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(ppro_l2_ifetch), .ngrp = 1, .umasks = ppro_l2_ifetch, }, { .name = "L2_ST", .desc = "Number of L2 data stores. This event indicates that a normal, unlocked, store memory access was received by the L2. Specifically, it indicates that the DCU sent a read-for ownership request to the L2. It also includes Invalid to Modified requests sent by the DCU to the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(ppro_l2_ifetch), .ngrp = 1, .umasks = ppro_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_M_LINES_INM", .desc = "Number of modified lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x25, }, { .name = "L2_RQSTS", .desc = "Total number of L2 requests", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(ppro_l2_ifetch), .ngrp = 1, .umasks = ppro_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_ADS", .desc = "Number of L2 address strobes", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x21, }, { .name = "L2_DBUS_BUSY", .desc = "Number of cycles during which the L2 cache data bus was busy", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x22, }, { .name = "L2_DBUS_BUSY_RD", .desc = "Number of cycles during which the data bus was busy transferring read data from L2 to the processor", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x23, }, { .name = "BUS_DRDY_CLOCKS", .desc = "Number of clocks during which DRDY# is asserted. Utilization of the external system data bus during data transfers", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, }, { .name = "BUS_LOCK_CLOCKS", .desc = "Number of clocks during which LOCK# is asserted on the external system bus", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_REQ_OUTSTANDING", .desc = "Number of bus requests outstanding. This counter is incremented by the number of cacheable read bus requests outstanding in any given cycle", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x60, }, { .name = "BUS_TRANS_BRD", .desc = "Number of burst read transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_RFO", .desc = "Number of completed read for ownership transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_WB", .desc = "Number of completed write back transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x67, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_IFETCH", .desc = "Number of completed instruction fetch transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x68, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_INVAL", .desc = "Number of completed invalidate transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x69, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_PWR", .desc = "Number of completed partial write transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6a, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_P", .desc = "Number of completed partial transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6b, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRANS_IO", .desc = "Number of completed I/O transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6c, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_DEF", .desc = "Number of completed deferred transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6d, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_BURST", .desc = "Number of completed burst transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6e, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_ANY", .desc = "Number of all completed bus transactions. Address bus utilization can be calculated knowing the minimum address bus occupancy. Includes special cycles, etc.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x70, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_TRAN_MEM", .desc = "Number of completed memory transactions", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6f, .numasks = LIBPFM_ARRAY_SIZE(ppro_bus_drdy_clocks), .ngrp = 1, .umasks = ppro_bus_drdy_clocks, /* identical to actual umasks list for this event */ }, { .name = "BUS_DATA_RECV", .desc = "Number of bus clock cycles during which this processor is receiving data", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x64, }, { .name = "BUS_BNR_DRV", .desc = "Number of bus clock cycles during which this processor is driving the BNR# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x61, }, { .name = "BUS_HIT_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HIT# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7a, }, { .name = "BUS_HITM_DRV", .desc = "Number of bus clock cycles during which this processor is driving the HITM# pin", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7b, }, { .name = "BUS_SNOOP_STALL", .desc = "Number of clock cycles during which the bus is snoop stalled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x7e, }, { .name = "FLOPS", .desc = "Number of computational floating-point operations retired. Excludes floating-point computational operations that cause traps or assists. Includes internal sub-operations for complex floating-point instructions like transcendentals. Excludes floating point loads and stores", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0xc1, }, { .name = "FP_COMP_OPS_EXE", .desc = "Number of computational floating-point operations executed. The number of FADD, FSUB, FCOM, FMULs, integer MULs and IMULs, FDIVs, FPREMs, FSQRTS, integer DIVs, and IDIVs. This number does not include the number of cycles, but the number of operations. This event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x10, }, { .name = "FP_ASSIST", .desc = "Number of floating-point exception cases handled by microcode.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x11, }, { .name = "MUL", .desc = "Number of multiplies.This count includes integer as well as FP multiplies and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x12, }, { .name = "DIV", .desc = "Number of divides.This count includes integer as well as FP divides and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x2, .code = 0x13, }, { .name = "CYCLES_DIV_BUSY", .desc = "Number of cycles during which the divider is busy, and cannot accept new divides. This includes integer and FP divides, FPREM, FPSQRT, etc. and is speculative", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x1, .code = 0x14, }, { .name = "LD_BLOCKS", .desc = "Number of load operations delayed due to store buffer blocks. Includes counts caused by preceding stores whose addresses are unknown, preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflicts with the load but which incompletely overlap the load", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x3, }, { .name = "SB_DRAINS", .desc = "Number of store buffer drain cycles. Incremented every cycle the store buffer is draining. Draining is caused by serializing operations like CPUID, synchronizing operations like XCHG, interrupt acknowledgment, as well as other conditions (such as cache flushing).", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x4, }, { .name = "MISALIGN_MEM_REF", .desc = "Number of misaligned data memory references. Incremented by 1 every cycle during which, either the processor's load or store pipeline dispatches a misaligned micro-op Counting is performed if it is the first or second half or if it is blocked, squashed, or missed. In this context, misaligned means crossing a 64-bit boundary", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x5, }, { .name = "UOPS_RETIRED", .desc = "Number of micro-ops retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc2, }, { .name = "INST_DECODED", .desc = "Number of instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd0, }, { .name = "HW_INT_RX", .desc = "Number of hardware interrupts received", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc8, }, { .name = "CYCLES_INT_MASKED", .desc = "Number of processor cycles for which interrupts are disabled", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc6, }, { .name = "CYCLES_INT_PENDING_AND_MASKED", .desc = "Number of processor cycles for which interrupts are disabled and interrupts are pending.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc7, }, { .name = "BR_INST_RETIRED", .desc = "Number of branch instructions retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc4, }, { .name = "BR_MISS_PRED_RETIRED", .desc = "Number of mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc5, }, { .name = "BR_TAKEN_RETIRED", .desc = "Number of taken branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xc9, }, { .name = "BR_MISS_PRED_TAKEN_RET", .desc = "Number of taken mispredicted branches retired", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xca, }, { .name = "BR_INST_DECODED", .desc = "Number of branch instructions decoded", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe0, }, { .name = "BTB_MISSES", .desc = "Number of branches for which the BTB did not produce a prediction", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe2, }, { .name = "BR_BOGUS", .desc = "Number of bogus branches", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe4, }, { .name = "BACLEARS", .desc = "Number of times BACLEAR is asserted. This is the number of times that a static branch prediction was made, in which the branch decoder decided to make a branch prediction because the BTB did not", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xe6, }, { .name = "RESOURCE_STALLS", .desc = "Incremented by 1 during every cycle for which there is a resource related stall. Includes register renaming buffer entries, memory buffer entries. Does not include stalls due to bus queue full, too many cache misses, etc. In addition to resource related stalls, this event counts some other events. Includes stalls arising during branch misprediction recovery, such as if retirement of the mispredicted branch is delayed and stalls arising while store buffer is draining from synchronizing operations", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xa2, }, { .name = "PARTIAL_RAT_STALLS", .desc = "Number of cycles or events for partial stalls. This includes flag partial stalls", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0xd2, }, { .name = "SEGMENT_REG_LOADS", .desc = "Number of segment register loads.", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x6, }, { .name = "L2_LD", .desc = "Number of L2 data loads. This event indicates that a normal, unlocked, load memory access was received by the L2. It includes only L2 cacheable memory accesses; it does not include I/O accesses, other non-memory accesses, or memory accesses such as UC/WT memory accesses. It does include L2 cacheable TLB miss memory accesses", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(ppro_l2_ifetch), .ngrp = 1, .umasks = ppro_l2_ifetch, /* identical to actual umasks list for this event */ }, { .name = "L2_LINES_IN", .desc = "Number of lines allocated in the L2", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x24, }, { .name = "L2_LINES_OUT", .desc = "Number of lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x26, }, { .name = "L2_M_LINES_OUTM", .desc = "Number of modified lines removed from the L2 for any reason", .modmsk = INTEL_X86_ATTRS, .cntmsk = 0x3, .code = 0x27, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skl_events.h000066400000000000000000003243221502707512200235650ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skl (Intel SkyLake) */ static const intel_x86_umask_t skl_baclears[]={ { .uname = "ANY", .udesc = "Number of front-end re-steers due to BPU misprediction", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_br_inst_retired[]={ { .uname = "CONDITIONAL", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "Counts all taken and not taken macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Counts all macro direct and indirect near calls", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "Counts all taken and not taken macro branches including far branches (architectural event)", .ucode = 0x0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Counts the number of near ret instructions retired", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "Counts all not taken macro branch instructions retired", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Counts the number of near branch taken instructions retired", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Counts the number of far branch instructions retired", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_br_misp_retired[]={ { .uname = "CONDITIONAL", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0x100, .uequiv = "CONDITIONAL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted macro branches (architectural event)", .ucode = 0x0, /* architectural encoding */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET", .udesc = "This event counts the number of mispredicted ret instructions retired.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "This event counts the number of mispredicted ret instructions retired.", .uequiv = "RET", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_cpu_clk_thread_unhalted[]={ { .uname = "REF_XCLK", .udesc = "Count Xclk pulses (100Mhz) when the core is unhalted", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK_ANY", .udesc = "Count Xclk pulses (100Mhz) when the at least one thread on the physical core is unhalted", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "REF_XCLK:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "REF_P", .udesc = "Cycles when the core is unhalted (count at 100 Mhz)", .ucode = 0x100, .uequiv = "REF_XCLK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted", .ucode = 0x000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Counts Xclk (100Mhz) pulses when this thread is unhalted and the other thread is halted", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RING0_TRANS", .udesc = "Counts when the current privilege level transitions from ring 1, 2 or 3 to ring 0 (kernel)", .ucode = 0x000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "THREAD_P:e:c=1", .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_cycle_activity[]={ { .uname = "CYCLES_L2_MISS", .udesc = "Cycles with pending L2 miss demand loads outstanding", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_L2_PENDING", .udesc = "Cycles with pending L2 miss demand loads outstanding", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, .uequiv = "CYCLES_L2_MISS", }, { .uname = "CYCLES_L3_MISS", .udesc = "Cycles with L3 cache miss demand loads outstanding", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_LDM_PENDING", .udesc = "Cycles with L3 cache miss demand loads outstanding", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .uequiv = "CYCLES_L3_MISS", .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0800 | (0x8 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_L1D_PENDING", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0800 | (0x8 << INTEL_X86_CMASK_BIT), .uequiv = "CYCLES_L1D_MISS", .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles when memory subsystem has at least one outstanding load", .ucode = 0x1000 | (0x10 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while at least one L1D demand load cache miss is outstanding", .ucode = 0x0c00 | (0xc << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0x4, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while at least one L2 demand load is outstanding", .ucode = 0x0500 | (0x5 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while at least one L3 demand load is outstanding", .ucode = 0x0600 | (0x6 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_MEM_ANY", .udesc = "Execution stalls while at least one demand load is outstanding in the memory subsystem", .ucode = 0x1400 | (20 << INTEL_X86_CMASK_BIT), /* cnt=20 */ .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_TOTAL", .udesc = "Total execution stalls in cycles", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_dtlb_load_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Number of misses in all TLB levels causing a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Number of misses in all TLB levels causing a page walk of 4KB page size that completes", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Number of misses in all TLB levels causing a page walk of 2MB/4MB page size that completes", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Number of misses in all TLB levels causing a page walk of 1GB page size that completes", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles with at least one hardware walker active for a load", .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when hardware page walker is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Cycles when hardware page walker is busy with page walks", .ucode = 0x1000, .uequiv = "WALK_DURATION", .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_itlb_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Misses in all DTLB levels that cause page walks", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Number of misses in all TLB levels causing a page walk of any page size that completes", .ucode = 0xe00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Number of misses in all TLB levels causing a page walk of 4KB page size that completes", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Number of misses in all TLB levels causing a page walk of 2MB/4MB page size that completes", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Number of misses in all TLB levels causing a page walk of 1GB page size that completes", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Cycles when PMH is busy with page walks", .ucode = 0x1000, .uequiv = "WALK_DURATION", .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one page walker is busy with a page walk request. EPT page walks are excluded", .ucode = 0x1000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "WALK_PENDING:c=1", .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of cache load STLB hits. No page walk", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_fp_assist[]={ { .uname = "ANY", .udesc = "Cycles with any input/output SEE or FP assists", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t skl_icache_16b[]={ { .uname = "IFDATA_STALL", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache miss", .ucode = 0x400, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_icache_64b[]={ { .uname = "IFTAG_HIT", .udesc = "Number of instruction fetch tag lookups that hit in the instruction cache (L1I). Counts at 64-byte cache-line granularity", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFTAG_MISS", .udesc = "Number of instruction fetch tag lookups that miss in the instruction cache (L1I). Counts at 64-byte cache-line granularity", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IFTAG_STALL", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_idq[]={ { .uname = "MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS", .udesc = "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS", .udesc = "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS", .udesc = "Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS_CYCLES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer", .ucode = 0x3000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_UOPS:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MITE_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS_CYCLES", .udesc = "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path", .ucode = 0x800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_UOPS_CYCLES", .udesc = "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_DSB_OCCUR", .udesc = "Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequencer (MS) is busy", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_DSB_UOPS:c=1:e=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_4_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops", .ucode = 0x1800 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_DSB_CYCLES_ANY_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x1800 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_4_UOPS", .udesc = "Cycles MITE is delivering 4 Uops", .ucode = 0x2400 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_CYCLES_ANY_UOPS", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x2400 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_UOPS", .udesc = "Number of uops delivered to Instruction Decode Queue (IDQ) from any path", .ucode = 0x3c00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Count number of non-delivered uops to Resource Allocation Table (RAT)", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Number of uops not delivered to Resource Allocation Table (RAT) per thread when backend is not stalled", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Count cycles front-end (FE) delivered 4 uops or Resource Allocation Table (RAT) was stalling front-end", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 inv=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, { .uname = "CYCLES_LE_1_UOPS_DELIV_CORE", .udesc = "Count cycles per thread when 3 or more uops are not delivered to Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_LE_2_UOPS_DELIV_CORE", .udesc = "Count cycles with less than 2 uops delivered by the front-end", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_LE_3_UOPS_DELIV_CORE", .udesc = "Count cycles with less then 3 uops delivered by the front-end", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_inst_retired[]={ { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution (Precise Event)", .ucode = 0x100, .uequiv = "PREC_DIST", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "PREC_DIST:i=1:c=10", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution (Precise event)", .ucode = 0x100, .ucntmsk= 0x2, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_int_misc[]={ { .uname = "RECOVERY_CYCLES", .udesc = "Cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES_ANY", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "RECOVERY_CYCLES:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "RECOVERY_STALLS_COUNT", .udesc = "Number of occurrences waiting for Machine Clears", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "RECOVERY_CYCLES:e:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E, }, { .uname = "CLEAR_RESTEER_CYCLES", .udesc = "Number of cycles the issue-stage is waiting for front-end to fetch from resteered path following branch misprediction or machine clear events", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_itlb[]={ { .uname = "ITLB_FLUSH", .udesc = "Flushing of the Instruction TLB (ITLB) pages independent of page size", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_l1d[]={ { .uname = "REPLACEMENT", .udesc = "L1D Data line replacements", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_sq_misc[]={ { .uname = "SPLIT_LOCK", .udesc = "Number of split locks in the super queue (SQ)", .ucode = 0x1000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_l1d_pend_miss[]={ { .uname = "PENDING", .udesc = "L1D misses outstanding duration in core cycles", .ucode = 0x100, .ucntmsk = 0x4, .uflags = INTEL_X86_DFL, }, { .uname = "FB_FULL", .udesc = "Number of times a request needed a fill buffer (FB) entry but there was no entry available for it. That is the FB unavailability was dominant reason for blocking the request. A request includes cacheable/uncacheable demands load, store or SW prefetch", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D misses outstanding", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "PENDING:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "PENDING_CYCLES_ANY", .udesc = "Cycles with L1D load misses outstanding from any thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .uequiv = "PENDING:c=1:t", .ucntmsk = 0x4, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "OCCURRENCES", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "EDGE", .udesc = "Number L1D miss outstanding", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "PENDING:c=1:e=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t skl_l2_lines_in[]={ { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x1f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "L2 cache lines filling L2", .uequiv = "ALL", .ucode = 0x1f00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_l2_lines_out[]={ { .uname = "NON_SILENT", .udesc = "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3. Clean lines may either be allocated in L3 or dropped ", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "USELESS_HWPREF", .udesc = "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache", .ucode = 0x400, .uequiv = "USELESS_HWPF", .uflags = INTEL_X86_NCOMBO, }, { .uname = "USELESS_HWPF", .udesc = "Counts the number of lines that have been hardware prefetched but not used and now evicted by L2 cache", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SILENT", .udesc = "Counts the number of lines that are silently dropped by L2 cache when triggered by an L2 cache fill. These lines are typically in Shared state. This is a per-core event.", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_l2_rqsts[]={ { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read requests that miss L2 cache", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests, initiated by load instructions, that hit L2 cache", .ucode = 0xc100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200, .uequiv = "DEMAND_RFO_MISS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0xc200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0x4200, .uequiv = "DEMAND_RFO_HIT", .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "All demand requests that miss the L2 cache", .ucode = 0x2700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads", .ucode = 0xc400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "All requests that miss the L2 cache", .ucode = 0x3f00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_MISS", .udesc = "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache", .ucode = 0x3800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_HIT", .udesc = "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache", .ucode = 0xd800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Any data read request to L2 cache", .ucode = 0xe100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "Any data RFO request to L2 cache", .ucode = 0xe200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_CODE_RD", .udesc = "Any code read request to L2 cache", .ucode = 0xe400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_REFERENCES", .udesc = "All demand requests to L2 cache ", .ucode = 0xe700, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "Any L2 HW prefetch request to L2 cache", .ucode = 0xf800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All requests to L2 cache", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_l2_trans[]={ { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_ld_blocks[]={ { .uname = "STORE_FORWARD", .udesc = "Counts the number of loads blocked by overlapping with store buffer entries that cannot be forwarded", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_load_hit_pre[]={ { .uname = "SW_PF", .udesc = "Demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_lock_cycles[]={ { .uname = "CACHE_LOCK_DURATION", .udesc = "cycles that the L1D is locked", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed LLC - architectural event", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable demand requests that refer to LLC - architectural event", .ucode = 0x4f00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_machine_clears[]={ { .uname = "COUNT", .udesc = "Number of machine clears (Nukes) of any type", .ucode = 0x100| (1 << INTEL_X86_CMASK_BIT) | (1 << INTEL_X86_EDGE_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of Memory Ordering Machine Clears detected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Number of Self-modifying code (SMC) Machine Clears detected", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_mem_load_l3_hit_retired[]={ { .uname = "XSNP_MISS", .udesc = "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HIT", .udesc = "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Load had HitM Response from a core on same socket (shared L3). (Non PEBS", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load uops which data sources were hits in L3 without snoops required", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_mem_load_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from local dram", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from remote dram", .ucode = 0x200, .umodel = PFM_PMU_INTEL_SKX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Retired load instructions whose data sources was remote HITM", .ucode = 0x400, .umodel = PFM_PMU_INTEL_SKX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Retired load instructions whose data sources was forwarded from a remote cache", .ucode = 0x800, .umodel = PFM_PMU_INTEL_SKX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from remote dram", .ucode = 0x200, .umodel = PFM_PMU_INTEL_CLX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Retired load instructions whose data sources was remote HITM", .ucode = 0x400, .umodel = PFM_PMU_INTEL_CLX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_FWD", .udesc = "Retired load instructions whose data sources was forwarded from a remote cache", .ucode = 0x800, .umodel = PFM_PMU_INTEL_CLX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_PMM", .udesc = "Retired load instructions with remote persistent memory as the data source which missed all caches", .ucode = 0x1000, .umodel = PFM_PMU_INTEL_CLX, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_mem_load_retired[]={ { .uname = "L1_HIT", .udesc = "Retired load uops with L1 cache hits as data source", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load uops with L2 cache hits as data source", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load uops with L3 cache hits as data source", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load uops which missed the L1D", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load uops which missed the L2. Unknown data source excluded", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load uops which missed the L3", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "HIT_LFB", .udesc = "Retired load uops which missed L1 but hit line fill buffer (LFB)", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FB_HIT", .udesc = "Retired load uops which missed L1 but hit line fill buffer (LFB)", .ucode = 0x4000, .uequiv = "HIT_LFB", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCAL_PMM", .udesc = "Retired load instructions with local persistent memory as the data source where the request missed all the caches", .umodel = PFM_PMU_INTEL_CLX, .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_DFL, }, { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uequiv = "LOAD_LATENCY", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT | INTEL_X86_NO_AUTOENCODE, }, }; static const intel_x86_umask_t skl_mem_inst_retired[]={ { .uname = "STLB_MISS_LOADS", .udesc = "Load uops with true STLB miss retired to architected path", .ucode = 0x1100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Store uops with true STLB miss retired to architected path", .ucode = 0x1200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Load uops with locked access retired", .ucode = 0x2100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Line-splitted load uops retired", .ucode = 0x4100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Line-splitted store uops retired", .ucode = 0x4200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "All load uops retired", .ucode = 0x8100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "All store uops retired", .ucode = 0x8200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All retired memory instructions", .ucode = 0x8300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_misalign_mem_ref[]={ { .uname = "LOADS", .udesc = "Speculative cache-line split load uops dispatched to the L1D", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Speculative cache-line split store-address uops dispatched to L1D", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_move_elimination[]={ { .uname = "INT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were eliminated", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were eliminated", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_NOT_ELIMINATED", .udesc = "Number of integer Move Elimination candidate uops that were not eliminated", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SIMD_NOT_ELIMINATED", .udesc = "Number of SIMD Move Elimination candidate uops that were not eliminated", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_offcore_requests[]={ { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read requests sent to uncore (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Demand code read requests sent to uncore (use with HT off only)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFOs requests sent to uncore (use with HT off only)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Data read requests sent to uncore (use with HT off only)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_REQUESTS", .udesc = "Number of memory transactions that reached the superqueue (SQ)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Number of demand data read requests which missed the L3 cache", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_other_assists[]={ { .uname = "ANY", .udesc = "Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists", .ucode = 0x3f00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_resource_stalls[]={ { .uname = "ANY", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Cycles Allocation is stalled due to Resource Related reason", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "RS", .udesc = "Stall cycles caused by absence of eligible entries in Reservation Station (RS)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SB", .udesc = "Cycles Allocator is stalled due to Store Buffer full (not including draining from synch)", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ROB", .udesc = "ROB full stall cycles", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_rob_misc_events[]={ { .uname = "LBR_INSERTS", .udesc = "Count each time an new Last Branch Record (LBR) is inserted", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAUSE_INST", .udesc = "Count number of retired PAUSE instructions (that do not end up with a VMEXIT to the VMM; TSX aborted instructions may be counted). This event is not supported on first SKL and KBL processors", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_rs_events[]={ { .uname = "EMPTY_CYCLES", .udesc = "Cycles the Reservation Station (RS) is empty for this thread", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, { .uname = "EMPTY_END", .udesc = "Number of times the reservation station (RS) was empty", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_EDGE, /* inv=1, cmask=1,edge=1 */ .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t skl_tlb_flush[]={ { .uname = "DTLB_THREAD", .udesc = "Count number of DTLB flushes of thread-specific entries", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STLB_ANY", .udesc = "Count number of any STLB flushes", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_uops_executed[]={ { .uname = "THREAD", .udesc = "Number of uops executed per thread in each cycle", .ucode = 0x100, .uflags = INTEL_X86_DFL, }, { .uname = "THREAD_CYCLES_GE_1", .udesc = "Number of cycles with at least 1 uop is executed per thread", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD_CYCLES_GE_2", .udesc = "Number of cycles with at least 2 uops are executed per thread", .ucode = 0x100 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO }, { .uname = "THREAD_CYCLES_GE_3", .udesc = "Number of cycles with at least 3 uops are executed per thread", .ucode = 0x100 | (0x3 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD_CYCLES_GE_4", .udesc = "Number of cycles with at least 4 uops are executed per thread", .ucode = 0x100 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE", .udesc = "Number of uops executed from any thread in each cycle", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Number of cycles with at least 1 uop is executed for any thread", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Number of cycles with at least 2 uops are executed for any thread", .ucode = 0x200 | (0x2 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Number of cycles with at least 3 uops are executed for any thread", .ucode = 0x200 | (0x3 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Number of cycles with at least 4 uops are executed for any thread", .ucode = 0x200 | (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Number of cycles with no uops executed by thread", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "THREAD:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_NONE", .udesc = "Number of cycles with no uops executed from any thread", .ucode = 0x200 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "X87", .udesc = "Number of x87 uops executed per thread", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO }, }; static const intel_x86_umask_t skl_uops_dispatched_port[]={ { .uname = "PORT_0", .udesc = "Cycles which a Uop is executed on port 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Cycles which a Uop is executed on port 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2", .udesc = "Cycles which a Uop is executed on port 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_3", .udesc = "Cycles which a Uop is executed on port 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4", .udesc = "Cycles which a Uop is executed on port 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Cycles which a Uop is executed on port 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Cycles which a Uop is executed on port 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7", .udesc = "Cycles which a Uop is executed on port 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_0_CORE", .udesc = "tbd", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_0:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_1_CORE", .udesc = "tbd", .ucode = 0x200 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_1:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_2_CORE", .udesc = "tbd", .ucode = 0x400 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_2:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_3_CORE", .udesc = "tbd", .ucode = 0x800 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_3:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_4_CORE", .udesc = "tbd", .ucode = 0x1000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_4:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_5_CORE", .udesc = "tbd", .ucode = 0x2000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_5:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_6_CORE", .udesc = "tbd", .ucode = 0x4000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_6:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_7_CORE", .udesc = "tbd", .ucode = 0x8000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_7:t=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, }; static const intel_x86_umask_t skl_uops_issued[]={ { .uname = "ANY", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL", .udesc = "Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)", .ucode = 0x100, .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR_WIDTH_MISMATCH", .udesc = "Number of blend uops issued by the Resource Allocation table (RAT) to the Reservation Station (RS) in order to preserve upper bits of vector registers", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLAGS_MERGE", .udesc = "Number of flags-merge uops being allocated. Such uops adds delay", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_LEA", .udesc = "Number of slow LEA or similar uops allocated. Such uop has 3 sources regardless if result of LEA instruction or not", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SINGLE_MUL", .udesc = "Number of Multiply packed/scalar single precision uops allocated", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued by this thread", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1", .uflags = INTEL_X86_NCOMBO, .ucntmsk = 0xf, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Counts the number of cycles no uops issued on this core", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* any=1 inv=1 cnt=1 */ .uequiv = "ANY:c=1:i=1:t=1", .ucntmsk = 0xf, .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t skl_uops_retired[]={ { .uname = "ALL", .udesc = "All uops that actually retired", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "All uops that actually retired", .ucode = 0x100, .uequiv = "ALL", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "number of retirement slots used non PEBS", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no executable uops retired (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:c=1:i", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "TOTAL_CYCLES", .udesc = "Number of cycles using always true condition applied to PEBS uops retired event", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=10 */ .uequiv = "ALL:c=10:i", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no executable uops retired on core (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uequiv = "ALL:c=1:i:t=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_OCCURRENCES", .udesc = "Number of transitions from stalled to unstalled execution (Precise Event)", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE| (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "ALL:c=1:i=1:e=1", .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t skl_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_CODE_RD", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "PF_L2_DATA_RD", .udesc = "Request: number of data prefetch requests to L2", .ucode = 1ULL << (4 + 8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "PF_L2_RFO", .udesc = "Request: number of RFO prefetch requests to L2", .ucode = 1ULL << (5 + 8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "PF_L3_DATA_RD", .udesc = "Request: number of data prefetch requests for loads that end up in L3", .ucode = 1ULL << (7 + 8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "PF_L3_RFO", .udesc = "Request: number of RFO prefetch requests that end up in L3", .ucode = 1ULL << (8 + 8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "PF_L1D_AND_SW", .udesc = "Request: number of L1 data cache hardware prefetch requests and software prefetch requests", .ucode = 1ULL << (10 + 8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "PF_L2_DATA_RD", .udesc = "Request: number of data prefetch requests to L2", .ucode = 1ULL << (4 + 8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "PF_L2_RFO", .udesc = "Request: number of RFO prefetch requests to L2", .ucode = 1ULL << (5 + 8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "PF_L3_DATA_RD", .udesc = "Request: number of data prefetch requests for loads that end up in L3", .ucode = 1ULL << (7 + 8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "PF_L3_RFO", .udesc = "Request: number of RFO prefetch requests that end up in L3", .ucode = 1ULL << (8 + 8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "PF_L1D_AND_SW", .udesc = "Request: number of L1 data cache hardware prefetch requests and software prefetch requests", .ucode = 1ULL << (10 + 8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:OTHER", .ucode = 0x1800700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .umodel = PFM_PMU_INTEL_SKL, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_DATA_RD:PF_L2_RFO:PF_L3_DATA_RD:PF_L3_RFO:PF_L1D_AND_SW:OTHER", .ucode = 0x85b700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Request: combination of DMND_DATA_RD | PF_L2_DATA_RD | PF_L3_DATA_RD | PF_L1D_AND_SW", .uequiv = "DMND_DATA_RD:PF_L2_DATA_RD:PF_L3_DATA_RD:PF_L1D_AND_SW", .ucode = 0x1049100, .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of ANY_DATA_RD | PF_L2_RFO | PF_L3_RFO | DMND_RFO", .uequiv = "ANY_DATA_RD:DMND_RFO:PF_L2_RFO:PF_L3_RFO", .ucode = 0x105b300, .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "ANY_DATA_PF", .udesc = "Request: combination of PF_L2_DATA_RD | PF_L3_DATA_RD | PF_L1D_AND_SW", .uequiv = "PF_L2_DATA_RD:PF_L3_DATA_RD:PF_L1D_AND_SW", .ucode = 0x1049000, .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_L2_RFO | PF_L3_RFO", .uequiv = "DMND_RFO:PF_L2_RFO:PF_L3_RFO", .ucode = 0x1012200, .umodel = PFM_PMU_INTEL_SKX, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_DATA_RD:PF_L2_RFO:PF_L3_DATA_RD:PF_L3_RFO:PF_L1D_AND_SW:OTHER", .ucode = 0x85b700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Request: combination of DMND_DATA_RD | PF_L2_DATA_RD | PF_L3_DATA_RD | PF_L1D_AND_SW", .uequiv = "DMND_DATA_RD:PF_L2_DATA_RD:PF_L3_DATA_RD:PF_L1D_AND_SW", .ucode = 0x1049100, .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of ANY_DATA_RD | PF_L2_RFO | PF_L3_RFO | DMND_RFO", .uequiv = "ANY_DATA_RD:DMND_RFO:PF_L2_RFO:PF_L3_RFO", .ucode = 0x105b300, .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "ANY_DATA_PF", .udesc = "Request: combination of PF_L2_DATA_RD | PF_L3_DATA_RD | PF_L1D_AND_SW", .uequiv = "PF_L2_DATA_RD:PF_L3_DATA_RD:PF_L1D_AND_SW", .ucode = 0x1049000, .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_L2_RFO | PF_L3_RFO", .uequiv = "DMND_RFO:PF_L2_RFO:PF_L3_RFO", .ucode = 0x1012200, .umodel = PFM_PMU_INTEL_CLX, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "SUPPLIER_NONE", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .grpid = 1, }, { .uname = "NO_SUPP", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .uequiv = "SUPPLIER_NONE", .grpid = 1, }, { .uname = "L3_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "L3_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .grpid = 1, }, { .uname = "L3_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .grpid = 1, }, { .uname = "L3_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .umodel = PFM_PMU_INTEL_SKX, .grpid = 1, }, { .uname = "L3_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .umodel = PFM_PMU_INTEL_CLX, .grpid = 1, }, { .uname = "L3_HITMES", .udesc = "Supplier: counts L3 hits in any state (M, E, S)", .ucode = 0x3ULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS", .umodel = PFM_PMU_INTEL_SKL, .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Alias for L3_HITMES", .ucode = 0x3ULL << (18+8), .uequiv = "L3_HITMES", .umodel = PFM_PMU_INTEL_SKL, .grpid = 1, }, { .uname = "L3_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .umodel = PFM_PMU_INTEL_SKX, .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Alias for L3_HITMES", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITMESF", .umodel = PFM_PMU_INTEL_SKX, .grpid = 1, }, { .uname = "L3_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITM:L3_HITE:L3_HITS:L3_HITF", .umodel = PFM_PMU_INTEL_CLX, .grpid = 1, }, { .uname = "L3_HIT", .udesc = "Alias for L3_HITMES", .ucode = 0xfULL << (18+8), .uequiv = "L3_HITMESF", .umodel = PFM_PMU_INTEL_CLX, .grpid = 1, }, { .uname = "L4_HIT_LOCAL_L4", .udesc = "Supplier: L4 local hit", .ucode = 0x1ULL << (22+8), .umodel = PFM_PMU_INTEL_SKL, .grpid = 1, }, { .uname = "L3_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (26+8), .grpid = 1, }, { .uname = "L3_MISS_REMOTE_HOP1_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM with 1 hop", .ucode = 1ULL << (28+8), .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses", .ucode = 0x1ULL << (26+8), .uequiv = "L3_MISS_LOCAL", .umodel = PFM_PMU_INTEL_SKL, .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses (local or remote)", .ucode = 0xfULL << (26+8), .uequiv = "L3_MISS_LOCAL", .umodel = PFM_PMU_INTEL_SKX, .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses (local or remote)", .ucode = 0x1ULL << (26+8), .uequiv = "L3_MISS_LOCAL", .umodel = PFM_PMU_INTEL_CLX, .grpid = 1, }, { .uname = "SPL_HIT", .udesc = "Snoop: counts L3 supplier hit", .ucode = 0x1ULL << (30+8), .umodel = PFM_PMU_INTEL_SKL, .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_NOT_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_HIT_NO_FWD", .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_HIT_WITH_FWD", .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", .ucode = 1ULL << (35+8), .grpid = 2, }, { .uname = "SNP_HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "SNP_NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7fULL << (31+8), .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_HIT_NO_FWD:SNP_HIT_WITH_FWD:SNP_HITM:SNP_NON_DRAM", .uflags= INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t skl_hle_retired[]={ { .uname = "START", .udesc = "Number of times an HLE execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an HLE execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an HLE execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an HLE execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_TMR", .udesc = "Number of times an HLE execution aborted due to hardware timer expiration", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain events such as AD-assists", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEMTYPE", .udesc = "Number of times an HLE execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_EVENTS", .udesc = "Number of times an HLE execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_rtm_retired[]={ { .uname = "START", .udesc = "Number of times an RTM execution started", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one) (Precise Event)", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an RTM execution aborted due to various memory events", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_TMR", .udesc = "Number of times an RTM execution aborted due to uncommon conditions", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an RTM execution aborted due to RTM-unfriendly instructions", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEMTYPE", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_EVENTS", .udesc = "Number of times an RTM execution aborted due to none of the other 4 reasons (e.g., interrupt)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_tx_mem[]={ { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to data conflict on a transactionally accessed address", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY", .udesc = "Number of times a transactional abort was signaled due to data capacity limitation", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_STORE_TO_ELIDED_LOCK", .udesc = "Number of times a HLE transactional execution aborted due to a non xrelease prefixed instruction writing to an elided lock in the elision buffer", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_NOT_EMPTY", .udesc = "Number of times a HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_MISMATCH", .udesc = "Number of times a HLE transaction execution aborted due to xrelease lock not satisfying the address and value requirements in the elision buffer", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT", .udesc = "Number of times a HLE transaction execution aborted due to an unsupported read alignment from the elision buffer", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_HLE_ELISION_BUFFER_FULL", .udesc = "Number of times a HLE clock could not be elided due to ElisionBufferAvailable being zero", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_tx_exec[]={ { .uname = "MISC1", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC2", .udesc = "Number of times a class of instructions that may cause a transactional abort was executed inside a transactional region", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC3", .udesc = "Number of times an instruction execution caused the supported nest count to be exceeded", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC4", .udesc = "Number of times an instruction a xbegin instruction was executed inside HLE transactional region", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISC5", .udesc = "Number of times an instruction with HLE xacquire prefix was executed inside a RTM transactional region", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD_CYCLES", .udesc = "Cycles with cacheable data read transactions in the superQ (use with HT off only)", .uequiv = "ALL_DATA_RD:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_CYCLES", .udesc = "Cycles with demand code reads transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DEMAND_CODE_RD", .udesc = "Cycles with demand code reads transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_CYCLES", .udesc = "Cycles with demand data read transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DEMAND_DATA_RD", .udesc = "Cycles with demand data read transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Cacheable data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Code read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read transactions in the superQ every cycle (use with HT off only)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_GE_6", .udesc = "Cycles with at lesat 6 offcore outstanding demand data read requests in the uncore queue", .uequiv = "DEMAND_DATA_RD:c=6", .ucode = 0x100 | (6 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding RFO (store) transactions in the superQ every cycle (use with HT off only)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_CYCLES", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_RFO", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ (use with HT off only)", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Number of offcore outstanding demand data read requests missing the L3 cache every cycle", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD_GE_6", .udesc = "Number of cycles in which at least 6 demand data read requests missing the L3", .ucode = 0x1000 | (0x6 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_L3_MISS_DEMAND_DATA_RD", .udesc = "Cycles with at least 1 Demand Data Read requests who miss L3 cache in the superQ", .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t skl_ild_stall[]={ { .uname = "LCP", .udesc = "Stall caused by changing prefix length of the instruction", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_lsd[]={ { .uname = "UOPS", .udesc = "Number of uops delivered by the Loop Stream Detector (LSD)", .ucode = 0x100, .uflags= INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "CYCLES_4_UOPS", .udesc = "Number of cycles the LSD delivered 4 uops which did not come from the decoder", .ucode = 0x100| (0x4 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_ACTIVE", .udesc = "Number of cycles the LSD delivered uops which did not come from the decoder", .ucode = 0x100| (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "Number of DSB to MITE switch true penalty cycles", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_ept[]={ { .uname = "WALK_DURATION", .udesc = "Cycles for an extended page table walk of any type", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "WALK_PENDING", .udesc = "Cycles for an extended page table walk of any type", .ucode = 0x1000, .uequiv = "WALK_DURATION", .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_arith[]={ { .uname = "DIVIDER_ACTIVE", .udesc = "Cycles when divider is busy executing divide or square root operations on integers or floating-points", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_DFL | INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FPU_DIV_ACTIVE", .udesc = "Cycles when divider is busy executing divide or square root operations on integers or floating-points", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), .uequiv = "DIVIDER_ACTIVE", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t skl_fp_arith[]={ { .uname = "SCALAR_DOUBLE", .udesc = "Number of scalar double precision floating-point arithmetic instructions (multiply by 1 to get flops)", .ucode = 0x0100, }, { .uname = "SCALAR_SINGLE", .udesc = "Number of scalar single precision floating-point arithmetic instructions (multiply by 1 to get flops)", .ucode = 0x0200, }, { .uname = "128B_PACKED_DOUBLE", .udesc = "Number of scalar 128-bit packed double precision floating-point arithmetic instructions (multiply by 2 to get flops)", .ucode = 0x0400, }, { .uname = "128B_PACKED_SINGLE", .udesc = "Number of scalar 128-bit packed single precision floating-point arithmetic instructions (multiply by 4 to get flops)", .ucode = 0x0800, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "Number of scalar 256-bit packed double precision floating-point arithmetic instructions (multiply by 4 to get flops)", .ucode = 0x1000, }, { .uname = "256B_PACKED_SINGLE", .udesc = "Number of scalar 256-bit packed single precision floating-point arithmetic instructions (multiply by 8 to get flops)", .ucode = 0x2000, }, { .uname = "512B_PACKED_DOUBLE", .udesc = "Number of scalar 512-bit packed double precision floating-point arithmetic instructions (multiply by 8 to get flops)", .ucode = 0x4000, }, { .uname = "512B_PACKED_SINGLE", .udesc = "Number of scalar 512-bit packed single precision floating-point arithmetic instructions (multiply by 16 to get flops)", .ucode = 0x8000, }, }; static const intel_x86_umask_t skl_exe_activity[]={ { .uname = "1_PORTS_UTIL", .udesc = "Cycles with 1 uop executing across all ports and Reservation Station is not empty", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "2_PORTS_UTIL", .udesc = "Cycles with 2 uops executing across all ports and Reservation Station is not empty", .ucode = 0x0400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "3_PORTS_UTIL", .udesc = "Cycles with 3 uops executing across all ports and Reservation Station is not empty", .ucode = 0x0800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "4_PORTS_UTIL", .udesc = "Cycles with 4 uops executing across all ports and Reservation Station is not empty", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "BOUND_ON_STORES", .udesc = "Cycles where the store buffer is full and no outstanding load", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EXE_BOUND_0_PORTS", .udesc = "Cycles where no uop is executed and the Reservation Station was not empty", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_frontend_retired[]={ { .uname = "DSB_MISS", .udesc = "Retired instructions experiencing a critical decode stream buffer (DSB) miss. A critical DSB miss can cause stalls in the backend", .ucode = 0x11 << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_DSB_MISS", .udesc = "Retired Instructions experiencing a decode stream buffer (DSB) miss.", .ucode = 0x1 << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ITLB_MISS", .udesc = "Retired instructions experiencing ITLB true miss", .ucode = 0x14 << 8, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1I_MISS", .udesc = "Retired instructions experiencing L1I cache true miss", .ucode = 0x12 << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired instructions experiencing instruction L2 cache true miss", .ucode = 0x13 << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS", .udesc = "Retired instructions experiencing STLB (2nd level TLB) true miss", .ucode = 0x15 << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IDQ_4_BUBBLES", .udesc = "Retired instructions after an interval where the front-end did not deliver any uops (4 bubbles) for a period determined by the fe_thres modifier and which was not interrupted by a back-end stall", .ucode = (4 << 20 | 0x6) << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_3_BUBBLES", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 1 uop (3 bubbles) for a period determined by the fe_thres modifier and which was not interrupted by a back-end stall", .ucode = (3 << 20 | 0x6) << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_2_BUBBLES", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 2 uops (2 bubbles) for a period determined by the fe_thres modifier and which was not interrupted by a back-end stall", .ucode = (2 << 20 | 0x6) << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, { .uname = "IDQ_1_BUBBLE", .udesc = "Counts instructions retired after an interval where the front-end did not deliver more than 3 uops (1 bubble) for a period determined by the fe_thres modifier and which was not interrupted by a back-end stall", .ucode = (1 << 20 | 0x6) << 8, .uflags= INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t skl_hw_interrupts[]={ { .uname = "RECEIVED", .udesc = "Number of hardware interrupts received by the processor", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_offcore_requests_buffer[]={ { .uname = "SQ_FULL", .udesc = "Number of requests for which the offcore buffer (SQ) is full", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_mem_load_misc_retired[]={ { .uname = "UC", .udesc = "Number of uncached load retired", .ucode = 0x400, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_idi_misc[]={ { .uname = "WB_UPGRADE", .udesc = "Counts number of cache lines that are allocated and written back to L3 with the intention that they are more likely to be reused shortly", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WB_DOWNGRADE", .udesc = "Counts number of cache lines that are dropped and not written back to L3 as they are deemed to be less likely to be reused shortly", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_core_power[]={ { .uname = "LVL0_TURBO_LICENSE", .udesc = "Number of core cycles where the core was running in a manner where Turbo may be clipped to the Non-AVX turbo schedule.", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LVL1_TURBO_LICENSE", .udesc = "Number of core cycles where the core was running in a manner where Turbo may be clipped to the AVX2 turbo schedule.", .ucode = 0x1800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LVL2_TURBO_LICENSE", .udesc = "Number of core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "THROTTLE", .udesc = "Number of core cycles where the core was throttled due to a pending power level request.", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_sw_prefetch[]={ { .uname = "NTA", .udesc = "Number of prefetch.nta instructions executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T0", .udesc = "Number of prefetch.t0 instructions executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "T1_T2", .udesc = "Number prefetch.t1 or prefetch.t2 instructions executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCHW", .udesc = "Number prefetch.w instructions executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_core_snoop_response[]={ { .uname = "RSP_IHITI", .udesc = "TBD", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_IHITFSE", .udesc = "TBD", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_SHITFSE", .udesc = "TBD", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_SFWDM", .udesc = "TBD", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_IFWDM", .udesc = "TBD", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_IFWDFE", .udesc = "TBD", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RSP_SFWDFE", .udesc = "TBD", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t skl_partial_rat_stalls[]={ { .uname = "SCOREBOARD", .udesc = "Count core cycles where the pipeline is stalled due to serialization operations", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t skl_br_misp_exec[]={ { .uname = "INDIRECT", .udesc = "Speculative mispredicted indirect branches", .ucode = 0xe400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "Speculative and retired mispredicted macro conditional branches", .ucode = 0xff00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_skl_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V4_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V4_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc5, }, { .name = "BACLEARS", .desc = "Branch re-steered", .code = 0xe6, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_baclears), .umasks = skl_baclears }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired (Precise Event)", .code = 0xc4, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_br_inst_retired), .umasks = skl_br_inst_retired }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches (Precise Event)", .code = 0xc5, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_br_misp_retired), .umasks = skl_br_misp_retired }, { .name = "BR_MISP_EXEC", .desc = "Speculative mispredicted branches", .code = 0x89, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_br_misp_exec), .umasks = skl_br_misp_exec }, { .name = "CPU_CLK_THREAD_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_cpu_clk_thread_unhalted), .umasks = skl_cpu_clk_thread_unhalted }, { .name = "CPU_CLK_UNHALTED", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .code = 0x3c, .cntmsk = 0xff, .modmsk = INTEL_V4_ATTRS, .equiv = "CPU_CLK_THREAD_UNHALTED", }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles", .code = 0xa3, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_cycle_activity), .umasks = skl_cycle_activity }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .code = 0x8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_dtlb_load_misses), .umasks = skl_dtlb_load_misses }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses", .code = 0x49, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_dtlb_load_misses), .umasks = skl_dtlb_load_misses /* shared */ }, { .name = "FP_ASSIST", .desc = "X87 floating-point assists", .code = 0xca, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_fp_assist), .umasks = skl_fp_assist }, { .name = "HLE_RETIRED", .desc = "HLE execution (Precise Event)", .code = 0xc8, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_hle_retired), .umasks = skl_hle_retired }, { .name = "ICACHE_16B", .desc = "Instruction Cache", .code = 0x80, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_icache_16b), .umasks = skl_icache_16b }, { .name = "ICACHE_64B", .desc = "Instruction Cache", .code = 0x83, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_icache_64b), .umasks = skl_icache_64b }, { .name = "IDQ", .desc = "IDQ operations", .code = 0x79, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_idq), .umasks = skl_idq }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered", .code = 0x9c, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_idq_uops_not_delivered), .umasks = skl_idq_uops_not_delivered }, { .name = "INST_RETIRED", .desc = "Number of instructions retired (Precise Event)", .code = 0xc0, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_inst_retired), .umasks = skl_inst_retired }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions", .code = 0xd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_int_misc), .umasks = skl_int_misc }, { .name = "ITLB", .desc = "Instruction TLB", .code = 0xae, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_itlb), .umasks = skl_itlb }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .code = 0x85, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_itlb_misses), .umasks = skl_itlb_misses }, { .name = "L1D", .desc = "L1D cache", .code = 0x51, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l1d), .umasks = skl_l1d }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses", .code = 0x48, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l1d_pend_miss), .umasks = skl_l1d_pend_miss }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .code = 0xf1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l2_lines_in), .umasks = skl_l2_lines_in }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .code = 0xf2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l2_lines_out), .umasks = skl_l2_lines_out }, { .name = "L2_RQSTS", .desc = "L2 requests", .code = 0x24, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l2_rqsts), .umasks = skl_l2_rqsts }, { .name = "L2_TRANS", .desc = "L2 transactions", .code = 0xf0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_l2_trans), .umasks = skl_l2_trans }, { .name = "LD_BLOCKS", .desc = "Blocking loads", .code = 0x3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_ld_blocks), .umasks = skl_ld_blocks }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks", .code = 0x7, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_ld_blocks_partial), .umasks = skl_ld_blocks_partial }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches", .code = 0x4c, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_load_hit_pre), .umasks = skl_load_hit_pre }, { .name = "LOCK_CYCLES", .desc = "Locked cycles in L1D and L2", .code = 0x63, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_lock_cycles), .umasks = skl_lock_cycles }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache", .code = 0x2e, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_longest_lat_cache), .umasks = skl_longest_lat_cache }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .code = 0xc3, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_machine_clears), .umasks = skl_machine_clears }, { .name = "MEM_LOAD_L3_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_hit_retired), .umasks = skl_mem_load_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_L3_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .equiv = "MEM_LOAD_L3_HIT_RETIRED", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_hit_retired), .umasks = skl_mem_load_l3_hit_retired }, { .name = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .desc = "L3 hit load uops retired (Precise Event)", .equiv = "MEM_LOAD_L3_HIT_RETIRED", .code = 0xd2, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_hit_retired), .umasks = skl_mem_load_l3_hit_retired }, { .name = "MEM_LOAD_L3_MISS_RETIRED", .desc = "L3 miss load uops retired (Precise Event)", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_miss_retired), .umasks = skl_mem_load_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_L3_MISS_RETIRED", .desc = "L3 miss load uops retired (Precise Event)", .equiv = "MEM_LOAD_L3_MISS_RETIRED", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_miss_retired), .umasks = skl_mem_load_l3_miss_retired }, { .name = "MEM_LOAD_UOPS_LLC_MISS_RETIRED", .desc = "L3 miss load uops retired (Precise Event)", .equiv = "MEM_LOAD_L3_MISS_RETIRED", .code = 0xd3, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_l3_miss_retired), .umasks = skl_mem_load_l3_miss_retired }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired load uops (Precise Event)", .code = 0xd1, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_retired), .umasks = skl_mem_load_retired }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Retired load uops (Precise Event)", .code = 0xd1, .equiv = "MEM_LOAD_RETIRED", .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_retired), .umasks = skl_mem_load_retired }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired (Precise Event)", .code = 0xcd, .cntmsk = 0x8, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS | _INTEL_X86_ATTR_LDLAT, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_trans_retired), .umasks = skl_mem_trans_retired }, { .name = "MEM_INST_RETIRED", .desc = "Memory instructions retired (Precise Event)", .code = 0xd0, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_inst_retired), .umasks = skl_mem_inst_retired }, { .name = "MEM_UOPS_RETIRED", .desc = "Memory instructions retired (Precise Event)", .code = 0xd0, .cntmsk = 0xf, .equiv = "MEM_INST_RETIRED", .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_inst_retired), .umasks = skl_mem_inst_retired }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned memory references", .code = 0x5, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_misalign_mem_ref), .umasks = skl_misalign_mem_ref }, { .name = "MOVE_ELIMINATION", .desc = "Move Elimination", .code = 0x58, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_move_elimination), .umasks = skl_move_elimination }, { .name = "OFFCORE_REQUESTS", .desc = "Demand Data Read requests sent to uncore", .code = 0xb0, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_offcore_requests), .umasks = skl_offcore_requests }, { .name = "OTHER_ASSISTS", .desc = "Software assist", .code = 0xc1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_other_assists), .umasks = skl_other_assists }, { .name = "RESOURCE_STALLS", .desc = "Cycles Allocation is stalled due to Resource Related reason", .code = 0xa2, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_resource_stalls), .umasks = skl_resource_stalls }, { .name = "ROB_MISC_EVENTS", .desc = "ROB miscellaneous events", .code = 0xcc, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_rob_misc_events), .umasks = skl_rob_misc_events }, { .name = "RS_EVENTS", .desc = "Reservation Station", .code = 0x5e, .cntmsk = 0xf, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_rs_events), .umasks = skl_rs_events }, { .name = "RTM_RETIRED", .desc = "Restricted Transaction Memory execution (Precise Event)", .code = 0xc9, .cntmsk = 0xf, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_rtm_retired), .umasks = skl_rtm_retired }, { .name = "TLB_FLUSH", .desc = "TLB flushes", .code = 0xbd, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_tlb_flush), .umasks = skl_tlb_flush }, { .name = "UOPS_EXECUTED", .desc = "Uops executed", .code = 0xb1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_uops_executed), .umasks = skl_uops_executed }, { .name = "LSD", .desc = "Loop stream detector", .code = 0xa8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_lsd), .umasks = skl_lsd, }, { .name = "UOPS_DISPATCHED_PORT", .desc = "Uops dispatched to specific ports", .code = 0xa1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_uops_dispatched_port), .umasks = skl_uops_dispatched_port, }, { .name = "UOPS_DISPATCHED", .desc = "Uops dispatched to specific ports", .equiv = "UOPS_DISPATCHED_PORT", .code = 0xa1, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_uops_dispatched_port), .umasks = skl_uops_dispatched_port, }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .code = 0xe, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_uops_issued), .umasks = skl_uops_issued }, { .name = "ARITH", .desc = "Arithmetic uop", .code = 0x14, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_arith), .umasks = skl_arith }, { .name = "UOPS_RETIRED", .desc = "Uops retired (Precise Event)", .code = 0xc2, .cntmsk = 0xff, .ngrp = 1, .flags = INTEL_X86_PEBS, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_uops_retired), .umasks = skl_uops_retired }, { .name = "TX_MEM", .desc = "Transactional memory aborts", .code = 0x54, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_tx_mem), .umasks = skl_tx_mem, }, { .name = "TX_EXEC", .desc = "Transactional execution", .code = 0x5d, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(skl_tx_exec), .umasks = skl_tx_exec }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(skl_offcore_requests_outstanding), .ngrp = 1, .umasks = skl_offcore_requests_outstanding, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(skl_ild_stall), .ngrp = 1, .umasks = skl_ild_stall, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(skl_dsb2mite_switches), .ngrp = 1, .umasks = skl_dsb2mite_switches, }, { .name = "EPT", .desc = "Extended page table", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(skl_ept), .ngrp = 1, .umasks = skl_ept, }, { .name = "FP_ARITH", .desc = "Floating-point instructions retired", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xc7, .numasks = LIBPFM_ARRAY_SIZE(skl_fp_arith), .ngrp = 1, .umasks = skl_fp_arith, .equiv = "FP_ARITH_INST_RETIRED", }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Floating-point instructions retired", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xc7, .numasks = LIBPFM_ARRAY_SIZE(skl_fp_arith), .ngrp = 1, .umasks = skl_fp_arith, }, { .name = "EXE_ACTIVITY", .desc = "Execution activity", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xa6, .numasks = LIBPFM_ARRAY_SIZE(skl_exe_activity), .ngrp = 1, .umasks = skl_exe_activity, }, { .name = "FRONTEND_RETIRED", .desc = "Precise Front-End activity", .modmsk = INTEL_SKL_FE_ATTRS, .cntmsk = 0xf, .code = 0x1c6, .flags = INTEL_X86_FRONTEND | INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(skl_frontend_retired), .ngrp = 1, .umasks = skl_frontend_retired, }, { .name = "HW_INTERRUPTS", .desc = "Number of hardware interrupts received by the processor", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(skl_hw_interrupts), .ngrp = 1, .umasks = skl_hw_interrupts, }, { .name = "SQ_MISC", .desc = "SuperQueue miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(skl_sq_misc), .ngrp = 1, .umasks = skl_sq_misc, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Load retired miscellaneous", .modmsk = INTEL_V4_ATTRS, .flags = INTEL_X86_PEBS, .cntmsk = 0xf, .code = 0xd4, .numasks = LIBPFM_ARRAY_SIZE(skl_mem_load_misc_retired), .ngrp = 1, .umasks = skl_mem_load_misc_retired, }, { .name = "IDI_MISC", .desc = "Miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xfe, .numasks = LIBPFM_ARRAY_SIZE(skl_idi_misc), .model = PFM_PMU_INTEL_SKX, .ngrp = 1, .umasks = skl_idi_misc, }, { .name = "IDI_MISC", .desc = "Miscellaneous", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xfe, .numasks = LIBPFM_ARRAY_SIZE(skl_idi_misc), .model = PFM_PMU_INTEL_CLX, .ngrp = 1, .umasks = skl_idi_misc, }, { .name = "CORE_POWER", .desc = "Power power cycles", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(skl_core_power), .model = PFM_PMU_INTEL_SKX, .ngrp = 1, .umasks = skl_core_power, }, { .name = "CORE_POWER", .desc = "Power power cycles", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(skl_core_power), .model = PFM_PMU_INTEL_CLX, .ngrp = 1, .umasks = skl_core_power, }, { .name = "SW_PREFETCH", .desc = "Software prefetches", .modmsk = INTEL_V4_ATTRS, .equiv = "SW_PREFETCH_ACCESS", .cntmsk = 0xf, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(skl_sw_prefetch), .ngrp = 1, .umasks = skl_sw_prefetch, }, { .name = "SW_PREFETCH_ACCESS", .desc = "Software prefetches", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(skl_sw_prefetch), .ngrp = 1, .umasks = skl_sw_prefetch, }, { .name = "CORE_SNOOP_RESPONSE", .desc = "Aggregated core snoops", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0xef, .numasks = LIBPFM_ARRAY_SIZE(skl_core_snoop_response), .ngrp = 1, .umasks = skl_core_snoop_response, }, { .name = "PARTIAL_RAT_STALLS", .desc = "RAT stalls", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x59, .numasks = LIBPFM_ARRAY_SIZE(skl_partial_rat_stalls), .ngrp = 1, .umasks = skl_partial_rat_stalls, }, { .name = "OFFCORE_REQUESTS_BUFFER", .desc = "Offcore requests buffer", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0xb2, .numasks = LIBPFM_ARRAY_SIZE(skl_offcore_requests_buffer), .ngrp = 1, .umasks = skl_offcore_requests_buffer, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(skl_offcore_response), .ngrp = 3, .umasks = skl_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(skl_offcore_response), .ngrp = 3, .umasks = skl_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_cha_events.h000066400000000000000000003672621502707512200252730ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_cha */ #define CHA_FILT_MESIF(a, b, c, d) \ { .uname = "STATE_"#a,\ .udesc = #b" cacheline state",\ .ufilters[0] = 1ULL << (17 + (c)),\ .grpid = d, \ } #define CHA_FILT_MESIFS(d) \ CHA_FILT_MESIF(LLC_I, LLC Invalid, 0, d), \ CHA_FILT_MESIF(SF_S, SF Shared, 1, d), \ CHA_FILT_MESIF(SF_E, SF Exclusive, 2, d), \ CHA_FILT_MESIF(SF_H, SF H, 3, d), \ CHA_FILT_MESIF(LLC_S, LLC Shared, 4, d), \ CHA_FILT_MESIF(LLC_E, LLC Exclusive, 5, d), \ CHA_FILT_MESIF(LLC_M, LLC Modified, 6, d), \ CHA_FILT_MESIF(LLC_F, LLC Forward, 7, d), \ { .uname = "STATE_CACHE_ANY",\ .udesc = "Any cache line state",\ .ufilters[0] = 0x7fULL << 17,\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ } #define CHA_FILT1(d) \ { .uname = "FILT_REM", \ .udesc = "Filter1 matches on remote node target",\ .ufilters[1] = 0x1ULL << 0, \ .uflags = INTEL_X86_DFL, \ .grpid = d, \ }, \ { .uname = "FILT_LOC", \ .udesc = "Filter1 matches on local node target", \ .ufilters[1] = 0x1ULL << 1, \ .uflags = INTEL_X86_DFL, \ .grpid = d, \ },\ { .uname = "FILT_LOC_MEM", \ .udesc = "Filter1 matches on near memory",\ .ufilters[1] = 0x1ULL << 4, \ .uflags = INTEL_X86_DFL, \ .grpid = d, \ }, \ { .uname = "FILT_REM_MEM", \ .udesc = "Filter1 matches on remote memory", \ .ufilters[1] = 0x1ULL << 5, \ .uflags = INTEL_X86_DFL, \ .grpid = d, \ } #define CHA_FILT_OPC_IRQ(n, d, r, shift) \ { .uname = "OPC"#n"_RFO",\ .udesc = "IRQ Opcode: Demand data RFO (line to be cache in E state)",\ .ufilters[1] = 0x200ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_CRD",\ .udesc = "IRQ Opcode: Demand code read",\ .ufilters[1] = 0x201ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_DRD",\ .udesc = "IRQ Opcode: Demand data read (line to be cached in S or E states)",\ .ufilters[1] = 0x202ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_PRD",\ .udesc = "IRQ Opcode: Partial reads 0-32 bytes uncacheable (IIO can be up to 64 bytes)",\ .ufilters[1] = 0x207ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WCILF",\ .udesc = "IRQ Opcode: Full cacheline streaming store", \ .ufilters[1] = 0x20cULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WCIL",\ .udesc = "IRQ Opcode: Partial streaming store", \ .ufilters[1] = 0x20dULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_UCRDF",\ .udesc = "IRQ Opcode: Uncacheable Reads full cacheline", \ .ufilters[1] = 0x20dULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WIL",\ .udesc = "IRQ Opcode: Write Invalidate Line (Partial)", \ .ufilters[1] = 0x20fULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_PUSH_HINT",\ .udesc = "IRQ Opcode: TBD", \ .ufilters[1] = 0x243ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_MTOI",\ .udesc = "IRQ Opcode: Request writeback modified invalidate line, evict fill M-state line from core", \ .ufilters[1] = 0x244ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_MTOE",\ .udesc = "IRQ Opcode: Request writeback modified set to exclusive (combine with any OPCODE umask)", \ .ufilters[1] = 0x245ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_EFTOI",\ .udesc = "IRQ Opcode: Request clean E or F state lines writeback, ownership gone when writeback completes", \ .ufilters[1] = 0x246ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_EFTOE",\ .udesc = "IRQ Opcode: Request clean E or F state lines writeback, core may retain ownership when writeback completes", \ .ufilters[1] = 0x247ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_ITOM",\ .udesc = "IRQ Opcode: Request invalidate line. Request exclusive ownership of the line", \ .ufilters[1] = 0x248ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_LLC_PF_RFO",\ .udesc = "IRQ Opcode: LLC prefetch RFO, uncore first looks up the line in LLC. For a hit, the LRU is updated. For a miss, the RFO is initiated", \ .ufilters[1] = 0x258ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_LLC_PF_CODE",\ .udesc = "IRQ Opcode: LLC prefetch code, uncore first looks up the line in LLC. For a hit, the LRU is updated. For a miss, the CRd is initiated", \ .ufilters[1] = 0x259ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_LLC_PF_DATA",\ .udesc = "IRQ Opcode: LLC prefetch data, uncore first looks up the line in LLC. For a hit, the LRU is updated. For a miss, the DRd is initiated", \ .ufilters[1] = 0x25aULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_INT_LOG",\ .udesc = "IRQ Opcode: Interrupts logically addressed", \ .ufilters[1] = 0x279ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_INT_PHY",\ .udesc = "IRQ Opcode: Interrupts physically addressed", \ .ufilters[1] = 0x27aULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_PRI_UP",\ .udesc = "IRQ Opcode: Interrupt priority update", \ .ufilters[1] = 0x27bULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SPLIT_LOCK",\ .udesc = "IRQ Opcode: Request to start split lock sequence", \ .ufilters[1] = 0x27eULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_LOCK",\ .udesc = "IRQ Opcode: Request to start IDI lock sequence", \ .ufilters[1] = 0x27fULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ } #define CHA_FILT_OPC_IPQ(n, d, r, shift) \ { .uname = "OPC"#n"_SNP_CUR",\ .udesc = "IPQ Opcode: Snoop request to get uncacheable 'sanpshot' of data", \ .ufilters[1] = 0x180ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SNP_CODE",\ .udesc = "IPQ Opcode: Snoop request to get cacheline intended to be cached in S-state", \ .ufilters[1] = 0x181ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SNP_DATA",\ .udesc = "IPQ Opcode: Snoop request to get cacheline intended to be cached in E or S-state", \ .ufilters[1] = 0x182ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SNP_DATA_MIG",\ .udesc = "IPQ Opcode: Snoop request to get cacheline intended to be cached in M, E or S-state", \ .ufilters[1] = 0x183ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SNP_INV_OWN",\ .udesc = "IPQ Opcode: Snoop invalidate own. To get cacheline in M or E-state", \ .ufilters[1] = 0x184ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_SNP_INV",\ .udesc = "IPQ Opcode: Snoop invalidate. To get cacheline intended to be cached in E-state", \ .ufilters[1] = 0x185ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ } #define CHA_FILT_OPC_PRQ(n, d, r, shift) \ { .uname = "OPC"#n"_RD_CUR",\ .udesc = "PRQ Opcode: Read current. Request cacheline in I-state. Used to obtain a coherent snapshot of an uncached line", \ .ufilters[1] = 0x080ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_CODE",\ .udesc = "PRQ Opcode: Read code. Request cacheline in S-state", \ .ufilters[1] = 0x081ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_DATA",\ .udesc = "PRQ Opcode: Read data. Request cacheline in E or S-state", \ .ufilters[1] = 0x082ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_DATA_MIG",\ .udesc = "PRQ Opcode: Read data migratory. Request cacheline in E or S-state, except peer cache can forward cacheline in M-state without any writeback to memory", \ .ufilters[1] = 0x083ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_INV_OWN",\ .udesc = "PRQ Opcode: Read invalidate own. Invalidate cacheline in M or E-state", \ .ufilters[1] = 0x084ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_INV_XTOI",\ .udesc = "PRQ Opcode: Read invalidate X to I-state", \ .ufilters[1] = 0x085ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_PUSH_HINT",\ .udesc = "PRQ Opcode: Read push hint", \ .ufilters[1] = 0x086ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_INV_ITOE",\ .udesc = "PRQ Opcode: Read invalidate I to E-state", \ .ufilters[1] = 0x087ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_INV",\ .udesc = "PRQ Opcode: Read invalidate. Request cacheline in E-state from home agent", \ .ufilters[1] = 0x08cULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_RD_INV_ITOM",\ .udesc = "PRQ Opcode: Read invalidate I to M-state",\ .ufilters[1] = 0x08fULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ } #define CHA_FILT_OPC_WBQ(n, d, r, shift) \ { .uname = "OPC"#n"_WB_MTOS",\ .udesc = "Writeback M to S-state. Write full cacheline in M-state to memory and transition it to S-state",\ .ufilters[1] = 0x001ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_NON_SNP_WR",\ .udesc = "Non snoop write. Write cacheline to memory",\ .ufilters[1] = 0x003ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_MTOI_PARTIAL",\ .udesc = "Writeback M to I-state. Write full cacheline in M-state to memory according to byte-enable mask and transition to I-state",\ .ufilters[1] = 0x004ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_MTOE_PARTIAL",\ .udesc = "Writeback M to E-state. Write full cacheline in M-state to memory according to byte-enable mask and transition to E-state",\ .ufilters[1] = 0x006ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_NON_SNP_WR_PARTIAL",\ .udesc = "Non snoop write. Write cacheline to memory according to byte-enable mask",\ .ufilters[1] = 0x007ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_PUSH_MTOI",\ .udesc = "Writeback push M to I-state. Push cacheline in M-stat to the home agent. Agent may push data to local cache in M-state or write to memory. Transition to I-state",\ .ufilters[1] = 0x008ULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_WB_FLUSH",\ .udesc = "Writeback flush. Hint for flushing writes to memory. No data is sent with the request",\ .ufilters[1] = 0x00bULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_EVICT_CLEAN",\ .udesc = "Evict clean. Notification to home node that a cacheline in E-state was invalidated",\ .ufilters[1] = 0x00cULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ }, \ { .uname = "OPC"#n"_NON_SNP_RD",\ .udesc = "Non snoop read. Request a read-only cacheline from memory",\ .ufilters[1] = 0x00dULL << shift, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE | INTEL_X86_GRP_REQ, \ .grpid = (d & 0xff) | ((r & 0xff) << 8), \ } static intel_x86_umask_t skx_unc_c_ag0_ad_crd_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 AD Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag0_ad_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 AD Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag0_bl_crd_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 BL Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag0_bl_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 BL Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag1_ad_crd_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent1 AD Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag1_ad_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent1 AD Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag1_bl_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent1 BL Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_ag1_bl_credits_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent1 BL Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_bypass_cha_imc[]={ { .uname = "INTERMEDIATE", .ucode = 0x200, .udesc = "CHA to iMC Bypass -- Intermediate bypass Taken", }, { .uname = "NOT_TAKEN", .ucode = 0x400, .udesc = "CHA to iMC Bypass -- Not Taken", }, { .uname = "TAKEN", .ucode = 0x100, .udesc = "CHA to iMC Bypass -- Taken", }, }; static intel_x86_umask_t skx_unc_c_core_pma[]={ { .uname = "C1_STATE", .ucode = 0x100, .udesc = "Core PMA Events -- C1 State", }, { .uname = "C1_TRANSITION", .ucode = 0x200, .udesc = "Core PMA Events -- C1 Transition", }, { .uname = "C6_STATE", .ucode = 0x400, .udesc = "Core PMA Events -- C6 State", }, { .uname = "C6_TRANSITION", .ucode = 0x800, .udesc = "Core PMA Events -- C6 Transition", }, { .uname = "GV", .ucode = 0x1000, .udesc = "Core PMA Events -- GV", }, }; static intel_x86_umask_t skx_unc_c_core_snp[]={ { .uname = "ANY_GTONE", .ucode = 0xe200, .udesc = "Core Cross Snoops Issued -- Any Cycle with Multiple Snoops", }, { .uname = "ANY_ONE", .ucode = 0xe100, .udesc = "Core Cross Snoops Issued -- Any Single Snoop", }, { .uname = "ANY_REMOTE", .ucode = 0xe400, .udesc = "Core Cross Snoops Issued -- Any Snoop to Remote Node", }, { .uname = "CORE_GTONE", .ucode = 0x4200, .udesc = "Core Cross Snoops Issued -- Multiple Core Requests", }, { .uname = "CORE_ONE", .ucode = 0x4100, .udesc = "Core Cross Snoops Issued -- Single Core Requests", }, { .uname = "CORE_REMOTE", .ucode = 0x4400, .udesc = "Core Cross Snoops Issued -- Core Request to Remote Node", }, { .uname = "EVICT_GTONE", .ucode = 0x8200, .udesc = "Core Cross Snoops Issued -- Multiple Eviction", }, { .uname = "EVICT_ONE", .ucode = 0x8100, .udesc = "Core Cross Snoops Issued -- Single Eviction", }, { .uname = "EVICT_REMOTE", .ucode = 0x8400, .udesc = "Core Cross Snoops Issued -- Eviction to Remote Node", }, { .uname = "EXT_GTONE", .ucode = 0x2200, .udesc = "Core Cross Snoops Issued -- Multiple External Snoops", }, { .uname = "EXT_ONE", .ucode = 0x2100, .udesc = "Core Cross Snoops Issued -- Single External Snoops", }, { .uname = "EXT_REMOTE", .ucode = 0x2400, .udesc = "Core Cross Snoops Issued -- External Snoop to Remote Node", }, }; static intel_x86_umask_t skx_unc_c_dir_lookup[]={ { .uname = "NO_SNP", .ucode = 0x200, .udesc = "Directory Lookups -- Snoop Not Needed", }, { .uname = "SNP", .ucode = 0x100, .udesc = "Directory Lookups -- Snoop Needed", }, }; static intel_x86_umask_t skx_unc_c_dir_update[]={ { .uname = "HA", .ucode = 0x100, .udesc = "Directory Updates -- from HA pipe", }, { .uname = "TOR", .ucode = 0x200, .udesc = "Directory Updates -- from TOR pipe", }, }; static intel_x86_umask_t skx_unc_c_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .ucode = 0x400, .udesc = "Egress Blocking due to Ordering requirements -- Down", }, { .uname = "IV_SNOOPGO_UP", .ucode = 0x100, .udesc = "Egress Blocking due to Ordering requirements -- Up", }, }; static intel_x86_umask_t skx_unc_c_fast_asserted[]={ { .uname = "HORZ", .ucode = 0x200, .udesc = "FaST wire asserted -- Horizontal", }, { .uname = "VERT", .ucode = 0x100, .udesc = "FaST wire asserted -- Vertical", }, }; static intel_x86_umask_t skx_unc_c_hitme_hit[]={ { .uname = "EX_RDS", .ucode = 0x100, .udesc = "Counts Number of Hits in HitMe Cache -- Exclusive hit and op is RdCode, RdData, RdDataMigratory, RdCur, RdInv*, Inv*", }, { .uname = "SHARED_OWNREQ", .ucode = 0x400, .udesc = "Counts Number of Hits in HitMe Cache -- Shared hit and op is RdInvOwn, RdInv, Inv*", }, { .uname = "WBMTOE", .ucode = 0x800, .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoE", }, { .uname = "WBMTOI_OR_S", .ucode = 0x1000, .udesc = "Counts Number of Hits in HitMe Cache -- op is WbMtoI, WbPushMtoI, WbFlush, or WbMtoS", }, }; static intel_x86_umask_t skx_unc_c_hitme_lookup[]={ { .uname = "READ", .ucode = 0x100, .udesc = "Counts Number of times HitMe Cache is accessed -- op is RdCode, RdData, RdDataMigratory, RdCur, RdInvOwn, RdInv, Inv*", }, { .uname = "WRITE", .ucode = 0x200, .udesc = "Counts Number of times HitMe Cache is accessed -- op is WbMtoE, WbMtoI, WbPushMtoI, WbFlush, or WbMtoS", }, }; static intel_x86_umask_t skx_unc_c_hitme_miss[]={ { .uname = "NOTSHARED_RDINVOWN", .ucode = 0x4000, .udesc = "Counts Number of Misses in HitMe Cache -- No SF/LLC HitS/F and op is RdInvOwn", }, { .uname = "READ_OR_INV", .ucode = 0x8000, .udesc = "Counts Number of Misses in HitMe Cache -- op is RdCode, RdData, RdDataMigratory, RdCur, RdInv, Inv*", }, { .uname = "SHARED_RDINVOWN", .ucode = 0x2000, .udesc = "Counts Number of Misses in HitMe Cache -- SF/LLC HitS/F and op is RdInvOwn", }, }; static intel_x86_umask_t skx_unc_c_hitme_update[]={ { .uname = "DEALLOCATE", .ucode = 0x1000, .udesc = "Counts the number of Allocate/Update to HitMe Cache -- Deallocate HtiME Reads without RspFwdI*", }, { .uname = "DEALLOCATE_RSPFWDI_LOC", .ucode = 0x100, .udesc = "Counts the number of Allocate/Update to HitMe Cache -- op is RspIFwd or RspIFwdWb for a local request", }, { .uname = "RDINVOWN", .ucode = 0x800, .udesc = "Counts the number of Allocate/Update to HitMe Cache -- Update HitMe Cache on RdInvOwn even if not RspFwdI*", }, { .uname = "RSPFWDI_REM", .ucode = 0x200, .udesc = "Counts the number of Allocate/Update to HitMe Cache -- op is RspIFwd or RspIFwdWb for a remote request", }, { .uname = "SHARED", .ucode = 0x400, .udesc = "Counts the number of Allocate/Update to HitMe Cache -- Update HitMe Cache to SHARed", }, }; static intel_x86_umask_t skx_unc_c_horz_ring_ad_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AD Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AD Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AD Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AD Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_c_horz_ring_ak_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AK Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AK Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AK Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AK Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_c_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal BL Ring in Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal BL Ring in Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal BL Ring in Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal BL Ring in Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_c_horz_ring_iv_in_use[]={ { .uname = "LEFT", .ucode = 0x100, .udesc = "Horizontal IV Ring in Use -- Left", }, { .uname = "RIGHT", .ucode = 0x400, .udesc = "Horizontal IV Ring in Use -- Right", }, }; static intel_x86_umask_t skx_unc_c_imc_reads_count[]={ { .uname = "NORMAL", .ucode = 0x100, .udesc = "HA to iMC Reads Issued -- Normal", }, { .uname = "PRIORITY", .ucode = 0x200, .udesc = "HA to iMC Reads Issued -- ISOCH", }, }; static intel_x86_umask_t skx_unc_c_imc_writes_count[]={ { .uname = "FULL", .ucode = 0x100, .udesc = "Writes Issued to the iMC by the HA -- Full Line Non-ISOCH", }, { .uname = "FULL_MIG", .ucode = 0x1000, .udesc = "Writes Issued to the iMC by the HA -- Full Line MIG", }, { .uname = "FULL_PRIORITY", .ucode = 0x400, .udesc = "Writes Issued to the iMC by the HA -- ISOCH Full Line", }, { .uname = "PARTIAL", .ucode = 0x200, .udesc = "Writes Issued to the iMC by the HA -- Partial Non-ISOCH", }, { .uname = "PARTIAL_MIG", .ucode = 0x2000, .udesc = "Writes Issued to the iMC by the HA -- Partial MIG", }, { .uname = "PARTIAL_PRIORITY", .ucode = 0x800, .udesc = "Writes Issued to the iMC by the HA -- ISOCH Partial", }, }; static intel_x86_umask_t skx_unc_c_iodc_alloc[]={ { .uname = "INVITOM", .ucode = 0x100, .udesc = "Counts Number of times IODC entry allocation is attempted -- Number of IODC allocations", }, { .uname = "IODCFULL", .ucode = 0x200, .udesc = "Counts Number of times IODC entry allocation is attempted -- Number of IODC allocations dropped due to IODC Full", }, { .uname = "OSBGATED", .ucode = 0x400, .udesc = "Counts Number of times IODC entry allocation is attempted -- Number of IDOC allocation dropped due to OSB gate", }, }; static intel_x86_umask_t skx_unc_c_iodc_dealloc[]={ { .uname = "ALL", .ucode = 0x1000, .udesc = "Counts number of IODC deallocations -- IODC deallocated due to any reason", }, { .uname = "SNPOUT", .ucode = 0x800, .udesc = "Counts number of IODC deallocations -- IODC deallocated due to conflicting transaction", }, { .uname = "WBMTOE", .ucode = 0x100, .udesc = "Counts number of IODC deallocations -- IODC deallocated due to WbMtoE", }, { .uname = "WBMTOI", .ucode = 0x200, .udesc = "Counts number of IODC deallocations -- IODC deallocated due to WbMtoI", }, { .uname = "WBPUSHMTOI", .ucode = 0x400, .udesc = "Counts number of IODC deallocations -- IODC deallocated due to WbPushMtoI", }, }; static intel_x86_umask_t skx_unc_c_llc_lookup[]={ { .uname = "ANY", .ucode = 0x1100, .udesc = "Cache and Snoop Filter Lookups -- Any Request", .grpid = 0, .uflags= INTEL_X86_DFL, }, { .uname = "DATA_READ", .ucode = 0x300, .udesc = "Cache and Snoop Filter Lookups -- Data Read Request", .grpid = 0, }, { .uname = "LOCAL", .ucode = 0x3100, .udesc = "Cache and Snoop Filter Lookups -- Local", .grpid = 0, }, { .uname = "REMOTE", .ucode = 0x9100, .udesc = "Cache and Snoop Filter Lookups -- Remote", .grpid = 0, }, { .uname = "REMOTE_SNOOP", .ucode = 0x900, .udesc = "Cache and Snoop Filter Lookups -- External Snoop Request", .grpid = 0, }, { .uname = "WRITE", .ucode = 0x500, .udesc = "Cache and Snoop Filter Lookups -- Write Requests", .grpid = 0, }, CHA_FILT_MESIFS(1), }; static intel_x86_umask_t skx_unc_c_llc_victims[]={ { .uname = "LOCAL_ALL", .ucode = 0x2f00, .udesc = "Lines Victimized -- Local - All Lines", }, { .uname = "LOCAL_E", .ucode = 0x2200, .udesc = "Lines Victimized -- Local - Lines in E State", }, { .uname = "LOCAL_F", .ucode = 0x2800, .udesc = "Lines Victimized -- Local - Lines in F State", }, { .uname = "LOCAL_M", .ucode = 0x2100, .udesc = "Lines Victimized -- Local - Lines in M State", }, { .uname = "LOCAL_S", .ucode = 0x2400, .udesc = "Lines Victimized -- Local - Lines in S State", }, { .uname = "REMOTE_ALL", .ucode = 0x8f00, .udesc = "Lines Victimized -- Remote - All Lines", }, { .uname = "REMOTE_E", .ucode = 0x8200, .udesc = "Lines Victimized -- Remote - Lines in E State", }, { .uname = "REMOTE_F", .ucode = 0x8800, .udesc = "Lines Victimized -- Remote - Lines in F State", }, { .uname = "REMOTE_M", .ucode = 0x8100, .udesc = "Lines Victimized -- Remote - Lines in M State", }, { .uname = "REMOTE_S", .ucode = 0x8400, .udesc = "Lines Victimized -- Remote - Lines in S State", }, { .uname = "TOTAL_E", .ucode = 0xa200, .udesc = "Lines Victimized -- Lines in E State", }, { .uname = "TOTAL_F", .ucode = 0xa800, .udesc = "Lines Victimized -- Lines in F State", }, { .uname = "TOTAL_M", .ucode = 0xa100, .udesc = "Lines Victimized -- Lines in M State", }, { .uname = "TOTAL_S", .ucode = 0xa400, .udesc = "Lines Victimized -- Lines in S State", }, }; static intel_x86_umask_t skx_unc_c_misc[]={ { .uname = "CV0_PREF_MISS", .ucode = 0x2000, .udesc = "Cbo Misc -- CV0 Prefetch Miss", }, { .uname = "CV0_PREF_VIC", .ucode = 0x1000, .udesc = "Cbo Misc -- CV0 Prefetch Victim", }, { .uname = "RFO_HIT_S", .ucode = 0x800, .udesc = "Cbo Misc -- RFO HitS", }, { .uname = "RSPI_WAS_FSE", .ucode = 0x100, .udesc = "Cbo Misc -- Silent Snoop Eviction", }, { .uname = "WC_ALIASING", .ucode = 0x200, .udesc = "Cbo Misc -- Write Combining Aliasing", }, }; static intel_x86_umask_t skx_unc_c_read_no_credits[]={ { .uname = "EDC0_SMI2", .ucode = 0x400, .udesc = "CHA iMC CHNx READ Credits Empty -- EDC0_SMI2", }, { .uname = "EDC1_SMI3", .ucode = 0x800, .udesc = "CHA iMC CHNx READ Credits Empty -- EDC1_SMI3", }, { .uname = "EDC2_SMI4", .ucode = 0x1000, .udesc = "CHA iMC CHNx READ Credits Empty -- EDC2_SMI4", }, { .uname = "EDC3_SMI5", .ucode = 0x2000, .udesc = "CHA iMC CHNx READ Credits Empty -- EDC3_SMI5", }, { .uname = "MC0_SMI0", .ucode = 0x100, .udesc = "CHA iMC CHNx READ Credits Empty -- MC0_SMI0", }, { .uname = "MC1_SMI1", .ucode = 0x200, .udesc = "CHA iMC CHNx READ Credits Empty -- MC1_SMI1", }, }; static intel_x86_umask_t skx_unc_c_requests[]={ { .uname = "INVITOE_LOCAL", .ucode = 0x1000, .udesc = "Read and Write Requests -- InvalItoE Local", }, { .uname = "INVITOE_REMOTE", .ucode = 0x2000, .udesc = "Read and Write Requests -- InvalItoE Remote", }, { .uname = "READS", .ucode = 0x300, .udesc = "Read and Write Requests -- Reads", .uflags= INTEL_X86_DFL, }, { .uname = "READS_LOCAL", .ucode = 0x100, .udesc = "Read and Write Requests -- Reads Local", }, { .uname = "READS_REMOTE", .ucode = 0x200, .udesc = "Read and Write Requests -- Reads Remote", }, { .uname = "WRITES", .ucode = 0xc00, .udesc = "Read and Write Requests -- Writes", }, { .uname = "WRITES_LOCAL", .ucode = 0x400, .udesc = "Read and Write Requests -- Writes Local", }, { .uname = "WRITES_REMOTE", .ucode = 0x800, .udesc = "Read and Write Requests -- Writes Remote", }, }; static intel_x86_umask_t skx_unc_c_ring_bounces_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Horizontal Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Horizontal Ring. -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Horizontal Ring. -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Horizontal Ring. -- IV", }, }; static intel_x86_umask_t skx_unc_c_ring_bounces_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Vertical Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Vertical Ring. -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Vertical Ring. -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Vertical Ring. -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_c_ring_sink_starved_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Horizontal Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Horizontal Ring -- AK", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Sink Starvation on Horizontal Ring -- Acknowledgements to Agent 1", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Horizontal Ring -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Horizontal Ring -- IV", }, }; static intel_x86_umask_t skx_unc_c_ring_sink_starved_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Vertical Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Vertical Ring -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Vertical Ring -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Vertical Ring -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_c_rxc_inserts[]={ { .uname = "IPQ", .ucode = 0x400, .udesc = "Ingress (from CMS) Allocations -- IPQ", }, { .uname = "IRQ", .ucode = 0x100, .udesc = "Ingress (from CMS) Allocations -- IRQ", }, { .uname = "IRQ_REJ", .ucode = 0x200, .udesc = "Ingress (from CMS) Allocations -- IRQ Rejected", }, { .uname = "PRQ", .ucode = 0x1000, .udesc = "Ingress (from CMS) Allocations -- PRQ", }, { .uname = "PRQ_REJ", .ucode = 0x2000, .udesc = "Ingress (from CMS) Allocations -- PRQ", }, { .uname = "RRQ", .ucode = 0x4000, .udesc = "Ingress (from CMS) Allocations -- RRQ", }, { .uname = "WBQ", .ucode = 0x8000, .udesc = "Ingress (from CMS) Allocations -- WBQ", }, }; static intel_x86_umask_t skx_unc_c_rxc_ipq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "Ingress Probe Queue Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "Ingress Probe Queue Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "Ingress Probe Queue Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "Ingress Probe Queue Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "Ingress Probe Queue Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "Ingress Probe Queue Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "Ingress Probe Queue Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "Ingress Probe Queue Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_ipq1_reject[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "Ingress Probe Queue Rejects -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "Ingress Probe Queue Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "Ingress Probe Queue Rejects -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "Ingress Probe Queue Rejects -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "Ingress Probe Queue Rejects -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "Ingress Probe Queue Rejects -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "Ingress Probe Queue Rejects -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "Ingress Probe Queue Rejects -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_irq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "Ingress (from CMS) Request Queue Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "Ingress (from CMS) Request Queue Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_irq1_reject[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "Ingress (from CMS) Request Queue Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "Ingress (from CMS) Request Queue Rejects -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "Ingress (from CMS) Request Queue Rejects -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "Ingress (from CMS) Request Queue Rejects -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "Ingress (from CMS) Request Queue Rejects -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_ismq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "ISMQ Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "ISMQ Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "ISMQ Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "ISMQ Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "ISMQ Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "ISMQ Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "ISMQ Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "ISMQ Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_ismq0_retry[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "ISMQ Retries -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "ISMQ Retries -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "ISMQ Retries -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "ISMQ Retries -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "ISMQ Retries -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "ISMQ Retries -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "ISMQ Retries -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "ISMQ Retries -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_ismq1_reject[]={ { .uname = "ANY0", .ucode = 0x100, .udesc = "ISMQ Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "ISMQ Rejects -- HA", }, }; static intel_x86_umask_t skx_unc_c_rxc_ismq1_retry[]={ { .uname = "ANY0", .ucode = 0x100, .udesc = "ISMQ Retries -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "ISMQ Retries -- HA", }, }; static intel_x86_umask_t skx_unc_c_rxc_occupancy[]={ { .uname = "IPQ", .ucode = 0x400, .udesc = "Ingress (from CMS) Occupancy -- IPQ", }, { .uname = "IRQ", .ucode = 0x100, .udesc = "Ingress (from CMS) Occupancy -- IRQ", }, { .uname = "RRQ", .ucode = 0x4000, .udesc = "Ingress (from CMS) Occupancy -- RRQ", }, { .uname = "WBQ", .ucode = 0x8000, .udesc = "Ingress (from CMS) Occupancy -- WBQ", }, }; static intel_x86_umask_t skx_unc_c_rxc_other0_retry[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "Other Retries -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "Other Retries -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "Other Retries -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "Other Retries -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "Other Retries -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "Other Retries -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "Other Retries -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "Other Retries -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_other1_retry[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "Other Retries -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "Other Retries -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "Other Retries -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "Other Retries -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "Other Retries -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "Other Retries -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "Other Retries -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "Other Retries -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_prq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "Ingress (from CMS) Request Queue Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "Ingress (from CMS) Request Queue Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "Ingress (from CMS) Request Queue Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_prq1_reject[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "Ingress (from CMS) Request Queue Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "Ingress (from CMS) Request Queue Rejects -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "Ingress (from CMS) Request Queue Rejects -- LLC OR SF Way", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "Ingress (from CMS) Request Queue Rejects -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "Ingress (from CMS) Request Queue Rejects -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "Ingress (from CMS) Request Queue Rejects -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "Ingress (from CMS) Request Queue Rejects -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_req_q0_retry[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "Request Queue Retries -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "Request Queue Retries -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "Request Queue Retries -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "Request Queue Retries -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "Request Queue Retries -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "Request Queue Retries -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "Request Queue Retries -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "Request Queue Retries -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_req_q1_retry[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "Request Queue Retries -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "Request Queue Retries -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "Request Queue Retries -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "Request Queue Retries -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "Request Queue Retries -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "Request Queue Retries -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "Request Queue Retries -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "Request Queue Retries -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_rrq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "RRQ Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "RRQ Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "RRQ Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "RRQ Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "RRQ Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "RRQ Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "RRQ Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "RRQ Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_rrq1_reject[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "RRQ Rejects -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "RRQ Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "RRQ Rejects -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "RRQ Rejects -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "RRQ Rejects -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "RRQ Rejects -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "RRQ Rejects -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "RRQ Rejects -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxc_wbq0_reject[]={ { .uname = "AD_REQ_VN0", .ucode = 0x100, .udesc = "WBQ Rejects -- AD REQ on VN0", }, { .uname = "AD_RSP_VN0", .ucode = 0x200, .udesc = "WBQ Rejects -- AD RSP on VN0", }, { .uname = "AK_NON_UPI", .ucode = 0x4000, .udesc = "WBQ Rejects -- Non UPI AK Request", }, { .uname = "BL_NCB_VN0", .ucode = 0x1000, .udesc = "WBQ Rejects -- BL NCB on VN0", }, { .uname = "BL_NCS_VN0", .ucode = 0x2000, .udesc = "WBQ Rejects -- BL NCS on VN0", }, { .uname = "BL_RSP_VN0", .ucode = 0x400, .udesc = "WBQ Rejects -- BL RSP on VN0", }, { .uname = "BL_WB_VN0", .ucode = 0x800, .udesc = "WBQ Rejects -- BL WB on VN0", }, { .uname = "IV_NON_UPI", .ucode = 0x8000, .udesc = "WBQ Rejects -- Non UPI IV Request", }, }; static intel_x86_umask_t skx_unc_c_rxc_wbq1_reject[]={ { .uname = "ALLOW_SNP", .ucode = 0x4000, .udesc = "WBQ Rejects -- Allow Snoop", }, { .uname = "ANY0", .ucode = 0x100, .udesc = "WBQ Rejects -- ANY0", }, { .uname = "HA", .ucode = 0x200, .udesc = "WBQ Rejects -- HA", }, { .uname = "LLC_OR_SF_WAY", .ucode = 0x2000, .udesc = "WBQ Rejects -- Merging these two together to make room for ANY_REJECT_*0", }, { .uname = "LLC_VICTIM", .ucode = 0x400, .udesc = "WBQ Rejects -- LLC Victim", }, { .uname = "PA_MATCH", .ucode = 0x8000, .udesc = "WBQ Rejects -- PhyAddr Match", }, { .uname = "SF_VICTIM", .ucode = 0x800, .udesc = "WBQ Rejects -- SF Victim", }, { .uname = "VICTIM", .ucode = 0x1000, .udesc = "WBQ Rejects -- Victim", }, }; static intel_x86_umask_t skx_unc_c_rxr_busy_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_c_rxr_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Bypass -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Bypass -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Bypass -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Bypass -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Bypass -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Bypass -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_rxr_crd_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, { .uname = "IFV", .ucode = 0x8000, .udesc = "Transgress Injection Starvation -- IFV - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_rxr_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Allocations -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Allocations -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Allocations -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Allocations -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Allocations -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Allocations -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_rxr_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_sf_eviction[]={ { .uname = "E_STATE", .ucode = 0x200, .udesc = "Snoop Filter Eviction -- E state", }, { .uname = "M_STATE", .ucode = 0x100, .udesc = "Snoop Filter Eviction -- M state", }, { .uname = "S_STATE", .ucode = 0x400, .udesc = "Snoop Filter Eviction -- S state", }, }; static intel_x86_umask_t skx_unc_c_snoops_sent[]={ { .uname = "ALL", .ucode = 0x100, .udesc = "Snoops Sent -- All", }, { .uname = "BCST_LOCAL", .ucode = 0x1000, .udesc = "Snoops Sent -- Broadcast snoop for Local Requests", }, { .uname = "BCST_REMOTE", .ucode = 0x2000, .udesc = "Snoops Sent -- Broadcast snoops for Remote Requests", }, { .uname = "DIRECT_LOCAL", .ucode = 0x4000, .udesc = "Snoops Sent -- Directed snoops for Local Requests", }, { .uname = "DIRECT_REMOTE", .ucode = 0x8000, .udesc = "Snoops Sent -- Directed snoops for Remote Requests", }, { .uname = "LOCAL", .ucode = 0x400, .udesc = "Snoops Sent -- Broadcast or directed Snoops sent for Local Requests", }, { .uname = "REMOTE", .ucode = 0x800, .udesc = "Snoops Sent -- Broadcast or directed Snoops sent for Remote Requests", }, }; static intel_x86_umask_t skx_unc_c_snoop_resp[]={ { .uname = "RSPCNFLCTS", .ucode = 0x4000, .udesc = "Snoop Responses Received -- RSPCNFLCT*", }, { .uname = "RSPFWD", .ucode = 0x8000, .udesc = "Snoop Responses Received -- RspFwd", }, { .uname = "RSPI", .ucode = 0x100, .udesc = "Snoop Responses Received -- RspI", }, { .uname = "RSPIFWD", .ucode = 0x400, .udesc = "Snoop Responses Received -- RspIFwd", }, { .uname = "RSPS", .ucode = 0x200, .udesc = "Snoop Responses Received -- RspS", }, { .uname = "RSPSFWD", .ucode = 0x800, .udesc = "Snoop Responses Received -- RspSFwd", }, { .uname = "RSP_FWD_WB", .ucode = 0x2000, .udesc = "Snoop Responses Received -- Rsp*Fwd*WB", }, { .uname = "RSP_WBWB", .ucode = 0x1000, .udesc = "Snoop Responses Received -- Rsp*WB", }, }; static intel_x86_umask_t skx_unc_c_snoop_resp_local[]={ { .uname = "RSPFWD", .ucode = 0x8000, .udesc = "Snoop Responses Received Local -- RspFwd", }, { .uname = "RSPI", .ucode = 0x100, .udesc = "Snoop Responses Received Local -- RspI", }, { .uname = "RSPIFWD", .ucode = 0x400, .udesc = "Snoop Responses Received Local -- RspIFwd", }, { .uname = "RSPS", .ucode = 0x200, .udesc = "Snoop Responses Received Local -- RspS", }, { .uname = "RSPSFWD", .ucode = 0x800, .udesc = "Snoop Responses Received Local -- RspSFwd", }, { .uname = "RSP_FWD_WB", .ucode = 0x2000, .udesc = "Snoop Responses Received Local -- Rsp*FWD*WB", }, { .uname = "RSP_WB", .ucode = 0x1000, .udesc = "Snoop Responses Received Local -- Rsp*WB", }, }; static intel_x86_umask_t skx_unc_c_stall_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_stall_no_txr_horz_crd_ad_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_stall_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_stall_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_c_tor_inserts[]={ { .uname = "ALL_HIT", .ucode = 0x1500, .udesc = "TOR Inserts -- Hits from Local", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "ALL_IO_IA", .ucode = 0x3500, .udesc = "TOR Inserts -- All from Local iA and IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "ALL_MISS", .ucode = 0x2500, .udesc = "TOR Inserts -- Misses from Local", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "EVICT", .ucode = 0x200, .udesc = "TOR Inserts -- SF/LLC Evictions", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "HIT", .ucode = 0x1000, .udesc = "TOR Inserts -- Hit (Not a Miss)", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA", .ucode = 0x3100, .udesc = "TOR Inserts -- All from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA_HIT", .ucode = 0x1100, .udesc = "TOR Inserts -- Hits from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA_MISS", .ucode = 0x2100, .udesc = "TOR Inserts -- Misses from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO", .ucode = 0x3400, .udesc = "TOR Inserts -- All from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO_HIT", .ucode = 0x1400, .udesc = "TOR Inserts -- Hits from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO_MISS", .ucode = 0x2400, .udesc = "TOR Inserts -- Misses from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "MISS", .ucode = 0x2000, .udesc = "TOR Inserts -- Miss", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IPQ", .ucode = 0x800, .udesc = "TOR Inserts -- IPQ", .grpid = 1, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IRQ", .ucode = 0x100, .udesc = "TOR Inserts -- IRQ", .grpid = 2, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "PRQ", .ucode = 0x400, .udesc = "TOR Inserts -- PRQ", .grpid = 3, .uflags= INTEL_X86_GRP_DFL_NONE, }, CHA_FILT_OPC_IPQ(0, 4, 1, 9), CHA_FILT_OPC_IPQ(1, 5, 1, 19), CHA_FILT_OPC_IRQ(0, 4, 2, 9), CHA_FILT_OPC_IRQ(1, 5, 2, 19), CHA_FILT_OPC_PRQ(0, 4, 3, 9), CHA_FILT_OPC_PRQ(1, 5, 3, 19), //CHA_FILT_OPC_WBQ(0, 1, 9), //CHA_FILT_OPC_WBQ(1, 2, 19), }; static intel_x86_umask_t skx_unc_c_tor_occupancy[]={ { .uname = "ALL", .ucode = 0x3700, .udesc = "TOR Occupancy -- All from Local", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "ALL_HIT", .ucode = 0x1700, .udesc = "TOR Occupancy -- Hits from Local", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "ALL_MISS", .ucode = 0x2700, .udesc = "TOR Occupancy -- Misses from Local", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "EVICT", .ucode = 0x200, .udesc = "TOR Occupancy -- SF/LLC Evictions", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "HIT", .ucode = 0x1000, .udesc = "TOR Occupancy -- Hit (Not a Miss)", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA", .ucode = 0x3100, .udesc = "TOR Occupancy -- All from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA_HIT", .ucode = 0x1100, .udesc = "TOR Occupancy -- Hits from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IA_MISS", .ucode = 0x2100, .udesc = "TOR Occupancy -- Misses from Local iA", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO", .ucode = 0x3400, .udesc = "TOR Occupancy -- All from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO_HIT", .ucode = 0x1400, .udesc = "TOR Occupancy -- Hits from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IO_MISS", .ucode = 0x2400, .udesc = "TOR Occupancy -- Misses from Local IO", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "MISS", .ucode = 0x2000, .udesc = "TOR Occupancy -- Miss", .grpid = 0, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IPQ", .ucode = 0x800, .udesc = "TOR Occupancy -- IPQ", .grpid = 1, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "IRQ", .ucode = 0x100, .udesc = "TOR Occupancy -- IRQ", .grpid = 2, .uflags= INTEL_X86_GRP_DFL_NONE, }, { .uname = "PRQ", .ucode = 0x400, .udesc = "TOR Occupancy -- PRQ", .grpid = 3, .uflags= INTEL_X86_GRP_DFL_NONE, }, CHA_FILT_OPC_IPQ(0, 4, 1, 9), CHA_FILT_OPC_IPQ(1, 5, 1, 19), CHA_FILT_OPC_IRQ(0, 4, 2, 9), CHA_FILT_OPC_IRQ(1, 5, 2, 19), CHA_FILT_OPC_PRQ(0, 4, 3, 9), CHA_FILT_OPC_PRQ(1, 5, 3, 19), //CHA_FILT_OPC_WBQ(0, 1, 9), //CHA_FILT_OPC_WBQ(1, 2, 19), }; static intel_x86_umask_t skx_unc_c_txr_horz_ads_used[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal ADS Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal ADS Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal ADS Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal ADS Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal ADS Used -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Bypass Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Bypass Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Bypass Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Bypass Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Bypass Used -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Bypass Used -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_cycles_full[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_cycles_ne[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Inserts -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Inserts -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Inserts -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Inserts -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Inserts -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Inserts -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_nack[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress NACKs -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x2000, .udesc = "CMS Horizontal Egress NACKs -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress NACKs -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress NACKs -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress NACKs -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress NACKs -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_horz_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Injection Starvation -- AD - Bounce", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Injection Starvation -- BL - Bounce", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_ads_used[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_bypass[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical ADS Used -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_cycles_full[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_cycles_ne[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_inserts[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Allocations -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Allocations -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Allocations -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Allocations -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Allocations -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Allocations -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Allocations -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_nack[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress NACKs -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_occupancy[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Occupancy -- IV", }, }; static intel_x86_umask_t skx_unc_c_txr_vert_starved[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress Injection Starvation -- IV", }, }; static intel_x86_umask_t skx_unc_c_vert_ring_ad_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AD Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AD Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AD Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AD Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_c_vert_ring_ak_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AK Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AK Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AK Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AK Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_c_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical BL Ring in Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical BL Ring in Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical BL Ring in Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical BL Ring in Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_c_vert_ring_iv_in_use[]={ { .uname = "DN", .ucode = 0x400, .udesc = "Vertical IV Ring in Use -- Down", }, { .uname = "UP", .ucode = 0x100, .udesc = "Vertical IV Ring in Use -- Up", }, }; static intel_x86_umask_t skx_unc_c_wb_push_mtoi[]={ { .uname = "LLC", .ucode = 0x100, .udesc = "WbPushMtoI -- Pushed to LLC", }, { .uname = "MEM", .ucode = 0x200, .udesc = "WbPushMtoI -- Pushed to Memory", }, }; static intel_x86_umask_t skx_unc_c_write_no_credits[]={ { .uname = "EDC0_SMI2", .ucode = 0x400, .udesc = "CHA iMC CHNx WRITE Credits Empty -- EDC0_SMI2", }, { .uname = "EDC1_SMI3", .ucode = 0x800, .udesc = "CHA iMC CHNx WRITE Credits Empty -- EDC1_SMI3", }, { .uname = "EDC2_SMI4", .ucode = 0x1000, .udesc = "CHA iMC CHNx WRITE Credits Empty -- EDC2_SMI4", }, { .uname = "EDC3_SMI5", .ucode = 0x2000, .udesc = "CHA iMC CHNx WRITE Credits Empty -- EDC3_SMI5", }, { .uname = "MC0_SMI0", .ucode = 0x100, .udesc = "CHA iMC CHNx WRITE Credits Empty -- MC0_SMI0", }, { .uname = "MC1_SMI1", .ucode = 0x200, .udesc = "CHA iMC CHNx WRITE Credits Empty -- MC1_SMI1", }, }; static intel_x86_umask_t skx_unc_c_xsnp_resp[]={ { .uname = "ANY_RSPI_FWDFE", .ucode = 0xe400, .udesc = "Core Cross Snoop Responses -- Any RspIFwdFE", }, { .uname = "ANY_RSPS_FWDFE", .ucode = 0xe200, .udesc = "Core Cross Snoop Responses -- Any RspSFwdFE", }, { .uname = "ANY_RSPS_FWDM", .ucode = 0xe800, .udesc = "Core Cross Snoop Responses -- Any RspSFwdM", }, { .uname = "ANY_RSP_HITFSE", .ucode = 0xe100, .udesc = "Core Cross Snoop Responses -- Any RspHitFSE", }, { .uname = "CORE_RSPI_FWDFE", .ucode = 0x4400, .udesc = "Core Cross Snoop Responses -- Core RspIFwdFE", }, { .uname = "CORE_RSPI_FWDM", .ucode = 0x5000, .udesc = "Core Cross Snoop Responses -- Core RspIFwdM", }, { .uname = "CORE_RSPS_FWDFE", .ucode = 0x4200, .udesc = "Core Cross Snoop Responses -- Core RspSFwdFE", }, { .uname = "CORE_RSPS_FWDM", .ucode = 0x4800, .udesc = "Core Cross Snoop Responses -- Core RspSFwdM", }, { .uname = "CORE_RSP_HITFSE", .ucode = 0x4100, .udesc = "Core Cross Snoop Responses -- Core RspHitFSE", }, { .uname = "EVICT_RSPI_FWDFE", .ucode = 0x8400, .udesc = "Core Cross Snoop Responses -- Evict RspIFwdFE", }, { .uname = "EVICT_RSPI_FWDM", .ucode = 0x9000, .udesc = "Core Cross Snoop Responses -- Evict RspIFwdM", }, { .uname = "EVICT_RSPS_FWDFE", .ucode = 0x8200, .udesc = "Core Cross Snoop Responses -- Evict RspSFwdFE", }, { .uname = "EVICT_RSPS_FWDM", .ucode = 0x8800, .udesc = "Core Cross Snoop Responses -- Evict RspSFwdM", }, { .uname = "EVICT_RSP_HITFSE", .ucode = 0x8100, .udesc = "Core Cross Snoop Responses -- Evict RspHitFSE", }, { .uname = "EXT_RSPI_FWDFE", .ucode = 0x2400, .udesc = "Core Cross Snoop Responses -- External RspIFwdFE", }, { .uname = "EXT_RSPI_FWDM", .ucode = 0x3000, .udesc = "Core Cross Snoop Responses -- External RspIFwdM", }, { .uname = "EXT_RSPS_FWDFE", .ucode = 0x2200, .udesc = "Core Cross Snoop Responses -- External RspSFwdFE", }, { .uname = "EXT_RSPS_FWDM", .ucode = 0x2800, .udesc = "Core Cross Snoop Responses -- External RspSFwdM", }, { .uname = "EXT_RSP_HITFSE", .ucode = 0x2100, .udesc = "Core Cross Snoop Responses -- External RspHitFSE", }, }; static intel_x86_entry_t intel_skx_unc_c_pe[]={ { .name = "UNC_C_AG0_AD_CRD_ACQUIRED", .code = 0x80, .desc = "Number of CMS Agent 0 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag0_ad_crd_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag0_ad_crd_acquired), }, { .name = "UNC_C_AG0_AD_CRD_OCCUPANCY", .code = 0x82, .desc = "Number of CMS Agent 0 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag0_ad_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag0_ad_crd_occupancy), }, { .name = "UNC_C_AG0_BL_CRD_ACQUIRED", .code = 0x88, .desc = "Number of CMS Agent 0 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag0_bl_crd_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag0_bl_crd_acquired), }, { .name = "UNC_C_AG0_BL_CRD_OCCUPANCY", .code = 0x8a, .desc = "Number of CMS Agent 0 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag0_bl_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag0_bl_crd_occupancy), }, { .name = "UNC_C_AG1_AD_CRD_ACQUIRED", .code = 0x84, .desc = "Number of CMS Agent 1 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag1_ad_crd_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag1_ad_crd_acquired), }, { .name = "UNC_C_AG1_AD_CRD_OCCUPANCY", .code = 0x86, .desc = "Number of CMS Agent 1 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag1_ad_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag1_ad_crd_occupancy), }, { .name = "UNC_C_AG1_BL_CRD_OCCUPANCY", .code = 0x8e, .desc = "Number of CMS Agent 1 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag1_bl_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag1_bl_crd_occupancy), }, { .name = "UNC_C_AG1_BL_CREDITS_ACQUIRED", .code = 0x8c, .desc = "Number of CMS Agent 1 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ag1_bl_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ag1_bl_credits_acquired), }, { .name = "UNC_C_BYPASS_CHA_IMC", .code = 0x57, .desc = "Counts the number of times when the CHA was able to bypass HA pipe on the way to iMC. This is a latency optimization for situations when there is light loadings on the memory subsystem. This can be filtered by when the bypass was taken and when it was not.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_bypass_cha_imc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_bypass_cha_imc), }, { .name = "UNC_C_CLOCKTICKS", .code = 0x0, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_CMS_CLOCKTICKS", .code = 0xc0, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_CORE_PMA", .code = 0x17, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_core_pma, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_core_pma), }, { .name = "UNC_C_CORE_SNP", .code = 0x33, .desc = "Counts the number of transactions that trigger a configurable number of cross snoops. Cores are snooped if the transaction looks up the cache and determines that it is necessary based on the operation type and what CoreValid bits are set. For example, if 2 CV bits are set on a data read, the cores must have the data in S state so it is not necessary to snoop them. However, if only 1 CV bit is set the core my have modified the data. If the transaction was an RFO, it would need to invalidate the lines. This event can be filtered based on who triggered the initial snoop(s).", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_core_snp, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_core_snp), }, { .name = "UNC_C_COUNTER0_OCCUPANCY", .code = 0x1f, .desc = "Since occupancy counts can only be captured in the Cbos 0 counter, this event allows a user to capture occupancy related information by filtering the Cb0 occupancy count captured in Counter 0. The filtering available is found in the control register - threshold, invert and edge detect. E.g. setting threshold to 1 can effectively monitor how many cycles the monitored queue has an entry.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_DIR_LOOKUP", .code = 0x53, .desc = "Counts the number of transactions that looked up the Home Agent directory. Can be filtered by requests that had to snoop and those that did not have to.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_dir_lookup, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_dir_lookup), }, { .name = "UNC_C_DIR_UPDATE", .code = 0x54, .desc = "Counts the number of directory updates that were required. These result in writes to the memory controller.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_dir_update, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_dir_update), }, { .name = "UNC_C_EGRESS_ORDERING", .code = 0xae, .desc = "Counts number of cycles IV was blocked in the TGR Egress due to SNP/GO Ordering requirements", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_egress_ordering, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_egress_ordering), }, { .name = "UNC_C_FAST_ASSERTED", .code = 0xa5, .desc = "Counts the number of cycles either the local or incoming distress signals are asserted. Incoming distress includes up, dn and across.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_fast_asserted, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_fast_asserted), }, { .name = "UNC_C_HITME_HIT", .code = 0x5f, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_hitme_hit, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_hitme_hit), }, { .name = "UNC_C_HITME_LOOKUP", .code = 0x5e, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_hitme_lookup, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_hitme_lookup), }, { .name = "UNC_C_HITME_MISS", .code = 0x60, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_hitme_miss, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_hitme_miss), }, { .name = "UNC_C_HITME_UPDATE", .code = 0x61, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_hitme_update, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_hitme_update), }, { .name = "UNC_C_HORZ_RING_AD_IN_USE", .code = 0xa7, .desc = "Counts the number of cycles that the Horizontal AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_horz_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_horz_ring_ad_in_use), }, { .name = "UNC_C_HORZ_RING_AK_IN_USE", .code = 0xa9, .desc = "Counts the number of cycles that the Horizontal AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_horz_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_horz_ring_ak_in_use), }, { .name = "UNC_C_HORZ_RING_BL_IN_USE", .code = 0xab, .desc = "Counts the number of cycles that the Horizontal BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_horz_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_horz_ring_bl_in_use), }, { .name = "UNC_C_HORZ_RING_IV_IN_USE", .code = 0xad, .desc = "Counts the number of cycles that the Horizontal IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_horz_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_horz_ring_iv_in_use), }, { .name = "UNC_C_IMC_READS_COUNT", .code = 0x59, .desc = "Count of the number of reads issued to any of the memory controller channels. This can be filtered by the priority of the reads.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_imc_reads_count, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_imc_reads_count), }, { .name = "UNC_C_IMC_WRITES_COUNT", .code = 0x5b, .desc = "Counts the total number of writes issued from the HA into the memory controller. This counts for all four channels. It can be filtered by full/partial and ISOCH/non-ISOCH.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_imc_writes_count, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_imc_writes_count), }, { .name = "UNC_C_IODC_ALLOC", .code = 0x62, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_iodc_alloc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_iodc_alloc), }, { .name = "UNC_C_IODC_DEALLOC", .code = 0x63, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_iodc_dealloc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_iodc_dealloc), }, { .name = "UNC_C_LLC_LOOKUP", .code = 0x34, .desc = "Counts the number of times the LLC was accessed - this includes code, data, prefetches and hints coming from L2. This has numerous filters available. Note the non-standard filtering equation. This event will count requests that lookup the cache multiple times with multiple increments. One must ALWAYS set umask bit 0 and select a state or states to match. Otherwise, the event will count nothing. CHAFilter0[24:21,17] bits correspond to [FMESI] state.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_c_llc_lookup, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_llc_lookup), }, { .name = "UNC_C_LLC_VICTIMS", .code = 0x37, .desc = "Counts the number of lines that were victimized on a fill. This can be filtered by the state that the line was in.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_llc_victims, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_llc_victims), }, { .name = "UNC_C_MISC", .code = 0x39, .desc = "Miscellaneous events in the CHA.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_misc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_misc), }, { .name = "UNC_C_OSB", .code = 0x55, .desc = "Count of OSB snoop broadcasts. Counts by 1 per request causing OSB snoops to be broadcast. Does not count all the snoops generated by OSB.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_READ_NO_CREDITS", .code = 0x58, .desc = "Counts the number of times when there are no credits available for sending reads from the CHA into the iMC. In order to send reads into the memory controller, the HA must first acquire a credit for the iMCs AD Ingress queuee.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_read_no_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_read_no_credits), }, { .name = "UNC_C_REQUESTS", .code = 0x50, .desc = "Counts the total number of read requests made into the Home Agent. Reads include all read opcodes (including RFO). Writes include all writes (streaming, evictions, HitM, etc).", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_requests, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_requests), }, { .name = "UNC_C_RING_BOUNCES_HORZ", .code = 0xa1, .desc = "Number of cycles incoming messages from the Horizontal ring that were bounced, by ring type.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ring_bounces_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ring_bounces_horz), }, { .name = "UNC_C_RING_BOUNCES_VERT", .code = 0xa0, .desc = "Number of cycles incoming messages from the Vertical ring that were bounced, by ring type.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ring_bounces_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ring_bounces_vert), }, { .name = "UNC_C_RING_SINK_STARVED_HORZ", .code = 0xa3, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ring_sink_starved_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ring_sink_starved_horz), }, { .name = "UNC_C_RING_SINK_STARVED_VERT", .code = 0xa2, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_ring_sink_starved_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_ring_sink_starved_vert), }, { .name = "UNC_C_RING_SRC_THRTL", .code = 0xa4, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_C_RXC_INSERTS", .code = 0x13, .desc = "Counts number of allocations per cycle into the specified Ingress queue.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_inserts), }, { .name = "UNC_C_RXC_IPQ0_REJECT", .code = 0x22, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ipq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ipq0_reject), }, { .name = "UNC_C_RXC_IPQ1_REJECT", .code = 0x23, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ipq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ipq1_reject), }, { .name = "UNC_C_RXC_IRQ0_REJECT", .code = 0x18, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_irq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_irq0_reject), }, { .name = "UNC_C_RXC_IRQ1_REJECT", .code = 0x19, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_irq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_irq1_reject), }, { .name = "UNC_C_RXC_ISMQ0_REJECT", .code = 0x24, .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ismq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ismq0_reject), }, { .name = "UNC_C_RXC_ISMQ0_RETRY", .code = 0x2c, .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ismq0_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ismq0_retry), }, { .name = "UNC_C_RXC_ISMQ1_REJECT", .code = 0x25, .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ismq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ismq1_reject), }, { .name = "UNC_C_RXC_ISMQ1_RETRY", .code = 0x2d, .desc = "Number of times a transaction flowing through the ISMQ had to retry. Transaction pass through the ISMQ as responses for requests that already exist in the Cbo. Some examples include: when data is returned or when snoop responses come back from the cores.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_ismq1_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_ismq1_retry), }, { .name = "UNC_C_RXC_OCCUPANCY", .code = 0x11, .desc = "Counts number of entries in the specified Ingress queue in each cycle.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = skx_unc_c_rxc_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_occupancy), }, { .name = "UNC_C_RXC_OTHER0_RETRY", .code = 0x2e, .desc = "Retry Queue Inserts of Transactions that were already in another Retry Q (sub-events encode the reason for the next reject)", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_other0_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_other0_retry), }, { .name = "UNC_C_RXC_OTHER1_RETRY", .code = 0x2f, .desc = "Retry Queue Inserts of Transactions that were already in another Retry Q (sub-events encode the reason for the next reject)", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_other1_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_other1_retry), }, { .name = "UNC_C_RXC_PRQ0_REJECT", .code = 0x20, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_prq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_prq0_reject), }, { .name = "UNC_C_RXC_PRQ1_REJECT", .code = 0x21, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_prq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_prq1_reject), }, { .name = "UNC_C_RXC_REQ_Q0_RETRY", .code = 0x2a, .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMSMQ)", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_req_q0_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_req_q0_retry), }, { .name = "UNC_C_RXC_REQ_Q1_RETRY", .code = 0x2b, .desc = "REQUESTQ includes: IRQ, PRQ, IPQ, RRQ, WBQ (everything except for ISMSMQ)", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_req_q1_retry, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_req_q1_retry), }, { .name = "UNC_C_RXC_RRQ0_REJECT", .code = 0x26, .desc = "Number of times a transaction flowing through the RRQ (Remote Response Queue) had to retry.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_rrq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_rrq0_reject), }, { .name = "UNC_C_RXC_RRQ1_REJECT", .code = 0x27, .desc = "Number of times a transaction flowing through the RRQ (Remote Response Queue) had to retry.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_rrq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_rrq1_reject), }, { .name = "UNC_C_RXC_WBQ0_REJECT", .code = 0x28, .desc = "Number of times a transaction flowing through the WBQ (Writeback Queue) had to retry.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_wbq0_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_wbq0_reject), }, { .name = "UNC_C_RXC_WBQ1_REJECT", .code = 0x29, .desc = "Number of times a transaction flowing through the WBQ (Writeback Queue) had to retry.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxc_wbq1_reject, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxc_wbq1_reject), }, { .name = "UNC_C_RXR_BUSY_STARVED", .code = 0xb4, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, because a message from the other queue has higher priority", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxr_busy_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxr_busy_starved), }, { .name = "UNC_C_RXR_BYPASS", .code = 0xb2, .desc = "Number of packets bypassing the CMS Ingress", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxr_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxr_bypass), }, { .name = "UNC_C_RXR_CRD_STARVED", .code = 0xb3, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, the Ingress is unable to forward to the Egress due to a lack of credit.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxr_crd_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxr_crd_starved), }, { .name = "UNC_C_RXR_INSERTS", .code = 0xb1, .desc = "Number of allocations into the CMS Ingress The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxr_inserts), }, { .name = "UNC_C_RXR_OCCUPANCY", .code = 0xb0, .desc = "Occupancy event for the Ingress buffers in the CMS The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_rxr_occupancy), }, { .name = "UNC_C_SF_EVICTION", .code = 0x3d, .desc = "TBD", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_sf_eviction, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_sf_eviction), }, { .name = "UNC_C_SNOOPS_SENT", .code = 0x51, .desc = "Counts the number of snoops issued by the HA.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_snoops_sent, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_snoops_sent), }, { .name = "UNC_C_SNOOP_RESP", .code = 0x5c, .desc = "Counts the total number of RspI snoop responses received. Whenever a snoops are issued, one or more snoop responses will be returned depending on the topology of the system. In systems larger than 2s, when multiple snoops are returned this will count all the snoops that are received. For example, if 3 snoops were issued and returned RspI, RspS, and RspSFwd; then each of these sub-events would increment by 1.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_snoop_resp, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_snoop_resp), }, { .name = "UNC_C_SNOOP_RESP_LOCAL", .code = 0x5d, .desc = "Number of snoop responses received for a Local request", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_snoop_resp_local, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_snoop_resp_local), }, { .name = "UNC_C_STALL_NO_TXR_HORZ_CRD_AD_AG0", .code = 0xd0, .desc = "Number of cycles the AD Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_stall_no_txr_horz_crd_ad_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_stall_no_txr_horz_crd_ad_ag0), }, { .name = "UNC_C_STALL_NO_TXR_HORZ_CRD_AD_AG1", .code = 0xd2, .desc = "Number of cycles the AD Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_stall_no_txr_horz_crd_ad_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_stall_no_txr_horz_crd_ad_ag1), }, { .name = "UNC_C_STALL_NO_TXR_HORZ_CRD_BL_AG0", .code = 0xd4, .desc = "Number of cycles the BL Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_stall_no_txr_horz_crd_bl_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_stall_no_txr_horz_crd_bl_ag0), }, { .name = "UNC_C_STALL_NO_TXR_HORZ_CRD_BL_AG1", .code = 0xd6, .desc = "Number of cycles the BL Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_stall_no_txr_horz_crd_bl_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_stall_no_txr_horz_crd_bl_ag1), }, { .name = "UNC_C_TOR_INSERTS", .code = 0x35, .desc = "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent.", .modmsk = SKX_UNC_CHA_FILT1_ATTRS, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .ngrp = 6, .umasks = skx_unc_c_tor_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_tor_inserts), }, { .name = "UNC_C_TOR_OCCUPANCY", .code = 0x36, .desc = "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.", .modmsk = SKX_UNC_CHA_FILT1_ATTRS, .cntmsk = 0x1, .flags = INTEL_X86_NO_AUTOENCODE, .ngrp = 6, .umasks = skx_unc_c_tor_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_tor_occupancy), }, { .name = "UNC_C_TXR_HORZ_ADS_USED", .code = 0x9d, .desc = "Number of packets using the Horizontal Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_ads_used), }, { .name = "UNC_C_TXR_HORZ_BYPASS", .code = 0x9f, .desc = "Number of packets bypassing the Horizontal Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_bypass), }, { .name = "UNC_C_TXR_HORZ_CYCLES_FULL", .code = 0x96, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Full. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_cycles_full), }, { .name = "UNC_C_TXR_HORZ_CYCLES_NE", .code = 0x97, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Not-Empty. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_cycles_ne), }, { .name = "UNC_C_TXR_HORZ_INSERTS", .code = 0x95, .desc = "Number of allocations into the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_inserts), }, { .name = "UNC_C_TXR_HORZ_NACK", .code = 0x99, .desc = "Counts number of Egress packets NACKed on to the Horizontal Rinng", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_nack), }, { .name = "UNC_C_TXR_HORZ_OCCUPANCY", .code = 0x94, .desc = "Occupancy event for the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_occupancy), }, { .name = "UNC_C_TXR_HORZ_STARVED", .code = 0x9b, .desc = "Counts injection starvation. This starvation is triggered when the CMS Transgress buffer cannot send a transaction onto the Horizontal ring for a long period of time.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_horz_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_horz_starved), }, { .name = "UNC_C_TXR_VERT_ADS_USED", .code = 0x9c, .desc = "Number of packets using the Vertical Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_ads_used), }, { .name = "UNC_C_TXR_VERT_BYPASS", .code = 0x9e, .desc = "Number of packets bypassing the Vertical Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_bypass), }, { .name = "UNC_C_TXR_VERT_CYCLES_FULL", .code = 0x92, .desc = "Number of cycles the Common Mesh Stop Egress was Not Full. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_cycles_full), }, { .name = "UNC_C_TXR_VERT_CYCLES_NE", .code = 0x93, .desc = "Number of cycles the Common Mesh Stop Egress was Not Empty. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_cycles_ne), }, { .name = "UNC_C_TXR_VERT_INSERTS", .code = 0x91, .desc = "Number of allocations into the Common Mesh Stop Egress. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_inserts), }, { .name = "UNC_C_TXR_VERT_NACK", .code = 0x98, .desc = "Counts number of Egress packets NACKed on to the Vertical Rinng", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_nack), }, { .name = "UNC_C_TXR_VERT_OCCUPANCY", .code = 0x90, .desc = "Occupancy event for the Egress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_occupancy), }, { .name = "UNC_C_TXR_VERT_STARVED", .code = 0x9a, .desc = "Counts injection starvation. This starvation is triggered when the CMS Egress cannot send a transaction onto the Vertical ring for a long period of time.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_txr_vert_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_txr_vert_starved), }, { .name = "UNC_C_VERT_RING_AD_IN_USE", .code = 0xa6, .desc = "Counts the number of cycles that the Vertical AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_vert_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_vert_ring_ad_in_use), }, { .name = "UNC_C_VERT_RING_AK_IN_USE", .code = 0xa8, .desc = "Counts the number of cycles that the Vertical AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_vert_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_vert_ring_ak_in_use), }, { .name = "UNC_C_VERT_RING_BL_IN_USE", .code = 0xaa, .desc = "Counts the number of cycles that the Vertical BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_vert_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_vert_ring_bl_in_use), }, { .name = "UNC_C_VERT_RING_IV_IN_USE", .code = 0xac, .desc = "Counts the number of cycles that the Vertical IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_vert_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_vert_ring_iv_in_use), }, { .name = "UNC_C_WB_PUSH_MTOI", .code = 0x56, .desc = "Counts the number of times when the CHA was received WbPushMtoI", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_wb_push_mtoi, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_wb_push_mtoi), }, { .name = "UNC_C_WRITE_NO_CREDITS", .code = 0x5a, .desc = "Counts the number of times when there are no credits available for sending WRITEs from the CHA into the iMC. In order to send WRITEs into the memory controller, the HA must first acquire a credit for the iMCs BL Ingress queuee.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_write_no_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_write_no_credits), }, { .name = "UNC_C_XSNP_RESP", .code = 0x32, .desc = "Counts the number of core cross snoops. Cores are snooped if the transaction looks up the cache and determines that it is necessary based on the operation type. This event can be filtered based on who triggered the initial snoop(s): from Evictions, Core or External (i.e. from a remote node) Requests. And the event can be filtered based on the responses: RspX_Fwd/HitY where Y is the state prior to the snoop response and X is the state following.", .modmsk = SKX_UNC_CHA_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_c_xsnp_resp, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_c_xsnp_resp), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_iio_events.h000066400000000000000000000611741502707512200253110ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_iio */ #define FC_MASK(g) \ { .uname = "FC_POSTED_REQ",\ .udesc = "Posted requests",\ .ucode = 0x100ULL << 36,\ .grpid = g, \ },\ { .uname = "FC_NON_POSTED_REQ",\ .udesc = "Non-Posted requests",\ .ucode = 0x200ULL << 36,\ .grpid = g, \ },\ { .uname = "FC_CMPL",\ .udesc = "Completion requests",\ .ucode = 0x400ULL << 36,\ .grpid = g, \ },\ { .uname = "FC_ANY", \ .udesc = "Any type of requests",\ .uequiv = "FC_POSTED_REQ:FC_NON_POSTED_REQ:FC_CMPL",\ .uflags =INTEL_X86_NCOMBO | INTEL_X86_DFL,\ .ucode = 0x700ULL << 36,\ .grpid = g, \ } #define CH_PORT_MASK(g) \ { .uname = "CH_P_PCIE_PORT0",\ .udesc = "PCIe Port 0",\ .ucode = 0x100,\ .grpid = g, \ },\ { .uname = "CH_P_PCIE_PORT1",\ .udesc = "PCIe Port 1",\ .ucode = 0x200,\ .grpid = g, \ },\ { .uname = "CH_P_PCIE_PORT2",\ .udesc = "PCIe Port 2",\ .ucode = 0x400,\ .grpid = g, \ },\ { .uname = "CH_P_PCIE_PORT3",\ .udesc = "PCIe Port 3",\ .ucode = 0x800,\ .grpid = g, \ },\ #define CH_P_MASK(g) \ CH_PORT_MASK(g),\ { .uname = "CH_P_INTEL_VTD",\ .udesc = "Intel VT-d",\ .ucode = 0x1000,\ .grpid = g, \ },\ { .uname = "CH_P_ANY", \ .udesc = "Any type of requests",\ .uequiv = "FC_P_POSTED_REQ:FC_P_NON_POSTED_REQ:FC_P_CMPL",\ .uflags =INTEL_X86_NCOMBO | INTEL_X86_DFL,\,\ .ucode = 0x1f00,\ .grpid = g, \ } /* not yet used */ #define CH_C_MASK(g) \ { .uname = "CH_C_CBDMA",\ .udesc = "CBDMA",\ .ucode = 0x100,\ .grpid = g, \ },\ { .uname = "CH_C_DMI_VC0",\ .udesc = "DMI VC0",\ .ucode = 0x200,\ .grpid = g, \ },\ { .uname = "CH_C_DMI_VC1",\ .udesc = "DMI VC1",\ .ucode = 0x400,\ .grpid = g, \ },\ { .uname = "CH_C_DMI_VCN",\ .udesc = "DMI VCn",\ .ucode = 0x800,\ .grpid = g, \ },\ { .uname = "CH_C_INTEL_VTD_NO_ISOCH",\ .udesc = "Intel VT-d non-isochronous",\ .ucode = 0x1000,\ .grpid = g, \ },\ { .uname = "CH_C_INTEL_VTD_ISOCH",\ .udesc = "Intel VT-d non-isochronous",\ .ucode = 0x2000,\ .grpid = g, \ },\ { .uname = "CH_C_ANY", \ .udesc = "Any type of requests",\ .uequiv = "CH_C_CBDMA:CH_C_DMI_VC0:CH_C_DMI_VC1:CH_C_DMI_VCN:CH_C_INTEL_VTD_NO_ISOCH:CH_C_INTEL_VTD_ISOCH",\ .uflags =INTEL_X86_NCOMBO,\ .ucode = 0x3f00,\ .grpid = g, \ } static intel_x86_umask_t skx_unc_io_comp_buf_inserts[]={ { .uname = "PORT0", .ucode = 0x400ULL | 1ULL << 36, .udesc = "PCIe Completion Buffer Inserts -- Port 0", .grpid = 0, }, { .uname = "PORT1", .ucode = 0x400ULL | 2ULL << 36, .udesc = "PCIe Completion Buffer Inserts -- Port 1", .grpid = 0, }, { .uname = "PORT2", .ucode = 0x400ULL | 4ULL << 36, .udesc = "PCIe Completion Buffer Inserts -- Port 2", .grpid = 0, }, { .uname = "PORT3", .ucode = 0x400ULL | 8ULL << 36, .udesc = "PCIe Completion Buffer Inserts -- Port 3", .grpid = 0, }, { .uname = "ANY_PORT", .ucode = 0x400ULL | 0xfULL << 36, .udesc = "PCIe Completion Buffer Inserts -- Any port", .uequiv= "PORT0:PORT1:PORT2:PORT3", .uflags= INTEL_X86_DFL | INTEL_X86_NCOMBO, .grpid = 0, }, FC_MASK(1) }; static intel_x86_umask_t skx_unc_io_data_req_by_cpu[]={ { .uname = "CFG_READ_PART0", .ucode = 0x4000ULL | 0x1ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_READ_PART1", .ucode = 0x4000ULL | 0x2ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_READ_PART2", .ucode = 0x4000ULL | 0x4ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_READ_PART3", .ucode = 0x4000ULL | 0x8ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_READ_VTD0", .ucode = 0x4000ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_READ_VTD1", .ucode = 0x4000ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards PCICFG spacce", }, { .uname = "CFG_WRITE_PART0", .ucode = 0x1000ULL | 0x1ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "CFG_WRITE_PART1", .ucode = 0x1000ULL | 0x2ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "CFG_WRITE_PART2", .ucode = 0x1000ULL | 0x4ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "CFG_WRITE_PART3", .ucode = 0x1000ULL | 0x8ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "CFG_WRITE_VTD0", .ucode = 0x1000ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "CFG_WRITE_VTD1", .ucode = 0x1000ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards PCICFG spacce", }, { .uname = "IO_READ_PART0", .ucode = 0x8000ULL | 0x1ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_READ_PART1", .ucode = 0x8000ULL | 0x2ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_READ_PART2", .ucode = 0x8000ULL | 0x4ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_READ_PART3", .ucode = 0x8000ULL | 0x8ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_READ_VTD0", .ucode = 0x8000ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_READ_VTD1", .ucode = 0x8000ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards IO spacce", }, { .uname = "IO_WRITE_PART0", .ucode = 0x2000ULL | 0x1ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "IO_WRITE_PART1", .ucode = 0x2000ULL | 0x2ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "IO_WRITE_PART2", .ucode = 0x2000ULL | 0x4ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "IO_WRITE_PART3", .ucode = 0x2000ULL | 0x8ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "IO_WRITE_VTD0", .ucode = 0x2000ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "IO_WRITE_VTD1", .ucode = 0x2000ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards IO spacce", }, { .uname = "MEM_READ_PART0", .ucode = 0x400ULL | 0x1ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_PART1", .ucode = 0x400ULL | 0x2ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_PART2", .ucode = 0x400ULL | 0x4ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_PART3", .ucode = 0x400ULL | 0x8ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_VTD0", .ucode = 0x400ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_VTD1", .ucode = 0x400ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core reading from Cards MMIO spacce", }, { .uname = "MEM_READ_ANY", .ucode = 0x400ULL | 0x3fULL << 36, .udesc = "Data requested by the CPU -- Core reading from any source", .uflags= INTEL_X86_DFL, .uequiv= "MEM_READ_PART0:MEM_READ_PART1:MEM_READ_PART2:MEM_READ_PART3:MEM_READ_VTD0:MEM_READ_VTD1", }, { .uname = "MEM_WRITE_PART0", .ucode = 0x100ULL | 1ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_PART1", .ucode = 0x100ULL | 2ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_PART2", .ucode = 0x100ULL | 4ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_PART3", .ucode = 0x100ULL | 8ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_VTD0", .ucode = 0x100ULL | 0x10ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_VTD1", .ucode = 0x100ULL | 0x20ULL << 36, .udesc = "Data requested by the CPU -- Core writing to Cards MMIO spacce", }, { .uname = "MEM_WRITE_ANY", .ucode = 0x100ULL | 0x3fULL << 36, .udesc = "Data requested by the CPU -- Core writing", .uequiv= "MEM_WRITE_PART0:MEM_WRITE_PART1:MEM_WRITE_PART2:MEM_WRITE_PART3:MEM_WRITE_VTD0:MEM_WRITE_VTD1", }, { .uname = "PEER_READ_PART0", .ucode = 0x800ULL | 0x1ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_PART1", .ucode = 0x800ULL | 0x2ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_PART2", .ucode = 0x800ULL | 0x4ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_PART3", .ucode = 0x800ULL | 0x8ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_VTD0", .ucode = 0x800ULL | 0x10ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_VTD1", .ucode = 0x800ULL | 0x20ULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", }, { .uname = "PEER_READ_ANY", .ucode = 0x800ULL | 0x3fULL << 36, .udesc = "Another card (different IIO stack) reading from this card.", .uequiv= "PEER_READ_PART0:PEER_READ_PART1:PEER_READ_PART2:PEER_READ_PART3:PEER_READ_VTD0:PEER_READ_VTD1", }, { .uname = "PEER_WRITE_PART0", .ucode = 0x200ULL | 0x1ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_PART1", .ucode = 0x200ULL | 0x2ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_PART2", .ucode = 0x200ULL | 0x4ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_PART3", .ucode = 0x200ULL | 0x8ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_VTD0", .ucode = 0x200ULL | 0x10ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_VTD1", .ucode = 0x200ULL | 0x20ULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", }, { .uname = "PEER_WRITE_ANY", .ucode = 0x200ULL | 0x3fULL << 36, .udesc = "Another card (different IIO stack) writing to this card.", .uequiv= "PEER_WRITE_PART0:PEER_WRITE_PART1:PEER_WRITE_PART2:PEER_WRITE_PART3:PEER_WRITE_VTD0:PEER_WRITE_VTD1", }, FC_MASK(1) }; static intel_x86_umask_t skx_unc_io_data_req_of_cpu[]={ { .uname = "ATOMIC_PART0", .ucode = 0x1000ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMIC_PART1", .ucode = 0x1000ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMIC_PART2", .ucode = 0x1000ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMIC_PART3", .ucode = 0x1000ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMIC_VTD0", .ucode = 0x1000ULL | 0x10ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMIC_VTD1", .ucode = 0x1000ULL | 0x20ULL << 36, .udesc = "Data requested of the CPU -- Atomic requests targeting DRAM", }, { .uname = "ATOMICCMP_PART0", .ucode = 0x2000ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Completion of atomic requests targeting DRAM", }, { .uname = "ATOMICCMP_PART1", .ucode = 0x2000ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Completion of atomic requests targeting DRAM", }, { .uname = "ATOMICCMP_PART2", .ucode = 0x2000ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Completion of atomic requests targeting DRAM", }, { .uname = "ATOMICCMP_PART3", .ucode = 0x2000ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Completion of atomic requests targeting DRAM", }, { .uname = "MEM_READ_PART0", .ucode = 0x400ULL| 0x1ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_PART1", .ucode = 0x400ULL| 0x2ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_PART2", .ucode = 0x400ULL| 0x4ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_PART3", .ucode = 0x400ULL| 0x8ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_VTD0", .ucode = 0x400ULL| 0x10ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_VTD1", .ucode = 0x400ULL| 0x20ULL << 36, .udesc = "Data requested of the CPU -- Card reading from DRAM", }, { .uname = "MEM_READ_ANY", .ucode = 0x400ULL | 0x3fULL << 36, .udesc = "Data requested by the CPU -- Core reading from any DRAM source", .uflags= INTEL_X86_DFL, .uequiv= "MEM_READ_PART0:MEM_READ_PART1:MEM_READ_PART2:MEM_READ_PART3:MEM_READ_VTD0:MEM_READ_VTD1", }, { .uname = "MEM_WRITE_PART0", .ucode = 0x100ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MEM_WRITE_PART1", .ucode = 0x100ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MEM_WRITE_PART2", .ucode = 0x100ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MEM_WRITE_PART3", .ucode = 0x100ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MEM_WRITE_VTD0", .ucode = 0x100ULL | 0x10ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MEM_WRITE_VTD1", .ucode = 0x100ULL | 0x20ULL << 36, .udesc = "Data requested of the CPU -- Card writing to DRAM", }, { .uname = "MSG_PART0", .ucode = 0x4000ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "MSG_PART1", .ucode = 0x4000ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "MSG_PART2", .ucode = 0x4000ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "MSG_PART3", .ucode = 0x4000ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "MSG_VTD0", .ucode = 0x4000ULL | 0x10ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "MSG_VTD1", .ucode = 0x4000ULL | 0x20ULL << 36, .udesc = "Data requested of the CPU -- Messages", }, { .uname = "PEER_READ_PART0", .ucode = 0x800ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_READ_PART1", .ucode = 0x800ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_READ_PART2", .ucode = 0x800ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_READ_PART3", .ucode = 0x800ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_READ_VTD0", .ucode = 0x800ULL | 0x10ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_READ_VTD1", .ucode = 0x800ULL | 0x20ULL << 36, .udesc = "Data requested of the CPU -- Card reading from another Card (same or different stack)", }, { .uname = "PEER_WRITE_PART0", .ucode = 0x200ULL | 0x1ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, { .uname = "PEER_WRITE_PART1", .ucode = 0x200ULL | 0x2ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, { .uname = "PEER_WRITE_PART2", .ucode = 0x200ULL | 0x4ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, { .uname = "PEER_WRITE_PART3", .ucode = 0x200ULL | 0x8ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, { .uname = "PEER_WRITE_VTD0", .ucode = 0x200ULL | 0x10ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, { .uname = "PEER_WRITE_VTD1", .ucode = 0x200ULL | 0x20ULL << 36, .udesc = "Data requested of the CPU -- Card writing to another Card (same or different stack)", }, FC_MASK(1) }; static intel_x86_umask_t skx_unc_io_mask_match_and[]={ { .uname = "BUS0", .ucode = 0x100, .udesc = "AND Mask/match for debug bus -- Non-PCIE bus", }, { .uname = "BUS0_BUS1", .ucode = 0x800, .udesc = "AND Mask/match for debug bus -- Non-PCIE bus and PCIE bus", }, { .uname = "BUS0_NOT_BUS1", .ucode = 0x400, .udesc = "AND Mask/match for debug bus -- Non-PCIE bus and !(PCIE bus)", }, { .uname = "BUS1", .ucode = 0x200, .udesc = "AND Mask/match for debug bus -- PCIE bus", }, { .uname = "NOT_BUS0_BUS1", .ucode = 0x1000, .udesc = "AND Mask/match for debug bus -- !(Non-PCIE bus) and PCIE bus", }, { .uname = "NOT_BUS0_NOT_BUS1", .ucode = 0x2000, .udesc = "AND Mask/match for debug bus -- ", }, }; static intel_x86_umask_t skx_unc_io_mask_match_or[]={ { .uname = "BUS0", .ucode = 0x100, .udesc = "OR Mask/match for debug bus -- Non-PCIE bus", }, { .uname = "BUS0_BUS1", .ucode = 0x800, .udesc = "OR Mask/match for debug bus -- Non-PCIE bus and PCIE bus", }, { .uname = "BUS0_NOT_BUS1", .ucode = 0x400, .udesc = "OR Mask/match for debug bus -- Non-PCIE bus and !(PCIE bus)", }, { .uname = "BUS1", .ucode = 0x200, .udesc = "OR Mask/match for debug bus -- PCIE bus", }, { .uname = "NOT_BUS0_BUS1", .ucode = 0x1000, .udesc = "OR Mask/match for debug bus -- !(Non-PCIE bus) and PCIE bus", }, { .uname = "NOT_BUS0_NOT_BUS1", .ucode = 0x2000, .udesc = "OR Mask/match for debug bus -- !(Non-PCIE bus) and !(PCIE bus)", }, }; static intel_x86_umask_t skx_unc_io_vtd_access[]={ { .uname = "CTXT_MISS", .ucode = 0x200, .udesc = "VTd Access -- context cache miss", }, { .uname = "L1_MISS", .ucode = 0x400, .udesc = "VTd Access -- L1 miss", }, { .uname = "L2_MISS", .ucode = 0x800, .udesc = "VTd Access -- L2 miss", }, { .uname = "L3_MISS", .ucode = 0x1000, .udesc = "VTd Access -- L3 miss", }, { .uname = "L4_PAGE_HIT", .ucode = 0x100, .udesc = "VTd Access -- Vtd hit", }, { .uname = "TLB1_MISS", .ucode = 0x8000, .udesc = "VTd Access -- TLB miss", }, { .uname = "TLB_FULL", .ucode = 0x4000, .udesc = "VTd Access -- TLB is full", }, { .uname = "TLB_MISS", .ucode = 0x2000, .udesc = "VTd Access -- TLB miss", }, }; static intel_x86_entry_t intel_skx_unc_iio_pe[]={ { .name = "UNC_IO_CLOCKTICKS", .code = 0x1, .desc = "IIO clockticks", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_COMP_BUF_INSERTS", .code = 0xc2, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_io_comp_buf_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_comp_buf_inserts), }, { .name = "UNC_IO_COMP_BUF_OCCUPANCY", .code = 0xd5, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_DATA_REQ_BY_CPU", .code = 0xc0, .desc = "Number of double word (4 bytes) requests initiated by the main die to the attached device.", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xc, .ngrp = 2, .umasks = skx_unc_io_data_req_by_cpu, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_data_req_by_cpu), }, { .name = "UNC_IO_DATA_REQ_OF_CPU", .code = 0x83, .desc = "Number of double word (4 bytes) requests the attached device made of the main die.", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0x3, .ngrp = 2, .umasks = skx_unc_io_data_req_of_cpu, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_data_req_of_cpu), }, { .name = "UNC_IO_LINK_NUM_CORR_ERR", .code = 0xf, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_LINK_NUM_RETRIES", .code = 0xe, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_MASK_MATCH", .code = 0x21, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_MASK_MATCH_AND", .code = 0x2, .desc = "Asserted if all bits specified by mask match", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_io_mask_match_and, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_mask_match_and), }, { .name = "UNC_IO_MASK_MATCH_OR", .code = 0x3, .desc = "Asserted if any bits specified by mask match", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_io_mask_match_or, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_mask_match_or), }, { .name = "UNC_IO_NOTHING", .code = 0x0, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_SYMBOL_TIMES", .code = 0x82, .desc = "Gen1 - increment once every 4nS, Gen2 - increment once every 2nS, Gen3 - increment once every 1nS", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_IO_TXN_REQ_BY_CPU", .code = 0xc1, .desc = "Also known as Outbound. Number of requests, to the attached device, initiated by the main die.", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_io_data_req_by_cpu, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_data_req_by_cpu), }, { .name = "UNC_IO_TXN_REQ_OF_CPU", .code = 0x84, .desc = "Also known as Inbound. Number of 64 byte cache line requests initiated by the attached device.", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_io_data_req_of_cpu, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_data_req_of_cpu), }, { .name = "UNC_IO_VTD_ACCESS", .code = 0x41, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_io_vtd_access, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_io_vtd_access), }, { .name = "UNC_IO_VTD_OCCUPANCY", .code = 0x40, .desc = "TBD", .modmsk = SKX_UNC_IIO_ATTRS, .cntmsk = 0xf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_imc_events.h000066400000000000000000000644121502707512200252770ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_imc */ static intel_x86_umask_t skx_unc_m_act_count[]={ { .uname = "BYP", .ucode = 0x800, .udesc = "DRAM Activate Count -- Activate due to Bypass", }, { .uname = "RD", .ucode = 0x100, .udesc = "DRAM Activate Count -- Activate due to Read", }, { .uname = "WR", .ucode = 0x200, .udesc = "DRAM Activate Count -- Activate due to Write", }, }; static intel_x86_umask_t skx_unc_m_byp_cmds[]={ { .uname = "ACT", .ucode = 0x100, .udesc = "ACT command issued by 2 cycle bypass", }, { .uname = "CAS", .ucode = 0x200, .udesc = "CAS command issued by 2 cycle bypass", }, { .uname = "PRE", .ucode = 0x400, .udesc = "PRE command issued by 2 cycle bypass", }, }; static intel_x86_umask_t skx_unc_m_cas_count[]={ { .uname = "ALL", .ucode = 0xf00, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- All CASes issued.", }, { .uname = "RD", .ucode = 0x300, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- All DRAM Reads (includes underfills)", }, { .uname = "RD_ISOCH", .ucode = 0x4000, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- Read CAS issued in Read ISOCH Mode", }, { .uname = "RD_REG", .ucode = 0x100, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- All read CAS (w/ and w/out auto-pre)", }, { .uname = "RD_RMM", .ucode = 0x2000, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- Read CAS issued in RMM", }, { .uname = "RD_UNDERFILL", .ucode = 0x200, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- Underfill Read Issued", }, { .uname = "RD_WMM", .ucode = 0x1000, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- Read CAS issued in WMM", }, { .uname = "WR", .ucode = 0xc00, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- All DRAM WR_CAS (both Modes)", }, { .uname = "WR_ISOCH", .ucode = 0x8000, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- Read CAS issued in Write ISOCH Mode", }, { .uname = "WR_RMM", .ucode = 0x800, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- DRAM WR_CAS (w/ and w/out auto-pre) in Read Major Mode", }, { .uname = "WR_WMM", .ucode = 0x400, .udesc = "DRAM CAS (Column Address Strobe) Commands. -- DRAM WR_CAS (w/ and w/out auto-pre) in Write Major Mode", }, }; static intel_x86_umask_t skx_unc_m_dram_refresh[]={ { .uname = "HIGH", .ucode = 0x400, .udesc = "Number of DRAM Refreshes Issued -- ", }, { .uname = "PANIC", .ucode = 0x200, .udesc = "Number of DRAM Refreshes Issued -- ", }, }; static intel_x86_umask_t skx_unc_m_major_modes[]={ { .uname = "ISOCH", .ucode = 0x800, .udesc = "Cycles in a Major Mode -- Isoch Major Mode", }, { .uname = "PARTIAL", .ucode = 0x400, .udesc = "Cycles in a Major Mode -- Partial Major Mode", }, { .uname = "READ", .ucode = 0x100, .udesc = "Cycles in a Major Mode -- Read Major Mode", }, { .uname = "WRITE", .ucode = 0x200, .udesc = "Cycles in a Major Mode -- Write Major Mode", }, }; static intel_x86_umask_t skx_unc_m_power_cke_cycles[]={ { .uname = "RANK0", .ucode = 0x100, .udesc = "Rank0 -- DIMM ID", }, { .uname = "RANK1", .ucode = 0x200, .udesc = "Rank1 -- DIMM ID", }, { .uname = "RANK2", .ucode = 0x400, .udesc = "Rank2 -- DIMM ID", }, { .uname = "RANK3", .ucode = 0x800, .udesc = "Rank3 -- DIMM ID", }, { .uname = "RANK4", .ucode = 0x1000, .udesc = "Rank4 -- DIMM ID", }, { .uname = "RANK5", .ucode = 0x2000, .udesc = "Rank5 -- DIMM ID", }, { .uname = "RANK6", .ucode = 0x4000, .udesc = "Rank6 -- DIMM ID", }, { .uname = "RANK7", .ucode = 0x8000, .udesc = "Rank7 -- DIMM ID", }, }; static intel_x86_umask_t skx_unc_m_preemption[]={ { .uname = "RD_PREEMPT_RD", .ucode = 0x100, .udesc = "Read Preemption Count -- Read over Read Preemption", }, { .uname = "RD_PREEMPT_WR", .ucode = 0x200, .udesc = "Read Preemption Count -- Read over Write Preemption", }, }; static intel_x86_umask_t skx_unc_m_pre_count[]={ { .uname = "BYP", .ucode = 0x1000, .udesc = "DRAM Precharge commands. -- Precharge due to bypass", }, { .uname = "PAGE_CLOSE", .ucode = 0x200, .udesc = "DRAM Precharge commands. -- Precharge due to timer expiration", }, { .uname = "PAGE_MISS", .ucode = 0x100, .udesc = "DRAM Precharge commands. -- Precharges due to page miss", }, { .uname = "RD", .ucode = 0x400, .udesc = "DRAM Precharge commands. -- Precharge due to read", }, { .uname = "WR", .ucode = 0x800, .udesc = "DRAM Precharge commands. -- Precharge due to write", }, }; static intel_x86_umask_t skx_unc_m_rd_cas_prio[]={ { .uname = "HIGH", .ucode = 0x400, .udesc = " -- Read CAS issued with HIGH priority", }, { .uname = "LOW", .ucode = 0x100, .udesc = " -- Read CAS issued with LOW priority", }, { .uname = "MED", .ucode = 0x200, .udesc = " -- Read CAS issued with MEDIUM priority", }, { .uname = "PANIC", .ucode = 0x800, .udesc = " -- Read CAS issued with PANIC NON ISOCH priority (starved)", }, }; static intel_x86_umask_t skx_unc_m_rd_cas_rank0[]={ { .uname = "ALLBANKS", .ucode = 0x1000, .udesc = "Access to all banks", }, { .uname = "BANK0", .ucode = 0x0, .udesc = "Access to Bank 0", }, { .uname = "BANK1", .ucode = 0x100, .udesc = "Access to Bank 1", }, { .uname = "BANK2", .ucode = 0x200, .udesc = "Access to Bank 2", }, { .uname = "BANK3", .ucode = 0x300, .udesc = "Access to Bank 3", }, { .uname = "BANK4", .ucode = 0x400, .udesc = "Access to Bank 4", }, { .uname = "BANK5", .ucode = 0x500, .udesc = "Access to Bank 5", }, { .uname = "BANK6", .ucode = 0x600, .udesc = "Access to Bank 6", }, { .uname = "BANK7", .ucode = 0x700, .udesc = "Access to Bank 7", }, { .uname = "BANK8", .ucode = 0x800, .udesc = "Access to Bank 8", }, { .uname = "BANK9", .ucode = 0x900, .udesc = "Access to Bank 9", }, { .uname = "BANK10", .ucode = 0xa00, .udesc = "Access to Bank 10", }, { .uname = "BANK11", .ucode = 0xb00, .udesc = "Access to Bank 11", }, { .uname = "BANK12", .ucode = 0xc00, .udesc = "Access to Bank 12", }, { .uname = "BANK13", .ucode = 0xd00, .udesc = "Access to Bank 13", }, { .uname = "BANK14", .ucode = 0xe00, .udesc = "Access to Bank 14", }, { .uname = "BANK15", .ucode = 0xf00, .udesc = "Access to Bank 15", }, { .uname = "BANKG0", .ucode = 0x1100, .udesc = "Access to Bank Group 0 (Banks 0-3)", }, { .uname = "BANKG1", .ucode = 0x1200, .udesc = "Access to Bank Group 1 (Banks 4-7)", }, { .uname = "BANKG2", .ucode = 0x1300, .udesc = "Access to Bank Group 2 (Banks 8-11)", }, { .uname = "BANKG3", .ucode = 0x1400, .udesc = "Access to Bank Group 3 (Banks 12-15)", }, }; static intel_x86_umask_t skx_unc_m_wmm_to_rmm[]={ { .uname = "LOW_THRESH", .ucode = 0x100, .udesc = "Transition from WMM to RMM because of low threshold -- Transition from WMM to RMM because of starve counter", }, { .uname = "STARVE", .ucode = 0x200, .udesc = "Transition from WMM to RMM because of low threshold -- ", }, { .uname = "VMSE_RETRY", .ucode = 0x400, .udesc = "Transition from WMM to RMM because of low threshold -- ", }, }; static intel_x86_entry_t intel_skx_unc_m_pe[]={ { .name = "UNC_M_ACT_COUNT", .code = 0x1, .desc = "Counts the number of DRAM Activate commands sent on this channel. Activate commands are issued to open up a page on the DRAM devices so that it can be read or written to with a CAS. One can calculate the number of Page Misses by subtracting the number of Page Miss precharges from the number of Activates.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_act_count, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_act_count), }, { .name = "UNC_M_BYP_CMDS", .code = 0xa1, .desc = "TBD", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_byp_cmds, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_byp_cmds), }, { .name = "UNC_M_CAS_COUNT", .code = 0x4, .desc = "TBD", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_cas_count, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_cas_count), }, { .name = "UNC_M_DCLOCKTICKS", .desc = "DRAM Clock ticks, fixed counter. Counts at half the DDR speed. Speed never changes", .modmsk = 0x0, .cntmsk = 0x100000000ull, .code = 0xff, /* perf pseudo encoding for fixed counter */ }, { .name = "UNC_M_CLOCKTICKS", .code = 0x0, .desc = "DRAM Clock ticks, generic counters", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_DRAM_PRE_ALL", .code = 0x6, .desc = "Counts the number of times that the precharge all command was sent.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_DRAM_REFRESH", .code = 0x5, .desc = "Counts the number of refreshes issued.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_dram_refresh, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_dram_refresh), }, { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", .code = 0x9, .desc = "Counts the number of ECC errors detected and corrected by the iMC on this channel. This counter is only useful with ECC DRAM devices. This count will increment one time for each correction regardless of the number of bits corrected. The iMC can correct up to 4 bit errors in independent channel mode and 8 bit erros in lockstep mode.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_MAJOR_MODES", .code = 0x7, .desc = "Counts the total number of cycles spent in a major mode (selected by a filter) on the given channel. Major modes are channel-wide, and not a per-rank (or dimm or bank) mode.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_major_modes, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_major_modes), }, { .name = "UNC_M_POWER_CHANNEL_DLLOFF", .code = 0x84, .desc = "Number of cycles when all the ranks in the channel are in CKE Slow (DLLOFF) mode.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .code = 0x85, .desc = "Number of cycles when all the ranks in the channel are in PPD mode. If IBT=off is enabled, then this can be used to count those cycles. If it is not enabled, then this can count the number of cycles when that could have been taken advantage of.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_CKE_CYCLES", .code = 0x83, .desc = "Number of cycles spent in CKE ON mode. The filter allows you to select a rank to monitor. If multiple ranks are in CKE ON mode at one time, the counter will ONLY increment by one rather than doing accumulation. Multiple counters will need to be used to track multiple ranks simultaneously. There is no distinction between the different CKE modes (APD, PPDS, PPDF). This can be determined based on the system programming. These events should commonly be used with Invert to get the number of cycles in power saving mode. Edge Detect is also useful here. Make sure that you do NOT use Invert with Edge Detect (this just confuses the system and is not necessary).", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_power_cke_cycles, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_power_cke_cycles), }, { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", .code = 0x86, .desc = "Counts the number of cycles when the iMC is in critical thermal throttling. When this happens, all traffic is blocked. This should be rare unless something bad is going on in the platform. There is no filtering by rank for this event.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_PCU_THROTTLING", .code = 0x42, .desc = "TBD", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_SELF_REFRESH", .code = 0x43, .desc = "Counts the number of cycles when the iMC is in self-refresh and the iMC still has a clock. This happens in some package C-states. For example, the PCU may ask the iMC to enter self-refresh even though some of the cores are still processing. One use of this is for Monroe technology. Self-refresh is required during package C3 and C6, but there is no clock in the iMC at this time, so it is not possible to count these cases.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .code = 0x41, .desc = "Counts the number of cycles while the iMC is being throttled by either thermal constraints or by the PCU throttling. It is not possible to distinguish between the two. This can be filtered by rank. If multiple ranks are selected and are being throttled at the same time, the counter will only increment by 1.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_power_cke_cycles, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_power_cke_cycles), }, { .name = "UNC_M_PREEMPTION", .code = 0x8, .desc = "Counts the number of times a read in the iMC preempts another read or write. Generally reads to an open page are issued ahead of requests to closed pages. This improves the page hit rate of the system. However, high priority requests can cause pages of active requests to be closed in order to get them out. This will reduce the latency of the high-priority request at the expense of lower bandwidth and increased overall average latency.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_preemption, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_preemption), }, { .name = "UNC_M_PRE_COUNT", .code = 0x2, .desc = "Counts the number of DRAM Precharge commands sent on this channel.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_pre_count, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_pre_count), }, { .name = "UNC_M_RD_CAS_PRIO", .code = 0xa0, .desc = "TBD", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_prio, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_prio), }, { .name = "UNC_M_RD_CAS_RANK0", .code = 0xb0, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK1", .code = 0xb1, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK2", .code = 0xb2, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK3", .code = 0xb3, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK4", .code = 0xb4, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK5", .code = 0xb5, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK6", .code = 0xb6, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RD_CAS_RANK7", .code = 0xb7, .desc = "Read Cass Access to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_RPQ_CYCLES_FULL", .code = 0x12, .desc = "Counts the number of cycles when the Read Pending Queue is full. When the RPQ is full, the HA will not be able to issue any additional read requests into the iMC. This count should be similar count in the HA which tracks the number of cycles that the HA has no RPQ credits, just somewhat smaller to account for the credit return overhead. We generally do not expect to see RPQ become full except for potentially during Write Major Mode or while running with slow DRAM. This event only tracks non-ISOC queue entries.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_RPQ_CYCLES_NE", .code = 0x11, .desc = "Counts the number of cycles that the Read Pending Queue is not empty. This can then be used to calculate the average occupancy (in conjunction with the Read Pending Queue Occupancy count). The RPQ is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This filter is to be used in conjunction with the occupancy filter so that one can correctly track the average occupancies for schedulable entries and scheduled requests.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_RPQ_INSERTS", .code = 0x10, .desc = "Counts the number of allocations into the Read Pending Queue. This queue is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory. This includes both ISOCH and non-ISOCH requests.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_RPQ_OCCUPANCY", .code = 0x80, .desc = "Accumulates the occupancies of the Read Pending Queue each cycle. This can then be used to calculate both the average occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The RPQ is used to schedule reads out to the memory controller and to track the requests. Requests allocate into the RPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the HA to the iMC. They deallocate after the CAS command has been issued to memory.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WMM_TO_RMM", .code = 0xc0, .desc = "Transitions from WMM to RMM because of low threshold", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_wmm_to_rmm, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_wmm_to_rmm), }, { .name = "UNC_M_WPQ_CYCLES_FULL", .code = 0x22, .desc = "Counts the number of cycles when the Write Pending Queue is full. When the WPQ is full, the HA will not be able to issue any additional write requests into the iMC. This count should be similar count in the CHA which tracks the number of cycles that the CHA has no WPQ credits, just somewhat smaller to account for the credit return overhead.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_CYCLES_NE", .code = 0x21, .desc = "Counts the number of cycles that the Write Pending Queue is not empty. This can then be used to calculate the average queue occupancy (in conjunction with the WPQ Occupancy Accumulation count). The WPQ is used to schedule write out to the memory controller and to track the writes. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC. They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have posted to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencieies.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_INSERTS", .code = 0x20, .desc = "Counts the number of allocations into the Write Pending Queue. This can then be used to calculate the average queuing latency (in conjunction with the WPQ occupancy count). The WPQ is used to schedule write out to the memory controller and to track the writes. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC. They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have posted to the iMiMC.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_READ_HIT", .code = 0x23, .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WPQ_WRITE_HIT", .code = 0x24, .desc = "Counts the number of times a request hits in the WPQ (write-pending queue). The iMC allows writes and reads to pass up other writes to different addresses. Before a read or a write is issued, it will first CAM the WPQ to see if there is a write pending to that address. When reads hit, they are able to directly pull their data from the WPQ instead of going to memory. Writes that hit will overwrite the existing data. Partial writes that hit will not need to do underfill reads and will simply update their relevant sections.", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WRONG_MM", .code = 0xc1, .desc = "Number of times not getting the requested major mode", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M_WR_CAS_RANK0", .code = 0xb8, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK1", .code = 0xb9, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0 /* shared */, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK2", .code = 0xba, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK3", .code = 0xbb, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK4", .code = 0xbc, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK5", .code = 0xbd, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK6", .code = 0xbe, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, { .name = "UNC_M_WR_CAS_RANK7", .code = 0xbf, .desc = "Write VAS to Rank", .modmsk = SKX_UNC_IMC_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m_rd_cas_rank0, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m_rd_cas_rank0), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_irp_events.h000066400000000000000000000322361502707512200253200ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_irp */ static intel_x86_umask_t skx_unc_i_cache_total_occupancy[]={ { .uname = "ANY", .ucode = 0x100, .udesc = "Total Write Cache Occupancy -- Any Source", .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_Q", .ucode = 0x200, .udesc = "Total Write Cache Occupancy -- Snoops", .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM", .ucode = 0x400, .udesc = "Total Write Cache Occupancy -- Mem", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t skx_unc_i_coherent_ops[]={ { .uname = "CLFLUSH", .ucode = 0x8000, .udesc = "Coherent Ops -- CLFlush", }, { .uname = "CRD", .ucode = 0x200, .udesc = "Coherent Ops -- CRd", }, { .uname = "DRD", .ucode = 0x400, .udesc = "Coherent Ops -- DRd", }, { .uname = "PCIDCAHINT", .ucode = 0x2000, .udesc = "Coherent Ops -- PCIDCAHin5t", }, { .uname = "PCIRDCUR", .ucode = 0x100, .udesc = "Coherent Ops -- PCIRdCur", }, { .uname = "PCITOM", .ucode = 0x1000, .udesc = "Coherent Ops -- PCIItoM", }, { .uname = "RFO", .ucode = 0x800, .udesc = "Coherent Ops -- RFO", }, { .uname = "WBMTOI", .ucode = 0x4000, .udesc = "Coherent Ops -- WbMtoI", }, }; static intel_x86_umask_t skx_unc_i_irp_all[]={ { .uname = "INBOUND_INSERTS", .ucode = 0x100, .udesc = " -- All Inserts Inbound (p2p + faf + cset)", }, { .uname = "OUTBOUND_INSERTS", .ucode = 0x200, .udesc = " -- All Inserts Outbound (BL, AK, Snoops)", }, }; static intel_x86_umask_t skx_unc_i_misc0[]={ { .uname = "2ND_ATOMIC_INSERT", .ucode = 0x1000, .udesc = "Misc Events - Set 0 -- Cache Inserts of Atomic Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_RD_INSERT", .ucode = 0x400, .udesc = "Misc Events - Set 0 -- Cache Inserts of Read Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "2ND_WR_INSERT", .ucode = 0x800, .udesc = "Misc Events - Set 0 -- Cache Inserts of Write Transactions as Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REJ", .ucode = 0x200, .udesc = "Misc Events - Set 0 -- Fastpath Rejects", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_REQ", .ucode = 0x100, .udesc = "Misc Events - Set 0 -- Fastpath Requests", .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_XFER", .ucode = 0x2000, .udesc = "Misc Events - Set 0 -- Fastpath Transfers From Primary to Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "PF_ACK_HINT", .ucode = 0x4000, .udesc = "Misc Events - Set 0 -- Prefetch Ack Hints From Primary to Secondary", .uflags = INTEL_X86_NCOMBO, }, { .uname = "UNKNOWN", .ucode = 0x8000, .udesc = "Misc Events - Set 0 -- ", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t skx_unc_i_misc1[]={ { .uname = "LOST_FWD", .ucode = 0x1000, .udesc = "Misc Events - Set 1 -- Lost Forward", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_INVLD", .ucode = 0x2000, .udesc = "Misc Events - Set 1 -- Received Invalid", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SEC_RCVD_VLD", .ucode = 0x4000, .udesc = "Misc Events - Set 1 -- Received Valid", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_E", .ucode = 0x400, .udesc = "Misc Events - Set 1 -- Slow Transfer of E Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_I", .ucode = 0x100, .udesc = "Misc Events - Set 1 -- Slow Transfer of I Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_M", .ucode = 0x800, .udesc = "Misc Events - Set 1 -- Slow Transfer of M Line", .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_S", .ucode = 0x200, .udesc = "Misc Events - Set 1 -- Slow Transfer of S Line", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t skx_unc_i_p2p_transactions[]={ { .uname = "CMPL", .ucode = 0x800, .udesc = "P2P Transactions -- P2P completions", }, { .uname = "LOC", .ucode = 0x4000, .udesc = "P2P Transactions -- match if local only", }, { .uname = "LOC_AND_TGT_MATCH", .ucode = 0x8000, .udesc = "P2P Transactions -- match if local and target matches", }, { .uname = "MSG", .ucode = 0x400, .udesc = "P2P Transactions -- P2P Message", }, { .uname = "RD", .ucode = 0x100, .udesc = "P2P Transactions -- P2P reads", }, { .uname = "REM", .ucode = 0x1000, .udesc = "P2P Transactions -- Match if remote only", }, { .uname = "REM_AND_TGT_MATCH", .ucode = 0x2000, .udesc = "P2P Transactions -- match if remote and target matches", }, { .uname = "WR", .ucode = 0x200, .udesc = "P2P Transactions -- P2P Writes", }, }; static intel_x86_umask_t skx_unc_i_snoop_resp[]={ { .uname = "HIT_ES", .ucode = 0x400, .udesc = "Snoop Responses -- Hit E or S", }, { .uname = "HIT_I", .ucode = 0x200, .udesc = "Snoop Responses -- Hit I", }, { .uname = "HIT_M", .ucode = 0x800, .udesc = "Snoop Responses -- Hit M", }, { .uname = "MISS", .ucode = 0x100, .udesc = "Snoop Responses -- Miss", }, { .uname = "SNPCODE", .ucode = 0x1000, .udesc = "Snoop Responses -- SnpCode", }, { .uname = "SNPDATA", .ucode = 0x2000, .udesc = "Snoop Responses -- SnpData", }, { .uname = "SNPINV", .ucode = 0x4000, .udesc = "Snoop Responses -- SnpInv", }, }; static intel_x86_umask_t skx_unc_i_transactions[]={ { .uname = "ATOMIC", .ucode = 0x1000, .udesc = "Inbound Transaction Count -- Atomic", }, { .uname = "OTHER", .ucode = 0x2000, .udesc = "Inbound Transaction Count -- Other", }, { .uname = "RD_PREF", .ucode = 0x400, .udesc = "Inbound Transaction Count -- Read Prefetches", }, { .uname = "READS", .ucode = 0x100, .udesc = "Inbound Transaction Count -- Reads", }, { .uname = "WRITES", .ucode = 0x200, .udesc = "Inbound Transaction Count -- Writes", }, { .uname = "WR_PREF", .ucode = 0x800, .udesc = "Inbound Transaction Count -- Write Prefetches", }, }; static intel_x86_entry_t intel_skx_unc_i_pe[]={ { .name = "UNC_I_CACHE_TOTAL_OCCUPANCY", .code = 0xf, .desc = "Accumulates the number of reads and writes that are outstanding in the uncore in each cycle. This is effectively the sum of the READ_OCCUPANCY and WRITE_OCCUPANCY events.", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_cache_total_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_cache_total_occupancy), }, { .name = "UNC_I_CLOCKTICKS", .code = 0x1, .desc = "IRP Clocks", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_COHERENT_OPS", .code = 0x10, .desc = "Counts the number of coherency related operations servied by the IRP", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_coherent_ops, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_coherent_ops), }, { .name = "UNC_I_FAF_FULL", .code = 0x17, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_FAF_INSERTS", .code = 0x18, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_FAF_OCCUPANCY", .code = 0x19, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_FAF_TRANSACTIONS", .code = 0x16, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_IRP_ALL", .code = 0x1e, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_i_irp_all, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_irp_all), }, { .name = "UNC_I_MISC0", .code = 0x1c, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_misc0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_misc0), }, { .name = "UNC_I_MISC1", .code = 0x1d, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_misc1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_misc1), }, { .name = "UNC_I_P2P_INSERTS", .code = 0x14, .desc = "P2P requests from the ITC", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_P2P_OCCUPANCY", .code = 0x15, .desc = "P2P B & S Queue Occupancy", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_I_P2P_TRANSACTIONS", .code = 0x13, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_i_p2p_transactions, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_p2p_transactions), }, { .name = "UNC_I_SNOOP_RESP", .code = 0x12, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_snoop_resp, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_snoop_resp), }, { .name = "UNC_I_TRANSACTIONS", .code = 0x11, .desc = "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portItID.", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_i_transactions, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_i_transactions), }, { .name = "UNC_I_TXC_AK_INSERTS", .code = 0xb, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_DRS_CYCLES_FULL", .code = 0x5, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_DRS_INSERTS", .code = 0x2, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_DRS_OCCUPANCY", .code = 0x8, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCB_CYCLES_FULL", .code = 0x6, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCB_INSERTS", .code = 0x3, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCB_OCCUPANCY", .code = 0x9, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCS_CYCLES_FULL", .code = 0x7, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCS_INSERTS", .code = 0x4, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXC_BL_NCS_OCCUPANCY", .code = 0xa, .desc = "TBD", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR2_AD_STALL_CREDIT_CYCLES", .code = 0x1a, .desc = "Counts the number times when it is not possible to issue a request to the R2PCIe because there are no AD Egress Credits available.", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXR2_BL_STALL_CREDIT_CYCLES", .code = 0x1b, .desc = "Counts the number times when it is not possible to issue data to the R2PCIe because there are no BL Egress Credits available.", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXS_DATA_INSERTS_NCB", .code = 0xd, .desc = "Counts the number of requests issued to the switch (towards the devices).", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXS_DATA_INSERTS_NCS", .code = 0xe, .desc = "Counts the number of requests issued to the switch (towards the devices).", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_I_TXS_REQUEST_OCCUPANCY", .code = 0xc, .desc = "Accumultes the number of outstanding outbound requests from the IRP to the switch (towards the devices). This can be used in conjuection with the allocations event in order to calculate average latency of outbound requests.", .modmsk = SKX_UNC_IRP_ATTRS, .cntmsk = 0x3, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_m2m_events.h000066400000000000000000002412541502707512200252230ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_m2m */ static intel_x86_umask_t skx_unc_m2m_ag0_ad_crd_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_ag0_ad_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_bypass_m2m_egress[]={ { .uname = "NOT_TAKEN", .ucode = 0x200, .udesc = "M2M to iMC Bypass -- Not Taken", }, { .uname = "TAKEN", .ucode = 0x100, .udesc = "M2M to iMC Bypass -- Taken", }, }; static intel_x86_umask_t skx_unc_m2m_bypass_m2m_ingress[]={ { .uname = "NOT_TAKEN", .ucode = 0x200, .udesc = "M2M to iMC Bypass -- Not Taken", }, { .uname = "TAKEN", .ucode = 0x100, .udesc = "M2M to iMC Bypass -- Taken", }, }; static intel_x86_umask_t skx_unc_m2m_directory_hit[]={ { .uname = "CLEAN_A", .ucode = 0x8000, .udesc = "Directory Hit -- On NonDirty Line in A State", }, { .uname = "CLEAN_I", .ucode = 0x1000, .udesc = "Directory Hit -- On NonDirty Line in I State", }, { .uname = "CLEAN_P", .ucode = 0x4000, .udesc = "Directory Hit -- On NonDirty Line in L State", }, { .uname = "CLEAN_S", .ucode = 0x2000, .udesc = "Directory Hit -- On NonDirty Line in S State", }, { .uname = "DIRTY_A", .ucode = 0x800, .udesc = "Directory Hit -- On Dirty Line in A State", }, { .uname = "DIRTY_I", .ucode = 0x100, .udesc = "Directory Hit -- On Dirty Line in I State", }, { .uname = "DIRTY_P", .ucode = 0x400, .udesc = "Directory Hit -- On Dirty Line in L State", }, { .uname = "DIRTY_S", .ucode = 0x200, .udesc = "Directory Hit -- On Dirty Line in S State", }, }; static intel_x86_umask_t skx_unc_m2m_directory_lookup[]={ { .uname = "ANY", .ucode = 0x100, .udesc = "Directory Lookups -- Any state", .uflags= INTEL_X86_DFL, }, { .uname = "STATE_A", .ucode = 0x800, .udesc = "Directory Lookups -- A State", }, { .uname = "STATE_I", .ucode = 0x200, .udesc = "Directory Lookups -- I State", }, { .uname = "STATE_S", .ucode = 0x400, .udesc = "Directory Lookups -- S State", }, }; static intel_x86_umask_t skx_unc_m2m_directory_miss[]={ { .uname = "CLEAN_A", .ucode = 0x8000, .udesc = "Directory Miss -- On NonDirty Line in A State", }, { .uname = "CLEAN_I", .ucode = 0x1000, .udesc = "Directory Miss -- On NonDirty Line in I State", }, { .uname = "CLEAN_P", .ucode = 0x4000, .udesc = "Directory Miss -- On NonDirty Line in L State", }, { .uname = "CLEAN_S", .ucode = 0x2000, .udesc = "Directory Miss -- On NonDirty Line in S State", }, { .uname = "DIRTY_A", .ucode = 0x800, .udesc = "Directory Miss -- On Dirty Line in A State", }, { .uname = "DIRTY_I", .ucode = 0x100, .udesc = "Directory Miss -- On Dirty Line in I State", }, { .uname = "DIRTY_P", .ucode = 0x400, .udesc = "Directory Miss -- On Dirty Line in L State", }, { .uname = "DIRTY_S", .ucode = 0x200, .udesc = "Directory Miss -- On Dirty Line in S State", }, }; static intel_x86_umask_t skx_unc_m2m_directory_update[]={ { .uname = "A2I", .ucode = 0x2000, .udesc = "Directory Updates -- A2I", }, { .uname = "A2S", .ucode = 0x4000, .udesc = "Directory Updates -- A2S", }, { .uname = "ANY", .ucode = 0x100, .udesc = "Directory Updates -- Any", }, { .uname = "I2A", .ucode = 0x400, .udesc = "Directory Updates -- I2A", }, { .uname = "I2S", .ucode = 0x200, .udesc = "Directory Updates -- I2S", }, { .uname = "S2A", .ucode = 0x1000, .udesc = "Directory Updates -- S2A", }, { .uname = "S2I", .ucode = 0x800, .udesc = "Directory Updates -- S2I", }, }; static intel_x86_umask_t skx_unc_m2m_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .ucode = 0x400, .udesc = "Egress Blocking due to Ordering requirements -- Down", }, { .uname = "IV_SNOOPGO_UP", .ucode = 0x100, .udesc = "Egress Blocking due to Ordering requirements -- Up", }, }; static intel_x86_umask_t skx_unc_m2m_fast_asserted[]={ { .uname = "HORZ", .ucode = 0x200, .udesc = "FaST wire asserted -- Horizontal", .uflags = INTEL_X86_NCOMBO, }, { .uname = "VERT", .ucode = 0x100, .udesc = "FaST wire asserted -- Vertical", .uflags = INTEL_X86_NCOMBO, }, }; static intel_x86_umask_t skx_unc_m2m_horz_ring_ad_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AD Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AD Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AD Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AD Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_horz_ring_ak_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AK Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AK Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AK Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AK Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal BL Ring in Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal BL Ring in Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal BL Ring in Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal BL Ring in Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_horz_ring_iv_in_use[]={ { .uname = "LEFT", .ucode = 0x100, .udesc = "Horizontal IV Ring in Use -- Left", }, { .uname = "RIGHT", .ucode = 0x400, .udesc = "Horizontal IV Ring in Use -- Right", }, }; static intel_x86_umask_t skx_unc_m2m_imc_reads[]={ { .uname = "ALL", .ucode = 0x400, .udesc = "M2M Reads Issued to iMC -- All, regardless of priority.", }, { .uname = "FROM_TRANSGRESS", .ucode = 0x1000, .udesc = "M2M Reads Issued to iMC -- All, regardless of priority.", }, { .uname = "ISOCH", .ucode = 0x200, .udesc = "M2M Reads Issued to iMC -- Critical Priority", }, { .uname = "NORMAL", .ucode = 0x100, .udesc = "M2M Reads Issued to iMC -- Normal Priority", }, }; static intel_x86_umask_t skx_unc_m2m_imc_writes[]={ { .uname = "ALL", .ucode = 0x1000, .udesc = "M2M Writes Issued to iMC -- All Writes", }, { .uname = "FROM_TRANSGRESS", .ucode = 0x4000, .udesc = "M2M Writes Issued to iMC -- All, regardless of priority.", }, { .uname = "FULL", .ucode = 0x100, .udesc = "M2M Writes Issued to iMC -- Full Line Non-ISOCH", }, { .uname = "FULL_ISOCH", .ucode = 0x400, .udesc = "M2M Writes Issued to iMC -- ISOCH Full Line", }, { .uname = "NI", .ucode = 0x8000, .udesc = "M2M Writes Issued to iMC -- All, regardless of priority.", }, { .uname = "PARTIAL", .ucode = 0x200, .udesc = "M2M Writes Issued to iMC -- Partial Non-ISOCH", }, { .uname = "PARTIAL_ISOCH", .ucode = 0x800, .udesc = "M2M Writes Issued to iMC -- ISOCH Partial", }, }; static intel_x86_umask_t skx_unc_m2m_pkt_match[]={ { .uname = "MC", .ucode = 0x200, .udesc = "Number Packet Header Matches -- MC Match", }, { .uname = "MESH", .ucode = 0x100, .udesc = "Number Packet Header Matches -- Mesh Match", }, }; static intel_x86_umask_t skx_unc_m2m_ring_bounces_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Horizontal Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Horizontal Ring. -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Horizontal Ring. -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Horizontal Ring. -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_ring_bounces_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Vertical Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Vertical Ring. -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Vertical Ring. -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Vertical Ring. -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_m2m_ring_sink_starved_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Horizontal Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Horizontal Ring -- AK", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Sink Starvation on Horizontal Ring -- Acknowledgements to Agent 1", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Horizontal Ring -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Horizontal Ring -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_ring_sink_starved_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Vertical Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Vertical Ring -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Vertical Ring -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Vertical Ring -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_m2m_rpq_cycles_reg_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "M2M to iMC RPQ Cycles w/Credits - Regular -- Channel 0", }, { .uname = "CHN1", .ucode = 0x200, .udesc = "M2M to iMC RPQ Cycles w/Credits - Regular -- Channel 1", }, { .uname = "CHN2", .ucode = 0x400, .udesc = "M2M to iMC RPQ Cycles w/Credits - Regular -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_rpq_cycles_spec_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "M2M to iMC RPQ Cycles w/Credits - Special -- Channel 0", }, { .uname = "CHN1", .ucode = 0x200, .udesc = "M2M to iMC RPQ Cycles w/Credits - Special -- Channel 1", }, { .uname = "CHN2", .ucode = 0x400, .udesc = "M2M to iMC RPQ Cycles w/Credits - Special -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_rxr_busy_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_m2m_rxr_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Bypass -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Bypass -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Bypass -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Bypass -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Bypass -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Bypass -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_rxr_crd_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, { .uname = "IFV", .ucode = 0x8000, .udesc = "Transgress Injection Starvation -- IFV - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_rxr_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Allocations -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Allocations -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Allocations -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Allocations -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Allocations -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Allocations -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_rxr_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_stall_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_stall_no_txr_horz_crd_ad_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_stall_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_stall_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m2m_tracker_cycles_full[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Tracker Cycles Full -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Tracker Cycles Full -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Tracker Cycles Full -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_tracker_cycles_ne[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Tracker Cycles Not Empty -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Tracker Cycles Not Empty -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Tracker Cycles Not Empty -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_tracker_inserts[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Tracker Inserts -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Tracker Inserts -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Tracker Inserts -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_tracker_occupancy[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Tracker Occupancy -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Tracker Occupancy -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Tracker Occupancy -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak[]={ { .uname = "CRD_CBO", .ucode = 0x200, .udesc = "Outbound Ring Transactions on AK -- CRD Transactions to Cbo", }, { .uname = "NDR", .ucode = 0x100, .udesc = "Outbound Ring Transactions on AK -- NDR Transactions", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_credits_acquired[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Credit Acquired -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Credit Acquired -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_credit_occupancy[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Credits Occupancy -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Credits Occupancy -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_cycles_full[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "AK Egress (to CMS) Full -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Full -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Full -- Common Mesh Stop - Far Side", }, { .uname = "RDCRD0", .ucode = 0x800, .udesc = "AK Egress (to CMS) Full -- Read Credit Request", }, { .uname = "RDCRD1", .ucode = 0x8800, .udesc = "AK Egress (to CMS) Full -- Read Credit Request", }, { .uname = "WRCMP0", .ucode = 0x2000, .udesc = "AK Egress (to CMS) Full -- Write Compare Request", }, { .uname = "WRCMP1", .ucode = 0xa000, .udesc = "AK Egress (to CMS) Full -- Write Compare Request", }, { .uname = "WRCRD0", .ucode = 0x1000, .udesc = "AK Egress (to CMS) Full -- Write Credit Request", }, { .uname = "WRCRD1", .ucode = 0x9000, .udesc = "AK Egress (to CMS) Full -- Write Credit Request", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_cycles_ne[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "AK Egress (to CMS) Not Empty -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Not Empty -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Not Empty -- Common Mesh Stop - Far Side", }, { .uname = "RDCRD", .ucode = 0x800, .udesc = "AK Egress (to CMS) Not Empty -- Read Credit Request", }, { .uname = "WRCMP", .ucode = 0x2000, .udesc = "AK Egress (to CMS) Not Empty -- Write Compare Request", }, { .uname = "WRCRD", .ucode = 0x1000, .udesc = "AK Egress (to CMS) Not Empty -- Write Credit Request", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_inserts[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "AK Egress (to CMS) Allocations -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Allocations -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Allocations -- Common Mesh Stop - Far Side", }, { .uname = "PREF_RD_CAM_HIT", .ucode = 0x4000, .udesc = "AK Egress (to CMS) Allocations -- Prefetch Read Cam Hit", }, { .uname = "RDCRD", .ucode = 0x800, .udesc = "AK Egress (to CMS) Allocations -- Read Credit Request", }, { .uname = "WRCMP", .ucode = 0x2000, .udesc = "AK Egress (to CMS) Allocations -- Write Compare Request", }, { .uname = "WRCRD", .ucode = 0x1000, .udesc = "AK Egress (to CMS) Allocations -- Write Credit Request", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_no_credit_cycles[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "Cycles with No AK Egress (to CMS) Credits -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "Cycles with No AK Egress (to CMS) Credits -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_no_credit_stalled[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "Cycles Stalled with No AK Egress (to CMS) Credits -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "Cycles Stalled with No AK Egress (to CMS) Credits -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_occupancy[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "AK Egress (to CMS) Occupancy -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "AK Egress (to CMS) Occupancy -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "AK Egress (to CMS) Occupancy -- Common Mesh Stop - Far Side", }, { .uname = "RDCRD", .ucode = 0x800, .udesc = "AK Egress (to CMS) Occupancy -- Read Credit Request", }, { .uname = "WRCMP", .ucode = 0x2000, .udesc = "AK Egress (to CMS) Occupancy -- Write Compare Request", }, { .uname = "WRCRD", .ucode = 0x1000, .udesc = "AK Egress (to CMS) Occupancy -- Write Credit Request", }, }; static intel_x86_umask_t skx_unc_m2m_txc_ak_sideband[]={ { .uname = "RD", .ucode = 0x100, .udesc = "AK Egress (to CMS) Sideband -- ", }, { .uname = "WR", .ucode = 0x200, .udesc = "AK Egress (to CMS) Sideband -- ", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl[]={ { .uname = "DRS_CACHE", .ucode = 0x100, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Cache", }, { .uname = "DRS_CORE", .ucode = 0x200, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to Core", }, { .uname = "DRS_UPI", .ucode = 0x400, .udesc = "Outbound DRS Ring Transactions to Cache -- Data to QPI", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_credits_acquired[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Credit Acquired -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Credit Acquired -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_credit_occupancy[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Credits Occupancy -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Credits Occupancy -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_cycles_full[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "BL Egress (to CMS) Full -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Full -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Full -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_cycles_ne[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "BL Egress (to CMS) Not Empty -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Not Empty -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Not Empty -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_inserts[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "BL Egress (to CMS) Allocations -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Allocations -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Allocations -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_no_credit_cycles[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "Cycles with No BL Egress (to CMS) Credits -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "Cycles with No BL Egress (to CMS) Credits -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_no_credit_stalled[]={ { .uname = "CMS0", .ucode = 0x100, .udesc = "Cycles Stalled with No BL Egress (to CMS) Credits -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "Cycles Stalled with No BL Egress (to CMS) Credits -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txc_bl_occupancy[]={ { .uname = "ALL", .ucode = 0x300, .udesc = "BL Egress (to CMS) Occupancy -- All", }, { .uname = "CMS0", .ucode = 0x100, .udesc = "BL Egress (to CMS) Occupancy -- Common Mesh Stop - Near Side", }, { .uname = "CMS1", .ucode = 0x200, .udesc = "BL Egress (to CMS) Occupancy -- Common Mesh Stop - Far Side", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_ads_used[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal ADS Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal ADS Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal ADS Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal ADS Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal ADS Used -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Bypass Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Bypass Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Bypass Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Bypass Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Bypass Used -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Bypass Used -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_cycles_full[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_cycles_ne[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Inserts -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Inserts -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Inserts -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Inserts -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Inserts -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Inserts -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_nack[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress NACKs -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x2000, .udesc = "CMS Horizontal Egress NACKs -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress NACKs -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress NACKs -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress NACKs -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress NACKs -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_horz_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Injection Starvation -- AD - Bounce", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Injection Starvation -- BL - Bounce", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_ads_used[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_bypass[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical ADS Used -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_cycles_full[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_cycles_ne[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_inserts[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Allocations -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Allocations -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Allocations -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Allocations -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Allocations -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Allocations -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Allocations -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_nack[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress NACKs -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_occupancy[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Occupancy -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_txr_vert_starved[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress Injection Starvation -- IV", }, }; static intel_x86_umask_t skx_unc_m2m_vert_ring_ad_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AD Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AD Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AD Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AD Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_vert_ring_ak_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AK Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AK Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AK Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AK Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical BL Ring in Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical BL Ring in Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical BL Ring in Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical BL Ring in Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m2m_vert_ring_iv_in_use[]={ { .uname = "DN", .ucode = 0x400, .udesc = "Vertical IV Ring in Use -- Down", }, { .uname = "UP", .ucode = 0x100, .udesc = "Vertical IV Ring in Use -- Up", }, }; static intel_x86_umask_t skx_unc_m2m_wpq_cycles_reg_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "M2M->iMC WPQ Cycles w/Credits - Regular -- Channel 0", }, { .uname = "CHN1", .ucode = 0x200, .udesc = "M2M->iMC WPQ Cycles w/Credits - Regular -- Channel 1", }, { .uname = "CHN2", .ucode = 0x400, .udesc = "M2M->iMC WPQ Cycles w/Credits - Regular -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_wpq_cycles_spec_credits[]={ { .uname = "CHN0", .ucode = 0x100, .udesc = "M2M->iMC WPQ Cycles w/Credits - Special -- Channel 0", }, { .uname = "CHN1", .ucode = 0x200, .udesc = "M2M->iMC WPQ Cycles w/Credits - Special -- Channel 1", }, { .uname = "CHN2", .ucode = 0x400, .udesc = "M2M->iMC WPQ Cycles w/Credits - Special -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_write_tracker_cycles_full[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Write Tracker Cycles Full -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Write Tracker Cycles Full -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Write Tracker Cycles Full -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_write_tracker_cycles_ne[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Write Tracker Cycles Not Empty -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Write Tracker Cycles Not Empty -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Write Tracker Cycles Not Empty -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_write_tracker_inserts[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Write Tracker Inserts -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Write Tracker Inserts -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Write Tracker Inserts -- Channel 2", }, }; static intel_x86_umask_t skx_unc_m2m_write_tracker_occupancy[]={ { .uname = "CH0", .ucode = 0x100, .udesc = "Write Tracker Occupancy -- Channel 0", }, { .uname = "CH1", .ucode = 0x200, .udesc = "Write Tracker Occupancy -- Channel 1", }, { .uname = "CH2", .ucode = 0x400, .udesc = "Write Tracker Occupancy -- Channel 2", }, }; static intel_x86_entry_t intel_skx_unc_m2m_pe[]={ { .name = "UNC_M2_AG0_AD_CRD_ACQUIRED", .code = 0x80, .desc = "Number of CMS Agent 0 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_acquired), }, { .name = "UNC_M2_AG0_AD_CRD_OCCUPANCY", .code = 0x82, .desc = "Number of CMS Agent 0 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_occupancy), }, { .name = "UNC_M2_AG0_BL_CRD_ACQUIRED", .code = 0x88, .desc = "Number of CMS Agent 0 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_acquired, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_acquired), }, { .name = "UNC_M2_AG0_BL_CRD_OCCUPANCY", .code = 0x8a, .desc = "Number of CMS Agent 0 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_occupancy), }, { .name = "UNC_M2_AG1_AD_CRD_ACQUIRED", .code = 0x84, .desc = "Number of CMS Agent 1 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_acquired, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_acquired), }, { .name = "UNC_M2_AG1_AD_CRD_OCCUPANCY", .code = 0x86, .desc = "Number of CMS Agent 1 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_occupancy), }, { .name = "UNC_M2_AG1_BL_CRD_OCCUPANCY", .code = 0x8e, .desc = "Number of CMS Agent 1 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_occupancy), }, { .name = "UNC_M2_AG1_BL_CREDITS_ACQUIRED", .code = 0x8c, .desc = "Number of CMS Agent 1 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ag0_ad_crd_acquired, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ag0_ad_crd_acquired), }, { .name = "UNC_M2_BYPASS_M2M_EGRESS", .code = 0x22, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_bypass_m2m_egress, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_bypass_m2m_egress), }, { .name = "UNC_M2_BYPASS_M2M_INGRESS", .code = 0x21, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_bypass_m2m_ingress, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_bypass_m2m_ingress), }, { .name = "UNC_M2_CLOCKTICKS", .code = 0x0, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_CMS_CLOCKTICKS", .code = 0xc0, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2CORE_NOT_TAKEN_DIRSTATE", .code = 0x24, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2CORE_TAKEN", .code = 0x23, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2CORE_TXN_OVERRIDE", .code = 0x25, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2UPI_NOT_TAKEN_CREDITS", .code = 0x28, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2UPI_NOT_TAKEN_DIRSTATE", .code = 0x27, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2UPI_TAKEN", .code = 0x26, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECT2UPI_TXN_OVERRIDE", .code = 0x29, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_DIRECTORY_HIT", .code = 0x2a, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_directory_hit, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_directory_hit), }, { .name = "UNC_M2_DIRECTORY_LOOKUP", .code = 0x2d, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_directory_lookup, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_directory_lookup), }, { .name = "UNC_M2_DIRECTORY_MISS", .code = 0x2b, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_directory_miss, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_directory_miss), }, { .name = "UNC_M2_DIRECTORY_UPDATE", .code = 0x2e, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_directory_update, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_directory_update), }, { .name = "UNC_M2_EGRESS_ORDERING", .code = 0xae, .desc = "Counts number of cycles IV was blocked in the TGR Egress due to SNP/GO Ordering requirements", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_egress_ordering, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_egress_ordering), }, { .name = "UNC_M2_FAST_ASSERTED", .code = 0xa5, .desc = "Counts the number of cycles either the local or incoming distress signals are asserted. Incoming distress includes up, dn and across.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_fast_asserted, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_fast_asserted), }, { .name = "UNC_M2_HORZ_RING_AD_IN_USE", .code = 0xa7, .desc = "Counts the number of cycles that the Horizontal AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_horz_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_horz_ring_ad_in_use), }, { .name = "UNC_M2_HORZ_RING_AK_IN_USE", .code = 0xa9, .desc = "Counts the number of cycles that the Horizontal AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_horz_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_horz_ring_ak_in_use), }, { .name = "UNC_M2_HORZ_RING_BL_IN_USE", .code = 0xab, .desc = "Counts the number of cycles that the Horizontal BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_horz_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_horz_ring_bl_in_use), }, { .name = "UNC_M2_HORZ_RING_IV_IN_USE", .code = 0xad, .desc = "Counts the number of cycles that the Horizontal IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_horz_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_horz_ring_iv_in_use), }, { .name = "UNC_M2_IMC_READS", .code = 0x37, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_imc_reads, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_imc_reads), }, { .name = "UNC_M2_IMC_WRITES", .code = 0x38, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_imc_writes, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_imc_writes), }, { .name = "UNC_M2_PKT_MATCH", .code = 0x4c, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_pkt_match, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_pkt_match), }, { .name = "UNC_M2_PREFCAM_CYCLES_FULL", .code = 0x53, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_PREFCAM_CYCLES_NE", .code = 0x54, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_PREFCAM_DEMAND_PROMOTIONS", .code = 0x56, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_PREFCAM_INSERTS", .code = 0x57, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_PREFCAM_OCCUPANCY", .code = 0x55, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RING_BOUNCES_HORZ", .code = 0xa1, .desc = "Number of cycles incoming messages from the Horizontal ring that were bounced, by ring type.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ring_bounces_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ring_bounces_horz), }, { .name = "UNC_M2_RING_BOUNCES_VERT", .code = 0xa0, .desc = "Number of cycles incoming messages from the Vertical ring that were bounced, by ring type.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ring_bounces_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ring_bounces_vert), }, { .name = "UNC_M2_RING_SINK_STARVED_HORZ", .code = 0xa3, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ring_sink_starved_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ring_sink_starved_horz), }, { .name = "UNC_M2_RING_SINK_STARVED_VERT", .code = 0xa2, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_ring_sink_starved_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_ring_sink_starved_vert), }, { .name = "UNC_M2_RING_SRC_THRTL", .code = 0xa4, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RPQ_CYCLES_REG_CREDITS", .code = 0x43, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rpq_cycles_reg_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rpq_cycles_reg_credits), }, { .name = "UNC_M2_RPQ_CYCLES_SPEC_CREDITS", .code = 0x44, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rpq_cycles_spec_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rpq_cycles_spec_credits), }, { .name = "UNC_M2_RXC_AD_CYCLES_FULL", .code = 0x4, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_AD_CYCLES_NE", .code = 0x3, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_AD_INSERTS", .code = 0x1, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_AD_OCCUPANCY", .code = 0x2, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_BL_CYCLES_FULL", .code = 0x8, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_BL_CYCLES_NE", .code = 0x7, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_BL_INSERTS", .code = 0x5, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXC_BL_OCCUPANCY", .code = 0x6, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_RXR_BUSY_STARVED", .code = 0xb4, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, because a message from the other queue has higher priority", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rxr_busy_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rxr_busy_starved), }, { .name = "UNC_M2_RXR_BYPASS", .code = 0xb2, .desc = "Number of packets bypassing the CMS Ingress", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rxr_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rxr_bypass), }, { .name = "UNC_M2_RXR_CRD_STARVED", .code = 0xb3, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, the Ingress is unable to forward to the Egress due to a lack of credit.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rxr_crd_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rxr_crd_starved), }, { .name = "UNC_M2_RXR_INSERTS", .code = 0xb1, .desc = "Number of allocations into the CMS Ingress The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rxr_inserts), }, { .name = "UNC_M2_RXR_OCCUPANCY", .code = 0xb0, .desc = "Occupancy event for the Ingress buffers in the CMS The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_rxr_occupancy), }, { .name = "UNC_M2_STALL_NO_TXR_HORZ_CRD_AD_AG0", .code = 0xd0, .desc = "Number of cycles the AD Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_stall_no_txr_horz_crd_ad_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_stall_no_txr_horz_crd_ad_ag0), }, { .name = "UNC_M2_STALL_NO_TXR_HORZ_CRD_AD_AG1", .code = 0xd2, .desc = "Number of cycles the AD Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_stall_no_txr_horz_crd_ad_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_stall_no_txr_horz_crd_ad_ag1), }, { .name = "UNC_M2_STALL_NO_TXR_HORZ_CRD_BL_AG0", .code = 0xd4, .desc = "Number of cycles the BL Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_stall_no_txr_horz_crd_bl_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_stall_no_txr_horz_crd_bl_ag0), }, { .name = "UNC_M2_STALL_NO_TXR_HORZ_CRD_BL_AG1", .code = 0xd6, .desc = "Number of cycles the BL Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_stall_no_txr_horz_crd_bl_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_stall_no_txr_horz_crd_bl_ag1), }, { .name = "UNC_M2_TGR_AD_CREDITS", .code = 0x41, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TGR_BL_CREDITS", .code = 0x42, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TRACKER_CYCLES_FULL", .code = 0x45, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_tracker_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_tracker_cycles_full), }, { .name = "UNC_M2_TRACKER_CYCLES_NE", .code = 0x46, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_tracker_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_tracker_cycles_ne), }, { .name = "UNC_M2_TRACKER_INSERTS", .code = 0x49, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_tracker_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_tracker_inserts), }, { .name = "UNC_M2_TRACKER_OCCUPANCY", .code = 0x47, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_tracker_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_tracker_occupancy), }, { .name = "UNC_M2_TRACKER_PENDING_OCCUPANCY", .code = 0x48, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_CREDITS_ACQUIRED", .code = 0xd, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_CREDIT_OCCUPANCY", .code = 0xe, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_CYCLES_FULL", .code = 0xc, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_CYCLES_NE", .code = 0xb, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_INSERTS", .code = 0x9, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_NO_CREDIT_CYCLES", .code = 0xf, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_NO_CREDIT_STALLED", .code = 0x10, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AD_OCCUPANCY", .code = 0xa, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M2_TXC_AK", .code = 0x39, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak), }, { .name = "UNC_M2_TXC_AK_CREDITS_ACQUIRED", .code = 0x1d, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_credits_acquired), }, { .name = "UNC_M2_TXC_AK_CREDIT_OCCUPANCY", .code = 0x1e, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_credit_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_credit_occupancy), }, { .name = "UNC_M2_TXC_AK_CYCLES_FULL", .code = 0x14, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_cycles_full), }, { .name = "UNC_M2_TXC_AK_CYCLES_NE", .code = 0x13, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_cycles_ne), }, { .name = "UNC_M2_TXC_AK_INSERTS", .code = 0x11, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_inserts), }, { .name = "UNC_M2_TXC_AK_NO_CREDIT_CYCLES", .code = 0x1f, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_no_credit_cycles, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_no_credit_cycles), }, { .name = "UNC_M2_TXC_AK_NO_CREDIT_STALLED", .code = 0x20, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_no_credit_stalled, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_no_credit_stalled), }, { .name = "UNC_M2_TXC_AK_OCCUPANCY", .code = 0x12, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_occupancy), }, { .name = "UNC_M2_TXC_AK_SIDEBAND", .code = 0x6b, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_ak_sideband, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_ak_sideband), }, { .name = "UNC_M2_TXC_BL", .code = 0x40, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl), }, { .name = "UNC_M2_TXC_BL_CREDITS_ACQUIRED", .code = 0x19, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_credits_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_credits_acquired), }, { .name = "UNC_M2_TXC_BL_CREDIT_OCCUPANCY", .code = 0x1a, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_credit_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_credit_occupancy), }, { .name = "UNC_M2_TXC_BL_CYCLES_FULL", .code = 0x18, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_cycles_full), }, { .name = "UNC_M2_TXC_BL_CYCLES_NE", .code = 0x17, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_cycles_ne), }, { .name = "UNC_M2_TXC_BL_INSERTS", .code = 0x15, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_inserts), }, { .name = "UNC_M2_TXC_BL_NO_CREDIT_CYCLES", .code = 0x1b, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_no_credit_cycles, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_no_credit_cycles), }, { .name = "UNC_M2_TXC_BL_NO_CREDIT_STALLED", .code = 0x1c, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_no_credit_stalled, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_no_credit_stalled), }, { .name = "UNC_M2_TXC_BL_OCCUPANCY", .code = 0x16, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txc_bl_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txc_bl_occupancy), }, { .name = "UNC_M2_TXR_HORZ_ADS_USED", .code = 0x9d, .desc = "Number of packets using the Horizontal Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_ads_used), }, { .name = "UNC_M2_TXR_HORZ_BYPASS", .code = 0x9f, .desc = "Number of packets bypassing the Horizontal Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_bypass), }, { .name = "UNC_M2_TXR_HORZ_CYCLES_FULL", .code = 0x96, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Full. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_cycles_full), }, { .name = "UNC_M2_TXR_HORZ_CYCLES_NE", .code = 0x97, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Not-Empty. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_cycles_ne), }, { .name = "UNC_M2_TXR_HORZ_INSERTS", .code = 0x95, .desc = "Number of allocations into the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_inserts), }, { .name = "UNC_M2_TXR_HORZ_NACK", .code = 0x99, .desc = "Counts number of Egress packets NACKed on to the Horizontal Rinng", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_nack), }, { .name = "UNC_M2_TXR_HORZ_OCCUPANCY", .code = 0x94, .desc = "Occupancy event for the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_occupancy), }, { .name = "UNC_M2_TXR_HORZ_STARVED", .code = 0x9b, .desc = "Counts injection starvation. This starvation is triggered when the CMS Transgress buffer cannot send a transaction onto the Horizontal ring for a long period of time.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_horz_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_horz_starved), }, { .name = "UNC_M2_TXR_VERT_ADS_USED", .code = 0x9c, .desc = "Number of packets using the Vertical Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_ads_used), }, { .name = "UNC_M2_TXR_VERT_BYPASS", .code = 0x9e, .desc = "Number of packets bypassing the Vertical Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_bypass), }, { .name = "UNC_M2_TXR_VERT_CYCLES_FULL", .code = 0x92, .desc = "Number of cycles the Common Mesh Stop Egress was Not Full. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_cycles_full), }, { .name = "UNC_M2_TXR_VERT_CYCLES_NE", .code = 0x93, .desc = "Number of cycles the Common Mesh Stop Egress was Not Empty. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_cycles_ne), }, { .name = "UNC_M2_TXR_VERT_INSERTS", .code = 0x91, .desc = "Number of allocations into the Common Mesh Stop Egress. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_inserts), }, { .name = "UNC_M2_TXR_VERT_NACK", .code = 0x98, .desc = "Counts number of Egress packets NACKed on to the Vertical Rinng", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_nack), }, { .name = "UNC_M2_TXR_VERT_OCCUPANCY", .code = 0x90, .desc = "Occupancy event for the Egress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_occupancy), }, { .name = "UNC_M2_TXR_VERT_STARVED", .code = 0x9a, .desc = "Counts injection starvation. This starvation is triggered when the CMS Egress cannot send a transaction onto the Vertical ring for a long period of time.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_txr_vert_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_txr_vert_starved), }, { .name = "UNC_M2_VERT_RING_AD_IN_USE", .code = 0xa6, .desc = "Counts the number of cycles that the Vertical AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_vert_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_vert_ring_ad_in_use), }, { .name = "UNC_M2_VERT_RING_AK_IN_USE", .code = 0xa8, .desc = "Counts the number of cycles that the Vertical AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_vert_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_vert_ring_ak_in_use), }, { .name = "UNC_M2_VERT_RING_BL_IN_USE", .code = 0xaa, .desc = "Counts the number of cycles that the Vertical BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_vert_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_vert_ring_bl_in_use), }, { .name = "UNC_M2_VERT_RING_IV_IN_USE", .code = 0xac, .desc = "Counts the number of cycles that the Vertical IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_vert_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_vert_ring_iv_in_use), }, { .name = "UNC_M2_WPQ_CYCLES_REG_CREDITS", .code = 0x4d, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_wpq_cycles_reg_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_wpq_cycles_reg_credits), }, { .name = "UNC_M2_WPQ_CYCLES_SPEC_CREDITS", .code = 0x4e, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_wpq_cycles_spec_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_wpq_cycles_spec_credits), }, { .name = "UNC_M2_WRITE_TRACKER_CYCLES_FULL", .code = 0x4a, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_write_tracker_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_write_tracker_cycles_full), }, { .name = "UNC_M2_WRITE_TRACKER_CYCLES_NE", .code = 0x4b, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_write_tracker_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_write_tracker_cycles_ne), }, { .name = "UNC_M2_WRITE_TRACKER_INSERTS", .code = 0x61, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_write_tracker_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_write_tracker_inserts), }, { .name = "UNC_M2_WRITE_TRACKER_OCCUPANCY", .code = 0x60, .desc = "TBD", .modmsk = SKX_UNC_M2M_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m2m_write_tracker_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m2m_write_tracker_occupancy), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_m3upi_events.h000066400000000000000000003347051502707512200255710ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_m3upi */ static intel_x86_umask_t skx_unc_m3_ag0_ad_crd_acquired[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 Credits Acquired -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_ag0_ad_crd_occupancy[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "CMS Agent0 Credits Occupancy -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_cha_ad_credits_empty[]={ { .uname = "REQ", .ucode = 0x400, .udesc = "CBox AD Credits Empty -- Requests", }, { .uname = "SNP", .ucode = 0x800, .udesc = "CBox AD Credits Empty -- Snoops", }, { .uname = "VNA", .ucode = 0x100, .udesc = "CBox AD Credits Empty -- VNA Messages", }, { .uname = "WB", .ucode = 0x200, .udesc = "CBox AD Credits Empty -- Writebacks", }, }; static intel_x86_umask_t skx_unc_m3_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .ucode = 0x400, .udesc = "Egress Blocking due to Ordering requirements -- Down", }, { .uname = "IV_SNOOPGO_UP", .ucode = 0x100, .udesc = "Egress Blocking due to Ordering requirements -- Up", }, }; static intel_x86_umask_t skx_unc_m3_fast_asserted[]={ { .uname = "HORZ", .ucode = 0x200, .udesc = "FaST wire asserted -- Horizontal", }, { .uname = "VERT", .ucode = 0x100, .udesc = "FaST wire asserted -- Vertical", }, }; static intel_x86_umask_t skx_unc_m3_horz_ring_ad_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AD Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AD Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AD Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AD Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m3_horz_ring_ak_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal AK Ring In Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal AK Ring In Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal AK Ring In Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal AK Ring In Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m3_horz_ring_bl_in_use[]={ { .uname = "LEFT_EVEN", .ucode = 0x100, .udesc = "Horizontal BL Ring in Use -- Left and Even", }, { .uname = "LEFT_ODD", .ucode = 0x200, .udesc = "Horizontal BL Ring in Use -- Left and Odd", }, { .uname = "RIGHT_EVEN", .ucode = 0x400, .udesc = "Horizontal BL Ring in Use -- Right and Even", }, { .uname = "RIGHT_ODD", .ucode = 0x800, .udesc = "Horizontal BL Ring in Use -- Right and Odd", }, }; static intel_x86_umask_t skx_unc_m3_horz_ring_iv_in_use[]={ { .uname = "LEFT", .ucode = 0x100, .udesc = "Horizontal IV Ring in Use -- Left", }, { .uname = "RIGHT", .ucode = 0x400, .udesc = "Horizontal IV Ring in Use -- Right", }, }; static intel_x86_umask_t skx_unc_m3_m2_bl_credits_empty[]={ { .uname = "IIO0_IIO1_NCB", .ucode = 0x100, .udesc = "M2 BL Credits Empty -- IIO0 and IIO1 share the same ring destination. (1 VN0 credit only)", }, { .uname = "IIO2_NCB", .ucode = 0x200, .udesc = "M2 BL Credits Empty -- IIO2", }, { .uname = "IIO3_NCB", .ucode = 0x400, .udesc = "M2 BL Credits Empty -- IIO3", }, { .uname = "IIO4_NCB", .ucode = 0x800, .udesc = "M2 BL Credits Empty -- IIO4", }, { .uname = "IIO5_NCB", .ucode = 0x1000, .udesc = "M2 BL Credits Empty -- IIO5", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "M2 BL Credits Empty -- All IIO targets for NCS are in single mask. ORs them together", }, { .uname = "NCS_SEL", .ucode = 0x4000, .udesc = "M2 BL Credits Empty -- Selected M2p BL NCS credits", }, }; static intel_x86_umask_t skx_unc_m3_multi_slot_rcvd[]={ { .uname = "AD_SLOT0", .ucode = 0x100, .udesc = "Multi Slot Flit Received -- AD - Slot 0", }, { .uname = "AD_SLOT1", .ucode = 0x200, .udesc = "Multi Slot Flit Received -- AD - Slot 1", }, { .uname = "AD_SLOT2", .ucode = 0x400, .udesc = "Multi Slot Flit Received -- AD - Slot 2", }, { .uname = "AK_SLOT0", .ucode = 0x1000, .udesc = "Multi Slot Flit Received -- AK - Slot 0", }, { .uname = "AK_SLOT2", .ucode = 0x2000, .udesc = "Multi Slot Flit Received -- AK - Slot 2", }, { .uname = "BL_SLOT0", .ucode = 0x800, .udesc = "Multi Slot Flit Received -- BL - Slot 0", }, }; static intel_x86_umask_t skx_unc_m3_ring_bounces_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Horizontal Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Horizontal Ring. -- AK", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Horizontal Ring. -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Horizontal Ring. -- IV", }, }; static intel_x86_umask_t skx_unc_m3_ring_bounces_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Messages that bounced on the Vertical Ring. -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Messages that bounced on the Vertical Ring. -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Messages that bounced on the Vertical Ring. -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Messages that bounced on the Vertical Ring. -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_m3_ring_sink_starved_horz[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Horizontal Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Horizontal Ring -- AK", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Sink Starvation on Horizontal Ring -- Acknowledgements to Agent 1", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Horizontal Ring -- BL", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Horizontal Ring -- IV", }, }; static intel_x86_umask_t skx_unc_m3_ring_sink_starved_vert[]={ { .uname = "AD", .ucode = 0x100, .udesc = "Sink Starvation on Vertical Ring -- AD", }, { .uname = "AK", .ucode = 0x200, .udesc = "Sink Starvation on Vertical Ring -- Acknowledgements to core", }, { .uname = "BL", .ucode = 0x400, .udesc = "Sink Starvation on Vertical Ring -- Data Responses to core", }, { .uname = "IV", .ucode = 0x800, .udesc = "Sink Starvation on Vertical Ring -- Snoops of processors cachee.", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_lost_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "Lost Arb for VN0 -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "Lost Arb for VN0 -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "Lost Arb for VN0 -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "Lost Arb for VN0 -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "Lost Arb for VN0 -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "Lost Arb for VN0 -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "Lost Arb for VN0 -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_lost_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "Lost Arb for VN1 -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "Lost Arb for VN1 -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "Lost Arb for VN1 -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "Lost Arb for VN1 -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "Lost Arb for VN1 -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "Lost Arb for VN1 -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "Lost Arb for VN1 -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_misc[]={ { .uname = "ADBL_PARALLEL_WIN", .ucode = 0x4000, .udesc = "Arb Miscellaneous -- AD, BL Parallel Win", }, { .uname = "NO_PROG_AD_VN0", .ucode = 0x400, .udesc = "Arb Miscellaneous -- No Progress on Pending AD VN0", }, { .uname = "NO_PROG_AD_VN1", .ucode = 0x800, .udesc = "Arb Miscellaneous -- No Progress on Pending AD VN1", }, { .uname = "NO_PROG_BL_VN0", .ucode = 0x1000, .udesc = "Arb Miscellaneous -- No Progress on Pending BL VN0", }, { .uname = "NO_PROG_BL_VN1", .ucode = 0x2000, .udesc = "Arb Miscellaneous -- No Progress on Pending BL VN1", }, { .uname = "PAR_BIAS_VN0", .ucode = 0x100, .udesc = "Arb Miscellaneous -- Parallel Bias to VN0", }, { .uname = "PAR_BIAS_VN1", .ucode = 0x200, .udesc = "Arb Miscellaneous -- Parallel Bias to VN1", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_noad_req_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "Cant Arb for VN0 -- REQ on AAD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "Cant Arb for VN0 -- RSP on AAD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "Cant Arb for VN0 -- SNP on AAD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "Cant Arb for VN0 -- NCB on BBL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "Cant Arb for VN0 -- NCS on BBL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "Cant Arb for VN0 -- RSP on BBL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "Cant Arb for VN0 -- WB on BBL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_noad_req_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "Cant Arb for VN1 -- REQ on AAD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "Cant Arb for VN1 -- RSP on AAD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "Cant Arb for VN1 -- SNP on AAD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "Cant Arb for VN1 -- NCB on BBL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "Cant Arb for VN1 -- NCS on BBL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "Cant Arb for VN1 -- RSP on BBL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "Cant Arb for VN1 -- WB on BBL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_nocred_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "No Credits to Arb for VN0 -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "No Credits to Arb for VN0 -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "No Credits to Arb for VN0 -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "No Credits to Arb for VN0 -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "No Credits to Arb for VN0 -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "No Credits to Arb for VN0 -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "No Credits to Arb for VN0 -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_arb_nocred_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "No Credits to Arb for VN1 -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "No Credits to Arb for VN1 -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "No Credits to Arb for VN1 -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "No Credits to Arb for VN1 -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "No Credits to Arb for VN1 -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "No Credits to Arb for VN1 -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "No Credits to Arb for VN1 -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_bypassed[]={ { .uname = "AD_S0_BL_ARB", .ucode = 0x200, .udesc = "Ingress Queue Bypasses -- AD to Slot 0 on BL Arb", }, { .uname = "AD_S0_IDLE", .ucode = 0x100, .udesc = "Ingress Queue Bypasses -- AD to Slot 0 on Idle", }, { .uname = "AD_S1_BL_SLOT", .ucode = 0x400, .udesc = "Ingress Queue Bypasses -- AD + BL to Slot 1", }, { .uname = "AD_S2_BL_SLOT", .ucode = 0x800, .udesc = "Ingress Queue Bypasses -- AD + BL to Slot 2", }, }; static intel_x86_umask_t skx_unc_m3_rxc_collision_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN0 message lost contest for flit -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN0 message lost contest for flit -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN0 message lost contest for flit -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN0 message lost contest for flit -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN0 message lost contest for flit -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN0 message lost contest for flit -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN0 message lost contest for flit -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_collision_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN1 message lost contest for flit -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN1 message lost contest for flit -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN1 message lost contest for flit -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN1 message lost contest for flit -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN1 message lost contest for flit -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN1 message lost contest for flit -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN1 message lost contest for flit -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_crd_misc[]={ { .uname = "ANY_BGF_FIFO", .ucode = 0x100, .udesc = "Miscellaneous Credit Events -- Any In BGF FIFO", }, { .uname = "ANY_BGF_PATH", .ucode = 0x200, .udesc = "Miscellaneous Credit Events -- Any in BGF Path", }, { .uname = "NO_D2K_FOR_ARB", .ucode = 0x400, .udesc = "Miscellaneous Credit Events -- No D2K For Arb", }, }; static intel_x86_umask_t skx_unc_m3_rxc_crd_occ[]={ { .uname = "D2K_CRD", .ucode = 0x1000, .udesc = "Credit Occupancy -- D2K Credits", }, { .uname = "FLITS_IN_FIFO", .ucode = 0x200, .udesc = "Credit Occupancy -- Packets in BGF FIFO", }, { .uname = "FLITS_IN_PATH", .ucode = 0x400, .udesc = "Credit Occupancy -- Packets in BGF Path", }, { .uname = "P1P_FIFO", .ucode = 0x4000, .udesc = "Credit Occupancy -- ", }, { .uname = "P1P_TOTAL", .ucode = 0x2000, .udesc = "Credit Occupancy -- ", }, { .uname = "TxQ_CRD", .ucode = 0x800, .udesc = "Credit Occupancy -- Transmit Credits", }, { .uname = "VNA_IN_USE", .ucode = 0x100, .udesc = "Credit Occupancy -- VNA In Use", }, }; static intel_x86_umask_t skx_unc_m3_rxc_cycles_ne_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN0 Ingress (from CMS) Queue - Cycles Not Empty -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_cycles_ne_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN1 Ingress (from CMS) Queue - Cycles Not Empty -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flits_data_not_sent[]={ { .uname = "ALL", .ucode = 0x100, .udesc = "Data Flit Not Sent -- All", .uflags = INTEL_X86_DFL, }, { .uname = "NO_BGF", .ucode = 0x200, .udesc = "Data Flit Not Sent -- No BGF Credits", }, { .uname = "NO_TXQ", .ucode = 0x400, .udesc = "Data Flit Not Sent -- No TxQ Credits", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flits_gen_bl[]={ { .uname = "P0_WAIT", .ucode = 0x100, .udesc = "Generating BL Data Flit Sequence -- Wait on Pump 0", }, { .uname = "P1P_AT_LIMIT", .ucode = 0x1000, .udesc = "Generating BL Data Flit Sequence -- ", }, { .uname = "P1P_BUSY", .ucode = 0x800, .udesc = "Generating BL Data Flit Sequence -- ", }, { .uname = "P1P_FIFO_FULL", .ucode = 0x4000, .udesc = "Generating BL Data Flit Sequence -- ", }, { .uname = "P1P_HOLD_P0", .ucode = 0x2000, .udesc = "Generating BL Data Flit Sequence -- ", }, { .uname = "P1P_TO_LIMBO", .ucode = 0x400, .udesc = "Generating BL Data Flit Sequence -- ", }, { .uname = "P1_WAIT", .ucode = 0x200, .udesc = "Generating BL Data Flit Sequence -- Wait on Pump 1", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flits_sent[]={ { .uname = "1_MSG", .ucode = 0x100, .udesc = "Sent Header Flit -- One Message", }, { .uname = "1_MSG_VNX", .ucode = 0x800, .udesc = "Sent Header Flit -- One Message in non-VNA", }, { .uname = "2_MSGS", .ucode = 0x200, .udesc = "Sent Header Flit -- Two Messages", }, { .uname = "3_MSGS", .ucode = 0x400, .udesc = "Sent Header Flit -- Three Messages", }, { .uname = "SLOTS_1", .ucode = 0x1000, .udesc = "Sent Header Flit -- ", }, { .uname = "SLOTS_2", .ucode = 0x2000, .udesc = "Sent Header Flit -- ", }, { .uname = "SLOTS_3", .ucode = 0x4000, .udesc = "Sent Header Flit -- ", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flits_slot_bl[]={ { .uname = "ALL", .ucode = 0x100, .udesc = "Slotting BL Message Into Header Flit -- All", .uflags = INTEL_X86_DFL, }, { .uname = "NEED_DATA", .ucode = 0x200, .udesc = "Slotting BL Message Into Header Flit -- Needs Data Flit", }, { .uname = "P0_WAIT", .ucode = 0x400, .udesc = "Slotting BL Message Into Header Flit -- Wait on Pump 0", }, { .uname = "P1_NOT_REQ", .ucode = 0x1000, .udesc = "Slotting BL Message Into Header Flit -- Don't Need Pump 1", }, { .uname = "P1_NOT_REQ_BUT_BUBBLE", .ucode = 0x2000, .udesc = "Slotting BL Message Into Header Flit -- Don't Need Pump 1 - Bubble", }, { .uname = "P1_NOT_REQ_NOT_AVAIL", .ucode = 0x4000, .udesc = "Slotting BL Message Into Header Flit -- Don't Need Pump 1 - Not Avail", }, { .uname = "P1_WAIT", .ucode = 0x800, .udesc = "Slotting BL Message Into Header Flit -- Wait on Pump 1", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flit_gen_hdr1[]={ { .uname = "ACCUM", .ucode = 0x100, .udesc = "Flit Gen - Header 1 -- Accumulate", }, { .uname = "ACCUM_READ", .ucode = 0x200, .udesc = "Flit Gen - Header 1 -- Accumulate Ready", }, { .uname = "ACCUM_WASTED", .ucode = 0x400, .udesc = "Flit Gen - Header 1 -- Accumulate Wasted", }, { .uname = "AHEAD_BLOCKED", .ucode = 0x800, .udesc = "Flit Gen - Header 1 -- Run-Ahead - Blocked", }, { .uname = "AHEAD_MSG", .ucode = 0x1000, .udesc = "Flit Gen - Header 1 -- Run-Ahead - Message", }, { .uname = "PAR", .ucode = 0x2000, .udesc = "Flit Gen - Header 1 -- Parallel Ok", }, { .uname = "PAR_FLIT", .ucode = 0x8000, .udesc = "Flit Gen - Header 1 -- Parallel Flit Finished", }, { .uname = "PAR_MSG", .ucode = 0x4000, .udesc = "Flit Gen - Header 1 -- Parallel Message", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flit_gen_hdr2[]={ { .uname = "RMSTALL", .ucode = 0x100, .udesc = "Flit Gen - Header 2 -- Rate-matching Stall", }, { .uname = "RMSTALL_NOMSG", .ucode = 0x200, .udesc = "Flit Gen - Header 2 -- Rate-matching Stall - No Message", }, }; static intel_x86_umask_t skx_unc_m3_rxc_flit_not_sent[]={ { .uname = "ALL", .ucode = 0x100, .udesc = "Header Not Sent -- All", .uflags = INTEL_X86_DFL, }, { .uname = "NO_BGF_CRD", .ucode = 0x200, .udesc = "Header Not Sent -- No BGF Credits", }, { .uname = "NO_BGF_NO_MSG", .ucode = 0x800, .udesc = "Header Not Sent -- No BGF Credits + No Extra Message Slotted", }, { .uname = "NO_TXQ_CRD", .ucode = 0x400, .udesc = "Header Not Sent -- No TxQ Credits", }, { .uname = "NO_TXQ_NO_MSG", .ucode = 0x1000, .udesc = "Header Not Sent -- No TxQ Credits + No Extra Message Slotted", }, { .uname = "ONE_TAKEN", .ucode = 0x2000, .udesc = "Header Not Sent -- Sent - One Slot Taken", }, { .uname = "THREE_TAKEN", .ucode = 0x8000, .udesc = "Header Not Sent -- Sent - Three Slots Taken", }, { .uname = "TWO_TAKEN", .ucode = 0x4000, .udesc = "Header Not Sent -- Sent - Two Slots Taken", }, }; static intel_x86_umask_t skx_unc_m3_rxc_held[]={ { .uname = "CANT_SLOT_AD", .ucode = 0x4000, .udesc = "Message Held -- Cant Slot AAD", }, { .uname = "CANT_SLOT_BL", .ucode = 0x8000, .udesc = "Message Held -- Cant Slot BBL", }, { .uname = "PARALLEL_AD_LOST", .ucode = 0x1000, .udesc = "Message Held -- Parallel AD Lost", }, { .uname = "PARALLEL_ATTEMPT", .ucode = 0x400, .udesc = "Message Held -- Parallel Attempt", }, { .uname = "PARALLEL_BL_LOST", .ucode = 0x2000, .udesc = "Message Held -- Parallel BL Lost", }, { .uname = "PARALLEL_SUCCESS", .ucode = 0x800, .udesc = "Message Held -- Parallel Success", }, { .uname = "VN0", .ucode = 0x100, .udesc = "Message Held -- VN0", }, { .uname = "VN1", .ucode = 0x200, .udesc = "Message Held -- VN1", }, }; static intel_x86_umask_t skx_unc_m3_rxc_inserts_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN0 Ingress (from CMS) Queue - Inserts -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_inserts_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN1 Ingress (from CMS) Queue - Inserts -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_occupancy_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN0 Ingress (from CMS) Queue - Occupancy -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_occupancy_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- REQ on AD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- RSP on AD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- SNP on AD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- NCB on BL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- NCS on BL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- RSP on BL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN1 Ingress (from CMS) Queue - Occupancy -- WB on BL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_packing_miss_vn0[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN0 message cant slot into flit -- REQ on AAD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN0 message cant slot into flit -- RSP on AAD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN0 message cant slot into flit -- SNP on AAD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN0 message cant slot into flit -- NCB on BBL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN0 message cant slot into flit -- NCS on BBL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN0 message cant slot into flit -- RSP on BBL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN0 message cant slot into flit -- WB on BBL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_packing_miss_vn1[]={ { .uname = "AD_REQ", .ucode = 0x100, .udesc = "VN1 message cant slot into flit -- REQ on AAD", }, { .uname = "AD_RSP", .ucode = 0x400, .udesc = "VN1 message cant slot into flit -- RSP on AAD", }, { .uname = "AD_SNP", .ucode = 0x200, .udesc = "VN1 message cant slot into flit -- SNP on AAD", }, { .uname = "BL_NCB", .ucode = 0x2000, .udesc = "VN1 message cant slot into flit -- NCB on BBL", }, { .uname = "BL_NCS", .ucode = 0x4000, .udesc = "VN1 message cant slot into flit -- NCS on BBL", }, { .uname = "BL_RSP", .ucode = 0x800, .udesc = "VN1 message cant slot into flit -- RSP on BBL", }, { .uname = "BL_WB", .ucode = 0x1000, .udesc = "VN1 message cant slot into flit -- WB on BBL", }, }; static intel_x86_umask_t skx_unc_m3_rxc_smi3_pftch[]={ { .uname = "ARB_LOST", .ucode = 0x200, .udesc = "SMI3 Prefetch Messages -- Lost Arbitration", }, { .uname = "ARRIVED", .ucode = 0x100, .udesc = "SMI3 Prefetch Messages -- Arrived", }, { .uname = "DROP_OLD", .ucode = 0x800, .udesc = "SMI3 Prefetch Messages -- Dropped - Old", }, { .uname = "DROP_WRAP", .ucode = 0x1000, .udesc = "SMI3 Prefetch Messages -- Dropped - Wrap", }, { .uname = "SLOTTED", .ucode = 0x400, .udesc = "SMI3 Prefetch Messages -- Slotted", }, }; static intel_x86_umask_t skx_unc_m3_rxc_vna_crd[]={ { .uname = "ANY_IN_USE", .ucode = 0x2000, .udesc = "Remote VNA Credits -- Any In Use", }, { .uname = "CORRECTED", .ucode = 0x200, .udesc = "Remote VNA Credits -- Corrected", }, { .uname = "LT1", .ucode = 0x400, .udesc = "Remote VNA Credits -- Level < 1", }, { .uname = "LT4", .ucode = 0x800, .udesc = "Remote VNA Credits -- Level < 4", }, { .uname = "LT5", .ucode = 0x1000, .udesc = "Remote VNA Credits -- Level < 5", }, { .uname = "USED", .ucode = 0x100, .udesc = "Remote VNA Credits -- Used", }, }; static intel_x86_umask_t skx_unc_m3_rxr_busy_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_m3_rxr_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Bypass -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Bypass -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Bypass -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Bypass -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Bypass -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Bypass -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_rxr_crd_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Injection Starvation -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Injection Starvation -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Injection Starvation -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Injection Starvation -- BL - Credit", }, { .uname = "IFV", .ucode = 0x8000, .udesc = "Transgress Injection Starvation -- IFV - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_rxr_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Allocations -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Allocations -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Allocations -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Allocations -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Allocations -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Allocations -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_rxr_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Transgress Ingress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Transgress Ingress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Transgress Ingress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Transgress Ingress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Transgress Ingress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Transgress Ingress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_stall_no_txr_horz_crd_ad_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_stall_no_txr_horz_crd_ad_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No AD Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_stall_no_txr_horz_crd_bl_ag0[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent0 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_stall_no_txr_horz_crd_bl_ag1[]={ { .uname = "TGR0", .ucode = 0x100, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 0", }, { .uname = "TGR1", .ucode = 0x200, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 1", }, { .uname = "TGR2", .ucode = 0x400, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 2", }, { .uname = "TGR3", .ucode = 0x800, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 3", }, { .uname = "TGR4", .ucode = 0x1000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 4", }, { .uname = "TGR5", .ucode = 0x2000, .udesc = "Stall on No BL Agent1 Transgress Credits -- For Transgress 5", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_arb_fail[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "Failed ARB for AD -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "Failed ARB for AD -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "Failed ARB for AD -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "Failed ARB for AD -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "Failed ARB for AD -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "Failed ARB for AD -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "Failed ARB for AD -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "Failed ARB for AD -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_flq_bypass[]={ { .uname = "AD_SLOT0", .ucode = 0x100, .udesc = "AD FlowQ Bypass -- ", }, { .uname = "AD_SLOT1", .ucode = 0x200, .udesc = "AD FlowQ Bypass -- ", }, { .uname = "AD_SLOT2", .ucode = 0x400, .udesc = "AD FlowQ Bypass -- ", }, { .uname = "BL_EARLY_RSP", .ucode = 0x800, .udesc = "AD FlowQ Bypass -- ", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_flq_cycles_ne[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "AD Flow Q Not Empty -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "AD Flow Q Not Empty -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "AD Flow Q Not Empty -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "AD Flow Q Not Empty -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "AD Flow Q Not Empty -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "AD Flow Q Not Empty -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "AD Flow Q Not Empty -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "AD Flow Q Not Empty -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_flq_inserts[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "AD Flow Q Inserts -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "AD Flow Q Inserts -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "AD Flow Q Inserts -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "AD Flow Q Inserts -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "AD Flow Q Inserts -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "AD Flow Q Inserts -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "AD Flow Q Inserts -- VN1 SNP Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_flq_occupancy[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "AD Flow Q Occupancy -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "AD Flow Q Occupancy -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "AD Flow Q Occupancy -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "AD Flow Q Occupancy -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "AD Flow Q Occupancy -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "AD Flow Q Occupancy -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "AD Flow Q Occupancy -- VN1 SNP Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_snpf_grp1_vn1[]={ { .uname = "VN0_CHA", .ucode = 0x400, .udesc = "Number of Snoop Targets -- CHA on VN0", }, { .uname = "VN0_NON_IDLE", .ucode = 0x4000, .udesc = "Number of Snoop Targets -- Non Idle cycles on VN0", }, { .uname = "VN0_PEER_UPI0", .ucode = 0x100, .udesc = "Number of Snoop Targets -- Peer UPI0 on VN0", }, { .uname = "VN0_PEER_UPI1", .ucode = 0x200, .udesc = "Number of Snoop Targets -- Peer UPI1 on VN0", }, { .uname = "VN1_CHA", .ucode = 0x2000, .udesc = "Number of Snoop Targets -- CHA on VN1", }, { .uname = "VN1_NON_IDLE", .ucode = 0x8000, .udesc = "Number of Snoop Targets -- Non Idle cycles on VN1", }, { .uname = "VN1_PEER_UPI0", .ucode = 0x800, .udesc = "Number of Snoop Targets -- Peer UPI0 on VN1", }, { .uname = "VN1_PEER_UPI1", .ucode = 0x1000, .udesc = "Number of Snoop Targets -- Peer UPI1 on VN1", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_snpf_grp2_vn1[]={ { .uname = "VN0_SNPFP_NONSNP", .ucode = 0x100, .udesc = "Snoop Arbitration -- FlowQ Won", }, { .uname = "VN0_SNPFP_VN2SNP", .ucode = 0x400, .udesc = "Snoop Arbitration -- FlowQ SnpF Won", }, { .uname = "VN1_SNPFP_NONSNP", .ucode = 0x200, .udesc = "Snoop Arbitration -- FlowQ Won", }, { .uname = "VN1_SNPFP_VN0SNP", .ucode = 0x800, .udesc = "Snoop Arbitration -- FlowQ SnpF Won", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_spec_arb_crd_avail[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "Speculative ARB for AD - Credit Available -- VN0 REQ Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "Speculative ARB for AD - Credit Available -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "Speculative ARB for AD - Credit Available -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "Speculative ARB for AD - Credit Available -- VN1 REQ Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "Speculative ARB for AD - Credit Available -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "Speculative ARB for AD - Credit Available -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_spec_arb_new_msg[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "Speculative ARB for AD - New Message -- VN0 REQ Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "Speculative ARB for AD - New Message -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "Speculative ARB for AD - New Message -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "Speculative ARB for AD - New Message -- VN1 REQ Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "Speculative ARB for AD - New Message -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "Speculative ARB for AD - New Message -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_ad_spec_arb_no_other_pend[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "Speculative ARB for AD - No Credit -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "Speculative ARB for AD - No Credit -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "Speculative ARB for AD - No Credit -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "Speculative ARB for AD - No Credit -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "Speculative ARB for AD - No Credit -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "Speculative ARB for AD - No Credit -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "Speculative ARB for AD - No Credit -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "Speculative ARB for AD - No Credit -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_arb_fail[]={ { .uname = "VN0_NCB", .ucode = 0x400, .udesc = "Failed ARB for BL -- VN0 NCB Messages", }, { .uname = "VN0_NCS", .ucode = 0x800, .udesc = "Failed ARB for BL -- VN0 NCS Messages", }, { .uname = "VN0_RSP", .ucode = 0x100, .udesc = "Failed ARB for BL -- VN0 RSP Messages", }, { .uname = "VN0_WB", .ucode = 0x200, .udesc = "Failed ARB for BL -- VN0 WB Messages", }, { .uname = "VN1_NCB", .ucode = 0x4000, .udesc = "Failed ARB for BL -- VN1 NCS Messages", }, { .uname = "VN1_NCS", .ucode = 0x8000, .udesc = "Failed ARB for BL -- VN1 NCB Messages", }, { .uname = "VN1_RSP", .ucode = 0x1000, .udesc = "Failed ARB for BL -- VN1 RSP Messages", }, { .uname = "VN1_WB", .ucode = 0x2000, .udesc = "Failed ARB for BL -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_flq_cycles_ne[]={ { .uname = "VN0_REQ", .ucode = 0x100, .udesc = "BL Flow Q Not Empty -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x400, .udesc = "BL Flow Q Not Empty -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x200, .udesc = "BL Flow Q Not Empty -- VN0 SNP Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "BL Flow Q Not Empty -- VN0 WB Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "BL Flow Q Not Empty -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "BL Flow Q Not Empty -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "BL Flow Q Not Empty -- VN1 SNP Messages", }, { .uname = "VN1_WB", .ucode = 0x8000, .udesc = "BL Flow Q Not Empty -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_flq_inserts[]={ { .uname = "VN0_NCB", .ucode = 0x100, .udesc = "BL Flow Q Inserts -- VN0 RSP Messages", }, { .uname = "VN0_NCS", .ucode = 0x200, .udesc = "BL Flow Q Inserts -- VN0 WB Messages", }, { .uname = "VN0_RSP", .ucode = 0x800, .udesc = "BL Flow Q Inserts -- VN0 NCS Messages", }, { .uname = "VN0_WB", .ucode = 0x400, .udesc = "BL Flow Q Inserts -- VN0 NCB Messages", }, { .uname = "VN1_NCB", .ucode = 0x1000, .udesc = "BL Flow Q Inserts -- VN1 RSP Messages", }, { .uname = "VN1_NCS", .ucode = 0x2000, .udesc = "BL Flow Q Inserts -- VN1 WB Messages", }, { .uname = "VN1_RSP", .ucode = 0x8000, .udesc = "BL Flow Q Inserts -- VN1_NCB Messages", }, { .uname = "VN1_WB", .ucode = 0x4000, .udesc = "BL Flow Q Inserts -- VN1_NCS Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_flq_occupancy[]={ { .uname = "VN0_NCB", .ucode = 0x400, .udesc = "BL Flow Q Occupancy -- VN0 NCB Messages", }, { .uname = "VN0_NCS", .ucode = 0x800, .udesc = "BL Flow Q Occupancy -- VN0 NCS Messages", }, { .uname = "VN0_RSP", .ucode = 0x100, .udesc = "BL Flow Q Occupancy -- VN0 RSP Messages", }, { .uname = "VN0_WB", .ucode = 0x200, .udesc = "BL Flow Q Occupancy -- VN0 WB Messages", }, { .uname = "VN1_NCB", .ucode = 0x4000, .udesc = "BL Flow Q Occupancy -- VN1_NCS Messages", }, { .uname = "VN1_NCS", .ucode = 0x8000, .udesc = "BL Flow Q Occupancy -- VN1_NCB Messages", }, { .uname = "VN1_RSP", .ucode = 0x1000, .udesc = "BL Flow Q Occupancy -- VN1 RSP Messages", }, { .uname = "VN1_WB", .ucode = 0x2000, .udesc = "BL Flow Q Occupancy -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_spec_arb_new_msg[]={ { .uname = "VN0_NCB", .ucode = 0x200, .udesc = "Speculative ARB for BL - New Message -- VN0 WB Messages", }, { .uname = "VN0_NCS", .ucode = 0x800, .udesc = "Speculative ARB for BL - New Message -- VN0 NCS Messages", }, { .uname = "VN0_WB", .ucode = 0x100, .udesc = "Speculative ARB for BL - New Message -- VN0 WB Messages", }, { .uname = "VN1_NCB", .ucode = 0x2000, .udesc = "Speculative ARB for BL - New Message -- VN1 WB Messages", }, { .uname = "VN1_NCS", .ucode = 0x8000, .udesc = "Speculative ARB for BL - New Message -- VN1 NCB Messages", }, { .uname = "VN1_WB", .ucode = 0x1000, .udesc = "Speculative ARB for BL - New Message -- VN1 RSP Messages", }, }; static intel_x86_umask_t skx_unc_m3_txc_bl_spec_arb_no_other_pend[]={ { .uname = "VN0_NCB", .ucode = 0x400, .udesc = "Speculative ARB for AD Failed - No Credit -- VN0 NCB Messages", }, { .uname = "VN0_NCS", .ucode = 0x800, .udesc = "Speculative ARB for AD Failed - No Credit -- VN0 NCS Messages", }, { .uname = "VN0_RSP", .ucode = 0x100, .udesc = "Speculative ARB for AD Failed - No Credit -- VN0 RSP Messages", }, { .uname = "VN0_WB", .ucode = 0x200, .udesc = "Speculative ARB for AD Failed - No Credit -- VN0 WB Messages", }, { .uname = "VN1_NCB", .ucode = 0x4000, .udesc = "Speculative ARB for AD Failed - No Credit -- VN1 NCS Messages", }, { .uname = "VN1_NCS", .ucode = 0x8000, .udesc = "Speculative ARB for AD Failed - No Credit -- VN1 NCB Messages", }, { .uname = "VN1_RSP", .ucode = 0x1000, .udesc = "Speculative ARB for AD Failed - No Credit -- VN1 RSP Messages", }, { .uname = "VN1_WB", .ucode = 0x2000, .udesc = "Speculative ARB for AD Failed - No Credit -- VN1 WB Messages", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_ads_used[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal ADS Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal ADS Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal ADS Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal ADS Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal ADS Used -- BL - Credit", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_bypass[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Bypass Used -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Bypass Used -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Bypass Used -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Bypass Used -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Bypass Used -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Bypass Used -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_cycles_full[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Full -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_cycles_ne[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "Cycles CMS Horizontal Egress Queue is Not Empty -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_inserts[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Inserts -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Inserts -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Inserts -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Inserts -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Inserts -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Inserts -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_nack[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress NACKs -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x2000, .udesc = "CMS Horizontal Egress NACKs -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress NACKs -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress NACKs -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress NACKs -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress NACKs -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_occupancy[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Occupancy -- AD - Bounce", }, { .uname = "AD_CRD", .ucode = 0x1000, .udesc = "CMS Horizontal Egress Occupancy -- AD - Credit", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Occupancy -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Occupancy -- BL - Bounce", }, { .uname = "BL_CRD", .ucode = 0x4000, .udesc = "CMS Horizontal Egress Occupancy -- BL - Credit", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Occupancy -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_horz_starved[]={ { .uname = "AD_BNC", .ucode = 0x100, .udesc = "CMS Horizontal Egress Injection Starvation -- AD - Bounce", }, { .uname = "AK_BNC", .ucode = 0x200, .udesc = "CMS Horizontal Egress Injection Starvation -- AK - Bounce", }, { .uname = "BL_BNC", .ucode = 0x400, .udesc = "CMS Horizontal Egress Injection Starvation -- BL - Bounce", }, { .uname = "IV_BNC", .ucode = 0x800, .udesc = "CMS Horizontal Egress Injection Starvation -- IV - Bounce", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_ads_used[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_bypass[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical ADS Used -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical ADS Used -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical ADS Used -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical ADS Used -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical ADS Used -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical ADS Used -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical ADS Used -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_cycles_full[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Full -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_cycles_ne[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "Cycles CMS Vertical Egress Queue Is Not Empty -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_inserts[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Allocations -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Allocations -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Allocations -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Allocations -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Allocations -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Allocations -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Allocations -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_nack[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress NACKs -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress NACKs -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress NACKs -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress NACKs -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_occupancy[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vert Egress Occupancy -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vert Egress Occupancy -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vert Egress Occupancy -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vert Egress Occupancy -- IV", }, }; static intel_x86_umask_t skx_unc_m3_txr_vert_starved[]={ { .uname = "AD_AG0", .ucode = 0x100, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 0", }, { .uname = "AD_AG1", .ucode = 0x1000, .udesc = "CMS Vertical Egress Injection Starvation -- AD - Agent 1", }, { .uname = "AK_AG0", .ucode = 0x200, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 0", }, { .uname = "AK_AG1", .ucode = 0x2000, .udesc = "CMS Vertical Egress Injection Starvation -- AK - Agent 1", }, { .uname = "BL_AG0", .ucode = 0x400, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 0", }, { .uname = "BL_AG1", .ucode = 0x4000, .udesc = "CMS Vertical Egress Injection Starvation -- BL - Agent 1", }, { .uname = "IV", .ucode = 0x800, .udesc = "CMS Vertical Egress Injection Starvation -- IV", }, }; static intel_x86_umask_t skx_unc_m3_upi_peer_ad_credits_empty[]={ { .uname = "VN0_REQ", .ucode = 0x200, .udesc = "UPI0 AD Credits Empty -- VN0 REQ Messages", }, { .uname = "VN0_RSP", .ucode = 0x800, .udesc = "UPI0 AD Credits Empty -- VN0 RSP Messages", }, { .uname = "VN0_SNP", .ucode = 0x400, .udesc = "UPI0 AD Credits Empty -- VN0 SNP Messages", }, { .uname = "VN1_REQ", .ucode = 0x1000, .udesc = "UPI0 AD Credits Empty -- VN1 REQ Messages", }, { .uname = "VN1_RSP", .ucode = 0x4000, .udesc = "UPI0 AD Credits Empty -- VN1 RSP Messages", }, { .uname = "VN1_SNP", .ucode = 0x2000, .udesc = "UPI0 AD Credits Empty -- VN1 SNP Messages", }, { .uname = "VNA", .ucode = 0x100, .udesc = "UPI0 AD Credits Empty -- VNA", }, }; static intel_x86_umask_t skx_unc_m3_upi_peer_bl_credits_empty[]={ { .uname = "VN0_NCS_NCB", .ucode = 0x400, .udesc = "UPI0 BL Credits Empty -- VN0 RSP Messages", }, { .uname = "VN0_RSP", .ucode = 0x200, .udesc = "UPI0 BL Credits Empty -- VN0 REQ Messages", }, { .uname = "VN0_WB", .ucode = 0x800, .udesc = "UPI0 BL Credits Empty -- VN0 SNP Messages", }, { .uname = "VN1_NCS_NCB", .ucode = 0x2000, .udesc = "UPI0 BL Credits Empty -- VN1 RSP Messages", }, { .uname = "VN1_RSP", .ucode = 0x1000, .udesc = "UPI0 BL Credits Empty -- VN1 REQ Messages", }, { .uname = "VN1_WB", .ucode = 0x4000, .udesc = "UPI0 BL Credits Empty -- VN1 SNP Messages", }, { .uname = "VNA", .ucode = 0x100, .udesc = "UPI0 BL Credits Empty -- VNA", }, }; static intel_x86_umask_t skx_unc_m3_vert_ring_ad_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AD Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AD Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AD Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AD Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m3_vert_ring_ak_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical AK Ring In Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical AK Ring In Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical AK Ring In Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical AK Ring In Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m3_vert_ring_bl_in_use[]={ { .uname = "DN_EVEN", .ucode = 0x400, .udesc = "Vertical BL Ring in Use -- Down and Even", }, { .uname = "DN_ODD", .ucode = 0x800, .udesc = "Vertical BL Ring in Use -- Down and Odd", }, { .uname = "UP_EVEN", .ucode = 0x100, .udesc = "Vertical BL Ring in Use -- Up and Even", }, { .uname = "UP_ODD", .ucode = 0x200, .udesc = "Vertical BL Ring in Use -- Up and Odd", }, }; static intel_x86_umask_t skx_unc_m3_vert_ring_iv_in_use[]={ { .uname = "DN", .ucode = 0x400, .udesc = "Vertical IV Ring in Use -- Down", }, { .uname = "UP", .ucode = 0x100, .udesc = "Vertical IV Ring in Use -- Up", }, }; static intel_x86_umask_t skx_unc_m3_vn0_credits_used[]={ { .uname = "NCB", .ucode = 0x1000, .udesc = "VN0 Credit Used -- WB on BL", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN0 Credit Used -- NCB on BL", }, { .uname = "REQ", .ucode = 0x100, .udesc = "VN0 Credit Used -- REQ on AD", }, { .uname = "RSP", .ucode = 0x400, .udesc = "VN0 Credit Used -- RSP on AD", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN0 Credit Used -- SNP on AD", }, { .uname = "WB", .ucode = 0x800, .udesc = "VN0 Credit Used -- RSP on BL", }, }; static intel_x86_umask_t skx_unc_m3_vn0_no_credits[]={ { .uname = "NCB", .ucode = 0x1000, .udesc = "VN0 No Credits -- WB on BL", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN0 No Credits -- NCB on BL", }, { .uname = "REQ", .ucode = 0x100, .udesc = "VN0 No Credits -- REQ on AD", }, { .uname = "RSP", .ucode = 0x400, .udesc = "VN0 No Credits -- RSP on AD", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN0 No Credits -- SNP on AD", }, { .uname = "WB", .ucode = 0x800, .udesc = "VN0 No Credits -- RSP on BL", }, }; static intel_x86_umask_t skx_unc_m3_vn1_credits_used[]={ { .uname = "NCB", .ucode = 0x1000, .udesc = "VN1 Credit Used -- WB on BL", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN1 Credit Used -- NCB on BL", }, { .uname = "REQ", .ucode = 0x100, .udesc = "VN1 Credit Used -- REQ on AD", }, { .uname = "RSP", .ucode = 0x400, .udesc = "VN1 Credit Used -- RSP on AD", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN1 Credit Used -- SNP on AD", }, { .uname = "WB", .ucode = 0x800, .udesc = "VN1 Credit Used -- RSP on BL", }, }; static intel_x86_umask_t skx_unc_m3_vn1_no_credits[]={ { .uname = "NCB", .ucode = 0x1000, .udesc = "VN1 No Credits -- WB on BL", }, { .uname = "NCS", .ucode = 0x2000, .udesc = "VN1 No Credits -- NCB on BL", }, { .uname = "REQ", .ucode = 0x100, .udesc = "VN1 No Credits -- REQ on AD", }, { .uname = "RSP", .ucode = 0x400, .udesc = "VN1 No Credits -- RSP on AD", }, { .uname = "SNP", .ucode = 0x200, .udesc = "VN1 No Credits -- SNP on AD", }, { .uname = "WB", .ucode = 0x800, .udesc = "VN1 No Credits -- RSP on BL", }, }; static intel_x86_entry_t intel_skx_unc_m3_pe[]={ { .name = "UNC_M3_AG0_AD_CRD_ACQUIRED", .code = 0x80, .desc = "Number of CMS Agent 0 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_acquired, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_acquired), }, { .name = "UNC_M3_AG0_AD_CRD_OCCUPANCY", .code = 0x82, .desc = "Number of CMS Agent 0 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_occupancy), }, { .name = "UNC_M3_AG0_BL_CRD_ACQUIRED", .code = 0x88, .desc = "Number of CMS Agent 0 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_acquired, /* shred */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_acquired), }, { .name = "UNC_M3_AG0_BL_CRD_OCCUPANCY", .code = 0x8a, .desc = "Number of CMS Agent 0 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_occupancy), }, { .name = "UNC_M3_AG1_AD_CRD_ACQUIRED", .code = 0x84, .desc = "Number of CMS Agent 1 AD credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_acquired, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_acquired), }, { .name = "UNC_M3_AG1_AD_CRD_OCCUPANCY", .code = 0x86, .desc = "Number of CMS Agent 1 AD credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_occupancy), }, { .name = "UNC_M3_AG1_BL_CRD_OCCUPANCY", .code = 0x8e, .desc = "Number of CMS Agent 1 BL credits in use in a given cycle, per transgress", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_occupancy, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_occupancy), }, { .name = "UNC_M3_AG1_BL_CREDITS_ACQUIRED", .code = 0x8c, .desc = "Number of CMS Agent 1 BL credits acquired in a given cycle, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ag0_ad_crd_acquired, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ag0_ad_crd_acquired), }, { .name = "UNC_M3_CHA_AD_CREDITS_EMPTY", .code = 0x22, .desc = "No credits available to send to Cbox on the AD Ring (covers higher CBoxes)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_cha_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_cha_ad_credits_empty), }, { .name = "UNC_M3_CLOCKTICKS", .code = 0x1, .desc = "Counts the number of uclks in the M3 uclk domain. This could be slightly different than the count in the Ubox because of enable/freeze delays. However, because the M3 is close to the Ubox, they generally should not diverge by more than a handful of cycles.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_CMS_CLOCKTICKS", .code = 0xc0, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_D2C_SENT", .code = 0x2b, .desc = "Count cases BL sends direct to core", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_D2U_SENT", .code = 0x2a, .desc = "Cases where SMI3 sends D2U command", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_EGRESS_ORDERING", .code = 0xae, .desc = "Counts number of cycles IV was blocked in the TGR Egress due to SNP/GO Ordering requirements", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_egress_ordering, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_egress_ordering), }, { .name = "UNC_M3_FAST_ASSERTED", .code = 0xa5, .desc = "Counts the number of cycles either the local or incoming distress signals are asserted. Incoming distress includes up, dn and across.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_fast_asserted, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_fast_asserted), }, { .name = "UNC_M3_HORZ_RING_AD_IN_USE", .code = 0xa7, .desc = "Counts the number of cycles that the Horizontal AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_horz_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_horz_ring_ad_in_use), }, { .name = "UNC_M3_HORZ_RING_AK_IN_USE", .code = 0xa9, .desc = "Counts the number of cycles that the Horizontal AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_horz_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_horz_ring_ak_in_use), }, { .name = "UNC_M3_HORZ_RING_BL_IN_USE", .code = 0xab, .desc = "Counts the number of cycles that the Horizontal BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_horz_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_horz_ring_bl_in_use), }, { .name = "UNC_M3_HORZ_RING_IV_IN_USE", .code = 0xad, .desc = "Counts the number of cycles that the Horizontal IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_horz_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_horz_ring_iv_in_use), }, { .name = "UNC_M3_M2_BL_CREDITS_EMPTY", .code = 0x23, .desc = "No vn0 and vna credits available to send to M2", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_m2_bl_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_m2_bl_credits_empty), }, { .name = "UNC_M3_MULTI_SLOT_RCVD", .code = 0x3e, .desc = "Multi slot flit received - S0, S1 and/or S2 populated (can use AK S0/S1 masks for AK allocations)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_multi_slot_rcvd, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_multi_slot_rcvd), }, { .name = "UNC_M3_RING_BOUNCES_HORZ", .code = 0xa1, .desc = "Number of cycles incoming messages from the Horizontal ring that were bounced, by ring type.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ring_bounces_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ring_bounces_horz), }, { .name = "UNC_M3_RING_BOUNCES_VERT", .code = 0xa0, .desc = "Number of cycles incoming messages from the Vertical ring that were bounced, by ring type.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ring_bounces_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ring_bounces_vert), }, { .name = "UNC_M3_RING_SINK_STARVED_HORZ", .code = 0xa3, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ring_sink_starved_horz, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ring_sink_starved_horz), }, { .name = "UNC_M3_RING_SINK_STARVED_VERT", .code = 0xa2, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_ring_sink_starved_vert, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_ring_sink_starved_vert), }, { .name = "UNC_M3_RING_SRC_THRTL", .code = 0xa4, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_RXC_ARB_LOST_VN0", .code = 0x4b, .desc = "VN0 message requested but lost arbitration", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_lost_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_lost_vn0), }, { .name = "UNC_M3_RXC_ARB_LOST_VN1", .code = 0x4c, .desc = "VN1 message requested but lost arbitration", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_lost_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_lost_vn1), }, { .name = "UNC_M3_RXC_ARB_MISC", .code = 0x4d, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_misc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_misc), }, { .name = "UNC_M3_RXC_ARB_NOAD_REQ_VN0", .code = 0x49, .desc = "VN0 message was not able to request arbitration while some other message won arbitration", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_noad_req_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_noad_req_vn0), }, { .name = "UNC_M3_RXC_ARB_NOAD_REQ_VN1", .code = 0x4a, .desc = "VN1 message was not able to request arbitration while some other message won arbitration", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_noad_req_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_noad_req_vn1), }, { .name = "UNC_M3_RXC_ARB_NOCRED_VN0", .code = 0x47, .desc = "VN0 message is blocked from requesting arbitration due to lack of remote UPI credits", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_nocred_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_nocred_vn0), }, { .name = "UNC_M3_RXC_ARB_NOCRED_VN1", .code = 0x48, .desc = "VN1 message is blocked from requesting arbitration due to lack of remote UPI credits", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_arb_nocred_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_arb_nocred_vn1), }, { .name = "UNC_M3_RXC_BYPASSED", .code = 0x40, .desc = "Number ot times message is bypassed around the Ingress Queue", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_bypassed, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_bypassed), }, { .name = "UNC_M3_RXC_COLLISION_VN0", .code = 0x50, .desc = "Count cases where Ingress VN0 packets lost the contest for Flit Slot 0.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_collision_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_collision_vn0), }, { .name = "UNC_M3_RXC_COLLISION_VN1", .code = 0x51, .desc = "Count cases where Ingress VN1 packets lost the contest for Flit Slot 0.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_collision_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_collision_vn1), }, { .name = "UNC_M3_RXC_CRD_MISC", .code = 0x60, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_crd_misc, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_crd_misc), }, { .name = "UNC_M3_RXC_CRD_OCC", .code = 0x61, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_crd_occ, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_crd_occ), }, { .name = "UNC_M3_RXC_CYCLES_NE_VN0", .code = 0x43, .desc = "Counts the number of cycles when the UPI Ingress is not empty. This tracks one of the three rings that are used by the UPI agent. This can be used in conjunction with the UPI Ingress Occupancy Accumulator event in order to calculate average queue occupancy. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_cycles_ne_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_cycles_ne_vn0), }, { .name = "UNC_M3_RXC_CYCLES_NE_VN1", .code = 0x44, .desc = "Counts the number of allocations into the UPI VN1 Ingress. This tracks one of the three rings that are used by the UPI agent. This can be used in conjunction with the UPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_cycles_ne_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_cycles_ne_vn1), }, { .name = "UNC_M3_RXC_FLITS_DATA_NOT_SENT", .code = 0x57, .desc = "Data flit is ready for transmission but could not be sent", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flits_data_not_sent, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flits_data_not_sent), }, { .name = "UNC_M3_RXC_FLITS_GEN_BL", .code = 0x59, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flits_gen_bl, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flits_gen_bl), }, { .name = "UNC_M3_RXC_FLITS_MISC", .code = 0x5a, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_RXC_FLITS_SENT", .code = 0x56, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flits_sent, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flits_sent), }, { .name = "UNC_M3_RXC_FLITS_SLOT_BL", .code = 0x58, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flits_slot_bl, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flits_slot_bl), }, { .name = "UNC_M3_RXC_FLIT_GEN_HDR1", .code = 0x53, .desc = "Events related to Header Flit Generation - Set 1", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flit_gen_hdr1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flit_gen_hdr1), }, { .name = "UNC_M3_RXC_FLIT_GEN_HDR2", .code = 0x54, .desc = "Events related to Header Flit Generation - Set 2", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flit_gen_hdr2, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flit_gen_hdr2), }, { .name = "UNC_M3_RXC_FLIT_NOT_SENT", .code = 0x55, .desc = "header flit is ready for transmission but could not be sent", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_flit_not_sent, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_flit_not_sent), }, { .name = "UNC_M3_RXC_HELD", .code = 0x52, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_held, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_held), }, { .name = "UNC_M3_RXC_INSERTS_VN0", .code = 0x41, .desc = "Counts the number of allocations into the UPI Ingress. This tracks one of the three rings that are used by the UPI agent. This can be used in conjunction with the UPI Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_inserts_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_inserts_vn0), }, { .name = "UNC_M3_RXC_INSERTS_VN1", .code = 0x42, .desc = "Counts the number of allocations into the UPI VN1 Ingress. This tracks one of the three rings that are used by the UPI agent. This can be used in conjunction with the UPI VN1 Ingress Occupancy Accumulator event in order to calculate average queue latency. Multiple ingress buffers can be tracked at a given time using multiple counters.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_inserts_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_inserts_vn1), }, { .name = "UNC_M3_RXC_OCCUPANCY_VN0", .code = 0x45, .desc = "Accumulates the occupancy of a given UPI VN1 Ingress queue in each cycle. This tracks one of the three ring Ingress buffers. This can be used with the UPI VN1 Ingress Not Empty event to calculate average occupancy or the UPI VN1 Ingress Allocations event in order to calculate average queuing latency.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_occupancy_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_occupancy_vn0), }, { .name = "UNC_M3_RXC_OCCUPANCY_VN1", .code = 0x46, .desc = "Accumulates the occupancy of a given UPI VN1 Ingress queue in each cycle. This tracks one of the three ring Ingress buffers. This can be used with the UPI VN1 Ingress Not Empty event to calculate average occupancy or the UPI VN1 Ingress Allocations event in order to calculate average queuing latency.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_occupancy_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_occupancy_vn1), }, { .name = "UNC_M3_RXC_PACKING_MISS_VN0", .code = 0x4e, .desc = "Count cases where Ingress has packets to send but did not have time to pack into flit before sending to Agent so slot was left NULL which could have been used.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_packing_miss_vn0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_packing_miss_vn0), }, { .name = "UNC_M3_RXC_PACKING_MISS_VN1", .code = 0x4f, .desc = "Count cases where Ingress has packets to send but did not have time to pack into flit before sending to Agent so slot was left NULL which could have been used.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x7, .ngrp = 1, .umasks = skx_unc_m3_rxc_packing_miss_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_packing_miss_vn1), }, { .name = "UNC_M3_RXC_SMI3_PFTCH", .code = 0x62, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_smi3_pftch, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_smi3_pftch), }, { .name = "UNC_M3_RXC_VNA_CRD", .code = 0x5b, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxc_vna_crd, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxc_vna_crd), }, { .name = "UNC_M3_RXR_BUSY_STARVED", .code = 0xb4, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, because a message from the other queue has higher priority", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxr_busy_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxr_busy_starved), }, { .name = "UNC_M3_RXR_BYPASS", .code = 0xb2, .desc = "Number of packets bypassing the CMS Ingress", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxr_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxr_bypass), }, { .name = "UNC_M3_RXR_CRD_STARVED", .code = 0xb3, .desc = "Counts cycles under injection starvation mode. This starvation is triggered when the CMS Ingress cannot send a transaction onto the mesh for a long period of time. In this case, the Ingress is unable to forward to the Egress due to a lack of credit.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxr_crd_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxr_crd_starved), }, { .name = "UNC_M3_RXR_INSERTS", .code = 0xb1, .desc = "Number of allocations into the CMS Ingress The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxr_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxr_inserts), }, { .name = "UNC_M3_RXR_OCCUPANCY", .code = 0xb0, .desc = "Occupancy event for the Ingress buffers in the CMS The Ingress is used to queue up requests received from the mesh", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_rxr_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_rxr_occupancy), }, { .name = "UNC_M3_STALL_NO_TXR_HORZ_CRD_AD_AG0", .code = 0xd0, .desc = "Number of cycles the AD Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_stall_no_txr_horz_crd_ad_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_stall_no_txr_horz_crd_ad_ag0), }, { .name = "UNC_M3_STALL_NO_TXR_HORZ_CRD_AD_AG1", .code = 0xd2, .desc = "Number of cycles the AD Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_stall_no_txr_horz_crd_ad_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_stall_no_txr_horz_crd_ad_ag1), }, { .name = "UNC_M3_STALL_NO_TXR_HORZ_CRD_BL_AG0", .code = 0xd4, .desc = "Number of cycles the BL Agent 0 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_stall_no_txr_horz_crd_bl_ag0, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_stall_no_txr_horz_crd_bl_ag0), }, { .name = "UNC_M3_STALL_NO_TXR_HORZ_CRD_BL_AG1", .code = 0xd6, .desc = "Number of cycles the BL Agent 1 Egress Buffer is stalled waiting for a TGR credit to become available, per transgress.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_stall_no_txr_horz_crd_bl_ag1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_stall_no_txr_horz_crd_bl_ag1), }, { .name = "UNC_M3_TXC_AD_ARB_FAIL", .code = 0x30, .desc = "AD arb but no win; arb request asserted but not won", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_arb_fail, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_arb_fail), }, { .name = "UNC_M3_TXC_AD_FLQ_BYPASS", .code = 0x2c, .desc = "Counts cases when the AD flowQ is bypassed (S0, S1 and S2 indicate which slot was bypassed with S0 having the highest priority and S2 the least)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_flq_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_flq_bypass), }, { .name = "UNC_M3_TXC_AD_FLQ_CYCLES_NE", .code = 0x27, .desc = "Number of cycles the AD Egress queue is Not Empty", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_flq_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_flq_cycles_ne), }, { .name = "UNC_M3_TXC_AD_FLQ_INSERTS", .code = 0x2d, .desc = "Counts the number of allocations into the QPI FlowQ. This can be used in conjunction with the QPI FlowQ Occupancy Accumulator event in order to calculate average queue latency. Only a single FlowQ queue can be tracked at any given time. It is not possible to filter based on direction or polarity.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_flq_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_flq_inserts), }, { .name = "UNC_M3_TXC_AD_FLQ_OCCUPANCY", .code = 0x1c, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_flq_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_flq_occupancy), }, { .name = "UNC_M3_TXC_AD_SNPF_GRP1_VN1", .code = 0x3c, .desc = "Number of snpfanout targets and non-idle cycles can be used to calculate average snpfanout latency", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_snpf_grp1_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_snpf_grp1_vn1), }, { .name = "UNC_M3_TXC_AD_SNPF_GRP2_VN1", .code = 0x3d, .desc = "Outcome of SnpF pending arbitration", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_snpf_grp2_vn1, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_snpf_grp2_vn1), }, { .name = "UNC_M3_TXC_AD_SPEC_ARB_CRD_AVAIL", .code = 0x34, .desc = "AD speculative arb request with prior cycle credit check complete and credit avail", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_spec_arb_crd_avail, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_spec_arb_crd_avail), }, { .name = "UNC_M3_TXC_AD_SPEC_ARB_NEW_MSG", .code = 0x33, .desc = "AD speculative arb request due to new message arriving on a specific channel (MC/VN)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_spec_arb_new_msg, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_spec_arb_new_msg), }, { .name = "UNC_M3_TXC_AD_SPEC_ARB_NO_OTHER_PEND", .code = 0x32, .desc = "AD speculative arb request asserted due to no other channel being active (have a valid entry but don't have credits to send)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_ad_spec_arb_no_other_pend, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_ad_spec_arb_no_other_pend), }, { .name = "UNC_M3_TXC_AK_FLQ_INSERTS", .code = 0x2f, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_TXC_AK_FLQ_OCCUPANCY", .code = 0x1e, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x1, }, { .name = "UNC_M3_TXC_BL_ARB_FAIL", .code = 0x35, .desc = "BL arb but no win; arb request asserted but not won", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_arb_fail, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_arb_fail), }, { .name = "UNC_M3_TXC_BL_FLQ_CYCLES_NE", .code = 0x28, .desc = "Number of cycles the BL Egress queue is Not Empty", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_flq_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_flq_cycles_ne), }, { .name = "UNC_M3_TXC_BL_FLQ_INSERTS", .code = 0x2e, .desc = "Counts the number of allocations into the QPI FlowQ. This can be used in conjunction with the QPI FlowQ Occupancy Accumulator event in order to calculate average queue latency. Only a single FlowQ queue can be tracked at any given time. It is not possible to filter based on direction or polarity.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_flq_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_flq_inserts), }, { .name = "UNC_M3_TXC_BL_FLQ_OCCUPANCY", .code = 0x1d, .desc = "TBD", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0x1, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_flq_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_flq_occupancy), }, { .name = "UNC_M3_TXC_BL_SPEC_ARB_NEW_MSG", .code = 0x38, .desc = "BL speculative arb request due to new message arriving on a specific channel (MC/VN)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_spec_arb_new_msg, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_spec_arb_new_msg), }, { .name = "UNC_M3_TXC_BL_SPEC_ARB_NO_OTHER_PEND", .code = 0x37, .desc = "BL speculative arb request asserted due to no other channel being active (have a valid entry but don't have credits to send)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txc_bl_spec_arb_no_other_pend, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txc_bl_spec_arb_no_other_pend), }, { .name = "UNC_M3_TXR_HORZ_ADS_USED", .code = 0x9d, .desc = "Number of packets using the Horizontal Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_ads_used), }, { .name = "UNC_M3_TXR_HORZ_BYPASS", .code = 0x9f, .desc = "Number of packets bypassing the Horizontal Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_bypass), }, { .name = "UNC_M3_TXR_HORZ_CYCLES_FULL", .code = 0x96, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Full. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_cycles_full), }, { .name = "UNC_M3_TXR_HORZ_CYCLES_NE", .code = 0x97, .desc = "Cycles the Transgress buffers in the Common Mesh Stop are Not-Empty. The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_cycles_ne), }, { .name = "UNC_M3_TXR_HORZ_INSERTS", .code = 0x95, .desc = "Number of allocations into the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_inserts), }, { .name = "UNC_M3_TXR_HORZ_NACK", .code = 0x99, .desc = "Counts number of Egress packets NACKed on to the Horizontal Rinng", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_nack), }, { .name = "UNC_M3_TXR_HORZ_OCCUPANCY", .code = 0x94, .desc = "Occupancy event for the Transgress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Horizontal Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_occupancy), }, { .name = "UNC_M3_TXR_HORZ_STARVED", .code = 0x9b, .desc = "Counts injection starvation. This starvation is triggered when the CMS Transgress buffer cannot send a transaction onto the Horizontal ring for a long period of time.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_horz_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_horz_starved), }, { .name = "UNC_M3_TXR_VERT_ADS_USED", .code = 0x9c, .desc = "Number of packets using the Vertical Anti-Deadlock Slot, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_ads_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_ads_used), }, { .name = "UNC_M3_TXR_VERT_BYPASS", .code = 0x9e, .desc = "Number of packets bypassing the Vertical Egress, broken down by ring type and CMS Agent.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_bypass), }, { .name = "UNC_M3_TXR_VERT_CYCLES_FULL", .code = 0x92, .desc = "Number of cycles the Common Mesh Stop Egress was Not Full. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_cycles_full, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_cycles_full), }, { .name = "UNC_M3_TXR_VERT_CYCLES_NE", .code = 0x93, .desc = "Number of cycles the Common Mesh Stop Egress was Not Empty. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_cycles_ne, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_cycles_ne), }, { .name = "UNC_M3_TXR_VERT_INSERTS", .code = 0x91, .desc = "Number of allocations into the Common Mesh Stop Egress. The Egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_inserts), }, { .name = "UNC_M3_TXR_VERT_NACK", .code = 0x98, .desc = "Counts number of Egress packets NACKed on to the Vertical Rinng", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_nack, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_nack), }, { .name = "UNC_M3_TXR_VERT_OCCUPANCY", .code = 0x90, .desc = "Occupancy event for the Egress buffers in the Common Mesh Stop The egress is used to queue up requests destined for the Vertical Ring on the Mesh.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_occupancy), }, { .name = "UNC_M3_TXR_VERT_STARVED", .code = 0x9a, .desc = "Counts injection starvation. This starvation is triggered when the CMS Egress cannot send a transaction onto the Vertical ring for a long period of time.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_txr_vert_starved, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_txr_vert_starved), }, { .name = "UNC_M3_UPI_PEER_AD_CREDITS_EMPTY", .code = 0x20, .desc = "No credits available to send to UPIs on the AD Ring", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_upi_peer_ad_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_upi_peer_ad_credits_empty), }, { .name = "UNC_M3_UPI_PEER_BL_CREDITS_EMPTY", .code = 0x21, .desc = "No credits available to send to UPI on the BL Ring (diff between non-SMI and SMI mode)", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_upi_peer_bl_credits_empty, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_upi_peer_bl_credits_empty), }, { .name = "UNC_M3_UPI_PREFETCH_SPAWN", .code = 0x29, .desc = "Count cases where FlowQ causes spawn of Prefetch to iMC/SMI3 target", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_M3_VERT_RING_AD_IN_USE", .code = 0xa6, .desc = "Counts the number of cycles that the Vertical AD ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vert_ring_ad_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vert_ring_ad_in_use), }, { .name = "UNC_M3_VERT_RING_AK_IN_USE", .code = 0xa8, .desc = "Counts the number of cycles that the Vertical AK ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings in -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vert_ring_ak_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vert_ring_ak_in_use), }, { .name = "UNC_M3_VERT_RING_BL_IN_USE", .code = 0xaa, .desc = "Counts the number of cycles that the Vertical BL ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop.We really have two rings -- a clockwise ring and a counter-clockwise ring. On the left side of the ring, the UP direction is on the clockwise ring and DN is on the counter-clockwise ring. On the right side of the ring, this is reversed. The first half of the CBos are on the left side of the ring, and the 2nd half are on the right side of the ring. In other words (for example), in a 4c part, Cbo 0 UP AD is NOT the same ring as CBo 2 UP AD because they are on opposite sides of the rhe ring.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vert_ring_bl_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vert_ring_bl_in_use), }, { .name = "UNC_M3_VERT_RING_IV_IN_USE", .code = 0xac, .desc = "Counts the number of cycles that the Vertical IV ring is being used at this ring stop. This includes when packets are passing by and when packets are being sunk, but does not include when packets are being sent from the ring stop. There is only 1 IV ring. Therefore, if one wants to monitor the Even ring, they should select both UP_EVEN and DN_EVEN. To monitor the Odd ring, they should select both UP_ODD and DN_ DN_ODD.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vert_ring_iv_in_use, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vert_ring_iv_in_use), }, { .name = "UNC_M3_VN0_CREDITS_USED", .code = 0x5c, .desc = "Number of times a VN0 credit was used on the DRS message channel. In order for a request to be transferred across UPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN0. VNA is a shared pool used to achieve high performance. The VN0 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN0 if they fail. This counts the number of times a VN0 credit was used. Note that a single VN0 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN0 will only count a single credit even though it may use multiple buffers.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vn0_credits_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vn0_credits_used), }, { .name = "UNC_M3_VN0_NO_CREDITS", .code = 0x5e, .desc = "Number of Cycles there were no VN0 Credits", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vn0_no_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vn0_no_credits), }, { .name = "UNC_M3_VN1_CREDITS_USED", .code = 0x5d, .desc = "Number of times a VN1 credit was used on the WB message channel. In order for a request to be transferred across QPI, it must be guaranteed to have a flit buffer on the remote socket to sink into. There are two credit pools, VNA and VN1. VNA is a shared pool used to achieve high performance. The VN1 pool has reserved entries for each message class and is used to prevent deadlock. Requests first attempt to acquire a VNA credit, and then fall back to VN1 if they fail. This counts the number of times a VN1 credit was used. Note that a single VN1 credit holds access to potentially multiple flit buffers. For example, a transfer that uses VNA could use 9 flit buffers and in that case uses 9 credits. A transfer on VN1 will only count a single credit even though it may use multiple buffers.", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vn1_credits_used, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vn1_credits_used), }, { .name = "UNC_M3_VN1_NO_CREDITS", .code = 0x5f, .desc = "Number of Cycles there were no VN1 Credits", .modmsk = SKX_UNC_M3UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_m3_vn1_no_credits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_m3_vn1_no_credits), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_pcu_events.h000066400000000000000000000225071502707512200253150ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_pcu */ static intel_x86_umask_t skx_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .ucode = 0x4000, .udesc = "Number of cores in C-State -- C0 and C1", }, { .uname = "CORES_C3", .ucode = 0x8000, .udesc = "Number of cores in C-State -- C3", }, { .uname = "CORES_C6", .ucode = 0xc000, .udesc = "Number of cores in C-State -- C6 and C7", }, }; static intel_x86_entry_t intel_skx_unc_p_pe[]={ { .name = "UNC_P_CLOCKTICKS", .code = 0x0, .desc = "The PCU runs off a fixed 1 GHz clock. This event counts the number of pclk cycles measured while the counter was enabled. The pclk, like the Memory Controllers dclk, counts at a constant rate making it a good measure of actual wall timee.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CORE_TRANSITION_CYCLES", .code = 0x60, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CTS_EVENT0", .code = 0x11, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_CTS_EVENT1", .code = 0x12, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_DEMOTIONS", .code = 0x30, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FIVR_PS_PS0_CYCLES", .code = 0x75, .desc = "Cycles spent in phase-shedding power state 0", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FIVR_PS_PS1_CYCLES", .code = 0x76, .desc = "Cycles spent in phase-shedding power state 1", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FIVR_PS_PS2_CYCLES", .code = 0x77, .desc = "Cycles spent in phase-shedding power state 2", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FIVR_PS_PS3_CYCLES", .code = 0x78, .desc = "Cycles spent in phase-shedding power state 3", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .code = 0x4, .desc = "Counts the number of cycles when thermal conditions are the upper limit on frequency. This is related to the THERMAL_THROTTLE CYCLES_ABOVE_TEMP event, which always counts cycles when we are above the thermal temperature. This event (STRONGEST_UPPER_LIMIT) is sampled at the output of the algorithm that determines the actual frequency, while THERMAL_THROTTLE looks at the input.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .code = 0x5, .desc = "Counts the number of cycles when power is the upper limit on frequency.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .code = 0x73, .desc = "Counts the number of cycles when IO P Limit is preventing us from dropping the frequency lower. This algorithm monitors the needs to the IO subsystem on both local and remote sockets and will maintain a frequency high enough to maintain good IO BW. This is necessary for when all the IA cores on a socket are idle but a user still would like to maintain high IO Bandwidth.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .code = 0x74, .desc = "Counts the number of cycles when the system is changing frequency. This can not be filtered by thread ID. One can also use it with the occupancy counter that monitors number of threads in C0 to estimate the performance impact that frequency transitions had on the system.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_MCP_PROCHOT_CYCLES", .code = 0x6, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .code = 0x2f, .desc = "Counts the number of cycles that the PCU has triggered memory phase shedding. This is a mode that can be run in the iMC physicals that saves power at the expense of additional latency.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PKG_RESIDENCY_C0_CYCLES", .code = 0x2a, .desc = "Counts the number of cycles when the package was in C0. This event can be used in conjunction with edge detect to count C0 entrances (or exits using invert). Residency events do not include transition times.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PKG_RESIDENCY_C2E_CYCLES", .code = 0x2b, .desc = "Counts the number of cycles when the package was in C2E. This event can be used in conjunction with edge detect to count C2E entrances (or exits using invert). Residency events do not include transition times.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PKG_RESIDENCY_C3_CYCLES", .code = 0x2c, .desc = "Counts the number of cycles when the package was in C3. This event can be used in conjunction with edge detect to count C3 entrances (or exits using invert). Residency events do not include transition times.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PKG_RESIDENCY_C6_CYCLES", .code = 0x2d, .desc = "Counts the number of cycles when the package was in C6. This event can be used in conjunction with edge detect to count C6 entrances (or exits using invert). Residency events do not include transition times.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PMAX_THROTTLED_CYCLES", .code = 0x7, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .code = 0xa, .desc = "Counts the number of cycles that we are in external PROCHOT mode. This mode is triggered when a sensor off the die determines that something off-die (like DRAM) is too hot and must throttle to avoid damaging the chip.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .code = 0x9, .desc = "Counts the number of cycles that we are in internal PROCHOT mode. This mode is triggered when a sensor on the die determines that we are too hot and must throttle to avoid damaging the chip.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .code = 0x72, .desc = "Number of cycles spent performing core C state transitions across all cores.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_FREQ_BAND0_CYCLES", .desc = "Frequency Residency", .code = 0xb, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SKX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND1_CYCLES", .desc = "Frequency Residency", .code = 0xc, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SKX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND2_CYCLES", .desc = "Frequency Residency", .code = 0xd, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SKX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND3_CYCLES", .desc = "Frequency Residency", .code = 0xe, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SKX_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_VR_HOT_CYCLES", .code = 0x42, .desc = "TBD", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .code = 0x80, .desc = "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.", .modmsk = SKX_UNC_PCU_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_p_power_state_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_p_power_state_occupancy), }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_ubo_events.h000066400000000000000000000066021502707512200253110ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_ubo */ static intel_x86_umask_t skx_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .ucode = 0x800, .udesc = "Message Received -- ", }, { .uname = "INT_PRIO", .ucode = 0x1000, .udesc = "Message Received -- ", }, { .uname = "IPI_RCVD", .ucode = 0x400, .udesc = "Message Received -- IPI", }, { .uname = "MSI_RCVD", .ucode = 0x200, .udesc = "Message Received -- MSI", }, { .uname = "VLW_RCVD", .ucode = 0x100, .udesc = "Message Received -- VLW", }, }; static intel_x86_umask_t skx_unc_u_phold_cycles[]={ { .uname = "ASSERT_TO_ACK", .ucode = 0x100, .udesc = "Cycles PHOLD Assert to Ack -- Assert to ACK", .uflags= INTEL_X86_DFL, }, }; static intel_x86_umask_t skx_unc_u_racu_drng[]={ { .uname = "PFTCH_BUF_EMPTY", .ucode = 0x400, .udesc = "TBD", }, { .uname = "RDRAND", .ucode = 0x100, .udesc = "TBD", }, { .uname = "RDSEED", .ucode = 0x200, .udesc = "TBD", }, }; static intel_x86_entry_t intel_skx_unc_u_pe[]={ { .name = "UNC_U_EVENT_MSG", .code = 0x42, .desc = "Virtual Logical Wire (legacy) message were received from Uncore.", .modmsk = SKX_UNC_UBO_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_u_event_msg, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_u_event_msg), }, { .name = "UNC_U_LOCK_CYCLES", .code = 0x44, .desc = "Number of times an IDI Lock/SplitLock sequence was started", .modmsk = SKX_UNC_UBO_ATTRS, .cntmsk = 0x3, }, { .name = "UNC_U_PHOLD_CYCLES", .code = 0x45, .desc = "PHOLD cycles.", .modmsk = SKX_UNC_UBO_ATTRS, .cntmsk = 0x3, .ngrp = 1, .umasks = skx_unc_u_phold_cycles, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_u_phold_cycles), }, { .name = "UNC_U_RACU_DRNG", .code = 0x4c, .desc = "TBD", .modmsk = SKX_UNC_UBO_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_u_racu_drng, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_u_racu_drng), }, { .name = "UNC_U_RACU_REQUESTS", .code = 0x46, .desc = "Number outstanding register requests within message channel tracker", .modmsk = SKX_UNC_UBO_ATTRS, .cntmsk = 0x3, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_skx_unc_upi_events.h000066400000000000000000001105461502707512200253240ustar00rootroot00000000000000/* * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: skx_unc_upi */ static intel_x86_umask_t skx_unc_upi_direct_attempts[]={ { .uname = "D2C", .ucode = 0x100, .udesc = "Direct packet attempts -- Direct 2 Core", }, { .uname = "D2U", .ucode = 0x200, .udesc = "Direct packet attempts -- Direct 2 UPI", }, }; static intel_x86_umask_t skx_unc_upi_flowq_no_vna_crd[]={ { .uname = "AD_VNA_EQ0", .ucode = 0x100, .udesc = "TBD", }, { .uname = "AD_VNA_EQ1", .ucode = 0x200, .udesc = "TBD", }, { .uname = "AD_VNA_EQ2", .ucode = 0x400, .udesc = "TBD", }, { .uname = "AK_VNA_EQ0", .ucode = 0x1000, .udesc = "TBD", }, { .uname = "AK_VNA_EQ1", .ucode = 0x2000, .udesc = "TBD", }, { .uname = "AK_VNA_EQ2", .ucode = 0x4000, .udesc = "TBD", }, { .uname = "AK_VNA_EQ3", .ucode = 0x8000, .udesc = "TBD", }, { .uname = "BL_VNA_EQ0", .ucode = 0x800, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_m3_byp_blocked[]={ { .uname = "BGF_CRD", .ucode = 0x800, .udesc = "TBD", }, { .uname = "FLOWQ_AD_VNA_LE2", .ucode = 0x100, .udesc = "TBD", }, { .uname = "FLOWQ_AK_VNA_LE3", .ucode = 0x400, .udesc = "TBD", }, { .uname = "FLOWQ_BL_VNA_EQ0", .ucode = 0x200, .udesc = "TBD", }, { .uname = "GV_BLOCK", .ucode = 0x1000, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_m3_rxq_blocked[]={ { .uname = "BGF_CRD", .ucode = 0x2000, .udesc = "TBD", }, { .uname = "FLOWQ_AD_VNA_BTW_2_THRESH", .ucode = 0x200, .udesc = "TBD", }, { .uname = "FLOWQ_AD_VNA_LE2", .ucode = 0x100, .udesc = "TBD", }, { .uname = "FLOWQ_AK_VNA_LE3", .ucode = 0x1000, .udesc = "TBD", }, { .uname = "FLOWQ_BL_VNA_BTW_0_THRESH", .ucode = 0x800, .udesc = "TBD", }, { .uname = "FLOWQ_BL_VNA_EQ0", .ucode = 0x400, .udesc = "TBD", }, { .uname = "GV_BLOCK", .ucode = 0x4000, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_req_slot2_from_m3[]={ { .uname = "ACK", .ucode = 0x800, .udesc = "TBD", }, { .uname = "VN0", .ucode = 0x200, .udesc = "TBD", }, { .uname = "VN1", .ucode = 0x400, .udesc = "TBD", }, { .uname = "VNA", .ucode = 0x100, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_rxl_basic_hdr_match[]={ { .uname = "NCB", .ucode = 0xe00, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_NCWR", .ucode = 0x0e00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - NCWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_WCWR", .ucode = 0x1e00 | 1ULL << 32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - WCWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_NCMSGB", .ucode = 0x8e00 | 1ULL << 32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - WCWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_INTLOGICAL", .ucode = 0x9e00 | 1ULL << 32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - WCWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_INTPHYSICAL", .ucode = 0xae00 | 1ULL << 32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - INTPHYSICAL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_INTPRIOUPD", .ucode = 0xbe00 | 1ULL << 32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - INTPRIOUPD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_NCWRPTL", .ucode = 0xce00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - NCWRPTL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCB_OPC_NCP2PB", .ucode = 0xfe00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Bypass - NCP2PB", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS", .ucode = 0xf00, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCRD", .ucode = 0x0f00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - NCRD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_INTACK", .ucode = 0x1f00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - INTACK", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCRDPTL", .ucode = 0x4f00 | 1ULL<<32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - NCRDPTL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCCFGRD", .ucode = 0x5f00 | 1ULL<<32, .ufilters[0] = 1ULL, .udesc = "NCS - NCCFGRD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCLTRD", .ucode = 0x6f00| 1ULL<<32, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - NCLTRD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_IORD", .ucode = 0x7f00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - IORD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_MSGS", .ucode = 0x8f00, .ufilters[0] = 1ULL, .udesc = "NCS - MSGS", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_CFGWR", .ucode = 0x9f00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - CFGWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_LTWR", .ucode = 0xaf00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - LTWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCIOWR", .ucode = 0xbf00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - NCIOWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "NCS_OPC_NCP2PS", .ucode = 0xff00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Non-Coherent Standard - NCP2PS", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "REQ", .ucode = 0x800, .udesc = "Matches on Receive path of a UPI Port -- Request", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "REQ_OPC_INVITOE", .ucode = 0x7800, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Request Opcode - ITOE", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "REQ_OPC_RDINV", .ucode = 0xc800, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Request Opcode - ReadInv", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSPCNFLT", .ucode = 0xaa00, .udesc = "Matches on Receive path of a UPI Port -- Response - Conflict", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSPI", .ucode = 0x2a00, .udesc = "Matches on Receive path of a UPI Port -- Response - Invalid", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA", .ucode = 0xc00, .udesc = "Matches on Receive path of a UPI Port -- Response - Data", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_M", .ucode = 0x0c00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DATA_M", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_E", .ucode = 0x1c00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DATA_E", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_SI", .ucode = 0x2c00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DATA_SI", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_M_CMPO", .ucode = 0x4c00, .ufilters[0] = 1ULL, .udesc = "RSP4 - DATA_M_CMPO", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_E_CMPO", .ucode = 0x5c00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DATA_E_CMPO", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DATA_SI_CMPO", .ucode = 0x6c00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DATA_SI_CMPO", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_RSPFWDIWB", .ucode = 0xac00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - RSPFWDIWB", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_RSPFWDSWB", .ucode = 0xbc00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - RSPFWDIWB", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_RSPIWB", .ucode = 0xcc00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - RSPIWB", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_RSPSWB", .ucode = 0xdc00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - RSPSWB", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_DATA_OPC_DEBUG_DATA", .ucode = 0xfc00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - Data - DEBUGDATA", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA", .ucode = 0xa00, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA_OPC_FWDS", .ucode = 0x6a00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data - FWDS", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA_OPC_MIRCMPU", .ucode = 0x8a00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data - MIRCMPU", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA_OPC_CNFLT", .ucode = 0xaa00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data - CNFLT", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA_OPC_FWDCNFLTO", .ucode = 0xda00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data - FWDCNFLTO", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "RSP_NODATA_OPC_CMPO", .ucode = 0xca00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Response - No Data - CMPO", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP", .ucode = 0x900, .udesc = "Matches on Receive path of a UPI Port -- Snoop", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FCUR", .ucode = 0x8900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FCUR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FCODE", .ucode = 0x9900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FCODE", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FDATA", .ucode = 0xa900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FDATA", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FDATAMIG", .ucode = 0xb900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FDATAMIG", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FINVOWN", .ucode = 0xc900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FDATA", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "SNP_OPC_FINV", .ucode = 0xd900, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Snoop Opcode - FINV", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB", .ucode = 0xd00, .udesc = "Matches on Receive path of a UPI Port -- Writeback", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_WBMTOI", .ucode = 0x0d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - MTOI", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_WBMTOS", .ucode = 0x1d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - MTOS", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_WBMTOE", .ucode = 0x2d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - MTOE", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_NONSNPWR", .ucode = 0x3d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - NONSNPWR", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_MTOIPTL", .ucode = 0x4d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - MTOIPTL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_MTOEPTL", .ucode = 0x6d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - MTOEPTL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_NONSNPWRTL", .ucode = 0x6d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - NONSNPWRTL", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_PUSHMTOI", .ucode = 0x8d00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - PUSHMTOI", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_FLUSH", .ucode = 0xbd00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - FLUSH", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_EVCTCLN", .ucode = 0xcd00, .ufilters[0] = 1ULL, .udesc = "Matches on Receive path of a UPI Port -- Writeback - EVCTCLN", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "WB_OPC_NONSNPRD", .ucode = 0xdd00, .ufilters[0] = 1ULL, .udesc = "WB - NONSNPRD", .uflags = INTEL_X86_NCOMBO, .grpid = 0, }, { .uname = "FILT_NONE", .ucode = 0x0000, .udesc = "No extra filter", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, { .uname = "FILT_LOCAL", .ucode = 0x0000, .ufilters[0] = 1ULL << 1, .udesc = "Filter packets targeting this socket", .grpid = 1, }, { .uname = "FILT_REMOTE", .ucode = 0x0000, .ufilters[0] = 1ULL << 2, .udesc = "Filter packets targeting another socket", .grpid = 1, }, { .uname = "FILT_DATA", .ucode = 0x0000, .ufilters[0] = 1ULL << 3, .udesc = "Filter on Data packets (mutually exclusive with FILT_NON_DATA)", .grpid = 1, }, { .uname = "FILT_NON_DATA", .ucode = 0x0000, .ufilters[0] = 1ULL << 4, .udesc = "Filter on non-Data packets (mutually exclusive with FILT_DATA)", .grpid = 1, }, { .uname = "FILT_DUAL_SLOT", .ucode = 0x0000, .ufilters[0] = 1ULL << 5, .udesc = "Filter on dual-slot packets (mutually exclusive with FILT_SINGLE_SLOT)", .grpid = 1, }, { .uname = "FILT_SINGLE_SLOT", .ucode = 0x0000, .ufilters[0] = 1ULL << 6, .udesc = "Filter on single-slot packets (mutually exclusive with FILT_DUAL_SLOT)", .grpid = 1, }, { .uname = "FILT_ISOCH", .ucode = 0x0000, .ufilters[0] = 1ULL << 7, .udesc = "Filter on isochronous packets", .grpid = 1, }, { .uname = "FILT_SLOT0", .ucode = 0x0000, .ufilters[0] = 1ULL << 19, .udesc = "Filter on slot0 packets", .grpid = 1, }, { .uname = "FILT_SLOT1", .ucode = 0x0000, .ufilters[0] = 1ULL << 20, .udesc = "Filter on slot1 packets", .grpid = 1, }, { .uname = "FILT_SLOT2", .ucode = 0x0000, .ufilters[0] = 1ULL << 21, .udesc = "Filter on slot2 packets", .grpid = 1, }, { .uname = "FILT_LLCRD_NON_ZERO", .ucode = 0x0000, .ufilters[0] = 1ULL << 22, .udesc = "Filter on LLCRD nonzero (only applies to slot2 with opcode match)", .grpid = 1, }, { .uname = "FILT_IMPL_NULL", .ucode = 0x0000, .ufilters[0] = 1ULL << 23, .udesc = "Filter on implied NULL (only applies to slot2 with opcode match)", .grpid = 1, }, }; static intel_x86_umask_t skx_unc_upi_rxl_bypassed[]={ { .uname = "SLOT0", .ucode = 0x100, .udesc = "RxQ Flit Buffer Bypassed -- Slot 0", }, { .uname = "SLOT1", .ucode = 0x200, .udesc = "RxQ Flit Buffer Bypassed -- Slot 1", }, { .uname = "SLOT2", .ucode = 0x400, .udesc = "RxQ Flit Buffer Bypassed -- Slot 2", }, }; static intel_x86_umask_t skx_unc_upi_rxl_flits[]={ { .uname = "ALL_DATA", .ucode = 0xf00, .udesc = "Valid Flits Received -- All Data", .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_NULL", .ucode = 0x2700, .udesc = "Valid Flits Received -- All Null Slots", .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .ucode = 0x800, .udesc = "Valid Flits Received -- Data", }, { .uname = "IDLE", .ucode = 0x4700, .udesc = "Valid Flits Received -- Idle", .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .ucode = 0x1000, .udesc = "Valid Flits Received -- LLCRD Not Empty", .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .ucode = 0x4000, .udesc = "Valid Flits Received -- LLCTRL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .ucode = 0x9700, .udesc = "Valid Flits Received -- All Non Data", .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .ucode = 0x2000, .udesc = "Valid Flits Received -- Slot NULL or LLCRD Empty", }, { .uname = "PROTHDR", .ucode = 0x8000, .udesc = "Valid Flits Received -- Protocol Header", }, { .uname = "SLOT0", .ucode = 0x100, .udesc = "Valid Flits Received -- Slot 0", }, { .uname = "SLOT1", .ucode = 0x200, .udesc = "Valid Flits Received -- Slot 1", }, { .uname = "SLOT2", .ucode = 0x400, .udesc = "Valid Flits Received -- Slot 2", }, }; static intel_x86_umask_t skx_unc_upi_rxl_inserts[]={ { .uname = "SLOT0", .ucode = 0x100, .udesc = "RxQ Flit Buffer Allocations -- Slot 0", }, { .uname = "SLOT1", .ucode = 0x200, .udesc = "RxQ Flit Buffer Allocations -- Slot 1", }, { .uname = "SLOT2", .ucode = 0x400, .udesc = "RxQ Flit Buffer Allocations -- Slot 2", }, }; static intel_x86_umask_t skx_unc_upi_rxl_occupancy[]={ { .uname = "SLOT0", .ucode = 0x100, .udesc = "RxQ Occupancy - All Packets -- Slot 0", }, { .uname = "SLOT1", .ucode = 0x200, .udesc = "RxQ Occupancy - All Packets -- Slot 1", }, { .uname = "SLOT2", .ucode = 0x400, .udesc = "RxQ Occupancy - All Packets -- Slot 2", }, }; static intel_x86_umask_t skx_unc_upi_rxl_slot_bypass[]={ { .uname = "S0_RXQ1", .ucode = 0x100, .udesc = "TBD", }, { .uname = "S0_RXQ2", .ucode = 0x200, .udesc = "TBD", }, { .uname = "S1_RXQ0", .ucode = 0x400, .udesc = "TBD", }, { .uname = "S1_RXQ2", .ucode = 0x800, .udesc = "TBD", }, { .uname = "S2_RXQ0", .ucode = 0x1000, .udesc = "TBD", }, { .uname = "S2_RXQ1", .ucode = 0x2000, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_txl0p_clk_active[]={ { .uname = "CFG_CTL", .ucode = 0x100, .udesc = "TBD", }, { .uname = "DFX", .ucode = 0x4000, .udesc = "TBD", }, { .uname = "RETRY", .ucode = 0x2000, .udesc = "TBD", }, { .uname = "RXQ", .ucode = 0x200, .udesc = "TBD", }, { .uname = "RXQ_BYPASS", .ucode = 0x400, .udesc = "TBD", }, { .uname = "RXQ_CRED", .ucode = 0x800, .udesc = "TBD", }, { .uname = "SPARE", .ucode = 0x8000, .udesc = "TBD", }, { .uname = "TXQ", .ucode = 0x1000, .udesc = "TBD", }, }; static intel_x86_umask_t skx_unc_upi_txl_flits[]={ { .uname = "ALL_DATA", .ucode = 0xf00, .udesc = "Valid Flits Sent -- All Data", }, { .uname = "ALL_NULL", .ucode = 0x2700, .udesc = "Valid Flits Sent -- All Null Slots", }, { .uname = "DATA", .ucode = 0x800, .udesc = "Valid Flits Sent -- Data", }, { .uname = "IDLE", .ucode = 0x4700, .udesc = "Valid Flits Sent -- Idle", }, { .uname = "LLCRD", .ucode = 0x1000, .udesc = "Valid Flits Sent -- LLCRD Not Empty", }, { .uname = "LLCTRL", .ucode = 0x4000, .udesc = "Valid Flits Sent -- LLCTRL", }, { .uname = "NON_DATA", .ucode = 0x9700, .udesc = "Valid Flits Sent -- All Non Data", }, { .uname = "NULL", .ucode = 0x2000, .udesc = "Valid Flits Sent -- Slot NULL or LLCRD Empty", }, { .uname = "PROTHDR", .ucode = 0x8000, .udesc = "Valid Flits Sent -- Protocol Header", }, { .uname = "SLOT0", .ucode = 0x100, .udesc = "Valid Flits Sent -- Slot 0", }, { .uname = "SLOT1", .ucode = 0x200, .udesc = "Valid Flits Sent -- Slot 1", }, { .uname = "SLOT2", .ucode = 0x400, .udesc = "Valid Flits Sent -- Slot 2", }, }; static intel_x86_entry_t intel_skx_unc_upi_pe[]={ { .name = "UNC_UPI_CLOCKTICKS", .code = 0x1, .desc = "Counts the number of clocks in the UPI LL. This clock runs at 1/8th the GT/s speed of the UPI link. For example, a 8GT/s link will have qfclk or 1GHz. Current products do not support dynamic link speeds, so this frequency is fixexed.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_DIRECT_ATTEMPTS", .code = 0x12, .desc = "Counts the number of Data Response(DRS) packets UPI attempted to send directly to the core or to a different UPI link. Note: This only counts attempts on valid candidates such as DRS packets destined for CHAs.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_direct_attempts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_direct_attempts), }, { .name = "UNC_UPI_FLOWQ_NO_VNA_CRD", .code = 0x18, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_flowq_no_vna_crd, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_flowq_no_vna_crd), }, { .name = "UNC_UPI_L1_POWER_CYCLES", .code = 0x21, .desc = "Number of UPI qfclk cycles spent in L1 power mode. L1 is a mode that totally shuts down a UPI link. Use edge detect to count the number of instances when the UPI link entered L1. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. Because L1 totally shuts down the link, it takes a good amount of time to exit this mode.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_M3_BYP_BLOCKED", .code = 0x14, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_m3_byp_blocked, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_m3_byp_blocked), }, { .name = "UNC_UPI_M3_CRD_RETURN_BLOCKED", .code = 0x16, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_M3_RXQ_BLOCKED", .code = 0x15, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_m3_rxq_blocked, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_m3_rxq_blocked), }, { .name = "UNC_UPI_PHY_INIT_CYCLES", .code = 0x20, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_POWER_L1_NACK", .code = 0x23, .desc = "Counts the number of times a link sends/receives a LinkReqNAck. When the UPI links would like to change power state, the Tx side initiates a request to the Rx side requesting to change states. This requests can either be accepted or denied. If the Rx side replies with an Ack, the power mode will change. If it replies with NAck, no change will take place. This can be filtered based on Rx and Tx. An Rx LinkReqNAck refers to receiving an NAck (meaning this agents Tx originally requested the power change). A Tx LinkReqNAck refers to sending this command (meaning the peer agents Tx originally requested the power change and this agent accepted itit).", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_POWER_L1_REQ", .code = 0x22, .desc = "Counts the number of times a link sends/receives a LinkReqAck. When the UPI links would like to change power state, the Tx side initiates a request to the Rx side requesting to change states. This requests can either be accepted or denied. If the Rx side replies with an Ack, the power mode will change. If it replies with NAck, no change will take place. This can be filtered based on Rx and Tx. An Rx LinkReqAck refers to receiving an Ack (meaning this agents Tx originally requested the power change). A Tx LinkReqAck refers to sending this command (meaning the peer agents Tx originally requested the power change and this agent accepted itit).", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_REQ_SLOT2_FROM_M3", .code = 0x46, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_req_slot2_from_m3, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_req_slot2_from_m3), }, { .name = "UNC_UPI_RXL0P_POWER_CYCLES", .code = 0x25, .desc = "Number of UPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize UPI for snoops and their responses. Use edge detect to count the number of instances when the UPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_RXL0_POWER_CYCLES", .code = 0x24, .desc = "Number of UPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_RXL_BASIC_HDR_MATCH", .code = 0x5, .desc = "TBD", .modmsk = SKX_UNC_UPI_OPC_ATTRS, .flags = INTEL_X86_FILT_UMASK | INTEL_X86_FORCE_FILT0, /* filter may be encoded in umask, filter encoding must be passed even if 0 */ .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_upi_rxl_basic_hdr_match, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_basic_hdr_match), }, { .name = "UNC_UPI_RXL_BYPASSED", .code = 0x31, .desc = "Counts the number of times that an incoming flit was able to bypass the flit buffer and pass directly and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of flits transferred, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_rxl_bypassed, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_bypassed), }, { .name = "UNC_UPI_RXL_CREDITS_CONSUMED_VN0", .code = 0x39, .desc = "Counts the number of times that an RxQ VN0 credit was consumed (i.e. message uses a VN0 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_RXL_CREDITS_CONSUMED_VN1", .code = 0x3a, .desc = "Counts the number of times that an RxQ VN1 credit was consumed (i.e. message uses a VN1 credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_RXL_CREDITS_CONSUMED_VNA", .code = 0x38, .desc = "Counts the number of times that an RxQ VNA credit was consumed (i.e. message uses a VNA credit for the Rx Buffer). This includes packets that went through the RxQ and those that were bypasssed.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_RXL_FLITS", .code = 0x3, .desc = "Shows legal flit time (hides impact of L0p and L0c).", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_rxl_flits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_flits), }, { .name = "UNC_UPI_RXL_INSERTS", .code = 0x30, .desc = "Number of allocations into the UPI Rx Flit Buffer. Generally, when data is transmitted across UPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_rxl_inserts, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_inserts), }, { .name = "UNC_UPI_RXL_OCCUPANCY", .code = 0x32, .desc = "Accumulates the number of elements in the UPI RxQ in each cycle. Generally, when data is transmitted across UPI, it will bypass the RxQ and pass directly to the ring interface. If things back up getting transmitted onto the ring, however, it may need to allocate into this buffer, thus increasing the latency. This event can be used in conjunction with the Flit Buffer Not Empty event to calculate average occupancy, or with the Flit Buffer Allocations event to track average lifetime.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_rxl_occupancy, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_occupancy), }, { .name = "UNC_UPI_RXL_SLOT_BYPASS", .code = 0x33, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_rxl_slot_bypass, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_slot_bypass), }, { .name = "UNC_UPI_TXL0P_CLK_ACTIVE", .code = 0x2a, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_txl0p_clk_active, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_txl0p_clk_active), }, { .name = "UNC_UPI_TXL0P_POWER_CYCLES", .code = 0x27, .desc = "Number of UPI qfclk cycles spent in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power. It increases snoop and data transfer latencies and decreases overall bandwidth. This mode can be very useful in NUMA optimized workloads that largely only utilize UPI for snoops and their responses. Use edge detect to count the number of instances when the UPI link entered L0p. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL0P_POWER_CYCLES_LL_ENTER", .code = 0x28, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL0P_POWER_CYCLES_M3_EXIT", .code = 0x29, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL0_POWER_CYCLES", .code = 0x26, .desc = "Number of UPI qfclk cycles spent in L0 power mode in the Link Layer. L0 is the default mode which provides the highest performance with the most power. Use edge detect to count the number of instances that the link entered L0. Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another. The phy layer sometimes leaves L0 for training, which will not be captured by this event.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL_BASIC_HDR_MATCH", .code = 0x4, .desc = "TBD", .modmsk = SKX_UNC_UPI_OPC_ATTRS, .flags = INTEL_X86_FILT_UMASK | INTEL_X86_FORCE_FILT0, /* filter may be encoded in umask, filter encoding must be passed even if 0 */ .cntmsk = 0xf, .ngrp = 2, .umasks = skx_unc_upi_rxl_basic_hdr_match, /* shared */ .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_rxl_basic_hdr_match), }, { .name = "UNC_UPI_TXL_BYPASSED", .code = 0x41, .desc = "Counts the number of times that an incoming flit was able to bypass the Tx flit buffer and pass directly out the UPI Link. Generally, when data is transmitted across UPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL_FLITS", .code = 0x2, .desc = "Shows legal flit time (hides impact of L0p and L0c).", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, .ngrp = 1, .umasks = skx_unc_upi_txl_flits, .numasks= LIBPFM_ARRAY_SIZE(skx_unc_upi_txl_flits), }, { .name = "UNC_UPI_TXL_INSERTS", .code = 0x40, .desc = "Number of allocations into the UPI Tx Flit Buffer. Generally, when data is transmitted across UPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This event can be used in conjunction with the Flit Buffer Occupancy event in order to calculate the average flit buffer lifetime.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_TXL_OCCUPANCY", .code = 0x42, .desc = "Accumulates the number of flits in the TxQ. Generally, when data is transmitted across UPI, it will bypass the TxQ and pass directly to the link. However, the TxQ will be used with L0p and when LLR occurs, increasing latency to transfer out to the link. This can be used with the cycles not empty event to track average occupancy, or the allocations event to track average lifetime in the TxQ.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_BLOCKED_VN01", .code = 0x45, .desc = "TBD", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_OCCUPANCY", .code = 0x44, .desc = "Number of VNA credits in the Rx side that are waitng to be returned back across the link.", .modmsk = SKX_UNC_UPI_ATTRS, .cntmsk = 0xf, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_slm_events.h000066400000000000000000000734011502707512200235660ustar00rootroot00000000000000/* * Copyright (c) 2013 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: slm (Intel Silvermont) */ static const intel_x86_umask_t slm_icache[]={ { .uname = "ACCESSES", .udesc = "Instruction fetches, including uncacheacble fetches", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "MISSES", .udesc = "Count all instructions fetches that miss the icache or produce memory requests. This includes uncacheache fetches. Any instruction fetch miss is counted only once and not once for every cycle it is outstanding", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Count all instructions fetches from the instruction cache", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_uops_retired[]={ { .uname = "ANY", .udesc = "Micro-ops retired", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "MS", .udesc = "Micro-ops retired that were supplied fro MSROM", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALLED_CYCLES", .udesc = "Cycles no micro-ops retired", .ucode = 0x1000 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALLS", .udesc = "Periods no micro-ops retired", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t slm_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions retired using generic counter (precise event)", .ucode = 0x0, .uflags= INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Instructions retired using generic counter (precise event)", .uequiv = "ANY_P", .ucode = 0x0, .uflags= INTEL_X86_PEBS, }, }; static const intel_x86_umask_t slm_l2_reject_xq[]={ { .uname = "ALL", .udesc = "Number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the IDI link. The XQ may reject transactions from the L2Q (non-cacheable requests), BBS (L2 misses) and WOB (L2 write-back victims)", .ucode = 0x000, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_machine_clears[]={ { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of stalled cycles due to memory ordering", .ucode = 0x200, }, { .uname = "FP_ASSIST", .udesc = "Number of stalled cycle due to FPU assist", .ucode = 0x400, }, { .uname = "ALL", .udesc = "Count any the machine clears", .ucode = 0x800, }, { .uname = "ANY", .udesc = "Count any the machine clears", .uequiv = "ALL", .ucode = 0x800, }, }; static const intel_x86_umask_t slm_br_inst_retired[]={ { .uname = "ANY", .udesc = "Any retired branch instruction (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_PEBS, }, { .uname = "ALL_BRANCHES", .udesc = "Any Retired branch instruction (Precise Event)", .uequiv = "ANY", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_TAKEN_BRANCHES", .udesc = "Retired branch instructions (Precise Event)", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .grpid = 0, .ucntmsk = 0xfull, }, { .uname = "JCC", .udesc = "JCC instructions retired (Precise Event)", .ucode = 0x7e00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "Taken JCC instructions retired (Precise Event)", .ucode = 0xfe00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "CALL", .udesc = "Near call instructions retired (Precise Event)", .ucode = 0xf900, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REL_CALL", .udesc = "Near relative call instructions retired (Precise Event)", .ucode = 0xfd00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "Near indirect call instructions retired (Precise Event)", .ucode = 0xfb00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "Near ret instructions retired (Precise Event)", .ucode = 0xf700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "Number of near indirect jmp and near indirect call instructions retired (Precise Event)", .ucode = 0xeb00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Far branch instructions retired (Precise Event)", .uequiv = "FAR", .ucode = 0xbf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR", .udesc = "Far branch instructions retired (Precise Event)", .ucode = 0xbf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t slm_baclears[]={ { .uname = "ANY", .udesc = "BACLEARS asserted", .uequiv = "ALL", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "BACLEARS asserted", .ucode = 0x100, .uflags= INTEL_X86_DFL | INTEL_X86_NCOMBO, }, { .uname = "RETURN", .udesc = "Number of baclears for return branches", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "COND", .udesc = "Number of baclears for conditional branches", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_cpu_clk_unhalted[]={ { .uname = "CORE_P", .udesc = "Core cycles when core is not halted", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BUS", .udesc = "Bus cycles when core is not halted. This event can give a measurement of the elapsed time. This events has a constant ratio with CPU_CLK_UNHALTED:REF event, which is the maximum bus to processor frequency ratio", .uequiv = "REF", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF", .udesc = "Number of reference cycles that the core is not in a halted state. The core enters the halted state when it is running the HLT instruction. In mobile systems, the core frequency may change from time to time. This event is not affected by core frequency changes but counts as if the core is running a the same maximum frequency all the time", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_mem_uop_retired[]={ { .uname = "LD_DCU_MISS", .udesc = "Number of load uops retired that miss in L1 data cache. Note that prefetch misses will not be counted", .ucode = 0x100, }, { .uname = "LD_L2_HIT", .udesc = "Number of load uops retired that hit L2 (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_PEBS, }, { .uname = "LD_L2_MISS", .udesc = "Number of load uops retired that missed L2 (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_PEBS, }, { .uname = "LD_DTLB_MISS", .udesc = "Number of load uops retired that had a DTLB miss (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_PEBS, }, { .uname = "LD_UTLB_MISS", .udesc = "Number of load uops retired that had a UTLB miss", .ucode = 0x1000, }, { .uname = "HITM", .udesc = "Number of load uops retired that got data from the other core or from the other module and the line was modified (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_PEBS, }, { .uname = "ANY_LD", .udesc = "Number of load uops retired", .ucode = 0x4000, }, { .uname = "ANY_ST", .udesc = "Number of store uops retired", .ucode = 0x8000, }, }; static const intel_x86_umask_t slm_llc_rqsts[]={ { .uname = "MISS", .udesc = "Number of L2 cache misses", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Number of L2 cache references", .ucode = 0x4f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_rehabq[]={ { .uname = "LD_BLOCK_ST_FORWARD", .udesc = "Number of retired loads that were prohibited from receiving forwarded data from the store because of address mismatch (Precise Event)", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LD_BLOCK_STD_NOTREADY", .udesc = "Number of times forward was technically possible but did not occur because the store data was not available at the right time", .ucode = 0x0200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ST_SPLITS", .udesc = "Number of retired stores that experienced cache line boundary splits", .ucode = 0x0400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LD_SPLITS", .udesc = "Number of retired loads that experienced cache line boundary splits (Precise Event)", .ucode = 0x0800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK", .udesc = "Number of retired memory operations with lock semantics. These are either implicit locked instructions such as XCHG or instructions with an explicit LOCK prefix", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STA_FULL", .udesc = "Number of retired stores that are delayed because there is not a store address buffer available", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_LD", .udesc = "Number of load uops reissued from RehabQ", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_ST", .udesc = "Number of store uops reissued from RehabQ", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, }, { .uname = "PF_L2_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", .ucode = 1ULL << (4 + 8), .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetchers", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PARTIAL_READ", .udesc = "Request: number of demand reads of partial cachelines (including UC, WC)", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PARTIAL_WRITE", .udesc = "Request: number of demand RFO requests to write to partial cache lines (includes UC, WT, WP)", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "UC_IFETCH", .udesc = "Request: number of UC instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Request: number bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "SW_PREFETCH", .udesc = "Request: number of software prefetch requests", .ucode = 1ULL << (12 + 8), .grpid = 0, }, { .uname = "PF_L1_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L1 prefetchers", .ucode = 1ULL << (13 + 8), .grpid = 0, }, { .uname = "PARTIAL_STRM_ST", .udesc = "Request: number of partial streaming store requests", .ucode = 1ULL << (14 + 8), .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one any other request that crosses IDI, including I/O", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH | UC_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH:UC_IFETCH", .ucode = (1ULL << 6 | 1ULL << 2 | 1ULL << 9) << 8, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_L2_DATA_RD:PF_RFO:PF_IFETCH:PARTIAL_READ:PARTIAL_WRITE:UC_IFETCH:BUS_LOCKS:STRM_ST:SW_PREFETCH:PF_L1_DATA_RD:PARTIAL_STRM_ST:OTHER", .ucode = 0xffff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of DMND_DATA | PF_L1_DATA_RD | PF_L2_DATA_RD", .uequiv = "DMND_DATA_RD:PF_L1_DATA_RD:PF_L2_DATA_RD", .ucode = (1ULL << 0 | 1ULL << 4 | 1ULL << 13) << 8, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO", .uequiv = "DMND_RFO:PF_RFO", .ucode = (1ULL << 1 | 1ULL << 5) << 8, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "L2_HIT", .udesc = "Supplier: counts L2 hits in M/E/S states", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_HIT", .udesc = "Snoop: counts number of times a snoop hits in the other module where no modified copies were found in the L1 cache of the other core", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_HITM", .udesc = "Snoop: counts number of times a snoop hits in the other module where modified copies were found in the L1 cache of the other core", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7dULL << (31+8), .uequiv = "SNP_NONE:SNP_MISS:SNP_HIT:SNP_HITM:NON_DRAM", .uflags= INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t slm_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branches (Precise Event)", .uequiv = "ANY", .ucode = 0x0000, /* architectural encoding */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All mispredicted branches (Precise Event)", .ucode = 0x0000, /* architectural encoding */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "JCC", .udesc = "Number of mispredicted conditional branch instructions retired (Precise Event)", .ucode = 0x7e00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NON_RETURN_IND", .udesc = "Number of mispredicted non-return branch instructions retired (Precise Event)", .ucode = 0xeb00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETURN", .udesc = "Number of mispredicted return branch instructions retired (Precise Event)", .ucode = 0xf700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "IND_CALL", .udesc = "Number of mispredicted indirect call branch instructions retired (Precise Event)", .ucode = 0xfb00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN_JCC", .udesc = "Number of mispredicted taken conditional branch instructions retired (Precise Event)", .ucode = 0xfe00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t slm_no_alloc_cycles[]={ { .uname = "ANY", .udesc = "Number of cycles when the front-end does not provide any instructions to be allocated for any reason", .ucode = 0x3f00, .uequiv = "ALL", .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "Number of cycles when the front-end does not provide any instructions to be allocated for any reason", .ucode = 0x3f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "NOT_DELIVERED", .udesc = "Number of cycles when the front-end does not provide any instructions to be allocated but the back-end is not stalled", .ucode = 0x5000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISPREDICTS", .udesc = "Number of cycles when no uops are allocated and the alloc pipe is stalled waiting for a mispredicted jump to retire", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RAT_STALL", .udesc = "Number of cycles when no uops are allocated and a RAT stall is asserted", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ROB_FULL", .udesc = "Number of cycles when no uops are allocated and the ROB is full (less than 2 entries available)", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_rs_full_stall[]={ { .uname = "MEC", .udesc = "Number of cycles when the allocation pipeline is stalled due to the RS for the MEC cluster is full", .ucode = 0x0100, }, { .uname = "ALL", .udesc = "Number of cycles when the allocation pipeline is stalled due any one of the RS being full", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "Number of cycles when the allocation pipeline is stalled due any one of the RS being full", .ucode = 0x1f00, .uequiv = "ALL", .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t slm_cycles_div_busy[]={ { .uname = "ANY", .udesc = "Number of cycles the divider is busy", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_ms_decoded[]={ { .uname = "ENTRY", .udesc = "Number of times the MSROM starts a flow of uops", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_decode_restriction[]={ { .uname = "PREDECODE_WRONG", .udesc = "Number of times the prediction (from the predecode cache) for instruction length is incorrect", .ucode = 0x0100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_fetch_stall[]={ { .uname = "ICACHE_FILL_PENDING_CYCLES", .udesc = "Number of cycles the NIP stalls because of an icache miss. This is a cumulative count of cycles the NIP stalled for all icache misses", .ucode = 0x0400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_core_reject_l2q[]={ { .uname = "ALL", .udesc = "Number of requests that were not accepted into the L2Q because the L2Q was FULL", .ucode = 0x0000, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t slm_page_walks[]={ { .uname = "CYCLES", .udesc = "Total cycles for all the page walks. (I-side and D-side)", .ucode = 0x0300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALKS", .udesc = "Total number of page walks. (I-side and D-side)", .ucode = 0x0300 | INTEL_X86_MOD_EDGE, .uequiv = "D_SIDE_WALKS:I_SIDE_WALKS", .uflags = INTEL_X86_NCOMBO, }, { .uname = "D_SIDE_CYCLES", .udesc = "Number of cycles when a D-side page walk is in progress", .ucode = 0x0100, }, { .uname = "D_SIDE_WALKS", .udesc = "Number of D-side page walks", .ucode = 0x0100 | INTEL_X86_MOD_EDGE, .uequiv = "D_SIDE_CYCLES:e", }, { .uname = "I_SIDE_CYCLES", .udesc = "Number of cycles when a I-side page walk is in progress", .ucode = 0x0200, }, { .uname = "I_SIDE_WALKS", .udesc = "Number of I-side page walks", .ucode = 0x0200 | INTEL_X86_MOD_EDGE, .uequiv = "I_SIDE_CYCLES:e", }, }; static const intel_x86_entry_t intel_slm_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Unhalted core cycles", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x200000003ull, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycle", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x100000003ull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V2_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10003, .code = 0xc0, }, { .name = "LLC_REFERENCES", .desc = "Last level of cache references", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for LLC_REFERENCES", .modmsk = INTEL_V2_ATTRS, .equiv = "LLC_REFERENCES", .cntmsk = 0x3, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Last level of cache misses", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for LLC_MISSES", .modmsk = INTEL_V2_ATTRS, .equiv = "LLC_MISSES", .cntmsk = 0x3, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Branch instructions retired", .modmsk = INTEL_V2_ATTRS, .equiv = "BR_INST_RETIRED:ANY", .cntmsk = 0x3, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Mispredicted branch instruction retired", .equiv = "BR_MISP_RETIRED", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, }, /* begin model specific events */ { .name = "DECODE_RESTRICTION", .desc = "Instruction length prediction delay", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xe9, .ngrp = 1, .numasks = LIBPFM_ARRAY_SIZE(slm_decode_restriction), .umasks = slm_decode_restriction, }, { .name = "L2_REJECT_XQ", .desc = "Rejected L2 requests to XQ", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(slm_l2_reject_xq), .ngrp = 1, .umasks = slm_l2_reject_xq, }, { .name = "ICACHE", .desc = "Instruction fetches", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(slm_icache), .ngrp = 1, .umasks = slm_icache, }, { .name = "UOPS_RETIRED", .desc = "Micro-ops retired", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc2, .numasks = LIBPFM_ARRAY_SIZE(slm_uops_retired), .ngrp = 1, .umasks = slm_uops_retired, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(slm_inst_retired), .ngrp = 1, .umasks = slm_inst_retired, }, { .name = "CYCLES_DIV_BUSY", .desc = "Cycles the divider is busy", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xcd, .numasks = LIBPFM_ARRAY_SIZE(slm_cycles_div_busy), .ngrp = 1, .umasks = slm_cycles_div_busy, }, { .name = "RS_FULL_STALL", .desc = "RS full", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xcb, .numasks = LIBPFM_ARRAY_SIZE(slm_rs_full_stall), .ngrp = 1, .umasks = slm_rs_full_stall, }, { .name = "LLC_RQSTS", .desc = "L2 cache requests", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(slm_llc_rqsts), .ngrp = 1, .umasks = slm_llc_rqsts, }, { .name = "MACHINE_CLEARS", .desc = "Self-Modifying Code detected", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(slm_machine_clears), .ngrp = 1, .umasks = slm_machine_clears, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc4, .numasks = LIBPFM_ARRAY_SIZE(slm_br_inst_retired), .flags= INTEL_X86_PEBS, .ngrp = 1, .umasks = slm_br_inst_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branch instructions (Precise Event)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(slm_br_misp_retired), .ngrp = 1, .umasks = slm_br_misp_retired, }, { .name = "BR_MISP_INST_RETIRED", /* for backward compatibility with older version */ .desc = "Mispredicted retired branch instructions (Precise Event)", .modmsk = INTEL_V2_ATTRS, .equiv = "BR_MISP_RETIRED", .cntmsk = 0x3, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(slm_br_misp_retired), .ngrp = 1, .umasks = slm_br_misp_retired, }, { .name = "MS_DECODED", .desc = "MS decoder", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xe7, .numasks = LIBPFM_ARRAY_SIZE(slm_ms_decoded), .ngrp = 1, .umasks = slm_ms_decoded, }, { .name = "BACLEARS", .desc = "Branch address calculator", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(slm_baclears), .ngrp = 1, .umasks = slm_baclears, }, { .name = "NO_ALLOC_CYCLES", .desc = "Front-end allocation", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(slm_no_alloc_cycles), .ngrp = 1, .umasks = slm_no_alloc_cycles, }, { .name = "CPU_CLK_UNHALTED", .desc = "Core cycles when core is not halted", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(slm_cpu_clk_unhalted), .ngrp = 1, .umasks = slm_cpu_clk_unhalted, }, { .name = "MEM_UOP_RETIRED", .desc = "Retired loads micro-ops", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(slm_mem_uop_retired), .ngrp = 1, .umasks = slm_mem_uop_retired, }, { .name = "CORE_REJECT_L2Q", .desc = "Demand and L1 prefetcher requests rejected by L2", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x31, .numasks = LIBPFM_ARRAY_SIZE(slm_core_reject_l2q), .ngrp = 1, .umasks = slm_core_reject_l2q, }, { .name = "REHABQ", .desc = "Memory reference queue", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x03, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(slm_rehabq), .ngrp = 1, .umasks = slm_rehabq, }, { .name = "FETCH_STALL", .desc = "Fetch stalls", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x86, .numasks = LIBPFM_ARRAY_SIZE(slm_fetch_stall), .ngrp = 1, .umasks = slm_fetch_stall, }, { .name = "PAGE_WALKS", .desc = "Page walker", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x3, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(slm_page_walks), .ngrp = 1, .umasks = slm_page_walks, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xf, .code = 0x01b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(slm_offcore_response), .ngrp = 3, .umasks = slm_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xf, .code = 0x02b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(slm_offcore_response), .ngrp = 3, .umasks = slm_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snb_events.h000066400000000000000000002600661502707512200235620ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: snb (Intel Sandy Bridge) */ static const intel_x86_umask_t snb_agu_bypass_cancel[]={ { .uname = "COUNT", .udesc = "This event counts executed load operations", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_arith[]={ { .uname = "FPU_DIV_ACTIVE", .udesc = "Cycles that the divider is active, includes integer and floating point", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "FPU_DIV", .udesc = "Number of cycles the divider is activated, includes integer and floating point", .uequiv = "FPU_DIV_ACTIVE:c=1:e=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_br_inst_exec[]={ { .uname = "NONTAKEN_COND", .udesc = "All macro conditional non-taken branch instructions", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "All macro conditional taken branch instructions", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_JUMP", .udesc = "All macro unconditional taken branch instructions, excluding calls and indirects", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All taken indirect branches that are not calls nor returns", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_RETURN_NEAR", .udesc = "All taken indirect branches that have a return mnemonic", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_NEAR_CALL", .udesc = "All taken non-indirect calls", .ucode = 0x9000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All taken indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All near executed branches instructions (not necessarily retired)", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_CONDITIONAL", .udesc = "All macro conditional branch instructions", .ucode = 0xc100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "All macro conditional branch instructions", .ucode = 0xc100, .uequiv = "ALL_CONDITIONAL", .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_DIRECT_NEAR_CALL", .udesc = "All non-indirect calls", .ucode = 0xd000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DIRECT_JMP", .udesc = "Speculative and retired macro-unconditional branches excluding calls and indirects", .ucode = 0xc200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_INDIRECT_NEAR_RETURN", .udesc = "Speculative and retired indirect return branches", .ucode = 0xc800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All taken and not taken macro branches including far branches (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "CONDITIONAL", .udesc = "All taken and not taken macro conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Number of far branch instructions retired (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "All macro direct and indirect near calls, does not count far calls (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Number of near ret instructions retired (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch taken instructions retired (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "All not taken macro branch instructions retired (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_br_misp_exec[]={ { .uname = "NONTAKEN_COND", .udesc = "All non-taken mispredicted macro conditional branch instructions", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_COND", .udesc = "All taken mispredicted macro conditional branch instructions", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All taken mispredicted indirect branches that are not calls nor returns", .ucode = 0x8400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_RETURN_NEAR", .udesc = "All taken mispredicted indirect branches that have a return mnemonic", .ucode = 0x8800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_DIRECT_NEAR_CALL", .udesc = "All taken mispredicted non-indirect calls", .ucode = 0x9000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN_INDIRECT_NEAR_CALL", .udesc = "All taken mispredicted indirect calls, including both register and memory indirect", .ucode = 0xa000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_COND", .udesc = "All mispredicted macro conditional branch instructions", .ucode = 0xc100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_DIRECT_NEAR_CALL", .udesc = "All mispredicted non-indirect calls", .ucode = 0xd000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_INDIRECT_JUMP_NON_CALL_RET", .udesc = "All mispredicted indirect branches that are not calls nor returns", .ucode = 0xc400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted macro branches (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "CONDITIONAL", .udesc = "All mispredicted macro conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "All macro direct and indirect near calls (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOT_TAKEN", .udesc = "Number of branch instructions retired that were mispredicted and not-taken (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TAKEN", .udesc = "Number of branch instructions retired that were mispredicted and taken (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_lock_cycles[]={ { .uname = "SPLIT_LOCK_UC_LOCK_DURATION", .udesc = "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CACHE_LOCK_DURATION", .udesc = "Cycles in which the L1D is locked", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_cpl_cycles[]={ { .uname = "RING0", .udesc = "Unhalted core cycles the thread was in ring 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RING0_TRANS", .udesc = "Transitions from rings 1, 2, or 3 to ring 0", .uequiv = "RING0:c=1:e=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "RING123", .udesc = "Unhalted core cycles the thread was in rings 1, 2, or 3", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_cpu_clk_unhalted[]={ { .uname = "REF_P", .udesc = "Cycles when the core is unhalted (count at 100 Mhz)", .ucode = 0x100, .uequiv = "REF_XCLK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK", .udesc = "Count Xclk pulses (100Mhz) when the core is unhalted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REF_XCLK_ANY", .udesc = "Count Xclk pulses (100Mhz) when the at least one thread on the physical core is unhalted", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "REF_XCLK:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Counts Xclk (100Mhz) pulses when this thread is unhalted and the other thread is halted", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_dsb2mite_switches[]={ { .uname = "COUNT", .udesc = "Number of DSB to MITE switches", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PENALTY_CYCLES", .udesc = "Cycles SB to MITE switches caused delay", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_dsb_fill[]={ { .uname = "ALL_CANCEL", .udesc = "Number of times a valid DSB fill has been cancelled for any reason", .ucode = 0xa00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "EXCEED_DSB_LINES", .udesc = "DSB Fill encountered > 3 DSB lines", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "OTHER_CANCEL", .udesc = "Number of times a valid DSB fill has been cancelled not because of exceeding way limit", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_dtlb_load_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Demand load miss in all TLB levels which causes an page walk of any page size", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CAUSES_A_WALK", .udesc = "Demand load miss in all TLB levels which causes an page walk of any page size", .ucode = 0x100, .uequiv = "MISS_CAUSES_A_WALK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "Number of DTLB lookups for loads which missed first level DTLB but hit second level DTLB (STLB); No page walk.", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Demand load miss in all TLB levels which causes a page walk that completes for any page size", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles PMH is busy with a walk", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_dtlb_store_misses[]={ { .uname = "MISS_CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CAUSES_A_WALK", .udesc = "Miss in all TLB levels that causes a page walk of any page size (4K/2M/4M/1G)", .ucode = 0x100, .uequiv = "MISS_CAUSES_A_WALK", .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "First level miss but second level hit; no page walk. Only relevant if multiple levels", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "Miss in all TLB levels that causes a page walk that completes of any page size (4K/2M/4M/1G)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_DURATION", .udesc = "Cycles PMH is busy with this walk", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_fp_assist[]={ { .uname = "ANY", .udesc = "Cycles with any input/output SSE or FP assists", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "SIMD_INPUT", .udesc = "Number of SIMD FP assists due to input values", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SIMD_OUTPUT", .udesc = "Number of SIMD FP assists due to output values", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87_INPUT", .udesc = "Number of X87 assists due to input value", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87_OUTPUT", .udesc = "Number of X87 assists due to output value", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "Cycles with any input and output SSE or FP assist", .ucode = 0x1e00 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "ANY", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_fp_comp_ops_exe[]={ { .uname = "X87", .udesc = "Number of X87 uops executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_PACKED_DOUBLE", .udesc = "Number of SSE double precision FP packed uops executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_SCALAR_SINGLE", .udesc = "Number of SSE single precision FP scalar uops executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_PACKED_SINGLE", .udesc = "Number of SSE single precision FP packed uops executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_SCALAR_DOUBLE", .udesc = "Number of SSE double precision FP scalar uops executed", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_hw_pre_req[]={ { .uname = "L1D_MISS", .udesc = "Hardware prefetch requests that misses the L1D cache. A request is counted each time it accesses the cache and misses it, including if a block is applicable or if it hits the full buffer, for example. This accounts for both L1 streamer and IP-based Hw prefetchers", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_icache[]={ { .uname = "MISSES", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "Number of Instruction Cache, Streaming Buffer and Victim Cache Reads. Includes cacheable and uncacheable accesses and uncacheable fetches", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_idq[]={ { .uname = "EMPTY", .udesc = "Cycles IDQ is empty", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS", .udesc = "Number of uops delivered to IDQ from MITE path", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DSB_UOPS", .udesc = "Number of uops delivered to IDQ from DSB path", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS", .udesc = "Number of uops delivered to IDQ when MS busy by DSB", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS", .udesc = "Number of uops delivered to IDQ when MS busy by MITE", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS", .udesc = "Number of uops were delivered to IDQ from MS by either DSB or MITE", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MITE_UOPS_CYCLES", .udesc = "Cycles where uops are delivered to IDQ from MITE (MITE active)", .uequiv = "MITE_UOPS:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_SWITCHES", .udesc = "Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE", .ucode = 0x3000 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uequiv = "MS_UOPS:c=1:e", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS_CYCLES", .udesc = "Cycles where uops are delivered to IDQ from DSB (DSB active)", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ when MS busy by DSB", .uequiv = "MS_DSB_UOPS:c=1", .ucode = 0x1000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_MITE_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ when MS busy by MITE", .uequiv = "MS_MITE_UOPS:c=1", .ucode = 0x2000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_UOPS_CYCLES", .udesc = "Cycles where uops delivered to IDQ from MS by either BSD or MITE", .uequiv = "MS_UOPS:c=1", .ucode = 0x3000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DSB_UOPS", .udesc = "Number of uops deliver from either DSB paths", .ucode = 0x1800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DSB_CYCLES", .udesc = "Cycles MITE/MS deliver anything", .ucode = 0x1800 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DSB_CYCLES_4_UOPS", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops", .ucode = 0x1800 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ALL_MITE_UOPS", .udesc = "Number of uops delivered from either MITE paths", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_MITE_CYCLES", .udesc = "Cycles DSB/MS deliver anything", .ucode = 0x2400 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_MITE_CYCLES_4_UOPS", .udesc = "Cycles MITE is delivering 4 Uops", .ucode = 0x2400 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "ANY_UOPS", .udesc = "Number of uops delivered to IDQ from any path", .ucode = 0x3c00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_DSB_UOPS_OCCUR", .udesc = "Occurrences of DSB MS going active", .uequiv = "MS_DSB_UOPS:c=1:e=1", .ucode = 0x1000 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Number of non-delivered uops to RAT (use cmask to qualify further)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles per thread when 4 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .uequiv = "CORE:c=4", .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1_UOP_DELIV_CORE", .udesc = "Cycles per thread when 1 or more uops are delivered to the Resource Allocation Table (RAT) by the front end", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_INV, /* cnt=4 inv=1 */ .uequiv = "CORE:c=4:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, { .uname = "CYCLES_LE_1_UOP_DELIV_CORE", .udesc = "Cycles per thread when 3 or more uops are not delivered to the Resource Allocation Table (RAT) when backend is not stalled", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_2_UOP_DELIV_CORE", .udesc = "Cycles with less than 2 uops delivered by the front end", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_LE_3_UOP_DELIV_CORE", .udesc = "Cycles with less than 3 uops delivered by the front end", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles Front-End (FE) delivered 4 uops or Resource Allocation Table (RAT) was stalling FE", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 inv=1 */ .uequiv = "CORE:c=1:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t snb_ild_stall[]={ { .uname = "LCP", .udesc = "Stall caused by changing prefix length of the instruction", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IQ_FULL", .udesc = "Stall cycles due to IQ full", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_insts_written_to_iq[]={ { .uname = "INSTS", .udesc = "Number of instructions written to IQ every cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_inst_retired[]={ { .uname = "ANY_P", .udesc = "Number of instructions retired", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired event to reduce effect of PEBS shadow IP distribution (Precise Event)", .ucntmsk = 0x2, .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_int_misc[]={ { .uname = "RAT_STALL_CYCLES", .udesc = "Cycles RAT external stall is sent to IDQ for this thread", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES", .udesc = "Cycles waiting to be recovered after Machine Clears due to all other cases except JEClear", .ucode = 0x300 | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_STALLS_COUNT", .udesc = "Number of times need to wait after Machine Clears due to all other cases except JEClear", .ucode = 0x300 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES_ANY", .udesc = "Cycles during which the allocator was stalled due to recovery from earlier clear event for any thread (e.g. misprediction or memory nuke)", .ucode = 0x300 | (0x1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, /* cnt=1 any=1 */ .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_itlb[]={ { .uname = "ITLB_FLUSH", .udesc = "Number of ITLB flushes, includes 4k/2M/4M pages", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FLUSH", .udesc = "Number of ITLB flushes, includes 4k/2M/4M pages", .ucode = 0x100, .uequiv = "ITLB_FLUSH", .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l1d[]={ { .uname = "ALLOCATED_IN_M", .udesc = "Number of allocations of L1D cache lines in modified (M) state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_M_REPLACEMENT", .udesc = "Number of cache lines in M-state evicted of L1D due to snoop HITM or dirty line replacement", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_EVICT", .udesc = "Number of modified lines evicted from L1D due to replacement", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REPLACEMENT", .udesc = "Number of cache lines brought into the L1D cache", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l1d_blocks[]={ { .uname = "BANK_CONFLICT", .udesc = "Number of dispatched loads cancelled due to L1D bank conflicts with other load ports", .ucode = 0x500, .uflags= INTEL_X86_NCOMBO, }, { .uname = "BANK_CONFLICT_CYCLES", .udesc = "Cycles when dispatched loads are cancelled due to L1D bank conflicts with other load ports", .ucode = 0x500 | (0x1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "BANK_CONFLICT:c=1", .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_l1d_pend_miss[]={ { .uname = "OCCURRENCES", .udesc = "Occurrences of L1D_PEND_MISS going active", .uequiv = "PENDING:e=1:c=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "EDGE", .udesc = "Occurrences of L1D_PEND_MISS going active", .uequiv = "OCCURRENCES", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "PENDING", .udesc = "Number of L1D load misses outstanding every cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load misses outstanding", .uequiv = "PENDING:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "PENDING_CYCLES_ANY", .udesc = "Cycles with L1D load misses outstanding from any thread", .uequiv = "PENDING:c=1:t", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT) | INTEL_X86_MOD_ANY, .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "FB_FULL", .udesc = "Number of cycles a demand request was blocked due to Fill Buffer (FB) unavailability", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_l2_l1d_wb_rqsts[]={ { .uname = "ALL", .udesc = "Non rejected writebacks from L1D to L2 cache lines in E state", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "HIT_E", .udesc = "Non rejected writebacks from L1D to L2 cache lines in E state", .ucode = 0x400, }, { .uname = "HIT_M", .udesc = "Non rejected writebacks from L1D to L2 cache lines in M state", .ucode = 0x800, }, { .uname = "HIT_S", .udesc = "Non rejected writebacks from L1D to L2 cache lines in S state", .ucode = 0x200, }, { .uname = "MISS", .udesc = "Number of modified lines evicted from L1 and missing L2 (non-rejected WB from DCU)", .ucode = 0x100, }, }; static const intel_x86_umask_t snb_l2_lines_in[]={ { .uname = "ANY", .udesc = "L2 cache lines filling (counting does not cover rejects)", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "E", .udesc = "L2 cache lines in E state (counting does not cover rejects)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I", .udesc = "L2 cache lines in I state (counting does not cover rejects)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S", .udesc = "L2 cache lines in S state (counting does not cover rejects)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l2_lines_out[]={ { .uname = "DEMAND_CLEAN", .udesc = "L2 clean line evicted by a demand", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DIRTY", .udesc = "L2 dirty line evicted by a demand", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_CLEAN", .udesc = "L2 clean line evicted by a prefetch", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_DIRTY", .udesc = "L2 dirty line evicted by an MLC Prefetch", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRTY_ANY", .udesc = "Any L2 dirty line evicted (does not cover rejects)", .ucode = 0xa00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l2_rqsts[]={ { .uname = "ALL_CODE_RD", .udesc = "Any ifetch request to L2 cache", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand data read requests to L2 cache", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_RD_HIT", .udesc = "Demand data read requests that hit L2", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_PF", .udesc = "Any L2 HW prefetch request to L2 cache", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_HIT", .udesc = "Requests from the L2 hardware prefetchers that hit L2 cache", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PF_MISS", .udesc = "Requests from the L2 hardware prefetchers that miss L2 cache", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_ANY", .udesc = "Any RFO requests to L2 cache", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HITS", .udesc = "RFO requests that hit L2 cache", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l2_store_lock_rqsts[]={ { .uname = "HIT_E", .udesc = "RFOs that hit cache lines in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "RFOs that miss cache (I state)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HIT_M", .udesc = "RFOs that hit cache lines in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL", .udesc = "RFOs that access cache lines in any state", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_l2_trans[]={ { .uname = "ALL", .udesc = "Transactions accessing MLC pipe", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_RD", .udesc = "L2 cache accesses when fetching instructions", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writebacks that access L2 cache", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOAD", .udesc = "Demand Data Read* requests that access L2 cache", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L2_FILL", .udesc = "L2 fill requests that access L2 cache", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_PREFETCH", .udesc = "L2 or L3 HW prefetches that access L2 cache (including rejects)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO requests that access L2 cache", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_ld_blocks[]={ { .uname = "DATA_UNKNOWN", .udesc = "Blocked loads due to store buffer blocks with unknown data", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked by overlapping with store buffer that cannot be forwarded", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "Number of split loads blocked due to resource not available", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_BLOCK", .udesc = "Number of cases where any load is blocked but has not DCU miss", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_ld_blocks_partial[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_STA_BLOCK", .udesc = "Number of times that load operations are temporarily blocked because of older stores, with addresses that are not yet known. A load operation may incur more than one block of this type", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_load_hit_pre[]={ { .uname = "HW_PF", .udesc = "Non sw-prefetch load dispatches that hit the fill buffer allocated for HW prefetch", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SW_PF", .udesc = "Non sw-prefetch load dispatches that hit the fill buffer allocated for SW prefetch", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_l3_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable demand requests missed L3", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable demand requests that refer to L3", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_machine_clears[]={ { .uname = "MASKMOV", .udesc = "The number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of Memory Ordering Machine Clears detected", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-Modifying Code detected", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type", .ucode = 0x100 | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* edge=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_mem_load_uops_llc_hit_retired[]={ { .uname = "XSNP_HIT", .udesc = "Load LLC Hit and a cross-core Snoop hits in on-pkg core cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_HITM", .udesc = "Load had HitM Response from a core on same socket (shared LLC) (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Load LLC Hit and a cross-core Snoop missed in on-pkg core cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Load hit in last-level (L3) cache with no snoop needed (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_mem_load_uops_misc_retired[]={ { .uname = "LLC_MISS", .udesc = "Counts load driven L3 misses and some non simd split loads (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_mem_load_uops_retired[]={ { .uname = "HIT_LFB", .udesc = "A load missed L1D but hit the Fill Buffer (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Load hit in nearest-level (L1D) cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Load hit in mid-level (L2) cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Load hit in last-level (L3) cache with no snoop needed (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load uops which data sources were data missed LLC (excluding unknown data source)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_SNB_EP, }, }; static const intel_x86_umask_t snb_mem_trans_retired[]={ { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "PRECISE_STORE", .udesc = "Capture where stores occur, must use with PEBS (Precise Event required)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_mem_uops_retired[]={ { .uname = "ALL_LOADS", .udesc = "Any retired loads (Precise Event)", .ucode = 0x8100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_LOADS", .udesc = "Any retired loads (Precise Event)", .ucode = 0x8100, .uequiv = "ALL_LOADS", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "Any retired stores (Precise Event)", .ucode = 0x8200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY_STORES", .udesc = "Any retired stores (Precise Event)", .ucode = 0x8200, .uequiv = "ALL_STORES", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_LOADS", .udesc = "Locked retired loads (Precise Event)", .ucode = 0x2100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCK_STORES", .udesc = "Locked retired stores (Precise Event)", .ucode = 0x2200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired loads causing cacheline splits (Precise Event)", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired stores causing cacheline splits (Precise Event)", .ucode = 0x4200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "STLB misses dues to retired loads (Precise Event)", .ucode = 0x1100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "STLB misses dues to retired stores (Precise Event)", .ucode = 0x1200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t snb_misalign_mem_ref[]={ { .uname = "LOADS", .udesc = "Speculative cache-line split load uops dispatched to the L1D", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STORES", .udesc = "Speculative cache-line split Store-address uops dispatched to L1D", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_offcore_requests[]={ { .uname = "ALL_DATA_RD", .udesc = "Demand and prefetch read requests sent to uncore", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_READ", .udesc = "Demand and prefetch read requests sent to uncore", .uequiv = "ALL_DATA_RD", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Offcore code read requests, including cacheable and un-cacheables", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Offcore Demand RFOs, includes regular RFO, Locks, ItoM", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_offcore_requests_buffer[]={ { .uname = "SQ_FULL", .udesc = "Offcore requests buffer cannot take more entries for this thread core", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD_CYCLES", .udesc = "Cycles with cacheable data read transactions in the superQ", .uequiv = "ALL_DATA_RD:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_CYCLES", .udesc = "Cycles with demand code reads transactions in the superQ", .uequiv = "DEMAND_CODE_RD:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_CYCLES", .udesc = "Cycles with demand data read transactions in the superQ", .uequiv = "DEMAND_DATA_RD:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "ALL_DATA_RD", .udesc = "Cacheable data read transactions in the superQ every cycle", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Code read transactions in the superQ every cycle", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand data read transactions in the superQ every cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_GE_6", .udesc = "Cycles with at lesat 6 offcore outstanding demand data read requests in the uncore queue", .uequiv = "DEMAND_DATA_RD:c=6", .ucode = 0x100 | (6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding RFO (store) transactions in the superQ every cycle", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_CYCLES", .udesc = "Cycles with outstanding RFO (store) transactions in the superQ", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_other_assists[]={ { .uname = "ITLB_MISS_RETIRED", .udesc = "Number of instructions that experienced an ITLB miss", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AVX_TO_SSE", .udesc = "Number of transitions from AVX-256 to legacy SSE when penalty applicable", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_TO_AVX", .udesc = "Number of transitions from legacy SSE to AVX-256 when penalty applicable", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "AVX_STORE", .udesc = "Number of GSSE memory assist for stores. GSSE microcode assist is being invoked whenever the hardware is unable to properly handle GSSE-256b operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_partial_rat_stalls[]={ { .uname = "FLAGS_MERGE_UOP", .udesc = "Number of flags-merge uops in flight in each cycle", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_FLAGS_MERGE_UOP", .udesc = "Cycles in which flags-merge uops in flight", .uequiv = "FLAGS_MERGE_UOP:c=1", .ucode = 0x2000 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MUL_SINGLE_UOP", .udesc = "Number of Multiply packed/scalar single precision uops allocated", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SLOW_LEA_WINDOW", .udesc = "Number of cycles with at least one slow LEA uop allocated", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_resource_stalls[]={ { .uname = "ANY", .udesc = "Cycles stalled due to Resource Related reason", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LB", .udesc = "Cycles stalled due to lack of load buffers", .ucode = 0x200, }, { .uname = "RS", .udesc = "Cycles stalled due to no eligible RS entry available", .ucode = 0x400, }, { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available (not including draining from sync)", .ucode = 0x800, }, { .uname = "ROB", .udesc = "Cycles stalled due to re-order buffer full", .ucode = 0x1000, }, { .uname = "FCSW", .udesc = "Cycles stalled due to writing the FPU control word", .ucode = 0x2000, }, { .uname = "MXCSR", .udesc = "Cycles stalled due to the MXCSR register ranme occurring too close to a previous MXCSR rename", .ucode = 0x4000, }, { .uname = "MEM_RS", .udesc = "Cycles stalled due to LB, SB or RS being completely in use", .ucode = 0xe00, .uequiv = "LB:SB:RS", }, { .uname = "LD_SB", .udesc = "Resource stalls due to load or store buffers all being in use", .ucode = 0xa00, }, { .uname = "OOO_SRC", .udesc = "Resource stalls due to Rob being full, FCSW, MXCSR and OTHER", .ucode = 0xf000, }, }; static const intel_x86_umask_t snb_resource_stalls2[]={ { .uname = "ALL_FL_EMPTY", .udesc = "Cycles stalled due to free list empty", .ucode = 0xc00, }, { .uname = "ALL_PRF_CONTROL", .udesc = "Cycles stalls due to control structures full for physical registers", .ucode = 0xf00, }, { .uname = "ANY_PRF_CONTROL", .udesc = "Cycles stalls due to control structures full for physical registers", .ucode = 0xf00, .uequiv = "ALL_PRF_CONTROL", }, { .uname = "BOB_FULL", .udesc = "Cycles Allocator is stalled due Branch Order Buffer", .ucode = 0x4000, }, { .uname = "OOO_RSRC", .udesc = "Cycles stalled due to out of order resources full", .ucode = 0x4f00, }, }; static const intel_x86_umask_t snb_rob_misc_events[]={ { .uname = "LBR_INSERTS", .udesc = "Count each time an new LBR record is saved by HW", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_rs_events[]={ { .uname = "EMPTY_CYCLES", .udesc = "Cycles the RS is empty for this thread", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "EMPTY_END", .udesc = "Counts number of time the Reservation Station (RS) goes from empty to non-empty", .ucode = 0x100 | INTEL_X86_MOD_INV | INTEL_X86_MOD_EDGE | (1 << INTEL_X86_CMASK_BIT), /* inv=1 edge=1 cnt=1 */ .uequiv = "EMPTY_CYCLES:c=1:e:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_simd_fp_256[]={ { .uname = "PACKED_SINGLE", .udesc = "Counts 256-bit packed single-precision", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_DOUBLE", .udesc = "Counts 256-bit packed double-precision", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_sq_misc[]={ { .uname = "SPLIT_LOCK", .udesc = "Split locks in SQ", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_tlb_flush[]={ { .uname = "DTLB_THREAD", .udesc = "Number of DTLB flushes of thread-specific entries", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_ANY", .udesc = "Number of STLB flushes", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_uops_dispatched_port[]={ { .uname = "PORT_0", .udesc = "Cycles which a Uop is dispatched on port 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Cycles which a Uop is dispatched on port 1", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_2_LD", .udesc = "Cycles in which a load uop is dispatched on port 2", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_2_STA", .udesc = "Cycles in which a store uop is dispatched on port 2", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_2", .udesc = "Cycles in which a uop is dispatched on port 2", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_3", .udesc = "Cycles in which a uop is dispatched on port 3", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_4", .udesc = "Cycles which a uop is dispatched on port 4", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "Cycles which a Uop is dispatched on port 5", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT_0_CORE", .udesc = "Cycles in which a uop is dispatched on port 0 for any thread", .ucode = 0x100 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_0:t", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_1_CORE", .udesc = "Cycles in which a uop is dispatched on port 1 for any thread", .ucode = 0x200 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_1:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_2_CORE", .udesc = "Cycles in which a uop is dispatched on port 2 for any thread", .ucode = 0xc00 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_2:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_3_CORE", .udesc = "Cycles in which a uop is dispatched on port 3 for any thread", .ucode = 0x3000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_3:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_4_CORE", .udesc = "Cycles in which a uop is dispatched on port 4 for any thread", .ucode = 0x4000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_4:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, { .uname = "PORT_5_CORE", .udesc = "Cycles in which a uop is dispatched on port 5 for any thread", .ucode = 0x8000 | INTEL_X86_MOD_ANY, /* any=1 */ .uequiv = "PORT_5:t", .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_T, }, }; static const intel_x86_umask_t snb_uops_issued[]={ { .uname = "ANY", .udesc = "Number of uops issued by the RAT to the Reservation Station (RS)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no uops issued on this core (by any thread)", .uequiv = "ANY:c=1:i=1:t=1", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no uops issued by this thread", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_uops_retired[]={ { .uname = "ALL", .udesc = "All uops that actually retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "ANY", .udesc = "All uops that actually retired (Precise Event)", .ucode = 0x100, .uequiv= "ALL", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "Number of retirement slots used (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no executable uop retired (Precise Event)", .uequiv = "ALL:c=1:i", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "TOTAL_CYCLES", .udesc = "Total cycles using precise uop retired event (Precise Event)", .uequiv = "ALL:c=10:i", .ucode = 0x100 | INTEL_X86_MOD_INV | (10 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_offcore_response[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 1ULL << (0 + 8), .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches", .ucode = 1ULL << (1 + 8), .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 1ULL << (2 + 8), .grpid = 0, }, { .uname = "WB", .udesc = "Request: number of writebacks (modified to exclusive) transactions", .ucode = 1ULL << (3 + 8), .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: number of data cacheline reads generated by L2 prefetchers", .ucode = 1ULL << (4 + 8), .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetchers", .ucode = 1ULL << (5 + 8), .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: number of code reads generated by L2 prefetchers", .ucode = 1ULL << (6 + 8), .grpid = 0, }, { .uname = "PF_LLC_DATA_RD", .udesc = "Request: number of L3 prefetcher requests to L2 for loads", .ucode = 1ULL << (7 + 8), .grpid = 0, }, { .uname = "PF_LLC_RFO", .udesc = "Request: number of RFO requests generated by L2 prefetcher", .ucode = 1ULL << (8 + 8), .grpid = 0, }, { .uname = "PF_LLC_IFETCH", .udesc = "Request: number of L2 prefetcher requests to L3 for instruction fetches", .ucode = 1ULL << (9 + 8), .grpid = 0, }, { .uname = "BUS_LOCKS", .udesc = "Request: number bus lock and split lock requests", .ucode = 1ULL << (10 + 8), .grpid = 0, }, { .uname = "STRM_ST", .udesc = "Request: number of streaming store requests", .ucode = 1ULL << (11 + 8), .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 1ULL << (15+8), .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH | PF_LLC_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH:PF_LLC_IFETCH", .ucode = 0x24400, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all request umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER", .ucode = 0x8fff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: combination of DMND_DATA | PF_DATA_RD | PF_LLC_DATA_RD", .uequiv = "DMND_DATA_RD:PF_DATA_RD:PF_LLC_DATA_RD", .ucode = 0x9100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO | PF_LLC_RFO", .uequiv = "DMND_RFO:PF_RFO:PF_LLC_RFO", .ucode = 0x12200, .grpid = 0, }, { .uname = "ANY_RESPONSE", .udesc = "Response: count any response type", .ucode = 1ULL << (16+8), .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, .grpid = 1, }, { .uname = "NO_SUPP", .udesc = "Supplier: counts number of times supplier information is not available", .ucode = 1ULL << (17+8), .grpid = 1, }, { .uname = "LLC_HITM", .udesc = "Supplier: counts L3 hits in M-state (initial lookup)", .ucode = 1ULL << (18+8), .grpid = 1, }, { .uname = "LLC_HITE", .udesc = "Supplier: counts L3 hits in E-state", .ucode = 1ULL << (19+8), .grpid = 1, }, { .uname = "LLC_HITS", .udesc = "Supplier: counts L3 hits in S-state", .ucode = 1ULL << (20+8), .grpid = 1, }, { .uname = "LLC_HITF", .udesc = "Supplier: counts L3 hits in F-state", .ucode = 1ULL << (21+8), .grpid = 1, }, { .uname = "LLC_MISS_LOCAL_DRAM", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (22+8), .grpid = 1, }, { .uname = "LLC_MISS_LOCAL", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 1ULL << (22+8), .uequiv = "LLC_MISS_LOCAL_DRAM", .grpid = 1, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local DRAM", .ucode = 0x1ULL << (22+8), .grpid = 1, .uequiv = "LLC_MISS_LOCAL", .umodel = PFM_PMU_INTEL_SNB, }, { .uname = "LLC_MISS_REMOTE", .udesc = "Supplier: counts L3 misses to remote DRAM", .ucode = 0xffULL << (23+8), .uequiv = "LLC_MISS_REMOTE_DRAM", .grpid = 1, .umodel = PFM_PMU_INTEL_SNB_EP, }, { .uname = "LLC_MISS_REMOTE_DRAM", .udesc = "Supplier: counts L3 misses to remote DRAM", .ucode = 0xffULL << (23+8), .grpid = 1, .umodel = PFM_PMU_INTEL_SNB_EP, }, { .uname = "L3_MISS", .udesc = "Supplier: counts L3 misses to local or remote DRAM", .ucode = 0x3ULL << (22+8), .uequiv = "LLC_MISS_LOCAL:LLC_MISS_REMOTE", .umodel = PFM_PMU_INTEL_SNB_EP, .grpid = 1, }, { .uname = "LLC_HITMESF", .udesc = "Supplier: counts L3 hits in any state (M, E, S, F)", .ucode = 0xfULL << (18+8), .uequiv = "LLC_HITM:LLC_HITE:LLC_HITS:LLC_HITF", .grpid = 1, }, { .uname = "SNP_NONE", .udesc = "Snoop: counts number of times no snoop-related information is available", .ucode = 1ULL << (31+8), .grpid = 2, }, { .uname = "SNP_NOT_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .grpid = 2, }, { .uname = "NO_SNP_NEEDED", .udesc = "Snoop: counts the number of times no snoop was needed to satisfy the request", .ucode = 1ULL << (32+8), .uequiv = "SNP_NOT_NEEDED", .grpid = 2, }, { .uname = "SNP_MISS", .udesc = "Snoop: counts number of times a snoop was needed and it missed all snooped caches", .ucode = 1ULL << (33+8), .grpid = 2, }, { .uname = "SNP_NO_FWD", .udesc = "Snoop: counts number of times a snoop was needed and it hit in at leas one snooped cache", .ucode = 1ULL << (34+8), .grpid = 2, }, { .uname = "SNP_FWD", .udesc = "Snoop: counts number of times a snoop was needed and data was forwarded from a remote socket", .ucode = 1ULL << (35+8), .grpid = 2, }, { .uname = "HITM", .udesc = "Snoop: counts number of times a snoop was needed and it hitM-ed in local or remote cache", .ucode = 1ULL << (36+8), .grpid = 2, }, { .uname = "NON_DRAM", .udesc = "Snoop: counts number of times target was a non-DRAM system address. This includes MMIO transactions", .ucode = 1ULL << (37+8), .grpid = 2, }, { .uname = "SNP_ANY", .udesc = "Snoop: any snoop reason", .ucode = 0x7fULL << (31+8), .uequiv = "SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:HITM:NON_DRAM", .uflags= INTEL_X86_DFL, .grpid = 2, }, }; static const intel_x86_umask_t snb_baclears[]={ { .uname = "ANY", .udesc = "Counts the number of times the front end is re-steered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_cycle_activity[]={ { .uname = "CYCLES_L2_PENDING", .udesc = "Cycles with pending L2 miss loads", .ucode = 0x0100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, .ucntmsk= 0xf, }, { .uname = "CYCLES_L1D_PENDING", .udesc = "Cycles with pending L1D load cache misses", .ucode = 0x0200 | (0x2 << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_NO_DISPATCH", .udesc = "Cycles of dispatch stalls", .ucode = 0x0400 | (0x4 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L2_PENDING", .udesc = "Execution stalls due to L2 pending loads", .ucode = 0x0500 | (0x5 << INTEL_X86_CMASK_BIT), .ucntmsk= 0xf, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS_L1D_PENDING", .udesc = "Execution stalls due to L1D pending loads", .ucode = 0x0600 | (0x6 << INTEL_X86_CMASK_BIT), .ucntmsk= 0x4, .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_ept[]={ { .uname = "WALK_CYCLES", .udesc = "Cycles for an extended page table walk", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_lsd[]={ { .uname = "UOPS", .udesc = "Number of uops delivered by the Loop Stream Detector (LSD)", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, { .uname = "ACTIVE", .udesc = "Cycles with uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "UOPS:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_4_UOPS", .udesc = "Cycles with 4 uops delivered by the LSD but which did not come from decoder", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "UOPS:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t snb_page_walks[]={ { .uname = "LLC_MISS", .udesc = "Number of page walks with a LLC miss", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t snb_uops_executed[]={ { .uname = "CORE", .udesc = "Counts total number of uops executed from any thread per cycle", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Counts total number of uops executed per thread each cycle", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Number of cycles with no uops executed", .ucode = 0x100 | INTEL_X86_MOD_INV | (1 << INTEL_X86_CMASK_BIT), /* inv=1 cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1_UOP_EXEC", .udesc = "Cycles where at least 1 uop was executed per thread", .ucode = 0x100 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2_UOPS_EXEC", .udesc = "Cycles where at least 2 uops were executed per thread", .ucode = 0x100 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3_UOPS_EXEC", .udesc = "Cycles where at least 3 uops were executed per thread", .ucode = 0x100 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4_UOPS_EXEC", .udesc = "Cycles where at least 4 uops were executed per thread", .ucode = 0x100 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed from any thread", .ucode = 0x200 | (1 << INTEL_X86_CMASK_BIT), /* cnt=1 */ .uequiv = "CORE:c=1", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed from any thread", .ucode = 0x200 | (2 << INTEL_X86_CMASK_BIT), /* cnt=2 */ .uequiv = "CORE:c=2", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed from any thread", .ucode = 0x200 | (3 << INTEL_X86_CMASK_BIT), /* cnt=3 */ .uequiv = "CORE:c=3", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed from any thread", .ucode = 0x200 | (4 << INTEL_X86_CMASK_BIT), /* cnt=4 */ .uequiv = "CORE:c=4", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_NONE", .udesc = "Cycles where no uop is executed on any thread", .ucode = 0x200 | INTEL_X86_MOD_INV, /* inv=1 */ .uequiv = "CORE:i", .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I, }, }; static const intel_x86_umask_t snb_mem_load_uops_llc_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Load uops that miss in the L3 and hit local DRAM", .ucode = 0x100, .umodel = PFM_PMU_INTEL_SNB_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_DRAM", .udesc = "Load uops that miss in the L3 and hit remote DRAM", .ucode = 0x400, .umodel = PFM_PMU_INTEL_SNB_EP, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_entry_t intel_snb_pe[]={ { .name = "AGU_BYPASS_CANCEL", .desc = "Number of executed load operations with all the following traits: 1. addressing of the format [base + offset], 2. the offset is between 1 and 2047, 3. the address specified in the base register is in one page and the address [base+offset] is in another page", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb6, .numasks = LIBPFM_ARRAY_SIZE(snb_agu_bypass_cancel), .ngrp = 1, .umasks = snb_agu_bypass_cancel, }, { .name = "ARITH", .desc = "Counts arithmetic multiply operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x14, .numasks = LIBPFM_ARRAY_SIZE(snb_arith), .ngrp = 1, .umasks = snb_arith, }, { .name = "BACLEARS", .desc = "Branch re-steered", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(snb_baclears), .ngrp = 1, .umasks = snb_baclears, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x88, .numasks = LIBPFM_ARRAY_SIZE(snb_br_inst_exec), .ngrp = 1, .umasks = snb_br_inst_exec, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_br_inst_retired), .ngrp = 1, .umasks = snb_br_inst_retired, }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x89, .numasks = LIBPFM_ARRAY_SIZE(snb_br_misp_exec), .ngrp = 1, .umasks = snb_br_misp_exec, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_br_misp_retired), .ngrp = 1, .umasks = snb_br_misp_retired, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc4, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xff, .code = 0xc5, }, { .name = "LOCK_CYCLES", .desc = "Locked cycles in L1D and L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(snb_lock_cycles), .ngrp = 1, .umasks = snb_lock_cycles, }, { .name = "CPL_CYCLES", .desc = "Unhalted core cycles at a specific ring level", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5c, .numasks = LIBPFM_ARRAY_SIZE(snb_cpl_cycles), .ngrp = 1, .umasks = snb_cpl_cycles, }, { .name = "CPU_CLK_UNHALTED", .desc = "Cycles when processor is not in halted state", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(snb_cpu_clk_unhalted), .ngrp = 1, .umasks = snb_cpu_clk_unhalted, }, { .name = "DSB2MITE_SWITCHES", .desc = "Number of DSB to MITE switches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xab, .numasks = LIBPFM_ARRAY_SIZE(snb_dsb2mite_switches), .ngrp = 1, .umasks = snb_dsb2mite_switches, }, { .name = "DSB_FILL", .desc = "DSB fills", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xac, .numasks = LIBPFM_ARRAY_SIZE(snb_dsb_fill), .ngrp = 1, .umasks = snb_dsb_fill, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(snb_dtlb_load_misses), .ngrp = 1, .umasks = snb_dtlb_load_misses, }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x49, .numasks = LIBPFM_ARRAY_SIZE(snb_dtlb_store_misses), .ngrp = 1, .umasks = snb_dtlb_store_misses, }, { .name = "FP_ASSIST", .desc = "X87 Floating point assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xca, .numasks = LIBPFM_ARRAY_SIZE(snb_fp_assist), .ngrp = 1, .umasks = snb_fp_assist, }, { .name = "FP_COMP_OPS_EXE", .desc = "Counts number of floating point events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x10, .numasks = LIBPFM_ARRAY_SIZE(snb_fp_comp_ops_exe), .ngrp = 1, .umasks = snb_fp_comp_ops_exe, }, { .name = "HW_PRE_REQ", .desc = "Hardware prefetch requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x4e, .numasks = LIBPFM_ARRAY_SIZE(snb_hw_pre_req), .ngrp = 1, .umasks = snb_hw_pre_req, }, { .name = "ICACHE", .desc = "Instruction Cache accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(snb_icache), .ngrp = 1, .umasks = snb_icache, }, { .name = "IDQ", .desc = "IDQ operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x79, .numasks = LIBPFM_ARRAY_SIZE(snb_idq), .ngrp = 1, .umasks = snb_idq, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x9c, .numasks = LIBPFM_ARRAY_SIZE(snb_idq_uops_not_delivered), .ngrp = 1, .umasks = snb_idq_uops_not_delivered, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(snb_ild_stall), .ngrp = 1, .umasks = snb_ild_stall, }, { .name = "INSTS_WRITTEN_TO_IQ", .desc = "Instructions written to IQ", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x17, .numasks = LIBPFM_ARRAY_SIZE(snb_insts_written_to_iq), .ngrp = 1, .umasks = snb_insts_written_to_iq, }, { .name = "INST_RETIRED", .desc = "Instructions retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_inst_retired), .ngrp = 1, .umasks = snb_inst_retired, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V3_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INT_MISC", .desc = "Miscellaneous internals", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd, .numasks = LIBPFM_ARRAY_SIZE(snb_int_misc), .ngrp = 1, .umasks = snb_int_misc, }, { .name = "ITLB", .desc = "Instruction TLB", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xae, .numasks = LIBPFM_ARRAY_SIZE(snb_itlb), .ngrp = 1, .umasks = snb_itlb, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(snb_dtlb_store_misses), .ngrp = 1, .umasks = snb_dtlb_store_misses, /* identical to actual umasks list for this event */ }, { .name = "L1D", .desc = "L1D cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x51, .numasks = LIBPFM_ARRAY_SIZE(snb_l1d), .ngrp = 1, .umasks = snb_l1d, }, { .name = "L1D_BLOCKS", .desc = "L1D is blocking", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xbf, .numasks = LIBPFM_ARRAY_SIZE(snb_l1d_blocks), .ngrp = 1, .umasks = snb_l1d_blocks, }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x4, .code = 0x48, .numasks = LIBPFM_ARRAY_SIZE(snb_l1d_pend_miss), .ngrp = 1, .umasks = snb_l1d_pend_miss, }, { .name = "L2_L1D_WB_RQSTS", .desc = "Writeback requests from L1D to L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_l1d_wb_rqsts), .ngrp = 1, .umasks = snb_l2_l1d_wb_rqsts, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf1, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_lines_in), .ngrp = 1, .umasks = snb_l2_lines_in, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf2, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_lines_out), .ngrp = 1, .umasks = snb_l2_lines_out, }, { .name = "L2_RQSTS", .desc = "L2 requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_rqsts), .ngrp = 1, .umasks = snb_l2_rqsts, }, { .name = "L2_STORE_LOCK_RQSTS", .desc = "L2 store lock requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_store_lock_rqsts), .ngrp = 1, .umasks = snb_l2_store_lock_rqsts, }, { .name = "L2_TRANS", .desc = "L2 transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf0, .numasks = LIBPFM_ARRAY_SIZE(snb_l2_trans), .ngrp = 1, .umasks = snb_l2_trans, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for L3_LAT_CACHE:MISS", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:MISS", .cntmsk = 0xff, .code = 0x412e, }, { .name = "LLC_MISSES", .desc = "Alias for LAST_LEVEL_CACHE_MISSES", .modmsk = INTEL_V3_ATTRS, .equiv = "LAST_LEVEL_CACHE_MISSES", .cntmsk = 0xff, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for L3_LAT_CACHE:REFERENCE", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:REFERENCE", .cntmsk = 0xff, .code = 0x4f2e, }, { .name = "LLC_REFERENCES", .desc = "Alias for LAST_LEVEL_CACHE_REFERENCES", .modmsk = INTEL_V3_ATTRS, .equiv = "LAST_LEVEL_CACHE_REFERENCES", .cntmsk = 0xff, .code = 0x4f2e, }, { .name = "LD_BLOCKS", .desc = "Blocking loads", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(snb_ld_blocks), .ngrp = 1, .umasks = snb_ld_blocks, }, { .name = "LD_BLOCKS_PARTIAL", .desc = "Partial load blocks", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(snb_ld_blocks_partial), .ngrp = 1, .umasks = snb_ld_blocks_partial, }, { .name = "LOAD_HIT_PRE", .desc = "Load dispatches that hit fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x4c, .numasks = LIBPFM_ARRAY_SIZE(snb_load_hit_pre), .ngrp = 1, .umasks = snb_load_hit_pre, }, { .name = "L3_LAT_CACHE", .desc = "Core-originated cacheable demand requests to L3", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(snb_l3_lat_cache), .ngrp = 1, .umasks = snb_l3_lat_cache, }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(snb_machine_clears), .ngrp = 1, .umasks = snb_machine_clears, }, { .name = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .desc = "L3 hit loads uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_llc_hit_retired), .ngrp = 1, .umasks = snb_mem_load_uops_llc_hit_retired, }, { .name = "MEM_LOAD_LLC_HIT_RETIRED", .desc = "L3 hit loads uops retired (deprecated use MEM_LOAD_UOPS_LLC_HIT_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd2, .equiv = "MEM_LOAD_UOPS_LLC_HIT_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_llc_hit_retired), .ngrp = 1, .umasks = snb_mem_load_uops_llc_hit_retired, }, { .name = "MEM_LOAD_UOPS_MISC_RETIRED", .desc = "Loads and some non simd split loads uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_misc_retired), .ngrp = 1, .umasks = snb_mem_load_uops_misc_retired, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Loads and some non simd split loads uops retired (deprecated use MEM_LOAD_UOPS_MISC_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd4, .equiv = "MEM_LOAD_UOPS_MISC_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_misc_retired), .ngrp = 1, .umasks = snb_mem_load_uops_misc_retired, }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Memory loads uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd1, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_retired), .ngrp = 1, .umasks = snb_mem_load_uops_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Memory loads uops retired (deprecated use MEM_LOAD_UOPS_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd1, .equiv = "MEM_LOAD_UOPS_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_retired), .ngrp = 1, .umasks = snb_mem_load_uops_retired, }, { .name = "MEM_TRANS_RETIRED", .desc = "Memory transactions retired", .modmsk = INTEL_V3_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0x8, .code = 0xcd, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_trans_retired), .ngrp = 1, .umasks = snb_mem_trans_retired, }, { .name = "MEM_UOPS_RETIRED", .desc = "Memory uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_uops_retired), .ngrp = 1, .umasks = snb_mem_uops_retired, }, { .name = "MEM_UOP_RETIRED", .desc = "Memory uops retired (deprecated use MEM_UOPS_RETIRED)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd0, .equiv = "MEM_UOPS_RETIRED", .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_uops_retired), .ngrp = 1, .umasks = snb_mem_uops_retired, }, { .name = "MISALIGN_MEM_REF", .desc = "Misaligned memory references", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(snb_misalign_mem_ref), .ngrp = 1, .umasks = snb_misalign_mem_ref, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb0, .numasks = LIBPFM_ARRAY_SIZE(snb_offcore_requests), .ngrp = 1, .umasks = snb_offcore_requests, }, { .name = "OFFCORE_REQUESTS_BUFFER", .desc = "Offcore requests buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb2, .numasks = LIBPFM_ARRAY_SIZE(snb_offcore_requests_buffer), .ngrp = 1, .umasks = snb_offcore_requests_buffer, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(snb_offcore_requests_outstanding), .ngrp = 1, .umasks = snb_offcore_requests_outstanding, }, { .name = "OTHER_ASSISTS", .desc = "Count hardware assists", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc1, .numasks = LIBPFM_ARRAY_SIZE(snb_other_assists), .ngrp = 1, .umasks = snb_other_assists, }, { .name = "PARTIAL_RAT_STALLS", .desc = "Partial Register Allocation Table stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x59, .numasks = LIBPFM_ARRAY_SIZE(snb_partial_rat_stalls), .ngrp = 1, .umasks = snb_partial_rat_stalls, }, { .name = "RESOURCE_STALLS", .desc = "Resource related stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa2, .numasks = LIBPFM_ARRAY_SIZE(snb_resource_stalls), .ngrp = 1, .umasks = snb_resource_stalls, }, { .name = "RESOURCE_STALLS2", .desc = "Resource related stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5b, .numasks = LIBPFM_ARRAY_SIZE(snb_resource_stalls2), .ngrp = 1, .umasks = snb_resource_stalls2, }, { .name = "ROB_MISC_EVENTS", .desc = "Reorder buffer events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(snb_rob_misc_events), .ngrp = 1, .umasks = snb_rob_misc_events, }, { .name = "RS_EVENTS", .desc = "Reservation station events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x5e, .numasks = LIBPFM_ARRAY_SIZE(snb_rs_events), .ngrp = 1, .umasks = snb_rs_events, }, { .name = "SIMD_FP_256", .desc = "Counts 256-bit packed floating point instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0x11, .numasks = LIBPFM_ARRAY_SIZE(snb_simd_fp_256), .ngrp = 1, .umasks = snb_simd_fp_256, }, { .name = "SQ_MISC", .desc = "SuperQ events", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(snb_sq_misc), .ngrp = 1, .umasks = snb_sq_misc, }, { .name = "TLB_FLUSH", .desc = "TLB flushes", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xbd, .numasks = LIBPFM_ARRAY_SIZE(snb_tlb_flush), .ngrp = 1, .umasks = snb_tlb_flush, }, { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "UOPS_EXECUTED", .desc = "Uops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xb1, .numasks = LIBPFM_ARRAY_SIZE(snb_uops_executed), .ngrp = 1, .umasks = snb_uops_executed, }, { .name = "UOPS_DISPATCHED_PORT", .desc = "Uops dispatch to specific ports", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa1, .numasks = LIBPFM_ARRAY_SIZE(snb_uops_dispatched_port), .ngrp = 1, .umasks = snb_uops_dispatched_port, }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xe, .numasks = LIBPFM_ARRAY_SIZE(snb_uops_issued), .ngrp = 1, .umasks = snb_uops_issued, }, { .name = "UOPS_RETIRED", .desc = "Uops retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xc2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_uops_retired), .ngrp = 1, .umasks = snb_uops_retired, }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xa3, .numasks = LIBPFM_ARRAY_SIZE(snb_cycle_activity), .ngrp = 1, .umasks = snb_cycle_activity, }, { .name = "EPT", .desc = "Extended page table", .modmsk = INTEL_V4_ATTRS, .cntmsk = 0xff, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(snb_ept), .ngrp = 1, .umasks = snb_ept, }, { .name = "LSD", .desc = "Loop stream detector", .code = 0xa8, .cntmsk = 0xff, .ngrp = 1, .modmsk = INTEL_V4_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snb_lsd), .umasks = snb_lsd, }, { .name = "PAGE_WALKS", .desc = "page walker", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xbe, .numasks = LIBPFM_ARRAY_SIZE(snb_page_walks), .ngrp = 1, .umasks = snb_page_walks, }, { .name = "MEM_LOAD_UOPS_LLC_MISS_RETIRED", .desc = "Load uops retired which miss the L3 cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xff, .code = 0xd3, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(snb_mem_load_uops_llc_miss_retired), .ngrp = 1, .umasks = snb_mem_load_uops_llc_miss_retired, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(snb_offcore_response), .ngrp = 3, .umasks = snb_offcore_response, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event (must provide at least one request type and either any_response or any combination of supplier + snoop)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(snb_offcore_response), .ngrp = 3, .umasks = snb_offcore_response, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snb_unc_events.h000066400000000000000000000134231502707512200244200ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: snb_unc (Intel Sandy Bridge uncore PMU) */ static const intel_x86_umask_t snb_unc_cbo_xsnp_response[]={ { .uname = "MISS", .udesc = "Number of snoop misses", .ucode = 0x100, .grpid = 0, }, { .uname = "INVAL", .udesc = "Number of snoop invalidates of a non-modified line", .ucode = 0x200, .grpid = 0, }, { .uname = "HIT", .udesc = "Number of snoop hits of a non-modified line", .ucode = 0x400, .grpid = 0, }, { .uname = "HITM", .udesc = "Number of snoop hits of a modified line", .ucode = 0x800, .grpid = 0, }, { .uname = "INVAL_M", .udesc = "Number of snoop invalidates of a modified line", .ucode = 0x1000, .grpid = 0, }, { .uname = "ANY_SNP", .udesc = "Number of snoops", .ucode = 0x1f00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "EXTERNAL_FILTER", .udesc = "Filter on cross-core snoops initiated by this Cbox due to external snoop request", .ucode = 0x2000, .grpid = 1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "XCORE_FILTER", .udesc = "Filter on cross-core snoops initiated by this Cbox due to processor core memory request", .ucode = 0x4000, .grpid = 1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "EVICTION_FILTER", .udesc = "Filter on cross-core snoops initiated by this Cbox due to LLC eviction", .ucode = 0x8000, .grpid = 1, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snb_unc_cbo_cache_lookup[]={ { .uname = "STATE_M", .udesc = "Number of LLC lookup requests for a line in modified state", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_E", .udesc = "Number of LLC lookup requests for a line in exclusive state", .ucode = 0x200, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_S", .udesc = "Number of LLC lookup requests for a line in shared state", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_I", .udesc = "Number of LLC lookup requests for a line in invalid state", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STATE_MESI", .udesc = "Number of LLC lookup requests for a line", .ucode = 0xf00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "READ_FILTER", .udesc = "Filter on processor core initiated cacheable read requests", .ucode = 0x1000, .grpid = 1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_FILTER", .udesc = "Filter on processor core initiated cacheable write requests", .ucode = 0x2000, .grpid = 1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EXTSNP_FILTER", .udesc = "Filter on external snoop requests", .ucode = 0x4000, .grpid = 1, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_FILTER", .udesc = "Filter on any IRQ or IPQ initiated requests including uncacheable, non-coherent requests", .ucode = 0x8000, .grpid = 1, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_entry_t intel_snb_unc_cbo0_pe[]={ { .name = "UNC_CLOCKTICKS", .desc = "uncore clock ticks", .cntmsk = 1ULL << 32, .code = 0xff, /* perf_event pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "UNC_CBO_XSNP_RESPONSE", .desc = "Snoop responses", .modmsk = INTEL_SNB_UNC_ATTRS, .cntmsk = 0xff, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(snb_unc_cbo_xsnp_response), .ngrp = 2, .umasks = snb_unc_cbo_xsnp_response, }, { .name = "UNC_CBO_CACHE_LOOKUP", .desc = "LLC cache lookups", .modmsk = INTEL_SNB_UNC_ATTRS, .cntmsk = 0xff, .code = 0x34, .numasks = LIBPFM_ARRAY_SIZE(snb_unc_cbo_cache_lookup), .ngrp = 2, .umasks = snb_unc_cbo_cache_lookup, }, }; static const intel_x86_entry_t intel_snb_unc_cbo_pe[]={ { .name = "UNC_CBO_XSNP_RESPONSE", .desc = "Snoop responses (must provide a snoop type and filter)", .modmsk = INTEL_SNB_UNC_ATTRS, .cntmsk = 0xff, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(snb_unc_cbo_xsnp_response), .ngrp = 2, .umasks = snb_unc_cbo_xsnp_response, }, { .name = "UNC_CBO_CACHE_LOOKUP", .desc = "LLC cache lookups", .modmsk = INTEL_SNB_UNC_ATTRS, .cntmsk = 0xff, .code = 0x34, .numasks = LIBPFM_ARRAY_SIZE(snb_unc_cbo_cache_lookup), .ngrp = 2, .umasks = snb_unc_cbo_cache_lookup, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_cbo_events.h000066400000000000000000000572051502707512200255760ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: snbep_unc_cbo (Intel SandyBridge-EP C-Box uncore PMU) */ #define CBO_FILT_MESIF(a, b, c, d) \ { .uname = "STATE_"#a,\ .udesc = #b" cacheline state",\ .ufilters[0] = 1ULL << (18 + (c)),\ .grpid = d, \ } #define CBO_FILT_MESIFS(d) \ CBO_FILT_MESIF(I, Invalid, 0, d), \ CBO_FILT_MESIF(S, Shared, 1, d), \ CBO_FILT_MESIF(E, Exclusive, 2, d), \ CBO_FILT_MESIF(M, Modified, 3, d), \ CBO_FILT_MESIF(F, Forward, 4, d), \ { .uname = "STATE_MESIF",\ .udesc = "Any cache line state",\ .ufilters[0] = 0x1fULL << 18,\ .grpid = d, \ .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, \ } #define CBO_FILT_OPC(d) \ { .uname = "OPC_RFO",\ .udesc = "Demand data RFO (combine with any OPCODE umask)",\ .ufilters[0] = 0x180ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_CRD",\ .udesc = "Demand code read (combine with any OPCODE umask)",\ .ufilters[0] = 0x181ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_DRD",\ .udesc = "Demand data read (combine with any OPCODE umask)",\ .ufilters[0] = 0x182ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PRD",\ .udesc = "Partial reads (UC) (combine with any OPCODE umask)",\ .ufilters[0] = 0x187ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCILF",\ .udesc = "Full Stream store (combine with any OPCODE umask)", \ .ufilters[0] = 0x18cULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WCIL",\ .udesc = "Partial Stream store (combine with any OPCODE umask)", \ .ufilters[0] = 0x18dULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_RFO",\ .udesc = "Prefetch RFO into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[0] = 0x190ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_CODE",\ .udesc = "Prefetch code into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[0] = 0x191ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PF_DATA",\ .udesc = "Prefetch data into LLC but do not pass to L2 (includes hints) (combine with any OPCODE umask)", \ .ufilters[0] = 0x192ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIWILF",\ .udesc = "PCIe write (non-allocating) (combine with any OPCODE umask)", \ .ufilters[0] = 0x194ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIPRD",\ .udesc = "PCIe UC read (combine with any OPCODE umask)", \ .ufilters[0] = 0x195ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIITOM",\ .udesc = "PCIe write (allocating) (combine with any OPCODE umask)", \ .ufilters[0] = 0x19cULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCIRDCUR",\ .udesc = "PCIe read current (combine with any OPCODE umask)", \ .ufilters[0] = 0x19eULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOI",\ .udesc = "Request writeback modified invalidate line (combine with any OPCODE umask)", \ .ufilters[0] = 0x1c4ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_WBMTOE",\ .udesc = "Request writeback modified set to exclusive (combine with any OPCODE umask)", \ .ufilters[0] = 0x1c5ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_ITOM",\ .udesc = "Request invalidate line (combine with any OPCODE umask)", \ .ufilters[0] = 0x1c8ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSRD",\ .udesc = "PCIe non-snoop read (combine with any OPCODE umask)", \ .ufilters[0] = 0x1e4ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWR",\ .udesc = "PCIe non-snoop write (partial) (combine with any OPCODE umask)", \ .ufilters[0] = 0x1e5ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ }, \ { .uname = "OPC_PCINSWRF",\ .udesc = "PCIe non-snoop write (full) (combine with any OPCODE umask)", \ .ufilters[0] = 0x1e6ULL << 23, \ .uflags = INTEL_X86_NCOMBO, \ .grpid = d, \ } static const intel_x86_umask_t snbep_unc_c_llc_lookup[]={ { .uname = "ANY", .udesc = "Any request", .grpid = 0, .uflags = INTEL_X86_NCOMBO, .ucode = 0x1f00, }, { .uname = "DATA_READ", .udesc = "Data read requests", .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, .ucode = 0x300, }, { .uname = "WRITE", .udesc = "Write requests. Includes all write transactions (cached, uncached)", .grpid = 0, .uflags = INTEL_X86_NCOMBO, .ucode = 0x500, }, { .uname = "REMOTE_SNOOP", .udesc = "External snoop request", .grpid = 0, .uflags = INTEL_X86_NCOMBO, .ucode = 0x900, }, { .uname = "NID", .udesc = "Match a given RTID destination NID (must provide nf=X modifier)", .uflags = INTEL_X86_NCOMBO | INTEL_X86_GRP_DFL_NONE, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .grpid = 1, .ucode = 0x4100, }, CBO_FILT_MESIFS(2), }; static const intel_x86_umask_t snbep_unc_c_llc_victims[]={ { .uname = "M_STATE", .udesc = "Lines in M state", .ucode = 0x100, }, { .uname = "E_STATE", .udesc = "Lines in E state", .ucode = 0x200, }, { .uname = "S_STATE", .udesc = "Lines in S state", .ucode = 0x400, }, { .uname = "MISS", .udesc = "TBD", .ucode = 0x800, }, { .uname = "NID", .udesc = "Victimized Lines matching the NID filter (must provide nf=X modifier)", .uflags = INTEL_X86_NCOMBO, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .ucode = 0x4000, }, }; static const intel_x86_umask_t snbep_unc_c_misc[]={ { .uname = "RSPI_WAS_FSE", .udesc = "Silent snoop eviction", .ucode = 0x100, }, { .uname = "WC_ALIASING", .udesc = "Write combining aliasing", .ucode = 0x200, }, { .uname = "STARTED", .udesc = "TBD", .ucode = 0x400, }, { .uname = "RFO_HIT_S", .udesc = "RFO hits in S state", .ucode = 0x800, }, }; static const intel_x86_umask_t snbep_unc_c_ring_ad_used[]={ { .uname = "UP_EVEN", .udesc = "Up and Even ring polarity filter", .ucode = 0x100, }, { .uname = "UP_ODD", .udesc = "Up and odd ring polarity filter", .ucode = 0x200, }, { .uname = "DOWN_EVEN", .udesc = "Down and even ring polarity filter", .ucode = 0x400, }, { .uname = "DOWN_ODD", .udesc = "Down and odd ring polarity filter", .ucode = 0x800, }, }; static const intel_x86_umask_t snbep_unc_c_ring_bounces[]={ { .uname = "AK_CORE", .udesc = "Acknowledgment to core", .ucode = 0x200, }, { .uname = "BL_CORE", .udesc = "Data response to core", .ucode = 0x400, }, { .uname = "IV_CORE", .udesc = "Snoops of processor cache", .ucode = 0x800, }, }; static const intel_x86_umask_t snbep_unc_c_ring_iv_used[]={ { .uname = "ANY", .udesc = "Any filter", .ucode = 0xf00, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_c_rxr_ext_starved[]={ { .uname = "IRQ", .udesc = "Irq externally starved, therefore blocking the IPQ", .ucode = 0x100, }, { .uname = "IPQ", .udesc = "IPQ externally starved, therefore blocking the IRQ", .ucode = 0x200, }, { .uname = "ISMQ", .udesc = "ISMQ externally starved, therefore blocking both IRQ and IPQ", .ucode = 0x400, }, { .uname = "ISMQ_BIDS", .udesc = "Number of time the ISMQ bids", .ucode = 0x800, }, }; static const intel_x86_umask_t snbep_unc_c_rxr_inserts[]={ { .uname = "IPQ", .udesc = "IPQ", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .udesc = "IRQ", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJECTED", .udesc = "IRQ rejected", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VFIFO", .udesc = "Counts the number of allocated into the IRQ ordering FIFO", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_c_rxr_ipq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any Reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_c_rxr_irq_retry[]={ { .uname = "ADDR_CONFLICT", .udesc = "Address conflict", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "No QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_c_rxr_ismq_retry[]={ { .uname = "ANY", .udesc = "Any reject", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "No Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IIO_CREDITS", .udesc = "No IIO credits", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "QPI_CREDITS", .udesc = "NO QPI credits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RTID", .udesc = "No RTIDs", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_c_tor_inserts[]={ { .uname = "EVICTION", .udesc = "Number of Evictions transactions inserted into TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_ALL", .udesc = "Number of miss requests inserted into the TOR", .ucode = 0xa00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_OPCODE", .udesc = "Number of miss transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched eviction transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched miss transactions that were inserted into the TOR (must provide nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID and opcode matched miss transactions inserted into the TOR (must provide opc_* umask and nf=X modifier)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_OPCODE", .udesc = "Number of transactions inserted into the TOR that match a NID and opcode (must provide opc_* umask and nf=X modifier)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_WB", .udesc = "Number of NID-matched write back transactions inserted into the TOR (must provide nf=X modifier)", .ucode = 0x5000, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "OPCODE", .udesc = "Number of transactions inserted into the TOR that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "Number of write transactions inserted into the TOR", .ucode = 0x1000, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t snbep_unc_c_tor_occupancy[]={ { .uname = "ALL", .udesc = "All valid TOR entries", .ucode = 0x800, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_EXCL_GRP_GT, }, { .uname = "EVICTION", .udesc = "Number of outstanding eviction transactions in the TOR", .ucode = 0x400, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_ALL", .udesc = "Number of outstanding miss requests in the TOR", .ucode = 0xa00, .grpid = 0, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "MISS_OPCODE", .udesc = "Number of TOR entries that match a NID and an opcode (must provide opc_* umask)", .ucode = 0x300, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_ALL", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide nf=X modifier)", .ucode = 0x4800, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_EVICTION", .udesc = "Number of NID-matched outstanding requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4400, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_ALL", .udesc = "Number of NID-matched outstanding miss requests in the TOR (must provide a nf=X modifier)", .ucode = 0x4a00, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO | INTEL_X86_EXCL_GRP_GT, }, { .uname = "NID_MISS_OPCODE", .udesc = "Number of NID-matched outstanding miss requests in the TOR that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4300, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NID_OPCODE", .udesc = "Number of NID-matched TOR entries that an opcode (must provide nf=X modifier and opc_* umask)", .ucode = 0x4100, .grpid = 0, .umodmsk_req = _SNBEP_UNC_ATTR_NF, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OPCODE", .udesc = "Number of TOR entries that match an opcode (must provide opc_* umask)", .ucode = 0x100, .grpid = 0, .uflags = INTEL_X86_NCOMBO, }, CBO_FILT_OPC(1) }; static const intel_x86_umask_t snbep_unc_c_txr_inserts[]={ { .uname = "AD_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AD ring", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to AK ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to BL ring", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_CACHE", .udesc = "Counts the number of ring transactions from Cachebo to IV ring", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_CORE", .udesc = "Counts the number of ring transactions from Corebo to AD ring", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_CORE", .udesc = "Counts the number of ring transactions from Corebo to AK ring", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_CORE", .udesc = "Counts the number of ring transactions from Corebo to BL ring", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_c_pe[]={ { .name = "UNC_C_CLOCKTICKS", .desc = "C-box Uncore clockticks", .modmsk = 0x0, .cntmsk = 0xf, .code = 0x00, .flags = INTEL_X86_FIXED, }, { .name = "UNC_C_COUNTER0_OCCUPANCY", .desc = "Counter 0 occupancy. Counts the occupancy related information by filtering CB0 occupancy count captured in counter 0.", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0xe, .code = 0x1f, }, { .name = "UNC_C_ISMQ_DRD_MISS_OCC", .desc = "TBD", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x21, }, { .name = "UNC_C_LLC_LOOKUP", .desc = "Cache lookups. Counts number of times the LLC is accessed from L2 for code, data, prefetches (Must set filter mask bit 0 and select )", .modmsk = SNBEP_UNC_CBO_NID_ATTRS, .cntmsk = 0x3, .code = 0x34, .ngrp = 3, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_llc_lookup), .umasks = snbep_unc_c_llc_lookup, }, { .name = "UNC_C_LLC_VICTIMS", .desc = "Lines victimized", .modmsk = SNBEP_UNC_CBO_NID_ATTRS, .cntmsk = 0x3, .code = 0x37, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_llc_victims), .ngrp = 1, .umasks = snbep_unc_c_llc_victims, }, { .name = "UNC_C_MISC", .desc = "Miscellaneous C-Box events", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x39, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_misc), .ngrp = 1, .umasks = snbep_unc_c_misc, }, { .name = "UNC_C_RING_AD_USED", .desc = "Address ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1b, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_ring_ad_used), .ngrp = 1, .umasks = snbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_AK_USED", .desc = "Acknowledgment ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1c, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = snbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BL_USED", .desc = "Bus or Data ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1d, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_ring_ad_used), /* identical to RING_AD_USED */ .ngrp = 1, .umasks = snbep_unc_c_ring_ad_used, }, { .name = "UNC_C_RING_BOUNCES", .desc = "Number of LLC responses that bounced in the ring", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x05, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_ring_bounces), .ngrp = 1, .umasks = snbep_unc_c_ring_bounces, }, { .name = "UNC_C_RING_IV_USED", .desc = "Invalidate ring in use. Counts number of cycles ring is being used at this ring stop", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0xc, .code = 0x1e, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_ring_iv_used), .ngrp = 1, .umasks = snbep_unc_c_ring_iv_used, }, { .name = "UNC_C_RING_SRC_THRTL", .desc = "TDB", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x07, }, { .name = "UNC_C_RXR_EXT_STARVED", .desc = "Ingress arbiter blocking cycles", .modmsk = SNBEP_UNC_CBO_ATTRS, .cntmsk = 0x3, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_ext_starved), .ngrp = 1, .umasks = snbep_unc_c_rxr_ext_starved, }, { .name = "UNC_C_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x13, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_inserts), .umasks = snbep_unc_c_rxr_inserts }, { .name = "UNC_C_RXR_IPQ_RETRY", .desc = "Probe Queue Retries", .code = 0x31, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_ipq_retry), .umasks = snbep_unc_c_rxr_ipq_retry }, { .name = "UNC_C_RXR_IRQ_RETRY", .desc = "Ingress Request Queue Rejects", .code = 0x32, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_irq_retry), .umasks = snbep_unc_c_rxr_irq_retry }, { .name = "UNC_C_RXR_ISMQ_RETRY", .desc = "ISMQ Retries", .code = 0x33, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_ismq_retry), .umasks = snbep_unc_c_rxr_ismq_retry }, { .name = "UNC_C_RXR_OCCUPANCY", .desc = "Ingress Occupancy", .code = 0x11, .cntmsk = 0x1, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_rxr_inserts), .umasks = snbep_unc_c_rxr_inserts, /* identical to snbep_unc_c_rxr_inserts */ }, { .name = "UNC_C_TOR_INSERTS", .desc = "TOR Inserts", .code = 0x35, .cntmsk = 0x3, .ngrp = 2, .modmsk = SNBEP_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_tor_inserts), .umasks = snbep_unc_c_tor_inserts }, { .name = "UNC_C_TOR_OCCUPANCY", .desc = "TOR Occupancy", .code = 0x36, .cntmsk = 0x1, .ngrp = 2, .modmsk = SNBEP_UNC_CBO_NID_ATTRS, .flags = INTEL_X86_NO_AUTOENCODE, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_tor_occupancy), .umasks = snbep_unc_c_tor_occupancy }, { .name = "UNC_C_TXR_ADS_USED", .desc = "Egress events", .code = 0x04, .cntmsk = 0x3, .modmsk = SNBEP_UNC_CBO_ATTRS, }, { .name = "UNC_C_TXR_INSERTS", .desc = "Egress allocations", .code = 0x02, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_CBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_c_txr_inserts), .umasks = snbep_unc_c_txr_inserts }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_ha_events.h000066400000000000000000000337271502707512200254260ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: snbep_unc_ha (Intel SandyBridge-EP HA uncore PMU) */ static const intel_x86_umask_t snbep_unc_h_conflict_cycles[]={ { .uname = "CONFLICT", .udesc = "Number of cycles that we are handling conflicts", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_CONFLICT", .udesc = "Number of cycles that we are not handling conflicts", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_directory_lookup[]={ { .uname = "NO_SNP", .udesc = "Snoop not needed", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Snoop needed", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_directory_update[]={ { .uname = "ANY", .udesc = "Counts any directory update", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "CLEAR", .udesc = "Directory clears", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SET", .udesc = "Directory set", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_igr_no_credit_cycles[]={ { .uname = "AD_QPI0", .udesc = "AD to QPI link 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_QPI1", .udesc = "AD to QPI link 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_QPI0", .udesc = "BL to QPI link 0", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_QPI1", .udesc = "BL to QPI link 1", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_imc_writes[]={ { .uname = "ALL", .udesc = "Counts all writes", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FULL", .udesc = "Counts full line non ISOCH", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL_ISOCH", .udesc = "Counts ISOCH full line", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Counts partial non-ISOCH", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL_ISOCH", .udesc = "Counts ISOCH partial", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_requests[]={ { .uname = "READS", .udesc = "Counts incoming read requests. Good proxy for LLC read misses, incl. RFOs", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES", .udesc = "Counts incoming writes", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_rpq_cycles_no_reg_credits[]={ { .uname = "CHN0", .udesc = "Channel 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN1", .udesc = "Channel 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN2", .udesc = "channel 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CHN3", .udesc = "Chanell 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_tad_requests_g0[]={ { .uname = "REGION0", .udesc = "Counts for TAD Region 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION1", .udesc = "Counts for TAD Region 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION2", .udesc = "Counts for TAD Region 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION3", .udesc = "Counts for TAD Region 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION4", .udesc = "Counts for TAD Region 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION5", .udesc = "Counts for TAD Region 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION6", .udesc = "Counts for TAD Region 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION7", .udesc = "Counts for TAD Region 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_tad_requests_g1[]={ { .uname = "REGION8", .udesc = "Counts for TAD Region 8", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION9", .udesc = "Counts for TAD Region 9", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION10", .udesc = "Counts for TAD Region 10", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REGION11", .udesc = "Counts for TAD Region 11", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_tracker_inserts[]={ { .uname = "ALL", .udesc = "Counts all requests", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_h_txr_ad[]={ { .uname = "NDR", .udesc = "Counts non-data responses", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Counts outbound snoops send on the ring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_txr_ad_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles full from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles full from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles full from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_txr_ak_cycles_full[]={ { .uname = "ALL", .udesc = "Counts cycles from both schedulers", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "SCHED0", .udesc = "Counts cycles from scheduler bank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCHED1", .udesc = "Counts cycles from scheduler bank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_h_txr_bl[]={ { .uname = "DRS_CACHE", .udesc = "Counts data being sent to the cache", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_CORE", .udesc = "Counts data being sent directly to the requesting core", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_QPI", .udesc = "Counts data being sent to a remote socket over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; #if 0 static const intel_x86_umask_t snbep_unc_h_addr_opc_match[]={ { .uname = "FILT", .udesc = "Number of addr and opcode matches (opc via opc= or address via addr= modifiers)", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL | INTEL_X86_ADDR, }, }; #endif static const intel_x86_entry_t intel_snbep_unc_h_pe[]={ { .name = "UNC_H_CLOCKTICKS", .desc = "HA Uncore clockticks", .modmsk = SNBEP_UNC_HA_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_H_CONFLICT_CYCLES", .desc = "Conflict Checks", .code = 0xb, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_conflict_cycles), .umasks = snbep_unc_h_conflict_cycles, }, { .name = "UNC_H_DIRECT2CORE_COUNT", .desc = "Direct2Core Messages Sent", .code = 0x11, .cntmsk = 0xf, .modmsk = SNBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_CYCLES_DISABLED", .desc = "Cycles when Direct2Core was Disabled", .code = 0x12, .cntmsk = 0xf, .modmsk = SNBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECT2CORE_TXN_OVERRIDE", .desc = "Number of Reads that had Direct2Core Overridden", .code = 0x13, .cntmsk = 0xf, .modmsk = SNBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_DIRECTORY_LOOKUP", .desc = "Directory Lookups", .code = 0xc, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_directory_lookup), .umasks = snbep_unc_h_directory_lookup }, { .name = "UNC_H_DIRECTORY_UPDATE", .desc = "Directory Updates", .code = 0xd, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_directory_update), .umasks = snbep_unc_h_directory_update }, { .name = "UNC_H_IGR_NO_CREDIT_CYCLES", .desc = "Cycles without QPI Ingress Credits", .code = 0x22, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_igr_no_credit_cycles), .umasks = snbep_unc_h_igr_no_credit_cycles }, { .name = "UNC_H_IMC_RETRY", .desc = "Retry Events", .code = 0x1e, .cntmsk = 0xf, .modmsk = SNBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_IMC_WRITES", .desc = "HA to iMC Full Line Writes Issued", .code = 0x1a, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_imc_writes), .umasks = snbep_unc_h_imc_writes }, { .name = "UNC_H_REQUESTS", .desc = "Read and Write Requests", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_requests), .umasks = snbep_unc_h_requests }, { .name = "UNC_H_RPQ_CYCLES_NO_REG_CREDITS", .desc = "iMC RPQ Credits Empty - Regular", .code = 0x15, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_rpq_cycles_no_reg_credits), .umasks = snbep_unc_h_rpq_cycles_no_reg_credits }, { .name = "UNC_H_TAD_REQUESTS_G0", .desc = "HA Requests to a TAD Region - Group 0", .code = 0x1b, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_tad_requests_g0), .umasks = snbep_unc_h_tad_requests_g0 }, { .name = "UNC_H_TAD_REQUESTS_G1", .desc = "HA Requests to a TAD Region - Group 1", .code = 0x1c, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_tad_requests_g1), .umasks = snbep_unc_h_tad_requests_g1 }, { .name = "UNC_H_TRACKER_INSERTS", .desc = "Tracker Allocations", .code = 0x6, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_tracker_inserts), .umasks = snbep_unc_h_tracker_inserts }, { .name = "UNC_H_TXR_AD", .desc = "Outbound NDR Ring Transactions", .code = 0xf, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_txr_ad), .umasks = snbep_unc_h_txr_ad }, { .name = "UNC_H_TXR_AD_CYCLES_FULL", .desc = "AD Egress Full", .code = 0x2a, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_txr_ad_cycles_full), .umasks = snbep_unc_h_txr_ad_cycles_full }, { .name = "UNC_H_TXR_AK_CYCLES_FULL", .desc = "AK Egress Full", .code = 0x32, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_txr_ak_cycles_full), .umasks = snbep_unc_h_txr_ak_cycles_full }, { .name = "UNC_H_TXR_AK_NDR", .desc = "Outbound NDR Ring Transactions", .code = 0xe, .cntmsk = 0xf, .modmsk = SNBEP_UNC_HA_ATTRS, }, { .name = "UNC_H_TXR_BL", .desc = "Outbound DRS Ring Transactions to Cache", .code = 0x10, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_txr_bl), .umasks = snbep_unc_h_txr_bl }, { .name = "UNC_H_TXR_BL_CYCLES_FULL", .desc = "BL Egress Full", .code = 0x36, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_txr_ak_cycles_full), .umasks = snbep_unc_h_txr_ak_cycles_full, /* identical to snbep_unc_h_txr_ak_cycles_full */ }, { .name = "UNC_H_WPQ_CYCLES_NO_REG_CREDITS", .desc = "HA iMC CHN0 WPQ Credits Empty - Regular", .code = 0x18, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_rpq_cycles_no_reg_credits), .umasks = snbep_unc_h_rpq_cycles_no_reg_credits , /* identical to snbep_unc_h_rpq_cycles_no_reg_credits */ }, #if 0 { .name = "UNC_H_ADDR_OPC_MATCH", .desc = "QPI address/opcode match", .code = 0x20, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_HA_OPC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_h_addr_opc_match), .umasks = snbep_unc_h_addr_opc_match, }, #endif }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_imc_events.h000066400000000000000000000237031502707512200255770ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: snbep_unc_imc (Intel SandyBridge-EP IMC uncore PMU) */ static const intel_x86_umask_t snbep_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "Counts total number of DRAM CAS commands issued on this channel", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RD", .udesc = "Counts all DRAM reads on this channel, incl. underfills", .ucode = 0x300, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "Counts number of DRAM read CAS commands issued on this channel, incl. regular read CAS and those with implicit precharge", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UNDERFILL", .udesc = "Counts number of underfill reads issued by the memory controller", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Counts number of DRAM write CAS commands on this channel", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_RMM", .udesc = "Counts Number of opportunistic DRAM write CAS commands issued on this channel", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_WMM", .udesc = "Counts number of DRAM write CAS commands issued on this channel while in Write-Major mode", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_m_dram_refresh[]={ { .uname = "HIGH", .udesc = "TBD", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PANIC", .udesc = "TBD", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_m_major_modes[]={ { .uname = "ISOCH", .udesc = "Counts cycles in ISOCH Major mode", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Counts cycles in Partial Major mode", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ", .udesc = "Counts cycles in Read Major mode", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Counts cycles in Write Major mode", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_m_power_cke_cycles[]={ { .uname = "RANK0", .udesc = "Count cycles for rank 0", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK1", .udesc = "Count cycles for rank 1", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK2", .udesc = "Count cycles for rank 2", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK3", .udesc = "Count cycles for rank 3", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK4", .udesc = "Count cycles for rank 4", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK5", .udesc = "Count cycles for rank 5", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK6", .udesc = "Count cycles for rank 6", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RANK7", .udesc = "Count cycles for rank 7", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_m_preemption[]={ { .uname = "RD_PREEMPT_RD", .udesc = "Counts read over read preemptions", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PREEMPT_WR", .udesc = "Counts read over write preemptions", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_m_pre_count[]={ { .uname = "PAGE_CLOSE", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of the page close counter expiring", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAGE_MISS", .udesc = "Counts number of DRAM precharge commands sent on this channel as a result of page misses", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_m_pe[]={ { .name = "UNC_M_CLOCKTICKS", .desc = "IMC Uncore clockticks", .modmsk = 0x0, .cntmsk = 0x100000000ull, .code = 0xff, /* perf pseudo encoding for fixed counter */ .flags = INTEL_X86_FIXED, }, { .name = "UNC_M_ACT_COUNT", .desc = "DRAM Activate Count", .code = 0x1, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_CAS_COUNT", .desc = "DRAM RD_CAS and WR_CAS Commands.", .code = 0x4, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_cas_count), .umasks = snbep_unc_m_cas_count }, { .name = "UNC_M_DRAM_PRE_ALL", .desc = "DRAM Precharge All Commands", .code = 0x6, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_DRAM_REFRESH", .desc = "Number of DRAM Refreshes Issued", .code = 0x5, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_dram_refresh), .umasks = snbep_unc_m_dram_refresh }, { .name = "UNC_M_ECC_CORRECTABLE_ERRORS", .desc = "ECC Correctable Errors", .code = 0x9, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_MAJOR_MODES", .desc = "Cycles in a Major Mode", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_major_modes), .umasks = snbep_unc_m_major_modes }, { .name = "UNC_M_POWER_CHANNEL_DLLOFF", .desc = "Channel DLLOFF Cycles", .code = 0x84, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .desc = "Channel PPD Cycles", .code = 0x85, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_CKE_CYCLES", .desc = "CKE_ON_CYCLES by Rank", .code = 0x83, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_power_cke_cycles), .umasks = snbep_unc_m_power_cke_cycles }, { .name = "UNC_M_POWER_CRITICAL_THROTTLE_CYCLES", .desc = "Critical Throttle Cycles", .code = 0x86, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_SELF_REFRESH", .desc = "Clock-Enabled Self-Refresh", .code = 0x43, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_POWER_THROTTLE_CYCLES", .desc = "Throttle Cycles for Rank 0", .code = 0x41, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_power_cke_cycles), .umasks = snbep_unc_m_power_cke_cycles /* identical to snbep_unc_m_power_cke_cycles */ }, { .name = "UNC_M_PREEMPTION", .desc = "Read Preemption Count", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_preemption), .umasks = snbep_unc_m_preemption }, { .name = "UNC_M_PRE_COUNT", .desc = "DRAM Precharge commands.", .code = 0x2, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_IMC_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_m_pre_count), .umasks = snbep_unc_m_pre_count }, { .name = "UNC_M_RPQ_CYCLES_FULL", .desc = "Read Pending Queue Full Cycles", .code = 0x12, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_RPQ_CYCLES_NE", .desc = "Read Pending Queue Not Empty", .code = 0x11, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue Allocations", .code = 0x10, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_RPQ_OCCUPANCY", .desc = "Read Pending Queue Occupancy", .code = 0x80, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_FULL", .desc = "Write Pending Queue Full Cycles", .code = 0x22, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_CYCLES_NE", .desc = "Write Pending Queue Not Empty", .code = 0x21, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_INSERTS", .desc = "Write Pending Queue Allocations", .code = 0x20, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_OCCUPANCY", .desc = "Write Pending Queue Occupancy", .code = 0x81, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_READ_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x23, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, { .name = "UNC_M_WPQ_WRITE_HIT", .desc = "Write Pending Queue CAM Match", .code = 0x24, .cntmsk = 0xf, .modmsk = SNBEP_UNC_IMC_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_pcu_events.h000066400000000000000000000223061502707512200256140ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: snbep_unc_pcu (Intel SandyBridge-EP PCU uncore) */ static const intel_x86_umask_t snbep_unc_p_power_state_occupancy[]={ { .uname = "CORES_C0", .udesc = "Counts number of cores in C0", .ucode = 0x4000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C3", .udesc = "Counts number of cores in C3", .ucode = 0x8000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORES_C6", .udesc = "Counts number of cores in C6", .ucode = 0xc000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_p_occupancy_counters[]={ { .uname = "C0", .udesc = "Counts number of cores in C0", .ucode = 0x0100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "C3", .udesc = "Counts number of cores in C3", .ucode = 0x0200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "C6", .udesc = "Counts number of cores in C6", .ucode = 0x0300, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_p_pe[]={ { .name = "UNC_P_CLOCKTICKS", .desc = "PCU Uncore clockticks", .modmsk = SNBEP_UNC_PCU_ATTRS, .cntmsk = 0xf, .code = 0x00, }, { .name = "UNC_P_CORE0_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x3 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE1_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x4 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE2_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x5 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE3_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x6 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE4_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x7 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE5_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x8 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE6_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0x9 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_CORE7_TRANSITION_CYCLES", .desc = "Core C State Transition Cycles", .code = 0xa | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE0", .desc = "Core C State Demotions", .code = 0x1e, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE1", .desc = "Core C State Demotions", .code = 0x1f, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE2", .desc = "Core C State Demotions", .code = 0x20, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE3", .desc = "Core C State Demotions", .code = 0x21, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE4", .desc = "Core C State Demotions", .code = 0x22, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE5", .desc = "Core C State Demotions", .code = 0x23, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE6", .desc = "Core C State Demotions", .code = 0x24, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_DEMOTIONS_CORE7", .desc = "Core C State Demotions", .code = 0x25, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_BAND0_CYCLES", .desc = "Frequency Residency", .code = 0xb, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SNBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND1_CYCLES", .desc = "Frequency Residency", .code = 0xc, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SNBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND2_CYCLES", .desc = "Frequency Residency", .code = 0xd, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SNBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_BAND3_CYCLES", .desc = "Frequency Residency", .code = 0xe, .cntmsk = 0xf, .flags = INTEL_X86_NO_AUTOENCODE, .modmsk = SNBEP_UNC_PCU_BAND_ATTRS, .modmsk_req = _SNBEP_UNC_ATTR_FF, }, { .name = "UNC_P_FREQ_MAX_CURRENT_CYCLES", .desc = "Current Strongest Upper Limit Cycles", .code = 0x7, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_LIMIT_THERMAL_CYCLES", .desc = "Thermal Strongest Upper Limit Cycles", .code = 0x4, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_OS_CYCLES", .desc = "OS Strongest Upper Limit Cycles", .code = 0x6, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MAX_POWER_CYCLES", .desc = "Power Strongest Upper Limit Cycles", .code = 0x5, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_FREQ_MIN_IO_P_CYCLES", .desc = "IO P Limit Strongest Lower Limit Cycles", .code = 0x1 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_PCU_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_p_occupancy_counters), .umasks = snbep_unc_p_occupancy_counters }, { .name = "UNC_P_FREQ_MIN_PERF_P_CYCLES", .desc = "Perf P Limit Strongest Lower Limit Cycles", .code = 0x2 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_PCU_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_p_occupancy_counters), .umasks = snbep_unc_p_occupancy_counters }, { .name = "UNC_P_FREQ_TRANS_CYCLES", .desc = "Cycles spent changing Frequency", .code = 0x0 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_PCU_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_p_occupancy_counters), .umasks = snbep_unc_p_occupancy_counters }, { .name = "UNC_P_MEMORY_PHASE_SHEDDING_CYCLES", .desc = "Memory Phase Shedding Cycles", .code = 0x2f, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_POWER_STATE_OCCUPANCY", .desc = "Number of cores in C0", .code = 0x80, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_PCU_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_p_power_state_occupancy), .umasks = snbep_unc_p_power_state_occupancy }, { .name = "UNC_P_PROCHOT_EXTERNAL_CYCLES", .desc = "External Prochot", .code = 0xa, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_PROCHOT_INTERNAL_CYCLES", .desc = "Internal Prochot", .code = 0x9, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_TOTAL_TRANSITION_CYCLES", .desc = "Total Core C State Transition Cycles", .code = 0xb | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_CHANGE", .desc = "Cycles Changing Voltage", .code = 0x3, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_DECREASE", .desc = "Cycles Decreasing Voltage", .code = 0x2, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VOLT_TRANS_CYCLES_INCREASE", .desc = "Cycles Increasing Voltage", .code = 0x1, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, { .name = "UNC_P_VR_HOT_CYCLES", .desc = "VR Hot", .code = 0x32, .cntmsk = 0xf, .modmsk = SNBEP_UNC_PCU_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_qpi_events.h000066400000000000000000000315441502707512200256220ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: snbep_unc_qpi (Intel SandyBridge-EP QPI uncore) */ static const intel_x86_umask_t snbep_unc_q_direct2core[]={ { .uname = "FAILURE_CREDITS", .udesc = "Number of spawn failures due to lack of Egress credits", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_CREDITS_RBT", .udesc = "Number of spawn failures due to lack of Egress credit and route-back table (RBT) bit was not set", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAILURE_RBT", .udesc = "Number of spawn failures because route-back table (RBT) specified that the transaction should not trigger a direct2core transaction", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SUCCESS", .udesc = "Number of spawn successes", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_q_rxl_credits_consumed_vn0[]={ { .uname = "DRS", .udesc = "Number of times VN0 consumed for DRS message class", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of times VN0 consumed for HOM message class", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "Number of times VN0 consumed for NCB message class", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of times VN0 consumed for NCS message class", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "Number of times VN0 consumed for NDR message class", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of times VN0 consumed for SNP message class", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_q_rxl_flits_g0[]={ { .uname = "DATA", .udesc = "Number of data flits over QPI", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Number of flits over QPI that do not hold protocol payload", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "Number of non-NULL non-data flits over QPI", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_q_rxl_flits_g1[]={ { .uname = "DRS", .udesc = "Number of flits over QPI on the Data Response (DRS) channel", .ucode = 0x1800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_DATA", .udesc = "Number of data flits over QPI on the Data Response (DRS) channel", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DRS_NONDATA", .udesc = "Number of protocol flits over QPI on the Data Response (DRS) channel", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Number of flits over QPI on the home channel", .ucode = 0x600, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_NONREQ", .udesc = "Number of non-request flits over QPI on the home channel", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM_REQ", .udesc = "Number of data requests over QPI on the home channel", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Number of snoop requests flits over QPI", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_q_rxl_flits_g2[]={ { .uname = "NCB", .udesc = "Number of non-coherent bypass flits", .ucode = 0xc00, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_DATA", .udesc = "Number of non-coherent data flits", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB_NONDATA", .udesc = "Number of bypass non-data flits", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Number of non-coherent standard (NCS) flits", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AD", .udesc = "Number of flits received over Non-data response (NDR) channel", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR_AK", .udesc = "Number of flits received on the Non-data response (NDR) channel)", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_q_pe[]={ { .name = "UNC_Q_CLOCKTICKS", .desc = "Number of qfclks", .code = 0x14, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_CTO_COUNT", .desc = "Count of CTO Events", .code = 0x38, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_DIRECT2CORE", .desc = "Direct 2 Core Spawning", .code = 0x13, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_direct2core), .umasks = snbep_unc_q_direct2core }, { .name = "UNC_Q_L1_POWER_CYCLES", .desc = "Cycles in L1", .code = 0x12, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0x10, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xf, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_BYPASSED", .desc = "Rx Flit Buffer Bypassed", .code = 0x9, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VN0", .desc = "VN0 Credit Consumed", .code = 0x1e | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_credits_consumed_vn0), .umasks = snbep_unc_q_rxl_credits_consumed_vn0 }, { .name = "UNC_Q_RXL_CREDITS_CONSUMED_VNA", .desc = "VNA Credit Consumed", .code = 0x1d | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_CYCLES_NE", .desc = "RxQ Cycles Not Empty", .code = 0xa, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_FLITS_G0", .desc = "Flits Received - Group 0", .code = 0x1, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g0), .umasks = snbep_unc_q_rxl_flits_g0 }, { .name = "UNC_Q_RXL_FLITS_G1", .desc = "Flits Received - Group 1", .code = 0x2 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g1), .umasks = snbep_unc_q_rxl_flits_g1 }, { .name = "UNC_Q_RXL_FLITS_G2", .desc = "Flits Received - Group 2", .code = 0x3 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g2), .umasks = snbep_unc_q_rxl_flits_g2 }, { .name = "UNC_Q_RXL_INSERTS", .desc = "Rx Flit Buffer Allocations", .code = 0x8, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_DRS", .desc = "Rx Flit Buffer Allocations - DRS", .code = 0x9 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_HOM", .desc = "Rx Flit Buffer Allocations - HOM", .code = 0xc | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_NCB", .desc = "Rx Flit Buffer Allocations - NCB", .code = 0xa | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_NCS", .desc = "Rx Flit Buffer Allocations - NCS", .code = 0xb | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_NDR", .desc = "Rx Flit Buffer Allocations - NDR", .code = 0xe | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_INSERTS_SNP", .desc = "Rx Flit Buffer Allocations - SNP", .code = 0xd | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY", .desc = "RxQ Occupancy - All Packets", .code = 0xb, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_DRS", .desc = "RxQ Occupancy - DRS", .code = 0x15 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_HOM", .desc = "RxQ Occupancy - HOM", .code = 0x18 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCB", .desc = "RxQ Occupancy - NCB", .code = 0x16 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_NCS", .desc = "RxQ Occupancy - NCS", .code = 0x17 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_NDR", .desc = "RxQ Occupancy - NDR", .code = 0x1a | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_RXL_OCCUPANCY_SNP", .desc = "RxQ Occupancy - SNP", .code = 0x19 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL0P_POWER_CYCLES", .desc = "Cycles in L0p", .code = 0xd, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL0_POWER_CYCLES", .desc = "Cycles in L0", .code = 0xc, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_BYPASSED", .desc = "Tx Flit Buffer Bypassed", .code = 0x5, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_CYCLES_NE", .desc = "Tx Flit Buffer Cycles not Empty", .code = 0x6, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_FLITS_G0", .desc = "Flits Transferred - Group 0", .code = 0x0, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g0), .umasks = snbep_unc_q_rxl_flits_g0 /* shared with rxl_flits_g0 */ }, { .name = "UNC_Q_TXL_FLITS_G1", .desc = "Flits Transferred - Group 1", .code = 0x0 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g1), .umasks = snbep_unc_q_rxl_flits_g1 /* shared with rxl_flits_g1 */ }, { .name = "UNC_Q_TXL_FLITS_G2", .desc = "Flits Transferred - Group 2", .code = 0x1 | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_q_rxl_flits_g2), .umasks = snbep_unc_q_rxl_flits_g2 /* shared with rxl_flits_g2 */ }, { .name = "UNC_Q_TXL_INSERTS", .desc = "Tx Flit Buffer Allocations", .code = 0x4, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_TXL_OCCUPANCY", .desc = "Tx Flit Buffer Occupancy", .code = 0x7, .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURNS", .desc = "VNA Credits Returned", .code = 0x1c | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, { .name = "UNC_Q_VNA_CREDIT_RETURN_OCCUPANCY", .desc = "VNA Credits Pending Return - Occupancy", .code = 0x1b | (1ULL << 21), /* sel_ext */ .cntmsk = 0xf, .modmsk = SNBEP_UNC_QPI_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_r2pcie_events.h000066400000000000000000000130121502707512200262030ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: snbep_unc_r2pcie (Intel SandyBridge-EP R2PCIe uncore) */ static const intel_x86_umask_t snbep_unc_r2_ring_ad_used[]={ { .uname = "CCW_EVEN", .udesc = "Counter-clockwise and even ring polarity", .ucode = 0x400, }, { .uname = "CCW_ODD", .udesc = "Counter-clockwise and odd ring polarity", .ucode = 0x800, }, { .uname = "CW_EVEN", .udesc = "Clockwise and even ring polarity", .ucode = 0x100, }, { .uname = "CW_ODD", .udesc = "Clockwise and odd ring polarity", .ucode = 0x200, }, { .uname = "CW_ANY", .udesc = "Clockwise with any polarity", .ucode = 0x300, }, { .uname = "CCW_ANY", .udesc = "Counter-clockwise with any polarity", .ucode = 0xc00, }, { .uname = "ANY", .udesc = "any direction and any polarity", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_r2_ring_iv_used[]={ { .uname = "ANY", .udesc = "R2 IV Ring in Use", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_r2_rxr_cycles_ne[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_r2_txr_cycles_full[]={ { .uname = "AD", .udesc = "AD Egress queue", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK", .udesc = "AK Egress queue", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL", .udesc = "BL Egress queue", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_r2_pe[]={ { .name = "UNC_R2_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0xf, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, }, { .name = "UNC_R2_RING_AD_USED", .desc = "R2 AD Ring in Use", .code = 0x7, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_ring_ad_used), .umasks = snbep_unc_r2_ring_ad_used }, { .name = "UNC_R2_RING_AK_USED", .desc = "R2 AK Ring in Use", .code = 0x8, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_ring_ad_used), .umasks = snbep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_BL_USED", .desc = "R2 BL Ring in Use", .code = 0x9, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_ring_ad_used), .umasks = snbep_unc_r2_ring_ad_used /* shared */ }, { .name = "UNC_R2_RING_IV_USED", .desc = "R2 IV Ring in Use", .code = 0xa, .cntmsk = 0xf, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_ring_iv_used), .umasks = snbep_unc_r2_ring_iv_used }, { .name = "UNC_R2_RXR_AK_BOUNCES", .desc = "AK Ingress Bounced", .code = 0x12, .cntmsk = 0x1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, }, { .name = "UNC_R2_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_rxr_cycles_ne), .umasks = snbep_unc_r2_rxr_cycles_ne }, { .name = "UNC_R2_TXR_CYCLES_FULL", .desc = "Egress Cycles Full", .code = 0x25, .cntmsk = 0x1, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_txr_cycles_full), .umasks = snbep_unc_r2_txr_cycles_full }, { .name = "UNC_R2_TXR_CYCLES_NE", .desc = "Egress Cycles Not Empty", .code = 0x23, .cntmsk = 0x1, .ngrp = 1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r2_txr_cycles_full), .umasks = snbep_unc_r2_txr_cycles_full /* shared */ }, { .name = "UNC_R2_TXR_INSERTS", .desc = "Egress allocations", .code = 0x24, .cntmsk = 0x1, .modmsk = SNBEP_UNC_R2PCIE_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_r3qpi_events.h000066400000000000000000000223271502707512200260660ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: snbep_unc_r3qpi (Intel SandyBridge-EP R3QPI uncore) */ static const intel_x86_umask_t snbep_unc_r3_iio_credits_acquired[]={ { .uname = "DRS", .udesc = "DRS", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "NCB", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCS", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_r3_ring_ad_used[]={ { .uname = "CCW_EVEN", .udesc = "Counter-Clockwise and even ring polarity", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CCW_ODD", .udesc = "Counter-Clockwise and odd ring polarity", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW_EVEN", .udesc = "Clockwise and even ring polarity", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CW_ODD", .udesc = "Clockwise and odd ring polarity", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_r3_ring_iv_used[]={ { .uname = "ANY", .udesc = "Any polarity", .ucode = 0xf00, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_r3_rxr_bypassed[]={ { .uname = "AD", .udesc = "Ingress Bypassed", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t snbep_unc_r3_rxr_cycles_ne[]={ { .uname = "DRS", .udesc = "DRS Ingress queue", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "HOM Ingress queue", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "NCB Ingress queue", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "NCS Ingress queue", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "NDR Ingress queue", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "SNP Ingress queue", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t snbep_unc_r3_vn0_credits_reject[]={ { .uname = "DRS", .udesc = "Filter DRS message class", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HOM", .udesc = "Filter HOM message class", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCB", .udesc = "Filter NCB message class", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NCS", .udesc = "Filter NCS message class", .ucode = 0x2000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NDR", .udesc = "Filter NDR message class", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Filter SNP message class", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_r3_pe[]={ { .name = "UNC_R3_CLOCKTICKS", .desc = "Number of uclks in domain", .code = 0x1, .cntmsk = 0x7, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_IIO_CREDITS_ACQUIRED", .desc = "to IIO BL Credit Acquired", .code = 0x20, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_iio_credits_acquired), .umasks = snbep_unc_r3_iio_credits_acquired }, { .name = "UNC_R3_IIO_CREDITS_REJECT", .desc = "to IIO BL Credit Rejected", .code = 0x21, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_iio_credits_acquired), .umasks = snbep_unc_r3_iio_credits_acquired /* shared */ }, { .name = "UNC_R3_IIO_CREDITS_USED", .desc = "to IIO BL Credit In Use", .code = 0x22, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_iio_credits_acquired), .umasks = snbep_unc_r3_iio_credits_acquired /* shared */ }, { .name = "UNC_R3_RING_AD_USED", .desc = "R3 AD Ring in Use", .code = 0x7, .cntmsk = 0x7, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_ring_ad_used), .umasks = snbep_unc_r3_ring_ad_used }, { .name = "UNC_R3_RING_AK_USED", .desc = "R3 AK Ring in Use", .code = 0x8, .cntmsk = 0x7, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_ring_ad_used), .umasks = snbep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_BL_USED", .desc = "R3 BL Ring in Use", .code = 0x9, .cntmsk = 0x7, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_ring_ad_used), .umasks = snbep_unc_r3_ring_ad_used /* shared */ }, { .name = "UNC_R3_RING_IV_USED", .desc = "R3 IV Ring in Use", .code = 0xa, .cntmsk = 0x7, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_ring_iv_used), .umasks = snbep_unc_r3_ring_iv_used }, { .name = "UNC_R3_RXR_BYPASSED", .desc = "Ingress Bypassed", .code = 0x12, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_rxr_bypassed), .umasks = snbep_unc_r3_rxr_bypassed }, { .name = "UNC_R3_RXR_CYCLES_NE", .desc = "Ingress Cycles Not Empty", .code = 0x10, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_rxr_cycles_ne), .umasks = snbep_unc_r3_rxr_cycles_ne }, { .name = "UNC_R3_RXR_INSERTS", .desc = "Ingress Allocations", .code = 0x11, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_rxr_cycles_ne), .umasks = snbep_unc_r3_rxr_cycles_ne /* shared */ }, { .name = "UNC_R3_RXR_OCCUPANCY", .desc = "Ingress Occupancy Accumulator", .code = 0x13, .cntmsk = 0x1, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_rxr_cycles_ne), .umasks = snbep_unc_r3_rxr_cycles_ne /* shared */ }, { .name = "UNC_R3_TXR_CYCLES_FULL", .desc = "Egress cycles full", .code = 0x25, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_TXR_INSERTS", .desc = "Egress allocations", .code = 0x24, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_TXR_NACK", .desc = "Egress Nack", .code = 0x26, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_VN0_CREDITS_REJECT", .desc = "VN0 Credit Acquisition Failed on DRS", .code = 0x37, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_vn0_credits_reject), .umasks = snbep_unc_r3_vn0_credits_reject }, { .name = "UNC_R3_VN0_CREDITS_USED", .desc = "VN0 Credit Used", .code = 0x36, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_vn0_credits_reject), .umasks = snbep_unc_r3_vn0_credits_reject /* shared */ }, { .name = "UNC_R3_VNA_CREDITS_ACQUIRED", .desc = "VNA credit Acquisitions", .code = 0x33, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_VNA_CREDITS_REJECT", .desc = "VNA Credit Reject", .code = 0x34, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_R3QPI_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_r3_vn0_credits_reject), .umasks = snbep_unc_r3_vn0_credits_reject /* shared */ }, { .name = "UNC_R3_VNA_CREDIT_CYCLES_OUT", .desc = "Cycles with no VNA credits available", .code = 0x31, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, { .name = "UNC_R3_VNA_CREDIT_CYCLES_USED", .desc = "Cycles with 1 or more VNA credits in use", .code = 0x32, .cntmsk = 0x3, .modmsk = SNBEP_UNC_R3QPI_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_snbep_unc_ubo_events.h000066400000000000000000000045161502707512200256150ustar00rootroot00000000000000/* * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: snbep_unc_ubo (Intel SandyBridge-EP U-Box uncore PMU) */ static const intel_x86_umask_t snbep_unc_u_event_msg[]={ { .uname = "DOORBELL_RCVD", .udesc = "TBD", .ucode = 0x800, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_PRIO", .udesc = "TBD", .ucode = 0x1000, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPI_RCVD", .udesc = "TBD", .ucode = 0x400, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MSI_RCVD", .udesc = "TBD", .ucode = 0x200, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VLW_RCVD", .udesc = "TBD", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_snbep_unc_u_pe[]={ { .name = "UNC_U_EVENT_MSG", .desc = "VLW Received", .code = 0x42, .cntmsk = 0x3, .ngrp = 1, .modmsk = SNBEP_UNC_UBO_ATTRS, .numasks = LIBPFM_ARRAY_SIZE(snbep_unc_u_event_msg), .umasks = snbep_unc_u_event_msg }, { .name = "UNC_U_LOCK_CYCLES", .desc = "IDI Lock/SplitLock Cycles", .code = 0x44, .cntmsk = 0x3, .modmsk = SNBEP_UNC_UBO_ATTRS, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_spr_events.h000066400000000000000000003411311502707512200235750ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * * PMU: intel_spr (Intel SapphireRapid) * Based on Intel JSON event table version : 1.24 * Based on Intel JSON event table pusblised : 07/18/2024 */ static const intel_x86_umask_t intel_spr_ocr[]={ { .uname = "DEMAND_CODE_RD_ANY_RESPONSE", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that have any type of response.", .ucode = 0x1000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM.", .ucode = 0x73c00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_L3_MISS", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_LOCAL_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_CACHE_HITM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD_SNC_DRAM", .udesc = "Counts demand instruction fetches and L1 instruction cache prefetches that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any type of response.", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM.", .ucode = 0x73c00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT", .udesc = "Counts demand data reads that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .udesc = "Counts demand data reads that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts demand data reads that resulted in a snoop that hit in another core, which did not forward the data.", .ucode = 0x4003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that resulted in a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x8003c000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3fbfc0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_LOCAL_SOCKET_PMM", .udesc = "Counts demand data reads that were supplied by PMM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x700c0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_PMM", .udesc = "Counts demand data reads that were supplied by PMM.", .ucode = 0x703c0000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts demand data reads that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM attached to another socket.", .ucode = 0x73000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_REMOTE_PMM", .udesc = "Counts demand data reads that were supplied by PMM attached to another socket.", .ucode = 0x70300000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_CACHE_HITM", .udesc = "Counts demand data reads that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand data reads that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_SNC_DRAM", .udesc = "Counts demand data reads that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.", .ucode = 0x3f3ffc000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM.", .ucode = 0x73c00000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f803c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_HIT_SNOOP_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_LOCAL_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_CACHE_HITM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_SNC_DRAM", .udesc = "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L1D_ANY_RESPONSE", .udesc = "Counts data load hardware prefetch requests to the L1 data cache that have any type of response.", .ucode = 0x1040000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L2_ANY_RESPONSE", .udesc = "Counts hardware prefetches (which bring data to L2) that have any type of response.", .ucode = 0x1007000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_ANY_RESPONSE", .udesc = "Counts hardware prefetches to the L3 only that have any type of response.", .ucode = 0x1238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_HIT", .udesc = "Counts hardware prefetches to the L3 only that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x8008238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_MISS", .udesc = "Counts hardware prefetches to the L3 only that missed the local socket's L1, L2, and L3 caches.", .ucode = 0x9400238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_L3_MISS_LOCAL", .udesc = "Counts hardware prefetches to the L3 only that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x8400238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HWPF_L3_REMOTE", .udesc = "Counts hardware prefetches to the L3 only that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline was homed in a remote socket.", .ucode = 0x9000238000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MODIFIED_WRITE_ANY_RESPONSE", .udesc = "Counts writebacks of modified cachelines and streaming stores that have any type of response.", .ucode = 0x1080800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_ANY_RESPONSE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that have any type of response.", .ucode = 0x3f3ffc447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM.", .ucode = 0x73c00447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x3f003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x10003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HIT_NO_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop that hit in another core, which did not forward the data.", .ucode = 0x4003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_HIT_SNOOP_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that resulted in a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x8003c447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches.", .ucode = 0x3f3fc0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x3f04c0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_L3_MISS_LOCAL_SOCKET", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that missed the L3 Cache and were supplied by the local socket (DRAM or PMM), whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM or DRAM accesses that are controlled by the close or distant SNC Cluster. It does not count misses to the L3 which go to Local CXL Type 2 Memory or Local Non DRAM.", .ucode = 0x70cc0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, unless in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts only those DRAM accesses that are controlled by the close SNC Cluster.", .ucode = 0x10400447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_SOCKET_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts DRAM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x70c00447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_LOCAL_SOCKET_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM attached to this socket, whether or not in Sub NUMA Cluster(SNC) Mode. In SNC Mode counts PMM accesses that are controlled by the close or distant SNC Cluster.", .ucode = 0x700c0447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were not supplied by the local socket's L1, L2, or L3 caches and were supplied by a remote socket.", .ucode = 0x3f3300447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop was sent and data was returned (Modified or Not Modified).", .ucode = 0x183000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit a modified line in another core's caches which forwarded the data.", .ucode = 0x103000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_CACHE_SNOOP_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by a cache on a remote socket where a snoop hit in another core's caches which forwarded the unmodified data to the requesting core.", .ucode = 0x83000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM attached to another socket.", .ucode = 0x73000447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_MEMORY", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM or PMM attached to another socket.", .ucode = 0x73300447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_REMOTE_PMM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by PMM attached to another socket.", .ucode = 0x70300447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HITM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that hit a modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x100800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_CACHE_HIT_WITH_FWD", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that either hit a non-modified line in a distant L3 Cache or were snooped from a distant core's L1/L2 caches on this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x80800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_TO_CORE_SNC_DRAM", .udesc = "Counts all (cacheable) data read, code read and RFO requests including demands and prefetches to the core caches (L1 or L2) that were supplied by DRAM on a distant memory controller of this socket when the system is in SNC (sub-NUMA cluster) mode.", .ucode = 0x70800447700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_TO_CORE_L3_HIT_M", .udesc = "Counts demand reads for ownership (RFO), hardware prefetch RFOs (which bring data to L2), and software prefetches for exclusive ownership (PREFETCHW) that hit to a (M)odified cacheline in the L3 or snoop filter.", .ucode = 0x1f8004002200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_ANY_RESPONSE", .udesc = "Counts streaming stores that have any type of response.", .ucode = 0x1080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_HIT", .udesc = "Counts streaming stores that hit in the L3 or were snooped from another core's caches on the same socket.", .ucode = 0x8008080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_MISS", .udesc = "Counts streaming stores that missed the local socket's L1, L2, and L3 caches.", .ucode = 0x9400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STREAMING_WR_L3_MISS_LOCAL", .udesc = "Counts streaming stores that were not supplied by the local socket's L1, L2, or L3 caches and the cacheline is homed locally.", .ucode = 0x8400080000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_ESTIMATE_MEMORY", .udesc = "Counts Demand RFOs, ItoM's, PREFECTHW's, Hardware RFO Prefetches to the L1/L2 and Streaming stores that likely resulted in a store to Memory (DRAM or PMM)", .ucode = 0xfbff8082200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_int_vec_retired[]={ { .uname = "128BIT", .udesc = "TBD", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256BIT", .udesc = "TBD", .ucode = 0xac00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_128", .udesc = "integer ADD, SUB, SAD 128-bit vector instructions.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ADD_256", .udesc = "integer ADD, SUB, SAD 256-bit vector instructions.", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MUL_256", .udesc = "TBD", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHUFFLES", .udesc = "TBD", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_128", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNNI_256", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_mem_uop_retired[]={ { .uname = "ANY", .udesc = "Retired memory uops for any access", .ucode = 0x0300ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_misc2_retired[]={ { .uname = "LFENCE", .udesc = "LFENCE instructions retired", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_mem_load_misc_retired[]={ { .uname = "UC", .udesc = "Retired instructions with at least 1 uncacheable load or lock.", .ucode = 0x0400ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_mem_load_l3_miss_retired[]={ { .uname = "LOCAL_DRAM", .udesc = "Retired load instructions which data sources missed L3 but serviced from local dram", .ucode = 0x0100ull, .uflags = INTEL_X86_PEBS | INTEL_X86_NCOMBO, }, { .uname = "REMOTE_DRAM", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_PEBS | INTEL_X86_NCOMBO, }, { .uname = "REMOTE_FWD", .udesc = "Retired load instructions whose data sources was forwarded from a remote cache", .ucode = 0x0800ull, .uflags = INTEL_X86_PEBS | INTEL_X86_NCOMBO, }, { .uname = "REMOTE_HITM", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_PEBS | INTEL_X86_NCOMBO, }, { .uname = "REMOTE_PMM", .udesc = "Retired load instructions with remote Intel Optane DC persistent memory as the data source where the data request missed all caches.", .ucode = 0x1000ull, .uflags = INTEL_X86_PEBS | INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_mem_load_l3_hit_retired[]={ { .uname = "XSNP_FWD", .udesc = "Retired load instructions whose data sources were HitM responses from shared L3", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_MISS", .udesc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NONE", .udesc = "Retired load instructions whose data sources were hits in L3 without snoops required", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "XSNP_NO_FWD", .udesc = "Retired load instructions whose data sources were L3 and cross-core snoop hits in on-pkg core cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_mem_load_retired[]={ { .uname = "FB_HIT", .udesc = "Number of completed demand load requests that missed the L1, but hit the FB(fill buffer), because a preceding miss to the same cacheline initiated the line to be brought into L1, but data is not yet ready in L1.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Retired load instructions with L1 cache hits as data sources", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Retired load instructions missed L1 cache as data sources", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired load instructions with L2 cache hits as data sources", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired load instructions missed L2 cache as data sources", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Retired load instructions with L3 cache hits as data sources", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired load instructions missed L3 cache as data sources", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LOCAL_PMM", .udesc = "Retired load instructions with local Intel Optane DC persistent memory as the data source where the data request missed all caches.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_mem_inst_retired[]={ { .uname = "ALL_LOADS", .udesc = "All retired load instructions.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_STORES", .udesc = "All retired store instructions.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "All retired memory instructions.", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS |INTEL_X86_DFL, }, { .uname = "LOCK_LOADS", .udesc = "Retired load instructions with locked access.", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_LOADS", .udesc = "Retired load instructions that split across a cacheline boundary.", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SPLIT_STORES", .udesc = "Retired store instructions that split across a cacheline boundary.", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_LOADS", .udesc = "Retired load instructions that miss the STLB.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS_STORES", .udesc = "Retired store instructions that miss the STLB.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_fp_arith_inst_retired2[]={ { .uname = "128B_PACKED_HALF", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_HALF", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_HALF", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMPLEX_SCALAR_HALF", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Number of all Scalar Half-Precision FP arithmetic instructions(1) retired - regular and complex.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_HALF", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of all Vector (also called packed) Half-Precision FP arithmetic instructions(1) retired.", .ucode = 0x1c00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_mem_trans_retired[]={ { .uname = "LOAD_LATENCY", .udesc = "Memory load instructions retired above programmed clocks, minimum threshold value is 3 (Precise Event and ldlat required)", .ucode = 0x100, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "STORE_SAMPLE", .udesc = "Retired instructions with at least 1 store uop. This PEBS event is the trigger for stores sampled by the PEBS Store Facility.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_misc_retired[]={ { .uname = "LBR_INSERTS", .udesc = "Increments whenever there is an update to the LBR array.", .ucode = 0x2000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_rtm_retired[]={ { .uname = "ABORTED", .udesc = "Number of times an RTM execution aborted.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ABORTED_EVENTS", .udesc = "Number of times an RTM execution aborted due to none of the previous 3 categories (e.g. interrupt)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEM", .udesc = "Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_MEMTYPE", .udesc = "Number of times an RTM execution aborted due to incompatible memory type", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORTED_UNFRIENDLY", .udesc = "Number of times an RTM execution aborted due to HLE-unfriendly instructions", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COMMIT", .udesc = "Number of times an RTM execution successfully committed", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "START", .udesc = "Number of times an RTM execution started.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_fp_arith_inst_retired[]={ { .uname = "128B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 2 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "128B_PACKED_SINGLE", .udesc = "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 4 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "256B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_FLOPS", .udesc = "Number of SSE/AVX computational 128-bit packed single and 256-bit packed double precision FP instructions retired; some instructions will count twice as noted below. Each count represents 2 or/and 4 computation operations, 1 for each element. Applies to SSE* and AVX* packed single precision and packed double precision FP instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x1800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_DOUBLE", .udesc = "Counts number of SSE/AVX computational 512-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "512B_PACKED_SINGLE", .udesc = "Counts number of SSE/AVX computational 512-bit packed single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 16 computation operations, one for each element. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "8_FLOPS", .udesc = "Number of SSE/AVX computational 256-bit packed single precision and 512-bit packed double precision FP instructions retired; some instructions will count twice as noted below. Each count represents 8 computation operations, 1 for each element. Applies to SSE* and AVX* packed single precision and double precision FP instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RSQRT14 RCP RCP14 DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they perform 2 calculations per element.", .ucode = 0x6000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR", .udesc = "Number of SSE/AVX computational scalar floating-point instructions retired; some instructions will count twice as noted below. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 RANGE SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_DOUBLE", .udesc = "Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCALAR_SINGLE", .udesc = "Counts number of SSE/AVX computational scalar single precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VECTOR", .udesc = "Number of any Vector retired FP arithmetic instructions", .ucode = 0xfc00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_frontend_retired[]={ { .uname = "ANY_DSB_MISS", .udesc = "Retired Instructions who experienced DSB miss.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "DSB_MISS", .udesc = "Retired Instructions who experienced a critical DSB miss.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ITLB_MISS", .udesc = "Retired Instructions who experienced iTLB true miss.", .ucode = 0x1400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1I_MISS", .udesc = "Retired Instructions who experienced Instruction L1 Cache true miss.", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_MISS", .udesc = "Retired Instructions who experienced Instruction L2 Cache true miss.", .ucode = 0x1300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LATENCY_GE_1", .udesc = "Retired instructions after front-end starvation of at least 1 cycle", .ucode = 0x60010600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_128", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.", .ucode = 0x60800600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_16", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.", .ucode = 0x60100600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2", .udesc = "Retired instructions after front-end starvation of at least 2 cycles", .ucode = 0x60020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_256", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.", .ucode = 0x61000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_2_BUBBLES_GE_1", .udesc = "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.", .ucode = 0x10020600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_32", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.", .ucode = 0x60200600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_4", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.", .ucode = 0x60040600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_512", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.", .ucode = 0x62000600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_64", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.", .ucode = 0x60400600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "LATENCY_GE_8", .udesc = "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 8 cycles which was not interrupted by a back-end stall.", .ucode = 0x60080600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_FETHR | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_FETHR, }, { .uname = "MS_FLOWS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STLB_MISS", .udesc = "Retired Instructions who experienced STLB (2nd level TLB) true miss.", .ucode = 0x1500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "UNKNOWN_BRANCH", .udesc = "TBD", .ucode = 0x1700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All mispredicted branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Mispredicted conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Mispredicted non-taken conditional branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Number of branch instructions retired that were mispredicted and taken.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Miss-predicted near indirect branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT_CALL", .udesc = "Mispredicted indirect CALL retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Number of near branch instructions retired that were mispredicted and taken.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RET", .udesc = "This event counts the number of mispredicted ret instructions retired. Non PEBS", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "All branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Conditional branch instructions retired.", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_NTAKEN", .udesc = "Not taken branch instructions retired.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "COND_TAKEN", .udesc = "Taken conditional branch instructions retired.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "FAR_BRANCH", .udesc = "Far branch instructions retired.", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "INDIRECT", .udesc = "Indirect near branch instructions retired (excluding returns)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Direct and indirect near call instructions retired.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_RETURN", .udesc = "Return instructions retired.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_TAKEN", .udesc = "Taken branch instructions retired.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_spr_machine_clears[]={ { .uname = "COUNT", .udesc = "Number of machine clears (nukes) of any type.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MEMORY_ORDERING", .udesc = "Number of machine clears due to memory ordering conflicts.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-modifying code (SMC) detected.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_uops_retired[]={ { .uname = "CYCLES", .udesc = "Cycles with retired uop(s).", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "HEAVY", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "Retirement slots used.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALLS", .udesc = "Cycles without actually retired uops.", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .uequiv = "STALLS", .udesc = "Cycles without actually retired uops.", .ucode = 0x0200ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_assists[]={ { .uname = "ANY", .udesc = "Number of occurrences where a microcode assist is invoked by hardware.", .ucode = 0x1b00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FP", .udesc = "Counts all microcode FP assists.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAGE_FAULT", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SSE_AVX_MIX", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_exe[]={ { .uname = "AMX_BUSY", .udesc = "Counts the cycles where the AMX (Advance Matrix Extension) unit is busy performing an operation.", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_fp_arith_dispatched[]={ { .uname = "PORT_0", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "TBD", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5", .udesc = "TBD", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_uops_dispatched[]={ { .uname = "PORT_0", .udesc = "Uops executed on port 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_1", .udesc = "Uops executed on port 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_2_3_10", .udesc = "Uops executed on ports 2, 3 and 10", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_4_9", .udesc = "Uops executed on ports 4 and 9", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_5_11", .udesc = "Uops executed on ports 5 and 11", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_6", .udesc = "Uops executed on port 6", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PORT_7_8", .udesc = "Uops executed on ports 7 and 8", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_uops_executed[]={ { .uname = "CORE", .udesc = "Number of uops executed on the core.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_CYCLES_GE_1", .udesc = "Cycles at least 1 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_2", .udesc = "Cycles at least 2 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_3", .udesc = "Cycles at least 3 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CORE_CYCLES_GE_4", .udesc = "Cycles at least 4 micro-op is executed from any thread on physical core.", .ucode = 0x0200ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_1", .udesc = "Cycles where at least 1 uop was executed per-thread", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_2", .udesc = "Cycles where at least 2 uops were executed per-thread", .ucode = 0x0100ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_3", .udesc = "Cycles where at least 3 uops were executed per-thread", .ucode = 0x0100ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_GE_4", .udesc = "Cycles where at least 4 uops were executed per-thread", .ucode = 0x0100ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS", .udesc = "Counts number of cycles no uops were dispatched to be executed on this thread.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "STALL_CYCLES", .uequiv = "STALLS", .udesc = "Counts number of cycles no uops were dispatched to be executed on this thread.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "THREAD", .udesc = "Counts the number of uops to be executed per-thread each cycle.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "X87", .udesc = "Counts the number of x87 uops dispatched.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_arith[]={ { .uname = "DIVIDER_ACTIVE", .uequiv = "DIV_ACTIVE", .udesc = "Cycles when divide unit is busy executing divide or square root operations.", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DIV_ACTIVE", .udesc = "Cycles when divide unit is busy executing divide or square root operations.", .ucode = 0x0900ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FPDIV_ACTIVE", .udesc = "TBD", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "FP_DIVIDER_ACTIVE", .uequiv = "FPDIV_ACTIVE", .udesc = "TBD", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "IDIV_ACTIVE", .udesc = "This event counts the cycles the integer divider is busy.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INT_DIVIDER_ACTIVE", .uequiv = "IDIV_ACTIVE", .udesc = "This event counts the cycles the integer divider is busy.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_uops_issued[]={ { .uname = "ANY", .udesc = "Uops that RAT issues to RS", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, { .uname = "CYCLES", .udesc = "TBD", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_int_misc[]={ { .uname = "CLEARS_COUNT", .udesc = "Clears speculative count", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "CLEAR_RESTEER_CYCLES", .udesc = "Counts cycles after recovery from a branch misprediction or machine clear till the first uop is issued from the resteered path.", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MBA_STALLS", .udesc = "TBD", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RECOVERY_CYCLES", .udesc = "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UNKNOWN_BRANCH_CYCLES", .udesc = "TBD", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UOP_DROPPING", .udesc = "TMA slots where uops got dropped", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_lsd[]={ { .uname = "CYCLES_ACTIVE", .udesc = "Cycles Uops delivered by the LSD, but didn't come from the decoder.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_OK", .udesc = "Cycles optimal number of Uops delivered by the LSD, but did not come from the decoder.", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "UOPS", .udesc = "Number of Uops delivered by the LSD.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_exe_activity[]={ { .uname = "1_PORTS_UTIL", .udesc = "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "2_PORTS_UTIL", .udesc = "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "3_PORTS_UTIL", .udesc = "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "4_PORTS_UTIL", .udesc = "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BOUND_ON_LOADS", .udesc = "Execution stalls while memory subsystem has an outstanding load.", .ucode = 0x2100ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "BOUND_ON_STORES", .udesc = "Cycles where the Store Buffer was full and no loads caused an execution stall.", .ucode = 0x4000ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_rs[]={ { .uname = "EMPTY", .udesc = "Cycles when Reservation Station (RS) is empty for the thread.", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EMPTY_COUNT", .udesc = "Counts end of periods where the Reservation Station (RS) was empty.", .ucode = 0x0700ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t spr_rs_empty[]={ { .uname = "COUNT", .udesc = "This event is deprecated. Refer to new event RS.EMPTY_COUNT", .ucode = 0x0700ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "CYCLES", .udesc = "This event is deprecated. Refer to new event RS.EMPTY", .ucode = 0x0700ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_cycle_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding.", .ucode = 0x0800ull | (0x8 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_L2_MISS", .udesc = "Cycles while L2 cache miss demand load is outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_MEM_ANY", .udesc = "Cycles while memory subsystem has an outstanding load.", .ucode = 0x1000ull | (0x10 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding.", .ucode = 0x0c00ull | (0xc << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand load is outstanding.", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand load is outstanding.", .ucode = 0x0600ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_TOTAL", .udesc = "Total execution stalls.", .ucode = 0x0400ull | (0x4 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_resource_stalls[]={ { .uname = "SB", .udesc = "Cycles stalled due to no store buffers available. (not including draining form sync).", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SCOREBOARD", .udesc = "Counts cycles where the pipeline is stalled due to serializing operations.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_idq_uops_not_delivered[]={ { .uname = "CORE", .udesc = "Uops not delivered by IDQ when backend of the machine is not stalled", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_0_UOPS_DELIV_CORE", .udesc = "Cycles when no uops are not delivered by the IDQ when backend of the machine is not stalled", .ucode = 0x0100ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_FE_WAS_OK", .udesc = "Cycles when optimal number of uops was delivered to the back-end when the back-end is not stalled", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_decode[]={ { .uname = "LCP", .udesc = "Stalls caused by changing prefix length of the instruction.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_BUSY", .udesc = "Cycles the Microcode Sequencer is busy.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_icache_tag[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_icache_data[]={ { .uname = "STALLS", .udesc = "Cycles where a code fetch is stalled due to L1 instruction cache miss.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STALL_PERIODS", .udesc = "TBD", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, }; static const intel_x86_umask_t intel_spr_idq[]={ { .uname = "DSB_CYCLES_ANY", .udesc = "Cycles Decode Stream Buffer (DSB) is delivering any Uop", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_CYCLES_OK", .udesc = "Cycles DSB is delivering optimal number of Uops", .ucode = 0x0800ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DSB_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MITE_CYCLES_ANY", .udesc = "Cycles MITE is delivering any Uop", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_CYCLES_OK", .udesc = "Cycles MITE is delivering optimal number of Uops", .ucode = 0x0400ull | (0x6 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MITE_UOPS", .udesc = "Uops delivered to Instruction Decode Queue (IDQ) from MITE path", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MS_CYCLES_ANY", .udesc = "Cycles when uops are being delivered to IDQ while MS is busy", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "MS_SWITCHES", .udesc = "Number of switches from DSB or MITE to the MS", .ucode = 0x2000ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "MS_UOPS", .udesc = "Uops delivered to IDQ while MS is busy", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_uops_decoded[]={ { .uname = "DEC0_UOPS", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_inst_decoded[]={ { .uname = "DECODERS", .udesc = "Instruction decoders utilized in a cycle", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_dsb2mite_switches[]={ { .uname = "PENALTY_CYCLES", .udesc = "DSB-to-MITE switch true penalty cycles.", .ucode = 0x0200ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_tx_mem[]={ { .uname = "ABORT_CAPACITY_READ", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional reads", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CAPACITY_WRITE", .udesc = "Speculatively counts the number of TSX aborts due to a data capacity limitation for transactional writes.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ABORT_CONFLICT", .udesc = "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_l1d[]={ { .uname = "REPLACEMENT", .udesc = "Counts the number of cache lines replaced in L1 data cache.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_load_hit_prefetch[]={ { .uname = "SWPF", .udesc = "Counts the number of demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_l1d_pend_miss[]={ { .uname = "FB_FULL", .udesc = "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FB_FULL_PERIODS", .udesc = "Number of phases a demand request has waited due to L1D Fill Buffer (FB) unavailability.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "L2_STALL", .uequiv = "L2_STALLS", .udesc = "Number of cycles a demand request has waited due to L1D due to lack of L2 resources.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L2_STALLS", .udesc = "Number of cycles a demand request has waited due to L1D due to lack of L2 resources.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING", .udesc = "Number of L1D misses that are outstanding", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PENDING_CYCLES", .udesc = "Cycles with L1D load Misses outstanding.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_memory_activity[]={ { .uname = "CYCLES_L1D_MISS", .udesc = "Cycles while L1 cache miss demand load is outstanding.", .ucode = 0x0200ull | (0x2 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L1D_MISS", .udesc = "Execution stalls while L1 cache miss demand load is outstanding.", .ucode = 0x0300ull | (0x3 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L2_MISS", .udesc = "Execution stalls while L2 cache miss demand cacheable load request is outstanding", .ucode = 0x0500ull | (0x5 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "STALLS_L3_MISS", .udesc = "Execution stalls while L3 cache miss demand cacheable load request is outstanding", .ucode = 0x0900ull | (0x9 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_mem_store_retired[]={ { .uname = "L2_HIT", .udesc = "TBD", .ucode = 0x0100ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_mem_load_completed[]={ { .uname = "L1_MISS_ANY", .udesc = "Completed demand load uops that miss the L1 d-cache.", .ucode = 0xfd00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t spr_sq_misc[]={ { .uname = "BUS_LOCK", .udesc = "Counts bus locks, accounts for cache line split locks and UC locks.", .ucode = 0x1000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_sw_prefetch_access[]={ { .uname = "ANY", .udesc = "Counts the number of PREFETCHNTA, PREFETCHW, PREFETCHT0, PREFETCHT1 or PREFETCHT2 instructions executed.", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "NTA", .udesc = "Number of PREFETCHNTA instructions executed.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREFETCHW", .udesc = "Number of PREFETCHW instructions executed.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T0", .udesc = "Number of PREFETCHT0 instructions executed.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "T1_T2", .udesc = "Number of PREFETCHT1 or PREFETCHT2 instructions executed.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_longest_lat_cache[]={ { .uname = "MISS", .udesc = "Core-originated cacheable requests that missed L3 (Except hardware prefetches to the L3)", .ucode = 0x4100ull, .uflags = INTEL_X86_DFL, }, { .uname = "REFERENCE", .udesc = "Core-originated cacheable requests that refer to L3 (Except hardware prefetches to the L3)", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_xq[]={ { .uname = "FULL_CYCLES", .udesc = "Cycles the uncore cannot take further requests", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_DFL, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t intel_spr_l2_lines_out[]={ { .uname = "NON_SILENT", .udesc = "Modified cache lines that are evicted by L2 cache when triggered by an L2 cache fill", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SILENT", .udesc = "Non-modified cache lines that are silently dropped by L2 cache when triggered by an L2 cache fill.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "USELESS_HWPF", .udesc = "Cache lines that have been L2 hardware prefetched but not used by demand accesses", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_l2_lines_in[]={ { .uname = "ALL", .udesc = "L2 cache lines filling L2", .ucode = 0x1f00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_l2_rqsts[]={ { .uname = "ALL_CODE_RD", .udesc = "L2 code requests", .ucode = 0xe400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_DATA_RD", .udesc = "Demand Data Read requests", .ucode = 0xe100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_MISS", .udesc = "Demand requests that miss L2 cache", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_DEMAND_REFERENCES", .udesc = "Demand requests to L2 cache", .ucode = 0xe700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_RFO", .udesc = "RFO requests to L2 cache", .ucode = 0xe200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_HIT", .udesc = "L2 cache hits when fetching instructions, code reads.", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_RD_MISS", .udesc = "L2 cache misses when fetching instructions", .ucode = 0x2400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_HIT", .udesc = "Demand Data Read requests that hit L2 cache", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_MISS", .udesc = "Demand Data Read miss L2, no rejects", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "RFO requests that hit L2 cache", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "RFO requests that miss L2 cache", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_HIT", .udesc = "SW prefetch requests that hit L2 cache.", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SWPF_MISS", .udesc = "SW prefetch requests that miss L2 cache.", .ucode = 0x2800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_offcore_requests[]={ { .uname = "ALL_REQUESTS", .udesc = "TBD", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_RD", .udesc = "Demand and prefetch data reads", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Cacheable and noncacheable code read requests", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "Demand Data Read requests sent to uncore", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Demand RFO requests including regular RFOs, locks, ItoM", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "Counts demand data read requests that miss the L3 cache.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_offcore_requests_outstanding[]={ { .uname = "ALL_DATA_RD", .uequiv = "DATA_RD", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CYCLES_WITH_DATA_RD", .udesc = "TBD", .ucode = 0x0800ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_CODE_RD", .udesc = "Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.", .ucode = 0x0200ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_DATA_RD", .udesc = "Cycles where at least 1 outstanding demand data read request is pending.", .ucode = 0x0100ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "CYCLES_WITH_DEMAND_RFO", .udesc = "TBD", .ucode = 0x0400ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "DATA_RD", .udesc = "TBD", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_CODE_RD", .udesc = "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of outstanding demand data read requests pending.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "L3_MISS_DEMAND_DATA_RD", .udesc = "For every cycle, increments by the number of demand data read requests pending that are known to have missed the L3 cache.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_dtlb_store_misses[]={ { .uname = "STLB_HIT", .udesc = "Stores that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a store.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Store misses in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data store to a 1G page.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data store to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data store to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a store in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_dtlb_load_misses[]={ { .uname = "STLB_HIT", .udesc = "Loads that miss the DTLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for a demand load.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Load miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_1G", .udesc = "Page walks completed due to a demand data load to a 1G page.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walks completed due to a demand data load to a 2M/4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walks completed due to a demand data load to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for a demand load in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_itlb_misses[]={ { .uname = "STLB_HIT", .udesc = "Instruction fetch requests that miss the ITLB and hit the STLB.", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_ACTIVE", .udesc = "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request.", .ucode = 0x1000ull | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_C, }, { .uname = "WALK_COMPLETED", .udesc = "Code miss in all TLB levels causes a page walk that completes. (All page sizes)", .ucode = 0x0e00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Code miss in all TLB levels causes a page walk that completes. (2M/4M)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Code miss in all TLB levels causes a page walk that completes. (4K)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_PENDING", .udesc = "Number of page walks outstanding for an outstanding code request in the PMH each cycle.", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_l2_trans[]={ { .uname = "L2_WB", .udesc = "L2 writebacks that access L2 cache", .ucode = 0x4000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_spr_ld_blocks[]={ { .uname = "ADDRESS_ALIAS", .udesc = "False dependencies in MOB due to partial compare on address.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_SR", .udesc = "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.", .ucode = 0x8800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "STORE_FORWARD", .udesc = "Loads blocked due to overlapping with a preceding store that cannot be forwarded.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_topdown[]={ { .uname = "BACKEND_BOUND_SLOTS", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BAD_SPEC_SLOTS", .udesc = "TMA slots wasted due to incorrect speculations.", .ucode = 0x04a4ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "BR_MISPREDICT_SLOTS", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMORY_BOUND_SLOTS", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOTS_P", .udesc = "TMA slots available for an unhalted logical processor. General counter - architectural event", .ucode = 0x01a4ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, }; static const intel_x86_umask_t intel_spr_topdown_m[]={ { .uname = "BACKEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of back-end resources", .ucode = 0x8300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "BAD_SPEC", .udesc = "TMA slots wasted due to incorrect speculations.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "BR_MISPREDICT", .udesc = "TMA slots wasted due to incorrect speculation by branch mispredictions", .ucode = 0x8500ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "FRONTEND_BOUND", .udesc = "TMA slots where no uops were being issued due to lack of front-end resources.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "FETCH_LAT", .udesc = "TMA slots wasted due to lack of uops to decode due to code fetch latencies.", .ucode = 0x8600ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "HEAVY_OPS", .udesc = "TMA slots where instructions with 2+ uops retired.", .ucode = 0x8400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "MEMORY_BOUND", .udesc = "TMA slots wasted due to back-end waiting for memory.", .ucode = 0x8700ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "RETIRING", .udesc = "TMA slots where instructions are retiring", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_NO_MODS, }, { .uname = "SLOTS", .udesc = "TMA slots available for an unhalted logical processor. Fixed counter - architectural event", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_cpu_clk_unhalted[]={ { .uname = "C01", .udesc = "Core clocks when the thread is in the C0.1 light-weight slower wakeup time but more power saving optimized state.", .ucode = 0x10ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "C02", .udesc = "Core clocks when the thread is in the C0.2 light-weight faster wakeup time but less power saving optimized state.", .ucode = 0x20ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "C0_WAIT", .udesc = "Core clocks when the thread is in the C0.1 or C0.2 or running a PAUSE in C0 ACPI state.", .ucode = 0x70ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "DISTRIBUTED", .udesc = "Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x02ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "ONE_THREAD_ACTIVE", .udesc = "Core crystal clock cycles when this thread is unhalted and the other thread is halted.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PAUSE", .udesc = "TBD", .ucode = 0x40ecull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "PAUSE_INST", .udesc = "TBD", .ucode = 0x40ecull | (0x1 << INTEL_X86_CMASK_BIT) | (0x1 << INTEL_X86_EDGE_BIT), .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_E, }, { .uname = "REF_DISTRIBUTED", .udesc = "Core crystal clock cycles. Cycle counts are evenly distributed between active threads in the Core.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Reference cycles when the core is not in halt state (Fixed Counter 2).", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, { .uname = "REF_TSC_P", .udesc = "Reference cycles when the core is not in halt state (Programmable Counter).", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THREAD", .udesc = "Core cycles when the thread is not in halt state", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, /* fixed counter encoding */ }, { .uname = "THREAD_P", .udesc = "Thread cycles when thread is not in halt state", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_spr_inst_retired[]={ { .uname = "ANY", .udesc = "Number of instructions retired. Fixed Counter - architectural event (c, e, i, intx, intxcp modifiers not available)", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_CODE_OVERRIDE | INTEL_X86_DFL, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_E, }, { .uname = "STALL_CYCLES", .udesc = "Cycles without actually retired instructions.", .ucode = 0x0100ull | (0x1 << INTEL_X86_INV_BIT) | (0x1 << INTEL_X86_CMASK_BIT), .uflags = INTEL_X86_NCOMBO, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "ANY_P", .udesc = "Number of instructions retired. General Counter - architectural event", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MACRO_FUSED", .udesc = "TBD", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NOP", .udesc = "Number of all retired NOP instructions.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PREC_DIST", .udesc = "Precise instruction retired with PEBS precise-distribution", .ucode = 0x0100ull, .ucntmsk = 0x100000000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE | INTEL_X86_FIXED | INTEL_X86_PEBS, /* * because this encoding is for a fixed counter, not all modifiers are available. Given that we do not have per umask modmsk, we use * the hardcoded modifiers field instead. We mark all unavailable modifiers as set (to 0) so the user cannot modify them */ .modhw = _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP | _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_I, }, { .uname = "REP_ITERATION", .udesc = "Iterations of Repeat string retired instructions.", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_entry_t intel_spr_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x1000000ffull, .code = 0xc0, }, { .name = "INT_VEC_RETIRED", .desc = "integer ADD, SUB, SAD 128-bit vector instructions.", .code = 0x00e7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_int_vec_retired), .umasks = intel_spr_int_vec_retired, }, { .name = "MEM_UOP_RETIRED", .desc = "Retired memory uops for any access", .code = 0x00e5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_uop_retired), .umasks = intel_spr_mem_uop_retired, }, { .name = "MISC2_RETIRED", .desc = "TBD", .code = 0x00e0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_misc2_retired), .umasks = intel_spr_misc2_retired, }, { .name = "MEM_LOAD_MISC_RETIRED", .desc = "Retired instructions with at least 1 uncacheable load or lock.", .code = 0x00d4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_load_misc_retired), .umasks = intel_spr_mem_load_misc_retired, }, { .name = "MEM_LOAD_L3_MISS_RETIRED", .desc = "Retired load instructions which data sources missed L3 but serviced from local dram", .code = 0x00d3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_load_l3_miss_retired), .umasks = intel_spr_mem_load_l3_miss_retired, }, { .name = "MEM_LOAD_L3_HIT_RETIRED", .desc = "Retired load instructions whose data sources were L3 hit and cross-core snoop missed in on-pkg core cache.", .code = 0x00d2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_load_l3_hit_retired), .umasks = intel_spr_mem_load_l3_hit_retired, }, { .name = "MEM_LOAD_RETIRED", .desc = "Retired load instructions with L1 cache hits as data sources", .code = 0x00d1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_load_retired), .umasks = intel_spr_mem_load_retired, }, { .name = "MEM_INST_RETIRED", .desc = "Retired load instructions that miss the STLB.", .code = 0x00d0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_inst_retired), .umasks = intel_spr_mem_inst_retired, }, { .name = "FP_ARITH_INST_RETIRED2", .desc = "TBD", .code = 0x00cf, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_fp_arith_inst_retired2), .umasks = intel_spr_fp_arith_inst_retired2, }, { .name = "MEM_TRANS_RETIRED", .desc = "Counts randomly selected loads when the latency from first dispatch to completion is greater than 128 cycles.", .code = 0x01cd, .modmsk = INTEL_V5_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xfeull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_trans_retired), .umasks = intel_spr_mem_trans_retired, }, { .name = "MISC_RETIRED", .desc = "Increments whenever there is an update to the LBR array.", .code = 0x00cc, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_misc_retired), .umasks = intel_spr_misc_retired, }, { .name = "RTM_RETIRED", .desc = "Number of times an RTM execution started.", .code = 0x00c9, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_rtm_retired), .umasks = intel_spr_rtm_retired, }, { .name = "SQ_MISC", .desc = "Counts bus locks, accounts for cache line split locks and UC locks.", .code = 0x002c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(spr_sq_misc), .umasks = spr_sq_misc, }, { .name = "FP_ARITH_INST_RETIRED", .desc = "Counts number of SSE/AVX computational scalar double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 1 computational operation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.", .code = 0x00c7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_fp_arith_inst_retired), .umasks = intel_spr_fp_arith_inst_retired, }, { .name = "FRONTEND_RETIRED", .desc = "Retired Instructions who experienced a critical DSB miss.", .code = 0x01c6, .modmsk = INTEL_SKL_FE_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_FRONTEND | INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_frontend_retired), .umasks = intel_spr_frontend_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted branch instructions retired.", .code = 0x00c5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_br_misp_retired), .umasks = intel_spr_br_misp_retired, }, { .name = "BR_INST_RETIRED", .desc = "Branch instructions retired.", .code = 0x00c4, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_br_inst_retired), .umasks = intel_spr_br_inst_retired, }, { .name = "MACHINE_CLEARS", .desc = "Number of machine clears (nukes) of any type.", .code = 0x00c3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_machine_clears), .umasks = intel_spr_machine_clears, }, { .name = "UOPS_RETIRED", .desc = "Retired uops.", .code = 0x00c2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_uops_retired), .umasks = intel_spr_uops_retired, }, { .name = "ASSISTS", .desc = "Counts all microcode FP assists.", .code = 0x00c1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_assists), .umasks = intel_spr_assists, }, { .name = "EXE", .desc = "Excution cycles.", .code = 0x02b7, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_exe), .umasks = intel_spr_exe, }, { .name = "FP_ARITH_DISPATCHED", .desc = "TBD", .code = 0x00b3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_fp_arith_dispatched), .umasks = intel_spr_fp_arith_dispatched, }, { .name = "UOPS_DISPATCHED", .desc = "Uops dispatched.", .code = 0x00b2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_uops_dispatched), .umasks = intel_spr_uops_dispatched, }, { .name = "UOPS_EXECUTED", .desc = "Uops executed.", .code = 0x00b1, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_uops_executed), .umasks = intel_spr_uops_executed, }, { .name = "ARITH", .desc = "Arithmetic operations.", .code = 0x00b0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_arith), .umasks = intel_spr_arith, }, { .name = "UOPS_ISSUED", .desc = "Uops issued.", .code = 0x00ae, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_uops_issued), .umasks = intel_spr_uops_issued, }, { .name = "INT_MISC", .desc = "Miscellaneous interruptions.", .code = 0x00ad, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_int_misc), .umasks = intel_spr_int_misc, }, { .name = "LSD", .desc = "LSD (Loop Stream Detector) operations.", .code = 0x00a8, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_lsd), .umasks = intel_spr_lsd, }, { .name = "EXE_ACTIVITY", .desc = "Execution activity.", .code = 0x00a6, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_exe_activity), .umasks = intel_spr_exe_activity, }, { .name = "RS", .desc = "Reservation Station (RS) activity.", .code = 0x00a5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_rs), .umasks = intel_spr_rs, }, { .name = "RS_EMPTY", .desc = "This event is deprecated. Refer to new event RS.EMPTY_COUNT", .code = 0x00a5, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(spr_rs_empty), .umasks = spr_rs_empty, }, { .name = "CYCLE_ACTIVITY", .desc = "Stalled cycles.", .code = 0x00a3, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_cycle_activity), .umasks = intel_spr_cycle_activity, }, { .name = "RESOURCE_STALLS", .desc = "Cycles where Allocation is stalled due to Resource Related reasons.", .code = 0x00a2, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_resource_stalls), .umasks = intel_spr_resource_stalls, }, { .name = "IDQ_UOPS_NOT_DELIVERED", .desc = "Uops not delivered.", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_idq_uops_not_delivered), .umasks = intel_spr_idq_uops_not_delivered, }, { .name = "IDQ_BUBBLES", .desc = "Uops not delivered.", .code = 0x009c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .equiv = "IDQ_UOPS_NOT_DELIVERED", .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_idq_uops_not_delivered), .umasks = intel_spr_idq_uops_not_delivered, }, { .name = "DECODE", .desc = "Decoder activity.", .code = 0x0087, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_decode), .umasks = intel_spr_decode, }, { .name = "ICACHE_TAG", .desc = "Instruction cache tagging.", .code = 0x0083, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_icache_tag), .umasks = intel_spr_icache_tag, }, { .name = "ICACHE_DATA", .desc = "Instruction cache.", .code = 0x0080, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_icache_data), .umasks = intel_spr_icache_data, }, { .name = "IDQ", .desc = "IDQ (Instruction Decoded Queue) operations.", .code = 0x0079, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_idq), .umasks = intel_spr_idq, }, { .name = "UOPS_DECODED", .desc = "Uops decoded.", .code = 0x0076, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_uops_decoded), .umasks = intel_spr_uops_decoded, }, { .name = "INST_DECODED", .desc = "Instruction decoded.", .code = 0x0075, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_inst_decoded), .umasks = intel_spr_inst_decoded, }, { .name = "DSB2MITE_SWITCHES", .desc = "DSB to MITE switches.", .code = 0x0061, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_dsb2mite_switches), .umasks = intel_spr_dsb2mite_switches, }, { .name = "TX_MEM", .desc = "Transactional memory.", .code = 0x0054, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_tx_mem), .umasks = intel_spr_tx_mem, }, { .name = "L1D", .desc = "L1D cache.", .code = 0x0051, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_l1d), .umasks = intel_spr_l1d, }, { .name = "LOAD_HIT_PREFETCH", .desc = "Load dispatches.", .code = 0x004c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_load_hit_prefetch), .umasks = intel_spr_load_hit_prefetch, }, { .name = "L1D_PEND_MISS", .desc = "L1D pending misses.", .code = 0x0048, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_l1d_pend_miss), .umasks = intel_spr_l1d_pend_miss, }, { .name = "MEMORY_ACTIVITY", .desc = "Memory activity.", .code = 0x0047, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_memory_activity), .umasks = intel_spr_memory_activity, }, { .name = "MEM_STORE_RETIRED", .desc = "TBD", .code = 0x0044, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_store_retired), .umasks = intel_spr_mem_store_retired, }, { .name = "MEM_LOAD_COMPLETED", .desc = "Completed demand load.", .code = 0x0043, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_mem_load_completed), .umasks = intel_spr_mem_load_completed, }, { .name = "SW_PREFETCH_ACCESS", .desc = "Software prefetches.", .code = 0x0040, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_sw_prefetch_access), .umasks = intel_spr_sw_prefetch_access, }, { .name = "LONGEST_LAT_CACHE", .desc = "L3 cache.", .code = 0x002e, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xffull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_longest_lat_cache), .umasks = intel_spr_longest_lat_cache, }, { .name = "XQ", .desc = "TBD", .code = 0x002d, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_xq), .umasks = intel_spr_xq, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted.", .code = 0x0026, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_l2_lines_out), .umasks = intel_spr_l2_lines_out, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated.", .code = 0x0025, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_l2_lines_in), .umasks = intel_spr_l2_lines_in, }, { .name = "L2_RQSTS", .desc = "L2 requests.", .code = 0x0024, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_l2_rqsts), .umasks = intel_spr_l2_rqsts, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore requests.", .code = 0x0021, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_offcore_requests), .umasks = intel_spr_offcore_requests, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests.", .code = 0x0020, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_offcore_requests_outstanding), .umasks = intel_spr_offcore_requests_outstanding, }, { .name = "DTLB_STORE_MISSES", .desc = "Data TLB store misses.", .code = 0x0013, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_dtlb_store_misses), .umasks = intel_spr_dtlb_store_misses, }, { .name = "DTLB_LOAD_MISSES", .desc = "Data TLB load misses.", .code = 0x0012, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_dtlb_load_misses), .umasks = intel_spr_dtlb_load_misses, }, { .name = "ITLB_MISSES", .desc = "Instruction TLB misses.", .code = 0x0011, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_itlb_misses), .umasks = intel_spr_itlb_misses, }, { .name = "L2_TRANS", .desc = "L2 writebacks that access L2 cache", .code = 0x0023, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(spr_l2_trans), .umasks = spr_l2_trans, }, { .name = "LD_BLOCKS", .desc = "Blocking loads.", .code = 0x0003, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_ld_blocks), .umasks = intel_spr_ld_blocks, }, { .name = "TOPDOWN", .desc = "Topdown events.", .cntmsk = 0x800000000ull, .modmsk = INTEL_V5_ATTRS, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_topdown), .umasks = intel_spr_topdown, }, { .name = "TOPDOWN_M", .desc = "Topdown events via PERF_METRICS MSR (Linux only). All events must be in a Linux perf_events group and SLOTS must be the first event for the kernel to program the events onto the PERF_METRICS MSR. Only SLOTS umask accepts modifiers", .cntmsk = 0x1000000000ull, .modmsk = INTEL_FIXED2_ATTRS, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_FIXED, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_topdown_m), .umasks = intel_spr_topdown_m, }, { .name = "CPU_CLK_UNHALTED", .desc = "Cycles in unhalted state.", .code = 0x003c, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_SPEC, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_cpu_clk_unhalted), .umasks = intel_spr_cpu_clk_unhalted, }, { .name = "INST_RETIRED", .desc = "Number of instructions retired.", .code = 0x00c0, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_inst_retired), .umasks = intel_spr_inst_retired, }, { .name = "OCR", .desc = "Counts demand data reads that have any type of response.", .code = 0x012a, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_ocr), .umasks = intel_spr_ocr, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response event", .code = 0x012a, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .equiv = "OCR", .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_ocr), .umasks = intel_spr_ocr, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response event", .code = 0x012b, .modmsk = INTEL_V5_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .equiv = "OCR", .flags = INTEL_X86_SPEC | INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_spr_ocr), .umasks = intel_spr_ocr, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_spr_unc_cha_events.h000066400000000000000000002604101502707512200252550ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: spr_unc_cha (SapphireRapids Uncore CHA) * Based on Intel JSON event table version : 1.17 * Based on Intel JSON event table published : 11/09/2023 */ static const intel_x86_umask_t spr_unc_cha_bypass_cha_imc[]={ { .uname = "INTERMEDIATE", .udesc = "Intermediate bypass Taken (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_TAKEN", .udesc = "Not Taken (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Taken (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_core_snp[]={ { .uname = "ANY_GTONE", .udesc = "Any Cycle with Multiple Snoops (experimental)", .ucode = 0xf200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_ONE", .udesc = "Any Single Snoop (experimental)", .ucode = 0xf100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_GTONE", .udesc = "Multiple Core Requests (experimental)", .ucode = 0x4200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_ONE", .udesc = "Single Core Requests (experimental)", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT_GTONE", .udesc = "Multiple Eviction (experimental)", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT_ONE", .udesc = "Single Eviction (experimental)", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EXT_GTONE", .udesc = "Multiple External Snoops (experimental)", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EXT_ONE", .udesc = "Single External Snoops (experimental)", .ucode = 0x2100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_GTONE", .udesc = "Multiple Snoop Targets from Remote (experimental)", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ONE", .udesc = "Single Snoop Target from Remote (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_direct_go[]={ { .uname = "HA_SUPPRESS_DRD", .udesc = "Direct GO (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA_SUPPRESS_NO_D2C", .udesc = "Direct GO (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA_TOR_DEALLOC", .udesc = "Direct GO (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_direct_go_opc[]={ { .uname = "EXTCMP", .udesc = "Direct GO (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_GO", .udesc = "Direct GO (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FAST_GO_PULL", .udesc = "Direct GO (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GO", .udesc = "Direct GO (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GO_PULL", .udesc = "Direct GO (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE_DUE_SUPPRESS", .udesc = "Direct GO (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOP", .udesc = "Direct GO (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULL", .udesc = "Direct GO (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_dir_lookup[]={ { .uname = "NO_SNP", .udesc = "Snoop Not Needed (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNP", .udesc = "Snoop Needed (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_dir_update[]={ { .uname = "HA", .udesc = "Directory Updated memory write from the HA pipe", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOR", .udesc = "Directory Updated memory write from TOR pipe", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_egress_ordering[]={ { .uname = "IV_SNOOPGO_DN", .udesc = "Down (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_SNOOPGO_UP", .udesc = "Up (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_hitme_hit[]={ { .uname = "EX_RDS", .udesc = "Read request from a remote socket which hit in the HitMe Cache to a line In the E state (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED_OWNREQ", .udesc = "Shared hit and op is RdInvOwn, RdInv, Inv* (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOE", .udesc = "Op is WbMtoE (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBMTOI_OR_S", .udesc = "Op is WbMtoI, WbPushMtoI, WbFlush, or WbMtoS (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_hitme_lookup[]={ { .uname = "READ", .udesc = "Op is RdCode, RdData, RdDataMigratory, RdCur, RdInvOwn, RdInv, Inv* (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE", .udesc = "Op is WbMtoE, WbMtoI, WbPushMtoI, WbFlush, or WbMtoS (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_hitme_miss[]={ { .uname = "NOTSHARED_RDINVOWN", .udesc = "No SF/LLC HitS/F and op is RdInvOwn (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_OR_INV", .udesc = "Op is RdCode, RdData, RdDataMigratory, RdCur, RdInv, Inv* (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED_RDINVOWN", .udesc = "SF/LLC HitS/F and op is RdInvOwn (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_hitme_update[]={ { .uname = "DEALLOCATE", .udesc = "Deallocate HitME$ on Reads without RspFwdI* (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEALLOCATE_RSPFWDI_LOC", .udesc = "op is RspIFwd or RspIFwdWb for a local request (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDINVOWN", .udesc = "Update HitMe Cache on RdInvOwn even if not RspFwdI* (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDI_REM", .udesc = "op is RspIFwd or RspIFwdWb for a remote request (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SHARED", .udesc = "Update HitMe Cache to SHARed (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_imc_reads_count[]={ { .uname = "NORMAL", .udesc = "Normal priority reads issued to the memory controller from the CHA", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRIORITY", .udesc = "ISOCH (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_imc_writes_count[]={ { .uname = "FULL", .udesc = "CHA to iMC Full Line Writes Issued; Full Line Non-ISOCH", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FULL_PRIORITY", .udesc = "ISOCH Full Line (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL", .udesc = "Partial Non-ISOCH (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PARTIAL_PRIORITY", .udesc = "ISOCH Partial (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_llc_lookup[]={ { .uname = "ALL", .udesc = "Cache and Snoop Filter Lookups; Any Request (experimental)", .ucode = 0x1fff0000ff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_REMOTE", .udesc = "All transactions from Remote Agents (experimental)", .ucode = 0x17e00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY_F", .udesc = "All Requests (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE", .udesc = "CRd Requests (experimental)", .ucode = 0x1bd00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CODE_READ_F", .udesc = "CRd Requests (experimental)", .ucode = 0x1000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "COREPREF_OR_DMND_LOCAL_F", .udesc = "Local non-prefetch requests (experimental)", .ucode = 0x4000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_RD", .udesc = "Cache and Snoop Filter Lookups; Data Read Request (experimental)", .ucode = 0x1bc10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_ALL", .udesc = "Data Reads (experimental)", .ucode = 0x1fc10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_F", .udesc = "Data Read Request (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_LOCAL", .udesc = "Demand Data Reads, Core and LLC prefetches (experimental)", .ucode = 0x8410000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA_READ_MISS", .udesc = "Data Read Misses (experimental)", .ucode = 0x1fc100000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "E_STATE", .udesc = "E State (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "F_STATE", .udesc = "F State (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_INV", .udesc = "Flush or Invalidate Requests (experimental)", .ucode = 0x1a440000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLUSH_OR_INV_F", .udesc = "Flush (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "I_STATE", .udesc = "I State (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCPREF_LOCAL_F", .udesc = "Local LLC prefetch requests (from LLC) (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCALLY_HOMED_ADDRESS", .udesc = "Transactions homed locally (experimental)", .ucode = 0xbdf0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_CODE", .udesc = "CRd Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x19d00000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_DATA_RD", .udesc = "Cache and Snoop Filter Lookups; Data Read Request that come from the local socket (usually the core) (experimental)", .ucode = 0x19c10000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_DMND_CODE", .udesc = "Demand CRd Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x18500000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_DMND_DATA_RD", .udesc = "Cache and Snoop Filter Lookups; Demand Data Reads that come from the local socket (usually the core) (experimental)", .ucode = 0x18410000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_DMND_RFO", .udesc = "Demand RFO Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x18480000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_F", .udesc = "Transactions homed locally (experimental)", .ucode = 0x80000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_FLUSH_INV", .udesc = "Flush or Invalidate Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x18440000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_LLC_PF", .udesc = "Cache and Snoop Filter Lookups; Prefetch requests to the LLC that come from the local socket (usually the core) (experimental)", .ucode = 0x189d0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_PF", .udesc = "Cache and Snoop Filter Lookups; Data Read Prefetches that come from the local socket (usually the core) (experimental)", .ucode = 0x199d0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_PF_CODE", .udesc = "CRd Prefetches that come from the local socket (usually the core) (experimental)", .ucode = 0x19100000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_PF_DATA_RD", .udesc = "Cache and Snoop Filter Lookups; Data Read Prefetches that come from the local socket (usually the core) (experimental)", .ucode = 0x19810000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_PF_RFO", .udesc = "RFO Prefetches that come from the local socket (usually the core) (experimental)", .ucode = 0x19080000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_RFO", .udesc = "RFO Requests that come from the local socket (usually the core) (experimental)", .ucode = 0x19c80000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "M State (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_ALL", .udesc = "All Misses (experimental)", .ucode = 0x1fe000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OTHER_REQ_F", .udesc = "Write Requests (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PREF_OR_DMND_REMOTE_F", .udesc = "Remote non-snoop requests (experimental)", .ucode = 0x20000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTELY_HOMED_ADDRESS", .udesc = "Transactions homed remotely (experimental)", .ucode = 0x15df0000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_CODE", .udesc = "CRd Requests that come from a Remote socket. (experimental)", .ucode = 0x1a100000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_DATA_RD", .udesc = "Cache and Snoop Filter Lookups; Data Read Requests that come from a Remote socket (experimental)", .ucode = 0x1a010000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_F", .udesc = "Transactions homed remotely (experimental)", .ucode = 0x100000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_FLUSH_INV", .udesc = "Flush or Invalidate requests that come from a Remote socket. (experimental)", .ucode = 0x1a040000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_OTHER", .udesc = "Filters Requests for those that write info into the cache that come from a remote socket (experimental)", .ucode = 0x1a020000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_RFO", .udesc = "RFO Requests that come from a Remote socket. (experimental)", .ucode = 0x1a080000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNOOP_F", .udesc = "Remote snoop requests (experimental)", .ucode = 0x40000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_SNP", .udesc = "Cache and Snoop Filter Lookups; Snoop Requests from a Remote Socket (experimental)", .ucode = 0x1c190000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "RFO Requests (experimental)", .ucode = 0x1bc80000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_F", .udesc = "RFO Request Filter (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_LOCAL", .udesc = "Locally HOMed RFOs - Demand and Prefetches (experimental)", .ucode = 0x9c80000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "S State (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_E", .udesc = "SnoopFilter - E State (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_H", .udesc = "SnoopFilter - H State (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_S", .udesc = "SnoopFilter - S State (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_LOCAL", .udesc = "Writes (experimental)", .ucode = 0x8420000ff00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITE_REMOTE", .udesc = "Remote Writes (experimental)", .ucode = 0x17c20000ff00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_llc_victims[]={ { .uname = "E_STATE", .udesc = "Lines in E state (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA", .udesc = "IA traffic (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO", .udesc = "IO traffic (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_E", .udesc = "All LLC lines in E state that are victimized on a fill from an IO device (experimental)", .ucode = 0x1200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_FS", .udesc = "All LLC lines in F or S state that are victimized on a fill from an IO device (experimental)", .ucode = 0x1c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_M", .udesc = "All LLC lines in M state that are victimized on a fill from an IO device (experimental)", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MESF", .udesc = "All LLC lines in any state that are victimized on a fill from an IO device (experimental)", .ucode = 0x1f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_ALL", .udesc = "Lines Victimized; Local - All Lines (experimental)", .ucode = 0x2000000f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_E", .udesc = "Lines Victimized (experimental)", .ucode = 0x2000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_M", .udesc = "Lines Victimized (experimental)", .ucode = 0x2000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_ONLY", .udesc = "Local Only (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_S", .udesc = "Lines Victimized (experimental)", .ucode = 0x2000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "Lines in M state (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ALL", .udesc = "Lines Victimized; Remote - All Lines (experimental)", .ucode = 0x8000000f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_E", .udesc = "Lines Victimized (experimental)", .ucode = 0x8000000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_M", .udesc = "Lines Victimized (experimental)", .ucode = 0x8000000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_ONLY", .udesc = "Remote Only (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_S", .udesc = "Lines Victimized (experimental)", .ucode = 0x8000000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "Lines in S State (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL_E", .udesc = "All LLC lines in E state that are victimized on a fill (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL_M", .udesc = "All LLC lines in M state that are victimized on a fill (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL_S", .udesc = "All LLC lines in S state that are victimized on a fill (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_misc[]={ { .uname = "CV0_PREF_MISS", .udesc = "CV0 Prefetch Miss (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CV0_PREF_VIC", .udesc = "CV0 Prefetch Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT_S", .udesc = "Number of times that an RFO hit in S state. (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI_WAS_FSE", .udesc = "Silent Snoop Eviction (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WC_ALIASING", .udesc = "Write Combining Aliasing (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_osb[]={ { .uname = "LOCAL_INVITOE", .udesc = "Local InvItoE (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_READ", .udesc = "Local Rd (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OFF_PWRHEURISTIC", .udesc = "Off (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_READ", .udesc = "Remote Rd (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_READINVITOE", .udesc = "Remote Rd InvItoE (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_HITS_SNP_BCAST", .udesc = "RFO HitS Snoop Broadcast (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_pmm_memmode_nm_invitox[]={ { .uname = "LOCAL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SETCONFLICT", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_pmm_memmode_nm_setconflicts[]={ { .uname = "LLC", .udesc = "Memory Mode related events; Counts the number of times CHA saw a Near Memory set conflict in SF/LLC (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF", .udesc = "Memory Mode related events; Counts the number of times CHA saw a Near memory set conflict in SF/LLC (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOR", .udesc = "Memory Mode related events; Counts the number of times CHA saw a Near Memory set conflict in TOR (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_pmm_memmode_nm_setconflicts2[]={ { .uname = "IODC", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMWR", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEMWRNI", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_pmm_qos[]={ { .uname = "DDR4_FAST_INSERT", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REJ_IRQ", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOWTORQ_SKIP", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOW_INSERT", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE_IRQ", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "THROTTLE_PRQ", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_pmm_qos_occupancy[]={ { .uname = "DDR_FAST_FIFO", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR_SLOW_FIFO", .udesc = "Number of SLOW TOR Request inserted to ha_pmm_tor_req_fifo (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_requests[]={ { .uname = "INVITOE", .udesc = "Requests for exclusive ownership of a cache line without receiving data (experimental)", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_LOCAL", .udesc = "Local requests for exclusive ownership of a cache line without receiving data", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "INVITOE_REMOTE", .udesc = "Remote requests for exclusive ownership of a cache line without receiving data", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS", .udesc = "Read requests made into the CHA", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_LOCAL", .udesc = "Read requests from a unit on this socket", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READS_REMOTE", .udesc = "Read requests from a remote socket", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES", .udesc = "Write requests made into the CHA", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_LOCAL", .udesc = "Write Requests from a unit on this socket", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRITES_REMOTE", .udesc = "Read and Write Requests; Writes Remote", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_inserts[]={ { .uname = "IPQ", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_REJ", .udesc = "IRQ Rejected (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_REJ", .udesc = "PRQ Rejected (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_irq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_irq1_reject[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC or SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PA_MATCH", .udesc = "Ingress (from CMS) Request Queue Rejects; PhyAddr Match (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_ismq0_retry[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_ismq1_retry[]={ { .uname = "ANY0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_occupancy[]={ { .uname = "IPQ", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_other1_retry[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_prq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_req_q1_retry[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "ANY0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "HA (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "LLC OR SF Way (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "LLC Victim (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "SF Victim (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "Victim (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_rrq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_wbq0_reject[]={ { .uname = "AD_REQ_VN0", .udesc = "AD REQ on VN0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_RSP_VN0", .udesc = "AD RSP on VN0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_NON_UPI", .udesc = "Non UPI AK Request (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCB_VN0", .udesc = "BL NCB on VN0 (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_NCS_VN0", .udesc = "BL NCS on VN0 (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_RSP_VN0", .udesc = "BL RSP on VN0 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_WB_VN0", .udesc = "BL WB on VN0 (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IV_NON_UPI", .udesc = "Non UPI IV Request (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_rxc_wbq1_reject[]={ { .uname = "ALLOW_SNP", .udesc = "Allow Snoop (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HA", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_OR_SF_WAY", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLC_VICTIM", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SF_VICTIM", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VICTIM", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_snoops_sent[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "BCST_LOCAL", .udesc = "Broadcast snoop for Local Requests (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BCST_REMOTE", .udesc = "Broadcast snoops for Remote Requests (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRECT_LOCAL", .udesc = "Directed snoops for Local Requests (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DIRECT_REMOTE", .udesc = "Directed snoops for Remote Requests (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL", .udesc = "Broadcast or directed Snoops sent for Local Requests (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE", .udesc = "Broadcast or directed Snoops sent for Remote Requests (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_snoop_resp[]={ { .uname = "RSPCNFLCT", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWD", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDWB", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI", .udesc = "RspI Snoop Responses Received (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWD", .udesc = "RspIFwd Snoop Responses Received (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPS", .udesc = "RspS Snoop Responses Received (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPSFWD", .udesc = "RspSFwd Snoop Responses Received (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPWB", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_snoop_resp_local[]={ { .uname = "RSPCNFLCT", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWD", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPFWDWB", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPI", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWD", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPS", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPSFWD", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPWB", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_snoop_rsp_misc[]={ { .uname = "MTOI_RSPDATAM", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MTOI_RSPIFWDM", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULLDATAPTL_HITLLC", .udesc = "Pull Data Partial - Hit LLC (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PULLDATAPTL_HITSF", .udesc = "Pull Data Partial - Hit SF (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWDMPTL_HITLLC", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RSPIFWDMPTL_HITSF", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_tor_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0xc001ff0000ff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DDR", .udesc = "DDR Access (experimental)", .ucode = 0x400000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "EVICT", .udesc = "SF/LLC Evictions (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "HIT", .udesc = "TBD (experimental)", .ucode = 0x100000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA", .udesc = "All from Local IA", .ucode = 0xc001ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSH", .udesc = "CLFlush from Local IA", .ucode = 0xc8c7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CLFLUSHOPT", .udesc = "CLFlushOpt from Local IA (experimental)", .ucode = 0xc8d7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD", .udesc = "CRd from local IA", .ucode = 0xc80fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_CRD_PREF", .udesc = "Rd Pref from local IA (experimental)", .ucode = 0xc88fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD", .udesc = "Rd from local IA", .ucode = 0xc817ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRDPTE", .udesc = "DRd PTEs issued by iA Cores (experimental)", .ucode = 0xc837ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT", .udesc = "DRd Opt from local IA (experimental)", .ucode = 0xc827ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_OPT_PREF", .udesc = "DRd Opt Pref from local IA (experimental)", .ucode = 0xc8a7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_DRD_PREF", .udesc = "DRd Pref from local IA", .ucode = 0xc897ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT", .udesc = "Hits from Local IA", .ucode = 0xc001fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD", .udesc = "CRd hits from local IA", .ucode = 0xc80ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CRD_PREF", .udesc = "CRd Pref hits from local IA", .ucode = 0xc88ffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CXL_ACC", .udesc = "All requests issued from IA cores to CXL accelerator memory regions that hit the LLC. (experimental)", .ucode = 0x10c0018100000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c0008100000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD", .udesc = "DRd hits from local IA", .ucode = 0xc817fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRDPTE", .udesc = "DRd PTEs issued by iA Cores that Hit the LLC (experimental)", .ucode = 0xc837fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT", .udesc = "DRd Opt hits from local IA (experimental)", .ucode = 0xc827fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_OPT_PREF", .udesc = "DRd Opt Pref hits from local IA (experimental)", .ucode = 0xc8a7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_DRD_PREF", .udesc = "DRd Pref hits from local IA", .ucode = 0xc897fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_ITOM", .udesc = "ItoMs issued by iA Cores that Hit LLC (experimental)", .ucode = 0xcc47fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFCODE", .udesc = "LLCPrefCode hits from local IA (experimental)", .ucode = 0xcccffd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFDATA", .udesc = "LLCPrefData hits from local IA (experimental)", .ucode = 0xccd7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_LLCPREFRFO", .udesc = "LLCPrefRFO hits from local IA", .ucode = 0xccc7fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO", .udesc = "RFO hits from local IA", .ucode = 0xc807fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_HIT_RFO_PREF", .udesc = "RFO Pref hits from local IA", .ucode = 0xc887fd00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOM", .udesc = "ItoM from Local IA (experimental)", .ucode = 0xcc47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_ITOMCACHENEAR", .udesc = "ItoMCacheNears issued by iA Cores (experimental)", .ucode = 0xcd47ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFCODE", .udesc = "LLCPrefCode from local IA (experimental)", .ucode = 0xcccfff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFDATA", .udesc = "LLCPrefData from local IA", .ucode = 0xccd7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_LLCPREFRFO", .udesc = "LLCPrefRFO from local IA", .ucode = 0xccc7ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS", .udesc = "misses from Local IA", .ucode = 0xc001fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD", .udesc = "CRd misses from local IA", .ucode = 0xc80ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRDMORPH_CXL_ACC", .udesc = "CRds and equivalent opcodes issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c80b8200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_LOCAL", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc80efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF", .udesc = "CRd Pref misses from local IA", .ucode = 0xc88ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_LOCAL", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc88efe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_PREF_REMOTE", .udesc = "CRd_Prefs issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc88f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CRD_REMOTE", .udesc = "CRd issued by iA Cores that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc80f7e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CXL_ACC", .udesc = "All requests issued from IA cores to CXL accelerator memory regions that miss the LLC. (experimental)", .ucode = 0x10c0018200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c0008200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD", .udesc = "DRd misses from local IA", .ucode = 0xc817fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRDMORPH_CXL_ACC", .udesc = "DRds and equivalent opcodes issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c8138200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRDPTE", .udesc = "DRd PTEs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc837fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_CXL_ACC", .udesc = "DRds issued from an IA core which miss the L3 and target memory in a CXL type 2 memory expander card. (experimental)", .ucode = 0x10c8178200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8168200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_DDR", .udesc = "DRds issued by IA Cores targeting DDR Mem that Missed the LLC", .ucode = 0xc8178600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL", .udesc = "DRd misses from local IA targeting local memory", .ucode = 0xc816fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally", .ucode = 0xc8168600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_LOCAL_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally", .ucode = 0xc8168a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT", .udesc = "DRd Opt misses from local IA (experimental)", .ucode = 0xc827fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8268200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT_PREF", .udesc = "DRd Opt Pref misses from local IA (experimental)", .ucode = 0xc8a7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_OPT_PREF_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8a68200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC", .ucode = 0xc8178a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF", .udesc = "DRd Pref misses from local IA", .ucode = 0xc897fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_CXL_ACC", .udesc = "L2 data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c8978200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8968200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC (experimental)", .ucode = 0xc8978600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL", .udesc = "DRd Pref misses from local IA targeting local memory", .ucode = 0xc896fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_LOCAL_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed locally (experimental)", .ucode = 0xc8968a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC (experimental)", .ucode = 0xc8978a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE", .udesc = "DRd Pref misses from local IA targeting remote memory", .ucode = 0xc8977e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_DDR", .udesc = "DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_PREF_REMOTE_PMM", .udesc = "DRd_Prefs issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8970a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE", .udesc = "DRd misses from local IA targeting remote memory", .ucode = 0xc8177e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_DDR", .udesc = "DRds issued by iA Cores targeting DDR Mem that Missed the LLC - HOMed remotely", .ucode = 0xc8170600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_DRD_REMOTE_PMM", .udesc = "DRds issued by iA Cores targeting PMM Mem that Missed the LLC - HOMed remotely", .ucode = 0xc8170a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_ITOM", .udesc = "ItoMs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xcc47fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFCODE", .udesc = "LLCPrefCode misses from local IA (experimental)", .ucode = 0xcccffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFCODE_CXL_ACC", .udesc = "LLC Prefetch Code transactions issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10cccf8200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFDATA", .udesc = "LLCPrefData misses from local IA", .ucode = 0xccd7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFDATA_CXL_ACC", .udesc = "LLC data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10ccd78200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFDATA_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10ccd68200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFRFO", .udesc = "LLCPrefRFO misses from local IA", .ucode = 0xccc7fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFRFO_CXL_ACC", .udesc = "L2 RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c8878200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LLCPREFRFO_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8868200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc8668a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_LOCAL_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed locally (experimental)", .ucode = 0xc86e8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8670600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc8670a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_REMOTE_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC - HOMed remotely (experimental)", .ucode = 0xc86f0a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO", .udesc = "RFO misses from local IA", .ucode = 0xc807fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFOMORPH_CXL_ACC", .udesc = "RFO and L2 RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c8038200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_CXL_ACC", .udesc = "RFOs issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10c8078200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10c8068200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_LOCAL", .udesc = "RFO misses from local IA", .ucode = 0xc806fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF", .udesc = "RFO pref misses from local IA", .ucode = 0xc887fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_CXL_ACC", .udesc = "LLC RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator. (experimental)", .ucode = 0x10ccc78200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_CXL_ACC_LOCAL", .udesc = "TBD (experimental)", .ucode = 0x10ccc68200000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_LOCAL", .udesc = "RFO prefetch misses from local IA", .ucode = 0xc886fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_PREF_REMOTE", .udesc = "RFO prefetch misses from local IA", .ucode = 0xc8877e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_RFO_REMOTE", .udesc = "RFO misses from local IA", .ucode = 0xc8077e00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_UCRDF", .udesc = "UCRdFs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc877de00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL", .udesc = "WCiLs issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc86ffe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF", .udesc = "WCiLF issued by iA Cores that Missed the LLC (experimental)", .ucode = 0xc867fe00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_DDR", .udesc = "WCiLFs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc8678600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCILF_PMM", .udesc = "WCiLFs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc8678a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_DDR", .udesc = "WCiLs issued by iA Cores targeting DDR that missed the LLC (experimental)", .ucode = 0xc86f8600000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WCIL_PMM", .udesc = "WCiLs issued by iA Cores targeting PMM that missed the LLC (experimental)", .ucode = 0xc86f8a00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_MISS_WIL", .udesc = "WiLs issued by iA Cores that Missed LLC (experimental)", .ucode = 0xc87fde00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO", .udesc = "RFO from local IA", .ucode = 0xc807ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_RFO_PREF", .udesc = "RFO pref from local IA", .ucode = 0xc887ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_SPECITOM", .udesc = "SpecItoM from Local IA", .ucode = 0xcc57ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBEFTOE", .udesc = "WBEFtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc3fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBEFTOI", .udesc = "WBEFtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc37ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBMTOE", .udesc = "WBEFtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc2fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBMTOI", .udesc = "WbMtoIs issued by an iA Cores. Modified Write Backs (experimental)", .ucode = 0xcc27ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WBSTOI", .udesc = "WBEFtoEs issued by an IA Core. Non Modified Write Backs (experimental)", .ucode = 0xcc67ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCIL", .udesc = "WCiLs issued by iA Cores (experimental)", .ucode = 0xc86fff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IA_WCILF", .udesc = "WCiLF issued by iA Cores (experimental)", .ucode = 0xc867ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO", .udesc = "All from local IO", .ucode = 0xc001ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_CLFLUSH", .udesc = "CLFlushes issued by IO Devices", .ucode = 0xc8c3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT", .udesc = "Hits from local IO", .ucode = 0xc001fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOM", .udesc = "ItoM hits from local IO", .ucode = 0xcc43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that hit the LLC", .ucode = 0xcd43fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_PCIRDCUR", .udesc = "RdCur and FsRdCur hits from local IO", .ucode = 0xc8f3fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_HIT_RFO", .udesc = "RFO hits from local IO", .ucode = 0xc803fd00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOM", .udesc = "ItoM from local IO", .ucode = 0xcc43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_ITOMCACHENEAR", .udesc = "ItoMCacheNears from IO devices.", .ucode = 0xcd43ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS", .udesc = "Misses from local IO", .ucode = 0xc001fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOM", .udesc = "ItoM misses from local IO", .ucode = 0xcc43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_ITOMCACHENEAR", .udesc = "ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC", .ucode = 0xcd43fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_PCIRDCUR", .udesc = "RdCur and FsRdCur misses from local IO", .ucode = 0xc8f3fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_MISS_RFO", .udesc = "RFO misses from local IO", .ucode = 0xc803fe00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_PCIRDCUR", .udesc = "RdCur from local IO", .ucode = 0xc8f3ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_RFO", .udesc = "RFO from local IO", .ucode = 0xc803ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IO_WBMTOI", .udesc = "WbMtoIs issued by IO Devices", .ucode = 0xcc23ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IPQ", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_IA", .udesc = "IRQ - iA (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IRQ_NON_IA", .udesc = "IRQ - Non iA (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ISOC", .udesc = "TBD (experimental)", .ucode = 0x200000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOCAL_TGT", .udesc = "Local Targets (experimental)", .ucode = 0x8000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_ALL", .udesc = "All from Local iA and IO (experimental)", .ucode = 0xc000ff00000500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IA", .udesc = "All from Local iA (experimental)", .ucode = 0xc000ff00000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOC_IO", .udesc = "All from Local IO (experimental)", .ucode = 0xc000ff00000400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Just Misses (experimental)", .ucode = 0x200000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MMCFG", .udesc = "MMCFG Access (experimental)", .ucode = 0x2000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MMIO", .udesc = "MMIO Access (experimental)", .ucode = 0x4000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEARMEM", .udesc = "Near Memory (experimental)", .ucode = 0x40000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NONCOH", .udesc = "Non Coherent (experimental)", .ucode = 0x100000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NOT_NEARMEM", .udesc = "Not Near Memory (experimental)", .ucode = 0x80000000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "PMM Access (experimental)", .ucode = 0x800000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_IOSF", .udesc = "PRQ - IOSF (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PRQ_NON_IOSF", .udesc = "PRQ - Non IOSF (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REMOTE_TGT", .udesc = "Remote Targets (experimental)", .ucode = 0x10000000000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM_ALL", .udesc = "All from Remote (experimental)", .ucode = 0xc001ff0000c800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REM_SNPS", .udesc = "All Snoops from Remote (experimental)", .ucode = 0xc001ff00000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RRQ", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SNPS_FROM_REM", .udesc = "All Snoops from Remote (experimental)", .ucode = 0xc001ff00000800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WBQ", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_wb_push_mtoi[]={ { .uname = "LLC", .udesc = "Pushed to LLC (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MEM", .udesc = "Pushed to Memory (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_write_no_credits[]={ { .uname = "MC0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC2", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC3", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC4", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MC5", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_cha_xpt_pref[]={ { .uname = "DROP0_CONFLICT", .udesc = "Dropped (on 0?) - Conflict (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP0_NOCRD", .udesc = "Dropped (on 0?) - No Credits (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP1_CONFLICT", .udesc = "Dropped (on 1?) - Conflict (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DROP1_NOCRD", .udesc = "Dropped (on 1?) - No Credits (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SENT0", .udesc = "Sent (on 0?) (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SENT1", .udesc = "Sent (on 1?) (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_spr_unc_cha_pe[]={ { .name = "UNC_CHA_BYPASS_CHA_IMC", .desc = "CHA to iMC Bypass", .code = 0x0057, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_bypass_cha_imc), .umasks = spr_unc_cha_bypass_cha_imc, }, { .name = "UNC_CHA_CLOCKTICKS", .desc = "Clockticks", .code = 0x0001, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_CMS_CLOCKTICKS", .desc = "CMS Clockticks", .code = 0x00c0, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_CHA_CORE_SNP", .desc = "Core Cross Snoops Issued", .code = 0x0033, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_core_snp), .umasks = spr_unc_cha_core_snp, }, { .name = "UNC_CHA_DIRECT_GO", .desc = "Direct GO", .code = 0x006e, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_direct_go), .umasks = spr_unc_cha_direct_go, }, { .name = "UNC_CHA_DIRECT_GO_OPC", .desc = "Direct GO opcodes", .code = 0x006d, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_direct_go_opc), .umasks = spr_unc_cha_direct_go_opc, }, { .name = "UNC_CHA_DIR_LOOKUP", .desc = "Multi-socket cacheline Directory state lookups", .code = 0x0053, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_dir_lookup), .umasks = spr_unc_cha_dir_lookup, }, { .name = "UNC_CHA_DIR_UPDATE", .desc = "Multi-socket cacheline Directory state updates", .code = 0x0054, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_dir_update), .umasks = spr_unc_cha_dir_update, }, { .name = "UNC_CHA_EGRESS_ORDERING", .desc = "Egress Blocking due to Ordering requirements", .code = 0x00ba, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_egress_ordering), .umasks = spr_unc_cha_egress_ordering, }, { .name = "UNC_CHA_HITME_HIT", .desc = "HitMe Cache hits", .code = 0x005f, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_hitme_hit), .umasks = spr_unc_cha_hitme_hit, }, { .name = "UNC_CHA_HITME_LOOKUP", .desc = "HitMe Cache accesses", .code = 0x005e, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_hitme_lookup), .umasks = spr_unc_cha_hitme_lookup, }, { .name = "UNC_CHA_HITME_MISS", .desc = "HitMe Cache misses", .code = 0x0060, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_hitme_miss), .umasks = spr_unc_cha_hitme_miss, }, { .name = "UNC_CHA_HITME_UPDATE", .desc = "HitMe Cache updates", .code = 0x0061, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_hitme_update), .umasks = spr_unc_cha_hitme_update, }, { .name = "UNC_CHA_IMC_READS_COUNT", .desc = "Memory controller reads from CHA", .code = 0x0059, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_imc_reads_count), .umasks = spr_unc_cha_imc_reads_count, }, { .name = "UNC_CHA_IMC_WRITES_COUNT", .desc = "Memory controller writes from CHA", .code = 0x005b, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_imc_writes_count), .umasks = spr_unc_cha_imc_writes_count, }, { .name = "UNC_CHA_LLC_LOOKUP", .desc = "LLC Cache Lookups", .code = 0x0034, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_llc_lookup), .umasks = spr_unc_cha_llc_lookup, }, { .name = "UNC_CHA_LLC_VICTIMS", .desc = "LLC cache victims", .code = 0x0037, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_llc_victims), .umasks = spr_unc_cha_llc_victims, }, { .name = "UNC_CHA_MISC", .desc = "Miscellaneous", .code = 0x0039, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_misc), .umasks = spr_unc_cha_misc, }, { .name = "UNC_CHA_OSB", .desc = "OSB", .code = 0x0055, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_osb), .umasks = spr_unc_cha_osb, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_INVITOX", .desc = "TBD", .code = 0x0065, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_pmm_memmode_nm_invitox), .umasks = spr_unc_cha_pmm_memmode_nm_invitox, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS", .desc = "Memory Modes", .code = 0x0064, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_pmm_memmode_nm_setconflicts), .umasks = spr_unc_cha_pmm_memmode_nm_setconflicts, }, { .name = "UNC_CHA_PMM_MEMMODE_NM_SETCONFLICTS2", .desc = "TBD", .code = 0x0070, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_pmm_memmode_nm_setconflicts2), .umasks = spr_unc_cha_pmm_memmode_nm_setconflicts2, }, { .name = "UNC_CHA_PMM_QOS", .desc = "TBD", .code = 0x0066, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_pmm_qos), .umasks = spr_unc_cha_pmm_qos, }, { .name = "UNC_CHA_PMM_QOS_OCCUPANCY", .desc = "TBD", .code = 0x0067, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_pmm_qos_occupancy), .umasks = spr_unc_cha_pmm_qos_occupancy, }, { .name = "UNC_CHA_READ_NO_CREDITS", .desc = "CHA iMC CHNx READ Credits Empty", .code = 0x0058, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_write_no_credits), /* shared */ .umasks = spr_unc_cha_write_no_credits, }, { .name = "UNC_CHA_REQUESTS", .desc = "Requests", .code = 0x0050, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_requests), .umasks = spr_unc_cha_requests, }, { .name = "UNC_CHA_RxC_INSERTS", .desc = "Ingress (from CMS) Allocations", .code = 0x0013, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_inserts), .umasks = spr_unc_cha_rxc_inserts, }, { .name = "UNC_CHA_RxC_IPQ0_REJECT", .desc = "IPQ Requests (from CMS) Rejected - Set 0", .code = 0x0022, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_irq0_reject), /* shared */ .umasks = spr_unc_cha_rxc_irq0_reject, }, { .name = "UNC_CHA_RxC_IPQ1_REJECT", .desc = "IPQ Requests (from CMS) Rejected - Set 1", .code = 0x0023, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_other1_retry), /* shared */ .umasks = spr_unc_cha_rxc_other1_retry, }, { .name = "UNC_CHA_RxC_IRQ0_REJECT", .desc = "IRQ Requests (from CMS) Rejected - Set 0", .code = 0x0018, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_irq0_reject), .umasks = spr_unc_cha_rxc_irq0_reject, }, { .name = "UNC_CHA_RxC_IRQ1_REJECT", .desc = "Ingress (from CMS) Request Queue Rejects", .code = 0x0019, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_irq1_reject), .umasks = spr_unc_cha_rxc_irq1_reject, }, { .name = "UNC_CHA_RxC_ISMQ0_REJECT", .desc = "ISMQ Rejects - Set 0", .code = 0x0024, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_ismq0_retry), /* shared */ .umasks = spr_unc_cha_rxc_ismq0_retry, }, { .name = "UNC_CHA_RxC_ISMQ0_RETRY", .desc = "ISMQ Retries - Set 0", .code = 0x002c, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_ismq0_retry), .umasks = spr_unc_cha_rxc_ismq0_retry, }, { .name = "UNC_CHA_RxC_ISMQ1_REJECT", .desc = "ISMQ Rejects - Set 1", .code = 0x0025, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_ismq1_retry), /* shared */ .umasks = spr_unc_cha_rxc_ismq1_retry, }, { .name = "UNC_CHA_RxC_ISMQ1_RETRY", .desc = "ISMQ Retries - Set 1", .code = 0x002d, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_ismq1_retry), .umasks = spr_unc_cha_rxc_ismq1_retry, }, { .name = "UNC_CHA_RxC_OCCUPANCY", .desc = "Ingress (from CMS) Occupancy", .code = 0x0011, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_occupancy), .umasks = spr_unc_cha_rxc_occupancy, }, { .name = "UNC_CHA_RxC_OTHER0_RETRY", .desc = "Other Retries - Set 0", .code = 0x002e, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_prq0_reject), /* shared */ .umasks = spr_unc_cha_rxc_prq0_reject, }, { .name = "UNC_CHA_RxC_OTHER1_RETRY", .desc = "Other Retries - Set 1", .code = 0x002f, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_other1_retry), .umasks = spr_unc_cha_rxc_other1_retry, }, { .name = "UNC_CHA_RxC_PRQ0_REJECT", .desc = "PRQ Requests (from CMS) Rejected - Set 0", .code = 0x0020, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_prq0_reject), .umasks = spr_unc_cha_rxc_prq0_reject, }, { .name = "UNC_CHA_RxC_PRQ1_REJECT", .desc = "PRQ Requests (from CMS) Rejected - Set 1", .code = 0x0021, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_req_q1_retry), /* shared */ .umasks = spr_unc_cha_rxc_req_q1_retry, }, { .name = "UNC_CHA_RxC_REQ_Q0_RETRY", .desc = "Request Queue Retries - Set 0", .code = 0x002a, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_rrq0_reject), /* shared */ .umasks = spr_unc_cha_rxc_rrq0_reject, }, { .name = "UNC_CHA_RxC_REQ_Q1_RETRY", .desc = "Request Queue Retries - Set 1", .code = 0x002b, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_req_q1_retry), .umasks = spr_unc_cha_rxc_req_q1_retry, }, { .name = "UNC_CHA_RxC_RRQ0_REJECT", .desc = "RRQ Rejects - Set 0", .code = 0x0026, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_rrq0_reject), .umasks = spr_unc_cha_rxc_rrq0_reject, }, { .name = "UNC_CHA_RxC_RRQ1_REJECT", .desc = "RRQ Rejects - Set 1", .code = 0x0027, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_wbq1_reject), /* shared */ .umasks = spr_unc_cha_rxc_wbq1_reject, }, { .name = "UNC_CHA_RxC_WBQ0_REJECT", .desc = "WBQ Rejects - Set 0", .code = 0x0028, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_wbq0_reject), .umasks = spr_unc_cha_rxc_wbq0_reject, }, { .name = "UNC_CHA_RxC_WBQ1_REJECT", .desc = "WBQ Rejects - Set 1", .code = 0x0029, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_rxc_wbq1_reject), .umasks = spr_unc_cha_rxc_wbq1_reject, }, { .name = "UNC_CHA_SNOOPS_SENT", .desc = "Snoops Sent", .code = 0x0051, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_snoops_sent), .umasks = spr_unc_cha_snoops_sent, }, { .name = "UNC_CHA_SNOOP_RESP", .desc = "Snoop Responses Received", .code = 0x005c, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_snoop_resp), .umasks = spr_unc_cha_snoop_resp, }, { .name = "UNC_CHA_SNOOP_RESP_LOCAL", .desc = "Snoop Responses Received Local", .code = 0x005d, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_snoop_resp_local), .umasks = spr_unc_cha_snoop_resp_local, }, { .name = "UNC_CHA_SNOOP_RSP_MISC", .desc = "Misc Snoop Responses Received", .code = 0x006b, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_snoop_rsp_misc), .umasks = spr_unc_cha_snoop_rsp_misc, }, { .name = "UNC_CHA_TOR_INSERTS", .desc = "TOR Inserts", .code = 0x0035, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_tor_inserts), .umasks = spr_unc_cha_tor_inserts, }, { .name = "UNC_CHA_TOR_OCCUPANCY", .desc = "TOR Occupancy", .code = 0x0036, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0x1ull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_tor_inserts), /* shared */ .umasks = spr_unc_cha_tor_inserts, }, { .name = "UNC_CHA_WB_PUSH_MTOI", .desc = "WbPushMtoI", .code = 0x0056, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_wb_push_mtoi), .umasks = spr_unc_cha_wb_push_mtoi, }, { .name = "UNC_CHA_WRITE_NO_CREDITS", .desc = "CHA iMC CHNx WRITE Credits Empty", .code = 0x005a, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_write_no_credits), .umasks = spr_unc_cha_write_no_credits, }, { .name = "UNC_CHA_XPT_PREF", .desc = "XPT Prefetches", .code = 0x006f, .modmsk = SPR_UNC_CHA_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_cha_xpt_pref), .umasks = spr_unc_cha_xpt_pref, }, }; /* 55 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_spr_unc_imc_events.h000066400000000000000000000763071502707512200253040ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: spr_unc_imc (SapphireRapids Uncore IMC) * Based on Intel JSON event table version : 1.17 * Based on Intel JSON event table published : 11/09/2023 */ static const intel_x86_umask_t spr_unc_m_act_count[]={ { .uname = "ALL", .udesc = "Activate due to read, write, underfill, or bypass", .ucode = 0xff00ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t spr_unc_m_cas_count[]={ { .uname = "ALL", .udesc = "All DRAM CAS commands issued", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PCH0", .udesc = "Pseudo Channel 0 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo Channel 1 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "All DRAM read CAS commands issued (including underfills)", .ucode = 0xcf00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PRE_REG", .udesc = "DRAM RD_CAS and WR_CAS Commands. (experimental)", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PRE_UNDERFILL", .udesc = "DRAM RD_CAS and WR_CAS Commands. (experimental)", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REG", .udesc = "All DRAM read CAS commands issued (does not include underfills) (experimental)", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UNDERFILL", .udesc = "DRAM underfill read CAS commands issued (experimental)", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "All DRAM write CAS commands issued", .ucode = 0xf000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_NONPRE", .udesc = "DRAM WR_CAS commands w/o auto-pre (experimental)", .ucode = 0xd000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PRE", .udesc = "DRAM RD_CAS and WR_CAS Commands. (experimental)", .ucode = 0xe000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_cas_issued_req_len[]={ { .uname = "PCH0", .udesc = "Pseudo Channel 0 (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo Channel 1 (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_32B", .udesc = "Read CAS Command in Interleaved Mode (32B) (experimental)", .ucode = 0xc800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_64B", .udesc = "Read CAS Command in Regular Mode (64B) in Pseudochannel 0 (experimental)", .ucode = 0xc100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UFILL_32B", .udesc = "Underfill Read CAS Command in Interleaved Mode (32B) (experimental)", .ucode = 0xd000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_UFILL_64B", .udesc = "Underfill Read CAS Command in Regular Mode (64B) in Pseudochannel 1 (experimental)", .ucode = 0xc200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_32B", .udesc = "Write CAS Command in Interleaved Mode (32B) (experimental)", .ucode = 0xe000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_64B", .udesc = "Write CAS Command in Regular Mode (64B) in Pseudochannel 0 (experimental)", .ucode = 0xc400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_pcls[]={ { .uname = "RD", .udesc = "TBD (experimental)", .ucode = 0x0500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TOTAL", .udesc = "TBD (experimental)", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "WR", .udesc = "TBD (experimental)", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_pmm_rpq_occupancy[]={ { .uname = "ALL_SCH0", .udesc = "PMM Read Pending Queue occupancy", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_SCH1", .udesc = "PMM Read Pending Queue occupancy", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GNT_WAIT_SCH0", .udesc = "PMM Read Pending Queue Occupancy (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GNT_WAIT_SCH1", .udesc = "PMM Read Pending Queue Occupancy (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_GNT_SCH0", .udesc = "PMM Read Pending Queue Occupancy (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NO_GNT_SCH1", .udesc = "PMM Read Pending Queue Occupancy (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_pmm_wpq_occupancy[]={ { .uname = "ALL", .udesc = "PMM Write Pending Queue Occupancy", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_SCH0", .udesc = "PMM Write Pending Queue Occupancy", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_SCH1", .udesc = "PMM Write Pending Queue Occupancy", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CAS", .udesc = "PMM (for IXP) Write Pending Queue Occupancy (experimental)", .ucode = 0x0c00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PWR", .udesc = "PMM (for IXP) Write Pending Queue Occupancy (experimental)", .ucode = 0x3000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_power_cke_cycles[]={ { .uname = "LOW_0", .udesc = "DIMM ID (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_1", .udesc = "DIMM ID (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_2", .udesc = "DIMM ID (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LOW_3", .udesc = "DIMM ID (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_power_crit_throttle_cycles[]={ { .uname = "SLOT0", .udesc = "Throttle Cycles for Rank 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Throttle Cycles for Rank 0 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_pre_count[]={ { .uname = "ALL", .udesc = "Precharge due to read, write, underfill, or PGT.", .ucode = 0xff00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PGT", .udesc = "DRAM Precharge commands", .ucode = 0x8800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PGT_PCH0", .udesc = "Precharges from Page Table (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PGT_PCH1", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD", .udesc = "Precharge due to read on page miss", .ucode = 0x1100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PCH0", .udesc = "Precharge due to read (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_PCH1", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x4400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL_PCH0", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "UFILL_PCH1", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR", .udesc = "Precharge due to write on page miss", .ucode = 0x2200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PCH0", .udesc = "Precharge due to write (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_PCH1", .udesc = "DRAM Precharge commands. (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_rdb_inserts[]={ { .uname = "PCH0", .udesc = "Pseudo-channel 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo-channel 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_rdb_ne[]={ { .uname = "PCH0", .udesc = "Pseudo-channel 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo-channel 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ANY", .udesc = "Read Data Buffer Not Empty any channel (experimental)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t spr_unc_m_rpq_inserts[]={ { .uname = "PCH0", .udesc = "Pseudo-channel 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo-channel 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_accesses[]={ { .uname = "ACCEPTS", .udesc = "Scoreboard accepts (experimental)", .ucode = 0x0500ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD_CMPS", .udesc = "Write Accepts (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR_CMPS", .udesc = "Write Rejects (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_CMPS", .udesc = "FM read completions (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_CMPS", .udesc = "FM write completions (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_ACCEPTS", .udesc = "Read Accepts (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_REJECTS", .udesc = "Read Rejects (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REJECTS", .udesc = "Scoreboard rejects (experimental)", .ucode = 0x0a00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_ACCEPTS", .udesc = "NM read completions (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WR_REJECTS", .udesc = "NM write completions (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_canary[]={ { .uname = "ALLOC", .udesc = "Alloc (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEALLOC", .udesc = "Dealloc (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_RD_STARVED", .udesc = "Near Mem Write Starved (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR_WR_STARVED", .udesc = "Far Mem Write Starved (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR_STARVED", .udesc = "Far Mem Read Starved (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_STARVED", .udesc = "Valid (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_STARVED", .udesc = "Near Mem Read Starved (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VLD", .udesc = "Reject (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_inserts[]={ { .uname = "BLOCK_RDS", .udesc = "Block region reads (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BLOCK_WRS", .udesc = "Block region writes (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_RDS", .udesc = "Persistent Mem reads (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_WRS", .udesc = "Persistent Mem writes (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDS", .udesc = "Reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WRS", .udesc = "Writes (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_occupancy[]={ { .uname = "BLOCK_RDS", .udesc = "Block region reads (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BLOCK_WRS", .udesc = "Block region writes (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_RDS", .udesc = "Persistent Mem reads (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM_WRS", .udesc = "Persistent Mem writes (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RDS", .udesc = "Reads (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_pref_inserts[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DDR", .udesc = "DDR4 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "PMM (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_pref_occupancy[]={ { .uname = "ALL", .udesc = "All (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR", .udesc = "DDR4 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM", .udesc = "Persistent Mem (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_reject[]={ { .uname = "CANARY", .udesc = "Number of Scoreboard Requests Rejected (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DDR_EARLY_CMP", .udesc = "Number of Scoreboard Requests Rejected (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_ADDR_CNFLT", .udesc = "FM requests rejected due to full address conflict (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_SET_CNFLT", .udesc = "NM requests rejected due to set conflict (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PATROL_SET_CNFLT", .udesc = "Patrol requests rejected due to set conflict (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_strv_dealloc[]={ { .uname = "FM_RD", .udesc = "Far Mem Read - Set (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR", .udesc = "Near Mem Read - Clear (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR", .udesc = "Far Mem Write - Set (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD", .udesc = "Near Mem Read - Set (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR", .udesc = "Near Mem Write - Set (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_strv_occ[]={ { .uname = "FM_RD", .udesc = "Far Mem Read (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_TGR", .udesc = "Near Mem Read - Clear (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FM_WR", .udesc = "Far Mem Write (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD", .udesc = "Near Mem Read (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR", .udesc = "Near Mem Write (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_sb_tagged[]={ { .uname = "DDR4_CMP", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NEW", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "OCC", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM0_CMP", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM1_CMP", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PMM2_CMP", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_HIT", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RD_MISS", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_tagchk[]={ { .uname = "HIT", .udesc = "2LM Tag check hit in near memory cache (DDR4)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_CLEAN", .udesc = "2LM Tag check miss, no data at this line", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS_DIRTY", .udesc = "2LM Tag check miss, existing data may be evicted to PMM", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_RD_HIT", .udesc = "2LM Tag check hit due to memory read", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NM_WR_HIT", .udesc = "2LM Tag check hit due to memory write", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_m_wpq_inserts[]={ { .uname = "PCH0", .udesc = "Pseudo-channel 0", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PCH1", .udesc = "Pseudo-channel 1", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; #define RDB_E(n) \ { .uname = "RDB_"#n"_ELEM", \ .udesc = "Read buffer has "#n" element(s)", \ .ucode = 1ull << (n+8), \ .uflags = INTEL_X86_NCOMBO, \ } static const intel_x86_umask_t spr_unc_m_rdb_full[]={ RDB_E(1), RDB_E(2), RDB_E(3), RDB_E(4), RDB_E(5), RDB_E(6), RDB_E(7), RDB_E(8), RDB_E(9), RDB_E(10), RDB_E(11), RDB_E(12), RDB_E(13), RDB_E(14), RDB_E(15), RDB_E(16), RDB_E(17), RDB_E(18), RDB_E(19), RDB_E(20), RDB_E(21), RDB_E(22), RDB_E(23), RDB_E(24), }; static const intel_x86_entry_t intel_spr_unc_imc_pe[]={ { .name = "UNC_M_ACT_COUNT", .desc = "Count Activation", .code = 0x0002, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_act_count), .umasks = spr_unc_m_act_count, }, { .name = "UNC_M_CAS_COUNT", .desc = "DRAM CAS commands issued", .code = 0x0005, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_cas_count), .umasks = spr_unc_m_cas_count, }, { .name = "UNC_M_CAS_ISSUED_REQ_LEN", .desc = "CAS Command in Regular Mode issued", .code = 0x0006, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_cas_issued_req_len), .umasks = spr_unc_m_cas_issued_req_len, }, { .name = "UNC_M_CLOCKTICKS", .desc = "IMC Clockticks at DCLK frequency", .code = 0x0101, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_DRAM_PRE_ALL", .desc = "DRAM Precharge All Commands (experimental)", .code = 0x0044, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_HCLOCKTICKS", .desc = "IMC Clockticks at HCLK frequency", .code = 0x0001, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PCLS", .desc = "TBD", .code = 0x00a0, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_pcls), .umasks = spr_unc_m_pcls, }, { .name = "UNC_M_PMM_RPQ_INSERTS", .desc = "PMM Read Pending Queue inserts", .code = 0x00e3, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_RPQ_OCCUPANCY", .desc = "PMM Read Pending Queue occupancy", .code = 0x00e0, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_pmm_rpq_occupancy), .umasks = spr_unc_m_pmm_rpq_occupancy, }, { .name = "UNC_M_PMM_WPQ_CYCLES_NE", .desc = "PMM (for IXP) Write Queue Cycles Not Empty (experimental)", .code = 0x00e5, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_INSERTS", .desc = "PMM Write Pending Queue inserts", .code = 0x00e7, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PMM_WPQ_OCCUPANCY", .desc = "PMM Write Pending Queue Occupancy", .code = 0x00e4, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_pmm_wpq_occupancy), .umasks = spr_unc_m_pmm_wpq_occupancy, }, { .name = "UNC_M_POWER_CHANNEL_PPD", .desc = "Channel PPD Cycles (experimental)", .code = 0x0085, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_POWER_CKE_CYCLES", .desc = "Cycles in CKE mode", .code = 0x0047, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_power_cke_cycles), .umasks = spr_unc_m_power_cke_cycles, }, { .name = "UNC_M_POWER_CRIT_THROTTLE_CYCLES", .desc = "Throttled Cycles", .code = 0x0086, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_power_crit_throttle_cycles), .umasks = spr_unc_m_power_crit_throttle_cycles, }, { .name = "UNC_M_POWER_SELF_REFRESH", .desc = "Clock-Enabled Self-Refresh (experimental)", .code = 0x0043, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_PRE_COUNT", .desc = "Count number of DRAM Precharge", .code = 0x0003, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_pre_count), .umasks = spr_unc_m_pre_count, }, { .name = "UNC_M_RDB_FULL", .desc = "Counts the number of cycles where the read buffer has greater than UMASK elements. This includes reads to both DDR and PMEM. NOTE: Umask must be set to the maximum number of elements in the queue (24 entries for SPR). (experimental)", .code = 0x0019, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_rdb_full), .umasks = spr_unc_m_rdb_full, }, { .name = "UNC_M_RDB_INSERTS", .desc = "Read Data Buffer Inserts", .code = 0x0017, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_rdb_inserts), .umasks = spr_unc_m_rdb_inserts, }, { .name = "UNC_M_RDB_NE", .desc = "Counts the number of cycles where there is at least one element in the read buffer. This includes reads to both DDR and PMEM.", .code = 0x0018, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_rdb_ne), .umasks = spr_unc_m_rdb_ne, }, { .name = "UNC_M_RDB_OCCUPANCY", .desc = "Counts the number of elements in the read buffer, including reads to both DDR and PMEM. (experimental)", .code = 0x001a, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_INSERTS", .desc = "Read Pending Queue Allocations", .code = 0x0010, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_rpq_inserts), .umasks = spr_unc_m_rpq_inserts, }, { .name = "UNC_M_RPQ_OCCUPANCY_PCH0", .desc = "Read Pending Queue Occupancy pseudo-channel 0", .code = 0x0080, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_RPQ_OCCUPANCY_PCH1", .desc = "Read Pending Queue Occupancy pseudo-channel 0", .code = 0x0081, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_ACCESSES", .desc = "Scoreboard accesses", .code = 0x00d2, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_accesses), .umasks = spr_unc_m_sb_accesses, }, { .name = "UNC_M_SB_CANARY", .desc = "TBD", .code = 0x00d9, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_canary), .umasks = spr_unc_m_sb_canary, }, { .name = "UNC_M_SB_CYCLES_FULL", .desc = "Scoreboard Cycles Full (experimental)", .code = 0x00d1, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_CYCLES_NE", .desc = "Scoreboard Cycles Not-Empty (experimental)", .code = 0x00d0, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_SB_INSERTS", .desc = "Scoreboard Inserts", .code = 0x00d6, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_inserts), .umasks = spr_unc_m_sb_inserts, }, { .name = "UNC_M_SB_OCCUPANCY", .desc = "Scoreboard Occupancy", .code = 0x00d5, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_occupancy), .umasks = spr_unc_m_sb_occupancy, }, { .name = "UNC_M_SB_PREF_INSERTS", .desc = "Scoreboard Prefetch Inserts", .code = 0x00da, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_pref_inserts), .umasks = spr_unc_m_sb_pref_inserts, }, { .name = "UNC_M_SB_PREF_OCCUPANCY", .desc = "Scoreboard Prefetch Occupancy", .code = 0x00db, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_pref_occupancy), .umasks = spr_unc_m_sb_pref_occupancy, }, { .name = "UNC_M_SB_REJECT", .desc = "Number of Scoreboard Requests Rejected", .code = 0x00d4, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_reject), .umasks = spr_unc_m_sb_reject, }, { .name = "UNC_M_SB_STRV_ALLOC", .desc = "TBD", .code = 0x00d7, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_strv_dealloc), /* shared */ .umasks = spr_unc_m_sb_strv_dealloc, }, { .name = "UNC_M_SB_STRV_DEALLOC", .desc = "TBD", .code = 0x00de, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_strv_dealloc), .umasks = spr_unc_m_sb_strv_dealloc, }, { .name = "UNC_M_SB_STRV_OCC", .desc = "TBD", .code = 0x00d8, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_strv_occ), .umasks = spr_unc_m_sb_strv_occ, }, { .name = "UNC_M_SB_TAGGED", .desc = "TBD", .code = 0x00dd, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_sb_tagged), .umasks = spr_unc_m_sb_tagged, }, { .name = "UNC_M_TAGCHK", .desc = "2LM Tag check", .code = 0x00d3, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_tagchk), .umasks = spr_unc_m_tagchk, }, { .name = "UNC_M_WPQ_INSERTS", .desc = "Write Pending Queue Allocations", .code = 0x0020, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_m_wpq_inserts), .umasks = spr_unc_m_wpq_inserts, }, { .name = "UNC_M_WPQ_OCCUPANCY_PCH0", .desc = "Write Pending Queue Occupancy pseudo channel 0", .code = 0x0082, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_OCCUPANCY_PCH1", .desc = "Write Pending Queue Occupancy pseudo channel 1", .code = 0x0083, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_READ_HIT", .desc = "Write Pending Queue CAM Match (experimental)", .code = 0x0023, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_M_WPQ_WRITE_HIT", .desc = "Write Pending Queue CAM Match (experimental)", .code = 0x0024, .modmsk = SPR_UNC_IMC_ATTRS, .cntmsk = 0xfull, }, }; /* 44 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_spr_unc_upi_events.h000066400000000000000000000476751502707512200253370ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: spr_unc_upi_ll (SapphireRapids Uncore UPI) * Based on Intel JSON event table version : 1.17 * Based on Intel JSON event table published : 11/09/2023 */ static const intel_x86_umask_t spr_unc_upi_direct_attempts[]={ { .uname = "D2C", .udesc = "D2C (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "D2K", .udesc = "D2K (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_flowq_no_vna_crd[]={ { .uname = "AD_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_VNA_EQ1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AD_VNA_EQ2", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ2", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "AK_VNA_EQ3", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_m3_byp_blocked[]={ { .uname = "BGF_CRD", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_LE2", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AK_VNA_LE3", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GV_BLOCK", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_m3_rxq_blocked[]={ { .uname = "BGF_CRD", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_BTW_2_THRESH", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AD_VNA_LE2", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_AK_VNA_LE3", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_BTW_0_THRESH", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "FLOWQ_BL_VNA_EQ0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "GV_BLOCK", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_req_slot2_from_m3[]={ { .uname = "ACK", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN0", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VN1", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "VNA", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_rxl_flits[]={ { .uname = "ALL_DATA", .udesc = "All Data", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_NULL", .udesc = "Null FLITs received from any slot", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Data (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Idle (experimental)", .ucode = 0x4700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .udesc = "LLCRD Not Empty (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .udesc = "LLCTRL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "All Non Data", .ucode = 0x9700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .udesc = "Slot NULL or LLCRD Empty (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PROTHDR", .udesc = "Protocol Header (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_rxl_inserts[]={ { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_rxl_occupancy[]={ { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_rxl_slot_bypass[]={ { .uname = "S0_RXQ1", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S0_RXQ2", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S1_RXQ0", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S1_RXQ2", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2_RXQ0", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "S2_RXQ1", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_txl0p_clk_active[]={ { .uname = "CFG_CTL", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DFX", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RETRY", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ_BYPASS", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RXQ_CRED", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SPARE", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "TXQ", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_txl_any_flits[]={ { .uname = "DATA", .udesc = "TBD (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .udesc = "TBD (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .udesc = "TBD (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .udesc = "TBD (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PROTHDR", .udesc = "TBD (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT0", .udesc = "TBD (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "TBD (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "TBD (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t spr_unc_upi_txl_flits[]={ { .uname = "ALL_DATA", .udesc = "All Data", .ucode = 0x0f00ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ALL_LLCRD", .udesc = "All LLCRD Not Empty (experimental)", .ucode = 0x1700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_LLCTRL", .udesc = "All LLCTRL (experimental)", .ucode = 0x4700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_NULL", .udesc = "All Null Flits", .ucode = 0x2700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "ALL_PROTHDR", .udesc = "All Protocol Header (experimental)", .ucode = 0x8700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Data (experimental)", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "IDLE", .udesc = "Idle (experimental)", .ucode = 0x4700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCRD", .udesc = "LLCRD Not Empty (experimental)", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "LLCTRL", .udesc = "LLCTRL (experimental)", .ucode = 0x4000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NON_DATA", .udesc = "All Non Data", .ucode = 0x9700ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "NULL", .udesc = "Slot NULL or LLCRD Empty (experimental)", .ucode = 0x2000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "PROTHDR", .udesc = "Protocol Header (experimental)", .ucode = 0x8000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT0", .udesc = "Slot 0 (experimental)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT1", .udesc = "Slot 1 (experimental)", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "SLOT2", .udesc = "Slot 2 (experimental)", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_entry_t intel_spr_unc_upi_ll_pe[]={ { .name = "UNC_UPI_CLOCKTICKS", .desc = "UPI Clockticks", .code = 0x0001, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_DIRECT_ATTEMPTS", .desc = "Direct packet attempts", .code = 0x0012, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_direct_attempts), .umasks = spr_unc_upi_direct_attempts, }, { .name = "UNC_UPI_FLOWQ_NO_VNA_CRD", .desc = "TBD", .code = 0x0018, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_flowq_no_vna_crd), .umasks = spr_unc_upi_flowq_no_vna_crd, }, { .name = "UNC_UPI_L1_POWER_CYCLES", .desc = "Cycles in L1", .code = 0x0021, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_M3_BYP_BLOCKED", .desc = "TBD", .code = 0x0014, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_m3_byp_blocked), .umasks = spr_unc_upi_m3_byp_blocked, }, { .name = "UNC_UPI_M3_CRD_RETURN_BLOCKED", .desc = "TBD (experimental)", .code = 0x0016, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_M3_RXQ_BLOCKED", .desc = "TBD", .code = 0x0015, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_m3_rxq_blocked), .umasks = spr_unc_upi_m3_rxq_blocked, }, { .name = "UNC_UPI_PHY_INIT_CYCLES", .desc = "PHY Cycles (experimental)", .code = 0x0020, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_POWER_L1_NACK", .desc = "L1 Req Nack (experimental)", .code = 0x0023, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_POWER_L1_REQ", .desc = "L1 Req (same as L1 Ack). (experimental)", .code = 0x0022, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_REQ_SLOT2_FROM_M3", .desc = "TBD", .code = 0x0046, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_req_slot2_from_m3), .umasks = spr_unc_upi_req_slot2_from_m3, }, { .name = "UNC_UPI_RxL0P_POWER_CYCLES", .desc = "Cycles in L0p (experimental)", .code = 0x0025, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL0_POWER_CYCLES", .desc = "Cycles in L0 (experimental)", .code = 0x0024, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_ANY_FLITS", .desc = "TBD", .code = 0x004b, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_txl_any_flits), /* shared */ .umasks = spr_unc_upi_txl_any_flits, }, { .name = "UNC_UPI_RxL_BYPASSED", .desc = "RxQ Flit Buffer Bypassed", .code = 0x0031, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_rxl_inserts), /* shared */ .umasks = spr_unc_upi_rxl_inserts, }, { .name = "UNC_UPI_RxL_CRC_ERRORS", .desc = "CRC Errors Detected (experimental)", .code = 0x000b, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CRC_LLR_REQ_TRANSMIT", .desc = "LLR Requests Sent (experimental)", .code = 0x0008, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VN0", .desc = "VN0 Credit Consumed (experimental)", .code = 0x0039, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VN1", .desc = "VN1 Credit Consumed (experimental)", .code = 0x003a, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_CREDITS_CONSUMED_VNA", .desc = "VNA Credit Consumed (experimental)", .code = 0x0038, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_RxL_FLITS", .desc = "Flits Received", .code = 0x0003, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_rxl_flits), .umasks = spr_unc_upi_rxl_flits, }, { .name = "UNC_UPI_RxL_INSERTS", .desc = "Receive Flit Buffer Allocations", .code = 0x0030, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_rxl_inserts), .umasks = spr_unc_upi_rxl_inserts, }, { .name = "UNC_UPI_RxL_OCCUPANCY", .desc = "RxQ Occupancy", .code = 0x0032, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_rxl_occupancy), .umasks = spr_unc_upi_rxl_occupancy, }, { .name = "UNC_UPI_RxL_SLOT_BYPASS", .desc = "TBD", .code = 0x0033, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_rxl_slot_bypass), .umasks = spr_unc_upi_rxl_slot_bypass, }, { .name = "UNC_UPI_TxL0P_CLK_ACTIVE", .desc = "TBD", .code = 0x002a, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_txl0p_clk_active), .umasks = spr_unc_upi_txl0p_clk_active, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES", .desc = "Cycles in L0p (experimental)", .code = 0x0027, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES_LL_ENTER", .desc = "TBD (experimental)", .code = 0x0028, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0P_POWER_CYCLES_M3_EXIT", .desc = "TBD (experimental)", .code = 0x0029, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL0_POWER_CYCLES", .desc = "Cycles in L0 (experimental)", .code = 0x0026, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_ANY_FLITS", .desc = "TBD", .code = 0x004a, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_txl_any_flits), .umasks = spr_unc_upi_txl_any_flits, }, { .name = "UNC_UPI_TxL_BYPASSED", .desc = "Transmitted Flit Buffer Bypassed (experimental)", .code = 0x0041, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_FLITS", .desc = "Flits transmitted", .code = 0x0002, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .numasks= LIBPFM_ARRAY_SIZE(spr_unc_upi_txl_flits), .umasks = spr_unc_upi_txl_flits, }, { .name = "UNC_UPI_TxL_INSERTS", .desc = "Transmitted Flit Buffer Allocations (experimental)", .code = 0x0040, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_TxL_OCCUPANCY", .desc = "Transmitted Flit Buffer Occupancy (experimental)", .code = 0x0042, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_BLOCKED_VN01", .desc = "TBD (experimental)", .code = 0x0045, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, { .name = "UNC_UPI_VNA_CREDIT_RETURN_OCCUPANCY", .desc = "VNA Credits Pending Return - Occupancy (experimental)", .code = 0x0044, .modmsk = SPR_UNC_UPI_ATTRS, .cntmsk = 0xfull, }, }; /* 36 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_tmt_events.h000066400000000000000000000355071502707512200236040ustar00rootroot00000000000000/* * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * PMU: intel_tmt (Intel Tremont) */ static const intel_x86_umask_t intel_tmt_ocr[]={ { .uname = "DEMAND_RFO_L3_MISS", .udesc = "Counts all demand reads for ownership (RFO) requests and software based prefetches for exclusive ownership (PREFETCHW) that was not supplied by the L3 cache.", .ucode = 0x3f0400000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO_ANY_RESPONSE", .udesc = "Counts all demand reads for ownership (RFO) requests and software based prefetches for exclusive ownership (PREFETCHW) that have any response type.", .ucode = 0x1000200ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_L3_MISS", .udesc = "Counts demand data reads that was not supplied by the L3 cache.", .ucode = 0x3f0400000100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DATA_RD_ANY_RESPONSE", .udesc = "Counts demand data reads that have any response type.", .ucode = 0x1000100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_mem_load_uops_retired[]={ { .uname = "L2_MISS", .udesc = "Counts the number of load uops retired that miss in the level 2 cache", .ucode = 0x1000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_MISS", .udesc = "Counts the number of load uops retired that miss in the level 1 data cache", .ucode = 0x0800ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_HIT", .udesc = "Counts the number of load uops retired that miss in the level 3 cache", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Counts the number of load uops retired that hit in the level 2 cache", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1_HIT", .udesc = "Counts the number of load uops retired that hit the level 1 data cache", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_tmt_mem_uops_retired[]={ { .uname = "ALL_STORES", .udesc = "Counts the number of store uops retired.", .ucode = 0x8200ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ALL_LOADS", .udesc = "Counts the number of load uops retired.", .ucode = 0x8100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t intel_tmt_cycles_div_busy[]={ { .uname = "ANY", .udesc = "Counts cycles the floating point divider or integer divider or both are busy. Does not imply a stall waiting for either divider.", .ucode = 0x0000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_tmt_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Counts the number of mispredicted branch instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_tmt_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Counts the number of branch instructions retired for all branch types.", .ucode = 0x0000ull, .uflags = INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_tmt_machine_clears[]={ { .uname = "ANY", .udesc = "Counts all machine clears due to, but not limited to memory ordering, memory disambiguation, SMC, page faults and FP assist.", .ucode = 0x0000ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_tmt_itlb_misses[]={ { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walk completed due to an instruction fetch in a 2M or 4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walk completed due to an instruction fetch in a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_itlb[]={ { .uname = "FILLS", .udesc = "Counts the number of times there was an ITLB miss and a new translation was filled into the ITLB.", .ucode = 0x0400ull, .uflags = INTEL_X86_DFL, }, }; static const intel_x86_umask_t intel_tmt_icache[]={ { .uname = "ACCESSES", .udesc = "Counts requests to the Instruction Cache (ICache) for one or more bytes cache Line.", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "Counts requests to the Instruction Cache (ICache) for one or more bytes in a cache line and they do not hit in the ICache (miss).", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_dtlb_store_misses[]={ { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walk completed due to a demand data store to a 2M or 4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walk completed due to a demand data store to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_longest_lat_cache[]={ { .uname = "REFERENCE", .udesc = "Counts memory requests originating from the core that reference a cache line in the last level cache. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2.", .ucode = 0x4f00ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "Counts memory requests originating from the core that miss in the last level cache. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2.", .ucode = 0x4100ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_dtlb_load_misses[]={ { .uname = "WALK_COMPLETED_2M_4M", .udesc = "Page walk completed due to a demand load to a 2M or 4M page.", .ucode = 0x0400ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED_4K", .udesc = "Page walk completed due to a demand load to a 4K page.", .ucode = 0x0200ull, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t intel_tmt_cpu_clk_unhalted[]={ { .uname = "REF", .udesc = "Counts the number of unhalted reference clock cycles at TSC frequency.", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_P", .udesc = "Counts the number of unhalted core clock cycles.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE", .udesc = "Counts the number of unhalted core clock cycles.", .uequiv = "CORE_P", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO, }, { .uname = "REF_TSC", .udesc = "Counts the number of unhalted reference clock cycles at TSC frequency. (Fixed event)", .ucode = 0x0300ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_CODE_OVERRIDE, }, }; static const intel_x86_umask_t intel_tmt_inst_retired[]={ { .uname = "ANY_P", .udesc = "Counts the number of instructions retired.", .ucode = 0x0000ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "Counts the number of instructions retired. (Fixed event)", .ucode = 0x0100ull, .uflags = INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_entry_t intel_tmt_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted)", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "INSTRUCTION_RETIRED", .desc = "Number of instructions at retirement", .modmsk = INTEL_V2_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V2_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "OCR", .desc = "Counts demand data reads that have any response type.", .equiv = "OFFCORE_RESPONSE_0", .code = 0x01b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_ocr), .umasks = intel_tmt_ocr, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Counts demand data reads that have any response type.", .code = 0x01b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_ocr), .umasks = intel_tmt_ocr, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Counts demand data reads that have any response type.", .code = 0x02b7, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_NHM_OFFCORE, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_ocr), .umasks = intel_tmt_ocr, }, { .name = "MEM_LOAD_UOPS_RETIRED", .desc = "Counts the number of load uops retired that hit the level 1 data cache", .code = 0x00d1, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_mem_load_uops_retired), .umasks = intel_tmt_mem_load_uops_retired, }, { .name = "MEM_UOPS_RETIRED", .desc = "Counts the number of load uops retired.", .code = 0x00d0, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_mem_uops_retired), .umasks = intel_tmt_mem_uops_retired, }, { .name = "CYCLES_DIV_BUSY", .desc = "Counts cycles the floating point divider or integer divider or both are busy. Does not imply a stall waiting for either divider.", .code = 0x00cd, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_cycles_div_busy), .umasks = intel_tmt_cycles_div_busy, }, { .name = "BR_MISP_RETIRED", .desc = "Counts the number of mispredicted branch instructions retired.", .code = 0x00c5, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_br_misp_retired), .umasks = intel_tmt_br_misp_retired, }, { .name = "BR_INST_RETIRED", .desc = "Counts the number of branch instructions retired for all branch types.", .code = 0x00c4, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = INTEL_X86_PEBS, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_br_inst_retired), .umasks = intel_tmt_br_inst_retired, }, { .name = "MACHINE_CLEARS", .desc = "Counts all machine clears due to, but not limited to memory ordering, memory disambiguation, SMC, page faults and FP assist.", .code = 0x00c3, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_machine_clears), .umasks = intel_tmt_machine_clears, }, { .name = "ITLB_MISSES", .desc = "Page walk completed due to an instruction fetch in a 4K page.", .code = 0x0085, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_itlb_misses), .umasks = intel_tmt_itlb_misses, }, { .name = "ITLB", .desc = "Counts the number of times there was an ITLB miss and a new translation was filled into the ITLB.", .code = 0x0081, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_itlb), .umasks = intel_tmt_itlb, }, { .name = "ICACHE", .desc = "Counts requests to the Instruction Cache (ICache) for one or more bytes in a cache line and they do not hit in the ICache (miss).", .code = 0x0080, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_icache), .umasks = intel_tmt_icache, }, { .name = "DTLB_STORE_MISSES", .desc = "Page walk completed due to a demand data store to a 4K page.", .code = 0x0049, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_dtlb_store_misses), .umasks = intel_tmt_dtlb_store_misses, }, { .name = "LONGEST_LAT_CACHE", .desc = "Counts memory requests originating from the core that miss in the last level cache. If the platform has an L3 cache, last level cache is the L3, otherwise it is the L2.", .code = 0x002e, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_longest_lat_cache), .umasks = intel_tmt_longest_lat_cache, }, { .name = "DTLB_LOAD_MISSES", .desc = "Page walk completed due to a demand load to a 4K page.", .code = 0x0008, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_dtlb_load_misses), .umasks = intel_tmt_dtlb_load_misses, }, { .name = "CPU_CLK_UNHALTED", .desc = "Counts the number of unhalted core clock cycles. (Fixed event)", .code = 0x003c, .modmsk = INTEL_V2_ATTRS, .cntmsk = 0xfull, .ngrp = 1, .flags = 0, .numasks= LIBPFM_ARRAY_SIZE(intel_tmt_cpu_clk_unhalted), .umasks = intel_tmt_cpu_clk_unhalted, }, }; /* 15 events available */ papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_wsm_events.h000066400000000000000000002333321502707512200236020ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: wsm (Intel Westmere (single-socket)) */ static const intel_x86_umask_t wsm_uops_decoded[]={ { .uname = "ESP_FOLDING", .udesc = "Stack pointer instructions decoded", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ESP_SYNC", .udesc = "Stack pointer sync operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS", .udesc = "Counts the number of uops decoded by the Microcode Sequencer (MS). The MS delivers uops when the instruction is more than 4 uops long or a microcode assist is occurring.", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MS_CYCLES_ACTIVE", .udesc = "Uops decoded by Microcode Sequencer", .uequiv = "MS:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "STALL_CYCLES", .udesc = "Cycles no Uops are decoded", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_bpu_clears[]={ { .uname = "EARLY", .udesc = "Early Branch Prediction Unit clears", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LATE", .udesc = "Late Branch Prediction Unit clears", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_uops_retired[]={ { .uname = "ANY", .udesc = "Uops retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "MACRO_FUSED", .udesc = "Macro-fused Uops retired (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "RETIRE_SLOTS", .udesc = "Retirement slots used (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STALL_CYCLES", .udesc = "Cycles Uops are not retiring (Precise Event)", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "TOTAL_CYCLES", .udesc = "Total cycles using precise uop retired event (Precise Event)", .uequiv = "ANY:c=16:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x10 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, }, { .uname = "ACTIVE_CYCLES", .udesc = "Alias for TOTAL_CYCLES (Precise Event)", .uequiv = "ANY:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .modhw = _INTEL_X86_ATTR_C, }, }; static const intel_x86_umask_t wsm_br_misp_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Mispredicted retired branch instructions (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Mispredicted near retired calls (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "CONDITIONAL", .udesc = "Mispredicted conditional branches retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_ept[]={ { .uname = "WALK_CYCLES", .udesc = "Extended Page Table walk cycles", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_uops_executed[]={ { .uname = "PORT0", .udesc = "Uops executed on port 0 (integer arithmetic, SIMD and FP add uops)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT1", .udesc = "Uops executed on port 1 (integer arithmetic, SIMD, integer shift, FP multiply, FP divide uops)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT2_CORE", .udesc = "Uops executed on port 2 on any thread (load uops) (core count only)", .ucode = 0x400 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT3_CORE", .udesc = "Uops executed on port 3 on any thread (store uops) (core count only)", .ucode = 0x800 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT4_CORE", .udesc = "Uops executed on port 4 on any thread (handle store values for stores on port 3) (core count only)", .ucode = 0x1000 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT5", .udesc = "Uops executed on port 5", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT015", .udesc = "Uops issued on ports 0, 1 or 5", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT234_CORE", .udesc = "Uops issued on ports 2, 3 or 4 on any thread (core count only)", .ucode = 0x8000 | INTEL_X86_MOD_ANY, .modhw = _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PORT015_STALL_CYCLES", .udesc = "Cycles no Uops issued on ports 0, 1 or 5", .uequiv = "PORT015:c=1:i=1", .ucode = 0x4000 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_ACTIVE_CYCLES_NO_PORT5", .udesc = "Cycles in which uops are executed only on port0-4 on any thread (core count only)", .ucode = 0x1f00 | INTEL_X86_MOD_ANY | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_ACTIVE_CYCLES", .udesc = "Cycles in which uops are executed on any port any thread (core count only)", .ucode = 0x3f00 | INTEL_X86_MOD_ANY | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles in which no uops are executed on any port any thread (core count only)", .ucode = 0x3f00 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_STALL_CYCLES_NO_PORT5", .udesc = "Cycles in which no uops are executed on any port0-4 on any thread (core count only)", .ucode = 0x1f00 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_C | _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_STALL_COUNT", .udesc = "Number of transitions from stalled to uops to execute on any port any thread (core count only)", .uequiv = "CORE_STALL_CYCLES:e:t:i:c=1", .ucode = 0x3f00 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .modhw = _INTEL_X86_ATTR_E | _INTEL_X86_ATTR_T | _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "CORE_STALL_COUNT_NO_PORT5", .udesc = "Number of transitions from stalled to uops to execute on ports 0-4 on any thread (core count only)", .uequiv = "CORE_STALL_CYCLES_NO_PORT5:e:t:i:c=1", .ucode = 0x1f00 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_inst_retired[]={ { .uname = "ANY_P", .udesc = "Instructions Retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "ANY", .udesc = "Instructions Retired (Precise Event)", .ucode = 0x100, .uequiv = "ANY_P", .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "X87", .udesc = "Retired floating-point operations (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "MMX", .udesc = "Retired MMX instructions (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "TOTAL_CYCLES", .udesc = "Total cycles (Precise Event)", .uequiv = "ANY_P:c=16:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x10 << INTEL_X86_CMASK_BIT), /* inv=1, cmask=16 */ .modhw = _INTEL_X86_ATTR_I | _INTEL_X86_ATTR_C, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_ild_stall[]={ { .uname = "ANY", .udesc = "Any Instruction Length Decoder stall cycles", .uequiv = "IQ_FULL:LCP:MRU:REGEN", .ucode = 0xf00, .uflags= INTEL_X86_DFL, }, { .uname = "IQ_FULL", .udesc = "Instruction Queue full stall cycles", .ucode = 0x400, }, { .uname = "LCP", .udesc = "Length Change Prefix stall cycles", .ucode = 0x100, }, { .uname = "MRU", .udesc = "Stall cycles due to BPU MRU bypass", .ucode = 0x200, }, { .uname = "REGEN", .udesc = "Regen stall cycles", .ucode = 0x800, }, }; static const intel_x86_umask_t wsm_dtlb_load_misses[]={ { .uname = "ANY", .udesc = "DTLB load misses", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "PDE_MISS", .udesc = "DTLB load miss caused by low part of address", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "DTLB second level hit", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "DTLB load miss page walks complete", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_CYCLES", .udesc = "DTLB load miss page walk cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "DTLB load miss large page walk cycles", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l2_lines_in[]={ { .uname = "ANY", .udesc = "L2 lines allocated", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "E_STATE", .udesc = "L2 lines allocated in the E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L2 lines allocated in the S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_ssex_uops_retired[]={ { .uname = "PACKED_DOUBLE", .udesc = "SIMD Packed-Double Uops retired (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "PACKED_SINGLE", .udesc = "SIMD Packed-Single Uops retired (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SCALAR_DOUBLE", .udesc = "SIMD Scalar-Double Uops retired (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "SCALAR_SINGLE", .udesc = "SIMD Scalar-Single Uops retired (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "VECTOR_INTEGER", .udesc = "SIMD Vector Integer Uops retired (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_store_blocks[]={ { .uname = "AT_RET", .udesc = "Loads delayed with at-Retirement block code", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_BLOCK", .udesc = "Cacheable loads delayed with L1D block code", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_fp_mmx_trans[]={ { .uname = "ANY", .udesc = "All Floating Point to and from MMX transitions", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "TO_FP", .udesc = "Transitions from MMX to Floating Point instructions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TO_MMX", .udesc = "Transitions from Floating Point to MMX instructions", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_cache_lock_cycles[]={ { .uname = "L1D", .udesc = "Cycles L1D locked", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_L2", .udesc = "Cycles L1D and L2 locked", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l3_lat_cache[]={ { .uname = "MISS", .udesc = "Last level cache miss", .ucode = 0x4100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REFERENCE", .udesc = "Last level cache reference", .ucode = 0x4f00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_simd_int_64[]={ { .uname = "PACK", .udesc = "SIMD integer 64 bit pack operations", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_ARITH", .udesc = "SIMD integer 64 bit arithmetic operations", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_LOGICAL", .udesc = "SIMD integer 64 bit logical operations", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_MPY", .udesc = "SIMD integer 64 bit packed multiply operations", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_SHIFT", .udesc = "SIMD integer 64 bit shift operations", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHUFFLE_MOVE", .udesc = "SIMD integer 64 bit shuffle/move operations", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK", .udesc = "SIMD integer 64 bit unpack operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_br_misp_exec[]={ { .uname = "ANY", .udesc = "Mispredicted branches executed", .ucode = 0x7f00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "COND", .udesc = "Mispredicted conditional branches executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT", .udesc = "Mispredicted unconditional branches executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT_NEAR_CALL", .udesc = "Mispredicted non call branches executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NEAR_CALL", .udesc = "Mispredicted indirect call branches executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NON_CALL", .udesc = "Mispredicted indirect non call branches executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NEAR_CALLS", .udesc = "Mispredicted call branches executed", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NON_CALLS", .udesc = "Mispredicted non call branches executed", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RETURN_NEAR", .udesc = "Mispredicted return branches executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Mispredicted taken branches executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_baclear[]={ { .uname = "BAD_TARGET", .udesc = "BACLEAR asserted with bad target address", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CLEAR", .udesc = "BACLEAR asserted, regardless of cause", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_dtlb_misses[]={ { .uname = "ANY", .udesc = "DTLB misses", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "DTLB miss large page walks", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "DTLB first level misses but second level hit", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_COMPLETED", .udesc = "DTLB miss page walks", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_CYCLES", .udesc = "DTLB miss page walk cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PDE_MISS", .udesc = "DTLB miss caused by low part of address", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_mem_inst_retired[]={ { .uname = "LATENCY_ABOVE_THRESHOLD", .udesc = "Memory instructions retired above programmed clocks, minimum threshold value is 3, (Precise Event and ldlat required)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_LDLAT, }, { .uname = "LOADS", .udesc = "Instructions retired which contains a load (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "STORES", .udesc = "Instructions retired which contains a store (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_uops_issued[]={ { .uname = "ANY", .udesc = "Uops issued", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "STALL_CYCLES", .udesc = "Cycles stalled no issued uops", .uequiv = "ANY:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "FUSED", .udesc = "Fused Uops issued", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES_ALL_THREADS", .udesc = "Cycles uops issued on either threads (core count)", .uequiv = "ANY:c=1:t=1", .ucode = 0x100 | INTEL_X86_MOD_ANY | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "CORE_STALL_CYCLES", .udesc = "Cycles no uops issued on any threads (core count)", .uequiv = "ANY:c=1:i=1:t=1", .ucode = 0x100 | INTEL_X86_MOD_ANY | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l2_rqsts[]={ { .uname = "IFETCH_HIT", .udesc = "L2 instruction fetch hits", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH_MISS", .udesc = "L2 instruction fetch misses", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCHES", .udesc = "L2 instruction fetches", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LD_HIT", .udesc = "L2 load hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LD_MISS", .udesc = "L2 load misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOADS", .udesc = "L2 requests", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISS", .udesc = "All L2 misses", .ucode = 0xaa00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_HIT", .udesc = "L2 prefetch hits", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_MISS", .udesc = "L2 prefetch misses", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCHES", .udesc = "All L2 prefetches", .ucode = 0xc000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REFERENCES", .udesc = "All L2 requests", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "L2 RFO hits", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MISS", .udesc = "L2 RFO misses", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFOS", .udesc = "L2 RFO requests", .ucode = 0xc00, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_load_dispatch[]={ { .uname = "ANY", .udesc = "All loads dispatched", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "RS", .udesc = "Number of loads dispatched from the Reservation Station (RS) that bypass the Memory Order Buffer", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RS_DELAYED", .udesc = "Number of delayed RS dispatches at the stage latch", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MOB", .udesc = "Number of loads dispatched from Reservation Station (RS)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_snoopq_requests[]={ { .uname = "CODE", .udesc = "Snoop code requests", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Snoop data requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INVALIDATE", .udesc = "Snoop invalidate requests", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_offcore_requests[]={ { .uname = "ANY", .udesc = "All offcore requests", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "ANY_READ", .udesc = "Offcore read requests", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_RFO", .udesc = "Offcore RFO requests", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_CODE", .udesc = "Offcore demand code read requests", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_DATA", .udesc = "Offcore demand data read requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Offcore demand RFO requests", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WRITEBACK", .udesc = "Offcore L1 data cache writebacks", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_load_block[]={ { .uname = "OVERLAP_STORE", .udesc = "Loads that partially overlap an earlier store", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_misalign_memory[]={ { .uname = "STORE", .udesc = "Store referenced with misaligned address", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_machine_clears[]={ { .uname = "MEM_ORDER", .udesc = "Execution pipeline restart due to Memory ordering conflicts ", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CYCLES", .udesc = "Cycles machine clear is asserted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SMC", .udesc = "Self-modifying code detected", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_fp_comp_ops_exe[]={ { .uname = "MMX", .udesc = "MMX Uops", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_DOUBLE_PRECISION", .udesc = "SSE FP double precision Uops", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP", .udesc = "SSE and SSE2 FP Uops", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_PACKED", .udesc = "SSE FP packed Uops", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_FP_SCALAR", .udesc = "SSE FP scalar Uops", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE_SINGLE_PRECISION", .udesc = "SSE FP single precision Uops", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SSE2_INTEGER", .udesc = "SSE2 integer Uops", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "X87", .udesc = "Computational floating-point operations executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_br_inst_retired[]={ { .uname = "ALL_BRANCHES", .udesc = "Retired branch instructions (Precise Event)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "CONDITIONAL", .udesc = "Retired conditional branch instructions (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "NEAR_CALL", .udesc = "Retired near call instructions (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_large_itlb[]={ { .uname = "HIT", .udesc = "Large ITLB hit", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_lsd[]={ { .uname = "UOPS", .udesc = "Counts the number of micro-ops delivered by LSD", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ACTIVE", .udesc = "Cycles is which at least one micro-op delivered by LSD", .uequiv = "UOPS:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "INACTIVE", .udesc = "Cycles is which no micro-op is delivered by LSD", .uequiv = "UOPS:c=1:i=1", .ucode = 0x100 | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l2_lines_out[]={ { .uname = "ANY", .udesc = "L2 lines evicted", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DEMAND_CLEAN", .udesc = "L2 lines evicted by a demand request", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_DIRTY", .udesc = "L2 modified lines evicted by a demand request", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_CLEAN", .udesc = "L2 lines evicted by a prefetch request", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_DIRTY", .udesc = "L2 modified lines evicted by a prefetch request", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_itlb_misses[]={ { .uname = "ANY", .udesc = "ITLB miss", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "WALK_COMPLETED", .udesc = "ITLB miss page walks", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WALK_CYCLES", .udesc = "ITLB miss page walk cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LARGE_WALK_COMPLETED", .udesc = "Number of completed large page walks due to misses in the STLB", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "STLB_HIT", .udesc = "ITLB misses hitting second level TLB", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l1d_prefetch[]={ { .uname = "MISS", .udesc = "L1D hardware prefetch misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REQUESTS", .udesc = "L1D hardware prefetch requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TRIGGERS", .udesc = "L1D hardware prefetch requests triggered", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_sq_misc[]={ { .uname = "LRU_HINTS", .udesc = "Super Queue LRU hints sent to LLC", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SPLIT_LOCK", .udesc = "Super Queue lock splits across a cache line", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_fp_assist[]={ { .uname = "ALL", .udesc = "All X87 Floating point assists (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, { .uname = "INPUT", .udesc = "X87 Floating point assists for invalid input value (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "OUTPUT", .udesc = "X87 Floating point assists for invalid output value (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_simd_int_128[]={ { .uname = "PACK", .udesc = "128 bit SIMD integer pack operations", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_ARITH", .udesc = "128 bit SIMD integer arithmetic operations", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_LOGICAL", .udesc = "128 bit SIMD integer logical operations", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_MPY", .udesc = "128 bit SIMD integer multiply operations", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PACKED_SHIFT", .udesc = "128 bit SIMD integer shift operations", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "SHUFFLE_MOVE", .udesc = "128 bit SIMD integer shuffle/move operations", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "UNPACK", .udesc = "128 bit SIMD integer unpack operations", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_offcore_requests_outstanding[]={ { .uname = "ANY_READ", .udesc = "Outstanding offcore reads", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_CODE", .udesc = "Outstanding offcore demand code reads", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_READ_DATA", .udesc = "Outstanding offcore demand data reads", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_RFO", .udesc = "Outstanding offcore demand RFOs", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "ANY_READ_NOT_EMPTY", .udesc = "Number of cycles with offcore reads busy", .uequiv = "ANY_READ:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), /* cmask=1 */ .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_DATA_NOT_EMPTY", .udesc = "Number of cycles with offcore demand data reads busy", .uequiv = "DEMAND_READ_DATA:c=1", .ucode = 0x800 | (0x1 << INTEL_X86_CMASK_BIT), /* cmask=1 */ .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "READ_CODE_NOT_EMPTY", .udesc = "Number of cycles with offcore code reads busy", .uequiv = "DEMAND_READ_CODE:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), /* cmask=1 */ .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, { .uname = "RFO_NOT_EMPTY", .udesc = "Number of cycles with offcore rfo busy", .uequiv = "DEMAND_RFO:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), /* cmask=1 */ .modhw = _INTEL_X86_ATTR_C, .uflags = INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_mem_store_retired[]={ { .uname = "DTLB_MISS", .udesc = "Retired stores that miss the DTLB (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_inst_decoded[]={ { .uname = "DEC0", .udesc = "Instructions that must be decoded by decoder 0", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_macro_insts[]={ { .uname = "DECODED", .udesc = "Instructions decoded", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_arith[]={ { .uname = "CYCLES_DIV_BUSY", .udesc = "Counts the number of cycles the divider is busy executing divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Count may be incorrect when HT is on", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIV", .udesc = "Counts the number of divide or square root operations. The divide can be integer, X87 or Streaming SIMD Extensions (SSE). The square root operation can be either X87 or SSE. Count may be incorrect when HT is on", .uequiv = "CYCLES_DIV_BUSY:c=1:i=1:e=1", .ucode = 0x100 | INTEL_X86_MOD_EDGE | INTEL_X86_MOD_INV | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "MUL", .udesc = "Counts the number of multiply operations executed. This includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD. Count may be incorrect when HT is on", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l2_transactions[]={ { .uname = "ANY", .udesc = "All L2 transactions", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FILL", .udesc = "L2 fill transactions", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "IFETCH", .udesc = "L2 instruction fetch transactions", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "L1D_WB", .udesc = "L1D writeback to L2 transactions", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOAD", .udesc = "L2 Load transactions", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH", .udesc = "L2 prefetch transactions", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO", .udesc = "L2 RFO transactions", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "WB", .udesc = "L2 writeback to LLC transactions", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_sb_drain[]={ { .uname = "ANY", .udesc = "All Store buffer stall cycles", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_mem_uncore_retired[]={ { .uname = "LOCAL_HITM", .udesc = "Load instructions retired that HIT modified data in sibling core (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "LOCAL_DRAM_AND_REMOTE_CACHE_HIT", .udesc = "Load instructions retired local dram and remote cache HIT data sources (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "REMOTE_DRAM", .udesc = "Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "UNCACHEABLE", .udesc = "Load instructions retired IO (Precise Event)", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "REMOTE_HITM", .udesc = "Retired loads that hit remote socket in modified state (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "OTHER_LLC_MISS", .udesc = "Load instructions retired other LLC miss (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "UNKNOWN_SOURCE", .udesc = "Load instructions retired unknown LLC miss (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "LOCAL_DRAM", .udesc = "Retired loads with a data source of local DRAM or locally homed remote cache HITM (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "OTHER_CORE_L2_HITM", .udesc = "Retired loads instruction that hit modified data in sibling core (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "REMOTE_CACHE_LOCAL_HOME_HIT", .udesc = "Retired loads instruction that hit remote cache hit data source (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "REMOTE_DRAM", .udesc = "Retired loads instruction remote DRAM and remote home-remote cache HITM (Precise Event)", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, .umodel = PFM_PMU_INTEL_WSM, }, }; static const intel_x86_umask_t wsm_l2_data_rqsts[]={ { .uname = "ANY", .udesc = "All L2 data requests", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "DEMAND_E_STATE", .udesc = "L2 data demand loads in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_I_STATE", .udesc = "L2 data demand loads in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_M_STATE", .udesc = "L2 data demand loads in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_MESI", .udesc = "L2 data demand requests", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DEMAND_S_STATE", .udesc = "L2 data demand loads in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_E_STATE", .udesc = "L2 data prefetches in E state", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_I_STATE", .udesc = "L2 data prefetches in the I state (misses)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_M_STATE", .udesc = "L2 data prefetches in M state", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_MESI", .udesc = "All L2 data prefetches", .ucode = 0xf000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "PREFETCH_S_STATE", .udesc = "L2 data prefetches in the S state", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_br_inst_exec[]={ { .uname = "ANY", .udesc = "Branch instructions executed", .ucode = 0x7f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "COND", .udesc = "Conditional branch instructions executed", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT", .udesc = "Unconditional branches executed", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DIRECT_NEAR_CALL", .udesc = "Unconditional call branches executed", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NEAR_CALL", .udesc = "Indirect call branches executed", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INDIRECT_NON_CALL", .udesc = "Indirect non call branches executed", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NEAR_CALLS", .udesc = "Call branches executed", .ucode = 0x3000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "NON_CALLS", .udesc = "All non call branches executed", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RETURN_NEAR", .udesc = "Indirect return branches executed", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TAKEN", .udesc = "Taken branches executed", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_snoopq_requests_outstanding[]={ { .uname = "CODE", .udesc = "Outstanding snoop code requests", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "CODE_NOT_EMPTY", .udesc = "Cycles snoop code requests queue not empty", .uequiv = "CODE:c=1", .ucode = 0x400 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "DATA", .udesc = "Outstanding snoop data requests", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "DATA_NOT_EMPTY", .udesc = "Cycles snoop data requests queue not empty", .uequiv = "DATA:c=1", .ucode = 0x100 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, { .uname = "INVALIDATE", .udesc = "Outstanding snoop invalidate requests", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "INVALIDATE_NOT_EMPTY", .udesc = "Cycles snoop invalidate requests queue not empty", .uequiv = "INVALIDATE:c=1", .ucode = 0x200 | (0x1 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_mem_load_retired[]={ { .uname = "DTLB_MISS", .udesc = "Retired loads that miss the DTLB (Precise Event)", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "HIT_LFB", .udesc = "Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L1D_HIT", .udesc = "Retired loads that hit the L1 data cache (Precise Event)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L2_HIT", .udesc = "Retired loads that hit the L2 cache (Precise Event)", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_MISS", .udesc = "Retired loads that miss the LLC cache (Precise Event)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LLC_MISS", .udesc = "This is an alias for L3_MISS", .uequiv = "L3_MISS", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "L3_UNSHARED_HIT", .udesc = "Retired loads that hit valid versions in the LLC cache (Precise Event)", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "LLC_UNSHARED_HIT", .udesc = "This is an alias for L3_UNSHARED_HIT", .uequiv = "L3_UNSHARED_HIT", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, { .uname = "OTHER_CORE_L2_HIT_HITM", .udesc = "Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_PEBS, }, }; static const intel_x86_umask_t wsm_l1i[]={ { .uname = "CYCLES_STALLED", .udesc = "L1I instruction fetch stall cycles", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITS", .udesc = "L1I instruction fetch hits", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MISSES", .udesc = "L1I instruction fetch misses", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "READS", .udesc = "L1I Instruction fetches", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l2_write[]={ { .uname = "LOCK_E_STATE", .udesc = "L2 demand lock RFOs in E state", .ucode = 0x4000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_HIT", .udesc = "All demand L2 lock RFOs that hit the cache", .ucode = 0xe000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_I_STATE", .udesc = "L2 demand lock RFOs in I state (misses)", .ucode = 0x1000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_M_STATE", .udesc = "L2 demand lock RFOs in M state", .ucode = 0x8000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_MESI", .udesc = "All demand L2 lock RFOs", .ucode = 0xf000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LOCK_S_STATE", .udesc = "L2 demand lock RFOs in S state", .ucode = 0x2000, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_HIT", .udesc = "All L2 demand store RFOs that hit the cache", .ucode = 0xe00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_I_STATE", .udesc = "L2 demand store RFOs in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_M_STATE", .udesc = "L2 demand store RFOs in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_MESI", .udesc = "All L2 demand store RFOs", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "RFO_S_STATE", .udesc = "L2 demand store RFOs in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_snoop_response[]={ { .uname = "HIT", .udesc = "Thread responded HIT to snoop", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITE", .udesc = "Thread responded HITE to snoop", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "HITM", .udesc = "Thread responded HITM to snoop", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l1d[]={ { .uname = "M_EVICT", .udesc = "L1D cache lines replaced in M state ", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_REPL", .udesc = "L1D cache lines allocated in the M state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_SNOOP_EVICT", .udesc = "L1D snoop eviction of cache lines in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REPL", .udesc = "L1 data cache lines allocated", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_resource_stalls[]={ { .uname = "ANY", .udesc = "Resource related stall cycles", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FPCW", .udesc = "FPU control word write stall cycles", .ucode = 0x2000, }, { .uname = "LOAD", .udesc = "Load buffer stall cycles", .ucode = 0x200, }, { .uname = "MXCSR", .udesc = "MXCSR rename stall cycles", .ucode = 0x4000, }, { .uname = "OTHER", .udesc = "Other Resource related stall cycles", .ucode = 0x8000, }, { .uname = "ROB_FULL", .udesc = "ROB full stall cycles", .ucode = 0x1000, }, { .uname = "RS_FULL", .udesc = "Reservation Station full stall cycles", .ucode = 0x400, }, { .uname = "STORE", .udesc = "Store buffer stall cycles", .ucode = 0x800, }, }; static const intel_x86_umask_t wsm_rat_stalls[]={ { .uname = "ANY", .udesc = "All RAT stall cycles", .uequiv = "FLAGS:REGISTERS:ROB_READ_PORT:SCOREBOARD", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, { .uname = "FLAGS", .udesc = "Flag stall cycles", .ucode = 0x100, }, { .uname = "REGISTERS", .udesc = "Partial register stall cycles", .ucode = 0x200, }, { .uname = "ROB_READ_PORT", .udesc = "ROB read port stalls cycles", .ucode = 0x400, }, { .uname = "SCOREBOARD", .udesc = "Scoreboard stall cycles", .ucode = 0x800, }, }; static const intel_x86_umask_t wsm_cpu_clk_unhalted[]={ { .uname = "THREAD_P", .udesc = "Cycles when thread is not halted (programmable counter)", .ucode = 0x0, .uflags= INTEL_X86_NCOMBO, }, { .uname = "REF_P", .udesc = "Reference base clock (133 Mhz) cycles when thread is not halted", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "TOTAL_CYCLES", .udesc = "Total number of elapsed cycles. Does not work when C-state enabled", .uequiv = "THREAD_P:c=2:i=1", .ucode = 0x0 | INTEL_X86_MOD_INV | (0x2 << INTEL_X86_CMASK_BIT), .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_l1d_wb_l2[]={ { .uname = "E_STATE", .udesc = "L1 writebacks to L2 in E state", .ucode = 0x400, .uflags= INTEL_X86_NCOMBO, }, { .uname = "I_STATE", .udesc = "L1 writebacks to L2 in I state (misses)", .ucode = 0x100, .uflags= INTEL_X86_NCOMBO, }, { .uname = "M_STATE", .udesc = "L1 writebacks to L2 in M state", .ucode = 0x800, .uflags= INTEL_X86_NCOMBO, }, { .uname = "MESI", .udesc = "All L1 writebacks to L2", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO, }, { .uname = "S_STATE", .udesc = "L1 writebacks to L2 in S state", .ucode = 0x200, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_offcore_response_0[]={ { .uname = "DMND_DATA_RD", .udesc = "Request: counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches", .ucode = 0x100, .grpid = 0, }, { .uname = "DMND_RFO", .udesc = "Request: counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO", .ucode = 0x200, .grpid = 0, }, { .uname = "DMND_IFETCH", .udesc = "Request: counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches", .ucode = 0x400, .grpid = 0, }, { .uname = "WB", .udesc = "Request: counts the number of writeback (modified to exclusive) transactions", .ucode = 0x800, .grpid = 0, }, { .uname = "PF_DATA_RD", .udesc = "Request: counts the number of data cacheline reads generated by L2 prefetchers", .ucode = 0x1000, .grpid = 0, }, { .uname = "PF_RFO", .udesc = "Request: counts the number of RFO requests generated by L2 prefetchers", .ucode = 0x2000, .grpid = 0, }, { .uname = "PF_IFETCH", .udesc = "Request: counts the number of code reads generated by L2 prefetchers", .ucode = 0x4000, .grpid = 0, }, { .uname = "OTHER", .udesc = "Request: counts one of the following transaction types, including L3 invalidate, I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, lock, unlock, split lock", .ucode = 0x8000, .grpid = 0, }, { .uname = "ANY_IFETCH", .udesc = "Request: combination of PF_IFETCH | DMND_IFETCH", .uequiv = "PF_IFETCH:DMND_IFETCH", .ucode = 0x4400, .grpid = 0, }, { .uname = "ANY_REQUEST", .udesc = "Request: combination of all requests umasks", .uequiv = "DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:OTHER", .ucode = 0xff00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "ANY_DATA", .udesc = "Request: any data read/write request", .uequiv = "DMND_DATA_RD:PF_DATA_RD:DMND_RFO:PF_RFO", .ucode = 0x3300, .grpid = 0, }, { .uname = "ANY_DATA_RD", .udesc = "Request: any data read in request", .uequiv = "DMND_DATA_RD:PF_DATA_RD", .ucode = 0x1100, .grpid = 0, }, { .uname = "ANY_RFO", .udesc = "Request: combination of DMND_RFO | PF_RFO", .uequiv = "DMND_RFO:PF_RFO", .ucode = 0x2200, .grpid = 0, }, { .uname = "UNCORE_HIT", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore with no coherency actions required (snooping)", .ucode = 0x10000, .grpid = 1, }, { .uname = "OTHER_CORE_HIT_SNP", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where no modified copies were found (clean)", .ucode = 0x20000, .grpid = 1, }, { .uname = "OTHER_CORE_HITM", .udesc = "Response: counts L3 Hit: local or remote home requests that hit L3 cache in the uncore and was serviced by another core with a cross core snoop where modified copies were found (HITM)", .ucode = 0x40000, .grpid = 1, }, { .uname = "REMOTE_CACHE_HITM", .udesc = "Response: counts L3 Hit: local or remote home requests that hit a remote L3 cacheline in modified (HITM) state", .ucode = 0x80000, .grpid = 1, }, { .uname = "REMOTE_CACHE_FWD", .udesc = "Response: counts L3 Miss: local homed requests that missed the L3 cache and was serviced by forwarded data following a cross package snoop where no modified copies found. (Remote home requests are not counted)", .ucode = 0x100000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "LOCAL_DRAM_AND_REMOTE_CACHE_HIT", .udesc = "Response: counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM or a remote cache", .ucode = 0x100000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "REMOTE_DRAM", .udesc = "Response: counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .ucode = 0x200000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "LOCAL_DRAM", .udesc = "Response: counts L3 Miss: local home requests that missed the L3 cache and were serviced by local DRAM", .ucode = 0x200000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "REMOTE_DRAM", .udesc = "Response: counts L3 Miss: remote home requests that missed the L3 cache and were serviced by remote DRAM", .ucode = 0x400000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "OTHER_LLC_MISS", .udesc = "Response: counts L3 Miss: remote home requests that missed the L3 cache", .ucode = 0x400000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "NON_DRAM", .udesc = "Response: Non-DRAM requests that were serviced by IOH", .ucode = 0x800000, .grpid = 1, }, { .uname = "ANY_CACHE_DRAM", .udesc = "Response: requests serviced by any source but IOH", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_FWD:REMOTE_CACHE_HITM:REMOTE_DRAM:LOCAL_DRAM", .ucode = 0x7f0000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "ANY_CACHE_DRAM", .udesc = "Response: requests serviced by any source but IOH", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_HITM:OTHER_LLC_MISS:REMOTE_DRAM:LOCAL_DRAM_AND_REMOTE_CACHE_HIT", .ucode = 0x7f0000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "ANY_DRAM", .udesc = "Response: requests serviced by local or remote DRAM", .uequiv = "REMOTE_DRAM:LOCAL_DRAM", .ucode = 0x600000, .umodel = PFM_PMU_INTEL_WSM, .grpid = 1, }, { .uname = "ANY_LLC_MISS", .udesc = "Response: requests that missed in L3", .uequiv = "REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM:LOCAL_DRAM:NON_DRAM", .ucode = 0xf80000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "ANY_LLC_MISS", .udesc = "Response: requests that missed in L3", .uequiv = "REMOTE_CACHE_HITM:REMOTE_DRAM:OTHER_LLC_MISS:LOCAL_DRAM_AND_REMOTE_CACHE_HIT:NON_DRAM", .ucode = 0xf80000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, { .uname = "LOCAL_CACHE_DRAM", .udesc = "Response: requests hit local core or uncore caches or local DRAM", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:LOCAL_DRAM", .ucode = 0x270000, .umodel = PFM_PMU_INTEL_WSM, .grpid = 1, }, { .uname = "REMOTE_CACHE_DRAM", .udesc = "Response: requests that miss L3 and hit remote caches or DRAM", .uequiv = "REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM", .ucode = 0x580000, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "LOCAL_CACHE", .udesc = "Response: any local (core and socket) caches", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM", .ucode = 0x70000, .grpid = 1, }, { .uname = "ANY_RESPONSE", .udesc = "Response: combination of all response umasks", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM:LOCAL_DRAM:NON_DRAM", .ucode = 0xff0000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM, }, { .uname = "ANY_RESPONSE", .udesc = "Response: combination of all response umasks", .uequiv = "UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_HITM:REMOTE_DRAM:OTHER_LLC_MISS:LOCAL_DRAM_AND_REMOTE_CACHE_HIT:NON_DRAM", .ucode = 0xff0000, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, .umodel = PFM_PMU_INTEL_WSM_DP, }, }; static const intel_x86_entry_t intel_wsm_pe[]={ { .name = "UNHALTED_CORE_CYCLES", .desc = "Count core clock cycles whenever the clock signal on the specific core is running (not halted).", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x20000000full, .code = 0x3c, }, { .name = "INSTRUCTION_RETIRED", .desc = "Count the number of instructions at retirement.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "INSTRUCTIONS_RETIRED", .desc = "This is an alias for INSTRUCTION_RETIRED", .modmsk = INTEL_V3_ATTRS, .equiv = "INSTRUCTION_RETIRED", .cntmsk = 0x10000000full, .code = 0xc0, }, { .name = "UNHALTED_REFERENCE_CYCLES", .desc = "Unhalted reference cycles", .modmsk = INTEL_FIXED3_ATTRS, .cntmsk = 0x400000000ull, .code = 0x0300, /* pseudo encoding */ .flags = INTEL_X86_FIXED, }, { .name = "LLC_REFERENCES", .desc = "Count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch (Alias for L3_LAT_CACHE:REFERENCE).", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:REFERENCE", .cntmsk = 0xf, .code = 0x4f2e, }, { .name = "LAST_LEVEL_CACHE_REFERENCES", .desc = "This is an alias for L3_LAT_CACHE:REFERENCE", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:REFERENCE", .cntmsk = 0xf, .code = 0x4f2e, }, { .name = "LLC_MISSES", .desc = "Count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch (Alias for L3_LAT_CACHE:MISS)", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:MISS", .cntmsk = 0xf, .code = 0x412e, }, { .name = "LAST_LEVEL_CACHE_MISSES", .desc = "This is an alias for L3_LAT_CACHE:MISS", .modmsk = INTEL_V3_ATTRS, .equiv = "L3_LAT_CACHE:MISS", .cntmsk = 0xf, .code = 0x412e, }, { .name = "BRANCH_INSTRUCTIONS_RETIRED", .desc = "Count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction.", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_INST_RETIRED:ALL_BRANCHES", .cntmsk = 0xf, .code = 0x4c4, }, { .name = "UOPS_DECODED", .desc = "Micro-ops decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd1, .numasks = LIBPFM_ARRAY_SIZE(wsm_uops_decoded), .ngrp = 1, .umasks = wsm_uops_decoded, }, { .name = "L1D_CACHE_LOCK_FB_HIT", .desc = "L1D cacheable load lock speculated or retired accepted into the fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x152, }, { .name = "BPU_CLEARS", .desc = "Branch Prediction Unit clears", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe8, .numasks = LIBPFM_ARRAY_SIZE(wsm_bpu_clears), .ngrp = 1, .umasks = wsm_bpu_clears, }, { .name = "UOPS_RETIRED", .desc = "Cycles Uops are being retired (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc2, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_uops_retired), .ngrp = 1, .umasks = wsm_uops_retired, }, { .name = "BR_MISP_RETIRED", .desc = "Mispredicted retired branches (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc5, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_br_misp_retired), .ngrp = 1, .umasks = wsm_br_misp_retired, }, { .name = "EPT", .desc = "Extended Page Table", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x4f, .numasks = LIBPFM_ARRAY_SIZE(wsm_ept), .ngrp = 1, .umasks = wsm_ept, }, { .name = "UOPS_EXECUTED", .desc = "Micro-ops executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb1, .numasks = LIBPFM_ARRAY_SIZE(wsm_uops_executed), .ngrp = 1, .umasks = wsm_uops_executed, }, { .name = "IO_TRANSACTIONS", .desc = "I/O transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x16c, }, { .name = "ES_REG_RENAMES", .desc = "ES segment renames", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1d5, }, { .name = "INST_RETIRED", .desc = "Instructions retired (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc0, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_inst_retired), .ngrp = 1, .umasks = wsm_inst_retired, }, { .name = "ILD_STALL", .desc = "Instruction Length Decoder stalls", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x87, .numasks = LIBPFM_ARRAY_SIZE(wsm_ild_stall), .ngrp = 1, .umasks = wsm_ild_stall, }, { .name = "DTLB_LOAD_MISSES", .desc = "DTLB load misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(wsm_dtlb_load_misses), .ngrp = 1, .umasks = wsm_dtlb_load_misses, }, { .name = "L2_LINES_IN", .desc = "L2 lines allocated", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf1, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_lines_in), .ngrp = 1, .umasks = wsm_l2_lines_in, }, { .name = "SSEX_UOPS_RETIRED", .desc = "SIMD micro-ops retired (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc7, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_ssex_uops_retired), .ngrp = 1, .umasks = wsm_ssex_uops_retired, }, { .name = "STORE_BLOCKS", .desc = "Load delayed by block code", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(wsm_store_blocks), .ngrp = 1, .umasks = wsm_store_blocks, }, { .name = "FP_MMX_TRANS", .desc = "Floating Point to and from MMX transitions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xcc, .numasks = LIBPFM_ARRAY_SIZE(wsm_fp_mmx_trans), .ngrp = 1, .umasks = wsm_fp_mmx_trans, }, { .name = "CACHE_LOCK_CYCLES", .desc = "Cache locked", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(wsm_cache_lock_cycles), .ngrp = 1, .umasks = wsm_cache_lock_cycles, }, { .name = "OFFCORE_REQUESTS_SQ_FULL", .desc = "Offcore requests blocked due to Super Queue full", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b2, }, { .name = "LONGEST_LAT_CACHE", .desc = "Last level cache accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(wsm_l3_lat_cache), .ngrp = 1, .umasks = wsm_l3_lat_cache, }, { .name = "L3_LAT_CACHE", .desc = "Last level cache accesses", .equiv = "LONGEST_LAT_CACHE", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(wsm_l3_lat_cache), .ngrp = 1, .umasks = wsm_l3_lat_cache, }, { .name = "SIMD_INT_64", .desc = "SIMD 64-bit integer operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xfd, .numasks = LIBPFM_ARRAY_SIZE(wsm_simd_int_64), .ngrp = 1, .umasks = wsm_simd_int_64, }, { .name = "BR_INST_DECODED", .desc = "Branch instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1e0, }, { .name = "BR_MISP_EXEC", .desc = "Mispredicted branches executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x89, .numasks = LIBPFM_ARRAY_SIZE(wsm_br_misp_exec), .ngrp = 1, .umasks = wsm_br_misp_exec, }, { .name = "SQ_FULL_STALL_CYCLES", .desc = "Super Queue full stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1f6, }, { .name = "BACLEAR", .desc = "Branch address calculator clears", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe6, .numasks = LIBPFM_ARRAY_SIZE(wsm_baclear), .ngrp = 1, .umasks = wsm_baclear, }, { .name = "DTLB_MISSES", .desc = "Data TLB misses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x49, .numasks = LIBPFM_ARRAY_SIZE(wsm_dtlb_misses), .ngrp = 1, .umasks = wsm_dtlb_misses, }, { .name = "MEM_INST_RETIRED", .desc = "Memory instructions retired (Precise Event)", .modmsk = INTEL_V3_ATTRS | _INTEL_X86_ATTR_LDLAT, .cntmsk = 0xf, .code = 0xb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_mem_inst_retired), .ngrp = 1, .umasks = wsm_mem_inst_retired, }, { .name = "UOPS_ISSUED", .desc = "Uops issued", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xe, .numasks = LIBPFM_ARRAY_SIZE(wsm_uops_issued), .ngrp = 1, .umasks = wsm_uops_issued, }, { .name = "L2_RQSTS", .desc = "L2 requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_rqsts), .ngrp = 1, .umasks = wsm_l2_rqsts, }, { .name = "TWO_UOP_INSTS_DECODED", .desc = "Two Uop instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x119, }, { .name = "LOAD_DISPATCH", .desc = "Loads dispatched", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x13, .numasks = LIBPFM_ARRAY_SIZE(wsm_load_dispatch), .ngrp = 1, .umasks = wsm_load_dispatch, }, { .name = "BACLEAR_FORCE_IQ", .desc = "BACLEAR forced by Instruction queue", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1a7, }, { .name = "SNOOPQ_REQUESTS", .desc = "Snoopq requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb4, .numasks = LIBPFM_ARRAY_SIZE(wsm_snoopq_requests), .ngrp = 1, .umasks = wsm_snoopq_requests, }, { .name = "OFFCORE_REQUESTS", .desc = "Offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb0, .numasks = LIBPFM_ARRAY_SIZE(wsm_offcore_requests), .ngrp = 1, .umasks = wsm_offcore_requests, }, { .name = "LOAD_BLOCK", .desc = "Loads blocked", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(wsm_load_block), .ngrp = 1, .umasks = wsm_load_block, }, { .name = "MISALIGN_MEMORY", .desc = "Misaligned accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(wsm_misalign_memory), .ngrp = 1, .umasks = wsm_misalign_memory, }, { .name = "INST_QUEUE_WRITE_CYCLES", .desc = "Cycles instructions are written to the instruction queue", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x11e, }, { .name = "LSD_OVERFLOW", .desc = "Number of loops that cannot stream from the instruction queue.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x120, }, { .name = "MACHINE_CLEARS", .desc = "Machine clear asserted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc3, .numasks = LIBPFM_ARRAY_SIZE(wsm_machine_clears), .ngrp = 1, .umasks = wsm_machine_clears, }, { .name = "FP_COMP_OPS_EXE", .desc = "SSE/MMX micro-ops", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x10, .numasks = LIBPFM_ARRAY_SIZE(wsm_fp_comp_ops_exe), .ngrp = 1, .umasks = wsm_fp_comp_ops_exe, }, { .name = "ITLB_FLUSH", .desc = "ITLB flushes", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1ae, }, { .name = "BR_INST_RETIRED", .desc = "Retired branch instructions (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc4, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_br_inst_retired), .ngrp = 1, .umasks = wsm_br_inst_retired, }, { .name = "L1D_CACHE_PREFETCH_LOCK_FB_HIT", .desc = "L1D prefetch load lock accepted in fill buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x152, }, { .name = "LARGE_ITLB", .desc = "Large ITLB accesses", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x82, .numasks = LIBPFM_ARRAY_SIZE(wsm_large_itlb), .ngrp = 1, .umasks = wsm_large_itlb, }, { .name = "LSD", .desc = "Loop stream detector", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xa8, .numasks = LIBPFM_ARRAY_SIZE(wsm_lsd), .ngrp = 1, .umasks = wsm_lsd, }, { .name = "L2_LINES_OUT", .desc = "L2 lines evicted", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf2, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_lines_out), .ngrp = 1, .umasks = wsm_l2_lines_out, }, { .name = "ITLB_MISSES", .desc = "ITLB miss", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x85, .numasks = LIBPFM_ARRAY_SIZE(wsm_itlb_misses), .ngrp = 1, .umasks = wsm_itlb_misses, }, { .name = "L1D_PREFETCH", .desc = "L1D hardware prefetch", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x4e, .numasks = LIBPFM_ARRAY_SIZE(wsm_l1d_prefetch), .ngrp = 1, .umasks = wsm_l1d_prefetch, }, { .name = "SQ_MISC", .desc = "Super Queue miscellaneous", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf4, .numasks = LIBPFM_ARRAY_SIZE(wsm_sq_misc), .ngrp = 1, .umasks = wsm_sq_misc, }, { .name = "SEG_RENAME_STALLS", .desc = "Segment rename stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1d4, }, { .name = "FP_ASSIST", .desc = "X87 Floating point assists (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf7, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_fp_assist), .ngrp = 1, .umasks = wsm_fp_assist, }, { .name = "SIMD_INT_128", .desc = "128 bit SIMD operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x12, .numasks = LIBPFM_ARRAY_SIZE(wsm_simd_int_128), .ngrp = 1, .umasks = wsm_simd_int_128, }, { .name = "OFFCORE_REQUESTS_OUTSTANDING", .desc = "Outstanding offcore requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x1, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(wsm_offcore_requests_outstanding), .ngrp = 1, .umasks = wsm_offcore_requests_outstanding, }, { .name = "MEM_STORE_RETIRED", .desc = "Retired stores", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xc, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_mem_store_retired), .ngrp = 1, .umasks = wsm_mem_store_retired, }, { .name = "INST_DECODED", .desc = "Instructions decoded", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x18, .numasks = LIBPFM_ARRAY_SIZE(wsm_inst_decoded), .ngrp = 1, .umasks = wsm_inst_decoded, }, { .name = "MACRO_INSTS_FUSIONS_DECODED", .desc = "Count the number of instructions decoded that are macros-fused but not necessarily executed or retired", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1a6, }, { .name = "MACRO_INSTS", .desc = "Macro-instructions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd0, .numasks = LIBPFM_ARRAY_SIZE(wsm_macro_insts), .ngrp = 1, .umasks = wsm_macro_insts, }, { .name = "PARTIAL_ADDRESS_ALIAS", .desc = "False dependencies due to partial address aliasing", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x107, }, { .name = "ARITH", .desc = "Counts arithmetic multiply and divide operations", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x14, .numasks = LIBPFM_ARRAY_SIZE(wsm_arith), .ngrp = 1, .umasks = wsm_arith, }, { .name = "L2_TRANSACTIONS", .desc = "L2 transactions", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf0, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_transactions), .ngrp = 1, .umasks = wsm_l2_transactions, }, { .name = "INST_QUEUE_WRITES", .desc = "Instructions written to instruction queue.", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x117, }, { .name = "SB_DRAIN", .desc = "Store buffer", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(wsm_sb_drain), .ngrp = 1, .umasks = wsm_sb_drain, }, { .name = "LOAD_HIT_PRE", .desc = "Load operations conflicting with software prefetches", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x14c, }, { .name = "MEM_UNCORE_RETIRED", .desc = "Load instructions retired (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xf, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_mem_uncore_retired), .ngrp = 1, .umasks = wsm_mem_uncore_retired, }, { .name = "L2_DATA_RQSTS", .desc = "All L2 data requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x26, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_data_rqsts), .ngrp = 1, .umasks = wsm_l2_data_rqsts, }, { .name = "BR_INST_EXEC", .desc = "Branch instructions executed", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x88, .numasks = LIBPFM_ARRAY_SIZE(wsm_br_inst_exec), .ngrp = 1, .umasks = wsm_br_inst_exec, }, { .name = "ITLB_MISS_RETIRED", .desc = "Retired instructions that missed the ITLB (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x20c8, .flags= INTEL_X86_PEBS, }, { .name = "BPU_MISSED_CALL_RET", .desc = "Branch prediction unit missed call or return", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1e5, }, { .name = "SNOOPQ_REQUESTS_OUTSTANDING", .desc = "Outstanding snoop requests", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x1, .code = 0xb3, .numasks = LIBPFM_ARRAY_SIZE(wsm_snoopq_requests_outstanding), .ngrp = 1, .umasks = wsm_snoopq_requests_outstanding, }, { .name = "MEM_LOAD_RETIRED", .desc = "Memory loads retired (Precise Event)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xcb, .flags= INTEL_X86_PEBS, .numasks = LIBPFM_ARRAY_SIZE(wsm_mem_load_retired), .ngrp = 1, .umasks = wsm_mem_load_retired, }, { .name = "L1I", .desc = "L1I instruction fetch", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(wsm_l1i), .ngrp = 1, .umasks = wsm_l1i, }, { .name = "L2_WRITE", .desc = "L2 demand lock/store RFO", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x27, .numasks = LIBPFM_ARRAY_SIZE(wsm_l2_write), .ngrp = 1, .umasks = wsm_l2_write, }, { .name = "SNOOP_RESPONSE", .desc = "Snoop", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xb8, .numasks = LIBPFM_ARRAY_SIZE(wsm_snoop_response), .ngrp = 1, .umasks = wsm_snoop_response, }, { .name = "L1D", .desc = "L1D cache", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0x3, .code = 0x51, .numasks = LIBPFM_ARRAY_SIZE(wsm_l1d), .ngrp = 1, .umasks = wsm_l1d, }, { .name = "RESOURCE_STALLS", .desc = "Resource related stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xa2, .numasks = LIBPFM_ARRAY_SIZE(wsm_resource_stalls), .ngrp = 1, .umasks = wsm_resource_stalls, }, { .name = "RAT_STALLS", .desc = "All RAT stall cycles", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0xd2, .numasks = LIBPFM_ARRAY_SIZE(wsm_rat_stalls), .ngrp = 1, .umasks = wsm_rat_stalls, }, { .name = "CPU_CLK_UNHALTED", .desc = "Cycles when processor is not in halted state", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x3c, .numasks = LIBPFM_ARRAY_SIZE(wsm_cpu_clk_unhalted), .ngrp = 1, .umasks = wsm_cpu_clk_unhalted, }, { .name = "L1D_WB_L2", .desc = "L1D writebacks to L2", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(wsm_l1d_wb_l2), .ngrp = 1, .umasks = wsm_l1d_wb_l2, }, { .name = "MISPREDICTED_BRANCH_RETIRED", .desc = "Count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", .modmsk = INTEL_V3_ATTRS, .equiv = "BR_MISP_RETIRED:ALL_BRANCHES", .cntmsk = 0xf, .code = 0xc5, }, { .name = "THREAD_ACTIVE", .desc = "Cycles thread is active", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1ec, }, { .name = "UOP_UNFUSION", .desc = "Counts unfusion events due to floating point exception to a fused uop", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1db, }, { .name = "OFFCORE_RESPONSE_0", .desc = "Offcore response 0 (must provide at least one request and one response umasks)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1b7, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(wsm_offcore_response_0), .ngrp = 2, .umasks = wsm_offcore_response_0, }, { .name = "OFFCORE_RESPONSE_1", .desc = "Offcore response 1 (must provide at least one request and one response umasks)", .modmsk = INTEL_V3_ATTRS, .cntmsk = 0xf, .code = 0x1bb, .flags= INTEL_X86_NHM_OFFCORE, .numasks = LIBPFM_ARRAY_SIZE(wsm_offcore_response_0), .ngrp = 2, .umasks = wsm_offcore_response_0, /* identical to actual umasks list for this event */ }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_wsm_unc_events.h000066400000000000000000001144411502707512200244460ustar00rootroot00000000000000/* * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * This file has been automatically generated. * * PMU: wsm_unc (Intel Westmere uncore) */ static const intel_x86_umask_t wsm_unc_unc_dram_open[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 open commands issued for read or write", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 open commands issued for read or write", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 open commands issued for read or write", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_gc_occupancy[]={ { .uname = "READ_TRACKER", .udesc = "In the read tracker", .ucode = 0x100, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_page_close[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 page close", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 page close", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 page close", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_page_miss[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 page miss", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 page miss", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 page miss", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_pre_all[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 precharge all commands", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 precharge all commands", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 precharge all commands", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_read_cas[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 read CAS commands", .ucode = 0x100, }, { .uname = "AUTOPRE_CH0", .udesc = "DRAM Channel 0 read CAS auto page close commands", .ucode = 0x200, }, { .uname = "CH1", .udesc = "DRAM Channel 1 read CAS commands", .ucode = 0x400, }, { .uname = "AUTOPRE_CH1", .udesc = "DRAM Channel 1 read CAS auto page close commands", .ucode = 0x800, }, { .uname = "CH2", .udesc = "DRAM Channel 2 read CAS commands", .ucode = 0x1000, }, { .uname = "AUTOPRE_CH2", .udesc = "DRAM Channel 2 read CAS auto page close commands", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_refresh[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 refresh commands", .ucode = 0x100, }, { .uname = "CH1", .udesc = "DRAM Channel 1 refresh commands", .ucode = 0x200, }, { .uname = "CH2", .udesc = "DRAM Channel 2 refresh commands", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_dram_write_cas[]={ { .uname = "CH0", .udesc = "DRAM Channel 0 write CAS commands", .ucode = 0x100, }, { .uname = "AUTOPRE_CH0", .udesc = "DRAM Channel 0 write CAS auto page close commands", .ucode = 0x200, }, { .uname = "CH1", .udesc = "DRAM Channel 1 write CAS commands", .ucode = 0x400, }, { .uname = "AUTOPRE_CH1", .udesc = "DRAM Channel 1 write CAS auto page close commands", .ucode = 0x800, }, { .uname = "CH2", .udesc = "DRAM Channel 2 write CAS commands", .ucode = 0x1000, }, { .uname = "AUTOPRE_CH2", .udesc = "DRAM Channel 2 write CAS auto page close commands", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_gq_alloc[]={ { .uname = "READ_TRACKER", .udesc = "GQ read tracker requests", .ucode = 0x100, }, { .uname = "RT_LLC_MISS", .udesc = "GQ read tracker LLC misses", .ucode = 0x200, }, { .uname = "RT_TO_LLC_RESP", .udesc = "GQ read tracker LLC requests", .ucode = 0x400, }, { .uname = "RT_TO_RTID_ACQUIRED", .udesc = "GQ read tracker LLC miss to RTID acquired", .ucode = 0x800, }, { .uname = "WT_TO_RTID_ACQUIRED", .udesc = "GQ write tracker LLC miss to RTID acquired", .ucode = 0x1000, }, { .uname = "WRITE_TRACKER", .udesc = "GQ write tracker LLC misses", .ucode = 0x2000, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "GQ peer probe tracker requests", .ucode = 0x4000, }, }; static const intel_x86_umask_t wsm_unc_unc_gq_cycles_full[]={ { .uname = "READ_TRACKER", .udesc = "Cycles GQ read tracker is full.", .ucode = 0x100, }, { .uname = "WRITE_TRACKER", .udesc = "Cycles GQ write tracker is full.", .ucode = 0x200, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "Cycles GQ peer probe tracker is full.", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_gq_cycles_not_empty[]={ { .uname = "READ_TRACKER", .udesc = "Cycles GQ read tracker is busy", .ucode = 0x100, }, { .uname = "WRITE_TRACKER", .udesc = "Cycles GQ write tracker is busy", .ucode = 0x200, }, { .uname = "PEER_PROBE_TRACKER", .udesc = "Cycles GQ peer probe tracker is busy", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_gq_data_from[]={ { .uname = "QPI", .udesc = "Cycles GQ data is imported from Quickpath interface", .ucode = 0x100, }, { .uname = "QMC", .udesc = "Cycles GQ data is imported from Quickpath memory interface", .ucode = 0x200, }, { .uname = "LLC", .udesc = "Cycles GQ data is imported from LLC", .ucode = 0x400, }, { .uname = "CORES_02", .udesc = "Cycles GQ data is imported from Cores 0 and 2", .ucode = 0x800, }, { .uname = "CORES_13", .udesc = "Cycles GQ data is imported from Cores 1 and 3", .ucode = 0x1000, }, }; static const intel_x86_umask_t wsm_unc_unc_gq_data_to[]={ { .uname = "QPI_QMC", .udesc = "Cycles GQ data sent to the QPI or QMC", .ucode = 0x100, }, { .uname = "LLC", .udesc = "Cycles GQ data sent to LLC", .ucode = 0x200, }, { .uname = "CORES", .udesc = "Cycles GQ data sent to cores", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_llc_hits[]={ { .uname = "READ", .udesc = "Number of LLC read hits", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Number of LLC write hits", .ucode = 0x200, }, { .uname = "PROBE", .udesc = "Number of LLC peer probe hits", .ucode = 0x400, }, { .uname = "ANY", .udesc = "Number of LLC hits", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_llc_lines_in[]={ { .uname = "M_STATE", .udesc = "LLC lines allocated in M state", .ucode = 0x100, }, { .uname = "E_STATE", .udesc = "LLC lines allocated in E state", .ucode = 0x200, }, { .uname = "S_STATE", .udesc = "LLC lines allocated in S state", .ucode = 0x400, }, { .uname = "F_STATE", .udesc = "LLC lines allocated in F state", .ucode = 0x800, }, { .uname = "ANY", .udesc = "LLC lines allocated", .ucode = 0xf00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_llc_lines_out[]={ { .uname = "M_STATE", .udesc = "LLC lines victimized in M state", .ucode = 0x100, }, { .uname = "E_STATE", .udesc = "LLC lines victimized in E state", .ucode = 0x200, }, { .uname = "S_STATE", .udesc = "LLC lines victimized in S state", .ucode = 0x400, }, { .uname = "I_STATE", .udesc = "LLC lines victimized in I state", .ucode = 0x800, }, { .uname = "F_STATE", .udesc = "LLC lines victimized in F state", .ucode = 0x1000, }, { .uname = "ANY", .udesc = "LLC lines victimized", .ucode = 0x1f00, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_llc_miss[]={ { .uname = "READ", .udesc = "Number of LLC read misses", .ucode = 0x100, }, { .uname = "WRITE", .udesc = "Number of LLC write misses", .ucode = 0x200, }, { .uname = "PROBE", .udesc = "Number of LLC peer probe misses", .ucode = 0x400, }, { .uname = "ANY", .udesc = "Number of LLC misses", .ucode = 0x300, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_address_conflicts[]={ { .uname = "2WAY", .udesc = "QHL 2 way address conflicts", .ucode = 0x200, }, { .uname = "3WAY", .udesc = "QHL 3 way address conflicts", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_conflict_cycles[]={ { .uname = "IOH", .udesc = "QHL IOH Tracker conflict cycles", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "QHL Remote Tracker conflict cycles", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "QHL Local Tracker conflict cycles", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_cycles_full[]={ { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker is full", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker is full", .ucode = 0x400, }, { .uname = "IOH", .udesc = "Cycles QHL IOH Tracker is full", .ucode = 0x100, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_cycles_not_empty[]={ { .uname = "IOH", .udesc = "Cycles QHL IOH is busy", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker is busy", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker is busy", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_frc_ack_cnflts[]={ { .uname = "LOCAL", .udesc = "QHL FrcAckCnflts sent to local home", .ucode = 0x400, .uflags= INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_sleeps[]={ { .uname = "IOH_ORDER", .udesc = "Due to IOH ordering (write after read) conflicts", .ucode = 0x100, }, { .uname = "REMOTE_ORDER", .udesc = "Due to remote socket ordering (write after read) conflicts", .ucode = 0x200, }, { .uname = "LOCAL_ORDER", .udesc = "Due to local socket ordering (write after read) conflicts", .ucode = 0x400, }, { .uname = "IOH_CONFLICT", .udesc = "Due to IOH address conflicts", .ucode = 0x800, }, { .uname = "REMOTE_CONFLICT", .udesc = "Due to remote socket address conflicts", .ucode = 0x1000, }, { .uname = "LOCAL_CONFLICT", .udesc = "Due to local socket address conflicts", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_occupancy[]={ { .uname = "IOH", .udesc = "Cycles QHL IOH Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x100, }, { .uname = "REMOTE", .udesc = "Cycles QHL Remote Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x200, }, { .uname = "LOCAL", .udesc = "Cycles QHL Local Tracker Allocate to Deallocate Read Occupancy", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_qhl_requests[]={ { .uname = "LOCAL_READS", .udesc = "Quickpath Home Logic local read requests", .ucode = 0x1000, }, { .uname = "LOCAL_WRITES", .udesc = "Quickpath Home Logic local write requests", .ucode = 0x2000, }, { .uname = "REMOTE_READS", .udesc = "Quickpath Home Logic remote read requests", .ucode = 0x400, }, { .uname = "IOH_READS", .udesc = "Quickpath Home Logic IOH read requests", .ucode = 0x100, }, { .uname = "IOH_WRITES", .udesc = "Quickpath Home Logic IOH write requests", .ucode = 0x200, }, { .uname = "REMOTE_WRITES", .udesc = "Quickpath Home Logic remote write requests", .ucode = 0x800, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_busy[]={ { .uname = "READ_CH0", .udesc = "Cycles QMC channel 0 busy with a read request", .ucode = 0x100, }, { .uname = "READ_CH1", .udesc = "Cycles QMC channel 1 busy with a read request", .ucode = 0x200, }, { .uname = "READ_CH2", .udesc = "Cycles QMC channel 2 busy with a read request", .ucode = 0x400, }, { .uname = "WRITE_CH0", .udesc = "Cycles QMC channel 0 busy with a write request", .ucode = 0x800, }, { .uname = "WRITE_CH1", .udesc = "Cycles QMC channel 1 busy with a write request", .ucode = 0x1000, }, { .uname = "WRITE_CH2", .udesc = "Cycles QMC channel 2 busy with a write request", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_cancel[]={ { .uname = "CH0", .udesc = "QMC channel 0 cancels", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 cancels", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 cancels", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC cancels", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_critical_priority_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 critical priority read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 critical priority read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 critical priority read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC critical priority read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_high_priority_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 high priority read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 high priority read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 high priority read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC high priority read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_isoc_full[]={ { .uname = "READ_CH0", .udesc = "Cycles DRAM channel 0 full with isochronous read requests", .ucode = 0x100, }, { .uname = "READ_CH1", .udesc = "Cycles DRAM channel 1 full with isochronous read requests", .ucode = 0x200, }, { .uname = "READ_CH2", .udesc = "Cycles DRAM channel 2 full with isochronous read requests", .ucode = 0x400, }, { .uname = "WRITE_CH0", .udesc = "Cycles DRAM channel 0 full with isochronous write requests", .ucode = 0x800, }, { .uname = "WRITE_CH1", .udesc = "Cycles DRAM channel 1 full with isochronous write requests", .ucode = 0x1000, }, { .uname = "WRITE_CH2", .udesc = "Cycles DRAM channel 2 full with isochronous write requests", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_imc_isoc_occupancy[]={ { .uname = "CH0", .udesc = "IMC channel 0 isochronous read request occupancy", .ucode = 0x100, }, { .uname = "CH1", .udesc = "IMC channel 1 isochronous read request occupancy", .ucode = 0x200, }, { .uname = "CH2", .udesc = "IMC channel 2 isochronous read request occupancy", .ucode = 0x400, }, { .uname = "ANY", .udesc = "IMC isochronous read request occupancy", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_normal_reads[]={ { .uname = "CH0", .udesc = "QMC channel 0 normal read requests", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 normal read requests", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 normal read requests", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC normal read requests", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_occupancy[]={ { .uname = "CH0", .udesc = "IMC channel 0 normal read request occupancy", .ucode = 0x100, }, { .uname = "CH1", .udesc = "IMC channel 1 normal read request occupancy", .ucode = 0x200, }, { .uname = "CH2", .udesc = "IMC channel 2 normal read request occupancy", .ucode = 0x400, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_priority_updates[]={ { .uname = "CH0", .udesc = "QMC channel 0 priority updates", .ucode = 0x100, }, { .uname = "CH1", .udesc = "QMC channel 1 priority updates", .ucode = 0x200, }, { .uname = "CH2", .udesc = "QMC channel 2 priority updates", .ucode = 0x400, }, { .uname = "ANY", .udesc = "QMC priority updates", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_imc_retry[]={ { .uname = "CH0", .udesc = "Channel 0", .ucode = 0x100, }, { .uname = "CH1", .udesc = "Channel 1", .ucode = 0x200, }, { .uname = "CH2", .udesc = "Channel 2", .ucode = 0x400, }, { .uname = "ANY", .udesc = "Any channel", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, }, }; static const intel_x86_umask_t wsm_unc_unc_qmc_writes[]={ { .uname = "FULL_CH0", .udesc = "QMC channel 0 full cache line writes", .ucode = 0x100, .grpid = 0, }, { .uname = "FULL_CH1", .udesc = "QMC channel 1 full cache line writes", .ucode = 0x200, .grpid = 0, }, { .uname = "FULL_CH2", .udesc = "QMC channel 2 full cache line writes", .ucode = 0x400, .grpid = 0, }, { .uname = "FULL_ANY", .udesc = "QMC full cache line writes", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 0, }, { .uname = "PARTIAL_CH0", .udesc = "QMC channel 0 partial cache line writes", .ucode = 0x800, .grpid = 1, }, { .uname = "PARTIAL_CH1", .udesc = "QMC channel 1 partial cache line writes", .ucode = 0x1000, .grpid = 1, }, { .uname = "PARTIAL_CH2", .udesc = "QMC channel 2 partial cache line writes", .ucode = 0x2000, .grpid = 1, }, { .uname = "PARTIAL_ANY", .udesc = "QMC partial cache line writes", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO | INTEL_X86_DFL, .grpid = 1, }, }; static const intel_x86_umask_t wsm_unc_unc_qpi_rx_no_ppt_credit[]={ { .uname = "STALLS_LINK_0", .udesc = "Link 0 snoop stalls due to no PPT entry", .ucode = 0x100, }, { .uname = "STALLS_LINK_1", .udesc = "Link 1 snoop stalls due to no PPT entry", .ucode = 0x200, }, }; static const intel_x86_umask_t wsm_unc_unc_qpi_tx_header[]={ { .uname = "BUSY_LINK_0", .udesc = "Cycles link 0 outbound header busy", .ucode = 0x200, }, { .uname = "BUSY_LINK_1", .udesc = "Cycles link 1 outbound header busy", .ucode = 0x800, }, }; static const intel_x86_umask_t wsm_unc_unc_qpi_tx_stalled_multi_flit[]={ { .uname = "DRS_LINK_0", .udesc = "Cycles QPI outbound link 0 DRS stalled", .ucode = 0x100, }, { .uname = "NCB_LINK_0", .udesc = "Cycles QPI outbound link 0 NCB stalled", .ucode = 0x200, }, { .uname = "NCS_LINK_0", .udesc = "Cycles QPI outbound link 0 NCS stalled", .ucode = 0x400, }, { .uname = "DRS_LINK_1", .udesc = "Cycles QPI outbound link 1 DRS stalled", .ucode = 0x800, }, { .uname = "NCB_LINK_1", .udesc = "Cycles QPI outbound link 1 NCB stalled", .ucode = 0x1000, }, { .uname = "NCS_LINK_1", .udesc = "Cycles QPI outbound link 1 NCS stalled", .ucode = 0x2000, }, { .uname = "LINK_0", .udesc = "Cycles QPI outbound link 0 multi flit stalled", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LINK_1", .udesc = "Cycles QPI outbound link 1 multi flit stalled", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_unc_unc_qpi_tx_stalled_single_flit[]={ { .uname = "HOME_LINK_0", .udesc = "Cycles QPI outbound link 0 HOME stalled", .ucode = 0x100, }, { .uname = "SNOOP_LINK_0", .udesc = "Cycles QPI outbound link 0 SNOOP stalled", .ucode = 0x200, }, { .uname = "NDR_LINK_0", .udesc = "Cycles QPI outbound link 0 NDR stalled", .ucode = 0x400, }, { .uname = "HOME_LINK_1", .udesc = "Cycles QPI outbound link 1 HOME stalled", .ucode = 0x800, }, { .uname = "SNOOP_LINK_1", .udesc = "Cycles QPI outbound link 1 SNOOP stalled", .ucode = 0x1000, }, { .uname = "NDR_LINK_1", .udesc = "Cycles QPI outbound link 1 NDR stalled", .ucode = 0x2000, }, { .uname = "LINK_0", .udesc = "Cycles QPI outbound link 0 single flit stalled", .ucode = 0x700, .uflags= INTEL_X86_NCOMBO, }, { .uname = "LINK_1", .udesc = "Cycles QPI outbound link 1 single flit stalled", .ucode = 0x3800, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_unc_unc_snp_resp_to_local_home[]={ { .uname = "I_STATE", .udesc = "Local home snoop response - LLC does not have cache line", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Local home snoop response - LLC has cache line in S state", .ucode = 0x200, }, { .uname = "FWD_S_STATE", .udesc = "Local home snoop response - LLC forwarding cache line in S state.", .ucode = 0x400, }, { .uname = "FWD_I_STATE", .udesc = "Local home snoop response - LLC has forwarded a modified cache line", .ucode = 0x800, }, { .uname = "CONFLICT", .udesc = "Local home conflict snoop response", .ucode = 0x1000, }, { .uname = "WB", .udesc = "Local home snoop response - LLC has cache line in the M state", .ucode = 0x2000, }, }; static const intel_x86_umask_t wsm_unc_unc_snp_resp_to_remote_home[]={ { .uname = "I_STATE", .udesc = "Remote home snoop response - LLC does not have cache line", .ucode = 0x100, }, { .uname = "S_STATE", .udesc = "Remote home snoop response - LLC has cache line in S state", .ucode = 0x200, }, { .uname = "FWD_S_STATE", .udesc = "Remote home snoop response - LLC forwarding cache line in S state.", .ucode = 0x400, }, { .uname = "FWD_I_STATE", .udesc = "Remote home snoop response - LLC has forwarded a modified cache line", .ucode = 0x800, }, { .uname = "CONFLICT", .udesc = "Remote home conflict snoop response", .ucode = 0x1000, }, { .uname = "WB", .udesc = "Remote home snoop response - LLC has cache line in the M state", .ucode = 0x2000, }, { .uname = "HITM", .udesc = "Remote home snoop response - LLC HITM", .ucode = 0x2400, .uflags= INTEL_X86_NCOMBO, }, }; static const intel_x86_umask_t wsm_unc_unc_thermal_throttling_temp[]={ { .uname = "CORE_0", .udesc = "Core 0", .ucode = 0x100, }, { .uname = "CORE_1", .udesc = "Core 1", .ucode = 0x200, }, { .uname = "CORE_2", .udesc = "Core 2", .ucode = 0x400, }, { .uname = "CORE_3", .udesc = "Core 3", .ucode = 0x800, }, }; static const intel_x86_entry_t intel_wsm_unc_pe[]={ { .name = "UNC_CLK_UNHALTED", .desc = "Uncore clockticks.", .modmsk =0x0, .cntmsk = 0x100000, .code = 0xff, .flags = INTEL_X86_FIXED, }, { .name = "UNC_DRAM_OPEN", .desc = "DRAM open commands issued for read or write", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x60, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_open), .ngrp = 1, .umasks = wsm_unc_unc_dram_open, }, { .name = "UNC_GC_OCCUPANCY", .desc = "Number of queue entries", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gc_occupancy), .ngrp = 1, .umasks = wsm_unc_unc_gc_occupancy, }, { .name = "UNC_DRAM_PAGE_CLOSE", .desc = "DRAM page close due to idle timer expiration", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x61, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_page_close), .ngrp = 1, .umasks = wsm_unc_unc_dram_page_close, }, { .name = "UNC_DRAM_PAGE_MISS", .desc = "DRAM Channel 0 page miss", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x62, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_page_miss), .ngrp = 1, .umasks = wsm_unc_unc_dram_page_miss, }, { .name = "UNC_DRAM_PRE_ALL", .desc = "DRAM Channel 0 precharge all commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x66, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_pre_all), .ngrp = 1, .umasks = wsm_unc_unc_dram_pre_all, }, { .name = "UNC_DRAM_THERMAL_THROTTLED", .desc = "Uncore cycles DRAM was throttled due to its temperature being above thermal throttling threshold", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x67, }, { .name = "UNC_DRAM_READ_CAS", .desc = "DRAM Channel 0 read CAS commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x63, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_read_cas), .ngrp = 1, .umasks = wsm_unc_unc_dram_read_cas, }, { .name = "UNC_DRAM_REFRESH", .desc = "DRAM Channel 0 refresh commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x65, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_refresh), .ngrp = 1, .umasks = wsm_unc_unc_dram_refresh, }, { .name = "UNC_DRAM_WRITE_CAS", .desc = "DRAM Channel 0 write CAS commands", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x64, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_dram_write_cas), .ngrp = 1, .umasks = wsm_unc_unc_dram_write_cas, }, { .name = "UNC_GQ_ALLOC", .desc = "GQ read tracker requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x3, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gq_alloc), .ngrp = 1, .umasks = wsm_unc_unc_gq_alloc, }, { .name = "UNC_GQ_CYCLES_FULL", .desc = "Cycles GQ read tracker is full.", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x0, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gq_cycles_full), .ngrp = 1, .umasks = wsm_unc_unc_gq_cycles_full, }, { .name = "UNC_GQ_CYCLES_NOT_EMPTY", .desc = "Cycles GQ read tracker is busy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x1, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gq_cycles_not_empty), .ngrp = 1, .umasks = wsm_unc_unc_gq_cycles_not_empty, }, { .name = "UNC_GQ_DATA_FROM", .desc = "Cycles GQ data is imported", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x4, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gq_data_from), .ngrp = 1, .umasks = wsm_unc_unc_gq_data_from, }, { .name = "UNC_GQ_DATA_TO", .desc = "Cycles GQ data is exported", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x5, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_gq_data_to), .ngrp = 1, .umasks = wsm_unc_unc_gq_data_to, }, { .name = "UNC_LLC_HITS", .desc = "Number of LLC read hits", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x8, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_llc_hits), .ngrp = 1, .umasks = wsm_unc_unc_llc_hits, }, { .name = "UNC_LLC_LINES_IN", .desc = "LLC lines allocated in M state", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0xa, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_llc_lines_in), .ngrp = 1, .umasks = wsm_unc_unc_llc_lines_in, }, { .name = "UNC_LLC_LINES_OUT", .desc = "LLC lines victimized in M state", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0xb, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_llc_lines_out), .ngrp = 1, .umasks = wsm_unc_unc_llc_lines_out, }, { .name = "UNC_LLC_MISS", .desc = "Number of LLC read misses", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x9, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_llc_miss), .ngrp = 1, .umasks = wsm_unc_unc_llc_miss, }, { .name = "UNC_QHL_ADDRESS_CONFLICTS", .desc = "QHL 2 way address conflicts", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x24, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_address_conflicts), .ngrp = 1, .umasks = wsm_unc_unc_qhl_address_conflicts, }, { .name = "UNC_QHL_CONFLICT_CYCLES", .desc = "QHL IOH Tracker conflict cycles", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x25, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_conflict_cycles), .ngrp = 1, .umasks = wsm_unc_unc_qhl_conflict_cycles, }, { .name = "UNC_QHL_CYCLES_FULL", .desc = "Cycles QHL Remote Tracker is full", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x21, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_cycles_full), .ngrp = 1, .umasks = wsm_unc_unc_qhl_cycles_full, }, { .name = "UNC_QHL_CYCLES_NOT_EMPTY", .desc = "Cycles QHL Tracker is not empty", .modmsk =0x0, .cntmsk = 0x1fe00000, .code = 0x22, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_cycles_not_empty), .ngrp = 1, .umasks = wsm_unc_unc_qhl_cycles_not_empty, }, { .name = "UNC_QHL_FRC_ACK_CNFLTS", .desc = "QHL FrcAckCnflts sent to local home", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x33, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_frc_ack_cnflts), .ngrp = 1, .umasks = wsm_unc_unc_qhl_frc_ack_cnflts, }, { .name = "UNC_QHL_SLEEPS", .desc = "Number of occurrences a request was put to sleep", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x34, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_sleeps), .ngrp = 1, .umasks = wsm_unc_unc_qhl_sleeps, }, { .name = "UNC_QHL_OCCUPANCY", .desc = "Cycles QHL Tracker Allocate to Deallocate Read Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x23, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_occupancy), .ngrp = 1, .umasks = wsm_unc_unc_qhl_occupancy, }, { .name = "UNC_QHL_REQUESTS", .desc = "Quickpath Home Logic local read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x20, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qhl_requests), .ngrp = 1, .umasks = wsm_unc_unc_qhl_requests, }, { .name = "UNC_QHL_TO_QMC_BYPASS", .desc = "Number of requests to QMC that bypass QHL", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x26, }, { .name = "UNC_QMC_BUSY", .desc = "Cycles QMC busy with a read request", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x29, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_busy), .ngrp = 1, .umasks = wsm_unc_unc_qmc_busy, }, { .name = "UNC_QMC_CANCEL", .desc = "QMC cancels", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x30, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_cancel), .ngrp = 1, .umasks = wsm_unc_unc_qmc_cancel, }, { .name = "UNC_QMC_CRITICAL_PRIORITY_READS", .desc = "QMC critical priority read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2e, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_critical_priority_reads), .ngrp = 1, .umasks = wsm_unc_unc_qmc_critical_priority_reads, }, { .name = "UNC_QMC_HIGH_PRIORITY_READS", .desc = "QMC high priority read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2d, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_high_priority_reads), .ngrp = 1, .umasks = wsm_unc_unc_qmc_high_priority_reads, }, { .name = "UNC_QMC_ISOC_FULL", .desc = "Cycles DRAM full with isochronous (ISOC) read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x28, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_isoc_full), .ngrp = 1, .umasks = wsm_unc_unc_qmc_isoc_full, }, { .name = "UNC_IMC_ISOC_OCCUPANCY", .desc = "IMC isochronous (ISOC) Read Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2b, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_imc_isoc_occupancy), .ngrp = 1, .umasks = wsm_unc_unc_imc_isoc_occupancy, }, { .name = "UNC_QMC_NORMAL_READS", .desc = "QMC normal read requests", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2c, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_normal_reads), .ngrp = 1, .umasks = wsm_unc_unc_qmc_normal_reads, }, { .name = "UNC_QMC_OCCUPANCY", .desc = "QMC Occupancy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2a, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_occupancy), .ngrp = 1, .umasks = wsm_unc_unc_qmc_occupancy, }, { .name = "UNC_QMC_PRIORITY_UPDATES", .desc = "QMC priority updates", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x31, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_priority_updates), .ngrp = 1, .umasks = wsm_unc_unc_qmc_priority_updates, }, { .name = "UNC_IMC_RETRY", .desc = "Number of IMC DRAM channel retries (retries occur in RAS mode only)", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x32, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_imc_retry), .ngrp = 1, .umasks = wsm_unc_unc_imc_retry, }, { .name = "UNC_QMC_WRITES", .desc = "QMC cache line writes", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x2f, .flags= INTEL_X86_GRP_EXCL, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qmc_writes), .ngrp = 2, .umasks = wsm_unc_unc_qmc_writes, }, { .name = "UNC_QPI_RX_NO_PPT_CREDIT", .desc = "Link 0 snoop stalls due to no PPT entry", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x43, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qpi_rx_no_ppt_credit), .ngrp = 1, .umasks = wsm_unc_unc_qpi_rx_no_ppt_credit, }, { .name = "UNC_QPI_TX_HEADER", .desc = "Cycles link 0 outbound header busy", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x42, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qpi_tx_header), .ngrp = 1, .umasks = wsm_unc_unc_qpi_tx_header, }, { .name = "UNC_QPI_TX_STALLED_MULTI_FLIT", .desc = "Cycles QPI outbound stalls", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x41, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qpi_tx_stalled_multi_flit), .ngrp = 1, .umasks = wsm_unc_unc_qpi_tx_stalled_multi_flit, }, { .name = "UNC_QPI_TX_STALLED_SINGLE_FLIT", .desc = "Cycles QPI outbound link stalls", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x40, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_qpi_tx_stalled_single_flit), .ngrp = 1, .umasks = wsm_unc_unc_qpi_tx_stalled_single_flit, }, { .name = "UNC_SNP_RESP_TO_LOCAL_HOME", .desc = "Local home snoop response", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x6, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_snp_resp_to_local_home), .ngrp = 1, .umasks = wsm_unc_unc_snp_resp_to_local_home, }, { .name = "UNC_SNP_RESP_TO_REMOTE_HOME", .desc = "Remote home snoop response", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x7, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_snp_resp_to_remote_home), .ngrp = 1, .umasks = wsm_unc_unc_snp_resp_to_remote_home, }, { .name = "UNC_THERMAL_THROTTLING_TEMP", .desc = "Uncore cycles that the PCU records core temperature above threshold", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x80, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_thermal_throttling_temp), .ngrp = 1, .umasks = wsm_unc_unc_thermal_throttling_temp, }, { .name = "UNC_THERMAL_THROTTLED_TEMP", .desc = "Uncore cycles that the PCU records that core is in power throttled state due to temperature being above threshold", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x81, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_thermal_throttling_temp), .ngrp = 1, .umasks = wsm_unc_unc_thermal_throttling_temp, /* identical to actual umasks list for this event */ }, { .name = "UNC_PROCHOT_ASSERTION", .desc = "Number of system assertions of PROCHOT indicating the entire processor has exceeded the thermal limit", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x82, }, { .name = "UNC_THERMAL_THROTTLING_PROCHOT", .desc = "Uncore cycles that the PCU records that core is in power throttled state due PROCHOT assertions", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x83, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_thermal_throttling_temp), .ngrp = 1, .umasks = wsm_unc_unc_thermal_throttling_temp, /* identical to actual umasks list for this event */ }, { .name = "UNC_TURBO_MODE", .desc = "Uncore cycles that a core is operating in turbo mode", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x84, .numasks = LIBPFM_ARRAY_SIZE(wsm_unc_unc_thermal_throttling_temp), .ngrp = 1, .umasks = wsm_unc_unc_thermal_throttling_temp, /* identical to actual umasks list for this event */ }, { .name = "UNC_CYCLES_UNHALTED_L3_FLL_ENABLE", .desc = "Uncore cycles where at least one core is unhalted and all L3 ways are enabled", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x85, }, { .name = "UNC_CYCLES_UNHALTED_L3_FLL_DISABLE", .desc = "Uncore cycles where at least one core is unhalted and all L3 ways are disabled", .modmsk = NHM_UNC_ATTRS, .cntmsk = 0x1fe00000, .code = 0x86, }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/intel_x86_arch_events.h000066400000000000000000000064221502707512200244140ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * architected events for architectural perfmon v1 and v2 as defined by the IA-32 developer's manual * Vol 3B, table 18-6 (May 2007) */ static intel_x86_entry_t intel_x86_arch_pe[]={ {.name = "UNHALTED_CORE_CYCLES", .code = 0x003c, .cntmsk = 0x200000000ull, /* temporary */ .desc = "count core clock cycles whenever the clock signal on the specific core is running (not halted)" }, {.name = "INSTRUCTION_RETIRED", .code = 0x00c0, .cntmsk = 0x100000000ull, /* temporary */ .desc = "count the number of instructions at retirement. For instructions that consists of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction", }, {.name = "UNHALTED_REFERENCE_CYCLES", .code = 0x013c, .cntmsk = 0x400000000ull, /* temporary */ .desc = "count reference clock cycles while the clock signal on the specific core is running. The reference clock operates at a fixed frequency, irrespective of core frequency changes due to performance state transitions", }, {.name = "LLC_REFERENCES", .code = 0x4f2e, .desc = "count each request originating from the core to reference a cache line in the last level cache. The count may include speculation, but excludes cache line fills due to hardware prefetch", }, {.name = "LLC_MISSES", .code = 0x412e, .desc = "count each cache miss condition for references to the last level cache. The event count may include speculation, but excludes cache line fills due to hardware prefetch", }, {.name = "BRANCH_INSTRUCTIONS_RETIRED", .code = 0x00c4, .desc = "count branch instructions at retirement. Specifically, this event counts the retirement of the last micro-op of a branch instruction", }, {.name = "MISPREDICTED_BRANCH_RETIRED", .code = 0x00c5, .desc = "count mispredicted branch instructions at retirement. Specifically, this event counts at retirement of the last micro-op of a branch instruction in the architectural path of the execution and experienced misprediction in the branch prediction hardware", } }; papi-papi-7-2-0-t/src/libpfm4/lib/events/itanium2_events.h000066400000000000000000002746701502707512200233430ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ static pme_ita2_entry_t itanium2_pe []={ #define PME_ITA2_ALAT_CAPACITY_MISS_ALL 0 { "ALAT_CAPACITY_MISS_ALL", {0x30058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- both integer and floating point instructions"}, #define PME_ITA2_ALAT_CAPACITY_MISS_FP 1 { "ALAT_CAPACITY_MISS_FP", {0x20058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- only floating point instructions"}, #define PME_ITA2_ALAT_CAPACITY_MISS_INT 2 { "ALAT_CAPACITY_MISS_INT", {0x10058}, 0xf0, 2, {0xf00007}, "ALAT Entry Replaced -- only integer instructions"}, #define PME_ITA2_BACK_END_BUBBLE_ALL 3 { "BACK_END_BUBBLE_ALL", {0x0}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- Front-end, RSE, EXE, FPU/L1D stall or a pipeline flush due to an exception/branch misprediction"}, #define PME_ITA2_BACK_END_BUBBLE_FE 4 { "BACK_END_BUBBLE_FE", {0x10000}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- front-end"}, #define PME_ITA2_BACK_END_BUBBLE_L1D_FPU_RSE 5 { "BACK_END_BUBBLE_L1D_FPU_RSE", {0x20000}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe -- L1D_FPU or RSE."}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_ANY 6 { "BE_BR_MISPRED_DETAIL_ANY", {0x61}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- any back-end (be) mispredictions"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_PFS 7 { "BE_BR_MISPRED_DETAIL_PFS", {0x30061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end pfs mispredictions for taken branches"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_ROT 8 { "BE_BR_MISPRED_DETAIL_ROT", {0x20061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end rotate mispredictions"}, #define PME_ITA2_BE_BR_MISPRED_DETAIL_STG 9 { "BE_BR_MISPRED_DETAIL_STG", {0x10061}, 0xf0, 1, {0xf00003}, "BE Branch Misprediction Detail -- only back-end stage mispredictions"}, #define PME_ITA2_BE_EXE_BUBBLE_ALL 10 { "BE_EXE_BUBBLE_ALL", {0x2}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe"}, #define PME_ITA2_BE_EXE_BUBBLE_ARCR 11 { "BE_EXE_BUBBLE_ARCR", {0x40002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to AR or CR dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK 12 { "BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK", {0x80002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- ARCR, PR, CANCEL or BANK_SWITCH"}, #define PME_ITA2_BE_EXE_BUBBLE_BANK_SWITCH 13 { "BE_EXE_BUBBLE_BANK_SWITCH", {0x70002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to bank switching."}, #define PME_ITA2_BE_EXE_BUBBLE_CANCEL 14 { "BE_EXE_BUBBLE_CANCEL", {0x60002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to a canceled load"}, #define PME_ITA2_BE_EXE_BUBBLE_FRALL 15 { "BE_EXE_BUBBLE_FRALL", {0x20002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to FR/FR or FR/load dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_GRALL 16 { "BE_EXE_BUBBLE_GRALL", {0x10002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR or GR/load dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_GRGR 17 { "BE_EXE_BUBBLE_GRGR", {0x50002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR dependency"}, #define PME_ITA2_BE_EXE_BUBBLE_PR 18 { "BE_EXE_BUBBLE_PR", {0x30002}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to PR dependency"}, #define PME_ITA2_BE_FLUSH_BUBBLE_ALL 19 { "BE_FLUSH_BUBBLE_ALL", {0x4}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to either an exception/interruption or branch misprediction flush"}, #define PME_ITA2_BE_FLUSH_BUBBLE_BRU 20 { "BE_FLUSH_BUBBLE_BRU", {0x10004}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to a branch misprediction flush"}, #define PME_ITA2_BE_FLUSH_BUBBLE_XPN 21 { "BE_FLUSH_BUBBLE_XPN", {0x20004}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to an exception/interruption flush"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_ALL 22 { "BE_L1D_FPU_BUBBLE_ALL", {0xca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D or FPU"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_FPU 23 { "BE_L1D_FPU_BUBBLE_FPU", {0x100ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by FPU."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D 24 { "BE_L1D_FPU_BUBBLE_L1D", {0x200ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D. This includes all stalls caused by the L1 pipeline (created in the L1D stage of the L1 pipeline which corresponds to the DET stage of the main pipe)."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_DCS 25 { "BE_L1D_FPU_BUBBLE_L1D_DCS", {0x800ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to DCS requiring a stall"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_DCURECIR 26 { "BE_L1D_FPU_BUBBLE_L1D_DCURECIR", {0x400ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to DCU recirculating"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_FILLCONF 27 { "BE_L1D_FPU_BUBBLE_L1D_FILLCONF", {0x700ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due a store in conflict with a returning fill."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF 28 { "BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF", {0x300ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer being full"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_HPW 29 { "BE_L1D_FPU_BUBBLE_L1D_HPW", {0x500ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to Hardware Page Walker"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_L2BPRESS 30 { "BE_L1D_FPU_BUBBLE_L1D_L2BPRESS", {0x900ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2 Back Pressure"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_LDCHK 31 { "BE_L1D_FPU_BUBBLE_L1D_LDCHK", {0xc00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_LDCONF 32 { "BE_L1D_FPU_BUBBLE_L1D_LDCONF", {0xb00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_NAT 33 { "BE_L1D_FPU_BUBBLE_L1D_NAT", {0xd00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L1D data return needing recirculated NaT generation."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_NATCONF 34 { "BE_L1D_FPU_BUBBLE_L1D_NATCONF", {0xf00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ld8.fill conflict with st8.spill not written to unat."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR 35 { "BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR", {0xe00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer cancel needing recirculate."}, #define PME_ITA2_BE_L1D_FPU_BUBBLE_L1D_TLB 36 { "BE_L1D_FPU_BUBBLE_L1D_TLB", {0xa00ca}, 0xf0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2DTLB to L1DTLB transfer"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_ALL 37 { "BE_LOST_BW_DUE_TO_FE_ALL", {0x72}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- count regardless of cause"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BI 38 { "BE_LOST_BW_DUE_TO_FE_BI", {0x90072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch initialization stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BRQ 39 { "BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch retirement queue stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 40 { "BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch interlock stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_BUBBLE 41 { "BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_FEFLUSH 42 { "BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a front-end flush"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 43 { "BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_IBFULL 44 { "BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- (* meaningless for this event *)"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_IMISS 45 { "BE_LOST_BW_DUE_TO_FE_IMISS", {0x60072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by instruction cache miss stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_PLP 46 { "BE_LOST_BW_DUE_TO_FE_PLP", {0xb0072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_TLBMISS 47 { "BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by TLB stall"}, #define PME_ITA2_BE_LOST_BW_DUE_TO_FE_UNREACHED 48 { "BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40072}, 0xf0, 2, {0xf00000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by unreachable bundle"}, #define PME_ITA2_BE_RSE_BUBBLE_ALL 49 { "BE_RSE_BUBBLE_ALL", {0x1}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE"}, #define PME_ITA2_BE_RSE_BUBBLE_AR_DEP 50 { "BE_RSE_BUBBLE_AR_DEP", {0x20001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to AR dependencies"}, #define PME_ITA2_BE_RSE_BUBBLE_BANK_SWITCH 51 { "BE_RSE_BUBBLE_BANK_SWITCH", {0x10001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to bank switching"}, #define PME_ITA2_BE_RSE_BUBBLE_LOADRS 52 { "BE_RSE_BUBBLE_LOADRS", {0x50001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to loadrs calculations"}, #define PME_ITA2_BE_RSE_BUBBLE_OVERFLOW 53 { "BE_RSE_BUBBLE_OVERFLOW", {0x30001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to spill"}, #define PME_ITA2_BE_RSE_BUBBLE_UNDERFLOW 54 { "BE_RSE_BUBBLE_UNDERFLOW", {0x40001}, 0xf0, 1, {0xf00000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to fill"}, #define PME_ITA2_BRANCH_EVENT 55 { "BRANCH_EVENT", {0x111}, 0xf0, 1, {0xf00003}, "Branch Event Captured"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_ALL_PRED 56 { "BR_MISPRED_DETAIL_ALL_ALL_PRED", {0x5b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_CORRECT_PRED 57 { "BR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x1005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_WRONG_PATH 58 { "BR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x2005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_ALL_WRONG_TARGET 59 { "BR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x3005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_ALL_PRED 60 { "BR_MISPRED_DETAIL_IPREL_ALL_PRED", {0x4005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_CORRECT_PRED 61 { "BR_MISPRED_DETAIL_IPREL_CORRECT_PRED", {0x5005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_WRONG_PATH 62 { "BR_MISPRED_DETAIL_IPREL_WRONG_PATH", {0x6005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_IPREL_WRONG_TARGET 63 { "BR_MISPRED_DETAIL_IPREL_WRONG_TARGET", {0x7005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_ALL_PRED 64 { "BR_MISPRED_DETAIL_NTRETIND_ALL_PRED", {0xc005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_CORRECT_PRED 65 { "BR_MISPRED_DETAIL_NTRETIND_CORRECT_PRED", {0xd005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_WRONG_PATH 66 { "BR_MISPRED_DETAIL_NTRETIND_WRONG_PATH", {0xe005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_NTRETIND_WRONG_TARGET 67 { "BR_MISPRED_DETAIL_NTRETIND_WRONG_TARGET", {0xf005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_ALL_PRED 68 { "BR_MISPRED_DETAIL_RETURN_ALL_PRED", {0x8005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, regardless of prediction result"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_CORRECT_PRED 69 { "BR_MISPRED_DETAIL_RETURN_CORRECT_PRED", {0x9005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_WRONG_PATH 70 { "BR_MISPRED_DETAIL_RETURN_WRONG_PATH", {0xa005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL_RETURN_WRONG_TARGET 71 { "BR_MISPRED_DETAIL_RETURN_WRONG_TARGET", {0xb005b}, 0xf0, 3, {0xf00003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED 72 { "BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED", {0x68}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED 73 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED", {0x10068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and correctly predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH 74 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH", {0x20068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED 75 { "BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED", {0x40068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED 76 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED", {0x50068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH 77 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH", {0x60068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED 78 { "BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED", {0xc0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED 79 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED", {0xd0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH 80 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH", {0xe0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED 81 { "BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED", {0x80068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED 82 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED", {0x90068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_ITA2_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH 83 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH", {0xa0068}, 0xf0, 2, {0xf00003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and wrong branch direction"}, #define PME_ITA2_BR_PATH_PRED_ALL_MISPRED_NOTTAKEN 84 { "BR_PATH_PRED_ALL_MISPRED_NOTTAKEN", {0x54}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_MISPRED_TAKEN 85 { "BR_PATH_PRED_ALL_MISPRED_TAKEN", {0x10054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_OKPRED_NOTTAKEN 86 { "BR_PATH_PRED_ALL_OKPRED_NOTTAKEN", {0x20054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_ALL_OKPRED_TAKEN 87 { "BR_PATH_PRED_ALL_OKPRED_TAKEN", {0x30054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN 88 { "BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN", {0x40054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_MISPRED_TAKEN 89 { "BR_PATH_PRED_IPREL_MISPRED_TAKEN", {0x50054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN 90 { "BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN", {0x60054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_IPREL_OKPRED_TAKEN 91 { "BR_PATH_PRED_IPREL_OKPRED_TAKEN", {0x70054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN 92 { "BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN", {0xc0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_MISPRED_TAKEN 93 { "BR_PATH_PRED_NRETIND_MISPRED_TAKEN", {0xd0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN 94 { "BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN", {0xe0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_NRETIND_OKPRED_TAKEN 95 { "BR_PATH_PRED_NRETIND_OKPRED_TAKEN", {0xf0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN 96 { "BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN", {0x80054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_MISPRED_TAKEN 97 { "BR_PATH_PRED_RETURN_MISPRED_TAKEN", {0x90054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN 98 { "BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN", {0xa0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and not taken branch"}, #define PME_ITA2_BR_PATH_PRED_RETURN_OKPRED_TAKEN 99 { "BR_PATH_PRED_RETURN_OKPRED_TAKEN", {0xb0054}, 0xf0, 3, {0xf00003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and taken branch"}, #define PME_ITA2_BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN 100 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN", {0x6a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN 101 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN", {0x1006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN 102 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN", {0x4006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN 103 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN", {0x5006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN 104 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN", {0xc006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN 105 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN", {0xd006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN 106 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN", {0x8006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_ITA2_BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN 107 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN", {0x9006a}, 0xf0, 2, {0xf00003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_ITA2_BUS_ALL_ANY 108 { "BUS_ALL_ANY", {0x30087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_ALL_IO 109 { "BUS_ALL_IO", {0x10087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_ALL_SELF 110 { "BUS_ALL_SELF", {0x20087}, 0xf0, 1, {0xf00000}, "Bus Transactions -- local processor"}, #define PME_ITA2_BUS_BACKSNP_REQ_THIS 111 { "BUS_BACKSNP_REQ_THIS", {0x1008e}, 0xf0, 1, {0xf00000}, "Bus Back Snoop Requests -- Counts the number of bus back snoop me requests"}, #define PME_ITA2_BUS_BRQ_LIVE_REQ_HI 112 { "BUS_BRQ_LIVE_REQ_HI", {0x9c}, 0xf0, 2, {0xf00000}, "BRQ Live Requests (upper 2 bits)"}, #define PME_ITA2_BUS_BRQ_LIVE_REQ_LO 113 { "BUS_BRQ_LIVE_REQ_LO", {0x9b}, 0xf0, 7, {0xf00000}, "BRQ Live Requests (lower 3 bits)"}, #define PME_ITA2_BUS_BRQ_REQ_INSERTED 114 { "BUS_BRQ_REQ_INSERTED", {0x9d}, 0xf0, 1, {0xf00000}, "BRQ Requests Inserted"}, #define PME_ITA2_BUS_DATA_CYCLE 115 { "BUS_DATA_CYCLE", {0x88}, 0xf0, 1, {0xf00000}, "Valid Data Cycle on the Bus"}, #define PME_ITA2_BUS_HITM 116 { "BUS_HITM", {0x84}, 0xf0, 1, {0xf00000}, "Bus Hit Modified Line Transactions"}, #define PME_ITA2_BUS_IO_ANY 117 { "BUS_IO_ANY", {0x30090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_IO_IO 118 { "BUS_IO_IO", {0x10090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_IO_SELF 119 { "BUS_IO_SELF", {0x20090}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Bus Transactions -- local processor"}, #define PME_ITA2_BUS_IOQ_LIVE_REQ_HI 120 { "BUS_IOQ_LIVE_REQ_HI", {0x98}, 0xf0, 2, {0xf00000}, "Inorder Bus Queue Requests (upper 2 bits)"}, #define PME_ITA2_BUS_IOQ_LIVE_REQ_LO 121 { "BUS_IOQ_LIVE_REQ_LO", {0x97}, 0xf0, 3, {0xf00000}, "Inorder Bus Queue Requests (lower2 bitst)"}, #define PME_ITA2_BUS_LOCK_ANY 122 { "BUS_LOCK_ANY", {0x30093}, 0xf0, 1, {0xf00000}, "IA-32 Compatible Bus Lock Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_LOCK_SELF 123 { "BUS_LOCK_SELF", {0x20093}, 0xf0, 1, {0xf00000}, "IA-32 Compatible Bus Lock Transactions -- local processor"}, #define PME_ITA2_BUS_MEMORY_ALL_ANY 124 { "BUS_MEMORY_ALL_ANY", {0xf008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_ALL_IO 125 { "BUS_MEMORY_ALL_IO", {0xd008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_ALL_SELF 126 { "BUS_MEMORY_ALL_SELF", {0xe008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- All bus transactions from local processor"}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_ANY 127 { "BUS_MEMORY_EQ_128BYTE_ANY", {0x7008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_IO 128 { "BUS_MEMORY_EQ_128BYTE_IO", {0x5008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_EQ_128BYTE_SELF 129 { "BUS_MEMORY_EQ_128BYTE_SELF", {0x6008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL) from local processor"}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_ANY 130 { "BUS_MEMORY_LT_128BYTE_ANY", {0xb008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_IO 131 { "BUS_MEMORY_LT_128BYTE_IO", {0x9008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) from non-CPU priority agents"}, #define PME_ITA2_BUS_MEMORY_LT_128BYTE_SELF 132 { "BUS_MEMORY_LT_128BYTE_SELF", {0xa008a}, 0xf0, 1, {0xf00000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP) local processor"}, #define PME_ITA2_BUS_MEM_READ_ALL_ANY 133 { "BUS_MEM_READ_ALL_ANY", {0xf008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_ALL_IO 134 { "BUS_MEM_READ_ALL_IO", {0xd008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_ALL_SELF 135 { "BUS_MEM_READ_ALL_SELF", {0xe008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BIL_ANY 136 { "BUS_MEM_READ_BIL_ANY", {0x3008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BIL_IO 137 { "BUS_MEM_READ_BIL_IO", {0x1008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BIL_SELF 138 { "BUS_MEM_READ_BIL_SELF", {0x2008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BRIL_ANY 139 { "BUS_MEM_READ_BRIL_ANY", {0xb008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BRIL_IO 140 { "BUS_MEM_READ_BRIL_IO", {0x9008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BRIL_SELF 141 { "BUS_MEM_READ_BRIL_SELF", {0xa008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_BRL_ANY 142 { "BUS_MEM_READ_BRL_ANY", {0x7008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_MEM_READ_BRL_IO 143 { "BUS_MEM_READ_BRL_IO", {0x5008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from non-CPU priority agents"}, #define PME_ITA2_BUS_MEM_READ_BRL_SELF 144 { "BUS_MEM_READ_BRL_SELF", {0x6008b}, 0xf0, 1, {0xf00000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from local processor"}, #define PME_ITA2_BUS_MEM_READ_OUT_HI 145 { "BUS_MEM_READ_OUT_HI", {0x94}, 0xf0, 2, {0xf00000}, "Outstanding Memory Read Transactions (upper 2 bits)"}, #define PME_ITA2_BUS_MEM_READ_OUT_LO 146 { "BUS_MEM_READ_OUT_LO", {0x95}, 0xf0, 7, {0xf00000}, "Outstanding Memory Read Transactions (lower 3 bits)"}, #define PME_ITA2_BUS_OOQ_LIVE_REQ_HI 147 { "BUS_OOQ_LIVE_REQ_HI", {0x9a}, 0xf0, 2, {0xf00000}, "Out-of-order Bus Queue Requests (upper 2 bits)"}, #define PME_ITA2_BUS_OOQ_LIVE_REQ_LO 148 { "BUS_OOQ_LIVE_REQ_LO", {0x99}, 0xf0, 7, {0xf00000}, "Out-of-order Bus Queue Requests (lower 3 bits)"}, #define PME_ITA2_BUS_RD_DATA_ANY 149 { "BUS_RD_DATA_ANY", {0x3008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_DATA_IO 150 { "BUS_RD_DATA_IO", {0x1008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_DATA_SELF 151 { "BUS_RD_DATA_SELF", {0x2008c}, 0xf0, 1, {0xf00000}, "Bus Read Data Transactions -- local processor"}, #define PME_ITA2_BUS_RD_HIT 152 { "BUS_RD_HIT", {0x80}, 0xf0, 1, {0xf00000}, "Bus Read Hit Clean Non-local Cache Transactions"}, #define PME_ITA2_BUS_RD_HITM 153 { "BUS_RD_HITM", {0x81}, 0xf0, 1, {0xf00000}, "Bus Read Hit Modified Non-local Cache Transactions"}, #define PME_ITA2_BUS_RD_INVAL_ALL_HITM 154 { "BUS_RD_INVAL_ALL_HITM", {0x83}, 0xf0, 1, {0xf00000}, "Bus BRIL Burst Transaction Results in HITM"}, #define PME_ITA2_BUS_RD_INVAL_HITM 155 { "BUS_RD_INVAL_HITM", {0x82}, 0xf0, 1, {0xf00000}, "Bus BIL Transaction Results in HITM"}, #define PME_ITA2_BUS_RD_IO_ANY 156 { "BUS_RD_IO_ANY", {0x30091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_IO_IO 157 { "BUS_RD_IO_IO", {0x10091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_IO_SELF 158 { "BUS_RD_IO_SELF", {0x20091}, 0xf0, 1, {0xf00000}, "IA-32 Compatible IO Read Transactions -- local processor"}, #define PME_ITA2_BUS_RD_PRTL_ANY 159 { "BUS_RD_PRTL_ANY", {0x3008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_RD_PRTL_IO 160 { "BUS_RD_PRTL_IO", {0x1008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_RD_PRTL_SELF 161 { "BUS_RD_PRTL_SELF", {0x2008d}, 0xf0, 1, {0xf00000}, "Bus Read Partial Transactions -- local processor"}, #define PME_ITA2_BUS_SNOOPQ_REQ 162 { "BUS_SNOOPQ_REQ", {0x96}, 0xf0, 7, {0xf00000}, "Bus Snoop Queue Requests"}, #define PME_ITA2_BUS_SNOOPS_ANY 163 { "BUS_SNOOPS_ANY", {0x30086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOPS_IO 164 { "BUS_SNOOPS_IO", {0x10086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- non-CPU priority agents"}, #define PME_ITA2_BUS_SNOOPS_SELF 165 { "BUS_SNOOPS_SELF", {0x20086}, 0xf0, 1, {0xf00000}, "Bus Snoops Total -- local processor"}, #define PME_ITA2_BUS_SNOOPS_HITM_ANY 166 { "BUS_SNOOPS_HITM_ANY", {0x30085}, 0xf0, 1, {0xf00000}, "Bus Snoops HIT Modified Cache Line -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOPS_HITM_SELF 167 { "BUS_SNOOPS_HITM_SELF", {0x20085}, 0xf0, 1, {0xf00000}, "Bus Snoops HIT Modified Cache Line -- local processor"}, #define PME_ITA2_BUS_SNOOP_STALL_CYCLES_ANY 168 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x3008f}, 0xf0, 1, {0xf00000}, "Bus Snoop Stall Cycles (from any agent) -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_SNOOP_STALL_CYCLES_SELF 169 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x2008f}, 0xf0, 1, {0xf00000}, "Bus Snoop Stall Cycles (from any agent) -- local processor"}, #define PME_ITA2_BUS_WR_WB_ALL_ANY 170 { "BUS_WR_WB_ALL_ANY", {0xf0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_BUS_WR_WB_ALL_IO 171 { "BUS_WR_WB_ALL_IO", {0xd0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- non-CPU priority agents"}, #define PME_ITA2_BUS_WR_WB_ALL_SELF 172 { "BUS_WR_WB_ALL_SELF", {0xe0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor"}, #define PME_ITA2_BUS_WR_WB_CCASTOUT_ANY 173 { "BUS_WR_WB_CCASTOUT_ANY", {0xb0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_ITA2_BUS_WR_WB_CCASTOUT_SELF 174 { "BUS_WR_WB_CCASTOUT_SELF", {0xa0092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_ANY 175 { "BUS_WR_WB_EQ_128BYTE_ANY", {0x70092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)./Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_IO 176 { "BUS_WR_WB_EQ_128BYTE_IO", {0x50092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- non-CPU priority agents/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_BUS_WR_WB_EQ_128BYTE_SELF 177 { "BUS_WR_WB_EQ_128BYTE_SELF", {0x60092}, 0xf0, 1, {0xf00000}, "Bus Write Back Transactions -- local processor/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_ITA2_CPU_CPL_CHANGES 178 { "CPU_CPL_CHANGES", {0x13}, 0xf0, 1, {0xf00000}, "Privilege Level Changes"}, #define PME_ITA2_CPU_CYCLES 179 { "CPU_CYCLES", {0x12}, 0xf0, 1, {0xf00000}, "CPU Cycles"}, #define PME_ITA2_DATA_DEBUG_REGISTER_FAULT 180 { "DATA_DEBUG_REGISTER_FAULT", {0x52}, 0xf0, 1, {0xf00000}, "Fault Due to Data Debug Reg. Match to Load/Store Instruction"}, #define PME_ITA2_DATA_DEBUG_REGISTER_MATCHES 181 { "DATA_DEBUG_REGISTER_MATCHES", {0xc6}, 0xf0, 1, {0xf00007}, "Data Debug Register Matches Data Address of Memory Reference."}, #define PME_ITA2_DATA_EAR_ALAT 182 { "DATA_EAR_ALAT", {0x6c8}, 0xf0, 1, {0xf00007}, "Data EAR ALAT"}, #define PME_ITA2_DATA_EAR_CACHE_LAT1024 183 { "DATA_EAR_CACHE_LAT1024", {0x805c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 1024 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT128 184 { "DATA_EAR_CACHE_LAT128", {0x505c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 128 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT16 185 { "DATA_EAR_CACHE_LAT16", {0x205c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 16 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT2048 186 { "DATA_EAR_CACHE_LAT2048", {0x905c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 2048 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT256 187 { "DATA_EAR_CACHE_LAT256", {0x605c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 256 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT32 188 { "DATA_EAR_CACHE_LAT32", {0x305c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 32 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT4 189 { "DATA_EAR_CACHE_LAT4", {0x5c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 4 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT4096 190 { "DATA_EAR_CACHE_LAT4096", {0xa05c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 4096 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT512 191 { "DATA_EAR_CACHE_LAT512", {0x705c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 512 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT64 192 { "DATA_EAR_CACHE_LAT64", {0x405c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 64 Cycles"}, #define PME_ITA2_DATA_EAR_CACHE_LAT8 193 { "DATA_EAR_CACHE_LAT8", {0x105c8}, 0xf0, 1, {0xf00007}, "Data EAR Cache -- >= 8 Cycles"}, #define PME_ITA2_DATA_EAR_EVENTS 194 { "DATA_EAR_EVENTS", {0xc8}, 0xf0, 1, {0xf00007}, "L1 Data Cache EAR Events"}, #define PME_ITA2_DATA_EAR_TLB_ALL 195 { "DATA_EAR_TLB_ALL", {0xe04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- All L1 DTLB Misses"}, #define PME_ITA2_DATA_EAR_TLB_FAULT 196 { "DATA_EAR_TLB_FAULT", {0x804c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- DTLB Misses which produce a software fault"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB 197 { "DATA_EAR_TLB_L2DTLB", {0x204c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB_OR_FAULT 198 { "DATA_EAR_TLB_L2DTLB_OR_FAULT", {0xa04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or produce a software fault"}, #define PME_ITA2_DATA_EAR_TLB_L2DTLB_OR_VHPT 199 { "DATA_EAR_TLB_L2DTLB_OR_VHPT", {0x604c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or VHPT"}, #define PME_ITA2_DATA_EAR_TLB_VHPT 200 { "DATA_EAR_TLB_VHPT", {0x404c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT"}, #define PME_ITA2_DATA_EAR_TLB_VHPT_OR_FAULT 201 { "DATA_EAR_TLB_VHPT_OR_FAULT", {0xc04c8}, 0xf0, 1, {0xf00007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT or produce a software fault"}, #define PME_ITA2_DATA_REFERENCES_SET0 202 { "DATA_REFERENCES_SET0", {0xc3}, 0xf0, 4, {0x5010007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_ITA2_DATA_REFERENCES_SET1 203 { "DATA_REFERENCES_SET1", {0xc5}, 0xf0, 4, {0x5110007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_ITA2_DISP_STALLED 204 { "DISP_STALLED", {0x49}, 0xf0, 1, {0xf00000}, "Number of Cycles Dispersal Stalled"}, #define PME_ITA2_DTLB_INSERTS_HPW 205 { "DTLB_INSERTS_HPW", {0xc9}, 0xf0, 4, {0xf00007}, "Hardware Page Walker Installs to DTLB"}, #define PME_ITA2_DTLB_INSERTS_HPW_RETIRED 206 { "DTLB_INSERTS_HPW_RETIRED", {0x2c}, 0xf0, 4, {0xf00007}, "VHPT Entries Inserted into DTLB by the Hardware Page Walker"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_ALL_PRED 207 { "ENCBR_MISPRED_DETAIL_ALL_ALL_PRED", {0x63}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED 208 { "ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x10063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH 209 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x20063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET 210 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x30063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED 211 { "ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED", {0xc0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED 212 { "ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED", {0xd0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH 213 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH", {0xe0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET 214 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET", {0xf0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED 215 { "ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED", {0x80063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, regardless of prediction result"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED 216 { "ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED", {0x90063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH 217 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH", {0xa0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_ITA2_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET 218 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET", {0xb0063}, 0xf0, 3, {0xf00003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_ALL 219 { "EXTERN_DP_PINS_0_TO_3_ALL", {0xf009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0 220 { "EXTERN_DP_PINS_0_TO_3_PIN0", {0x1009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1 221 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1", {0x3009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN2 222 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN2", {0x7009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN3 223 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN1_OR_PIN3", {0xb009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin1 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2 224 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2", {0x5009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2_OR_PIN3 225 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN2_OR_PIN3", {0xd009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN3 226 { "EXTERN_DP_PINS_0_TO_3_PIN0_OR_PIN3", {0x9009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin0 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1 227 { "EXTERN_DP_PINS_0_TO_3_PIN1", {0x2009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2 228 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2", {0x6009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2_OR_PIN3 229 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN2_OR_PIN3", {0xe009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN3 230 { "EXTERN_DP_PINS_0_TO_3_PIN1_OR_PIN3", {0xa009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin1 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN2 231 { "EXTERN_DP_PINS_0_TO_3_PIN2", {0x4009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin2 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN2_OR_PIN3 232 { "EXTERN_DP_PINS_0_TO_3_PIN2_OR_PIN3", {0xc009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin2 or pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_0_TO_3_PIN3 233 { "EXTERN_DP_PINS_0_TO_3_PIN3", {0x8009e}, 0xf0, 1, {0xf00000}, "DP Pins 0-3 Asserted -- include pin3 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_ALL 234 { "EXTERN_DP_PINS_4_TO_5_ALL", {0x3009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin5 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_PIN4 235 { "EXTERN_DP_PINS_4_TO_5_PIN4", {0x1009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin4 assertion"}, #define PME_ITA2_EXTERN_DP_PINS_4_TO_5_PIN5 236 { "EXTERN_DP_PINS_4_TO_5_PIN5", {0x2009f}, 0xf0, 1, {0xf00000}, "DP Pins 4-5 Asserted -- include pin5 assertion"}, #define PME_ITA2_FE_BUBBLE_ALL 237 { "FE_BUBBLE_ALL", {0x71}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- count regardless of cause"}, #define PME_ITA2_FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE 238 { "FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE", {0xb0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- ALL except FEFLUSH and BUBBLE"}, #define PME_ITA2_FE_BUBBLE_ALLBUT_IBFULL 239 { "FE_BUBBLE_ALLBUT_IBFULL", {0xc0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- ALL except IBFULl"}, #define PME_ITA2_FE_BUBBLE_BRANCH 240 { "FE_BUBBLE_BRANCH", {0x90071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by any of 4 branch recirculates"}, #define PME_ITA2_FE_BUBBLE_BUBBLE 241 { "FE_BUBBLE_BUBBLE", {0xd0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by branch bubble stall"}, #define PME_ITA2_FE_BUBBLE_FEFLUSH 242 { "FE_BUBBLE_FEFLUSH", {0x10071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by a front-end flush"}, #define PME_ITA2_FE_BUBBLE_FILL_RECIRC 243 { "FE_BUBBLE_FILL_RECIRC", {0x80071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_FE_BUBBLE_GROUP1 244 { "FE_BUBBLE_GROUP1", {0x30071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- BUBBLE or BRANCH"}, #define PME_ITA2_FE_BUBBLE_GROUP2 245 { "FE_BUBBLE_GROUP2", {0x40071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- IMISS or TLBMISS"}, #define PME_ITA2_FE_BUBBLE_GROUP3 246 { "FE_BUBBLE_GROUP3", {0xa0071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- FILL_RECIRC or BRANCH"}, #define PME_ITA2_FE_BUBBLE_IBFULL 247 { "FE_BUBBLE_IBFULL", {0x50071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by instruction buffer full stall"}, #define PME_ITA2_FE_BUBBLE_IMISS 248 { "FE_BUBBLE_IMISS", {0x60071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by instruction cache miss stall"}, #define PME_ITA2_FE_BUBBLE_TLBMISS 249 { "FE_BUBBLE_TLBMISS", {0x70071}, 0xf0, 1, {0xf00000}, "Bubbles Seen by FE -- only if caused by TLB stall"}, #define PME_ITA2_FE_LOST_BW_ALL 250 { "FE_LOST_BW_ALL", {0x70}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- count regardless of cause"}, #define PME_ITA2_FE_LOST_BW_BI 251 { "FE_LOST_BW_BI", {0x90070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch initialization stall"}, #define PME_ITA2_FE_LOST_BW_BRQ 252 { "FE_LOST_BW_BRQ", {0xa0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch retirement queue stall"}, #define PME_ITA2_FE_LOST_BW_BR_ILOCK 253 { "FE_LOST_BW_BR_ILOCK", {0xc0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch interlock stall"}, #define PME_ITA2_FE_LOST_BW_BUBBLE 254 { "FE_LOST_BW_BUBBLE", {0xd0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_FE_LOST_BW_FEFLUSH 255 { "FE_LOST_BW_FEFLUSH", {0x10070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by a front-end flush"}, #define PME_ITA2_FE_LOST_BW_FILL_RECIRC 256 { "FE_LOST_BW_FILL_RECIRC", {0x80070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_FE_LOST_BW_IBFULL 257 { "FE_LOST_BW_IBFULL", {0x50070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction buffer full stall"}, #define PME_ITA2_FE_LOST_BW_IMISS 258 { "FE_LOST_BW_IMISS", {0x60070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction cache miss stall"}, #define PME_ITA2_FE_LOST_BW_PLP 259 { "FE_LOST_BW_PLP", {0xb0070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_FE_LOST_BW_TLBMISS 260 { "FE_LOST_BW_TLBMISS", {0x70070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by TLB stall"}, #define PME_ITA2_FE_LOST_BW_UNREACHED 261 { "FE_LOST_BW_UNREACHED", {0x40070}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Entrance to IB -- only if caused by unreachable bundle"}, #define PME_ITA2_FP_FAILED_FCHKF 262 { "FP_FAILED_FCHKF", {0x6}, 0xf0, 1, {0xf00001}, "Failed fchkf"}, #define PME_ITA2_FP_FALSE_SIRSTALL 263 { "FP_FALSE_SIRSTALL", {0x5}, 0xf0, 1, {0xf00001}, "SIR Stall Without a Trap"}, #define PME_ITA2_FP_FLUSH_TO_ZERO 264 { "FP_FLUSH_TO_ZERO", {0xb}, 0xf0, 2, {0xf00001}, "FP Result Flushed to Zero"}, #define PME_ITA2_FP_OPS_RETIRED 265 { "FP_OPS_RETIRED", {0x9}, 0xf0, 4, {0xf00001}, "Retired FP Operations"}, #define PME_ITA2_FP_TRUE_SIRSTALL 266 { "FP_TRUE_SIRSTALL", {0x3}, 0xf0, 1, {0xf00001}, "SIR stall asserted and leads to a trap"}, #define PME_ITA2_HPW_DATA_REFERENCES 267 { "HPW_DATA_REFERENCES", {0x2d}, 0xf0, 4, {0xf00007}, "Data Memory References to VHPT"}, #define PME_ITA2_IA32_INST_RETIRED 268 { "IA32_INST_RETIRED", {0x59}, 0xf0, 2, {0xf00000}, "IA-32 Instructions Retired"}, #define PME_ITA2_IA32_ISA_TRANSITIONS 269 { "IA32_ISA_TRANSITIONS", {0x7}, 0xf0, 1, {0xf00000}, "IA-64 to/from IA-32 ISA Transitions"}, #define PME_ITA2_IA64_INST_RETIRED 270 { "IA64_INST_RETIRED", {0x8}, 0xf0, 6, {0xf00003}, "Retired IA-64 Instructions, alias to IA64_INST_RETIRED_THIS"}, #define PME_ITA2_IA64_INST_RETIRED_THIS 271 { "IA64_INST_RETIRED_THIS", {0x8}, 0xf0, 6, {0xf00003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions"}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP0_PMC8 272 { "IA64_TAGGED_INST_RETIRED_IBRP0_PMC8", {0x8}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 0 and opcode matcher PMC8. Code executed with PSR.is=1 is included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP1_PMC9 273 { "IA64_TAGGED_INST_RETIRED_IBRP1_PMC9", {0x10008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 1 and opcode matcher PMC9. Code executed with PSR.is=1 is included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP2_PMC8 274 { "IA64_TAGGED_INST_RETIRED_IBRP2_PMC8", {0x20008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 2 and opcode matcher PMC8. Code executed with PSR.is=1 is not included."}, #define PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP3_PMC9 275 { "IA64_TAGGED_INST_RETIRED_IBRP3_PMC9", {0x30008}, 0xf0, 6, {0xf00003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 3 and opcode matcher PMC9. Code executed with PSR.is=1 is not included."}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_ALL 276 { "IDEAL_BE_LOST_BW_DUE_TO_FE_ALL", {0x73}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- count regardless of cause"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BI 277 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BI", {0x90073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch initialization stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ 278 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch retirement queue stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 279 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch interlock stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE 280 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by branch resteer bubble stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH 281 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by a front-end flush"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 282 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL 283 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- (* meaningless for this event *)"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS 284 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS", {0x60073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by instruction cache miss stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_PLP 285 { "IDEAL_BE_LOST_BW_DUE_TO_FE_PLP", {0xb0073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by perfect loop prediction stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS 286 { "IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by TLB stall"}, #define PME_ITA2_IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED 287 { "IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40073}, 0xf0, 2, {0xf00000}, "Invalid Bundles at the Exit from IB -- only if caused by unreachable bundle"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_ALL 288 { "INST_CHKA_LDC_ALAT_ALL", {0x30056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_FP 289 { "INST_CHKA_LDC_ALAT_FP", {0x20056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_ITA2_INST_CHKA_LDC_ALAT_INT 290 { "INST_CHKA_LDC_ALAT_INT", {0x10056}, 0xf0, 2, {0xf00007}, "Retired chk.a and ld.c Instructions -- only integer instructions"}, #define PME_ITA2_INST_DISPERSED 291 { "INST_DISPERSED", {0x4d}, 0xf0, 6, {0xf00001}, "Syllables Dispersed from REN to REG stage"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_ALL 292 { "INST_FAILED_CHKA_LDC_ALAT_ALL", {0x30057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_FP 293 { "INST_FAILED_CHKA_LDC_ALAT_FP", {0x20057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKA_LDC_ALAT_INT 294 { "INST_FAILED_CHKA_LDC_ALAT_INT", {0x10057}, 0xf0, 1, {0xf00007}, "Failed chk.a and ld.c Instructions -- only integer instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_ALL 295 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- both integer and floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_FP 296 { "INST_FAILED_CHKS_RETIRED_FP", {0x20055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- only floating point instructions"}, #define PME_ITA2_INST_FAILED_CHKS_RETIRED_INT 297 { "INST_FAILED_CHKS_RETIRED_INT", {0x10055}, 0xf0, 1, {0xf00000}, "Failed chk.s Instructions -- only integer instructions"}, #define PME_ITA2_ISB_BUNPAIRS_IN 298 { "ISB_BUNPAIRS_IN", {0x46}, 0xf0, 1, {0xf00001}, "Bundle Pairs Written from L2 into FE"}, #define PME_ITA2_ITLB_MISSES_FETCH_ALL 299 { "ITLB_MISSES_FETCH_ALL", {0x30047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All tlb misses will be counted. Note that this is not equal to sum of the L1ITLB and L2ITLB umasks because any access could be a miss in L1ITLB and L2ITLB."}, #define PME_ITA2_ITLB_MISSES_FETCH_L1ITLB 300 { "ITLB_MISSES_FETCH_L1ITLB", {0x10047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB will be counted. even if L1ITLB is not updated for an access (Uncacheable/nat page/not present page/faulting/some flushed), it will be counted here."}, #define PME_ITA2_ITLB_MISSES_FETCH_L2ITLB 301 { "ITLB_MISSES_FETCH_L2ITLB", {0x20047}, 0xf0, 1, {0xf00001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB which also missed in L2ITLB will be counted."}, #define PME_ITA2_L1DTLB_TRANSFER 302 { "L1DTLB_TRANSFER", {0xc0}, 0xf0, 1, {0x5010007}, "L1DTLB Misses That Hit in the L2DTLB for Accesses Counted in L1D_READS"}, #define PME_ITA2_L1D_READS_SET0 303 { "L1D_READS_SET0", {0xc2}, 0xf0, 2, {0x5010007}, "L1 Data Cache Reads"}, #define PME_ITA2_L1D_READS_SET1 304 { "L1D_READS_SET1", {0xc4}, 0xf0, 2, {0x5110007}, "L1 Data Cache Reads"}, #define PME_ITA2_L1D_READ_MISSES_ALL 305 { "L1D_READ_MISSES_ALL", {0xc7}, 0xf0, 2, {0x5110007}, "L1 Data Cache Read Misses -- all L1D read misses will be counted."}, #define PME_ITA2_L1D_READ_MISSES_RSE_FILL 306 { "L1D_READ_MISSES_RSE_FILL", {0x100c7}, 0xf0, 2, {0x5110007}, "L1 Data Cache Read Misses -- only L1D read misses caused by RSE fills will be counted"}, #define PME_ITA2_L1ITLB_INSERTS_HPW 307 { "L1ITLB_INSERTS_HPW", {0x48}, 0xf0, 1, {0xf00001}, "L1ITLB Hardware Page Walker Inserts"}, #define PME_ITA2_L1I_EAR_CACHE_LAT0 308 { "L1I_EAR_CACHE_LAT0", {0x400343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- > 0 Cycles (All L1 Misses)"}, #define PME_ITA2_L1I_EAR_CACHE_LAT1024 309 { "L1I_EAR_CACHE_LAT1024", {0xc00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 1024 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT128 310 { "L1I_EAR_CACHE_LAT128", {0xf00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 128 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT16 311 { "L1I_EAR_CACHE_LAT16", {0xfc0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 16 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT256 312 { "L1I_EAR_CACHE_LAT256", {0xe00343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 256 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT32 313 { "L1I_EAR_CACHE_LAT32", {0xf80343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 32 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT4 314 { "L1I_EAR_CACHE_LAT4", {0xff0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 4 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT4096 315 { "L1I_EAR_CACHE_LAT4096", {0x800343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 4096 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_LAT8 316 { "L1I_EAR_CACHE_LAT8", {0xfe0343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- >= 8 Cycles"}, #define PME_ITA2_L1I_EAR_CACHE_RAB 317 { "L1I_EAR_CACHE_RAB", {0x343}, 0xf0, 1, {0xf00001}, "L1I EAR Cache -- RAB HIT"}, #define PME_ITA2_L1I_EAR_EVENTS 318 { "L1I_EAR_EVENTS", {0x43}, 0xf0, 1, {0xf00001}, "Instruction EAR Events"}, #define PME_ITA2_L1I_EAR_TLB_ALL 319 { "L1I_EAR_TLB_ALL", {0x70243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- All L1 ITLB Misses"}, #define PME_ITA2_L1I_EAR_TLB_FAULT 320 { "L1I_EAR_TLB_FAULT", {0x40243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- ITLB Misses which produced a fault"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB 321 { "L1I_EAR_TLB_L2TLB", {0x10243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB_OR_FAULT 322 { "L1I_EAR_TLB_L2TLB_OR_FAULT", {0x50243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or produce a software fault"}, #define PME_ITA2_L1I_EAR_TLB_L2TLB_OR_VHPT 323 { "L1I_EAR_TLB_L2TLB_OR_VHPT", {0x30243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or VHPT"}, #define PME_ITA2_L1I_EAR_TLB_VHPT 324 { "L1I_EAR_TLB_VHPT", {0x20243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT"}, #define PME_ITA2_L1I_EAR_TLB_VHPT_OR_FAULT 325 { "L1I_EAR_TLB_VHPT_OR_FAULT", {0x60243}, 0xf0, 1, {0xf00001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT or produce a software fault"}, #define PME_ITA2_L1I_FETCH_ISB_HIT 326 { "L1I_FETCH_ISB_HIT", {0x66}, 0xf0, 1, {0xf00001}, "\"Just-In-Time\" Instruction Fetch Hitting in and Being Bypassed from ISB"}, #define PME_ITA2_L1I_FETCH_RAB_HIT 327 { "L1I_FETCH_RAB_HIT", {0x65}, 0xf0, 1, {0xf00001}, "Instruction Fetch Hitting in RAB"}, #define PME_ITA2_L1I_FILLS 328 { "L1I_FILLS", {0x41}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Fills"}, #define PME_ITA2_L1I_PREFETCHES 329 { "L1I_PREFETCHES", {0x44}, 0xf0, 1, {0xf00001}, "L1 Instruction Prefetch Requests"}, #define PME_ITA2_L1I_PREFETCH_STALL_ALL 330 { "L1I_PREFETCH_STALL_ALL", {0x30067}, 0xf0, 1, {0xf00000}, "Prefetch Pipeline Stalls -- Number of clocks prefetch pipeline is stalled"}, #define PME_ITA2_L1I_PREFETCH_STALL_FLOW 331 { "L1I_PREFETCH_STALL_FLOW", {0x20067}, 0xf0, 1, {0xf00000}, "Prefetch Pipeline Stalls -- Number of clocks flow is not asserted"}, #define PME_ITA2_L1I_PURGE 332 { "L1I_PURGE", {0x4b}, 0xf0, 1, {0xf00001}, "L1ITLB Purges Handled by L1I"}, #define PME_ITA2_L1I_PVAB_OVERFLOW 333 { "L1I_PVAB_OVERFLOW", {0x69}, 0xf0, 1, {0xf00000}, "PVAB Overflow"}, #define PME_ITA2_L1I_RAB_ALMOST_FULL 334 { "L1I_RAB_ALMOST_FULL", {0x64}, 0xf0, 1, {0xf00000}, "Is RAB Almost Full?"}, #define PME_ITA2_L1I_RAB_FULL 335 { "L1I_RAB_FULL", {0x60}, 0xf0, 1, {0xf00000}, "Is RAB Full?"}, #define PME_ITA2_L1I_READS 336 { "L1I_READS", {0x40}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Reads"}, #define PME_ITA2_L1I_SNOOP 337 { "L1I_SNOOP", {0x4a}, 0xf0, 1, {0xf00007}, "Snoop Requests Handled by L1I"}, #define PME_ITA2_L1I_STRM_PREFETCHES 338 { "L1I_STRM_PREFETCHES", {0x5f}, 0xf0, 1, {0xf00001}, "L1 Instruction Cache Line Prefetch Requests"}, #define PME_ITA2_L2DTLB_MISSES 339 { "L2DTLB_MISSES", {0xc1}, 0xf0, 4, {0x5010007}, "L2DTLB Misses"}, #define PME_ITA2_L2_BAD_LINES_SELECTED_ANY 340 { "L2_BAD_LINES_SELECTED_ANY", {0xb9}, 0xf0, 4, {0x4320007}, "Valid Line Replaced When Invalid Line Is Available -- Valid line replaced when invalid line is available"}, #define PME_ITA2_L2_BYPASS_L2_DATA1 341 { "L2_BYPASS_L2_DATA1", {0xb8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 data bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L2_DATA2 342 { "L2_BYPASS_L2_DATA2", {0x100b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 data bypasses (L1W to L2I)"}, #define PME_ITA2_L2_BYPASS_L2_INST1 343 { "L2_BYPASS_L2_INST1", {0x400b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 instruction bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L2_INST2 344 { "L2_BYPASS_L2_INST2", {0x500b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L2 instruction bypasses (L1W to L2I)"}, #define PME_ITA2_L2_BYPASS_L3_DATA1 345 { "L2_BYPASS_L3_DATA1", {0x200b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L3 data bypasses (L1D to L2A)"}, #define PME_ITA2_L2_BYPASS_L3_INST1 346 { "L2_BYPASS_L3_INST1", {0x600b8}, 0xf0, 1, {0x4320007}, "Count L2 Bypasses -- Count only L3 instruction bypasses (L1D to L2A)"}, #define PME_ITA2_L2_DATA_REFERENCES_L2_ALL 347 { "L2_DATA_REFERENCES_L2_ALL", {0x300b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count both read and write operations (semaphores will count as 2)"}, #define PME_ITA2_L2_DATA_REFERENCES_L2_DATA_READS 348 { "L2_DATA_REFERENCES_L2_DATA_READS", {0x100b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count only data read and semaphore operations."}, #define PME_ITA2_L2_DATA_REFERENCES_L2_DATA_WRITES 349 { "L2_DATA_REFERENCES_L2_DATA_WRITES", {0x200b2}, 0xf0, 4, {0x4120007}, "Data Read/Write Access to L2 -- count only data write and semaphore operations"}, #define PME_ITA2_L2_FILLB_FULL_THIS 350 { "L2_FILLB_FULL_THIS", {0xbf}, 0xf0, 1, {0x4520000}, "L2D Fill Buffer Is Full -- L2 Fill buffer is full"}, #define PME_ITA2_L2_FORCE_RECIRC_ANY 351 { "L2_FORCE_RECIRC_ANY", {0xb4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count forced recirculates regardless of cause. SMC_HIT, TRAN_PREF & SNP_OR_L3 will not be included here."}, #define PME_ITA2_L2_FORCE_RECIRC_FILL_HIT 352 { "L2_FORCE_RECIRC_FILL_HIT", {0x900b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss which hit in the fill buffer."}, #define PME_ITA2_L2_FORCE_RECIRC_FRC_RECIRC 353 { "L2_FORCE_RECIRC_FRC_RECIRC", {0xe00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when a force recirculate already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_IPF_MISS 354 { "L2_FORCE_RECIRC_IPF_MISS", {0xa00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by L2 miss when instruction prefetch buffer miss already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_L1W 355 { "L2_FORCE_RECIRC_L1W", {0x200b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by forced limbo"}, #define PME_ITA2_L2_FORCE_RECIRC_OZQ_MISS 356 { "L2_FORCE_RECIRC_OZQ_MISS", {0xc00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when an OZQ miss already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_SAME_INDEX 357 { "L2_FORCE_RECIRC_SAME_INDEX", {0xd00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- caused by an L2 miss when a miss to the same index already existed"}, #define PME_ITA2_L2_FORCE_RECIRC_SMC_HIT 358 { "L2_FORCE_RECIRC_SMC_HIT", {0x100b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by SMC hits due to an ifetch and load to same cache line or a pending WT store"}, #define PME_ITA2_L2_FORCE_RECIRC_SNP_OR_L3 359 { "L2_FORCE_RECIRC_SNP_OR_L3", {0x600b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by a snoop or L3 issue"}, #define PME_ITA2_L2_FORCE_RECIRC_TAG_NOTOK 360 { "L2_FORCE_RECIRC_TAG_NOTOK", {0x400b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by L2 hits caused by in flight snoops, stores with a sibling miss to the same index, sibling probe to the same line or pending sync.ia instructions."}, #define PME_ITA2_L2_FORCE_RECIRC_TRAN_PREF 361 { "L2_FORCE_RECIRC_TRAN_PREF", {0x500b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by transforms to prefetches"}, #define PME_ITA2_L2_FORCE_RECIRC_VIC_BUF_FULL 362 { "L2_FORCE_RECIRC_VIC_BUF_FULL", {0xb00b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss with victim buffer full"}, #define PME_ITA2_L2_FORCE_RECIRC_VIC_PEND 363 { "L2_FORCE_RECIRC_VIC_PEND", {0x800b4}, 0x10, 4, {0x4220007}, "Forced Recirculates -- count only those caused by an L2 miss with pending victim"}, #define PME_ITA2_L2_GOT_RECIRC_IFETCH_ANY 364 { "L2_GOT_RECIRC_IFETCH_ANY", {0x800ba}, 0xf0, 1, {0x4420007}, "Instruction Fetch Recirculates Received by L2D -- Instruction fetch recirculates received by L2"}, #define PME_ITA2_L2_GOT_RECIRC_OZQ_ACC 365 { "L2_GOT_RECIRC_OZQ_ACC", {0xb6}, 0xf0, 1, {0x4220007}, "Counts Number of OZQ Accesses Recirculated to L1D"}, #define PME_ITA2_L2_IFET_CANCELS_ANY 366 { "L2_IFET_CANCELS_ANY", {0xa1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- total instruction fetch cancels by L2"}, #define PME_ITA2_L2_IFET_CANCELS_BYPASS 367 { "L2_IFET_CANCELS_BYPASS", {0x200a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to bypassing"}, #define PME_ITA2_L2_IFET_CANCELS_CHG_PRIO 368 { "L2_IFET_CANCELS_CHG_PRIO", {0xc00a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to change priority"}, #define PME_ITA2_L2_IFET_CANCELS_DATA_RD 369 { "L2_IFET_CANCELS_DATA_RD", {0x700a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch/prefetch cancels due to a data read"}, #define PME_ITA2_L2_IFET_CANCELS_DIDNT_RECIR 370 { "L2_IFET_CANCELS_DIDNT_RECIR", {0x400a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels because it did not recirculate"}, #define PME_ITA2_L2_IFET_CANCELS_IFETCH_BYP 371 { "L2_IFET_CANCELS_IFETCH_BYP", {0xd00a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- due to ifetch bypass during last clock"}, #define PME_ITA2_L2_IFET_CANCELS_PREEMPT 372 { "L2_IFET_CANCELS_PREEMPT", {0x800a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to preempts"}, #define PME_ITA2_L2_IFET_CANCELS_RECIR_OVER_SUB 373 { "L2_IFET_CANCELS_RECIR_OVER_SUB", {0x500a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels because of recirculate oversubscription"}, #define PME_ITA2_L2_IFET_CANCELS_ST_FILL_WB 374 { "L2_IFET_CANCELS_ST_FILL_WB", {0x600a1}, 0xf0, 1, {0x4020007}, "Instruction Fetch Cancels by the L2 -- ifetch cancels due to a store or fill or write back"}, #define PME_ITA2_L2_INST_DEMAND_READS 375 { "L2_INST_DEMAND_READS", {0x42}, 0xf0, 1, {0xf00001}, "L2 Instruction Demand Fetch Requests"}, #define PME_ITA2_L2_INST_PREFETCHES 376 { "L2_INST_PREFETCHES", {0x45}, 0xf0, 1, {0xf00001}, "L2 Instruction Prefetch Requests"}, #define PME_ITA2_L2_ISSUED_RECIRC_IFETCH_ANY 377 { "L2_ISSUED_RECIRC_IFETCH_ANY", {0x800b9}, 0xf0, 1, {0x4420007}, "Instruction Fetch Recirculates Issued by L2 -- Instruction fetch recirculates issued by L2"}, #define PME_ITA2_L2_ISSUED_RECIRC_OZQ_ACC 378 { "L2_ISSUED_RECIRC_OZQ_ACC", {0xb5}, 0xf0, 1, {0x4220007}, "Count Number of Times a Recirculate Issue Was Attempted and Not Preempted"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_ANY 379 { "L2_L3ACCESS_CANCEL_ANY", {0x900b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- count cancels due to any reason. This umask will count more than the sum of all the other umasks. It will count things that weren't committed accesses when they reached L1w, but the L2 attempted to bypass them to the L3 anyway (speculatively). This will include accesses made repeatedly while the main pipeline is stalled and the L1d is attempting to recirculate an access down the L1d pipeline. Thus, an access could get counted many times before it really does get bypassed to the L3. It is a measure of how many times we asserted a request to the L3 but didn't confirm it."}, #define PME_ITA2_L2_L3ACCESS_CANCEL_DFETCH 380 { "L2_L3ACCESS_CANCEL_DFETCH", {0xa00b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- data fetches"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_EBL_REJECT 381 { "L2_L3ACCESS_CANCEL_EBL_REJECT", {0x800b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- ebl rejects"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_FILLD_FULL 382 { "L2_L3ACCESS_CANCEL_FILLD_FULL", {0x200b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- filld being full"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_IFETCH 383 { "L2_L3ACCESS_CANCEL_IFETCH", {0xb00b0}, 0xf0, 1, {0x4120007}, "Canceled L3 Accesses -- instruction fetches"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_INV_L3_BYP 384 { "L2_L3ACCESS_CANCEL_INV_L3_BYP", {0x600b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- invalid L3 bypasses"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_SPEC_L3_BYP 385 { "L2_L3ACCESS_CANCEL_SPEC_L3_BYP", {0x100b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- speculative L3 bypasses"}, #define PME_ITA2_L2_L3ACCESS_CANCEL_UC_BLOCKED 386 { "L2_L3ACCESS_CANCEL_UC_BLOCKED", {0x500b0}, 0x10, 1, {0x4120007}, "Canceled L3 Accesses -- Uncacheable blocked L3 Accesses"}, #define PME_ITA2_L2_MISSES 387 { "L2_MISSES", {0xcb}, 0xf0, 1, {0xf00007}, "L2 Misses"}, #define PME_ITA2_L2_OPS_ISSUED_FP_LOAD 388 { "L2_OPS_ISSUED_FP_LOAD", {0x900b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid floating point loads"}, #define PME_ITA2_L2_OPS_ISSUED_INT_LOAD 389 { "L2_OPS_ISSUED_INT_LOAD", {0x800b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid integer loads"}, #define PME_ITA2_L2_OPS_ISSUED_NST_NLD 390 { "L2_OPS_ISSUED_NST_NLD", {0xc00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid non-load, no-store accesses"}, #define PME_ITA2_L2_OPS_ISSUED_RMW 391 { "L2_OPS_ISSUED_RMW", {0xa00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid read_modify_write stores"}, #define PME_ITA2_L2_OPS_ISSUED_STORE 392 { "L2_OPS_ISSUED_STORE", {0xb00b8}, 0xf0, 4, {0x4420007}, "Different Operations Issued by L2D -- Count only valid non-read_modify_write stores"}, #define PME_ITA2_L2_OZDB_FULL_THIS 393 { "L2_OZDB_FULL_THIS", {0xbd}, 0xf0, 1, {0x4520000}, "L2 OZ Data Buffer Is Full -- L2 OZ Data Buffer is full"}, #define PME_ITA2_L2_OZQ_ACQUIRE 394 { "L2_OZQ_ACQUIRE", {0xa2}, 0xf0, 1, {0x4020000}, "Clocks With Acquire Ordering Attribute Existed in L2 OZQ"}, #define PME_ITA2_L2_OZQ_CANCELS0_ANY 395 { "L2_OZQ_CANCELS0_ANY", {0xa0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the total OZ Queue cancels"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_ACQUIRE 396 { "L2_OZQ_CANCELS0_LATE_ACQUIRE", {0x300a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by acquires"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_BYP_EFFRELEASE 397 { "L2_OZQ_CANCELS0_LATE_BYP_EFFRELEASE", {0x400a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by L1D to L2A bypass effective releases"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_RELEASE 398 { "L2_OZQ_CANCELS0_LATE_RELEASE", {0x200a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by releases"}, #define PME_ITA2_L2_OZQ_CANCELS0_LATE_SPEC_BYP 399 { "L2_OZQ_CANCELS0_LATE_SPEC_BYP", {0x100a0}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Late or Any) -- counts the late cancels caused by speculative bypasses"}, #define PME_ITA2_L2_OZQ_CANCELS1_BANK_CONF 400 { "L2_OZQ_CANCELS1_BANK_CONF", {0x100ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- bank conflicts"}, #define PME_ITA2_L2_OZQ_CANCELS1_CANC_L2M_ST 401 { "L2_OZQ_CANCELS1_CANC_L2M_ST", {0x600ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by a canceled store in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_CCV 402 { "L2_OZQ_CANCELS1_CCV", {0x900ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a ccv"}, #define PME_ITA2_L2_OZQ_CANCELS1_ECC 403 { "L2_OZQ_CANCELS1_ECC", {0xf00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- ECC hardware detecting a problem"}, #define PME_ITA2_L2_OZQ_CANCELS1_HPW_IFETCH_CONF 404 { "L2_OZQ_CANCELS1_HPW_IFETCH_CONF", {0x500ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a ifetch conflict (canceling HPW?)"}, #define PME_ITA2_L2_OZQ_CANCELS1_L1DF_L2M 405 { "L2_OZQ_CANCELS1_L1DF_L2M", {0xe00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- L1D fill in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_L1_FILL_CONF 406 { "L2_OZQ_CANCELS1_L1_FILL_CONF", {0x700ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- an L1 fill conflict"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2A_ST_MAT 407 { "L2_OZQ_CANCELS1_L2A_ST_MAT", {0xd00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2A"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2D_ST_MAT 408 { "L2_OZQ_CANCELS1_L2D_ST_MAT", {0x200ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2D"}, #define PME_ITA2_L2_OZQ_CANCELS1_L2M_ST_MAT 409 { "L2_OZQ_CANCELS1_L2M_ST_MAT", {0xb00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store match in L2M"}, #define PME_ITA2_L2_OZQ_CANCELS1_MFA 410 { "L2_OZQ_CANCELS1_MFA", {0xc00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a memory fence instruction"}, #define PME_ITA2_L2_OZQ_CANCELS1_REL 411 { "L2_OZQ_CANCELS1_REL", {0xac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by release"}, #define PME_ITA2_L2_OZQ_CANCELS1_SEM 412 { "L2_OZQ_CANCELS1_SEM", {0xa00ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a semaphore"}, #define PME_ITA2_L2_OZQ_CANCELS1_ST_FILL_CONF 413 { "L2_OZQ_CANCELS1_ST_FILL_CONF", {0x800ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- a store fill conflict"}, #define PME_ITA2_L2_OZQ_CANCELS1_SYNC 414 { "L2_OZQ_CANCELS1_SYNC", {0x400ac}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 1) -- caused by sync.i"}, #define PME_ITA2_L2_OZQ_CANCELS2_ACQ 415 { "L2_OZQ_CANCELS2_ACQ", {0x400a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by an acquire"}, #define PME_ITA2_L2_OZQ_CANCELS2_CANC_L2C_ST 416 { "L2_OZQ_CANCELS2_CANC_L2C_ST", {0x100a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a canceled store in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_CANC_L2D_ST 417 { "L2_OZQ_CANCELS2_CANC_L2D_ST", {0xd00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a canceled store in L2D"}, #define PME_ITA2_L2_OZQ_CANCELS2_DIDNT_RECIRC 418 { "L2_OZQ_CANCELS2_DIDNT_RECIRC", {0x900a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused because it did not recirculate"}, #define PME_ITA2_L2_OZQ_CANCELS2_D_IFET 419 { "L2_OZQ_CANCELS2_D_IFET", {0xf00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a demand ifetch"}, #define PME_ITA2_L2_OZQ_CANCELS2_L2C_ST_MAT 420 { "L2_OZQ_CANCELS2_L2C_ST_MAT", {0x200a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a store match in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_L2FILL_ST_CONF 421 { "L2_OZQ_CANCELS2_L2FILL_ST_CONF", {0x800a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a L2fill and store conflict in L2C"}, #define PME_ITA2_L2_OZQ_CANCELS2_OVER_SUB 422 { "L2_OZQ_CANCELS2_OVER_SUB", {0xc00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- oversubscription"}, #define PME_ITA2_L2_OZQ_CANCELS2_OZ_DATA_CONF 423 { "L2_OZQ_CANCELS2_OZ_DATA_CONF", {0x600a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- an OZ data conflict"}, #define PME_ITA2_L2_OZQ_CANCELS2_READ_WB_CONF 424 { "L2_OZQ_CANCELS2_READ_WB_CONF", {0x500a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- a write back conflict (canceling read?)"}, #define PME_ITA2_L2_OZQ_CANCELS2_RECIRC_OVER_SUB 425 { "L2_OZQ_CANCELS2_RECIRC_OVER_SUB", {0xa8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- caused by a recirculate oversubscription"}, #define PME_ITA2_L2_OZQ_CANCELS2_SCRUB 426 { "L2_OZQ_CANCELS2_SCRUB", {0x300a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- 32/64 byte HPW/L2D fill which needs scrub"}, #define PME_ITA2_L2_OZQ_CANCELS2_WEIRD 427 { "L2_OZQ_CANCELS2_WEIRD", {0xa00a8}, 0xf0, 4, {0x4020007}, "L2 OZQ Cancels (Specific Reason Set 2) -- counts the cancels caused by attempted 5-cycle bypasses for non-aligned accesses and bypasses blocking recirculates for too long"}, #define PME_ITA2_L2_OZQ_FULL_THIS 428 { "L2_OZQ_FULL_THIS", {0xbc}, 0xf0, 1, {0x4520000}, "L2D OZQ Is Full -- L2D OZQ is full"}, #define PME_ITA2_L2_OZQ_RELEASE 429 { "L2_OZQ_RELEASE", {0xa3}, 0xf0, 1, {0x4020000}, "Clocks With Release Ordering Attribute Existed in L2 OZQ"}, #define PME_ITA2_L2_REFERENCES 430 { "L2_REFERENCES", {0xb1}, 0xf0, 4, {0x4120007}, "Requests Made To L2"}, #define PME_ITA2_L2_STORE_HIT_SHARED_ANY 431 { "L2_STORE_HIT_SHARED_ANY", {0xba}, 0xf0, 2, {0x4320007}, "Store Hit a Shared Line -- Store hit a shared line"}, #define PME_ITA2_L2_SYNTH_PROBE 432 { "L2_SYNTH_PROBE", {0xb7}, 0xf0, 1, {0x4220007}, "Synthesized Probe"}, #define PME_ITA2_L2_VICTIMB_FULL_THIS 433 { "L2_VICTIMB_FULL_THIS", {0xbe}, 0xf0, 1, {0x4520000}, "L2D Victim Buffer Is Full -- L2D victim buffer is full"}, #define PME_ITA2_L3_LINES_REPLACED 434 { "L3_LINES_REPLACED", {0xdf}, 0xf0, 1, {0xf00000}, "L3 Cache Lines Replaced"}, #define PME_ITA2_L3_MISSES 435 { "L3_MISSES", {0xdc}, 0xf0, 1, {0xf00007}, "L3 Misses"}, #define PME_ITA2_L3_READS_ALL_ALL 436 { "L3_READS_ALL_ALL", {0xf00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read References"}, #define PME_ITA2_L3_READS_ALL_HIT 437 { "L3_READS_ALL_HIT", {0xd00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read Hits"}, #define PME_ITA2_L3_READS_ALL_MISS 438 { "L3_READS_ALL_MISS", {0xe00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Read Misses"}, #define PME_ITA2_L3_READS_DATA_READ_ALL 439 { "L3_READS_DATA_READ_ALL", {0xb00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load References (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DATA_READ_HIT 440 { "L3_READS_DATA_READ_HIT", {0x900dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load Hits (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DATA_READ_MISS 441 { "L3_READS_DATA_READ_MISS", {0xa00dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Load Misses (excludes reads for ownership used to satisfy stores)"}, #define PME_ITA2_L3_READS_DINST_FETCH_ALL 442 { "L3_READS_DINST_FETCH_ALL", {0x300dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction References"}, #define PME_ITA2_L3_READS_DINST_FETCH_HIT 443 { "L3_READS_DINST_FETCH_HIT", {0x100dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction Fetch Hits"}, #define PME_ITA2_L3_READS_DINST_FETCH_MISS 444 { "L3_READS_DINST_FETCH_MISS", {0x200dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Demand Instruction Fetch Misses"}, #define PME_ITA2_L3_READS_INST_FETCH_ALL 445 { "L3_READS_INST_FETCH_ALL", {0x700dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch References"}, #define PME_ITA2_L3_READS_INST_FETCH_HIT 446 { "L3_READS_INST_FETCH_HIT", {0x500dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch Hits"}, #define PME_ITA2_L3_READS_INST_FETCH_MISS 447 { "L3_READS_INST_FETCH_MISS", {0x600dd}, 0xf0, 1, {0xf00007}, "L3 Reads -- L3 Instruction Fetch and Prefetch Misses"}, #define PME_ITA2_L3_REFERENCES 448 { "L3_REFERENCES", {0xdb}, 0xf0, 1, {0xf00007}, "L3 References"}, #define PME_ITA2_L3_WRITES_ALL_ALL 449 { "L3_WRITES_ALL_ALL", {0xf00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write References"}, #define PME_ITA2_L3_WRITES_ALL_HIT 450 { "L3_WRITES_ALL_HIT", {0xd00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write Hits"}, #define PME_ITA2_L3_WRITES_ALL_MISS 451 { "L3_WRITES_ALL_MISS", {0xe00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Write Misses"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_ALL 452 { "L3_WRITES_DATA_WRITE_ALL", {0x700de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store References (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_HIT 453 { "L3_WRITES_DATA_WRITE_HIT", {0x500de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store Hits (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_DATA_WRITE_MISS 454 { "L3_WRITES_DATA_WRITE_MISS", {0x600de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L3 Store Misses (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_ITA2_L3_WRITES_L2_WB_ALL 455 { "L3_WRITES_L2_WB_ALL", {0xb00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back References"}, #define PME_ITA2_L3_WRITES_L2_WB_HIT 456 { "L3_WRITES_L2_WB_HIT", {0x900de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back Hits"}, #define PME_ITA2_L3_WRITES_L2_WB_MISS 457 { "L3_WRITES_L2_WB_MISS", {0xa00de}, 0xf0, 1, {0xf00007}, "L3 Writes -- L2 Write Back Misses"}, #define PME_ITA2_LOADS_RETIRED 458 { "LOADS_RETIRED", {0xcd}, 0xf0, 4, {0x5310007}, "Retired Loads"}, #define PME_ITA2_MEM_READ_CURRENT_ANY 459 { "MEM_READ_CURRENT_ANY", {0x30089}, 0xf0, 1, {0xf00000}, "Current Mem Read Transactions On Bus -- CPU or non-CPU (all transactions)."}, #define PME_ITA2_MEM_READ_CURRENT_IO 460 { "MEM_READ_CURRENT_IO", {0x10089}, 0xf0, 1, {0xf00000}, "Current Mem Read Transactions On Bus -- non-CPU priority agents"}, #define PME_ITA2_MISALIGNED_LOADS_RETIRED 461 { "MISALIGNED_LOADS_RETIRED", {0xce}, 0xf0, 4, {0x5310007}, "Retired Misaligned Load Instructions"}, #define PME_ITA2_MISALIGNED_STORES_RETIRED 462 { "MISALIGNED_STORES_RETIRED", {0xd2}, 0xf0, 2, {0x5410007}, "Retired Misaligned Store Instructions"}, #define PME_ITA2_NOPS_RETIRED 463 { "NOPS_RETIRED", {0x50}, 0xf0, 6, {0xf00003}, "Retired NOP Instructions"}, #define PME_ITA2_PREDICATE_SQUASHED_RETIRED 464 { "PREDICATE_SQUASHED_RETIRED", {0x51}, 0xf0, 6, {0xf00003}, "Instructions Squashed Due to Predicate Off"}, #define PME_ITA2_RSE_CURRENT_REGS_2_TO_0 465 { "RSE_CURRENT_REGS_2_TO_0", {0x2b}, 0xf0, 7, {0xf00000}, "Current RSE Registers (Bits 2:0)"}, #define PME_ITA2_RSE_CURRENT_REGS_5_TO_3 466 { "RSE_CURRENT_REGS_5_TO_3", {0x2a}, 0xf0, 7, {0xf00000}, "Current RSE Registers (Bits 5:3)"}, #define PME_ITA2_RSE_CURRENT_REGS_6 467 { "RSE_CURRENT_REGS_6", {0x26}, 0xf0, 1, {0xf00000}, "Current RSE Registers (Bit 6)"}, #define PME_ITA2_RSE_DIRTY_REGS_2_TO_0 468 { "RSE_DIRTY_REGS_2_TO_0", {0x29}, 0xf0, 7, {0xf00000}, "Dirty RSE Registers (Bits 2:0)"}, #define PME_ITA2_RSE_DIRTY_REGS_5_TO_3 469 { "RSE_DIRTY_REGS_5_TO_3", {0x28}, 0xf0, 7, {0xf00000}, "Dirty RSE Registers (Bits 5:3)"}, #define PME_ITA2_RSE_DIRTY_REGS_6 470 { "RSE_DIRTY_REGS_6", {0x24}, 0xf0, 1, {0xf00000}, "Dirty RSE Registers (Bit 6)"}, #define PME_ITA2_RSE_EVENT_RETIRED 471 { "RSE_EVENT_RETIRED", {0x32}, 0xf0, 1, {0xf00000}, "Retired RSE operations"}, #define PME_ITA2_RSE_REFERENCES_RETIRED_ALL 472 { "RSE_REFERENCES_RETIRED_ALL", {0x30020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Both RSE loads and stores will be counted."}, #define PME_ITA2_RSE_REFERENCES_RETIRED_LOAD 473 { "RSE_REFERENCES_RETIRED_LOAD", {0x10020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Only RSE loads will be counted."}, #define PME_ITA2_RSE_REFERENCES_RETIRED_STORE 474 { "RSE_REFERENCES_RETIRED_STORE", {0x20020}, 0xf0, 2, {0xf00007}, "RSE Accesses -- Only RSE stores will be counted."}, #define PME_ITA2_SERIALIZATION_EVENTS 475 { "SERIALIZATION_EVENTS", {0x53}, 0xf0, 1, {0xf00000}, "Number of srlz.i Instructions"}, #define PME_ITA2_STORES_RETIRED 476 { "STORES_RETIRED", {0xd1}, 0xf0, 2, {0x5410007}, "Retired Stores"}, #define PME_ITA2_SYLL_NOT_DISPERSED_ALL 477 { "SYLL_NOT_DISPERSED_ALL", {0xf004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Counts all syllables not dispersed. NOTE: Any combination of b0000-b1111 is valid."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL 478 { "SYLL_NOT_DISPERSED_EXPL", {0x1004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits. These consist of programmer specified architected S-bit and templates 1 and 5. Dispersal takes a 6-syllable (3-syllable) hit for every template 1/5 in bundle 0(1). Dispersal takes a 3-syllable (0 syllable) hit for every S-bit in bundle 0(1)"}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_FE 479 { "SYLL_NOT_DISPERSED_EXPL_OR_FE", {0x5004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLI 480 { "SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLI", {0xd004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL 481 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL", {0x3004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit/implicit stop bits."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE 482 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE", {0x7004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to front-end not providing valid bundles or providing valid illegal template."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLI 483 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLI", {0xb004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_EXPL_OR_MLI 484 { "SYLL_NOT_DISPERSED_EXPL_OR_MLI", {0x9004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_FE 485 { "SYLL_NOT_DISPERSED_FE", {0x4004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to front-end not providing valid bundles or providing valid illegal templates. Dispersal takes a 3-syllable hit for every invalid bundle or valid illegal template from front-end. Bundle 1 with front-end fault, is counted here (3-syllable hit).."}, #define PME_ITA2_SYLL_NOT_DISPERSED_FE_OR_MLI 486 { "SYLL_NOT_DISPERSED_FE_OR_MLI", {0xc004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable or due to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL 487 { "SYLL_NOT_DISPERSED_IMPL", {0x2004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits. These consist of all of the non-architected stop bits (asymmetry, oversubscription, implicit). Dispersal takes a 6-syllable(3-syllable) hit for every implicit stop bits in bundle 0(1)."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_FE 488 { "SYLL_NOT_DISPERSED_IMPL_OR_FE", {0x6004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLI 489 { "SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLI", {0xe004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_IMPL_OR_MLI 490 { "SYLL_NOT_DISPERSED_IMPL_OR_MLI", {0xa004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to MLI bundle and resteers to non-0 syllable."}, #define PME_ITA2_SYLL_NOT_DISPERSED_MLI 491 { "SYLL_NOT_DISPERSED_MLI", {0x8004e}, 0xf0, 5, {0xf00001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable. Dispersal takes a 1 syllable hit for each MLI bundle . Dispersal could take 0-2 syllable hit depending on which syllable we resteer to. Bundle 1 with front-end fault which is split, is counted here (0-2 syllable hit)."}, #define PME_ITA2_SYLL_OVERCOUNT_ALL 492 { "SYLL_OVERCOUNT_ALL", {0x3004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- syllables overcounted in implicit & explicit bucket"}, #define PME_ITA2_SYLL_OVERCOUNT_EXPL 493 { "SYLL_OVERCOUNT_EXPL", {0x1004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- Only syllables overcounted in the explicit bucket"}, #define PME_ITA2_SYLL_OVERCOUNT_IMPL 494 { "SYLL_OVERCOUNT_IMPL", {0x2004f}, 0xf0, 2, {0xf00001}, "Syllables Overcounted -- Only syllables overcounted in the implicit bucket"}, #define PME_ITA2_UC_LOADS_RETIRED 495 { "UC_LOADS_RETIRED", {0xcf}, 0xf0, 4, {0x5310007}, "Retired Uncacheable Loads"}, #define PME_ITA2_UC_STORES_RETIRED 496 { "UC_STORES_RETIRED", {0xd0}, 0xf0, 2, {0x5410007}, "Retired Uncacheable Stores"}, }; #define PME_ITA2_EVENT_COUNT 497 papi-papi-7-2-0-t/src/libpfm4/lib/events/itanium_events.h000066400000000000000000000667411502707512200232570ustar00rootroot00000000000000/* * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ /* * Events table for the Itanium PMU family */ static pme_ita_entry_t itanium_pe []={ #define PME_ITA_ALAT_INST_CHKA_LDC_ALL 0 { "ALAT_INST_CHKA_LDC_ALL", {0x30036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_CHKA_LDC_FP 1 { "ALAT_INST_CHKA_LDC_FP", {0x10036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_CHKA_LDC_INT 2 { "ALAT_INST_CHKA_LDC_INT", {0x20036} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_ALL 3 { "ALAT_INST_FAILED_CHKA_LDC_ALL", {0x30037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_FP 4 { "ALAT_INST_FAILED_CHKA_LDC_FP", {0x10037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_INST_FAILED_CHKA_LDC_INT 5 { "ALAT_INST_FAILED_CHKA_LDC_INT", {0x20037} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_ALL 6 { "ALAT_REPLACEMENT_ALL", {0x30038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_FP 7 { "ALAT_REPLACEMENT_FP", {0x10038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALAT_REPLACEMENT_INT 8 { "ALAT_REPLACEMENT_INT", {0x20038} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_ALL_STOPS_DISPERSED 9 { "ALL_STOPS_DISPERSED", {0x2f} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_BRANCH_EVENT 10 { "BRANCH_EVENT", {0x811} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_ALL_PREDICTIONS 11 { "BRANCH_MULTIWAY_ALL_PATHS_ALL_PREDICTIONS", {0xe} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_CORRECT_PREDICTIONS 12 { "BRANCH_MULTIWAY_ALL_PATHS_CORRECT_PREDICTIONS", {0x1000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_WRONG_PATH 13 { "BRANCH_MULTIWAY_ALL_PATHS_WRONG_PATH", {0x2000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_ALL_PATHS_WRONG_TARGET 14 { "BRANCH_MULTIWAY_ALL_PATHS_WRONG_TARGET", {0x3000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_ALL_PREDICTIONS 15 { "BRANCH_MULTIWAY_NOT_TAKEN_ALL_PREDICTIONS", {0x8000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_CORRECT_PREDICTIONS 16 { "BRANCH_MULTIWAY_NOT_TAKEN_CORRECT_PREDICTIONS", {0x9000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_WRONG_PATH 17 { "BRANCH_MULTIWAY_NOT_TAKEN_WRONG_PATH", {0xa000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_NOT_TAKEN_WRONG_TARGET 18 { "BRANCH_MULTIWAY_NOT_TAKEN_WRONG_TARGET", {0xb000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_ALL_PREDICTIONS 19 { "BRANCH_MULTIWAY_TAKEN_ALL_PREDICTIONS", {0xc000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_CORRECT_PREDICTIONS 20 { "BRANCH_MULTIWAY_TAKEN_CORRECT_PREDICTIONS", {0xd000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_WRONG_PATH 21 { "BRANCH_MULTIWAY_TAKEN_WRONG_PATH", {0xe000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_MULTIWAY_TAKEN_WRONG_TARGET 22 { "BRANCH_MULTIWAY_TAKEN_WRONG_TARGET", {0xf000e} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_NOT_TAKEN 23 { "BRANCH_NOT_TAKEN", {0x8000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 24 { "BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0x6000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 25 { "BRANCH_PATH_1ST_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0x4000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 26 { "BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0x7000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 27 { "BRANCH_PATH_1ST_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x5000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 28 { "BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0xa000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 29 { "BRANCH_PATH_2ND_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0x8000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 30 { "BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0xb000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 31 { "BRANCH_PATH_2ND_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x9000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED 32 { "BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_CORRECTLY_PREDICTED", {0xe000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED 33 { "BRANCH_PATH_3RD_STAGE_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0xc000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED 34 { "BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_CORRECTLY_PREDICTED", {0xf000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED 35 { "BRANCH_PATH_3RD_STAGE_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0xd000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_NT_OUTCOMES_CORRECTLY_PREDICTED 36 { "BRANCH_PATH_ALL_NT_OUTCOMES_CORRECTLY_PREDICTED", {0x2000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_NT_OUTCOMES_INCORRECTLY_PREDICTED 37 { "BRANCH_PATH_ALL_NT_OUTCOMES_INCORRECTLY_PREDICTED", {0xf} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_TK_OUTCOMES_CORRECTLY_PREDICTED 38 { "BRANCH_PATH_ALL_TK_OUTCOMES_CORRECTLY_PREDICTED", {0x3000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PATH_ALL_TK_OUTCOMES_INCORRECTLY_PREDICTED 39 { "BRANCH_PATH_ALL_TK_OUTCOMES_INCORRECTLY_PREDICTED", {0x1000f} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_ALL_PREDICTIONS 40 { "BRANCH_PREDICTOR_1ST_STAGE_ALL_PREDICTIONS", {0x40010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_CORRECT_PREDICTIONS 41 { "BRANCH_PREDICTOR_1ST_STAGE_CORRECT_PREDICTIONS", {0x50010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_WRONG_PATH 42 { "BRANCH_PREDICTOR_1ST_STAGE_WRONG_PATH", {0x60010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_1ST_STAGE_WRONG_TARGET 43 { "BRANCH_PREDICTOR_1ST_STAGE_WRONG_TARGET", {0x70010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_ALL_PREDICTIONS 44 { "BRANCH_PREDICTOR_2ND_STAGE_ALL_PREDICTIONS", {0x80010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_CORRECT_PREDICTIONS 45 { "BRANCH_PREDICTOR_2ND_STAGE_CORRECT_PREDICTIONS", {0x90010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_WRONG_PATH 46 { "BRANCH_PREDICTOR_2ND_STAGE_WRONG_PATH", {0xa0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_2ND_STAGE_WRONG_TARGET 47 { "BRANCH_PREDICTOR_2ND_STAGE_WRONG_TARGET", {0xb0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_ALL_PREDICTIONS 48 { "BRANCH_PREDICTOR_3RD_STAGE_ALL_PREDICTIONS", {0xc0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_CORRECT_PREDICTIONS 49 { "BRANCH_PREDICTOR_3RD_STAGE_CORRECT_PREDICTIONS", {0xd0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_WRONG_PATH 50 { "BRANCH_PREDICTOR_3RD_STAGE_WRONG_PATH", {0xe0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_3RD_STAGE_WRONG_TARGET 51 { "BRANCH_PREDICTOR_3RD_STAGE_WRONG_TARGET", {0xf0010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_ALL_PREDICTIONS 52 { "BRANCH_PREDICTOR_ALL_ALL_PREDICTIONS", {0x10} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_CORRECT_PREDICTIONS 53 { "BRANCH_PREDICTOR_ALL_CORRECT_PREDICTIONS", {0x10010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_WRONG_PATH 54 { "BRANCH_PREDICTOR_ALL_WRONG_PATH", {0x20010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_PREDICTOR_ALL_WRONG_TARGET 55 { "BRANCH_PREDICTOR_ALL_WRONG_TARGET", {0x30010} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_0 56 { "BRANCH_TAKEN_SLOT_0", {0x1000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_1 57 { "BRANCH_TAKEN_SLOT_1", {0x2000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BRANCH_TAKEN_SLOT_2 58 { "BRANCH_TAKEN_SLOT_2", {0x4000d} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_BUS_ALL_ANY 59 { "BUS_ALL_ANY", {0x10047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_ALL_IO 60 { "BUS_ALL_IO", {0x40047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_ALL_SELF 61 { "BUS_ALL_SELF", {0x20047} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_LIVE_REQ_HI 62 { "BUS_BRQ_LIVE_REQ_HI", {0x5c} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_LIVE_REQ_LO 63 { "BUS_BRQ_LIVE_REQ_LO", {0x5b} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_BUS_BRQ_REQ_INSERTED 64 { "BUS_BRQ_REQ_INSERTED", {0x5d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_ANY 65 { "BUS_BURST_ANY", {0x10049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_IO 66 { "BUS_BURST_IO", {0x40049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_BURST_SELF 67 { "BUS_BURST_SELF", {0x20049} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_HITM 68 { "BUS_HITM", {0x44} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_IO_ANY 69 { "BUS_IO_ANY", {0x10050} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_IOQ_LIVE_REQ_HI 70 { "BUS_IOQ_LIVE_REQ_HI", {0x58} , 0xf0, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_IOQ_LIVE_REQ_LO 71 { "BUS_IOQ_LIVE_REQ_LO", {0x57} , 0xf0, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_IO_SELF 72 { "BUS_IO_SELF", {0x20050} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_ANY 73 { "BUS_LOCK_ANY", {0x10053} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_CYCLES_ANY 74 { "BUS_LOCK_CYCLES_ANY", {0x10054} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_CYCLES_SELF 75 { "BUS_LOCK_CYCLES_SELF", {0x20054} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_LOCK_SELF 76 { "BUS_LOCK_SELF", {0x20053} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_ANY 77 { "BUS_MEMORY_ANY", {0x1004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_IO 78 { "BUS_MEMORY_IO", {0x4004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_MEMORY_SELF 79 { "BUS_MEMORY_SELF", {0x2004a} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_ANY 80 { "BUS_PARTIAL_ANY", {0x10048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_IO 81 { "BUS_PARTIAL_IO", {0x40048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_PARTIAL_SELF 82 { "BUS_PARTIAL_SELF", {0x20048} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_ANY 83 { "BUS_RD_ALL_ANY", {0x1004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_IO 84 { "BUS_RD_ALL_IO", {0x4004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_ALL_SELF 85 { "BUS_RD_ALL_SELF", {0x2004b} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_ANY 86 { "BUS_RD_DATA_ANY", {0x1004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_IO 87 { "BUS_RD_DATA_IO", {0x4004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_DATA_SELF 88 { "BUS_RD_DATA_SELF", {0x2004c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_HIT 89 { "BUS_RD_HIT", {0x40} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_HITM 90 { "BUS_RD_HITM", {0x41} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_ANY 91 { "BUS_RD_INVAL_ANY", {0x1004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_ANY 92 { "BUS_RD_INVAL_BST_ANY", {0x1004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_HITM 93 { "BUS_RD_INVAL_BST_HITM", {0x43} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_IO 94 { "BUS_RD_INVAL_BST_IO", {0x4004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_BST_SELF 95 { "BUS_RD_INVAL_BST_SELF", {0x2004f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_HITM 96 { "BUS_RD_INVAL_HITM", {0x42} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_IO 97 { "BUS_RD_INVAL_IO", {0x4004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_INVAL_SELF 98 { "BUS_RD_INVAL_SELF", {0x2004e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_IO_ANY 99 { "BUS_RD_IO_ANY", {0x10051} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_IO_SELF 100 { "BUS_RD_IO_SELF", {0x20051} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_ANY 101 { "BUS_RD_PRTL_ANY", {0x1004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_IO 102 { "BUS_RD_PRTL_IO", {0x4004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_RD_PRTL_SELF 103 { "BUS_RD_PRTL_SELF", {0x2004d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPQ_REQ 104 { "BUS_SNOOPQ_REQ", {0x56} , 0x30, 3, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPS_ANY 105 { "BUS_SNOOPS_ANY", {0x10046} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOPS_HITM_ANY 106 { "BUS_SNOOPS_HITM_ANY", {0x10045} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOP_STALL_CYCLES_ANY 107 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x10055} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_SNOOP_STALL_CYCLES_SELF 108 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x20055} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_ANY 109 { "BUS_WR_WB_ANY", {0x10052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_IO 110 { "BUS_WR_WB_IO", {0x40052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_BUS_WR_WB_SELF 111 { "BUS_WR_WB_SELF", {0x20052} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_CPU_CPL_CHANGES 112 { "CPU_CPL_CHANGES", {0x34} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_CPU_CYCLES 113 { "CPU_CYCLES", {0x12} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DATA_ACCESS_CYCLE 114 { "DATA_ACCESS_CYCLE", {0x3} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT1024 115 { "DATA_EAR_CACHE_LAT1024", {0x90367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT128 116 { "DATA_EAR_CACHE_LAT128", {0x50367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT16 117 { "DATA_EAR_CACHE_LAT16", {0x20367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT2048 118 { "DATA_EAR_CACHE_LAT2048", {0xa0367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT256 119 { "DATA_EAR_CACHE_LAT256", {0x60367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT32 120 { "DATA_EAR_CACHE_LAT32", {0x30367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT4 121 { "DATA_EAR_CACHE_LAT4", {0x367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT512 122 { "DATA_EAR_CACHE_LAT512", {0x80367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT64 123 { "DATA_EAR_CACHE_LAT64", {0x40367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT8 124 { "DATA_EAR_CACHE_LAT8", {0x10367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_CACHE_LAT_NONE 125 { "DATA_EAR_CACHE_LAT_NONE", {0xf0367} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_EVENTS 126 { "DATA_EAR_EVENTS", {0x67} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DATA_EAR_TLB_L2 127 { "DATA_EAR_TLB_L2", {0x20767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_TLB_SW 128 { "DATA_EAR_TLB_SW", {0x80767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_EAR_TLB_VHPT 129 { "DATA_EAR_TLB_VHPT", {0x40767} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_DATA_REFERENCES_RETIRED 130 { "DATA_REFERENCES_RETIRED", {0x63} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_DEPENDENCY_ALL_CYCLE 131 { "DEPENDENCY_ALL_CYCLE", {0x6} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DEPENDENCY_SCOREBOARD_CYCLE 132 { "DEPENDENCY_SCOREBOARD_CYCLE", {0x2} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_DTC_MISSES 133 { "DTC_MISSES", {0x60} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DTLB_INSERTS_HPW 134 { "DTLB_INSERTS_HPW", {0x62} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_DTLB_MISSES 135 { "DTLB_MISSES", {0x61} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_EXPL_STOPBITS 136 { "EXPL_STOPBITS", {0x2e} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_FP_FLUSH_TO_ZERO 137 { "FP_FLUSH_TO_ZERO", {0xb} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_FP_OPS_RETIRED_HI 138 { "FP_OPS_RETIRED_HI", {0xa} , 0xf0, 3, {0xffff0003}, NULL}, #define PME_ITA_FP_OPS_RETIRED_LO 139 { "FP_OPS_RETIRED_LO", {0x9} , 0xf0, 3, {0xffff0003}, NULL}, #define PME_ITA_FP_SIR_FLUSH 140 { "FP_SIR_FLUSH", {0xc} , 0xf0, 2, {0xffff0003}, NULL}, #define PME_ITA_IA32_INST_RETIRED 141 { "IA32_INST_RETIRED", {0x15} , 0xf0, 2, {0xffff0000}, NULL}, #define PME_ITA_IA64_INST_RETIRED 142 { "IA64_INST_RETIRED", {0x8} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_IA64_TAGGED_INST_RETIRED_PMC8 143 { "IA64_TAGGED_INST_RETIRED_PMC8", {0x30008} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_IA64_TAGGED_INST_RETIRED_PMC9 144 { "IA64_TAGGED_INST_RETIRED_PMC9", {0x20008} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_INST_ACCESS_CYCLE 145 { "INST_ACCESS_CYCLE", {0x1} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_INST_DISPERSED 146 { "INST_DISPERSED", {0x2d} , 0x30, 6, {0xffff0001}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_ALL 147 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_FP 148 { "INST_FAILED_CHKS_RETIRED_FP", {0x20035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INST_FAILED_CHKS_RETIRED_INT 149 { "INST_FAILED_CHKS_RETIRED_INT", {0x10035} , 0xf0, 1, {0xffff0003}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT1024 150 { "INSTRUCTION_EAR_CACHE_LAT1024", {0x80123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT128 151 { "INSTRUCTION_EAR_CACHE_LAT128", {0x50123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT16 152 { "INSTRUCTION_EAR_CACHE_LAT16", {0x20123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT2048 153 { "INSTRUCTION_EAR_CACHE_LAT2048", {0x90123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT256 154 { "INSTRUCTION_EAR_CACHE_LAT256", {0x60123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT32 155 { "INSTRUCTION_EAR_CACHE_LAT32", {0x30123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT4096 156 { "INSTRUCTION_EAR_CACHE_LAT4096", {0xa0123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT4 157 { "INSTRUCTION_EAR_CACHE_LAT4", {0x123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT512 158 { "INSTRUCTION_EAR_CACHE_LAT512", {0x70123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT64 159 { "INSTRUCTION_EAR_CACHE_LAT64", {0x40123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT8 160 { "INSTRUCTION_EAR_CACHE_LAT8", {0x10123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_CACHE_LAT_NONE 161 { "INSTRUCTION_EAR_CACHE_LAT_NONE", {0xf0123} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_EVENTS 162 { "INSTRUCTION_EAR_EVENTS", {0x23} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_TLB_SW 163 { "INSTRUCTION_EAR_TLB_SW", {0x80523} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_INSTRUCTION_EAR_TLB_VHPT 164 { "INSTRUCTION_EAR_TLB_VHPT", {0x40523} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_ISA_TRANSITIONS 165 { "ISA_TRANSITIONS", {0x14} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_ISB_LINES_IN 166 { "ISB_LINES_IN", {0x26} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_ITLB_INSERTS_HPW 167 { "ITLB_INSERTS_HPW", {0x28} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_ITLB_MISSES_FETCH 168 { "ITLB_MISSES_FETCH", {0x27} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1D_READ_FORCED_MISSES_RETIRED 169 { "L1D_READ_FORCED_MISSES_RETIRED", {0x6b} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1D_READ_MISSES_RETIRED 170 { "L1D_READ_MISSES_RETIRED", {0x66} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1D_READS_RETIRED 171 { "L1D_READS_RETIRED", {0x64} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L1I_DEMAND_READS 172 { "L1I_DEMAND_READS", {0x20} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1I_FILLS 173 { "L1I_FILLS", {0x21} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L1I_PREFETCH_READS 174 { "L1I_PREFETCH_READS", {0x24} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L1_OUTSTANDING_REQ_HI 175 { "L1_OUTSTANDING_REQ_HI", {0x79} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L1_OUTSTANDING_REQ_LO 176 { "L1_OUTSTANDING_REQ_LO", {0x78} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_ALL 177 { "L2_DATA_REFERENCES_ALL", {0x30069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_READS 178 { "L2_DATA_REFERENCES_READS", {0x10069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_DATA_REFERENCES_WRITES 179 { "L2_DATA_REFERENCES_WRITES", {0x20069} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ADDR_CONFLICT 180 { "L2_FLUSH_DETAILS_ADDR_CONFLICT", {0x20077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ALL 181 { "L2_FLUSH_DETAILS_ALL", {0xf0077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_BUS_REJECT 182 { "L2_FLUSH_DETAILS_BUS_REJECT", {0x40077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_FULL_FLUSH 183 { "L2_FLUSH_DETAILS_FULL_FLUSH", {0x80077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSH_DETAILS_ST_BUFFER 184 { "L2_FLUSH_DETAILS_ST_BUFFER", {0x10077} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_FLUSHES 185 { "L2_FLUSHES", {0x76} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L2_INST_DEMAND_READS 186 { "L2_INST_DEMAND_READS", {0x22} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L2_INST_PREFETCH_READS 187 { "L2_INST_PREFETCH_READS", {0x25} , 0xf0, 1, {0xffff0001}, NULL}, #define PME_ITA_L2_MISSES 188 { "L2_MISSES", {0x6a} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_L2_REFERENCES 189 { "L2_REFERENCES", {0x68} , 0xf0, 3, {0xffff0007}, NULL}, #define PME_ITA_L3_LINES_REPLACED 190 { "L3_LINES_REPLACED", {0x7f} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_MISSES 191 { "L3_MISSES", {0x7c} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_ALL 192 { "L3_READS_ALL_READS_ALL", {0xf007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_HIT 193 { "L3_READS_ALL_READS_HIT", {0xd007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_ALL_READS_MISS 194 { "L3_READS_ALL_READS_MISS", {0xe007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_ALL 195 { "L3_READS_DATA_READS_ALL", {0xb007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_HIT 196 { "L3_READS_DATA_READS_HIT", {0x9007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_DATA_READS_MISS 197 { "L3_READS_DATA_READS_MISS", {0xa007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_ALL 198 { "L3_READS_INST_READS_ALL", {0x7007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_HIT 199 { "L3_READS_INST_READS_HIT", {0x5007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_READS_INST_READS_MISS 200 { "L3_READS_INST_READS_MISS", {0x6007d} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_REFERENCES 201 { "L3_REFERENCES", {0x7b} , 0xf0, 1, {0xffff0007}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_ALL 202 { "L3_WRITES_ALL_WRITES_ALL", {0xf007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_HIT 203 { "L3_WRITES_ALL_WRITES_HIT", {0xd007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_ALL_WRITES_MISS 204 { "L3_WRITES_ALL_WRITES_MISS", {0xe007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_ALL 205 { "L3_WRITES_DATA_WRITES_ALL", {0x7007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_HIT 206 { "L3_WRITES_DATA_WRITES_HIT", {0x5007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_DATA_WRITES_MISS 207 { "L3_WRITES_DATA_WRITES_MISS", {0x6007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_ALL 208 { "L3_WRITES_L2_WRITEBACK_ALL", {0xb007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_HIT 209 { "L3_WRITES_L2_WRITEBACK_HIT", {0x9007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_L3_WRITES_L2_WRITEBACK_MISS 210 { "L3_WRITES_L2_WRITEBACK_MISS", {0xa007e} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_LOADS_RETIRED 211 { "LOADS_RETIRED", {0x6c} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_MEMORY_CYCLE 212 { "MEMORY_CYCLE", {0x7} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_MISALIGNED_LOADS_RETIRED 213 { "MISALIGNED_LOADS_RETIRED", {0x70} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_MISALIGNED_STORES_RETIRED 214 { "MISALIGNED_STORES_RETIRED", {0x71} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_NOPS_RETIRED 215 { "NOPS_RETIRED", {0x30} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_PIPELINE_ALL_FLUSH_CYCLE 216 { "PIPELINE_ALL_FLUSH_CYCLE", {0x4} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_BACKEND_FLUSH_CYCLE 217 { "PIPELINE_BACKEND_FLUSH_CYCLE", {0x0} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_ALL 218 { "PIPELINE_FLUSH_ALL", {0xf0033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_DTC_FLUSH 219 { "PIPELINE_FLUSH_DTC_FLUSH", {0x40033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_IEU_FLUSH 220 { "PIPELINE_FLUSH_IEU_FLUSH", {0x80033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_L1D_WAYMP_FLUSH 221 { "PIPELINE_FLUSH_L1D_WAYMP_FLUSH", {0x20033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PIPELINE_FLUSH_OTHER_FLUSH 222 { "PIPELINE_FLUSH_OTHER_FLUSH", {0x10033} , 0xf0, 1, {0xffff0000}, NULL}, #define PME_ITA_PREDICATE_SQUASHED_RETIRED 223 { "PREDICATE_SQUASHED_RETIRED", {0x31} , 0x30, 6, {0xffff0003}, NULL}, #define PME_ITA_RSE_LOADS_RETIRED 224 { "RSE_LOADS_RETIRED", {0x72} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_RSE_REFERENCES_RETIRED 225 { "RSE_REFERENCES_RETIRED", {0x65} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_STORES_RETIRED 226 { "STORES_RETIRED", {0x6d} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UC_LOADS_RETIRED 227 { "UC_LOADS_RETIRED", {0x6e} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UC_STORES_RETIRED 228 { "UC_STORES_RETIRED", {0x6f} , 0xf0, 2, {0xffff0007}, NULL}, #define PME_ITA_UNSTALLED_BACKEND_CYCLE 229 { "UNSTALLED_BACKEND_CYCLE", {0x5} , 0xf0, 1, {0xffff0000}, NULL}}; #define PME_ITA_EVENT_COUNT 230 papi-papi-7-2-0-t/src/libpfm4/lib/events/mips_74k_events.h000066400000000000000000000471301502707512200232350ustar00rootroot00000000000000/* * Copyright (c) 2011 Samara Technology Group, Inc * Contributed by Philip Mucci * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Based on: * MIPS32 74KTM Processor Core Family Software Users' Manual * Document Number: MD00519 Revision 01.05 March 30, 2011 */ static const mips_entry_t mips_74k_pe []={ { .name = "CYCLES", /* BOTH */ .code = 0x0, .desc = "Cycles", }, { .name = "INSTRUCTIONS", /* BOTH */ .code = 0x1, .desc = "Instructions graduated", }, { .name = "PREDICTED_JR_31", .code = 0x2, .desc = "jr $31 (return) instructions whose target is predicted", }, { .name = "JR_31_MISPREDICTIONS", .code = 0x82, .desc = "jr $31 (return) predicted but guessed wrong", }, { .name = "REDIRECT_STALLS", .code = 0x3, .desc = "Cycles where no instruction is fetched because it has no next address candidate. This includes stalls due to register indirect jumps such as jr, stalls following a wait or eret and stalls dues to exceptions from instruction fetch", }, { .name = "JR_31_NO_PREDICTIONS", .code = 0x83, .desc = "jr $31 (return) instructions fetched and not predicted using RPS", }, { .name = "ITLB_ACCESSES", .code = 0x4, .desc = "ITLB accesses", }, { .name = "ITLB_MISSES", .code = 0x84, .desc = "ITLB misses, which result in a JTLB access", }, { .name = "JTLB_INSN_MISSES", .code = 0x85, .desc = "JTLB instruction access misses (will lead to an exception)", }, { .name = "ICACHE_ACCESSES", .code = 0x6, .desc = "Instruction cache accesses. 74K cores have a 128-bit connection to the I-cache and fetch 4 instructions every access. This counts every such access, including accesses for instructions which are eventually discarded. For example, following a branch which is incorrectly predicted, the 74K core will continue to fetch instructions, which will eventually get thrown away", }, { .name = "ICACHE_MISSES", .code = 0x86, .desc = "I-cache misses. Includes misses resulting from fetch-ahead and speculation", }, { .name = "ICACHE_MISS_STALLS", .code = 0x7, .desc = "Cycles where no instruction is fetched because we missed in the I-cache", }, { .name = "UNCACHED_IFETCH_STALLS", .code = 0x8, .desc = "Cycles where no instruction is fetched because we're waiting for an I-fetch from uncached memory", }, { .name = "PDTRACE_BACK_STALLS", .code = 0x88, .desc = "PDTrace back stalls", }, { .name = "IFU_REPLAYS", .code = 0x9, .desc = "Number of times the instruction fetch pipeline is flushed and replayed because the IFU buffers are full and unable to accept any instructions", }, { .name = "KILLED_FETCH_SLOTS", .code = 0x89, .desc = "Valid fetch slots killed due to taken branches/jumps or stalling instructions", }, { .name = "DDQ0_FULL_DR_STALLS", .code = 0xd, .desc = "Cycles where no instructions are brought into the IDU because the ALU instruction candidate pool is full", }, { .name = "DDQ1_FULL_DR_STALLS", .code = 0x8d, .desc = "Cycles where no instructions are brought into the IDU because the AGEN instruction candidate pool is full", }, { .name = "ALCB_FULL_DR_STALLS", .code = 0xe, .desc = "Cycles where no instructions can be added to the issue pool, because we have run out of ALU completion buffers (CBs)", }, { .name = "AGCB_FULL_DR_STALLS", .code = 0x8e, .desc = "Cycles where no instructions can be added to the issue pool, because we have run out of AGEN completion buffers (CBs)", }, { .name = "CLDQ_FULL_DR_STALLS", .code = 0xf, .desc = "Cycles where no instructions can be added to the issue pool, because we've used all the FIFO entries in the CLDQ which keep track of data coming back from the FPU", }, { .name = "IODQ_FULL_DR_STALLS", .code = 0x8f, .desc = "Cycles where no instructions can be added to the issue pool, because we've filled the in order FIFO used for coprocessor 1 instructions (IOIQ)", }, { .name = "ALU_EMPTY_CYCLES", .code = 0x10, .desc = "Cycles with no ALU-pipe issue; no instructions available", }, { .name = "AGEN_EMPTY_CYCLES", .code = 0x90, .desc = "Cycles with no AGEN-pipe issue; no instructions available", }, { .name = "ALU_OPERANDS_NOT_READY_CYCLES", .code = 0x11, .desc = "Cycles with no ALU-pipe issue; we have instructions, but operands not ready", }, { .name = "AGEN_OPERANDS_NOT_READY_CYCLES", .code = 0x91, .desc = "Cycles with no AGEN-pipe issue; we have instructions, but operands not ready", }, { .name = "ALU_NO_ISSUE_CYCLES", .code = 0x12, .desc = "Cycles with no ALU-pipe issue; we have instructions, but some resource is unavailable. This includes, operands are not ready (same as event 17), div in progress inhibits MDU instructions, CorExtend resource limitation", }, { .name = "AGEN_NO_ISSUE_CYCLES", .code = 0x92, .desc = "Cycles with no AGEN-pipe issue; we have instructions, but some resource is unavailable. This includes, operands are not ready (same as event 17), Non-issued stores blocking ready to issue loads, issued cacheops blocking ready to issue loads", }, { .name = "ALU_BUBBLE_CYCLES", .code = 0x13, .desc = "ALU-pipe bubble issued. The resulting empty pipe stage guarantees that some resource will be unused for a cycle, sometime soon. Used, for example, to guarantee an opportunity to write mfc1 data into a CB", }, { .name = "AGEN_BUBBLE_CYCLES", .code = 0x93, .desc = "AGEN-pipe bubble issued. The resulting empty pipe stage guarantees that some resource will be unused for a cycle, sometime soon. Used, for example, to allow access to the data cache for refill or eviction", }, { .name = "SINGLE_ISSUE_CYCLES", .code = 0x14, .desc = "Cycles when one instruction is issued", }, { .name = "DUAL_ISSUE_CYCLES", .code = 0x94, .desc = "Cycles when two instructions are issued (one ALU, one AGEN)", }, { .name = "OOO_ALU_ISSUE_CYCLES", .code = 0x15, .desc = "Cycles when instructions are issued out of order into the ALU pipe. i.e. instruction issued is not the oldest in the pool", }, { .name = "OOO_AGEN_ISSUE_CYCLES", .code = 0x95, .desc = "Cycles when instructions are issued out of order into the AGEN pipe. i.e. instruction issued is not the oldest in the pool", }, { .name = "JALR_JALR_HB_INSNS", .code = 0x16, .desc = "Graduated JAR/JALR.HB", }, { .name = "DCACHE_LINE_REFILL_REQUESTS", .code = 0x96, .desc = "D-Cache line refill (not LD/ST misses)", }, { .name = "DCACHE_LOAD_ACCESSES", .code = 0x17, .desc = "Cacheable loads - Counts all accesses to the D-cache caused by load instructions. This count includes instructions that do not graduate", }, { .name = "DCACHE_ACCESSES", .code = 0x97, .desc = "All D-cache accesses (loads, stores, prefetch, cacheop etc). This count includes instructions that do not graduate", }, { .name = "DCACHE_WRITEBACKS", .code = 0x18, .desc = "D-Cache writebacks", }, { .name = "DCACHE_MISSES", .code = 0x98, .desc = "D-cache misses. This count is per instruction at graduation and includes load, store, prefetch, synci and address based cacheops", }, { .name = "JTLB_DATA_ACCESSES", .code = 0x19, .desc = "JTLB d-side (data side as opposed to instruction side) accesses", }, { .name = "JTLB_DATA_MISSES", .code = 0x99, .desc = "JTLB translation fails on d-side (data side as opposed to instruction side) accesses. This count includes instructions that do not graduate", }, { .name = "LOAD_STORE_REPLAYS", .code = 0x1a, .desc = "Load/store instruction redirects, which happen when the load/store follows too closely on a possibly matching cacheop", }, { .name = "DCACHE_VTAG_MISMATCH", .code = 0x9a, .desc = "The 74K core's D-cache has an auxiliary virtual tag, used to pick the right line early. When (occasionally) the physical tag match and virtual tag match do not line up, it is treated as a cache miss - in processing the miss the virtual tag is corrected for future accesses. This event counts those bogus misses", }, { .name = "L2_CACHE_WRITEBACKS", .code = 0x1c, .desc = "L2 cache writebacks", }, { .name = "L2_CACHE_ACCESSES", .code = 0x9c, .desc = "L2 cache accesses", }, { .name = "L2_CACHE_MISSES", .code = 0x1d, .desc = "L2 cache misses", }, { .name = "L2_CACHE_MISS_CYCLES", .code = 0x9d, .desc = "L2 cache miss cycles", }, { .name = "FSB_FULL_STALLS", .code = 0x1e, .desc = "Cycles Fill Store Buffer(FSB) are full and cause a pipe stall", }, { .name = "FSB_OVER_50_FULL", .code = 0x9e, .desc = "Cycles Fill Store Buffer(FSB) > 1/2 full", }, { .name = "LDQ_FULL_STALLS", .code = 0x1f, .desc = "Cycles Load Data Queue (LDQ) are full and cause a pipe stall", }, { .name = "LDQ_OVER_50_FULL", .code = 0x9f, .desc = "Cycles Load Data Queue(LDQ) > 1/2 full", }, { .name = "WBB_FULL_STALLS", .code = 0x20, .desc = "Cycles Writeback Buffer(WBB) are full and cause a pipe stall", }, { .name = "WBB_OVER_50_FULL", .code = 0xa0, .desc = "Cycles Writeback Buffer(WBB) > 1/2 full", }, { .name = "LOAD_MISS_CONSUMER_REPLAYS", .code = 0x23, .desc = "Replays following optimistic issue of instruction dependent on load which missed. Counted only when the dependent instruction graduates", }, { .name = "FPU_LOAD_INSNS", .code = 0xa3, .desc = "Floating Point Load instructions graduated", }, { .name = "JR_NON_31_INSNS", .code = 0x24, .desc = "jr (not $31) instructions graduated", }, { .name = "MISPREDICTED_JR_31_INSNS", .code = 0xa4, .desc = "jr $31 mispredicted at graduation", }, { .name = "INT_BRANCH_INSNS", .code = 0x25, .desc = "Integer branch instructions graduated", }, { .name = "FPU_BRANCH_INSNS", .code = 0xa5, .desc = "Floating point branch instructions graduated", }, { .name = "BRANCH_LIKELY_INSNS", .code = 0x26, .desc = "Branch-likely instructions graduated", }, { .name = "MISPREDICTED_BRANCH_LIKELY_INSNS", .code = 0xa6, .desc = "Mispredicted branch-likely instructions graduated", }, { .name = "COND_BRANCH_INSNS", .code = 0x27, .desc = "Conditional branches graduated", }, { .name = "MISPREDICTED_BRANCH_INSNS", .code = 0xa7, .desc = "Mispredicted conditional branches graduated", }, { .name = "INTEGER_INSNS", .code = 0x28, .desc = "Integer instructions graduated (includes nop, ssnop, ehb as well as all arithmetic, logical, shift and extract type operations)", }, { .name = "FPU_INSNS", .code = 0xa8, .desc = "Floating point instructions graduated (but not counting floating point load/store)", }, { .name = "LOAD_INSNS", .code = 0x29, .desc = "Loads graduated (includes floating point)", }, { .name = "STORE_INSNS", .code = 0xa9, .desc = "Stores graduated (includes floating point). Of sc instructions, only successful ones are counted", }, { .name = "J_JAL_INSNS", .code = 0x2a, .desc = "j/jal graduated", }, { .name = "MIPS16_INSNS", .code = 0xaa, .desc = "MIPS16e instructions graduated", }, { .name = "NOP_INSNS", .code = 0x2b, .desc = "no-ops graduated - included (sll, nop, ssnop, ehb)", }, { .name = "NT_MUL_DIV_INSNS", .code = 0xab, .desc = "integer multiply/divides graduated", }, { .name = "DSP_INSNS", .code = 0x2c, .desc = "DSP instructions graduated", }, { .name = "ALU_DSP_SATURATION_INSNS", .code = 0xac, .desc = "ALU-DSP instructions graduated, result was saturated", }, { .name = "DSP_BRANCH_INSNS", .code = 0x2d, .desc = "DSP branch instructions graduated", }, { .name = "MDU_DSP_SATURATION_INSNS", .code = 0xad, .desc = "MDU-DSP instructions graduated, result was saturated", }, { .name = "UNCACHED_LOAD_INSNS", .code = 0x2e, .desc = "Uncached loads graduated", }, { .name = "UNCACHED_STORE_INSNS", .code = 0xae, .desc = "Uncached stores graduated", }, { .name = "EJTAG_INSN_TRIGGERS", .code = 0x31, .desc = "EJTAG instruction triggers", }, { .name = "EJTAG_DATA_TRIGGERS", .code = 0xb1, .desc = "EJTAG data triggers", }, { .name = "CP1_BRANCH_MISPREDICTIONS", .code = 0x32, .desc = "CP1 branches mispredicted", }, { .name = "SC_INSNS", .code = 0x33, .desc = "sc instructions graduated", }, { .name = "FAILED_SC_INSNS", .code = 0xb3, .desc = "sc instructions failed", }, { .name = "PREFETCH_INSNS", .code = 0x34, .desc = "prefetch instructions graduated at the top of LSGB", }, { .name = "CACHE_HIT_PREFETCH_INSNS", .code = 0xb4, .desc = "prefetch instructions which did nothing, because they hit in the cache", }, { .name = "NO_INSN_CYCLES", .code = 0x35, .desc = "Cycles where no instructions graduated", }, { .name = "LOAD_MISS_INSNS", .code = 0xb5, .desc = "Load misses graduated. Includes floating point loads", }, { .name = "ONE_INSN_CYCLES", .code = 0x36, .desc = "Cycles where one instruction graduated", }, { .name = "TWO_INSNS_CYCLES", .code = 0xb6, .desc = "Cycles where two instructions graduated", }, { .name = "GFIFO_BLOCKED_CYCLES", .code = 0x37, .desc = "GFifo blocked cycles", }, { .name = "FPU_STORE_INSNS", .code = 0xb7, .desc = "Floating point stores graduated", }, { .name = "GFIFO_BLOCKED_TLB_CACHE", .code = 0x38, .desc = "GFifo blocked due to TLB or Cacheop", }, { .name = "NO_INSTRUCTIONS_FROM_REPLAY_CYCLES", .code = 0xb8, .desc = "Number of cycles no instructions graduated from the time the pipe was flushed because of a replay until the first new instruction graduates. This is an indicator of the graduation bandwidth loss due to replay. Often times this replay is a result of event 25 and therefore an indicator of bandwidth lost due to cache misses", }, { .name = "MISPREDICTION_BRANCH_NODELAY_CYCLES", .code = 0x39, /* even counters event 57 (raw 57) */ .desc = "Slot 0 misprediction branch instruction graduation cycles without the delay slot" }, { .name = "MISPREDICTION_BRANCH_DELAY_WAIT_CYCLES", .code = 0xb9, /* even counters event 57 (raw 57) */ .desc = "Cycles waiting for delay slot to graduate on a mispredicted branch", }, { .name = "EXCEPTIONS_TAKEN", .code = 0x3a, .desc = "Exceptions taken", }, { .name = "GRADUATION_REPLAYS", .code = 0xba, .desc = "Replays initiated from graduation", }, { .name = "COREEXTEND_EVENTS", .code = 0x3b, .desc = "Implementation specific CorExtend event. The integrator of this core may connect the core pin UDI_perfcnt_event to an event to be counted. This is intended for use with the CorExtend interface", }, { .name = "DSPRAM_EVENTS", .code = 0xbe, .desc = "Implementation-specific DSPRAM event. The integrator of this core may connect the core pin SP_prf_c13_e62_xx to the event to be counted", }, { .name = "L2_CACHE_SINGLE_BIT_ERRORS", .code = 0x3f, .desc = "L2 single-bit errors which were detected", }, { .name = "SYSTEM_EVENT_0", .code = 0x40, .desc = "SI_Event[0] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[0] to an event to be counted", }, { .name = "SYSTEM_EVENT_1", .code = 0xc0, .desc = "SI_Event[1] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[1] to an event to be counted", }, { .name = "SYSTEM_EVENT_2", .code = 0x41, .desc = "SI_Event[2] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[2] to an event to be counted", }, { .name = "SYSTEM_EVENT_3", .code = 0xc1, .desc = "SI_Event[3] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[3] to an event to be counted", }, { .name = "SYSTEM_EVENT_4", .code = 0x42, .desc = "SI_Event[4] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[4] to an event to be counted", }, { .name = "SYSTEM_EVENT_5", .code = 0xc2, .desc = "SI_Event[5] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[5] to an event to be counted", }, { .name = "SYSTEM_EVENT_6", .code = 0x43, .desc = "SI_Event[6] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[6] to an event to be counted", }, { .name = "SYSTEM_EVENT_7", .code = 0xc3, .desc = "SI_Event[7] - Implementation-specific system event. The integrator of this core may connect the core pin SI_PCEvent[7] to an event to be counted", }, { .name = "OCP_ALL_REQUESTS", .code = 0x44, .desc = "All OCP requests accepted", }, { .name = "OCP_ALL_CACHEABLE_REQUESTS", .code = 0xc4, .desc = "All OCP cacheable requests accepted", }, { .name = "OCP_READ_REQUESTS", .code = 0x45, .desc = "OCP read requests accepted", }, { .name = "OCP_READ_CACHEABLE_REQUESTS", .code = 0xc5, .desc = "OCP cacheable read requests accepted", }, { .name = "OCP_WRITE_REQUESTS", .code = 0x46, .desc = "OCP write requests accepted", }, { .name = "OCP_WRITE_CACHEABLE_REQUESTS", .code = 0xc6, .desc = "OCP cacheable write requests accepted", }, { .name = "OCP_WRITE_DATA_SENT", .code = 0xc7, .desc = "OCP write data sent", }, { .name = "OCP_READ_DATA_RECEIVED", .code = 0xc8, .desc = "OCP read data received", }, { .name = "FSB_LESS_25_FULL", .code = 0x4a, .desc = "Cycles fill store buffer (FSB) < 1/4 full", }, { .name = "FSB_25_50_FULL", .code = 0xca, .desc = "Cycles fill store buffer (FSB) 1/4 to 1/2 full", }, { .name = "LDQ_LESS_25_FULL", .code = 0x4b, .desc = "Cycles load data queue (LDQ) < 1/4 full", }, { .name = "LDQ_25_50_FULL", .code = 0xcb, .desc = "Cycles load data queue (LDQ) 1/4 to 1/2 full", }, { .name = "WBB_LESS_25_FULL", .code = 0x4c, .desc = "Cycles writeback buffer (WBB) < 1/4 full", }, { .name = "WBB_25_50_FULL", .code = 0xcc, .desc = "Cycles writeback buffer (WBB) 1/4 to 1/2 full", }, }; papi-papi-7-2-0-t/src/libpfm4/lib/events/montecito_events.h000066400000000000000000003721371502707512200236110ustar00rootroot00000000000000/* * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ /* * This file is generated automatically * !! DO NOT CHANGE !! */ static pme_mont_entry_t montecito_pe []={ #define PME_MONT_ALAT_CAPACITY_MISS_ALL 0 { "ALAT_CAPACITY_MISS_ALL", {0x30058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- both integer and floating point instructions"}, #define PME_MONT_ALAT_CAPACITY_MISS_FP 1 { "ALAT_CAPACITY_MISS_FP", {0x20058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- only floating point instructions"}, #define PME_MONT_ALAT_CAPACITY_MISS_INT 2 { "ALAT_CAPACITY_MISS_INT", {0x10058}, 0xfff0, 2, {0xffff0007}, "ALAT Entry Replaced -- only integer instructions"}, #define PME_MONT_BACK_END_BUBBLE_ALL 3 { "BACK_END_BUBBLE_ALL", {0x0}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- Front-end, RSE, EXE, FPU/L1D stall or a pipeline flush due to an exception/branch misprediction"}, #define PME_MONT_BACK_END_BUBBLE_FE 4 { "BACK_END_BUBBLE_FE", {0x10000}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- front-end"}, #define PME_MONT_BACK_END_BUBBLE_L1D_FPU_RSE 5 { "BACK_END_BUBBLE_L1D_FPU_RSE", {0x20000}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe -- L1D_FPU or RSE."}, #define PME_MONT_BE_BR_MISPRED_DETAIL_ANY 6 { "BE_BR_MISPRED_DETAIL_ANY", {0x61}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- any back-end (be) mispredictions"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_PFS 7 { "BE_BR_MISPRED_DETAIL_PFS", {0x30061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end pfs mispredictions for taken branches"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_ROT 8 { "BE_BR_MISPRED_DETAIL_ROT", {0x20061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end rotate mispredictions"}, #define PME_MONT_BE_BR_MISPRED_DETAIL_STG 9 { "BE_BR_MISPRED_DETAIL_STG", {0x10061}, 0xfff0, 1, {0xffff0003}, "BE Branch Misprediction Detail -- only back-end stage mispredictions"}, #define PME_MONT_BE_EXE_BUBBLE_ALL 10 { "BE_EXE_BUBBLE_ALL", {0x2}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe"}, #define PME_MONT_BE_EXE_BUBBLE_ARCR 11 { "BE_EXE_BUBBLE_ARCR", {0x40002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to AR or CR dependency"}, #define PME_MONT_BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK 12 { "BE_EXE_BUBBLE_ARCR_PR_CANCEL_BANK", {0x80002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- ARCR, PR, CANCEL or BANK_SWITCH"}, #define PME_MONT_BE_EXE_BUBBLE_BANK_SWITCH 13 { "BE_EXE_BUBBLE_BANK_SWITCH", {0x70002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to bank switching."}, #define PME_MONT_BE_EXE_BUBBLE_CANCEL 14 { "BE_EXE_BUBBLE_CANCEL", {0x60002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to a canceled load"}, #define PME_MONT_BE_EXE_BUBBLE_FRALL 15 { "BE_EXE_BUBBLE_FRALL", {0x20002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to FR/FR or FR/load dependency"}, #define PME_MONT_BE_EXE_BUBBLE_GRALL 16 { "BE_EXE_BUBBLE_GRALL", {0x10002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR or GR/load dependency"}, #define PME_MONT_BE_EXE_BUBBLE_GRGR 17 { "BE_EXE_BUBBLE_GRGR", {0x50002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to GR/GR dependency"}, #define PME_MONT_BE_EXE_BUBBLE_PR 18 { "BE_EXE_BUBBLE_PR", {0x30002}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Execution Unit Stalls -- Back-end was stalled by exe due to PR dependency"}, #define PME_MONT_BE_FLUSH_BUBBLE_ALL 19 { "BE_FLUSH_BUBBLE_ALL", {0x4}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to either an exception/interruption or branch misprediction flush"}, #define PME_MONT_BE_FLUSH_BUBBLE_BRU 20 { "BE_FLUSH_BUBBLE_BRU", {0x10004}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to a branch misprediction flush"}, #define PME_MONT_BE_FLUSH_BUBBLE_XPN 21 { "BE_FLUSH_BUBBLE_XPN", {0x20004}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to Flushes. -- Back-end was stalled due to an exception/interruption flush"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_ALL 22 { "BE_L1D_FPU_BUBBLE_ALL", {0xca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D or FPU"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_FPU 23 { "BE_L1D_FPU_BUBBLE_FPU", {0x100ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by FPU."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D 24 { "BE_L1D_FPU_BUBBLE_L1D", {0x200ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D. This includes all stalls caused by the L1 pipeline (created in the L1D stage of the L1 pipeline which corresponds to the DET stage of the main pipe)."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_AR_CR 25 { "BE_L1D_FPU_BUBBLE_L1D_AR_CR", {0x800ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ar/cr requiring a stall"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_FILLCONF 26 { "BE_L1D_FPU_BUBBLE_L1D_FILLCONF", {0x700ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due a store in conflict with a returning fill."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF 27 { "BE_L1D_FPU_BUBBLE_L1D_FULLSTBUF", {0x300ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer being full"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_HPW 28 { "BE_L1D_FPU_BUBBLE_L1D_HPW", {0x500ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to Hardware Page Walker"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_L2BPRESS 29 { "BE_L1D_FPU_BUBBLE_L1D_L2BPRESS", {0x900ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2 Back Pressure"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_LDCHK 30 { "BE_L1D_FPU_BUBBLE_L1D_LDCHK", {0xc00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_LDCONF 31 { "BE_L1D_FPU_BUBBLE_L1D_LDCONF", {0xb00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to architectural ordering conflict"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_NAT 32 { "BE_L1D_FPU_BUBBLE_L1D_NAT", {0xd00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L1D data return needing recirculated NaT generation."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_NATCONF 33 { "BE_L1D_FPU_BUBBLE_L1D_NATCONF", {0xf00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to ld8.fill conflict with st8.spill not written to unat."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_PIPE_RECIRC 34 { "BE_L1D_FPU_BUBBLE_L1D_PIPE_RECIRC", {0x400ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to recirculate"}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR 35 { "BE_L1D_FPU_BUBBLE_L1D_STBUFRECIR", {0xe00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to store buffer cancel needing recirculate."}, #define PME_MONT_BE_L1D_FPU_BUBBLE_L1D_TLB 36 { "BE_L1D_FPU_BUBBLE_L1D_TLB", {0xa00ca}, 0xfff0, 1, {0x5210000}, "Full Pipe Bubbles in Main Pipe due to FPU or L1D Cache -- Back-end was stalled by L1D due to L2DTLB to L1DTLB transfer"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_ALL 37 { "BE_LOST_BW_DUE_TO_FE_ALL", {0x72}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- count regardless of cause"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BI 38 { "BE_LOST_BW_DUE_TO_FE_BI", {0x90072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch initialization stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BRQ 39 { "BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch retirement queue stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 40 { "BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch interlock stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_BUBBLE 41 { "BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by branch resteer bubble stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_FEFLUSH 42 { "BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a front-end flush"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 43 { "BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_IBFULL 44 { "BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- (* meaningless for this event *)"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_IMISS 45 { "BE_LOST_BW_DUE_TO_FE_IMISS", {0x60072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by instruction cache miss stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_PLP 46 { "BE_LOST_BW_DUE_TO_FE_PLP", {0xb0072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by perfect loop prediction stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_TLBMISS 47 { "BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by TLB stall"}, #define PME_MONT_BE_LOST_BW_DUE_TO_FE_UNREACHED 48 { "BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40072}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles if BE Not Stalled for Other Reasons. -- only if caused by unreachable bundle"}, #define PME_MONT_BE_RSE_BUBBLE_ALL 49 { "BE_RSE_BUBBLE_ALL", {0x1}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE"}, #define PME_MONT_BE_RSE_BUBBLE_AR_DEP 50 { "BE_RSE_BUBBLE_AR_DEP", {0x20001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to AR dependencies"}, #define PME_MONT_BE_RSE_BUBBLE_BANK_SWITCH 51 { "BE_RSE_BUBBLE_BANK_SWITCH", {0x10001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to bank switching"}, #define PME_MONT_BE_RSE_BUBBLE_LOADRS 52 { "BE_RSE_BUBBLE_LOADRS", {0x50001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to loadrs calculations"}, #define PME_MONT_BE_RSE_BUBBLE_OVERFLOW 53 { "BE_RSE_BUBBLE_OVERFLOW", {0x30001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to spill"}, #define PME_MONT_BE_RSE_BUBBLE_UNDERFLOW 54 { "BE_RSE_BUBBLE_UNDERFLOW", {0x40001}, 0xfff0, 1, {0xffff0000}, "Full Pipe Bubbles in Main Pipe due to RSE Stalls -- Back-end was stalled by RSE due to need to fill"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_ALL_PRED 55 { "BR_MISPRED_DETAIL_ALL_ALL_PRED", {0x5b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_CORRECT_PRED 56 { "BR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x1005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_WRONG_PATH 57 { "BR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x2005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_ALL_WRONG_TARGET 58 { "BR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x3005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- All branch types, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_ALL_PRED 59 { "BR_MISPRED_DETAIL_IPREL_ALL_PRED", {0x4005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_CORRECT_PRED 60 { "BR_MISPRED_DETAIL_IPREL_CORRECT_PRED", {0x5005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_WRONG_PATH 61 { "BR_MISPRED_DETAIL_IPREL_WRONG_PATH", {0x6005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_IPREL_WRONG_TARGET 62 { "BR_MISPRED_DETAIL_IPREL_WRONG_TARGET", {0x7005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only IP relative branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_ALL_PRED 63 { "BR_MISPRED_DETAIL_NRETIND_ALL_PRED", {0xc005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_CORRECT_PRED 64 { "BR_MISPRED_DETAIL_NRETIND_CORRECT_PRED", {0xd005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_WRONG_PATH 65 { "BR_MISPRED_DETAIL_NRETIND_WRONG_PATH", {0xe005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_NRETIND_WRONG_TARGET 66 { "BR_MISPRED_DETAIL_NRETIND_WRONG_TARGET", {0xf005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_ALL_PRED 67 { "BR_MISPRED_DETAIL_RETURN_ALL_PRED", {0x8005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, regardless of prediction result"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_CORRECT_PRED 68 { "BR_MISPRED_DETAIL_RETURN_CORRECT_PRED", {0x9005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_WRONG_PATH 69 { "BR_MISPRED_DETAIL_RETURN_WRONG_PATH", {0xa005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL_RETURN_WRONG_TARGET 70 { "BR_MISPRED_DETAIL_RETURN_WRONG_TARGET", {0xb005b}, 0xfff0, 3, {0xffff0003}, "FE Branch Mispredict Detail -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED 71 { "BR_MISPRED_DETAIL2_ALL_ALL_UNKNOWN_PRED", {0x68}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED 72 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_CORRECT_PRED", {0x10068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and correctly predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH 73 { "BR_MISPRED_DETAIL2_ALL_UNKNOWN_PATH_WRONG_PATH", {0x20068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- All branch types, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED 74 { "BR_MISPRED_DETAIL2_IPREL_ALL_UNKNOWN_PRED", {0x40068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED 75 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_CORRECT_PRED", {0x50068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH 76 { "BR_MISPRED_DETAIL2_IPREL_UNKNOWN_PATH_WRONG_PATH", {0x60068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only IP relative branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED 77 { "BR_MISPRED_DETAIL2_NRETIND_ALL_UNKNOWN_PRED", {0xc0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED 78 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_CORRECT_PRED", {0xd0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH 79 { "BR_MISPRED_DETAIL2_NRETIND_UNKNOWN_PATH_WRONG_PATH", {0xe0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only non-return indirect branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED 80 { "BR_MISPRED_DETAIL2_RETURN_ALL_UNKNOWN_PRED", {0x80068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED 81 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_CORRECT_PRED", {0x90068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and correct predicted branch (outcome & target)"}, #define PME_MONT_BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH 82 { "BR_MISPRED_DETAIL2_RETURN_UNKNOWN_PATH_WRONG_PATH", {0xa0068}, 0xfff0, 2, {0xffff0003}, "FE Branch Mispredict Detail (Unknown Path Component) -- Only return type branches, branches with unknown path prediction and wrong branch direction"}, #define PME_MONT_BR_PATH_PRED_ALL_MISPRED_NOTTAKEN 83 { "BR_PATH_PRED_ALL_MISPRED_NOTTAKEN", {0x54}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_MISPRED_TAKEN 84 { "BR_PATH_PRED_ALL_MISPRED_TAKEN", {0x10054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_OKPRED_NOTTAKEN 85 { "BR_PATH_PRED_ALL_OKPRED_NOTTAKEN", {0x20054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_ALL_OKPRED_TAKEN 86 { "BR_PATH_PRED_ALL_OKPRED_TAKEN", {0x30054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- All branch types, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN 87 { "BR_PATH_PRED_IPREL_MISPRED_NOTTAKEN", {0x40054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_MISPRED_TAKEN 88 { "BR_PATH_PRED_IPREL_MISPRED_TAKEN", {0x50054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN 89 { "BR_PATH_PRED_IPREL_OKPRED_NOTTAKEN", {0x60054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_IPREL_OKPRED_TAKEN 90 { "BR_PATH_PRED_IPREL_OKPRED_TAKEN", {0x70054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only IP relative branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN 91 { "BR_PATH_PRED_NRETIND_MISPRED_NOTTAKEN", {0xc0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_MISPRED_TAKEN 92 { "BR_PATH_PRED_NRETIND_MISPRED_TAKEN", {0xd0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN 93 { "BR_PATH_PRED_NRETIND_OKPRED_NOTTAKEN", {0xe0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_NRETIND_OKPRED_TAKEN 94 { "BR_PATH_PRED_NRETIND_OKPRED_TAKEN", {0xf0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only non-return indirect branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN 95 { "BR_PATH_PRED_RETURN_MISPRED_NOTTAKEN", {0x80054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_MISPRED_TAKEN 96 { "BR_PATH_PRED_RETURN_MISPRED_TAKEN", {0x90054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, incorrectly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN 97 { "BR_PATH_PRED_RETURN_OKPRED_NOTTAKEN", {0xa0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and not taken branch"}, #define PME_MONT_BR_PATH_PRED_RETURN_OKPRED_TAKEN 98 { "BR_PATH_PRED_RETURN_OKPRED_TAKEN", {0xb0054}, 0xfff0, 3, {0xffff0003}, "FE Branch Path Prediction Detail -- Only return type branches, correctly predicted path and taken branch"}, #define PME_MONT_BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN 99 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_NOTTAKEN", {0x6a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN 100 { "BR_PATH_PRED2_ALL_UNKNOWNPRED_TAKEN", {0x1006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- All branch types, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN 101 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_NOTTAKEN", {0x4006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN 102 { "BR_PATH_PRED2_IPREL_UNKNOWNPRED_TAKEN", {0x5006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only IP relative branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN 103 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_NOTTAKEN", {0xc006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN 104 { "BR_PATH_PRED2_NRETIND_UNKNOWNPRED_TAKEN", {0xd006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only non-return indirect branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN 105 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_NOTTAKEN", {0x8006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and not taken branch (which impacts OKPRED_NOTTAKEN)"}, #define PME_MONT_BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN 106 { "BR_PATH_PRED2_RETURN_UNKNOWNPRED_TAKEN", {0x9006a}, 0xfff0, 2, {0xffff0003}, "FE Branch Path Prediction Detail (Unknown pred component) -- Only return type branches, unknown predicted path and taken branch (which impacts MISPRED_TAKEN)"}, #define PME_MONT_BUS_ALL_ANY 107 { "BUS_ALL_ANY", {0x31887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_ALL_EITHER 108 { "BUS_ALL_EITHER", {0x1887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_ALL_IO 109 { "BUS_ALL_IO", {0x11887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_ALL_SELF 110 { "BUS_ALL_SELF", {0x21887}, 0x03f0, 1, {0xffff0000}, "Bus Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_ANY 111 { "BUS_B2B_DATA_CYCLES_ANY", {0x31093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_B2B_DATA_CYCLES_EITHER 112 { "BUS_B2B_DATA_CYCLES_EITHER", {0x1093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_IO 113 { "BUS_B2B_DATA_CYCLES_IO", {0x11093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_B2B_DATA_CYCLES_SELF 114 { "BUS_B2B_DATA_CYCLES_SELF", {0x21093}, 0x03f0, 1, {0xffff0000}, "Back to Back Data Cycles on the Bus -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_DATA_CYCLE_ANY 115 { "BUS_DATA_CYCLE_ANY", {0x31088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_DATA_CYCLE_EITHER 116 { "BUS_DATA_CYCLE_EITHER", {0x1088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_DATA_CYCLE_IO 117 { "BUS_DATA_CYCLE_IO", {0x11088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_DATA_CYCLE_SELF 118 { "BUS_DATA_CYCLE_SELF", {0x21088}, 0x03f0, 1, {0xffff0000}, "Valid Data Cycle on the Bus -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_HITM_ANY 119 { "BUS_HITM_ANY", {0x31884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_HITM_EITHER 120 { "BUS_HITM_EITHER", {0x1884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_HITM_IO 121 { "BUS_HITM_IO", {0x11884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_HITM_SELF 122 { "BUS_HITM_SELF", {0x21884}, 0x03f0, 1, {0xffff0000}, "Bus Hit Modified Line Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_IO_ANY 123 { "BUS_IO_ANY", {0x31890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_IO_EITHER 124 { "BUS_IO_EITHER", {0x1890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_IO_IO 125 { "BUS_IO_IO", {0x11890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_IO_SELF 126 { "BUS_IO_SELF", {0x21890}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Bus Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_MEMORY_ALL_ANY 127 { "BUS_MEMORY_ALL_ANY", {0xf188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_ALL_EITHER 128 { "BUS_MEMORY_ALL_EITHER", {0xc188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_ALL_IO 129 { "BUS_MEMORY_ALL_IO", {0xd188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from 'this' local processor"}, #define PME_MONT_BUS_MEMORY_ALL_SELF 130 { "BUS_MEMORY_ALL_SELF", {0xe188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_ANY 131 { "BUS_MEMORY_EQ_128BYTE_ANY", {0x7188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from either local processor"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_EITHER 132 { "BUS_MEMORY_EQ_128BYTE_EITHER", {0x4188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_IO 133 { "BUS_MEMORY_EQ_128BYTE_IO", {0x5188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from 'this' processor"}, #define PME_MONT_BUS_MEMORY_EQ_128BYTE_SELF 134 { "BUS_MEMORY_EQ_128BYTE_SELF", {0x6188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of full cache line transactions (BRL, BRIL, BWL, BRC, BCR, BCCL) from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_ANY 135 { "BUS_MEMORY_LT_128BYTE_ANY", {0xb188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- All bus transactions from either local processor"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_EITHER 136 { "BUS_MEMORY_LT_128BYTE_EITHER", {0x8188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from non-CPU priority agents"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_IO 137 { "BUS_MEMORY_LT_128BYTE_IO", {0x9188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) from 'this' processor"}, #define PME_MONT_BUS_MEMORY_LT_128BYTE_SELF 138 { "BUS_MEMORY_LT_128BYTE_SELF", {0xa188a}, 0x03f0, 1, {0xffff0000}, "Bus Memory Transactions -- number of less than full cache line transactions (BRP, BWP, BIL) CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_ALL_ANY 139 { "BUS_MEM_READ_ALL_ANY", {0xf188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_ALL_EITHER 140 { "BUS_MEM_READ_ALL_EITHER", {0xc188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_ALL_IO 141 { "BUS_MEM_READ_ALL_IO", {0xd188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_ALL_SELF 142 { "BUS_MEM_READ_ALL_SELF", {0xe188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- All memory read transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BIL_ANY 143 { "BUS_MEM_READ_BIL_ANY", {0x3188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BIL_EITHER 144 { "BUS_MEM_READ_BIL_EITHER", {0x188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BIL_IO 145 { "BUS_MEM_READ_BIL_IO", {0x1188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BIL_SELF 146 { "BUS_MEM_READ_BIL_SELF", {0x2188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of BIL 0-byte memory read invalidate transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BRIL_ANY 147 { "BUS_MEM_READ_BRIL_ANY", {0xb188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BRIL_EITHER 148 { "BUS_MEM_READ_BRIL_EITHER", {0x8188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BRIL_IO 149 { "BUS_MEM_READ_BRIL_IO", {0x9188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BRIL_SELF 150 { "BUS_MEM_READ_BRIL_SELF", {0xa188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read invalidate transactions from local processor"}, #define PME_MONT_BUS_MEM_READ_BRL_ANY 151 { "BUS_MEM_READ_BRL_ANY", {0x7188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_MEM_READ_BRL_EITHER 152 { "BUS_MEM_READ_BRL_EITHER", {0x4188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from either local processor"}, #define PME_MONT_BUS_MEM_READ_BRL_IO 153 { "BUS_MEM_READ_BRL_IO", {0x5188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from non-CPU priority agents"}, #define PME_MONT_BUS_MEM_READ_BRL_SELF 154 { "BUS_MEM_READ_BRL_SELF", {0x6188b}, 0x03f0, 1, {0xffff0000}, "Full Cache Line D/I Memory RD, RD Invalidate, and BRIL -- Number of full cache line memory read transactions from local processor"}, #define PME_MONT_BUS_RD_DATA_ANY 155 { "BUS_RD_DATA_ANY", {0x3188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_DATA_EITHER 156 { "BUS_RD_DATA_EITHER", {0x188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_DATA_IO 157 { "BUS_RD_DATA_IO", {0x1188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_DATA_SELF 158 { "BUS_RD_DATA_SELF", {0x2188c}, 0x03f0, 1, {0xffff0000}, "Bus Read Data Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_HIT_ANY 159 { "BUS_RD_HIT_ANY", {0x31880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_HIT_EITHER 160 { "BUS_RD_HIT_EITHER", {0x1880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_HIT_IO 161 { "BUS_RD_HIT_IO", {0x11880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_HIT_SELF 162 { "BUS_RD_HIT_SELF", {0x21880}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Clean Non-local Cache Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_HITM_ANY 163 { "BUS_RD_HITM_ANY", {0x31881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_HITM_EITHER 164 { "BUS_RD_HITM_EITHER", {0x1881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_HITM_IO 165 { "BUS_RD_HITM_IO", {0x11881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_HITM_SELF 166 { "BUS_RD_HITM_SELF", {0x21881}, 0x03f0, 1, {0xffff0000}, "Bus Read Hit Modified Non-local Cache Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_ANY 167 { "BUS_RD_INVAL_BST_HITM_ANY", {0x31883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_EITHER 168 { "BUS_RD_INVAL_BST_HITM_EITHER", {0x1883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_IO 169 { "BUS_RD_INVAL_BST_HITM_IO", {0x11883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_INVAL_BST_HITM_SELF 170 { "BUS_RD_INVAL_BST_HITM_SELF", {0x21883}, 0x03f0, 1, {0xffff0000}, "Bus BRIL Transaction Results in HITM -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_INVAL_HITM_ANY 171 { "BUS_RD_INVAL_HITM_ANY", {0x31882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_INVAL_HITM_EITHER 172 { "BUS_RD_INVAL_HITM_EITHER", {0x1882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_INVAL_HITM_IO 173 { "BUS_RD_INVAL_HITM_IO", {0x11882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_INVAL_HITM_SELF 174 { "BUS_RD_INVAL_HITM_SELF", {0x21882}, 0x03f0, 1, {0xffff0000}, "Bus BIL Transaction Results in HITM -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_IO_ANY 175 { "BUS_RD_IO_ANY", {0x31891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_IO_EITHER 176 { "BUS_RD_IO_EITHER", {0x1891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_IO_IO 177 { "BUS_RD_IO_IO", {0x11891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_IO_SELF 178 { "BUS_RD_IO_SELF", {0x21891}, 0x03f0, 1, {0xffff0000}, "IA-32 Compatible IO Read Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_RD_PRTL_ANY 179 { "BUS_RD_PRTL_ANY", {0x3188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_RD_PRTL_EITHER 180 { "BUS_RD_PRTL_EITHER", {0x188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_RD_PRTL_IO 181 { "BUS_RD_PRTL_IO", {0x1188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by non-CPU priority agents"}, #define PME_MONT_BUS_RD_PRTL_SELF 182 { "BUS_RD_PRTL_SELF", {0x2188d}, 0x03f0, 1, {0xffff0000}, "Bus Read Partial Transactions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_ANY 183 { "BUS_SNOOP_STALL_CYCLES_ANY", {0x3188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_EITHER 184 { "BUS_SNOOP_STALL_CYCLES_EITHER", {0x188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- transactions initiated by either cpu core"}, #define PME_MONT_BUS_SNOOP_STALL_CYCLES_SELF 185 { "BUS_SNOOP_STALL_CYCLES_SELF", {0x2188f}, 0x03f0, 1, {0xffff0000}, "Bus Snoop Stall Cycles (from any agent) -- local processor"}, #define PME_MONT_BUS_WR_WB_ALL_ANY 186 { "BUS_WR_WB_ALL_ANY", {0xf1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)."}, #define PME_MONT_BUS_WR_WB_ALL_IO 187 { "BUS_WR_WB_ALL_IO", {0xd1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- non-CPU priority agents"}, #define PME_MONT_BUS_WR_WB_ALL_SELF 188 { "BUS_WR_WB_ALL_SELF", {0xe1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor"}, #define PME_MONT_BUS_WR_WB_CCASTOUT_ANY 189 { "BUS_WR_WB_CCASTOUT_ANY", {0xb1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_MONT_BUS_WR_WB_CCASTOUT_SELF 190 { "BUS_WR_WB_CCASTOUT_SELF", {0xa1892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor/Only 0-byte transactions with write back attribute (clean cast outs) will be counted"}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_ANY 191 { "BUS_WR_WB_EQ_128BYTE_ANY", {0x71892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- CPU or non-CPU (all transactions)./Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_IO 192 { "BUS_WR_WB_EQ_128BYTE_IO", {0x51892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- non-CPU priority agents/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_BUS_WR_WB_EQ_128BYTE_SELF 193 { "BUS_WR_WB_EQ_128BYTE_SELF", {0x61892}, 0x03f0, 1, {0xffff0000}, "Bus Write Back Transactions -- this' processor/Only cache line transactions with write back or write coalesce attributes will be counted."}, #define PME_MONT_CPU_CPL_CHANGES_ALL 194 { "CPU_CPL_CHANGES_ALL", {0xf0013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes in cpl counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL0 195 { "CPU_CPL_CHANGES_LVL0", {0x10013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level0 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL1 196 { "CPU_CPL_CHANGES_LVL1", {0x20013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level1 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL2 197 { "CPU_CPL_CHANGES_LVL2", {0x40013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level2 are counted"}, #define PME_MONT_CPU_CPL_CHANGES_LVL3 198 { "CPU_CPL_CHANGES_LVL3", {0x80013}, 0xfff0, 1, {0xffff0000}, "Privilege Level Changes -- All changes to/from privilege level3 are counted"}, #define PME_MONT_CPU_OP_CYCLES_ALL 199 { "CPU_OP_CYCLES_ALL", {0x1012}, 0xfff0, 1, {0xffff0000}, "CPU Operating Cycles -- All CPU cycles counted"}, #define PME_MONT_CPU_OP_CYCLES_QUAL 200 { "CPU_OP_CYCLES_QUAL", {0x11012}, 0xfff0, 1, {0xffff0003}, "CPU Operating Cycles -- Qualified cycles only"}, #define PME_MONT_CPU_OP_CYCLES_HALTED 201 { "CPU_OP_CYCLES_HALTED", {0x1018}, 0x0400, 7, {0xffff0000}, "CPU Operating Cycles Halted"}, #define PME_MONT_DATA_DEBUG_REGISTER_FAULT 202 { "DATA_DEBUG_REGISTER_FAULT", {0x52}, 0xfff0, 1, {0xffff0000}, "Fault Due to Data Debug Reg. Match to Load/Store Instruction"}, #define PME_MONT_DATA_DEBUG_REGISTER_MATCHES 203 { "DATA_DEBUG_REGISTER_MATCHES", {0xc6}, 0xfff0, 1, {0xffff0007}, "Data Debug Register Matches Data Address of Memory Reference."}, #define PME_MONT_DATA_EAR_ALAT 204 { "DATA_EAR_ALAT", {0xec8}, 0xfff0, 1, {0xffff0007}, "Data EAR ALAT"}, #define PME_MONT_DATA_EAR_CACHE_LAT1024 205 { "DATA_EAR_CACHE_LAT1024", {0x80dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 1024 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT128 206 { "DATA_EAR_CACHE_LAT128", {0x50dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 128 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT16 207 { "DATA_EAR_CACHE_LAT16", {0x20dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 16 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT2048 208 { "DATA_EAR_CACHE_LAT2048", {0x90dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 2048 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT256 209 { "DATA_EAR_CACHE_LAT256", {0x60dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 256 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT32 210 { "DATA_EAR_CACHE_LAT32", {0x30dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 32 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT4 211 { "DATA_EAR_CACHE_LAT4", {0xdc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 4 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT4096 212 { "DATA_EAR_CACHE_LAT4096", {0xa0dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 4096 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT512 213 { "DATA_EAR_CACHE_LAT512", {0x70dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 512 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT64 214 { "DATA_EAR_CACHE_LAT64", {0x40dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 64 Cycles"}, #define PME_MONT_DATA_EAR_CACHE_LAT8 215 { "DATA_EAR_CACHE_LAT8", {0x10dc8}, 0xfff0, 1, {0xffff0007}, "Data EAR Cache -- >= 8 Cycles"}, #define PME_MONT_DATA_EAR_EVENTS 216 { "DATA_EAR_EVENTS", {0x8c8}, 0xfff0, 1, {0xffff0007}, "L1 Data Cache EAR Events"}, #define PME_MONT_DATA_EAR_TLB_ALL 217 { "DATA_EAR_TLB_ALL", {0xe0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- All L1 DTLB Misses"}, #define PME_MONT_DATA_EAR_TLB_FAULT 218 { "DATA_EAR_TLB_FAULT", {0x80cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- DTLB Misses which produce a software fault"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB 219 { "DATA_EAR_TLB_L2DTLB", {0x20cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB_OR_FAULT 220 { "DATA_EAR_TLB_L2DTLB_OR_FAULT", {0xa0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or produce a software fault"}, #define PME_MONT_DATA_EAR_TLB_L2DTLB_OR_VHPT 221 { "DATA_EAR_TLB_L2DTLB_OR_VHPT", {0x60cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit L2 DTLB or VHPT"}, #define PME_MONT_DATA_EAR_TLB_VHPT 222 { "DATA_EAR_TLB_VHPT", {0x40cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT"}, #define PME_MONT_DATA_EAR_TLB_VHPT_OR_FAULT 223 { "DATA_EAR_TLB_VHPT_OR_FAULT", {0xc0cc8}, 0xfff0, 1, {0xffff0007}, "Data EAR TLB -- L1 DTLB Misses which hit VHPT or produce a software fault"}, #define PME_MONT_DATA_REFERENCES_SET0 224 { "DATA_REFERENCES_SET0", {0xc3}, 0xfff0, 4, {0x5010007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_MONT_DATA_REFERENCES_SET1 225 { "DATA_REFERENCES_SET1", {0xc5}, 0xfff0, 4, {0x5110007}, "Data Memory References Issued to Memory Pipeline"}, #define PME_MONT_DISP_STALLED 226 { "DISP_STALLED", {0x49}, 0xfff0, 1, {0xffff0000}, "Number of Cycles Dispersal Stalled"}, #define PME_MONT_DTLB_INSERTS_HPW 227 { "DTLB_INSERTS_HPW", {0x8c9}, 0xfff0, 4, {0xffff0000}, "Hardware Page Walker Installs to DTLB"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_ALL_PRED 228 { "ENCBR_MISPRED_DETAIL_ALL_ALL_PRED", {0x63}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED 229 { "ENCBR_MISPRED_DETAIL_ALL_CORRECT_PRED", {0x10063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH 230 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_PATH", {0x20063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET 231 { "ENCBR_MISPRED_DETAIL_ALL_WRONG_TARGET", {0x30063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- All encoded branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED 232 { "ENCBR_MISPRED_DETAIL_ALL2_ALL_PRED", {0xc0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED 233 { "ENCBR_MISPRED_DETAIL_ALL2_CORRECT_PRED", {0xd0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH 234 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_PATH", {0xe0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET 235 { "ENCBR_MISPRED_DETAIL_ALL2_WRONG_TARGET", {0xf0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only non-return indirect branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED 236 { "ENCBR_MISPRED_DETAIL_OVERSUB_ALL_PRED", {0x80063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, regardless of prediction result"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED 237 { "ENCBR_MISPRED_DETAIL_OVERSUB_CORRECT_PRED", {0x90063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, correctly predicted branches (outcome and target)"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH 238 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_PATH", {0xa0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong branch direction"}, #define PME_MONT_ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET 239 { "ENCBR_MISPRED_DETAIL_OVERSUB_WRONG_TARGET", {0xb0063}, 0xfff0, 3, {0xffff0003}, "Number of Encoded Branches Retired -- Only return type branches, mispredicted branches due to wrong target for taken branches"}, #define PME_MONT_ER_BKSNP_ME_ACCEPTED 240 { "ER_BKSNP_ME_ACCEPTED", {0x10bb}, 0x03f0, 2, {0xffff0000}, "Backsnoop Me Accepted"}, #define PME_MONT_ER_BRQ_LIVE_REQ_HI 241 { "ER_BRQ_LIVE_REQ_HI", {0x10b8}, 0x03f0, 2, {0xffff0000}, "BRQ Live Requests (upper 2 bits)"}, #define PME_MONT_ER_BRQ_LIVE_REQ_LO 242 { "ER_BRQ_LIVE_REQ_LO", {0x10b9}, 0x03f0, 7, {0xffff0000}, "BRQ Live Requests (lower 3 bits)"}, #define PME_MONT_ER_BRQ_REQ_INSERTED 243 { "ER_BRQ_REQ_INSERTED", {0x8ba}, 0x03f0, 1, {0xffff0000}, "BRQ Requests Inserted"}, #define PME_MONT_ER_MEM_READ_OUT_HI 244 { "ER_MEM_READ_OUT_HI", {0x8b4}, 0x03f0, 2, {0xffff0000}, "Outstanding Memory Read Transactions (upper 2 bits)"}, #define PME_MONT_ER_MEM_READ_OUT_LO 245 { "ER_MEM_READ_OUT_LO", {0x8b5}, 0x03f0, 7, {0xffff0000}, "Outstanding Memory Read Transactions (lower 3 bits)"}, #define PME_MONT_ER_REJECT_ALL_L1D_REQ 246 { "ER_REJECT_ALL_L1D_REQ", {0x10bd}, 0x03f0, 1, {0xffff0000}, "Reject All L1D Requests"}, #define PME_MONT_ER_REJECT_ALL_L1I_REQ 247 { "ER_REJECT_ALL_L1I_REQ", {0x10be}, 0x03f0, 1, {0xffff0000}, "Reject All L1I Requests"}, #define PME_MONT_ER_REJECT_ALL_L1_REQ 248 { "ER_REJECT_ALL_L1_REQ", {0x10bc}, 0x03f0, 1, {0xffff0000}, "Reject All L1 Requests"}, #define PME_MONT_ER_SNOOPQ_REQ_HI 249 { "ER_SNOOPQ_REQ_HI", {0x10b6}, 0x03f0, 2, {0xffff0000}, "Outstanding Snoops (upper bit)"}, #define PME_MONT_ER_SNOOPQ_REQ_LO 250 { "ER_SNOOPQ_REQ_LO", {0x10b7}, 0x03f0, 7, {0xffff0000}, "Outstanding Snoops (lower 3 bits)"}, #define PME_MONT_ETB_EVENT 251 { "ETB_EVENT", {0x111}, 0xfff0, 1, {0xffff0003}, "Execution Trace Buffer Event Captured"}, #define PME_MONT_FE_BUBBLE_ALL 252 { "FE_BUBBLE_ALL", {0x71}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- count regardless of cause"}, #define PME_MONT_FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE 253 { "FE_BUBBLE_ALLBUT_FEFLUSH_BUBBLE", {0xb0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- ALL except FEFLUSH and BUBBLE"}, #define PME_MONT_FE_BUBBLE_ALLBUT_IBFULL 254 { "FE_BUBBLE_ALLBUT_IBFULL", {0xc0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- ALL except IBFULl"}, #define PME_MONT_FE_BUBBLE_BRANCH 255 { "FE_BUBBLE_BRANCH", {0x90071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by any of 4 branch recirculates"}, #define PME_MONT_FE_BUBBLE_BUBBLE 256 { "FE_BUBBLE_BUBBLE", {0xd0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by branch bubble stall"}, #define PME_MONT_FE_BUBBLE_FEFLUSH 257 { "FE_BUBBLE_FEFLUSH", {0x10071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by a front-end flush"}, #define PME_MONT_FE_BUBBLE_FILL_RECIRC 258 { "FE_BUBBLE_FILL_RECIRC", {0x80071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_FE_BUBBLE_GROUP1 259 { "FE_BUBBLE_GROUP1", {0x30071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- BUBBLE or BRANCH"}, #define PME_MONT_FE_BUBBLE_GROUP2 260 { "FE_BUBBLE_GROUP2", {0x40071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- IMISS or TLBMISS"}, #define PME_MONT_FE_BUBBLE_GROUP3 261 { "FE_BUBBLE_GROUP3", {0xa0071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- FILL_RECIRC or BRANCH"}, #define PME_MONT_FE_BUBBLE_IBFULL 262 { "FE_BUBBLE_IBFULL", {0x50071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by instruction buffer full stall"}, #define PME_MONT_FE_BUBBLE_IMISS 263 { "FE_BUBBLE_IMISS", {0x60071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by instruction cache miss stall"}, #define PME_MONT_FE_BUBBLE_TLBMISS 264 { "FE_BUBBLE_TLBMISS", {0x70071}, 0xfff0, 1, {0xffff0000}, "Bubbles Seen by FE -- only if caused by TLB stall"}, #define PME_MONT_FE_LOST_BW_ALL 265 { "FE_LOST_BW_ALL", {0x70}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- count regardless of cause"}, #define PME_MONT_FE_LOST_BW_BI 266 { "FE_LOST_BW_BI", {0x90070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch initialization stall"}, #define PME_MONT_FE_LOST_BW_BRQ 267 { "FE_LOST_BW_BRQ", {0xa0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch retirement queue stall"}, #define PME_MONT_FE_LOST_BW_BR_ILOCK 268 { "FE_LOST_BW_BR_ILOCK", {0xc0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch interlock stall"}, #define PME_MONT_FE_LOST_BW_BUBBLE 269 { "FE_LOST_BW_BUBBLE", {0xd0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by branch resteer bubble stall"}, #define PME_MONT_FE_LOST_BW_FEFLUSH 270 { "FE_LOST_BW_FEFLUSH", {0x10070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by a front-end flush"}, #define PME_MONT_FE_LOST_BW_FILL_RECIRC 271 { "FE_LOST_BW_FILL_RECIRC", {0x80070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_FE_LOST_BW_IBFULL 272 { "FE_LOST_BW_IBFULL", {0x50070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction buffer full stall"}, #define PME_MONT_FE_LOST_BW_IMISS 273 { "FE_LOST_BW_IMISS", {0x60070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by instruction cache miss stall"}, #define PME_MONT_FE_LOST_BW_PLP 274 { "FE_LOST_BW_PLP", {0xb0070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by perfect loop prediction stall"}, #define PME_MONT_FE_LOST_BW_TLBMISS 275 { "FE_LOST_BW_TLBMISS", {0x70070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by TLB stall"}, #define PME_MONT_FE_LOST_BW_UNREACHED 276 { "FE_LOST_BW_UNREACHED", {0x40070}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Entrance to IB -- only if caused by unreachable bundle"}, #define PME_MONT_FP_FAILED_FCHKF 277 { "FP_FAILED_FCHKF", {0x6}, 0xfff0, 1, {0xffff0001}, "Failed fchkf"}, #define PME_MONT_FP_FALSE_SIRSTALL 278 { "FP_FALSE_SIRSTALL", {0x5}, 0xfff0, 1, {0xffff0001}, "SIR Stall Without a Trap"}, #define PME_MONT_FP_FLUSH_TO_ZERO_FTZ_POSS 279 { "FP_FLUSH_TO_ZERO_FTZ_POSS", {0x1000b}, 0xfff0, 2, {0xffff0001}, "FP Result Flushed to Zero -- "}, #define PME_MONT_FP_FLUSH_TO_ZERO_FTZ_REAL 280 { "FP_FLUSH_TO_ZERO_FTZ_REAL", {0xb}, 0xfff0, 2, {0xffff0001}, "FP Result Flushed to Zero -- Times FTZ"}, #define PME_MONT_FP_OPS_RETIRED 281 { "FP_OPS_RETIRED", {0x9}, 0xfff0, 6, {0xffff0001}, "Retired FP Operations"}, #define PME_MONT_FP_TRUE_SIRSTALL 282 { "FP_TRUE_SIRSTALL", {0x3}, 0xfff0, 1, {0xffff0001}, "SIR stall asserted and leads to a trap"}, #define PME_MONT_HPW_DATA_REFERENCES 283 { "HPW_DATA_REFERENCES", {0x2d}, 0xfff0, 4, {0xffff0000}, "Data Memory References to VHPT"}, #define PME_MONT_IA64_INST_RETIRED_THIS 284 { "IA64_INST_RETIRED_THIS", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions"}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33 285 { "IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 0 and the opcode matcher pair PMC32 and PMC33."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35 286 { "IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35", {0x10008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 1 and the opcode matcher pair PMC34 and PMC35."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33 287 { "IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33", {0x20008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 2 and the opcode matcher pair PMC32 and PMC33."}, #define PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35 288 { "IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35", {0x30008}, 0xfff0, 6, {0xffff0003}, "Retired Tagged Instructions -- Instruction tagged by Instruction Breakpoint Pair 3 and the opcode matcher pair PMC34 and PMC35."}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_ALL 289 { "IDEAL_BE_LOST_BW_DUE_TO_FE_ALL", {0x73}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- count regardless of cause"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BI 290 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BI", {0x90073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch initialization stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ 291 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BRQ", {0xa0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch retirement queue stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK 292 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BR_ILOCK", {0xc0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch interlock stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE 293 { "IDEAL_BE_LOST_BW_DUE_TO_FE_BUBBLE", {0xd0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by branch resteer bubble stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH 294 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FEFLUSH", {0x10073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by a front-end flush"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC 295 { "IDEAL_BE_LOST_BW_DUE_TO_FE_FILL_RECIRC", {0x80073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by a recirculate for a cache line fill operation"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL 296 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IBFULL", {0x50073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- (* meaningless for this event *)"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS 297 { "IDEAL_BE_LOST_BW_DUE_TO_FE_IMISS", {0x60073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by instruction cache miss stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_PLP 298 { "IDEAL_BE_LOST_BW_DUE_TO_FE_PLP", {0xb0073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by perfect loop prediction stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS 299 { "IDEAL_BE_LOST_BW_DUE_TO_FE_TLBMISS", {0x70073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by TLB stall"}, #define PME_MONT_IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED 300 { "IDEAL_BE_LOST_BW_DUE_TO_FE_UNREACHED", {0x40073}, 0xfff0, 2, {0xffff0000}, "Invalid Bundles at the Exit from IB -- only if caused by unreachable bundle"}, #define PME_MONT_INST_CHKA_LDC_ALAT_ALL 301 { "INST_CHKA_LDC_ALAT_ALL", {0x30056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_CHKA_LDC_ALAT_FP 302 { "INST_CHKA_LDC_ALAT_FP", {0x20056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_MONT_INST_CHKA_LDC_ALAT_INT 303 { "INST_CHKA_LDC_ALAT_INT", {0x10056}, 0xfff0, 2, {0xffff0007}, "Retired chk.a and ld.c Instructions -- only integer instructions"}, #define PME_MONT_INST_DISPERSED 304 { "INST_DISPERSED", {0x4d}, 0xfff0, 6, {0xffff0001}, "Syllables Dispersed from REN to REG stage"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_ALL 305 { "INST_FAILED_CHKA_LDC_ALAT_ALL", {0x30057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_FP 306 { "INST_FAILED_CHKA_LDC_ALAT_FP", {0x20057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- only floating point instructions"}, #define PME_MONT_INST_FAILED_CHKA_LDC_ALAT_INT 307 { "INST_FAILED_CHKA_LDC_ALAT_INT", {0x10057}, 0xfff0, 1, {0xffff0007}, "Failed chk.a and ld.c Instructions -- only integer instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_ALL 308 { "INST_FAILED_CHKS_RETIRED_ALL", {0x30055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- both integer and floating point instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_FP 309 { "INST_FAILED_CHKS_RETIRED_FP", {0x20055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- only floating point instructions"}, #define PME_MONT_INST_FAILED_CHKS_RETIRED_INT 310 { "INST_FAILED_CHKS_RETIRED_INT", {0x10055}, 0xfff0, 1, {0xffff0000}, "Failed chk.s Instructions -- only integer instructions"}, #define PME_MONT_ISB_BUNPAIRS_IN 311 { "ISB_BUNPAIRS_IN", {0x46}, 0xfff0, 1, {0xffff0001}, "Bundle Pairs Written from L2I into FE"}, #define PME_MONT_ITLB_MISSES_FETCH_ALL 312 { "ITLB_MISSES_FETCH_ALL", {0x30047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All tlb misses will be counted. Note that this is not equal to sum of the L1ITLB and L2ITLB umasks because any access could be a miss in L1ITLB and L2ITLB."}, #define PME_MONT_ITLB_MISSES_FETCH_L1ITLB 313 { "ITLB_MISSES_FETCH_L1ITLB", {0x10047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB will be counted. even if L1ITLB is not updated for an access (Uncacheable/nat page/not present page/faulting/some flushed), it will be counted here."}, #define PME_MONT_ITLB_MISSES_FETCH_L2ITLB 314 { "ITLB_MISSES_FETCH_L2ITLB", {0x20047}, 0xfff0, 1, {0xffff0001}, "ITLB Misses Demand Fetch -- All misses in L1ITLB which also missed in L2ITLB will be counted."}, #define PME_MONT_L1DTLB_TRANSFER 315 { "L1DTLB_TRANSFER", {0xc0}, 0xfff0, 1, {0x5010007}, "L1DTLB Misses That Hit in the L2DTLB for Accesses Counted in L1D_READS"}, #define PME_MONT_L1D_READS_SET0 316 { "L1D_READS_SET0", {0xc2}, 0xfff0, 2, {0x5010007}, "L1 Data Cache Reads"}, #define PME_MONT_L1D_READS_SET1 317 { "L1D_READS_SET1", {0xc4}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Reads"}, #define PME_MONT_L1D_READ_MISSES_ALL 318 { "L1D_READ_MISSES_ALL", {0xc7}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Read Misses -- all L1D read misses will be counted."}, #define PME_MONT_L1D_READ_MISSES_RSE_FILL 319 { "L1D_READ_MISSES_RSE_FILL", {0x100c7}, 0xfff0, 2, {0x5110007}, "L1 Data Cache Read Misses -- only L1D read misses caused by RSE fills will be counted"}, #define PME_MONT_L1ITLB_INSERTS_HPW 320 { "L1ITLB_INSERTS_HPW", {0x48}, 0xfff0, 1, {0xffff0001}, "L1ITLB Hardware Page Walker Inserts"}, #define PME_MONT_L1I_EAR_CACHE_LAT0 321 { "L1I_EAR_CACHE_LAT0", {0x400b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- > 0 Cycles (All L1 Misses)"}, #define PME_MONT_L1I_EAR_CACHE_LAT1024 322 { "L1I_EAR_CACHE_LAT1024", {0xc00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 1024 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT128 323 { "L1I_EAR_CACHE_LAT128", {0xf00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 128 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT16 324 { "L1I_EAR_CACHE_LAT16", {0xfc0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 16 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT256 325 { "L1I_EAR_CACHE_LAT256", {0xe00b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 256 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT32 326 { "L1I_EAR_CACHE_LAT32", {0xf80b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 32 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT4 327 { "L1I_EAR_CACHE_LAT4", {0xff0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 4 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT4096 328 { "L1I_EAR_CACHE_LAT4096", {0x800b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 4096 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_LAT8 329 { "L1I_EAR_CACHE_LAT8", {0xfe0b43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- >= 8 Cycles"}, #define PME_MONT_L1I_EAR_CACHE_RAB 330 { "L1I_EAR_CACHE_RAB", {0xb43}, 0xfff0, 1, {0xffff0001}, "L1I EAR Cache -- RAB HIT"}, #define PME_MONT_L1I_EAR_EVENTS 331 { "L1I_EAR_EVENTS", {0x843}, 0xfff0, 1, {0xffff0001}, "Instruction EAR Events"}, #define PME_MONT_L1I_EAR_TLB_ALL 332 { "L1I_EAR_TLB_ALL", {0x70a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- All L1 ITLB Misses"}, #define PME_MONT_L1I_EAR_TLB_FAULT 333 { "L1I_EAR_TLB_FAULT", {0x40a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- ITLB Misses which produced a fault"}, #define PME_MONT_L1I_EAR_TLB_L2TLB 334 { "L1I_EAR_TLB_L2TLB", {0x10a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB"}, #define PME_MONT_L1I_EAR_TLB_L2TLB_OR_FAULT 335 { "L1I_EAR_TLB_L2TLB_OR_FAULT", {0x50a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or produce a software fault"}, #define PME_MONT_L1I_EAR_TLB_L2TLB_OR_VHPT 336 { "L1I_EAR_TLB_L2TLB_OR_VHPT", {0x30a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit L2 ITLB or VHPT"}, #define PME_MONT_L1I_EAR_TLB_VHPT 337 { "L1I_EAR_TLB_VHPT", {0x20a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT"}, #define PME_MONT_L1I_EAR_TLB_VHPT_OR_FAULT 338 { "L1I_EAR_TLB_VHPT_OR_FAULT", {0x60a43}, 0xfff0, 1, {0xffff0001}, "L1I EAR TLB -- L1 ITLB Misses which hit VHPT or produce a software fault"}, #define PME_MONT_L1I_FETCH_ISB_HIT 339 { "L1I_FETCH_ISB_HIT", {0x66}, 0xfff0, 1, {0xffff0001}, "\"Just-In-Time\" Instruction Fetch Hitting in and Being Bypassed from ISB"}, #define PME_MONT_L1I_FETCH_RAB_HIT 340 { "L1I_FETCH_RAB_HIT", {0x65}, 0xfff0, 1, {0xffff0001}, "Instruction Fetch Hitting in RAB"}, #define PME_MONT_L1I_FILLS 341 { "L1I_FILLS", {0x841}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Fills"}, #define PME_MONT_L1I_PREFETCHES 342 { "L1I_PREFETCHES", {0x44}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Prefetch Requests"}, #define PME_MONT_L1I_PREFETCH_STALL_ALL 343 { "L1I_PREFETCH_STALL_ALL", {0x30067}, 0xfff0, 1, {0xffff0000}, "Prefetch Pipeline Stalls -- Number of clocks prefetch pipeline is stalled"}, #define PME_MONT_L1I_PREFETCH_STALL_FLOW 344 { "L1I_PREFETCH_STALL_FLOW", {0x20067}, 0xfff0, 1, {0xffff0000}, "Prefetch Pipeline Stalls -- Asserted when the streaming prefetcher is working close to the instructions being fetched for demand reads, and is not asserted when the streaming prefetcher is ranging way ahead of the demand reads."}, #define PME_MONT_L1I_PURGE 345 { "L1I_PURGE", {0x104b}, 0xfff0, 1, {0xffff0001}, "L1ITLB Purges Handled by L1I"}, #define PME_MONT_L1I_PVAB_OVERFLOW 346 { "L1I_PVAB_OVERFLOW", {0x69}, 0xfff0, 1, {0xffff0000}, "PVAB Overflow"}, #define PME_MONT_L1I_RAB_ALMOST_FULL 347 { "L1I_RAB_ALMOST_FULL", {0x1064}, 0xfff0, 1, {0xffff0000}, "Is RAB Almost Full?"}, #define PME_MONT_L1I_RAB_FULL 348 { "L1I_RAB_FULL", {0x1060}, 0xfff0, 1, {0xffff0000}, "Is RAB Full?"}, #define PME_MONT_L1I_READS 349 { "L1I_READS", {0x40}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Reads"}, #define PME_MONT_L1I_SNOOP 350 { "L1I_SNOOP", {0x104a}, 0xfff0, 1, {0xffff0007}, "Snoop Requests Handled by L1I"}, #define PME_MONT_L1I_STRM_PREFETCHES 351 { "L1I_STRM_PREFETCHES", {0x5f}, 0xfff0, 1, {0xffff0001}, "L1 Instruction Cache Line Prefetch Requests"}, #define PME_MONT_L2DTLB_MISSES 352 { "L2DTLB_MISSES", {0xc1}, 0xfff0, 4, {0x5010007}, "L2DTLB Misses"}, #define PME_MONT_L2D_BAD_LINES_SELECTED_ANY 353 { "L2D_BAD_LINES_SELECTED_ANY", {0x8ec}, 0xfff0, 4, {0x4520007}, "Valid Line Replaced When Invalid Line Is Available -- Valid line replaced when invalid line is available"}, #define PME_MONT_L2D_BYPASS_L2_DATA1 354 { "L2D_BYPASS_L2_DATA1", {0x8e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L2D data bypasses (L1D to L2A)"}, #define PME_MONT_L2D_BYPASS_L2_DATA2 355 { "L2D_BYPASS_L2_DATA2", {0x108e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L2D data bypasses (L1W to L2I)"}, #define PME_MONT_L2D_BYPASS_L3_DATA1 356 { "L2D_BYPASS_L3_DATA1", {0x208e4}, 0xfff0, 1, {0x4120007}, "Count L2D Bypasses -- Count only L3 data bypasses (L1D to L2A)"}, #define PME_MONT_L2D_FILLB_FULL_THIS 357 { "L2D_FILLB_FULL_THIS", {0x8f1}, 0xfff0, 1, {0x4720000}, "L2D Fill Buffer Is Full -- L2D Fill buffer is full"}, #define PME_MONT_L2D_FILL_MESI_STATE_E 358 { "L2D_FILL_MESI_STATE_E", {0x108f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_I 359 { "L2D_FILL_MESI_STATE_I", {0x308f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_M 360 { "L2D_FILL_MESI_STATE_M", {0x8f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_P 361 { "L2D_FILL_MESI_STATE_P", {0x408f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FILL_MESI_STATE_S 362 { "L2D_FILL_MESI_STATE_S", {0x208f2}, 0xfff0, 1, {0x4820000}, "L2D Cache Fills with MESI state -- "}, #define PME_MONT_L2D_FORCE_RECIRC_FILL_HIT 363 { "L2D_FORCE_RECIRC_FILL_HIT", {0x808ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by an L2D miss which hit in the fill buffer."}, #define PME_MONT_L2D_FORCE_RECIRC_FRC_RECIRC 364 { "L2D_FORCE_RECIRC_FRC_RECIRC", {0x908ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when a force recirculate already existed in the Ozq."}, #define PME_MONT_L2D_FORCE_RECIRC_L1W 365 { "L2D_FORCE_RECIRC_L1W", {0xc08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by a L2D miss one cycle ahead of the current op."}, #define PME_MONT_L2D_FORCE_RECIRC_LIMBO 366 { "L2D_FORCE_RECIRC_LIMBO", {0x108ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count operations that went into the LIMBO Ozq state. This state is entered when the the op sees a FILL_HIT or OZQ_MISS event."}, #define PME_MONT_L2D_FORCE_RECIRC_OZQ_MISS 367 { "L2D_FORCE_RECIRC_OZQ_MISS", {0xb08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when an L2D miss was already in the OZQ."}, #define PME_MONT_L2D_FORCE_RECIRC_RECIRC 368 { "L2D_FORCE_RECIRC_RECIRC", {0x8ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Counts inserts into OzQ due to a recirculate. The recirculate due to secondary misses or various other conflicts"}, #define PME_MONT_L2D_FORCE_RECIRC_SAME_INDEX 369 { "L2D_FORCE_RECIRC_SAME_INDEX", {0xa08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by an L2D miss when a miss to the same index was in the same issue group."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_ALL 370 { "L2D_FORCE_RECIRC_SECONDARY_ALL", {0xf08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- CSaused by any L2D op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_READ 371 { "L2D_FORCE_RECIRC_SECONDARY_READ", {0xd08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by L2D read op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SECONDARY_WRITE 372 { "L2D_FORCE_RECIRC_SECONDARY_WRITE", {0xe08ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Caused by L2D write op that saw a miss to the same address in OZQ, L2 fill buffer, or one cycle ahead in the main pipeline."}, #define PME_MONT_L2D_FORCE_RECIRC_SNP_OR_L3 373 { "L2D_FORCE_RECIRC_SNP_OR_L3", {0x608ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by a snoop or L3 issue."}, #define PME_MONT_L2D_FORCE_RECIRC_TAG_NOTOK 374 { "L2D_FORCE_RECIRC_TAG_NOTOK", {0x408ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by L2D hits caused by in flight snoops, stores with a sibling miss to the same index, sibling probe to the same line or a pending mf.a instruction. This count can usually be ignored since its events are rare, unpredictable, and/or show up in one of the other events."}, #define PME_MONT_L2D_FORCE_RECIRC_TAG_OK 375 { "L2D_FORCE_RECIRC_TAG_OK", {0x708ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count operations that inserted to Ozq as a hit. Thus it was NOT forced to recirculate. Likely identical to L2D_INSERT_HITS."}, #define PME_MONT_L2D_FORCE_RECIRC_TRAN_PREF 376 { "L2D_FORCE_RECIRC_TRAN_PREF", {0x508ea}, 0xfff0, 4, {0x4420007}, "Forced Recirculates -- Count only those caused by L2D miss requests that transformed to prefetches"}, #define PME_MONT_L2D_INSERT_HITS 377 { "L2D_INSERT_HITS", {0x8b1}, 0xfff0, 4, {0xffff0007}, "Count Number of Times an Inserting Data Request Hit in the L2D."}, #define PME_MONT_L2D_INSERT_MISSES 378 { "L2D_INSERT_MISSES", {0x8b0}, 0xfff0, 4, {0xffff0007}, "Count Number of Times an Inserting Data Request Missed the L2D."}, #define PME_MONT_L2D_ISSUED_RECIRC_OZQ_ACC 379 { "L2D_ISSUED_RECIRC_OZQ_ACC", {0x8eb}, 0xfff0, 1, {0x4420007}, "Count Number of Times a Recirculate Issue Was Attempted and Not Preempted"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_ANY 380 { "L2D_L3ACCESS_CANCEL_ANY", {0x208e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- count cancels due to any reason. This umask will count more than the sum of all the other umasks. It will count things that weren't committed accesses when they reached L1w, but the L2D attempted to bypass them to the L3 anyway (speculatively). This will include accesses made repeatedly while the main pipeline is stalled and the L1D is attempting to recirculate an access down the L1D pipeline. Thus, an access could get counted many times before it really does get bypassed to the L3. It is a measure of how many times we asserted a request to the L3 but didn't confirm it."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_ER_REJECT 381 { "L2D_L3ACCESS_CANCEL_ER_REJECT", {0x308e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- Count only requests that were rejected by ER"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_INV_L3_BYP 382 { "L2D_L3ACCESS_CANCEL_INV_L3_BYP", {0x8e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- L2D cancelled a bypass because it did not commit, or was not a valid opcode to bypass, or was not a true miss of L2D (either hit,recirc,or limbo)."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_FILL_NOSNP 383 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_FILL_NOSNP", {0x608e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop and a fill to the same address reached the L2D within a 3 cycle window of each other or a snoop hit a nosnoops entry in Ozq."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_TEM 384 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_TEM", {0x408e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop saw an L2D tag error and missed/"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_P2_COV_SNP_VIC 385 { "L2D_L3ACCESS_CANCEL_P2_COV_SNP_VIC", {0x508e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- A snoop hit in the L1D victim buffer"}, #define PME_MONT_L2D_L3ACCESS_CANCEL_SPEC_L3_BYP 386 { "L2D_L3ACCESS_CANCEL_SPEC_L3_BYP", {0x108e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- L2D cancelled speculative L3 bypasses because it was not a WB memory attribute or it was an effective release."}, #define PME_MONT_L2D_L3ACCESS_CANCEL_TAIL_TRANS_DIS 387 { "L2D_L3ACCESS_CANCEL_TAIL_TRANS_DIS", {0x708e8}, 0xfff0, 1, {0x4320007}, "L2D Access Cancelled by L2D -- Count the number of cycles that either transform to prefetches or Ozq tail collapse have been dynamically disabled. This would indicate that memory contention has lead the L2D to throttle request to prevent livelock scenarios."}, #define PME_MONT_L2D_MISSES 388 { "L2D_MISSES", {0x8cb}, 0xfff0, 1, {0xffff0007}, "L2 Misses"}, #define PME_MONT_L2D_OPS_ISSUED_FP_LOAD 389 { "L2D_OPS_ISSUED_FP_LOAD", {0x108f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid floating-point loads"}, #define PME_MONT_L2D_OPS_ISSUED_INT_LOAD 390 { "L2D_OPS_ISSUED_INT_LOAD", {0x8f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid integer loads, including ld16."}, #define PME_MONT_L2D_OPS_ISSUED_LFETCH 391 { "L2D_OPS_ISSUED_LFETCH", {0x408f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only lfetch operations."}, #define PME_MONT_L2D_OPS_ISSUED_OTHER 392 { "L2D_OPS_ISSUED_OTHER", {0x508f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid non-load, no-store accesses that are not in any of the above sections."}, #define PME_MONT_L2D_OPS_ISSUED_RMW 393 { "L2D_OPS_ISSUED_RMW", {0x208f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid read_modify_write stores and semaphores including cmp8xchg16."}, #define PME_MONT_L2D_OPS_ISSUED_STORE 394 { "L2D_OPS_ISSUED_STORE", {0x308f0}, 0xfff0, 4, {0xffff0007}, "Operations Issued By L2D -- Count only valid non-read_modify_write stores, including st16."}, #define PME_MONT_L2D_OZDB_FULL_THIS 395 { "L2D_OZDB_FULL_THIS", {0x8e9}, 0xfff0, 1, {0x4320000}, "L2D OZ Data Buffer Is Full -- L2 OZ Data Buffer is full"}, #define PME_MONT_L2D_OZQ_ACQUIRE 396 { "L2D_OZQ_ACQUIRE", {0x8ef}, 0xfff0, 1, {0x4620000}, "Acquire Ordering Attribute Exists in L2D OZQ"}, #define PME_MONT_L2D_OZQ_CANCELS0_ACQ 397 { "L2D_OZQ_CANCELS0_ACQ", {0x608e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- caused by an acquire somewhere in Ozq or ER."}, #define PME_MONT_L2D_OZQ_CANCELS0_BANK_CONF 398 { "L2D_OZQ_CANCELS0_BANK_CONF", {0x808e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a bypassed L2D hit operation had a bank conflict with an older sibling bypass or an older operation in the L2D pipeline."}, #define PME_MONT_L2D_OZQ_CANCELS0_CANC_L2M_TO_L2C_ST 399 { "L2D_OZQ_CANCELS0_CANC_L2M_TO_L2C_ST", {0x108e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- caused by a canceled store in L2M,L2D or L2C. This is the combination of following subevents that were available separately in Itanium2: CANC_L2M_ST=caused by canceled store in L2M, CANC_L2D_ST=caused by canceled store in L2D, CANC_L2C_ST=caused by canceled store in L2C"}, #define PME_MONT_L2D_OZQ_CANCELS0_FILL_ST_CONF 400 { "L2D_OZQ_CANCELS0_FILL_ST_CONF", {0xe08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ store conflicted with a returning L2D fill"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2A_ST_MAT 401 { "L2D_OZQ_CANCELS0_L2A_ST_MAT", {0x208e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2A"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2C_ST_MAT 402 { "L2D_OZQ_CANCELS0_L2C_ST_MAT", {0x508e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2C"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2D_ST_MAT 403 { "L2D_OZQ_CANCELS0_L2D_ST_MAT", {0x408e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2D"}, #define PME_MONT_L2D_OZQ_CANCELS0_L2M_ST_MAT 404 { "L2D_OZQ_CANCELS0_L2M_ST_MAT", {0x308e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- canceled due to an uncanceled store match in L2M"}, #define PME_MONT_L2D_OZQ_CANCELS0_MISC_ORDER 405 { "L2D_OZQ_CANCELS0_MISC_ORDER", {0xd08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a sync.i or mf.a . This is the combination of following subevents that were available separately in Itanium2: SYNC=caused by sync.i, MFA=a memory fence instruction"}, #define PME_MONT_L2D_OZQ_CANCELS0_OVER_SUB 406 { "L2D_OZQ_CANCELS0_OVER_SUB", {0xa08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a high Ozq issue rate resulted in the L2D having to cancel due to hardware restrictions. This is the combination of following subevents that were available separately in Itanium2: OVER_SUB=oversubscription, L1DF_L2M=L1D fill in L2M"}, #define PME_MONT_L2D_OZQ_CANCELS0_OZDATA_CONF 407 { "L2D_OZQ_CANCELS0_OZDATA_CONF", {0xf08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ operation that needed to read the OZQ data buffer conflicted with a fill return that needed to do the same."}, #define PME_MONT_L2D_OZQ_CANCELS0_OZQ_PREEMPT 408 { "L2D_OZQ_CANCELS0_OZQ_PREEMPT", {0xb08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an L2D fill return conflicted with, and cancelled, an ozq request for various reasons. Formerly known as L1_FILL_CONF."}, #define PME_MONT_L2D_OZQ_CANCELS0_RECIRC 409 { "L2D_OZQ_CANCELS0_RECIRC", {0x8e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a recirculate was cancelled due h/w limitations on recirculate issue rate. This is the combination of following subevents that were available separately in Itanium2: RECIRC_OVER_SUB=caused by a recirculate oversubscription, DIDNT_RECIRC=caused because it did not recirculate, WEIRD=counts the cancels caused by attempted 5-cycle bypasses for non-aligned accesses and bypasses blocking recirculates for too long"}, #define PME_MONT_L2D_OZQ_CANCELS0_REL 410 { "L2D_OZQ_CANCELS0_REL", {0x708e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a release was cancelled due to some other operation"}, #define PME_MONT_L2D_OZQ_CANCELS0_SEMA 411 { "L2D_OZQ_CANCELS0_SEMA", {0x908e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- a semaphore op was cancelled for various ordering or h/w restriction reasons. This is the combination of following subevents that were available separately in Itanium 2: SEM=a semaphore, CCV=a CCV"}, #define PME_MONT_L2D_OZQ_CANCELS0_WB_CONF 412 { "L2D_OZQ_CANCELS0_WB_CONF", {0xc08e0}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Specific Reason Set 0) -- an OZQ request conflicted with an L2D data array read for a writeback. This is the combination of following subevents that were available separately in Itanium2: READ_WB_CONF=a write back conflict, ST_FILL_CONF=a store fill conflict"}, #define PME_MONT_L2D_OZQ_CANCELS1_ANY 413 { "L2D_OZQ_CANCELS1_ANY", {0x8e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the total OZ Queue cancels"}, #define PME_MONT_L2D_OZQ_CANCELS1_LATE_BYP_EFFRELEASE 414 { "L2D_OZQ_CANCELS1_LATE_BYP_EFFRELEASE", {0x308e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by L1D to L2A bypass effective releases"}, #define PME_MONT_L2D_OZQ_CANCELS1_LATE_SPEC_BYP 415 { "L2D_OZQ_CANCELS1_LATE_SPEC_BYP", {0x108e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by speculative bypasses"}, #define PME_MONT_L2D_OZQ_CANCELS1_SIBLING_ACQ_REL 416 { "L2D_OZQ_CANCELS1_SIBLING_ACQ_REL", {0x208e2}, 0xfff0, 4, {0x4020007}, "L2D OZQ Cancels (Late or Any) -- counts the late cancels caused by releases and acquires in the same issue group. This is the combination of following subevents that were available separately in Itanium2: LATE_ACQUIRE=late cancels caused by acquires, LATE_RELEASE=late cancles caused by releases"}, #define PME_MONT_L2D_OZQ_FULL_THIS 417 { "L2D_OZQ_FULL_THIS", {0x8bc}, 0xfff0, 1, {0x4520000}, "L2D OZQ Is Full -- L2D OZQ is full"}, #define PME_MONT_L2D_OZQ_RELEASE 418 { "L2D_OZQ_RELEASE", {0x8e5}, 0xfff0, 1, {0x4120000}, "Release Ordering Attribute Exists in L2D OZQ"}, #define PME_MONT_L2D_REFERENCES_ALL 419 { "L2D_REFERENCES_ALL", {0x308e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count both read and write operations (semaphores will count as 2)"}, #define PME_MONT_L2D_REFERENCES_READS 420 { "L2D_REFERENCES_READS", {0x108e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count only data read and semaphore operations."}, #define PME_MONT_L2D_REFERENCES_WRITES 421 { "L2D_REFERENCES_WRITES", {0x208e6}, 0xfff0, 4, {0x4220007}, "Data Read/Write Access to L2D -- count only data write and semaphore operations"}, #define PME_MONT_L2D_STORE_HIT_SHARED_ANY 422 { "L2D_STORE_HIT_SHARED_ANY", {0x8ed}, 0xfff0, 2, {0x4520007}, "Store Hit a Shared Line -- Store hit a shared line"}, #define PME_MONT_L2D_VICTIMB_FULL_THIS 423 { "L2D_VICTIMB_FULL_THIS", {0x8f3}, 0xfff0, 1, {0x4820000}, "L2D Victim Buffer Is Full -- L2D victim buffer is full"}, #define PME_MONT_L2I_DEMAND_READS 424 { "L2I_DEMAND_READS", {0x42}, 0xfff0, 1, {0xffff0001}, "L2 Instruction Demand Fetch Requests"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_ALL 425 { "L2I_HIT_CONFLICTS_ALL_ALL", {0xf087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_DMND 426 { "L2I_HIT_CONFLICTS_ALL_DMND", {0xd087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_ALL_PFTCH 427 { "L2I_HIT_CONFLICTS_ALL_PFTCH", {0xe087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_ALL 428 { "L2I_HIT_CONFLICTS_HIT_ALL", {0x7087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_DMND 429 { "L2I_HIT_CONFLICTS_HIT_DMND", {0x5087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_HIT_PFTCH 430 { "L2I_HIT_CONFLICTS_HIT_PFTCH", {0x6087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_ALL 431 { "L2I_HIT_CONFLICTS_MISS_ALL", {0xb087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_DMND 432 { "L2I_HIT_CONFLICTS_MISS_DMND", {0x9087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_HIT_CONFLICTS_MISS_PFTCH 433 { "L2I_HIT_CONFLICTS_MISS_PFTCH", {0xa087d}, 0xfff0, 1, {0xffff0001}, "L2I hit conflicts -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_ALL 434 { "L2I_L3_REJECTS_ALL_ALL", {0xf087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_DMND 435 { "L2I_L3_REJECTS_ALL_DMND", {0xd087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_ALL_PFTCH 436 { "L2I_L3_REJECTS_ALL_PFTCH", {0xe087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_ALL 437 { "L2I_L3_REJECTS_HIT_ALL", {0x7087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_DMND 438 { "L2I_L3_REJECTS_HIT_DMND", {0x5087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_HIT_PFTCH 439 { "L2I_L3_REJECTS_HIT_PFTCH", {0x6087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_ALL 440 { "L2I_L3_REJECTS_MISS_ALL", {0xb087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_DMND 441 { "L2I_L3_REJECTS_MISS_DMND", {0x9087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_L3_REJECTS_MISS_PFTCH 442 { "L2I_L3_REJECTS_MISS_PFTCH", {0xa087c}, 0xfff0, 1, {0xffff0001}, "L3 rejects -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_PREFETCHES 443 { "L2I_PREFETCHES", {0x45}, 0xfff0, 1, {0xffff0001}, "L2 Instruction Prefetch Requests"}, #define PME_MONT_L2I_READS_ALL_ALL 444 { "L2I_READS_ALL_ALL", {0xf0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_ALL_DMND 445 { "L2I_READS_ALL_DMND", {0xd0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_ALL_PFTCH 446 { "L2I_READS_ALL_PFTCH", {0xe0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_READS_HIT_ALL 447 { "L2I_READS_HIT_ALL", {0x70878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_READS_HIT_DMND 448 { "L2I_READS_HIT_DMND", {0x50878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_READS_HIT_PFTCH 449 { "L2I_READS_HIT_PFTCH", {0x60878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_ALL 450 { "L2I_READS_MISS_ALL", {0xb0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_DMND 451 { "L2I_READS_MISS_DMND", {0x90878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_READS_MISS_PFTCH 452 { "L2I_READS_MISS_PFTCH", {0xa0878}, 0xfff0, 1, {0xffff0001}, "L2I Cacheable Reads -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_ALL 453 { "L2I_RECIRCULATES_ALL_ALL", {0xf087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_DMND 454 { "L2I_RECIRCULATES_ALL_DMND", {0xd087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_ALL_PFTCH 455 { "L2I_RECIRCULATES_ALL_PFTCH", {0xe087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_ALL 456 { "L2I_RECIRCULATES_HIT_ALL", {0x7087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_DMND 457 { "L2I_RECIRCULATES_HIT_DMND", {0x5087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_HIT_PFTCH 458 { "L2I_RECIRCULATES_HIT_PFTCH", {0x6087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_ALL 459 { "L2I_RECIRCULATES_MISS_ALL", {0xb087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_DMND 460 { "L2I_RECIRCULATES_MISS_DMND", {0x9087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_RECIRCULATES_MISS_PFTCH 461 { "L2I_RECIRCULATES_MISS_PFTCH", {0xa087b}, 0xfff0, 1, {0xffff0001}, "L2I recirculates -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_SNOOP_HITS 462 { "L2I_SNOOP_HITS", {0x107f}, 0xfff0, 1, {0xffff0000}, "L2I snoop hits"}, #define PME_MONT_L2I_SPEC_ABORTS 463 { "L2I_SPEC_ABORTS", {0x87e}, 0xfff0, 1, {0xffff0001}, "L2I speculative aborts"}, #define PME_MONT_L2I_UC_READS_ALL_ALL 464 { "L2I_UC_READS_ALL_ALL", {0xf0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_ALL_DMND 465 { "L2I_UC_READS_ALL_DMND", {0xd0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_ALL_PFTCH 466 { "L2I_UC_READS_ALL_PFTCH", {0xe0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that reference L2I are counted"}, #define PME_MONT_L2I_UC_READS_HIT_ALL 467 { "L2I_UC_READS_HIT_ALL", {0x70879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that hit in L2I counted"}, #define PME_MONT_L2I_UC_READS_HIT_DMND 468 { "L2I_UC_READS_HIT_DMND", {0x50879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that hit in L2I are counted"}, #define PME_MONT_L2I_UC_READS_HIT_PFTCH 469 { "L2I_UC_READS_HIT_PFTCH", {0x60879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that hit in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_ALL 470 { "L2I_UC_READS_MISS_ALL", {0xb0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- All fetches that miss in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_DMND 471 { "L2I_UC_READS_MISS_DMND", {0x90879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only demand fetches that miss in L2I are counted"}, #define PME_MONT_L2I_UC_READS_MISS_PFTCH 472 { "L2I_UC_READS_MISS_PFTCH", {0xa0879}, 0xfff0, 1, {0xffff0001}, "L2I Uncacheable reads -- Only prefetches that miss in L2I are counted"}, #define PME_MONT_L2I_VICTIMIZATION 473 { "L2I_VICTIMIZATION", {0x87a}, 0xfff0, 1, {0xffff0001}, "L2I victimizations"}, #define PME_MONT_L3_INSERTS 474 { "L3_INSERTS", {0x8da}, 0xfff0, 1, {0xffff0017}, "L3 Cache Lines inserts"}, #define PME_MONT_L3_LINES_REPLACED 475 { "L3_LINES_REPLACED", {0x8df}, 0xfff0, 1, {0xffff0010}, "L3 Cache Lines Replaced"}, #define PME_MONT_L3_MISSES 476 { "L3_MISSES", {0x8dc}, 0xfff0, 1, {0xffff0007}, "L3 Misses"}, #define PME_MONT_L3_READS_ALL_ALL 477 { "L3_READS_ALL_ALL", {0xf08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read References"}, #define PME_MONT_L3_READS_ALL_HIT 478 { "L3_READS_ALL_HIT", {0xd08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read Hits"}, #define PME_MONT_L3_READS_ALL_MISS 479 { "L3_READS_ALL_MISS", {0xe08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Read Misses"}, #define PME_MONT_L3_READS_DATA_READ_ALL 480 { "L3_READS_DATA_READ_ALL", {0xb08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load References (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DATA_READ_HIT 481 { "L3_READS_DATA_READ_HIT", {0x908dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load Hits (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DATA_READ_MISS 482 { "L3_READS_DATA_READ_MISS", {0xa08dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Load Misses (excludes reads for ownership used to satisfy stores)"}, #define PME_MONT_L3_READS_DINST_FETCH_ALL 483 { "L3_READS_DINST_FETCH_ALL", {0x308dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction References"}, #define PME_MONT_L3_READS_DINST_FETCH_HIT 484 { "L3_READS_DINST_FETCH_HIT", {0x108dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction Fetch Hits"}, #define PME_MONT_L3_READS_DINST_FETCH_MISS 485 { "L3_READS_DINST_FETCH_MISS", {0x208dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Demand Instruction Fetch Misses"}, #define PME_MONT_L3_READS_INST_FETCH_ALL 486 { "L3_READS_INST_FETCH_ALL", {0x708dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch References"}, #define PME_MONT_L3_READS_INST_FETCH_HIT 487 { "L3_READS_INST_FETCH_HIT", {0x508dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch Hits"}, #define PME_MONT_L3_READS_INST_FETCH_MISS 488 { "L3_READS_INST_FETCH_MISS", {0x608dd}, 0xfff0, 1, {0xffff0017}, "L3 Reads -- L3 Instruction Fetch and Prefetch Misses"}, #define PME_MONT_L3_REFERENCES 489 { "L3_REFERENCES", {0x8db}, 0xfff0, 1, {0xffff0007}, "L3 References"}, #define PME_MONT_L3_WRITES_ALL_ALL 490 { "L3_WRITES_ALL_ALL", {0xf08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write References"}, #define PME_MONT_L3_WRITES_ALL_HIT 491 { "L3_WRITES_ALL_HIT", {0xd08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write Hits"}, #define PME_MONT_L3_WRITES_ALL_MISS 492 { "L3_WRITES_ALL_MISS", {0xe08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Write Misses"}, #define PME_MONT_L3_WRITES_DATA_WRITE_ALL 493 { "L3_WRITES_DATA_WRITE_ALL", {0x708de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store References (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_DATA_WRITE_HIT 494 { "L3_WRITES_DATA_WRITE_HIT", {0x508de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store Hits (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_DATA_WRITE_MISS 495 { "L3_WRITES_DATA_WRITE_MISS", {0x608de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L3 Store Misses (excludes L2 write backs, includes L3 read for ownership requests that satisfy stores)"}, #define PME_MONT_L3_WRITES_L2_WB_ALL 496 { "L3_WRITES_L2_WB_ALL", {0xb08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back References"}, #define PME_MONT_L3_WRITES_L2_WB_HIT 497 { "L3_WRITES_L2_WB_HIT", {0x908de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back Hits"}, #define PME_MONT_L3_WRITES_L2_WB_MISS 498 { "L3_WRITES_L2_WB_MISS", {0xa08de}, 0xfff0, 1, {0xffff0017}, "L3 Writes -- L2 Write Back Misses"}, #define PME_MONT_LOADS_RETIRED 499 { "LOADS_RETIRED", {0xcd}, 0xfff0, 4, {0x5310007}, "Retired Loads"}, #define PME_MONT_LOADS_RETIRED_INTG 500 { "LOADS_RETIRED_INTG", {0xd8}, 0xfff0, 2, {0x5610007}, "Integer loads retired"}, #define PME_MONT_MEM_READ_CURRENT_ANY 501 { "MEM_READ_CURRENT_ANY", {0x31089}, 0xfff0, 1, {0xffff0000}, "Current Mem Read Transactions On Bus -- CPU or non-CPU (all transactions)."}, #define PME_MONT_MEM_READ_CURRENT_IO 502 { "MEM_READ_CURRENT_IO", {0x11089}, 0xfff0, 1, {0xffff0000}, "Current Mem Read Transactions On Bus -- non-CPU priority agents"}, #define PME_MONT_MISALIGNED_LOADS_RETIRED 503 { "MISALIGNED_LOADS_RETIRED", {0xce}, 0xfff0, 4, {0x5310007}, "Retired Misaligned Load Instructions"}, #define PME_MONT_MISALIGNED_STORES_RETIRED 504 { "MISALIGNED_STORES_RETIRED", {0xd2}, 0xfff0, 2, {0x5410007}, "Retired Misaligned Store Instructions"}, #define PME_MONT_NOPS_RETIRED 505 { "NOPS_RETIRED", {0x50}, 0xfff0, 6, {0xffff0003}, "Retired NOP Instructions"}, #define PME_MONT_PREDICATE_SQUASHED_RETIRED 506 { "PREDICATE_SQUASHED_RETIRED", {0x51}, 0xfff0, 6, {0xffff0003}, "Instructions Squashed Due to Predicate Off"}, #define PME_MONT_RSE_CURRENT_REGS_2_TO_0 507 { "RSE_CURRENT_REGS_2_TO_0", {0x2b}, 0xfff0, 7, {0xffff0000}, "Current RSE Registers (Bits 2:0)"}, #define PME_MONT_RSE_CURRENT_REGS_5_TO_3 508 { "RSE_CURRENT_REGS_5_TO_3", {0x2a}, 0xfff0, 7, {0xffff0000}, "Current RSE Registers (Bits 5:3)"}, #define PME_MONT_RSE_CURRENT_REGS_6 509 { "RSE_CURRENT_REGS_6", {0x26}, 0xfff0, 1, {0xffff0000}, "Current RSE Registers (Bit 6)"}, #define PME_MONT_RSE_DIRTY_REGS_2_TO_0 510 { "RSE_DIRTY_REGS_2_TO_0", {0x29}, 0xfff0, 7, {0xffff0000}, "Dirty RSE Registers (Bits 2:0)"}, #define PME_MONT_RSE_DIRTY_REGS_5_TO_3 511 { "RSE_DIRTY_REGS_5_TO_3", {0x28}, 0xfff0, 7, {0xffff0000}, "Dirty RSE Registers (Bits 5:3)"}, #define PME_MONT_RSE_DIRTY_REGS_6 512 { "RSE_DIRTY_REGS_6", {0x24}, 0xfff0, 1, {0xffff0000}, "Dirty RSE Registers (Bit 6)"}, #define PME_MONT_RSE_EVENT_RETIRED 513 { "RSE_EVENT_RETIRED", {0x32}, 0xfff0, 1, {0xffff0000}, "Retired RSE operations"}, #define PME_MONT_RSE_REFERENCES_RETIRED_ALL 514 { "RSE_REFERENCES_RETIRED_ALL", {0x30020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Both RSE loads and stores will be counted."}, #define PME_MONT_RSE_REFERENCES_RETIRED_LOAD 515 { "RSE_REFERENCES_RETIRED_LOAD", {0x10020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Only RSE loads will be counted."}, #define PME_MONT_RSE_REFERENCES_RETIRED_STORE 516 { "RSE_REFERENCES_RETIRED_STORE", {0x20020}, 0xfff0, 2, {0xffff0007}, "RSE Accesses -- Only RSE stores will be counted."}, #define PME_MONT_SERIALIZATION_EVENTS 517 { "SERIALIZATION_EVENTS", {0x53}, 0xfff0, 1, {0xffff0000}, "Number of srlz.i Instructions"}, #define PME_MONT_SI_CCQ_COLLISIONS_EITHER 518 { "SI_CCQ_COLLISIONS_EITHER", {0x10a8}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Collisions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_COLLISIONS_SELF 519 { "SI_CCQ_COLLISIONS_SELF", {0x110a8}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Collisions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_INSERTS_EITHER 520 { "SI_CCQ_INSERTS_EITHER", {0x18a5}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Insertions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_INSERTS_SELF 521 { "SI_CCQ_INSERTS_SELF", {0x118a5}, 0xfff0, 2, {0xffff0000}, "Clean Castout Queue Insertions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_HI_EITHER 522 { "SI_CCQ_LIVE_REQ_HI_EITHER", {0x10a7}, 0xfff0, 1, {0xffff0000}, "Clean Castout Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_HI_SELF 523 { "SI_CCQ_LIVE_REQ_HI_SELF", {0x110a7}, 0xfff0, 1, {0xffff0000}, "Clean Castout Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_LO_EITHER 524 { "SI_CCQ_LIVE_REQ_LO_EITHER", {0x10a6}, 0xfff0, 7, {0xffff0000}, "Clean Castout Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_CCQ_LIVE_REQ_LO_SELF 525 { "SI_CCQ_LIVE_REQ_LO_SELF", {0x110a6}, 0xfff0, 7, {0xffff0000}, "Clean Castout Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_CYCLES 526 { "SI_CYCLES", {0x108e}, 0xfff0, 1, {0xffff0000}, "SI Cycles"}, #define PME_MONT_SI_IOQ_COLLISIONS 527 { "SI_IOQ_COLLISIONS", {0x10aa}, 0xfff0, 2, {0xffff0000}, "In Order Queue Collisions"}, #define PME_MONT_SI_IOQ_LIVE_REQ_HI 528 { "SI_IOQ_LIVE_REQ_HI", {0x1098}, 0xfff0, 2, {0xffff0000}, "Inorder Bus Queue Requests (upper bit)"}, #define PME_MONT_SI_IOQ_LIVE_REQ_LO 529 { "SI_IOQ_LIVE_REQ_LO", {0x1097}, 0xfff0, 3, {0xffff0000}, "Inorder Bus Queue Requests (lower three bits)"}, #define PME_MONT_SI_RQ_INSERTS_EITHER 530 { "SI_RQ_INSERTS_EITHER", {0x189e}, 0xfff0, 2, {0xffff0000}, "Request Queue Insertions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_INSERTS_SELF 531 { "SI_RQ_INSERTS_SELF", {0x1189e}, 0xfff0, 2, {0xffff0000}, "Request Queue Insertions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_HI_EITHER 532 { "SI_RQ_LIVE_REQ_HI_EITHER", {0x10a0}, 0xfff0, 1, {0xffff0000}, "Request Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_HI_SELF 533 { "SI_RQ_LIVE_REQ_HI_SELF", {0x110a0}, 0xfff0, 1, {0xffff0000}, "Request Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_LO_EITHER 534 { "SI_RQ_LIVE_REQ_LO_EITHER", {0x109f}, 0xfff0, 7, {0xffff0000}, "Request Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_RQ_LIVE_REQ_LO_SELF 535 { "SI_RQ_LIVE_REQ_LO_SELF", {0x1109f}, 0xfff0, 7, {0xffff0000}, "Request Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_ALL_EITHER 536 { "SI_SCB_INSERTS_ALL_EITHER", {0xc10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count all snoop signoffs (plus backsnoop inserts) from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_ALL_SELF 537 { "SI_SCB_INSERTS_ALL_SELF", {0xd10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count all snoop signoffs (plus backsnoop inserts) from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HIT_EITHER 538 { "SI_SCB_INSERTS_HIT_EITHER", {0x410ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HIT snoop signoffs from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HIT_SELF 539 { "SI_SCB_INSERTS_HIT_SELF", {0x510ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HIT snoop signoffs from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HITM_EITHER 540 { "SI_SCB_INSERTS_HITM_EITHER", {0x810ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HITM snoop signoffs from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_HITM_SELF 541 { "SI_SCB_INSERTS_HITM_SELF", {0x910ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count HITM snoop signoffs from 'this' cpu core"}, #define PME_MONT_SI_SCB_INSERTS_MISS_EITHER 542 { "SI_SCB_INSERTS_MISS_EITHER", {0x10ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count MISS snoop signoffs (plus backsnoop inserts) from either cpu core"}, #define PME_MONT_SI_SCB_INSERTS_MISS_SELF 543 { "SI_SCB_INSERTS_MISS_SELF", {0x110ab}, 0xfff0, 4, {0xffff0000}, "Snoop Coalescing Buffer Insertions -- count MISS snoop signoffs (plus backsnoop inserts) from 'this' cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_HI_EITHER 544 { "SI_SCB_LIVE_REQ_HI_EITHER", {0x10ad}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_HI_SELF 545 { "SI_SCB_LIVE_REQ_HI_SELF", {0x110ad}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_LO_EITHER 546 { "SI_SCB_LIVE_REQ_LO_EITHER", {0x10ac}, 0xfff0, 7, {0xffff0000}, "Snoop Coalescing Buffer Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_SCB_LIVE_REQ_LO_SELF 547 { "SI_SCB_LIVE_REQ_LO_SELF", {0x110ac}, 0xfff0, 7, {0xffff0000}, "Snoop Coalescing Buffer Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_SCB_SIGNOFFS_ALL 548 { "SI_SCB_SIGNOFFS_ALL", {0xc10ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count all snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_HIT 549 { "SI_SCB_SIGNOFFS_HIT", {0x410ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count HIT snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_HITM 550 { "SI_SCB_SIGNOFFS_HITM", {0x810ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count HITM snoop signoffs"}, #define PME_MONT_SI_SCB_SIGNOFFS_MISS 551 { "SI_SCB_SIGNOFFS_MISS", {0x10ae}, 0xfff0, 1, {0xffff0000}, "Snoop Coalescing Buffer Coherency Signoffs -- count MISS snoop signoffs"}, #define PME_MONT_SI_WAQ_COLLISIONS_EITHER 552 { "SI_WAQ_COLLISIONS_EITHER", {0x10a4}, 0xfff0, 1, {0xffff0000}, "Write Address Queue Collisions -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WAQ_COLLISIONS_SELF 553 { "SI_WAQ_COLLISIONS_SELF", {0x110a4}, 0xfff0, 1, {0xffff0000}, "Write Address Queue Collisions -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_ALL_EITHER 554 { "SI_WDQ_ECC_ERRORS_ALL_EITHER", {0x810af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count all ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_ALL_SELF 555 { "SI_WDQ_ECC_ERRORS_ALL_SELF", {0x910af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count all ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_DBL_EITHER 556 { "SI_WDQ_ECC_ERRORS_DBL_EITHER", {0x410af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count double-bit ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_DBL_SELF 557 { "SI_WDQ_ECC_ERRORS_DBL_SELF", {0x510af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count double-bit ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_SGL_EITHER 558 { "SI_WDQ_ECC_ERRORS_SGL_EITHER", {0x10af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count single-bit ECC errors from either cpu core"}, #define PME_MONT_SI_WDQ_ECC_ERRORS_SGL_SELF 559 { "SI_WDQ_ECC_ERRORS_SGL_SELF", {0x110af}, 0xfff0, 2, {0xffff0000}, "Write Data Queue ECC Errors -- count single-bit ECC errors from 'this' cpu core"}, #define PME_MONT_SI_WRITEQ_INSERTS_ALL_EITHER 560 { "SI_WRITEQ_INSERTS_ALL_EITHER", {0x18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_ALL_SELF 561 { "SI_WRITEQ_INSERTS_ALL_SELF", {0x118a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_EWB_EITHER 562 { "SI_WRITEQ_INSERTS_EWB_EITHER", {0x418a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_EWB_SELF 563 { "SI_WRITEQ_INSERTS_EWB_SELF", {0x518a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_IWB_EITHER 564 { "SI_WRITEQ_INSERTS_IWB_EITHER", {0x218a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_IWB_SELF 565 { "SI_WRITEQ_INSERTS_IWB_SELF", {0x318a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_NEWB_EITHER 566 { "SI_WRITEQ_INSERTS_NEWB_EITHER", {0xc18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_NEWB_SELF 567 { "SI_WRITEQ_INSERTS_NEWB_SELF", {0xd18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC16_EITHER 568 { "SI_WRITEQ_INSERTS_WC16_EITHER", {0x818a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC16_SELF 569 { "SI_WRITEQ_INSERTS_WC16_SELF", {0x918a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8A_EITHER 570 { "SI_WRITEQ_INSERTS_WC1_8A_EITHER", {0x618a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8A_SELF 571 { "SI_WRITEQ_INSERTS_WC1_8A_SELF", {0x718a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8B_EITHER 572 { "SI_WRITEQ_INSERTS_WC1_8B_EITHER", {0xe18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC1_8B_SELF 573 { "SI_WRITEQ_INSERTS_WC1_8B_SELF", {0xf18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC32_EITHER 574 { "SI_WRITEQ_INSERTS_WC32_EITHER", {0xa18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_INSERTS_WC32_SELF 575 { "SI_WRITEQ_INSERTS_WC32_SELF", {0xb18a1}, 0xfff0, 2, {0xffff0000}, "Write Queue Insertions -- "}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_HI_EITHER 576 { "SI_WRITEQ_LIVE_REQ_HI_EITHER", {0x10a3}, 0xfff0, 1, {0xffff0000}, "Write Queue Requests (upper bit) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_HI_SELF 577 { "SI_WRITEQ_LIVE_REQ_HI_SELF", {0x110a3}, 0xfff0, 1, {0xffff0000}, "Write Queue Requests (upper bit) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_LO_EITHER 578 { "SI_WRITEQ_LIVE_REQ_LO_EITHER", {0x10a2}, 0xfff0, 7, {0xffff0000}, "Write Queue Requests (lower three bits) -- transactions initiated by either cpu core"}, #define PME_MONT_SI_WRITEQ_LIVE_REQ_LO_SELF 579 { "SI_WRITEQ_LIVE_REQ_LO_SELF", {0x110a2}, 0xfff0, 7, {0xffff0000}, "Write Queue Requests (lower three bits) -- transactions initiated by 'this' cpu core"}, #define PME_MONT_SPEC_LOADS_NATTED_ALL 580 { "SPEC_LOADS_NATTED_ALL", {0xd9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Count all NaT'd loads"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_PSR_ED 581 { "SPEC_LOADS_NATTED_DEF_PSR_ED", {0x500d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to effect of PSR.ed"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_TLB_FAULT 582 { "SPEC_LOADS_NATTED_DEF_TLB_FAULT", {0x300d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to deferred TLB faults"}, #define PME_MONT_SPEC_LOADS_NATTED_DEF_TLB_MISS 583 { "SPEC_LOADS_NATTED_DEF_TLB_MISS", {0x200d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to deferred TLB misses"}, #define PME_MONT_SPEC_LOADS_NATTED_NAT_CNSM 584 { "SPEC_LOADS_NATTED_NAT_CNSM", {0x400d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to NaT consumption"}, #define PME_MONT_SPEC_LOADS_NATTED_VHPT_MISS 585 { "SPEC_LOADS_NATTED_VHPT_MISS", {0x100d9}, 0xfff0, 2, {0xffff0005}, "Number of speculative inter loads that are NaTd -- Only loads NaT'd due to VHPT miss"}, #define PME_MONT_STORES_RETIRED 586 { "STORES_RETIRED", {0xd1}, 0xfff0, 2, {0x5410007}, "Retired Stores"}, #define PME_MONT_SYLL_NOT_DISPERSED_ALL 587 { "SYLL_NOT_DISPERSED_ALL", {0xf004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Counts all syllables not dispersed. NOTE: Any combination of b0000-b1111 is valid."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL 588 { "SYLL_NOT_DISPERSED_EXPL", {0x1004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits. These consist of programmer specified architected S-bit and templates 1 and 5. Dispersal takes a 6-syllable (3-syllable) hit for every template 1/5 in bundle 0(1). Dispersal takes a 3-syllable (0 syllable) hit for every S-bit in bundle 0(1)"}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_FE 589 { "SYLL_NOT_DISPERSED_EXPL_OR_FE", {0x5004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLX 590 { "SYLL_NOT_DISPERSED_EXPL_OR_FE_OR_MLX", {0xd004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL 591 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL", {0x3004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit/implicit stop bits."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE 592 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_FE", {0x7004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to front-end not providing valid bundles or providing valid illegal template."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLX 593 { "SYLL_NOT_DISPERSED_EXPL_OR_IMPL_OR_MLX", {0xb004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit or implicit stop bits or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_EXPL_OR_MLX 594 { "SYLL_NOT_DISPERSED_EXPL_OR_MLX", {0x9004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to explicit stop bits or to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_FE 595 { "SYLL_NOT_DISPERSED_FE", {0x4004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to front-end not providing valid bundles or providing valid illegal templates. Dispersal takes a 3-syllable hit for every invalid bundle or valid illegal template from front-end. Bundle 1 with front-end fault, is counted here (3-syllable hit).."}, #define PME_MONT_SYLL_NOT_DISPERSED_FE_OR_MLX 596 { "SYLL_NOT_DISPERSED_FE_OR_MLX", {0xc004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLI bundle and resteers to non-0 syllable or due to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL 597 { "SYLL_NOT_DISPERSED_IMPL", {0x2004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits. These consist of all of the non-architected stop bits (asymmetry, oversubscription, implicit). Dispersal takes a 6-syllable(3-syllable) hit for every implicit stop bits in bundle 0(1)."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_FE 598 { "SYLL_NOT_DISPERSED_IMPL_OR_FE", {0x6004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to front-end not providing valid bundles or providing valid illegal templates."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLX 599 { "SYLL_NOT_DISPERSED_IMPL_OR_FE_OR_MLX", {0xe004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or due to front-end not providing valid bundles or providing valid illegal templates or due to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_IMPL_OR_MLX 600 { "SYLL_NOT_DISPERSED_IMPL_OR_MLX", {0xa004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to implicit stop bits or to MLX bundle and resteers to non-0 syllable."}, #define PME_MONT_SYLL_NOT_DISPERSED_MLX 601 { "SYLL_NOT_DISPERSED_MLX", {0x8004e}, 0xfff0, 5, {0xffff0001}, "Syllables Not Dispersed -- Count syllables not dispersed due to MLX bundle and resteers to non-0 syllable. Dispersal takes a 1 syllable hit for each MLX bundle . Dispersal could take 0-2 syllable hit depending on which syllable we resteer to. Bundle 1 with front-end fault which is split, is counted here (0-2 syllable hit)."}, #define PME_MONT_SYLL_OVERCOUNT_ALL 602 { "SYLL_OVERCOUNT_ALL", {0x3004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- syllables overcounted in implicit & explicit bucket"}, #define PME_MONT_SYLL_OVERCOUNT_EXPL 603 { "SYLL_OVERCOUNT_EXPL", {0x1004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- Only syllables overcounted in the explicit bucket"}, #define PME_MONT_SYLL_OVERCOUNT_IMPL 604 { "SYLL_OVERCOUNT_IMPL", {0x2004f}, 0xfff0, 2, {0xffff0001}, "Syllables Overcounted -- Only syllables overcounted in the implicit bucket"}, #define PME_MONT_THREAD_SWITCH_CYCLE_ALL_GATED 605 { "THREAD_SWITCH_CYCLE_ALL_GATED", {0x6000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are gated due to any reason"}, #define PME_MONT_THREAD_SWITCH_CYCLE_ANYSTALL 606 { "THREAD_SWITCH_CYCLE_ANYSTALL", {0x3000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to any reason"}, #define PME_MONT_THREAD_SWITCH_CYCLE_CRAB 607 { "THREAD_SWITCH_CYCLE_CRAB", {0x1000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to CRAB operation"}, #define PME_MONT_THREAD_SWITCH_CYCLE_L2D 608 { "THREAD_SWITCH_CYCLE_L2D", {0x2000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles TSs are stalled due to L2D return operation"}, #define PME_MONT_THREAD_SWITCH_CYCLE_PCR 609 { "THREAD_SWITCH_CYCLE_PCR", {0x4000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Cycles we run with PCR.sd set"}, #define PME_MONT_THREAD_SWITCH_CYCLE_TOTAL 610 { "THREAD_SWITCH_CYCLE_TOTAL", {0x7000e}, 0xfff0, 1, {0xffff0000}, "Thread switch overhead cycles. -- Total time from TS opportunity is seized to TS happens."}, #define PME_MONT_THREAD_SWITCH_EVENTS_ALL 611 { "THREAD_SWITCH_EVENTS_ALL", {0x7000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- All taken TSs"}, #define PME_MONT_THREAD_SWITCH_EVENTS_DBG 612 { "THREAD_SWITCH_EVENTS_DBG", {0x5000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to debug operations"}, #define PME_MONT_THREAD_SWITCH_EVENTS_HINT 613 { "THREAD_SWITCH_EVENTS_HINT", {0x3000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to hint instruction"}, #define PME_MONT_THREAD_SWITCH_EVENTS_L3MISS 614 { "THREAD_SWITCH_EVENTS_L3MISS", {0x1000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to L3 miss"}, #define PME_MONT_THREAD_SWITCH_EVENTS_LP 615 { "THREAD_SWITCH_EVENTS_LP", {0x4000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to low power operation"}, #define PME_MONT_THREAD_SWITCH_EVENTS_MISSED 616 { "THREAD_SWITCH_EVENTS_MISSED", {0xc}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TS opportunities missed"}, #define PME_MONT_THREAD_SWITCH_EVENTS_TIMER 617 { "THREAD_SWITCH_EVENTS_TIMER", {0x2000c}, 0xfff0, 1, {0xffff0000}, "Thread switch events. -- TSs due to time out"}, #define PME_MONT_THREAD_SWITCH_GATED_ALL 618 { "THREAD_SWITCH_GATED_ALL", {0x7000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- TSs gated for any reason"}, #define PME_MONT_THREAD_SWITCH_GATED_FWDPRO 619 { "THREAD_SWITCH_GATED_FWDPRO", {0x5000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- Gated due to forward progress reasons"}, #define PME_MONT_THREAD_SWITCH_GATED_LP 620 { "THREAD_SWITCH_GATED_LP", {0x1000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- TSs gated due to LP"}, #define PME_MONT_THREAD_SWITCH_GATED_PIPE 621 { "THREAD_SWITCH_GATED_PIPE", {0x4000d}, 0xfff0, 1, {0xffff0000}, "Thread switches gated -- Gated due to pipeline operations"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_1024 622 { "THREAD_SWITCH_STALL_GTE_1024", {0x8000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 1024 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_128 623 { "THREAD_SWITCH_STALL_GTE_128", {0x5000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 128 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_16 624 { "THREAD_SWITCH_STALL_GTE_16", {0x2000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 16 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_2048 625 { "THREAD_SWITCH_STALL_GTE_2048", {0x9000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 2048 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_256 626 { "THREAD_SWITCH_STALL_GTE_256", {0x6000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 256 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_32 627 { "THREAD_SWITCH_STALL_GTE_32", {0x3000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 32 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_4 628 { "THREAD_SWITCH_STALL_GTE_4", {0xf}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 4 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_4096 629 { "THREAD_SWITCH_STALL_GTE_4096", {0xa000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 4096 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_512 630 { "THREAD_SWITCH_STALL_GTE_512", {0x7000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 512 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_64 631 { "THREAD_SWITCH_STALL_GTE_64", {0x4000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 64 cycles"}, #define PME_MONT_THREAD_SWITCH_STALL_GTE_8 632 { "THREAD_SWITCH_STALL_GTE_8", {0x1000f}, 0xfff0, 1, {0xffff0000}, "Thread switch stall -- Thread switch stall >= 8 cycles"}, #define PME_MONT_UC_LOADS_RETIRED 633 { "UC_LOADS_RETIRED", {0xcf}, 0xfff0, 4, {0x5310007}, "Retired Uncacheable Loads"}, #define PME_MONT_UC_STORES_RETIRED 634 { "UC_STORES_RETIRED", {0xd0}, 0xfff0, 2, {0x5410007}, "Retired Uncacheable Stores"}, #define PME_MONT_IA64_INST_RETIRED 635 { "IA64_INST_RETIRED", {0x8}, 0xfff0, 6, {0xffff0003}, "Retired IA-64 Instructions -- Retired IA-64 Instructions -- Alias to IA64_INST_RETIRED_THIS"}, #define PME_MONT_BRANCH_EVENT 636 { "BRANCH_EVENT", {0x111}, 0xfff0, 1, {0xffff0003}, "Execution Trace Buffer Event Captured. Alias to ETB_EVENT"}, }; #define PME_MONT_EVENT_COUNT (sizeof(montecito_pe)/sizeof(pme_mont_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/perf_events.h000066400000000000000000000276731502707512200225460ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #define CACHE_ST_ACCESS(n, d, e) \ {\ .name = #n"-STORES",\ .desc = d" store accesses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":WRITE:ACCESS"\ },\ {\ .name = #n"-STORE-MISSES",\ .desc = d" store misses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":WRITE:MISS"\ } #define CACHE_PF_ACCESS(n, d, e) \ {\ .name = #n"-PREFETCHES",\ .desc = d" prefetch accesses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":PREFETCH:ACCESS"\ },\ {\ .name = #n"-PREFETCH-MISSES",\ .desc = d" prefetch misses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":PREFETCH:MISS"\ } #define CACHE_LD_ACCESS(n, d, e) \ {\ .name = #n"-LOADS",\ .desc = d" load accesses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":READ:ACCESS"\ },\ {\ .name = #n"-LOAD-MISSES",\ .desc = d" load misses",\ .id = PERF_COUNT_HW_CACHE_##e,\ .type = PERF_TYPE_HW_CACHE,\ .modmsk = PERF_ATTR_HW,\ .umask_ovfl_idx = ~0UL,\ .equiv = "PERF_COUNT_HW_CACHE_"#e":READ:MISS"\ } #define CACHE_ACCESS(n, d, e) \ CACHE_LD_ACCESS(n, d, e), \ CACHE_ST_ACCESS(n, d, e), \ CACHE_PF_ACCESS(n, d, e) #define ICACHE_ACCESS(n, d, e) \ CACHE_LD_ACCESS(n, d, e), \ CACHE_PF_ACCESS(n, d, e) static perf_event_t perf_static_events[]={ PCL_EVT_HW_FL(CPU_CYCLES, PERF_FL_PRECISE), PCL_EVT_AHW(CYCLES, CPU_CYCLES), PCL_EVT_AHW(CPU-CYCLES, CPU_CYCLES), PCL_EVT_HW(INSTRUCTIONS), PCL_EVT_AHW(INSTRUCTIONS, INSTRUCTIONS), PCL_EVT_HW(CACHE_REFERENCES), PCL_EVT_AHW(CACHE-REFERENCES, CACHE_REFERENCES), PCL_EVT_HW(CACHE_MISSES), PCL_EVT_AHW(CACHE-MISSES,CACHE_MISSES), PCL_EVT_HW(BRANCH_INSTRUCTIONS), PCL_EVT_AHW(BRANCH-INSTRUCTIONS, BRANCH_INSTRUCTIONS), PCL_EVT_AHW(BRANCHES, BRANCH_INSTRUCTIONS), PCL_EVT_HW(BRANCH_MISSES), PCL_EVT_AHW(BRANCH-MISSES, BRANCH_MISSES), PCL_EVT_HW(BUS_CYCLES), PCL_EVT_AHW(BUS-CYCLES, BUS_CYCLES), PCL_EVT_HW(STALLED_CYCLES_FRONTEND), PCL_EVT_AHW(STALLED-CYCLES-FRONTEND, STALLED_CYCLES_FRONTEND), PCL_EVT_AHW(IDLE-CYCLES-FRONTEND, STALLED_CYCLES_FRONTEND), PCL_EVT_HW(STALLED_CYCLES_BACKEND), PCL_EVT_AHW(STALLED-CYCLES-BACKEND, STALLED_CYCLES_BACKEND), PCL_EVT_AHW(IDLE-CYCLES-BACKEND, STALLED_CYCLES_BACKEND), PCL_EVT_HW(REF_CPU_CYCLES), PCL_EVT_AHW(REF-CYCLES,REF_CPU_CYCLES), PCL_EVT_SW(CPU_CLOCK), PCL_EVT_ASW(CPU-CLOCK, CPU_CLOCK), PCL_EVT_SW(TASK_CLOCK), PCL_EVT_ASW(TASK-CLOCK, TASK_CLOCK), PCL_EVT_SW(PAGE_FAULTS), PCL_EVT_ASW(PAGE-FAULTS, PAGE_FAULTS), PCL_EVT_ASW(FAULTS, PAGE_FAULTS), PCL_EVT_SW(CONTEXT_SWITCHES), PCL_EVT_ASW(CONTEXT-SWITCHES, CONTEXT_SWITCHES), PCL_EVT_ASW(CS, CONTEXT_SWITCHES), PCL_EVT_SW(CPU_MIGRATIONS), PCL_EVT_ASW(CPU-MIGRATIONS, CPU_MIGRATIONS), PCL_EVT_ASW(MIGRATIONS, CPU_MIGRATIONS), PCL_EVT_SW(PAGE_FAULTS_MIN), PCL_EVT_ASW(MINOR-FAULTS, PAGE_FAULTS_MIN), PCL_EVT_SW(PAGE_FAULTS_MAJ), PCL_EVT_ASW(MAJOR-FAULTS, PAGE_FAULTS_MAJ), PCL_EVT_SW(CGROUP_SWITCHES), PCL_EVT_ASW(CGROUP-SWITCHES, CGROUP_SWITCHES), { .name = "PERF_COUNT_HW_CACHE_L1D", .desc = "L1 data cache", .id = PERF_COUNT_HW_CACHE_L1D, .type = PERF_TYPE_HW_CACHE, .numasks = 5, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "WRITE", .udesc = "write access", .uid = PERF_COUNT_HW_CACHE_OP_WRITE << 8, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "prefetch access", .uid = PERF_COUNT_HW_CACHE_OP_PREFETCH << 8, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, CACHE_ACCESS(L1-DCACHE, "L1 cache", L1D), { .name = "PERF_COUNT_HW_CACHE_L1I", .desc = "L1 instruction cache", .id = PERF_COUNT_HW_CACHE_L1I, .type = PERF_TYPE_HW_CACHE, .numasks = 4, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "prefetch access", .uid = PERF_COUNT_HW_CACHE_OP_PREFETCH << 8, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, ICACHE_ACCESS(L1-ICACHE, "L1I cache", L1I), { .name = "PERF_COUNT_HW_CACHE_LL", .desc = "Last level cache", .id = PERF_COUNT_HW_CACHE_LL, .type = PERF_TYPE_HW_CACHE, .numasks = 5, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "WRITE", .udesc = "write access", .uid = PERF_COUNT_HW_CACHE_OP_WRITE << 8, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "prefetch access", .uid = PERF_COUNT_HW_CACHE_OP_PREFETCH << 8, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, CACHE_ACCESS(LLC, "Last level cache", LL), { .name = "PERF_COUNT_HW_CACHE_DTLB", .desc = "Data Translation Lookaside Buffer", .id = PERF_COUNT_HW_CACHE_DTLB, .type = PERF_TYPE_HW_CACHE, .numasks = 5, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "WRITE", .udesc = "write access", .uid = PERF_COUNT_HW_CACHE_OP_WRITE << 8, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "prefetch access", .uid = PERF_COUNT_HW_CACHE_OP_PREFETCH << 8, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, CACHE_ACCESS(DTLB, "Data TLB", DTLB), { .name = "PERF_COUNT_HW_CACHE_ITLB", .desc = "Instruction Translation Lookaside Buffer", .id = PERF_COUNT_HW_CACHE_ITLB, .type = PERF_TYPE_HW_CACHE, .numasks = 3, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, CACHE_LD_ACCESS(ITLB, "Instruction TLB", ITLB), { .name = "PERF_COUNT_HW_CACHE_BPU", .desc = "Branch Prediction Unit", .id = PERF_COUNT_HW_CACHE_BPU, .type = PERF_TYPE_HW_CACHE, .numasks = 3, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } } }, CACHE_LD_ACCESS(BRANCH, "Branch ", BPU), { .name = "PERF_COUNT_HW_CACHE_NODE", .desc = "Node memory access", .id = PERF_COUNT_HW_CACHE_NODE, .type = PERF_TYPE_HW_CACHE, .numasks = 5, .modmsk = PERF_ATTR_HW, .umask_ovfl_idx = ~0UL, .ngrp = 2, .umasks = { { .uname = "READ", .udesc = "read access", .uid = PERF_COUNT_HW_CACHE_OP_READ << 8, .uflags= PERF_FL_DEFAULT, .grpid = 0, }, { .uname = "WRITE", .udesc = "write access", .uid = PERF_COUNT_HW_CACHE_OP_WRITE << 8, .grpid = 0, }, { .uname = "PREFETCH", .udesc = "prefetch access", .uid = PERF_COUNT_HW_CACHE_OP_PREFETCH << 8, .grpid = 0, }, { .uname = "ACCESS", .udesc = "hit access", .uid = PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16, .grpid = 1, }, { .uname = "MISS", .udesc = "miss access", .uid = PERF_COUNT_HW_CACHE_RESULT_MISS << 16, .uflags= PERF_FL_DEFAULT, .grpid = 1, } }, }, CACHE_ACCESS(NODE, "Node ", NODE) }; #define PME_PERF_EVENT_COUNT (sizeof(perf_static_events)/sizeof(perf_event_t)) /* * the following events depend on the kernel exporting them. They may be dependent on hardware features */ static perf_event_t perf_optional_events[]={ PCL_EVT_RAW(slots, 0x00, 0x04, "issue slots per logical CPU (used for topdown toplevel computation, must be first event in the group)"), PCL_EVT_RAW(topdown-retiring, 0x00, 0x80, "topdown useful slots retiring uops (must be used in a group with the other topdown- events with slots as leader)"), PCL_EVT_RAW(topdown-bad-spec, 0x00, 0x81, "topdown wasted slots due to bad speculation (must be used in a group with the other topdown- events with slots as leader)"), PCL_EVT_RAW(topdown-fe-bound, 0x00, 0x82, "topdown wasted slots due to frontend (must be used in a group with the other topdown- events with slots as leader)"), PCL_EVT_RAW(topdown-be-bound, 0x00, 0x83, "topdown wasted slots due to backend (must be used in a group with the other topdown- events with slots as leader)"), }; #define PME_PERF_EVENT_OPT_COUNT (sizeof(perf_optional_events)/sizeof(perf_event_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/power10_events.h000066400000000000000000014370541502707512200231060ustar00rootroot00000000000000/* * File: power10_events.h * (C) Copyright IBM Corporation, 2023-2024. All Rights Reserved. * Author: Will Schmidt * will_schmidt@vnet.ibm.com * Author: Carl Love * cel@us.ibm.com # * Content reworked Aug 12, 2024, - Sachin Monga, Jeevitha P. * This file was automatically generated from event lists as * provided by the IBM PowerPC PMU team. Any manual * updates should be clearly marked so they are not lost in * any subsequent automatic rebuilds or refreshes of this file. * * Documentation on the PMU events for Power10 can be found * in Appendix E of the Power10 Users Manual. * The Power10 manual is at * https://ibm.ent.box.com/v/power10usermanual * This and other PowerPC related documents can be found at * https://www-50.ibm.com/systems/power/openpower/ */ #ifndef __POWER10_EVENTS_H__ #define __POWER10_EVENTS_H__ static const pme_power_entry_t power10_pe[] = { {.pme_name = "PM_1FLOP_CMPL", .pme_code = 0x45050, .pme_short_desc = "floating point;One floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)", .pme_long_desc = "floating point;One floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)", }, {.pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100F2, .pme_short_desc = "frontend;Cycles in which at least one instruction is completed by this thread", .pme_long_desc = "frontend;Cycles in which at least one instruction is completed by this thread", }, {.pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x400F2, .pme_short_desc = "pipeline;Cycles at least one Instr Dispatched", .pme_long_desc = "pipeline;Cycles at least one Instr Dispatched", }, {.pme_name = "PM_2FLOP_CMPL", .pme_code = 0x4D052, .pme_short_desc = "floating point;Double Precision vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg completed.", .pme_long_desc = "floating point;Double Precision vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg completed.", }, {.pme_name = "PM_4FLOP_CMPL", .pme_code = 0x45052, .pme_short_desc = "floating point;Four floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)", .pme_long_desc = "floating point;Four floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)", }, {.pme_name = "PM_8FLOP_CMPL", .pme_code = 0x4D054, .pme_short_desc = "floating point;Four Double Precision vector instruction completed.", .pme_long_desc = "floating point;Four Double Precision vector instruction completed.", }, {.pme_name = "PM_ADJUNCT_CYC", .pme_code = 0x10066, .pme_short_desc = "Cycles in which the thread is in Adjunct state.", .pme_long_desc = "Cycles in which the thread is in Adjunct state. MSR[S HV PR] bits = 011", }, {.pme_name = "PM_ADJUNCT_INST_CMPL", .pme_code = 0x2E010, .pme_short_desc = "PowerPC instruction completed while the thread was in Adjunct state.", .pme_long_desc = "PowerPC instruction completed while the thread was in Adjunct state.", }, {.pme_name = "PM_BR_CMPL", .pme_code = 0x4D05E, .pme_short_desc = "empty;A branch completed.", .pme_long_desc = "empty;A branch completed. All branches are included.", }, {.pme_name = "PM_BR_FIN", .pme_code = 0x10068, .pme_short_desc = "pipeline;A branch instruction finished.", .pme_long_desc = "pipeline;A branch instruction finished. Includes predicted/mispredicted/unconditional", }, {.pme_name = "PM_BR_FIN_ALT2", .pme_code = 0x2F04A, .pme_short_desc = "pipeline;A branch instruction finished.", .pme_long_desc = "pipeline;A branch instruction finished. Includes predicted/mispredicted/unconditional", }, {.pme_name = "PM_BR_MPRED_CMPL", .pme_code = 0x400F6, .pme_short_desc = "cache;A mispredicted branch completed.", .pme_long_desc = "cache;A mispredicted branch completed. Includes direction and target.", }, {.pme_name = "PM_BR_MPRED_NTKN_COND_DIR_GBHT", .pme_code = 0x000000E880, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Global Branch History Table.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Global Branch History Table. Resolved not taken", }, {.pme_name = "PM_BR_COND_CMPL", .pme_code = 0x4E058, .pme_short_desc = "frontend;A conditional branch completed.", .pme_long_desc = "frontend;A conditional branch completed.", }, {.pme_name = "PM_BR_MPRED_NTKN_COND_DIR_LBHT_GSEL", .pme_code = 0x000000E080, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected with the global selector.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected with the global selector. Resolved not taken", }, {.pme_name = "PM_BR_TKN_FIN", .pme_code = 0x00000040B4, .pme_short_desc = "frontend; A taken branch (conditional or unconditional) finished", .pme_long_desc = "frontend;A taken branch (conditional or unconditional) finished", }, {.pme_name = "PM_BR_MPRED_NTKN_COND_DIR_LBHT_LSEL", .pme_code = 0x00000058BC, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected by the local selector.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected by the local selector. Resolved not taken", }, {.pme_name = "PM_BR_MPRED_NTKN_COND_DIR_TAGE", .pme_code = 0x000000E084, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using a TAGE override.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using a TAGE override. Resolved not taken", }, {.pme_name = "PM_BR_MPRED_NTKN_COND_DIR_TOP", .pme_code = 0x000000E884, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using a TOP override to the BHT.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using a TOP override to the BHT. Resolved not taken", }, {.pme_name = "PM_BR_MPRED_NTKN_SWHINT", .pme_code = 0x000000E0A0, .pme_short_desc = "NA;A software hinted branch finished and the branch resolved not taken and the hint was incorrect.", .pme_long_desc = "NA;A software hinted branch finished and the branch resolved not taken and the hint was incorrect.", }, {.pme_name = "PM_BR_MPRED_TKN_COND_DIR_GBHT", .pme_code = 0x00000050B0, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Global Branch History Table.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Global Branch History Table. Resolved taken", }, {.pme_name = "PM_BR_MPRED_TKN_COND_DIR_LBHT_GSEL", .pme_code = 0x00000058AC, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected with the global selector.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected with the global selector. Resolved taken", }, {.pme_name = "PM_BR_MPRED_TKN_COND_DIR_LBHT_LSEL", .pme_code = 0x00000050AC, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected by the local selector.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using the Local Branch History Table selected by the local selector. Resolved taken", }, {.pme_name = "PM_BR_MPRED_TKN_COND_DIR_TAGE", .pme_code = 0x00000058B0, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using a TAGE override.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using a TAGE override. Resolved taken", }, {.pme_name = "PM_BR_MPRED_TKN_COND_DIR_TOP", .pme_code = 0x00000050B4, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction using a TOP override to the BHT.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction using a TOP override to the BHT. Resolved taken", }, {.pme_name = "PM_BR_MPRED_TKN_SWHINT", .pme_code = 0x000000E89C, .pme_short_desc = "NA;A software hinted branch finished and the branch resolved taken and the hint was incorrect.", .pme_long_desc = "NA;A software hinted branch finished and the branch resolved taken and the hint was incorrect.", }, {.pme_name = "PM_BR_TAKEN_CMPL", .pme_code = 0x200FA, .pme_short_desc = "frontend;Branch Taken instruction completed", .pme_long_desc = "frontend;Branch Taken instruction completed", }, {.pme_name = "PM_BR_TKN_UNCOND_FIN", .pme_code = 0x00000048B4, .pme_short_desc = "NA;An unconditional branch finished.", .pme_long_desc = "NA;An unconditional branch finished. All unconditional branches are taken.", }, {.pme_name = "PM_CMPL_STALL_EXCEPTION", .pme_code = 0x3003A, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", }, {.pme_name = "PM_CMPL_STALL_HWSYNC", .pme_code = 0x4D01A, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a hwsync waiting for response from L2 before completing.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a hwsync waiting for response from L2 before completing.", }, {.pme_name = "PM_CMPL_STALL_LWSYNC", .pme_code = 0x1E05A, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a lwsync waiting to complete.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a lwsync waiting to complete.", }, {.pme_name = "PM_CMPL_STALL_MEM_ECC", .pme_code = 0x30028, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for the non-speculative finish of either a STCX waiting for its result or a load waiting for non-critical sectors of data and ECC.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for the non-speculative finish of either a STCX waiting for its result or a load waiting for non-critical sectors of data and ECC.", }, {.pme_name = "PM_CMPL_STALL_SPECIAL", .pme_code = 0x2C014, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline required special handling before completing.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline required special handling before completing.", }, {.pme_name = "PM_CMPL_STALL_STCX", .pme_code = 0x2D01C, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a stcx waiting for resolution from the nest before completing.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a stcx waiting for resolution from the nest before completing.", }, {.pme_name = "PM_CMPL_STALL", .pme_code = 0x4C018, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline cannot complete because the thread was blocked for any reason.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline cannot complete because the thread was blocked for any reason.", }, {.pme_name = "PM_CYC", .pme_code = 0x100F0, .pme_short_desc = "pmc;Processor cycles", .pme_long_desc = "pmc;Processor cycles", }, {.pme_name = "PM_CYC_ALT2", .pme_code = 0x2001E, .pme_short_desc = "pmc;Processor cycles", .pme_long_desc = "pmc;Processor cycles", }, {.pme_name = "PM_CYC_ALT3", .pme_code = 0x3001E, .pme_short_desc = "pmc;Processor cycles", .pme_long_desc = "pmc;Processor cycles", }, {.pme_name = "PM_CYC_ALT4", .pme_code = 0x4001E, .pme_short_desc = "pmc;Processor cycles", .pme_long_desc = "pmc;Processor cycles", }, {.pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x0E4240000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_MOD_ALT2", .pme_code = 0x0E4240000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_MOD_ALT3", .pme_code = 0x0E4240000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_MOD_ALT4", .pme_code = 0x0E4240000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x0E0240000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_SHR_ALT2", .pme_code = 0x0E0240000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_SHR_ALT3", .pme_code = 0x0E0240000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2L3_SHR_ALT4", .pme_code = 0x0E0240000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_MOD", .pme_code = 0x0E4040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_MOD_ALT2", .pme_code = 0x0E4040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_MOD_ALT3", .pme_code = 0x0E4040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_MOD_ALT4", .pme_code = 0x0E4040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_SHR", .pme_code = 0x0E0040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_SHR_ALT2", .pme_code = 0x0E0040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_SHR_ALT3", .pme_code = 0x0E0040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL2_SHR_ALT4", .pme_code = 0x0E0040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_MOD", .pme_code = 0x0EC040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_MOD_ALT2", .pme_code = 0x0EC040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_MOD_ALT3", .pme_code = 0x0EC040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_MOD_ALT4", .pme_code = 0x0EC040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_SHR", .pme_code = 0x0E8040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_SHR_ALT2", .pme_code = 0x0E8040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_SHR_ALT3", .pme_code = 0x0E8040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DL3_SHR_ALT4", .pme_code = 0x0E8040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x0F4040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DMEM_ALT2", .pme_code = 0x0F4040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DMEM_ALT3", .pme_code = 0x0F4040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_DMEM_ALT4", .pme_code = 0x0F4040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_CACHE", .pme_code = 0x0F8040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_CACHE_ALT2", .pme_code = 0x0F8040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_CACHE_ALT3", .pme_code = 0x0F8040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_CACHE_ALT4", .pme_code = 0x0F8040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_MEM", .pme_code = 0x0FC040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_MEM_ALT2", .pme_code = 0x0FC040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_MEM_ALT3", .pme_code = 0x0FC040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_D_OC_MEM_ALT4", .pme_code = 0x0FC040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_MOD", .pme_code = 0x0A4040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_MOD_ALT2", .pme_code = 0x0A4040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_MOD_ALT3", .pme_code = 0x0A4040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_MOD_ALT4", .pme_code = 0x0A4040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_SHR", .pme_code = 0x0A0040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_SHR_ALT2", .pme_code = 0x0A0040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_SHR_ALT3", .pme_code = 0x0A0040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_NON_REGENT_SHR_ALT4", .pme_code = 0x0A0040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_MOD", .pme_code = 0x084040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_MOD_ALT2", .pme_code = 0x084040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_MOD_ALT3", .pme_code = 0x084040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_MOD_ALT4", .pme_code = 0x084040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_SHR", .pme_code = 0x080040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_SHR_ALT2", .pme_code = 0x080040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_SHR_ALT3", .pme_code = 0x080040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L21_REGENT_SHR_ALT4", .pme_code = 0x080040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L1MISS", .pme_code = 0x003F40000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L1MISS_ALT2", .pme_code = 0x003F40000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L1MISS_ALT3", .pme_code = 0x003F40000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L1MISS_ALT4", .pme_code = 0x003F40000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x0003C0000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2MISS_ALT2", .pme_code = 0x200FE, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2MISS_ALT3", .pme_code = 0x0003C0000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2MISS_ALT4", .pme_code = 0x0003C0000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2", .pme_code = 0x000340000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2_ALT2", .pme_code = 0x000340000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2_ALT3", .pme_code = 0x000340000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L2_ALT4", .pme_code = 0x000340000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_ST_DATA_FROM_L2", .pme_code = 0x0C0000016080, .pme_short_desc = "Data Source;Store data line hit in the local L2. Includes cache-line states Sx, Tx, Mx.", .pme_long_desc = "Data Source;Store data line hit in the local L2. Includes cache-line states Sx, Tx, Mx.Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_MOD", .pme_code = 0x0AC040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_MOD_ALT2", .pme_code = 0x0AC040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_MOD_ALT3", .pme_code = 0x0AC040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_MOD_ALT4", .pme_code = 0x0AC040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_SHR", .pme_code = 0x0A8040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_SHR_ALT2", .pme_code = 0x0A8040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_SHR_ALT3", .pme_code = 0x0A8040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_NON_REGENT_SHR_ALT4", .pme_code = 0x0A8040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_MOD", .pme_code = 0x08C040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_MOD_ALT2", .pme_code = 0x08C040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_MOD_ALT3", .pme_code = 0x08C040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_MOD_ALT4", .pme_code = 0x08C040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_SHR", .pme_code = 0x088040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_SHR_ALT2", .pme_code = 0x088040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_SHR_ALT3", .pme_code = 0x088040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L31_REGENT_SHR_ALT4", .pme_code = 0x088040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_MEPF", .pme_code = 0x014040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_MEPF_ALT2", .pme_code = 0x014040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_MEPF_ALT3", .pme_code = 0x014040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_MEPF_ALT4", .pme_code = 0x014040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x0007C0000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3MISS_ALT2", .pme_code = 0x0007C0000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3MISS_ALT3", .pme_code = 0x300FE, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3MISS_ALT4", .pme_code = 0x0007C0000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3", .pme_code = 0x010340000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_ALT2", .pme_code = 0x010340000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_ALT3", .pme_code = 0x010340000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L3_ALT4", .pme_code = 0x010340000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_ST_DATA_FROM_L3", .pme_code = 0x0C0000016880, .pme_short_desc = "Data Source;Store data line hit in the local L3. Includes cache-line states Tx and Mx.", .pme_long_desc = "Data Source;Store data line hit in the local L3. Includes cache-line states Tx and Mx. If the cache line is in the Sx state, the RC machine will send a RWITM command. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x094040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_LMEM_ALT2", .pme_code = 0x094040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_LMEM_ALT3", .pme_code = 0x094040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_LMEM_ALT4", .pme_code = 0x094040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_CACHE", .pme_code = 0x098040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_CACHE_ALT2", .pme_code = 0x098040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_CACHE_ALT3", .pme_code = 0x098040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_CACHE_ALT4", .pme_code = 0x098040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_MEM", .pme_code = 0x09C040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_MEM_ALT2", .pme_code = 0x09C040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_MEM_ALT3", .pme_code = 0x09C040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_L_OC_MEM_ALT4", .pme_code = 0x09C040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_MEMORY", .pme_code = 0x400FE, .pme_short_desc = "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss", .pme_long_desc = "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss", }, {.pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x0C4240000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_MOD_ALT2", .pme_code = 0x0C4240000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_MOD_ALT3", .pme_code = 0x0C4240000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_MOD_ALT4", .pme_code = 0x0C4240000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x0C0240000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_SHR_ALT2", .pme_code = 0x0C0240000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_SHR_ALT3", .pme_code = 0x0C0240000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2L3_SHR_ALT4", .pme_code = 0x0C0240000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 or L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_MOD", .pme_code = 0x0C4040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_MOD_ALT2", .pme_code = 0x0C4040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_MOD_ALT3", .pme_code = 0x0C4040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_MOD_ALT4", .pme_code = 0x0C4040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_SHR", .pme_code = 0x0C0040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_SHR_ALT2", .pme_code = 0x0C0040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_SHR_ALT3", .pme_code = 0x0C0040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL2_SHR_ALT4", .pme_code = 0x0C0040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_MOD", .pme_code = 0x0CC040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_MOD_ALT2", .pme_code = 0x0CC040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_MOD_ALT3", .pme_code = 0x0CC040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_MOD_ALT4", .pme_code = 0x0CC040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_SHR", .pme_code = 0x0C8040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_SHR_ALT2", .pme_code = 0x0C8040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_SHR_ALT3", .pme_code = 0x0C8040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RL3_SHR_ALT4", .pme_code = 0x0C8040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x0D4040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RMEM_ALT2", .pme_code = 0x0D4040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RMEM_ALT3", .pme_code = 0x0D4040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_RMEM_ALT4", .pme_code = 0x0D4040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_CACHE", .pme_code = 0x0D8040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_CACHE_ALT2", .pme_code = 0x0D8040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_CACHE_ALT3", .pme_code = 0x0D8040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_CACHE_ALT4", .pme_code = 0x0D8040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_MEM", .pme_code = 0x0DC040000001C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_MEM_ALT2", .pme_code = 0x0DC040000002C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_MEM_ALT3", .pme_code = 0x0DC040000003C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DATA_FROM_R_OC_MEM_ALT4", .pme_code = 0x0DC040000004C040, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DC_RELOAD_COLLISIONS", .pme_code = 0x000000C0BC, .pme_short_desc = "NA;A load reading the L1 cache has a bank collision with another load reading the same bank, or due to a cache-line reload writing to that bank of the L1 cache.", .pme_long_desc = "NA;A load reading the L1 cache has a bank collision with another load reading the same bank, or due to a cache-line reload writing to that bank of the L1 cache.", }, {.pme_name = "PM_DC_STORE_WRITE_COLLISIONS", .pme_code = 0x000000C8BC, .pme_short_desc = "NA;A store writing the L1 cache at the same time as a reload or dkill writing the L1 cache that results in a bank collision.", .pme_long_desc = "NA;A store writing the L1 cache at the same time as a reload or dkill writing the L1 cache that results in a bank collision.", }, {.pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x4C054, .pme_short_desc = "memory;Data ERAT Miss (Data TLB Access) page size 16G.", .pme_long_desc = "memory;Data ERAT Miss (Data TLB Access) page size 16G. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x3C054, .pme_short_desc = "memory;Data ERAT Miss (Data TLB Access) page size 16M.", .pme_long_desc = "memory;Data ERAT Miss (Data TLB Access) page size 16M. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS_1G", .pme_code = 0x2C05A, .pme_short_desc = "memory;Data ERAT Miss (Data TLB Access) page size 1G.", .pme_long_desc = "memory;Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS_2M", .pme_code = 0x1C05A, .pme_short_desc = "pipeline;Data ERAT Miss (Data TLB Access) page size 2M.", .pme_long_desc = "pipeline;Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x1C056, .pme_short_desc = "memory;Data ERAT Miss (Data TLB Access) page size 4K.", .pme_long_desc = "memory;Data ERAT Miss (Data TLB Access) page size 4K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x2C054, .pme_short_desc = "memory;Data ERAT Miss (Data TLB Access) page size 64K.", .pme_long_desc = "memory;Data ERAT Miss (Data TLB Access) page size 64K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DERAT_MISS", .pme_code = 0x200F6, .pme_short_desc = "memory;DERAT Reloaded to satisfy a DERAT miss.", .pme_long_desc = "memory;DERAT Reloaded to satisfy a DERAT miss. All page sizes are counted by this event. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DISP_SS0_2_INSTR_CYC", .pme_code = 0x1F056, .pme_short_desc = "Cycles in which Superslice 0 dispatches either 1 or 2 instructions", .pme_long_desc = "Cycles in which Superslice 0 dispatches either 1 or 2 instructions", }, {.pme_name = "PM_DISP_SS0_4_INSTR_CYC", .pme_code = 0x3F054, .pme_short_desc = "Cycles in which Superslice 0 dispatches either 3 or 4 instructions", .pme_long_desc = "Cycles in which Superslice 0 dispatches either 3 or 4 instructions", }, {.pme_name = "PM_DISP_SS0_8_INSTR_CYC", .pme_code = 0x3F056, .pme_short_desc = "Cycles in which Superslice 0 dispatches either 5, 6, 7 or 8 instructions", .pme_long_desc = "Cycles in which Superslice 0 dispatches either 5, 6, 7 or 8 instructions", }, {.pme_name = "PM_DISP_SS1_2_INSTR_CYC", .pme_code = 0x2F054, .pme_short_desc = "Cycles in which Superslice 1 dispatches either 1 or 2 instructions", .pme_long_desc = "Cycles in which Superslice 1 dispatches either 1 or 2 instructions", }, {.pme_name = "PM_DISP_SS1_4_INSTR_CYC", .pme_code = 0x2F056, .pme_short_desc = "Cycles in which Superslice 1 dispatches either 3 or 4 instructions", .pme_long_desc = "Cycles in which Superslice 1 dispatches either 3 or 4 instructions", }, {.pme_name = "PM_DISP_STALL_BR_MPRED_IC_L2", .pme_code = 0x1003A, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L2 after suffering a branch mispredict.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L2 after suffering a branch mispredict.", }, {.pme_name = "PM_DISP_STALL_BR_MPRED_IC_L3MISS", .pme_code = 0x4C010, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from sources beyond the local L3 after suffering a mispredicted branch.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from sources beyond the local L3 after suffering a mispredicted branch.", }, {.pme_name = "PM_DISP_STALL_BR_MPRED_IC_L3", .pme_code = 0x2C01E, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L3 after suffering a branch mispredict.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L3 after suffering a branch mispredict.", }, {.pme_name = "PM_DISP_STALL_BR_MPRED_ICMISS", .pme_code = 0x34058, .pme_short_desc = "pipeline;Cycles when dispatch was stalled after a mispredicted branch resulted in an instruction cache miss.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled after a mispredicted branch resulted in an instruction cache miss.", }, {.pme_name = "PM_DISP_STALL_BR_MPRED", .pme_code = 0x4D01E, .pme_short_desc = "pipeline;Cycles when dispatch was stalled for this thread due to a mispredicted branch.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled for this thread due to a mispredicted branch.", }, {.pme_name = "PM_DISP_STALL_CYC", .pme_code = 0x100F8, .pme_short_desc = "pipeline;Cycles the ICT has no itags assigned to this thread (no instructions were dispatched during these cycles).", .pme_long_desc = "pipeline;Cycles the ICT has no itags assigned to this thread (no instructions were dispatched during these cycles).", }, {.pme_name = "PM_DISP_STALL_FETCH", .pme_code = 0x2E018, .pme_short_desc = "pipeline;Cycles when dispatch was stalled for this thread because Fetch was being held", .pme_long_desc = "pipeline;Cycles when dispatch was stalled for this thread because Fetch was being held", }, {.pme_name = "PM_DISP_STALL_FLUSH", .pme_code = 0x30004, .pme_short_desc = "pipeline;Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet next-to-complete (NTC).", .pme_long_desc = "pipeline;Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet next-to-complete (NTC). PM_EXEC_STALL_NTC_FLUSH only includes instructions that were flushed after becoming NTC", }, {.pme_name = "PM_DISP_STALL_HELD_CYC", .pme_code = 0x4E01A, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any reason", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any reason", }, {.pme_name = "PM_DISP_STALL_HELD_HALT_CYC", .pme_code = 0x1D05E, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of power management", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of power management", }, {.pme_name = "PM_DISP_STALL_HELD_ISSQ_FULL_CYC", .pme_code = 0x20006, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch due to Issue queue full.", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch due to Issue queue full. Includes issue queue and branch queue", }, {.pme_name = "PM_DISP_STALL_HELD_OTHER_CYC", .pme_code = 0x10006, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any other reason", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any other reason", }, {.pme_name = "PM_DISP_STALL_HELD_RENAME_CYC", .pme_code = 0x3D05C, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the mapper/SRB was full.", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR and XVFC", }, {.pme_name = "PM_DISP_STALL_HELD_SCOREBOARD_CYC", .pme_code = 0x30018, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch while waiting on the Scoreboard.", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch while waiting on the Scoreboard. This event combines VSCR and FPSCR together", }, {.pme_name = "PM_DISP_STALL_HELD_STF_MAPPER_CYC", .pme_code = 0x1E050, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the STF mapper/SRB was full.", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the STF mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR", }, {.pme_name = "PM_DISP_STALL_HELD_SYNC_CYC", .pme_code = 0x4003C, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch", }, {.pme_name = "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC", .pme_code = 0x2E01A, .pme_short_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the XVFC mapper/SRB was full", .pme_long_desc = "pipeline;Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the XVFC mapper/SRB was full", }, {.pme_name = "PM_DISP_STALL_IC_L2", .pme_code = 0x10064, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L2.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L2.", }, {.pme_name = "PM_DISP_STALL_IC_L3MISS", .pme_code = 0x4E010, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from any source beyond the local L3.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from any source beyond the local L3.", }, {.pme_name = "PM_DISP_STALL_IC_L3", .pme_code = 0x3E052, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L3.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while the instruction was fetched from the local L3.", }, {.pme_name = "PM_DISP_STALL_IC_MISS", .pme_code = 0x2D01A, .pme_short_desc = "pipeline;Cycles when dispatch was stalled for this thread due to an instruction cache miss.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled for this thread due to an instruction cache miss.", }, {.pme_name = "PM_DISP_STALL_IERAT_ONLY_MISS", .pme_code = 0x2C016, .pme_short_desc = "pipeline;Cycles when dispatch was stalled while waiting to resolve an instruction ERAT miss", .pme_long_desc = "pipeline;Cycles when dispatch was stalled while waiting to resolve an instruction ERAT miss", }, {.pme_name = "PM_DISP_STALL_ITLB_MISS", .pme_code = 0x3000A, .pme_short_desc = "frontend;Cycles when dispatch was stalled while waiting to resolve an instruction TLB miss.", .pme_long_desc = "frontend;Cycles when dispatch was stalled while waiting to resolve an instruction TLB miss.", }, {.pme_name = "PM_DISP_STALL_TRANSLATION", .pme_code = 0x10038, .pme_short_desc = "pipeline;Cycles when dispatch was stalled for this thread because the MMU was handling a translation miss.", .pme_long_desc = "pipeline;Cycles when dispatch was stalled for this thread because the MMU was handling a translation miss.", }, {.pme_name = "PM_DPP_FLOP_CMPL", .pme_code = 0x4D05C, .pme_short_desc = "floating point;Double-Precision or Quad-Precision instruction completed", .pme_long_desc = "floating point;Double-Precision or Quad-Precision instruction completed", }, {.pme_name = "PM_DPTEG_FROM_DL2_MOD", .pme_code = 0x0E4060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_MOD_ALT2", .pme_code = 0x0E4060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_MOD_ALT3", .pme_code = 0x0E4060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_MOD_ALT4", .pme_code = 0x0E4060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_SHR", .pme_code = 0x0E0060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_SHR_ALT2", .pme_code = 0x0E0060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_SHR_ALT3", .pme_code = 0x0E0060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL2_SHR_ALT4", .pme_code = 0x0E0060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_MOD", .pme_code = 0x0EC060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_MOD_ALT2", .pme_code = 0x0EC060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_MOD_ALT3", .pme_code = 0x0EC060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_MOD_ALT4", .pme_code = 0x0EC060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_SHR", .pme_code = 0x0E8060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_SHR_ALT2", .pme_code = 0x0E8060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_SHR_ALT3", .pme_code = 0x0E8060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DL3_SHR_ALT4", .pme_code = 0x0E8060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DMEM", .pme_code = 0x0F4060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DMEM_ALT2", .pme_code = 0x0F4060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DMEM_ALT3", .pme_code = 0x0F4060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_DMEM_ALT4", .pme_code = 0x0F4060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_CACHE", .pme_code = 0x0F8060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_CACHE_ALT2", .pme_code = 0x0F8060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_CACHE_ALT3", .pme_code = 0x0F8060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_CACHE_ALT4", .pme_code = 0x0F8060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_MEM", .pme_code = 0x0FC060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_MEM_ALT2", .pme_code = 0x0FC060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_MEM_ALT3", .pme_code = 0x0FC060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_D_OC_MEM_ALT4", .pme_code = 0x0FC060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_MOD", .pme_code = 0x0A4060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_MOD_ALT2", .pme_code = 0x0A4060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_MOD_ALT3", .pme_code = 0x0A4060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_MOD_ALT4", .pme_code = 0x0A4060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_SHR", .pme_code = 0x0A0060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_SHR_ALT2", .pme_code = 0x0A0060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_SHR_ALT3", .pme_code = 0x0A0060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_NON_REGENT_SHR_ALT4", .pme_code = 0x0A0060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_MOD", .pme_code = 0x084060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_MOD_ALT2", .pme_code = 0x084060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_MOD_ALT3", .pme_code = 0x084060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_MOD_ALT4", .pme_code = 0x084060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_SHR", .pme_code = 0x080060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_SHR_ALT2", .pme_code = 0x080060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_SHR_ALT3", .pme_code = 0x080060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L21_REGENT_SHR_ALT4", .pme_code = 0x080060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2MISS", .pme_code = 0x0003E0000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2MISS_ALT2", .pme_code = 0x0003E0000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2MISS_ALT3", .pme_code = 0x0003E0000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2MISS_ALT4", .pme_code = 0x0003E0000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2", .pme_code = 0x000360000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2_ALT2", .pme_code = 0x000360000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2_ALT3", .pme_code = 0x000360000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L2_ALT4", .pme_code = 0x000360000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_MOD", .pme_code = 0x0AC060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_MOD_ALT2", .pme_code = 0x0AC060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_MOD_ALT3", .pme_code = 0x0AC060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_MOD_ALT4", .pme_code = 0x0AC060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_SHR", .pme_code = 0x0A8060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_SHR_ALT2", .pme_code = 0x0A8060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_SHR_ALT3", .pme_code = 0x0A8060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_NON_REGENT_SHR_ALT4", .pme_code = 0x0A8060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_MOD", .pme_code = 0x08C060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_MOD_ALT2", .pme_code = 0x08C060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_MOD_ALT3", .pme_code = 0x08C060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_MOD_ALT4", .pme_code = 0x08C060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_SHR", .pme_code = 0x088060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_SHR_ALT2", .pme_code = 0x088060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_SHR_ALT3", .pme_code = 0x088060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L31_REGENT_SHR_ALT4", .pme_code = 0x088060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3MISS", .pme_code = 0x0007E0000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3MISS_ALT2", .pme_code = 0x0007E0000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3MISS_ALT3", .pme_code = 0x0007E0000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3MISS_ALT4", .pme_code = 0x0007E0000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3", .pme_code = 0x010360000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3_ALT2", .pme_code = 0x010360000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3_ALT3", .pme_code = 0x010360000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L3_ALT4", .pme_code = 0x010360000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_LMEM", .pme_code = 0x094060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_LMEM_ALT2", .pme_code = 0x094060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_LMEM_ALT3", .pme_code = 0x094060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_LMEM_ALT4", .pme_code = 0x094060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_CACHE", .pme_code = 0x098060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_CACHE_ALT2", .pme_code = 0x098060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_CACHE_ALT3", .pme_code = 0x098060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_CACHE_ALT4", .pme_code = 0x098060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_MEM", .pme_code = 0x09C060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_MEM_ALT2", .pme_code = 0x09C060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_MEM_ALT3", .pme_code = 0x09C060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_L_OC_MEM_ALT4", .pme_code = 0x09C060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_MOD", .pme_code = 0x0C4060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_MOD_ALT2", .pme_code = 0x0C4060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_MOD_ALT3", .pme_code = 0x0C4060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_MOD_ALT4", .pme_code = 0x0C4060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_SHR", .pme_code = 0x0C0060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_SHR_ALT2", .pme_code = 0x0C0060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_SHR_ALT3", .pme_code = 0x0C0060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL2_SHR_ALT4", .pme_code = 0x0C0060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_MOD", .pme_code = 0x0CC060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_MOD_ALT2", .pme_code = 0x0CC060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_MOD_ALT3", .pme_code = 0x0CC060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_MOD_ALT4", .pme_code = 0x0CC060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_SHR", .pme_code = 0x0C8060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_SHR_ALT2", .pme_code = 0x0C8060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_SHR_ALT3", .pme_code = 0x0C8060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RL3_SHR_ALT4", .pme_code = 0x0C8060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RMEM", .pme_code = 0x0D4060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RMEM_ALT2", .pme_code = 0x0D4060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RMEM_ALT3", .pme_code = 0x0D4060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_RMEM_ALT4", .pme_code = 0x0D4060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_CACHE", .pme_code = 0x0D8060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_CACHE_ALT2", .pme_code = 0x0D8060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_CACHE_ALT3", .pme_code = 0x0D8060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_CACHE_ALT4", .pme_code = 0x0D8060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_MEM", .pme_code = 0x0DC060000001C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_MEM_ALT2", .pme_code = 0x0DC060000002C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_MEM_ALT3", .pme_code = 0x0DC060000003C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DPTEG_FROM_R_OC_MEM_ALT4", .pme_code = 0x0DC060000004C040, .pme_short_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's data page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_DTLB_HIT", .pme_code = 0x1F054, .pme_short_desc = "frontend;The PTE required by the instruction was resident in the TLB (data TLB access).", .pme_long_desc = "frontend;The PTE required by the instruction was resident in the TLB (data TLB access). When MMCR1[16]=0 this event counts only demand hits. When MMCR1[16]=1 this event includes demand and prefetch. Applies to both HPT and RPT", }, {.pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x1C058, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 16G.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 16G. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches.", }, {.pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x4C056, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 16M.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 16M. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DTLB_MISS_1G", .pme_code = 0x4C05A, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 1G.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 1G. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DTLB_MISS_2M", .pme_code = 0x1C05C, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 2M.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 2M. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x2C056, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 4K.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 4K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x3C056, .pme_short_desc = "memory;Data TLB reload (after a miss) page size 64K.", .pme_long_desc = "memory;Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_DTLB_MISS", .pme_code = 0x300FC, .pme_short_desc = "memory;The DPTEG required for the load/store instruction in execution was missing from the TLB.", .pme_long_desc = "memory;The DPTEG required for the load/store instruction in execution was missing from the TLB. It includes pages of all sizes for demand and prefetch activity", }, {.pme_name = "PM_EXEC_STALL_BRU", .pme_code = 0x4D018, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the Branch unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the Branch unit.", }, {.pme_name = "PM_EXEC_STALL_DERAT_DTLB_MISS", .pme_code = 0x30016, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered a TLB miss and waited for it resolve.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered a TLB miss and waited for it resolve.", }, {.pme_name = "PM_EXEC_STALL_DERAT_ONLY_MISS", .pme_code = 0x4C012, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered an ERAT miss and waited for it resolve.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered an ERAT miss and waited for it resolve.", }, {.pme_name = "PM_EXEC_STALL_DMISS_L21_L31", .pme_code = 0x1E054, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from another core's L2 or L3 on the same chip.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from another core's L2 or L3 on the same chip.", }, {.pme_name = "PM_EXEC_STALL_DMISS_L2L3_CONFLICT", .pme_code = 0x4C016, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, with a dispatch conflict.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, with a dispatch conflict.", }, {.pme_name = "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT", .pme_code = 0x34054, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, without a dispatch conflict.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, without a dispatch conflict.", }, {.pme_name = "PM_EXEC_STALL_DMISS_L2L3", .pme_code = 0x1003C, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from either the local L2 or local L3.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from either the local L2 or local L3.", }, {.pme_name = "PM_EXEC_STALL_DMISS_L3MISS", .pme_code = 0x2C018, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a source beyond the local L2 or local L3.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a source beyond the local L2 or local L3.", }, {.pme_name = "PM_EXEC_STALL_DMISS_LMEM", .pme_code = 0x30038, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local memory, local OpenCAPI cache, or local OpenCAPI memory.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local memory, local OpenCAPI cache, or local OpenCAPI memory.", }, {.pme_name = "PM_EXEC_STALL_DMISS_OFF_CHIP", .pme_code = 0x2C01C, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a remote chip.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a remote chip.", }, {.pme_name = "PM_EXEC_STALL_DMISS_OFF_NODE", .pme_code = 0x4C01A, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a distant chip.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a distant chip.", }, {.pme_name = "PM_EXEC_STALL_FIN_AT_DISP", .pme_code = 0x10058, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline finished at dispatch and did not require execution in the LSU, BRU or VSU.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline finished at dispatch and did not require execution in the LSU, BRU or VSU.", }, {.pme_name = "PM_EXEC_STALL_LOAD_FINISH", .pme_code = 0x34056, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the next-to-finish (NTF) instruction merged with another load in the LMQ; cycles in which the NTF instruction is waiting for a data reload for a load miss, but the data comes back with a non-NTF instruction.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the next-to-finish (NTF) instruction merged with another load in the LMQ; cycles in which the NTF instruction is waiting for a data reload for a load miss, but the data comes back with a non-NTF instruction.", }, {.pme_name = "PM_EXEC_STALL_LOAD", .pme_code = 0x4D014, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a load instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a load instruction executing in the Load Store Unit.", }, {.pme_name = "PM_EXEC_STALL_LSU", .pme_code = 0x2C010, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the Load Store Unit. This does not include simple fixed point instructions.", }, {.pme_name = "PM_EXEC_STALL_NTC_FLUSH", .pme_code = 0x2E01E, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed. Note that if the flush of the oldest instruction happens after finish, the cycles from dispatch to issue will be included in PM_DISP_STALL and the cycles from issue to finish will be included in PM_EXEC_STALL and its corresponding children. This event will also count cycles when the previous next-to-finish (NTF) instruction is still completing and the new NTF instruction is stalled at dispatch.", }, {.pme_name = "PM_EXEC_STALL_PTESYNC", .pme_code = 0x4D016, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a PTESYNC instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a PTESYNC instruction executing in the Load Store Unit.", }, {.pme_name = "PM_EXEC_STALL_SIMPLE_FX", .pme_code = 0x30036, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a simple fixed point instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a simple fixed point instruction executing in the Load Store Unit.", }, {.pme_name = "PM_EXEC_STALL_STORE_MISS", .pme_code = 0x30026, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a store whose cache line was not resident in the L1 and was waiting for allocation of the missing line into the L1.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a store whose cache line was not resident in the L1 and was waiting for allocation of the missing line into the L1.", }, {.pme_name = "PM_EXEC_STALL_STORE_PIPE", .pme_code = 0x1E056, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the store unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the store unit. This does not include cycles spent handling store misses, PTESYNC instructions or TLBIE instructions.", }, {.pme_name = "PM_EXEC_STALL_STORE", .pme_code = 0x30014, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a store instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a store instruction executing in the Load Store Unit.", }, {.pme_name = "PM_EXEC_STALL_TLBIEL", .pme_code = 0x4D01C, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a TLBIEL instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a TLBIEL instruction executing in the Load Store Unit. TLBIEL instructions have lower overhead than TLBIE instructions because they don't get set to the nest.", }, {.pme_name = "PM_EXEC_STALL_TLBIE", .pme_code = 0x2E01C, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a TLBIE instruction executing in the Load Store Unit.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was a TLBIE instruction executing in the Load Store Unit.", }, {.pme_name = "PM_EXEC_STALL_TRANSLATION", .pme_code = 0x10004, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered a TLB miss or ERAT miss and waited for it to resolve.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline suffered a TLB miss or ERAT miss and waited for it to resolve.", }, {.pme_name = "PM_EXEC_STALL", .pme_code = 0x30008, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting to finish in one of the execution units (BRU, LSU, VSU).", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was waiting to finish in one of the execution units (BRU, LSU, VSU). Only cycles between issue and finish are counted in this category.", }, {.pme_name = "PM_EXEC_STALL_UNKNOWN", .pme_code = 0x4E012, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline completed without an ntf_type pulse.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline completed without an ntf_type pulse. The ntf_pulse was missed by the ISU because the next-to-finish (NTF) instruction finishes and completions came too close together.", }, {.pme_name = "PM_EXEC_STALL_VSU", .pme_code = 0x2D018, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the VSU (includes FXU, VSU, CRU).", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was executing in the VSU (includes FXU, VSU, CRU).", }, {.pme_name = "PM_EXT_INT", .pme_code = 0x200F8, .pme_short_desc = "pipeline;Cycles an external interrupt was active", .pme_long_desc = "pipeline;Cycles an external interrupt was active", }, {.pme_name = "PM_FLOP_CMPL", .pme_code = 0x100F4, .pme_short_desc = "floating point;Floating Point Operations Completed.", .pme_long_desc = "floating point;Floating Point Operations Completed. Includes any type. It counts once for each 1, 2, 4 or 8 flop instruction. Use PM_1|2|4|8_FLOP_CMPL events to count flops", }, {.pme_name = "PM_FLUSH_COMPLETION", .pme_code = 0x30012, .pme_short_desc = "frontend;The instruction that was next to complete (oldest in the pipeline) did not complete because it suffered a flush", .pme_long_desc = "frontend;The instruction that was next to complete (oldest in the pipeline) did not complete because it suffered a flush", }, {.pme_name = "PM_FLUSH_MPRED", .pme_code = 0x1005A, .pme_short_desc = "pipeline;A flush occurred due to a mispredicted branch.", .pme_long_desc = "pipeline;A flush occurred due to a mispredicted branch. Includes target and direction", }, {.pme_name = "PM_FLUSH", .pme_code = 0x400F8, .pme_short_desc = "pipeline;Flush (any type)", .pme_long_desc = "pipeline;Flush (any type)", }, {.pme_name = "PM_FMA_CMPL", .pme_code = 0x45054, .pme_short_desc = "empty;Two floating point instruction completed (FMA class of instructions: fmadd, fnmadd, fmsub, fnmsub).", .pme_long_desc = "empty;Two floating point instruction completed (FMA class of instructions: fmadd, fnmadd, fmsub, fnmsub). Scalar instructions only. ", }, {.pme_name = "PM_FX_LSU_FIN", .pme_code = 0x1006A, .pme_short_desc = "pipeline;Simple fixed point instruction issued to the store unit.", .pme_long_desc = "pipeline;Simple fixed point instruction issued to the store unit. Measured at finish time", }, {.pme_name = "PM_FXU_ISSUE", .pme_code = 0x40004, .pme_short_desc = "pipeline;A fixed point instruction was issued to the VSU.", .pme_long_desc = "pipeline;A fixed point instruction was issued to the VSU.", }, {.pme_name = "PM_HYPERVISOR_CYC", .pme_code = 0x2000A, .pme_short_desc = "pmc;Cycles when the thread is in Hypervisor state.", .pme_long_desc = "pmc;Cycles when the thread is in Hypervisor state. MSR[S HV PR]=010", }, {.pme_name = "PM_HYPERVISOR_INST_CMPL", .pme_code = 0x4D022, .pme_short_desc = "pmc;PowerPC instruction completed while the thread was in hypervisor state.", .pme_long_desc = "pmc;PowerPC instruction completed while the thread was in hypervisor state.", }, {.pme_name = "PM_ICBI_FIN", .pme_code = 0x2F04C, .pme_short_desc = "An ICBI instruction finished", .pme_long_desc = "An ICBI instruction finished", }, {.pme_name = "PM_IC_DEMAND_CYC", .pme_code = 0x10018, .pme_short_desc = "pipeline;Cycles in which an instruction reload is pending to satisfy a demand miss", .pme_long_desc = "pipeline;Cycles in which an instruction reload is pending to satisfy a demand miss", }, {.pme_name = "PM_IC_MISS_CMPL", .pme_code = 0x45058, .pme_short_desc = "pipeline;Non-speculative instruction cache miss, counted at completion", .pme_long_desc = "pipeline;Non-speculative instruction cache miss, counted at completion", }, {.pme_name = "PM_IERAT_MISS", .pme_code = 0x100F6, .pme_short_desc = "frontend;IERAT Reloaded to satisfy an IERAT miss.", .pme_long_desc = "frontend;IERAT Reloaded to satisfy an IERAT miss. All page sizes are counted by this event.", }, {.pme_name = "PM_INST_CMPL", .pme_code = 0x100FE, .pme_short_desc = "pmc;PowerPC instruction completed", .pme_long_desc = "pmc;PowerPC instruction completed", }, {.pme_name = "PM_INST_DISP", .pme_code = 0x200F2, .pme_short_desc = "frontend;PowerPC instruction dispatched", .pme_long_desc = "frontend;PowerPC instruction dispatched", }, {.pme_name = "PM_INST_DISP_ALT", .pme_code = 0x300F2, .pme_short_desc = "frontend;PowerPC instruction dispatched", .pme_long_desc = "frontend;PowerPC instruction dispatched", }, {.pme_name = "PM_INST_CMPL_ALT2", .pme_code = 0x20002, .pme_short_desc = "pmc;PowerPC instruction completed", .pme_long_desc = "pmc;PowerPC instruction completed", }, {.pme_name = "PM_INST_CMPL_ALT3", .pme_code = 0x30002, .pme_short_desc = "pmc;PowerPC instruction completed", .pme_long_desc = "pmc;PowerPC instruction completed", }, {.pme_name = "PM_INST_CMPL_ALT4", .pme_code = 0x40002, .pme_short_desc = "pmc;PowerPC instruction completed", .pme_long_desc = "pmc;PowerPC instruction completed", }, {.pme_name = "PM_INST_FIN", .pme_code = 0x40030, .pme_short_desc = "pmc;Instruction finished", .pme_long_desc = "pmc;Instruction finished", }, {.pme_name = "PM_INST_FROM_L1", .pme_code = 0x0000004080, .pme_short_desc = "NA;An instruction fetch hit in the L1.", .pme_long_desc = "NA;An instruction fetch hit in the L1. Each fetch group contains 8 instructions. The same line can hit 4 times if 32 sequential instructions are fetched.", }, {.pme_name = "PM_INST_FROM_L1MISS", .pme_code = 0x003F00000001C040, .pme_short_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L1MISS_ALT2", .pme_code = 0x003F00000002C040, .pme_short_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L1MISS_ALT3", .pme_code = 0x003F00000003C040, .pme_short_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L1MISS_ALT4", .pme_code = 0x003F00000004C040, .pme_short_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", .pme_long_desc = "NA;The processor's instruction cache was reloaded from a source beyond the local core's L1 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x0F4100000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_DMEM_ALT2", .pme_code = 0x0F4100000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_DMEM_ALT3", .pme_code = 0x0F4100000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_DMEM_ALT4", .pme_code = 0x0F4100000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x000380000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2MISS_ALT2", .pme_code = 0x000380000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2MISS_ALT3", .pme_code = 0x000380000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2MISS_ALT4", .pme_code = 0x000380000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2", .pme_code = 0x000300000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2_ALT2", .pme_code = 0x000300000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2_ALT3", .pme_code = 0x000300000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L2_ALT4", .pme_code = 0x000300000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x000780000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3MISS_ALT2", .pme_code = 0x000780000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3MISS_ALT3", .pme_code = 0x300FA, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3MISS_ALT4", .pme_code = 0x000780000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3", .pme_code = 0x010300000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3_ALT2", .pme_code = 0x010300000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3_ALT3", .pme_code = 0x010300000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_L3_ALT4", .pme_code = 0x010300000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x094100000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_INST_FROM_LMEM_ALT2", .pme_code = 0x094100000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_INST_FROM_LMEM_ALT3", .pme_code = 0x094100000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_INST_FROM_LMEM_ALT4", .pme_code = 0x094100000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x0D4100000001C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_RMEM_ALT2", .pme_code = 0x0D4100000002C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_RMEM_ALT3", .pme_code = 0x0D4100000003C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_INST_FROM_RMEM_ALT4", .pme_code = 0x0D4100000004C040, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IOPS_DISP", .pme_code = 0x24050, .pme_short_desc = "frontend;Internal Operations dispatched.", .pme_long_desc = "frontend;Internal Operations dispatched. PM_IOPS_DISP / PM_INST_DISP will show the average number of internal operations per PowerPC instruction.", }, {.pme_name = "PM_IPTEG_FROM_DL2_MOD", .pme_code = 0x0E4020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_MOD_ALT2", .pme_code = 0x0E4020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_MOD_ALT3", .pme_code = 0x0E4020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_MOD_ALT4", .pme_code = 0x0E4020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_SHR", .pme_code = 0x0E0020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_SHR_ALT2", .pme_code = 0x0E0020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_SHR_ALT3", .pme_code = 0x0E0020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL2_SHR_ALT4", .pme_code = 0x0E0020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_MOD", .pme_code = 0x0EC020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_MOD_ALT2", .pme_code = 0x0EC020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_MOD_ALT3", .pme_code = 0x0EC020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_MOD_ALT4", .pme_code = 0x0EC020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_SHR", .pme_code = 0x0E8020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_SHR_ALT2", .pme_code = 0x0E8020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_SHR_ALT3", .pme_code = 0x0E8020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DL3_SHR_ALT4", .pme_code = 0x0E8020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DMEM", .pme_code = 0x0F4020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DMEM_ALT2", .pme_code = 0x0F4020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DMEM_ALT3", .pme_code = 0x0F4020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_DMEM_ALT4", .pme_code = 0x0F4020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from distant memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_CACHE", .pme_code = 0x0F8020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_CACHE_ALT2", .pme_code = 0x0F8020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_CACHE_ALT3", .pme_code = 0x0F8020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_CACHE_ALT4", .pme_code = 0x0F8020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_MEM", .pme_code = 0x0FC020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_MEM_ALT2", .pme_code = 0x0FC020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_MEM_ALT3", .pme_code = 0x0FC020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_D_OC_MEM_ALT4", .pme_code = 0x0FC020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_MOD", .pme_code = 0x0A4020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_MOD_ALT2", .pme_code = 0x0A4020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_MOD_ALT3", .pme_code = 0x0A4020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_MOD_ALT4", .pme_code = 0x0A4020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_SHR", .pme_code = 0x0A0020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_SHR_ALT2", .pme_code = 0x0A0020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_SHR_ALT3", .pme_code = 0x0A0020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_NON_REGENT_SHR_ALT4", .pme_code = 0x0A0020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_MOD", .pme_code = 0x084020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_MOD_ALT2", .pme_code = 0x084020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_MOD_ALT3", .pme_code = 0x084020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_MOD_ALT4", .pme_code = 0x084020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_SHR", .pme_code = 0x080020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_SHR_ALT2", .pme_code = 0x080020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_SHR_ALT3", .pme_code = 0x080020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L21_REGENT_SHR_ALT4", .pme_code = 0x080020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2MISS", .pme_code = 0x0003A0000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2MISS_ALT2", .pme_code = 0x0003A0000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2MISS_ALT3", .pme_code = 0x0003A0000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2MISS_ALT4", .pme_code = 0x0003A0000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a source beyond the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2", .pme_code = 0x000320000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2_ALT2", .pme_code = 0x000320000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2_ALT3", .pme_code = 0x000320000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L2_ALT4", .pme_code = 0x000320000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L2 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_MOD", .pme_code = 0x0AC020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_MOD_ALT2", .pme_code = 0x0AC020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_MOD_ALT3", .pme_code = 0x0AC020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_MOD_ALT4", .pme_code = 0x0AC020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_SHR", .pme_code = 0x0A8020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_SHR_ALT2", .pme_code = 0x0A8020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_SHR_ALT3", .pme_code = 0x0A8020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_NON_REGENT_SHR_ALT4", .pme_code = 0x0A8020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_MOD", .pme_code = 0x08C020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_MOD_ALT2", .pme_code = 0x08C020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_MOD_ALT3", .pme_code = 0x08C020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_MOD_ALT4", .pme_code = 0x08C020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_SHR", .pme_code = 0x088020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_SHR_ALT2", .pme_code = 0x088020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_SHR_ALT3", .pme_code = 0x088020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L31_REGENT_SHR_ALT4", .pme_code = 0x088020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3MISS", .pme_code = 0x0007A0000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3MISS_ALT2", .pme_code = 0x0007A0000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3MISS_ALT3", .pme_code = 0x0007A0000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3MISS_ALT4", .pme_code = 0x0007A0000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from beyond the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3", .pme_code = 0x010320000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3_ALT2", .pme_code = 0x010320000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3_ALT3", .pme_code = 0x010320000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L3_ALT4", .pme_code = 0x010320000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local core's L3 due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_LMEM", .pme_code = 0x094020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_LMEM_ALT2", .pme_code = 0x094020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_LMEM_ALT3", .pme_code = 0x094020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_LMEM_ALT4", .pme_code = 0x094020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_CACHE", .pme_code = 0x098020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_CACHE_ALT2", .pme_code = 0x098020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_CACHE_ALT3", .pme_code = 0x098020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_CACHE_ALT4", .pme_code = 0x098020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_MEM", .pme_code = 0x09C020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_MEM_ALT2", .pme_code = 0x09C020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_MEM_ALT3", .pme_code = 0x09C020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_L_OC_MEM_ALT4", .pme_code = 0x09C020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from the local chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_MOD", .pme_code = 0x0C4020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_MOD_ALT2", .pme_code = 0x0C4020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_MOD_ALT3", .pme_code = 0x0C4020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_MOD_ALT4", .pme_code = 0x0C4020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_SHR", .pme_code = 0x0C0020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_SHR_ALT2", .pme_code = 0x0C0020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_SHR_ALT3", .pme_code = 0x0C0020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL2_SHR_ALT4", .pme_code = 0x0C0020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_MOD", .pme_code = 0x0CC020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_MOD_ALT2", .pme_code = 0x0CC020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_MOD_ALT3", .pme_code = 0x0CC020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_MOD_ALT4", .pme_code = 0x0CC020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_SHR", .pme_code = 0x0C8020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_SHR_ALT2", .pme_code = 0x0C8020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_SHR_ALT3", .pme_code = 0x0C8020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RL3_SHR_ALT4", .pme_code = 0x0C8020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RMEM", .pme_code = 0x0D4020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RMEM_ALT2", .pme_code = 0x0D4020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RMEM_ALT3", .pme_code = 0x0D4020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_RMEM_ALT4", .pme_code = 0x0D4020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from remote memory (MC slow) due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_CACHE", .pme_code = 0x0D8020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_CACHE_ALT2", .pme_code = 0x0D8020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_CACHE_ALT3", .pme_code = 0x0D8020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_CACHE_ALT4", .pme_code = 0x0D8020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_MEM", .pme_code = 0x0DC020000001C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_MEM_ALT2", .pme_code = 0x0DC020000002C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_MEM_ALT3", .pme_code = 0x0DC020000003C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_IPTEG_FROM_R_OC_MEM_ALT4", .pme_code = 0x0DC020000004C040, .pme_short_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;The processor's instruction page table entry was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", }, {.pme_name = "PM_ISSUE_CANCEL", .pme_code = 0x2405E, .pme_short_desc = "frontend;An instruction issued and the issue was later cancelled.", .pme_long_desc = "frontend;An instruction issued and the issue was later cancelled. Only one cancel per PowerPC instruction", }, {.pme_name = "PM_ISSUE_KILL", .pme_code = 0x40006, .pme_short_desc = "frontend;Cycles in which an instruction or group of instructions were cancelled after being issued.", .pme_long_desc = "frontend;Cycles in which an instruction or group of instructions were cancelled after being issued. This event increments once per occurance, regardless of how many instructions are included in the issue group", }, {.pme_name = "PM_ISSUE_STALL", .pme_code = 0x20004, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was dispatched but not issued yet.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline was dispatched but not issued yet.", }, {.pme_name = "PM_ISU_FLUSH_DISP", .pme_code = 0x0000002084, .pme_short_desc = "NA;Dispatch flushes occur when one thread is causing other threads to stall", .pme_long_desc = "NA;Dispatch flushes occur when one thread is causing other threads to stall", }, {.pme_name = "PM_ISU_FLUSH", .pme_code = 0x0000002880, .pme_short_desc = "NA;All flushes initiated by the Instruction Sequencing Unit (ISU).", .pme_long_desc = "NA;All flushes initiated by the Instruction Sequencing Unit (ISU). Excludes LSU NTC+1 flushes", }, {.pme_name = "PM_ITLB_HIT_1G", .pme_code = 0x3F046, .pme_short_desc = "frontend;Instruction TLB hit (IERAT reload) page size 1G, which implies Radix Page Table translation is in use.", .pme_long_desc = "frontend;Instruction TLB hit (IERAT reload) page size 1G, which implies Radix Page Table translation is in use. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches.", }, {.pme_name = "PM_ITLB_HIT", .pme_code = 0x2001A, .pme_short_desc = "memory;The PTE required to translate the instruction address was resident in the TLB (instruction TLB access/IERAT reload).", .pme_long_desc = "memory;The PTE required to translate the instruction address was resident in the TLB (instruction TLB access/IERAT reload). Applies to both HPT and RPT. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches.", }, {.pme_name = "PM_ITLB_MISS", .pme_code = 0x400FC, .pme_short_desc = "frontend;Instruction TLB reload (after a miss), all page sizes.", .pme_long_desc = "frontend;Instruction TLB reload (after a miss), all page sizes. Includes only demand misses.", }, {.pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x200FD, .pme_short_desc = "Demand instruction cache miss", .pme_long_desc = "Demand instruction cache miss", }, {.pme_name = "PM_L1_ICACHE_RELOADED_ALL", .pme_code = 0x40012, .pme_short_desc = "Counts all instruction cache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", .pme_long_desc = "Counts all instruction cache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", }, {.pme_name = "PM_L1_ICACHE_RELOADED_PREF", .pme_code = 0x30068, .pme_short_desc = "Counts all instruction cache prefetch reloads (includes demand turned into prefetch)", .pme_long_desc = "Counts all instruction cache prefetch reloads (includes demand turned into prefetch)", }, {.pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x010000016080, .pme_short_desc = "NA;A line in an Exclusive (M,Mu,Me) state is evicted from the L2.", .pme_long_desc = "NA;A line in an Exclusive (M,Mu,Me) state is evicted from the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x010000016880, .pme_short_desc = "NA;A line in a Shared (Tx,Sx) state is evicted from the L2.", .pme_long_desc = "NA;A line in a Shared (Tx,Sx) state is evicted from the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_CO_USAGE", .pme_code = 0x060000026880, .pme_short_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy.", .pme_long_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_DC_INV", .pme_code = 0x010000026880, .pme_short_desc = "NA;Data cache invalidates sent over the reload bus to the core.", .pme_long_desc = "NA;Data cache invalidates sent over the reload bus to the core. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_IC_INV", .pme_code = 0x010000026080, .pme_short_desc = "NA;Instruction cache invalidates sent over the reload bus to the core.", .pme_long_desc = "NA;Instruction cache invalidates sent over the reload bus to the core. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_INST", .pme_code = 0x000000036080, .pme_short_desc = "NA;All successful I-side-instruction-fetch (e.", .pme_long_desc = "NA;All successful I-side-instruction-fetch (e.g. i-demand, i-prefetch) dispatches for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_INST_MISS", .pme_code = 0x000000036880, .pme_short_desc = "NA;All successful instruction (demand and prefetch) dispatches for this thread that missed in the L2.", .pme_long_desc = "NA;All successful instruction (demand and prefetch) dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_INST_MISS_ALT", .pme_code = 0x0F0000046080, .pme_short_desc = "NA;All successful instruction (demand and prefetch) dispatches for this thread that missed in the L2.", .pme_long_desc = "NA;All successful instruction (demand and prefetch) dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ISIDE_DSIDE_ATTEMPT", .pme_code = 0x020000016080, .pme_short_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread.", .pme_long_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ISIDE_DSIDE_FAIL_ADDR", .pme_code = 0x020000016880, .pme_short_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread that failed due to an address collision conflicts with an L2 machine already working on this line (e.", .pme_long_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread that failed due to an address collision conflicts with an L2 machine already working on this line (e.g. ld-hit-stq or Read-claim/Castout/Snoop machines). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ISIDE_DSIDE_FAIL_OTHER", .pme_code = 0x020000026080, .pme_short_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread that failed due to reasons other than an address collision conflicts with an L2 machine (e.", .pme_long_desc = "NA;All D-side-Ld or I-side-instruction-fetch dispatch attempts for this thread that failed due to reasons other than an address collision conflicts with an L2 machine (e.g. Read-Claim/Snoop machine not available). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ISIDE_DSIDE", .pme_code = 0x010000036080, .pme_short_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread.", .pme_long_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LD_DISP", .pme_code = 0x0F0000016080, .pme_short_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread.", .pme_long_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LD_HIT", .pme_code = 0x0F0000026080, .pme_short_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread that were L2 hits.", .pme_long_desc = "NA;All successful D-side-Ld or I-side-instruction-fetch dispatches for this thread that were L2 hits. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LD_MISS", .pme_code = 0x000000026080, .pme_short_desc = "NA;All successful D-Side Load dispatches for this thread that missed in the L2.", .pme_long_desc = "NA;All successful D-Side Load dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LD", .pme_code = 0x000000016080, .pme_short_desc = "NA;All successful D-side Load dispatches for this thread (L2 miss + L2 hits).", .pme_long_desc = "NA;All successful D-side Load dispatches for this thread (L2 miss + L2 hits). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LOC_GUESS_CORRECT", .pme_code = 0x040000016080, .pme_short_desc = "NA;L2 guess local (LNS) and guess was correct (ie data local).", .pme_long_desc = "NA;L2 guess local (LNS) and guess was correct (ie data local). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_LOC_GUESS_WRONG", .pme_code = 0x040000016880, .pme_short_desc = "NA;L2 guess local (LNS) and guess was not correct (ie data not on chip).", .pme_long_desc = "NA;L2 guess local (LNS) and guess was not correct (ie data not on chip). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_RC_USAGE", .pme_code = 0x060000016880, .pme_short_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy.", .pme_long_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_SN_USAGE", .pme_code = 0x060000036880, .pme_short_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy.", .pme_long_desc = "NA;Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_ATTEMPT", .pme_code = 0x020000036080, .pme_short_desc = "NA;All D-side store dispatch attempts for this thread.", .pme_long_desc = "NA;All D-side store dispatch attempts for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_DISP_FAIL_ADDR", .pme_code = 0x020000036880, .pme_short_desc = "NA;All D-side store dispatch attempts for this thread that failed due to address collision with L2 machine already working on this line (e.", .pme_long_desc = "NA;All D-side store dispatch attempts for this thread that failed due to address collision with L2 machine already working on this line (e.g. ld-hit-stq or Read-claim/Castout/Snoop machines). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_DISP_FAIL_OTHER", .pme_code = 0x020000046080, .pme_short_desc = "NA;All D-side store dispatch attempts for this thread that failed due to reason other than address collision (e.", .pme_long_desc = "NA;All D-side store dispatch attempts for this thread that failed due to reason other than address collision (e.g. Read-Claim/Snoop machine not available). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_DISP", .pme_code = 0x0F0000016880, .pme_short_desc = "NA;All successful D-side store dispatches for this thread.", .pme_long_desc = "NA;All successful D-side store dispatches for this thread. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_MISS", .pme_code = 0x000000026880, .pme_short_desc = "NA;All successful D-Side Store dispatches for this thread that missed in the L2.", .pme_long_desc = "NA;All successful D-Side Store dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_HIT", .pme_code = 0x0F0000026880, .pme_short_desc = "NA;All successful D-side store dispatches for this thread that were L2 hits", .pme_long_desc = "NA;All successful D-side store dispatches for this thread that were L2 hits. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST", .pme_code = 0x000000016880, .pme_short_desc = "NA;All successful D-side store dispatches for this thread (L2 miss + L2 hits).", .pme_long_desc = "NA;All successful D-side store dispatches for this thread (L2 miss + L2 hits). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L2_ST_ALT4", .pme_code = 0x010000046080, .pme_short_desc = "NA;All successful D-side store dispatches for this thread (L2 miss + L2 hits).", .pme_long_desc = "NA;All successful D-side store dispatches for this thread (L2 miss + L2 hits). Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L3_LD_HIT", .pme_code = 0x120000026080, .pme_short_desc = "NA;L3 Hits for for loads, but not stores.", .pme_long_desc = "NA;L3 Hits for for loads, but not stores. Any L2 load (but not store) that hits in the L3, including data load, instruction load or translate load. Since the event happens in the 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L3_LD_MISS", .pme_code = 0x120000026880, .pme_short_desc = "NA;L3 Misses for loads, but not stores.", .pme_long_desc = "NA;L3 Misses for loads, but not stores. Any L2 load (but not store) to the L3 that misses in the L3, including data load, instruction load or translate load. Since the event happens in the 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_L3_WI_USAGE", .pme_code = 0x140000016880, .pme_short_desc = "NA;Lifetime, sample of Write Inject machine 0 valid.", .pme_long_desc = "NA;Lifetime, sample of Write Inject machine 0 valid. Increments while Write Inject machine 0 is valid. Since the event happens in the 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2.", }, {.pme_name = "PM_LARX_FIN", .pme_code = 0x3C058, .pme_short_desc = "memory;Load and reserve instruction (LARX) finished.", .pme_long_desc = "memory;Load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock", }, {.pme_name = "PM_LD_CMPL", .pme_code = 0x4003E, .pme_short_desc = "memory;Load instruction completed", .pme_long_desc = "memory;Load instruction completed", }, {.pme_name = "PM_LD_DEMAND_MISS_L1_FIN", .pme_code = 0x400F0, .pme_short_desc = "empty;Load missed L1, counted at finish time", .pme_long_desc = "empty;Load missed L1, counted at finish time", }, {.pme_name = "PM_LD_DEMAND_MISS_L1", .pme_code = 0x300F6, .pme_short_desc = "The L1 cache was reloaded with a line that fulfills a demand miss request.", .pme_long_desc = "The L1 cache was reloaded with a line that fulfills a demand miss request. Counted at reload time, before finish.", }, {.pme_name = "PM_LD_HIT_L1", .pme_code = 0x1505E, .pme_short_desc = "Load finished without experiencing an L1 miss", .pme_long_desc = "Load finished without experiencing an L1 miss", }, {.pme_name = "PM_LD_L3MISS_PEND_CYC", .pme_code = 0x10062, .pme_short_desc = "memory;Cycles in which an L3 miss was pending for this thread", .pme_long_desc = "memory;Cycles in which an L3 miss was pending for this thread", }, {.pme_name = "PM_LD_MISS_L1", .pme_code = 0x3E054, .pme_short_desc = "frontend;Load missed L1, counted at finish time.", .pme_long_desc = "frontend;Load missed L1, counted at finish time. LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", }, {.pme_name = "PM_LD_PREFETCH_CACHE_LINE_MISS", .pme_code = 0x1002C, .pme_short_desc = "The L1 cache was reloaded with a line that fulfills a prefetch request", .pme_long_desc = "The L1 cache was reloaded with a line that fulfills a prefetch request", }, {.pme_name = "PM_LD_REF_L1", .pme_code = 0x100FC, .pme_short_desc = "empty;All L1 D cache load references counted at finish, gated by reject.", .pme_long_desc = "empty;All L1 D cache load references counted at finish, gated by reject. In P9 and earlier this event counted only cacheable loads but in P10 both cacheable and non-cacheable loads are included", }, {.pme_name = "PM_LSU_FIN", .pme_code = 0x30066, .pme_short_desc = "pipeline;LSU Finished an internal operation (up to 4 per cycle)", .pme_long_desc = "pipeline;LSU Finished an internal operation (up to 4 per cycle)", }, {.pme_name = "PM_LSU_LD0_FIN", .pme_code = 0x1000C, .pme_short_desc = "pipeline;LSU Finished an internal operation in LD0 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in LD0 port", }, {.pme_name = "PM_LSU_LD1_FIN", .pme_code = 0x2000E, .pme_short_desc = "pipeline;LSU Finished an internal operation in LD1 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in LD1 port", }, {.pme_name = "PM_LSU_ST0_FIN", .pme_code = 0x10012, .pme_short_desc = "pipeline;LSU Finished an internal operation in ST0 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in ST0 port", }, {.pme_name = "PM_LSU_ST1_FIN", .pme_code = 0x2D010, .pme_short_desc = "pipeline;LSU Finished an internal operation in ST1 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in ST1 port", }, {.pme_name = "PM_LSU_ST2_FIN", .pme_code = 0x3001A, .pme_short_desc = "pipeline;LSU Finished an internal operation in ST2 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in ST2 port", }, {.pme_name = "PM_LSU_ST3_FIN", .pme_code = 0x4C01E, .pme_short_desc = "pipeline;LSU Finished an internal operation in ST3 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in ST3 port", }, {.pme_name = "PM_LSU_ST4_FIN", .pme_code = 0x10014, .pme_short_desc = "pipeline;LSU Finished an internal operation in ST4 port", .pme_long_desc = "pipeline;LSU Finished an internal operation in ST4 port", }, {.pme_name = "PM_LSU_ST5_FIN", .pme_code = 0x3F04A, .pme_short_desc = "LSU Finished an internal operation in ST2 port", .pme_long_desc = "LSU Finished an internal operation in ST2 port", }, {.pme_name = "PM_MATH_FLOP_CMPL", .pme_code = 0x4505C, .pme_short_desc = "floating point;Math floating point instruction completed", .pme_long_desc = "floating point;Math floating point instruction completed", }, {.pme_name = "PM_MEM_PREF", .pme_code = 0x2C058, .pme_short_desc = "Memory prefetch for this thread.", .pme_long_desc = "Memory prefetch for this thread. Includes instruction and data. This event count should be divided by two since the event is sourced from 2:1 clock domain.", }, {.pme_name = "PM_MEM_READ", .pme_code = 0x10056, .pme_short_desc = "pipeline;Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch).", .pme_long_desc = "pipeline;Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). This event count should be divided by two since the event is sourced from 2:1 clock domain.", }, {.pme_name = "PM_MEM_RWITM", .pme_code = 0x3C05E, .pme_short_desc = "empty;Memory Read With Intent to Modify for this thread.", .pme_long_desc = "empty;Memory Read With Intent to Modify for this thread. This event count should be divided by two since the event is sourced from 2:1 clock domain.", }, {.pme_name = "PM_MMA_ISSUED", .pme_code = 0x1000E, .pme_short_desc = "pipeline;MMA instruction issued", .pme_long_desc = "pipeline;MMA instruction issued", }, {.pme_name = "PM_PRED_BR_TKN_COND_DIR", .pme_code = 0x00000040B8, .pme_short_desc = "frontend;A conditional branch finished with correctly predicted direction.", .pme_long_desc = "frontend;A conditional branch finished with correctly predicted direction. Resolved taken", }, {.pme_name = "PM_PRED_BR_NTKN_COND_DIR", .pme_code = 0x00000048B8, .pme_short_desc = "frontend;A conditional branch finished with correctly predicted direction.", .pme_long_desc = "frontend;A conditional branch finished with correctly predicted direction. Resolved not taken", }, {.pme_name = "PM_MPRED_BR_NTKN_COND_DIR", .pme_code = 0x00000048BC, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction. Resolved not taken", }, {.pme_name = "PM_MPRED_BR_TKN_COND_DIR", .pme_code = 0x00000040BC, .pme_short_desc = "NA;A conditional branch finished with mispredicted direction.", .pme_long_desc = "NA;A conditional branch finished with mispredicted direction. Resolved taken", }, {.pme_name = "PM_MRK_BR_CMPL", .pme_code = 0x2415C, .pme_short_desc = "marked;A marked branch completed.", .pme_long_desc = "marked;A marked branch completed. All branches are included.", }, {.pme_name = "PM_MRK_BR_MPRED_CMPL", .pme_code = 0x301E4, .pme_short_desc = "marked;Marked Branch Mispredicted.", .pme_long_desc = "marked;Marked Branch Mispredicted. Includes direction and target", }, {.pme_name = "PM_MRK_BR_TAKEN_CMPL", .pme_code = 0x101E2, .pme_short_desc = "marked;Marked Branch Taken instruction completed", .pme_long_desc = "marked;Marked Branch Taken instruction completed", }, {.pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2013A, .pme_short_desc = "marked;Marked Branch instruction finished", .pme_long_desc = "marked;Marked Branch instruction finished", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_CYC", .pme_code = 0x0E4040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_MOD to obtain the average DL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_CYC_ALT2", .pme_code = 0x0E4040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_MOD to obtain the average DL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_CYC_ALT3", .pme_code = 0x0E4040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_MOD to obtain the average DL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_CYC_ALT4", .pme_code = 0x0E4040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_MOD to obtain the average DL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD", .pme_code = 0x0E4040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_ALT2", .pme_code = 0x0E4040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_ALT3", .pme_code = 0x0E4040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_MOD_ALT4", .pme_code = 0x0E4040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_CYC", .pme_code = 0x0E0040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_SHR to obtain the average DL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_CYC_ALT2", .pme_code = 0x0E0040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_SHR to obtain the average DL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_CYC_ALT3", .pme_code = 0x0E0040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_SHR to obtain the average DL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_CYC_ALT4", .pme_code = 0x0E0040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL2_SHR to obtain the average DL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR", .pme_code = 0x0E0040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_ALT2", .pme_code = 0x0E0040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_ALT3", .pme_code = 0x0E0040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL2_SHR_ALT4", .pme_code = 0x0E0040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_CYC", .pme_code = 0x0EC040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_MOD to obtain the average DL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_CYC_ALT2", .pme_code = 0x0EC040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_MOD to obtain the average DL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_CYC_ALT3", .pme_code = 0x0EC040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_MOD to obtain the average DL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_CYC_ALT4", .pme_code = 0x0EC040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_MOD to obtain the average DL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD", .pme_code = 0x0EC040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_ALT2", .pme_code = 0x0EC040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_ALT3", .pme_code = 0x0EC040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_MOD_ALT4", .pme_code = 0x0EC040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_CYC", .pme_code = 0x0E8040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_SHR to obtain the average DL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_CYC_ALT2", .pme_code = 0x0E8040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_SHR to obtain the average DL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_CYC_ALT3", .pme_code = 0x0E8040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_SHR to obtain the average DL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_CYC_ALT4", .pme_code = 0x0E8040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DL3_SHR to obtain the average DL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR", .pme_code = 0x0E8040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_ALT2", .pme_code = 0x0E8040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_ALT3", .pme_code = 0x0E8040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DL3_SHR_ALT4", .pme_code = 0x0E8040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a distant chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", .pme_code = 0x0F4040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DMEM to obtain the average DMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_CYC_ALT2", .pme_code = 0x0F4040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DMEM to obtain the average DMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_CYC_ALT3", .pme_code = 0x0F4040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DMEM to obtain the average DMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_CYC_ALT4", .pme_code = 0x0F4040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from distant memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_DMEM to obtain the average DMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x0F4040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_ALT2", .pme_code = 0x0F4040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_ALT3", .pme_code = 0x0F4040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_DMEM_ALT4", .pme_code = 0x0F4040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from distant memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_CYC", .pme_code = 0x0F8040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_CACHE to obtain the average D_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_CYC_ALT2", .pme_code = 0x0F8040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_CACHE to obtain the average D_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_CYC_ALT3", .pme_code = 0x0F8040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_CACHE to obtain the average D_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_CYC_ALT4", .pme_code = 0x0F8040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_CACHE to obtain the average D_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE", .pme_code = 0x0F8040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_ALT2", .pme_code = 0x0F8040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_ALT3", .pme_code = 0x0F8040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_CACHE_ALT4", .pme_code = 0x0F8040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_CYC", .pme_code = 0x0FC040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_MEM to obtain the average D_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_CYC_ALT2", .pme_code = 0x0FC040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_MEM to obtain the average D_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_CYC_ALT3", .pme_code = 0x0FC040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_MEM to obtain the average D_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_CYC_ALT4", .pme_code = 0x0FC040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a distant chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_D_OC_MEM to obtain the average D_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM", .pme_code = 0x0FC040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_ALT2", .pme_code = 0x0FC040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_ALT3", .pme_code = 0x0FC040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_D_OC_MEM_ALT4", .pme_code = 0x0FC040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a distant chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_CYC", .pme_code = 0x0A4040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_MOD to obtain the average L21_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_CYC_ALT2", .pme_code = 0x0A4040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_MOD to obtain the average L21_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_CYC_ALT3", .pme_code = 0x0A4040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_MOD to obtain the average L21_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_CYC_ALT4", .pme_code = 0x0A4040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_MOD to obtain the average L21_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD", .pme_code = 0x0A4040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_ALT2", .pme_code = 0x0A4040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_ALT3", .pme_code = 0x0A4040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_MOD_ALT4", .pme_code = 0x0A4040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_CYC", .pme_code = 0x0A0040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_SHR to obtain the average L21_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_CYC_ALT2", .pme_code = 0x0A0040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_SHR to obtain the average L21_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_CYC_ALT3", .pme_code = 0x0A0040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_SHR to obtain the average L21_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_CYC_ALT4", .pme_code = 0x0A0040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_NON_REGENT_SHR to obtain the average L21_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR", .pme_code = 0x0A0040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_ALT2", .pme_code = 0x0A0040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_ALT3", .pme_code = 0x0A0040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_NON_REGENT_SHR_ALT4", .pme_code = 0x0A0040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_CYC", .pme_code = 0x084040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_MOD to obtain the average L21_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_CYC_ALT2", .pme_code = 0x084040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_MOD to obtain the average L21_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_CYC_ALT3", .pme_code = 0x084040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_MOD to obtain the average L21_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_CYC_ALT4", .pme_code = 0x084040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_MOD to obtain the average L21_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD", .pme_code = 0x084040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_ALT2", .pme_code = 0x084040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_ALT3", .pme_code = 0x084040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_MOD_ALT4", .pme_code = 0x084040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_CYC", .pme_code = 0x080040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_SHR to obtain the average L21_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_CYC_ALT2", .pme_code = 0x080040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_SHR to obtain the average L21_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_CYC_ALT3", .pme_code = 0x080040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_SHR to obtain the average L21_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_CYC_ALT4", .pme_code = 0x080040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L21_REGENT_SHR to obtain the average L21_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR", .pme_code = 0x080040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_ALT2", .pme_code = 0x080040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_ALT3", .pme_code = 0x080040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L21_REGENT_SHR_ALT4", .pme_code = 0x080040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x000340000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2 to obtain the average L2 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_CYC_ALT2", .pme_code = 0x000340000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2 to obtain the average L2 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_CYC_ALT3", .pme_code = 0x000340000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2 to obtain the average L2 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_CYC_ALT4", .pme_code = 0x000340000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2 to obtain the average L2 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", .pme_code = 0x004040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2_MEPF to obtain the average L2_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC_ALT2", .pme_code = 0x004040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2_MEPF to obtain the average L2_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC_ALT3", .pme_code = 0x004040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2_MEPF to obtain the average L2_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC_ALT4", .pme_code = 0x004040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L2_MEPF to obtain the average L2_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF", .pme_code = 0x004040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_ALT2", .pme_code = 0x004040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_ALT3", .pme_code = 0x004040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_MEPF_ALT4", .pme_code = 0x004040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x0003C0000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2MISS_ALT2", .pme_code = 0x0003C0000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2MISS_ALT3", .pme_code = 0x0003C0000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2MISS_ALT4", .pme_code = 0x401E8, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x000340000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_ALT2", .pme_code = 0x000340000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_ALT3", .pme_code = 0x000340000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L2_ALT4", .pme_code = 0x000340000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L2 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_CYC", .pme_code = 0x0AC040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_MOD to obtain the average L31_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_CYC_ALT2", .pme_code = 0x0AC040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_MOD to obtain the average L31_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_CYC_ALT3", .pme_code = 0x0AC040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_MOD to obtain the average L31_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_CYC_ALT4", .pme_code = 0x0AC040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_MOD to obtain the average L31_NON_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD", .pme_code = 0x0AC040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_ALT2", .pme_code = 0x0AC040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_ALT3", .pme_code = 0x0AC040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_MOD_ALT4", .pme_code = 0x0AC040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_CYC", .pme_code = 0x0A8040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_SHR to obtain the average L31_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_CYC_ALT2", .pme_code = 0x0A8040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_SHR to obtain the average L31_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_CYC_ALT3", .pme_code = 0x0A8040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_SHR to obtain the average L31_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_CYC_ALT4", .pme_code = 0x0A8040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_NON_REGENT_SHR to obtain the average L31_NON_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR", .pme_code = 0x0A8040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_ALT2", .pme_code = 0x0A8040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_ALT3", .pme_code = 0x0A8040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_NON_REGENT_SHR_ALT4", .pme_code = 0x0A8040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in a different regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_CYC", .pme_code = 0x08C040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_MOD to obtain the average L31_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_CYC_ALT2", .pme_code = 0x08C040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_MOD to obtain the average L31_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_CYC_ALT3", .pme_code = 0x08C040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_MOD to obtain the average L31_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_CYC_ALT4", .pme_code = 0x08C040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_MOD to obtain the average L31_REGENT_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD", .pme_code = 0x08C040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_ALT2", .pme_code = 0x08C040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_ALT3", .pme_code = 0x08C040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_MOD_ALT4", .pme_code = 0x08C040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_CYC", .pme_code = 0x088040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_SHR to obtain the average L31_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_CYC_ALT2", .pme_code = 0x088040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_SHR to obtain the average L31_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_CYC_ALT3", .pme_code = 0x088040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_SHR to obtain the average L31_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_CYC_ALT4", .pme_code = 0x088040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L31_REGENT_SHR to obtain the average L31_REGENT_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR", .pme_code = 0x088040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_ALT2", .pme_code = 0x088040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_ALT3", .pme_code = 0x088040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L31_REGENT_SHR_ALT4", .pme_code = 0x088040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 on the same chip in the same regent due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x010340000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3 to obtain the average L3 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_CYC_ALT2", .pme_code = 0x010340000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3 to obtain the average L3 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_CYC_ALT3", .pme_code = 0x010340000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3 to obtain the average L3 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_CYC_ALT4", .pme_code = 0x010340000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3 to obtain the average L3 latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", .pme_code = 0x014040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3_MEPF to obtain the average L3_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC_ALT2", .pme_code = 0x014040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3_MEPF to obtain the average L3_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC_ALT3", .pme_code = 0x014040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3_MEPF to obtain the average L3_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC_ALT4", .pme_code = 0x014040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L3_MEPF to obtain the average L3_MEPF latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF", .pme_code = 0x014040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_ALT2", .pme_code = 0x014040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_ALT3", .pme_code = 0x014040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_MEPF_ALT4", .pme_code = 0x014040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with data in the MEPF state without dispatch conflicts from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x0007C0000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3MISS_ALT2", .pme_code = 0x201E4, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3MISS_ALT3", .pme_code = 0x0007C0000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3MISS_ALT4", .pme_code = 0x0007C0000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x010340000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_ALT2", .pme_code = 0x010340000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_ALT3", .pme_code = 0x010340000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L3_ALT4", .pme_code = 0x010340000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x094040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_LMEM to obtain the average LMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_CYC_ALT2", .pme_code = 0x094040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_LMEM to obtain the average LMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_CYC_ALT3", .pme_code = 0x094040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_LMEM to obtain the average LMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_CYC_ALT4", .pme_code = 0x094040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_LMEM to obtain the average LMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x094040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_ALT2", .pme_code = 0x094040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_ALT3", .pme_code = 0x094040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_LMEM_ALT4", .pme_code = 0x094040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_CYC", .pme_code = 0x098040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_CACHE to obtain the average L_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_CYC_ALT2", .pme_code = 0x098040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_CACHE to obtain the average L_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_CYC_ALT3", .pme_code = 0x098040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_CACHE to obtain the average L_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_CYC_ALT4", .pme_code = 0x098040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_CACHE to obtain the average L_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE", .pme_code = 0x098040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_ALT2", .pme_code = 0x098040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_ALT3", .pme_code = 0x098040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_CACHE_ALT4", .pme_code = 0x098040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_CYC", .pme_code = 0x09C040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_MEM to obtain the average L_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_CYC_ALT2", .pme_code = 0x09C040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_MEM to obtain the average L_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_CYC_ALT3", .pme_code = 0x09C040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_MEM to obtain the average L_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_CYC_ALT4", .pme_code = 0x09C040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from the local chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_L_OC_MEM to obtain the average L_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM", .pme_code = 0x09C040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_ALT2", .pme_code = 0x09C040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_ALT3", .pme_code = 0x09C040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_L_OC_MEM_ALT4", .pme_code = 0x09C040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from the local chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_MEMORY", .pme_code = 0x201E0, .pme_short_desc = "marked;The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss for a marked load", .pme_long_desc = "marked;The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss for a marked load", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_CYC", .pme_code = 0x0C4040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_MOD to obtain the average RL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_CYC_ALT2", .pme_code = 0x0C4040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_MOD to obtain the average RL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_CYC_ALT3", .pme_code = 0x0C4040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_MOD to obtain the average RL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_CYC_ALT4", .pme_code = 0x0C4040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_MOD to obtain the average RL2_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD", .pme_code = 0x0C4040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_ALT2", .pme_code = 0x0C4040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_ALT3", .pme_code = 0x0C4040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_MOD_ALT4", .pme_code = 0x0C4040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_CYC", .pme_code = 0x0C0040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_SHR to obtain the average RL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_CYC_ALT2", .pme_code = 0x0C0040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_SHR to obtain the average RL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_CYC_ALT3", .pme_code = 0x0C0040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_SHR to obtain the average RL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_CYC_ALT4", .pme_code = 0x0C0040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL2_SHR to obtain the average RL2_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR", .pme_code = 0x0C0040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_ALT2", .pme_code = 0x0C0040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_ALT3", .pme_code = 0x0C0040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL2_SHR_ALT4", .pme_code = 0x0C0040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L2 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_CYC", .pme_code = 0x0CC040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_MOD to obtain the average RL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_CYC_ALT2", .pme_code = 0x0CC040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_MOD to obtain the average RL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_CYC_ALT3", .pme_code = 0x0CC040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_MOD to obtain the average RL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_CYC_ALT4", .pme_code = 0x0CC040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_MOD to obtain the average RL3_MOD latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD", .pme_code = 0x0CC040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_ALT2", .pme_code = 0x0CC040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_ALT3", .pme_code = 0x0CC040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_MOD_ALT4", .pme_code = 0x0CC040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a line in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_CYC", .pme_code = 0x0C8040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_SHR to obtain the average RL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_CYC_ALT2", .pme_code = 0x0C8040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_SHR to obtain the average RL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_CYC_ALT3", .pme_code = 0x0C8040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_SHR to obtain the average RL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_CYC_ALT4", .pme_code = 0x0C8040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RL3_SHR to obtain the average RL3_SHR latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR", .pme_code = 0x0C8040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_ALT2", .pme_code = 0x0C8040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_ALT3", .pme_code = 0x0C8040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RL3_SHR_ALT4", .pme_code = 0x0C8040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded with a valid line that was not in the M (exclusive) state from another core's L3 from a remote chip due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x0D4040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RMEM to obtain the average RMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_CYC_ALT2", .pme_code = 0x0D4040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RMEM to obtain the average RMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_CYC_ALT3", .pme_code = 0x0D4040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RMEM to obtain the average RMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_CYC_ALT4", .pme_code = 0x0D4040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from remote memory (MC slow) due to a demand miss. Divide this count by PM_MRK_DATA_FROM_RMEM to obtain the average RMEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x0D4040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_ALT2", .pme_code = 0x0D4040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_ALT3", .pme_code = 0x0D4040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_RMEM_ALT4", .pme_code = 0x0D4040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from remote memory (MC slow) due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_CYC", .pme_code = 0x0D8040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_CACHE to obtain the average R_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_CYC_ALT2", .pme_code = 0x0D8040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_CACHE to obtain the average R_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_CYC_ALT3", .pme_code = 0x0D8040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_CACHE to obtain the average R_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_CYC_ALT4", .pme_code = 0x0D8040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI cache due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_CACHE to obtain the average R_OC_CACHE latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE", .pme_code = 0x0D8040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_ALT2", .pme_code = 0x0D8040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_ALT3", .pme_code = 0x0D8040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_CACHE_ALT4", .pme_code = 0x0D8040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI cache due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_CYC", .pme_code = 0x0DC040000001C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_MEM to obtain the average R_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_CYC_ALT2", .pme_code = 0x0DC040000002C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_MEM to obtain the average R_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_CYC_ALT3", .pme_code = 0x0DC040000003C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_MEM to obtain the average R_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_CYC_ALT4", .pme_code = 0x0DC040000004C144, .pme_short_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss.", .pme_long_desc = "Data Source;Number of cycles when a marked instruction was waiting for a data cache miss that was reloaded from a remote chip's OpenCAPI memory due to a demand miss. Divide this count by PM_MRK_DATA_FROM_R_OC_MEM to obtain the average R_OC_MEM latency for data reloads.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM", .pme_code = 0x0DC040000001C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_ALT2", .pme_code = 0x0DC040000002C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_ALT3", .pme_code = 0x0DC040000003C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DATA_FROM_R_OC_MEM_ALT4", .pme_code = 0x0DC040000004C142, .pme_short_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's L1 data cache was reloaded from a remote chip's OpenCAPI memory due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_DERAT_MISS_2M", .pme_code = 0x40164, .pme_short_desc = "marked;Data ERAT Miss (Data TLB Access) page size 2M for a marked instruction.", .pme_long_desc = "marked;Data ERAT Miss (Data TLB Access) page size 2M for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x2D154, .pme_short_desc = "marked;Data ERAT Miss (Data TLB Access) page size 64K for a marked instruction.", .pme_long_desc = "marked;Data ERAT Miss (Data TLB Access) page size 64K for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_MRK_DFU_ISSUE", .pme_code = 0x20132, .pme_short_desc = "marked;The marked instruction was a decimal floating point operation issued to the VSU.", .pme_long_desc = "marked;The marked instruction was a decimal floating point operation issued to the VSU. Measured at issue time.", }, {.pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x4C15E, .pme_short_desc = "marked;Marked Data TLB reload (after a miss) page size 64K.", .pme_long_desc = "marked;Marked Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches", }, {.pme_name = "PM_MRK_FX_LSU_FIN", .pme_code = 0x2013C, .pme_short_desc = "marked;The marked instruction was simple fixed point that was issued to the store unit.", .pme_long_desc = "marked;The marked instruction was simple fixed point that was issued to the store unit. Measured at finish time", }, {.pme_name = "PM_MRK_FXU_ISSUE", .pme_code = 0x20134, .pme_short_desc = "marked;The marked instruction was a fixed point operation issued to the VSU.", .pme_long_desc = "marked;The marked instruction was a fixed point operation issued to the VSU. Measured at issue time.", }, {.pme_name = "PM_MRK_INST_CMPL", .pme_code = 0x401E0, .pme_short_desc = "marked;Marked instruction completed", .pme_long_desc = "marked;Marked instruction completed", }, {.pme_name = "PM_MRK_INST_DECODED", .pme_code = 0x20130, .pme_short_desc = "marked;An instruction was marked at decode time.", .pme_long_desc = "marked;An instruction was marked at decode time. Random Instruction Sampling (RIS) only", }, {.pme_name = "PM_MRK_INST_DISP", .pme_code = 0x101E0, .pme_short_desc = "marked;The thread has dispatched a randomly sampled marked instruction", .pme_long_desc = "marked;The thread has dispatched a randomly sampled marked instruction", }, {.pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30130, .pme_short_desc = "marked;Marked instruction finished.", .pme_long_desc = "marked;Marked instruction finished. Excludes instructions that finish at dispatch. Note that stores always finish twice since the address gets issued to the LSU and the data gets issued to the VSU.", }, {.pme_name = "PM_MRK_INST_FLUSHED", .pme_code = 0x4E15E, .pme_short_desc = "marked;The marked instruction was flushed", .pme_long_desc = "marked;The marked instruction was flushed", }, {.pme_name = "PM_MRK_INST_FROM_L3MISS", .pme_code = 0x000780000001C142, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_INST_FROM_L3MISS_ALT2", .pme_code = 0x000780000002C142, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_INST_FROM_L3MISS_ALT3", .pme_code = 0x000780000003C142, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_INST_FROM_L3MISS_ALT4", .pme_code = 0x401E6, .pme_short_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", .pme_long_desc = "Data Source;The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction.", }, {.pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10132, .pme_short_desc = "marked;Marked instruction issued.", .pme_long_desc = "marked;Marked instruction issued. Note that stores always get issued twice, the address gets issued to the LSU and the data gets issued to the VSU. Also, issues can sometimes get killed/cancelled and cause multiple sequential issues for the same instruction.", }, {.pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40134, .pme_short_desc = "pmc;Marked instruction finish timeout (instruction was lost)", .pme_long_desc = "pmc;Marked instruction finish timeout (instruction was lost)", }, {.pme_name = "PM_MRK_INST", .pme_code = 0x24158, .pme_short_desc = "marked;An instruction was marked.", .pme_long_desc = "marked;An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens", }, {.pme_name = "PM_MRK_ISSUE_DEPENDENT_LOAD", .pme_code = 0x30162, .pme_short_desc = "marked;The marked instruction was dependent on a load.", .pme_long_desc = "marked;The marked instruction was dependent on a load. It is eligible for issue kill", }, {.pme_name = "PM_MRK_L1_ICACHE_MISS", .pme_code = 0x101E4, .pme_short_desc = "marked;Marked instruction suffered an instruction cache miss", .pme_long_desc = "marked;Marked instruction suffered an instruction cache miss", }, {.pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x101EA, .pme_short_desc = "marked;Marked demand reload", .pme_long_desc = "marked;Marked demand reload", }, {.pme_name = "PM_MRK_L2_RC_DISP", .pme_code = 0x20114, .pme_short_desc = "marked;Marked instruction RC dispatched in L2", .pme_long_desc = "marked;Marked instruction RC dispatched in L2", }, {.pme_name = "PM_MRK_L2_RC_DONE", .pme_code = 0x3012A, .pme_short_desc = "marked;L2 RC machine completed the transaction for the marked instruction", .pme_long_desc = "marked;L2 RC machine completed the transaction for the marked instruction", }, {.pme_name = "PM_MRK_LARX_FIN", .pme_code = 0x40116, .pme_short_desc = "marked;Marked load and reserve instruction (LARX) finished.", .pme_long_desc = "marked;Marked load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock", }, {.pme_name = "PM_MRK_LD_CMPL", .pme_code = 0x34146, .pme_short_desc = "marked;Marked load instruction completed", .pme_long_desc = "marked;Marked load instruction completed", }, {.pme_name = "PM_MRK_LD_MISS_L1_CYC", .pme_code = 0x1D156, .pme_short_desc = "marked;Marked load latency.", .pme_long_desc = "marked;Marked load latency. The latency measured for this event counts between the original launch of a load L1 miss to the relaunch of that same load l1 miss once data is back from the nest and the load is going to finish", }, {.pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x201E2, .pme_short_desc = "marked;Marked demand data load miss counted at finish time", .pme_long_desc = "marked;Marked demand data load miss counted at finish time", }, {.pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40132, .pme_short_desc = "marked;LSU marked instruction finish", .pme_long_desc = "marked;LSU marked instruction finish", }, {.pme_name = "PM_MRK_NTF_CYC", .pme_code = 0x2011C, .pme_short_desc = "marked;Cycles in which the marked instruction is the oldest in the pipeline (next-to-finish or next-to-complete)", .pme_long_desc = "marked;Cycles in which the marked instruction is the oldest in the pipeline (next-to-finish or next-to-complete)", }, {.pme_name = "PM_MRK_NTF_FIN", .pme_code = 0x20112, .pme_short_desc = "marked;The marked instruction became the oldest in the pipeline before it finished.", .pme_long_desc = "marked;The marked instruction became the oldest in the pipeline before it finished. It excludes instructions that finish at dispatch", }, {.pme_name = "PM_MRK_START_PROBE_NOP_CMPL", .pme_code = 0x1F15E, .pme_short_desc = "pmc;Marked Start probe nop (AND R0,R0,R0) completed.", .pme_long_desc = "pmc;Marked Start probe nop (AND R0,R0,R0) completed.", }, {.pme_name = "PM_MRK_START_PROBE_NOP_DISP", .pme_code = 0x40114, .pme_short_desc = "pmc;Marked Start probe nop dispatched.", .pme_long_desc = "pmc;Marked Start probe nop dispatched. Instruction AND R0,R0,R0", }, {.pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x301E2, .pme_short_desc = "marked;Marked store completed and sent to nest.", .pme_long_desc = "marked;Marked store completed and sent to nest. Note that this count excludes cache-inhibited stores", }, {.pme_name = "PM_MRK_STCX_CORE_CYC", .pme_code = 0x44146, .pme_short_desc = "marked;Cycles spent in the core portion of a marked STCX instruction.", .pme_long_desc = "marked;Cycles spent in the core portion of a marked STCX instruction. It starts counting when the instruction is decoded and stops counting when it drains into the L2", }, {.pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x3E158, .pme_short_desc = "marked;Marked conditional store instruction (STCX) failed.", .pme_long_desc = "marked;Marked conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock ", }, {.pme_name = "PM_MRK_STCX_FIN", .pme_code = 0x24156, .pme_short_desc = "marked;Marked conditional store instruction (STCX) finished.", .pme_long_desc = "marked;Marked conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock", }, {.pme_name = "PM_MRK_STCX_L2_CYC", .pme_code = 0x1F15C, .pme_short_desc = "marked;Cycles spent in the nest portion of a marked Stcx instruction.", .pme_long_desc = "marked;Cycles spent in the nest portion of a marked Stcx instruction. It starts counting when the operation starts to drain to the L2 and it stops counting when the instruction retires from the Instruction Completion Table (ICT) in the Instruction Sequencing Unit (ISU)", }, {.pme_name = "PM_MRK_ST_DONE_L2", .pme_code = 0x10134, .pme_short_desc = "marked;Marked store completed in L2", .pme_long_desc = "marked;Marked store completed in L2", }, {.pme_name = "PM_MRK_ST_DRAIN_CYC", .pme_code = 0x3F150, .pme_short_desc = "marked;Cycles in which the marked store drained from the core to the L2", .pme_long_desc = "marked;Cycles in which the marked store drained from the core to the L2", }, {.pme_name = "PM_MRK_ST_FIN", .pme_code = 0x3E15A, .pme_short_desc = "marked;Marked store instruction finished", .pme_long_desc = "marked;Marked store instruction finished", }, {.pme_name = "PM_MRK_ST_L2_CYC", .pme_code = 0x1F150, .pme_short_desc = "marked;Cycles from L2 RC dispatch to L2 RC completion", .pme_long_desc = "marked;Cycles from L2 RC dispatch to L2 RC completion", }, {.pme_name = "PM_MRK_ST_NEST", .pme_code = 0x20138, .pme_short_desc = "marked;A store has been sampled/marked and is at the point of execution where it has completed in the core and can no longer be flushed.", .pme_long_desc = "marked;A store has been sampled/marked and is at the point of execution where it has completed in the core and can no longer be flushed. At this point the store is sent to the L2.", }, {.pme_name = "PM_MRK_VSU_FIN", .pme_code = 0x30132, .pme_short_desc = "marked;VSU marked instruction finished.", .pme_long_desc = "marked;VSU marked instruction finished. Excludes simple FX instructions issued to the Store Unit.", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_CYC_PMC1", .pme_code = 0x1C144, .pme_short_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[0:12].", .pme_long_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[0:12].", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_CYC_PMC2", .pme_code = 0x2C144, .pme_short_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[15:27].", .pme_long_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[15:27].", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_CYC_PMC3", .pme_code = 0x3C144, .pme_short_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[30:42].", .pme_long_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[30:42].", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_CYC_PMC4", .pme_code = 0x4C144, .pme_short_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[45:57].", .pme_long_desc = "marked;Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[45:57].", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_PMC1", .pme_code = 0x1C142, .pme_short_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[0:12].", .pme_long_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_PMC2", .pme_code = 0x2C142, .pme_short_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[15:27].", .pme_long_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_PMC3", .pme_code = 0x3C142, .pme_short_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[30:42].", .pme_long_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_MRK_XFER_FROM_SRC_PMC4", .pme_code = 0x4C142, .pme_short_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[45:57].", .pme_long_desc = "marked;For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_NON_FMA_FLOP_CMPL", .pme_code = 0x4D056, .pme_short_desc = "floating point;Non FMA instruction completed", .pme_long_desc = "floating point;Non FMA instruction completed", }, {.pme_name = "PM_NON_MATH_FLOP_CMPL", .pme_code = 0x4D05A, .pme_short_desc = "floating point;Non Math instruction completed", .pme_long_desc = "floating point;Non Math instruction completed", }, {.pme_name = "PM_NTC_ALL_FIN", .pme_code = 0x40008, .pme_short_desc = "pipeline;Cycles in which both instructions in the ICT entry pair show as finished.", .pme_long_desc = "pipeline;Cycles in which both instructions in the ICT entry pair show as finished. These are the cycles between finish and completion for the oldest pair of instructions in the pipeline", }, {.pme_name = "PM_NTC_FIN", .pme_code = 0x2405A, .pme_short_desc = "pipeline;Cycles in which the oldest instruction in the pipeline (NTC) finishes.", .pme_long_desc = "pipeline;Cycles in which the oldest instruction in the pipeline (NTC) finishes. Note that instructions can finish out of order, therefore not all the instructions that finish have a Next-to-complete status.", }, {.pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20010, .pme_short_desc = "pmc;The event selected for PMC1 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC1 caused the event counter to overflow.", }, {.pme_name = "PM_PMC1_REWIND", .pme_code = 0x4D02C, .pme_short_desc = "pmc;The speculative event selected for PMC1 rewinds and the counter for PMC1 is not charged.", .pme_long_desc = "pmc;The speculative event selected for PMC1 rewinds and the counter for PMC1 is not charged.", }, {.pme_name = "PM_PMC1_SAVED", .pme_code = 0x4D010, .pme_short_desc = "pmc;The conditions for the speculative event selected for PMC1 are met and PMC1 is charged.", .pme_long_desc = "pmc;The conditions for the speculative event selected for PMC1 are met and PMC1 is charged.", }, {.pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30010, .pme_short_desc = "pmc;The event selected for PMC2 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC2 caused the event counter to overflow.", }, {.pme_name = "PM_PMC2_REWIND", .pme_code = 0x30020, .pme_short_desc = "pmc;The speculative event selected for PMC2 rewinds and the counter for PMC2 is not charged.", .pme_long_desc = "pmc;The speculative event selected for PMC2 rewinds and the counter for PMC2 is not charged.", }, {.pme_name = "PM_PMC2_SAVED", .pme_code = 0x10022, .pme_short_desc = "pmc;The conditions for the speculative event selected for PMC2 are met and PMC2 is charged.", .pme_long_desc = "pmc;The conditions for the speculative event selected for PMC2 are met and PMC2 is charged.", }, {.pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40010, .pme_short_desc = "pmc;The event selected for PMC3 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC3 caused the event counter to overflow.", }, {.pme_name = "PM_PMC3_REWIND", .pme_code = 0x1000A, .pme_short_desc = "pmc;The speculative event selected for PMC3 rewinds and the counter for PMC3 is not charged.", .pme_long_desc = "pmc;The speculative event selected for PMC3 rewinds and the counter for PMC3 is not charged.", }, {.pme_name = "PM_PMC3_SAVED", .pme_code = 0x4D012, .pme_short_desc = "pmc;The conditions for the speculative event selected for PMC3 are met and PMC3 is charged.", .pme_long_desc = "pmc;The conditions for the speculative event selected for PMC3 are met and PMC3 is charged.", }, {.pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10010, .pme_short_desc = "pmc;The event selected for PMC4 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC4 caused the event counter to overflow.", }, {.pme_name = "PM_PMC4_REWIND", .pme_code = 0x10020, .pme_short_desc = "pmc;The speculative event selected for PMC4 rewinds and the counter for PMC4 is not charged.", .pme_long_desc = "pmc;The speculative event selected for PMC4 rewinds and the counter for PMC4 is not charged.", }, {.pme_name = "PM_PMC4_SAVED", .pme_code = 0x30022, .pme_short_desc = "pmc;The conditions for the speculative event selected for PMC4 are met and PMC4 is charged.", .pme_long_desc = "pmc;The conditions for the speculative event selected for PMC4 are met and PMC4 is charged.", }, {.pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10024, .pme_short_desc = "pmc;The event selected for PMC5 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC5 caused the event counter to overflow.", }, {.pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30024, .pme_short_desc = "pmc;The event selected for PMC6 caused the event counter to overflow.", .pme_long_desc = "pmc;The event selected for PMC6 caused the event counter to overflow.", }, {.pme_name = "PM_PRIVILEGED_CYC", .pme_code = 0x4D028, .pme_short_desc = "pmc;Cycles when the thread is in Privileged state.", .pme_long_desc = "pmc;Cycles when the thread is in Privileged state. MSR[S HV PR]=x00", }, {.pme_name = "PM_PRIVILEGED_INST_CMPL", .pme_code = 0x3405A, .pme_short_desc = "PowerPC instruction completed while the thread was in Privileged state.", .pme_long_desc = "PowerPC instruction completed while the thread was in Privileged state.", }, {.pme_name = "PM_PTESYNC_FIN", .pme_code = 0x2003E, .pme_short_desc = "memory;Ptesync instruction finished in the store unit.", .pme_long_desc = "memory;Ptesync instruction finished in the store unit. Only one ptesync can finish at a time.", }, {.pme_name = "PM_RUN_CYC_SMT2_MODE", .pme_code = 0x3006C, .pme_short_desc = "pmc;Cycles when this thread's run latch is set and the core is in SMT2 mode", .pme_long_desc = "pmc;Cycles when this thread's run latch is set and the core is in SMT2 mode", }, {.pme_name = "PM_RUN_CYC_SMT4_MODE", .pme_code = 0x2006C, .pme_short_desc = "pmc;Cycles when this thread's run latch is set and the core is in SMT4 mode", .pme_long_desc = "pmc;Cycles when this thread's run latch is set and the core is in SMT4 mode", }, {.pme_name = "PM_RUN_CYC_ST_MODE", .pme_code = 0x1006C, .pme_short_desc = "pmc;Cycles when the run latch is set and the core is in ST mode", .pme_long_desc = "pmc;Cycles when the run latch is set and the core is in ST mode", }, {.pme_name = "PM_RUN_CYC", .pme_code = 0x200F4, .pme_short_desc = "pmc;Processor cycles gated by the run latch", .pme_long_desc = "pmc;Processor cycles gated by the run latch", }, {.pme_name = "PM_RUN_INST_CMPL_CONC", .pme_code = 0x300F4, .pme_short_desc = "cache;PowerPC instruction completed by this thread when all threads in the core had the run-latch set", .pme_long_desc = "cache;PowerPC instruction completed by this thread when all threads in the core had the run-latch set", }, {.pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x400FA, .pme_short_desc = "pmc;PowerPC instruction completed while the run latch is set", .pme_long_desc = "pmc;PowerPC instruction completed while the run latch is set", }, {.pme_name = "PM_RUN_LATCH_ALL_THREADS_CYC", .pme_code = 0x2000C, .pme_short_desc = "pmc;Cycles when the run latch is set for all threads.", .pme_long_desc = "pmc;Cycles when the run latch is set for all threads.", }, {.pme_name = "PM_RUN_LATCH_ANY_THREAD_CYC", .pme_code = 0x100FA, .pme_short_desc = "pmc;Cycles when at least one thread has the run latch set", .pme_long_desc = "pmc;Cycles when at least one thread has the run latch set", }, {.pme_name = "PM_SCALAR_FLOP_CMPL", .pme_code = 0x45056, .pme_short_desc = "floating point;Scalar floating point instruction completed.", .pme_long_desc = "floating point;Scalar floating point instruction completed.", }, {.pme_name = "PM_SCALAR_FSQRT_FDIV_ISSUE", .pme_code = 0x3D058, .pme_short_desc = "pipeline;Scalar versions of four floating point operations: fdiv,fsqrt (xvdivdp, xvdivsp, xvsqrtdp, xvsqrtsp).", .pme_long_desc = "pipeline;Scalar versions of four floating point operations: fdiv,fsqrt (xvdivdp, xvdivsp, xvsqrtdp, xvsqrtsp).", }, {.pme_name = "PM_SP_FLOP_CMPL", .pme_code = 0x4505A, .pme_short_desc = "floating point;Single Precision floating point instruction completed.", .pme_long_desc = "floating point;Single Precision floating point instruction completed.", }, {.pme_name = "PM_ST_CMPL", .pme_code = 0x200F0, .pme_short_desc = "translation;Stores completed from S2Q (2nd-level store queue).", .pme_long_desc = "translation;Stores completed from S2Q (2nd-level store queue). This event includes regular stores, stcx and cache inhibited stores. The following operations are excluded (pteupdate, snoop tlbie complete, store atomics, miso, load atomic payloads, tlbie, tlbsync, slbieg, isync, msgsnd, slbiag, cpabort, copy, tcheck, tend, stsync, dcbst, icbi, dcbf, hwsync, lwsync, ptesync, eieio, msgsync)", }, {.pme_name = "PM_STCX_FAIL_FIN", .pme_code = 0x1E058, .pme_short_desc = "locks;Conditional store instruction (STCX) failed.", .pme_long_desc = "locks;Conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock ", }, {.pme_name = "PM_STCX_FIN", .pme_code = 0x2E014, .pme_short_desc = "empty;Conditional store instruction (STCX) finished.", .pme_long_desc = "empty;Conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock", }, {.pme_name = "PM_STCX_PASS_FIN", .pme_code = 0x4E050, .pme_short_desc = "locks;Conditional store instruction (STCX) passed.", .pme_long_desc = "locks;Conditional store instruction (STCX) passed. LARX and STCX are instructions used to acquire a lock ", }, {.pme_name = "PM_ST_FIN", .pme_code = 0x20016, .pme_short_desc = "translation;Store finish count.", .pme_long_desc = "translation;Store finish count. Includes speculative activity", }, {.pme_name = "PM_ST_FWD", .pme_code = 0x20018, .pme_short_desc = "translation;Store forwards that finished", .pme_long_desc = "translation;Store forwards that finished", }, {.pme_name = "PM_ST_MISS_L1", .pme_code = 0x300F0, .pme_short_desc = "translation;Store Missed L1", .pme_long_desc = "translation;Store Missed L1", }, {.pme_name = "PM_THRESH_EXC_1024", .pme_code = 0x301EA, .pme_short_desc = "pmc;Threshold counter exceeded a value of 1024", .pme_long_desc = "pmc;Threshold counter exceeded a value of 1024", }, {.pme_name = "PM_THRESH_EXC_128", .pme_code = 0x401EA, .pme_short_desc = "pmc;Threshold counter exceeded a value of 128", .pme_long_desc = "pmc;Threshold counter exceeded a value of 128", }, {.pme_name = "PM_THRESH_EXC_256", .pme_code = 0x101E8, .pme_short_desc = "pmc;Threshold counter exceeded a count of 256", .pme_long_desc = "pmc;Threshold counter exceeded a count of 256", }, {.pme_name = "PM_THRESH_EXC_32", .pme_code = 0x201E6, .pme_short_desc = "pmc;Threshold counter exceeded a value of 32", .pme_long_desc = "pmc;Threshold counter exceeded a value of 32", }, {.pme_name = "PM_THRESH_EXC_512", .pme_code = 0x201E8, .pme_short_desc = "pmc;Threshold counter exceeded a value of 512", .pme_long_desc = "pmc;Threshold counter exceeded a value of 512", }, {.pme_name = "PM_THRESH_EXC_64", .pme_code = 0x301E8, .pme_short_desc = "pmc;Threshold counter exceeded a value of 64", .pme_long_desc = "pmc;Threshold counter exceeded a value of 64", }, {.pme_name = "PM_THRESH_MET", .pme_code = 0x101EC, .pme_short_desc = "pmc;Threshold exceeded", .pme_long_desc = "pmc;Threshold exceeded", }, {.pme_name = "PM_TLBIE_FIN", .pme_code = 0x30058, .pme_short_desc = "pipeline;TLBIE instruction finished in the LSU.", .pme_long_desc = "pipeline;TLBIE instruction finished in the LSU. Two TLBIEs can finish each cycle. All will be counted", }, {.pme_name = "PM_ULTRAVISOR_CYC", .pme_code = 0x4D026, .pme_short_desc = "pmc;Cycles when the thread is in Ultravisor state.", .pme_long_desc = "pmc;Cycles when the thread is in Ultravisor state. MSR[S HV PR]=110", }, {.pme_name = "PM_ULTRAVISOR_INST_CMPL", .pme_code = 0x1001C, .pme_short_desc = "pmc;PowerPC instruction completed while the thread was in ultravisor state.", .pme_long_desc = "pmc;PowerPC instruction completed while the thread was in ultravisor state.", }, {.pme_name = "PM_VECTOR_FLOP_CMPL", .pme_code = 0x4D058, .pme_short_desc = "floating point;Vector floating point instruction completed", .pme_long_desc = "floating point;Vector floating point instruction completed", }, {.pme_name = "PM_VECTOR_LD_CMPL", .pme_code = 0x44054, .pme_short_desc = "empty;Vector load instruction completed", .pme_long_desc = "empty;Vector load instruction completed", }, {.pme_name = "PM_VECTOR_ST_CMPL", .pme_code = 0x44056, .pme_short_desc = "frontend;Vector store instruction completed", .pme_long_desc = "frontend;Vector store instruction completed", }, {.pme_name = "PM_VSU0_ISSUE", .pme_code = 0x10016, .pme_short_desc = "VSU instruction issued to VSU pipe 0", .pme_long_desc = "VSU instruction issued to VSU pipe 0", }, {.pme_name = "PM_VSU1_ISSUE", .pme_code = 0x2D012, .pme_short_desc = "pipeline;VSU instruction issued to VSU pipe 1", .pme_long_desc = "pipeline;VSU instruction issued to VSU pipe 1", }, {.pme_name = "PM_VSU2_ISSUE", .pme_code = 0x3F044, .pme_short_desc = "pipeline;VSU instruction issued to VSU pipe 2", .pme_long_desc = "pipeline;VSU instruction issued to VSU pipe 2", }, {.pme_name = "PM_VSU3_ISSUE", .pme_code = 0x4D020, .pme_short_desc = "pipeline;VSU instruction was issued to VSU pipe 3", .pme_long_desc = "pipeline;VSU instruction was issued to VSU pipe 3", }, {.pme_name = "PM_VSU_FIN", .pme_code = 0x4001C, .pme_short_desc = "empty;VSU instruction finished", .pme_long_desc = "empty;VSU instruction finished", }, {.pme_name = "PM_VSU_ISSUE", .pme_code = 0x2505C, .pme_short_desc = "empty;At least one VSU instruction was issued to one of the VSU pipes.", .pme_long_desc = "empty;At least one VSU instruction was issued to one of the VSU pipes. Up to 4 per cycle. Includes fixed point operations.", }, {.pme_name = "PM_VSU_NON_FLOP_CMPL", .pme_code = 0x4D050, .pme_short_desc = "floating point;Non-floating point VSU instruction completed", .pme_long_desc = "floating point;Non-floating point VSU instruction completed", }, {.pme_name = "PM_XFER_FROM_SRC_PMC1", .pme_code = 0x1C040, .pme_short_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[0:12].", .pme_long_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_XFER_FROM_SRC_PMC2", .pme_code = 0x2C040, .pme_short_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[15:27].", .pme_long_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_XFER_FROM_SRC_PMC3", .pme_code = 0x3C040, .pme_short_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[30:42].", .pme_long_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, {.pme_name = "PM_XFER_FROM_SRC_PMC4", .pme_code = 0x4C040, .pme_short_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[45:57].", .pme_long_desc = "memory;The processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads.", }, /* total 949 */ }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power4_events.h000066400000000000000000002417101502707512200230200ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER4_EVENTS_H__ #define __POWER4_EVENTS_H__ /* * File: power4_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID 0 #define POWER4_PME_PM_FPU1_SINGLE 1 #define POWER4_PME_PM_DC_PREF_OUT_STREAMS 2 #define POWER4_PME_PM_FPU0_STALL3 3 #define POWER4_PME_PM_TB_BIT_TRANS 4 #define POWER4_PME_PM_GPR_MAP_FULL_CYC 5 #define POWER4_PME_PM_MRK_ST_CMPL 6 #define POWER4_PME_PM_MRK_LSU_FLUSH_LRQ 7 #define POWER4_PME_PM_FPU0_STF 8 #define POWER4_PME_PM_FPU1_FMA 9 #define POWER4_PME_PM_L2SA_MOD_TAG 10 #define POWER4_PME_PM_MRK_DATA_FROM_L275_SHR 11 #define POWER4_PME_PM_1INST_CLB_CYC 12 #define POWER4_PME_PM_LSU1_FLUSH_ULD 13 #define POWER4_PME_PM_MRK_INST_FIN 14 #define POWER4_PME_PM_MRK_LSU0_FLUSH_UST 15 #define POWER4_PME_PM_FPU_FDIV 16 #define POWER4_PME_PM_LSU_LRQ_S0_ALLOC 17 #define POWER4_PME_PM_FPU0_FULL_CYC 18 #define POWER4_PME_PM_FPU_SINGLE 19 #define POWER4_PME_PM_FPU0_FMA 20 #define POWER4_PME_PM_MRK_LSU1_FLUSH_ULD 21 #define POWER4_PME_PM_LSU1_FLUSH_LRQ 22 #define POWER4_PME_PM_L2SA_ST_HIT 23 #define POWER4_PME_PM_L2SB_SHR_INV 24 #define POWER4_PME_PM_DTLB_MISS 25 #define POWER4_PME_PM_MRK_ST_MISS_L1 26 #define POWER4_PME_PM_EXT_INT 27 #define POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ 28 #define POWER4_PME_PM_MRK_ST_GPS 29 #define POWER4_PME_PM_GRP_DISP_SUCCESS 30 #define POWER4_PME_PM_LSU1_LDF 31 #define POWER4_PME_PM_FAB_CMD_ISSUED 32 #define POWER4_PME_PM_LSU0_SRQ_STFWD 33 #define POWER4_PME_PM_CR_MAP_FULL_CYC 34 #define POWER4_PME_PM_MRK_LSU0_FLUSH_ULD 35 #define POWER4_PME_PM_LSU_DERAT_MISS 36 #define POWER4_PME_PM_FPU0_SINGLE 37 #define POWER4_PME_PM_FPU1_FDIV 38 #define POWER4_PME_PM_FPU1_FEST 39 #define POWER4_PME_PM_FPU0_FRSP_FCONV 40 #define POWER4_PME_PM_MRK_ST_CMPL_INT 41 #define POWER4_PME_PM_FXU_FIN 42 #define POWER4_PME_PM_FPU_STF 43 #define POWER4_PME_PM_DSLB_MISS 44 #define POWER4_PME_PM_DATA_FROM_L275_SHR 45 #define POWER4_PME_PM_FXLS1_FULL_CYC 46 #define POWER4_PME_PM_L3B0_DIR_MIS 47 #define POWER4_PME_PM_2INST_CLB_CYC 48 #define POWER4_PME_PM_MRK_STCX_FAIL 49 #define POWER4_PME_PM_LSU_LMQ_LHR_MERGE 50 #define POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE 51 #define POWER4_PME_PM_L3B1_DIR_REF 52 #define POWER4_PME_PM_MRK_LSU_FLUSH_UST 53 #define POWER4_PME_PM_MRK_DATA_FROM_L25_SHR 54 #define POWER4_PME_PM_LSU_FLUSH_ULD 55 #define POWER4_PME_PM_MRK_BRU_FIN 56 #define POWER4_PME_PM_IERAT_XLATE_WR 57 #define POWER4_PME_PM_LSU0_BUSY 58 #define POWER4_PME_PM_L2SA_ST_REQ 59 #define POWER4_PME_PM_DATA_FROM_MEM 60 #define POWER4_PME_PM_FPR_MAP_FULL_CYC 61 #define POWER4_PME_PM_FPU1_FULL_CYC 62 #define POWER4_PME_PM_FPU0_FIN 63 #define POWER4_PME_PM_3INST_CLB_CYC 64 #define POWER4_PME_PM_DATA_FROM_L35 65 #define POWER4_PME_PM_L2SA_SHR_INV 66 #define POWER4_PME_PM_MRK_LSU_FLUSH_SRQ 67 #define POWER4_PME_PM_THRESH_TIMEO 68 #define POWER4_PME_PM_FPU_FSQRT 69 #define POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ 70 #define POWER4_PME_PM_FXLS0_FULL_CYC 71 #define POWER4_PME_PM_DATA_TABLEWALK_CYC 72 #define POWER4_PME_PM_FPU0_ALL 73 #define POWER4_PME_PM_FPU0_FEST 74 #define POWER4_PME_PM_DATA_FROM_L25_MOD 75 #define POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 76 #define POWER4_PME_PM_FPU_FEST 77 #define POWER4_PME_PM_0INST_FETCH 78 #define POWER4_PME_PM_LARX_LSU1 79 #define POWER4_PME_PM_LD_MISS_L1_LSU0 80 #define POWER4_PME_PM_L1_PREF 81 #define POWER4_PME_PM_FPU1_STALL3 82 #define POWER4_PME_PM_BRQ_FULL_CYC 83 #define POWER4_PME_PM_LARX 84 #define POWER4_PME_PM_MRK_DATA_FROM_L35 85 #define POWER4_PME_PM_WORK_HELD 86 #define POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 87 #define POWER4_PME_PM_FXU_IDLE 88 #define POWER4_PME_PM_INST_CMPL 89 #define POWER4_PME_PM_LSU1_FLUSH_UST 90 #define POWER4_PME_PM_LSU0_FLUSH_ULD 91 #define POWER4_PME_PM_INST_FROM_L2 92 #define POWER4_PME_PM_DATA_FROM_L3 93 #define POWER4_PME_PM_FPU0_DENORM 94 #define POWER4_PME_PM_FPU1_FMOV_FEST 95 #define POWER4_PME_PM_GRP_DISP_REJECT 96 #define POWER4_PME_PM_INST_FETCH_CYC 97 #define POWER4_PME_PM_LSU_LDF 98 #define POWER4_PME_PM_INST_DISP 99 #define POWER4_PME_PM_L2SA_MOD_INV 100 #define POWER4_PME_PM_DATA_FROM_L25_SHR 101 #define POWER4_PME_PM_FAB_CMD_RETRIED 102 #define POWER4_PME_PM_L1_DCACHE_RELOAD_VALID 103 #define POWER4_PME_PM_MRK_GRP_ISSUED 104 #define POWER4_PME_PM_FPU_FULL_CYC 105 #define POWER4_PME_PM_FPU_FMA 106 #define POWER4_PME_PM_MRK_CRU_FIN 107 #define POWER4_PME_PM_MRK_LSU1_FLUSH_UST 108 #define POWER4_PME_PM_MRK_FXU_FIN 109 #define POWER4_PME_PM_BR_ISSUED 110 #define POWER4_PME_PM_EE_OFF 111 #define POWER4_PME_PM_INST_FROM_L3 112 #define POWER4_PME_PM_ITLB_MISS 113 #define POWER4_PME_PM_FXLS_FULL_CYC 114 #define POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE 115 #define POWER4_PME_PM_GRP_DISP_VALID 116 #define POWER4_PME_PM_L2SC_ST_HIT 117 #define POWER4_PME_PM_MRK_GRP_DISP 118 #define POWER4_PME_PM_L2SB_MOD_TAG 119 #define POWER4_PME_PM_INST_FROM_L25_L275 120 #define POWER4_PME_PM_LSU_FLUSH_UST 121 #define POWER4_PME_PM_L2SB_ST_HIT 122 #define POWER4_PME_PM_FXU1_FIN 123 #define POWER4_PME_PM_L3B1_DIR_MIS 124 #define POWER4_PME_PM_4INST_CLB_CYC 125 #define POWER4_PME_PM_GRP_CMPL 126 #define POWER4_PME_PM_DC_PREF_L2_CLONE_L3 127 #define POWER4_PME_PM_FPU_FRSP_FCONV 128 #define POWER4_PME_PM_5INST_CLB_CYC 129 #define POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ 130 #define POWER4_PME_PM_MRK_LSU_FLUSH_ULD 131 #define POWER4_PME_PM_8INST_CLB_CYC 132 #define POWER4_PME_PM_LSU_LMQ_FULL_CYC 133 #define POWER4_PME_PM_ST_REF_L1_LSU0 134 #define POWER4_PME_PM_LSU0_DERAT_MISS 135 #define POWER4_PME_PM_LSU_SRQ_SYNC_CYC 136 #define POWER4_PME_PM_FPU_STALL3 137 #define POWER4_PME_PM_MRK_DATA_FROM_L2 138 #define POWER4_PME_PM_FPU0_FMOV_FEST 139 #define POWER4_PME_PM_LSU0_FLUSH_SRQ 140 #define POWER4_PME_PM_LD_REF_L1_LSU0 141 #define POWER4_PME_PM_L2SC_SHR_INV 142 #define POWER4_PME_PM_LSU1_FLUSH_SRQ 143 #define POWER4_PME_PM_LSU_LMQ_S0_ALLOC 144 #define POWER4_PME_PM_ST_REF_L1 145 #define POWER4_PME_PM_LSU_SRQ_EMPTY_CYC 146 #define POWER4_PME_PM_FPU1_STF 147 #define POWER4_PME_PM_L3B0_DIR_REF 148 #define POWER4_PME_PM_RUN_CYC 149 #define POWER4_PME_PM_LSU_LMQ_S0_VALID 150 #define POWER4_PME_PM_LSU_LRQ_S0_VALID 151 #define POWER4_PME_PM_LSU0_LDF 152 #define POWER4_PME_PM_MRK_IMR_RELOAD 153 #define POWER4_PME_PM_7INST_CLB_CYC 154 #define POWER4_PME_PM_MRK_GRP_TIMEO 155 #define POWER4_PME_PM_FPU_FMOV_FEST 156 #define POWER4_PME_PM_GRP_DISP_BLK_SB_CYC 157 #define POWER4_PME_PM_XER_MAP_FULL_CYC 158 #define POWER4_PME_PM_ST_MISS_L1 159 #define POWER4_PME_PM_STOP_COMPLETION 160 #define POWER4_PME_PM_MRK_GRP_CMPL 161 #define POWER4_PME_PM_ISLB_MISS 162 #define POWER4_PME_PM_CYC 163 #define POWER4_PME_PM_LD_MISS_L1_LSU1 164 #define POWER4_PME_PM_STCX_FAIL 165 #define POWER4_PME_PM_LSU1_SRQ_STFWD 166 #define POWER4_PME_PM_GRP_DISP 167 #define POWER4_PME_PM_DATA_FROM_L2 168 #define POWER4_PME_PM_L2_PREF 169 #define POWER4_PME_PM_FPU0_FPSCR 170 #define POWER4_PME_PM_FPU1_DENORM 171 #define POWER4_PME_PM_MRK_DATA_FROM_L25_MOD 172 #define POWER4_PME_PM_L2SB_ST_REQ 173 #define POWER4_PME_PM_L2SB_MOD_INV 174 #define POWER4_PME_PM_FPU0_FSQRT 175 #define POWER4_PME_PM_LD_REF_L1 176 #define POWER4_PME_PM_MRK_L1_RELOAD_VALID 177 #define POWER4_PME_PM_L2SB_SHR_MOD 178 #define POWER4_PME_PM_INST_FROM_L1 179 #define POWER4_PME_PM_1PLUS_PPC_CMPL 180 #define POWER4_PME_PM_EE_OFF_EXT_INT 181 #define POWER4_PME_PM_L2SC_SHR_MOD 182 #define POWER4_PME_PM_LSU_LRQ_FULL_CYC 183 #define POWER4_PME_PM_IC_PREF_INSTALL 184 #define POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ 185 #define POWER4_PME_PM_GCT_FULL_CYC 186 #define POWER4_PME_PM_INST_FROM_MEM 187 #define POWER4_PME_PM_FXU_BUSY 188 #define POWER4_PME_PM_ST_REF_L1_LSU1 189 #define POWER4_PME_PM_MRK_LD_MISS_L1 190 #define POWER4_PME_PM_MRK_LSU1_INST_FIN 191 #define POWER4_PME_PM_L1_WRITE_CYC 192 #define POWER4_PME_PM_BIQ_IDU_FULL_CYC 193 #define POWER4_PME_PM_MRK_LSU0_INST_FIN 194 #define POWER4_PME_PM_L2SC_ST_REQ 195 #define POWER4_PME_PM_LSU1_BUSY 196 #define POWER4_PME_PM_FPU_ALL 197 #define POWER4_PME_PM_LSU_SRQ_S0_ALLOC 198 #define POWER4_PME_PM_GRP_MRK 199 #define POWER4_PME_PM_FPU1_FIN 200 #define POWER4_PME_PM_DC_PREF_STREAM_ALLOC 201 #define POWER4_PME_PM_BR_MPRED_CR 202 #define POWER4_PME_PM_BR_MPRED_TA 203 #define POWER4_PME_PM_CRQ_FULL_CYC 204 #define POWER4_PME_PM_INST_FROM_PREF 205 #define POWER4_PME_PM_LD_MISS_L1 206 #define POWER4_PME_PM_STCX_PASS 207 #define POWER4_PME_PM_DC_INV_L2 208 #define POWER4_PME_PM_LSU_SRQ_FULL_CYC 209 #define POWER4_PME_PM_LSU0_FLUSH_LRQ 210 #define POWER4_PME_PM_LSU_SRQ_S0_VALID 211 #define POWER4_PME_PM_LARX_LSU0 212 #define POWER4_PME_PM_GCT_EMPTY_CYC 213 #define POWER4_PME_PM_FPU1_ALL 214 #define POWER4_PME_PM_FPU1_FSQRT 215 #define POWER4_PME_PM_FPU_FIN 216 #define POWER4_PME_PM_L2SA_SHR_MOD 217 #define POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 218 #define POWER4_PME_PM_LSU_SRQ_STFWD 219 #define POWER4_PME_PM_FXU0_FIN 220 #define POWER4_PME_PM_MRK_FPU_FIN 221 #define POWER4_PME_PM_LSU_BUSY 222 #define POWER4_PME_PM_INST_FROM_L35 223 #define POWER4_PME_PM_FPU1_FRSP_FCONV 224 #define POWER4_PME_PM_SNOOP_TLBIE 225 #define POWER4_PME_PM_FPU0_FDIV 226 #define POWER4_PME_PM_LD_REF_L1_LSU1 227 #define POWER4_PME_PM_MRK_DATA_FROM_L275_MOD 228 #define POWER4_PME_PM_HV_CYC 229 #define POWER4_PME_PM_6INST_CLB_CYC 230 #define POWER4_PME_PM_LR_CTR_MAP_FULL_CYC 231 #define POWER4_PME_PM_L2SC_MOD_INV 232 #define POWER4_PME_PM_FPU_DENORM 233 #define POWER4_PME_PM_DATA_FROM_L275_MOD 234 #define POWER4_PME_PM_LSU1_DERAT_MISS 235 #define POWER4_PME_PM_IC_PREF_REQ 236 #define POWER4_PME_PM_MRK_LSU_FIN 237 #define POWER4_PME_PM_MRK_DATA_FROM_L3 238 #define POWER4_PME_PM_MRK_DATA_FROM_MEM 239 #define POWER4_PME_PM_LSU0_FLUSH_UST 240 #define POWER4_PME_PM_LSU_FLUSH_LRQ 241 #define POWER4_PME_PM_LSU_FLUSH_SRQ 242 #define POWER4_PME_PM_L2SC_MOD_TAG 243 static const pme_power_entry_t power4_pe[] = { [ POWER4_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x933, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", }, [ POWER4_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", }, [ POWER4_PME_PM_DC_PREF_OUT_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_STREAMS", .pme_code = 0xc36, .pme_short_desc = "Out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected, but no more stream entries were available", }, [ POWER4_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ POWER4_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ POWER4_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x235, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ POWER4_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ POWER4_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x3910, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", }, [ POWER4_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0xf06, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x6c76, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "DL1 was reloaded with shared (T) data from the L2 of another MCM due to a marked demand load", }, [ POWER4_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x450, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc04, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x911, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", }, [ POWER4_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc26, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", }, [ POWER4_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x203, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", }, [ POWER4_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x914, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc06, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0xf11, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", }, [ POWER4_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0xf21, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER4_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x904, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ POWER4_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x923, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", }, [ POWER4_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x916, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ POWER4_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups successfully dispatched (not rejected)", }, [ POWER4_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x934, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", }, [ POWER4_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0xf16, .pme_short_desc = "Fabric command issued", .pme_long_desc = "A bus command was issued on the MCM to MCM fabric from the local (this chip's) Fabric Bus Controller. This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2.", }, [ POWER4_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc20, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", }, [ POWER4_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x204, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x910, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6900, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ POWER4_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", }, [ POWER4_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER4_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ POWER4_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "fThis signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ POWER4_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3230, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER4_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x905, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", }, [ POWER4_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x6c66, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "DL1 was reloaded with shared (T) data from the L2 of another MCM due to a demand load", }, [ POWER4_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x214, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 1 cannot accept any more instructions. Issue is stopped", }, [ POWER4_PME_PM_L3B0_DIR_MIS ] = { .pme_name = "PM_L3B0_DIR_MIS", .pme_code = 0xf01, .pme_short_desc = "L3 bank 0 directory misses", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", }, [ POWER4_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x451, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x925, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ POWER4_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x926, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", }, [ POWER4_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ POWER4_PME_PM_L3B1_DIR_REF ] = { .pme_name = "PM_L3B1_DIR_REF", .pme_code = 0xf02, .pme_short_desc = "L3 bank 1 directory references", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", }, [ POWER4_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x7910, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x5c76, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", }, [ POWER4_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c00, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x327, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", }, [ POWER4_PME_PM_LSU0_BUSY ] = { .pme_name = "PM_LSU0_BUSY", .pme_code = 0xc33, .pme_short_desc = "LSU0 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions", }, [ POWER4_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0xf10, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x2c66, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "DL1 was reloaded from memory due to a demand load", }, [ POWER4_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x201, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ POWER4_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x207, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", }, [ POWER4_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", }, [ POWER4_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x452, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_DATA_FROM_L35 ] = { .pme_name = "PM_DATA_FROM_L35", .pme_code = 0x3c66, .pme_short_desc = "Data loaded from L3.5", .pme_long_desc = "DL1 was reloaded from the L3 of another MCM due to a demand load", }, [ POWER4_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0xf05, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER4_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x4910, .pme_short_desc = "Marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER4_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ POWER4_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x912, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x210, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", }, [ POWER4_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x936, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ POWER4_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ POWER4_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ POWER4_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x8c66, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", }, [ POWER4_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER4_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ POWER4_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x8327, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ POWER4_PME_PM_LARX_LSU1 ] = { .pme_name = "PM_LARX_LSU1", .pme_code = 0xc77, .pme_short_desc = "Larx executed on LSU1", .pme_long_desc = "Invalid event, larx instructions are never executed on unit 1", }, [ POWER4_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc12, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", }, [ POWER4_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc35, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ POWER4_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ POWER4_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x205, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", }, [ POWER4_PME_PM_LARX ] = { .pme_name = "PM_LARX", .pme_code = 0x4c70, .pme_short_desc = "Larx executed", .pme_long_desc = "A Larx (lwarx or ldarx) was executed. This is the combined count from LSU0 + LSU1, but these instructions only execute on LSU0", }, [ POWER4_PME_PM_MRK_DATA_FROM_L35 ] = { .pme_name = "PM_MRK_DATA_FROM_L35", .pme_code = 0x3c76, .pme_short_desc = "Marked data loaded from L3.5", .pme_long_desc = "DL1 was reloaded from the L3 of another MCM due to a marked demand load", }, [ POWER4_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x920, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", }, [ POWER4_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", }, [ POWER4_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x8001, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", }, [ POWER4_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc05, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ POWER4_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x3327, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c66, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a demand load", }, [ POWER4_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ POWER4_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ POWER4_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x8003, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", }, [ POWER4_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x323, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", }, [ POWER4_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8930, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", }, [ POWER4_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x221, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", }, [ POWER4_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0xf07, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x5c66, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", }, [ POWER4_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0xf17, .pme_short_desc = "Fabric command retried", .pme_long_desc = "A bus command on the MCM to MCM fabric was retried. This event is the total count of all retried fabric commands for the local MCM (all four chips report the same value). This event is scaled to the fabric frequency and must be adjusted for a true count. i.e. if the fabric is running 2:1, divide the count by 2.", }, [ POWER4_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc64, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", }, [ POWER4_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", }, [ POWER4_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x5200, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full", }, [ POWER4_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x915, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ POWER4_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x330, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", }, [ POWER4_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x233, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", }, [ POWER4_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x5327, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x900, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ POWER4_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x8210, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when one or both FXU/LSU issue queue are full", }, [ POWER4_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", }, [ POWER4_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x223, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", }, [ POWER4_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0xf15, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", }, [ POWER4_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", }, [ POWER4_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0xf22, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_INST_FROM_L25_L275 ] = { .pme_name = "PM_INST_FROM_L25_L275", .pme_code = 0x2327, .pme_short_desc = "Instruction fetched from L2.5/L2.75", .pme_long_desc = "An instruction fetch group was fetched from the L2 of another chip. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c00, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", }, [ POWER4_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0xf13, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", }, [ POWER4_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x236, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", }, [ POWER4_PME_PM_L3B1_DIR_MIS ] = { .pme_name = "PM_L3B1_DIR_MIS", .pme_code = 0xf03, .pme_short_desc = "L3 bank 1 directory misses", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU and it missed in the L3. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", }, [ POWER4_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x453, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER4_PME_PM_DC_PREF_L2_CLONE_L3 ] = { .pme_name = "PM_DC_PREF_L2_CLONE_L3", .pme_code = 0xc27, .pme_short_desc = "L2 prefetch cloned with L3", .pme_long_desc = "A prefetch request was made to the L2 with a cloned request sent to the L3", }, [ POWER4_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x454, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x913, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER4_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x8910, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER4_PME_PM_8INST_CLB_CYC ] = { .pme_name = "PM_8INST_CLB_CYC", .pme_code = 0x457, .pme_short_desc = "Cycles 8 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x927, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", }, [ POWER4_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc11, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", }, [ POWER4_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x902, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER4_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x932, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", }, [ POWER4_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x4c76, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", }, [ POWER4_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ POWER4_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc03, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER4_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", }, [ POWER4_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0xf25, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A,B,and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER4_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc07, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", }, [ POWER4_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x935, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", }, [ POWER4_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7c10, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", }, [ POWER4_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", }, [ POWER4_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", }, [ POWER4_PME_PM_L3B0_DIR_REF ] = { .pme_name = "PM_L3B0_DIR_REF", .pme_code = 0xf00, .pme_short_desc = "L3 bank 0 directory references", .pme_long_desc = "A reference was made to the local L3 directory by a local CPU. Only requests from on-MCM CPUs are counted. This event is scaled to the L3 speed and the count must be scaled. i.e. if the L3 is running 3:1, divide the count by 3", }, [ POWER4_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", }, [ POWER4_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x931, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", }, [ POWER4_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc22, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ POWER4_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x930, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", }, [ POWER4_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x922, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", }, [ POWER4_PME_PM_7INST_CLB_CYC ] = { .pme_name = "PM_7INST_CLB_CYC", .pme_code = 0x456, .pme_short_desc = "Cycles 7 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", }, [ POWER4_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x231, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", }, [ POWER4_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x202, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ POWER4_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc23, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", }, [ POWER4_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", }, [ POWER4_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER4_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x901, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ POWER4_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER4_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc16, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", }, [ POWER4_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x921, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ POWER4_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc24, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", }, [ POWER4_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", }, [ POWER4_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x4c66, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", }, [ POWER4_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc34, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", }, [ POWER4_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", }, [ POWER4_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ POWER4_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x8c76, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", }, [ POWER4_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0xf12, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0xf23, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8c10, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", }, [ POWER4_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc74, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", }, [ POWER4_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0xf20, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", }, [ POWER4_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x6327, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER4_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x237, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", }, [ POWER4_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0xf24, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", }, [ POWER4_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x212, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The isu sends this signal when the lrq is full.", }, [ POWER4_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x325, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "This signal is asserted when a prefetch buffer entry (line) is allocated but the request is not a demand fetch.", }, [ POWER4_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x917, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER4_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x200, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", }, [ POWER4_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x1327, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "An instruction fetch group was fetched from memory. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", }, [ POWER4_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc15, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", }, [ POWER4_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1920, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ POWER4_PME_PM_MRK_LSU1_INST_FIN ] = { .pme_name = "PM_MRK_LSU1_INST_FIN", .pme_code = 0xc32, .pme_short_desc = "LSU1 finished a marked instruction", .pme_long_desc = "LSU unit 1 finished a marked instruction", }, [ POWER4_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x333, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", }, [ POWER4_PME_PM_BIQ_IDU_FULL_CYC ] = { .pme_name = "PM_BIQ_IDU_FULL_CYC", .pme_code = 0x324, .pme_short_desc = "Cycles BIQ or IDU full", .pme_long_desc = "This signal will be asserted each time either the IDU is full or the BIQ is full.", }, [ POWER4_PME_PM_MRK_LSU0_INST_FIN ] = { .pme_name = "PM_MRK_LSU0_INST_FIN", .pme_code = 0xc31, .pme_short_desc = "LSU0 finished a marked instruction", .pme_long_desc = "LSU unit 0 finished a marked instruction", }, [ POWER4_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0xf14, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_LSU1_BUSY ] = { .pme_name = "PM_LSU1_BUSY", .pme_code = 0xc37, .pme_short_desc = "LSU1 busy", .pme_long_desc = "LSU unit 1 is busy rejecting instructions ", }, [ POWER4_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc25, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", }, [ POWER4_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", }, [ POWER4_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", }, [ POWER4_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x907, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", }, [ POWER4_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x331, .pme_short_desc = "Branch mispredictions due CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", }, [ POWER4_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x332, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", }, [ POWER4_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x211, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", }, [ POWER4_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x7327, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c10, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", }, [ POWER4_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0xc75, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", }, [ POWER4_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc17, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ POWER4_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x213, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The isu sends this signal when the srq is full.", }, [ POWER4_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc02, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc21, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ POWER4_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0xc73, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", }, [ POWER4_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ POWER4_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ POWER4_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0xf04, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A,B,and C. ", }, [ POWER4_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x924, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", }, [ POWER4_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1c20, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", }, [ POWER4_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x232, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", }, [ POWER4_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_LSU_BUSY ] = { .pme_name = "PM_LSU_BUSY", .pme_code = 0x4c30, .pme_short_desc = "LSU busy", .pme_long_desc = "LSU (unit 0 + unit 1) is busy rejecting instructions ", }, [ POWER4_PME_PM_INST_FROM_L35 ] = { .pme_name = "PM_INST_FROM_L35", .pme_code = 0x4327, .pme_short_desc = "Instructions fetched from L3.5", .pme_long_desc = "An instruction fetch group was fetched from the L3 of another module. Fetch Groups can contain up to 8 instructions", }, [ POWER4_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "fThis signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER4_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x903, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ POWER4_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER4_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc14, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", }, [ POWER4_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x7c76, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of another MCM due to a marked demand load. ", }, [ POWER4_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 0 and MSR[PR]=0)", }, [ POWER4_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x455, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is an 8-deep, 4-wide instruction buffer. Fullness is indicated in the 8 valid bits associated with each of the 4-wide slots with full(0) correspanding to the number of cycles there are 8 instructions in the queue and full (7) corresponding to the number of cycles there is 1 instruction in the queue. This signal gives a real time history of the number of instruction quads valid in the instruction queue.", }, [ POWER4_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x206, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ POWER4_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0xf27, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", }, [ POWER4_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", }, [ POWER4_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x7c66, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of another MCM due to a demand load. ", }, [ POWER4_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x906, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER4_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x326, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", }, [ POWER4_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER4_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c76, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a marked demand load", }, [ POWER4_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x2c76, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "DL1 was reloaded from memory due to a marked demand load", }, [ POWER4_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc01, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", }, [ POWER4_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6c00, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER4_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5c00, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER4_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0xf26, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A,B,and C.", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power5+_events.h000066400000000000000000005374471502707512200231130ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER5p_EVENTS_H__ #define __POWER5p_EVENTS_H__ /* * File: power5+_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define POWER5p_PME_PM_FPU1_SINGLE 1 #define POWER5p_PME_PM_L3SB_REF 2 #define POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC 3 #define POWER5p_PME_PM_INST_FROM_L275_SHR 4 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD 5 #define POWER5p_PME_PM_DTLB_MISS_4K 6 #define POWER5p_PME_PM_CLB_FULL_CYC 7 #define POWER5p_PME_PM_MRK_ST_CMPL 8 #define POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL 9 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR 10 #define POWER5p_PME_PM_1INST_CLB_CYC 11 #define POWER5p_PME_PM_MEM_SPEC_RD_CANCEL 12 #define POWER5p_PME_PM_MRK_DTLB_MISS_16M 13 #define POWER5p_PME_PM_FPU_FDIV 14 #define POWER5p_PME_PM_FPU_SINGLE 15 #define POWER5p_PME_PM_FPU0_FMA 16 #define POWER5p_PME_PM_SLB_MISS 17 #define POWER5p_PME_PM_LSU1_FLUSH_LRQ 18 #define POWER5p_PME_PM_L2SA_ST_HIT 19 #define POWER5p_PME_PM_DTLB_MISS 20 #define POWER5p_PME_PM_BR_PRED_TA 21 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC 22 #define POWER5p_PME_PM_CMPLU_STALL_FXU 23 #define POWER5p_PME_PM_EXT_INT 24 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ 25 #define POWER5p_PME_PM_MRK_ST_GPS 26 #define POWER5p_PME_PM_LSU1_LDF 27 #define POWER5p_PME_PM_FAB_CMD_ISSUED 28 #define POWER5p_PME_PM_LSU0_SRQ_STFWD 29 #define POWER5p_PME_PM_CR_MAP_FULL_CYC 30 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL 31 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD 32 #define POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL 33 #define POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 34 #define POWER5p_PME_PM_FLUSH_IMBAL 35 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 36 #define POWER5p_PME_PM_DATA_FROM_L35_MOD 37 #define POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL 38 #define POWER5p_PME_PM_FPU1_FDIV 39 #define POWER5p_PME_PM_MEM_RQ_DISP 40 #define POWER5p_PME_PM_FPU0_FRSP_FCONV 41 #define POWER5p_PME_PM_LWSYNC_HELD 42 #define POWER5p_PME_PM_FXU_FIN 43 #define POWER5p_PME_PM_DSLB_MISS 44 #define POWER5p_PME_PM_DATA_FROM_L275_SHR 45 #define POWER5p_PME_PM_FXLS1_FULL_CYC 46 #define POWER5p_PME_PM_THRD_SEL_T0 47 #define POWER5p_PME_PM_PTEG_RELOAD_VALID 48 #define POWER5p_PME_PM_MRK_STCX_FAIL 49 #define POWER5p_PME_PM_LSU_LMQ_LHR_MERGE 50 #define POWER5p_PME_PM_2INST_CLB_CYC 51 #define POWER5p_PME_PM_FAB_PNtoVN_DIRECT 52 #define POWER5p_PME_PM_PTEG_FROM_L2MISS 53 #define POWER5p_PME_PM_CMPLU_STALL_LSU 54 #define POWER5p_PME_PM_MRK_DSLB_MISS 55 #define POWER5p_PME_PM_LSU_FLUSH_ULD 56 #define POWER5p_PME_PM_PTEG_FROM_LMEM 57 #define POWER5p_PME_PM_MRK_BRU_FIN 58 #define POWER5p_PME_PM_MEM_WQ_DISP_WRITE 59 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC 60 #define POWER5p_PME_PM_LSU1_NCLD 61 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER 62 #define POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ 63 #define POWER5p_PME_PM_FPU1_FULL_CYC 64 #define POWER5p_PME_PM_FPR_MAP_FULL_CYC 65 #define POWER5p_PME_PM_L3SA_ALL_BUSY 66 #define POWER5p_PME_PM_3INST_CLB_CYC 67 #define POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 68 #define POWER5p_PME_PM_L2SA_SHR_INV 69 #define POWER5p_PME_PM_THRESH_TIMEO 70 #define POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL 71 #define POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL 72 #define POWER5p_PME_PM_FPU_FSQRT 73 #define POWER5p_PME_PM_PMC1_OVERFLOW 74 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ 75 #define POWER5p_PME_PM_L3SC_SNOOP_RETRY 76 #define POWER5p_PME_PM_DATA_TABLEWALK_CYC 77 #define POWER5p_PME_PM_THRD_PRIO_6_CYC 78 #define POWER5p_PME_PM_FPU_FEST 79 #define POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY 80 #define POWER5p_PME_PM_MRK_DATA_FROM_RMEM 81 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC 82 #define POWER5p_PME_PM_MEM_PWQ_DISP 83 #define POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY 84 #define POWER5p_PME_PM_LD_MISS_L1_LSU0 85 #define POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL 86 #define POWER5p_PME_PM_FPU1_STALL3 87 #define POWER5p_PME_PM_GCT_USAGE_80to99_CYC 88 #define POWER5p_PME_PM_WORK_HELD 89 #define POWER5p_PME_PM_INST_CMPL 90 #define POWER5p_PME_PM_LSU1_FLUSH_UST 91 #define POWER5p_PME_PM_FXU_IDLE 92 #define POWER5p_PME_PM_LSU0_FLUSH_ULD 93 #define POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL 94 #define POWER5p_PME_PM_GRP_DISP_REJECT 95 #define POWER5p_PME_PM_PTEG_FROM_L25_SHR 96 #define POWER5p_PME_PM_L2SA_MOD_INV 97 #define POWER5p_PME_PM_FAB_CMD_RETRIED 98 #define POWER5p_PME_PM_L3SA_SHR_INV 99 #define POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL 100 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR 101 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL 102 #define POWER5p_PME_PM_PTEG_FROM_L375_MOD 103 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_UST 104 #define POWER5p_PME_PM_BR_ISSUED 105 #define POWER5p_PME_PM_MRK_GRP_BR_REDIR 106 #define POWER5p_PME_PM_EE_OFF 107 #define POWER5p_PME_PM_IERAT_XLATE_WR_LP 108 #define POWER5p_PME_PM_DTLB_REF_64K 109 #define POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 110 #define POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP 111 #define POWER5p_PME_PM_INST_FROM_L3 112 #define POWER5p_PME_PM_ITLB_MISS 113 #define POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE 114 #define POWER5p_PME_PM_DTLB_REF_4K 115 #define POWER5p_PME_PM_FXLS_FULL_CYC 116 #define POWER5p_PME_PM_GRP_DISP_VALID 117 #define POWER5p_PME_PM_LSU_FLUSH_UST 118 #define POWER5p_PME_PM_FXU1_FIN 119 #define POWER5p_PME_PM_THRD_PRIO_4_CYC 120 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD 121 #define POWER5p_PME_PM_4INST_CLB_CYC 122 #define POWER5p_PME_PM_MRK_DTLB_REF_16M 123 #define POWER5p_PME_PM_INST_FROM_L375_MOD 124 #define POWER5p_PME_PM_GRP_CMPL 125 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR 126 #define POWER5p_PME_PM_FPU1_1FLOP 127 #define POWER5p_PME_PM_FPU_FRSP_FCONV 128 #define POWER5p_PME_PM_L3SC_REF 129 #define POWER5p_PME_PM_5INST_CLB_CYC 130 #define POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC 131 #define POWER5p_PME_PM_MEM_PW_GATH 132 #define POWER5p_PME_PM_DTLB_REF_16G 133 #define POWER5p_PME_PM_FAB_DCLAIM_ISSUED 134 #define POWER5p_PME_PM_FAB_PNtoNN_SIDECAR 135 #define POWER5p_PME_PM_GRP_IC_MISS 136 #define POWER5p_PME_PM_INST_FROM_L35_SHR 137 #define POWER5p_PME_PM_LSU_LMQ_FULL_CYC 138 #define POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC 139 #define POWER5p_PME_PM_LSU_SRQ_SYNC_CYC 140 #define POWER5p_PME_PM_LSU0_BUSY_REJECT 141 #define POWER5p_PME_PM_LSU_REJECT_ERAT_MISS 142 #define POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC 143 #define POWER5p_PME_PM_DATA_FROM_L375_SHR 144 #define POWER5p_PME_PM_PTEG_FROM_L25_MOD 145 #define POWER5p_PME_PM_FPU0_FMOV_FEST 146 #define POWER5p_PME_PM_THRD_PRIO_7_CYC 147 #define POWER5p_PME_PM_LSU1_FLUSH_SRQ 148 #define POWER5p_PME_PM_LD_REF_L1_LSU0 149 #define POWER5p_PME_PM_L2SC_RCST_DISP 150 #define POWER5p_PME_PM_CMPLU_STALL_DIV 151 #define POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 152 #define POWER5p_PME_PM_INST_FROM_L375_SHR 153 #define POWER5p_PME_PM_ST_REF_L1 154 #define POWER5p_PME_PM_L3SB_ALL_BUSY 155 #define POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY 156 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC 157 #define POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY 158 #define POWER5p_PME_PM_DATA_FROM_LMEM 159 #define POWER5p_PME_PM_RUN_CYC 160 #define POWER5p_PME_PM_PTEG_FROM_RMEM 161 #define POWER5p_PME_PM_L2SC_RCLD_DISP 162 #define POWER5p_PME_PM_LSU_LRQ_S0_VALID 163 #define POWER5p_PME_PM_LSU0_LDF 164 #define POWER5p_PME_PM_PMC3_OVERFLOW 165 #define POWER5p_PME_PM_MRK_IMR_RELOAD 166 #define POWER5p_PME_PM_MRK_GRP_TIMEO 167 #define POWER5p_PME_PM_ST_MISS_L1 168 #define POWER5p_PME_PM_STOP_COMPLETION 169 #define POWER5p_PME_PM_LSU_BUSY_REJECT 170 #define POWER5p_PME_PM_ISLB_MISS 171 #define POWER5p_PME_PM_CYC 172 #define POWER5p_PME_PM_THRD_ONE_RUN_CYC 173 #define POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC 174 #define POWER5p_PME_PM_LSU1_SRQ_STFWD 175 #define POWER5p_PME_PM_L3SC_MOD_INV 176 #define POWER5p_PME_PM_L2_PREF 177 #define POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED 178 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD 179 #define POWER5p_PME_PM_L2SB_ST_REQ 180 #define POWER5p_PME_PM_L2SB_MOD_INV 181 #define POWER5p_PME_PM_MRK_L1_RELOAD_VALID 182 #define POWER5p_PME_PM_L3SB_HIT 183 #define POWER5p_PME_PM_L2SB_SHR_MOD 184 #define POWER5p_PME_PM_EE_OFF_EXT_INT 185 #define POWER5p_PME_PM_1PLUS_PPC_CMPL 186 #define POWER5p_PME_PM_L2SC_SHR_MOD 187 #define POWER5p_PME_PM_PMC6_OVERFLOW 188 #define POWER5p_PME_PM_IC_PREF_INSTALL 189 #define POWER5p_PME_PM_LSU_LRQ_FULL_CYC 190 #define POWER5p_PME_PM_TLB_MISS 191 #define POWER5p_PME_PM_GCT_FULL_CYC 192 #define POWER5p_PME_PM_FXU_BUSY 193 #define POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC 194 #define POWER5p_PME_PM_LSU_REJECT_LMQ_FULL 195 #define POWER5p_PME_PM_LSU_SRQ_S0_ALLOC 196 #define POWER5p_PME_PM_GRP_MRK 197 #define POWER5p_PME_PM_INST_FROM_L25_SHR 198 #define POWER5p_PME_PM_DC_PREF_STREAM_ALLOC 199 #define POWER5p_PME_PM_FPU1_FIN 200 #define POWER5p_PME_PM_BR_MPRED_TA 201 #define POWER5p_PME_PM_MRK_DTLB_REF_64K 202 #define POWER5p_PME_PM_RUN_INST_CMPL 203 #define POWER5p_PME_PM_CRQ_FULL_CYC 204 #define POWER5p_PME_PM_L2SA_RCLD_DISP 205 #define POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL 206 #define POWER5p_PME_PM_MRK_DTLB_REF_4K 207 #define POWER5p_PME_PM_LSU_SRQ_S0_VALID 208 #define POWER5p_PME_PM_LSU0_FLUSH_LRQ 209 #define POWER5p_PME_PM_INST_FROM_L275_MOD 210 #define POWER5p_PME_PM_GCT_EMPTY_CYC 211 #define POWER5p_PME_PM_LARX_LSU0 212 #define POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC 213 #define POWER5p_PME_PM_SNOOP_RETRY_1AHEAD 214 #define POWER5p_PME_PM_FPU1_FSQRT 215 #define POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 216 #define POWER5p_PME_PM_MRK_FPU_FIN 217 #define POWER5p_PME_PM_THRD_PRIO_5_CYC 218 #define POWER5p_PME_PM_MRK_DATA_FROM_LMEM 219 #define POWER5p_PME_PM_SNOOP_TLBIE 220 #define POWER5p_PME_PM_FPU1_FRSP_FCONV 221 #define POWER5p_PME_PM_DTLB_MISS_16G 222 #define POWER5p_PME_PM_L3SB_SNOOP_RETRY 223 #define POWER5p_PME_PM_FAB_VBYPASS_EMPTY 224 #define POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD 225 #define POWER5p_PME_PM_L2SB_RCST_DISP 226 #define POWER5p_PME_PM_6INST_CLB_CYC 227 #define POWER5p_PME_PM_FLUSH 228 #define POWER5p_PME_PM_L2SC_MOD_INV 229 #define POWER5p_PME_PM_FPU_DENORM 230 #define POWER5p_PME_PM_L3SC_HIT 231 #define POWER5p_PME_PM_SNOOP_WR_RETRY_RQ 232 #define POWER5p_PME_PM_LSU1_REJECT_SRQ 233 #define POWER5p_PME_PM_L3SC_ALL_BUSY 234 #define POWER5p_PME_PM_IC_PREF_REQ 235 #define POWER5p_PME_PM_MRK_GRP_IC_MISS 236 #define POWER5p_PME_PM_GCT_NOSLOT_IC_MISS 237 #define POWER5p_PME_PM_MRK_DATA_FROM_L3 238 #define POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL 239 #define POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS 240 #define POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD 241 #define POWER5p_PME_PM_LSU_FLUSH_LRQ 242 #define POWER5p_PME_PM_THRD_PRIO_2_CYC 243 #define POWER5p_PME_PM_L3SA_MOD_INV 244 #define POWER5p_PME_PM_LSU_FLUSH_SRQ 245 #define POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID 246 #define POWER5p_PME_PM_L3SA_REF 247 #define POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL 248 #define POWER5p_PME_PM_FPU0_STALL3 249 #define POWER5p_PME_PM_TB_BIT_TRANS 250 #define POWER5p_PME_PM_GPR_MAP_FULL_CYC 251 #define POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ 252 #define POWER5p_PME_PM_FPU0_STF 253 #define POWER5p_PME_PM_MRK_DTLB_MISS 254 #define POWER5p_PME_PM_FPU1_FMA 255 #define POWER5p_PME_PM_L2SA_MOD_TAG 256 #define POWER5p_PME_PM_LSU1_FLUSH_ULD 257 #define POWER5p_PME_PM_MRK_INST_FIN 258 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_UST 259 #define POWER5p_PME_PM_FPU0_FULL_CYC 260 #define POWER5p_PME_PM_LSU_LRQ_S0_ALLOC 261 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD 262 #define POWER5p_PME_PM_MRK_DTLB_REF 263 #define POWER5p_PME_PM_BR_UNCOND 264 #define POWER5p_PME_PM_THRD_SEL_OVER_L2MISS 265 #define POWER5p_PME_PM_L2SB_SHR_INV 266 #define POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL 267 #define POWER5p_PME_PM_MRK_DTLB_MISS_64K 268 #define POWER5p_PME_PM_MRK_ST_MISS_L1 269 #define POWER5p_PME_PM_L3SC_MOD_TAG 270 #define POWER5p_PME_PM_GRP_DISP_SUCCESS 271 #define POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC 272 #define POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 273 #define POWER5p_PME_PM_LSU_DERAT_MISS 274 #define POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 275 #define POWER5p_PME_PM_FPU0_SINGLE 276 #define POWER5p_PME_PM_THRD_PRIO_1_CYC 277 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER 278 #define POWER5p_PME_PM_SNOOP_RD_RETRY_RQ 279 #define POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY 280 #define POWER5p_PME_PM_FPU1_FEST 281 #define POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL 282 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC 283 #define POWER5p_PME_PM_MRK_ST_CMPL_INT 284 #define POWER5p_PME_PM_FLUSH_BR_MPRED 285 #define POWER5p_PME_PM_MRK_DTLB_MISS_16G 286 #define POWER5p_PME_PM_FPU_STF 287 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR 288 #define POWER5p_PME_PM_CMPLU_STALL_FPU 289 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 290 #define POWER5p_PME_PM_GCT_NOSLOT_CYC 291 #define POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE 292 #define POWER5p_PME_PM_PTEG_FROM_L35_SHR 293 #define POWER5p_PME_PM_MRK_DTLB_REF_16G 294 #define POWER5p_PME_PM_MRK_LSU_FLUSH_UST 295 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR 296 #define POWER5p_PME_PM_L3SA_HIT 297 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR 298 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR 299 #define POWER5p_PME_PM_IERAT_XLATE_WR 300 #define POWER5p_PME_PM_L2SA_ST_REQ 301 #define POWER5p_PME_PM_INST_FROM_LMEM 302 #define POWER5p_PME_PM_THRD_SEL_T1 303 #define POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT 304 #define POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC 305 #define POWER5p_PME_PM_FPU0_1FLOP 306 #define POWER5p_PME_PM_PTEG_FROM_L2 307 #define POWER5p_PME_PM_MEM_PW_CMPL 308 #define POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 309 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER 310 #define POWER5p_PME_PM_MRK_DTLB_MISS_4K 311 #define POWER5p_PME_PM_FPU0_FIN 312 #define POWER5p_PME_PM_L3SC_SHR_INV 313 #define POWER5p_PME_PM_GRP_BR_REDIR 314 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL 315 #define POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ 316 #define POWER5p_PME_PM_PTEG_FROM_L275_SHR 317 #define POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL 318 #define POWER5p_PME_PM_SNOOP_RD_RETRY_WQ 319 #define POWER5p_PME_PM_FAB_DCLAIM_RETRIED 320 #define POWER5p_PME_PM_LSU0_NCLD 321 #define POWER5p_PME_PM_LSU1_BUSY_REJECT 322 #define POWER5p_PME_PM_FXLS0_FULL_CYC 323 #define POWER5p_PME_PM_DTLB_REF_16M 324 #define POWER5p_PME_PM_FPU0_FEST 325 #define POWER5p_PME_PM_GCT_USAGE_60to79_CYC 326 #define POWER5p_PME_PM_DATA_FROM_L25_MOD 327 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR 328 #define POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS 329 #define POWER5p_PME_PM_DATA_FROM_L375_MOD 330 #define POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 331 #define POWER5p_PME_PM_DTLB_MISS_64K 332 #define POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF 333 #define POWER5p_PME_PM_0INST_FETCH 334 #define POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF 335 #define POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 336 #define POWER5p_PME_PM_L1_PREF 337 #define POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC 338 #define POWER5p_PME_PM_BRQ_FULL_CYC 339 #define POWER5p_PME_PM_GRP_IC_MISS_NONSPEC 340 #define POWER5p_PME_PM_PTEG_FROM_L275_MOD 341 #define POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 342 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC 343 #define POWER5p_PME_PM_DATA_FROM_L3 344 #define POWER5p_PME_PM_INST_FROM_L2 345 #define POWER5p_PME_PM_LSU_FLUSH 346 #define POWER5p_PME_PM_PMC2_OVERFLOW 347 #define POWER5p_PME_PM_FPU0_DENORM 348 #define POWER5p_PME_PM_FPU1_FMOV_FEST 349 #define POWER5p_PME_PM_INST_FETCH_CYC 350 #define POWER5p_PME_PM_INST_DISP 351 #define POWER5p_PME_PM_LSU_LDF 352 #define POWER5p_PME_PM_DATA_FROM_L25_SHR 353 #define POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID 354 #define POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM 355 #define POWER5p_PME_PM_MRK_GRP_ISSUED 356 #define POWER5p_PME_PM_FPU_FULL_CYC 357 #define POWER5p_PME_PM_INST_FROM_L35_MOD 358 #define POWER5p_PME_PM_FPU_FMA 359 #define POWER5p_PME_PM_THRD_PRIO_3_CYC 360 #define POWER5p_PME_PM_MRK_CRU_FIN 361 #define POWER5p_PME_PM_SNOOP_WR_RETRY_WQ 362 #define POWER5p_PME_PM_CMPLU_STALL_REJECT 363 #define POWER5p_PME_PM_MRK_FXU_FIN 364 #define POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS 365 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER 366 #define POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY 367 #define POWER5p_PME_PM_PMC4_OVERFLOW 368 #define POWER5p_PME_PM_L3SA_SNOOP_RETRY 369 #define POWER5p_PME_PM_PTEG_FROM_L35_MOD 370 #define POWER5p_PME_PM_INST_FROM_L25_MOD 371 #define POWER5p_PME_PM_THRD_SMT_HANG 372 #define POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS 373 #define POWER5p_PME_PM_L3SA_MOD_TAG 374 #define POWER5p_PME_PM_INST_FROM_L2MISS 375 #define POWER5p_PME_PM_FLUSH_SYNC 376 #define POWER5p_PME_PM_MRK_GRP_DISP 377 #define POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 378 #define POWER5p_PME_PM_L2SC_ST_HIT 379 #define POWER5p_PME_PM_L2SB_MOD_TAG 380 #define POWER5p_PME_PM_CLB_EMPTY_CYC 381 #define POWER5p_PME_PM_L2SB_ST_HIT 382 #define POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL 383 #define POWER5p_PME_PM_BR_PRED_CR_TA 384 #define POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ 385 #define POWER5p_PME_PM_MRK_LSU_FLUSH_ULD 386 #define POWER5p_PME_PM_INST_DISP_ATTEMPT 387 #define POWER5p_PME_PM_INST_FROM_RMEM 388 #define POWER5p_PME_PM_ST_REF_L1_LSU0 389 #define POWER5p_PME_PM_LSU0_DERAT_MISS 390 #define POWER5p_PME_PM_FPU_STALL3 391 #define POWER5p_PME_PM_L2SB_RCLD_DISP 392 #define POWER5p_PME_PM_BR_PRED_CR 393 #define POWER5p_PME_PM_MRK_DATA_FROM_L2 394 #define POWER5p_PME_PM_LSU0_FLUSH_SRQ 395 #define POWER5p_PME_PM_FAB_PNtoNN_DIRECT 396 #define POWER5p_PME_PM_IOPS_CMPL 397 #define POWER5p_PME_PM_L2SA_RCST_DISP 398 #define POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER 399 #define POWER5p_PME_PM_L2SC_SHR_INV 400 #define POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION 401 #define POWER5p_PME_PM_FAB_PNtoVN_SIDECAR 402 #define POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL 403 #define POWER5p_PME_PM_LSU_LMQ_S0_ALLOC 404 #define POWER5p_PME_PM_SNOOP_PW_RETRY_RQ 405 #define POWER5p_PME_PM_DTLB_REF 406 #define POWER5p_PME_PM_PTEG_FROM_L3 407 #define POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY 408 #define POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC 409 #define POWER5p_PME_PM_FPU1_STF 410 #define POWER5p_PME_PM_LSU_LMQ_S0_VALID 411 #define POWER5p_PME_PM_GCT_USAGE_00to59_CYC 412 #define POWER5p_PME_PM_FPU_FMOV_FEST 413 #define POWER5p_PME_PM_DATA_FROM_L2MISS 414 #define POWER5p_PME_PM_XER_MAP_FULL_CYC 415 #define POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC 416 #define POWER5p_PME_PM_FLUSH_SB 417 #define POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR 418 #define POWER5p_PME_PM_MRK_GRP_CMPL 419 #define POWER5p_PME_PM_SUSPENDED 420 #define POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL 421 #define POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC 422 #define POWER5p_PME_PM_DATA_FROM_L35_SHR 423 #define POWER5p_PME_PM_L3SB_MOD_INV 424 #define POWER5p_PME_PM_STCX_FAIL 425 #define POWER5p_PME_PM_LD_MISS_L1_LSU1 426 #define POWER5p_PME_PM_GRP_DISP 427 #define POWER5p_PME_PM_DC_PREF_DST 428 #define POWER5p_PME_PM_FPU1_DENORM 429 #define POWER5p_PME_PM_FPU0_FPSCR 430 #define POWER5p_PME_PM_DATA_FROM_L2 431 #define POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR 432 #define POWER5p_PME_PM_FPU_1FLOP 433 #define POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER 434 #define POWER5p_PME_PM_FPU0_FSQRT 435 #define POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL 436 #define POWER5p_PME_PM_LD_REF_L1 437 #define POWER5p_PME_PM_INST_FROM_L1 438 #define POWER5p_PME_PM_TLBIE_HELD 439 #define POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS 440 #define POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC 441 #define POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ 442 #define POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 443 #define POWER5p_PME_PM_ST_REF_L1_LSU1 444 #define POWER5p_PME_PM_MRK_LD_MISS_L1 445 #define POWER5p_PME_PM_L1_WRITE_CYC 446 #define POWER5p_PME_PM_L2SC_ST_REQ 447 #define POWER5p_PME_PM_CMPLU_STALL_FDIV 448 #define POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY 449 #define POWER5p_PME_PM_BR_MPRED_CR 450 #define POWER5p_PME_PM_L3SB_MOD_TAG 451 #define POWER5p_PME_PM_MRK_DATA_FROM_L2MISS 452 #define POWER5p_PME_PM_LSU_REJECT_SRQ 453 #define POWER5p_PME_PM_LD_MISS_L1 454 #define POWER5p_PME_PM_INST_FROM_PREF 455 #define POWER5p_PME_PM_STCX_PASS 456 #define POWER5p_PME_PM_DC_INV_L2 457 #define POWER5p_PME_PM_LSU_SRQ_FULL_CYC 458 #define POWER5p_PME_PM_FPU_FIN 459 #define POWER5p_PME_PM_LSU_SRQ_STFWD 460 #define POWER5p_PME_PM_L2SA_SHR_MOD 461 #define POWER5p_PME_PM_0INST_CLB_CYC 462 #define POWER5p_PME_PM_FXU0_FIN 463 #define POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL 464 #define POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC 465 #define POWER5p_PME_PM_PMC5_OVERFLOW 466 #define POWER5p_PME_PM_FPU0_FDIV 467 #define POWER5p_PME_PM_PTEG_FROM_L375_SHR 468 #define POWER5p_PME_PM_HV_CYC 469 #define POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY 470 #define POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC 471 #define POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC 472 #define POWER5p_PME_PM_L3SB_SHR_INV 473 #define POWER5p_PME_PM_DATA_FROM_RMEM 474 #define POWER5p_PME_PM_DATA_FROM_L275_MOD 475 #define POWER5p_PME_PM_LSU0_REJECT_SRQ 476 #define POWER5p_PME_PM_LSU1_DERAT_MISS 477 #define POWER5p_PME_PM_MRK_LSU_FIN 478 #define POWER5p_PME_PM_DTLB_MISS_16M 479 #define POWER5p_PME_PM_LSU0_FLUSH_UST 480 #define POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY 481 #define POWER5p_PME_PM_L2SC_MOD_TAG 482 static const pme_power_entry_t power5p_pe[] = { [ POWER5p_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x2c4090, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x20e7, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "FPU1 has executed a single precision instruction.", }, [ POWER5p_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x701c4, .pme_short_desc = "L3 slice B references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x430e5, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 3 or 4.", }, [ POWER5p_PME_PM_INST_FROM_L275_SHR ] = { .pme_name = "PM_INST_FROM_L275_SHR", .pme_code = 0x322096, .pme_short_desc = "Instruction fetched from L2.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD", .pme_code = 0x1c70a7, .pme_short_desc = "Marked data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load.", }, [ POWER5p_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x1c208d, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_CLB_FULL_CYC ] = { .pme_name = "PM_CLB_FULL_CYC", .pme_code = 0x220e5, .pme_short_desc = "Cycles CLB full", .pme_long_desc = "Cycles when both thread's CLB is full.", }, [ POWER5p_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_LRQ_FULL", .pme_code = 0x320e7, .pme_short_desc = "Flush caused by LRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x3c7097, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a marked load.", }, [ POWER5p_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x400c1, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_MEM_SPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_SPEC_RD_CANCEL", .pme_code = 0x721e6, .pme_short_desc = "Speculative memory read cancelled", .pme_long_desc = "Speculative memory read cancelled (i.e. cresp = sourced by L2/L3)", }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x3c608d, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Marked Data TLB misses for 16M page", }, [ POWER5p_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x100088, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "The floating point unit has executed a divide instruction. This could be fdiv, fdivs, fdiv., fdivs.. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x102090, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc1, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x280088, .pme_short_desc = "SLB misses", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", }, [ POWER5p_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc00c6, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5p_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x733e0, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C.", }, [ POWER5p_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x800c4, .pme_short_desc = "Data TLB misses", .pme_long_desc = "Data TLB misses, all page sizes.", }, [ POWER5p_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x230e3, .pme_short_desc = "A conditional branch was predicted, target prediction", .pme_long_desc = "The target address of a branch instruction was predicted.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD_CYC", .pme_code = 0x4c70a7, .pme_short_desc = "Marked load latency from L3.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x211099, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", }, [ POWER5p_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x400003, .pme_short_desc = "External interrupts", .pme_long_desc = "An interrupt due to an external exception occurred", }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x810c6, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5p_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ POWER5p_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc50c4, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU1", }, [ POWER5p_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x700c7, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5p_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc60e1, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER5p_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x100c4, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x810c1, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_SRQ_FULL", .pme_code = 0x330e0, .pme_short_desc = "Flush caused by SRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q16to19 ] = { .pme_name = "PM_MEM_RQ_DISP_Q16to19", .pme_code = 0x727e6, .pme_short_desc = "Memory read queue dispatched to queues 16-19", .pme_long_desc = "A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FLUSH_IMBAL ] = { .pme_name = "PM_FLUSH_IMBAL", .pme_code = 0x330e3, .pme_short_desc = "Flush caused by thread GCT imbalance", .pme_long_desc = "This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x430e1, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 3 or 4.", }, [ POWER5p_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x2c309e, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_HI_PRIO_WR_CMPL", .pme_code = 0x726e6, .pme_short_desc = "High priority write completed", .pme_long_desc = "A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0xc4, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER5p_PME_PM_MEM_RQ_DISP ] = { .pme_name = "PM_MEM_RQ_DISP", .pme_code = 0x701c6, .pme_short_desc = "Memory read queue dispatched", .pme_long_desc = "A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x10c1, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x130e0, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", }, [ POWER5p_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x313088, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x800c5, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", }, [ POWER5p_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x3c3097, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", }, [ POWER5p_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x110c4, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x410c0, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Thread selection picked thread 0 for decode.", }, [ POWER5p_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x830e4, .pme_short_desc = "PTEG reload valid", .pme_long_desc = "A Page Table Entry was loaded into the TLB.", }, [ POWER5p_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x820e6, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ POWER5p_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0xc70e5, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", }, [ POWER5p_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x400c2, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_FAB_PNtoVN_DIRECT ] = { .pme_name = "PM_FAB_PNtoVN_DIRECT", .pme_code = 0x723e7, .pme_short_desc = "PN to VN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x38309b, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", }, [ POWER5p_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x211098, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", }, [ POWER5p_PME_PM_MRK_DSLB_MISS ] = { .pme_name = "PM_MRK_DSLB_MISS", .pme_code = 0xc50c7, .pme_short_desc = "Marked Data SLB misses", .pme_long_desc = "A Data SLB miss was caused by a marked instruction.", }, [ POWER5p_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c0088, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x283087, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", }, [ POWER5p_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x200005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_MEM_WQ_DISP_WRITE ] = { .pme_name = "PM_MEM_WQ_DISP_WRITE", .pme_code = 0x703c6, .pme_short_desc = "Memory write queue dispatched due to write", .pme_long_desc = "A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD_CYC", .pme_code = 0x4c70a3, .pme_short_desc = "Marked load latency from L2.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc50c5, .pme_short_desc = "LSU1 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_WQ_PWQ", .pme_code = 0x717c6, .pme_short_desc = "Snoop partial-write retry due to collision with active write or partial-write queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x100c7, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped", }, [ POWER5p_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x100c1, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5p_PME_PM_L3SA_ALL_BUSY ] = { .pme_name = "PM_L3SA_ALL_BUSY", .pme_code = 0x721e3, .pme_short_desc = "L3 slice A active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5p_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x400c3, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { .pme_name = "PM_MEM_PWQ_DISP_Q2or3", .pme_code = 0x734e6, .pme_short_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3", .pme_long_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0x710c0, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5p_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x30000b, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5p_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { .pme_name = "PM_THRD_SEL_OVER_GCT_IMBAL", .pme_code = 0x410c4, .pme_short_desc = "Thread selection overrides caused by GCT imbalance", .pme_long_desc = "Thread selection was overridden because of a GCT imbalance.", }, [ POWER5p_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x200090, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20000a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x810c2, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5p_PME_PM_L3SC_SNOOP_RETRY ] = { .pme_name = "PM_L3SC_SNOOP_RETRY", .pme_code = 0x731e5, .pme_short_desc = "L3 slice C snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5p_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x800c7, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ POWER5p_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x420e5, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles this thread was running at priority level 6.", }, [ POWER5p_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x1010a8, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toP1_SIDECAR_EMPTY", .pme_code = 0x702c7, .pme_short_desc = "M1 to P1 sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x1c70a1, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD_CYC", .pme_code = 0x4c70a6, .pme_short_desc = "Marked load latency from L3.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_MEM_PWQ_DISP ] = { .pme_name = "PM_MEM_PWQ_DISP", .pme_code = 0x704c6, .pme_short_desc = "Memory partial-write queue dispatched", .pme_long_desc = "Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toM1_SIDECAR_EMPTY", .pme_code = 0x701c7, .pme_short_desc = "P1 to M1 sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc10c2, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 0.", }, [ POWER5p_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { .pme_name = "PM_SNOOP_PARTIAL_RTRY_QFULL", .pme_code = 0x730e6, .pme_short_desc = "Snoop partial write retry due to partial-write queues full", .pme_long_desc = "A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x20e5, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", }, [ POWER5p_PME_PM_GCT_USAGE_80to99_CYC ] = { .pme_name = "PM_GCT_USAGE_80to99_CYC", .pme_code = 0x30001f, .pme_short_desc = "Cycles GCT 80-99% full", .pme_long_desc = "Cycles when the Global Completion Table has between 80% and 99% of its slots used. The GCT has 20 entries shared between threads", }, [ POWER5p_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x40000c, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", }, [ POWER5p_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x100009, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PowerPC instructions that completed.", }, [ POWER5p_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc00c5, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", }, [ POWER5p_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100012, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", }, [ POWER5p_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00c0, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", }, [ POWER5p_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc40c5, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER5p_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x120e4, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", }, [ POWER5p_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x183097, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0x730e0, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x710c7, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_L3SA_SHR_INV ] = { .pme_name = "PM_L3SA_SHR_INV", .pme_code = 0x710c3, .pme_short_desc = "L3 slice A transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5p_PME_PM_PTEG_FROM_L375_MOD ] = { .pme_name = "PM_PTEG_FROM_L375_MOD", .pme_code = 0x1830a7, .pme_short_desc = "PTEG loaded from L3.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x810c5, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ POWER5p_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x230e4, .pme_short_desc = "Branches issued", .pme_long_desc = "A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times.", }, [ POWER5p_PME_PM_MRK_GRP_BR_REDIR ] = { .pme_name = "PM_MRK_GRP_BR_REDIR", .pme_code = 0x212091, .pme_short_desc = "Group experienced marked branch redirect", .pme_long_desc = "A group containing a marked (sampled) instruction experienced a branch redirect.", }, [ POWER5p_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x130e3, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked.", }, [ POWER5p_PME_PM_IERAT_XLATE_WR_LP ] = { .pme_name = "PM_IERAT_XLATE_WR_LP", .pme_code = 0x210c6, .pme_short_desc = "Large page translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", }, [ POWER5p_PME_PM_DTLB_REF_64K ] = { .pme_name = "PM_DTLB_REF_64K", .pme_code = 0x2c2086, .pme_short_desc = "Data TLB reference for 64K page", .pme_long_desc = "Data TLB references for 64KB pages. Includes hits + misses.", }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q4to7 ] = { .pme_name = "PM_MEM_RQ_DISP_Q4to7", .pme_code = 0x712c6, .pme_short_desc = "Memory read queue dispatched to queues 4-7", .pme_long_desc = "A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_MEM_FAST_PATH_RD_DISP ] = { .pme_name = "PM_MEM_FAST_PATH_RD_DISP", .pme_code = 0x731e6, .pme_short_desc = "Fast path memory read dispatched", .pme_long_desc = "Fast path memory read dispatched", }, [ POWER5p_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x12208d, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x800c0, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ POWER5p_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400012, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy.", }, [ POWER5p_PME_PM_DTLB_REF_4K ] = { .pme_name = "PM_DTLB_REF_4K", .pme_code = 0x1c2086, .pme_short_desc = "Data TLB reference for 4K page", .pme_long_desc = "Data TLB references for 4KB pages. Includes hits + misses.", }, [ POWER5p_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x1110a8, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", }, [ POWER5p_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x120e3, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "A group is available for dispatch. This does not mean it was successfully dispatched.", }, [ POWER5p_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c0088, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x130e6, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x420e3, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles this thread was running at priority level 4.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x2c709e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5p_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x400c4, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_MRK_DTLB_REF_16M ] = { .pme_name = "PM_MRK_DTLB_REF_16M", .pme_code = 0x3c6086, .pme_short_desc = "Marked Data TLB reference for 16M page", .pme_long_desc = "Data TLB references by a marked instruction for 16MB pages.", }, [ POWER5p_PME_PM_INST_FROM_L375_MOD ] = { .pme_name = "PM_INST_FROM_L375_MOD", .pme_code = 0x42209d, .pme_short_desc = "Instruction fetched from L3.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x300013, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc7, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5p_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x2010a8, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_L3SC_REF ] = { .pme_name = "PM_L3SC_REF", .pme_code = 0x701c5, .pme_short_desc = "L3 slice C references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice.", }, [ POWER5p_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x400c5, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_THRD_L2MISS_BOTH_CYC ] = { .pme_name = "PM_THRD_L2MISS_BOTH_CYC", .pme_code = 0x410c7, .pme_short_desc = "Cycles both threads in L2 misses", .pme_long_desc = "Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used.", }, [ POWER5p_PME_PM_MEM_PW_GATH ] = { .pme_name = "PM_MEM_PW_GATH", .pme_code = 0x714c6, .pme_short_desc = "Memory partial-write gathered", .pme_long_desc = "Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_DTLB_REF_16G ] = { .pme_name = "PM_DTLB_REF_16G", .pme_code = 0x4c2086, .pme_short_desc = "Data TLB reference for 16G page", .pme_long_desc = "Data TLB references for 16GB pages. Includes hits + misses.", }, [ POWER5p_PME_PM_FAB_DCLAIM_ISSUED ] = { .pme_name = "PM_FAB_DCLAIM_ISSUED", .pme_code = 0x720e7, .pme_short_desc = "dclaim issued", .pme_long_desc = "A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_FAB_PNtoNN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoNN_SIDECAR", .pme_code = 0x713c7, .pme_short_desc = "PN to NN beat went to sidecar first", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5p_PME_PM_GRP_IC_MISS ] = { .pme_name = "PM_GRP_IC_MISS", .pme_code = 0x120e7, .pme_short_desc = "Group experienced I cache miss", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count.", }, [ POWER5p_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x12209d, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xc30e7, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The Load Miss Queue was full.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x2c70a0, .pme_short_desc = "Marked load latency from L2", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x830e5, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", }, [ POWER5p_PME_PM_LSU0_BUSY_REJECT ] = { .pme_name = "PM_LSU0_BUSY_REJECT", .pme_code = 0xc20e1, .pme_short_desc = "LSU0 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions.", }, [ POWER5p_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x1c4090, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4c70a1, .pme_short_desc = "Marked load latency from remote memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_DATA_FROM_L375_SHR ] = { .pme_name = "PM_DATA_FROM_L375_SHR", .pme_code = 0x3c309e, .pme_short_desc = "Data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a demand load.", }, [ POWER5p_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x283097, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x10c0, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5p_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x420e6, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles this thread was running at priority level 7.", }, [ POWER5p_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc00c7, .pme_short_desc = "LSU1 SRQ lhs flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5p_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10c0, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", }, [ POWER5p_PME_PM_L2SC_RCST_DISP ] = { .pme_name = "PM_L2SC_RCST_DISP", .pme_code = 0x702c2, .pme_short_desc = "L2 slice C RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5p_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x411099, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q12to15 ] = { .pme_name = "PM_MEM_RQ_DISP_Q12to15", .pme_code = 0x732e6, .pme_short_desc = "Memory read queue dispatched to queues 12-15", .pme_long_desc = "A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_INST_FROM_L375_SHR ] = { .pme_name = "PM_INST_FROM_L375_SHR", .pme_code = 0x32209d, .pme_short_desc = "Instruction fetched from L3.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x2c10a8, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Store references to the Data Cache. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_L3SB_ALL_BUSY ] = { .pme_name = "PM_L3SB_ALL_BUSY", .pme_code = 0x721e4, .pme_short_desc = "L3 slice B active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5p_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x711c7, .pme_short_desc = "P1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR_CYC", .pme_code = 0x2c70a3, .pme_short_desc = "Marked load latency from L2.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_FAB_HOLDtoNN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoNN_EMPTY", .pme_code = 0x722e7, .pme_short_desc = "Hold buffer to NN empty", .pme_long_desc = "Fabric cyles when the Next Node out hold-buffers are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x2c3087, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", }, [ POWER5p_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x100005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", }, [ POWER5p_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x1830a1, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", }, [ POWER5p_PME_PM_L2SC_RCLD_DISP ] = { .pme_name = "PM_L2SC_RCLD_DISP", .pme_code = 0x701c2, .pme_short_desc = "L2 slice C RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5p_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc60e6, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", }, [ POWER5p_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc50c0, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU0", }, [ POWER5p_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40000a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x820e2, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", }, [ POWER5p_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x40000b, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", }, [ POWER5p_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc10c3, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x300018, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", }, [ POWER5p_PME_PM_LSU_BUSY_REJECT ] = { .pme_name = "PM_LSU_BUSY_REJECT", .pme_code = 0x2c2088, .pme_short_desc = "LSU busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions. Combined unit 0 + 1.", }, [ POWER5p_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x800c1, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ POWER5p_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0xf, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER5p_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x10000b, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT.", }, [ POWER5p_PME_PM_GRP_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_BR_REDIR_NONSPEC", .pme_code = 0x112091, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Number of groups, counted at completion, that have encountered a branch redirect.", }, [ POWER5p_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc60e5, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER5p_PME_PM_L3SC_MOD_INV ] = { .pme_name = "PM_L3SC_MOD_INV", .pme_code = 0x730e5, .pme_short_desc = "L3 slice C transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc50c3, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", }, [ POWER5p_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x41009c, .pme_short_desc = "No slot in GCT caused by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x2c7097, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5p_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x723e1, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0x730e1, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc70e4, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", }, [ POWER5p_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x711c4, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", }, [ POWER5p_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0x700c1, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x130e7, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", }, [ POWER5p_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100013, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER5p_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0x700c2, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30001a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x210c7, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "A prefetch buffer entry (line) is allocated but the request is not a demand fetch.", }, [ POWER5p_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x110c2, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "Cycles when the LRQ is full.", }, [ POWER5p_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x180088, .pme_short_desc = "TLB misses", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", }, [ POWER5p_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x100c0, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The Global Completion Table is completely full.", }, [ POWER5p_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200012, .pme_short_desc = "FXU busy", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x2c70a4, .pme_short_desc = "Marked load latency from L3", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2c4088, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", }, [ POWER5p_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc20e7, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", }, [ POWER5p_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x100014, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", }, [ POWER5p_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x122096, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T or SL) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", }, [ POWER5p_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x830e7, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated.", }, [ POWER5p_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x10c7, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., ,", }, [ POWER5p_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x230e6, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", }, [ POWER5p_PME_PM_MRK_DTLB_REF_64K ] = { .pme_name = "PM_MRK_DTLB_REF_64K", .pme_code = 0x2c6086, .pme_short_desc = "Marked Data TLB reference for 64K page", .pme_long_desc = "Data TLB references by a marked instruction for 64KB pages.", }, [ POWER5p_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x500009, .pme_short_desc = "Run instructions completed", .pme_long_desc = "Number of run instructions completed.", }, [ POWER5p_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x110c1, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_L2SA_RCLD_DISP ] = { .pme_name = "PM_L2SA_RCLD_DISP", .pme_code = 0x701c0, .pme_short_desc = "L2 slice A RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_WR_RETRY_QFULL", .pme_code = 0x710c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DTLB_REF_4K ] = { .pme_name = "PM_MRK_DTLB_REF_4K", .pme_code = 0x1c6086, .pme_short_desc = "Marked Data TLB reference for 4K page", .pme_long_desc = "Data TLB references by a marked instruction for 4KB pages.", }, [ POWER5p_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc20e6, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", }, [ POWER5p_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc00c2, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5p_PME_PM_INST_FROM_L275_MOD ] = { .pme_name = "PM_INST_FROM_L275_MOD", .pme_code = 0x422096, .pme_short_desc = "Instruction fetched from L2.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x200004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ POWER5p_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x820e7, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x430e6, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 5 or 6.", }, [ POWER5p_PME_PM_SNOOP_RETRY_1AHEAD ] = { .pme_name = "PM_SNOOP_RETRY_1AHEAD", .pme_code = 0x725e6, .pme_short_desc = "Snoop retry due to one ahead collision", .pme_long_desc = "Snoop retry due to one ahead collision", }, [ POWER5p_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0xc6, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x820e4, .pme_short_desc = "LSU1 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU1.", }, [ POWER5p_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x300014, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5p_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x420e4, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles this thread was running at priority level 5.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x2c7087, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", }, [ POWER5p_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x800c3, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A tlbie was snooped from another processor.", }, [ POWER5p_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x10c5, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x4c208d, .pme_short_desc = "Data TLB miss for 16G page", .pme_long_desc = "Data TLB references to 16GB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_L3SB_SNOOP_RETRY ] = { .pme_name = "PM_L3SB_SNOOP_RETRY", .pme_code = 0x731e4, .pme_short_desc = "L3 slice B snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5p_PME_PM_FAB_VBYPASS_EMPTY ] = { .pme_name = "PM_FAB_VBYPASS_EMPTY", .pme_code = 0x731e7, .pme_short_desc = "Vertical bypass buffer empty", .pme_long_desc = "Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x1c70a3, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load.", }, [ POWER5p_PME_PM_L2SB_RCST_DISP ] = { .pme_name = "PM_L2SB_RCST_DISP", .pme_code = 0x702c1, .pme_short_desc = "L2 slice B RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5p_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x400c6, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x110c7, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", }, [ POWER5p_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0x730e2, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x102088, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "The floating point unit has encountered a denormalized operand. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_L3SC_HIT ] = { .pme_name = "PM_L3SC_HIT", .pme_code = 0x711c5, .pme_short_desc = "L3 slice C hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice", }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_RQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_RQ", .pme_code = 0x706c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active read queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly", }, [ POWER5p_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0xc40c4, .pme_short_desc = "LSU1 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", }, [ POWER5p_PME_PM_L3SC_ALL_BUSY ] = { .pme_name = "PM_L3SC_ALL_BUSY", .pme_code = 0x721e5, .pme_short_desc = "L3 slice C active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5p_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x220e6, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", }, [ POWER5p_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x412091, .pme_short_desc = "Group experienced marked I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", }, [ POWER5p_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x21009c, .pme_short_desc = "No slot in GCT caused by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c708e, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", }, [ POWER5p_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { .pme_name = "PM_GCT_NOSLOT_SRQ_FULL", .pme_code = 0x310084, .pme_short_desc = "No slot in GCT caused by SRQ full", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because the Store Request Queue (SRQ) is full. This happens when the storage subsystem can not process the stores in the SRQ. Groups can not be dispatched until a SRQ entry is available.", }, [ POWER5p_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x21109a, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER5p_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { .pme_name = "PM_THRD_SEL_OVER_ISU_HOLD", .pme_code = 0x410c5, .pme_short_desc = "Thread selection overrides caused by ISU holds", .pme_long_desc = "Thread selection was overridden because of an ISU hold.", }, [ POWER5p_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x2c0090, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1.", }, [ POWER5p_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x420e1, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles this thread was running at priority level 2.", }, [ POWER5p_PME_PM_L3SA_MOD_INV ] = { .pme_name = "PM_L3SA_MOD_INV", .pme_code = 0x730e3, .pme_short_desc = "L3 slice A transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x1c0090, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0xc70e6, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", }, [ POWER5p_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x701c3, .pme_short_desc = "L3 slice A references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice", }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5p_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x20e1, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", }, [ POWER5p_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100018, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1", }, [ POWER5p_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x130e5, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x381088, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5p_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x20e2, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "FPU0 has executed a Floating Point Store instruction.", }, [ POWER5p_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0xc50c6, .pme_short_desc = "Marked Data TLB misses", .pme_long_desc = "Data TLB references by a marked instruction that missed the TLB (all page sizes).", }, [ POWER5p_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc5, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0x720e0, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc00c4, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", }, [ POWER5p_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x300005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x810c0, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", }, [ POWER5p_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x100c3, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped.", }, [ POWER5p_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc60e7, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x810c4, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5p_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0xc60e4, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x123087, .pme_short_desc = "Unconditional branch", .pme_long_desc = "An unconditional branch was executed.", }, [ POWER5p_PME_PM_THRD_SEL_OVER_L2MISS ] = { .pme_name = "PM_THRD_SEL_OVER_L2MISS", .pme_code = 0x410c3, .pme_short_desc = "Thread selection overrides caused by L2 misses", .pme_long_desc = "Thread selection was overridden because one thread was had a L2 miss pending.", }, [ POWER5p_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0x710c1, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5p_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_LO_PRIO_WR_CMPL", .pme_code = 0x736e6, .pme_short_desc = "Low priority write completed", .pme_long_desc = "A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly", }, [ POWER5p_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x2c608d, .pme_short_desc = "Marked Data TLB misses for 64K page", .pme_long_desc = "Data TLB references to 64KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x820e3, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", }, [ POWER5p_PME_PM_L3SC_MOD_TAG ] = { .pme_name = "PM_L3SC_MOD_TAG", .pme_code = 0x720e5, .pme_short_desc = "L3 slice C transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x300002, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups successfully dispatched (not rejected)", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x430e4, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 1 or 2.", }, [ POWER5p_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x230e0, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", }, [ POWER5p_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x280090, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q8to15 ] = { .pme_name = "PM_MEM_WQ_DISP_Q8to15", .pme_code = 0x733e6, .pme_short_desc = "Memory write queue dispatched to queues 8-15", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x20e3, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "FPU0 has executed a single precision instruction.", }, [ POWER5p_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x420e0, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping.", }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_RQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_RQ", .pme_code = 0x705c6, .pme_short_desc = "Snoop read retry due to collision with active read queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FAB_HOLDtoVN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoVN_EMPTY", .pme_code = 0x721e7, .pme_short_desc = "Hold buffer to VN empty", .pme_long_desc = "Fabric cycles when the Vertical Node out hold-buffers are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x10c6, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5p_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_DCLAIM_RETRY_QFULL", .pme_code = 0x720e6, .pme_short_desc = "Snoop dclaim/flush retry due to write/dclaim queues full", .pme_long_desc = "The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR_CYC", .pme_code = 0x2c70a2, .pme_short_desc = "Marked load latency from L2.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ POWER5p_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x110c6, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", }, [ POWER5p_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x4c608d, .pme_short_desc = "Marked Data TLB misses for 16G page", .pme_long_desc = "Data TLB references to 16GB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x202090, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU has executed a store instruction. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x411098, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point instruction.", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 1 or 2.", }, [ POWER5p_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100004, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", }, [ POWER5p_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300012, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ POWER5p_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x18309e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", }, [ POWER5p_PME_PM_MRK_DTLB_REF_16G ] = { .pme_name = "PM_MRK_DTLB_REF_16G", .pme_code = 0x4c6086, .pme_short_desc = "Marked Data TLB reference for 16G page", .pme_long_desc = "Data TLB references by a marked instruction for 16GB pages.", }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x2810a8, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x1c7097, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5p_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x711c3, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x1c709e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x220e7, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", }, [ POWER5p_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x723e0, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x222086, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_THRD_SEL_T1 ] = { .pme_name = "PM_THRD_SEL_T1", .pme_code = 0x410c1, .pme_short_desc = "Decode selected thread 1", .pme_long_desc = "Thread selection picked thread 1 for decode.", }, [ POWER5p_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x230e1, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR_CYC", .pme_code = 0x2c70a6, .pme_short_desc = "Marked load latency from L3.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc3, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5p_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x183087, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L2 due to a demand load", }, [ POWER5p_PME_PM_MEM_PW_CMPL ] = { .pme_name = "PM_MEM_PW_CMPL", .pme_code = 0x724e6, .pme_short_desc = "Memory partial-write completed", .pme_long_desc = "Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x430e0, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 5 or 6.", }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5p_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x1c608d, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x10c3, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads.", }, [ POWER5p_PME_PM_L3SC_SHR_INV ] = { .pme_name = "PM_L3SC_SHR_INV", .pme_code = 0x710c5, .pme_short_desc = "L3 slice C transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5p_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x120e6, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count.", }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x481088, .pme_short_desc = "Marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5p_PME_PM_PTEG_FROM_L275_SHR ] = { .pme_name = "PM_PTEG_FROM_L275_SHR", .pme_code = 0x383097, .pme_short_desc = "PTEG loaded from L2.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", }, [ POWER5p_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_WQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_WQ", .pme_code = 0x715c6, .pme_short_desc = "Snoop read retry due to collision with active write queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_FAB_DCLAIM_RETRIED ] = { .pme_name = "PM_FAB_DCLAIM_RETRIED", .pme_code = 0x730e7, .pme_short_desc = "dclaim retried", .pme_long_desc = "A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc50c1, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by unit 0.", }, [ POWER5p_PME_PM_LSU1_BUSY_REJECT ] = { .pme_name = "PM_LSU1_BUSY_REJECT", .pme_code = 0xc20e5, .pme_short_desc = "LSU1 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions.", }, [ POWER5p_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x110c0, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_DTLB_REF_16M ] = { .pme_name = "PM_DTLB_REF_16M", .pme_code = 0x3c2086, .pme_short_desc = "Data TLB reference for 16M page", .pme_long_desc = "Data TLB references for 16MB pages. Includes hits + misses.", }, [ POWER5p_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x10c2, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5p_PME_PM_GCT_USAGE_60to79_CYC ] = { .pme_name = "PM_GCT_USAGE_60to79_CYC", .pme_code = 0x20001f, .pme_short_desc = "Cycles GCT 60-79% full", .pme_long_desc = "Cycles when the Global Completion Table has between 60% and 70% of its slots used. The GCT has 20 entries shared between threads.", }, [ POWER5p_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x2c3097, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0xc40c3, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5p_PME_PM_DATA_FROM_L375_MOD ] = { .pme_name = "PM_DATA_FROM_L375_MOD", .pme_code = 0x1c30a7, .pme_short_desc = "Data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x200015, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER5p_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x2c208d, .pme_short_desc = "Data TLB miss for 64K page", .pme_long_desc = "Data TLB references to 64KB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0xc40c2, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", }, [ POWER5p_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x42208d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ POWER5p_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0xc40c6, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", }, [ POWER5p_PME_PM_MEM_WQ_DISP_Q0to7 ] = { .pme_name = "PM_MEM_WQ_DISP_Q0to7", .pme_code = 0x723e6, .pme_short_desc = "Memory write queue dispatched to queues 0-7", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc70e7, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ POWER5p_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x4c70a0, .pme_short_desc = "Marked load latency from local memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x100c5, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x112099, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", }, [ POWER5p_PME_PM_PTEG_FROM_L275_MOD ] = { .pme_name = "PM_PTEG_FROM_L275_MOD", .pme_code = 0x1830a3, .pme_short_desc = "PTEG loaded from L2.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load.", }, [ POWER5p_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x820e0, .pme_short_desc = "LSU0 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU0.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR_CYC", .pme_code = 0x2c70a7, .pme_short_desc = "Marked load latency from L3.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c308e, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", }, [ POWER5p_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x122086, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x110c5, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit", }, [ POWER5p_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30000a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x20e0, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "FPU0 has encountered a denormalized operand.", }, [ POWER5p_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x10c4, .pme_short_desc = "FPU1 executed FMOV or FEST instructions", .pme_long_desc = "FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5p_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x220e4, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Cycles when at least one instruction was sent from the fetch unit to the decode unit.", }, [ POWER5p_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x300009, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", }, [ POWER5p_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x1c50a8, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x1c3097, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc30e4, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", }, [ POWER5p_PME_PM_MEM_WQ_DISP_DCLAIM ] = { .pme_name = "PM_MEM_WQ_DISP_DCLAIM", .pme_code = 0x713c6, .pme_short_desc = "Memory write queue dispatched due to dclaim/flush", .pme_long_desc = "A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x100015, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued.", }, [ POWER5p_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x110090, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", }, [ POWER5p_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x22209d, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x200088, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x420e2, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles this thread was running at priority level 3.", }, [ POWER5p_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x400005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_SNOOP_WR_RETRY_WQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_WQ", .pme_code = 0x716c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active write queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x41109a, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER5p_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x200014, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0xc40c7, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5p_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", }, [ POWER5p_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10000a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_L3SA_SNOOP_RETRY ] = { .pme_name = "PM_L3SA_SNOOP_RETRY", .pme_code = 0x731e3, .pme_short_desc = "L3 slice A snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5p_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x28309e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", }, [ POWER5p_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x222096, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", }, [ POWER5p_PME_PM_THRD_SMT_HANG ] = { .pme_name = "PM_THRD_SMT_HANG", .pme_code = 0x330e7, .pme_short_desc = "SMT hang detected", .pme_long_desc = "A hung thread was detected", }, [ POWER5p_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x41109b, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", }, [ POWER5p_PME_PM_L3SA_MOD_TAG ] = { .pme_name = "PM_L3SA_MOD_TAG", .pme_code = 0x720e3, .pme_short_desc = "L3 slice A transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x12209b, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", }, [ POWER5p_PME_PM_FLUSH_SYNC ] = { .pme_name = "PM_FLUSH_SYNC", .pme_code = 0x330e1, .pme_short_desc = "Flush caused by sync", .pme_long_desc = "This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes.", }, [ POWER5p_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x100002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q8to11 ] = { .pme_name = "PM_MEM_RQ_DISP_Q8to11", .pme_code = 0x722e6, .pme_short_desc = "Memory read queue dispatched to queues 8-11", .pme_long_desc = "A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0x733e2, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0x720e1, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_CLB_EMPTY_CYC ] = { .pme_name = "PM_CLB_EMPTY_CYC", .pme_code = 0x410c6, .pme_short_desc = "Cycles CLB empty", .pme_long_desc = "Cycles when both thread's CLB is completely empty.", }, [ POWER5p_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x733e1, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C.", }, [ POWER5p_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_NONSPEC_RD_CANCEL", .pme_code = 0x711c6, .pme_short_desc = "Non speculative memory read cancelled", .pme_long_desc = "A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x423087, .pme_short_desc = "A conditional branch was predicted, CR and target prediction", .pme_long_desc = "Both the condition (taken or not taken) and the target address of a branch instruction was predicted.", }, [ POWER5p_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x810c3, .pme_short_desc = "LSU0 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5p_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x1810a8, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5p_PME_PM_INST_DISP_ATTEMPT ] = { .pme_name = "PM_INST_DISP_ATTEMPT", .pme_code = 0x120e1, .pme_short_desc = "Instructions dispatch attempted", .pme_long_desc = "Number of PowerPC Instructions dispatched (attempted, not filtered by success.", }, [ POWER5p_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x422086, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc10c1, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU0.", }, [ POWER5p_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x800c2, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ POWER5p_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x202088, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_L2SB_RCLD_DISP ] = { .pme_name = "PM_L2SB_RCLD_DISP", .pme_code = 0x701c1, .pme_short_desc = "L2 slice B RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5p_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x230e2, .pme_short_desc = "A conditional branch was predicted, CR prediction", .pme_long_desc = "A conditional branch instruction was predicted as taken or not taken.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1c7087, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", }, [ POWER5p_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc00c3, .pme_short_desc = "LSU0 SRQ lhs flushes", .pme_long_desc = "A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5p_PME_PM_FAB_PNtoNN_DIRECT ] = { .pme_name = "PM_FAB_PNtoNN_DIRECT", .pme_code = 0x703c7, .pme_short_desc = "PN to NN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5p_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1, .pme_short_desc = "Internal operations completed", .pme_long_desc = "Number of internal operations that completed.", }, [ POWER5p_PME_PM_L2SA_RCST_DISP ] = { .pme_name = "PM_L2SA_RCST_DISP", .pme_code = 0x702c0, .pme_short_desc = "L2 slice A RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5p_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5p_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0x710c2, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5p_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { .pme_name = "PM_SNOOP_RETRY_AB_COLLISION", .pme_code = 0x735e6, .pme_short_desc = "Snoop retry due to a b collision", .pme_long_desc = "Snoop retry due to a b collision", }, [ POWER5p_PME_PM_FAB_PNtoVN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoVN_SIDECAR", .pme_code = 0x733e7, .pme_short_desc = "PN to VN beat went to sidecar first", .pme_long_desc = "Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc40c1, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER5p_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xc30e6, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", }, [ POWER5p_PME_PM_SNOOP_PW_RETRY_RQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_RQ", .pme_code = 0x707c6, .pme_short_desc = "Snoop partial-write retry due to collision with active read queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_DTLB_REF ] = { .pme_name = "PM_DTLB_REF", .pme_code = 0xc20e4, .pme_short_desc = "Data TLB references", .pme_long_desc = "Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x18308e, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", }, [ POWER5p_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x712c7, .pme_short_desc = "M1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5p_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x400015, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "Cycles the Store Request Queue is empty", }, [ POWER5p_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x20e6, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "FPU1 has executed a Floating Point Store instruction.", }, [ POWER5p_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xc30e5, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", }, [ POWER5p_PME_PM_GCT_USAGE_00to59_CYC ] = { .pme_name = "PM_GCT_USAGE_00to59_CYC", .pme_code = 0x10001f, .pme_short_desc = "Cycles GCT less than 60% full", .pme_long_desc = "Cycles when the Global Completion Table has fewer than 60% of its slots used. The GCT has 20 entries shared between threads.", }, [ POWER5p_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x301088, .pme_short_desc = "FPU executed FMOV or FEST instructions", .pme_long_desc = "The floating point unit has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.. Combined Unit 0 + Unit 1.", }, [ POWER5p_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x3c309b, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", }, [ POWER5p_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x100c2, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x130e1, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "A scoreboard operation on a non-renamed resource has blocked dispatch.", }, [ POWER5p_PME_PM_FLUSH_SB ] = { .pme_name = "PM_FLUSH_SB", .pme_code = 0x330e2, .pme_short_desc = "Flush caused by scoreboard operation", .pme_long_desc = "This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L375_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR", .pme_code = 0x3c709e, .pme_short_desc = "Marked data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a marked load.", }, [ POWER5p_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x400013, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER5p_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "The counter is suspended (does not count).", }, [ POWER5p_PME_PM_SNOOP_RD_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_RD_RETRY_QFULL", .pme_code = 0x700c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_BR_REDIR_NONSPEC", .pme_code = 0x120e5, .pme_short_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_long_desc = "Group experienced non-speculative I cache miss or branch redirect", }, [ POWER5p_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x1c309e, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5p_PME_PM_L3SB_MOD_INV ] = { .pme_name = "PM_L3SB_MOD_INV", .pme_code = 0x730e4, .pme_short_desc = "L3 slice B transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x820e1, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ POWER5p_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc10c5, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 1.", }, [ POWER5p_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x200002, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", }, [ POWER5p_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0x830e6, .pme_short_desc = "DST (Data Stream Touch) stream start", .pme_long_desc = "A prefetch stream was started using the DST instruction.", }, [ POWER5p_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x20e4, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "FPU1 has encountered a denormalized operand.", }, [ POWER5p_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x30e0, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c3087, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", }, [ POWER5p_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5p_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x100090, .pme_short_desc = "FPU executed one flop instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5p_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5p_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0xc2, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5p_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e1, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5p_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x1c10a8, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Load references to the Level 1 Data Cache. Combined unit 0 + 1.", }, [ POWER5p_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x22208d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_TLBIE_HELD ] = { .pme_name = "PM_TLBIE_HELD", .pme_code = 0x130e4, .pme_short_desc = "TLBIE held at dispatch", .pme_long_desc = "Cycles a TLBIE instruction was held at dispatch.", }, [ POWER5p_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0xc50c2, .pme_short_desc = "D cache out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected but no more stream entries were available.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD_CYC", .pme_code = 0x4c70a2, .pme_short_desc = "Marked load latency from L2.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5p_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x810c7, .pme_short_desc = "LSU1 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5p_PME_PM_MEM_RQ_DISP_Q0to3 ] = { .pme_name = "PM_MEM_RQ_DISP_Q0to3", .pme_code = 0x702c6, .pme_short_desc = "Memory read queue dispatched to queues 0-3", .pme_long_desc = "A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5p_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc10c4, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU1.", }, [ POWER5p_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x182088, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ POWER5p_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x230e7, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "Cycles that a cache line was written to the instruction cache.", }, [ POWER5p_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0x723e2, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x21109b, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point divide or square root instruction. This is a subset of PM_CMPLU_STALL_FPU.", }, [ POWER5p_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { .pme_name = "PM_THRD_SEL_OVER_CLB_EMPTY", .pme_code = 0x410c2, .pme_short_desc = "Thread selection overrides caused by CLB empty", .pme_long_desc = "Thread selection was overridden because one thread's CLB was empty.", }, [ POWER5p_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x230e5, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", }, [ POWER5p_PME_PM_L3SB_MOD_TAG ] = { .pme_name = "PM_L3SB_MOD_TAG", .pme_code = 0x720e4, .pme_short_desc = "L3 slice B transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5p_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x3c709b, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", }, [ POWER5p_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1c4088, .pme_short_desc = "LSU SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c1088, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", }, [ POWER5p_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x32208d, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", }, [ POWER5p_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x820e5, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", }, [ POWER5p_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc10c7, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ POWER5p_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x110c3, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "Cycles the Store Request Queue is full.", }, [ POWER5p_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x401088, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result. This only indicates finish, not completion. Combined Unit 0 + Unit 1. Floating Point Stores are included in this count but not Floating Point Loads., , , XYZs", }, [ POWER5p_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x2c6088, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", }, [ POWER5p_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0x700c0, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C.", }, [ POWER5p_PME_PM_0INST_CLB_CYC ] = { .pme_name = "PM_0INST_CLB_CYC", .pme_code = 0x400c0, .pme_short_desc = "Cycles no instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5p_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x130e2, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER5p_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e2, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5p_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200013, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", }, [ POWER5p_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10001a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5p_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0xc0, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER5p_PME_PM_PTEG_FROM_L375_SHR ] = { .pme_name = "PM_PTEG_FROM_L375_SHR", .pme_code = 0x38309e, .pme_short_desc = "PTEG loaded from L3.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", }, [ POWER5p_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x20000b, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ POWER5p_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", }, [ POWER5p_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x430e3, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles when this thread's priority is equal to the other thread's priority.", }, [ POWER5p_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x100c6, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5p_PME_PM_L3SB_SHR_INV ] = { .pme_name = "PM_L3SB_SHR_INV", .pme_code = 0x710c4, .pme_short_desc = "L3 slice B transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5p_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x1c30a1, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", }, [ POWER5p_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x1c30a3, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load.", }, [ POWER5p_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0xc40c0, .pme_short_desc = "LSU0 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", }, [ POWER5p_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x800c6, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER5p_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x400014, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5p_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x3c208d, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5p_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc00c1, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", }, [ POWER5p_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", }, [ POWER5p_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0x720e2, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power5_events.h000066400000000000000000005300101502707512200230130ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER5_EVENTS_H__ #define __POWER5_EVENTS_H__ /* * File: power5_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER5_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define POWER5_PME_PM_FPU1_SINGLE 1 #define POWER5_PME_PM_L3SB_REF 2 #define POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC 3 #define POWER5_PME_PM_INST_FROM_L275_SHR 4 #define POWER5_PME_PM_MRK_DATA_FROM_L375_MOD 5 #define POWER5_PME_PM_DTLB_MISS_4K 6 #define POWER5_PME_PM_CLB_FULL_CYC 7 #define POWER5_PME_PM_MRK_ST_CMPL 8 #define POWER5_PME_PM_LSU_FLUSH_LRQ_FULL 9 #define POWER5_PME_PM_MRK_DATA_FROM_L275_SHR 10 #define POWER5_PME_PM_1INST_CLB_CYC 11 #define POWER5_PME_PM_MEM_SPEC_RD_CANCEL 12 #define POWER5_PME_PM_MRK_DTLB_MISS_16M 13 #define POWER5_PME_PM_FPU_FDIV 14 #define POWER5_PME_PM_FPU_SINGLE 15 #define POWER5_PME_PM_FPU0_FMA 16 #define POWER5_PME_PM_SLB_MISS 17 #define POWER5_PME_PM_LSU1_FLUSH_LRQ 18 #define POWER5_PME_PM_L2SA_ST_HIT 19 #define POWER5_PME_PM_DTLB_MISS 20 #define POWER5_PME_PM_BR_PRED_TA 21 #define POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC 22 #define POWER5_PME_PM_CMPLU_STALL_FXU 23 #define POWER5_PME_PM_EXT_INT 24 #define POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ 25 #define POWER5_PME_PM_LSU1_LDF 26 #define POWER5_PME_PM_MRK_ST_GPS 27 #define POWER5_PME_PM_FAB_CMD_ISSUED 28 #define POWER5_PME_PM_LSU0_SRQ_STFWD 29 #define POWER5_PME_PM_CR_MAP_FULL_CYC 30 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL 31 #define POWER5_PME_PM_MRK_LSU0_FLUSH_ULD 32 #define POWER5_PME_PM_LSU_FLUSH_SRQ_FULL 33 #define POWER5_PME_PM_FLUSH_IMBAL 34 #define POWER5_PME_PM_MEM_RQ_DISP_Q16to19 35 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 36 #define POWER5_PME_PM_DATA_FROM_L35_MOD 37 #define POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL 38 #define POWER5_PME_PM_FPU1_FDIV 39 #define POWER5_PME_PM_FPU0_FRSP_FCONV 40 #define POWER5_PME_PM_MEM_RQ_DISP 41 #define POWER5_PME_PM_LWSYNC_HELD 42 #define POWER5_PME_PM_FXU_FIN 43 #define POWER5_PME_PM_DSLB_MISS 44 #define POWER5_PME_PM_FXLS1_FULL_CYC 45 #define POWER5_PME_PM_DATA_FROM_L275_SHR 46 #define POWER5_PME_PM_THRD_SEL_T0 47 #define POWER5_PME_PM_PTEG_RELOAD_VALID 48 #define POWER5_PME_PM_LSU_LMQ_LHR_MERGE 49 #define POWER5_PME_PM_MRK_STCX_FAIL 50 #define POWER5_PME_PM_2INST_CLB_CYC 51 #define POWER5_PME_PM_FAB_PNtoVN_DIRECT 52 #define POWER5_PME_PM_PTEG_FROM_L2MISS 53 #define POWER5_PME_PM_CMPLU_STALL_LSU 54 #define POWER5_PME_PM_MRK_DSLB_MISS 55 #define POWER5_PME_PM_LSU_FLUSH_ULD 56 #define POWER5_PME_PM_PTEG_FROM_LMEM 57 #define POWER5_PME_PM_MRK_BRU_FIN 58 #define POWER5_PME_PM_MEM_WQ_DISP_WRITE 59 #define POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC 60 #define POWER5_PME_PM_LSU1_NCLD 61 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER 62 #define POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ 63 #define POWER5_PME_PM_FPR_MAP_FULL_CYC 64 #define POWER5_PME_PM_FPU1_FULL_CYC 65 #define POWER5_PME_PM_L3SA_ALL_BUSY 66 #define POWER5_PME_PM_3INST_CLB_CYC 67 #define POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 68 #define POWER5_PME_PM_L2SA_SHR_INV 69 #define POWER5_PME_PM_THRESH_TIMEO 70 #define POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL 71 #define POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL 72 #define POWER5_PME_PM_FPU_FSQRT 73 #define POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ 74 #define POWER5_PME_PM_PMC1_OVERFLOW 75 #define POWER5_PME_PM_L3SC_SNOOP_RETRY 76 #define POWER5_PME_PM_DATA_TABLEWALK_CYC 77 #define POWER5_PME_PM_THRD_PRIO_6_CYC 78 #define POWER5_PME_PM_FPU_FEST 79 #define POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY 80 #define POWER5_PME_PM_MRK_DATA_FROM_RMEM 81 #define POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC 82 #define POWER5_PME_PM_MEM_PWQ_DISP 83 #define POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY 84 #define POWER5_PME_PM_LD_MISS_L1_LSU0 85 #define POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL 86 #define POWER5_PME_PM_FPU1_STALL3 87 #define POWER5_PME_PM_GCT_USAGE_80to99_CYC 88 #define POWER5_PME_PM_WORK_HELD 89 #define POWER5_PME_PM_INST_CMPL 90 #define POWER5_PME_PM_LSU1_FLUSH_UST 91 #define POWER5_PME_PM_FXU_IDLE 92 #define POWER5_PME_PM_LSU0_FLUSH_ULD 93 #define POWER5_PME_PM_LSU1_REJECT_LMQ_FULL 94 #define POWER5_PME_PM_GRP_DISP_REJECT 95 #define POWER5_PME_PM_L2SA_MOD_INV 96 #define POWER5_PME_PM_PTEG_FROM_L25_SHR 97 #define POWER5_PME_PM_FAB_CMD_RETRIED 98 #define POWER5_PME_PM_L3SA_SHR_INV 99 #define POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL 100 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR 101 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL 102 #define POWER5_PME_PM_PTEG_FROM_L375_MOD 103 #define POWER5_PME_PM_MRK_LSU1_FLUSH_UST 104 #define POWER5_PME_PM_BR_ISSUED 105 #define POWER5_PME_PM_MRK_GRP_BR_REDIR 106 #define POWER5_PME_PM_EE_OFF 107 #define POWER5_PME_PM_MEM_RQ_DISP_Q4to7 108 #define POWER5_PME_PM_MEM_FAST_PATH_RD_DISP 109 #define POWER5_PME_PM_INST_FROM_L3 110 #define POWER5_PME_PM_ITLB_MISS 111 #define POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE 112 #define POWER5_PME_PM_FXLS_FULL_CYC 113 #define POWER5_PME_PM_DTLB_REF_4K 114 #define POWER5_PME_PM_GRP_DISP_VALID 115 #define POWER5_PME_PM_LSU_FLUSH_UST 116 #define POWER5_PME_PM_FXU1_FIN 117 #define POWER5_PME_PM_THRD_PRIO_4_CYC 118 #define POWER5_PME_PM_MRK_DATA_FROM_L35_MOD 119 #define POWER5_PME_PM_4INST_CLB_CYC 120 #define POWER5_PME_PM_MRK_DTLB_REF_16M 121 #define POWER5_PME_PM_INST_FROM_L375_MOD 122 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR 123 #define POWER5_PME_PM_GRP_CMPL 124 #define POWER5_PME_PM_FPU1_1FLOP 125 #define POWER5_PME_PM_FPU_FRSP_FCONV 126 #define POWER5_PME_PM_5INST_CLB_CYC 127 #define POWER5_PME_PM_L3SC_REF 128 #define POWER5_PME_PM_THRD_L2MISS_BOTH_CYC 129 #define POWER5_PME_PM_MEM_PW_GATH 130 #define POWER5_PME_PM_FAB_PNtoNN_SIDECAR 131 #define POWER5_PME_PM_FAB_DCLAIM_ISSUED 132 #define POWER5_PME_PM_GRP_IC_MISS 133 #define POWER5_PME_PM_INST_FROM_L35_SHR 134 #define POWER5_PME_PM_LSU_LMQ_FULL_CYC 135 #define POWER5_PME_PM_MRK_DATA_FROM_L2_CYC 136 #define POWER5_PME_PM_LSU_SRQ_SYNC_CYC 137 #define POWER5_PME_PM_LSU0_BUSY_REJECT 138 #define POWER5_PME_PM_LSU_REJECT_ERAT_MISS 139 #define POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC 140 #define POWER5_PME_PM_DATA_FROM_L375_SHR 141 #define POWER5_PME_PM_FPU0_FMOV_FEST 142 #define POWER5_PME_PM_PTEG_FROM_L25_MOD 143 #define POWER5_PME_PM_LD_REF_L1_LSU0 144 #define POWER5_PME_PM_THRD_PRIO_7_CYC 145 #define POWER5_PME_PM_LSU1_FLUSH_SRQ 146 #define POWER5_PME_PM_L2SC_RCST_DISP 147 #define POWER5_PME_PM_CMPLU_STALL_DIV 148 #define POWER5_PME_PM_MEM_RQ_DISP_Q12to15 149 #define POWER5_PME_PM_INST_FROM_L375_SHR 150 #define POWER5_PME_PM_ST_REF_L1 151 #define POWER5_PME_PM_L3SB_ALL_BUSY 152 #define POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY 153 #define POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC 154 #define POWER5_PME_PM_FAB_HOLDtoNN_EMPTY 155 #define POWER5_PME_PM_DATA_FROM_LMEM 156 #define POWER5_PME_PM_RUN_CYC 157 #define POWER5_PME_PM_PTEG_FROM_RMEM 158 #define POWER5_PME_PM_L2SC_RCLD_DISP 159 #define POWER5_PME_PM_LSU0_LDF 160 #define POWER5_PME_PM_LSU_LRQ_S0_VALID 161 #define POWER5_PME_PM_PMC3_OVERFLOW 162 #define POWER5_PME_PM_MRK_IMR_RELOAD 163 #define POWER5_PME_PM_MRK_GRP_TIMEO 164 #define POWER5_PME_PM_ST_MISS_L1 165 #define POWER5_PME_PM_STOP_COMPLETION 166 #define POWER5_PME_PM_LSU_BUSY_REJECT 167 #define POWER5_PME_PM_ISLB_MISS 168 #define POWER5_PME_PM_CYC 169 #define POWER5_PME_PM_THRD_ONE_RUN_CYC 170 #define POWER5_PME_PM_GRP_BR_REDIR_NONSPEC 171 #define POWER5_PME_PM_LSU1_SRQ_STFWD 172 #define POWER5_PME_PM_L3SC_MOD_INV 173 #define POWER5_PME_PM_L2_PREF 174 #define POWER5_PME_PM_GCT_NOSLOT_BR_MPRED 175 #define POWER5_PME_PM_MRK_DATA_FROM_L25_MOD 176 #define POWER5_PME_PM_L2SB_MOD_INV 177 #define POWER5_PME_PM_L2SB_ST_REQ 178 #define POWER5_PME_PM_MRK_L1_RELOAD_VALID 179 #define POWER5_PME_PM_L3SB_HIT 180 #define POWER5_PME_PM_L2SB_SHR_MOD 181 #define POWER5_PME_PM_EE_OFF_EXT_INT 182 #define POWER5_PME_PM_1PLUS_PPC_CMPL 183 #define POWER5_PME_PM_L2SC_SHR_MOD 184 #define POWER5_PME_PM_PMC6_OVERFLOW 185 #define POWER5_PME_PM_LSU_LRQ_FULL_CYC 186 #define POWER5_PME_PM_IC_PREF_INSTALL 187 #define POWER5_PME_PM_TLB_MISS 188 #define POWER5_PME_PM_GCT_FULL_CYC 189 #define POWER5_PME_PM_FXU_BUSY 190 #define POWER5_PME_PM_MRK_DATA_FROM_L3_CYC 191 #define POWER5_PME_PM_LSU_REJECT_LMQ_FULL 192 #define POWER5_PME_PM_LSU_SRQ_S0_ALLOC 193 #define POWER5_PME_PM_GRP_MRK 194 #define POWER5_PME_PM_INST_FROM_L25_SHR 195 #define POWER5_PME_PM_FPU1_FIN 196 #define POWER5_PME_PM_DC_PREF_STREAM_ALLOC 197 #define POWER5_PME_PM_BR_MPRED_TA 198 #define POWER5_PME_PM_CRQ_FULL_CYC 199 #define POWER5_PME_PM_L2SA_RCLD_DISP 200 #define POWER5_PME_PM_SNOOP_WR_RETRY_QFULL 201 #define POWER5_PME_PM_MRK_DTLB_REF_4K 202 #define POWER5_PME_PM_LSU_SRQ_S0_VALID 203 #define POWER5_PME_PM_LSU0_FLUSH_LRQ 204 #define POWER5_PME_PM_INST_FROM_L275_MOD 205 #define POWER5_PME_PM_GCT_EMPTY_CYC 206 #define POWER5_PME_PM_LARX_LSU0 207 #define POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC 208 #define POWER5_PME_PM_SNOOP_RETRY_1AHEAD 209 #define POWER5_PME_PM_FPU1_FSQRT 210 #define POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 211 #define POWER5_PME_PM_MRK_FPU_FIN 212 #define POWER5_PME_PM_THRD_PRIO_5_CYC 213 #define POWER5_PME_PM_MRK_DATA_FROM_LMEM 214 #define POWER5_PME_PM_FPU1_FRSP_FCONV 215 #define POWER5_PME_PM_SNOOP_TLBIE 216 #define POWER5_PME_PM_L3SB_SNOOP_RETRY 217 #define POWER5_PME_PM_FAB_VBYPASS_EMPTY 218 #define POWER5_PME_PM_MRK_DATA_FROM_L275_MOD 219 #define POWER5_PME_PM_6INST_CLB_CYC 220 #define POWER5_PME_PM_L2SB_RCST_DISP 221 #define POWER5_PME_PM_FLUSH 222 #define POWER5_PME_PM_L2SC_MOD_INV 223 #define POWER5_PME_PM_FPU_DENORM 224 #define POWER5_PME_PM_L3SC_HIT 225 #define POWER5_PME_PM_SNOOP_WR_RETRY_RQ 226 #define POWER5_PME_PM_LSU1_REJECT_SRQ 227 #define POWER5_PME_PM_IC_PREF_REQ 228 #define POWER5_PME_PM_L3SC_ALL_BUSY 229 #define POWER5_PME_PM_MRK_GRP_IC_MISS 230 #define POWER5_PME_PM_GCT_NOSLOT_IC_MISS 231 #define POWER5_PME_PM_MRK_DATA_FROM_L3 232 #define POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL 233 #define POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD 234 #define POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS 235 #define POWER5_PME_PM_L3SA_MOD_INV 236 #define POWER5_PME_PM_LSU_FLUSH_LRQ 237 #define POWER5_PME_PM_THRD_PRIO_2_CYC 238 #define POWER5_PME_PM_LSU_FLUSH_SRQ 239 #define POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID 240 #define POWER5_PME_PM_L3SA_REF 241 #define POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL 242 #define POWER5_PME_PM_FPU0_STALL3 243 #define POWER5_PME_PM_GPR_MAP_FULL_CYC 244 #define POWER5_PME_PM_TB_BIT_TRANS 245 #define POWER5_PME_PM_MRK_LSU_FLUSH_LRQ 246 #define POWER5_PME_PM_FPU0_STF 247 #define POWER5_PME_PM_MRK_DTLB_MISS 248 #define POWER5_PME_PM_FPU1_FMA 249 #define POWER5_PME_PM_L2SA_MOD_TAG 250 #define POWER5_PME_PM_LSU1_FLUSH_ULD 251 #define POWER5_PME_PM_MRK_LSU0_FLUSH_UST 252 #define POWER5_PME_PM_MRK_INST_FIN 253 #define POWER5_PME_PM_FPU0_FULL_CYC 254 #define POWER5_PME_PM_LSU_LRQ_S0_ALLOC 255 #define POWER5_PME_PM_MRK_LSU1_FLUSH_ULD 256 #define POWER5_PME_PM_MRK_DTLB_REF 257 #define POWER5_PME_PM_BR_UNCOND 258 #define POWER5_PME_PM_THRD_SEL_OVER_L2MISS 259 #define POWER5_PME_PM_L2SB_SHR_INV 260 #define POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL 261 #define POWER5_PME_PM_L3SC_MOD_TAG 262 #define POWER5_PME_PM_MRK_ST_MISS_L1 263 #define POWER5_PME_PM_GRP_DISP_SUCCESS 264 #define POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC 265 #define POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 266 #define POWER5_PME_PM_MEM_WQ_DISP_Q8to15 267 #define POWER5_PME_PM_FPU0_SINGLE 268 #define POWER5_PME_PM_LSU_DERAT_MISS 269 #define POWER5_PME_PM_THRD_PRIO_1_CYC 270 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER 271 #define POWER5_PME_PM_FPU1_FEST 272 #define POWER5_PME_PM_FAB_HOLDtoVN_EMPTY 273 #define POWER5_PME_PM_SNOOP_RD_RETRY_RQ 274 #define POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL 275 #define POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC 276 #define POWER5_PME_PM_MRK_ST_CMPL_INT 277 #define POWER5_PME_PM_FLUSH_BR_MPRED 278 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR 279 #define POWER5_PME_PM_FPU_STF 280 #define POWER5_PME_PM_CMPLU_STALL_FPU 281 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 282 #define POWER5_PME_PM_GCT_NOSLOT_CYC 283 #define POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE 284 #define POWER5_PME_PM_PTEG_FROM_L35_SHR 285 #define POWER5_PME_PM_MRK_LSU_FLUSH_UST 286 #define POWER5_PME_PM_L3SA_HIT 287 #define POWER5_PME_PM_MRK_DATA_FROM_L25_SHR 288 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR 289 #define POWER5_PME_PM_MRK_DATA_FROM_L35_SHR 290 #define POWER5_PME_PM_IERAT_XLATE_WR 291 #define POWER5_PME_PM_L2SA_ST_REQ 292 #define POWER5_PME_PM_THRD_SEL_T1 293 #define POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT 294 #define POWER5_PME_PM_INST_FROM_LMEM 295 #define POWER5_PME_PM_FPU0_1FLOP 296 #define POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC 297 #define POWER5_PME_PM_PTEG_FROM_L2 298 #define POWER5_PME_PM_MEM_PW_CMPL 299 #define POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 300 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER 301 #define POWER5_PME_PM_FPU0_FIN 302 #define POWER5_PME_PM_MRK_DTLB_MISS_4K 303 #define POWER5_PME_PM_L3SC_SHR_INV 304 #define POWER5_PME_PM_GRP_BR_REDIR 305 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL 306 #define POWER5_PME_PM_MRK_LSU_FLUSH_SRQ 307 #define POWER5_PME_PM_PTEG_FROM_L275_SHR 308 #define POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL 309 #define POWER5_PME_PM_SNOOP_RD_RETRY_WQ 310 #define POWER5_PME_PM_LSU0_NCLD 311 #define POWER5_PME_PM_FAB_DCLAIM_RETRIED 312 #define POWER5_PME_PM_LSU1_BUSY_REJECT 313 #define POWER5_PME_PM_FXLS0_FULL_CYC 314 #define POWER5_PME_PM_FPU0_FEST 315 #define POWER5_PME_PM_DTLB_REF_16M 316 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR 317 #define POWER5_PME_PM_LSU0_REJECT_ERAT_MISS 318 #define POWER5_PME_PM_DATA_FROM_L25_MOD 319 #define POWER5_PME_PM_GCT_USAGE_60to79_CYC 320 #define POWER5_PME_PM_DATA_FROM_L375_MOD 321 #define POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 322 #define POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF 323 #define POWER5_PME_PM_0INST_FETCH 324 #define POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF 325 #define POWER5_PME_PM_L1_PREF 326 #define POWER5_PME_PM_MEM_WQ_DISP_Q0to7 327 #define POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC 328 #define POWER5_PME_PM_BRQ_FULL_CYC 329 #define POWER5_PME_PM_GRP_IC_MISS_NONSPEC 330 #define POWER5_PME_PM_PTEG_FROM_L275_MOD 331 #define POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 332 #define POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC 333 #define POWER5_PME_PM_LSU_FLUSH 334 #define POWER5_PME_PM_DATA_FROM_L3 335 #define POWER5_PME_PM_INST_FROM_L2 336 #define POWER5_PME_PM_PMC2_OVERFLOW 337 #define POWER5_PME_PM_FPU0_DENORM 338 #define POWER5_PME_PM_FPU1_FMOV_FEST 339 #define POWER5_PME_PM_INST_FETCH_CYC 340 #define POWER5_PME_PM_LSU_LDF 341 #define POWER5_PME_PM_INST_DISP 342 #define POWER5_PME_PM_DATA_FROM_L25_SHR 343 #define POWER5_PME_PM_L1_DCACHE_RELOAD_VALID 344 #define POWER5_PME_PM_MEM_WQ_DISP_DCLAIM 345 #define POWER5_PME_PM_FPU_FULL_CYC 346 #define POWER5_PME_PM_MRK_GRP_ISSUED 347 #define POWER5_PME_PM_THRD_PRIO_3_CYC 348 #define POWER5_PME_PM_FPU_FMA 349 #define POWER5_PME_PM_INST_FROM_L35_MOD 350 #define POWER5_PME_PM_MRK_CRU_FIN 351 #define POWER5_PME_PM_SNOOP_WR_RETRY_WQ 352 #define POWER5_PME_PM_CMPLU_STALL_REJECT 353 #define POWER5_PME_PM_LSU1_REJECT_ERAT_MISS 354 #define POWER5_PME_PM_MRK_FXU_FIN 355 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER 356 #define POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY 357 #define POWER5_PME_PM_PMC4_OVERFLOW 358 #define POWER5_PME_PM_L3SA_SNOOP_RETRY 359 #define POWER5_PME_PM_PTEG_FROM_L35_MOD 360 #define POWER5_PME_PM_INST_FROM_L25_MOD 361 #define POWER5_PME_PM_THRD_SMT_HANG 362 #define POWER5_PME_PM_CMPLU_STALL_ERAT_MISS 363 #define POWER5_PME_PM_L3SA_MOD_TAG 364 #define POWER5_PME_PM_FLUSH_SYNC 365 #define POWER5_PME_PM_INST_FROM_L2MISS 366 #define POWER5_PME_PM_L2SC_ST_HIT 367 #define POWER5_PME_PM_MEM_RQ_DISP_Q8to11 368 #define POWER5_PME_PM_MRK_GRP_DISP 369 #define POWER5_PME_PM_L2SB_MOD_TAG 370 #define POWER5_PME_PM_CLB_EMPTY_CYC 371 #define POWER5_PME_PM_L2SB_ST_HIT 372 #define POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL 373 #define POWER5_PME_PM_BR_PRED_CR_TA 374 #define POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ 375 #define POWER5_PME_PM_MRK_LSU_FLUSH_ULD 376 #define POWER5_PME_PM_INST_DISP_ATTEMPT 377 #define POWER5_PME_PM_INST_FROM_RMEM 378 #define POWER5_PME_PM_ST_REF_L1_LSU0 379 #define POWER5_PME_PM_LSU0_DERAT_MISS 380 #define POWER5_PME_PM_L2SB_RCLD_DISP 381 #define POWER5_PME_PM_FPU_STALL3 382 #define POWER5_PME_PM_BR_PRED_CR 383 #define POWER5_PME_PM_MRK_DATA_FROM_L2 384 #define POWER5_PME_PM_LSU0_FLUSH_SRQ 385 #define POWER5_PME_PM_FAB_PNtoNN_DIRECT 386 #define POWER5_PME_PM_IOPS_CMPL 387 #define POWER5_PME_PM_L2SC_SHR_INV 388 #define POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER 389 #define POWER5_PME_PM_L2SA_RCST_DISP 390 #define POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION 391 #define POWER5_PME_PM_FAB_PNtoVN_SIDECAR 392 #define POWER5_PME_PM_LSU_LMQ_S0_ALLOC 393 #define POWER5_PME_PM_LSU0_REJECT_LMQ_FULL 394 #define POWER5_PME_PM_SNOOP_PW_RETRY_RQ 395 #define POWER5_PME_PM_DTLB_REF 396 #define POWER5_PME_PM_PTEG_FROM_L3 397 #define POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY 398 #define POWER5_PME_PM_LSU_SRQ_EMPTY_CYC 399 #define POWER5_PME_PM_FPU1_STF 400 #define POWER5_PME_PM_LSU_LMQ_S0_VALID 401 #define POWER5_PME_PM_GCT_USAGE_00to59_CYC 402 #define POWER5_PME_PM_DATA_FROM_L2MISS 403 #define POWER5_PME_PM_GRP_DISP_BLK_SB_CYC 404 #define POWER5_PME_PM_FPU_FMOV_FEST 405 #define POWER5_PME_PM_XER_MAP_FULL_CYC 406 #define POWER5_PME_PM_FLUSH_SB 407 #define POWER5_PME_PM_MRK_DATA_FROM_L375_SHR 408 #define POWER5_PME_PM_MRK_GRP_CMPL 409 #define POWER5_PME_PM_SUSPENDED 410 #define POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC 411 #define POWER5_PME_PM_SNOOP_RD_RETRY_QFULL 412 #define POWER5_PME_PM_L3SB_MOD_INV 413 #define POWER5_PME_PM_DATA_FROM_L35_SHR 414 #define POWER5_PME_PM_LD_MISS_L1_LSU1 415 #define POWER5_PME_PM_STCX_FAIL 416 #define POWER5_PME_PM_DC_PREF_DST 417 #define POWER5_PME_PM_GRP_DISP 418 #define POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR 419 #define POWER5_PME_PM_FPU0_FPSCR 420 #define POWER5_PME_PM_DATA_FROM_L2 421 #define POWER5_PME_PM_FPU1_DENORM 422 #define POWER5_PME_PM_FPU_1FLOP 423 #define POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER 424 #define POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL 425 #define POWER5_PME_PM_FPU0_FSQRT 426 #define POWER5_PME_PM_LD_REF_L1 427 #define POWER5_PME_PM_INST_FROM_L1 428 #define POWER5_PME_PM_TLBIE_HELD 429 #define POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS 430 #define POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC 431 #define POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ 432 #define POWER5_PME_PM_MEM_RQ_DISP_Q0to3 433 #define POWER5_PME_PM_ST_REF_L1_LSU1 434 #define POWER5_PME_PM_MRK_LD_MISS_L1 435 #define POWER5_PME_PM_L1_WRITE_CYC 436 #define POWER5_PME_PM_L2SC_ST_REQ 437 #define POWER5_PME_PM_CMPLU_STALL_FDIV 438 #define POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY 439 #define POWER5_PME_PM_BR_MPRED_CR 440 #define POWER5_PME_PM_L3SB_MOD_TAG 441 #define POWER5_PME_PM_MRK_DATA_FROM_L2MISS 442 #define POWER5_PME_PM_LSU_REJECT_SRQ 443 #define POWER5_PME_PM_LD_MISS_L1 444 #define POWER5_PME_PM_INST_FROM_PREF 445 #define POWER5_PME_PM_DC_INV_L2 446 #define POWER5_PME_PM_STCX_PASS 447 #define POWER5_PME_PM_LSU_SRQ_FULL_CYC 448 #define POWER5_PME_PM_FPU_FIN 449 #define POWER5_PME_PM_L2SA_SHR_MOD 450 #define POWER5_PME_PM_LSU_SRQ_STFWD 451 #define POWER5_PME_PM_0INST_CLB_CYC 452 #define POWER5_PME_PM_FXU0_FIN 453 #define POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL 454 #define POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC 455 #define POWER5_PME_PM_PMC5_OVERFLOW 456 #define POWER5_PME_PM_FPU0_FDIV 457 #define POWER5_PME_PM_PTEG_FROM_L375_SHR 458 #define POWER5_PME_PM_LD_REF_L1_LSU1 459 #define POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY 460 #define POWER5_PME_PM_HV_CYC 461 #define POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC 462 #define POWER5_PME_PM_LR_CTR_MAP_FULL_CYC 463 #define POWER5_PME_PM_L3SB_SHR_INV 464 #define POWER5_PME_PM_DATA_FROM_RMEM 465 #define POWER5_PME_PM_DATA_FROM_L275_MOD 466 #define POWER5_PME_PM_LSU0_REJECT_SRQ 467 #define POWER5_PME_PM_LSU1_DERAT_MISS 468 #define POWER5_PME_PM_MRK_LSU_FIN 469 #define POWER5_PME_PM_DTLB_MISS_16M 470 #define POWER5_PME_PM_LSU0_FLUSH_UST 471 #define POWER5_PME_PM_L2SC_MOD_TAG 472 #define POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY 473 static const pme_power_entry_t power5_pe[] = { [ POWER5_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x2c6090, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x20e7, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "FPU1 has executed a single precision instruction.", }, [ POWER5_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x701c4, .pme_short_desc = "L3 slice B references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice ", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x430e5, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 3 or 4.", }, [ POWER5_PME_PM_INST_FROM_L275_SHR ] = { .pme_name = "PM_INST_FROM_L275_SHR", .pme_code = 0x322096, .pme_short_desc = "Instruction fetched from L2.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD", .pme_code = 0x1c70a7, .pme_short_desc = "Marked data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on a different module than this processor is located due to a marked load.", }, [ POWER5_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0xc40c0, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5_PME_PM_CLB_FULL_CYC ] = { .pme_name = "PM_CLB_FULL_CYC", .pme_code = 0x220e5, .pme_short_desc = "Cycles CLB full", .pme_long_desc = "Cycles when both thread's CLB is full.", }, [ POWER5_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ POWER5_PME_PM_LSU_FLUSH_LRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_LRQ_FULL", .pme_code = 0x320e7, .pme_short_desc = "Flush caused by LRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Load Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR", .pme_code = 0x3c7097, .pme_short_desc = "Marked data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a marked load.", }, [ POWER5_PME_PM_1INST_CLB_CYC ] = { .pme_name = "PM_1INST_CLB_CYC", .pme_code = 0x400c1, .pme_short_desc = "Cycles 1 instruction in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_MEM_SPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_SPEC_RD_CANCEL", .pme_code = 0x721e6, .pme_short_desc = "Speculative memory read cancelled", .pme_long_desc = "Speculative memory read cancelled (i.e. cresp = sourced by L2/L3)", }, [ POWER5_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0xc40c5, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Marked Data TLB misses for 16M page", }, [ POWER5_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x100088, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "The floating point unit has executed a divide instruction. This could be fdiv, fdivs, fdiv., fdivs.. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x102090, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc1, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x280088, .pme_short_desc = "SLB misses", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", }, [ POWER5_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc00c6, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x733e0, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B, and C.", }, [ POWER5_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x800c4, .pme_short_desc = "Data TLB misses", .pme_long_desc = "Data TLB misses, all page sizes.", }, [ POWER5_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x230e3, .pme_short_desc = "A conditional branch was predicted, target prediction", .pme_long_desc = "The target address of a branch instruction was predicted.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_MOD_CYC", .pme_code = 0x4c70a7, .pme_short_desc = "Marked load latency from L3.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x211099, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", }, [ POWER5_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x400003, .pme_short_desc = "External interrupts", .pme_long_desc = "An interrupt due to an external exception occurred", }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x810c6, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc50c4, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU1", }, [ POWER5_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ POWER5_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x700c7, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Incremented when a chip issues a command on its SnoopA address bus. Each of the two address busses (SnoopA and SnoopB) is capable of one transaction per fabric cycle (one fabric cycle = 2 cpu cycles in normal 2:1 mode), but each chip can only drive the SnoopA bus, and can only drive one transaction every two fabric cycles (i.e., every four cpu cycles). In MCM-based systems, two chips interleave their accesses to each of the two fabric busses (SnoopA, SnoopB) to reach a peak capability of one transaction per cpu clock cycle. The two chips that drive SnoopB are wired so that the chips refer to the bus as SnoopA but it is connected to the other two chips as SnoopB. Note that this event will only be recorded by the FBC on the chip that sourced the operation. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc20e0, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER5_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x100c4, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The Conditional Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x810c0, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5_PME_PM_LSU_FLUSH_SRQ_FULL ] = { .pme_name = "PM_LSU_FLUSH_SRQ_FULL", .pme_code = 0x330e0, .pme_short_desc = "Flush caused by SRQ full", .pme_long_desc = "This thread was flushed at dispatch because its Store Request Queue was full. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5_PME_PM_FLUSH_IMBAL ] = { .pme_name = "PM_FLUSH_IMBAL", .pme_code = 0x330e3, .pme_short_desc = "Flush caused by thread GCT imbalance", .pme_long_desc = "This thread has been flushed at dispatch because it is stalled and a GCT imbalance exists. GCT thresholds are set in the TSCR register. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5_PME_PM_MEM_RQ_DISP_Q16to19 ] = { .pme_name = "PM_MEM_RQ_DISP_Q16to19", .pme_code = 0x727e6, .pme_short_desc = "Memory read queue dispatched to queues 16-19", .pme_long_desc = "A memory operation was dispatched to read queue 16,17,18 or 19. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x430e1, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 3 or 4.", }, [ POWER5_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x2c309e, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_MEM_HI_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_HI_PRIO_WR_CMPL", .pme_code = 0x726e6, .pme_short_desc = "High priority write completed", .pme_long_desc = "A memory write, which was upgraded to high priority, completed. Writes can be upgraded to high priority to ensure that read traffic does not lock out writes. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0xc4, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "FPU1 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER5_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x10c1, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "FPU0 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_MEM_RQ_DISP ] = { .pme_name = "PM_MEM_RQ_DISP", .pme_code = 0x701c6, .pme_short_desc = "Memory read queue dispatched", .pme_long_desc = "A memory read was dispatched. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x130e0, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", }, [ POWER5_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x313088, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x800c5, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", }, [ POWER5_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x110c4, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 1 / Load Store Unit 1 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_DATA_FROM_L275_SHR ] = { .pme_name = "PM_DATA_FROM_L275_SHR", .pme_code = 0x3c3097, .pme_short_desc = "Data loaded from L2.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T) data from the L2 on a different module than this processor is located due to a demand load. ", }, [ POWER5_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x410c0, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Thread selection picked thread 0 for decode.", }, [ POWER5_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x830e4, .pme_short_desc = "PTEG reload valid", .pme_long_desc = "A Page Table Entry was loaded into the TLB.", }, [ POWER5_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0xc70e5, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A data cache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", }, [ POWER5_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x820e6, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ POWER5_PME_PM_2INST_CLB_CYC ] = { .pme_name = "PM_2INST_CLB_CYC", .pme_code = 0x400c2, .pme_short_desc = "Cycles 2 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_FAB_PNtoVN_DIRECT ] = { .pme_name = "PM_FAB_PNtoVN_DIRECT", .pme_code = 0x723e7, .pme_short_desc = "PN to VN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound VN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x38309b, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", }, [ POWER5_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x211098, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", }, [ POWER5_PME_PM_MRK_DSLB_MISS ] = { .pme_name = "PM_MRK_DSLB_MISS", .pme_code = 0xc50c7, .pme_short_desc = "Marked Data SLB misses", .pme_long_desc = "A Data SLB miss was caused by a marked instruction.", }, [ POWER5_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1c0088, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", }, [ POWER5_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x283087, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", }, [ POWER5_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x200005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_MEM_WQ_DISP_WRITE ] = { .pme_name = "PM_MEM_WQ_DISP_WRITE", .pme_code = 0x703c6, .pme_short_desc = "Memory write queue dispatched due to write", .pme_long_desc = "A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD_CYC", .pme_code = 0x4c70a3, .pme_short_desc = "Marked load latency from L2.75 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc50c5, .pme_short_desc = "LSU1 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5_PME_PM_SNOOP_PW_RETRY_WQ_PWQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_WQ_PWQ", .pme_code = 0x717c6, .pme_short_desc = "Snoop partial-write retry due to collision with active write or partial-write queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active write or partial write. When this happens the snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x100c1, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations. ", }, [ POWER5_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x100c7, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU1 cannot accept any more instructions. Dispatch to this issue queue is stopped", }, [ POWER5_PME_PM_L3SA_ALL_BUSY ] = { .pme_name = "PM_L3SA_ALL_BUSY", .pme_code = 0x721e3, .pme_short_desc = "L3 slice A active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5_PME_PM_3INST_CLB_CYC ] = { .pme_name = "PM_3INST_CLB_CYC", .pme_code = 0x400c3, .pme_short_desc = "Cycles 3 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_MEM_PWQ_DISP_Q2or3 ] = { .pme_name = "PM_MEM_PWQ_DISP_Q2or3", .pme_code = 0x734e6, .pme_short_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3", .pme_long_desc = "Memory partial-write queue dispatched to Write Queue 2 or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_L2SA_SHR_INV ] = { .pme_name = "PM_L2SA_SHR_INV", .pme_code = 0x710c0, .pme_short_desc = "L2 slice A transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x30000b, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5_PME_PM_THRD_SEL_OVER_GCT_IMBAL ] = { .pme_name = "PM_THRD_SEL_OVER_GCT_IMBAL", .pme_code = 0x410c4, .pme_short_desc = "Thread selection overrides caused by GCT imbalance", .pme_long_desc = "Thread selection was overridden because of a GCT imbalance.", }, [ POWER5_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x200090, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "The floating point unit has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x810c2, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20000a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_L3SC_SNOOP_RETRY ] = { .pme_name = "PM_L3SC_SNOOP_RETRY", .pme_code = 0x731e5, .pme_short_desc = "L3 slice C snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x800c7, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ POWER5_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x420e5, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles this thread was running at priority level 6.", }, [ POWER5_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x401090, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "The floating point unit has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_FAB_M1toP1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toP1_SIDECAR_EMPTY", .pme_code = 0x702c7, .pme_short_desc = "M1 to P1 sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x1c70a1, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD_CYC", .pme_code = 0x4c70a6, .pme_short_desc = "Marked load latency from L3.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_MEM_PWQ_DISP ] = { .pme_name = "PM_MEM_PWQ_DISP", .pme_code = 0x704c6, .pme_short_desc = "Memory partial-write queue dispatched", .pme_long_desc = "Number of Partial Writes dispatched. The MC provides resources to gather partial cacheline writes (Partial line DMA writes & CI-stores) to up to four different cachelines at a time. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FAB_P1toM1_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toM1_SIDECAR_EMPTY", .pme_code = 0x701c7, .pme_short_desc = "P1 to M1 sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 hip/hop sidecars (sidecars for chip to chip data transfer) are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0xc10c2, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 0.", }, [ POWER5_PME_PM_SNOOP_PARTIAL_RTRY_QFULL ] = { .pme_name = "PM_SNOOP_PARTIAL_RTRY_QFULL", .pme_code = 0x730e6, .pme_short_desc = "Snoop partial write retry due to partial-write queues full", .pme_long_desc = "A snoop request for a partial write to memory was retried because the write queues that handle partial writes were full. When this happens the active writes are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x20e5, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "FPU1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", }, [ POWER5_PME_PM_GCT_USAGE_80to99_CYC ] = { .pme_name = "PM_GCT_USAGE_80to99_CYC", .pme_code = 0x30001f, .pme_short_desc = "Cycles GCT 80-99% full", .pme_long_desc = "Cycles when the Global Completion Table has between 80% and 99% of its slots used. The GCT has 20 entries shared between threads", }, [ POWER5_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x40000c, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", }, [ POWER5_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x100009, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PowerPC instructions that completed. ", }, [ POWER5_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc00c5, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", }, [ POWER5_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100012, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", }, [ POWER5_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc00c0, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", }, [ POWER5_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc60e5, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER5_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x120e4, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", }, [ POWER5_PME_PM_L2SA_MOD_INV ] = { .pme_name = "PM_L2SA_MOD_INV", .pme_code = 0x730e0, .pme_short_desc = "L2 slice A transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x183097, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x710c7, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Incremented when a command issued by a chip on its SnoopA address bus is retried for any reason. The overwhelming majority of retries are due to running out of memory controller queues but retries can also be caused by trying to reference addresses that are in a transient cache state -- e.g. a line is transient after issuing a DCLAIM instruction to a shared line but before the associated store completes. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_L3SA_SHR_INV ] = { .pme_name = "PM_L3SA_SHR_INV", .pme_code = 0x710c3, .pme_short_desc = "L3 slice A transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5_PME_PM_PTEG_FROM_L375_MOD ] = { .pme_name = "PM_PTEG_FROM_L375_MOD", .pme_code = 0x1830a7, .pme_short_desc = "PTEG loaded from L3.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x810c5, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ POWER5_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x230e4, .pme_short_desc = "Branches issued", .pme_long_desc = "A branch instruction was issued to the branch unit. A branch that was incorrectly predicted may issue and execute multiple times.", }, [ POWER5_PME_PM_MRK_GRP_BR_REDIR ] = { .pme_name = "PM_MRK_GRP_BR_REDIR", .pme_code = 0x212091, .pme_short_desc = "Group experienced marked branch redirect", .pme_long_desc = "A group containing a marked (sampled) instruction experienced a branch redirect.", }, [ POWER5_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x130e3, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "Cycles MSR(EE) bit was off indicating that interrupts due to external exceptions were masked.", }, [ POWER5_PME_PM_MEM_RQ_DISP_Q4to7 ] = { .pme_name = "PM_MEM_RQ_DISP_Q4to7", .pme_code = 0x712c6, .pme_short_desc = "Memory read queue dispatched to queues 4-7", .pme_long_desc = "A memory operation was dispatched to read queue 4,5,6 or 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MEM_FAST_PATH_RD_DISP ] = { .pme_name = "PM_MEM_FAST_PATH_RD_DISP", .pme_code = 0x713e6, .pme_short_desc = "Fast path memory read dispatched", .pme_long_desc = "Fast path memory read dispatched", }, [ POWER5_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x12208d, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from the local L3. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x800c0, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ POWER5_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400012, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy.", }, [ POWER5_PME_PM_FXLS_FULL_CYC ] = { .pme_name = "PM_FXLS_FULL_CYC", .pme_code = 0x411090, .pme_short_desc = "Cycles FXLS queue is full", .pme_long_desc = "Cycles when the issue queues for one or both FXU/LSU units is full. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", }, [ POWER5_PME_PM_DTLB_REF_4K ] = { .pme_name = "PM_DTLB_REF_4K", .pme_code = 0xc40c2, .pme_short_desc = "Data TLB reference for 4K page", .pme_long_desc = "Data TLB references for 4KB pages. Includes hits + misses.", }, [ POWER5_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x120e3, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "A group is available for dispatch. This does not mean it was successfully dispatched.", }, [ POWER5_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2c0088, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", }, [ POWER5_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x130e6, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x420e3, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles this thread was running at priority level 4.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x2c709e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5_PME_PM_4INST_CLB_CYC ] = { .pme_name = "PM_4INST_CLB_CYC", .pme_code = 0x400c4, .pme_short_desc = "Cycles 4 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_MRK_DTLB_REF_16M ] = { .pme_name = "PM_MRK_DTLB_REF_16M", .pme_code = 0xc40c7, .pme_short_desc = "Marked Data TLB reference for 16M page", .pme_long_desc = "Data TLB references by a marked instruction for 16MB pages.", }, [ POWER5_PME_PM_INST_FROM_L375_MOD ] = { .pme_name = "PM_INST_FROM_L375_MOD", .pme_code = 0x42209d, .pme_short_desc = "Instruction fetched from L3.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x300013, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER5_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc7, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x301090, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "The floating point unit has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_5INST_CLB_CYC ] = { .pme_name = "PM_5INST_CLB_CYC", .pme_code = 0x400c5, .pme_short_desc = "Cycles 5 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_L3SC_REF ] = { .pme_name = "PM_L3SC_REF", .pme_code = 0x701c5, .pme_short_desc = "L3 slice C references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice.", }, [ POWER5_PME_PM_THRD_L2MISS_BOTH_CYC ] = { .pme_name = "PM_THRD_L2MISS_BOTH_CYC", .pme_code = 0x410c7, .pme_short_desc = "Cycles both threads in L2 misses", .pme_long_desc = "Cycles that both threads have L2 miss pending. If only one thread has a L2 miss pending the other thread is given priority at decode. If both threads have L2 miss pending decode priority is determined by the number of GCT entries used.", }, [ POWER5_PME_PM_MEM_PW_GATH ] = { .pme_name = "PM_MEM_PW_GATH", .pme_code = 0x714c6, .pme_short_desc = "Memory partial-write gathered", .pme_long_desc = "Two or more partial-writes have been merged into a single memory write. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FAB_PNtoNN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoNN_SIDECAR", .pme_code = 0x713c7, .pme_short_desc = "PN to NN beat went to sidecar first", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and forwards it on to the outbound NN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5_PME_PM_FAB_DCLAIM_ISSUED ] = { .pme_name = "PM_FAB_DCLAIM_ISSUED", .pme_code = 0x720e7, .pme_short_desc = "dclaim issued", .pme_long_desc = "A DCLAIM command was issued. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly. ", }, [ POWER5_PME_PM_GRP_IC_MISS ] = { .pme_name = "PM_GRP_IC_MISS", .pme_code = 0x120e7, .pme_short_desc = "Group experienced I cache miss", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered an icache miss redirect. Every group constructed from a fetch group that missed the instruction cache will count.", }, [ POWER5_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x12209d, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xc30e7, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The Load Miss Queue was full.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x2c70a0, .pme_short_desc = "Marked load latency from L2", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x830e5, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", }, [ POWER5_PME_PM_LSU0_BUSY_REJECT ] = { .pme_name = "PM_LSU0_BUSY_REJECT", .pme_code = 0xc20e3, .pme_short_desc = "LSU0 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions. ", }, [ POWER5_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x1c6090, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4c70a1, .pme_short_desc = "Marked load latency from remote memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_DATA_FROM_L375_SHR ] = { .pme_name = "PM_DATA_FROM_L375_SHR", .pme_code = 0x3c309e, .pme_short_desc = "Data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a demand load.", }, [ POWER5_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x10c0, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "FPU0 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x283097, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc10c0, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", }, [ POWER5_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x420e6, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles this thread was running at priority level 7.", }, [ POWER5_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc00c7, .pme_short_desc = "LSU1 SRQ lhs flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", }, [ POWER5_PME_PM_L2SC_RCST_DISP ] = { .pme_name = "PM_L2SC_RCST_DISP", .pme_code = 0x702c2, .pme_short_desc = "L2 slice C RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x411099, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", }, [ POWER5_PME_PM_MEM_RQ_DISP_Q12to15 ] = { .pme_name = "PM_MEM_RQ_DISP_Q12to15", .pme_code = 0x732e6, .pme_short_desc = "Memory read queue dispatched to queues 12-15", .pme_long_desc = "A memory operation was dispatched to read queue 12,13,14 or 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_INST_FROM_L375_SHR ] = { .pme_name = "PM_INST_FROM_L375_SHR", .pme_code = 0x32209d, .pme_short_desc = "Instruction fetched from L3.75 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L3 of a chip on a different module than this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x3c1090, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Store references to the Data Cache. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_L3SB_ALL_BUSY ] = { .pme_name = "PM_L3SB_ALL_BUSY", .pme_code = 0x721e4, .pme_short_desc = "L3 slice B active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5_PME_PM_FAB_P1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_P1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x711c7, .pme_short_desc = "P1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Plus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L275_SHR_CYC", .pme_code = 0x2c70a3, .pme_short_desc = "Marked load latency from L2.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_FAB_HOLDtoNN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoNN_EMPTY", .pme_code = 0x722e7, .pme_short_desc = "Hold buffer to NN empty", .pme_long_desc = "Fabric cyles when the Next Node out hold-buffers are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x2c3087, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", }, [ POWER5_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x100005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", }, [ POWER5_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x1830a1, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", }, [ POWER5_PME_PM_L2SC_RCLD_DISP ] = { .pme_name = "PM_L2SC_RCLD_DISP", .pme_code = 0x701c2, .pme_short_desc = "L2 slice C RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc50c0, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed by LSU0", }, [ POWER5_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xc20e2, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", }, [ POWER5_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40000a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x820e2, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", }, [ POWER5_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x40000b, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", }, [ POWER5_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0xc10c3, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x300018, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", }, [ POWER5_PME_PM_LSU_BUSY_REJECT ] = { .pme_name = "PM_LSU_BUSY_REJECT", .pme_code = 0x1c2090, .pme_short_desc = "LSU busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions. Combined unit 0 + 1.", }, [ POWER5_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x800c1, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ POWER5_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0xf, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER5_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x10000b, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT.", }, [ POWER5_PME_PM_GRP_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_BR_REDIR_NONSPEC", .pme_code = 0x112091, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Number of groups, counted at completion, that have encountered a branch redirect.", }, [ POWER5_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc20e4, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER5_PME_PM_L3SC_MOD_INV ] = { .pme_name = "PM_L3SC_MOD_INV", .pme_code = 0x730e5, .pme_short_desc = "L3 slice C transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a previous read op Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0xc50c3, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", }, [ POWER5_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x41009c, .pme_short_desc = "No slot in GCT caused by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x2c7097, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5_PME_PM_L2SB_MOD_INV ] = { .pme_name = "PM_L2SB_MOD_INV", .pme_code = 0x730e1, .pme_short_desc = "L2 slice B transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x723e1, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0xc70e4, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", }, [ POWER5_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x711c4, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", }, [ POWER5_PME_PM_L2SB_SHR_MOD ] = { .pme_name = "PM_L2SB_SHR_MOD", .pme_code = 0x700c1, .pme_short_desc = "L2 slice B transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", }, [ POWER5_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x130e7, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", }, [ POWER5_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100013, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER5_PME_PM_L2SC_SHR_MOD ] = { .pme_name = "PM_L2SC_SHR_MOD", .pme_code = 0x700c2, .pme_short_desc = "L2 slice C transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", }, [ POWER5_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30001a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x110c2, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "Cycles when the LRQ is full.", }, [ POWER5_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x210c7, .pme_short_desc = "Instruction prefetched installed in prefetch buffer", .pme_long_desc = "A prefetch buffer entry (line) is allocated but the request is not a demand fetch.", }, [ POWER5_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x180088, .pme_short_desc = "TLB misses", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", }, [ POWER5_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x100c0, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The Global Completion Table is completely full.", }, [ POWER5_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200012, .pme_short_desc = "FXU busy", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x2c70a4, .pme_short_desc = "Marked load latency from L3", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2c6088, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", }, [ POWER5_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xc20e5, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", }, [ POWER5_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x100014, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", }, [ POWER5_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x122096, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (T or SL) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", }, [ POWER5_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x10c7, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "FPU1 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads., , ", }, [ POWER5_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x830e7, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated.", }, [ POWER5_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x230e6, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", }, [ POWER5_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x110c1, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The issue queue that feeds the Conditional Register unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_L2SA_RCLD_DISP ] = { .pme_name = "PM_L2SA_RCLD_DISP", .pme_code = 0x701c0, .pme_short_desc = "L2 slice A RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5_PME_PM_SNOOP_WR_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_WR_RETRY_QFULL", .pme_code = 0x710c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a write to memory was retried because the write queues were full. When this happens the snoop request is retried and the writes in the write reorder queue are changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DTLB_REF_4K ] = { .pme_name = "PM_MRK_DTLB_REF_4K", .pme_code = 0xc40c3, .pme_short_desc = "Marked Data TLB reference for 4K page", .pme_long_desc = "Data TLB references by a marked instruction for 4KB pages.", }, [ POWER5_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xc20e1, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", }, [ POWER5_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc00c2, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5_PME_PM_INST_FROM_L275_MOD ] = { .pme_name = "PM_INST_FROM_L275_MOD", .pme_code = 0x422096, .pme_short_desc = "Instruction fetched from L2.75 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 on a different module than this processor is located. Fetch groups can contain up to 8 instructions ", }, [ POWER5_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x200004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ POWER5_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x820e7, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x430e6, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 5 or 6.", }, [ POWER5_PME_PM_SNOOP_RETRY_1AHEAD ] = { .pme_name = "PM_SNOOP_RETRY_1AHEAD", .pme_code = 0x725e6, .pme_short_desc = "Snoop retry due to one ahead collision", .pme_long_desc = "Snoop retry due to one ahead collision", }, [ POWER5_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0xc6, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "FPU1 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x820e4, .pme_short_desc = "LSU1 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU1.", }, [ POWER5_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x300014, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x420e4, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles this thread was running at priority level 5.", }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x2c7087, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", }, [ POWER5_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x10c5, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "FPU1 has executed a frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x800c3, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A tlbie was snooped from another processor.", }, [ POWER5_PME_PM_L3SB_SNOOP_RETRY ] = { .pme_name = "PM_L3SB_SNOOP_RETRY", .pme_code = 0x731e4, .pme_short_desc = "L3 slice B snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5_PME_PM_FAB_VBYPASS_EMPTY ] = { .pme_name = "PM_FAB_VBYPASS_EMPTY", .pme_code = 0x731e7, .pme_short_desc = "Vertical bypass buffer empty", .pme_long_desc = "Fabric cycles when the Middle Bypass sidecar is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L275_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L275_MOD", .pme_code = 0x1c70a3, .pme_short_desc = "Marked data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a marked load.", }, [ POWER5_PME_PM_6INST_CLB_CYC ] = { .pme_name = "PM_6INST_CLB_CYC", .pme_code = 0x400c6, .pme_short_desc = "Cycles 6 instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_L2SB_RCST_DISP ] = { .pme_name = "PM_L2SB_RCST_DISP", .pme_code = 0x702c1, .pme_short_desc = "L2 slice B RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x110c7, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", }, [ POWER5_PME_PM_L2SC_MOD_INV ] = { .pme_name = "PM_L2SC_MOD_INV", .pme_code = 0x730e2, .pme_short_desc = "L2 slice C transition from modified to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Invalid state. This transition was caused by any RWITM snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x102088, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "The floating point unit has encountered a denormalized operand. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_L3SC_HIT ] = { .pme_name = "PM_L3SC_HIT", .pme_code = 0x711c5, .pme_short_desc = "L3 slice C hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 Slice", }, [ POWER5_PME_PM_SNOOP_WR_RETRY_RQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_RQ", .pme_code = 0x706c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active read queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cacheline of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly", }, [ POWER5_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0xc60e4, .pme_short_desc = "LSU1 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", }, [ POWER5_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x220e6, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", }, [ POWER5_PME_PM_L3SC_ALL_BUSY ] = { .pme_name = "PM_L3SC_ALL_BUSY", .pme_code = 0x721e5, .pme_short_desc = "L3 slice C active for every cycle all CI/CO machines busy", .pme_long_desc = "Cycles All Castin/Castout machines are busy.", }, [ POWER5_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x412091, .pme_short_desc = "Group experienced marked I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", }, [ POWER5_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x21009c, .pme_short_desc = "No slot in GCT caused by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1c708e, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", }, [ POWER5_PME_PM_GCT_NOSLOT_SRQ_FULL ] = { .pme_name = "PM_GCT_NOSLOT_SRQ_FULL", .pme_code = 0x310084, .pme_short_desc = "No slot in GCT caused by SRQ full", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because the Store Request Queue (SRQ) is full. This happens when the storage subsystem can not process the stores in the SRQ. Groups can not be dispatched until a SRQ entry is available.", }, [ POWER5_PME_PM_THRD_SEL_OVER_ISU_HOLD ] = { .pme_name = "PM_THRD_SEL_OVER_ISU_HOLD", .pme_code = 0x410c5, .pme_short_desc = "Thread selection overrides caused by ISU holds", .pme_long_desc = "Thread selection was overridden because of an ISU hold.", }, [ POWER5_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x21109a, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER5_PME_PM_L3SA_MOD_INV ] = { .pme_name = "PM_L3SA_MOD_INV", .pme_code = 0x730e3, .pme_short_desc = "L3 slice A transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I) Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x2c0090, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Units 0 and 1.", }, [ POWER5_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x420e1, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles this thread was running at priority level 2.", }, [ POWER5_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x1c0090, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0xc70e6, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", }, [ POWER5_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x701c3, .pme_short_desc = "L3 slice A references", .pme_long_desc = "Number of attempts made by this chip cores to find data in the L3. Reported per L3 slice ", }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY_ALL", .pme_code = 0x713c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to all CO busy", .pme_long_desc = "A Read/Claim dispatch was rejected because all Castout machines were busy.", }, [ POWER5_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x20e1, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "FPU0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always).", }, [ POWER5_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x130e5, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The General Purpose Register mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100018, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ POWER5_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0x381088, .pme_short_desc = "Marked LRQ flushes", .pme_long_desc = "A marked load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER5_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x20e2, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "FPU0 has executed a Floating Point Store instruction.", }, [ POWER5_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0xc50c6, .pme_short_desc = "Marked Data TLB misses", .pme_long_desc = "Data TLB references by a marked instruction that missed the TLB (all page sizes).", }, [ POWER5_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc5, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "The floating point unit has executed a multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_L2SA_MOD_TAG ] = { .pme_name = "PM_L2SA_MOD_TAG", .pme_code = 0x720e0, .pme_short_desc = "L2 slice A transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc00c4, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x810c1, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", }, [ POWER5_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x300005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x100c3, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU0 cannot accept any more instruction. Dispatch to this issue queue is stopped.", }, [ POWER5_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xc20e6, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x810c4, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0x1c4090, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Total number of Data TLB references by a marked instruction for all page sizes. Page size is determined at TLB reload time.", }, [ POWER5_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x123087, .pme_short_desc = "Unconditional branch", .pme_long_desc = "An unconditional branch was executed.", }, [ POWER5_PME_PM_THRD_SEL_OVER_L2MISS ] = { .pme_name = "PM_THRD_SEL_OVER_L2MISS", .pme_code = 0x410c3, .pme_short_desc = "Thread selection overrides caused by L2 misses", .pme_long_desc = "Thread selection was overridden because one thread was had a L2 miss pending.", }, [ POWER5_PME_PM_L2SB_SHR_INV ] = { .pme_name = "PM_L2SB_SHR_INV", .pme_code = 0x710c1, .pme_short_desc = "L2 slice B transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5_PME_PM_MEM_LO_PRIO_WR_CMPL ] = { .pme_name = "PM_MEM_LO_PRIO_WR_CMPL", .pme_code = 0x736e6, .pme_short_desc = "Low priority write completed", .pme_long_desc = "A memory write, which was not upgraded to high priority, completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly", }, [ POWER5_PME_PM_L3SC_MOD_TAG ] = { .pme_name = "PM_L3SC_MOD_TAG", .pme_code = 0x720e5, .pme_short_desc = "L3 slice C transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x820e3, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", }, [ POWER5_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x300002, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups successfully dispatched (not rejected)", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x430e4, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles when this thread's priority is higher than the other thread's priority by 1 or 2.", }, [ POWER5_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x230e0, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", }, [ POWER5_PME_PM_MEM_WQ_DISP_Q8to15 ] = { .pme_name = "PM_MEM_WQ_DISP_Q8to15", .pme_code = 0x733e6, .pme_short_desc = "Memory write queue dispatched to queues 8-15", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 8 and 15. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x20e3, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "FPU0 has executed a single precision instruction.", }, [ POWER5_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x280090, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x420e0, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles this thread was running at priority level 1. Priority level 1 is the lowest and indicates the thread is sleeping.", }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x10c6, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "FPU1 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5_PME_PM_FAB_HOLDtoVN_EMPTY ] = { .pme_name = "PM_FAB_HOLDtoVN_EMPTY", .pme_code = 0x721e7, .pme_short_desc = "Hold buffer to VN empty", .pme_long_desc = "Fabric cycles when the Vertical Node out hold-buffers are empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_SNOOP_RD_RETRY_RQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_RQ", .pme_code = 0x705c6, .pme_short_desc = "Snoop read retry due to collision with active read queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active read. The snoop request is retried because the L2 may be able to source data via intervention for the 2nd read faster than the MC. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_SNOOP_DCLAIM_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_DCLAIM_RETRY_QFULL", .pme_code = 0x720e6, .pme_short_desc = "Snoop dclaim/flush retry due to write/dclaim queues full", .pme_long_desc = "The memory controller A memory write was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR_CYC", .pme_code = 0x2c70a2, .pme_short_desc = "Marked load latency from L2.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ POWER5_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x110c6, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x202090, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU has executed a store instruction. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x411098, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point instruction.", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 1 or 2.", }, [ POWER5_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100004, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", }, [ POWER5_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300012, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ POWER5_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x18309e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", }, [ POWER5_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0x381090, .pme_short_desc = "Marked unaligned store flushes", .pme_long_desc = "A marked store was flushed because it was unaligned", }, [ POWER5_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x711c3, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "Number of attempts made by this chip cores that resulted in an L3 hit. Reported per L3 slice", }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x1c7097, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_ADDR", .pme_code = 0x712c1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a store failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x1c709e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a marked load.", }, [ POWER5_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x220e7, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "An entry was written into the IERAT as a result of an IERAT miss. This event can be used to count IERAT misses. An ERAT miss that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed.", }, [ POWER5_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x723e0, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_THRD_SEL_T1 ] = { .pme_name = "PM_THRD_SEL_T1", .pme_code = 0x410c1, .pme_short_desc = "Decode selected thread 1", .pme_long_desc = "Thread selection picked thread 1 for decode.", }, [ POWER5_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x230e1, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", }, [ POWER5_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x222086, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc3, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR_CYC", .pme_code = 0x2c70a6, .pme_short_desc = "Marked load latency from L3.5 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x183087, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L2 due to a demand load", }, [ POWER5_PME_PM_MEM_PW_CMPL ] = { .pme_name = "PM_MEM_PW_CMPL", .pme_code = 0x724e6, .pme_short_desc = "Memory partial-write completed", .pme_long_desc = "Number of Partial Writes completed. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x430e0, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles when this thread's priority is lower than the other thread's priority by 5 or 6.", }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x10c3, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "FPU0 finished, produced a result. This only indicates finish, not completion. Floating Point Stores are included in this count but not Floating Point Loads.", }, [ POWER5_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0xc40c1, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5_PME_PM_L3SC_SHR_INV ] = { .pme_name = "PM_L3SC_SHR_INV", .pme_code = 0x710c5, .pme_short_desc = "L3 slice C transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x120e6, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Number of groups, counted at dispatch, that have encountered a branch redirect. Every group constructed from a fetch group that has been redirected will count.", }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0x481088, .pme_short_desc = "Marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5_PME_PM_PTEG_FROM_L275_SHR ] = { .pme_name = "PM_PTEG_FROM_L275_SHR", .pme_code = 0x383097, .pme_short_desc = "PTEG loaded from L2.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (T) data from the L2 on a different module than this processor is located due to a demand load.", }, [ POWER5_PME_PM_L2SB_RCLD_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCLD_DISP_FAIL_RC_FULL", .pme_code = 0x721e1, .pme_short_desc = "L2 slice B RC load dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a load failed because all RC machines are busy.", }, [ POWER5_PME_PM_SNOOP_RD_RETRY_WQ ] = { .pme_name = "PM_SNOOP_RD_RETRY_WQ", .pme_code = 0x715c6, .pme_short_desc = "Snoop read retry due to collision with active write queue", .pme_long_desc = "A snoop request for a read from memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc50c1, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "A non-cacheable load was executed by unit 0.", }, [ POWER5_PME_PM_FAB_DCLAIM_RETRIED ] = { .pme_name = "PM_FAB_DCLAIM_RETRIED", .pme_code = 0x730e7, .pme_short_desc = "dclaim retried", .pme_long_desc = "A DCLAIM command was retried. Each chip reports its own counts. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_LSU1_BUSY_REJECT ] = { .pme_name = "PM_LSU1_BUSY_REJECT", .pme_code = 0xc20e7, .pme_short_desc = "LSU1 busy due to reject", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions.", }, [ POWER5_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x110c0, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue that feeds the Fixed Point unit 0 / Load Store Unit 0 is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x10c2, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "FPU0 has executed an estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ POWER5_PME_PM_DTLB_REF_16M ] = { .pme_name = "PM_DTLB_REF_16M", .pme_code = 0xc40c6, .pme_short_desc = "Data TLB reference for 16M page", .pme_long_desc = "Data TLB references for 16MB pages. Includes hits + misses.", }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0xc60e3, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x2c3097, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_GCT_USAGE_60to79_CYC ] = { .pme_name = "PM_GCT_USAGE_60to79_CYC", .pme_code = 0x20001f, .pme_short_desc = "Cycles GCT 60-79% full", .pme_long_desc = "Cycles when the Global Completion Table has between 60% and 70% of its slots used. The GCT has 20 entries shared between threads.", }, [ POWER5_PME_PM_DATA_FROM_L375_MOD ] = { .pme_name = "PM_DATA_FROM_L375_MOD", .pme_code = 0x1c30a7, .pme_short_desc = "Data loaded from L3.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x200015, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER5_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0xc60e2, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", }, [ POWER5_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x42208d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ POWER5_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0xc60e6, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because of Critical Data Forward. When critical data arrives from the storage system it is formatted and immediately forwarded, bypassing the data cache, to the destination register using the result bus. Any instruction the requires the result bus in the same cycle is rejected. Tag update rejects are caused when an instruction requires access to the Dcache directory or ERAT in the same system when they are being updated.", }, [ POWER5_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xc70e7, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ POWER5_PME_PM_MEM_WQ_DISP_Q0to7 ] = { .pme_name = "PM_MEM_WQ_DISP_Q0to7", .pme_code = 0x723e6, .pme_short_desc = "Memory write queue dispatched to queues 0-7", .pme_long_desc = "A memory operation was dispatched to a write queue in the range between 0 and 7. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x4c70a0, .pme_short_desc = "Marked load latency from local memory", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x100c5, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "Cycles when the issue queue that feeds the branch unit is full. This condition will prevent dispatch groups from being dispatched. This event only indicates that the queue was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x112099, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", }, [ POWER5_PME_PM_PTEG_FROM_L275_MOD ] = { .pme_name = "PM_PTEG_FROM_L275_MOD", .pme_code = 0x1830a3, .pme_short_desc = "PTEG loaded from L2.75 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L2 on a different module than this processor is located due to a demand load. ", }, [ POWER5_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x820e0, .pme_short_desc = "LSU0 marked L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by LSU0.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR_CYC", .pme_code = 0x2c70a7, .pme_short_desc = "Marked load latency from L3.75 shared", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x110c5, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit", }, [ POWER5_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c308e, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", }, [ POWER5_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x122086, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ POWER5_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30000a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x20e0, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "FPU0 has encountered a denormalized operand. ", }, [ POWER5_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x10c4, .pme_short_desc = "FPU1 executed FMOV or FEST instructions", .pme_long_desc = "FPU1 has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.", }, [ POWER5_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x220e4, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Cycles when at least one instruction was sent from the fetch unit to the decode unit.", }, [ POWER5_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x4c5090, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x300009, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", }, [ POWER5_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x1c3097, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from the L2 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0xc30e4, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", }, [ POWER5_PME_PM_MEM_WQ_DISP_DCLAIM ] = { .pme_name = "PM_MEM_WQ_DISP_DCLAIM", .pme_code = 0x713c6, .pme_short_desc = "Memory write queue dispatched due to dclaim/flush", .pme_long_desc = "A memory dclaim or flush operation was dispatched to a write queue. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_FPU_FULL_CYC ] = { .pme_name = "PM_FPU_FULL_CYC", .pme_code = 0x110090, .pme_short_desc = "Cycles FPU issue queue full", .pme_long_desc = "Cycles when one or both FPU issue queues are full. Combined Unit 0 + 1. Use with caution since this is the sum of cycles when Unit 0 was full plus Unit 1 full. It does not indicate when both units were full.", }, [ POWER5_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x100015, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued.", }, [ POWER5_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x420e2, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles this thread was running at priority level 3.", }, [ POWER5_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x200088, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x22209d, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L3 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x400005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_SNOOP_WR_RETRY_WQ ] = { .pme_name = "PM_SNOOP_WR_RETRY_WQ", .pme_code = 0x716c6, .pme_short_desc = "Snoop write/dclaim retry due to collision with active write queue", .pme_long_desc = "A snoop request for a write or dclaim to memory was retried because it matched the cache line of an active write. The snoop request is retried and the active write is changed to high priority. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x41109a, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER5_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0xc60e7, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions due to an ERAT miss. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER5_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x200014, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5_PME_PM_L2SC_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SC_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c2, .pme_short_desc = "L2 slice C RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", }, [ POWER5_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10000a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_L3SA_SNOOP_RETRY ] = { .pme_name = "PM_L3SA_SNOOP_RETRY", .pme_code = 0x731e3, .pme_short_desc = "L3 slice A snoop retries", .pme_long_desc = "Number of times an L3 retried a snoop because it got two in at the same time (one on snp_a, one on snp_b)", }, [ POWER5_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x28309e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "A Page Table Entry was loaded into the TLB with modified (M) data from the L3 of a chip on the same module as this processor is located, due to a demand load.", }, [ POWER5_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x222096, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from the L2 of a chip on the same module as this processor is located. Fetch groups can contain up to 8 instructions.", }, [ POWER5_PME_PM_THRD_SMT_HANG ] = { .pme_name = "PM_THRD_SMT_HANG", .pme_code = 0x330e7, .pme_short_desc = "SMT hang detected", .pme_long_desc = "A hung thread was detected", }, [ POWER5_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x41109b, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", }, [ POWER5_PME_PM_L3SA_MOD_TAG ] = { .pme_name = "PM_L3SA_MOD_TAG", .pme_code = 0x720e3, .pme_short_desc = "L3 slice A transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case) Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_FLUSH_SYNC ] = { .pme_name = "PM_FLUSH_SYNC", .pme_code = 0x330e1, .pme_short_desc = "Flush caused by sync", .pme_long_desc = "This thread has been flushed at dispatch due to a sync, lwsync, ptesync, or tlbsync instruction. This allows the other thread to have more machine resources for it to make progress until the sync finishes.", }, [ POWER5_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x12209b, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", }, [ POWER5_PME_PM_L2SC_ST_HIT ] = { .pme_name = "PM_L2SC_ST_HIT", .pme_code = 0x733e2, .pme_short_desc = "L2 slice C store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_MEM_RQ_DISP_Q8to11 ] = { .pme_name = "PM_MEM_RQ_DISP_Q8to11", .pme_code = 0x722e6, .pme_short_desc = "Memory read queue dispatched to queues 8-11", .pme_long_desc = "A memory operation was dispatched to read queue 8,9,10 or 11. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x100002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", }, [ POWER5_PME_PM_L2SB_MOD_TAG ] = { .pme_name = "PM_L2SB_MOD_TAG", .pme_code = 0x720e1, .pme_short_desc = "L2 slice B transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_CLB_EMPTY_CYC ] = { .pme_name = "PM_CLB_EMPTY_CYC", .pme_code = 0x410c6, .pme_short_desc = "Cycles CLB empty", .pme_long_desc = "Cycles when both thread's CLB is completely empty.", }, [ POWER5_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x733e1, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A, B and C.", }, [ POWER5_PME_PM_MEM_NONSPEC_RD_CANCEL ] = { .pme_name = "PM_MEM_NONSPEC_RD_CANCEL", .pme_code = 0x711c6, .pme_short_desc = "Non speculative memory read cancelled", .pme_long_desc = "A non-speculative read was cancelled because the combined response indicated it was sourced from aother L2 or L3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x423087, .pme_short_desc = "A conditional branch was predicted, CR and target prediction", .pme_long_desc = "Both the condition (taken or not taken) and the target address of a branch instruction was predicted.", }, [ POWER5_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x810c3, .pme_short_desc = "LSU0 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0x481090, .pme_short_desc = "Marked unaligned load flushes", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER5_PME_PM_INST_DISP_ATTEMPT ] = { .pme_name = "PM_INST_DISP_ATTEMPT", .pme_code = 0x120e1, .pme_short_desc = "Instructions dispatch attempted", .pme_long_desc = "Number of PowerPC Instructions dispatched (attempted, not filtered by success.", }, [ POWER5_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x422086, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0xc10c1, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU0.", }, [ POWER5_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x800c2, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "Total D-ERAT Misses by LSU0. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ POWER5_PME_PM_L2SB_RCLD_DISP ] = { .pme_name = "PM_L2SB_RCLD_DISP", .pme_code = 0x701c1, .pme_short_desc = "L2 slice B RC load dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Load was attempted", }, [ POWER5_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x202088, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x230e2, .pme_short_desc = "A conditional branch was predicted, CR prediction", .pme_long_desc = "A conditional branch instruction was predicted as taken or not taken.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1c7087, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", }, [ POWER5_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc00c3, .pme_short_desc = "LSU0 SRQ lhs flushes", .pme_long_desc = "A store was flushed by unit 0 because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5_PME_PM_FAB_PNtoNN_DIRECT ] = { .pme_name = "PM_FAB_PNtoNN_DIRECT", .pme_code = 0x703c7, .pme_short_desc = "PN to NN beat went straight to its destination", .pme_long_desc = "Fabric Data beats that the base chip takes the inbound PN data and passes it through to the outbound NN bus without going into a sidecar. The signal is delivered at FBC speed and the count must be scaled.", }, [ POWER5_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1, .pme_short_desc = "Internal operations completed", .pme_long_desc = "Number of internal operations that completed.", }, [ POWER5_PME_PM_L2SC_SHR_INV ] = { .pme_name = "PM_L2SC_SHR_INV", .pme_code = 0x710c2, .pme_short_desc = "L2 slice C transition from shared to invalid", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L, or Tagged) to the Invalid state. This transition was caused by any external snoop request. The event is provided on each of the three slices A, B, and C. NOTE: For this event to be useful the tablewalk duration event should also be counted.", }, [ POWER5_PME_PM_L2SA_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SA_RCST_DISP_FAIL_OTHER", .pme_code = 0x732e0, .pme_short_desc = "L2 slice A RC store dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a store failed for some reason other than Full or Collision conditions. Rejected dispatches do not count because they have not yet been attempted.", }, [ POWER5_PME_PM_L2SA_RCST_DISP ] = { .pme_name = "PM_L2SA_RCST_DISP", .pme_code = 0x702c0, .pme_short_desc = "L2 slice A RC store dispatch attempt", .pme_long_desc = "A Read/Claim dispatch for a Store was attempted.", }, [ POWER5_PME_PM_SNOOP_RETRY_AB_COLLISION ] = { .pme_name = "PM_SNOOP_RETRY_AB_COLLISION", .pme_code = 0x735e6, .pme_short_desc = "Snoop retry due to a b collision", .pme_long_desc = "Snoop retry due to a b collision", }, [ POWER5_PME_PM_FAB_PNtoVN_SIDECAR ] = { .pme_name = "PM_FAB_PNtoVN_SIDECAR", .pme_code = 0x733e7, .pme_short_desc = "PN to VN beat went to sidecar first", .pme_long_desc = "Fabric data beats that the base chip takes the inbound PN data and forwards it on to the outbound VN data bus after going into a sidecar first. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xc30e6, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", }, [ POWER5_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc60e1, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER5_PME_PM_SNOOP_PW_RETRY_RQ ] = { .pme_name = "PM_SNOOP_PW_RETRY_RQ", .pme_code = 0x707c6, .pme_short_desc = "Snoop partial-write retry due to collision with active read queue", .pme_long_desc = "A snoop request for a partial write to memory was retried because it matched the cache line of an active read. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_DTLB_REF ] = { .pme_name = "PM_DTLB_REF", .pme_code = 0x2c4090, .pme_short_desc = "Data TLB references", .pme_long_desc = "Total number of Data TLB references for all page sizes. Page size is determined at TLB reload time.", }, [ POWER5_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x18308e, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", }, [ POWER5_PME_PM_FAB_M1toVNorNN_SIDECAR_EMPTY ] = { .pme_name = "PM_FAB_M1toVNorNN_SIDECAR_EMPTY", .pme_code = 0x712c7, .pme_short_desc = "M1 to VN/NN sidecar empty", .pme_long_desc = "Fabric cycles when the Minus-1 jump sidecar (sidecars for mcm to mcm data transfer) is empty. The signal is delivered at FBC speed and the count must be scaled accordingly.", }, [ POWER5_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x400015, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "Cycles the Store Request Queue is empty", }, [ POWER5_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x20e6, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "FPU1 has executed a Floating Point Store instruction.", }, [ POWER5_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xc30e5, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", }, [ POWER5_PME_PM_GCT_USAGE_00to59_CYC ] = { .pme_name = "PM_GCT_USAGE_00to59_CYC", .pme_code = 0x10001f, .pme_short_desc = "Cycles GCT less than 60% full", .pme_long_desc = "Cycles when the Global Completion Table has fewer than 60% of its slots used. The GCT has 20 entries shared between threads.", }, [ POWER5_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x3c309b, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", }, [ POWER5_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x130e1, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "A scoreboard operation on a non-renamed resource has blocked dispatch.", }, [ POWER5_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x301088, .pme_short_desc = "FPU executed FMOV or FEST instructions", .pme_long_desc = "The floating point unit has executed a move kind of instruction or one of the estimate instructions. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ.. Combined Unit 0 + Unit 1.", }, [ POWER5_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x100c2, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The XER mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_FLUSH_SB ] = { .pme_name = "PM_FLUSH_SB", .pme_code = 0x330e2, .pme_short_desc = "Flush caused by scoreboard operation", .pme_long_desc = "This thread has been flushed at dispatch because its scoreboard bit is set indicating that a non-renamed resource is being updated. This allows the other thread to have more machine resources for it to make progress while this thread is stalled.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L375_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L375_SHR", .pme_code = 0x3c709e, .pme_short_desc = "Marked data loaded from L3.75 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on a different module than this processor is located due to a marked load.", }, [ POWER5_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x400013, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER5_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "The counter is suspended (does not count).", }, [ POWER5_PME_PM_GRP_IC_MISS_BR_REDIR_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_BR_REDIR_NONSPEC", .pme_code = 0x120e5, .pme_short_desc = "Group experienced non-speculative I cache miss or branch redirect", .pme_long_desc = "Group experienced non-speculative I cache miss or branch redirect", }, [ POWER5_PME_PM_SNOOP_RD_RETRY_QFULL ] = { .pme_name = "PM_SNOOP_RD_RETRY_QFULL", .pme_code = 0x700c6, .pme_short_desc = "Snoop read retry due to read queue full", .pme_long_desc = "A snoop request for a read from memory was retried because the read queues were full. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_L3SB_MOD_INV ] = { .pme_name = "PM_L3SB_MOD_INV", .pme_code = 0x730e4, .pme_short_desc = "L3 slice B transition from modified to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is truly M in this L3 (i.e. L3 going M=>I). Mu|Me are not included since they are formed due to a prev read op. Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x1c309e, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (S) data from the L3 of a chip on the same module as this processor is located due to a demand load.", }, [ POWER5_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0xc10c6, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache, by unit 1.", }, [ POWER5_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x820e1, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ POWER5_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0x830e6, .pme_short_desc = "DST (Data Stream Touch) stream start", .pme_long_desc = "A prefetch stream was started using the DST instruction.", }, [ POWER5_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x200002, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", }, [ POWER5_PME_PM_L2SA_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2SA_RCLD_DISP_FAIL_ADDR", .pme_code = 0x711c0, .pme_short_desc = "L2 slice A RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "A Read/Claim dispatch for a load failed because of an address conflict. Two RC machines will never both work on the same line or line in the same congruence class at the same time.", }, [ POWER5_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x30e0, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "FPU0 has executed FPSCR move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*, mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c3087, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", }, [ POWER5_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x20e4, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "FPU1 has encountered a denormalized operand.", }, [ POWER5_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x100090, .pme_short_desc = "FPU executed one flop instruction", .pme_long_desc = "The floating point unit has executed an add, mult, sub, compare, fsel, fneg, fabs, fnabs, fres, or frsqrte kind of instruction. These are single FLOP operations.", }, [ POWER5_PME_PM_L2SC_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2SC_RCLD_DISP_FAIL_OTHER", .pme_code = 0x731e2, .pme_short_desc = "L2 slice C RC load dispatch attempt failed due to other reasons", .pme_long_desc = "A Read/Claim dispatch for a load failed for some reason other than Full or Collision conditions.", }, [ POWER5_PME_PM_L2SC_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SC_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e2, .pme_short_desc = "L2 slice C RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0xc2, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "FPU0 has executed a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER5_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x4c1090, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Load references to the Level 1 Data Cache. Combined unit 0 + 1.", }, [ POWER5_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x22208d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ POWER5_PME_PM_TLBIE_HELD ] = { .pme_name = "PM_TLBIE_HELD", .pme_code = 0x130e4, .pme_short_desc = "TLBIE held at dispatch", .pme_long_desc = "Cycles a TLBIE instruction was held at dispatch.", }, [ POWER5_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0xc50c2, .pme_short_desc = "D cache out of prefetch streams", .pme_long_desc = "A new prefetch stream was detected but no more stream entries were available.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD_CYC", .pme_code = 0x4c70a2, .pme_short_desc = "Marked load latency from L2.5 modified", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER5_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x810c7, .pme_short_desc = "LSU1 marked SRQ lhs flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ POWER5_PME_PM_MEM_RQ_DISP_Q0to3 ] = { .pme_name = "PM_MEM_RQ_DISP_Q0to3", .pme_code = 0x702c6, .pme_short_desc = "Memory read queue dispatched to queues 0-3", .pme_long_desc = "A memory operation was dispatched to read queue 0,1,2, or 3. This event is sent from the Memory Controller clock domain and must be scaled accordingly.", }, [ POWER5_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0xc10c5, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "Store references to the Data Cache by LSU1.", }, [ POWER5_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x182088, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ POWER5_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x230e7, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "Cycles that a cache line was written to the instruction cache.", }, [ POWER5_PME_PM_L2SC_ST_REQ ] = { .pme_name = "PM_L2SC_ST_REQ", .pme_code = 0x723e2, .pme_short_desc = "L2 slice C store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x21109b, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a floating point divide or square root instruction. This is a subset of PM_CMPLU_STALL_FPU.", }, [ POWER5_PME_PM_THRD_SEL_OVER_CLB_EMPTY ] = { .pme_name = "PM_THRD_SEL_OVER_CLB_EMPTY", .pme_code = 0x410c2, .pme_short_desc = "Thread selection overrides caused by CLB empty", .pme_long_desc = "Thread selection was overridden because one thread's CLB was empty.", }, [ POWER5_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x230e5, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", }, [ POWER5_PME_PM_L3SB_MOD_TAG ] = { .pme_name = "PM_L3SB_MOD_TAG", .pme_code = 0x720e4, .pme_short_desc = "L3 slice B transition from modified to TAG", .pme_long_desc = "L3 snooper detects someone doing a read to a line that is truly M in this L3(i.e. L3 going M->T or M->I(go_Mu case); Mu|Me are not included since they are formed due to a prev read op). Tx is not included since it is considered shared at this point.", }, [ POWER5_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x3c709b, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", }, [ POWER5_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1c6088, .pme_short_desc = "LSU SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3c1088, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", }, [ POWER5_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x32208d, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", }, [ POWER5_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0xc10c7, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ POWER5_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x820e5, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", }, [ POWER5_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x110c3, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "Cycles the Store Request Queue is full.", }, [ POWER5_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x401088, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result. This only indicates finish, not completion. Combined Unit 0 + Unit 1. Floating Point Stores are included in this count but not Floating Point Loads., , , XYZs", }, [ POWER5_PME_PM_L2SA_SHR_MOD ] = { .pme_name = "PM_L2SA_SHR_MOD", .pme_code = 0x700c0, .pme_short_desc = "L2 slice A transition from shared to modified", .pme_long_desc = "A cache line in the local L2 directory made a state transition from Shared (Shared, Shared L , or Tagged) to the Modified state. This transition was caused by a store from either of the two local CPUs to a cache line in any of the Shared states. The event is provided on each of the three slices A, B, and C. ", }, [ POWER5_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1c2088, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", }, [ POWER5_PME_PM_0INST_CLB_CYC ] = { .pme_name = "PM_0INST_CLB_CYC", .pme_code = 0x400c0, .pme_short_desc = "Cycles no instructions in CLB", .pme_long_desc = "The cache line buffer (CLB) is a 6-deep, 4-wide instruction buffer. Fullness is reported on a cycle basis with each event representing the number of cycles the CLB had the corresponding number of entries occupied. These events give a real time history of the number of instruction buffers used, but not the number of PowerPC instructions within those buffers. Each thread has its own set of CLB; these events are thread specific.", }, [ POWER5_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x130e2, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER5_PME_PM_L2SB_RCST_DISP_FAIL_RC_FULL ] = { .pme_name = "PM_L2SB_RCST_DISP_FAIL_RC_FULL", .pme_code = 0x722e1, .pme_short_desc = "L2 slice B RC store dispatch attempt failed due to all RC full", .pme_long_desc = "A Read/Claim dispatch for a store failed because all RC machines are busy.", }, [ POWER5_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200013, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", }, [ POWER5_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10001a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER5_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0xc0, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "FPU0 has executed a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ POWER5_PME_PM_PTEG_FROM_L375_SHR ] = { .pme_name = "PM_PTEG_FROM_L375_SHR", .pme_code = 0x38309e, .pme_short_desc = "PTEG loaded from L3.75 shared", .pme_long_desc = "A Page Table Entry was loaded into the TLB with shared (S) data from the L3 of a chip on a different module than this processor is located, due to a demand load.", }, [ POWER5_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc10c4, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 1.", }, [ POWER5_PME_PM_L2SA_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SA_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c0, .pme_short_desc = "L2 slice A RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", }, [ POWER5_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x20000b, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ POWER5_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x430e3, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles when this thread's priority is equal to the other thread's priority.", }, [ POWER5_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x100c6, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The LR/CTR mapper cannot accept any more groups. This condition will prevent dispatch groups from being dispatched. This event only indicates that the mapper was full, not that dispatch was prevented.", }, [ POWER5_PME_PM_L3SB_SHR_INV ] = { .pme_name = "PM_L3SB_SHR_INV", .pme_code = 0x710c4, .pme_short_desc = "L3 slice B transition from shared to invalid", .pme_long_desc = "L3 snooper detects someone doing a store to a line that is Sx in this L3(i.e. invalidate hit SX and dispatched).", }, [ POWER5_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x1c30a1, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", }, [ POWER5_PME_PM_DATA_FROM_L275_MOD ] = { .pme_name = "PM_DATA_FROM_L275_MOD", .pme_code = 0x1c30a3, .pme_short_desc = "Data loaded from L2.75 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from the L2 on a different module than this processor is located due to a demand load. ", }, [ POWER5_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0xc60e0, .pme_short_desc = "LSU0 SRQ lhs rejects", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because of Load Hit Store conditions. Loads are rejected when data is needed from a previous store instruction but store forwarding is not possible because the data is not fully contained in the Store Data Queue or is not yet available in the Store Data Queue.", }, [ POWER5_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x800c6, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER5_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x400014, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER5_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0xc40c4, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER5_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc00c1, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", }, [ POWER5_PME_PM_L2SC_MOD_TAG ] = { .pme_name = "PM_L2SC_MOD_TAG", .pme_code = 0x720e2, .pme_short_desc = "L2 slice C transition from modified to tagged", .pme_long_desc = "A cache line in the local L2 directory made a state transition from the Modified state to the Tagged state. This transition was caused by a read snoop request that hit against a modified entry in the local L2. The event is provided on each of the three slices A, B, and C.", }, [ POWER5_PME_PM_L2SB_RC_DISP_FAIL_CO_BUSY ] = { .pme_name = "PM_L2SB_RC_DISP_FAIL_CO_BUSY", .pme_code = 0x703c1, .pme_short_desc = "L2 slice B RC dispatch attempt failed due to RC/CO pair chosen was miss and CO already busy", .pme_long_desc = "A Read/Claim Dispatch was rejected at dispatch because the Castout Machine was busy. In the case of an RC starting up on a miss and the victim is valid, the CO machine must be available for the RC to process the access. If the CO is still busy working on an old castout, then the RC must not-ack the access if it is a miss(re-issued by the CIU). If it is a miss and the CO is available to process the castout, the RC will accept the access. Once the RC has finished, it can restart and process new accesses that result in a hit (or miss that doesn't need a CO) even though the CO is still processing a castout from a previous access.", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power6_events.h000066400000000000000000004414711502707512200230300ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER6_EVENTS_H__ #define __POWER6_EVENTS_H__ /* * File: power6_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define POWER6_PME_PM_LSU_REJECT_STQ_FULL 0 #define POWER6_PME_PM_DPU_HELD_FXU_MULTI 1 #define POWER6_PME_PM_VMX1_STALL 2 #define POWER6_PME_PM_PMC2_SAVED 3 #define POWER6_PME_PM_L2SB_IC_INV 4 #define POWER6_PME_PM_IERAT_MISS_64K 5 #define POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC 6 #define POWER6_PME_PM_LD_REF_L1_BOTH 7 #define POWER6_PME_PM_FPU1_FCONV 8 #define POWER6_PME_PM_IBUF_FULL_COUNT 9 #define POWER6_PME_PM_MRK_LSU_DERAT_MISS 10 #define POWER6_PME_PM_MRK_ST_CMPL 11 #define POWER6_PME_PM_L2_CASTOUT_MOD 12 #define POWER6_PME_PM_FPU1_ST_FOLDED 13 #define POWER6_PME_PM_MRK_INST_TIMEO 14 #define POWER6_PME_PM_DPU_WT 15 #define POWER6_PME_PM_DPU_HELD_RESTART 16 #define POWER6_PME_PM_IERAT_MISS 17 #define POWER6_PME_PM_FPU_SINGLE 18 #define POWER6_PME_PM_MRK_PTEG_FROM_LMEM 19 #define POWER6_PME_PM_HV_COUNT 20 #define POWER6_PME_PM_L2SA_ST_HIT 21 #define POWER6_PME_PM_L2_LD_MISS_INST 22 #define POWER6_PME_PM_EXT_INT 23 #define POWER6_PME_PM_LSU1_LDF 24 #define POWER6_PME_PM_FAB_CMD_ISSUED 25 #define POWER6_PME_PM_PTEG_FROM_L21 26 #define POWER6_PME_PM_L2SA_MISS 27 #define POWER6_PME_PM_PTEG_FROM_RL2L3_MOD 28 #define POWER6_PME_PM_DPU_WT_COUNT 29 #define POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD 30 #define POWER6_PME_PM_LD_HIT_L2 31 #define POWER6_PME_PM_PTEG_FROM_DL2L3_SHR 32 #define POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC 33 #define POWER6_PME_PM_L3SA_MISS 34 #define POWER6_PME_PM_NO_ITAG_COUNT 35 #define POWER6_PME_PM_DSLB_MISS 36 #define POWER6_PME_PM_LSU_FLUSH_ALIGN 37 #define POWER6_PME_PM_DPU_HELD_FPU_CR 38 #define POWER6_PME_PM_PTEG_FROM_L2MISS 39 #define POWER6_PME_PM_MRK_DATA_FROM_DMEM 40 #define POWER6_PME_PM_PTEG_FROM_LMEM 41 #define POWER6_PME_PM_MRK_DERAT_REF_64K 42 #define POWER6_PME_PM_L2SA_LD_REQ_INST 43 #define POWER6_PME_PM_MRK_DERAT_MISS_16M 44 #define POWER6_PME_PM_DATA_FROM_DL2L3_MOD 45 #define POWER6_PME_PM_FPU0_FXMULT 46 #define POWER6_PME_PM_L3SB_MISS 47 #define POWER6_PME_PM_STCX_CANCEL 48 #define POWER6_PME_PM_L2SA_LD_MISS_DATA 49 #define POWER6_PME_PM_IC_INV_L2 50 #define POWER6_PME_PM_DPU_HELD 51 #define POWER6_PME_PM_PMC1_OVERFLOW 52 #define POWER6_PME_PM_THRD_PRIO_6_CYC 53 #define POWER6_PME_PM_MRK_PTEG_FROM_L3MISS 54 #define POWER6_PME_PM_MRK_LSU0_REJECT_UST 55 #define POWER6_PME_PM_MRK_INST_DISP 56 #define POWER6_PME_PM_LARX 57 #define POWER6_PME_PM_INST_CMPL 58 #define POWER6_PME_PM_FXU_IDLE 59 #define POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD 60 #define POWER6_PME_PM_L2_LD_REQ_DATA 61 #define POWER6_PME_PM_LSU_DERAT_MISS_CYC 62 #define POWER6_PME_PM_DPU_HELD_POWER_COUNT 63 #define POWER6_PME_PM_INST_FROM_RL2L3_MOD 64 #define POWER6_PME_PM_DATA_FROM_DMEM_CYC 65 #define POWER6_PME_PM_DATA_FROM_DMEM 66 #define POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR 67 #define POWER6_PME_PM_LSU_REJECT_DERAT_MPRED 68 #define POWER6_PME_PM_LSU1_REJECT_ULD 69 #define POWER6_PME_PM_DATA_FROM_L3_CYC 70 #define POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE 71 #define POWER6_PME_PM_INST_FROM_MEM_DP 72 #define POWER6_PME_PM_LSU_FLUSH_DSI 73 #define POWER6_PME_PM_MRK_DERAT_REF_16G 74 #define POWER6_PME_PM_LSU_LDF_BOTH 75 #define POWER6_PME_PM_FPU1_1FLOP 76 #define POWER6_PME_PM_DATA_FROM_RMEM_CYC 77 #define POWER6_PME_PM_INST_PTEG_SECONDARY 78 #define POWER6_PME_PM_L1_ICACHE_MISS 79 #define POWER6_PME_PM_INST_DISP_LLA 80 #define POWER6_PME_PM_THRD_BOTH_RUN_CYC 81 #define POWER6_PME_PM_LSU_ST_CHAINED 82 #define POWER6_PME_PM_FPU1_FXDIV 83 #define POWER6_PME_PM_FREQ_UP 84 #define POWER6_PME_PM_FAB_RETRY_SYS_PUMP 85 #define POWER6_PME_PM_DATA_FROM_LMEM 86 #define POWER6_PME_PM_PMC3_OVERFLOW 87 #define POWER6_PME_PM_LSU0_REJECT_SET_MPRED 88 #define POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED 89 #define POWER6_PME_PM_LSU1_REJECT_STQ_FULL 90 #define POWER6_PME_PM_MRK_BR_MPRED 91 #define POWER6_PME_PM_L2SA_ST_MISS 92 #define POWER6_PME_PM_LSU0_REJECT_EXTERN 93 #define POWER6_PME_PM_MRK_BR_TAKEN 94 #define POWER6_PME_PM_ISLB_MISS 95 #define POWER6_PME_PM_CYC 96 #define POWER6_PME_PM_FPU_FXDIV 97 #define POWER6_PME_PM_DPU_HELD_LLA_END 98 #define POWER6_PME_PM_MEM0_DP_CL_WR_LOC 99 #define POWER6_PME_PM_MRK_LSU_REJECT_ULD 100 #define POWER6_PME_PM_1PLUS_PPC_CMPL 101 #define POWER6_PME_PM_PTEG_FROM_DMEM 102 #define POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT 103 #define POWER6_PME_PM_GCT_FULL_CYC 104 #define POWER6_PME_PM_INST_FROM_L25_SHR 105 #define POWER6_PME_PM_MRK_DERAT_MISS_4K 106 #define POWER6_PME_PM_DC_PREF_STREAM_ALLOC 107 #define POWER6_PME_PM_FPU1_FIN 108 #define POWER6_PME_PM_BR_MPRED_TA 109 #define POWER6_PME_PM_DPU_HELD_POWER 110 #define POWER6_PME_PM_RUN_INST_CMPL 111 #define POWER6_PME_PM_GCT_EMPTY_CYC 112 #define POWER6_PME_PM_LLA_COUNT 113 #define POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH 114 #define POWER6_PME_PM_DPU_WT_IC_MISS 115 #define POWER6_PME_PM_DATA_FROM_L3MISS 116 #define POWER6_PME_PM_FPU_FPSCR 117 #define POWER6_PME_PM_VMX1_INST_ISSUED 118 #define POWER6_PME_PM_FLUSH 119 #define POWER6_PME_PM_ST_HIT_L2 120 #define POWER6_PME_PM_SYNC_CYC 121 #define POWER6_PME_PM_FAB_SYS_PUMP 122 #define POWER6_PME_PM_IC_PREF_REQ 123 #define POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC 124 #define POWER6_PME_PM_FPU_ISSUE_0 125 #define POWER6_PME_PM_THRD_PRIO_2_CYC 126 #define POWER6_PME_PM_VMX_SIMPLE_ISSUED 127 #define POWER6_PME_PM_MRK_FPU1_FIN 128 #define POWER6_PME_PM_DPU_HELD_CW 129 #define POWER6_PME_PM_L3SA_REF 130 #define POWER6_PME_PM_STCX 131 #define POWER6_PME_PM_L2SB_MISS 132 #define POWER6_PME_PM_LSU0_REJECT 133 #define POWER6_PME_PM_TB_BIT_TRANS 134 #define POWER6_PME_PM_THERMAL_MAX 135 #define POWER6_PME_PM_FPU0_STF 136 #define POWER6_PME_PM_FPU1_FMA 137 #define POWER6_PME_PM_LSU1_REJECT_LHS 138 #define POWER6_PME_PM_DPU_HELD_INT 139 #define POWER6_PME_PM_THRD_LLA_BOTH_CYC 140 #define POWER6_PME_PM_DPU_HELD_THERMAL_COUNT 141 #define POWER6_PME_PM_PMC4_REWIND 142 #define POWER6_PME_PM_DERAT_REF_16M 143 #define POWER6_PME_PM_FPU0_FCONV 144 #define POWER6_PME_PM_L2SA_LD_REQ_DATA 145 #define POWER6_PME_PM_DATA_FROM_MEM_DP 146 #define POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED 147 #define POWER6_PME_PM_MRK_PTEG_FROM_L2MISS 148 #define POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC 149 #define POWER6_PME_PM_VMX0_STALL 150 #define POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 151 #define POWER6_PME_PM_LSU_DERAT_MISS 152 #define POWER6_PME_PM_FPU0_SINGLE 153 #define POWER6_PME_PM_FPU_ISSUE_STEERING 154 #define POWER6_PME_PM_THRD_PRIO_1_CYC 155 #define POWER6_PME_PM_VMX_COMPLEX_ISSUED 156 #define POWER6_PME_PM_FPU_ISSUE_ST_FOLDED 157 #define POWER6_PME_PM_DFU_FIN 158 #define POWER6_PME_PM_BR_PRED_CCACHE 159 #define POWER6_PME_PM_MRK_ST_CMPL_INT 160 #define POWER6_PME_PM_FAB_MMIO 161 #define POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED 162 #define POWER6_PME_PM_FPU_STF 163 #define POWER6_PME_PM_MEM1_DP_CL_WR_GLOB 164 #define POWER6_PME_PM_MRK_DATA_FROM_L3MISS 165 #define POWER6_PME_PM_GCT_NOSLOT_CYC 166 #define POWER6_PME_PM_L2_ST_REQ_DATA 167 #define POWER6_PME_PM_INST_TABLEWALK_COUNT 168 #define POWER6_PME_PM_PTEG_FROM_L35_SHR 169 #define POWER6_PME_PM_DPU_HELD_ISYNC 170 #define POWER6_PME_PM_MRK_DATA_FROM_L25_SHR 171 #define POWER6_PME_PM_L3SA_HIT 172 #define POWER6_PME_PM_DERAT_MISS_16G 173 #define POWER6_PME_PM_DATA_PTEG_2ND_HALF 174 #define POWER6_PME_PM_L2SA_ST_REQ 175 #define POWER6_PME_PM_INST_FROM_LMEM 176 #define POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT 177 #define POWER6_PME_PM_PTEG_FROM_L2 178 #define POWER6_PME_PM_DATA_PTEG_1ST_HALF 179 #define POWER6_PME_PM_BR_MPRED_COUNT 180 #define POWER6_PME_PM_IERAT_MISS_4K 181 #define POWER6_PME_PM_THRD_BOTH_RUN_COUNT 182 #define POWER6_PME_PM_LSU_REJECT_ULD 183 #define POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC 184 #define POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD 185 #define POWER6_PME_PM_FPU0_FLOP 186 #define POWER6_PME_PM_FPU0_FEST 187 #define POWER6_PME_PM_MRK_LSU0_REJECT_LHS 188 #define POWER6_PME_PM_VMX_RESULT_SAT_1 189 #define POWER6_PME_PM_NO_ITAG_CYC 190 #define POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH 191 #define POWER6_PME_PM_0INST_FETCH 192 #define POWER6_PME_PM_DPU_WT_BR_MPRED 193 #define POWER6_PME_PM_L1_PREF 194 #define POWER6_PME_PM_VMX_FLOAT_MULTICYCLE 195 #define POWER6_PME_PM_DATA_FROM_L25_SHR_CYC 196 #define POWER6_PME_PM_DATA_FROM_L3 197 #define POWER6_PME_PM_PMC2_OVERFLOW 198 #define POWER6_PME_PM_VMX0_LD_WRBACK 199 #define POWER6_PME_PM_FPU0_DENORM 200 #define POWER6_PME_PM_INST_FETCH_CYC 201 #define POWER6_PME_PM_LSU_LDF 202 #define POWER6_PME_PM_LSU_REJECT_L2_CORR 203 #define POWER6_PME_PM_DERAT_REF_64K 204 #define POWER6_PME_PM_THRD_PRIO_3_CYC 205 #define POWER6_PME_PM_FPU_FMA 206 #define POWER6_PME_PM_INST_FROM_L35_MOD 207 #define POWER6_PME_PM_DFU_CONV 208 #define POWER6_PME_PM_INST_FROM_L25_MOD 209 #define POWER6_PME_PM_PTEG_FROM_L35_MOD 210 #define POWER6_PME_PM_MRK_VMX_ST_ISSUED 211 #define POWER6_PME_PM_VMX_FLOAT_ISSUED 212 #define POWER6_PME_PM_LSU0_REJECT_L2_CORR 213 #define POWER6_PME_PM_THRD_L2MISS 214 #define POWER6_PME_PM_FPU_FCONV 215 #define POWER6_PME_PM_FPU_FXMULT 216 #define POWER6_PME_PM_FPU1_FRSP 217 #define POWER6_PME_PM_MRK_DERAT_REF_16M 218 #define POWER6_PME_PM_L2SB_CASTOUT_SHR 219 #define POWER6_PME_PM_THRD_ONE_RUN_COUNT 220 #define POWER6_PME_PM_INST_FROM_RMEM 221 #define POWER6_PME_PM_LSU_BOTH_BUS 222 #define POWER6_PME_PM_FPU1_FSQRT_FDIV 223 #define POWER6_PME_PM_L2_LD_REQ_INST 224 #define POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR 225 #define POWER6_PME_PM_BR_PRED_CR 226 #define POWER6_PME_PM_MRK_LSU0_REJECT_ULD 227 #define POWER6_PME_PM_LSU_REJECT 228 #define POWER6_PME_PM_LSU_REJECT_LHS_BOTH 229 #define POWER6_PME_PM_GXO_ADDR_CYC_BUSY 230 #define POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT 231 #define POWER6_PME_PM_PTEG_FROM_L3 232 #define POWER6_PME_PM_VMX0_LD_ISSUED 233 #define POWER6_PME_PM_FXU_PIPELINED_MULT_DIV 234 #define POWER6_PME_PM_FPU1_STF 235 #define POWER6_PME_PM_DFU_ADD 236 #define POWER6_PME_PM_MEM_DP_CL_WR_GLOB 237 #define POWER6_PME_PM_MRK_LSU1_REJECT_ULD 238 #define POWER6_PME_PM_ITLB_REF 239 #define POWER6_PME_PM_LSU0_REJECT_L2MISS 240 #define POWER6_PME_PM_DATA_FROM_L35_SHR 241 #define POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD 242 #define POWER6_PME_PM_FPU0_FPSCR 243 #define POWER6_PME_PM_DATA_FROM_L2 244 #define POWER6_PME_PM_DPU_HELD_XER 245 #define POWER6_PME_PM_FAB_NODE_PUMP 246 #define POWER6_PME_PM_VMX_RESULT_SAT_0_1 247 #define POWER6_PME_PM_LD_REF_L1 248 #define POWER6_PME_PM_TLB_REF 249 #define POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS 250 #define POWER6_PME_PM_FLUSH_FPU 251 #define POWER6_PME_PM_MEM1_DP_CL_WR_LOC 252 #define POWER6_PME_PM_L2SB_LD_HIT 253 #define POWER6_PME_PM_FAB_DCLAIM 254 #define POWER6_PME_PM_MEM_DP_CL_WR_LOC 255 #define POWER6_PME_PM_BR_MPRED_CR 256 #define POWER6_PME_PM_LSU_REJECT_EXTERN 257 #define POWER6_PME_PM_DATA_FROM_RL2L3_MOD 258 #define POWER6_PME_PM_DPU_HELD_RU_WQ 259 #define POWER6_PME_PM_LD_MISS_L1 260 #define POWER6_PME_PM_DC_INV_L2 261 #define POWER6_PME_PM_MRK_PTEG_FROM_RMEM 262 #define POWER6_PME_PM_FPU_FIN 263 #define POWER6_PME_PM_FXU0_FIN 264 #define POWER6_PME_PM_DPU_HELD_FPQ 265 #define POWER6_PME_PM_GX_DMA_READ 266 #define POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR 267 #define POWER6_PME_PM_0INST_FETCH_COUNT 268 #define POWER6_PME_PM_PMC5_OVERFLOW 269 #define POWER6_PME_PM_L2SB_LD_REQ 270 #define POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC 271 #define POWER6_PME_PM_DATA_FROM_RMEM 272 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC 273 #define POWER6_PME_PM_ST_REF_L1_BOTH 274 #define POWER6_PME_PM_VMX_PERMUTE_ISSUED 275 #define POWER6_PME_PM_BR_TAKEN 276 #define POWER6_PME_PM_FAB_DMA 277 #define POWER6_PME_PM_GCT_EMPTY_COUNT 278 #define POWER6_PME_PM_FPU1_SINGLE 279 #define POWER6_PME_PM_L2SA_CASTOUT_SHR 280 #define POWER6_PME_PM_L3SB_REF 281 #define POWER6_PME_PM_FPU0_FRSP 282 #define POWER6_PME_PM_PMC4_SAVED 283 #define POWER6_PME_PM_L2SA_DC_INV 284 #define POWER6_PME_PM_GXI_ADDR_CYC_BUSY 285 #define POWER6_PME_PM_FPU0_FMA 286 #define POWER6_PME_PM_SLB_MISS 287 #define POWER6_PME_PM_MRK_ST_GPS 288 #define POWER6_PME_PM_DERAT_REF_4K 289 #define POWER6_PME_PM_L2_CASTOUT_SHR 290 #define POWER6_PME_PM_DPU_HELD_STCX_CR 291 #define POWER6_PME_PM_FPU0_ST_FOLDED 292 #define POWER6_PME_PM_MRK_DATA_FROM_L21 293 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC 294 #define POWER6_PME_PM_DATA_FROM_L35_MOD 295 #define POWER6_PME_PM_DATA_FROM_DL2L3_SHR 296 #define POWER6_PME_PM_GXI_DATA_CYC_BUSY 297 #define POWER6_PME_PM_LSU_REJECT_STEAL 298 #define POWER6_PME_PM_ST_FIN 299 #define POWER6_PME_PM_DPU_HELD_CR_LOGICAL 300 #define POWER6_PME_PM_THRD_SEL_T0 301 #define POWER6_PME_PM_PTEG_RELOAD_VALID 302 #define POWER6_PME_PM_L2_PREF_ST 303 #define POWER6_PME_PM_MRK_STCX_FAIL 304 #define POWER6_PME_PM_LSU0_REJECT_LHS 305 #define POWER6_PME_PM_DFU_EXP_EQ 306 #define POWER6_PME_PM_DPU_HELD_FP_FX_MULT 307 #define POWER6_PME_PM_L2_LD_MISS_DATA 308 #define POWER6_PME_PM_DATA_FROM_L35_MOD_CYC 309 #define POWER6_PME_PM_FLUSH_FXU 310 #define POWER6_PME_PM_FPU_ISSUE_1 311 #define POWER6_PME_PM_DATA_FROM_LMEM_CYC 312 #define POWER6_PME_PM_DPU_HELD_LSU_SOPS 313 #define POWER6_PME_PM_INST_PTEG_2ND_HALF 314 #define POWER6_PME_PM_THRESH_TIMEO 315 #define POWER6_PME_PM_LSU_REJECT_UST_BOTH 316 #define POWER6_PME_PM_LSU_REJECT_FAST 317 #define POWER6_PME_PM_DPU_HELD_THRD_PRIO 318 #define POWER6_PME_PM_L2_PREF_LD 319 #define POWER6_PME_PM_FPU_FEST 320 #define POWER6_PME_PM_MRK_DATA_FROM_RMEM 321 #define POWER6_PME_PM_LD_MISS_L1_CYC 322 #define POWER6_PME_PM_DERAT_MISS_4K 323 #define POWER6_PME_PM_DPU_HELD_COMPLETION 324 #define POWER6_PME_PM_FPU_ISSUE_STALL_ST 325 #define POWER6_PME_PM_L2SB_DC_INV 326 #define POWER6_PME_PM_PTEG_FROM_L25_SHR 327 #define POWER6_PME_PM_PTEG_FROM_DL2L3_MOD 328 #define POWER6_PME_PM_FAB_CMD_RETRIED 329 #define POWER6_PME_PM_BR_PRED_LSTACK 330 #define POWER6_PME_PM_GXO_DATA_CYC_BUSY 331 #define POWER6_PME_PM_DFU_SUBNORM 332 #define POWER6_PME_PM_FPU_ISSUE_OOO 333 #define POWER6_PME_PM_LSU_REJECT_ULD_BOTH 334 #define POWER6_PME_PM_L2SB_ST_MISS 335 #define POWER6_PME_PM_DATA_FROM_L25_MOD_CYC 336 #define POWER6_PME_PM_INST_PTEG_1ST_HALF 337 #define POWER6_PME_PM_DERAT_MISS_16M 338 #define POWER6_PME_PM_GX_DMA_WRITE 339 #define POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD 340 #define POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC 341 #define POWER6_PME_PM_L2SB_LD_REQ_DATA 342 #define POWER6_PME_PM_L2SA_LD_MISS_INST 343 #define POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS 344 #define POWER6_PME_PM_MRK_IFU_FIN 345 #define POWER6_PME_PM_INST_FROM_L3 346 #define POWER6_PME_PM_FXU1_FIN 347 #define POWER6_PME_PM_THRD_PRIO_4_CYC 348 #define POWER6_PME_PM_MRK_DATA_FROM_L35_MOD 349 #define POWER6_PME_PM_LSU_REJECT_SET_MPRED 350 #define POWER6_PME_PM_MRK_DERAT_MISS_16G 351 #define POWER6_PME_PM_FPU0_FXDIV 352 #define POWER6_PME_PM_MRK_LSU1_REJECT_UST 353 #define POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP 354 #define POWER6_PME_PM_INST_FROM_L35_SHR 355 #define POWER6_PME_PM_MRK_LSU_REJECT_LHS 356 #define POWER6_PME_PM_LSU_LMQ_FULL_CYC 357 #define POWER6_PME_PM_SYNC_COUNT 358 #define POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB 359 #define POWER6_PME_PM_L2SA_CASTOUT_MOD 360 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT 361 #define POWER6_PME_PM_PTEG_FROM_MEM_DP 362 #define POWER6_PME_PM_LSU_REJECT_SLOW 363 #define POWER6_PME_PM_PTEG_FROM_L25_MOD 364 #define POWER6_PME_PM_THRD_PRIO_7_CYC 365 #define POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR 366 #define POWER6_PME_PM_ST_REQ_L2 367 #define POWER6_PME_PM_ST_REF_L1 368 #define POWER6_PME_PM_FPU_ISSUE_STALL_THRD 369 #define POWER6_PME_PM_RUN_COUNT 370 #define POWER6_PME_PM_RUN_CYC 371 #define POWER6_PME_PM_PTEG_FROM_RMEM 372 #define POWER6_PME_PM_LSU0_LDF 373 #define POWER6_PME_PM_ST_MISS_L1 374 #define POWER6_PME_PM_INST_FROM_DL2L3_SHR 375 #define POWER6_PME_PM_L2SA_IC_INV 376 #define POWER6_PME_PM_THRD_ONE_RUN_CYC 377 #define POWER6_PME_PM_L2SB_LD_REQ_INST 378 #define POWER6_PME_PM_MRK_DATA_FROM_L25_MOD 379 #define POWER6_PME_PM_DPU_HELD_XTHRD 380 #define POWER6_PME_PM_L2SB_ST_REQ 381 #define POWER6_PME_PM_INST_FROM_L21 382 #define POWER6_PME_PM_INST_FROM_L3MISS 383 #define POWER6_PME_PM_L3SB_HIT 384 #define POWER6_PME_PM_EE_OFF_EXT_INT 385 #define POWER6_PME_PM_INST_FROM_DL2L3_MOD 386 #define POWER6_PME_PM_PMC6_OVERFLOW 387 #define POWER6_PME_PM_FPU_FLOP 388 #define POWER6_PME_PM_FXU_BUSY 389 #define POWER6_PME_PM_FPU1_FLOP 390 #define POWER6_PME_PM_IC_RELOAD_SHR 391 #define POWER6_PME_PM_INST_TABLEWALK_CYC 392 #define POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC 393 #define POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC 394 #define POWER6_PME_PM_IBUF_FULL_CYC 395 #define POWER6_PME_PM_L2SA_LD_REQ 396 #define POWER6_PME_PM_VMX1_LD_WRBACK 397 #define POWER6_PME_PM_MRK_FPU_FIN 398 #define POWER6_PME_PM_THRD_PRIO_5_CYC 399 #define POWER6_PME_PM_DFU_BACK2BACK 400 #define POWER6_PME_PM_MRK_DATA_FROM_LMEM 401 #define POWER6_PME_PM_LSU_REJECT_LHS 402 #define POWER6_PME_PM_DPU_HELD_SPR 403 #define POWER6_PME_PM_FREQ_DOWN 404 #define POWER6_PME_PM_DFU_ENC_BCD_DPD 405 #define POWER6_PME_PM_DPU_HELD_GPR 406 #define POWER6_PME_PM_LSU0_NCST 407 #define POWER6_PME_PM_MRK_INST_ISSUED 408 #define POWER6_PME_PM_INST_FROM_RL2L3_SHR 409 #define POWER6_PME_PM_FPU_DENORM 410 #define POWER6_PME_PM_PTEG_FROM_L3MISS 411 #define POWER6_PME_PM_RUN_PURR 412 #define POWER6_PME_PM_MRK_VMX0_LD_WRBACK 413 #define POWER6_PME_PM_L2_MISS 414 #define POWER6_PME_PM_MRK_DATA_FROM_L3 415 #define POWER6_PME_PM_MRK_LSU1_REJECT_LHS 416 #define POWER6_PME_PM_L2SB_LD_MISS_INST 417 #define POWER6_PME_PM_PTEG_FROM_RL2L3_SHR 418 #define POWER6_PME_PM_MRK_DERAT_MISS_64K 419 #define POWER6_PME_PM_LWSYNC 420 #define POWER6_PME_PM_FPU1_FXMULT 421 #define POWER6_PME_PM_MEM0_DP_CL_WR_GLOB 422 #define POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR 423 #define POWER6_PME_PM_INST_IMC_MATCH_CMPL 424 #define POWER6_PME_PM_DPU_HELD_THERMAL 425 #define POWER6_PME_PM_FPU_FRSP 426 #define POWER6_PME_PM_MRK_INST_FIN 427 #define POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR 428 #define POWER6_PME_PM_MRK_DTLB_REF 429 #define POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR 430 #define POWER6_PME_PM_DPU_HELD_LSU 431 #define POWER6_PME_PM_FPU_FSQRT_FDIV 432 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT 433 #define POWER6_PME_PM_DATA_PTEG_SECONDARY 434 #define POWER6_PME_PM_FPU1_FEST 435 #define POWER6_PME_PM_L2SA_LD_HIT 436 #define POWER6_PME_PM_DATA_FROM_MEM_DP_CYC 437 #define POWER6_PME_PM_BR_MPRED_CCACHE 438 #define POWER6_PME_PM_DPU_HELD_COUNT 439 #define POWER6_PME_PM_LSU1_REJECT_SET_MPRED 440 #define POWER6_PME_PM_FPU_ISSUE_2 441 #define POWER6_PME_PM_LSU1_REJECT_L2_CORR 442 #define POWER6_PME_PM_MRK_PTEG_FROM_DMEM 443 #define POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB 444 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC 445 #define POWER6_PME_PM_THRD_PRIO_0_CYC 446 #define POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE 447 #define POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED 448 #define POWER6_PME_PM_MRK_VMX1_LD_WRBACK 449 #define POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC 450 #define POWER6_PME_PM_IERAT_MISS_16M 451 #define POWER6_PME_PM_MRK_DATA_FROM_MEM_DP 452 #define POWER6_PME_PM_LARX_L1HIT 453 #define POWER6_PME_PM_L2_ST_MISS_DATA 454 #define POWER6_PME_PM_FPU_ST_FOLDED 455 #define POWER6_PME_PM_MRK_DATA_FROM_L35_SHR 456 #define POWER6_PME_PM_DPU_HELD_MULT_GPR 457 #define POWER6_PME_PM_FPU0_1FLOP 458 #define POWER6_PME_PM_IERAT_MISS_16G 459 #define POWER6_PME_PM_IC_PREF_WRITE 460 #define POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC 461 #define POWER6_PME_PM_FPU0_FIN 462 #define POWER6_PME_PM_DATA_FROM_L2_CYC 463 #define POWER6_PME_PM_DERAT_REF_16G 464 #define POWER6_PME_PM_BR_PRED 465 #define POWER6_PME_PM_VMX1_LD_ISSUED 466 #define POWER6_PME_PM_L2SB_CASTOUT_MOD 467 #define POWER6_PME_PM_INST_FROM_DMEM 468 #define POWER6_PME_PM_DATA_FROM_L35_SHR_CYC 469 #define POWER6_PME_PM_LSU0_NCLD 470 #define POWER6_PME_PM_FAB_RETRY_NODE_PUMP 471 #define POWER6_PME_PM_VMX0_INST_ISSUED 472 #define POWER6_PME_PM_DATA_FROM_L25_MOD 473 #define POWER6_PME_PM_DPU_HELD_ITLB_ISLB 474 #define POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 475 #define POWER6_PME_PM_THRD_CONC_RUN_INST 476 #define POWER6_PME_PM_MRK_PTEG_FROM_L2 477 #define POWER6_PME_PM_PURR 478 #define POWER6_PME_PM_DERAT_MISS_64K 479 #define POWER6_PME_PM_PMC2_REWIND 480 #define POWER6_PME_PM_INST_FROM_L2 481 #define POWER6_PME_PM_INST_DISP 482 #define POWER6_PME_PM_DATA_FROM_L25_SHR 483 #define POWER6_PME_PM_L1_DCACHE_RELOAD_VALID 484 #define POWER6_PME_PM_LSU1_REJECT_UST 485 #define POWER6_PME_PM_FAB_ADDR_COLLISION 486 #define POWER6_PME_PM_MRK_FXU_FIN 487 #define POWER6_PME_PM_LSU0_REJECT_UST 488 #define POWER6_PME_PM_PMC4_OVERFLOW 489 #define POWER6_PME_PM_MRK_PTEG_FROM_L3 490 #define POWER6_PME_PM_INST_FROM_L2MISS 491 #define POWER6_PME_PM_L2SB_ST_HIT 492 #define POWER6_PME_PM_DPU_WT_IC_MISS_COUNT 493 #define POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR 494 #define POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD 495 #define POWER6_PME_PM_FPU1_FPSCR 496 #define POWER6_PME_PM_LSU_REJECT_UST 497 #define POWER6_PME_PM_LSU0_DERAT_MISS 498 #define POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP 499 #define POWER6_PME_PM_MRK_DATA_FROM_L2 500 #define POWER6_PME_PM_FPU0_FSQRT_FDIV 501 #define POWER6_PME_PM_DPU_HELD_FXU_SOPS 502 #define POWER6_PME_PM_MRK_FPU0_FIN 503 #define POWER6_PME_PM_L2SB_LD_MISS_DATA 504 #define POWER6_PME_PM_LSU_SRQ_EMPTY_CYC 505 #define POWER6_PME_PM_1PLUS_PPC_DISP 506 #define POWER6_PME_PM_VMX_ST_ISSUED 507 #define POWER6_PME_PM_DATA_FROM_L2MISS 508 #define POWER6_PME_PM_LSU0_REJECT_ULD 509 #define POWER6_PME_PM_SUSPENDED 510 #define POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH 511 #define POWER6_PME_PM_LSU_REJECT_NO_SCRATCH 512 #define POWER6_PME_PM_STCX_FAIL 513 #define POWER6_PME_PM_FPU1_DENORM 514 #define POWER6_PME_PM_GCT_NOSLOT_COUNT 515 #define POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC 516 #define POWER6_PME_PM_DATA_FROM_L21 517 #define POWER6_PME_PM_FPU_1FLOP 518 #define POWER6_PME_PM_LSU1_REJECT 519 #define POWER6_PME_PM_IC_REQ 520 #define POWER6_PME_PM_MRK_DFU_FIN 521 #define POWER6_PME_PM_NOT_LLA_CYC 522 #define POWER6_PME_PM_INST_FROM_L1 523 #define POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED 524 #define POWER6_PME_PM_BRU_FIN 525 #define POWER6_PME_PM_LSU1_REJECT_EXTERN 526 #define POWER6_PME_PM_DATA_FROM_L21_CYC 527 #define POWER6_PME_PM_GXI_CYC_BUSY 528 #define POWER6_PME_PM_MRK_LD_MISS_L1 529 #define POWER6_PME_PM_L1_WRITE_CYC 530 #define POWER6_PME_PM_LLA_CYC 531 #define POWER6_PME_PM_MRK_DATA_FROM_L2MISS 532 #define POWER6_PME_PM_GCT_FULL_COUNT 533 #define POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB 534 #define POWER6_PME_PM_DATA_FROM_RL2L3_SHR 535 #define POWER6_PME_PM_MRK_LSU_REJECT_UST 536 #define POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED 537 #define POWER6_PME_PM_MRK_PTEG_FROM_L21 538 #define POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC 539 #define POWER6_PME_PM_BR_MPRED 540 #define POWER6_PME_PM_LD_REQ_L2 541 #define POWER6_PME_PM_FLUSH_ASYNC 542 #define POWER6_PME_PM_HV_CYC 543 #define POWER6_PME_PM_LSU1_DERAT_MISS 544 #define POWER6_PME_PM_DPU_HELD_SMT 545 #define POWER6_PME_PM_MRK_LSU_FIN 546 #define POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR 547 #define POWER6_PME_PM_LSU0_REJECT_STQ_FULL 548 #define POWER6_PME_PM_MRK_DERAT_REF_4K 549 #define POWER6_PME_PM_FPU_ISSUE_STALL_FPR 550 #define POWER6_PME_PM_IFU_FIN 551 #define POWER6_PME_PM_GXO_CYC_BUSY 552 static const pme_power_entry_t power6_pe[] = { [ POWER6_PME_PM_LSU_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU_REJECT_STQ_FULL", .pme_code = 0x1a0030, .pme_short_desc = "LSU reject due to store queue full", .pme_long_desc = "LSU reject due to store queue full", }, [ POWER6_PME_PM_DPU_HELD_FXU_MULTI ] = { .pme_name = "PM_DPU_HELD_FXU_MULTI", .pme_code = 0x210a6, .pme_short_desc = "DISP unit held due to FXU multicycle", .pme_long_desc = "DISP unit held due to FXU multicycle", }, [ POWER6_PME_PM_VMX1_STALL ] = { .pme_name = "PM_VMX1_STALL", .pme_code = 0xb008c, .pme_short_desc = "VMX1 stall", .pme_long_desc = "VMX1 stall", }, [ POWER6_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x100022, .pme_short_desc = "PMC2 rewind value saved", .pme_long_desc = "PMC2 rewind value saved", }, [ POWER6_PME_PM_L2SB_IC_INV ] = { .pme_name = "PM_L2SB_IC_INV", .pme_code = 0x5068c, .pme_short_desc = "L2 slice B I cache invalidate", .pme_long_desc = "L2 slice B I cache invalidate", }, [ POWER6_PME_PM_IERAT_MISS_64K ] = { .pme_name = "PM_IERAT_MISS_64K", .pme_code = 0x392076, .pme_short_desc = "IERAT misses for 64K page", .pme_long_desc = "IERAT misses for 64K page", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_3or4_CYC", .pme_code = 0x323040, .pme_short_desc = "Cycles thread priority difference is 3 or 4", .pme_long_desc = "Cycles thread priority difference is 3 or 4", }, [ POWER6_PME_PM_LD_REF_L1_BOTH ] = { .pme_name = "PM_LD_REF_L1_BOTH", .pme_code = 0x180036, .pme_short_desc = "Both units L1 D cache load reference", .pme_long_desc = "Both units L1 D cache load reference", }, [ POWER6_PME_PM_FPU1_FCONV ] = { .pme_name = "PM_FPU1_FCONV", .pme_code = 0xd10a8, .pme_short_desc = "FPU1 executed FCONV instruction", .pme_long_desc = "FPU1 executed FCONV instruction", }, [ POWER6_PME_PM_IBUF_FULL_COUNT ] = { .pme_name = "PM_IBUF_FULL_COUNT", .pme_code = 0x40085, .pme_short_desc = "Periods instruction buffer full", .pme_long_desc = "Periods instruction buffer full", }, [ POWER6_PME_PM_MRK_LSU_DERAT_MISS ] = { .pme_name = "PM_MRK_LSU_DERAT_MISS", .pme_code = 0x400012, .pme_short_desc = "Marked DERAT miss", .pme_long_desc = "Marked DERAT miss", }, [ POWER6_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x100006, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ POWER6_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x150630, .pme_short_desc = "L2 castouts - Modified (M, Mu, Me)", .pme_long_desc = "L2 castouts - Modified (M, Mu, Me)", }, [ POWER6_PME_PM_FPU1_ST_FOLDED ] = { .pme_name = "PM_FPU1_ST_FOLDED", .pme_code = 0xd10ac, .pme_short_desc = "FPU1 folded store", .pme_long_desc = "FPU1 folded store", }, [ POWER6_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40003e, .pme_short_desc = "Marked Instruction finish timeout ", .pme_long_desc = "Marked Instruction finish timeout ", }, [ POWER6_PME_PM_DPU_WT ] = { .pme_name = "PM_DPU_WT", .pme_code = 0x300004, .pme_short_desc = "Cycles DISP unit is stalled waiting for instructions", .pme_long_desc = "Cycles DISP unit is stalled waiting for instructions", }, [ POWER6_PME_PM_DPU_HELD_RESTART ] = { .pme_name = "PM_DPU_HELD_RESTART", .pme_code = 0x30086, .pme_short_desc = "DISP unit held after restart coming", .pme_long_desc = "DISP unit held after restart coming", }, [ POWER6_PME_PM_IERAT_MISS ] = { .pme_name = "PM_IERAT_MISS", .pme_code = 0x420ce, .pme_short_desc = "IERAT miss count", .pme_long_desc = "IERAT miss count", }, [ POWER6_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x4c1030, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", }, [ POWER6_PME_PM_MRK_PTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_LMEM", .pme_code = 0x412042, .pme_short_desc = "Marked PTEG loaded from local memory", .pme_long_desc = "Marked PTEG loaded from local memory", }, [ POWER6_PME_PM_HV_COUNT ] = { .pme_name = "PM_HV_COUNT", .pme_code = 0x200017, .pme_short_desc = "Hypervisor Periods", .pme_long_desc = "Periods when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ POWER6_PME_PM_L2SA_ST_HIT ] = { .pme_name = "PM_L2SA_ST_HIT", .pme_code = 0x50786, .pme_short_desc = "L2 slice A store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", }, [ POWER6_PME_PM_L2_LD_MISS_INST ] = { .pme_name = "PM_L2_LD_MISS_INST", .pme_code = 0x250530, .pme_short_desc = "L2 instruction load misses", .pme_long_desc = "L2 instruction load misses", }, [ POWER6_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x2000f8, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", }, [ POWER6_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x8008c, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", }, [ POWER6_PME_PM_FAB_CMD_ISSUED ] = { .pme_name = "PM_FAB_CMD_ISSUED", .pme_code = 0x150130, .pme_short_desc = "Fabric command issued", .pme_long_desc = "Fabric command issued", }, [ POWER6_PME_PM_PTEG_FROM_L21 ] = { .pme_name = "PM_PTEG_FROM_L21", .pme_code = 0x213048, .pme_short_desc = "PTEG loaded from private L2 other core", .pme_long_desc = "PTEG loaded from private L2 other core", }, [ POWER6_PME_PM_L2SA_MISS ] = { .pme_name = "PM_L2SA_MISS", .pme_code = 0x50584, .pme_short_desc = "L2 slice A misses", .pme_long_desc = "L2 slice A misses", }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_RL2L3_MOD", .pme_code = 0x11304c, .pme_short_desc = "PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "PTEG loaded from remote L2 or L3 modified", }, [ POWER6_PME_PM_DPU_WT_COUNT ] = { .pme_name = "PM_DPU_WT_COUNT", .pme_code = 0x300005, .pme_short_desc = "Periods DISP unit is stalled waiting for instructions", .pme_long_desc = "Periods DISP unit is stalled waiting for instructions", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L25_MOD", .pme_code = 0x312046, .pme_short_desc = "Marked PTEG loaded from L2.5 modified", .pme_long_desc = "Marked PTEG loaded from L2.5 modified", }, [ POWER6_PME_PM_LD_HIT_L2 ] = { .pme_name = "PM_LD_HIT_L2", .pme_code = 0x250730, .pme_short_desc = "L2 D cache load hits", .pme_long_desc = "L2 D cache load hits", }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_DL2L3_SHR", .pme_code = 0x31304c, .pme_short_desc = "PTEG loaded from distant L2 or L3 shared", .pme_long_desc = "PTEG loaded from distant L2 or L3 shared", }, [ POWER6_PME_PM_MEM_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM_DP_RQ_GLOB_LOC", .pme_code = 0x150230, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local", }, [ POWER6_PME_PM_L3SA_MISS ] = { .pme_name = "PM_L3SA_MISS", .pme_code = 0x50084, .pme_short_desc = "L3 slice A misses", .pme_long_desc = "L3 slice A misses", }, [ POWER6_PME_PM_NO_ITAG_COUNT ] = { .pme_name = "PM_NO_ITAG_COUNT", .pme_code = 0x40089, .pme_short_desc = "Periods no ITAG available", .pme_long_desc = "Periods no ITAG available", }, [ POWER6_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x830e8, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", }, [ POWER6_PME_PM_LSU_FLUSH_ALIGN ] = { .pme_name = "PM_LSU_FLUSH_ALIGN", .pme_code = 0x220cc, .pme_short_desc = "Flush caused by alignment exception", .pme_long_desc = "Flush caused by alignment exception", }, [ POWER6_PME_PM_DPU_HELD_FPU_CR ] = { .pme_name = "PM_DPU_HELD_FPU_CR", .pme_code = 0x210a0, .pme_short_desc = "DISP unit held due to FPU updating CR", .pme_long_desc = "DISP unit held due to FPU updating CR", }, [ POWER6_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x113028, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "PTEG loaded from L2 miss", }, [ POWER6_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x20304a, .pme_short_desc = "Marked data loaded from distant memory", .pme_long_desc = "Marked data loaded from distant memory", }, [ POWER6_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x41304a, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "PTEG loaded from local memory", }, [ POWER6_PME_PM_MRK_DERAT_REF_64K ] = { .pme_name = "PM_MRK_DERAT_REF_64K", .pme_code = 0x182044, .pme_short_desc = "Marked DERAT reference for 64K page", .pme_long_desc = "Marked DERAT reference for 64K page", }, [ POWER6_PME_PM_L2SA_LD_REQ_INST ] = { .pme_name = "PM_L2SA_LD_REQ_INST", .pme_code = 0x50580, .pme_short_desc = "L2 slice A instruction load requests", .pme_long_desc = "L2 slice A instruction load requests", }, [ POWER6_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x392044, .pme_short_desc = "Marked DERAT misses for 16M page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x40005c, .pme_short_desc = "Data loaded from distant L2 or L3 modified", .pme_long_desc = "Data loaded from distant L2 or L3 modified", }, [ POWER6_PME_PM_FPU0_FXMULT ] = { .pme_name = "PM_FPU0_FXMULT", .pme_code = 0xd0086, .pme_short_desc = "FPU0 executed fixed point multiplication", .pme_long_desc = "FPU0 executed fixed point multiplication", }, [ POWER6_PME_PM_L3SB_MISS ] = { .pme_name = "PM_L3SB_MISS", .pme_code = 0x5008c, .pme_short_desc = "L3 slice B misses", .pme_long_desc = "L3 slice B misses", }, [ POWER6_PME_PM_STCX_CANCEL ] = { .pme_name = "PM_STCX_CANCEL", .pme_code = 0x830ec, .pme_short_desc = "stcx cancel by core", .pme_long_desc = "stcx cancel by core", }, [ POWER6_PME_PM_L2SA_LD_MISS_DATA ] = { .pme_name = "PM_L2SA_LD_MISS_DATA", .pme_code = 0x50482, .pme_short_desc = "L2 slice A data load misses", .pme_long_desc = "L2 slice A data load misses", }, [ POWER6_PME_PM_IC_INV_L2 ] = { .pme_name = "PM_IC_INV_L2", .pme_code = 0x250632, .pme_short_desc = "L1 I cache entries invalidated from L2", .pme_long_desc = "L1 I cache entries invalidated from L2", }, [ POWER6_PME_PM_DPU_HELD ] = { .pme_name = "PM_DPU_HELD", .pme_code = 0x200004, .pme_short_desc = "DISP unit held", .pme_long_desc = "DISP unit held", }, [ POWER6_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200014, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", }, [ POWER6_PME_PM_THRD_PRIO_6_CYC ] = { .pme_name = "PM_THRD_PRIO_6_CYC", .pme_code = 0x222046, .pme_short_desc = "Cycles thread running at priority level 6", .pme_long_desc = "Cycles thread running at priority level 6", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L3MISS", .pme_code = 0x312054, .pme_short_desc = "Marked PTEG loaded from L3 miss", .pme_long_desc = "Marked PTEG loaded from L3 miss", }, [ POWER6_PME_PM_MRK_LSU0_REJECT_UST ] = { .pme_name = "PM_MRK_LSU0_REJECT_UST", .pme_code = 0x930e2, .pme_short_desc = "LSU0 marked unaligned store reject", .pme_long_desc = "LSU0 marked unaligned store reject", }, [ POWER6_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x10001a, .pme_short_desc = "Marked instruction dispatched", .pme_long_desc = "Marked instruction dispatched", }, [ POWER6_PME_PM_LARX ] = { .pme_name = "PM_LARX", .pme_code = 0x830ea, .pme_short_desc = "Larx executed", .pme_long_desc = "Larx executed", }, [ POWER6_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x2, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of PPC instructions completed. ", }, [ POWER6_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x100050, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x40304c, .pme_short_desc = "Marked data loaded from distant L2 or L3 modified", .pme_long_desc = "Marked data loaded from distant L2 or L3 modified", }, [ POWER6_PME_PM_L2_LD_REQ_DATA ] = { .pme_name = "PM_L2_LD_REQ_DATA", .pme_code = 0x150430, .pme_short_desc = "L2 data load requests", .pme_long_desc = "L2 data load requests", }, [ POWER6_PME_PM_LSU_DERAT_MISS_CYC ] = { .pme_name = "PM_LSU_DERAT_MISS_CYC", .pme_code = 0x1000fc, .pme_short_desc = "DERAT miss latency", .pme_long_desc = "DERAT miss latency", }, [ POWER6_PME_PM_DPU_HELD_POWER_COUNT ] = { .pme_name = "PM_DPU_HELD_POWER_COUNT", .pme_code = 0x20003d, .pme_short_desc = "Periods DISP unit held due to Power Management", .pme_long_desc = "Periods DISP unit held due to Power Management", }, [ POWER6_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x142044, .pme_short_desc = "Instruction fetched from remote L2 or L3 modified", .pme_long_desc = "Instruction fetched from remote L2 or L3 modified", }, [ POWER6_PME_PM_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_DATA_FROM_DMEM_CYC", .pme_code = 0x20002e, .pme_short_desc = "Load latency from distant memory", .pme_long_desc = "Load latency from distant memory", }, [ POWER6_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x20005e, .pme_short_desc = "Data loaded from distant memory", .pme_long_desc = "Data loaded from distant memory", }, [ POWER6_PME_PM_LSU_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU_REJECT_PARTIAL_SECTOR", .pme_code = 0x1a0032, .pme_short_desc = "LSU reject due to partial sector valid", .pme_long_desc = "LSU reject due to partial sector valid", }, [ POWER6_PME_PM_LSU_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU_REJECT_DERAT_MPRED", .pme_code = 0x2a0030, .pme_short_desc = "LSU reject due to mispredicted DERAT", .pme_long_desc = "LSU reject due to mispredicted DERAT", }, [ POWER6_PME_PM_LSU1_REJECT_ULD ] = { .pme_name = "PM_LSU1_REJECT_ULD", .pme_code = 0x90088, .pme_short_desc = "LSU1 unaligned load reject", .pme_long_desc = "LSU1 unaligned load reject", }, [ POWER6_PME_PM_DATA_FROM_L3_CYC ] = { .pme_name = "PM_DATA_FROM_L3_CYC", .pme_code = 0x200022, .pme_short_desc = "Load latency from L3", .pme_long_desc = "Load latency from L3", }, [ POWER6_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x400050, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", }, [ POWER6_PME_PM_INST_FROM_MEM_DP ] = { .pme_name = "PM_INST_FROM_MEM_DP", .pme_code = 0x142042, .pme_short_desc = "Instruction fetched from double pump memory", .pme_long_desc = "Instruction fetched from double pump memory", }, [ POWER6_PME_PM_LSU_FLUSH_DSI ] = { .pme_name = "PM_LSU_FLUSH_DSI", .pme_code = 0x220ce, .pme_short_desc = "Flush caused by DSI", .pme_long_desc = "Flush caused by DSI", }, [ POWER6_PME_PM_MRK_DERAT_REF_16G ] = { .pme_name = "PM_MRK_DERAT_REF_16G", .pme_code = 0x482044, .pme_short_desc = "Marked DERAT reference for 16G page", .pme_long_desc = "Marked DERAT reference for 16G page", }, [ POWER6_PME_PM_LSU_LDF_BOTH ] = { .pme_name = "PM_LSU_LDF_BOTH", .pme_code = 0x180038, .pme_short_desc = "Both LSU units executed Floating Point load instruction", .pme_long_desc = "Both LSU units executed Floating Point load instruction", }, [ POWER6_PME_PM_FPU1_1FLOP ] = { .pme_name = "PM_FPU1_1FLOP", .pme_code = 0xc0088, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ POWER6_PME_PM_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_DATA_FROM_RMEM_CYC", .pme_code = 0x40002c, .pme_short_desc = "Load latency from remote memory", .pme_long_desc = "Load latency from remote memory", }, [ POWER6_PME_PM_INST_PTEG_SECONDARY ] = { .pme_name = "PM_INST_PTEG_SECONDARY", .pme_code = 0x910ac, .pme_short_desc = "Instruction table walk matched in secondary PTEG", .pme_long_desc = "Instruction table walk matched in secondary PTEG", }, [ POWER6_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x100056, .pme_short_desc = "L1 I cache miss count", .pme_long_desc = "L1 I cache miss count", }, [ POWER6_PME_PM_INST_DISP_LLA ] = { .pme_name = "PM_INST_DISP_LLA", .pme_code = 0x310a2, .pme_short_desc = "Instruction dispatched under load look ahead", .pme_long_desc = "Instruction dispatched under load look ahead", }, [ POWER6_PME_PM_THRD_BOTH_RUN_CYC ] = { .pme_name = "PM_THRD_BOTH_RUN_CYC", .pme_code = 0x400004, .pme_short_desc = "Both threads in run cycles", .pme_long_desc = "Both threads in run cycles", }, [ POWER6_PME_PM_LSU_ST_CHAINED ] = { .pme_name = "PM_LSU_ST_CHAINED", .pme_code = 0x820ce, .pme_short_desc = "number of chained stores", .pme_long_desc = "number of chained stores", }, [ POWER6_PME_PM_FPU1_FXDIV ] = { .pme_name = "PM_FPU1_FXDIV", .pme_code = 0xc10a8, .pme_short_desc = "FPU1 executed fixed point division", .pme_long_desc = "FPU1 executed fixed point division", }, [ POWER6_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x40003c, .pme_short_desc = "Frequency is being slewed up due to Power Management", .pme_long_desc = "Frequency is being slewed up due to Power Management", }, [ POWER6_PME_PM_FAB_RETRY_SYS_PUMP ] = { .pme_name = "PM_FAB_RETRY_SYS_PUMP", .pme_code = 0x50182, .pme_short_desc = "Retry of a system pump, locally mastered ", .pme_long_desc = "Retry of a system pump, locally mastered ", }, [ POWER6_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x40005e, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "Data loaded from local memory", }, [ POWER6_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400014, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", }, [ POWER6_PME_PM_LSU0_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU0_REJECT_SET_MPRED", .pme_code = 0xa0084, .pme_short_desc = "LSU0 reject due to mispredicted set", .pme_long_desc = "LSU0 reject due to mispredicted set", }, [ POWER6_PME_PM_LSU0_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU0_REJECT_DERAT_MPRED", .pme_code = 0xa0082, .pme_short_desc = "LSU0 reject due to mispredicted DERAT", .pme_long_desc = "LSU0 reject due to mispredicted DERAT", }, [ POWER6_PME_PM_LSU1_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_STQ_FULL", .pme_code = 0xa0088, .pme_short_desc = "LSU1 reject due to store queue full", .pme_long_desc = "LSU1 reject due to store queue full", }, [ POWER6_PME_PM_MRK_BR_MPRED ] = { .pme_name = "PM_MRK_BR_MPRED", .pme_code = 0x300052, .pme_short_desc = "Marked branch mispredicted", .pme_long_desc = "Marked branch mispredicted", }, [ POWER6_PME_PM_L2SA_ST_MISS ] = { .pme_name = "PM_L2SA_ST_MISS", .pme_code = 0x50486, .pme_short_desc = "L2 slice A store misses", .pme_long_desc = "L2 slice A store misses", }, [ POWER6_PME_PM_LSU0_REJECT_EXTERN ] = { .pme_name = "PM_LSU0_REJECT_EXTERN", .pme_code = 0xa10a4, .pme_short_desc = "LSU0 external reject request ", .pme_long_desc = "LSU0 external reject request ", }, [ POWER6_PME_PM_MRK_BR_TAKEN ] = { .pme_name = "PM_MRK_BR_TAKEN", .pme_code = 0x100052, .pme_short_desc = "Marked branch taken", .pme_long_desc = "Marked branch taken", }, [ POWER6_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x830e0, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ POWER6_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x1e, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER6_PME_PM_FPU_FXDIV ] = { .pme_name = "PM_FPU_FXDIV", .pme_code = 0x1c1034, .pme_short_desc = "FPU executed fixed point division", .pme_long_desc = "FPU executed fixed point division", }, [ POWER6_PME_PM_DPU_HELD_LLA_END ] = { .pme_name = "PM_DPU_HELD_LLA_END", .pme_code = 0x30084, .pme_short_desc = "DISP unit held due to load look ahead ended", .pme_long_desc = "DISP unit held due to load look ahead ended", }, [ POWER6_PME_PM_MEM0_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM0_DP_CL_WR_LOC", .pme_code = 0x50286, .pme_short_desc = "cacheline write setting dp to local side 0", .pme_long_desc = "cacheline write setting dp to local side 0", }, [ POWER6_PME_PM_MRK_LSU_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU_REJECT_ULD", .pme_code = 0x193034, .pme_short_desc = "Marked unaligned load reject", .pme_long_desc = "Marked unaligned load reject", }, [ POWER6_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100004, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER6_PME_PM_PTEG_FROM_DMEM ] = { .pme_name = "PM_PTEG_FROM_DMEM", .pme_code = 0x21304a, .pme_short_desc = "PTEG loaded from distant memory", .pme_long_desc = "PTEG loaded from distant memory", }, [ POWER6_PME_PM_DPU_WT_BR_MPRED_COUNT ] = { .pme_name = "PM_DPU_WT_BR_MPRED_COUNT", .pme_code = 0x40000d, .pme_short_desc = "Periods DISP unit is stalled due to branch misprediction", .pme_long_desc = "Periods DISP unit is stalled due to branch misprediction", }, [ POWER6_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x40086, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", }, [ POWER6_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x442046, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", }, [ POWER6_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x292044, .pme_short_desc = "Marked DERAT misses for 4K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x810a2, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", }, [ POWER6_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0xd0088, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", }, [ POWER6_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x410ac, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", }, [ POWER6_PME_PM_DPU_HELD_POWER ] = { .pme_name = "PM_DPU_HELD_POWER", .pme_code = 0x20003c, .pme_short_desc = "DISP unit held due to Power Management", .pme_long_desc = "DISP unit held due to Power Management", }, [ POWER6_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x500009, .pme_short_desc = "Run instructions completed", .pme_long_desc = "Number of run instructions completed. ", }, [ POWER6_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1000f8, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ POWER6_PME_PM_LLA_COUNT ] = { .pme_name = "PM_LLA_COUNT", .pme_code = 0xc01f, .pme_short_desc = "Transitions into Load Look Ahead mode", .pme_long_desc = "Transitions into Load Look Ahead mode", }, [ POWER6_PME_PM_LSU0_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU0_REJECT_NO_SCRATCH", .pme_code = 0xa10a2, .pme_short_desc = "LSU0 reject due to scratch register not available", .pme_long_desc = "LSU0 reject due to scratch register not available", }, [ POWER6_PME_PM_DPU_WT_IC_MISS ] = { .pme_name = "PM_DPU_WT_IC_MISS", .pme_code = 0x20000c, .pme_short_desc = "Cycles DISP unit is stalled due to I cache miss", .pme_long_desc = "Cycles DISP unit is stalled due to I cache miss", }, [ POWER6_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x3000fe, .pme_short_desc = "Data loaded from private L3 miss", .pme_long_desc = "Data loaded from private L3 miss", }, [ POWER6_PME_PM_FPU_FPSCR ] = { .pme_name = "PM_FPU_FPSCR", .pme_code = 0x2d0032, .pme_short_desc = "FPU executed FPSCR instruction", .pme_long_desc = "FPU executed FPSCR instruction", }, [ POWER6_PME_PM_VMX1_INST_ISSUED ] = { .pme_name = "PM_VMX1_INST_ISSUED", .pme_code = 0x60088, .pme_short_desc = "VMX1 instruction issued", .pme_long_desc = "VMX1 instruction issued", }, [ POWER6_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x100010, .pme_short_desc = "Flushes", .pme_long_desc = "Flushes", }, [ POWER6_PME_PM_ST_HIT_L2 ] = { .pme_name = "PM_ST_HIT_L2", .pme_code = 0x150732, .pme_short_desc = "L2 D cache store hits", .pme_long_desc = "L2 D cache store hits", }, [ POWER6_PME_PM_SYNC_CYC ] = { .pme_name = "PM_SYNC_CYC", .pme_code = 0x920cc, .pme_short_desc = "Sync duration", .pme_long_desc = "Sync duration", }, [ POWER6_PME_PM_FAB_SYS_PUMP ] = { .pme_name = "PM_FAB_SYS_PUMP", .pme_code = 0x50180, .pme_short_desc = "System pump operation, locally mastered", .pme_long_desc = "System pump operation, locally mastered", }, [ POWER6_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x4008c, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", }, [ POWER6_PME_PM_MEM0_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM0_DP_RQ_GLOB_LOC", .pme_code = 0x50280, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local side 0", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local side 0", }, [ POWER6_PME_PM_FPU_ISSUE_0 ] = { .pme_name = "PM_FPU_ISSUE_0", .pme_code = 0x320c6, .pme_short_desc = "FPU issue 0 per cycle", .pme_long_desc = "FPU issue 0 per cycle", }, [ POWER6_PME_PM_THRD_PRIO_2_CYC ] = { .pme_name = "PM_THRD_PRIO_2_CYC", .pme_code = 0x322040, .pme_short_desc = "Cycles thread running at priority level 2", .pme_long_desc = "Cycles thread running at priority level 2", }, [ POWER6_PME_PM_VMX_SIMPLE_ISSUED ] = { .pme_name = "PM_VMX_SIMPLE_ISSUED", .pme_code = 0x70082, .pme_short_desc = "VMX instruction issued to simple", .pme_long_desc = "VMX instruction issued to simple", }, [ POWER6_PME_PM_MRK_FPU1_FIN ] = { .pme_name = "PM_MRK_FPU1_FIN", .pme_code = 0xd008a, .pme_short_desc = "Marked instruction FPU1 processing finished", .pme_long_desc = "Marked instruction FPU1 processing finished", }, [ POWER6_PME_PM_DPU_HELD_CW ] = { .pme_name = "PM_DPU_HELD_CW", .pme_code = 0x20084, .pme_short_desc = "DISP unit held due to cache writes ", .pme_long_desc = "DISP unit held due to cache writes ", }, [ POWER6_PME_PM_L3SA_REF ] = { .pme_name = "PM_L3SA_REF", .pme_code = 0x50080, .pme_short_desc = "L3 slice A references", .pme_long_desc = "L3 slice A references", }, [ POWER6_PME_PM_STCX ] = { .pme_name = "PM_STCX", .pme_code = 0x830e6, .pme_short_desc = "STCX executed", .pme_long_desc = "STCX executed", }, [ POWER6_PME_PM_L2SB_MISS ] = { .pme_name = "PM_L2SB_MISS", .pme_code = 0x5058c, .pme_short_desc = "L2 slice B misses", .pme_long_desc = "L2 slice B misses", }, [ POWER6_PME_PM_LSU0_REJECT ] = { .pme_name = "PM_LSU0_REJECT", .pme_code = 0xa10a6, .pme_short_desc = "LSU0 reject", .pme_long_desc = "LSU0 reject", }, [ POWER6_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x100026, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ POWER6_PME_PM_THERMAL_MAX ] = { .pme_name = "PM_THERMAL_MAX", .pme_code = 0x30002a, .pme_short_desc = "Processor in thermal MAX", .pme_long_desc = "Processor in thermal MAX", }, [ POWER6_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0xc10a4, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", }, [ POWER6_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0xc008a, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER6_PME_PM_LSU1_REJECT_LHS ] = { .pme_name = "PM_LSU1_REJECT_LHS", .pme_code = 0x9008e, .pme_short_desc = "LSU1 load hit store reject", .pme_long_desc = "LSU1 load hit store reject", }, [ POWER6_PME_PM_DPU_HELD_INT ] = { .pme_name = "PM_DPU_HELD_INT", .pme_code = 0x310a8, .pme_short_desc = "DISP unit held due to exception", .pme_long_desc = "DISP unit held due to exception", }, [ POWER6_PME_PM_THRD_LLA_BOTH_CYC ] = { .pme_name = "PM_THRD_LLA_BOTH_CYC", .pme_code = 0x400008, .pme_short_desc = "Both threads in Load Look Ahead", .pme_long_desc = "Both threads in Load Look Ahead", }, [ POWER6_PME_PM_DPU_HELD_THERMAL_COUNT ] = { .pme_name = "PM_DPU_HELD_THERMAL_COUNT", .pme_code = 0x10002b, .pme_short_desc = "Periods DISP unit held due to thermal condition", .pme_long_desc = "Periods DISP unit held due to thermal condition", }, [ POWER6_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x100020, .pme_short_desc = "PMC4 rewind event", .pme_long_desc = "PMC4 rewind event", }, [ POWER6_PME_PM_DERAT_REF_16M ] = { .pme_name = "PM_DERAT_REF_16M", .pme_code = 0x382070, .pme_short_desc = "DERAT reference for 16M page", .pme_long_desc = "DERAT reference for 16M page", }, [ POWER6_PME_PM_FPU0_FCONV ] = { .pme_name = "PM_FPU0_FCONV", .pme_code = 0xd10a0, .pme_short_desc = "FPU0 executed FCONV instruction", .pme_long_desc = "FPU0 executed FCONV instruction", }, [ POWER6_PME_PM_L2SA_LD_REQ_DATA ] = { .pme_name = "PM_L2SA_LD_REQ_DATA", .pme_code = 0x50480, .pme_short_desc = "L2 slice A data load requests", .pme_long_desc = "L2 slice A data load requests", }, [ POWER6_PME_PM_DATA_FROM_MEM_DP ] = { .pme_name = "PM_DATA_FROM_MEM_DP", .pme_code = 0x10005e, .pme_short_desc = "Data loaded from double pump memory", .pme_long_desc = "Data loaded from double pump memory", }, [ POWER6_PME_PM_MRK_VMX_FLOAT_ISSUED ] = { .pme_name = "PM_MRK_VMX_FLOAT_ISSUED", .pme_code = 0x70088, .pme_short_desc = "Marked VMX instruction issued to float", .pme_long_desc = "Marked VMX instruction issued to float", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L2MISS", .pme_code = 0x412054, .pme_short_desc = "Marked PTEG loaded from L2 miss", .pme_long_desc = "Marked PTEG loaded from L2 miss", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_1or2_CYC", .pme_code = 0x223040, .pme_short_desc = "Cycles thread priority difference is 1 or 2", .pme_long_desc = "Cycles thread priority difference is 1 or 2", }, [ POWER6_PME_PM_VMX0_STALL ] = { .pme_name = "PM_VMX0_STALL", .pme_code = 0xb0084, .pme_short_desc = "VMX0 stall", .pme_long_desc = "VMX0 stall", }, [ POWER6_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x420ca, .pme_short_desc = "L2 I cache demand request due to BHT redirect", .pme_long_desc = "L2 I cache demand request due to BHT redirect", }, [ POWER6_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x20000e, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total DERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ POWER6_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0xc10a6, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", }, [ POWER6_PME_PM_FPU_ISSUE_STEERING ] = { .pme_name = "PM_FPU_ISSUE_STEERING", .pme_code = 0x320c4, .pme_short_desc = "FPU issue steering", .pme_long_desc = "FPU issue steering", }, [ POWER6_PME_PM_THRD_PRIO_1_CYC ] = { .pme_name = "PM_THRD_PRIO_1_CYC", .pme_code = 0x222040, .pme_short_desc = "Cycles thread running at priority level 1", .pme_long_desc = "Cycles thread running at priority level 1", }, [ POWER6_PME_PM_VMX_COMPLEX_ISSUED ] = { .pme_name = "PM_VMX_COMPLEX_ISSUED", .pme_code = 0x70084, .pme_short_desc = "VMX instruction issued to complex", .pme_long_desc = "VMX instruction issued to complex", }, [ POWER6_PME_PM_FPU_ISSUE_ST_FOLDED ] = { .pme_name = "PM_FPU_ISSUE_ST_FOLDED", .pme_code = 0x320c2, .pme_short_desc = "FPU issue a folded store", .pme_long_desc = "FPU issue a folded store", }, [ POWER6_PME_PM_DFU_FIN ] = { .pme_name = "PM_DFU_FIN", .pme_code = 0xe0080, .pme_short_desc = "DFU instruction finish", .pme_long_desc = "DFU instruction finish", }, [ POWER6_PME_PM_BR_PRED_CCACHE ] = { .pme_name = "PM_BR_PRED_CCACHE", .pme_code = 0x410a4, .pme_short_desc = "Branch count cache prediction", .pme_long_desc = "Branch count cache prediction", }, [ POWER6_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x300006, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ POWER6_PME_PM_FAB_MMIO ] = { .pme_name = "PM_FAB_MMIO", .pme_code = 0x50186, .pme_short_desc = "MMIO operation, locally mastered", .pme_long_desc = "MMIO operation, locally mastered", }, [ POWER6_PME_PM_MRK_VMX_SIMPLE_ISSUED ] = { .pme_name = "PM_MRK_VMX_SIMPLE_ISSUED", .pme_code = 0x7008a, .pme_short_desc = "Marked VMX instruction issued to simple", .pme_long_desc = "Marked VMX instruction issued to simple", }, [ POWER6_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x3c1030, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", }, [ POWER6_PME_PM_MEM1_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM1_DP_CL_WR_GLOB", .pme_code = 0x5028c, .pme_short_desc = "cacheline write setting dp to global side 1", .pme_long_desc = "cacheline write setting dp to global side 1", }, [ POWER6_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x303028, .pme_short_desc = "Marked data loaded from L3 miss", .pme_long_desc = "Marked data loaded from L3 miss", }, [ POWER6_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100008, .pme_short_desc = "Cycles no GCT slot allocated", .pme_long_desc = "Cycles this thread does not have any slots allocated in the GCT.", }, [ POWER6_PME_PM_L2_ST_REQ_DATA ] = { .pme_name = "PM_L2_ST_REQ_DATA", .pme_code = 0x250432, .pme_short_desc = "L2 data store requests", .pme_long_desc = "L2 data store requests", }, [ POWER6_PME_PM_INST_TABLEWALK_COUNT ] = { .pme_name = "PM_INST_TABLEWALK_COUNT", .pme_code = 0x920cb, .pme_short_desc = "Periods doing instruction tablewalks", .pme_long_desc = "Periods doing instruction tablewalks", }, [ POWER6_PME_PM_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_PTEG_FROM_L35_SHR", .pme_code = 0x21304e, .pme_short_desc = "PTEG loaded from L3.5 shared", .pme_long_desc = "PTEG loaded from L3.5 shared", }, [ POWER6_PME_PM_DPU_HELD_ISYNC ] = { .pme_name = "PM_DPU_HELD_ISYNC", .pme_code = 0x2008a, .pme_short_desc = "DISP unit held due to ISYNC ", .pme_long_desc = "DISP unit held due to ISYNC ", }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x40304e, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", }, [ POWER6_PME_PM_L3SA_HIT ] = { .pme_name = "PM_L3SA_HIT", .pme_code = 0x50082, .pme_short_desc = "L3 slice A hits", .pme_long_desc = "L3 slice A hits", }, [ POWER6_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x492070, .pme_short_desc = "DERAT misses for 16G page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_DATA_PTEG_2ND_HALF ] = { .pme_name = "PM_DATA_PTEG_2ND_HALF", .pme_code = 0x910a2, .pme_short_desc = "Data table walk matched in second half pri­mary PTEG", .pme_long_desc = "Data table walk matched in second half pri­mary PTEG", }, [ POWER6_PME_PM_L2SA_ST_REQ ] = { .pme_name = "PM_L2SA_ST_REQ", .pme_code = 0x50484, .pme_short_desc = "L2 slice A store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", }, [ POWER6_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x442042, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "Instruction fetched from local memory", }, [ POWER6_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x420cc, .pme_short_desc = "L2 I cache demand request due to branch redirect", .pme_long_desc = "L2 I cache demand request due to branch redirect", }, [ POWER6_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x113048, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "PTEG loaded from L2", }, [ POWER6_PME_PM_DATA_PTEG_1ST_HALF ] = { .pme_name = "PM_DATA_PTEG_1ST_HALF", .pme_code = 0x910a0, .pme_short_desc = "Data table walk matched in first half primary PTEG", .pme_long_desc = "Data table walk matched in first half primary PTEG", }, [ POWER6_PME_PM_BR_MPRED_COUNT ] = { .pme_name = "PM_BR_MPRED_COUNT", .pme_code = 0x410aa, .pme_short_desc = "Branch misprediction due to count prediction", .pme_long_desc = "Branch misprediction due to count prediction", }, [ POWER6_PME_PM_IERAT_MISS_4K ] = { .pme_name = "PM_IERAT_MISS_4K", .pme_code = 0x492076, .pme_short_desc = "IERAT misses for 4K page", .pme_long_desc = "IERAT misses for 4K page", }, [ POWER6_PME_PM_THRD_BOTH_RUN_COUNT ] = { .pme_name = "PM_THRD_BOTH_RUN_COUNT", .pme_code = 0x400005, .pme_short_desc = "Periods both threads in run cycles", .pme_long_desc = "Periods both threads in run cycles", }, [ POWER6_PME_PM_LSU_REJECT_ULD ] = { .pme_name = "PM_LSU_REJECT_ULD", .pme_code = 0x190030, .pme_short_desc = "Unaligned load reject", .pme_long_desc = "Unaligned load reject", }, [ POWER6_PME_PM_DATA_FROM_DL2L3_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD_CYC", .pme_code = 0x40002a, .pme_short_desc = "Load latency from distant L2 or L3 modified", .pme_long_desc = "Load latency from distant L2 or L3 modified", }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_MOD", .pme_code = 0x112044, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "Marked PTEG loaded from remote L2 or L3 modified", }, [ POWER6_PME_PM_FPU0_FLOP ] = { .pme_name = "PM_FPU0_FLOP", .pme_code = 0xc0086, .pme_short_desc = "FPU0 executed 1FLOP, FMA, FSQRT or FDIV instruction", .pme_long_desc = "FPU0 executed 1FLOP, FMA, FSQRT or FDIV instruction", }, [ POWER6_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0xd10a6, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ POWER6_PME_PM_MRK_LSU0_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU0_REJECT_LHS", .pme_code = 0x930e6, .pme_short_desc = "LSU0 marked load hit store reject", .pme_long_desc = "LSU0 marked load hit store reject", }, [ POWER6_PME_PM_VMX_RESULT_SAT_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_1", .pme_code = 0xb0086, .pme_short_desc = "VMX valid result with sat=1", .pme_long_desc = "VMX valid result with sat=1", }, [ POWER6_PME_PM_NO_ITAG_CYC ] = { .pme_name = "PM_NO_ITAG_CYC", .pme_code = 0x40088, .pme_short_desc = "Cyles no ITAG available", .pme_long_desc = "Cyles no ITAG available", }, [ POWER6_PME_PM_LSU1_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU1_REJECT_NO_SCRATCH", .pme_code = 0xa10aa, .pme_short_desc = "LSU1 reject due to scratch register not available", .pme_long_desc = "LSU1 reject due to scratch register not available", }, [ POWER6_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x40080, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ POWER6_PME_PM_DPU_WT_BR_MPRED ] = { .pme_name = "PM_DPU_WT_BR_MPRED", .pme_code = 0x40000c, .pme_short_desc = "Cycles DISP unit is stalled due to branch misprediction", .pme_long_desc = "Cycles DISP unit is stalled due to branch misprediction", }, [ POWER6_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x810a4, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ POWER6_PME_PM_VMX_FLOAT_MULTICYCLE ] = { .pme_name = "PM_VMX_FLOAT_MULTICYCLE", .pme_code = 0xb0082, .pme_short_desc = "VMX multi-cycle floating point instruction issued", .pme_long_desc = "VMX multi-cycle floating point instruction issued", }, [ POWER6_PME_PM_DATA_FROM_L25_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_L25_SHR_CYC", .pme_code = 0x200024, .pme_short_desc = "Load latency from L2.5 shared", .pme_long_desc = "Load latency from L2.5 shared", }, [ POWER6_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x300058, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a demand load", }, [ POWER6_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300014, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", }, [ POWER6_PME_PM_VMX0_LD_WRBACK ] = { .pme_name = "PM_VMX0_LD_WRBACK", .pme_code = 0x60084, .pme_short_desc = "VMX0 load writeback valid", .pme_long_desc = "VMX0 load writeback valid", }, [ POWER6_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0xc10a2, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ POWER6_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x420c8, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", }, [ POWER6_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x280032, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", }, [ POWER6_PME_PM_LSU_REJECT_L2_CORR ] = { .pme_name = "PM_LSU_REJECT_L2_CORR", .pme_code = 0x1a1034, .pme_short_desc = "LSU reject due to L2 correctable error", .pme_long_desc = "LSU reject due to L2 correctable error", }, [ POWER6_PME_PM_DERAT_REF_64K ] = { .pme_name = "PM_DERAT_REF_64K", .pme_code = 0x282070, .pme_short_desc = "DERAT reference for 64K page", .pme_long_desc = "DERAT reference for 64K page", }, [ POWER6_PME_PM_THRD_PRIO_3_CYC ] = { .pme_name = "PM_THRD_PRIO_3_CYC", .pme_code = 0x422040, .pme_short_desc = "Cycles thread running at priority level 3", .pme_long_desc = "Cycles thread running at priority level 3", }, [ POWER6_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2c0030, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ POWER6_PME_PM_INST_FROM_L35_MOD ] = { .pme_name = "PM_INST_FROM_L35_MOD", .pme_code = 0x142046, .pme_short_desc = "Instruction fetched from L3.5 modified", .pme_long_desc = "Instruction fetched from L3.5 modified", }, [ POWER6_PME_PM_DFU_CONV ] = { .pme_name = "PM_DFU_CONV", .pme_code = 0xe008e, .pme_short_desc = "DFU convert from fixed op", .pme_long_desc = "DFU convert from fixed op", }, [ POWER6_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x342046, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", }, [ POWER6_PME_PM_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_PTEG_FROM_L35_MOD", .pme_code = 0x11304e, .pme_short_desc = "PTEG loaded from L3.5 modified", .pme_long_desc = "PTEG loaded from L3.5 modified", }, [ POWER6_PME_PM_MRK_VMX_ST_ISSUED ] = { .pme_name = "PM_MRK_VMX_ST_ISSUED", .pme_code = 0xb0088, .pme_short_desc = "Marked VMX store issued", .pme_long_desc = "Marked VMX store issued", }, [ POWER6_PME_PM_VMX_FLOAT_ISSUED ] = { .pme_name = "PM_VMX_FLOAT_ISSUED", .pme_code = 0x70080, .pme_short_desc = "VMX instruction issued to float", .pme_long_desc = "VMX instruction issued to float", }, [ POWER6_PME_PM_LSU0_REJECT_L2_CORR ] = { .pme_name = "PM_LSU0_REJECT_L2_CORR", .pme_code = 0xa10a0, .pme_short_desc = "LSU0 reject due to L2 correctable error", .pme_long_desc = "LSU0 reject due to L2 correctable error", }, [ POWER6_PME_PM_THRD_L2MISS ] = { .pme_name = "PM_THRD_L2MISS", .pme_code = 0x310a0, .pme_short_desc = "Thread in L2 miss", .pme_long_desc = "Thread in L2 miss", }, [ POWER6_PME_PM_FPU_FCONV ] = { .pme_name = "PM_FPU_FCONV", .pme_code = 0x1d1034, .pme_short_desc = "FPU executed FCONV instruction", .pme_long_desc = "FPU executed FCONV instruction", }, [ POWER6_PME_PM_FPU_FXMULT ] = { .pme_name = "PM_FPU_FXMULT", .pme_code = 0x1d0032, .pme_short_desc = "FPU executed fixed point multiplication", .pme_long_desc = "FPU executed fixed point multiplication", }, [ POWER6_PME_PM_FPU1_FRSP ] = { .pme_name = "PM_FPU1_FRSP", .pme_code = 0xd10aa, .pme_short_desc = "FPU1 executed FRSP instruction", .pme_long_desc = "FPU1 executed FRSP instruction", }, [ POWER6_PME_PM_MRK_DERAT_REF_16M ] = { .pme_name = "PM_MRK_DERAT_REF_16M", .pme_code = 0x382044, .pme_short_desc = "Marked DERAT reference for 16M page", .pme_long_desc = "Marked DERAT reference for 16M page", }, [ POWER6_PME_PM_L2SB_CASTOUT_SHR ] = { .pme_name = "PM_L2SB_CASTOUT_SHR", .pme_code = 0x5068a, .pme_short_desc = "L2 slice B castouts - Shared", .pme_long_desc = "L2 slice B castouts - Shared", }, [ POWER6_PME_PM_THRD_ONE_RUN_COUNT ] = { .pme_name = "PM_THRD_ONE_RUN_COUNT", .pme_code = 0x1000fb, .pme_short_desc = "Periods one of the threads in run cycles", .pme_long_desc = "Periods one of the threads in run cycles", }, [ POWER6_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x342042, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "Instruction fetched from remote memory", }, [ POWER6_PME_PM_LSU_BOTH_BUS ] = { .pme_name = "PM_LSU_BOTH_BUS", .pme_code = 0x810aa, .pme_short_desc = "Both data return buses busy simultaneously", .pme_long_desc = "Both data return buses busy simultaneously", }, [ POWER6_PME_PM_FPU1_FSQRT_FDIV ] = { .pme_name = "PM_FPU1_FSQRT_FDIV", .pme_code = 0xc008c, .pme_short_desc = "FPU1 executed FSQRT or FDIV instruction", .pme_long_desc = "FPU1 executed FSQRT or FDIV instruction", }, [ POWER6_PME_PM_L2_LD_REQ_INST ] = { .pme_name = "PM_L2_LD_REQ_INST", .pme_code = 0x150530, .pme_short_desc = "L2 instruction load requests", .pme_long_desc = "L2 instruction load requests", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L35_SHR", .pme_code = 0x212046, .pme_short_desc = "Marked PTEG loaded from L3.5 shared", .pme_long_desc = "Marked PTEG loaded from L3.5 shared", }, [ POWER6_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x410a2, .pme_short_desc = "A conditional branch was predicted, CR prediction", .pme_long_desc = "A conditional branch was predicted, CR prediction", }, [ POWER6_PME_PM_MRK_LSU0_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU0_REJECT_ULD", .pme_code = 0x930e0, .pme_short_desc = "LSU0 marked unaligned load reject", .pme_long_desc = "LSU0 marked unaligned load reject", }, [ POWER6_PME_PM_LSU_REJECT ] = { .pme_name = "PM_LSU_REJECT", .pme_code = 0x4a1030, .pme_short_desc = "LSU reject", .pme_long_desc = "LSU reject", }, [ POWER6_PME_PM_LSU_REJECT_LHS_BOTH ] = { .pme_name = "PM_LSU_REJECT_LHS_BOTH", .pme_code = 0x290038, .pme_short_desc = "Load hit store reject both units", .pme_long_desc = "Load hit store reject both units", }, [ POWER6_PME_PM_GXO_ADDR_CYC_BUSY ] = { .pme_name = "PM_GXO_ADDR_CYC_BUSY", .pme_code = 0x50382, .pme_short_desc = "Outbound GX address utilization (# of cycles address out is valid)", .pme_long_desc = "Outbound GX address utilization (# of cycles address out is valid)", }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_COUNT ] = { .pme_name = "PM_LSU_SRQ_EMPTY_COUNT", .pme_code = 0x40001d, .pme_short_desc = "Periods SRQ empty", .pme_long_desc = "The Store Request Queue is empty", }, [ POWER6_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x313048, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "PTEG loaded from L3", }, [ POWER6_PME_PM_VMX0_LD_ISSUED ] = { .pme_name = "PM_VMX0_LD_ISSUED", .pme_code = 0x60082, .pme_short_desc = "VMX0 load issued", .pme_long_desc = "VMX0 load issued", }, [ POWER6_PME_PM_FXU_PIPELINED_MULT_DIV ] = { .pme_name = "PM_FXU_PIPELINED_MULT_DIV", .pme_code = 0x210ae, .pme_short_desc = "Fix point multiply/divide pipelined", .pme_long_desc = "Fix point multiply/divide pipelined", }, [ POWER6_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0xc10ac, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", }, [ POWER6_PME_PM_DFU_ADD ] = { .pme_name = "PM_DFU_ADD", .pme_code = 0xe008c, .pme_short_desc = "DFU add type instruction", .pme_long_desc = "DFU add type instruction", }, [ POWER6_PME_PM_MEM_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM_DP_CL_WR_GLOB", .pme_code = 0x250232, .pme_short_desc = "cache line write setting double pump state to global", .pme_long_desc = "cache line write setting double pump state to global", }, [ POWER6_PME_PM_MRK_LSU1_REJECT_ULD ] = { .pme_name = "PM_MRK_LSU1_REJECT_ULD", .pme_code = 0x930e8, .pme_short_desc = "LSU1 marked unaligned load reject", .pme_long_desc = "LSU1 marked unaligned load reject", }, [ POWER6_PME_PM_ITLB_REF ] = { .pme_name = "PM_ITLB_REF", .pme_code = 0x920c2, .pme_short_desc = "Instruction TLB reference", .pme_long_desc = "Instruction TLB reference", }, [ POWER6_PME_PM_LSU0_REJECT_L2MISS ] = { .pme_name = "PM_LSU0_REJECT_L2MISS", .pme_code = 0x90084, .pme_short_desc = "LSU0 L2 miss reject", .pme_long_desc = "LSU0 L2 miss reject", }, [ POWER6_PME_PM_DATA_FROM_L35_SHR ] = { .pme_name = "PM_DATA_FROM_L35_SHR", .pme_code = 0x20005a, .pme_short_desc = "Data loaded from L3.5 shared", .pme_long_desc = "Data loaded from L3.5 shared", }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x10304c, .pme_short_desc = "Marked data loaded from remote L2 or L3 modified", .pme_long_desc = "Marked data loaded from remote L2 or L3 modified", }, [ POWER6_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0xd0084, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", }, [ POWER6_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x100058, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", }, [ POWER6_PME_PM_DPU_HELD_XER ] = { .pme_name = "PM_DPU_HELD_XER", .pme_code = 0x20088, .pme_short_desc = "DISP unit held due to XER dependency", .pme_long_desc = "DISP unit held due to XER dependency", }, [ POWER6_PME_PM_FAB_NODE_PUMP ] = { .pme_name = "PM_FAB_NODE_PUMP", .pme_code = 0x50188, .pme_short_desc = "Node pump operation, locally mastered", .pme_long_desc = "Node pump operation, locally mastered", }, [ POWER6_PME_PM_VMX_RESULT_SAT_0_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_0_1", .pme_code = 0xb008e, .pme_short_desc = "VMX valid result with sat bit is set (0->1)", .pme_long_desc = "VMX valid result with sat bit is set (0->1)", }, [ POWER6_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x80082, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", }, [ POWER6_PME_PM_TLB_REF ] = { .pme_name = "PM_TLB_REF", .pme_code = 0x920c8, .pme_short_desc = "TLB reference", .pme_long_desc = "TLB reference", }, [ POWER6_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x810a0, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", }, [ POWER6_PME_PM_FLUSH_FPU ] = { .pme_name = "PM_FLUSH_FPU", .pme_code = 0x230ec, .pme_short_desc = "Flush caused by FPU exception", .pme_long_desc = "Flush caused by FPU exception", }, [ POWER6_PME_PM_MEM1_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM1_DP_CL_WR_LOC", .pme_code = 0x5028e, .pme_short_desc = "cacheline write setting dp to local side 1", .pme_long_desc = "cacheline write setting dp to local side 1", }, [ POWER6_PME_PM_L2SB_LD_HIT ] = { .pme_name = "PM_L2SB_LD_HIT", .pme_code = 0x5078a, .pme_short_desc = "L2 slice B load hits", .pme_long_desc = "L2 slice B load hits", }, [ POWER6_PME_PM_FAB_DCLAIM ] = { .pme_name = "PM_FAB_DCLAIM", .pme_code = 0x50184, .pme_short_desc = "Dclaim operation, locally mastered", .pme_long_desc = "Dclaim operation, locally mastered", }, [ POWER6_PME_PM_MEM_DP_CL_WR_LOC ] = { .pme_name = "PM_MEM_DP_CL_WR_LOC", .pme_code = 0x150232, .pme_short_desc = "cache line write setting double pump state to local", .pme_long_desc = "cache line write setting double pump state to local", }, [ POWER6_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x410a8, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", }, [ POWER6_PME_PM_LSU_REJECT_EXTERN ] = { .pme_name = "PM_LSU_REJECT_EXTERN", .pme_code = 0x3a1030, .pme_short_desc = "LSU external reject request ", .pme_long_desc = "LSU external reject request ", }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x10005c, .pme_short_desc = "Data loaded from remote L2 or L3 modified", .pme_long_desc = "Data loaded from remote L2 or L3 modified", }, [ POWER6_PME_PM_DPU_HELD_RU_WQ ] = { .pme_name = "PM_DPU_HELD_RU_WQ", .pme_code = 0x2008e, .pme_short_desc = "DISP unit held due to RU FXU write queue full", .pme_long_desc = "DISP unit held due to RU FXU write queue full", }, [ POWER6_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x80080, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", }, [ POWER6_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x150632, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ POWER6_PME_PM_MRK_PTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_RMEM", .pme_code = 0x312042, .pme_short_desc = "Marked PTEG loaded from remote memory", .pme_long_desc = "Marked PTEG loaded from remote memory", }, [ POWER6_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x1d0030, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", }, [ POWER6_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x300016, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", }, [ POWER6_PME_PM_DPU_HELD_FPQ ] = { .pme_name = "PM_DPU_HELD_FPQ", .pme_code = 0x20086, .pme_short_desc = "DISP unit held due to FPU issue queue full", .pme_long_desc = "DISP unit held due to FPU issue queue full", }, [ POWER6_PME_PM_GX_DMA_READ ] = { .pme_name = "PM_GX_DMA_READ", .pme_code = 0x5038c, .pme_short_desc = "DMA Read Request", .pme_long_desc = "DMA Read Request", }, [ POWER6_PME_PM_LSU1_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU1_REJECT_PARTIAL_SECTOR", .pme_code = 0xa008e, .pme_short_desc = "LSU1 reject due to partial sector valid", .pme_long_desc = "LSU1 reject due to partial sector valid", }, [ POWER6_PME_PM_0INST_FETCH_COUNT ] = { .pme_name = "PM_0INST_FETCH_COUNT", .pme_code = 0x40081, .pme_short_desc = "Periods with no instructions fetched", .pme_long_desc = "No instructions were fetched this periods (due to IFU hold, redirect, or icache miss)", }, [ POWER6_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x100024, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", }, [ POWER6_PME_PM_L2SB_LD_REQ ] = { .pme_name = "PM_L2SB_LD_REQ", .pme_code = 0x50788, .pme_short_desc = "L2 slice B load requests ", .pme_long_desc = "L2 slice B load requests ", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_0_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_0_CYC", .pme_code = 0x123040, .pme_short_desc = "Cycles no thread priority difference", .pme_long_desc = "Cycles no thread priority difference", }, [ POWER6_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x30005e, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "Data loaded from remote memory", }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_BOTH_CYC", .pme_code = 0x30001c, .pme_short_desc = "Cycles both threads LMQ and SRQ empty", .pme_long_desc = "Cycles both threads LMQ and SRQ empty", }, [ POWER6_PME_PM_ST_REF_L1_BOTH ] = { .pme_name = "PM_ST_REF_L1_BOTH", .pme_code = 0x280038, .pme_short_desc = "Both units L1 D cache store reference", .pme_long_desc = "Both units L1 D cache store reference", }, [ POWER6_PME_PM_VMX_PERMUTE_ISSUED ] = { .pme_name = "PM_VMX_PERMUTE_ISSUED", .pme_code = 0x70086, .pme_short_desc = "VMX instruction issued to permute", .pme_long_desc = "VMX instruction issued to permute", }, [ POWER6_PME_PM_BR_TAKEN ] = { .pme_name = "PM_BR_TAKEN", .pme_code = 0x200052, .pme_short_desc = "Branches taken", .pme_long_desc = "Branches taken", }, [ POWER6_PME_PM_FAB_DMA ] = { .pme_name = "PM_FAB_DMA", .pme_code = 0x5018c, .pme_short_desc = "DMA operation, locally mastered", .pme_long_desc = "DMA operation, locally mastered", }, [ POWER6_PME_PM_GCT_EMPTY_COUNT ] = { .pme_name = "PM_GCT_EMPTY_COUNT", .pme_code = 0x200009, .pme_short_desc = "Periods GCT empty", .pme_long_desc = "The Global Completion Table is completely empty.", }, [ POWER6_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0xc10ae, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", }, [ POWER6_PME_PM_L2SA_CASTOUT_SHR ] = { .pme_name = "PM_L2SA_CASTOUT_SHR", .pme_code = 0x50682, .pme_short_desc = "L2 slice A castouts - Shared", .pme_long_desc = "L2 slice A castouts - Shared", }, [ POWER6_PME_PM_L3SB_REF ] = { .pme_name = "PM_L3SB_REF", .pme_code = 0x50088, .pme_short_desc = "L3 slice B references", .pme_long_desc = "L3 slice B references", }, [ POWER6_PME_PM_FPU0_FRSP ] = { .pme_name = "PM_FPU0_FRSP", .pme_code = 0xd10a2, .pme_short_desc = "FPU0 executed FRSP instruction", .pme_long_desc = "FPU0 executed FRSP instruction", }, [ POWER6_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x300022, .pme_short_desc = "PMC4 rewind value saved", .pme_long_desc = "PMC4 rewind value saved", }, [ POWER6_PME_PM_L2SA_DC_INV ] = { .pme_name = "PM_L2SA_DC_INV", .pme_code = 0x50686, .pme_short_desc = "L2 slice A D cache invalidate", .pme_long_desc = "L2 slice A D cache invalidate", }, [ POWER6_PME_PM_GXI_ADDR_CYC_BUSY ] = { .pme_name = "PM_GXI_ADDR_CYC_BUSY", .pme_code = 0x50388, .pme_short_desc = "Inbound GX address utilization (# of cycle address is in valid)", .pme_long_desc = "Inbound GX address utilization (# of cycle address is in valid)", }, [ POWER6_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0xc0082, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ POWER6_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0x183034, .pme_short_desc = "SLB misses", .pme_long_desc = "SLB misses", }, [ POWER6_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x200006, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ POWER6_PME_PM_DERAT_REF_4K ] = { .pme_name = "PM_DERAT_REF_4K", .pme_code = 0x182070, .pme_short_desc = "DERAT reference for 4K page", .pme_long_desc = "DERAT reference for 4K page", }, [ POWER6_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x250630, .pme_short_desc = "L2 castouts - Shared (T, Te, Si, S)", .pme_long_desc = "L2 castouts - Shared (T, Te, Si, S)", }, [ POWER6_PME_PM_DPU_HELD_STCX_CR ] = { .pme_name = "PM_DPU_HELD_STCX_CR", .pme_code = 0x2008c, .pme_short_desc = "DISP unit held due to STCX updating CR ", .pme_long_desc = "DISP unit held due to STCX updating CR ", }, [ POWER6_PME_PM_FPU0_ST_FOLDED ] = { .pme_name = "PM_FPU0_ST_FOLDED", .pme_code = 0xd10a4, .pme_short_desc = "FPU0 folded store", .pme_long_desc = "FPU0 folded store", }, [ POWER6_PME_PM_MRK_DATA_FROM_L21 ] = { .pme_name = "PM_MRK_DATA_FROM_L21", .pme_code = 0x203048, .pme_short_desc = "Marked data loaded from private L2 other core", .pme_long_desc = "Marked data loaded from private L2 other core", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus3or4_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus3or4_CYC", .pme_code = 0x323046, .pme_short_desc = "Cycles thread priority difference is -3 or -4", .pme_long_desc = "Cycles thread priority difference is -3 or -4", }, [ POWER6_PME_PM_DATA_FROM_L35_MOD ] = { .pme_name = "PM_DATA_FROM_L35_MOD", .pme_code = 0x10005a, .pme_short_desc = "Data loaded from L3.5 modified", .pme_long_desc = "Data loaded from L3.5 modified", }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x30005c, .pme_short_desc = "Data loaded from distant L2 or L3 shared", .pme_long_desc = "Data loaded from distant L2 or L3 shared", }, [ POWER6_PME_PM_GXI_DATA_CYC_BUSY ] = { .pme_name = "PM_GXI_DATA_CYC_BUSY", .pme_code = 0x5038a, .pme_short_desc = "Inbound GX Data utilization (# of cycle data in is valid)", .pme_long_desc = "Inbound GX Data utilization (# of cycle data in is valid)", }, [ POWER6_PME_PM_LSU_REJECT_STEAL ] = { .pme_name = "PM_LSU_REJECT_STEAL", .pme_code = 0x9008c, .pme_short_desc = "LSU reject due to steal", .pme_long_desc = "LSU reject due to steal", }, [ POWER6_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x100054, .pme_short_desc = "Store instructions finished", .pme_long_desc = "Store instructions finished", }, [ POWER6_PME_PM_DPU_HELD_CR_LOGICAL ] = { .pme_name = "PM_DPU_HELD_CR_LOGICAL", .pme_code = 0x3008e, .pme_short_desc = "DISP unit held due to CR, LR or CTR updated by CR logical, MTCRF, MTLR or MTCTR", .pme_long_desc = "DISP unit held due to CR, LR or CTR updated by CR logical, MTCRF, MTLR or MTCTR", }, [ POWER6_PME_PM_THRD_SEL_T0 ] = { .pme_name = "PM_THRD_SEL_T0", .pme_code = 0x310a6, .pme_short_desc = "Decode selected thread 0", .pme_long_desc = "Decode selected thread 0", }, [ POWER6_PME_PM_PTEG_RELOAD_VALID ] = { .pme_name = "PM_PTEG_RELOAD_VALID", .pme_code = 0x130e8, .pme_short_desc = "TLB reload valid", .pme_long_desc = "TLB reload valid", }, [ POWER6_PME_PM_L2_PREF_ST ] = { .pme_name = "PM_L2_PREF_ST", .pme_code = 0x810a8, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "L2 cache prefetches", }, [ POWER6_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x830e4, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ POWER6_PME_PM_LSU0_REJECT_LHS ] = { .pme_name = "PM_LSU0_REJECT_LHS", .pme_code = 0x90086, .pme_short_desc = "LSU0 load hit store reject", .pme_long_desc = "LSU0 load hit store reject", }, [ POWER6_PME_PM_DFU_EXP_EQ ] = { .pme_name = "PM_DFU_EXP_EQ", .pme_code = 0xe0084, .pme_short_desc = "DFU operand exponents are equal for add type", .pme_long_desc = "DFU operand exponents are equal for add type", }, [ POWER6_PME_PM_DPU_HELD_FP_FX_MULT ] = { .pme_name = "PM_DPU_HELD_FP_FX_MULT", .pme_code = 0x210a8, .pme_short_desc = "DISP unit held due to non fixed multiple/divide after fixed multiply/divide", .pme_long_desc = "DISP unit held due to non fixed multiple/divide after fixed multiply/divide", }, [ POWER6_PME_PM_L2_LD_MISS_DATA ] = { .pme_name = "PM_L2_LD_MISS_DATA", .pme_code = 0x250430, .pme_short_desc = "L2 data load misses", .pme_long_desc = "L2 data load misses", }, [ POWER6_PME_PM_DATA_FROM_L35_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_L35_MOD_CYC", .pme_code = 0x400026, .pme_short_desc = "Load latency from L3.5 modified", .pme_long_desc = "Load latency from L3.5 modified", }, [ POWER6_PME_PM_FLUSH_FXU ] = { .pme_name = "PM_FLUSH_FXU", .pme_code = 0x230ea, .pme_short_desc = "Flush caused by FXU exception", .pme_long_desc = "Flush caused by FXU exception", }, [ POWER6_PME_PM_FPU_ISSUE_1 ] = { .pme_name = "PM_FPU_ISSUE_1", .pme_code = 0x320c8, .pme_short_desc = "FPU issue 1 per cycle", .pme_long_desc = "FPU issue 1 per cycle", }, [ POWER6_PME_PM_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_DATA_FROM_LMEM_CYC", .pme_code = 0x20002c, .pme_short_desc = "Load latency from local memory", .pme_long_desc = "Load latency from local memory", }, [ POWER6_PME_PM_DPU_HELD_LSU_SOPS ] = { .pme_name = "PM_DPU_HELD_LSU_SOPS", .pme_code = 0x30080, .pme_short_desc = "DISP unit held due to LSU slow ops (sync, tlbie, stcx)", .pme_long_desc = "DISP unit held due to LSU slow ops (sync, tlbie, stcx)", }, [ POWER6_PME_PM_INST_PTEG_2ND_HALF ] = { .pme_name = "PM_INST_PTEG_2ND_HALF", .pme_code = 0x910aa, .pme_short_desc = "Instruction table walk matched in second half primary PTEG", .pme_long_desc = "Instruction table walk matched in second half primary PTEG", }, [ POWER6_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x300018, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ POWER6_PME_PM_LSU_REJECT_UST_BOTH ] = { .pme_name = "PM_LSU_REJECT_UST_BOTH", .pme_code = 0x190036, .pme_short_desc = "Unaligned store reject both units", .pme_long_desc = "Unaligned store reject both units", }, [ POWER6_PME_PM_LSU_REJECT_FAST ] = { .pme_name = "PM_LSU_REJECT_FAST", .pme_code = 0x30003e, .pme_short_desc = "LSU fast reject", .pme_long_desc = "LSU fast reject", }, [ POWER6_PME_PM_DPU_HELD_THRD_PRIO ] = { .pme_name = "PM_DPU_HELD_THRD_PRIO", .pme_code = 0x3008a, .pme_short_desc = "DISP unit held due to lower priority thread", .pme_long_desc = "DISP unit held due to lower priority thread", }, [ POWER6_PME_PM_L2_PREF_LD ] = { .pme_name = "PM_L2_PREF_LD", .pme_code = 0x810a6, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "L2 cache prefetches", }, [ POWER6_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x4d1030, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ POWER6_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x30304a, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "Marked data loaded from remote memory", }, [ POWER6_PME_PM_LD_MISS_L1_CYC ] = { .pme_name = "PM_LD_MISS_L1_CYC", .pme_code = 0x10000c, .pme_short_desc = "L1 data load miss cycles", .pme_long_desc = "L1 data load miss cycles", }, [ POWER6_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x192070, .pme_short_desc = "DERAT misses for 4K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_DPU_HELD_COMPLETION ] = { .pme_name = "PM_DPU_HELD_COMPLETION", .pme_code = 0x210ac, .pme_short_desc = "DISP unit held due to completion holding dispatch ", .pme_long_desc = "DISP unit held due to completion holding dispatch ", }, [ POWER6_PME_PM_FPU_ISSUE_STALL_ST ] = { .pme_name = "PM_FPU_ISSUE_STALL_ST", .pme_code = 0x320ce, .pme_short_desc = "FPU issue stalled due to store", .pme_long_desc = "FPU issue stalled due to store", }, [ POWER6_PME_PM_L2SB_DC_INV ] = { .pme_name = "PM_L2SB_DC_INV", .pme_code = 0x5068e, .pme_short_desc = "L2 slice B D cache invalidate", .pme_long_desc = "L2 slice B D cache invalidate", }, [ POWER6_PME_PM_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_PTEG_FROM_L25_SHR", .pme_code = 0x41304e, .pme_short_desc = "PTEG loaded from L2.5 shared", .pme_long_desc = "PTEG loaded from L2.5 shared", }, [ POWER6_PME_PM_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_DL2L3_MOD", .pme_code = 0x41304c, .pme_short_desc = "PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "PTEG loaded from distant L2 or L3 modified", }, [ POWER6_PME_PM_FAB_CMD_RETRIED ] = { .pme_name = "PM_FAB_CMD_RETRIED", .pme_code = 0x250130, .pme_short_desc = "Fabric command retried", .pme_long_desc = "Fabric command retried", }, [ POWER6_PME_PM_BR_PRED_LSTACK ] = { .pme_name = "PM_BR_PRED_LSTACK", .pme_code = 0x410a6, .pme_short_desc = "A conditional branch was predicted, link stack", .pme_long_desc = "A conditional branch was predicted, link stack", }, [ POWER6_PME_PM_GXO_DATA_CYC_BUSY ] = { .pme_name = "PM_GXO_DATA_CYC_BUSY", .pme_code = 0x50384, .pme_short_desc = "Outbound GX Data utilization (# of cycles data out is valid)", .pme_long_desc = "Outbound GX Data utilization (# of cycles data out is valid)", }, [ POWER6_PME_PM_DFU_SUBNORM ] = { .pme_name = "PM_DFU_SUBNORM", .pme_code = 0xe0086, .pme_short_desc = "DFU result is a subnormal", .pme_long_desc = "DFU result is a subnormal", }, [ POWER6_PME_PM_FPU_ISSUE_OOO ] = { .pme_name = "PM_FPU_ISSUE_OOO", .pme_code = 0x320c0, .pme_short_desc = "FPU issue out-of-order", .pme_long_desc = "FPU issue out-of-order", }, [ POWER6_PME_PM_LSU_REJECT_ULD_BOTH ] = { .pme_name = "PM_LSU_REJECT_ULD_BOTH", .pme_code = 0x290036, .pme_short_desc = "Unaligned load reject both units", .pme_long_desc = "Unaligned load reject both units", }, [ POWER6_PME_PM_L2SB_ST_MISS ] = { .pme_name = "PM_L2SB_ST_MISS", .pme_code = 0x5048e, .pme_short_desc = "L2 slice B store misses", .pme_long_desc = "L2 slice B store misses", }, [ POWER6_PME_PM_DATA_FROM_L25_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_L25_MOD_CYC", .pme_code = 0x400024, .pme_short_desc = "Load latency from L2.5 modified", .pme_long_desc = "Load latency from L2.5 modified", }, [ POWER6_PME_PM_INST_PTEG_1ST_HALF ] = { .pme_name = "PM_INST_PTEG_1ST_HALF", .pme_code = 0x910a8, .pme_short_desc = "Instruction table walk matched in first half primary PTEG", .pme_long_desc = "Instruction table walk matched in first half primary PTEG", }, [ POWER6_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x392070, .pme_short_desc = "DERAT misses for 16M page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_GX_DMA_WRITE ] = { .pme_name = "PM_GX_DMA_WRITE", .pme_code = 0x5038e, .pme_short_desc = "All DMA Write Requests (including dma wrt lgcy)", .pme_long_desc = "All DMA Write Requests (including dma wrt lgcy)", }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_MOD", .pme_code = 0x412044, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "Marked PTEG loaded from distant L2 or L3 modified", }, [ POWER6_PME_PM_MEM1_DP_RQ_GLOB_LOC ] = { .pme_name = "PM_MEM1_DP_RQ_GLOB_LOC", .pme_code = 0x50288, .pme_short_desc = "Memory read queue marking cache line double pump state from global to local side 1", .pme_long_desc = "Memory read queue marking cache line double pump state from global to local side 1", }, [ POWER6_PME_PM_L2SB_LD_REQ_DATA ] = { .pme_name = "PM_L2SB_LD_REQ_DATA", .pme_code = 0x50488, .pme_short_desc = "L2 slice B data load requests", .pme_long_desc = "L2 slice B data load requests", }, [ POWER6_PME_PM_L2SA_LD_MISS_INST ] = { .pme_name = "PM_L2SA_LD_MISS_INST", .pme_code = 0x50582, .pme_short_desc = "L2 slice A instruction load misses", .pme_long_desc = "L2 slice A instruction load misses", }, [ POWER6_PME_PM_MRK_LSU0_REJECT_L2MISS ] = { .pme_name = "PM_MRK_LSU0_REJECT_L2MISS", .pme_code = 0x930e4, .pme_short_desc = "LSU0 marked L2 miss reject", .pme_long_desc = "LSU0 marked L2 miss reject", }, [ POWER6_PME_PM_MRK_IFU_FIN ] = { .pme_name = "PM_MRK_IFU_FIN", .pme_code = 0x20000a, .pme_short_desc = "Marked instruction IFU processing finished", .pme_long_desc = "Marked instruction IFU processing finished", }, [ POWER6_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x342040, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", }, [ POWER6_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x400016, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", }, [ POWER6_PME_PM_THRD_PRIO_4_CYC ] = { .pme_name = "PM_THRD_PRIO_4_CYC", .pme_code = 0x422046, .pme_short_desc = "Cycles thread running at priority level 4", .pme_long_desc = "Cycles thread running at priority level 4", }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L35_MOD", .pme_code = 0x10304e, .pme_short_desc = "Marked data loaded from L3.5 modified", .pme_long_desc = "Marked data loaded from L3.5 modified", }, [ POWER6_PME_PM_LSU_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU_REJECT_SET_MPRED", .pme_code = 0x2a0032, .pme_short_desc = "LSU reject due to mispredicted set", .pme_long_desc = "LSU reject due to mispredicted set", }, [ POWER6_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x492044, .pme_short_desc = "Marked DERAT misses for 16G page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_FPU0_FXDIV ] = { .pme_name = "PM_FPU0_FXDIV", .pme_code = 0xc10a0, .pme_short_desc = "FPU0 executed fixed point division", .pme_long_desc = "FPU0 executed fixed point division", }, [ POWER6_PME_PM_MRK_LSU1_REJECT_UST ] = { .pme_name = "PM_MRK_LSU1_REJECT_UST", .pme_code = 0x930ea, .pme_short_desc = "LSU1 marked unaligned store reject", .pme_long_desc = "LSU1 marked unaligned store reject", }, [ POWER6_PME_PM_FPU_ISSUE_DIV_SQRT_OVERLAP ] = { .pme_name = "PM_FPU_ISSUE_DIV_SQRT_OVERLAP", .pme_code = 0x320cc, .pme_short_desc = "FPU divide/sqrt overlapped with other divide/sqrt", .pme_long_desc = "FPU divide/sqrt overlapped with other divide/sqrt", }, [ POWER6_PME_PM_INST_FROM_L35_SHR ] = { .pme_name = "PM_INST_FROM_L35_SHR", .pme_code = 0x242046, .pme_short_desc = "Instruction fetched from L3.5 shared", .pme_long_desc = "Instruction fetched from L3.5 shared", }, [ POWER6_PME_PM_MRK_LSU_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU_REJECT_LHS", .pme_code = 0x493030, .pme_short_desc = "Marked load hit store reject", .pme_long_desc = "Marked load hit store reject", }, [ POWER6_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x810ac, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", }, [ POWER6_PME_PM_SYNC_COUNT ] = { .pme_name = "PM_SYNC_COUNT", .pme_code = 0x920cd, .pme_short_desc = "SYNC instructions completed", .pme_long_desc = "SYNC instructions completed", }, [ POWER6_PME_PM_MEM0_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM0_DP_RQ_LOC_GLOB", .pme_code = 0x50282, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global side 0", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global side 0", }, [ POWER6_PME_PM_L2SA_CASTOUT_MOD ] = { .pme_name = "PM_L2SA_CASTOUT_MOD", .pme_code = 0x50680, .pme_short_desc = "L2 slice A castouts - Modified", .pme_long_desc = "L2 slice A castouts - Modified", }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_BOTH_COUNT", .pme_code = 0x30001d, .pme_short_desc = "Periods both threads LMQ and SRQ empty", .pme_long_desc = "Periods both threads LMQ and SRQ empty", }, [ POWER6_PME_PM_PTEG_FROM_MEM_DP ] = { .pme_name = "PM_PTEG_FROM_MEM_DP", .pme_code = 0x11304a, .pme_short_desc = "PTEG loaded from double pump memory", .pme_long_desc = "PTEG loaded from double pump memory", }, [ POWER6_PME_PM_LSU_REJECT_SLOW ] = { .pme_name = "PM_LSU_REJECT_SLOW", .pme_code = 0x20003e, .pme_short_desc = "LSU slow reject", .pme_long_desc = "LSU slow reject", }, [ POWER6_PME_PM_PTEG_FROM_L25_MOD ] = { .pme_name = "PM_PTEG_FROM_L25_MOD", .pme_code = 0x31304e, .pme_short_desc = "PTEG loaded from L2.5 modified", .pme_long_desc = "PTEG loaded from L2.5 modified", }, [ POWER6_PME_PM_THRD_PRIO_7_CYC ] = { .pme_name = "PM_THRD_PRIO_7_CYC", .pme_code = 0x122046, .pme_short_desc = "Cycles thread running at priority level 7", .pme_long_desc = "Cycles thread running at priority level 7", }, [ POWER6_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_SHR", .pme_code = 0x212044, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "Marked PTEG loaded from remote L2 or L3 shared", }, [ POWER6_PME_PM_ST_REQ_L2 ] = { .pme_name = "PM_ST_REQ_L2", .pme_code = 0x250732, .pme_short_desc = "L2 store requests", .pme_long_desc = "L2 store requests", }, [ POWER6_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x80086, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", }, [ POWER6_PME_PM_FPU_ISSUE_STALL_THRD ] = { .pme_name = "PM_FPU_ISSUE_STALL_THRD", .pme_code = 0x330e0, .pme_short_desc = "FPU issue stalled due to thread resource conflict", .pme_long_desc = "FPU issue stalled due to thread resource conflict", }, [ POWER6_PME_PM_RUN_COUNT ] = { .pme_name = "PM_RUN_COUNT", .pme_code = 0x10000b, .pme_short_desc = "Run Periods", .pme_long_desc = "Processor Periods gated by the run latch", }, [ POWER6_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x10000a, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", }, [ POWER6_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x31304a, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "PTEG loaded from remote memory", }, [ POWER6_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x80084, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", }, [ POWER6_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x80088, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", }, [ POWER6_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x342044, .pme_short_desc = "Instruction fetched from distant L2 or L3 shared", .pme_long_desc = "Instruction fetched from distant L2 or L3 shared", }, [ POWER6_PME_PM_L2SA_IC_INV ] = { .pme_name = "PM_L2SA_IC_INV", .pme_code = 0x50684, .pme_short_desc = "L2 slice A I cache invalidate", .pme_long_desc = "L2 slice A I cache invalidate", }, [ POWER6_PME_PM_THRD_ONE_RUN_CYC ] = { .pme_name = "PM_THRD_ONE_RUN_CYC", .pme_code = 0x100016, .pme_short_desc = "One of the threads in run cycles", .pme_long_desc = "One of the threads in run cycles", }, [ POWER6_PME_PM_L2SB_LD_REQ_INST ] = { .pme_name = "PM_L2SB_LD_REQ_INST", .pme_code = 0x50588, .pme_short_desc = "L2 slice B instruction load requests", .pme_long_desc = "L2 slice B instruction load requests", }, [ POWER6_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x30304e, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", }, [ POWER6_PME_PM_DPU_HELD_XTHRD ] = { .pme_name = "PM_DPU_HELD_XTHRD", .pme_code = 0x30082, .pme_short_desc = "DISP unit held due to cross thread resource conflicts", .pme_long_desc = "DISP unit held due to cross thread resource conflicts", }, [ POWER6_PME_PM_L2SB_ST_REQ ] = { .pme_name = "PM_L2SB_ST_REQ", .pme_code = 0x5048c, .pme_short_desc = "L2 slice B store requests", .pme_long_desc = "A store request as seen at the L2 directory has been made from the core. Stores are counted after gathering in the L2 store queues. The event is provided on each of the three slices A,B,and C.", }, [ POWER6_PME_PM_INST_FROM_L21 ] = { .pme_name = "PM_INST_FROM_L21", .pme_code = 0x242040, .pme_short_desc = "Instruction fetched from private L2 other core", .pme_long_desc = "Instruction fetched from private L2 other core", }, [ POWER6_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x342054, .pme_short_desc = "Instruction fetched missed L3", .pme_long_desc = "Instruction fetched missed L3", }, [ POWER6_PME_PM_L3SB_HIT ] = { .pme_name = "PM_L3SB_HIT", .pme_code = 0x5008a, .pme_short_desc = "L3 slice B hits", .pme_long_desc = "L3 slice B hits", }, [ POWER6_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x230ee, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", }, [ POWER6_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x442044, .pme_short_desc = "Instruction fetched from distant L2 or L3 modified", .pme_long_desc = "Instruction fetched from distant L2 or L3 modified", }, [ POWER6_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x300024, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", }, [ POWER6_PME_PM_FPU_FLOP ] = { .pme_name = "PM_FPU_FLOP", .pme_code = 0x1c0032, .pme_short_desc = "FPU executed 1FLOP, FMA, FSQRT or FDIV instruction", .pme_long_desc = "FPU executed 1FLOP, FMA, FSQRT or FDIV instruction", }, [ POWER6_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x200050, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", }, [ POWER6_PME_PM_FPU1_FLOP ] = { .pme_name = "PM_FPU1_FLOP", .pme_code = 0xc008e, .pme_short_desc = "FPU1 executed 1FLOP, FMA, FSQRT or FDIV instruction", .pme_long_desc = "FPU1 executed 1FLOP, FMA, FSQRT or FDIV instruction", }, [ POWER6_PME_PM_IC_RELOAD_SHR ] = { .pme_name = "PM_IC_RELOAD_SHR", .pme_code = 0x4008e, .pme_short_desc = "I cache line reloading to be shared by threads", .pme_long_desc = "I cache line reloading to be shared by threads", }, [ POWER6_PME_PM_INST_TABLEWALK_CYC ] = { .pme_name = "PM_INST_TABLEWALK_CYC", .pme_code = 0x920ca, .pme_short_desc = "Cycles doing instruction tablewalks", .pme_long_desc = "Cycles doing instruction tablewalks", }, [ POWER6_PME_PM_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x400028, .pme_short_desc = "Load latency from remote L2 or L3 modified", .pme_long_desc = "Load latency from remote L2 or L3 modified", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_5or6_CYC", .pme_code = 0x423040, .pme_short_desc = "Cycles thread priority difference is 5 or 6", .pme_long_desc = "Cycles thread priority difference is 5 or 6", }, [ POWER6_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x40084, .pme_short_desc = "Cycles instruction buffer full", .pme_long_desc = "Cycles instruction buffer full", }, [ POWER6_PME_PM_L2SA_LD_REQ ] = { .pme_name = "PM_L2SA_LD_REQ", .pme_code = 0x50780, .pme_short_desc = "L2 slice A load requests ", .pme_long_desc = "L2 slice A load requests ", }, [ POWER6_PME_PM_VMX1_LD_WRBACK ] = { .pme_name = "PM_VMX1_LD_WRBACK", .pme_code = 0x6008c, .pme_short_desc = "VMX1 load writeback valid", .pme_long_desc = "VMX1 load writeback valid", }, [ POWER6_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x2d0030, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER6_PME_PM_THRD_PRIO_5_CYC ] = { .pme_name = "PM_THRD_PRIO_5_CYC", .pme_code = 0x322046, .pme_short_desc = "Cycles thread running at priority level 5", .pme_long_desc = "Cycles thread running at priority level 5", }, [ POWER6_PME_PM_DFU_BACK2BACK ] = { .pme_name = "PM_DFU_BACK2BACK", .pme_code = 0xe0082, .pme_short_desc = "DFU back to back operations executed", .pme_long_desc = "DFU back to back operations executed", }, [ POWER6_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x40304a, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "Marked data loaded from local memory", }, [ POWER6_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0x190032, .pme_short_desc = "Load hit store reject", .pme_long_desc = "Load hit store reject", }, [ POWER6_PME_PM_DPU_HELD_SPR ] = { .pme_name = "PM_DPU_HELD_SPR", .pme_code = 0x3008c, .pme_short_desc = "DISP unit held due to MTSPR/MFSPR", .pme_long_desc = "DISP unit held due to MTSPR/MFSPR", }, [ POWER6_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x30003c, .pme_short_desc = "Frequency is being slewed down due to Power Management", .pme_long_desc = "Frequency is being slewed down due to Power Management", }, [ POWER6_PME_PM_DFU_ENC_BCD_DPD ] = { .pme_name = "PM_DFU_ENC_BCD_DPD", .pme_code = 0xe008a, .pme_short_desc = "DFU Encode BCD to DPD", .pme_long_desc = "DFU Encode BCD to DPD", }, [ POWER6_PME_PM_DPU_HELD_GPR ] = { .pme_name = "PM_DPU_HELD_GPR", .pme_code = 0x20080, .pme_short_desc = "DISP unit held due to GPR dependencies", .pme_long_desc = "DISP unit held due to GPR dependencies", }, [ POWER6_PME_PM_LSU0_NCST ] = { .pme_name = "PM_LSU0_NCST", .pme_code = 0x820cc, .pme_short_desc = "LSU0 non-cachable stores", .pme_long_desc = "LSU0 non-cachable stores", }, [ POWER6_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10001c, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "Marked instruction issued", }, [ POWER6_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x242044, .pme_short_desc = "Instruction fetched from remote L2 or L3 shared", .pme_long_desc = "Instruction fetched from remote L2 or L3 shared", }, [ POWER6_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x2c1034, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", }, [ POWER6_PME_PM_PTEG_FROM_L3MISS ] = { .pme_name = "PM_PTEG_FROM_L3MISS", .pme_code = 0x313028, .pme_short_desc = "PTEG loaded from L3 miss", .pme_long_desc = "PTEG loaded from L3 miss", }, [ POWER6_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x4000f4, .pme_short_desc = "Run PURR Event", .pme_long_desc = "Run PURR Event", }, [ POWER6_PME_PM_MRK_VMX0_LD_WRBACK ] = { .pme_name = "PM_MRK_VMX0_LD_WRBACK", .pme_code = 0x60086, .pme_short_desc = "Marked VMX0 load writeback valid", .pme_long_desc = "Marked VMX0 load writeback valid", }, [ POWER6_PME_PM_L2_MISS ] = { .pme_name = "PM_L2_MISS", .pme_code = 0x250532, .pme_short_desc = "L2 cache misses", .pme_long_desc = "L2 cache misses", }, [ POWER6_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x303048, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "DL1 was reloaded from the local L3 due to a marked demand load", }, [ POWER6_PME_PM_MRK_LSU1_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU1_REJECT_LHS", .pme_code = 0x930ee, .pme_short_desc = "LSU1 marked load hit store reject", .pme_long_desc = "LSU1 marked load hit store reject", }, [ POWER6_PME_PM_L2SB_LD_MISS_INST ] = { .pme_name = "PM_L2SB_LD_MISS_INST", .pme_code = 0x5058a, .pme_short_desc = "L2 slice B instruction load misses", .pme_long_desc = "L2 slice B instruction load misses", }, [ POWER6_PME_PM_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_RL2L3_SHR", .pme_code = 0x21304c, .pme_short_desc = "PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "PTEG loaded from remote L2 or L3 shared", }, [ POWER6_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x192044, .pme_short_desc = "Marked DERAT misses for 64K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0x810ae, .pme_short_desc = "Isync instruction completed", .pme_long_desc = "Isync instruction completed", }, [ POWER6_PME_PM_FPU1_FXMULT ] = { .pme_name = "PM_FPU1_FXMULT", .pme_code = 0xd008e, .pme_short_desc = "FPU1 executed fixed point multiplication", .pme_long_desc = "FPU1 executed fixed point multiplication", }, [ POWER6_PME_PM_MEM0_DP_CL_WR_GLOB ] = { .pme_name = "PM_MEM0_DP_CL_WR_GLOB", .pme_code = 0x50284, .pme_short_desc = "cacheline write setting dp to global side 0", .pme_long_desc = "cacheline write setting dp to global side 0", }, [ POWER6_PME_PM_LSU0_REJECT_PARTIAL_SECTOR ] = { .pme_name = "PM_LSU0_REJECT_PARTIAL_SECTOR", .pme_code = 0xa0086, .pme_short_desc = "LSU0 reject due to partial sector valid", .pme_long_desc = "LSU0 reject due to partial sector valid", }, [ POWER6_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x1000f0, .pme_short_desc = "IMC matched instructions completed", .pme_long_desc = "Number of instructions resulting from the marked instructions expansion that completed.", }, [ POWER6_PME_PM_DPU_HELD_THERMAL ] = { .pme_name = "PM_DPU_HELD_THERMAL", .pme_code = 0x10002a, .pme_short_desc = "DISP unit held due to thermal condition", .pme_long_desc = "DISP unit held due to thermal condition", }, [ POWER6_PME_PM_FPU_FRSP ] = { .pme_name = "PM_FPU_FRSP", .pme_code = 0x2d1034, .pme_short_desc = "FPU executed FRSP instruction", .pme_long_desc = "FPU executed FRSP instruction", }, [ POWER6_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30000a, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER6_PME_PM_MRK_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_SHR", .pme_code = 0x312044, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 shared", .pme_long_desc = "Marked PTEG loaded from distant L2 or L3 shared", }, [ POWER6_PME_PM_MRK_DTLB_REF ] = { .pme_name = "PM_MRK_DTLB_REF", .pme_code = 0x920c0, .pme_short_desc = "Marked Data TLB reference", .pme_long_desc = "Marked Data TLB reference", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L25_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L25_SHR", .pme_code = 0x412046, .pme_short_desc = "Marked PTEG loaded from L2.5 shared", .pme_long_desc = "Marked PTEG loaded from L2.5 shared", }, [ POWER6_PME_PM_DPU_HELD_LSU ] = { .pme_name = "PM_DPU_HELD_LSU", .pme_code = 0x210a2, .pme_short_desc = "DISP unit held due to LSU move or invalidate SLB and SR", .pme_long_desc = "DISP unit held due to LSU move or invalidate SLB and SR", }, [ POWER6_PME_PM_FPU_FSQRT_FDIV ] = { .pme_name = "PM_FPU_FSQRT_FDIV", .pme_code = 0x2c0032, .pme_short_desc = "FPU executed FSQRT or FDIV instruction", .pme_long_desc = "FPU executed FSQRT or FDIV instruction", }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_COUNT ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_COUNT", .pme_code = 0x20001d, .pme_short_desc = "Periods LMQ and SRQ empty", .pme_long_desc = "Periods when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER6_PME_PM_DATA_PTEG_SECONDARY ] = { .pme_name = "PM_DATA_PTEG_SECONDARY", .pme_code = 0x910a4, .pme_short_desc = "Data table walk matched in secondary PTEG", .pme_long_desc = "Data table walk matched in secondary PTEG", }, [ POWER6_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0xd10ae, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ POWER6_PME_PM_L2SA_LD_HIT ] = { .pme_name = "PM_L2SA_LD_HIT", .pme_code = 0x50782, .pme_short_desc = "L2 slice A load hits", .pme_long_desc = "L2 slice A load hits", }, [ POWER6_PME_PM_DATA_FROM_MEM_DP_CYC ] = { .pme_name = "PM_DATA_FROM_MEM_DP_CYC", .pme_code = 0x40002e, .pme_short_desc = "Load latency from double pump memory", .pme_long_desc = "Load latency from double pump memory", }, [ POWER6_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x410ae, .pme_short_desc = "Branch misprediction due to count cache prediction", .pme_long_desc = "Branch misprediction due to count cache prediction", }, [ POWER6_PME_PM_DPU_HELD_COUNT ] = { .pme_name = "PM_DPU_HELD_COUNT", .pme_code = 0x200005, .pme_short_desc = "Periods DISP unit held", .pme_long_desc = "Dispatch unit held", }, [ POWER6_PME_PM_LSU1_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU1_REJECT_SET_MPRED", .pme_code = 0xa008c, .pme_short_desc = "LSU1 reject due to mispredicted set", .pme_long_desc = "LSU1 reject due to mispredicted set", }, [ POWER6_PME_PM_FPU_ISSUE_2 ] = { .pme_name = "PM_FPU_ISSUE_2", .pme_code = 0x320ca, .pme_short_desc = "FPU issue 2 per cycle", .pme_long_desc = "FPU issue 2 per cycle", }, [ POWER6_PME_PM_LSU1_REJECT_L2_CORR ] = { .pme_name = "PM_LSU1_REJECT_L2_CORR", .pme_code = 0xa10a8, .pme_short_desc = "LSU1 reject due to L2 correctable error", .pme_long_desc = "LSU1 reject due to L2 correctable error", }, [ POWER6_PME_PM_MRK_PTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_DMEM", .pme_code = 0x212042, .pme_short_desc = "Marked PTEG loaded from distant memory", .pme_long_desc = "Marked PTEG loaded from distant memory", }, [ POWER6_PME_PM_MEM1_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM1_DP_RQ_LOC_GLOB", .pme_code = 0x5028a, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global side 1", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global side 1", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus1or2_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus1or2_CYC", .pme_code = 0x223046, .pme_short_desc = "Cycles thread priority difference is -1 or -2", .pme_long_desc = "Cycles thread priority difference is -1 or -2", }, [ POWER6_PME_PM_THRD_PRIO_0_CYC ] = { .pme_name = "PM_THRD_PRIO_0_CYC", .pme_code = 0x122040, .pme_short_desc = "Cycles thread running at priority level 0", .pme_long_desc = "Cycles thread running at priority level 0", }, [ POWER6_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x300050, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ POWER6_PME_PM_LSU1_REJECT_DERAT_MPRED ] = { .pme_name = "PM_LSU1_REJECT_DERAT_MPRED", .pme_code = 0xa008a, .pme_short_desc = "LSU1 reject due to mispredicted DERAT", .pme_long_desc = "LSU1 reject due to mispredicted DERAT", }, [ POWER6_PME_PM_MRK_VMX1_LD_WRBACK ] = { .pme_name = "PM_MRK_VMX1_LD_WRBACK", .pme_code = 0x6008e, .pme_short_desc = "Marked VMX1 load writeback valid", .pme_long_desc = "Marked VMX1 load writeback valid", }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x200028, .pme_short_desc = "Load latency from remote L2 or L3 shared", .pme_long_desc = "Load latency from remote L2 or L3 shared", }, [ POWER6_PME_PM_IERAT_MISS_16M ] = { .pme_name = "PM_IERAT_MISS_16M", .pme_code = 0x292076, .pme_short_desc = "IERAT misses for 16M page", .pme_long_desc = "IERAT misses for 16M page", }, [ POWER6_PME_PM_MRK_DATA_FROM_MEM_DP ] = { .pme_name = "PM_MRK_DATA_FROM_MEM_DP", .pme_code = 0x10304a, .pme_short_desc = "Marked data loaded from double pump memory", .pme_long_desc = "Marked data loaded from double pump memory", }, [ POWER6_PME_PM_LARX_L1HIT ] = { .pme_name = "PM_LARX_L1HIT", .pme_code = 0x830e2, .pme_short_desc = "larx hits in L1", .pme_long_desc = "larx hits in L1", }, [ POWER6_PME_PM_L2_ST_MISS_DATA ] = { .pme_name = "PM_L2_ST_MISS_DATA", .pme_code = 0x150432, .pme_short_desc = "L2 data store misses", .pme_long_desc = "L2 data store misses", }, [ POWER6_PME_PM_FPU_ST_FOLDED ] = { .pme_name = "PM_FPU_ST_FOLDED", .pme_code = 0x3d1030, .pme_short_desc = "FPU folded store", .pme_long_desc = "FPU folded store", }, [ POWER6_PME_PM_MRK_DATA_FROM_L35_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L35_SHR", .pme_code = 0x20304e, .pme_short_desc = "Marked data loaded from L3.5 shared", .pme_long_desc = "Marked data loaded from L3.5 shared", }, [ POWER6_PME_PM_DPU_HELD_MULT_GPR ] = { .pme_name = "PM_DPU_HELD_MULT_GPR", .pme_code = 0x210aa, .pme_short_desc = "DISP unit held due to multiple/divide multiply/divide GPR dependencies", .pme_long_desc = "DISP unit held due to multiple/divide multiply/divide GPR dependencies", }, [ POWER6_PME_PM_FPU0_1FLOP ] = { .pme_name = "PM_FPU0_1FLOP", .pme_code = 0xc0080, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ POWER6_PME_PM_IERAT_MISS_16G ] = { .pme_name = "PM_IERAT_MISS_16G", .pme_code = 0x192076, .pme_short_desc = "IERAT misses for 16G page", .pme_long_desc = "IERAT misses for 16G page", }, [ POWER6_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x430e0, .pme_short_desc = "Instruction prefetch written into I cache", .pme_long_desc = "Instruction prefetch written into I cache", }, [ POWER6_PME_PM_THRD_PRIO_DIFF_minus5or6_CYC ] = { .pme_name = "PM_THRD_PRIO_DIFF_minus5or6_CYC", .pme_code = 0x423046, .pme_short_desc = "Cycles thread priority difference is -5 or -6", .pme_long_desc = "Cycles thread priority difference is -5 or -6", }, [ POWER6_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0xd0080, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", }, [ POWER6_PME_PM_DATA_FROM_L2_CYC ] = { .pme_name = "PM_DATA_FROM_L2_CYC", .pme_code = 0x200020, .pme_short_desc = "Load latency from L2", .pme_long_desc = "Load latency from L2", }, [ POWER6_PME_PM_DERAT_REF_16G ] = { .pme_name = "PM_DERAT_REF_16G", .pme_code = 0x482070, .pme_short_desc = "DERAT reference for 16G page", .pme_long_desc = "DERAT reference for 16G page", }, [ POWER6_PME_PM_BR_PRED ] = { .pme_name = "PM_BR_PRED", .pme_code = 0x410a0, .pme_short_desc = "A conditional branch was predicted", .pme_long_desc = "A conditional branch was predicted", }, [ POWER6_PME_PM_VMX1_LD_ISSUED ] = { .pme_name = "PM_VMX1_LD_ISSUED", .pme_code = 0x6008a, .pme_short_desc = "VMX1 load issued", .pme_long_desc = "VMX1 load issued", }, [ POWER6_PME_PM_L2SB_CASTOUT_MOD ] = { .pme_name = "PM_L2SB_CASTOUT_MOD", .pme_code = 0x50688, .pme_short_desc = "L2 slice B castouts - Modified", .pme_long_desc = "L2 slice B castouts - Modified", }, [ POWER6_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x242042, .pme_short_desc = "Instruction fetched from distant memory", .pme_long_desc = "Instruction fetched from distant memory", }, [ POWER6_PME_PM_DATA_FROM_L35_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_L35_SHR_CYC", .pme_code = 0x200026, .pme_short_desc = "Load latency from L3.5 shared", .pme_long_desc = "Load latency from L3.5 shared", }, [ POWER6_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0x820ca, .pme_short_desc = "LSU0 non-cacheable loads", .pme_long_desc = "LSU0 non-cacheable loads", }, [ POWER6_PME_PM_FAB_RETRY_NODE_PUMP ] = { .pme_name = "PM_FAB_RETRY_NODE_PUMP", .pme_code = 0x5018a, .pme_short_desc = "Retry of a node pump, locally mastered", .pme_long_desc = "Retry of a node pump, locally mastered", }, [ POWER6_PME_PM_VMX0_INST_ISSUED ] = { .pme_name = "PM_VMX0_INST_ISSUED", .pme_code = 0x60080, .pme_short_desc = "VMX0 instruction issued", .pme_long_desc = "VMX0 instruction issued", }, [ POWER6_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x30005a, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", }, [ POWER6_PME_PM_DPU_HELD_ITLB_ISLB ] = { .pme_name = "PM_DPU_HELD_ITLB_ISLB", .pme_code = 0x210a4, .pme_short_desc = "DISP unit held due to SLB or TLB invalidates ", .pme_long_desc = "DISP unit held due to SLB or TLB invalidates ", }, [ POWER6_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x20001c, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER6_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x300026, .pme_short_desc = "Concurrent run instructions", .pme_long_desc = "Concurrent run instructions", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L2 ] = { .pme_name = "PM_MRK_PTEG_FROM_L2", .pme_code = 0x112040, .pme_short_desc = "Marked PTEG loaded from L2.5 modified", .pme_long_desc = "Marked PTEG loaded from L2.5 modified", }, [ POWER6_PME_PM_PURR ] = { .pme_name = "PM_PURR", .pme_code = 0x10000e, .pme_short_desc = "PURR Event", .pme_long_desc = "PURR Event", }, [ POWER6_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x292070, .pme_short_desc = "DERAT misses for 64K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", }, [ POWER6_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x300020, .pme_short_desc = "PMC2 rewind event", .pme_long_desc = "PMC2 rewind event", }, [ POWER6_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x142040, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ POWER6_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x200012, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", }, [ POWER6_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x40005a, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", }, [ POWER6_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x3000f6, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", }, [ POWER6_PME_PM_LSU1_REJECT_UST ] = { .pme_name = "PM_LSU1_REJECT_UST", .pme_code = 0x9008a, .pme_short_desc = "LSU1 unaligned store reject", .pme_long_desc = "LSU1 unaligned store reject", }, [ POWER6_PME_PM_FAB_ADDR_COLLISION ] = { .pme_name = "PM_FAB_ADDR_COLLISION", .pme_code = 0x5018e, .pme_short_desc = "local node launch collision with off-node address", .pme_long_desc = "local node launch collision with off-node address", }, [ POWER6_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x20001a, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "The fixed point units (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER6_PME_PM_LSU0_REJECT_UST ] = { .pme_name = "PM_LSU0_REJECT_UST", .pme_code = 0x90082, .pme_short_desc = "LSU0 unaligned store reject", .pme_long_desc = "LSU0 unaligned store reject", }, [ POWER6_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x100014, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L3 ] = { .pme_name = "PM_MRK_PTEG_FROM_L3", .pme_code = 0x312040, .pme_short_desc = "Marked PTEG loaded from L3", .pme_long_desc = "Marked PTEG loaded from L3", }, [ POWER6_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x442054, .pme_short_desc = "Instructions fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond L2.", }, [ POWER6_PME_PM_L2SB_ST_HIT ] = { .pme_name = "PM_L2SB_ST_HIT", .pme_code = 0x5078e, .pme_short_desc = "L2 slice B store hits", .pme_long_desc = "A store request made from the core hit in the L2 directory. This event is provided on each of the three L2 slices A,B, and C.", }, [ POWER6_PME_PM_DPU_WT_IC_MISS_COUNT ] = { .pme_name = "PM_DPU_WT_IC_MISS_COUNT", .pme_code = 0x20000d, .pme_short_desc = "Periods DISP unit is stalled due to I cache miss", .pme_long_desc = "Periods DISP unit is stalled due to I cache miss", }, [ POWER6_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x30304c, .pme_short_desc = "Marked data loaded from distant L2 or L3 shared", .pme_long_desc = "Marked data loaded from distant L2 or L3 shared", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L35_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L35_MOD", .pme_code = 0x112046, .pme_short_desc = "Marked PTEG loaded from L3.5 modified", .pme_long_desc = "Marked PTEG loaded from L3.5 modified", }, [ POWER6_PME_PM_FPU1_FPSCR ] = { .pme_name = "PM_FPU1_FPSCR", .pme_code = 0xd008c, .pme_short_desc = "FPU1 executed FPSCR instruction", .pme_long_desc = "FPU1 executed FPSCR instruction", }, [ POWER6_PME_PM_LSU_REJECT_UST ] = { .pme_name = "PM_LSU_REJECT_UST", .pme_code = 0x290030, .pme_short_desc = "Unaligned store reject", .pme_long_desc = "Unaligned store reject", }, [ POWER6_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x910a6, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER6_PME_PM_MRK_PTEG_FROM_MEM_DP ] = { .pme_name = "PM_MRK_PTEG_FROM_MEM_DP", .pme_code = 0x112042, .pme_short_desc = "Marked PTEG loaded from double pump memory", .pme_long_desc = "Marked PTEG loaded from double pump memory", }, [ POWER6_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x103048, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", }, [ POWER6_PME_PM_FPU0_FSQRT_FDIV ] = { .pme_name = "PM_FPU0_FSQRT_FDIV", .pme_code = 0xc0084, .pme_short_desc = "FPU0 executed FSQRT or FDIV instruction", .pme_long_desc = "FPU0 executed FSQRT or FDIV instruction", }, [ POWER6_PME_PM_DPU_HELD_FXU_SOPS ] = { .pme_name = "PM_DPU_HELD_FXU_SOPS", .pme_code = 0x30088, .pme_short_desc = "DISP unit held due to FXU slow ops (mtmsr, scv, rfscv)", .pme_long_desc = "DISP unit held due to FXU slow ops (mtmsr, scv, rfscv)", }, [ POWER6_PME_PM_MRK_FPU0_FIN ] = { .pme_name = "PM_MRK_FPU0_FIN", .pme_code = 0xd0082, .pme_short_desc = "Marked instruction FPU0 processing finished", .pme_long_desc = "Marked instruction FPU0 processing finished", }, [ POWER6_PME_PM_L2SB_LD_MISS_DATA ] = { .pme_name = "PM_L2SB_LD_MISS_DATA", .pme_code = 0x5048a, .pme_short_desc = "L2 slice B data load misses", .pme_long_desc = "L2 slice B data load misses", }, [ POWER6_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x40001c, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", }, [ POWER6_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x100012, .pme_short_desc = "Cycles at least one instruction dispatched", .pme_long_desc = "Cycles at least one instruction dispatched", }, [ POWER6_PME_PM_VMX_ST_ISSUED ] = { .pme_name = "PM_VMX_ST_ISSUED", .pme_code = 0xb0080, .pme_short_desc = "VMX store issued", .pme_long_desc = "VMX store issued", }, [ POWER6_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x2000fe, .pme_short_desc = "Data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2.", }, [ POWER6_PME_PM_LSU0_REJECT_ULD ] = { .pme_name = "PM_LSU0_REJECT_ULD", .pme_code = 0x90080, .pme_short_desc = "LSU0 unaligned load reject", .pme_long_desc = "LSU0 unaligned load reject", }, [ POWER6_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", }, [ POWER6_PME_PM_DFU_ADD_SHIFTED_BOTH ] = { .pme_name = "PM_DFU_ADD_SHIFTED_BOTH", .pme_code = 0xe0088, .pme_short_desc = "DFU add type with both operands shifted", .pme_long_desc = "DFU add type with both operands shifted", }, [ POWER6_PME_PM_LSU_REJECT_NO_SCRATCH ] = { .pme_name = "PM_LSU_REJECT_NO_SCRATCH", .pme_code = 0x2a1034, .pme_short_desc = "LSU reject due to scratch register not available", .pme_long_desc = "LSU reject due to scratch register not available", }, [ POWER6_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x830ee, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ POWER6_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0xc10aa, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ POWER6_PME_PM_GCT_NOSLOT_COUNT ] = { .pme_name = "PM_GCT_NOSLOT_COUNT", .pme_code = 0x100009, .pme_short_desc = "Periods no GCT slot allocated", .pme_long_desc = "Periods this thread does not have any slots allocated in the GCT.", }, [ POWER6_PME_PM_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x20002a, .pme_short_desc = "Load latency from distant L2 or L3 shared", .pme_long_desc = "Load latency from distant L2 or L3 shared", }, [ POWER6_PME_PM_DATA_FROM_L21 ] = { .pme_name = "PM_DATA_FROM_L21", .pme_code = 0x200058, .pme_short_desc = "Data loaded from private L2 other core", .pme_long_desc = "Data loaded from private L2 other core", }, [ POWER6_PME_PM_FPU_1FLOP ] = { .pme_name = "PM_FPU_1FLOP", .pme_code = 0x1c0030, .pme_short_desc = "FPU executed one flop instruction ", .pme_long_desc = "This event counts the number of one flop instructions. These could be fadd*, fmul*, fsub*, fneg+, fabs+, fnabs+, fres+, frsqrte+, fcmp**, or fsel where XYZ* means XYZ, XYZs, XYZ., XYZs., XYZ+ means XYZ, XYZ., and XYZ** means XYZu, XYZo.", }, [ POWER6_PME_PM_LSU1_REJECT ] = { .pme_name = "PM_LSU1_REJECT", .pme_code = 0xa10ae, .pme_short_desc = "LSU1 reject", .pme_long_desc = "LSU1 reject", }, [ POWER6_PME_PM_IC_REQ ] = { .pme_name = "PM_IC_REQ", .pme_code = 0x4008a, .pme_short_desc = "I cache demand of prefetch request", .pme_long_desc = "I cache demand of prefetch request", }, [ POWER6_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x300008, .pme_short_desc = "DFU marked instruction finish", .pme_long_desc = "DFU marked instruction finish", }, [ POWER6_PME_PM_NOT_LLA_CYC ] = { .pme_name = "PM_NOT_LLA_CYC", .pme_code = 0x401e, .pme_short_desc = "Load Look Ahead not Active", .pme_long_desc = "Load Look Ahead not Active", }, [ POWER6_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x40082, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ POWER6_PME_PM_MRK_VMX_COMPLEX_ISSUED ] = { .pme_name = "PM_MRK_VMX_COMPLEX_ISSUED", .pme_code = 0x7008c, .pme_short_desc = "Marked VMX instruction issued to complex", .pme_long_desc = "Marked VMX instruction issued to complex", }, [ POWER6_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x430e6, .pme_short_desc = "BRU produced a result", .pme_long_desc = "BRU produced a result", }, [ POWER6_PME_PM_LSU1_REJECT_EXTERN ] = { .pme_name = "PM_LSU1_REJECT_EXTERN", .pme_code = 0xa10ac, .pme_short_desc = "LSU1 external reject request ", .pme_long_desc = "LSU1 external reject request ", }, [ POWER6_PME_PM_DATA_FROM_L21_CYC ] = { .pme_name = "PM_DATA_FROM_L21_CYC", .pme_code = 0x400020, .pme_short_desc = "Load latency from private L2 other core", .pme_long_desc = "Load latency from private L2 other core", }, [ POWER6_PME_PM_GXI_CYC_BUSY ] = { .pme_name = "PM_GXI_CYC_BUSY", .pme_code = 0x50386, .pme_short_desc = "Inbound GX bus utilizations (# of cycles in use)", .pme_long_desc = "Inbound GX bus utilizations (# of cycles in use)", }, [ POWER6_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x200056, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ POWER6_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x430e2, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", }, [ POWER6_PME_PM_LLA_CYC ] = { .pme_name = "PM_LLA_CYC", .pme_code = 0xc01e, .pme_short_desc = "Load Look Ahead Active", .pme_long_desc = "Load Look Ahead Active", }, [ POWER6_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x103028, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", }, [ POWER6_PME_PM_GCT_FULL_COUNT ] = { .pme_name = "PM_GCT_FULL_COUNT", .pme_code = 0x40087, .pme_short_desc = "Periods GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full.", }, [ POWER6_PME_PM_MEM_DP_RQ_LOC_GLOB ] = { .pme_name = "PM_MEM_DP_RQ_LOC_GLOB", .pme_code = 0x250230, .pme_short_desc = "Memory read queue marking cache line double pump state from local to global", .pme_long_desc = "Memory read queue marking cache line double pump state from local to global", }, [ POWER6_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x20005c, .pme_short_desc = "Data loaded from remote L2 or L3 shared", .pme_long_desc = "Data loaded from remote L2 or L3 shared", }, [ POWER6_PME_PM_MRK_LSU_REJECT_UST ] = { .pme_name = "PM_MRK_LSU_REJECT_UST", .pme_code = 0x293034, .pme_short_desc = "Marked unaligned store reject", .pme_long_desc = "Marked unaligned store reject", }, [ POWER6_PME_PM_MRK_VMX_PERMUTE_ISSUED ] = { .pme_name = "PM_MRK_VMX_PERMUTE_ISSUED", .pme_code = 0x7008e, .pme_short_desc = "Marked VMX instruction issued to permute", .pme_long_desc = "Marked VMX instruction issued to permute", }, [ POWER6_PME_PM_MRK_PTEG_FROM_L21 ] = { .pme_name = "PM_MRK_PTEG_FROM_L21", .pme_code = 0x212040, .pme_short_desc = "Marked PTEG loaded from private L2 other core", .pme_long_desc = "Marked PTEG loaded from private L2 other core", }, [ POWER6_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x200018, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles group completed by both threads", }, [ POWER6_PME_PM_BR_MPRED ] = { .pme_name = "PM_BR_MPRED", .pme_code = 0x400052, .pme_short_desc = "Branches incorrectly predicted", .pme_long_desc = "Branches incorrectly predicted", }, [ POWER6_PME_PM_LD_REQ_L2 ] = { .pme_name = "PM_LD_REQ_L2", .pme_code = 0x150730, .pme_short_desc = "L2 load requests ", .pme_long_desc = "L2 load requests ", }, [ POWER6_PME_PM_FLUSH_ASYNC ] = { .pme_name = "PM_FLUSH_ASYNC", .pme_code = 0x220ca, .pme_short_desc = "Flush caused by asynchronous exception", .pme_long_desc = "Flush caused by asynchronous exception", }, [ POWER6_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x200016, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ POWER6_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x910ae, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ POWER6_PME_PM_DPU_HELD_SMT ] = { .pme_name = "PM_DPU_HELD_SMT", .pme_code = 0x20082, .pme_short_desc = "DISP unit held due to SMT conflicts ", .pme_long_desc = "DISP unit held due to SMT conflicts ", }, [ POWER6_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40001a, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER6_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x20304c, .pme_short_desc = "Marked data loaded from remote L2 or L3 shared", .pme_long_desc = "Marked data loaded from remote L2 or L3 shared", }, [ POWER6_PME_PM_LSU0_REJECT_STQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_STQ_FULL", .pme_code = 0xa0080, .pme_short_desc = "LSU0 reject due to store queue full", .pme_long_desc = "LSU0 reject due to store queue full", }, [ POWER6_PME_PM_MRK_DERAT_REF_4K ] = { .pme_name = "PM_MRK_DERAT_REF_4K", .pme_code = 0x282044, .pme_short_desc = "Marked DERAT reference for 4K page", .pme_long_desc = "Marked DERAT reference for 4K page", }, [ POWER6_PME_PM_FPU_ISSUE_STALL_FPR ] = { .pme_name = "PM_FPU_ISSUE_STALL_FPR", .pme_code = 0x330e2, .pme_short_desc = "FPU issue stalled due to FPR dependencies", .pme_long_desc = "FPU issue stalled due to FPR dependencies", }, [ POWER6_PME_PM_IFU_FIN ] = { .pme_name = "PM_IFU_FIN", .pme_code = 0x430e4, .pme_short_desc = "IFU finished an instruction", .pme_long_desc = "IFU finished an instruction", }, [ POWER6_PME_PM_GXO_CYC_BUSY ] = { .pme_name = "PM_GXO_CYC_BUSY", .pme_code = 0x50380, .pme_short_desc = "Outbound GX bus utilizations (# of cycles in use)", .pme_long_desc = "Outbound GX bus utilizations (# of cycles in use)", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power7_events.h000066400000000000000000005076721502707512200230370ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER7_EVENTS_H__ #define __POWER7_EVENTS_H__ /* * File: power7_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * * Documentation on the PMU events can be found at: * http://www.power.org/documentation/comprehensive-pmu-event-reference-power7 */ #define POWER7_PME_PM_IC_DEMAND_L2_BR_ALL 0 #define POWER7_PME_PM_GCT_UTIL_7_TO_10_SLOTS 1 #define POWER7_PME_PM_PMC2_SAVED 2 #define POWER7_PME_PM_CMPLU_STALL_DFU 3 #define POWER7_PME_PM_VSU0_16FLOP 4 #define POWER7_PME_PM_MRK_LSU_DERAT_MISS 5 #define POWER7_PME_PM_MRK_ST_CMPL 6 #define POWER7_PME_PM_NEST_PAIR3_ADD 7 #define POWER7_PME_PM_L2_ST_DISP 8 #define POWER7_PME_PM_L2_CASTOUT_MOD 9 #define POWER7_PME_PM_ISEG 10 #define POWER7_PME_PM_MRK_INST_TIMEO 11 #define POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR 12 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM 13 #define POWER7_PME_PM_IERAT_WR_64K 14 #define POWER7_PME_PM_MRK_DTLB_MISS_16M 15 #define POWER7_PME_PM_IERAT_MISS 16 #define POWER7_PME_PM_MRK_PTEG_FROM_LMEM 17 #define POWER7_PME_PM_FLOP 18 #define POWER7_PME_PM_THRD_PRIO_4_5_CYC 19 #define POWER7_PME_PM_BR_PRED_TA 20 #define POWER7_PME_PM_CMPLU_STALL_FXU 21 #define POWER7_PME_PM_EXT_INT 22 #define POWER7_PME_PM_VSU_FSQRT_FDIV 23 #define POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC 24 #define POWER7_PME_PM_LSU1_LDF 25 #define POWER7_PME_PM_IC_WRITE_ALL 26 #define POWER7_PME_PM_LSU0_SRQ_STFWD 27 #define POWER7_PME_PM_PTEG_FROM_RL2L3_MOD 28 #define POWER7_PME_PM_MRK_DATA_FROM_L31_SHR 29 #define POWER7_PME_PM_DATA_FROM_L21_MOD 30 #define POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED 31 #define POWER7_PME_PM_VSU0_8FLOP 32 #define POWER7_PME_PM_POWER_EVENT1 33 #define POWER7_PME_PM_DISP_CLB_HELD_BAL 34 #define POWER7_PME_PM_VSU1_2FLOP 35 #define POWER7_PME_PM_LWSYNC_HELD 36 #define POWER7_PME_PM_PTEG_FROM_DL2L3_SHR 37 #define POWER7_PME_PM_INST_FROM_L21_MOD 38 #define POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS 39 #define POWER7_PME_PM_IC_REQ_ALL 40 #define POWER7_PME_PM_DSLB_MISS 41 #define POWER7_PME_PM_L3_MISS 42 #define POWER7_PME_PM_LSU0_L1_PREF 43 #define POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED 44 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE 45 #define POWER7_PME_PM_L2_INST 46 #define POWER7_PME_PM_VSU0_FRSP 47 #define POWER7_PME_PM_FLUSH_DISP 48 #define POWER7_PME_PM_PTEG_FROM_L2MISS 49 #define POWER7_PME_PM_VSU1_DQ_ISSUED 50 #define POWER7_PME_PM_CMPLU_STALL_LSU 51 #define POWER7_PME_PM_MRK_DATA_FROM_DMEM 52 #define POWER7_PME_PM_LSU_FLUSH_ULD 53 #define POWER7_PME_PM_PTEG_FROM_LMEM 54 #define POWER7_PME_PM_MRK_DERAT_MISS_16M 55 #define POWER7_PME_PM_THRD_ALL_RUN_CYC 56 #define POWER7_PME_PM_MEM0_PREFETCH_DISP 57 #define POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT 58 #define POWER7_PME_PM_DATA_FROM_DL2L3_MOD 59 #define POWER7_PME_PM_VSU_FRSP 60 #define POWER7_PME_PM_MRK_DATA_FROM_L21_MOD 61 #define POWER7_PME_PM_PMC1_OVERFLOW 62 #define POWER7_PME_PM_VSU0_SINGLE 63 #define POWER7_PME_PM_MRK_PTEG_FROM_L3MISS 64 #define POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR 65 #define POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED 66 #define POWER7_PME_PM_VSU1_FEST 67 #define POWER7_PME_PM_MRK_INST_DISP 68 #define POWER7_PME_PM_VSU0_COMPLEX_ISSUED 69 #define POWER7_PME_PM_LSU1_FLUSH_UST 70 #define POWER7_PME_PM_INST_CMPL 71 #define POWER7_PME_PM_FXU_IDLE 72 #define POWER7_PME_PM_LSU0_FLUSH_ULD 73 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD 74 #define POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC 75 #define POWER7_PME_PM_LSU1_REJECT_LMQ_FULL 76 #define POWER7_PME_PM_INST_PTEG_FROM_L21_MOD 77 #define POWER7_PME_PM_INST_FROM_RL2L3_MOD 78 #define POWER7_PME_PM_SHL_CREATED 79 #define POWER7_PME_PM_L2_ST_HIT 80 #define POWER7_PME_PM_DATA_FROM_DMEM 81 #define POWER7_PME_PM_L3_LD_MISS 82 #define POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE 83 #define POWER7_PME_PM_DISP_CLB_HELD_RES 84 #define POWER7_PME_PM_L2_SN_SX_I_DONE 85 #define POWER7_PME_PM_GRP_CMPL 86 #define POWER7_PME_PM_STCX_CMPL 87 #define POWER7_PME_PM_VSU0_2FLOP 88 #define POWER7_PME_PM_L3_PREF_MISS 89 #define POWER7_PME_PM_LSU_SRQ_SYNC_CYC 90 #define POWER7_PME_PM_LSU_REJECT_ERAT_MISS 91 #define POWER7_PME_PM_L1_ICACHE_MISS 92 #define POWER7_PME_PM_LSU1_FLUSH_SRQ 93 #define POWER7_PME_PM_LD_REF_L1_LSU0 94 #define POWER7_PME_PM_VSU0_FEST 95 #define POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED 96 #define POWER7_PME_PM_FREQ_UP 97 #define POWER7_PME_PM_DATA_FROM_LMEM 98 #define POWER7_PME_PM_LSU1_LDX 99 #define POWER7_PME_PM_PMC3_OVERFLOW 100 #define POWER7_PME_PM_MRK_BR_MPRED 101 #define POWER7_PME_PM_SHL_MATCH 102 #define POWER7_PME_PM_MRK_BR_TAKEN 103 #define POWER7_PME_PM_CMPLU_STALL_BRU 104 #define POWER7_PME_PM_ISLB_MISS 105 #define POWER7_PME_PM_CYC 106 #define POWER7_PME_PM_DISP_HELD_THERMAL 107 #define POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR 108 #define POWER7_PME_PM_LSU1_SRQ_STFWD 109 #define POWER7_PME_PM_GCT_NOSLOT_BR_MPRED 110 #define POWER7_PME_PM_1PLUS_PPC_CMPL 111 #define POWER7_PME_PM_PTEG_FROM_DMEM 112 #define POWER7_PME_PM_VSU_2FLOP 113 #define POWER7_PME_PM_GCT_FULL_CYC 114 #define POWER7_PME_PM_MRK_DATA_FROM_L3_CYC 115 #define POWER7_PME_PM_LSU_SRQ_S0_ALLOC 116 #define POWER7_PME_PM_MRK_DERAT_MISS_4K 117 #define POWER7_PME_PM_BR_MPRED_TA 118 #define POWER7_PME_PM_INST_PTEG_FROM_L2MISS 119 #define POWER7_PME_PM_DPU_HELD_POWER 120 #define POWER7_PME_PM_RUN_INST_CMPL 121 #define POWER7_PME_PM_MRK_VSU_FIN 122 #define POWER7_PME_PM_LSU_SRQ_S0_VALID 123 #define POWER7_PME_PM_GCT_EMPTY_CYC 124 #define POWER7_PME_PM_IOPS_DISP 125 #define POWER7_PME_PM_RUN_SPURR 126 #define POWER7_PME_PM_PTEG_FROM_L21_MOD 127 #define POWER7_PME_PM_VSU0_1FLOP 128 #define POWER7_PME_PM_SNOOP_TLBIE 129 #define POWER7_PME_PM_DATA_FROM_L3MISS 130 #define POWER7_PME_PM_VSU_SINGLE 131 #define POWER7_PME_PM_DTLB_MISS_16G 132 #define POWER7_PME_PM_CMPLU_STALL_VECTOR 133 #define POWER7_PME_PM_FLUSH 134 #define POWER7_PME_PM_L2_LD_HIT 135 #define POWER7_PME_PM_NEST_PAIR2_AND 136 #define POWER7_PME_PM_VSU1_1FLOP 137 #define POWER7_PME_PM_IC_PREF_REQ 138 #define POWER7_PME_PM_L3_LD_HIT 139 #define POWER7_PME_PM_GCT_NOSLOT_IC_MISS 140 #define POWER7_PME_PM_DISP_HELD 141 #define POWER7_PME_PM_L2_LD 142 #define POWER7_PME_PM_LSU_FLUSH_SRQ 143 #define POWER7_PME_PM_BC_PLUS_8_CONV 144 #define POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC 145 #define POWER7_PME_PM_CMPLU_STALL_VECTOR_LONG 146 #define POWER7_PME_PM_L2_RCST_BUSY_RC_FULL 147 #define POWER7_PME_PM_TB_BIT_TRANS 148 #define POWER7_PME_PM_THERMAL_MAX 149 #define POWER7_PME_PM_LSU1_FLUSH_ULD 150 #define POWER7_PME_PM_LSU1_REJECT_LHS 151 #define POWER7_PME_PM_LSU_LRQ_S0_ALLOC 152 #define POWER7_PME_PM_L3_CO_L31 153 #define POWER7_PME_PM_POWER_EVENT4 154 #define POWER7_PME_PM_DATA_FROM_L31_SHR 155 #define POWER7_PME_PM_BR_UNCOND 156 #define POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC 157 #define POWER7_PME_PM_PMC4_REWIND 158 #define POWER7_PME_PM_L2_RCLD_DISP 159 #define POWER7_PME_PM_THRD_PRIO_2_3_CYC 160 #define POWER7_PME_PM_MRK_PTEG_FROM_L2MISS 161 #define POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 162 #define POWER7_PME_PM_LSU_DERAT_MISS 163 #define POWER7_PME_PM_IC_PREF_CANCEL_L2 164 #define POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT 165 #define POWER7_PME_PM_BR_PRED_CCACHE 166 #define POWER7_PME_PM_GCT_UTIL_1_TO_2_SLOTS 167 #define POWER7_PME_PM_MRK_ST_CMPL_INT 168 #define POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC 169 #define POWER7_PME_PM_MRK_DATA_FROM_L3MISS 170 #define POWER7_PME_PM_GCT_NOSLOT_CYC 171 #define POWER7_PME_PM_LSU_SET_MPRED 172 #define POWER7_PME_PM_FLUSH_DISP_TLBIE 173 #define POWER7_PME_PM_VSU1_FCONV 174 #define POWER7_PME_PM_DERAT_MISS_16G 175 #define POWER7_PME_PM_INST_FROM_LMEM 176 #define POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT 177 #define POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG 178 #define POWER7_PME_PM_INST_PTEG_FROM_L2 179 #define POWER7_PME_PM_PTEG_FROM_L2 180 #define POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC 181 #define POWER7_PME_PM_MRK_DTLB_MISS_4K 182 #define POWER7_PME_PM_VSU0_FPSCR 183 #define POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED 184 #define POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD 185 #define POWER7_PME_PM_MEM0_RQ_DISP 186 #define POWER7_PME_PM_L2_LD_MISS 187 #define POWER7_PME_PM_VMX_RESULT_SAT_1 188 #define POWER7_PME_PM_L1_PREF 189 #define POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC 190 #define POWER7_PME_PM_GRP_IC_MISS_NONSPEC 191 #define POWER7_PME_PM_PB_NODE_PUMP 192 #define POWER7_PME_PM_SHL_MERGED 193 #define POWER7_PME_PM_NEST_PAIR1_ADD 194 #define POWER7_PME_PM_DATA_FROM_L3 195 #define POWER7_PME_PM_LSU_FLUSH 196 #define POWER7_PME_PM_LSU_SRQ_SYNC_COUNT 197 #define POWER7_PME_PM_PMC2_OVERFLOW 198 #define POWER7_PME_PM_LSU_LDF 199 #define POWER7_PME_PM_POWER_EVENT3 200 #define POWER7_PME_PM_DISP_WT 201 #define POWER7_PME_PM_CMPLU_STALL_REJECT 202 #define POWER7_PME_PM_IC_BANK_CONFLICT 203 #define POWER7_PME_PM_BR_MPRED_CR_TA 204 #define POWER7_PME_PM_L2_INST_MISS 205 #define POWER7_PME_PM_CMPLU_STALL_ERAT_MISS 206 #define POWER7_PME_PM_NEST_PAIR2_ADD 207 #define POWER7_PME_PM_MRK_LSU_FLUSH 208 #define POWER7_PME_PM_L2_LDST 209 #define POWER7_PME_PM_INST_FROM_L31_SHR 210 #define POWER7_PME_PM_VSU0_FIN 211 #define POWER7_PME_PM_LARX_LSU 212 #define POWER7_PME_PM_INST_FROM_RMEM 213 #define POWER7_PME_PM_DISP_CLB_HELD_TLBIE 214 #define POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC 215 #define POWER7_PME_PM_BR_PRED_CR 216 #define POWER7_PME_PM_LSU_REJECT 217 #define POWER7_PME_PM_GCT_UTIL_3_TO_6_SLOTS 218 #define POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT 219 #define POWER7_PME_PM_LSU0_REJECT_LMQ_FULL 220 #define POWER7_PME_PM_VSU_FEST 221 #define POWER7_PME_PM_NEST_PAIR0_AND 222 #define POWER7_PME_PM_PTEG_FROM_L3 223 #define POWER7_PME_PM_POWER_EVENT2 224 #define POWER7_PME_PM_IC_PREF_CANCEL_PAGE 225 #define POWER7_PME_PM_VSU0_FSQRT_FDIV 226 #define POWER7_PME_PM_MRK_GRP_CMPL 227 #define POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED 228 #define POWER7_PME_PM_GRP_DISP 229 #define POWER7_PME_PM_LSU0_LDX 230 #define POWER7_PME_PM_DATA_FROM_L2 231 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD 232 #define POWER7_PME_PM_LD_REF_L1 233 #define POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED 234 #define POWER7_PME_PM_VSU1_2FLOP_DOUBLE 235 #define POWER7_PME_PM_THRD_PRIO_6_7_CYC 236 #define POWER7_PME_PM_BC_PLUS_8_RSLV_TAKEN 237 #define POWER7_PME_PM_BR_MPRED_CR 238 #define POWER7_PME_PM_L3_CO_MEM 239 #define POWER7_PME_PM_LD_MISS_L1 240 #define POWER7_PME_PM_DATA_FROM_RL2L3_MOD 241 #define POWER7_PME_PM_LSU_SRQ_FULL_CYC 242 #define POWER7_PME_PM_TABLEWALK_CYC 243 #define POWER7_PME_PM_MRK_PTEG_FROM_RMEM 244 #define POWER7_PME_PM_LSU_SRQ_STFWD 245 #define POWER7_PME_PM_INST_PTEG_FROM_RMEM 246 #define POWER7_PME_PM_FXU0_FIN 247 #define POWER7_PME_PM_LSU1_L1_SW_PREF 248 #define POWER7_PME_PM_PTEG_FROM_L31_MOD 249 #define POWER7_PME_PM_PMC5_OVERFLOW 250 #define POWER7_PME_PM_LD_REF_L1_LSU1 251 #define POWER7_PME_PM_INST_PTEG_FROM_L21_SHR 252 #define POWER7_PME_PM_CMPLU_STALL_THRD 253 #define POWER7_PME_PM_DATA_FROM_RMEM 254 #define POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED 255 #define POWER7_PME_PM_BR_MPRED_LSTACK 256 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 257 #define POWER7_PME_PM_LSU0_FLUSH_UST 258 #define POWER7_PME_PM_LSU_NCST 259 #define POWER7_PME_PM_BR_TAKEN 260 #define POWER7_PME_PM_INST_PTEG_FROM_LMEM 261 #define POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS 262 #define POWER7_PME_PM_DTLB_MISS_4K 263 #define POWER7_PME_PM_PMC4_SAVED 264 #define POWER7_PME_PM_VSU1_PERMUTE_ISSUED 265 #define POWER7_PME_PM_SLB_MISS 266 #define POWER7_PME_PM_LSU1_FLUSH_LRQ 267 #define POWER7_PME_PM_DTLB_MISS 268 #define POWER7_PME_PM_VSU1_FRSP 269 #define POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED 270 #define POWER7_PME_PM_L2_CASTOUT_SHR 271 #define POWER7_PME_PM_DATA_FROM_DL2L3_SHR 272 #define POWER7_PME_PM_VSU1_STF 273 #define POWER7_PME_PM_ST_FIN 274 #define POWER7_PME_PM_PTEG_FROM_L21_SHR 275 #define POWER7_PME_PM_L2_LOC_GUESS_WRONG 276 #define POWER7_PME_PM_MRK_STCX_FAIL 277 #define POWER7_PME_PM_LSU0_REJECT_LHS 278 #define POWER7_PME_PM_IC_PREF_CANCEL_HIT 279 #define POWER7_PME_PM_L3_PREF_BUSY 280 #define POWER7_PME_PM_MRK_BRU_FIN 281 #define POWER7_PME_PM_LSU1_NCLD 282 #define POWER7_PME_PM_INST_PTEG_FROM_L31_MOD 283 #define POWER7_PME_PM_LSU_NCLD 284 #define POWER7_PME_PM_LSU_LDX 285 #define POWER7_PME_PM_L2_LOC_GUESS_CORRECT 286 #define POWER7_PME_PM_THRESH_TIMEO 287 #define POWER7_PME_PM_L3_PREF_ST 288 #define POWER7_PME_PM_DISP_CLB_HELD_SYNC 289 #define POWER7_PME_PM_VSU_SIMPLE_ISSUED 290 #define POWER7_PME_PM_VSU1_SINGLE 291 #define POWER7_PME_PM_DATA_TABLEWALK_CYC 292 #define POWER7_PME_PM_L2_RC_ST_DONE 293 #define POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD 294 #define POWER7_PME_PM_LARX_LSU1 295 #define POWER7_PME_PM_MRK_DATA_FROM_RMEM 296 #define POWER7_PME_PM_DISP_CLB_HELD 297 #define POWER7_PME_PM_DERAT_MISS_4K 298 #define POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR 299 #define POWER7_PME_PM_SEG_EXCEPTION 300 #define POWER7_PME_PM_FLUSH_DISP_SB 301 #define POWER7_PME_PM_L2_DC_INV 302 #define POWER7_PME_PM_PTEG_FROM_DL2L3_MOD 303 #define POWER7_PME_PM_DSEG 304 #define POWER7_PME_PM_BR_PRED_LSTACK 305 #define POWER7_PME_PM_VSU0_STF 306 #define POWER7_PME_PM_LSU_FX_FIN 307 #define POWER7_PME_PM_DERAT_MISS_16M 308 #define POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD 309 #define POWER7_PME_PM_GCT_UTIL_11_PLUS_SLOTS 310 #define POWER7_PME_PM_INST_FROM_L3 311 #define POWER7_PME_PM_MRK_IFU_FIN 312 #define POWER7_PME_PM_ITLB_MISS 313 #define POWER7_PME_PM_VSU_STF 314 #define POWER7_PME_PM_LSU_FLUSH_UST 315 #define POWER7_PME_PM_L2_LDST_MISS 316 #define POWER7_PME_PM_FXU1_FIN 317 #define POWER7_PME_PM_SHL_DEALLOCATED 318 #define POWER7_PME_PM_L2_SN_M_WR_DONE 319 #define POWER7_PME_PM_LSU_REJECT_SET_MPRED 320 #define POWER7_PME_PM_L3_PREF_LD 321 #define POWER7_PME_PM_L2_SN_M_RD_DONE 322 #define POWER7_PME_PM_MRK_DERAT_MISS_16G 323 #define POWER7_PME_PM_VSU_FCONV 324 #define POWER7_PME_PM_ANY_THRD_RUN_CYC 325 #define POWER7_PME_PM_LSU_LMQ_FULL_CYC 326 #define POWER7_PME_PM_MRK_LSU_REJECT_LHS 327 #define POWER7_PME_PM_MRK_LD_MISS_L1_CYC 328 #define POWER7_PME_PM_MRK_DATA_FROM_L2_CYC 329 #define POWER7_PME_PM_INST_IMC_MATCH_DISP 330 #define POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC 331 #define POWER7_PME_PM_VSU0_SIMPLE_ISSUED 332 #define POWER7_PME_PM_CMPLU_STALL_DIV 333 #define POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR 334 #define POWER7_PME_PM_VSU_FMA_DOUBLE 335 #define POWER7_PME_PM_VSU_4FLOP 336 #define POWER7_PME_PM_VSU1_FIN 337 #define POWER7_PME_PM_NEST_PAIR1_AND 338 #define POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD 339 #define POWER7_PME_PM_RUN_CYC 340 #define POWER7_PME_PM_PTEG_FROM_RMEM 341 #define POWER7_PME_PM_LSU_LRQ_S0_VALID 342 #define POWER7_PME_PM_LSU0_LDF 343 #define POWER7_PME_PM_FLUSH_COMPLETION 344 #define POWER7_PME_PM_ST_MISS_L1 345 #define POWER7_PME_PM_L2_NODE_PUMP 346 #define POWER7_PME_PM_INST_FROM_DL2L3_SHR 347 #define POWER7_PME_PM_MRK_STALL_CMPLU_CYC 348 #define POWER7_PME_PM_VSU1_DENORM 349 #define POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC 350 #define POWER7_PME_PM_NEST_PAIR0_ADD 351 #define POWER7_PME_PM_INST_FROM_L3MISS 352 #define POWER7_PME_PM_EE_OFF_EXT_INT 353 #define POWER7_PME_PM_INST_PTEG_FROM_DMEM 354 #define POWER7_PME_PM_INST_FROM_DL2L3_MOD 355 #define POWER7_PME_PM_PMC6_OVERFLOW 356 #define POWER7_PME_PM_VSU_2FLOP_DOUBLE 357 #define POWER7_PME_PM_TLB_MISS 358 #define POWER7_PME_PM_FXU_BUSY 359 #define POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER 360 #define POWER7_PME_PM_LSU_REJECT_LMQ_FULL 361 #define POWER7_PME_PM_IC_RELOAD_SHR 362 #define POWER7_PME_PM_GRP_MRK 363 #define POWER7_PME_PM_MRK_ST_NEST 364 #define POWER7_PME_PM_VSU1_FSQRT_FDIV 365 #define POWER7_PME_PM_LSU0_FLUSH_LRQ 366 #define POWER7_PME_PM_LARX_LSU0 367 #define POWER7_PME_PM_IBUF_FULL_CYC 368 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 369 #define POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC 370 #define POWER7_PME_PM_GRP_MRK_CYC 371 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 372 #define POWER7_PME_PM_L2_GLOB_GUESS_CORRECT 373 #define POWER7_PME_PM_LSU_REJECT_LHS 374 #define POWER7_PME_PM_MRK_DATA_FROM_LMEM 375 #define POWER7_PME_PM_INST_PTEG_FROM_L3 376 #define POWER7_PME_PM_FREQ_DOWN 377 #define POWER7_PME_PM_PB_RETRY_NODE_PUMP 378 #define POWER7_PME_PM_INST_FROM_RL2L3_SHR 379 #define POWER7_PME_PM_MRK_INST_ISSUED 380 #define POWER7_PME_PM_PTEG_FROM_L3MISS 381 #define POWER7_PME_PM_RUN_PURR 382 #define POWER7_PME_PM_MRK_GRP_IC_MISS 383 #define POWER7_PME_PM_MRK_DATA_FROM_L3 384 #define POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS 385 #define POWER7_PME_PM_PTEG_FROM_RL2L3_SHR 386 #define POWER7_PME_PM_LSU_FLUSH_LRQ 387 #define POWER7_PME_PM_MRK_DERAT_MISS_64K 388 #define POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD 389 #define POWER7_PME_PM_L2_ST_MISS 390 #define POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR 391 #define POWER7_PME_PM_LWSYNC 392 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE 393 #define POWER7_PME_PM_MRK_LSU_FLUSH_LRQ 394 #define POWER7_PME_PM_INST_IMC_MATCH_CMPL 395 #define POWER7_PME_PM_NEST_PAIR3_AND 396 #define POWER7_PME_PM_PB_RETRY_SYS_PUMP 397 #define POWER7_PME_PM_MRK_INST_FIN 398 #define POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_SHR 399 #define POWER7_PME_PM_INST_FROM_L31_MOD 400 #define POWER7_PME_PM_MRK_DTLB_MISS_64K 401 #define POWER7_PME_PM_LSU_FIN 402 #define POWER7_PME_PM_MRK_LSU_REJECT 403 #define POWER7_PME_PM_L2_CO_FAIL_BUSY 404 #define POWER7_PME_PM_MEM0_WQ_DISP 405 #define POWER7_PME_PM_DATA_FROM_L31_MOD 406 #define POWER7_PME_PM_THERMAL_WARN 407 #define POWER7_PME_PM_VSU0_4FLOP 408 #define POWER7_PME_PM_BR_MPRED_CCACHE 409 #define POWER7_PME_PM_CMPLU_STALL_IFU 410 #define POWER7_PME_PM_L1_DEMAND_WRITE 411 #define POWER7_PME_PM_FLUSH_BR_MPRED 412 #define POWER7_PME_PM_MRK_DTLB_MISS_16G 413 #define POWER7_PME_PM_MRK_PTEG_FROM_DMEM 414 #define POWER7_PME_PM_L2_RCST_DISP 415 #define POWER7_PME_PM_CMPLU_STALL 416 #define POWER7_PME_PM_LSU_PARTIAL_CDF 417 #define POWER7_PME_PM_DISP_CLB_HELD_SB 418 #define POWER7_PME_PM_VSU0_FMA_DOUBLE 419 #define POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE 420 #define POWER7_PME_PM_IC_DEMAND_CYC 421 #define POWER7_PME_PM_MRK_DATA_FROM_L21_SHR 422 #define POWER7_PME_PM_MRK_LSU_FLUSH_UST 423 #define POWER7_PME_PM_INST_PTEG_FROM_L3MISS 424 #define POWER7_PME_PM_VSU_DENORM 425 #define POWER7_PME_PM_MRK_LSU_PARTIAL_CDF 426 #define POWER7_PME_PM_INST_FROM_L21_SHR 427 #define POWER7_PME_PM_IC_PREF_WRITE 428 #define POWER7_PME_PM_BR_PRED 429 #define POWER7_PME_PM_INST_FROM_DMEM 430 #define POWER7_PME_PM_IC_PREF_CANCEL_ALL 431 #define POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM 432 #define POWER7_PME_PM_MRK_LSU_FLUSH_SRQ 433 #define POWER7_PME_PM_MRK_FIN_STALL_CYC 434 #define POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER 435 #define POWER7_PME_PM_VSU1_DD_ISSUED 436 #define POWER7_PME_PM_PTEG_FROM_L31_SHR 437 #define POWER7_PME_PM_DATA_FROM_L21_SHR 438 #define POWER7_PME_PM_LSU0_NCLD 439 #define POWER7_PME_PM_VSU1_4FLOP 440 #define POWER7_PME_PM_VSU1_8FLOP 441 #define POWER7_PME_PM_VSU_8FLOP 442 #define POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 443 #define POWER7_PME_PM_DTLB_MISS_64K 444 #define POWER7_PME_PM_THRD_CONC_RUN_INST 445 #define POWER7_PME_PM_MRK_PTEG_FROM_L2 446 #define POWER7_PME_PM_PB_SYS_PUMP 447 #define POWER7_PME_PM_VSU_FIN 448 #define POWER7_PME_PM_MRK_DATA_FROM_L31_MOD 449 #define POWER7_PME_PM_THRD_PRIO_0_1_CYC 450 #define POWER7_PME_PM_DERAT_MISS_64K 451 #define POWER7_PME_PM_PMC2_REWIND 452 #define POWER7_PME_PM_INST_FROM_L2 453 #define POWER7_PME_PM_GRP_BR_MPRED_NONSPEC 454 #define POWER7_PME_PM_INST_DISP 455 #define POWER7_PME_PM_MEM0_RD_CANCEL_TOTAL 456 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM 457 #define POWER7_PME_PM_L1_DCACHE_RELOAD_VALID 458 #define POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED 459 #define POWER7_PME_PM_L3_PREF_HIT 460 #define POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD 461 #define POWER7_PME_PM_CMPLU_STALL_STORE 462 #define POWER7_PME_PM_MRK_FXU_FIN 463 #define POWER7_PME_PM_PMC4_OVERFLOW 464 #define POWER7_PME_PM_MRK_PTEG_FROM_L3 465 #define POWER7_PME_PM_LSU0_LMQ_LHR_MERGE 466 #define POWER7_PME_PM_BTAC_HIT 467 #define POWER7_PME_PM_L3_RD_BUSY 468 #define POWER7_PME_PM_LSU0_L1_SW_PREF 469 #define POWER7_PME_PM_INST_FROM_L2MISS 470 #define POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC 471 #define POWER7_PME_PM_L2_ST 472 #define POWER7_PME_PM_VSU0_DENORM 473 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR 474 #define POWER7_PME_PM_BR_PRED_CR_TA 475 #define POWER7_PME_PM_VSU0_FCONV 476 #define POWER7_PME_PM_MRK_LSU_FLUSH_ULD 477 #define POWER7_PME_PM_BTAC_MISS 478 #define POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT 479 #define POWER7_PME_PM_MRK_DATA_FROM_L2 480 #define POWER7_PME_PM_LSU_DCACHE_RELOAD_VALID 481 #define POWER7_PME_PM_VSU_FMA 482 #define POWER7_PME_PM_LSU0_FLUSH_SRQ 483 #define POWER7_PME_PM_LSU1_L1_PREF 484 #define POWER7_PME_PM_IOPS_CMPL 485 #define POWER7_PME_PM_L2_SYS_PUMP 486 #define POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL 487 #define POWER7_PME_PM_LSU_LMQ_S0_ALLOC 488 #define POWER7_PME_PM_FLUSH_DISP_SYNC 489 #define POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 490 #define POWER7_PME_PM_L2_IC_INV 491 #define POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC 492 #define POWER7_PME_PM_L3_PREF_LDST 493 #define POWER7_PME_PM_LSU_SRQ_EMPTY_CYC 494 #define POWER7_PME_PM_LSU_LMQ_S0_VALID 495 #define POWER7_PME_PM_FLUSH_PARTIAL 496 #define POWER7_PME_PM_VSU1_FMA_DOUBLE 497 #define POWER7_PME_PM_1PLUS_PPC_DISP 498 #define POWER7_PME_PM_DATA_FROM_L2MISS 499 #define POWER7_PME_PM_SUSPENDED 500 #define POWER7_PME_PM_VSU0_FMA 501 #define POWER7_PME_PM_CMPLU_STALL_SCALAR 502 #define POWER7_PME_PM_STCX_FAIL 503 #define POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE 504 #define POWER7_PME_PM_DC_PREF_DST 505 #define POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED 506 #define POWER7_PME_PM_L3_HIT 507 #define POWER7_PME_PM_L2_GLOB_GUESS_WRONG 508 #define POWER7_PME_PM_MRK_DFU_FIN 509 #define POWER7_PME_PM_INST_FROM_L1 510 #define POWER7_PME_PM_BRU_FIN 511 #define POWER7_PME_PM_IC_DEMAND_REQ 512 #define POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE 513 #define POWER7_PME_PM_VSU1_FMA 514 #define POWER7_PME_PM_MRK_LD_MISS_L1 515 #define POWER7_PME_PM_VSU0_2FLOP_DOUBLE 516 #define POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM 517 #define POWER7_PME_PM_INST_PTEG_FROM_L31_SHR 518 #define POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS 519 #define POWER7_PME_PM_MRK_DATA_FROM_L2MISS 520 #define POWER7_PME_PM_DATA_FROM_RL2L3_SHR 521 #define POWER7_PME_PM_INST_FROM_PREF 522 #define POWER7_PME_PM_VSU1_SQ 523 #define POWER7_PME_PM_L2_LD_DISP 524 #define POWER7_PME_PM_L2_DISP_ALL 525 #define POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC 526 #define POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE 527 #define POWER7_PME_PM_BR_MPRED 528 #define POWER7_PME_PM_INST_PTEG_FROM_DL2L3_SHR 529 #define POWER7_PME_PM_VSU_1FLOP 530 #define POWER7_PME_PM_HV_CYC 531 #define POWER7_PME_PM_MRK_LSU_FIN 532 #define POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR 533 #define POWER7_PME_PM_DTLB_MISS_16M 534 #define POWER7_PME_PM_LSU1_LMQ_LHR_MERGE 535 #define POWER7_PME_PM_IFU_FIN 536 #define POWER7_PME_PM_1THRD_CON_RUN_INSTR 537 #define POWER7_PME_PM_CMPLU_STALL_COUNT 538 #define POWER7_PME_PM_MEM0_PB_RD_CL 539 #define POWER7_PME_PM_THRD_1_RUN_CYC 540 #define POWER7_PME_PM_THRD_2_CONC_RUN_INSTR 541 #define POWER7_PME_PM_THRD_2_RUN_CYC 542 #define POWER7_PME_PM_THRD_3_CONC_RUN_INST 543 #define POWER7_PME_PM_THRD_3_RUN_CYC 544 #define POWER7_PME_PM_THRD_4_CONC_RUN_INST 545 #define POWER7_PME_PM_THRD_4_RUN_CYC 546 static const pme_power_entry_t power7_pe[] = { [ POWER7_PME_PM_IC_DEMAND_L2_BR_ALL ] = { .pme_name = "PM_IC_DEMAND_L2_BR_ALL", .pme_code = 0x4898, .pme_short_desc = " L2 I cache demand request due to BHT or redirect", .pme_long_desc = " L2 I cache demand request due to BHT or redirect", }, [ POWER7_PME_PM_GCT_UTIL_7_TO_10_SLOTS ] = { .pme_name = "PM_GCT_UTIL_7_TO_10_SLOTS", .pme_code = 0x20a0, .pme_short_desc = "GCT Utilization 7-10 entries", .pme_long_desc = "GCT Utilization 7-10 entries", }, [ POWER7_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x10022, .pme_short_desc = "PMC2 Rewind Value saved", .pme_long_desc = "PMC2 was counting speculatively. The speculative condition was met and the counter value was committed by copying it to the backup register.", }, [ POWER7_PME_PM_CMPLU_STALL_DFU ] = { .pme_name = "PM_CMPLU_STALL_DFU", .pme_code = 0x2003c, .pme_short_desc = "Completion stall caused by Decimal Floating Point Unit", .pme_long_desc = "Completion stall caused by Decimal Floating Point Unit", }, [ POWER7_PME_PM_VSU0_16FLOP ] = { .pme_name = "PM_VSU0_16FLOP", .pme_code = 0xa0a4, .pme_short_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", .pme_long_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", }, [ POWER7_PME_PM_MRK_LSU_DERAT_MISS ] = { .pme_name = "PM_MRK_LSU_DERAT_MISS", .pme_code = 0x3d05a, .pme_short_desc = "Marked DERAT Miss", .pme_long_desc = "Marked DERAT Miss", }, [ POWER7_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x10034, .pme_short_desc = "marked store finished (was complete)", .pme_long_desc = "A sampled store has completed (data home)", }, [ POWER7_PME_PM_NEST_PAIR3_ADD ] = { .pme_name = "PM_NEST_PAIR3_ADD", .pme_code = 0x40881, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair3 ADD", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair3 ADD", }, [ POWER7_PME_PM_L2_ST_DISP ] = { .pme_name = "PM_L2_ST_DISP", .pme_code = 0x46180, .pme_short_desc = "All successful store dispatches", .pme_long_desc = "All successful store dispatches", }, [ POWER7_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x16180, .pme_short_desc = "L2 Castouts - Modified (M, Mu, Me)", .pme_long_desc = "An L2 line in the Modified state was castout. Total for all slices.", }, [ POWER7_PME_PM_ISEG ] = { .pme_name = "PM_ISEG", .pme_code = 0x20a4, .pme_short_desc = "ISEG Exception", .pme_long_desc = "ISEG Exception", }, [ POWER7_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40034, .pme_short_desc = "marked Instruction finish timeout ", .pme_long_desc = "The number of instructions finished since the last progress indicator from a marked instruction exceeded the threshold. The marked instruction was flushed.", }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", .pme_code = 0x36282, .pme_short_desc = " L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = " L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd0b6, .pme_short_desc = "LS1 'Dcache prefetch stream confirmed", .pme_long_desc = "LS1 'Dcache prefetch stream confirmed", }, [ POWER7_PME_PM_IERAT_WR_64K ] = { .pme_name = "PM_IERAT_WR_64K", .pme_code = 0x40be, .pme_short_desc = "large page 64k ", .pme_long_desc = "large page 64k ", }, [ POWER7_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x4d05e, .pme_short_desc = "Marked Data TLB misses for 16M page", .pme_long_desc = "Data TLB references to 16M pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_IERAT_MISS ] = { .pme_name = "PM_IERAT_MISS", .pme_code = 0x100f6, .pme_short_desc = "IERAT Miss (Not implemented as DI on POWER6)", .pme_long_desc = "A translation request missed the Instruction Effective to Real Address Translation (ERAT) table", }, [ POWER7_PME_PM_MRK_PTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_LMEM", .pme_code = 0x4d052, .pme_short_desc = "Marked PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to the same module this proccessor is located on due to a marked load or store.", }, [ POWER7_PME_PM_FLOP ] = { .pme_name = "PM_FLOP", .pme_code = 0x100f4, .pme_short_desc = "Floating Point Operation Finished", .pme_long_desc = "A floating point operation has completed", }, [ POWER7_PME_PM_THRD_PRIO_4_5_CYC ] = { .pme_name = "PM_THRD_PRIO_4_5_CYC", .pme_code = 0x40b4, .pme_short_desc = " Cycles thread running at priority level 4 or 5", .pme_long_desc = " Cycles thread running at priority level 4 or 5", }, [ POWER7_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x40aa, .pme_short_desc = "Branch predict - target address", .pme_long_desc = "The target address of a branch instruction was predicted.", }, [ POWER7_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x20014, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point instruction.", }, [ POWER7_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x200f8, .pme_short_desc = "external interrupt", .pme_long_desc = "An interrupt due to an external exception occurred", }, [ POWER7_PME_PM_VSU_FSQRT_FDIV ] = { .pme_name = "PM_VSU_FSQRT_FDIV", .pme_code = 0xa888, .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only!", .pme_long_desc = "DP vector versions of fdiv,fsqrt ", }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", .pme_code = 0x1003e, .pme_short_desc = "Marked Load exposed Miss ", .pme_long_desc = "Marked Load exposed Miss ", }, [ POWER7_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0xc086, .pme_short_desc = "LS1 Scalar Loads ", .pme_long_desc = "A floating point load was executed by LSU1", }, [ POWER7_PME_PM_IC_WRITE_ALL ] = { .pme_name = "PM_IC_WRITE_ALL", .pme_code = 0x488c, .pme_short_desc = "Icache sectors written, prefetch + demand", .pme_long_desc = "Icache sectors written, prefetch + demand", }, [ POWER7_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc0a0, .pme_short_desc = "LS0 SRQ forwarded data to a load", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1c052, .pme_short_desc = "PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a remote module due to a demand load or store.", }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR", .pme_code = 0x1d04e, .pme_short_desc = "Marked data loaded from another L3 on same chip shared", .pme_long_desc = "Marked data loaded from another L3 on same chip shared", }, [ POWER7_PME_PM_DATA_FROM_L21_MOD ] = { .pme_name = "PM_DATA_FROM_L21_MOD", .pme_code = 0x3c046, .pme_short_desc = "Data loaded from another L2 on same chip modified", .pme_long_desc = "Data loaded from another L2 on same chip modified", }, [ POWER7_PME_PM_VSU1_SCAL_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU1_SCAL_DOUBLE_ISSUED", .pme_code = 0xb08a, .pme_short_desc = "Double Precision scalar instruction issued on Pipe1", .pme_long_desc = "Double Precision scalar instruction issued on Pipe1", }, [ POWER7_PME_PM_VSU0_8FLOP ] = { .pme_name = "PM_VSU0_8FLOP", .pme_code = 0xa0a0, .pme_short_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", .pme_long_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", }, [ POWER7_PME_PM_POWER_EVENT1 ] = { .pme_name = "PM_POWER_EVENT1", .pme_code = 0x1006e, .pme_short_desc = "Power Management Event 1", .pme_long_desc = "Power Management Event 1", }, [ POWER7_PME_PM_DISP_CLB_HELD_BAL ] = { .pme_name = "PM_DISP_CLB_HELD_BAL", .pme_code = 0x2092, .pme_short_desc = "Dispatch/CLB Hold: Balance", .pme_long_desc = "Dispatch/CLB Hold: Balance", }, [ POWER7_PME_PM_VSU1_2FLOP ] = { .pme_name = "PM_VSU1_2FLOP", .pme_code = 0xa09a, .pme_short_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", .pme_long_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", }, [ POWER7_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x209a, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "Cycles a LWSYNC instruction was held at dispatch. LWSYNC instructions are held at dispatch until all previous loads are done and all previous stores have issued. LWSYNC enters the Store Request Queue and is sent to the storage subsystem but does not wait for a response.", }, [ POWER7_PME_PM_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_DL2L3_SHR", .pme_code = 0x3c054, .pme_short_desc = "PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with shared (T or SL) data from an L2 or L3 on a remote module due to a demand load or store.", }, [ POWER7_PME_PM_INST_FROM_L21_MOD ] = { .pme_name = "PM_INST_FROM_L21_MOD", .pme_code = 0x34046, .pme_short_desc = "Instruction fetched from another L2 on same chip modified", .pme_long_desc = "Instruction fetched from another L2 on same chip modified", }, [ POWER7_PME_PM_IERAT_XLATE_WR_16MPLUS ] = { .pme_name = "PM_IERAT_XLATE_WR_16MPLUS", .pme_code = 0x40bc, .pme_short_desc = "large page 16M+", .pme_long_desc = "large page 16M+", }, [ POWER7_PME_PM_IC_REQ_ALL ] = { .pme_name = "PM_IC_REQ_ALL", .pme_code = 0x4888, .pme_short_desc = "Icache requests, prefetch + demand", .pme_long_desc = "Icache requests, prefetch + demand", }, [ POWER7_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0xd090, .pme_short_desc = "Data SLB Miss - Total of all segment sizes", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve.", }, [ POWER7_PME_PM_L3_MISS ] = { .pme_name = "PM_L3_MISS", .pme_code = 0x1f082, .pme_short_desc = "L3 Misses ", .pme_long_desc = "L3 Misses ", }, [ POWER7_PME_PM_LSU0_L1_PREF ] = { .pme_name = "PM_LSU0_L1_PREF", .pme_code = 0xd0b8, .pme_short_desc = " LS0 L1 cache data prefetches", .pme_long_desc = " LS0 L1 cache data prefetches", }, [ POWER7_PME_PM_VSU_SCALAR_SINGLE_ISSUED ] = { .pme_name = "PM_VSU_SCALAR_SINGLE_ISSUED", .pme_code = 0xb884, .pme_short_desc = "Single Precision scalar instruction issued on Pipe0", .pme_long_desc = "Single Precision scalar instruction issued on Pipe0", }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE", .pme_code = 0xd0be, .pme_short_desc = "LS1 Dcache Strided prefetch stream confirmed", .pme_long_desc = "LS1 Dcache Strided prefetch stream confirmed", }, [ POWER7_PME_PM_L2_INST ] = { .pme_name = "PM_L2_INST", .pme_code = 0x36080, .pme_short_desc = "Instruction Load Count", .pme_long_desc = "Instruction Load Count", }, [ POWER7_PME_PM_VSU0_FRSP ] = { .pme_name = "PM_VSU0_FRSP", .pme_code = 0xa0b4, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", }, [ POWER7_PME_PM_FLUSH_DISP ] = { .pme_name = "PM_FLUSH_DISP", .pme_code = 0x2082, .pme_short_desc = "Dispatch flush", .pme_long_desc = "Dispatch flush", }, [ POWER7_PME_PM_PTEG_FROM_L2MISS ] = { .pme_name = "PM_PTEG_FROM_L2MISS", .pme_code = 0x4c058, .pme_short_desc = "PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the TLB but not from the local L2.", }, [ POWER7_PME_PM_VSU1_DQ_ISSUED ] = { .pme_name = "PM_VSU1_DQ_ISSUED", .pme_code = 0xb09a, .pme_short_desc = "128BIT Decimal Issued on Pipe1", .pme_long_desc = "128BIT Decimal Issued on Pipe1", }, [ POWER7_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x20012, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a load/store instruction.", }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x1d04a, .pme_short_desc = "Marked data loaded from distant memory", .pme_long_desc = "The processor's Data Cache was reloaded with data from memory attached to a distant module due to a marked load.", }, [ POWER7_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0xc8b0, .pme_short_desc = "Flush: Unaligned Load", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1). Combined Unit 0 + 1.", }, [ POWER7_PME_PM_PTEG_FROM_LMEM ] = { .pme_name = "PM_PTEG_FROM_LMEM", .pme_code = 0x4c052, .pme_short_desc = "PTEG loaded from local memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to the same module this proccessor is located on.", }, [ POWER7_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x3d05c, .pme_short_desc = "Marked DERAT misses for 16M page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_THRD_ALL_RUN_CYC ] = { .pme_name = "PM_THRD_ALL_RUN_CYC", .pme_code = 0x2000c, .pme_short_desc = "All Threads in run_cycles", .pme_long_desc = "Cycles when all threads had their run latches set. Operating systems use the run latch to indicate when they are doing useful work.", }, [ POWER7_PME_PM_MEM0_PREFETCH_DISP ] = { .pme_name = "PM_MEM0_PREFETCH_DISP", .pme_code = 0x20083, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair1 Bit1", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair1 Bit1", }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC_COUNT ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC_COUNT", .pme_code = 0x3003f, .pme_short_desc = "Marked Group Completion Stall cycles (use edge detect to count #)", .pme_long_desc = "Marked Group Completion Stall cycles (use edge detect to count #)", }, [ POWER7_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x3c04c, .pme_short_desc = "Data loaded from distant L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a distant module due to a demand load", }, [ POWER7_PME_PM_VSU_FRSP ] = { .pme_name = "PM_VSU_FRSP", .pme_code = 0xa8b4, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD", .pme_code = 0x3d046, .pme_short_desc = "Marked data loaded from another L2 on same chip modified", .pme_long_desc = "Marked data loaded from another L2 on same chip modified", }, [ POWER7_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20010, .pme_short_desc = "Overflow from counter 1", .pme_long_desc = "Overflows from PMC1 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_VSU0_SINGLE ] = { .pme_name = "PM_VSU0_SINGLE", .pme_code = 0xa0a8, .pme_short_desc = "FPU single precision", .pme_long_desc = "VSU0 executed single precision instruction", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L3MISS", .pme_code = 0x2d058, .pme_short_desc = "Marked PTEG loaded from L3 miss", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from beyond the L3 due to a marked load or store", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L31_SHR", .pme_code = 0x2d056, .pme_short_desc = "Marked PTEG loaded from another L3 on same chip shared", .pme_long_desc = "Marked PTEG loaded from another L3 on same chip shared", }, [ POWER7_PME_PM_VSU0_VECTOR_SP_ISSUED ] = { .pme_name = "PM_VSU0_VECTOR_SP_ISSUED", .pme_code = 0xb090, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", }, [ POWER7_PME_PM_VSU1_FEST ] = { .pme_name = "PM_VSU1_FEST", .pme_code = 0xa0ba, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", }, [ POWER7_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x20030, .pme_short_desc = "marked instruction dispatch", .pme_long_desc = "A marked instruction was dispatched", }, [ POWER7_PME_PM_VSU0_COMPLEX_ISSUED ] = { .pme_name = "PM_VSU0_COMPLEX_ISSUED", .pme_code = 0xb096, .pme_short_desc = "Complex VMX instruction issued", .pme_long_desc = "Complex VMX instruction issued", }, [ POWER7_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc0b6, .pme_short_desc = "LS1 Flush: Unaligned Store", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4K boundary)", }, [ POWER7_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x2, .pme_short_desc = "# PPC Instructions Finished", .pme_long_desc = "Number of PowerPC Instructions that completed.", }, [ POWER7_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x1000e, .pme_short_desc = "fxu0 idle and fxu1 idle", .pme_long_desc = "FXU0 and FXU1 are both idle.", }, [ POWER7_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc0b0, .pme_short_desc = "LS0 Flush: Unaligned Load", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1)", }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x3d04c, .pme_short_desc = "Marked data loaded from distant L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a distant module due to a marked load.", }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC", .pme_code = 0x3001c, .pme_short_desc = "ALL threads lsu empty (lmq and srq empty)", .pme_long_desc = "ALL threads lsu empty (lmq and srq empty)", }, [ POWER7_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0xc0a6, .pme_short_desc = "LS1 Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit 1 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_L21_MOD", .pme_code = 0x3e056, .pme_short_desc = "Instruction PTEG loaded from another L2 on same chip modified", .pme_long_desc = "Instruction PTEG loaded from another L2 on same chip modified", }, [ POWER7_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x14042, .pme_short_desc = "Instruction fetched from remote L2 or L3 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from an L2 or L3 on a remote module. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_SHL_CREATED ] = { .pme_name = "PM_SHL_CREATED", .pme_code = 0x5082, .pme_short_desc = "SHL table entry Created", .pme_long_desc = "SHL table entry Created", }, [ POWER7_PME_PM_L2_ST_HIT ] = { .pme_name = "PM_L2_ST_HIT", .pme_code = 0x46182, .pme_short_desc = "All successful store dispatches that were L2Hits", .pme_long_desc = "A store request hit in the L2 directory. This event includes all requests to this L2 from all sources. Total for all slices.", }, [ POWER7_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x1c04a, .pme_short_desc = "Data loaded from distant memory", .pme_long_desc = "The processor's Data Cache was reloaded with data from memory attached to a distant module due to a demand load", }, [ POWER7_PME_PM_L3_LD_MISS ] = { .pme_name = "PM_L3_LD_MISS", .pme_code = 0x2f082, .pme_short_desc = "L3 demand LD Miss", .pme_long_desc = "L3 demand LD Miss", }, [ POWER7_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4000e, .pme_short_desc = "fxu0 idle and fxu1 busy. ", .pme_long_desc = "FXU0 was idle while FXU1 was busy", }, [ POWER7_PME_PM_DISP_CLB_HELD_RES ] = { .pme_name = "PM_DISP_CLB_HELD_RES", .pme_code = 0x2094, .pme_short_desc = "Dispatch/CLB Hold: Resource", .pme_long_desc = "Dispatch/CLB Hold: Resource", }, [ POWER7_PME_PM_L2_SN_SX_I_DONE ] = { .pme_name = "PM_L2_SN_SX_I_DONE", .pme_code = 0x36382, .pme_short_desc = "SNP dispatched and went from Sx or Tx to Ix", .pme_long_desc = "SNP dispatched and went from Sx or Tx to Ix", }, [ POWER7_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x30004, .pme_short_desc = "group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER7_PME_PM_STCX_CMPL ] = { .pme_name = "PM_STCX_CMPL", .pme_code = 0xc098, .pme_short_desc = "STCX executed", .pme_long_desc = "Conditional stores with reservation completed", }, [ POWER7_PME_PM_VSU0_2FLOP ] = { .pme_name = "PM_VSU0_2FLOP", .pme_code = 0xa098, .pme_short_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", .pme_long_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", }, [ POWER7_PME_PM_L3_PREF_MISS ] = { .pme_name = "PM_L3_PREF_MISS", .pme_code = 0x3f082, .pme_short_desc = "L3 Prefetch Directory Miss", .pme_long_desc = "L3 Prefetch Directory Miss", }, [ POWER7_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0xd096, .pme_short_desc = "A sync is in the SRQ", .pme_long_desc = "Cycles that a sync instruction is active in the Store Request Queue.", }, [ POWER7_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x20064, .pme_short_desc = "LSU Reject due to ERAT (up to 2 per cycles)", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions due to an ERAT miss. Combined unit 0 + 1. Requests that miss the Derat are rejected and retried until the request hits in the Erat.", }, [ POWER7_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x200fc, .pme_short_desc = "Demand iCache Miss", .pme_long_desc = "An instruction fetch request missed the L1 cache.", }, [ POWER7_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc0be, .pme_short_desc = "LS1 Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed from unit 1 because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", }, [ POWER7_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc080, .pme_short_desc = "LS0 L1 D cache load references counted at finish", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 0.", }, [ POWER7_PME_PM_VSU0_FEST ] = { .pme_name = "PM_VSU0_FEST", .pme_code = 0xa0b8, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", }, [ POWER7_PME_PM_VSU_VECTOR_SINGLE_ISSUED ] = { .pme_name = "PM_VSU_VECTOR_SINGLE_ISSUED", .pme_code = 0xb890, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", }, [ POWER7_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x4000c, .pme_short_desc = "Power Management: Above Threshold A", .pme_long_desc = "Processor frequency was sped up due to power management", }, [ POWER7_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x3c04a, .pme_short_desc = "Data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to the same module this proccessor is located on.", }, [ POWER7_PME_PM_LSU1_LDX ] = { .pme_name = "PM_LSU1_LDX", .pme_code = 0xc08a, .pme_short_desc = "LS1 Vector Loads", .pme_long_desc = "LS1 Vector Loads", }, [ POWER7_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40010, .pme_short_desc = "Overflow from counter 3", .pme_long_desc = "Overflows from PMC3 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_MRK_BR_MPRED ] = { .pme_name = "PM_MRK_BR_MPRED", .pme_code = 0x30036, .pme_short_desc = "Marked Branch Mispredicted", .pme_long_desc = "A marked branch was mispredicted", }, [ POWER7_PME_PM_SHL_MATCH ] = { .pme_name = "PM_SHL_MATCH", .pme_code = 0x5086, .pme_short_desc = "SHL Table Match", .pme_long_desc = "SHL Table Match", }, [ POWER7_PME_PM_MRK_BR_TAKEN ] = { .pme_name = "PM_MRK_BR_TAKEN", .pme_code = 0x10036, .pme_short_desc = "Marked Branch Taken", .pme_long_desc = "A marked branch was taken", }, [ POWER7_PME_PM_CMPLU_STALL_BRU ] = { .pme_name = "PM_CMPLU_STALL_BRU", .pme_code = 0x4004e, .pme_short_desc = "Completion stall due to BRU", .pme_long_desc = "Completion stall due to BRU", }, [ POWER7_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0xd092, .pme_short_desc = "Instruction SLB Miss - Tota of all segment sizes", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ POWER7_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x1e, .pme_short_desc = "Cycles", .pme_long_desc = "Processor Cycles", }, [ POWER7_PME_PM_DISP_HELD_THERMAL ] = { .pme_name = "PM_DISP_HELD_THERMAL", .pme_code = 0x30006, .pme_short_desc = "Dispatch Held due to Thermal", .pme_long_desc = "Dispatch Held due to Thermal", }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2e054, .pme_short_desc = "Instruction PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "Instruction PTEG loaded from remote L2 or L3 shared", }, [ POWER7_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc0a2, .pme_short_desc = "LS1 SRQ forwarded data to a load", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss.", }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x4001a, .pme_short_desc = "GCT empty by branch mispredict", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of a branch misprediction.", }, [ POWER7_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100f2, .pme_short_desc = "1 or more ppc insts finished", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER7_PME_PM_PTEG_FROM_DMEM ] = { .pme_name = "PM_PTEG_FROM_DMEM", .pme_code = 0x2c052, .pme_short_desc = "PTEG loaded from distant memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with data from memory attached to a distant module due to a demand load or store.", }, [ POWER7_PME_PM_VSU_2FLOP ] = { .pme_name = "PM_VSU_2FLOP", .pme_code = 0xa898, .pme_short_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", .pme_long_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", }, [ POWER7_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x4086, .pme_short_desc = "Cycles No room in EAT", .pme_long_desc = "The Global Completion Table is completely full.", }, [ POWER7_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x40020, .pme_short_desc = "Marked ld latency Data source 0001 (L3)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER7_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xd09d, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "Slot 0 of SRQ valid", }, [ POWER7_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x1d05c, .pme_short_desc = "Marked DERAT misses for 4K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x40ae, .pme_short_desc = "Branch mispredict - target address", .pme_long_desc = "A branch instruction target was incorrectly predicted. This will result in a branch mispredict flush unless a flush is detected from an older instruction.", }, [ POWER7_PME_PM_INST_PTEG_FROM_L2MISS ] = { .pme_name = "PM_INST_PTEG_FROM_L2MISS", .pme_code = 0x4e058, .pme_short_desc = "Instruction PTEG loaded from L2 miss", .pme_long_desc = "Instruction PTEG loaded from L2 miss", }, [ POWER7_PME_PM_DPU_HELD_POWER ] = { .pme_name = "PM_DPU_HELD_POWER", .pme_code = 0x20006, .pme_short_desc = "Dispatch Held due to Power Management", .pme_long_desc = "Cycles that Instruction Dispatch was held due to power management. More than one hold condition can exist at the same time", }, [ POWER7_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x400fa, .pme_short_desc = "Run_Instructions", .pme_long_desc = "Number of run instructions completed. ", }, [ POWER7_PME_PM_MRK_VSU_FIN ] = { .pme_name = "PM_MRK_VSU_FIN", .pme_code = 0x30032, .pme_short_desc = "vsu (fpu) marked instr finish", .pme_long_desc = "vsu (fpu) marked instr finish", }, [ POWER7_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xd09c, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the SRQ is split between the two threads (16 entries each).", }, [ POWER7_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x20008, .pme_short_desc = "GCT empty, all threads", .pme_long_desc = "Cycles when the Global Completion Table was completely empty. No thread had an entry allocated.", }, [ POWER7_PME_PM_IOPS_DISP ] = { .pme_name = "PM_IOPS_DISP", .pme_code = 0x30014, .pme_short_desc = "IOPS dispatched", .pme_long_desc = "IOPS dispatched", }, [ POWER7_PME_PM_RUN_SPURR ] = { .pme_name = "PM_RUN_SPURR", .pme_code = 0x10008, .pme_short_desc = "Run SPURR", .pme_long_desc = "Run SPURR", }, [ POWER7_PME_PM_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_PTEG_FROM_L21_MOD", .pme_code = 0x3c056, .pme_short_desc = "PTEG loaded from another L2 on same chip modified", .pme_long_desc = "PTEG loaded from another L2 on same chip modified", }, [ POWER7_PME_PM_VSU0_1FLOP ] = { .pme_name = "PM_VSU0_1FLOP", .pme_code = 0xa080, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg, xsadd, xsmul, xssub, xscmp, xssel, xsabs, xsnabs, xsre, xssqrte, xsneg) operation finished", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg, xsadd, xsmul, xssub, xscmp, xssel, xsabs, xsnabs, xsre, xssqrte, xsneg) operation finished", }, [ POWER7_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0xd0b2, .pme_short_desc = "TLBIE snoop", .pme_long_desc = "A tlbie was snooped from another processor.", }, [ POWER7_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x2c048, .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", .pme_long_desc = "The processor's Data Cache was reloaded from beyond L3 due to a demand load", }, [ POWER7_PME_PM_VSU_SINGLE ] = { .pme_name = "PM_VSU_SINGLE", .pme_code = 0xa8a8, .pme_short_desc = "Vector or Scalar single precision", .pme_long_desc = "Vector or Scalar single precision", }, [ POWER7_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x1c05e, .pme_short_desc = "Data TLB miss for 16G page", .pme_long_desc = "Data TLB references to 16GB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_CMPLU_STALL_VECTOR ] = { .pme_name = "PM_CMPLU_STALL_VECTOR", .pme_code = 0x2001c, .pme_short_desc = "Completion stall caused by Vector instruction", .pme_long_desc = "Completion stall caused by Vector instruction", }, [ POWER7_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x400f8, .pme_short_desc = "Flush (any type)", .pme_long_desc = "Flushes occurred including LSU and Branch flushes.", }, [ POWER7_PME_PM_L2_LD_HIT ] = { .pme_name = "PM_L2_LD_HIT", .pme_code = 0x36182, .pme_short_desc = "All successful load dispatches that were L2 hits", .pme_long_desc = "A load request (data or instruction) hit in the L2 directory. Includes speculative, prefetched, and demand requests. This event includes all requests to this L2 from all sources. Total for all slices", }, [ POWER7_PME_PM_NEST_PAIR2_AND ] = { .pme_name = "PM_NEST_PAIR2_AND", .pme_code = 0x30883, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair2 AND", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair2 AND", }, [ POWER7_PME_PM_VSU1_1FLOP ] = { .pme_name = "PM_VSU1_1FLOP", .pme_code = 0xa082, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg, xsadd, xsmul, xssub, xscmp, xssel, xsabs, xsnabs, xsre, xssqrte, xsneg) operation finished", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg, xsadd, xsmul, xssub, xscmp, xssel, xsabs, xsnabs, xsre, xssqrte, xsneg) operation finished", }, [ POWER7_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x408a, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "An instruction prefetch request has been made.", }, [ POWER7_PME_PM_L3_LD_HIT ] = { .pme_name = "PM_L3_LD_HIT", .pme_code = 0x2f080, .pme_short_desc = "L3 demand LD Hits", .pme_long_desc = "L3 demand LD Hits", }, [ POWER7_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x2001a, .pme_short_desc = "GCT empty by I cache miss", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread because of an Instruction Cache miss.", }, [ POWER7_PME_PM_DISP_HELD ] = { .pme_name = "PM_DISP_HELD", .pme_code = 0x10006, .pme_short_desc = "Dispatch Held", .pme_long_desc = "Dispatch Held", }, [ POWER7_PME_PM_L2_LD ] = { .pme_name = "PM_L2_LD", .pme_code = 0x16080, .pme_short_desc = "Data Load Count", .pme_long_desc = "Data Load Count", }, [ POWER7_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0xc8bc, .pme_short_desc = "Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_BC_PLUS_8_CONV ] = { .pme_name = "PM_BC_PLUS_8_CONV", .pme_code = 0x40b8, .pme_short_desc = "BC+8 Converted", .pme_long_desc = "BC+8 Converted", }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD_CYC", .pme_code = 0x40026, .pme_short_desc = "Marked ld latency Data source 0111 (L3.1 M same chip)", .pme_long_desc = "Marked ld latency Data source 0111 (L3.1 M same chip)", }, [ POWER7_PME_PM_CMPLU_STALL_VECTOR_LONG ] = { .pme_name = "PM_CMPLU_STALL_VECTOR_LONG", .pme_code = 0x4004a, .pme_short_desc = "completion stall due to long latency vector instruction", .pme_long_desc = "completion stall due to long latency vector instruction", }, [ POWER7_PME_PM_L2_RCST_BUSY_RC_FULL ] = { .pme_name = "PM_L2_RCST_BUSY_RC_FULL", .pme_code = 0x26282, .pme_short_desc = " L2 activated Busy to the core for stores due to all RC full", .pme_long_desc = " L2 activated Busy to the core for stores due to all RC full", }, [ POWER7_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x300f8, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ POWER7_PME_PM_THERMAL_MAX ] = { .pme_name = "PM_THERMAL_MAX", .pme_code = 0x40006, .pme_short_desc = "Processor In Thermal MAX", .pme_long_desc = "The processor experienced a thermal overload condition. This bit is sticky, it remains set until cleared by software.", }, [ POWER7_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc0b2, .pme_short_desc = "LS 1 Flush: Unaligned Load", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64 byte boundary, or 32 byte if it missed the L1).", }, [ POWER7_PME_PM_LSU1_REJECT_LHS ] = { .pme_name = "PM_LSU1_REJECT_LHS", .pme_code = 0xc0ae, .pme_short_desc = "LS1 Reject: Load Hit Store", .pme_long_desc = "Load Store Unit 1 rejected a load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully.", }, [ POWER7_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xd09f, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "Slot 0 of LRQ valid", }, [ POWER7_PME_PM_L3_CO_L31 ] = { .pme_name = "PM_L3_CO_L31", .pme_code = 0x4f080, .pme_short_desc = "L3 Castouts to Memory", .pme_long_desc = "L3 Castouts to Memory", }, [ POWER7_PME_PM_POWER_EVENT4 ] = { .pme_name = "PM_POWER_EVENT4", .pme_code = 0x4006e, .pme_short_desc = "Power Management Event 4", .pme_long_desc = "Power Management Event 4", }, [ POWER7_PME_PM_DATA_FROM_L31_SHR ] = { .pme_name = "PM_DATA_FROM_L31_SHR", .pme_code = 0x1c04e, .pme_short_desc = "Data loaded from another L3 on same chip shared", .pme_long_desc = "Data loaded from another L3 on same chip shared", }, [ POWER7_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x409e, .pme_short_desc = "Unconditional Branch", .pme_long_desc = "An unconditional branch was executed.", }, [ POWER7_PME_PM_LSU1_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU1_DC_PREF_STREAM_ALLOC", .pme_code = 0xd0aa, .pme_short_desc = "LS 1 D cache new prefetch stream allocated", .pme_long_desc = "LS 1 D cache new prefetch stream allocated", }, [ POWER7_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x10020, .pme_short_desc = "PMC4 Rewind Event", .pme_long_desc = "PMC4 was counting speculatively. The speculative condition was not met and the counter was restored to its previous value.", }, [ POWER7_PME_PM_L2_RCLD_DISP ] = { .pme_name = "PM_L2_RCLD_DISP", .pme_code = 0x16280, .pme_short_desc = " L2 RC load dispatch attempt", .pme_long_desc = " L2 RC load dispatch attempt", }, [ POWER7_PME_PM_THRD_PRIO_2_3_CYC ] = { .pme_name = "PM_THRD_PRIO_2_3_CYC", .pme_code = 0x40b2, .pme_short_desc = " Cycles thread running at priority level 2 or 3", .pme_long_desc = " Cycles thread running at priority level 2 or 3", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_PTEG_FROM_L2MISS", .pme_code = 0x4d058, .pme_short_desc = "Marked PTEG loaded from L2 miss", .pme_long_desc = "A Page Table Entry was loaded into the ERAT but not from the local L2 due to a marked load or store.", }, [ POWER7_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x4098, .pme_short_desc = " L2 I cache demand request due to BHT redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (CR mispredict).", }, [ POWER7_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x200f6, .pme_short_desc = "DERAT Reloaded due to a DERAT miss", .pme_long_desc = "Total D-ERAT Misses. Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_IC_PREF_CANCEL_L2 ] = { .pme_name = "PM_IC_PREF_CANCEL_L2", .pme_code = 0x4094, .pme_short_desc = "L2 Squashed request", .pme_long_desc = "L2 Squashed request", }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC_COUNT ] = { .pme_name = "PM_MRK_FIN_STALL_CYC_COUNT", .pme_code = 0x1003d, .pme_short_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #)", .pme_long_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #)", }, [ POWER7_PME_PM_BR_PRED_CCACHE ] = { .pme_name = "PM_BR_PRED_CCACHE", .pme_code = 0x40a0, .pme_short_desc = "Count Cache Predictions", .pme_long_desc = "The count value of a Branch and Count instruction was predicted", }, [ POWER7_PME_PM_GCT_UTIL_1_TO_2_SLOTS ] = { .pme_name = "PM_GCT_UTIL_1_TO_2_SLOTS", .pme_code = 0x209c, .pme_short_desc = "GCT Utilization 1-2 entries", .pme_long_desc = "GCT Utilization 1-2 entries", }, [ POWER7_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x30034, .pme_short_desc = "marked store complete (data home) with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ POWER7_PME_PM_LSU_TWO_TABLEWALK_CYC ] = { .pme_name = "PM_LSU_TWO_TABLEWALK_CYC", .pme_code = 0xd0a6, .pme_short_desc = "Cycles when two tablewalks pending on this thread", .pme_long_desc = "Cycles when two tablewalks pending on this thread", }, [ POWER7_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x2d048, .pme_short_desc = "Marked data loaded from L3 miss", .pme_long_desc = "DL1 was reloaded from beyond L3 due to a marked load.", }, [ POWER7_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100f8, .pme_short_desc = "No itags assigned ", .pme_long_desc = "Cycles when the Global Completion Table has no slots from this thread.", }, [ POWER7_PME_PM_LSU_SET_MPRED ] = { .pme_name = "PM_LSU_SET_MPRED", .pme_code = 0xc0a8, .pme_short_desc = "Line already in cache at reload time", .pme_long_desc = "Line already in cache at reload time", }, [ POWER7_PME_PM_FLUSH_DISP_TLBIE ] = { .pme_name = "PM_FLUSH_DISP_TLBIE", .pme_code = 0x208a, .pme_short_desc = "Dispatch Flush: TLBIE", .pme_long_desc = "Dispatch Flush: TLBIE", }, [ POWER7_PME_PM_VSU1_FCONV ] = { .pme_name = "PM_VSU1_FCONV", .pme_code = 0xa0b2, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", }, [ POWER7_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x4c05c, .pme_short_desc = "DERAT misses for 16G page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x3404a, .pme_short_desc = "Instruction fetched from local memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to the same module this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x409a, .pme_short_desc = " L2 I cache demand request due to branch redirect", .pme_long_desc = "A demand (not prefetch) miss to the instruction cache was sent to the L2 as a result of a branch prediction redirect (either ALL mispredicted or Target).", }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR_LONG ] = { .pme_name = "PM_CMPLU_STALL_SCALAR_LONG", .pme_code = 0x20018, .pme_short_desc = "Completion stall caused by long latency scalar instruction", .pme_long_desc = "Completion stall caused by long latency scalar instruction", }, [ POWER7_PME_PM_INST_PTEG_FROM_L2 ] = { .pme_name = "PM_INST_PTEG_FROM_L2", .pme_code = 0x1e050, .pme_short_desc = "Instruction PTEG loaded from L2", .pme_long_desc = "Instruction PTEG loaded from L2", }, [ POWER7_PME_PM_PTEG_FROM_L2 ] = { .pme_name = "PM_PTEG_FROM_L2", .pme_code = 0x1c050, .pme_short_desc = "PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L2 due to a demand load or store.", }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR_CYC", .pme_code = 0x20024, .pme_short_desc = "Marked ld latency Data source 0100 (L2.1 S)", .pme_long_desc = "Marked load latency Data source 0100 (L2.1 S)", }, [ POWER7_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x2d05a, .pme_short_desc = "Marked Data TLB misses for 4K page", .pme_long_desc = "Data TLB references to 4KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_VSU0_FPSCR ] = { .pme_name = "PM_VSU0_FPSCR", .pme_code = 0xb09c, .pme_short_desc = "Move to/from FPSCR type instruction issued on Pipe 0", .pme_long_desc = "Move to/from FPSCR type instruction issued on Pipe 0", }, [ POWER7_PME_PM_VSU1_VECT_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU1_VECT_DOUBLE_ISSUED", .pme_code = 0xb082, .pme_short_desc = "Double Precision vector instruction issued on Pipe1", .pme_long_desc = "Double Precision vector instruction issued on Pipe1", }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1d052, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with shared (T or SL) data from an L2 or L3 on a remote module due to a marked load or store.", }, [ POWER7_PME_PM_MEM0_RQ_DISP ] = { .pme_name = "PM_MEM0_RQ_DISP", .pme_code = 0x10083, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair0 Bit1", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair0 Bit1", }, [ POWER7_PME_PM_L2_LD_MISS ] = { .pme_name = "PM_L2_LD_MISS", .pme_code = 0x26080, .pme_short_desc = "Data Load Miss", .pme_long_desc = "Data Load Miss", }, [ POWER7_PME_PM_VMX_RESULT_SAT_1 ] = { .pme_name = "PM_VMX_RESULT_SAT_1", .pme_code = 0xb0a0, .pme_short_desc = "Valid result with sat=1", .pme_long_desc = "Valid result with sat=1", }, [ POWER7_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0xd8b8, .pme_short_desc = "L1 Prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x2002c, .pme_short_desc = "Marked ld latency Data Source 1100 (Local Memory)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER7_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x1000c, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Number of groups, counted at completion, that have encountered an instruction cache miss.", }, [ POWER7_PME_PM_PB_NODE_PUMP ] = { .pme_name = "PM_PB_NODE_PUMP", .pme_code = 0x10081, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair0 Bit0", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair0 Bit0", }, [ POWER7_PME_PM_SHL_MERGED ] = { .pme_name = "PM_SHL_MERGED", .pme_code = 0x5084, .pme_short_desc = "SHL table entry merged with existing", .pme_long_desc = "SHL table entry merged with existing", }, [ POWER7_PME_PM_NEST_PAIR1_ADD ] = { .pme_name = "PM_NEST_PAIR1_ADD", .pme_code = 0x20881, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair1 ADD", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair1 ADD", }, [ POWER7_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x1c048, .pme_short_desc = "Data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a demand load.", }, [ POWER7_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x208e, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "A flush was initiated by the Load Store Unit.", }, [ POWER7_PME_PM_LSU_SRQ_SYNC_COUNT ] = { .pme_name = "PM_LSU_SRQ_SYNC_COUNT", .pme_code = 0xd097, .pme_short_desc = "SRQ sync count (edge of PM_LSU_SRQ_SYNC_CYC)", .pme_long_desc = "SRQ sync count (edge of PM_LSU_SRQ_SYNC_CYC)", }, [ POWER7_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30010, .pme_short_desc = "Overflow from counter 2", .pme_long_desc = "Overflows from PMC2 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0xc884, .pme_short_desc = "All Scalar Loads", .pme_long_desc = "LSU executed Floating Point load instruction. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_POWER_EVENT3 ] = { .pme_name = "PM_POWER_EVENT3", .pme_code = 0x3006e, .pme_short_desc = "Power Management Event 3", .pme_long_desc = "Power Management Event 3", }, [ POWER7_PME_PM_DISP_WT ] = { .pme_name = "PM_DISP_WT", .pme_code = 0x30008, .pme_short_desc = "Dispatched Starved (not held, nothing to dispatch)", .pme_long_desc = "Dispatched Starved (not held, nothing to dispatch)", }, [ POWER7_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x40016, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a load/store reject. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER7_PME_PM_IC_BANK_CONFLICT ] = { .pme_name = "PM_IC_BANK_CONFLICT", .pme_code = 0x4082, .pme_short_desc = "Read blocked due to interleave conflict.", .pme_long_desc = "Read blocked due to interleave conflict.", }, [ POWER7_PME_PM_BR_MPRED_CR_TA ] = { .pme_name = "PM_BR_MPRED_CR_TA", .pme_code = 0x48ae, .pme_short_desc = "Branch mispredict - taken/not taken and target", .pme_long_desc = "Branch mispredict - taken/not taken and target", }, [ POWER7_PME_PM_L2_INST_MISS ] = { .pme_name = "PM_L2_INST_MISS", .pme_code = 0x36082, .pme_short_desc = "Instruction Load Misses", .pme_long_desc = "Instruction Load Misses", }, [ POWER7_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x40018, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered an ERAT miss. This is a subset of PM_CMPLU_STALL_REJECT.", }, [ POWER7_PME_PM_NEST_PAIR2_ADD ] = { .pme_name = "PM_NEST_PAIR2_ADD", .pme_code = 0x30881, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair2 ADD", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair2 ADD", }, [ POWER7_PME_PM_MRK_LSU_FLUSH ] = { .pme_name = "PM_MRK_LSU_FLUSH", .pme_code = 0xd08c, .pme_short_desc = "Flush: (marked) : All Cases", .pme_long_desc = "Marked flush initiated by LSU", }, [ POWER7_PME_PM_L2_LDST ] = { .pme_name = "PM_L2_LDST", .pme_code = 0x16880, .pme_short_desc = "Data Load+Store Count", .pme_long_desc = "Data Load+Store Count", }, [ POWER7_PME_PM_INST_FROM_L31_SHR ] = { .pme_name = "PM_INST_FROM_L31_SHR", .pme_code = 0x1404e, .pme_short_desc = "Instruction fetched from another L3 on same chip shared", .pme_long_desc = "Instruction fetched from another L3 on same chip shared", }, [ POWER7_PME_PM_VSU0_FIN ] = { .pme_name = "PM_VSU0_FIN", .pme_code = 0xa0bc, .pme_short_desc = "VSU0 Finished an instruction", .pme_long_desc = "VSU0 Finished an instruction", }, [ POWER7_PME_PM_LARX_LSU ] = { .pme_name = "PM_LARX_LSU", .pme_code = 0xc894, .pme_short_desc = "Larx Finished", .pme_long_desc = "Larx Finished", }, [ POWER7_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x34042, .pme_short_desc = "Instruction fetched from remote memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a different module than this proccessor is located on. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_DISP_CLB_HELD_TLBIE ] = { .pme_name = "PM_DISP_CLB_HELD_TLBIE", .pme_code = 0x2096, .pme_short_desc = "Dispatch Hold: Due to TLBIE", .pme_long_desc = "Dispatch Hold: Due to TLBIE", }, [ POWER7_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", .pme_code = 0x2002e, .pme_short_desc = "Marked ld latency Data Source 1110 (Distant Memory)", .pme_long_desc = "Marked ld latency Data Source 1110 (Distant Memory)", }, [ POWER7_PME_PM_BR_PRED_CR ] = { .pme_name = "PM_BR_PRED_CR", .pme_code = 0x40a8, .pme_short_desc = "Branch predict - taken/not taken", .pme_long_desc = "A conditional branch instruction was predicted as taken or not taken.", }, [ POWER7_PME_PM_LSU_REJECT ] = { .pme_name = "PM_LSU_REJECT", .pme_code = 0x10064, .pme_short_desc = "LSU Reject (up to 2 per cycle)", .pme_long_desc = "The Load Store Unit rejected an instruction. Combined Unit 0 + 1", }, [ POWER7_PME_PM_GCT_UTIL_3_TO_6_SLOTS ] = { .pme_name = "PM_GCT_UTIL_3_TO_6_SLOTS", .pme_code = 0x209e, .pme_short_desc = "GCT Utilization 3-6 entries", .pme_long_desc = "GCT Utilization 3-6 entries", }, [ POWER7_PME_PM_CMPLU_STALL_END_GCT_NOSLOT ] = { .pme_name = "PM_CMPLU_STALL_END_GCT_NOSLOT", .pme_code = 0x10028, .pme_short_desc = "Count ended because GCT went empty", .pme_long_desc = "Count ended because GCT went empty", }, [ POWER7_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0xc0a4, .pme_short_desc = "LS0 Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit 0 is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all eight entries are full, subsequent load instructions are rejected.", }, [ POWER7_PME_PM_VSU_FEST ] = { .pme_name = "PM_VSU_FEST", .pme_code = 0xa8b8, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", }, [ POWER7_PME_PM_NEST_PAIR0_AND ] = { .pme_name = "PM_NEST_PAIR0_AND", .pme_code = 0x10883, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair0 AND", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair0 AND", }, [ POWER7_PME_PM_PTEG_FROM_L3 ] = { .pme_name = "PM_PTEG_FROM_L3", .pme_code = 0x2c050, .pme_short_desc = "PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local L3 due to a demand load.", }, [ POWER7_PME_PM_POWER_EVENT2 ] = { .pme_name = "PM_POWER_EVENT2", .pme_code = 0x2006e, .pme_short_desc = "Power Management Event 2", .pme_long_desc = "Power Management Event 2", }, [ POWER7_PME_PM_IC_PREF_CANCEL_PAGE ] = { .pme_name = "PM_IC_PREF_CANCEL_PAGE", .pme_code = 0x4090, .pme_short_desc = "Prefetch Canceled due to page boundary", .pme_long_desc = "Prefetch Canceled due to page boundary", }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV ] = { .pme_name = "PM_VSU0_FSQRT_FDIV", .pme_code = 0xa088, .pme_short_desc = "four flops operation (fdiv,fsqrt,xsdiv,xssqrt) Scalar Instructions only!", .pme_long_desc = "four flops operation (fdiv,fsqrt,xsdiv,xssqrt) Scalar Instructions only!", }, [ POWER7_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x40030, .pme_short_desc = "Marked group complete", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ POWER7_PME_PM_VSU0_SCAL_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU0_SCAL_DOUBLE_ISSUED", .pme_code = 0xb088, .pme_short_desc = "Double Precision scalar instruction issued on Pipe0", .pme_long_desc = "Double Precision scalar instruction issued on Pipe0", }, [ POWER7_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x3000a, .pme_short_desc = "dispatch_success (Group Dispatched)", .pme_long_desc = "A group was dispatched", }, [ POWER7_PME_PM_LSU0_LDX ] = { .pme_name = "PM_LSU0_LDX", .pme_code = 0xc088, .pme_short_desc = "LS0 Vector Loads", .pme_long_desc = "LS0 Vector Loads", }, [ POWER7_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c040, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a demand load.", }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x1d042, .pme_short_desc = "Marked data loaded from remote L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a remote module due to a marked load.", }, [ POWER7_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0xc880, .pme_short_desc = " L1 D cache load references counted at finish", .pme_long_desc = " L1 D cache load references counted at finish", }, [ POWER7_PME_PM_VSU0_VECT_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU0_VECT_DOUBLE_ISSUED", .pme_code = 0xb080, .pme_short_desc = "Double Precision vector instruction issued on Pipe0", .pme_long_desc = "Double Precision vector instruction issued on Pipe0", }, [ POWER7_PME_PM_VSU1_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU1_2FLOP_DOUBLE", .pme_code = 0xa08e, .pme_short_desc = "two flop DP vector operation (xvadddp, xvmuldp, xvsubdp, xvcmpdp, xvseldp, xvabsdp, xvnabsdp, xvredp ,xvsqrtedp, vxnegdp)", .pme_long_desc = "two flop DP vector operation (xvadddp, xvmuldp, xvsubdp, xvcmpdp, xvseldp, xvabsdp, xvnabsdp, xvredp ,xvsqrtedp, vxnegdp)", }, [ POWER7_PME_PM_THRD_PRIO_6_7_CYC ] = { .pme_name = "PM_THRD_PRIO_6_7_CYC", .pme_code = 0x40b6, .pme_short_desc = " Cycles thread running at priority level 6 or 7", .pme_long_desc = " Cycles thread running at priority level 6 or 7", }, [ POWER7_PME_PM_BC_PLUS_8_RSLV_TAKEN ] = { .pme_name = "PM_BC_PLUS_8_RSLV_TAKEN", .pme_code = 0x40ba, .pme_short_desc = "BC+8 Resolve outcome was Taken, resulting in the conditional instruction being canceled", .pme_long_desc = "BC+8 Resolve outcome was Taken, resulting in the conditional instruction being canceled", }, [ POWER7_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x40ac, .pme_short_desc = "Branch mispredict - taken/not taken", .pme_long_desc = "A conditional branch instruction was incorrectly predicted as taken or not taken. The branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", }, [ POWER7_PME_PM_L3_CO_MEM ] = { .pme_name = "PM_L3_CO_MEM", .pme_code = 0x4f082, .pme_short_desc = "L3 Castouts to L3.1", .pme_long_desc = "L3 Castouts to L3.1", }, [ POWER7_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x400f0, .pme_short_desc = "Load Missed L1", .pme_long_desc = "Load references that miss the Level 1 Data cache. Combined unit 0 + 1.", }, [ POWER7_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x1c042, .pme_short_desc = "Data loaded from remote L2 or L3 modified", .pme_long_desc = "The processor's Data Cache was reloaded with modified (M) data from an L2 or L3 on a remote module due to a demand load", }, [ POWER7_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x1001a, .pme_short_desc = "Storage Queue is full and is blocking dispatch", .pme_long_desc = "Cycles the Store Request Queue is full.", }, [ POWER7_PME_PM_TABLEWALK_CYC ] = { .pme_name = "PM_TABLEWALK_CYC", .pme_code = 0x10026, .pme_short_desc = "Cycles when a tablewalk (I or D) is active", .pme_long_desc = "Cycles doing instruction or data tablewalks", }, [ POWER7_PME_PM_MRK_PTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_RMEM", .pme_code = 0x3d052, .pme_short_desc = "Marked PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT. POWER6 does not have a TLB", }, [ POWER7_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0xc8a0, .pme_short_desc = "Load got data from a store", .pme_long_desc = "Data from a store instruction was forwarded to a load. A load that misses L1 but becomes a store forward is treated as a load miss and it causes the DL1 load miss event to be counted. It does not go into the LMQ. If a load that hits L1 but becomes a store forward, then it's not treated as a load miss. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_INST_PTEG_FROM_RMEM ] = { .pme_name = "PM_INST_PTEG_FROM_RMEM", .pme_code = 0x3e052, .pme_short_desc = "Instruction PTEG loaded from remote memory", .pme_long_desc = "Instruction PTEG loaded from remote memory", }, [ POWER7_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x10004, .pme_short_desc = "FXU0 Finished", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER7_PME_PM_LSU1_L1_SW_PREF ] = { .pme_name = "PM_LSU1_L1_SW_PREF", .pme_code = 0xc09e, .pme_short_desc = "LSU1 Software L1 Prefetches, including SW Transient Prefetches", .pme_long_desc = "LSU1 Software L1 Prefetches, including SW Transient Prefetches", }, [ POWER7_PME_PM_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_PTEG_FROM_L31_MOD", .pme_code = 0x1c054, .pme_short_desc = "PTEG loaded from another L3 on same chip modified", .pme_long_desc = "PTEG loaded from another L3 on same chip modified", }, [ POWER7_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10024, .pme_short_desc = "Overflow from counter 5", .pme_long_desc = "Overflows from PMC5 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc082, .pme_short_desc = "LS1 L1 D cache load references counted at finish", .pme_long_desc = "Load references to Level 1 Data Cache, by unit 1.", }, [ POWER7_PME_PM_INST_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_L21_SHR", .pme_code = 0x4e056, .pme_short_desc = "Instruction PTEG loaded from another L2 on same chip shared", .pme_long_desc = "Instruction PTEG loaded from another L2 on same chip shared", }, [ POWER7_PME_PM_CMPLU_STALL_THRD ] = { .pme_name = "PM_CMPLU_STALL_THRD", .pme_code = 0x1001c, .pme_short_desc = "Completion Stalled due to thread conflict. Group ready to complete but it was another thread's turn", .pme_long_desc = "Completion Stalled due to thread conflict. Group ready to complete but it was another thread's turn", }, [ POWER7_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x3c042, .pme_short_desc = "Data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded from memory attached to a different module than this proccessor is located on.", }, [ POWER7_PME_PM_VSU0_SCAL_SINGLE_ISSUED ] = { .pme_name = "PM_VSU0_SCAL_SINGLE_ISSUED", .pme_code = 0xb084, .pme_short_desc = "Single Precision scalar instruction issued on Pipe0", .pme_long_desc = "Single Precision scalar instruction issued on Pipe0", }, [ POWER7_PME_PM_BR_MPRED_LSTACK ] = { .pme_name = "PM_BR_MPRED_LSTACK", .pme_code = 0x40a6, .pme_short_desc = "Branch Mispredict due to Link Stack", .pme_long_desc = "Branch Mispredict due to Link Stack", }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x40028, .pme_short_desc = "Marked ld latency Data source 1001 (L2.5/L3.5 M same 4 chip node)", .pme_long_desc = "Marked ld latency Data source 1001 (L2.5/L3.5 M same 4 chip node)", }, [ POWER7_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc0b4, .pme_short_desc = "LS0 Flush: Unaligned Store", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4K boundary).", }, [ POWER7_PME_PM_LSU_NCST ] = { .pme_name = "PM_LSU_NCST", .pme_code = 0xc090, .pme_short_desc = "Non-cachable Stores sent to nest", .pme_long_desc = "Non-cachable Stores sent to nest", }, [ POWER7_PME_PM_BR_TAKEN ] = { .pme_name = "PM_BR_TAKEN", .pme_code = 0x20004, .pme_short_desc = "Branch Taken", .pme_long_desc = "A branch instruction was taken. This could have been a conditional branch or an unconditional branch", }, [ POWER7_PME_PM_INST_PTEG_FROM_LMEM ] = { .pme_name = "PM_INST_PTEG_FROM_LMEM", .pme_code = 0x4e052, .pme_short_desc = "Instruction PTEG loaded from local memory", .pme_long_desc = "Instruction PTEG loaded from local memory", }, [ POWER7_PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED_IC_MISS", .pme_code = 0x4001c, .pme_short_desc = "GCT empty by branch mispredict + IC miss", .pme_long_desc = "No slot in GCT caused by branch mispredict or I cache miss", }, [ POWER7_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x2c05a, .pme_short_desc = "Data TLB miss for 4K page", .pme_long_desc = "Data TLB references to 4KB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x30022, .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", .pme_long_desc = "PMC4 was counting speculatively. The speculative condition was met and the counter value was committed by copying it to the backup register.", }, [ POWER7_PME_PM_VSU1_PERMUTE_ISSUED ] = { .pme_name = "PM_VSU1_PERMUTE_ISSUED", .pme_code = 0xb092, .pme_short_desc = "Permute VMX Instruction Issued", .pme_long_desc = "Permute VMX Instruction Issued", }, [ POWER7_PME_PM_SLB_MISS ] = { .pme_name = "PM_SLB_MISS", .pme_code = 0xd890, .pme_short_desc = "Data + Instruction SLB Miss - Total of all segment sizes", .pme_long_desc = "Total of all Segment Lookaside Buffer (SLB) misses, Instructions + Data.", }, [ POWER7_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc0ba, .pme_short_desc = "LS1 Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed from unit 1 because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER7_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x300fc, .pme_short_desc = "TLB reload valid", .pme_long_desc = "Data TLB misses, all page sizes.", }, [ POWER7_PME_PM_VSU1_FRSP ] = { .pme_name = "PM_VSU1_FRSP", .pme_code = 0xa0b6, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", }, [ POWER7_PME_PM_VSU_VECTOR_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU_VECTOR_DOUBLE_ISSUED", .pme_code = 0xb880, .pme_short_desc = "Double Precision vector instruction issued on Pipe0", .pme_long_desc = "Double Precision vector instruction issued on Pipe0", }, [ POWER7_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x16182, .pme_short_desc = "L2 Castouts - Shared (T, Te, Si, S)", .pme_long_desc = "An L2 line in the Shared state was castout. Total for all slices.", }, [ POWER7_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x3c044, .pme_short_desc = "Data loaded from distant L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a distant module due to a demand load", }, [ POWER7_PME_PM_VSU1_STF ] = { .pme_name = "PM_VSU1_STF", .pme_code = 0xb08e, .pme_short_desc = "FPU store (SP or DP) issued on Pipe1", .pme_long_desc = "FPU store (SP or DP) issued on Pipe1", }, [ POWER7_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x200f0, .pme_short_desc = "Store Instructions Finished", .pme_long_desc = "Store requests sent to the nest.", }, [ POWER7_PME_PM_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_PTEG_FROM_L21_SHR", .pme_code = 0x4c056, .pme_short_desc = "PTEG loaded from another L2 on same chip shared", .pme_long_desc = "PTEG loaded from another L2 on same chip shared", }, [ POWER7_PME_PM_L2_LOC_GUESS_WRONG ] = { .pme_name = "PM_L2_LOC_GUESS_WRONG", .pme_code = 0x26480, .pme_short_desc = "L2 guess loc and guess was not correct (ie data remote)", .pme_long_desc = "L2 guess loc and guess was not correct (ie data remote)", }, [ POWER7_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0xd08e, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ POWER7_PME_PM_LSU0_REJECT_LHS ] = { .pme_name = "PM_LSU0_REJECT_LHS", .pme_code = 0xc0ac, .pme_short_desc = "LS0 Reject: Load Hit Store", .pme_long_desc = "Load Store Unit 0 rejected a load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully.", }, [ POWER7_PME_PM_IC_PREF_CANCEL_HIT ] = { .pme_name = "PM_IC_PREF_CANCEL_HIT", .pme_code = 0x4092, .pme_short_desc = "Prefetch Canceled due to icache hit", .pme_long_desc = "Prefetch Canceled due to icache hit", }, [ POWER7_PME_PM_L3_PREF_BUSY ] = { .pme_name = "PM_L3_PREF_BUSY", .pme_code = 0x4f080, .pme_short_desc = "Prefetch machines >= threshold (8,16,20,24)", .pme_long_desc = "Prefetch machines >= threshold (8,16,20,24)", }, [ POWER7_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2003a, .pme_short_desc = "bru marked instr finish", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER7_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc08e, .pme_short_desc = "LS1 Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed by Unit 0.", }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_L31_MOD", .pme_code = 0x1e054, .pme_short_desc = "Instruction PTEG loaded from another L3 on same chip modified", .pme_long_desc = "Instruction PTEG loaded from another L3 on same chip modified", }, [ POWER7_PME_PM_LSU_NCLD ] = { .pme_name = "PM_LSU_NCLD", .pme_code = 0xc88c, .pme_short_desc = "Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_LSU_LDX ] = { .pme_name = "PM_LSU_LDX", .pme_code = 0xc888, .pme_short_desc = "All Vector loads (vsx vector + vmx vector)", .pme_long_desc = "All Vector loads (vsx vector + vmx vector)", }, [ POWER7_PME_PM_L2_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L2_LOC_GUESS_CORRECT", .pme_code = 0x16480, .pme_short_desc = "L2 guess loc and guess was correct (ie data local)", .pme_long_desc = "L2 guess loc and guess was correct (ie data local)", }, [ POWER7_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x10038, .pme_short_desc = "Threshold timeout event", .pme_long_desc = "The threshold timer expired", }, [ POWER7_PME_PM_L3_PREF_ST ] = { .pme_name = "PM_L3_PREF_ST", .pme_code = 0xd0ae, .pme_short_desc = "L3 cache ST prefetches", .pme_long_desc = "L3 cache ST prefetches", }, [ POWER7_PME_PM_DISP_CLB_HELD_SYNC ] = { .pme_name = "PM_DISP_CLB_HELD_SYNC", .pme_code = 0x2098, .pme_short_desc = "Dispatch/CLB Hold: Sync type instruction", .pme_long_desc = "Dispatch/CLB Hold: Sync type instruction", }, [ POWER7_PME_PM_VSU_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU_SIMPLE_ISSUED", .pme_code = 0xb894, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", }, [ POWER7_PME_PM_VSU1_SINGLE ] = { .pme_name = "PM_VSU1_SINGLE", .pme_code = 0xa0aa, .pme_short_desc = "FPU single precision", .pme_long_desc = "VSU1 executed single precision instruction", }, [ POWER7_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x3001a, .pme_short_desc = "Data Tablewalk Active", .pme_long_desc = "Cycles a translation tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ POWER7_PME_PM_L2_RC_ST_DONE ] = { .pme_name = "PM_L2_RC_ST_DONE", .pme_code = 0x36380, .pme_short_desc = "RC did st to line that was Tx or Sx", .pme_long_desc = "RC did st to line that was Tx or Sx", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L21_MOD", .pme_code = 0x3d056, .pme_short_desc = "Marked PTEG loaded from another L2 on same chip modified", .pme_long_desc = "Marked PTEG loaded from another L2 on same chip modified", }, [ POWER7_PME_PM_LARX_LSU1 ] = { .pme_name = "PM_LARX_LSU1", .pme_code = 0xc096, .pme_short_desc = "ls1 Larx Finished", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 1 ", }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x3d042, .pme_short_desc = "Marked data loaded from remote memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to a different module than this proccessor is located on.", }, [ POWER7_PME_PM_DISP_CLB_HELD ] = { .pme_name = "PM_DISP_CLB_HELD", .pme_code = 0x2090, .pme_short_desc = "CLB Hold: Any Reason", .pme_long_desc = "CLB Hold: Any Reason", }, [ POWER7_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x1c05c, .pme_short_desc = "DERAT misses for 4K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 4K page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", .pme_code = 0x16282, .pme_short_desc = " L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = " L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", }, [ POWER7_PME_PM_SEG_EXCEPTION ] = { .pme_name = "PM_SEG_EXCEPTION", .pme_code = 0x28a4, .pme_short_desc = "ISEG + DSEG Exception", .pme_long_desc = "ISEG + DSEG Exception", }, [ POWER7_PME_PM_FLUSH_DISP_SB ] = { .pme_name = "PM_FLUSH_DISP_SB", .pme_code = 0x208c, .pme_short_desc = "Dispatch Flush: Scoreboard", .pme_long_desc = "Dispatch Flush: Scoreboard", }, [ POWER7_PME_PM_L2_DC_INV ] = { .pme_name = "PM_L2_DC_INV", .pme_code = 0x26182, .pme_short_desc = "Dcache invalidates from L2 ", .pme_long_desc = "The L2 invalidated a line in processor's data cache. This is caused by the L2 line being cast out or invalidated. Total for all slices", }, [ POWER7_PME_PM_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4c054, .pme_short_desc = "PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a distant module due to a demand load or store.", }, [ POWER7_PME_PM_DSEG ] = { .pme_name = "PM_DSEG", .pme_code = 0x20a6, .pme_short_desc = "DSEG Exception", .pme_long_desc = "DSEG Exception", }, [ POWER7_PME_PM_BR_PRED_LSTACK ] = { .pme_name = "PM_BR_PRED_LSTACK", .pme_code = 0x40a2, .pme_short_desc = "Link Stack Predictions", .pme_long_desc = "The target address of a Branch to Link instruction was predicted by the link stack.", }, [ POWER7_PME_PM_VSU0_STF ] = { .pme_name = "PM_VSU0_STF", .pme_code = 0xb08c, .pme_short_desc = "FPU store (SP or DP) issued on Pipe0", .pme_long_desc = "FPU store (SP or DP) issued on Pipe0", }, [ POWER7_PME_PM_LSU_FX_FIN ] = { .pme_name = "PM_LSU_FX_FIN", .pme_code = 0x10066, .pme_short_desc = "LSU Finished a FX operation (up to 2 per cycle)", .pme_long_desc = "LSU Finished a FX operation (up to 2 per cycle)", }, [ POWER7_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x3c05c, .pme_short_desc = "DERAT misses for 16M page", .pme_long_desc = "A data request (load or store) missed the ERAT for 16M page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4d054, .pme_short_desc = "Marked PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with modified (M) data from an L2 or L3 on a distant module due to a marked load or store.", }, [ POWER7_PME_PM_GCT_UTIL_11_PLUS_SLOTS ] = { .pme_name = "PM_GCT_UTIL_11_PLUS_SLOTS", .pme_code = 0x20a2, .pme_short_desc = "GCT Utilization 11+ entries", .pme_long_desc = "GCT Utilization 11+ entries", }, [ POWER7_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x14048, .pme_short_desc = "Instruction fetched from L3", .pme_long_desc = "An instruction fetch group was fetched from L3. Fetch Groups can contain up to 8 instructions", }, [ POWER7_PME_PM_MRK_IFU_FIN ] = { .pme_name = "PM_MRK_IFU_FIN", .pme_code = 0x3003a, .pme_short_desc = "IFU non-branch marked instruction finished", .pme_long_desc = "The Instruction Fetch Unit finished a marked instruction.", }, [ POWER7_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x400fc, .pme_short_desc = "ITLB Reloaded (always zero on POWER6)", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ POWER7_PME_PM_VSU_STF ] = { .pme_name = "PM_VSU_STF", .pme_code = 0xb88c, .pme_short_desc = "FPU store (SP or DP) issued on Pipe0", .pme_long_desc = "FPU store (SP or DP) issued on Pipe0", }, [ POWER7_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0xc8b4, .pme_short_desc = "Flush: Unaligned Store", .pme_long_desc = "A store was flushed because it was unaligned (crossed a 4K boundary). Combined Unit 0 + 1.", }, [ POWER7_PME_PM_L2_LDST_MISS ] = { .pme_name = "PM_L2_LDST_MISS", .pme_code = 0x26880, .pme_short_desc = "Data Load+Store Miss", .pme_long_desc = "Data Load+Store Miss", }, [ POWER7_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x40004, .pme_short_desc = "FXU1 Finished", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result. Instructions that finish may not necessary complete.", }, [ POWER7_PME_PM_SHL_DEALLOCATED ] = { .pme_name = "PM_SHL_DEALLOCATED", .pme_code = 0x5080, .pme_short_desc = "SHL Table entry deallocated", .pme_long_desc = "SHL Table entry deallocated", }, [ POWER7_PME_PM_L2_SN_M_WR_DONE ] = { .pme_name = "PM_L2_SN_M_WR_DONE", .pme_code = 0x46382, .pme_short_desc = "SNP dispatched for a write and was M", .pme_long_desc = "SNP dispatched for a write and was M", }, [ POWER7_PME_PM_LSU_REJECT_SET_MPRED ] = { .pme_name = "PM_LSU_REJECT_SET_MPRED", .pme_code = 0xc8a8, .pme_short_desc = "Reject: Set Predict Wrong", .pme_long_desc = "The Load Store Unit rejected an instruction because the cache set was improperly predicted. This is a fast reject and will be immediately redispatched. Combined Unit 0 + 1", }, [ POWER7_PME_PM_L3_PREF_LD ] = { .pme_name = "PM_L3_PREF_LD", .pme_code = 0xd0ac, .pme_short_desc = "L3 cache LD prefetches", .pme_long_desc = "L3 cache LD prefetches", }, [ POWER7_PME_PM_L2_SN_M_RD_DONE ] = { .pme_name = "PM_L2_SN_M_RD_DONE", .pme_code = 0x46380, .pme_short_desc = "SNP dispatched for a read and was M", .pme_long_desc = "SNP dispatched for a read and was M", }, [ POWER7_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x4d05c, .pme_short_desc = "Marked DERAT misses for 16G page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 16G page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_VSU_FCONV ] = { .pme_name = "PM_VSU_FCONV", .pme_code = 0xa8b0, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", }, [ POWER7_PME_PM_ANY_THRD_RUN_CYC ] = { .pme_name = "PM_ANY_THRD_RUN_CYC", .pme_code = 0x100fa, .pme_short_desc = "One of threads in run_cycles ", .pme_long_desc = "One of threads in run_cycles ", }, [ POWER7_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xd0a4, .pme_short_desc = "LMQ full", .pme_long_desc = "The Load Miss Queue was full.", }, [ POWER7_PME_PM_MRK_LSU_REJECT_LHS ] = { .pme_name = "PM_MRK_LSU_REJECT_LHS", .pme_code = 0xd082, .pme_short_desc = " Reject(marked): Load Hit Store", .pme_long_desc = "The Load Store Unit rejected a marked load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully", }, [ POWER7_PME_PM_MRK_LD_MISS_L1_CYC ] = { .pme_name = "PM_MRK_LD_MISS_L1_CYC", .pme_code = 0x4003e, .pme_short_desc = "L1 data load miss cycles", .pme_long_desc = "L1 data load miss cycles", }, [ POWER7_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x20020, .pme_short_desc = "Marked ld latency Data source 0000 (L2 hit)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER7_PME_PM_INST_IMC_MATCH_DISP ] = { .pme_name = "PM_INST_IMC_MATCH_DISP", .pme_code = 0x30016, .pme_short_desc = "IMC Matches dispatched", .pme_long_desc = "IMC Matches dispatched", }, [ POWER7_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x4002c, .pme_short_desc = "Marked ld latency Data source 1101 (Memory same 4 chip node)", .pme_long_desc = "Cycles a marked load waited for data from this level of the storage system. Counting begins when a marked load misses the data cache and ends when the data is reloaded into the data cache. To calculate average latency divide this count by the number of marked misses to the same level.", }, [ POWER7_PME_PM_VSU0_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU0_SIMPLE_ISSUED", .pme_code = 0xb094, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", }, [ POWER7_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x40014, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes was a fixed point divide instruction. This is a subset of PM_CMPLU_STALL_FXU.", }, [ POWER7_PME_PM_MRK_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2d054, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to a different module than this proccessor is located on due to a marked load or store.", }, [ POWER7_PME_PM_VSU_FMA_DOUBLE ] = { .pme_name = "PM_VSU_FMA_DOUBLE", .pme_code = 0xa890, .pme_short_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", .pme_long_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", }, [ POWER7_PME_PM_VSU_4FLOP ] = { .pme_name = "PM_VSU_4FLOP", .pme_code = 0xa89c, .pme_short_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", .pme_long_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", }, [ POWER7_PME_PM_VSU1_FIN ] = { .pme_name = "PM_VSU1_FIN", .pme_code = 0xa0be, .pme_short_desc = "VSU1 Finished an instruction", .pme_long_desc = "VSU1 Finished an instruction", }, [ POWER7_PME_PM_NEST_PAIR1_AND ] = { .pme_name = "PM_NEST_PAIR1_AND", .pme_code = 0x20883, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair1 AND", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair1 AND", }, [ POWER7_PME_PM_INST_PTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_RL2L3_MOD", .pme_code = 0x1e052, .pme_short_desc = "Instruction PTEG loaded from remote L2 or L3 modified", .pme_long_desc = "Instruction PTEG loaded from remote L2 or L3 modified", }, [ POWER7_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x200f4, .pme_short_desc = "Run_cycles", .pme_long_desc = "Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.", }, [ POWER7_PME_PM_PTEG_FROM_RMEM ] = { .pme_name = "PM_PTEG_FROM_RMEM", .pme_code = 0x3c052, .pme_short_desc = "PTEG loaded from remote memory", .pme_long_desc = "A Page Table Entry was loaded into the TLB from memory attached to a different module than this proccessor is located on.", }, [ POWER7_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xd09e, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", }, [ POWER7_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0xc084, .pme_short_desc = "LS0 Scalar Loads", .pme_long_desc = "A floating point load was executed by LSU0", }, [ POWER7_PME_PM_FLUSH_COMPLETION ] = { .pme_name = "PM_FLUSH_COMPLETION", .pme_code = 0x30012, .pme_short_desc = "Completion Flush", .pme_long_desc = "Completion Flush", }, [ POWER7_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x300f0, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_L2_NODE_PUMP ] = { .pme_name = "PM_L2_NODE_PUMP", .pme_code = 0x36480, .pme_short_desc = "RC req that was a local (aka node) pump attempt", .pme_long_desc = "RC req that was a local (aka node) pump attempt", }, [ POWER7_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x34044, .pme_short_desc = "Instruction fetched from distant L2 or L3 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L2 or L3 on a distant module. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_MRK_STALL_CMPLU_CYC ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC", .pme_code = 0x3003e, .pme_short_desc = "Marked Group Completion Stall cycles ", .pme_long_desc = "Marked Group Completion Stall cycles ", }, [ POWER7_PME_PM_VSU1_DENORM ] = { .pme_name = "PM_VSU1_DENORM", .pme_code = 0xa0ae, .pme_short_desc = "FPU denorm operand", .pme_long_desc = "VSU1 received denormalized data", }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR_CYC", .pme_code = 0x20026, .pme_short_desc = "Marked ld latency Data source 0110 (L3.1 S) ", .pme_long_desc = "Marked load latency Data source 0110 (L3.1 S) ", }, [ POWER7_PME_PM_NEST_PAIR0_ADD ] = { .pme_name = "PM_NEST_PAIR0_ADD", .pme_code = 0x10881, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair0 ADD", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair0 ADD", }, [ POWER7_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x24048, .pme_short_desc = "Instruction fetched missed L3", .pme_long_desc = "An instruction fetch group was fetched from beyond L3. Fetch groups can contain up to 8 instructions.", }, [ POWER7_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x2080, .pme_short_desc = "ee off and external interrupt", .pme_long_desc = "Cycles when an interrupt due to an external exception is pending but external exceptions were masked.", }, [ POWER7_PME_PM_INST_PTEG_FROM_DMEM ] = { .pme_name = "PM_INST_PTEG_FROM_DMEM", .pme_code = 0x2e052, .pme_short_desc = "Instruction PTEG loaded from distant memory", .pme_long_desc = "Instruction PTEG loaded from distant memory", }, [ POWER7_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x3404c, .pme_short_desc = "Instruction fetched from distant L2 or L3 modified", .pme_long_desc = "An instruction fetch group was fetched with modified (M) data from an L2 or L3 on a distant module. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30024, .pme_short_desc = "Overflow from counter 6", .pme_long_desc = "Overflows from PMC6 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_VSU_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU_2FLOP_DOUBLE", .pme_code = 0xa88c, .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", }, [ POWER7_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x20066, .pme_short_desc = "TLB Miss (I + D)", .pme_long_desc = "Total of Data TLB mises + Instruction TLB misses", }, [ POWER7_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x2000e, .pme_short_desc = "fxu0 busy and fxu1 busy.", .pme_long_desc = "Cycles when both FXU0 and FXU1 are busy.", }, [ POWER7_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", .pme_code = 0x26280, .pme_short_desc = " L2 RC load dispatch attempt failed due to other reasons", .pme_long_desc = " L2 RC load dispatch attempt failed due to other reasons", }, [ POWER7_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0xc8a4, .pme_short_desc = "Reject: LMQ Full (LHR)", .pme_long_desc = "Total cycles the Load Store Unit is busy rejecting instructions because the Load Miss Queue was full. The LMQ has eight entries. If all the eight entries are full, subsequent load instructions are rejected. Combined unit 0 + 1.", }, [ POWER7_PME_PM_IC_RELOAD_SHR ] = { .pme_name = "PM_IC_RELOAD_SHR", .pme_code = 0x4096, .pme_short_desc = "Reloading line to be shared between the threads", .pme_long_desc = "An Instruction Cache request was made by this thread and the cache line was already in the cache for the other thread. The line is marked valid for all threads.", }, [ POWER7_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x10031, .pme_short_desc = "IDU Marked Instruction", .pme_long_desc = "A group was sampled (marked). The group is called a marked group. One instruction within the group is tagged for detailed monitoring. The sampled instruction is called a marked instructions. Events associated with the marked instruction are annotated with the marked term.", }, [ POWER7_PME_PM_MRK_ST_NEST ] = { .pme_name = "PM_MRK_ST_NEST", .pme_code = 0x20034, .pme_short_desc = "marked store sent to Nest", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV ] = { .pme_name = "PM_VSU1_FSQRT_FDIV", .pme_code = 0xa08a, .pme_short_desc = "four flops operation (fdiv,fsqrt,xsdiv,xssqrt) Scalar Instructions only!", .pme_long_desc = "four flops operation (fdiv,fsqrt,xsdiv,xssqrt) Scalar Instructions only!", }, [ POWER7_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc0b8, .pme_short_desc = "LS0 Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed from unit 0 because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER7_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0xc094, .pme_short_desc = "ls0 Larx Finished", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 ", }, [ POWER7_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x4084, .pme_short_desc = "Cycles No room in ibuff", .pme_long_desc = "Cycles with the Instruction Buffer was full. The Instruction Buffer is a circular queue of 64 instructions per thread, organized as 16 groups of 4 instructions.", }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x2002a, .pme_short_desc = "Marked ld latency Data Source 1010 (Distant L2.75/L3.75 S)", .pme_long_desc = "Marked ld latency Data Source 1010 (Distant L2.75/L3.75 S)", }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU_DC_PREF_STREAM_ALLOC", .pme_code = 0xd8a8, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "D cache new prefetch stream allocated", }, [ POWER7_PME_PM_GRP_MRK_CYC ] = { .pme_name = "PM_GRP_MRK_CYC", .pme_code = 0x10030, .pme_short_desc = "cycles IDU marked instruction before dispatch", .pme_long_desc = "cycles IDU marked instruction before dispatch", }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x20028, .pme_short_desc = "Marked ld latency Data Source 1000 (Remote L2.5/L3.5 S)", .pme_long_desc = "Marked load latency Data Source 1000 (Remote L2.5/L3.5 S)", }, [ POWER7_PME_PM_L2_GLOB_GUESS_CORRECT ] = { .pme_name = "PM_L2_GLOB_GUESS_CORRECT", .pme_code = 0x16482, .pme_short_desc = "L2 guess glb and guess was correct (ie data remote)", .pme_long_desc = "L2 guess glb and guess was correct (ie data remote)", }, [ POWER7_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0xc8ac, .pme_short_desc = "Reject: Load Hit Store", .pme_long_desc = "The Load Store Unit rejected a load load instruction that had an address overlap with an older store in the store queue. The store must be committed and de-allocated from the Store Queue before the load can execute successfully. Combined Unit 0 + 1", }, [ POWER7_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x3d04a, .pme_short_desc = "Marked data loaded from local memory", .pme_long_desc = "The processor's Data Cache was reloaded due to a marked load from memory attached to the same module this proccessor is located on.", }, [ POWER7_PME_PM_INST_PTEG_FROM_L3 ] = { .pme_name = "PM_INST_PTEG_FROM_L3", .pme_code = 0x2e050, .pme_short_desc = "Instruction PTEG loaded from L3", .pme_long_desc = "Instruction PTEG loaded from L3", }, [ POWER7_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x3000c, .pme_short_desc = "Frequency is being slewed down due to Power Management", .pme_long_desc = "Processor frequency was slowed down due to power management", }, [ POWER7_PME_PM_PB_RETRY_NODE_PUMP ] = { .pme_name = "PM_PB_RETRY_NODE_PUMP", .pme_code = 0x30081, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair2 Bit0", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair2 Bit0", }, [ POWER7_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x1404c, .pme_short_desc = "Instruction fetched from remote L2 or L3 shared", .pme_long_desc = "An instruction fetch group was fetched with shared (S) data from the L2 or L3 on a remote module. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10032, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "A marked instruction was issued to an execution unit.", }, [ POWER7_PME_PM_PTEG_FROM_L3MISS ] = { .pme_name = "PM_PTEG_FROM_L3MISS", .pme_code = 0x2c058, .pme_short_desc = "PTEG loaded from L3 miss", .pme_long_desc = " Page Table Entry was loaded into the ERAT from beyond the L3 due to a demand load or store.", }, [ POWER7_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x400f4, .pme_short_desc = "Run_PURR", .pme_long_desc = "The Processor Utilization of Resources Register was incremented while the run latch was set. The PURR registers will be incremented roughly in the ratio in which the instructions are dispatched from the two threads. ", }, [ POWER7_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x40038, .pme_short_desc = "Marked group experienced I cache miss", .pme_long_desc = "A group containing a marked (sampled) instruction experienced an instruction cache miss.", }, [ POWER7_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x1d048, .pme_short_desc = "Marked data loaded from L3", .pme_long_desc = "The processor's Data Cache was reloaded from the local L3 due to a marked load.", }, [ POWER7_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x20016, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Following a completion stall (any period when no groups completed) the last instruction to finish before completion resumes suffered a Data Cache Miss. Data Cache Miss has higher priority than any other Load/Store delay, so if an instruction encounters multiple delays only the Data Cache Miss will be reported and the entire delay period will be charged to Data Cache Miss. This is a subset of PM_CMPLU_STALL_LSU.", }, [ POWER7_PME_PM_PTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_PTEG_FROM_RL2L3_SHR", .pme_code = 0x2c054, .pme_short_desc = "PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT with shared (T or SL) data from an L2 or L3 on a remote module due to a demand load or store.", }, [ POWER7_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0xc8b8, .pme_short_desc = "Flush: LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A younger load was flushed because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte. Combined Unit 0 + 1.", }, [ POWER7_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x2d05c, .pme_short_desc = "Marked DERAT misses for 64K page", .pme_long_desc = "A marked data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_INST_PTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_PTEG_FROM_DL2L3_MOD", .pme_code = 0x4e054, .pme_short_desc = "Instruction PTEG loaded from distant L2 or L3 modified", .pme_long_desc = "Instruction PTEG loaded from distant L2 or L3 modified", }, [ POWER7_PME_PM_L2_ST_MISS ] = { .pme_name = "PM_L2_ST_MISS", .pme_code = 0x26082, .pme_short_desc = "Data Store Miss", .pme_long_desc = "Data Store Miss", }, [ POWER7_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0xd094, .pme_short_desc = "lwsync count (easier to use than IMC)", .pme_long_desc = "lwsync count (easier to use than IMC)", }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE", .pme_code = 0xd0bc, .pme_short_desc = "LS0 Dcache Strided prefetch stream confirmed", .pme_long_desc = "LS0 Dcache Strided prefetch stream confirmed", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L21_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_L21_SHR", .pme_code = 0x4d056, .pme_short_desc = "Marked PTEG loaded from another L2 on same chip shared", .pme_long_desc = "Marked PTEG loaded from another L2 on same chip shared", }, [ POWER7_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0xd088, .pme_short_desc = "Flush: (marked) LRQ", .pme_long_desc = "Load Hit Load or Store Hit Load flush. A marked load was flushed because it executed before an older store and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ POWER7_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x100f0, .pme_short_desc = "IMC Match Count", .pme_long_desc = "Number of instructions resulting from the marked instructions expansion that completed.", }, [ POWER7_PME_PM_NEST_PAIR3_AND ] = { .pme_name = "PM_NEST_PAIR3_AND", .pme_code = 0x40883, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair3 AND", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair3 AND", }, [ POWER7_PME_PM_PB_RETRY_SYS_PUMP ] = { .pme_name = "PM_PB_RETRY_SYS_PUMP", .pme_code = 0x40081, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair3 Bit0", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair3 Bit0", }, [ POWER7_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30030, .pme_short_desc = "marked instr finish any unit ", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER7_PME_PM_MRK_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_PTEG_FROM_DL2L3_SHR", .pme_code = 0x3d054, .pme_short_desc = "Marked PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to a different module than this proccessor is located on due to a marked load or store.", }, [ POWER7_PME_PM_INST_FROM_L31_MOD ] = { .pme_name = "PM_INST_FROM_L31_MOD", .pme_code = 0x14044, .pme_short_desc = "Instruction fetched from another L3 on same chip modified", .pme_long_desc = "Instruction fetched from another L3 on same chip modified", }, [ POWER7_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x3d05e, .pme_short_desc = "Marked Data TLB misses for 64K page", .pme_long_desc = "Data TLB references to 64KB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_LSU_FIN ] = { .pme_name = "PM_LSU_FIN", .pme_code = 0x30066, .pme_short_desc = "LSU Finished an instruction (up to 2 per cycle)", .pme_long_desc = "LSU Finished an instruction (up to 2 per cycle)", }, [ POWER7_PME_PM_MRK_LSU_REJECT ] = { .pme_name = "PM_MRK_LSU_REJECT", .pme_code = 0x40064, .pme_short_desc = "LSU marked reject (up to 2 per cycle)", .pme_long_desc = "LSU marked reject (up to 2 per cycle)", }, [ POWER7_PME_PM_L2_CO_FAIL_BUSY ] = { .pme_name = "PM_L2_CO_FAIL_BUSY", .pme_code = 0x16382, .pme_short_desc = " L2 RC Cast Out dispatch attempt failed due to all CO machines busy", .pme_long_desc = " L2 RC Cast Out dispatch attempt failed due to all CO machines busy", }, [ POWER7_PME_PM_MEM0_WQ_DISP ] = { .pme_name = "PM_MEM0_WQ_DISP", .pme_code = 0x40083, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair3 Bit1", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair3 Bit1", }, [ POWER7_PME_PM_DATA_FROM_L31_MOD ] = { .pme_name = "PM_DATA_FROM_L31_MOD", .pme_code = 0x1c044, .pme_short_desc = "Data loaded from another L3 on same chip modified", .pme_long_desc = "Data loaded from another L3 on same chip modified", }, [ POWER7_PME_PM_THERMAL_WARN ] = { .pme_name = "PM_THERMAL_WARN", .pme_code = 0x10016, .pme_short_desc = "Processor in Thermal Warning", .pme_long_desc = "Processor in Thermal Warning", }, [ POWER7_PME_PM_VSU0_4FLOP ] = { .pme_name = "PM_VSU0_4FLOP", .pme_code = 0xa09c, .pme_short_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", .pme_long_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", }, [ POWER7_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x40a4, .pme_short_desc = "Branch Mispredict due to Count Cache prediction", .pme_long_desc = "A branch instruction target was incorrectly predicted by the ccount cache. This will result in a branch redirect flush if not overfidden by a flush of an older instruction.", }, [ POWER7_PME_PM_CMPLU_STALL_IFU ] = { .pme_name = "PM_CMPLU_STALL_IFU", .pme_code = 0x4004c, .pme_short_desc = "Completion stall due to IFU ", .pme_long_desc = "Completion stall due to IFU ", }, [ POWER7_PME_PM_L1_DEMAND_WRITE ] = { .pme_name = "PM_L1_DEMAND_WRITE", .pme_code = 0x408c, .pme_short_desc = "Instruction Demand sectors wriittent into IL1", .pme_long_desc = "Instruction Demand sectors wriittent into IL1", }, [ POWER7_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x2084, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "A flush was caused by a branch mispredict.", }, [ POWER7_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x1d05e, .pme_short_desc = "Marked Data TLB misses for 16G page", .pme_long_desc = "Data TLB references to 16GB pages by a marked instruction that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_MRK_PTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_PTEG_FROM_DMEM", .pme_code = 0x2d052, .pme_short_desc = "Marked PTEG loaded from distant memory", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from memory attached to a different module than this proccessor is located on due to a marked load or store.", }, [ POWER7_PME_PM_L2_RCST_DISP ] = { .pme_name = "PM_L2_RCST_DISP", .pme_code = 0x36280, .pme_short_desc = " L2 RC store dispatch attempt", .pme_long_desc = " L2 RC store dispatch attempt", }, [ POWER7_PME_PM_CMPLU_STALL ] = { .pme_name = "PM_CMPLU_STALL", .pme_code = 0x4000a, .pme_short_desc = "No groups completed, GCT not empty", .pme_long_desc = "No groups completed, GCT not empty", }, [ POWER7_PME_PM_LSU_PARTIAL_CDF ] = { .pme_name = "PM_LSU_PARTIAL_CDF", .pme_code = 0xc0aa, .pme_short_desc = "A partial cacheline was returned from the L3", .pme_long_desc = "A partial cacheline was returned from the L3", }, [ POWER7_PME_PM_DISP_CLB_HELD_SB ] = { .pme_name = "PM_DISP_CLB_HELD_SB", .pme_code = 0x20a8, .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", }, [ POWER7_PME_PM_VSU0_FMA_DOUBLE ] = { .pme_name = "PM_VSU0_FMA_DOUBLE", .pme_code = 0xa090, .pme_short_desc = "four flop DP vector operations (xvmadddp, xvnmadddp, xvmsubdp, xvmsubdp)", .pme_long_desc = "four flop DP vector operations (xvmadddp, xvnmadddp, xvmsubdp, xvmsubdp)", }, [ POWER7_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x3000e, .pme_short_desc = "fxu0 busy and fxu1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ POWER7_PME_PM_IC_DEMAND_CYC ] = { .pme_name = "PM_IC_DEMAND_CYC", .pme_code = 0x10018, .pme_short_desc = "Cycles when a demand ifetch was pending", .pme_long_desc = "Cycles when a demand ifetch was pending", }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR", .pme_code = 0x3d04e, .pme_short_desc = "Marked data loaded from another L2 on same chip shared", .pme_long_desc = "Marked data loaded from another L2 on same chip shared", }, [ POWER7_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0xd086, .pme_short_desc = "Flush: (marked) Unaligned Store", .pme_long_desc = "A marked store was flushed because it was unaligned", }, [ POWER7_PME_PM_INST_PTEG_FROM_L3MISS ] = { .pme_name = "PM_INST_PTEG_FROM_L3MISS", .pme_code = 0x2e058, .pme_short_desc = "Instruction PTEG loaded from L3 miss", .pme_long_desc = "Instruction PTEG loaded from L3 miss", }, [ POWER7_PME_PM_VSU_DENORM ] = { .pme_name = "PM_VSU_DENORM", .pme_code = 0xa8ac, .pme_short_desc = "Vector or Scalar denorm operand", .pme_long_desc = "Vector or Scalar denorm operand", }, [ POWER7_PME_PM_MRK_LSU_PARTIAL_CDF ] = { .pme_name = "PM_MRK_LSU_PARTIAL_CDF", .pme_code = 0xd080, .pme_short_desc = "A partial cacheline was returned from the L3 for a marked load", .pme_long_desc = "A partial cacheline was returned from the L3 for a marked load", }, [ POWER7_PME_PM_INST_FROM_L21_SHR ] = { .pme_name = "PM_INST_FROM_L21_SHR", .pme_code = 0x3404e, .pme_short_desc = "Instruction fetched from another L2 on same chip shared", .pme_long_desc = "Instruction fetched from another L2 on same chip shared", }, [ POWER7_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x408e, .pme_short_desc = "Instruction prefetch written into IL1", .pme_long_desc = "Number of Instruction Cache entries written because of prefetch. Prefetch entries are marked least recently used and are candidates for eviction if they are not needed to satisfy a demand fetch.", }, [ POWER7_PME_PM_BR_PRED ] = { .pme_name = "PM_BR_PRED", .pme_code = 0x409c, .pme_short_desc = "Branch Predictions made", .pme_long_desc = "A branch prediction was made. This could have been a target prediction, a condition prediction, or both", }, [ POWER7_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x1404a, .pme_short_desc = "Instruction fetched from distant memory", .pme_long_desc = "An instruction fetch group was fetched from memory attached to a distant module. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_IC_PREF_CANCEL_ALL ] = { .pme_name = "PM_IC_PREF_CANCEL_ALL", .pme_code = 0x4890, .pme_short_desc = "Prefetch Canceled due to page boundary or icache hit", .pme_long_desc = "Prefetch Canceled due to page boundary or icache hit", }, [ POWER7_PME_PM_LSU_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd8b4, .pme_short_desc = "Dcache new prefetch stream confirmed", .pme_long_desc = "Dcache new prefetch stream confirmed", }, [ POWER7_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0xd08a, .pme_short_desc = "Flush: (marked) SRQ", .pme_long_desc = "Load Hit Store flush. A marked load was flushed because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", }, [ POWER7_PME_PM_MRK_FIN_STALL_CYC ] = { .pme_name = "PM_MRK_FIN_STALL_CYC", .pme_code = 0x1003c, .pme_short_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) ", .pme_long_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) ", }, [ POWER7_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", .pme_code = 0x46280, .pme_short_desc = " L2 RC store dispatch attempt failed due to other reasons", .pme_long_desc = " L2 RC store dispatch attempt failed due to other reasons", }, [ POWER7_PME_PM_VSU1_DD_ISSUED ] = { .pme_name = "PM_VSU1_DD_ISSUED", .pme_code = 0xb098, .pme_short_desc = "64BIT Decimal Issued on Pipe1", .pme_long_desc = "64BIT Decimal Issued on Pipe1", }, [ POWER7_PME_PM_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_PTEG_FROM_L31_SHR", .pme_code = 0x2c056, .pme_short_desc = "PTEG loaded from another L3 on same chip shared", .pme_long_desc = "PTEG loaded from another L3 on same chip shared", }, [ POWER7_PME_PM_DATA_FROM_L21_SHR ] = { .pme_name = "PM_DATA_FROM_L21_SHR", .pme_code = 0x3c04e, .pme_short_desc = "Data loaded from another L2 on same chip shared", .pme_long_desc = "Data loaded from another L2 on same chip shared", }, [ POWER7_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc08c, .pme_short_desc = "LS0 Non-cachable Loads counted at finish", .pme_long_desc = "A non-cacheable load was executed by unit 0.", }, [ POWER7_PME_PM_VSU1_4FLOP ] = { .pme_name = "PM_VSU1_4FLOP", .pme_code = 0xa09e, .pme_short_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", .pme_long_desc = "four flops operation (scalar fdiv, fsqrt; DP vector version of fmadd, fnmadd, fmsub, fnmsub; SP vector versions of single flop instructions)", }, [ POWER7_PME_PM_VSU1_8FLOP ] = { .pme_name = "PM_VSU1_8FLOP", .pme_code = 0xa0a2, .pme_short_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", .pme_long_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", }, [ POWER7_PME_PM_VSU_8FLOP ] = { .pme_name = "PM_VSU_8FLOP", .pme_code = 0xa8a0, .pme_short_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", .pme_long_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub) ", }, [ POWER7_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2003e, .pme_short_desc = "LSU empty (lmq and srq empty)", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ POWER7_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x3c05e, .pme_short_desc = "Data TLB miss for 64K page", .pme_long_desc = "Data TLB references to 64KB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x300f4, .pme_short_desc = "Concurrent Run Instructions", .pme_long_desc = "Instructions completed by this thread when both threads had their run latches set.", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L2 ] = { .pme_name = "PM_MRK_PTEG_FROM_L2", .pme_code = 0x1d050, .pme_short_desc = "Marked PTEG loaded from L2", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L2 due to a marked load or store.", }, [ POWER7_PME_PM_PB_SYS_PUMP ] = { .pme_name = "PM_PB_SYS_PUMP", .pme_code = 0x20081, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair1 Bit0", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair1 Bit0", }, [ POWER7_PME_PM_VSU_FIN ] = { .pme_name = "PM_VSU_FIN", .pme_code = 0xa8bc, .pme_short_desc = "VSU0 Finished an instruction", .pme_long_desc = "VSU0 Finished an instruction", }, [ POWER7_PME_PM_MRK_DATA_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD", .pme_code = 0x1d044, .pme_short_desc = "Marked data loaded from another L3 on same chip modified", .pme_long_desc = "Marked data loaded from another L3 on same chip modified", }, [ POWER7_PME_PM_THRD_PRIO_0_1_CYC ] = { .pme_name = "PM_THRD_PRIO_0_1_CYC", .pme_code = 0x40b0, .pme_short_desc = " Cycles thread running at priority level 0 or 1", .pme_long_desc = " Cycles thread running at priority level 0 or 1", }, [ POWER7_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x2c05c, .pme_short_desc = "DERAT misses for 64K page", .pme_long_desc = "A data request (load or store) missed the ERAT for 64K page and resulted in an ERAT reload.", }, [ POWER7_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x30020, .pme_short_desc = "PMC2 Rewind Event (did not match condition)", .pme_long_desc = "PMC2 was counting speculatively. The speculative condition was not met and the counter was restored to its previous value.", }, [ POWER7_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x14040, .pme_short_desc = "Instruction fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ POWER7_PME_PM_GRP_BR_MPRED_NONSPEC ] = { .pme_name = "PM_GRP_BR_MPRED_NONSPEC", .pme_code = 0x1000a, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Group experienced non-speculative branch redirect", }, [ POWER7_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x200f2, .pme_short_desc = "# PPC Dispatched", .pme_long_desc = "Number of PowerPC instructions successfully dispatched.", }, [ POWER7_PME_PM_MEM0_RD_CANCEL_TOTAL ] = { .pme_name = "PM_MEM0_RD_CANCEL_TOTAL", .pme_code = 0x30083, .pme_short_desc = " Nest events (MC0/MC1/PB/GX), Pair2 Bit1", .pme_long_desc = " Nest events (MC0/MC1/PB/GX), Pair2 Bit1", }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_CONFIRM ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_CONFIRM", .pme_code = 0xd0b4, .pme_short_desc = "LS0 Dcache prefetch stream confirmed", .pme_long_desc = "LS0 Dcache prefetch stream confirmed", }, [ POWER7_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x300f6, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid,the data cache has been reloaded. Prior to POWER5+ this included data cache reloads due to prefetch activity. With POWER5+ this now only includes reloads due to demand loads.", }, [ POWER7_PME_PM_VSU_SCALAR_DOUBLE_ISSUED ] = { .pme_name = "PM_VSU_SCALAR_DOUBLE_ISSUED", .pme_code = 0xb888, .pme_short_desc = "Double Precision scalar instruction issued on Pipe0", .pme_long_desc = "Double Precision scalar instruction issued on Pipe0", }, [ POWER7_PME_PM_L3_PREF_HIT ] = { .pme_name = "PM_L3_PREF_HIT", .pme_code = 0x3f080, .pme_short_desc = "L3 Prefetch Directory Hit", .pme_long_desc = "L3 Prefetch Directory Hit", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L31_MOD ] = { .pme_name = "PM_MRK_PTEG_FROM_L31_MOD", .pme_code = 0x1d054, .pme_short_desc = "Marked PTEG loaded from another L3 on same chip modified", .pme_long_desc = "Marked PTEG loaded from another L3 on same chip modified", }, [ POWER7_PME_PM_CMPLU_STALL_STORE ] = { .pme_name = "PM_CMPLU_STALL_STORE", .pme_code = 0x2004a, .pme_short_desc = "Completion stall due to store instruction", .pme_long_desc = "Completion stall due to store instruction", }, [ POWER7_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x20038, .pme_short_desc = "fxu marked instr finish", .pme_long_desc = "One of the Fixed Point Units finished a marked instruction. Instructions that finish may not necessary complete.", }, [ POWER7_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10010, .pme_short_desc = "Overflow from counter 4", .pme_long_desc = "Overflows from PMC4 are counted. This effectively widens the PMC. The Overflow from the original PMC will not trigger an exception even if the PMU is configured to generate exceptions on overflow.", }, [ POWER7_PME_PM_MRK_PTEG_FROM_L3 ] = { .pme_name = "PM_MRK_PTEG_FROM_L3", .pme_code = 0x2d050, .pme_short_desc = "Marked PTEG loaded from L3", .pme_long_desc = "A Page Table Entry was loaded into the ERAT from the local L3 due to a marked load or store.", }, [ POWER7_PME_PM_LSU0_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU0_LMQ_LHR_MERGE", .pme_code = 0xd098, .pme_short_desc = "LS0 Load Merged with another cacheline request", .pme_long_desc = "LS0 Load Merged with another cacheline request", }, [ POWER7_PME_PM_BTAC_HIT ] = { .pme_name = "PM_BTAC_HIT", .pme_code = 0x508a, .pme_short_desc = "BTAC Correct Prediction", .pme_long_desc = "BTAC Correct Prediction", }, [ POWER7_PME_PM_L3_RD_BUSY ] = { .pme_name = "PM_L3_RD_BUSY", .pme_code = 0x4f082, .pme_short_desc = "Rd machines busy >= threshold (2,4,6,8)", .pme_long_desc = "Rd machines busy >= threshold (2,4,6,8)", }, [ POWER7_PME_PM_LSU0_L1_SW_PREF ] = { .pme_name = "PM_LSU0_L1_SW_PREF", .pme_code = 0xc09c, .pme_short_desc = "LSU0 Software L1 Prefetches, including SW Transient Prefetches", .pme_long_desc = "LSU0 Software L1 Prefetches, including SW Transient Prefetches", }, [ POWER7_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x44048, .pme_short_desc = "Instruction fetched missed L2", .pme_long_desc = "An instruction fetch group was fetched from beyond the local L2.", }, [ POWER7_PME_PM_LSU0_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_LSU0_DC_PREF_STREAM_ALLOC", .pme_code = 0xd0a8, .pme_short_desc = "LS0 D cache new prefetch stream allocated", .pme_long_desc = "LS0 D cache new prefetch stream allocated", }, [ POWER7_PME_PM_L2_ST ] = { .pme_name = "PM_L2_ST", .pme_code = 0x16082, .pme_short_desc = "Data Store Count", .pme_long_desc = "Data Store Count", }, [ POWER7_PME_PM_VSU0_DENORM ] = { .pme_name = "PM_VSU0_DENORM", .pme_code = 0xa0ac, .pme_short_desc = "FPU denorm operand", .pme_long_desc = "VSU0 received denormalized data", }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x3d044, .pme_short_desc = "Marked data loaded from distant L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a distant module due to a marked load.", }, [ POWER7_PME_PM_BR_PRED_CR_TA ] = { .pme_name = "PM_BR_PRED_CR_TA", .pme_code = 0x48aa, .pme_short_desc = "Branch predict - taken/not taken and target", .pme_long_desc = "Both the condition (taken or not taken) and the target address of a branch instruction was predicted.", }, [ POWER7_PME_PM_VSU0_FCONV ] = { .pme_name = "PM_VSU0_FCONV", .pme_code = 0xa0b0, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", }, [ POWER7_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0xd084, .pme_short_desc = "Flush: (marked) Unaligned Load", .pme_long_desc = "A marked load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ POWER7_PME_PM_BTAC_MISS ] = { .pme_name = "PM_BTAC_MISS", .pme_code = 0x5088, .pme_short_desc = "BTAC Mispredicted", .pme_long_desc = "BTAC Mispredicted", }, [ POWER7_PME_PM_MRK_LD_MISS_EXPOSED_CYC_COUNT ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC_COUNT", .pme_code = 0x1003f, .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", }, [ POWER7_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1d040, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "The processor's Data Cache was reloaded from the local L2 due to a marked load.", }, [ POWER7_PME_PM_LSU_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_LSU_DCACHE_RELOAD_VALID", .pme_code = 0xd0a2, .pme_short_desc = "count per sector of lines reloaded in L1 (demand + prefetch) ", .pme_long_desc = "count per sector of lines reloaded in L1 (demand + prefetch) ", }, [ POWER7_PME_PM_VSU_FMA ] = { .pme_name = "PM_VSU_FMA", .pme_code = 0xa884, .pme_short_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", .pme_long_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", }, [ POWER7_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc0bc, .pme_short_desc = "LS0 Flush: SRQ", .pme_long_desc = "Load Hit Store flush. A younger load was flushed from unit 0 because it hits (overlaps) an older store that is already in the SRQ or in the same group. If the real addresses match but the effective addresses do not, an alias condition exists that prevents store forwarding. If the load and store are in the same group the load must be flushed to separate the two instructions. ", }, [ POWER7_PME_PM_LSU1_L1_PREF ] = { .pme_name = "PM_LSU1_L1_PREF", .pme_code = 0xd0ba, .pme_short_desc = " LS1 L1 cache data prefetches", .pme_long_desc = " LS1 L1 cache data prefetches", }, [ POWER7_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x10014, .pme_short_desc = "Internal Operations completed", .pme_long_desc = "Number of internal operations that completed.", }, [ POWER7_PME_PM_L2_SYS_PUMP ] = { .pme_name = "PM_L2_SYS_PUMP", .pme_code = 0x36482, .pme_short_desc = "RC req that was a global (aka system) pump attempt", .pme_long_desc = "RC req that was a global (aka system) pump attempt", }, [ POWER7_PME_PM_L2_RCLD_BUSY_RC_FULL ] = { .pme_name = "PM_L2_RCLD_BUSY_RC_FULL", .pme_code = 0x46282, .pme_short_desc = " L2 activated Busy to the core for loads due to all RC full", .pme_long_desc = " L2 activated Busy to the core for loads due to all RC full", }, [ POWER7_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xd0a1, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "Slot 0 of LMQ valid", }, [ POWER7_PME_PM_FLUSH_DISP_SYNC ] = { .pme_name = "PM_FLUSH_DISP_SYNC", .pme_code = 0x2088, .pme_short_desc = "Dispatch Flush: Sync", .pme_long_desc = "Dispatch Flush: Sync", }, [ POWER7_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", .pme_code = 0x4002a, .pme_short_desc = "Marked ld latency Data source 1011 (L2.75/L3.75 M different 4 chip node)", .pme_long_desc = "Marked ld latency Data source 1011 (L2.75/L3.75 M different 4 chip node)", }, [ POWER7_PME_PM_L2_IC_INV ] = { .pme_name = "PM_L2_IC_INV", .pme_code = 0x26180, .pme_short_desc = "Icache Invalidates from L2 ", .pme_long_desc = "Icache Invalidates from L2 ", }, [ POWER7_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD_CYC", .pme_code = 0x40024, .pme_short_desc = "Marked ld latency Data source 0101 (L2.1 M same chip)", .pme_long_desc = "Marked ld latency Data source 0101 (L2.1 M same chip)", }, [ POWER7_PME_PM_L3_PREF_LDST ] = { .pme_name = "PM_L3_PREF_LDST", .pme_code = 0xd8ac, .pme_short_desc = "L3 cache prefetches LD + ST", .pme_long_desc = "L3 cache prefetches LD + ST", }, [ POWER7_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x40008, .pme_short_desc = "ALL threads srq empty", .pme_long_desc = "The Store Request Queue is empty", }, [ POWER7_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xd0a0, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin. In SMT mode the LRQ is split between the two threads (16 entries each).", }, [ POWER7_PME_PM_FLUSH_PARTIAL ] = { .pme_name = "PM_FLUSH_PARTIAL", .pme_code = 0x2086, .pme_short_desc = "Partial flush", .pme_long_desc = "Partial flush", }, [ POWER7_PME_PM_VSU1_FMA_DOUBLE ] = { .pme_name = "PM_VSU1_FMA_DOUBLE", .pme_code = 0xa092, .pme_short_desc = "four flop DP vector operations (xvmadddp, xvnmadddp, xvmsubdp, xvmsubdp)", .pme_long_desc = "four flop DP vector operations (xvmadddp, xvnmadddp, xvmsubdp, xvmsubdp)", }, [ POWER7_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x400f2, .pme_short_desc = "Cycles at least one Instr Dispatched", .pme_long_desc = "A group containing at least one PPC instruction was dispatched. For microcoded instructions that span multiple groups, this will only occur once.", }, [ POWER7_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x200fe, .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", .pme_long_desc = "The processor's Data Cache was reloaded but not from the local L2.", }, [ POWER7_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Counter OFF", .pme_long_desc = "The counter is suspended (does not count)", }, [ POWER7_PME_PM_VSU0_FMA ] = { .pme_name = "PM_VSU0_FMA", .pme_code = 0xa084, .pme_short_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub, xsmadd, xsnmadd, xsmsub, xsnmsub) Scalar instructions only!", .pme_long_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub, xsmadd, xsnmadd, xsmsub, xsnmsub) Scalar instructions only!", }, [ POWER7_PME_PM_CMPLU_STALL_SCALAR ] = { .pme_name = "PM_CMPLU_STALL_SCALAR", .pme_code = 0x40012, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Completion stall caused by FPU instruction", }, [ POWER7_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0xc09a, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ POWER7_PME_PM_VSU0_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU0_FSQRT_FDIV_DOUBLE", .pme_code = 0xa094, .pme_short_desc = "eight flop DP vector operations (xvfdivdp, xvsqrtdp ", .pme_long_desc = "eight flop DP vector operations (xvfdivdp, xvsqrtdp ", }, [ POWER7_PME_PM_DC_PREF_DST ] = { .pme_name = "PM_DC_PREF_DST", .pme_code = 0xd0b0, .pme_short_desc = "Data Stream Touch", .pme_long_desc = "A prefetch stream was started using the DST instruction.", }, [ POWER7_PME_PM_VSU1_SCAL_SINGLE_ISSUED ] = { .pme_name = "PM_VSU1_SCAL_SINGLE_ISSUED", .pme_code = 0xb086, .pme_short_desc = "Single Precision scalar instruction issued on Pipe1", .pme_long_desc = "Single Precision scalar instruction issued on Pipe1", }, [ POWER7_PME_PM_L3_HIT ] = { .pme_name = "PM_L3_HIT", .pme_code = 0x1f080, .pme_short_desc = "L3 Hits", .pme_long_desc = "L3 Hits", }, [ POWER7_PME_PM_L2_GLOB_GUESS_WRONG ] = { .pme_name = "PM_L2_GLOB_GUESS_WRONG", .pme_code = 0x26482, .pme_short_desc = "L2 guess glb and guess was not correct (ie data local)", .pme_long_desc = "L2 guess glb and guess was not correct (ie data local)", }, [ POWER7_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x20032, .pme_short_desc = "Decimal Unit marked Instruction Finish", .pme_long_desc = "The Decimal Floating Point Unit finished a marked instruction.", }, [ POWER7_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x4080, .pme_short_desc = "Instruction fetches from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ POWER7_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x10068, .pme_short_desc = "Branch Instruction Finished ", .pme_long_desc = "The Branch execution unit finished an instruction", }, [ POWER7_PME_PM_IC_DEMAND_REQ ] = { .pme_name = "PM_IC_DEMAND_REQ", .pme_code = 0x4088, .pme_short_desc = "Demand Instruction fetch request", .pme_long_desc = "Demand Instruction fetch request", }, [ POWER7_PME_PM_VSU1_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU1_FSQRT_FDIV_DOUBLE", .pme_code = 0xa096, .pme_short_desc = "eight flop DP vector operations (xvfdivdp, xvsqrtdp ", .pme_long_desc = "eight flop DP vector operations (xvfdivdp, xvsqrtdp ", }, [ POWER7_PME_PM_VSU1_FMA ] = { .pme_name = "PM_VSU1_FMA", .pme_code = 0xa086, .pme_short_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub, xsmadd, xsnmadd, xsmsub, xsnmsub) Scalar instructions only!", .pme_long_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub, xsmadd, xsnmadd, xsmsub, xsnmsub) Scalar instructions only!", }, [ POWER7_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x20036, .pme_short_desc = "Marked DL1 Demand Miss", .pme_long_desc = "Marked L1 D cache load misses", }, [ POWER7_PME_PM_VSU0_2FLOP_DOUBLE ] = { .pme_name = "PM_VSU0_2FLOP_DOUBLE", .pme_code = 0xa08c, .pme_short_desc = "two flop DP vector operation (xvadddp, xvmuldp, xvsubdp, xvcmpdp, xvseldp, xvabsdp, xvnabsdp, xvredp ,xvsqrtedp, vxnegdp)", .pme_long_desc = "two flop DP vector operation (xvadddp, xvmuldp, xvsubdp, xvcmpdp, xvseldp, xvabsdp, xvnabsdp, xvredp ,xvsqrtedp, vxnegdp)", }, [ POWER7_PME_PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM ] = { .pme_name = "PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM", .pme_code = 0xd8bc, .pme_short_desc = "Dcache Strided prefetch stream confirmed (software + hardware)", .pme_long_desc = "Dcache Strided prefetch stream confirmed (software + hardware)", }, [ POWER7_PME_PM_INST_PTEG_FROM_L31_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_L31_SHR", .pme_code = 0x2e056, .pme_short_desc = "Instruction PTEG loaded from another L3 on same chip shared", .pme_long_desc = "Instruction PTEG loaded from another L3 on same chip shared", }, [ POWER7_PME_PM_MRK_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_MRK_LSU_REJECT_ERAT_MISS", .pme_code = 0x30064, .pme_short_desc = "LSU marked reject due to ERAT (up to 2 per cycle)", .pme_long_desc = "LSU marked reject due to ERAT (up to 2 per cycle)", }, [ POWER7_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x4d048, .pme_short_desc = "Marked data loaded missed L2", .pme_long_desc = "DL1 was reloaded from beyond L2 due to a marked demand load.", }, [ POWER7_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x1c04c, .pme_short_desc = "Data loaded from remote L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a remote module due to a demand load", }, [ POWER7_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x14046, .pme_short_desc = "Instruction fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch groups can contain up to 8 instructions", }, [ POWER7_PME_PM_VSU1_SQ ] = { .pme_name = "PM_VSU1_SQ", .pme_code = 0xb09e, .pme_short_desc = "Store Vector Issued on Pipe1", .pme_long_desc = "Store Vector Issued on Pipe1", }, [ POWER7_PME_PM_L2_LD_DISP ] = { .pme_name = "PM_L2_LD_DISP", .pme_code = 0x36180, .pme_short_desc = "All successful load dispatches", .pme_long_desc = "All successful load dispatches", }, [ POWER7_PME_PM_L2_DISP_ALL ] = { .pme_name = "PM_L2_DISP_ALL", .pme_code = 0x46080, .pme_short_desc = "All successful LD/ST dispatches for this thread(i+d)", .pme_long_desc = "All successful LD/ST dispatches for this thread(i+d)", }, [ POWER7_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x10012, .pme_short_desc = "Cycles group completed by both threads", .pme_long_desc = "Cycles that both threads completed.", }, [ POWER7_PME_PM_VSU_FSQRT_FDIV_DOUBLE ] = { .pme_name = "PM_VSU_FSQRT_FDIV_DOUBLE", .pme_code = 0xa894, .pme_short_desc = "DP vector versions of fdiv,fsqrt ", .pme_long_desc = "DP vector versions of fdiv,fsqrt ", }, [ POWER7_PME_PM_BR_MPRED ] = { .pme_name = "PM_BR_MPRED", .pme_code = 0x400f6, .pme_short_desc = "Number of Branch Mispredicts", .pme_long_desc = "A branch instruction was incorrectly predicted. This could have been a target prediction, a condition prediction, or both", }, [ POWER7_PME_PM_INST_PTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_PTEG_FROM_DL2L3_SHR", .pme_code = 0x3e054, .pme_short_desc = "Instruction PTEG loaded from remote L2 or L3 shared", .pme_long_desc = "Instruction PTEG loaded from remote L2 or L3 shared", }, [ POWER7_PME_PM_VSU_1FLOP ] = { .pme_name = "PM_VSU_1FLOP", .pme_code = 0xa880, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished", }, [ POWER7_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x2000a, .pme_short_desc = "cycles in hypervisor mode ", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ POWER7_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x1d04c, .pme_short_desc = "Marked data loaded from remote L2 or L3 shared", .pme_long_desc = "The processor's Data Cache was reloaded with shared (T or SL) data from an L2 or L3 on a remote module due to a marked load", }, [ POWER7_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x4c05e, .pme_short_desc = "Data TLB miss for 16M page", .pme_long_desc = "Data TLB references to 16MB pages that missed the TLB. Page size is determined at TLB reload time.", }, [ POWER7_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40032, .pme_short_desc = "Marked LSU instruction finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER7_PME_PM_LSU1_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU1_LMQ_LHR_MERGE", .pme_code = 0xd09a, .pme_short_desc = "LS1 Load Merge with another cacheline request", .pme_long_desc = "LS1 Load Merge with another cacheline request", }, [ POWER7_PME_PM_IFU_FIN ] = { .pme_name = "PM_IFU_FIN", .pme_code = 0x40066, .pme_short_desc = "IFU Finished a (non-branch) instruction", .pme_long_desc = "The Instruction Fetch Unit finished an instruction", }, [ POWER7_PME_PM_1THRD_CON_RUN_INSTR ] = { .pme_name = "PM_1THRD_CON_RUN_INSTR", .pme_code = 0x30062, .pme_short_desc = "1 thread Concurrent Run Instructions", .pme_long_desc = "1 thread Concurrent Run Instructions", }, [ POWER7_PME_PM_CMPLU_STALL_COUNT ] = { .pme_name = "PM_CMPLU_STALL_COUNT", .pme_code = 0x4000B, .pme_short_desc = "Marked LSU instruction finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ POWER7_PME_PM_MEM0_PB_RD_CL ] = { .pme_name = "PM_MEM0_PB_RD_CL", .pme_code = 0x30083, .pme_short_desc = "Nest events (MC0/MC1/PB/GX), Pair2 Bit1", .pme_long_desc = "Nest events (MC0/MC1/PB/GX), Pair2 Bit1", }, [ POWER7_PME_PM_THRD_1_RUN_CYC ] = { .pme_name = "PM_THRD_1_RUN_CYC", .pme_code = 0x10060, .pme_short_desc = "1 thread in Run Cycles", .pme_long_desc = "At least one thread has set its run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. This event does not respect FCWAIT.", }, [ POWER7_PME_PM_THRD_2_CONC_RUN_INSTR ] = { .pme_name = "PM_THRD_2_CONC_RUN_INSTR", .pme_code = 0x40062, .pme_short_desc = "2 thread Concurrent Run Instructions", .pme_long_desc = "2 thread Concurrent Run Instructions", }, [ POWER7_PME_PM_THRD_2_RUN_CYC ] = { .pme_name = "PM_THRD_2_RUN_CYC", .pme_code = 0x20060, .pme_short_desc = "2 thread in Run Cycles", .pme_long_desc = "2 thread in Run Cycles", }, [ POWER7_PME_PM_THRD_3_CONC_RUN_INST ] = { .pme_name = "PM_THRD_3_CONC_RUN_INST", .pme_code = 0x10062, .pme_short_desc = "3 thread in Run Cycles", .pme_long_desc = "3 thread in Run Cycles", }, [ POWER7_PME_PM_THRD_3_RUN_CYC ] = { .pme_name = "PM_THRD_3_RUN_CYC", .pme_code = 0x30060, .pme_short_desc = "3 thread in Run Cycles", .pme_long_desc = "3 thread in Run Cycles", }, [ POWER7_PME_PM_THRD_4_CONC_RUN_INST ] = { .pme_name = "PM_THRD_4_CONC_RUN_INST", .pme_code = 0x20062, .pme_short_desc = "4 thread in Run Cycles", .pme_long_desc = "4 thread in Run Cycles", }, [ POWER7_PME_PM_THRD_4_RUN_CYC ] = { .pme_name = "PM_THRD_4_RUN_CYC", .pme_code = 0x40060, .pme_short_desc = "4 thread in Run Cycles", .pme_long_desc = "4 thread in Run Cycles", }, }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power8_events.h000066400000000000000000013011521502707512200230220ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __POWER8_EVENTS_H__ #define __POWER8_EVENTS_H__ /* * File: power8_events.h * CVS: * Author: Carl Love * carll.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2013. All Rights Reserved. * Contributed by * * Note: This code was automatically generated and should not be modified by * hand. * * Documentation on the PMU events will be published at: * http://www.power.org/documentation */ #define POWER8_PME_PM_1LPAR_CYC 0 #define POWER8_PME_PM_1PLUS_PPC_CMPL 1 #define POWER8_PME_PM_1PLUS_PPC_DISP 2 #define POWER8_PME_PM_2LPAR_CYC 3 #define POWER8_PME_PM_4LPAR_CYC 4 #define POWER8_PME_PM_ALL_CHIP_PUMP_CPRED 5 #define POWER8_PME_PM_ALL_GRP_PUMP_CPRED 6 #define POWER8_PME_PM_ALL_GRP_PUMP_MPRED 7 #define POWER8_PME_PM_ALL_GRP_PUMP_MPRED_RTY 8 #define POWER8_PME_PM_ALL_PUMP_CPRED 9 #define POWER8_PME_PM_ALL_PUMP_MPRED 10 #define POWER8_PME_PM_ALL_SYS_PUMP_CPRED 11 #define POWER8_PME_PM_ALL_SYS_PUMP_MPRED 12 #define POWER8_PME_PM_ALL_SYS_PUMP_MPRED_RTY 13 #define POWER8_PME_PM_ANY_THRD_RUN_CYC 14 #define POWER8_PME_PM_BACK_BR_CMPL 15 #define POWER8_PME_PM_BANK_CONFLICT 16 #define POWER8_PME_PM_BRU_FIN 17 #define POWER8_PME_PM_BR_2PATH 18 #define POWER8_PME_PM_BR_BC_8 19 #define POWER8_PME_PM_BR_BC_8_CONV 20 #define POWER8_PME_PM_BR_CMPL 21 #define POWER8_PME_PM_BR_MPRED_CCACHE 22 #define POWER8_PME_PM_BR_MPRED_CMPL 23 #define POWER8_PME_PM_BR_MPRED_CR 24 #define POWER8_PME_PM_BR_MPRED_LSTACK 25 #define POWER8_PME_PM_BR_MPRED_TA 26 #define POWER8_PME_PM_BR_MRK_2PATH 27 #define POWER8_PME_PM_BR_PRED_BR0 28 #define POWER8_PME_PM_BR_PRED_BR1 29 #define POWER8_PME_PM_BR_PRED_BR_CMPL 30 #define POWER8_PME_PM_BR_PRED_CCACHE_BR0 31 #define POWER8_PME_PM_BR_PRED_CCACHE_BR1 32 #define POWER8_PME_PM_BR_PRED_CCACHE_CMPL 33 #define POWER8_PME_PM_BR_PRED_CR_BR0 34 #define POWER8_PME_PM_BR_PRED_CR_BR1 35 #define POWER8_PME_PM_BR_PRED_CR_CMPL 36 #define POWER8_PME_PM_BR_PRED_LSTACK_BR0 37 #define POWER8_PME_PM_BR_PRED_LSTACK_BR1 38 #define POWER8_PME_PM_BR_PRED_LSTACK_CMPL 39 #define POWER8_PME_PM_BR_PRED_TA_BR0 40 #define POWER8_PME_PM_BR_PRED_TA_BR1 41 #define POWER8_PME_PM_BR_PRED_TA_CMPL 42 #define POWER8_PME_PM_BR_TAKEN_CMPL 43 #define POWER8_PME_PM_BR_UNCOND_BR0 44 #define POWER8_PME_PM_BR_UNCOND_BR1 45 #define POWER8_PME_PM_BR_UNCOND_CMPL 46 #define POWER8_PME_PM_CASTOUT_ISSUED 47 #define POWER8_PME_PM_CASTOUT_ISSUED_GPR 48 #define POWER8_PME_PM_CHIP_PUMP_CPRED 49 #define POWER8_PME_PM_CLB_HELD 50 #define POWER8_PME_PM_CMPLU_STALL 51 #define POWER8_PME_PM_CMPLU_STALL_BRU 52 #define POWER8_PME_PM_CMPLU_STALL_BRU_CRU 53 #define POWER8_PME_PM_CMPLU_STALL_COQ_FULL 54 #define POWER8_PME_PM_CMPLU_STALL_DCACHE_MISS 55 #define POWER8_PME_PM_CMPLU_STALL_DMISS_L21_L31 56 #define POWER8_PME_PM_CMPLU_STALL_DMISS_L2L3 57 #define POWER8_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT 58 #define POWER8_PME_PM_CMPLU_STALL_DMISS_L3MISS 59 #define POWER8_PME_PM_CMPLU_STALL_DMISS_LMEM 60 #define POWER8_PME_PM_CMPLU_STALL_DMISS_REMOTE 61 #define POWER8_PME_PM_CMPLU_STALL_ERAT_MISS 62 #define POWER8_PME_PM_CMPLU_STALL_FLUSH 63 #define POWER8_PME_PM_CMPLU_STALL_FXLONG 64 #define POWER8_PME_PM_CMPLU_STALL_FXU 65 #define POWER8_PME_PM_CMPLU_STALL_HWSYNC 66 #define POWER8_PME_PM_CMPLU_STALL_LOAD_FINISH 67 #define POWER8_PME_PM_CMPLU_STALL_LSU 68 #define POWER8_PME_PM_CMPLU_STALL_LWSYNC 69 #define POWER8_PME_PM_CMPLU_STALL_MEM_ECC_DELAY 70 #define POWER8_PME_PM_CMPLU_STALL_NO_NTF 71 #define POWER8_PME_PM_CMPLU_STALL_NTCG_FLUSH 72 #define POWER8_PME_PM_CMPLU_STALL_OTHER_CMPL 73 #define POWER8_PME_PM_CMPLU_STALL_REJECT 74 #define POWER8_PME_PM_CMPLU_STALL_REJECT_LHS 75 #define POWER8_PME_PM_CMPLU_STALL_REJ_LMQ_FULL 76 #define POWER8_PME_PM_CMPLU_STALL_SCALAR 77 #define POWER8_PME_PM_CMPLU_STALL_SCALAR_LONG 78 #define POWER8_PME_PM_CMPLU_STALL_STORE 79 #define POWER8_PME_PM_CMPLU_STALL_ST_FWD 80 #define POWER8_PME_PM_CMPLU_STALL_THRD 81 #define POWER8_PME_PM_CMPLU_STALL_VECTOR 82 #define POWER8_PME_PM_CMPLU_STALL_VECTOR_LONG 83 #define POWER8_PME_PM_CMPLU_STALL_VSU 84 #define POWER8_PME_PM_CO0_ALLOC 85 #define POWER8_PME_PM_CO0_BUSY 86 #define POWER8_PME_PM_CO_DISP_FAIL 87 #define POWER8_PME_PM_CO_TM_SC_FOOTPRINT 88 #define POWER8_PME_PM_CO_USAGE 89 #define POWER8_PME_PM_CRU_FIN 90 #define POWER8_PME_PM_CYC 91 #define POWER8_PME_PM_DATA_ALL_CHIP_PUMP_CPRED 92 #define POWER8_PME_PM_DATA_ALL_FROM_DL2L3_MOD 93 #define POWER8_PME_PM_DATA_ALL_FROM_DL2L3_SHR 94 #define POWER8_PME_PM_DATA_ALL_FROM_DL4 95 #define POWER8_PME_PM_DATA_ALL_FROM_DMEM 96 #define POWER8_PME_PM_DATA_ALL_FROM_L2 97 #define POWER8_PME_PM_DATA_ALL_FROM_L21_MOD 98 #define POWER8_PME_PM_DATA_ALL_FROM_L21_SHR 99 #define POWER8_PME_PM_DATA_ALL_FROM_L2MISS_MOD 100 #define POWER8_PME_PM_DATA_ALL_FROM_L2_DISP_CONFLICT_LDHITST 101 #define POWER8_PME_PM_DATA_ALL_FROM_L2_DISP_CONFLICT_OTHER 102 #define POWER8_PME_PM_DATA_ALL_FROM_L2_MEPF 103 #define POWER8_PME_PM_DATA_ALL_FROM_L2_NO_CONFLICT 104 #define POWER8_PME_PM_DATA_ALL_FROM_L3 105 #define POWER8_PME_PM_DATA_ALL_FROM_L31_ECO_MOD 106 #define POWER8_PME_PM_DATA_ALL_FROM_L31_ECO_SHR 107 #define POWER8_PME_PM_DATA_ALL_FROM_L31_MOD 108 #define POWER8_PME_PM_DATA_ALL_FROM_L31_SHR 109 #define POWER8_PME_PM_DATA_ALL_FROM_L3MISS_MOD 110 #define POWER8_PME_PM_DATA_ALL_FROM_L3_DISP_CONFLICT 111 #define POWER8_PME_PM_DATA_ALL_FROM_L3_MEPF 112 #define POWER8_PME_PM_DATA_ALL_FROM_L3_NO_CONFLICT 113 #define POWER8_PME_PM_DATA_ALL_FROM_LL4 114 #define POWER8_PME_PM_DATA_ALL_FROM_LMEM 115 #define POWER8_PME_PM_DATA_ALL_FROM_MEMORY 116 #define POWER8_PME_PM_DATA_ALL_FROM_OFF_CHIP_CACHE 117 #define POWER8_PME_PM_DATA_ALL_FROM_ON_CHIP_CACHE 118 #define POWER8_PME_PM_DATA_ALL_FROM_RL2L3_MOD 119 #define POWER8_PME_PM_DATA_ALL_FROM_RL2L3_SHR 120 #define POWER8_PME_PM_DATA_ALL_FROM_RL4 121 #define POWER8_PME_PM_DATA_ALL_FROM_RMEM 122 #define POWER8_PME_PM_DATA_ALL_GRP_PUMP_CPRED 123 #define POWER8_PME_PM_DATA_ALL_GRP_PUMP_MPRED 124 #define POWER8_PME_PM_DATA_ALL_GRP_PUMP_MPRED_RTY 125 #define POWER8_PME_PM_DATA_ALL_PUMP_CPRED 126 #define POWER8_PME_PM_DATA_ALL_PUMP_MPRED 127 #define POWER8_PME_PM_DATA_ALL_SYS_PUMP_CPRED 128 #define POWER8_PME_PM_DATA_ALL_SYS_PUMP_MPRED 129 #define POWER8_PME_PM_DATA_ALL_SYS_PUMP_MPRED_RTY 130 #define POWER8_PME_PM_DATA_CHIP_PUMP_CPRED 131 #define POWER8_PME_PM_DATA_FROM_DL2L3_MOD 132 #define POWER8_PME_PM_DATA_FROM_DL2L3_SHR 133 #define POWER8_PME_PM_DATA_FROM_DL4 134 #define POWER8_PME_PM_DATA_FROM_DMEM 135 #define POWER8_PME_PM_DATA_FROM_L2 136 #define POWER8_PME_PM_DATA_FROM_L21_MOD 137 #define POWER8_PME_PM_DATA_FROM_L21_SHR 138 #define POWER8_PME_PM_DATA_FROM_L2MISS 139 #define POWER8_PME_PM_DATA_FROM_L2MISS_MOD 140 #define POWER8_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST 141 #define POWER8_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER 142 #define POWER8_PME_PM_DATA_FROM_L2_MEPF 143 #define POWER8_PME_PM_DATA_FROM_L2_NO_CONFLICT 144 #define POWER8_PME_PM_DATA_FROM_L3 145 #define POWER8_PME_PM_DATA_FROM_L31_ECO_MOD 146 #define POWER8_PME_PM_DATA_FROM_L31_ECO_SHR 147 #define POWER8_PME_PM_DATA_FROM_L31_MOD 148 #define POWER8_PME_PM_DATA_FROM_L31_SHR 149 #define POWER8_PME_PM_DATA_FROM_L3MISS 150 #define POWER8_PME_PM_DATA_FROM_L3MISS_MOD 151 #define POWER8_PME_PM_DATA_FROM_L3_DISP_CONFLICT 152 #define POWER8_PME_PM_DATA_FROM_L3_MEPF 153 #define POWER8_PME_PM_DATA_FROM_L3_NO_CONFLICT 154 #define POWER8_PME_PM_DATA_FROM_LL4 155 #define POWER8_PME_PM_DATA_FROM_LMEM 156 #define POWER8_PME_PM_DATA_FROM_MEM 157 #define POWER8_PME_PM_DATA_FROM_MEMORY 158 #define POWER8_PME_PM_DATA_FROM_OFF_CHIP_CACHE 159 #define POWER8_PME_PM_DATA_FROM_ON_CHIP_CACHE 160 #define POWER8_PME_PM_DATA_FROM_RL2L3_MOD 161 #define POWER8_PME_PM_DATA_FROM_RL2L3_SHR 162 #define POWER8_PME_PM_DATA_FROM_RL4 163 #define POWER8_PME_PM_DATA_FROM_RMEM 164 #define POWER8_PME_PM_DATA_GRP_PUMP_CPRED 165 #define POWER8_PME_PM_DATA_GRP_PUMP_MPRED 166 #define POWER8_PME_PM_DATA_GRP_PUMP_MPRED_RTY 167 #define POWER8_PME_PM_DATA_PUMP_CPRED 168 #define POWER8_PME_PM_DATA_PUMP_MPRED 169 #define POWER8_PME_PM_DATA_SYS_PUMP_CPRED 170 #define POWER8_PME_PM_DATA_SYS_PUMP_MPRED 171 #define POWER8_PME_PM_DATA_SYS_PUMP_MPRED_RTY 172 #define POWER8_PME_PM_DATA_TABLEWALK_CYC 173 #define POWER8_PME_PM_DC_COLLISIONS 174 #define POWER8_PME_PM_DC_PREF_STREAM_ALLOC 175 #define POWER8_PME_PM_DC_PREF_STREAM_CONF 176 #define POWER8_PME_PM_DC_PREF_STREAM_FUZZY_CONF 177 #define POWER8_PME_PM_DC_PREF_STREAM_STRIDED_CONF 178 #define POWER8_PME_PM_DERAT_MISS_16G 179 #define POWER8_PME_PM_DERAT_MISS_16M 180 #define POWER8_PME_PM_DERAT_MISS_4K 181 #define POWER8_PME_PM_DERAT_MISS_64K 182 #define POWER8_PME_PM_DFU 183 #define POWER8_PME_PM_DFU_DCFFIX 184 #define POWER8_PME_PM_DFU_DENBCD 185 #define POWER8_PME_PM_DFU_MC 186 #define POWER8_PME_PM_DISP_CLB_HELD_BAL 187 #define POWER8_PME_PM_DISP_CLB_HELD_RES 188 #define POWER8_PME_PM_DISP_CLB_HELD_SB 189 #define POWER8_PME_PM_DISP_CLB_HELD_SYNC 190 #define POWER8_PME_PM_DISP_CLB_HELD_TLBIE 191 #define POWER8_PME_PM_DISP_HELD 192 #define POWER8_PME_PM_DISP_HELD_IQ_FULL 193 #define POWER8_PME_PM_DISP_HELD_MAP_FULL 194 #define POWER8_PME_PM_DISP_HELD_SRQ_FULL 195 #define POWER8_PME_PM_DISP_HELD_SYNC_HOLD 196 #define POWER8_PME_PM_DISP_HOLD_GCT_FULL 197 #define POWER8_PME_PM_DISP_WT 198 #define POWER8_PME_PM_DPTEG_FROM_DL2L3_MOD 199 #define POWER8_PME_PM_DPTEG_FROM_DL2L3_SHR 200 #define POWER8_PME_PM_DPTEG_FROM_DL4 201 #define POWER8_PME_PM_DPTEG_FROM_DMEM 202 #define POWER8_PME_PM_DPTEG_FROM_L2 203 #define POWER8_PME_PM_DPTEG_FROM_L21_MOD 204 #define POWER8_PME_PM_DPTEG_FROM_L21_SHR 205 #define POWER8_PME_PM_DPTEG_FROM_L2MISS 206 #define POWER8_PME_PM_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST 207 #define POWER8_PME_PM_DPTEG_FROM_L2_DISP_CONFLICT_OTHER 208 #define POWER8_PME_PM_DPTEG_FROM_L2_MEPF 209 #define POWER8_PME_PM_DPTEG_FROM_L2_NO_CONFLICT 210 #define POWER8_PME_PM_DPTEG_FROM_L3 211 #define POWER8_PME_PM_DPTEG_FROM_L31_ECO_MOD 212 #define POWER8_PME_PM_DPTEG_FROM_L31_ECO_SHR 213 #define POWER8_PME_PM_DPTEG_FROM_L31_MOD 214 #define POWER8_PME_PM_DPTEG_FROM_L31_SHR 215 #define POWER8_PME_PM_DPTEG_FROM_L3MISS 216 #define POWER8_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT 217 #define POWER8_PME_PM_DPTEG_FROM_L3_MEPF 218 #define POWER8_PME_PM_DPTEG_FROM_L3_NO_CONFLICT 219 #define POWER8_PME_PM_DPTEG_FROM_LL4 220 #define POWER8_PME_PM_DPTEG_FROM_LMEM 221 #define POWER8_PME_PM_DPTEG_FROM_MEMORY 222 #define POWER8_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE 223 #define POWER8_PME_PM_DPTEG_FROM_ON_CHIP_CACHE 224 #define POWER8_PME_PM_DPTEG_FROM_RL2L3_MOD 225 #define POWER8_PME_PM_DPTEG_FROM_RL2L3_SHR 226 #define POWER8_PME_PM_DPTEG_FROM_RL4 227 #define POWER8_PME_PM_DPTEG_FROM_RMEM 228 #define POWER8_PME_PM_DSLB_MISS 229 #define POWER8_PME_PM_DTLB_MISS 230 #define POWER8_PME_PM_DTLB_MISS_16G 231 #define POWER8_PME_PM_DTLB_MISS_16M 232 #define POWER8_PME_PM_DTLB_MISS_4K 233 #define POWER8_PME_PM_DTLB_MISS_64K 234 #define POWER8_PME_PM_EAT_FORCE_MISPRED 235 #define POWER8_PME_PM_EAT_FULL_CYC 236 #define POWER8_PME_PM_EE_OFF_EXT_INT 237 #define POWER8_PME_PM_EXT_INT 238 #define POWER8_PME_PM_FAV_TBEGIN 239 #define POWER8_PME_PM_FLOP 240 #define POWER8_PME_PM_FLOP_SUM_SCALAR 241 #define POWER8_PME_PM_FLOP_SUM_VEC 242 #define POWER8_PME_PM_FLUSH 243 #define POWER8_PME_PM_FLUSH_BR_MPRED 244 #define POWER8_PME_PM_FLUSH_COMPLETION 245 #define POWER8_PME_PM_FLUSH_DISP 246 #define POWER8_PME_PM_FLUSH_DISP_SB 247 #define POWER8_PME_PM_FLUSH_DISP_SYNC 248 #define POWER8_PME_PM_FLUSH_DISP_TLBIE 249 #define POWER8_PME_PM_FLUSH_LSU 250 #define POWER8_PME_PM_FLUSH_PARTIAL 251 #define POWER8_PME_PM_FPU0_FCONV 252 #define POWER8_PME_PM_FPU0_FEST 253 #define POWER8_PME_PM_FPU0_FRSP 254 #define POWER8_PME_PM_FPU1_FCONV 255 #define POWER8_PME_PM_FPU1_FEST 256 #define POWER8_PME_PM_FPU1_FRSP 257 #define POWER8_PME_PM_FREQ_DOWN 258 #define POWER8_PME_PM_FREQ_UP 259 #define POWER8_PME_PM_FUSION_TOC_GRP0_1 260 #define POWER8_PME_PM_FUSION_TOC_GRP0_2 261 #define POWER8_PME_PM_FUSION_TOC_GRP0_3 262 #define POWER8_PME_PM_FUSION_TOC_GRP1_1 263 #define POWER8_PME_PM_FUSION_VSX_GRP0_1 264 #define POWER8_PME_PM_FUSION_VSX_GRP0_2 265 #define POWER8_PME_PM_FUSION_VSX_GRP0_3 266 #define POWER8_PME_PM_FUSION_VSX_GRP1_1 267 #define POWER8_PME_PM_FXU0_BUSY_FXU1_IDLE 268 #define POWER8_PME_PM_FXU0_FIN 269 #define POWER8_PME_PM_FXU1_BUSY_FXU0_IDLE 270 #define POWER8_PME_PM_FXU1_FIN 271 #define POWER8_PME_PM_FXU_BUSY 272 #define POWER8_PME_PM_FXU_IDLE 273 #define POWER8_PME_PM_GCT_EMPTY_CYC 274 #define POWER8_PME_PM_GCT_MERGE 275 #define POWER8_PME_PM_GCT_NOSLOT_BR_MPRED 276 #define POWER8_PME_PM_GCT_NOSLOT_BR_MPRED_ICMISS 277 #define POWER8_PME_PM_GCT_NOSLOT_CYC 278 #define POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_ISSQ 279 #define POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_MAP 280 #define POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_OTHER 281 #define POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_SRQ 282 #define POWER8_PME_PM_GCT_NOSLOT_IC_L3MISS 283 #define POWER8_PME_PM_GCT_NOSLOT_IC_MISS 284 #define POWER8_PME_PM_GCT_UTIL_11_14_ENTRIES 285 #define POWER8_PME_PM_GCT_UTIL_15_17_ENTRIES 286 #define POWER8_PME_PM_GCT_UTIL_18_ENTRIES 287 #define POWER8_PME_PM_GCT_UTIL_1_2_ENTRIES 288 #define POWER8_PME_PM_GCT_UTIL_3_6_ENTRIES 289 #define POWER8_PME_PM_GCT_UTIL_7_10_ENTRIES 290 #define POWER8_PME_PM_GRP_BR_MPRED_NONSPEC 291 #define POWER8_PME_PM_GRP_CMPL 292 #define POWER8_PME_PM_GRP_DISP 293 #define POWER8_PME_PM_GRP_IC_MISS_NONSPEC 294 #define POWER8_PME_PM_GRP_MRK 295 #define POWER8_PME_PM_GRP_NON_FULL_GROUP 296 #define POWER8_PME_PM_GRP_PUMP_CPRED 297 #define POWER8_PME_PM_GRP_PUMP_MPRED 298 #define POWER8_PME_PM_GRP_PUMP_MPRED_RTY 299 #define POWER8_PME_PM_GRP_TERM_2ND_BRANCH 300 #define POWER8_PME_PM_GRP_TERM_FPU_AFTER_BR 301 #define POWER8_PME_PM_GRP_TERM_NOINST 302 #define POWER8_PME_PM_GRP_TERM_OTHER 303 #define POWER8_PME_PM_GRP_TERM_SLOT_LIMIT 304 #define POWER8_PME_PM_HV_CYC 305 #define POWER8_PME_PM_IBUF_FULL_CYC 306 #define POWER8_PME_PM_IC_DEMAND_CYC 307 #define POWER8_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 308 #define POWER8_PME_PM_IC_DEMAND_L2_BR_REDIRECT 309 #define POWER8_PME_PM_IC_DEMAND_REQ 310 #define POWER8_PME_PM_IC_INVALIDATE 311 #define POWER8_PME_PM_IC_PREF_CANCEL_HIT 312 #define POWER8_PME_PM_IC_PREF_CANCEL_L2 313 #define POWER8_PME_PM_IC_PREF_CANCEL_PAGE 314 #define POWER8_PME_PM_IC_PREF_REQ 315 #define POWER8_PME_PM_IC_PREF_WRITE 316 #define POWER8_PME_PM_IC_RELOAD_PRIVATE 317 #define POWER8_PME_PM_IERAT_RELOAD 318 #define POWER8_PME_PM_IERAT_RELOAD_16M 319 #define POWER8_PME_PM_IERAT_RELOAD_4K 320 #define POWER8_PME_PM_IERAT_RELOAD_64K 321 #define POWER8_PME_PM_IFETCH_THROTTLE 322 #define POWER8_PME_PM_IFU_L2_TOUCH 323 #define POWER8_PME_PM_INST_ALL_CHIP_PUMP_CPRED 324 #define POWER8_PME_PM_INST_ALL_FROM_DL2L3_MOD 325 #define POWER8_PME_PM_INST_ALL_FROM_DL2L3_SHR 326 #define POWER8_PME_PM_INST_ALL_FROM_DL4 327 #define POWER8_PME_PM_INST_ALL_FROM_DMEM 328 #define POWER8_PME_PM_INST_ALL_FROM_L2 329 #define POWER8_PME_PM_INST_ALL_FROM_L21_MOD 330 #define POWER8_PME_PM_INST_ALL_FROM_L21_SHR 331 #define POWER8_PME_PM_INST_ALL_FROM_L2MISS 332 #define POWER8_PME_PM_INST_ALL_FROM_L2_DISP_CONFLICT_LDHITST 333 #define POWER8_PME_PM_INST_ALL_FROM_L2_DISP_CONFLICT_OTHER 334 #define POWER8_PME_PM_INST_ALL_FROM_L2_MEPF 335 #define POWER8_PME_PM_INST_ALL_FROM_L2_NO_CONFLICT 336 #define POWER8_PME_PM_INST_ALL_FROM_L3 337 #define POWER8_PME_PM_INST_ALL_FROM_L31_ECO_MOD 338 #define POWER8_PME_PM_INST_ALL_FROM_L31_ECO_SHR 339 #define POWER8_PME_PM_INST_ALL_FROM_L31_MOD 340 #define POWER8_PME_PM_INST_ALL_FROM_L31_SHR 341 #define POWER8_PME_PM_INST_ALL_FROM_L3MISS_MOD 342 #define POWER8_PME_PM_INST_ALL_FROM_L3_DISP_CONFLICT 343 #define POWER8_PME_PM_INST_ALL_FROM_L3_MEPF 344 #define POWER8_PME_PM_INST_ALL_FROM_L3_NO_CONFLICT 345 #define POWER8_PME_PM_INST_ALL_FROM_LL4 346 #define POWER8_PME_PM_INST_ALL_FROM_LMEM 347 #define POWER8_PME_PM_INST_ALL_FROM_MEMORY 348 #define POWER8_PME_PM_INST_ALL_FROM_OFF_CHIP_CACHE 349 #define POWER8_PME_PM_INST_ALL_FROM_ON_CHIP_CACHE 350 #define POWER8_PME_PM_INST_ALL_FROM_RL2L3_MOD 351 #define POWER8_PME_PM_INST_ALL_FROM_RL2L3_SHR 352 #define POWER8_PME_PM_INST_ALL_FROM_RL4 353 #define POWER8_PME_PM_INST_ALL_FROM_RMEM 354 #define POWER8_PME_PM_INST_ALL_GRP_PUMP_CPRED 355 #define POWER8_PME_PM_INST_ALL_GRP_PUMP_MPRED 356 #define POWER8_PME_PM_INST_ALL_GRP_PUMP_MPRED_RTY 357 #define POWER8_PME_PM_INST_ALL_PUMP_CPRED 358 #define POWER8_PME_PM_INST_ALL_PUMP_MPRED 359 #define POWER8_PME_PM_INST_ALL_SYS_PUMP_CPRED 360 #define POWER8_PME_PM_INST_ALL_SYS_PUMP_MPRED 361 #define POWER8_PME_PM_INST_ALL_SYS_PUMP_MPRED_RTY 362 #define POWER8_PME_PM_INST_CHIP_PUMP_CPRED 363 #define POWER8_PME_PM_INST_CMPL 364 #define POWER8_PME_PM_INST_DISP 365 #define POWER8_PME_PM_INST_FROM_DL2L3_MOD 366 #define POWER8_PME_PM_INST_FROM_DL2L3_SHR 367 #define POWER8_PME_PM_INST_FROM_DL4 368 #define POWER8_PME_PM_INST_FROM_DMEM 369 #define POWER8_PME_PM_INST_FROM_L1 370 #define POWER8_PME_PM_INST_FROM_L2 371 #define POWER8_PME_PM_INST_FROM_L21_MOD 372 #define POWER8_PME_PM_INST_FROM_L21_SHR 373 #define POWER8_PME_PM_INST_FROM_L2MISS 374 #define POWER8_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST 375 #define POWER8_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER 376 #define POWER8_PME_PM_INST_FROM_L2_MEPF 377 #define POWER8_PME_PM_INST_FROM_L2_NO_CONFLICT 378 #define POWER8_PME_PM_INST_FROM_L3 379 #define POWER8_PME_PM_INST_FROM_L31_ECO_MOD 380 #define POWER8_PME_PM_INST_FROM_L31_ECO_SHR 381 #define POWER8_PME_PM_INST_FROM_L31_MOD 382 #define POWER8_PME_PM_INST_FROM_L31_SHR 383 #define POWER8_PME_PM_INST_FROM_L3MISS 384 #define POWER8_PME_PM_INST_FROM_L3MISS_MOD 385 #define POWER8_PME_PM_INST_FROM_L3_DISP_CONFLICT 386 #define POWER8_PME_PM_INST_FROM_L3_MEPF 387 #define POWER8_PME_PM_INST_FROM_L3_NO_CONFLICT 388 #define POWER8_PME_PM_INST_FROM_LL4 389 #define POWER8_PME_PM_INST_FROM_LMEM 390 #define POWER8_PME_PM_INST_FROM_MEMORY 391 #define POWER8_PME_PM_INST_FROM_OFF_CHIP_CACHE 392 #define POWER8_PME_PM_INST_FROM_ON_CHIP_CACHE 393 #define POWER8_PME_PM_INST_FROM_RL2L3_MOD 394 #define POWER8_PME_PM_INST_FROM_RL2L3_SHR 395 #define POWER8_PME_PM_INST_FROM_RL4 396 #define POWER8_PME_PM_INST_FROM_RMEM 397 #define POWER8_PME_PM_INST_GRP_PUMP_CPRED 398 #define POWER8_PME_PM_INST_GRP_PUMP_MPRED 399 #define POWER8_PME_PM_INST_GRP_PUMP_MPRED_RTY 400 #define POWER8_PME_PM_INST_IMC_MATCH_CMPL 401 #define POWER8_PME_PM_INST_IMC_MATCH_DISP 402 #define POWER8_PME_PM_INST_PUMP_CPRED 403 #define POWER8_PME_PM_INST_PUMP_MPRED 404 #define POWER8_PME_PM_INST_SYS_PUMP_CPRED 405 #define POWER8_PME_PM_INST_SYS_PUMP_MPRED 406 #define POWER8_PME_PM_INST_SYS_PUMP_MPRED_RTY 407 #define POWER8_PME_PM_IOPS_CMPL 408 #define POWER8_PME_PM_IOPS_DISP 409 #define POWER8_PME_PM_IPTEG_FROM_DL2L3_MOD 410 #define POWER8_PME_PM_IPTEG_FROM_DL2L3_SHR 411 #define POWER8_PME_PM_IPTEG_FROM_DL4 412 #define POWER8_PME_PM_IPTEG_FROM_DMEM 413 #define POWER8_PME_PM_IPTEG_FROM_L2 414 #define POWER8_PME_PM_IPTEG_FROM_L21_MOD 415 #define POWER8_PME_PM_IPTEG_FROM_L21_SHR 416 #define POWER8_PME_PM_IPTEG_FROM_L2MISS 417 #define POWER8_PME_PM_IPTEG_FROM_L2_DISP_CONFLICT_LDHITST 418 #define POWER8_PME_PM_IPTEG_FROM_L2_DISP_CONFLICT_OTHER 419 #define POWER8_PME_PM_IPTEG_FROM_L2_MEPF 420 #define POWER8_PME_PM_IPTEG_FROM_L2_NO_CONFLICT 421 #define POWER8_PME_PM_IPTEG_FROM_L3 422 #define POWER8_PME_PM_IPTEG_FROM_L31_ECO_MOD 423 #define POWER8_PME_PM_IPTEG_FROM_L31_ECO_SHR 424 #define POWER8_PME_PM_IPTEG_FROM_L31_MOD 425 #define POWER8_PME_PM_IPTEG_FROM_L31_SHR 426 #define POWER8_PME_PM_IPTEG_FROM_L3MISS 427 #define POWER8_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT 428 #define POWER8_PME_PM_IPTEG_FROM_L3_MEPF 429 #define POWER8_PME_PM_IPTEG_FROM_L3_NO_CONFLICT 430 #define POWER8_PME_PM_IPTEG_FROM_LL4 431 #define POWER8_PME_PM_IPTEG_FROM_LMEM 432 #define POWER8_PME_PM_IPTEG_FROM_MEMORY 433 #define POWER8_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE 434 #define POWER8_PME_PM_IPTEG_FROM_ON_CHIP_CACHE 435 #define POWER8_PME_PM_IPTEG_FROM_RL2L3_MOD 436 #define POWER8_PME_PM_IPTEG_FROM_RL2L3_SHR 437 #define POWER8_PME_PM_IPTEG_FROM_RL4 438 #define POWER8_PME_PM_IPTEG_FROM_RMEM 439 #define POWER8_PME_PM_ISIDE_DISP 440 #define POWER8_PME_PM_ISIDE_DISP_FAIL 441 #define POWER8_PME_PM_ISIDE_DISP_FAIL_OTHER 442 #define POWER8_PME_PM_ISIDE_L2MEMACC 443 #define POWER8_PME_PM_ISIDE_MRU_TOUCH 444 #define POWER8_PME_PM_ISLB_MISS 445 #define POWER8_PME_PM_ISU_REF_FX0 446 #define POWER8_PME_PM_ISU_REF_FX1 447 #define POWER8_PME_PM_ISU_REF_FXU 448 #define POWER8_PME_PM_ISU_REF_LS0 449 #define POWER8_PME_PM_ISU_REF_LS1 450 #define POWER8_PME_PM_ISU_REF_LS2 451 #define POWER8_PME_PM_ISU_REF_LS3 452 #define POWER8_PME_PM_ISU_REJECTS_ALL 453 #define POWER8_PME_PM_ISU_REJECT_RES_NA 454 #define POWER8_PME_PM_ISU_REJECT_SAR_BYPASS 455 #define POWER8_PME_PM_ISU_REJECT_SRC_NA 456 #define POWER8_PME_PM_ISU_REJ_VS0 457 #define POWER8_PME_PM_ISU_REJ_VS1 458 #define POWER8_PME_PM_ISU_REJ_VSU 459 #define POWER8_PME_PM_ISYNC 460 #define POWER8_PME_PM_ITLB_MISS 461 #define POWER8_PME_PM_L1MISS_LAT_EXC_1024 462 #define POWER8_PME_PM_L1MISS_LAT_EXC_2048 463 #define POWER8_PME_PM_L1MISS_LAT_EXC_256 464 #define POWER8_PME_PM_L1MISS_LAT_EXC_32 465 #define POWER8_PME_PM_L1PF_L2MEMACC 466 #define POWER8_PME_PM_L1_DCACHE_RELOADED_ALL 467 #define POWER8_PME_PM_L1_DCACHE_RELOAD_VALID 468 #define POWER8_PME_PM_L1_DEMAND_WRITE 469 #define POWER8_PME_PM_L1_ICACHE_MISS 470 #define POWER8_PME_PM_L1_ICACHE_RELOADED_ALL 471 #define POWER8_PME_PM_L1_ICACHE_RELOADED_PREF 472 #define POWER8_PME_PM_L2_CASTOUT_MOD 473 #define POWER8_PME_PM_L2_CASTOUT_SHR 474 #define POWER8_PME_PM_L2_CHIP_PUMP 475 #define POWER8_PME_PM_L2_DC_INV 476 #define POWER8_PME_PM_L2_DISP_ALL_L2MISS 477 #define POWER8_PME_PM_L2_GROUP_PUMP 478 #define POWER8_PME_PM_L2_GRP_GUESS_CORRECT 479 #define POWER8_PME_PM_L2_GRP_GUESS_WRONG 480 #define POWER8_PME_PM_L2_IC_INV 481 #define POWER8_PME_PM_L2_INST 482 #define POWER8_PME_PM_L2_INST_MISS 483 #define POWER8_PME_PM_L2_LD 484 #define POWER8_PME_PM_L2_LD_DISP 485 #define POWER8_PME_PM_L2_LD_HIT 486 #define POWER8_PME_PM_L2_LD_MISS 487 #define POWER8_PME_PM_L2_LOC_GUESS_CORRECT 488 #define POWER8_PME_PM_L2_LOC_GUESS_WRONG 489 #define POWER8_PME_PM_L2_RCLD_DISP 490 #define POWER8_PME_PM_L2_RCLD_DISP_FAIL_ADDR 491 #define POWER8_PME_PM_L2_RCLD_DISP_FAIL_OTHER 492 #define POWER8_PME_PM_L2_RCST_DISP 493 #define POWER8_PME_PM_L2_RCST_DISP_FAIL_ADDR 494 #define POWER8_PME_PM_L2_RCST_DISP_FAIL_OTHER 495 #define POWER8_PME_PM_L2_RC_ST_DONE 496 #define POWER8_PME_PM_L2_RTY_LD 497 #define POWER8_PME_PM_L2_RTY_ST 498 #define POWER8_PME_PM_L2_SN_M_RD_DONE 499 #define POWER8_PME_PM_L2_SN_M_WR_DONE 500 #define POWER8_PME_PM_L2_SN_SX_I_DONE 501 #define POWER8_PME_PM_L2_ST 502 #define POWER8_PME_PM_L2_ST_DISP 503 #define POWER8_PME_PM_L2_ST_HIT 504 #define POWER8_PME_PM_L2_ST_MISS 505 #define POWER8_PME_PM_L2_SYS_GUESS_CORRECT 506 #define POWER8_PME_PM_L2_SYS_GUESS_WRONG 507 #define POWER8_PME_PM_L2_SYS_PUMP 508 #define POWER8_PME_PM_L2_TM_REQ_ABORT 509 #define POWER8_PME_PM_L2_TM_ST_ABORT_SISTER 510 #define POWER8_PME_PM_L3_CINJ 511 #define POWER8_PME_PM_L3_CI_HIT 512 #define POWER8_PME_PM_L3_CI_MISS 513 #define POWER8_PME_PM_L3_CI_USAGE 514 #define POWER8_PME_PM_L3_CO 515 #define POWER8_PME_PM_L3_CO0_ALLOC 516 #define POWER8_PME_PM_L3_CO0_BUSY 517 #define POWER8_PME_PM_L3_CO_L31 518 #define POWER8_PME_PM_L3_CO_LCO 519 #define POWER8_PME_PM_L3_CO_MEM 520 #define POWER8_PME_PM_L3_CO_MEPF 521 #define POWER8_PME_PM_L3_GRP_GUESS_CORRECT 522 #define POWER8_PME_PM_L3_GRP_GUESS_WRONG_HIGH 523 #define POWER8_PME_PM_L3_GRP_GUESS_WRONG_LOW 524 #define POWER8_PME_PM_L3_HIT 525 #define POWER8_PME_PM_L3_L2_CO_HIT 526 #define POWER8_PME_PM_L3_L2_CO_MISS 527 #define POWER8_PME_PM_L3_LAT_CI_HIT 528 #define POWER8_PME_PM_L3_LAT_CI_MISS 529 #define POWER8_PME_PM_L3_LD_HIT 530 #define POWER8_PME_PM_L3_LD_MISS 531 #define POWER8_PME_PM_L3_LD_PREF 532 #define POWER8_PME_PM_L3_LOC_GUESS_CORRECT 533 #define POWER8_PME_PM_L3_LOC_GUESS_WRONG 534 #define POWER8_PME_PM_L3_MISS 535 #define POWER8_PME_PM_L3_P0_CO_L31 536 #define POWER8_PME_PM_L3_P0_CO_MEM 537 #define POWER8_PME_PM_L3_P0_CO_RTY 538 #define POWER8_PME_PM_L3_P0_GRP_PUMP 539 #define POWER8_PME_PM_L3_P0_LCO_DATA 540 #define POWER8_PME_PM_L3_P0_LCO_NO_DATA 541 #define POWER8_PME_PM_L3_P0_LCO_RTY 542 #define POWER8_PME_PM_L3_P0_NODE_PUMP 543 #define POWER8_PME_PM_L3_P0_PF_RTY 544 #define POWER8_PME_PM_L3_P0_SN_HIT 545 #define POWER8_PME_PM_L3_P0_SN_INV 546 #define POWER8_PME_PM_L3_P0_SN_MISS 547 #define POWER8_PME_PM_L3_P0_SYS_PUMP 548 #define POWER8_PME_PM_L3_P1_CO_L31 549 #define POWER8_PME_PM_L3_P1_CO_MEM 550 #define POWER8_PME_PM_L3_P1_CO_RTY 551 #define POWER8_PME_PM_L3_P1_GRP_PUMP 552 #define POWER8_PME_PM_L3_P1_LCO_DATA 553 #define POWER8_PME_PM_L3_P1_LCO_NO_DATA 554 #define POWER8_PME_PM_L3_P1_LCO_RTY 555 #define POWER8_PME_PM_L3_P1_NODE_PUMP 556 #define POWER8_PME_PM_L3_P1_PF_RTY 557 #define POWER8_PME_PM_L3_P1_SN_HIT 558 #define POWER8_PME_PM_L3_P1_SN_INV 559 #define POWER8_PME_PM_L3_P1_SN_MISS 560 #define POWER8_PME_PM_L3_P1_SYS_PUMP 561 #define POWER8_PME_PM_L3_PF0_ALLOC 562 #define POWER8_PME_PM_L3_PF0_BUSY 563 #define POWER8_PME_PM_L3_PF_HIT_L3 564 #define POWER8_PME_PM_L3_PF_MISS_L3 565 #define POWER8_PME_PM_L3_PF_OFF_CHIP_CACHE 566 #define POWER8_PME_PM_L3_PF_OFF_CHIP_MEM 567 #define POWER8_PME_PM_L3_PF_ON_CHIP_CACHE 568 #define POWER8_PME_PM_L3_PF_ON_CHIP_MEM 569 #define POWER8_PME_PM_L3_PF_USAGE 570 #define POWER8_PME_PM_L3_PREF_ALL 571 #define POWER8_PME_PM_L3_RD0_ALLOC 572 #define POWER8_PME_PM_L3_RD0_BUSY 573 #define POWER8_PME_PM_L3_RD_USAGE 574 #define POWER8_PME_PM_L3_SN0_ALLOC 575 #define POWER8_PME_PM_L3_SN0_BUSY 576 #define POWER8_PME_PM_L3_SN_USAGE 577 #define POWER8_PME_PM_L3_ST_PREF 578 #define POWER8_PME_PM_L3_SW_PREF 579 #define POWER8_PME_PM_L3_SYS_GUESS_CORRECT 580 #define POWER8_PME_PM_L3_SYS_GUESS_WRONG 581 #define POWER8_PME_PM_L3_TRANS_PF 582 #define POWER8_PME_PM_L3_WI0_ALLOC 583 #define POWER8_PME_PM_L3_WI0_BUSY 584 #define POWER8_PME_PM_L3_WI_USAGE 585 #define POWER8_PME_PM_LARX_FIN 586 #define POWER8_PME_PM_LD_CMPL 587 #define POWER8_PME_PM_LD_L3MISS_PEND_CYC 588 #define POWER8_PME_PM_LD_MISS_L1 589 #define POWER8_PME_PM_LD_REF_L1 590 #define POWER8_PME_PM_LD_REF_L1_LSU0 591 #define POWER8_PME_PM_LD_REF_L1_LSU1 592 #define POWER8_PME_PM_LD_REF_L1_LSU2 593 #define POWER8_PME_PM_LD_REF_L1_LSU3 594 #define POWER8_PME_PM_LINK_STACK_INVALID_PTR 595 #define POWER8_PME_PM_LINK_STACK_WRONG_ADD_PRED 596 #define POWER8_PME_PM_LS0_ERAT_MISS_PREF 597 #define POWER8_PME_PM_LS0_L1_PREF 598 #define POWER8_PME_PM_LS0_L1_SW_PREF 599 #define POWER8_PME_PM_LS1_ERAT_MISS_PREF 600 #define POWER8_PME_PM_LS1_L1_PREF 601 #define POWER8_PME_PM_LS1_L1_SW_PREF 602 #define POWER8_PME_PM_LSU0_FLUSH_LRQ 603 #define POWER8_PME_PM_LSU0_FLUSH_SRQ 604 #define POWER8_PME_PM_LSU0_FLUSH_ULD 605 #define POWER8_PME_PM_LSU0_FLUSH_UST 606 #define POWER8_PME_PM_LSU0_L1_CAM_CANCEL 607 #define POWER8_PME_PM_LSU0_LARX_FIN 608 #define POWER8_PME_PM_LSU0_LMQ_LHR_MERGE 609 #define POWER8_PME_PM_LSU0_NCLD 610 #define POWER8_PME_PM_LSU0_PRIMARY_ERAT_HIT 611 #define POWER8_PME_PM_LSU0_REJECT 612 #define POWER8_PME_PM_LSU0_SRQ_STFWD 613 #define POWER8_PME_PM_LSU0_STORE_REJECT 614 #define POWER8_PME_PM_LSU0_TMA_REQ_L2 615 #define POWER8_PME_PM_LSU0_TM_L1_HIT 616 #define POWER8_PME_PM_LSU0_TM_L1_MISS 617 #define POWER8_PME_PM_LSU1_FLUSH_LRQ 618 #define POWER8_PME_PM_LSU1_FLUSH_SRQ 619 #define POWER8_PME_PM_LSU1_FLUSH_ULD 620 #define POWER8_PME_PM_LSU1_FLUSH_UST 621 #define POWER8_PME_PM_LSU1_L1_CAM_CANCEL 622 #define POWER8_PME_PM_LSU1_LARX_FIN 623 #define POWER8_PME_PM_LSU1_LMQ_LHR_MERGE 624 #define POWER8_PME_PM_LSU1_NCLD 625 #define POWER8_PME_PM_LSU1_PRIMARY_ERAT_HIT 626 #define POWER8_PME_PM_LSU1_REJECT 627 #define POWER8_PME_PM_LSU1_SRQ_STFWD 628 #define POWER8_PME_PM_LSU1_STORE_REJECT 629 #define POWER8_PME_PM_LSU1_TMA_REQ_L2 630 #define POWER8_PME_PM_LSU1_TM_L1_HIT 631 #define POWER8_PME_PM_LSU1_TM_L1_MISS 632 #define POWER8_PME_PM_LSU2_FLUSH_LRQ 633 #define POWER8_PME_PM_LSU2_FLUSH_SRQ 634 #define POWER8_PME_PM_LSU2_FLUSH_ULD 635 #define POWER8_PME_PM_LSU2_L1_CAM_CANCEL 636 #define POWER8_PME_PM_LSU2_LARX_FIN 637 #define POWER8_PME_PM_LSU2_LDF 638 #define POWER8_PME_PM_LSU2_LDX 639 #define POWER8_PME_PM_LSU2_LMQ_LHR_MERGE 640 #define POWER8_PME_PM_LSU2_PRIMARY_ERAT_HIT 641 #define POWER8_PME_PM_LSU2_REJECT 642 #define POWER8_PME_PM_LSU2_SRQ_STFWD 643 #define POWER8_PME_PM_LSU2_TMA_REQ_L2 644 #define POWER8_PME_PM_LSU2_TM_L1_HIT 645 #define POWER8_PME_PM_LSU2_TM_L1_MISS 646 #define POWER8_PME_PM_LSU3_FLUSH_LRQ 647 #define POWER8_PME_PM_LSU3_FLUSH_SRQ 648 #define POWER8_PME_PM_LSU3_FLUSH_ULD 649 #define POWER8_PME_PM_LSU3_L1_CAM_CANCEL 650 #define POWER8_PME_PM_LSU3_LARX_FIN 651 #define POWER8_PME_PM_LSU3_LDF 652 #define POWER8_PME_PM_LSU3_LDX 653 #define POWER8_PME_PM_LSU3_LMQ_LHR_MERGE 654 #define POWER8_PME_PM_LSU3_PRIMARY_ERAT_HIT 655 #define POWER8_PME_PM_LSU3_REJECT 656 #define POWER8_PME_PM_LSU3_SRQ_STFWD 657 #define POWER8_PME_PM_LSU3_TMA_REQ_L2 658 #define POWER8_PME_PM_LSU3_TM_L1_HIT 659 #define POWER8_PME_PM_LSU3_TM_L1_MISS 660 #define POWER8_PME_PM_LSU_DERAT_MISS 661 #define POWER8_PME_PM_LSU_ERAT_MISS_PREF 662 #define POWER8_PME_PM_LSU_FIN 663 #define POWER8_PME_PM_LSU_FLUSH_UST 664 #define POWER8_PME_PM_LSU_FOUR_TABLEWALK_CYC 665 #define POWER8_PME_PM_LSU_FX_FIN 666 #define POWER8_PME_PM_LSU_L1_PREF 667 #define POWER8_PME_PM_LSU_L1_SW_PREF 668 #define POWER8_PME_PM_LSU_LDF 669 #define POWER8_PME_PM_LSU_LDX 670 #define POWER8_PME_PM_LSU_LMQ_FULL_CYC 671 #define POWER8_PME_PM_LSU_LMQ_S0_ALLOC 672 #define POWER8_PME_PM_LSU_LMQ_S0_VALID 673 #define POWER8_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC 674 #define POWER8_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 675 #define POWER8_PME_PM_LSU_LRQ_S0_ALLOC 676 #define POWER8_PME_PM_LSU_LRQ_S0_VALID 677 #define POWER8_PME_PM_LSU_LRQ_S43_ALLOC 678 #define POWER8_PME_PM_LSU_LRQ_S43_VALID 679 #define POWER8_PME_PM_LSU_MRK_DERAT_MISS 680 #define POWER8_PME_PM_LSU_NCLD 681 #define POWER8_PME_PM_LSU_NCST 682 #define POWER8_PME_PM_LSU_REJECT 683 #define POWER8_PME_PM_LSU_REJECT_ERAT_MISS 684 #define POWER8_PME_PM_LSU_REJECT_LHS 685 #define POWER8_PME_PM_LSU_REJECT_LMQ_FULL 686 #define POWER8_PME_PM_LSU_SET_MPRED 687 #define POWER8_PME_PM_LSU_SRQ_EMPTY_CYC 688 #define POWER8_PME_PM_LSU_SRQ_FULL_CYC 689 #define POWER8_PME_PM_LSU_SRQ_S0_ALLOC 690 #define POWER8_PME_PM_LSU_SRQ_S0_VALID 691 #define POWER8_PME_PM_LSU_SRQ_S39_ALLOC 692 #define POWER8_PME_PM_LSU_SRQ_S39_VALID 693 #define POWER8_PME_PM_LSU_SRQ_SYNC 694 #define POWER8_PME_PM_LSU_SRQ_SYNC_CYC 695 #define POWER8_PME_PM_LSU_STORE_REJECT 696 #define POWER8_PME_PM_LSU_TWO_TABLEWALK_CYC 697 #define POWER8_PME_PM_LWSYNC 698 #define POWER8_PME_PM_LWSYNC_HELD 699 #define POWER8_PME_PM_MEM_CO 700 #define POWER8_PME_PM_MEM_LOC_THRESH_IFU 701 #define POWER8_PME_PM_MEM_LOC_THRESH_LSU_HIGH 702 #define POWER8_PME_PM_MEM_LOC_THRESH_LSU_MED 703 #define POWER8_PME_PM_MEM_PREF 704 #define POWER8_PME_PM_MEM_READ 705 #define POWER8_PME_PM_MEM_RWITM 706 #define POWER8_PME_PM_MRK_BACK_BR_CMPL 707 #define POWER8_PME_PM_MRK_BRU_FIN 708 #define POWER8_PME_PM_MRK_BR_CMPL 709 #define POWER8_PME_PM_MRK_BR_MPRED_CMPL 710 #define POWER8_PME_PM_MRK_BR_TAKEN_CMPL 711 #define POWER8_PME_PM_MRK_CRU_FIN 712 #define POWER8_PME_PM_MRK_DATA_FROM_DL2L3_MOD 713 #define POWER8_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 714 #define POWER8_PME_PM_MRK_DATA_FROM_DL2L3_SHR 715 #define POWER8_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 716 #define POWER8_PME_PM_MRK_DATA_FROM_DL4 717 #define POWER8_PME_PM_MRK_DATA_FROM_DL4_CYC 718 #define POWER8_PME_PM_MRK_DATA_FROM_DMEM 719 #define POWER8_PME_PM_MRK_DATA_FROM_DMEM_CYC 720 #define POWER8_PME_PM_MRK_DATA_FROM_L2 721 #define POWER8_PME_PM_MRK_DATA_FROM_L21_MOD 722 #define POWER8_PME_PM_MRK_DATA_FROM_L21_MOD_CYC 723 #define POWER8_PME_PM_MRK_DATA_FROM_L21_SHR 724 #define POWER8_PME_PM_MRK_DATA_FROM_L21_SHR_CYC 725 #define POWER8_PME_PM_MRK_DATA_FROM_L2MISS 726 #define POWER8_PME_PM_MRK_DATA_FROM_L2MISS_CYC 727 #define POWER8_PME_PM_MRK_DATA_FROM_L2_CYC 728 #define POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST 729 #define POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC 730 #define POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER 731 #define POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC 732 #define POWER8_PME_PM_MRK_DATA_FROM_L2_MEPF 733 #define POWER8_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC 734 #define POWER8_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT 735 #define POWER8_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC 736 #define POWER8_PME_PM_MRK_DATA_FROM_L3 737 #define POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_MOD 738 #define POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC 739 #define POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_SHR 740 #define POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC 741 #define POWER8_PME_PM_MRK_DATA_FROM_L31_MOD 742 #define POWER8_PME_PM_MRK_DATA_FROM_L31_MOD_CYC 743 #define POWER8_PME_PM_MRK_DATA_FROM_L31_SHR 744 #define POWER8_PME_PM_MRK_DATA_FROM_L31_SHR_CYC 745 #define POWER8_PME_PM_MRK_DATA_FROM_L3MISS 746 #define POWER8_PME_PM_MRK_DATA_FROM_L3MISS_CYC 747 #define POWER8_PME_PM_MRK_DATA_FROM_L3_CYC 748 #define POWER8_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT 749 #define POWER8_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC 750 #define POWER8_PME_PM_MRK_DATA_FROM_L3_MEPF 751 #define POWER8_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC 752 #define POWER8_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT 753 #define POWER8_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC 754 #define POWER8_PME_PM_MRK_DATA_FROM_LL4 755 #define POWER8_PME_PM_MRK_DATA_FROM_LL4_CYC 756 #define POWER8_PME_PM_MRK_DATA_FROM_LMEM 757 #define POWER8_PME_PM_MRK_DATA_FROM_LMEM_CYC 758 #define POWER8_PME_PM_MRK_DATA_FROM_MEM 759 #define POWER8_PME_PM_MRK_DATA_FROM_MEMORY 760 #define POWER8_PME_PM_MRK_DATA_FROM_MEMORY_CYC 761 #define POWER8_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE 762 #define POWER8_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC 763 #define POWER8_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE 764 #define POWER8_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC 765 #define POWER8_PME_PM_MRK_DATA_FROM_RL2L3_MOD 766 #define POWER8_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 767 #define POWER8_PME_PM_MRK_DATA_FROM_RL2L3_SHR 768 #define POWER8_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 769 #define POWER8_PME_PM_MRK_DATA_FROM_RL4 770 #define POWER8_PME_PM_MRK_DATA_FROM_RL4_CYC 771 #define POWER8_PME_PM_MRK_DATA_FROM_RMEM 772 #define POWER8_PME_PM_MRK_DATA_FROM_RMEM_CYC 773 #define POWER8_PME_PM_MRK_DCACHE_RELOAD_INTV 774 #define POWER8_PME_PM_MRK_DERAT_MISS 775 #define POWER8_PME_PM_MRK_DERAT_MISS_16G 776 #define POWER8_PME_PM_MRK_DERAT_MISS_16M 777 #define POWER8_PME_PM_MRK_DERAT_MISS_4K 778 #define POWER8_PME_PM_MRK_DERAT_MISS_64K 779 #define POWER8_PME_PM_MRK_DFU_FIN 780 #define POWER8_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD 781 #define POWER8_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR 782 #define POWER8_PME_PM_MRK_DPTEG_FROM_DL4 783 #define POWER8_PME_PM_MRK_DPTEG_FROM_DMEM 784 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2 785 #define POWER8_PME_PM_MRK_DPTEG_FROM_L21_MOD 786 #define POWER8_PME_PM_MRK_DPTEG_FROM_L21_SHR 787 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2MISS 788 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST 789 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_OTHER 790 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2_MEPF 791 #define POWER8_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT 792 #define POWER8_PME_PM_MRK_DPTEG_FROM_L3 793 #define POWER8_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD 794 #define POWER8_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR 795 #define POWER8_PME_PM_MRK_DPTEG_FROM_L31_MOD 796 #define POWER8_PME_PM_MRK_DPTEG_FROM_L31_SHR 797 #define POWER8_PME_PM_MRK_DPTEG_FROM_L3MISS 798 #define POWER8_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT 799 #define POWER8_PME_PM_MRK_DPTEG_FROM_L3_MEPF 800 #define POWER8_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT 801 #define POWER8_PME_PM_MRK_DPTEG_FROM_LL4 802 #define POWER8_PME_PM_MRK_DPTEG_FROM_LMEM 803 #define POWER8_PME_PM_MRK_DPTEG_FROM_MEMORY 804 #define POWER8_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE 805 #define POWER8_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE 806 #define POWER8_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD 807 #define POWER8_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR 808 #define POWER8_PME_PM_MRK_DPTEG_FROM_RL4 809 #define POWER8_PME_PM_MRK_DPTEG_FROM_RMEM 810 #define POWER8_PME_PM_MRK_DTLB_MISS 811 #define POWER8_PME_PM_MRK_DTLB_MISS_16G 812 #define POWER8_PME_PM_MRK_DTLB_MISS_16M 813 #define POWER8_PME_PM_MRK_DTLB_MISS_4K 814 #define POWER8_PME_PM_MRK_DTLB_MISS_64K 815 #define POWER8_PME_PM_MRK_FAB_RSP_BKILL 816 #define POWER8_PME_PM_MRK_FAB_RSP_BKILL_CYC 817 #define POWER8_PME_PM_MRK_FAB_RSP_CLAIM_RTY 818 #define POWER8_PME_PM_MRK_FAB_RSP_DCLAIM 819 #define POWER8_PME_PM_MRK_FAB_RSP_DCLAIM_CYC 820 #define POWER8_PME_PM_MRK_FAB_RSP_MATCH 821 #define POWER8_PME_PM_MRK_FAB_RSP_MATCH_CYC 822 #define POWER8_PME_PM_MRK_FAB_RSP_RD_RTY 823 #define POWER8_PME_PM_MRK_FAB_RSP_RD_T_INTV 824 #define POWER8_PME_PM_MRK_FAB_RSP_RWITM_CYC 825 #define POWER8_PME_PM_MRK_FAB_RSP_RWITM_RTY 826 #define POWER8_PME_PM_MRK_FILT_MATCH 827 #define POWER8_PME_PM_MRK_FIN_STALL_CYC 828 #define POWER8_PME_PM_MRK_FXU_FIN 829 #define POWER8_PME_PM_MRK_GRP_CMPL 830 #define POWER8_PME_PM_MRK_GRP_IC_MISS 831 #define POWER8_PME_PM_MRK_GRP_NTC 832 #define POWER8_PME_PM_MRK_INST_CMPL 833 #define POWER8_PME_PM_MRK_INST_DECODED 834 #define POWER8_PME_PM_MRK_INST_DISP 835 #define POWER8_PME_PM_MRK_INST_FIN 836 #define POWER8_PME_PM_MRK_INST_FROM_L3MISS 837 #define POWER8_PME_PM_MRK_INST_ISSUED 838 #define POWER8_PME_PM_MRK_INST_TIMEO 839 #define POWER8_PME_PM_MRK_L1_ICACHE_MISS 840 #define POWER8_PME_PM_MRK_L1_RELOAD_VALID 841 #define POWER8_PME_PM_MRK_L2_RC_DISP 842 #define POWER8_PME_PM_MRK_L2_RC_DONE 843 #define POWER8_PME_PM_MRK_LARX_FIN 844 #define POWER8_PME_PM_MRK_LD_MISS_EXPOSED 845 #define POWER8_PME_PM_MRK_LD_MISS_EXPOSED_CYC 846 #define POWER8_PME_PM_MRK_LD_MISS_L1 847 #define POWER8_PME_PM_MRK_LD_MISS_L1_CYC 848 #define POWER8_PME_PM_MRK_LSU_FIN 849 #define POWER8_PME_PM_MRK_LSU_FLUSH 850 #define POWER8_PME_PM_MRK_LSU_FLUSH_LRQ 851 #define POWER8_PME_PM_MRK_LSU_FLUSH_SRQ 852 #define POWER8_PME_PM_MRK_LSU_FLUSH_ULD 853 #define POWER8_PME_PM_MRK_LSU_FLUSH_UST 854 #define POWER8_PME_PM_MRK_LSU_REJECT 855 #define POWER8_PME_PM_MRK_LSU_REJECT_ERAT_MISS 856 #define POWER8_PME_PM_MRK_NTF_FIN 857 #define POWER8_PME_PM_MRK_RUN_CYC 858 #define POWER8_PME_PM_MRK_SRC_PREF_TRACK_EFF 859 #define POWER8_PME_PM_MRK_SRC_PREF_TRACK_INEFF 860 #define POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD 861 #define POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD_L2 862 #define POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD_L3 863 #define POWER8_PME_PM_MRK_STALL_CMPLU_CYC 864 #define POWER8_PME_PM_MRK_STCX_FAIL 865 #define POWER8_PME_PM_MRK_ST_CMPL 866 #define POWER8_PME_PM_MRK_ST_CMPL_INT 867 #define POWER8_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC 868 #define POWER8_PME_PM_MRK_ST_FWD 869 #define POWER8_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC 870 #define POWER8_PME_PM_MRK_ST_NEST 871 #define POWER8_PME_PM_MRK_TGT_PREF_TRACK_EFF 872 #define POWER8_PME_PM_MRK_TGT_PREF_TRACK_INEFF 873 #define POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD 874 #define POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD_L2 875 #define POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD_L3 876 #define POWER8_PME_PM_MRK_VSU_FIN 877 #define POWER8_PME_PM_MULT_MRK 878 #define POWER8_PME_PM_NESTED_TEND 879 #define POWER8_PME_PM_NEST_REF_CLK 880 #define POWER8_PME_PM_NON_FAV_TBEGIN 881 #define POWER8_PME_PM_NON_TM_RST_SC 882 #define POWER8_PME_PM_NTCG_ALL_FIN 883 #define POWER8_PME_PM_OUTER_TBEGIN 884 #define POWER8_PME_PM_OUTER_TEND 885 #define POWER8_PME_PM_PMC1_OVERFLOW 886 #define POWER8_PME_PM_PMC2_OVERFLOW 887 #define POWER8_PME_PM_PMC2_REWIND 888 #define POWER8_PME_PM_PMC2_SAVED 889 #define POWER8_PME_PM_PMC3_OVERFLOW 890 #define POWER8_PME_PM_PMC4_OVERFLOW 891 #define POWER8_PME_PM_PMC4_REWIND 892 #define POWER8_PME_PM_PMC4_SAVED 893 #define POWER8_PME_PM_PMC5_OVERFLOW 894 #define POWER8_PME_PM_PMC6_OVERFLOW 895 #define POWER8_PME_PM_PREF_TRACKED 896 #define POWER8_PME_PM_PREF_TRACK_EFF 897 #define POWER8_PME_PM_PREF_TRACK_INEFF 898 #define POWER8_PME_PM_PREF_TRACK_MOD 899 #define POWER8_PME_PM_PREF_TRACK_MOD_L2 900 #define POWER8_PME_PM_PREF_TRACK_MOD_L3 901 #define POWER8_PME_PM_PROBE_NOP_DISP 902 #define POWER8_PME_PM_PTE_PREFETCH 903 #define POWER8_PME_PM_PUMP_CPRED 904 #define POWER8_PME_PM_PUMP_MPRED 905 #define POWER8_PME_PM_RC0_ALLOC 906 #define POWER8_PME_PM_RC0_BUSY 907 #define POWER8_PME_PM_RC_LIFETIME_EXC_1024 908 #define POWER8_PME_PM_RC_LIFETIME_EXC_2048 909 #define POWER8_PME_PM_RC_LIFETIME_EXC_256 910 #define POWER8_PME_PM_RC_LIFETIME_EXC_32 911 #define POWER8_PME_PM_RC_USAGE 912 #define POWER8_PME_PM_RD_CLEARING_SC 913 #define POWER8_PME_PM_RD_FORMING_SC 914 #define POWER8_PME_PM_RD_HIT_PF 915 #define POWER8_PME_PM_REAL_SRQ_FULL 916 #define POWER8_PME_PM_RUN_CYC 917 #define POWER8_PME_PM_RUN_CYC_SMT2_MODE 918 #define POWER8_PME_PM_RUN_CYC_SMT2_SHRD_MODE 919 #define POWER8_PME_PM_RUN_CYC_SMT2_SPLIT_MODE 920 #define POWER8_PME_PM_RUN_CYC_SMT4_MODE 921 #define POWER8_PME_PM_RUN_CYC_SMT8_MODE 922 #define POWER8_PME_PM_RUN_CYC_ST_MODE 923 #define POWER8_PME_PM_RUN_INST_CMPL 924 #define POWER8_PME_PM_RUN_PURR 925 #define POWER8_PME_PM_RUN_SPURR 926 #define POWER8_PME_PM_SEC_ERAT_HIT 927 #define POWER8_PME_PM_SHL_CREATED 928 #define POWER8_PME_PM_SHL_ST_CONVERT 929 #define POWER8_PME_PM_SHL_ST_DISABLE 930 #define POWER8_PME_PM_SN0_ALLOC 931 #define POWER8_PME_PM_SN0_BUSY 932 #define POWER8_PME_PM_SNOOP_TLBIE 933 #define POWER8_PME_PM_SNP_TM_HIT_M 934 #define POWER8_PME_PM_SNP_TM_HIT_T 935 #define POWER8_PME_PM_SN_USAGE 936 #define POWER8_PME_PM_STALL_END_GCT_EMPTY 937 #define POWER8_PME_PM_STCX_FAIL 938 #define POWER8_PME_PM_STCX_LSU 939 #define POWER8_PME_PM_ST_CAUSED_FAIL 940 #define POWER8_PME_PM_ST_CMPL 941 #define POWER8_PME_PM_ST_FIN 942 #define POWER8_PME_PM_ST_FWD 943 #define POWER8_PME_PM_ST_MISS_L1 944 #define POWER8_PME_PM_SUSPENDED 945 #define POWER8_PME_PM_SWAP_CANCEL 946 #define POWER8_PME_PM_SWAP_CANCEL_GPR 947 #define POWER8_PME_PM_SWAP_COMPLETE 948 #define POWER8_PME_PM_SWAP_COMPLETE_GPR 949 #define POWER8_PME_PM_SYNC_MRK_BR_LINK 950 #define POWER8_PME_PM_SYNC_MRK_BR_MPRED 951 #define POWER8_PME_PM_SYNC_MRK_FX_DIVIDE 952 #define POWER8_PME_PM_SYNC_MRK_L2HIT 953 #define POWER8_PME_PM_SYNC_MRK_L2MISS 954 #define POWER8_PME_PM_SYNC_MRK_L3MISS 955 #define POWER8_PME_PM_SYNC_MRK_PROBE_NOP 956 #define POWER8_PME_PM_SYS_PUMP_CPRED 957 #define POWER8_PME_PM_SYS_PUMP_MPRED 958 #define POWER8_PME_PM_SYS_PUMP_MPRED_RTY 959 #define POWER8_PME_PM_TABLEWALK_CYC 960 #define POWER8_PME_PM_TABLEWALK_CYC_PREF 961 #define POWER8_PME_PM_TABORT_TRECLAIM 962 #define POWER8_PME_PM_TB_BIT_TRANS 963 #define POWER8_PME_PM_TEND_PEND_CYC 964 #define POWER8_PME_PM_THRD_ALL_RUN_CYC 965 #define POWER8_PME_PM_THRD_CONC_RUN_INST 966 #define POWER8_PME_PM_THRD_GRP_CMPL_BOTH_CYC 967 #define POWER8_PME_PM_THRD_PRIO_0_1_CYC 968 #define POWER8_PME_PM_THRD_PRIO_2_3_CYC 969 #define POWER8_PME_PM_THRD_PRIO_4_5_CYC 970 #define POWER8_PME_PM_THRD_PRIO_6_7_CYC 971 #define POWER8_PME_PM_THRD_REBAL_CYC 972 #define POWER8_PME_PM_THRESH_EXC_1024 973 #define POWER8_PME_PM_THRESH_EXC_128 974 #define POWER8_PME_PM_THRESH_EXC_2048 975 #define POWER8_PME_PM_THRESH_EXC_256 976 #define POWER8_PME_PM_THRESH_EXC_32 977 #define POWER8_PME_PM_THRESH_EXC_4096 978 #define POWER8_PME_PM_THRESH_EXC_512 979 #define POWER8_PME_PM_THRESH_EXC_64 980 #define POWER8_PME_PM_THRESH_MET 981 #define POWER8_PME_PM_THRESH_NOT_MET 982 #define POWER8_PME_PM_TLBIE_FIN 983 #define POWER8_PME_PM_TLB_MISS 984 #define POWER8_PME_PM_TM_BEGIN_ALL 985 #define POWER8_PME_PM_TM_CAM_OVERFLOW 986 #define POWER8_PME_PM_TM_CAP_OVERFLOW 987 #define POWER8_PME_PM_TM_END_ALL 988 #define POWER8_PME_PM_TM_FAIL_CONF_NON_TM 989 #define POWER8_PME_PM_TM_FAIL_CON_TM 990 #define POWER8_PME_PM_TM_FAIL_DISALLOW 991 #define POWER8_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW 992 #define POWER8_PME_PM_TM_FAIL_NON_TX_CONFLICT 993 #define POWER8_PME_PM_TM_FAIL_SELF 994 #define POWER8_PME_PM_TM_FAIL_TLBIE 995 #define POWER8_PME_PM_TM_FAIL_TX_CONFLICT 996 #define POWER8_PME_PM_TM_FAV_CAUSED_FAIL 997 #define POWER8_PME_PM_TM_LD_CAUSED_FAIL 998 #define POWER8_PME_PM_TM_LD_CONF 999 #define POWER8_PME_PM_TM_RST_SC 1000 #define POWER8_PME_PM_TM_SC_CO 1001 #define POWER8_PME_PM_TM_ST_CAUSED_FAIL 1002 #define POWER8_PME_PM_TM_ST_CONF 1003 #define POWER8_PME_PM_TM_TBEGIN 1004 #define POWER8_PME_PM_TM_TRANS_RUN_CYC 1005 #define POWER8_PME_PM_TM_TRANS_RUN_INST 1006 #define POWER8_PME_PM_TM_TRESUME 1007 #define POWER8_PME_PM_TM_TSUSPEND 1008 #define POWER8_PME_PM_TM_TX_PASS_RUN_CYC 1009 #define POWER8_PME_PM_TM_TX_PASS_RUN_INST 1010 #define POWER8_PME_PM_UP_PREF_L3 1011 #define POWER8_PME_PM_UP_PREF_POINTER 1012 #define POWER8_PME_PM_VSU0_16FLOP 1013 #define POWER8_PME_PM_VSU0_1FLOP 1014 #define POWER8_PME_PM_VSU0_2FLOP 1015 #define POWER8_PME_PM_VSU0_4FLOP 1016 #define POWER8_PME_PM_VSU0_8FLOP 1017 #define POWER8_PME_PM_VSU0_COMPLEX_ISSUED 1018 #define POWER8_PME_PM_VSU0_CY_ISSUED 1019 #define POWER8_PME_PM_VSU0_DD_ISSUED 1020 #define POWER8_PME_PM_VSU0_DP_2FLOP 1021 #define POWER8_PME_PM_VSU0_DP_FMA 1022 #define POWER8_PME_PM_VSU0_DP_FSQRT_FDIV 1023 #define POWER8_PME_PM_VSU0_DQ_ISSUED 1024 #define POWER8_PME_PM_VSU0_EX_ISSUED 1025 #define POWER8_PME_PM_VSU0_FIN 1026 #define POWER8_PME_PM_VSU0_FMA 1027 #define POWER8_PME_PM_VSU0_FPSCR 1028 #define POWER8_PME_PM_VSU0_FSQRT_FDIV 1029 #define POWER8_PME_PM_VSU0_PERMUTE_ISSUED 1030 #define POWER8_PME_PM_VSU0_SCALAR_DP_ISSUED 1031 #define POWER8_PME_PM_VSU0_SIMPLE_ISSUED 1032 #define POWER8_PME_PM_VSU0_SINGLE 1033 #define POWER8_PME_PM_VSU0_SQ 1034 #define POWER8_PME_PM_VSU0_STF 1035 #define POWER8_PME_PM_VSU0_VECTOR_DP_ISSUED 1036 #define POWER8_PME_PM_VSU0_VECTOR_SP_ISSUED 1037 #define POWER8_PME_PM_VSU1_16FLOP 1038 #define POWER8_PME_PM_VSU1_1FLOP 1039 #define POWER8_PME_PM_VSU1_2FLOP 1040 #define POWER8_PME_PM_VSU1_4FLOP 1041 #define POWER8_PME_PM_VSU1_8FLOP 1042 #define POWER8_PME_PM_VSU1_COMPLEX_ISSUED 1043 #define POWER8_PME_PM_VSU1_CY_ISSUED 1044 #define POWER8_PME_PM_VSU1_DD_ISSUED 1045 #define POWER8_PME_PM_VSU1_DP_2FLOP 1046 #define POWER8_PME_PM_VSU1_DP_FMA 1047 #define POWER8_PME_PM_VSU1_DP_FSQRT_FDIV 1048 #define POWER8_PME_PM_VSU1_DQ_ISSUED 1049 #define POWER8_PME_PM_VSU1_EX_ISSUED 1050 #define POWER8_PME_PM_VSU1_FIN 1051 #define POWER8_PME_PM_VSU1_FMA 1052 #define POWER8_PME_PM_VSU1_FPSCR 1053 #define POWER8_PME_PM_VSU1_FSQRT_FDIV 1054 #define POWER8_PME_PM_VSU1_PERMUTE_ISSUED 1055 #define POWER8_PME_PM_VSU1_SCALAR_DP_ISSUED 1056 #define POWER8_PME_PM_VSU1_SIMPLE_ISSUED 1057 #define POWER8_PME_PM_VSU1_SINGLE 1058 #define POWER8_PME_PM_VSU1_SQ 1059 #define POWER8_PME_PM_VSU1_STF 1060 #define POWER8_PME_PM_VSU1_VECTOR_DP_ISSUED 1061 #define POWER8_PME_PM_VSU1_VECTOR_SP_ISSUED 1062 static const pme_power_entry_t power8_pe[] = { [ POWER8_PME_PM_1LPAR_CYC ] = { .pme_name = "PM_1LPAR_CYC", .pme_code = 0x1f05e, .pme_short_desc = "Number of cycles in single lpar mode. All threads in the core are assigned to the same lpar", .pme_long_desc = "Number of cycles in single lpar mode.", }, [ POWER8_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x100f2, .pme_short_desc = "1 or more ppc insts finished", .pme_long_desc = "1 or more ppc insts finished (completed).", }, [ POWER8_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x400f2, .pme_short_desc = "Cycles at least one Instr Dispatched", .pme_long_desc = "Cycles at least one Instr Dispatched. Could be a group with only microcode. Issue HW016521", }, [ POWER8_PME_PM_2LPAR_CYC ] = { .pme_name = "PM_2LPAR_CYC", .pme_code = 0x2006e, .pme_short_desc = "Cycles in 2-lpar mode. Threads 0-3 belong to Lpar0 and threads 4-7 belong to Lpar1", .pme_long_desc = "Number of cycles in 2 lpar mode.", }, [ POWER8_PME_PM_4LPAR_CYC ] = { .pme_name = "PM_4LPAR_CYC", .pme_code = 0x4e05e, .pme_short_desc = "Number of cycles in 4 LPAR mode. Threads 0-1 belong to lpar0, threads 2-3 belong to lpar1, threads 4-5 belong to lpar2, and threads 6-7 belong to lpar3", .pme_long_desc = "Number of cycles in 4 LPAR mode.", }, [ POWER8_PME_PM_ALL_CHIP_PUMP_CPRED ] = { .pme_name = "PM_ALL_CHIP_PUMP_CPRED", .pme_code = 0x610050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for all data types (demand load,data,inst prefetch,inst fetch,xlate (I or d)", }, [ POWER8_PME_PM_ALL_GRP_PUMP_CPRED ] = { .pme_name = "PM_ALL_GRP_PUMP_CPRED", .pme_code = 0x520050, .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ALL_GRP_PUMP_MPRED ] = { .pme_name = "PM_ALL_GRP_PUMP_MPRED", .pme_code = 0x620052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_ALL_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_ALL_GRP_PUMP_MPRED_RTY", .pme_code = 0x610052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ALL_PUMP_CPRED ] = { .pme_name = "PM_ALL_PUMP_CPRED", .pme_code = 0x610054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Pump prediction correct. Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ALL_PUMP_MPRED ] = { .pme_name = "PM_ALL_PUMP_MPRED", .pme_code = 0x640052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ALL_SYS_PUMP_CPRED ] = { .pme_name = "PM_ALL_SYS_PUMP_CPRED", .pme_code = 0x630050, .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ALL_SYS_PUMP_MPRED ] = { .pme_name = "PM_ALL_SYS_PUMP_MPRED", .pme_code = 0x630052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_ALL_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_ALL_SYS_PUMP_MPRED_RTY", .pme_code = 0x640050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types (demand load,data prefetch,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER8_PME_PM_ANY_THRD_RUN_CYC ] = { .pme_name = "PM_ANY_THRD_RUN_CYC", .pme_code = 0x100fa, .pme_short_desc = "One of threads in run_cycles", .pme_long_desc = "Any thread in run_cycles (was one thread in run_cycles).", }, [ POWER8_PME_PM_BACK_BR_CMPL ] = { .pme_name = "PM_BACK_BR_CMPL", .pme_code = 0x2505e, .pme_short_desc = "Branch instruction completed with a target address less than current instruction address", .pme_long_desc = "Branch instruction completed with a target address less than current instruction address.", }, [ POWER8_PME_PM_BANK_CONFLICT ] = { .pme_name = "PM_BANK_CONFLICT", .pme_code = 0x4082, .pme_short_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", .pme_long_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", }, [ POWER8_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x10068, .pme_short_desc = "Branch Instruction Finished", .pme_long_desc = "Branch Instruction Finished .", }, [ POWER8_PME_PM_BR_2PATH ] = { .pme_name = "PM_BR_2PATH", .pme_code = 0x20036, .pme_short_desc = "two path branch", .pme_long_desc = "two path branch.", }, [ POWER8_PME_PM_BR_BC_8 ] = { .pme_name = "PM_BR_BC_8", .pme_code = 0x5086, .pme_short_desc = "Pairable BC+8 branch that has not been converted to a Resolve Finished in the BRU pipeline", .pme_long_desc = "Pairable BC+8 branch that has not been converted to a Resolve Finished in the BRU pipeline", }, [ POWER8_PME_PM_BR_BC_8_CONV ] = { .pme_name = "PM_BR_BC_8_CONV", .pme_code = 0x5084, .pme_short_desc = "Pairable BC+8 branch that was converted to a Resolve Finished in the BRU pipeline.", .pme_long_desc = "Pairable BC+8 branch that was converted to a Resolve Finished in the BRU pipeline.", }, [ POWER8_PME_PM_BR_CMPL ] = { .pme_name = "PM_BR_CMPL", .pme_code = 0x40060, .pme_short_desc = "Branch Instruction completed", .pme_long_desc = "Branch Instruction completed.", }, [ POWER8_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x40ac, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", }, [ POWER8_PME_PM_BR_MPRED_CMPL ] = { .pme_name = "PM_BR_MPRED_CMPL", .pme_code = 0x400f6, .pme_short_desc = "Number of Branch Mispredicts", .pme_long_desc = "Number of Branch Mispredicts.", }, [ POWER8_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x40b8, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).", }, [ POWER8_PME_PM_BR_MPRED_LSTACK ] = { .pme_name = "PM_BR_MPRED_LSTACK", .pme_code = 0x40ae, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", }, [ POWER8_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x40ba, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", }, [ POWER8_PME_PM_BR_MRK_2PATH ] = { .pme_name = "PM_BR_MRK_2PATH", .pme_code = 0x10138, .pme_short_desc = "marked two path branch", .pme_long_desc = "marked two path branch.", }, [ POWER8_PME_PM_BR_PRED_BR0 ] = { .pme_name = "PM_BR_PRED_BR0", .pme_code = 0x409c, .pme_short_desc = "Conditional Branch Completed on BR0 (1st branch in group) in which the HW predicted the Direction or Target", .pme_long_desc = "Conditional Branch Completed on BR0 (1st branch in group) in which the HW predicted the Direction or Target", }, [ POWER8_PME_PM_BR_PRED_BR1 ] = { .pme_name = "PM_BR_PRED_BR1", .pme_code = 0x409e, .pme_short_desc = "Conditional Branch Completed on BR1 (2nd branch in group) in which the HW predicted the Direction or Target. Note: BR1 can only be used in Single Thread Mode. In all of the SMT modes, only one branch can complete, thus BR1 is unused.", .pme_long_desc = "Conditional Branch Completed on BR1 (2nd branch in group) in which the HW predicted the Direction or Target. Note: BR1 can only be used in Single Thread Mode. In all of the SMT modes, only one branch can complete, thus BR1 is unused.", }, [ POWER8_PME_PM_BR_PRED_BR_CMPL ] = { .pme_name = "PM_BR_PRED_BR_CMPL", .pme_code = 0x489c, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred(0) OR if_pc_br0_br_pred(1).", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_BR_PRED_CCACHE_BR0 ] = { .pme_name = "PM_BR_PRED_CCACHE_BR0", .pme_code = 0x40a4, .pme_short_desc = "Conditional Branch Completed on BR0 that used the Count Cache for Target Prediction", .pme_long_desc = "Conditional Branch Completed on BR0 that used the Count Cache for Target Prediction", }, [ POWER8_PME_PM_BR_PRED_CCACHE_BR1 ] = { .pme_name = "PM_BR_PRED_CCACHE_BR1", .pme_code = 0x40a6, .pme_short_desc = "Conditional Branch Completed on BR1 that used the Count Cache for Target Prediction", .pme_long_desc = "Conditional Branch Completed on BR1 that used the Count Cache for Target Prediction", }, [ POWER8_PME_PM_BR_PRED_CCACHE_CMPL ] = { .pme_name = "PM_BR_PRED_CCACHE_CMPL", .pme_code = 0x48a4, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred(0) AND if_pc_br0_pred_type.", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_BR_PRED_CR_BR0 ] = { .pme_name = "PM_BR_PRED_CR_BR0", .pme_code = 0x40b0, .pme_short_desc = "Conditional Branch Completed on BR0 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", .pme_long_desc = "Conditional Branch Completed on BR0 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and bra", }, [ POWER8_PME_PM_BR_PRED_CR_BR1 ] = { .pme_name = "PM_BR_PRED_CR_BR1", .pme_code = 0x40b2, .pme_short_desc = "Conditional Branch Completed on BR1 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", .pme_long_desc = "Conditional Branch Completed on BR1 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and bra", }, [ POWER8_PME_PM_BR_PRED_CR_CMPL ] = { .pme_name = "PM_BR_PRED_CR_CMPL", .pme_code = 0x48b0, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred(1)='1'.", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_BR_PRED_LSTACK_BR0 ] = { .pme_name = "PM_BR_PRED_LSTACK_BR0", .pme_code = 0x40a8, .pme_short_desc = "Conditional Branch Completed on BR0 that used the Link Stack for Target Prediction", .pme_long_desc = "Conditional Branch Completed on BR0 that used the Link Stack for Target Prediction", }, [ POWER8_PME_PM_BR_PRED_LSTACK_BR1 ] = { .pme_name = "PM_BR_PRED_LSTACK_BR1", .pme_code = 0x40aa, .pme_short_desc = "Conditional Branch Completed on BR1 that used the Link Stack for Target Prediction", .pme_long_desc = "Conditional Branch Completed on BR1 that used the Link Stack for Target Prediction", }, [ POWER8_PME_PM_BR_PRED_LSTACK_CMPL ] = { .pme_name = "PM_BR_PRED_LSTACK_CMPL", .pme_code = 0x48a8, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred(0) AND (not if_pc_br0_pred_type).", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_BR_PRED_TA_BR0 ] = { .pme_name = "PM_BR_PRED_TA_BR0", .pme_code = 0x40b4, .pme_short_desc = "Conditional Branch Completed on BR0 that had its target address predicted. Only XL-form branches set this event.", .pme_long_desc = "Conditional Branch Completed on BR0 that had its target address predicted. Only XL-form branches set this event.", }, [ POWER8_PME_PM_BR_PRED_TA_BR1 ] = { .pme_name = "PM_BR_PRED_TA_BR1", .pme_code = 0x40b6, .pme_short_desc = "Conditional Branch Completed on BR1 that had its target address predicted. Only XL-form branches set this event.", .pme_long_desc = "Conditional Branch Completed on BR1 that had its target address predicted. Only XL-form branches set this event.", }, [ POWER8_PME_PM_BR_PRED_TA_CMPL ] = { .pme_name = "PM_BR_PRED_TA_CMPL", .pme_code = 0x48b4, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred(0)='1'.", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_BR_TAKEN_CMPL ] = { .pme_name = "PM_BR_TAKEN_CMPL", .pme_code = 0x200fa, .pme_short_desc = "New event for Branch Taken", .pme_long_desc = "Branch Taken.", }, [ POWER8_PME_PM_BR_UNCOND_BR0 ] = { .pme_name = "PM_BR_UNCOND_BR0", .pme_code = 0x40a0, .pme_short_desc = "Unconditional Branch Completed on BR0. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was converted to a Resolve.", .pme_long_desc = "Unconditional Branch Completed on BR0. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was converted to a Resolve.", }, [ POWER8_PME_PM_BR_UNCOND_BR1 ] = { .pme_name = "PM_BR_UNCOND_BR1", .pme_code = 0x40a2, .pme_short_desc = "Unconditional Branch Completed on BR1. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was converted to a Resolve.", .pme_long_desc = "Unconditional Branch Completed on BR1. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was converted to a Resolve.", }, [ POWER8_PME_PM_BR_UNCOND_CMPL ] = { .pme_name = "PM_BR_UNCOND_CMPL", .pme_code = 0x48a0, .pme_short_desc = "Completion Time Event. This event can also be calculated from the direct bus as follows: if_pc_br0_br_pred=00 AND if_pc_br0_completed.", .pme_long_desc = "IFU", }, [ POWER8_PME_PM_CASTOUT_ISSUED ] = { .pme_name = "PM_CASTOUT_ISSUED", .pme_code = 0x3094, .pme_short_desc = "Castouts issued", .pme_long_desc = "Castouts issued", }, [ POWER8_PME_PM_CASTOUT_ISSUED_GPR ] = { .pme_name = "PM_CASTOUT_ISSUED_GPR", .pme_code = 0x3096, .pme_short_desc = "Castouts issued GPR", .pme_long_desc = "Castouts issued GPR", }, [ POWER8_PME_PM_CHIP_PUMP_CPRED ] = { .pme_name = "PM_CHIP_PUMP_CPRED", .pme_code = 0x10050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for all data types (demand load,data,inst prefetch,inst fetch,xlate (I or d).", }, [ POWER8_PME_PM_CLB_HELD ] = { .pme_name = "PM_CLB_HELD", .pme_code = 0x2090, .pme_short_desc = "CLB Hold: Any Reason", .pme_long_desc = "CLB Hold: Any Reason", }, [ POWER8_PME_PM_CMPLU_STALL ] = { .pme_name = "PM_CMPLU_STALL", .pme_code = 0x4000a, .pme_short_desc = "Completion stall", .pme_long_desc = "Completion stall.", }, [ POWER8_PME_PM_CMPLU_STALL_BRU ] = { .pme_name = "PM_CMPLU_STALL_BRU", .pme_code = 0x4d018, .pme_short_desc = "Completion stall due to a Branch Unit", .pme_long_desc = "Completion stall due to a Branch Unit.", }, [ POWER8_PME_PM_CMPLU_STALL_BRU_CRU ] = { .pme_name = "PM_CMPLU_STALL_BRU_CRU", .pme_code = 0x2d018, .pme_short_desc = "Completion stall due to IFU", .pme_long_desc = "Completion stall due to IFU.", }, [ POWER8_PME_PM_CMPLU_STALL_COQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_COQ_FULL", .pme_code = 0x30026, .pme_short_desc = "Completion stall due to CO q full", .pme_long_desc = "Completion stall due to CO q full.", }, [ POWER8_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x2c012, .pme_short_desc = "Completion stall by Dcache miss", .pme_long_desc = "Completion stall by Dcache miss.", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_L21_L31 ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L21_L31", .pme_code = 0x2c018, .pme_short_desc = "Completion stall by Dcache miss which resolved on chip (excluding local L2/L3)", .pme_long_desc = "Completion stall by Dcache miss which resolved on chip (excluding local L2/L3).", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_L2L3 ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L2L3", .pme_code = 0x2c016, .pme_short_desc = "Completion stall by Dcache miss which resolved in L2/L3", .pme_long_desc = "Completion stall by Dcache miss which resolved in L2/L3.", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L2L3_CONFLICT", .pme_code = 0x4c016, .pme_short_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", .pme_long_desc = "Completion stall due to cache miss resolving in core's L2/L3 with a conflict.", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_L3MISS ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L3MISS", .pme_code = 0x4c01a, .pme_short_desc = "Completion stall due to cache miss resolving missed the L3", .pme_long_desc = "Completion stall due to cache miss resolving missed the L3.", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_LMEM ] = { .pme_name = "PM_CMPLU_STALL_DMISS_LMEM", .pme_code = 0x4c018, .pme_short_desc = "Completion stall due to cache miss that resolves in local memory", .pme_long_desc = "Completion stall due to cache miss resolving in core's Local Memory.", }, [ POWER8_PME_PM_CMPLU_STALL_DMISS_REMOTE ] = { .pme_name = "PM_CMPLU_STALL_DMISS_REMOTE", .pme_code = 0x2c01c, .pme_short_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", .pme_long_desc = "Completion stall by Dcache miss which resolved on chip (excluding local L2/L3).", }, [ POWER8_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x4c012, .pme_short_desc = "Completion stall due to LSU reject ERAT miss", .pme_long_desc = "Completion stall due to LSU reject ERAT miss.", }, [ POWER8_PME_PM_CMPLU_STALL_FLUSH ] = { .pme_name = "PM_CMPLU_STALL_FLUSH", .pme_code = 0x30038, .pme_short_desc = "completion stall due to flush by own thread", .pme_long_desc = "completion stall due to flush by own thread.", }, [ POWER8_PME_PM_CMPLU_STALL_FXLONG ] = { .pme_name = "PM_CMPLU_STALL_FXLONG", .pme_code = 0x4d016, .pme_short_desc = "Completion stall due to a long latency fixed point instruction", .pme_long_desc = "Completion stall due to a long latency fixed point instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x2d016, .pme_short_desc = "Completion stall due to FXU", .pme_long_desc = "Completion stall due to FXU.", }, [ POWER8_PME_PM_CMPLU_STALL_HWSYNC ] = { .pme_name = "PM_CMPLU_STALL_HWSYNC", .pme_code = 0x30036, .pme_short_desc = "completion stall due to hwsync", .pme_long_desc = "completion stall due to hwsync.", }, [ POWER8_PME_PM_CMPLU_STALL_LOAD_FINISH ] = { .pme_name = "PM_CMPLU_STALL_LOAD_FINISH", .pme_code = 0x4d014, .pme_short_desc = "Completion stall due to a Load finish", .pme_long_desc = "Completion stall due to a Load finish.", }, [ POWER8_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x2c010, .pme_short_desc = "Completion stall by LSU instruction", .pme_long_desc = "Completion stall by LSU instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_LWSYNC ] = { .pme_name = "PM_CMPLU_STALL_LWSYNC", .pme_code = 0x10036, .pme_short_desc = "completion stall due to isync/lwsync", .pme_long_desc = "completion stall due to isync/lwsync.", }, [ POWER8_PME_PM_CMPLU_STALL_MEM_ECC_DELAY ] = { .pme_name = "PM_CMPLU_STALL_MEM_ECC_DELAY", .pme_code = 0x30028, .pme_short_desc = "Completion stall due to mem ECC delay", .pme_long_desc = "Completion stall due to mem ECC delay.", }, [ POWER8_PME_PM_CMPLU_STALL_NO_NTF ] = { .pme_name = "PM_CMPLU_STALL_NO_NTF", .pme_code = 0x2e01c, .pme_short_desc = "Completion stall due to nop", .pme_long_desc = "Completion stall due to nop.", }, [ POWER8_PME_PM_CMPLU_STALL_NTCG_FLUSH ] = { .pme_name = "PM_CMPLU_STALL_NTCG_FLUSH", .pme_code = 0x2e01e, .pme_short_desc = "Completion stall due to ntcg flush", .pme_long_desc = "Completion stall due to reject (load hit store).", }, [ POWER8_PME_PM_CMPLU_STALL_OTHER_CMPL ] = { .pme_name = "PM_CMPLU_STALL_OTHER_CMPL", .pme_code = 0x30006, .pme_short_desc = "Instructions core completed while this tread was stalled", .pme_long_desc = "Instructions core completed while this thread was stalled.", }, [ POWER8_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x4c010, .pme_short_desc = "Completion stall due to LSU reject", .pme_long_desc = "Completion stall due to LSU reject.", }, [ POWER8_PME_PM_CMPLU_STALL_REJECT_LHS ] = { .pme_name = "PM_CMPLU_STALL_REJECT_LHS", .pme_code = 0x2c01a, .pme_short_desc = "Completion stall due to reject (load hit store)", .pme_long_desc = "Completion stall due to reject (load hit store).", }, [ POWER8_PME_PM_CMPLU_STALL_REJ_LMQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_REJ_LMQ_FULL", .pme_code = 0x4c014, .pme_short_desc = "Completion stall due to LSU reject LMQ full", .pme_long_desc = "Completion stall due to LSU reject LMQ full.", }, [ POWER8_PME_PM_CMPLU_STALL_SCALAR ] = { .pme_name = "PM_CMPLU_STALL_SCALAR", .pme_code = 0x4d010, .pme_short_desc = "Completion stall due to VSU scalar instruction", .pme_long_desc = "Completion stall due to VSU scalar instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_SCALAR_LONG ] = { .pme_name = "PM_CMPLU_STALL_SCALAR_LONG", .pme_code = 0x2d010, .pme_short_desc = "Completion stall due to VSU scalar long latency instruction", .pme_long_desc = "Completion stall due to VSU scalar long latency instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_STORE ] = { .pme_name = "PM_CMPLU_STALL_STORE", .pme_code = 0x2c014, .pme_short_desc = "Completion stall by stores this includes store agen finishes in pipe LS0/LS1 and store data finishes in LS2/LS3", .pme_long_desc = "Completion stall by stores.", }, [ POWER8_PME_PM_CMPLU_STALL_ST_FWD ] = { .pme_name = "PM_CMPLU_STALL_ST_FWD", .pme_code = 0x4c01c, .pme_short_desc = "Completion stall due to store forward", .pme_long_desc = "Completion stall due to store forward.", }, [ POWER8_PME_PM_CMPLU_STALL_THRD ] = { .pme_name = "PM_CMPLU_STALL_THRD", .pme_code = 0x1001c, .pme_short_desc = "Completion Stalled due to thread conflict. Group ready to complete but it was another thread's turn", .pme_long_desc = "Completion stall due to thread conflict.", }, [ POWER8_PME_PM_CMPLU_STALL_VECTOR ] = { .pme_name = "PM_CMPLU_STALL_VECTOR", .pme_code = 0x2d014, .pme_short_desc = "Completion stall due to VSU vector instruction", .pme_long_desc = "Completion stall due to VSU vector instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_VECTOR_LONG ] = { .pme_name = "PM_CMPLU_STALL_VECTOR_LONG", .pme_code = 0x4d012, .pme_short_desc = "Completion stall due to VSU vector long instruction", .pme_long_desc = "Completion stall due to VSU vector long instruction.", }, [ POWER8_PME_PM_CMPLU_STALL_VSU ] = { .pme_name = "PM_CMPLU_STALL_VSU", .pme_code = 0x2d012, .pme_short_desc = "Completion stall due to VSU instruction", .pme_long_desc = "Completion stall due to VSU instruction.", }, [ POWER8_PME_PM_CO0_ALLOC ] = { .pme_name = "PM_CO0_ALLOC", .pme_code = 0x16083, .pme_short_desc = "CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_CO0_BUSY ] = { .pme_name = "PM_CO0_BUSY", .pme_code = 0x16082, .pme_short_desc = "CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", }, [ POWER8_PME_PM_CO_DISP_FAIL ] = { .pme_name = "PM_CO_DISP_FAIL", .pme_code = 0x517082, .pme_short_desc = "CO dispatch failed due to all CO machines being busy", .pme_long_desc = "CO dispatch failed due to all CO machines being busy", }, [ POWER8_PME_PM_CO_TM_SC_FOOTPRINT ] = { .pme_name = "PM_CO_TM_SC_FOOTPRINT", .pme_code = 0x527084, .pme_short_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", .pme_long_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)", }, [ POWER8_PME_PM_CO_USAGE ] = { .pme_name = "PM_CO_USAGE", .pme_code = 0x3608a, .pme_short_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", .pme_long_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, [ POWER8_PME_PM_CRU_FIN ] = { .pme_name = "PM_CRU_FIN", .pme_code = 0x40066, .pme_short_desc = "IFU Finished a (non-branch) instruction", .pme_long_desc = "IFU Finished a (non-branch) instruction.", }, [ POWER8_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x1e, .pme_short_desc = "Cycles", .pme_long_desc = "Cycles .", }, [ POWER8_PME_PM_DATA_ALL_CHIP_PUMP_CPRED ] = { .pme_name = "PM_DATA_ALL_CHIP_PUMP_CPRED", .pme_code = 0x61c050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for either demand loads or data prefetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for a demand load", }, [ POWER8_PME_PM_DATA_ALL_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_DL2L3_MOD", .pme_code = 0x64c048, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_ALL_FROM_DL2L3_SHR", .pme_code = 0x63c048, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_DL4 ] = { .pme_name = "PM_DATA_ALL_FROM_DL4", .pme_code = 0x63c04c, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_DMEM ] = { .pme_name = "PM_DATA_ALL_FROM_DMEM", .pme_code = 0x64c04c, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2 ] = { .pme_name = "PM_DATA_ALL_FROM_L2", .pme_code = 0x61c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L21_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_L21_MOD", .pme_code = 0x64c046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L21_SHR ] = { .pme_name = "PM_DATA_ALL_FROM_L21_SHR", .pme_code = 0x63c046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2MISS_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_L2MISS_MOD", .pme_code = 0x61c04e, .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_DATA_ALL_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x63c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_DATA_ALL_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x64c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2_MEPF ] = { .pme_name = "PM_DATA_ALL_FROM_L2_MEPF", .pme_code = 0x62c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_DATA_ALL_FROM_L2_NO_CONFLICT", .pme_code = 0x61c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L3 ] = { .pme_name = "PM_DATA_ALL_FROM_L3", .pme_code = 0x64c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L31_ECO_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_L31_ECO_MOD", .pme_code = 0x64c044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L31_ECO_SHR ] = { .pme_name = "PM_DATA_ALL_FROM_L31_ECO_SHR", .pme_code = 0x63c044, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L31_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_L31_MOD", .pme_code = 0x62c044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L31_SHR ] = { .pme_name = "PM_DATA_ALL_FROM_L31_SHR", .pme_code = 0x61c046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L3MISS_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_L3MISS_MOD", .pme_code = 0x64c04e, .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_DATA_ALL_FROM_L3_DISP_CONFLICT", .pme_code = 0x63c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L3_MEPF ] = { .pme_name = "PM_DATA_ALL_FROM_L3_MEPF", .pme_code = 0x62c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_DATA_ALL_FROM_L3_NO_CONFLICT", .pme_code = 0x61c044, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_LL4 ] = { .pme_name = "PM_DATA_ALL_FROM_LL4", .pme_code = 0x61c04c, .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_LMEM ] = { .pme_name = "PM_DATA_ALL_FROM_LMEM", .pme_code = 0x62c048, .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_MEMORY ] = { .pme_name = "PM_DATA_ALL_FROM_MEMORY", .pme_code = 0x62c04c, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_DATA_ALL_FROM_OFF_CHIP_CACHE", .pme_code = 0x64c04a, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_DATA_ALL_FROM_ON_CHIP_CACHE", .pme_code = 0x61c048, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_ALL_FROM_RL2L3_MOD", .pme_code = 0x62c046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_ALL_FROM_RL2L3_SHR", .pme_code = 0x61c04a, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_RL4 ] = { .pme_name = "PM_DATA_ALL_FROM_RL4", .pme_code = 0x62c04a, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_FROM_RMEM ] = { .pme_name = "PM_DATA_ALL_FROM_RMEM", .pme_code = 0x63c04a, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to either demand loads or data prefetch", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1", }, [ POWER8_PME_PM_DATA_ALL_GRP_PUMP_CPRED ] = { .pme_name = "PM_DATA_ALL_GRP_PUMP_CPRED", .pme_code = 0x62c050, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for either demand loads or data prefetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for a demand load", }, [ POWER8_PME_PM_DATA_ALL_GRP_PUMP_MPRED ] = { .pme_name = "PM_DATA_ALL_GRP_PUMP_MPRED", .pme_code = 0x62c052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for either demand loads or data prefetch", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_DATA_ALL_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_ALL_GRP_PUMP_MPRED_RTY", .pme_code = 0x61c052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for either demand loads or data prefetch", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor a demand load", }, [ POWER8_PME_PM_DATA_ALL_PUMP_CPRED ] = { .pme_name = "PM_DATA_ALL_PUMP_CPRED", .pme_code = 0x61c054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for either demand loads or data prefetch", .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", }, [ POWER8_PME_PM_DATA_ALL_PUMP_MPRED ] = { .pme_name = "PM_DATA_ALL_PUMP_MPRED", .pme_code = 0x64c052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for either demand loads or data prefetch", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor a demand load", }, [ POWER8_PME_PM_DATA_ALL_SYS_PUMP_CPRED ] = { .pme_name = "PM_DATA_ALL_SYS_PUMP_CPRED", .pme_code = 0x63c050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for either demand loads or data prefetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for a demand load", }, [ POWER8_PME_PM_DATA_ALL_SYS_PUMP_MPRED ] = { .pme_name = "PM_DATA_ALL_SYS_PUMP_MPRED", .pme_code = 0x63c052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for either demand loads or data prefetch", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_DATA_ALL_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_ALL_SYS_PUMP_MPRED_RTY", .pme_code = 0x64c050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for either demand loads or data prefetch", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for a demand load", }, [ POWER8_PME_PM_DATA_CHIP_PUMP_CPRED ] = { .pme_name = "PM_DATA_CHIP_PUMP_CPRED", .pme_code = 0x1c050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for a demand load.", }, [ POWER8_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x4c048, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x3c048, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_DL4 ] = { .pme_name = "PM_DATA_FROM_DL4", .pme_code = 0x3c04c, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x4c04c, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L21_MOD ] = { .pme_name = "PM_DATA_FROM_L21_MOD", .pme_code = 0x4c046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L21_SHR ] = { .pme_name = "PM_DATA_FROM_L21_SHR", .pme_code = 0x3c046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x200fe, .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", .pme_long_desc = "Demand LD - L2 Miss (not L2 hit).", }, [ POWER8_PME_PM_DATA_FROM_L2MISS_MOD ] = { .pme_name = "PM_DATA_FROM_L2MISS_MOD", .pme_code = 0x1c04e, .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x3c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x4c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2_MEPF ] = { .pme_name = "PM_DATA_FROM_L2_MEPF", .pme_code = 0x2c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L2_NO_CONFLICT", .pme_code = 0x1c040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1 .", }, [ POWER8_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x4c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L31_ECO_MOD ] = { .pme_name = "PM_DATA_FROM_L31_ECO_MOD", .pme_code = 0x4c044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L31_ECO_SHR ] = { .pme_name = "PM_DATA_FROM_L31_ECO_SHR", .pme_code = 0x3c044, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L31_MOD ] = { .pme_name = "PM_DATA_FROM_L31_MOD", .pme_code = 0x2c044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L31_SHR ] = { .pme_name = "PM_DATA_FROM_L31_SHR", .pme_code = 0x1c046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x300fe, .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", .pme_long_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit).", }, [ POWER8_PME_PM_DATA_FROM_L3MISS_MOD ] = { .pme_name = "PM_DATA_FROM_L3MISS_MOD", .pme_code = 0x4c04e, .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L3_DISP_CONFLICT", .pme_code = 0x3c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L3_MEPF ] = { .pme_name = "PM_DATA_FROM_L3_MEPF", .pme_code = 0x2c042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L3_NO_CONFLICT", .pme_code = 0x1c044, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_LL4 ] = { .pme_name = "PM_DATA_FROM_LL4", .pme_code = 0x1c04c, .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x2c048, .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x400fe, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", .pme_long_desc = "Data cache reload from memory (including L4).", }, [ POWER8_PME_PM_DATA_FROM_MEMORY ] = { .pme_name = "PM_DATA_FROM_MEMORY", .pme_code = 0x2c04c, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_DATA_FROM_OFF_CHIP_CACHE", .pme_code = 0x4c04a, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_DATA_FROM_ON_CHIP_CACHE", .pme_code = 0x1c048, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x2c046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x1c04a, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_RL4 ] = { .pme_name = "PM_DATA_FROM_RL4", .pme_code = 0x2c04a, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x3c04a, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.", }, [ POWER8_PME_PM_DATA_GRP_PUMP_CPRED ] = { .pme_name = "PM_DATA_GRP_PUMP_CPRED", .pme_code = 0x2c050, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for a demand load.", }, [ POWER8_PME_PM_DATA_GRP_PUMP_MPRED ] = { .pme_name = "PM_DATA_GRP_PUMP_MPRED", .pme_code = 0x2c052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_DATA_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_GRP_PUMP_MPRED_RTY", .pme_code = 0x1c052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor a demand load.", }, [ POWER8_PME_PM_DATA_PUMP_CPRED ] = { .pme_name = "PM_DATA_PUMP_CPRED", .pme_code = 0x1c054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load.", }, [ POWER8_PME_PM_DATA_PUMP_MPRED ] = { .pme_name = "PM_DATA_PUMP_MPRED", .pme_code = 0x4c052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for a demand load", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor a demand load.", }, [ POWER8_PME_PM_DATA_SYS_PUMP_CPRED ] = { .pme_name = "PM_DATA_SYS_PUMP_CPRED", .pme_code = 0x3c050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for a demand load.", }, [ POWER8_PME_PM_DATA_SYS_PUMP_MPRED ] = { .pme_name = "PM_DATA_SYS_PUMP_MPRED", .pme_code = 0x3c052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_DATA_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_SYS_PUMP_MPRED_RTY", .pme_code = 0x4c050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for a demand load.", }, [ POWER8_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x3001a, .pme_short_desc = "Tablwalk Cycles (could be 1 or 2 active)", .pme_long_desc = "Data Tablewalk Active.", }, [ POWER8_PME_PM_DC_COLLISIONS ] = { .pme_name = "PM_DC_COLLISIONS", .pme_code = 0xe0bc, .pme_short_desc = "DATA Cache collisions", .pme_long_desc = "DATA Cache collisions42", }, [ POWER8_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x1e050, .pme_short_desc = "Stream marked valid. The stream could have been allocated through the hardware prefetch mechanism or through software. This is combined ls0 and ls1", .pme_long_desc = "Stream marked valid. The stream could have been allocated through the hardware prefetch mechanism or through software. This is combined ls0 and ls1.", }, [ POWER8_PME_PM_DC_PREF_STREAM_CONF ] = { .pme_name = "PM_DC_PREF_STREAM_CONF", .pme_code = 0x2e050, .pme_short_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Combine up + down", .pme_long_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Combine up + down.", }, [ POWER8_PME_PM_DC_PREF_STREAM_FUZZY_CONF ] = { .pme_name = "PM_DC_PREF_STREAM_FUZZY_CONF", .pme_code = 0x4e050, .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up).", }, [ POWER8_PME_PM_DC_PREF_STREAM_STRIDED_CONF ] = { .pme_name = "PM_DC_PREF_STREAM_STRIDED_CONF", .pme_code = 0x3e050, .pme_short_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.", .pme_long_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software..", }, [ POWER8_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x4c054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16G", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16G.", }, [ POWER8_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x3c054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16M", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16M.", }, [ POWER8_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x1c056, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 4K", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 4K.", }, [ POWER8_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x2c054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 64K", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 64K.", }, [ POWER8_PME_PM_DFU ] = { .pme_name = "PM_DFU", .pme_code = 0xb0ba, .pme_short_desc = "Finish DFU (all finish)", .pme_long_desc = "Finish DFU (all finish)", }, [ POWER8_PME_PM_DFU_DCFFIX ] = { .pme_name = "PM_DFU_DCFFIX", .pme_code = 0xb0be, .pme_short_desc = "Convert from fixed opcode finish (dcffix,dcffixq)", .pme_long_desc = "Convert from fixed opcode finish (dcffix,dcffixq)", }, [ POWER8_PME_PM_DFU_DENBCD ] = { .pme_name = "PM_DFU_DENBCD", .pme_code = 0xb0bc, .pme_short_desc = "BCD->DPD opcode finish (denbcd, denbcdq)", .pme_long_desc = "BCD->DPD opcode finish (denbcd, denbcdq)", }, [ POWER8_PME_PM_DFU_MC ] = { .pme_name = "PM_DFU_MC", .pme_code = 0xb0b8, .pme_short_desc = "Finish DFU multicycle", .pme_long_desc = "Finish DFU multicycle", }, [ POWER8_PME_PM_DISP_CLB_HELD_BAL ] = { .pme_name = "PM_DISP_CLB_HELD_BAL", .pme_code = 0x2092, .pme_short_desc = "Dispatch/CLB Hold: Balance", .pme_long_desc = "Dispatch/CLB Hold: Balance", }, [ POWER8_PME_PM_DISP_CLB_HELD_RES ] = { .pme_name = "PM_DISP_CLB_HELD_RES", .pme_code = 0x2094, .pme_short_desc = "Dispatch/CLB Hold: Resource", .pme_long_desc = "Dispatch/CLB Hold: Resource", }, [ POWER8_PME_PM_DISP_CLB_HELD_SB ] = { .pme_name = "PM_DISP_CLB_HELD_SB", .pme_code = 0x20a8, .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", }, [ POWER8_PME_PM_DISP_CLB_HELD_SYNC ] = { .pme_name = "PM_DISP_CLB_HELD_SYNC", .pme_code = 0x2098, .pme_short_desc = "Dispatch/CLB Hold: Sync type instruction", .pme_long_desc = "Dispatch/CLB Hold: Sync type instruction", }, [ POWER8_PME_PM_DISP_CLB_HELD_TLBIE ] = { .pme_name = "PM_DISP_CLB_HELD_TLBIE", .pme_code = 0x2096, .pme_short_desc = "Dispatch Hold: Due to TLBIE", .pme_long_desc = "Dispatch Hold: Due to TLBIE", }, [ POWER8_PME_PM_DISP_HELD ] = { .pme_name = "PM_DISP_HELD", .pme_code = 0x10006, .pme_short_desc = "Dispatch Held", .pme_long_desc = "Dispatch Held.", }, [ POWER8_PME_PM_DISP_HELD_IQ_FULL ] = { .pme_name = "PM_DISP_HELD_IQ_FULL", .pme_code = 0x20006, .pme_short_desc = "Dispatch held due to Issue q full", .pme_long_desc = "Dispatch held due to Issue q full.", }, [ POWER8_PME_PM_DISP_HELD_MAP_FULL ] = { .pme_name = "PM_DISP_HELD_MAP_FULL", .pme_code = 0x1002a, .pme_short_desc = "Dispatch for this thread was held because the Mappers were full", .pme_long_desc = "Dispatch held due to Mapper full.", }, [ POWER8_PME_PM_DISP_HELD_SRQ_FULL ] = { .pme_name = "PM_DISP_HELD_SRQ_FULL", .pme_code = 0x30018, .pme_short_desc = "Dispatch held due SRQ no room", .pme_long_desc = "Dispatch held due SRQ no room.", }, [ POWER8_PME_PM_DISP_HELD_SYNC_HOLD ] = { .pme_name = "PM_DISP_HELD_SYNC_HOLD", .pme_code = 0x4003c, .pme_short_desc = "Dispatch held due to SYNC hold", .pme_long_desc = "Dispatch held due to SYNC hold.", }, [ POWER8_PME_PM_DISP_HOLD_GCT_FULL ] = { .pme_name = "PM_DISP_HOLD_GCT_FULL", .pme_code = 0x30a6, .pme_short_desc = "Dispatch Hold Due to no space in the GCT", .pme_long_desc = "Dispatch Hold Due to no space in the GCT", }, [ POWER8_PME_PM_DISP_WT ] = { .pme_name = "PM_DISP_WT", .pme_code = 0x30008, .pme_short_desc = "Dispatched Starved", .pme_long_desc = "Dispatched Starved (not held, nothing to dispatch).", }, [ POWER8_PME_PM_DPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_DPTEG_FROM_DL2L3_MOD", .pme_code = 0x4e048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_DPTEG_FROM_DL2L3_SHR", .pme_code = 0x3e048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_DL4 ] = { .pme_name = "PM_DPTEG_FROM_DL4", .pme_code = 0x3e04c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_DMEM ] = { .pme_name = "PM_DPTEG_FROM_DMEM", .pme_code = 0x4e04c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2 ] = { .pme_name = "PM_DPTEG_FROM_L2", .pme_code = 0x1e042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L21_MOD ] = { .pme_name = "PM_DPTEG_FROM_L21_MOD", .pme_code = 0x4e046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L21_SHR ] = { .pme_name = "PM_DPTEG_FROM_L21_SHR", .pme_code = 0x3e046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2MISS ] = { .pme_name = "PM_DPTEG_FROM_L2MISS", .pme_code = 0x1e04e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x3e040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_DPTEG_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x4e040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_DPTEG_FROM_L2_MEPF", .pme_code = 0x2e040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x1e040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L3 ] = { .pme_name = "PM_DPTEG_FROM_L3", .pme_code = 0x4e042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_DPTEG_FROM_L31_ECO_MOD", .pme_code = 0x4e044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_DPTEG_FROM_L31_ECO_SHR", .pme_code = 0x3e044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L31_MOD ] = { .pme_name = "PM_DPTEG_FROM_L31_MOD", .pme_code = 0x2e044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L31_SHR ] = { .pme_name = "PM_DPTEG_FROM_L31_SHR", .pme_code = 0x1e046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L3MISS ] = { .pme_name = "PM_DPTEG_FROM_L3MISS", .pme_code = 0x4e04e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x3e042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_DPTEG_FROM_L3_MEPF", .pme_code = 0x2e042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x1e044, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_LL4 ] = { .pme_name = "PM_DPTEG_FROM_LL4", .pme_code = 0x1e04c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_LMEM ] = { .pme_name = "PM_DPTEG_FROM_LMEM", .pme_code = 0x2e048, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_MEMORY ] = { .pme_name = "PM_DPTEG_FROM_MEMORY", .pme_code = 0x2e04c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_DPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x4e04a, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_DPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x1e048, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_DPTEG_FROM_RL2L3_MOD", .pme_code = 0x2e046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_DPTEG_FROM_RL2L3_SHR", .pme_code = 0x1e04a, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_RL4 ] = { .pme_name = "PM_DPTEG_FROM_RL4", .pme_code = 0x2e04a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a data side request.", }, [ POWER8_PME_PM_DPTEG_FROM_RMEM ] = { .pme_name = "PM_DPTEG_FROM_RMEM", .pme_code = 0x3e04a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a data side request.", }, [ POWER8_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0xd094, .pme_short_desc = "Data SLB Miss - Total of all segment sizes", .pme_long_desc = "Data SLB Miss - Total of all segment sizesData SLB misses", }, [ POWER8_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x300fc, .pme_short_desc = "Data PTEG reload", .pme_long_desc = "Data PTEG Reloaded (DTLB Miss).", }, [ POWER8_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x1c058, .pme_short_desc = "Data TLB Miss page size 16G", .pme_long_desc = "Data TLB Miss page size 16G.", }, [ POWER8_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x4c056, .pme_short_desc = "Data TLB Miss page size 16M", .pme_long_desc = "Data TLB Miss page size 16M.", }, [ POWER8_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x2c056, .pme_short_desc = "Data TLB Miss page size 4k", .pme_long_desc = "Data TLB Miss page size 4k.", }, [ POWER8_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x3c056, .pme_short_desc = "Data TLB Miss page size 64K", .pme_long_desc = "Data TLB Miss page size 64K.", }, [ POWER8_PME_PM_EAT_FORCE_MISPRED ] = { .pme_name = "PM_EAT_FORCE_MISPRED", .pme_code = 0x50a8, .pme_short_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issue", .pme_long_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is", }, [ POWER8_PME_PM_EAT_FULL_CYC ] = { .pme_name = "PM_EAT_FULL_CYC", .pme_code = 0x4084, .pme_short_desc = "Cycles No room in EAT", .pme_long_desc = "Cycles No room in EATSet on bank conflict and case where no ibuffers available.", }, [ POWER8_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x2080, .pme_short_desc = "Ee off and external interrupt", .pme_long_desc = "Ee off and external interrupt", }, [ POWER8_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x200f8, .pme_short_desc = "external interrupt", .pme_long_desc = "external interrupt.", }, [ POWER8_PME_PM_FAV_TBEGIN ] = { .pme_name = "PM_FAV_TBEGIN", .pme_code = 0x20b4, .pme_short_desc = "Dispatch time Favored tbegin", .pme_long_desc = "Dispatch time Favored tbegin", }, [ POWER8_PME_PM_FLOP ] = { .pme_name = "PM_FLOP", .pme_code = 0x100f4, .pme_short_desc = "Floating Point Operation Finished", .pme_long_desc = "Floating Point Operations Finished.", }, [ POWER8_PME_PM_FLOP_SUM_SCALAR ] = { .pme_name = "PM_FLOP_SUM_SCALAR", .pme_code = 0xa0ae, .pme_short_desc = "flops summary scalar instructions", .pme_long_desc = "flops summary scalar instructions", }, [ POWER8_PME_PM_FLOP_SUM_VEC ] = { .pme_name = "PM_FLOP_SUM_VEC", .pme_code = 0xa0ac, .pme_short_desc = "flops summary vector instructions", .pme_long_desc = "flops summary vector instructions", }, [ POWER8_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x400f8, .pme_short_desc = "Flush (any type)", .pme_long_desc = "Flush (any type).", }, [ POWER8_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x2084, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "Flush caused by branch mispredict", }, [ POWER8_PME_PM_FLUSH_COMPLETION ] = { .pme_name = "PM_FLUSH_COMPLETION", .pme_code = 0x30012, .pme_short_desc = "Completion Flush", .pme_long_desc = "Completion Flush.", }, [ POWER8_PME_PM_FLUSH_DISP ] = { .pme_name = "PM_FLUSH_DISP", .pme_code = 0x2082, .pme_short_desc = "Dispatch flush", .pme_long_desc = "Dispatch flush", }, [ POWER8_PME_PM_FLUSH_DISP_SB ] = { .pme_name = "PM_FLUSH_DISP_SB", .pme_code = 0x208c, .pme_short_desc = "Dispatch Flush: Scoreboard", .pme_long_desc = "Dispatch Flush: Scoreboard", }, [ POWER8_PME_PM_FLUSH_DISP_SYNC ] = { .pme_name = "PM_FLUSH_DISP_SYNC", .pme_code = 0x2088, .pme_short_desc = "Dispatch Flush: Sync", .pme_long_desc = "Dispatch Flush: Sync", }, [ POWER8_PME_PM_FLUSH_DISP_TLBIE ] = { .pme_name = "PM_FLUSH_DISP_TLBIE", .pme_code = 0x208a, .pme_short_desc = "Dispatch Flush: TLBIE", .pme_long_desc = "Dispatch Flush: TLBIE", }, [ POWER8_PME_PM_FLUSH_LSU ] = { .pme_name = "PM_FLUSH_LSU", .pme_code = 0x208e, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "Flush initiated by LSU", }, [ POWER8_PME_PM_FLUSH_PARTIAL ] = { .pme_name = "PM_FLUSH_PARTIAL", .pme_code = 0x2086, .pme_short_desc = "Partial flush", .pme_long_desc = "Partial flush", }, [ POWER8_PME_PM_FPU0_FCONV ] = { .pme_name = "PM_FPU0_FCONV", .pme_code = 0xa0b0, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", }, [ POWER8_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0xa0b8, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", }, [ POWER8_PME_PM_FPU0_FRSP ] = { .pme_name = "PM_FPU0_FRSP", .pme_code = 0xa0b4, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", }, [ POWER8_PME_PM_FPU1_FCONV ] = { .pme_name = "PM_FPU1_FCONV", .pme_code = 0xa0b2, .pme_short_desc = "Convert instruction executed", .pme_long_desc = "Convert instruction executed", }, [ POWER8_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0xa0ba, .pme_short_desc = "Estimate instruction executed", .pme_long_desc = "Estimate instruction executed", }, [ POWER8_PME_PM_FPU1_FRSP ] = { .pme_name = "PM_FPU1_FRSP", .pme_code = 0xa0b6, .pme_short_desc = "Round to single precision instruction executed", .pme_long_desc = "Round to single precision instruction executed", }, [ POWER8_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x3000c, .pme_short_desc = "Power Management: Below Threshold B", .pme_long_desc = "Frequency is being slewed down due to Power Management.", }, [ POWER8_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x4000c, .pme_short_desc = "Power Management: Above Threshold A", .pme_long_desc = "Frequency is being slewed up due to Power Management.", }, [ POWER8_PME_PM_FUSION_TOC_GRP0_1 ] = { .pme_name = "PM_FUSION_TOC_GRP0_1", .pme_code = 0x50b0, .pme_short_desc = "One pair of instructions fused with TOC in Group0", .pme_long_desc = "One pair of instructions fused with TOC in Group0", }, [ POWER8_PME_PM_FUSION_TOC_GRP0_2 ] = { .pme_name = "PM_FUSION_TOC_GRP0_2", .pme_code = 0x50ae, .pme_short_desc = "Two pairs of instructions fused with TOCin Group0", .pme_long_desc = "Two pairs of instructions fused with TOCin Group0", }, [ POWER8_PME_PM_FUSION_TOC_GRP0_3 ] = { .pme_name = "PM_FUSION_TOC_GRP0_3", .pme_code = 0x50ac, .pme_short_desc = "Three pairs of instructions fused with TOC in Group0", .pme_long_desc = "Three pairs of instructions fused with TOC in Group0", }, [ POWER8_PME_PM_FUSION_TOC_GRP1_1 ] = { .pme_name = "PM_FUSION_TOC_GRP1_1", .pme_code = 0x50b2, .pme_short_desc = "One pair of instructions fused with TOX in Group1", .pme_long_desc = "One pair of instructions fused with TOX in Group1", }, [ POWER8_PME_PM_FUSION_VSX_GRP0_1 ] = { .pme_name = "PM_FUSION_VSX_GRP0_1", .pme_code = 0x50b8, .pme_short_desc = "One pair of instructions fused with VSX in Group0", .pme_long_desc = "One pair of instructions fused with VSX in Group0", }, [ POWER8_PME_PM_FUSION_VSX_GRP0_2 ] = { .pme_name = "PM_FUSION_VSX_GRP0_2", .pme_code = 0x50b6, .pme_short_desc = "Two pairs of instructions fused with VSX in Group0", .pme_long_desc = "Two pairs of instructions fused with VSX in Group0", }, [ POWER8_PME_PM_FUSION_VSX_GRP0_3 ] = { .pme_name = "PM_FUSION_VSX_GRP0_3", .pme_code = 0x50b4, .pme_short_desc = "Three pairs of instructions fused with VSX in Group0", .pme_long_desc = "Three pairs of instructions fused with VSX in Group0", }, [ POWER8_PME_PM_FUSION_VSX_GRP1_1 ] = { .pme_name = "PM_FUSION_VSX_GRP1_1", .pme_code = 0x50ba, .pme_short_desc = "One pair of instructions fused with VSX in Group1", .pme_long_desc = "One pair of instructions fused with VSX in Group1", }, [ POWER8_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x3000e, .pme_short_desc = "fxu0 busy and fxu1 idle", .pme_long_desc = "fxu0 busy and fxu1 idle.", }, [ POWER8_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x10004, .pme_short_desc = "The fixed point unit Unit 0 finished an instruction. Instructions that finish may not necessary complete.", .pme_long_desc = "FXU0 Finished.", }, [ POWER8_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4000e, .pme_short_desc = "fxu0 idle and fxu1 busy.", .pme_long_desc = "fxu0 idle and fxu1 busy. .", }, [ POWER8_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x40004, .pme_short_desc = "FXU1 Finished", .pme_long_desc = "FXU1 Finished.", }, [ POWER8_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x2000e, .pme_short_desc = "fxu0 busy and fxu1 busy.", .pme_long_desc = "fxu0 busy and fxu1 busy..", }, [ POWER8_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x1000e, .pme_short_desc = "fxu0 idle and fxu1 idle", .pme_long_desc = "fxu0 idle and fxu1 idle.", }, [ POWER8_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x20008, .pme_short_desc = "No itags assigned either thread (GCT Empty)", .pme_long_desc = "No itags assigned either thread (GCT Empty).", }, [ POWER8_PME_PM_GCT_MERGE ] = { .pme_name = "PM_GCT_MERGE", .pme_code = 0x30a4, .pme_short_desc = "Group dispatched on a merged GCT empty. GCT entries can be merged only within the same thread", .pme_long_desc = "Group dispatched on a merged GCT empty. GCT entries can be merged only within the same thread", }, [ POWER8_PME_PM_GCT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED", .pme_code = 0x4d01e, .pme_short_desc = "Gct empty for this thread due to branch mispred", .pme_long_desc = "Gct empty for this thread due to branch mispred.", }, [ POWER8_PME_PM_GCT_NOSLOT_BR_MPRED_ICMISS ] = { .pme_name = "PM_GCT_NOSLOT_BR_MPRED_ICMISS", .pme_code = 0x4d01a, .pme_short_desc = "Gct empty for this thread due to Icache Miss and branch mispred", .pme_long_desc = "Gct empty for this thread due to Icache Miss and branch mispred.", }, [ POWER8_PME_PM_GCT_NOSLOT_CYC ] = { .pme_name = "PM_GCT_NOSLOT_CYC", .pme_code = 0x100f8, .pme_short_desc = "No itags assigned", .pme_long_desc = "Pipeline empty (No itags assigned , no GCT slots used).", }, [ POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_ISSQ ] = { .pme_name = "PM_GCT_NOSLOT_DISP_HELD_ISSQ", .pme_code = 0x2d01e, .pme_short_desc = "Gct empty for this thread due to dispatch hold on this thread due to Issue q full", .pme_long_desc = "Gct empty for this thread due to dispatch hold on this thread due to Issue q full.", }, [ POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_MAP ] = { .pme_name = "PM_GCT_NOSLOT_DISP_HELD_MAP", .pme_code = 0x4d01c, .pme_short_desc = "Gct empty for this thread due to dispatch hold on this thread due to Mapper full", .pme_long_desc = "Gct empty for this thread due to dispatch hold on this thread due to Mapper full.", }, [ POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_OTHER ] = { .pme_name = "PM_GCT_NOSLOT_DISP_HELD_OTHER", .pme_code = 0x2e010, .pme_short_desc = "Gct empty for this thread due to dispatch hold on this thread due to sync", .pme_long_desc = "Gct empty for this thread due to dispatch hold on this thread due to sync.", }, [ POWER8_PME_PM_GCT_NOSLOT_DISP_HELD_SRQ ] = { .pme_name = "PM_GCT_NOSLOT_DISP_HELD_SRQ", .pme_code = 0x2d01c, .pme_short_desc = "Gct empty for this thread due to dispatch hold on this thread due to SRQ full", .pme_long_desc = "Gct empty for this thread due to dispatch hold on this thread due to SRQ full.", }, [ POWER8_PME_PM_GCT_NOSLOT_IC_L3MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_L3MISS", .pme_code = 0x4e010, .pme_short_desc = "Gct empty for this thread due to icach l3 miss", .pme_long_desc = "Gct empty for this thread due to icach l3 miss.", }, [ POWER8_PME_PM_GCT_NOSLOT_IC_MISS ] = { .pme_name = "PM_GCT_NOSLOT_IC_MISS", .pme_code = 0x2d01a, .pme_short_desc = "Gct empty for this thread due to Icache Miss", .pme_long_desc = "Gct empty for this thread due to Icache Miss.", }, [ POWER8_PME_PM_GCT_UTIL_11_14_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_11_14_ENTRIES", .pme_code = 0x20a2, .pme_short_desc = "GCT Utilization 11-14 entries", .pme_long_desc = "GCT Utilization 11-14 entries", }, [ POWER8_PME_PM_GCT_UTIL_15_17_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_15_17_ENTRIES", .pme_code = 0x20a4, .pme_short_desc = "GCT Utilization 15-17 entries", .pme_long_desc = "GCT Utilization 15-17 entries", }, [ POWER8_PME_PM_GCT_UTIL_18_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_18_ENTRIES", .pme_code = 0x20a6, .pme_short_desc = "GCT Utilization 18+ entries", .pme_long_desc = "GCT Utilization 18+ entries", }, [ POWER8_PME_PM_GCT_UTIL_1_2_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_1_2_ENTRIES", .pme_code = 0x209c, .pme_short_desc = "GCT Utilization 1-2 entries", .pme_long_desc = "GCT Utilization 1-2 entries", }, [ POWER8_PME_PM_GCT_UTIL_3_6_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_3_6_ENTRIES", .pme_code = 0x209e, .pme_short_desc = "GCT Utilization 3-6 entries", .pme_long_desc = "GCT Utilization 3-6 entries", }, [ POWER8_PME_PM_GCT_UTIL_7_10_ENTRIES ] = { .pme_name = "PM_GCT_UTIL_7_10_ENTRIES", .pme_code = 0x20a0, .pme_short_desc = "GCT Utilization 7-10 entries", .pme_long_desc = "GCT Utilization 7-10 entries", }, [ POWER8_PME_PM_GRP_BR_MPRED_NONSPEC ] = { .pme_name = "PM_GRP_BR_MPRED_NONSPEC", .pme_code = 0x1000a, .pme_short_desc = "Group experienced non-speculative branch redirect", .pme_long_desc = "Group experienced Non-speculative br mispredicct.", }, [ POWER8_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x30004, .pme_short_desc = "group completed", .pme_long_desc = "group completed.", }, [ POWER8_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x3000a, .pme_short_desc = "group dispatch", .pme_long_desc = "dispatch_success (Group Dispatched).", }, [ POWER8_PME_PM_GRP_IC_MISS_NONSPEC ] = { .pme_name = "PM_GRP_IC_MISS_NONSPEC", .pme_code = 0x1000c, .pme_short_desc = "Group experienced non-speculative I cache miss", .pme_long_desc = "Group experi enced Non-specu lative I cache miss.", }, [ POWER8_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x10130, .pme_short_desc = "Instruction Marked", .pme_long_desc = "Instruction marked in idu.", }, [ POWER8_PME_PM_GRP_NON_FULL_GROUP ] = { .pme_name = "PM_GRP_NON_FULL_GROUP", .pme_code = 0x509c, .pme_short_desc = "GROUPs where we did not have 6 non branch instructions in the group(ST mode), in SMT mode 3 non branches", .pme_long_desc = "GROUPs where we did not have 6 non branch instructions in the group(ST mode), in SMT mode 3 non branches", }, [ POWER8_PME_PM_GRP_PUMP_CPRED ] = { .pme_name = "PM_GRP_PUMP_CPRED", .pme_code = 0x20050, .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_GRP_PUMP_MPRED ] = { .pme_name = "PM_GRP_PUMP_MPRED", .pme_code = 0x20052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_GRP_PUMP_MPRED_RTY", .pme_code = 0x10052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_GRP_TERM_2ND_BRANCH ] = { .pme_name = "PM_GRP_TERM_2ND_BRANCH", .pme_code = 0x50a4, .pme_short_desc = "There were enough instructions in the Ibuffer, but 2nd branch ends group", .pme_long_desc = "There were enough instructions in the Ibuffer, but 2nd branch ends group", }, [ POWER8_PME_PM_GRP_TERM_FPU_AFTER_BR ] = { .pme_name = "PM_GRP_TERM_FPU_AFTER_BR", .pme_code = 0x50a6, .pme_short_desc = "There were enough instructions in the Ibuffer, but FPU OP IN same group after a branch terminates a group, cant do partial flushes", .pme_long_desc = "There were enough instructions in the Ibuffer, but FPU OP IN same group after a branch terminates a group, cant do partial flushes", }, [ POWER8_PME_PM_GRP_TERM_NOINST ] = { .pme_name = "PM_GRP_TERM_NOINST", .pme_code = 0x509e, .pme_short_desc = "Do not fill every slot in the group, Not enough instructions in the Ibuffer. This includes cases where the group started with enough instructions, but some got knocked out by a cache miss or branch redirect (which would also empty the Ibuffer).", .pme_long_desc = "Do not fill every slot in the group, Not enough instructions in the Ibuffer. This includes cases where the group started with enough instructions, but some got knocked out by a cache miss or branch redirect (which would also empty the Ibuffer).", }, [ POWER8_PME_PM_GRP_TERM_OTHER ] = { .pme_name = "PM_GRP_TERM_OTHER", .pme_code = 0x50a0, .pme_short_desc = "There were enough instructions in the Ibuffer, but the group terminated early for some other reason, most likely due to a First or Last.", .pme_long_desc = "There were enough instructions in the Ibuffer, but the group terminated early for some other reason, most likely due to a First or Last.", }, [ POWER8_PME_PM_GRP_TERM_SLOT_LIMIT ] = { .pme_name = "PM_GRP_TERM_SLOT_LIMIT", .pme_code = 0x50a2, .pme_short_desc = "There were enough instructions in the Ibuffer, but 3 src RA/RB/RC , 2 way crack caused a group termination", .pme_long_desc = "There were enough instructions in the Ibuffer, but 3 src RA/RB/RC , 2 way crack caused a group termination", }, [ POWER8_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x2000a, .pme_short_desc = "Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration", .pme_long_desc = "cycles in hypervisor mode .", }, [ POWER8_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x4086, .pme_short_desc = "Cycles No room in ibuff", .pme_long_desc = "Cycles No room in ibufffully qualified transfer (if5 valid).", }, [ POWER8_PME_PM_IC_DEMAND_CYC ] = { .pme_name = "PM_IC_DEMAND_CYC", .pme_code = 0x10018, .pme_short_desc = "Cycles when a demand ifetch was pending", .pme_long_desc = "Demand ifetch pending.", }, [ POWER8_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x4098, .pme_short_desc = "L2 I cache demand request due to BHT redirect, branch redirect (2 bubbles 3 cycles)", .pme_long_desc = "L2 I cache demand request due to BHT redirect, branch redirect (2 bubbles 3 cycles)", }, [ POWER8_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x409a, .pme_short_desc = "L2 I cache demand request due to branch Mispredict (15 cycle path)", .pme_long_desc = "L2 I cache demand request due to branch Mispredict (15 cycle path)", }, [ POWER8_PME_PM_IC_DEMAND_REQ ] = { .pme_name = "PM_IC_DEMAND_REQ", .pme_code = 0x4088, .pme_short_desc = "Demand Instruction fetch request", .pme_long_desc = "Demand Instruction fetch request", }, [ POWER8_PME_PM_IC_INVALIDATE ] = { .pme_name = "PM_IC_INVALIDATE", .pme_code = 0x508a, .pme_short_desc = "Ic line invalidated", .pme_long_desc = "Ic line invalidated", }, [ POWER8_PME_PM_IC_PREF_CANCEL_HIT ] = { .pme_name = "PM_IC_PREF_CANCEL_HIT", .pme_code = 0x4092, .pme_short_desc = "Prefetch Canceled due to icache hit", .pme_long_desc = "Prefetch Canceled due to icache hit", }, [ POWER8_PME_PM_IC_PREF_CANCEL_L2 ] = { .pme_name = "PM_IC_PREF_CANCEL_L2", .pme_code = 0x4094, .pme_short_desc = "L2 Squashed request", .pme_long_desc = "L2 Squashed request", }, [ POWER8_PME_PM_IC_PREF_CANCEL_PAGE ] = { .pme_name = "PM_IC_PREF_CANCEL_PAGE", .pme_code = 0x4090, .pme_short_desc = "Prefetch Canceled due to page boundary", .pme_long_desc = "Prefetch Canceled due to page boundary", }, [ POWER8_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x408a, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Instruction prefetch requests", }, [ POWER8_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x408e, .pme_short_desc = "Instruction prefetch written into IL1", .pme_long_desc = "Instruction prefetch written into IL1", }, [ POWER8_PME_PM_IC_RELOAD_PRIVATE ] = { .pme_name = "PM_IC_RELOAD_PRIVATE", .pme_code = 0x4096, .pme_short_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight thrreads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat", .pme_long_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight thrreads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was inv", }, [ POWER8_PME_PM_IERAT_RELOAD ] = { .pme_name = "PM_IERAT_RELOAD", .pme_code = 0x100f6, .pme_short_desc = "Number of I-ERAT reloads", .pme_long_desc = "IERAT Reloaded (Miss).", }, [ POWER8_PME_PM_IERAT_RELOAD_16M ] = { .pme_name = "PM_IERAT_RELOAD_16M", .pme_code = 0x4006a, .pme_short_desc = "IERAT Reloaded (Miss) for a 16M page", .pme_long_desc = "IERAT Reloaded (Miss) for a 16M page.", }, [ POWER8_PME_PM_IERAT_RELOAD_4K ] = { .pme_name = "PM_IERAT_RELOAD_4K", .pme_code = 0x20064, .pme_short_desc = "IERAT Miss (Not implemented as DI on POWER6)", .pme_long_desc = "IERAT Reloaded (Miss) for a 4k page.", }, [ POWER8_PME_PM_IERAT_RELOAD_64K ] = { .pme_name = "PM_IERAT_RELOAD_64K", .pme_code = 0x3006a, .pme_short_desc = "IERAT Reloaded (Miss) for a 64k page", .pme_long_desc = "IERAT Reloaded (Miss) for a 64k page.", }, [ POWER8_PME_PM_IFETCH_THROTTLE ] = { .pme_name = "PM_IFETCH_THROTTLE", .pme_code = 0x3405e, .pme_short_desc = "Cycles in which Instruction fetch throttle was active", .pme_long_desc = "Cycles instruction fecth was throttled in IFU.", }, [ POWER8_PME_PM_IFU_L2_TOUCH ] = { .pme_name = "PM_IFU_L2_TOUCH", .pme_code = 0x5088, .pme_short_desc = "L2 touch to update MRU on a line", .pme_long_desc = "L2 touch to update MRU on a line", }, [ POWER8_PME_PM_INST_ALL_CHIP_PUMP_CPRED ] = { .pme_name = "PM_INST_ALL_CHIP_PUMP_CPRED", .pme_code = 0x514050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for instruction fetches and prefetches", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_ALL_FROM_DL2L3_MOD", .pme_code = 0x544048, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_ALL_FROM_DL2L3_SHR", .pme_code = 0x534048, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_DL4 ] = { .pme_name = "PM_INST_ALL_FROM_DL4", .pme_code = 0x53404c, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_DMEM ] = { .pme_name = "PM_INST_ALL_FROM_DMEM", .pme_code = 0x54404c, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2 ] = { .pme_name = "PM_INST_ALL_FROM_L2", .pme_code = 0x514042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L21_MOD ] = { .pme_name = "PM_INST_ALL_FROM_L21_MOD", .pme_code = 0x544046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L21_SHR ] = { .pme_name = "PM_INST_ALL_FROM_L21_SHR", .pme_code = 0x534046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2MISS ] = { .pme_name = "PM_INST_ALL_FROM_L2MISS", .pme_code = 0x51404e, .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_INST_ALL_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x534040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_INST_ALL_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x544040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2_MEPF ] = { .pme_name = "PM_INST_ALL_FROM_L2_MEPF", .pme_code = 0x524040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_INST_ALL_FROM_L2_NO_CONFLICT", .pme_code = 0x514040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L3 ] = { .pme_name = "PM_INST_ALL_FROM_L3", .pme_code = 0x544042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L31_ECO_MOD ] = { .pme_name = "PM_INST_ALL_FROM_L31_ECO_MOD", .pme_code = 0x544044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L31_ECO_SHR ] = { .pme_name = "PM_INST_ALL_FROM_L31_ECO_SHR", .pme_code = 0x534044, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L31_MOD ] = { .pme_name = "PM_INST_ALL_FROM_L31_MOD", .pme_code = 0x524044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L31_SHR ] = { .pme_name = "PM_INST_ALL_FROM_L31_SHR", .pme_code = 0x514046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L3MISS_MOD ] = { .pme_name = "PM_INST_ALL_FROM_L3MISS_MOD", .pme_code = 0x54404e, .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_INST_ALL_FROM_L3_DISP_CONFLICT", .pme_code = 0x534042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L3_MEPF ] = { .pme_name = "PM_INST_ALL_FROM_L3_MEPF", .pme_code = 0x524042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_INST_ALL_FROM_L3_NO_CONFLICT", .pme_code = 0x514044, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_LL4 ] = { .pme_name = "PM_INST_ALL_FROM_LL4", .pme_code = 0x51404c, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_LMEM ] = { .pme_name = "PM_INST_ALL_FROM_LMEM", .pme_code = 0x524048, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_MEMORY ] = { .pme_name = "PM_INST_ALL_FROM_MEMORY", .pme_code = 0x52404c, .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_INST_ALL_FROM_OFF_CHIP_CACHE", .pme_code = 0x54404a, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_INST_ALL_FROM_ON_CHIP_CACHE", .pme_code = 0x514048, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_ALL_FROM_RL2L3_MOD", .pme_code = 0x524046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_ALL_FROM_RL2L3_SHR", .pme_code = 0x51404a, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_RL4 ] = { .pme_name = "PM_INST_ALL_FROM_RL4", .pme_code = 0x52404a, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_FROM_RMEM ] = { .pme_name = "PM_INST_ALL_FROM_RMEM", .pme_code = 0x53404a, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to instruction fetches and prefetches", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1", }, [ POWER8_PME_PM_INST_ALL_GRP_PUMP_CPRED ] = { .pme_name = "PM_INST_ALL_GRP_PUMP_CPRED", .pme_code = 0x524050, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for instruction fetches and prefetches", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_GRP_PUMP_MPRED ] = { .pme_name = "PM_INST_ALL_GRP_PUMP_MPRED", .pme_code = 0x524052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for instruction fetches and prefetches", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_INST_ALL_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_ALL_GRP_PUMP_MPRED_RTY", .pme_code = 0x514052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for instruction fetches and prefetches", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_PUMP_CPRED ] = { .pme_name = "PM_INST_ALL_PUMP_CPRED", .pme_code = 0x514054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for instruction fetches and prefetches", .pme_long_desc = "Pump prediction correct. Counts across all types of pumpsfor an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_PUMP_MPRED ] = { .pme_name = "PM_INST_ALL_PUMP_MPRED", .pme_code = 0x544052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for instruction fetches and prefetches", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_SYS_PUMP_CPRED ] = { .pme_name = "PM_INST_ALL_SYS_PUMP_CPRED", .pme_code = 0x534050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for instruction fetches and prefetches", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for an instruction fetch", }, [ POWER8_PME_PM_INST_ALL_SYS_PUMP_MPRED ] = { .pme_name = "PM_INST_ALL_SYS_PUMP_MPRED", .pme_code = 0x534052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for instruction fetches and prefetches", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_INST_ALL_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_ALL_SYS_PUMP_MPRED_RTY", .pme_code = 0x544050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for instruction fetches and prefetches", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for an instruction fetch", }, [ POWER8_PME_PM_INST_CHIP_PUMP_CPRED ] = { .pme_name = "PM_INST_CHIP_PUMP_CPRED", .pme_code = 0x14050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for an instruction fetch.", }, [ POWER8_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x2, .pme_short_desc = "Number of PowerPC Instructions that completed.", .pme_long_desc = "PPC Instructions Finished (completed).", }, [ POWER8_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x200f2, .pme_short_desc = "PPC Dispatched", .pme_long_desc = "PPC Dispatched.", }, [ POWER8_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x44048, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x34048, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_DL4 ] = { .pme_name = "PM_INST_FROM_DL4", .pme_code = 0x3404c, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x4404c, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x4080, .pme_short_desc = "Instruction fetches from L1", .pme_long_desc = "Instruction fetches from L1", }, [ POWER8_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x14042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L21_MOD ] = { .pme_name = "PM_INST_FROM_L21_MOD", .pme_code = 0x44046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L21_SHR ] = { .pme_name = "PM_INST_FROM_L21_SHR", .pme_code = 0x34046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x1404e, .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x34040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x44040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L2_MEPF ] = { .pme_name = "PM_INST_FROM_L2_MEPF", .pme_code = 0x24040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_INST_FROM_L2_NO_CONFLICT", .pme_code = 0x14040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x44042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L31_ECO_MOD ] = { .pme_name = "PM_INST_FROM_L31_ECO_MOD", .pme_code = 0x44044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L31_ECO_SHR ] = { .pme_name = "PM_INST_FROM_L31_ECO_SHR", .pme_code = 0x34044, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L31_MOD ] = { .pme_name = "PM_INST_FROM_L31_MOD", .pme_code = 0x24044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L31_SHR ] = { .pme_name = "PM_INST_FROM_L31_SHR", .pme_code = 0x14046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x300fa, .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", .pme_long_desc = "Inst from L3 miss.", }, [ POWER8_PME_PM_INST_FROM_L3MISS_MOD ] = { .pme_name = "PM_INST_FROM_L3MISS_MOD", .pme_code = 0x4404e, .pme_short_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to a instruction fetch", .pme_long_desc = "The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_INST_FROM_L3_DISP_CONFLICT", .pme_code = 0x34042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L3_MEPF ] = { .pme_name = "PM_INST_FROM_L3_MEPF", .pme_code = 0x24042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_INST_FROM_L3_NO_CONFLICT", .pme_code = 0x14044, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_LL4 ] = { .pme_name = "PM_INST_FROM_LL4", .pme_code = 0x1404c, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x24048, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_MEMORY ] = { .pme_name = "PM_INST_FROM_MEMORY", .pme_code = 0x2404c, .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_INST_FROM_OFF_CHIP_CACHE", .pme_code = 0x4404a, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_INST_FROM_ON_CHIP_CACHE", .pme_code = 0x14048, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x24046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x1404a, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_RL4 ] = { .pme_name = "PM_INST_FROM_RL4", .pme_code = 0x2404a, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x3404a, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .", }, [ POWER8_PME_PM_INST_GRP_PUMP_CPRED ] = { .pme_name = "PM_INST_GRP_PUMP_CPRED", .pme_code = 0x24050, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for an instruction fetch.", }, [ POWER8_PME_PM_INST_GRP_PUMP_MPRED ] = { .pme_name = "PM_INST_GRP_PUMP_MPRED", .pme_code = 0x24052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro", }, [ POWER8_PME_PM_INST_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_GRP_PUMP_MPRED_RTY", .pme_code = 0x14052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", .pme_long_desc = "Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor an instruction fetch.", }, [ POWER8_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x1003a, .pme_short_desc = "IMC Match Count (Not architected in P8)", .pme_long_desc = "IMC Match Count.", }, [ POWER8_PME_PM_INST_IMC_MATCH_DISP ] = { .pme_name = "PM_INST_IMC_MATCH_DISP", .pme_code = 0x30016, .pme_short_desc = "Matched Instructions Dispatched", .pme_long_desc = "IMC Matches dispatched.", }, [ POWER8_PME_PM_INST_PUMP_CPRED ] = { .pme_name = "PM_INST_PUMP_CPRED", .pme_code = 0x14054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for an instruction fetch", .pme_long_desc = "Pump prediction correct. Counts across all types of pumpsfor an instruction fetch.", }, [ POWER8_PME_PM_INST_PUMP_MPRED ] = { .pme_name = "PM_INST_PUMP_MPRED", .pme_code = 0x44052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for an instruction fetch", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor an instruction fetch.", }, [ POWER8_PME_PM_INST_SYS_PUMP_CPRED ] = { .pme_name = "PM_INST_SYS_PUMP_CPRED", .pme_code = 0x34050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for an instruction fetch.", }, [ POWER8_PME_PM_INST_SYS_PUMP_MPRED ] = { .pme_name = "PM_INST_SYS_PUMP_MPRED", .pme_code = 0x34052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_INST_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_SYS_PUMP_MPRED_RTY", .pme_code = 0x44050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for an instruction fetch.", }, [ POWER8_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x10014, .pme_short_desc = "Internal Operations completed", .pme_long_desc = "IOPS Completed.", }, [ POWER8_PME_PM_IOPS_DISP ] = { .pme_name = "PM_IOPS_DISP", .pme_code = 0x30014, .pme_short_desc = "Internal Operations dispatched", .pme_long_desc = "IOPS dispatched.", }, [ POWER8_PME_PM_IPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_IPTEG_FROM_DL2L3_MOD", .pme_code = 0x45048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_IPTEG_FROM_DL2L3_SHR", .pme_code = 0x35048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_DL4 ] = { .pme_name = "PM_IPTEG_FROM_DL4", .pme_code = 0x3504c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_DMEM ] = { .pme_name = "PM_IPTEG_FROM_DMEM", .pme_code = 0x4504c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2 ] = { .pme_name = "PM_IPTEG_FROM_L2", .pme_code = 0x15042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L21_MOD ] = { .pme_name = "PM_IPTEG_FROM_L21_MOD", .pme_code = 0x45046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L21_SHR ] = { .pme_name = "PM_IPTEG_FROM_L21_SHR", .pme_code = 0x35046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2MISS ] = { .pme_name = "PM_IPTEG_FROM_L2MISS", .pme_code = 0x1504e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_IPTEG_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x35040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_IPTEG_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x45040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_IPTEG_FROM_L2_MEPF", .pme_code = 0x25040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x15040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L3 ] = { .pme_name = "PM_IPTEG_FROM_L3", .pme_code = 0x45042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_IPTEG_FROM_L31_ECO_MOD", .pme_code = 0x45044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_IPTEG_FROM_L31_ECO_SHR", .pme_code = 0x35044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L31_MOD ] = { .pme_name = "PM_IPTEG_FROM_L31_MOD", .pme_code = 0x25044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L31_SHR ] = { .pme_name = "PM_IPTEG_FROM_L31_SHR", .pme_code = 0x15046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L3MISS ] = { .pme_name = "PM_IPTEG_FROM_L3MISS", .pme_code = 0x4504e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x35042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_IPTEG_FROM_L3_MEPF", .pme_code = 0x25042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x15044, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_LL4 ] = { .pme_name = "PM_IPTEG_FROM_LL4", .pme_code = 0x1504c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_LMEM ] = { .pme_name = "PM_IPTEG_FROM_LMEM", .pme_code = 0x25048, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_MEMORY ] = { .pme_name = "PM_IPTEG_FROM_MEMORY", .pme_code = 0x2504c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_IPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x4504a, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_IPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x15048, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_IPTEG_FROM_RL2L3_MOD", .pme_code = 0x25046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_IPTEG_FROM_RL2L3_SHR", .pme_code = 0x1504a, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_RL4 ] = { .pme_name = "PM_IPTEG_FROM_RL4", .pme_code = 0x2504a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a instruction side request.", }, [ POWER8_PME_PM_IPTEG_FROM_RMEM ] = { .pme_name = "PM_IPTEG_FROM_RMEM", .pme_code = 0x3504a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a instruction side request.", }, [ POWER8_PME_PM_ISIDE_DISP ] = { .pme_name = "PM_ISIDE_DISP", .pme_code = 0x617082, .pme_short_desc = "All i-side dispatch attempts", .pme_long_desc = "All i-side dispatch attempts", }, [ POWER8_PME_PM_ISIDE_DISP_FAIL ] = { .pme_name = "PM_ISIDE_DISP_FAIL", .pme_code = 0x627084, .pme_short_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", .pme_long_desc = "All i-side dispatch attempts that failed due to a addr collision with another machine", }, [ POWER8_PME_PM_ISIDE_DISP_FAIL_OTHER ] = { .pme_name = "PM_ISIDE_DISP_FAIL_OTHER", .pme_code = 0x627086, .pme_short_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", .pme_long_desc = "All i-side dispatch attempts that failed due to a reason other than addrs collision", }, [ POWER8_PME_PM_ISIDE_L2MEMACC ] = { .pme_name = "PM_ISIDE_L2MEMACC", .pme_code = 0x4608e, .pme_short_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", .pme_long_desc = "valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)", }, [ POWER8_PME_PM_ISIDE_MRU_TOUCH ] = { .pme_name = "PM_ISIDE_MRU_TOUCH", .pme_code = 0x44608e, .pme_short_desc = "Iside L2 MRU touch", .pme_long_desc = "Iside L2 MRU touch", }, [ POWER8_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0xd096, .pme_short_desc = "I SLB Miss.", .pme_long_desc = "I SLB Miss.", }, [ POWER8_PME_PM_ISU_REF_FX0 ] = { .pme_name = "PM_ISU_REF_FX0", .pme_code = 0x30ac, .pme_short_desc = "FX0 ISU reject", .pme_long_desc = "FX0 ISU reject", }, [ POWER8_PME_PM_ISU_REF_FX1 ] = { .pme_name = "PM_ISU_REF_FX1", .pme_code = 0x30ae, .pme_short_desc = "FX1 ISU reject", .pme_long_desc = "FX1 ISU reject", }, [ POWER8_PME_PM_ISU_REF_FXU ] = { .pme_name = "PM_ISU_REF_FXU", .pme_code = 0x38ac, .pme_short_desc = "FXU ISU reject from either pipe", .pme_long_desc = "ISU", }, [ POWER8_PME_PM_ISU_REF_LS0 ] = { .pme_name = "PM_ISU_REF_LS0", .pme_code = 0x30b0, .pme_short_desc = "LS0 ISU reject", .pme_long_desc = "LS0 ISU reject", }, [ POWER8_PME_PM_ISU_REF_LS1 ] = { .pme_name = "PM_ISU_REF_LS1", .pme_code = 0x30b2, .pme_short_desc = "LS1 ISU reject", .pme_long_desc = "LS1 ISU reject", }, [ POWER8_PME_PM_ISU_REF_LS2 ] = { .pme_name = "PM_ISU_REF_LS2", .pme_code = 0x30b4, .pme_short_desc = "LS2 ISU reject", .pme_long_desc = "LS2 ISU reject", }, [ POWER8_PME_PM_ISU_REF_LS3 ] = { .pme_name = "PM_ISU_REF_LS3", .pme_code = 0x30b6, .pme_short_desc = "LS3 ISU reject", .pme_long_desc = "LS3 ISU reject", }, [ POWER8_PME_PM_ISU_REJECTS_ALL ] = { .pme_name = "PM_ISU_REJECTS_ALL", .pme_code = 0x309c, .pme_short_desc = "All isu rejects could be more than 1 per cycle", .pme_long_desc = "All isu rejects could be more than 1 per cycle", }, [ POWER8_PME_PM_ISU_REJECT_RES_NA ] = { .pme_name = "PM_ISU_REJECT_RES_NA", .pme_code = 0x30a2, .pme_short_desc = "ISU reject due to resource not available", .pme_long_desc = "ISU reject due to resource not available", }, [ POWER8_PME_PM_ISU_REJECT_SAR_BYPASS ] = { .pme_name = "PM_ISU_REJECT_SAR_BYPASS", .pme_code = 0x309e, .pme_short_desc = "Reject because of SAR bypass", .pme_long_desc = "Reject because of SAR bypass", }, [ POWER8_PME_PM_ISU_REJECT_SRC_NA ] = { .pme_name = "PM_ISU_REJECT_SRC_NA", .pme_code = 0x30a0, .pme_short_desc = "ISU reject due to source not available", .pme_long_desc = "ISU reject due to source not available", }, [ POWER8_PME_PM_ISU_REJ_VS0 ] = { .pme_name = "PM_ISU_REJ_VS0", .pme_code = 0x30a8, .pme_short_desc = "VS0 ISU reject", .pme_long_desc = "VS0 ISU reject", }, [ POWER8_PME_PM_ISU_REJ_VS1 ] = { .pme_name = "PM_ISU_REJ_VS1", .pme_code = 0x30aa, .pme_short_desc = "VS1 ISU reject", .pme_long_desc = "VS1 ISU reject", }, [ POWER8_PME_PM_ISU_REJ_VSU ] = { .pme_name = "PM_ISU_REJ_VSU", .pme_code = 0x38a8, .pme_short_desc = "VSU ISU reject from either pipe", .pme_long_desc = "ISU", }, [ POWER8_PME_PM_ISYNC ] = { .pme_name = "PM_ISYNC", .pme_code = 0x30b8, .pme_short_desc = "Isync count per thread", .pme_long_desc = "Isync count per thread", }, [ POWER8_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x400fc, .pme_short_desc = "ITLB Reloaded (always zero on POWER6)", .pme_long_desc = "ITLB Reloaded.", }, [ POWER8_PME_PM_L1MISS_LAT_EXC_1024 ] = { .pme_name = "PM_L1MISS_LAT_EXC_1024", .pme_code = 0x67200301eaull, .pme_short_desc = "L1 misses that took longer than 1024 cyles to resolve (miss to reload)", .pme_long_desc = "Reload latency exceeded 1024 cyc", }, [ POWER8_PME_PM_L1MISS_LAT_EXC_2048 ] = { .pme_name = "PM_L1MISS_LAT_EXC_2048", .pme_code = 0x67200401ecull, .pme_short_desc = "L1 misses that took longer than 2048 cyles to resolve (miss to reload)", .pme_long_desc = "Reload latency exceeded 2048 cyc", }, [ POWER8_PME_PM_L1MISS_LAT_EXC_256 ] = { .pme_name = "PM_L1MISS_LAT_EXC_256", .pme_code = 0x67200101e8ull, .pme_short_desc = "L1 misses that took longer than 256 cyles to resolve (miss to reload)", .pme_long_desc = "Reload latency exceeded 256 cyc", }, [ POWER8_PME_PM_L1MISS_LAT_EXC_32 ] = { .pme_name = "PM_L1MISS_LAT_EXC_32", .pme_code = 0x67200201e6ull, .pme_short_desc = "L1 misses that took longer than 32 cyles to resolve (miss to reload)", .pme_long_desc = "Reload latency exceeded 32 cyc", }, [ POWER8_PME_PM_L1PF_L2MEMACC ] = { .pme_name = "PM_L1PF_L2MEMACC", .pme_code = 0x26086, .pme_short_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", .pme_long_desc = "valid when first beat of data comes in for an L1pref where data came from mem(or L4)", }, [ POWER8_PME_PM_L1_DCACHE_RELOADED_ALL ] = { .pme_name = "PM_L1_DCACHE_RELOADED_ALL", .pme_code = 0x1002c, .pme_short_desc = "L1 data cache reloaded for demand or prefetch", .pme_long_desc = "L1 data cache reloaded for demand or prefetch .", }, [ POWER8_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x300f6, .pme_short_desc = "DL1 reloaded due to Demand Load", .pme_long_desc = "DL1 reloaded due to Demand Load .", }, [ POWER8_PME_PM_L1_DEMAND_WRITE ] = { .pme_name = "PM_L1_DEMAND_WRITE", .pme_code = 0x408c, .pme_short_desc = "Instruction Demand sectors wriittent into IL1", .pme_long_desc = "Instruction Demand sectors wriittent into IL1", }, [ POWER8_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x200fd, .pme_short_desc = "Demand iCache Miss", .pme_long_desc = "Demand iCache Miss.", }, [ POWER8_PME_PM_L1_ICACHE_RELOADED_ALL ] = { .pme_name = "PM_L1_ICACHE_RELOADED_ALL", .pme_code = 0x40012, .pme_short_desc = "Counts all Icache reloads includes demand, prefetchm prefetch turned into demand and demand turned into prefetch", .pme_long_desc = "Counts all Icache reloads includes demand, prefetchm prefetch turned into demand and demand turned into prefetch.", }, [ POWER8_PME_PM_L1_ICACHE_RELOADED_PREF ] = { .pme_name = "PM_L1_ICACHE_RELOADED_PREF", .pme_code = 0x30068, .pme_short_desc = "Counts all Icache prefetch reloads (includes demand turned into prefetch)", .pme_long_desc = "Counts all Icache prefetch reloads (includes demand turned into prefetch).", }, [ POWER8_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x417080, .pme_short_desc = "L2 Castouts - Modified (M, Mu, Me)", .pme_long_desc = "L2 Castouts - Modified (M, Mu, Me)", }, [ POWER8_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x417082, .pme_short_desc = "L2 Castouts - Shared (T, Te, Si, S)", .pme_long_desc = "L2 Castouts - Shared (T, Te, Si, S)", }, [ POWER8_PME_PM_L2_CHIP_PUMP ] = { .pme_name = "PM_L2_CHIP_PUMP", .pme_code = 0x27084, .pme_short_desc = "RC requests that were local on chip pump attempts", .pme_long_desc = "RC requests that were local on chip pump attempts", }, [ POWER8_PME_PM_L2_DC_INV ] = { .pme_name = "PM_L2_DC_INV", .pme_code = 0x427086, .pme_short_desc = "Dcache invalidates from L2", .pme_long_desc = "Dcache invalidates from L2", }, [ POWER8_PME_PM_L2_DISP_ALL_L2MISS ] = { .pme_name = "PM_L2_DISP_ALL_L2MISS", .pme_code = 0x44608c, .pme_short_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", .pme_long_desc = "All successful Ld/St dispatches for this thread that were an L2miss.", }, [ POWER8_PME_PM_L2_GROUP_PUMP ] = { .pme_name = "PM_L2_GROUP_PUMP", .pme_code = 0x27086, .pme_short_desc = "RC requests that were on Node Pump attempts", .pme_long_desc = "RC requests that were on Node Pump attempts", }, [ POWER8_PME_PM_L2_GRP_GUESS_CORRECT ] = { .pme_name = "PM_L2_GRP_GUESS_CORRECT", .pme_code = 0x626084, .pme_short_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", .pme_long_desc = "L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)", }, [ POWER8_PME_PM_L2_GRP_GUESS_WRONG ] = { .pme_name = "PM_L2_GRP_GUESS_WRONG", .pme_code = 0x626086, .pme_short_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", .pme_long_desc = "L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)", }, [ POWER8_PME_PM_L2_IC_INV ] = { .pme_name = "PM_L2_IC_INV", .pme_code = 0x427084, .pme_short_desc = "Icache Invalidates from L2", .pme_long_desc = "Icache Invalidates from L2", }, [ POWER8_PME_PM_L2_INST ] = { .pme_name = "PM_L2_INST", .pme_code = 0x436088, .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", }, [ POWER8_PME_PM_L2_INST_MISS ] = { .pme_name = "PM_L2_INST_MISS", .pme_code = 0x43608a, .pme_short_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)", }, [ POWER8_PME_PM_L2_LD ] = { .pme_name = "PM_L2_LD", .pme_code = 0x416080, .pme_short_desc = "All successful D-side Load dispatches for this thread", .pme_long_desc = "All successful D-side Load dispatches for this thread", }, [ POWER8_PME_PM_L2_LD_DISP ] = { .pme_name = "PM_L2_LD_DISP", .pme_code = 0x437088, .pme_short_desc = "All successful load dispatches", .pme_long_desc = "All successful load dispatches", }, [ POWER8_PME_PM_L2_LD_HIT ] = { .pme_name = "PM_L2_LD_HIT", .pme_code = 0x43708a, .pme_short_desc = "All successful load dispatches that were L2 hits", .pme_long_desc = "All successful load dispatches that were L2 hits", }, [ POWER8_PME_PM_L2_LD_MISS ] = { .pme_name = "PM_L2_LD_MISS", .pme_code = 0x426084, .pme_short_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", .pme_long_desc = "All successful D-Side Load dispatches that were an L2miss for this thread", }, [ POWER8_PME_PM_L2_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L2_LOC_GUESS_CORRECT", .pme_code = 0x616080, .pme_short_desc = "L2 guess loc and guess was correct (ie data local)", .pme_long_desc = "L2 guess loc and guess was correct (ie data local)", }, [ POWER8_PME_PM_L2_LOC_GUESS_WRONG ] = { .pme_name = "PM_L2_LOC_GUESS_WRONG", .pme_code = 0x616082, .pme_short_desc = "L2 guess loc and guess was not correct (ie data not on chip)", .pme_long_desc = "L2 guess loc and guess was not correct (ie data not on chip)", }, [ POWER8_PME_PM_L2_RCLD_DISP ] = { .pme_name = "PM_L2_RCLD_DISP", .pme_code = 0x516080, .pme_short_desc = "L2 RC load dispatch attempt", .pme_long_desc = "L2 RC load dispatch attempt", }, [ POWER8_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", .pme_code = 0x516082, .pme_short_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ", }, [ POWER8_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", .pme_code = 0x526084, .pme_short_desc = "L2 RC load dispatch attempt failed due to other reasons", .pme_long_desc = "L2 RC load dispatch attempt failed due to other reasons", }, [ POWER8_PME_PM_L2_RCST_DISP ] = { .pme_name = "PM_L2_RCST_DISP", .pme_code = 0x536088, .pme_short_desc = "L2 RC store dispatch attempt", .pme_long_desc = "L2 RC store dispatch attempt", }, [ POWER8_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", .pme_code = 0x53608a, .pme_short_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ", }, [ POWER8_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", .pme_code = 0x54608c, .pme_short_desc = "L2 RC store dispatch attempt failed due to other reasons", .pme_long_desc = "L2 RC store dispatch attempt failed due to other reasons", }, [ POWER8_PME_PM_L2_RC_ST_DONE ] = { .pme_name = "PM_L2_RC_ST_DONE", .pme_code = 0x537088, .pme_short_desc = "RC did st to line that was Tx or Sx", .pme_long_desc = "RC did st to line that was Tx or Sx", }, [ POWER8_PME_PM_L2_RTY_LD ] = { .pme_name = "PM_L2_RTY_LD", .pme_code = 0x63708a, .pme_short_desc = "RC retries on PB for any load from core", .pme_long_desc = "RC retries on PB for any load from core", }, [ POWER8_PME_PM_L2_RTY_ST ] = { .pme_name = "PM_L2_RTY_ST", .pme_code = 0x3708a, .pme_short_desc = "RC retries on PB for any store from core", .pme_long_desc = "RC retries on PB for any store from core", }, [ POWER8_PME_PM_L2_SN_M_RD_DONE ] = { .pme_name = "PM_L2_SN_M_RD_DONE", .pme_code = 0x54708c, .pme_short_desc = "SNP dispatched for a read and was M", .pme_long_desc = "SNP dispatched for a read and was M", }, [ POWER8_PME_PM_L2_SN_M_WR_DONE ] = { .pme_name = "PM_L2_SN_M_WR_DONE", .pme_code = 0x54708e, .pme_short_desc = "SNP dispatched for a write and was M", .pme_long_desc = "SNP dispatched for a write and was M", }, [ POWER8_PME_PM_L2_SN_SX_I_DONE ] = { .pme_name = "PM_L2_SN_SX_I_DONE", .pme_code = 0x53708a, .pme_short_desc = "SNP dispatched and went from Sx or Tx to Ix", .pme_long_desc = "SNP dispatched and went from Sx or Tx to Ix", }, [ POWER8_PME_PM_L2_ST ] = { .pme_name = "PM_L2_ST", .pme_code = 0x17080, .pme_short_desc = "All successful D-side store dispatches for this thread", .pme_long_desc = "All successful D-side store dispatches for this thread", }, [ POWER8_PME_PM_L2_ST_DISP ] = { .pme_name = "PM_L2_ST_DISP", .pme_code = 0x44708c, .pme_short_desc = "All successful store dispatches", .pme_long_desc = "All successful store dispatches", }, [ POWER8_PME_PM_L2_ST_HIT ] = { .pme_name = "PM_L2_ST_HIT", .pme_code = 0x44708e, .pme_short_desc = "All successful store dispatches that were L2Hits", .pme_long_desc = "All successful store dispatches that were L2Hits", }, [ POWER8_PME_PM_L2_ST_MISS ] = { .pme_name = "PM_L2_ST_MISS", .pme_code = 0x17082, .pme_short_desc = "All successful D-side store dispatches for this thread that were L2 Miss", .pme_long_desc = "All successful D-side store dispatches for this thread that were L2 Miss", }, [ POWER8_PME_PM_L2_SYS_GUESS_CORRECT ] = { .pme_name = "PM_L2_SYS_GUESS_CORRECT", .pme_code = 0x636088, .pme_short_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", .pme_long_desc = "L2 guess sys and guess was correct (ie data beyond-6chip)", }, [ POWER8_PME_PM_L2_SYS_GUESS_WRONG ] = { .pme_name = "PM_L2_SYS_GUESS_WRONG", .pme_code = 0x63608a, .pme_short_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", .pme_long_desc = "L2 guess sys and guess was not correct (ie data ^beyond-6chip)", }, [ POWER8_PME_PM_L2_SYS_PUMP ] = { .pme_name = "PM_L2_SYS_PUMP", .pme_code = 0x617080, .pme_short_desc = "RC requests that were system pump attempts", .pme_long_desc = "RC requests that were system pump attempts", }, [ POWER8_PME_PM_L2_TM_REQ_ABORT ] = { .pme_name = "PM_L2_TM_REQ_ABORT", .pme_code = 0x1e05e, .pme_short_desc = "TM abort", .pme_long_desc = "TM abort.", }, [ POWER8_PME_PM_L2_TM_ST_ABORT_SISTER ] = { .pme_name = "PM_L2_TM_ST_ABORT_SISTER", .pme_code = 0x3e05c, .pme_short_desc = "TM marked store abort", .pme_long_desc = "TM marked store abort.", }, [ POWER8_PME_PM_L3_CINJ ] = { .pme_name = "PM_L3_CINJ", .pme_code = 0x23808a, .pme_short_desc = "l3 ci of cache inject", .pme_long_desc = "l3 ci of cache inject", }, [ POWER8_PME_PM_L3_CI_HIT ] = { .pme_name = "PM_L3_CI_HIT", .pme_code = 0x128084, .pme_short_desc = "L3 Castins Hit (total count", .pme_long_desc = "L3 Castins Hit (total count", }, [ POWER8_PME_PM_L3_CI_MISS ] = { .pme_name = "PM_L3_CI_MISS", .pme_code = 0x128086, .pme_short_desc = "L3 castins miss (total count", .pme_long_desc = "L3 castins miss (total count", }, [ POWER8_PME_PM_L3_CI_USAGE ] = { .pme_name = "PM_L3_CI_USAGE", .pme_code = 0x819082, .pme_short_desc = "rotating sample of 16 CI or CO actives", .pme_long_desc = "rotating sample of 16 CI or CO actives", }, [ POWER8_PME_PM_L3_CO ] = { .pme_name = "PM_L3_CO", .pme_code = 0x438088, .pme_short_desc = "l3 castout occurring (does not include casthrough or log writes (cinj/dmaw)", .pme_long_desc = "l3 castout occurring (does not include casthrough or log writes (cinj/dmaw)", }, [ POWER8_PME_PM_L3_CO0_ALLOC ] = { .pme_name = "PM_L3_CO0_ALLOC", .pme_code = 0x83908b, .pme_short_desc = "lifetime, sample of CO machine 0 valid", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_L3_CO0_BUSY ] = { .pme_name = "PM_L3_CO0_BUSY", .pme_code = 0x83908a, .pme_short_desc = "lifetime, sample of CO machine 0 valid", .pme_long_desc = "lifetime, sample of CO machine 0 valid", }, [ POWER8_PME_PM_L3_CO_L31 ] = { .pme_name = "PM_L3_CO_L31", .pme_code = 0x28086, .pme_short_desc = "L3 CO to L3.1 OR of port 0 and 1 (lossy)", .pme_long_desc = "L3 CO to L3.1 OR of port 0 and 1 (lossy)", }, [ POWER8_PME_PM_L3_CO_LCO ] = { .pme_name = "PM_L3_CO_LCO", .pme_code = 0x238088, .pme_short_desc = "Total L3 castouts occurred on LCO", .pme_long_desc = "Total L3 castouts occurred on LCO", }, [ POWER8_PME_PM_L3_CO_MEM ] = { .pme_name = "PM_L3_CO_MEM", .pme_code = 0x28084, .pme_short_desc = "L3 CO to memory OR of port 0 and 1 (lossy)", .pme_long_desc = "L3 CO to memory OR of port 0 and 1 (lossy)", }, [ POWER8_PME_PM_L3_CO_MEPF ] = { .pme_name = "PM_L3_CO_MEPF", .pme_code = 0x18082, .pme_short_desc = "L3 CO of line in Mep state (includes casthrough)", .pme_long_desc = "L3 CO of line in Mep state (includes casthrough)", }, [ POWER8_PME_PM_L3_GRP_GUESS_CORRECT ] = { .pme_name = "PM_L3_GRP_GUESS_CORRECT", .pme_code = 0xb19082, .pme_short_desc = "Initial scope=group and data from same group (near) (pred successful)", .pme_long_desc = "Initial scope=group and data from same group (near) (pred successful)", }, [ POWER8_PME_PM_L3_GRP_GUESS_WRONG_HIGH ] = { .pme_name = "PM_L3_GRP_GUESS_WRONG_HIGH", .pme_code = 0xb3908a, .pme_short_desc = "Initial scope=group but data from local node. Predition too high", .pme_long_desc = "Initial scope=group but data from local node. Predition too high", }, [ POWER8_PME_PM_L3_GRP_GUESS_WRONG_LOW ] = { .pme_name = "PM_L3_GRP_GUESS_WRONG_LOW", .pme_code = 0xb39088, .pme_short_desc = "Initial scope=group but data from outside group (far or rem). Prediction too Low", .pme_long_desc = "Initial scope=group but data from outside group (far or rem). Prediction too Low", }, [ POWER8_PME_PM_L3_HIT ] = { .pme_name = "PM_L3_HIT", .pme_code = 0x218080, .pme_short_desc = "L3 Hits", .pme_long_desc = "L3 Hits", }, [ POWER8_PME_PM_L3_L2_CO_HIT ] = { .pme_name = "PM_L3_L2_CO_HIT", .pme_code = 0x138088, .pme_short_desc = "L2 castout hits", .pme_long_desc = "L2 castout hits", }, [ POWER8_PME_PM_L3_L2_CO_MISS ] = { .pme_name = "PM_L3_L2_CO_MISS", .pme_code = 0x13808a, .pme_short_desc = "L2 castout miss", .pme_long_desc = "L2 castout miss", }, [ POWER8_PME_PM_L3_LAT_CI_HIT ] = { .pme_name = "PM_L3_LAT_CI_HIT", .pme_code = 0x14808c, .pme_short_desc = "L3 Lateral Castins Hit", .pme_long_desc = "L3 Lateral Castins Hit", }, [ POWER8_PME_PM_L3_LAT_CI_MISS ] = { .pme_name = "PM_L3_LAT_CI_MISS", .pme_code = 0x14808e, .pme_short_desc = "L3 Lateral Castins Miss", .pme_long_desc = "L3 Lateral Castins Miss", }, [ POWER8_PME_PM_L3_LD_HIT ] = { .pme_name = "PM_L3_LD_HIT", .pme_code = 0x228084, .pme_short_desc = "L3 demand LD Hits", .pme_long_desc = "L3 demand LD Hits", }, [ POWER8_PME_PM_L3_LD_MISS ] = { .pme_name = "PM_L3_LD_MISS", .pme_code = 0x228086, .pme_short_desc = "L3 demand LD Miss", .pme_long_desc = "L3 demand LD Miss", }, [ POWER8_PME_PM_L3_LD_PREF ] = { .pme_name = "PM_L3_LD_PREF", .pme_code = 0x1e052, .pme_short_desc = "L3 Load Prefetches", .pme_long_desc = "L3 Load Prefetches.", }, [ POWER8_PME_PM_L3_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L3_LOC_GUESS_CORRECT", .pme_code = 0xb19080, .pme_short_desc = "initial scope=node/chip and data from local node (local) (pred successful)", .pme_long_desc = "initial scope=node/chip and data from local node (local) (pred successful)", }, [ POWER8_PME_PM_L3_LOC_GUESS_WRONG ] = { .pme_name = "PM_L3_LOC_GUESS_WRONG", .pme_code = 0xb29086, .pme_short_desc = "Initial scope=node but data from out side local node (near or far or rem). Prediction too Low", .pme_long_desc = "Initial scope=node but data from out side local node (near or far or rem). Prediction too Low", }, [ POWER8_PME_PM_L3_MISS ] = { .pme_name = "PM_L3_MISS", .pme_code = 0x218082, .pme_short_desc = "L3 Misses", .pme_long_desc = "L3 Misses", }, [ POWER8_PME_PM_L3_P0_CO_L31 ] = { .pme_name = "PM_L3_P0_CO_L31", .pme_code = 0x54808c, .pme_short_desc = "l3 CO to L3.1 (lco) port 0", .pme_long_desc = "l3 CO to L3.1 (lco) port 0", }, [ POWER8_PME_PM_L3_P0_CO_MEM ] = { .pme_name = "PM_L3_P0_CO_MEM", .pme_code = 0x538088, .pme_short_desc = "l3 CO to memory port 0", .pme_long_desc = "l3 CO to memory port 0", }, [ POWER8_PME_PM_L3_P0_CO_RTY ] = { .pme_name = "PM_L3_P0_CO_RTY", .pme_code = 0x929084, .pme_short_desc = "L3 CO received retry port 0", .pme_long_desc = "L3 CO received retry port 0", }, [ POWER8_PME_PM_L3_P0_GRP_PUMP ] = { .pme_name = "PM_L3_P0_GRP_PUMP", .pme_code = 0xa29084, .pme_short_desc = "L3 pf sent with grp scope port 0", .pme_long_desc = "L3 pf sent with grp scope port 0", }, [ POWER8_PME_PM_L3_P0_LCO_DATA ] = { .pme_name = "PM_L3_P0_LCO_DATA", .pme_code = 0x528084, .pme_short_desc = "lco sent with data port 0", .pme_long_desc = "lco sent with data port 0", }, [ POWER8_PME_PM_L3_P0_LCO_NO_DATA ] = { .pme_name = "PM_L3_P0_LCO_NO_DATA", .pme_code = 0x518080, .pme_short_desc = "dataless l3 lco sent port 0", .pme_long_desc = "dataless l3 lco sent port 0", }, [ POWER8_PME_PM_L3_P0_LCO_RTY ] = { .pme_name = "PM_L3_P0_LCO_RTY", .pme_code = 0xa4908c, .pme_short_desc = "L3 LCO received retry port 0", .pme_long_desc = "L3 LCO received retry port 0", }, [ POWER8_PME_PM_L3_P0_NODE_PUMP ] = { .pme_name = "PM_L3_P0_NODE_PUMP", .pme_code = 0xa19080, .pme_short_desc = "L3 pf sent with nodal scope port 0", .pme_long_desc = "L3 pf sent with nodal scope port 0", }, [ POWER8_PME_PM_L3_P0_PF_RTY ] = { .pme_name = "PM_L3_P0_PF_RTY", .pme_code = 0x919080, .pme_short_desc = "L3 PF received retry port 0", .pme_long_desc = "L3 PF received retry port 0", }, [ POWER8_PME_PM_L3_P0_SN_HIT ] = { .pme_name = "PM_L3_P0_SN_HIT", .pme_code = 0x939088, .pme_short_desc = "L3 snoop hit port 0", .pme_long_desc = "L3 snoop hit port 0", }, [ POWER8_PME_PM_L3_P0_SN_INV ] = { .pme_name = "PM_L3_P0_SN_INV", .pme_code = 0x118080, .pme_short_desc = "Port0 snooper detects someone doing a store to a line that is Sx", .pme_long_desc = "Port0 snooper detects someone doing a store to a line that is Sx", }, [ POWER8_PME_PM_L3_P0_SN_MISS ] = { .pme_name = "PM_L3_P0_SN_MISS", .pme_code = 0x94908c, .pme_short_desc = "L3 snoop miss port 0", .pme_long_desc = "L3 snoop miss port 0", }, [ POWER8_PME_PM_L3_P0_SYS_PUMP ] = { .pme_name = "PM_L3_P0_SYS_PUMP", .pme_code = 0xa39088, .pme_short_desc = "L3 pf sent with sys scope port 0", .pme_long_desc = "L3 pf sent with sys scope port 0", }, [ POWER8_PME_PM_L3_P1_CO_L31 ] = { .pme_name = "PM_L3_P1_CO_L31", .pme_code = 0x54808e, .pme_short_desc = "l3 CO to L3.1 (lco) port 1", .pme_long_desc = "l3 CO to L3.1 (lco) port 1", }, [ POWER8_PME_PM_L3_P1_CO_MEM ] = { .pme_name = "PM_L3_P1_CO_MEM", .pme_code = 0x53808a, .pme_short_desc = "l3 CO to memory port 1", .pme_long_desc = "l3 CO to memory port 1", }, [ POWER8_PME_PM_L3_P1_CO_RTY ] = { .pme_name = "PM_L3_P1_CO_RTY", .pme_code = 0x929086, .pme_short_desc = "L3 CO received retry port 1", .pme_long_desc = "L3 CO received retry port 1", }, [ POWER8_PME_PM_L3_P1_GRP_PUMP ] = { .pme_name = "PM_L3_P1_GRP_PUMP", .pme_code = 0xa29086, .pme_short_desc = "L3 pf sent with grp scope port 1", .pme_long_desc = "L3 pf sent with grp scope port 1", }, [ POWER8_PME_PM_L3_P1_LCO_DATA ] = { .pme_name = "PM_L3_P1_LCO_DATA", .pme_code = 0x528086, .pme_short_desc = "lco sent with data port 1", .pme_long_desc = "lco sent with data port 1", }, [ POWER8_PME_PM_L3_P1_LCO_NO_DATA ] = { .pme_name = "PM_L3_P1_LCO_NO_DATA", .pme_code = 0x518082, .pme_short_desc = "dataless l3 lco sent port 1", .pme_long_desc = "dataless l3 lco sent port 1", }, [ POWER8_PME_PM_L3_P1_LCO_RTY ] = { .pme_name = "PM_L3_P1_LCO_RTY", .pme_code = 0xa4908e, .pme_short_desc = "L3 LCO received retry port 1", .pme_long_desc = "L3 LCO received retry port 1", }, [ POWER8_PME_PM_L3_P1_NODE_PUMP ] = { .pme_name = "PM_L3_P1_NODE_PUMP", .pme_code = 0xa19082, .pme_short_desc = "L3 pf sent with nodal scope port 1", .pme_long_desc = "L3 pf sent with nodal scope port 1", }, [ POWER8_PME_PM_L3_P1_PF_RTY ] = { .pme_name = "PM_L3_P1_PF_RTY", .pme_code = 0x919082, .pme_short_desc = "L3 PF received retry port 1", .pme_long_desc = "L3 PF received retry port 1", }, [ POWER8_PME_PM_L3_P1_SN_HIT ] = { .pme_name = "PM_L3_P1_SN_HIT", .pme_code = 0x93908a, .pme_short_desc = "L3 snoop hit port 1", .pme_long_desc = "L3 snoop hit port 1", }, [ POWER8_PME_PM_L3_P1_SN_INV ] = { .pme_name = "PM_L3_P1_SN_INV", .pme_code = 0x118082, .pme_short_desc = "Port1 snooper detects someone doing a store to a line that is Sx", .pme_long_desc = "Port1 snooper detects someone doing a store to a line that is Sx", }, [ POWER8_PME_PM_L3_P1_SN_MISS ] = { .pme_name = "PM_L3_P1_SN_MISS", .pme_code = 0x94908e, .pme_short_desc = "L3 snoop miss port 1", .pme_long_desc = "L3 snoop miss port 1", }, [ POWER8_PME_PM_L3_P1_SYS_PUMP ] = { .pme_name = "PM_L3_P1_SYS_PUMP", .pme_code = 0xa3908a, .pme_short_desc = "L3 pf sent with sys scope port 1", .pme_long_desc = "L3 pf sent with sys scope port 1", }, [ POWER8_PME_PM_L3_PF0_ALLOC ] = { .pme_name = "PM_L3_PF0_ALLOC", .pme_code = 0x84908d, .pme_short_desc = "lifetime, sample of PF machine 0 valid", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_L3_PF0_BUSY ] = { .pme_name = "PM_L3_PF0_BUSY", .pme_code = 0x84908c, .pme_short_desc = "lifetime, sample of PF machine 0 valid", .pme_long_desc = "lifetime, sample of PF machine 0 valid", }, [ POWER8_PME_PM_L3_PF_HIT_L3 ] = { .pme_name = "PM_L3_PF_HIT_L3", .pme_code = 0x428084, .pme_short_desc = "l3 pf hit in l3", .pme_long_desc = "l3 pf hit in l3", }, [ POWER8_PME_PM_L3_PF_MISS_L3 ] = { .pme_name = "PM_L3_PF_MISS_L3", .pme_code = 0x18080, .pme_short_desc = "L3 Prefetch missed in L3", .pme_long_desc = "L3 Prefetch missed in L3", }, [ POWER8_PME_PM_L3_PF_OFF_CHIP_CACHE ] = { .pme_name = "PM_L3_PF_OFF_CHIP_CACHE", .pme_code = 0x3808a, .pme_short_desc = "L3 Prefetch from Off chip cache", .pme_long_desc = "L3 Prefetch from Off chip cache", }, [ POWER8_PME_PM_L3_PF_OFF_CHIP_MEM ] = { .pme_name = "PM_L3_PF_OFF_CHIP_MEM", .pme_code = 0x4808e, .pme_short_desc = "L3 Prefetch from Off chip memory", .pme_long_desc = "L3 Prefetch from Off chip memory", }, [ POWER8_PME_PM_L3_PF_ON_CHIP_CACHE ] = { .pme_name = "PM_L3_PF_ON_CHIP_CACHE", .pme_code = 0x38088, .pme_short_desc = "L3 Prefetch from On chip cache", .pme_long_desc = "L3 Prefetch from On chip cache", }, [ POWER8_PME_PM_L3_PF_ON_CHIP_MEM ] = { .pme_name = "PM_L3_PF_ON_CHIP_MEM", .pme_code = 0x4808c, .pme_short_desc = "L3 Prefetch from On chip memory", .pme_long_desc = "L3 Prefetch from On chip memory", }, [ POWER8_PME_PM_L3_PF_USAGE ] = { .pme_name = "PM_L3_PF_USAGE", .pme_code = 0x829084, .pme_short_desc = "rotating sample of 32 PF actives", .pme_long_desc = "rotating sample of 32 PF actives", }, [ POWER8_PME_PM_L3_PREF_ALL ] = { .pme_name = "PM_L3_PREF_ALL", .pme_code = 0x4e052, .pme_short_desc = "Total HW L3 prefetches(Load+store)", .pme_long_desc = "Total HW L3 prefetches(Load+store).", }, [ POWER8_PME_PM_L3_RD0_ALLOC ] = { .pme_name = "PM_L3_RD0_ALLOC", .pme_code = 0x84908f, .pme_short_desc = "lifetime, sample of RD machine 0 valid", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_L3_RD0_BUSY ] = { .pme_name = "PM_L3_RD0_BUSY", .pme_code = 0x84908e, .pme_short_desc = "lifetime, sample of RD machine 0 valid", .pme_long_desc = "lifetime, sample of RD machine 0 valid", }, [ POWER8_PME_PM_L3_RD_USAGE ] = { .pme_name = "PM_L3_RD_USAGE", .pme_code = 0x829086, .pme_short_desc = "rotating sample of 16 RD actives", .pme_long_desc = "rotating sample of 16 RD actives", }, [ POWER8_PME_PM_L3_SN0_ALLOC ] = { .pme_name = "PM_L3_SN0_ALLOC", .pme_code = 0x839089, .pme_short_desc = "lifetime, sample of snooper machine 0 valid", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_L3_SN0_BUSY ] = { .pme_name = "PM_L3_SN0_BUSY", .pme_code = 0x839088, .pme_short_desc = "lifetime, sample of snooper machine 0 valid", .pme_long_desc = "lifetime, sample of snooper machine 0 valid", }, [ POWER8_PME_PM_L3_SN_USAGE ] = { .pme_name = "PM_L3_SN_USAGE", .pme_code = 0x819080, .pme_short_desc = "rotating sample of 8 snoop valids", .pme_long_desc = "rotating sample of 8 snoop valids", }, [ POWER8_PME_PM_L3_ST_PREF ] = { .pme_name = "PM_L3_ST_PREF", .pme_code = 0x2e052, .pme_short_desc = "L3 store Prefetches", .pme_long_desc = "L3 store Prefetches.", }, [ POWER8_PME_PM_L3_SW_PREF ] = { .pme_name = "PM_L3_SW_PREF", .pme_code = 0x3e052, .pme_short_desc = "Data stream touchto L3", .pme_long_desc = "Data stream touchto L3.", }, [ POWER8_PME_PM_L3_SYS_GUESS_CORRECT ] = { .pme_name = "PM_L3_SYS_GUESS_CORRECT", .pme_code = 0xb29084, .pme_short_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", .pme_long_desc = "Initial scope=system and data from outside group (far or rem)(pred successful)", }, [ POWER8_PME_PM_L3_SYS_GUESS_WRONG ] = { .pme_name = "PM_L3_SYS_GUESS_WRONG", .pme_code = 0xb4908c, .pme_short_desc = "Initial scope=system but data from local or near. Predction too high", .pme_long_desc = "Initial scope=system but data from local or near. Predction too high", }, [ POWER8_PME_PM_L3_TRANS_PF ] = { .pme_name = "PM_L3_TRANS_PF", .pme_code = 0x24808e, .pme_short_desc = "L3 Transient prefetch", .pme_long_desc = "L3 Transient prefetch", }, [ POWER8_PME_PM_L3_WI0_ALLOC ] = { .pme_name = "PM_L3_WI0_ALLOC", .pme_code = 0x18081, .pme_short_desc = "lifetime, sample of Write Inject machine 0 valid", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_L3_WI0_BUSY ] = { .pme_name = "PM_L3_WI0_BUSY", .pme_code = 0x418080, .pme_short_desc = "lifetime, sample of Write Inject machine 0 valid", .pme_long_desc = "lifetime, sample of Write Inject machine 0 valid", }, [ POWER8_PME_PM_L3_WI_USAGE ] = { .pme_name = "PM_L3_WI_USAGE", .pme_code = 0x418082, .pme_short_desc = "rotating sample of 8 WI actives", .pme_long_desc = "rotating sample of 8 WI actives", }, [ POWER8_PME_PM_LARX_FIN ] = { .pme_name = "PM_LARX_FIN", .pme_code = 0x3c058, .pme_short_desc = "Larx finished", .pme_long_desc = "Larx finished .", }, [ POWER8_PME_PM_LD_CMPL ] = { .pme_name = "PM_LD_CMPL", .pme_code = 0x1002e, .pme_short_desc = "count of Loads completed", .pme_long_desc = "count of Loads completed.", }, [ POWER8_PME_PM_LD_L3MISS_PEND_CYC ] = { .pme_name = "PM_LD_L3MISS_PEND_CYC", .pme_code = 0x10062, .pme_short_desc = "Cycles L3 miss was pending for this thread", .pme_long_desc = "Cycles L3 miss was pending for this thread.", }, [ POWER8_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3e054, .pme_short_desc = "Load Missed L1", .pme_long_desc = "Load Missed L1.", }, [ POWER8_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x100ee, .pme_short_desc = "All L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "Load Ref count combined for all units.", }, [ POWER8_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0xc080, .pme_short_desc = "LS0 L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "LS0 L1 D cache load references counted at finish, gated by rejectLSU0 L1 D cache load references", }, [ POWER8_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0xc082, .pme_short_desc = "LS1 L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "LS1 L1 D cache load references counted at finish, gated by rejectLSU1 L1 D cache load references", }, [ POWER8_PME_PM_LD_REF_L1_LSU2 ] = { .pme_name = "PM_LD_REF_L1_LSU2", .pme_code = 0xc094, .pme_short_desc = "LS2 L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "LS2 L1 D cache load references counted at finish, gated by reject42", }, [ POWER8_PME_PM_LD_REF_L1_LSU3 ] = { .pme_name = "PM_LD_REF_L1_LSU3", .pme_code = 0xc096, .pme_short_desc = "LS3 L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "LS3 L1 D cache load references counted at finish, gated by reject42", }, [ POWER8_PME_PM_LINK_STACK_INVALID_PTR ] = { .pme_name = "PM_LINK_STACK_INVALID_PTR", .pme_code = 0x509a, .pme_short_desc = "A flush were LS ptr is invalid, results in a pop , A lot of interrupts between push and pops", .pme_long_desc = "A flush were LS ptr is invalid, results in a pop , A lot of interrupts between push and pops", }, [ POWER8_PME_PM_LINK_STACK_WRONG_ADD_PRED ] = { .pme_name = "PM_LINK_STACK_WRONG_ADD_PRED", .pme_code = 0x5098, .pme_short_desc = "Link stack predicts wrong address, because of link stack design limitation.", .pme_long_desc = "Link stack predicts wrong address, because of link stack design limitation.", }, [ POWER8_PME_PM_LS0_ERAT_MISS_PREF ] = { .pme_name = "PM_LS0_ERAT_MISS_PREF", .pme_code = 0xe080, .pme_short_desc = "LS0 Erat miss due to prefetch", .pme_long_desc = "LS0 Erat miss due to prefetch42", }, [ POWER8_PME_PM_LS0_L1_PREF ] = { .pme_name = "PM_LS0_L1_PREF", .pme_code = 0xd0b8, .pme_short_desc = "LS0 L1 cache data prefetches", .pme_long_desc = "LS0 L1 cache data prefetches42", }, [ POWER8_PME_PM_LS0_L1_SW_PREF ] = { .pme_name = "PM_LS0_L1_SW_PREF", .pme_code = 0xc098, .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches42", }, [ POWER8_PME_PM_LS1_ERAT_MISS_PREF ] = { .pme_name = "PM_LS1_ERAT_MISS_PREF", .pme_code = 0xe082, .pme_short_desc = "LS1 Erat miss due to prefetch", .pme_long_desc = "LS1 Erat miss due to prefetch42", }, [ POWER8_PME_PM_LS1_L1_PREF ] = { .pme_name = "PM_LS1_L1_PREF", .pme_code = 0xd0ba, .pme_short_desc = "LS1 L1 cache data prefetches", .pme_long_desc = "LS1 L1 cache data prefetches42", }, [ POWER8_PME_PM_LS1_L1_SW_PREF ] = { .pme_name = "PM_LS1_L1_SW_PREF", .pme_code = 0xc09a, .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches42", }, [ POWER8_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0xc0b0, .pme_short_desc = "LS0 Flush: LRQ", .pme_long_desc = "LS0 Flush: LRQLSU0 LRQ flushes", }, [ POWER8_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0xc0b8, .pme_short_desc = "LS0 Flush: SRQ", .pme_long_desc = "LS0 Flush: SRQLSU0 SRQ lhs flushes", }, [ POWER8_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0xc0a4, .pme_short_desc = "LS0 Flush: Unaligned Load", .pme_long_desc = "LS0 Flush: Unaligned LoadLSU0 unaligned load flushes", }, [ POWER8_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0xc0ac, .pme_short_desc = "LS0 Flush: Unaligned Store", .pme_long_desc = "LS0 Flush: Unaligned StoreLSU0 unaligned store flushes", }, [ POWER8_PME_PM_LSU0_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU0_L1_CAM_CANCEL", .pme_code = 0xf088, .pme_short_desc = "ls0 l1 tm cam cancel", .pme_long_desc = "ls0 l1 tm cam cancel42", }, [ POWER8_PME_PM_LSU0_LARX_FIN ] = { .pme_name = "PM_LSU0_LARX_FIN", .pme_code = 0x1e056, .pme_short_desc = "Larx finished in LSU pipe0", .pme_long_desc = ".", }, [ POWER8_PME_PM_LSU0_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU0_LMQ_LHR_MERGE", .pme_code = 0xd08c, .pme_short_desc = "LS0 Load Merged with another cacheline request", .pme_long_desc = "LS0 Load Merged with another cacheline request42", }, [ POWER8_PME_PM_LSU0_NCLD ] = { .pme_name = "PM_LSU0_NCLD", .pme_code = 0xc08c, .pme_short_desc = "LS0 Non-cachable Loads counted at finish", .pme_long_desc = "LS0 Non-cachable Loads counted at finishLSU0 non-cacheable loads", }, [ POWER8_PME_PM_LSU0_PRIMARY_ERAT_HIT ] = { .pme_name = "PM_LSU0_PRIMARY_ERAT_HIT", .pme_code = 0xe090, .pme_short_desc = "Primary ERAT hit", .pme_long_desc = "Primary ERAT hit42", }, [ POWER8_PME_PM_LSU0_REJECT ] = { .pme_name = "PM_LSU0_REJECT", .pme_code = 0x1e05a, .pme_short_desc = "LSU0 reject", .pme_long_desc = "LSU0 reject .", }, [ POWER8_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0xc09c, .pme_short_desc = "LS0 SRQ forwarded data to a load", .pme_long_desc = "LS0 SRQ forwarded data to a loadLSU0 SRQ store forwarded", }, [ POWER8_PME_PM_LSU0_STORE_REJECT ] = { .pme_name = "PM_LSU0_STORE_REJECT", .pme_code = 0xf084, .pme_short_desc = "ls0 store reject", .pme_long_desc = "ls0 store reject42", }, [ POWER8_PME_PM_LSU0_TMA_REQ_L2 ] = { .pme_name = "PM_LSU0_TMA_REQ_L2", .pme_code = 0xe0a8, .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42", }, [ POWER8_PME_PM_LSU0_TM_L1_HIT ] = { .pme_name = "PM_LSU0_TM_L1_HIT", .pme_code = 0xe098, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L142", }, [ POWER8_PME_PM_LSU0_TM_L1_MISS ] = { .pme_name = "PM_LSU0_TM_L1_MISS", .pme_code = 0xe0a0, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss42", }, [ POWER8_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0xc0b2, .pme_short_desc = "LS1 Flush: LRQ", .pme_long_desc = "LS1 Flush: LRQLSU1 LRQ flushes", }, [ POWER8_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0xc0ba, .pme_short_desc = "LS1 Flush: SRQ", .pme_long_desc = "LS1 Flush: SRQLSU1 SRQ lhs flushes", }, [ POWER8_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0xc0a6, .pme_short_desc = "LS 1 Flush: Unaligned Load", .pme_long_desc = "LS 1 Flush: Unaligned LoadLSU1 unaligned load flushes", }, [ POWER8_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0xc0ae, .pme_short_desc = "LS1 Flush: Unaligned Store", .pme_long_desc = "LS1 Flush: Unaligned StoreLSU1 unaligned store flushes", }, [ POWER8_PME_PM_LSU1_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU1_L1_CAM_CANCEL", .pme_code = 0xf08a, .pme_short_desc = "ls1 l1 tm cam cancel", .pme_long_desc = "ls1 l1 tm cam cancel42", }, [ POWER8_PME_PM_LSU1_LARX_FIN ] = { .pme_name = "PM_LSU1_LARX_FIN", .pme_code = 0x2e056, .pme_short_desc = "Larx finished in LSU pipe1", .pme_long_desc = "Larx finished in LSU pipe1.", }, [ POWER8_PME_PM_LSU1_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU1_LMQ_LHR_MERGE", .pme_code = 0xd08e, .pme_short_desc = "LS1 Load Merge with another cacheline request", .pme_long_desc = "LS1 Load Merge with another cacheline request42", }, [ POWER8_PME_PM_LSU1_NCLD ] = { .pme_name = "PM_LSU1_NCLD", .pme_code = 0xc08e, .pme_short_desc = "LS1 Non-cachable Loads counted at finish", .pme_long_desc = "LS1 Non-cachable Loads counted at finishLSU1 non-cacheable loads", }, [ POWER8_PME_PM_LSU1_PRIMARY_ERAT_HIT ] = { .pme_name = "PM_LSU1_PRIMARY_ERAT_HIT", .pme_code = 0xe092, .pme_short_desc = "Primary ERAT hit", .pme_long_desc = "Primary ERAT hit42", }, [ POWER8_PME_PM_LSU1_REJECT ] = { .pme_name = "PM_LSU1_REJECT", .pme_code = 0x2e05a, .pme_short_desc = "LSU1 reject", .pme_long_desc = "LSU1 reject .", }, [ POWER8_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0xc09e, .pme_short_desc = "LS1 SRQ forwarded data to a load", .pme_long_desc = "LS1 SRQ forwarded data to a loadLSU1 SRQ store forwarded", }, [ POWER8_PME_PM_LSU1_STORE_REJECT ] = { .pme_name = "PM_LSU1_STORE_REJECT", .pme_code = 0xf086, .pme_short_desc = "ls1 store reject", .pme_long_desc = "ls1 store reject42", }, [ POWER8_PME_PM_LSU1_TMA_REQ_L2 ] = { .pme_name = "PM_LSU1_TMA_REQ_L2", .pme_code = 0xe0aa, .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42", }, [ POWER8_PME_PM_LSU1_TM_L1_HIT ] = { .pme_name = "PM_LSU1_TM_L1_HIT", .pme_code = 0xe09a, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L142", }, [ POWER8_PME_PM_LSU1_TM_L1_MISS ] = { .pme_name = "PM_LSU1_TM_L1_MISS", .pme_code = 0xe0a2, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss42", }, [ POWER8_PME_PM_LSU2_FLUSH_LRQ ] = { .pme_name = "PM_LSU2_FLUSH_LRQ", .pme_code = 0xc0b4, .pme_short_desc = "LS02Flush: LRQ", .pme_long_desc = "LS02Flush: LRQ42", }, [ POWER8_PME_PM_LSU2_FLUSH_SRQ ] = { .pme_name = "PM_LSU2_FLUSH_SRQ", .pme_code = 0xc0bc, .pme_short_desc = "LS2 Flush: SRQ", .pme_long_desc = "LS2 Flush: SRQ42", }, [ POWER8_PME_PM_LSU2_FLUSH_ULD ] = { .pme_name = "PM_LSU2_FLUSH_ULD", .pme_code = 0xc0a8, .pme_short_desc = "LS3 Flush: Unaligned Load", .pme_long_desc = "LS3 Flush: Unaligned Load42", }, [ POWER8_PME_PM_LSU2_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU2_L1_CAM_CANCEL", .pme_code = 0xf08c, .pme_short_desc = "ls2 l1 tm cam cancel", .pme_long_desc = "ls2 l1 tm cam cancel42", }, [ POWER8_PME_PM_LSU2_LARX_FIN ] = { .pme_name = "PM_LSU2_LARX_FIN", .pme_code = 0x3e056, .pme_short_desc = "Larx finished in LSU pipe2", .pme_long_desc = "Larx finished in LSU pipe2.", }, [ POWER8_PME_PM_LSU2_LDF ] = { .pme_name = "PM_LSU2_LDF", .pme_code = 0xc084, .pme_short_desc = "LS2 Scalar Loads", .pme_long_desc = "LS2 Scalar Loads42", }, [ POWER8_PME_PM_LSU2_LDX ] = { .pme_name = "PM_LSU2_LDX", .pme_code = 0xc088, .pme_short_desc = "LS0 Vector Loads", .pme_long_desc = "LS0 Vector Loads42", }, [ POWER8_PME_PM_LSU2_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU2_LMQ_LHR_MERGE", .pme_code = 0xd090, .pme_short_desc = "LS0 Load Merged with another cacheline request", .pme_long_desc = "LS0 Load Merged with another cacheline request42", }, [ POWER8_PME_PM_LSU2_PRIMARY_ERAT_HIT ] = { .pme_name = "PM_LSU2_PRIMARY_ERAT_HIT", .pme_code = 0xe094, .pme_short_desc = "Primary ERAT hit", .pme_long_desc = "Primary ERAT hit42", }, [ POWER8_PME_PM_LSU2_REJECT ] = { .pme_name = "PM_LSU2_REJECT", .pme_code = 0x3e05a, .pme_short_desc = "LSU2 reject", .pme_long_desc = "LSU2 reject .", }, [ POWER8_PME_PM_LSU2_SRQ_STFWD ] = { .pme_name = "PM_LSU2_SRQ_STFWD", .pme_code = 0xc0a0, .pme_short_desc = "LS2 SRQ forwarded data to a load", .pme_long_desc = "LS2 SRQ forwarded data to a load42", }, [ POWER8_PME_PM_LSU2_TMA_REQ_L2 ] = { .pme_name = "PM_LSU2_TMA_REQ_L2", .pme_code = 0xe0ac, .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42", }, [ POWER8_PME_PM_LSU2_TM_L1_HIT ] = { .pme_name = "PM_LSU2_TM_L1_HIT", .pme_code = 0xe09c, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L142", }, [ POWER8_PME_PM_LSU2_TM_L1_MISS ] = { .pme_name = "PM_LSU2_TM_L1_MISS", .pme_code = 0xe0a4, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss42", }, [ POWER8_PME_PM_LSU3_FLUSH_LRQ ] = { .pme_name = "PM_LSU3_FLUSH_LRQ", .pme_code = 0xc0b6, .pme_short_desc = "LS3 Flush: LRQ", .pme_long_desc = "LS3 Flush: LRQ42", }, [ POWER8_PME_PM_LSU3_FLUSH_SRQ ] = { .pme_name = "PM_LSU3_FLUSH_SRQ", .pme_code = 0xc0be, .pme_short_desc = "LS13 Flush: SRQ", .pme_long_desc = "LS13 Flush: SRQ42", }, [ POWER8_PME_PM_LSU3_FLUSH_ULD ] = { .pme_name = "PM_LSU3_FLUSH_ULD", .pme_code = 0xc0aa, .pme_short_desc = "LS 14Flush: Unaligned Load", .pme_long_desc = "LS 14Flush: Unaligned Load42", }, [ POWER8_PME_PM_LSU3_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU3_L1_CAM_CANCEL", .pme_code = 0xf08e, .pme_short_desc = "ls3 l1 tm cam cancel", .pme_long_desc = "ls3 l1 tm cam cancel42", }, [ POWER8_PME_PM_LSU3_LARX_FIN ] = { .pme_name = "PM_LSU3_LARX_FIN", .pme_code = 0x4e056, .pme_short_desc = "Larx finished in LSU pipe3", .pme_long_desc = "Larx finished in LSU pipe3.", }, [ POWER8_PME_PM_LSU3_LDF ] = { .pme_name = "PM_LSU3_LDF", .pme_code = 0xc086, .pme_short_desc = "LS3 Scalar Loads", .pme_long_desc = "LS3 Scalar Loads 42", }, [ POWER8_PME_PM_LSU3_LDX ] = { .pme_name = "PM_LSU3_LDX", .pme_code = 0xc08a, .pme_short_desc = "LS1 Vector Loads", .pme_long_desc = "LS1 Vector Loads42", }, [ POWER8_PME_PM_LSU3_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU3_LMQ_LHR_MERGE", .pme_code = 0xd092, .pme_short_desc = "LS1 Load Merge with another cacheline request", .pme_long_desc = "LS1 Load Merge with another cacheline request42", }, [ POWER8_PME_PM_LSU3_PRIMARY_ERAT_HIT ] = { .pme_name = "PM_LSU3_PRIMARY_ERAT_HIT", .pme_code = 0xe096, .pme_short_desc = "Primary ERAT hit", .pme_long_desc = "Primary ERAT hit42", }, [ POWER8_PME_PM_LSU3_REJECT ] = { .pme_name = "PM_LSU3_REJECT", .pme_code = 0x4e05a, .pme_short_desc = "LSU3 reject", .pme_long_desc = "LSU3 reject .", }, [ POWER8_PME_PM_LSU3_SRQ_STFWD ] = { .pme_name = "PM_LSU3_SRQ_STFWD", .pme_code = 0xc0a2, .pme_short_desc = "LS3 SRQ forwarded data to a load", .pme_long_desc = "LS3 SRQ forwarded data to a load42", }, [ POWER8_PME_PM_LSU3_TMA_REQ_L2 ] = { .pme_name = "PM_LSU3_TMA_REQ_L2", .pme_code = 0xe0ae, .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42", }, [ POWER8_PME_PM_LSU3_TM_L1_HIT ] = { .pme_name = "PM_LSU3_TM_L1_HIT", .pme_code = 0xe09e, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L142", }, [ POWER8_PME_PM_LSU3_TM_L1_MISS ] = { .pme_name = "PM_LSU3_TM_L1_MISS", .pme_code = 0xe0a6, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss42", }, [ POWER8_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x200f6, .pme_short_desc = "DERAT Reloaded due to a DERAT miss", .pme_long_desc = "DERAT Reloaded (Miss).", }, [ POWER8_PME_PM_LSU_ERAT_MISS_PREF ] = { .pme_name = "PM_LSU_ERAT_MISS_PREF", .pme_code = 0xe880, .pme_short_desc = "Erat miss due to prefetch, on either pipe", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_FIN ] = { .pme_name = "PM_LSU_FIN", .pme_code = 0x30066, .pme_short_desc = "LSU Finished an instruction (up to 2 per cycle)", .pme_long_desc = "LSU Finished an instruction (up to 2 per cycle).", }, [ POWER8_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0xc8ac, .pme_short_desc = "Unaligned Store Flush on either pipe", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_FOUR_TABLEWALK_CYC ] = { .pme_name = "PM_LSU_FOUR_TABLEWALK_CYC", .pme_code = 0xd0a4, .pme_short_desc = "Cycles when four tablewalks pending on this thread", .pme_long_desc = "Cycles when four tablewalks pending on this thread42", }, [ POWER8_PME_PM_LSU_FX_FIN ] = { .pme_name = "PM_LSU_FX_FIN", .pme_code = 0x10066, .pme_short_desc = "LSU Finished a FX operation (up to 2 per cycle", .pme_long_desc = "LSU Finished a FX operation (up to 2 per cycle.", }, [ POWER8_PME_PM_LSU_L1_PREF ] = { .pme_name = "PM_LSU_L1_PREF", .pme_code = 0xd8b8, .pme_short_desc = "hw initiated , include sw streaming forms as well , include sw streams as a separate event", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_L1_SW_PREF ] = { .pme_name = "PM_LSU_L1_SW_PREF", .pme_code = 0xc898, .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches, on both pipes", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0xc884, .pme_short_desc = "FPU loads only on LS2/LS3 ie LU0/LU1", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_LDX ] = { .pme_name = "PM_LSU_LDX", .pme_code = 0xc888, .pme_short_desc = "Vector loads can issue only on LS2/LS3", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0xd0a2, .pme_short_desc = "LMQ full", .pme_long_desc = "LMQ fullCycles LMQ full,", }, [ POWER8_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0xd0a1, .pme_short_desc = "Per thread - use edge detect to count allocates On a per thread basis, level signal indicating Slot 0 is valid. By instrumenting a single slot we can calculate service time for that slot. Previous machines required a separate signal indicating the slot was allocated. Because any signal can be routed to any counter in P8, we can count level in one PMC and edge detect in another PMC using the same signal", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0xd0a0, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "Slot 0 of LMQ validLMQ slot 0 valid", }, [ POWER8_PME_PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC", .pme_code = 0x3001c, .pme_short_desc = "ALL threads lsu empty (lmq and srq empty)", .pme_long_desc = "ALL threads lsu empty (lmq and srq empty). Issue HW016541", }, [ POWER8_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2003e, .pme_short_desc = "LSU empty (lmq and srq empty)", .pme_long_desc = "LSU empty (lmq and srq empty).", }, [ POWER8_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0xd09f, .pme_short_desc = "Per thread - use edge detect to count allocates On a per thread basis, level signal indicating Slot 0 is valid. By instrumenting a single slot we can calculate service time for that slot. Previous machines required a separate signal indicating the slot was allocated. Because any signal can be routed to any counter in P8, we can count level in one PMC and edge detect in another PMC using the same signal", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0xd09e, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "Slot 0 of LRQ validLRQ slot 0 valid", }, [ POWER8_PME_PM_LSU_LRQ_S43_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S43_ALLOC", .pme_code = 0xf091, .pme_short_desc = "LRQ slot 43 was released", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_LRQ_S43_VALID ] = { .pme_name = "PM_LSU_LRQ_S43_VALID", .pme_code = 0xf090, .pme_short_desc = "LRQ slot 43 was busy", .pme_long_desc = "LRQ slot 43 was busy42", }, [ POWER8_PME_PM_LSU_MRK_DERAT_MISS ] = { .pme_name = "PM_LSU_MRK_DERAT_MISS", .pme_code = 0x30162, .pme_short_desc = "DERAT Reloaded (Miss)", .pme_long_desc = "DERAT Reloaded (Miss).", }, [ POWER8_PME_PM_LSU_NCLD ] = { .pme_name = "PM_LSU_NCLD", .pme_code = 0xc88c, .pme_short_desc = "count at finish so can return only on ls0 or ls1", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_NCST ] = { .pme_name = "PM_LSU_NCST", .pme_code = 0xc092, .pme_short_desc = "Non-cachable Stores sent to nest", .pme_long_desc = "Non-cachable Stores sent to nest42", }, [ POWER8_PME_PM_LSU_REJECT ] = { .pme_name = "PM_LSU_REJECT", .pme_code = 0x10064, .pme_short_desc = "LSU Reject (up to 4 per cycle)", .pme_long_desc = "LSU Reject (up to 4 per cycle).", }, [ POWER8_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x2e05c, .pme_short_desc = "LSU Reject due to ERAT (up to 4 per cycles)", .pme_long_desc = "LSU Reject due to ERAT (up to 4 per cycles).", }, [ POWER8_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0x4e05c, .pme_short_desc = "LSU Reject due to LHS (up to 4 per cycle)", .pme_long_desc = "LSU Reject due to LHS (up to 4 per cycle).", }, [ POWER8_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x1e05c, .pme_short_desc = "LSU reject due to LMQ full (4 per cycle)", .pme_long_desc = "LSU reject due to LMQ full (4 per cycle).", }, [ POWER8_PME_PM_LSU_SET_MPRED ] = { .pme_name = "PM_LSU_SET_MPRED", .pme_code = 0xd082, .pme_short_desc = "Line already in cache at reload time", .pme_long_desc = "Line already in cache at reload time42", }, [ POWER8_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x40008, .pme_short_desc = "ALL threads srq empty", .pme_long_desc = "All threads srq empty.", }, [ POWER8_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x1001a, .pme_short_desc = "Storage Queue is full and is blocking dispatch", .pme_long_desc = "SRQ is Full.", }, [ POWER8_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0xd09d, .pme_short_desc = "Per thread - use edge detect to count allocates On a per thread basis, level signal indicating Slot 0 is valid. By instrumenting a single slot we can calculate service time for that slot. Previous machines required a separate signal indicating the slot was allocated. Because any signal can be routed to any counter in P8, we can count level in one PMC and edge detect in another PMC using the same signal", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0xd09c, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "Slot 0 of SRQ validSRQ slot 0 valid", }, [ POWER8_PME_PM_LSU_SRQ_S39_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S39_ALLOC", .pme_code = 0xf093, .pme_short_desc = "SRQ slot 39 was released", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_SRQ_S39_VALID ] = { .pme_name = "PM_LSU_SRQ_S39_VALID", .pme_code = 0xf092, .pme_short_desc = "SRQ slot 39 was busy", .pme_long_desc = "SRQ slot 39 was busy42", }, [ POWER8_PME_PM_LSU_SRQ_SYNC ] = { .pme_name = "PM_LSU_SRQ_SYNC", .pme_code = 0xd09b, .pme_short_desc = "A sync in the SRQ ended", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0xd09a, .pme_short_desc = "A sync is in the SRQ (edge detect to count)", .pme_long_desc = "A sync is in the SRQ (edge detect to count)SRQ sync duration", }, [ POWER8_PME_PM_LSU_STORE_REJECT ] = { .pme_name = "PM_LSU_STORE_REJECT", .pme_code = 0xf084, .pme_short_desc = "Store reject on either pipe", .pme_long_desc = "LSU", }, [ POWER8_PME_PM_LSU_TWO_TABLEWALK_CYC ] = { .pme_name = "PM_LSU_TWO_TABLEWALK_CYC", .pme_code = 0xd0a6, .pme_short_desc = "Cycles when two tablewalks pending on this thread", .pme_long_desc = "Cycles when two tablewalks pending on this thread42", }, [ POWER8_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0x5094, .pme_short_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", .pme_long_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", }, [ POWER8_PME_PM_LWSYNC_HELD ] = { .pme_name = "PM_LWSYNC_HELD", .pme_code = 0x209a, .pme_short_desc = "LWSYNC held at dispatch", .pme_long_desc = "LWSYNC held at dispatch", }, [ POWER8_PME_PM_MEM_CO ] = { .pme_name = "PM_MEM_CO", .pme_code = 0x4c058, .pme_short_desc = "Memory castouts from this lpar", .pme_long_desc = "Memory castouts from this lpar.", }, [ POWER8_PME_PM_MEM_LOC_THRESH_IFU ] = { .pme_name = "PM_MEM_LOC_THRESH_IFU", .pme_code = 0x10058, .pme_short_desc = "Local Memory above threshold for IFU speculation control", .pme_long_desc = "Local Memory above threshold for IFU speculation control.", }, [ POWER8_PME_PM_MEM_LOC_THRESH_LSU_HIGH ] = { .pme_name = "PM_MEM_LOC_THRESH_LSU_HIGH", .pme_code = 0x40056, .pme_short_desc = "Local memory above threshold for LSU medium", .pme_long_desc = "Local memory above threshold for LSU medium.", }, [ POWER8_PME_PM_MEM_LOC_THRESH_LSU_MED ] = { .pme_name = "PM_MEM_LOC_THRESH_LSU_MED", .pme_code = 0x1c05e, .pme_short_desc = "Local memory above theshold for data prefetch", .pme_long_desc = "Local memory above theshold for data prefetch.", }, [ POWER8_PME_PM_MEM_PREF ] = { .pme_name = "PM_MEM_PREF", .pme_code = 0x2c058, .pme_short_desc = "Memory prefetch for this lpar. Includes L4", .pme_long_desc = "Memory prefetch for this lpar.", }, [ POWER8_PME_PM_MEM_READ ] = { .pme_name = "PM_MEM_READ", .pme_code = 0x10056, .pme_short_desc = "Reads from Memory from this lpar (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4", .pme_long_desc = "Reads from Memory from this lpar (includes data/inst/xlate/l1prefetch/inst prefetch).", }, [ POWER8_PME_PM_MEM_RWITM ] = { .pme_name = "PM_MEM_RWITM", .pme_code = 0x3c05e, .pme_short_desc = "Memory rwitm for this lpar", .pme_long_desc = "Memory rwitm for this lpar.", }, [ POWER8_PME_PM_MRK_BACK_BR_CMPL ] = { .pme_name = "PM_MRK_BACK_BR_CMPL", .pme_code = 0x3515e, .pme_short_desc = "Marked branch instruction completed with a target address less than current instruction address", .pme_long_desc = "Marked branch instruction completed with a target address less than current instruction address.", }, [ POWER8_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2013a, .pme_short_desc = "bru marked instr finish", .pme_long_desc = "bru marked instr finish.", }, [ POWER8_PME_PM_MRK_BR_CMPL ] = { .pme_name = "PM_MRK_BR_CMPL", .pme_code = 0x1016e, .pme_short_desc = "Branch Instruction completed", .pme_long_desc = "Branch Instruction completed.", }, [ POWER8_PME_PM_MRK_BR_MPRED_CMPL ] = { .pme_name = "PM_MRK_BR_MPRED_CMPL", .pme_code = 0x301e4, .pme_short_desc = "Marked Branch Mispredicted", .pme_long_desc = "Marked Branch Mispredicted.", }, [ POWER8_PME_PM_MRK_BR_TAKEN_CMPL ] = { .pme_name = "PM_MRK_BR_TAKEN_CMPL", .pme_code = 0x101e2, .pme_short_desc = "Marked Branch Taken completed", .pme_long_desc = "Marked Branch Taken.", }, [ POWER8_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x3013a, .pme_short_desc = "IFU non-branch finished", .pme_long_desc = "IFU non-branch marked instruction finished.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x4d148, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", .pme_code = 0x2d128, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x3d148, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x2c128, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL4 ] = { .pme_name = "PM_MRK_DATA_FROM_DL4", .pme_code = 0x3d14c, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL4_CYC", .pme_code = 0x2c12c, .pme_short_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x4d14c, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", .pme_code = 0x2d12c, .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1d142, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD", .pme_code = 0x4d146, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD_CYC", .pme_code = 0x2d126, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR", .pme_code = 0x3d146, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR_CYC", .pme_code = 0x2c126, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x1d14e, .pme_short_desc = "Data cache reload L2 miss", .pme_long_desc = "Data cache reload L2 miss.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2MISS_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS_CYC", .pme_code = 0x4c12e, .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load", .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x4c122, .pme_short_desc = "Duration in cycles to reload from local core's L2 due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x3d140, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC", .pme_code = 0x2c120, .pme_short_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x4d140, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC", .pme_code = 0x2d120, .pme_short_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_MEPF ] = { .pme_name = "PM_MRK_DATA_FROM_L2_MEPF", .pme_code = 0x2d140, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", .pme_code = 0x4d120, .pme_short_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT", .pme_code = 0x1d140, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC", .pme_code = 0x4c120, .pme_short_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x4d142, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD", .pme_code = 0x4d144, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD_CYC", .pme_code = 0x2d124, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR", .pme_code = 0x3d144, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR_CYC", .pme_code = 0x2c124, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD", .pme_code = 0x2d144, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD_CYC", .pme_code = 0x4d124, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR", .pme_code = 0x1d146, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR_CYC", .pme_code = 0x4c126, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x201e4, .pme_short_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3MISS_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS_CYC", .pme_code = 0x2d12e, .pme_short_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load", .pme_long_desc = "Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x2d122, .pme_short_desc = "Duration in cycles to reload from local core's L3 due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT", .pme_code = 0x3d142, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC", .pme_code = 0x2c122, .pme_short_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_MEPF ] = { .pme_name = "PM_MRK_DATA_FROM_L3_MEPF", .pme_code = 0x2d142, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", .pme_code = 0x4d122, .pme_short_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT", .pme_code = 0x1d144, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC", .pme_code = 0x4c124, .pme_short_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_LL4 ] = { .pme_name = "PM_MRK_DATA_FROM_LL4", .pme_code = 0x1d14c, .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_LL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LL4_CYC", .pme_code = 0x4c12c, .pme_short_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", .pme_long_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x2d148, .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x4d128, .pme_short_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", .pme_long_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x201e0, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_MEMORY ] = { .pme_name = "PM_MRK_DATA_FROM_MEMORY", .pme_code = 0x2d14c, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_MEMORY_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_MEMORY_CYC", .pme_code = 0x4d12c, .pme_short_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", .pme_long_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE", .pme_code = 0x4d14a, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC", .pme_code = 0x2d12a, .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE", .pme_code = 0x1d148, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC", .pme_code = 0x4c128, .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x2d146, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x4d126, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x1d14a, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x4c12a, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL4 ] = { .pme_name = "PM_MRK_DATA_FROM_RL4", .pme_code = 0x2d14a, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL4_CYC", .pme_code = 0x4d12a, .pme_short_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group (Remote) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x3d14a, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a marked load.", }, [ POWER8_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x2c12a, .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Remote) due to a marked load.", }, [ POWER8_PME_PM_MRK_DCACHE_RELOAD_INTV ] = { .pme_name = "PM_MRK_DCACHE_RELOAD_INTV", .pme_code = 0x40118, .pme_short_desc = "Combined Intervention event", .pme_long_desc = "Combined Intervention event.", }, [ POWER8_PME_PM_MRK_DERAT_MISS ] = { .pme_name = "PM_MRK_DERAT_MISS", .pme_code = 0x301e6, .pme_short_desc = "Erat Miss (TLB Access) All page sizes", .pme_long_desc = "Erat Miss (TLB Access) All page sizes.", }, [ POWER8_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x4d154, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G.", }, [ POWER8_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x3d154, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M.", }, [ POWER8_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x1d156, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K.", }, [ POWER8_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x2d154, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K.", }, [ POWER8_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x20132, .pme_short_desc = "Decimal Unit marked Instruction Finish", .pme_long_desc = "Decimal Unit marked Instruction Finish.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_MOD", .pme_code = 0x4f148, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_SHR", .pme_code = 0x3f148, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_DL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL4", .pme_code = 0x3f14c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_DMEM", .pme_code = 0x4f14c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2 ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2", .pme_code = 0x1f142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L21_MOD", .pme_code = 0x4f146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L21_SHR", .pme_code = 0x3f146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2MISS", .pme_code = 0x1f14e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x3f140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x4f140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_MEPF", .pme_code = 0x2f140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x1f140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L3 ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3", .pme_code = 0x4f142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_MOD", .pme_code = 0x4f144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_SHR", .pme_code = 0x3f144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_MOD", .pme_code = 0x2f144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_SHR", .pme_code = 0x1f146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3MISS", .pme_code = 0x4f14e, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x3f142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_MEPF", .pme_code = 0x2f142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x1f144, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_LL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_LL4", .pme_code = 0x1f14c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_LMEM", .pme_code = 0x2f148, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_MEMORY ] = { .pme_name = "PM_MRK_DPTEG_FROM_MEMORY", .pme_code = 0x2f14c, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x4f14a, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x1f148, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_MOD", .pme_code = 0x2f146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_SHR", .pme_code = 0x1f14a, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_RL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL4", .pme_code = 0x2f14a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DPTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_RMEM", .pme_code = 0x3f14a, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a marked data side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a marked data side request.", }, [ POWER8_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0x401e4, .pme_short_desc = "Marked dtlb miss", .pme_long_desc = "Marked dtlb miss.", }, [ POWER8_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x1d158, .pme_short_desc = "Marked Data TLB Miss page size 16G", .pme_long_desc = "Marked Data TLB Miss page size 16G.", }, [ POWER8_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x4d156, .pme_short_desc = "Marked Data TLB Miss page size 16M", .pme_long_desc = "Marked Data TLB Miss page size 16M.", }, [ POWER8_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x2d156, .pme_short_desc = "Marked Data TLB Miss page size 4k", .pme_long_desc = "Marked Data TLB Miss page size 4k.", }, [ POWER8_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x3d156, .pme_short_desc = "Marked Data TLB Miss page size 64K", .pme_long_desc = "Marked Data TLB Miss page size 64K.", }, [ POWER8_PME_PM_MRK_FAB_RSP_BKILL ] = { .pme_name = "PM_MRK_FAB_RSP_BKILL", .pme_code = 0x40154, .pme_short_desc = "Marked store had to do a bkill", .pme_long_desc = "Marked store had to do a bkill.", }, [ POWER8_PME_PM_MRK_FAB_RSP_BKILL_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_BKILL_CYC", .pme_code = 0x2f150, .pme_short_desc = "cycles L2 RC took for a bkill", .pme_long_desc = "cycles L2 RC took for a bkill.", }, [ POWER8_PME_PM_MRK_FAB_RSP_CLAIM_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_CLAIM_RTY", .pme_code = 0x3015e, .pme_short_desc = "Sampled store did a rwitm and got a rty", .pme_long_desc = "Sampled store did a rwitm and got a rty.", }, [ POWER8_PME_PM_MRK_FAB_RSP_DCLAIM ] = { .pme_name = "PM_MRK_FAB_RSP_DCLAIM", .pme_code = 0x30154, .pme_short_desc = "Marked store had to do a dclaim", .pme_long_desc = "Marked store had to do a dclaim.", }, [ POWER8_PME_PM_MRK_FAB_RSP_DCLAIM_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_DCLAIM_CYC", .pme_code = 0x2f152, .pme_short_desc = "cycles L2 RC took for a dclaim", .pme_long_desc = "cycles L2 RC took for a dclaim.", }, [ POWER8_PME_PM_MRK_FAB_RSP_MATCH ] = { .pme_name = "PM_MRK_FAB_RSP_MATCH", .pme_code = 0x30156, .pme_short_desc = "ttype and cresp matched as specified in MMCR1", .pme_long_desc = "ttype and cresp matched as specified in MMCR1.", }, [ POWER8_PME_PM_MRK_FAB_RSP_MATCH_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_MATCH_CYC", .pme_code = 0x4f152, .pme_short_desc = "cresp/ttype match cycles", .pme_long_desc = "cresp/ttype match cycles.", }, [ POWER8_PME_PM_MRK_FAB_RSP_RD_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_RD_RTY", .pme_code = 0x4015e, .pme_short_desc = "Sampled L2 reads retry count", .pme_long_desc = "Sampled L2 reads retry count.", }, [ POWER8_PME_PM_MRK_FAB_RSP_RD_T_INTV ] = { .pme_name = "PM_MRK_FAB_RSP_RD_T_INTV", .pme_code = 0x1015e, .pme_short_desc = "Sampled Read got a T intervention", .pme_long_desc = "Sampled Read got a T intervention.", }, [ POWER8_PME_PM_MRK_FAB_RSP_RWITM_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_RWITM_CYC", .pme_code = 0x4f150, .pme_short_desc = "cycles L2 RC took for a rwitm", .pme_long_desc = "cycles L2 RC took for a rwitm.", }, [ POWER8_PME_PM_MRK_FAB_RSP_RWITM_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_RWITM_RTY", .pme_code = 0x2015e, .pme_short_desc = "Sampled store did a rwitm and got a rty", .pme_long_desc = "Sampled store did a rwitm and got a rty.", }, [ POWER8_PME_PM_MRK_FILT_MATCH ] = { .pme_name = "PM_MRK_FILT_MATCH", .pme_code = 0x2013c, .pme_short_desc = "Marked filter Match", .pme_long_desc = "Marked filter Match.", }, [ POWER8_PME_PM_MRK_FIN_STALL_CYC ] = { .pme_name = "PM_MRK_FIN_STALL_CYC", .pme_code = 0x1013c, .pme_short_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count )", .pme_long_desc = "Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #).", }, [ POWER8_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x20134, .pme_short_desc = "fxu marked instr finish", .pme_long_desc = "fxu marked instr finish.", }, [ POWER8_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x40130, .pme_short_desc = "marked instruction finished (completed)", .pme_long_desc = "marked instruction finished (completed).", }, [ POWER8_PME_PM_MRK_GRP_IC_MISS ] = { .pme_name = "PM_MRK_GRP_IC_MISS", .pme_code = 0x4013a, .pme_short_desc = "Marked Group experienced I cache miss", .pme_long_desc = "Marked Group experienced I cache miss.", }, [ POWER8_PME_PM_MRK_GRP_NTC ] = { .pme_name = "PM_MRK_GRP_NTC", .pme_code = 0x3013c, .pme_short_desc = "Marked group ntc cycles.", .pme_long_desc = "Marked group ntc cycles.", }, [ POWER8_PME_PM_MRK_INST_CMPL ] = { .pme_name = "PM_MRK_INST_CMPL", .pme_code = 0x401e0, .pme_short_desc = "marked instruction completed", .pme_long_desc = "marked instruction completed.", }, [ POWER8_PME_PM_MRK_INST_DECODED ] = { .pme_name = "PM_MRK_INST_DECODED", .pme_code = 0x20130, .pme_short_desc = "marked instruction decoded", .pme_long_desc = "marked instruction decoded. Name from ISU?", }, [ POWER8_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x101e0, .pme_short_desc = "The thread has dispatched a randomly sampled marked instruction", .pme_long_desc = "Marked Instruction dispatched.", }, [ POWER8_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x30130, .pme_short_desc = "marked instruction finished", .pme_long_desc = "marked instr finish any unit .", }, [ POWER8_PME_PM_MRK_INST_FROM_L3MISS ] = { .pme_name = "PM_MRK_INST_FROM_L3MISS", .pme_code = 0x401e6, .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", .pme_long_desc = "n/a", }, [ POWER8_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x10132, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "Marked instruction issued.", }, [ POWER8_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x40134, .pme_short_desc = "marked Instruction finish timeout (instruction lost)", .pme_long_desc = "marked Instruction finish timeout (instruction lost).", }, [ POWER8_PME_PM_MRK_L1_ICACHE_MISS ] = { .pme_name = "PM_MRK_L1_ICACHE_MISS", .pme_code = 0x101e4, .pme_short_desc = "sampled Instruction suffered an icache Miss", .pme_long_desc = "Marked L1 Icache Miss.", }, [ POWER8_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x101ea, .pme_short_desc = "Marked demand reload", .pme_long_desc = "Marked demand reload.", }, [ POWER8_PME_PM_MRK_L2_RC_DISP ] = { .pme_name = "PM_MRK_L2_RC_DISP", .pme_code = 0x20114, .pme_short_desc = "Marked Instruction RC dispatched in L2", .pme_long_desc = "Marked Instruction RC dispatched in L2.", }, [ POWER8_PME_PM_MRK_L2_RC_DONE ] = { .pme_name = "PM_MRK_L2_RC_DONE", .pme_code = 0x3012a, .pme_short_desc = "Marked RC done", .pme_long_desc = "Marked RC done.", }, [ POWER8_PME_PM_MRK_LARX_FIN ] = { .pme_name = "PM_MRK_LARX_FIN", .pme_code = 0x40116, .pme_short_desc = "Larx finished", .pme_long_desc = "Larx finished .", }, [ POWER8_PME_PM_MRK_LD_MISS_EXPOSED ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED", .pme_code = 0x1013f, .pme_short_desc = "Marked Load exposed Miss (exposed period ended)", .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", }, [ POWER8_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", .pme_code = 0x1013e, .pme_short_desc = "Marked Load exposed Miss cycles", .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #).", }, [ POWER8_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x201e2, .pme_short_desc = "Marked DL1 Demand Miss counted at exec time", .pme_long_desc = "Marked DL1 Demand Miss counted at exec time.", }, [ POWER8_PME_PM_MRK_LD_MISS_L1_CYC ] = { .pme_name = "PM_MRK_LD_MISS_L1_CYC", .pme_code = 0x4013e, .pme_short_desc = "Marked ld latency", .pme_long_desc = "Marked ld latency.", }, [ POWER8_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x40132, .pme_short_desc = "lsu marked instr finish", .pme_long_desc = "lsu marked instr finish.", }, [ POWER8_PME_PM_MRK_LSU_FLUSH ] = { .pme_name = "PM_MRK_LSU_FLUSH", .pme_code = 0xd180, .pme_short_desc = "Flush: (marked) : All Cases", .pme_long_desc = "Flush: (marked) : All Cases42", }, [ POWER8_PME_PM_MRK_LSU_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_LRQ", .pme_code = 0xd188, .pme_short_desc = "Flush: (marked) LRQ", .pme_long_desc = "Flush: (marked) LRQMarked LRQ flushes", }, [ POWER8_PME_PM_MRK_LSU_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU_FLUSH_SRQ", .pme_code = 0xd18a, .pme_short_desc = "Flush: (marked) SRQ", .pme_long_desc = "Flush: (marked) SRQMarked SRQ lhs flushes", }, [ POWER8_PME_PM_MRK_LSU_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU_FLUSH_ULD", .pme_code = 0xd184, .pme_short_desc = "Flush: (marked) Unaligned Load", .pme_long_desc = "Flush: (marked) Unaligned LoadMarked unaligned load flushes", }, [ POWER8_PME_PM_MRK_LSU_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU_FLUSH_UST", .pme_code = 0xd186, .pme_short_desc = "Flush: (marked) Unaligned Store", .pme_long_desc = "Flush: (marked) Unaligned StoreMarked unaligned store flushes", }, [ POWER8_PME_PM_MRK_LSU_REJECT ] = { .pme_name = "PM_MRK_LSU_REJECT", .pme_code = 0x40164, .pme_short_desc = "LSU marked reject (up to 2 per cycle)", .pme_long_desc = "LSU marked reject (up to 2 per cycle).", }, [ POWER8_PME_PM_MRK_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_MRK_LSU_REJECT_ERAT_MISS", .pme_code = 0x30164, .pme_short_desc = "LSU marked reject due to ERAT (up to 2 per cycle)", .pme_long_desc = "LSU marked reject due to ERAT (up to 2 per cycle).", }, [ POWER8_PME_PM_MRK_NTF_FIN ] = { .pme_name = "PM_MRK_NTF_FIN", .pme_code = 0x20112, .pme_short_desc = "Marked next to finish instruction finished", .pme_long_desc = "Marked next to finish instruction finished.", }, [ POWER8_PME_PM_MRK_RUN_CYC ] = { .pme_name = "PM_MRK_RUN_CYC", .pme_code = 0x1d15e, .pme_short_desc = "Marked run cycles", .pme_long_desc = "Marked run cycles.", }, [ POWER8_PME_PM_MRK_SRC_PREF_TRACK_EFF ] = { .pme_name = "PM_MRK_SRC_PREF_TRACK_EFF", .pme_code = 0x1d15a, .pme_short_desc = "Marked src pref track was effective", .pme_long_desc = "Marked src pref track was effective.", }, [ POWER8_PME_PM_MRK_SRC_PREF_TRACK_INEFF ] = { .pme_name = "PM_MRK_SRC_PREF_TRACK_INEFF", .pme_code = 0x3d15a, .pme_short_desc = "Prefetch tracked was ineffective for marked src", .pme_long_desc = "Prefetch tracked was ineffective for marked src.", }, [ POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD ] = { .pme_name = "PM_MRK_SRC_PREF_TRACK_MOD", .pme_code = 0x4d15c, .pme_short_desc = "Prefetch tracked was moderate for marked src", .pme_long_desc = "Prefetch tracked was moderate for marked src.", }, [ POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD_L2 ] = { .pme_name = "PM_MRK_SRC_PREF_TRACK_MOD_L2", .pme_code = 0x1d15c, .pme_short_desc = "Marked src Prefetch Tracked was moderate (source L2)", .pme_long_desc = "Marked src Prefetch Tracked was moderate (source L2).", }, [ POWER8_PME_PM_MRK_SRC_PREF_TRACK_MOD_L3 ] = { .pme_name = "PM_MRK_SRC_PREF_TRACK_MOD_L3", .pme_code = 0x3d15c, .pme_short_desc = "Prefetch tracked was moderate (L3 hit) for marked src", .pme_long_desc = "Prefetch tracked was moderate (L3 hit) for marked src.", }, [ POWER8_PME_PM_MRK_STALL_CMPLU_CYC ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC", .pme_code = 0x3013e, .pme_short_desc = "Marked Group completion Stall", .pme_long_desc = "Marked Group Completion Stall cycles (use edge detect to count #).", }, [ POWER8_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x3e158, .pme_short_desc = "marked stcx failed", .pme_long_desc = "marked stcx failed.", }, [ POWER8_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x10134, .pme_short_desc = "marked store completed and sent to nest", .pme_long_desc = "Marked store completed.", }, [ POWER8_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x30134, .pme_short_desc = "marked store finished with intervention", .pme_long_desc = "marked store complete (data home) with intervention.", }, [ POWER8_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC ] = { .pme_name = "PM_MRK_ST_DRAIN_TO_L2DISP_CYC", .pme_code = 0x3f150, .pme_short_desc = "cycles to drain st from core to L2", .pme_long_desc = "cycles to drain st from core to L2.", }, [ POWER8_PME_PM_MRK_ST_FWD ] = { .pme_name = "PM_MRK_ST_FWD", .pme_code = 0x3012c, .pme_short_desc = "Marked st forwards", .pme_long_desc = "Marked st forwards.", }, [ POWER8_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC ] = { .pme_name = "PM_MRK_ST_L2DISP_TO_CMPL_CYC", .pme_code = 0x1f150, .pme_short_desc = "cycles from L2 rc disp to l2 rc completion", .pme_long_desc = "cycles from L2 rc disp to l2 rc completion.", }, [ POWER8_PME_PM_MRK_ST_NEST ] = { .pme_name = "PM_MRK_ST_NEST", .pme_code = 0x20138, .pme_short_desc = "Marked store sent to nest", .pme_long_desc = "Marked store sent to nest.", }, [ POWER8_PME_PM_MRK_TGT_PREF_TRACK_EFF ] = { .pme_name = "PM_MRK_TGT_PREF_TRACK_EFF", .pme_code = 0x1c15a, .pme_short_desc = "Marked target pref track was effective", .pme_long_desc = "Marked target pref track was effective.", }, [ POWER8_PME_PM_MRK_TGT_PREF_TRACK_INEFF ] = { .pme_name = "PM_MRK_TGT_PREF_TRACK_INEFF", .pme_code = 0x3c15a, .pme_short_desc = "Prefetch tracked was ineffective for marked target", .pme_long_desc = "Prefetch tracked was ineffective for marked target.", }, [ POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD ] = { .pme_name = "PM_MRK_TGT_PREF_TRACK_MOD", .pme_code = 0x4c15c, .pme_short_desc = "Prefetch tracked was moderate for marked target", .pme_long_desc = "Prefetch tracked was moderate for marked target.", }, [ POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD_L2 ] = { .pme_name = "PM_MRK_TGT_PREF_TRACK_MOD_L2", .pme_code = 0x1c15c, .pme_short_desc = "Marked target Prefetch Tracked was moderate (source L2)", .pme_long_desc = "Marked target Prefetch Tracked was moderate (source L2).", }, [ POWER8_PME_PM_MRK_TGT_PREF_TRACK_MOD_L3 ] = { .pme_name = "PM_MRK_TGT_PREF_TRACK_MOD_L3", .pme_code = 0x3c15c, .pme_short_desc = "Prefetch tracked was moderate (L3 hit) for marked target", .pme_long_desc = "Prefetch tracked was moderate (L3 hit) for marked target.", }, [ POWER8_PME_PM_MRK_VSU_FIN ] = { .pme_name = "PM_MRK_VSU_FIN", .pme_code = 0x30132, .pme_short_desc = "VSU marked instr finish", .pme_long_desc = "vsu (fpu) marked instr finish.", }, [ POWER8_PME_PM_MULT_MRK ] = { .pme_name = "PM_MULT_MRK", .pme_code = 0x3d15e, .pme_short_desc = "mult marked instr", .pme_long_desc = "mult marked instr.", }, [ POWER8_PME_PM_NESTED_TEND ] = { .pme_name = "PM_NESTED_TEND", .pme_code = 0x20b0, .pme_short_desc = "Completion time nested tend", .pme_long_desc = "Completion time nested tend", }, [ POWER8_PME_PM_NEST_REF_CLK ] = { .pme_name = "PM_NEST_REF_CLK", .pme_code = 0x3006e, .pme_short_desc = "Multiply by 4 to obtain the number of PB cycles", .pme_long_desc = "Nest reference clocks.", }, [ POWER8_PME_PM_NON_FAV_TBEGIN ] = { .pme_name = "PM_NON_FAV_TBEGIN", .pme_code = 0x20b6, .pme_short_desc = "Dispatch time non favored tbegin", .pme_long_desc = "Dispatch time non favored tbegin", }, [ POWER8_PME_PM_NON_TM_RST_SC ] = { .pme_name = "PM_NON_TM_RST_SC", .pme_code = 0x328084, .pme_short_desc = "non tm snp rst tm sc", .pme_long_desc = "non tm snp rst tm sc", }, [ POWER8_PME_PM_NTCG_ALL_FIN ] = { .pme_name = "PM_NTCG_ALL_FIN", .pme_code = 0x2001a, .pme_short_desc = "Cycles after all instructions have finished to group completed", .pme_long_desc = "Ccycles after all instructions have finished to group completed.", }, [ POWER8_PME_PM_OUTER_TBEGIN ] = { .pme_name = "PM_OUTER_TBEGIN", .pme_code = 0x20ac, .pme_short_desc = "Completion time outer tbegin", .pme_long_desc = "Completion time outer tbegin", }, [ POWER8_PME_PM_OUTER_TEND ] = { .pme_name = "PM_OUTER_TEND", .pme_code = 0x20ae, .pme_short_desc = "Completion time outer tend", .pme_long_desc = "Completion time outer tend", }, [ POWER8_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x20010, .pme_short_desc = "Overflow from counter 1", .pme_long_desc = "Overflow from counter 1.", }, [ POWER8_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x30010, .pme_short_desc = "Overflow from counter 2", .pme_long_desc = "Overflow from counter 2.", }, [ POWER8_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x30020, .pme_short_desc = "PMC2 Rewind Event (did not match condition)", .pme_long_desc = "PMC2 Rewind Event (did not match condition).", }, [ POWER8_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x10022, .pme_short_desc = "PMC2 Rewind Value saved", .pme_long_desc = "PMC2 Rewind Value saved (matched condition).", }, [ POWER8_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x40010, .pme_short_desc = "Overflow from counter 3", .pme_long_desc = "Overflow from counter 3.", }, [ POWER8_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x10010, .pme_short_desc = "Overflow from counter 4", .pme_long_desc = "Overflow from counter 4.", }, [ POWER8_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x10020, .pme_short_desc = "PMC4 Rewind Event", .pme_long_desc = "PMC4 Rewind Event (did not match condition).", }, [ POWER8_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x30022, .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", .pme_long_desc = "PMC4 Rewind Value saved (matched condition).", }, [ POWER8_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x10024, .pme_short_desc = "Overflow from counter 5", .pme_long_desc = "Overflow from counter 5.", }, [ POWER8_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x30024, .pme_short_desc = "Overflow from counter 6", .pme_long_desc = "Overflow from counter 6.", }, [ POWER8_PME_PM_PREF_TRACKED ] = { .pme_name = "PM_PREF_TRACKED", .pme_code = 0x2005a, .pme_short_desc = "Total number of Prefetch Operations that were tracked", .pme_long_desc = "Total number of Prefetch Operations that were tracked.", }, [ POWER8_PME_PM_PREF_TRACK_EFF ] = { .pme_name = "PM_PREF_TRACK_EFF", .pme_code = 0x1005a, .pme_short_desc = "Prefetch Tracked was effective", .pme_long_desc = "Prefetch Tracked was effective.", }, [ POWER8_PME_PM_PREF_TRACK_INEFF ] = { .pme_name = "PM_PREF_TRACK_INEFF", .pme_code = 0x3005a, .pme_short_desc = "Prefetch tracked was ineffective", .pme_long_desc = "Prefetch tracked was ineffective.", }, [ POWER8_PME_PM_PREF_TRACK_MOD ] = { .pme_name = "PM_PREF_TRACK_MOD", .pme_code = 0x4005a, .pme_short_desc = "Prefetch tracked was moderate", .pme_long_desc = "Prefetch tracked was moderate.", }, [ POWER8_PME_PM_PREF_TRACK_MOD_L2 ] = { .pme_name = "PM_PREF_TRACK_MOD_L2", .pme_code = 0x1005c, .pme_short_desc = "Prefetch Tracked was moderate (source L2)", .pme_long_desc = "Prefetch Tracked was moderate (source L2).", }, [ POWER8_PME_PM_PREF_TRACK_MOD_L3 ] = { .pme_name = "PM_PREF_TRACK_MOD_L3", .pme_code = 0x3005c, .pme_short_desc = "Prefetch tracked was moderate (L3)", .pme_long_desc = "Prefetch tracked was moderate (L3).", }, [ POWER8_PME_PM_PROBE_NOP_DISP ] = { .pme_name = "PM_PROBE_NOP_DISP", .pme_code = 0x40014, .pme_short_desc = "ProbeNops dispatched", .pme_long_desc = "ProbeNops dispatched.", }, [ POWER8_PME_PM_PTE_PREFETCH ] = { .pme_name = "PM_PTE_PREFETCH", .pme_code = 0xe084, .pme_short_desc = "PTE prefetches", .pme_long_desc = "PTE prefetches42", }, [ POWER8_PME_PM_PUMP_CPRED ] = { .pme_name = "PM_PUMP_CPRED", .pme_code = 0x10054, .pme_short_desc = "Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Pump prediction correct. Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_PUMP_MPRED ] = { .pme_name = "PM_PUMP_MPRED", .pme_code = 0x40052, .pme_short_desc = "Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Pump Mis prediction Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_RC0_ALLOC ] = { .pme_name = "PM_RC0_ALLOC", .pme_code = 0x16081, .pme_short_desc = "RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_RC0_BUSY ] = { .pme_name = "PM_RC0_BUSY", .pme_code = 0x16080, .pme_short_desc = "RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", }, [ POWER8_PME_PM_RC_LIFETIME_EXC_1024 ] = { .pme_name = "PM_RC_LIFETIME_EXC_1024", .pme_code = 0xde200301eaull, .pme_short_desc = "Number of times the RC machine for a sampled instruction was active for more than 1024 cycles", .pme_long_desc = "Reload latency exceeded 1024 cyc", }, [ POWER8_PME_PM_RC_LIFETIME_EXC_2048 ] = { .pme_name = "PM_RC_LIFETIME_EXC_2048", .pme_code = 0xde200401ecull, .pme_short_desc = "Number of times the RC machine for a sampled instruction was active for more than 2048 cycles", .pme_long_desc = "Threshold counter exceeded a value of 2048", }, [ POWER8_PME_PM_RC_LIFETIME_EXC_256 ] = { .pme_name = "PM_RC_LIFETIME_EXC_256", .pme_code = 0xde200101e8ull, .pme_short_desc = "Number of times the RC machine for a sampled instruction was active for more than 256 cycles", .pme_long_desc = "Threshold counter exceed a count of 256", }, [ POWER8_PME_PM_RC_LIFETIME_EXC_32 ] = { .pme_name = "PM_RC_LIFETIME_EXC_32", .pme_code = 0xde200201e6ull, .pme_short_desc = "Number of times the RC machine for a sampled instruction was active for more than 32 cycles", .pme_long_desc = "Reload latency exceeded 32 cyc", }, [ POWER8_PME_PM_RC_USAGE ] = { .pme_name = "PM_RC_USAGE", .pme_code = 0x36088, .pme_short_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", .pme_long_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, [ POWER8_PME_PM_RD_CLEARING_SC ] = { .pme_name = "PM_RD_CLEARING_SC", .pme_code = 0x34808e, .pme_short_desc = "rd clearing sc", .pme_long_desc = "rd clearing sc", }, [ POWER8_PME_PM_RD_FORMING_SC ] = { .pme_name = "PM_RD_FORMING_SC", .pme_code = 0x34808c, .pme_short_desc = "rd forming sc", .pme_long_desc = "rd forming sc", }, [ POWER8_PME_PM_RD_HIT_PF ] = { .pme_name = "PM_RD_HIT_PF", .pme_code = 0x428086, .pme_short_desc = "rd machine hit l3 pf machine", .pme_long_desc = "rd machine hit l3 pf machine", }, [ POWER8_PME_PM_REAL_SRQ_FULL ] = { .pme_name = "PM_REAL_SRQ_FULL", .pme_code = 0x20004, .pme_short_desc = "Out of real srq entries", .pme_long_desc = "Out of real srq entries.", }, [ POWER8_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x600f4, .pme_short_desc = "Run_cycles", .pme_long_desc = "Run_cycles.", }, [ POWER8_PME_PM_RUN_CYC_SMT2_MODE ] = { .pme_name = "PM_RUN_CYC_SMT2_MODE", .pme_code = 0x3006c, .pme_short_desc = "Cycles run latch is set and core is in SMT2 mode", .pme_long_desc = "Cycles run latch is set and core is in SMT2 mode.", }, [ POWER8_PME_PM_RUN_CYC_SMT2_SHRD_MODE ] = { .pme_name = "PM_RUN_CYC_SMT2_SHRD_MODE", .pme_code = 0x2006a, .pme_short_desc = "cycles this threads run latch is set and the core is in SMT2 shared mode", .pme_long_desc = "Cycles run latch is set and core is in SMT2-shared mode.", }, [ POWER8_PME_PM_RUN_CYC_SMT2_SPLIT_MODE ] = { .pme_name = "PM_RUN_CYC_SMT2_SPLIT_MODE", .pme_code = 0x1006a, .pme_short_desc = "Cycles run latch is set and core is in SMT2-split mode", .pme_long_desc = "Cycles run latch is set and core is in SMT2-split mode.", }, [ POWER8_PME_PM_RUN_CYC_SMT4_MODE ] = { .pme_name = "PM_RUN_CYC_SMT4_MODE", .pme_code = 0x2006c, .pme_short_desc = "cycles this threads run latch is set and the core is in SMT4 mode", .pme_long_desc = "Cycles run latch is set and core is in SMT4 mode.", }, [ POWER8_PME_PM_RUN_CYC_SMT8_MODE ] = { .pme_name = "PM_RUN_CYC_SMT8_MODE", .pme_code = 0x4006c, .pme_short_desc = "Cycles run latch is set and core is in SMT8 mode", .pme_long_desc = "Cycles run latch is set and core is in SMT8 mode.", }, [ POWER8_PME_PM_RUN_CYC_ST_MODE ] = { .pme_name = "PM_RUN_CYC_ST_MODE", .pme_code = 0x1006c, .pme_short_desc = "Cycles run latch is set and core is in ST mode", .pme_long_desc = "Cycles run latch is set and core is in ST mode.", }, [ POWER8_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x500fa, .pme_short_desc = "Run_Instructions", .pme_long_desc = "Run_Instructions.", }, [ POWER8_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x400f4, .pme_short_desc = "Run_PURR", .pme_long_desc = "Run_PURR.", }, [ POWER8_PME_PM_RUN_SPURR ] = { .pme_name = "PM_RUN_SPURR", .pme_code = 0x10008, .pme_short_desc = "Run SPURR", .pme_long_desc = "Run SPURR.", }, [ POWER8_PME_PM_SEC_ERAT_HIT ] = { .pme_name = "PM_SEC_ERAT_HIT", .pme_code = 0xf082, .pme_short_desc = "secondary ERAT Hit", .pme_long_desc = "secondary ERAT Hit42", }, [ POWER8_PME_PM_SHL_CREATED ] = { .pme_name = "PM_SHL_CREATED", .pme_code = 0x508c, .pme_short_desc = "Store-Hit-Load Table Entry Created", .pme_long_desc = "Store-Hit-Load Table Entry Created", }, [ POWER8_PME_PM_SHL_ST_CONVERT ] = { .pme_name = "PM_SHL_ST_CONVERT", .pme_code = 0x508e, .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Enabled", .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Enabled", }, [ POWER8_PME_PM_SHL_ST_DISABLE ] = { .pme_name = "PM_SHL_ST_DISABLE", .pme_code = 0x5090, .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", }, [ POWER8_PME_PM_SN0_ALLOC ] = { .pme_name = "PM_SN0_ALLOC", .pme_code = 0x26085, .pme_short_desc = "SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "0.0", }, [ POWER8_PME_PM_SN0_BUSY ] = { .pme_name = "PM_SN0_BUSY", .pme_code = 0x26084, .pme_short_desc = "SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)", }, [ POWER8_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0xd0b2, .pme_short_desc = "TLBIE snoop", .pme_long_desc = "TLBIE snoopSnoop TLBIE", }, [ POWER8_PME_PM_SNP_TM_HIT_M ] = { .pme_name = "PM_SNP_TM_HIT_M", .pme_code = 0x338088, .pme_short_desc = "snp tm st hit m mu", .pme_long_desc = "snp tm st hit m mu", }, [ POWER8_PME_PM_SNP_TM_HIT_T ] = { .pme_name = "PM_SNP_TM_HIT_T", .pme_code = 0x33808a, .pme_short_desc = "snp tm_st_hit t tn te", .pme_long_desc = "snp tm_st_hit t tn te", }, [ POWER8_PME_PM_SN_USAGE ] = { .pme_name = "PM_SN_USAGE", .pme_code = 0x4608c, .pme_short_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", .pme_long_desc = "Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, [ POWER8_PME_PM_STALL_END_GCT_EMPTY ] = { .pme_name = "PM_STALL_END_GCT_EMPTY", .pme_code = 0x10028, .pme_short_desc = "Count ended because GCT went empty", .pme_long_desc = "Count ended because GCT went empty.", }, [ POWER8_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x1e058, .pme_short_desc = "stcx failed", .pme_long_desc = "stcx failed .", }, [ POWER8_PME_PM_STCX_LSU ] = { .pme_name = "PM_STCX_LSU", .pme_code = 0xc090, .pme_short_desc = "STCX executed reported at sent to nest", .pme_long_desc = "STCX executed reported at sent to nest42", }, [ POWER8_PME_PM_ST_CAUSED_FAIL ] = { .pme_name = "PM_ST_CAUSED_FAIL", .pme_code = 0x717080, .pme_short_desc = "Non TM St caused any thread to fail", .pme_long_desc = "Non TM St caused any thread to fail", }, [ POWER8_PME_PM_ST_CMPL ] = { .pme_name = "PM_ST_CMPL", .pme_code = 0x20016, .pme_short_desc = "Store completion count", .pme_long_desc = "Store completion count.", }, [ POWER8_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x200f0, .pme_short_desc = "Store Instructions Finished", .pme_long_desc = "Store Instructions Finished (store sent to nest).", }, [ POWER8_PME_PM_ST_FWD ] = { .pme_name = "PM_ST_FWD", .pme_code = 0x20018, .pme_short_desc = "Store forwards that finished", .pme_long_desc = "Store forwards that finished.", }, [ POWER8_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x300f0, .pme_short_desc = "Store Missed L1", .pme_long_desc = "Store Missed L1.", }, [ POWER8_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Counter OFF", .pme_long_desc = "Counter OFF.", }, [ POWER8_PME_PM_SWAP_CANCEL ] = { .pme_name = "PM_SWAP_CANCEL", .pme_code = 0x3090, .pme_short_desc = "SWAP cancel , rtag not available", .pme_long_desc = "SWAP cancel , rtag not available", }, [ POWER8_PME_PM_SWAP_CANCEL_GPR ] = { .pme_name = "PM_SWAP_CANCEL_GPR", .pme_code = 0x3092, .pme_short_desc = "SWAP cancel , rtag not available for gpr", .pme_long_desc = "SWAP cancel , rtag not available for gpr", }, [ POWER8_PME_PM_SWAP_COMPLETE ] = { .pme_name = "PM_SWAP_COMPLETE", .pme_code = 0x308c, .pme_short_desc = "swap cast in completed", .pme_long_desc = "swap cast in completed", }, [ POWER8_PME_PM_SWAP_COMPLETE_GPR ] = { .pme_name = "PM_SWAP_COMPLETE_GPR", .pme_code = 0x308e, .pme_short_desc = "swap cast in completed fpr gpr", .pme_long_desc = "swap cast in completed fpr gpr", }, [ POWER8_PME_PM_SYNC_MRK_BR_LINK ] = { .pme_name = "PM_SYNC_MRK_BR_LINK", .pme_code = 0x15152, .pme_short_desc = "Marked Branch and link branch that can cause a synchronous interrupt", .pme_long_desc = "Marked Branch and link branch that can cause a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_BR_MPRED ] = { .pme_name = "PM_SYNC_MRK_BR_MPRED", .pme_code = 0x1515c, .pme_short_desc = "Marked Branch mispredict that can cause a synchronous interrupt", .pme_long_desc = "Marked Branch mispredict that can cause a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_FX_DIVIDE ] = { .pme_name = "PM_SYNC_MRK_FX_DIVIDE", .pme_code = 0x15156, .pme_short_desc = "Marked fixed point divide that can cause a synchronous interrupt", .pme_long_desc = "Marked fixed point divide that can cause a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_L2HIT ] = { .pme_name = "PM_SYNC_MRK_L2HIT", .pme_code = 0x15158, .pme_short_desc = "Marked L2 Hits that can throw a synchronous interrupt", .pme_long_desc = "Marked L2 Hits that can throw a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_L2MISS ] = { .pme_name = "PM_SYNC_MRK_L2MISS", .pme_code = 0x1515a, .pme_short_desc = "Marked L2 Miss that can throw a synchronous interrupt", .pme_long_desc = "Marked L2 Miss that can throw a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_L3MISS ] = { .pme_name = "PM_SYNC_MRK_L3MISS", .pme_code = 0x15154, .pme_short_desc = "Marked L3 misses that can throw a synchronous interrupt", .pme_long_desc = "Marked L3 misses that can throw a synchronous interrupt.", }, [ POWER8_PME_PM_SYNC_MRK_PROBE_NOP ] = { .pme_name = "PM_SYNC_MRK_PROBE_NOP", .pme_code = 0x15150, .pme_short_desc = "Marked probeNops which can cause synchronous interrupts", .pme_long_desc = "Marked probeNops which can cause synchronous interrupts.", }, [ POWER8_PME_PM_SYS_PUMP_CPRED ] = { .pme_name = "PM_SYS_PUMP_CPRED", .pme_code = 0x30050, .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_SYS_PUMP_MPRED ] = { .pme_name = "PM_SYS_PUMP_MPRED", .pme_code = 0x30052, .pme_short_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or", }, [ POWER8_PME_PM_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_SYS_PUMP_MPRED_RTY", .pme_code = 0x40050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).", }, [ POWER8_PME_PM_TABLEWALK_CYC ] = { .pme_name = "PM_TABLEWALK_CYC", .pme_code = 0x10026, .pme_short_desc = "Cycles when a tablewalk (I or D) is active", .pme_long_desc = "Tablewalk Active.", }, [ POWER8_PME_PM_TABLEWALK_CYC_PREF ] = { .pme_name = "PM_TABLEWALK_CYC_PREF", .pme_code = 0xe086, .pme_short_desc = "tablewalk qualified for pte prefetches", .pme_long_desc = "tablewalk qualified for pte prefetches42", }, [ POWER8_PME_PM_TABORT_TRECLAIM ] = { .pme_name = "PM_TABORT_TRECLAIM", .pme_code = 0x20b2, .pme_short_desc = "Completion time tabortnoncd, tabortcd, treclaim", .pme_long_desc = "Completion time tabortnoncd, tabortcd, treclaim", }, [ POWER8_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x300f8, .pme_short_desc = "timebase event", .pme_long_desc = "timebase event.", }, [ POWER8_PME_PM_TEND_PEND_CYC ] = { .pme_name = "PM_TEND_PEND_CYC", .pme_code = 0xe0ba, .pme_short_desc = "TEND latency per thread", .pme_long_desc = "TEND latency per thread42", }, [ POWER8_PME_PM_THRD_ALL_RUN_CYC ] = { .pme_name = "PM_THRD_ALL_RUN_CYC", .pme_code = 0x2000c, .pme_short_desc = "All Threads in Run_cycles (was both threads in run_cycles)", .pme_long_desc = "All Threads in Run_cycles (was both threads in run_cycles).", }, [ POWER8_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x300f4, .pme_short_desc = "PPC Instructions Finished when both threads in run_cycles", .pme_long_desc = "Concurrent Run Instructions.", }, [ POWER8_PME_PM_THRD_GRP_CMPL_BOTH_CYC ] = { .pme_name = "PM_THRD_GRP_CMPL_BOTH_CYC", .pme_code = 0x10012, .pme_short_desc = "Cycles group completed on both completion slots by any thread", .pme_long_desc = "Two threads finished same cycle (gated by run latch).", }, [ POWER8_PME_PM_THRD_PRIO_0_1_CYC ] = { .pme_name = "PM_THRD_PRIO_0_1_CYC", .pme_code = 0x40bc, .pme_short_desc = "Cycles thread running at priority level 0 or 1", .pme_long_desc = "Cycles thread running at priority level 0 or 1", }, [ POWER8_PME_PM_THRD_PRIO_2_3_CYC ] = { .pme_name = "PM_THRD_PRIO_2_3_CYC", .pme_code = 0x40be, .pme_short_desc = "Cycles thread running at priority level 2 or 3", .pme_long_desc = "Cycles thread running at priority level 2 or 3", }, [ POWER8_PME_PM_THRD_PRIO_4_5_CYC ] = { .pme_name = "PM_THRD_PRIO_4_5_CYC", .pme_code = 0x5080, .pme_short_desc = "Cycles thread running at priority level 4 or 5", .pme_long_desc = "Cycles thread running at priority level 4 or 5", }, [ POWER8_PME_PM_THRD_PRIO_6_7_CYC ] = { .pme_name = "PM_THRD_PRIO_6_7_CYC", .pme_code = 0x5082, .pme_short_desc = "Cycles thread running at priority level 6 or 7", .pme_long_desc = "Cycles thread running at priority level 6 or 7", }, [ POWER8_PME_PM_THRD_REBAL_CYC ] = { .pme_name = "PM_THRD_REBAL_CYC", .pme_code = 0x3098, .pme_short_desc = "cycles rebalance was active", .pme_long_desc = "cycles rebalance was active", }, [ POWER8_PME_PM_THRESH_EXC_1024 ] = { .pme_name = "PM_THRESH_EXC_1024", .pme_code = 0x301ea, .pme_short_desc = "Threshold counter exceeded a value of 1024", .pme_long_desc = "Threshold counter exceeded a value of 1024.", }, [ POWER8_PME_PM_THRESH_EXC_128 ] = { .pme_name = "PM_THRESH_EXC_128", .pme_code = 0x401ea, .pme_short_desc = "Threshold counter exceeded a value of 128", .pme_long_desc = "Threshold counter exceeded a value of 128.", }, [ POWER8_PME_PM_THRESH_EXC_2048 ] = { .pme_name = "PM_THRESH_EXC_2048", .pme_code = 0x401ec, .pme_short_desc = "Threshold counter exceeded a value of 2048", .pme_long_desc = "Threshold counter exceeded a value of 2048.", }, [ POWER8_PME_PM_THRESH_EXC_256 ] = { .pme_name = "PM_THRESH_EXC_256", .pme_code = 0x101e8, .pme_short_desc = "Threshold counter exceed a count of 256", .pme_long_desc = "Threshold counter exceed a count of 256.", }, [ POWER8_PME_PM_THRESH_EXC_32 ] = { .pme_name = "PM_THRESH_EXC_32", .pme_code = 0x201e6, .pme_short_desc = "Threshold counter exceeded a value of 32", .pme_long_desc = "Threshold counter exceeded a value of 32.", }, [ POWER8_PME_PM_THRESH_EXC_4096 ] = { .pme_name = "PM_THRESH_EXC_4096", .pme_code = 0x101e6, .pme_short_desc = "Threshold counter exceed a count of 4096", .pme_long_desc = "Threshold counter exceed a count of 4096.", }, [ POWER8_PME_PM_THRESH_EXC_512 ] = { .pme_name = "PM_THRESH_EXC_512", .pme_code = 0x201e8, .pme_short_desc = "Threshold counter exceeded a value of 512", .pme_long_desc = "Threshold counter exceeded a value of 512.", }, [ POWER8_PME_PM_THRESH_EXC_64 ] = { .pme_name = "PM_THRESH_EXC_64", .pme_code = 0x301e8, .pme_short_desc = "IFU non-branch finished", .pme_long_desc = "Threshold counter exceeded a value of 64.", }, [ POWER8_PME_PM_THRESH_MET ] = { .pme_name = "PM_THRESH_MET", .pme_code = 0x101ec, .pme_short_desc = "threshold exceeded", .pme_long_desc = "threshold exceeded.", }, [ POWER8_PME_PM_THRESH_NOT_MET ] = { .pme_name = "PM_THRESH_NOT_MET", .pme_code = 0x4016e, .pme_short_desc = "Threshold counter did not meet threshold", .pme_long_desc = "Threshold counter did not meet threshold.", }, [ POWER8_PME_PM_TLBIE_FIN ] = { .pme_name = "PM_TLBIE_FIN", .pme_code = 0x30058, .pme_short_desc = "tlbie finished", .pme_long_desc = "tlbie finished.", }, [ POWER8_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x20066, .pme_short_desc = "TLB Miss (I + D)", .pme_long_desc = "TLB Miss (I + D).", }, [ POWER8_PME_PM_TM_BEGIN_ALL ] = { .pme_name = "PM_TM_BEGIN_ALL", .pme_code = 0x20b8, .pme_short_desc = "Tm any tbegin", .pme_long_desc = "Tm any tbegin", }, [ POWER8_PME_PM_TM_CAM_OVERFLOW ] = { .pme_name = "PM_TM_CAM_OVERFLOW", .pme_code = 0x318082, .pme_short_desc = "l3 tm cam overflow during L2 co of SC", .pme_long_desc = "l3 tm cam overflow during L2 co of SC", }, [ POWER8_PME_PM_TM_CAP_OVERFLOW ] = { .pme_name = "PM_TM_CAP_OVERFLOW", .pme_code = 0x74708c, .pme_short_desc = "TM Footprint Capactiy Overflow", .pme_long_desc = "TM Footprint Capactiy Overflow", }, [ POWER8_PME_PM_TM_END_ALL ] = { .pme_name = "PM_TM_END_ALL", .pme_code = 0x20ba, .pme_short_desc = "Tm any tend", .pme_long_desc = "Tm any tend", }, [ POWER8_PME_PM_TM_FAIL_CONF_NON_TM ] = { .pme_name = "PM_TM_FAIL_CONF_NON_TM", .pme_code = 0x3086, .pme_short_desc = "TEXAS fail reason @ completion", .pme_long_desc = "TEXAS fail reason @ completion", }, [ POWER8_PME_PM_TM_FAIL_CON_TM ] = { .pme_name = "PM_TM_FAIL_CON_TM", .pme_code = 0x3088, .pme_short_desc = "TEXAS fail reason @ completion", .pme_long_desc = "TEXAS fail reason @ completion", }, [ POWER8_PME_PM_TM_FAIL_DISALLOW ] = { .pme_name = "PM_TM_FAIL_DISALLOW", .pme_code = 0xe0b2, .pme_short_desc = "TM fail disallow", .pme_long_desc = "TM fail disallow42", }, [ POWER8_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW ] = { .pme_name = "PM_TM_FAIL_FOOTPRINT_OVERFLOW", .pme_code = 0x3084, .pme_short_desc = "TEXAS fail reason @ completion", .pme_long_desc = "TEXAS fail reason @ completion", }, [ POWER8_PME_PM_TM_FAIL_NON_TX_CONFLICT ] = { .pme_name = "PM_TM_FAIL_NON_TX_CONFLICT", .pme_code = 0xe0b8, .pme_short_desc = "Non transactional conflict from LSU whtver gets repoted to texas", .pme_long_desc = "Non transactional conflict from LSU whtver gets repoted to texas42", }, [ POWER8_PME_PM_TM_FAIL_SELF ] = { .pme_name = "PM_TM_FAIL_SELF", .pme_code = 0x308a, .pme_short_desc = "TEXAS fail reason @ completion", .pme_long_desc = "TEXAS fail reason @ completion", }, [ POWER8_PME_PM_TM_FAIL_TLBIE ] = { .pme_name = "PM_TM_FAIL_TLBIE", .pme_code = 0xe0b4, .pme_short_desc = "TLBIE hit bloom filter", .pme_long_desc = "TLBIE hit bloom filter42", }, [ POWER8_PME_PM_TM_FAIL_TX_CONFLICT ] = { .pme_name = "PM_TM_FAIL_TX_CONFLICT", .pme_code = 0xe0b6, .pme_short_desc = "Transactional conflict from LSU, whatever gets reported to texas", .pme_long_desc = "Transactional conflict from LSU, whatever gets reported to texas 42", }, [ POWER8_PME_PM_TM_FAV_CAUSED_FAIL ] = { .pme_name = "PM_TM_FAV_CAUSED_FAIL", .pme_code = 0x727086, .pme_short_desc = "TM Load (fav) caused another thread to fail", .pme_long_desc = "TM Load (fav) caused another thread to fail", }, [ POWER8_PME_PM_TM_LD_CAUSED_FAIL ] = { .pme_name = "PM_TM_LD_CAUSED_FAIL", .pme_code = 0x717082, .pme_short_desc = "Non TM Ld caused any thread to fail", .pme_long_desc = "Non TM Ld caused any thread to fail", }, [ POWER8_PME_PM_TM_LD_CONF ] = { .pme_name = "PM_TM_LD_CONF", .pme_code = 0x727084, .pme_short_desc = "TM Load (fav or non-fav) ran into conflict (failed)", .pme_long_desc = "TM Load (fav or non-fav) ran into conflict (failed)", }, [ POWER8_PME_PM_TM_RST_SC ] = { .pme_name = "PM_TM_RST_SC", .pme_code = 0x328086, .pme_short_desc = "tm snp rst tm sc", .pme_long_desc = "tm snp rst tm sc", }, [ POWER8_PME_PM_TM_SC_CO ] = { .pme_name = "PM_TM_SC_CO", .pme_code = 0x318080, .pme_short_desc = "l3 castout tm Sc line", .pme_long_desc = "l3 castout tm Sc line", }, [ POWER8_PME_PM_TM_ST_CAUSED_FAIL ] = { .pme_name = "PM_TM_ST_CAUSED_FAIL", .pme_code = 0x73708a, .pme_short_desc = "TM Store (fav or non-fav) caused another thread to fail", .pme_long_desc = "TM Store (fav or non-fav) caused another thread to fail", }, [ POWER8_PME_PM_TM_ST_CONF ] = { .pme_name = "PM_TM_ST_CONF", .pme_code = 0x737088, .pme_short_desc = "TM Store (fav or non-fav) ran into conflict (failed)", .pme_long_desc = "TM Store (fav or non-fav) ran into conflict (failed)", }, [ POWER8_PME_PM_TM_TBEGIN ] = { .pme_name = "PM_TM_TBEGIN", .pme_code = 0x20bc, .pme_short_desc = "Tm nested tbegin", .pme_long_desc = "Tm nested tbegin", }, [ POWER8_PME_PM_TM_TRANS_RUN_CYC ] = { .pme_name = "PM_TM_TRANS_RUN_CYC", .pme_code = 0x10060, .pme_short_desc = "run cycles in transactional state", .pme_long_desc = "run cycles in transactional state.", }, [ POWER8_PME_PM_TM_TRANS_RUN_INST ] = { .pme_name = "PM_TM_TRANS_RUN_INST", .pme_code = 0x30060, .pme_short_desc = "Instructions completed in transactional state", .pme_long_desc = "Instructions completed in transactional state.", }, [ POWER8_PME_PM_TM_TRESUME ] = { .pme_name = "PM_TM_TRESUME", .pme_code = 0x3080, .pme_short_desc = "Tm resume", .pme_long_desc = "Tm resume", }, [ POWER8_PME_PM_TM_TSUSPEND ] = { .pme_name = "PM_TM_TSUSPEND", .pme_code = 0x20be, .pme_short_desc = "Tm suspend", .pme_long_desc = "Tm suspend", }, [ POWER8_PME_PM_TM_TX_PASS_RUN_CYC ] = { .pme_name = "PM_TM_TX_PASS_RUN_CYC", .pme_code = 0x2e012, .pme_short_desc = "cycles spent in successful transactions", .pme_long_desc = "run cycles spent in successful transactions.", }, [ POWER8_PME_PM_TM_TX_PASS_RUN_INST ] = { .pme_name = "PM_TM_TX_PASS_RUN_INST", .pme_code = 0x4e014, .pme_short_desc = "run instructions spent in successful transactions.", .pme_long_desc = "run instructions spent in successful transactions.", }, [ POWER8_PME_PM_UP_PREF_L3 ] = { .pme_name = "PM_UP_PREF_L3", .pme_code = 0xe08c, .pme_short_desc = "Micropartition prefetch", .pme_long_desc = "Micropartition prefetch42", }, [ POWER8_PME_PM_UP_PREF_POINTER ] = { .pme_name = "PM_UP_PREF_POINTER", .pme_code = 0xe08e, .pme_short_desc = "Micrpartition pointer prefetches", .pme_long_desc = "Micrpartition pointer prefetches42", }, [ POWER8_PME_PM_VSU0_16FLOP ] = { .pme_name = "PM_VSU0_16FLOP", .pme_code = 0xa0a4, .pme_short_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", .pme_long_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", }, [ POWER8_PME_PM_VSU0_1FLOP ] = { .pme_name = "PM_VSU0_1FLOP", .pme_code = 0xa080, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finishedDecode into 1,2,4 FLOP according to instr IOP, multiplied by #vector elements according to route(eg x1, x2, x4) Only if instr sends finish to ISU", }, [ POWER8_PME_PM_VSU0_2FLOP ] = { .pme_name = "PM_VSU0_2FLOP", .pme_code = 0xa098, .pme_short_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", .pme_long_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", }, [ POWER8_PME_PM_VSU0_4FLOP ] = { .pme_name = "PM_VSU0_4FLOP", .pme_code = 0xa09c, .pme_short_desc = "four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)", .pme_long_desc = "four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)", }, [ POWER8_PME_PM_VSU0_8FLOP ] = { .pme_name = "PM_VSU0_8FLOP", .pme_code = 0xa0a0, .pme_short_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)", .pme_long_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)", }, [ POWER8_PME_PM_VSU0_COMPLEX_ISSUED ] = { .pme_name = "PM_VSU0_COMPLEX_ISSUED", .pme_code = 0xb0a4, .pme_short_desc = "Complex VMX instruction issued", .pme_long_desc = "Complex VMX instruction issued", }, [ POWER8_PME_PM_VSU0_CY_ISSUED ] = { .pme_name = "PM_VSU0_CY_ISSUED", .pme_code = 0xb0b4, .pme_short_desc = "Cryptographic instruction RFC02196 Issued", .pme_long_desc = "Cryptographic instruction RFC02196 Issued", }, [ POWER8_PME_PM_VSU0_DD_ISSUED ] = { .pme_name = "PM_VSU0_DD_ISSUED", .pme_code = 0xb0a8, .pme_short_desc = "64BIT Decimal Issued", .pme_long_desc = "64BIT Decimal Issued", }, [ POWER8_PME_PM_VSU0_DP_2FLOP ] = { .pme_name = "PM_VSU0_DP_2FLOP", .pme_code = 0xa08c, .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", }, [ POWER8_PME_PM_VSU0_DP_FMA ] = { .pme_name = "PM_VSU0_DP_FMA", .pme_code = 0xa090, .pme_short_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", .pme_long_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", }, [ POWER8_PME_PM_VSU0_DP_FSQRT_FDIV ] = { .pme_name = "PM_VSU0_DP_FSQRT_FDIV", .pme_code = 0xa094, .pme_short_desc = "DP vector versions of fdiv,fsqrt", .pme_long_desc = "DP vector versions of fdiv,fsqrt", }, [ POWER8_PME_PM_VSU0_DQ_ISSUED ] = { .pme_name = "PM_VSU0_DQ_ISSUED", .pme_code = 0xb0ac, .pme_short_desc = "128BIT Decimal Issued", .pme_long_desc = "128BIT Decimal Issued", }, [ POWER8_PME_PM_VSU0_EX_ISSUED ] = { .pme_name = "PM_VSU0_EX_ISSUED", .pme_code = 0xb0b0, .pme_short_desc = "Direct move 32/64b VRFtoGPR RFC02206 Issued", .pme_long_desc = "Direct move 32/64b VRFtoGPR RFC02206 Issued", }, [ POWER8_PME_PM_VSU0_FIN ] = { .pme_name = "PM_VSU0_FIN", .pme_code = 0xa0bc, .pme_short_desc = "VSU0 Finished an instruction", .pme_long_desc = "VSU0 Finished an instruction", }, [ POWER8_PME_PM_VSU0_FMA ] = { .pme_name = "PM_VSU0_FMA", .pme_code = 0xa084, .pme_short_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", .pme_long_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", }, [ POWER8_PME_PM_VSU0_FPSCR ] = { .pme_name = "PM_VSU0_FPSCR", .pme_code = 0xb098, .pme_short_desc = "Move to/from FPSCR type instruction issued on Pipe 0", .pme_long_desc = "Move to/from FPSCR type instruction issued on Pipe 0", }, [ POWER8_PME_PM_VSU0_FSQRT_FDIV ] = { .pme_name = "PM_VSU0_FSQRT_FDIV", .pme_code = 0xa088, .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only!", .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only!", }, [ POWER8_PME_PM_VSU0_PERMUTE_ISSUED ] = { .pme_name = "PM_VSU0_PERMUTE_ISSUED", .pme_code = 0xb090, .pme_short_desc = "Permute VMX Instruction Issued", .pme_long_desc = "Permute VMX Instruction Issued", }, [ POWER8_PME_PM_VSU0_SCALAR_DP_ISSUED ] = { .pme_name = "PM_VSU0_SCALAR_DP_ISSUED", .pme_code = 0xb088, .pme_short_desc = "Double Precision scalar instruction issued on Pipe0", .pme_long_desc = "Double Precision scalar instruction issued on Pipe0", }, [ POWER8_PME_PM_VSU0_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU0_SIMPLE_ISSUED", .pme_code = 0xb094, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", }, [ POWER8_PME_PM_VSU0_SINGLE ] = { .pme_name = "PM_VSU0_SINGLE", .pme_code = 0xa0a8, .pme_short_desc = "FPU single precision", .pme_long_desc = "FPU single precision", }, [ POWER8_PME_PM_VSU0_SQ ] = { .pme_name = "PM_VSU0_SQ", .pme_code = 0xb09c, .pme_short_desc = "Store Vector Issued", .pme_long_desc = "Store Vector Issued", }, [ POWER8_PME_PM_VSU0_STF ] = { .pme_name = "PM_VSU0_STF", .pme_code = 0xb08c, .pme_short_desc = "FPU store (SP or DP) issued on Pipe0", .pme_long_desc = "FPU store (SP or DP) issued on Pipe0", }, [ POWER8_PME_PM_VSU0_VECTOR_DP_ISSUED ] = { .pme_name = "PM_VSU0_VECTOR_DP_ISSUED", .pme_code = 0xb080, .pme_short_desc = "Double Precision vector instruction issued on Pipe0", .pme_long_desc = "Double Precision vector instruction issued on Pipe0", }, [ POWER8_PME_PM_VSU0_VECTOR_SP_ISSUED ] = { .pme_name = "PM_VSU0_VECTOR_SP_ISSUED", .pme_code = 0xb084, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", }, [ POWER8_PME_PM_VSU1_16FLOP ] = { .pme_name = "PM_VSU1_16FLOP", .pme_code = 0xa0a6, .pme_short_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", .pme_long_desc = "Sixteen flops operation (SP vector versions of fdiv,fsqrt)", }, [ POWER8_PME_PM_VSU1_1FLOP ] = { .pme_name = "PM_VSU1_1FLOP", .pme_code = 0xa082, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished", }, [ POWER8_PME_PM_VSU1_2FLOP ] = { .pme_name = "PM_VSU1_2FLOP", .pme_code = 0xa09a, .pme_short_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", .pme_long_desc = "two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)", }, [ POWER8_PME_PM_VSU1_4FLOP ] = { .pme_name = "PM_VSU1_4FLOP", .pme_code = 0xa09e, .pme_short_desc = "four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)", .pme_long_desc = "four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)", }, [ POWER8_PME_PM_VSU1_8FLOP ] = { .pme_name = "PM_VSU1_8FLOP", .pme_code = 0xa0a2, .pme_short_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)", .pme_long_desc = "eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)", }, [ POWER8_PME_PM_VSU1_COMPLEX_ISSUED ] = { .pme_name = "PM_VSU1_COMPLEX_ISSUED", .pme_code = 0xb0a6, .pme_short_desc = "Complex VMX instruction issued", .pme_long_desc = "Complex VMX instruction issued", }, [ POWER8_PME_PM_VSU1_CY_ISSUED ] = { .pme_name = "PM_VSU1_CY_ISSUED", .pme_code = 0xb0b6, .pme_short_desc = "Cryptographic instruction RFC02196 Issued", .pme_long_desc = "Cryptographic instruction RFC02196 Issued", }, [ POWER8_PME_PM_VSU1_DD_ISSUED ] = { .pme_name = "PM_VSU1_DD_ISSUED", .pme_code = 0xb0aa, .pme_short_desc = "64BIT Decimal Issued", .pme_long_desc = "64BIT Decimal Issued", }, [ POWER8_PME_PM_VSU1_DP_2FLOP ] = { .pme_name = "PM_VSU1_DP_2FLOP", .pme_code = 0xa08e, .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg", }, [ POWER8_PME_PM_VSU1_DP_FMA ] = { .pme_name = "PM_VSU1_DP_FMA", .pme_code = 0xa092, .pme_short_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", .pme_long_desc = "DP vector version of fmadd,fnmadd,fmsub,fnmsub", }, [ POWER8_PME_PM_VSU1_DP_FSQRT_FDIV ] = { .pme_name = "PM_VSU1_DP_FSQRT_FDIV", .pme_code = 0xa096, .pme_short_desc = "DP vector versions of fdiv,fsqrt", .pme_long_desc = "DP vector versions of fdiv,fsqrt", }, [ POWER8_PME_PM_VSU1_DQ_ISSUED ] = { .pme_name = "PM_VSU1_DQ_ISSUED", .pme_code = 0xb0ae, .pme_short_desc = "128BIT Decimal Issued", .pme_long_desc = "128BIT Decimal Issued", }, [ POWER8_PME_PM_VSU1_EX_ISSUED ] = { .pme_name = "PM_VSU1_EX_ISSUED", .pme_code = 0xb0b2, .pme_short_desc = "Direct move 32/64b VRFtoGPR RFC02206 Issued", .pme_long_desc = "Direct move 32/64b VRFtoGPR RFC02206 Issued", }, [ POWER8_PME_PM_VSU1_FIN ] = { .pme_name = "PM_VSU1_FIN", .pme_code = 0xa0be, .pme_short_desc = "VSU1 Finished an instruction", .pme_long_desc = "VSU1 Finished an instruction", }, [ POWER8_PME_PM_VSU1_FMA ] = { .pme_name = "PM_VSU1_FMA", .pme_code = 0xa086, .pme_short_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", .pme_long_desc = "two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!", }, [ POWER8_PME_PM_VSU1_FPSCR ] = { .pme_name = "PM_VSU1_FPSCR", .pme_code = 0xb09a, .pme_short_desc = "Move to/from FPSCR type instruction issued on Pipe 0", .pme_long_desc = "Move to/from FPSCR type instruction issued on Pipe 0", }, [ POWER8_PME_PM_VSU1_FSQRT_FDIV ] = { .pme_name = "PM_VSU1_FSQRT_FDIV", .pme_code = 0xa08a, .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only!", .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only!", }, [ POWER8_PME_PM_VSU1_PERMUTE_ISSUED ] = { .pme_name = "PM_VSU1_PERMUTE_ISSUED", .pme_code = 0xb092, .pme_short_desc = "Permute VMX Instruction Issued", .pme_long_desc = "Permute VMX Instruction Issued", }, [ POWER8_PME_PM_VSU1_SCALAR_DP_ISSUED ] = { .pme_name = "PM_VSU1_SCALAR_DP_ISSUED", .pme_code = 0xb08a, .pme_short_desc = "Double Precision scalar instruction issued on Pipe1", .pme_long_desc = "Double Precision scalar instruction issued on Pipe1", }, [ POWER8_PME_PM_VSU1_SIMPLE_ISSUED ] = { .pme_name = "PM_VSU1_SIMPLE_ISSUED", .pme_code = 0xb096, .pme_short_desc = "Simple VMX instruction issued", .pme_long_desc = "Simple VMX instruction issued", }, [ POWER8_PME_PM_VSU1_SINGLE ] = { .pme_name = "PM_VSU1_SINGLE", .pme_code = 0xa0aa, .pme_short_desc = "FPU single precision", .pme_long_desc = "FPU single precision", }, [ POWER8_PME_PM_VSU1_SQ ] = { .pme_name = "PM_VSU1_SQ", .pme_code = 0xb09e, .pme_short_desc = "Store Vector Issued", .pme_long_desc = "Store Vector Issued", }, [ POWER8_PME_PM_VSU1_STF ] = { .pme_name = "PM_VSU1_STF", .pme_code = 0xb08e, .pme_short_desc = "FPU store (SP or DP) issued on Pipe1", .pme_long_desc = "FPU store (SP or DP) issued on Pipe1", }, [ POWER8_PME_PM_VSU1_VECTOR_DP_ISSUED ] = { .pme_name = "PM_VSU1_VECTOR_DP_ISSUED", .pme_code = 0xb082, .pme_short_desc = "Double Precision vector instruction issued on Pipe1", .pme_long_desc = "Double Precision vector instruction issued on Pipe1", }, [ POWER8_PME_PM_VSU1_VECTOR_SP_ISSUED ] = { .pme_name = "PM_VSU1_VECTOR_SP_ISSUED", .pme_code = 0xb086, .pme_short_desc = "Single Precision vector instruction issued (executed)", .pme_long_desc = "Single Precision vector instruction issued (executed)", }, }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/power9_events.h000066400000000000000000012543461502707512200230370ustar00rootroot00000000000000/* * File: power9_events.h * CVS: * Author: Will Schmidt * will_schmidt@vnet.ibm.com * Author: Carl Love * cel@us.ibm.com * * Mods: * Initial content generated by Will Schmidt. (Jan 31, 2017). * Refresh/update generated Jun 06, 2017 by Will Schmidt. * missing _ALT events added, Nov 16, 2017 by Will Schmidt. * * Contributed by * (C) Copyright IBM Corporation, 2017. All Rights Reserved. * * Note: This code was automatically generated and should not be modified by * hand. * * Documentation on the PMU events will be published at: * ... */ #ifndef __POWER9_EVENTS_H__ #define __POWER9_EVENTS_H__ #define POWER9_PME_PM_1FLOP_CMPL 0 #define POWER9_PME_PM_1PLUS_PPC_CMPL 1 #define POWER9_PME_PM_1PLUS_PPC_DISP 2 #define POWER9_PME_PM_2FLOP_CMPL 3 #define POWER9_PME_PM_4FLOP_CMPL 4 #define POWER9_PME_PM_8FLOP_CMPL 5 #define POWER9_PME_PM_ANY_THRD_RUN_CYC 6 #define POWER9_PME_PM_BACK_BR_CMPL 7 #define POWER9_PME_PM_BANK_CONFLICT 8 #define POWER9_PME_PM_BFU_BUSY 9 #define POWER9_PME_PM_BR_2PATH 10 #define POWER9_PME_PM_BR_CMPL 11 #define POWER9_PME_PM_BR_CORECT_PRED_TAKEN_CMPL 12 #define POWER9_PME_PM_BR_MPRED_CCACHE 13 #define POWER9_PME_PM_BR_MPRED_CMPL 14 #define POWER9_PME_PM_BR_MPRED_LSTACK 15 #define POWER9_PME_PM_BR_MPRED_PCACHE 16 #define POWER9_PME_PM_BR_MPRED_TAKEN_CR 17 #define POWER9_PME_PM_BR_MPRED_TAKEN_TA 18 #define POWER9_PME_PM_BR_PRED_CCACHE 19 #define POWER9_PME_PM_BR_PRED_LSTACK 20 #define POWER9_PME_PM_BR_PRED_PCACHE 21 #define POWER9_PME_PM_BR_PRED_TAKEN_CR 22 #define POWER9_PME_PM_BR_PRED_TA 23 #define POWER9_PME_PM_BR_PRED 24 #define POWER9_PME_PM_BR_TAKEN_CMPL 25 #define POWER9_PME_PM_BRU_FIN 26 #define POWER9_PME_PM_BR_UNCOND 27 #define POWER9_PME_PM_BTAC_BAD_RESULT 28 #define POWER9_PME_PM_BTAC_GOOD_RESULT 29 #define POWER9_PME_PM_CHIP_PUMP_CPRED 30 #define POWER9_PME_PM_CLB_HELD 31 #define POWER9_PME_PM_CMPLU_STALL_ANY_SYNC 32 #define POWER9_PME_PM_CMPLU_STALL_BRU 33 #define POWER9_PME_PM_CMPLU_STALL_CRYPTO 34 #define POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS 35 #define POWER9_PME_PM_CMPLU_STALL_DFLONG 36 #define POWER9_PME_PM_CMPLU_STALL_DFU 37 #define POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 38 #define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT 39 #define POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 40 #define POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS 41 #define POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM 42 #define POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE 43 #define POWER9_PME_PM_CMPLU_STALL_DPLONG 44 #define POWER9_PME_PM_CMPLU_STALL_DP 45 #define POWER9_PME_PM_CMPLU_STALL_EIEIO 46 #define POWER9_PME_PM_CMPLU_STALL_EMQ_FULL 47 #define POWER9_PME_PM_CMPLU_STALL_ERAT_MISS 48 #define POWER9_PME_PM_CMPLU_STALL_EXCEPTION 49 #define POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT 50 #define POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD 51 #define POWER9_PME_PM_CMPLU_STALL_FXLONG 52 #define POWER9_PME_PM_CMPLU_STALL_FXU 53 #define POWER9_PME_PM_CMPLU_STALL_HWSYNC 54 #define POWER9_PME_PM_CMPLU_STALL_LARX 55 #define POWER9_PME_PM_CMPLU_STALL_LHS 56 #define POWER9_PME_PM_CMPLU_STALL_LMQ_FULL 57 #define POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH 58 #define POWER9_PME_PM_CMPLU_STALL_LRQ_FULL 59 #define POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER 60 #define POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB 61 #define POWER9_PME_PM_CMPLU_STALL_LSU_FIN 62 #define POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT 63 #define POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR 64 #define POWER9_PME_PM_CMPLU_STALL_LSU 65 #define POWER9_PME_PM_CMPLU_STALL_LWSYNC 66 #define POWER9_PME_PM_CMPLU_STALL_MTFPSCR 67 #define POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN 68 #define POWER9_PME_PM_CMPLU_STALL_NESTED_TEND 69 #define POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN 70 #define POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH 71 #define POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL 72 #define POWER9_PME_PM_CMPLU_STALL_PASTE 73 #define POWER9_PME_PM_CMPLU_STALL_PM 74 #define POWER9_PME_PM_CMPLU_STALL_SLB 75 #define POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH 76 #define POWER9_PME_PM_CMPLU_STALL_SRQ_FULL 77 #define POWER9_PME_PM_CMPLU_STALL_STCX 78 #define POWER9_PME_PM_CMPLU_STALL_ST_FWD 79 #define POWER9_PME_PM_CMPLU_STALL_STORE_DATA 80 #define POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB 81 #define POWER9_PME_PM_CMPLU_STALL_STORE_FINISH 82 #define POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB 83 #define POWER9_PME_PM_CMPLU_STALL_SYNC_PMU_INT 84 #define POWER9_PME_PM_CMPLU_STALL_TEND 85 #define POWER9_PME_PM_CMPLU_STALL_THRD 86 #define POWER9_PME_PM_CMPLU_STALL_TLBIE 87 #define POWER9_PME_PM_CMPLU_STALL 88 #define POWER9_PME_PM_CMPLU_STALL_VDPLONG 89 #define POWER9_PME_PM_CMPLU_STALL_VDP 90 #define POWER9_PME_PM_CMPLU_STALL_VFXLONG 91 #define POWER9_PME_PM_CMPLU_STALL_VFXU 92 #define POWER9_PME_PM_CO0_BUSY 93 #define POWER9_PME_PM_CO0_BUSY_ALT 94 #define POWER9_PME_PM_CO_DISP_FAIL 95 #define POWER9_PME_PM_CO_TM_SC_FOOTPRINT 96 #define POWER9_PME_PM_CO_USAGE 97 #define POWER9_PME_PM_CYC 98 #define POWER9_PME_PM_DARQ0_0_3_ENTRIES 99 #define POWER9_PME_PM_DARQ0_10_12_ENTRIES 100 #define POWER9_PME_PM_DARQ0_4_6_ENTRIES 101 #define POWER9_PME_PM_DARQ0_7_9_ENTRIES 102 #define POWER9_PME_PM_DARQ1_0_3_ENTRIES 103 #define POWER9_PME_PM_DARQ1_10_12_ENTRIES 104 #define POWER9_PME_PM_DARQ1_4_6_ENTRIES 105 #define POWER9_PME_PM_DARQ1_7_9_ENTRIES 106 #define POWER9_PME_PM_DARQ_STORE_REJECT 107 #define POWER9_PME_PM_DARQ_STORE_XMIT 108 #define POWER9_PME_PM_DATA_CHIP_PUMP_CPRED 109 #define POWER9_PME_PM_DATA_FROM_DL2L3_MOD 110 #define POWER9_PME_PM_DATA_FROM_DL2L3_SHR 111 #define POWER9_PME_PM_DATA_FROM_DL4 112 #define POWER9_PME_PM_DATA_FROM_DMEM 113 #define POWER9_PME_PM_DATA_FROM_L21_MOD 114 #define POWER9_PME_PM_DATA_FROM_L21_SHR 115 #define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST 116 #define POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER 117 #define POWER9_PME_PM_DATA_FROM_L2_MEPF 118 #define POWER9_PME_PM_DATA_FROM_L2MISS_MOD 119 #define POWER9_PME_PM_DATA_FROM_L2MISS 120 #define POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT 121 #define POWER9_PME_PM_DATA_FROM_L2 122 #define POWER9_PME_PM_DATA_FROM_L31_ECO_MOD 123 #define POWER9_PME_PM_DATA_FROM_L31_ECO_SHR 124 #define POWER9_PME_PM_DATA_FROM_L31_MOD 125 #define POWER9_PME_PM_DATA_FROM_L31_SHR 126 #define POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT 127 #define POWER9_PME_PM_DATA_FROM_L3_MEPF 128 #define POWER9_PME_PM_DATA_FROM_L3MISS_MOD 129 #define POWER9_PME_PM_DATA_FROM_L3MISS 130 #define POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT 131 #define POWER9_PME_PM_DATA_FROM_L3 132 #define POWER9_PME_PM_DATA_FROM_LL4 133 #define POWER9_PME_PM_DATA_FROM_LMEM 134 #define POWER9_PME_PM_DATA_FROM_MEMORY 135 #define POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE 136 #define POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE 137 #define POWER9_PME_PM_DATA_FROM_RL2L3_MOD 138 #define POWER9_PME_PM_DATA_FROM_RL2L3_SHR 139 #define POWER9_PME_PM_DATA_FROM_RL4 140 #define POWER9_PME_PM_DATA_FROM_RMEM 141 #define POWER9_PME_PM_DATA_GRP_PUMP_CPRED 142 #define POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY 143 #define POWER9_PME_PM_DATA_GRP_PUMP_MPRED 144 #define POWER9_PME_PM_DATA_PUMP_CPRED 145 #define POWER9_PME_PM_DATA_PUMP_MPRED 146 #define POWER9_PME_PM_DATA_STORE 147 #define POWER9_PME_PM_DATA_SYS_PUMP_CPRED 148 #define POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY 149 #define POWER9_PME_PM_DATA_SYS_PUMP_MPRED 150 #define POWER9_PME_PM_DATA_TABLEWALK_CYC 151 #define POWER9_PME_PM_DC_DEALLOC_NO_CONF 152 #define POWER9_PME_PM_DC_PREF_CONF 153 #define POWER9_PME_PM_DC_PREF_CONS_ALLOC 154 #define POWER9_PME_PM_DC_PREF_FUZZY_CONF 155 #define POWER9_PME_PM_DC_PREF_HW_ALLOC 156 #define POWER9_PME_PM_DC_PREF_STRIDED_CONF 157 #define POWER9_PME_PM_DC_PREF_SW_ALLOC 158 #define POWER9_PME_PM_DC_PREF_XCONS_ALLOC 159 #define POWER9_PME_PM_DECODE_FUSION_CONST_GEN 160 #define POWER9_PME_PM_DECODE_FUSION_EXT_ADD 161 #define POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP 162 #define POWER9_PME_PM_DECODE_FUSION_OP_PRESERV 163 #define POWER9_PME_PM_DECODE_HOLD_ICT_FULL 164 #define POWER9_PME_PM_DECODE_LANES_NOT_AVAIL 165 #define POWER9_PME_PM_DERAT_MISS_16G 166 #define POWER9_PME_PM_DERAT_MISS_16M 167 #define POWER9_PME_PM_DERAT_MISS_1G 168 #define POWER9_PME_PM_DERAT_MISS_2M 169 #define POWER9_PME_PM_DERAT_MISS_4K 170 #define POWER9_PME_PM_DERAT_MISS_64K 171 #define POWER9_PME_PM_DFU_BUSY 172 #define POWER9_PME_PM_DISP_CLB_HELD_BAL 173 #define POWER9_PME_PM_DISP_CLB_HELD_SB 174 #define POWER9_PME_PM_DISP_CLB_HELD_TLBIE 175 #define POWER9_PME_PM_DISP_HELD_HB_FULL 176 #define POWER9_PME_PM_DISP_HELD_ISSQ_FULL 177 #define POWER9_PME_PM_DISP_HELD_SYNC_HOLD 178 #define POWER9_PME_PM_DISP_HELD_TBEGIN 179 #define POWER9_PME_PM_DISP_HELD 180 #define POWER9_PME_PM_DISP_STARVED 181 #define POWER9_PME_PM_DP_QP_FLOP_CMPL 182 #define POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD 183 #define POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR 184 #define POWER9_PME_PM_DPTEG_FROM_DL4 185 #define POWER9_PME_PM_DPTEG_FROM_DMEM 186 #define POWER9_PME_PM_DPTEG_FROM_L21_MOD 187 #define POWER9_PME_PM_DPTEG_FROM_L21_SHR 188 #define POWER9_PME_PM_DPTEG_FROM_L2_MEPF 189 #define POWER9_PME_PM_DPTEG_FROM_L2MISS 190 #define POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT 191 #define POWER9_PME_PM_DPTEG_FROM_L2 192 #define POWER9_PME_PM_DPTEG_FROM_L31_ECO_MOD 193 #define POWER9_PME_PM_DPTEG_FROM_L31_ECO_SHR 194 #define POWER9_PME_PM_DPTEG_FROM_L31_MOD 195 #define POWER9_PME_PM_DPTEG_FROM_L31_SHR 196 #define POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT 197 #define POWER9_PME_PM_DPTEG_FROM_L3_MEPF 198 #define POWER9_PME_PM_DPTEG_FROM_L3MISS 199 #define POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT 200 #define POWER9_PME_PM_DPTEG_FROM_L3 201 #define POWER9_PME_PM_DPTEG_FROM_LL4 202 #define POWER9_PME_PM_DPTEG_FROM_LMEM 203 #define POWER9_PME_PM_DPTEG_FROM_MEMORY 204 #define POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE 205 #define POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE 206 #define POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD 207 #define POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR 208 #define POWER9_PME_PM_DPTEG_FROM_RL4 209 #define POWER9_PME_PM_DPTEG_FROM_RMEM 210 #define POWER9_PME_PM_DSIDE_L2MEMACC 211 #define POWER9_PME_PM_DSIDE_MRU_TOUCH 212 #define POWER9_PME_PM_DSIDE_OTHER_64B_L2MEMACC 213 #define POWER9_PME_PM_DSLB_MISS 214 #define POWER9_PME_PM_DSLB_MISS_ALT 215 #define POWER9_PME_PM_DTLB_MISS_16G 216 #define POWER9_PME_PM_DTLB_MISS_16M 217 #define POWER9_PME_PM_DTLB_MISS_1G 218 #define POWER9_PME_PM_DTLB_MISS_2M 219 #define POWER9_PME_PM_DTLB_MISS_4K 220 #define POWER9_PME_PM_DTLB_MISS_64K 221 #define POWER9_PME_PM_DTLB_MISS 222 #define POWER9_PME_PM_SPACEHOLDER_0000040062 223 #define POWER9_PME_PM_SPACEHOLDER_0000040064 224 #define POWER9_PME_PM_EAT_FORCE_MISPRED 225 #define POWER9_PME_PM_EAT_FULL_CYC 226 #define POWER9_PME_PM_EE_OFF_EXT_INT 227 #define POWER9_PME_PM_EXT_INT 228 #define POWER9_PME_PM_FLOP_CMPL 229 #define POWER9_PME_PM_FLUSH_COMPLETION 230 #define POWER9_PME_PM_FLUSH_DISP_SB 231 #define POWER9_PME_PM_FLUSH_DISP_TLBIE 232 #define POWER9_PME_PM_FLUSH_DISP 233 #define POWER9_PME_PM_FLUSH_HB_RESTORE_CYC 234 #define POWER9_PME_PM_FLUSH_LSU 235 #define POWER9_PME_PM_FLUSH_MPRED 236 #define POWER9_PME_PM_FLUSH 237 #define POWER9_PME_PM_FMA_CMPL 238 #define POWER9_PME_PM_FORCED_NOP 239 #define POWER9_PME_PM_FREQ_DOWN 240 #define POWER9_PME_PM_FREQ_UP 241 #define POWER9_PME_PM_FXU_1PLUS_BUSY 242 #define POWER9_PME_PM_FXU_BUSY 243 #define POWER9_PME_PM_FXU_FIN 244 #define POWER9_PME_PM_FXU_IDLE 245 #define POWER9_PME_PM_GRP_PUMP_CPRED 246 #define POWER9_PME_PM_GRP_PUMP_MPRED_RTY 247 #define POWER9_PME_PM_GRP_PUMP_MPRED 248 #define POWER9_PME_PM_HV_CYC 249 #define POWER9_PME_PM_HWSYNC 250 #define POWER9_PME_PM_IBUF_FULL_CYC 251 #define POWER9_PME_PM_IC_DEMAND_CYC 252 #define POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT 253 #define POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT 254 #define POWER9_PME_PM_IC_DEMAND_REQ 255 #define POWER9_PME_PM_IC_INVALIDATE 256 #define POWER9_PME_PM_IC_MISS_CMPL 257 #define POWER9_PME_PM_IC_MISS_ICBI 258 #define POWER9_PME_PM_IC_PREF_CANCEL_HIT 259 #define POWER9_PME_PM_IC_PREF_CANCEL_L2 260 #define POWER9_PME_PM_IC_PREF_CANCEL_PAGE 261 #define POWER9_PME_PM_IC_PREF_REQ 262 #define POWER9_PME_PM_IC_PREF_WRITE 263 #define POWER9_PME_PM_IC_RELOAD_PRIVATE 264 #define POWER9_PME_PM_ICT_EMPTY_CYC 265 #define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS 266 #define POWER9_PME_PM_ICT_NOSLOT_BR_MPRED 267 #define POWER9_PME_PM_ICT_NOSLOT_CYC 268 #define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL 269 #define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ 270 #define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC 271 #define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN 272 #define POWER9_PME_PM_ICT_NOSLOT_DISP_HELD 273 #define POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS 274 #define POWER9_PME_PM_ICT_NOSLOT_IC_L3 275 #define POWER9_PME_PM_ICT_NOSLOT_IC_MISS 276 #define POWER9_PME_PM_IERAT_RELOAD_16M 277 #define POWER9_PME_PM_IERAT_RELOAD_4K 278 #define POWER9_PME_PM_IERAT_RELOAD_64K 279 #define POWER9_PME_PM_IERAT_RELOAD 280 #define POWER9_PME_PM_IFETCH_THROTTLE 281 #define POWER9_PME_PM_INST_CHIP_PUMP_CPRED 282 #define POWER9_PME_PM_INST_CMPL 283 #define POWER9_PME_PM_INST_DISP 284 #define POWER9_PME_PM_INST_FROM_DL2L3_MOD 285 #define POWER9_PME_PM_INST_FROM_DL2L3_SHR 286 #define POWER9_PME_PM_INST_FROM_DL4 287 #define POWER9_PME_PM_INST_FROM_DMEM 288 #define POWER9_PME_PM_INST_FROM_L1 289 #define POWER9_PME_PM_INST_FROM_L21_MOD 290 #define POWER9_PME_PM_INST_FROM_L21_SHR 291 #define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST 292 #define POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER 293 #define POWER9_PME_PM_INST_FROM_L2_MEPF 294 #define POWER9_PME_PM_INST_FROM_L2MISS 295 #define POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT 296 #define POWER9_PME_PM_INST_FROM_L2 297 #define POWER9_PME_PM_INST_FROM_L31_ECO_MOD 298 #define POWER9_PME_PM_INST_FROM_L31_ECO_SHR 299 #define POWER9_PME_PM_INST_FROM_L31_MOD 300 #define POWER9_PME_PM_INST_FROM_L31_SHR 301 #define POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT 302 #define POWER9_PME_PM_INST_FROM_L3_MEPF 303 #define POWER9_PME_PM_INST_FROM_L3MISS_MOD 304 #define POWER9_PME_PM_INST_FROM_L3MISS 305 #define POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT 306 #define POWER9_PME_PM_INST_FROM_L3 307 #define POWER9_PME_PM_INST_FROM_LL4 308 #define POWER9_PME_PM_INST_FROM_LMEM 309 #define POWER9_PME_PM_INST_FROM_MEMORY 310 #define POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE 311 #define POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE 312 #define POWER9_PME_PM_INST_FROM_RL2L3_MOD 313 #define POWER9_PME_PM_INST_FROM_RL2L3_SHR 314 #define POWER9_PME_PM_INST_FROM_RL4 315 #define POWER9_PME_PM_INST_FROM_RMEM 316 #define POWER9_PME_PM_INST_GRP_PUMP_CPRED 317 #define POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY 318 #define POWER9_PME_PM_INST_GRP_PUMP_MPRED 319 #define POWER9_PME_PM_INST_IMC_MATCH_CMPL 320 #define POWER9_PME_PM_INST_PUMP_CPRED 321 #define POWER9_PME_PM_INST_PUMP_MPRED 322 #define POWER9_PME_PM_INST_SYS_PUMP_CPRED 323 #define POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY 324 #define POWER9_PME_PM_INST_SYS_PUMP_MPRED 325 #define POWER9_PME_PM_IOPS_CMPL 326 #define POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD 327 #define POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR 328 #define POWER9_PME_PM_IPTEG_FROM_DL4 329 #define POWER9_PME_PM_IPTEG_FROM_DMEM 330 #define POWER9_PME_PM_IPTEG_FROM_L21_MOD 331 #define POWER9_PME_PM_IPTEG_FROM_L21_SHR 332 #define POWER9_PME_PM_IPTEG_FROM_L2_MEPF 333 #define POWER9_PME_PM_IPTEG_FROM_L2MISS 334 #define POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT 335 #define POWER9_PME_PM_IPTEG_FROM_L2 336 #define POWER9_PME_PM_IPTEG_FROM_L31_ECO_MOD 337 #define POWER9_PME_PM_IPTEG_FROM_L31_ECO_SHR 338 #define POWER9_PME_PM_IPTEG_FROM_L31_MOD 339 #define POWER9_PME_PM_IPTEG_FROM_L31_SHR 340 #define POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT 341 #define POWER9_PME_PM_IPTEG_FROM_L3_MEPF 342 #define POWER9_PME_PM_IPTEG_FROM_L3MISS 343 #define POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT 344 #define POWER9_PME_PM_IPTEG_FROM_L3 345 #define POWER9_PME_PM_IPTEG_FROM_LL4 346 #define POWER9_PME_PM_IPTEG_FROM_LMEM 347 #define POWER9_PME_PM_IPTEG_FROM_MEMORY 348 #define POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE 349 #define POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE 350 #define POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD 351 #define POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR 352 #define POWER9_PME_PM_IPTEG_FROM_RL4 353 #define POWER9_PME_PM_IPTEG_FROM_RMEM 354 #define POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR 355 #define POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER 356 #define POWER9_PME_PM_ISIDE_DISP 357 #define POWER9_PME_PM_ISIDE_L2MEMACC 358 #define POWER9_PME_PM_ISIDE_MRU_TOUCH 359 #define POWER9_PME_PM_ISLB_MISS 360 #define POWER9_PME_PM_ISLB_MISS_ALT 361 #define POWER9_PME_PM_ISQ_0_8_ENTRIES 362 #define POWER9_PME_PM_ISQ_36_44_ENTRIES 363 #define POWER9_PME_PM_ISU0_ISS_HOLD_ALL 364 #define POWER9_PME_PM_ISU1_ISS_HOLD_ALL 365 #define POWER9_PME_PM_ISU2_ISS_HOLD_ALL 366 #define POWER9_PME_PM_ISU3_ISS_HOLD_ALL 367 #define POWER9_PME_PM_ISYNC 368 #define POWER9_PME_PM_ITLB_MISS 369 #define POWER9_PME_PM_L1_DCACHE_RELOADED_ALL 370 #define POWER9_PME_PM_L1_DCACHE_RELOAD_VALID 371 #define POWER9_PME_PM_L1_DEMAND_WRITE 372 #define POWER9_PME_PM_L1_ICACHE_MISS 373 #define POWER9_PME_PM_L1_ICACHE_RELOADED_ALL 374 #define POWER9_PME_PM_L1_ICACHE_RELOADED_PREF 375 #define POWER9_PME_PM_L1PF_L2MEMACC 376 #define POWER9_PME_PM_L1_PREF 377 #define POWER9_PME_PM_L1_SW_PREF 378 #define POWER9_PME_PM_L2_CASTOUT_MOD 379 #define POWER9_PME_PM_L2_CASTOUT_SHR 380 #define POWER9_PME_PM_L2_CHIP_PUMP 381 #define POWER9_PME_PM_L2_DC_INV 382 #define POWER9_PME_PM_L2_DISP_ALL_L2MISS 383 #define POWER9_PME_PM_L2_GROUP_PUMP 384 #define POWER9_PME_PM_L2_GRP_GUESS_CORRECT 385 #define POWER9_PME_PM_L2_GRP_GUESS_WRONG 386 #define POWER9_PME_PM_L2_IC_INV 387 #define POWER9_PME_PM_L2_INST_MISS 388 #define POWER9_PME_PM_L2_INST_MISS_ALT 389 #define POWER9_PME_PM_L2_INST 390 #define POWER9_PME_PM_L2_INST_ALT 391 #define POWER9_PME_PM_L2_LD_DISP 392 #define POWER9_PME_PM_L2_LD_DISP_ALT 393 #define POWER9_PME_PM_L2_LD_HIT 394 #define POWER9_PME_PM_L2_LD_HIT_ALT 395 #define POWER9_PME_PM_L2_LD_MISS_128B 396 #define POWER9_PME_PM_L2_LD_MISS_64B 397 #define POWER9_PME_PM_L2_LD_MISS 398 #define POWER9_PME_PM_L2_LD 399 #define POWER9_PME_PM_L2_LOC_GUESS_CORRECT 400 #define POWER9_PME_PM_L2_LOC_GUESS_WRONG 401 #define POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR 402 #define POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER 403 #define POWER9_PME_PM_L2_RCLD_DISP 404 #define POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR 405 #define POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER 406 #define POWER9_PME_PM_L2_RCST_DISP 407 #define POWER9_PME_PM_L2_RC_ST_DONE 408 #define POWER9_PME_PM_L2_RTY_LD 409 #define POWER9_PME_PM_L2_RTY_LD_ALT 410 #define POWER9_PME_PM_L2_RTY_ST 411 #define POWER9_PME_PM_L2_RTY_ST_ALT 412 #define POWER9_PME_PM_L2_SN_M_RD_DONE 413 #define POWER9_PME_PM_L2_SN_M_WR_DONE 414 #define POWER9_PME_PM_L2_SN_M_WR_DONE_ALT 415 #define POWER9_PME_PM_L2_SN_SX_I_DONE 416 #define POWER9_PME_PM_L2_ST_DISP 417 #define POWER9_PME_PM_L2_ST_DISP_ALT 418 #define POWER9_PME_PM_L2_ST_HIT 419 #define POWER9_PME_PM_L2_ST_HIT_ALT 420 #define POWER9_PME_PM_L2_ST_MISS_128B 421 #define POWER9_PME_PM_L2_ST_MISS_64B 422 #define POWER9_PME_PM_L2_ST_MISS 423 #define POWER9_PME_PM_L2_ST 424 #define POWER9_PME_PM_L2_SYS_GUESS_CORRECT 425 #define POWER9_PME_PM_L2_SYS_GUESS_WRONG 426 #define POWER9_PME_PM_L2_SYS_PUMP 427 #define POWER9_PME_PM_L3_CI_HIT 428 #define POWER9_PME_PM_L3_CI_MISS 429 #define POWER9_PME_PM_L3_CINJ 430 #define POWER9_PME_PM_L3_CI_USAGE 431 #define POWER9_PME_PM_L3_CO0_BUSY 432 #define POWER9_PME_PM_L3_CO0_BUSY_ALT 433 #define POWER9_PME_PM_L3_CO_L31 434 #define POWER9_PME_PM_L3_CO_LCO 435 #define POWER9_PME_PM_L3_CO_MEM 436 #define POWER9_PME_PM_L3_CO_MEPF 437 #define POWER9_PME_PM_L3_CO_MEPF_ALT 438 #define POWER9_PME_PM_L3_CO 439 #define POWER9_PME_PM_L3_GRP_GUESS_CORRECT 440 #define POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH 441 #define POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW 442 #define POWER9_PME_PM_L3_HIT 443 #define POWER9_PME_PM_L3_L2_CO_HIT 444 #define POWER9_PME_PM_L3_L2_CO_MISS 445 #define POWER9_PME_PM_L3_LAT_CI_HIT 446 #define POWER9_PME_PM_L3_LAT_CI_MISS 447 #define POWER9_PME_PM_L3_LD_HIT 448 #define POWER9_PME_PM_L3_LD_MISS 449 #define POWER9_PME_PM_L3_LD_PREF 450 #define POWER9_PME_PM_L3_LOC_GUESS_CORRECT 451 #define POWER9_PME_PM_L3_LOC_GUESS_WRONG 452 #define POWER9_PME_PM_L3_MISS 453 #define POWER9_PME_PM_L3_P0_CO_L31 454 #define POWER9_PME_PM_L3_P0_CO_MEM 455 #define POWER9_PME_PM_L3_P0_CO_RTY 456 #define POWER9_PME_PM_L3_P0_CO_RTY_ALT 457 #define POWER9_PME_PM_L3_P0_GRP_PUMP 458 #define POWER9_PME_PM_L3_P0_LCO_DATA 459 #define POWER9_PME_PM_L3_P0_LCO_NO_DATA 460 #define POWER9_PME_PM_L3_P0_LCO_RTY 461 #define POWER9_PME_PM_L3_P0_NODE_PUMP 462 #define POWER9_PME_PM_L3_P0_PF_RTY 463 #define POWER9_PME_PM_L3_P0_PF_RTY_ALT 464 #define POWER9_PME_PM_L3_P0_SYS_PUMP 465 #define POWER9_PME_PM_L3_P1_CO_L31 466 #define POWER9_PME_PM_L3_P1_CO_MEM 467 #define POWER9_PME_PM_L3_P1_CO_RTY 468 #define POWER9_PME_PM_L3_P1_CO_RTY_ALT 469 #define POWER9_PME_PM_L3_P1_GRP_PUMP 470 #define POWER9_PME_PM_L3_P1_LCO_DATA 471 #define POWER9_PME_PM_L3_P1_LCO_NO_DATA 472 #define POWER9_PME_PM_L3_P1_LCO_RTY 473 #define POWER9_PME_PM_L3_P1_NODE_PUMP 474 #define POWER9_PME_PM_L3_P1_PF_RTY 475 #define POWER9_PME_PM_L3_P1_PF_RTY_ALT 476 #define POWER9_PME_PM_L3_P1_SYS_PUMP 477 #define POWER9_PME_PM_L3_P2_LCO_RTY 478 #define POWER9_PME_PM_L3_P3_LCO_RTY 479 #define POWER9_PME_PM_L3_PF0_BUSY 480 #define POWER9_PME_PM_L3_PF0_BUSY_ALT 481 #define POWER9_PME_PM_L3_PF_HIT_L3 482 #define POWER9_PME_PM_L3_PF_MISS_L3 483 #define POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE 484 #define POWER9_PME_PM_L3_PF_OFF_CHIP_MEM 485 #define POWER9_PME_PM_L3_PF_ON_CHIP_CACHE 486 #define POWER9_PME_PM_L3_PF_ON_CHIP_MEM 487 #define POWER9_PME_PM_L3_PF_USAGE 488 #define POWER9_PME_PM_L3_RD0_BUSY 489 #define POWER9_PME_PM_L3_RD0_BUSY_ALT 490 #define POWER9_PME_PM_L3_RD_USAGE 491 #define POWER9_PME_PM_L3_SN0_BUSY 492 #define POWER9_PME_PM_L3_SN0_BUSY_ALT 493 #define POWER9_PME_PM_L3_SN_USAGE 494 #define POWER9_PME_PM_L3_SW_PREF 495 #define POWER9_PME_PM_L3_SYS_GUESS_CORRECT 496 #define POWER9_PME_PM_L3_SYS_GUESS_WRONG 497 #define POWER9_PME_PM_L3_TRANS_PF 498 #define POWER9_PME_PM_L3_WI0_BUSY 499 #define POWER9_PME_PM_L3_WI0_BUSY_ALT 500 #define POWER9_PME_PM_L3_WI_USAGE 501 #define POWER9_PME_PM_LARX_FIN 502 #define POWER9_PME_PM_LD_CMPL 503 #define POWER9_PME_PM_LD_L3MISS_PEND_CYC 504 #define POWER9_PME_PM_LD_MISS_L1_FIN 505 #define POWER9_PME_PM_LD_MISS_L1 506 #define POWER9_PME_PM_LD_REF_L1 507 #define POWER9_PME_PM_LINK_STACK_CORRECT 508 #define POWER9_PME_PM_LINK_STACK_INVALID_PTR 509 #define POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED 510 #define POWER9_PME_PM_LMQ_EMPTY_CYC 511 #define POWER9_PME_PM_LMQ_MERGE 512 #define POWER9_PME_PM_LRQ_REJECT 513 #define POWER9_PME_PM_LS0_DC_COLLISIONS 514 #define POWER9_PME_PM_LS0_ERAT_MISS_PREF 515 #define POWER9_PME_PM_LS0_LAUNCH_HELD_PREF 516 #define POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC 517 #define POWER9_PME_PM_LS0_TM_DISALLOW 518 #define POWER9_PME_PM_LS0_UNALIGNED_LD 519 #define POWER9_PME_PM_LS0_UNALIGNED_ST 520 #define POWER9_PME_PM_LS1_DC_COLLISIONS 521 #define POWER9_PME_PM_LS1_ERAT_MISS_PREF 522 #define POWER9_PME_PM_LS1_LAUNCH_HELD_PREF 523 #define POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC 524 #define POWER9_PME_PM_LS1_TM_DISALLOW 525 #define POWER9_PME_PM_LS1_UNALIGNED_LD 526 #define POWER9_PME_PM_LS1_UNALIGNED_ST 527 #define POWER9_PME_PM_LS2_DC_COLLISIONS 528 #define POWER9_PME_PM_LS2_ERAT_MISS_PREF 529 #define POWER9_PME_PM_LS2_TM_DISALLOW 530 #define POWER9_PME_PM_LS2_UNALIGNED_LD 531 #define POWER9_PME_PM_LS2_UNALIGNED_ST 532 #define POWER9_PME_PM_LS3_DC_COLLISIONS 533 #define POWER9_PME_PM_LS3_ERAT_MISS_PREF 534 #define POWER9_PME_PM_LS3_TM_DISALLOW 535 #define POWER9_PME_PM_LS3_UNALIGNED_LD 536 #define POWER9_PME_PM_LS3_UNALIGNED_ST 537 #define POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC 538 #define POWER9_PME_PM_LSU0_ERAT_HIT 539 #define POWER9_PME_PM_LSU0_FALSE_LHS 540 #define POWER9_PME_PM_LSU0_L1_CAM_CANCEL 541 #define POWER9_PME_PM_LSU0_LDMX_FIN 542 #define POWER9_PME_PM_LSU0_LMQ_S0_VALID 543 #define POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC 544 #define POWER9_PME_PM_LSU0_SET_MPRED 545 #define POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC 546 #define POWER9_PME_PM_LSU0_STORE_REJECT 547 #define POWER9_PME_PM_LSU0_TM_L1_HIT 548 #define POWER9_PME_PM_LSU0_TM_L1_MISS 549 #define POWER9_PME_PM_LSU1_ERAT_HIT 550 #define POWER9_PME_PM_LSU1_FALSE_LHS 551 #define POWER9_PME_PM_LSU1_L1_CAM_CANCEL 552 #define POWER9_PME_PM_LSU1_LDMX_FIN 553 #define POWER9_PME_PM_LSU1_SET_MPRED 554 #define POWER9_PME_PM_LSU1_STORE_REJECT 555 #define POWER9_PME_PM_LSU1_TM_L1_HIT 556 #define POWER9_PME_PM_LSU1_TM_L1_MISS 557 #define POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC 558 #define POWER9_PME_PM_LSU2_ERAT_HIT 559 #define POWER9_PME_PM_LSU2_FALSE_LHS 560 #define POWER9_PME_PM_LSU2_L1_CAM_CANCEL 561 #define POWER9_PME_PM_LSU2_LDMX_FIN 562 #define POWER9_PME_PM_LSU2_SET_MPRED 563 #define POWER9_PME_PM_LSU2_STORE_REJECT 564 #define POWER9_PME_PM_LSU2_TM_L1_HIT 565 #define POWER9_PME_PM_LSU2_TM_L1_MISS 566 #define POWER9_PME_PM_LSU3_ERAT_HIT 567 #define POWER9_PME_PM_LSU3_FALSE_LHS 568 #define POWER9_PME_PM_LSU3_L1_CAM_CANCEL 569 #define POWER9_PME_PM_LSU3_LDMX_FIN 570 #define POWER9_PME_PM_LSU3_SET_MPRED 571 #define POWER9_PME_PM_LSU3_STORE_REJECT 572 #define POWER9_PME_PM_LSU3_TM_L1_HIT 573 #define POWER9_PME_PM_LSU3_TM_L1_MISS 574 #define POWER9_PME_PM_LSU_DERAT_MISS 575 #define POWER9_PME_PM_LSU_FIN 576 #define POWER9_PME_PM_LSU_FLUSH_ATOMIC 577 #define POWER9_PME_PM_LSU_FLUSH_CI 578 #define POWER9_PME_PM_LSU_FLUSH_EMSH 579 #define POWER9_PME_PM_LSU_FLUSH_LARX_STCX 580 #define POWER9_PME_PM_LSU_FLUSH_LHL_SHL 581 #define POWER9_PME_PM_LSU_FLUSH_LHS 582 #define POWER9_PME_PM_LSU_FLUSH_NEXT 583 #define POWER9_PME_PM_LSU_FLUSH_OTHER 584 #define POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS 585 #define POWER9_PME_PM_LSU_FLUSH_SAO 586 #define POWER9_PME_PM_LSU_FLUSH_UE 587 #define POWER9_PME_PM_LSU_FLUSH_WRK_ARND 588 #define POWER9_PME_PM_LSU_LMQ_FULL_CYC 589 #define POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 590 #define POWER9_PME_PM_LSU_NCST 591 #define POWER9_PME_PM_LSU_REJECT_ERAT_MISS 592 #define POWER9_PME_PM_LSU_REJECT_LHS 593 #define POWER9_PME_PM_LSU_REJECT_LMQ_FULL 594 #define POWER9_PME_PM_LSU_SRQ_FULL_CYC 595 #define POWER9_PME_PM_LSU_STCX_FAIL 596 #define POWER9_PME_PM_LSU_STCX 597 #define POWER9_PME_PM_LWSYNC 598 #define POWER9_PME_PM_MATH_FLOP_CMPL 599 #define POWER9_PME_PM_MEM_CO 600 #define POWER9_PME_PM_MEM_LOC_THRESH_IFU 601 #define POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH 602 #define POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED 603 #define POWER9_PME_PM_MEM_PREF 604 #define POWER9_PME_PM_MEM_READ 605 #define POWER9_PME_PM_MEM_RWITM 606 #define POWER9_PME_PM_MRK_BACK_BR_CMPL 607 #define POWER9_PME_PM_MRK_BR_2PATH 608 #define POWER9_PME_PM_MRK_BR_CMPL 609 #define POWER9_PME_PM_MRK_BR_MPRED_CMPL 610 #define POWER9_PME_PM_MRK_BR_TAKEN_CMPL 611 #define POWER9_PME_PM_MRK_BRU_FIN 612 #define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC 613 #define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD 614 #define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC 615 #define POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR 616 #define POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC 617 #define POWER9_PME_PM_MRK_DATA_FROM_DL4 618 #define POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC 619 #define POWER9_PME_PM_MRK_DATA_FROM_DMEM 620 #define POWER9_PME_PM_MRK_DATA_FROM_L21_MOD_CYC 621 #define POWER9_PME_PM_MRK_DATA_FROM_L21_MOD 622 #define POWER9_PME_PM_MRK_DATA_FROM_L21_SHR_CYC 623 #define POWER9_PME_PM_MRK_DATA_FROM_L21_SHR 624 #define POWER9_PME_PM_MRK_DATA_FROM_L2_CYC 625 #define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC 626 #define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST 627 #define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC 628 #define POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER 629 #define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC 630 #define POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF 631 #define POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC 632 #define POWER9_PME_PM_MRK_DATA_FROM_L2MISS 633 #define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC 634 #define POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT 635 #define POWER9_PME_PM_MRK_DATA_FROM_L2 636 #define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC 637 #define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD 638 #define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC 639 #define POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR 640 #define POWER9_PME_PM_MRK_DATA_FROM_L31_MOD_CYC 641 #define POWER9_PME_PM_MRK_DATA_FROM_L31_MOD 642 #define POWER9_PME_PM_MRK_DATA_FROM_L31_SHR_CYC 643 #define POWER9_PME_PM_MRK_DATA_FROM_L31_SHR 644 #define POWER9_PME_PM_MRK_DATA_FROM_L3_CYC 645 #define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC 646 #define POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT 647 #define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC 648 #define POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF 649 #define POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC 650 #define POWER9_PME_PM_MRK_DATA_FROM_L3MISS 651 #define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC 652 #define POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT 653 #define POWER9_PME_PM_MRK_DATA_FROM_L3 654 #define POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC 655 #define POWER9_PME_PM_MRK_DATA_FROM_LL4 656 #define POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC 657 #define POWER9_PME_PM_MRK_DATA_FROM_LMEM 658 #define POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC 659 #define POWER9_PME_PM_MRK_DATA_FROM_MEMORY 660 #define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC 661 #define POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE 662 #define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC 663 #define POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE 664 #define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC 665 #define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD 666 #define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC 667 #define POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR 668 #define POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC 669 #define POWER9_PME_PM_MRK_DATA_FROM_RL4 670 #define POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC 671 #define POWER9_PME_PM_MRK_DATA_FROM_RMEM 672 #define POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV 673 #define POWER9_PME_PM_MRK_DERAT_MISS_16G 674 #define POWER9_PME_PM_MRK_DERAT_MISS_16M 675 #define POWER9_PME_PM_MRK_DERAT_MISS_1G 676 #define POWER9_PME_PM_MRK_DERAT_MISS_2M 677 #define POWER9_PME_PM_MRK_DERAT_MISS_4K 678 #define POWER9_PME_PM_MRK_DERAT_MISS_64K 679 #define POWER9_PME_PM_MRK_DERAT_MISS 680 #define POWER9_PME_PM_MRK_DFU_FIN 681 #define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD 682 #define POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR 683 #define POWER9_PME_PM_MRK_DPTEG_FROM_DL4 684 #define POWER9_PME_PM_MRK_DPTEG_FROM_DMEM 685 #define POWER9_PME_PM_MRK_DPTEG_FROM_L21_MOD 686 #define POWER9_PME_PM_MRK_DPTEG_FROM_L21_SHR 687 #define POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF 688 #define POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS 689 #define POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT 690 #define POWER9_PME_PM_MRK_DPTEG_FROM_L2 691 #define POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD 692 #define POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR 693 #define POWER9_PME_PM_MRK_DPTEG_FROM_L31_MOD 694 #define POWER9_PME_PM_MRK_DPTEG_FROM_L31_SHR 695 #define POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT 696 #define POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF 697 #define POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS 698 #define POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT 699 #define POWER9_PME_PM_MRK_DPTEG_FROM_L3 700 #define POWER9_PME_PM_MRK_DPTEG_FROM_LL4 701 #define POWER9_PME_PM_MRK_DPTEG_FROM_LMEM 702 #define POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY 703 #define POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE 704 #define POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE 705 #define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD 706 #define POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR 707 #define POWER9_PME_PM_MRK_DPTEG_FROM_RL4 708 #define POWER9_PME_PM_MRK_DPTEG_FROM_RMEM 709 #define POWER9_PME_PM_MRK_DTLB_MISS_16G 710 #define POWER9_PME_PM_MRK_DTLB_MISS_16M 711 #define POWER9_PME_PM_MRK_DTLB_MISS_1G 712 #define POWER9_PME_PM_MRK_DTLB_MISS_4K 713 #define POWER9_PME_PM_MRK_DTLB_MISS_64K 714 #define POWER9_PME_PM_MRK_DTLB_MISS 715 #define POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC 716 #define POWER9_PME_PM_MRK_FAB_RSP_BKILL 717 #define POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY 718 #define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC 719 #define POWER9_PME_PM_MRK_FAB_RSP_DCLAIM 720 #define POWER9_PME_PM_MRK_FAB_RSP_RD_RTY 721 #define POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV 722 #define POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC 723 #define POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY 724 #define POWER9_PME_PM_MRK_FXU_FIN 725 #define POWER9_PME_PM_MRK_IC_MISS 726 #define POWER9_PME_PM_MRK_INST_CMPL 727 #define POWER9_PME_PM_MRK_INST_DECODED 728 #define POWER9_PME_PM_MRK_INST_DISP 729 #define POWER9_PME_PM_MRK_INST_FIN 730 #define POWER9_PME_PM_MRK_INST_FROM_L3MISS 731 #define POWER9_PME_PM_MRK_INST_ISSUED 732 #define POWER9_PME_PM_MRK_INST_TIMEO 733 #define POWER9_PME_PM_MRK_INST 734 #define POWER9_PME_PM_MRK_L1_ICACHE_MISS 735 #define POWER9_PME_PM_MRK_L1_RELOAD_VALID 736 #define POWER9_PME_PM_MRK_L2_RC_DISP 737 #define POWER9_PME_PM_MRK_L2_RC_DONE 738 #define POWER9_PME_PM_MRK_L2_TM_REQ_ABORT 739 #define POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER 740 #define POWER9_PME_PM_MRK_LARX_FIN 741 #define POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC 742 #define POWER9_PME_PM_MRK_LD_MISS_L1_CYC 743 #define POWER9_PME_PM_MRK_LD_MISS_L1 744 #define POWER9_PME_PM_MRK_LSU_DERAT_MISS 745 #define POWER9_PME_PM_MRK_LSU_FIN 746 #define POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC 747 #define POWER9_PME_PM_MRK_LSU_FLUSH_EMSH 748 #define POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX 749 #define POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL 750 #define POWER9_PME_PM_MRK_LSU_FLUSH_LHS 751 #define POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS 752 #define POWER9_PME_PM_MRK_LSU_FLUSH_SAO 753 #define POWER9_PME_PM_MRK_LSU_FLUSH_UE 754 #define POWER9_PME_PM_MRK_NTC_CYC 755 #define POWER9_PME_PM_MRK_NTF_FIN 756 #define POWER9_PME_PM_MRK_PROBE_NOP_CMPL 757 #define POWER9_PME_PM_MRK_RUN_CYC 758 #define POWER9_PME_PM_MRK_STALL_CMPLU_CYC 759 #define POWER9_PME_PM_MRK_ST_CMPL_INT 760 #define POWER9_PME_PM_MRK_ST_CMPL 761 #define POWER9_PME_PM_MRK_STCX_FAIL 762 #define POWER9_PME_PM_MRK_STCX_FIN 763 #define POWER9_PME_PM_MRK_ST_DONE_L2 764 #define POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC 765 #define POWER9_PME_PM_MRK_ST_FWD 766 #define POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC 767 #define POWER9_PME_PM_MRK_ST_NEST 768 #define POWER9_PME_PM_MRK_TEND_FAIL 769 #define POWER9_PME_PM_MRK_VSU_FIN 770 #define POWER9_PME_PM_MULT_MRK 771 #define POWER9_PME_PM_NEST_REF_CLK 772 #define POWER9_PME_PM_NON_DATA_STORE 773 #define POWER9_PME_PM_NON_FMA_FLOP_CMPL 774 #define POWER9_PME_PM_NON_MATH_FLOP_CMPL 775 #define POWER9_PME_PM_NON_TM_RST_SC 776 #define POWER9_PME_PM_NTC_ALL_FIN 777 #define POWER9_PME_PM_NTC_FIN 778 #define POWER9_PME_PM_NTC_ISSUE_HELD_ARB 779 #define POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL 780 #define POWER9_PME_PM_NTC_ISSUE_HELD_OTHER 781 #define POWER9_PME_PM_PARTIAL_ST_FIN 782 #define POWER9_PME_PM_PMC1_OVERFLOW 783 #define POWER9_PME_PM_PMC1_REWIND 784 #define POWER9_PME_PM_PMC1_SAVED 785 #define POWER9_PME_PM_PMC2_OVERFLOW 786 #define POWER9_PME_PM_PMC2_REWIND 787 #define POWER9_PME_PM_PMC2_SAVED 788 #define POWER9_PME_PM_PMC3_OVERFLOW 789 #define POWER9_PME_PM_PMC3_REWIND 790 #define POWER9_PME_PM_PMC3_SAVED 791 #define POWER9_PME_PM_PMC4_OVERFLOW 792 #define POWER9_PME_PM_PMC4_REWIND 793 #define POWER9_PME_PM_PMC4_SAVED 794 #define POWER9_PME_PM_PMC5_OVERFLOW 795 #define POWER9_PME_PM_PMC6_OVERFLOW 796 #define POWER9_PME_PM_PROBE_NOP_DISP 797 #define POWER9_PME_PM_PTE_PREFETCH 798 #define POWER9_PME_PM_PTESYNC 799 #define POWER9_PME_PM_PUMP_CPRED 800 #define POWER9_PME_PM_PUMP_MPRED 801 #define POWER9_PME_PM_RADIX_PWC_L1_HIT 802 #define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 803 #define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS 804 #define POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 805 #define POWER9_PME_PM_RADIX_PWC_L2_HIT 806 #define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 807 #define POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 808 #define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 809 #define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS 810 #define POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 811 #define POWER9_PME_PM_RADIX_PWC_L3_HIT 812 #define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 813 #define POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 814 #define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 815 #define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS 816 #define POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 817 #define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 818 #define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS 819 #define POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 820 #define POWER9_PME_PM_RADIX_PWC_MISS 821 #define POWER9_PME_PM_RC0_BUSY 822 #define POWER9_PME_PM_RC0_BUSY_ALT 823 #define POWER9_PME_PM_RC_USAGE 824 #define POWER9_PME_PM_RD_CLEARING_SC 825 #define POWER9_PME_PM_RD_FORMING_SC 826 #define POWER9_PME_PM_RD_HIT_PF 827 #define POWER9_PME_PM_RUN_CYC_SMT2_MODE 828 #define POWER9_PME_PM_RUN_CYC_SMT4_MODE 829 #define POWER9_PME_PM_RUN_CYC_ST_MODE 830 #define POWER9_PME_PM_RUN_CYC 831 #define POWER9_PME_PM_RUN_INST_CMPL 832 #define POWER9_PME_PM_RUN_PURR 833 #define POWER9_PME_PM_RUN_SPURR 834 #define POWER9_PME_PM_S2Q_FULL 835 #define POWER9_PME_PM_SCALAR_FLOP_CMPL 836 #define POWER9_PME_PM_SHL_CREATED 837 #define POWER9_PME_PM_SHL_ST_DEP_CREATED 838 #define POWER9_PME_PM_SHL_ST_DISABLE 839 #define POWER9_PME_PM_SLB_TABLEWALK_CYC 840 #define POWER9_PME_PM_SN0_BUSY 841 #define POWER9_PME_PM_SN0_BUSY_ALT 842 #define POWER9_PME_PM_SN_HIT 843 #define POWER9_PME_PM_SN_INVL 844 #define POWER9_PME_PM_SN_MISS 845 #define POWER9_PME_PM_SNOOP_TLBIE 846 #define POWER9_PME_PM_SNP_TM_HIT_M 847 #define POWER9_PME_PM_SNP_TM_HIT_T 848 #define POWER9_PME_PM_SN_USAGE 849 #define POWER9_PME_PM_SP_FLOP_CMPL 850 #define POWER9_PME_PM_SRQ_EMPTY_CYC 851 #define POWER9_PME_PM_SRQ_SYNC_CYC 852 #define POWER9_PME_PM_STALL_END_ICT_EMPTY 853 #define POWER9_PME_PM_ST_CAUSED_FAIL 854 #define POWER9_PME_PM_ST_CMPL 855 #define POWER9_PME_PM_STCX_FAIL 856 #define POWER9_PME_PM_STCX_FIN 857 #define POWER9_PME_PM_STCX_SUCCESS_CMPL 858 #define POWER9_PME_PM_ST_FIN 859 #define POWER9_PME_PM_ST_FWD 860 #define POWER9_PME_PM_ST_MISS_L1 861 #define POWER9_PME_PM_STOP_FETCH_PENDING_CYC 862 #define POWER9_PME_PM_SUSPENDED 863 #define POWER9_PME_PM_SYNC_MRK_BR_LINK 864 #define POWER9_PME_PM_SYNC_MRK_BR_MPRED 865 #define POWER9_PME_PM_SYNC_MRK_FX_DIVIDE 866 #define POWER9_PME_PM_SYNC_MRK_L2HIT 867 #define POWER9_PME_PM_SYNC_MRK_L2MISS 868 #define POWER9_PME_PM_SYNC_MRK_L3MISS 869 #define POWER9_PME_PM_SYNC_MRK_PROBE_NOP 870 #define POWER9_PME_PM_SYS_PUMP_CPRED 871 #define POWER9_PME_PM_SYS_PUMP_MPRED_RTY 872 #define POWER9_PME_PM_SYS_PUMP_MPRED 873 #define POWER9_PME_PM_TABLEWALK_CYC_PREF 874 #define POWER9_PME_PM_TABLEWALK_CYC 875 #define POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL 876 #define POWER9_PME_PM_TAGE_CORRECT 877 #define POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC 878 #define POWER9_PME_PM_TAGE_OVERRIDE_WRONG 879 #define POWER9_PME_PM_TAKEN_BR_MPRED_CMPL 880 #define POWER9_PME_PM_TB_BIT_TRANS 881 #define POWER9_PME_PM_TEND_PEND_CYC 882 #define POWER9_PME_PM_THRD_ALL_RUN_CYC 883 #define POWER9_PME_PM_THRD_CONC_RUN_INST 884 #define POWER9_PME_PM_THRD_PRIO_0_1_CYC 885 #define POWER9_PME_PM_THRD_PRIO_2_3_CYC 886 #define POWER9_PME_PM_THRD_PRIO_4_5_CYC 887 #define POWER9_PME_PM_THRD_PRIO_6_7_CYC 888 #define POWER9_PME_PM_THRESH_ACC 889 #define POWER9_PME_PM_THRESH_EXC_1024 890 #define POWER9_PME_PM_THRESH_EXC_128 891 #define POWER9_PME_PM_THRESH_EXC_2048 892 #define POWER9_PME_PM_THRESH_EXC_256 893 #define POWER9_PME_PM_THRESH_EXC_32 894 #define POWER9_PME_PM_THRESH_EXC_4096 895 #define POWER9_PME_PM_THRESH_EXC_512 896 #define POWER9_PME_PM_THRESH_EXC_64 897 #define POWER9_PME_PM_THRESH_MET 898 #define POWER9_PME_PM_THRESH_NOT_MET 899 #define POWER9_PME_PM_TLB_HIT 900 #define POWER9_PME_PM_TLBIE_FIN 901 #define POWER9_PME_PM_TLB_MISS 902 #define POWER9_PME_PM_TM_ABORTS 903 #define POWER9_PME_PM_TMA_REQ_L2 904 #define POWER9_PME_PM_TM_CAM_OVERFLOW 905 #define POWER9_PME_PM_TM_CAP_OVERFLOW 906 #define POWER9_PME_PM_TM_FAIL_CONF_NON_TM 907 #define POWER9_PME_PM_TM_FAIL_CONF_TM 908 #define POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW 909 #define POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT 910 #define POWER9_PME_PM_TM_FAIL_SELF 911 #define POWER9_PME_PM_TM_FAIL_TLBIE 912 #define POWER9_PME_PM_TM_FAIL_TX_CONFLICT 913 #define POWER9_PME_PM_TM_FAV_CAUSED_FAIL 914 #define POWER9_PME_PM_TM_FAV_TBEGIN 915 #define POWER9_PME_PM_TM_LD_CAUSED_FAIL 916 #define POWER9_PME_PM_TM_LD_CONF 917 #define POWER9_PME_PM_TM_NESTED_TBEGIN 918 #define POWER9_PME_PM_TM_NESTED_TEND 919 #define POWER9_PME_PM_TM_NON_FAV_TBEGIN 920 #define POWER9_PME_PM_TM_OUTER_TBEGIN_DISP 921 #define POWER9_PME_PM_TM_OUTER_TBEGIN 922 #define POWER9_PME_PM_TM_OUTER_TEND 923 #define POWER9_PME_PM_TM_PASSED 924 #define POWER9_PME_PM_TM_RST_SC 925 #define POWER9_PME_PM_TM_SC_CO 926 #define POWER9_PME_PM_TM_ST_CAUSED_FAIL 927 #define POWER9_PME_PM_TM_ST_CONF 928 #define POWER9_PME_PM_TM_TABORT_TRECLAIM 929 #define POWER9_PME_PM_TM_TRANS_RUN_CYC 930 #define POWER9_PME_PM_TM_TRANS_RUN_INST 931 #define POWER9_PME_PM_TM_TRESUME 932 #define POWER9_PME_PM_TM_TSUSPEND 933 #define POWER9_PME_PM_TM_TX_PASS_RUN_CYC 934 #define POWER9_PME_PM_TM_TX_PASS_RUN_INST 935 #define POWER9_PME_PM_VECTOR_FLOP_CMPL 936 #define POWER9_PME_PM_VECTOR_LD_CMPL 937 #define POWER9_PME_PM_VECTOR_ST_CMPL 938 #define POWER9_PME_PM_VSU_DP_FSQRT_FDIV 939 #define POWER9_PME_PM_VSU_FIN 940 #define POWER9_PME_PM_VSU_FSQRT_FDIV 941 #define POWER9_PME_PM_VSU_NON_FLOP_CMPL 942 #define POWER9_PME_PM_XLATE_HPT_MODE 943 #define POWER9_PME_PM_XLATE_MISS 944 #define POWER9_PME_PM_XLATE_RADIX_MODE 945 #define POWER9_PME_PM_BR_2PATH_ALT 946 #define POWER9_PME_PM_CYC_ALT 947 #define POWER9_PME_PM_CYC_ALT2 948 #define POWER9_PME_PM_CYC_ALT3 949 #define POWER9_PME_PM_INST_CMPL_ALT 950 #define POWER9_PME_PM_INST_CMPL_ALT2 951 #define POWER9_PME_PM_INST_CMPL_ALT3 952 #define POWER9_PME_PM_INST_DISP_ALT 953 #define POWER9_PME_PM_LD_MISS_L1_ALT 954 #define POWER9_PME_PM_SUSPENDED_ALT 955 #define POWER9_PME_PM_SUSPENDED_ALT2 956 #define POWER9_PME_PM_SUSPENDED_ALT3 957 static const pme_power_entry_t power9_pe[] = { [ POWER9_PME_PM_1FLOP_CMPL ] = { .pme_name = "PM_1FLOP_CMPL", .pme_code = 0x0000045050, .pme_short_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", .pme_long_desc = "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed", }, [ POWER9_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x00000100F2, .pme_short_desc = "1 or more ppc insts finished", .pme_long_desc = "1 or more ppc insts finished", }, [ POWER9_PME_PM_1PLUS_PPC_DISP ] = { .pme_name = "PM_1PLUS_PPC_DISP", .pme_code = 0x00000400F2, .pme_short_desc = "Cycles at least one Instr Dispatched", .pme_long_desc = "Cycles at least one Instr Dispatched", }, [ POWER9_PME_PM_2FLOP_CMPL ] = { .pme_name = "PM_2FLOP_CMPL", .pme_code = 0x000004D052, .pme_short_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg ", .pme_long_desc = "DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg ", }, [ POWER9_PME_PM_4FLOP_CMPL ] = { .pme_name = "PM_4FLOP_CMPL", .pme_code = 0x0000045052, .pme_short_desc = "4 FLOP instruction completed", .pme_long_desc = "4 FLOP instruction completed", }, [ POWER9_PME_PM_8FLOP_CMPL ] = { .pme_name = "PM_8FLOP_CMPL", .pme_code = 0x000004D054, .pme_short_desc = "8 FLOP instruction completed", .pme_long_desc = "8 FLOP instruction completed", }, [ POWER9_PME_PM_ANY_THRD_RUN_CYC ] = { .pme_name = "PM_ANY_THRD_RUN_CYC", .pme_code = 0x00000100FA, .pme_short_desc = "Cycles in which at least one thread has the run latch set", .pme_long_desc = "Cycles in which at least one thread has the run latch set", }, [ POWER9_PME_PM_BACK_BR_CMPL ] = { .pme_name = "PM_BACK_BR_CMPL", .pme_code = 0x000002505E, .pme_short_desc = "Branch instruction completed with a target address less than current instruction address", .pme_long_desc = "Branch instruction completed with a target address less than current instruction address", }, [ POWER9_PME_PM_BANK_CONFLICT ] = { .pme_name = "PM_BANK_CONFLICT", .pme_code = 0x0000004880, .pme_short_desc = "Read blocked due to interleave conflict.", .pme_long_desc = "Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.", }, [ POWER9_PME_PM_BFU_BUSY ] = { .pme_name = "PM_BFU_BUSY", .pme_code = 0x000003005C, .pme_short_desc = "Cycles in which all 4 Binary Floating Point units are busy.", .pme_long_desc = "Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity", }, /* See also alternate entries for 0000020036 / POWER9_PME_PM_BR_2PATH with code(s) 0000040036 at the bottom of this table. \n */ [ POWER9_PME_PM_BR_2PATH ] = { .pme_name = "PM_BR_2PATH", .pme_code = 0x0000020036, .pme_short_desc = "Branches that are not strongly biased", .pme_long_desc = "Branches that are not strongly biased", }, [ POWER9_PME_PM_BR_CMPL ] = { .pme_name = "PM_BR_CMPL", .pme_code = 0x000004D05E, .pme_short_desc = "Any Branch instruction completed", .pme_long_desc = "Any Branch instruction completed", }, [ POWER9_PME_PM_BR_CORECT_PRED_TAKEN_CMPL ] = { .pme_name = "PM_BR_CORECT_PRED_TAKEN_CMPL", .pme_code = 0x000000489C, .pme_short_desc = "Conditional Branch Completed in which the HW correctly predicted the direction as taken.", .pme_long_desc = "Conditional Branch Completed in which the HW correctly predicted the direction as taken. Counted at completion time", }, [ POWER9_PME_PM_BR_MPRED_CCACHE ] = { .pme_name = "PM_BR_MPRED_CCACHE", .pme_code = 0x00000040AC, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction", }, [ POWER9_PME_PM_BR_MPRED_CMPL ] = { .pme_name = "PM_BR_MPRED_CMPL", .pme_code = 0x00000400F6, .pme_short_desc = "Number of Branch Mispredicts", .pme_long_desc = "Number of Branch Mispredicts", }, [ POWER9_PME_PM_BR_MPRED_LSTACK ] = { .pme_name = "PM_BR_MPRED_LSTACK", .pme_code = 0x00000048AC, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction", }, [ POWER9_PME_PM_BR_MPRED_PCACHE ] = { .pme_name = "PM_BR_MPRED_PCACHE", .pme_code = 0x00000048B0, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to pattern cache prediction", }, [ POWER9_PME_PM_BR_MPRED_TAKEN_CR ] = { .pme_name = "PM_BR_MPRED_TAKEN_CR", .pme_code = 0x00000040B8, .pme_short_desc = "A Conditional Branch that resolved to taken was mispredicted as not taken (due to the BHT Direction Prediction).", .pme_long_desc = "A Conditional Branch that resolved to taken was mispredicted as not taken (due to the BHT Direction Prediction).", }, [ POWER9_PME_PM_BR_MPRED_TAKEN_TA ] = { .pme_name = "PM_BR_MPRED_TAKEN_TA", .pme_code = 0x00000048B8, .pme_short_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack.", .pme_long_desc = "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.", }, [ POWER9_PME_PM_BR_PRED_CCACHE ] = { .pme_name = "PM_BR_PRED_CCACHE", .pme_code = 0x00000040A4, .pme_short_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", .pme_long_desc = "Conditional Branch Completed that used the Count Cache for Target Prediction", }, [ POWER9_PME_PM_BR_PRED_LSTACK ] = { .pme_name = "PM_BR_PRED_LSTACK", .pme_code = 0x00000040A8, .pme_short_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", .pme_long_desc = "Conditional Branch Completed that used the Link Stack for Target Prediction", }, [ POWER9_PME_PM_BR_PRED_PCACHE ] = { .pme_name = "PM_BR_PRED_PCACHE", .pme_code = 0x00000048A0, .pme_short_desc = "Conditional branch completed that used pattern cache prediction", .pme_long_desc = "Conditional branch completed that used pattern cache prediction", }, [ POWER9_PME_PM_BR_PRED_TAKEN_CR ] = { .pme_name = "PM_BR_PRED_TAKEN_CR", .pme_code = 0x00000040B0, .pme_short_desc = "Conditional Branch that had its direction predicted.", .pme_long_desc = "Conditional Branch that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches", }, [ POWER9_PME_PM_BR_PRED_TA ] = { .pme_name = "PM_BR_PRED_TA", .pme_code = 0x00000040B4, .pme_short_desc = "Conditional Branch Completed that had its target address predicted.", .pme_long_desc = "Conditional Branch Completed that had its target address predicted. Only XL-form branches set this event. This equal the sum of CCACHE, LSTACK, and PCACHE", }, [ POWER9_PME_PM_BR_PRED ] = { .pme_name = "PM_BR_PRED", .pme_code = 0x000000409C, .pme_short_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target.", .pme_long_desc = "Conditional Branch Executed in which the HW predicted the Direction or Target. Includes taken and not taken and is counted at execution time", }, [ POWER9_PME_PM_BR_TAKEN_CMPL ] = { .pme_name = "PM_BR_TAKEN_CMPL", .pme_code = 0x00000200FA, .pme_short_desc = "New event for Branch Taken", .pme_long_desc = "New event for Branch Taken", }, [ POWER9_PME_PM_BRU_FIN ] = { .pme_name = "PM_BRU_FIN", .pme_code = 0x0000010068, .pme_short_desc = "Branch Instruction Finished", .pme_long_desc = "Branch Instruction Finished", }, [ POWER9_PME_PM_BR_UNCOND ] = { .pme_name = "PM_BR_UNCOND", .pme_code = 0x00000040A0, .pme_short_desc = "Unconditional Branch Completed.", .pme_long_desc = "Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was covenrted to a Resolve.", }, [ POWER9_PME_PM_BTAC_BAD_RESULT ] = { .pme_name = "PM_BTAC_BAD_RESULT", .pme_code = 0x00000050B0, .pme_short_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common).", .pme_long_desc = "BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common). In both cases, a redirect will happen", }, [ POWER9_PME_PM_BTAC_GOOD_RESULT ] = { .pme_name = "PM_BTAC_GOOD_RESULT", .pme_code = 0x00000058B0, .pme_short_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", .pme_long_desc = "BTAC predicts a taken branch and the BHT agrees, and the target address is correct", }, [ POWER9_PME_PM_CHIP_PUMP_CPRED ] = { .pme_name = "PM_CHIP_PUMP_CPRED", .pme_code = 0x0000010050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_CLB_HELD ] = { .pme_name = "PM_CLB_HELD", .pme_code = 0x000000208C, .pme_short_desc = "CLB (control logic block - indicates quadword fetch block) Hold: Any Reason", .pme_long_desc = "CLB (control logic block - indicates quadword fetch block) Hold: Any Reason", }, [ POWER9_PME_PM_CMPLU_STALL_ANY_SYNC ] = { .pme_name = "PM_CMPLU_STALL_ANY_SYNC", .pme_code = 0x000001E05A, .pme_short_desc = "Cycles in which the NTC sync instruction (isync, lwsync or hwsync) is not allowed to complete", .pme_long_desc = "Cycles in which the NTC sync instruction (isync, lwsync or hwsync) is not allowed to complete", }, [ POWER9_PME_PM_CMPLU_STALL_BRU ] = { .pme_name = "PM_CMPLU_STALL_BRU", .pme_code = 0x000004D018, .pme_short_desc = "Completion stall due to a Branch Unit", .pme_long_desc = "Completion stall due to a Branch Unit", }, [ POWER9_PME_PM_CMPLU_STALL_CRYPTO ] = { .pme_name = "PM_CMPLU_STALL_CRYPTO", .pme_code = 0x000004C01E, .pme_short_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", .pme_long_desc = "Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish", }, [ POWER9_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x000002C012, .pme_short_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", .pme_long_desc = "Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest", }, [ POWER9_PME_PM_CMPLU_STALL_DFLONG ] = { .pme_name = "PM_CMPLU_STALL_DFLONG", .pme_code = 0x000001005A, .pme_short_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Qualified by multicycle", }, [ POWER9_PME_PM_CMPLU_STALL_DFU ] = { .pme_name = "PM_CMPLU_STALL_DFU", .pme_code = 0x000002D012, .pme_short_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Not qualified by multicycle", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_L21_L31 ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L21_L31", .pme_code = 0x000002C018, .pme_short_desc = "Completion stall by Dcache miss which resolved on chip (excluding local L2/L3)", .pme_long_desc = "Completion stall by Dcache miss which resolved on chip (excluding local L2/L3)", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3_CONFLICT ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L2L3_CONFLICT", .pme_code = 0x000004C016, .pme_short_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", .pme_long_desc = "Completion stall due to cache miss that resolves in the L2 or L3 with a conflict", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_L2L3 ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L2L3", .pme_code = 0x000001003C, .pme_short_desc = "Completion stall by Dcache miss which resolved in L2/L3", .pme_long_desc = "Completion stall by Dcache miss which resolved in L2/L3", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_L3MISS ] = { .pme_name = "PM_CMPLU_STALL_DMISS_L3MISS", .pme_code = 0x000004C01A, .pme_short_desc = "Completion stall due to cache miss resolving missed the L3", .pme_long_desc = "Completion stall due to cache miss resolving missed the L3", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_LMEM ] = { .pme_name = "PM_CMPLU_STALL_DMISS_LMEM", .pme_code = 0x0000030038, .pme_short_desc = "Completion stall due to cache miss that resolves in local memory", .pme_long_desc = "Completion stall due to cache miss that resolves in local memory", }, [ POWER9_PME_PM_CMPLU_STALL_DMISS_REMOTE ] = { .pme_name = "PM_CMPLU_STALL_DMISS_REMOTE", .pme_code = 0x000002C01C, .pme_short_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", .pme_long_desc = "Completion stall by Dcache miss which resolved from remote chip (cache or memory)", }, [ POWER9_PME_PM_CMPLU_STALL_DPLONG ] = { .pme_name = "PM_CMPLU_STALL_DPLONG", .pme_code = 0x000003405C, .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", }, [ POWER9_PME_PM_CMPLU_STALL_DP ] = { .pme_name = "PM_CMPLU_STALL_DP", .pme_code = 0x000001005C, .pme_short_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by NOT vector", }, [ POWER9_PME_PM_CMPLU_STALL_EIEIO ] = { .pme_name = "PM_CMPLU_STALL_EIEIO", .pme_code = 0x000004D01A, .pme_short_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", .pme_long_desc = "Finish stall because the NTF instruction is an EIEIO waiting for response from L2", }, [ POWER9_PME_PM_CMPLU_STALL_EMQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_EMQ_FULL", .pme_code = 0x0000030004, .pme_short_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", .pme_long_desc = "Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full", }, [ POWER9_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x000004C012, .pme_short_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", .pme_long_desc = "Finish stall because the NTF instruction was a load or store that suffered a translation miss", }, [ POWER9_PME_PM_CMPLU_STALL_EXCEPTION ] = { .pme_name = "PM_CMPLU_STALL_EXCEPTION", .pme_code = 0x000003003A, .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete", }, [ POWER9_PME_PM_CMPLU_STALL_EXEC_UNIT ] = { .pme_name = "PM_CMPLU_STALL_EXEC_UNIT", .pme_code = 0x000002D018, .pme_short_desc = "Completion stall due to execution units (FXU/VSU/CRU)", .pme_long_desc = "Completion stall due to execution units (FXU/VSU/CRU)", }, [ POWER9_PME_PM_CMPLU_STALL_FLUSH_ANY_THREAD ] = { .pme_name = "PM_CMPLU_STALL_FLUSH_ANY_THREAD", .pme_code = 0x000001E056, .pme_short_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", .pme_long_desc = "Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion", }, [ POWER9_PME_PM_CMPLU_STALL_FXLONG ] = { .pme_name = "PM_CMPLU_STALL_FXLONG", .pme_code = 0x000004D016, .pme_short_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", .pme_long_desc = "Completion stall due to a long latency scalar fixed point instruction (division, square root)", }, [ POWER9_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x000002D016, .pme_short_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline.", .pme_long_desc = "Finish stall due to a scalar fixed point or CR instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", }, [ POWER9_PME_PM_CMPLU_STALL_HWSYNC ] = { .pme_name = "PM_CMPLU_STALL_HWSYNC", .pme_code = 0x0000030036, .pme_short_desc = "completion stall due to hwsync", .pme_long_desc = "completion stall due to hwsync", }, [ POWER9_PME_PM_CMPLU_STALL_LARX ] = { .pme_name = "PM_CMPLU_STALL_LARX", .pme_code = 0x000001002A, .pme_short_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", .pme_long_desc = "Finish stall because the NTF instruction was a larx waiting to be satisfied", }, [ POWER9_PME_PM_CMPLU_STALL_LHS ] = { .pme_name = "PM_CMPLU_STALL_LHS", .pme_code = 0x000002C01A, .pme_short_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", .pme_long_desc = "Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data", }, [ POWER9_PME_PM_CMPLU_STALL_LMQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_LMQ_FULL", .pme_code = 0x000004C014, .pme_short_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", .pme_long_desc = "Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full", }, [ POWER9_PME_PM_CMPLU_STALL_LOAD_FINISH ] = { .pme_name = "PM_CMPLU_STALL_LOAD_FINISH", .pme_code = 0x000004D014, .pme_short_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", .pme_long_desc = "Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish", }, [ POWER9_PME_PM_CMPLU_STALL_LRQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_LRQ_FULL", .pme_code = 0x000002D014, .pme_short_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ (load-store address queue) because the LRQ (load-reorder queue) was full", .pme_long_desc = "Finish stall because the NTF instruction was a load that was held in LSAQ (load-store address queue) because the LRQ (load-reorder queue) was full", }, [ POWER9_PME_PM_CMPLU_STALL_LRQ_OTHER ] = { .pme_name = "PM_CMPLU_STALL_LRQ_OTHER", .pme_code = 0x0000010004, .pme_short_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", .pme_long_desc = "Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others", }, [ POWER9_PME_PM_CMPLU_STALL_LSAQ_ARB ] = { .pme_name = "PM_CMPLU_STALL_LSAQ_ARB", .pme_code = 0x000004E016, .pme_short_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", .pme_long_desc = "Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch", }, [ POWER9_PME_PM_CMPLU_STALL_LSU_FIN ] = { .pme_name = "PM_CMPLU_STALL_LSU_FIN", .pme_code = 0x000001003A, .pme_short_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", .pme_long_desc = "Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish", }, [ POWER9_PME_PM_CMPLU_STALL_LSU_FLUSH_NEXT ] = { .pme_name = "PM_CMPLU_STALL_LSU_FLUSH_NEXT", .pme_code = 0x000002E01A, .pme_short_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence.", .pme_long_desc = "Completion stall of one cycle because the LSU requested to flush the next iop in the sequence. It takes 1 cycle for the ISU to process this request before the LSU instruction is allowed to complete", }, [ POWER9_PME_PM_CMPLU_STALL_LSU_MFSPR ] = { .pme_name = "PM_CMPLU_STALL_LSU_MFSPR", .pme_code = 0x0000034056, .pme_short_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", .pme_long_desc = "Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned", }, [ POWER9_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x000002C010, .pme_short_desc = "Completion stall by LSU instruction", .pme_long_desc = "Completion stall by LSU instruction", }, [ POWER9_PME_PM_CMPLU_STALL_LWSYNC ] = { .pme_name = "PM_CMPLU_STALL_LWSYNC", .pme_code = 0x0000010036, .pme_short_desc = "completion stall due to lwsync", .pme_long_desc = "completion stall due to lwsync", }, [ POWER9_PME_PM_CMPLU_STALL_MTFPSCR ] = { .pme_name = "PM_CMPLU_STALL_MTFPSCR", .pme_code = 0x000004E012, .pme_short_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", .pme_long_desc = "Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)", }, [ POWER9_PME_PM_CMPLU_STALL_NESTED_TBEGIN ] = { .pme_name = "PM_CMPLU_STALL_NESTED_TBEGIN", .pme_code = 0x000001E05C, .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin.", .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT", }, [ POWER9_PME_PM_CMPLU_STALL_NESTED_TEND ] = { .pme_name = "PM_CMPLU_STALL_NESTED_TEND", .pme_code = 0x000003003C, .pme_short_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level.", .pme_long_desc = "Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level. This is a short delay", }, [ POWER9_PME_PM_CMPLU_STALL_NTC_DISP_FIN ] = { .pme_name = "PM_CMPLU_STALL_NTC_DISP_FIN", .pme_code = 0x000004E018, .pme_short_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", .pme_long_desc = "Finish stall because the NTF instruction was one that must finish at dispatch.", }, [ POWER9_PME_PM_CMPLU_STALL_NTC_FLUSH ] = { .pme_name = "PM_CMPLU_STALL_NTC_FLUSH", .pme_code = 0x000002E01E, .pme_short_desc = "Completion stall due to ntc flush", .pme_long_desc = "Completion stall due to ntc flush", }, [ POWER9_PME_PM_CMPLU_STALL_OTHER_CMPL ] = { .pme_name = "PM_CMPLU_STALL_OTHER_CMPL", .pme_code = 0x0000030006, .pme_short_desc = "Instructions the core completed while this tread was stalled", .pme_long_desc = "Instructions the core completed while this tread was stalled", }, [ POWER9_PME_PM_CMPLU_STALL_PASTE ] = { .pme_name = "PM_CMPLU_STALL_PASTE", .pme_code = 0x000002C016, .pme_short_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", .pme_long_desc = "Finish stall because the NTF instruction was a paste waiting for response from L2", }, [ POWER9_PME_PM_CMPLU_STALL_PM ] = { .pme_name = "PM_CMPLU_STALL_PM", .pme_code = 0x000003000A, .pme_short_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish. Includes permute and decimal fixed point instructions (128 bit BCD arithmetic) + a few 128 bit fixpoint add/subtract instructions with carry. Not qualified by vector or multicycle", }, [ POWER9_PME_PM_CMPLU_STALL_SLB ] = { .pme_name = "PM_CMPLU_STALL_SLB", .pme_code = 0x000001E052, .pme_short_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", .pme_long_desc = "Finish stall because the NTF instruction was awaiting L2 response for an SLB", }, [ POWER9_PME_PM_CMPLU_STALL_SPEC_FINISH ] = { .pme_name = "PM_CMPLU_STALL_SPEC_FINISH", .pme_code = 0x0000030028, .pme_short_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", .pme_long_desc = "Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC", }, [ POWER9_PME_PM_CMPLU_STALL_SRQ_FULL ] = { .pme_name = "PM_CMPLU_STALL_SRQ_FULL", .pme_code = 0x0000030016, .pme_short_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", .pme_long_desc = "Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full", }, [ POWER9_PME_PM_CMPLU_STALL_STCX ] = { .pme_name = "PM_CMPLU_STALL_STCX", .pme_code = 0x000002D01C, .pme_short_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", .pme_long_desc = "Finish stall because the NTF instruction was a stcx waiting for response from L2", }, [ POWER9_PME_PM_CMPLU_STALL_ST_FWD ] = { .pme_name = "PM_CMPLU_STALL_ST_FWD", .pme_code = 0x000004C01C, .pme_short_desc = "Completion stall due to store forward", .pme_long_desc = "Completion stall due to store forward", }, [ POWER9_PME_PM_CMPLU_STALL_STORE_DATA ] = { .pme_name = "PM_CMPLU_STALL_STORE_DATA", .pme_code = 0x0000030026, .pme_short_desc = "Finish stall because the next to finish instruction was a store waiting on data", .pme_long_desc = "Finish stall because the next to finish instruction was a store waiting on data", }, [ POWER9_PME_PM_CMPLU_STALL_STORE_FIN_ARB ] = { .pme_name = "PM_CMPLU_STALL_STORE_FIN_ARB", .pme_code = 0x0000030014, .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe.", .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe. This means the instruction is ready to finish but there are instructions ahead of it, using the finish pipe", }, [ POWER9_PME_PM_CMPLU_STALL_STORE_FINISH ] = { .pme_name = "PM_CMPLU_STALL_STORE_FINISH", .pme_code = 0x000002C014, .pme_short_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", .pme_long_desc = "Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish", }, [ POWER9_PME_PM_CMPLU_STALL_STORE_PIPE_ARB ] = { .pme_name = "PM_CMPLU_STALL_STORE_PIPE_ARB", .pme_code = 0x000004C010, .pme_short_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject.", .pme_long_desc = "Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject. This means the instruction is ready to relaunch and tried once but lost arbitration", }, [ POWER9_PME_PM_CMPLU_STALL_SYNC_PMU_INT ] = { .pme_name = "PM_CMPLU_STALL_SYNC_PMU_INT", .pme_code = 0x000002C01E, .pme_short_desc = "Cycles in which the NTC instruction is waiting for a synchronous PMU interrupt", .pme_long_desc = "Cycles in which the NTC instruction is waiting for a synchronous PMU interrupt", }, [ POWER9_PME_PM_CMPLU_STALL_TEND ] = { .pme_name = "PM_CMPLU_STALL_TEND", .pme_code = 0x000001E050, .pme_short_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", .pme_long_desc = "Finish stall because the NTF instruction was a tend instruction awaiting response from L2", }, [ POWER9_PME_PM_CMPLU_STALL_THRD ] = { .pme_name = "PM_CMPLU_STALL_THRD", .pme_code = 0x000001001C, .pme_short_desc = "Completion Stalled because the thread was blocked", .pme_long_desc = "Completion Stalled because the thread was blocked", }, [ POWER9_PME_PM_CMPLU_STALL_TLBIE ] = { .pme_name = "PM_CMPLU_STALL_TLBIE", .pme_code = 0x000002E01C, .pme_short_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", .pme_long_desc = "Finish stall because the NTF instruction was a tlbie waiting for response from L2", }, [ POWER9_PME_PM_CMPLU_STALL ] = { .pme_name = "PM_CMPLU_STALL", .pme_code = 0x000001E054, .pme_short_desc = "Nothing completed and ICT not empty", .pme_long_desc = "Nothing completed and ICT not empty", }, [ POWER9_PME_PM_CMPLU_STALL_VDPLONG ] = { .pme_name = "PM_CMPLU_STALL_VDPLONG", .pme_code = 0x000003C05A, .pme_short_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle", }, [ POWER9_PME_PM_CMPLU_STALL_VDP ] = { .pme_name = "PM_CMPLU_STALL_VDP", .pme_code = 0x000004405C, .pme_short_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish.", .pme_long_desc = "Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by vector", }, [ POWER9_PME_PM_CMPLU_STALL_VFXLONG ] = { .pme_name = "PM_CMPLU_STALL_VFXLONG", .pme_code = 0x000002E018, .pme_short_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", .pme_long_desc = "Completion stall due to a long latency vector fixed point instruction (division, square root)", }, [ POWER9_PME_PM_CMPLU_STALL_VFXU ] = { .pme_name = "PM_CMPLU_STALL_VFXU", .pme_code = 0x000003C05C, .pme_short_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline.", .pme_long_desc = "Finish stall due to a vector fixed point instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes", }, [ POWER9_PME_PM_CO0_BUSY ] = { .pme_name = "PM_CO0_BUSY", .pme_code = 0x000003608C, .pme_short_desc = "CO mach 0 Busy.", .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_CO0_BUSY_ALT ] = { .pme_name = "PM_CO0_BUSY_ALT", .pme_code = 0x000004608C, .pme_short_desc = "CO mach 0 Busy.", .pme_long_desc = "CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_CO_DISP_FAIL ] = { .pme_name = "PM_CO_DISP_FAIL", .pme_code = 0x0000016886, .pme_short_desc = "CO dispatch failed due to all CO machines being busy", .pme_long_desc = "CO dispatch failed due to all CO machines being busy", }, [ POWER9_PME_PM_CO_TM_SC_FOOTPRINT ] = { .pme_name = "PM_CO_TM_SC_FOOTPRINT", .pme_code = 0x0000026086, .pme_short_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3) OR L2 TM_store hit dirty HPC line and L3 indicated SC line formed in L3 on RDR bus", .pme_long_desc = "L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3) OR L2 TM_store hit dirty HPC line and L3 indicated SC line formed in L3 on RDR bus", }, [ POWER9_PME_PM_CO_USAGE ] = { .pme_name = "PM_CO_USAGE", .pme_code = 0x000002688C, .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy.", .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, /* See also alternate entries for 000001001E / POWER9_PME_PM_CYC with code(s) 000002001E 000003001E 000004001E at the bottom of this table. \n */ [ POWER9_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x000001001E, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER9_PME_PM_DARQ0_0_3_ENTRIES ] = { .pme_name = "PM_DARQ0_0_3_ENTRIES", .pme_code = 0x000004D04A, .pme_short_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", .pme_long_desc = "Cycles in which 3 or less DARQ entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ0_10_12_ENTRIES ] = { .pme_name = "PM_DARQ0_10_12_ENTRIES", .pme_code = 0x000001D058, .pme_short_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", .pme_long_desc = "Cycles in which 10 or more DARQ entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ0_4_6_ENTRIES ] = { .pme_name = "PM_DARQ0_4_6_ENTRIES", .pme_code = 0x000003504E, .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ0_7_9_ENTRIES ] = { .pme_name = "PM_DARQ0_7_9_ENTRIES", .pme_code = 0x000002E050, .pme_short_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", .pme_long_desc = "Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ1_0_3_ENTRIES ] = { .pme_name = "PM_DARQ1_0_3_ENTRIES", .pme_code = 0x000004C122, .pme_short_desc = "Cycles in which 3 or fewer DARQ1 entries (out of 12) are in use", .pme_long_desc = "Cycles in which 3 or fewer DARQ1 entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ1_10_12_ENTRIES ] = { .pme_name = "PM_DARQ1_10_12_ENTRIES", .pme_code = 0x0000020058, .pme_short_desc = "Cycles in which 10 or more DARQ1 entries (out of 12) are in use", .pme_long_desc = "Cycles in which 10 or more DARQ1 entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ1_4_6_ENTRIES ] = { .pme_name = "PM_DARQ1_4_6_ENTRIES", .pme_code = 0x000003E050, .pme_short_desc = "Cycles in which 4, 5, or 6 DARQ1 entries (out of 12) are in use", .pme_long_desc = "Cycles in which 4, 5, or 6 DARQ1 entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ1_7_9_ENTRIES ] = { .pme_name = "PM_DARQ1_7_9_ENTRIES", .pme_code = 0x000002005A, .pme_short_desc = "Cycles in which 7 to 9 DARQ1 entries (out of 12) are in use", .pme_long_desc = "Cycles in which 7 to 9 DARQ1 entries (out of 12) are in use", }, [ POWER9_PME_PM_DARQ_STORE_REJECT ] = { .pme_name = "PM_DARQ_STORE_REJECT", .pme_code = 0x000004405E, .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected.", .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected. Divide by PM_DARQ_STORE_XMIT to get reject ratio", }, [ POWER9_PME_PM_DARQ_STORE_XMIT ] = { .pme_name = "PM_DARQ_STORE_XMIT", .pme_code = 0x0000030064, .pme_short_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry.", .pme_long_desc = "The DARQ attempted to transmit a store into an LSAQ or SRQ entry. Includes rejects. Not qualified by thread, so it includes counts for the whole core", }, [ POWER9_PME_PM_DATA_CHIP_PUMP_CPRED ] = { .pme_name = "PM_DATA_CHIP_PUMP_CPRED", .pme_code = 0x000001C050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load", }, [ POWER9_PME_PM_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_DL2L3_MOD", .pme_code = 0x000004C048, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_DL2L3_SHR", .pme_code = 0x000003C048, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_DL4 ] = { .pme_name = "PM_DATA_FROM_DL4", .pme_code = 0x000003C04C, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_DMEM ] = { .pme_name = "PM_DATA_FROM_DMEM", .pme_code = 0x000004C04C, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L21_MOD ] = { .pme_name = "PM_DATA_FROM_L21_MOD", .pme_code = 0x000004C046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L21_SHR ] = { .pme_name = "PM_DATA_FROM_L21_SHR", .pme_code = 0x000003C046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x000003C040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_DATA_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x000004C040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2_MEPF ] = { .pme_name = "PM_DATA_FROM_L2_MEPF", .pme_code = 0x000002C040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2MISS_MOD ] = { .pme_name = "PM_DATA_FROM_L2MISS_MOD", .pme_code = 0x000001C04E, .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2MISS ] = { .pme_name = "PM_DATA_FROM_L2MISS", .pme_code = 0x00000200FE, .pme_short_desc = "Demand LD - L2 Miss (not L2 hit)", .pme_long_desc = "Demand LD - L2 Miss (not L2 hit)", }, [ POWER9_PME_PM_DATA_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L2_NO_CONFLICT", .pme_code = 0x000001C040, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x000001C042, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L31_ECO_MOD ] = { .pme_name = "PM_DATA_FROM_L31_ECO_MOD", .pme_code = 0x000004C044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L31_ECO_SHR ] = { .pme_name = "PM_DATA_FROM_L31_ECO_SHR", .pme_code = 0x000003C044, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L31_MOD ] = { .pme_name = "PM_DATA_FROM_L31_MOD", .pme_code = 0x000002C044, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L31_SHR ] = { .pme_name = "PM_DATA_FROM_L31_SHR", .pme_code = 0x000001C046, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L3_DISP_CONFLICT", .pme_code = 0x000003C042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L3_MEPF ] = { .pme_name = "PM_DATA_FROM_L3_MEPF", .pme_code = 0x000002C042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L3MISS_MOD ] = { .pme_name = "PM_DATA_FROM_L3MISS_MOD", .pme_code = 0x000004C04E, .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L3MISS ] = { .pme_name = "PM_DATA_FROM_L3MISS", .pme_code = 0x00000300FE, .pme_short_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", .pme_long_desc = "Demand LD - L3 Miss (not L2 hit and not L3 hit)", }, [ POWER9_PME_PM_DATA_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_DATA_FROM_L3_NO_CONFLICT", .pme_code = 0x000001C044, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_L3 ] = { .pme_name = "PM_DATA_FROM_L3", .pme_code = 0x000004C042, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_LL4 ] = { .pme_name = "PM_DATA_FROM_LL4", .pme_code = 0x000001C04C, .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_LMEM ] = { .pme_name = "PM_DATA_FROM_LMEM", .pme_code = 0x000002C048, .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_MEMORY ] = { .pme_name = "PM_DATA_FROM_MEMORY", .pme_code = 0x00000400FE, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_DATA_FROM_OFF_CHIP_CACHE", .pme_code = 0x000004C04A, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_DATA_FROM_ON_CHIP_CACHE", .pme_code = 0x000001C048, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_DATA_FROM_RL2L3_MOD", .pme_code = 0x000002C046, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_DATA_FROM_RL2L3_SHR", .pme_code = 0x000001C04A, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_RL4 ] = { .pme_name = "PM_DATA_FROM_RL4", .pme_code = 0x000002C04A, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a demand load", }, [ POWER9_PME_PM_DATA_FROM_RMEM ] = { .pme_name = "PM_DATA_FROM_RMEM", .pme_code = 0x000003C04A, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a demand load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a demand load", }, [ POWER9_PME_PM_DATA_GRP_PUMP_CPRED ] = { .pme_name = "PM_DATA_GRP_PUMP_CPRED", .pme_code = 0x000002C050, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for a demand load", }, [ POWER9_PME_PM_DATA_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_GRP_PUMP_MPRED_RTY", .pme_code = 0x000001C052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", }, [ POWER9_PME_PM_DATA_GRP_PUMP_MPRED ] = { .pme_name = "PM_DATA_GRP_PUMP_MPRED", .pme_code = 0x000002C052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for a demand load", }, [ POWER9_PME_PM_DATA_PUMP_CPRED ] = { .pme_name = "PM_DATA_PUMP_CPRED", .pme_code = 0x000001C054, .pme_short_desc = "Pump prediction correct.", .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for a demand load", }, [ POWER9_PME_PM_DATA_PUMP_MPRED ] = { .pme_name = "PM_DATA_PUMP_MPRED", .pme_code = 0x000004C052, .pme_short_desc = "Pump misprediction.", .pme_long_desc = "Pump misprediction. Counts across all types of pumps for a demand load", }, [ POWER9_PME_PM_DATA_STORE ] = { .pme_name = "PM_DATA_STORE", .pme_code = 0x000000F0A0, .pme_short_desc = "All ops that drain from s2q to L2 containing data", .pme_long_desc = "All ops that drain from s2q to L2 containing data", }, [ POWER9_PME_PM_DATA_SYS_PUMP_CPRED ] = { .pme_name = "PM_DATA_SYS_PUMP_CPRED", .pme_code = 0x000003C050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for a demand load", }, [ POWER9_PME_PM_DATA_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_DATA_SYS_PUMP_MPRED_RTY", .pme_code = 0x000004C050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load", }, [ POWER9_PME_PM_DATA_SYS_PUMP_MPRED ] = { .pme_name = "PM_DATA_SYS_PUMP_MPRED", .pme_code = 0x000003C052, .pme_short_desc = "Final Pump Scope (system) mispredicted.", .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load", }, [ POWER9_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x000003001A, .pme_short_desc = "Data Tablewalk Cycles.", .pme_long_desc = "Data Tablewalk Cycles. Could be 1 or 2 active tablewalks. Includes data prefetches.", }, [ POWER9_PME_PM_DC_DEALLOC_NO_CONF ] = { .pme_name = "PM_DC_DEALLOC_NO_CONF", .pme_code = 0x000000F8AC, .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", }, [ POWER9_PME_PM_DC_PREF_CONF ] = { .pme_name = "PM_DC_PREF_CONF", .pme_code = 0x000000F0A8, .pme_short_desc = "A demand load referenced a line in an active prefetch stream.", .pme_long_desc = "A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Includes forwards and backwards streams", }, [ POWER9_PME_PM_DC_PREF_CONS_ALLOC ] = { .pme_name = "PM_DC_PREF_CONS_ALLOC", .pme_code = 0x000000F0B4, .pme_short_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", .pme_long_desc = "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch", }, [ POWER9_PME_PM_DC_PREF_FUZZY_CONF ] = { .pme_name = "PM_DC_PREF_FUZZY_CONF", .pme_code = 0x000000F8A8, .pme_short_desc = "A demand load referenced a line in an active fuzzy prefetch stream.", .pme_long_desc = "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)", }, [ POWER9_PME_PM_DC_PREF_HW_ALLOC ] = { .pme_name = "PM_DC_PREF_HW_ALLOC", .pme_code = 0x000000F0A4, .pme_short_desc = "Prefetch stream allocated by the hardware prefetch mechanism", .pme_long_desc = "Prefetch stream allocated by the hardware prefetch mechanism", }, [ POWER9_PME_PM_DC_PREF_STRIDED_CONF ] = { .pme_name = "PM_DC_PREF_STRIDED_CONF", .pme_code = 0x000000F0AC, .pme_short_desc = "A demand load referenced a line in an active strided prefetch stream.", .pme_long_desc = "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.", }, [ POWER9_PME_PM_DC_PREF_SW_ALLOC ] = { .pme_name = "PM_DC_PREF_SW_ALLOC", .pme_code = 0x000000F8A4, .pme_short_desc = "Prefetch stream allocated by software prefetching", .pme_long_desc = "Prefetch stream allocated by software prefetching", }, [ POWER9_PME_PM_DC_PREF_XCONS_ALLOC ] = { .pme_name = "PM_DC_PREF_XCONS_ALLOC", .pme_code = 0x000000F8B4, .pme_short_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", .pme_long_desc = "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch", }, [ POWER9_PME_PM_DECODE_FUSION_CONST_GEN ] = { .pme_name = "PM_DECODE_FUSION_CONST_GEN", .pme_code = 0x00000048B4, .pme_short_desc = "32-bit constant generation", .pme_long_desc = "32-bit constant generation", }, [ POWER9_PME_PM_DECODE_FUSION_EXT_ADD ] = { .pme_name = "PM_DECODE_FUSION_EXT_ADD", .pme_code = 0x0000005084, .pme_short_desc = "32-bit extended addition", .pme_long_desc = "32-bit extended addition", }, [ POWER9_PME_PM_DECODE_FUSION_LD_ST_DISP ] = { .pme_name = "PM_DECODE_FUSION_LD_ST_DISP", .pme_code = 0x00000048A8, .pme_short_desc = "32-bit displacement D-form and 16-bit displacement X-form", .pme_long_desc = "32-bit displacement D-form and 16-bit displacement X-form", }, [ POWER9_PME_PM_DECODE_FUSION_OP_PRESERV ] = { .pme_name = "PM_DECODE_FUSION_OP_PRESERV", .pme_code = 0x0000005088, .pme_short_desc = "Destructive op operand preservation", .pme_long_desc = "Destructive op operand preservation", }, [ POWER9_PME_PM_DECODE_HOLD_ICT_FULL ] = { .pme_name = "PM_DECODE_HOLD_ICT_FULL", .pme_code = 0x00000058A8, .pme_short_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use.", .pme_long_desc = "Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use. This means the ICT is full for this thread", }, [ POWER9_PME_PM_DECODE_LANES_NOT_AVAIL ] = { .pme_name = "PM_DECODE_LANES_NOT_AVAIL", .pme_code = 0x0000005884, .pme_short_desc = "Decode has something to transmit but dispatch lanes are not available", .pme_long_desc = "Decode has something to transmit but dispatch lanes are not available", }, [ POWER9_PME_PM_DERAT_MISS_16G ] = { .pme_name = "PM_DERAT_MISS_16G", .pme_code = 0x000004C054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16G", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16G", }, [ POWER9_PME_PM_DERAT_MISS_16M ] = { .pme_name = "PM_DERAT_MISS_16M", .pme_code = 0x000003C054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 16M", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 16M", }, [ POWER9_PME_PM_DERAT_MISS_1G ] = { .pme_name = "PM_DERAT_MISS_1G", .pme_code = 0x000002C05A, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 1G.", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", }, [ POWER9_PME_PM_DERAT_MISS_2M ] = { .pme_name = "PM_DERAT_MISS_2M", .pme_code = 0x000001C05A, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 2M.", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", }, [ POWER9_PME_PM_DERAT_MISS_4K ] = { .pme_name = "PM_DERAT_MISS_4K", .pme_code = 0x000001C056, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 4K", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 4K", }, [ POWER9_PME_PM_DERAT_MISS_64K ] = { .pme_name = "PM_DERAT_MISS_64K", .pme_code = 0x000002C054, .pme_short_desc = "Data ERAT Miss (Data TLB Access) page size 64K", .pme_long_desc = "Data ERAT Miss (Data TLB Access) page size 64K", }, [ POWER9_PME_PM_DFU_BUSY ] = { .pme_name = "PM_DFU_BUSY", .pme_code = 0x000004D04C, .pme_short_desc = "Cycles in which all 4 Decimal Floating Point units are busy.", .pme_long_desc = "Cycles in which all 4 Decimal Floating Point units are busy. The DFU is running at capacity", }, [ POWER9_PME_PM_DISP_CLB_HELD_BAL ] = { .pme_name = "PM_DISP_CLB_HELD_BAL", .pme_code = 0x000000288C, .pme_short_desc = "Dispatch/CLB Hold: Balance Flush", .pme_long_desc = "Dispatch/CLB Hold: Balance Flush", }, [ POWER9_PME_PM_DISP_CLB_HELD_SB ] = { .pme_name = "PM_DISP_CLB_HELD_SB", .pme_code = 0x0000002090, .pme_short_desc = "Dispatch/CLB Hold: Scoreboard", .pme_long_desc = "Dispatch/CLB Hold: Scoreboard", }, [ POWER9_PME_PM_DISP_CLB_HELD_TLBIE ] = { .pme_name = "PM_DISP_CLB_HELD_TLBIE", .pme_code = 0x0000002890, .pme_short_desc = "Dispatch Hold: Due to TLBIE", .pme_long_desc = "Dispatch Hold: Due to TLBIE", }, [ POWER9_PME_PM_DISP_HELD_HB_FULL ] = { .pme_name = "PM_DISP_HELD_HB_FULL", .pme_code = 0x000003D05C, .pme_short_desc = "Dispatch held due to History Buffer full.", .pme_long_desc = "Dispatch held due to History Buffer full. Could be GPR/VSR/VMR/FPR/CR/XVF; CR; XVF (XER/VSCR/FPSCR)", }, [ POWER9_PME_PM_DISP_HELD_ISSQ_FULL ] = { .pme_name = "PM_DISP_HELD_ISSQ_FULL", .pme_code = 0x0000020006, .pme_short_desc = "Dispatch held due to Issue q full.", .pme_long_desc = "Dispatch held due to Issue q full. Includes issue queue and branch queue", }, [ POWER9_PME_PM_DISP_HELD_SYNC_HOLD ] = { .pme_name = "PM_DISP_HELD_SYNC_HOLD", .pme_code = 0x000004003C, .pme_short_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", .pme_long_desc = "Cycles in which dispatch is held because of a synchronizing instruction in the pipeline", }, [ POWER9_PME_PM_DISP_HELD_TBEGIN ] = { .pme_name = "PM_DISP_HELD_TBEGIN", .pme_code = 0x00000028B0, .pme_short_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", .pme_long_desc = "This outer tbegin transaction cannot be dispatched until the previous tend instruction completes", }, [ POWER9_PME_PM_DISP_HELD ] = { .pme_name = "PM_DISP_HELD", .pme_code = 0x0000010006, .pme_short_desc = "Dispatch Held", .pme_long_desc = "Dispatch Held", }, [ POWER9_PME_PM_DISP_STARVED ] = { .pme_name = "PM_DISP_STARVED", .pme_code = 0x0000030008, .pme_short_desc = "Dispatched Starved", .pme_long_desc = "Dispatched Starved", }, [ POWER9_PME_PM_DP_QP_FLOP_CMPL ] = { .pme_name = "PM_DP_QP_FLOP_CMPL", .pme_code = 0x000004D05C, .pme_short_desc = "Double-Precion or Quad-Precision instruction completed", .pme_long_desc = "Double-Precion or Quad-Precision instruction completed", }, [ POWER9_PME_PM_DPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_DPTEG_FROM_DL2L3_MOD", .pme_code = 0x000004E048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_DPTEG_FROM_DL2L3_SHR", .pme_code = 0x000003E048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_DL4 ] = { .pme_name = "PM_DPTEG_FROM_DL4", .pme_code = 0x000003E04C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_DMEM ] = { .pme_name = "PM_DPTEG_FROM_DMEM", .pme_code = 0x000004E04C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L21_MOD ] = { .pme_name = "PM_DPTEG_FROM_L21_MOD", .pme_code = 0x000004E046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L21_SHR ] = { .pme_name = "PM_DPTEG_FROM_L21_SHR", .pme_code = 0x000003E046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_DPTEG_FROM_L2_MEPF", .pme_code = 0x000002E040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L2MISS ] = { .pme_name = "PM_DPTEG_FROM_L2MISS", .pme_code = 0x000001E04E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x000001E040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L2 ] = { .pme_name = "PM_DPTEG_FROM_L2", .pme_code = 0x000001E042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_DPTEG_FROM_L31_ECO_MOD", .pme_code = 0x000004E044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_DPTEG_FROM_L31_ECO_SHR", .pme_code = 0x000003E044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L31_MOD ] = { .pme_name = "PM_DPTEG_FROM_L31_MOD", .pme_code = 0x000002E044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L31_SHR ] = { .pme_name = "PM_DPTEG_FROM_L31_SHR", .pme_code = 0x000001E046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x000003E042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_DPTEG_FROM_L3_MEPF", .pme_code = 0x000002E042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L3MISS ] = { .pme_name = "PM_DPTEG_FROM_L3MISS", .pme_code = 0x000004E04E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_DPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x000001E044, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_L3 ] = { .pme_name = "PM_DPTEG_FROM_L3", .pme_code = 0x000004E042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_LL4 ] = { .pme_name = "PM_DPTEG_FROM_LL4", .pme_code = 0x000001E04C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_LMEM ] = { .pme_name = "PM_DPTEG_FROM_LMEM", .pme_code = 0x000002E048, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_MEMORY ] = { .pme_name = "PM_DPTEG_FROM_MEMORY", .pme_code = 0x000002E04C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_DPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x000004E04A, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_DPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x000001E048, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_DPTEG_FROM_RL2L3_MOD", .pme_code = 0x000002E046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_DPTEG_FROM_RL2L3_SHR", .pme_code = 0x000001E04A, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_RL4 ] = { .pme_name = "PM_DPTEG_FROM_RL4", .pme_code = 0x000002E04A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DPTEG_FROM_RMEM ] = { .pme_name = "PM_DPTEG_FROM_RMEM", .pme_code = 0x000003E04A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_DSIDE_L2MEMACC ] = { .pme_name = "PM_DSIDE_L2MEMACC", .pme_code = 0x0000036092, .pme_short_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory (excluding hpcread64 accesses), i.", .pme_long_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory (excluding hpcread64 accesses), i.e., total memory accesses by RCs", }, [ POWER9_PME_PM_DSIDE_MRU_TOUCH ] = { .pme_name = "PM_DSIDE_MRU_TOUCH", .pme_code = 0x0000026884, .pme_short_desc = "D-side L2 MRU touch sent to L2", .pme_long_desc = "D-side L2 MRU touch sent to L2", }, [ POWER9_PME_PM_DSIDE_OTHER_64B_L2MEMACC ] = { .pme_name = "PM_DSIDE_OTHER_64B_L2MEMACC", .pme_code = 0x0000036892, .pme_short_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory that was for hpc_read64, (RC had to fetch other 64B of a line from MC) i.", .pme_long_desc = "Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory that was for hpc_read64, (RC had to fetch other 64B of a line from MC) i.e., number of times RC had to go to memory to get 'missing' 64B", }, [ POWER9_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x000000D0A8, .pme_short_desc = "Data SLB Miss - Total of all segment sizes", .pme_long_desc = "Data SLB Miss - Total of all segment sizes", }, [ POWER9_PME_PM_DSLB_MISS_ALT ] = { .pme_name = "PM_DSLB_MISS_ALT", .pme_code = 0x0000010016, .pme_short_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", .pme_long_desc = "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))", }, [ POWER9_PME_PM_DTLB_MISS_16G ] = { .pme_name = "PM_DTLB_MISS_16G", .pme_code = 0x000001C058, .pme_short_desc = "Data TLB Miss page size 16G", .pme_long_desc = "Data TLB Miss page size 16G", }, [ POWER9_PME_PM_DTLB_MISS_16M ] = { .pme_name = "PM_DTLB_MISS_16M", .pme_code = 0x000004C056, .pme_short_desc = "Data TLB Miss page size 16M", .pme_long_desc = "Data TLB Miss page size 16M", }, [ POWER9_PME_PM_DTLB_MISS_1G ] = { .pme_name = "PM_DTLB_MISS_1G", .pme_code = 0x000004C05A, .pme_short_desc = "Data TLB reload (after a miss) page size 1G.", .pme_long_desc = "Data TLB reload (after a miss) page size 1G. Implies radix translation was used", }, [ POWER9_PME_PM_DTLB_MISS_2M ] = { .pme_name = "PM_DTLB_MISS_2M", .pme_code = 0x000001C05C, .pme_short_desc = "Data TLB reload (after a miss) page size 2M.", .pme_long_desc = "Data TLB reload (after a miss) page size 2M. Implies radix translation was used", }, [ POWER9_PME_PM_DTLB_MISS_4K ] = { .pme_name = "PM_DTLB_MISS_4K", .pme_code = 0x000002C056, .pme_short_desc = "Data TLB Miss page size 4k", .pme_long_desc = "Data TLB Miss page size 4k", }, [ POWER9_PME_PM_DTLB_MISS_64K ] = { .pme_name = "PM_DTLB_MISS_64K", .pme_code = 0x000003C056, .pme_short_desc = "Data TLB Miss page size 64K", .pme_long_desc = "Data TLB Miss page size 64K", }, [ POWER9_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x00000300FC, .pme_short_desc = "Data PTEG reload", .pme_long_desc = "Data PTEG reload", }, [ POWER9_PME_PM_SPACEHOLDER_0000040062 ] = { .pme_name = "PM_SPACEHOLDER_0000040062", .pme_code = 0x0000040062, .pme_short_desc = "SPACE_HOLDER for event 0000040062", .pme_long_desc = "SPACE_HOLDER for event 0000040062", }, [ POWER9_PME_PM_SPACEHOLDER_0000040064 ] = { .pme_name = "PM_SPACEHOLDER_0000040064", .pme_code = 0x0000040064, .pme_short_desc = "SPACE_HOLDER for event 0000040064", .pme_long_desc = "SPACE_HOLDER for event 0000040064", }, [ POWER9_PME_PM_EAT_FORCE_MISPRED ] = { .pme_name = "PM_EAT_FORCE_MISPRED", .pme_code = 0x00000050A8, .pme_short_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT.", .pme_long_desc = "XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issued", }, [ POWER9_PME_PM_EAT_FULL_CYC ] = { .pme_name = "PM_EAT_FULL_CYC", .pme_code = 0x0000004084, .pme_short_desc = "Cycles No room in EAT", .pme_long_desc = "Cycles No room in EAT", }, [ POWER9_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x0000002080, .pme_short_desc = "CyclesMSR[EE] is off and external interrupts are active", .pme_long_desc = "CyclesMSR[EE] is off and external interrupts are active", }, [ POWER9_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x00000200F8, .pme_short_desc = "external interrupt", .pme_long_desc = "external interrupt", }, [ POWER9_PME_PM_FLOP_CMPL ] = { .pme_name = "PM_FLOP_CMPL", .pme_code = 0x000004505E, .pme_short_desc = "Floating Point Operation Finished", .pme_long_desc = "Floating Point Operation Finished", }, [ POWER9_PME_PM_FLUSH_COMPLETION ] = { .pme_name = "PM_FLUSH_COMPLETION", .pme_code = 0x0000030012, .pme_short_desc = "The instruction that was next to complete did not complete because it suffered a flush", .pme_long_desc = "The instruction that was next to complete did not complete because it suffered a flush", }, [ POWER9_PME_PM_FLUSH_DISP_SB ] = { .pme_name = "PM_FLUSH_DISP_SB", .pme_code = 0x0000002088, .pme_short_desc = "Dispatch Flush: Scoreboard", .pme_long_desc = "Dispatch Flush: Scoreboard", }, [ POWER9_PME_PM_FLUSH_DISP_TLBIE ] = { .pme_name = "PM_FLUSH_DISP_TLBIE", .pme_code = 0x0000002888, .pme_short_desc = "Dispatch Flush: TLBIE", .pme_long_desc = "Dispatch Flush: TLBIE", }, [ POWER9_PME_PM_FLUSH_DISP ] = { .pme_name = "PM_FLUSH_DISP", .pme_code = 0x0000002880, .pme_short_desc = "Dispatch flush", .pme_long_desc = "Dispatch flush", }, [ POWER9_PME_PM_FLUSH_HB_RESTORE_CYC ] = { .pme_name = "PM_FLUSH_HB_RESTORE_CYC", .pme_code = 0x0000002084, .pme_short_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush.", .pme_long_desc = "Cycles in which no new instructions can be dispatched to the ICT after a flush. History buffer recovery", }, [ POWER9_PME_PM_FLUSH_LSU ] = { .pme_name = "PM_FLUSH_LSU", .pme_code = 0x00000058A4, .pme_short_desc = "LSU flushes.", .pme_long_desc = "LSU flushes. Includes all lsu flushes", }, [ POWER9_PME_PM_FLUSH_MPRED ] = { .pme_name = "PM_FLUSH_MPRED", .pme_code = 0x00000050A4, .pme_short_desc = "Branch mispredict flushes.", .pme_long_desc = "Branch mispredict flushes. Includes target and address misprecition", }, [ POWER9_PME_PM_FLUSH ] = { .pme_name = "PM_FLUSH", .pme_code = 0x00000400F8, .pme_short_desc = "Flush (any type)", .pme_long_desc = "Flush (any type)", }, [ POWER9_PME_PM_FMA_CMPL ] = { .pme_name = "PM_FMA_CMPL", .pme_code = 0x0000045054, .pme_short_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.", .pme_long_desc = "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only. ", }, [ POWER9_PME_PM_FORCED_NOP ] = { .pme_name = "PM_FORCED_NOP", .pme_code = 0x000000509C, .pme_short_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", .pme_long_desc = "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time", }, [ POWER9_PME_PM_FREQ_DOWN ] = { .pme_name = "PM_FREQ_DOWN", .pme_code = 0x000003000C, .pme_short_desc = "Power Management: Below Threshold B", .pme_long_desc = "Power Management: Below Threshold B", }, [ POWER9_PME_PM_FREQ_UP ] = { .pme_name = "PM_FREQ_UP", .pme_code = 0x000004000C, .pme_short_desc = "Power Management: Above Threshold A", .pme_long_desc = "Power Management: Above Threshold A", }, [ POWER9_PME_PM_FXU_1PLUS_BUSY ] = { .pme_name = "PM_FXU_1PLUS_BUSY", .pme_code = 0x000003000E, .pme_short_desc = "At least one of the 4 FXU units is busy", .pme_long_desc = "At least one of the 4 FXU units is busy", }, [ POWER9_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x000002000E, .pme_short_desc = "Cycles in which all 4 FXUs are busy.", .pme_long_desc = "Cycles in which all 4 FXUs are busy. The FXU is running at capacity", }, [ POWER9_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x0000040004, .pme_short_desc = "The fixed point unit Unit finished an instruction.", .pme_long_desc = "The fixed point unit Unit finished an instruction. Instructions that finish may not necessary complete.", }, [ POWER9_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x0000024052, .pme_short_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", .pme_long_desc = "Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle", }, [ POWER9_PME_PM_GRP_PUMP_CPRED ] = { .pme_name = "PM_GRP_PUMP_CPRED", .pme_code = 0x0000020050, .pme_short_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_GRP_PUMP_MPRED_RTY", .pme_code = 0x0000010052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_GRP_PUMP_MPRED ] = { .pme_name = "PM_GRP_PUMP_MPRED", .pme_code = 0x0000020052, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x000002000A, .pme_short_desc = "Cycles in which msr_hv is high.", .pme_long_desc = "Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration", }, [ POWER9_PME_PM_HWSYNC ] = { .pme_name = "PM_HWSYNC", .pme_code = 0x00000050A0, .pme_short_desc = "Hwsync instruction decoded and transferred", .pme_long_desc = "Hwsync instruction decoded and transferred", }, [ POWER9_PME_PM_IBUF_FULL_CYC ] = { .pme_name = "PM_IBUF_FULL_CYC", .pme_code = 0x0000004884, .pme_short_desc = "Cycles No room in ibuff", .pme_long_desc = "Cycles No room in ibuff", }, [ POWER9_PME_PM_IC_DEMAND_CYC ] = { .pme_name = "PM_IC_DEMAND_CYC", .pme_code = 0x0000010018, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for a demand load", }, [ POWER9_PME_PM_IC_DEMAND_L2_BHT_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BHT_REDIRECT", .pme_code = 0x0000004098, .pme_short_desc = "L2 I cache demand request due to BHT redirect, branch redirect (2 bubbles 3 cycles)", .pme_long_desc = "L2 I cache demand request due to BHT redirect, branch redirect (2 bubbles 3 cycles)", }, [ POWER9_PME_PM_IC_DEMAND_L2_BR_REDIRECT ] = { .pme_name = "PM_IC_DEMAND_L2_BR_REDIRECT", .pme_code = 0x0000004898, .pme_short_desc = "L2 I cache demand request due to branch Mispredict (15 cycle path)", .pme_long_desc = "L2 I cache demand request due to branch Mispredict (15 cycle path)", }, [ POWER9_PME_PM_IC_DEMAND_REQ ] = { .pme_name = "PM_IC_DEMAND_REQ", .pme_code = 0x0000004088, .pme_short_desc = "Demand Instruction fetch request", .pme_long_desc = "Demand Instruction fetch request", }, [ POWER9_PME_PM_IC_INVALIDATE ] = { .pme_name = "PM_IC_INVALIDATE", .pme_code = 0x0000005888, .pme_short_desc = "Ic line invalidated", .pme_long_desc = "Ic line invalidated", }, [ POWER9_PME_PM_IC_MISS_CMPL ] = { .pme_name = "PM_IC_MISS_CMPL", .pme_code = 0x0000045058, .pme_short_desc = "Non-speculative icache miss, counted at completion", .pme_long_desc = "Non-speculative icache miss, counted at completion", }, [ POWER9_PME_PM_IC_MISS_ICBI ] = { .pme_name = "PM_IC_MISS_ICBI", .pme_code = 0x0000005094, .pme_short_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on.", .pme_long_desc = "threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out", }, [ POWER9_PME_PM_IC_PREF_CANCEL_HIT ] = { .pme_name = "PM_IC_PREF_CANCEL_HIT", .pme_code = 0x0000004890, .pme_short_desc = "Prefetch Canceled due to icache hit", .pme_long_desc = "Prefetch Canceled due to icache hit", }, [ POWER9_PME_PM_IC_PREF_CANCEL_L2 ] = { .pme_name = "PM_IC_PREF_CANCEL_L2", .pme_code = 0x0000004094, .pme_short_desc = "L2 Squashed a demand or prefetch request", .pme_long_desc = "L2 Squashed a demand or prefetch request", }, [ POWER9_PME_PM_IC_PREF_CANCEL_PAGE ] = { .pme_name = "PM_IC_PREF_CANCEL_PAGE", .pme_code = 0x0000004090, .pme_short_desc = "Prefetch Canceled due to page boundary", .pme_long_desc = "Prefetch Canceled due to page boundary", }, [ POWER9_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x0000004888, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Instruction prefetch requests", }, [ POWER9_PME_PM_IC_PREF_WRITE ] = { .pme_name = "PM_IC_PREF_WRITE", .pme_code = 0x000000488C, .pme_short_desc = "Instruction prefetch written into IL1", .pme_long_desc = "Instruction prefetch written into IL1", }, [ POWER9_PME_PM_IC_RELOAD_PRIVATE ] = { .pme_name = "PM_IC_RELOAD_PRIVATE", .pme_code = 0x0000004894, .pme_short_desc = "Reloading line was brought in private for a specific thread.", .pme_long_desc = "Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight threads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat", }, [ POWER9_PME_PM_ICT_EMPTY_CYC ] = { .pme_name = "PM_ICT_EMPTY_CYC", .pme_code = 0x0000020008, .pme_short_desc = "Cycles in which the ICT is completely empty.", .pme_long_desc = "Cycles in which the ICT is completely empty. No itags are assigned to any thread", }, [ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED_ICMISS ] = { .pme_name = "PM_ICT_NOSLOT_BR_MPRED_ICMISS", .pme_code = 0x0000034058, .pme_short_desc = "Ict empty for this thread due to Icache Miss and branch mispred", .pme_long_desc = "Ict empty for this thread due to Icache Miss and branch mispred", }, [ POWER9_PME_PM_ICT_NOSLOT_BR_MPRED ] = { .pme_name = "PM_ICT_NOSLOT_BR_MPRED", .pme_code = 0x000004D01E, .pme_short_desc = "Ict empty for this thread due to branch mispred", .pme_long_desc = "Ict empty for this thread due to branch mispred", }, [ POWER9_PME_PM_ICT_NOSLOT_CYC ] = { .pme_name = "PM_ICT_NOSLOT_CYC", .pme_code = 0x00000100F8, .pme_short_desc = "Number of cycles the ICT has no itags assigned to this thread", .pme_long_desc = "Number of cycles the ICT has no itags assigned to this thread", }, [ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_HB_FULL ] = { .pme_name = "PM_ICT_NOSLOT_DISP_HELD_HB_FULL", .pme_code = 0x0000030018, .pme_short_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full.", .pme_long_desc = "Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF; CR; XVF (XER/VSCR/FPSCR)", }, [ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_ISSQ ] = { .pme_name = "PM_ICT_NOSLOT_DISP_HELD_ISSQ", .pme_code = 0x000002D01E, .pme_short_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", .pme_long_desc = "Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full", }, [ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_SYNC ] = { .pme_name = "PM_ICT_NOSLOT_DISP_HELD_SYNC", .pme_code = 0x000004D01C, .pme_short_desc = "Dispatch held due to a synchronizing instruction at dispatch", .pme_long_desc = "Dispatch held due to a synchronizing instruction at dispatch", }, [ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD_TBEGIN ] = { .pme_name = "PM_ICT_NOSLOT_DISP_HELD_TBEGIN", .pme_code = 0x0000010064, .pme_short_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", .pme_long_desc = "the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch", }, [ POWER9_PME_PM_ICT_NOSLOT_DISP_HELD ] = { .pme_name = "PM_ICT_NOSLOT_DISP_HELD", .pme_code = 0x000004E01A, .pme_short_desc = "Cycles in which the NTC instruction is held at dispatch for any reason", .pme_long_desc = "Cycles in which the NTC instruction is held at dispatch for any reason", }, [ POWER9_PME_PM_ICT_NOSLOT_IC_L3MISS ] = { .pme_name = "PM_ICT_NOSLOT_IC_L3MISS", .pme_code = 0x000004E010, .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3.", .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from beyond the local L3. The source could be local/remote/distant memory or another core's cache", }, [ POWER9_PME_PM_ICT_NOSLOT_IC_L3 ] = { .pme_name = "PM_ICT_NOSLOT_IC_L3", .pme_code = 0x000003E052, .pme_short_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", .pme_long_desc = "Ict empty for this thread due to icache misses that were sourced from the local L3", }, [ POWER9_PME_PM_ICT_NOSLOT_IC_MISS ] = { .pme_name = "PM_ICT_NOSLOT_IC_MISS", .pme_code = 0x000002D01A, .pme_short_desc = "Ict empty for this thread due to Icache Miss", .pme_long_desc = "Ict empty for this thread due to Icache Miss", }, [ POWER9_PME_PM_IERAT_RELOAD_16M ] = { .pme_name = "PM_IERAT_RELOAD_16M", .pme_code = 0x000004006A, .pme_short_desc = "IERAT Reloaded (Miss) for a 16M page", .pme_long_desc = "IERAT Reloaded (Miss) for a 16M page", }, [ POWER9_PME_PM_IERAT_RELOAD_4K ] = { .pme_name = "PM_IERAT_RELOAD_4K", .pme_code = 0x0000020064, .pme_short_desc = "IERAT reloaded (after a miss) for 4K pages", .pme_long_desc = "IERAT reloaded (after a miss) for 4K pages", }, [ POWER9_PME_PM_IERAT_RELOAD_64K ] = { .pme_name = "PM_IERAT_RELOAD_64K", .pme_code = 0x000003006A, .pme_short_desc = "IERAT Reloaded (Miss) for a 64k page", .pme_long_desc = "IERAT Reloaded (Miss) for a 64k page", }, [ POWER9_PME_PM_IERAT_RELOAD ] = { .pme_name = "PM_IERAT_RELOAD", .pme_code = 0x00000100F6, .pme_short_desc = "Number of I-ERAT reloads", .pme_long_desc = "Number of I-ERAT reloads", }, [ POWER9_PME_PM_IFETCH_THROTTLE ] = { .pme_name = "PM_IFETCH_THROTTLE", .pme_code = 0x000003405E, .pme_short_desc = "Cycles in which Instruction fetch throttle was active.", .pme_long_desc = "Cycles in which Instruction fetch throttle was active.", }, [ POWER9_PME_PM_INST_CHIP_PUMP_CPRED ] = { .pme_name = "PM_INST_CHIP_PUMP_CPRED", .pme_code = 0x0000014050, .pme_short_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", .pme_long_desc = "Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch", }, /* See also alternate entries for 0000010002 / POWER9_PME_PM_INST_CMPL with code(s) 0000020002 0000030002 0000040002 at the bottom of this table. \n */ [ POWER9_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x0000010002, .pme_short_desc = "Number of PowerPC Instructions that completed.", .pme_long_desc = "Number of PowerPC Instructions that completed.", }, /* See also alternate entries for 00000200F2 / POWER9_PME_PM_INST_DISP with code(s) 00000300F2 at the bottom of this table. \n */ [ POWER9_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x00000200F2, .pme_short_desc = "# PPC Dispatched", .pme_long_desc = "# PPC Dispatched", }, [ POWER9_PME_PM_INST_FROM_DL2L3_MOD ] = { .pme_name = "PM_INST_FROM_DL2L3_MOD", .pme_code = 0x0000044048, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_DL2L3_SHR ] = { .pme_name = "PM_INST_FROM_DL2L3_SHR", .pme_code = 0x0000034048, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_DL4 ] = { .pme_name = "PM_INST_FROM_DL4", .pme_code = 0x000003404C, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_DMEM ] = { .pme_name = "PM_INST_FROM_DMEM", .pme_code = 0x000004404C, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x0000004080, .pme_short_desc = "Instruction fetches from L1.", .pme_long_desc = "Instruction fetches from L1. L1 instruction hit", }, [ POWER9_PME_PM_INST_FROM_L21_MOD ] = { .pme_name = "PM_INST_FROM_L21_MOD", .pme_code = 0x0000044046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L21_SHR ] = { .pme_name = "PM_INST_FROM_L21_SHR", .pme_code = 0x0000034046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x0000034040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_INST_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x0000044040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2_MEPF ] = { .pme_name = "PM_INST_FROM_L2_MEPF", .pme_code = 0x0000024040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2MISS ] = { .pme_name = "PM_INST_FROM_L2MISS", .pme_code = 0x000001404E, .pme_short_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L2 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L2 due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_INST_FROM_L2_NO_CONFLICT", .pme_code = 0x0000014040, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x0000014042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L31_ECO_MOD ] = { .pme_name = "PM_INST_FROM_L31_ECO_MOD", .pme_code = 0x0000044044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L31_ECO_SHR ] = { .pme_name = "PM_INST_FROM_L31_ECO_SHR", .pme_code = 0x0000034044, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L31_MOD ] = { .pme_name = "PM_INST_FROM_L31_MOD", .pme_code = 0x0000024044, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L31_SHR ] = { .pme_name = "PM_INST_FROM_L31_SHR", .pme_code = 0x0000014046, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_INST_FROM_L3_DISP_CONFLICT", .pme_code = 0x0000034042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L3_MEPF ] = { .pme_name = "PM_INST_FROM_L3_MEPF", .pme_code = 0x0000024042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L3MISS_MOD ] = { .pme_name = "PM_INST_FROM_L3MISS_MOD", .pme_code = 0x000004404E, .pme_short_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L3 due to a instruction fetch", .pme_long_desc = "The processor's Instruction cache was reloaded from a location other than the local core's L3 due to a instruction fetch", }, [ POWER9_PME_PM_INST_FROM_L3MISS ] = { .pme_name = "PM_INST_FROM_L3MISS", .pme_code = 0x00000300FA, .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", }, [ POWER9_PME_PM_INST_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_INST_FROM_L3_NO_CONFLICT", .pme_code = 0x0000014044, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_L3 ] = { .pme_name = "PM_INST_FROM_L3", .pme_code = 0x0000044042, .pme_short_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_LL4 ] = { .pme_name = "PM_INST_FROM_LL4", .pme_code = 0x000001404C, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_LMEM ] = { .pme_name = "PM_INST_FROM_LMEM", .pme_code = 0x0000024048, .pme_short_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_MEMORY ] = { .pme_name = "PM_INST_FROM_MEMORY", .pme_code = 0x000002404C, .pme_short_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_INST_FROM_OFF_CHIP_CACHE", .pme_code = 0x000004404A, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_INST_FROM_ON_CHIP_CACHE", .pme_code = 0x0000014048, .pme_short_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_RL2L3_MOD ] = { .pme_name = "PM_INST_FROM_RL2L3_MOD", .pme_code = 0x0000024046, .pme_short_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_RL2L3_SHR ] = { .pme_name = "PM_INST_FROM_RL2L3_SHR", .pme_code = 0x000001404A, .pme_short_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_RL4 ] = { .pme_name = "PM_INST_FROM_RL4", .pme_code = 0x000002404A, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_FROM_RMEM ] = { .pme_name = "PM_INST_FROM_RMEM", .pme_code = 0x000003404A, .pme_short_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", .pme_long_desc = "The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Remote) due to an instruction fetch (not prefetch)", }, [ POWER9_PME_PM_INST_GRP_PUMP_CPRED ] = { .pme_name = "PM_INST_GRP_PUMP_CPRED", .pme_code = 0x000002C05C, .pme_short_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", .pme_long_desc = "Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)", }, [ POWER9_PME_PM_INST_GRP_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_GRP_PUMP_MPRED_RTY", .pme_code = 0x0000014052, .pme_short_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", .pme_long_desc = "Final Pump Scope (Group) ended up larger than Initial Pump Scope (Chip) for an instruction fetch", }, [ POWER9_PME_PM_INST_GRP_PUMP_MPRED ] = { .pme_name = "PM_INST_GRP_PUMP_MPRED", .pme_code = 0x000002C05E, .pme_short_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", .pme_long_desc = "Final Pump Scope (Group) ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)", }, [ POWER9_PME_PM_INST_IMC_MATCH_CMPL ] = { .pme_name = "PM_INST_IMC_MATCH_CMPL", .pme_code = 0x000004001C, .pme_short_desc = "IMC Match Count", .pme_long_desc = "IMC Match Count", }, [ POWER9_PME_PM_INST_PUMP_CPRED ] = { .pme_name = "PM_INST_PUMP_CPRED", .pme_code = 0x0000014054, .pme_short_desc = "Pump prediction correct.", .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for an instruction fetch", }, [ POWER9_PME_PM_INST_PUMP_MPRED ] = { .pme_name = "PM_INST_PUMP_MPRED", .pme_code = 0x0000044052, .pme_short_desc = "Pump misprediction.", .pme_long_desc = "Pump misprediction. Counts across all types of pumps for an instruction fetch", }, [ POWER9_PME_PM_INST_SYS_PUMP_CPRED ] = { .pme_name = "PM_INST_SYS_PUMP_CPRED", .pme_code = 0x0000034050, .pme_short_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", .pme_long_desc = "Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch", }, [ POWER9_PME_PM_INST_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_INST_SYS_PUMP_MPRED_RTY", .pme_code = 0x0000044050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch", }, [ POWER9_PME_PM_INST_SYS_PUMP_MPRED ] = { .pme_name = "PM_INST_SYS_PUMP_MPRED", .pme_code = 0x0000034052, .pme_short_desc = "Final Pump Scope (system) mispredicted.", .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch", }, [ POWER9_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x0000024050, .pme_short_desc = "Internal Operations completed", .pme_long_desc = "Internal Operations completed", }, [ POWER9_PME_PM_IPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_IPTEG_FROM_DL2L3_MOD", .pme_code = 0x0000045048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_IPTEG_FROM_DL2L3_SHR", .pme_code = 0x0000035048, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_DL4 ] = { .pme_name = "PM_IPTEG_FROM_DL4", .pme_code = 0x000003504C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_DMEM ] = { .pme_name = "PM_IPTEG_FROM_DMEM", .pme_code = 0x000004504C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L21_MOD ] = { .pme_name = "PM_IPTEG_FROM_L21_MOD", .pme_code = 0x0000045046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L21_SHR ] = { .pme_name = "PM_IPTEG_FROM_L21_SHR", .pme_code = 0x0000035046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_IPTEG_FROM_L2_MEPF", .pme_code = 0x0000025040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L2MISS ] = { .pme_name = "PM_IPTEG_FROM_L2MISS", .pme_code = 0x000001504E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x0000015040, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L2 ] = { .pme_name = "PM_IPTEG_FROM_L2", .pme_code = 0x0000015042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_IPTEG_FROM_L31_ECO_MOD", .pme_code = 0x0000045044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_IPTEG_FROM_L31_ECO_SHR", .pme_code = 0x0000035044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L31_MOD ] = { .pme_name = "PM_IPTEG_FROM_L31_MOD", .pme_code = 0x0000025044, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L31_SHR ] = { .pme_name = "PM_IPTEG_FROM_L31_SHR", .pme_code = 0x0000015046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x0000035042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_IPTEG_FROM_L3_MEPF", .pme_code = 0x0000025042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L3MISS ] = { .pme_name = "PM_IPTEG_FROM_L3MISS", .pme_code = 0x000004504E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_IPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x0000015044, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_L3 ] = { .pme_name = "PM_IPTEG_FROM_L3", .pme_code = 0x0000045042, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_LL4 ] = { .pme_name = "PM_IPTEG_FROM_LL4", .pme_code = 0x000001504C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_LMEM ] = { .pme_name = "PM_IPTEG_FROM_LMEM", .pme_code = 0x0000025048, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_MEMORY ] = { .pme_name = "PM_IPTEG_FROM_MEMORY", .pme_code = 0x000002504C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_IPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x000004504A, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_IPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x0000015048, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_IPTEG_FROM_RL2L3_MOD", .pme_code = 0x0000025046, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_IPTEG_FROM_RL2L3_SHR", .pme_code = 0x000001504A, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_RL4 ] = { .pme_name = "PM_IPTEG_FROM_RL4", .pme_code = 0x000002504A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a instruction side request", }, [ POWER9_PME_PM_IPTEG_FROM_RMEM ] = { .pme_name = "PM_IPTEG_FROM_RMEM", .pme_code = 0x000003504A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a instruction side request", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a instruction side request", }, [ POWER9_PME_PM_ISIDE_DISP_FAIL_ADDR ] = { .pme_name = "PM_ISIDE_DISP_FAIL_ADDR", .pme_code = 0x000002608A, .pme_short_desc = "All I-side dispatch attempts for this thread that failed due to a addr collision with another machine (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-side dispatch attempts for this thread that failed due to a addr collision with another machine (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_ISIDE_DISP_FAIL_OTHER ] = { .pme_name = "PM_ISIDE_DISP_FAIL_OTHER", .pme_code = 0x000002688A, .pme_short_desc = "All I-side dispatch attempts for this thread that failed due to a reason other than addrs collision (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-side dispatch attempts for this thread that failed due to a reason other than addrs collision (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_ISIDE_DISP ] = { .pme_name = "PM_ISIDE_DISP", .pme_code = 0x000001688A, .pme_short_desc = "All I-side dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-side dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_ISIDE_L2MEMACC ] = { .pme_name = "PM_ISIDE_L2MEMACC", .pme_code = 0x0000026890, .pme_short_desc = "Valid when first beat of data comes in for an I-side fetch where data came from memory", .pme_long_desc = "Valid when first beat of data comes in for an I-side fetch where data came from memory", }, [ POWER9_PME_PM_ISIDE_MRU_TOUCH ] = { .pme_name = "PM_ISIDE_MRU_TOUCH", .pme_code = 0x0000046880, .pme_short_desc = "I-side L2 MRU touch sent to L2 for this thread", .pme_long_desc = "I-side L2 MRU touch sent to L2 for this thread", }, [ POWER9_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x000000D8A8, .pme_short_desc = "Instruction SLB Miss - Total of all segment sizes", .pme_long_desc = "Instruction SLB Miss - Total of all segment sizes", }, [ POWER9_PME_PM_ISLB_MISS_ALT ] = { .pme_name = "PM_ISLB_MISS_ALT", .pme_code = 0x0000040006, .pme_short_desc = "Number of ISLB misses for this thread", .pme_long_desc = "Number of ISLB misses for this thread", }, [ POWER9_PME_PM_ISQ_0_8_ENTRIES ] = { .pme_name = "PM_ISQ_0_8_ENTRIES", .pme_code = 0x000003005A, .pme_short_desc = "Cycles in which 8 or less Issue Queue entries are in use.", .pme_long_desc = "Cycles in which 8 or less Issue Queue entries are in use. This is a shared event, not per thread", }, [ POWER9_PME_PM_ISQ_36_44_ENTRIES ] = { .pme_name = "PM_ISQ_36_44_ENTRIES", .pme_code = 0x000004000A, .pme_short_desc = "Cycles in which 36 or more Issue Queue entries are in use.", .pme_long_desc = "Cycles in which 36 or more Issue Queue entries are in use. This is a shared event, not per thread. There are 44 issue queue entries across 4 slices in the whole core", }, [ POWER9_PME_PM_ISU0_ISS_HOLD_ALL ] = { .pme_name = "PM_ISU0_ISS_HOLD_ALL", .pme_code = 0x0000003080, .pme_short_desc = "All ISU rejects", .pme_long_desc = "All ISU rejects", }, [ POWER9_PME_PM_ISU1_ISS_HOLD_ALL ] = { .pme_name = "PM_ISU1_ISS_HOLD_ALL", .pme_code = 0x0000003084, .pme_short_desc = "All ISU rejects", .pme_long_desc = "All ISU rejects", }, [ POWER9_PME_PM_ISU2_ISS_HOLD_ALL ] = { .pme_name = "PM_ISU2_ISS_HOLD_ALL", .pme_code = 0x0000003880, .pme_short_desc = "All ISU rejects", .pme_long_desc = "All ISU rejects", }, [ POWER9_PME_PM_ISU3_ISS_HOLD_ALL ] = { .pme_name = "PM_ISU3_ISS_HOLD_ALL", .pme_code = 0x0000003884, .pme_short_desc = "All ISU rejects", .pme_long_desc = "All ISU rejects", }, [ POWER9_PME_PM_ISYNC ] = { .pme_name = "PM_ISYNC", .pme_code = 0x0000002884, .pme_short_desc = "Isync completion count per thread", .pme_long_desc = "Isync completion count per thread", }, [ POWER9_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x00000400FC, .pme_short_desc = "ITLB Reloaded.", .pme_long_desc = "ITLB Reloaded. Counts 1 per ITLB miss for HPT but multiple for radix depending on number of levels traveresed", }, [ POWER9_PME_PM_L1_DCACHE_RELOADED_ALL ] = { .pme_name = "PM_L1_DCACHE_RELOADED_ALL", .pme_code = 0x000001002C, .pme_short_desc = "L1 data cache reloaded for demand.", .pme_long_desc = "L1 data cache reloaded for demand. If MMCR1[16] is 1, prefetches will be included as well", }, [ POWER9_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x00000300F6, .pme_short_desc = "DL1 reloaded due to Demand Load", .pme_long_desc = "DL1 reloaded due to Demand Load", }, [ POWER9_PME_PM_L1_DEMAND_WRITE ] = { .pme_name = "PM_L1_DEMAND_WRITE", .pme_code = 0x000000408C, .pme_short_desc = "Instruction Demand sectors written into IL1", .pme_long_desc = "Instruction Demand sectors written into IL1", }, [ POWER9_PME_PM_L1_ICACHE_MISS ] = { .pme_name = "PM_L1_ICACHE_MISS", .pme_code = 0x00000200FD, .pme_short_desc = "Demand iCache Miss", .pme_long_desc = "Demand iCache Miss", }, [ POWER9_PME_PM_L1_ICACHE_RELOADED_ALL ] = { .pme_name = "PM_L1_ICACHE_RELOADED_ALL", .pme_code = 0x0000040012, .pme_short_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", .pme_long_desc = "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch", }, [ POWER9_PME_PM_L1_ICACHE_RELOADED_PREF ] = { .pme_name = "PM_L1_ICACHE_RELOADED_PREF", .pme_code = 0x0000030068, .pme_short_desc = "Counts all Icache prefetch reloads (includes demand turned into prefetch)", .pme_long_desc = "Counts all Icache prefetch reloads (includes demand turned into prefetch)", }, [ POWER9_PME_PM_L1PF_L2MEMACC ] = { .pme_name = "PM_L1PF_L2MEMACC", .pme_code = 0x0000016890, .pme_short_desc = "Valid when first beat of data comes in for an L1PF where data came from memory", .pme_long_desc = "Valid when first beat of data comes in for an L1PF where data came from memory", }, [ POWER9_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x0000020054, .pme_short_desc = "A data line was written to the L1 due to a hardware or software prefetch", .pme_long_desc = "A data line was written to the L1 due to a hardware or software prefetch", }, [ POWER9_PME_PM_L1_SW_PREF ] = { .pme_name = "PM_L1_SW_PREF", .pme_code = 0x000000E880, .pme_short_desc = "Software L1 Prefetches, including SW Transient Prefetches", .pme_long_desc = "Software L1 Prefetches, including SW Transient Prefetches", }, [ POWER9_PME_PM_L2_CASTOUT_MOD ] = { .pme_name = "PM_L2_CASTOUT_MOD", .pme_code = 0x0000016082, .pme_short_desc = "L2 Castouts - Modified (M,Mu,Me)", .pme_long_desc = "L2 Castouts - Modified (M,Mu,Me)", }, [ POWER9_PME_PM_L2_CASTOUT_SHR ] = { .pme_name = "PM_L2_CASTOUT_SHR", .pme_code = 0x0000016882, .pme_short_desc = "L2 Castouts - Shared (Tx,Sx)", .pme_long_desc = "L2 Castouts - Shared (Tx,Sx)", }, [ POWER9_PME_PM_L2_CHIP_PUMP ] = { .pme_name = "PM_L2_CHIP_PUMP", .pme_code = 0x0000046088, .pme_short_desc = "RC requests that were local (aka chip) pump attempts", .pme_long_desc = "RC requests that were local (aka chip) pump attempts", }, [ POWER9_PME_PM_L2_DC_INV ] = { .pme_name = "PM_L2_DC_INV", .pme_code = 0x0000026882, .pme_short_desc = "D-cache invalidates sent over the reload bus to the core", .pme_long_desc = "D-cache invalidates sent over the reload bus to the core", }, [ POWER9_PME_PM_L2_DISP_ALL_L2MISS ] = { .pme_name = "PM_L2_DISP_ALL_L2MISS", .pme_code = 0x0000046080, .pme_short_desc = "All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_GROUP_PUMP ] = { .pme_name = "PM_L2_GROUP_PUMP", .pme_code = 0x0000046888, .pme_short_desc = "RC requests that were on group (aka nodel) pump attempts", .pme_long_desc = "RC requests that were on group (aka nodel) pump attempts", }, [ POWER9_PME_PM_L2_GRP_GUESS_CORRECT ] = { .pme_name = "PM_L2_GRP_GUESS_CORRECT", .pme_code = 0x0000026088, .pme_short_desc = "L2 guess grp (GS or NNS) and guess was correct (data intra-group AND ^on-chip)", .pme_long_desc = "L2 guess grp (GS or NNS) and guess was correct (data intra-group AND ^on-chip)", }, [ POWER9_PME_PM_L2_GRP_GUESS_WRONG ] = { .pme_name = "PM_L2_GRP_GUESS_WRONG", .pme_code = 0x0000026888, .pme_short_desc = "L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)", .pme_long_desc = "L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)", }, [ POWER9_PME_PM_L2_IC_INV ] = { .pme_name = "PM_L2_IC_INV", .pme_code = 0x0000026082, .pme_short_desc = "I-cache Invalidates sent over the realod bus to the core", .pme_long_desc = "I-cache Invalidates sent over the realod bus to the core", }, [ POWER9_PME_PM_L2_INST_MISS ] = { .pme_name = "PM_L2_INST_MISS", .pme_code = 0x0000036880, .pme_short_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", }, [ POWER9_PME_PM_L2_INST_MISS_ALT ] = { .pme_name = "PM_L2_INST_MISS_ALT", .pme_code = 0x000004609E, .pme_short_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)", }, [ POWER9_PME_PM_L2_INST ] = { .pme_name = "PM_L2_INST", .pme_code = 0x0000036080, .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", }, [ POWER9_PME_PM_L2_INST_ALT ] = { .pme_name = "PM_L2_INST_ALT", .pme_code = 0x000003609E, .pme_short_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", .pme_long_desc = "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)", }, [ POWER9_PME_PM_L2_LD_DISP ] = { .pme_name = "PM_L2_LD_DISP", .pme_code = 0x000001609E, .pme_short_desc = "All successful D-side load dispatches for this thread (L2 miss + L2 hits)", .pme_long_desc = "All successful D-side load dispatches for this thread (L2 miss + L2 hits)", }, [ POWER9_PME_PM_L2_LD_DISP_ALT ] = { .pme_name = "PM_L2_LD_DISP_ALT", .pme_code = 0x0000036082, .pme_short_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_LD_HIT ] = { .pme_name = "PM_L2_LD_HIT", .pme_code = 0x000002609E, .pme_short_desc = "All successful D-side load dispatches that were L2 hits for this thread", .pme_long_desc = "All successful D-side load dispatches that were L2 hits for this thread", }, [ POWER9_PME_PM_L2_LD_HIT_ALT ] = { .pme_name = "PM_L2_LD_HIT_ALT", .pme_code = 0x0000036882, .pme_short_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_LD_MISS_128B ] = { .pme_name = "PM_L2_LD_MISS_128B", .pme_code = 0x0000016092, .pme_short_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.", .pme_long_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)", }, [ POWER9_PME_PM_L2_LD_MISS_64B ] = { .pme_name = "PM_L2_LD_MISS_64B", .pme_code = 0x0000026092, .pme_short_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.", .pme_long_desc = "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.e., M=1)", }, [ POWER9_PME_PM_L2_LD_MISS ] = { .pme_name = "PM_L2_LD_MISS", .pme_code = 0x0000026080, .pme_short_desc = "All successful D-Side Load dispatches that were an L2 miss for this thread", .pme_long_desc = "All successful D-Side Load dispatches that were an L2 miss for this thread", }, [ POWER9_PME_PM_L2_LD ] = { .pme_name = "PM_L2_LD", .pme_code = 0x0000016080, .pme_short_desc = "All successful D-side Load dispatches for this thread (L2 miss + L2 hits)", .pme_long_desc = "All successful D-side Load dispatches for this thread (L2 miss + L2 hits)", }, [ POWER9_PME_PM_L2_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L2_LOC_GUESS_CORRECT", .pme_code = 0x0000016088, .pme_short_desc = "L2 guess local (LNS) and guess was correct (ie data local)", .pme_long_desc = "L2 guess local (LNS) and guess was correct (ie data local)", }, [ POWER9_PME_PM_L2_LOC_GUESS_WRONG ] = { .pme_name = "PM_L2_LOC_GUESS_WRONG", .pme_code = 0x0000016888, .pme_short_desc = "L2 guess local (LNS) and guess was not correct (ie data not on chip)", .pme_long_desc = "L2 guess local (LNS) and guess was not correct (ie data not on chip)", }, [ POWER9_PME_PM_L2_RCLD_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_ADDR", .pme_code = 0x0000016884, .pme_short_desc = "All I-od-D side load dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ machine (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-od-D side load dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ machine (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_RCLD_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCLD_DISP_FAIL_OTHER", .pme_code = 0x0000026084, .pme_short_desc = "All I-or-D side load dispatch attempts for this thread that failed due to reason other than address collision (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-or-D side load dispatch attempts for this thread that failed due to reason other than address collision (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_RCLD_DISP ] = { .pme_name = "PM_L2_RCLD_DISP", .pme_code = 0x0000016084, .pme_short_desc = "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", .pme_long_desc = "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)", }, [ POWER9_PME_PM_L2_RCST_DISP_FAIL_ADDR ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_ADDR", .pme_code = 0x0000036884, .pme_short_desc = "All D-side store dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ", .pme_long_desc = "All D-side store dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ", }, [ POWER9_PME_PM_L2_RCST_DISP_FAIL_OTHER ] = { .pme_name = "PM_L2_RCST_DISP_FAIL_OTHER", .pme_code = 0x0000046084, .pme_short_desc = "All D-side store dispatch attempts for this thread that failed due to reason other than address collision", .pme_long_desc = "All D-side store dispatch attempts for this thread that failed due to reason other than address collision", }, [ POWER9_PME_PM_L2_RCST_DISP ] = { .pme_name = "PM_L2_RCST_DISP", .pme_code = 0x0000036084, .pme_short_desc = "All D-side store dispatch attempts for this thread", .pme_long_desc = "All D-side store dispatch attempts for this thread", }, [ POWER9_PME_PM_L2_RC_ST_DONE ] = { .pme_name = "PM_L2_RC_ST_DONE", .pme_code = 0x0000036086, .pme_short_desc = "RC did store to line that was Tx or Sx", .pme_long_desc = "RC did store to line that was Tx or Sx", }, [ POWER9_PME_PM_L2_RTY_LD ] = { .pme_name = "PM_L2_RTY_LD", .pme_code = 0x000003688A, .pme_short_desc = "RC retries on PB for any load from core (excludes DCBFs)", .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", }, [ POWER9_PME_PM_L2_RTY_LD_ALT ] = { .pme_name = "PM_L2_RTY_LD_ALT", .pme_code = 0x000003689E, .pme_short_desc = "RC retries on PB for any load from core (excludes DCBFs)", .pme_long_desc = "RC retries on PB for any load from core (excludes DCBFs)", }, [ POWER9_PME_PM_L2_RTY_ST ] = { .pme_name = "PM_L2_RTY_ST", .pme_code = 0x000003608A, .pme_short_desc = "RC retries on PB for any store from core (excludes DCBFs)", .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", }, [ POWER9_PME_PM_L2_RTY_ST_ALT ] = { .pme_name = "PM_L2_RTY_ST_ALT", .pme_code = 0x000004689E, .pme_short_desc = "RC retries on PB for any store from core (excludes DCBFs)", .pme_long_desc = "RC retries on PB for any store from core (excludes DCBFs)", }, [ POWER9_PME_PM_L2_SN_M_RD_DONE ] = { .pme_name = "PM_L2_SN_M_RD_DONE", .pme_code = 0x0000046086, .pme_short_desc = "SNP dispatched for a read and was M (true M)", .pme_long_desc = "SNP dispatched for a read and was M (true M)", }, [ POWER9_PME_PM_L2_SN_M_WR_DONE ] = { .pme_name = "PM_L2_SN_M_WR_DONE", .pme_code = 0x0000016086, .pme_short_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", }, [ POWER9_PME_PM_L2_SN_M_WR_DONE_ALT ] = { .pme_name = "PM_L2_SN_M_WR_DONE_ALT", .pme_code = 0x0000046886, .pme_short_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", .pme_long_desc = "SNP dispatched for a write and was M (true M); for DMA cacheinj this will pulse if rty/push is required (won't pulse if cacheinj is accepted)", }, [ POWER9_PME_PM_L2_SN_SX_I_DONE ] = { .pme_name = "PM_L2_SN_SX_I_DONE", .pme_code = 0x0000036886, .pme_short_desc = "SNP dispatched and went from Sx to Ix", .pme_long_desc = "SNP dispatched and went from Sx to Ix", }, [ POWER9_PME_PM_L2_ST_DISP ] = { .pme_name = "PM_L2_ST_DISP", .pme_code = 0x0000046082, .pme_short_desc = "All successful D-side store dispatches for this thread", .pme_long_desc = "All successful D-side store dispatches for this thread", }, [ POWER9_PME_PM_L2_ST_DISP_ALT ] = { .pme_name = "PM_L2_ST_DISP_ALT", .pme_code = 0x000001689E, .pme_short_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", .pme_long_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", }, [ POWER9_PME_PM_L2_ST_HIT ] = { .pme_name = "PM_L2_ST_HIT", .pme_code = 0x0000046882, .pme_short_desc = "All successful D-side store dispatches for this thread that were L2 hits", .pme_long_desc = "All successful D-side store dispatches for this thread that were L2 hits", }, [ POWER9_PME_PM_L2_ST_HIT_ALT ] = { .pme_name = "PM_L2_ST_HIT_ALT", .pme_code = 0x000002689E, .pme_short_desc = "All successful D-side store dispatches that were L2 hits for this thread", .pme_long_desc = "All successful D-side store dispatches that were L2 hits for this thread", }, [ POWER9_PME_PM_L2_ST_MISS_128B ] = { .pme_name = "PM_L2_ST_MISS_128B", .pme_code = 0x0000016892, .pme_short_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.", .pme_long_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)", }, [ POWER9_PME_PM_L2_ST_MISS_64B ] = { .pme_name = "PM_L2_ST_MISS_64B", .pme_code = 0x0000026892, .pme_short_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B (i.", .pme_long_desc = "All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B (i.e., M=1)", }, [ POWER9_PME_PM_L2_ST_MISS ] = { .pme_name = "PM_L2_ST_MISS", .pme_code = 0x0000026880, .pme_short_desc = "All successful D-Side Store dispatches that were an L2 miss for this thread", .pme_long_desc = "All successful D-Side Store dispatches that were an L2 miss for this thread", }, [ POWER9_PME_PM_L2_ST ] = { .pme_name = "PM_L2_ST", .pme_code = 0x0000016880, .pme_short_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", .pme_long_desc = "All successful D-side store dispatches for this thread (L2 miss + L2 hits)", }, [ POWER9_PME_PM_L2_SYS_GUESS_CORRECT ] = { .pme_name = "PM_L2_SYS_GUESS_CORRECT", .pme_code = 0x0000036088, .pme_short_desc = "L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)", .pme_long_desc = "L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)", }, [ POWER9_PME_PM_L2_SYS_GUESS_WRONG ] = { .pme_name = "PM_L2_SYS_GUESS_WRONG", .pme_code = 0x0000036888, .pme_short_desc = "L2 guess system (VGS or RNS) and guess was not correct (ie data ^beyond-group)", .pme_long_desc = "L2 guess system (VGS or RNS) and guess was not correct (ie data ^beyond-group)", }, [ POWER9_PME_PM_L2_SYS_PUMP ] = { .pme_name = "PM_L2_SYS_PUMP", .pme_code = 0x000004688A, .pme_short_desc = "RC requests that were system pump attempts", .pme_long_desc = "RC requests that were system pump attempts", }, [ POWER9_PME_PM_L3_CI_HIT ] = { .pme_name = "PM_L3_CI_HIT", .pme_code = 0x00000260A2, .pme_short_desc = "L3 Castins Hit (total count)", .pme_long_desc = "L3 Castins Hit (total count)", }, [ POWER9_PME_PM_L3_CI_MISS ] = { .pme_name = "PM_L3_CI_MISS", .pme_code = 0x00000268A2, .pme_short_desc = "L3 castins miss (total count)", .pme_long_desc = "L3 castins miss (total count)", }, [ POWER9_PME_PM_L3_CINJ ] = { .pme_name = "PM_L3_CINJ", .pme_code = 0x00000368A4, .pme_short_desc = "L3 castin of cache inject", .pme_long_desc = "L3 castin of cache inject", }, [ POWER9_PME_PM_L3_CI_USAGE ] = { .pme_name = "PM_L3_CI_USAGE", .pme_code = 0x00000168AC, .pme_short_desc = "Rotating sample of 16 CI or CO actives", .pme_long_desc = "Rotating sample of 16 CI or CO actives", }, [ POWER9_PME_PM_L3_CO0_BUSY ] = { .pme_name = "PM_L3_CO0_BUSY", .pme_code = 0x00000368AC, .pme_short_desc = "Lifetime, sample of CO machine 0 valid", .pme_long_desc = "Lifetime, sample of CO machine 0 valid", }, [ POWER9_PME_PM_L3_CO0_BUSY_ALT ] = { .pme_name = "PM_L3_CO0_BUSY_ALT", .pme_code = 0x00000468AC, .pme_short_desc = "Lifetime, sample of CO machine 0 valid", .pme_long_desc = "Lifetime, sample of CO machine 0 valid", }, [ POWER9_PME_PM_L3_CO_L31 ] = { .pme_name = "PM_L3_CO_L31", .pme_code = 0x00000268A0, .pme_short_desc = "L3 CO to L3.", .pme_long_desc = "L3 CO to L3.1 OR of port 0 and 1 (lossy = may undercount if two cresps come in the same cyc)", }, [ POWER9_PME_PM_L3_CO_LCO ] = { .pme_name = "PM_L3_CO_LCO", .pme_code = 0x00000360A4, .pme_short_desc = "Total L3 COs occurred on LCO L3.", .pme_long_desc = "Total L3 COs occurred on LCO L3.1 (good cresp, may end up in mem on a retry)", }, [ POWER9_PME_PM_L3_CO_MEM ] = { .pme_name = "PM_L3_CO_MEM", .pme_code = 0x00000260A0, .pme_short_desc = "L3 CO to memory OR of port 0 and 1 (lossy = may undercount if two cresp come in the same cyc)", .pme_long_desc = "L3 CO to memory OR of port 0 and 1 (lossy = may undercount if two cresp come in the same cyc)", }, [ POWER9_PME_PM_L3_CO_MEPF ] = { .pme_name = "PM_L3_CO_MEPF", .pme_code = 0x000003E05E, .pme_short_desc = "L3 castouts in Mepf state for this thread", .pme_long_desc = "L3 castouts in Mepf state for this thread", }, [ POWER9_PME_PM_L3_CO_MEPF_ALT ] = { .pme_name = "PM_L3_CO_MEPF_ALT", .pme_code = 0x00000168A0, .pme_short_desc = "L3 CO of line in Mep state (includes casthrough to memory).", .pme_long_desc = "L3 CO of line in Mep state (includes casthrough to memory). The Mepf state indicates that a line was brought in to satisfy an L3 prefetch request", }, [ POWER9_PME_PM_L3_CO ] = { .pme_name = "PM_L3_CO", .pme_code = 0x00000360A8, .pme_short_desc = "L3 castout occurring (does not include casthrough or log writes (cinj/dmaw))", .pme_long_desc = "L3 castout occurring (does not include casthrough or log writes (cinj/dmaw))", }, [ POWER9_PME_PM_L3_GRP_GUESS_CORRECT ] = { .pme_name = "PM_L3_GRP_GUESS_CORRECT", .pme_code = 0x00000168B2, .pme_short_desc = "Initial scope=group (GS or NNS) and data from same group (near) (pred successful)", .pme_long_desc = "Initial scope=group (GS or NNS) and data from same group (near) (pred successful)", }, [ POWER9_PME_PM_L3_GRP_GUESS_WRONG_HIGH ] = { .pme_name = "PM_L3_GRP_GUESS_WRONG_HIGH", .pme_code = 0x00000368B2, .pme_short_desc = "Initial scope=group (GS or NNS) but data from local node.", .pme_long_desc = "Initial scope=group (GS or NNS) but data from local node. Prediction too high", }, [ POWER9_PME_PM_L3_GRP_GUESS_WRONG_LOW ] = { .pme_name = "PM_L3_GRP_GUESS_WRONG_LOW", .pme_code = 0x00000360B2, .pme_short_desc = "Initial scope=group (GS or NNS) but data from outside group (far or rem).", .pme_long_desc = "Initial scope=group (GS or NNS) but data from outside group (far or rem). Prediction too Low", }, [ POWER9_PME_PM_L3_HIT ] = { .pme_name = "PM_L3_HIT", .pme_code = 0x00000160A4, .pme_short_desc = "L3 Hits (L2 miss hitting L3, including data/instrn/xlate)", .pme_long_desc = "L3 Hits (L2 miss hitting L3, including data/instrn/xlate)", }, [ POWER9_PME_PM_L3_L2_CO_HIT ] = { .pme_name = "PM_L3_L2_CO_HIT", .pme_code = 0x00000360A2, .pme_short_desc = "L2 CO hits", .pme_long_desc = "L2 CO hits", }, [ POWER9_PME_PM_L3_L2_CO_MISS ] = { .pme_name = "PM_L3_L2_CO_MISS", .pme_code = 0x00000368A2, .pme_short_desc = "L2 CO miss", .pme_long_desc = "L2 CO miss", }, [ POWER9_PME_PM_L3_LAT_CI_HIT ] = { .pme_name = "PM_L3_LAT_CI_HIT", .pme_code = 0x00000460A2, .pme_short_desc = "L3 Lateral Castins Hit", .pme_long_desc = "L3 Lateral Castins Hit", }, [ POWER9_PME_PM_L3_LAT_CI_MISS ] = { .pme_name = "PM_L3_LAT_CI_MISS", .pme_code = 0x00000468A2, .pme_short_desc = "L3 Lateral Castins Miss", .pme_long_desc = "L3 Lateral Castins Miss", }, [ POWER9_PME_PM_L3_LD_HIT ] = { .pme_name = "PM_L3_LD_HIT", .pme_code = 0x00000260A4, .pme_short_desc = "L3 Hits for demand LDs", .pme_long_desc = "L3 Hits for demand LDs", }, [ POWER9_PME_PM_L3_LD_MISS ] = { .pme_name = "PM_L3_LD_MISS", .pme_code = 0x00000268A4, .pme_short_desc = "L3 Misses for demand LDs", .pme_long_desc = "L3 Misses for demand LDs", }, [ POWER9_PME_PM_L3_LD_PREF ] = { .pme_name = "PM_L3_LD_PREF", .pme_code = 0x000000F0B0, .pme_short_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", .pme_long_desc = "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest", }, [ POWER9_PME_PM_L3_LOC_GUESS_CORRECT ] = { .pme_name = "PM_L3_LOC_GUESS_CORRECT", .pme_code = 0x00000160B2, .pme_short_desc = "initial scope=node/chip (LNS) and data from local node (local) (pred successful) - always PFs only", .pme_long_desc = "initial scope=node/chip (LNS) and data from local node (local) (pred successful) - always PFs only", }, [ POWER9_PME_PM_L3_LOC_GUESS_WRONG ] = { .pme_name = "PM_L3_LOC_GUESS_WRONG", .pme_code = 0x00000268B2, .pme_short_desc = "Initial scope=node (LNS) but data from out side local node (near or far or rem).", .pme_long_desc = "Initial scope=node (LNS) but data from out side local node (near or far or rem). Prediction too Low", }, [ POWER9_PME_PM_L3_MISS ] = { .pme_name = "PM_L3_MISS", .pme_code = 0x00000168A4, .pme_short_desc = "L3 Misses (L2 miss also missing L3, including data/instrn/xlate)", .pme_long_desc = "L3 Misses (L2 miss also missing L3, including data/instrn/xlate)", }, [ POWER9_PME_PM_L3_P0_CO_L31 ] = { .pme_name = "PM_L3_P0_CO_L31", .pme_code = 0x00000460AA, .pme_short_desc = "L3 CO to L3.", .pme_long_desc = "L3 CO to L3.1 (LCO) port 0 with or without data", }, [ POWER9_PME_PM_L3_P0_CO_MEM ] = { .pme_name = "PM_L3_P0_CO_MEM", .pme_code = 0x00000360AA, .pme_short_desc = "L3 CO to memory port 0 with or without data", .pme_long_desc = "L3 CO to memory port 0 with or without data", }, [ POWER9_PME_PM_L3_P0_CO_RTY ] = { .pme_name = "PM_L3_P0_CO_RTY", .pme_code = 0x00000360AE, .pme_short_desc = "L3 CO received retry port 0 (memory only), every retry counted", .pme_long_desc = "L3 CO received retry port 0 (memory only), every retry counted", }, [ POWER9_PME_PM_L3_P0_CO_RTY_ALT ] = { .pme_name = "PM_L3_P0_CO_RTY_ALT", .pme_code = 0x00000460AE, .pme_short_desc = "L3 CO received retry port 2 (memory only), every retry counted", .pme_long_desc = "L3 CO received retry port 2 (memory only), every retry counted", }, [ POWER9_PME_PM_L3_P0_GRP_PUMP ] = { .pme_name = "PM_L3_P0_GRP_PUMP", .pme_code = 0x00000260B0, .pme_short_desc = "L3 PF sent with grp scope port 0, counts even retried requests", .pme_long_desc = "L3 PF sent with grp scope port 0, counts even retried requests", }, [ POWER9_PME_PM_L3_P0_LCO_DATA ] = { .pme_name = "PM_L3_P0_LCO_DATA", .pme_code = 0x00000260AA, .pme_short_desc = "LCO sent with data port 0", .pme_long_desc = "LCO sent with data port 0", }, [ POWER9_PME_PM_L3_P0_LCO_NO_DATA ] = { .pme_name = "PM_L3_P0_LCO_NO_DATA", .pme_code = 0x00000160AA, .pme_short_desc = "Dataless L3 LCO sent port 0", .pme_long_desc = "Dataless L3 LCO sent port 0", }, [ POWER9_PME_PM_L3_P0_LCO_RTY ] = { .pme_name = "PM_L3_P0_LCO_RTY", .pme_code = 0x00000160B4, .pme_short_desc = "L3 initiated LCO received retry on port 0 (can try 4 times)", .pme_long_desc = "L3 initiated LCO received retry on port 0 (can try 4 times)", }, [ POWER9_PME_PM_L3_P0_NODE_PUMP ] = { .pme_name = "PM_L3_P0_NODE_PUMP", .pme_code = 0x00000160B0, .pme_short_desc = "L3 PF sent with nodal scope port 0, counts even retried requests", .pme_long_desc = "L3 PF sent with nodal scope port 0, counts even retried requests", }, [ POWER9_PME_PM_L3_P0_PF_RTY ] = { .pme_name = "PM_L3_P0_PF_RTY", .pme_code = 0x00000160AE, .pme_short_desc = "L3 PF received retry port 0, every retry counted", .pme_long_desc = "L3 PF received retry port 0, every retry counted", }, [ POWER9_PME_PM_L3_P0_PF_RTY_ALT ] = { .pme_name = "PM_L3_P0_PF_RTY_ALT", .pme_code = 0x00000260AE, .pme_short_desc = "L3 PF received retry port 2, every retry counted", .pme_long_desc = "L3 PF received retry port 2, every retry counted", }, [ POWER9_PME_PM_L3_P0_SYS_PUMP ] = { .pme_name = "PM_L3_P0_SYS_PUMP", .pme_code = 0x00000360B0, .pme_short_desc = "L3 PF sent with sys scope port 0, counts even retried requests", .pme_long_desc = "L3 PF sent with sys scope port 0, counts even retried requests", }, [ POWER9_PME_PM_L3_P1_CO_L31 ] = { .pme_name = "PM_L3_P1_CO_L31", .pme_code = 0x00000468AA, .pme_short_desc = "L3 CO to L3.", .pme_long_desc = "L3 CO to L3.1 (LCO) port 1 with or without data", }, [ POWER9_PME_PM_L3_P1_CO_MEM ] = { .pme_name = "PM_L3_P1_CO_MEM", .pme_code = 0x00000368AA, .pme_short_desc = "L3 CO to memory port 1 with or without data", .pme_long_desc = "L3 CO to memory port 1 with or without data", }, [ POWER9_PME_PM_L3_P1_CO_RTY ] = { .pme_name = "PM_L3_P1_CO_RTY", .pme_code = 0x00000368AE, .pme_short_desc = "L3 CO received retry port 1 (memory only), every retry counted", .pme_long_desc = "L3 CO received retry port 1 (memory only), every retry counted", }, [ POWER9_PME_PM_L3_P1_CO_RTY_ALT ] = { .pme_name = "PM_L3_P1_CO_RTY_ALT", .pme_code = 0x00000468AE, .pme_short_desc = "L3 CO received retry port 3 (memory only), every retry counted", .pme_long_desc = "L3 CO received retry port 3 (memory only), every retry counted", }, [ POWER9_PME_PM_L3_P1_GRP_PUMP ] = { .pme_name = "PM_L3_P1_GRP_PUMP", .pme_code = 0x00000268B0, .pme_short_desc = "L3 PF sent with grp scope port 1, counts even retried requests", .pme_long_desc = "L3 PF sent with grp scope port 1, counts even retried requests", }, [ POWER9_PME_PM_L3_P1_LCO_DATA ] = { .pme_name = "PM_L3_P1_LCO_DATA", .pme_code = 0x00000268AA, .pme_short_desc = "LCO sent with data port 1", .pme_long_desc = "LCO sent with data port 1", }, [ POWER9_PME_PM_L3_P1_LCO_NO_DATA ] = { .pme_name = "PM_L3_P1_LCO_NO_DATA", .pme_code = 0x00000168AA, .pme_short_desc = "Dataless L3 LCO sent port 1", .pme_long_desc = "Dataless L3 LCO sent port 1", }, [ POWER9_PME_PM_L3_P1_LCO_RTY ] = { .pme_name = "PM_L3_P1_LCO_RTY", .pme_code = 0x00000168B4, .pme_short_desc = "L3 initiated LCO received retry on port 1 (can try 4 times)", .pme_long_desc = "L3 initiated LCO received retry on port 1 (can try 4 times)", }, [ POWER9_PME_PM_L3_P1_NODE_PUMP ] = { .pme_name = "PM_L3_P1_NODE_PUMP", .pme_code = 0x00000168B0, .pme_short_desc = "L3 PF sent with nodal scope port 1, counts even retried requests", .pme_long_desc = "L3 PF sent with nodal scope port 1, counts even retried requests", }, [ POWER9_PME_PM_L3_P1_PF_RTY ] = { .pme_name = "PM_L3_P1_PF_RTY", .pme_code = 0x00000168AE, .pme_short_desc = "L3 PF received retry port 1, every retry counted", .pme_long_desc = "L3 PF received retry port 1, every retry counted", }, [ POWER9_PME_PM_L3_P1_PF_RTY_ALT ] = { .pme_name = "PM_L3_P1_PF_RTY_ALT", .pme_code = 0x00000268AE, .pme_short_desc = "L3 PF received retry port 3, every retry counted", .pme_long_desc = "L3 PF received retry port 3, every retry counted", }, [ POWER9_PME_PM_L3_P1_SYS_PUMP ] = { .pme_name = "PM_L3_P1_SYS_PUMP", .pme_code = 0x00000368B0, .pme_short_desc = "L3 PF sent with sys scope port 1, counts even retried requests", .pme_long_desc = "L3 PF sent with sys scope port 1, counts even retried requests", }, [ POWER9_PME_PM_L3_P2_LCO_RTY ] = { .pme_name = "PM_L3_P2_LCO_RTY", .pme_code = 0x00000260B4, .pme_short_desc = "L3 initiated LCO received retry on port 2 (can try 4 times)", .pme_long_desc = "L3 initiated LCO received retry on port 2 (can try 4 times)", }, [ POWER9_PME_PM_L3_P3_LCO_RTY ] = { .pme_name = "PM_L3_P3_LCO_RTY", .pme_code = 0x00000268B4, .pme_short_desc = "L3 initiated LCO received retry on port 3 (can try 4 times)", .pme_long_desc = "L3 initiated LCO received retry on port 3 (can try 4 times)", }, [ POWER9_PME_PM_L3_PF0_BUSY ] = { .pme_name = "PM_L3_PF0_BUSY", .pme_code = 0x00000360B4, .pme_short_desc = "Lifetime, sample of PF machine 0 valid", .pme_long_desc = "Lifetime, sample of PF machine 0 valid", }, [ POWER9_PME_PM_L3_PF0_BUSY_ALT ] = { .pme_name = "PM_L3_PF0_BUSY_ALT", .pme_code = 0x00000460B4, .pme_short_desc = "Lifetime, sample of PF machine 0 valid", .pme_long_desc = "Lifetime, sample of PF machine 0 valid", }, [ POWER9_PME_PM_L3_PF_HIT_L3 ] = { .pme_name = "PM_L3_PF_HIT_L3", .pme_code = 0x00000260A8, .pme_short_desc = "L3 PF hit in L3 (abandoned)", .pme_long_desc = "L3 PF hit in L3 (abandoned)", }, [ POWER9_PME_PM_L3_PF_MISS_L3 ] = { .pme_name = "PM_L3_PF_MISS_L3", .pme_code = 0x00000160A0, .pme_short_desc = "L3 PF missed in L3", .pme_long_desc = "L3 PF missed in L3", }, [ POWER9_PME_PM_L3_PF_OFF_CHIP_CACHE ] = { .pme_name = "PM_L3_PF_OFF_CHIP_CACHE", .pme_code = 0x00000368A0, .pme_short_desc = "L3 PF from Off chip cache", .pme_long_desc = "L3 PF from Off chip cache", }, [ POWER9_PME_PM_L3_PF_OFF_CHIP_MEM ] = { .pme_name = "PM_L3_PF_OFF_CHIP_MEM", .pme_code = 0x00000468A0, .pme_short_desc = "L3 PF from Off chip memory", .pme_long_desc = "L3 PF from Off chip memory", }, [ POWER9_PME_PM_L3_PF_ON_CHIP_CACHE ] = { .pme_name = "PM_L3_PF_ON_CHIP_CACHE", .pme_code = 0x00000360A0, .pme_short_desc = "L3 PF from On chip cache", .pme_long_desc = "L3 PF from On chip cache", }, [ POWER9_PME_PM_L3_PF_ON_CHIP_MEM ] = { .pme_name = "PM_L3_PF_ON_CHIP_MEM", .pme_code = 0x00000460A0, .pme_short_desc = "L3 PF from On chip memory", .pme_long_desc = "L3 PF from On chip memory", }, [ POWER9_PME_PM_L3_PF_USAGE ] = { .pme_name = "PM_L3_PF_USAGE", .pme_code = 0x00000260AC, .pme_short_desc = "Rotating sample of 32 PF actives", .pme_long_desc = "Rotating sample of 32 PF actives", }, [ POWER9_PME_PM_L3_RD0_BUSY ] = { .pme_name = "PM_L3_RD0_BUSY", .pme_code = 0x00000368B4, .pme_short_desc = "Lifetime, sample of RD machine 0 valid", .pme_long_desc = "Lifetime, sample of RD machine 0 valid", }, [ POWER9_PME_PM_L3_RD0_BUSY_ALT ] = { .pme_name = "PM_L3_RD0_BUSY_ALT", .pme_code = 0x00000468B4, .pme_short_desc = "Lifetime, sample of RD machine 0 valid", .pme_long_desc = "Lifetime, sample of RD machine 0 valid", }, [ POWER9_PME_PM_L3_RD_USAGE ] = { .pme_name = "PM_L3_RD_USAGE", .pme_code = 0x00000268AC, .pme_short_desc = "Rotating sample of 16 RD actives", .pme_long_desc = "Rotating sample of 16 RD actives", }, [ POWER9_PME_PM_L3_SN0_BUSY ] = { .pme_name = "PM_L3_SN0_BUSY", .pme_code = 0x00000360AC, .pme_short_desc = "Lifetime, sample of snooper machine 0 valid", .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", }, [ POWER9_PME_PM_L3_SN0_BUSY_ALT ] = { .pme_name = "PM_L3_SN0_BUSY_ALT", .pme_code = 0x00000460AC, .pme_short_desc = "Lifetime, sample of snooper machine 0 valid", .pme_long_desc = "Lifetime, sample of snooper machine 0 valid", }, [ POWER9_PME_PM_L3_SN_USAGE ] = { .pme_name = "PM_L3_SN_USAGE", .pme_code = 0x00000160AC, .pme_short_desc = "Rotating sample of 16 snoop valids", .pme_long_desc = "Rotating sample of 16 snoop valids", }, [ POWER9_PME_PM_L3_SW_PREF ] = { .pme_name = "PM_L3_SW_PREF", .pme_code = 0x000000F8B0, .pme_short_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", .pme_long_desc = "L3 load prefetch, sourced from a software prefetch stream, was sent to the nest", }, [ POWER9_PME_PM_L3_SYS_GUESS_CORRECT ] = { .pme_name = "PM_L3_SYS_GUESS_CORRECT", .pme_code = 0x00000260B2, .pme_short_desc = "Initial scope=system (VGS or RNS) and data from outside group (far or rem)(pred successful)", .pme_long_desc = "Initial scope=system (VGS or RNS) and data from outside group (far or rem)(pred successful)", }, [ POWER9_PME_PM_L3_SYS_GUESS_WRONG ] = { .pme_name = "PM_L3_SYS_GUESS_WRONG", .pme_code = 0x00000460B2, .pme_short_desc = "Initial scope=system (VGS or RNS) but data from local or near.", .pme_long_desc = "Initial scope=system (VGS or RNS) but data from local or near. Prediction too high", }, [ POWER9_PME_PM_L3_TRANS_PF ] = { .pme_name = "PM_L3_TRANS_PF", .pme_code = 0x00000468A4, .pme_short_desc = "L3 Transient prefetch received from L2", .pme_long_desc = "L3 Transient prefetch received from L2", }, [ POWER9_PME_PM_L3_WI0_BUSY ] = { .pme_name = "PM_L3_WI0_BUSY", .pme_code = 0x00000160B6, .pme_short_desc = "Rotating sample of 8 WI valid", .pme_long_desc = "Rotating sample of 8 WI valid", }, [ POWER9_PME_PM_L3_WI0_BUSY_ALT ] = { .pme_name = "PM_L3_WI0_BUSY_ALT", .pme_code = 0x00000260B6, .pme_short_desc = "Rotating sample of 8 WI valid (duplicate)", .pme_long_desc = "Rotating sample of 8 WI valid (duplicate)", }, [ POWER9_PME_PM_L3_WI_USAGE ] = { .pme_name = "PM_L3_WI_USAGE", .pme_code = 0x00000168A8, .pme_short_desc = "Lifetime, sample of Write Inject machine 0 valid", .pme_long_desc = "Lifetime, sample of Write Inject machine 0 valid", }, [ POWER9_PME_PM_LARX_FIN ] = { .pme_name = "PM_LARX_FIN", .pme_code = 0x000003C058, .pme_short_desc = "Larx finished", .pme_long_desc = "Larx finished", }, [ POWER9_PME_PM_LD_CMPL ] = { .pme_name = "PM_LD_CMPL", .pme_code = 0x000004003E, .pme_short_desc = "count of Loads completed", .pme_long_desc = "count of Loads completed", }, [ POWER9_PME_PM_LD_L3MISS_PEND_CYC ] = { .pme_name = "PM_LD_L3MISS_PEND_CYC", .pme_code = 0x0000010062, .pme_short_desc = "Cycles L3 miss was pending for this thread", .pme_long_desc = "Cycles L3 miss was pending for this thread", }, [ POWER9_PME_PM_LD_MISS_L1_FIN ] = { .pme_name = "PM_LD_MISS_L1_FIN", .pme_code = 0x000002C04E, .pme_short_desc = "Number of load instructions that finished with an L1 miss.", .pme_long_desc = "Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.", }, /* See also alternate entries for 000003E054 / POWER9_PME_PM_LD_MISS_L1 with code(s) 00000400F0 at the bottom of this table. \n */ [ POWER9_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x000003E054, .pme_short_desc = "Load Missed L1, counted at execution time (can be greater than loads finished).", .pme_long_desc = "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", }, [ POWER9_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x00000100FC, .pme_short_desc = "All L1 D cache load references counted at finish, gated by reject", .pme_long_desc = "All L1 D cache load references counted at finish, gated by reject", }, [ POWER9_PME_PM_LINK_STACK_CORRECT ] = { .pme_name = "PM_LINK_STACK_CORRECT", .pme_code = 0x00000058A0, .pme_short_desc = "Link stack predicts right address", .pme_long_desc = "Link stack predicts right address", }, [ POWER9_PME_PM_LINK_STACK_INVALID_PTR ] = { .pme_name = "PM_LINK_STACK_INVALID_PTR", .pme_code = 0x0000005898, .pme_short_desc = "It is most often caused by certain types of flush where the pointer is not available.", .pme_long_desc = "It is most often caused by certain types of flush where the pointer is not available. Can result in the data in the link stack becoming unusable.", }, [ POWER9_PME_PM_LINK_STACK_WRONG_ADD_PRED ] = { .pme_name = "PM_LINK_STACK_WRONG_ADD_PRED", .pme_code = 0x0000005098, .pme_short_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", .pme_long_desc = "Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions", }, [ POWER9_PME_PM_LMQ_EMPTY_CYC ] = { .pme_name = "PM_LMQ_EMPTY_CYC", .pme_code = 0x000002E05E, .pme_short_desc = "Cycles in which the LMQ has no pending load misses for this thread", .pme_long_desc = "Cycles in which the LMQ has no pending load misses for this thread", }, [ POWER9_PME_PM_LMQ_MERGE ] = { .pme_name = "PM_LMQ_MERGE", .pme_code = 0x000001002E, .pme_short_desc = "A demand miss collides with a prefetch for the same line", .pme_long_desc = "A demand miss collides with a prefetch for the same line", }, [ POWER9_PME_PM_LRQ_REJECT ] = { .pme_name = "PM_LRQ_REJECT", .pme_code = 0x000002E05A, .pme_short_desc = "Internal LSU reject from LRQ.", .pme_long_desc = "Internal LSU reject from LRQ. Rejects cause the load to go back to LRQ, but it stays contained within the LSU once it gets issued. This event counts the number of times the LRQ attempts to relaunch an instruction after a reject. Any load can suffer multiple rejects", }, [ POWER9_PME_PM_LS0_DC_COLLISIONS ] = { .pme_name = "PM_LS0_DC_COLLISIONS", .pme_code = 0x000000D090, .pme_short_desc = "Read-write data cache collisions", .pme_long_desc = "Read-write data cache collisions", }, [ POWER9_PME_PM_LS0_ERAT_MISS_PREF ] = { .pme_name = "PM_LS0_ERAT_MISS_PREF", .pme_code = 0x000000E084, .pme_short_desc = "LS0 Erat miss due to prefetch", .pme_long_desc = "LS0 Erat miss due to prefetch", }, [ POWER9_PME_PM_LS0_LAUNCH_HELD_PREF ] = { .pme_name = "PM_LS0_LAUNCH_HELD_PREF", .pme_code = 0x000000C09C, .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", }, [ POWER9_PME_PM_LS0_PTE_TABLEWALK_CYC ] = { .pme_name = "PM_LS0_PTE_TABLEWALK_CYC", .pme_code = 0x000000E0BC, .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 0", .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 0", }, [ POWER9_PME_PM_LS0_TM_DISALLOW ] = { .pme_name = "PM_LS0_TM_DISALLOW", .pme_code = 0x000000E0B4, .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", }, [ POWER9_PME_PM_LS0_UNALIGNED_LD ] = { .pme_name = "PM_LS0_UNALIGNED_LD", .pme_code = 0x000000C094, .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS0_UNALIGNED_ST ] = { .pme_name = "PM_LS0_UNALIGNED_ST", .pme_code = 0x000000F0B8, .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS1_DC_COLLISIONS ] = { .pme_name = "PM_LS1_DC_COLLISIONS", .pme_code = 0x000000D890, .pme_short_desc = "Read-write data cache collisions", .pme_long_desc = "Read-write data cache collisions", }, [ POWER9_PME_PM_LS1_ERAT_MISS_PREF ] = { .pme_name = "PM_LS1_ERAT_MISS_PREF", .pme_code = 0x000000E884, .pme_short_desc = "LS1 Erat miss due to prefetch", .pme_long_desc = "LS1 Erat miss due to prefetch", }, [ POWER9_PME_PM_LS1_LAUNCH_HELD_PREF ] = { .pme_name = "PM_LS1_LAUNCH_HELD_PREF", .pme_code = 0x000000C89C, .pme_short_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", .pme_long_desc = "Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle", }, [ POWER9_PME_PM_LS1_PTE_TABLEWALK_CYC ] = { .pme_name = "PM_LS1_PTE_TABLEWALK_CYC", .pme_code = 0x000000E8BC, .pme_short_desc = "Cycles when a tablewalk is pending on this thread on table 1", .pme_long_desc = "Cycles when a tablewalk is pending on this thread on table 1", }, [ POWER9_PME_PM_LS1_TM_DISALLOW ] = { .pme_name = "PM_LS1_TM_DISALLOW", .pme_code = 0x000000E8B4, .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", }, [ POWER9_PME_PM_LS1_UNALIGNED_LD ] = { .pme_name = "PM_LS1_UNALIGNED_LD", .pme_code = 0x000000C894, .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS1_UNALIGNED_ST ] = { .pme_name = "PM_LS1_UNALIGNED_ST", .pme_code = 0x000000F8B8, .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS2_DC_COLLISIONS ] = { .pme_name = "PM_LS2_DC_COLLISIONS", .pme_code = 0x000000D094, .pme_short_desc = "Read-write data cache collisions", .pme_long_desc = "Read-write data cache collisions", }, [ POWER9_PME_PM_LS2_ERAT_MISS_PREF ] = { .pme_name = "PM_LS2_ERAT_MISS_PREF", .pme_code = 0x000000E088, .pme_short_desc = "LS0 Erat miss due to prefetch", .pme_long_desc = "LS0 Erat miss due to prefetch", }, [ POWER9_PME_PM_LS2_TM_DISALLOW ] = { .pme_name = "PM_LS2_TM_DISALLOW", .pme_code = 0x000000E0B8, .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", }, [ POWER9_PME_PM_LS2_UNALIGNED_LD ] = { .pme_name = "PM_LS2_UNALIGNED_LD", .pme_code = 0x000000C098, .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS2_UNALIGNED_ST ] = { .pme_name = "PM_LS2_UNALIGNED_ST", .pme_code = 0x000000F0BC, .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS3_DC_COLLISIONS ] = { .pme_name = "PM_LS3_DC_COLLISIONS", .pme_code = 0x000000D894, .pme_short_desc = "Read-write data cache collisions", .pme_long_desc = "Read-write data cache collisions", }, [ POWER9_PME_PM_LS3_ERAT_MISS_PREF ] = { .pme_name = "PM_LS3_ERAT_MISS_PREF", .pme_code = 0x000000E888, .pme_short_desc = "LS1 Erat miss due to prefetch", .pme_long_desc = "LS1 Erat miss due to prefetch", }, [ POWER9_PME_PM_LS3_TM_DISALLOW ] = { .pme_name = "PM_LS3_TM_DISALLOW", .pme_code = 0x000000E8B8, .pme_short_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", .pme_long_desc = "A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it", }, [ POWER9_PME_PM_LS3_UNALIGNED_LD ] = { .pme_name = "PM_LS3_UNALIGNED_LD", .pme_code = 0x000000C898, .pme_short_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.", .pme_long_desc = "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LS3_UNALIGNED_ST ] = { .pme_name = "PM_LS3_UNALIGNED_ST", .pme_code = 0x000000F8BC, .pme_short_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.", .pme_long_desc = "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty", }, [ POWER9_PME_PM_LSU0_1_LRQF_FULL_CYC ] = { .pme_name = "PM_LSU0_1_LRQF_FULL_CYC", .pme_code = 0x000000D0BC, .pme_short_desc = "Counts the number of cycles the LRQF is full.", .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", }, [ POWER9_PME_PM_LSU0_ERAT_HIT ] = { .pme_name = "PM_LSU0_ERAT_HIT", .pme_code = 0x000000E08C, .pme_short_desc = "Primary ERAT hit.", .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", }, [ POWER9_PME_PM_LSU0_FALSE_LHS ] = { .pme_name = "PM_LSU0_FALSE_LHS", .pme_code = 0x000000C0A0, .pme_short_desc = "False LHS match detected", .pme_long_desc = "False LHS match detected", }, [ POWER9_PME_PM_LSU0_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU0_L1_CAM_CANCEL", .pme_code = 0x000000F090, .pme_short_desc = "ls0 l1 tm cam cancel", .pme_long_desc = "ls0 l1 tm cam cancel", }, [ POWER9_PME_PM_LSU0_LDMX_FIN ] = { .pme_name = "PM_LSU0_LDMX_FIN", .pme_code = 0x000000D088, .pme_short_desc = "New P9 instruction LDMX.", .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", }, [ POWER9_PME_PM_LSU0_LMQ_S0_VALID ] = { .pme_name = "PM_LSU0_LMQ_S0_VALID", .pme_code = 0x000000D8B8, .pme_short_desc = "Slot 0 of LMQ valid", .pme_long_desc = "Slot 0 of LMQ valid", }, [ POWER9_PME_PM_LSU0_LRQ_S0_VALID_CYC ] = { .pme_name = "PM_LSU0_LRQ_S0_VALID_CYC", .pme_code = 0x000000D8B4, .pme_short_desc = "Slot 0 of LRQ valid", .pme_long_desc = "Slot 0 of LRQ valid", }, [ POWER9_PME_PM_LSU0_SET_MPRED ] = { .pme_name = "PM_LSU0_SET_MPRED", .pme_code = 0x000000D080, .pme_short_desc = "Set prediction(set-p) miss.", .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", }, [ POWER9_PME_PM_LSU0_SRQ_S0_VALID_CYC ] = { .pme_name = "PM_LSU0_SRQ_S0_VALID_CYC", .pme_code = 0x000000D0B4, .pme_short_desc = "Slot 0 of SRQ valid", .pme_long_desc = "Slot 0 of SRQ valid", }, [ POWER9_PME_PM_LSU0_STORE_REJECT ] = { .pme_name = "PM_LSU0_STORE_REJECT", .pme_code = 0x000000F088, .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", }, [ POWER9_PME_PM_LSU0_TM_L1_HIT ] = { .pme_name = "PM_LSU0_TM_L1_HIT", .pme_code = 0x000000E094, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L1", }, [ POWER9_PME_PM_LSU0_TM_L1_MISS ] = { .pme_name = "PM_LSU0_TM_L1_MISS", .pme_code = 0x000000E09C, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss", }, [ POWER9_PME_PM_LSU1_ERAT_HIT ] = { .pme_name = "PM_LSU1_ERAT_HIT", .pme_code = 0x000000E88C, .pme_short_desc = "Primary ERAT hit.", .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", }, [ POWER9_PME_PM_LSU1_FALSE_LHS ] = { .pme_name = "PM_LSU1_FALSE_LHS", .pme_code = 0x000000C8A0, .pme_short_desc = "False LHS match detected", .pme_long_desc = "False LHS match detected", }, [ POWER9_PME_PM_LSU1_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU1_L1_CAM_CANCEL", .pme_code = 0x000000F890, .pme_short_desc = "ls1 l1 tm cam cancel", .pme_long_desc = "ls1 l1 tm cam cancel", }, [ POWER9_PME_PM_LSU1_LDMX_FIN ] = { .pme_name = "PM_LSU1_LDMX_FIN", .pme_code = 0x000000D888, .pme_short_desc = "New P9 instruction LDMX.", .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", }, [ POWER9_PME_PM_LSU1_SET_MPRED ] = { .pme_name = "PM_LSU1_SET_MPRED", .pme_code = 0x000000D880, .pme_short_desc = "Set prediction(set-p) miss.", .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", }, [ POWER9_PME_PM_LSU1_STORE_REJECT ] = { .pme_name = "PM_LSU1_STORE_REJECT", .pme_code = 0x000000F888, .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", }, [ POWER9_PME_PM_LSU1_TM_L1_HIT ] = { .pme_name = "PM_LSU1_TM_L1_HIT", .pme_code = 0x000000E894, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L1", }, [ POWER9_PME_PM_LSU1_TM_L1_MISS ] = { .pme_name = "PM_LSU1_TM_L1_MISS", .pme_code = 0x000000E89C, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss", }, [ POWER9_PME_PM_LSU2_3_LRQF_FULL_CYC ] = { .pme_name = "PM_LSU2_3_LRQF_FULL_CYC", .pme_code = 0x000000D8BC, .pme_short_desc = "Counts the number of cycles the LRQF is full.", .pme_long_desc = "Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ", }, [ POWER9_PME_PM_LSU2_ERAT_HIT ] = { .pme_name = "PM_LSU2_ERAT_HIT", .pme_code = 0x000000E090, .pme_short_desc = "Primary ERAT hit.", .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", }, [ POWER9_PME_PM_LSU2_FALSE_LHS ] = { .pme_name = "PM_LSU2_FALSE_LHS", .pme_code = 0x000000C0A4, .pme_short_desc = "False LHS match detected", .pme_long_desc = "False LHS match detected", }, [ POWER9_PME_PM_LSU2_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU2_L1_CAM_CANCEL", .pme_code = 0x000000F094, .pme_short_desc = "ls2 l1 tm cam cancel", .pme_long_desc = "ls2 l1 tm cam cancel", }, [ POWER9_PME_PM_LSU2_LDMX_FIN ] = { .pme_name = "PM_LSU2_LDMX_FIN", .pme_code = 0x000000D08C, .pme_short_desc = "New P9 instruction LDMX.", .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", }, [ POWER9_PME_PM_LSU2_SET_MPRED ] = { .pme_name = "PM_LSU2_SET_MPRED", .pme_code = 0x000000D084, .pme_short_desc = "Set prediction(set-p) miss.", .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", }, [ POWER9_PME_PM_LSU2_STORE_REJECT ] = { .pme_name = "PM_LSU2_STORE_REJECT", .pme_code = 0x000000F08C, .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", }, [ POWER9_PME_PM_LSU2_TM_L1_HIT ] = { .pme_name = "PM_LSU2_TM_L1_HIT", .pme_code = 0x000000E098, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L1", }, [ POWER9_PME_PM_LSU2_TM_L1_MISS ] = { .pme_name = "PM_LSU2_TM_L1_MISS", .pme_code = 0x000000E0A0, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss", }, [ POWER9_PME_PM_LSU3_ERAT_HIT ] = { .pme_name = "PM_LSU3_ERAT_HIT", .pme_code = 0x000000E890, .pme_short_desc = "Primary ERAT hit.", .pme_long_desc = "Primary ERAT hit. There is no secondary ERAT", }, [ POWER9_PME_PM_LSU3_FALSE_LHS ] = { .pme_name = "PM_LSU3_FALSE_LHS", .pme_code = 0x000000C8A4, .pme_short_desc = "False LHS match detected", .pme_long_desc = "False LHS match detected", }, [ POWER9_PME_PM_LSU3_L1_CAM_CANCEL ] = { .pme_name = "PM_LSU3_L1_CAM_CANCEL", .pme_code = 0x000000F894, .pme_short_desc = "ls3 l1 tm cam cancel", .pme_long_desc = "ls3 l1 tm cam cancel", }, [ POWER9_PME_PM_LSU3_LDMX_FIN ] = { .pme_name = "PM_LSU3_LDMX_FIN", .pme_code = 0x000000D88C, .pme_short_desc = "New P9 instruction LDMX.", .pme_long_desc = "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).", }, [ POWER9_PME_PM_LSU3_SET_MPRED ] = { .pme_name = "PM_LSU3_SET_MPRED", .pme_code = 0x000000D884, .pme_short_desc = "Set prediction(set-p) miss.", .pme_long_desc = "Set prediction(set-p) miss. The entry was not found in the Set prediction table", }, [ POWER9_PME_PM_LSU3_STORE_REJECT ] = { .pme_name = "PM_LSU3_STORE_REJECT", .pme_code = 0x000000F88C, .pme_short_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", .pme_long_desc = "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met", }, [ POWER9_PME_PM_LSU3_TM_L1_HIT ] = { .pme_name = "PM_LSU3_TM_L1_HIT", .pme_code = 0x000000E898, .pme_short_desc = "Load tm hit in L1", .pme_long_desc = "Load tm hit in L1", }, [ POWER9_PME_PM_LSU3_TM_L1_MISS ] = { .pme_name = "PM_LSU3_TM_L1_MISS", .pme_code = 0x000000E8A0, .pme_short_desc = "Load tm L1 miss", .pme_long_desc = "Load tm L1 miss", }, [ POWER9_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x00000200F6, .pme_short_desc = "DERAT Reloaded due to a DERAT miss", .pme_long_desc = "DERAT Reloaded due to a DERAT miss", }, [ POWER9_PME_PM_LSU_FIN ] = { .pme_name = "PM_LSU_FIN", .pme_code = 0x0000030066, .pme_short_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", .pme_long_desc = "LSU Finished a PPC instruction (up to 4 per cycle)", }, [ POWER9_PME_PM_LSU_FLUSH_ATOMIC ] = { .pme_name = "PM_LSU_FLUSH_ATOMIC", .pme_code = 0x000000C8A8, .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", }, [ POWER9_PME_PM_LSU_FLUSH_CI ] = { .pme_name = "PM_LSU_FLUSH_CI", .pme_code = 0x000000C0A8, .pme_short_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", .pme_long_desc = "Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited", }, [ POWER9_PME_PM_LSU_FLUSH_EMSH ] = { .pme_name = "PM_LSU_FLUSH_EMSH", .pme_code = 0x000000C0AC, .pme_short_desc = "An ERAT miss was detected after a set-p hit.", .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", }, [ POWER9_PME_PM_LSU_FLUSH_LARX_STCX ] = { .pme_name = "PM_LSU_FLUSH_LARX_STCX", .pme_code = 0x000000C8B8, .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", }, [ POWER9_PME_PM_LSU_FLUSH_LHL_SHL ] = { .pme_name = "PM_LSU_FLUSH_LHL_SHL", .pme_code = 0x000000C8B4, .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", }, [ POWER9_PME_PM_LSU_FLUSH_LHS ] = { .pme_name = "PM_LSU_FLUSH_LHS", .pme_code = 0x000000C8B0, .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", }, [ POWER9_PME_PM_LSU_FLUSH_NEXT ] = { .pme_name = "PM_LSU_FLUSH_NEXT", .pme_code = 0x00000020B0, .pme_short_desc = "LSU flush next reported at flush time.", .pme_long_desc = "LSU flush next reported at flush time. Sometimes these also come with an exception", }, [ POWER9_PME_PM_LSU_FLUSH_OTHER ] = { .pme_name = "PM_LSU_FLUSH_OTHER", .pme_code = 0x000000C0BC, .pme_short_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC); Data Valid Flush Next (several cases of this, one example is store and reload are lined up such that a store-hit-reload scenario exists and the CDF has already launched and has gotten bad/stale data); Bad Data Valid Flush Next (might be a few cases of this, one example is a larxa (D$ hit) return data and dval but can't allocate to LMQ (LMQ full or other reason).", .pme_long_desc = "Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC); Data Valid Flush Next (several cases of this, one example is store and reload are lined up such that a store-hit-reload scenario exists and the CDF has already launched and has gotten bad/stale data); Bad Data Valid Flush Next (might be a few cases of this, one example is a larxa (D$ hit) return data and dval but can't allocate to LMQ (LMQ full or other reason). Already gave dval but can't watch it for snoop_hit_larx. Need to take the “bad dval†back and flush all younger ops)", }, [ POWER9_PME_PM_LSU_FLUSH_RELAUNCH_MISS ] = { .pme_name = "PM_LSU_FLUSH_RELAUNCH_MISS", .pme_code = 0x000000C8AC, .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", }, [ POWER9_PME_PM_LSU_FLUSH_SAO ] = { .pme_name = "PM_LSU_FLUSH_SAO", .pme_code = 0x000000C0B8, .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", }, [ POWER9_PME_PM_LSU_FLUSH_UE ] = { .pme_name = "PM_LSU_FLUSH_UE", .pme_code = 0x000000C0B0, .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", }, [ POWER9_PME_PM_LSU_FLUSH_WRK_ARND ] = { .pme_name = "PM_LSU_FLUSH_WRK_ARND", .pme_code = 0x000000C0B4, .pme_short_desc = "LSU workaround flush.", .pme_long_desc = "LSU workaround flush. These flushes are setup with programmable scan only latches to perform various actions when the flush macro receives a trigger from the dbg macros. These actions include things like flushing the next op encountered for a particular thread or flushing the next op that is NTC op that is encountered on a particular slice. The kind of flush that the workaround is setup to perform is highly variable.", }, [ POWER9_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x000000D0B8, .pme_short_desc = "Counts the number of cycles the LMQ is full", .pme_long_desc = "Counts the number of cycles the LMQ is full", }, [ POWER9_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x000002003E, .pme_short_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", .pme_long_desc = "Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)", }, [ POWER9_PME_PM_LSU_NCST ] = { .pme_name = "PM_LSU_NCST", .pme_code = 0x000000C890, .pme_short_desc = "Asserts when a i=1 store op is sent to the nest.", .pme_long_desc = "Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1", }, [ POWER9_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x000002E05C, .pme_short_desc = "LSU Reject due to ERAT (up to 4 per cycles)", .pme_long_desc = "LSU Reject due to ERAT (up to 4 per cycles)", }, [ POWER9_PME_PM_LSU_REJECT_LHS ] = { .pme_name = "PM_LSU_REJECT_LHS", .pme_code = 0x000004E05C, .pme_short_desc = "LSU Reject due to LHS (up to 4 per cycle)", .pme_long_desc = "LSU Reject due to LHS (up to 4 per cycle)", }, [ POWER9_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x000003001C, .pme_short_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", .pme_long_desc = "LSU Reject due to LMQ full (up to 4 per cycles)", }, [ POWER9_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x000001001A, .pme_short_desc = "Cycles in which the Store Queue is full on all 4 slices.", .pme_long_desc = "Cycles in which the Store Queue is full on all 4 slices. This is event is not per thread. All the threads will see the same count for this core resource", }, [ POWER9_PME_PM_LSU_STCX_FAIL ] = { .pme_name = "PM_LSU_STCX_FAIL", .pme_code = 0x000000F080, .pme_short_desc = "", .pme_long_desc = "", }, [ POWER9_PME_PM_LSU_STCX ] = { .pme_name = "PM_LSU_STCX", .pme_code = 0x000000C090, .pme_short_desc = "STCX sent to nest, i.", .pme_long_desc = "STCX sent to nest, i.e. total", }, [ POWER9_PME_PM_LWSYNC ] = { .pme_name = "PM_LWSYNC", .pme_code = 0x0000005894, .pme_short_desc = "Lwsync instruction decoded and transferred", .pme_long_desc = "Lwsync instruction decoded and transferred", }, [ POWER9_PME_PM_MATH_FLOP_CMPL ] = { .pme_name = "PM_MATH_FLOP_CMPL", .pme_code = 0x000004505C, .pme_short_desc = "Math flop instruction completed", .pme_long_desc = "Math flop instruction completed", }, [ POWER9_PME_PM_MEM_CO ] = { .pme_name = "PM_MEM_CO", .pme_code = 0x000004C058, .pme_short_desc = "Memory castouts from this thread", .pme_long_desc = "Memory castouts from this thread", }, [ POWER9_PME_PM_MEM_LOC_THRESH_IFU ] = { .pme_name = "PM_MEM_LOC_THRESH_IFU", .pme_code = 0x0000010058, .pme_short_desc = "Local Memory above threshold for IFU speculation control", .pme_long_desc = "Local Memory above threshold for IFU speculation control", }, [ POWER9_PME_PM_MEM_LOC_THRESH_LSU_HIGH ] = { .pme_name = "PM_MEM_LOC_THRESH_LSU_HIGH", .pme_code = 0x0000040056, .pme_short_desc = "Local memory above threshold for LSU medium", .pme_long_desc = "Local memory above threshold for LSU medium", }, [ POWER9_PME_PM_MEM_LOC_THRESH_LSU_MED ] = { .pme_name = "PM_MEM_LOC_THRESH_LSU_MED", .pme_code = 0x000001C05E, .pme_short_desc = "Local memory above threshold for data prefetch", .pme_long_desc = "Local memory above threshold for data prefetch", }, [ POWER9_PME_PM_MEM_PREF ] = { .pme_name = "PM_MEM_PREF", .pme_code = 0x000002C058, .pme_short_desc = "Memory prefetch for this thread.", .pme_long_desc = "Memory prefetch for this thread. Includes L4", }, [ POWER9_PME_PM_MEM_READ ] = { .pme_name = "PM_MEM_READ", .pme_code = 0x0000010056, .pme_short_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch).", .pme_long_desc = "Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4", }, [ POWER9_PME_PM_MEM_RWITM ] = { .pme_name = "PM_MEM_RWITM", .pme_code = 0x000003C05E, .pme_short_desc = "Memory Read With Intent to Modify for this thread", .pme_long_desc = "Memory Read With Intent to Modify for this thread", }, [ POWER9_PME_PM_MRK_BACK_BR_CMPL ] = { .pme_name = "PM_MRK_BACK_BR_CMPL", .pme_code = 0x000003515E, .pme_short_desc = "Marked branch instruction completed with a target address less than current instruction address", .pme_long_desc = "Marked branch instruction completed with a target address less than current instruction address", }, [ POWER9_PME_PM_MRK_BR_2PATH ] = { .pme_name = "PM_MRK_BR_2PATH", .pme_code = 0x0000010138, .pme_short_desc = "marked branches which are not strongly biased", .pme_long_desc = "marked branches which are not strongly biased", }, [ POWER9_PME_PM_MRK_BR_CMPL ] = { .pme_name = "PM_MRK_BR_CMPL", .pme_code = 0x000001016E, .pme_short_desc = "Branch Instruction completed", .pme_long_desc = "Branch Instruction completed", }, [ POWER9_PME_PM_MRK_BR_MPRED_CMPL ] = { .pme_name = "PM_MRK_BR_MPRED_CMPL", .pme_code = 0x00000301E4, .pme_short_desc = "Marked Branch Mispredicted", .pme_long_desc = "Marked Branch Mispredicted", }, [ POWER9_PME_PM_MRK_BR_TAKEN_CMPL ] = { .pme_name = "PM_MRK_BR_TAKEN_CMPL", .pme_code = 0x00000101E2, .pme_short_desc = "Marked Branch Taken completed", .pme_long_desc = "Marked Branch Taken completed", }, [ POWER9_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x000002013A, .pme_short_desc = "bru marked instr finish", .pme_long_desc = "bru marked instr finish", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD_CYC", .pme_code = 0x000004D12E, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_MOD", .pme_code = 0x000003D14E, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR_CYC", .pme_code = 0x000002C128, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_DL2L3_SHR", .pme_code = 0x000001D150, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DL4_CYC", .pme_code = 0x000002C12C, .pme_short_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DL4 ] = { .pme_name = "PM_MRK_DATA_FROM_DL4", .pme_code = 0x000001D152, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM_CYC", .pme_code = 0x000004E11E, .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_DMEM ] = { .pme_name = "PM_MRK_DATA_FROM_DMEM", .pme_code = 0x000003D14C, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L21_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD_CYC", .pme_code = 0x000003D148, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L21_MOD", .pme_code = 0x000004D146, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L21_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR_CYC", .pme_code = 0x000001D154, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L21_SHR", .pme_code = 0x000002D14E, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_CYC", .pme_code = 0x0000014156, .pme_short_desc = "Duration in cycles to reload from local core's L2 due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC", .pme_code = 0x000001415A, .pme_short_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST", .pme_code = 0x000002D148, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC", .pme_code = 0x000003D140, .pme_short_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER ] = { .pme_name = "PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER", .pme_code = 0x000002C124, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_MEPF_CYC", .pme_code = 0x000003D144, .pme_short_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_MEPF ] = { .pme_name = "PM_MRK_DATA_FROM_L2_MEPF", .pme_code = 0x000004C120, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2MISS_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS_CYC", .pme_code = 0x0000035152, .pme_short_desc = "Duration in cycles to reload from a location other than the local core's L2 due to a marked load", .pme_long_desc = "Duration in cycles to reload from a location other than the local core's L2 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L2MISS", .pme_code = 0x00000401E8, .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L2 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC", .pme_code = 0x0000014158, .pme_short_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L2 without conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L2_NO_CONFLICT", .pme_code = 0x000002C120, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x000002C126, .pme_short_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L2 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD_CYC", .pme_code = 0x0000035158, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_MOD", .pme_code = 0x000004D144, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR_CYC", .pme_code = 0x000001D142, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_ECO_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_ECO_SHR", .pme_code = 0x000002D14C, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD_CYC", .pme_code = 0x000001D140, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L31_MOD", .pme_code = 0x000002D144, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR_CYC", .pme_code = 0x0000035156, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L31_SHR", .pme_code = 0x000004D124, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_CYC", .pme_code = 0x0000035154, .pme_short_desc = "Duration in cycles to reload from local core's L3 due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC", .pme_code = 0x000002C122, .pme_short_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L3_DISP_CONFLICT", .pme_code = 0x000001D144, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_MEPF_CYC", .pme_code = 0x000001415C, .pme_short_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_MEPF ] = { .pme_name = "PM_MRK_DATA_FROM_L3_MEPF", .pme_code = 0x000002D142, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state.", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3MISS_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS_CYC", .pme_code = 0x000001415E, .pme_short_desc = "Duration in cycles to reload from a location other than the local core's L3 due to a marked load", .pme_long_desc = "Duration in cycles to reload from a location other than the local core's L3 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3MISS ] = { .pme_name = "PM_MRK_DATA_FROM_L3MISS", .pme_code = 0x00000201E4, .pme_short_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a location other than the local core's L3 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC", .pme_code = 0x000004C124, .pme_short_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", .pme_long_desc = "Duration in cycles to reload from local core's L3 without conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_MRK_DATA_FROM_L3_NO_CONFLICT", .pme_code = 0x000003D146, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 without conflict due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_L3 ] = { .pme_name = "PM_MRK_DATA_FROM_L3", .pme_code = 0x000004D142, .pme_short_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from local core's L3 due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_LL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LL4_CYC", .pme_code = 0x000002C12E, .pme_short_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", .pme_long_desc = "Duration in cycles to reload from the local chip's L4 cache due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_LL4 ] = { .pme_name = "PM_MRK_DATA_FROM_LL4", .pme_code = 0x000001D14C, .pme_short_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's L4 cache due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_LMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM_CYC", .pme_code = 0x000004D128, .pme_short_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", .pme_long_desc = "Duration in cycles to reload from the local chip's Memory due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_LMEM ] = { .pme_name = "PM_MRK_DATA_FROM_LMEM", .pme_code = 0x000003D142, .pme_short_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from the local chip's Memory due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_MEMORY_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_MEMORY_CYC", .pme_code = 0x000001D146, .pme_short_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", .pme_long_desc = "Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_MEMORY ] = { .pme_name = "PM_MRK_DATA_FROM_MEMORY", .pme_code = 0x00000201E0, .pme_short_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC", .pme_code = 0x000001D14E, .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_MRK_DATA_FROM_OFF_CHIP_CACHE", .pme_code = 0x000002D120, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC", .pme_code = 0x000003515A, .pme_short_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", .pme_long_desc = "Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_MRK_DATA_FROM_ON_CHIP_CACHE", .pme_code = 0x000004D140, .pme_short_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD_CYC", .pme_code = 0x000002D14A, .pme_short_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_MOD", .pme_code = 0x000001D14A, .pme_short_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR_CYC", .pme_code = 0x000004C12A, .pme_short_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_RL2L3_SHR", .pme_code = 0x0000035150, .pme_short_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", .pme_long_desc = "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL4_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RL4_CYC", .pme_code = 0x000004D12A, .pme_short_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's L4 on the same Node or Group (Remote) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RL4 ] = { .pme_name = "PM_MRK_DATA_FROM_RL4", .pme_code = 0x000003515C, .pme_short_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's L4 on the same Node or Group (Remote) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RMEM_CYC ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM_CYC", .pme_code = 0x000002C12A, .pme_short_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "Duration in cycles to reload from another chip's memory on the same Node or Group (Remote) due to a marked load", }, [ POWER9_PME_PM_MRK_DATA_FROM_RMEM ] = { .pme_name = "PM_MRK_DATA_FROM_RMEM", .pme_code = 0x000001D148, .pme_short_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a marked load", .pme_long_desc = "The processor's data cache was reloaded from another chip's memory on the same Node or Group (Remote) due to a marked load", }, [ POWER9_PME_PM_MRK_DCACHE_RELOAD_INTV ] = { .pme_name = "PM_MRK_DCACHE_RELOAD_INTV", .pme_code = 0x0000040118, .pme_short_desc = "Combined Intervention event", .pme_long_desc = "Combined Intervention event", }, [ POWER9_PME_PM_MRK_DERAT_MISS_16G ] = { .pme_name = "PM_MRK_DERAT_MISS_16G", .pme_code = 0x000004C15C, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16G", }, [ POWER9_PME_PM_MRK_DERAT_MISS_16M ] = { .pme_name = "PM_MRK_DERAT_MISS_16M", .pme_code = 0x000003D154, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 16M", }, [ POWER9_PME_PM_MRK_DERAT_MISS_1G ] = { .pme_name = "PM_MRK_DERAT_MISS_1G", .pme_code = 0x000003D152, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G.", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation", }, [ POWER9_PME_PM_MRK_DERAT_MISS_2M ] = { .pme_name = "PM_MRK_DERAT_MISS_2M", .pme_code = 0x000002D152, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M.", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation", }, [ POWER9_PME_PM_MRK_DERAT_MISS_4K ] = { .pme_name = "PM_MRK_DERAT_MISS_4K", .pme_code = 0x000002D150, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 4K", }, [ POWER9_PME_PM_MRK_DERAT_MISS_64K ] = { .pme_name = "PM_MRK_DERAT_MISS_64K", .pme_code = 0x000002D154, .pme_short_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", .pme_long_desc = "Marked Data ERAT Miss (Data TLB Access) page size 64K", }, [ POWER9_PME_PM_MRK_DERAT_MISS ] = { .pme_name = "PM_MRK_DERAT_MISS", .pme_code = 0x00000301E6, .pme_short_desc = "Erat Miss (TLB Access) All page sizes", .pme_long_desc = "Erat Miss (TLB Access) All page sizes", }, [ POWER9_PME_PM_MRK_DFU_FIN ] = { .pme_name = "PM_MRK_DFU_FIN", .pme_code = 0x0000020132, .pme_short_desc = "Decimal Unit marked Instruction Finish", .pme_long_desc = "Decimal Unit marked Instruction Finish", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_MOD", .pme_code = 0x000004F148, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_DL2L3_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL2L3_SHR", .pme_code = 0x000003F148, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_DL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_DL4", .pme_code = 0x000003F14C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_DMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_DMEM", .pme_code = 0x000004F14C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L21_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L21_MOD", .pme_code = 0x000004F146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L21_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L21_SHR", .pme_code = 0x000003F146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L2_MEPF ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_MEPF", .pme_code = 0x000002F140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L2MISS ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2MISS", .pme_code = 0x000001F14E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L2_NO_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2_NO_CONFLICT", .pme_code = 0x000001F140, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L2 ] = { .pme_name = "PM_MRK_DPTEG_FROM_L2", .pme_code = 0x000001F142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_MOD", .pme_code = 0x000004F144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L31_ECO_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_ECO_SHR", .pme_code = 0x000003F144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L31_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_MOD", .pme_code = 0x000002F144, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L31_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_L31_SHR", .pme_code = 0x000001F146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT", .pme_code = 0x000003F142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L3_MEPF ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_MEPF", .pme_code = 0x000002F142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L3MISS ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3MISS", .pme_code = 0x000004F14E, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L3_NO_CONFLICT ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3_NO_CONFLICT", .pme_code = 0x000001F144, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_L3 ] = { .pme_name = "PM_MRK_DPTEG_FROM_L3", .pme_code = 0x000004F142, .pme_short_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_LL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_LL4", .pme_code = 0x000001F14C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_LMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_LMEM", .pme_code = 0x000002F148, .pme_short_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_MEMORY ] = { .pme_name = "PM_MRK_DPTEG_FROM_MEMORY", .pme_code = 0x000002F14C, .pme_short_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE ] = { .pme_name = "PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE", .pme_code = 0x000004F14A, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_ON_CHIP_CACHE ] = { .pme_name = "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE", .pme_code = 0x000001F148, .pme_short_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_MOD ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_MOD", .pme_code = 0x000002F146, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_RL2L3_SHR ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL2L3_SHR", .pme_code = 0x000001F14A, .pme_short_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_RL4 ] = { .pme_name = "PM_MRK_DPTEG_FROM_RL4", .pme_code = 0x000002F14A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group (Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DPTEG_FROM_RMEM ] = { .pme_name = "PM_MRK_DPTEG_FROM_RMEM", .pme_code = 0x000003F14A, .pme_short_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a marked data side request.", .pme_long_desc = "A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included", }, [ POWER9_PME_PM_MRK_DTLB_MISS_16G ] = { .pme_name = "PM_MRK_DTLB_MISS_16G", .pme_code = 0x000002D15E, .pme_short_desc = "Marked Data TLB Miss page size 16G", .pme_long_desc = "Marked Data TLB Miss page size 16G", }, [ POWER9_PME_PM_MRK_DTLB_MISS_16M ] = { .pme_name = "PM_MRK_DTLB_MISS_16M", .pme_code = 0x000004C15E, .pme_short_desc = "Marked Data TLB Miss page size 16M", .pme_long_desc = "Marked Data TLB Miss page size 16M", }, [ POWER9_PME_PM_MRK_DTLB_MISS_1G ] = { .pme_name = "PM_MRK_DTLB_MISS_1G", .pme_code = 0x000001D15C, .pme_short_desc = "Marked Data TLB reload (after a miss) page size 2M.", .pme_long_desc = "Marked Data TLB reload (after a miss) page size 2M. Implies radix translation was used", }, [ POWER9_PME_PM_MRK_DTLB_MISS_4K ] = { .pme_name = "PM_MRK_DTLB_MISS_4K", .pme_code = 0x000002D156, .pme_short_desc = "Marked Data TLB Miss page size 4k", .pme_long_desc = "Marked Data TLB Miss page size 4k", }, [ POWER9_PME_PM_MRK_DTLB_MISS_64K ] = { .pme_name = "PM_MRK_DTLB_MISS_64K", .pme_code = 0x000003D156, .pme_short_desc = "Marked Data TLB Miss page size 64K", .pme_long_desc = "Marked Data TLB Miss page size 64K", }, [ POWER9_PME_PM_MRK_DTLB_MISS ] = { .pme_name = "PM_MRK_DTLB_MISS", .pme_code = 0x00000401E4, .pme_short_desc = "Marked dtlb miss", .pme_long_desc = "Marked dtlb miss", }, [ POWER9_PME_PM_MRK_FAB_RSP_BKILL_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_BKILL_CYC", .pme_code = 0x000001F152, .pme_short_desc = "cycles L2 RC took for a bkill", .pme_long_desc = "cycles L2 RC took for a bkill", }, [ POWER9_PME_PM_MRK_FAB_RSP_BKILL ] = { .pme_name = "PM_MRK_FAB_RSP_BKILL", .pme_code = 0x0000040154, .pme_short_desc = "Marked store had to do a bkill", .pme_long_desc = "Marked store had to do a bkill", }, [ POWER9_PME_PM_MRK_FAB_RSP_CLAIM_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_CLAIM_RTY", .pme_code = 0x000003015E, .pme_short_desc = "Sampled store did a rwitm and got a rty", .pme_long_desc = "Sampled store did a rwitm and got a rty", }, [ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_DCLAIM_CYC", .pme_code = 0x000002F152, .pme_short_desc = "cycles L2 RC took for a dclaim", .pme_long_desc = "cycles L2 RC took for a dclaim", }, [ POWER9_PME_PM_MRK_FAB_RSP_DCLAIM ] = { .pme_name = "PM_MRK_FAB_RSP_DCLAIM", .pme_code = 0x0000030154, .pme_short_desc = "Marked store had to do a dclaim", .pme_long_desc = "Marked store had to do a dclaim", }, [ POWER9_PME_PM_MRK_FAB_RSP_RD_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_RD_RTY", .pme_code = 0x000004015E, .pme_short_desc = "Sampled L2 reads retry count", .pme_long_desc = "Sampled L2 reads retry count", }, [ POWER9_PME_PM_MRK_FAB_RSP_RD_T_INTV ] = { .pme_name = "PM_MRK_FAB_RSP_RD_T_INTV", .pme_code = 0x000001015E, .pme_short_desc = "Sampled Read got a T intervention", .pme_long_desc = "Sampled Read got a T intervention", }, [ POWER9_PME_PM_MRK_FAB_RSP_RWITM_CYC ] = { .pme_name = "PM_MRK_FAB_RSP_RWITM_CYC", .pme_code = 0x000004F150, .pme_short_desc = "cycles L2 RC took for a rwitm", .pme_long_desc = "cycles L2 RC took for a rwitm", }, [ POWER9_PME_PM_MRK_FAB_RSP_RWITM_RTY ] = { .pme_name = "PM_MRK_FAB_RSP_RWITM_RTY", .pme_code = 0x000002015E, .pme_short_desc = "Sampled store did a rwitm and got a rty", .pme_long_desc = "Sampled store did a rwitm and got a rty", }, [ POWER9_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x0000020134, .pme_short_desc = "fxu marked instr finish", .pme_long_desc = "fxu marked instr finish", }, [ POWER9_PME_PM_MRK_IC_MISS ] = { .pme_name = "PM_MRK_IC_MISS", .pme_code = 0x000004013A, .pme_short_desc = "Marked instruction experienced I cache miss", .pme_long_desc = "Marked instruction experienced I cache miss", }, [ POWER9_PME_PM_MRK_INST_CMPL ] = { .pme_name = "PM_MRK_INST_CMPL", .pme_code = 0x00000401E0, .pme_short_desc = "marked instruction completed", .pme_long_desc = "marked instruction completed", }, [ POWER9_PME_PM_MRK_INST_DECODED ] = { .pme_name = "PM_MRK_INST_DECODED", .pme_code = 0x0000020130, .pme_short_desc = "An instruction was marked at decode time.", .pme_long_desc = "An instruction was marked at decode time. Random Instruction Sampling (RIS) only", }, [ POWER9_PME_PM_MRK_INST_DISP ] = { .pme_name = "PM_MRK_INST_DISP", .pme_code = 0x00000101E0, .pme_short_desc = "The thread has dispatched a randomly sampled marked instruction", .pme_long_desc = "The thread has dispatched a randomly sampled marked instruction", }, [ POWER9_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x0000030130, .pme_short_desc = "marked instruction finished", .pme_long_desc = "marked instruction finished", }, [ POWER9_PME_PM_MRK_INST_FROM_L3MISS ] = { .pme_name = "PM_MRK_INST_FROM_L3MISS", .pme_code = 0x00000401E6, .pme_short_desc = "Marked instruction was reloaded from a location beyond the local chiplet", .pme_long_desc = "Marked instruction was reloaded from a location beyond the local chiplet", }, [ POWER9_PME_PM_MRK_INST_ISSUED ] = { .pme_name = "PM_MRK_INST_ISSUED", .pme_code = 0x0000010132, .pme_short_desc = "Marked instruction issued", .pme_long_desc = "Marked instruction issued", }, [ POWER9_PME_PM_MRK_INST_TIMEO ] = { .pme_name = "PM_MRK_INST_TIMEO", .pme_code = 0x0000040134, .pme_short_desc = "marked Instruction finish timeout (instruction lost)", .pme_long_desc = "marked Instruction finish timeout (instruction lost)", }, [ POWER9_PME_PM_MRK_INST ] = { .pme_name = "PM_MRK_INST", .pme_code = 0x0000024058, .pme_short_desc = "An instruction was marked.", .pme_long_desc = "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens", }, [ POWER9_PME_PM_MRK_L1_ICACHE_MISS ] = { .pme_name = "PM_MRK_L1_ICACHE_MISS", .pme_code = 0x00000101E4, .pme_short_desc = "sampled Instruction suffered an icache Miss", .pme_long_desc = "sampled Instruction suffered an icache Miss", }, [ POWER9_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x00000101EA, .pme_short_desc = "Marked demand reload", .pme_long_desc = "Marked demand reload", }, [ POWER9_PME_PM_MRK_L2_RC_DISP ] = { .pme_name = "PM_MRK_L2_RC_DISP", .pme_code = 0x0000020114, .pme_short_desc = "Marked Instruction RC dispatched in L2", .pme_long_desc = "Marked Instruction RC dispatched in L2", }, [ POWER9_PME_PM_MRK_L2_RC_DONE ] = { .pme_name = "PM_MRK_L2_RC_DONE", .pme_code = 0x000003012A, .pme_short_desc = "Marked RC done", .pme_long_desc = "Marked RC done", }, [ POWER9_PME_PM_MRK_L2_TM_REQ_ABORT ] = { .pme_name = "PM_MRK_L2_TM_REQ_ABORT", .pme_code = 0x000001E15E, .pme_short_desc = "TM abort", .pme_long_desc = "TM abort", }, [ POWER9_PME_PM_MRK_L2_TM_ST_ABORT_SISTER ] = { .pme_name = "PM_MRK_L2_TM_ST_ABORT_SISTER", .pme_code = 0x000003E15C, .pme_short_desc = "TM marked store abort for this thread", .pme_long_desc = "TM marked store abort for this thread", }, [ POWER9_PME_PM_MRK_LARX_FIN ] = { .pme_name = "PM_MRK_LARX_FIN", .pme_code = 0x0000040116, .pme_short_desc = "Larx finished", .pme_long_desc = "Larx finished", }, [ POWER9_PME_PM_MRK_LD_MISS_EXPOSED_CYC ] = { .pme_name = "PM_MRK_LD_MISS_EXPOSED_CYC", .pme_code = 0x000001013E, .pme_short_desc = "Marked Load exposed Miss (use edge detect to count #)", .pme_long_desc = "Marked Load exposed Miss (use edge detect to count #)", }, [ POWER9_PME_PM_MRK_LD_MISS_L1_CYC ] = { .pme_name = "PM_MRK_LD_MISS_L1_CYC", .pme_code = 0x000001D056, .pme_short_desc = "Marked ld latency", .pme_long_desc = "Marked ld latency", }, [ POWER9_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x00000201E2, .pme_short_desc = "Marked DL1 Demand Miss counted at exec time.", .pme_long_desc = "Marked DL1 Demand Miss counted at exec time. Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", }, [ POWER9_PME_PM_MRK_LSU_DERAT_MISS ] = { .pme_name = "PM_MRK_LSU_DERAT_MISS", .pme_code = 0x0000030162, .pme_short_desc = "Marked derat reload (miss) for any page size", .pme_long_desc = "Marked derat reload (miss) for any page size", }, [ POWER9_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x0000040132, .pme_short_desc = "lsu marked instr PPC finish", .pme_long_desc = "lsu marked instr PPC finish", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_ATOMIC ] = { .pme_name = "PM_MRK_LSU_FLUSH_ATOMIC", .pme_code = 0x000000D098, .pme_short_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.", .pme_long_desc = "Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_EMSH ] = { .pme_name = "PM_MRK_LSU_FLUSH_EMSH", .pme_code = 0x000000D898, .pme_short_desc = "An ERAT miss was detected after a set-p hit.", .pme_long_desc = "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_LARX_STCX ] = { .pme_name = "PM_MRK_LSU_FLUSH_LARX_STCX", .pme_code = 0x000000D8A4, .pme_short_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread.", .pme_long_desc = "A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_LHL_SHL ] = { .pme_name = "PM_MRK_LSU_FLUSH_LHL_SHL", .pme_code = 0x000000D8A0, .pme_short_desc = "The instruction was flushed because of a sequential load/store consistency.", .pme_long_desc = "The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_LHS ] = { .pme_name = "PM_MRK_LSU_FLUSH_LHS", .pme_code = 0x000000D0A0, .pme_short_desc = "Effective Address alias flush : no EA match but Real Address match.", .pme_long_desc = "Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_RELAUNCH_MISS ] = { .pme_name = "PM_MRK_LSU_FLUSH_RELAUNCH_MISS", .pme_code = 0x000000D09C, .pme_short_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", .pme_long_desc = "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_SAO ] = { .pme_name = "PM_MRK_LSU_FLUSH_SAO", .pme_code = 0x000000D0A4, .pme_short_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", .pme_long_desc = "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush", }, [ POWER9_PME_PM_MRK_LSU_FLUSH_UE ] = { .pme_name = "PM_MRK_LSU_FLUSH_UE", .pme_code = 0x000000D89C, .pme_short_desc = "Correctable ECC error on reload data, reported at critical data forward time", .pme_long_desc = "Correctable ECC error on reload data, reported at critical data forward time", }, [ POWER9_PME_PM_MRK_NTC_CYC ] = { .pme_name = "PM_MRK_NTC_CYC", .pme_code = 0x000002011C, .pme_short_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", .pme_long_desc = "Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)", }, [ POWER9_PME_PM_MRK_NTF_FIN ] = { .pme_name = "PM_MRK_NTF_FIN", .pme_code = 0x0000020112, .pme_short_desc = "Marked next to finish instruction finished", .pme_long_desc = "Marked next to finish instruction finished", }, [ POWER9_PME_PM_MRK_PROBE_NOP_CMPL ] = { .pme_name = "PM_MRK_PROBE_NOP_CMPL", .pme_code = 0x000001F05E, .pme_short_desc = "Marked probeNops completed", .pme_long_desc = "Marked probeNops completed", }, [ POWER9_PME_PM_MRK_RUN_CYC ] = { .pme_name = "PM_MRK_RUN_CYC", .pme_code = 0x000001D15E, .pme_short_desc = "Run cycles in which a marked instruction is in the pipeline", .pme_long_desc = "Run cycles in which a marked instruction is in the pipeline", }, [ POWER9_PME_PM_MRK_STALL_CMPLU_CYC ] = { .pme_name = "PM_MRK_STALL_CMPLU_CYC", .pme_code = 0x000003013E, .pme_short_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", .pme_long_desc = "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)", }, [ POWER9_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x0000030134, .pme_short_desc = "marked store finished with intervention", .pme_long_desc = "marked store finished with intervention", }, [ POWER9_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x00000301E2, .pme_short_desc = "Marked store completed and sent to nest", .pme_long_desc = "Marked store completed and sent to nest", }, [ POWER9_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x000003E158, .pme_short_desc = "marked stcx failed", .pme_long_desc = "marked stcx failed", }, [ POWER9_PME_PM_MRK_STCX_FIN ] = { .pme_name = "PM_MRK_STCX_FIN", .pme_code = 0x0000024056, .pme_short_desc = "Number of marked stcx instructions finished.", .pme_long_desc = "Number of marked stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", }, [ POWER9_PME_PM_MRK_ST_DONE_L2 ] = { .pme_name = "PM_MRK_ST_DONE_L2", .pme_code = 0x0000010134, .pme_short_desc = "marked store completed in L2 (RC machine done)", .pme_long_desc = "marked store completed in L2 (RC machine done)", }, [ POWER9_PME_PM_MRK_ST_DRAIN_TO_L2DISP_CYC ] = { .pme_name = "PM_MRK_ST_DRAIN_TO_L2DISP_CYC", .pme_code = 0x000003F150, .pme_short_desc = "cycles to drain st from core to L2", .pme_long_desc = "cycles to drain st from core to L2", }, [ POWER9_PME_PM_MRK_ST_FWD ] = { .pme_name = "PM_MRK_ST_FWD", .pme_code = 0x000003012C, .pme_short_desc = "Marked st forwards", .pme_long_desc = "Marked st forwards", }, [ POWER9_PME_PM_MRK_ST_L2DISP_TO_CMPL_CYC ] = { .pme_name = "PM_MRK_ST_L2DISP_TO_CMPL_CYC", .pme_code = 0x000001F150, .pme_short_desc = "cycles from L2 rc disp to l2 rc completion", .pme_long_desc = "cycles from L2 rc disp to l2 rc completion", }, [ POWER9_PME_PM_MRK_ST_NEST ] = { .pme_name = "PM_MRK_ST_NEST", .pme_code = 0x0000020138, .pme_short_desc = "Marked store sent to nest", .pme_long_desc = "Marked store sent to nest", }, [ POWER9_PME_PM_MRK_TEND_FAIL ] = { .pme_name = "PM_MRK_TEND_FAIL", .pme_code = 0x00000028A4, .pme_short_desc = "Nested or not nested tend failed for a marked tend instruction", .pme_long_desc = "Nested or not nested tend failed for a marked tend instruction", }, [ POWER9_PME_PM_MRK_VSU_FIN ] = { .pme_name = "PM_MRK_VSU_FIN", .pme_code = 0x0000030132, .pme_short_desc = "VSU marked instr finish", .pme_long_desc = "VSU marked instr finish", }, [ POWER9_PME_PM_MULT_MRK ] = { .pme_name = "PM_MULT_MRK", .pme_code = 0x000003D15E, .pme_short_desc = "mult marked instr", .pme_long_desc = "mult marked instr", }, [ POWER9_PME_PM_NEST_REF_CLK ] = { .pme_name = "PM_NEST_REF_CLK", .pme_code = 0x000003006E, .pme_short_desc = "Multiply by 4 to obtain the number of PB cycles", .pme_long_desc = "Multiply by 4 to obtain the number of PB cycles", }, [ POWER9_PME_PM_NON_DATA_STORE ] = { .pme_name = "PM_NON_DATA_STORE", .pme_code = 0x000000F8A0, .pme_short_desc = "All ops that drain from s2q to L2 and contain no data", .pme_long_desc = "All ops that drain from s2q to L2 and contain no data", }, [ POWER9_PME_PM_NON_FMA_FLOP_CMPL ] = { .pme_name = "PM_NON_FMA_FLOP_CMPL", .pme_code = 0x000004D056, .pme_short_desc = "Non FMA instruction completed", .pme_long_desc = "Non FMA instruction completed", }, [ POWER9_PME_PM_NON_MATH_FLOP_CMPL ] = { .pme_name = "PM_NON_MATH_FLOP_CMPL", .pme_code = 0x000004D05A, .pme_short_desc = "Non FLOP operation completed", .pme_long_desc = "Non FLOP operation completed", }, [ POWER9_PME_PM_NON_TM_RST_SC ] = { .pme_name = "PM_NON_TM_RST_SC", .pme_code = 0x00000260A6, .pme_short_desc = "Non-TM snp rst TM SC", .pme_long_desc = "Non-TM snp rst TM SC", }, [ POWER9_PME_PM_NTC_ALL_FIN ] = { .pme_name = "PM_NTC_ALL_FIN", .pme_code = 0x000002001A, .pme_short_desc = "Cycles after all instructions have finished to group completed", .pme_long_desc = "Cycles after all instructions have finished to group completed", }, [ POWER9_PME_PM_NTC_FIN ] = { .pme_name = "PM_NTC_FIN", .pme_code = 0x000002405A, .pme_short_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes.", .pme_long_desc = "Cycles in which the oldest instruction in the pipeline (NTC) finishes. This event is used to account for cycles in which work is being completed in the CPI stack", }, [ POWER9_PME_PM_NTC_ISSUE_HELD_ARB ] = { .pme_name = "PM_NTC_ISSUE_HELD_ARB", .pme_code = 0x000002E016, .pme_short_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", .pme_long_desc = "The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)", }, [ POWER9_PME_PM_NTC_ISSUE_HELD_DARQ_FULL ] = { .pme_name = "PM_NTC_ISSUE_HELD_DARQ_FULL", .pme_code = 0x000001006A, .pme_short_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", .pme_long_desc = "The NTC instruction is being held at dispatch because there are no slots in the DARQ for it", }, [ POWER9_PME_PM_NTC_ISSUE_HELD_OTHER ] = { .pme_name = "PM_NTC_ISSUE_HELD_OTHER", .pme_code = 0x000003D05A, .pme_short_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", .pme_long_desc = "The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU", }, [ POWER9_PME_PM_PARTIAL_ST_FIN ] = { .pme_name = "PM_PARTIAL_ST_FIN", .pme_code = 0x0000034054, .pme_short_desc = "Any store finished by an LSU slice", .pme_long_desc = "Any store finished by an LSU slice", }, [ POWER9_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x0000020010, .pme_short_desc = "Overflow from counter 1", .pme_long_desc = "Overflow from counter 1", }, [ POWER9_PME_PM_PMC1_REWIND ] = { .pme_name = "PM_PMC1_REWIND", .pme_code = 0x000004D02C, .pme_short_desc = "", .pme_long_desc = "", }, [ POWER9_PME_PM_PMC1_SAVED ] = { .pme_name = "PM_PMC1_SAVED", .pme_code = 0x000004D010, .pme_short_desc = "PMC1 Rewind Value saved", .pme_long_desc = "PMC1 Rewind Value saved", }, [ POWER9_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x0000030010, .pme_short_desc = "Overflow from counter 2", .pme_long_desc = "Overflow from counter 2", }, [ POWER9_PME_PM_PMC2_REWIND ] = { .pme_name = "PM_PMC2_REWIND", .pme_code = 0x0000030020, .pme_short_desc = "PMC2 Rewind Event (did not match condition)", .pme_long_desc = "PMC2 Rewind Event (did not match condition)", }, [ POWER9_PME_PM_PMC2_SAVED ] = { .pme_name = "PM_PMC2_SAVED", .pme_code = 0x0000010022, .pme_short_desc = "PMC2 Rewind Value saved", .pme_long_desc = "PMC2 Rewind Value saved", }, [ POWER9_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x0000040010, .pme_short_desc = "Overflow from counter 3", .pme_long_desc = "Overflow from counter 3", }, [ POWER9_PME_PM_PMC3_REWIND ] = { .pme_name = "PM_PMC3_REWIND", .pme_code = 0x000001000A, .pme_short_desc = "PMC3 rewind event.", .pme_long_desc = "PMC3 rewind event. A rewind happens when a speculative event (such as latency or CPI stack) is selected on PMC3 and the stall reason or reload source did not match the one programmed in PMC3. When this occurs, the count in PMC3 will not change.", }, [ POWER9_PME_PM_PMC3_SAVED ] = { .pme_name = "PM_PMC3_SAVED", .pme_code = 0x000004D012, .pme_short_desc = "PMC3 Rewind Value saved", .pme_long_desc = "PMC3 Rewind Value saved", }, [ POWER9_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x0000010010, .pme_short_desc = "Overflow from counter 4", .pme_long_desc = "Overflow from counter 4", }, [ POWER9_PME_PM_PMC4_REWIND ] = { .pme_name = "PM_PMC4_REWIND", .pme_code = 0x0000010020, .pme_short_desc = "PMC4 Rewind Event", .pme_long_desc = "PMC4 Rewind Event", }, [ POWER9_PME_PM_PMC4_SAVED ] = { .pme_name = "PM_PMC4_SAVED", .pme_code = 0x0000030022, .pme_short_desc = "PMC4 Rewind Value saved (matched condition)", .pme_long_desc = "PMC4 Rewind Value saved (matched condition)", }, [ POWER9_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x0000010024, .pme_short_desc = "Overflow from counter 5", .pme_long_desc = "Overflow from counter 5", }, [ POWER9_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x0000030024, .pme_short_desc = "Overflow from counter 6", .pme_long_desc = "Overflow from counter 6", }, [ POWER9_PME_PM_PROBE_NOP_DISP ] = { .pme_name = "PM_PROBE_NOP_DISP", .pme_code = 0x0000040014, .pme_short_desc = "ProbeNops dispatched", .pme_long_desc = "ProbeNops dispatched", }, [ POWER9_PME_PM_PTE_PREFETCH ] = { .pme_name = "PM_PTE_PREFETCH", .pme_code = 0x000000F084, .pme_short_desc = "PTE prefetches", .pme_long_desc = "PTE prefetches", }, [ POWER9_PME_PM_PTESYNC ] = { .pme_name = "PM_PTESYNC", .pme_code = 0x000000589C, .pme_short_desc = "ptesync instruction counted when the instruction is decoded and transmitted", .pme_long_desc = "ptesync instruction counted when the instruction is decoded and transmitted", }, [ POWER9_PME_PM_PUMP_CPRED ] = { .pme_name = "PM_PUMP_CPRED", .pme_code = 0x0000010054, .pme_short_desc = "Pump prediction correct.", .pme_long_desc = "Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_PUMP_MPRED ] = { .pme_name = "PM_PUMP_MPRED", .pme_code = 0x0000040052, .pme_short_desc = "Pump misprediction.", .pme_long_desc = "Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_RADIX_PWC_L1_HIT ] = { .pme_name = "PM_RADIX_PWC_L1_HIT", .pme_code = 0x000001F056, .pme_short_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", .pme_long_desc = "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.", }, [ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L2", .pme_code = 0x000002D026, .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3MISS ] = { .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3MISS", .pme_code = 0x000004F056, .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache.", .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache. The source could be local/remote/distant memory or another core's cache", }, [ POWER9_PME_PM_RADIX_PWC_L1_PDE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L1_PDE_FROM_L3", .pme_code = 0x000003F058, .pme_short_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L2_HIT ] = { .pme_name = "PM_RADIX_PWC_L2_HIT", .pme_code = 0x000002D024, .pme_short_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", .pme_long_desc = "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.", }, [ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L2", .pme_code = 0x000002D028, .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L2_PDE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L2_PDE_FROM_L3", .pme_code = 0x000003F05A, .pme_short_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L2", .pme_code = 0x000001F058, .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", }, [ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3MISS ] = { .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3MISS", .pme_code = 0x000004F05C, .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation. The source could be local/remote/distant memory or another core's cache", }, [ POWER9_PME_PM_RADIX_PWC_L2_PTE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L2_PTE_FROM_L3", .pme_code = 0x000004F058, .pme_short_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation", }, [ POWER9_PME_PM_RADIX_PWC_L3_HIT ] = { .pme_name = "PM_RADIX_PWC_L3_HIT", .pme_code = 0x000003F056, .pme_short_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", .pme_long_desc = "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.", }, [ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L2", .pme_code = 0x000002D02A, .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L3_PDE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L3_PDE_FROM_L3", .pme_code = 0x000001F15C, .pme_short_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", .pme_long_desc = "A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache", }, [ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L2", .pme_code = 0x000002D02E, .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache. This implies that a level 4 PWC access was not necessary for this translation", }, [ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3MISS ] = { .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3MISS", .pme_code = 0x000004F05E, .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation. The source could be local/remote/distant memory or another core's cache", }, [ POWER9_PME_PM_RADIX_PWC_L3_PTE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L3_PTE_FROM_L3", .pme_code = 0x000003F05E, .pme_short_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation", }, [ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L2 ] = { .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L2", .pme_code = 0x000001F05A, .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache. This is the deepest level of PWC possible for a translation", }, [ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3MISS ] = { .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3MISS", .pme_code = 0x000003F054, .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache. This is the deepest level of PWC possible for a translation. The source could be local/remote/distant memory or another core's cache", }, [ POWER9_PME_PM_RADIX_PWC_L4_PTE_FROM_L3 ] = { .pme_name = "PM_RADIX_PWC_L4_PTE_FROM_L3", .pme_code = 0x000004F05A, .pme_short_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache.", .pme_long_desc = "A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache. This is the deepest level of PWC possible for a translation", }, [ POWER9_PME_PM_RADIX_PWC_MISS ] = { .pme_name = "PM_RADIX_PWC_MISS", .pme_code = 0x000004F054, .pme_short_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", .pme_long_desc = "A radix translation attempt missed in the TLB and all levels of page walk cache.", }, [ POWER9_PME_PM_RC0_BUSY ] = { .pme_name = "PM_RC0_BUSY", .pme_code = 0x000001608C, .pme_short_desc = "RC mach 0 Busy.", .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_RC0_BUSY_ALT ] = { .pme_name = "PM_RC0_BUSY_ALT", .pme_code = 0x000002608C, .pme_short_desc = "RC mach 0 Busy.", .pme_long_desc = "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_RC_USAGE ] = { .pme_name = "PM_RC_USAGE", .pme_code = 0x000001688C, .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy.", .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, [ POWER9_PME_PM_RD_CLEARING_SC ] = { .pme_name = "PM_RD_CLEARING_SC", .pme_code = 0x00000468A6, .pme_short_desc = "Read clearing SC", .pme_long_desc = "Read clearing SC", }, [ POWER9_PME_PM_RD_FORMING_SC ] = { .pme_name = "PM_RD_FORMING_SC", .pme_code = 0x00000460A6, .pme_short_desc = "Read forming SC", .pme_long_desc = "Read forming SC", }, [ POWER9_PME_PM_RD_HIT_PF ] = { .pme_name = "PM_RD_HIT_PF", .pme_code = 0x00000268A8, .pme_short_desc = "RD machine hit L3 PF machine", .pme_long_desc = "RD machine hit L3 PF machine", }, [ POWER9_PME_PM_RUN_CYC_SMT2_MODE ] = { .pme_name = "PM_RUN_CYC_SMT2_MODE", .pme_code = 0x000003006C, .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT2 mode", }, [ POWER9_PME_PM_RUN_CYC_SMT4_MODE ] = { .pme_name = "PM_RUN_CYC_SMT4_MODE", .pme_code = 0x000002006C, .pme_short_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", .pme_long_desc = "Cycles in which this thread's run latch is set and the core is in SMT4 mode", }, [ POWER9_PME_PM_RUN_CYC_ST_MODE ] = { .pme_name = "PM_RUN_CYC_ST_MODE", .pme_code = 0x000001006C, .pme_short_desc = "Cycles run latch is set and core is in ST mode", .pme_long_desc = "Cycles run latch is set and core is in ST mode", }, [ POWER9_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x00000200F4, .pme_short_desc = "Run_cycles", .pme_long_desc = "Run_cycles", }, [ POWER9_PME_PM_RUN_INST_CMPL ] = { .pme_name = "PM_RUN_INST_CMPL", .pme_code = 0x00000400FA, .pme_short_desc = "Run_Instructions", .pme_long_desc = "Run_Instructions", }, [ POWER9_PME_PM_RUN_PURR ] = { .pme_name = "PM_RUN_PURR", .pme_code = 0x00000400F4, .pme_short_desc = "Run_PURR", .pme_long_desc = "Run_PURR", }, [ POWER9_PME_PM_RUN_SPURR ] = { .pme_name = "PM_RUN_SPURR", .pme_code = 0x0000010008, .pme_short_desc = "Run SPURR", .pme_long_desc = "Run SPURR", }, [ POWER9_PME_PM_S2Q_FULL ] = { .pme_name = "PM_S2Q_FULL", .pme_code = 0x000000E080, .pme_short_desc = "Cycles during which the S2Q is full", .pme_long_desc = "Cycles during which the S2Q is full", }, [ POWER9_PME_PM_SCALAR_FLOP_CMPL ] = { .pme_name = "PM_SCALAR_FLOP_CMPL", .pme_code = 0x0000045056, .pme_short_desc = "Scalar flop operation completed", .pme_long_desc = "Scalar flop operation completed", }, [ POWER9_PME_PM_SHL_CREATED ] = { .pme_name = "PM_SHL_CREATED", .pme_code = 0x000000508C, .pme_short_desc = "Store-Hit-Load Table Entry Created", .pme_long_desc = "Store-Hit-Load Table Entry Created", }, [ POWER9_PME_PM_SHL_ST_DEP_CREATED ] = { .pme_name = "PM_SHL_ST_DEP_CREATED", .pme_code = 0x000000588C, .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Enabled", .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Enabled", }, [ POWER9_PME_PM_SHL_ST_DISABLE ] = { .pme_name = "PM_SHL_ST_DISABLE", .pme_code = 0x0000005090, .pme_short_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", .pme_long_desc = "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)", }, [ POWER9_PME_PM_SLB_TABLEWALK_CYC ] = { .pme_name = "PM_SLB_TABLEWALK_CYC", .pme_code = 0x000000F09C, .pme_short_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", .pme_long_desc = "Cycles when a tablewalk is pending on this thread on the SLB table", }, [ POWER9_PME_PM_SN0_BUSY ] = { .pme_name = "PM_SN0_BUSY", .pme_code = 0x0000016090, .pme_short_desc = "SN mach 0 Busy.", .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_SN0_BUSY_ALT ] = { .pme_name = "PM_SN0_BUSY_ALT", .pme_code = 0x0000026090, .pme_short_desc = "SN mach 0 Busy.", .pme_long_desc = "SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)", }, [ POWER9_PME_PM_SN_HIT ] = { .pme_name = "PM_SN_HIT", .pme_code = 0x00000460A8, .pme_short_desc = "Any port snooper hit L3.", .pme_long_desc = "Any port snooper hit L3. Up to 4 can happen in a cycle but we only count 1", }, [ POWER9_PME_PM_SN_INVL ] = { .pme_name = "PM_SN_INVL", .pme_code = 0x00000368A8, .pme_short_desc = "Any port snooper detects a store to a line in the Sx state and invalidates the line.", .pme_long_desc = "Any port snooper detects a store to a line in the Sx state and invalidates the line. Up to 4 can happen in a cycle but we only count 1", }, [ POWER9_PME_PM_SN_MISS ] = { .pme_name = "PM_SN_MISS", .pme_code = 0x00000468A8, .pme_short_desc = "Any port snooper L3 miss or collision.", .pme_long_desc = "Any port snooper L3 miss or collision. Up to 4 can happen in a cycle but we only count 1", }, [ POWER9_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x000000F880, .pme_short_desc = "TLBIE snoop", .pme_long_desc = "TLBIE snoop", }, [ POWER9_PME_PM_SNP_TM_HIT_M ] = { .pme_name = "PM_SNP_TM_HIT_M", .pme_code = 0x00000360A6, .pme_short_desc = "Snp TM st hit M/Mu", .pme_long_desc = "Snp TM st hit M/Mu", }, [ POWER9_PME_PM_SNP_TM_HIT_T ] = { .pme_name = "PM_SNP_TM_HIT_T", .pme_code = 0x00000368A6, .pme_short_desc = "Snp TM sthit T/Tn/Te", .pme_long_desc = "Snp TM sthit T/Tn/Te", }, [ POWER9_PME_PM_SN_USAGE ] = { .pme_name = "PM_SN_USAGE", .pme_code = 0x000003688C, .pme_short_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy.", .pme_long_desc = "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running", }, [ POWER9_PME_PM_SP_FLOP_CMPL ] = { .pme_name = "PM_SP_FLOP_CMPL", .pme_code = 0x000004505A, .pme_short_desc = "SP instruction completed", .pme_long_desc = "SP instruction completed", }, [ POWER9_PME_PM_SRQ_EMPTY_CYC ] = { .pme_name = "PM_SRQ_EMPTY_CYC", .pme_code = 0x0000040008, .pme_short_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", .pme_long_desc = "Cycles in which the SRQ has at least one (out of four) empty slice", }, [ POWER9_PME_PM_SRQ_SYNC_CYC ] = { .pme_name = "PM_SRQ_SYNC_CYC", .pme_code = 0x000000D0AC, .pme_short_desc = "A sync is in the S2Q (edge detect to count)", .pme_long_desc = "A sync is in the S2Q (edge detect to count)", }, [ POWER9_PME_PM_STALL_END_ICT_EMPTY ] = { .pme_name = "PM_STALL_END_ICT_EMPTY", .pme_code = 0x0000010028, .pme_short_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", .pme_long_desc = "The number a times the core transitioned from a stall to ICT-empty for this thread", }, [ POWER9_PME_PM_ST_CAUSED_FAIL ] = { .pme_name = "PM_ST_CAUSED_FAIL", .pme_code = 0x000001608E, .pme_short_desc = "Non-TM Store caused any thread to fail", .pme_long_desc = "Non-TM Store caused any thread to fail", }, [ POWER9_PME_PM_ST_CMPL ] = { .pme_name = "PM_ST_CMPL", .pme_code = 0x00000200F0, .pme_short_desc = "Stores completed from S2Q (2nd-level store queue).", .pme_long_desc = "Stores completed from S2Q (2nd-level store queue).", }, [ POWER9_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x000001E058, .pme_short_desc = "stcx failed", .pme_long_desc = "stcx failed", }, [ POWER9_PME_PM_STCX_FIN ] = { .pme_name = "PM_STCX_FIN", .pme_code = 0x000002E014, .pme_short_desc = "Number of stcx instructions finished.", .pme_long_desc = "Number of stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed", }, [ POWER9_PME_PM_STCX_SUCCESS_CMPL ] = { .pme_name = "PM_STCX_SUCCESS_CMPL", .pme_code = 0x000000C8BC, .pme_short_desc = "Number of stcx instructions that completed successfully", .pme_long_desc = "Number of stcx instructions that completed successfully", }, [ POWER9_PME_PM_ST_FIN ] = { .pme_name = "PM_ST_FIN", .pme_code = 0x0000020016, .pme_short_desc = "Store finish count.", .pme_long_desc = "Store finish count. Includes speculative activity", }, [ POWER9_PME_PM_ST_FWD ] = { .pme_name = "PM_ST_FWD", .pme_code = 0x0000020018, .pme_short_desc = "Store forwards that finished", .pme_long_desc = "Store forwards that finished", }, [ POWER9_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x00000300F0, .pme_short_desc = "Store Missed L1", .pme_long_desc = "Store Missed L1", }, [ POWER9_PME_PM_STOP_FETCH_PENDING_CYC ] = { .pme_name = "PM_STOP_FETCH_PENDING_CYC", .pme_code = 0x00000048A4, .pme_short_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", .pme_long_desc = "Fetching is stopped due to an incoming instruction that will result in a flush", }, /* See also alternate entries for 0000010000 / POWER9_PME_PM_SUSPENDED with code(s) 0000020000 0000030000 0000040000 at the bottom of this table. \n */ [ POWER9_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0000010000, .pme_short_desc = "Counter OFF", .pme_long_desc = "Counter OFF", }, [ POWER9_PME_PM_SYNC_MRK_BR_LINK ] = { .pme_name = "PM_SYNC_MRK_BR_LINK", .pme_code = 0x0000015152, .pme_short_desc = "Marked Branch and link branch that can cause a synchronous interrupt", .pme_long_desc = "Marked Branch and link branch that can cause a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_BR_MPRED ] = { .pme_name = "PM_SYNC_MRK_BR_MPRED", .pme_code = 0x000001515C, .pme_short_desc = "Marked Branch mispredict that can cause a synchronous interrupt", .pme_long_desc = "Marked Branch mispredict that can cause a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_FX_DIVIDE ] = { .pme_name = "PM_SYNC_MRK_FX_DIVIDE", .pme_code = 0x0000015156, .pme_short_desc = "Marked fixed point divide that can cause a synchronous interrupt", .pme_long_desc = "Marked fixed point divide that can cause a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_L2HIT ] = { .pme_name = "PM_SYNC_MRK_L2HIT", .pme_code = 0x0000015158, .pme_short_desc = "Marked L2 Hits that can throw a synchronous interrupt", .pme_long_desc = "Marked L2 Hits that can throw a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_L2MISS ] = { .pme_name = "PM_SYNC_MRK_L2MISS", .pme_code = 0x000001515A, .pme_short_desc = "Marked L2 Miss that can throw a synchronous interrupt", .pme_long_desc = "Marked L2 Miss that can throw a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_L3MISS ] = { .pme_name = "PM_SYNC_MRK_L3MISS", .pme_code = 0x0000015154, .pme_short_desc = "Marked L3 misses that can throw a synchronous interrupt", .pme_long_desc = "Marked L3 misses that can throw a synchronous interrupt", }, [ POWER9_PME_PM_SYNC_MRK_PROBE_NOP ] = { .pme_name = "PM_SYNC_MRK_PROBE_NOP", .pme_code = 0x0000015150, .pme_short_desc = "Marked probeNops which can cause synchronous interrupts", .pme_long_desc = "Marked probeNops which can cause synchronous interrupts", }, [ POWER9_PME_PM_SYS_PUMP_CPRED ] = { .pme_name = "PM_SYS_PUMP_CPRED", .pme_code = 0x0000030050, .pme_short_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_SYS_PUMP_MPRED_RTY ] = { .pme_name = "PM_SYS_PUMP_MPRED_RTY", .pme_code = 0x0000040050, .pme_short_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", .pme_long_desc = "Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_SYS_PUMP_MPRED ] = { .pme_name = "PM_SYS_PUMP_MPRED", .pme_code = 0x0000030052, .pme_short_desc = "Final Pump Scope (system) mispredicted.", .pme_long_desc = "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)", }, [ POWER9_PME_PM_TABLEWALK_CYC_PREF ] = { .pme_name = "PM_TABLEWALK_CYC_PREF", .pme_code = 0x000000F884, .pme_short_desc = "tablewalk qualified for pte prefetches", .pme_long_desc = "tablewalk qualified for pte prefetches", }, [ POWER9_PME_PM_TABLEWALK_CYC ] = { .pme_name = "PM_TABLEWALK_CYC", .pme_code = 0x0000010026, .pme_short_desc = "Cycles when an instruction tablewalk is active", .pme_long_desc = "Cycles when an instruction tablewalk is active", }, [ POWER9_PME_PM_TAGE_CORRECT_TAKEN_CMPL ] = { .pme_name = "PM_TAGE_CORRECT_TAKEN_CMPL", .pme_code = 0x00000050B4, .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Counted at completion for taken branches only", }, [ POWER9_PME_PM_TAGE_CORRECT ] = { .pme_name = "PM_TAGE_CORRECT", .pme_code = 0x00000058B4, .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", }, [ POWER9_PME_PM_TAGE_OVERRIDE_WRONG_SPEC ] = { .pme_name = "PM_TAGE_OVERRIDE_WRONG_SPEC", .pme_code = 0x00000058B8, .pme_short_desc = "The TAGE overrode BHT direction prediction and it was correct.", .pme_long_desc = "The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time", }, [ POWER9_PME_PM_TAGE_OVERRIDE_WRONG ] = { .pme_name = "PM_TAGE_OVERRIDE_WRONG", .pme_code = 0x00000050B8, .pme_short_desc = "The TAGE overrode BHT direction prediction but it was incorrect.", .pme_long_desc = "The TAGE overrode BHT direction prediction but it was incorrect. Counted at completion for taken branches only", }, [ POWER9_PME_PM_TAKEN_BR_MPRED_CMPL ] = { .pme_name = "PM_TAKEN_BR_MPRED_CMPL", .pme_code = 0x0000020056, .pme_short_desc = "Total number of taken branches that were incorrectly predicted as not-taken.", .pme_long_desc = "Total number of taken branches that were incorrectly predicted as not-taken. This event counts branches completed and does not include speculative instructions", }, [ POWER9_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x00000300F8, .pme_short_desc = "timebase event", .pme_long_desc = "timebase event", }, [ POWER9_PME_PM_TEND_PEND_CYC ] = { .pme_name = "PM_TEND_PEND_CYC", .pme_code = 0x000000E8B0, .pme_short_desc = "TEND latency per thread", .pme_long_desc = "TEND latency per thread", }, [ POWER9_PME_PM_THRD_ALL_RUN_CYC ] = { .pme_name = "PM_THRD_ALL_RUN_CYC", .pme_code = 0x000002000C, .pme_short_desc = "Cycles in which all the threads have the run latch set", .pme_long_desc = "Cycles in which all the threads have the run latch set", }, [ POWER9_PME_PM_THRD_CONC_RUN_INST ] = { .pme_name = "PM_THRD_CONC_RUN_INST", .pme_code = 0x00000300F4, .pme_short_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", .pme_long_desc = "PPC Instructions Finished by this thread when all threads in the core had the run-latch set", }, [ POWER9_PME_PM_THRD_PRIO_0_1_CYC ] = { .pme_name = "PM_THRD_PRIO_0_1_CYC", .pme_code = 0x00000040BC, .pme_short_desc = "Cycles thread running at priority level 0 or 1", .pme_long_desc = "Cycles thread running at priority level 0 or 1", }, [ POWER9_PME_PM_THRD_PRIO_2_3_CYC ] = { .pme_name = "PM_THRD_PRIO_2_3_CYC", .pme_code = 0x00000048BC, .pme_short_desc = "Cycles thread running at priority level 2 or 3", .pme_long_desc = "Cycles thread running at priority level 2 or 3", }, [ POWER9_PME_PM_THRD_PRIO_4_5_CYC ] = { .pme_name = "PM_THRD_PRIO_4_5_CYC", .pme_code = 0x0000005080, .pme_short_desc = "Cycles thread running at priority level 4 or 5", .pme_long_desc = "Cycles thread running at priority level 4 or 5", }, [ POWER9_PME_PM_THRD_PRIO_6_7_CYC ] = { .pme_name = "PM_THRD_PRIO_6_7_CYC", .pme_code = 0x0000005880, .pme_short_desc = "Cycles thread running at priority level 6 or 7", .pme_long_desc = "Cycles thread running at priority level 6 or 7", }, [ POWER9_PME_PM_THRESH_ACC ] = { .pme_name = "PM_THRESH_ACC", .pme_code = 0x0000024154, .pme_short_desc = "This event increments every time the threshold event counter ticks.", .pme_long_desc = "This event increments every time the threshold event counter ticks. Thresholding must be enabled (via MMCRA) and the thresholding start event must occur for this counter to increment. It will stop incrementing when the thresholding stop event occurs or when thresholding is disabled, until the next time a configured thresholding start event occurs.", }, [ POWER9_PME_PM_THRESH_EXC_1024 ] = { .pme_name = "PM_THRESH_EXC_1024", .pme_code = 0x00000301EA, .pme_short_desc = "Threshold counter exceeded a value of 1024", .pme_long_desc = "Threshold counter exceeded a value of 1024", }, [ POWER9_PME_PM_THRESH_EXC_128 ] = { .pme_name = "PM_THRESH_EXC_128", .pme_code = 0x00000401EA, .pme_short_desc = "Threshold counter exceeded a value of 128", .pme_long_desc = "Threshold counter exceeded a value of 128", }, [ POWER9_PME_PM_THRESH_EXC_2048 ] = { .pme_name = "PM_THRESH_EXC_2048", .pme_code = 0x00000401EC, .pme_short_desc = "Threshold counter exceeded a value of 2048", .pme_long_desc = "Threshold counter exceeded a value of 2048", }, [ POWER9_PME_PM_THRESH_EXC_256 ] = { .pme_name = "PM_THRESH_EXC_256", .pme_code = 0x00000101E8, .pme_short_desc = "Threshold counter exceed a count of 256", .pme_long_desc = "Threshold counter exceed a count of 256", }, [ POWER9_PME_PM_THRESH_EXC_32 ] = { .pme_name = "PM_THRESH_EXC_32", .pme_code = 0x00000201E6, .pme_short_desc = "Threshold counter exceeded a value of 32", .pme_long_desc = "Threshold counter exceeded a value of 32", }, [ POWER9_PME_PM_THRESH_EXC_4096 ] = { .pme_name = "PM_THRESH_EXC_4096", .pme_code = 0x00000101E6, .pme_short_desc = "Threshold counter exceed a count of 4096", .pme_long_desc = "Threshold counter exceed a count of 4096", }, [ POWER9_PME_PM_THRESH_EXC_512 ] = { .pme_name = "PM_THRESH_EXC_512", .pme_code = 0x00000201E8, .pme_short_desc = "Threshold counter exceeded a value of 512", .pme_long_desc = "Threshold counter exceeded a value of 512", }, [ POWER9_PME_PM_THRESH_EXC_64 ] = { .pme_name = "PM_THRESH_EXC_64", .pme_code = 0x00000301E8, .pme_short_desc = "Threshold counter exceeded a value of 64", .pme_long_desc = "Threshold counter exceeded a value of 64", }, [ POWER9_PME_PM_THRESH_MET ] = { .pme_name = "PM_THRESH_MET", .pme_code = 0x00000101EC, .pme_short_desc = "threshold exceeded", .pme_long_desc = "threshold exceeded", }, [ POWER9_PME_PM_THRESH_NOT_MET ] = { .pme_name = "PM_THRESH_NOT_MET", .pme_code = 0x000004016E, .pme_short_desc = "Threshold counter did not meet threshold", .pme_long_desc = "Threshold counter did not meet threshold", }, [ POWER9_PME_PM_TLB_HIT ] = { .pme_name = "PM_TLB_HIT", .pme_code = 0x000001F054, .pme_short_desc = "Number of times the TLB had the data required by the instruction.", .pme_long_desc = "Number of times the TLB had the data required by the instruction. Applies to both HPT and RPT", }, [ POWER9_PME_PM_TLBIE_FIN ] = { .pme_name = "PM_TLBIE_FIN", .pme_code = 0x0000030058, .pme_short_desc = "tlbie finished", .pme_long_desc = "tlbie finished", }, [ POWER9_PME_PM_TLB_MISS ] = { .pme_name = "PM_TLB_MISS", .pme_code = 0x0000020066, .pme_short_desc = "TLB Miss (I + D)", .pme_long_desc = "TLB Miss (I + D)", }, [ POWER9_PME_PM_TM_ABORTS ] = { .pme_name = "PM_TM_ABORTS", .pme_code = 0x0000030056, .pme_short_desc = "Number of TM transactions aborted", .pme_long_desc = "Number of TM transactions aborted", }, [ POWER9_PME_PM_TMA_REQ_L2 ] = { .pme_name = "PM_TMA_REQ_L2", .pme_code = 0x000000E0A4, .pme_short_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", .pme_long_desc = "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding", }, [ POWER9_PME_PM_TM_CAM_OVERFLOW ] = { .pme_name = "PM_TM_CAM_OVERFLOW", .pme_code = 0x00000168A6, .pme_short_desc = "L3 TM cam overflow during L2 co of SC", .pme_long_desc = "L3 TM cam overflow during L2 co of SC", }, [ POWER9_PME_PM_TM_CAP_OVERFLOW ] = { .pme_name = "PM_TM_CAP_OVERFLOW", .pme_code = 0x000004608E, .pme_short_desc = "TM Footprint Capacity Overflow", .pme_long_desc = "TM Footprint Capacity Overflow", }, [ POWER9_PME_PM_TM_FAIL_CONF_NON_TM ] = { .pme_name = "PM_TM_FAIL_CONF_NON_TM", .pme_code = 0x00000028A8, .pme_short_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", .pme_long_desc = "TM aborted because a conflict occurred with a non-transactional access by another processor", }, [ POWER9_PME_PM_TM_FAIL_CONF_TM ] = { .pme_name = "PM_TM_FAIL_CONF_TM", .pme_code = 0x00000020AC, .pme_short_desc = "TM aborted because a conflict occurred with another transaction.", .pme_long_desc = "TM aborted because a conflict occurred with another transaction.", }, [ POWER9_PME_PM_TM_FAIL_FOOTPRINT_OVERFLOW ] = { .pme_name = "PM_TM_FAIL_FOOTPRINT_OVERFLOW", .pme_code = 0x00000020A8, .pme_short_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.", .pme_long_desc = "TM aborted because the tracking limit for transactional storage accesses was exceeded.. Asynchronous", }, [ POWER9_PME_PM_TM_FAIL_NON_TX_CONFLICT ] = { .pme_name = "PM_TM_FAIL_NON_TX_CONFLICT", .pme_code = 0x000000E0B0, .pme_short_desc = "Non transactional conflict from LSU, gets reported to TEXASR", .pme_long_desc = "Non transactional conflict from LSU, gets reported to TEXASR", }, [ POWER9_PME_PM_TM_FAIL_SELF ] = { .pme_name = "PM_TM_FAIL_SELF", .pme_code = 0x00000028AC, .pme_short_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally; a dcbf, dcbi, or icbi specify- ing a block that was previously accessed transactionally; a dcbst specifying a block that was previously written transactionally; or a tlbie that specifies a translation that was pre- viously used transactionally", .pme_long_desc = "TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally; a dcbf, dcbi, or icbi specify- ing a block that was previously accessed transactionally; a dcbst specifying a block that was previously written transactionally; or a tlbie that specifies a translation that was pre- viously used transactionally", }, [ POWER9_PME_PM_TM_FAIL_TLBIE ] = { .pme_name = "PM_TM_FAIL_TLBIE", .pme_code = 0x000000E0AC, .pme_short_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", .pme_long_desc = "Transaction failed because there was a TLBIE hit in the bloom filter", }, [ POWER9_PME_PM_TM_FAIL_TX_CONFLICT ] = { .pme_name = "PM_TM_FAIL_TX_CONFLICT", .pme_code = 0x000000E8AC, .pme_short_desc = "Transactional conflict from LSU, gets reported to TEXASR", .pme_long_desc = "Transactional conflict from LSU, gets reported to TEXASR", }, [ POWER9_PME_PM_TM_FAV_CAUSED_FAIL ] = { .pme_name = "PM_TM_FAV_CAUSED_FAIL", .pme_code = 0x000002688E, .pme_short_desc = "TM Load (fav) caused another thread to fail", .pme_long_desc = "TM Load (fav) caused another thread to fail", }, [ POWER9_PME_PM_TM_FAV_TBEGIN ] = { .pme_name = "PM_TM_FAV_TBEGIN", .pme_code = 0x000000209C, .pme_short_desc = "Dispatch time Favored tbegin", .pme_long_desc = "Dispatch time Favored tbegin", }, [ POWER9_PME_PM_TM_LD_CAUSED_FAIL ] = { .pme_name = "PM_TM_LD_CAUSED_FAIL", .pme_code = 0x000001688E, .pme_short_desc = "Non-TM Load caused any thread to fail", .pme_long_desc = "Non-TM Load caused any thread to fail", }, [ POWER9_PME_PM_TM_LD_CONF ] = { .pme_name = "PM_TM_LD_CONF", .pme_code = 0x000002608E, .pme_short_desc = "TM Load (fav or non-fav) ran into conflict (failed)", .pme_long_desc = "TM Load (fav or non-fav) ran into conflict (failed)", }, [ POWER9_PME_PM_TM_NESTED_TBEGIN ] = { .pme_name = "PM_TM_NESTED_TBEGIN", .pme_code = 0x00000020A0, .pme_short_desc = "Completion Tm nested tbegin", .pme_long_desc = "Completion Tm nested tbegin", }, [ POWER9_PME_PM_TM_NESTED_TEND ] = { .pme_name = "PM_TM_NESTED_TEND", .pme_code = 0x0000002098, .pme_short_desc = "Completion time nested tend", .pme_long_desc = "Completion time nested tend", }, [ POWER9_PME_PM_TM_NON_FAV_TBEGIN ] = { .pme_name = "PM_TM_NON_FAV_TBEGIN", .pme_code = 0x000000289C, .pme_short_desc = "Dispatch time non favored tbegin", .pme_long_desc = "Dispatch time non favored tbegin", }, [ POWER9_PME_PM_TM_OUTER_TBEGIN_DISP ] = { .pme_name = "PM_TM_OUTER_TBEGIN_DISP", .pme_code = 0x000004E05E, .pme_short_desc = "Number of outer tbegin instructions dispatched.", .pme_long_desc = "Number of outer tbegin instructions dispatched. The dispatch unit determines whether the tbegin instruction is outer or nested. This is a speculative count, which includes flushed instructions", }, [ POWER9_PME_PM_TM_OUTER_TBEGIN ] = { .pme_name = "PM_TM_OUTER_TBEGIN", .pme_code = 0x0000002094, .pme_short_desc = "Completion time outer tbegin", .pme_long_desc = "Completion time outer tbegin", }, [ POWER9_PME_PM_TM_OUTER_TEND ] = { .pme_name = "PM_TM_OUTER_TEND", .pme_code = 0x0000002894, .pme_short_desc = "Completion time outer tend", .pme_long_desc = "Completion time outer tend", }, [ POWER9_PME_PM_TM_PASSED ] = { .pme_name = "PM_TM_PASSED", .pme_code = 0x000002E052, .pme_short_desc = "Number of TM transactions that passed", .pme_long_desc = "Number of TM transactions that passed", }, [ POWER9_PME_PM_TM_RST_SC ] = { .pme_name = "PM_TM_RST_SC", .pme_code = 0x00000268A6, .pme_short_desc = "TM-snp rst RM SC", .pme_long_desc = "TM-snp rst RM SC", }, [ POWER9_PME_PM_TM_SC_CO ] = { .pme_name = "PM_TM_SC_CO", .pme_code = 0x00000160A6, .pme_short_desc = "L3 castout TM SC line", .pme_long_desc = "L3 castout TM SC line", }, [ POWER9_PME_PM_TM_ST_CAUSED_FAIL ] = { .pme_name = "PM_TM_ST_CAUSED_FAIL", .pme_code = 0x000003688E, .pme_short_desc = "TM Store (fav or non-fav) caused another thread to fail", .pme_long_desc = "TM Store (fav or non-fav) caused another thread to fail", }, [ POWER9_PME_PM_TM_ST_CONF ] = { .pme_name = "PM_TM_ST_CONF", .pme_code = 0x000003608E, .pme_short_desc = "TM Store (fav or non-fav) ran into conflict (failed)", .pme_long_desc = "TM Store (fav or non-fav) ran into conflict (failed)", }, [ POWER9_PME_PM_TM_TABORT_TRECLAIM ] = { .pme_name = "PM_TM_TABORT_TRECLAIM", .pme_code = 0x0000002898, .pme_short_desc = "Completion time tabortnoncd, tabortcd, treclaim", .pme_long_desc = "Completion time tabortnoncd, tabortcd, treclaim", }, [ POWER9_PME_PM_TM_TRANS_RUN_CYC ] = { .pme_name = "PM_TM_TRANS_RUN_CYC", .pme_code = 0x0000010060, .pme_short_desc = "run cycles in transactional state", .pme_long_desc = "run cycles in transactional state", }, [ POWER9_PME_PM_TM_TRANS_RUN_INST ] = { .pme_name = "PM_TM_TRANS_RUN_INST", .pme_code = 0x0000030060, .pme_short_desc = "Run instructions completed in transactional state (gated by the run latch)", .pme_long_desc = "Run instructions completed in transactional state (gated by the run latch)", }, [ POWER9_PME_PM_TM_TRESUME ] = { .pme_name = "PM_TM_TRESUME", .pme_code = 0x00000020A4, .pme_short_desc = "TM resume instruction completed", .pme_long_desc = "TM resume instruction completed", }, [ POWER9_PME_PM_TM_TSUSPEND ] = { .pme_name = "PM_TM_TSUSPEND", .pme_code = 0x00000028A0, .pme_short_desc = "TM suspend instruction completed", .pme_long_desc = "TM suspend instruction completed", }, [ POWER9_PME_PM_TM_TX_PASS_RUN_CYC ] = { .pme_name = "PM_TM_TX_PASS_RUN_CYC", .pme_code = 0x000002E012, .pme_short_desc = "cycles spent in successful transactions", .pme_long_desc = "cycles spent in successful transactions", }, [ POWER9_PME_PM_TM_TX_PASS_RUN_INST ] = { .pme_name = "PM_TM_TX_PASS_RUN_INST", .pme_code = 0x000004E014, .pme_short_desc = "Run instructions spent in successful transactions", .pme_long_desc = "Run instructions spent in successful transactions", }, [ POWER9_PME_PM_VECTOR_FLOP_CMPL ] = { .pme_name = "PM_VECTOR_FLOP_CMPL", .pme_code = 0x000004D058, .pme_short_desc = "Vector FP instruction completed", .pme_long_desc = "Vector FP instruction completed", }, [ POWER9_PME_PM_VECTOR_LD_CMPL ] = { .pme_name = "PM_VECTOR_LD_CMPL", .pme_code = 0x0000044054, .pme_short_desc = "Number of vector load instructions completed", .pme_long_desc = "Number of vector load instructions completed", }, [ POWER9_PME_PM_VECTOR_ST_CMPL ] = { .pme_name = "PM_VECTOR_ST_CMPL", .pme_code = 0x0000044056, .pme_short_desc = "Number of vector store instructions completed", .pme_long_desc = "Number of vector store instructions completed", }, [ POWER9_PME_PM_VSU_DP_FSQRT_FDIV ] = { .pme_name = "PM_VSU_DP_FSQRT_FDIV", .pme_code = 0x000003D058, .pme_short_desc = "vector versions of fdiv,fsqrt", .pme_long_desc = "vector versions of fdiv,fsqrt", }, [ POWER9_PME_PM_VSU_FIN ] = { .pme_name = "PM_VSU_FIN", .pme_code = 0x000002505C, .pme_short_desc = "VSU instruction finished.", .pme_long_desc = "VSU instruction finished. Up to 4 per cycle", }, [ POWER9_PME_PM_VSU_FSQRT_FDIV ] = { .pme_name = "PM_VSU_FSQRT_FDIV", .pme_code = 0x000004D04E, .pme_short_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", .pme_long_desc = "four flops operation (fdiv,fsqrt) Scalar Instructions only", }, [ POWER9_PME_PM_VSU_NON_FLOP_CMPL ] = { .pme_name = "PM_VSU_NON_FLOP_CMPL", .pme_code = 0x000004D050, .pme_short_desc = "Non FLOP operation completed", .pme_long_desc = "Non FLOP operation completed", }, [ POWER9_PME_PM_XLATE_HPT_MODE ] = { .pme_name = "PM_XLATE_HPT_MODE", .pme_code = 0x000000F098, .pme_short_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", .pme_long_desc = "LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)", }, [ POWER9_PME_PM_XLATE_MISS ] = { .pme_name = "PM_XLATE_MISS", .pme_code = 0x000000F89C, .pme_short_desc = "The LSU requested a line from L2 for translation.", .pme_long_desc = "The LSU requested a line from L2 for translation. It may be satisfied from any source beyond L2. Includes speculative instructions", }, [ POWER9_PME_PM_XLATE_RADIX_MODE ] = { .pme_name = "PM_XLATE_RADIX_MODE", .pme_code = 0x000000F898, .pme_short_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", .pme_long_desc = "LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)", }, [ POWER9_PME_PM_BR_2PATH_ALT ] = { .pme_name = "PM_BR_2PATH_ALT", .pme_code = 0x0000040036, .pme_short_desc = "Branches that are not strongly biased", .pme_long_desc = "Branches that are not strongly biased", }, [ POWER9_PME_PM_CYC_ALT ] = { .pme_name = "PM_CYC_ALT", .pme_code = 0x000002001E, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER9_PME_PM_CYC_ALT2 ] = { .pme_name = "PM_CYC_ALT2", .pme_code = 0x000003001E, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER9_PME_PM_CYC_ALT3 ] = { .pme_name = "PM_CYC_ALT3", .pme_code = 0x000004001E, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ POWER9_PME_PM_INST_CMPL_ALT ] = { .pme_name = "PM_INST_CMPL_ALT", .pme_code = 0x0000020002, .pme_short_desc = "Number of PowerPC Instructions that completed.", .pme_long_desc = "Number of PowerPC Instructions that completed.", }, [ POWER9_PME_PM_INST_CMPL_ALT2 ] = { .pme_name = "PM_INST_CMPL_ALT2", .pme_code = 0x0000030002, .pme_short_desc = "Number of PowerPC Instructions that completed.", .pme_long_desc = "Number of PowerPC Instructions that completed.", }, [ POWER9_PME_PM_INST_CMPL_ALT3 ] = { .pme_name = "PM_INST_CMPL_ALT3", .pme_code = 0x0000040002, .pme_short_desc = "Number of PowerPC Instructions that completed.", .pme_long_desc = "Number of PowerPC Instructions that completed.", }, [ POWER9_PME_PM_INST_DISP_ALT ] = { .pme_name = "PM_INST_DISP_ALT", .pme_code = 0x00000300F2, .pme_short_desc = "# PPC Dispatched", .pme_long_desc = "# PPC Dispatched", }, [ POWER9_PME_PM_LD_MISS_L1_ALT ] = { .pme_name = "PM_LD_MISS_L1_ALT", .pme_code = 0x00000400F0, .pme_short_desc = "Load Missed L1, counted at execution time (can be greater than loads finished).", .pme_long_desc = "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.", }, [ POWER9_PME_PM_SUSPENDED_ALT ] = { .pme_name = "PM_SUSPENDED_ALT", .pme_code = 0x0000020000, .pme_short_desc = "Counter OFF", .pme_long_desc = "Counter OFF", }, [ POWER9_PME_PM_SUSPENDED_ALT2 ] = { .pme_name = "PM_SUSPENDED_ALT2", .pme_code = 0x0000030000, .pme_short_desc = "Counter OFF", .pme_long_desc = "Counter OFF", }, [ POWER9_PME_PM_SUSPENDED_ALT3 ] = { .pme_name = "PM_SUSPENDED_ALT3", .pme_code = 0x0000040000, .pme_short_desc = "Counter OFF", .pme_long_desc = "Counter OFF", }, /* total 957 */ }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/powerpc_events.h000066400000000000000000000022551502707512200232560ustar00rootroot00000000000000/* * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * powerpc_events.h */ #ifndef _POWERPC_EVENTS_H_ #define _POWERPC_EVENTS_H_ #define PME_INSTR_COMPLETED 1 #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/powerpc_nest_events.h000066400000000000000000000044211502707512200243040ustar00rootroot00000000000000#ifndef __POWERPC_NEST_EVENTS_H__ #define __POWERPC_NEST_EVENTS_H__ #define POWERPC_PME_NEST_MCS_00 0 #define POWERPC_PME_NEST_MCS_01 1 #define POWERPC_PME_NEST_MCS_02 2 #define POWERPC_PME_NEST_MCS_03 3 static const pme_power_entry_t powerpc_nest_read_pe[] = { [ POWERPC_PME_NEST_MCS_00 ] = { .pme_name = "MCS_00", .pme_code = 0x118, .pme_short_desc = "Total Read Bandwidth seen on both MCS of MC0", .pme_long_desc = "Total Read Bandwidth seen on both MCS of MC0", }, [ POWERPC_PME_NEST_MCS_01 ] = { .pme_name = "MCS_01", .pme_code = 0x120, .pme_short_desc = "Total Read Bandwidth seen on both MCS of MC1", .pme_long_desc = "Total Read Bandwidth seen on both MCS of MC1", }, [ POWERPC_PME_NEST_MCS_02 ] = { .pme_name = "MCS_02", .pme_code = 0x128, .pme_short_desc = "Total Read Bandwidth seen on both MCS of MC2", .pme_long_desc = "Total Read Bandwidth seen on both MCS of MC2", }, [ POWERPC_PME_NEST_MCS_03 ] = { .pme_name = "MCS_03", .pme_code = 0x130, .pme_short_desc = "Total Read Bandwidth seen on both MCS of MC3", .pme_long_desc = "Total Read Bandwidth seen on both MCS of MC3", }, }; static const pme_power_entry_t powerpc_nest_write_pe[] = { [ POWERPC_PME_NEST_MCS_00 ] = { .pme_name = "MCS_00", .pme_code = 0x198, .pme_short_desc = "Total Write Bandwidth seen on both MCS of MC0", .pme_long_desc = "Total Write Bandwidth seen on both MCS of MC0", }, [ POWERPC_PME_NEST_MCS_01 ] = { .pme_name = "MCS_01", .pme_code = 0x1a0, .pme_short_desc = "Total Write Bandwidth seen on both MCS of MC1", .pme_long_desc = "Total Write Bandwidth seen on both MCS of MC1", }, [ POWERPC_PME_NEST_MCS_02 ] = { .pme_name = "MCS_02", .pme_code = 0x1a8, .pme_short_desc = "Total Write Bandwidth seen on both MCS of MC2", .pme_long_desc = "Total Write Bandwidth seen on both MCS of MC2", }, [ POWERPC_PME_NEST_MCS_03 ] = { .pme_name = "MCS_03", .pme_code = 0x1b0, .pme_short_desc = "Total Write Bandwidth seen on both MCS of MC3", .pme_long_desc = "Total Write Bandwidth seen on both MCS of MC3", }, }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/ppc970_events.h000066400000000000000000002003251502707512200226170ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PPC970_EVENTS_H__ #define __PPC970_EVENTS_H__ /* * File: ppc970_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID 1 #define PPC970_PME_PM_FPU1_SINGLE 2 #define PPC970_PME_PM_FPU0_STALL3 3 #define PPC970_PME_PM_TB_BIT_TRANS 4 #define PPC970_PME_PM_GPR_MAP_FULL_CYC 5 #define PPC970_PME_PM_MRK_ST_CMPL 6 #define PPC970_PME_PM_FPU0_STF 7 #define PPC970_PME_PM_FPU1_FMA 8 #define PPC970_PME_PM_LSU1_FLUSH_ULD 9 #define PPC970_PME_PM_MRK_INST_FIN 10 #define PPC970_PME_PM_MRK_LSU0_FLUSH_UST 11 #define PPC970_PME_PM_LSU_LRQ_S0_ALLOC 12 #define PPC970_PME_PM_FPU_FDIV 13 #define PPC970_PME_PM_FPU0_FULL_CYC 14 #define PPC970_PME_PM_FPU_SINGLE 15 #define PPC970_PME_PM_FPU0_FMA 16 #define PPC970_PME_PM_MRK_LSU1_FLUSH_ULD 17 #define PPC970_PME_PM_LSU1_FLUSH_LRQ 18 #define PPC970_PME_PM_DTLB_MISS 19 #define PPC970_PME_PM_MRK_ST_MISS_L1 20 #define PPC970_PME_PM_EXT_INT 21 #define PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ 22 #define PPC970_PME_PM_MRK_ST_GPS 23 #define PPC970_PME_PM_GRP_DISP_SUCCESS 24 #define PPC970_PME_PM_LSU1_LDF 25 #define PPC970_PME_PM_LSU0_SRQ_STFWD 26 #define PPC970_PME_PM_CR_MAP_FULL_CYC 27 #define PPC970_PME_PM_MRK_LSU0_FLUSH_ULD 28 #define PPC970_PME_PM_LSU_DERAT_MISS 29 #define PPC970_PME_PM_FPU0_SINGLE 30 #define PPC970_PME_PM_FPU1_FDIV 31 #define PPC970_PME_PM_FPU1_FEST 32 #define PPC970_PME_PM_FPU0_FRSP_FCONV 33 #define PPC970_PME_PM_GCT_EMPTY_SRQ_FULL 34 #define PPC970_PME_PM_MRK_ST_CMPL_INT 35 #define PPC970_PME_PM_FLUSH_BR_MPRED 36 #define PPC970_PME_PM_FXU_FIN 37 #define PPC970_PME_PM_FPU_STF 38 #define PPC970_PME_PM_DSLB_MISS 39 #define PPC970_PME_PM_FXLS1_FULL_CYC 40 #define PPC970_PME_PM_LSU_LMQ_LHR_MERGE 41 #define PPC970_PME_PM_MRK_STCX_FAIL 42 #define PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE 43 #define PPC970_PME_PM_MRK_DATA_FROM_L25_SHR 44 #define PPC970_PME_PM_LSU_FLUSH_ULD 45 #define PPC970_PME_PM_MRK_BRU_FIN 46 #define PPC970_PME_PM_IERAT_XLATE_WR 47 #define PPC970_PME_PM_DATA_FROM_MEM 48 #define PPC970_PME_PM_FPR_MAP_FULL_CYC 49 #define PPC970_PME_PM_FPU1_FULL_CYC 50 #define PPC970_PME_PM_FPU0_FIN 51 #define PPC970_PME_PM_GRP_BR_REDIR 52 #define PPC970_PME_PM_THRESH_TIMEO 53 #define PPC970_PME_PM_FPU_FSQRT 54 #define PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ 55 #define PPC970_PME_PM_PMC1_OVERFLOW 56 #define PPC970_PME_PM_FXLS0_FULL_CYC 57 #define PPC970_PME_PM_FPU0_ALL 58 #define PPC970_PME_PM_DATA_TABLEWALK_CYC 59 #define PPC970_PME_PM_FPU0_FEST 60 #define PPC970_PME_PM_DATA_FROM_L25_MOD 61 #define PPC970_PME_PM_LSU0_REJECT_ERAT_MISS 62 #define PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 63 #define PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF 64 #define PPC970_PME_PM_FPU_FEST 65 #define PPC970_PME_PM_0INST_FETCH 66 #define PPC970_PME_PM_LD_MISS_L1_LSU0 67 #define PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF 68 #define PPC970_PME_PM_L1_PREF 69 #define PPC970_PME_PM_FPU1_STALL3 70 #define PPC970_PME_PM_BRQ_FULL_CYC 71 #define PPC970_PME_PM_PMC8_OVERFLOW 72 #define PPC970_PME_PM_PMC7_OVERFLOW 73 #define PPC970_PME_PM_WORK_HELD 74 #define PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 75 #define PPC970_PME_PM_FXU_IDLE 76 #define PPC970_PME_PM_INST_CMPL 77 #define PPC970_PME_PM_LSU1_FLUSH_UST 78 #define PPC970_PME_PM_LSU0_FLUSH_ULD 79 #define PPC970_PME_PM_LSU_FLUSH 80 #define PPC970_PME_PM_INST_FROM_L2 81 #define PPC970_PME_PM_LSU1_REJECT_LMQ_FULL 82 #define PPC970_PME_PM_PMC2_OVERFLOW 83 #define PPC970_PME_PM_FPU0_DENORM 84 #define PPC970_PME_PM_FPU1_FMOV_FEST 85 #define PPC970_PME_PM_GRP_DISP_REJECT 86 #define PPC970_PME_PM_LSU_LDF 87 #define PPC970_PME_PM_INST_DISP 88 #define PPC970_PME_PM_DATA_FROM_L25_SHR 89 #define PPC970_PME_PM_L1_DCACHE_RELOAD_VALID 90 #define PPC970_PME_PM_MRK_GRP_ISSUED 91 #define PPC970_PME_PM_FPU_FMA 92 #define PPC970_PME_PM_MRK_CRU_FIN 93 #define PPC970_PME_PM_MRK_LSU1_FLUSH_UST 94 #define PPC970_PME_PM_MRK_FXU_FIN 95 #define PPC970_PME_PM_LSU1_REJECT_ERAT_MISS 96 #define PPC970_PME_PM_BR_ISSUED 97 #define PPC970_PME_PM_PMC4_OVERFLOW 98 #define PPC970_PME_PM_EE_OFF 99 #define PPC970_PME_PM_INST_FROM_L25_MOD 100 #define PPC970_PME_PM_ITLB_MISS 101 #define PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE 102 #define PPC970_PME_PM_GRP_DISP_VALID 103 #define PPC970_PME_PM_MRK_GRP_DISP 104 #define PPC970_PME_PM_LSU_FLUSH_UST 105 #define PPC970_PME_PM_FXU1_FIN 106 #define PPC970_PME_PM_GRP_CMPL 107 #define PPC970_PME_PM_FPU_FRSP_FCONV 108 #define PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ 109 #define PPC970_PME_PM_LSU_LMQ_FULL_CYC 110 #define PPC970_PME_PM_ST_REF_L1_LSU0 111 #define PPC970_PME_PM_LSU0_DERAT_MISS 112 #define PPC970_PME_PM_LSU_SRQ_SYNC_CYC 113 #define PPC970_PME_PM_FPU_STALL3 114 #define PPC970_PME_PM_LSU_REJECT_ERAT_MISS 115 #define PPC970_PME_PM_MRK_DATA_FROM_L2 116 #define PPC970_PME_PM_LSU0_FLUSH_SRQ 117 #define PPC970_PME_PM_FPU0_FMOV_FEST 118 #define PPC970_PME_PM_LD_REF_L1_LSU0 119 #define PPC970_PME_PM_LSU1_FLUSH_SRQ 120 #define PPC970_PME_PM_GRP_BR_MPRED 121 #define PPC970_PME_PM_LSU_LMQ_S0_ALLOC 122 #define PPC970_PME_PM_LSU0_REJECT_LMQ_FULL 123 #define PPC970_PME_PM_ST_REF_L1 124 #define PPC970_PME_PM_MRK_VMX_FIN 125 #define PPC970_PME_PM_LSU_SRQ_EMPTY_CYC 126 #define PPC970_PME_PM_FPU1_STF 127 #define PPC970_PME_PM_RUN_CYC 128 #define PPC970_PME_PM_LSU_LMQ_S0_VALID 129 #define PPC970_PME_PM_LSU0_LDF 130 #define PPC970_PME_PM_LSU_LRQ_S0_VALID 131 #define PPC970_PME_PM_PMC3_OVERFLOW 132 #define PPC970_PME_PM_MRK_IMR_RELOAD 133 #define PPC970_PME_PM_MRK_GRP_TIMEO 134 #define PPC970_PME_PM_FPU_FMOV_FEST 135 #define PPC970_PME_PM_GRP_DISP_BLK_SB_CYC 136 #define PPC970_PME_PM_XER_MAP_FULL_CYC 137 #define PPC970_PME_PM_ST_MISS_L1 138 #define PPC970_PME_PM_STOP_COMPLETION 139 #define PPC970_PME_PM_MRK_GRP_CMPL 140 #define PPC970_PME_PM_ISLB_MISS 141 #define PPC970_PME_PM_SUSPENDED 142 #define PPC970_PME_PM_CYC 143 #define PPC970_PME_PM_LD_MISS_L1_LSU1 144 #define PPC970_PME_PM_STCX_FAIL 145 #define PPC970_PME_PM_LSU1_SRQ_STFWD 146 #define PPC970_PME_PM_GRP_DISP 147 #define PPC970_PME_PM_L2_PREF 148 #define PPC970_PME_PM_FPU1_DENORM 149 #define PPC970_PME_PM_DATA_FROM_L2 150 #define PPC970_PME_PM_FPU0_FPSCR 151 #define PPC970_PME_PM_MRK_DATA_FROM_L25_MOD 152 #define PPC970_PME_PM_FPU0_FSQRT 153 #define PPC970_PME_PM_LD_REF_L1 154 #define PPC970_PME_PM_MRK_L1_RELOAD_VALID 155 #define PPC970_PME_PM_1PLUS_PPC_CMPL 156 #define PPC970_PME_PM_INST_FROM_L1 157 #define PPC970_PME_PM_EE_OFF_EXT_INT 158 #define PPC970_PME_PM_PMC6_OVERFLOW 159 #define PPC970_PME_PM_LSU_LRQ_FULL_CYC 160 #define PPC970_PME_PM_IC_PREF_INSTALL 161 #define PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS 162 #define PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ 163 #define PPC970_PME_PM_GCT_FULL_CYC 164 #define PPC970_PME_PM_INST_FROM_MEM 165 #define PPC970_PME_PM_FLUSH_LSU_BR_MPRED 166 #define PPC970_PME_PM_FXU_BUSY 167 #define PPC970_PME_PM_ST_REF_L1_LSU1 168 #define PPC970_PME_PM_MRK_LD_MISS_L1 169 #define PPC970_PME_PM_L1_WRITE_CYC 170 #define PPC970_PME_PM_LSU_REJECT_LMQ_FULL 171 #define PPC970_PME_PM_FPU_ALL 172 #define PPC970_PME_PM_LSU_SRQ_S0_ALLOC 173 #define PPC970_PME_PM_INST_FROM_L25_SHR 174 #define PPC970_PME_PM_GRP_MRK 175 #define PPC970_PME_PM_BR_MPRED_CR 176 #define PPC970_PME_PM_DC_PREF_STREAM_ALLOC 177 #define PPC970_PME_PM_FPU1_FIN 178 #define PPC970_PME_PM_LSU_REJECT_SRQ 179 #define PPC970_PME_PM_BR_MPRED_TA 180 #define PPC970_PME_PM_CRQ_FULL_CYC 181 #define PPC970_PME_PM_LD_MISS_L1 182 #define PPC970_PME_PM_INST_FROM_PREF 183 #define PPC970_PME_PM_STCX_PASS 184 #define PPC970_PME_PM_DC_INV_L2 185 #define PPC970_PME_PM_LSU_SRQ_FULL_CYC 186 #define PPC970_PME_PM_LSU0_FLUSH_LRQ 187 #define PPC970_PME_PM_LSU_SRQ_S0_VALID 188 #define PPC970_PME_PM_LARX_LSU0 189 #define PPC970_PME_PM_GCT_EMPTY_CYC 190 #define PPC970_PME_PM_FPU1_ALL 191 #define PPC970_PME_PM_FPU1_FSQRT 192 #define PPC970_PME_PM_FPU_FIN 193 #define PPC970_PME_PM_LSU_SRQ_STFWD 194 #define PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 195 #define PPC970_PME_PM_FXU0_FIN 196 #define PPC970_PME_PM_MRK_FPU_FIN 197 #define PPC970_PME_PM_PMC5_OVERFLOW 198 #define PPC970_PME_PM_SNOOP_TLBIE 199 #define PPC970_PME_PM_FPU1_FRSP_FCONV 200 #define PPC970_PME_PM_FPU0_FDIV 201 #define PPC970_PME_PM_LD_REF_L1_LSU1 202 #define PPC970_PME_PM_HV_CYC 203 #define PPC970_PME_PM_LR_CTR_MAP_FULL_CYC 204 #define PPC970_PME_PM_FPU_DENORM 205 #define PPC970_PME_PM_LSU0_REJECT_SRQ 206 #define PPC970_PME_PM_LSU1_REJECT_SRQ 207 #define PPC970_PME_PM_LSU1_DERAT_MISS 208 #define PPC970_PME_PM_IC_PREF_REQ 209 #define PPC970_PME_PM_MRK_LSU_FIN 210 #define PPC970_PME_PM_MRK_DATA_FROM_MEM 211 #define PPC970_PME_PM_LSU0_FLUSH_UST 212 #define PPC970_PME_PM_LSU_FLUSH_LRQ 213 #define PPC970_PME_PM_LSU_FLUSH_SRQ 214 static const pme_power_entry_t ppc970_pe[] = { [ PPC970_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x6920, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "LSU reject due to reload CDF or tag update collision", }, [ PPC970_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x936, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", }, [ PPC970_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", }, [ PPC970_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ PPC970_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ PPC970_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x335, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ PPC970_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", }, [ PPC970_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0x804, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x711, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", }, [ PPC970_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0x826, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", }, [ PPC970_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x303, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x714, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0x806, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x704, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ PPC970_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x723, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", }, [ PPC970_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x716, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ PPC970_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups successfully dispatched (not rejected)", }, [ PPC970_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x734, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", }, [ PPC970_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0x820, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", }, [ PPC970_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x304, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x710, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6700, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ PPC970_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", }, [ PPC970_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ PPC970_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ PPC970_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_GCT_EMPTY_SRQ_FULL ] = { .pme_name = "PM_GCT_EMPTY_SRQ_FULL", .pme_code = 0x200b, .pme_short_desc = "GCT empty caused by SRQ full", .pme_long_desc = "GCT empty caused by SRQ full", }, [ PPC970_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ PPC970_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x316, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "Flush caused by branch mispredict", }, [ PPC970_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3330, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished a marked instruction. Instructions that finish may not necessary complete.", }, [ PPC970_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x705, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", }, [ PPC970_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x314, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x935, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", }, [ PPC970_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x726, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ PPC970_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x193d, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", }, [ PPC970_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1800, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x430, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", }, [ PPC970_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x3837, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "Data loaded from memory", }, [ PPC970_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x301, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x307, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", }, [ PPC970_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", }, [ PPC970_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x326, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Group experienced branch redirect", }, [ PPC970_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ PPC970_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x712, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", }, [ PPC970_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x310, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ PPC970_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x707, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ PPC970_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ PPC970_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x383d, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", }, [ PPC970_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0x923, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "LSU0 reject due to ERAT miss", }, [ PPC970_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ PPC970_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0x922, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU0 reject due to reload CDF or tag update collision", }, [ PPC970_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ PPC970_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x442d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ PPC970_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0x812, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", }, [ PPC970_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0x926, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU1 reject due to reload CDF or tag update collision", }, [ PPC970_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x731, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ PPC970_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ PPC970_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x305, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", }, [ PPC970_PME_PM_PMC8_OVERFLOW ] = { .pme_name = "PM_PMC8_OVERFLOW", .pme_code = 0x100a, .pme_short_desc = "PMC8 Overflow", .pme_long_desc = "PMC8 Overflow", }, [ PPC970_PME_PM_PMC7_OVERFLOW ] = { .pme_name = "PM_PMC7_OVERFLOW", .pme_code = 0x800a, .pme_short_desc = "PMC7 Overflow", .pme_long_desc = "PMC7 Overflow", }, [ PPC970_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x720, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", }, [ PPC970_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", }, [ PPC970_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x1, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", }, [ PPC970_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0x805, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ PPC970_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0x800, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x315, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "Flush initiated by LSU", }, [ PPC970_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x1426, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ PPC970_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0x925, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU1 reject due to LMQ full or missed data coming", }, [ PPC970_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", }, [ PPC970_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ PPC970_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ PPC970_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x324, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", }, [ PPC970_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8730, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", }, [ PPC970_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x320, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", }, [ PPC970_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x183d, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", }, [ PPC970_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x834, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", }, [ PPC970_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", }, [ PPC970_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x715, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ PPC970_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "Marked instruction FXU processing finished", }, [ PPC970_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0x927, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "LSU1 reject due to ERAT miss", }, [ PPC970_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x431, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", }, [ PPC970_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x500a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", }, [ PPC970_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x333, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", }, [ PPC970_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x6426, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", }, [ PPC970_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x700, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ PPC970_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", }, [ PPC970_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x323, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", }, [ PPC970_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", }, [ PPC970_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2800, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", }, [ PPC970_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x336, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", }, [ PPC970_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ PPC970_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x713, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x837, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", }, [ PPC970_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0x811, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", }, [ PPC970_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x702, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ PPC970_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x735, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", }, [ PPC970_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x5920, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "LSU reject due to ERAT miss", }, [ PPC970_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1937, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", }, [ PPC970_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0x803, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ PPC970_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0x810, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", }, [ PPC970_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0x807, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", }, [ PPC970_PME_PM_GRP_BR_MPRED ] = { .pme_name = "PM_GRP_BR_MPRED", .pme_code = 0x327, .pme_short_desc = "Group experienced a branch mispredict", .pme_long_desc = "Group experienced a branch mispredict", }, [ PPC970_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x836, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", }, [ PPC970_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0x921, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU0 reject due to LMQ full or missed data coming", }, [ PPC970_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7810, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", }, [ PPC970_PME_PM_MRK_VMX_FIN ] = { .pme_name = "PM_MRK_VMX_FIN", .pme_code = 0x3005, .pme_short_desc = "Marked instruction VMX processing finished", .pme_long_desc = "Marked instruction VMX processing finished", }, [ PPC970_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", }, [ PPC970_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", }, [ PPC970_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", }, [ PPC970_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x835, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", }, [ PPC970_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x730, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", }, [ PPC970_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0x822, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ PPC970_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", }, [ PPC970_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x722, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", }, [ PPC970_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", }, [ PPC970_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x331, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", }, [ PPC970_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x302, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x813, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", }, [ PPC970_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", }, [ PPC970_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ PPC970_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x701, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ PPC970_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", }, [ PPC970_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ PPC970_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0x816, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", }, [ PPC970_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x721, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ PPC970_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0x824, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", }, [ PPC970_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", }, [ PPC970_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0x733, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", }, [ PPC970_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ PPC970_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1837, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", }, [ PPC970_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", }, [ PPC970_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x393d, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", }, [ PPC970_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8810, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", }, [ PPC970_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x934, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", }, [ PPC970_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ PPC970_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x142d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ PPC970_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x337, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", }, [ PPC970_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x700a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", }, [ PPC970_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x312, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The ISU sends this signal when the LRQ is full.", }, [ PPC970_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x427, .pme_short_desc = "Instruction prefetched installed in prefetch", .pme_long_desc = "New line coming into the prefetch buffer", }, [ PPC970_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x732, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", }, [ PPC970_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x717, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x300, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", }, [ PPC970_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x2426, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "Instruction fetched from memory", }, [ PPC970_PME_PM_FLUSH_LSU_BR_MPRED ] = { .pme_name = "PM_FLUSH_LSU_BR_MPRED", .pme_code = 0x317, .pme_short_desc = "Flush caused by LSU or branch mispredict", .pme_long_desc = "Flush caused by LSU or branch mispredict", }, [ PPC970_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", }, [ PPC970_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0x815, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", }, [ PPC970_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1720, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ PPC970_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x434, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", }, [ PPC970_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2920, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "LSU reject due to LMQ full or missed data coming", }, [ PPC970_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0x825, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", }, [ PPC970_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x5426, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", }, [ PPC970_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", }, [ PPC970_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x432, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", }, [ PPC970_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x737, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", }, [ PPC970_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", }, [ PPC970_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1920, .pme_short_desc = "LSU SRQ rejects", .pme_long_desc = "LSU SRQ rejects", }, [ PPC970_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x433, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", }, [ PPC970_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x311, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", }, [ PPC970_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3810, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", }, [ PPC970_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x342d, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", }, [ PPC970_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x725, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", }, [ PPC970_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x817, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ PPC970_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x313, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The ISU sends this signal when the srq is full.", }, [ PPC970_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0x802, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0x821, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ PPC970_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x727, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", }, [ PPC970_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ PPC970_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ PPC970_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1820, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", }, [ PPC970_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x724, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", }, [ PPC970_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x332, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", }, [ PPC970_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x600a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", }, [ PPC970_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x703, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ PPC970_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ PPC970_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0x814, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", }, [ PPC970_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ PPC970_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x306, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", }, [ PPC970_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0x920, .pme_short_desc = "LSU0 SRQ rejects", .pme_long_desc = "LSU0 SRQ rejects", }, [ PPC970_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0x924, .pme_short_desc = "LSU1 SRQ rejects", .pme_long_desc = "LSU1 SRQ rejects", }, [ PPC970_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x706, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ PPC970_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x426, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", }, [ PPC970_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x3937, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "Marked data loaded from memory", }, [ PPC970_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0x801, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", }, [ PPC970_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6800, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5800, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/ppc970mp_events.h000066400000000000000000002120261502707512200231550ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PPC970MP_EVENTS_H__ #define __PPC970MP_EVENTS_H__ /* * File: ppc970mp_events.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * * Note: This code was automatically generated and should not be modified by * hand. * */ #define PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF 0 #define PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID 1 #define PPC970MP_PME_PM_FPU1_SINGLE 2 #define PPC970MP_PME_PM_FPU0_STALL3 3 #define PPC970MP_PME_PM_TB_BIT_TRANS 4 #define PPC970MP_PME_PM_GPR_MAP_FULL_CYC 5 #define PPC970MP_PME_PM_MRK_ST_CMPL 6 #define PPC970MP_PME_PM_FPU0_STF 7 #define PPC970MP_PME_PM_FPU1_FMA 8 #define PPC970MP_PME_PM_LSU1_FLUSH_ULD 9 #define PPC970MP_PME_PM_MRK_INST_FIN 10 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST 11 #define PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC 12 #define PPC970MP_PME_PM_FPU_FDIV 13 #define PPC970MP_PME_PM_FPU0_FULL_CYC 14 #define PPC970MP_PME_PM_FPU_SINGLE 15 #define PPC970MP_PME_PM_FPU0_FMA 16 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD 17 #define PPC970MP_PME_PM_LSU1_FLUSH_LRQ 18 #define PPC970MP_PME_PM_DTLB_MISS 19 #define PPC970MP_PME_PM_CMPLU_STALL_FXU 20 #define PPC970MP_PME_PM_MRK_ST_MISS_L1 21 #define PPC970MP_PME_PM_EXT_INT 22 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ 23 #define PPC970MP_PME_PM_MRK_ST_GPS 24 #define PPC970MP_PME_PM_GRP_DISP_SUCCESS 25 #define PPC970MP_PME_PM_LSU1_LDF 26 #define PPC970MP_PME_PM_LSU0_SRQ_STFWD 27 #define PPC970MP_PME_PM_CR_MAP_FULL_CYC 28 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD 29 #define PPC970MP_PME_PM_LSU_DERAT_MISS 30 #define PPC970MP_PME_PM_FPU0_SINGLE 31 #define PPC970MP_PME_PM_FPU1_FDIV 32 #define PPC970MP_PME_PM_FPU1_FEST 33 #define PPC970MP_PME_PM_FPU0_FRSP_FCONV 34 #define PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL 35 #define PPC970MP_PME_PM_MRK_ST_CMPL_INT 36 #define PPC970MP_PME_PM_FLUSH_BR_MPRED 37 #define PPC970MP_PME_PM_FXU_FIN 38 #define PPC970MP_PME_PM_FPU_STF 39 #define PPC970MP_PME_PM_DSLB_MISS 40 #define PPC970MP_PME_PM_FXLS1_FULL_CYC 41 #define PPC970MP_PME_PM_CMPLU_STALL_FPU 42 #define PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE 43 #define PPC970MP_PME_PM_MRK_STCX_FAIL 44 #define PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE 45 #define PPC970MP_PME_PM_CMPLU_STALL_LSU 46 #define PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR 47 #define PPC970MP_PME_PM_LSU_FLUSH_ULD 48 #define PPC970MP_PME_PM_MRK_BRU_FIN 49 #define PPC970MP_PME_PM_IERAT_XLATE_WR 50 #define PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED 51 #define PPC970MP_PME_PM_LSU0_BUSY 52 #define PPC970MP_PME_PM_DATA_FROM_MEM 53 #define PPC970MP_PME_PM_FPR_MAP_FULL_CYC 54 #define PPC970MP_PME_PM_FPU1_FULL_CYC 55 #define PPC970MP_PME_PM_FPU0_FIN 56 #define PPC970MP_PME_PM_GRP_BR_REDIR 57 #define PPC970MP_PME_PM_GCT_EMPTY_IC_MISS 58 #define PPC970MP_PME_PM_THRESH_TIMEO 59 #define PPC970MP_PME_PM_FPU_FSQRT 60 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ 61 #define PPC970MP_PME_PM_PMC1_OVERFLOW 62 #define PPC970MP_PME_PM_FXLS0_FULL_CYC 63 #define PPC970MP_PME_PM_FPU0_ALL 64 #define PPC970MP_PME_PM_DATA_TABLEWALK_CYC 65 #define PPC970MP_PME_PM_FPU0_FEST 66 #define PPC970MP_PME_PM_DATA_FROM_L25_MOD 67 #define PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS 68 #define PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC 69 #define PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF 70 #define PPC970MP_PME_PM_FPU_FEST 71 #define PPC970MP_PME_PM_0INST_FETCH 72 #define PPC970MP_PME_PM_LD_MISS_L1_LSU0 73 #define PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF 74 #define PPC970MP_PME_PM_L1_PREF 75 #define PPC970MP_PME_PM_FPU1_STALL3 76 #define PPC970MP_PME_PM_BRQ_FULL_CYC 77 #define PPC970MP_PME_PM_PMC8_OVERFLOW 78 #define PPC970MP_PME_PM_PMC7_OVERFLOW 79 #define PPC970MP_PME_PM_WORK_HELD 80 #define PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 81 #define PPC970MP_PME_PM_FXU_IDLE 82 #define PPC970MP_PME_PM_INST_CMPL 83 #define PPC970MP_PME_PM_LSU1_FLUSH_UST 84 #define PPC970MP_PME_PM_LSU0_FLUSH_ULD 85 #define PPC970MP_PME_PM_LSU_FLUSH 86 #define PPC970MP_PME_PM_INST_FROM_L2 87 #define PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL 88 #define PPC970MP_PME_PM_PMC2_OVERFLOW 89 #define PPC970MP_PME_PM_FPU0_DENORM 90 #define PPC970MP_PME_PM_FPU1_FMOV_FEST 91 #define PPC970MP_PME_PM_INST_FETCH_CYC 92 #define PPC970MP_PME_PM_GRP_DISP_REJECT 93 #define PPC970MP_PME_PM_LSU_LDF 94 #define PPC970MP_PME_PM_INST_DISP 95 #define PPC970MP_PME_PM_DATA_FROM_L25_SHR 96 #define PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID 97 #define PPC970MP_PME_PM_MRK_GRP_ISSUED 98 #define PPC970MP_PME_PM_FPU_FMA 99 #define PPC970MP_PME_PM_MRK_CRU_FIN 100 #define PPC970MP_PME_PM_CMPLU_STALL_REJECT 101 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST 102 #define PPC970MP_PME_PM_MRK_FXU_FIN 103 #define PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS 104 #define PPC970MP_PME_PM_BR_ISSUED 105 #define PPC970MP_PME_PM_PMC4_OVERFLOW 106 #define PPC970MP_PME_PM_EE_OFF 107 #define PPC970MP_PME_PM_INST_FROM_L25_MOD 108 #define PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS 109 #define PPC970MP_PME_PM_ITLB_MISS 110 #define PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE 111 #define PPC970MP_PME_PM_GRP_DISP_VALID 112 #define PPC970MP_PME_PM_MRK_GRP_DISP 113 #define PPC970MP_PME_PM_LSU_FLUSH_UST 114 #define PPC970MP_PME_PM_FXU1_FIN 115 #define PPC970MP_PME_PM_GRP_CMPL 116 #define PPC970MP_PME_PM_FPU_FRSP_FCONV 117 #define PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ 118 #define PPC970MP_PME_PM_CMPLU_STALL_OTHER 119 #define PPC970MP_PME_PM_LSU_LMQ_FULL_CYC 120 #define PPC970MP_PME_PM_ST_REF_L1_LSU0 121 #define PPC970MP_PME_PM_LSU0_DERAT_MISS 122 #define PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC 123 #define PPC970MP_PME_PM_FPU_STALL3 124 #define PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS 125 #define PPC970MP_PME_PM_MRK_DATA_FROM_L2 126 #define PPC970MP_PME_PM_LSU0_FLUSH_SRQ 127 #define PPC970MP_PME_PM_FPU0_FMOV_FEST 128 #define PPC970MP_PME_PM_IOPS_CMPL 129 #define PPC970MP_PME_PM_LD_REF_L1_LSU0 130 #define PPC970MP_PME_PM_LSU1_FLUSH_SRQ 131 #define PPC970MP_PME_PM_CMPLU_STALL_DIV 132 #define PPC970MP_PME_PM_GRP_BR_MPRED 133 #define PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC 134 #define PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL 135 #define PPC970MP_PME_PM_ST_REF_L1 136 #define PPC970MP_PME_PM_MRK_VMX_FIN 137 #define PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC 138 #define PPC970MP_PME_PM_FPU1_STF 139 #define PPC970MP_PME_PM_RUN_CYC 140 #define PPC970MP_PME_PM_LSU_LMQ_S0_VALID 141 #define PPC970MP_PME_PM_LSU0_LDF 142 #define PPC970MP_PME_PM_LSU_LRQ_S0_VALID 143 #define PPC970MP_PME_PM_PMC3_OVERFLOW 144 #define PPC970MP_PME_PM_MRK_IMR_RELOAD 145 #define PPC970MP_PME_PM_MRK_GRP_TIMEO 146 #define PPC970MP_PME_PM_FPU_FMOV_FEST 147 #define PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC 148 #define PPC970MP_PME_PM_XER_MAP_FULL_CYC 149 #define PPC970MP_PME_PM_ST_MISS_L1 150 #define PPC970MP_PME_PM_STOP_COMPLETION 151 #define PPC970MP_PME_PM_MRK_GRP_CMPL 152 #define PPC970MP_PME_PM_ISLB_MISS 153 #define PPC970MP_PME_PM_SUSPENDED 154 #define PPC970MP_PME_PM_CYC 155 #define PPC970MP_PME_PM_LD_MISS_L1_LSU1 156 #define PPC970MP_PME_PM_STCX_FAIL 157 #define PPC970MP_PME_PM_LSU1_SRQ_STFWD 158 #define PPC970MP_PME_PM_GRP_DISP 159 #define PPC970MP_PME_PM_L2_PREF 160 #define PPC970MP_PME_PM_FPU1_DENORM 161 #define PPC970MP_PME_PM_DATA_FROM_L2 162 #define PPC970MP_PME_PM_FPU0_FPSCR 163 #define PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD 164 #define PPC970MP_PME_PM_FPU0_FSQRT 165 #define PPC970MP_PME_PM_LD_REF_L1 166 #define PPC970MP_PME_PM_MRK_L1_RELOAD_VALID 167 #define PPC970MP_PME_PM_1PLUS_PPC_CMPL 168 #define PPC970MP_PME_PM_INST_FROM_L1 169 #define PPC970MP_PME_PM_EE_OFF_EXT_INT 170 #define PPC970MP_PME_PM_PMC6_OVERFLOW 171 #define PPC970MP_PME_PM_LSU_LRQ_FULL_CYC 172 #define PPC970MP_PME_PM_IC_PREF_INSTALL 173 #define PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS 174 #define PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ 175 #define PPC970MP_PME_PM_GCT_FULL_CYC 176 #define PPC970MP_PME_PM_INST_FROM_MEM 177 #define PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED 178 #define PPC970MP_PME_PM_FXU_BUSY 179 #define PPC970MP_PME_PM_ST_REF_L1_LSU1 180 #define PPC970MP_PME_PM_MRK_LD_MISS_L1 181 #define PPC970MP_PME_PM_L1_WRITE_CYC 182 #define PPC970MP_PME_PM_LSU1_BUSY 183 #define PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL 184 #define PPC970MP_PME_PM_CMPLU_STALL_FDIV 185 #define PPC970MP_PME_PM_FPU_ALL 186 #define PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC 187 #define PPC970MP_PME_PM_INST_FROM_L25_SHR 188 #define PPC970MP_PME_PM_GRP_MRK 189 #define PPC970MP_PME_PM_BR_MPRED_CR 190 #define PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC 191 #define PPC970MP_PME_PM_FPU1_FIN 192 #define PPC970MP_PME_PM_LSU_REJECT_SRQ 193 #define PPC970MP_PME_PM_BR_MPRED_TA 194 #define PPC970MP_PME_PM_CRQ_FULL_CYC 195 #define PPC970MP_PME_PM_LD_MISS_L1 196 #define PPC970MP_PME_PM_INST_FROM_PREF 197 #define PPC970MP_PME_PM_STCX_PASS 198 #define PPC970MP_PME_PM_DC_INV_L2 199 #define PPC970MP_PME_PM_LSU_SRQ_FULL_CYC 200 #define PPC970MP_PME_PM_LSU0_FLUSH_LRQ 201 #define PPC970MP_PME_PM_LSU_SRQ_S0_VALID 202 #define PPC970MP_PME_PM_LARX_LSU0 203 #define PPC970MP_PME_PM_GCT_EMPTY_CYC 204 #define PPC970MP_PME_PM_FPU1_ALL 205 #define PPC970MP_PME_PM_FPU1_FSQRT 206 #define PPC970MP_PME_PM_FPU_FIN 207 #define PPC970MP_PME_PM_LSU_SRQ_STFWD 208 #define PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 209 #define PPC970MP_PME_PM_FXU0_FIN 210 #define PPC970MP_PME_PM_MRK_FPU_FIN 211 #define PPC970MP_PME_PM_PMC5_OVERFLOW 212 #define PPC970MP_PME_PM_SNOOP_TLBIE 213 #define PPC970MP_PME_PM_FPU1_FRSP_FCONV 214 #define PPC970MP_PME_PM_FPU0_FDIV 215 #define PPC970MP_PME_PM_LD_REF_L1_LSU1 216 #define PPC970MP_PME_PM_HV_CYC 217 #define PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC 218 #define PPC970MP_PME_PM_FPU_DENORM 219 #define PPC970MP_PME_PM_LSU0_REJECT_SRQ 220 #define PPC970MP_PME_PM_LSU1_REJECT_SRQ 221 #define PPC970MP_PME_PM_LSU1_DERAT_MISS 222 #define PPC970MP_PME_PM_IC_PREF_REQ 223 #define PPC970MP_PME_PM_MRK_LSU_FIN 224 #define PPC970MP_PME_PM_MRK_DATA_FROM_MEM 225 #define PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS 226 #define PPC970MP_PME_PM_LSU0_FLUSH_UST 227 #define PPC970MP_PME_PM_LSU_FLUSH_LRQ 228 #define PPC970MP_PME_PM_LSU_FLUSH_SRQ 229 static const pme_power_entry_t ppc970mp_pe[] = { [ PPC970MP_PME_PM_LSU_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU_REJECT_RELOAD_CDF", .pme_code = 0x6920, .pme_short_desc = "LSU reject due to reload CDF or tag update collision", .pme_long_desc = "LSU reject due to reload CDF or tag update collision", }, [ PPC970MP_PME_PM_MRK_LSU_SRQ_INST_VALID ] = { .pme_name = "PM_MRK_LSU_SRQ_INST_VALID", .pme_code = 0x936, .pme_short_desc = "Marked instruction valid in SRQ", .pme_long_desc = "This signal is asserted every cycle when a marked request is resident in the Store Request Queue", }, [ PPC970MP_PME_PM_FPU1_SINGLE ] = { .pme_name = "PM_FPU1_SINGLE", .pme_code = 0x127, .pme_short_desc = "FPU1 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing single precision instruction.", }, [ PPC970MP_PME_PM_FPU0_STALL3 ] = { .pme_name = "PM_FPU0_STALL3", .pme_code = 0x121, .pme_short_desc = "FPU0 stalled in pipe3", .pme_long_desc = "This signal indicates that fp0 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ PPC970MP_PME_PM_TB_BIT_TRANS ] = { .pme_name = "PM_TB_BIT_TRANS", .pme_code = 0x8005, .pme_short_desc = "Time Base bit transition", .pme_long_desc = "When the selected time base bit (as specified in MMCR0[TBSEL])transitions from 0 to 1 ", }, [ PPC970MP_PME_PM_GPR_MAP_FULL_CYC ] = { .pme_name = "PM_GPR_MAP_FULL_CYC", .pme_code = 0x335, .pme_short_desc = "Cycles GPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the gpr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970MP_PME_PM_MRK_ST_CMPL ] = { .pme_name = "PM_MRK_ST_CMPL", .pme_code = 0x1003, .pme_short_desc = "Marked store instruction completed", .pme_long_desc = "A sampled store has completed (data home)", }, [ PPC970MP_PME_PM_FPU0_STF ] = { .pme_name = "PM_FPU0_STF", .pme_code = 0x122, .pme_short_desc = "FPU0 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a store instruction.", }, [ PPC970MP_PME_PM_FPU1_FMA ] = { .pme_name = "PM_FPU1_FMA", .pme_code = 0x105, .pme_short_desc = "FPU1 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_LSU1_FLUSH_ULD ] = { .pme_name = "PM_LSU1_FLUSH_ULD", .pme_code = 0x804, .pme_short_desc = "LSU1 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970MP_PME_PM_MRK_INST_FIN ] = { .pme_name = "PM_MRK_INST_FIN", .pme_code = 0x7005, .pme_short_desc = "Marked instruction finished", .pme_long_desc = "One of the execution units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU0_FLUSH_UST", .pme_code = 0x711, .pme_short_desc = "LSU0 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 0 because it was unaligned", }, [ PPC970MP_PME_PM_LSU_LRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LRQ_S0_ALLOC", .pme_code = 0x826, .pme_short_desc = "LRQ slot 0 allocated", .pme_long_desc = "LRQ slot zero was allocated", }, [ PPC970MP_PME_PM_FPU_FDIV ] = { .pme_name = "PM_FPU_FDIV", .pme_code = 0x1100, .pme_short_desc = "FPU executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_FPU0_FULL_CYC ] = { .pme_name = "PM_FPU0_FULL_CYC", .pme_code = 0x303, .pme_short_desc = "Cycles FPU0 issue queue full", .pme_long_desc = "The issue queue for FPU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970MP_PME_PM_FPU_SINGLE ] = { .pme_name = "PM_FPU_SINGLE", .pme_code = 0x5120, .pme_short_desc = "FPU executed single precision instruction", .pme_long_desc = "FPU is executing single precision instruction. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_FPU0_FMA ] = { .pme_name = "PM_FPU0_FMA", .pme_code = 0x101, .pme_short_desc = "FPU0 executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU1_FLUSH_ULD", .pme_code = 0x714, .pme_short_desc = "LSU1 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 1 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970MP_PME_PM_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_LSU1_FLUSH_LRQ", .pme_code = 0x806, .pme_short_desc = "LSU1 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970MP_PME_PM_DTLB_MISS ] = { .pme_name = "PM_DTLB_MISS", .pme_code = 0x704, .pme_short_desc = "Data TLB misses", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ PPC970MP_PME_PM_CMPLU_STALL_FXU ] = { .pme_name = "PM_CMPLU_STALL_FXU", .pme_code = 0x508b, .pme_short_desc = "Completion stall caused by FXU instruction", .pme_long_desc = "Completion stall caused by FXU instruction", }, [ PPC970MP_PME_PM_MRK_ST_MISS_L1 ] = { .pme_name = "PM_MRK_ST_MISS_L1", .pme_code = 0x723, .pme_short_desc = "Marked L1 D cache store misses", .pme_long_desc = "A marked store missed the dcache", }, [ PPC970MP_PME_PM_EXT_INT ] = { .pme_name = "PM_EXT_INT", .pme_code = 0x8002, .pme_short_desc = "External interrupts", .pme_long_desc = "An external interrupt occurred", }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_LRQ", .pme_code = 0x716, .pme_short_desc = "LSU1 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970MP_PME_PM_MRK_ST_GPS ] = { .pme_name = "PM_MRK_ST_GPS", .pme_code = 0x6003, .pme_short_desc = "Marked store sent to GPS", .pme_long_desc = "A sampled store has been sent to the memory subsystem", }, [ PPC970MP_PME_PM_GRP_DISP_SUCCESS ] = { .pme_name = "PM_GRP_DISP_SUCCESS", .pme_code = 0x5001, .pme_short_desc = "Group dispatch success", .pme_long_desc = "Number of groups successfully dispatched (not rejected)", }, [ PPC970MP_PME_PM_LSU1_LDF ] = { .pme_name = "PM_LSU1_LDF", .pme_code = 0x734, .pme_short_desc = "LSU1 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 1", }, [ PPC970MP_PME_PM_LSU0_SRQ_STFWD ] = { .pme_name = "PM_LSU0_SRQ_STFWD", .pme_code = 0x820, .pme_short_desc = "LSU0 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 0", }, [ PPC970MP_PME_PM_CR_MAP_FULL_CYC ] = { .pme_name = "PM_CR_MAP_FULL_CYC", .pme_code = 0x304, .pme_short_desc = "Cycles CR logical operation mapper full", .pme_long_desc = "The ISU sends a signal indicating that the cr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_ULD ] = { .pme_name = "PM_MRK_LSU0_FLUSH_ULD", .pme_code = 0x710, .pme_short_desc = "LSU0 marked unaligned load flushes", .pme_long_desc = "A marked load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970MP_PME_PM_LSU_DERAT_MISS ] = { .pme_name = "PM_LSU_DERAT_MISS", .pme_code = 0x6700, .pme_short_desc = "DERAT misses", .pme_long_desc = "Total D-ERAT Misses (Unit 0 + Unit 1). Requests that miss the Derat are rejected and retried until the request hits in the Erat. This may result in multiple erat misses for the same instruction.", }, [ PPC970MP_PME_PM_FPU0_SINGLE ] = { .pme_name = "PM_FPU0_SINGLE", .pme_code = 0x123, .pme_short_desc = "FPU0 executed single precision instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing single precision instruction.", }, [ PPC970MP_PME_PM_FPU1_FDIV ] = { .pme_name = "PM_FPU1_FDIV", .pme_code = 0x104, .pme_short_desc = "FPU1 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ PPC970MP_PME_PM_FPU1_FEST ] = { .pme_name = "PM_FPU1_FEST", .pme_code = 0x116, .pme_short_desc = "FPU1 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ PPC970MP_PME_PM_FPU0_FRSP_FCONV ] = { .pme_name = "PM_FPU0_FRSP_FCONV", .pme_code = 0x111, .pme_short_desc = "FPU0 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_GCT_EMPTY_SRQ_FULL ] = { .pme_name = "PM_GCT_EMPTY_SRQ_FULL", .pme_code = 0x200b, .pme_short_desc = "GCT empty caused by SRQ full", .pme_long_desc = "GCT empty caused by SRQ full", }, [ PPC970MP_PME_PM_MRK_ST_CMPL_INT ] = { .pme_name = "PM_MRK_ST_CMPL_INT", .pme_code = 0x3003, .pme_short_desc = "Marked store completed with intervention", .pme_long_desc = "A marked store previously sent to the memory subsystem completed (data home) after requiring intervention", }, [ PPC970MP_PME_PM_FLUSH_BR_MPRED ] = { .pme_name = "PM_FLUSH_BR_MPRED", .pme_code = 0x316, .pme_short_desc = "Flush caused by branch mispredict", .pme_long_desc = "Flush caused by branch mispredict", }, [ PPC970MP_PME_PM_FXU_FIN ] = { .pme_name = "PM_FXU_FIN", .pme_code = 0x3330, .pme_short_desc = "FXU produced a result", .pme_long_desc = "The fixed point unit (Unit 0 + Unit 1) finished an instruction. Instructions that finish may not necessary complete.", }, [ PPC970MP_PME_PM_FPU_STF ] = { .pme_name = "PM_FPU_STF", .pme_code = 0x6120, .pme_short_desc = "FPU executed store instruction", .pme_long_desc = "FPU is executing a store instruction. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_DSLB_MISS ] = { .pme_name = "PM_DSLB_MISS", .pme_code = 0x705, .pme_short_desc = "Data SLB misses", .pme_long_desc = "A SLB miss for a data request occurred. SLB misses trap to the operating system to resolve", }, [ PPC970MP_PME_PM_FXLS1_FULL_CYC ] = { .pme_name = "PM_FXLS1_FULL_CYC", .pme_code = 0x314, .pme_short_desc = "Cycles FXU1/LS1 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970MP_PME_PM_CMPLU_STALL_FPU ] = { .pme_name = "PM_CMPLU_STALL_FPU", .pme_code = 0x704b, .pme_short_desc = "Completion stall caused by FPU instruction", .pme_long_desc = "Completion stall caused by FPU instruction", }, [ PPC970MP_PME_PM_LSU_LMQ_LHR_MERGE ] = { .pme_name = "PM_LSU_LMQ_LHR_MERGE", .pme_code = 0x935, .pme_short_desc = "LMQ LHR merges", .pme_long_desc = "A dcache miss occurred for the same real cache line address as an earlier request already in the Load Miss Queue and was merged into the LMQ entry.", }, [ PPC970MP_PME_PM_MRK_STCX_FAIL ] = { .pme_name = "PM_MRK_STCX_FAIL", .pme_code = 0x726, .pme_short_desc = "Marked STCX failed", .pme_long_desc = "A marked stcx (stwcx or stdcx) failed", }, [ PPC970MP_PME_PM_FXU0_BUSY_FXU1_IDLE ] = { .pme_name = "PM_FXU0_BUSY_FXU1_IDLE", .pme_code = 0x7002, .pme_short_desc = "FXU0 busy FXU1 idle", .pme_long_desc = "FXU0 is busy while FXU1 was idle", }, [ PPC970MP_PME_PM_CMPLU_STALL_LSU ] = { .pme_name = "PM_CMPLU_STALL_LSU", .pme_code = 0x504b, .pme_short_desc = "Completion stall caused by LSU instruction", .pme_long_desc = "Completion stall caused by LSU instruction", }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_SHR ] = { .pme_name = "PM_MRK_DATA_FROM_L25_SHR", .pme_code = 0x5937, .pme_short_desc = "Marked data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a marked demand load", }, [ PPC970MP_PME_PM_LSU_FLUSH_ULD ] = { .pme_name = "PM_LSU_FLUSH_ULD", .pme_code = 0x1800, .pme_short_desc = "LRQ unaligned load flushes", .pme_long_desc = "A load was flushed because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970MP_PME_PM_MRK_BRU_FIN ] = { .pme_name = "PM_MRK_BRU_FIN", .pme_code = 0x2005, .pme_short_desc = "Marked instruction BRU processing finished", .pme_long_desc = "The branch unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970MP_PME_PM_IERAT_XLATE_WR ] = { .pme_name = "PM_IERAT_XLATE_WR", .pme_code = 0x430, .pme_short_desc = "Translation written to ierat", .pme_long_desc = "This signal will be asserted each time the I-ERAT is written. This indicates that an ERAT miss has been serviced. ERAT misses will initiate a sequence resulting in the ERAT being written. ERAT misses that are later ignored will not be counted unless the ERAT is written before the instruction stream is changed, This should be a fairly accurate count of ERAT missed (best available).", }, [ PPC970MP_PME_PM_GCT_EMPTY_BR_MPRED ] = { .pme_name = "PM_GCT_EMPTY_BR_MPRED", .pme_code = 0x708c, .pme_short_desc = "GCT empty due to branch mispredict", .pme_long_desc = "GCT empty due to branch mispredict", }, [ PPC970MP_PME_PM_LSU0_BUSY ] = { .pme_name = "PM_LSU0_BUSY", .pme_code = 0x823, .pme_short_desc = "LSU0 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions", }, [ PPC970MP_PME_PM_DATA_FROM_MEM ] = { .pme_name = "PM_DATA_FROM_MEM", .pme_code = 0x2837, .pme_short_desc = "Data loaded from memory", .pme_long_desc = "Data loaded from memory", }, [ PPC970MP_PME_PM_FPR_MAP_FULL_CYC ] = { .pme_name = "PM_FPR_MAP_FULL_CYC", .pme_code = 0x301, .pme_short_desc = "Cycles FPR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the FPR mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970MP_PME_PM_FPU1_FULL_CYC ] = { .pme_name = "PM_FPU1_FULL_CYC", .pme_code = 0x307, .pme_short_desc = "Cycles FPU1 issue queue full", .pme_long_desc = "The issue queue for FPU unit 1 cannot accept any more instructions. Issue is stopped", }, [ PPC970MP_PME_PM_FPU0_FIN ] = { .pme_name = "PM_FPU0_FIN", .pme_code = 0x113, .pme_short_desc = "FPU0 produced a result", .pme_long_desc = "fp0 finished, produced a result This only indicates finish, not completion. ", }, [ PPC970MP_PME_PM_GRP_BR_REDIR ] = { .pme_name = "PM_GRP_BR_REDIR", .pme_code = 0x326, .pme_short_desc = "Group experienced branch redirect", .pme_long_desc = "Group experienced branch redirect", }, [ PPC970MP_PME_PM_GCT_EMPTY_IC_MISS ] = { .pme_name = "PM_GCT_EMPTY_IC_MISS", .pme_code = 0x508c, .pme_short_desc = "GCT empty due to I cache miss", .pme_long_desc = "GCT empty due to I cache miss", }, [ PPC970MP_PME_PM_THRESH_TIMEO ] = { .pme_name = "PM_THRESH_TIMEO", .pme_code = 0x2003, .pme_short_desc = "Threshold timeout", .pme_long_desc = "The threshold timer expired", }, [ PPC970MP_PME_PM_FPU_FSQRT ] = { .pme_name = "PM_FPU_FSQRT", .pme_code = 0x6100, .pme_short_desc = "FPU executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when FPU is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_LRQ", .pme_code = 0x712, .pme_short_desc = "LSU0 marked LRQ flushes", .pme_long_desc = "A marked load was flushed by unit 0 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970MP_PME_PM_PMC1_OVERFLOW ] = { .pme_name = "PM_PMC1_OVERFLOW", .pme_code = 0x200a, .pme_short_desc = "PMC1 Overflow", .pme_long_desc = "PMC1 Overflow", }, [ PPC970MP_PME_PM_FXLS0_FULL_CYC ] = { .pme_name = "PM_FXLS0_FULL_CYC", .pme_code = 0x310, .pme_short_desc = "Cycles FXU0/LS0 queue full", .pme_long_desc = "The issue queue for FXU/LSU unit 0 cannot accept any more instructions. Issue is stopped", }, [ PPC970MP_PME_PM_FPU0_ALL ] = { .pme_name = "PM_FPU0_ALL", .pme_code = 0x103, .pme_short_desc = "FPU0 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ PPC970MP_PME_PM_DATA_TABLEWALK_CYC ] = { .pme_name = "PM_DATA_TABLEWALK_CYC", .pme_code = 0x707, .pme_short_desc = "Cycles doing data tablewalks", .pme_long_desc = "This signal is asserted every cycle when a tablewalk is active. While a tablewalk is active any request attempting to access the TLB will be rejected and retried.", }, [ PPC970MP_PME_PM_FPU0_FEST ] = { .pme_name = "PM_FPU0_FEST", .pme_code = 0x112, .pme_short_desc = "FPU0 executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. ", }, [ PPC970MP_PME_PM_DATA_FROM_L25_MOD ] = { .pme_name = "PM_DATA_FROM_L25_MOD", .pme_code = 0x6837, .pme_short_desc = "Data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a demand load", }, [ PPC970MP_PME_PM_LSU0_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU0_REJECT_ERAT_MISS", .pme_code = 0x923, .pme_short_desc = "LSU0 reject due to ERAT miss", .pme_long_desc = "LSU0 reject due to ERAT miss", }, [ PPC970MP_PME_PM_LSU_LMQ_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_LMQ_SRQ_EMPTY_CYC", .pme_code = 0x2002, .pme_short_desc = "Cycles LMQ and SRQ empty", .pme_long_desc = "Cycles when both the LMQ and SRQ are empty (LSU is idle)", }, [ PPC970MP_PME_PM_LSU0_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU0_REJECT_RELOAD_CDF", .pme_code = 0x922, .pme_short_desc = "LSU0 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU0 reject due to reload CDF or tag update collision", }, [ PPC970MP_PME_PM_FPU_FEST ] = { .pme_name = "PM_FPU_FEST", .pme_code = 0x3110, .pme_short_desc = "FPU executed FEST instruction", .pme_long_desc = "This signal is active for one cycle when executing one of the estimate instructions. This could be fres* or frsqrte* where XYZ* means XYZ or XYZ. Combined Unit 0 + Unit 1.", }, [ PPC970MP_PME_PM_0INST_FETCH ] = { .pme_name = "PM_0INST_FETCH", .pme_code = 0x442d, .pme_short_desc = "No instructions fetched", .pme_long_desc = "No instructions were fetched this cycles (due to IFU hold, redirect, or icache miss)", }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_LD_MISS_L1_LSU0", .pme_code = 0x812, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 0, missed the dcache", }, [ PPC970MP_PME_PM_LSU1_REJECT_RELOAD_CDF ] = { .pme_name = "PM_LSU1_REJECT_RELOAD_CDF", .pme_code = 0x926, .pme_short_desc = "LSU1 reject due to reload CDF or tag update collision", .pme_long_desc = "LSU1 reject due to reload CDF or tag update collision", }, [ PPC970MP_PME_PM_L1_PREF ] = { .pme_name = "PM_L1_PREF", .pme_code = 0x731, .pme_short_desc = "L1 cache data prefetches", .pme_long_desc = "A request to prefetch data into the L1 was made", }, [ PPC970MP_PME_PM_FPU1_STALL3 ] = { .pme_name = "PM_FPU1_STALL3", .pme_code = 0x125, .pme_short_desc = "FPU1 stalled in pipe3", .pme_long_desc = "This signal indicates that fp1 has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. ", }, [ PPC970MP_PME_PM_BRQ_FULL_CYC ] = { .pme_name = "PM_BRQ_FULL_CYC", .pme_code = 0x305, .pme_short_desc = "Cycles branch queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu br unit cannot accept any more group (queue is full of groups).", }, [ PPC970MP_PME_PM_PMC8_OVERFLOW ] = { .pme_name = "PM_PMC8_OVERFLOW", .pme_code = 0x100a, .pme_short_desc = "PMC8 Overflow", .pme_long_desc = "PMC8 Overflow", }, [ PPC970MP_PME_PM_PMC7_OVERFLOW ] = { .pme_name = "PM_PMC7_OVERFLOW", .pme_code = 0x800a, .pme_short_desc = "PMC7 Overflow", .pme_long_desc = "PMC7 Overflow", }, [ PPC970MP_PME_PM_WORK_HELD ] = { .pme_name = "PM_WORK_HELD", .pme_code = 0x2001, .pme_short_desc = "Work held", .pme_long_desc = "RAS Unit has signaled completion to stop and there are groups waiting to complete", }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU0 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU0", .pme_code = 0x720, .pme_short_desc = "LSU0 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 0, missed the dcache", }, [ PPC970MP_PME_PM_FXU_IDLE ] = { .pme_name = "PM_FXU_IDLE", .pme_code = 0x5002, .pme_short_desc = "FXU idle", .pme_long_desc = "FXU0 and FXU1 are both idle", }, [ PPC970MP_PME_PM_INST_CMPL ] = { .pme_name = "PM_INST_CMPL", .pme_code = 0x1, .pme_short_desc = "Instructions completed", .pme_long_desc = "Number of Eligible Instructions that completed. ", }, [ PPC970MP_PME_PM_LSU1_FLUSH_UST ] = { .pme_name = "PM_LSU1_FLUSH_UST", .pme_code = 0x805, .pme_short_desc = "LSU1 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ PPC970MP_PME_PM_LSU0_FLUSH_ULD ] = { .pme_name = "PM_LSU0_FLUSH_ULD", .pme_code = 0x800, .pme_short_desc = "LSU0 unaligned load flushes", .pme_long_desc = "A load was flushed from unit 0 because it was unaligned (crossed a 64byte boundary, or 32 byte if it missed the L1)", }, [ PPC970MP_PME_PM_LSU_FLUSH ] = { .pme_name = "PM_LSU_FLUSH", .pme_code = 0x315, .pme_short_desc = "Flush initiated by LSU", .pme_long_desc = "Flush initiated by LSU", }, [ PPC970MP_PME_PM_INST_FROM_L2 ] = { .pme_name = "PM_INST_FROM_L2", .pme_code = 0x1426, .pme_short_desc = "Instructions fetched from L2", .pme_long_desc = "An instruction fetch group was fetched from L2. Fetch Groups can contain up to 8 instructions", }, [ PPC970MP_PME_PM_LSU1_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU1_REJECT_LMQ_FULL", .pme_code = 0x925, .pme_short_desc = "LSU1 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU1 reject due to LMQ full or missed data coming", }, [ PPC970MP_PME_PM_PMC2_OVERFLOW ] = { .pme_name = "PM_PMC2_OVERFLOW", .pme_code = 0x300a, .pme_short_desc = "PMC2 Overflow", .pme_long_desc = "PMC2 Overflow", }, [ PPC970MP_PME_PM_FPU0_DENORM ] = { .pme_name = "PM_FPU0_DENORM", .pme_code = 0x120, .pme_short_desc = "FPU0 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ PPC970MP_PME_PM_FPU1_FMOV_FEST ] = { .pme_name = "PM_FPU1_FMOV_FEST", .pme_code = 0x114, .pme_short_desc = "FPU1 executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ PPC970MP_PME_PM_INST_FETCH_CYC ] = { .pme_name = "PM_INST_FETCH_CYC", .pme_code = 0x424, .pme_short_desc = "Cycles at least 1 instruction fetched", .pme_long_desc = "Asserted each cycle when the IFU sends at least one instruction to the IDU. ", }, [ PPC970MP_PME_PM_GRP_DISP_REJECT ] = { .pme_name = "PM_GRP_DISP_REJECT", .pme_code = 0x324, .pme_short_desc = "Group dispatch rejected", .pme_long_desc = "A group that previously attempted dispatch was rejected.", }, [ PPC970MP_PME_PM_LSU_LDF ] = { .pme_name = "PM_LSU_LDF", .pme_code = 0x8730, .pme_short_desc = "LSU executed Floating Point load instruction", .pme_long_desc = "LSU executed Floating Point load instruction", }, [ PPC970MP_PME_PM_INST_DISP ] = { .pme_name = "PM_INST_DISP", .pme_code = 0x320, .pme_short_desc = "Instructions dispatched", .pme_long_desc = "The ISU sends the number of instructions dispatched.", }, [ PPC970MP_PME_PM_DATA_FROM_L25_SHR ] = { .pme_name = "PM_DATA_FROM_L25_SHR", .pme_code = 0x5837, .pme_short_desc = "Data loaded from L2.5 shared", .pme_long_desc = "DL1 was reloaded with shared (T or SL) data from the L2 of a chip on this MCM due to a demand load", }, [ PPC970MP_PME_PM_L1_DCACHE_RELOAD_VALID ] = { .pme_name = "PM_L1_DCACHE_RELOAD_VALID", .pme_code = 0x834, .pme_short_desc = "L1 reload data source valid", .pme_long_desc = "The data source information is valid", }, [ PPC970MP_PME_PM_MRK_GRP_ISSUED ] = { .pme_name = "PM_MRK_GRP_ISSUED", .pme_code = 0x6005, .pme_short_desc = "Marked group issued", .pme_long_desc = "A sampled instruction was issued", }, [ PPC970MP_PME_PM_FPU_FMA ] = { .pme_name = "PM_FPU_FMA", .pme_code = 0x2100, .pme_short_desc = "FPU executed multiply-add instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing multiply-add kind of instruction. This could be fmadd*, fnmadd*, fmsub*, fnmsub* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_MRK_CRU_FIN ] = { .pme_name = "PM_MRK_CRU_FIN", .pme_code = 0x4005, .pme_short_desc = "Marked instruction CRU processing finished", .pme_long_desc = "The Condition Register Unit finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970MP_PME_PM_CMPLU_STALL_REJECT ] = { .pme_name = "PM_CMPLU_STALL_REJECT", .pme_code = 0x70cb, .pme_short_desc = "Completion stall caused by reject", .pme_long_desc = "Completion stall caused by reject", }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_UST ] = { .pme_name = "PM_MRK_LSU1_FLUSH_UST", .pme_code = 0x715, .pme_short_desc = "LSU1 marked unaligned store flushes", .pme_long_desc = "A marked store was flushed from unit 1 because it was unaligned (crossed a 4k boundary)", }, [ PPC970MP_PME_PM_MRK_FXU_FIN ] = { .pme_name = "PM_MRK_FXU_FIN", .pme_code = 0x6004, .pme_short_desc = "Marked instruction FXU processing finished", .pme_long_desc = "Marked instruction FXU processing finished", }, [ PPC970MP_PME_PM_LSU1_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU1_REJECT_ERAT_MISS", .pme_code = 0x927, .pme_short_desc = "LSU1 reject due to ERAT miss", .pme_long_desc = "LSU1 reject due to ERAT miss", }, [ PPC970MP_PME_PM_BR_ISSUED ] = { .pme_name = "PM_BR_ISSUED", .pme_code = 0x431, .pme_short_desc = "Branches issued", .pme_long_desc = "This signal will be asserted each time the ISU issues a branch instruction. This signal will be asserted each time the ISU selects a branch instruction to issue.", }, [ PPC970MP_PME_PM_PMC4_OVERFLOW ] = { .pme_name = "PM_PMC4_OVERFLOW", .pme_code = 0x500a, .pme_short_desc = "PMC4 Overflow", .pme_long_desc = "PMC4 Overflow", }, [ PPC970MP_PME_PM_EE_OFF ] = { .pme_name = "PM_EE_OFF", .pme_code = 0x333, .pme_short_desc = "Cycles MSR(EE) bit off", .pme_long_desc = "The number of Cycles MSR(EE) bit was off.", }, [ PPC970MP_PME_PM_INST_FROM_L25_MOD ] = { .pme_name = "PM_INST_FROM_L25_MOD", .pme_code = 0x6426, .pme_short_desc = "Instruction fetched from L2.5 modified", .pme_long_desc = "Instruction fetched from L2.5 modified", }, [ PPC970MP_PME_PM_CMPLU_STALL_ERAT_MISS ] = { .pme_name = "PM_CMPLU_STALL_ERAT_MISS", .pme_code = 0x704c, .pme_short_desc = "Completion stall caused by ERAT miss", .pme_long_desc = "Completion stall caused by ERAT miss", }, [ PPC970MP_PME_PM_ITLB_MISS ] = { .pme_name = "PM_ITLB_MISS", .pme_code = 0x700, .pme_short_desc = "Instruction TLB misses", .pme_long_desc = "A TLB miss for an Instruction Fetch has occurred", }, [ PPC970MP_PME_PM_FXU1_BUSY_FXU0_IDLE ] = { .pme_name = "PM_FXU1_BUSY_FXU0_IDLE", .pme_code = 0x4002, .pme_short_desc = "FXU1 busy FXU0 idle", .pme_long_desc = "FXU0 was idle while FXU1 was busy", }, [ PPC970MP_PME_PM_GRP_DISP_VALID ] = { .pme_name = "PM_GRP_DISP_VALID", .pme_code = 0x323, .pme_short_desc = "Group dispatch valid", .pme_long_desc = "Dispatch has been attempted for a valid group. Some groups may be rejected. The total number of successful dispatches is the number of dispatch valid minus dispatch reject.", }, [ PPC970MP_PME_PM_MRK_GRP_DISP ] = { .pme_name = "PM_MRK_GRP_DISP", .pme_code = 0x1002, .pme_short_desc = "Marked group dispatched", .pme_long_desc = "A group containing a sampled instruction was dispatched", }, [ PPC970MP_PME_PM_LSU_FLUSH_UST ] = { .pme_name = "PM_LSU_FLUSH_UST", .pme_code = 0x2800, .pme_short_desc = "SRQ unaligned store flushes", .pme_long_desc = "A store was flushed because it was unaligned", }, [ PPC970MP_PME_PM_FXU1_FIN ] = { .pme_name = "PM_FXU1_FIN", .pme_code = 0x336, .pme_short_desc = "FXU1 produced a result", .pme_long_desc = "The Fixed Point unit 1 finished an instruction and produced a result", }, [ PPC970MP_PME_PM_GRP_CMPL ] = { .pme_name = "PM_GRP_CMPL", .pme_code = 0x7003, .pme_short_desc = "Group completed", .pme_long_desc = "A group completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ PPC970MP_PME_PM_FPU_FRSP_FCONV ] = { .pme_name = "PM_FPU_FRSP_FCONV", .pme_code = 0x7110, .pme_short_desc = "FPU executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_MRK_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU0_FLUSH_SRQ", .pme_code = 0x713, .pme_short_desc = "LSU0 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970MP_PME_PM_CMPLU_STALL_OTHER ] = { .pme_name = "PM_CMPLU_STALL_OTHER", .pme_code = 0x100b, .pme_short_desc = "Completion stall caused by other reason", .pme_long_desc = "Completion stall caused by other reason", }, [ PPC970MP_PME_PM_LSU_LMQ_FULL_CYC ] = { .pme_name = "PM_LSU_LMQ_FULL_CYC", .pme_code = 0x837, .pme_short_desc = "Cycles LMQ full", .pme_long_desc = "The LMQ was full", }, [ PPC970MP_PME_PM_ST_REF_L1_LSU0 ] = { .pme_name = "PM_ST_REF_L1_LSU0", .pme_code = 0x811, .pme_short_desc = "LSU0 L1 D cache store references", .pme_long_desc = "A store executed on unit 0", }, [ PPC970MP_PME_PM_LSU0_DERAT_MISS ] = { .pme_name = "PM_LSU0_DERAT_MISS", .pme_code = 0x702, .pme_short_desc = "LSU0 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 0 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ PPC970MP_PME_PM_LSU_SRQ_SYNC_CYC ] = { .pme_name = "PM_LSU_SRQ_SYNC_CYC", .pme_code = 0x735, .pme_short_desc = "SRQ sync duration", .pme_long_desc = "This signal is asserted every cycle when a sync is in the SRQ.", }, [ PPC970MP_PME_PM_FPU_STALL3 ] = { .pme_name = "PM_FPU_STALL3", .pme_code = 0x2120, .pme_short_desc = "FPU stalled in pipe3", .pme_long_desc = "FPU has generated a stall in pipe3 due to overflow, underflow, massive cancel, convert to integer (sometimes), or convert from integer (always). This signal is active during the entire duration of the stall. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_LSU_REJECT_ERAT_MISS ] = { .pme_name = "PM_LSU_REJECT_ERAT_MISS", .pme_code = 0x5920, .pme_short_desc = "LSU reject due to ERAT miss", .pme_long_desc = "LSU reject due to ERAT miss", }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L2 ] = { .pme_name = "PM_MRK_DATA_FROM_L2", .pme_code = 0x1937, .pme_short_desc = "Marked data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a marked demand load", }, [ PPC970MP_PME_PM_LSU0_FLUSH_SRQ ] = { .pme_name = "PM_LSU0_FLUSH_SRQ", .pme_code = 0x803, .pme_short_desc = "LSU0 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970MP_PME_PM_FPU0_FMOV_FEST ] = { .pme_name = "PM_FPU0_FMOV_FEST", .pme_code = 0x110, .pme_short_desc = "FPU0 executed FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when fp0 is executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ", }, [ PPC970MP_PME_PM_IOPS_CMPL ] = { .pme_name = "PM_IOPS_CMPL", .pme_code = 0x1001, .pme_short_desc = "IOPS instructions completed", .pme_long_desc = "Number of IOPS Instructions that completed.", }, [ PPC970MP_PME_PM_LD_REF_L1_LSU0 ] = { .pme_name = "PM_LD_REF_L1_LSU0", .pme_code = 0x810, .pme_short_desc = "LSU0 L1 D cache load references", .pme_long_desc = "A load executed on unit 0", }, [ PPC970MP_PME_PM_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_LSU1_FLUSH_SRQ", .pme_code = 0x807, .pme_short_desc = "LSU1 SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group. ", }, [ PPC970MP_PME_PM_CMPLU_STALL_DIV ] = { .pme_name = "PM_CMPLU_STALL_DIV", .pme_code = 0x708b, .pme_short_desc = "Completion stall caused by DIV instruction", .pme_long_desc = "Completion stall caused by DIV instruction", }, [ PPC970MP_PME_PM_GRP_BR_MPRED ] = { .pme_name = "PM_GRP_BR_MPRED", .pme_code = 0x327, .pme_short_desc = "Group experienced a branch mispredict", .pme_long_desc = "Group experienced a branch mispredict", }, [ PPC970MP_PME_PM_LSU_LMQ_S0_ALLOC ] = { .pme_name = "PM_LSU_LMQ_S0_ALLOC", .pme_code = 0x836, .pme_short_desc = "LMQ slot 0 allocated", .pme_long_desc = "The first entry in the LMQ was allocated.", }, [ PPC970MP_PME_PM_LSU0_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU0_REJECT_LMQ_FULL", .pme_code = 0x921, .pme_short_desc = "LSU0 reject due to LMQ full or missed data coming", .pme_long_desc = "LSU0 reject due to LMQ full or missed data coming", }, [ PPC970MP_PME_PM_ST_REF_L1 ] = { .pme_name = "PM_ST_REF_L1", .pme_code = 0x7810, .pme_short_desc = "L1 D cache store references", .pme_long_desc = "Total DL1 Store references", }, [ PPC970MP_PME_PM_MRK_VMX_FIN ] = { .pme_name = "PM_MRK_VMX_FIN", .pme_code = 0x3005, .pme_short_desc = "Marked instruction VMX processing finished", .pme_long_desc = "Marked instruction VMX processing finished", }, [ PPC970MP_PME_PM_LSU_SRQ_EMPTY_CYC ] = { .pme_name = "PM_LSU_SRQ_EMPTY_CYC", .pme_code = 0x4003, .pme_short_desc = "Cycles SRQ empty", .pme_long_desc = "The Store Request Queue is empty", }, [ PPC970MP_PME_PM_FPU1_STF ] = { .pme_name = "PM_FPU1_STF", .pme_code = 0x126, .pme_short_desc = "FPU1 executed store instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing a store instruction.", }, [ PPC970MP_PME_PM_RUN_CYC ] = { .pme_name = "PM_RUN_CYC", .pme_code = 0x1005, .pme_short_desc = "Run cycles", .pme_long_desc = "Processor Cycles gated by the run latch", }, [ PPC970MP_PME_PM_LSU_LMQ_S0_VALID ] = { .pme_name = "PM_LSU_LMQ_S0_VALID", .pme_code = 0x835, .pme_short_desc = "LMQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle when the first entry in the LMQ is valid. The LMQ had eight entries that are allocated FIFO", }, [ PPC970MP_PME_PM_LSU0_LDF ] = { .pme_name = "PM_LSU0_LDF", .pme_code = 0x730, .pme_short_desc = "LSU0 executed Floating Point load instruction", .pme_long_desc = "A floating point load was executed from LSU unit 0", }, [ PPC970MP_PME_PM_LSU_LRQ_S0_VALID ] = { .pme_name = "PM_LSU_LRQ_S0_VALID", .pme_code = 0x822, .pme_short_desc = "LRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Load Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ PPC970MP_PME_PM_PMC3_OVERFLOW ] = { .pme_name = "PM_PMC3_OVERFLOW", .pme_code = 0x400a, .pme_short_desc = "PMC3 Overflow", .pme_long_desc = "PMC3 Overflow", }, [ PPC970MP_PME_PM_MRK_IMR_RELOAD ] = { .pme_name = "PM_MRK_IMR_RELOAD", .pme_code = 0x722, .pme_short_desc = "Marked IMR reloaded", .pme_long_desc = "A DL1 reload occurred due to marked load", }, [ PPC970MP_PME_PM_MRK_GRP_TIMEO ] = { .pme_name = "PM_MRK_GRP_TIMEO", .pme_code = 0x5005, .pme_short_desc = "Marked group completion timeout", .pme_long_desc = "The sampling timeout expired indicating that the previously sampled instruction is no longer in the processor", }, [ PPC970MP_PME_PM_FPU_FMOV_FEST ] = { .pme_name = "PM_FPU_FMOV_FEST", .pme_code = 0x8110, .pme_short_desc = "FPU executing FMOV or FEST instructions", .pme_long_desc = "This signal is active for one cycle when executing a move kind of instruction or one of the estimate instructions.. This could be fmr*, fneg*, fabs*, fnabs* , fres* or frsqrte* where XYZ* means XYZ or XYZ . Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_GRP_DISP_BLK_SB_CYC ] = { .pme_name = "PM_GRP_DISP_BLK_SB_CYC", .pme_code = 0x331, .pme_short_desc = "Cycles group dispatch blocked by scoreboard", .pme_long_desc = "The ISU sends a signal indicating that dispatch is blocked by scoreboard.", }, [ PPC970MP_PME_PM_XER_MAP_FULL_CYC ] = { .pme_name = "PM_XER_MAP_FULL_CYC", .pme_code = 0x302, .pme_short_desc = "Cycles XER mapper full", .pme_long_desc = "The ISU sends a signal indicating that the xer mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970MP_PME_PM_ST_MISS_L1 ] = { .pme_name = "PM_ST_MISS_L1", .pme_code = 0x813, .pme_short_desc = "L1 D cache store misses", .pme_long_desc = "A store missed the dcache", }, [ PPC970MP_PME_PM_STOP_COMPLETION ] = { .pme_name = "PM_STOP_COMPLETION", .pme_code = 0x3001, .pme_short_desc = "Completion stopped", .pme_long_desc = "RAS Unit has signaled completion to stop", }, [ PPC970MP_PME_PM_MRK_GRP_CMPL ] = { .pme_name = "PM_MRK_GRP_CMPL", .pme_code = 0x4004, .pme_short_desc = "Marked group completed", .pme_long_desc = "A group containing a sampled instruction completed. Microcoded instructions that span multiple groups will generate this event once per group.", }, [ PPC970MP_PME_PM_ISLB_MISS ] = { .pme_name = "PM_ISLB_MISS", .pme_code = 0x701, .pme_short_desc = "Instruction SLB misses", .pme_long_desc = "A SLB miss for an instruction fetch as occurred", }, [ PPC970MP_PME_PM_SUSPENDED ] = { .pme_name = "PM_SUSPENDED", .pme_code = 0x0, .pme_short_desc = "Suspended", .pme_long_desc = "Suspended", }, [ PPC970MP_PME_PM_CYC ] = { .pme_name = "PM_CYC", .pme_code = 0x7, .pme_short_desc = "Processor cycles", .pme_long_desc = "Processor cycles", }, [ PPC970MP_PME_PM_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_LD_MISS_L1_LSU1", .pme_code = 0x816, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A load, executing on unit 1, missed the dcache", }, [ PPC970MP_PME_PM_STCX_FAIL ] = { .pme_name = "PM_STCX_FAIL", .pme_code = 0x721, .pme_short_desc = "STCX failed", .pme_long_desc = "A stcx (stwcx or stdcx) failed", }, [ PPC970MP_PME_PM_LSU1_SRQ_STFWD ] = { .pme_name = "PM_LSU1_SRQ_STFWD", .pme_code = 0x824, .pme_short_desc = "LSU1 SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load on unit 1", }, [ PPC970MP_PME_PM_GRP_DISP ] = { .pme_name = "PM_GRP_DISP", .pme_code = 0x2004, .pme_short_desc = "Group dispatches", .pme_long_desc = "A group was dispatched", }, [ PPC970MP_PME_PM_L2_PREF ] = { .pme_name = "PM_L2_PREF", .pme_code = 0x733, .pme_short_desc = "L2 cache prefetches", .pme_long_desc = "A request to prefetch data into L2 was made", }, [ PPC970MP_PME_PM_FPU1_DENORM ] = { .pme_name = "PM_FPU1_DENORM", .pme_code = 0x124, .pme_short_desc = "FPU1 received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized.", }, [ PPC970MP_PME_PM_DATA_FROM_L2 ] = { .pme_name = "PM_DATA_FROM_L2", .pme_code = 0x1837, .pme_short_desc = "Data loaded from L2", .pme_long_desc = "DL1 was reloaded from the local L2 due to a demand load", }, [ PPC970MP_PME_PM_FPU0_FPSCR ] = { .pme_name = "PM_FPU0_FPSCR", .pme_code = 0x130, .pme_short_desc = "FPU0 executed FPSCR instruction", .pme_long_desc = "This signal is active for one cycle when fp0 is executing fpscr move related instruction. This could be mtfsfi*, mtfsb0*, mtfsb1*. mffs*, mtfsf*, mcrsf* where XYZ* means XYZ, XYZs, XYZ., XYZs", }, [ PPC970MP_PME_PM_MRK_DATA_FROM_L25_MOD ] = { .pme_name = "PM_MRK_DATA_FROM_L25_MOD", .pme_code = 0x6937, .pme_short_desc = "Marked data loaded from L2.5 modified", .pme_long_desc = "DL1 was reloaded with modified (M) data from the L2 of a chip on this MCM due to a marked demand load", }, [ PPC970MP_PME_PM_FPU0_FSQRT ] = { .pme_name = "PM_FPU0_FSQRT", .pme_code = 0x102, .pme_short_desc = "FPU0 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_LD_REF_L1 ] = { .pme_name = "PM_LD_REF_L1", .pme_code = 0x8810, .pme_short_desc = "L1 D cache load references", .pme_long_desc = "Total DL1 Load references", }, [ PPC970MP_PME_PM_MRK_L1_RELOAD_VALID ] = { .pme_name = "PM_MRK_L1_RELOAD_VALID", .pme_code = 0x934, .pme_short_desc = "Marked L1 reload data source valid", .pme_long_desc = "The source information is valid and is for a marked load", }, [ PPC970MP_PME_PM_1PLUS_PPC_CMPL ] = { .pme_name = "PM_1PLUS_PPC_CMPL", .pme_code = 0x5003, .pme_short_desc = "One or more PPC instruction completed", .pme_long_desc = "A group containing at least one PPC instruction completed. For microcoded instructions that span multiple groups, this will only occur once.", }, [ PPC970MP_PME_PM_INST_FROM_L1 ] = { .pme_name = "PM_INST_FROM_L1", .pme_code = 0x142d, .pme_short_desc = "Instruction fetched from L1", .pme_long_desc = "An instruction fetch group was fetched from L1. Fetch Groups can contain up to 8 instructions", }, [ PPC970MP_PME_PM_EE_OFF_EXT_INT ] = { .pme_name = "PM_EE_OFF_EXT_INT", .pme_code = 0x337, .pme_short_desc = "Cycles MSR(EE) bit off and external interrupt pending", .pme_long_desc = "Cycles MSR(EE) bit off and external interrupt pending", }, [ PPC970MP_PME_PM_PMC6_OVERFLOW ] = { .pme_name = "PM_PMC6_OVERFLOW", .pme_code = 0x700a, .pme_short_desc = "PMC6 Overflow", .pme_long_desc = "PMC6 Overflow", }, [ PPC970MP_PME_PM_LSU_LRQ_FULL_CYC ] = { .pme_name = "PM_LSU_LRQ_FULL_CYC", .pme_code = 0x312, .pme_short_desc = "Cycles LRQ full", .pme_long_desc = "The ISU sends this signal when the LRQ is full.", }, [ PPC970MP_PME_PM_IC_PREF_INSTALL ] = { .pme_name = "PM_IC_PREF_INSTALL", .pme_code = 0x427, .pme_short_desc = "Instruction prefetched installed in prefetch", .pme_long_desc = "New line coming into the prefetch buffer", }, [ PPC970MP_PME_PM_DC_PREF_OUT_OF_STREAMS ] = { .pme_name = "PM_DC_PREF_OUT_OF_STREAMS", .pme_code = 0x732, .pme_short_desc = "D cache out of streams", .pme_long_desc = "out of streams", }, [ PPC970MP_PME_PM_MRK_LSU1_FLUSH_SRQ ] = { .pme_name = "PM_MRK_LSU1_FLUSH_SRQ", .pme_code = 0x717, .pme_short_desc = "LSU1 marked SRQ flushes", .pme_long_desc = "A marked store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", }, [ PPC970MP_PME_PM_GCT_FULL_CYC ] = { .pme_name = "PM_GCT_FULL_CYC", .pme_code = 0x300, .pme_short_desc = "Cycles GCT full", .pme_long_desc = "The ISU sends a signal indicating the gct is full. ", }, [ PPC970MP_PME_PM_INST_FROM_MEM ] = { .pme_name = "PM_INST_FROM_MEM", .pme_code = 0x2426, .pme_short_desc = "Instruction fetched from memory", .pme_long_desc = "Instruction fetched from memory", }, [ PPC970MP_PME_PM_FLUSH_LSU_BR_MPRED ] = { .pme_name = "PM_FLUSH_LSU_BR_MPRED", .pme_code = 0x317, .pme_short_desc = "Flush caused by LSU or branch mispredict", .pme_long_desc = "Flush caused by LSU or branch mispredict", }, [ PPC970MP_PME_PM_FXU_BUSY ] = { .pme_name = "PM_FXU_BUSY", .pme_code = 0x6002, .pme_short_desc = "FXU busy", .pme_long_desc = "FXU0 and FXU1 are both busy", }, [ PPC970MP_PME_PM_ST_REF_L1_LSU1 ] = { .pme_name = "PM_ST_REF_L1_LSU1", .pme_code = 0x815, .pme_short_desc = "LSU1 L1 D cache store references", .pme_long_desc = "A store executed on unit 1", }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1 ] = { .pme_name = "PM_MRK_LD_MISS_L1", .pme_code = 0x1720, .pme_short_desc = "Marked L1 D cache load misses", .pme_long_desc = "Marked L1 D cache load misses", }, [ PPC970MP_PME_PM_L1_WRITE_CYC ] = { .pme_name = "PM_L1_WRITE_CYC", .pme_code = 0x434, .pme_short_desc = "Cycles writing to instruction L1", .pme_long_desc = "This signal is asserted each cycle a cache write is active.", }, [ PPC970MP_PME_PM_LSU1_BUSY ] = { .pme_name = "PM_LSU1_BUSY", .pme_code = 0x827, .pme_short_desc = "LSU1 busy", .pme_long_desc = "LSU unit 0 is busy rejecting instructions ", }, [ PPC970MP_PME_PM_LSU_REJECT_LMQ_FULL ] = { .pme_name = "PM_LSU_REJECT_LMQ_FULL", .pme_code = 0x2920, .pme_short_desc = "LSU reject due to LMQ full or missed data coming", .pme_long_desc = "LSU reject due to LMQ full or missed data coming", }, [ PPC970MP_PME_PM_CMPLU_STALL_FDIV ] = { .pme_name = "PM_CMPLU_STALL_FDIV", .pme_code = 0x504c, .pme_short_desc = "Completion stall caused by FDIV or FQRT instruction", .pme_long_desc = "Completion stall caused by FDIV or FQRT instruction", }, [ PPC970MP_PME_PM_FPU_ALL ] = { .pme_name = "PM_FPU_ALL", .pme_code = 0x5100, .pme_short_desc = "FPU executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when FPU is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_LSU_SRQ_S0_ALLOC ] = { .pme_name = "PM_LSU_SRQ_S0_ALLOC", .pme_code = 0x825, .pme_short_desc = "SRQ slot 0 allocated", .pme_long_desc = "SRQ Slot zero was allocated", }, [ PPC970MP_PME_PM_INST_FROM_L25_SHR ] = { .pme_name = "PM_INST_FROM_L25_SHR", .pme_code = 0x5426, .pme_short_desc = "Instruction fetched from L2.5 shared", .pme_long_desc = "Instruction fetched from L2.5 shared", }, [ PPC970MP_PME_PM_GRP_MRK ] = { .pme_name = "PM_GRP_MRK", .pme_code = 0x5004, .pme_short_desc = "Group marked in IDU", .pme_long_desc = "A group was sampled (marked)", }, [ PPC970MP_PME_PM_BR_MPRED_CR ] = { .pme_name = "PM_BR_MPRED_CR", .pme_code = 0x432, .pme_short_desc = "Branch mispredictions due to CR bit setting", .pme_long_desc = "This signal is asserted when the branch execution unit detects a branch mispredict because the CR value is opposite of the predicted value. This signal is asserted after a branch issue event and will result in a branch redirect flush if not overridden by a flush of an older instruction.", }, [ PPC970MP_PME_PM_DC_PREF_STREAM_ALLOC ] = { .pme_name = "PM_DC_PREF_STREAM_ALLOC", .pme_code = 0x737, .pme_short_desc = "D cache new prefetch stream allocated", .pme_long_desc = "A new Prefetch Stream was allocated", }, [ PPC970MP_PME_PM_FPU1_FIN ] = { .pme_name = "PM_FPU1_FIN", .pme_code = 0x117, .pme_short_desc = "FPU1 produced a result", .pme_long_desc = "fp1 finished, produced a result. This only indicates finish, not completion. ", }, [ PPC970MP_PME_PM_LSU_REJECT_SRQ ] = { .pme_name = "PM_LSU_REJECT_SRQ", .pme_code = 0x1920, .pme_short_desc = "LSU SRQ rejects", .pme_long_desc = "LSU SRQ rejects", }, [ PPC970MP_PME_PM_BR_MPRED_TA ] = { .pme_name = "PM_BR_MPRED_TA", .pme_code = 0x433, .pme_short_desc = "Branch mispredictions due to target address", .pme_long_desc = "branch miss predict due to a target address prediction. This signal will be asserted each time the branch execution unit detects an incorrect target address prediction. This signal will be asserted after a valid branch execution unit issue and will cause a branch mispredict flush unless a flush is detected from an older instruction.", }, [ PPC970MP_PME_PM_CRQ_FULL_CYC ] = { .pme_name = "PM_CRQ_FULL_CYC", .pme_code = 0x311, .pme_short_desc = "Cycles CR issue queue full", .pme_long_desc = "The ISU sends a signal indicating that the issue queue that feeds the ifu cr unit cannot accept any more group (queue is full of groups).", }, [ PPC970MP_PME_PM_LD_MISS_L1 ] = { .pme_name = "PM_LD_MISS_L1", .pme_code = 0x3810, .pme_short_desc = "L1 D cache load misses", .pme_long_desc = "Total DL1 Load references that miss the DL1", }, [ PPC970MP_PME_PM_INST_FROM_PREF ] = { .pme_name = "PM_INST_FROM_PREF", .pme_code = 0x342d, .pme_short_desc = "Instructions fetched from prefetch", .pme_long_desc = "An instruction fetch group was fetched from the prefetch buffer. Fetch Groups can contain up to 8 instructions", }, [ PPC970MP_PME_PM_STCX_PASS ] = { .pme_name = "PM_STCX_PASS", .pme_code = 0x725, .pme_short_desc = "Stcx passes", .pme_long_desc = "A stcx (stwcx or stdcx) instruction was successful", }, [ PPC970MP_PME_PM_DC_INV_L2 ] = { .pme_name = "PM_DC_INV_L2", .pme_code = 0x817, .pme_short_desc = "L1 D cache entries invalidated from L2", .pme_long_desc = "A dcache invalidated was received from the L2 because a line in L2 was castout.", }, [ PPC970MP_PME_PM_LSU_SRQ_FULL_CYC ] = { .pme_name = "PM_LSU_SRQ_FULL_CYC", .pme_code = 0x313, .pme_short_desc = "Cycles SRQ full", .pme_long_desc = "The ISU sends this signal when the srq is full.", }, [ PPC970MP_PME_PM_LSU0_FLUSH_LRQ ] = { .pme_name = "PM_LSU0_FLUSH_LRQ", .pme_code = 0x802, .pme_short_desc = "LSU0 LRQ flushes", .pme_long_desc = "A load was flushed by unit 1 because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970MP_PME_PM_LSU_SRQ_S0_VALID ] = { .pme_name = "PM_LSU_SRQ_S0_VALID", .pme_code = 0x821, .pme_short_desc = "SRQ slot 0 valid", .pme_long_desc = "This signal is asserted every cycle that the Store Request Queue slot zero is valid. The SRQ is 32 entries long and is allocated round-robin.", }, [ PPC970MP_PME_PM_LARX_LSU0 ] = { .pme_name = "PM_LARX_LSU0", .pme_code = 0x727, .pme_short_desc = "Larx executed on LSU0", .pme_long_desc = "A larx (lwarx or ldarx) was executed on side 0 (there is no corresponding unit 1 event since larx instructions can only execute on unit 0)", }, [ PPC970MP_PME_PM_GCT_EMPTY_CYC ] = { .pme_name = "PM_GCT_EMPTY_CYC", .pme_code = 0x1004, .pme_short_desc = "Cycles GCT empty", .pme_long_desc = "The Global Completion Table is completely empty", }, [ PPC970MP_PME_PM_FPU1_ALL ] = { .pme_name = "PM_FPU1_ALL", .pme_code = 0x107, .pme_short_desc = "FPU1 executed add, mult, sub, cmp or sel instruction", .pme_long_desc = "This signal is active for one cycle when fp1 is executing an add, mult, sub, compare, or fsel kind of instruction. This could be fadd*, fmul*, fsub*, fcmp**, fsel where XYZ* means XYZ, XYZs, XYZ., XYZs. and XYZ** means XYZu, XYZo", }, [ PPC970MP_PME_PM_FPU1_FSQRT ] = { .pme_name = "PM_FPU1_FSQRT", .pme_code = 0x106, .pme_short_desc = "FPU1 executed FSQRT instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp1 is executing a square root instruction. This could be fsqrt* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_FPU_FIN ] = { .pme_name = "PM_FPU_FIN", .pme_code = 0x4110, .pme_short_desc = "FPU produced a result", .pme_long_desc = "FPU finished, produced a result This only indicates finish, not completion. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_LSU_SRQ_STFWD ] = { .pme_name = "PM_LSU_SRQ_STFWD", .pme_code = 0x1820, .pme_short_desc = "SRQ store forwarded", .pme_long_desc = "Data from a store instruction was forwarded to a load", }, [ PPC970MP_PME_PM_MRK_LD_MISS_L1_LSU1 ] = { .pme_name = "PM_MRK_LD_MISS_L1_LSU1", .pme_code = 0x724, .pme_short_desc = "LSU1 L1 D cache load misses", .pme_long_desc = "A marked load, executing on unit 1, missed the dcache", }, [ PPC970MP_PME_PM_FXU0_FIN ] = { .pme_name = "PM_FXU0_FIN", .pme_code = 0x332, .pme_short_desc = "FXU0 produced a result", .pme_long_desc = "The Fixed Point unit 0 finished an instruction and produced a result", }, [ PPC970MP_PME_PM_MRK_FPU_FIN ] = { .pme_name = "PM_MRK_FPU_FIN", .pme_code = 0x7004, .pme_short_desc = "Marked instruction FPU processing finished", .pme_long_desc = "One of the Floating Point Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970MP_PME_PM_PMC5_OVERFLOW ] = { .pme_name = "PM_PMC5_OVERFLOW", .pme_code = 0x600a, .pme_short_desc = "PMC5 Overflow", .pme_long_desc = "PMC5 Overflow", }, [ PPC970MP_PME_PM_SNOOP_TLBIE ] = { .pme_name = "PM_SNOOP_TLBIE", .pme_code = 0x703, .pme_short_desc = "Snoop TLBIE", .pme_long_desc = "A TLB miss for a data request occurred. Requests that miss the TLB may be retried until the instruction is in the next to complete group (unless HID4 is set to allow speculative tablewalks). This may result in multiple TLB misses for the same instruction.", }, [ PPC970MP_PME_PM_FPU1_FRSP_FCONV ] = { .pme_name = "PM_FPU1_FRSP_FCONV", .pme_code = 0x115, .pme_short_desc = "FPU1 executed FRSP or FCONV instructions", .pme_long_desc = "This signal is active for one cycle when fp1 is executing frsp or convert kind of instruction. This could be frsp*, fcfid*, fcti* where XYZ* means XYZ, XYZs, XYZ., XYZs.", }, [ PPC970MP_PME_PM_FPU0_FDIV ] = { .pme_name = "PM_FPU0_FDIV", .pme_code = 0x100, .pme_short_desc = "FPU0 executed FDIV instruction", .pme_long_desc = "This signal is active for one cycle at the end of the microcode executed when fp0 is executing a divide instruction. This could be fdiv, fdivs, fdiv. fdivs.", }, [ PPC970MP_PME_PM_LD_REF_L1_LSU1 ] = { .pme_name = "PM_LD_REF_L1_LSU1", .pme_code = 0x814, .pme_short_desc = "LSU1 L1 D cache load references", .pme_long_desc = "A load executed on unit 1", }, [ PPC970MP_PME_PM_HV_CYC ] = { .pme_name = "PM_HV_CYC", .pme_code = 0x3004, .pme_short_desc = "Hypervisor Cycles", .pme_long_desc = "Cycles when the processor is executing in Hypervisor (MSR[HV] = 1 and MSR[PR]=0)", }, [ PPC970MP_PME_PM_LR_CTR_MAP_FULL_CYC ] = { .pme_name = "PM_LR_CTR_MAP_FULL_CYC", .pme_code = 0x306, .pme_short_desc = "Cycles LR/CTR mapper full", .pme_long_desc = "The ISU sends a signal indicating that the lr/ctr mapper cannot accept any more groups. Dispatch is stopped. Note: this condition indicates that a pool of mapper is full but the entire mapper may not be.", }, [ PPC970MP_PME_PM_FPU_DENORM ] = { .pme_name = "PM_FPU_DENORM", .pme_code = 0x1120, .pme_short_desc = "FPU received denormalized data", .pme_long_desc = "This signal is active for one cycle when one of the operands is denormalized. Combined Unit 0 + Unit 1", }, [ PPC970MP_PME_PM_LSU0_REJECT_SRQ ] = { .pme_name = "PM_LSU0_REJECT_SRQ", .pme_code = 0x920, .pme_short_desc = "LSU0 SRQ rejects", .pme_long_desc = "LSU0 SRQ rejects", }, [ PPC970MP_PME_PM_LSU1_REJECT_SRQ ] = { .pme_name = "PM_LSU1_REJECT_SRQ", .pme_code = 0x924, .pme_short_desc = "LSU1 SRQ rejects", .pme_long_desc = "LSU1 SRQ rejects", }, [ PPC970MP_PME_PM_LSU1_DERAT_MISS ] = { .pme_name = "PM_LSU1_DERAT_MISS", .pme_code = 0x706, .pme_short_desc = "LSU1 DERAT misses", .pme_long_desc = "A data request (load or store) from LSU Unit 1 missed the ERAT and resulted in an ERAT reload. Multiple instructions may miss the ERAT entry for the same 4K page, but only one reload will occur.", }, [ PPC970MP_PME_PM_IC_PREF_REQ ] = { .pme_name = "PM_IC_PREF_REQ", .pme_code = 0x426, .pme_short_desc = "Instruction prefetch requests", .pme_long_desc = "Asserted when a non-canceled prefetch is made to the cache interface unit (CIU).", }, [ PPC970MP_PME_PM_MRK_LSU_FIN ] = { .pme_name = "PM_MRK_LSU_FIN", .pme_code = 0x8004, .pme_short_desc = "Marked instruction LSU processing finished", .pme_long_desc = "One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete", }, [ PPC970MP_PME_PM_MRK_DATA_FROM_MEM ] = { .pme_name = "PM_MRK_DATA_FROM_MEM", .pme_code = 0x2937, .pme_short_desc = "Marked data loaded from memory", .pme_long_desc = "Marked data loaded from memory", }, [ PPC970MP_PME_PM_CMPLU_STALL_DCACHE_MISS ] = { .pme_name = "PM_CMPLU_STALL_DCACHE_MISS", .pme_code = 0x50cb, .pme_short_desc = "Completion stall caused by D cache miss", .pme_long_desc = "Completion stall caused by D cache miss", }, [ PPC970MP_PME_PM_LSU0_FLUSH_UST ] = { .pme_name = "PM_LSU0_FLUSH_UST", .pme_code = 0x801, .pme_short_desc = "LSU0 unaligned store flushes", .pme_long_desc = "A store was flushed from unit 0 because it was unaligned (crossed a 4k boundary)", }, [ PPC970MP_PME_PM_LSU_FLUSH_LRQ ] = { .pme_name = "PM_LSU_FLUSH_LRQ", .pme_code = 0x6800, .pme_short_desc = "LRQ flushes", .pme_long_desc = "A load was flushed because a younger load executed before an older store executed and they had overlapping data OR two loads executed out of order and they have byte overlap and there was a snoop in between to an overlapped byte.", }, [ PPC970MP_PME_PM_LSU_FLUSH_SRQ ] = { .pme_name = "PM_LSU_FLUSH_SRQ", .pme_code = 0x5800, .pme_short_desc = "SRQ flushes", .pme_long_desc = "A store was flushed because younger load hits and older store that is already in the SRQ or in the same group.", } }; #endif papi-papi-7-2-0-t/src/libpfm4/lib/events/s390x_cpumf_events.h000066400000000000000000002533431502707512200236650ustar00rootroot00000000000000#ifndef __S390X_CPUMF_EVENTS_H__ #define __S390X_CPUMF_EVENTS_H__ #define __stringify(x) #x #define STRINGIFY(x) __stringify(x) /* CPUMF counter sets */ #define CPUMF_CTRSET_NONE 0 #define CPUMF_CTRSET_BASIC 2 #define CPUMF_CTRSET_PROBLEM_STATE 4 #define CPUMF_CTRSET_CRYPTO 8 #define CPUMF_CTRSET_EXTENDED 1 #define CPUMF_CTRSET_MT_DIAG 32 #define CPUMF_SVN6_ECC 4 static const pme_cpumf_ctr_t cpumcf_fvn1_counters[] = { { .ctrnum = 0, .ctrset = CPUMF_CTRSET_BASIC, .name = "CPU_CYCLES", .desc = "Cycle Count", }, { .ctrnum = 1, .ctrset = CPUMF_CTRSET_BASIC, .name = "INSTRUCTIONS", .desc = "Instruction Count", }, { .ctrnum = 2, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1I_DIR_WRITES", .desc = "Level-1 I-Cache Directory Write Count", }, { .ctrnum = 3, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1I_PENALTY_CYCLES", .desc = "Level-1 I-Cache Penalty Cycle Count", }, { .ctrnum = 4, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1D_DIR_WRITES", .desc = "Level-1 D-Cache Directory Write Count", }, { .ctrnum = 5, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1D_PENALTY_CYCLES", .desc = "Level-1 D-Cache Penalty Cycle Count", }, { .ctrnum = 32, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_CPU_CYCLES", .desc = "Problem-State Cycle Count", }, { .ctrnum = 33, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_INSTRUCTIONS", .desc = "Problem-State Instruction Count", }, { .ctrnum = 34, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_L1I_DIR_WRITES", .desc = "Problem-State Level-1 I-Cache Directory Write Count", }, { .ctrnum = 35, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_L1I_PENALTY_CYCLES", .desc = "Problem-State Level-1 I-Cache Penalty Cycle Count", }, { .ctrnum = 36, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_L1D_DIR_WRITES", .desc = "Problem-State Level-1 D-Cache Directory Write Count", }, { .ctrnum = 37, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_L1D_PENALTY_CYCLES", .desc = "Problem-State Level-1 D-Cache Penalty Cycle Count", }, }; static const pme_cpumf_ctr_t cpumcf_fvn3_counters[] = { { .ctrnum = 0, .ctrset = CPUMF_CTRSET_BASIC, .name = "CPU_CYCLES", .desc = "Cycle Count", }, { .ctrnum = 1, .ctrset = CPUMF_CTRSET_BASIC, .name = "INSTRUCTIONS", .desc = "Instruction Count", }, { .ctrnum = 2, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1I_DIR_WRITES", .desc = "Level-1 I-Cache Directory Write Count", }, { .ctrnum = 3, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1I_PENALTY_CYCLES", .desc = "Level-1 I-Cache Penalty Cycle Count", }, { .ctrnum = 4, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1D_DIR_WRITES", .desc = "Level-1 D-Cache Directory Write Count", }, { .ctrnum = 5, .ctrset = CPUMF_CTRSET_BASIC, .name = "L1D_PENALTY_CYCLES", .desc = "Level-1 D-Cache Penalty Cycle Count", }, { .ctrnum = 32, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_CPU_CYCLES", .desc = "Problem-State Cycle Count", }, { .ctrnum = 33, .ctrset = CPUMF_CTRSET_PROBLEM_STATE, .name = "PROBLEM_STATE_INSTRUCTIONS", .desc = "Problem-State Instruction Count", }, }; static const pme_cpumf_ctr_t cpumcf_svn_generic_counters[] = { { .ctrnum = 64, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "PRNG_FUNCTIONS", .desc = "Total number of the PRNG functions issued by the" " CPU", }, { .ctrnum = 65, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "PRNG_CYCLES", .desc = "Total number of CPU cycles when the DEA/AES" " coprocessor is busy performing PRNG functions" " issued by the CPU", }, { .ctrnum = 66, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "PRNG_BLOCKED_FUNCTIONS", .desc = "Total number of the PRNG functions that are issued" " by the CPU and are blocked because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 67, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "PRNG_BLOCKED_CYCLES", .desc = "Total number of CPU cycles blocked for the PRNG" " functions issued by the CPU because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 68, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "SHA_FUNCTIONS", .desc = "Total number of SHA functions issued by the CPU", }, { .ctrnum = 69, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "SHA_CYCLES", .desc = "Total number of CPU cycles when the SHA coprocessor" " is busy performing the SHA functions issued by the" " CPU", }, { .ctrnum = 70, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "SHA_BLOCKED_FUNCTIONS", .desc = "Total number of the SHA functions that are issued" " by the CPU and are blocked because the SHA" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 71, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "SHA_BLOCKED_CYCLES", .desc = "Total number of CPU cycles blocked for the SHA" " functions issued by the CPU because the SHA" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 72, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "DEA_FUNCTIONS", .desc = "Total number of the DEA functions issued by the CPU", }, { .ctrnum = 73, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "DEA_CYCLES", .desc = "Total number of CPU cycles when the DEA/AES" " coprocessor is busy performing the DEA functions" " issued by the CPU", }, { .ctrnum = 74, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "DEA_BLOCKED_FUNCTIONS", .desc = "Total number of the DEA functions that are issued" " by the CPU and are blocked because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 75, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "DEA_BLOCKED_CYCLES", .desc = "Total number of CPU cycles blocked for the DEA" " functions issued by the CPU because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 76, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "AES_FUNCTIONS", .desc = "Total number of AES functions issued by the CPU", }, { .ctrnum = 77, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "AES_CYCLES", .desc = "Total number of CPU cycles when the DEA/AES" " coprocessor is busy performing the AES functions" " issued by the CPU", }, { .ctrnum = 78, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "AES_BLOCKED_FUNCTIONS", .desc = "Total number of AES functions that are issued by" " the CPU and are blocked because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 79, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "AES_BLOCKED_CYCLES", .desc = "Total number of CPU cycles blocked for the AES" " functions issued by the CPU because the DEA/AES" " coprocessor is busy performing a function issued by" " another CPU", }, { .ctrnum = 80, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "ECC_FUNCTION_COUNT", .desc = "This counter counts the total number of the" " elliptic-curve cryptography (ECC) functions issued" " by the CPU.", }, { .ctrnum = 81, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "ECC_CYCLES_COUNT", .desc = "This counter counts the total number of CPU cycles" " when the ECC coprocessor is busy performing the" " elliptic-curve cryptography (ECC) functions issued" " by the CPU.", }, { .ctrnum = 82, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "ECC_BLOCKED_FUNCTION_COUNT", .desc = "This counter counts the total number of the" " elliptic-curve cryptography (ECC) functions that" " are issued by the CPU and are blocked because the" " ECC coprocessor is busy performing a function" " issued by another CPU.", }, { .ctrnum = 83, .ctrset = CPUMF_CTRSET_CRYPTO, .name = "ECC_BLOCKED_CYCLES_COUNT", .desc = "This counter counts the total number of CPU cycles" " blocked for the elliptic-curve cryptography (ECC)" " functions issued by the CPU because the ECC" " coprocessor is busy perform- ing a function issued" " by another CPU.", }, }; static const pme_cpumf_ctr_t cpumcf_z10_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from the" " Level-2 (L1.5) cache", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the installed cache line was sourced from the" " Level-2 (L1.5) cache", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L3_LOCAL_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the installed cache line was sourced from the" " Level-3 cache that is on the same book as the" " Instruction cache (Local L2 cache)", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L3_LOCAL_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the installtion cache line was source from" " the Level-3 cache that is on the same book as the" " Data cache (Local L2 cache)", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L3_REMOTE_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the installed cache line was sourced from a" " Level-3 cache that is not on the same book as the" " Instruction cache (Remote L2 cache)", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L3_REMOTE_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the installed cache line was sourced from a" " Level-3 cache that is not on the same book as the" " Data cache (Remote L2 cache)", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the installed cache line was sourced from" " memory that is attached to the same book as the" " Data cache (Local Memory)", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache where the" " installed cache line was sourced from memory that" " is attached to the s ame book as the Instruction" " cache (Local Memory)", }, { .ctrnum = 136, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 D-Cache where the" " line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_CACHELINE_INVALIDATES", .desc = "A cache line in the Level-1 I-Cache has been" " invalidated by a store on the same CPU as the Level-" "1 I-Cache", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_WRITES", .desc = "A translation entry has been written into the Level-" "1 Instruction Translation Lookaside Buffer", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Page Table Entry arrays", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays", }, { .ctrnum = 142, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays for a" " one-megabyte large page translation", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_MISSES", .desc = "Level-1 Instruction TLB miss in progress." " Incremented by one for every cycle an ITLB1 miss is" " in progress", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_MISSES", .desc = "Level-1 Data TLB miss in progress. Incremented by" " one for every cycle an DTLB1 miss is in progress", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L2C_STORES_SENT", .desc = "Incremented by one for every store sent to Level-2" " (L1.5) cache", }, }; static const pme_cpumf_ctr_t cpumcf_z196_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from the" " Level-2 cache", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from the" " Level-2 cache", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_MISSES", .desc = "Level-1 Data TLB miss in progress. Incremented by" " one for every cycle a DTLB1 miss is in progress.", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_MISSES", .desc = "Level-1 Instruction TLB miss in progress." " Incremented by one for every cycle a ITLB1 miss is" " in progress.", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L2C_STORES_SENT", .desc = "Incremented by one for every store sent to Level-2" " cache", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFBOOK_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from an" " Off Book Level-3 cache", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from an" " On Book Level-4 cache", }, { .ctrnum = 136, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from an" " On Book Level-4 cache", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 D-Cache where the" " line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from an" " Off Book Level-4 cache", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from an" " Off Book Level-4 cache", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer for a one-" "megabyte page", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache where the" " installed cache line was sourced from memory that" " is attached to the same book as the Data cache" " (Local Memory)", }, { .ctrnum = 142, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache where the" " installed cache line was sourced from memory that" " is attached to the same book as the Instruction" " cache (Local Memory)", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFBOOK_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from an" " Off Book Level-3 cache", }, { .ctrnum = 144, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Instruction Translation Lookaside Buffer", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Page Table Entry arrays", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays for a" " one-megabyte large page translation", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from an" " On Chip Level-3 cache", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 D-Cache directory" " where the returned cache line was sourced from an" " Off Chip/On Book Level-3 cache", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from an" " On Chip Level-3 cache", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 I-Cache directory" " where the returned cache line was sourced from an" " Off Chip/On Book Level-3 cache", }, }; static const pme_cpumf_ctr_t cpumcf_zec12_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_MISSES", .desc = "Level-1 Data TLB miss in progress. Incremented by" " one for every cycle a DTLB1 miss is in progress.", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_MISSES", .desc = "Level-1 Instruction TLB miss in progress." " Incremented by one for every cycle a ITLB1 miss is" " in progress.", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2I_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the Level-2 Instruction cache", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2I_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the Level-2 Instruction cache", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2D_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the Level-2 Data cache", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache where" " the installed cache line was sourced from memory" " that is attached to the same book as the Data cache" " (Local Memory)", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_LMEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " where the installed cache line was sourced from" " memory that is attached to the same book as the" " Instruction cache (Local Memory)", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 D-Cache where the" " line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer for a one-" "megabyte page", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Instruction Translation Lookaside Buffer", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Page Table Entry arrays", }, { .ctrnum = 142, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays for a" " one-megabyte large page translation", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Common Region Segment Table Entry arrays", }, { .ctrnum = 144, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On Chip Level-3 cache without intervention", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off Chip/On Book Level-3 cache without" " intervention", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFBOOK_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off Book Level-3 cache without intervention", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On Book Level-4 cache", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off Book Level-4 cache", }, { .ctrnum = 149, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TEND", .desc = "A TEND instruction has completed in a" " nonconstrained transactional-execution mode", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from a On Chip Level-3 cache with intervention", }, { .ctrnum = 151, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off Chip/On Book Level-3 cache with" " intervention", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFBOOK_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off Book Level-3 cache with intervention", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On Chip Level-3 cache without intervention", }, { .ctrnum = 154, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off Chip/On Book Level-3 cache without" " intervention", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFBOOK_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off Book Level-3 cache without intervention", }, { .ctrnum = 156, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On Book Level-4 cache", }, { .ctrnum = 157, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFBOOK_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off Book Level-4 cache", }, { .ctrnum = 158, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TEND", .desc = "A TEND instruction has completed in a constrained" " transactional-execution mode", }, { .ctrnum = 159, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On Chip Level-3 cache with intervention", }, { .ctrnum = 160, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off Chip/On Book Level-3 cache with" " intervention", }, { .ctrnum = 161, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFBOOK_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off Book Level-3 cache with intervention", }, { .ctrnum = 177, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TABORT", .desc = "A transaction abort has occurred in a" " nonconstrained transactional-execution mode", }, { .ctrnum = 178, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_NO_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is not" " using any special logic to allow the transaction to" " complete", }, { .ctrnum = 179, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is using" " special logic to allow the transaction to complete", }, }; static const pme_cpumf_ctr_t cpumcf_z13_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 Data cache where" " the line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line.", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_MISSES", .desc = "Level-1 Data TLB miss in progress. Incremented by" " one for every cycle a DTLB1 miss is in progress.", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer for a one-" "megabyte page", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB1_GPAGE_WRITES", .desc = "A translation entry has been written to the Level-1" " Data Translation Lookaside Buffer for a two-" "gigabyte page.", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2D_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the Level-2 Data cache", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_WRITES", .desc = "A translation entry has been written to the Level-1" " Instruction Translation Lookaside Buffer", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB1_MISSES", .desc = "Level-1 Instruction TLB miss in progress." " Incremented by one for every cycle an ITLB1 miss is" " in progress", }, { .ctrnum = 136, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2I_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the Level-2 Instruction cache", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Page Table Entry arrays", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_HPAGE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Combined Region Segment Table Entry arrays for" " a one-megabyte large page translation", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "A translation entry has been written to the Level-2" " TLB Combined Region Segment Table Entry arrays", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TEND", .desc = "A TEND instruction has completed in a constrained" " transactional-execution mode", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TEND", .desc = "A TEND instruction has completed in a non-" "constrained transactional-execution mode", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1C_TLB1_MISSES", .desc = "Increments by one for any cycle where a Level-1" " cache or Level-1 TLB miss is in progress.", }, { .ctrnum = 144, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache with intervention", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONNODE_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Node Level-4 cache", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONNODE_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Node Level-3 cache with intervention", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONNODE_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Node Level-3 cache without intervention", }, { .ctrnum = 149, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-4 cache", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-3 cache with intervention", }, { .ctrnum = 151, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-4 cache", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-3 cache with" " intervention", }, { .ctrnum = 154, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-3 cache" " without intervention", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-4 cache", }, { .ctrnum = 156, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-3 cache with" " intervention", }, { .ctrnum = 157, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-3 cache without" " intervention", }, { .ctrnum = 158, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONNODE_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Node memory", }, { .ctrnum = 159, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Drawer memory", }, { .ctrnum = 160, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Drawer memory", }, { .ctrnum = 161, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip memory", }, { .ctrnum = 162, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 163, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On Chip Level-3 cache with intervention", }, { .ctrnum = 164, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONNODE_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Node Level-4 cache", }, { .ctrnum = 165, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONNODE_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Node Level-3 cache with intervention", }, { .ctrnum = 166, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONNODE_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Node Level-3 cache without intervention", }, { .ctrnum = 167, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-4 cache", }, { .ctrnum = 168, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-3 cache with intervention", }, { .ctrnum = 169, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 170, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-4 cache", }, { .ctrnum = 171, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-3 cache with" " intervention", }, { .ctrnum = 172, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Same-Column Level-3 cache" " without intervention", }, { .ctrnum = 173, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-4 cache", }, { .ctrnum = 174, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-3 cache with" " intervention", }, { .ctrnum = 175, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Far-Column Level-3 cache without" " intervention", }, { .ctrnum = 176, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONNODE_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Node memory", }, { .ctrnum = 177, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Drawer memory", }, { .ctrnum = 178, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Drawer memory", }, { .ctrnum = 179, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_MEM_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Chip memory", }, { .ctrnum = 218, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TABORT", .desc = "A transaction abort has occurred in a non-" "constrained transactional-execution mode", }, { .ctrnum = 219, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_NO_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is not" " using any special logic to allow the transaction to" " complete", }, { .ctrnum = 220, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is using" " special logic to allow the transaction to complete", }, { .ctrnum = 448, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", .desc = "Cycle count with one thread active", }, { .ctrnum = 449, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", .desc = "Cycle count with two threads active", }, }; static const pme_cpumf_ctr_t cpumcf_z14_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 Data cache where" " the line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_WRITES", .desc = "A translation has been written into The Translation" " Lookaside Buffer 2 (TLB2) and the request was made" " by the data cache", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the data cache. Incremented by one for every TLB2" " miss in progress for the Level-1 Data cache on this" " cycle", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_HPAGE_WRITES", .desc = "A translation entry was written into the Combined" " Region and Segment Table Entry array in the Level-2" " TLB for a one-megabyte page or a Last Host" " Translation was done", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_GPAGE_WRITES", .desc = "A translation entry for a two-gigabyte page was" " written into the Level-2 TLB", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2D_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the Level-2 Data cache", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_WRITES", .desc = "A translation entry has been written into the" " Translation Lookaside Buffer 2 (TLB2) and the" " request was made by the instruction cache", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the instruction cache. Incremented by one for every" " TLB2 miss in progress for the Level-1 Instruction" " cache in a cycle", }, { .ctrnum = 136, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2I_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the Level-2 Instruction cache", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry was written into the Page Table" " Entry array in the Level-2 TLB", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "Translation entries were written into the Combined" " Region and Segment Table Entry array and the Page" " Table Entry array in the Level-2 TLB", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_ENGINES_BUSY", .desc = "The number of Level-2 TLB translation engines busy" " in a cycle", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TEND", .desc = "A TEND instruction has completed in a constrained" " transactional-execution mode", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TEND", .desc = "A TEND instruction has completed in a non-" "constrained transactional-execution mode", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1C_TLB2_MISSES", .desc = "Increments by one for any cycle where a level-1" " cache or level-2 TLB miss is in progress", }, { .ctrnum = 144, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip memory", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache with intervention", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Cluster Level-3 cache withountervention", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Cluster memory", }, { .ctrnum = 149, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Cluster Level-3 cache with intervention", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 151, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Cluster memory", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache with intervention", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 154, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Drawer memory", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache with intervention", }, { .ctrnum = 156, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Drawer Level-4 cache", }, { .ctrnum = 157, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Drawer Level-4 cache", }, { .ctrnum = 158, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_RO", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip L3 but a read-only invalidate was done" " to remove other copies of the cache line", }, { .ctrnum = 162, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 163, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from On-Chip memory", }, { .ctrnum = 164, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from an On-Chip Level-3 cache with intervention", }, { .ctrnum = 165, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 166, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Cluster memory", }, { .ctrnum = 167, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Cluster Level-3 cache with intervention", }, { .ctrnum = 168, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 169, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Cluster memory", }, { .ctrnum = 170, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache with intervention", }, { .ctrnum = 171, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 172, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Drawer memory", }, { .ctrnum = 173, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache with intervention", }, { .ctrnum = 174, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Drawer Level-4 cache", }, { .ctrnum = 175, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Drawer Level-4 cache", }, { .ctrnum = 224, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "BCD_DFP_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished Binary Coded Decimal to Decimal Floating" " Point conversions. Instructions: CDZT, CXZT, CZDT," " CZXT", }, { .ctrnum = 225, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "VX_BCD_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished vector arithmetic Binary Coded Decimal" " instructions. Instructions: VAP, VSP, VMPVMSP, VDP," " VSDP, VRP, VLIP, VSRP, VPSOPVCP, VTP, VPKZ, VUPKZ," " VCVB, VCVBG, VCVDVCVDG", }, { .ctrnum = 226, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DECIMAL_INSTRUCTIONS", .desc = "Decimal instructions dispatched. Instructions: CVB," " CVD, AP, CP, DP, ED, EDMK, MP, SRP, SP, ZAP", }, { .ctrnum = 232, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "LAST_HOST_TRANSLATIONS", .desc = "Last Host Translation done", }, { .ctrnum = 243, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TABORT", .desc = "A transaction abort has occurred in a non-" "constrained transactional-execution mode", }, { .ctrnum = 244, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_NO_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is not" " using any special logic to allow the transaction to" " complete", }, { .ctrnum = 245, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is using" " special logic to allow the transaction to complete", }, { .ctrnum = 448, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", .desc = "Cycle count with one thread active", }, { .ctrnum = 449, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", .desc = "Cycle count with two threads active", }, }; static const pme_cpumf_ctr_t cpumcf_z15_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 Data cache where" " the line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_WRITES", .desc = "A translation has been written into The Translation" " Lookaside Buffer 2 (TLB2) and the request was made" " by the data cache", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the data cache. Incremented by one for every TLB2" " miss in progress for the Level-1 Data cache on this" " cycle", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_HPAGE_WRITES", .desc = "A translation entry was written into the Combined" " Region and Segment Table Entry array in the Level-2" " TLB for a one-megabyte page", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_GPAGE_WRITES", .desc = "A translation entry for a two-gigabyte page was" " written into the Level-2 TLB", }, { .ctrnum = 133, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_L2D_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the Level-2 Data cache", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_WRITES", .desc = "A translation entry has been written into the" " Translation Lookaside Buffer 2 (TLB2) and the" " request was made by the instruction cache", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the instruction cache. Incremented by one for every" " TLB2 miss in progress for the Level-1 Instruction" " cache in a cycle", }, { .ctrnum = 136, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_L2I_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the Level-2 Instruction cache", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry was written into the Page Table" " Entry array in the Level-2 TLB", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "Translation entries were written into the Combined" " Region and Segment Table Entry array and the Page" " Table Entry array in the Level-2 TLB", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_ENGINES_BUSY", .desc = "The number of Level-2 TLB translation engines busy" " in a cycle", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TEND", .desc = "A TEND instruction has completed in a constrained" " transactional-execution mode", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TEND", .desc = "A TEND instruction has completed in a non-" "constrained transactional-execution mode", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1C_TLB2_MISSES", .desc = "Increments by one for any cycle where a level-1" " cache or level-2 TLB miss is in progress", }, { .ctrnum = 144, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip memory", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-3 cache with intervention", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Cluster Level-3 cache withountervention", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Cluster memory", }, { .ctrnum = 149, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Cluster Level-3 cache with intervention", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 151, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Cluster memory", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache with intervention", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 154, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Drawer memory", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache with intervention", }, { .ctrnum = 156, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Drawer Level-4 cache", }, { .ctrnum = 157, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_OFFDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Drawer Level-4 cache", }, { .ctrnum = 158, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_ONCHIP_L3_SOURCED_WRITES_RO", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip L3 but a read-only invalidate was done" " to remove other copies of the cache line", }, { .ctrnum = 162, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from an On-Chip Level-3 cache without intervention", }, { .ctrnum = 163, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from On-Chip memory", }, { .ctrnum = 164, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCHIP_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache ine was sourced" " from an On-Chip Level-3 cache with intervention", }, { .ctrnum = 165, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 166, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Cluster memory", }, { .ctrnum = 167, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Cluster Level-3 cache with intervention", }, { .ctrnum = 168, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache without" " intervention", }, { .ctrnum = 169, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Cluster memory", }, { .ctrnum = 170, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Cluster Level-3 cache with intervention", }, { .ctrnum = 171, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache without" " intervention", }, { .ctrnum = 172, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_MEMORY_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Drawer memory", }, { .ctrnum = 173, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L3_SOURCED_WRITES_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-3 cache with intervention", }, { .ctrnum = 174, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_ONDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Drawer Level-4 cache", }, { .ctrnum = 175, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1I_OFFDRAWER_L4_SOURCED_WRITES", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Drawer Level-4 cache", }, { .ctrnum = 224, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "BCD_DFP_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished Binary Coded Decimal to Decimal Floating" " Point conversions. Instructions: CDZT, CXZT, CZDT," " CZXT", }, { .ctrnum = 225, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "VX_BCD_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished vector arithmetic Binary Coded Decimal" " instructions. Instructions: VAP, VSP, VMPVMSP, VDP," " VSDP, VRP, VLIP, VSRP, VPSOPVCP, VTP, VPKZ, VUPKZ," " VCVB, VCVBG, VCVDVCVDG", }, { .ctrnum = 226, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DECIMAL_INSTRUCTIONS", .desc = "Decimal instructions dispatched. Instructions: CVB," " CVD, AP, CP, DP, ED, EDMK, MP, SRP, SP, ZAP", }, { .ctrnum = 232, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "LAST_HOST_TRANSLATIONS", .desc = "Last Host Translation done", }, { .ctrnum = 243, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TABORT", .desc = "A transaction abort has occurred in a non-" "constrained transactional-execution mode", }, { .ctrnum = 244, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_NO_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is not" " using any special logic to allow the transaction to" " complete", }, { .ctrnum = 245, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is using" " special logic to allow the transaction to complete", }, { .ctrnum = 247, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_ACCESS", .desc = "Cycles CPU spent obtaining access to Deflate unit", }, { .ctrnum = 252, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CYCLES", .desc = "Cycles CPU is using Deflate unit", }, { .ctrnum = 264, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CC", .desc = "Increments by one for every DEFLATE CONVERSION CALL" " instruction executed", }, { .ctrnum = 265, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CCFINISH", .desc = "Increments by one for every DEFLATE CONVERSION CALL" " instruction executed that ended in Condition Codes" " 0, 1 or 2", }, { .ctrnum = 448, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", .desc = "Cycle count with one thread active", }, { .ctrnum = 449, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", .desc = "Cycle count with two threads active", }, }; static const pme_cpumf_ctr_t cpumcf_z16_counters[] = { { .ctrnum = 128, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1D_RO_EXCL_WRITES", .desc = "A directory write to the Level-1 Data cache where" " the line was originally in a Read-Only state in the" " cache but has been updated to be in the Exclusive" " state that allows stores to the cache line.", }, { .ctrnum = 129, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_WRITES", .desc = "A translation has been written into The Translation" " Lookaside Buffer 2 (TLB2) and the request was made" " by the Level-1 Data cache. This is a replacement" " for what was provided for the DTLB on z13 and prior" " machines.", }, { .ctrnum = 130, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the Level-1 Data cache. Incremented by one for" " every TLB2 miss in progress for the Level-1 Data" " cache on this cycle. This is a replacement for what" " was provided for the DTLB on z13 and prior" " machines.", }, { .ctrnum = 131, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "CRSTE_1MB_WRITES", .desc = "A translation entry was written into the Combined" " Region and Segment Table Entry array in the Level-2" " TLB for a one-megabyte page.", }, { .ctrnum = 132, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DTLB2_GPAGE_WRITES", .desc = "A translation entry for a two-gigabyte page was" " written into the Level-2 TLB.", }, { .ctrnum = 134, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_WRITES", .desc = "A translation entry has been written into the" " Translation Lookaside Buffer 2 (TLB2) and the" " request was made by the instruction cache. This is" " a replacement for what was provided for the ITLB on" " z13 and prior machines.", }, { .ctrnum = 135, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ITLB2_MISSES", .desc = "A TLB2 miss is in progress for a request made by" " the Level-1 Instruction cache. Incremented by one" " for every TLB2 miss in progress for the Level-1" " Instruction cache in a cycle. This is a replacement" " for what was provided for the ITLB on z13 and prior" " machines.", }, { .ctrnum = 137, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_PTE_WRITES", .desc = "A translation entry was written into the Page Table" " Entry array in the Level-2 TLB.", }, { .ctrnum = 138, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_CRSTE_WRITES", .desc = "Translation entries were written into the Combined" " Region and Segment Table Entry array and the Page" " Table Entry array in the Level-2 TLB.", }, { .ctrnum = 139, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TLB2_ENGINES_BUSY", .desc = "The number of Level-2 TLB translation engines busy" " in a cycle.", }, { .ctrnum = 140, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TEND", .desc = "A TEND instruction has completed in a constrained" " transactional-execution mode.", }, { .ctrnum = 141, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TEND", .desc = "A TEND instruction has completed in a non-" " constrained transactional-execution mode.", }, { .ctrnum = 143, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "L1C_TLB2_MISSES", .desc = "Increments by one for any cycle where a level-1" " cache or level-2 TLB miss is in progress.", }, { .ctrnum = 145, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_REQ", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the requestor's Level-2 cache.", }, { .ctrnum = 146, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_REQ_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the requestor's Level-2 cache with" " intervention.", }, { .ctrnum = 147, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_REQ_CHIP_HIT", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the requestor's Level-2 cache after using" " chip level horizontal persistence, Chip-HP hit.", }, { .ctrnum = 148, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_REQ_DRAWER_HIT", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from the requestor's Level-2 cache after using" " drawer level horizontal persistence, Drawer-HP hit.", }, { .ctrnum = 149, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_CHIP", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache.", }, { .ctrnum = 150, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_CHIP_IV", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache with intervention.", }, { .ctrnum = 151, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_CHIP_CHIP_HIT", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache after using chip" " level horizontal persistence, Chip-HP hit.", }, { .ctrnum = 152, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_CHIP_DRAWER_HIT", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache using drawer level" " horizontal persistence, Drawer-HP hit.", }, { .ctrnum = 153, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_MODULE", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Module Level-2 cache.", }, { .ctrnum = 154, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_DRAWER", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an On-Drawer Level-2 cache.", }, { .ctrnum = 155, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_OFF_DRAWER", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from an Off-Drawer Level-2 cache.", }, { .ctrnum = 156, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_CHIP_MEMORY", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Chip memory.", }, { .ctrnum = 157, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_MODULE_MEMORY", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Module memory.", }, { .ctrnum = 158, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_ON_DRAWER_MEMORY", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from On-Drawer memory.", }, { .ctrnum = 159, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DCW_OFF_DRAWER_MEMORY", .desc = "A directory write to the Level-1 Data cache" " directory where the returned cache line was sourced" " from Off-Drawer memory.", }, { .ctrnum = 160, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_MODULE_IV", .desc = "A directory write to the Level-1 Data or Level-1" " Instruction cache directory where the returned" " cache line was sourced from an On-Module Level-2" " cache with intervention.", }, { .ctrnum = 161, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_MODULE_CHIP_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " Instruction cache directory where the returned" " cache line was sourced from an On-Module Level-2" " cache using chip horizontal persistence, Chip-HP" " hit.", }, { .ctrnum = 162, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_MODULE_DRAWER_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " Instruction cache directory where the returned" " cache line was sourced from an On-Module Level-2" " cache using drawer level horizontal persistence," " Drawer-HP hit.", }, { .ctrnum = 163, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_DRAWER_IV", .desc = "A directory write to the Level-1 Data or Level-1" " Instruction cache directory where the returned" " cache line was sourced from an On-Drawer Level-2" " cache with intervention.", }, { .ctrnum = 164, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_DRAWER_CHIP_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " instruction cache directory where the returned" " cache line was sourced from an On-Drawer Level-2" " cache using chip level horizontal persistence, Chip-" " HP hit.", }, { .ctrnum = 165, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_ON_DRAWER_DRAWER_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " instruction cache directory where the returned" " cache line was sourced from an On-Drawer Level-2" " cache using drawer level horizontal persistence," " Drawer-HP hit.", }, { .ctrnum = 166, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_OFF_DRAWER_IV", .desc = "A directory write to the Level-1 Data or Level-1" " instruction cache directory where the returned" " cache line was sourced from an Off-Drawer Level-2" " cache with intervention.", }, { .ctrnum = 167, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_OFF_DRAWER_CHIP_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " instruction cache directory where the returned" " cache line was sourced from an Off-Drawer Level-2" " cache using chip level horizontal persistence, Chip-" " HP hit.", }, { .ctrnum = 168, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "IDCW_OFF_DRAWER_DRAWER_HIT", .desc = "A directory write to the Level-1 Data or Level-1" " Instruction cache directory where the returned" " cache line was sourced from an Off-Drawer Level-2" " cache using drawer level horizontal persistence," " Drawer-HP hit.", }, { .ctrnum = 169, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_REQ", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " the requestors Level-2 cache.", }, { .ctrnum = 170, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_REQ_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the requestors Level-2 cache with" " intervention.", }, { .ctrnum = 171, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_REQ_CHIP_HIT", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the requestors Level-2 cache using chip level" " horizontal persistence, Chip-HP hit.", }, { .ctrnum = 172, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_REQ_DRAWER_HIT", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from the requestor's Level-2 cache using drawer" " level horizontal persistence, Drawer-HP hit.", }, { .ctrnum = 173, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_CHIP", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache.", }, { .ctrnum = 174, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_CHIP_IV", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " an On-Chip Level-2 cache with intervention.", }, { .ctrnum = 175, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_CHIP_CHIP_HIT", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Chip Level-2 cache using chip level" " horizontal persistence, Chip-HP hit.", }, { .ctrnum = 176, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_CHIP_DRAWER_HIT", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Chip level 2 cache using drawer level" " horizontal persistence, Drawer-HP hit.", }, { .ctrnum = 177, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_MODULE", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from an On-Module Level-2 cache.", }, { .ctrnum = 178, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_DRAWER", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " an On-Drawer Level-2 cache.", }, { .ctrnum = 179, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_OFF_DRAWER", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " an Off-Drawer Level-2 cache.", }, { .ctrnum = 180, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_CHIP_MEMORY", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Chip memory.", }, { .ctrnum = 181, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_MODULE_MEMORY", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Module memory.", }, { .ctrnum = 182, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_ON_DRAWER_MEMORY", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from On-Drawer memory.", }, { .ctrnum = 183, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "ICW_OFF_DRAWER_MEMORY", .desc = "A directory write to the Level-1 Instruction cache" " directory where the returned cache line was sourced" " from Off-Drawer memory.", }, { .ctrnum = 224, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "BCD_DFP_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished Binary Coded Decimal to Decimal Floating" " Point conversions. Instructions: CDZT, CXZT, CZDT," " CZXT.", }, { .ctrnum = 225, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "VX_BCD_EXECUTION_SLOTS", .desc = "Count of floating point execution slots used for" " finished vector arithmetic Binary Coded Decimal" " instructions. Instructions: VAP, VSP, VMP, VMSP," " VDP, VSDP, VRP, VLIP, VSRP, VPSOP, VCP, VTP, VPKZ," " VUPKZ, VCVB, VCVBG, VCVD, VCVDG.", }, { .ctrnum = 226, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DECIMAL_INSTRUCTIONS", .desc = "Decimal instruction dispatched. Instructions: CVB," " CVD, AP, CP, DP, ED, EDMK, MP, SRP, SP, ZAP.", }, { .ctrnum = 232, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "LAST_HOST_TRANSLATIONS", .desc = "Last Host Translation done", }, { .ctrnum = 244, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_NC_TABORT", .desc = "A transaction abort has occurred in a non-" " constrained transactional-execution mode.", }, { .ctrnum = 245, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_NO_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is not" " using any special logic to allow the transaction to" " complete.", }, { .ctrnum = 246, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "TX_C_TABORT_SPECIAL", .desc = "A transaction abort has occurred in a constrained" " transactional-execution mode and the CPU is using" " special logic to allow the transaction to complete.", }, { .ctrnum = 248, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_ACCESS", .desc = "Cycles CPU spent obtaining access to Deflate unit", }, { .ctrnum = 253, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CYCLES", .desc = "Cycles CPU is using Deflate unit", }, { .ctrnum = 256, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "SORTL", .desc = "Increments by one for every SORT LISTS instruction" " executed.", }, { .ctrnum = 265, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CC", .desc = "Increments by one for every DEFLATE CONVERSION CALL" " instruction executed.", }, { .ctrnum = 266, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "DFLT_CCFINISH", .desc = "Increments by one for every DEFLATE CONVERSION CALL" " instruction executed that ended in Condition Codes" " 0, 1 or 2.", }, { .ctrnum = 267, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "NNPA_INVOCATIONS", .desc = "Increments by one for every Neural Network" " Processing Assist instruction executed.", }, { .ctrnum = 268, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "NNPA_COMPLETIONS", .desc = "Increments by one for every Neural Network" " Processing Assist instruction executed that ended" " in Condition Codes 0, 1 or 2.", }, { .ctrnum = 269, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "NNPA_WAIT_LOCK", .desc = "Cycles CPU spent obtaining access to IBM Z" " Integrated Accelerator for AI.", }, { .ctrnum = 270, .ctrset = CPUMF_CTRSET_EXTENDED, .name = "NNPA_HOLD_LOCK", .desc = "Cycles CPU is using IBM Z Integrated Accelerator" " for AI.", }, { .ctrnum = 448, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_ONE_THR_ACTIVE", .desc = "Cycle count with one thread active", }, { .ctrnum = 449, .ctrset = CPUMF_CTRSET_MT_DIAG, .name = "MT_DIAG_CYCLES_TWO_THR_ACTIVE", .desc = "Cycle count with two threads active", }, }; static const pme_cpumf_ctr_t cpumsf_counters[] = { { .ctrnum = 720896, .ctrset = CPUMF_CTRSET_NONE, .name = "SF_CYCLES_BASIC", .desc = "Sample CPU cycles using basic-sampling mode", }, { .ctrnum = 774144, .ctrset = CPUMF_CTRSET_NONE, .name = "SF_CYCLES_BASIC_DIAG", .desc = "Sample CPU cycle using diagnostic-sampling mode" " (not for ordinary use)", }, }; #endif /* __S390X_CPUMF_EVENTS_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_niagara1_events.h000066400000000000000000000020611502707512200244450ustar00rootroot00000000000000static const sparc_entry_t niagara1_pe[] = { /* PIC1 Niagara-1 events */ { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S1, .code = 0x0, }, /* PIC0 Niagara-1 events */ { .name = "SB_full", .desc = "Store-buffer full", .ctrl = PME_CTRL_S0, .code = 0x0, }, { .name = "FP_instr_cnt", .desc = "FPU instructions", .ctrl = PME_CTRL_S0, .code = 0x1, }, { .name = "IC_miss", .desc = "I-cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, { .name = "DC_miss", .desc = "D-cache miss", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "ITLB_miss", .desc = "I-TLB miss", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "DTLB_miss", .desc = "D-TLB miss", .ctrl = PME_CTRL_S0, .code = 0x5, }, { .name = "L2_imiss", .desc = "E-cache instruction fetch miss", .ctrl = PME_CTRL_S0, .code = 0x6, }, { .name = "L2_dmiss_ld", .desc = "E-cache data load miss", .ctrl = PME_CTRL_S0, .code = 0x7, }, }; #define PME_SPARC_NIAGARA1_EVENT_COUNT (sizeof(niagara1_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_niagara2_events.h000066400000000000000000000147361502707512200244620ustar00rootroot00000000000000static const sparc_entry_t niagara2_pe[] = { /* PIC0 Niagara-2 events */ { .name = "All_strands_idle", .desc = "Cycles when no strand can be picked for the physical core on which the monitoring strand resides.", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x2, .umasks = { { .uname = "branches", .udesc = "Completed branches", .ubit = 0, }, { .uname = "taken_branches", .udesc = "Taken branches, which are always mispredicted", .ubit = 1, }, { .uname = "FGU_arith", .udesc = "All FADD, FSUB, FCMP, convert, FMUL, FDIV, FNEG, FABS, FSQRT, FMOV, FPADD, FPSUB, FPACK, FEXPAND, FPMERGE, FMUL8, FMULD8, FALIGNDATA, BSHUFFLE, FZERO, FONE, FSRC, FNOT1, FNOT2, FOR, FNOR, FAND, FNAND, FXOR, FXNOR, FORNOT1, FORNOT2, FANDNOT1, FANDNOT2, PDIST, SIAM", .ubit = 2, }, { .uname = "Loads", .udesc = "Load instructions", .ubit = 3, }, { .uname = "Stores", .udesc = "Stores instructions", .ubit = 3, }, { .uname = "SW_count", .udesc = "Software count 'sethi %hi(fc00), %g0' instructions", .ubit = 5, }, { .uname = "other", .udesc = "Instructions not covered by other mask bits", .ubit = 6, }, { .uname = "atomics", .udesc = "Atomics are LDSTUB/A, CASA/XA, SWAP/A", .ubit = 7, }, }, .numasks = 8, }, { .name = "cache", .desc = "Cache events", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x3, .umasks = { { .uname = "IC_miss", .udesc = "I-cache misses. This counts only primary instruction cache misses, and does not count duplicate instruction cache misses.4 Also, only 'true' misses are counted. If a thread encounters an I$ miss, but the thread is redirected (due to a branch misprediction or trap, for example) before the line returns from L2 and is loaded into the I$, then the miss is not counted.", .ubit = 0, }, { .uname = "DC_miss", .udesc = "D-cache misses. This counts both primary and duplicate data cache misses.", .ubit = 1, }, { .uname = "L2IC_miss", .udesc = "L2 cache instruction misses", .ubit = 4, }, { .uname = "L2LD_miss", .udesc = "L2 cache load misses. Block loads are treated as one L2 miss event. In reality, each individual load can hit or miss in the L2 since the block load is not atomic.", .ubit = 5, }, }, .numasks = 4, }, { .name = "TLB", .desc = "TLB events", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x4, .umasks = { { .uname = "ITLB_L2ref", .udesc = "ITLB references to L2. For each ITLB miss with hardware tablewalk enabled, count each access the ITLB hardware tablewalk makes to L2.", .ubit = 2, }, { .uname = "DTLB_L2ref", .udesc = "DTLB references to L2. For each DTLB miss with hardware tablewalk enabled, count each access the DTLB hardware tablewalk makes to L2.", .ubit = 3, }, { .uname = "ITLB_L2miss", .udesc = "For each ITLB miss with hardware tablewalk enabled, count each access the ITLB hardware tablewalk makes to L2 which misses in L2. Note: Depending upon the hardware table walk configuration, each ITLB miss may issue from 1 to 4 requests to L2 to search TSBs.", .ubit = 4, }, { .uname = "DTLB_L2miss", .udesc = "For each DTLB miss with hardware tablewalk enabled, count each access the DTLB hardware tablewalk makes to L2 which misses in L2. Note: Depending upon the hardware table walk configuration, each DTLB miss may issue from 1 to 4 requests to L2 to search TSBs.", .ubit = 5, }, }, .numasks = 4, }, { .name = "mem", .desc = "Memory operations", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x5, .umasks = { { .uname = "stream_load", .udesc = "Stream Unit load operations to L2", .ubit = 0, }, { .uname = "stream_store", .udesc = "Stream Unit store operations to L2", .ubit = 1, }, { .uname = "cpu_load", .udesc = "CPU loads to L2", .ubit = 2, }, { .uname = "cpu_ifetch", .udesc = "CPU instruction fetches to L2", .ubit = 3, }, { .uname = "cpu_store", .udesc = "CPU stores to L2", .ubit = 6, }, { .uname = "mmu_load", .udesc = "MMU loads to L2", .ubit = 7, }, }, .numasks = 6, }, { .name = "spu_ops", .desc = "Stream Unit operations. User, supervisor, and hypervisor counting must all be enabled to properly count these events.", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x6, .umasks = { { .uname = "DES", .udesc = "Increment for each CWQ or ASI operation that uses DES/3DES unit", .ubit = 0, }, { .uname = "AES", .udesc = "Increment for each CWQ or ASI operation that uses AES unit", .ubit = 1, }, { .uname = "RC4", .udesc = "Increment for each CWQ or ASI operation that uses RC4 unit", .ubit = 2, }, { .uname = "HASH", .udesc = "Increment for each CWQ or ASI operation that uses MD5/SHA-1/SHA-256 unit", .ubit = 3, }, { .uname = "MA", .udesc = "Increment for each CWQ or ASI modular arithmetic operation", .ubit = 4, }, { .uname = "CSUM", .udesc = "Increment for each iSCSI CRC or TCP/IP checksum operation", .ubit = 5, }, }, .numasks = 6, }, { .name = "spu_busy", .desc = "Stream Unit busy cycles. User, supervisor, and hypervisor counting must all be enabled to properly count these events.", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x07, .umasks = { { .uname = "DES", .udesc = "Cycles the DES/3DES unit is busy", .ubit = 0, }, { .uname = "AES", .udesc = "Cycles the AES unit is busy", .ubit = 1, }, { .uname = "RC4", .udesc = "Cycles the RC4 unit is busy", .ubit = 2, }, { .uname = "HASH", .udesc = "Cycles the MD5/SHA-1/SHA-256 unit is busy", .ubit = 3, }, { .uname = "MA", .udesc = "Cycles the modular arithmetic unit is busy", .ubit = 4, }, { .uname = "CSUM", .udesc = "Cycles the CRC/MPA/checksum unit is busy", .ubit = 5, }, }, .numasks = 6, }, { .name = "tlb_miss", .desc = "TLB misses", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0xb, .umasks = { { .uname = "ITLB", .udesc = "I-TLB misses", .ubit = 2, }, { .uname = "DTLB", .udesc = "D-TLB misses", .ubit = 3, }, }, .numasks = 2, }, }; #define PME_SPARC_NIAGARA2_EVENT_COUNT (sizeof(niagara2_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_ultra12_events.h000066400000000000000000000061601502707512200242600ustar00rootroot00000000000000static const sparc_entry_t ultra12_pe[] = { /* These two must always be first. */ { .name = "Cycle_cnt", .desc = "Accumulated cycles", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x1, }, { .name = "Dispatch0_IC_miss", .desc = "I-buffer is empty from I-Cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, /* PIC0 events for UltraSPARC-I/II/IIi/IIe */ { .name = "Dispatch0_storeBuf", .desc = "Store buffer can not hold additional stores", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "IC_ref", .desc = "I-cache references", .ctrl = PME_CTRL_S0, .code = 0x8, }, { .name = "DC_rd", .desc = "D-cache read references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0x9, }, { .name = "DC_wr", .desc = "D-cache write references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0xa, }, { .name = "Load_use", .desc = "An instruction in the execute stage depends on an earlier load result that is not yet available", .ctrl = PME_CTRL_S0, .code = 0xb, }, { .name = "EC_ref", .desc = "Total E-cache references", .ctrl = PME_CTRL_S0, .code = 0xc, }, { .name = "EC_write_hit_RDO", .desc = "E-cache hits that do a read for ownership UPA transaction", .ctrl = PME_CTRL_S0, .code = 0xd, }, { .name = "EC_snoop_inv", .desc = "E-cache invalidates from the following UPA transactions: S_INV_REQ, S_CPI_REQ", .ctrl = PME_CTRL_S0, .code = 0xe, }, { .name = "EC_rd_hit", .desc = "E-cache read hits from D-cache misses", .ctrl = PME_CTRL_S0, .code = 0xf, }, /* PIC1 events for UltraSPARC-I/II/IIi/IIe */ { .name = "Dispatch0_mispred", .desc = "I-buffer is empty from Branch misprediction", .ctrl = PME_CTRL_S1, .code = 0x2, }, { .name = "Dispatch0_FP_use", .desc = "First instruction in the group depends on an earlier floating point result that is not yet available", .ctrl = PME_CTRL_S1, .code = 0x3, }, { .name = "IC_hit", .desc = "I-cache hits", .ctrl = PME_CTRL_S1, .code = 0x8, }, { .name = "DC_rd_hit", .desc = "D-cache read hits", .ctrl = PME_CTRL_S1, .code = 0x9, }, { .name = "DC_wr_hit", .desc = "D-cache write hits", .ctrl = PME_CTRL_S1, .code = 0xa, }, { .name = "Load_use_RAW", .desc = "There is a load use in the execute stage and there is a read-after-write hazard on the oldest outstanding load", .ctrl = PME_CTRL_S1, .code = 0xb, }, { .name = "EC_hit", .desc = "Total E-cache hits", .ctrl = PME_CTRL_S1, .code = 0xc, }, { .name = "EC_wb", .desc = "E-cache misses that do writebacks", .ctrl = PME_CTRL_S1, .code = 0xd, }, { .name = "EC_snoop_cb", .desc = "E-cache snoop copy-backs from the following UPA transactions: S_CPB_REQ, S_CPI_REQ, S_CPD_REQ, S_CPB_MIS_REQ", .ctrl = PME_CTRL_S1, .code = 0xe, }, { .name = "EC_ic_hit", .desc = "E-cache read hits from I-cache misses", .ctrl = PME_CTRL_S1, .code = 0xf, }, }; #define PME_SPARC_ULTRA12_EVENT_COUNT (sizeof(ultra12_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_ultra3_events.h000066400000000000000000000245621502707512200242060ustar00rootroot00000000000000static const sparc_entry_t ultra3_pe[] = { /* These two must always be first. */ { .name = "Cycle_cnt", .desc = "Accumulated cycles", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .name = "Dispatch0_IC_miss", .desc = "I-buffer is empty from I-Cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, { .name = "IC_ref", .desc = "I-cache references", .ctrl = PME_CTRL_S0, .code = 0x8, }, { .name = "DC_rd", .desc = "D-cache read references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0x9, }, { .name = "DC_wr", .desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0xa, }, { .name = "EC_ref", .desc = "E-cache references", .ctrl = PME_CTRL_S0, .code = 0xc, }, { .name = "EC_snoop_inv", .desc = "L2-cache invalidates generated from a snoop by a remote processor", .ctrl = PME_CTRL_S0, .code = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .name = "Dispatch0_mispred", .desc = "I-buffer is empty from Branch misprediction", .ctrl = PME_CTRL_S1, .code = 0x2, }, { .name = "EC_wb", .desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .ctrl = PME_CTRL_S1, .code = 0xd, }, { .name = "EC_snoop_cb", .desc = "L2-cache copybacks generated from a snoop by a remote processor", .ctrl = PME_CTRL_S1, .code = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "Dispatch0_br_target", .desc = "I-buffer is empty due to a branch target address calculation", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "Dispatch0_2nd_br", .desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "Rstall_storeQ", .desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stalled due to the store queue being full", .ctrl = PME_CTRL_S0, .code = 0x5, }, { .name = "Rstall_IU_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding integer instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S0, .code = 0x6, }, { .name = "EC_write_hit_RTO", .desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .ctrl = PME_CTRL_S0, .code = 0xd, }, { .name = "EC_rd_miss", .desc = "L2-cache miss events (including atomics) from D-cache events", .ctrl = PME_CTRL_S0, .code = 0xf, }, { .name = "PC_port0_rd", .desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .ctrl = PME_CTRL_S0, .code = 0x10, }, { .name = "SI_snoop", .desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .ctrl = PME_CTRL_S0, .code = 0x11, }, { .name = "SI_ciq_flow", .desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .ctrl = PME_CTRL_S0, .code = 0x12, }, { .name = "SI_owned", .desc = "Counts events where owned_in is asserted on bus requests from the local processor", .ctrl = PME_CTRL_S0, .code = 0x13, }, { .name = "SW_count0", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S0, .code = 0x14, }, { .name = "IU_Stat_Br_miss_taken", .desc = "Retired branches that were predicted to be taken, but in fact were not taken", .ctrl = PME_CTRL_S0, .code = 0x15, }, { .name = "IU_Stat_Br_Count_taken", .desc = "Retired taken branches", .ctrl = PME_CTRL_S0, .code = 0x16, }, { .name = "Dispatch0_rs_mispred", .desc = "I-buffer is empty due to a Return Address Stack misprediction", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "FA_pipe_completion", .desc = "Instructions that complete execution on the FPG ALU pipelines", .ctrl = PME_CTRL_S0, .code = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "IC_miss_cancelled", .desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .ctrl = PME_CTRL_S1, .code = 0x3, }, { .name = "Re_FPU_bypass", .desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .ctrl = PME_CTRL_S1, .code = 0x5, }, { .name = "Re_DC_miss", .desc = "Stall due to loads that miss D-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x6, }, { .name = "Re_EC_miss", .desc = "Stall due to loads that miss L2-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x7, }, { .name = "IC_miss", .desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .ctrl = PME_CTRL_S1, .code = 0x8, }, { .name = "DC_rd_miss", .desc = "Recirculated loads that miss the D-cache", .ctrl = PME_CTRL_S1, .code = 0x9, }, { .name = "DC_wr_miss", .desc = "D-cache store accesses that miss D-cache", .ctrl = PME_CTRL_S1, .code = 0xa, }, { .name = "Rstall_FP_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding floating-point instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S1, .code = 0xb, }, { .name = "EC_misses", .desc = "E-cache misses", .ctrl = PME_CTRL_S1, .code = 0xc, }, { .name = "EC_ic_miss", .desc = "L2-cache read misses from I-cache requests", .ctrl = PME_CTRL_S1, .code = 0xf, }, { .name = "Re_PC_miss", .desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .ctrl = PME_CTRL_S1, .code = 0x10, }, { .name = "ITLB_miss", .desc = "I-TLB miss traps taken", .ctrl = PME_CTRL_S1, .code = 0x11, }, { .name = "DTLB_miss", .desc = "Memory reference instructions which trap due to D-TLB miss", .ctrl = PME_CTRL_S1, .code = 0x12, }, { .name = "WC_miss", .desc = "W-cache misses", .ctrl = PME_CTRL_S1, .code = 0x13, }, { .name = "WC_snoop_cb", .desc = "W-cache copybacks generated by a snoop from a remote processor", .ctrl = PME_CTRL_S1, .code = 0x14, }, { .name = "WC_scrubbed", .desc = "W-cache hits to clean lines", .ctrl = PME_CTRL_S1, .code = 0x15, }, { .name = "WC_wb_wo_read", .desc = "W-cache writebacks not requiring a read", .ctrl = PME_CTRL_S1, .code = 0x16, }, { .name = "PC_soft_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .ctrl = PME_CTRL_S1, .code = 0x18, }, { .name = "PC_snoop_inv", .desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .ctrl = PME_CTRL_S1, .code = 0x19, }, { .name = "PC_hard_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .ctrl = PME_CTRL_S1, .code = 0x1a, }, { .name = "PC_port1_rd", .desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .ctrl = PME_CTRL_S1, .code = 0x1b, }, { .name = "SW_count1", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S1, .code = 0x1c, }, { .name = "IU_Stat_Br_miss_untaken", .desc = "Retired branches that were predicted to be untaken, but in fact were taken", .ctrl = PME_CTRL_S1, .code = 0x1d, }, { .name = "IU_Stat_Br_Count_untaken", .desc = "Retired untaken branches", .ctrl = PME_CTRL_S1, .code = 0x1e, }, { .name = "PC_MS_miss", .desc = "FP loads through the MS pipeline that miss P-cache", .ctrl = PME_CTRL_S1, .code = 0x1f, }, { .name = "Re_RAW_miss", .desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .ctrl = PME_CTRL_S1, .code = 0x26, }, { .name = "FM_pipe_completion", .desc = "Instructions that complete execution on the FPG Multiply pipelines", .ctrl = PME_CTRL_S0, .code = 0x27, }, /* PIC0 memory controller events common to UltraSPARC-III/III+ processors */ { .name = "MC_reads_0", .desc = "Read requests completed to memory bank 0", .ctrl = PME_CTRL_S0, .code = 0x20, }, { .name = "MC_reads_1", .desc = "Read requests completed to memory bank 1", .ctrl = PME_CTRL_S0, .code = 0x21, }, { .name = "MC_reads_2", .desc = "Read requests completed to memory bank 2", .ctrl = PME_CTRL_S0, .code = 0x22, }, { .name = "MC_reads_3", .desc = "Read requests completed to memory bank 3", .ctrl = PME_CTRL_S0, .code = 0x23, }, { .name = "MC_stalls_0", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x24, }, { .name = "MC_stalls_2", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x25, }, /* PIC1 memory controller events common to all UltraSPARC-III/III+ processors */ { .name = "MC_writes_0", .desc = "Write requests completed to memory bank 0", .ctrl = PME_CTRL_S1, .code = 0x20, }, { .name = "MC_writes_1", .desc = "Write requests completed to memory bank 1", .ctrl = PME_CTRL_S1, .code = 0x21, }, { .name = "MC_writes_2", .desc = "Write requests completed to memory bank 2", .ctrl = PME_CTRL_S1, .code = 0x22, }, { .name = "MC_writes_3", .desc = "Write requests completed to memory bank 3", .ctrl = PME_CTRL_S1, .code = 0x23, }, { .name = "MC_stalls_1", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 1 was busy with a previous request", .ctrl = PME_CTRL_S1, .code = 0x24, }, { .name = "MC_stalls_3", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous request", .ctrl = PME_CTRL_S1, .code = 0x25, }, }; #define PME_SPARC_ULTRA3_EVENT_COUNT (sizeof(ultra3_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_ultra3i_events.h000066400000000000000000000241271502707512200243540ustar00rootroot00000000000000static const sparc_entry_t ultra3i_pe[] = { /* These two must always be first. */ { .name = "Cycle_cnt", .desc = "Accumulated cycles", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .name = "Dispatch0_IC_miss", .desc = "I-buffer is empty from I-Cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, { .name = "IC_ref", .desc = "I-cache references", .ctrl = PME_CTRL_S0, .code = 0x8, }, { .name = "DC_rd", .desc = "D-cache read references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0x9, }, { .name = "DC_wr", .desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0xa, }, { .name = "EC_ref", .desc = "E-cache references", .ctrl = PME_CTRL_S0, .code = 0xc, }, { .name = "EC_snoop_inv", .desc = "L2-cache invalidates generated from a snoop by a remote processor", .ctrl = PME_CTRL_S0, .code = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .name = "Dispatch0_mispred", .desc = "I-buffer is empty from Branch misprediction", .ctrl = PME_CTRL_S1, .code = 0x2, }, { .name = "EC_wb", .desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .ctrl = PME_CTRL_S1, .code = 0xd, }, { .name = "EC_snoop_cb", .desc = "L2-cache copybacks generated from a snoop by a remote processor", .ctrl = PME_CTRL_S1, .code = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "Dispatch0_br_target", .desc = "I-buffer is empty due to a branch target address calculation", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "Dispatch0_2nd_br", .desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "Rstall_storeQ", .desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stalled due to the store queue being full", .ctrl = PME_CTRL_S0, .code = 0x5, }, { .name = "Rstall_IU_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding integer instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S0, .code = 0x6, }, { .name = "EC_write_hit_RTO", .desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .ctrl = PME_CTRL_S0, .code = 0xd, }, { .name = "EC_rd_miss", .desc = "L2-cache miss events (including atomics) from D-cache events", .ctrl = PME_CTRL_S0, .code = 0xf, }, { .name = "PC_port0_rd", .desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .ctrl = PME_CTRL_S0, .code = 0x10, }, { .name = "SI_snoop", .desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .ctrl = PME_CTRL_S0, .code = 0x11, }, { .name = "SI_ciq_flow", .desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .ctrl = PME_CTRL_S0, .code = 0x12, }, { .name = "SI_owned", .desc = "Counts events where owned_in is asserted on bus requests from the local processor", .ctrl = PME_CTRL_S0, .code = 0x13, }, { .name = "SW_count0", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S0, .code = 0x14, }, { .name = "IU_Stat_Br_miss_taken", .desc = "Retired branches that were predicted to be taken, but in fact were not taken", .ctrl = PME_CTRL_S0, .code = 0x15, }, { .name = "IU_Stat_Br_Count_taken", .desc = "Retired taken branches", .ctrl = PME_CTRL_S0, .code = 0x16, }, { .name = "Dispatch0_rs_mispred", .desc = "I-buffer is empty due to a Return Address Stack misprediction", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "FA_pipe_completion", .desc = "Instructions that complete execution on the FPG ALU pipelines", .ctrl = PME_CTRL_S0, .code = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "IC_miss_cancelled", .desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .ctrl = PME_CTRL_S1, .code = 0x3, }, { .name = "Re_FPU_bypass", .desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .ctrl = PME_CTRL_S1, .code = 0x5, }, { .name = "Re_DC_miss", .desc = "Stall due to loads that miss D-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x6, }, { .name = "Re_EC_miss", .desc = "Stall due to loads that miss L2-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x7, }, { .name = "IC_miss", .desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .ctrl = PME_CTRL_S1, .code = 0x8, }, { .name = "DC_rd_miss", .desc = "Recirculated loads that miss the D-cache", .ctrl = PME_CTRL_S1, .code = 0x9, }, { .name = "DC_wr_miss", .desc = "D-cache store accesses that miss D-cache", .ctrl = PME_CTRL_S1, .code = 0xa, }, { .name = "Rstall_FP_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding floating-point instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S1, .code = 0xb, }, { .name = "EC_misses", .desc = "E-cache misses", .ctrl = PME_CTRL_S1, .code = 0xc, }, { .name = "EC_ic_miss", .desc = "L2-cache read misses from I-cache requests", .ctrl = PME_CTRL_S1, .code = 0xf, }, { .name = "Re_PC_miss", .desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .ctrl = PME_CTRL_S1, .code = 0x10, }, { .name = "ITLB_miss", .desc = "I-TLB miss traps taken", .ctrl = PME_CTRL_S1, .code = 0x11, }, { .name = "DTLB_miss", .desc = "Memory reference instructions which trap due to D-TLB miss", .ctrl = PME_CTRL_S1, .code = 0x12, }, { .name = "WC_miss", .desc = "W-cache misses", .ctrl = PME_CTRL_S1, .code = 0x13, }, { .name = "WC_snoop_cb", .desc = "W-cache copybacks generated by a snoop from a remote processor", .ctrl = PME_CTRL_S1, .code = 0x14, }, { .name = "WC_scrubbed", .desc = "W-cache hits to clean lines", .ctrl = PME_CTRL_S1, .code = 0x15, }, { .name = "WC_wb_wo_read", .desc = "W-cache writebacks not requiring a read", .ctrl = PME_CTRL_S1, .code = 0x16, }, { .name = "PC_soft_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .ctrl = PME_CTRL_S1, .code = 0x18, }, { .name = "PC_snoop_inv", .desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .ctrl = PME_CTRL_S1, .code = 0x19, }, { .name = "PC_hard_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .ctrl = PME_CTRL_S1, .code = 0x1a, }, { .name = "PC_port1_rd", .desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .ctrl = PME_CTRL_S1, .code = 0x1b, }, { .name = "SW_count1", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S1, .code = 0x1c, }, { .name = "IU_Stat_Br_miss_untaken", .desc = "Retired branches that were predicted to be untaken, but in fact were taken", .ctrl = PME_CTRL_S1, .code = 0x1d, }, { .name = "IU_Stat_Br_Count_untaken", .desc = "Retired untaken branches", .ctrl = PME_CTRL_S1, .code = 0x1e, }, { .name = "PC_MS_miss", .desc = "FP loads through the MS pipeline that miss P-cache", .ctrl = PME_CTRL_S1, .code = 0x1f, }, { .name = "Re_RAW_miss", .desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .ctrl = PME_CTRL_S1, .code = 0x26, }, { .name = "FM_pipe_completion", .desc = "Instructions that complete execution on the FPG Multiply pipelines", .ctrl = PME_CTRL_S0, .code = 0x27, }, /* PIC0 memory controller events specific to UltraSPARC-IIIi processors */ { .name = "MC_read_dispatched", .desc = "DDR 64-byte reads dispatched by the MIU", .ctrl = PME_CTRL_S0, .code = 0x20, }, { .name = "MC_write_dispatched", .desc = "DDR 64-byte writes dispatched by the MIU", .ctrl = PME_CTRL_S0, .code = 0x21, }, { .name = "MC_read_returned_to_JBU", .desc = "64-byte reads that return data to JBU", .ctrl = PME_CTRL_S0, .code = 0x22, }, { .name = "MC_msl_busy_stall", .desc = "Stall cycles due to msl_busy", .ctrl = PME_CTRL_S0, .code = 0x23, }, { .name = "MC_mdb_overflow_stall", .desc = "Stall cycles due to potential memory data buffer overflow", .ctrl = PME_CTRL_S0, .code = 0x24, }, { .name = "MC_miu_spec_request", .desc = "Speculative requests accepted by MIU", .ctrl = PME_CTRL_S0, .code = 0x25, }, /* PIC1 memory controller events specific to UltraSPARC-IIIi processors */ { .name = "MC_reads", .desc = "64-byte reads by the MSL", .ctrl = PME_CTRL_S1, .code = 0x20, }, { .name = "MC_writes", .desc = "64-byte writes by the MSL", .ctrl = PME_CTRL_S1, .code = 0x21, }, { .name = "MC_page_close_stall", .desc = "DDR page conflicts", .ctrl = PME_CTRL_S1, .code = 0x22, }, /* PIC1 events specific to UltraSPARC-III+/IIIi */ { .name = "Re_DC_missovhd", .desc = "Used to measure D-cache stall counts separately for L2-cache hits and misses. This counter is used with the recirculation and cache access events to separately calculate the D-cache loads that hit and miss the L2-cache", .ctrl = PME_CTRL_S1, .code = 0x4, }, }; #define PME_SPARC_ULTRA3I_EVENT_COUNT (sizeof(ultra3i_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_ultra3plus_events.h000066400000000000000000000300131502707512200250760ustar00rootroot00000000000000static const sparc_entry_t ultra3plus_pe[] = { /* These two must always be first. */ { .name = "Cycle_cnt", .desc = "Accumulated cycles", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x1, }, /* PIC0 events common to all UltraSPARC processors */ { .name = "Dispatch0_IC_miss", .desc = "I-buffer is empty from I-Cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, { .name = "IC_ref", .desc = "I-cache references", .ctrl = PME_CTRL_S0, .code = 0x8, }, { .name = "DC_rd", .desc = "D-cache read references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0x9, }, { .name = "DC_wr", .desc = "D-cache store accesses (including cacheable stores that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0xa, }, { .name = "EC_ref", .desc = "E-cache references", .ctrl = PME_CTRL_S0, .code = 0xc, }, { .name = "EC_snoop_inv", .desc = "L2-cache invalidates generated from a snoop by a remote processor", .ctrl = PME_CTRL_S0, .code = 0xe, }, /* PIC1 events common to all UltraSPARC processors */ { .name = "Dispatch0_mispred", .desc = "I-buffer is empty from Branch misprediction", .ctrl = PME_CTRL_S1, .code = 0x2, }, { .name = "EC_wb", .desc = "Dirty sub-blocks that produce writebacks due to L2-cache miss events", .ctrl = PME_CTRL_S1, .code = 0xd, }, { .name = "EC_snoop_cb", .desc = "L2-cache copybacks generated from a snoop by a remote processor", .ctrl = PME_CTRL_S1, .code = 0xe, }, /* PIC0 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "Dispatch0_br_target", .desc = "I-buffer is empty due to a branch target address calculation", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "Dispatch0_2nd_br", .desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "Rstall_storeQ", .desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stalled due to the store queue being full", .ctrl = PME_CTRL_S0, .code = 0x5, }, { .name = "Rstall_IU_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding integer instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S0, .code = 0x6, }, { .name = "EC_write_hit_RTO", .desc = "W-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .ctrl = PME_CTRL_S0, .code = 0xd, }, { .name = "EC_rd_miss", .desc = "L2-cache miss events (including atomics) from D-cache events", .ctrl = PME_CTRL_S0, .code = 0xf, }, { .name = "PC_port0_rd", .desc = "P-cache cacheable FP loads to the first port (general purpose load path to D-cache and P-cache via MS pipeline)", .ctrl = PME_CTRL_S0, .code = 0x10, }, { .name = "SI_snoop", .desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .ctrl = PME_CTRL_S0, .code = 0x11, }, { .name = "SI_ciq_flow", .desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .ctrl = PME_CTRL_S0, .code = 0x12, }, { .name = "SI_owned", .desc = "Counts events where owned_in is asserted on bus requests from the local processor", .ctrl = PME_CTRL_S0, .code = 0x13, }, { .name = "SW_count0", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S0, .code = 0x14, }, { .name = "IU_Stat_Br_miss_taken", .desc = "Retired branches that were predicted to be taken, but in fact were not taken", .ctrl = PME_CTRL_S0, .code = 0x15, }, { .name = "IU_Stat_Br_Count_taken", .desc = "Retired taken branches", .ctrl = PME_CTRL_S0, .code = 0x16, }, { .name = "Dispatch0_rs_mispred", .desc = "I-buffer is empty due to a Return Address Stack misprediction", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "FA_pipe_completion", .desc = "Instructions that complete execution on the FPG ALU pipelines", .ctrl = PME_CTRL_S0, .code = 0x18, }, /* PIC1 events common to all UltraSPARC-III/III+/IIIi processors */ { .name = "IC_miss_cancelled", .desc = "I-cache misses cancelled due to mis-speculation, recycle, or other events", .ctrl = PME_CTRL_S1, .code = 0x3, }, { .name = "Re_FPU_bypass", .desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .ctrl = PME_CTRL_S1, .code = 0x5, }, { .name = "Re_DC_miss", .desc = "Stall due to loads that miss D-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x6, }, { .name = "Re_EC_miss", .desc = "Stall due to loads that miss L2-cache and get recirculated", .ctrl = PME_CTRL_S1, .code = 0x7, }, { .name = "IC_miss", .desc = "I-cache misses, including fetches from mis-speculated execution paths which are later cancelled", .ctrl = PME_CTRL_S1, .code = 0x8, }, { .name = "DC_rd_miss", .desc = "Recirculated loads that miss the D-cache", .ctrl = PME_CTRL_S1, .code = 0x9, }, { .name = "DC_wr_miss", .desc = "D-cache store accesses that miss D-cache", .ctrl = PME_CTRL_S1, .code = 0xa, }, { .name = "Rstall_FP_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding floating-point instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S1, .code = 0xb, }, { .name = "EC_misses", .desc = "E-cache misses", .ctrl = PME_CTRL_S1, .code = 0xc, }, { .name = "EC_ic_miss", .desc = "L2-cache read misses from I-cache requests", .ctrl = PME_CTRL_S1, .code = 0xf, }, { .name = "Re_PC_miss", .desc = "Stall due to recirculation when a prefetch cache miss occurs on a prefetch predicted second load", .ctrl = PME_CTRL_S1, .code = 0x10, }, { .name = "ITLB_miss", .desc = "I-TLB miss traps taken", .ctrl = PME_CTRL_S1, .code = 0x11, }, { .name = "DTLB_miss", .desc = "Memory reference instructions which trap due to D-TLB miss", .ctrl = PME_CTRL_S1, .code = 0x12, }, { .name = "WC_miss", .desc = "W-cache misses", .ctrl = PME_CTRL_S1, .code = 0x13, }, { .name = "WC_snoop_cb", .desc = "W-cache copybacks generated by a snoop from a remote processor", .ctrl = PME_CTRL_S1, .code = 0x14, }, { .name = "WC_scrubbed", .desc = "W-cache hits to clean lines", .ctrl = PME_CTRL_S1, .code = 0x15, }, { .name = "WC_wb_wo_read", .desc = "W-cache writebacks not requiring a read", .ctrl = PME_CTRL_S1, .code = 0x16, }, { .name = "PC_soft_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a software-prefetch instruction", .ctrl = PME_CTRL_S1, .code = 0x18, }, { .name = "PC_snoop_inv", .desc = "P-cache invalidates that were generated by a snoop from a remote processor and stores by a local processor", .ctrl = PME_CTRL_S1, .code = 0x19, }, { .name = "PC_hard_hit", .desc = "FP loads that hit a P-cache line that was prefetched by a hardware prefetch", .ctrl = PME_CTRL_S1, .code = 0x1a, }, { .name = "PC_port1_rd", .desc = "P-cache cacheable FP loads to the second port (memory and out-of-pipeline instruction execution loads via the A0 and A1 pipelines)", .ctrl = PME_CTRL_S1, .code = 0x1b, }, { .name = "SW_count1", .desc = "Counts software-generated occurrences of 'sethi %hi(0xfc000), %g0' instruction", .ctrl = PME_CTRL_S1, .code = 0x1c, }, { .name = "IU_Stat_Br_miss_untaken", .desc = "Retired branches that were predicted to be untaken, but in fact were taken", .ctrl = PME_CTRL_S1, .code = 0x1d, }, { .name = "IU_Stat_Br_Count_untaken", .desc = "Retired untaken branches", .ctrl = PME_CTRL_S1, .code = 0x1e, }, { .name = "PC_MS_miss", .desc = "FP loads through the MS pipeline that miss P-cache", .ctrl = PME_CTRL_S1, .code = 0x1f, }, { .name = "Re_RAW_miss", .desc = "Stall due to recirculation when there is a load in the E-stage which has a non-bypassable read-after-write hazard with an earlier store instruction", .ctrl = PME_CTRL_S1, .code = 0x26, }, { .name = "FM_pipe_completion", .desc = "Instructions that complete execution on the FPG Multiply pipelines", .ctrl = PME_CTRL_S0, .code = 0x27, }, /* PIC0 memory controller events common to UltraSPARC-III/III+ processors */ { .name = "MC_reads_0", .desc = "Read requests completed to memory bank 0", .ctrl = PME_CTRL_S0, .code = 0x20, }, { .name = "MC_reads_1", .desc = "Read requests completed to memory bank 1", .ctrl = PME_CTRL_S0, .code = 0x21, }, { .name = "MC_reads_2", .desc = "Read requests completed to memory bank 2", .ctrl = PME_CTRL_S0, .code = 0x22, }, { .name = "MC_reads_3", .desc = "Read requests completed to memory bank 3", .ctrl = PME_CTRL_S0, .code = 0x23, }, { .name = "MC_stalls_0", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x24, }, { .name = "MC_stalls_2", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x25, }, /* PIC1 memory controller events common to all UltraSPARC-III/III+ processors */ { .name = "MC_writes_0", .desc = "Write requests completed to memory bank 0", .ctrl = PME_CTRL_S1, .code = 0x20, }, { .name = "MC_writes_1", .desc = "Write requests completed to memory bank 1", .ctrl = PME_CTRL_S1, .code = 0x21, }, { .name = "MC_writes_2", .desc = "Write requests completed to memory bank 2", .ctrl = PME_CTRL_S1, .code = 0x22, }, { .name = "MC_writes_3", .desc = "Write requests completed to memory bank 3", .ctrl = PME_CTRL_S1, .code = 0x23, }, { .name = "MC_stalls_1", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 1 was busy with a previous request", .ctrl = PME_CTRL_S1, .code = 0x24, }, { .name = "MC_stalls_3", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous request", .ctrl = PME_CTRL_S1, .code = 0x25, }, /* PIC0 events specific to UltraSPARC-III+ processors */ { .name = "EC_wb_remote", .desc = "Counts the retry event when any victimization for which the processor generates an R_WB transaction to non_LPA address region", .ctrl = PME_CTRL_S0, .code = 0x19, }, { .name = "EC_miss_local", .desc = "Counts any transaction to an LPA for which the processor issues an RTS/RTO/RS transaction", .ctrl = PME_CTRL_S0, .code = 0x1a, }, { .name = "EC_miss_mtag_remote", .desc = "Counts any transaction to an LPA in which the processor is required to generate a retry transaction", .ctrl = PME_CTRL_S0, .code = 0x1b, }, /* PIC1 events specific to UltraSPARC-III+/IIIi processors */ { .name = "Re_DC_missovhd", .desc = "Used to measure D-cache stall counts separately for L2-cache hits and misses. This counter is used with the recirculation and cache access events to separately calculate the D-cache loads that hit and miss the L2-cache", .ctrl = PME_CTRL_S1, .code = 0x4, }, /* PIC1 events specific to UltraSPARC-III+ processors */ { .name = "EC_miss_mtag_remote", .desc = "Counts any transaction to an LPA in which the processor is required to generate a retry transaction", .ctrl = PME_CTRL_S1, .code = 0x28, }, { .name = "EC_miss_remote", .desc = "Counts the events triggered whenever the processor generates a remote (R_*) transaction and the address is to a non-LPA portion (remote) of the physical address space, or an R_WS transaction due to block-store/block-store-commit to any address space (LPA or non-LPA), or an R-RTO due to store/swap request on Os state to LPA space", .ctrl = PME_CTRL_S1, .code = 0x29, }, }; #define PME_SPARC_ULTRA3PLUS_EVENT_COUNT (sizeof(ultra3plus_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/sparc_ultra4plus_events.h000066400000000000000000000436401502707512200251110ustar00rootroot00000000000000static const sparc_entry_t ultra4plus_pe[] = { /* These two must always be first. */ { .name = "Cycle_cnt", .desc = "Accumulated cycles", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x0, }, { .name = "Instr_cnt", .desc = "Number of instructions completed", .ctrl = PME_CTRL_S0 | PME_CTRL_S1, .code = 0x1, }, /* PIC0 UltraSPARC-IV+ events */ { .name = "Dispatch0_IC_miss", .desc = "I-buffer is empty from I-Cache miss", .ctrl = PME_CTRL_S0, .code = 0x2, }, { .name = "IU_stat_jmp_correct_pred", .desc = "Retired non-annulled register indirect jumps predicted correctly", .ctrl = PME_CTRL_S0, .code = 0x3, }, { .name = "Dispatch0_2nd_br", .desc = "Stall cycles due to having two branch instructions line-up in one 4-instruction group causing the second branch in the group to be re-fetched, delaying it's entrance into the I-buffer", .ctrl = PME_CTRL_S0, .code = 0x4, }, { .name = "Rstall_storeQ", .desc = "R-stage stall for a store instruction which is the next instruction to be executed, but it stalled due to the store queue being full", .ctrl = PME_CTRL_S0, .code = 0x5, }, { .name = "Rstall_IU_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding integer instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S0, .code = 0x6, }, { .name = "IU_stat_ret_correct_pred", .desc = "Retired non-annulled returns predicted correctly", .ctrl = PME_CTRL_S0, .code = 0x7, }, { .name = "IC_ref", .desc = "I-cache references", .ctrl = PME_CTRL_S0, .code = 0x8, }, { .name = "DC_rd", .desc = "D-cache read references (including accesses that subsequently trap)", .ctrl = PME_CTRL_S0, .code = 0x9, }, { .name = "Rstall_FP_use", .desc = "R-stage stall for an event that the next instruction to be executed depends on the result of a preceding floating-point instruction in the pipeline that is not yet available", .ctrl = PME_CTRL_S0, .code = 0xa, }, { .name = "SW_pf_instr", .desc = "Retired SW prefetch instructions", .ctrl = PME_CTRL_S0, .code = 0xb, }, { .name = "L2_ref", .desc = "L2-cache references", .ctrl = PME_CTRL_S0, .code = 0xc, }, { .name = "L2_write_hit_RTO", .desc = "L2-cache exclusive requests that hit L2-cache in S, O, or Os state and thus, do a read-to-own bus transaction", .ctrl = PME_CTRL_S0, .code = 0xd, }, { .name = "L2_snoop_inv_sh", .desc = "L2 cache lines that were written back to the L3 cache due to requests from both cores", .ctrl = PME_CTRL_S0, .code = 0xe, }, { .name = "L2_rd_miss", .desc = "L2-cache miss events (including atomics) from D-cache events", .ctrl = PME_CTRL_S0, .code = 0xf, }, { .name = "PC_rd", .desc = "P-cache cacheable loads", .ctrl = PME_CTRL_S0, .code = 0x10, }, { .name = "SI_snoop_sh", .desc = "Counts snoops from remote processor(s) including RTS, RTSR, RTO, RTOR, RS, RSR, RTSM, and WS", .ctrl = PME_CTRL_S0, .code = 0x11, }, { .name = "SI_ciq_flow_sh", .desc = "Counts system clock cycles when the flow control (PauseOut) signal is asserted", .ctrl = PME_CTRL_S0, .code = 0x12, }, { .name = "Re_DC_miss", .desc = "Stall due to loads that miss D-cache and get recirculated", .ctrl = PME_CTRL_S0, .code = 0x13, }, { .name = "SW_count_NOP0", .desc = "Retired, non-annulled special software NOP instructions (which is equivalent to 'sethi %hi(0xfc000), %g0' instruction)", .ctrl = PME_CTRL_S0, .code = 0x14, }, { .name = "IU_Stat_Br_miss_taken", .desc = "Retired branches that were predicted to be taken, but in fact were not taken", .ctrl = PME_CTRL_S0, .code = 0x15, }, { .name = "IU_Stat_Br_Count_taken", .desc = "Retired taken branches", .ctrl = PME_CTRL_S0, .code = 0x16, }, { .name = "HW_pf_exec", .desc = "Hardware prefetches enqueued in the prefetch queue", .ctrl = PME_CTRL_S0, .code = 0x17, }, { .name = "FA_pipe_completion", .desc = "Instructions that complete execution on the FPG ALU pipelines", .ctrl = PME_CTRL_S0, .code = 0x18, }, { .name = "SSM_L3_wb_remote", .desc = "L3 cache line victimizations from this core which generate R_WB transactions to non-LPA (remote physical address) regions", .ctrl = PME_CTRL_S0, .code = 0x19, }, { .name = "SSM_L3_miss_local", .desc = "L3 cache misses to LPA (local physical address) from this core which generate an RTS, RTO, or RS transaction", .ctrl = PME_CTRL_S0, .code = 0x1a, }, { .name = "SSM_L3_miss_mtag_remote", .desc = "L3 cache misses to LPA (local physical address) from this core which generate retry (R_*) transactions including R_RTS, R_RTO, and R_RS", .ctrl = PME_CTRL_S0, .code = 0x1b, }, { .name = "SW_pf_str_trapped", .desc = "Strong software prefetch instructions trapping due to TLB miss", .ctrl = PME_CTRL_S0, .code = 0x1c, }, { .name = "SW_pf_PC_installed", .desc = "Software prefetch instructions that installed lines in the P-cache", .ctrl = PME_CTRL_S0, .code = 0x1d, }, { .name = "IPB_to_IC_fill", .desc = "I-cache fills from the instruction prefetch buffer", .ctrl = PME_CTRL_S0, .code = 0x1e, }, { .name = "L2_write_miss", .desc = "L2-cache misses from this core by cacheable store requests", .ctrl = PME_CTRL_S0, .code = 0x1f, }, { .name = "MC_reads_0_sh", .desc = "Read requests completed to memory bank 0", .ctrl = PME_CTRL_S0, .code = 0x20, }, { .name = "MC_reads_1_sh", .desc = "Read requests completed to memory bank 1", .ctrl = PME_CTRL_S0, .code = 0x21, }, { .name = "MC_reads_2_sh", .desc = "Read requests completed to memory bank 2", .ctrl = PME_CTRL_S0, .code = 0x22, }, { .name = "MC_reads_3_sh", .desc = "Read requests completed to memory bank 3", .ctrl = PME_CTRL_S0, .code = 0x23, }, { .name = "MC_stalls_0_sh", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x24, }, { .name = "MC_stalls_2_sh", .desc = "Clock cycles that requests were stalled in the MCU queues because bank 2 was busy with a previous request", .ctrl = PME_CTRL_S0, .code = 0x25, }, { .name = "L2_hit_other_half", .desc = "L2 cache hits from this core to the ways filled by the other core when the cache is in the pseudo-split mode", .ctrl = PME_CTRL_S0, .code = 0x26, }, { .name = "L3_rd_miss", .desc = "L3 cache misses sent out to SIU from this code by cacheable I-cache, D-cache, PO-cache, and W-cache (excluding block store) requests", .ctrl = PME_CTRL_S0, .code = 0x28, }, { .name = "Re_L2_miss", .desc = "Stall cycles due to recirculation of cacheable loads that miss both D-cache and L2 cache", .ctrl = PME_CTRL_S0, .code = 0x29, }, { .name = "IC_miss_cancelled", .desc = "I-cache miss requests cancelled due to new fetch stream", .ctrl = PME_CTRL_S0, .code = 0x2a, }, { .name = "DC_wr_miss", .desc = "D-cache store accesses that miss D-cache", .ctrl = PME_CTRL_S0, .code = 0x2b, }, { .name = "L3_hit_I_state_sh", .desc = "Tag hits in L3 cache when the line is in I state", .ctrl = PME_CTRL_S0, .code = 0x2c, }, { .name = "SI_RTS_src_data", .desc = "Local RTS transactions due to I-cache, D-cache, or P-cache requests from this core where data is from the cache of another processor on the system, not from memory", .ctrl = PME_CTRL_S0, .code = 0x2d, }, { .name = "L2_IC_miss", .desc = "L2 cache misses from this code by cacheable I-cache requests", .ctrl = PME_CTRL_S0, .code = 0x2e, }, { .name = "SSM_new_transaction_sh", .desc = "New SSM transactions (RTSU, RTOU, UGM) observed by this processor on the Fireplane Interconnect", .ctrl = PME_CTRL_S0, .code = 0x2f, }, { .name = "L2_SW_pf_miss", .desc = "L2 cache misses by software prefetch requests from this core", .ctrl = PME_CTRL_S0, .code = 0x30, }, { .name = "L2_wb", .desc = "L2 cache lines that were written back to the L3 cache because of requests from this core", .ctrl = PME_CTRL_S0, .code = 0x31, }, { .name = "L2_wb_sh", .desc = "L2 cache lines that were written back to the L3 cache because of requests from both cores", .ctrl = PME_CTRL_S0, .code = 0x32, }, { .name = "L2_snoop_cb_sh", .desc = "L2 cache lines that were copied back due to other processors", .ctrl = PME_CTRL_S0, .code = 0x33, }, /* PIC1 UltraSPARC-IV+ events */ { .name = "Dispatch0_other", .desc = "Stall cycles due to the event that no instructions are dispatched because the I-queue is empty due to various other events, including branch target address fetch and various events which cause an instruction to be refetched", .ctrl = PME_CTRL_S1, .code = 0x2, }, { .name = "DC_wr", .desc = "D-cache write references by cacheable stores (excluding block stores)", .ctrl = PME_CTRL_S1, .code = 0x3, }, { .name = "Re_DC_missovhd", .desc = "Stall cycles due to D-cache load miss", .ctrl = PME_CTRL_S1, .code = 0x4, }, { .name = "Re_FPU_bypass", .desc = "Stall due to recirculation when an FPU bypass condition that does not have a direct bypass path occurs", .ctrl = PME_CTRL_S1, .code = 0x5, }, { .name = "L3_write_hit_RTO", .desc = "L3 cache hits in O, Os, or S state by cacheable store requests from this core that do a read-to-own (RTO) bus transaction", .ctrl = PME_CTRL_S1, .code = 0x6, }, { .name = "L2L3_snoop_inv_sh", .desc = "L2 and L3 cache lines that were invalidated due to other processors doing RTO, RTOR, RTOU, or WS transactions", .ctrl = PME_CTRL_S1, .code = 0x7, }, { .name = "IC_L2_req", .desc = "I-cache requests sent to L2 cache", .ctrl = PME_CTRL_S1, .code = 0x8, }, { .name = "DC_rd_miss", .desc = "Cacheable loads (excluding atomics and block loads) that miss D-cache as well as P-cache (for FP loads)", .ctrl = PME_CTRL_S1, .code = 0x9, }, { .name = "L2_hit_I_state_sh", .desc = "Tag hits in L2 cache when the line is in I state", .ctrl = PME_CTRL_S1, .code = 0xa, }, { .name = "L3_write_miss_RTO", .desc = "L3 cache misses from this core by cacheable store requests that do a read-to-own (RTO) bus transaction. This count does not include RTO requests for prefetch (fcn=2,3/22,23) instructions", .ctrl = PME_CTRL_S1, .code = 0xb, }, { .name = "L2_miss", .desc = "L2 cache misses from this core by cacheable I-cache, D-cache, P-cache, and W-cache (excluding block stores) requests", .ctrl = PME_CTRL_S1, .code = 0xc, }, { .name = "SI_owned_sh", .desc = "Number of times owned_in is asserted on bus requests from the local processor", .ctrl = PME_CTRL_S1, .code = 0xd, }, { .name = "SI_RTO_src_data", .desc = "Number of local RTO transactions due to W-cache or P-cache requests from this core where data is from the cache of another processor on the system, not from memory", .ctrl = PME_CTRL_S1, .code = 0xe, }, { .name = "SW_pf_duplicate", .desc = "Number of software prefetch instructions that were dropped because the prefetch request matched an outstanding requests in the prefetch queue or the request hit the P-cache", .ctrl = PME_CTRL_S1, .code = 0xf, }, { .name = "IU_stat_jmp_mispred", .desc = "Number of retired non-annulled register indirect jumps mispredicted", .ctrl = PME_CTRL_S1, .code = 0x10, }, { .name = "ITLB_miss", .desc = "I-TLB misses", .ctrl = PME_CTRL_S1, .code = 0x11, }, { .name = "DTLB_miss", .desc = "D-TLB misses", .ctrl = PME_CTRL_S1, .code = 0x12, }, { .name = "WC_miss", .desc = "W-cache misses", .ctrl = PME_CTRL_S1, .code = 0x13, }, { .name = "IC_fill", .desc = "Number of I-cache fills excluding fills from the instruction prefetch buffer. This is the best approximation of the number of I-cache misses for instructions that were actually executed", .ctrl = PME_CTRL_S1, .code = 0x14, }, { .name = "IU_stat_ret_mispred", .desc = "Number of retired non-annulled returns mispredicted", .ctrl = PME_CTRL_S1, .code = 0x15, }, { .name = "Re_L3_miss", .desc = "Stall cycles due to recirculation of cacheable loads that miss D-cache, L2, and L3 cache", .ctrl = PME_CTRL_S1, .code = 0x16, }, { .name = "Re_PFQ_full", .desc = "Stall cycles due to recirculation of prefetch instructions because the prefetch queue (PFQ) was full", .ctrl = PME_CTRL_S1, .code = 0x17, }, { .name = "PC_soft_hit", .desc = "Number of cacheable FP loads that hit a P-cache line that was prefetched by a software prefetch instruction", .ctrl = PME_CTRL_S1, .code = 0x18, }, { .name = "PC_inv", .desc = "Number of P-cache lines that were invalidated due to external snoops, internal stores, and L2 evictions", .ctrl = PME_CTRL_S1, .code = 0x19, }, { .name = "PC_hard_hit", .desc = "Number of FP loads that hit a P-cache line that was fetched by a FP load or a hardware prefetch, irrespective of whether the loads hit or miss the D-cache", .ctrl = PME_CTRL_S1, .code = 0x1a, }, { .name = "IC_pf", .desc = "Number of I-cache prefetch requests sent to L2 cache", .ctrl = PME_CTRL_S1, .code = 0x1b, }, { .name = "SW_count_NOP1", .desc = "Retired, non-annulled special software NOP instructions (which is equivalent to 'sethi %hi(0xfc000), %g0' instruction)", .ctrl = PME_CTRL_S1, .code = 0x1c, }, { .name = "IU_stat_br_miss_untaken", .desc = "Number of retired non-annulled conditional branches that were predicted to be not taken, but in fact were taken", .ctrl = PME_CTRL_S1, .code = 0x1d, }, { .name = "IU_stat_br_count_taken", .desc = "Number of retired non-annulled conditional branches that were taken", .ctrl = PME_CTRL_S1, .code = 0x1e, }, { .name = "PC_miss", .desc = "Number of cacheable FP loads that miss P-cache, irrespective of whether the loads hit or miss the D-cache", .ctrl = PME_CTRL_S1, .code = 0x1f, }, { .name = "MC_writes_0_sh", .desc = "Number of write requests complete to memory bank 0", .ctrl = PME_CTRL_S1, .code = 0x20, }, { .name = "MC_writes_1_sh", .desc = "Number of write requests complete to memory bank 1", .ctrl = PME_CTRL_S1, .code = 0x21, }, { .name = "MC_writes_2_sh", .desc = "Number of write requests complete to memory bank 2", .ctrl = PME_CTRL_S1, .code = 0x22, }, { .name = "MC_writes_3_sh", .desc = "Number of write requests complete to memory bank 3", .ctrl = PME_CTRL_S1, .code = 0x23, }, { .name = "MC_stalls_1_sh", .desc = "Number of processor cycles that requests were stalled in the MCU queues because bank 0 was busy with a previous requests", .ctrl = PME_CTRL_S1, .code = 0x24, }, { .name = "MC_stalls_3_sh", .desc = "Number of processor cycles that requests were stalled in the MCU queues because bank 3 was busy with a previous requests", .ctrl = PME_CTRL_S1, .code = 0x25, }, { .name = "Re_RAW_miss", .desc = "Stall cycles due to recirculation when there is a load instruction in the E-stage of the pipeline which has a non-bypassable read-after-write (RAW) hazard with an earlier store instruction", .ctrl = PME_CTRL_S1, .code = 0x26, }, { .name = "FM_pipe_completion", .desc = "Number of retired instructions that complete execution on the FLoat-Point/Graphics Multiply pipeline", .ctrl = PME_CTRL_S1, .code = 0x27, }, { .name = "SSM_L3_miss_mtag_remote", .desc = "Number of L3 cache misses to LPA (local physical address) from this core which generate retry (R_*) transactions including R_RTS, R_RTO, and R_RS", .ctrl = PME_CTRL_S1, .code = 0x28, }, { .name = "SSM_L3_miss_remote", .desc = "Number of L3 cache misses from this core which generate retry (R_*) transactions to non-LPA (non-local physical address) address space, or R_WS transactions due to block store (BST) / block store commit (BSTC) to any address space (LPA or non-LPA), or R_RTO due to atomic request on Os state to LPA space.", .ctrl = PME_CTRL_S1, .code = 0x29, }, { .name = "SW_pf_exec", .desc = "Number of retired, non-trapping software prefetch instructions that completed, i.e. number of retired prefetch instructions that were not dropped due to the prefecth queue being full", .ctrl = PME_CTRL_S1, .code = 0x2a, }, { .name = "SW_pf_str_exec", .desc = "Number of retired, non-trapping strong prefetch instructions that completed", .ctrl = PME_CTRL_S1, .code = 0x2b, }, { .name = "SW_pf_dropped", .desc = "Number of software prefetch instructions dropped due to TLB miss or due to the prefetch queue being full", .ctrl = PME_CTRL_S1, .code = 0x2c, }, { .name = "SW_pf_L2_installed", .desc = "Number of software prefetch instructions that installed lines in the L2 cache", .ctrl = PME_CTRL_S1, .code = 0x2d, }, { .name = "L2_HW_pf_miss", .desc = "Number of L2 cache misses by hardware prefetch requests from this core", .ctrl = PME_CTRL_S1, .code = 0x2f, }, { .name = "L3_miss", .desc = "Number of L3 cache misses sent out to SIU from this core by cacheable I-cache, D-cache, P-cache, and W-cache (excluding block stores) requests", .ctrl = PME_CTRL_S1, .code = 0x31, }, { .name = "L3_IC_miss", .desc = "Number of L3 cache misses by cacheable I-cache requests from this core", .ctrl = PME_CTRL_S1, .code = 0x32, }, { .name = "L3_SW_pf_miss", .desc = "Number of L3 cache misses by software prefetch requests from this core", .ctrl = PME_CTRL_S1, .code = 0x33, }, { .name = "L3_hit_other_half", .desc = "Number of L3 cache hits from this core to the ways filled by the other core when the cache is in pseudo-split mode", .ctrl = PME_CTRL_S1, .code = 0x34, }, { .name = "L3_wb", .desc = "Number of L3 cache lines that were written back because of requests from this core", .ctrl = PME_CTRL_S1, .code = 0x35, }, { .name = "L3_wb_sh", .desc = "Number of L3 cache lines that were written back because of requests from both cores", .ctrl = PME_CTRL_S1, .code = 0x36, }, { .name = "L2L3_snoop_cb_sh", .desc = "Total number of L2 and L3 cache lines that were copied back due to other processors", .ctrl = PME_CTRL_S1, .code = 0x37, }, }; #define PME_SPARC_ULTRA4PLUS_EVENT_COUNT (sizeof(ultra4plus_pe)/sizeof(sparc_entry_t)) papi-papi-7-2-0-t/src/libpfm4/lib/events/torrent_events.h000066400000000000000000001144331502707512200232760ustar00rootroot00000000000000/* Power Torrent PMU event codes */ #ifndef __POWER_TORRENT_EVENTS_H__ #define __POWER_TORRENT_EVENTS_H__ /* PRELIMINARY EVENT ENCODING * 0x0000_0000 - 0x00FF_FFFF = PowerPC core events * 0x0100_0000 - 0x01FF_FFFF = Torrent events * 0x0200_0000 - 0xFFFF_FFFF = reserved * For Torrent events: * Reserve encodings 0x0..0x00FF_FFFF for core PowerPC events. * For Torrent events * 0x00F0_0000 = Torrent PMU id * 0x000F_0000 = PMU unit number (e.g. 0 for MCD0, 1 for MCD1) * 0x0000_FF00 = virtual counter number (unused on MCD) * 0x0000_00FF = PMC mux value (unused on Util, MMU, CAU) * (Note that some of these fields are wider than necessary) * * The upper bits 0xFFFF_FFFF_0000_0000 are reserved for attribute * fields. */ #define PMU_SPACE_MASK 0xFF000000 #define POWERPC_CORE_SPACE 0x00000000 #define TORRENT_SPACE 0x01000000 #define IS_CORE_EVENT(x) ((x & PMU_SPACE_MASK) == POWERPC_CORE_SPACE) #define IS_TORRENT_EVENT(x) ((x & PMU_SPACE_MASK) == TORRENT_SPACE) #define TORRENT_PMU_SHIFT 20 #define TORRENT_PMU_MASK (0xF << TORRENT_PMU_SHIFT) #define TORRENT_PMU_GET(x) ((x & TORRENT_PMU_MASK) >> TORRENT_PMU_SHIFT) #define TORRENT_UNIT_SHIFT 16 #define TORRENT_UNIT_MASK (0xF << TORRENT_UNIT_SHIFT) #define TORRENT_UNIT_GET(x) ((x & TORRENT_UNIT_MASK) >> TORRENT_UNIT_SHIFT) #define TORRENT_VIRT_CTR_SHIFT 8 #define TORRENT_VIRT_CTR_MASK (0xFF << TORRENT_VIRT_CTR_SHIFT) #define TORRENT_VIRT_CTR_GET(x) ((x & TORRENT_VIRT_CTR_MASK) >> TORRENT_VIRT_CTR_SHIFT) #define TORRENT_MUX_SHIFT 0 #define TORRENT_MUX_MASK 0xFF #define TORRENT_MUX_GET(x) ((x & TORRENT_MUX_MASK) >> TORRENT_MUX_SHIFT) #define TORRENT_PBUS_WXYZ_ID 0x0 #define TORRENT_PBUS_LL_ID 0x1 #define TORRENT_PBUS_MCD_ID 0x2 #define TORRENT_PBUS_UTIL_ID 0x3 #define TORRENT_MMU_ID 0x4 #define TORRENT_CAU_ID 0x5 #define TORRENT_LAST_ID (TORRENT_CAU_ID) #define TORRENT_NUM_PMU_TYPES (TORRENT_LAST_ID + 1) /* TORRENT_DEVEL_NUM_PMU_TYPES is so that we don't try to call functions in * PMUs which are not currently supported. When all Torrent PMUs are * supported, we NEED to remove this definition and replace the usages of it * with TORRENT_NUM_PMU_TYPES. */ #define TORRENT_DEVEL_NUM_PMU_TYPES (TORRENT_PBUS_WXYZ_ID + 1) #define TORRENT_PMU(pmu) (TORRENT_SPACE | \ TORRENT_##pmu##_ID << TORRENT_PMU_SHIFT) #define TORRENT_PBUS_WXYZ TORRENT_PMU(PBUS_WXYZ) #define TORRENT_PBUS_LL TORRENT_PMU(PBUS_LL) #define TORRENT_PBUS_MCD TORRENT_PMU(PBUS_MCD) #define TORRENT_PBUS_UTIL TORRENT_PMU(PBUS_UTIL) #define TORRENT_MMU TORRENT_PMU(MMU) #define TORRENT_CAU TORRENT_PMU(CAU) #define COUNTER_W (0 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_X (1 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_Y (2 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_Z (3 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL0 (0 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL1 (1 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL2 (2 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL3 (3 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL4 (4 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL5 (5 << TORRENT_VIRT_CTR_SHIFT) #define COUNTER_LL6 (6 << TORRENT_VIRT_CTR_SHIFT) /* Attributes */ #define TORRENT_ATTR_MCD_TYPE_SHIFT 32 #define TORRENT_ATTR_MCD_TYPE_MASK (0x3ULL << TORRENT_ATTR_MCD_TYPE_SHIFT) #define TORRENT_ATTR_UTIL_SEL_SHIFT 32 #define TORRENT_ATTR_UTIL_SEL_MASK (0x3ULL << TORRENT_ATTR_UTIL_SEL_SHIFT) #define TORRENT_ATTR_UTIL_CMP_SHIFT 34 #define TORRENT_ATTR_UTIL_CMP_MASK (0x1FULL << TORRENT_ATTR_UTIL_CMP_SHIFT) static const pme_torrent_entry_t torrent_pe[] = { { .pme_name = "PM_PBUS_W_DISABLED", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x0, .pme_desc = "The W Link event counter is disabled" }, { .pme_name = "PM_PBUS_W_IN_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x1, .pme_desc = "Bus cycles that the W Link \"in\" channel is idle" }, { .pme_name = "PM_PBUS_W_IN_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the W Link \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_W_IN_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x3, .pme_desc = "Bus cycles that the W Link \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_W_OUT_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x5, .pme_desc = "Bus cycles that the W Link \"out\" channel is idle" }, { .pme_name = "PM_PBUS_W_OUT_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the W Link \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_W_OUT_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_W | 0x7, .pme_desc = "Bus cycles that the W Link \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_X_DISABLED", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x0, .pme_desc = "The X Link event counter is disabled" }, { .pme_name = "PM_PBUS_X_IN_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x1, .pme_desc = "Bus cycles that the X Link \"in\" channel is idle" }, { .pme_name = "PM_PBUS_X_IN_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the X Link \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_X_IN_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x3, .pme_desc = "Bus cycles that the X Link \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_X_OUT_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x5, .pme_desc = "Bus cycles that the X Link \"out\" channel is idle" }, { .pme_name = "PM_PBUS_X_OUT_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the X Link \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_X_OUT_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_X | 0x7, .pme_desc = "Bus cycles that the X Link \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_Y_DISABLED", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x0, .pme_desc = "The Y Link event counter is disabled" }, { .pme_name = "PM_PBUS_Y_IN_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x1, .pme_desc = "Bus cycles that the Y Link \"in\" channel is idle" }, { .pme_name = "PM_PBUS_Y_IN_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Y Link \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_Y_IN_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x3, .pme_desc = "Bus cycles that the Y Link \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_Y_OUT_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x5, .pme_desc = "Bus cycles that the Y Link \"out\" channel is idle", }, { .pme_name = "PM_PBUS_Y_OUT_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Y Link \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_Y_OUT_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Y | 0x7, .pme_desc = "Bus cycles that the W Link \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_Z_DISABLED", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x0, .pme_desc = "The Z Link event counter is disabled" }, { .pme_name = "PM_PBUS_Z_IN_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x1, .pme_desc = "Bus cycles that the Z Link \"in\" channel is idle" }, { .pme_name = "PM_PBUS_Z_IN_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Z Link \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_Z_IN_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x3, .pme_desc = "Bus cycles that the Z Link \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_Z_OUT_IDLE", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x5, .pme_desc = "Bus cycles that the Z Link \"out\" channel is idle" }, { .pme_name = "PM_PBUS_Z_OUT_CMDRSP", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Z Link \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_Z_OUT_DATA", .pme_code = TORRENT_PBUS_WXYZ | COUNTER_Z | 0x7, .pme_desc = "Bus cycles that the Z Link \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL0_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x0, .pme_desc = "The Local Link 0 event counter is disabled" }, { .pme_name = "PM_PBUS_LL0_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x1, .pme_desc = "Bus cycles that the Local Link 0 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL0_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 0 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL0_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x3, .pme_desc = "Bus cycles that the Local Link 0 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL0_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x5, .pme_desc = "Bus cycles that the Local Link 0 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL0_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 0 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL0_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x7, .pme_desc = "Bus cycles that the Local Link 0 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL0_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0x9, .pme_desc = "Bus cycles that the Local Link 0 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL0_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL0 | 0xd, .pme_desc = "Bus cycles that the Local Link 0 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL1_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x0, .pme_desc = "The Local Link 1 event counter is disabled" }, { .pme_name = "PM_PBUS_LL1_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x1, .pme_desc = "Bus cycles that the Local Link 1 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL1_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 1 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL1_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x3, .pme_desc = "Bus cycles that the Local Link 1 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL1_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x5, .pme_desc = "Bus cycles that the Local Link 1 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL1_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 1 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL1_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x7, .pme_desc = "Bus cycles that the Local Link 1 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL1_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0x9, .pme_desc = "Bus cycles that the Local Link 1 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL1_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL1 | 0xd, .pme_desc = "Bus cycles that the Local Link 1 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL2_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x0, .pme_desc = "The Local Link 2 event counter is disabled" }, { .pme_name = "PM_PBUS_LL2_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x1, .pme_desc = "Bus cycles that the Local Link 2 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL2_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 2 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL2_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x3, .pme_desc = "Bus cycles that the Local Link 2 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL2_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x5, .pme_desc = "Bus cycles that the Local Link 2 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL2_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 2 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL2_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x7, .pme_desc = "Bus cycles that the Local Link 2 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL2_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0x9, .pme_desc = "Bus cycles that the Local Link 2 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL2_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL2 | 0xd, .pme_desc = "Bus cycles that the Local Link 2 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL3_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x0, .pme_desc = "The Local Link 3 event counter is disabled" }, { .pme_name = "PM_PBUS_LL3_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x1, .pme_desc = "Bus cycles that the Local Link 3 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL3_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 3 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL3_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x3, .pme_desc = "Bus cycles that the Local Link 3 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL3_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x5, .pme_desc = "Bus cycles that the Local Link 3 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL3_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 3 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL3_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x7, .pme_desc = "Bus cycles that the Local Link 3 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL3_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0x9, .pme_desc = "Bus cycles that the Local Link 3 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL3_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL3 | 0xd, .pme_desc = "Bus cycles that the Local Link 3 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL4_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x0, .pme_desc = "The Local Link 4 event counter is disabled" }, { .pme_name = "PM_PBUS_LL4_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x1, .pme_desc = "Bus cycles that the Local Link 4 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL4_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 4 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL4_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x3, .pme_desc = "Bus cycles that the Local Link 4 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL4_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x5, .pme_desc = "Bus cycles that the Local Link 4 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL4_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 4 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL4_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x7, .pme_desc = "Bus cycles that the Local Link 4 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL4_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0x9, .pme_desc = "Bus cycles that the Local Link 4 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL4_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL4 | 0xd, .pme_desc = "Bus cycles that the Local Link 4 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL5_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x0, .pme_desc = "The Local Link 5 event counter is disabled" }, { .pme_name = "PM_PBUS_LL5_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x1, .pme_desc = "Bus cycles that the Local Link 5 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL5_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 5 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL5_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x3, .pme_desc = "Bus cycles that the Local Link 5 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL5_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x5, .pme_desc = "Bus cycles that the Local Link 5 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL5_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 5 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL5_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x7, .pme_desc = "Bus cycles that the Local Link 5 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL5_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0x9, .pme_desc = "Bus cycles that the Local Link 5 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL5_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL5 | 0xd, .pme_desc = "Bus cycles that the Local Link 5 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL6_DISABLED", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x0, .pme_desc = "The Local Link 6 event counter is disabled" }, { .pme_name = "PM_PBUS_LL6_IN_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x1, .pme_desc = "Bus cycles that the Local Link 6 \"in\" channel is idle" }, { .pme_name = "PM_PBUS_LL6_IN_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x2, .pme_desc = "Number of commands, partial responses, and combined responses received on the Local Link 6 \"in\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL6_IN_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x3, .pme_desc = "Bus cycles that the Local Link 6 \"in\" channel is receiving data or a data header" }, { .pme_name = "PM_PBUS_LL6_OUT_IDLE", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x5, .pme_desc = "Bus cycles that the Local Link 6 \"out\" channel is idle" }, { .pme_name = "PM_PBUS_LL6_OUT_CMDRSP", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x6, .pme_desc = "Number of commands, partial responses, and combined responses sent on the Local Link 6 \"out\" channel (Note: multiple events can occur in one cycle)" }, { .pme_name = "PM_PBUS_LL6_OUT_DATA", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x7, .pme_desc = "Bus cycles that the Local Link 6 \"out\" channel is sending data or a data header" }, { .pme_name = "PM_PBUS_LL6_IN_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0x9, .pme_desc = "Bus cycles that the Local Link 6 \"in\" channel is receiving ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_LL6_OUT_ISR", .pme_code = TORRENT_PBUS_LL | COUNTER_LL6 | 0xd, .pme_desc = "Bus cycles that the Local Link 6 \"out\" channel is sending ISR data or an ISR data header" }, { .pme_name = "PM_PBUS_MCD0_PROBE_ISSUED", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x00, .pme_desc = "cl_probe command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_PROBE_CRESP_GOOD", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x01, .pme_desc = "cResp for a cl_probe was addr_ack_done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_PROBE_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x02, .pme_desc = "cResp for a cl_probe was rty_sp or addr_error or unexpected cResp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH1_ISSUED", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x03, .pme_desc = "dcbfk command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH0_ISSUED", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x04, .pme_desc = "dcbf command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_BKILL_ISSUED", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x05, .pme_desc = "bkill command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH1_GOOD_COMP", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x06, .pme_desc = "cResp for a dcbfk was addr_ack_done and no collision", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH1_COLLISION", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x07, .pme_desc = "dcbfk had a collision", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH1_BAD_CRESP", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x08, .pme_desc = "cResp for a dcbfk was rty_sp or fl_addr_ack_bk_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_FLUSH0_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x09, .pme_desc = "cResp for a dcbf was rty_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_BKILL_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0A, .pme_desc = "cResp for a bkill was rty_sp or fl_addr_ack_bk_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_HIT", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0B, .pme_desc = "a reflected command got a hit", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_MISS", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0C, .pme_desc = "a reflected command got a miss", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_HIT_MD", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0D, .pme_desc = "a reflected command got a hit in the main directory", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_HIT_NE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0E, .pme_desc = "a reflected command got a hit in the new entry buffer", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_HIT_CO", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x0F, .pme_desc = "a reflected command got a hit in the castout buffer", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_MISS_CREATE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x10, .pme_desc = "a reflected command with a miss should create an entry", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RCMD_MISS_CREATED", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x11, .pme_desc = "a new entry was created", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RTY_DINC", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x12, .pme_desc = "MCD responded rty_dinc", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_RTY_FULL", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x13, .pme_desc = "MCD responded rty_lpc", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_BK_RTY", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x14, .pme_desc = "MCD responded with a master retry (rty_other or rty_lost_claim)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_NE_FULL", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x15, .pme_desc = "The new entry buffer is full", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_DEMAND_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x16, .pme_desc = "A demand castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_OTHER_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x17, .pme_desc = "A non-demand castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x18, .pme_desc = "A castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_CO_MOVE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x19, .pme_desc = "A castout entry was moved to the main directory", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_NE_MOVE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1A, .pme_desc = "A new entry movement was processed", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_PAGE_CREATE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1B, .pme_desc = "A new entry movement created a page (got a miss)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_NE_MOVE_MERGE", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1C, .pme_desc = "A new entry movement merged with an existing page (got a hit)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_NE_MOVE_ABORT_FLUSH", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1D, .pme_desc = "A new entry movement was aborted due to flush in progress", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_NE_MOVE_ABORT_COQ", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1E, .pme_desc = "A new entry movement was aborted due to castout buffer full", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_EM_HOLDOFF", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x1F, .pme_desc = "An entry movement was held off", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD0_EMQ_NOT_MT", .pme_code = TORRENT_PBUS_MCD | 0 << TORRENT_UNIT_SHIFT | 0x21, .pme_desc = "The entry movement queue is not empty", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_PROBE_ISSUED", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x00, .pme_desc = "cl_probe command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_PROBE_CRESP_GOOD", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x01, .pme_desc = "cResp for a cl_probe was addr_ack_done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_PROBE_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x02, .pme_desc = "cResp for a cl_probe was rty_sp or addr_error or unexpected cResp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH1_ISSUED", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x03, .pme_desc = "dcbfk command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH0_ISSUED", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x04, .pme_desc = "dcbf command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_BKILL_ISSUED", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x05, .pme_desc = "bkill command issued", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH1_GOOD_COMP", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x06, .pme_desc = "cResp for a dcbfk was addr_ack_done and no collision", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH1_COLLISION", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x07, .pme_desc = "dcbfk had a collision", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH1_BAD_CRESP", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x08, .pme_desc = "cResp for a dcbfk was rty_sp or fl_addr_ack_bk_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_FLUSH0_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x09, .pme_desc = "cResp for a dcbf was rty_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_BKILL_CRESP_RETRY", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0A, .pme_desc = "cResp for a bkill was rty_sp or fl_addr_ack_bk_sp", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_HIT", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0B, .pme_desc = "a reflected command got a hit", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_MISS", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0C, .pme_desc = "a reflected command got a miss", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_HIT_MD", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0D, .pme_desc = "a reflected command got a hit in the main directory", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_HIT_NE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0E, .pme_desc = "a reflected command got a hit in the new entry buffer", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_HIT_CO", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x0F, .pme_desc = "a reflected command got a hit in the castout buffer", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_MISS_CREATE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x10, .pme_desc = "a reflected command with a miss should create an entry", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RCMD_MISS_CREATED", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x11, .pme_desc = "a new entry was created", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RTY_DINC", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x12, .pme_desc = "MCD responded rty_dinc", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_RTY_FULL", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x13, .pme_desc = "MCD responded rty_lpc", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_BK_RTY", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x14, .pme_desc = "MCD responded with a master retry (rty_other or rty_lost_claim)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_NE_FULL", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x15, .pme_desc = "The new entry buffer is full", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_DEMAND_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x16, .pme_desc = "A demand castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_OTHER_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x17, .pme_desc = "A non-demand castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_CASTOUT", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x18, .pme_desc = "A castout was done", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_CO_MOVE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x19, .pme_desc = "A castout entry was moved to the main directory", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_NE_MOVE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1A, .pme_desc = "A new entry movement was processed", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_PAGE_CREATE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1B, .pme_desc = "A new entry movement created a page (got a miss)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_NE_MOVE_MERGE", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1C, .pme_desc = "A new entry movement merged with an existing page (got a hit)", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_NE_MOVE_ABORT_FLUSH", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1D, .pme_desc = "A new entry movement was aborted due to flush in progress", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_NE_MOVE_ABORT_COQ", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1E, .pme_desc = "A new entry movement was aborted due to castout buffer full", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_EM_HOLDOFF", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x1F, .pme_desc = "An entry movement was held off", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_MCD1_EMQ_NOT_MT", .pme_code = TORRENT_PBUS_MCD | 1 << TORRENT_UNIT_SHIFT | 0x21, .pme_desc = "The entry movement queue is not empty", .pme_modmsk = _TORRENT_ATTR_MCD }, { .pme_name = "PM_PBUS_UTIL_PB_APM_NM_HI_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x0 << TORRENT_VIRT_CTR_SHIFT | 0x0, .pme_desc = "Node Master High Threshold Counter", .pme_modmsk = _TORRENT_ATTR_UTIL_HI }, { .pme_name = "PM_PBUS_UTIL_PB_APM_NM_LO_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x1 << TORRENT_VIRT_CTR_SHIFT | 0x0, .pme_desc = "Node Master Low Threshold Counter", .pme_modmsk = _TORRENT_ATTR_UTIL_LO }, { .pme_name = "PM_PBUS_UTIL_PB_APM_LM_HI_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x2 << TORRENT_VIRT_CTR_SHIFT | 0x0, .pme_desc = "Local Master High Threshold Counter", .pme_modmsk = _TORRENT_ATTR_UTIL_HI }, { .pme_name = "PM_PBUS_UTIL_PB_APM_LM_LO_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x3 << TORRENT_VIRT_CTR_SHIFT | 0x0, .pme_desc = "Local Master Low Threshold Counter", .pme_modmsk = _TORRENT_ATTR_UTIL_LO }, { .pme_name = "PM_PBUS_UTIL_NODE_MASTER_PUMPS", .pme_code = TORRENT_PBUS_UTIL | 0x0 << TORRENT_VIRT_CTR_SHIFT | 0x1, .pme_desc = "Node Master Pumps" }, { .pme_name = "PM_PBUS_UTIL_LOCAL_MASTER_PUMPS", .pme_code = TORRENT_PBUS_UTIL | 0x1 << TORRENT_VIRT_CTR_SHIFT | 0x1, .pme_desc = "Local Master Pumps" }, { .pme_name = "PM_PBUS_UTIL_RETRY_NODE_MASTER_PUMPS", .pme_code = TORRENT_PBUS_UTIL | 0x2 << TORRENT_VIRT_CTR_SHIFT | 0x1, .pme_desc = "Retry Node Master Pumps" }, { .pme_name = "PM_PBUS_UTIL_RETRY_LOCAL_MASTER_PUMPS", .pme_code = TORRENT_PBUS_UTIL | 0x3 << TORRENT_VIRT_CTR_SHIFT | 0x1, .pme_desc = "Retry Local Master Pumps" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_RCMD_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x4 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "rCmd Activity Counter" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_INTDATA_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x5 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "Internal Data Counter" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDATSND_W_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x6 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Send Activity Counter for WXYZ links" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDATRCV_W_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x7 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Receive Activity Counter for WXYZ links" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDATSND_LL_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x8 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Send Activity Counter for LL links" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDATRCV_LL_CNT", .pme_code = TORRENT_PBUS_UTIL | 0x9 << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Receive Activity Counter for LL links" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDAT_W_LL_CNT", .pme_code = TORRENT_PBUS_UTIL | 0xA << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Activity Counter from WXYZ to LL links" }, { .pme_name = "PM_PBUS_UTIL_PB_APM_EXTDAT_LL_W_CNT", .pme_code = TORRENT_PBUS_UTIL | 0xB << TORRENT_VIRT_CTR_SHIFT, .pme_desc = "External Data Activity Counter from LL to WXYZ links" }, { .pme_name = "PM_MMU_G_MMCHIT", .pme_code = TORRENT_MMU | (0 << TORRENT_VIRT_CTR_SHIFT), .pme_desc = "Memory Management Cache Hit Counter Register" }, { .pme_name = "PM_MMU_G_MMCMIS", .pme_code = TORRENT_MMU | (1 << TORRENT_VIRT_CTR_SHIFT), .pme_desc = "Memory Management Cache Miss Counter Register" }, { .pme_name = "PM_MMU_G_MMATHIT", .pme_code = TORRENT_MMU | (2 << TORRENT_VIRT_CTR_SHIFT), .pme_desc = "Memory Management AT Cache Hit Counter Register" }, { .pme_name = "PM_MMU_G_MMATMIS", .pme_code = TORRENT_MMU | (3 << TORRENT_VIRT_CTR_SHIFT), .pme_desc = "Memory Management AT Cache Miss Counter Register" }, { .pme_name = "PM_CAU_CYCLES_WAITING_ON_A_CREDIT", .pme_code = TORRENT_CAU | 0, .pme_desc = "Count of cycles spent waiting on a credit. Increments whenever any index has a packet to send, but nothing (from any index) can be sent." }, }; #define PME_TORRENT_EVENT_COUNT (sizeof(torrent_pe) / sizeof(pme_torrent_entry_t)) #endif papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64.c000066400000000000000000000517651502707512200211600ustar00rootroot00000000000000/* * pfmlib_amd64.c : support for the AMD64 architected PMU * (for both 64 and 32 bit modes) * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_amd64_priv.h" /* architecture private */ const pfmlib_attr_desc_t amd64_mods[]={ PFM_ATTR_B("k", "monitor at priv level 0"), /* monitor priv level 0 */ PFM_ATTR_B("u", "monitor at priv level 1, 2, 3"), /* monitor priv level 1, 2, 3 */ PFM_ATTR_B("e", "edge level"), /* edge */ PFM_ATTR_B("i", "invert"), /* invert */ PFM_ATTR_I("c", "counter-mask in range [0-255]"), /* counter-mask */ PFM_ATTR_B("h", "monitor in hypervisor"), /* monitor in hypervisor*/ PFM_ATTR_B("g", "measure in guest"), /* monitor in guest */ PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; pfmlib_pmu_t amd64_support; pfm_amd64_config_t pfm_amd64_cfg; static int amd64_num_mods(void *this, int idx) { const amd64_entry_t *pe = this_pe(this); unsigned int mask; mask = pe[idx].modmsk; return pfmlib_popcnt(mask); } static inline int amd64_eflag(void *this, int idx, int flag) { const amd64_entry_t *pe = this_pe(this); return !!(pe[idx].flags & flag); } static inline int amd64_uflag(void *this, int idx, int attr, int flag) { const amd64_entry_t *pe = this_pe(this); return !!(pe[idx].umasks[attr].uflags & flag); } static inline int amd64_event_ibsfetch(void *this, int idx) { return amd64_eflag(this, idx, AMD64_FL_IBSFE); } static inline int amd64_event_ibsop(void *this, int idx) { return amd64_eflag(this, idx, AMD64_FL_IBSOP); } static inline int amd64_from_rev(unsigned int flags) { return ((flags) >> 8) & 0xff; } static inline int amd64_till_rev(unsigned int flags) { int till = (((flags)>>16) & 0xff); if (!till) return 0xff; return till; } static void amd64_get_revision(pfm_amd64_config_t *cfg) { pfm_pmu_t rev = PFM_PMU_NONE; if (cfg->family == 6) { cfg->revision = PFM_PMU_AMD64_K7; return; } if (cfg->family == 15) { /* family fh */ switch (cfg->model >> 4) { case 0: if (cfg->model == 5 && cfg->stepping < 2) { rev = PFM_PMU_AMD64_K8_REVB; break; } if (cfg->model == 4 && cfg->stepping == 0) { rev = PFM_PMU_AMD64_K8_REVB; break; } rev = PFM_PMU_AMD64_K8_REVC; break; case 1: rev = PFM_PMU_AMD64_K8_REVD; break; case 2: case 3: rev = PFM_PMU_AMD64_K8_REVE; break; case 4: case 5: case 0xc: rev = PFM_PMU_AMD64_K8_REVF; break; case 6: case 7: case 8: rev = PFM_PMU_AMD64_K8_REVG; break; default: rev = PFM_PMU_AMD64_K8_REVB; } } else if (cfg->family == 16) { /* family 10h */ switch (cfg->model) { case 4: case 5: case 6: rev = PFM_PMU_AMD64_FAM10H_SHANGHAI; break; case 8: case 9: rev = PFM_PMU_AMD64_FAM10H_ISTANBUL; break; default: rev = PFM_PMU_AMD64_FAM10H_BARCELONA; } } else if (cfg->family == 17) { /* family 11h */ switch (cfg->model) { default: rev = PFM_PMU_AMD64_FAM11H_TURION; } } else if (cfg->family == 18) { /* family 12h */ switch (cfg->model) { default: rev = PFM_PMU_AMD64_FAM12H_LLANO; } } else if (cfg->family == 20) { /* family 14h */ switch (cfg->model) { default: rev = PFM_PMU_AMD64_FAM14H_BOBCAT; } } else if (cfg->family == 21) { /* family 15h */ rev = PFM_PMU_AMD64_FAM15H_INTERLAGOS; } else if (cfg->family == 22) { /* family 16h */ rev = PFM_PMU_AMD64_FAM16H; } else if (cfg->family == 23) { /* family 17h */ if (cfg->model >= 48) rev = PFM_PMU_AMD64_FAM17H_ZEN2; else rev = PFM_PMU_AMD64_FAM17H_ZEN1; } else if (cfg->family == 25) { /* family 19h */ if (cfg->model >= 0x60 || (cfg->model >= 0x10 && cfg->model <= 0x1f)) { rev = PFM_PMU_AMD64_FAM19H_ZEN4; } else { rev = PFM_PMU_AMD64_FAM19H_ZEN3; } } else if (cfg->family == 26) { /* family 1ah */ rev = PFM_PMU_AMD64_FAM1AH_ZEN5; } cfg->revision = rev; } static inline void cpuid(unsigned int op, unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d) { asm volatile("cpuid" : "=a" (*a), "=b" (*b), "=c" (*c), "=d" (*d) : "a" (op) : "memory"); } static int amd64_event_valid(void *this, int i) { const amd64_entry_t *pe = this_pe(this); pfmlib_pmu_t *pmu = this; int flags; flags = pe[i].flags; if (pmu->pmu_rev < amd64_from_rev(flags)) return 0; if (pmu->pmu_rev > amd64_till_rev(flags)) return 0; /* no restrictions or matches restrictions */ return 1; } static int amd64_umask_valid(void *this, int i, int attr) { pfmlib_pmu_t *pmu = this; const amd64_entry_t *pe = this_pe(this); int flags; flags = pe[i].umasks[attr].uflags; if (pmu->pmu_rev < amd64_from_rev(flags)) return 0; if (pmu->pmu_rev > amd64_till_rev(flags)) return 0; /* no restrictions or matches restrictions */ return 1; } static unsigned int amd64_num_umasks(void *this, int pidx) { const amd64_entry_t *pe = this_pe(this); unsigned int i, n = 0; /* unit masks + modifiers */ for (i = 0; i < pe[pidx].numasks; i++) if (amd64_umask_valid(this, pidx, i)) n++; return n; } static int amd64_get_umask(void *this, int pidx, int attr_idx) { const amd64_entry_t *pe = this_pe(this); unsigned int i; int n; for (i=0, n = 0; i < pe[pidx].numasks; i++) { if (!amd64_umask_valid(this, pidx, i)) continue; if (n++ == attr_idx) return i; } return -1; } static inline int amd64_attr2mod(void *this, int pidx, int attr_idx) { const amd64_entry_t *pe = this_pe(this); size_t x; int n; n = attr_idx - amd64_num_umasks(this, pidx); pfmlib_for_each_bit(x, pe[pidx].modmsk) { if (n == 0) break; n--; } return x; } void amd64_display_reg(void *this, pfmlib_event_desc_t *e, pfm_amd64_reg_t reg) { pfmlib_pmu_t *pmu = this; __pfm_vbprintf("[0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d en=%d int=%d inv=%d edge=%d cnt_mask=%d", reg.val, reg.sel_event_mask | (reg.sel_event_mask2 << 8), reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask); /* Fam10h or later has host/guest filterting except Fam11h */ if (pfm_amd64_supports_virt(pmu)) __pfm_vbprintf(" guest=%d host=%d", reg.sel_guest, reg.sel_host); __pfm_vbprintf("] %s\n", e->fstr); } int pfm_amd64_detect(void *this) { unsigned int a, b, c, d; char buffer[128]; if (pfm_amd64_cfg.family) return PFM_SUCCESS; cpuid(0, &a, &b, &c, &d); strncpy(&buffer[0], (char *)(&b), 4); strncpy(&buffer[4], (char *)(&d), 4); strncpy(&buffer[8], (char *)(&c), 4); buffer[12] = '\0'; if (strcmp(buffer, "AuthenticAMD")) return PFM_ERR_NOTSUPP; cpuid(1, &a, &b, &c, &d); pfm_amd64_cfg.family = (a >> 8) & 0x0000000f; // bits 11 - 8 pfm_amd64_cfg.model = (a >> 4) & 0x0000000f; // Bits 7 - 4 if (pfm_amd64_cfg.family == 0xf) { pfm_amd64_cfg.family += (a >> 20) & 0x000000ff; // Extended family pfm_amd64_cfg.model |= (a >> 12) & 0x000000f0; // Extended model } pfm_amd64_cfg.stepping= a & 0x0000000f; // bits 3 - 0 amd64_get_revision(&pfm_amd64_cfg); if (pfm_amd64_cfg.revision == PFM_PMU_NONE) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } int pfm_amd64_family_detect(void *this) { struct pfmlib_pmu *pmu = this; int ret; ret = pfm_amd64_detect(this); if (ret != PFM_SUCCESS) return ret; ret = pfm_amd64_cfg.revision; return ret == pmu->cpu_family ? PFM_SUCCESS : PFM_ERR_NOTSUPP; } static int amd64_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask) { const amd64_entry_t *ent, *pe = this_pe(this); unsigned int i; int j, k, added, omit, numasks_grp; int idx; k = e->nattrs; ent = pe+e->event; for(i=0; msk; msk >>=1, i++) { if (!(msk & 0x1)) continue; added = omit = numasks_grp = 0; for (j = 0; j < e->npattrs; j++) { if (e->pattrs[j].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[j].type != PFM_ATTR_UMASK) continue; idx = e->pattrs[j].idx; if (ent->umasks[idx].grpid != i) continue; /* number of umasks in this group */ numasks_grp++; if (amd64_uflag(this, e->event, idx, AMD64_FL_DFL)) { DPRINT("added default for %s j=%d idx=%d\n", ent->umasks[idx].uname, j, idx); *umask |= ent->umasks[idx].ucode; e->attrs[k].id = j; /* pattrs index */ e->attrs[k].ival = 0; k++; added++; } if (amd64_uflag(this, e->event, idx, AMD64_FL_OMIT)) omit++; } /* * fail if no default was found AND at least one umasks cannot be omitted * in the group */ if (!added && omit != numasks_grp) { DPRINT("no default found for event %s unit mask group %d\n", ent->name, i); return PFM_ERR_UMASK; } } e->nattrs = k; return PFM_SUCCESS; } int pfm_amd64_get_encoding(void *this, pfmlib_event_desc_t *e) { const amd64_entry_t *pe = this_pe(this); pfm_amd64_reg_t reg; pfmlib_event_attr_info_t *a; uint64_t umask = 0; unsigned int plmmsk = 0; int k, ret, grpid; unsigned int grpmsk, ugrpmsk = 0; int grpcounts[AMD64_MAX_GRP]; int ncombo[AMD64_MAX_GRP]; memset(grpcounts, 0, sizeof(grpcounts)); memset(ncombo, 0, sizeof(ncombo)); e->fstr[0] = '\0'; reg.val = 0; /* assume reserved bits are zerooed */ grpmsk = (1 << pe[e->event].ngrp)-1; if (amd64_event_ibsfetch(this, e->event)) reg.ibsfetch.en = 1; else if (amd64_event_ibsop(this, e->event)) reg.ibsop.en = 1; else { reg.sel_event_mask = pe[e->event].code; reg.sel_event_mask2 = pe[e->event].code >> 8; reg.sel_en = 1; /* force enable */ reg.sel_int = 1; /* force APIC */ } for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { grpid = pe[e->event].umasks[a->idx].grpid; ++grpcounts[grpid]; /* * upper layer has removed duplicates * so if we come here more than once, it is for two * diinct umasks */ if (amd64_uflag(this, e->event, a->idx, AMD64_FL_NCOMBO)) ncombo[grpid] = 1; /* * if more than one umask in this group but one is marked * with ncombo, then fail. It is okay to combine umask within * a group as long as none is tagged with NCOMBO */ if (grpcounts[grpid] > 1 && ncombo[grpid]) { DPRINT("event does not support unit mask combination within a group\n"); return PFM_ERR_FEATCOMB; } umask |= pe[e->event].umasks[a->idx].ucode; ugrpmsk |= 1 << pe[e->event].umasks[a->idx].grpid; } else if (a->type == PFM_ATTR_RAW_UMASK) { /* there can only be one RAW_UMASK per event */ /* sanity checks */ if (a->idx & ~0xff) { DPRINT("raw umask is invalid\n"); return PFM_ERR_ATTR; } /* override umask */ umask = a->idx & 0xff; ugrpmsk = grpmsk; } else { /* modifiers */ uint64_t ival = e->attrs[k].ival; switch(a->idx) { //amd64_attr2mod(this, e->osid, e->event, a->idx)) { case AMD64_ATTR_I: /* invert */ reg.sel_inv = !!ival; break; case AMD64_ATTR_E: /* edge */ reg.sel_edge = !!ival; break; case AMD64_ATTR_C: /* counter-mask */ if (ival > 255) return PFM_ERR_ATTR_VAL; reg.sel_cnt_mask = ival; break; case AMD64_ATTR_U: /* USR */ reg.sel_usr = !!ival; plmmsk |= _AMD64_ATTR_U; break; case AMD64_ATTR_K: /* OS */ reg.sel_os = !!ival; plmmsk |= _AMD64_ATTR_K; break; case AMD64_ATTR_G: /* GUEST */ reg.sel_guest = !!ival; plmmsk |= _AMD64_ATTR_G; break; case AMD64_ATTR_H: /* HOST */ reg.sel_host = !!ival; plmmsk |= _AMD64_ATTR_H; break; } } } /* * handle case where no priv level mask was passed. * then we use the dfl_plm */ if (!(plmmsk & (_AMD64_ATTR_K|_AMD64_ATTR_U|_AMD64_ATTR_H))) { if (e->dfl_plm & PFM_PLM0) reg.sel_os = 1; if (e->dfl_plm & PFM_PLM3) reg.sel_usr = 1; if (e->dfl_plm & PFM_PLMH) reg.sel_host = 1; } /* * check that there is at least of unit mask in each unit * mask group */ if (ugrpmsk != grpmsk) { ugrpmsk ^= grpmsk; ret = amd64_add_defaults(this, e, ugrpmsk, &umask); if (ret != PFM_SUCCESS) return ret; } reg.sel_unit_mask = umask; e->codes[0] = reg.val; e->count = 1; /* * reorder all the attributes such that the fstr appears always * the same regardless of how the attributes were submitted. */ evt_strcat(e->fstr, "%s", pe[e->event].name); pfmlib_sort_attr(e); for (k = 0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", pe[e->event].umasks[a->idx].uname); else if (a->type == PFM_ATTR_RAW_UMASK) evt_strcat(e->fstr, ":0x%x", a->idx); } for (k = 0; k < e->npattrs; k++) { int idx; if (e->pattrs[k].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[k].type == PFM_ATTR_UMASK) continue; idx = e->pattrs[k].idx; switch(idx) { case AMD64_ATTR_K: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_os); break; case AMD64_ATTR_U: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_usr); break; case AMD64_ATTR_E: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_edge); break; case AMD64_ATTR_I: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_inv); break; case AMD64_ATTR_C: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_cnt_mask); break; case AMD64_ATTR_H: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_host); break; case AMD64_ATTR_G: evt_strcat(e->fstr, ":%s=%lu", amd64_mods[idx].name, reg.sel_guest); break; } } amd64_display_reg(this, e, reg); return PFM_SUCCESS; } int pfm_amd64_get_event_first(void *this) { pfmlib_pmu_t *pmu = this; int idx; for(idx=0; idx < pmu->pme_count; idx++) if (amd64_event_valid(this, idx)) return idx; return -1; } int pfm_amd64_get_event_next(void *this, int idx) { pfmlib_pmu_t *pmu = this; /* basic validity checks on idx down by caller */ if (idx >= (pmu->pme_count-1)) return -1; /* validate event fo this host PMU */ if (!amd64_event_valid(this, idx)) return -1; for(++idx; idx < pmu->pme_count; idx++) { if (amd64_event_valid(this, idx)) return idx; } return -1; } int pfm_amd64_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *pmu = this; if (pidx < 0 || pidx >= pmu->pme_count) return 0; /* valid revision */ return amd64_event_valid(this, pidx); } int pfm_amd64_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { const amd64_entry_t *pe = this_pe(this); int numasks, idx; numasks = amd64_num_umasks(this, pidx); if (attr_idx < numasks) { idx = amd64_get_umask(this, pidx, attr_idx); if (idx == -1) return PFM_ERR_ATTR; info->name = pe[pidx].umasks[idx].uname; info->desc = pe[pidx].umasks[idx].udesc; info->code = pe[pidx].umasks[idx].ucode; info->type = PFM_ATTR_UMASK; info->is_dfl = amd64_uflag(this, pidx, idx, AMD64_FL_DFL); } else { idx = amd64_attr2mod(this, pidx, attr_idx); info->name = amd64_mods[idx].name; info->desc = amd64_mods[idx].desc; info->type = amd64_mods[idx].type; info->code = idx; info->is_dfl = 0; } info->is_precise = 0; info->support_hw_smpl = 0; info->equiv = NULL; info->ctrl = PFM_ATTR_CTRL_PMU; info->idx = idx; /* namespace specific index */ info->dfl_val64 = 0; return PFM_SUCCESS; } int pfm_amd64_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const amd64_entry_t *pe = this_pe(this); info->name = pe[idx].name; info->desc = pe[idx].desc; info->equiv = NULL; info->code = pe[idx].code; info->idx = idx; info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; info->nattrs = amd64_num_umasks(this, idx); info->nattrs += amd64_num_mods(this, idx); return PFM_SUCCESS; } int pfm_amd64_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const amd64_entry_t *pe = this_pe(this); const char *name = pmu->name; unsigned int i, j, k; int ndfl; int error = 0; if (!pmu->atdesc) { fprintf(fp, "pmu: %s missing attr_desc\n", pmu->name); error++; } if (!pmu->supported_plm && pmu->type == PFM_PMU_TYPE_CORE) { fprintf(fp, "pmu: %s supported_plm not set\n", pmu->name); error++; } for(i=0; i < (unsigned int)pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", name, i, pe[i].name); error++; } if (pe[i].numasks && pe[i].umasks == NULL) { fprintf(fp, "pmu: %s event%d: %s :: numasks but no umasks\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks == 0 && pe[i].umasks) { fprintf(fp, "pmu: %s event%d: %s :: numasks=0 but umasks defined\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks && pe[i].ngrp == 0) { fprintf(fp, "pmu: %s event%d: %s :: ngrp cannot be zero\n", name, i, pe[i].name); error++; } if (pe[i].numasks == 0 && pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s :: ngrp must be zero\n", name, i, pe[i].name); error++; } if (pe[i].ngrp >= AMD64_MAX_GRP) { fprintf(fp, "pmu: %s event%d: %s :: ngrp too big (max=%d)\n", name, i, pe[i].name, AMD64_MAX_GRP); error++; } for(ndfl = 0, j= 0; j < pe[i].numasks; j++) { if (!pe[i].umasks[j].uname) { fprintf(fp, "pmu: %s event%d: %s umask%d :: no name\n", pmu->name, i, pe[i].name, j); error++; } if (!pe[i].umasks[j].udesc) { fprintf(fp, "pmu: %s event%d:%s umask%d: %s :: no description\n", name, i, pe[i].name, j, pe[i].umasks[j].uname); error++; } if (pe[i].ngrp && pe[i].umasks[j].grpid >= pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: invalid grpid %d (must be < %d)\n", name, i, pe[i].name, j, pe[i].umasks[j].uname, pe[i].umasks[j].grpid, pe[i].ngrp); error++; } if (pe[i].umasks[j].uflags & AMD64_FL_DFL) { for(k=0; k < j; k++) if ((pe[i].umasks[k].uflags == pe[i].umasks[j].uflags) && (pe[i].umasks[k].grpid == pe[i].umasks[j].grpid)) ndfl++; if (pe[i].numasks == 1) ndfl = 1; } } if (pe[i].numasks > 1 && ndfl) { fprintf(fp, "pmu: %s event%d: %s :: more than one default unit mask with same code\n", name, i, pe[i].name); error++; } /* if only one umask, then ought to be default */ if (pe[i].numasks == 1 && ndfl != 1) { fprintf(fp, "pmu: %s event%d: %s, only one umask but no default\n", pmu->name, i, pe[i].name); error++; } if (pe[i].flags & AMD64_FL_NCOMBO) { fprintf(fp, "pmu: %s event%d: %s :: NCOMBO is unit mask only flag\n", name, i, pe[i].name); error++; } for(j=0; j < pe[i].numasks; j++) { if (pe[i].umasks[j].uflags & AMD64_FL_NCOMBO) continue; for(k=j+1; k < pe[i].numasks; k++) { if (pe[i].umasks[k].uflags & AMD64_FL_NCOMBO) continue; if ((pe[i].umasks[j].ucode & pe[i].umasks[k].ucode)) { fprintf(fp, "pmu: %s event%d: %s :: umask %s and %s have overlapping code bits\n", name, i, pe[i].name, pe[i].umasks[j].uname, pe[i].umasks[k].uname); error++; } } } for (j=i+1; j < (unsigned int)pmu->pme_count; j++) { if (pe[i].code == pe[j].code && pe[i].flags == pe[j].flags) { fprintf(fp, "pmu: %s events %s and %s have the same code 0x%x\n", pmu->name, pe[i].name, pe[j].name, pe[i].code); error++; } } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } unsigned int pfm_amd64_get_event_nattrs(void *this, int pidx) { unsigned int nattrs; nattrs = amd64_num_umasks(this, pidx); nattrs += amd64_num_mods(this, pidx); return nattrs; } int pfm_amd64_get_num_events(void *this) { pfmlib_pmu_t *pmu = this; int i, num = 0; /* * count actual number of events for specific PMU. * Table may contain more events for the family than * what a specific model actually supports. */ for (i = 0; i < pmu->pme_count; i++) if (amd64_event_valid(this, i)) num++; return num; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam10h.c000066400000000000000000000052561502707512200223060ustar00rootroot00000000000000/* * pfmlib_amd64_fam10h.c : AMD64 Family 10h * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam10h.h" #define DEFINE_FAM10H_REV(d, n, r, pmuid) \ pfmlib_pmu_t amd64_fam10h_##n##_support={ \ .desc = "AMD64 Fam10h "#d, \ .name = "amd64_fam10h_"#n, \ .pmu = pmuid, \ .pmu_rev = r, \ .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam10h_pe),\ .type = PFM_PMU_TYPE_CORE, \ .supported_plm = AMD64_FAM10H_PLM, \ .num_cntrs = 4, \ .max_encoding = 1, \ .pe = amd64_fam10h_pe, \ .atdesc = amd64_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ \ .cpu_family = pmuid, \ .pmu_detect = pfm_amd64_family_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), \ .get_event_first = pfm_amd64_get_event_first, \ .get_event_next = pfm_amd64_get_event_next, \ .event_is_valid = pfm_amd64_event_is_valid, \ .validate_table = pfm_amd64_validate_table, \ .get_event_info = pfm_amd64_get_event_info, \ .get_event_attr_info = pfm_amd64_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs),\ .get_event_nattrs = pfm_amd64_get_event_nattrs, \ .get_num_events = pfm_amd64_get_num_events, \ } DEFINE_FAM10H_REV(Barcelona, barcelona, AMD64_FAM10H_REV_B, PFM_PMU_AMD64_FAM10H_BARCELONA); DEFINE_FAM10H_REV(Shanghai, shanghai, AMD64_FAM10H_REV_C, PFM_PMU_AMD64_FAM10H_SHANGHAI); DEFINE_FAM10H_REV(Istanbul, istanbul, AMD64_FAM10H_REV_D, PFM_PMU_AMD64_FAM10H_ISTANBUL); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam11h.c000066400000000000000000000047031502707512200223030ustar00rootroot00000000000000/* * pfmlib_amd64_fam11h.c : AMD64 Family 11h * * Copyright (c) 2012 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam11h.h" #define DEFINE_FAM11H_REV(d, n, r, pmuid) \ pfmlib_pmu_t amd64_fam11h_##n##_support={ \ .desc = "AMD64 Fam11h "#d, \ .name = "amd64_fam11h_"#n, \ .pmu = pmuid, \ .pmu_rev = r, \ .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam11h_pe),\ .type = PFM_PMU_TYPE_CORE, \ .supported_plm = AMD64_FAM10H_PLM, \ .num_cntrs = 4, \ .max_encoding = 1, \ .pe = amd64_fam11h_pe, \ .atdesc = amd64_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ \ .cpu_family = pmuid, \ .pmu_detect = pfm_amd64_family_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), \ .get_event_first = pfm_amd64_get_event_first, \ .get_event_next = pfm_amd64_get_event_next, \ .event_is_valid = pfm_amd64_event_is_valid, \ .validate_table = pfm_amd64_validate_table, \ .get_event_info = pfm_amd64_get_event_info, \ .get_event_attr_info = pfm_amd64_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs),\ .get_event_nattrs = pfm_amd64_get_event_nattrs, \ } DEFINE_FAM11H_REV(Turion, turion, AMD64_FAM11H, PFM_PMU_AMD64_FAM11H_TURION); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam12h.c000066400000000000000000000047001502707512200223010ustar00rootroot00000000000000/* * pfmlib_amd64_fam12h.c : AMD64 Family 12h * * Copyright (c) 2011 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam12h.h" #define DEFINE_FAM12H_REV(d, n, r, pmuid) \ pfmlib_pmu_t amd64_fam12h_##n##_support={ \ .desc = "AMD64 Fam12h "#d, \ .name = "amd64_fam12h_"#n, \ .pmu = pmuid, \ .pmu_rev = r, \ .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam12h_pe),\ .type = PFM_PMU_TYPE_CORE, \ .supported_plm = AMD64_FAM10H_PLM, \ .num_cntrs = 4, \ .max_encoding = 1, \ .pe = amd64_fam12h_pe, \ .atdesc = amd64_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ \ .cpu_family = pmuid, \ .pmu_detect = pfm_amd64_family_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), \ .get_event_first = pfm_amd64_get_event_first, \ .get_event_next = pfm_amd64_get_event_next, \ .event_is_valid = pfm_amd64_event_is_valid, \ .validate_table = pfm_amd64_validate_table, \ .get_event_info = pfm_amd64_get_event_info, \ .get_event_attr_info = pfm_amd64_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs),\ .get_event_nattrs = pfm_amd64_get_event_nattrs, \ } DEFINE_FAM12H_REV(Llano, llano, AMD64_FAM12H, PFM_PMU_AMD64_FAM12H_LLANO); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam14h.c000066400000000000000000000047011502707512200223040ustar00rootroot00000000000000/* * pfmlib_amd64_fam14h.c : AMD64 Family 14h * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam14h.h" #define DEFINE_FAM14H_REV(d, n, r, pmuid) \ pfmlib_pmu_t amd64_fam14h_##n##_support={ \ .desc = "AMD64 Fam14h "#d, \ .name = "amd64_fam14h_"#n, \ .pmu = pmuid, \ .pmu_rev = r, \ .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam14h_pe),\ .type = PFM_PMU_TYPE_CORE, \ .supported_plm = AMD64_FAM10H_PLM, \ .num_cntrs = 4, \ .max_encoding = 1, \ .pe = amd64_fam14h_pe, \ .atdesc = amd64_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ \ .cpu_family = pmuid, \ .pmu_detect = pfm_amd64_family_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), \ .get_event_first = pfm_amd64_get_event_first, \ .get_event_next = pfm_amd64_get_event_next, \ .event_is_valid = pfm_amd64_event_is_valid, \ .validate_table = pfm_amd64_validate_table, \ .get_event_info = pfm_amd64_get_event_info, \ .get_event_attr_info = pfm_amd64_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs),\ .get_event_nattrs = pfm_amd64_get_event_nattrs, \ } DEFINE_FAM14H_REV(Bobcat, bobcat, AMD64_FAM14H_REV_B, PFM_PMU_AMD64_FAM14H_BOBCAT); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam15h.c000066400000000000000000000066101502707512200223060ustar00rootroot00000000000000/* * pfmlib_amd64_fam15h.c : AMD64 Family 15h * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam15h.h" #include "events/amd64_events_fam15h_nb.h" pfmlib_pmu_t amd64_fam15h_interlagos_support={ .desc = "AMD64 Fam15h Interlagos", .name = "amd64_fam15h_interlagos", .pmu = PFM_PMU_AMD64_FAM15H_INTERLAGOS, .pmu_rev = AMD64_FAM15H, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam15h_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam15h_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM15H_INTERLAGOS, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; pfmlib_pmu_t amd64_fam15h_nb_support={ .desc = "AMD64 Fam15h NorthBridge", .name = "amd64_fam15h_nb", .pmu = PFM_PMU_AMD64_FAM15H_NB, .perf_name = "amd_nb", .pmu_rev = 0, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam15h_nb_pe), .type = PFM_PMU_TYPE_UNCORE, .supported_plm = 0, /* no plm support */ .num_cntrs = 4, .max_encoding = 1, .pe = amd64_fam15h_nb_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM15H_INTERLAGOS, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_nb_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam16h.c000066400000000000000000000043771502707512200223170ustar00rootroot00000000000000/* * pfmlib_amd64_fam16h.c : AMD64 Family 16h * * Copyright (c) 2017 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam16h.h" pfmlib_pmu_t amd64_fam16h_support={ .desc = "AMD64 Fam16h Jaguar", .name = "amd64_fam16h", .pmu = PFM_PMU_AMD64_FAM16H, .pmu_rev = AMD64_FAM16H, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam16h_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 4, .max_encoding = 1, .pe = amd64_fam16h_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM16H, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam17h.c000066400000000000000000000114721502707512200223120ustar00rootroot00000000000000/* * pfmlib_amd64_fam17h.c : AMD64 Family 17h * * Copyright (c) 2017 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam17h_zen1.h" #include "events/amd64_events_fam17h_zen2.h" /* * This function detects ZEN1 for the deprecated * amd_fam17h pmu model name. */ static int pfm_amd64_family_detect_zen1(void *this) { int ret, rev; ret = pfm_amd64_detect(this); if (ret != PFM_SUCCESS) return ret; rev = pfm_amd64_cfg.revision; return rev == PFM_PMU_AMD64_FAM17H_ZEN1 ? PFM_SUCCESS: PFM_ERR_NOTSUPP; } /* * Deprecated PMU model, kept here for backward compatibility. * Should use amd_fam17h_zen1 instead. */ pfmlib_pmu_t amd64_fam17h_deprecated_support={ .desc = "AMD64 Fam17h Zen1 (deprecated - use amd_fam17h_zen1 instead)", .name = "amd64_fam17h", .pmu = PFM_PMU_AMD64_FAM17H, .pmu_rev = AMD64_FAM17H, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam17h_zen1_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_DEPR, .cpu_family = PFM_PMU_AMD64_FAM17H, .pmu_detect = pfm_amd64_family_detect_zen1, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; pfmlib_pmu_t amd64_fam17h_zen1_support={ .desc = "AMD64 Fam17h Zen1", .name = "amd64_fam17h_zen1", .pmu = PFM_PMU_AMD64_FAM17H_ZEN1, .pmu_rev = 0, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen1_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam17h_zen1_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM17H_ZEN1, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; pfmlib_pmu_t amd64_fam17h_zen2_support={ .desc = "AMD64 Fam17h Zen2", .name = "amd64_fam17h_zen2", .pmu = PFM_PMU_AMD64_FAM17H_ZEN2, .pmu_rev = 0, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam17h_zen2_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam17h_zen2_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM17H_ZEN2, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam19h.c000066400000000000000000000065361502707512200223210ustar00rootroot00000000000000/* * pfmlib_amd64_fam19h.c : AMD64 Fam19h core PMU support * * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam19h_zen3.h" #include "events/amd64_events_fam19h_zen4.h" pfmlib_pmu_t amd64_fam19h_zen3_support={ .desc = "AMD64 Fam19h Zen3", .name = "amd64_fam19h_zen3", .pmu = PFM_PMU_AMD64_FAM19H_ZEN3, .pmu_rev = AMD64_FAM19H, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam19h_zen3_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM19H_ZEN3, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; pfmlib_pmu_t amd64_fam19h_zen4_support={ .desc = "AMD64 Fam19h Zen4", .name = "amd64_fam19h_zen4", .pmu = PFM_PMU_AMD64_FAM19H_ZEN4, .pmu_rev = AMD64_FAM19H, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen4_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam19h_zen4_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM19H_ZEN4, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam19h_l3.c000066400000000000000000000053331502707512200227110ustar00rootroot00000000000000/* * pfmlib_amd64_fam19h_zen3_l3.c : AMD Fam19h Zen3 L3 PMU * * Copyright 2021 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam19h_zen3_l3.h" static void display_l3(void *this, pfmlib_event_desc_t *e, void *val) { pfm_amd64_reg_t *reg = val; __pfm_vbprintf("[L3=0x%"PRIx64" event=0x%x umask=0x%x\n", reg->val, reg->l3.event, reg->l3.umask); } const pfmlib_attr_desc_t fam19h_l3_mods[]={ PFM_ATTR_NULL }; pfmlib_pmu_t amd64_fam19h_zen3_l3_support = { .desc = "AMD64 Fam19h Zen3 L3", .name = "amd64_fam19h_zen3_l3", .perf_name = "amd_l3", .pmu = PFM_PMU_AMD64_FAM19H_ZEN3_L3, .pmu_rev = 0, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam19h_zen3_l3_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .max_encoding = 1, .pe = amd64_fam19h_zen3_l3_pe, .atdesc = fam19h_l3_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM19H_ZEN3, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, .display_reg = display_l3, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam1ah.c000066400000000000000000000045501502707512200223630ustar00rootroot00000000000000/* * pfmlib_amd64_fam1ah.c : AMD64 Fam1Ah core PMU support * * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved. * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam1ah_zen5.h" pfmlib_pmu_t amd64_fam1ah_zen5_support={ .desc = "AMD64 Fam1Ah Zen5", .name = "amd64_fam1ah_zen5", .pmu = PFM_PMU_AMD64_FAM1AH_ZEN5, .pmu_rev = AMD64_FAM1AH, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_FAM10H_PLM, .num_cntrs = 6, .max_encoding = 1, .pe = amd64_fam1ah_zen5_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM1AH_ZEN5, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_fam1ah_l3.c000066400000000000000000000057251502707512200227660ustar00rootroot00000000000000/* * pfmlib_amd64_fam1ah_zen5_l3.c : AMD Fam1Ah Zen3 L3 PMU * * Copyright (C) 2024 Advanced Micro Devices, Inc. All rights reserved. * Contributed by Swarup Sahoo * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_fam1ah_zen5_l3.h" static void display_l3(void *this, pfmlib_event_desc_t *e, void *val) { pfm_amd64_reg_t *reg = val; __pfm_vbprintf("[L3=0x%" PRIx64 " event=0x%x umask=0x%x\n", reg->val, reg->l3.event, reg->l3.umask); } const pfmlib_attr_desc_t fam1ah_l3_mods[] = { PFM_ATTR_NULL }; pfmlib_pmu_t amd64_fam1ah_zen5_l3_support = { .desc = "AMD64 Fam1ah Zen5 L3", .name = "amd64_fam1ah_zen5_l3", .perf_name = "amd_l3", .pmu = PFM_PMU_AMD64_FAM1AH_ZEN5_L3, .pmu_rev = 0, .pme_count = LIBPFM_ARRAY_SIZE(amd64_fam1ah_zen5_l3_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .max_encoding = 1, .pe = amd64_fam1ah_zen5_l3_pe, .atdesc = fam1ah_l3_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_FAM1AH_ZEN5, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, .display_reg = display_l3, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_k7.c000066400000000000000000000043401502707512200215440ustar00rootroot00000000000000/* * pfmlib_amd64_k7.c : AMD64 K7 * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_k7.h" pfmlib_pmu_t amd64_k7_support={ .desc = "AMD64 K7", .name = "amd64_k7", .pmu = PFM_PMU_AMD64_K7, .pmu_rev = AMD64_K7, .pme_count = LIBPFM_ARRAY_SIZE(amd64_k7_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = AMD64_K7_PLM, .num_cntrs = 4, .max_encoding = 1, .pe = amd64_k7_pe, .atdesc = amd64_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = PFM_PMU_AMD64_K7, .pmu_detect = pfm_amd64_family_detect, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_k8.c000066400000000000000000000054011502707512200215440ustar00rootroot00000000000000/* * pfmlib_amd64_k8.c : AMD64 K8 * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_amd64_priv.h" #include "events/amd64_events_k8.h" #define DEFINE_K8_REV(d, n, r, pmuid) \ pfmlib_pmu_t amd64_k8_##n##_support={ \ .desc = "AMD64 K8 "#d, \ .name = "amd64_k8_"#n, \ .pmu = pmuid, \ .pmu_rev = r, \ .pme_count = LIBPFM_ARRAY_SIZE(amd64_k8_pe),\ .type = PFM_PMU_TYPE_CORE, \ .supported_plm = AMD64_K7_PLM, \ .num_cntrs = 4, \ .max_encoding = 1, \ .pe = amd64_k8_pe, \ .atdesc = amd64_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ \ .cpu_family = pmuid, \ .pmu_detect = pfm_amd64_family_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_amd64_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), \ .get_event_first = pfm_amd64_get_event_first, \ .get_event_next = pfm_amd64_get_event_next, \ .event_is_valid = pfm_amd64_event_is_valid, \ .validate_table = pfm_amd64_validate_table, \ .get_event_info = pfm_amd64_get_event_info, \ .get_event_attr_info = pfm_amd64_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs),\ .get_event_nattrs = pfm_amd64_get_event_nattrs, \ .get_num_events = pfm_amd64_get_num_events, \ } DEFINE_K8_REV(RevB, revb, AMD64_K8_REV_B, PFM_PMU_AMD64_K8_REVB); DEFINE_K8_REV(RevC, revc, AMD64_K8_REV_C, PFM_PMU_AMD64_K8_REVC); DEFINE_K8_REV(RevD, revd, AMD64_K8_REV_D, PFM_PMU_AMD64_K8_REVD); DEFINE_K8_REV(RevE, reve, AMD64_K8_REV_E, PFM_PMU_AMD64_K8_REVE); DEFINE_K8_REV(RevF, revf, AMD64_K8_REV_F, PFM_PMU_AMD64_K8_REVF); DEFINE_K8_REV(RevG, revg, AMD64_K8_REV_G, PFM_PMU_AMD64_K8_REVG); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_perf_event.c000066400000000000000000000112321502707512200233560ustar00rootroot00000000000000/* * pfmlib_amd64_perf_event.c : perf_event AMD64 functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_amd64_priv.h" /* architecture private */ #include "pfmlib_perf_event_priv.h" static int find_pmu_type_by_name(const char *name) { char filename[PATH_MAX]; FILE *fp; int ret, type; if (!name) return PFM_ERR_NOTSUPP; sprintf(filename, "/sys/bus/event_source/devices/%s/type", name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &type); if (ret != 1) type = PFM_ERR_NOTSUPP; fclose(fp); return type; } int pfm_amd64_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; pfm_amd64_reg_t reg; struct perf_event_attr *attr = e->os_data; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* * use generic raw encoding function first */ ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; if (e->count > 1) { DPRINT("unsupported count=%d\n", e->count); return PFM_ERR_NOTSUPP; } ret = PERF_TYPE_RAW; /* * if specific perf PMU is provided then try to locate it * otherwise assume core PMU and thus type RAW */ if (pmu->perf_name) { /* greab PMU type from sysfs */ ret = find_pmu_type_by_name(pmu->perf_name); if (ret < 0) return ret; } DPRINT("amd64_get_perf_encoding: PMU type=%d\n", ret); attr->type = ret; reg.val = e->codes[0]; /* * suppress the bits which are under the control of perf_events * they will be ignore by the perf tool and the kernel interface * the OS/USR bits are controlled by the attr.exclude_* fields * the EN/INT bits are controlled by the kernel */ reg.sel_en = 0; reg.sel_int = 0; reg.sel_os = 0; reg.sel_usr = 0; reg.sel_guest = 0; reg.sel_host = 0; attr->config = reg.val; return PFM_SUCCESS; } void pfm_amd64_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; int i, compact; for (i=0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * with perf_events, u and k are handled at the OS level * via attr.exclude_* fields */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { if (e->pattrs[i].idx == AMD64_ATTR_U || e->pattrs[i].idx == AMD64_ATTR_K || e->pattrs[i].idx == AMD64_ATTR_H) compact = 1; } if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* No precise mode on AMD */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; /* older processors do not support hypervisor priv level */ if (e->pattrs[i].idx == PERF_ATTR_H &&!pfm_amd64_supports_virt(pmu)) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } void pfm_amd64_nb_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i=0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * no perf_events attr is supported by AMD64 Northbridge PMU * sampling is not supported */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { compact = 1; } /* hardware sampling not supported on AMD */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_priv.h000066400000000000000000000211251502707512200222100ustar00rootroot00000000000000/* * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_AMD64_PRIV_H__ #define __PFMLIB_AMD64_PRIV_H__ #define AMD64_MAX_GRP 4 /* must be < 32 (int) */ typedef struct { const char *uname; /* unit mask name */ const char *udesc; /* event/umask description */ unsigned int ucode; /* unit mask code */ unsigned int uflags; /* unit mask flags */ unsigned int grpid; /* unit mask group id */ } amd64_umask_t; typedef struct { const char *name; /* event name */ const char *desc; /* event description */ const amd64_umask_t *umasks;/* list of umasks */ unsigned int code; /* event code */ unsigned int numasks;/* number of umasks */ unsigned int flags; /* flags */ unsigned int modmsk; /* modifiers bitmask */ unsigned int ngrp; /* number of unit masks groups */ } amd64_entry_t; /* * we keep an internal revision type to avoid * dealing with arbitrarily large pfm_pmu_t * which would not fit into the 8 bits reserved * in amd64_entry_t.flags or amd64_umask_t.flags */ typedef enum { AMD64_CPU_UN = 0, AMD64_K7, AMD64_K8_REV_B, AMD64_K8_REV_C, AMD64_K8_REV_D, AMD64_K8_REV_E, AMD64_K8_REV_F, AMD64_K8_REV_G, AMD64_FAM10H_REV_B, AMD64_FAM10H_REV_C, AMD64_FAM10H_REV_D, AMD64_FAM11H, AMD64_FAM12H, /* first with Host/Guest filtering */ AMD64_FAM14H_REV_B, AMD64_FAM15H, AMD64_FAM16H, AMD64_FAM17H, AMD64_FAM19H, AMD64_FAM1AH, } amd64_rev_t; #define AMD64_FAM10H AMD64_FAM10H_REV_B typedef struct { pfm_pmu_t revision; int family; /* 0 means nothing detected yet */ int model; int stepping; } pfm_amd64_config_t; extern pfm_amd64_config_t pfm_amd64_cfg; /* * flags values (bottom 8 bits only) * bits 00-07: flags * bits 08-15: from revision * bits 16-23: till revision */ #define AMD64_FROM_REV(rev) ((rev)<<8) #define AMD64_TILL_REV(rev) ((rev)<<16) #define AMD64_NOT_SUPP 0x1ff00 #define AMD64_FL_NCOMBO 0x01 /* unit mask can be combined */ #define AMD64_FL_IBSFE 0x02 /* IBS fetch */ #define AMD64_FL_IBSOP 0x04 /* IBS op */ #define AMD64_FL_DFL 0x08 /* unit mask is default choice */ #define AMD64_FL_OMIT 0x10 /* umask can be omitted */ #define AMD64_FL_TILL_K8_REV_C AMD64_TILL_REV(AMD64_K8_REV_C) #define AMD64_FL_K8_REV_D AMD64_FROM_REV(AMD64_K8_REV_D) #define AMD64_FL_K8_REV_E AMD64_FROM_REV(AMD64_K8_REV_E) #define AMD64_FL_TILL_K8_REV_E AMD64_TILL_REV(AMD64_K8_REV_E) #define AMD64_FL_K8_REV_F AMD64_FROM_REV(AMD64_K8_REV_F) #define AMD64_FL_TILL_FAM10H_REV_B AMD64_TILL_REV(AMD64_FAM10H_REV_B) #define AMD64_FL_FAM10H_REV_C AMD64_FROM_REV(AMD64_FAM10H_REV_C) #define AMD64_FL_TILL_FAM10H_REV_C AMD64_TILL_REV(AMD64_FAM10H_REV_C) #define AMD64_FL_FAM10H_REV_D AMD64_FROM_REV(AMD64_FAM10H_REV_D) #define AMD64_ATTR_K 0 #define AMD64_ATTR_U 1 #define AMD64_ATTR_E 2 #define AMD64_ATTR_I 3 #define AMD64_ATTR_C 4 #define AMD64_ATTR_H 5 #define AMD64_ATTR_G 6 #define _AMD64_ATTR_U (1 << AMD64_ATTR_U) #define _AMD64_ATTR_K (1 << AMD64_ATTR_K) #define _AMD64_ATTR_I (1 << AMD64_ATTR_I) #define _AMD64_ATTR_E (1 << AMD64_ATTR_E) #define _AMD64_ATTR_C (1 << AMD64_ATTR_C) #define _AMD64_ATTR_H (1 << AMD64_ATTR_H) #define _AMD64_ATTR_G (1 << AMD64_ATTR_G) #define AMD64_BASIC_ATTRS \ (_AMD64_ATTR_I|_AMD64_ATTR_E|_AMD64_ATTR_C|_AMD64_ATTR_U|_AMD64_ATTR_K) #define AMD64_K8_ATTRS (AMD64_BASIC_ATTRS) #define AMD64_FAM10H_ATTRS (AMD64_BASIC_ATTRS|_AMD64_ATTR_H|_AMD64_ATTR_G) #define AMD64_FAM12H_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM14H_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM15H_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM17H_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM19H_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM1AH_ATTRS AMD64_FAM10H_ATTRS #define AMD64_FAM10H_PLM (PFM_PLM0|PFM_PLM3|PFM_PLMH) #define AMD64_K7_PLM (PFM_PLM0|PFM_PLM3) /* * AMD64 MSR definitions */ typedef union { uint64_t val; /* complete register value */ struct { uint64_t sel_event_mask:8; /* event mask */ uint64_t sel_unit_mask:8; /* unit mask */ uint64_t sel_usr:1; /* user level */ uint64_t sel_os:1; /* system level */ uint64_t sel_edge:1; /* edge detec */ uint64_t sel_pc:1; /* pin control */ uint64_t sel_int:1; /* enable APIC intr */ uint64_t sel_res1:1; /* reserved */ uint64_t sel_en:1; /* enable */ uint64_t sel_inv:1; /* invert counter mask */ uint64_t sel_cnt_mask:8; /* counter mask */ uint64_t sel_event_mask2:4; /* 10h only: event mask [11:8] */ uint64_t sel_res2:4; /* reserved */ uint64_t sel_guest:1; /* 10h only: guest only counter */ uint64_t sel_host:1; /* 10h only: host only counter */ uint64_t sel_res3:22; /* reserved */ } perfsel; struct { uint64_t maxcnt:16; uint64_t cnt:16; uint64_t lat:16; uint64_t en:1; uint64_t val:1; uint64_t comp:1; uint64_t icmiss:1; uint64_t phyaddrvalid:1; uint64_t l1tlbpgsz:2; uint64_t l1tlbmiss:1; uint64_t l2tlbmiss:1; uint64_t randen:1; uint64_t reserved:6; } ibsfetch; struct { uint64_t maxcnt:16; uint64_t reserved1:1; uint64_t en:1; uint64_t val:1; uint64_t reserved2:45; } ibsop; struct { /* Zen3 L3 */ uint64_t event:8; /* event mask */ uint64_t umask:8; /* unit mask */ uint64_t reserved1:6; /* reserved */ uint64_t en:1; /* enable */ uint64_t reserved2:19; /* reserved */ uint64_t core_id:3; /* Core ID */ uint64_t reserved3:1; /* reserved */ uint64_t en_all_slices:1; /* enable all slices */ uint64_t en_all_cores:1; /* enable all cores */ uint64_t slice_id:3; /* Slice ID */ uint64_t reserved4:5; /* reserved */ uint64_t thread_id:4; /* reserved */ uint64_t reserved5:4; /* reserved */ } l3; } pfm_amd64_reg_t; /* MSR 0xc001000-0xc001003 */ /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_unit_mask perfsel.sel_unit_mask #define sel_usr perfsel.sel_usr #define sel_os perfsel.sel_os #define sel_edge perfsel.sel_edge #define sel_pc perfsel.sel_pc #define sel_int perfsel.sel_int #define sel_en perfsel.sel_en #define sel_inv perfsel.sel_inv #define sel_cnt_mask perfsel.sel_cnt_mask #define sel_event_mask2 perfsel.sel_event_mask2 #define sel_guest perfsel.sel_guest #define sel_host perfsel.sel_host extern int pfm_amd64_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_amd64_get_event_first(void *this); extern int pfm_amd64_get_event_next(void *this, int idx); extern int pfm_amd64_event_is_valid(void *this, int idx); extern int pfm_amd64_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info); extern int pfm_amd64_get_event_info(void *this, int idx, pfm_event_info_t *info); extern int pfm_amd64_validate_table(void *this, FILE *fp); extern int pfm_amd64_detect(void *this); extern const pfmlib_attr_desc_t amd64_mods[]; extern unsigned int pfm_amd64_get_event_nattrs(void *this, int pidx); extern int pfm_amd64_get_num_events(void *this); extern int pfm_amd64_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern void pfm_amd64_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern void pfm_amd64_nb_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_amd64_family_detect(void *this); static inline int pfm_amd64_supports_virt(pfmlib_pmu_t *pmu) { return pmu->pmu_rev >= AMD64_FAM10H && pmu->pmu_rev != AMD64_FAM11H; } #endif /* __PFMLIB_AMD64_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_amd64_rapl.c000066400000000000000000000067671502707512200222000ustar00rootroot00000000000000/* * pfmlib_amd64_rapl.c : AMD RAPL PMU * * Copyright 2021 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * AMD RAPL PMU (AMD Zen2) */ /* private headers */ #include "pfmlib_priv.h" /* * for now, we reuse the x86 table entry format and callback to avoid duplicating * code. We may revisit this later on */ #include "pfmlib_amd64_priv.h" extern pfmlib_pmu_t amd64_rapl_support; static const amd64_entry_t amd64_rapl_zen2[]={ { .name = "RAPL_ENERGY_PKG", .desc = "Number of Joules consumed by all cores and Last level cache on the package. Unit is 2^-32 Joules", .code = 0x2, } }; static int pfm_amd64_rapl_detect(void *this) { int ret, rev; ret = pfm_amd64_detect(this); if (ret != PFM_SUCCESS) return ret; rev = pfm_amd64_cfg.revision; switch(rev) { case PFM_PMU_AMD64_FAM17H_ZEN2: case PFM_PMU_AMD64_FAM19H_ZEN3: case PFM_PMU_AMD64_FAM19H_ZEN4: ret = PFM_SUCCESS; break; default: ret = PFM_ERR_NOTSUPP; } return ret; } static int pfm_amd64_rapl_get_encoding(void *this, pfmlib_event_desc_t *e) { const amd64_entry_t *pe; pe = this_pe(this); e->fstr[0] = '\0'; e->codes[0] = pe[e->event].code; e->count = 1; evt_strcat(e->fstr, "%s", pe[e->event].name); __pfm_vbprintf("[0x%"PRIx64" event=0x%x] %s\n", e->codes[0], e->codes[0], e->fstr); return PFM_SUCCESS; } /* * number modifiers for RAPL * define an empty modifier to avoid firing the * sanity pfm_amd64_validate_table(). We are * using this function to avoid duplicating code. */ static const pfmlib_attr_desc_t amd64_rapl_mods[]= { { 0, } }; pfmlib_pmu_t amd64_rapl_support={ .desc = "AMD64 RAPL", .name = "amd64_rapl", .perf_name = "power", .pmu = PFM_PMU_AMD64_RAPL, .pme_count = LIBPFM_ARRAY_SIZE(amd64_rapl_zen2), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 0, .num_fixed_cntrs = 3, .max_encoding = 1, .pe = amd64_rapl_zen2, .pmu_detect = pfm_amd64_rapl_detect, .atdesc = amd64_rapl_mods, .get_event_encoding[PFM_OS_NONE] = pfm_amd64_rapl_get_encoding, PFMLIB_ENCODE_PERF(pfm_amd64_get_perf_encoding), .get_event_first = pfm_amd64_get_event_first, .get_event_next = pfm_amd64_get_event_next, .event_is_valid = pfm_amd64_event_is_valid, .validate_table = pfm_amd64_validate_table, .get_event_info = pfm_amd64_get_event_info, .get_event_attr_info = pfm_amd64_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_amd64_perf_validate_pattrs), .get_event_nattrs = pfm_amd64_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm.c000066400000000000000000000216641502707512200210170ustar00rootroot00000000000000/* * pfmlib_arm.c : support for ARM chips * * Copyright (c) 2010 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" const pfmlib_attr_desc_t arm_mods[]={ PFM_ATTR_B("k", "monitor at kernel level"), PFM_ATTR_B("u", "monitor at user level"), PFM_ATTR_B("hv", "monitor in hypervisor"), PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; pfm_arm_config_t pfm_arm_cfg = { .init_cpuinfo_done = 0, }; #define MAX_ARM_CPUIDS 8 static arm_cpuid_t arm_cpuids[MAX_ARM_CPUIDS]; static int num_arm_cpuids; static int pfmlib_find_arm_cpuid(arm_cpuid_t *attr, arm_cpuid_t *match_attr) { int i; if (attr == NULL) return PFM_ERR_NOTFOUND; for (i=0; i < num_arm_cpuids; i++) { #if 0 /* * disabled due to issues with expected arch vs. reported * arch by the Linux kernel cpuinfo */ if (arm_cpuids[i].arch != attr->arch) continue; #endif if (arm_cpuids[i].impl != attr->impl) continue; if (arm_cpuids[i].part != attr->part) continue; if (match_attr) *match_attr = arm_cpuids[i]; return PFM_SUCCESS; } return PFM_ERR_NOTSUPP; } #ifdef CONFIG_PFMLIB_OS_LINUX /* * Function populates the arm_cpuidsp[] table with each unique * core identifications found on the host. In the case of hybrids * that number is greater than 1 */ static int pfmlib_init_cpuids(void) { arm_cpuid_t attr = {0, }; FILE *fp = NULL; int ret = -1; size_t buf_len = 0; char *p, *value = NULL; char *buffer = NULL; int nattrs = 0; if (pfm_arm_cfg.init_cpuinfo_done == 1) return PFM_SUCCESS; fp = fopen(pfm_cfg.proc_cpuinfo, "r"); if (fp == NULL) { DPRINT("pfmlib_init_cpuids: cannot open %s\n", pfm_cfg.proc_cpuinfo); return PFM_ERR_NOTFOUND; } while(pfmlib_getl(&buffer, &buf_len, fp) != -1){ if (nattrs == ARM_NUM_ATTR_FIELDS) { if (pfmlib_find_arm_cpuid(&attr, NULL) != PFM_SUCCESS) { /* must add */ if (num_arm_cpuids == MAX_ARM_CPUIDS) { DPRINT("pfmlib_init_cpuids: too many cpuids num_arm_cpuids=%d\n", num_arm_cpuids); ret = PFM_ERR_TOOMANY; goto error; } arm_cpuids[num_arm_cpuids++] = attr; __pfm_vbprintf("Detected ARM CPU impl=0x%x arch=%d part=0x%x\n", attr.impl, attr.arch, attr.part); } nattrs = 0; } /* skip blank lines */ if (*buffer == '\n' || *buffer == '\r') continue; p = strchr(buffer, ':'); if (p == NULL) continue; /* * p+2: +1 = space, +2= firt character * strlen()-1 gets rid of \n */ *p = '\0'; value = p+2; value[strlen(value)-1] = '\0'; if (!strncmp("CPU implementer", buffer, 15)) { attr.impl = strtoul(value, NULL, 0); nattrs++; continue; } if (!strncmp("CPU architecture", buffer, 16)) { attr.arch = strtoul(value, NULL, 0); nattrs++; continue; } if (!strncmp("CPU part", buffer, 8)) { attr.part = strtoul(value, NULL, 0); nattrs++; continue; } } ret = PFM_SUCCESS; DPRINT("num_arm_cpuids=%d\n", num_arm_cpuids); error: for (nattrs = 0; nattrs < num_arm_cpuids; nattrs++) { DPRINT("cpuids[%d] = impl=0x%x arch=%d part=0x%x\n", nattrs, arm_cpuids[nattrs].impl, arm_cpuids[nattrs].arch, arm_cpuids[nattrs].part); } pfm_arm_cfg.init_cpuinfo_done = 1; free(buffer); fclose(fp); return ret; } #else static int pfmlib_init_cpuids(void) { return -1; } #endif static int arm_num_mods(void *this, int idx) { const arm_entry_t *pe = this_pe(this); unsigned int mask; mask = pe[idx].modmsk; return pfmlib_popcnt(mask); } static inline int arm_attr2mod(void *this, int pidx, int attr_idx) { const arm_entry_t *pe = this_pe(this); size_t x; int n; n = attr_idx; pfmlib_for_each_bit(x, pe[pidx].modmsk) { if (n == 0) break; n--; } return x; } static void pfm_arm_display_reg(void *this, pfmlib_event_desc_t *e, pfm_arm_reg_t reg) { __pfm_vbprintf("[0x%x] %s\n", reg.val, e->fstr); } int pfm_arm_detect(arm_cpuid_t *attr, arm_cpuid_t *match_attr) { int ret; ret = pfmlib_init_cpuids(); if (ret != PFM_SUCCESS) return PFM_ERR_NOTSUPP; return pfmlib_find_arm_cpuid(attr, match_attr); } int pfm_arm_get_encoding(void *this, pfmlib_event_desc_t *e) { const arm_entry_t *pe = this_pe(this); pfmlib_event_attr_info_t *a; pfm_arm_reg_t reg; unsigned int plm = 0; int i, idx, has_plm = 0; reg.val = pe[e->event].code; for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type > PFM_ATTR_UMASK) { uint64_t ival = e->attrs[i].ival; switch(a->idx) { case ARM_ATTR_U: /* USR */ if (ival) plm |= PFM_PLM3; has_plm = 1; break; case ARM_ATTR_K: /* OS */ if (ival) plm |= PFM_PLM0; has_plm = 1; break; case ARM_ATTR_HV: /* HYPERVISOR */ if (ival) plm |= PFM_PLMH; has_plm = 1; break; default: return PFM_ERR_ATTR; } } } if (arm_has_plm(this, e)) { if (!has_plm) plm = e->dfl_plm; reg.evtsel.excl_pl1 = !(plm & PFM_PLM0); reg.evtsel.excl_usr = !(plm & PFM_PLM3); reg.evtsel.excl_hyp = !(plm & PFM_PLMH); } evt_strcat(e->fstr, "%s", pe[e->event].name); e->codes[0] = reg.val; e->count = 1; for (i = 0; i < e->npattrs; i++) { if (e->pattrs[i].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; idx = e->pattrs[i].idx; switch(idx) { case ARM_ATTR_K: evt_strcat(e->fstr, ":%s=%lu", arm_mods[idx].name, !reg.evtsel.excl_pl1); break; case ARM_ATTR_U: evt_strcat(e->fstr, ":%s=%lu", arm_mods[idx].name, !reg.evtsel.excl_usr); break; case ARM_ATTR_HV: evt_strcat(e->fstr, ":%s=%lu", arm_mods[idx].name, !reg.evtsel.excl_hyp); break; } } pfm_arm_display_reg(this, e, reg); return PFM_SUCCESS; } int pfm_arm_get_event_first(void *this) { return 0; } int pfm_arm_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; if (idx >= (p->pme_count-1)) return -1; return idx+1; } int pfm_arm_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count; } int pfm_arm_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const arm_entry_t *pe = this_pe(this); int i, j, error = 0; for(i=0; i < pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", pmu->name, i, pe[i].name); error++; } for(j = i+1; j < pmu->pme_count; j++) { if (pe[i].code == pe[j].code && !(pe[j].equiv || pe[i].equiv)) { fprintf(fp, "pmu: %s events %s and %s have the same code 0x%x\n", pmu->name, pe[i].name, pe[j].name, pe[i].code); error++; } } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } int pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { int idx; idx = arm_attr2mod(this, pidx, attr_idx); info->name = arm_mods[idx].name; info->desc = arm_mods[idx].desc; info->type = arm_mods[idx].type; info->code = idx; info->is_dfl = 0; info->equiv = NULL; info->ctrl = PFM_ATTR_CTRL_PMU; info->idx = idx; /* namespace specific index */ info->dfl_val64 = 0; info->is_precise = 0; info->support_hw_smpl = 0; return PFM_SUCCESS; } unsigned int pfm_arm_get_event_nattrs(void *this, int pidx) { return arm_num_mods(this, pidx); } int pfm_arm_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const arm_entry_t *pe = this_pe(this); info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].code; info->equiv = pe[idx].equiv; info->idx = idx; /* private index */ info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; /* no attributes defined for ARM yet */ info->nattrs = 0; return PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv6.c000066400000000000000000000046271502707512200221320ustar00rootroot00000000000000/* * pfmlib_arm_armv6.c : support for ARMv6 chips * * Copyright (c) 2013 by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "events/arm_1176_events.h" /* event tables */ static int pfm_arm_detect_1176(void *this) { /* ARM 1176 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 7, .part = 0xb76 }; return pfm_arm_detect(&attr, NULL); } /* ARM1176 support */ pfmlib_pmu_t arm_1176_support={ .desc = "ARM1176", .name = "arm_1176", .perf_name = "armv6_1176", .pmu = PFM_PMU_ARM_1176, .pme_count = LIBPFM_ARRAY_SIZE(arm_1176_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_1176_pe, .pmu_detect = pfm_arm_detect_1176, .max_encoding = 1, .num_cntrs = 2, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv7_pmuv1.c000066400000000000000000000160231502707512200232540ustar00rootroot00000000000000/* * pfmlib_arm_armv7_pmuv1.c : support for ARMV7 chips * * Copyright (c) 2010 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "events/arm_cortex_a7_events.h" /* event tables */ #include "events/arm_cortex_a8_events.h" #include "events/arm_cortex_a9_events.h" #include "events/arm_cortex_a15_events.h" #include "events/arm_qcom_krait_events.h" static int pfm_arm_detect_cortex_a7(void *this) { /* ARM Cortex A7 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 7, .part = 0xc07 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a8(void *this) { /* ARM Cortex A8 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 7, .part = 0xc08 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a9(void *this) { /* ARM Cortex A9 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 7, .part = 0xc09 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a15(void *this) { /* ARM Cortex A15 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 7, .part = 0xc0f }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_krait(void *this) { /* Qualcomm Krait */ /* Check that [15:10] of midr is 0x01 which */ /* indicates Krait rather than Scorpion CPU */ /* match_attr.part is (midr>>4)&0xfff */ /* if (pfm_arm_cfg.part >> 6 == 0x1) { */ /* return PFM_SUCCESS; */ arm_cpuid_t attr = { .impl = 0x51, .arch = 7, .part = 1 << 6 }; arm_cpuid_t match_attr; int ret; ret = pfm_arm_detect(&attr, &match_attr); if (ret != PFM_SUCCESS) return ret; if ((match_attr.part >> 6) == 0x1) return PFM_SUCCESS; return PFM_ERR_NOTFOUND; } /* Cortex A7 support */ pfmlib_pmu_t arm_cortex_a7_support={ .desc = "ARM Cortex A7", .name = "arm_ac7", .pmu = PFM_PMU_ARM_CORTEX_A7, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a7_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_cortex_a7_pe, .pmu_detect = pfm_arm_detect_cortex_a7, .max_encoding = 1, .num_cntrs = 4, .supported_plm = ARMV7_A7_PLM, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Cortex A8 support */ pfmlib_pmu_t arm_cortex_a8_support={ .desc = "ARM Cortex A8", .name = "arm_ac8", .pmu = PFM_PMU_ARM_CORTEX_A8, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a8_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_cortex_a8_pe, .pmu_detect = pfm_arm_detect_cortex_a8, .max_encoding = 1, .num_cntrs = 2, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Cortex A9 support */ pfmlib_pmu_t arm_cortex_a9_support={ .desc = "ARM Cortex A9", .name = "arm_ac9", .pmu = PFM_PMU_ARM_CORTEX_A9, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a9_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_cortex_a9_pe, .pmu_detect = pfm_arm_detect_cortex_a9, .max_encoding = 1, .num_cntrs = 2, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Cortex A15 support */ pfmlib_pmu_t arm_cortex_a15_support={ .desc = "ARM Cortex A15", .name = "arm_ac15", .pmu = PFM_PMU_ARM_CORTEX_A15, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a15_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_cortex_a15_pe, .pmu_detect = pfm_arm_detect_cortex_a15, .max_encoding = 1, .num_cntrs = 6, .supported_plm = ARMV7_A15_PLM, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Qualcomm Krait support */ pfmlib_pmu_t arm_qcom_krait_support={ .desc = "ARM Qualcomm Krait", .name = "qcom_krait", .pmu = PFM_PMU_ARM_QCOM_KRAIT, .pme_count = LIBPFM_ARRAY_SIZE(arm_qcom_krait_pe), .type = PFM_PMU_TYPE_CORE, .pe = arm_qcom_krait_pe, .pmu_detect = pfm_arm_detect_krait, .max_encoding = 1, .num_cntrs = 5, .supported_plm = ARMV7_A15_PLM, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8.c000066400000000000000000000336371502707512200221370ustar00rootroot00000000000000/* * pfmlib_arm_armv8.c : support for ARMv8 processors * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "events/arm_cortex_a57_events.h" /* A57 event tables */ #include "events/arm_cortex_a53_events.h" /* A53 event tables */ #include "events/arm_cortex_a55_events.h" /* A53 event tables */ #include "events/arm_cortex_a76_events.h" /* A76 event tables */ #include "events/arm_xgene_events.h" /* Applied Micro X-Gene tables */ #include "events/arm_cavium_tx2_events.h" /* Marvell ThunderX2 tables */ #include "events/arm_fujitsu_a64fx_events.h" /* Fujitsu A64FX PMU tables */ #include "events/arm_neoverse_n1_events.h" /* ARM Neoverse N1 table */ #include "events/arm_neoverse_v1_events.h" /* Arm Neoverse V1 table */ #include "events/arm_hisilicon_kunpeng_events.h" /* HiSilicon Kunpeng PMU tables */ static int pfm_arm_detect_n1(void *this) { /* Neoverse N1 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd0c }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_v1(void *this) { /* Neoverse V1 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd40 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a57(void *this) { /* Cortex A57 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd07 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a72(void *this) { /* Cortex A72 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd08 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a53(void *this) { /* Cortex A53 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd03 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a55(void *this) { /* Cortex A55 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd05 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_cortex_a76(void *this) { /* Cortex A76 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 8, .part = 0xd0b }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_xgene(void *this) { /* Applied Micro X-Gene */ arm_cpuid_t attr = { .impl = 0x50, .arch = 8, .part = 0x0 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_thunderx2(void *this) { /* Broadcom Thunder X2*/ arm_cpuid_t attr = { .impl = 0x42, .arch = 8, .part = 0x516 }; int ret; ret = pfm_arm_detect(&attr, NULL); if (ret == PFM_SUCCESS) return ret; /* Cavium Thunder X2 */ attr.impl = 0x43; attr.part = 0xaf; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_a64fx(void *this) { /* Fujitsu a64fx */ arm_cpuid_t attr = { .impl = 0x46, .arch = 8, .part = 0x001 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_hisilicon_kunpeng(void *this) { /* Hisilicon Kunpeng */ arm_cpuid_t attr = { .impl = 0x48, .arch = 8, .part = 0xd01 }; return pfm_arm_detect(&attr, NULL); } /* ARM Cortex A57 support */ pfmlib_pmu_t arm_cortex_a57_support={ .desc = "ARM Cortex A57", .name = "arm_ac57", .perf_name = "armv8_cortex_a57,armv8_pmuv3_0,armv8_pmuv3", .pmu = PFM_PMU_ARM_CORTEX_A57, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a57_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_cortex_a57_pe, .pmu_detect = pfm_arm_detect_cortex_a57, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* ARM Cortex A72 support */ pfmlib_pmu_t arm_cortex_a72_support={ .desc = "ARM Cortex A72", .name = "arm_ac72", .perf_name = "armv8_cortex_a72,armv8_pmuv3_0", .pmu = PFM_PMU_ARM_CORTEX_A72, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a57_pe), /* shared with a57 */ .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_cortex_a57_pe, /* shared with a57 */ .pmu_detect = pfm_arm_detect_cortex_a72, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* ARM Cortex A53 support */ pfmlib_pmu_t arm_cortex_a53_support={ .desc = "ARM Cortex A53", .name = "arm_ac53", .perf_name = "armv8_cortex_a53", .pmu = PFM_PMU_ARM_CORTEX_A53, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a53_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_cortex_a53_pe, .pmu_detect = pfm_arm_detect_cortex_a53, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* ARM Cortex A55 support */ pfmlib_pmu_t arm_cortex_a55_support={ .desc = "ARM Cortex A55", .name = "arm_ac55", .perf_name = "armv8_cortex_a55", .pmu = PFM_PMU_ARM_CORTEX_A55, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a55_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_cortex_a55_pe, .pmu_detect = pfm_arm_detect_cortex_a55, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* ARM Cortex A76 support */ pfmlib_pmu_t arm_cortex_a76_support={ .desc = "ARM Cortex A76", .name = "arm_ac76", .perf_name = "armv8_cortex_a76", .pmu = PFM_PMU_ARM_CORTEX_A76, .pme_count = LIBPFM_ARRAY_SIZE(arm_cortex_a76_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_cortex_a76_pe, .pmu_detect = pfm_arm_detect_cortex_a76, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Applied Micro X-Gene support */ pfmlib_pmu_t arm_xgene_support={ .desc = "Applied Micro X-Gene", .name = "arm_xgene", .pmu = PFM_PMU_ARM_XGENE, .pme_count = LIBPFM_ARRAY_SIZE(arm_xgene_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_xgene_pe, .pmu_detect = pfm_arm_detect_xgene, .max_encoding = 1, .num_cntrs = 4, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Marvell ThunderX2 support */ pfmlib_pmu_t arm_thunderx2_support={ .desc = "Cavium ThunderX2", .name = "arm_thunderx2", .perf_name = "armv8_cavium_thunder", .pmu = PFM_PMU_ARM_THUNDERX2, .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_thunderx2_pe, .pmu_detect = pfm_arm_detect_thunderx2, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Fujitsu A64FX support */ pfmlib_pmu_t arm_fujitsu_a64fx_support={ .desc = "Fujitsu A64FX", .name = "arm_a64fx", .pmu = PFM_PMU_ARM_A64FX, .pme_count = LIBPFM_ARRAY_SIZE(arm_a64fx_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_a64fx_pe, .pmu_detect = pfm_arm_detect_a64fx, .max_encoding = 1, .num_cntrs = 8, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* HiSilicon Kunpeng support */ pfmlib_pmu_t arm_hisilicon_kunpeng_support={ .desc = "Hisilicon Kunpeng", .name = "arm_kunpeng", .pmu = PFM_PMU_ARM_KUNPENG, .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_kunpeng_pe, .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, .max_encoding = 1, .num_cntrs = 12, .num_fixed_cntrs = 1, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; pfmlib_pmu_t arm_n1_support={ .desc = "ARM Neoverse N1", .name = "arm_n1", .pmu = PFM_PMU_ARM_N1, .pme_count = LIBPFM_ARRAY_SIZE(arm_n1_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_n1_pe, .pmu_detect = pfm_arm_detect_n1, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; pfmlib_pmu_t arm_v1_support={ .desc = "Arm Neoverse V1", .name = "arm_v1", .pmu = PFM_PMU_ARM_V1, .pme_count = LIBPFM_ARRAY_SIZE(arm_v1_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV8_PLM, .pe = arm_v1_pe, .pmu_detect = pfm_arm_detect_v1, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_kunpeng_unc.c000066400000000000000000000165621502707512200245310ustar00rootroot00000000000000/* * pfmlib_arm_armv8_kunpeng_unc.c : support for HiSilicon Kunpeng uncore PMUs * * Copyright (c) 2024 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "pfmlib_arm_armv8_unc_priv.h" #include "events/arm_hisilicon_kunpeng_unc_events.h" /* Hisilicon Kunpeng PMU uncore tables */ static int pfm_arm_detect_hisilicon_kunpeng(void *this) { /* Hisilicon Kunpeng */ arm_cpuid_t attr = { .impl = 0x48, .arch = 8, .part = 0xd01 }; return pfm_arm_detect(&attr, NULL); } static void display_com(void *this, pfmlib_event_desc_t *e, void *val) { const arm_entry_t *pe = this_pe(this); kunpeng_unc_data_t *reg = val; __pfm_vbprintf("[UNC=0x%"PRIx64"] %s\n", reg->val, pe[e->event].name); } static void display_reg(void *this, pfmlib_event_desc_t *e, kunpeng_unc_data_t reg) { pfmlib_pmu_t *pmu = this; if (pmu->display_reg) pmu->display_reg(this, e, ®); else display_com(this, e, ®); } int pfm_kunpeng_unc_get_event_encoding(void *this, pfmlib_event_desc_t *e) { //from pe field in for the uncore, get the array with all the event defs const arm_entry_t *event_list = this_pe(this); kunpeng_unc_data_t reg; //get code for the event from the table reg.val = event_list[e->event].code; //pass the data back to the caller e->codes[0] = reg.val; e->count = 1; evt_strcat(e->fstr, "%s", event_list[e->event].name); display_reg(this, e, reg); return PFM_SUCCESS; } /* Hisilicon Kunpeng support */ // For uncore, each socket has a separate perf name, otherwise they are the same, use macro #define DEFINE_KUNPENG_DDRC(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_ddrc##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" DDRC"#m, \ .name = "hisi_sccl"#n"_ddrc"#m, \ .perf_name = "hisi_sccl"#n"_ddrc"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_DDRC##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_ddrc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_ddrc_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_DDRC(1,0); DEFINE_KUNPENG_DDRC(1,1); DEFINE_KUNPENG_DDRC(1,2); DEFINE_KUNPENG_DDRC(1,3); DEFINE_KUNPENG_DDRC(3,0); DEFINE_KUNPENG_DDRC(3,1); DEFINE_KUNPENG_DDRC(3,2); DEFINE_KUNPENG_DDRC(3,3); DEFINE_KUNPENG_DDRC(5,0); DEFINE_KUNPENG_DDRC(5,1); DEFINE_KUNPENG_DDRC(5,2); DEFINE_KUNPENG_DDRC(5,3); DEFINE_KUNPENG_DDRC(7,0); DEFINE_KUNPENG_DDRC(7,1); DEFINE_KUNPENG_DDRC(7,2); DEFINE_KUNPENG_DDRC(7,3); #define DEFINE_KUNPENG_HHA(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_hha##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" HHA"#m, \ .name = "hisi_sccl"#n"_hha"#m, \ .perf_name = "hisi_sccl"#n"_hha"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_HHA##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_hha_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_hha_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_HHA(1,2); DEFINE_KUNPENG_HHA(1,3); DEFINE_KUNPENG_HHA(3,0); DEFINE_KUNPENG_HHA(3,1); DEFINE_KUNPENG_HHA(5,6); DEFINE_KUNPENG_HHA(5,7); DEFINE_KUNPENG_HHA(7,4); DEFINE_KUNPENG_HHA(7,5); #define DEFINE_KUNPENG_L3C(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_l3c##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" L3C"#m, \ .name = "hisi_sccl"#n"_l3c"#m, \ .perf_name = "hisi_sccl"#n"_l3c"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_L3C##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_l3c_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_l3c_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_L3C(1,10); DEFINE_KUNPENG_L3C(1,11); DEFINE_KUNPENG_L3C(1,12); DEFINE_KUNPENG_L3C(1,13); DEFINE_KUNPENG_L3C(1,14); DEFINE_KUNPENG_L3C(1,15); DEFINE_KUNPENG_L3C(1,8); DEFINE_KUNPENG_L3C(1,9); DEFINE_KUNPENG_L3C(3,0); DEFINE_KUNPENG_L3C(3,1); DEFINE_KUNPENG_L3C(3,2); DEFINE_KUNPENG_L3C(3,3); DEFINE_KUNPENG_L3C(3,4); DEFINE_KUNPENG_L3C(3,5); DEFINE_KUNPENG_L3C(3,6); DEFINE_KUNPENG_L3C(3,7); DEFINE_KUNPENG_L3C(5,24); DEFINE_KUNPENG_L3C(5,25); DEFINE_KUNPENG_L3C(5,26); DEFINE_KUNPENG_L3C(5,27); DEFINE_KUNPENG_L3C(5,28); DEFINE_KUNPENG_L3C(5,29); DEFINE_KUNPENG_L3C(5,30); DEFINE_KUNPENG_L3C(5,31); DEFINE_KUNPENG_L3C(7,16); DEFINE_KUNPENG_L3C(7,17); DEFINE_KUNPENG_L3C(7,18); DEFINE_KUNPENG_L3C(7,19); DEFINE_KUNPENG_L3C(7,20); DEFINE_KUNPENG_L3C(7,21); DEFINE_KUNPENG_L3C(7,22); DEFINE_KUNPENG_L3C(7,23); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_kunpeng_unc_perf_event.c000066400000000000000000000051631502707512200267410ustar00rootroot00000000000000/* * Copyright (c) 2021 Barcelona Supercomputing Center * Contributed by Estanislao Mercadal MeliĂ  * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_perf_event_priv.h" #include "pfmlib_arm_priv.h" #include "pfmlib_arm_armv8_unc_priv.h" static int find_pmu_type_by_name(const char *name) { char filename[PATH_MAX]; FILE *fp; int ret, type; if (!name) return PFM_ERR_NOTSUPP; sprintf(filename, "/sys/bus/event_source/devices/%s/type", name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &type); if (ret != 1) type = PFM_ERR_NOTSUPP; fclose(fp); return type; } int pfm_kunpeng_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; kunpeng_unc_data_t reg; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; //get pmu type to probe ret = find_pmu_type_by_name(pmu->perf_name); if (ret < 0) return ret; attr->type = ret; //get code to provide to the uncore pmu probe reg.val = e->codes[0]; attr->config = reg.val; // if needed, can use attr->config1 or attr->config2 for extra info from event structure defines e->codes[i] // uncore measures at all priv levels attr->exclude_hv = 0; attr->exclude_kernel = 0; attr->exclude_user = 0; return PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_thunderx2_unc.c000066400000000000000000000126131502707512200247760ustar00rootroot00000000000000/* * pfmlib_arm_armv8_thunderx2_unc.c : support for Marvell ThunderX2 uncore PMUs * * Copyright (c) 2024 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "pfmlib_arm_armv8_unc_priv.h" #include "events/arm_marvell_tx2_unc_events.h" /* Marvell ThunderX2 PMU tables */ static int pfm_arm_detect_thunderx2(void *this) { /* Broadcom Thunder X2*/ arm_cpuid_t attr = { .impl = 0x42, .arch = 8, .part = 0x516 }; int ret; ret = pfm_arm_detect(&attr, NULL); if (ret == PFM_SUCCESS) return ret; /* Cavium Thunder X2 */ attr.impl = 0x43; attr.part = 0xaf; return pfm_arm_detect(&attr, NULL); } static int pfm_tx2_unc_get_event_encoding(void *this, pfmlib_event_desc_t *e) { //from pe field in for the uncore, get the array with all the event defs const arm_entry_t *event_list = this_pe(this); tx2_unc_data_t reg; //get code for the event from the table reg.val = event_list[e->event].code; //pass the data back to the caller e->codes[0] = reg.val; e->count = 1; evt_strcat(e->fstr, "%s", event_list[e->event].name); return PFM_SUCCESS; } // For uncore, each socket has a separate perf name, otherwise they are the same, use macro #define DEFINE_TX2_DMC(n) \ pfmlib_pmu_t arm_thunderx2_dmc##n##_support={ \ .desc = "Marvell ThunderX2 Node"#n" DMC", \ .name = "tx2_dmc"#n, \ .perf_name = "uncore_dmc_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_DMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_dmc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_dmc_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_DMC(0); DEFINE_TX2_DMC(1); #define DEFINE_TX2_LLC(n) \ pfmlib_pmu_t arm_thunderx2_llc##n##_support={ \ .desc = "Marvell ThunderX2 node "#n" LLC", \ .name = "tx2_llc"#n, \ .perf_name = "uncore_l3c_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_LLC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_llc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_llc_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_LLC(0); DEFINE_TX2_LLC(1); #define DEFINE_TX2_CCPI(n) \ pfmlib_pmu_t arm_thunderx2_ccpi##n##_support={ \ .desc = "Marvell ThunderX2 node "#n" Cross-Socket Interconnect", \ .name = "tx2_ccpi"#n, \ .perf_name = "uncore_ccpi2_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_CCPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_ccpi_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_ccpi_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_CCPI(0); DEFINE_TX2_CCPI(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_thunderx2_unc_perf_event.c000066400000000000000000000026671502707512200272230ustar00rootroot00000000000000#include #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_perf_event_priv.h" #include "pfmlib_arm_priv.h" #include "pfmlib_arm_armv8_unc_priv.h" static int find_pmu_type_by_name(const char *name) { char filename[PATH_MAX]; FILE *fp; int ret, type; if (!name) return PFM_ERR_NOTSUPP; sprintf(filename, "/sys/bus/event_source/devices/%s/type", name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &type); if (ret != 1) type = PFM_ERR_NOTSUPP; fclose(fp); return type; } int pfm_tx2_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; tx2_unc_data_t reg; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; //get pmu type to probe ret = find_pmu_type_by_name(pmu->perf_name); if (ret < 0) return ret; attr->type = ret; //get code to provide to the uncore pmu probe reg.val = e->codes[0]; attr->config = reg.val; // if needed, can use attr->config1 or attr->config2 for extra info from event structure defines e->codes[i] // uncore measures at all priv levels attr->exclude_hv = 0; attr->exclude_kernel = 0; attr->exclude_user = 0; return PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_unc.c000066400000000000000000000251511502707512200227740ustar00rootroot00000000000000/* * pfmlib_arm_armv8_unc.c : support for ARMv8 uncore PMUs * * Copyright (c) 2024 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "pfmlib_arm_armv8_unc_priv.h" #include "events/arm_marvell_tx2_unc_events.h" /* Marvell ThunderX2 PMU tables */ #include "events/arm_hisilicon_kunpeng_unc_events.h" /* Hisilicon Kunpeng PMU uncore tables */ static int pfm_arm_detect_thunderx2(void *this) { int ret; ret = pfm_arm_detect(this); if (ret != PFM_SUCCESS) return PFM_ERR_NOTSUPP; if ((pfm_arm_cfg.implementer == 0x42) && /* Broadcom */ (pfm_arm_cfg.part == 0x516)) { /* Thunder2x */ return PFM_SUCCESS; } if ((pfm_arm_cfg.implementer == 0x43) && /* Cavium */ (pfm_arm_cfg.part == 0xaf)) { /* Thunder2x */ return PFM_SUCCESS; } return PFM_ERR_NOTSUPP; } static int pfm_arm_detect_hisilicon_kunpeng(void *this) { int ret; ret = pfm_arm_detect(this); if (ret != PFM_SUCCESS) return PFM_ERR_NOTSUPP; if ((pfm_arm_cfg.implementer == 0x48) && /* Hisilicon */ (pfm_arm_cfg.part == 0xd01)) { /* Kunpeng */ return PFM_SUCCESS; } return PFM_ERR_NOTSUPP; } static int pfm_tx2_unc_get_event_encoding(void *this, pfmlib_event_desc_t *e) { //from pe field in for the uncore, get the array with all the event defs const arm_entry_t *event_list = this_pe(this); tx2_unc_data_t reg; //get code for the event from the table reg.val = event_list[e->event].code; //pass the data back to the caller e->codes[0] = reg.val; e->count = 1; evt_strcat(e->fstr, "%s", event_list[e->event].name); return PFM_SUCCESS; } /* Hisilicon Kunpeng support */ // For uncore, each socket has a separate perf name, otherwise they are the same, use macro #define DEFINE_KUNPENG_DDRC(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_ddrc##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" DDRC"#m, \ .name = "hisi_sccl"#n"_ddrc"#m, \ .perf_name = "hisi_sccl"#n"_ddrc"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_DDRC##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_ddrc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_ddrc_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_DDRC(1,0); DEFINE_KUNPENG_DDRC(1,1); DEFINE_KUNPENG_DDRC(1,2); DEFINE_KUNPENG_DDRC(1,3); DEFINE_KUNPENG_DDRC(3,0); DEFINE_KUNPENG_DDRC(3,1); DEFINE_KUNPENG_DDRC(3,2); DEFINE_KUNPENG_DDRC(3,3); DEFINE_KUNPENG_DDRC(5,0); DEFINE_KUNPENG_DDRC(5,1); DEFINE_KUNPENG_DDRC(5,2); DEFINE_KUNPENG_DDRC(5,3); DEFINE_KUNPENG_DDRC(7,0); DEFINE_KUNPENG_DDRC(7,1); DEFINE_KUNPENG_DDRC(7,2); DEFINE_KUNPENG_DDRC(7,3); #define DEFINE_KUNPENG_HHA(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_hha##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" HHA"#m, \ .name = "hisi_sccl"#n"_hha"#m, \ .perf_name = "hisi_sccl"#n"_hha"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_HHA##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_hha_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_hha_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_HHA(1,2); DEFINE_KUNPENG_HHA(1,3); DEFINE_KUNPENG_HHA(3,0); DEFINE_KUNPENG_HHA(3,1); DEFINE_KUNPENG_HHA(5,6); DEFINE_KUNPENG_HHA(5,7); DEFINE_KUNPENG_HHA(7,4); DEFINE_KUNPENG_HHA(7,5); #define DEFINE_KUNPENG_L3C(n,m) \ pfmlib_pmu_t arm_hisilicon_kunpeng_sccl##n##_l3c##m##_support={ \ .desc = "Hisilicon Kunpeng SCCL"#n" L3C"#m, \ .name = "hisi_sccl"#n"_l3c"#m, \ .perf_name = "hisi_sccl"#n"_l3c"#m, \ .pmu = PFM_PMU_ARM_KUNPENG_UNC_SCCL##n##_L3C##m, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_kunpeng_unc_l3c_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_kunpeng_unc_l3c_pe, \ .pmu_detect = pfm_arm_detect_hisilicon_kunpeng, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_kunpeng_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_kunpeng_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), \ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_KUNPENG_L3C(1,10); DEFINE_KUNPENG_L3C(1,11); DEFINE_KUNPENG_L3C(1,12); DEFINE_KUNPENG_L3C(1,13); DEFINE_KUNPENG_L3C(1,14); DEFINE_KUNPENG_L3C(1,15); DEFINE_KUNPENG_L3C(1,8); DEFINE_KUNPENG_L3C(1,9); DEFINE_KUNPENG_L3C(3,0); DEFINE_KUNPENG_L3C(3,1); DEFINE_KUNPENG_L3C(3,2); DEFINE_KUNPENG_L3C(3,3); DEFINE_KUNPENG_L3C(3,4); DEFINE_KUNPENG_L3C(3,5); DEFINE_KUNPENG_L3C(3,6); DEFINE_KUNPENG_L3C(3,7); DEFINE_KUNPENG_L3C(5,24); DEFINE_KUNPENG_L3C(5,25); DEFINE_KUNPENG_L3C(5,26); DEFINE_KUNPENG_L3C(5,27); DEFINE_KUNPENG_L3C(5,28); DEFINE_KUNPENG_L3C(5,29); DEFINE_KUNPENG_L3C(5,30); DEFINE_KUNPENG_L3C(5,31); DEFINE_KUNPENG_L3C(7,16); DEFINE_KUNPENG_L3C(7,17); DEFINE_KUNPENG_L3C(7,18); DEFINE_KUNPENG_L3C(7,19); DEFINE_KUNPENG_L3C(7,20); DEFINE_KUNPENG_L3C(7,21); DEFINE_KUNPENG_L3C(7,22); DEFINE_KUNPENG_L3C(7,23); // For uncore, each socket has a separate perf name, otherwise they are the same, use macro #define DEFINE_TX2_DMC(n) \ pfmlib_pmu_t arm_thunderx2_dmc##n##_support={ \ .desc = "Marvell ThunderX2 Node"#n" DMC", \ .name = "tx2_dmc"#n, \ .perf_name = "uncore_dmc_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_DMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_dmc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_dmc_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_DMC(0); DEFINE_TX2_DMC(1); #define DEFINE_TX2_LLC(n) \ pfmlib_pmu_t arm_thunderx2_llc##n##_support={ \ .desc = "Marvell ThunderX2 node "#n" LLC", \ .name = "tx2_llc"#n, \ .perf_name = "uncore_l3c_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_LLC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_llc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_llc_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_LLC(0); DEFINE_TX2_LLC(1); #define DEFINE_TX2_CCPI(n) \ pfmlib_pmu_t arm_thunderx2_ccpi##n##_support={ \ .desc = "Marvell ThunderX2 node "#n" Cross-Socket Interconnect", \ .name = "tx2_ccpi"#n, \ .perf_name = "uncore_ccpi2_"#n, \ .pmu = PFM_PMU_ARM_THUNDERX2_CCPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(arm_thunderx2_unc_ccpi_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .pe = arm_thunderx2_unc_ccpi_pe, \ .pmu_detect = pfm_arm_detect_thunderx2, \ .max_encoding = 1, \ .num_cntrs = 4, \ .get_event_encoding[PFM_OS_NONE] = pfm_tx2_unc_get_event_encoding, \ PFMLIB_ENCODE_PERF(pfm_tx2_unc_get_perf_encoding), \ .get_event_first = pfm_arm_get_event_first, \ .get_event_next = pfm_arm_get_event_next, \ .event_is_valid = pfm_arm_event_is_valid, \ .validate_table = pfm_arm_validate_table, \ .get_event_info = pfm_arm_get_event_info, \ .get_event_attr_info = pfm_arm_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs),\ .get_event_nattrs = pfm_arm_get_event_nattrs, \ }; DEFINE_TX2_CCPI(0); DEFINE_TX2_CCPI(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv8_unc_priv.h000066400000000000000000000012411502707512200240330ustar00rootroot00000000000000#ifndef PFMLIB_ARM_ARMV8_UNC_PRIV_H #define PFMLIB_ARM_ARMV8_UNC_PRIV_H #include typedef union { uint64_t val; struct { unsigned long unc_res1:32; /* reserved */ } com; /* reserved space for future extensions */ } tx2_unc_data_t; typedef struct { uint64_t val; } kunpeng_unc_data_t; extern int pfm_tx2_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); //extern int pfm_kunpeng_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_kunpeng_unc_get_event_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_kunpeng_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); #endif /* PFMLIB_ARM_ARMV8_UNC_PRIV_H */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_armv9.c000066400000000000000000000153171502707512200221330ustar00rootroot00000000000000/* * pfmlib_arm_armv9.c : support for ARMv9 processors * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "events/arm_neoverse_n2_events.h" /* Arm Neoverse N2 table */ #include "events/arm_neoverse_n3_events.h" /* Arm Neoverse N3 table */ #include "events/arm_neoverse_v2_events.h" /* Arm Neoverse V2 table */ #include "events/arm_neoverse_v3_events.h" /* Arm Neoverse V3 table */ #include "events/arm_fujitsu_monaka_events.h" /* Fujitsu FUJITSU-MONAKA PMU tables */ static int pfm_arm_detect_n2(void *this) { /* ARM Neoverse N2 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 9, .part = 0xd49 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_n3(void *this) { /* ARM Neoverse N3 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 9, .part = 0xd8e }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_v2(void *this) { /* ARM Neoverse V2 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 9, .part = 0xd4f }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_v3(void *this) { /* ARM Neoverse V3 */ arm_cpuid_t attr = { .impl = 0x41, .arch = 9, .part = 0xd84 }; return pfm_arm_detect(&attr, NULL); } static int pfm_arm_detect_monaka(void *this) { /* Fujitsu Monaka */ arm_cpuid_t attr = { .impl = 0x46, .arch = 9, .part = 0x3 }; return pfm_arm_detect(&attr, NULL); } pfmlib_pmu_t arm_n2_support={ .desc = "Arm Neoverse N2", .name = "arm_n2", .pmu = PFM_PMU_ARM_N2, .pme_count = LIBPFM_ARRAY_SIZE(arm_n2_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV9_PLM, .pe = arm_n2_pe, .pmu_detect = pfm_arm_detect_n2, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; pfmlib_pmu_t arm_n3_support={ .desc = "Arm Neoverse N3", .name = "arm_n3", .pmu = PFM_PMU_ARM_N3, .pme_count = LIBPFM_ARRAY_SIZE(arm_n3_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV9_PLM, .pe = arm_n3_pe, .pmu_detect = pfm_arm_detect_n3, .max_encoding = 1, .num_cntrs = 6, /* or 20 */ .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; pfmlib_pmu_t arm_v2_support={ .desc = "Arm Neoverse V2", .name = "arm_v2", .pmu = PFM_PMU_ARM_V2, .pme_count = LIBPFM_ARRAY_SIZE(arm_v2_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV9_PLM, .pe = arm_v2_pe, .pmu_detect = pfm_arm_detect_v2, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; pfmlib_pmu_t arm_v3_support={ .desc = "Arm Neoverse V3", .name = "arm_v3", .pmu = PFM_PMU_ARM_V3, .pme_count = LIBPFM_ARRAY_SIZE(arm_neoverse_v3_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV9_PLM, .pe = arm_neoverse_v3_pe, .pmu_detect = pfm_arm_detect_v3, .max_encoding = 1, .num_cntrs = 6, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; /* Fujitsu FUJITSU-MONAKA support */ pfmlib_pmu_t arm_fujitsu_monaka_support={ .desc = "Fujitsu FUJITSU-MONAKA", .name = "arm_monaka", .pmu = PFM_PMU_ARM_MONAKA, .pme_count = LIBPFM_ARRAY_SIZE(arm_monaka_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = ARMV9_PLM, .pe = arm_monaka_pe, .pmu_detect = pfm_arm_detect_monaka, .max_encoding = 1, .num_cntrs = 8, .get_event_encoding[PFM_OS_NONE] = pfm_arm_get_encoding, PFMLIB_ENCODE_PERF(pfm_arm_get_perf_encoding), .get_event_first = pfm_arm_get_event_first, .get_event_next = pfm_arm_get_event_next, .event_is_valid = pfm_arm_event_is_valid, .validate_table = pfm_arm_validate_table, .get_event_info = pfm_arm_get_event_info, .get_event_attr_info = pfm_arm_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_arm_perf_validate_pattrs), .get_event_nattrs = pfm_arm_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_perf_event.c000066400000000000000000000067161502707512200232350ustar00rootroot00000000000000/* * pfmlib_arm_perf_event.c : perf_event ARM functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_arm_priv.h" #include "pfmlib_perf_event_priv.h" int pfm_arm_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; pfm_arm_reg_t reg; struct perf_event_attr *attr = e->os_data; int type; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* * use generic raw encoding function first */ ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; if (e->count > 1) { DPRINT("unsupported count=%d\n", e->count); return PFM_ERR_NOTSUPP; } /* * To eliminate the issue of PERF_TYPE_RAW not working * for hybrid because the attr needs to encode the actual * PMU type, then we simply extract the actual PMU type * from sysfs. */ if (pfm_perf_find_pmu_type(pmu, &type) != PFM_SUCCESS) { DPRINT("cannot determine PMU type for %s\n", pmu->name); return PFM_ERR_NOTSUPP; } attr->type = type; reg.val = e->codes[0]; /* * suppress the bits which are under the control of perf_events. * Recent version of the Linux perf tools may warn if bits which * should not be set by users are set. To avoid the warning, * clear the bits, they are overwritten by the kernel anyway. */ reg.evtsel.excl_pl1 = 0; reg.evtsel.excl_usr = 0; reg.evtsel.excl_hyp = 0; attr->config = reg.val; return PFM_SUCCESS; } void pfm_arm_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * with perf_events, u and k, hv are handled at the OS * level via attr.exclude_* fields */ if (arm_has_plm(this, e) && e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { if ( e->pattrs[i].idx == ARM_ATTR_U || e->pattrs[i].idx == ARM_ATTR_K || e->pattrs[i].idx == ARM_ATTR_HV) compact = 1; } if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_arm_priv.h000066400000000000000000000076321502707512200220630ustar00rootroot00000000000000/* * Copyright (c) 2010 University of Tennessee * Contributed by Vince Weaver * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_ARM_PRIV_H__ #define __PFMLIB_ARM_PRIV_H__ /* * This file contains the definitions used for ARM processors */ /* * event description */ typedef struct { const char *name; /* event name */ const char *desc; /* event description */ const char *equiv; /* aliased to that event */ unsigned int code; /* event code */ unsigned int modmsk; /* modifiers bitmask */ } arm_entry_t; typedef union pfm_arm_reg { unsigned int val; /* complete register value */ struct { unsigned int sel:8; unsigned int reserved1:19; unsigned int excl_hyp:1; unsigned int reserved2:2; unsigned int excl_pl1:1; unsigned int excl_usr:1; } evtsel; } pfm_arm_reg_t; typedef struct { int init_cpuinfo_done; } pfm_arm_config_t; extern pfm_arm_config_t pfm_arm_cfg; typedef struct { int impl; int arch; int part; /* if number of fields altered, update ARM_NUM_ATTR_FIELDS */ } arm_cpuid_t; #define ARM_NUM_ATTR_FIELDS 3 /* number of fields on arm_cpuid_t */ extern int pfm_arm_detect(arm_cpuid_t *attr, arm_cpuid_t *match_attr); extern int pfm_arm_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_arm_get_event_first(void *this); extern int pfm_arm_get_event_next(void *this, int idx); extern int pfm_arm_event_is_valid(void *this, int pidx); extern int pfm_arm_validate_table(void *this, FILE *fp); extern int pfm_arm_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); extern int pfm_arm_get_event_info(void *this, int idx, pfm_event_info_t *info); extern unsigned int pfm_arm_get_event_nattrs(void *this, int pidx); extern void pfm_arm_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_arm_get_perf_encoding(void *this, pfmlib_event_desc_t *e); #define ARM_ATTR_K 0 /* pl1 priv level */ #define ARM_ATTR_U 1 /* user priv level */ #define ARM_ATTR_HV 2 /* hypervisor priv level */ #define _ARM_ATTR_K (1 << ARM_ATTR_K) #define _ARM_ATTR_U (1 << ARM_ATTR_U) #define _ARM_ATTR_HV (1 << ARM_ATTR_HV) #define ARM_ATTR_PLM_ALL (_ARM_ATTR_K|_ARM_ATTR_U|_ARM_ATTR_HV) #define ARMV7_A15_ATTRS (_ARM_ATTR_K|_ARM_ATTR_U|_ARM_ATTR_HV) #define ARMV7_A15_PLM (PFM_PLM0|PFM_PLM3|PFM_PLMH) #define ARMV7_A7_ATTRS (_ARM_ATTR_K|_ARM_ATTR_U|_ARM_ATTR_HV) #define ARMV7_A7_PLM (PFM_PLM0|PFM_PLM3|PFM_PLMH) #define ARMV8_ATTRS (_ARM_ATTR_K|_ARM_ATTR_U|_ARM_ATTR_HV) #define ARMV8_PLM (PFM_PLM0|PFM_PLM3|PFM_PLMH) #define ARMV9_ATTRS (_ARM_ATTR_K|_ARM_ATTR_U|_ARM_ATTR_HV) #define ARMV9_PLM (PFM_PLM0|PFM_PLM3|PFM_PLMH) static inline int arm_has_plm(void *this, pfmlib_event_desc_t *e) { const arm_entry_t *pe = this_pe(this); return pe[e->event].modmsk & ARM_ATTR_PLM_ALL; } #endif /* __PFMLIB_ARM_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_cell.c000066400000000000000000000435451502707512200211610ustar00rootroot00000000000000/* * pfmlib_cell.c : support for the Cell PMU family * * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_cell_priv.h" /* architecture private */ #include "cell_events.h" /* PMU private */ #define SIGNAL_TYPE_CYCLES 0 #define PM_COUNTER_CTRL_CYLES 0x42C00000U #define PFM_CELL_NUM_PMCS 24 #define PFM_CELL_EVENT_MIN 1 #define PFM_CELL_EVENT_MAX 8 #define PMX_MIN_NUM 1 #define PMX_MAX_NUM 8 #define PFM_CELL_16BIT_CNTR_EVENT_MAX 8 #define PFM_CELL_32BIT_CNTR_EVENT_MAX 4 #define COMMON_REG_NUMS 8 #define ENABLE_WORD0 0 #define ENABLE_WORD1 1 #define ENABLE_WORD2 2 #define PFM_CELL_GRP_CONTROL_REG_GRP0_BIT 30 #define PFM_CELL_GRP_CONTROL_REG_GRP1_BIT 28 #define PFM_CELL_BASE_WORD_UNIT_FIELD_BIT 24 #define PFM_CELL_WORD_UNIT_FIELD_WIDTH 2 #define PFM_CELL_MAX_WORD_NUMBER 3 #define PFM_CELL_COUNTER_CONTROL_GRP1 0x80000000U #define PFM_CELL_DEFAULT_TRIGGER_EVENT_UNIT 0x00555500U #define PFM_CELL_PM_CONTROL_16BIT_CNTR_MASK 0x01E00000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_PROBLEM 0x00080000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_SUPERVISOR 0x00000000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_HYPERVISOR 0x00040000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_ALL 0x000C0000U #define PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_MASK 0x000C0000U #define ONLY_WORD(x) \ ((x == WORD_0_ONLY)||(x == WORD_2_ONLY)) ? x : 0 struct pfm_cell_signal_group_desc { unsigned int signal_type; unsigned int word_type; unsigned long long word; unsigned long long freq; unsigned int subunit; }; #define swap_int(num1, num2) do { \ int tmp = num1; \ num1 = num2; \ num2 = tmp; \ } while(0) static int pfm_cell_detect(void) { int ret; char buffer[128]; ret = __pfm_getcpuinfo_attr("cpu", buffer, sizeof(buffer)); if (ret == -1) { return PFMLIB_ERR_NOTSUPP; } if (strcmp(buffer, "Cell Broadband Engine, altivec supported")) { return PFMLIB_ERR_NOTSUPP; } return PFMLIB_SUCCESS; } static int get_pmx_offset(int pmx_num, unsigned int *pmx_ctrl_bits) { /* pmx_num==0 -> not specified * pmx_num==1 -> pm0 * : * pmx_num==8 -> pm7 */ int i = 0; int offset; if ((pmx_num >= PMX_MIN_NUM) && (pmx_num <= PMX_MAX_NUM)) { /* offset is specified */ offset = (pmx_num - 1); if ((~*pmx_ctrl_bits >> offset) & 0x1) { *pmx_ctrl_bits |= (0x1 << offset); return offset; } else { /* offset is used */ return PFMLIB_ERR_INVAL; } } else if (pmx_num == 0){ /* offset is not specified */ while (((*pmx_ctrl_bits >> i) & 0x1) && (i < PMX_MAX_NUM)) { i++; } *pmx_ctrl_bits |= (0x1 << i); return i; } /* pmx_num is invalid */ return PFMLIB_ERR_INVAL; } static unsigned long long search_enable_word(int word) { unsigned long long count = 0; while ((~word) & 0x1) { count++; word >>= 1; } return count; } static int get_count_bit(unsigned int type) { int count = 0; while(type) { if (type & 1) { count++; } type >>= 1; } return count; } static int get_debug_bus_word(struct pfm_cell_signal_group_desc *group0, struct pfm_cell_signal_group_desc *group1) { unsigned int word_type0, word_type1; /* search enable word */ word_type0 = group0->word_type; word_type1 = group1->word_type; if (group1->signal_type == NONE_SIGNAL) { group0->word = search_enable_word(word_type0); goto found; } /* swap */ if ((get_count_bit(word_type0) > get_count_bit(word_type1)) || (group0->freq == PFM_CELL_PME_FREQ_SPU)) { swap_int(group0->signal_type, group1->signal_type); swap_int(group0->freq, group1->freq); swap_int(group0->word_type, group1->word_type); swap_int(group0->subunit, group1->subunit); swap_int(word_type0, word_type1); } if ((ONLY_WORD(word_type0) != 0) && (word_type0 == word_type1)) { return PFMLIB_ERR_INVAL; } if (ONLY_WORD(word_type0)) { group0->word = search_enable_word(ONLY_WORD(word_type0)); word_type1 &= ~(1UL << (group0->word)); group1->word = search_enable_word(word_type1); } else if (ONLY_WORD(word_type1)) { group1->word = search_enable_word(ONLY_WORD(word_type1)); word_type0 &= ~(1UL << (group1->word)); group0->word = search_enable_word(word_type0); } else { group0->word = ENABLE_WORD0; if (word_type1 == WORD_0_AND_1) { group1->word = ENABLE_WORD1; } else if(word_type1 == WORD_0_AND_2) { group1->word = ENABLE_WORD2; } else { return PFMLIB_ERR_INVAL; } } found: return PFMLIB_SUCCESS; } static unsigned int get_signal_type(unsigned long long event_code) { return (event_code & 0x00000000FFFFFFFFULL) / 100; } static unsigned int get_signal_bit(unsigned long long event_code) { return (event_code & 0x00000000FFFFFFFFULL) % 100; } static int is_spe_signal_group(unsigned int signal_type) { if (41 <= signal_type && signal_type <= 56) { return 1; } else { return 0; } } static int check_signal_type(pfmlib_input_param_t *inp, pfmlib_cell_input_param_t *mod_in, struct pfm_cell_signal_group_desc *group0, struct pfm_cell_signal_group_desc *group1) { pfmlib_event_t *e; unsigned int event_cnt; int signal_cnt = 0; int i; int cycles_signal_cnt = 0; unsigned int signal_type, subunit; e = inp->pfp_events; event_cnt = inp->pfp_event_count; for(i = 0; i < event_cnt; i++) { signal_type = get_signal_type(cell_pe[e[i].event].pme_code); if ((signal_type == SIGNAL_SPU_TRIGGER) || (signal_type == SIGNAL_SPU_EVENT)) { continue; } if (signal_type == SIGNAL_TYPE_CYCLES) { cycles_signal_cnt = 1; continue; } subunit = 0; if (is_spe_signal_group(signal_type)) { subunit = mod_in->pfp_cell_counters[i].spe_subunit; } switch(signal_cnt) { case 0: group0->signal_type = signal_type; group0->word_type = cell_pe[e[i].event].pme_enable_word; group0->freq = cell_pe[e[i].event].pme_freq; group0->subunit = subunit; signal_cnt++; break; case 1: if ((group0->signal_type != signal_type) || (is_spe_signal_group(signal_type) && group0->subunit != subunit)) { group1->signal_type = signal_type; group1->word_type = cell_pe[e[i].event].pme_enable_word; group1->freq = cell_pe[e[i].event].pme_freq; group1->subunit = subunit; signal_cnt++; } break; case 2: if ((group0->signal_type != signal_type) && (group1->signal_type != signal_type)) { DPRINT("signal count is invalid\n"); return PFMLIB_ERR_INVAL; } break; default: DPRINT("signal count is invalid\n"); return PFMLIB_ERR_INVAL; } } return (signal_cnt + cycles_signal_cnt); } /* * The assignment between the privilege leve options * and ppu-count-mode field in pm_control register. * * option ppu count mode(pm_control) * --------------------------------- * -u(-3) 0b10 : Problem mode * -k(-0) 0b00 : Supervisor mode * -1 0b00 : Supervisor mode * -2 0b01 : Hypervisor mode * two options 0b11 : Any mode * * Note : Hypervisor-mode and Any-mode don't work on PS3. * */ static unsigned int get_ppu_count_mode(unsigned int plm) { unsigned int ppu_count_mode = 0; switch (plm) { case PFM_PLM0: case PFM_PLM1: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_SUPERVISOR; break; case PFM_PLM2: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_HYPERVISOR; break; case PFM_PLM3: ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_PROBLEM; break; default : ppu_count_mode = PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_ALL; break; } return ppu_count_mode; } static int pfm_cell_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_cell_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; unsigned int event_cnt; unsigned int signal_cnt = 0, pmcs_cnt = 0; unsigned int signal_type; unsigned long long signal_bit; struct pfm_cell_signal_group_desc group[2]; int pmx_offset = 0; int i, ret; int input_control, polarity, count_cycle, count_enable; unsigned long long subunit; int shift0, shift1; unsigned int pmx_ctrl_bits; int max_event_cnt = PFM_CELL_32BIT_CNTR_EVENT_MAX; count_enable = 1; group[0].signal_type = group[1].signal_type = NONE_SIGNAL; group[0].word = group[1].word = 0L; group[0].freq = group[1].freq = 0L; group[0].subunit = group[1].subunit = 0; group[0].word_type = group[1].word_type = WORD_NONE; event_cnt = inp->pfp_event_count; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; /* check event_cnt */ if (mod_in->control & PFM_CELL_PM_CONTROL_16BIT_CNTR_MASK) max_event_cnt = PFM_CELL_16BIT_CNTR_EVENT_MAX; if (event_cnt < PFM_CELL_EVENT_MIN) return PFMLIB_ERR_NOTFOUND; if (event_cnt > max_event_cnt) return PFMLIB_ERR_TOOMANY; /* check signal type */ signal_cnt = check_signal_type(inp, mod_in, &group[0], &group[1]); if (signal_cnt == PFMLIB_ERR_INVAL) return PFMLIB_ERR_NOASSIGN; /* decide debug_bus word */ if (signal_cnt != 0 && group[0].signal_type != NONE_SIGNAL) { ret = get_debug_bus_word(&group[0], &group[1]); if (ret != PFMLIB_SUCCESS) return PFMLIB_ERR_NOASSIGN; } /* common register setting */ pc[pmcs_cnt].reg_num = REG_GROUP_CONTROL; if (signal_cnt == 1) { pc[pmcs_cnt].reg_value = group[0].word << PFM_CELL_GRP_CONTROL_REG_GRP0_BIT; } else if (signal_cnt == 2) { pc[pmcs_cnt].reg_value = (group[0].word << PFM_CELL_GRP_CONTROL_REG_GRP0_BIT) | (group[1].word << PFM_CELL_GRP_CONTROL_REG_GRP1_BIT); } pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_DEBUG_BUS_CONTROL; if (signal_cnt == 1) { shift0 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[0].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); pc[pmcs_cnt].reg_value = group[0].freq << shift0; } else if (signal_cnt == 2) { shift0 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[0].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); shift1 = PFM_CELL_BASE_WORD_UNIT_FIELD_BIT + ((PFM_CELL_MAX_WORD_NUMBER - group[1].word) * PFM_CELL_WORD_UNIT_FIELD_WIDTH); pc[pmcs_cnt].reg_value = (group[0].freq << shift0) | (group[1].freq << shift1); } pc[pmcs_cnt].reg_value |= PFM_CELL_DEFAULT_TRIGGER_EVENT_UNIT; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_TRACE_ADDRESS; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_EXT_TRACE_TIMER; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_STATUS; pc[pmcs_cnt].reg_value = 0; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_CONTROL; pc[pmcs_cnt].reg_value = (mod_in->control & ~PFM_CELL_PM_CONTROL_PPU_CNTR_MODE_MASK) | get_ppu_count_mode(inp->pfp_dfl_plm); pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_INTERVAL; pc[pmcs_cnt].reg_value = mod_in->interval; pmcs_cnt++; pc[pmcs_cnt].reg_num = REG_PM_START_STOP; pc[pmcs_cnt].reg_value = mod_in->triggers; pmcs_cnt++; pmx_ctrl_bits = 0; /* pmX register setting */ for(i = 0; i < event_cnt; i++) { /* PMX_CONTROL */ pmx_offset = get_pmx_offset(mod_in->pfp_cell_counters[i].pmX_control_num, &pmx_ctrl_bits); if (pmx_offset == PFMLIB_ERR_INVAL) { DPRINT("pmX already used\n"); return PFMLIB_ERR_INVAL; } signal_type = get_signal_type(cell_pe[e[i].event].pme_code); if (signal_type == SIGNAL_TYPE_CYCLES) { pc[pmcs_cnt].reg_value = PM_COUNTER_CTRL_CYLES; pc[pmcs_cnt].reg_num = REG_PM0_CONTROL + pmx_offset; pmcs_cnt++; pc[pmcs_cnt].reg_value = cell_pe[e[i].event].pme_code; pc[pmcs_cnt].reg_num = REG_PM0_EVENT + pmx_offset; pmcs_cnt++; pd[i].reg_num = pmx_offset; pd[i].reg_value = 0; continue; } switch(cell_pe[e[i].event].pme_type) { case COUNT_TYPE_BOTH_TYPE: case COUNT_TYPE_CUMULATIVE_LEN: case COUNT_TYPE_MULTI_CYCLE: case COUNT_TYPE_SINGLE_CYCLE: count_cycle = 1; break; case COUNT_TYPE_OCCURRENCE: count_cycle = 0; break; default: return PFMLIB_ERR_INVAL; } signal_bit = get_signal_bit(cell_pe[e[i].event].pme_code); polarity = mod_in->pfp_cell_counters[i].polarity; input_control = mod_in->pfp_cell_counters[i].input_control; subunit = 0; if (is_spe_signal_group(signal_type)) { subunit = mod_in->pfp_cell_counters[i].spe_subunit; } pc[pmcs_cnt].reg_value = ( (signal_bit << (31 - 5)) | (input_control << (31 - 6)) | (polarity << (31 - 7)) | (count_cycle << (31 - 8)) | (count_enable << (31 - 9)) ); pc[pmcs_cnt].reg_num = REG_PM0_CONTROL + pmx_offset; if (signal_type == group[1].signal_type && subunit == group[1].subunit) { pc[pmcs_cnt].reg_value |= PFM_CELL_COUNTER_CONTROL_GRP1; } pmcs_cnt++; /* PMX_EVENT */ pc[pmcs_cnt].reg_num = REG_PM0_EVENT + pmx_offset; /* debug bus word setting */ if (signal_type == group[0].signal_type && subunit == group[0].subunit) { pc[pmcs_cnt].reg_value = (cell_pe[e[i].event].pme_code | (group[0].word << 48) | (subunit << 32)); } else if (signal_type == group[1].signal_type && subunit == group[1].subunit) { pc[pmcs_cnt].reg_value = (cell_pe[e[i].event].pme_code | (group[1].word << 48) | (subunit << 32)); } else if ((signal_type == SIGNAL_SPU_TRIGGER) || (signal_type == SIGNAL_SPU_EVENT)) { pc[pmcs_cnt].reg_value = cell_pe[e[i].event].pme_code | (subunit << 32); } else { return PFMLIB_ERR_INVAL; } pmcs_cnt++; /* pmd setting */ pd[i].reg_num = pmx_offset; pd[i].reg_value = 0; } outp->pfp_pmc_count = pmcs_cnt; outp->pfp_pmd_count = event_cnt; return PFMLIB_SUCCESS; } static int pfm_cell_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_cell_input_param_t *mod_in = (pfmlib_cell_input_param_t *)model_in; pfmlib_cell_input_param_t default_model_in; int i; if (model_in) { mod_in = (pfmlib_cell_input_param_t *)model_in; } else { mod_in = &default_model_in; mod_in->control = 0x80000000; mod_in->interval = 0; mod_in->triggers = 0; for (i = 0; i < PMU_CELL_NUM_COUNTERS; i++) { mod_in->pfp_cell_counters[i].pmX_control_num = 0; mod_in->pfp_cell_counters[i].spe_subunit = 0; mod_in->pfp_cell_counters[i].polarity = 1; mod_in->pfp_cell_counters[i].input_control = 0; mod_in->pfp_cell_counters[i].cnt_mask = 0; mod_in->pfp_cell_counters[i].flags = 0; } } return pfm_cell_dispatch_counters(inp, mod_in, outp); } static int pfm_cell_get_event_code(unsigned int i, unsigned int cnt, int *code) { // if (cnt != PFMLIB_CNT_FIRST && cnt > 2) { if (cnt != PFMLIB_CNT_FIRST && cnt > cell_support.num_cnt) { return PFMLIB_ERR_INVAL; } *code = cell_pe[i].pme_code; return PFMLIB_SUCCESS; } static void pfm_cell_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < PMU_CELL_NUM_COUNTERS; i++) { pfm_regmask_set(counters, i); } } static void pfm_cell_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i; memset(impl_pmcs, 0, sizeof(*impl_pmcs)); for(i=0; i < PFM_CELL_NUM_PMCS; i++) { pfm_regmask_set(impl_pmcs, i); } } static void pfm_cell_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i; memset(impl_pmds, 0, sizeof(*impl_pmds)); for(i=0; i < PMU_CELL_NUM_PERFCTR; i++) { pfm_regmask_set(impl_pmds, i); } } static void pfm_cell_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i; for(i=0; i < PMU_CELL_NUM_COUNTERS; i++) { pfm_regmask_set(impl_counters, i); } } static char* pfm_cell_get_event_name(unsigned int i) { return cell_pe[i].pme_name; } static int pfm_cell_get_event_desc(unsigned int ev, char **str) { char *s; s = cell_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_cell_get_cycle_event(pfmlib_event_t *e) { int i; for (i = 0; i < PME_CELL_EVENT_COUNT; i++) { if (!strcmp(cell_pe[i].pme_name, "CYCLES")) { e->event = i; return PFMLIB_SUCCESS; } } return PFMLIB_ERR_NOTFOUND; } int pfm_cell_spe_event(unsigned int event_index) { if (event_index >= PME_CELL_EVENT_COUNT) return 0; return is_spe_signal_group(get_signal_type(cell_pe[event_index].pme_code)); } pfm_pmu_support_t cell_support={ .pmu_name = "CELL", .pmu_type = PFMLIB_CELL_PMU, .pme_count = PME_CELL_EVENT_COUNT, .pmc_count = PFM_CELL_NUM_PMCS, .pmd_count = PMU_CELL_NUM_PERFCTR, .num_cnt = PMU_CELL_NUM_COUNTERS, .get_event_code = pfm_cell_get_event_code, .get_event_name = pfm_cell_get_event_name, .get_event_counters = pfm_cell_get_event_counters, .dispatch_events = pfm_cell_dispatch_events, .pmu_detect = pfm_cell_detect, .get_impl_pmcs = pfm_cell_get_impl_pmcs, .get_impl_pmds = pfm_cell_get_impl_pmds, .get_impl_counters = pfm_cell_get_impl_counters, .get_event_desc = pfm_cell_get_event_desc, .get_cycle_event = pfm_cell_get_cycle_event }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_cell_priv.h000066400000000000000000000056571502707512200222300ustar00rootroot00000000000000/* * Copyright (c) 2007 TOSHIBA CORPORATION based on code from * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_CELL_PRIV_H__ #define __PFMLIB_CELL_PRIV_H__ #define PFM_CELL_PME_FREQ_PPU_MFC 0 #define PFM_CELL_PME_FREQ_SPU 1 #define PFM_CELL_PME_FREQ_HALF 2 typedef struct { char *pme_name; /* event name */ char *pme_desc; /* event description */ unsigned long long pme_code; /* event code */ unsigned int pme_type; /* count type */ unsigned int pme_freq; /* debug_bus_control's frequency value */ unsigned int pme_enable_word; } pme_cell_entry_t; /* PMC register */ #define REG_PM0_CONTROL 0x0000 #define REG_PM1_CONTROL 0x0001 #define REG_PM2_CONTROL 0x0002 #define REG_PM3_CONTROL 0x0003 #define REG_PM4_CONTROL 0x0004 #define REG_PM5_CONTROL 0x0005 #define REG_PM6_CONTROL 0x0006 #define REG_PM7_CONTROL 0x0007 #define REG_PM0_EVENT 0x0008 #define REG_PM1_EVENT 0x0009 #define REG_PM2_EVENT 0x000A #define REG_PM3_EVENT 0x000B #define REG_PM4_EVENT 0x000C #define REG_PM5_EVENT 0x000D #define REG_PM6_EVENT 0x000E #define REG_PM7_EVENT 0x000F #define REG_GROUP_CONTROL 0x0010 #define REG_DEBUG_BUS_CONTROL 0x0011 #define REG_TRACE_ADDRESS 0x0012 #define REG_EXT_TRACE_TIMER 0x0013 #define REG_PM_STATUS 0x0014 #define REG_PM_CONTROL 0x0015 #define REG_PM_INTERVAL 0x0016 #define REG_PM_START_STOP 0x0017 #define NONE_SIGNAL 0x0000 #define SIGNAL_SPU 41 #define SIGNAL_SPU_TRIGGER 42 #define SIGNAL_SPU_EVENT 43 #define COUNT_TYPE_BOTH_TYPE 1 #define COUNT_TYPE_CUMULATIVE_LEN 2 #define COUNT_TYPE_OCCURRENCE 3 #define COUNT_TYPE_MULTI_CYCLE 4 #define COUNT_TYPE_SINGLE_CYCLE 5 #define WORD_0_ONLY 1 /* 0001 */ #define WORD_2_ONLY 4 /* 0100 */ #define WORD_0_AND_1 3 /* 0011 */ #define WORD_0_AND_2 5 /* 0101 */ #define WORD_NONE 0 #endif /* __PFMLIB_CELL_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_common.c000066400000000000000000001742701502707512200215320ustar00rootroot00000000000000/* * pfmlib_common.c: set of functions common to all PMU models * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" static pfmlib_pmu_t *pfmlib_pmus[]= { #ifdef CONFIG_PFMLIB_ARCH_IA64 #if 0 &montecito_support, &itanium2_support, &itanium_support, &generic_ia64_support, /* must always be last for IA-64 */ #endif #endif #ifdef CONFIG_PFMLIB_ARCH_I386 /* 32-bit only processors */ &intel_pii_support, &intel_ppro_support, &intel_p6_support, &intel_pm_support, &intel_coreduo_support, #endif #ifdef CONFIG_PFMLIB_ARCH_X86 /* 32 and 64 bit processors */ &netburst_support, &netburst_p_support, &amd64_k7_support, &amd64_k8_revb_support, &amd64_k8_revc_support, &amd64_k8_revd_support, &amd64_k8_reve_support, &amd64_k8_revf_support, &amd64_k8_revg_support, &amd64_fam10h_barcelona_support, &amd64_fam10h_shanghai_support, &amd64_fam10h_istanbul_support, &amd64_fam11h_turion_support, &amd64_fam12h_llano_support, &amd64_fam14h_bobcat_support, &amd64_fam15h_interlagos_support, &amd64_fam15h_nb_support, &amd64_fam16h_support, &amd64_fam17h_deprecated_support, &amd64_fam17h_zen1_support, &amd64_fam17h_zen2_support, &amd64_fam19h_zen3_support, &amd64_fam19h_zen4_support, &amd64_fam19h_zen3_l3_support, &amd64_fam1ah_zen5_support, &amd64_fam1ah_zen5_l3_support, &amd64_rapl_support, &intel_core_support, &intel_atom_support, &intel_nhm_support, &intel_nhm_ex_support, &intel_nhm_unc_support, &intel_wsm_sp_support, &intel_wsm_dp_support, &intel_wsm_unc_support, &intel_snb_support, &intel_snb_unc_cbo0_support, &intel_snb_unc_cbo1_support, &intel_snb_unc_cbo2_support, &intel_snb_unc_cbo3_support, &intel_snb_ep_support, &intel_ivb_support, &intel_ivb_unc_cbo0_support, &intel_ivb_unc_cbo1_support, &intel_ivb_unc_cbo2_support, &intel_ivb_unc_cbo3_support, &intel_ivb_ep_support, &intel_hsw_support, &intel_hsw_ep_support, &intel_bdw_support, &intel_bdw_ep_support, &intel_skl_support, &intel_skx_support, &intel_clx_support, &intel_icl_support, &intel_icx_support, &intel_icx_unc_cha0_support, &intel_icx_unc_cha1_support, &intel_icx_unc_cha2_support, &intel_icx_unc_cha3_support, &intel_icx_unc_cha4_support, &intel_icx_unc_cha5_support, &intel_icx_unc_cha6_support, &intel_icx_unc_cha7_support, &intel_icx_unc_cha8_support, &intel_icx_unc_cha9_support, &intel_icx_unc_cha10_support, &intel_icx_unc_cha11_support, &intel_icx_unc_cha12_support, &intel_icx_unc_cha13_support, &intel_icx_unc_cha14_support, &intel_icx_unc_cha15_support, &intel_icx_unc_cha16_support, &intel_icx_unc_cha17_support, &intel_icx_unc_cha18_support, &intel_icx_unc_cha19_support, &intel_icx_unc_cha20_support, &intel_icx_unc_cha21_support, &intel_icx_unc_cha22_support, &intel_icx_unc_cha23_support, &intel_icx_unc_cha24_support, &intel_icx_unc_cha25_support, &intel_icx_unc_cha26_support, &intel_icx_unc_cha27_support, &intel_icx_unc_cha28_support, &intel_icx_unc_cha29_support, &intel_icx_unc_cha30_support, &intel_icx_unc_cha31_support, &intel_icx_unc_cha32_support, &intel_icx_unc_cha33_support, &intel_icx_unc_cha34_support, &intel_icx_unc_cha35_support, &intel_icx_unc_cha36_support, &intel_icx_unc_cha37_support, &intel_icx_unc_cha38_support, &intel_icx_unc_cha39_support, &intel_icx_unc_imc0_support, &intel_icx_unc_imc1_support, &intel_icx_unc_imc2_support, &intel_icx_unc_imc3_support, &intel_icx_unc_imc4_support, &intel_icx_unc_imc5_support, &intel_icx_unc_imc6_support, &intel_icx_unc_imc7_support, &intel_icx_unc_imc8_support, &intel_icx_unc_imc9_support, &intel_icx_unc_imc10_support, &intel_icx_unc_imc11_support, &intel_icx_unc_m2m0_support, &intel_icx_unc_m2m1_support, &intel_icx_unc_iio0_support, &intel_icx_unc_iio1_support, &intel_icx_unc_iio2_support, &intel_icx_unc_iio3_support, &intel_icx_unc_iio4_support, &intel_icx_unc_iio5_support, &intel_icx_unc_irp0_support, &intel_icx_unc_irp1_support, &intel_icx_unc_irp2_support, &intel_icx_unc_irp3_support, &intel_icx_unc_irp4_support, &intel_icx_unc_irp5_support, &intel_icx_unc_pcu_support, &intel_icx_unc_upi0_support, &intel_icx_unc_upi1_support, &intel_icx_unc_upi2_support, &intel_icx_unc_upi3_support, &intel_icx_unc_m3upi0_support, &intel_icx_unc_m3upi1_support, &intel_icx_unc_m3upi2_support, &intel_icx_unc_m3upi3_support, &intel_icx_unc_ubox_support, &intel_icx_unc_m2pcie0_support, &intel_icx_unc_m2pcie1_support, &intel_icx_unc_m2pcie2_support, &intel_spr_support, &intel_spr_unc_imc0_support, &intel_spr_unc_imc1_support, &intel_spr_unc_imc2_support, &intel_spr_unc_imc3_support, &intel_spr_unc_imc4_support, &intel_spr_unc_imc5_support, &intel_spr_unc_imc6_support, &intel_spr_unc_imc7_support, &intel_spr_unc_imc8_support, &intel_spr_unc_imc9_support, &intel_spr_unc_imc10_support, &intel_spr_unc_imc11_support, &intel_spr_unc_upi0_support, &intel_spr_unc_upi1_support, &intel_spr_unc_upi2_support, &intel_spr_unc_upi3_support, &intel_spr_unc_cha0_support, &intel_spr_unc_cha1_support, &intel_spr_unc_cha2_support, &intel_spr_unc_cha3_support, &intel_spr_unc_cha4_support, &intel_spr_unc_cha5_support, &intel_spr_unc_cha6_support, &intel_spr_unc_cha7_support, &intel_spr_unc_cha8_support, &intel_spr_unc_cha9_support, &intel_spr_unc_cha10_support, &intel_spr_unc_cha11_support, &intel_spr_unc_cha12_support, &intel_spr_unc_cha13_support, &intel_spr_unc_cha14_support, &intel_spr_unc_cha15_support, &intel_spr_unc_cha16_support, &intel_spr_unc_cha17_support, &intel_spr_unc_cha18_support, &intel_spr_unc_cha19_support, &intel_spr_unc_cha20_support, &intel_spr_unc_cha21_support, &intel_spr_unc_cha22_support, &intel_spr_unc_cha23_support, &intel_spr_unc_cha24_support, &intel_spr_unc_cha25_support, &intel_spr_unc_cha26_support, &intel_spr_unc_cha27_support, &intel_spr_unc_cha28_support, &intel_spr_unc_cha29_support, &intel_spr_unc_cha30_support, &intel_spr_unc_cha31_support, &intel_spr_unc_cha32_support, &intel_spr_unc_cha33_support, &intel_spr_unc_cha34_support, &intel_spr_unc_cha35_support, &intel_spr_unc_cha36_support, &intel_spr_unc_cha37_support, &intel_spr_unc_cha38_support, &intel_spr_unc_cha39_support, &intel_spr_unc_cha40_support, &intel_spr_unc_cha41_support, &intel_spr_unc_cha42_support, &intel_spr_unc_cha43_support, &intel_spr_unc_cha44_support, &intel_spr_unc_cha45_support, &intel_spr_unc_cha46_support, &intel_spr_unc_cha47_support, &intel_spr_unc_cha48_support, &intel_spr_unc_cha49_support, &intel_spr_unc_cha50_support, &intel_spr_unc_cha51_support, &intel_spr_unc_cha52_support, &intel_spr_unc_cha53_support, &intel_spr_unc_cha54_support, &intel_spr_unc_cha55_support, &intel_spr_unc_cha56_support, &intel_spr_unc_cha57_support, &intel_spr_unc_cha58_support, &intel_spr_unc_cha59_support, &intel_emr_support, &intel_gnr_support, &intel_gnr_unc_imc0_support, &intel_gnr_unc_imc1_support, &intel_gnr_unc_imc2_support, &intel_gnr_unc_imc3_support, &intel_gnr_unc_imc4_support, &intel_gnr_unc_imc5_support, &intel_gnr_unc_imc6_support, &intel_gnr_unc_imc7_support, &intel_gnr_unc_imc8_support, &intel_gnr_unc_imc9_support, &intel_gnr_unc_imc10_support, &intel_gnr_unc_imc11_support, &intel_adl_glc_support, &intel_adl_grt_support, &intel_rapl_support, &intel_snbep_unc_cb0_support, &intel_snbep_unc_cb1_support, &intel_snbep_unc_cb2_support, &intel_snbep_unc_cb3_support, &intel_snbep_unc_cb4_support, &intel_snbep_unc_cb5_support, &intel_snbep_unc_cb6_support, &intel_snbep_unc_cb7_support, &intel_snbep_unc_ha_support, &intel_snbep_unc_imc0_support, &intel_snbep_unc_imc1_support, &intel_snbep_unc_imc2_support, &intel_snbep_unc_imc3_support, &intel_snbep_unc_pcu_support, &intel_snbep_unc_qpi0_support, &intel_snbep_unc_qpi1_support, &intel_snbep_unc_ubo_support, &intel_snbep_unc_r2pcie_support, &intel_snbep_unc_r3qpi0_support, &intel_snbep_unc_r3qpi1_support, &intel_knc_support, &intel_slm_support, &intel_glm_support, &intel_tmt_support, &intel_ivbep_unc_cb0_support, &intel_ivbep_unc_cb1_support, &intel_ivbep_unc_cb2_support, &intel_ivbep_unc_cb3_support, &intel_ivbep_unc_cb4_support, &intel_ivbep_unc_cb5_support, &intel_ivbep_unc_cb6_support, &intel_ivbep_unc_cb7_support, &intel_ivbep_unc_cb8_support, &intel_ivbep_unc_cb9_support, &intel_ivbep_unc_cb10_support, &intel_ivbep_unc_cb11_support, &intel_ivbep_unc_cb12_support, &intel_ivbep_unc_cb13_support, &intel_ivbep_unc_cb14_support, &intel_ivbep_unc_ha0_support, &intel_ivbep_unc_ha1_support, &intel_ivbep_unc_imc0_support, &intel_ivbep_unc_imc1_support, &intel_ivbep_unc_imc2_support, &intel_ivbep_unc_imc3_support, &intel_ivbep_unc_imc4_support, &intel_ivbep_unc_imc5_support, &intel_ivbep_unc_imc6_support, &intel_ivbep_unc_imc7_support, &intel_ivbep_unc_pcu_support, &intel_ivbep_unc_qpi0_support, &intel_ivbep_unc_qpi1_support, &intel_ivbep_unc_qpi2_support, &intel_ivbep_unc_ubo_support, &intel_ivbep_unc_r2pcie_support, &intel_ivbep_unc_r3qpi0_support, &intel_ivbep_unc_r3qpi1_support, &intel_ivbep_unc_r3qpi2_support, &intel_ivbep_unc_irp_support, &intel_hswep_unc_cb0_support, &intel_hswep_unc_cb1_support, &intel_hswep_unc_cb2_support, &intel_hswep_unc_cb3_support, &intel_hswep_unc_cb4_support, &intel_hswep_unc_cb5_support, &intel_hswep_unc_cb6_support, &intel_hswep_unc_cb7_support, &intel_hswep_unc_cb8_support, &intel_hswep_unc_cb9_support, &intel_hswep_unc_cb10_support, &intel_hswep_unc_cb11_support, &intel_hswep_unc_cb12_support, &intel_hswep_unc_cb13_support, &intel_hswep_unc_cb14_support, &intel_hswep_unc_cb15_support, &intel_hswep_unc_cb16_support, &intel_hswep_unc_cb17_support, &intel_hswep_unc_ha0_support, &intel_hswep_unc_ha1_support, &intel_hswep_unc_imc0_support, &intel_hswep_unc_imc1_support, &intel_hswep_unc_imc2_support, &intel_hswep_unc_imc3_support, &intel_hswep_unc_imc4_support, &intel_hswep_unc_imc5_support, &intel_hswep_unc_imc6_support, &intel_hswep_unc_imc7_support, &intel_hswep_unc_pcu_support, &intel_hswep_unc_qpi0_support, &intel_hswep_unc_qpi1_support, &intel_hswep_unc_sb0_support, &intel_hswep_unc_sb1_support, &intel_hswep_unc_sb2_support, &intel_hswep_unc_sb3_support, &intel_hswep_unc_ubo_support, &intel_hswep_unc_r2pcie_support, &intel_hswep_unc_r3qpi0_support, &intel_hswep_unc_r3qpi1_support, &intel_hswep_unc_r3qpi2_support, &intel_hswep_unc_irp_support, &intel_knl_support, &intel_knl_unc_imc0_support, &intel_knl_unc_imc1_support, &intel_knl_unc_imc2_support, &intel_knl_unc_imc3_support, &intel_knl_unc_imc4_support, &intel_knl_unc_imc5_support, &intel_knl_unc_imc_uclk0_support, &intel_knl_unc_imc_uclk1_support, &intel_knl_unc_edc_uclk0_support, &intel_knl_unc_edc_uclk1_support, &intel_knl_unc_edc_uclk2_support, &intel_knl_unc_edc_uclk3_support, &intel_knl_unc_edc_uclk4_support, &intel_knl_unc_edc_uclk5_support, &intel_knl_unc_edc_uclk6_support, &intel_knl_unc_edc_uclk7_support, &intel_knl_unc_edc_eclk0_support, &intel_knl_unc_edc_eclk1_support, &intel_knl_unc_edc_eclk2_support, &intel_knl_unc_edc_eclk3_support, &intel_knl_unc_edc_eclk4_support, &intel_knl_unc_edc_eclk5_support, &intel_knl_unc_edc_eclk6_support, &intel_knl_unc_edc_eclk7_support, &intel_knl_unc_cha0_support, &intel_knl_unc_cha1_support, &intel_knl_unc_cha2_support, &intel_knl_unc_cha3_support, &intel_knl_unc_cha4_support, &intel_knl_unc_cha5_support, &intel_knl_unc_cha6_support, &intel_knl_unc_cha7_support, &intel_knl_unc_cha8_support, &intel_knl_unc_cha9_support, &intel_knl_unc_cha10_support, &intel_knl_unc_cha11_support, &intel_knl_unc_cha12_support, &intel_knl_unc_cha13_support, &intel_knl_unc_cha14_support, &intel_knl_unc_cha15_support, &intel_knl_unc_cha16_support, &intel_knl_unc_cha17_support, &intel_knl_unc_cha18_support, &intel_knl_unc_cha19_support, &intel_knl_unc_cha20_support, &intel_knl_unc_cha21_support, &intel_knl_unc_cha22_support, &intel_knl_unc_cha23_support, &intel_knl_unc_cha24_support, &intel_knl_unc_cha25_support, &intel_knl_unc_cha26_support, &intel_knl_unc_cha27_support, &intel_knl_unc_cha28_support, &intel_knl_unc_cha29_support, &intel_knl_unc_cha30_support, &intel_knl_unc_cha31_support, &intel_knl_unc_cha32_support, &intel_knl_unc_cha33_support, &intel_knl_unc_cha34_support, &intel_knl_unc_cha35_support, &intel_knl_unc_cha36_support, &intel_knl_unc_cha37_support, &intel_knl_unc_m2pcie_support, &intel_bdx_unc_cb0_support, &intel_bdx_unc_cb1_support, &intel_bdx_unc_cb2_support, &intel_bdx_unc_cb3_support, &intel_bdx_unc_cb4_support, &intel_bdx_unc_cb5_support, &intel_bdx_unc_cb6_support, &intel_bdx_unc_cb7_support, &intel_bdx_unc_cb8_support, &intel_bdx_unc_cb9_support, &intel_bdx_unc_cb10_support, &intel_bdx_unc_cb11_support, &intel_bdx_unc_cb12_support, &intel_bdx_unc_cb13_support, &intel_bdx_unc_cb14_support, &intel_bdx_unc_cb15_support, &intel_bdx_unc_cb16_support, &intel_bdx_unc_cb17_support, &intel_bdx_unc_cb18_support, &intel_bdx_unc_cb19_support, &intel_bdx_unc_cb20_support, &intel_bdx_unc_cb21_support, &intel_bdx_unc_cb22_support, &intel_bdx_unc_cb23_support, &intel_bdx_unc_ubo_support, &intel_bdx_unc_sbo0_support, &intel_bdx_unc_sbo1_support, &intel_bdx_unc_sbo2_support, &intel_bdx_unc_sbo3_support, &intel_bdx_unc_ha0_support, &intel_bdx_unc_ha1_support, &intel_bdx_unc_imc0_support, &intel_bdx_unc_imc1_support, &intel_bdx_unc_imc2_support, &intel_bdx_unc_imc3_support, &intel_bdx_unc_imc4_support, &intel_bdx_unc_imc5_support, &intel_bdx_unc_imc6_support, &intel_bdx_unc_imc7_support, &intel_bdx_unc_irp_support, &intel_bdx_unc_pcu_support, &intel_bdx_unc_qpi0_support, &intel_bdx_unc_qpi1_support, &intel_bdx_unc_qpi2_support, &intel_bdx_unc_r2pcie_support, &intel_bdx_unc_r3qpi0_support, &intel_bdx_unc_r3qpi1_support, &intel_bdx_unc_r3qpi2_support, &intel_skx_unc_cha0_support, &intel_skx_unc_cha1_support, &intel_skx_unc_cha2_support, &intel_skx_unc_cha3_support, &intel_skx_unc_cha4_support, &intel_skx_unc_cha5_support, &intel_skx_unc_cha6_support, &intel_skx_unc_cha7_support, &intel_skx_unc_cha8_support, &intel_skx_unc_cha9_support, &intel_skx_unc_cha10_support, &intel_skx_unc_cha11_support, &intel_skx_unc_cha12_support, &intel_skx_unc_cha13_support, &intel_skx_unc_cha14_support, &intel_skx_unc_cha15_support, &intel_skx_unc_cha16_support, &intel_skx_unc_cha17_support, &intel_skx_unc_cha18_support, &intel_skx_unc_cha19_support, &intel_skx_unc_cha20_support, &intel_skx_unc_cha21_support, &intel_skx_unc_cha22_support, &intel_skx_unc_cha23_support, &intel_skx_unc_cha24_support, &intel_skx_unc_cha25_support, &intel_skx_unc_cha26_support, &intel_skx_unc_cha27_support, &intel_skx_unc_iio0_support, &intel_skx_unc_iio1_support, &intel_skx_unc_iio2_support, &intel_skx_unc_iio3_support, &intel_skx_unc_iio4_support, &intel_skx_unc_iio5_support, &intel_skx_unc_imc0_support, &intel_skx_unc_imc1_support, &intel_skx_unc_imc2_support, &intel_skx_unc_imc3_support, &intel_skx_unc_imc4_support, &intel_skx_unc_imc5_support, &intel_skx_unc_pcu_support, &intel_skx_unc_upi0_support, &intel_skx_unc_upi1_support, &intel_skx_unc_upi2_support, &intel_skx_unc_m2m0_support, &intel_skx_unc_m2m1_support, &intel_skx_unc_ubo_support, &intel_skx_unc_m3upi0_support, &intel_skx_unc_m3upi1_support, &intel_skx_unc_m3upi2_support, &intel_skx_unc_irp_support, &intel_knm_support, &intel_knm_unc_imc0_support, &intel_knm_unc_imc1_support, &intel_knm_unc_imc2_support, &intel_knm_unc_imc3_support, &intel_knm_unc_imc4_support, &intel_knm_unc_imc5_support, &intel_knm_unc_imc_uclk0_support, &intel_knm_unc_imc_uclk1_support, &intel_knm_unc_edc_uclk0_support, &intel_knm_unc_edc_uclk1_support, &intel_knm_unc_edc_uclk2_support, &intel_knm_unc_edc_uclk3_support, &intel_knm_unc_edc_uclk4_support, &intel_knm_unc_edc_uclk5_support, &intel_knm_unc_edc_uclk6_support, &intel_knm_unc_edc_uclk7_support, &intel_knm_unc_edc_eclk0_support, &intel_knm_unc_edc_eclk1_support, &intel_knm_unc_edc_eclk2_support, &intel_knm_unc_edc_eclk3_support, &intel_knm_unc_edc_eclk4_support, &intel_knm_unc_edc_eclk5_support, &intel_knm_unc_edc_eclk6_support, &intel_knm_unc_edc_eclk7_support, &intel_knm_unc_cha0_support, &intel_knm_unc_cha1_support, &intel_knm_unc_cha2_support, &intel_knm_unc_cha3_support, &intel_knm_unc_cha4_support, &intel_knm_unc_cha5_support, &intel_knm_unc_cha6_support, &intel_knm_unc_cha7_support, &intel_knm_unc_cha8_support, &intel_knm_unc_cha9_support, &intel_knm_unc_cha10_support, &intel_knm_unc_cha11_support, &intel_knm_unc_cha12_support, &intel_knm_unc_cha13_support, &intel_knm_unc_cha14_support, &intel_knm_unc_cha15_support, &intel_knm_unc_cha16_support, &intel_knm_unc_cha17_support, &intel_knm_unc_cha18_support, &intel_knm_unc_cha19_support, &intel_knm_unc_cha20_support, &intel_knm_unc_cha21_support, &intel_knm_unc_cha22_support, &intel_knm_unc_cha23_support, &intel_knm_unc_cha24_support, &intel_knm_unc_cha25_support, &intel_knm_unc_cha26_support, &intel_knm_unc_cha27_support, &intel_knm_unc_cha28_support, &intel_knm_unc_cha29_support, &intel_knm_unc_cha30_support, &intel_knm_unc_cha31_support, &intel_knm_unc_cha32_support, &intel_knm_unc_cha33_support, &intel_knm_unc_cha34_support, &intel_knm_unc_cha35_support, &intel_knm_unc_cha36_support, &intel_knm_unc_cha37_support, &intel_knm_unc_m2pcie_support, &intel_x86_arch_support, /* must always be last for x86 */ #endif #ifdef CONFIG_PFMLIB_ARCH_MIPS &mips_74k_support, #endif #ifdef CONFIG_PFMLIB_ARCH_SICORTEX &sicortex_support, #endif #ifdef CONFIG_PFMLIB_ARCH_POWERPC &power4_support, &ppc970_support, &ppc970mp_support, &power5_support, &power5p_support, &power6_support, &power7_support, &power8_support, &power9_support, &power10_support, &torrent_support, &powerpc_nest_mcs_read_support, &powerpc_nest_mcs_write_support, #endif #ifdef CONFIG_PFMLIB_ARCH_SPARC &sparc_ultra12_support, &sparc_ultra3_support, &sparc_ultra3i_support, &sparc_ultra3plus_support, &sparc_ultra4plus_support, &sparc_niagara1_support, &sparc_niagara2_support, #endif #ifdef CONFIG_PFMLIB_CELL &cell_support, #endif #ifdef CONFIG_PFMLIB_ARCH_ARM &arm_cortex_a7_support, &arm_cortex_a8_support, &arm_cortex_a9_support, &arm_cortex_a15_support, &arm_1176_support, &arm_qcom_krait_support, &arm_cortex_a57_support, &arm_cortex_a53_support, &arm_cortex_a55_support, &arm_cortex_a72_support, &arm_cortex_a76_support, &arm_xgene_support, &arm_thunderx2_support, &arm_thunderx2_dmc0_support, &arm_thunderx2_dmc1_support, &arm_thunderx2_llc0_support, &arm_thunderx2_llc1_support, &arm_thunderx2_ccpi0_support, &arm_thunderx2_ccpi1_support, &arm_n1_support, &arm_n2_support, &arm_n3_support, &arm_v1_support, &arm_v2_support, &arm_v3_support, &arm_hisilicon_kunpeng_support, &arm_hisilicon_kunpeng_sccl1_ddrc0_support, &arm_hisilicon_kunpeng_sccl1_ddrc1_support, &arm_hisilicon_kunpeng_sccl1_ddrc2_support, &arm_hisilicon_kunpeng_sccl1_ddrc3_support, &arm_hisilicon_kunpeng_sccl3_ddrc0_support, &arm_hisilicon_kunpeng_sccl3_ddrc1_support, &arm_hisilicon_kunpeng_sccl3_ddrc2_support, &arm_hisilicon_kunpeng_sccl3_ddrc3_support, &arm_hisilicon_kunpeng_sccl5_ddrc0_support, &arm_hisilicon_kunpeng_sccl5_ddrc1_support, &arm_hisilicon_kunpeng_sccl5_ddrc2_support, &arm_hisilicon_kunpeng_sccl5_ddrc3_support, &arm_hisilicon_kunpeng_sccl7_ddrc0_support, &arm_hisilicon_kunpeng_sccl7_ddrc1_support, &arm_hisilicon_kunpeng_sccl7_ddrc2_support, &arm_hisilicon_kunpeng_sccl7_ddrc3_support, &arm_hisilicon_kunpeng_sccl1_hha2_support, &arm_hisilicon_kunpeng_sccl1_hha3_support, &arm_hisilicon_kunpeng_sccl3_hha0_support, &arm_hisilicon_kunpeng_sccl3_hha1_support, &arm_hisilicon_kunpeng_sccl5_hha6_support, &arm_hisilicon_kunpeng_sccl5_hha7_support, &arm_hisilicon_kunpeng_sccl7_hha4_support, &arm_hisilicon_kunpeng_sccl7_hha5_support, &arm_hisilicon_kunpeng_sccl1_l3c10_support, &arm_hisilicon_kunpeng_sccl1_l3c11_support, &arm_hisilicon_kunpeng_sccl1_l3c12_support, &arm_hisilicon_kunpeng_sccl1_l3c13_support, &arm_hisilicon_kunpeng_sccl1_l3c14_support, &arm_hisilicon_kunpeng_sccl1_l3c15_support, &arm_hisilicon_kunpeng_sccl1_l3c8_support, &arm_hisilicon_kunpeng_sccl1_l3c9_support, &arm_hisilicon_kunpeng_sccl3_l3c0_support, &arm_hisilicon_kunpeng_sccl3_l3c1_support, &arm_hisilicon_kunpeng_sccl3_l3c2_support, &arm_hisilicon_kunpeng_sccl3_l3c3_support, &arm_hisilicon_kunpeng_sccl3_l3c4_support, &arm_hisilicon_kunpeng_sccl3_l3c5_support, &arm_hisilicon_kunpeng_sccl3_l3c6_support, &arm_hisilicon_kunpeng_sccl3_l3c7_support, &arm_hisilicon_kunpeng_sccl5_l3c24_support, &arm_hisilicon_kunpeng_sccl5_l3c25_support, &arm_hisilicon_kunpeng_sccl5_l3c26_support, &arm_hisilicon_kunpeng_sccl5_l3c27_support, &arm_hisilicon_kunpeng_sccl5_l3c28_support, &arm_hisilicon_kunpeng_sccl5_l3c29_support, &arm_hisilicon_kunpeng_sccl5_l3c30_support, &arm_hisilicon_kunpeng_sccl5_l3c31_support, &arm_hisilicon_kunpeng_sccl7_l3c16_support, &arm_hisilicon_kunpeng_sccl7_l3c17_support, &arm_hisilicon_kunpeng_sccl7_l3c18_support, &arm_hisilicon_kunpeng_sccl7_l3c19_support, &arm_hisilicon_kunpeng_sccl7_l3c20_support, &arm_hisilicon_kunpeng_sccl7_l3c21_support, &arm_hisilicon_kunpeng_sccl7_l3c22_support, &arm_hisilicon_kunpeng_sccl7_l3c23_support, #endif #ifdef CONFIG_PFMLIB_ARCH_ARM64 &arm_cortex_a57_support, &arm_cortex_a53_support, &arm_cortex_a55_support, &arm_cortex_a72_support, &arm_cortex_a76_support, &arm_xgene_support, &arm_thunderx2_support, &arm_thunderx2_dmc0_support, &arm_thunderx2_dmc1_support, &arm_thunderx2_llc0_support, &arm_thunderx2_llc1_support, &arm_thunderx2_ccpi0_support, &arm_thunderx2_ccpi1_support, &arm_fujitsu_a64fx_support, &arm_fujitsu_monaka_support, &arm_hisilicon_kunpeng_sccl1_ddrc0_support, &arm_hisilicon_kunpeng_sccl1_ddrc1_support, &arm_hisilicon_kunpeng_sccl1_ddrc2_support, &arm_hisilicon_kunpeng_sccl1_ddrc3_support, &arm_hisilicon_kunpeng_sccl3_ddrc0_support, &arm_hisilicon_kunpeng_sccl3_ddrc1_support, &arm_hisilicon_kunpeng_sccl3_ddrc2_support, &arm_hisilicon_kunpeng_sccl3_ddrc3_support, &arm_hisilicon_kunpeng_sccl5_ddrc0_support, &arm_hisilicon_kunpeng_sccl5_ddrc1_support, &arm_hisilicon_kunpeng_sccl5_ddrc2_support, &arm_hisilicon_kunpeng_sccl5_ddrc3_support, &arm_hisilicon_kunpeng_sccl7_ddrc0_support, &arm_hisilicon_kunpeng_sccl7_ddrc1_support, &arm_hisilicon_kunpeng_sccl7_ddrc2_support, &arm_hisilicon_kunpeng_sccl7_ddrc3_support, &arm_hisilicon_kunpeng_sccl1_hha2_support, &arm_hisilicon_kunpeng_sccl1_hha3_support, &arm_hisilicon_kunpeng_sccl3_hha0_support, &arm_hisilicon_kunpeng_sccl3_hha1_support, &arm_hisilicon_kunpeng_sccl5_hha6_support, &arm_hisilicon_kunpeng_sccl5_hha7_support, &arm_hisilicon_kunpeng_sccl7_hha4_support, &arm_hisilicon_kunpeng_sccl7_hha5_support, &arm_hisilicon_kunpeng_sccl1_l3c10_support, &arm_hisilicon_kunpeng_sccl1_l3c11_support, &arm_hisilicon_kunpeng_sccl1_l3c12_support, &arm_hisilicon_kunpeng_sccl1_l3c13_support, &arm_hisilicon_kunpeng_sccl1_l3c14_support, &arm_hisilicon_kunpeng_sccl1_l3c15_support, &arm_hisilicon_kunpeng_sccl1_l3c8_support, &arm_hisilicon_kunpeng_sccl1_l3c9_support, &arm_hisilicon_kunpeng_sccl3_l3c0_support, &arm_hisilicon_kunpeng_sccl3_l3c1_support, &arm_hisilicon_kunpeng_sccl3_l3c2_support, &arm_hisilicon_kunpeng_sccl3_l3c3_support, &arm_hisilicon_kunpeng_sccl3_l3c4_support, &arm_hisilicon_kunpeng_sccl3_l3c5_support, &arm_hisilicon_kunpeng_sccl3_l3c6_support, &arm_hisilicon_kunpeng_sccl3_l3c7_support, &arm_hisilicon_kunpeng_sccl5_l3c24_support, &arm_hisilicon_kunpeng_sccl5_l3c25_support, &arm_hisilicon_kunpeng_sccl5_l3c26_support, &arm_hisilicon_kunpeng_sccl5_l3c27_support, &arm_hisilicon_kunpeng_sccl5_l3c28_support, &arm_hisilicon_kunpeng_sccl5_l3c29_support, &arm_hisilicon_kunpeng_sccl5_l3c30_support, &arm_hisilicon_kunpeng_sccl5_l3c31_support, &arm_hisilicon_kunpeng_sccl7_l3c16_support, &arm_hisilicon_kunpeng_sccl7_l3c17_support, &arm_hisilicon_kunpeng_sccl7_l3c18_support, &arm_hisilicon_kunpeng_sccl7_l3c19_support, &arm_hisilicon_kunpeng_sccl7_l3c20_support, &arm_hisilicon_kunpeng_sccl7_l3c21_support, &arm_hisilicon_kunpeng_sccl7_l3c22_support, &arm_hisilicon_kunpeng_sccl7_l3c23_support, &arm_n1_support, &arm_n2_support, &arm_n3_support, &arm_v1_support, &arm_v2_support, &arm_v3_support, &arm_hisilicon_kunpeng_support, #endif #ifdef CONFIG_PFMLIB_ARCH_S390X &s390x_cpum_cf_support, &s390x_cpum_sf_support, #endif #ifdef __linux__ &perf_event_support, &perf_event_raw_support, #endif }; #define PFMLIB_NUM_PMUS (int)(sizeof(pfmlib_pmus)/sizeof(pfmlib_pmu_t *)) static pfmlib_os_t pfmlib_os_none; pfmlib_os_t *pfmlib_os = &pfmlib_os_none; static pfmlib_os_t *pfmlib_oses[]={ &pfmlib_os_none, #ifdef __linux__ &pfmlib_os_perf, &pfmlib_os_perf_ext, #endif }; #define PFMLIB_NUM_OSES (int)(sizeof(pfmlib_oses)/sizeof(pfmlib_os_t *)) /* * Mapping table from PMU index to pfmlib_pmu_t * table is populated from pfmlib_pmus[] when the library * is initialized. * * Some entries can be NULL if PMU is not implemented on host * architecture or if the initialization failed. */ static pfmlib_pmu_t *pfmlib_pmus_map[PFM_PMU_MAX]; static pfmlib_node_t pfmlib_active_pmus_list; /* * A drop-in replacement for strsep(). strsep() is not part of the POSIX * standard, and it is not available on all platforms - in particular it is not * provided by Microsoft's C runtime or by MinGW. */ static char* pfmlib_strsep(char **stringp, const char *delim) { char* token = *stringp; char* end = *stringp; if (!end) return NULL; while (*end && !strchr(delim, *end)) end++; if (*end) { *end = '\0'; *stringp = end + 1; } else { *stringp = NULL; } return token; } #define pfmlib_for_each_pmu_event(p, e) \ for(e=(p)->get_event_first((p)); e != -1; e = (p)->get_event_next((p), e)) #define for_each_pmu_event_attr(u, i) \ for((u)=0; (u) < (i)->nattrs; (u) = (u)+1) #define pfmlib_for_each_pmu(x) \ for((x)= 0 ; (x) < PFMLIB_NUM_PMUS; (x)++) #define pfmlib_for_each_pmu(x) \ for((x)= 0 ; (x) < PFMLIB_NUM_PMUS; (x)++) #define pfmlib_for_each_node(list, n) \ for((n) = (list)->next; (list) != (n) ; (n) = (n)->next) #define pfmlib_for_each_os(x) \ for((x)= 0 ; (x) < PFMLIB_NUM_OSES; (x)++) pfmlib_config_t pfm_cfg; static inline pfmlib_pmu_t * pfmlib_node_to_pmu(pfmlib_node_t *n) { void *p = (void *)n; void *offs = (void *)offsetof(pfmlib_pmu_t, node); return (pfmlib_pmu_t *)(p - offs); } void __pfm_dbprintf(const char *fmt, ...) { va_list ap; if (pfm_cfg.debug == 0) return; va_start(ap, fmt); vfprintf(pfm_cfg.fp, fmt, ap); va_end(ap); } void __pfm_vbprintf(const char *fmt, ...) { va_list ap; if (pfm_cfg.verbose == 0) return; va_start(ap, fmt); vfprintf(pfm_cfg.fp, fmt, ap); va_end(ap); } /* * pfmlib_getl: our own equivalent to GNU getline() extension. * This avoids a dependency on having a C library with * support for getline(). */ int pfmlib_getl(char **buffer, size_t *len, FILE *fp) { #define GETL_DFL_LEN 32 char *b; int c; size_t maxsz, maxi, d, i = 0; if (!len || !fp || !buffer || (*buffer && *len < 2)) return -1; b = *buffer; if (!b) *len = 0; maxsz = *len; maxi = maxsz - 2; while ((c = fgetc(fp)) != EOF) { if (maxsz == 0 || i == maxi) { if (maxsz == 0) maxsz = GETL_DFL_LEN; else maxsz <<= 1; if (*buffer) d = &b[i] - *buffer; else d = 0; *buffer = realloc(*buffer, maxsz); if (!*buffer) return -1; b = *buffer + d; maxi = maxsz - d - 2; i = 0; *len = maxsz; } b[i++] = c; if (c == '\n') break; } if (c != EOF) b[i] = '\0'; return c != EOF ? 0 : -1; } /* * append fmt+args to str such that the string is no * more than max characters incl. null termination */ void pfmlib_strconcat(char *str, size_t max, const char *fmt, ...) { va_list ap; size_t len, todo; len = strlen(str); todo = max - strlen(str); va_start(ap, fmt); vsnprintf(str+len, todo, fmt, ap); va_end(ap); } /* * compact all pattrs starting from index i */ void pfmlib_compact_pattrs(pfmlib_event_desc_t *e, int i) { int j; for (j = i+1; j < e->npattrs; j++) e->pattrs[j - 1] = e->pattrs[j]; e->npattrs--; } static void pfmlib_compact_attrs(pfmlib_event_desc_t *e, int i) { int j; for (j = i+1; j < e->nattrs; j++) e->attrs[j - 1] = e->attrs[j]; e->nattrs--; } /* * 0 : different attribute * 1 : exactly same attribute (duplicate can be removed) * -1 : same attribute but value differ, this is an error */ static inline int pfmlib_same_attr(pfmlib_event_desc_t *d, int i, int j) { pfmlib_event_attr_info_t *a1, *a2; pfmlib_attr_t *b1, *b2; a1 = attr(d, i); a2 = attr(d, j); b1 = d->attrs+i; b2 = d->attrs+j; if (a1->idx == a2->idx && a1->type == a2->type && a1->ctrl == a2->ctrl) { if (b1->ival == b2->ival) return 1; return -1; } return 0; } static inline int pfmlib_pmu_active(pfmlib_pmu_t *pmu) { return !!(pmu->flags & PFMLIB_PMU_FL_ACTIVE); } static inline int pfmlib_pmu_deprecated(pfmlib_pmu_t *pmu) { return !!(pmu->flags & PFMLIB_PMU_FL_DEPR); } static inline int pfmlib_pmu_initialized(pfmlib_pmu_t *pmu) { return !!(pmu->flags & PFMLIB_PMU_FL_INIT); } static inline pfm_pmu_t idx2pmu(int idx) { return (pfm_pmu_t)((idx >> PFMLIB_PMU_SHIFT) & PFMLIB_PMU_MASK); } static inline pfmlib_pmu_t * pmu2pmuidx(pfm_pmu_t pmu) { /* pfm_pmu_t is unsigned int enum, so * just need to check for upper bound */ if (pmu >= PFM_PMU_MAX) return NULL; return pfmlib_pmus_map[pmu]; } /* * external opaque idx -> PMU + internal idx */ static pfmlib_pmu_t * pfmlib_idx2pidx(int idx, int *pidx) { pfmlib_pmu_t *pmu; pfm_pmu_t pmu_id; if (PFMLIB_INITIALIZED() == 0) return NULL; if (idx < 0) return NULL; pmu_id = idx2pmu(idx); pmu = pmu2pmuidx(pmu_id); if (!pmu) return NULL; *pidx = idx & PFMLIB_PMU_PIDX_MASK; if (!pmu->event_is_valid(pmu, *pidx)) return NULL; return pmu; } static pfmlib_os_t * pfmlib_find_os(pfm_os_t id) { int o; pfmlib_os_t *os; pfmlib_for_each_os(o) { os = pfmlib_oses[o]; if (os->id == id && (os->flags & PFMLIB_OS_FL_ACTIVATED)) return os; } return NULL; } size_t pfmlib_check_struct(void *st, size_t usz, size_t refsz, size_t sz) { size_t rsz = sz; /* * if user size is zero, then use ABI0 size */ if (usz == 0) usz = refsz; /* * cannot be smaller than ABI0 size */ if (usz < refsz) { DPRINT("pfmlib_check_struct: user size too small %zu\n", usz); return 0; } /* * if bigger than current ABI, then check that none * of the extra bits are set. This is to avoid mistake * by caller assuming the library set those bits. */ if (usz > sz) { char *addr = (char *)st + sz; char *end = (char *)st + usz; while (addr != end) { if (*addr++) { DPRINT("pfmlib_check_struct: invalid extra bits\n"); return 0; } } } return rsz; } /* * check environment variables for: * LIBPFM_VERBOSE : enable verbose output (must be 1) * LIBPFM_DEBUG : enable debug output (must be 1) */ static void pfmlib_init_env(void) { char *str; pfm_cfg.fp = stderr; str = getenv("LIBPFM_VERBOSE"); if (str && isdigit((int)*str)) pfm_cfg.verbose = *str - '0'; str = getenv("LIBPFM_DEBUG"); if (str && isdigit((int)*str)) pfm_cfg.debug = *str - '0'; str = getenv("LIBPFM_DEBUG_STDOUT"); if (str) pfm_cfg.fp = stdout; pfm_cfg.forced_pmu = getenv("LIBPFM_FORCE_PMU"); str = getenv("LIBPFM_ENCODE_INACTIVE"); if (str && isdigit((int)*str)) pfm_cfg.inactive = *str - '0'; str = getenv("LIBPFM_DISABLED_PMUS"); if (str) pfm_cfg.blacklist_pmus = str; #ifdef CONFIG_PFMLIB_OS_LINUX str = getenv("LIBPFM_PROC_CPUINFO"); if (str) pfm_cfg.proc_cpuinfo = str; else pfm_cfg.proc_cpuinfo = "/proc/cpuinfo"; #endif } static int pfmlib_pmu_sanity_checks(pfmlib_pmu_t *p) { /* * check event can be encoded */ if (p->pme_count >= (1<< PFMLIB_PMU_SHIFT)) { DPRINT("too many events for %s\n", p->desc); return PFM_ERR_NOTSUPP; } if (p->max_encoding > PFMLIB_MAX_ENCODING) { DPRINT("max encoding too high (%d > %d) for %s\n", p->max_encoding, PFMLIB_MAX_ENCODING, p->desc); return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfmlib_build_fstr(pfmlib_event_desc_t *e, char **fstr) { /* nothing to do */ if (!fstr) return PFM_SUCCESS; *fstr = malloc(strlen(e->fstr) + 2 + strlen(e->pmu->name) + 1); if (*fstr) sprintf(*fstr, "%s::%s", e->pmu->name, e->fstr); return *fstr ? PFM_SUCCESS : PFM_ERR_NOMEM; } static int pfmlib_pmu_activate(pfmlib_pmu_t *p) { int ret; if (p->pmu_init) { ret = p->pmu_init(p); if (ret != PFM_SUCCESS) return ret; } p->flags |= PFMLIB_PMU_FL_ACTIVE; DPRINT("activated %s\n", p->desc); return PFM_SUCCESS; } static inline int pfmlib_match_forced_pmu(const char *name) { const char *p; size_t l; /* skip any lower level specifier */ p = strchr(pfm_cfg.forced_pmu, ','); if (p) l = p - pfm_cfg.forced_pmu; else l = strlen(pfm_cfg.forced_pmu); return !strncasecmp(name, pfm_cfg.forced_pmu, l); } static int pfmlib_is_blacklisted_pmu(pfmlib_pmu_t *p) { char *q, *buffer; int ret = 1; if (!pfm_cfg.blacklist_pmus) return 0; /* * scan list for matching PMU names, we accept substrings. * for instance: snbep does match snbep* */ buffer = strdup(pfm_cfg.blacklist_pmus); if (!buffer) return 0; strcpy (buffer, pfm_cfg.blacklist_pmus); for (q = strtok (buffer, ","); q != NULL; q = strtok (NULL, ",")) { if (strstr (p->name, q) != NULL) { goto done; } } ret = 0; done: free(buffer); return ret; } static inline void pfmlib_node_init(pfmlib_node_t *n) { n->next = n->prev = n; } static inline void pfmlib_node_add_tail(pfmlib_node_t *list, pfmlib_node_t *n) { n->prev = list->prev; n->next = list; list->prev->next = n; list->prev = n; } static inline void pfmlib_add_active_pmu(pfmlib_pmu_t *pmu) { /* * We must append to tail of the list to respect the ordering * of the PMUs in the pfmlib_pmus[] array as the order matters. * For instance, on Intel x86 there is an architected PMU to * catch default events and it needs to be checked last to * give priority to model specific event encodings */ pfmlib_node_add_tail(&pfmlib_active_pmus_list, &pmu->node); } static int pfmlib_init_pmus(void) { pfmlib_pmu_t *p; int i, ret; int nsuccess = 0; /* * activate all detected PMUs * when forced, only the designated PMU * is setup and activated */ pfmlib_for_each_pmu(i) { p = pfmlib_pmus[i]; DPRINT("trying %s\n", p->desc); ret = PFM_SUCCESS; if (!pfm_cfg.forced_pmu) ret = p->pmu_detect(p); else if (!pfmlib_match_forced_pmu(p->name)) ret = PFM_ERR_NOTSUPP; /* * basic checks * failure causes PMU to not be available */ if (pfmlib_pmu_sanity_checks(p) != PFM_SUCCESS) continue; if (pfmlib_is_blacklisted_pmu(p)) { DPRINT("%s PMU blacklisted, skipping initialization\n", p->name); continue; } p->flags |= PFMLIB_PMU_FL_INIT; /* * populate mapping table */ pfmlib_pmus_map[p->pmu] = p; if (ret != PFM_SUCCESS) { /* * if LIBPFM_ENCODE_INACTIVE=1, we place all PMUs on the active list. * we lose the optimization but we do not have to special case the parsing * code. This is a debug option anyway. */ if (pfm_cfg.inactive) pfmlib_add_active_pmu(p); continue; } /* * check if exported by OS if needed */ if (p->os_detect[pfmlib_os->id]) { ret = p->os_detect[pfmlib_os->id](p); if (ret != PFM_SUCCESS) { /* * must force on active list when * LIBPFM_ENCODE_INACTIVE=1 */ if (pfm_cfg.inactive) pfmlib_add_active_pmu(p); DPRINT("%s PMU not exported by OS\n", p->name); continue; } } ret = pfmlib_pmu_activate(p); if (ret == PFM_SUCCESS) { nsuccess++; pfmlib_add_active_pmu(p); } if (pfm_cfg.forced_pmu) { __pfm_vbprintf("PMU forced to %s (%s) : %s\n", p->name, p->desc, ret == PFM_SUCCESS ? "success" : "failure"); return ret; } } DPRINT("%d PMU detected out of %d supported\n", nsuccess, PFMLIB_NUM_PMUS); return PFM_SUCCESS; } static void pfmlib_init_os(void) { int o; pfmlib_os_t *os; pfmlib_for_each_os(o) { os = pfmlib_oses[o]; if (!os->detect) continue; if (os->detect(os) != PFM_SUCCESS) continue; if (os != &pfmlib_os_none && pfmlib_os == &pfmlib_os_none) pfmlib_os = os; DPRINT("OS layer %s activated\n", os->name); os->flags = PFMLIB_OS_FL_ACTIVATED; } DPRINT("default OS layer: %s\n", pfmlib_os->name); } int pfm_initialize(void) { int ret; /* * not atomic * if initialization already done, then reurn previous return value */ if (pfm_cfg.initdone) return pfm_cfg.initret; pfmlib_node_init(&pfmlib_active_pmus_list); /* * generic sanity checks */ if (PFM_PMU_MAX & (~PFMLIB_PMU_MASK)) { DPRINT("PFM_PMU_MAX exceeds PFMLIB_PMU_MASK\n"); ret = PFM_ERR_NOTSUPP; } else { pfmlib_init_env(); /* must be done before pfmlib_init_pmus() */ pfmlib_init_os(); ret = pfmlib_init_pmus(); } pfm_cfg.initdone = 1; pfm_cfg.initret = ret; return ret; } void pfm_terminate(void) { pfmlib_node_t *n; pfmlib_pmu_t *pmu; if (PFMLIB_INITIALIZED() == 0) return; pfmlib_for_each_node(&pfmlib_active_pmus_list, n) { pmu = pfmlib_node_to_pmu(n); /* handle LIBPFM_ENCODE_INACTIVE=1 */ if (!pfmlib_pmu_active(pmu)) continue; if (pmu->pmu_terminate) pmu->pmu_terminate(pmu); } pfm_cfg.initdone = 0; pfmlib_node_init(&pfmlib_active_pmus_list); } int pfm_find_event(const char *str) { pfmlib_event_desc_t e; int ret; if (PFMLIB_INITIALIZED() == 0) return PFM_ERR_NOINIT; if (!str) return PFM_ERR_INVAL; memset(&e, 0, sizeof(e)); ret = pfmlib_parse_event(str, &e); if (ret == PFM_SUCCESS) { /* * save index so we can return it * and free the pattrs data that was * allocated in pfmlib_parse_event() */ ret = pfmlib_pidx2idx(e.pmu, e.event); pfmlib_release_event(&e); } return ret; } static int pfmlib_sanitize_event(pfmlib_event_desc_t *d) { int i, j, ret; /* * fail if duplicate attributes are found */ for(i=0; i < d->nattrs; i++) { for(j=i+1; j < d->nattrs; j++) { ret = pfmlib_same_attr(d, i, j); if (ret == 1) pfmlib_compact_attrs(d, j); else if (ret == -1) return PFM_ERR_ATTR_SET; } } return PFM_SUCCESS; } static int pfmlib_parse_event_attr(char *str, pfmlib_event_desc_t *d) { pfmlib_event_attr_info_t *ainfo; char *s, *p, *q, *endptr; char yes[2] = "y"; pfm_attr_t type; int aidx = 0, has_val, has_raw_um = 0, has_um = 0; int ret = PFM_ERR_INVAL; s = str; while(s) { p = s; pfmlib_strsep(&p, PFMLIB_ATTR_DELIM); /* if (p) *p++ = '\0'; */ q = strchr(s, '='); if (q) *q++ = '\0'; has_val = !!q; /* * check for raw umasks in hexdecimal only */ if (*s == '0' && tolower(*(s+1)) == 'x') { char *endptr = NULL; /* can only have one raw umask */ if (has_raw_um || has_um) { DPRINT("cannot mix raw umask with umask\n"); return PFM_ERR_ATTR; } if (!(d->pmu->flags & PFMLIB_PMU_FL_RAW_UMASK)) { DPRINT("PMU %s does not support RAW umasks\n", d->pmu->name); return PFM_ERR_ATTR; } /* we have reserved an entry at the end of pattrs */ aidx = d->npattrs; ainfo = d->pattrs + aidx; ainfo->name = "RAW_UMASK"; ainfo->type = PFM_ATTR_RAW_UMASK; ainfo->ctrl = PFM_ATTR_CTRL_PMU; /* can handle up to 64-bit raw umask */ ainfo->idx = strtoull(s, &endptr, 0); ainfo->equiv= NULL; if (*endptr) { DPRINT("raw umask (%s) is not a number\n", str); return PFM_ERR_ATTR; } has_raw_um = 1; goto found_attr; } for(aidx = 0; aidx < d->npattrs; aidx++) { if (!strcasecmp(d->pattrs[aidx].name, s)) { ainfo = d->pattrs + aidx; /* disambiguate modifier and umask * with the same name : snb::L2_LINES_IN:I:I=1 */ if (has_val && ainfo->type == PFM_ATTR_UMASK) continue; goto found_attr; } } DPRINT("cannot find attribute %s\n", s); return PFM_ERR_ATTR; found_attr: type = ainfo->type; if (type == PFM_ATTR_UMASK) { has_um = 1; if (has_raw_um) { DPRINT("cannot mix raw umask with umask\n"); return PFM_ERR_ATTR; } } if (ainfo->equiv) { char *z; /* cannot have equiv for attributes with value */ if (has_val) return PFM_ERR_ATTR_VAL; /* copy because it is const */ z = strdup(ainfo->equiv); if (!z) return PFM_ERR_NOMEM; ret = pfmlib_parse_event_attr(z, d); free(z); if (ret != PFM_SUCCESS) return ret; s = p; continue; } /* * we tolerate missing value for boolean attributes. * Presence of the attribute is equivalent to * attr=1, i.e., attribute is set */ if (type != PFM_ATTR_UMASK && type != PFM_ATTR_RAW_UMASK && !has_val) { if (type != PFM_ATTR_MOD_BOOL) return PFM_ERR_ATTR_VAL; s = yes; /* no const */ goto handle_bool; } d->attrs[d->nattrs].ival = 0; if ((type == PFM_ATTR_UMASK || type == PFM_ATTR_RAW_UMASK) && has_val) return PFM_ERR_ATTR_VAL; if (has_val) { s = q; handle_bool: ret = PFM_ERR_ATTR_VAL; if (!strlen(s)) goto error; if (d->nattrs == PFMLIB_MAX_ATTRS) { DPRINT("too many attributes\n"); ret = PFM_ERR_TOOMANY; goto error; } endptr = NULL; switch(type) { case PFM_ATTR_MOD_BOOL: if (strlen(s) > 1) goto error; if (tolower((int)*s) == 'y' || tolower((int)*s) == 't' || *s == '1') d->attrs[d->nattrs].ival = 1; else if (tolower((int)*s) == 'n' || tolower((int)*s) == 'f' || *s == '0') d->attrs[d->nattrs].ival = 0; else goto error; break; case PFM_ATTR_MOD_INTEGER: d->attrs[d->nattrs].ival = strtoull(s, &endptr, 0); if (*endptr != '\0') goto error; break; default: goto error; } } d->attrs[d->nattrs].id = aidx; d->nattrs++; s = p; } ret = PFM_SUCCESS; error: return ret; } static int pfmlib_build_event_pattrs(pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu; pfmlib_os_t *os; int i, ret, pmu_nattrs = 0, os_nattrs = 0; int npattrs; /* * cannot satisfy request for an OS that was not activated */ os = pfmlib_find_os(e->osid); if (!os) return PFM_ERR_NOTSUPP; pmu = e->pmu; /* get actual PMU number of attributes for the event */ if (pmu->get_event_nattrs) pmu_nattrs = pmu->get_event_nattrs(pmu, e->event); if (os && os->get_os_nattrs) os_nattrs = os->get_os_nattrs(os, e); npattrs = pmu_nattrs + os_nattrs; /* * add extra entry for raw umask, if supported */ if (pmu->flags & PFMLIB_PMU_FL_RAW_UMASK) npattrs++; if (npattrs) { e->pattrs = calloc(npattrs, sizeof(*e->pattrs)); if (!e->pattrs) return PFM_ERR_NOMEM; } /* collect all actual PMU attrs */ for(i = 0; i < pmu_nattrs; i++) { ret = pmu->get_event_attr_info(pmu, e->event, i, e->pattrs+i); if (ret != PFM_SUCCESS) goto error; } e->npattrs = pmu_nattrs; if (os_nattrs) { if (e->osid == os->id && os->get_os_attr_info) { os->get_os_attr_info(os, e); /* * check for conflicts between HW and OS attributes */ if (pmu->validate_pattrs[e->osid]) pmu->validate_pattrs[e->osid](pmu, e); } } for (i = 0; i < e->npattrs; i++) DPRINT("%d %d %d %d %d %s\n", e->event, i, e->pattrs[i].type, e->pattrs[i].ctrl, e->pattrs[i].idx, e->pattrs[i].name); return PFM_SUCCESS; error: free(e->pattrs); e->pattrs = NULL; return ret; } void pfmlib_release_event(pfmlib_event_desc_t *e) { free(e->pattrs); e->pattrs = NULL; } static int match_event(void *this, pfmlib_event_desc_t *d, const char *e, const char *s) { return strcasecmp(e, s); } static int pfmlib_parse_equiv_event(const char *event, pfmlib_event_desc_t *d) { pfmlib_pmu_t *pmu = d->pmu; pfm_event_info_t einfo; int (*match)(void *this, pfmlib_event_desc_t *d, const char *e, const char *s); char *str, *s, *p; int i; int ret; /* * create copy because string is const */ s = str = strdup(event); if (!str) return PFM_ERR_NOMEM; p = s; pfmlib_strsep(&p, PFMLIB_ATTR_DELIM); /* if (p) *p++ = '\0'; */ match = pmu->match_event ? pmu->match_event : match_event; pfmlib_for_each_pmu_event(pmu, i) { ret = pmu->get_event_info(pmu, i, &einfo); if (ret != PFM_SUCCESS) goto error; if (!match(pmu, d, einfo.name, s)) goto found; } free(str); return PFM_ERR_NOTFOUND; found: d->pmu = pmu; d->event = i; /* private index */ /* * build_event_pattrs and parse_event_attr * cannot be factorized with pfmlib_parse_event() * because equivalent event may add its own attributes */ ret = pfmlib_build_event_pattrs(d); if (ret != PFM_SUCCESS) goto error; ret = pfmlib_parse_event_attr(p, d); if (ret == PFM_SUCCESS) ret = pfmlib_sanitize_event(d); error: free(str); if (ret != PFM_SUCCESS) pfmlib_release_event(d); return ret; } int pfmlib_parse_event(const char *event, pfmlib_event_desc_t *d) { pfmlib_node_t *n; pfm_event_info_t einfo; char *str, *s, *p; pfmlib_pmu_t *pmu; int (*match)(void *this, pfmlib_event_desc_t *d, const char *e, const char *s); const char *pname = NULL; int i, ret; /* * support only one event at a time. */ p = strpbrk(event, PFMLIB_EVENT_DELIM); if (p) return PFM_ERR_INVAL; /* * create copy because string is const */ s = str = strdup(event); if (!str) return PFM_ERR_NOMEM; /* check for optional PMU name */ p = strstr(s, PFMLIB_PMU_DELIM); if (p) { *p = '\0'; pname = s; s = p + strlen(PFMLIB_PMU_DELIM); } p = s; pfmlib_strsep(&p, PFMLIB_ATTR_DELIM); /* if (p) *p++ = '\0'; */ /* * for each pmu */ pfmlib_for_each_node(&pfmlib_active_pmus_list, n) { pmu = pfmlib_node_to_pmu(n); /* * test against active still required in case of * pfm_cfg.inactive=1 because we put all PMUs on * the active list to avoid forking this loop */ if (!pname && !pfmlib_pmu_active(pmu)) continue; /* * if the PMU name is not passed, then if * the pmu is deprecated, then skip it. It means * there is a better candidate in the active list */ if (!pname && pfmlib_pmu_deprecated(pmu)) continue; /* * check for requested PMU name, */ if (pname && strcasecmp(pname, pmu->name)) continue; /* * only allow event on inactive PMU if enabled via * environement variable */ if (pname && !pfmlib_pmu_active(pmu) && !pfm_cfg.inactive) continue; match = pmu->match_event ? pmu->match_event : match_event; /* * for each event */ pfmlib_for_each_pmu_event(pmu, i) { ret = pmu->get_event_info(pmu, i, &einfo); if (ret != PFM_SUCCESS) goto error; if (!match(pmu, d, einfo.name, s)) goto found; } } free(str); return PFM_ERR_NOTFOUND; found: d->pmu = pmu; /* * handle equivalence */ if (einfo.equiv) { ret = pfmlib_parse_equiv_event(einfo.equiv, d); if (ret != PFM_SUCCESS) goto error; } else { d->event = i; /* private index */ ret = pfmlib_build_event_pattrs(d); if (ret != PFM_SUCCESS) goto error; } /* * parse attributes from original event */ ret = pfmlib_parse_event_attr(p, d); if (ret == PFM_SUCCESS) ret = pfmlib_sanitize_event(d); for (i = 0; i < d->nattrs; i++) { pfmlib_event_attr_info_t *a = attr(d, i); if (a->type != PFM_ATTR_RAW_UMASK) DPRINT("%d %d %d %s\n", d->event, i, a->idx, d->pattrs[d->attrs[i].id].name); else DPRINT("%d %d RAW_UMASK (0x%x)\n", d->event, i, a->idx); } error: free(str); if (ret != PFM_SUCCESS) pfmlib_release_event(d); return ret; } /* sorry, only English supported at this point! */ static const char *pfmlib_err_list[]= { "success", "not supported", "invalid parameters", "pfmlib not initialized", "event not found", "invalid combination of model specific features", "invalid or missing unit mask", "out of memory", "invalid event attribute", "invalid event attribute value", "attribute value already set", "too many parameters", "parameter is too small", }; static int pfmlib_err_count = (int)sizeof(pfmlib_err_list)/sizeof(char *); const char * pfm_strerror(int code) { code = -code; if (code <0 || code >= pfmlib_err_count) return "unknown error code"; return pfmlib_err_list[code]; } int pfm_get_version(void) { return LIBPFM_VERSION; } int pfm_get_event_next(int idx) { pfmlib_pmu_t *pmu; int pidx; pmu = pfmlib_idx2pidx(idx, &pidx); if (!pmu) return -1; pidx = pmu->get_event_next(pmu, pidx); return pidx == -1 ? -1 : pfmlib_pidx2idx(pmu, pidx); } int pfm_get_os_event_encoding(const char *str, int dfl_plm, pfm_os_t uos, void *args) { pfmlib_os_t *os; if (PFMLIB_INITIALIZED() == 0) return PFM_ERR_NOINIT; if (!(args && str)) return PFM_ERR_INVAL; if (dfl_plm & ~(PFM_PLM_ALL)) return PFM_ERR_INVAL; os = pfmlib_find_os(uos); if (!os) return PFM_ERR_NOTSUPP; return os->encode(os, str, dfl_plm, args); } /* * old API maintained for backward compatibility with existing apps * prefer pfm_get_os_event_encoding() */ int pfm_get_event_encoding(const char *str, int dfl_plm, char **fstr, int *idx, uint64_t **codes, int *count) { pfm_pmu_encode_arg_t arg; int ret; if (!(str && codes && count)) return PFM_ERR_INVAL; if ((*codes && !*count) || (!*codes && *count)) return PFM_ERR_INVAL; memset(&arg, 0, sizeof(arg)); arg.fstr = fstr; arg.codes = *codes; arg.count = *count; arg.size = sizeof(arg); /* * request RAW PMU encoding */ ret = pfm_get_os_event_encoding(str, dfl_plm, PFM_OS_NONE, &arg); if (ret != PFM_SUCCESS) return ret; /* handle the case where the array was allocated */ *codes = arg.codes; *count = arg.count; if (idx) *idx = arg.idx; return PFM_SUCCESS; } static int pfmlib_check_event_pattrs(pfmlib_pmu_t *pmu, int pidx, pfm_os_t osid, FILE *fp) { pfmlib_event_desc_t e; int i, j, ret; memset(&e, 0, sizeof(e)); e.event = pidx; e.osid = osid; e.pmu = pmu; ret = pfmlib_build_event_pattrs(&e); if (ret != PFM_SUCCESS) { fprintf(fp, "invalid pattrs for event %d\n", pidx); return ret; } ret = PFM_ERR_ATTR; for (i = 0; i < e.npattrs; i++) { for (j = i+1; j < e.npattrs; j++) { if (!strcmp(e.pattrs[i].name, e.pattrs[j].name)) { fprintf(fp, "event %d duplicate pattrs %s\n", pidx, e.pattrs[i].name); goto error; } } } ret = PFM_SUCCESS; error: /* * release resources allocated for event */ pfmlib_release_event(&e); return ret; } static int pfmlib_validate_encoding(char *buf, int plm) { uint64_t *codes = NULL; int count = 0, ret; ret = pfm_get_event_encoding(buf, plm, NULL, NULL, &codes, &count); if (ret != PFM_SUCCESS) { int i; DPRINT("%s ", buf); for(i=0; i < count; i++) __pfm_dbprintf(" %#"PRIx64, codes[i]); __pfm_dbprintf("\n"); } if (codes) free(codes); return ret; } static int pfmlib_pmu_validate_encoding(pfmlib_pmu_t *pmu, FILE *fp) { pfm_event_info_t einfo; pfmlib_event_attr_info_t ainfo; char *buf; size_t maxlen = 0, len; int i, u, um; int ret, retval = PFM_SUCCESS; pfmlib_for_each_pmu_event(pmu, i) { ret = pmu->get_event_info(pmu, i, &einfo); if (ret != PFM_SUCCESS) return ret; ret = pfmlib_check_event_pattrs(pmu, i, PFM_OS_NONE, fp); if (ret != PFM_SUCCESS) return ret; len = strlen(einfo.name); if (len > maxlen) maxlen = len; for_each_pmu_event_attr(u, &einfo) { ret = pmu->get_event_attr_info(pmu, i, u, &ainfo); if (ret != PFM_SUCCESS) return ret; if (ainfo.type != PFM_ATTR_UMASK) continue; len = strlen(einfo.name) + strlen(ainfo.name); if (len > maxlen) maxlen = len; } } /* 2 = ::, 1=:, 1=eol */ maxlen += strlen(pmu->name) + 2 + 1 + 1; buf = malloc(maxlen); if (!buf) return PFM_ERR_NOMEM; pfmlib_for_each_pmu_event(pmu, i) { ret = pmu->get_event_info(pmu, i, &einfo); if (ret != PFM_SUCCESS) { retval = ret; continue; } um = 0; for_each_pmu_event_attr(u, &einfo) { ret = pmu->get_event_attr_info(pmu, i, u, &ainfo); if (ret != PFM_SUCCESS) { retval = ret; continue; } if (ainfo.type != PFM_ATTR_UMASK) continue; /* * XXX: some events may require more than one umasks to encode */ sprintf(buf, "%s::%s:%s", pmu->name, einfo.name, ainfo.name); ret = pfmlib_validate_encoding(buf, PFM_PLM3|PFM_PLM0); if (ret != PFM_SUCCESS) { if (pmu->can_auto_encode) { if (!pmu->can_auto_encode(pmu, i, u)) continue; } /* * some PMU may not support raw encoding */ if (ret != PFM_ERR_NOTSUPP) { fprintf(fp, "cannot encode event %s : %s\n", buf, pfm_strerror(ret)); retval = ret; } continue; } um++; } if (um == 0) { sprintf(buf, "%s::%s", pmu->name, einfo.name); ret = pfmlib_validate_encoding(buf, PFM_PLM3|PFM_PLM0); if (ret != PFM_SUCCESS) { if (pmu->can_auto_encode) { if (!pmu->can_auto_encode(pmu, i, u)) continue; } if (ret != PFM_ERR_NOTSUPP) { fprintf(fp, "cannot encode event %s : %s\n", buf, pfm_strerror(ret)); retval = ret; } continue; } } } free(buf); return retval; } int pfm_pmu_validate(pfm_pmu_t pmu_id, FILE *fp) { pfmlib_pmu_t *pmu, *pmx; int nos = 0; int i, ret; if (fp == NULL) return PFM_ERR_INVAL; pmu = pmu2pmuidx(pmu_id); if (!pmu) return PFM_ERR_INVAL; if (!pfmlib_pmu_initialized(pmu)) { fprintf(fp, "pmu: %s :: initialization failed\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->name) { fprintf(fp, "pmu id: %d :: no name\n", pmu->pmu); return PFM_ERR_INVAL; } if (!pmu->desc) { fprintf(fp, "pmu: %s :: no description\n", pmu->name); return PFM_ERR_INVAL; } if (pmu->pmu >= PFM_PMU_MAX) { fprintf(fp, "pmu: %s :: invalid PMU id\n", pmu->name); return PFM_ERR_INVAL; } if (pmu->max_encoding >= PFMLIB_MAX_ENCODING) { fprintf(fp, "pmu: %s :: max encoding too high\n", pmu->name); return PFM_ERR_INVAL; } if (pfmlib_pmu_active(pmu) && !pmu->pme_count) { fprintf(fp, "pmu: %s :: no events\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->pmu_detect) { fprintf(fp, "pmu: %s :: missing pmu_detect callback\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->get_event_first) { fprintf(fp, "pmu: %s :: missing get_event_first callback\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->get_event_next) { fprintf(fp, "pmu: %s :: missing get_event_next callback\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->get_event_info) { fprintf(fp, "pmu: %s :: missing get_event_info callback\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->get_event_attr_info) { fprintf(fp, "pmu: %s :: missing get_event_attr_info callback\n", pmu->name); return PFM_ERR_INVAL; } for (i = PFM_OS_NONE; i < PFM_OS_MAX; i++) { if (pmu->get_event_encoding[i]) nos++; } if (!nos) { fprintf(fp, "pmu: %s :: no os event encoding callback\n", pmu->name); return PFM_ERR_INVAL; } if (!pmu->max_encoding) { fprintf(fp, "pmu: %s :: max_encoding is zero\n", pmu->name); return PFM_ERR_INVAL; } /* look for duplicate names, id */ pfmlib_for_each_pmu(i) { pmx = pfmlib_pmus[i]; if (!pfmlib_pmu_active(pmx)) continue; if (pmx == pmu) continue; if (!strcasecmp(pmx->name, pmu->name)) { fprintf(fp, "pmu: %s :: duplicate name\n", pmu->name); return PFM_ERR_INVAL; } if (pmx->pmu == pmu->pmu) { fprintf(fp, "pmu: %s :: duplicate id\n", pmu->name); return PFM_ERR_INVAL; } } if (pmu->validate_table) { ret = pmu->validate_table(pmu, fp); if (ret != PFM_SUCCESS) return ret; } return pfmlib_pmu_validate_encoding(pmu, fp); } int pfm_get_event_info(int idx, pfm_os_t os, pfm_event_info_t *uinfo) { pfm_event_info_t info; pfmlib_event_desc_t e; pfmlib_pmu_t *pmu; size_t sz = sizeof(info); int pidx, ret; if (!PFMLIB_INITIALIZED()) return PFM_ERR_NOINIT; if (os >= PFM_OS_MAX) return PFM_ERR_INVAL; pmu = pfmlib_idx2pidx(idx, &pidx); if (!pmu) return PFM_ERR_INVAL; if (!uinfo) return PFM_ERR_INVAL; sz = pfmlib_check_struct(uinfo, uinfo->size, PFM_EVENT_INFO_ABI0, sz); if (!sz) return PFM_ERR_INVAL; memset(&info, 0, sizeof(info)); info.size = sz; /* default data type is uint64 */ info.dtype = PFM_DTYPE_UINT64; /* initialize flags */ info.is_speculative = PFM_EVENT_INFO_SPEC_NA; ret = pmu->get_event_info(pmu, pidx, &info); if (ret != PFM_SUCCESS) return ret; info.pmu = pmu->pmu; info.idx = idx; memset(&e, 0, sizeof(e)); e.event = pidx; e.osid = os; e.pmu = pmu; ret = pfmlib_build_event_pattrs(&e); if (ret == PFM_SUCCESS) { info.nattrs = e.npattrs; memcpy(uinfo, &info, sz); } pfmlib_release_event(&e); return ret; } int pfm_get_event_attr_info(int idx, int attr_idx, pfm_os_t os, pfm_event_attr_info_t *uinfo) { pfmlib_event_attr_info_t *info; pfmlib_event_desc_t e; pfmlib_pmu_t *pmu; size_t sz = sizeof(*info); int pidx, ret; if (!PFMLIB_INITIALIZED()) return PFM_ERR_NOINIT; if (attr_idx < 0) return PFM_ERR_INVAL; if (os >= PFM_OS_MAX) return PFM_ERR_INVAL; pmu = pfmlib_idx2pidx(idx, &pidx); if (!pmu) return PFM_ERR_INVAL; if (!uinfo) return PFM_ERR_INVAL; sz = pfmlib_check_struct(uinfo, uinfo->size, PFM_ATTR_INFO_ABI0, sz); if (!sz) return PFM_ERR_INVAL; memset(&e, 0, sizeof(e)); e.event = pidx; e.osid = os; e.pmu = pmu; ret = pfmlib_build_event_pattrs(&e); if (ret != PFM_SUCCESS) return ret; ret = PFM_ERR_INVAL; if (attr_idx >= e.npattrs) goto error; info = &e.pattrs[attr_idx]; /* * info.idx = private, namespace specific index, * should not be visible externally, so override * with public index * * cannot memcpy() info into uinfo as they do not * have the same size, cf. idx field (uint64 vs, uint32) */ uinfo->name = info->name; uinfo->desc = info->desc; uinfo->equiv = info->equiv; uinfo->size = sz; uinfo->code = info->code; uinfo->type = info->type; uinfo->idx = attr_idx; uinfo->ctrl = info->ctrl; uinfo->is_dfl= info->is_dfl; uinfo->is_precise = info->is_precise; uinfo->is_speculative = info->is_speculative; uinfo->support_hw_smpl = info->support_hw_smpl; uinfo->support_no_mods = info->support_no_mods; uinfo->reserved_bits = 0; uinfo->dfl_val64 = info->dfl_val64; ret = PFM_SUCCESS; error: pfmlib_release_event(&e); return ret; } int pfm_get_pmu_info(pfm_pmu_t pmuid, pfm_pmu_info_t *uinfo) { pfm_pmu_info_t info; pfmlib_pmu_t *pmu; size_t sz = sizeof(info); int pidx; if (!PFMLIB_INITIALIZED()) return PFM_ERR_NOINIT; if (pmuid >= PFM_PMU_MAX) return PFM_ERR_INVAL; if (!uinfo) return PFM_ERR_INVAL; sz = pfmlib_check_struct(uinfo, uinfo->size, PFM_PMU_INFO_ABI0, sz); if (!sz) return PFM_ERR_INVAL; pmu = pfmlib_pmus_map[pmuid]; if (!pmu) return PFM_ERR_NOTSUPP; info.name = pmu->name; info.desc = pmu->desc; info.pmu = pmuid; info.size = sz; info.max_encoding = pmu->max_encoding; info.num_cntrs = pmu->num_cntrs; info.num_fixed_cntrs = pmu->num_fixed_cntrs; pidx = pmu->get_event_first(pmu); if (pidx == -1) info.first_event = -1; else info.first_event = pfmlib_pidx2idx(pmu, pidx); /* * XXX: pme_count only valid when PMU is detected */ info.is_present = pfmlib_pmu_active(pmu); info.is_dfl = !!(pmu->flags & PFMLIB_PMU_FL_ARCH_DFL); info.type = pmu->type; if (pmu->get_num_events) info.nevents = pmu->get_num_events(pmu); else info.nevents = pmu->pme_count; memcpy(uinfo, &info, sz); return PFM_SUCCESS; } pfmlib_pmu_t * pfmlib_get_pmu_by_type(pfm_pmu_type_t t) { pfmlib_pmu_t *pmu; pfmlib_node_t *n; pfmlib_for_each_node(&pfmlib_active_pmus_list, n) { pmu = pfmlib_node_to_pmu(n); /* handle LIBPFM_ENCODE_INACTIVE=1 */ if (!pfmlib_pmu_active(pmu)) continue; /* first match */ if (pmu->type != t) continue; return pmu; } return NULL; } static int pfmlib_compare_attr_id(const void *a, const void *b) { const pfmlib_attr_t *t1 = a; const pfmlib_attr_t *t2 = b; if (t1->id < t2->id) return -1; return t1->id == t2->id ? 0 : 1; } void pfmlib_sort_attr(pfmlib_event_desc_t *e) { qsort(e->attrs, e->nattrs, sizeof(pfmlib_attr_t), pfmlib_compare_attr_id); } static int pfmlib_raw_pmu_encode(void *this, const char *str, int dfl_plm, void *data) { pfm_pmu_encode_arg_t arg; pfm_pmu_encode_arg_t *uarg = data; pfmlib_pmu_t *pmu; pfmlib_event_desc_t e; size_t sz = sizeof(arg); int ret, i; sz = pfmlib_check_struct(uarg, uarg->size, PFM_RAW_ENCODE_ABI0, sz); if (!sz) return PFM_ERR_INVAL; memset(&arg, 0, sizeof(arg)); /* * get input data */ memcpy(&arg, uarg, sz); memset(&e, 0, sizeof(e)); e.osid = PFM_OS_NONE; e.dfl_plm = dfl_plm; ret = pfmlib_parse_event(str, &e); if (ret != PFM_SUCCESS) return ret; pmu = e.pmu; if (!pmu->get_event_encoding[PFM_OS_NONE]) { DPRINT("PMU %s does not support PFM_OS_NONE\n", pmu->name); ret = PFM_ERR_NOTSUPP; goto error; } ret = pmu->get_event_encoding[PFM_OS_NONE](pmu, &e); if (ret != PFM_SUCCESS) goto error; /* * return opaque event identifier */ arg.idx = pfmlib_pidx2idx(e.pmu, e.event); if (arg.codes == NULL) { ret = PFM_ERR_NOMEM; arg.codes = malloc(sizeof(uint64_t) * e.count); if (!arg.codes) goto error_fstr; } else if (arg.count < e.count) { ret = PFM_ERR_TOOSMALL; goto error_fstr; } arg.count = e.count; for (i = 0; i < e.count; i++) arg.codes[i] = e.codes[i]; if (arg.fstr) { ret = pfmlib_build_fstr(&e, arg.fstr); if (ret != PFM_SUCCESS) goto error; } ret = PFM_SUCCESS; /* copy out results */ memcpy(uarg, &arg, sz); error_fstr: if (ret != PFM_SUCCESS) free(arg.fstr); error: /* * release resources allocated for event */ pfmlib_release_event(&e); return ret; } static int pfmlib_raw_pmu_detect(void *this) { return PFM_SUCCESS; } static pfmlib_os_t pfmlib_os_none= { .name = "No OS (raw PMU)", .id = PFM_OS_NONE, .flags = PFMLIB_OS_FL_ACTIVATED, .encode = pfmlib_raw_pmu_encode, .detect = pfmlib_raw_pmu_detect, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_gen_ia64.c000066400000000000000000000325061502707512200216310ustar00rootroot00000000000000/* * pfmlib_gen_ia64.c : support default architected IA-64 PMU features * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #define PMU_GEN_IA64_MAX_COUNTERS 4 /* * number of architected events */ #define PME_GEN_COUNT 2 /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn */ #define PFMLIB_GEN_IA64_PMC_BASE 0 /* * generic event as described by architecture */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_ig:56; /* ignored */ } pme_gen_ia64_code_t; /* * union of all possible entry codes. All encodings must fit in 64bit */ typedef union { unsigned long pme_vcode; pme_gen_ia64_code_t pme_gen_code; } pme_gen_ia64_entry_code_t; /* * entry in the event table (one table per implementation) */ typedef struct pme_entry { char *pme_name; pme_gen_ia64_entry_code_t pme_entry_code; /* event code */ pfmlib_regmask_t pme_counters; /* counter bitmask */ } pme_gen_ia64_entry_t; /* let's define some handy shortcuts ! */ #define pmc_plm pmc_gen_count_reg.pmc_plm #define pmc_ev pmc_gen_count_reg.pmc_ev #define pmc_oi pmc_gen_count_reg.pmc_oi #define pmc_pm pmc_gen_count_reg.pmc_pm #define pmc_es pmc_gen_count_reg.pmc_es /* * this table is patched by initialization code */ static pme_gen_ia64_entry_t generic_pe[PME_GEN_COUNT]={ #define PME_IA64_GEN_CPU_CYCLES 0 { "CPU_CYCLES", }, #define PME_IA64_GEN_INST_RETIRED 1 { "IA64_INST_RETIRED", }, }; static int pfm_gen_ia64_counter_width; static int pfm_gen_ia64_counters; static pfmlib_regmask_t pfm_gen_ia64_impl_pmcs; static pfmlib_regmask_t pfm_gen_ia64_impl_pmds; /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * We do not use a mapping table, instead we make up the * values on the fly given the base. */ #define PFMLIB_GEN_IA64_PMC_BASE 0 /* * convert text range (e.g. 4-15 18 12-26) into actual bitmask * range argument is modified */ static int parse_counter_range(char *range, pfmlib_regmask_t *b) { char *p, c; int start, end; if (range[strlen(range)-1] == '\n') range[strlen(range)-1] = '\0'; while(range) { p = range; while (*p && *p != ' ' && *p != '-') p++; if (*p == '\0') break; c = *p; *p = '\0'; start = atoi(range); range = p+1; if (c == '-') { p++; while (*p && *p != ' ' && *p != '-') p++; if (*p) *p++ = '\0'; end = atoi(range); range = p; } else { end = start; } if (end >= PFMLIB_REG_MAX|| start >= PFMLIB_REG_MAX) goto invalid; for (; start <= end; start++) pfm_regmask_set(b, start); } return 0; invalid: fprintf(stderr, "%s.%s : bitmask too small need %d bits\n", __FILE__, __FUNCTION__, start); return -1; } static int pfm_gen_ia64_initialize(void) { FILE *fp; char *p; char buffer[64]; int matches = 0; fp = fopen("/proc/pal/cpu0/perfmon_info", "r"); if (fp == NULL) return PFMLIB_ERR_NOTSUPP; for (;;) { p = fgets(buffer, sizeof(buffer)-1, fp); if (p == NULL) break; if ((p = strchr(buffer, ':')) == NULL) break; *p = '\0'; if (!strncmp("Counter width", buffer, 13)) { pfm_gen_ia64_counter_width = atoi(p+2); matches++; continue; } if (!strncmp("PMC/PMD pairs", buffer, 13)) { pfm_gen_ia64_counters = atoi(p+2); matches++; continue; } if (!strncmp("Cycle event number", buffer, 18)) { generic_pe[0].pme_entry_code.pme_vcode = atoi(p+2); matches++; continue; } if (!strncmp("Retired event number", buffer, 20)) { generic_pe[1].pme_entry_code.pme_vcode = atoi(p+2); matches++; continue; } if (!strncmp("Cycles count capable", buffer, 20)) { if (parse_counter_range(p+2, &generic_pe[0].pme_counters) == -1) return -1; matches++; continue; } if (!strncmp("Retired bundles count capable", buffer, 29)) { if (parse_counter_range(p+2, &generic_pe[1].pme_counters) == -1) return -1; matches++; continue; } if (!strncmp("Implemented PMC", buffer, 15)) { if (parse_counter_range(p+2, &pfm_gen_ia64_impl_pmcs) == -1) return -1; matches++; continue; } if (!strncmp("Implemented PMD", buffer, 15)) { if (parse_counter_range(p+2, &pfm_gen_ia64_impl_pmds) == -1) return -1; matches++; continue; } } pfm_regmask_weight(&pfm_gen_ia64_impl_pmcs, &generic_ia64_support.pmc_count); pfm_regmask_weight(&pfm_gen_ia64_impl_pmds, &generic_ia64_support.pmd_count); fclose(fp); return matches == 8 ? PFMLIB_SUCCESS : PFMLIB_ERR_NOTSUPP; } static void pfm_gen_ia64_forced_initialize(void) { unsigned int i; pfm_gen_ia64_counter_width = 47; pfm_gen_ia64_counters = 4; generic_pe[0].pme_entry_code.pme_vcode = 18; generic_pe[1].pme_entry_code.pme_vcode = 8; memset(&pfm_gen_ia64_impl_pmcs, 0, sizeof(pfmlib_regmask_t)); memset(&pfm_gen_ia64_impl_pmds, 0, sizeof(pfmlib_regmask_t)); for(i=0; i < 8; i++) pfm_regmask_set(&pfm_gen_ia64_impl_pmcs, i); for(i=4; i < 8; i++) pfm_regmask_set(&pfm_gen_ia64_impl_pmds, i); memset(&generic_pe[0].pme_counters, 0, sizeof(pfmlib_regmask_t)); memset(&generic_pe[1].pme_counters, 0, sizeof(pfmlib_regmask_t)); for(i=4; i < 8; i++) { pfm_regmask_set(&generic_pe[0].pme_counters, i); pfm_regmask_set(&generic_pe[1].pme_counters, i); } generic_ia64_support.pmc_count = 8; generic_ia64_support.pmd_count = 4; generic_ia64_support.num_cnt = 4; } static int pfm_gen_ia64_detect(void) { /* PMU is architected, so guaranteed to be present */ return PFMLIB_SUCCESS; } static int pfm_gen_ia64_init(void) { if (forced_pmu != PFMLIB_NO_PMU) { pfm_gen_ia64_forced_initialize(); } else if (pfm_gen_ia64_initialize() == -1) return PFMLIB_ERR_NOTSUPP; return PFMLIB_SUCCESS; } static int valid_assign(unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned int i; for(i=0; i < cnt; i++) { if (as[i]==0) return 0; /* * take care of restricted PMC registers */ if (pfm_regmask_isset(r_pmcs, as[i])) return 0; } return 1; } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_reg_t structure is ready to be submitted to kernel */ static int pfm_gen_ia64_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_output_param_t *outp) { #define has_counter(e,b) (pfm_regmask_isset(&generic_pe[e].pme_counters, b) ? b : 0) unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_GEN_IA64_MAX_COUNTERS]; pfm_gen_ia64_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l; unsigned int cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (cnt > PMU_GEN_IA64_MAX_COUNTERS) return PFMLIB_ERR_TOOMANY; max_l0 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS; max_l1 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>1); max_l2 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>2); max_l3 = PMU_GEN_IA64_FIRST_COUNTER + PMU_GEN_IA64_MAX_COUNTERS*(cnt>3); if (PFMLIB_DEBUG()) { DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); } /* * This code needs fixing. It is not very pretty and * won't handle more than 4 counters if more become * available ! * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_GEN_IA64_FIRST_COUNTER; i < max_l0; i++) { assign[0]= has_counter(e[0].event,i); if (max_l1 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (j=PMU_GEN_IA64_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (k=PMU_GEN_IA64_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_GEN_IA64_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt)) goto done; for (l=PMU_GEN_IA64_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(assign, r_pmcs, cnt)) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: memset(pc, 0, cnt*sizeof(pfmlib_reg_t)); memset(pd, 0, cnt*sizeof(pfmlib_reg_t)); for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all */ /* if not specified per event, then use default (could be zero: measure nothing) */ reg.pmc_plm = e[j].plm ? e[j].plm: inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE? 1 : 0; reg.pmc_es = generic_pe[e[j].event].pme_entry_code.pme_gen_code.pme_code; pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = PFMLIB_GEN_IA64_PMC_BASE+j; pd[j].reg_num = assign[j]; pd[j].reg_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%lx,es=0x%02x,plm=%d pm=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_es,reg.pmc_plm, reg.pmc_pm, generic_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_gen_ia64_dispatch_events(pfmlib_input_param_t *inp, void *dummy1, pfmlib_output_param_t *outp, void *dummy2) { return pfm_gen_ia64_dispatch_counters(inp, outp); } static int pfm_gen_ia64_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)generic_pe[i].pme_entry_code.pme_gen_code.pme_code; return PFMLIB_SUCCESS; } static char * pfm_gen_ia64_get_event_name(unsigned int i) { return generic_pe[i].pme_name; } static void pfm_gen_ia64_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; memset(counters, 0, sizeof(*counters)); for(i=0; i < pfm_gen_ia64_counters; i++) { if (pfm_regmask_isset(&generic_pe[j].pme_counters, i)) pfm_regmask_set(counters, i); } } static void pfm_gen_ia64_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { *impl_pmcs = pfm_gen_ia64_impl_pmcs; } static void pfm_gen_ia64_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { *impl_pmds = pfm_gen_ia64_impl_pmds; } static void pfm_gen_ia64_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* pmd4-pmd7 */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_gen_ia64_get_hw_counter_width(unsigned int *width) { *width = pfm_gen_ia64_counter_width; } static int pfm_gen_ia64_get_event_desc(unsigned int ev, char **str) { switch(ev) { case PME_IA64_GEN_CPU_CYCLES: *str = strdup("CPU cycles"); break; case PME_IA64_GEN_INST_RETIRED: *str = strdup("IA-64 instructions retired"); break; default: *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_gen_ia64_get_cycle_event(pfmlib_event_t *e) { e->event = PME_IA64_GEN_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_gen_ia64_get_inst_retired(pfmlib_event_t *e) { e->event = PME_IA64_GEN_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t generic_ia64_support={ .pmu_name ="IA-64", .pmu_type = PFMLIB_GEN_IA64_PMU, .pme_count = PME_GEN_COUNT, .pmc_count = 4+4, .pmd_count = PMU_GEN_IA64_MAX_COUNTERS, .num_cnt = PMU_GEN_IA64_MAX_COUNTERS, .get_event_code = pfm_gen_ia64_get_event_code, .get_event_name = pfm_gen_ia64_get_event_name, .get_event_counters = pfm_gen_ia64_get_event_counters, .dispatch_events = pfm_gen_ia64_dispatch_events, .pmu_detect = pfm_gen_ia64_detect, .pmu_init = pfm_gen_ia64_init, .get_impl_pmcs = pfm_gen_ia64_get_impl_pmcs, .get_impl_pmds = pfm_gen_ia64_get_impl_pmds, .get_impl_counters = pfm_gen_ia64_get_impl_counters, .get_hw_counter_width = pfm_gen_ia64_get_hw_counter_width, .get_event_desc = pfm_gen_ia64_get_event_desc, .get_cycle_event = pfm_gen_ia64_get_cycle_event, .get_inst_retired_event = pfm_gen_ia64_get_inst_retired }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_ia64_priv.h000066400000000000000000001003351502707512200220410ustar00rootroot00000000000000/* * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_PRIV_IA64_H__ #define __PFMLIB_PRIV_IA64_H__ /* * architected PMC register structure */ typedef union { unsigned long pmc_val; /* generic PMC register */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_ig2:48; /* reserved */ } pmc_gen_count_reg; /* This is the Itanium-specific PMC layout for counter config */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:7; /* event select */ unsigned long pmc_ig2:1; /* reserved */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_ig3:1; /* reserved (missing from table on p6-17) */ unsigned long pmc_ism:2; /* instruction set mask */ unsigned long pmc_ig4:38; /* reserved */ } pmc_ita_count_reg; /* Opcode matcher */ struct { unsigned long ignored1:3; unsigned long mask:27; /* mask encoding bits {40:27}{12:0} */ unsigned long ignored2:3; unsigned long match:27; /* match encoding bits {40:27}{12:0} */ unsigned long b:1; /* B-syllable */ unsigned long f:1; /* F-syllable */ unsigned long i:1; /* I-syllable */ unsigned long m:1; /* M-syllable */ } pmc8_9_ita_reg; /* Instruction Event Address Registers */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_ig1:2; /* reserved */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_tlb:1; /* cache/tlb mode */ unsigned long iear_ig2:8; /* reserved */ unsigned long iear_umask:4; /* unit mask */ unsigned long iear_ig3:4; /* reserved */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:38; /* reserved */ } pmc10_ita_reg; /* Data Event Address Registers */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* reserved */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_tlb:1; /* cache/tlb mode */ unsigned long dear_ig2:8; /* reserved */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* reserved */ unsigned long dear_ism:2; /* instruction set */ unsigned long dear_ig4:2; /* reserved */ unsigned long dear_pt:1; /* pass tags */ unsigned long dear_ig5:35; /* reserved */ } pmc11_ita_reg; /* Branch Trace Buffer registers */ struct { unsigned long btbc_plm:4; /* privilege level */ unsigned long btbc_ig1:2; unsigned long btbc_pm:1; /* privileged monitor */ unsigned long btbc_tar:1; /* target address register */ unsigned long btbc_tm:2; /* taken mask */ unsigned long btbc_ptm:2; /* predicted taken address mask */ unsigned long btbc_ppm:2; /* predicted predicate mask */ unsigned long btbc_bpt:1; /* branch prediction table */ unsigned long btbc_bac:1; /* branch address calculator */ unsigned long btbc_ig2:48; } pmc12_ita_reg; struct { unsigned long irange_ta:1; /* tag all bit */ unsigned long irange_ig:63; } pmc13_ita_reg; /* This is the Itanium2-specific PMC layout for counter config */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* reserved */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_enable:1; /* pmc4 only: power enable bit */ unsigned long pmc_ism:2; /* instruction set mask */ unsigned long pmc_ig2:38; /* reserved */ } pmc_ita2_counter_reg; /* opcode matchers */ struct { unsigned long opcm_ig_ad:1; /* ignore instruction address range checking */ unsigned long opcm_inv:1; /* invert range check */ unsigned long opcm_bit2:1; /* must be 1 */ unsigned long opcm_mask:27; /* mask encoding bits {41:27}{12:0} */ unsigned long opcm_ig1:3; /* reserved */ unsigned long opcm_match:27; /* match encoding bits {41:27}{12:0} */ unsigned long opcm_b:1; /* B-syllable */ unsigned long opcm_f:1; /* F-syllable */ unsigned long opcm_i:1; /* I-syllable */ unsigned long opcm_m:1; /* M-syllable */ } pmc8_9_ita2_reg; /* * instruction event address register configuration * * The register has two layout depending on the value of the ct field. * In cache mode(ct=1x): * - ct is 1 bit, umask is 8 bits * In TLB mode (ct=00): * - ct is 2 bits, umask is 7 bits * ct=11 <=> cache mode and use a latency with eighth bit set * ct=01 => nothing monitored * * The ct=01 value is the only reason why we cannot fix the layout * to ct 1 bit and umask 8 bits. Even though in TLB mode, only 6 bits * are effectively used for the umask, if the user inadvertently use * a umask with the most significant bit set, it would be equivalent * to no monitoring. */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:8; /* event unit mask: 7 bits in TLB mode, 8 bits in cache mode */ unsigned long iear_ct:1; /* cache tlb bit13: 0 for TLB mode, 1 for cache mode */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:48; /* reserved */ } pmc10_ita2_cache_reg; struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:7; /* event unit mask: 7 bits in TLB mode, 8 bits in cache mode */ unsigned long iear_ct:2; /* cache tlb bit13: 0 for TLB mode, 1 for cache mode */ unsigned long iear_ism:2; /* instruction set */ unsigned long iear_ig4:48; /* reserved */ } pmc10_ita2_tlb_reg; /* data event address register configuration */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* reserved */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_mode:2; /* mode */ unsigned long dear_ig2:7; /* reserved */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* reserved */ unsigned long dear_ism:2; /* instruction set */ unsigned long dear_ig4:38; /* reserved */ } pmc11_ita2_reg; /* branch trace buffer configuration register */ struct { unsigned long btbc_plm:4; /* privilege level */ unsigned long btbc_ig1:2; unsigned long btbc_pm:1; /* privileged monitor */ unsigned long btbc_ds:1; /* data selector */ unsigned long btbc_tm:2; /* taken mask */ unsigned long btbc_ptm:2; /* predicted taken address mask */ unsigned long btbc_ppm:2; /* predicted predicate mask */ unsigned long btbc_brt:2; /* branch type mask */ unsigned long btbc_ig2:48; } pmc12_ita2_reg; /* data address range configuration register */ struct { unsigned long darc_ig1:3; unsigned long darc_cfg_dbrp0:2; /* constraint on dbr0 */ unsigned long darc_ig2:6; unsigned long darc_cfg_dbrp1:2; /* constraint on dbr1 */ unsigned long darc_ig3:6; unsigned long darc_cfg_dbrp2:2; /* constraint on dbr2 */ unsigned long darc_ig4:6; unsigned long darc_cfg_dbrp3:2; /* constraint on dbr3 */ unsigned long darc_ig5:16; unsigned long darc_ena_dbrp0:1; /* enable constraint dbr0 */ unsigned long darc_ena_dbrp1:1; /* enable constraint dbr1 */ unsigned long darc_ena_dbrp2:1; /* enable constraint dbr2 */ unsigned long darc_ena_dbrp3:1; /* enable constraint dbr3 */ unsigned long darc_ig6:15; } pmc13_ita2_reg; /* instruction address range configuration register */ struct { unsigned long iarc_ig1:1; unsigned long iarc_ibrp0:1; /* constrained by ibr0 */ unsigned long iarc_ig2:2; unsigned long iarc_ibrp1:1; /* constrained by ibr1 */ unsigned long iarc_ig3:2; unsigned long iarc_ibrp2:1; /* constrained by ibr2 */ unsigned long iarc_ig4:2; unsigned long iarc_ibrp3:1; /* constrained by ibr3 */ unsigned long iarc_ig5:2; unsigned long iarc_fine:1; /* fine mode */ unsigned long iarc_ig6:50; } pmc14_ita2_reg; /* opcode matcher configuration register */ struct { unsigned long opcmc_ibrp0_pmc8:1; unsigned long opcmc_ibrp1_pmc9:1; unsigned long opcmc_ibrp2_pmc8:1; unsigned long opcmc_ibrp3_pmc9:1; unsigned long opcmc_ig1:60; } pmc15_ita2_reg; /* This is the Montecito-specific PMC layout for counters PMC4-PMC15 */ struct { unsigned long pmc_plm:4; /* privilege level mask */ unsigned long pmc_ev:1; /* external visibility */ unsigned long pmc_oi:1; /* overflow interrupt */ unsigned long pmc_pm:1; /* privileged monitor */ unsigned long pmc_ig1:1; /* ignored */ unsigned long pmc_es:8; /* event select */ unsigned long pmc_umask:4; /* unit mask */ unsigned long pmc_thres:3; /* threshold */ unsigned long pmc_ig2:1; /* ignored */ unsigned long pmc_ism:2; /* instruction set: must be 2 */ unsigned long pmc_all:1; /* 0=only self, 1=both threads */ unsigned long pmc_i:1; /* Invalidate */ unsigned long pmc_s:1; /* Shared */ unsigned long pmc_e:1; /* Exclusive */ unsigned long pmc_m:1; /* Modified */ unsigned long pmc_res3:33; /* reserved */ } pmc_mont_counter_reg; /* opcode matchers mask registers */ struct { unsigned long opcm_mask:41; /* opcode mask */ unsigned long opcm_ig1:7; /* ignored */ unsigned long opcm_b:1; /* B-syllable */ unsigned long opcm_f:1; /* F-syllable */ unsigned long opcm_i:1; /* I-syllable */ unsigned long opcm_m:1; /* M-syllable */ unsigned long opcm_ig2:4; /* ignored */ unsigned long opcm_inv:1; /* inverse range for ibrp0 */ unsigned long opcm_ig_ad:1; /* ignore address range restrictions */ unsigned long opcm_ig3:6; /* ignored */ } pmc32_34_mont_reg; /* opcode matchers match registers */ struct { unsigned long opcm_match:41; /* opcode match */ unsigned long opcm_ig1:23; /* ignored */ } pmc33_35_mont_reg; /* opcode matcher config register */ struct { unsigned long opcm_ch0_ig_opcm:1; /* chan0 opcode constraint */ unsigned long opcm_ch1_ig_opcm:1; /* chan1 opcode constraint */ unsigned long opcm_ch2_ig_opcm:1; /* chan2 opcode constraint */ unsigned long opcm_ch3_ig_opcm:1; /* chan3 opcode constraint */ unsigned long opcm_res:28; /* reserved */ unsigned long opcm_ig:32; /* ignored */ } pmc36_mont_reg; /* * instruction event address register configuration (I-EAR) * * The register has two layouts depending on the value of the ct field. * In cache mode(ct=1x): * - ct is 1 bit, umask is 8 bits * In TLB mode (ct=0x): * - ct is 2 bits, umask is 7 bits * ct=11 => cache mode using a latency filter with eighth bit set * ct=01 => nothing monitored * * The ct=01 value is the only reason why we cannot fix the layout * to ct 1 bit and umask 8 bits. Even though in TLB mode, only 6 bits * are effectively used for the umask, if the user inadvertently sets * a umask with the most significant bit set, it would be equivalent * to no monitoring. */ struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:8; /* event unit mask */ unsigned long iear_ct:1; /* =1 for i-cache */ unsigned long iear_res:2; /* reserved */ unsigned long iear_ig:48; /* ignored */ } pmc37_mont_cache_reg; struct { unsigned long iear_plm:4; /* privilege level mask */ unsigned long iear_pm:1; /* privileged monitor */ unsigned long iear_umask:7; /* event unit mask */ unsigned long iear_ct:2; /* 00=i-tlb, 01=nothing 1x=illegal */ unsigned long iear_res:50; /* reserved */ } pmc37_mont_tlb_reg; /* data event address register configuration (D-EAR) */ struct { unsigned long dear_plm:4; /* privilege level mask */ unsigned long dear_ig1:2; /* ignored */ unsigned long dear_pm:1; /* privileged monitor */ unsigned long dear_mode:2; /* mode */ unsigned long dear_ig2:7; /* ignored */ unsigned long dear_umask:4; /* unit mask */ unsigned long dear_ig3:4; /* ignored */ unsigned long dear_ism:2; /* instruction set: must be 2 */ unsigned long dear_ig4:38; /* ignored */ } pmc40_mont_reg; /* IP event address register (IP-EAR) */ struct { unsigned long ipear_plm:4; /* privilege level mask */ unsigned long ipear_ig1:2; /* ignored */ unsigned long ipear_pm:1; /* privileged monitor */ unsigned long ipear_ig2:1; /* ignored */ unsigned long ipear_mode:3; /* mode */ unsigned long ipear_delay:8; /* delay */ unsigned long ipear_ig3:45; /* reserved */ } pmc42_mont_reg; /* execution trace buffer configuration register (ETB) */ struct { unsigned long etbc_plm:4; /* privilege level */ unsigned long etbc_res1:2; /* reserved */ unsigned long etbc_pm:1; /* privileged monitor */ unsigned long etbc_ds:1; /* data selector */ unsigned long etbc_tm:2; /* taken mask */ unsigned long etbc_ptm:2; /* predicted taken address mask */ unsigned long etbc_ppm:2; /* predicted predicate mask */ unsigned long etbc_brt:2; /* branch type mask */ unsigned long etbc_ig:48; /* ignored */ } pmc39_mont_reg; /* data address range configuration register */ struct { unsigned long darc_res1:3; /* reserved */ unsigned long darc_cfg_dtag0:2; /* constraints on dbrp0 */ unsigned long darc_res2:6; /* reserved */ unsigned long darc_cfg_dtag1:2; /* constraints on dbrp1 */ unsigned long darc_res3:6; /* reserved */ unsigned long darc_cfg_dtag2:2; /* constraints on dbrp2 */ unsigned long darc_res4:6; /* reserved */ unsigned long darc_cfg_dtag3:2; /* constraints on dbrp3 */ unsigned long darc_res5:16; /* reserved */ unsigned long darc_ena_dbrp0:1; /* enable constraints dbrp0 */ unsigned long darc_ena_dbrp1:1; /* enable constraints dbrp1 */ unsigned long darc_ena_dbrp2:1; /* enable constraints dbrp2 */ unsigned long darc_ena_dbrp3:1; /* enable constraint dbr3 */ unsigned long darc_res6:15; } pmc41_mont_reg; /* instruction address range configuration register */ struct { unsigned long iarc_res1:1; /* reserved */ unsigned long iarc_ig_ibrp0:1; /* constrained by ibrp0 */ unsigned long iarc_res2:2; /* reserved */ unsigned long iarc_ig_ibrp1:1; /* constrained by ibrp1 */ unsigned long iarc_res3:2; /* reserved */ unsigned long iarc_ig_ibrp2:1; /* constrained by ibrp2 */ unsigned long iarc_res4:2; /* reserved */ unsigned long iarc_ig_ibrp3:1; /* constrained by ibrp3 */ unsigned long iarc_res5:2; /* reserved */ unsigned long iarc_fine:1; /* fine mode */ unsigned long iarc_ig6:50; /* reserved */ } pmc38_mont_reg; } pfm_gen_ia64_pmc_reg_t; typedef struct { unsigned long pmd_val; /* generic counter value */ /* counting pmd register */ struct { unsigned long pmd_count:32; /* 32-bit hardware counter */ unsigned long pmd_sxt32:32; /* sign extension of bit 32 */ } pmd_ita_counter_reg; struct { unsigned long iear_v:1; /* valid bit */ unsigned long iear_tlb:1; /* tlb miss bit */ unsigned long iear_ig1:3; /* reserved */ unsigned long iear_icla:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd0_ita_reg; struct { unsigned long iear_lat:12; /* latency */ unsigned long iear_ig1:52; /* reserved */ } pmd1_ita_reg; struct { unsigned long dear_daddr; /* data address */ } pmd2_ita_reg; struct { unsigned long dear_latency:12; /* latency */ unsigned long dear_ig1:50; /* reserved */ unsigned long dear_level:2; /* level */ } pmd3_ita_reg; struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_addr:60; /* b=1, bundle address, b=0 target address */ } pmd8_15_ita_reg; struct { unsigned long btbi_bbi:3; /* branch buffer index */ unsigned long btbi_full:1; /* full bit (sticky) */ unsigned long btbi_ignored:60; } pmd16_ita_reg; struct { unsigned long dear_vl:1; /* valid bit */ unsigned long dear_ig1:1; /* reserved */ unsigned long dear_slot:2; /* slot number */ unsigned long dear_iaddr:60; /* instruction address */ } pmd17_ita_reg; /* counting pmd register */ struct { unsigned long pmd_count:47; /* 47-bit hardware counter */ unsigned long pmd_sxt47:17; /* sign extension of bit 46 */ } pmd_ita2_counter_reg; /* instruction event address register: data address register */ struct { unsigned long iear_stat:2; /* status bit */ unsigned long iear_ig1:3; unsigned long iear_iaddr:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd0_ita2_reg; /* instruction event address register: data address register */ struct { unsigned long iear_latency:12; /* latency */ unsigned long iear_overflow:1; /* latency overflow */ unsigned long iear_ig1:51; /* reserved */ } pmd1_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_daddr; /* data address */ } pmd2_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_latency:13; /* latency */ unsigned long dear_overflow:1; /* overflow */ unsigned long dear_stat:2; /* status */ unsigned long dear_ig1:48; /* ignored */ } pmd3_ita2_reg; /* branch trace buffer data register when pmc12.ds == 0 */ struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_addr:60; /* bundle address(b=1), target address(b=0) */ } pmd8_15_ita2_reg; /* branch trace buffer data register when pmc12.ds == 1 */ struct { unsigned long btb_b:1; /* branch bit */ unsigned long btb_mp:1; /* mispredict bit */ unsigned long btb_slot:2; /* which slot, 3=not taken branch */ unsigned long btb_loaddr:37; /* b=1, bundle address, b=0 target address */ unsigned long btb_pred:20; /* low 20bits of L1IBR */ unsigned long btb_hiaddr:3; /* hi 3bits of bundle address(b=1) or target address (b=0)*/ } pmd8_15_ds_ita2_reg; /* branch trace buffer index register */ struct { unsigned long btbi_bbi:3; /* next entry index */ unsigned long btbi_full:1; /* full bit (sticky) */ unsigned long btbi_pmd8ext_b1:1; /* pmd8 ext */ unsigned long btbi_pmd8ext_bruflush:1; /* pmd8 ext */ unsigned long btbi_pmd8ext_ig:2; /* pmd8 ext */ unsigned long btbi_pmd9ext_b1:1; /* pmd9 ext */ unsigned long btbi_pmd9ext_bruflush:1; /* pmd9 ext */ unsigned long btbi_pmd9ext_ig:2; /* pmd9 ext */ unsigned long btbi_pmd10ext_b1:1; /* pmd10 ext */ unsigned long btbi_pmd10ext_bruflush:1; /* pmd10 ext */ unsigned long btbi_pmd10ext_ig:2; /* pmd10 ext */ unsigned long btbi_pmd11ext_b1:1; /* pmd11 ext */ unsigned long btbi_pmd11ext_bruflush:1; /* pmd11 ext */ unsigned long btbi_pmd11ext_ig:2; /* pmd11 ext */ unsigned long btbi_pmd12ext_b1:1; /* pmd12 ext */ unsigned long btbi_pmd12ext_bruflush:1; /* pmd12 ext */ unsigned long btbi_pmd12ext_ig:2; /* pmd12 ext */ unsigned long btbi_pmd13ext_b1:1; /* pmd13 ext */ unsigned long btbi_pmd13ext_bruflush:1; /* pmd13 ext */ unsigned long btbi_pmd13ext_ig:2; /* pmd13 ext */ unsigned long btbi_pmd14ext_b1:1; /* pmd14 ext */ unsigned long btbi_pmd14ext_bruflush:1; /* pmd14 ext */ unsigned long btbi_pmd14ext_ig:2; /* pmd14 ext */ unsigned long btbi_pmd15ext_b1:1; /* pmd15 ext */ unsigned long btbi_pmd15ext_bruflush:1; /* pmd15 ext */ unsigned long btbi_pmd15ext_ig:2; /* pmd15 ext */ unsigned long btbi_ignored:28; } pmd16_ita2_reg; /* data event address register: data address register */ struct { unsigned long dear_slot:2; /* slot */ unsigned long dear_bn:1; /* bundle bit (if 1 add 16 to address) */ unsigned long dear_vl:1; /* valid */ unsigned long dear_iaddr:60; /* instruction address (2-bundle window)*/ } pmd17_ita2_reg; struct { unsigned long pmd_count:47; /* 47-bit hardware counter */ unsigned long pmd_sxt47:17; /* sign extension of bit 46 */ } pmd_mont_counter_reg; /* data event address register */ struct { unsigned long dear_daddr; /* data address */ } pmd32_mont_reg; /* data event address register (D-EAR) */ struct { unsigned long dear_latency:13; /* latency */ unsigned long dear_ov:1; /* latency overflow */ unsigned long dear_stat:2; /* status */ unsigned long dear_ig:48; /* ignored */ } pmd33_mont_reg; /* instruction event address register (I-EAR) */ struct { unsigned long iear_stat:2; /* status bit */ unsigned long iear_ig:3; /* ignored */ unsigned long iear_iaddr:59; /* instruction cache line address {60:51} sxt {50}*/ } pmd34_mont_reg; /* instruction event address register (I-EAR) */ struct { unsigned long iear_latency:12; /* latency */ unsigned long iear_ov:1; /* latency overflow */ unsigned long iear_ig:51; /* ignored */ } pmd35_mont_reg; /* data event address register (D-EAR) */ struct { unsigned long dear_slot:2; /* slot */ unsigned long dear_bn:1; /* bundle bit (if 1 add 16 to iaddr) */ unsigned long dear_vl:1; /* valid */ unsigned long dear_iaddr:60; /* instruction address (2-bundle window)*/ } pmd36_mont_reg; /* execution trace buffer index register (ETB) */ struct { unsigned long etbi_ebi:4; /* next entry index */ unsigned long etbi_ig1:1; /* ignored */ unsigned long etbi_full:1; /* ETB overflowed at least once */ unsigned long etbi_ig2:58; /* ignored */ } pmd38_mont_reg; /* execution trace buffer extension register (ETB) */ struct { unsigned long etb_pmd48ext_b1:1; /* pmd48 ext */ unsigned long etb_pmd48ext_bruflush:1; /* pmd48 ext */ unsigned long etb_pmd48ext_res:2; /* reserved */ unsigned long etb_pmd56ext_b1:1; /* pmd56 ext */ unsigned long etb_pmd56ext_bruflush:1; /* pmd56 ext */ unsigned long etb_pmd56ext_res:2; /* reserved */ unsigned long etb_pmd49ext_b1:1; /* pmd49 ext */ unsigned long etb_pmd49ext_bruflush:1; /* pmd49 ext */ unsigned long etb_pmd49ext_res:2; /* reserved */ unsigned long etb_pmd57ext_b1:1; /* pmd57 ext */ unsigned long etb_pmd57ext_bruflush:1; /* pmd57 ext */ unsigned long etb_pmd57ext_res:2; /* reserved */ unsigned long etb_pmd50ext_b1:1; /* pmd50 ext */ unsigned long etb_pmd50ext_bruflush:1; /* pmd50 ext */ unsigned long etb_pmd50ext_res:2; /* reserved */ unsigned long etb_pmd58ext_b1:1; /* pmd58 ext */ unsigned long etb_pmd58ext_bruflush:1; /* pmd58 ext */ unsigned long etb_pmd58ext_res:2; /* reserved */ unsigned long etb_pmd51ext_b1:1; /* pmd51 ext */ unsigned long etb_pmd51ext_bruflush:1; /* pmd51 ext */ unsigned long etb_pmd51ext_res:2; /* reserved */ unsigned long etb_pmd59ext_b1:1; /* pmd59 ext */ unsigned long etb_pmd59ext_bruflush:1; /* pmd59 ext */ unsigned long etb_pmd59ext_res:2; /* reserved */ unsigned long etb_pmd52ext_b1:1; /* pmd52 ext */ unsigned long etb_pmd52ext_bruflush:1; /* pmd52 ext */ unsigned long etb_pmd52ext_res:2; /* reserved */ unsigned long etb_pmd60ext_b1:1; /* pmd60 ext */ unsigned long etb_pmd60ext_bruflush:1; /* pmd60 ext */ unsigned long etb_pmd60ext_res:2; /* reserved */ unsigned long etb_pmd53ext_b1:1; /* pmd53 ext */ unsigned long etb_pmd53ext_bruflush:1; /* pmd53 ext */ unsigned long etb_pmd53ext_res:2; /* reserved */ unsigned long etb_pmd61ext_b1:1; /* pmd61 ext */ unsigned long etb_pmd61ext_bruflush:1; /* pmd61 ext */ unsigned long etb_pmd61ext_res:2; /* reserved */ unsigned long etb_pmd54ext_b1:1; /* pmd54 ext */ unsigned long etb_pmd54ext_bruflush:1; /* pmd54 ext */ unsigned long etb_pmd54ext_res:2; /* reserved */ unsigned long etb_pmd62ext_b1:1; /* pmd62 ext */ unsigned long etb_pmd62ext_bruflush:1; /* pmd62 ext */ unsigned long etb_pmd62ext_res:2; /* reserved */ unsigned long etb_pmd55ext_b1:1; /* pmd55 ext */ unsigned long etb_pmd55ext_bruflush:1; /* pmd55 ext */ unsigned long etb_pmd55ext_res:2; /* reserved */ unsigned long etb_pmd63ext_b1:1; /* pmd63 ext */ unsigned long etb_pmd63ext_bruflush:1; /* pmd63 ext */ unsigned long etb_pmd63ext_res:2; /* reserved */ } pmd39_mont_reg; /* * execution trace buffer extension register when used with IP-EAR * * to be used in conjunction with pmd48_63_ipear_reg (see below) */ struct { unsigned long ipear_pmd48ext_cycles:2; /* pmd48 upper 2 bits of cycles */ unsigned long ipear_pmd48ext_f:1; /* pmd48 flush bit */ unsigned long ipear_pmd48ext_ef:1; /* pmd48 early freeze */ unsigned long ipear_pmd56ext_cycles:2; /* pmd56 upper 2 bits of cycles */ unsigned long ipear_pmd56ext_f:1; /* pmd56 flush bit */ unsigned long ipear_pmd56ext_ef:1; /* pmd56 early freeze */ unsigned long ipear_pmd49ext_cycles:2; /* pmd49 upper 2 bits of cycles */ unsigned long ipear_pmd49ext_f:1; /* pmd49 flush bit */ unsigned long ipear_pmd49ext_ef:1; /* pmd49 early freeze */ unsigned long ipear_pmd57ext_cycles:2; /* pmd57 upper 2 bits of cycles */ unsigned long ipear_pmd57ext_f:1; /* pmd57 flush bit */ unsigned long ipear_pmd57ext_ef:1; /* pmd57 early freeze */ unsigned long ipear_pmd50ext_cycles:2; /* pmd50 upper 2 bits of cycles */ unsigned long ipear_pmd50ext_f:1; /* pmd50 flush bit */ unsigned long ipear_pmd50ext_ef:1; /* pmd50 early freeze */ unsigned long ipear_pmd58ext_cycles:2; /* pmd58 upper 2 bits of cycles */ unsigned long ipear_pmd58ext_f:1; /* pmd58 flush bit */ unsigned long ipear_pmd58ext_ef:1; /* pmd58 early freeze */ unsigned long ipear_pmd51ext_cycles:2; /* pmd51 upper 2 bits of cycles */ unsigned long ipear_pmd51ext_f:1; /* pmd51 flush bit */ unsigned long ipear_pmd51ext_ef:1; /* pmd51 early freeze */ unsigned long ipear_pmd59ext_cycles:2; /* pmd59 upper 2 bits of cycles */ unsigned long ipear_pmd59ext_f:1; /* pmd59 flush bit */ unsigned long ipear_pmd59ext_ef:1; /* pmd59 early freeze */ unsigned long ipear_pmd52ext_cycles:2; /* pmd52 upper 2 bits of cycles */ unsigned long ipear_pmd52ext_f:1; /* pmd52 flush bit */ unsigned long ipear_pmd52ext_ef:1; /* pmd52 early freeze */ unsigned long ipear_pmd60ext_cycles:2; /* pmd60 upper 2 bits of cycles */ unsigned long ipear_pmd60ext_f:1; /* pmd60 flush bit */ unsigned long ipear_pmd60ext_ef:1; /* pmd60 early freeze */ unsigned long ipear_pmd53ext_cycles:2; /* pmd53 upper 2 bits of cycles */ unsigned long ipear_pmd53ext_f:1; /* pmd53 flush bit */ unsigned long ipear_pmd53ext_ef:1; /* pmd53 early freeze */ unsigned long ipear_pmd61ext_cycles:2; /* pmd61 upper 2 bits of cycles */ unsigned long ipear_pmd61ext_f:1; /* pmd61 flush bit */ unsigned long ipear_pmd61ext_ef:1; /* pmd61 early freeze */ unsigned long ipear_pmd54ext_cycles:2; /* pmd54 upper 2 bits of cycles */ unsigned long ipear_pmd54ext_f:1; /* pmd54 flush bit */ unsigned long ipear_pmd54ext_ef:1; /* pmd54 early freeze */ unsigned long ipear_pmd62ext_cycles:2; /* pmd62 upper 2 bits of cycles */ unsigned long ipear_pmd62ext_f:1; /* pmd62 flush bit */ unsigned long ipear_pmd62ext_ef:1; /* pmd62 early freeze */ unsigned long ipear_pmd55ext_cycles:2; /* pmd55 upper 2 bits of cycles */ unsigned long ipear_pmd55ext_f:1; /* pmd55 flush bit */ unsigned long ipear_pmd55ext_ef:1; /* pmd55 early freeze */ unsigned long ipear_pmd63ext_cycles:2; /* pmd63 upper 2 bits of cycles */ unsigned long ipear_pmd63ext_f:1; /* pmd63 flush bit */ unsigned long ipear_pmd63ext_ef:1; /* pmd63 early freeze */ } pmd39_ipear_mont_reg; /* * execution trace buffer data register (ETB) * * when pmc39.ds == 0: pmd48-63 contains branch targets * when pmc39.ds == 1: pmd48-63 content is undefined */ struct { unsigned long etb_s:1; /* source bit */ unsigned long etb_mp:1; /* mispredict bit */ unsigned long etb_slot:2; /* which slot, 3=not taken branch */ unsigned long etb_addr:60; /* bundle address(s=1), target address(s=0) */ } pmd48_63_etb_mont_reg; /* * execution trace buffer when used with IP-EAR with PMD48-63.ef=0 * * The cycles field straddles pmdXX and corresponding extension in * pmd39 (pmd39_ipear_mont_reg). For instance, cycles for pmd48: * * cycles= pmd39_ipear_mont_reg.etb_pmd48ext_cycles << 4 * | pmd48_63_etb_ipear_mont_reg.etb_cycles */ struct { unsigned long ipear_addr:60; /* retired IP[63:4] */ unsigned long ipear_cycles:4; /* lower 4 bit of cycles */ } pmd48_63_ipear_mont_reg; /* * execution trace buffer when used with IP-EAR with PMD48-63.ef=1 * * The cycles field straddles pmdXX and corresponding extension in * pmd39 (pmd39_ipear_mont_reg). For instance, cycles for pmd48: * * cycles= pmd39_ipear_mont_reg.etb_pmd48ext_cycles << 4 * | pmd48_63_etb_ipear_ef_mont_reg.etb_cycles */ struct { unsigned long ipear_delay:8; /* delay count */ unsigned long ipear_addr:52; /* retired IP[61:12] */ unsigned long ipear_cycles:4; /* lower 5 bit of cycles */ } pmd48_63_ipear_ef_mont_reg; } pfm_gen_ia64_pmd_reg_t; #define PFMLIB_ITA2_FL_EVT_NO_QUALCHECK 0x1 /* don't check qualifier constraints */ #define PFMLIB_ITA2_RR_INV 0x1 /* inverse instruction ranges (iranges only) */ #define PFMLIB_ITA2_RR_NO_FINE_MODE 0x2 /* force non fine mode for instruction ranges */ #define PFMLIB_ITA2_EVT_NO_GRP 0 /* event does not belong to a group */ #define PFMLIB_ITA2_EVT_L1_CACHE_GRP 1 /* event belongs to L1 Cache group */ #define PFMLIB_ITA2_EVT_L2_CACHE_GRP 2 /* event belongs to L2 Cache group */ #define PFMLIB_ITA2_EVT_NO_SET -1 /* event does not belong to a set */ /* * counter specific flags */ #define PFMLIB_MONT_FL_EVT_NO_QUALCHECK 0x1 /* don't check qualifier constraints */ #define PFMLIB_MONT_FL_EVT_ALL_THRD 0x2 /* event measured for both threads */ #define PFMLIB_MONT_FL_EVT_ACTIVE_ONLY 0x4 /* measure the event only when the thread is active */ #define PFMLIB_MONT_FL_EVT_ALWAYS 0x8 /* measure the event at all times (active or inactive) */ #define PFMLIB_MONT_RR_INV 0x1 /* inverse instruction ranges (iranges only) */ #define PFMLIB_MONT_RR_NO_FINE_MODE 0x2 /* force non fine mode for instruction ranges */ #define PFMLIB_MONT_IRR_DEMAND_FETCH 0x4 /* demand fetch only for dual events */ #define PFMLIB_MONT_IRR_PREFETCH_MATCH 0x8 /* regular prefetches for dual events */ #define PFMLIB_MONT_EVT_NO_GRP 0 /* event does not belong to a group */ #define PFMLIB_MONT_EVT_L1D_CACHE_GRP 1 /* event belongs to L1D Cache group */ #define PFMLIB_MONT_EVT_L2D_CACHE_GRP 2 /* event belongs to L2D Cache group */ #define PFMLIB_MONT_EVT_NO_SET -1 /* event does not belong to a set */ #define PFMLIB_MONT_EVT_ACTIVE 0 /* event measures only when thread is active */ #define PFMLIB_MONT_EVT_FLOATING 1 #define PFMLIB_MONT_EVT_CAUSAL 2 #define PFMLIB_MONT_EVT_SELF_FLOATING 3 /* floating with .self, causal otherwise */ typedef struct { unsigned long db_mask:56; unsigned long db_plm:4; unsigned long db_ig:2; unsigned long db_w:1; unsigned long db_rx:1; } br_mask_reg_t; typedef union { unsigned long val; br_mask_reg_t db; } dbreg_t; static inline int pfm_ia64_get_cpu_family(void) { return (int)((ia64_get_cpuid(3) >> 24) & 0xff); } static inline int pfm_ia64_get_cpu_model(void) { return (int)((ia64_get_cpuid(3) >> 16) & 0xff); } /* * find last bit set */ static inline int pfm_ia64_fls (unsigned long x) { double d = x; long exp; exp = ia64_getf(d); return exp - 0xffff; } #endif /* __PFMLIB_PRIV_IA64_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_adl.c000066400000000000000000000102341502707512200221620ustar00rootroot00000000000000/* * pfmlib_intel_adl.c : Intel AlderLake core PMU (P-Core, E-Core) * * Copyright (c) 2024 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_adl_glc_events.h" #include "events/intel_adl_grt_events.h" static const int adl_glc_models[] = { 0x97, /* Alderlake */ 0x9a, /* Alderlake L */ 0xb7, /* Raptorlake */ 0xba, /* Raptorlake P */ 0xbf, /* Raptorlake S */ 0 }; static int pfm_adl_init(void *this) { pfm_intel_x86_cfg.arch_version = 5; return PFM_SUCCESS; } pfmlib_pmu_t intel_adl_glc_support={ .desc = "Intel AlderLake GoldenCove (P-Core)", .name = "adl_glc", .perf_name = "cpu_core", .pmu = PFM_PMU_INTEL_ADL_GLC, .pme_count = LIBPFM_ARRAY_SIZE(intel_adl_glc_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_adl_glc_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = adl_glc_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_adl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; pfmlib_pmu_t intel_adl_grt_support={ .desc = "Intel AlderLake Gracemont (E-Core)", .name = "adl_grt", .perf_name = "cpu_atom", .pmu = PFM_PMU_INTEL_ADL_GRT, .pme_count = LIBPFM_ARRAY_SIZE(intel_adl_grt_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 6, /* no HT */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_adl_grt_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = adl_glc_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_adl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_atom.c000066400000000000000000000057241502707512200223720ustar00rootroot00000000000000/* * pfmlib_intel_atom.c : Intel Atom PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Based on work: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements support for Intel Core PMU as specified in the following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" * * Intel Atom = architectural v3 + PEBS */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_atom_events.h" static const int atom_models[] = { 28, /* Pineview/Silverthorne */ 38, /* Lincroft */ 39, /* Penwell */ 53, /* Cloverview */ 54, /* Cedarview */ 0 }; static int pfm_intel_atom_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } pfmlib_pmu_t intel_atom_support={ .desc = "Intel Atom", .name = "atom", .pmu = PFM_PMU_INTEL_ATOM, .pme_count = LIBPFM_ARRAY_SIZE(intel_atom_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .num_fixed_cntrs = 3, .max_encoding = 1, .pe = intel_atom_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = atom_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_atom_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdw.c000066400000000000000000000077121502707512200222050ustar00rootroot00000000000000/* * pfmlib_intel_bdw.c : Intel Broadwell core PMU * * Copyright (c) 2014 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_bdw_events.h" static const int bdw_models[] = { 61, /* Broadwell Core-M */ 71, /* Broadwell + GT3e (Iris Pro graphics) */ 0 }; static const int bdwep_models[] = { 79, /* Broadwell-EP, Xeon */ 86, /* Broadwell-EP, Xeon D */ 0 }; static int pfm_bdw_init(void *this) { pfm_intel_x86_cfg.arch_version = 4; return PFM_SUCCESS; } pfmlib_pmu_t intel_bdw_support={ .desc = "Intel Broadwell", .name = "bdw", .pmu = PFM_PMU_INTEL_BDW, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdw_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_bdw_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = bdw_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_bdw_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_bdw_ep_support={ .desc = "Intel Broadwell EP", .name = "bdw_ep", .pmu = PFM_PMU_INTEL_BDW_EP, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdw_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_bdw_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = bdwep_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_bdw_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_cbo.c000066400000000000000000000101111502707512200236610ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_cbo.c : Intel BDX C-Box uncore PMU * * Copyright (c) 2017 Google Inc. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_cbo_events.h" static void display_cbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d] %s\n", reg->val, reg->cbo.unc_event, reg->cbo.unc_umask, reg->cbo.unc_en, reg->cbo.unc_inv, reg->cbo.unc_edge, reg->cbo.unc_thres, reg->cbo.unc_tid, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CBOX_FILTER0=0x%"PRIx64" tid=%d core=0x%x" " state=0x%x]\n", f.val, f.ivbep_cbo_filt0.tid, f.ivbep_cbo_filt0.cid, f.ivbep_cbo_filt0.state); if (e->count == 2) return; f.val = e->codes[2]; __pfm_vbprintf("[UNC_CBOX_FILTER1=0x%"PRIx64" nid=%d opc=0x%x" " nc=0x%x isoc=0x%x]\n", f.val, f.ivbep_cbo_filt1.nid, f.ivbep_cbo_filt1.opc, f.ivbep_cbo_filt1.nc, f.ivbep_cbo_filt1.isoc); } #define DEFINE_C_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_cb##n##_support = {\ .desc = "Intel BroadwellX C-Box "#n" uncore",\ .name = "bdx_unc_cbo"#n,\ .perf_name = "uncore_cbox_"#n,\ .pmu = PFM_PMU_INTEL_BDX_UNC_CB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_c_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_bdx_unc_c_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK|INTEL_PMU_FL_UNC_CBO,\ .pmu_detect = pfm_intel_bdx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cbo,\ } DEFINE_C_BOX(0); DEFINE_C_BOX(1); DEFINE_C_BOX(2); DEFINE_C_BOX(3); DEFINE_C_BOX(4); DEFINE_C_BOX(5); DEFINE_C_BOX(6); DEFINE_C_BOX(7); DEFINE_C_BOX(8); DEFINE_C_BOX(9); DEFINE_C_BOX(10); DEFINE_C_BOX(11); DEFINE_C_BOX(12); DEFINE_C_BOX(13); DEFINE_C_BOX(14); DEFINE_C_BOX(15); DEFINE_C_BOX(16); DEFINE_C_BOX(17); DEFINE_C_BOX(18); DEFINE_C_BOX(19); DEFINE_C_BOX(20); DEFINE_C_BOX(21); DEFINE_C_BOX(22); DEFINE_C_BOX(23); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_ha.c000066400000000000000000000066601502707512200235240ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_ha.c : Intel BroadwellX Home Agent (HA) uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_ha_events.h" static void display_ha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_HA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_HA_ADDR=0x%"PRIx64" lo_addr=0x%x hi_addr=0x%x]\n", f.val, f.ha_addr.lo_addr, f.ha_addr.hi_addr); f.val = e->codes[2]; __pfm_vbprintf("[UNC_HA_OPC=0x%"PRIx64" opc=0x%x]\n", f.val, f.ha_opc.opc); } #define DEFINE_HA_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_ha##n##_support = {\ .desc = "Intel BroadwellX HA "#n" uncore",\ .name = "bdx_unc_ha"#n,\ .perf_name = "uncore_ha_"#n,\ .pmu = PFM_PMU_INTEL_BDX_UNC_HA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_h_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3, /* address matchers */\ .pe = intel_bdx_unc_h_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_bdx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_ha,\ } DEFINE_HA_BOX(0); DEFINE_HA_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_imc.c000066400000000000000000000055001502707512200236740ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_imc.c : Intel BroadwellX Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_imc_events.h" #define DEFINE_IMC_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_imc##n##_support = { \ .desc = "Intel BroadwellX IMC"#n" uncore", \ .name = "bdx_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_BDX_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_bdx_unc_m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_bdx_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_BOX(0); DEFINE_IMC_BOX(1); DEFINE_IMC_BOX(2); DEFINE_IMC_BOX(3); DEFINE_IMC_BOX(4); DEFINE_IMC_BOX(5); DEFINE_IMC_BOX(6); DEFINE_IMC_BOX(7); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_irp.c000066400000000000000000000057311502707512200237240ustar00rootroot00000000000000/* * pfmlib_intel_bdx_irp.c : Intel BroadwellX IRP uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_irp_events.h" static void display_irp(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "edge=%d thres=%d] %s\n", reg->val, reg->irp.unc_event, reg->irp.unc_umask, reg->irp.unc_en, reg->irp.unc_edge, reg->irp.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_bdx_unc_irp_support = { .desc = "Intel BroadwellX IRP uncore", .name = "bdx_unc_irp", .perf_name = "uncore_irp", .pmu = PFM_PMU_INTEL_BDX_UNC_IRP, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_i_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 3, .pe = intel_bdx_unc_i_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_bdx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_irp, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_pcu.c000066400000000000000000000067731502707512200237300ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_pcu.c : Intel BroadwellX Power Control Unit (PCU) uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x sel_ext=%d occ_sel=0x%x en=%d " "edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->ivbep_pcu.unc_event, reg->ivbep_pcu.unc_sel_ext, reg->ivbep_pcu.unc_occ, reg->ivbep_pcu.unc_en, reg->ivbep_pcu.unc_edge, reg->ivbep_pcu.unc_thres, reg->ivbep_pcu.unc_occ_inv, reg->ivbep_pcu.unc_occ_edge, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", f.val, f.pcu_filt.filt0, f.pcu_filt.filt1, f.pcu_filt.filt2, f.pcu_filt.filt3); } pfmlib_pmu_t intel_bdx_unc_pcu_support = { .desc = "Intel BroadwellX PCU uncore", .name = "bdx_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_BDX_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_p_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_bdx_unc_p_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_bdx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_qpi.c000066400000000000000000000062331502707512200237210ustar00rootroot00000000000000/* * pfmlib_intel_bdx_qpi.c : Intel BroadwellX QPI uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_qpi_events.h" static void display_qpi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_QPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->qpi.unc_event, reg->qpi.unc_event_ext, reg->qpi.unc_umask, reg->qpi.unc_en, reg->qpi.unc_inv, reg->qpi.unc_edge, reg->qpi.unc_thres, pe[e->event].name); } #define DEFINE_QPI_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_qpi##n##_support = {\ .desc = "Intel BroadwellX QPI"#n" uncore",\ .name = "bdx_unc_qpi"#n,\ .perf_name = "uncore_qpi_"#n,\ .pmu = PFM_PMU_INTEL_BDX_UNC_QPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_q_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_bdx_unc_q_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_bdx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_qpi,\ } DEFINE_QPI_BOX(0); DEFINE_QPI_BOX(1); DEFINE_QPI_BOX(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_r2pcie.c000066400000000000000000000060201502707512200243060ustar00rootroot00000000000000/* * pfmlib_intel_bdx_r2pcie.c : Intel BroadwellX R2PCIe uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_r2pcie_events.h" static void display_r2(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_R2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_bdx_unc_r2pcie_support = { .desc = "Intel BroadwellX R2PCIe uncore", .name = "bdx_unc_r2pcie", .perf_name = "uncore_r2pcie", .pmu = PFM_PMU_INTEL_BDX_UNC_R2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_r2_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_bdx_unc_r2_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_bdx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_r2, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_r3qpi.c000066400000000000000000000062221502707512200241640ustar00rootroot00000000000000/* * pfmlib_intel_bdx_r3qpi.c : Intel BroadwellX R3QPI uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_r3qpi_events.h" static void display_r3(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_R3QPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_R3QPI_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_r3qpi##n##_support = {\ .desc = "Intel BroadwellX R3QPI"#n" uncore", \ .name = "bdx_unc_r3qpi"#n,\ .perf_name = "uncore_r3qpi_"#n, \ .pmu = PFM_PMU_INTEL_BDX_UNC_R3QPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_r3_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 3,\ .num_fixed_cntrs = 0,\ .max_encoding = 1,\ .pe = intel_bdx_unc_r3_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_bdx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_r3,\ } DEFINE_R3QPI_BOX(0); DEFINE_R3QPI_BOX(1); DEFINE_R3QPI_BOX(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_sbo.c000066400000000000000000000062071502707512200237140ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_sbo.c : Intel BroadwellX S-Box uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_sbo_events.h" static void display_sbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_SBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_S_BOX(n) \ pfmlib_pmu_t intel_bdx_unc_sbo##n##_support = {\ .desc = "Intel BroadwellX S-BOX"#n" uncore",\ .name = "bdx_unc_sbo"#n,\ .perf_name = "uncore_sbox_"#n,\ .pmu = PFM_PMU_INTEL_BDX_UNC_SB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_s_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_bdx_unc_s_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_bdx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_sbo,\ } DEFINE_S_BOX(0); DEFINE_S_BOX(1); DEFINE_S_BOX(2); DEFINE_S_BOX(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_bdx_unc_ubo.c000066400000000000000000000060001502707512200237050ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_ubo.c : Intel BroadwellX U-Box uncore PMU * * Copyright (c) 2017 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_bdx_unc_ubo_events.h" static void display_ubo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_bdx_unc_ubo_support = { .desc = "Intel BroadwellX U-Box uncore", .name = "bdx_unc_ubo", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_BDX_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_bdx_unc_u_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 2, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_bdx_unc_u_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_bdx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_ubo, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_core.c000066400000000000000000000053511502707512200223560ustar00rootroot00000000000000/* * pfmlib_intel_core.c : Intel Core PMU * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Core PMU = architectural perfmon v2 + PEBS */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_core_events.h" static const int core_models[] = { 15, /* Merom */ 23, /* Penryn */ 29, /* Dunnington */ 0 }; static int pfm_core_init(void *this) { pfm_intel_x86_cfg.arch_version = 2; return PFM_SUCCESS; } pfmlib_pmu_t intel_core_support={ .desc = "Intel Core", .name = "core", .pmu = PFM_PMU_INTEL_CORE, .pme_count = LIBPFM_ARRAY_SIZE(intel_core_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .num_fixed_cntrs = 3, .max_encoding = 1, .supported_plm = INTEL_X86_PLM, .pe = intel_core_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = core_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_core_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_coreduo.c000066400000000000000000000054001502707512200230610ustar00rootroot00000000000000/* * pfmlib_intel_coreduo.c : Intel Core Duo/Solo (Yonah) * * Copyright (c) 2009, Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_intel_x86_priv.h" /* architecture private */ #include "events/intel_coreduo_events.h" static int pfm_coreduo_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; /* * check for core solo/core duo */ if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; if (pfm_intel_x86_cfg.model != 14) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } static int pfm_coreduo_init(void *this) { pfm_intel_x86_cfg.arch_version = 1; return PFM_SUCCESS; } pfmlib_pmu_t intel_coreduo_support={ .desc = "Intel Core Duo/Core Solo", .name = "coreduo", .pmu = PFM_PMU_COREDUO, .pme_count = LIBPFM_ARRAY_SIZE(intel_coreduo_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .max_encoding = 1, .pe = intel_coreduo_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .pmu_detect = pfm_coreduo_detect, .pmu_init = pfm_coreduo_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_glm.c000066400000000000000000000050231502707512200222010ustar00rootroot00000000000000/* * pfmlib_intel_glm.c : Intel Goldmont core PMU * * Copyright (c) 2016 Google * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_glm_events.h" static const int glm_models[] = { 92, /* Goldmont */ 95, /* Goldmont Denverton */ 0 }; static int pfm_intel_glm_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } pfmlib_pmu_t intel_glm_support={ .desc = "Intel Goldmont", .name = "glm", .pmu = PFM_PMU_INTEL_GLM, .pme_count = LIBPFM_ARRAY_SIZE(intel_glm_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, .pe = intel_glm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = glm_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_glm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_gnr.c000066400000000000000000000054221502707512200222130ustar00rootroot00000000000000/* * pfmlib_intel_gnr.c : Intel GraniteRapids core PMU * * Copyright (c) 2024 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_gnr_events.h" static const int gnr_models[] = { 173, /* GraniteRapids X */ 174, /* GraniteRapids D */ 0 }; static int pfm_gnr_init(void *this) { pfm_intel_x86_cfg.arch_version = 5; return PFM_SUCCESS; } pfmlib_pmu_t intel_gnr_support={ .desc = "Intel GraniteRapids", .name = "gnr", .pmu = PFM_PMU_INTEL_GNR, .pme_count = LIBPFM_ARRAY_SIZE(intel_gnr_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 16, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_gnr_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_SPEC | INTEL_X86_PMU_FL_ECMASK | INTEL_X86_PMU_FL_EXTPEBS, .cpu_family = 6, .cpu_models = gnr_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_gnr_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_gnr_unc_imc.c000066400000000000000000000064231502707512200237120ustar00rootroot00000000000000/* * pfmlib_intel_gnr_unc_imc.c : Intel GNR IMC uncore PMU * * Copyright (c) 2024 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of imcrge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERIMCNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_gnr_unc_imc_events.h" static void display_imc(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IMC=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_IMC(n) \ pfmlib_pmu_t intel_gnr_unc_imc##n##_support = {\ .desc = "Intel GraniteRapids IMC"#n" uncore",\ .name = "gnr_unc_imc"#n,\ .perf_name = "uncore_imc_"#n,\ .pmu = PFM_PMU_INTEL_GNR_UNC_IMC##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_gnr_unc_imc_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_gnr_unc_imc_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_gnr_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_imc,\ } DEFINE_IMC(0); DEFINE_IMC(1); DEFINE_IMC(2); DEFINE_IMC(3); DEFINE_IMC(4); DEFINE_IMC(5); DEFINE_IMC(6); DEFINE_IMC(7); DEFINE_IMC(8); DEFINE_IMC(9); DEFINE_IMC(10); DEFINE_IMC(11); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hsw.c000066400000000000000000000076101502707512200222270ustar00rootroot00000000000000/* * pfmlib_intel_hsw.c : Intel Haswell core PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_hsw_events.h" static const int hsw_models[] = { 60, /* Haswell */ 69, /* Haswell */ 70, /* Haswell */ 0 }; static const int hsw_ep_models[] = { 63, /* Haswell */ 0 }; static int pfm_hsw_init(void *this) { pfm_intel_x86_cfg.arch_version = 4; return PFM_SUCCESS; } pfmlib_pmu_t intel_hsw_support={ .desc = "Intel Haswell", .name = "hsw", .pmu = PFM_PMU_INTEL_HSW, .pme_count = LIBPFM_ARRAY_SIZE(intel_hsw_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_hsw_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = hsw_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_hsw_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_hsw_ep_support={ .desc = "Intel Haswell EP", .name = "hsw_ep", .pmu = PFM_PMU_INTEL_HSW_EP, .pme_count = LIBPFM_ARRAY_SIZE(intel_hsw_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_hsw_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = hsw_ep_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_hsw_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_cbo.c000066400000000000000000000100101502707512200242300ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_cbo.c : Intel Haswell-EP C-Box uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_cbo_events.h" static void display_cbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d] %s\n", reg->val, reg->cbo.unc_event, reg->cbo.unc_umask, reg->cbo.unc_en, reg->cbo.unc_inv, reg->cbo.unc_edge, reg->cbo.unc_thres, reg->cbo.unc_tid, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CBOX_FILTER0=0x%"PRIx64" tid=%d core=0x%x" " state=0x%x]\n", f.val, f.ivbep_cbo_filt0.tid, f.ivbep_cbo_filt0.cid, f.ivbep_cbo_filt0.state); if (e->count == 2) return; f.val = e->codes[2]; __pfm_vbprintf("[UNC_CBOX_FILTER1=0x%"PRIx64" nid=%d opc=0x%x" " nc=0x%x isoc=0x%x]\n", f.val, f.ivbep_cbo_filt1.nid, f.ivbep_cbo_filt1.opc, f.ivbep_cbo_filt1.nc, f.ivbep_cbo_filt1.isoc); } #define DEFINE_C_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_cb##n##_support = {\ .desc = "Intel Haswell-EP C-Box "#n" uncore",\ .name = "hswep_unc_cbo"#n,\ .perf_name = "uncore_cbox_"#n,\ .pmu = PFM_PMU_INTEL_HSWEP_UNC_CB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_c_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_hswep_unc_c_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK|INTEL_PMU_FL_UNC_CBO,\ .pmu_detect = pfm_intel_hswep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cbo,\ } DEFINE_C_BOX(0); DEFINE_C_BOX(1); DEFINE_C_BOX(2); DEFINE_C_BOX(3); DEFINE_C_BOX(4); DEFINE_C_BOX(5); DEFINE_C_BOX(6); DEFINE_C_BOX(7); DEFINE_C_BOX(8); DEFINE_C_BOX(9); DEFINE_C_BOX(10); DEFINE_C_BOX(11); DEFINE_C_BOX(12); DEFINE_C_BOX(13); DEFINE_C_BOX(14); DEFINE_C_BOX(15); DEFINE_C_BOX(16); DEFINE_C_BOX(17); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_ha.c000066400000000000000000000067001502707512200240700ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_ha.c : Intel Haswell-EP Home Agent (HA) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_ha_events.h" static void display_ha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_HA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_HA_ADDR=0x%"PRIx64" lo_addr=0x%x hi_addr=0x%x]\n", f.val, f.ha_addr.lo_addr, f.ha_addr.hi_addr); f.val = e->codes[2]; __pfm_vbprintf("[UNC_HA_OPC=0x%"PRIx64" opc=0x%x]\n", f.val, f.ha_opc.opc); } #define DEFINE_HA_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_ha##n##_support = {\ .desc = "Intel Haswell-EP HA "#n" uncore",\ .name = "hswep_unc_ha"#n,\ .perf_name = "uncore_ha_"#n,\ .pmu = PFM_PMU_INTEL_HSWEP_UNC_HA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_h_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3, /* address matchers */\ .pe = intel_hswep_unc_h_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_hswep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_ha,\ } DEFINE_HA_BOX(0); DEFINE_HA_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_imc.c000066400000000000000000000055201502707512200242470ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_imc.c : Intel Haswell-EP Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_imc_events.h" #define DEFINE_IMC_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_imc##n##_support = { \ .desc = "Intel Haswell-EP IMC"#n" uncore", \ .name = "hswep_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_HSWEP_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_hswep_unc_m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_hswep_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_BOX(0); DEFINE_IMC_BOX(1); DEFINE_IMC_BOX(2); DEFINE_IMC_BOX(3); DEFINE_IMC_BOX(4); DEFINE_IMC_BOX(5); DEFINE_IMC_BOX(6); DEFINE_IMC_BOX(7); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_irp.c000066400000000000000000000057521502707512200243000ustar00rootroot00000000000000 /* * pfmlib_intel_hswep_irp.c : Intel Haswell-EP IRP uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_irp_events.h" static void display_irp(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "edge=%d thres=%d] %s\n", reg->val, reg->irp.unc_event, reg->irp.unc_umask, reg->irp.unc_en, reg->irp.unc_edge, reg->irp.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_hswep_unc_irp_support = { .desc = "Intel Haswell-EP IRP uncore", .name = "hswep_unc_irp", .perf_name = "uncore_irp", .pmu = PFM_PMU_INTEL_HSWEP_UNC_IRP, .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_i_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 3, .pe = intel_hswep_unc_i_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_hswep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_irp, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_pcu.c000066400000000000000000000070131502707512200242650ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_pcu.c : Intel Haswell-EP Power Control Unit (PCU) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x sel_ext=%d occ_sel=0x%x en=%d " "edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->ivbep_pcu.unc_event, reg->ivbep_pcu.unc_sel_ext, reg->ivbep_pcu.unc_occ, reg->ivbep_pcu.unc_en, reg->ivbep_pcu.unc_edge, reg->ivbep_pcu.unc_thres, reg->ivbep_pcu.unc_occ_inv, reg->ivbep_pcu.unc_occ_edge, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", f.val, f.pcu_filt.filt0, f.pcu_filt.filt1, f.pcu_filt.filt2, f.pcu_filt.filt3); } pfmlib_pmu_t intel_hswep_unc_pcu_support = { .desc = "Intel Haswell-EP PCU uncore", .name = "hswep_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_HSWEP_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_p_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_hswep_unc_p_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_hswep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_qpi.c000066400000000000000000000062301502707512200242670ustar00rootroot00000000000000/* * pfmlib_intel_hswep_qpi.c : Intel Haswell-EP QPI uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_qpi_events.h" static void display_qpi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_QPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->qpi.unc_event, reg->qpi.unc_event_ext, reg->qpi.unc_umask, reg->qpi.unc_en, reg->qpi.unc_inv, reg->qpi.unc_edge, reg->qpi.unc_thres, pe[e->event].name); } #define DEFINE_QPI_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_qpi##n##_support = {\ .desc = "Intel Haswell-EP QPI"#n" uncore",\ .name = "hswep_unc_qpi"#n,\ .perf_name = "uncore_qpi_"#n,\ .pmu = PFM_PMU_INTEL_HSWEP_UNC_QPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_q_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_hswep_unc_q_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_hswep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_qpi,\ } DEFINE_QPI_BOX(0); DEFINE_QPI_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_r2pcie.c000066400000000000000000000060401502707512200246610ustar00rootroot00000000000000/* * pfmlib_intel_hswep_r2pcie.c : Intel Haswell-EP R2PCIe uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_r2pcie_events.h" static void display_r2(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_R2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_hswep_unc_r2pcie_support = { .desc = "Intel Haswell-EP R2PCIe uncore", .name = "hswep_unc_r2pcie", .perf_name = "uncore_r2pcie", .pmu = PFM_PMU_INTEL_HSWEP_UNC_R2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_r2_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_hswep_unc_r2_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_hswep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_r2, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_r3qpi.c000066400000000000000000000062421502707512200245370ustar00rootroot00000000000000/* * pfmlib_intel_hswep_r3qpi.c : Intel Haswell-EP R3QPI uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_r3qpi_events.h" static void display_r3(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_R3QPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_R3QPI_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_r3qpi##n##_support = {\ .desc = "Intel Haswell-EP R3QPI"#n" uncore", \ .name = "hswep_unc_r3qpi"#n,\ .perf_name = "uncore_r3qpi_"#n, \ .pmu = PFM_PMU_INTEL_HSWEP_UNC_R3QPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_r3_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 3,\ .num_fixed_cntrs = 0,\ .max_encoding = 1,\ .pe = intel_hswep_unc_r3_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_hswep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_r3,\ } DEFINE_R3QPI_BOX(0); DEFINE_R3QPI_BOX(1); DEFINE_R3QPI_BOX(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_sbo.c000066400000000000000000000062261502707512200242660ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_sbo.c : Intel Haswell-EP S-Box uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_sbo_events.h" static void display_sbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_SBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_S_BOX(n) \ pfmlib_pmu_t intel_hswep_unc_sb##n##_support = {\ .desc = "Intel Haswell-EP S-BOX"#n" uncore",\ .name = "hswep_unc_sbo"#n,\ .perf_name = "uncore_sbox_"#n,\ .pmu = PFM_PMU_INTEL_HSWEP_UNC_SB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_s_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_hswep_unc_s_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_hswep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_sbo,\ } DEFINE_S_BOX(0); DEFINE_S_BOX(1); DEFINE_S_BOX(2); DEFINE_S_BOX(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_hswep_unc_ubo.c000066400000000000000000000060201502707512200242600ustar00rootroot00000000000000/* * pfmlib_intel_hswep_unc_ubo.c : Intel Haswell-EP U-Box uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_hswep_unc_ubo_events.h" static void display_ubo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_hswep_unc_ubo_support = { .desc = "Intel Haswell-EP U-Box uncore", .name = "hswep_unc_ubo", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_HSWEP_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_hswep_unc_u_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 2, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_hswep_unc_u_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_hswep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_ubo, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icl.c000066400000000000000000000102641502707512200221740ustar00rootroot00000000000000/* * pfmlib_intel_icl.c : Intel Icelake core PMU * * Copyright (c) 2019 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_icl_events.h" static const int icl_models[] = { 108, /* Icelake D */ 125, /* Icelake */ 126, /* Icelake L */ 157, /* Icelake NNPI */ 140, /* Tigerlake L */ 141, /* Tigerlake */ 167, /* Rocketlake */ 0 }; static const int icx_models[] = { 106, /* IcelakeX */ 0 }; static int pfm_icl_init(void *this) { pfm_intel_x86_cfg.arch_version = 4; return PFM_SUCCESS; } pfmlib_pmu_t intel_icl_support={ .desc = "Intel Icelake", .name = "icl", .pmu = PFM_PMU_INTEL_ICL, .pme_count = LIBPFM_ARRAY_SIZE(intel_icl_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 16, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_icl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_SPEC | INTEL_X86_PMU_FL_ECMASK | INTEL_X86_PMU_FL_EXTPEBS, .cpu_family = 6, .cpu_models = icl_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_icl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; pfmlib_pmu_t intel_icx_support={ .desc = "Intel IcelakeX", .name = "icx", .pmu = PFM_PMU_INTEL_ICX, .pme_count = LIBPFM_ARRAY_SIZE(intel_icl_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 16, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_icl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_SPEC | INTEL_X86_PMU_FL_ECMASK | INTEL_X86_PMU_FL_EXTPEBS, .cpu_family = 6, .cpu_models = icx_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_icl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_cha.c000066400000000000000000000107411502707512200236700ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_cha.c : Intel ICX CHA-Box uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_cha_events.h" static void display_cha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CHA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d umask_ext=0x%x] %s\n", reg->val, reg->icx_cha.unc_event, reg->icx_cha.unc_umask, reg->icx_cha.unc_en, reg->icx_cha.unc_inv, reg->icx_cha.unc_edge, reg->icx_cha.unc_thres, reg->icx_cha.unc_tid, reg->icx_cha.unc_umask_ext, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CHA_FILTER0=0x%"PRIx64" thread_id=%d source=0x%x state=0x%x]\n", f.val, f.skx_cha_filt0.tid, f.skx_cha_filt0.sid, f.skx_cha_filt0.state); if (e->count == 2) return; f.val = e->codes[2]; __pfm_vbprintf("[UNC_CHA_FILTER1=0x%"PRIx64" rem=%d loc=%d all_opc=%d nm=%d" " not_nm=%d opc0=0x%x opc1=0x%x nc=%d isoc=%d]\n", f.val, f.skx_cha_filt1.rem, f.skx_cha_filt1.loc, f.skx_cha_filt1.all_opc, f.skx_cha_filt1.nm, f.skx_cha_filt1.not_nm, f.skx_cha_filt1.opc0, f.skx_cha_filt1.opc1, f.skx_cha_filt1.nc, f.skx_cha_filt1.isoc); } #define DEFINE_CHA(n) \ pfmlib_pmu_t intel_icx_unc_cha##n##_support = {\ .desc = "Intel IcelakeX CHA"#n" uncore",\ .name = "icx_unc_cha"#n,\ .perf_name = "uncore_cha_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_CHA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_cha_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_cha_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cha,\ } DEFINE_CHA(0); DEFINE_CHA(1); DEFINE_CHA(2); DEFINE_CHA(3); DEFINE_CHA(4); DEFINE_CHA(5); DEFINE_CHA(6); DEFINE_CHA(7); DEFINE_CHA(8); DEFINE_CHA(9); DEFINE_CHA(10); DEFINE_CHA(11); DEFINE_CHA(12); DEFINE_CHA(13); DEFINE_CHA(14); DEFINE_CHA(15); DEFINE_CHA(16); DEFINE_CHA(17); DEFINE_CHA(18); DEFINE_CHA(19); DEFINE_CHA(20); DEFINE_CHA(21); DEFINE_CHA(22); DEFINE_CHA(23); DEFINE_CHA(24); DEFINE_CHA(25); DEFINE_CHA(26); DEFINE_CHA(27); DEFINE_CHA(28); DEFINE_CHA(29); DEFINE_CHA(30); DEFINE_CHA(31); DEFINE_CHA(32); DEFINE_CHA(33); DEFINE_CHA(34); DEFINE_CHA(35); DEFINE_CHA(36); DEFINE_CHA(37); DEFINE_CHA(38); DEFINE_CHA(39); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_iio.c000066400000000000000000000063721502707512200237220ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_iio.c : Intel ICX IIO-Box uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of iiorge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERIIONTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_iio_events.h" static void display_iio(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IIO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d fc_mask=%d ch_mask=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, reg->iio.unc_fcmsk, reg->iio.unc_chmsk, pe[e->event].name); } #define DEFINE_IIO(n) \ pfmlib_pmu_t intel_icx_unc_iio##n##_support = {\ .desc = "Intel IcelakeX IIO"#n" uncore",\ .name = "icx_unc_iio"#n,\ .perf_name = "uncore_iio_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_IIO##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_iio_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_iio_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_iio,\ } DEFINE_IIO(0); DEFINE_IIO(1); DEFINE_IIO(2); DEFINE_IIO(3); DEFINE_IIO(4); DEFINE_IIO(5); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_imc.c000066400000000000000000000064221502707512200237060ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_imc.c : Intel ICX IMC-Box uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of imcrge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERIMCNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_imc_events.h" static void display_imc(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IMC=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_IMC(n) \ pfmlib_pmu_t intel_icx_unc_imc##n##_support = {\ .desc = "Intel IcelakeX CHA"#n" uncore",\ .name = "icx_unc_imc"#n,\ .perf_name = "uncore_imc_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_IMC##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_imc_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_imc_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_imc,\ } DEFINE_IMC(0); DEFINE_IMC(1); DEFINE_IMC(2); DEFINE_IMC(3); DEFINE_IMC(4); DEFINE_IMC(5); DEFINE_IMC(6); DEFINE_IMC(7); DEFINE_IMC(8); DEFINE_IMC(9); DEFINE_IMC(10); DEFINE_IMC(11); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_irp.c000066400000000000000000000063101502707512200237240ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_irp.c : Intel ICX IRP uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of irprge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERIRPNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_irp_events.h" static void display_irp(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d fc_mask=%d ch_mask=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_IRP(n) \ pfmlib_pmu_t intel_icx_unc_irp##n##_support = {\ .desc = "Intel IcelakeX IRP"#n" uncore",\ .name = "icx_unc_irp"#n,\ .perf_name = "uncore_irp_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_IRP##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_irp_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 2,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_irp_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_irp,\ } DEFINE_IRP(0); DEFINE_IRP(1); DEFINE_IRP(2); DEFINE_IRP(3); DEFINE_IRP(4); DEFINE_IRP(5); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_m2m.c000066400000000000000000000062341502707512200236320ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_m2m.c : Intel ICX M2M uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of m2mrge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERM2MNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_m2m_events.h" static void display_m2m(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_M2M=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d ext_umask0x%x] %s\n", reg->val, reg->icx_m2m.unc_event, reg->icx_m2m.unc_umask, reg->icx_m2m.unc_en, reg->icx_m2m.unc_inv, reg->icx_m2m.unc_edge, reg->icx_m2m.unc_thres, pe[e->event].name); } #define DEFINE_M2M(n) \ pfmlib_pmu_t intel_icx_unc_m2m##n##_support = {\ .desc = "Intel IcelakeX M2M"#n" uncore",\ .name = "icx_unc_m2m"#n,\ .perf_name = "uncore_m2m_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_M2M##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_m2m_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_m2m_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_m2m,\ } DEFINE_M2M(0); DEFINE_M2M(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_m2pcie.c000066400000000000000000000062761502707512200243240ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_m2pcie.c : Intel ICX M2PCIE uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of m2pcierge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERM2PCIENTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_m2pcie_events.h" static void display_m2pcie(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_M2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_M2PCIE(n) \ pfmlib_pmu_t intel_icx_unc_m2pcie##n##_support = {\ .desc = "Intel IcelakeX M2PCIE"#n" uncore",\ .name = "icx_unc_m2pcie"#n,\ .perf_name = "uncore_m2pcie_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_M2PCIE##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_m2pcie_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_m2pcie_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_m2pcie,\ } DEFINE_M2PCIE(0); DEFINE_M2PCIE(1); DEFINE_M2PCIE(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_m3upi.c000066400000000000000000000062741502707512200242000ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_m3upi.c : Intel ICX M3UPI uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of m3upirge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERM3UPINTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_m3upi_events.h" static void display_m3upi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_M3UPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_M3UPI(n) \ pfmlib_pmu_t intel_icx_unc_m3upi##n##_support = {\ .desc = "Intel IcelakeX M3UPI"#n" uncore",\ .name = "icx_unc_m3upi"#n,\ .perf_name = "uncore_m3upi_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_M3UPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_m3upi_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_m3upi_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_m3upi,\ } DEFINE_M3UPI(0); DEFINE_M3UPI(1); DEFINE_M3UPI(2); DEFINE_M3UPI(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_pcu.c000066400000000000000000000063331502707512200237260ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_pcu.c : Intel ICX PCU uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of pcurge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERPCUNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->icx_pcu.unc_event, reg->icx_pcu.unc_umask, reg->icx_pcu.unc_en, reg->icx_pcu.unc_inv, reg->icx_pcu.unc_edge, reg->icx_pcu.unc_thres, reg->icx_pcu.unc_tid_en, reg->icx_pcu.unc_occ_inv, reg->icx_pcu.unc_occ_edge, pe[e->event].name); } pfmlib_pmu_t intel_icx_unc_pcu_support = { .desc = "Intel IcelakeX PCU uncore", .name = "icx_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_ICX_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_pcu_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_icx_unc_pcu_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_icx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_ubox.c000066400000000000000000000060411502707512200241100ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_ubox.c : Intel ICX UBOX uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of uboxrge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERUBOXNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_ubox_events.h" static void display_ubox(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UBOX=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_icx_unc_ubox_support = { .desc = "Intel IcelakeX UBOX uncore", .name = "icx_unc_ubox", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_ICX_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_ubox_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_icx_unc_ubox_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_icx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .display_reg = display_ubox, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_icx_unc_upi.c000066400000000000000000000062321502707512200237320ustar00rootroot00000000000000/* * pfmlib_intel_icx_unc_upi.c : Intel ICX UPI uncore PMU * * Copyright (c) 2023 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of upirge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERUPINTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_icx_unc_upi_events.h" static void display_upi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_UPI(n) \ pfmlib_pmu_t intel_icx_unc_upi##n##_support = {\ .desc = "Intel IcelakeX UPI"#n" uncore",\ .name = "icx_unc_upi"#n,\ .perf_name = "uncore_upi_"#n,\ .pmu = PFM_PMU_INTEL_ICX_UNC_UPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_icx_unc_upi_ll_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_icx_unc_upi_ll_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_icx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_upi,\ } DEFINE_UPI(0); DEFINE_UPI(1); DEFINE_UPI(2); DEFINE_UPI(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivb.c000066400000000000000000000076001502707512200222050ustar00rootroot00000000000000/* * pfmlib_intel_ivb.c : Intel Ivy Bridge core PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_ivb_events.h" static int pfm_ivb_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } static const int ivb_models[] = { 58, /* IvyBridge (Core i3/i5/i7 3xxx) */ 0 }; static const int ivbep_models[] = { 62, /* Ivytown */ 0 }; pfmlib_pmu_t intel_ivb_support={ .desc = "Intel Ivy Bridge", .name = "ivb", .pmu = PFM_PMU_INTEL_IVB, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivb_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_ivb_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = ivb_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_ivb_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_ivb_ep_support={ .desc = "Intel Ivy Bridge EP", .name = "ivb_ep", .pmu = PFM_PMU_INTEL_IVB_EP, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivb_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_ivb_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = ivbep_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_ivb_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivb_unc.c000066400000000000000000000057421502707512200230570ustar00rootroot00000000000000/* * pfmlib_intel_ivb_unc.c : Intel IvyBridge C-Box uncore PMU * * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #define INTEL_SNB_UNC_ATTRS \ (_INTEL_X86_ATTR_I|_INTEL_X86_ATTR_E|_INTEL_X86_ATTR_C) /* same event table as SNB */ #include "events/intel_snb_unc_events.h" static int pfm_ivb_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch (pfm_intel_x86_cfg.model) { case 58: /* IvyBridge */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } #define IVB_UNC_CBOX(n, p) \ pfmlib_pmu_t intel_ivb_unc_cbo##n##_support={ \ .desc = "Intel Ivy Bridge C-box"#n" uncore", \ .name = "ivb_unc_cbo"#n, \ .perf_name = "uncore_cbox_"#n, \ .pmu = PFM_PMU_INTEL_IVB_UNC_CB##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_snb_unc_##p##_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 2, \ .num_fixed_cntrs = 1, \ .max_encoding = 1,\ .pe = intel_snb_unc_##p##_pe, \ .atdesc = intel_x86_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .pmu_detect = pfm_ivb_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_nhm_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ } IVB_UNC_CBOX(0, cbo0); IVB_UNC_CBOX(1, cbo); IVB_UNC_CBOX(2, cbo); IVB_UNC_CBOX(3, cbo); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_cbo.c000066400000000000000000000077271502707512200242340ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_unc_cbo.c : Intel IvyBridge-EP C-Box uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_cbo_events.h" static void display_cbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d] %s\n", reg->val, reg->cbo.unc_event, reg->cbo.unc_umask, reg->cbo.unc_en, reg->cbo.unc_inv, reg->cbo.unc_edge, reg->cbo.unc_thres, reg->cbo.unc_tid, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CBOX_FILTER0=0x%"PRIx64" tid=%d core=0x%x" " state=0x%x]\n", f.val, f.ivbep_cbo_filt0.tid, f.ivbep_cbo_filt0.cid, f.ivbep_cbo_filt0.state); if (e->count == 2) return; f.val = e->codes[2]; __pfm_vbprintf("[UNC_CBOX_FILTER1=0x%"PRIx64" nid=%d opc=0x%x" " nc=0x%x isoc=0x%x]\n", f.val, f.ivbep_cbo_filt1.nid, f.ivbep_cbo_filt1.opc, f.ivbep_cbo_filt1.nc, f.ivbep_cbo_filt1.isoc); } #define DEFINE_C_BOX(n) \ pfmlib_pmu_t intel_ivbep_unc_cb##n##_support = {\ .desc = "Intel Ivy Bridge-EP C-Box "#n" uncore",\ .name = "ivbep_unc_cbo"#n,\ .perf_name = "uncore_cbox_"#n,\ .pmu = PFM_PMU_INTEL_IVBEP_UNC_CB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_c_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_ivbep_unc_c_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK|INTEL_PMU_FL_UNC_CBO,\ .pmu_detect = pfm_intel_ivbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cbo,\ } DEFINE_C_BOX(0); DEFINE_C_BOX(1); DEFINE_C_BOX(2); DEFINE_C_BOX(3); DEFINE_C_BOX(4); DEFINE_C_BOX(5); DEFINE_C_BOX(6); DEFINE_C_BOX(7); DEFINE_C_BOX(8); DEFINE_C_BOX(9); DEFINE_C_BOX(10); DEFINE_C_BOX(11); DEFINE_C_BOX(12); DEFINE_C_BOX(13); DEFINE_C_BOX(14); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_ha.c000066400000000000000000000067051502707512200240540ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_unc_ha.c : Intel IvyBridge-EP Home Agent (HA) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_ha_events.h" static void display_ha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_HA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_HA_ADDR=0x%"PRIx64" lo_addr=0x%x hi_addr=0x%x]\n", f.val, f.ha_addr.lo_addr, f.ha_addr.hi_addr); f.val = e->codes[2]; __pfm_vbprintf("[UNC_HA_OPC=0x%"PRIx64" opc=0x%x]\n", f.val, f.ha_opc.opc); } #define DEFINE_HA_BOX(n) \ pfmlib_pmu_t intel_ivbep_unc_ha##n##_support = {\ .desc = "Intel Ivy Bridge-EP HA "#n" uncore",\ .name = "ivbep_unc_ha"#n,\ .perf_name = "uncore_ha_"#n,\ .pmu = PFM_PMU_INTEL_IVBEP_UNC_HA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_h_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3, /* address matchers */\ .pe = intel_ivbep_unc_h_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_ivbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_ha,\ } DEFINE_HA_BOX(0); DEFINE_HA_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_imc.c000066400000000000000000000055251502707512200242330ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_unc_imc.c : Intel IvyBridge-EP Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_imc_events.h" #define DEFINE_IMC_BOX(n) \ pfmlib_pmu_t intel_ivbep_unc_imc##n##_support = { \ .desc = "Intel Iyy Bridge-EP IMC"#n" uncore", \ .name = "ivbep_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_IVBEP_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_ivbep_unc_m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_ivbep_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_BOX(0); DEFINE_IMC_BOX(1); DEFINE_IMC_BOX(2); DEFINE_IMC_BOX(3); DEFINE_IMC_BOX(4); DEFINE_IMC_BOX(5); DEFINE_IMC_BOX(6); DEFINE_IMC_BOX(7); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_irp.c000066400000000000000000000057561502707512200242630ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_irp.c : Intel IvyBridge-EP IRP uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_irp_events.h" static void display_irp(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "edge=%d thres=%d] %s\n", reg->val, reg->irp.unc_event, reg->irp.unc_umask, reg->irp.unc_en, reg->irp.unc_edge, reg->irp.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_ivbep_unc_irp_support = { .desc = "Intel Ivy Bridge-EP IRP uncore", .name = "ivbep_unc_irp", .perf_name = "uncore_irp", .pmu = PFM_PMU_INTEL_IVBEP_UNC_IRP, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_i_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 3, .pe = intel_ivbep_unc_i_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_ivbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_irp, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_pcu.c000066400000000000000000000070201502707512200242420ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_unc_pcu.c : Intel IvyBridge-EP Power Control Unit (PCU) uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x sel_ext=%d occ_sel=0x%x en=%d " "edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->ivbep_pcu.unc_event, reg->ivbep_pcu.unc_sel_ext, reg->ivbep_pcu.unc_occ, reg->ivbep_pcu.unc_en, reg->ivbep_pcu.unc_edge, reg->ivbep_pcu.unc_thres, reg->ivbep_pcu.unc_occ_inv, reg->ivbep_pcu.unc_occ_edge, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", f.val, f.pcu_filt.filt0, f.pcu_filt.filt1, f.pcu_filt.filt2, f.pcu_filt.filt3); } pfmlib_pmu_t intel_ivbep_unc_pcu_support = { .desc = "Intel Ivy Bridge-EP PCU uncore", .name = "ivbep_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_IVBEP_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_p_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_ivbep_unc_p_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_ivbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_qpi.c000066400000000000000000000062601502707512200242510ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_qpi.c : Intel IvyBridge-EP QPI uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_qpi_events.h" static void display_qpi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_QPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->qpi.unc_event, reg->qpi.unc_event_ext, reg->qpi.unc_umask, reg->qpi.unc_en, reg->qpi.unc_inv, reg->qpi.unc_edge, reg->qpi.unc_thres, pe[e->event].name); } #define DEFINE_QPI_BOX(n) \ pfmlib_pmu_t intel_ivbep_unc_qpi##n##_support = {\ .desc = "Intel Ivy Bridge-EP QPI"#n" uncore",\ .name = "ivbep_unc_qpi"#n,\ .perf_name = "uncore_qpi_"#n,\ .pmu = PFM_PMU_INTEL_IVBEP_UNC_QPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_q_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_ivbep_unc_q_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_ivbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_qpi,\ } DEFINE_QPI_BOX(0); DEFINE_QPI_BOX(1); DEFINE_QPI_BOX(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_r2pcie.c000066400000000000000000000051241502707512200246420ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_r2pcie.c : Intel IvyBridge-EP R2PCIe uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_r2pcie_events.h" pfmlib_pmu_t intel_ivbep_unc_r2pcie_support = { .desc = "Intel Ivy Bridge-EP R2PCIe uncore", .name = "ivbep_unc_r2pcie", .perf_name = "uncore_r2pcie", .pmu = PFM_PMU_INTEL_IVBEP_UNC_R2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_r2_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_ivbep_unc_r2_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_ivbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_r3qpi.c000066400000000000000000000053261502707512200245200ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_r3qpi.c : Intel IvyBridge-EP R3QPI uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_r3qpi_events.h" #define DEFINE_R3QPI_BOX(n) \ pfmlib_pmu_t intel_ivbep_unc_r3qpi##n##_support = {\ .desc = "Intel Ivy Bridge-EP R3QPI"#n" uncore", \ .name = "ivbep_unc_r3qpi"#n,\ .perf_name = "uncore_r3qpi_"#n, \ .pmu = PFM_PMU_INTEL_IVBEP_UNC_R3QPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_r3_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 3,\ .num_fixed_cntrs = 0,\ .max_encoding = 1,\ .pe = intel_ivbep_unc_r3_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_ivbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ } DEFINE_R3QPI_BOX(0); DEFINE_R3QPI_BOX(1); DEFINE_R3QPI_BOX(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_ivbep_unc_ubo.c000066400000000000000000000051041502707512200242410ustar00rootroot00000000000000/* * pfmlib_intel_ivbep_unc_ubo.c : Intel IvyBridge-EP U-Box uncore PMU * * Copyright (c) 2014 Google Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_ivbep_unc_ubo_events.h" pfmlib_pmu_t intel_ivbep_unc_ubo_support = { .desc = "Intel Ivy Bridge-EP U-Box uncore", .name = "ivbep_unc_ubo", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_IVBEP_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_ivbep_unc_u_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 2, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_ivbep_unc_u_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_ivbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knc.c000066400000000000000000000045731502707512200222060ustar00rootroot00000000000000/* * pfmlib_intel_knc.c : Intel Knights Corner (Xeon Phi) * * Copyright (c) 2012, Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_intel_x86_priv.h" /* architecture private */ #include "events/intel_knc_events.h" static const int knc_models[] = { 1, /* Knights Corner */ 0 }; pfmlib_pmu_t intel_knc_support={ .desc = "Intel Knights Corner", .name = "knc", .pmu = PFM_PMU_INTEL_KNC, .pme_count = LIBPFM_ARRAY_SIZE(intel_knc_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .max_encoding = 1, .pe = intel_knc_pe, .atdesc = intel_x86_mods, .supported_plm = INTEL_X86_PLM, .cpu_family = 11, .cpu_models = knc_models, .pmu_detect = pfm_intel_x86_model_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knl.c000066400000000000000000000106221502707512200222070ustar00rootroot00000000000000/* * pfmlib_intel_knl.c : Intel Knights Landing core PMU * * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Intel Knights Mill core PMU support added March 2018 * Based on Intel's Knights Landing event table, which is shared with Knights Mill * Contributed by Heike Jagode * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Based on Intel Software Optimization Guide 2015 */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_knl_events.h" static const int knl_models[] = { 87, /* knights landing */ 0 }; static const int knm_models[] = { 133, /* knights mill */ 0 }; static int pfm_intel_knl_init(void *this) { pfm_intel_x86_cfg.arch_version = 2; return PFM_SUCCESS; } pfmlib_pmu_t intel_knl_support={ .desc = "Intel Knights Landing", .name = "knl", .pmu = PFM_PMU_INTEL_KNL, .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .num_fixed_cntrs = 3, .max_encoding = 2, .pe = intel_knl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = knl_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_knl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; pfmlib_pmu_t intel_knm_support={ .desc = "Intel Knights Mill", .name = "knm", .pmu = PFM_PMU_INTEL_KNM, .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .num_fixed_cntrs = 3, .max_encoding = 2, .pe = intel_knl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = knm_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_knl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knl_unc_cha.c000066400000000000000000000114241502707512200236700ustar00rootroot00000000000000/* * pfmlib_intel_knl_unc_cha.c : Intel KnightsLanding CHA uncore PMU * * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Intel Knights Mill CHA uncore PMU support added April 2018 * Based on Intel's Knights Landing event table, which is shared with Knights Mill * Contributed by Heike Jagode * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_knl_unc_cha_events.h" #define DEFINE_CHA_BOX(n) \ pfmlib_pmu_t intel_knl_unc_cha##n##_support = { \ .desc = "Intel KnightLanding CHA "#n" uncore", \ .name = "knl_unc_cha"#n, \ .perf_name = "uncore_cha_"#n, \ .pmu = PFM_PMU_INTEL_KNL_UNC_CHA##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_cha_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_cha_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knl_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; \ \ pfmlib_pmu_t intel_knm_unc_cha##n##_support = { \ .desc = "Intel Knights Mill CHA "#n" uncore", \ .name = "knm_unc_cha"#n, \ .perf_name = "uncore_cha_"#n, \ .pmu = PFM_PMU_INTEL_KNM_UNC_CHA##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_cha_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_cha_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knm_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_CHA_BOX(0); DEFINE_CHA_BOX(1); DEFINE_CHA_BOX(2); DEFINE_CHA_BOX(3); DEFINE_CHA_BOX(4); DEFINE_CHA_BOX(5); DEFINE_CHA_BOX(6); DEFINE_CHA_BOX(7); DEFINE_CHA_BOX(8); DEFINE_CHA_BOX(9); DEFINE_CHA_BOX(10); DEFINE_CHA_BOX(11); DEFINE_CHA_BOX(12); DEFINE_CHA_BOX(13); DEFINE_CHA_BOX(14); DEFINE_CHA_BOX(15); DEFINE_CHA_BOX(16); DEFINE_CHA_BOX(17); DEFINE_CHA_BOX(18); DEFINE_CHA_BOX(19); DEFINE_CHA_BOX(20); DEFINE_CHA_BOX(21); DEFINE_CHA_BOX(22); DEFINE_CHA_BOX(23); DEFINE_CHA_BOX(24); DEFINE_CHA_BOX(25); DEFINE_CHA_BOX(26); DEFINE_CHA_BOX(27); DEFINE_CHA_BOX(28); DEFINE_CHA_BOX(29); DEFINE_CHA_BOX(30); DEFINE_CHA_BOX(31); DEFINE_CHA_BOX(32); DEFINE_CHA_BOX(33); DEFINE_CHA_BOX(34); DEFINE_CHA_BOX(35); DEFINE_CHA_BOX(36); DEFINE_CHA_BOX(37); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knl_unc_edc.c000066400000000000000000000156751502707512200237040ustar00rootroot00000000000000/* * pfmlib_intel_knl_unc_edc.c : Intel KnightsLanding Integrated EDRAM uncore PMU * * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Intel Knights Mill Integrated EDRAM uncore PMU support added April 2018 * Based on Intel's Knights Landing event table, which is shared with Knights Mill * Contributed by Heike Jagode * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_knl_unc_edc_events.h" #define DEFINE_EDC_UCLK_BOX(n) \ pfmlib_pmu_t intel_knl_unc_edc_uclk##n##_support = { \ .desc = "Intel KnightLanding EDC_UCLK_"#n" uncore", \ .name = "knl_unc_edc_uclk"#n, \ .perf_name = "uncore_edc_uclk_"#n, \ .pmu = PFM_PMU_INTEL_KNL_UNC_EDC_UCLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_uclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_edc_uclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knl_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; \ \ pfmlib_pmu_t intel_knm_unc_edc_uclk##n##_support = { \ .desc = "Intel Knights Mill EDC_UCLK_"#n" uncore", \ .name = "knm_unc_edc_uclk"#n, \ .perf_name = "uncore_edc_uclk_"#n, \ .pmu = PFM_PMU_INTEL_KNM_UNC_EDC_UCLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_uclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_edc_uclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knm_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_EDC_UCLK_BOX(0); DEFINE_EDC_UCLK_BOX(1); DEFINE_EDC_UCLK_BOX(2); DEFINE_EDC_UCLK_BOX(3); DEFINE_EDC_UCLK_BOX(4); DEFINE_EDC_UCLK_BOX(5); DEFINE_EDC_UCLK_BOX(6); DEFINE_EDC_UCLK_BOX(7); #define DEFINE_EDC_ECLK_BOX(n) \ pfmlib_pmu_t intel_knl_unc_edc_eclk##n##_support = { \ .desc = "Intel KnightLanding EDC_ECLK_"#n" uncore", \ .name = "knl_unc_edc_eclk"#n, \ .perf_name = "uncore_edc_eclk_"#n, \ .pmu = PFM_PMU_INTEL_KNL_UNC_EDC_ECLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_eclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_edc_eclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knl_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; \ \ pfmlib_pmu_t intel_knm_unc_edc_eclk##n##_support = { \ .desc = "Intel Knights Mill EDC_ECLK_"#n" uncore", \ .name = "knm_unc_edc_eclk"#n, \ .perf_name = "uncore_edc_eclk_"#n, \ .pmu = PFM_PMU_INTEL_KNM_UNC_EDC_ECLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_edc_eclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_knl_unc_edc_eclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knm_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_EDC_ECLK_BOX(0); DEFINE_EDC_ECLK_BOX(1); DEFINE_EDC_ECLK_BOX(2); DEFINE_EDC_ECLK_BOX(3); DEFINE_EDC_ECLK_BOX(4); DEFINE_EDC_ECLK_BOX(5); DEFINE_EDC_ECLK_BOX(6); DEFINE_EDC_ECLK_BOX(7); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knl_unc_imc.c000066400000000000000000000152541502707512200237120ustar00rootroot00000000000000/* * pfmlib_intel_knl_unc_imc.c : Intel KnightsLanding Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Intel Knights Mill Integrated Memory Controller (IMC) uncore PMU support added April 2018 * Based on Intel's Knights Landing event table, which is shared with Knights Mill * Contributed by Heike Jagode * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_knl_unc_imc_events.h" #define DEFINE_IMC_BOX(n) \ pfmlib_pmu_t intel_knl_unc_imc##n##_support = { \ .desc = "Intel KnightLanding IMC "#n" uncore", \ .name = "knl_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_KNL_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_knl_unc_imc_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knl_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; \ \ pfmlib_pmu_t intel_knm_unc_imc##n##_support = { \ .desc = "Intel Knights Mill IMC "#n" uncore", \ .name = "knm_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_KNM_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_knl_unc_imc_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knm_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_BOX(0); DEFINE_IMC_BOX(1); DEFINE_IMC_BOX(2); DEFINE_IMC_BOX(3); DEFINE_IMC_BOX(4); DEFINE_IMC_BOX(5); #define DEFINE_IMC_UCLK_BOX(n) \ pfmlib_pmu_t intel_knl_unc_imc_uclk##n##_support = { \ .desc = "Intel KnightLanding IMC UCLK "#n" uncore", \ .name = "knl_unc_imc_uclk"#n, \ .perf_name = "uncore_mc_uclk_"#n, \ .pmu = PFM_PMU_INTEL_KNL_UNC_IMC_UCLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_uclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_knl_unc_imc_uclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knl_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; \ \ pfmlib_pmu_t intel_knm_unc_imc_uclk##n##_support = { \ .desc = "Intel Knights Mill IMC UCLK "#n" uncore", \ .name = "knm_unc_imc_uclk"#n, \ .perf_name = "uncore_mc_uclk_"#n, \ .pmu = PFM_PMU_INTEL_KNM_UNC_IMC_UCLK##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_imc_uclk_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_knl_unc_imc_uclk_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_knm_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_UCLK_BOX(0); DEFINE_IMC_UCLK_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_knl_unc_m2pcie.c000066400000000000000000000105351502707512200243160ustar00rootroot00000000000000/* * pfmlib_intel_knl_m2pcie.c : Intel Knights Landing M2PCIe uncore PMU * * Copyright (c) 2016 Intel Corp. All rights reserved * Contributed by Peinan Zhang * * Intel Knights Mill M2PCIe uncore PMU support added April 2018 * Based on Intel's Knights Landing event table, which is shared with Knights Mill * Contributed by Heike Jagode * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_knl_unc_m2pcie_events.h" static void display_m2p(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_R2PCIE=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_knl_unc_m2pcie_support = { .desc = "Intel Knights Landing M2PCIe uncore", .name = "knl_unc_m2pcie", .perf_name = "uncore_m2pcie", .pmu = PFM_PMU_INTEL_KNL_UNC_M2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_m2pcie_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_knl_unc_m2pcie_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_knl_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_m2p, }; pfmlib_pmu_t intel_knm_unc_m2pcie_support = { .desc = "Intel Knights Mill M2PCIe uncore", .name = "knm_unc_m2pcie", .perf_name = "uncore_m2pcie", .pmu = PFM_PMU_INTEL_KNM_UNC_M2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_knl_unc_m2pcie_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_knl_unc_m2pcie_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_knm_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_m2p, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_netburst.c000066400000000000000000000316621502707512200233000ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_intel_netburst.c * * Support for the Pentium4/Xeon/EM64T processor family (family=15). */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_netburst_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_netburst_events.h" const pfmlib_attr_desc_t netburst_mods[]={ PFM_ATTR_B("u", "monitor at priv level 1, 2, 3"), /* monitor priv level 1, 2, 3 */ PFM_ATTR_B("k", "monitor at priv level 0"), /* monitor priv level 0 */ PFM_ATTR_B("cmpl", "complement"), /* set: <=, clear: > */ PFM_ATTR_B("e", "edge"), /* edge */ PFM_ATTR_I("thr", "event threshold in range [0-15]"), /* threshold */ }; #define NETBURST_MODS_COUNT (sizeof(netburst_mods)/sizeof(pfmlib_attr_desc_t)) extern pfmlib_pmu_t netburst_support; static inline int netburst_get_numasks(int pidx) { int i = 0; /* * name = NULL is end-marker */ while (netburst_events[pidx].event_masks[i].name) i++; return i; } static void netburst_display_reg(pfmlib_event_desc_t *e) { netburst_escr_value_t escr; netburst_cccr_value_t cccr; escr.val = e->codes[0]; cccr.val = e->codes[1]; __pfm_vbprintf("[0x%"PRIx64" 0x%"PRIx64" 0x%"PRIx64" usr=%d os=%d tag_ena=%d tag_val=%d " "evmask=0x%x evsel=0x%x escr_sel=0x%x comp=%d cmpl=%d thr=%d e=%d", escr, cccr, e->codes[2], /* perf_event code */ escr.bits.t0_usr, /* t1 is identical */ escr.bits.t0_os, /* t1 is identical */ escr.bits.tag_enable, escr.bits.tag_value, escr.bits.event_mask, escr.bits.event_select, cccr.bits.escr_select, cccr.bits.compare, cccr.bits.complement, cccr.bits.threshold, cccr.bits.edge); __pfm_vbprintf("] %s\n", e->fstr); } static int netburst_add_defaults(pfmlib_event_desc_t *e, unsigned int *evmask) { int i, n; n = netburst_get_numasks(e->event); for (i = 0; i < n; i++) { if (netburst_events[e->event].event_masks[i].flags & NETBURST_FL_DFL) goto found; } return PFM_ERR_ATTR; found: *evmask = 1 << netburst_events[e->event].event_masks[i].bit; n = e->nattrs; e->attrs[n].id = i; e->attrs[n].ival = i; e->nattrs = n+1; return PFM_SUCCESS; } int pfm_netburst_get_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; netburst_escr_value_t escr; netburst_cccr_value_t cccr; unsigned int evmask = 0; unsigned int plmmsk = 0; int umask_done = 0; const char *n; int k, id, bit, ret; int tag_enable = 0, tag_value = 0; e->fstr[0] = '\0'; escr.val = 0; cccr.val = 0; for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { bit = netburst_events[e->event].event_masks[a->idx].bit; n = netburst_events[e->event].event_masks[a->idx].name; /* * umask combination seems possible, although it does * not always make sense, e.g., BOGUS vs. NBOGUS */ if (bit < EVENT_MASK_BITS && n) { evmask |= (1 << bit); } else if (bit >= EVENT_MASK_BITS && n) { tag_value |= (1 << (bit - EVENT_MASK_BITS)); tag_enable = 1; } umask_done = 1; } else if (a->type == PFM_ATTR_RAW_UMASK) { /* should not happen */ return PFM_ERR_ATTR; } else { uint64_t ival = e->attrs[k].ival; switch (a->idx) { case NETBURST_ATTR_U: escr.bits.t1_usr = !!ival; escr.bits.t0_usr = !!ival; plmmsk |= _NETBURST_ATTR_U; break; case NETBURST_ATTR_K: escr.bits.t1_os = !!ival; escr.bits.t0_os = !!ival; plmmsk |= _NETBURST_ATTR_K; break; case NETBURST_ATTR_E: if (ival) { cccr.bits.compare = 1; cccr.bits.edge = 1; } break; case NETBURST_ATTR_C: if (ival) { cccr.bits.compare = 1; cccr.bits.complement = 1; } break; case NETBURST_ATTR_T: if (ival > 15) return PFM_ERR_ATTR_VAL; if (ival) { cccr.bits.compare = 1; cccr.bits.threshold = ival; } break; default: return PFM_ERR_ATTR; } } } /* * handle case where no priv level mask was passed. * then we use the dfl_plm */ if (!(plmmsk & (_NETBURST_ATTR_K|_NETBURST_ATTR_U))) { if (e->dfl_plm & PFM_PLM0) { escr.bits.t1_os = 1; escr.bits.t0_os = 1; } if (e->dfl_plm & PFM_PLM3) { escr.bits.t1_usr = 1; escr.bits.t0_usr = 1; } } if (!umask_done) { ret = netburst_add_defaults(e, &evmask); if (ret != PFM_SUCCESS) return ret; } escr.bits.tag_enable = tag_enable; escr.bits.tag_value = tag_value; escr.bits.event_mask = evmask; escr.bits.event_select = netburst_events[e->event].event_select; cccr.bits.enable = 1; cccr.bits.escr_select = netburst_events[e->event].escr_select; cccr.bits.active_thread = 3; if (e->event == PME_REPLAY_EVENT) escr.bits.event_mask &= P4_REPLAY_REAL_MASK; /* remove virtual mask bits */ /* * reorder all the attributes such that the fstr appears always * the same regardless of how the attributes were submitted. */ evt_strcat(e->fstr, "%s", netburst_events[e->event].name); pfmlib_sort_attr(e); for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { id = e->attrs[k].id; evt_strcat(e->fstr, ":%s", netburst_events[e->event].event_masks[id].name); } } evt_strcat(e->fstr, ":%s=%lu", netburst_mods[NETBURST_ATTR_K].name, escr.bits.t0_os); evt_strcat(e->fstr, ":%s=%lu", netburst_mods[NETBURST_ATTR_U].name, escr.bits.t0_usr); evt_strcat(e->fstr, ":%s=%lu", netburst_mods[NETBURST_ATTR_E].name, cccr.bits.edge); evt_strcat(e->fstr, ":%s=%lu", netburst_mods[NETBURST_ATTR_C].name, cccr.bits.complement); evt_strcat(e->fstr, ":%s=%lu", netburst_mods[NETBURST_ATTR_T].name, cccr.bits.threshold); e->count = 2; e->codes[0] = escr.val; e->codes[1] = cccr.val; netburst_display_reg(e); return PFM_SUCCESS; } static int pfm_netburst_detect(void *this) { int ret; int model; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; if (pfm_intel_x86_cfg.family != 15) return PFM_ERR_NOTSUPP; model = pfm_intel_x86_cfg.model; if (model == 3 || model == 4 || model == 6) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } static int pfm_netburst_detect_prescott(void *this) { int ret; int model; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; if (pfm_intel_x86_cfg.family != 15) return PFM_ERR_NOTSUPP; /* * prescott has one more event (instr_completed) */ model = pfm_intel_x86_cfg.model; if (model != 3 && model != 4 && model != 6) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } static int pfm_netburst_get_event_first(void *this) { pfmlib_pmu_t *p = this; return p->pme_count ? 0 : -1; } static int pfm_netburst_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; if (idx >= (p->pme_count-1)) return -1; return idx+1; } static int pfm_netburst_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count; } static int pfm_netburst_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { const netburst_entry_t *pe = this_pe(this); int numasks, idx; numasks = netburst_get_numasks(pidx); if (attr_idx < numasks) { //idx = pfm_intel_x86_attr2umask(this, pidx, attr_idx); idx = attr_idx; info->name = pe[pidx].event_masks[idx].name; info->desc = pe[pidx].event_masks[idx].desc; info->equiv= NULL; info->code = pe[pidx].event_masks[idx].bit; info->type = PFM_ATTR_UMASK; info->is_dfl = !!(pe[pidx].event_masks[idx].flags & NETBURST_FL_DFL); } else { idx = attr_idx - numasks; info->name = netburst_mods[idx].name; info->desc = netburst_mods[idx].desc; info->equiv= NULL; info->code = idx; info->type = netburst_mods[idx].type; info->is_dfl = 0; } info->ctrl = PFM_ATTR_CTRL_PMU; info->idx = idx; /* namespace specific index */ info->dfl_val64 = 0; info->is_precise = 0; info->support_hw_smpl = 0; return PFM_SUCCESS; } static int pfm_netburst_get_event_info(void *this, int idx, pfm_event_info_t *info) { const netburst_entry_t *pe = this_pe(this); pfmlib_pmu_t *pmu = this; /* * pmu and idx filled out by caller */ info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].event_select | (pe[idx].escr_select << 8); info->equiv = NULL; info->idx = idx; /* private index */ info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; info->nattrs = netburst_get_numasks(idx); info->nattrs += NETBURST_MODS_COUNT; return PFM_SUCCESS; } static int pfm_netburst_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const netburst_entry_t *pe = netburst_events; const char *name = pmu->name; int i, j, noname, ndfl; int error = 0; for(i=0; i < pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", name, i, pe[i].name); error++; } noname = ndfl = 0; /* name = NULL is end-marker, veryfy there is at least one */ for(j= 0; j < EVENT_MASK_BITS; j++) { if (!pe[i].event_masks[j].name) noname++; if (pe[i].event_masks[j].name) { if (!pe[i].event_masks[j].desc) { fprintf(fp, "pmu: %s event%d:%s umask%d: %s :: no description\n", name, i, pe[i].name, j, pe[i].event_masks[j].name); error++; } if (pe[i].event_masks[j].bit >= (EVENT_MASK_BITS+4)) { fprintf(fp, "pmu: %s event%d:%s umask%d: %s :: invalid bit field\n", name, i, pe[i].name, j, pe[i].event_masks[j].name); error++; } if (pe[i].event_masks[j].flags & NETBURST_FL_DFL) ndfl++; } } if (ndfl > 1) { fprintf(fp, "pmu: %s event%d:%s :: more than one default umask\n", name, i, pe[i].name); error++; } if (!noname) { fprintf(fp, "pmu: %s event%d:%s :: no event mask end-marker\n", name, i, pe[i].name); error++; } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } static unsigned int pfm_netburst_get_event_nattrs(void *this, int pidx) { unsigned int nattrs; nattrs = netburst_get_numasks(pidx); nattrs += NETBURST_MODS_COUNT; return nattrs; } pfmlib_pmu_t netburst_support = { .desc = "Pentium4", .name = "netburst", .pmu = PFM_PMU_INTEL_NETBURST, .pme_count = LIBPFM_ARRAY_SIZE(netburst_events) - 1, .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .atdesc = netburst_mods, .pe = netburst_events, .max_encoding = 3, .num_cntrs = 18, .pmu_detect = pfm_netburst_detect, .get_event_encoding[PFM_OS_NONE] = pfm_netburst_get_encoding, PFMLIB_ENCODE_PERF(pfm_netburst_get_perf_encoding), .get_event_first = pfm_netburst_get_event_first, .get_event_next = pfm_netburst_get_event_next, .event_is_valid = pfm_netburst_event_is_valid, .validate_table = pfm_netburst_validate_table, .get_event_info = pfm_netburst_get_event_info, .get_event_attr_info = pfm_netburst_get_event_attr_info, .get_event_nattrs = pfm_netburst_get_event_nattrs, PFMLIB_VALID_PERF_PATTRS(pfm_netburst_perf_validate_pattrs), }; pfmlib_pmu_t netburst_p_support = { .desc = "Pentium4 (Prescott)", .name = "netburst_p", .pmu = PFM_PMU_INTEL_NETBURST_P, .pme_count = LIBPFM_ARRAY_SIZE(netburst_events), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .atdesc = netburst_mods, .pe = netburst_events, .max_encoding = 3, .num_cntrs = 18, .pmu_detect = pfm_netburst_detect_prescott, .get_event_encoding[PFM_OS_NONE] = pfm_netburst_get_encoding, PFMLIB_ENCODE_PERF(pfm_netburst_get_perf_encoding), .get_event_first = pfm_netburst_get_event_first, .get_event_next = pfm_netburst_get_event_next, .event_is_valid = pfm_netburst_event_is_valid, .validate_table = pfm_netburst_validate_table, .get_event_info = pfm_netburst_get_event_info, .get_event_attr_info = pfm_netburst_get_event_attr_info, .get_event_nattrs = pfm_netburst_get_event_nattrs, PFMLIB_VALID_PERF_PATTRS(pfm_netburst_perf_validate_pattrs), }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_netburst_perf_event.c000066400000000000000000000057471502707512200255220ustar00rootroot00000000000000/* pfmlib_intel_netburst_perf_event.c : perf_event Intel Netburst functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file implements the common code for all Intel X86 processors. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_netburst_priv.h" #include "pfmlib_perf_event_priv.h" int pfm_netburst_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { const netburst_entry_t *pe = this_pe(this); struct perf_event_attr *attr = e->os_data; int perf_code = pe[e->event].perf_code; uint64_t escr; int ret; ret = pfm_netburst_get_encoding(this, e); if (ret != PFM_SUCCESS) return ret; attr->type = PERF_TYPE_RAW; /* * codes[0] = ESCR * codes[1] = CCCR * * cleanup event_select, and install perf specific code */ escr = e->codes[0] & ~(0x3full << 25); escr |= perf_code << 25; attr->config = (escr << 32) | e->codes[1]; return PFM_SUCCESS; } void pfm_netburst_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * with perf_events, u and k are handled at the OS level * via exclude_user, exclude_kernel. */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { if (e->pattrs[i].idx == NETBURST_ATTR_U || e->pattrs[i].idx == NETBURST_ATTR_K) compact = 1; } if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* no PEBS support (for now) */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; /* * No hypervisor on Intel */ if (e->pattrs[i].idx == PERF_ATTR_H) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_netburst_priv.h000066400000000000000000000162171502707512200243440ustar00rootroot00000000000000/* * Copyright (c) 2006 IBM Corp. * Contributed by Kevin Corry * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_netburst_priv.h * * Structures and definitions for use in the Pentium4/Xeon/EM64T libpfm code. */ #ifndef _PFMLIB_INTEL_NETBURST_PRIV_H_ #define _PFMLIB_INTEL_NETBURST_PRIV_H_ /* ESCR: Event Selection Control Register * * These registers are used to select which event to count along with options * for that event. There are (up to) 45 ESCRs, but each data counter is * restricted to a specific set of ESCRs. */ /** * netburst_escr_value_t * * Bit-wise breakdown of the ESCR registers. * * Bits Description * ------- ----------- * 63 - 31 Reserved * 30 - 25 Event Select * 24 - 9 Event Mask * 8 - 5 Tag Value * 4 Tag Enable * 3 T0 OS - Enable counting in kernel mode (thread 0) * 2 T0 USR - Enable counting in user mode (thread 0) * 1 T1 OS - Enable counting in kernel mode (thread 1) * 0 T1 USR - Enable counting in user mode (thread 1) **/ #define EVENT_MASK_BITS 16 #define EVENT_SELECT_BITS 6 typedef union { unsigned long long val; struct { unsigned long t1_usr:1; unsigned long t1_os:1; unsigned long t0_usr:1; unsigned long t0_os:1; unsigned long tag_enable:1; unsigned long tag_value:4; unsigned long event_mask:EVENT_MASK_BITS; unsigned long event_select:EVENT_SELECT_BITS; unsigned long reserved:1; } bits; } netburst_escr_value_t; /* CCCR: Counter Configuration Control Register * * These registers are used to configure the data counters. There are 18 * CCCRs, one for each data counter. */ /** * netburst_cccr_value_t * * Bit-wise breakdown of the CCCR registers. * * Bits Description * ------- ----------- * 63 - 32 Reserved * 31 OVF - The data counter overflowed. * 30 Cascade - Enable cascading of data counter when alternate * counter overflows. * 29 - 28 Reserved * 27 OVF_PMI_T1 - Generate interrupt for LP1 on counter overflow * 26 OVF_PMI_T0 - Generate interrupt for LP0 on counter overflow * 25 FORCE_OVF - Force interrupt on every counter increment * 24 Edge - Enable rising edge detection of the threshold comparison * output for filtering event counts. * 23 - 20 Threshold Value - Select the threshold value for comparing to * incoming event counts. * 19 Complement - Select how incoming event count is compared with * the threshold value. * 18 Compare - Enable filtering of event counts. * 17 - 16 Active Thread - Only used with HT enabled. * 00 - None: Count when neither LP is active. * 01 - Single: Count when only one LP is active. * 10 - Both: Count when both LPs are active. * 11 - Any: Count when either LP is active. * 15 - 13 ESCR Select - Select which ESCR to use for selecting the * event to count. * 12 Enable - Turns the data counter on or off. * 11 - 0 Reserved **/ typedef union { unsigned long long val; struct { unsigned long reserved1:12; unsigned long enable:1; unsigned long escr_select:3; unsigned long active_thread:2; unsigned long compare:1; unsigned long complement:1; unsigned long threshold:4; unsigned long edge:1; unsigned long force_ovf:1; unsigned long ovf_pmi_t0:1; unsigned long ovf_pmi_t1:1; unsigned long reserved2:2; unsigned long cascade:1; unsigned long overflow:1; } bits; } netburst_cccr_value_t; /** * netburst_event_mask_t * * Defines one bit of the event-mask for one Pentium4 event. * * @name: Event mask name * @desc: Event mask description * @bit: The bit position within the event_mask field. **/ typedef struct { const char *name; const char *desc; unsigned int bit; unsigned int flags; } netburst_event_mask_t; /* * netburst_event_mask_t->flags */ #define NETBURST_FL_DFL 0x1 /* event mask is default */ #define MAX_ESCRS_PER_EVENT 2 /* * These are the unique event codes used by perf_events. * The need to be encoded in the ESCR.event_select field when * programming for perf_events */ enum netburst_events { P4_EVENT_TC_DELIVER_MODE, P4_EVENT_BPU_FETCH_REQUEST, P4_EVENT_ITLB_REFERENCE, P4_EVENT_MEMORY_CANCEL, P4_EVENT_MEMORY_COMPLETE, P4_EVENT_LOAD_PORT_REPLAY, P4_EVENT_STORE_PORT_REPLAY, P4_EVENT_MOB_LOAD_REPLAY, P4_EVENT_PAGE_WALK_TYPE, P4_EVENT_BSQ_CACHE_REFERENCE, P4_EVENT_IOQ_ALLOCATION, P4_EVENT_IOQ_ACTIVE_ENTRIES, P4_EVENT_FSB_DATA_ACTIVITY, P4_EVENT_BSQ_ALLOCATION, P4_EVENT_BSQ_ACTIVE_ENTRIES, P4_EVENT_SSE_INPUT_ASSIST, P4_EVENT_PACKED_SP_UOP, P4_EVENT_PACKED_DP_UOP, P4_EVENT_SCALAR_SP_UOP, P4_EVENT_SCALAR_DP_UOP, P4_EVENT_64BIT_MMX_UOP, P4_EVENT_128BIT_MMX_UOP, P4_EVENT_X87_FP_UOP, P4_EVENT_TC_MISC, P4_EVENT_GLOBAL_POWER_EVENTS, P4_EVENT_TC_MS_XFER, P4_EVENT_UOP_QUEUE_WRITES, P4_EVENT_RETIRED_MISPRED_BRANCH_TYPE, P4_EVENT_RETIRED_BRANCH_TYPE, P4_EVENT_RESOURCE_STALL, P4_EVENT_WC_BUFFER, P4_EVENT_B2B_CYCLES, P4_EVENT_BNR, P4_EVENT_SNOOP, P4_EVENT_RESPONSE, P4_EVENT_FRONT_END_EVENT, P4_EVENT_EXECUTION_EVENT, P4_EVENT_REPLAY_EVENT, P4_EVENT_INSTR_RETIRED, P4_EVENT_UOPS_RETIRED, P4_EVENT_UOP_TYPE, P4_EVENT_BRANCH_RETIRED, P4_EVENT_MISPRED_BRANCH_RETIRED, P4_EVENT_X87_ASSIST, P4_EVENT_MACHINE_CLEAR, P4_EVENT_INSTR_COMPLETED, }; typedef struct { const char *name; const char *desc; unsigned int event_select; unsigned int escr_select; enum netburst_events perf_code; /* perf_event event code, enum P4_EVENTS */ int allowed_escrs[MAX_ESCRS_PER_EVENT]; netburst_event_mask_t event_masks[EVENT_MASK_BITS]; } netburst_entry_t; #define NETBURST_ATTR_U 0 #define NETBURST_ATTR_K 1 #define NETBURST_ATTR_C 2 #define NETBURST_ATTR_E 3 #define NETBURST_ATTR_T 4 #define _NETBURST_ATTR_U (1 << NETBURST_ATTR_U) #define _NETBURST_ATTR_K (1 << NETBURST_ATTR_K) #define P4_REPLAY_REAL_MASK 0x00000003 extern int pfm_netburst_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_netburst_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern void pfm_netburst_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); #endif papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_nhm.c000066400000000000000000000143611502707512200222110ustar00rootroot00000000000000/* * pfmlib_intel_nhm.c : Intel Nehalem core PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Nehalem PMU = architectural perfmon v3 + OFFCORE + PEBS v2 + LBR */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #if 0 static int pfm_nhm_lbr_encode(void *this, pfmlib_event_desc_t *e, uint64_t *codes, int *count, pfmlib_perf_attr_t *attrs); static int pfm_nhm_offcore_encode(void *this, pfmlib_event_desc_t *e, uint64_t *codes, int *count, pfmlib_perf_attr_t *attrs); #endif #include "events/intel_nhm_events.h" static const int nhm_models[] = { 26, 30, 31, 0 }; static const int nhm_ex_models[] = { 46, 0 }; static int pfm_nhm_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } /* * the following function implement the model * specific API directly available to user */ static const char *data_src_encodings[]={ /* 0 */ "unknown L3 cache miss", /* 1 */ "minimal latency core cache hit. Request was satisfied by L1 data cache", /* 2 */ "pending core cache HIT. Outstanding core cache miss to same cacheline address already underway", /* 3 */ "data request satisfied by the L2", /* 4 */ "L3 HIT. Local or remote home request that hit L3 in the uncore with no coherency actions required (snooping)", /* 5 */ "L3 HIT. Local or remote home request that hit L3 and was serviced by another core with a cross core snoop where no modified copy was found (clean)", /* 6 */ "L3 HIT. Local or remote home request that hit L3 and was serviced by another core with a cross core snoop where modified copies were found (HITM)", /* 7 */ "reserved", /* 8 */ "L3 MISS. Local homed request that missed L3 and was serviced by forwarded data following a cross package snoop where no modified copy was found (remote home requests are not counted)", /* 9 */ "reserved", /* 10 */ "L3 MISS. Local homed request that missed L3 and was serviced by local DRAM (go to shared state)", /* 11 */ "L3 MISS. Remote homed request that missed L3 and was serviced by remote DRAM (go to shared state)", /* 12 */ "L3 MISS. Local homed request that missed L3 and was serviced by local DRAM (go to exclusive state)", /* 13 */ "L3 MISS. Remote homed request that missed L3 and was serviced by remote DRAM (go to exclusive state)", /* 14 */ "reserved", /* 15 */ "request to uncacheable memory" }; /* * return data source encoding based on index in val * To be used with PEBS load latency filtering to decode * source of the load miss */ const char * pfm_nhm_data_src_desc(int val) { if (val > 15 || val < 0) return NULL; return data_src_encodings[val]; } #if 0 static int pfm_nhm_lbr_encode(void *this, pfmlib_event_desc_t *e, uint64_t *codes, int *count, pfmlib_perf_attr_t *attrs) { return PFM_ERR_NOTSUPP; } static int pfm_nhm_offcore_encode(void *this, pfmlib_event_desc_t *e, uint64_t *codes, int *count, pfmlib_perf_attr_t *attrs) { return PFM_ERR_NOTSUPP; } #endif pfmlib_pmu_t intel_nhm_support={ .desc = "Intel Nehalem", .name = "nhm", .pmu = PFM_PMU_INTEL_NHM, .pme_count = LIBPFM_ARRAY_SIZE(intel_nhm_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, /* because of OFFCORE_RESPONSE */ .pe = intel_nhm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = nhm_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_nhm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_nhm_ex_support={ .desc = "Intel Nehalem EX", .name = "nhm_ex", .pmu = PFM_PMU_INTEL_NHM_EX, .pme_count = LIBPFM_ARRAY_SIZE(intel_nhm_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, /* because of OFFCORE_RESPONSE */ .pe = intel_nhm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = nhm_ex_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_nhm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_nhm_unc.c000066400000000000000000000237451502707512200230640ustar00rootroot00000000000000/* * pfmlib_intel_nhm_unc.c : Intel Nehalem/Westmere uncore PMU * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #define NHM_UNC_ATTR_E 0 #define NHM_UNC_ATTR_I 1 #define NHM_UNC_ATTR_C 2 #define NHM_UNC_ATTR_O 3 #define _NHM_UNC_ATTR_I (1 << NHM_UNC_ATTR_I) #define _NHM_UNC_ATTR_E (1 << NHM_UNC_ATTR_E) #define _NHM_UNC_ATTR_C (1 << NHM_UNC_ATTR_C) #define _NHM_UNC_ATTR_O (1 << NHM_UNC_ATTR_O) #define NHM_UNC_ATTRS \ (_NHM_UNC_ATTR_I|_NHM_UNC_ATTR_E|_NHM_UNC_ATTR_C|_NHM_UNC_ATTR_O) #define NHM_UNC_MOD_OCC_BIT 17 #define NHM_UNC_MOD_EDGE_BIT 18 #define NHM_UNC_MOD_INV_BIT 23 #define NHM_UNC_MOD_CMASK_BIT 24 #define NHM_UNC_MOD_OCC (1 << NHM_UNC_MOD_OCC_BIT) #define NHM_UNC_MOD_EDGE (1 << NHM_UNC_MOD_EDGE_BIT) #define NHM_UNC_MOD_INV (1 << NHM_UNC_MOD_INV_BIT) /* Intel Nehalem/Westmere uncore event table */ #include "events/intel_nhm_unc_events.h" #include "events/intel_wsm_unc_events.h" static const pfmlib_attr_desc_t nhm_unc_mods[]={ PFM_ATTR_B("e", "edge level"), /* edge */ PFM_ATTR_B("i", "invert"), /* invert */ PFM_ATTR_I("c", "counter-mask in range [0-255]"), /* counter-mask */ PFM_ATTR_B("o", "queue occupancy"), /* queue occupancy */ PFM_ATTR_NULL }; static const int nhm_models[] = { 26, 30, 31, 0 }; static const int wsm_dp_models[] = { 44, /* Westmere-EP, Gulftown */ 47, /* Westmere E7 */ 0, }; static int pfm_nhm_unc_get_encoding(void *this, pfmlib_event_desc_t *e) { pfm_intel_x86_reg_t reg; pfmlib_event_attr_info_t *a; const intel_x86_entry_t *pe = this_pe(this); unsigned int grpmsk, ugrpmsk = 0; int umodmsk = 0, modmsk_r = 0; uint64_t val; uint64_t umask; unsigned int modhw = 0; int k, ret, grpid, last_grpid = -1; int grpcounts[INTEL_X86_NUM_GRP]; int ncombo[INTEL_X86_NUM_GRP]; char umask_str[PFMLIB_EVT_MAX_NAME_LEN]; memset(grpcounts, 0, sizeof(grpcounts)); memset(ncombo, 0, sizeof(ncombo)); umask_str[0] = e->fstr[0] = '\0'; reg.val = 0; val = pe[e->event].code; grpmsk = (1 << pe[e->event].ngrp)-1; reg.val |= val; /* preset some filters from code */ /* take into account hardcoded umask */ umask = (val >> 8) & 0xff; modmsk_r = pe[e->event].modmsk_req; for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { grpid = pe[e->event].umasks[a->idx].grpid; /* * cfor certain events groups are meant to be * exclusive, i.e., only unit masks of one group * can be used */ if (last_grpid != -1 && grpid != last_grpid && intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) { DPRINT("exclusive unit mask group error\n"); return PFM_ERR_FEATCOMB; } /* * upper layer has removed duplicates * so if we come here more than once, it is for two * disinct umasks * * NCOMBO=no combination of unit masks within the same * umask group */ ++grpcounts[grpid]; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_NCOMBO)) ncombo[grpid] = 1; if (grpcounts[grpid] > 1 && ncombo[grpid]) { DPRINT("event does not support unit mask combination within a group\n"); return PFM_ERR_FEATCOMB; } evt_strcat(umask_str, ":%s", pe[e->event].umasks[a->idx].uname); last_grpid = grpid; modhw |= pe[e->event].umasks[a->idx].modhw; umask |= pe[e->event].umasks[a->idx].ucode >> 8; ugrpmsk |= 1 << pe[e->event].umasks[a->idx].grpid; reg.val |= umask << 8; modmsk_r |= pe[e->event].umasks[a->idx].umodmsk_req; } else if (a->type == PFM_ATTR_RAW_UMASK) { /* there can only be one RAW_UMASK per event */ /* sanity check */ if (a->idx & ~0xff) { DPRINT("raw umask is 8-bit wide\n"); return PFM_ERR_ATTR; } /* override umask */ umask = a->idx & 0xff; ugrpmsk = grpmsk; } else { uint64_t ival = e->attrs[k].ival; switch(a->idx) { case NHM_UNC_ATTR_I: /* invert */ reg.nhm_unc.usel_inv = !!ival; umodmsk |= _NHM_UNC_ATTR_I; break; case NHM_UNC_ATTR_E: /* edge */ reg.nhm_unc.usel_edge = !!ival; umodmsk |= _NHM_UNC_ATTR_E; break; case NHM_UNC_ATTR_C: /* counter-mask */ /* already forced, cannot overwrite */ if (ival > 255) return PFM_ERR_INVAL; reg.nhm_unc.usel_cnt_mask = ival; umodmsk |= _NHM_UNC_ATTR_C; break; case NHM_UNC_ATTR_O: /* occupancy */ reg.nhm_unc.usel_occ = !!ival; umodmsk |= _NHM_UNC_ATTR_O; break; } } } if ((modhw & _NHM_UNC_ATTR_I) && reg.nhm_unc.usel_inv) return PFM_ERR_ATTR_SET; if ((modhw & _NHM_UNC_ATTR_E) && reg.nhm_unc.usel_edge) return PFM_ERR_ATTR_SET; if ((modhw & _NHM_UNC_ATTR_C) && reg.nhm_unc.usel_cnt_mask) return PFM_ERR_ATTR_SET; if ((modhw & _NHM_UNC_ATTR_O) && reg.nhm_unc.usel_occ) return PFM_ERR_ATTR_SET; /* * check that there is at least of unit mask in each unit * mask group */ if ((ugrpmsk != grpmsk && !intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) || ugrpmsk == 0) { ugrpmsk ^= grpmsk; ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask, (unsigned short) -1, -1); if (ret != PFM_SUCCESS) return ret; } if (modmsk_r && (umodmsk ^ modmsk_r)) { DPRINT("required modifiers missing: 0x%x\n", modmsk_r); return PFM_ERR_ATTR; } evt_strcat(e->fstr, "%s", pe[e->event].name); pfmlib_sort_attr(e); for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", pe[e->event].umasks[a->idx].uname); else if (a->type == PFM_ATTR_RAW_UMASK) evt_strcat(e->fstr, ":0x%x", a->idx); } reg.val |= umask << 8; reg.nhm_unc.usel_en = 1; /* force enable bit to 1 */ reg.nhm_unc.usel_int = 1; /* force APIC int to 1 */ e->codes[0] = reg.val; e->count = 1; for (k = 0; k < e->npattrs; k++) { int idx; if (e->pattrs[k].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[k].type == PFM_ATTR_UMASK) continue; idx = e->pattrs[k].idx; switch(idx) { case NHM_UNC_ATTR_E: evt_strcat(e->fstr, ":%s=%lu", nhm_unc_mods[idx].name, reg.nhm_unc.usel_edge); break; case NHM_UNC_ATTR_I: evt_strcat(e->fstr, ":%s=%lu", nhm_unc_mods[idx].name, reg.nhm_unc.usel_inv); break; case NHM_UNC_ATTR_C: evt_strcat(e->fstr, ":%s=%lu", nhm_unc_mods[idx].name, reg.nhm_unc.usel_cnt_mask); break; case NHM_UNC_ATTR_O: evt_strcat(e->fstr, ":%s=%lu", nhm_unc_mods[idx].name, reg.nhm_unc.usel_occ); break; } } __pfm_vbprintf("[UNC_PERFEVTSEL=0x%"PRIx64" event=0x%x umask=0x%x en=%d int=%d inv=%d edge=%d occ=%d cnt_msk=%d] %s\n", reg.val, reg.nhm_unc.usel_event, reg.nhm_unc.usel_umask, reg.nhm_unc.usel_en, reg.nhm_unc.usel_int, reg.nhm_unc.usel_inv, reg.nhm_unc.usel_edge, reg.nhm_unc.usel_occ, reg.nhm_unc.usel_cnt_mask, pe[e->event].name); return PFM_SUCCESS; } pfmlib_pmu_t intel_nhm_unc_support={ .desc = "Intel Nehalem uncore", .name = "nhm_unc", .perf_name = "uncore", .pmu = PFM_PMU_INTEL_NHM_UNC, .pme_count = LIBPFM_ARRAY_SIZE(intel_nhm_unc_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 8, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_nhm_unc_pe, .atdesc = nhm_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = 6, .cpu_models = nhm_models, .pmu_detect = pfm_intel_x86_model_detect, .get_event_encoding[PFM_OS_NONE] = pfm_nhm_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_nhm_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; pfmlib_pmu_t intel_wsm_unc_support={ .desc = "Intel Westmere uncore", .name = "wsm_unc", .perf_name = "uncore", .pmu = PFM_PMU_INTEL_WSM_UNC, .pme_count = LIBPFM_ARRAY_SIZE(intel_wsm_unc_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 8, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_wsm_unc_pe, .atdesc = nhm_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .cpu_family = 6, .cpu_models = wsm_dp_models, .pmu_detect = pfm_intel_x86_model_detect, .get_event_encoding[PFM_OS_NONE] = pfm_nhm_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_nhm_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_p6.c000066400000000000000000000140401502707512200217460ustar00rootroot00000000000000/* * pfmlib_i386_p6.c : support for the P6 processor family (family=6) * incl. Pentium II, Pentium III, Pentium Pro, Pentium M * * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_intel_x86_priv.h" /* architecture private */ #include "events/intel_p6_events.h" /* generic P6 (PIII) */ #include "events/intel_pii_events.h" /* Pentium II */ #include "events/intel_ppro_events.h" /* Pentium Pro */ #include "events/intel_pm_events.h" /* Pentium M */ static const int pii_models[] = { 3, /* Pentium II */ 5, /* Pentium II Deschutes */ 6, /* Pentium II Mendocino */ 0 }; static const int ppro_models[] = { 1, /* Pentium Pro */ 0 }; static const int piii_models[] = { 7, /* Pentium III Katmai */ 8, /* Pentium III Coppermine */ 10,/* Pentium III Cascades */ 11,/* Pentium III Tualatin */ 0 }; static const int pm_models[] = { 9, /* Pentium M */ 13, /* Pentium III Coppermine */ 0 }; /* Pentium II support */ pfmlib_pmu_t intel_pii_support={ .desc = "Intel Pentium II", .name = "pii", .pmu = PFM_PMU_INTEL_PII, .pme_count = LIBPFM_ARRAY_SIZE(intel_pii_pe), .pe = intel_pii_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = pii_models, .pmu_detect = pfm_intel_x86_model_detect, .num_cntrs = 2, .max_encoding = 1, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; pfmlib_pmu_t intel_p6_support={ .desc = "Intel P6 Processor Family", .name = "p6", .pmu = PFM_PMU_I386_P6, .pme_count = LIBPFM_ARRAY_SIZE(intel_p6_pe), .pe = intel_p6_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = piii_models, .pmu_detect = pfm_intel_x86_model_detect, .num_cntrs = 2, .max_encoding = 1, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; pfmlib_pmu_t intel_ppro_support={ .desc = "Intel Pentium Pro", .name = "ppro", .pmu = PFM_PMU_INTEL_PPRO, .pme_count = LIBPFM_ARRAY_SIZE(intel_ppro_pe), .pe = intel_ppro_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = ppro_models, .pmu_detect = pfm_intel_x86_model_detect, .num_cntrs = 2, .max_encoding = 1, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; /* Pentium M support */ pfmlib_pmu_t intel_pm_support={ .desc = "Intel Pentium M", .name = "pm", .pmu = PFM_PMU_I386_PM, .pe = intel_pm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = pm_models, .pmu_detect = pfm_intel_x86_model_detect, .pme_count = LIBPFM_ARRAY_SIZE(intel_pm_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 2, .max_encoding = 1, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_rapl.c000066400000000000000000000146511502707512200223670ustar00rootroot00000000000000/* * pfmlib_intel_rapl.c : Intel RAPL PMU * * Copyright (c) 2013 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * RAPL PMU (SNB, IVB, HSW) */ /* private headers */ #include "pfmlib_priv.h" /* * for now, we reuse the x86 table entry format and callback to avoid duplicating * code. We may revisit this later on */ #include "pfmlib_intel_x86_priv.h" extern pfmlib_pmu_t intel_rapl_support; #define RAPL_COMMON_EVENTS \ { .name = "RAPL_ENERGY_CORES",\ .desc = "Number of Joules consumed by all cores on the package. Unit is 2^-32 Joules",\ .cntmsk = 0x1,\ .code = 0x1,\ },\ { .name = "RAPL_ENERGY_PKG",\ .desc = "Number of Joules consumed by all cores and Last level cache on the package. Unit is 2^-32 Joules",\ .cntmsk = 0x2,\ .code = 0x2,\ } static const intel_x86_entry_t intel_rapl_cln_pe[]={ RAPL_COMMON_EVENTS, { .name = "RAPL_ENERGY_GPU", .desc = "Number of Joules consumed by the builtin GPU. Unit is 2^-32 Joules", .cntmsk = 0x8, .code = 0x4, } }; static const intel_x86_entry_t intel_rapl_skl_cln_pe[]={ RAPL_COMMON_EVENTS, { .name = "RAPL_ENERGY_GPU", .desc = "Number of Joules consumed by the builtin GPU. Unit is 2^-32 Joules", .cntmsk = 0x8, .code = 0x4, }, { .name = "RAPL_ENERGY_PSYS", .desc = "Number of Joules consumed by the builtin PSYS. Unit is 2^-32 Joules", .cntmsk = 0x8, .code = 0x5, } }; static const intel_x86_entry_t intel_rapl_srv_pe[]={ RAPL_COMMON_EVENTS, { .name = "RAPL_ENERGY_DRAM", .desc = "Number of Joules consumed by the DRAM. Unit is 2^-32 Joules", .cntmsk = 0x4, .code = 0x3, }, }; static const intel_x86_entry_t intel_rapl_hswep_pe[]={ /* * RAPL_ENERGY_CORES not supported in HSW-EP */ { .name = "RAPL_ENERGY_PKG", .desc = "Number of Joules consumed by all cores and Last level cache on the package. Unit is 2^-32 Joules", .cntmsk = 0x2, .code = 0x2, }, { .name = "RAPL_ENERGY_DRAM", .desc = "Number of Joules consumed by the DRAM. Unit is 2^-32 Joules", .cntmsk = 0x4, .code = 0x3, }, }; static int pfm_rapl_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 42: /* Sandy Bridge */ case 58: /* Ivy Bridge */ case 60: /* Haswell */ case 69: /* Haswell */ case 70: /* Haswell */ case 61: /* Broadwell */ case 71: /* Broadwell GT3E */ case 92: /* Goldmont */ case 95: /* Denverton */ case 102: /* Cannonlake */ case 122: /* Goldmont Plus */ /* already setup by default */ break; case 45: /* Sandy Bridg-EP */ case 62: /* Ivy Bridge-EP */ intel_rapl_support.pe = intel_rapl_srv_pe; intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_srv_pe); break; case 78: /* Skylake */ case 94: /* Skylake H/S */ case 142: /* Kabylake */ case 158: /* Kabylake */ case 165: /* CometLake mobile */ case 166: /* CometLake */ case 125: /* Icelake */ case 126: /* Icelake mobile */ case 157: /* Icelake NNPI */ intel_rapl_support.pe = intel_rapl_skl_cln_pe; intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_skl_cln_pe); break; case 63: /* Haswell-EP */ case 79: /* Broadwell-EP */ case 86: /* Broadwell D */ case 85: /* Skylake X */ case 106:/* IcelakeX */ case 108:/* IcelakeD */ case 143:/* SapphireRapidX */ intel_rapl_support.pe = intel_rapl_hswep_pe; intel_rapl_support.pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_hswep_pe); break; default : return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } static int pfm_intel_rapl_get_encoding(void *this, pfmlib_event_desc_t *e) { const intel_x86_entry_t *pe; pe = this_pe(this); e->fstr[0] = '\0'; e->codes[0] = pe[e->event].code; e->count = 1; evt_strcat(e->fstr, "%s", pe[e->event].name); __pfm_vbprintf("[0x%"PRIx64" event=0x%x] %s\n", e->codes[0], e->codes[0], e->fstr); return PFM_SUCCESS; } /* * number modifiers for RAPL * define an empty modifier to avoid firing the * sanity pfm_intel_x86_validate_table(). We are * using this function to avoid duplicating code. */ static const pfmlib_attr_desc_t rapl_mods[]= { { 0, } }; pfmlib_pmu_t intel_rapl_support={ .desc = "Intel RAPL", .name = "rapl", .perf_name = "power", .pmu = PFM_PMU_INTEL_RAPL, .pme_count = LIBPFM_ARRAY_SIZE(intel_rapl_cln_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 0, .num_fixed_cntrs = 3, .max_encoding = 1, .pe = intel_rapl_cln_pe, /* default, maybe updated */ .pmu_detect = pfm_rapl_detect, .atdesc = rapl_mods, .get_event_encoding[PFM_OS_NONE] = pfm_intel_rapl_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skl.c000066400000000000000000000135101502707512200222130ustar00rootroot00000000000000/* * pfmlib_intel_skl.c : Intel Skylake core PMU * * Copyright (c) 2015 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_skl_events.h" static const int skl_models[] = { 78, /* Skylake mobile */ 94, /* Skylake desktop */ 142,/* KabyLake mobile */ 158,/* KabyLake desktop */ 165,/* CometLake mobile */ 166,/* CometLake */ 0 }; static const int skx_models[] = { 85, /* Skylake X */ 0 }; static int pfm_skx_detect(void *this) { int ret; /* Detect SKX model numbers (skx_models) */ ret = pfm_intel_x86_model_detect(this); if (ret != PFM_SUCCESS) return ret; /* SKX model with stepping < 5 */ return pfm_intel_x86_cfg.stepping < 5 ? PFM_SUCCESS : PFM_ERR_NOTSUPP; } static int pfm_clx_detect(void *this) { int ret; /* Detect SKX model numbers (skx_models) */ ret = pfm_intel_x86_model_detect(this); if (ret != PFM_SUCCESS) return ret; /* CLX is SKX model with stepping >= 5 */ return pfm_intel_x86_cfg.stepping >= 5 ? PFM_SUCCESS : PFM_ERR_NOTSUPP; } static int pfm_skl_init(void *this) { pfm_intel_x86_cfg.arch_version = 4; return PFM_SUCCESS; } pfmlib_pmu_t intel_skl_support={ .desc = "Intel Skylake", .name = "skl", .pmu = PFM_PMU_INTEL_SKL, .pme_count = LIBPFM_ARRAY_SIZE(intel_skl_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_skl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = skl_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_skl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; pfmlib_pmu_t intel_skx_support={ .desc = "Intel Skylake X", .name = "skx", .pmu = PFM_PMU_INTEL_SKX, .pme_count = LIBPFM_ARRAY_SIZE(intel_skl_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_skl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = skx_models, .pmu_detect = pfm_skx_detect, .pmu_init = pfm_skl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; pfmlib_pmu_t intel_clx_support={ .desc = "Intel CascadeLake X", .name = "clx", .pmu = PFM_PMU_INTEL_CLX, .pme_count = LIBPFM_ARRAY_SIZE(intel_skl_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_skl_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = skx_models, .pmu_detect = pfm_clx_detect, .pmu_init = pfm_skl_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_cha.c000066400000000000000000000103501502707512200237060ustar00rootroot00000000000000/* * pfmlib_intel_skx_unc_cha.c : Intel SKX CHA-Box uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_cha_events.h" static void display_cha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CHA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d] %s\n", reg->val, reg->cha.unc_event, reg->cha.unc_umask, reg->cha.unc_en, reg->cha.unc_inv, reg->cha.unc_edge, reg->cha.unc_thres, reg->cha.unc_tid, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CHA_FILTER0=0x%"PRIx64" thread_id=%d source=0x%x state=0x%x]\n", f.val, f.skx_cha_filt0.tid, f.skx_cha_filt0.sid, f.skx_cha_filt0.state); if (e->count == 2) return; f.val = e->codes[2]; __pfm_vbprintf("[UNC_CHA_FILTER1=0x%"PRIx64" rem=%d loc=%d all_opc=%d nm=%d" " not_nm=%d opc0=0x%x opc1=0x%x nc=%d isoc=%d]\n", f.val, f.skx_cha_filt1.rem, f.skx_cha_filt1.loc, f.skx_cha_filt1.all_opc, f.skx_cha_filt1.nm, f.skx_cha_filt1.not_nm, f.skx_cha_filt1.opc0, f.skx_cha_filt1.opc1, f.skx_cha_filt1.nc, f.skx_cha_filt1.isoc); } #define DEFINE_CHA(n) \ pfmlib_pmu_t intel_skx_unc_cha##n##_support = {\ .desc = "Intel SkylakeX CHA"#n" uncore",\ .name = "skx_unc_cha"#n,\ .perf_name = "uncore_cha_"#n,\ .pmu = PFM_PMU_INTEL_SKX_UNC_CHA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_c_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_skx_unc_c_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK|INTEL_PMU_FL_UNC_CHA,\ .pmu_detect = pfm_intel_skx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cha,\ } DEFINE_CHA(0); DEFINE_CHA(1); DEFINE_CHA(2); DEFINE_CHA(3); DEFINE_CHA(4); DEFINE_CHA(5); DEFINE_CHA(6); DEFINE_CHA(7); DEFINE_CHA(8); DEFINE_CHA(9); DEFINE_CHA(10); DEFINE_CHA(11); DEFINE_CHA(12); DEFINE_CHA(13); DEFINE_CHA(14); DEFINE_CHA(15); DEFINE_CHA(16); DEFINE_CHA(17); DEFINE_CHA(18); DEFINE_CHA(19); DEFINE_CHA(20); DEFINE_CHA(21); DEFINE_CHA(22); DEFINE_CHA(23); DEFINE_CHA(24); DEFINE_CHA(25); DEFINE_CHA(26); DEFINE_CHA(27); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_iio.c000066400000000000000000000064341502707512200237430ustar00rootroot00000000000000/* * pfmlib_intel_skx_unc_iio.c : Intel SkylakeX IIO uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_iio_events.h" static void display_iio(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IIO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d chmask=0x%x fcmsk=0x%x] %s\n", reg->val, reg->iio.unc_event, reg->iio.unc_umask, reg->iio.unc_en, reg->iio.unc_inv, reg->iio.unc_edge, reg->iio.unc_thres, reg->iio.unc_chmsk, reg->iio.unc_fcmsk, pe[e->event].name); /* no support for inbound and outbound bandwidth counters */ } #define DEFINE_IIO(n) \ pfmlib_pmu_t intel_skx_unc_iio##n##_support = {\ .desc = "Intel SkylakeX IIO"#n" uncore",\ .name = "skx_unc_iio"#n,\ .perf_name = "uncore_iio_"#n,\ .pmu = PFM_PMU_INTEL_SKX_UNC_IIO##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_iio_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3, /* address matchers */\ .pe = intel_skx_unc_iio_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_skx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_iio,\ } DEFINE_IIO(0); DEFINE_IIO(1); DEFINE_IIO(2); DEFINE_IIO(3); DEFINE_IIO(4); DEFINE_IIO(5); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_imc.c000066400000000000000000000053451502707512200237330ustar00rootroot00000000000000/* * pfmlib_intel_bdx_unc_imc.c : Intel SkylakeX Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_imc_events.h" #define DEFINE_IMC(n) \ pfmlib_pmu_t intel_skx_unc_imc##n##_support = { \ .desc = "Intel SkylakeX IMC"#n" uncore", \ .name = "skx_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_SKX_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_skx_unc_m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_skx_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC(0); DEFINE_IMC(1); DEFINE_IMC(2); DEFINE_IMC(3); DEFINE_IMC(4); DEFINE_IMC(5); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_irp.c000066400000000000000000000057001502707512200237500ustar00rootroot00000000000000/* * pfmlib_intel_skx_irp.c : Intel SkylakeX IRP uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_irp_events.h" static void display_irp(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IRP=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "edge=%d thres=%d] %s\n", reg->val, reg->irp.unc_event, reg->irp.unc_umask, reg->irp.unc_en, reg->irp.unc_edge, reg->irp.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_skx_unc_irp_support = { .desc = "Intel SkylakeX IRP uncore", .name = "skx_unc_irp", .perf_name = "uncore_irp", .pmu = PFM_PMU_INTEL_SKX_UNC_IRP, .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_i_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 3, .pe = intel_skx_unc_i_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_skx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_irp, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_m2m.c000066400000000000000000000061001502707512200236440ustar00rootroot00000000000000/* * pfmlib_intel_skx_m2m.c : Intel SkylakeX M2M uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_m2m_events.h" static void display_m2m(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_M2M=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "edge=%d thres=%d] %s\n", reg->val, reg->irp.unc_event, reg->irp.unc_umask, reg->irp.unc_en, reg->irp.unc_edge, reg->irp.unc_thres, pe[e->event].name); } #define DEFINE_M2M(n) \ pfmlib_pmu_t intel_skx_unc_m2m##n##_support = { \ .desc = "Intel SkylakeX M2M"#n" uncore", \ .name = "skx_unc_m2m"#n, \ .perf_name = "uncore_m2m_"#n, \ .pmu = PFM_PMU_INTEL_SKX_UNC_M2M##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_m2m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 0, \ .max_encoding = 1, \ .pe = intel_skx_unc_m2m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK, \ .pmu_detect = pfm_intel_skx_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ .display_reg = display_m2m, \ }; DEFINE_M2M(0); DEFINE_M2M(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_m3upi.c000066400000000000000000000061571502707512200242220ustar00rootroot00000000000000/* * pfmlib_intel_skx_m3upi.c : Intel SkylakeX M3UPI uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_m3upi_events.h" static void display_m3upi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_M3UPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_M3UPI(n) \ pfmlib_pmu_t intel_skx_unc_m3upi##n##_support = {\ .desc = "Intel SkylakeX M3UPI"#n" uncore", \ .name = "skx_unc_m3upi"#n,\ .perf_name = "uncore_m3upi_"#n, \ .pmu = PFM_PMU_INTEL_SKX_UNC_M3UPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_m3_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 1,\ .pe = intel_skx_unc_m3_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_skx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_m3upi,\ } DEFINE_M3UPI(0); DEFINE_M3UPI(1); DEFINE_M3UPI(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_pcu.c000066400000000000000000000067421502707512200237540ustar00rootroot00000000000000/* * pfmlib_intel_skx_unc_pcu.c : Intel SkylakeX Power Control Unit (PCU) uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x sel_ext=%d occ_sel=0x%x en=%d " "edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->ivbep_pcu.unc_event, reg->ivbep_pcu.unc_sel_ext, reg->ivbep_pcu.unc_occ, reg->ivbep_pcu.unc_en, reg->ivbep_pcu.unc_edge, reg->ivbep_pcu.unc_thres, reg->ivbep_pcu.unc_occ_inv, reg->ivbep_pcu.unc_occ_edge, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", f.val, f.pcu_filt.filt0, f.pcu_filt.filt1, f.pcu_filt.filt2, f.pcu_filt.filt3); } pfmlib_pmu_t intel_skx_unc_pcu_support = { .desc = "Intel SkylakeX PCU uncore", .name = "skx_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_SKX_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_p_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_skx_unc_p_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_skx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_ubo.c000066400000000000000000000057471502707512200237560ustar00rootroot00000000000000/* * pfmlib_intel_skx_unc_ubo.c : Intel SkylakeX U-Box uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_ubo_events.h" static void display_ubo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } pfmlib_pmu_t intel_skx_unc_ubo_support = { .desc = "Intel SkylakeX U-Box uncore", .name = "skx_unc_ubo", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_SKX_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_u_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 2, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_skx_unc_u_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .pmu_detect = pfm_intel_skx_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_ubo, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_skx_unc_upi.c000066400000000000000000000061661502707512200237620ustar00rootroot00000000000000/* * pfmlib_intel_skx_upi.c : Intel SkylakeX UPI uncore PMU * * Copyright (c) 2017 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_skx_unc_upi_events.h" static void display_upi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->qpi.unc_event, reg->qpi.unc_event_ext, reg->qpi.unc_umask, reg->qpi.unc_en, reg->qpi.unc_inv, reg->qpi.unc_edge, reg->qpi.unc_thres, pe[e->event].name); } #define DEFINE_UPI(n) \ pfmlib_pmu_t intel_skx_unc_upi##n##_support = {\ .desc = "Intel SkylakeX UPI"#n" uncore",\ .name = "skx_unc_upi"#n,\ .perf_name = "uncore_upi_"#n,\ .pmu = PFM_PMU_INTEL_SKX_UNC_UPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_skx_unc_upi_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_skx_unc_upi_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_skx_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_upi,\ } DEFINE_UPI(0); DEFINE_UPI(1); DEFINE_UPI(2); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_slm.c000066400000000000000000000051531502707512200222210ustar00rootroot00000000000000/* * pfmlib_intel_slm.c : Intel Silvermont core PMU * * Copyright (c) 2013 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Based on Intel Software Optimization Guide June 2013 */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_slm_events.h" static const int slm_models[] = { 55, /* Silvermont */ 77, /* Silvermont Avoton */ 76, /* Airmont */ 0 }; static int pfm_intel_slm_init(void *this) { pfm_intel_x86_cfg.arch_version = 2; return PFM_SUCCESS; } pfmlib_pmu_t intel_slm_support={ .desc = "Intel Silvermont", .name = "slm", .pmu = PFM_PMU_INTEL_SLM, .pme_count = LIBPFM_ARRAY_SIZE(intel_slm_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, .pe = intel_slm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = slm_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_slm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snb.c000066400000000000000000000076231502707512200222140ustar00rootroot00000000000000/* * pfmlib_intel_snb.c : Intel Sandy Bridge core PMU * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_snb_events.h" static const int snb_models[] = { 42, /* Sandy Bridge (Core i7 26xx, 25xx) */ 0 }; static const int snb_ep_models[] = { 45, /* Sandy Bridge EP */ 0 }; static int pfm_snb_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } pfmlib_pmu_t intel_snb_support={ .desc = "Intel Sandy Bridge", .name = "snb", .pmu = PFM_PMU_INTEL_SNB, .pme_count = LIBPFM_ARRAY_SIZE(intel_snb_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_snb_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = snb_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_snb_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_snb_ep_support={ .desc = "Intel Sandy Bridge EP", .name = "snb_ep", .pmu = PFM_PMU_INTEL_SNB_EP, .pme_count = LIBPFM_ARRAY_SIZE(intel_snb_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 8, /* consider with HT off by default */ .num_fixed_cntrs = 3, .max_encoding = 2, /* offcore_response */ .pe = intel_snb_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = snb_ep_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_snb_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snb_unc.c000066400000000000000000000054621502707512200230600ustar00rootroot00000000000000/* * pfmlib_intel_snb_unc.c : Intel SandyBridge C-Box uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #define INTEL_SNB_UNC_ATTRS \ (_INTEL_X86_ATTR_I|_INTEL_X86_ATTR_E|_INTEL_X86_ATTR_C) #include "events/intel_snb_unc_events.h" static const int snb_models[] = { 42, /* Sandy Bridge (Core i7 26xx, 25xx) */ 0 }; #define SNB_UNC_CBOX(n, p) \ pfmlib_pmu_t intel_snb_unc_cbo##n##_support={ \ .desc = "Intel Sandy Bridge C-box"#n" uncore", \ .name = "snb_unc_cbo"#n, \ .perf_name = "uncore_cbox_"#n, \ .pmu = PFM_PMU_INTEL_SNB_UNC_CB##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_snb_unc_##p##_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 2, \ .num_fixed_cntrs = 1, \ .max_encoding = 1,\ .pe = intel_snb_unc_##p##_pe, \ .atdesc = intel_x86_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .cpu_family = 6,\ .cpu_models = snb_models, \ .pmu_detect = pfm_intel_x86_model_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_nhm_unc_get_perf_encoding), \ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ } SNB_UNC_CBOX(0, cbo0); SNB_UNC_CBOX(1, cbo); SNB_UNC_CBOX(2, cbo); SNB_UNC_CBOX(3, cbo); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc.c000066400000000000000000000703701502707512200234050ustar00rootroot00000000000000/* * pfmlib_intel_snbep_unc.c : Intel SandyBridge-EP uncore PMU common code * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" const pfmlib_attr_desc_t snbep_unc_mods[]={ PFM_ATTR_B("e", "edge detect"), /* edge */ PFM_ATTR_B("i", "invert"), /* invert */ PFM_ATTR_I("t", "threshold in range [0-255]"), /* threshold */ PFM_ATTR_I("t", "threshold in range [0-31]"), /* threshold */ PFM_ATTR_I("tf", "thread id filter [0-1]"), /* thread id */ PFM_ATTR_I("cf", "core id filter, includes non-thread data in bit 4 [0-15]"), /* core id (ivbep) */ PFM_ATTR_I("nf", "node id bitmask filter [0-255]"),/* nodeid mask filter0 */ PFM_ATTR_I("ff", "frequency >= 100Mhz * [0-255]"),/* freq filter */ PFM_ATTR_I("addr", "physical address matcher [40 bits]"),/* address matcher */ PFM_ATTR_I("nf", "node id bitmask filter [0-255]"),/* nodeid mask filter1 */ PFM_ATTR_B("isoc", "match isochronous requests"), /* isochronous */ PFM_ATTR_B("nc", "match non-coherent requests"), /* non-coherent */ PFM_ATTR_I("cf", "core id filter, includes non-thread data in bit 5 [0-63]"), /* core id (hswep) */ PFM_ATTR_I("tf", "thread id filter [0-3]"), /* thread id (skx)*/ PFM_ATTR_I("cf", "source id filter [0-63]"), /* src-id/core-id (skx) */ PFM_ATTR_B("loc", "match on local node target"), /* loc filter1 (skx) */ PFM_ATTR_B("rem", "match on remote node target"),/* rem filter1 (skx) */ PFM_ATTR_B("lmem", "local memory cacheable"), /* nm filter1 (skx) */ PFM_ATTR_B("rmem", "remote memory cacheable"), /* not_nm filter1 (skx) */ PFM_ATTR_I("dnid", "destination node id [0-15]"), /* SKX:UPI */ PFM_ATTR_I("rcsnid", "destination RCS Node id [0-15]"), /* SKX:UPI */ PFM_ATTR_I("t", "threshold in range [0-63]"), /* threshold */ PFM_ATTR_B("occ_i", "occupancy event invert"), /* invert occupancy event */ PFM_ATTR_B("occ_e", "occupancy event edge "), /* edge occupancy event */ PFM_ATTR_NULL }; int pfm_intel_snbep_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 45: /* SandyBridge-EP */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_ivbep_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 62: /* SandyBridge-EP */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_hswep_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 63: /* Haswell-EP */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_knl_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 87: /* Knights Landing */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_knm_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 133: /* Knights Mill */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_bdx_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 79: /* Broadwell X */ case 86: /* Broadwell X */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_skx_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 85: /* Skylake X */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_icx_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 106: /* Icelake X */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_spr_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 143: /* SapphireRapids */ break; case 207: /* EmeraldRapids */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } int pfm_intel_gnr_unc_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) if (pfm_intel_x86_cfg.family != 6) return PFM_ERR_NOTSUPP; switch(pfm_intel_x86_cfg.model) { case 173: /* GraniteRapids X */ break; case 174: /* GraniteRapids D */ break; default: return PFM_ERR_NOTSUPP; } return PFM_SUCCESS; } static void display_com(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } static void display_reg(void *this, pfmlib_event_desc_t *e, pfm_snbep_unc_reg_t reg) { pfmlib_pmu_t *pmu = this; if (pmu->display_reg) pmu->display_reg(this, e, ®); else display_com(this, e, ®); } static inline int is_occ_event(void *this, int idx) { pfmlib_pmu_t *pmu = this; const intel_x86_entry_t *pe = this_pe(this); return (pmu->flags & INTEL_PMU_FL_UNC_OCC) && (pe[idx].code & 0x80); } static inline int get_pcu_filt_band(void *this, pfm_snbep_unc_reg_t reg) { #define PCU_FREQ_BAND0_CODE 0xb /* event code for UNC_P_FREQ_BAND0_CYCLES */ return reg.pcu.unc_event - PCU_FREQ_BAND0_CODE; } static inline void set_filters(void *this, pfm_snbep_unc_reg_t *filters, int event, int umask) { const intel_x86_entry_t *pe = this_pe(this); filters[0].val |= pe[event].umasks[umask].ufilters[0] & ((1ULL << 32)-1); filters[0].val &= ~(pe[event].umasks[umask].ufilters[0] >> 32); filters[1].val |= pe[event].umasks[umask].ufilters[1] & ((1ULL << 32)-1); filters[1].val &= ~(pe[event].umasks[umask].ufilters[1] >>32); } int snbep_unc_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask, pfm_snbep_unc_reg_t *filters, unsigned short max_grpid, int *numasks) { const intel_x86_entry_t *pe = this_pe(this); const intel_x86_entry_t *ent; unsigned int i; int j, k, added, skip; int idx; k = e->nattrs; ent = pe+e->event; for(i=0; msk; msk >>=1, i++) { if (!(msk & 0x1)) continue; added = skip = 0; for (j = 0; j < e->npattrs; j++) { if (e->pattrs[j].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[j].type != PFM_ATTR_UMASK) continue; idx = e->pattrs[j].idx; if (get_grpid(ent->umasks[idx].grpid) != i) continue; if (max_grpid != INTEL_X86_MAX_GRPID && i > max_grpid) { skip = 1; continue; } if (intel_x86_uflag(this, e->event, idx, INTEL_X86_GRP_DFL_NONE)) { skip = 1; continue; } /* umask is default for group */ if (intel_x86_uflag(this, e->event, idx, INTEL_X86_DFL)) { DPRINT("added default %s for group %d j=%d idx=%d ucode=0x%"PRIx64"\n", ent->umasks[idx].uname, i, j, idx, ent->umasks[idx].ucode); /* * default could be an alias, but * ucode must reflect actual code */ *umask |= ent->umasks[idx].ucode >> 8; set_filters(this, filters, e->event, idx); e->attrs[k].id = j; /* pattrs index */ e->attrs[k].ival = 0; k++; (*numasks)++; added++; if (intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) goto done; if (intel_x86_uflag(this, e->event, idx, INTEL_X86_EXCL_GRP_GT)) { if (max_grpid != INTEL_X86_MAX_GRPID) { DPRINT("two max_grpid, old=%d new=%d\n", max_grpid, get_grpid(ent->umasks[idx].grpid)); return PFM_ERR_UMASK; } max_grpid = get_grpid(ent->umasks[idx].grpid); } } } if (!added && !skip) { DPRINT("no default found for event %s unit mask group %d (max_grpid=%d, i=%d)\n", ent->name, i, max_grpid, i); return PFM_ERR_UMASK; } } DPRINT("max_grpid=%d nattrs=%d k=%d umask=0x%"PRIx64"\n", max_grpid, e->nattrs, k, *umask); done: e->nattrs = k; return PFM_SUCCESS; } /* * common encoding routine */ int pfm_intel_snbep_unc_get_encoding(void *this, pfmlib_event_desc_t *e) { const intel_x86_entry_t *pe = this_pe(this); unsigned int grpmsk, ugrpmsk = 0; unsigned short max_grpid = INTEL_X86_MAX_GRPID; unsigned short last_grpid = INTEL_X86_MAX_GRPID; unsigned short req_grpid; int umodmsk = 0, modmsk_r = 0; int pcu_filt_band = -1; pfm_snbep_unc_reg_t reg; pfm_snbep_unc_reg_t filters[INTEL_X86_MAX_FILTERS]; pfm_snbep_unc_reg_t addr; pfmlib_event_attr_info_t *a; uint64_t val, umask1, umask2; int k, ret, numasks = 0; int must_have_filt0 = 0; int max_req_grpid = -1; unsigned short grpid; int grpcounts[INTEL_X86_NUM_GRP]; int req_grps[INTEL_X86_NUM_GRP]; int ncombo[INTEL_X86_NUM_GRP]; char umask_str[PFMLIB_EVT_MAX_NAME_LEN]; memset(grpcounts, 0, sizeof(grpcounts)); memset(ncombo, 0, sizeof(ncombo)); memset(filters, 0, sizeof(filters)); addr.val = 0; umask_str[0] = e->fstr[0] = '\0'; reg.val = val = pe[e->event].code; /* take into account hardcoded umask */ umask1 = (val >> 8); umask2 = umask1; grpmsk = (1 << pe[e->event].ngrp)-1; modmsk_r = pe[e->event].modmsk_req; if (intel_x86_eflag(this, e->event, INTEL_X86_FORCE_FILT0)) must_have_filt0 = 1; for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { uint64_t um; grpid = get_grpid(pe[e->event].umasks[a->idx].grpid); req_grpid = get_req_grpid(pe[e->event].umasks[a->idx].grpid); /* * certain event groups are meant to be * exclusive, i.e., only unit masks of one group * can be used */ if (last_grpid != INTEL_X86_MAX_GRPID && grpid != last_grpid && intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) { DPRINT("exclusive unit mask group error\n"); return PFM_ERR_FEATCOMB; } /* * selecting certain umasks in a group may exclude any umasks * from any groups with a higher index * * enforcement requires looking at the grpid of all the umasks */ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_EXCL_GRP_GT)) max_grpid = grpid; /* * certain event groups are meant to be * exclusive, i.e., only unit masks of one group * can be used */ if (last_grpid != INTEL_X86_MAX_GRPID && grpid != last_grpid && intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) { DPRINT("exclusive unit mask group error\n"); return PFM_ERR_FEATCOMB; } /* * upper layer has removed duplicates * so if we come here more than once, it is for two * disinct umasks * * NCOMBO=no combination of unit masks within the same * umask group */ ++grpcounts[grpid]; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_GRP_REQ)) { DPRINT("event requires grpid %d\n", req_grpid); /* initialize req_grpcounts array only when needed */ if (max_req_grpid == -1) { int x; for (x = 0; x < INTEL_X86_NUM_GRP; x++) req_grps[x] = 0xff; } if (req_grpid > max_req_grpid) max_req_grpid = req_grpid; DPRINT("max_req_grpid=%d\n", max_req_grpid); req_grps[req_grpid] = 1; } /* mark that we have a umask with NCOMBO in this group */ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_NCOMBO)) ncombo[grpid] = 1; /* * if more than one umask in this group but one is marked * with ncombo, then fail. It is okay to combine umask within * a group as long as none is tagged with NCOMBO */ if (grpcounts[grpid] > 1 && ncombo[grpid]) { DPRINT("umask %s does not support unit mask combination within group %d\n", pe[e->event].umasks[a->idx].uname, grpid); return PFM_ERR_FEATCOMB; } last_grpid = grpid; um = pe[e->event].umasks[a->idx].ucode; set_filters(this, filters, e->event, a->idx); um >>= 8; umask2 |= um; ugrpmsk |= 1 << grpid; /* PCU occ event */ if (is_occ_event(this, e->event)) { reg.pcu.unc_occ = umask2 >> 6; umask2 = 0; } else reg.val |= umask2 << 8; evt_strcat(umask_str, ":%s", pe[e->event].umasks[a->idx].uname); modmsk_r |= pe[e->event].umasks[a->idx].umodmsk_req; numasks++; } else if (a->type == PFM_ATTR_RAW_UMASK) { /* there can only be one RAW_UMASK per event */ /* sanity check */ if (a->idx & ~0xff) { DPRINT("raw umask is 8-bit wide\n"); return PFM_ERR_ATTR; } /* override umask */ umask2 = a->idx & 0xff; ugrpmsk = grpmsk; numasks++; } else { uint64_t ival = e->attrs[k].ival; switch(a->idx) { case SNBEP_UNC_ATTR_I: /* invert */ if (is_occ_event(this, e->event)) reg.pcu.unc_occ_inv = !!ival; else reg.com.unc_inv = !!ival; umodmsk |= _SNBEP_UNC_ATTR_I; break; case SNBEP_UNC_ATTR_E: /* edge */ if (is_occ_event(this, e->event)) reg.pcu.unc_occ_edge = !!ival; else reg.com.unc_edge = !!ival; umodmsk |= _SNBEP_UNC_ATTR_E; break; case SNBEP_UNC_ATTR_T8: /* counter-mask */ /* already forced, cannot overwrite */ if (ival > 255) return PFM_ERR_ATTR_VAL; reg.com.unc_thres = ival; umodmsk |= _SNBEP_UNC_ATTR_T8; break; case SNBEP_UNC_ATTR_T5: /* pcu counter-mask */ /* already forced, cannot overwrite */ if (ival > 31) return PFM_ERR_ATTR_VAL; reg.pcu.unc_thres = ival; umodmsk |= _SNBEP_UNC_ATTR_T5; break; case SNBEP_UNC_ATTR_T6: /* counter-mask */ /* already forced, cannot overwrite */ if (ival > 63) return PFM_ERR_ATTR_VAL; reg.com.unc_thres = ival; umodmsk |= _SNBEP_UNC_ATTR_T6; break; case SNBEP_UNC_ATTR_TF: /* thread id */ if (ival > 1) { DPRINT("invalid thread id, must be < 1"); return PFM_ERR_ATTR_VAL; } reg.cbo.unc_tid = 1; must_have_filt0 = 1; filters[0].cbo_filt.tid = ival; umodmsk |= _SNBEP_UNC_ATTR_TF; break; case SNBEP_UNC_ATTR_TF1: /* thread id skx */ if (ival > 7) return PFM_ERR_ATTR_VAL; reg.cha.unc_tid = 1; filters[0].skx_cha_filt0.tid = ival; /* includes non-thread data */ must_have_filt0 = 1; umodmsk |= _SNBEP_UNC_ATTR_TF1; break; case SNBEP_UNC_ATTR_CF: /* core id */ if (ival > 15) return PFM_ERR_ATTR_VAL; reg.cbo.unc_tid = 1; filters[0].cbo_filt.cid = ival; must_have_filt0 = 1; umodmsk |= _SNBEP_UNC_ATTR_CF; break; case SNBEP_UNC_ATTR_CF1: /* core id */ if (ival > 63) return PFM_ERR_ATTR_VAL; reg.cbo.unc_tid = 1; filters[0].hswep_cbo_filt0.cid = ival; /* includes non-thread data */ must_have_filt0 = 1; umodmsk |= _SNBEP_UNC_ATTR_CF1; break; case SNBEP_UNC_ATTR_NF: /* node id filter0 */ if (ival > 255 || ival == 0) { DPRINT("invalid nf, 0 < nf < 256\n"); return PFM_ERR_ATTR_VAL; } filters[0].cbo_filt.nid = ival; umodmsk |= _SNBEP_UNC_ATTR_NF; break; case SNBEP_UNC_ATTR_NF1: /* node id filter1 */ if (ival > 255 || ival == 0) { DPRINT("invalid nf, 0 < nf < 256\n"); return PFM_ERR_ATTR_VAL; } filters[1].ivbep_cbo_filt1.nid = ival; umodmsk |= _SNBEP_UNC_ATTR_NF1; break; case SNBEP_UNC_ATTR_CF2: /* src-id/core-id skx */ if (ival > 64) return PFM_ERR_ATTR_VAL; reg.cha.unc_tid = 1; filters[0].skx_cha_filt0.sid = ival; must_have_filt0 = 1; umodmsk |= _SNBEP_UNC_ATTR_CF2; break; case SNBEP_UNC_ATTR_FF: /* freq band filter */ if (ival > 255) return PFM_ERR_ATTR_VAL; pcu_filt_band = get_pcu_filt_band(this, reg); filters[0].val = ival << (pcu_filt_band * 8); umodmsk |= _SNBEP_UNC_ATTR_FF; break; case SNBEP_UNC_ATTR_A: /* addr filter */ if (ival & ~((1ULL << 40)-1)) { DPRINT("address filter 40bits max\n"); return PFM_ERR_ATTR_VAL; } addr.ha_addr.lo_addr = ival; /* LSB 26 bits */ addr.ha_addr.hi_addr = (ival >> 26) & ((1ULL << 14)-1); umodmsk |= _SNBEP_UNC_ATTR_A; break; case SNBEP_UNC_ATTR_ISOC: /* isoc filter */ filters[1].ivbep_cbo_filt1.isoc = !!ival; break; case SNBEP_UNC_ATTR_NC: /* nc filter */ filters[1].ivbep_cbo_filt1.nc = !!ival; break; case SNBEP_UNC_ATTR_LOC: /* local target skx */ filters[1].skx_cha_filt1.loc = !!ival; break; case SNBEP_UNC_ATTR_REM: /* remote target skx */ filters[1].skx_cha_filt1.rem = !!ival; break; case SNBEP_UNC_ATTR_LMEM: /* local target skx */ filters[1].skx_cha_filt1.loc = !!ival; break; case SNBEP_UNC_ATTR_RMEM: /* local memory skx */ filters[1].skx_cha_filt1.not_nm = !!ival; break; case SNBEP_UNC_ATTR_DNID: /* destination node id skx */ if (ival > 15) { DPRINT("dnid must be [0-15]\n"); return PFM_ERR_ATTR_VAL; } filters[0].skx_upi_filt.dnid = ival; filters[0].skx_upi_filt.en_dnidd = 1; break; case SNBEP_UNC_ATTR_RCSNID: /* RCS node id skx */ if (ival > 15) { DPRINT("rcsnid must be [0-15]\n"); return PFM_ERR_ATTR_VAL; } filters[0].skx_upi_filt.rcsnid = ival; filters[0].skx_upi_filt.en_rcsnid = 1; break; case SNBEP_UNC_ATTR_OCC_I: /* occ_i */ reg.icx_pcu.unc_occ_inv = !!ival; umodmsk |= _SNBEP_UNC_ATTR_OCC_I; break; case SNBEP_UNC_ATTR_OCC_E: /* occ_e */ reg.icx_pcu.unc_occ_edge = !!ival; umodmsk |= _SNBEP_UNC_ATTR_OCC_E; break; default: DPRINT("event %s invalid attribute %d\n", pe[e->event].name, a->idx); return PFM_ERR_ATTR; } } } /* check required groups are in place */ if (max_req_grpid != -1) { int x; for (x = 0; x <= max_req_grpid; x++) { if (req_grps[x] == 0xff) continue; if ((ugrpmsk & (1 << x)) == 0) { DPRINT("required grpid %d umask missing\n", x); return PFM_ERR_FEATCOMB; } } } /* check required groups are in place */ if (max_req_grpid != -1) { int x; for (x = 0; x <= max_req_grpid; x++) { if (req_grps[x] == 0xff) continue; if ((ugrpmsk & (1 << x)) == 0) { DPRINT("required grpid %d umask missing\n", x); return PFM_ERR_FEATCOMB; } } } /* * check that there is at least of unit mask in each unit mask group */ if (pe[e->event].numasks && (ugrpmsk != grpmsk || ugrpmsk == 0)) { uint64_t um = 0; ugrpmsk ^= grpmsk; ret = snbep_unc_add_defaults(this, e, ugrpmsk, &um, filters, max_grpid, &numasks); if (ret != PFM_SUCCESS) return ret; umask2 |= um; } /* if event has umasks, then likely at least one must be set */ if (pe[e->event].numasks && numasks == 0) { DPRINT("event has umasks but none specified\n"); return PFM_ERR_ATTR; } /* * nf= is only required on some events in CBO */ if (!(modmsk_r & _SNBEP_UNC_ATTR_NF) && (umodmsk & _SNBEP_UNC_ATTR_NF)) { DPRINT("using nf= on an umask which does not require it\n"); return PFM_ERR_ATTR; } if (!(modmsk_r & _SNBEP_UNC_ATTR_NF1) && (umodmsk & _SNBEP_UNC_ATTR_NF1)) { DPRINT("using nf= on an umask which does not require it\n"); return PFM_ERR_ATTR; } if (modmsk_r && !(umodmsk & modmsk_r)) { DPRINT("required modifiers missing: 0x%x\n", modmsk_r); return PFM_ERR_ATTR; } /* * fixup filt1.all_opc based on values of the filter */ if (is_cha_filt_event(this, 1, reg)) { if (filters[1].val == 0) /* default value: rem=loc=nm=not_nm=all_opc=1 */ filters[1].val =0x3b; else if (filters[1].skx_cha_filt1.opc0 || filters[1].skx_cha_filt1.opc1) { /* enable opcode filtering */ filters[1].val &= ~(1ULL << 3); } } evt_strcat(e->fstr, "%s", pe[e->event].name); pfmlib_sort_attr(e); for(k = 0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", pe[e->event].umasks[a->idx].uname); else if (a->type == PFM_ATTR_RAW_UMASK) evt_strcat(e->fstr, ":0x%x", a->idx); } DPRINT("umask2=0x%"PRIx64" umask1=0x%"PRIx64"\n", umask2, umask1); e->count = 0; reg.val |= (umask1 | umask2) << 8; e->codes[e->count++] = reg.val; /* * handles filters */ if (filters[0].val || filters[1].val || must_have_filt0) e->codes[e->count++] = filters[0].val; if (filters[1].val) e->codes[e->count++] = filters[1].val; /* HA address matcher */ if (addr.val) e->codes[e->count++] = addr.val; for (k = 0; k < e->npattrs; k++) { int idx; if (e->pattrs[k].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[k].type == PFM_ATTR_UMASK) continue; idx = e->pattrs[k].idx; switch(idx) { case SNBEP_UNC_ATTR_E: if (is_occ_event(this, e->event)) evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.pcu.unc_occ_edge); else evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.com.unc_edge); break; case SNBEP_UNC_ATTR_I: if (is_occ_event(this, e->event)) evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.pcu.unc_occ_inv); else evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.com.unc_inv); break; case SNBEP_UNC_ATTR_T8: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.com.unc_thres); break; case SNBEP_UNC_ATTR_T5: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.pcu.unc_thres); break; case SNBEP_UNC_ATTR_TF: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.cbo.unc_tid); break; case SNBEP_UNC_ATTR_TF1: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].skx_cha_filt0.tid); break; case SNBEP_UNC_ATTR_CF: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].cbo_filt.cid); break; case SNBEP_UNC_ATTR_CF1: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].hswep_cbo_filt0.cid); break; case SNBEP_UNC_ATTR_CF2: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].skx_cha_filt0.sid); break; case SNBEP_UNC_ATTR_FF: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, (filters[0].val >> (pcu_filt_band*8)) & 0xff); break; case SNBEP_UNC_ATTR_ISOC: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].ivbep_cbo_filt1.isoc); break; case SNBEP_UNC_ATTR_NC: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].ivbep_cbo_filt1.nc); break; case SNBEP_UNC_ATTR_NF: if (modmsk_r & _SNBEP_UNC_ATTR_NF) evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].cbo_filt.nid); break; case SNBEP_UNC_ATTR_NF1: if (modmsk_r & _SNBEP_UNC_ATTR_NF1) evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].ivbep_cbo_filt1.nid); break; case SNBEP_UNC_ATTR_A: evt_strcat(e->fstr, ":%s=0x%lx", snbep_unc_mods[idx].name, addr.ha_addr.hi_addr << 26 | addr.ha_addr.lo_addr); break; case SNBEP_UNC_ATTR_REM: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].skx_cha_filt1.rem); break; case SNBEP_UNC_ATTR_LOC: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].skx_cha_filt1.loc); break; case SNBEP_UNC_ATTR_RMEM: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].skx_cha_filt1.rem); break; case SNBEP_UNC_ATTR_LMEM: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[1].skx_cha_filt1.loc); break; case SNBEP_UNC_ATTR_DNID: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].skx_upi_filt.dnid); break; case SNBEP_UNC_ATTR_RCSNID: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, filters[0].skx_upi_filt.rcsnid); break; case SNBEP_UNC_ATTR_T6: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.icx_pcu.unc_thres); break; case SNBEP_UNC_ATTR_OCC_I: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.icx_pcu.unc_occ_inv); break; case SNBEP_UNC_ATTR_OCC_E: evt_strcat(e->fstr, ":%s=%lu", snbep_unc_mods[idx].name, reg.icx_pcu.unc_occ_edge); break; default: DPRINT("unknown attribute %d for event %s\n", idx, pe[e->event].name); return PFM_ERR_ATTR; } } display_reg(this, e, reg); return PFM_SUCCESS; } int pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx) { if (intel_x86_eflag(this, pidx, INTEL_X86_NO_AUTOENCODE)) return 0; return !intel_x86_uflag(this, pidx, uidx, INTEL_X86_NO_AUTOENCODE); } int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { const intel_x86_entry_t *pe = this_pe(this); const pfmlib_attr_desc_t *atdesc = this_atdesc(this); int numasks, idx; numasks = intel_x86_num_umasks(this, pidx); if (attr_idx < numasks) { idx = intel_x86_attr2umask(this, pidx, attr_idx); info->name = pe[pidx].umasks[idx].uname; info->desc = pe[pidx].umasks[idx].udesc; info->equiv= pe[pidx].umasks[idx].uequiv; info->code = pe[pidx].umasks[idx].ucode; if (!intel_x86_uflag(this, pidx, idx, INTEL_X86_CODE_OVERRIDE)) info->code >>= 8; if (info->code == 0) info->code = pe[pidx].umasks[idx].ufilters[0]; info->type = PFM_ATTR_UMASK; info->is_dfl = intel_x86_uflag(this, pidx, idx, INTEL_X86_DFL); info->is_precise = intel_x86_uflag(this, pidx, idx, INTEL_X86_PEBS); } else { idx = intel_x86_attr2mod(this, pidx, attr_idx); info->name = atdesc[idx].name; info->desc = atdesc[idx].desc; info->type = atdesc[idx].type; info->equiv= NULL; info->code = idx; info->is_dfl = 0; info->is_precise = 0; } info->ctrl = PFM_ATTR_CTRL_PMU; info->idx = idx; /* namespace specific index */ info->dfl_val64 = 0; info->support_hw_smpl = 0; return PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_cbo.c000066400000000000000000000070741502707512200242310ustar00rootroot00000000000000/* * pfmlib_intel_snb_unc_cbo.c : Intel SandyBridge-EP C-Box uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_cbo_events.h" static void display_cbo(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CBO=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d] %s\n", reg->val, reg->cbo.unc_event, reg->cbo.unc_umask, reg->cbo.unc_en, reg->cbo.unc_inv, reg->cbo.unc_edge, reg->cbo.unc_thres, reg->cbo.unc_tid, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CBOX_FILTER=0x%"PRIx64" tid=%d core=0x%x nid=0x%x" " state=0x%x opc=0x%x]\n", f.val, f.cbo_filt.tid, f.cbo_filt.cid, f.cbo_filt.nid, f.cbo_filt.state, f.cbo_filt.opc); } #define DEFINE_C_BOX(n) \ pfmlib_pmu_t intel_snbep_unc_cb##n##_support = {\ .desc = "Intel Sandy Bridge-EP C-Box "#n" uncore",\ .name = "snbep_unc_cbo"#n,\ .perf_name = "uncore_cbox_"#n,\ .pmu = PFM_PMU_INTEL_SNBEP_UNC_CB##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_c_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_snbep_unc_c_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .pmu_detect = pfm_intel_snbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_snbep_unc_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cbo,\ } DEFINE_C_BOX(0); DEFINE_C_BOX(1); DEFINE_C_BOX(2); DEFINE_C_BOX(3); DEFINE_C_BOX(4); DEFINE_C_BOX(5); DEFINE_C_BOX(6); DEFINE_C_BOX(7); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_ha.c000066400000000000000000000064641502707512200240600ustar00rootroot00000000000000/* * pfmlib_intel_snb_unc_ha.c : Intel SandyBridge-EP Home Agent (HA) uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_ha_events.h" static void display_ha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_HA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_HA_ADDR=0x%"PRIx64" lo_addr=0x%x hi_addr=0x%x]\n", f.val, f.ha_addr.lo_addr, f.ha_addr.hi_addr); f.val = e->codes[2]; __pfm_vbprintf("[UNC_HA_OPC=0x%"PRIx64" opc=0x%x]\n", f.val, f.ha_opc.opc); } pfmlib_pmu_t intel_snbep_unc_ha_support = { .desc = "Intel Sandy Bridge-EP HA uncore", .name = "snbep_unc_ha", .perf_name = "uncore_ha", .pmu = PFM_PMU_INTEL_SNBEP_UNC_HA, .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_h_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 3, /* address matchers */ .pe = intel_snbep_unc_h_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_snbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .display_reg = display_ha, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_imc.c000066400000000000000000000053441502707512200242340ustar00rootroot00000000000000/* * pfmlib_intel_snbep_unc_imc.c : Intel SandyBridge-EP Integrated Memory Controller (IMC) uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_imc_events.h" #define DEFINE_IMC_BOX(n) \ pfmlib_pmu_t intel_snbep_unc_imc##n##_support = { \ .desc = "Intel Sandy Bridge-EP IMC"#n" uncore", \ .name = "snbep_unc_imc"#n, \ .perf_name = "uncore_imc_"#n, \ .pmu = PFM_PMU_INTEL_SNBEP_UNC_IMC##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_m_pe), \ .type = PFM_PMU_TYPE_UNCORE, \ .num_cntrs = 4, \ .num_fixed_cntrs = 1, \ .max_encoding = 1, \ .pe = intel_snbep_unc_m_pe, \ .atdesc = snbep_unc_mods, \ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .pmu_detect = pfm_intel_snbep_unc_detect, \ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, \ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), \ .get_event_first = pfm_intel_x86_get_event_first, \ .get_event_next = pfm_intel_x86_get_event_next, \ .event_is_valid = pfm_intel_x86_event_is_valid, \ .validate_table = pfm_intel_x86_validate_table, \ .get_event_info = pfm_intel_x86_get_event_info, \ .get_event_attr_info = pfm_intel_x86_get_event_attr_info, \ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), \ .get_event_nattrs = pfm_intel_x86_get_event_nattrs, \ }; DEFINE_IMC_BOX(0); DEFINE_IMC_BOX(1); DEFINE_IMC_BOX(2); DEFINE_IMC_BOX(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_pcu.c000066400000000000000000000066301502707512200242520ustar00rootroot00000000000000/* * pfmlib_intel_snbep_unc_pcu.c : Intel SandyBridge-EP Power Control Unit (PCU) uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_pcu_events.h" static void display_pcu(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_PCU=0x%"PRIx64" event=0x%x occ_sel=0x%x en=%d " "inv=%d edge=%d thres=%d occ_inv=%d occ_edge=%d] %s\n", reg->val, reg->pcu.unc_event, reg->pcu.unc_occ, reg->pcu.unc_en, reg->pcu.unc_inv, reg->pcu.unc_edge, reg->pcu.unc_thres, reg->pcu.unc_occ_inv, reg->pcu.unc_occ_edge, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_PCU_FILTER=0x%"PRIx64" band0=%u band1=%u band2=%u band3=%u]\n", f.val, f.pcu_filt.filt0, f.pcu_filt.filt1, f.pcu_filt.filt2, f.pcu_filt.filt3); } pfmlib_pmu_t intel_snbep_unc_pcu_support = { .desc = "Intel Sandy Bridge-EP PCU uncore", .name = "snbep_unc_pcu", .perf_name = "uncore_pcu", .pmu = PFM_PMU_INTEL_SNBEP_UNC_PCU, .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_p_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 2, .pe = intel_snbep_unc_p_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_PMU_FL_UNC_OCC | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_snbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_snbep_unc_can_auto_encode, .display_reg = display_pcu, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_perf_event.c000066400000000000000000000103131502707512200256110ustar00rootroot00000000000000/* pfmlib_intel_snbep_unc_perf.c : perf_events SNB-EP uncore support * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "pfmlib_perf_event_priv.h" static int find_pmu_type_by_name(const char *name) { char filename[PATH_MAX]; FILE *fp; int ret, type; if (!name) return PFM_ERR_NOTSUPP; sprintf(filename, "/sys/bus/event_source/devices/%s/type", name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &type); if (ret != 1) type = PFM_ERR_NOTSUPP; fclose(fp); return type; } int pfm_intel_snbep_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; pfm_snbep_unc_reg_t reg; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; ret = find_pmu_type_by_name(pmu->perf_name); if (ret < 0) return ret; attr->type = ret; reg.val = e->codes[0]; attr->config = reg.val; if (( is_cbo_filt_event(this, reg) || is_cha_filt_event(this, 0, reg) || is_cha_filt_event(this, 1, reg)) && e->count > 1) { if (e->count >= 2) attr->config1 = e->codes[1]; if (e->count >= 3) attr->config1 |= e->codes[2] << 32; } else { /* * various filters */ if (e->count >= 2) attr->config1 = e->codes[1]; if (e->count >= 3) attr->config2 = e->codes[2]; } /* * uncore measures at all priv levels * * user cannot set per-event priv levels because * attributes are simply not there * * dfl_plm is ignored in this case */ attr->exclude_hv = 0; attr->exclude_kernel = 0; attr->exclude_user = 0; return PFM_SUCCESS; } void pfm_intel_snbep_unc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; int no_smpl = pmu->flags & PFMLIB_PMU_FL_NO_SMPL; int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* No precise sampling mode for uncore */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; /* * No hypervisor for uncore */ if (e->pattrs[i].idx == PERF_ATTR_H) compact = 1; if (no_smpl && ( e->pattrs[i].idx == PERF_ATTR_FR || e->pattrs[i].idx == PERF_ATTR_PR || e->pattrs[i].idx == PERF_ATTR_PE)) compact = 1; /* * uncore has no priv level support */ if (pmu->supported_plm == 0 && ( e->pattrs[i].idx == PERF_ATTR_U || e->pattrs[i].idx == PERF_ATTR_K || e->pattrs[i].idx == PERF_ATTR_MG || e->pattrs[i].idx == PERF_ATTR_MH)) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_priv.h000066400000000000000000000515001502707512200244440ustar00rootroot00000000000000/* * pfmlib_intel_snbep_unc_priv.c : Intel SandyBridge/IvyBridge-EP common definitions * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PFMLIB_INTEL_SNBEP_UNC_PRIV_H__ #define __PFMLIB_INTEL_SNBEP_UNC_PRIV_H__ /* * Intel x86 specific pmu flags (pmu->flags 16 MSB) */ #define INTEL_PMU_FL_UNC_OCC 0x10000 /* PMU has occupancy counter filters */ #define INTEL_PMU_FL_UNC_CBO 0x20000 /* PMU is Cbox */ #define INTEL_PMU_FL_UNC_CHA 0x40000 /* PMU is CHA (skylake and later) */ #define SNBEP_UNC_ATTR_E 0 #define SNBEP_UNC_ATTR_I 1 #define SNBEP_UNC_ATTR_T8 2 #define SNBEP_UNC_ATTR_T5 3 #define SNBEP_UNC_ATTR_TF 4 #define SNBEP_UNC_ATTR_CF 5 #define SNBEP_UNC_ATTR_NF 6 /* for filter0 */ #define SNBEP_UNC_ATTR_FF 7 #define SNBEP_UNC_ATTR_A 8 #define SNBEP_UNC_ATTR_NF1 9 /* for filter1 */ #define SNBEP_UNC_ATTR_ISOC 10 /* isochronous */ #define SNBEP_UNC_ATTR_NC 11 /* non-coherent */ #define SNBEP_UNC_ATTR_CF1 12 /* core-filter hswep */ #define SNBEP_UNC_ATTR_TF1 13 /* thread-filter skx */ #define SNBEP_UNC_ATTR_CF2 14 /* core-filter (src filter) skx */ #define SNBEP_UNC_ATTR_LOC 15 /* local node target skx */ #define SNBEP_UNC_ATTR_REM 16 /* remote node target skx */ #define SNBEP_UNC_ATTR_LMEM 17 /* near memory cacheable skx */ #define SNBEP_UNC_ATTR_RMEM 18 /* not near memory cacheable skx */ #define SNBEP_UNC_ATTR_DNID 19 /* destination node id */ #define SNBEP_UNC_ATTR_RCSNID 20 /* RCS node id */ #define SNBEP_UNC_ATTR_T6 21 /* threshold (cmask) 6-bit */ #define SNBEP_UNC_ATTR_OCC_I 22 /* occupancy invert */ #define SNBEP_UNC_ATTR_OCC_E 23 /* occupancy edge */ #define _SNBEP_UNC_ATTR_I (1 << SNBEP_UNC_ATTR_I) #define _SNBEP_UNC_ATTR_E (1 << SNBEP_UNC_ATTR_E) #define _SNBEP_UNC_ATTR_T8 (1 << SNBEP_UNC_ATTR_T8) #define _SNBEP_UNC_ATTR_T5 (1 << SNBEP_UNC_ATTR_T5) #define _SNBEP_UNC_ATTR_TF (1 << SNBEP_UNC_ATTR_TF) #define _SNBEP_UNC_ATTR_CF (1 << SNBEP_UNC_ATTR_CF) #define _SNBEP_UNC_ATTR_NF (1 << SNBEP_UNC_ATTR_NF) #define _SNBEP_UNC_ATTR_FF (1 << SNBEP_UNC_ATTR_FF) #define _SNBEP_UNC_ATTR_A (1 << SNBEP_UNC_ATTR_A) #define _SNBEP_UNC_ATTR_NF1 (1 << SNBEP_UNC_ATTR_NF1) #define _SNBEP_UNC_ATTR_ISOC (1 << SNBEP_UNC_ATTR_ISOC) #define _SNBEP_UNC_ATTR_NC (1 << SNBEP_UNC_ATTR_NC) #define _SNBEP_UNC_ATTR_CF1 (1 << SNBEP_UNC_ATTR_CF1) #define _SNBEP_UNC_ATTR_TF1 (1 << SNBEP_UNC_ATTR_TF1) #define _SNBEP_UNC_ATTR_CF2 (1 << SNBEP_UNC_ATTR_CF2) #define _SNBEP_UNC_ATTR_LOC (1 << SNBEP_UNC_ATTR_LOC) #define _SNBEP_UNC_ATTR_REM (1 << SNBEP_UNC_ATTR_REM) #define _SNBEP_UNC_ATTR_LMEM (1 << SNBEP_UNC_ATTR_LMEM) #define _SNBEP_UNC_ATTR_RMEM (1 << SNBEP_UNC_ATTR_RMEM) #define _SNBEP_UNC_ATTR_DNID (1 << SNBEP_UNC_ATTR_DNID) #define _SNBEP_UNC_ATTR_RCSNID (1 << SNBEP_UNC_ATTR_RCSNID) #define _SNBEP_UNC_ATTR_T6 (1 << SNBEP_UNC_ATTR_T6) #define _SNBEP_UNC_ATTR_OCC_I (1 << SNBEP_UNC_ATTR_OCC_I) #define _SNBEP_UNC_ATTR_OCC_E (1 << SNBEP_UNC_ATTR_OCC_E) #define SNBEP_UNC_IRP_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_IRP_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) #define BDX_UNC_IRP_ATTRS HSWEP_UNC_IRP_ATTRS #define SNBEP_UNC_R3QPI_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_R3QPI_ATTRS SNBEP_UNC_R3QPI_ATTRS #define BDX_UNC_R3QPI_ATTRS SNBEP_UNC_R3QPI_ATTRS #define IVBEP_UNC_R3QPI_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define SNBEP_UNC_R2PCIE_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_R2PCIE_ATTRS SNBEP_UNC_R2PCIE_ATTRS #define BDX_UNC_R2PCIE_ATTRS SNBEP_UNC_R2PCIE_ATTRS #define IVBEP_UNC_R2PCIE_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define SNBEP_UNC_QPI_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define IVBEP_UNC_QPI_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_QPI_ATTRS SNBEP_UNC_QPI_ATTRS #define BDX_UNC_QPI_ATTRS SNBEP_UNC_QPI_ATTRS #define SNBEP_UNC_UBO_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define IVBEP_UNC_UBO_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS #define BDX_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS #define SNBEP_UNC_PCU_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T5) #define IVBEP_UNC_PCU_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T5) #define HSWEP_UNC_PCU_ATTRS SNBEP_UNC_PCU_ATTRS #define BDX_UNC_PCU_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T5) #define SKX_UNC_PCU_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define ICX_UNC_PCU_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T6) #define ICX_UNC_PCU_OCC_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T6|_SNBEP_UNC_ATTR_OCC_I|_SNBEP_UNC_ATTR_OCC_E) #define SNBEP_UNC_PCU_BAND_ATTRS \ (SNBEP_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_FF) #define IVBEP_UNC_PCU_BAND_ATTRS \ (IVBEP_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_FF) #define HSWEP_UNC_PCU_BAND_ATTRS SNBEP_UNC_PCU_BAND_ATTRS #define BDX_UNC_PCU_BAND_ATTRS SNBEP_UNC_PCU_BAND_ATTRS #define SKX_UNC_PCU_BAND_ATTRS \ (SKX_UNC_PCU_ATTRS | _SNBEP_UNC_ATTR_FF) #define SNBEP_UNC_IMC_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define IVBEP_UNC_IMC_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_IMC_ATTRS SNBEP_UNC_IMC_ATTRS #define BDX_UNC_IMC_ATTRS SNBEP_UNC_IMC_ATTRS #define SNBEP_UNC_CBO_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8 |\ _SNBEP_UNC_ATTR_CF |\ _SNBEP_UNC_ATTR_TF) #define IVBEP_UNC_CBO_ATTRS \ (_SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8 |\ _SNBEP_UNC_ATTR_CF |\ _SNBEP_UNC_ATTR_TF) #define HSWEP_UNC_CBO_ATTRS \ (_SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8 |\ _SNBEP_UNC_ATTR_CF1 |\ _SNBEP_UNC_ATTR_TF) #define BDX_UNC_CBO_ATTRS HSWEP_UNC_CBO_ATTRS #define SNBEP_UNC_CBO_NID_ATTRS \ (SNBEP_UNC_CBO_ATTRS|_SNBEP_UNC_ATTR_NF) #define IVBEP_UNC_CBO_NID_ATTRS \ (IVBEP_UNC_CBO_ATTRS|_SNBEP_UNC_ATTR_NF1) #define HSWEP_UNC_CBO_NID_ATTRS \ (HSWEP_UNC_CBO_ATTRS | _SNBEP_UNC_ATTR_NF1) #define BDX_UNC_CBO_NID_ATTRS HSWEP_UNC_CBO_NID_ATTRS #define SNBEP_UNC_HA_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define IVBEP_UNC_HA_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define HSWEP_UNC_HA_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) #define BDX_UNC_HA_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) #define SNBEP_UNC_HA_OPC_ATTRS \ (SNBEP_UNC_HA_ATTRS|_SNBEP_UNC_ATTR_A) #define HSWEP_UNC_SBO_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) #define BDX_UNC_SBO_ATTRS \ (_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8|_SNBEP_UNC_ATTR_I) #define KNL_UNC_CHA_TOR_ATTRS _SNBEP_UNC_ATTR_NF1 #define SKX_UNC_CHA_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_CHA_ATTRS SKX_UNC_CHA_ATTRS #define SPR_UNC_CHA_ATTRS SKX_UNC_CHA_ATTRS #define SKX_UNC_CHA_FILT1_ATTRS \ (SKX_UNC_CHA_ATTRS |\ _SNBEP_UNC_ATTR_LOC |\ _SNBEP_UNC_ATTR_REM |\ _SNBEP_UNC_ATTR_LMEM|\ _SNBEP_UNC_ATTR_RMEM|\ _SNBEP_UNC_ATTR_NC |\ _SNBEP_UNC_ATTR_ISOC) #define SKX_UNC_IIO_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_IIO_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define SKX_UNC_IMC_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_IMC_ATTRS SKX_UNC_IMC_ATTRS #define SPR_UNC_IMC_ATTRS SKX_UNC_IMC_ATTRS #define GNR_UNC_IMC_ATTRS SKX_UNC_IMC_ATTRS #define ICX_UNC_M2PCIE_ATTRS SKX_UNC_IMC_ATTRS #define SKX_UNC_IRP_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_IRP_ATTRS SKX_UNC_IRP_ATTRS #define SKX_UNC_M2M_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_M2M_ATTRS SKX_UNC_M2M_ATTRS #define SKX_UNC_M3UPI_ATTRS \ (_SNBEP_UNC_ATTR_I |\ _SNBEP_UNC_ATTR_E |\ _SNBEP_UNC_ATTR_T8) #define ICX_UNC_M3UPI_ATTRS SKX_UNC_M3UPI_ATTRS #define SKX_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS #define ICX_UNC_UBO_ATTRS SNBEP_UNC_UBO_ATTRS #define SKX_UNC_UPI_ATTRS \ (_SNBEP_UNC_ATTR_I|_SNBEP_UNC_ATTR_E|_SNBEP_UNC_ATTR_T8) #define ICX_UNC_UPI_ATTRS SKX_UNC_UPI_ATTRS #define SPR_UNC_UPI_ATTRS SKX_UNC_UPI_ATTRS #define SKX_UNC_UPI_OPC_ATTRS \ (SKX_UNC_UPI_ATTRS |\ _SNBEP_UNC_ATTR_DNID| _SNBEP_UNC_ATTR_RCSNID) typedef union { uint64_t val; struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res2:3; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_res3:32; /* reserved */ } com; /* covers common fields for cbox, ha, imc, ubox, r2pcie, r3qpi, sbox */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detect */ unsigned long unc_tid:1; /* tid filter enable */ unsigned long unc_res2:2; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_res3:32; /* reserved */ } cbo; /* covers c-box */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detect */ unsigned long unc_tid:1; /* tid filter enable */ unsigned long unc_ov:1; /* overflow enable */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_res3:32; /* reserved */ } cha; /* covers skx cha */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detect */ unsigned long unc_tid:1; /* tid filter enable */ unsigned long unc_ov:1; /* overflow enable */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_umask_ext:26; /* extended umask */ unsigned long unc_res3:9; /* reserved */ } icx_cha; /* covers icx cha */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res2:3; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_umask2:8; /* extended unit mask */ unsigned long unc_res3:24; /* reserved */ } icx_m2m; /* covers icx m2m */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_tid_en:1; /* tid enable */ unsigned long unc_res2:2; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:6; /* counter mask */ unsigned long unc_occ_inv:1; /* occupancy event invert */ unsigned long unc_occ_edge:1; /* occupancy event edge */ unsigned long unc_res3:24; /* reserved */ } icx_pcu; /* covers icx pcu */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detect */ unsigned long unc_tid:1; /* tid filter enable */ unsigned long unc_ov:1; /* overflow enable */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_chmsk:8; /* channel mask */ unsigned long unc_fcmsk:8; /* fc mask */ unsigned long unc_res3:16; /* reserved */ } iio; /* covers skx iio*/ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_res1:6; /* reserved */ unsigned long unc_occ:2; /* occ select */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res3:1; /* reserved */ unsigned long unc_res4:2; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:5; /* threshold */ unsigned long unc_res5:1; /* reserved */ unsigned long unc_occ_inv:1; /* occupancy invert */ unsigned long unc_occ_edge:1; /* occupancy edge detect */ unsigned long unc_res6:32; /* reserved */ } pcu; /* covers pcu */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_res1:6; /* reserved */ unsigned long unc_occ:2; /* occ select */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res3:1; /* reserved */ unsigned long unc_ov_en:1; /* overflow enable */ unsigned long unc_sel_ext:1; /* event_sel extension */ unsigned long unc_en:1; /* enable */ unsigned long unc_res4:1; /* reserved */ unsigned long unc_thres:5; /* threshold */ unsigned long unc_res5:1; /* reserved */ unsigned long unc_occ_inv:1; /* occupancy invert */ unsigned long unc_occ_edge:1; /* occupancy edge detect */ unsigned long unc_res6:32; /* reserved */ } ivbep_pcu; /* covers ivb-ep pcu */ struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit maks */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res2:1; /* reserved */ unsigned long unc_res3:1; /* reserved */ unsigned long unc_event_ext:1; /* event code extension */ unsigned long unc_en:1; /* enable */ unsigned long unc_inv:1; /* invert counter mask */ unsigned long unc_thres:8; /* threshold */ unsigned long unc_res4:32; /* reserved */ } qpi; /* covers qpi */ struct { unsigned long tid:1; unsigned long cid:3; unsigned long res0:1; unsigned long res1:3; unsigned long res2:2; unsigned long nid:8; unsigned long state:5; unsigned long opc:9; unsigned long res3:1; unsigned long res4:32; } cbo_filt; /* cbox filter */ struct { unsigned long tid:1; unsigned long cid:4; unsigned long res0:12; unsigned long state:6; unsigned long res1:9; unsigned long res2:32; } ivbep_cbo_filt0; /* ivbep cbox filter0 */ struct { unsigned long nid:16; unsigned long res0:4; unsigned long opc:9; unsigned long res1:1; unsigned long nc:1; unsigned long isoc:1; unsigned long res2:32; } ivbep_cbo_filt1; /* ivbep cbox filter1 */ struct { unsigned long tid:1; unsigned long cid:5; unsigned long res0:11; unsigned long state:7; unsigned long res1:8; unsigned long res2:32; } hswep_cbo_filt0; /* hswep cbox filter0 */ struct { unsigned long nid:16; unsigned long res0:4; unsigned long opc:9; unsigned long res1:1; unsigned long nc:1; unsigned long isoc:1; unsigned long res2:32; } hswep_cbo_filt1; /* hswep cbox filter1 */ struct { unsigned long tid:3; /* thread 0-3 */ unsigned long sid:6; /* source id */ unsigned long res0:8; unsigned long state:10; /* llc lookup cacheline state */ unsigned long res1:32; unsigned long res2:5; } skx_cha_filt0; /* skx cha filter0 */ struct { unsigned long rem:1; unsigned long loc:1; unsigned long res0:1; unsigned long all_opc:1; unsigned long nm:1; unsigned long not_nm:1; unsigned long res1:3; unsigned long opc0:10; unsigned long opc1:10; unsigned long res2:1; unsigned long nc:1; unsigned long isoc:1; unsigned long res3:32; } skx_cha_filt1; /* skx cha filter1 */ struct { unsigned long opc:1; unsigned long loc:1; unsigned long rem:1; unsigned long data:1; unsigned long nondata:1; unsigned long dualslot:1; unsigned long sglslot:1; unsigned long isoch:1; unsigned long dnid:4; unsigned long res1:1; unsigned long en_dnidd:1; unsigned long rcsnid:4; unsigned long en_rcsnid:1; unsigned long slot0:1; unsigned long slot1:1; unsigned long slot2:1; unsigned long llcrd_non0:1; unsigned long llcrd_implnull:1; unsigned long res2:9; } skx_upi_filt; /* skx upi basic_hdr_filt */ struct { unsigned long filt0:8; /* band0 freq filter */ unsigned long filt1:8; /* band1 freq filter */ unsigned long filt2:8; /* band2 freq filter */ unsigned long filt3:8; /* band3 freq filter */ unsigned long res1:32; /* reserved */ } pcu_filt; struct { unsigned long res1:6; unsigned long lo_addr:26; /* lo order 26b */ unsigned long hi_addr:14; /* hi order 14b */ unsigned long res2:18; /* reserved */ } ha_addr; struct { unsigned long opc:6; /* opcode match */ unsigned long res1:26; /* reserved */ unsigned long res2:32; /* reserved */ } ha_opc; struct { unsigned long unc_event:8; /* event code */ unsigned long unc_umask:8; /* unit mask */ unsigned long unc_res1:1; /* reserved */ unsigned long unc_rst:1; /* reset */ unsigned long unc_edge:1; /* edge detec */ unsigned long unc_res2:3; /* reserved */ unsigned long unc_en:1; /* enable */ unsigned long unc_res3:1; /* reserved */ unsigned long unc_thres:8; /* counter mask */ unsigned long unc_res4:32; /* reserved */ } irp; /* covers irp */ } pfm_snbep_unc_reg_t; extern void pfm_intel_snbep_unc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_intel_snbep_unc_get_encoding(void *this, pfmlib_event_desc_t *e); extern const pfmlib_attr_desc_t snbep_unc_mods[]; extern int pfm_intel_snbep_unc_detect(void *this); extern int pfm_intel_ivbep_unc_detect(void *this); extern int pfm_intel_hswep_unc_detect(void *this); extern int pfm_intel_knl_unc_detect(void *this); extern int pfm_intel_knm_unc_detect(void *this); extern int pfm_intel_bdx_unc_detect(void *this); extern int pfm_intel_skx_unc_detect(void *this); extern int pfm_intel_icx_unc_detect(void *this); extern int pfm_intel_spr_unc_detect(void *this); extern int pfm_intel_gnr_unc_detect(void *this); extern int pfm_intel_snbep_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_intel_snbep_unc_can_auto_encode(void *this, int pidx, int uidx); extern int pfm_intel_snbep_unc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); static inline int is_cha_filt_event(void *this, int x, pfm_snbep_unc_reg_t reg) { pfmlib_pmu_t *pmu = this; uint64_t sel = reg.com.unc_event; /* * TOR_INSERT: event code 0x35 * TOR_OCCUPANCY: event code 0x36 * LLC_LOOKUP : event code 0x34 */ if (!(pmu->flags & INTEL_PMU_FL_UNC_CHA)) return 0; if (x == 0) return sel == 0x34; if (x == 1) return sel == 0x35 || sel == 0x36; return 0; } static inline int is_cbo_filt_event(void *this, pfm_snbep_unc_reg_t reg) { pfmlib_pmu_t *pmu = this; uint64_t sel = reg.com.unc_event; /* * Cbox-only: umask bit 0 must be 1 (OPCODE) * * TOR_INSERT: event code 0x35 * TOR_OCCUPANCY: event code 0x36 * LLC_LOOKUP : event code 0x34 */ if (pmu->flags & INTEL_PMU_FL_UNC_CBO) return (reg.com.unc_umask & 0x1) && (sel == 0x35 || sel == 0x36 || sel == 0x34); return 0; } #endif /* __PFMLIB_INTEL_SNBEP_UNC_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_qpi.c000066400000000000000000000061711502707512200242540ustar00rootroot00000000000000/* * pfmlib_intel_snbep_qpi.c : Intel SandyBridge-EP QPI uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_qpi_events.h" static void display_qpi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_QPI=0x%"PRIx64" event=0x%x sel_ext=%d umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->qpi.unc_event, reg->qpi.unc_event_ext, reg->qpi.unc_umask, reg->qpi.unc_en, reg->qpi.unc_inv, reg->qpi.unc_edge, reg->qpi.unc_thres, pe[e->event].name); } #define DEFINE_QPI_BOX(n) \ pfmlib_pmu_t intel_snbep_unc_qpi##n##_support = {\ .desc = "Intel Sandy Bridge-EP QPI"#n" uncore",\ .name = "snbep_unc_qpi"#n,\ .perf_name = "uncore_qpi_"#n,\ .pmu = PFM_PMU_INTEL_SNBEP_UNC_QPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_q_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 3,\ .pe = intel_snbep_unc_q_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .pmu_detect = pfm_intel_snbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .display_reg = display_qpi,\ } DEFINE_QPI_BOX(0); DEFINE_QPI_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_r2pcie.c000066400000000000000000000050611502707512200246440ustar00rootroot00000000000000/* * pfmlib_intel_snbep_r2pcie.c : Intel SandyBridge-EP R2PCIe uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_r2pcie_events.h" pfmlib_pmu_t intel_snbep_unc_r2pcie_support = { .desc = "Intel Sandy Bridge-EP R2PCIe uncore", .name = "snbep_unc_r2pcie", .perf_name = "uncore_r2pcie", .pmu = PFM_PMU_INTEL_SNBEP_UNC_R2PCIE, .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_r2_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = intel_snbep_unc_r2_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_snbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_r3qpi.c000066400000000000000000000052351502707512200245210ustar00rootroot00000000000000/* * pfmlib_intel_snbep_r3qpi.c : Intel SandyBridge-EP R3QPI uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_r3qpi_events.h" #define DEFINE_R3QPI_BOX(n) \ pfmlib_pmu_t intel_snbep_unc_r3qpi##n##_support = {\ .desc = "Intel Sandy Bridge-EP R3QPI"#n" uncore", \ .name = "snbep_unc_r3qpi"#n,\ .perf_name = "uncore_r3qpi_"#n, \ .pmu = PFM_PMU_INTEL_SNBEP_UNC_R3QPI##n, \ .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_r3_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 3,\ .num_fixed_cntrs = 0,\ .max_encoding = 1,\ .pe = intel_snbep_unc_r3_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK\ | PFMLIB_PMU_FL_NO_SMPL,\ .pmu_detect = pfm_intel_snbep_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ } DEFINE_R3QPI_BOX(0); DEFINE_R3QPI_BOX(1); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_snbep_unc_ubo.c000066400000000000000000000050411502707512200242430ustar00rootroot00000000000000/* * pfmlib_intel_snbep_unc_ubo.c : Intel SandyBridge-EP U-Box uncore PMU * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_snbep_unc_ubo_events.h" pfmlib_pmu_t intel_snbep_unc_ubo_support = { .desc = "Intel Sandy Bridge-EP U-Box uncore", .name = "snbep_unc_ubo", .perf_name = "uncore_ubox", .pmu = PFM_PMU_INTEL_SNBEP_UNC_UBOX, .pme_count = LIBPFM_ARRAY_SIZE(intel_snbep_unc_u_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 2, .num_fixed_cntrs = 1, .max_encoding = 1, .pe = intel_snbep_unc_u_pe, .atdesc = snbep_unc_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_NO_SMPL, .pmu_detect = pfm_intel_snbep_unc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_spr.c000066400000000000000000000101051502707512200222230ustar00rootroot00000000000000/* * pfmlib_intel_spr.c : Intel SapphireRapid core PMU * * Copyright (c) 2022 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_spr_events.h" static const int spr_models[] = { 143, /* SapphireRapid */ 0 }; static const int emr_models[] = { 207, /* EmeraldRapid */ 0 }; static int pfm_spr_init(void *this) { pfm_intel_x86_cfg.arch_version = 5; return PFM_SUCCESS; } pfmlib_pmu_t intel_spr_support={ .desc = "Intel SapphireRapid", .name = "spr", .pmu = PFM_PMU_INTEL_SPR, .pme_count = LIBPFM_ARRAY_SIZE(intel_spr_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 16, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_spr_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_SPEC | INTEL_X86_PMU_FL_ECMASK | INTEL_X86_PMU_FL_EXTPEBS, .cpu_family = 6, .cpu_models = spr_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_spr_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; pfmlib_pmu_t intel_emr_support={ .desc = "Intel EmeraldRapid", .name = "emr", .pmu = PFM_PMU_INTEL_EMR, .pme_count = LIBPFM_ARRAY_SIZE(intel_spr_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 16, /* consider with HT off by default */ .num_fixed_cntrs = 4, .max_encoding = 2, /* offcore_response */ .pe = intel_spr_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_SPEC | INTEL_X86_PMU_FL_ECMASK | INTEL_X86_PMU_FL_EXTPEBS, .cpu_family = 6, .cpu_models = emr_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_spr_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, .get_num_events = pfm_intel_x86_get_num_events, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_spr_unc_cha.c000066400000000000000000000105741502707512200237150ustar00rootroot00000000000000/* * pfmlib_intel_spr_unc_cha.c : Intel SPR CHA-Box uncore PMU * * Copyright (c) 2024 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_spr_unc_cha_events.h" static void display_cha(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; pfm_snbep_unc_reg_t f; __pfm_vbprintf("[UNC_CHA=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d tid_en=%d umask_ext=0x%x] %s\n", reg->val, reg->icx_cha.unc_event, reg->icx_cha.unc_umask, reg->icx_cha.unc_en, reg->icx_cha.unc_inv, reg->icx_cha.unc_edge, reg->icx_cha.unc_thres, reg->icx_cha.unc_tid, reg->icx_cha.unc_umask_ext, pe[e->event].name); if (e->count == 1) return; f.val = e->codes[1]; __pfm_vbprintf("[UNC_CHA_FILTER0=0x%"PRIx64" thread_id=%d source=0x%x state=0x%x]\n", f.val, f.skx_cha_filt0.tid, f.skx_cha_filt0.sid, f.skx_cha_filt0.state); } #define DEFINE_CHA(n) \ pfmlib_pmu_t intel_spr_unc_cha##n##_support = {\ .desc = "Intel SapphireRapids CHA"#n" uncore",\ .name = "spr_unc_cha"#n,\ .perf_name = "uncore_cha_"#n,\ .pmu = PFM_PMU_INTEL_SPR_UNC_CHA##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_spr_unc_cha_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_spr_unc_cha_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_spr_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_cha,\ } DEFINE_CHA(0); DEFINE_CHA(1); DEFINE_CHA(2); DEFINE_CHA(3); DEFINE_CHA(4); DEFINE_CHA(5); DEFINE_CHA(6); DEFINE_CHA(7); DEFINE_CHA(8); DEFINE_CHA(9); DEFINE_CHA(10); DEFINE_CHA(11); DEFINE_CHA(12); DEFINE_CHA(13); DEFINE_CHA(14); DEFINE_CHA(15); DEFINE_CHA(16); DEFINE_CHA(17); DEFINE_CHA(18); DEFINE_CHA(19); DEFINE_CHA(20); DEFINE_CHA(21); DEFINE_CHA(22); DEFINE_CHA(23); DEFINE_CHA(24); DEFINE_CHA(25); DEFINE_CHA(26); DEFINE_CHA(27); DEFINE_CHA(28); DEFINE_CHA(29); DEFINE_CHA(30); DEFINE_CHA(31); DEFINE_CHA(32); DEFINE_CHA(33); DEFINE_CHA(34); DEFINE_CHA(35); DEFINE_CHA(36); DEFINE_CHA(37); DEFINE_CHA(38); DEFINE_CHA(39); DEFINE_CHA(40); DEFINE_CHA(41); DEFINE_CHA(42); DEFINE_CHA(43); DEFINE_CHA(44); DEFINE_CHA(45); DEFINE_CHA(46); DEFINE_CHA(47); DEFINE_CHA(48); DEFINE_CHA(49); DEFINE_CHA(50); DEFINE_CHA(51); DEFINE_CHA(52); DEFINE_CHA(53); DEFINE_CHA(54); DEFINE_CHA(55); DEFINE_CHA(56); DEFINE_CHA(57); DEFINE_CHA(58); DEFINE_CHA(59); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_spr_unc_imc.c000066400000000000000000000064241502707512200237310ustar00rootroot00000000000000/* * pfmlib_intel_spr_unc_imc.c : Intel SPR IMC uncore PMU * * Copyright (c) 2024 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of imcrge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERIMCNTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_spr_unc_imc_events.h" static void display_imc(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_IMC=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_IMC(n) \ pfmlib_pmu_t intel_spr_unc_imc##n##_support = {\ .desc = "Intel SapphireRapids IMC"#n" uncore",\ .name = "spr_unc_imc"#n,\ .perf_name = "uncore_imc_"#n,\ .pmu = PFM_PMU_INTEL_SPR_UNC_IMC##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_spr_unc_imc_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_spr_unc_imc_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_spr_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_imc,\ } DEFINE_IMC(0); DEFINE_IMC(1); DEFINE_IMC(2); DEFINE_IMC(3); DEFINE_IMC(4); DEFINE_IMC(5); DEFINE_IMC(6); DEFINE_IMC(7); DEFINE_IMC(8); DEFINE_IMC(9); DEFINE_IMC(10); DEFINE_IMC(11); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_spr_unc_upi.c000066400000000000000000000062401502707512200237520ustar00rootroot00000000000000/* * pfmlib_intel_spr_unc_upi.c : Intel SPR UPI uncore PMU * * Copyright (c) 2024 Google LLC * Contributed by Stephane Eranian * * Permission is hereby granted, free of upirge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERUPINTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_intel_snbep_unc_priv.h" #include "events/intel_spr_unc_upi_events.h" static void display_upi(void *this, pfmlib_event_desc_t *e, void *val) { const intel_x86_entry_t *pe = this_pe(this); pfm_snbep_unc_reg_t *reg = val; __pfm_vbprintf("[UNC_UPI=0x%"PRIx64" event=0x%x umask=0x%x en=%d " "inv=%d edge=%d thres=%d] %s\n", reg->val, reg->com.unc_event, reg->com.unc_umask, reg->com.unc_en, reg->com.unc_inv, reg->com.unc_edge, reg->com.unc_thres, pe[e->event].name); } #define DEFINE_UPI(n) \ pfmlib_pmu_t intel_spr_unc_upi##n##_support = {\ .desc = "Intel SapphireRapids UPI"#n" uncore",\ .name = "spr_unc_upi"#n,\ .perf_name = "uncore_upi_"#n,\ .pmu = PFM_PMU_INTEL_SPR_UNC_UPI##n,\ .pme_count = LIBPFM_ARRAY_SIZE(intel_spr_unc_upi_ll_pe),\ .type = PFM_PMU_TYPE_UNCORE,\ .num_cntrs = 4,\ .num_fixed_cntrs = 0,\ .max_encoding = 2,\ .pe = intel_spr_unc_upi_ll_pe,\ .atdesc = snbep_unc_mods,\ .flags = PFMLIB_PMU_FL_RAW_UMASK,\ .pmu_detect = pfm_intel_spr_unc_detect,\ .get_event_encoding[PFM_OS_NONE] = pfm_intel_snbep_unc_get_encoding,\ PFMLIB_ENCODE_PERF(pfm_intel_snbep_unc_get_perf_encoding),\ PFMLIB_OS_DETECT(pfm_intel_x86_perf_detect), \ .get_event_first = pfm_intel_x86_get_event_first,\ .get_event_next = pfm_intel_x86_get_event_next,\ .event_is_valid = pfm_intel_x86_event_is_valid,\ .validate_table = pfm_intel_x86_validate_table,\ .get_event_info = pfm_intel_x86_get_event_info,\ .get_event_attr_info = pfm_intel_x86_get_event_attr_info,\ PFMLIB_VALID_PERF_PATTRS(pfm_intel_snbep_unc_perf_validate_pattrs),\ .get_event_nattrs = pfm_intel_x86_get_event_nattrs,\ .can_auto_encode = pfm_intel_x86_can_auto_encode, \ .display_reg = display_upi,\ } DEFINE_UPI(0); DEFINE_UPI(1); DEFINE_UPI(2); DEFINE_UPI(3); papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_tmt.c000066400000000000000000000050561502707512200222340ustar00rootroot00000000000000/* * pfmlib_intel_tmt.c : Intel Tremont core PMU * * Copyright (c) 2020 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_tmt_events.h" static const int tmt_models[] = { 0x86, /* Tremont D, Jacosbville */ 0x96, /* Tremont, Elkhart Lake */ 0 }; static int pfm_intel_tmt_init(void *this) { pfm_intel_x86_cfg.arch_version = 4; return PFM_SUCCESS; } pfmlib_pmu_t intel_tmt_support = { .desc = "Intel Tremont", .name = "tmt", .pmu = PFM_PMU_INTEL_TMT, .pme_count = LIBPFM_ARRAY_SIZE(intel_tmt_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, .pe = intel_tmt_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK, .supported_plm = INTEL_X86_PLM, .cpu_family = 6, .cpu_models = tmt_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_intel_tmt_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_wsm.c000066400000000000000000000075551502707512200222440ustar00rootroot00000000000000/* * pfmlib_intel_wsm.c : Intel Westmere core PMU * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "events/intel_wsm_events.h" static const int wsm_models[] = { 37, /* Clarkdale */ 0, }; static const int wsm_dp_models[] = { 44, /* Westmere-EP, Gulftown */ 47, /* Westmere E7 */ 0, }; static int pfm_wsm_init(void *this) { pfm_intel_x86_cfg.arch_version = 3; return PFM_SUCCESS; } pfmlib_pmu_t intel_wsm_sp_support={ .desc = "Intel Westmere (single-socket)", .name = "wsm", .pmu = PFM_PMU_INTEL_WSM, .pme_count = LIBPFM_ARRAY_SIZE(intel_wsm_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, /* because of OFFCORE_RESPONSE */ .pe = intel_wsm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = wsm_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_wsm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; pfmlib_pmu_t intel_wsm_dp_support={ .desc = "Intel Westmere DP", .name = "wsm_dp", .pmu = PFM_PMU_INTEL_WSM_DP, .pme_count = LIBPFM_ARRAY_SIZE(intel_wsm_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = INTEL_X86_PLM, .num_cntrs = 4, .num_fixed_cntrs = 3, .max_encoding = 2, /* because of OFFCORE_RESPONSE */ .pe = intel_wsm_pe, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | INTEL_X86_PMU_FL_ECMASK, .cpu_family = 6, .cpu_models = wsm_dp_models, .pmu_detect = pfm_intel_x86_model_detect, .pmu_init = pfm_wsm_init, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .validate_table = pfm_intel_x86_validate_table, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, .can_auto_encode = pfm_intel_x86_can_auto_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_x86.c000066400000000000000000001057111502707512200220540ustar00rootroot00000000000000/* pfmlib_intel_x86.c : common code for Intel X86 processors * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file implements the common code for all Intel X86 processors. */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" const pfmlib_attr_desc_t intel_x86_mods[]={ PFM_ATTR_B("k", "monitor at priv level 0"), /* monitor priv level 0 */ PFM_ATTR_B("u", "monitor at priv level 1, 2, 3"), /* monitor priv level 1, 2, 3 */ PFM_ATTR_B("e", "edge level (may require counter-mask >= 1)"), /* edge */ PFM_ATTR_B("i", "invert"), /* invert */ PFM_ATTR_I("c", "counter-mask in range [0-255]"), /* counter-mask */ PFM_ATTR_B("t", "measure any thread"), /* monitor on both threads */ PFM_ATTR_I("ldlat", "load latency threshold (cycles, [3-65535])"), /* load latency threshold */ PFM_ATTR_B("intx", "monitor only inside transactional memory region"), PFM_ATTR_B("intxcp", "do not count occurrences inside aborted transactional memory region"), PFM_ATTR_I("fe_thres", "frontend bubble latency threshold in cycles ([1-4095]"), PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; pfm_intel_x86_config_t pfm_intel_x86_cfg; #define mdhw(m, u, at) (m & u & _INTEL_X86_##at) static inline void cpuid(unsigned int op, unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d) { asm volatile("cpuid" : "=a" (*a), "=b" (*b), "=c" (*c), "=d" (*d) : "a" (op) : "memory"); } static void pfm_intel_x86_display_reg(void *this, pfmlib_event_desc_t *e) { const intel_x86_entry_t *pe = this_pe(this); pfm_intel_x86_reg_t reg; int i; reg.val = e->codes[0]; /* * handle generic counters */ __pfm_vbprintf("[0x%"PRIx64" event_sel=0x%x umask=0x%x os=%d usr=%d " "en=%d int=%d inv=%d edge=%d cnt_mask=%d", reg.val, reg.sel_event_select, reg.sel_unit_mask, reg.sel_os, reg.sel_usr, reg.sel_en, reg.sel_int, reg.sel_inv, reg.sel_edge, reg.sel_cnt_mask); if (pe[e->event].modmsk & _INTEL_X86_ATTR_T) __pfm_vbprintf(" any=%d", reg.sel_anythr); __pfm_vbprintf("]"); for (i = 1 ; i < e->count; i++) __pfm_vbprintf(" [0x%"PRIx64"]", e->codes[i]); __pfm_vbprintf(" %s\n", e->fstr); } /* * number of HW modifiers */ static int intel_x86_num_mods(void *this, int idx) { const intel_x86_entry_t *pe = this_pe(this); unsigned int mask; mask = pe[idx].modmsk; return pfmlib_popcnt(mask); } int intel_x86_attr2mod(void *this, int pidx, int attr_idx) { const intel_x86_entry_t *pe = this_pe(this); size_t x; int n, numasks; numasks = intel_x86_num_umasks(this, pidx); n = attr_idx - numasks; pfmlib_for_each_bit(x, pe[pidx].modmsk) { if (n == 0) break; n--; } return x; } /* * detect processor model using cpuid() * based on documentation * http://www.intel.com/Assets/PDF/appnote/241618.pdf */ int pfm_intel_x86_detect(void) { unsigned int a, b, c, d; char buffer[64]; if (pfm_intel_x86_cfg.family) return PFM_SUCCESS; cpuid(0, &a, &b, &c, &d); strncpy(&buffer[0], (char *)(&b), 4); strncpy(&buffer[4], (char *)(&d), 4); strncpy(&buffer[8], (char *)(&c), 4); buffer[12] = '\0'; /* must be Intel */ if (strcmp(buffer, "GenuineIntel")) return PFM_ERR_NOTSUPP; cpuid(1, &a, &b, &c, &d); pfm_intel_x86_cfg.family = (a >> 8) & 0xf; // bits 11 - 8 pfm_intel_x86_cfg.model = (a >> 4) & 0xf; // Bits 7 - 4 pfm_intel_x86_cfg.stepping = a & 0xf; // Bits 0 - 3 /* extended family */ if (pfm_intel_x86_cfg.family == 0xf) pfm_intel_x86_cfg.family += (a >> 20) & 0xff; /* extended model */ if (pfm_intel_x86_cfg.family >= 0x6) pfm_intel_x86_cfg.model += ((a >> 16) & 0xf) << 4; return PFM_SUCCESS; } int pfm_intel_x86_model_detect(void *this) { pfmlib_pmu_t *pmu = this; const int *p; int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; if (pfm_intel_x86_cfg.family != pmu->cpu_family) return PFM_ERR_NOTSUPP; for (p = pmu->cpu_models; *p; p++) { if (*p == pfm_intel_x86_cfg.model) return PFM_SUCCESS; } return PFM_ERR_NOTSUPP; } int pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask, unsigned short max_grpid, int excl_grp_but_0) { const intel_x86_entry_t *pe = this_pe(this); const intel_x86_entry_t *ent; unsigned int i; unsigned short grpid; int j, k, added, skip; int idx; k = e->nattrs; ent = pe+e->event; for(i=0; msk; msk >>=1, i++) { if (!(msk & 0x1)) continue; added = skip = 0; /* * must scan list of possible attributes * (not all possible attributes) */ for (j = 0; j < e->npattrs; j++) { if (e->pattrs[j].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[j].type != PFM_ATTR_UMASK) continue; idx = e->pattrs[j].idx; if (get_grpid(ent->umasks[idx].grpid) != i) continue; if (max_grpid != INTEL_X86_MAX_GRPID && i > max_grpid) { skip = 1; continue; } if (intel_x86_uflag(this, e->event, idx, INTEL_X86_GRP_DFL_NONE)) { skip = 1; continue; } grpid = ent->umasks[idx].grpid; if (excl_grp_but_0 != -1 && grpid != 0 && excl_grp_but_0 != grpid) { skip = 1; continue; } /* umask is default for group */ if (intel_x86_uflag(this, e->event, idx, INTEL_X86_DFL)) { DPRINT("added default %s for group %d j=%d idx=%d ucode=0x%"PRIx64"\n", ent->umasks[idx].uname, i, j, idx, ent->umasks[idx].ucode); /* * default could be an alias, but * ucode must reflect actual code */ *umask |= ent->umasks[idx].ucode >> 8; e->attrs[k].id = j; /* pattrs index */ e->attrs[k].ival = 0; k++; added++; if (intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) goto done; if (intel_x86_uflag(this, e->event, idx, INTEL_X86_EXCL_GRP_GT)) { if (max_grpid != INTEL_X86_MAX_GRPID) { DPRINT("two max_grpid, old=%d new=%d\n", max_grpid, get_grpid(ent->umasks[idx].grpid)); return PFM_ERR_UMASK; } max_grpid = ent->umasks[idx].grpid; } } } if (!added && !skip) { DPRINT("no default found for event %s unit mask group %d (max_grpid=%d)\n", ent->name, i, max_grpid); return PFM_ERR_UMASK; } } DPRINT("max_grpid=%d nattrs=%d k=%d umask=0x%"PRIx64"\n", max_grpid, e->nattrs, k, *umask); done: e->nattrs = k; return PFM_SUCCESS; } #if 1 static int intel_x86_check_pebs(void *this, pfmlib_event_desc_t *e) { return PFM_SUCCESS; } #else /* this routine is supposed to check that umask combination is valid * w.r.t. PEBS. You cannot combine PEBS with non PEBS. But this is * only a problem is PEBS is requested. But this is not known at this * arch level. It is known at the OS interface level. Therefore we * cannot really check this here. * We keep the code around in case we need it later on */ static int intel_x86_check_pebs(void *this, pfmlib_event_desc_t *e) { const intel_x86_entry_t *pe = this_pe(this); pfmlib_event_attr_info_t *a; int numasks = 0, pebs = 0; int i; /* * if event has no umask and is PEBS, then we are okay */ if (!pe[e->event].numasks && intel_x86_eflag(this, e->event, INTEL_X86_PEBS)) return PFM_SUCCESS; /* * if the event sets PEBS, then it means at least one umask * supports PEBS, so we need to check */ for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { /* count number of umasks */ numasks++; /* and those that support PEBS */ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_PEBS)) pebs++; } } /* * pass if user requested only PEBS umasks */ return pebs != numasks ? PFM_ERR_FEATCOMB : PFM_SUCCESS; } #endif static int intel_x86_check_max_grpid(void *this, pfmlib_event_desc_t *e, unsigned short max_grpid) { const intel_x86_entry_t *pe; pfmlib_event_attr_info_t *a; unsigned short grpid; int i; DPRINT("check: max_grpid=%d\n", max_grpid); pe = this_pe(this); for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { grpid = pe[e->event].umasks[a->idx].grpid; if (grpid > max_grpid) return PFM_ERR_FEATCOMB; } } return PFM_SUCCESS; } static int pfm_intel_x86_encode_gen(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; pfmlib_event_attr_info_t *a; const intel_x86_entry_t *pe; pfm_intel_x86_reg_t reg, reg2; unsigned int grpmsk, ugrpmsk = 0; uint64_t umask1, umask2, ucode, last_ucode = ~0ULL; unsigned int modhw = 0; unsigned int plmmsk = 0; int umodmsk = 0, modmsk_r = 0; int k, ret, id, no_mods = 0; int max_req_grpid = -1; unsigned short grpid; unsigned short max_grpid = INTEL_X86_MAX_GRPID; unsigned short last_grpid = INTEL_X86_MAX_GRPID; unsigned short req_grpid; unsigned int ldlat = 0, ldlat_um = 0; unsigned int fe_thr= 0, fe_thr_um = 0; int excl_grp_but_0 = -1; int grpcounts[INTEL_X86_NUM_GRP]; int req_grps[INTEL_X86_NUM_GRP]; int ncombo[INTEL_X86_NUM_GRP]; memset(grpcounts, 0, sizeof(grpcounts)); memset(ncombo, 0, sizeof(ncombo)); pe = this_pe(this); e->fstr[0] = '\0'; /* * preset certain fields from event code * including modifiers */ reg.val = pe[e->event].code; grpmsk = (1 << pe[e->event].ngrp)-1; /* take into account hardcoded umask */ umask1 = (reg.val >> 8) & 0xff; umask2 = 0; modmsk_r = pe[e->event].modmsk_req; for (k = 0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { grpid = get_grpid(pe[e->event].umasks[a->idx].grpid); req_grpid = get_req_grpid(pe[e->event].umasks[a->idx].grpid); /* * certain event groups are meant to be * exclusive, i.e., only unit masks of one group * can be used */ if (last_grpid != INTEL_X86_MAX_GRPID && grpid != last_grpid && intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) { DPRINT("exclusive unit mask group error\n"); return PFM_ERR_FEATCOMB; } /* * selecting certain umasks in a group may exclude any umasks * from any groups with a higher index * * enforcement requires looking at the grpid of all the umasks */ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_EXCL_GRP_GT)) max_grpid = grpid; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_EXCL_GRP_BUT_0)) excl_grp_but_0 = grpid; /* * upper layer has removed duplicates * so if we come here more than once, it is for two * disinct umasks * * NCOMBO=no combination of unit masks within the same * umask group */ ++grpcounts[grpid]; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_GRP_REQ)) { DPRINT("event requires grpid %d\n", req_grpid); /* initialize req_grpcounts array only when needed */ if (max_req_grpid == -1) { int x; for (x = 0; x < INTEL_X86_NUM_GRP; x++) req_grps[x] = 0xff; } if (req_grpid > max_req_grpid) max_req_grpid = req_grpid; DPRINT("max_req_grpid=%d\n", max_req_grpid); req_grps[req_grpid] = 1; } /* mark that we have a umask with NCOMBO in this group */ if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_NCOMBO)) ncombo[grpid] = 1; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_LDLAT)) ldlat_um = 1; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_FETHR)) fe_thr_um = 1; no_mods |= intel_x86_uflag(this, e->event, a->idx, INTEL_X86_NO_MODS); /* * if more than one umask in this group but one is marked * with ncombo, then fail. It is okay to combine umask within * a group as long as none is tagged with NCOMBO */ if (grpcounts[grpid] > 1 && ncombo[grpid]) { DPRINT("umask %s does not support unit mask combination within group %d\n", pe[e->event].umasks[a->idx].uname, grpid); return PFM_ERR_FEATCOMB; } last_grpid= grpid; ucode = pe[e->event].umasks[a->idx].ucode; modhw |= pe[e->event].umasks[a->idx].modhw; umask2 |= ucode >> 8; ugrpmsk |= 1 << pe[e->event].umasks[a->idx].grpid; modmsk_r |= pe[e->event].umasks[a->idx].umodmsk_req; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_CODE_OVERRIDE)) { if (last_ucode != ~0ULL && (ucode & 0xff) != last_ucode) { DPRINT("cannot override event with two different codes for %s\n", pe[e->event].name); return PFM_ERR_FEATCOMB; } last_ucode = ucode & 0xff; reg.sel_event_select = last_ucode; } } else if (a->type == PFM_ATTR_RAW_UMASK) { int ofr_bits = 8; uint64_t rmask; /* set limit on width of raw umask */ if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE)) { ofr_bits = 38; if (e->pmu->pmu == PFM_PMU_INTEL_WSM || e->pmu->pmu == PFM_PMU_INTEL_WSM_DP) ofr_bits = 16; } rmask = (1ULL << ofr_bits) - 1; if (a->idx & ~rmask) { DPRINT("raw umask is too wide max %d bits\n", ofr_bits); return PFM_ERR_ATTR; } /* override umask */ umask2 = a->idx & rmask; ugrpmsk = grpmsk; } else { uint64_t ival = e->attrs[k].ival; switch(a->idx) { case INTEL_X86_ATTR_I: /* invert */ reg.sel_inv = !!ival; umodmsk |= _INTEL_X86_ATTR_I; break; case INTEL_X86_ATTR_E: /* edge */ reg.sel_edge = !!ival; umodmsk |= _INTEL_X86_ATTR_E; break; case INTEL_X86_ATTR_C: /* counter-mask */ if (ival > 255) return PFM_ERR_ATTR_VAL; reg.sel_cnt_mask = ival; umodmsk |= _INTEL_X86_ATTR_C; break; case INTEL_X86_ATTR_U: /* USR */ reg.sel_usr = !!ival; plmmsk |= _INTEL_X86_ATTR_U; umodmsk |= _INTEL_X86_ATTR_U; break; case INTEL_X86_ATTR_K: /* OS */ reg.sel_os = !!ival; plmmsk |= _INTEL_X86_ATTR_K; umodmsk |= _INTEL_X86_ATTR_K; break; case INTEL_X86_ATTR_T: /* anythread (v3 and above) */ reg.sel_anythr = !!ival; umodmsk |= _INTEL_X86_ATTR_T; break; case INTEL_X86_ATTR_LDLAT: /* load latency */ /* as per Intel SDM, lowest value is 1, 16-bit field */ if (ival < 1 || ival > 65535) return PFM_ERR_ATTR_VAL; ldlat = ival; break; case INTEL_X86_ATTR_INTX: /* in_tx */ reg.sel_intx = !!ival; umodmsk |= _INTEL_X86_ATTR_INTX; break; case INTEL_X86_ATTR_INTXCP: /* in_tx_cp */ reg.sel_intxcp = !!ival; umodmsk |= _INTEL_X86_ATTR_INTXCP; break; case INTEL_X86_ATTR_FETHR: /* precise frontend latency threshold */ if (ival < 1 || ival > 4095) return PFM_ERR_ATTR_VAL; fe_thr = ival; umodmsk |= _INTEL_X86_ATTR_FETHR; break; } } } /* check required groups are in place */ if (max_req_grpid != -1) { int x; for (x = 0; x <= max_req_grpid; x++) { if (req_grps[x] == 0xff) continue; if ((ugrpmsk & (1 << x)) == 0) { DPRINT("required grpid %d umask missing\n", x); return PFM_ERR_FEATCOMB; } } } /* * we need to wait until all the attributes have been parsed to check * for conflicts between hardcoded attributes and user-provided attributes. * we do not want to depend on the order in which they are specified * * The test check for conflicts. It is okay to specify an attribute if * it encodes to the same same value as the hardcoded value. That allows * use to prase a FQESTR (fully-qualified event string) as returned by * the library */ reg2.val = (umask1 | umask2) << 8; if (mdhw(modhw, umodmsk, ATTR_I) && reg2.sel_inv != reg.sel_inv) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_E) && reg2.sel_edge != reg.sel_edge) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_C) && reg2.sel_cnt_mask != reg.sel_cnt_mask) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_U) && reg2.sel_usr != reg.sel_usr) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_K) && reg2.sel_os != reg.sel_os) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_T) && reg2.sel_anythr != reg.sel_anythr) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_INTX) && reg2.sel_intx != reg.sel_intx) return PFM_ERR_ATTR_SET; if (mdhw(modhw, umodmsk, ATTR_INTXCP) && reg2.sel_intxcp != reg.sel_intxcp) return PFM_ERR_ATTR_SET; /* * handle case where no priv level mask was passed. * then we use the dfl_plm */ if (!(plmmsk & (_INTEL_X86_ATTR_K|_INTEL_X86_ATTR_U))) { if ((e->dfl_plm & PFM_PLM0) && (pmu->supported_plm & PFM_PLM0)) reg.sel_os = 1; if ((e->dfl_plm & PFM_PLM3) && (pmu->supported_plm & PFM_PLM3)) reg.sel_usr = 1; } /* * check that there is at least of unit mask in each unit * mask group */ if ((ugrpmsk != grpmsk && !intel_x86_eflag(this, e->event, INTEL_X86_GRP_EXCL)) || ugrpmsk == 0) { ugrpmsk ^= grpmsk; ret = pfm_intel_x86_add_defaults(this, e, ugrpmsk, &umask2, max_grpid, excl_grp_but_0); if (ret != PFM_SUCCESS) return ret; } /* * GRP_EXCL_BUT_0 groups require at least one bit set in grpid = 0 and one in theirs * applies to OFFCORE_RESPONSE umasks on some processors (e.g., Goldmont) */ DPRINT("excl_grp_but_0=%d\n", excl_grp_but_0); if (excl_grp_but_0 != -1) { /* skip group 0, because it is authorized */ for (k = 1; k < INTEL_X86_NUM_GRP; k++) { DPRINT("grpcounts[%d]=%d\n", k, grpcounts[k]); if (grpcounts[k] && k != excl_grp_but_0) { DPRINT("GRP_EXCL_BUT_0 but grpcounts[%d]=%d\n", k, grpcounts[k]); return PFM_ERR_FEATCOMB; } } } ret = intel_x86_check_pebs(this, e); if (ret != PFM_SUCCESS) return ret; /* * check no umask violates the max_grpid constraint */ if (max_grpid != INTEL_X86_MAX_GRPID) { ret = intel_x86_check_max_grpid(this, e, max_grpid); if (ret != PFM_SUCCESS) { DPRINT("event %s: umask from grp > %d\n", pe[e->event].name, max_grpid); return ret; } } if (modmsk_r && (umodmsk ^ modmsk_r)) { DPRINT("required modifiers missing: 0x%x\n", modmsk_r); return PFM_ERR_ATTR; } /* * reorder all the attributes such that the fstr appears always * the same regardless of how the attributes were submitted. */ evt_strcat(e->fstr, "%s", pe[e->event].name); pfmlib_sort_attr(e); for(k=0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", pe[e->event].umasks[a->idx].uname); else if (a->type == PFM_ATTR_RAW_UMASK) evt_strcat(e->fstr, ":0x%x", a->idx); } if (intel_x86_eflag(this, e->event, INTEL_X86_FRONTEND)) { uint64_t um_thr = (umask2 >> 8) & 0xfff; /* threshold from umask */ DPRINT("um_thr=0x%"PRIx64 " fe_thr=%u thr_um=%u modhw=0x%x umodhw=0x%x\n", um_thr, fe_thr, fe_thr_um, modhw, umodmsk); /* umask expects a fe_thres modifier */ if (fe_thr_um) { /* hardware has non zero fe_thres (hardcoded) */ if (um_thr) { /* user passed fe_thres, then must match hardcoded */ if (mdhw(modhw, umodmsk, ATTR_FETHR)) { if (fe_thr != um_thr) return PFM_ERR_ATTR_SET; } else fe_thr = um_thr; } else if (fe_thr == 0) { fe_thr = INTEL_X86_FETHR_DEFAULT; } umask2 &= ~((0xfffULL) << 8); umask2 |= fe_thr << 8; } } /* * offcore_response or precise frontend require a separate register */ if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE) || intel_x86_eflag(this, e->event, INTEL_X86_FRONTEND)) { e->codes[1] = umask2; e->count = 2; umask2 = 0; } else { e->count = 1; } if (ldlat && !ldlat_um) { DPRINT("passed ldlat= but not using ldlat umask\n"); return PFM_ERR_ATTR; } /* * force a default ldlat (will not appear in display_reg) */ if (ldlat_um && !ldlat) { DPRINT("missing ldlat= for umask, forcing to default %d cycles\n", INTEL_X86_LDLAT_DEFAULT); ldlat = INTEL_X86_LDLAT_DEFAULT; } if (ldlat && ldlat_um) { e->codes[1] = ldlat; e->count = 2; } /* take into account hardcoded modifiers, so use or on reg.val */ reg.val |= (umask1 | umask2) << 8; reg.sel_en = 1; /* force enable bit to 1 */ reg.sel_int = 1; /* force APIC int to 1 */ e->codes[0] = reg.val; /* * on recent processors (except Atom), edge requires cmask >=1 */ if ((pmu->flags & INTEL_X86_PMU_FL_ECMASK) && reg.sel_edge && !reg.sel_cnt_mask) { DPRINT("edge requires cmask >= 1\n"); return PFM_ERR_ATTR; } /* * no modifier to encode in fstr if umasks or event * does not support any */ if (no_mods) return PFM_SUCCESS; /* * decode ALL modifiers */ for (k = 0; k < e->npattrs; k++) { if (e->pattrs[k].ctrl != PFM_ATTR_CTRL_PMU) continue; if (e->pattrs[k].type == PFM_ATTR_UMASK) continue; id = e->pattrs[k].idx; switch(id) { case INTEL_X86_ATTR_U: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_usr); break; case INTEL_X86_ATTR_K: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_os); break; case INTEL_X86_ATTR_E: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_edge); break; case INTEL_X86_ATTR_I: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_inv); break; case INTEL_X86_ATTR_C: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_cnt_mask); break; case INTEL_X86_ATTR_T: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_anythr); break; case INTEL_X86_ATTR_LDLAT: evt_strcat(e->fstr, ":%s=%d", intel_x86_mods[id].name, ldlat); break; case INTEL_X86_ATTR_INTX: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_intx); break; case INTEL_X86_ATTR_INTXCP: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, reg.sel_intxcp); break; case INTEL_X86_ATTR_FETHR: evt_strcat(e->fstr, ":%s=%lu", intel_x86_mods[id].name, fe_thr); break; } } return PFM_SUCCESS; } int pfm_intel_x86_get_encoding(void *this, pfmlib_event_desc_t *e) { int ret; ret = pfm_intel_x86_encode_gen(this, e); if (ret != PFM_SUCCESS) return ret; pfm_intel_x86_display_reg(this, e); return PFM_SUCCESS; } int pfm_intel_x86_get_event_first(void *this) { pfmlib_pmu_t *p = this; int idx = 0; /* skip event for different models */ while (idx < p->pme_count && !is_model_event(this, idx)) idx++; return idx < p->pme_count ? idx : -1; } int pfm_intel_x86_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; /* pme_count is always >= 1*/ if (idx >= (p->pme_count-1)) return -1; idx++; /* skip event for different models */ while (idx < p->pme_count && !is_model_event(this, idx)) idx++; return idx < p->pme_count ? idx : -1; } int pfm_intel_x86_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count && is_model_event(this, pidx); } int pfm_intel_x86_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const intel_x86_entry_t *pe = this_pe(this); int ndfl[INTEL_X86_NUM_GRP]; int i, j, error = 0; unsigned int u, v; int npebs; if (!pmu->atdesc) { fprintf(fp, "pmu: %s missing attr_desc\n", pmu->name); error++; } if (!pmu->supported_plm && pmu->type == PFM_PMU_TYPE_CORE) { fprintf(fp, "pmu: %s supported_plm not set\n", pmu->name); error++; } for(i=0; i < pmu->pme_count; i++) { if (!is_model_event(this, i)) continue; if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", pmu->name, i, pe[i].name); error++; } if (pe[i].desc && strlen(pe[i].desc) == 0) { fprintf(fp, "pmu: %s event%d: %s :: empty description\n", pmu->name, i, pe[i].name); error++; } if (!pe[i].cntmsk) { fprintf(fp, "pmu: %s event%d: %s :: cntmsk=0\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks && pe[i].ngrp == 0) { fprintf(fp, "pmu: %s event%d: %s :: ngrp cannot be zero\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks && pe[i].umasks == NULL) { fprintf(fp, "pmu: %s event%d: %s :: numasks but no umasks\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks == 0 && pe[i].umasks) { fprintf(fp, "pmu: %s event%d: %s :: numasks=0 but umasks defined\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks == 0 && pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s :: ngrp must be zero\n", pmu->name, i, pe[i].name); error++; } if (pe[i].ngrp >= INTEL_X86_NUM_GRP) { fprintf(fp, "pmu: %s event%d: %s :: ngrp too big (max=%d)\n", pmu->name, i, pe[i].name, INTEL_X86_NUM_GRP); error++; } if (pe[i].model >= PFM_PMU_MAX) { fprintf(fp, "pmu: %s event%d: %s :: model too big (max=%d)\n", pmu->name, i, pe[i].name, PFM_PMU_MAX); error++; } for (j=i+1; j < (int)pmu->pme_count; j++) { if (pe[i].code == pe[j].code && pe[i].model == pe[j].model && !intel_x86_eflag(pmu, i, INTEL_X86_DEPRECATED) && !(pe[j].equiv || pe[i].equiv) && pe[j].cntmsk == pe[i].cntmsk && !intel_x86_eflag(pmu, i, INTEL_X86_CODE_DUP) && !!intel_x86_eflag(pmu, j, INTEL_X86_CODE_DUP)) { fprintf(fp, "pmu: %s events %s and %s have the same code 0x%x\n", pmu->name, pe[i].name, pe[j].name, pe[i].code); error++; } } for(j=0; j < INTEL_X86_NUM_GRP; j++) ndfl[j] = 0; for(j=0, npebs = 0; j < (int)pe[i].numasks; j++) { if (!pe[i].umasks[j].uname) { fprintf(fp, "pmu: %s event%d: %s umask%d :: no name\n", pmu->name, i, pe[i].name, j); error++; } if (pe[i].umasks[j].modhw && (pe[i].umasks[j].modhw | pe[i].modmsk) != pe[i].modmsk) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: modhw not subset of modmsk\n", pmu->name, i, pe[i].name, j, pe[i].umasks[j].uname); error++; } if (!pe[i].umasks[j].udesc) { fprintf(fp, "pmu: %s event%d: umask%d: %s :: no description\n", pmu->name, i, j, pe[i].umasks[j].uname); error++; } if (pe[i].umasks[j].udesc && strlen(pe[i].umasks[j].udesc) == 0) { fprintf(fp, "pmu: %s event%d: umask%d: %s :: empty description\n", pmu->name, i, j, pe[i].umasks[j].uname); error++; } if (pe[i].ngrp && get_grpid(pe[i].umasks[j].grpid) >= pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: invalid grpid %d (must be < %d)\n", pmu->name, i, pe[i].name, j, pe[i].umasks[j].uname, get_grpid(pe[i].umasks[j].grpid), pe[i].ngrp); error++; } if (pe[i].ngrp && get_req_grpid(pe[i].umasks[j].grpid) >= pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: invalid req_grpid %d (must be < %d)\n", pmu->name, i, pe[i].name, j, pe[i].umasks[j].uname, get_req_grpid(pe[i].umasks[j].grpid), pe[i].ngrp); error++; } if (pe[i].umasks[j].umodel >= PFM_PMU_MAX) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: model too big (max=%d)\n", pmu->name, i, pe[i].name, j, pe[i].umasks[j].uname, PFM_PMU_MAX); error++; } if (pe[i].umasks[j].uflags & INTEL_X86_DFL) ndfl[pe[i].umasks[j].grpid]++; if (pe[i].umasks[j].uflags & INTEL_X86_PEBS) npebs++; } if (npebs && !intel_x86_eflag(this, i, INTEL_X86_PEBS)) { fprintf(fp, "pmu: %s event%d: %s, pebs umasks but event pebs flag is not set\n", pmu->name, i, pe[i].name); error++; } if (intel_x86_eflag(this, i, INTEL_X86_PEBS) && pe[i].numasks && npebs == 0) { fprintf(fp, "pmu: %s event%d: %s, pebs event flag but no umask has the pebs flag\n", pmu->name, i, pe[i].name); error++; } /* if only one umask, then ought to be default */ if (pe[i].numasks == 1 && !(pe[i].umasks[0].uflags & INTEL_X86_DFL)) { fprintf(fp, "pmu: %s event%d: %s, only one umask but no default set\n", pmu->name, i, pe[i].name); error++; } if (pe[i].numasks) { unsigned int *dfl_model = malloc(sizeof(*dfl_model) * pe[i].numasks); if (!dfl_model) goto skip_dfl; for(u=0; u < pe[i].ngrp; u++) { int l = 0, m; for (v = 0; v < pe[i].numasks; v++) { if (pe[i].umasks[v].grpid != u) continue; if (pe[i].umasks[v].uflags & INTEL_X86_DFL) { for (m = 0; m < l; m++) { if (dfl_model[m] == pe[i].umasks[v].umodel || dfl_model[m] == 0) { fprintf(fp, "pmu: %s event%d: %s grpid %d has 2 default umasks\n", pmu->name, i, pe[i].name, u); error++; } } if (m == l) dfl_model[l++] = pe[i].umasks[v].umodel; } } } free(dfl_model); } skip_dfl: if (pe[i].flags & INTEL_X86_NCOMBO) { fprintf(fp, "pmu: %s event%d: %s :: NCOMBO is a umask only flag\n", pmu->name, i, pe[i].name); error++; } for(u=0; u < pe[i].numasks; u++) { if (pe[i].umasks[u].uequiv) continue; if (pe[i].umasks[u].uflags & INTEL_X86_NCOMBO) continue; for(v=j+1; v < pe[i].numasks; v++) { if (pe[i].umasks[v].uequiv) continue; if (pe[i].umasks[v].uflags & INTEL_X86_NCOMBO) continue; if (pe[i].umasks[v].grpid != pe[i].umasks[u].grpid) continue; if ((pe[i].umasks[u].ucode & pe[i].umasks[v].ucode) && pe[i].umasks[u].umodel == pe[i].umasks[v].umodel) { fprintf(fp, "pmu: %s event%d: %s :: umask %s and %s have overlapping code bits\n", pmu->name, i, pe[i].name, pe[i].umasks[u].uname, pe[i].umasks[v].uname); error++; } } } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } int pfm_intel_x86_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { const intel_x86_entry_t *pe = this_pe(this); const pfmlib_attr_desc_t *atdesc = this_atdesc(this); pfmlib_pmu_t *pmu = this; int numasks, idx; if (!is_model_event(this, pidx)) { DPRINT("invalid event index %d\n", pidx); return PFM_ERR_INVAL; } numasks = intel_x86_num_umasks(this, pidx); if (attr_idx < numasks) { int has_extpebs = pmu->flags & INTEL_X86_PMU_FL_EXTPEBS; int no_mods; idx = intel_x86_attr2umask(this, pidx, attr_idx); info->name = pe[pidx].umasks[idx].uname; info->desc = pe[pidx].umasks[idx].udesc; info->equiv= pe[pidx].umasks[idx].uequiv; info->code = pe[pidx].umasks[idx].ucode; if (!intel_x86_uflag(this, pidx, idx, INTEL_X86_CODE_OVERRIDE)) info->code >>= 8; no_mods = intel_x86_uflag(this, pidx, idx, INTEL_X86_NO_MODS); info->type = PFM_ATTR_UMASK; info->support_no_mods = no_mods; info->is_dfl = intel_x86_uflag(this, pidx, idx, INTEL_X86_DFL); info->is_precise = intel_x86_uflag(this, pidx, idx, INTEL_X86_PEBS) && !no_mods; /* * if PEBS is supported, then hw buffer sampling is also supported * because PEBS is a hw buffer */ info->support_hw_smpl = (info->is_precise || has_extpebs) && !no_mods; /* * On Intel X86, either all or none of the umasks are speculative * for a speculative event, so propagate speculation info to all * umasks */ if (pmu->flags & PFMLIB_PMU_FL_SPEC) { int ret = intel_x86_eflag(this, pidx, INTEL_X86_SPEC); if (ret) info->is_speculative = PFM_EVENT_INFO_SPEC_TRUE; else info->is_speculative = PFM_EVENT_INFO_SPEC_FALSE; } else info->is_speculative = PFM_EVENT_INFO_SPEC_NA; } else { idx = intel_x86_attr2mod(this, pidx, attr_idx); info->name = atdesc[idx].name; info->desc = atdesc[idx].desc; info->type = atdesc[idx].type; info->equiv= NULL; info->code = idx; info->is_dfl = 0; info->is_precise = 0; info->is_speculative = PFM_EVENT_INFO_SPEC_NA; info->support_hw_smpl = 0; } info->ctrl = PFM_ATTR_CTRL_PMU; info->idx = idx; /* namespace specific index */ info->dfl_val64 = 0; return PFM_SUCCESS; } int pfm_intel_x86_get_event_info(void *this, int idx, pfm_event_info_t *info) { const intel_x86_entry_t *pe = this_pe(this); pfmlib_pmu_t *pmu = this; int has_extpebs = pmu->flags & INTEL_X86_PMU_FL_EXTPEBS; if (!is_model_event(this, idx)) { DPRINT("invalid event index %d\n", idx); return PFM_ERR_INVAL; } info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].code; info->equiv = pe[idx].equiv; info->idx = idx; /* private index */ info->pmu = pmu->pmu; /* * no umask: event supports PEBS * with umasks: at least one umask supports PEBS */ info->is_precise = intel_x86_eflag(this, idx, INTEL_X86_PEBS); /* * if PEBS is supported, then hw buffer sampling is also supported * because PEBS is a hw buffer * * if the PMU supports ExtendedPEBS, then all events can be * recorded using the PEBS buffer. They will all benefit from * the sampling buffer feature. They will not all become precise. * Only the precise at-retirement events will be skidless. Though * by construction PEBS also limits the skid for all events. */ info->support_hw_smpl = (info->is_precise || has_extpebs); if (pmu->flags & PFMLIB_PMU_FL_SPEC) { int ret = intel_x86_eflag(this, idx, INTEL_X86_SPEC); if (ret) info->is_speculative = PFM_EVENT_INFO_SPEC_TRUE; else info->is_speculative = PFM_EVENT_INFO_SPEC_FALSE; } info->nattrs = intel_x86_num_umasks(this, idx); info->nattrs += intel_x86_num_mods(this, idx); return PFM_SUCCESS; } int pfm_intel_x86_valid_pebs(pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; int i, npebs = 0, numasks = 0; /* first check at the event level */ if (intel_x86_eflag(e->pmu, e->event, INTEL_X86_PEBS)) return PFM_SUCCESS; /* * next check the umasks * * we do not assume we are calling after * pfm_intel_x86_ge_event_encoding(), therefore * we check the unit masks again. * They must all be PEBS-capable. */ for(i=0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU || a->type != PFM_ATTR_UMASK) continue; numasks++; if (intel_x86_uflag(e->pmu, e->event, a->idx, INTEL_X86_PEBS)) npebs++; } return npebs == numasks ? PFM_SUCCESS : PFM_ERR_FEATCOMB; } unsigned int pfm_intel_x86_get_event_nattrs(void *this, int pidx) { unsigned int nattrs; nattrs = intel_x86_num_umasks(this, pidx); nattrs += intel_x86_num_mods(this, pidx); return nattrs; } int pfm_intel_x86_can_auto_encode(void *this, int pidx, int uidx) { int numasks; if (intel_x86_eflag(this, pidx, INTEL_X86_NO_AUTOENCODE)) return 0; numasks = intel_x86_num_umasks(this, pidx); if (uidx >= numasks) return 0; return !intel_x86_uflag(this, pidx, uidx, INTEL_X86_NO_AUTOENCODE); } static int intel_x86_event_valid(void *this, int i) { const intel_x86_entry_t *pe = this_pe(this); pfmlib_pmu_t *pmu = this; return pe[i].model == 0 || pe[i].model == pmu->pmu; } int pfm_intel_x86_get_num_events(void *this) { pfmlib_pmu_t *pmu = this; int i, num = 0; /* * count actual number of events for specific PMU. * Table may contain more events for the family than * what a specific model actually supports. */ for (i = 0; i < pmu->pme_count; i++) if (intel_x86_event_valid(this, i)) num++; return num; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_x86_arch.c000066400000000000000000000134321502707512200230470ustar00rootroot00000000000000/* * pfmlib_intel_x86_arch.c : Intel architectural PMU v1, v2, v3 * * Copyright (c) 2005-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * * This file implements supports for the IA-32 architectural PMU as specified * in the following document: * "IA-32 Intel Architecture Software Developer's Manual - Volume 3B: System * Programming Guide" */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_intel_x86_priv.h" /* architecture private */ #include "events/intel_x86_arch_events.h" /* architected event table */ extern pfmlib_pmu_t intel_x86_arch_support; static intel_x86_entry_t *x86_arch_pe; static inline void cpuid(unsigned int op, unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d) { asm volatile("cpuid" : "=a" (*a), "=b" (*b), "=c" (*c), "=d" (*d) : "a" (op) : "memory"); } /* * create architected event table */ static int create_arch_event_table(unsigned int mask, int version) { intel_x86_entry_t *pe; int i, num_events = 0; int m; DPRINT("version=%d evt_msk=0x%x\n", version, mask); /* * first pass: count the number of supported events */ m = mask; for(i=0; i < 7; i++, m>>=1) { if ((m & 0x1) == 0) num_events++; } intel_x86_arch_support.pme_count = num_events; pe = calloc(num_events, sizeof(intel_x86_entry_t)); if (pe == NULL) return PFM_ERR_NOTSUPP; x86_arch_pe = pe; intel_x86_arch_support.pe = pe; /* * second pass: populate the table */ m = mask; for(i=0; i < 7; i++, m>>=1) { if (!(m & 0x1)) { *pe = intel_x86_arch_pe[i]; switch(version) { case 5: pe->modmsk = INTEL_V5_ATTRS; break; case 4: pe->modmsk = INTEL_V4_ATTRS; break; case 3: pe->modmsk = INTEL_V3_ATTRS; break; default: pe->modmsk = INTEL_V2_ATTRS; break; } pe++; } } return PFM_SUCCESS; } static int check_arch_pmu(int family) { union { unsigned int val; intel_x86_pmu_eax_t eax; intel_x86_pmu_edx_t edx; } eax, ecx, edx, ebx; /* * check family number to reject for processors * older than Pentium (family=5). Those processors * did not have the CPUID instruction */ if (family < 5 || family == 15) return PFM_ERR_NOTSUPP; /* * check if CPU supports 0xa function of CPUID * 0xa started with Core Duo. Needed to detect if * architected PMU is present */ cpuid(0x0, &eax.val, &ebx.val, &ecx.val, &edx.val); if (eax.val < 0xa) return PFM_ERR_NOTSUPP; /* * extract architected PMU information */ cpuid(0xa, &eax.val, &ebx.val, &ecx.val, &edx.val); /* * version must be greater than zero */ return eax.eax.version < 1 ? PFM_ERR_NOTSUPP : PFM_SUCCESS; } static int pfm_intel_x86_arch_detect(void *this) { int ret; ret = pfm_intel_x86_detect(); if (ret != PFM_SUCCESS) return ret; return check_arch_pmu(pfm_intel_x86_cfg.family); } static int pfm_intel_x86_arch_init(void *this) { union { unsigned int val; intel_x86_pmu_eax_t eax; intel_x86_pmu_edx_t edx; } eax, ecx, edx, ebx; /* * extract architected PMU information */ if (!pfm_cfg.forced_pmu) { cpuid(0xa, &eax.val, &ebx.val, &ecx.val, &edx.val); intel_x86_arch_support.num_cntrs = eax.eax.num_cnt; intel_x86_arch_support.num_fixed_cntrs = edx.edx.num_cnt; } else { eax.eax.version = 3; ebx.val = 0; /* no restriction */ intel_x86_arch_support.num_cntrs = 0; intel_x86_arch_support.num_fixed_cntrs = 0; } /* * must be called after impl_cntrs has been initialized */ return create_arch_event_table(ebx.val, eax.eax.version); } void pfm_intel_x86_arch_terminate(void *this) { /* workaround const void for intel_x86_arch_support.pe */ if (x86_arch_pe) free(x86_arch_pe); } /* architected PMU */ pfmlib_pmu_t intel_x86_arch_support={ .desc = "Intel X86 architectural PMU", .name = "ix86arch", .pmu = PFM_PMU_INTEL_X86_ARCH, .pme_count = 0, .pe = NULL, .atdesc = intel_x86_mods, .flags = PFMLIB_PMU_FL_RAW_UMASK | PFMLIB_PMU_FL_ARCH_DFL, .type = PFM_PMU_TYPE_CORE, .max_encoding = 1, .supported_plm = INTEL_X86_PLM, .pmu_detect = pfm_intel_x86_arch_detect, .pmu_init = pfm_intel_x86_arch_init, .pmu_terminate = pfm_intel_x86_arch_terminate, .get_event_encoding[PFM_OS_NONE] = pfm_intel_x86_get_encoding, PFMLIB_ENCODE_PERF(pfm_intel_x86_get_perf_encoding), .get_event_first = pfm_intel_x86_get_event_first, .get_event_next = pfm_intel_x86_get_event_next, .event_is_valid = pfm_intel_x86_event_is_valid, .get_event_info = pfm_intel_x86_get_event_info, .get_event_attr_info = pfm_intel_x86_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_intel_x86_perf_validate_pattrs), .get_event_nattrs = pfm_intel_x86_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_x86_perf_event.c000066400000000000000000000212511502707512200242650ustar00rootroot00000000000000/* pfmlib_intel_x86_perf.c : perf_event Intel X86 functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_intel_x86_priv.h" #include "pfmlib_perf_event_priv.h" static int has_ldlat(void *this, pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; int i; for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type != PFM_ATTR_UMASK) continue; if (intel_x86_uflag(this, e->event, a->idx, INTEL_X86_LDLAT)) return 1; } return 0; } int pfm_intel_x86_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; pfm_intel_x86_reg_t reg; struct perf_event_attr *attr = e->os_data; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* * first, we need to do the generic encoding */ ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; if (e->count > 2) { DPRINT("unsupported count=%d\n", e->count); return PFM_ERR_NOTSUPP; } /* default PMU type */ attr->type = PERF_TYPE_RAW; /* * if PMU specifies a perf PMU name, then grab the type * from sysfs as it is most likely dynamically assigned. * This allows this function to use used by some uncore PMUs */ if (pmu->perf_name) { int type; ret = pfm_perf_find_pmu_type(pmu, &type); if (ret != PFM_SUCCESS) { DPRINT("perf PMU %s, not supported by OS\n", pmu->perf_name); } else { DPRINT("PMU %s perf type=%d\n", pmu->name, type); attr->type = type; } } reg.val = e->codes[0]; /* * suppress the bits which are under the control of perf_events * they will be ignore by the perf tool and the kernel interface * the OS/USR bits are controlled by the attr.exclude_* fields * the EN/INT bits are controlled by the kernel */ reg.sel_en = 0; reg.sel_int = 0; reg.sel_os = 0; reg.sel_usr = 0; attr->config = reg.val; if (e->count > 1) { /* * Nehalem/Westmere/Sandy Bridge OFFCORE_RESPONSE events * take two MSRs. Lower level returns two codes: * - codes[0] goes to regular counter config * - codes[1] goes into extra MSR */ if (intel_x86_eflag(this, e->event, INTEL_X86_NHM_OFFCORE)) { if (e->count != 2) { DPRINT("perf_encoding: offcore=1 count=%d\n", e->count); return PFM_ERR_INVAL; } attr->config1 = e->codes[1]; } /* * SkyLake FRONTEND_RETIRED event * takes two MSRs. Lower level returns two codes: * - codes[0] goes to regular counter config * - codes[1] goes into extra MSR */ if (intel_x86_eflag(this, e->event, INTEL_X86_FRONTEND)) { if (e->count != 2) { DPRINT("perf_encoding: frontend_retired=1 count=%d\n", e->count); return PFM_ERR_INVAL; } attr->config1 = e->codes[1]; } /* * Event has filters and perf_events expects them in the umask (extended) * For instance: SK UPI BASIC_HDR_FILT */ if (e->count > 1 && intel_x86_eflag(this, e->event, INTEL_X86_FILT_UMASK)) { attr->config |= e->codes[1] << 32; } /* * Load Latency threshold (NHM/WSM/SNB) * - codes[0] goes to regular counter config * - codes[1] LD_LAT MSR value (LSB 16 bits) */ if (has_ldlat(this, e)) { if (e->count != 2) { DPRINT("perf_encoding: ldlat count=%d\n", e->count); return PFM_ERR_INVAL; } attr->config1 = e->codes[1]; } } return PFM_SUCCESS; } int pfm_intel_nhm_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; pfm_intel_x86_reg_t reg; int ret, type; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; ret = pfm_perf_find_pmu_type(pmu, &type); if (ret != PFM_SUCCESS) return ret; attr->type = type; reg.val = e->codes[0]; /* * encoder treats all events as using the generic * counters. * perf_events override the enable and int bits, so * drop them here. * * also makes fixed counter special encoding 0xff * work. kernel checking for perfect match. */ reg.nhm_unc.usel_en = 0; reg.nhm_unc.usel_int = 0; attr->config = reg.val; /* * uncore measures at all priv levels * * user cannot set per-event priv levels because * attributes are simply not there * * dfl_plm is ignored in this case */ attr->exclude_hv = 0; attr->exclude_kernel = 0; attr->exclude_user = 0; return PFM_SUCCESS; } int pfm_intel_x86_requesting_pebs(pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; int i; for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PERF_EVENT) continue; if (a->idx == PERF_ATTR_PR && e->attrs[i].ival) return 1; } return 0; } static int intel_x86_event_has_pebs(void *this, pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; int i; /* first check at the event level */ if (intel_x86_eflag(e->pmu, e->event, INTEL_X86_PEBS)) return 1; /* check umasks */ for(i=0; i < e->npattrs; i++) { a = e->pattrs+i; if (a->ctrl != PFM_ATTR_CTRL_PMU || a->type != PFM_ATTR_UMASK) continue; if (intel_x86_uflag(e->pmu, e->event, a->idx, INTEL_X86_PEBS)) return 1; } return 0; } static int intel_x86_event_has_hws(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; return !!(pmu->flags & INTEL_X86_PMU_FL_EXTPEBS); } /* * remove attrs which are in conflicts (or duplicated) with os layer */ void pfm_intel_x86_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; int i, compact; int has_hws = intel_x86_event_has_hws(this, e); int has_pebs = intel_x86_event_has_pebs(this, e); int no_smpl = pmu->flags & PFMLIB_PMU_FL_NO_SMPL; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * with perf_events, u and k are handled at the OS level * via exclude_user, exclude_kernel. */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { if (e->pattrs[i].idx == INTEL_X86_ATTR_U || e->pattrs[i].idx == INTEL_X86_ATTR_K) compact = 1; } if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* Precise mode, subject to PEBS */ if (e->pattrs[i].idx == PERF_ATTR_PR && !has_pebs) compact = 1; /* hardware sampling mode, subject to HWS or PEBS */ if (e->pattrs[i].idx == PERF_ATTR_HWS && (!has_hws || has_pebs)) compact = 1; /* * No hypervisor on Intel */ if (e->pattrs[i].idx == PERF_ATTR_H) compact = 1; if (no_smpl && ( e->pattrs[i].idx == PERF_ATTR_FR || e->pattrs[i].idx == PERF_ATTR_PR || e->pattrs[i].idx == PERF_ATTR_PE)) compact = 1; /* * no priv level support * We assume that if we do not support hardware plm, * then the host, guest priv level filtering in not * supported as well, even though on some arch it is * achieved by the OS enabling/disabled on VMM entry * and exit. */ if (pmu->supported_plm == 0 && ( e->pattrs[i].idx == PERF_ATTR_U || e->pattrs[i].idx == PERF_ATTR_K || e->pattrs[i].idx == PERF_ATTR_MG || e->pattrs[i].idx == PERF_ATTR_MH)) compact = 1; } if (compact) { /* e->npattrs modified by call */ pfmlib_compact_pattrs(e, i); /* compensate for i++ */ i--; } } } int pfm_intel_x86_perf_detect(void *this) { pfmlib_pmu_t *pmu = this; char file[64]; snprintf(file,sizeof(file), "%s/%s", SYSFS_PMU_DEVICES_DIR, pmu->perf_name); return access(file, R_OK|X_OK) ? PFM_ERR_NOTSUPP : PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_intel_x86_priv.h000066400000000000000000000347451502707512200231310ustar00rootroot00000000000000/* * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_INTEL_X86_PRIV_H__ #define __PFMLIB_INTEL_X86_PRIV_H__ /* * This file contains the definitions used for all Intel X86 processors */ /* * maximum number of unit masks groups per event */ #define INTEL_X86_NUM_GRP 8 #define INTEL_X86_MAX_FILTERS 2 /* * unit mask description */ typedef struct { const char *uname; /* unit mask name */ const char *udesc; /* unit umask description */ const char *uequiv;/* name of event from which this one is derived, NULL if none */ uint64_t ucntmsk;/* supported counters for umask (if set, supersedes cntmsk) */ uint64_t ucode; /* unit mask code */ /* * extra 32-bit encoding for event * filter[0]: bit 0-31 is set mask, bit 32-63 is clear mask * filter[1]: bit 0-31 is set mask, bit 32-63 is clear mask */ uint64_t ufilters[INTEL_X86_MAX_FILTERS]; unsigned int uflags; /* unit mask flags */ unsigned short umodel; /* only available on this PMU model */ unsigned short grpid; /* unit mask group id */ unsigned int modhw; /* hardwired modifiers, cannot be changed */ unsigned int umodmsk_req; /* bitmask of required modifiers */ } intel_x86_umask_t; #define INTEL_X86_MAX_GRPID ((unsigned short)(~0)) /* * event description */ typedef struct { const char *name; /* event name */ const char *desc; /* event description */ const char *equiv; /* name of event from which this one is derived, NULL if none */ uint64_t cntmsk; /* supported counters */ unsigned int code; /* event code */ unsigned int numasks;/* number of umasks */ unsigned int flags; /* flags */ unsigned int modmsk; /* bitmask of modifiers for this event */ unsigned int modmsk_req; /* bitmask of required modifiers */ unsigned short ngrp; /* number of unit masks groups */ unsigned short model; /* only available on this PMU model */ const intel_x86_umask_t *umasks; /* umask desc */ } intel_x86_entry_t; /* * pme_flags value (event and unit mask) */ #define INTEL_X86_NCOMBO 0x00001 /* unit masks within group cannot be combined */ #define INTEL_X86_FALLBACK_GEN 0x00002 /* fallback from fixed to generic counter possible */ #define INTEL_X86_PEBS 0x00004 /* event supports PEBS or at least one umask supports PEBS */ #define INTEL_X86_DFL 0x00008 /* unit mask is default choice */ #define INTEL_X86_GRP_EXCL 0x00010 /* only one unit mask group can be selected */ #define INTEL_X86_NHM_OFFCORE 0x00020 /* Nehalem/Westmere offcore_response */ #define INTEL_X86_EXCL_GRP_GT 0x00040 /* exclude use of grp with id > own grp */ #define INTEL_X86_FIXED 0x00080 /* fixed counter only event */ #define INTEL_X86_NO_AUTOENCODE 0x00100 /* does not support auto encoding validation */ #define INTEL_X86_CODE_OVERRIDE 0x00200 /* umask overrides event code */ #define INTEL_X86_LDLAT 0x00400 /* needs load latency modifier (ldlat) */ #define INTEL_X86_GRP_DFL_NONE 0x00800 /* ok if umask group defaults to no umask */ #define INTEL_X86_FRONTEND 0x01000 /* Skylake Precise frontend */ #define INTEL_X86_FETHR 0x02000 /* precise frontend umask requires threshold modifier (fe_thres) */ #define INTEL_X86_EXCL_GRP_BUT_0 0x04000 /* exclude all groups except self and grpid = 0 */ #define INTEL_X86_GRP_REQ 0x08000 /* grpid field split as (grpid & 0xff) | (required_grpid & 0xff) << 8 */ #define INTEL_X86_FILT_UMASK 0x10000 /* Event use filter which may be encoded in umask */ #define INTEL_X86_FORCE_FILT0 0x20000 /* Event must set filter0 even if zero value */ #define INTEL_X86_SPEC 0x40000 /* Event includes speculative execution */ #define INTEL_X86_DEPRECATED 0x80000 /* Event is deprecated, ignore duplicate event code */ #define INTEL_X86_CODE_DUP 0x100000 /* Event code duplication is handled */ #define INTEL_X86_NO_MODS 0x200000 /* umask does not support modifier */ typedef union pfm_intel_x86_reg { unsigned long long val; /* complete register value */ struct { unsigned long sel_event_select:8; /* event mask */ unsigned long sel_unit_mask:8; /* unit mask */ unsigned long sel_usr:1; /* user level */ unsigned long sel_os:1; /* system level */ unsigned long sel_edge:1; /* edge detec */ unsigned long sel_pc:1; /* pin control */ unsigned long sel_int:1; /* enable APIC intr */ unsigned long sel_anythr:1; /* measure any thread */ unsigned long sel_en:1; /* enable */ unsigned long sel_inv:1; /* invert counter mask */ unsigned long sel_cnt_mask:8; /* counter mask */ unsigned long sel_intx:1; /* only in tx region */ unsigned long sel_intxcp:1; /* excl. aborted tx region */ unsigned long sel_res2:30; } perfevtsel; struct { unsigned long usel_event:8; /* event select */ unsigned long usel_umask:8; /* event unit mask */ unsigned long usel_res1:1; /* reserved */ unsigned long usel_occ:1; /* occupancy reset */ unsigned long usel_edge:1; /* edge detection */ unsigned long usel_res2:1; /* reserved */ unsigned long usel_int:1; /* PMI enable */ unsigned long usel_res3:1; /* reserved */ unsigned long usel_en:1; /* enable */ unsigned long usel_inv:1; /* invert */ unsigned long usel_cnt_mask:8; /* counter mask */ unsigned long usel_res4:32; /* reserved */ } nhm_unc; struct { unsigned long usel_en:1; /* enable */ unsigned long usel_res1:1; unsigned long usel_int:1; /* PMI enable */ unsigned long usel_res2:32; unsigned long usel_res3:29; } nhm_unc_fixed; struct { unsigned long cpl_eq0:1; /* filter out branches at pl0 */ unsigned long cpl_neq0:1; /* filter out branches at pl1-pl3 */ unsigned long jcc:1; /* filter out condition branches */ unsigned long near_rel_call:1; /* filter out near relative calls */ unsigned long near_ind_call:1; /* filter out near indirect calls */ unsigned long near_ret:1; /* filter out near returns */ unsigned long near_ind_jmp:1; /* filter out near unconditional jmp/calls */ unsigned long near_rel_jmp:1; /* filter out near uncoditional relative jmp */ unsigned long far_branch:1; /* filter out far branches */ unsigned long reserved1:23; /* reserved */ unsigned long reserved2:32; /* reserved */ } nhm_lbr_select; } pfm_intel_x86_reg_t; #define INTEL_X86_ATTR_K 0 /* kernel (0) */ #define INTEL_X86_ATTR_U 1 /* user (1, 2, 3) */ #define INTEL_X86_ATTR_E 2 /* edge */ #define INTEL_X86_ATTR_I 3 /* invert */ #define INTEL_X86_ATTR_C 4 /* counter mask */ #define INTEL_X86_ATTR_T 5 /* any thread */ #define INTEL_X86_ATTR_LDLAT 6 /* load latency threshold */ #define INTEL_X86_ATTR_INTX 7 /* in transaction */ #define INTEL_X86_ATTR_INTXCP 8 /* not aborted transaction */ #define INTEL_X86_ATTR_FETHR 9 /* precise frontend latency theshold */ #define _INTEL_X86_ATTR_U (1 << INTEL_X86_ATTR_U) #define _INTEL_X86_ATTR_K (1 << INTEL_X86_ATTR_K) #define _INTEL_X86_ATTR_I (1 << INTEL_X86_ATTR_I) #define _INTEL_X86_ATTR_E (1 << INTEL_X86_ATTR_E) #define _INTEL_X86_ATTR_C (1 << INTEL_X86_ATTR_C) #define _INTEL_X86_ATTR_T (1 << INTEL_X86_ATTR_T) #define _INTEL_X86_ATTR_INTX (1 << INTEL_X86_ATTR_INTX) #define _INTEL_X86_ATTR_INTXCP (1 << INTEL_X86_ATTR_INTXCP) #define _INTEL_X86_ATTR_LDLAT (1 << INTEL_X86_ATTR_LDLAT) #define _INTEL_X86_ATTR_FETHR (1 << INTEL_X86_ATTR_FETHR) #define INTEL_X86_ATTRS \ (_INTEL_X86_ATTR_I|_INTEL_X86_ATTR_E|_INTEL_X86_ATTR_C|_INTEL_X86_ATTR_U|_INTEL_X86_ATTR_K) #define INTEL_V1_ATTRS INTEL_X86_ATTRS #define INTEL_V2_ATTRS INTEL_X86_ATTRS #define INTEL_FIXED2_ATTRS (_INTEL_X86_ATTR_U|_INTEL_X86_ATTR_K) #define INTEL_FIXED3_ATTRS (INTEL_FIXED2_ATTRS|_INTEL_X86_ATTR_T) #define INTEL_V3_ATTRS (INTEL_V2_ATTRS|_INTEL_X86_ATTR_T) #define INTEL_V4_ATTRS (INTEL_V3_ATTRS | _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP) #define INTEL_V5_ATTRS (INTEL_V2_ATTRS | _INTEL_X86_ATTR_INTX | _INTEL_X86_ATTR_INTXCP) #define INTEL_SKL_FE_ATTRS (INTEL_V2_ATTRS |\ _INTEL_X86_ATTR_INTX |\ _INTEL_X86_ATTR_INTXCP |\ _INTEL_X86_ATTR_FETHR) /* let's define some handy shortcuts! */ #define sel_event_select perfevtsel.sel_event_select #define sel_unit_mask perfevtsel.sel_unit_mask #define sel_usr perfevtsel.sel_usr #define sel_os perfevtsel.sel_os #define sel_edge perfevtsel.sel_edge #define sel_pc perfevtsel.sel_pc #define sel_int perfevtsel.sel_int #define sel_en perfevtsel.sel_en #define sel_inv perfevtsel.sel_inv #define sel_cnt_mask perfevtsel.sel_cnt_mask #define sel_anythr perfevtsel.sel_anythr #define sel_intx perfevtsel.sel_intx #define sel_intxcp perfevtsel.sel_intxcp /* * shift relative to start of register */ #define INTEL_X86_EDGE_BIT 18 #define INTEL_X86_ANY_BIT 21 #define INTEL_X86_INV_BIT 23 #define INTEL_X86_CMASK_BIT 24 #define INTEL_X86_MOD_EDGE (1 << INTEL_X86_EDGE_BIT) #define INTEL_X86_MOD_ANY (1 << INTEL_X86_ANY_BIT) #define INTEL_X86_MOD_INV (1 << INTEL_X86_INV_BIT) /* intel x86 core PMU supported plm */ #define INTEL_X86_PLM (PFM_PLM0|PFM_PLM3) /* * Intel x86 specific pmu flags (pmu->flags 16 MSB) */ #define INTEL_X86_PMU_FL_ECMASK 0x10000 /* edge requires cmask >=1 */ #define INTEL_X86_PMU_FL_EXTPEBS 0x20000 /* PMU supports ExtendedPEBS */ /* * default ldlat value for PEBS-LL events. Used when ldlat= is missing */ #define INTEL_X86_LDLAT_DEFAULT 3 /* default ldlat value in core cycles */ #define INTEL_X86_FETHR_DEFAULT 1 /* default fe_thres value in core cycles */ typedef struct { unsigned int version:8; unsigned int num_cnt:8; unsigned int cnt_width:8; unsigned int ebx_length:8; } intel_x86_pmu_eax_t; typedef struct { unsigned int num_cnt:6; unsigned int cnt_width:6; unsigned int reserved:20; } intel_x86_pmu_edx_t; typedef struct { unsigned int no_core_cycle:1; unsigned int no_inst_retired:1; unsigned int no_ref_cycle:1; unsigned int no_llc_ref:1; unsigned int no_llc_miss:1; unsigned int no_br_retired:1; unsigned int no_br_mispred_retired:1; unsigned int reserved:25; } intel_x86_pmu_ebx_t; typedef struct { int model; int family; /* 0 means nothing detected yet */ int arch_version; int stepping; } pfm_intel_x86_config_t; extern pfm_intel_x86_config_t pfm_intel_x86_cfg; extern const pfmlib_attr_desc_t intel_x86_mods[]; static inline int intel_x86_eflag(void *this, int idx, int flag) { const intel_x86_entry_t *pe = this_pe(this); return !!(pe[idx].flags & flag); } static inline int is_model_event(void *this, int pidx) { pfmlib_pmu_t *pmu = this; const intel_x86_entry_t *pe = this_pe(this); unsigned short model; model = pe[pidx].model; return model == 0 || model == pmu->pmu; } static inline int is_model_umask(void *this, int pidx, int attr) { pfmlib_pmu_t *pmu = this; const intel_x86_entry_t *pe = this_pe(this); const intel_x86_entry_t *ent; unsigned short model; ent = pe + pidx; model = ent->umasks[attr].umodel; return model == 0 || model == pmu->pmu; } static inline int intel_x86_uflag(void *this, int idx, int attr, int flag) { const intel_x86_entry_t *pe = this_pe(this); if (pe[idx].numasks) return !!(pe[idx].umasks[attr].uflags & flag); return 0; } static inline unsigned int intel_x86_num_umasks(void *this, int pidx) { pfmlib_pmu_t *pmu = this; const intel_x86_entry_t *pe = this_pe(this); unsigned int i, n = 0; unsigned short model; /* * some umasks may be model specific */ for (i = 0; i < pe[pidx].numasks; i++) { model = pe[pidx].umasks[i].umodel; if (model && model != pmu->pmu) continue; n++; } return n; } /* * find actual index of umask based on attr_idx */ static inline int intel_x86_attr2umask(void *this, int pidx, int attr_idx) { const intel_x86_entry_t *pe = this_pe(this); unsigned int i; for (i = 0; i < pe[pidx].numasks; i++) { if (!is_model_umask(this, pidx, i)) continue; if (attr_idx == 0) break; attr_idx--; } return i; } static inline unsigned short get_grpid(unsigned short grpid) { return grpid & 0xff; } static inline unsigned short get_req_grpid(unsigned short grpid) { return (grpid >> 8) & 0xff; } extern int pfm_intel_x86_detect(void); extern int pfm_intel_x86_add_defaults(void *this, pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask, unsigned short max_grpid, int excl_grp_but_0); extern int pfm_intel_x86_event_is_valid(void *this, int pidx); extern int pfm_intel_x86_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_intel_x86_get_event_first(void *this); extern int pfm_intel_x86_get_event_next(void *this, int idx); extern int pfm_intel_x86_get_event_umask_first(void *this, int idx); extern int pfm_intel_x86_get_event_umask_next(void *this, int idx, int attr); extern int pfm_intel_x86_validate_table(void *this, FILE *fp); extern int pfm_intel_x86_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info); extern int pfm_intel_x86_get_event_info(void *this, int idx, pfm_event_info_t *info); extern int pfm_intel_x86_valid_pebs(pfmlib_event_desc_t *e); extern int pfm_intel_x86_perf_event_encoding(pfmlib_event_desc_t *e, void *data); extern int pfm_intel_x86_perf_detect(void *this); extern unsigned int pfm_intel_x86_get_event_nattrs(void *this, int pidx); extern int intel_x86_attr2mod(void *this, int pidx, int attr_idx); extern int pfm_intel_x86_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_intel_nhm_unc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern void pfm_intel_x86_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_intel_x86_can_auto_encode(void *this, int pidx, int uidx); extern int pfm_intel_x86_model_detect(void *this); extern int pfm_intel_x86_get_num_events(void *this); #endif /* __PFMLIB_INTEL_X86_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_itanium.c000066400000000000000000001016131502707512200216770ustar00rootroot00000000000000/* * pfmlib_itanium.c : support for Itanium-family PMU * * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_itanium_priv.h" /* PMU private */ #include "itanium_events.h" /* PMU private */ #define is_ear(i) event_is_ear(itanium_pe+(i)) #define is_ear_tlb(i) event_is_tlb_ear(itanium_pe+(i)) #define is_iear(i) event_is_iear(itanium_pe+(i)) #define is_dear(i) event_is_dear(itanium_pe+(i)) #define is_btb(i) event_is_btb(itanium_pe+(i)) #define has_opcm(i) event_opcm_ok(itanium_pe+(i)) #define has_iarr(i) event_iarr_ok(itanium_pe+(i)) #define has_darr(i) event_darr_ok(itanium_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_ita_pmc8.opcm_used != 0 || (e)->pfp_ita_pmc9.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_ita_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_ita_drange.rr_used) #define evt_umask(e) itanium_pe[(e)].pme_umask /* let's define some handy shortcuts! */ #define pmc_plm pmc_ita_count_reg.pmc_plm #define pmc_ev pmc_ita_count_reg.pmc_ev #define pmc_oi pmc_ita_count_reg.pmc_oi #define pmc_pm pmc_ita_count_reg.pmc_pm #define pmc_es pmc_ita_count_reg.pmc_es #define pmc_umask pmc_ita_count_reg.pmc_umask #define pmc_thres pmc_ita_count_reg.pmc_thres #define pmc_ism pmc_ita_count_reg.pmc_ism /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ #define PFMLIB_ITA_PMC_BASE 0 static int pfm_ita_detect(void) { int ret = PFMLIB_ERR_NOTSUPP; /* * we support all chips (there is only one!) in the Itanium family */ if (pfm_ia64_get_cpu_family() == 0x07) ret = PFMLIB_SUCCESS; return ret; } /* * Part of the following code will eventually go into a perfmon library */ static int valid_assign(unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned int i; for(i=0; i < cnt; i++) { if (as[i]==0) return PFMLIB_ERR_NOASSIGN; /* * take care of restricted PMC registers */ if (pfm_regmask_isset(r_pmcs, as[i])) return PFMLIB_ERR_NOASSIGN; } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. */ static int pfm_ita_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { #define has_counter(e,b) (itanium_pe[e].pme_counters & (1 << (b)) ? (b) : 0) pfmlib_ita_input_param_t *param = mod_in; pfm_ita_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l, m; unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_ITA_NUM_COUNTERS]; unsigned int cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (PFMLIB_DEBUG()) { for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, itanium_pe[e[m].event].pme_name, itanium_pe[e[m].event].pme_counters); } } if (cnt > PMU_ITA_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; max_l0 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS; max_l1 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>1); max_l2 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>2); max_l3 = PMU_ITA_FIRST_COUNTER + PMU_ITA_NUM_COUNTERS*(cnt>3); DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); /* * This code needs fixing. It is not very pretty and * won't handle more than 4 counters if more become * available ! * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_ITA_FIRST_COUNTER; i < max_l0; i++) { assign[0]= has_counter(e[0].event,i); if (max_l1 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (j=PMU_ITA_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (k=PMU_ITA_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_ITA_FIRST_COUNTER && valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (l=PMU_ITA_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = e[j].plm ? e[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_ita_counters[j].thres: 0; reg.pmc_ism = param ? param->pfp_ita_counters[j].ism : PFMLIB_ITA_ISM_BOTH; reg.pmc_umask = is_ear(e[j].event) ? 0x0 : evt_umask(e[j].event); reg.pmc_es = itanium_pe[e[j].event].pme_code; pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = assign[j]; pc[j].reg_alt_addr= assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = assign[j]; pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx thres=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_thres, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, itanium_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; int iear_idx = -1; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) iear_idx = i; } if (param == NULL || mod_in->pfp_ita_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_ita2_iear.ear_used */ if (iear_idx == -1) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; pfm_ita_get_ear_mode(inp->pfp_events[iear_idx].event, ¶m->pfp_ita_iear.ear_mode); param->pfp_ita_iear.ear_umask = evt_umask(inp->pfp_events[iear_idx].event); param->pfp_ita_iear.ear_ism = PFMLIB_ITA_ISM_BOTH; /* force both instruction sets */ DPRINT("I-EAR event with no info\n"); } /* sanity check on the mode */ if (param->pfp_ita_iear.ear_mode < 0 || param->pfp_ita_iear.ear_mode > 2) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita_reg.iear_plm = param->pfp_ita_iear.ear_plm ? param->pfp_ita_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita_reg.iear_tlb = param->pfp_ita_iear.ear_mode; reg.pmc10_ita_reg.iear_umask = param->pfp_ita_iear.ear_umask; reg.pmc10_ita_reg.iear_ism = param->pfp_ita_iear.ear_ism; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 10)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 10; /* PMC10 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = 10; pc[pos1].reg_alt_addr= 10; pos1++; pd[pos2].reg_num = 0; pd[pos2].reg_addr = 0; pd[pos2].reg_alt_addr = 0; pos2++; pd[pos2].reg_num = 1; pd[pos2].reg_addr = 1; pd[pos2].reg_alt_addr = 1; pos2++; __pfm_vbprintf("[PMC10(pmc10)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita_reg.iear_tlb ? "Yes" : "No", reg.pmc10_ita_reg.iear_plm, reg.pmc10_ita_reg.iear_pm, reg.pmc10_ita_reg.iear_ism, reg.pmc10_ita_reg.iear_umask); __pfm_vbprintf("[PMD0(pmd0)]\n[PMD1(pmd1)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; int dear_idx = -1; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) dear_idx = i; } if (param == NULL || param->pfp_ita_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_ita2_dear.ear_used */ if (dear_idx == -1) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; pfm_ita_get_ear_mode(inp->pfp_events[dear_idx].event, ¶m->pfp_ita_dear.ear_mode); param->pfp_ita_dear.ear_umask = evt_umask(inp->pfp_events[dear_idx].event); param->pfp_ita_dear.ear_ism = PFMLIB_ITA_ISM_BOTH; /* force both instruction sets */ DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if (param->pfp_ita_dear.ear_mode > 2) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc11_ita_reg.dear_plm = param->pfp_ita_dear.ear_plm ? param->pfp_ita_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc11_ita_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc11_ita_reg.dear_tlb = param->pfp_ita_dear.ear_mode; reg.pmc11_ita_reg.dear_ism = param->pfp_ita_dear.ear_ism; reg.pmc11_ita_reg.dear_umask = param->pfp_ita_dear.ear_umask; reg.pmc11_ita_reg.dear_pt = param->pfp_ita_drange.rr_used ? 0: 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 11; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = 11; pos1++; pd[pos2].reg_num = 2; pd[pos2].reg_addr = 2; pd[pos2].reg_alt_addr = 2; pos2++; pd[pos2].reg_num = 3; pd[pos2].reg_addr = 3; pd[pos2].reg_alt_addr = 3; pos2++; pd[pos2].reg_num = 17; pd[pos2].reg_addr = 17; pd[pos2].reg_alt_addr = 17; pos2++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x pt=%d]\n", reg.pmc_val, reg.pmc11_ita_reg.dear_tlb ? "Yes" : "No", reg.pmc11_ita_reg.dear_plm, reg.pmc11_ita_reg.dear_pm, reg.pmc11_ita_reg.dear_ism, reg.pmc11_ita_reg.dear_umask, reg.pmc11_ita_reg.dear_pt); __pfm_vbprintf("[PMD2(pmd2)]\n[PMD3(pmd3)\nPMD17(pmd17)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_ita_input_param_t *param = mod_in; pfm_ita_pmc_reg_t reg; pfmlib_reg_t *pc = outp->pfp_pmcs; int pos = outp->pfp_pmc_count; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_ita_pmc8.opcm_used) { reg.pmc_val = param->pfp_ita_pmc8.pmc_val; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 8)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 8; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 8; pc[pos].reg_alt_addr = 8; pos++; __pfm_vbprintf("[PMC8(pmc8)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita_reg.m, reg.pmc8_9_ita_reg.i, reg.pmc8_9_ita_reg.f, reg.pmc8_9_ita_reg.b, reg.pmc8_9_ita_reg.match, reg.pmc8_9_ita_reg.mask); } if (param->pfp_ita_pmc9.opcm_used) { reg.pmc_val = param->pfp_ita_pmc9.pmc_val; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 9)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 9; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 9; pc[pos].reg_alt_addr = 9; pos++; __pfm_vbprintf("[PMC9(pmc9)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita_reg.m, reg.pmc8_9_ita_reg.i, reg.pmc8_9_ita_reg.f, reg.pmc8_9_ita_reg.b, reg.pmc8_9_ita_reg.match, reg.pmc8_9_ita_reg.mask); } outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_btb(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_ita_input_param_t fake_param; pfmlib_reg_t *pc, *pd; int found_btb=0; unsigned int i, count; unsigned int pos1, pos2; reg.pmc_val = 0; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_btb(inp->pfp_events[i].event)) found_btb = 1; } if (param == NULL || param->pfp_ita_btb.btb_used == 0) { /* * case 3: no BTB event, no param */ if (found_btb == 0) return PFMLIB_SUCCESS; /* * case 1: BTB event, no param, capture all branches */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_ita_btb.btb_tar = 0x1; /* capture TAR */ param->pfp_ita_btb.btb_tm = 0x3; /* all branches */ param->pfp_ita_btb.btb_ptm = 0x3; /* all branches */ param->pfp_ita_btb.btb_ppm = 0x3; /* all branches */ param->pfp_ita_btb.btb_tac = 0x1; /* capture TAC */ param->pfp_ita_btb.btb_bac = 0x1; /* capture BAC */ DPRINT("BTB event with no info\n"); } /* * case 2: BTB event, param * case 4: no BTB event, param (free running mode) */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc12_ita_reg.btbc_plm = param->pfp_ita_btb.btb_plm ? param->pfp_ita_btb.btb_plm : inp->pfp_dfl_plm; reg.pmc12_ita_reg.btbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc12_ita_reg.btbc_tar = param->pfp_ita_btb.btb_tar & 0x1; reg.pmc12_ita_reg.btbc_tm = param->pfp_ita_btb.btb_tm & 0x3; reg.pmc12_ita_reg.btbc_ptm = param->pfp_ita_btb.btb_ptm & 0x3; reg.pmc12_ita_reg.btbc_ppm = param->pfp_ita_btb.btb_ppm & 0x3; reg.pmc12_ita_reg.btbc_bpt = param->pfp_ita_btb.btb_tac & 0x1; reg.pmc12_ita_reg.btbc_bac = param->pfp_ita_btb.btb_bac & 0x1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 12)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 12; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_value = 12; pos1++; __pfm_vbprintf("[PMC12(pmc12)=0x%lx plm=%d pm=%d tar=%d tm=%d ptm=%d ppm=%d bpt=%d bac=%d]\n", reg.pmc_val, reg.pmc12_ita_reg.btbc_plm, reg.pmc12_ita_reg.btbc_pm, reg.pmc12_ita_reg.btbc_tar, reg.pmc12_ita_reg.btbc_tm, reg.pmc12_ita_reg.btbc_ptm, reg.pmc12_ita_reg.btbc_ppm, reg.pmc12_ita_reg.btbc_bpt, reg.pmc12_ita_reg.btbc_bac); /* * PMD16 is included in list of used PMD */ for(i=8; i < 17; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = i; pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_ita_input_rr_t *irr, int mode, int *n_intervals) { int i; pfmlib_ita_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) DPRINT(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx += 2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, (unsigned long) d.db.db_mask, r_end); } } static int compute_normal_rr(pfmlib_ita_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita_output_rr_t *orr) { pfmlib_ita_input_rr_desc_t *in_rr; pfmlib_ita_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j, br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used = br_index; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita_output_param_t *mod_out) { pfm_ita_pmc_reg_t reg; pfmlib_ita_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita_input_rr_t *irr; pfmlib_ita_output_rr_t *orr; int pos = outp->pfp_pmc_count; int ret, base_idx = 0; int n_intervals; if (param == NULL || param->pfp_ita_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita_irange; orr = &mod_out->pfp_ita_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; DPRINT("n_intervals=%d\n", n_intervals); ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; } reg.pmc_val = 0; reg.pmc13_ita_reg.irange_ta = 0x0; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 13)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 13; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 13; pc[pos].reg_alt_addr= 13; pos++; __pfm_vbprintf("[PMC13(pmc13)=0x%lx ta=%d]\n", reg.pmc_val, reg.pmc13_ita_reg.irange_ta); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita_output_param_t *mod_out) { pfmlib_ita_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita_input_rr_t *irr; pfmlib_ita_output_rr_t *orr; pfm_ita_pmc_reg_t reg; unsigned int i, count; int pos = outp->pfp_pmc_count; int ret, base_idx = 0; int n_intervals; if (param == NULL || param->pfp_ita_drange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita_drange; orr = &mod_out->pfp_ita_drange; ret = check_intervals(irr, 1 , &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; DPRINT("n_intervals=%d\n", n_intervals); ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(e[i].event)) return PFMLIB_SUCCESS; /* will be done there */ } reg.pmc_val = 0UL; /* * here we have no other choice but to use the default priv level as there is no * specific D-EAR event provided */ reg.pmc11_ita_reg.dear_plm = inp->pfp_dfl_plm; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 11; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = 11; pc[pos].reg_alt_addr= 11; pos++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx tlb=%s plm=%d pm=%d ism=0x%x umask=0x%x pt=%d]\n", reg.pmc_val, reg.pmc11_ita_reg.dear_tlb ? "Yes" : "No", reg.pmc11_ita_reg.dear_plm, reg.pmc11_ita_reg.dear_pm, reg.pmc11_ita_reg.dear_ism, reg.pmc11_ita_reg.dear_umask, reg.pmc11_ita_reg.dear_pt); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in) { pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (mod_in->pfp_ita_counters[i].flags & PFMLIB_ITA_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(mod_in) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(mod_in) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(mod_in) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_ita_input_param_t *mod_in) { unsigned int i, count; if (mod_in->pfp_ita_drange.rr_used == 0 && mod_in->pfp_ita_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_ita_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_ita_input_param_t *mod_in = (pfmlib_ita_input_param_t *)model_in; pfmlib_ita_output_param_t *mod_out = (pfmlib_ita_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with raneg restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_ita_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_btb(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; return ret; } /* XXX: return value is also error code */ int pfm_ita_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_ITA_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = itanium_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_ita_is_ear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_ear(i) ? 0 : 1; } int pfm_ita_is_dear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_dear(i) ? 0 : 1; } int pfm_ita_is_dear_tlb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_dear(i) && is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_dear_cache(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_dear(i) && !is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_iear(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_iear(i) ? 0 : 1; } int pfm_ita_is_iear_tlb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_iear(i) && is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_iear_cache(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! (is_iear(i) && !is_ear_tlb(i)) ? 0 : 1; } int pfm_ita_is_btb(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! is_btb(i) ? 0 : 1; } int pfm_ita_support_iarr(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_iarr(i) ? 0 : 1; } int pfm_ita_support_darr(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_darr(i) ? 0 : 1; } int pfm_ita_support_opcm(unsigned int i) { return i >= PME_ITA_EVENT_COUNT || ! has_opcm(i) ? 0 : 1; } int pfm_ita_get_ear_mode(unsigned int i, pfmlib_ita_ear_mode_t *m) { if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; *m = is_ear_tlb(i) ? PFMLIB_ITA_EAR_TLB_MODE : PFMLIB_ITA_EAR_CACHE_MODE; return PFMLIB_SUCCESS; } static int pfm_ita_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)itanium_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_ita_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_ITA_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } static char * pfm_ita_get_event_name(unsigned int i) { return itanium_pe[i].pme_name; } static void pfm_ita_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =itanium_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_ita_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < PMU_ITA_NUM_PMCS; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_ita_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < PMU_ITA_NUM_PMDS; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_ita_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_ita_get_hw_counter_width(unsigned int *width) { *width = PMU_ITA_COUNTER_WIDTH; } static int pfm_ita_get_cycle_event(pfmlib_event_t *e) { e->event = PME_ITA_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_ita_get_inst_retired(pfmlib_event_t *e) { e->event = PME_ITA_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t itanium_support={ .pmu_name = "itanium", .pmu_type = PFMLIB_ITANIUM_PMU, .pme_count = PME_ITA_EVENT_COUNT, .pmc_count = PMU_ITA_NUM_PMCS, .pmd_count = PMU_ITA_NUM_PMDS, .num_cnt = PMU_ITA_NUM_COUNTERS, .get_event_code = pfm_ita_get_event_code, .get_event_name = pfm_ita_get_event_name, .get_event_counters = pfm_ita_get_event_counters, .dispatch_events = pfm_ita_dispatch_events, .pmu_detect = pfm_ita_detect, .get_impl_pmcs = pfm_ita_get_impl_pmcs, .get_impl_pmds = pfm_ita_get_impl_pmds, .get_impl_counters = pfm_ita_get_impl_counters, .get_hw_counter_width = pfm_ita_get_hw_counter_width, .get_cycle_event = pfm_ita_get_cycle_event, .get_inst_retired_event = pfm_ita_get_inst_retired /* no event description available for Itanium */ }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_itanium2.c000066400000000000000000001734271502707512200217750ustar00rootroot00000000000000/* * pfmlib_itanium2.c : support for the Itanium2 PMU family * * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_itanium2_priv.h" /* PMU private */ #include "itanium2_events.h" /* PMU private */ #define is_ear(i) event_is_ear(itanium2_pe+(i)) #define is_ear_tlb(i) event_is_ear_tlb(itanium2_pe+(i)) #define is_ear_alat(i) event_is_ear_alat(itanium2_pe+(i)) #define is_ear_cache(i) event_is_ear_cache(itanium2_pe+(i)) #define is_iear(i) event_is_iear(itanium2_pe+(i)) #define is_dear(i) event_is_dear(itanium2_pe+(i)) #define is_btb(i) event_is_btb(itanium2_pe+(i)) #define has_opcm(i) event_opcm_ok(itanium2_pe+(i)) #define has_iarr(i) event_iarr_ok(itanium2_pe+(i)) #define has_darr(i) event_darr_ok(itanium2_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_ita2_pmc8.opcm_used != 0 || (e)->pfp_ita2_pmc9.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_ita2_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_ita2_drange.rr_used) #define evt_grp(e) (int)itanium2_pe[e].pme_qualifiers.pme_qual.pme_group #define evt_set(e) (int)itanium2_pe[e].pme_qualifiers.pme_qual.pme_set #define evt_umask(e) itanium2_pe[e].pme_umask #define FINE_MODE_BOUNDARY_BITS 12 #define FINE_MODE_MASK ~((1U<<12)-1) /* let's define some handy shortcuts! */ #define pmc_plm pmc_ita2_counter_reg.pmc_plm #define pmc_ev pmc_ita2_counter_reg.pmc_ev #define pmc_oi pmc_ita2_counter_reg.pmc_oi #define pmc_pm pmc_ita2_counter_reg.pmc_pm #define pmc_es pmc_ita2_counter_reg.pmc_es #define pmc_umask pmc_ita2_counter_reg.pmc_umask #define pmc_thres pmc_ita2_counter_reg.pmc_thres #define pmc_ism pmc_ita2_counter_reg.pmc_ism static char * pfm_ita2_get_event_name(unsigned int i); /* * Description of the PMC register mappings use by * this module (as reported in pfmlib_reg_t.reg_num): * * 0 -> PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ /* * The Itanium2 PMU has a bug in the fine mode implementation. * It only sees ranges with a granularity of two bundles. * So we prepare for the day they fix it. */ static int has_fine_mode_bug; static int pfm_ita2_detect(void) { int tmp; int ret = PFMLIB_ERR_NOTSUPP; tmp = pfm_ia64_get_cpu_family(); if (tmp == 0x1f) { has_fine_mode_bug = 1; ret = PFMLIB_SUCCESS; } return ret; } /* * Check the event for incompatibilities. This is useful * for L1 and L2 related events. Due to wire limitations, * some caches events are separated into sets. There * are 5 sets for the L1D cache group and 6 sets for L2 group. * It is NOT possible to simultaneously measure events from * differents sets within a group. For instance, you cannot * measure events from set0 and set1 in L1D cache group. However * it is possible to measure set0 in L1D and set1 in L2 at the same * time. * * This function verifies that the set constraint are respected. */ static int check_cross_groups_and_umasks(pfmlib_input_param_t *inp) { unsigned long ref_umask, umask; int g, s; unsigned int cnt = inp->pfp_event_count; pfmlib_event_t *e = inp->pfp_events; unsigned int i, j; /* * XXX: could possibly be optimized */ for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); s = evt_set(e[i].event); if (g == PFMLIB_ITA2_EVT_NO_GRP) continue; ref_umask = evt_umask(e[i].event); for (j=i+1; j < cnt; j++) { if (evt_grp(e[j].event) != g) continue; if (evt_set(e[j].event) != s) return PFMLIB_ERR_EVTSET; /* only care about L2 cache group */ if (g != PFMLIB_ITA2_EVT_L2_CACHE_GRP || (s == 1 || s == 2)) continue; umask = evt_umask(e[j].event); /* * there is no assignement possible if the event in PMC4 * has a umask (ref_umask) and an event (from the same * set) also has a umask AND it is different. For some * sets, the umasks are shared, therefore the value * programmed into PMC4 determines the umask for all * the other events (with umask) from the set. */ if (umask && ref_umask != umask) return PFMLIB_ERR_NOASSIGN; } } return PFMLIB_SUCCESS; } /* * Certain prefetch events must be treated specially when instruction range restriction * is in use because they can only be constrained by IBRP1 in fine-mode. Other events * will use IBRP0 if tagged as a demand fetch OR IBPR1 if tagged as a prefetch match. * From the library's point of view there is no way of distinguishing this, so we leave * it up to the user to interpret the results. * * Events which can be qualified by the two pairs depending on their tag: * - IBP_BUNPAIRS_IN * - L1I_FETCH_RAB_HIT * - L1I_FETCH_ISB_HIT * - L1I_FILLS * * This function returns the number of qualifying prefetch events found * * XXX: not clear which events do qualify as prefetch events. */ static int prefetch_events[]={ PME_ITA2_L1I_PREFETCHES, PME_ITA2_L1I_STRM_PREFETCHES, PME_ITA2_L2_INST_PREFETCHES }; #define NPREFETCH_EVENTS sizeof(prefetch_events)/sizeof(int) static int check_prefetch_events(pfmlib_input_param_t *inp) { int code; int prefetch_codes[NPREFETCH_EVENTS]; unsigned int i, j, count; int c; int found = 0; for(i=0; i < NPREFETCH_EVENTS; i++) { pfm_get_event_code(prefetch_events[i], &code); prefetch_codes[i] = code; } count = inp->pfp_event_count; for(i=0; i < count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); for(j=0; j < NPREFETCH_EVENTS; j++) { if (c == prefetch_codes[j]) found++; } } return found; } /* * IA64_INST_RETIRED (and subevents) is the only event which can be measured on all * 4 IBR when non-fine mode is not possible. * * This function returns: * - the number of events matching the IA64_INST_RETIRED code * - in retired_mask the bottom 4 bits indicates which of the 4 INST_RETIRED event * is present */ static unsigned int check_inst_retired_events(pfmlib_input_param_t *inp, unsigned long *retired_mask) { int code; int c; unsigned int i, count, found = 0; unsigned long umask, mask; pfm_get_event_code(PME_ITA2_IA64_INST_RETIRED_THIS, &code); count = inp->pfp_event_count; mask = 0; for(i=0; i < count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) { pfm_ita2_get_event_umask(inp->pfp_events[i].event, &umask); switch(umask) { case 0: mask |= 1; break; case 1: mask |= 2; break; case 2: mask |= 4; break; case 3: mask |= 8; break; } found++; } } if (retired_mask) *retired_mask = mask; return found; } static int check_fine_mode_possible(pfmlib_ita2_input_rr_t *rr, int n) { pfmlib_ita2_input_rr_desc_t *lim = rr->rr_limits; int i; for(i=0; i < n; i++) { if ((lim[i].rr_start & FINE_MODE_MASK) != (lim[i].rr_end & FINE_MODE_MASK)) return 0; } return 1; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_ita2_input_rr_t *irr, int mode, unsigned int *n_intervals) { unsigned int i; pfmlib_ita2_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } static int valid_assign(pfmlib_event_t *e, unsigned int *as, pfmlib_regmask_t *r_pmcs, unsigned int cnt) { unsigned long pmc4_umask = 0, umask; char *name; int l1_grp_present = 0, l2_grp_present = 0; unsigned int i; int c, failure; int need_pmc5, need_pmc4; int pmc5_evt = -1, pmc4_evt = -1; if (PFMLIB_DEBUG()) { unsigned int j; for(j=0;jpfp_event_count; for(i=0; i < count; i++) { for (j=0; j < NCANCEL_EVENTS; j++) { pfm_get_event_code(inp->pfp_events[i].event, &code); if (code == cancel_codes[j]) { if (idx != -1) { return PFMLIB_ERR_INVAL; } idx = inp->pfp_events[i].event; } } } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_ita2_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { #define has_counter(e,b) (itanium2_pe[e].pme_counters & (1 << (b)) ? (b) : 0) pfmlib_ita2_input_param_t *param = mod_in; pfm_ita2_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t *r_pmcs; unsigned int i,j,k,l; int ret; unsigned int max_l0, max_l1, max_l2, max_l3; unsigned int assign[PMU_ITA2_NUM_COUNTERS]; unsigned int m, cnt; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; r_pmcs = &inp->pfp_unavail_pmcs; if (PFMLIB_DEBUG()) for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, itanium2_pe[e[m].event].pme_name, itanium2_pe[e[m].event].pme_counters); } if (cnt > PMU_ITA2_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; ret = check_cross_groups_and_umasks(inp); if (ret != PFMLIB_SUCCESS) return ret; ret = check_cancel_events(inp); if (ret != PFMLIB_SUCCESS) return ret; max_l0 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS; max_l1 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>1); max_l2 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>2); max_l3 = PMU_ITA2_FIRST_COUNTER + PMU_ITA2_NUM_COUNTERS*(cnt>3); DPRINT("max_l0=%u max_l1=%u max_l2=%u max_l3=%u\n", max_l0, max_l1, max_l2, max_l3); /* * For now, worst case in the loop nest: 4! (factorial) */ for (i=PMU_ITA2_FIRST_COUNTER; i < max_l0; i++) { assign[0] = has_counter(e[0].event,i); if (max_l1 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (j=PMU_ITA2_FIRST_COUNTER; j < max_l1; j++) { if (j == i) continue; assign[1] = has_counter(e[1].event,j); if (max_l2 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (k=PMU_ITA2_FIRST_COUNTER; k < max_l2; k++) { if(k == i || k == j) continue; assign[2] = has_counter(e[2].event,k); if (max_l3 == PMU_ITA2_FIRST_COUNTER && valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; for (l=PMU_ITA2_FIRST_COUNTER; l < max_l3; l++) { if(l == i || l == j || l == k) continue; assign[3] = has_counter(e[3].event,l); if (valid_assign(e, assign, r_pmcs, cnt) == PFMLIB_SUCCESS) goto done; } } } } /* we cannot satisfy the constraints */ return PFMLIB_ERR_NOASSIGN; done: for (j=0; j < cnt ; j++ ) { reg.pmc_val = 0; /* clear all, bits 26-27 must be zero for proper operations */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = inp->pfp_events[j].plm ? inp->pfp_events[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 1; /* overflow interrupt */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_ita2_counters[j].thres: 0; reg.pmc_ism = param ? param->pfp_ita2_counters[j].ism : PFMLIB_ITA2_ISM_BOTH; reg.pmc_umask = is_ear(e[j].event) ? 0x0 : itanium2_pe[e[j].event].pme_umask; reg.pmc_es = itanium2_pe[e[j].event].pme_code; /* * Note that we don't force PMC4.pmc_ena = 1 because the kernel takes care of this for us. * This way we don't have to program something in PMC4 even when we don't use it */ pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = pc[j].reg_alt_addr = assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = pd[j].reg_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx thres=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_thres, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, itanium2_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_ita2_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_ita2_iear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_ita2_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_ita2_iear.ear_mode); param->pfp_ita2_iear.ear_umask = evt_umask(inp->pfp_events[i].event); param->pfp_ita2_iear.ear_ism = PFMLIB_ITA2_ISM_BOTH; /* force both instruction sets */ DPRINT("I-EAR event with no info\n"); } /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita2_tlb_reg.iear_plm = param->pfp_ita2_iear.ear_plm ? param->pfp_ita2_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita2_tlb_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita2_tlb_reg.iear_ct = 0x0; reg.pmc10_ita2_tlb_reg.iear_umask = param->pfp_ita2_iear.ear_umask; reg.pmc10_ita2_tlb_reg.iear_ism = param->pfp_ita2_iear.ear_ism; } else if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_CACHE_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc10_ita2_cache_reg.iear_plm = param->pfp_ita2_iear.ear_plm ? param->pfp_ita2_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc10_ita2_cache_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc10_ita2_cache_reg.iear_ct = 0x1; reg.pmc10_ita2_cache_reg.iear_umask = param->pfp_ita2_iear.ear_umask; reg.pmc10_ita2_cache_reg.iear_ism = param->pfp_ita2_iear.ear_ism; } else { DPRINT("ALAT mode not supported in I-EAR mode\n"); return PFMLIB_ERR_INVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 10)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 10; /* PMC10 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 10; pos1++; pd[pos2].reg_num = 0; pd[pos2].reg_addr = pd[pos2].reg_alt_addr= 0; pos2++; pd[pos2].reg_num = 1; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 1; pos2++; if (param->pfp_ita2_iear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE) { __pfm_vbprintf("[PMC10(pmc10)=0x%lx ctb=tlb plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita2_tlb_reg.iear_plm, reg.pmc10_ita2_tlb_reg.iear_pm, reg.pmc10_ita2_tlb_reg.iear_ism, reg.pmc10_ita2_tlb_reg.iear_umask); } else { __pfm_vbprintf("[PMC10(pmc10)=0x%lx ctb=cache plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc10_ita2_cache_reg.iear_plm, reg.pmc10_ita2_cache_reg.iear_pm, reg.pmc10_ita2_cache_reg.iear_ism, reg.pmc10_ita2_cache_reg.iear_umask); } __pfm_vbprintf("[PMD0(pmd0)]\n[PMD1(pmd1)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_ita2_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_ita2_dear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_ita2_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_ita2_dear.ear_mode); param->pfp_ita2_dear.ear_umask = evt_umask(inp->pfp_events[i].event); param->pfp_ita2_dear.ear_ism = PFMLIB_ITA2_ISM_BOTH; /* force both instruction sets */ DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if ( param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_CACHE_MODE && param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_TLB_MODE && param->pfp_ita2_dear.ear_mode != PFMLIB_ITA2_EAR_ALAT_MODE) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc11_ita2_reg.dear_plm = param->pfp_ita2_dear.ear_plm ? param->pfp_ita2_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc11_ita2_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc11_ita2_reg.dear_mode = param->pfp_ita2_dear.ear_mode; reg.pmc11_ita2_reg.dear_umask = param->pfp_ita2_dear.ear_umask; reg.pmc11_ita2_reg.dear_ism = param->pfp_ita2_dear.ear_ism; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 11)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 11; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 11; pos1++; pd[pos2].reg_num = 2; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 2; pos2++; pd[pos2].reg_num = 3; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 3; pos2++; pd[pos2].reg_num = 17; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 17; pos2++; __pfm_vbprintf("[PMC11(pmc11)=0x%lx mode=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc11_ita2_reg.dear_mode == 0 ? "L1D" : (reg.pmc11_ita2_reg.dear_mode == 1 ? "L1DTLB" : "ALAT"), reg.pmc11_ita2_reg.dear_plm, reg.pmc11_ita2_reg.dear_pm, reg.pmc11_ita2_reg.dear_ism, reg.pmc11_ita2_reg.dear_umask); __pfm_vbprintf("[PMD2(pmd2)]\n[PMD3(pmd3)\nPMD17(pmd17)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfm_ita2_pmc_reg_t reg, pmc15; unsigned int i, has_1st_pair, has_2nd_pair, count; unsigned int pos = outp->pfp_pmc_count; if (param == NULL) return PFMLIB_SUCCESS; /* not constrained by PMC8 nor PMC9 */ pmc15.pmc_val = 0xffffffff; /* XXX: use PAL instead. PAL value is 0xfffffff0 */ if (param->pfp_ita2_irange.rr_used && mod_out == NULL) return PFMLIB_ERR_INVAL; if (param->pfp_ita2_pmc8.opcm_used || (param->pfp_ita2_irange.rr_used && mod_out->pfp_ita2_irange.rr_nbr_used!=0) ) { reg.pmc_val = param->pfp_ita2_pmc8.opcm_used ? param->pfp_ita2_pmc8.pmc_val : 0xffffffff3fffffff; if (param->pfp_ita2_irange.rr_used) { reg.pmc8_9_ita2_reg.opcm_ig_ad = 0; reg.pmc8_9_ita2_reg.opcm_inv = param->pfp_ita2_irange.rr_flags & PFMLIB_ITA2_RR_INV ? 1 : 0; } else { /* clear range restriction fields when none is used */ reg.pmc8_9_ita2_reg.opcm_ig_ad = 1; reg.pmc8_9_ita2_reg.opcm_inv = 0; } /* force bit 2 to 1 */ reg.pmc8_9_ita2_reg.opcm_bit2 = 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 8)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 8; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_addr = 8; pos++; /* * will be constrained by PMC8 */ if (param->pfp_ita2_pmc8.opcm_used) { has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP0_PMC8) has_1st_pair=1; if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP2_PMC8) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp0_pmc8 = 0; if (has_2nd_pair || has_1st_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp2_pmc8 = 0; } __pfm_vbprintf("[PMC8(pmc8)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x inv=%d ig_ad=%d]\n", reg.pmc_val, reg.pmc8_9_ita2_reg.opcm_m, reg.pmc8_9_ita2_reg.opcm_i, reg.pmc8_9_ita2_reg.opcm_f, reg.pmc8_9_ita2_reg.opcm_b, reg.pmc8_9_ita2_reg.opcm_match, reg.pmc8_9_ita2_reg.opcm_mask, reg.pmc8_9_ita2_reg.opcm_inv, reg.pmc8_9_ita2_reg.opcm_ig_ad); } if (param->pfp_ita2_pmc9.opcm_used) { /* * PMC9 can only be used to qualify IA64_INST_RETIRED_* events */ if (check_inst_retired_events(inp, NULL) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; reg.pmc_val = param->pfp_ita2_pmc9.pmc_val; /* ig_ad, inv are ignored for PMC9, to avoid confusion we force default values */ reg.pmc8_9_ita2_reg.opcm_ig_ad = 1; reg.pmc8_9_ita2_reg.opcm_inv = 0; /* force bit 2 to 1 */ reg.pmc8_9_ita2_reg.opcm_bit2 = 1; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 9)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 9; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 9; pos++; /* * will be constrained by PMC9 */ has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP1_PMC9) has_1st_pair=1; if (inp->pfp_events[i].event == PME_ITA2_IA64_TAGGED_INST_RETIRED_IBRP3_PMC9) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp1_pmc9 = 0; if (has_2nd_pair || has_1st_pair == 0) pmc15.pmc15_ita2_reg.opcmc_ibrp3_pmc9 = 0; __pfm_vbprintf("[PMC9(pmc9)=0x%lx m=%d i=%d f=%d b=%d match=0x%x mask=0x%x]\n", reg.pmc_val, reg.pmc8_9_ita2_reg.opcm_m, reg.pmc8_9_ita2_reg.opcm_i, reg.pmc8_9_ita2_reg.opcm_f, reg.pmc8_9_ita2_reg.opcm_b, reg.pmc8_9_ita2_reg.opcm_match, reg.pmc8_9_ita2_reg.opcm_mask); } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 15)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 15; pc[pos].reg_value = pmc15.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 15; pos++; __pfm_vbprintf("[PMC15(pmc15)=0x%lx ibrp0_pmc8=%d ibrp1_pmc9=%d ibrp2_pmc8=%d ibrp3_pmc9=%d]\n", pmc15.pmc_val, pmc15.pmc15_ita2_reg.opcmc_ibrp0_pmc8, pmc15.pmc15_ita2_reg.opcmc_ibrp1_pmc9, pmc15.pmc15_ita2_reg.opcmc_ibrp2_pmc8, pmc15.pmc15_ita2_reg.opcmc_ibrp3_pmc9); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_btb(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e= inp->pfp_events; pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_ita2_input_param_t fake_param; int found_btb = 0, found_bad_dear = 0; int has_btb_param; unsigned int i, pos1, pos2; unsigned int count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * explicit BTB settings */ has_btb_param = param && param->pfp_ita2_btb.btb_used; reg.pmc_val = 0UL; /* * we need to scan all events looking for DEAR ALAT/TLB due to incompatibility */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_btb(e[i].event)) found_btb = 1; /* * keep track of the first BTB event */ /* look only for DEAR TLB */ if (is_dear(e[i].event) && (is_ear_tlb(e[i].event) || is_ear_alat(e[i].event))) { found_bad_dear = 1; } } DPRINT("found_btb=%d found_bar_dear=%d\n", found_btb, found_bad_dear); /* * did not find D-EAR TLB/ALAT event, need to check param structure */ if (found_bad_dear == 0 && param && param->pfp_ita2_dear.ear_used == 1) { if ( param->pfp_ita2_dear.ear_mode == PFMLIB_ITA2_EAR_TLB_MODE || param->pfp_ita2_dear.ear_mode == PFMLIB_ITA2_EAR_ALAT_MODE) found_bad_dear = 1; } /* * no explicit BTB event and no special case to deal with (cover part of case 3) */ if (found_btb == 0 && has_btb_param == 0 && found_bad_dear == 0) return PFMLIB_SUCCESS; if (has_btb_param == 0) { /* * case 3: no BTB event, btb_used=0 but found_bad_dear=1, need to cleanup PMC12 */ if (found_btb == 0) goto assign_zero; /* * case 1: we have a BTB event but no param, default setting is to capture * all branches. */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_ita2_btb.btb_ds = 0; /* capture branch targets */ param->pfp_ita2_btb.btb_tm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_ptm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_ppm = 0x3; /* all branches */ param->pfp_ita2_btb.btb_brt = 0x0; /* all branches */ DPRINT("BTB event with no info\n"); } /* * case 2: BTB event in the list, param provided * case 4: no BTB event, param provided (free running mode) */ reg.pmc12_ita2_reg.btbc_plm = param->pfp_ita2_btb.btb_plm ? param->pfp_ita2_btb.btb_plm : inp->pfp_dfl_plm; reg.pmc12_ita2_reg.btbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc12_ita2_reg.btbc_ds = param->pfp_ita2_btb.btb_ds & 0x1; reg.pmc12_ita2_reg.btbc_tm = param->pfp_ita2_btb.btb_tm & 0x3; reg.pmc12_ita2_reg.btbc_ptm = param->pfp_ita2_btb.btb_ptm & 0x3; reg.pmc12_ita2_reg.btbc_ppm = param->pfp_ita2_btb.btb_ppm & 0x3; reg.pmc12_ita2_reg.btbc_brt = param->pfp_ita2_btb.btb_brt & 0x3; /* * if DEAR-ALAT or DEAR-TLB is set then PMC12 must be set to zero (see documentation p. 87) * * D-EAR ALAT/TLB and BTB cannot be used at the same time. * From documentation: PMC12 must be zero in this mode; else the wrong IP for misses * coming right after a mispredicted branch. * * D-EAR cache is fine. */ assign_zero: if (found_bad_dear && reg.pmc_val != 0UL) return PFMLIB_ERR_EVTINCOMP; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 12)) return PFMLIB_ERR_NOASSIGN; memset(pc+pos1, 0, sizeof(pfmlib_reg_t)); pc[pos1].reg_num = 12; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 12; pos1++; __pfm_vbprintf("[PMC12(pmc12)=0x%lx plm=%d pm=%d ds=%d tm=%d ptm=%d ppm=%d brt=%d]\n", reg.pmc_val, reg.pmc12_ita2_reg.btbc_plm, reg.pmc12_ita2_reg.btbc_pm, reg.pmc12_ita2_reg.btbc_ds, reg.pmc12_ita2_reg.btbc_tm, reg.pmc12_ita2_reg.btbc_ptm, reg.pmc12_ita2_reg.btbc_ppm, reg.pmc12_ita2_reg.btbc_brt); /* * only add BTB PMD when actually using BTB. * Not needed when dealing with D-EAR TLB and DEAR-ALAT * PMC12 restriction */ if (found_btb || has_btb_param) { /* * PMD16 is included in list of used PMD */ for(i=8; i < 17; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) printf(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs, fine_mode ? ", fine_mode" : "", rr_flags & PFMLIB_ITA2_RR_INV ? ", inversed" : ""); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx+=2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); if (fine_mode) __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask); else __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask, r_end); } } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_fine_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita2_output_rr_t *orr) { int i; pfmlib_reg_t *br; pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; unsigned long addr; int reg_idx; dbreg_t db; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; db.val = 0; db.db.db_mask = FINE_MODE_MASK; if (n > 2) return PFMLIB_ERR_IRRTOOMANY; for (i=0; i < n; i++, reg_idx += 2, in_rr++, br+= 4) { /* * setup lower limit pair * * because of the PMU bug, we must align down to the closest bundle-pair * aligned address. 5 => 32-byte aligned address */ addr = has_fine_mode_bug ? ALIGN_DOWN(in_rr->rr_start, 5) : in_rr->rr_start; out_rr->rr_soff = in_rr->rr_start - addr; /* * adjust plm for each range */ db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[0].reg_num = reg_idx; br[0].reg_value = addr; br[0].reg_addr = br[0].reg_alt_addr = reg_idx; br[1].reg_num = reg_idx+1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = reg_idx+1; /* * setup upper limit pair * * * In fine mode, the bundle address stored in the upper limit debug * registers is included in the count, so we substract 0x10 to exclude it. * * because of the PMU bug, we align the (corrected) end to the nearest * 32-byte aligned address + 0x10. With this correction and depending * on the correction, we may count one * * */ addr = in_rr->rr_end - 0x10; if (has_fine_mode_bug && (addr & 0x1f) == 0) addr += 0x10; out_rr->rr_eoff = addr - in_rr->rr_end + 0x10; br[2].reg_num = reg_idx+4; br[2].reg_value = addr; br[2].reg_addr = br[2].reg_alt_addr = reg_idx+4; br[3].reg_num = reg_idx+5; br[3].reg_value = db.val; br[3].reg_addr = br[3].reg_alt_addr = reg_idx+5; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 2, 1, irr->rr_flags); } orr->rr_nbr_used += i<<2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_single_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int *base_idx, pfmlib_ita2_output_rr_t *orr) { unsigned long size, end, start; unsigned long p_start, p_end; pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; pfmlib_reg_t *br; dbreg_t db; int reg_idx; int l, m; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; start = in_rr->rr_start; end = in_rr->rr_end; size = end - start; reg_idx = *base_idx; l = pfm_ia64_fls(size); m = l; if (size & ((1UL << l)-1)) { if (l>62) { printf("range: [0x%lx-0x%lx] too big\n", start, end); return PFMLIB_ERR_IRRTOOBIG; } m++; } DPRINT("size=%ld, l=%d m=%d, internal: 0x%lx full: 0x%lx\n", size, l, m, 1UL << l, 1UL << m); for (; m < 64; m++) { p_start = ALIGN_DOWN(start, m); p_end = p_start+(1UL<= end) goto found; } return PFMLIB_ERR_IRRINVAL; found: DPRINT("m=%d p_start=0x%lx p_end=0x%lx\n", m, p_start,p_end); /* when the event is not IA64_INST_RETIRED, then we MUST use ibrp0 */ br[0].reg_num = reg_idx; br[0].reg_value = p_start; br[0].reg_addr = br[0].reg_alt_addr = reg_idx; db.val = 0; db.db.db_mask = ~((1UL << m)-1); db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[1].reg_num = reg_idx + 1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = reg_idx + 1; out_rr->rr_soff = start - p_start; out_rr->rr_eoff = p_end - end; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 1, 0, irr->rr_flags); orr->rr_nbr_used += 2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int compute_normal_rr(pfmlib_ita2_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_ita2_output_rr_t *orr) { pfmlib_ita2_input_rr_desc_t *in_rr; pfmlib_ita2_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j; int br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1, 0, irr->rr_flags); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used += br_index; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfm_ita2_pmc_reg_t reg; pfmlib_ita2_input_param_t *param = mod_in; pfmlib_ita2_input_rr_t *irr; pfmlib_ita2_output_rr_t *orr; pfmlib_reg_t *pc = outp->pfp_pmcs; unsigned int i, pos = outp->pfp_pmc_count, count; int ret; unsigned int retired_only, retired_count, fine_mode, prefetch_count; unsigned int n_intervals; int base_idx = 0; unsigned long retired_mask; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita2_irange; orr = &mod_out->pfp_ita2_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; retired_count = check_inst_retired_events(inp, &retired_mask); retired_only = retired_count == inp->pfp_event_count; prefetch_count = check_prefetch_events(inp); fine_mode = irr->rr_flags & PFMLIB_ITA2_RR_NO_FINE_MODE ? 0 : check_fine_mode_possible(irr, n_intervals); DPRINT("n_intervals=%d retired_only=%d retired_count=%d prefetch_count=%d fine_mode=%d\n", n_intervals, retired_only, retired_count, prefetch_count, fine_mode); /* * On Itanium2, there are more constraints on what can be measured with irange. * * - The fine mode is the best because you directly set the lower and upper limits of * the range. This uses 2 ibr pairs for range (ibrp0/ibrp2 and ibp1/ibrp3). Therefore * at most 2 fine mode ranges can be defined. There is a limit on the size and alignment * of the range to allow fine mode: the range must be less than 4KB in size AND the lower * and upper limits must NOT cross a 4KB page boundary. The fine mode works will all events. * * - if the fine mode fails, then for all events, except IA64_TAGGED_INST_RETIRED_*, only * the first pair of ibr is available: ibrp0. This imposes some severe restrictions on the * size and alignment of the range. It can be bigger than 4KB and must be properly aligned * on its size. The library relaxes these constraints by allowing the covered areas to be * larger than the expected range. It may start before and end after. You can determine how * far off the range is in either direction for each range by looking at the rr_soff (start * offset) and rr_eoff (end offset). * * - if the events include certain prefetch events then only IBRP1 can be used in fine mode * See 10.3.5.1 Exception 1. * * - Finally, when the events are ONLY IA64_TAGGED_INST_RETIRED_* then all IBR pairs can be used * to cover the range giving us more flexibility to approximate the range when it is not * properly aligned on its size (see 10.3.5.2 Exception 2). */ if (fine_mode == 0 && retired_only == 0 && n_intervals > 1) return PFMLIB_ERR_IRRTOOMANY; /* we do not default to non-fine mode to support more ranges */ if (n_intervals > 2 && fine_mode == 1) return PFMLIB_ERR_IRRTOOMANY; if (fine_mode == 0) { if (retired_only) { ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } else { /* unless we have only prefetch and instruction retired events, * we cannot satisfy the request because the other events cannot * be measured on anything but IBRP0. */ if (prefetch_count && (prefetch_count+retired_count) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; base_idx = prefetch_count ? 2 : 0; ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); } } else { if (prefetch_count && n_intervals != 1) return PFMLIB_ERR_IRRTOOMANY; base_idx = prefetch_count ? 2 : 0; ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; } reg.pmc_val = 0xdb6; /* default value */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { switch(orr->rr_br[i].reg_num) { case 0: reg.pmc14_ita2_reg.iarc_ibrp0 = 0; break; case 2: reg.pmc14_ita2_reg.iarc_ibrp1 = 0; break; case 4: reg.pmc14_ita2_reg.iarc_ibrp2 = 0; break; case 6: reg.pmc14_ita2_reg.iarc_ibrp3 = 0; break; } } if (retired_only && (param->pfp_ita2_pmc8.opcm_used ||param->pfp_ita2_pmc9.opcm_used)) { /* * PMC8 + IA64_INST_RETIRED only works if irange on IBRP0 and/or IBRP2 * PMC9 + IA64_INST_RETIRED only works if irange on IBRP1 and/or IBRP3 */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { if (orr->rr_br[i].reg_num == 0 && param->pfp_ita2_pmc9.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 2 && param->pfp_ita2_pmc8.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 4 && param->pfp_ita2_pmc9.opcm_used) return PFMLIB_ERR_FEATCOMB; if (orr->rr_br[i].reg_num == 6 && param->pfp_ita2_pmc8.opcm_used) return PFMLIB_ERR_FEATCOMB; } } if (fine_mode) { reg.pmc14_ita2_reg.iarc_fine = 1; } else if (retired_only) { /* * we need to check that the user provided all the events needed to cover * all the ibr pairs used to cover the range */ if ((retired_mask & 0x1) == 0 && reg.pmc14_ita2_reg.iarc_ibrp0 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x2) == 0 && reg.pmc14_ita2_reg.iarc_ibrp1 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x4) == 0 && reg.pmc14_ita2_reg.iarc_ibrp2 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x8) == 0 && reg.pmc14_ita2_reg.iarc_ibrp3 == 0) return PFMLIB_ERR_IRRINVAL; } /* initialize pmc request slot */ memset(pc+pos, 0, sizeof(pfmlib_reg_t)); if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 14)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 14; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 14; pos++; __pfm_vbprintf("[PMC14(pmc14)=0x%lx ibrp0=%d ibrp1=%d ibrp2=%d ibrp3=%d fine=%d]\n", reg.pmc_val, reg.pmc14_ita2_reg.iarc_ibrp0, reg.pmc14_ita2_reg.iarc_ibrp1, reg.pmc14_ita2_reg.iarc_ibrp2, reg.pmc14_ita2_reg.iarc_ibrp3, reg.pmc14_ita2_reg.iarc_fine); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static const unsigned long iod_tab[8]={ /* --- */ 3, /* --D */ 2, /* -O- */ 3, /* should not be used */ /* -OD */ 0, /* =IOD safe because default IBR is harmless */ /* I-- */ 1, /* =IO safe because by defaut OPC is turned off */ /* I-D */ 0, /* =IOD safe because by default opc is turned off */ /* IO- */ 1, /* IOD */ 0 }; /* * IMPORTANT: MUST BE CALLED *AFTER* pfm_dispatch_irange() to make sure we see * the irange programming to adjust pmc13. */ static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_ita2_input_rr_t *irr; pfmlib_ita2_output_rr_t *orr, *orr2; pfm_ita2_pmc_reg_t pmc13; pfm_ita2_pmc_reg_t pmc14; unsigned int i, pos = outp->pfp_pmc_count; int iod_codes[4], dfl_val_pmc8, dfl_val_pmc9; unsigned int n_intervals; int ret; int base_idx = 0; int fine_mode = 0; #define DR_USED 0x1 /* data range is used */ #define OP_USED 0x2 /* opcode matching is used */ #define IR_USED 0x4 /* code range is used */ if (param == NULL) return PFMLIB_SUCCESS; /* * if only pmc8/pmc9 opcode matching is used, we do not need to change * the default value of pmc13 regardless of the events being measured. */ if ( param->pfp_ita2_drange.rr_used == 0 && param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * it seems like the ignored bits need to have special values * otherwise this does not work. */ pmc13.pmc_val = 0x2078fefefefe; /* * initialize iod codes */ iod_codes[0] = iod_codes[1] = iod_codes[2] = iod_codes[3] = 0; /* * setup default iod value, we need to separate because * if drange is used we do not know in advance which DBR will be used * therefore we need to apply dfl_val later */ dfl_val_pmc8 = param->pfp_ita2_pmc8.opcm_used ? OP_USED : 0; dfl_val_pmc9 = param->pfp_ita2_pmc9.opcm_used ? OP_USED : 0; if (param->pfp_ita2_drange.rr_used == 1) { if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_ita2_drange; orr = &mod_out->pfp_ita2_drange; ret = check_intervals(irr, 1, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } /* * Update iod_codes to reflect the use of the DBR constraint. */ for (i=0; i < orr->rr_nbr_used; i++) { if (orr->rr_br[i].reg_num == 0) iod_codes[0] |= DR_USED | dfl_val_pmc8; if (orr->rr_br[i].reg_num == 2) iod_codes[1] |= DR_USED | dfl_val_pmc9; if (orr->rr_br[i].reg_num == 4) iod_codes[2] |= DR_USED | dfl_val_pmc8; if (orr->rr_br[i].reg_num == 6) iod_codes[3] |= DR_USED | dfl_val_pmc9; } } /* * XXX: assume dispatch_irange executed before calling this function */ if (param->pfp_ita2_irange.rr_used == 1) { orr2 = &mod_out->pfp_ita2_irange; if (mod_out == NULL) return PFMLIB_ERR_INVAL; /* * we need to find out whether or not the irange is using * fine mode. If this is the case, then we only need to * program pmc13 for the ibr pairs which designate the lower * bounds of a range. For instance, if IBRP0/IBRP2 are used, * then we only need to program pmc13.cfg_dbrp0 and pmc13.ena_dbrp0, * the PMU will automatically use IBRP2, even though pmc13.ena_dbrp2=0. */ for(i=0; i <= pos; i++) { if (pc[i].reg_num == 14) { pmc14.pmc_val = pc[i].reg_value; if (pmc14.pmc14_ita2_reg.iarc_fine == 1) fine_mode = 1; break; } } /* * Update to reflect the use of the IBR constraint */ for (i=0; i < orr2->rr_nbr_used; i++) { if (orr2->rr_br[i].reg_num == 0) iod_codes[0] |= IR_USED | dfl_val_pmc8; if (orr2->rr_br[i].reg_num == 2) iod_codes[1] |= IR_USED | dfl_val_pmc9; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 4) iod_codes[2] |= IR_USED | dfl_val_pmc8; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 6) iod_codes[3] |= IR_USED | dfl_val_pmc9; } } if (param->pfp_ita2_irange.rr_used == 0 && param->pfp_ita2_drange.rr_used ==0) { iod_codes[0] = iod_codes[2] = dfl_val_pmc8; iod_codes[1] = iod_codes[3] = dfl_val_pmc9; } /* * update the cfg dbrpX field. If we put a constraint on a cfg dbrp, then * we must enable it in the corresponding ena_dbrpX */ pmc13.pmc13_ita2_reg.darc_ena_dbrp0 = iod_codes[0] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp0 = iod_tab[iod_codes[0]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp1 = iod_codes[1] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp1 = iod_tab[iod_codes[1]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp2 = iod_codes[2] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp2 = iod_tab[iod_codes[2]]; pmc13.pmc13_ita2_reg.darc_ena_dbrp3 = iod_codes[3] ? 1 : 0; pmc13.pmc13_ita2_reg.darc_cfg_dbrp3 = iod_tab[iod_codes[3]]; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 13)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 13; pc[pos].reg_value = pmc13.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 13; pos++; __pfm_vbprintf("[PMC13(pmc13)=0x%lx cfg_dbrp0=%d cfg_dbrp1=%d cfg_dbrp2=%d cfg_dbrp3=%d ena_dbrp0=%d ena_dbrp1=%d ena_dbrp2=%d ena_dbrp3=%d]\n", pmc13.pmc_val, pmc13.pmc13_ita2_reg.darc_cfg_dbrp0, pmc13.pmc13_ita2_reg.darc_cfg_dbrp1, pmc13.pmc13_ita2_reg.darc_cfg_dbrp2, pmc13.pmc13_ita2_reg.darc_cfg_dbrp3, pmc13.pmc13_ita2_reg.darc_ena_dbrp0, pmc13.pmc13_ita2_reg.darc_ena_dbrp1, pmc13.pmc13_ita2_reg.darc_ena_dbrp2, pmc13.pmc13_ita2_reg.darc_ena_dbrp3); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in) { pfmlib_ita2_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (param->pfp_ita2_counters[i].flags & PFMLIB_ITA2_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(param) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(param) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(param) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_ita2_input_param_t *mod_in) { pfmlib_ita2_input_param_t *param = mod_in; unsigned int i, count; if (param->pfp_ita2_drange.rr_used == 0 && param->pfp_ita2_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_ita2_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_ita2_input_param_t *mod_in = (pfmlib_ita2_input_param_t *)model_in; pfmlib_ita2_output_param_t *mod_out = (pfmlib_ita2_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with raneg restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_ita2_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* XXX: must be done before dispatch_opcm() and dispatch_drange() */ ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp, mod_out); if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_btb(inp, mod_in, outp); return ret; } /* XXX: return value is also error code */ int pfm_ita2_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_ITA2_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = itanium2_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_ita2_is_ear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_ear(i); } int pfm_ita2_is_dear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i); } int pfm_ita2_is_dear_tlb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i) && is_ear_tlb(i); } int pfm_ita2_is_dear_cache(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_dear(i) && is_ear_cache(i); } int pfm_ita2_is_dear_alat(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_ear_alat(i); } int pfm_ita2_is_iear(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i); } int pfm_ita2_is_iear_tlb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i) && is_ear_tlb(i); } int pfm_ita2_is_iear_cache(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_iear(i) && is_ear_cache(i); } int pfm_ita2_is_btb(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && is_btb(i); } int pfm_ita2_support_iarr(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_iarr(i); } int pfm_ita2_support_darr(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_darr(i); } int pfm_ita2_support_opcm(unsigned int i) { return i < PME_ITA2_EVENT_COUNT && has_opcm(i); } int pfm_ita2_get_ear_mode(unsigned int i, pfmlib_ita2_ear_mode_t *m) { pfmlib_ita2_ear_mode_t r; if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; r = PFMLIB_ITA2_EAR_TLB_MODE; if (is_ear_tlb(i)) goto done; r = PFMLIB_ITA2_EAR_CACHE_MODE; if (is_ear_cache(i)) goto done; r = PFMLIB_ITA2_EAR_ALAT_MODE; if (is_ear_alat(i)) goto done; return PFMLIB_ERR_INVAL; done: *m = r; return PFMLIB_SUCCESS; } static int pfm_ita2_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 7)) return PFMLIB_ERR_INVAL; *code = (int)itanium2_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_ita2_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_ITA2_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } int pfm_ita2_get_event_group(unsigned int i, int *grp) { if (i >= PME_ITA2_EVENT_COUNT || grp == NULL) return PFMLIB_ERR_INVAL; *grp = evt_grp(i); return PFMLIB_SUCCESS; } int pfm_ita2_get_event_set(unsigned int i, int *set) { if (i >= PME_ITA2_EVENT_COUNT || set == NULL) return PFMLIB_ERR_INVAL; *set = evt_set(i) == 0xf ? PFMLIB_ITA2_EVT_NO_SET : evt_set(i); return PFMLIB_SUCCESS; } /* external interface */ int pfm_ita2_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_ita2_output_param_t *mod_out) { pfmlib_ita2_output_param_t *param = mod_out; pfm_ita2_pmc_reg_t reg; unsigned int i, count; /* some sanity checks */ if (outp == NULL || param == NULL) return 0; if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return 0; if (param->pfp_ita2_irange.rr_nbr_used == 0) return 0; /* * we look for pmc14 as it contains the bit indicating if fine mode is used */ count = outp->pfp_pmc_count; for(i=0; i < count; i++) { if (outp->pfp_pmcs[i].reg_num == 14) goto found; } return 0; found: reg.pmc_val = outp->pfp_pmcs[i].reg_value; return reg.pmc14_ita2_reg.iarc_fine ? 1 : 0; } static char * pfm_ita2_get_event_name(unsigned int i) { return itanium2_pe[i].pme_name; } static void pfm_ita2_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =itanium2_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_ita2_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; /* all pmcs are contiguous */ for(i=0; i < PMU_ITA2_NUM_PMCS; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_ita2_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; /* all pmds are contiguous */ for(i=0; i < PMU_ITA2_NUM_PMDS; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_ita2_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counting pmds are contiguous */ for(i=4; i < 8; i++) pfm_regmask_set(impl_counters, i); } static void pfm_ita2_get_hw_counter_width(unsigned int *width) { *width = PMU_ITA2_COUNTER_WIDTH; } static int pfm_ita2_get_event_description(unsigned int ev, char **str) { char *s; s = itanium2_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_ita2_get_cycle_event(pfmlib_event_t *e) { e->event = PME_ITA2_CPU_CYCLES; return PFMLIB_SUCCESS; } static int pfm_ita2_get_inst_retired(pfmlib_event_t *e) { e->event = PME_ITA2_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } pfm_pmu_support_t itanium2_support={ .pmu_name = "itanium2", .pmu_type = PFMLIB_ITANIUM2_PMU, .pme_count = PME_ITA2_EVENT_COUNT, .pmc_count = PMU_ITA2_NUM_PMCS, .pmd_count = PMU_ITA2_NUM_PMDS, .num_cnt = PMU_ITA2_NUM_COUNTERS, .get_event_code = pfm_ita2_get_event_code, .get_event_name = pfm_ita2_get_event_name, .get_event_counters = pfm_ita2_get_event_counters, .dispatch_events = pfm_ita2_dispatch_events, .pmu_detect = pfm_ita2_detect, .get_impl_pmcs = pfm_ita2_get_impl_pmcs, .get_impl_pmds = pfm_ita2_get_impl_pmds, .get_impl_counters = pfm_ita2_get_impl_counters, .get_hw_counter_width = pfm_ita2_get_hw_counter_width, .get_event_desc = pfm_ita2_get_event_description, .get_cycle_event = pfm_ita2_get_cycle_event, .get_inst_retired_event = pfm_ita2_get_inst_retired }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_itanium2_priv.h000066400000000000000000000121711502707512200230260ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_ITANIUM2_PRIV_H__ #define __PFMLIB_ITANIUM2_PRIV_H__ /* * Event type definitions * * The virtual events are not really defined in the specs but are an artifact used * to quickly and easily setup EAR and/or BTB. The event type encodes the exact feature * which must be configured in combination with a counting monitor. * For instance, DATA_EAR_CACHE_LAT4 is a virtual D-EAR cache event. If the user * requests this event, this will configure a counting monitor to count DATA_EAR_EVENTS * and PMC11 will be configured for cache mode. The latency is encoded in the umask, here * it would correspond to 4 cycles. * */ #define PFMLIB_ITA2_EVENT_NORMAL 0x0 /* standard counter */ #define PFMLIB_ITA2_EVENT_BTB 0x1 /* virtual event used with BTB configuration */ #define PFMLIB_ITA2_EVENT_IEAR_TLB 0x2 /* virtual event used for I-EAR TLB configuration */ #define PFMLIB_ITA2_EVENT_IEAR_CACHE 0x3 /* virtual event used for I-EAR cache configuration */ #define PFMLIB_ITA2_EVENT_DEAR_TLB 0x4 /* virtual event used for D-EAR TLB configuration */ #define PFMLIB_ITA2_EVENT_DEAR_CACHE 0x5 /* virtual event used for D-EAR cache configuration */ #define PFMLIB_ITA2_EVENT_DEAR_ALAT 0x6 /* virtual event used for D-EAR ALAT configuration */ #define event_is_ear(e) ((e)->pme_type >= PFMLIB_ITA2_EVENT_IEAR_TLB &&(e)->pme_type <= PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_iear(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_CACHE) #define event_is_dear(e) ((e)->pme_type >= PFMLIB_ITA2_EVENT_DEAR_TLB && (e)->pme_type <= PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_ear_cache(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_CACHE || (e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_CACHE) #define event_is_ear_tlb(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_TLB) #define event_is_ear_alat(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_DEAR_ALAT) #define event_is_btb(e) ((e)->pme_type == PFMLIB_ITA2_EVENT_BTB) /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_type:3; /* see definitions above */ unsigned long pme_ig1:5; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_ita2_entry_code_t; typedef union { unsigned long pme_vcode; pme_ita2_entry_code_t pme_ita2_code; /* must not be larger than vcode */ } pme_ita2_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_res1:13; /* reserved */ unsigned long pme_group:4; /* event group */ unsigned long pme_set:4; /* event feature set*/ unsigned long pme_res2:40; /* reserved */ } pme_qual; } pme_ita2_qualifiers_t; typedef struct { char *pme_name; pme_ita2_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_ita2_qualifiers_t pme_qualifiers; char *pme_desc; /* text description of the event */ } pme_ita2_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_ita2_code.pme_code #define pme_umask pme_entry_code.pme_ita2_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define pme_type pme_entry_code.pme_ita2_code.pme_type #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #endif /* __PFMLIB_ITANIUM2_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_itanium_priv.h000066400000000000000000000071471502707512200227530ustar00rootroot00000000000000/* * Copyright (c) 2001-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_ITANIUM_PRIV_H__ #define __PFMLIB_ITANIUM_PRIV_H__ /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_ear:1; /* is EAR event */ unsigned long pme_dear:1; /* 1=Data 0=Instr */ unsigned long pme_tlb:1; /* 1=TLB 0=Cache */ unsigned long pme_btb:1; /* 1=BTB */ unsigned long pme_ig1:4; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_ita_entry_code_t; #define PME_UMASK_NONE 0x0 typedef union { unsigned long pme_vcode; pme_ita_entry_code_t pme_ita_code; /* must not be larger than vcode */ } pme_ita_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_reserved:61; /* not used */ } pme_qual; } pme_ita_qualifiers_t; typedef struct { char *pme_name; pme_ita_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_ita_qualifiers_t pme_qualifiers; char *pme_desc; } pme_ita_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_ita_code.pme_code #define pme_ear pme_entry_code.pme_ita_code.pme_ear #define pme_dear pme_entry_code.pme_ita_code.pme_dear #define pme_tlb pme_entry_code.pme_ita_code.pme_tlb #define pme_btb pme_entry_code.pme_ita_code.pme_btb #define pme_umask pme_entry_code.pme_ita_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define event_is_ear(e) ((e)->pme_ear == 1) #define event_is_iear(e) ((e)->pme_ear == 1 && (e)->pme_dear==0) #define event_is_dear(e) ((e)->pme_ear == 1 && (e)->pme_dear==1) #define event_is_tlb_ear(e) ((e)->pme_ear == 1 && (e)->pme_tlb==1) #define event_is_btb(e) ((e)->pme_btb) #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #endif /* __PFMLIB_ITANIUM_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_mips.c000066400000000000000000000215061502707512200212030ustar00rootroot00000000000000/* * pfmlib_mips.c : support for MIPS chips * * Copyright (c) 2011 Samara Technology Group, Inc * Contributed by Philip Mucci * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_mips_priv.h" pfm_mips_config_t pfm_mips_cfg; static const pfmlib_attr_desc_t mips_mods[]={ PFM_ATTR_B("k", "monitor at system level"), PFM_ATTR_B("u", "monitor at user level"), PFM_ATTR_B("s", "monitor at supervisor level"), PFM_ATTR_B("e", "monitor at exception level "), PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; #ifdef CONFIG_PFMLIB_OS_LINUX /* * helper function to retrieve one value from /proc/cpuinfo * for internal libpfm use only * attr: the attribute (line) to look for * ret_buf: a buffer to store the value of the attribute (as a string) * maxlen : number of bytes of capacity in ret_buf * * ret_buf is null terminated. * * Return: * 0 : attribute found, ret_buf populated * -1: attribute not found */ static int pfmlib_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { FILE *fp = NULL; int ret = -1; size_t attr_len, buf_len = 0; char *p, *value = NULL; char *buffer = NULL; if (attr == NULL || ret_buf == NULL || maxlen < 1) return -1; attr_len = strlen(attr); fp = fopen("/proc/cpuinfo", "r"); if (fp == NULL) return -1; while(pfmlib_getl(&buffer, &buf_len, fp) != -1){ /* skip blank lines */ if (*buffer == '\n') continue; p = strchr(buffer, ':'); if (p == NULL) goto error; /* * p+2: +1 = space, +2= firt character * strlen()-1 gets rid of \n */ *p = '\0'; value = p+2; value[strlen(value)-1] = '\0'; if (!strncmp(attr, buffer, attr_len)) break; } strncpy(ret_buf, value, maxlen-1); ret_buf[maxlen-1] = '\0'; ret = 0; error: free(buffer); fclose(fp); return ret; } #else static int pfmlib_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { DPRINT("/proc/cpuinfo ignored\n"); } #endif static void pfm_mips_display_reg(pfm_mips_sel_reg_t reg, uint64_t cntrs, char *fstr) { __pfm_vbprintf("[0x%"PRIx64" mask=0x%x usr=%d sys=%d sup=%d int=%d cntrs=0x%"PRIx64"] %s\n", reg.val, reg.perfsel64.sel_event_mask, reg.perfsel64.sel_usr, reg.perfsel64.sel_os, reg.perfsel64.sel_sup, reg.perfsel64.sel_exl, cntrs, fstr); } int pfm_mips_detect(void *this) { int ret; char buffer[1024]; DPRINT("mips_detect\n"); ret = pfmlib_getcpuinfo_attr("cpu model", buffer, sizeof(buffer)); if (ret == -1) return PFM_ERR_NOTSUPP; if (strstr(buffer,"MIPS") == NULL) return PFM_ERR_NOTSUPP; strcpy(pfm_mips_cfg.model, buffer); /* ret = pfmlib_getcpuinfo_attr("CPU implementer", buffer, sizeof(buffer)); if (ret == -1) return PFM_ERR_NOTSUPP; pfm_mips_cfg.implementer = strtol(buffer, NULL, 16); ret = pfmlib_getcpuinfo_attr("CPU part", buffer, sizeof(buffer)); if (ret == -1) return PFM_ERR_NOTSUPP; pfm_mips_cfg.part = strtol(buffer, NULL, 16); ret = pfmlib_getcpuinfo_attr("CPU architecture", buffer, sizeof(buffer)); if (ret == -1) return PFM_ERR_NOTSUPP; pfm_mips_cfg.architecture = strtol(buffer, NULL, 16); */ return PFM_SUCCESS; } int pfm_mips_get_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; const mips_entry_t *pe = this_pe(this); pfmlib_event_attr_info_t *a; pfm_mips_sel_reg_t reg; uint64_t ival, cntmask = 0; int plmmsk = 0, code; int k, id; reg.val = 0; code = pe[e->event].code; /* truncates bit 7 (counter info) */ reg.perfsel64.sel_event_mask = code; for (k = 0; k < e->nattrs; k++) { a = attr(e, k); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; ival = e->attrs[k].ival; switch(a->idx) { case MIPS_ATTR_K: /* os */ reg.perfsel64.sel_os = !!ival; plmmsk |= _MIPS_ATTR_K; break; case MIPS_ATTR_U: /* user */ reg.perfsel64.sel_usr = !!ival; plmmsk |= _MIPS_ATTR_U; break; case MIPS_ATTR_S: /* supervisor */ reg.perfsel64.sel_sup = !!ival; plmmsk |= _MIPS_ATTR_S; break; case MIPS_ATTR_E: /* int */ reg.perfsel64.sel_exl = !!ival; plmmsk |= _MIPS_ATTR_E; } } /* * handle case where no priv level mask was passed. * then we use the dfl_plm */ if (!(plmmsk & MIPS_PLM_ALL)) { if (e->dfl_plm & PFM_PLM0) reg.perfsel64.sel_os = 1; if (e->dfl_plm & PFM_PLM1) reg.perfsel64.sel_sup = 1; if (e->dfl_plm & PFM_PLM2) reg.perfsel64.sel_exl = 1; if (e->dfl_plm & PFM_PLM3) reg.perfsel64.sel_usr = 1; } evt_strcat(e->fstr, "%s", pe[e->event].name); for (k = 0; k < e->npattrs; k++) { if (e->pattrs[k].ctrl != PFM_ATTR_CTRL_PMU) continue; id = e->pattrs[k].idx; switch(id) { case MIPS_ATTR_K: evt_strcat(e->fstr, ":%s=%lu", mips_mods[id].name, reg.perfsel64.sel_os); break; case MIPS_ATTR_U: evt_strcat(e->fstr, ":%s=%lu", mips_mods[id].name, reg.perfsel64.sel_usr); break; case MIPS_ATTR_S: evt_strcat(e->fstr, ":%s=%lu", mips_mods[id].name, reg.perfsel64.sel_sup); break; case MIPS_ATTR_E: evt_strcat(e->fstr, ":%s=%lu", mips_mods[id].name, reg.perfsel64.sel_exl); break; } } e->codes[0] = reg.val; /* cycles and instructions support all counters */ if (code == 0 || code == 1) { cntmask = (1ULL << pmu->num_cntrs) -1; } else { /* event work on odd counters only */ for (k = !!(code & 0x80) ; k < pmu->num_cntrs; k+=2) { cntmask |= 1ULL << k; } } e->codes[1] = cntmask; e->count = 2; pfm_mips_display_reg(reg, cntmask, e->fstr); return PFM_SUCCESS; } int pfm_mips_get_event_first(void *this) { return 0; } int pfm_mips_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; if (idx >= (p->pme_count-1)) return -1; return idx+1; } int pfm_mips_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count; } int pfm_mips_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const mips_entry_t *pe = this_pe(this); int i, j, error = 0; for(i=0; i < pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", pmu->name, i, pe[i].name); error++; } for (j=i+1; j < pmu->pme_count; j++) { if (pe[i].code == pe[j].code) { fprintf(fp, "pmu: %s events %s and %s have the same code 0x%x\n", pmu->name, pe[i].name, pe[j].name, pe[i].code); error++; } } } if (!pmu->supported_plm) { fprintf(fp, "pmu: %s supported_plm=0, is that right?\n", pmu->name); error++; } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } unsigned int pfm_mips_get_event_nattrs(void *this, int pidx) { /* assume all pmus have the same number of attributes */ return MIPS_NUM_ATTRS; } int pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { /* no umasks, so all attrs are modifiers */ info->name = mips_mods[attr_idx].name; info->desc = mips_mods[attr_idx].desc; info->type = mips_mods[attr_idx].type; info->type = mips_mods[attr_idx].type; info->equiv= NULL; info->idx = attr_idx; /* private index */ info->code = attr_idx; info->is_dfl = 0; info->is_precise = 0; info->support_hw_smpl = 0; info->ctrl = PFM_ATTR_CTRL_PMU;; return PFM_SUCCESS; } int pfm_mips_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const mips_entry_t *pe = this_pe(this); info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].code; info->equiv = NULL; info->idx = idx; /* private index */ info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; /* no attributes defined for MIPS yet */ info->nattrs = pfm_mips_get_event_nattrs(this, idx); return PFM_SUCCESS; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_mips_74k.c000066400000000000000000000055141502707512200216710ustar00rootroot00000000000000/* * pfmlib_mips_74k.c : support for MIPS chips * * Copyright (c) 2011 Samara Technology Group, Inc * Contributed by Philip Mucci * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include "pfmlib_priv.h" /* library private */ #include "pfmlib_mips_priv.h" #include "events/mips_74k_events.h" /* event tables */ /* root@redhawk_RT-N16:/proc# more cpuinfo system type : Broadcom BCM4716 chip rev 1 processor : 0 cpu model : MIPS 74K V4.0 BogoMIPS : 239.20 wait instruction : no microsecond timers : yes tlb_entries : 64 */ static int pfm_mips_detect_74k(void *this) { int ret; DPRINT("mips_detect_74k\n"); ret = pfm_mips_detect(this); if (ret != PFM_SUCCESS) return PFM_ERR_NOTSUPP; if (strstr(pfm_mips_cfg.model,"MIPS 74K")) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } /* Cortex A8 support */ pfmlib_pmu_t mips_74k_support={ .desc = "MIPS 74k", .name = "mips_74k", .pmu = PFM_PMU_MIPS_74K, .pme_count = LIBPFM_ARRAY_SIZE(mips_74k_pe), .type = PFM_PMU_TYPE_CORE, .pe = mips_74k_pe, .pmu_detect = pfm_mips_detect_74k, .max_encoding = 2, /* event encoding + counter bitmask */ .num_cntrs = 4, .get_event_encoding[PFM_OS_NONE] = pfm_mips_get_encoding, PFMLIB_ENCODE_PERF(pfm_mips_get_perf_encoding), .get_event_first = pfm_mips_get_event_first, .get_event_next = pfm_mips_get_event_next, .event_is_valid = pfm_mips_event_is_valid, .validate_table = pfm_mips_validate_table, .get_event_info = pfm_mips_get_event_info, .get_event_attr_info = pfm_mips_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_mips_perf_validate_pattrs), .get_event_nattrs = pfm_mips_get_event_nattrs, .supported_plm = PFM_PLM0|PFM_PLM3|PFM_PLMH, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_mips_perf_event.c000066400000000000000000000065471502707512200234300ustar00rootroot00000000000000/* * pfmlib_mips_perf_event.c : perf_event MIPS functions * * Copyright (c) 2011 Samara Technology Group, Inc * Contributed by Philip Mucci * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include "pfmlib_priv.h" #include "pfmlib_mips_priv.h" #include "pfmlib_perf_event_priv.h" int pfm_mips_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { struct perf_event_attr *attr = e->os_data; int ret; ret = pfm_mips_get_encoding(this, e); if (ret != PFM_SUCCESS) return ret; if (e->count != 2) { DPRINT("unexpected encoding count=%d\n", e->count); return PFM_ERR_INVAL; } attr->type = PERF_TYPE_RAW; /* * priv levels are ignored because they are managed * directly through perf excl_*. */ attr->config = e->codes[0] >> 5; /* * codes[1] contains counter mask supported by the event. * Events support either odd or even indexed counters * except for cycles (code = 0) and instructions (code =1) * which work on all counters. * * The kernel expects bit 7 of config to indicate whether * the event works only on odd-indexed counters */ if ((e->codes[1] & 0x2) && attr->config > 1) attr->config |= 1ULL << 7; return PFM_SUCCESS; } void pfm_mips_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * remove PMU-provided attributes which are either * not accessible under perf_events or fully controlled * by perf_events, e.g., priv levels filters */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { /* * with perf_event, priv levels under full * control of perf_event. */ if ( e->pattrs[i].idx == MIPS_ATTR_K ||e->pattrs[i].idx == MIPS_ATTR_U ||e->pattrs[i].idx == MIPS_ATTR_S ||e->pattrs[i].idx == MIPS_ATTR_E) compact = 1; } /* * remove perf_event generic attributes not supported * by MIPS */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* no precise sampling on MIPS */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_mips_priv.h000066400000000000000000000102141502707512200222420ustar00rootroot00000000000000/* * Copyright (c) 2011 Samara Technology Group, Inc * Contributed by Philip Mucci * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_MIPS_PRIV_H__ #define __PFMLIB_MIPS_PRIV_H__ /* * This file contains the definitions used for MIPS processors */ /* * event description */ typedef struct { const char *name; /* event name */ const char *desc; /* event description */ unsigned int mask; /* which counters event lives on */ unsigned int code; /* event code */ } mips_entry_t; #if __BYTE_ORDER == __LITTLE_ENDIAN typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_exl:1; /* int level */ unsigned long sel_os:1; /* system level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_usr:1; /* user level */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_event_mask:7; /* event mask */ unsigned long sel_res1:20; /* reserved */ unsigned long sel_res2:32; /* reserved */ } perfsel64; } pfm_mips_sel_reg_t; #elif __BYTE_ORDER == __BIG_ENDIAN typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_res2:32; /* reserved */ unsigned long sel_res1:20; /* reserved */ unsigned long sel_event_mask:7; /* event mask */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_usr:1; /* user level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_os:1; /* system level */ unsigned long sel_exl:1; /* int level */ } perfsel64; } pfm_mips_sel_reg_t; #else #error "cannot determine endianess" #endif typedef struct { char model[1024]; int implementer; int architecture; int part; } pfm_mips_config_t; extern pfm_mips_config_t pfm_mips_cfg; #define MIPS_ATTR_K 0 /* system level */ #define MIPS_ATTR_U 1 /* user level */ #define MIPS_ATTR_S 2 /* supervisor level */ #define MIPS_ATTR_E 3 /* exception level */ #define MIPS_NUM_ATTRS 4 #define _MIPS_ATTR_K (1 << MIPS_ATTR_K) #define _MIPS_ATTR_U (1 << MIPS_ATTR_U) #define _MIPS_ATTR_S (1 << MIPS_ATTR_S) #define _MIPS_ATTR_E (1 << MIPS_ATTR_E) #define MIPS_PLM_ALL ( _MIPS_ATTR_K |\ _MIPS_ATTR_U |\ _MIPS_ATTR_S |\ _MIPS_ATTR_E) extern int pfm_mips_detect(void *this); extern int pfm_mips_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_mips_get_event_first(void *this); extern int pfm_mips_get_event_next(void *this, int idx); extern int pfm_mips_event_is_valid(void *this, int pidx); extern int pfm_mips_validate_table(void *this, FILE *fp); extern int pfm_mips_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); extern int pfm_mips_get_event_info(void *this, int idx, pfm_event_info_t *info); extern unsigned int pfm_mips_get_event_nattrs(void *this, int pidx); extern void pfm_mips_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_mips_get_perf_encoding(void *this, pfmlib_event_desc_t *e); #endif /* __PFMLIB_MIPS_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_montecito.c000066400000000000000000002117251502707512200222400ustar00rootroot00000000000000/* * pfmlib_montecito.c : support for the Dual-Core Itanium2 processor * * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* public headers */ #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_priv_ia64.h" /* architecture private */ #include "pfmlib_montecito_priv.h" /* PMU private */ #include "montecito_events.h" /* PMU private */ #define is_ear(i) event_is_ear(montecito_pe+(i)) #define is_ear_tlb(i) event_is_ear_tlb(montecito_pe+(i)) #define is_ear_alat(i) event_is_ear_alat(montecito_pe+(i)) #define is_ear_cache(i) event_is_ear_cache(montecito_pe+(i)) #define is_iear(i) event_is_iear(montecito_pe+(i)) #define is_dear(i) event_is_dear(montecito_pe+(i)) #define is_etb(i) event_is_etb(montecito_pe+(i)) #define has_opcm(i) event_opcm_ok(montecito_pe+(i)) #define has_iarr(i) event_iarr_ok(montecito_pe+(i)) #define has_darr(i) event_darr_ok(montecito_pe+(i)) #define has_all(i) event_all_ok(montecito_pe+(i)) #define has_mesi(i) event_mesi_ok(montecito_pe+(i)) #define evt_use_opcm(e) ((e)->pfp_mont_opcm1.opcm_used != 0 || (e)->pfp_mont_opcm2.opcm_used !=0) #define evt_use_irange(e) ((e)->pfp_mont_irange.rr_used) #define evt_use_drange(e) ((e)->pfp_mont_drange.rr_used) #define evt_grp(e) (int)montecito_pe[e].pme_qualifiers.pme_qual.pme_group #define evt_set(e) (int)montecito_pe[e].pme_qualifiers.pme_qual.pme_set #define evt_umask(e) montecito_pe[e].pme_umask #define evt_type(e) (int)montecito_pe[e].pme_type #define evt_caf(e) (int)montecito_pe[e].pme_caf #define FINE_MODE_BOUNDARY_BITS 16 #define FINE_MODE_MASK ~((1U< PMC0 * 1 -> PMC1 * n -> PMCn * * The following are in the model specific rr_br[]: * IBR0 -> 0 * IBR1 -> 1 * ... * IBR7 -> 7 * DBR0 -> 0 * DBR1 -> 1 * ... * DBR7 -> 7 * * We do not use a mapping table, instead we make up the * values on the fly given the base. */ static int pfm_mont_detect(void) { int tmp; int ret = PFMLIB_ERR_NOTSUPP; tmp = pfm_ia64_get_cpu_family(); if (tmp == 0x20) { ret = PFMLIB_SUCCESS; } return ret; } /* * Check the event for incompatibilities. This is useful * for L1D and L2D related events. Due to wire limitations, * some caches events are separated into sets. There * are 6 sets for the L1D cache group and 8 sets for L2D group. * It is NOT possible to simultaneously measure events from * differents sets for L1D. For instance, you cannot * measure events from set0 and set1 in L1D cache group. The L2D * group allows up to two different sets to be active at the same * time. The first set is selected by the event in PMC4 and the second * set by the event in PMC6. Once the set is selected for PMC4, * the same set is locked for PMC5 and PMC8. Similarly, once the * set is selected for PMC6, the same set is locked for PMC7 and * PMC9. * * This function verifies that only one set of L1D is selected * and that no more than 2 sets are selected for L2D */ static int check_cross_groups(pfmlib_input_param_t *inp, unsigned int *l1d_event, unsigned long *l2d_set1_mask, unsigned long *l2d_set2_mask) { int g, s, s1, s2; unsigned int cnt = inp->pfp_event_count; pfmlib_event_t *e = inp->pfp_events; unsigned int i, j; unsigned long l2d_mask1 = 0, l2d_mask2 = 0; unsigned int l1d_event_idx = UNEXISTING_SET; /* * Let check the L1D constraint first * * There is no umask restriction for this group */ for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); s = evt_set(e[i].event); if (g != PFMLIB_MONT_EVT_L1D_CACHE_GRP) continue; DPRINT("i=%u g=%d s=%d\n", i, g, s); l1d_event_idx = i; for (j=i+1; j < cnt; j++) { if (evt_grp(e[j].event) != g) continue; /* * if there is another event from the same group * but with a different set, then we return an error */ if (evt_set(e[j].event) != s) return PFMLIB_ERR_EVTSET; } } /* * Check that we have only up to two distinct * sets for L2D */ s1 = s2 = -1; for (i=0; i < cnt; i++) { g = evt_grp(e[i].event); if (g != PFMLIB_MONT_EVT_L2D_CACHE_GRP) continue; s = evt_set(e[i].event); /* * we have seen this set before, continue */ if (s1 == s) { l2d_mask1 |= 1UL << i; continue; } if (s2 == s) { l2d_mask2 |= 1UL << i; continue; } /* * record first of second set seen */ if (s1 == -1) { s1 = s; l2d_mask1 |= 1UL << i; } else if (s2 == -1) { s2 = s; l2d_mask2 |= 1UL << i; } else { /* * found a third set, that's not possible */ return PFMLIB_ERR_EVTSET; } } *l1d_event = l1d_event_idx; *l2d_set1_mask = l2d_mask1; *l2d_set2_mask = l2d_mask2; return PFMLIB_SUCCESS; } /* * Certain prefetch events must be treated specially when instruction range restriction * is used because they can only be constrained by IBRP1 in fine-mode. Other events * will use IBRP0 if tagged as a demand fetch OR IBPR1 if tagged as a prefetch match. * * Events which can be qualified by the two pairs depending on their tag: * - ISB_BUNPAIRS_IN * - L1I_FETCH_RAB_HIT * - L1I_FETCH_ISB_HIT * - L1I_FILLS * * This function returns the number of qualifying prefetch events found */ static int prefetch_events[]={ PME_MONT_L1I_PREFETCHES, PME_MONT_L1I_STRM_PREFETCHES, PME_MONT_L2I_PREFETCHES }; #define NPREFETCH_EVENTS sizeof(prefetch_events)/sizeof(int) static int prefetch_dual_events[]= { PME_MONT_ISB_BUNPAIRS_IN, PME_MONT_L1I_FETCH_RAB_HIT, PME_MONT_L1I_FETCH_ISB_HIT, PME_MONT_L1I_FILLS }; #define NPREFETCH_DUAL_EVENTS sizeof(prefetch_dual_events)/sizeof(int) /* * prefetch events must use IBRP1, unless they are dual and the user specified * PFMLIB_MONT_IRR_DEMAND_FETCH in rr_flags */ static int check_prefetch_events(pfmlib_input_param_t *inp, pfmlib_mont_input_rr_t *irr, unsigned int *count, int *base_idx, int *dup) { int code; int prefetch_codes[NPREFETCH_EVENTS]; int prefetch_dual_codes[NPREFETCH_DUAL_EVENTS]; unsigned int i, j; int c, flags; int found = 0, found_ibrp0 = 0, found_ibrp1 = 0; flags = irr->rr_flags & (PFMLIB_MONT_IRR_DEMAND_FETCH|PFMLIB_MONT_IRR_PREFETCH_MATCH); for(i=0; i < NPREFETCH_EVENTS; i++) { pfm_get_event_code(prefetch_events[i], &code); prefetch_codes[i] = code; } for(i=0; i < NPREFETCH_DUAL_EVENTS; i++) { pfm_get_event_code(prefetch_dual_events[i], &code); prefetch_dual_codes[i] = code; } for(i=0; i < inp->pfp_event_count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); for(j=0; j < NPREFETCH_EVENTS; j++) { if (c == prefetch_codes[j]) { found++; found_ibrp1++; } } /* * for the dual events, users must specify one or both of the * PFMLIB_MONT_IRR_DEMAND_FETCH or PFMLIB_MONT_IRR_PREFETCH_MATCH */ for(j=0; j < NPREFETCH_DUAL_EVENTS; j++) { if (c == prefetch_dual_codes[j]) { found++; if (flags == 0) return PFMLIB_ERR_IRRFLAGS; if (flags & PFMLIB_MONT_IRR_DEMAND_FETCH) found_ibrp0++; if (flags & PFMLIB_MONT_IRR_PREFETCH_MATCH) found_ibrp1++; } } } *count = found; *dup = 0; /* * if both found_ibrp0 and found_ibrp1 > 0, then we need to duplicate * the range in ibrp0 to ibrp1. */ if (found) { *base_idx = found_ibrp0 ? 0 : 2; if (found_ibrp1 && found_ibrp0) *dup = 1; } return 0; } /* * look for CPU_OP_CYCLES_QUAL * Return: * 1 if found * 0 otherwise */ static int has_cpu_cycles_qual(pfmlib_input_param_t *inp) { unsigned int i; int code, c; pfm_get_event_code(PME_MONT_CPU_OP_CYCLES_QUAL, &code); for(i=0; i < inp->pfp_event_count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) return 1; } return 0; } /* * IA64_INST_RETIRED (and subevents) is the only event which can be measured on all * 4 IBR when non-fine mode is not possible. * * This function returns: * - the number of events match the IA64_INST_RETIRED code * - in retired_mask to bottom 4 bits indicates which of the 4 INST_RETIRED event * is present */ static unsigned int check_inst_retired_events(pfmlib_input_param_t *inp, unsigned long *retired_mask) { int code; int c; unsigned int i, count, found = 0; unsigned long umask, mask; pfm_get_event_code(PME_MONT_IA64_INST_RETIRED, &code); count = inp->pfp_event_count; mask = 0; for(i=0; i < count; i++) { pfm_get_event_code(inp->pfp_events[i].event, &c); if (c == code) { pfm_mont_get_event_umask(inp->pfp_events[i].event, &umask); switch(umask) { case 0: mask |= 1; break; case 1: mask |= 2; break; case 2: mask |= 4; break; case 3: mask |= 8; break; } found++; } } if (retired_mask) *retired_mask = mask; return found; } static int check_fine_mode_possible(pfmlib_mont_input_rr_t *rr, int n) { pfmlib_mont_input_rr_desc_t *lim = rr->rr_limits; int i; for(i=0; i < n; i++) { if ((lim[i].rr_start & FINE_MODE_MASK) != (lim[i].rr_end & FINE_MODE_MASK)) return 0; } return 1; } /* * mode = 0 -> check code (enforce bundle alignment) * mode = 1 -> check data */ static int check_intervals(pfmlib_mont_input_rr_t *irr, int mode, unsigned int *n_intervals) { unsigned int i; pfmlib_mont_input_rr_desc_t *lim = irr->rr_limits; for(i=0; i < 4; i++) { /* end marker */ if (lim[i].rr_start == 0 && lim[i].rr_end == 0) break; /* invalid entry */ if (lim[i].rr_start >= lim[i].rr_end) return PFMLIB_ERR_IRRINVAL; if (mode == 0 && (lim[i].rr_start & 0xf || lim[i].rr_end & 0xf)) return PFMLIB_ERR_IRRALIGN; } *n_intervals = i; return PFMLIB_SUCCESS; } /* * It is not possible to measure more than one of the * L2D_OZQ_CANCELS0, L2D_OZQ_CANCELS1 at the same time. */ static int cancel_events[]= { PME_MONT_L2D_OZQ_CANCELS0_ACQ, PME_MONT_L2D_OZQ_CANCELS1_ANY }; #define NCANCEL_EVENTS sizeof(cancel_events)/sizeof(int) static int check_cancel_events(pfmlib_input_param_t *inp) { unsigned int i, j, count; int code; int cancel_codes[NCANCEL_EVENTS]; int idx = -1; for(i=0; i < NCANCEL_EVENTS; i++) { pfm_get_event_code(cancel_events[i], &code); cancel_codes[i] = code; } count = inp->pfp_event_count; for(i=0; i < count; i++) { for (j=0; j < NCANCEL_EVENTS; j++) { pfm_get_event_code(inp->pfp_events[i].event, &code); if (code == cancel_codes[j]) { if (idx != -1) { return PFMLIB_ERR_INVAL; } idx = inp->pfp_events[i].event; } } } return PFMLIB_SUCCESS; } /* * Automatically dispatch events to corresponding counters following constraints. */ static unsigned int l2d_set1_cnts[]={ 4, 5, 8 }; static unsigned int l2d_set2_cnts[]={ 6, 7, 9 }; static int pfm_mont_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_mont_input_param_t *param = mod_in; pfm_mont_pmc_reg_t reg; pfmlib_event_t *e; pfmlib_reg_t *pc, *pd; pfmlib_regmask_t avail_cntrs, impl_cntrs; unsigned int i,j, k, max_cnt; unsigned int assign[PMU_MONT_NUM_COUNTERS]; unsigned int m, cnt; unsigned int l1d_set; unsigned long l2d_set1_mask, l2d_set2_mask, evt_mask, mesi; unsigned long not_assigned_events, cnt_mask; int l2d_set1_p, l2d_set2_p; int ret; e = inp->pfp_events; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; cnt = inp->pfp_event_count; if (PFMLIB_DEBUG()) for (m=0; m < cnt; m++) { DPRINT("ev[%d]=%s counters=0x%lx\n", m, montecito_pe[e[m].event].pme_name, montecito_pe[e[m].event].pme_counters); } if (cnt > PMU_MONT_NUM_COUNTERS) return PFMLIB_ERR_TOOMANY; l1d_set = UNEXISTING_SET; ret = check_cross_groups(inp, &l1d_set, &l2d_set1_mask, &l2d_set2_mask); if (ret != PFMLIB_SUCCESS) return ret; ret = check_cancel_events(inp); if (ret != PFMLIB_SUCCESS) return ret; /* * at this point, we know that: * - we have at most 1 L1D set * - we have at most 2 L2D sets * - cancel events are compatible */ DPRINT("l1d_set=%u l2d_set1_mask=0x%lx l2d_set2_mask=0x%lx\n", l1d_set, l2d_set1_mask, l2d_set2_mask); /* * first, place L1D cache event in PMC5 * * this is the strongest constraint */ pfm_get_impl_counters(&impl_cntrs); pfm_regmask_andnot(&avail_cntrs, &impl_cntrs, &inp->pfp_unavail_pmcs); not_assigned_events = 0; DPRINT("avail_cntrs=0x%lx\n", avail_cntrs.bits[0]); /* * we do not check ALL_THRD here because at least * one event has to be in PMC5 for this group */ if (l1d_set != UNEXISTING_SET) { if (!pfm_regmask_isset(&avail_cntrs, 5)) return PFMLIB_ERR_NOASSIGN; assign[l1d_set] = 5; pfm_regmask_clr(&avail_cntrs, 5); } l2d_set1_p = l2d_set2_p = 0; /* * assign L2D set1 and set2 counters */ for (i=0; i < cnt ; i++) { evt_mask = 1UL << i; /* * place l2d set1 events. First 3 go to designated * counters, the rest is placed elsewhere in the final * pass */ if (l2d_set1_p < 3 && (l2d_set1_mask & evt_mask)) { assign[i] = l2d_set1_cnts[l2d_set1_p]; if (!pfm_regmask_isset(&avail_cntrs, assign[i])) return PFMLIB_ERR_NOASSIGN; pfm_regmask_clr(&avail_cntrs, assign[i]); l2d_set1_p++; continue; } /* * same as above but for l2d set2 */ if (l2d_set2_p < 3 && (l2d_set2_mask & evt_mask)) { assign[i] = l2d_set2_cnts[l2d_set2_p]; if (!pfm_regmask_isset(&avail_cntrs, assign[i])) return PFMLIB_ERR_NOASSIGN; pfm_regmask_clr(&avail_cntrs, assign[i]); l2d_set2_p++; continue; } /* * if not l2d nor l1d, then defer placement until final pass */ if (i != l1d_set) not_assigned_events |= evt_mask; DPRINT("phase 1: i=%u avail_cntrs=0x%lx l2d_set1_p=%d l2d_set2_p=%d not_assigned=0x%lx\n", i, avail_cntrs.bits[0], l2d_set1_p, l2d_set2_p, not_assigned_events); } /* * assign BUS_* ER_* events (work only in PMC4-PMC9) */ evt_mask = not_assigned_events; for (i=0; evt_mask ; i++, evt_mask >>=1) { if ((evt_mask & 0x1) == 0) continue; cnt_mask = montecito_pe[e[i].event].pme_counters; /* * only interested in events with restricted set of counters */ if (cnt_mask == 0xfff0) continue; for(j=0; cnt_mask; j++, cnt_mask >>=1) { if ((cnt_mask & 0x1) == 0) continue; DPRINT("phase 2: i=%d j=%d cnt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, cnt_mask, avail_cntrs.bits[0], not_assigned_events); if (!pfm_regmask_isset(&avail_cntrs, j)) continue; assign[i] = j; not_assigned_events &= ~(1UL << i); pfm_regmask_clr(&avail_cntrs, j); break; } if (cnt_mask == 0) return PFMLIB_ERR_NOASSIGN; } /* * assign the rest of the events (no constraints) */ evt_mask = not_assigned_events; max_cnt = PMU_MONT_FIRST_COUNTER + PMU_MONT_NUM_COUNTERS; for (i=0, j=0; evt_mask ; i++, evt_mask >>=1) { DPRINT("phase 3a: i=%d j=%d evt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, evt_mask, avail_cntrs.bits[0], not_assigned_events); if ((evt_mask & 0x1) == 0) continue; while(j < max_cnt && !pfm_regmask_isset(&avail_cntrs, j)) { DPRINT("phase 3: i=%d j=%d evt_mask=0x%lx avail_cntrs=0x%lx not_assigned_evnts=0x%lx\n", i, j, evt_mask, avail_cntrs.bits[0], not_assigned_events); j++; } if (j == max_cnt) return PFMLIB_ERR_NOASSIGN; assign[i] = j; j++; } for (j=0; j < cnt ; j++ ) { mesi = 0; /* * XXX: we do not support .all placement just yet */ if (param && param->pfp_mont_counters[j].flags & PFMLIB_MONT_FL_EVT_ALL_THRD) { DPRINT(".all mode is not yet supported by libpfm\n"); return PFMLIB_ERR_NOTSUPP; } if (has_mesi(e[j].event)) { for(k=0;k< e[j].num_masks; k++) { mesi |= 1UL << e[j].unit_masks[k]; } /* by default we capture everything */ if (mesi == 0) mesi = 0xf; } reg.pmc_val = 0; /* clear all, bits 26-27 must be zero for proper operations */ /* if plm is 0, then assume not specified per-event and use default */ reg.pmc_plm = inp->pfp_events[j].plm ? inp->pfp_events[j].plm : inp->pfp_dfl_plm; reg.pmc_oi = 0; /* let the user/OS deal with this field */ reg.pmc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc_thres = param ? param->pfp_mont_counters[j].thres: 0; reg.pmc_ism = 0x2; /* force IA-64 mode */ reg.pmc_umask = is_ear(e[j].event) ? 0x0 : montecito_pe[e[j].event].pme_umask; reg.pmc_es = montecito_pe[e[j].event].pme_code; reg.pmc_all = 0; /* XXX force self for now */ reg.pmc_m = (mesi>>3) & 0x1; reg.pmc_e = (mesi>>2) & 0x1; reg.pmc_s = (mesi>>1) & 0x1; reg.pmc_i = mesi & 0x1; /* * Note that we don't force PMC4.pmc_ena = 1 because the kernel takes care of this for us. * This way we don't have to program something in PMC4 even when we don't use it */ pc[j].reg_num = assign[j]; pc[j].reg_value = reg.pmc_val; pc[j].reg_addr = pc[j].reg_alt_addr = assign[j]; pd[j].reg_num = assign[j]; pd[j].reg_addr = pd[j].reg_alt_addr = assign[j]; __pfm_vbprintf("[PMC%u(pmc%u)=0x%06lx m=%d e=%d s=%d i=%d thres=%d all=%d es=0x%02x plm=%d umask=0x%x pm=%d ism=0x%x oi=%d] %s\n", assign[j], assign[j], reg.pmc_val, reg.pmc_m, reg.pmc_e, reg.pmc_s, reg.pmc_i, reg.pmc_thres, reg.pmc_all, reg.pmc_es,reg.pmc_plm, reg.pmc_umask, reg.pmc_pm, reg.pmc_ism, reg.pmc_oi, montecito_pe[e[j].event].pme_name); __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[j].reg_num, pd[j].reg_num); } /* number of PMC registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_dispatch_iear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_iear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_mont_iear.ear_used == 0) { /* * case 3: no I-EAR event, no (or nothing) in param->pfp_mont_iear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_mont_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_mont_iear.ear_mode); param->pfp_mont_iear.ear_umask = evt_umask(inp->pfp_events[i].event); DPRINT("I-EAR event with no info\n"); } /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running I-EAR), use param info */ reg.pmc_val = 0; if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc37_mont_tlb_reg.iear_plm = param->pfp_mont_iear.ear_plm ? param->pfp_mont_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc37_mont_tlb_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc37_mont_tlb_reg.iear_ct = 0x0; reg.pmc37_mont_tlb_reg.iear_umask = param->pfp_mont_iear.ear_umask; } else if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_CACHE_MODE) { /* if plm is 0, then assume not specified per-event and use default */ reg.pmc37_mont_cache_reg.iear_plm = param->pfp_mont_iear.ear_plm ? param->pfp_mont_iear.ear_plm : inp->pfp_dfl_plm; reg.pmc37_mont_cache_reg.iear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc37_mont_cache_reg.iear_ct = 0x1; reg.pmc37_mont_cache_reg.iear_umask = param->pfp_mont_iear.ear_umask; } else { DPRINT("ALAT mode not supported in I-EAR mode\n"); return PFMLIB_ERR_INVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 37)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 37; /* PMC37 is I-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_addr = 37; pos1++; pd[pos2].reg_num = 34; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 34; pos2++; pd[pos2].reg_num = 35; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 35; pos2++; if (param->pfp_mont_iear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE) { __pfm_vbprintf("[PMC37(pmc37)=0x%lx ctb=tlb plm=%d pm=%d umask=0x%x]\n", reg.pmc_val, reg.pmc37_mont_tlb_reg.iear_plm, reg.pmc37_mont_tlb_reg.iear_pm, reg.pmc37_mont_tlb_reg.iear_umask); } else { __pfm_vbprintf("[PMC37(pmc37)=0x%lx ctb=cache plm=%d pm=%d umask=0x%x]\n", reg.pmc_val, reg.pmc37_mont_cache_reg.iear_plm, reg.pmc37_mont_cache_reg.iear_pm, reg.pmc37_mont_cache_reg.iear_umask); } __pfm_vbprintf("[PMD34(pmd34)]\n[PMD35(pmd35)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_dear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_dear(inp->pfp_events[i].event)) break; } if (param == NULL || param->pfp_mont_dear.ear_used == 0) { /* * case 3: no D-EAR event, no (or nothing) in param->pfp_mont_dear.ear_used */ if (i == count) return PFMLIB_SUCCESS; memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; /* * case 1: extract all information for event (name) */ pfm_mont_get_ear_mode(inp->pfp_events[i].event, ¶m->pfp_mont_dear.ear_mode); param->pfp_mont_dear.ear_umask = evt_umask(inp->pfp_events[i].event); DPRINT("D-EAR event with no info\n"); } /* sanity check on the mode */ if ( param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_CACHE_MODE && param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_TLB_MODE && param->pfp_mont_dear.ear_mode != PFMLIB_MONT_EAR_ALAT_MODE) return PFMLIB_ERR_INVAL; /* * case 2: ear_used=1, event is defined, we use the param info as it is more precise * case 4: ear_used=1, no event (free running D-EAR), use param info */ reg.pmc_val = 0; /* if plm is 0, then assume not specified per-event and use default */ reg.pmc40_mont_reg.dear_plm = param->pfp_mont_dear.ear_plm ? param->pfp_mont_dear.ear_plm : inp->pfp_dfl_plm; reg.pmc40_mont_reg.dear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc40_mont_reg.dear_mode = param->pfp_mont_dear.ear_mode; reg.pmc40_mont_reg.dear_umask = param->pfp_mont_dear.ear_umask; reg.pmc40_mont_reg.dear_ism = 0x2; /* force IA-64 mode */ if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 40)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 40; /* PMC11 is D-EAR config register */ pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 40; pos1++; pd[pos2].reg_num = 32; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 32; pos2++; pd[pos2].reg_num = 33; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 33; pos2++; pd[pos2].reg_num = 36; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 36; pos2++; __pfm_vbprintf("[PMC40(pmc40)=0x%lx mode=%s plm=%d pm=%d ism=0x%x umask=0x%x]\n", reg.pmc_val, reg.pmc40_mont_reg.dear_mode == 0 ? "L1D" : (reg.pmc40_mont_reg.dear_mode == 1 ? "L1DTLB" : "ALAT"), reg.pmc40_mont_reg.dear_plm, reg.pmc40_mont_reg.dear_pm, reg.pmc40_mont_reg.dear_ism, reg.pmc40_mont_reg.dear_umask); __pfm_vbprintf("[PMD32(pmd32)]\n[PMD33(pmd33)\nPMD36(pmd36)\n"); /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_dispatch_opcm(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfm_mont_pmc_reg_t reg1, reg2, pmc36; unsigned int i, has_1st_pair, has_2nd_pair, count; unsigned int pos = outp->pfp_pmc_count; int used_pmc32, used_pmc34; if (param == NULL) return PFMLIB_SUCCESS; #define PMC36_DFL_VAL 0xfffffff0 /* * mandatory default value for PMC36 as described in the documentation * all monitoring is opcode constrained. Better make sure the match/mask * is set to match everything! It looks weird for the default value! */ pmc36.pmc_val = PMC36_DFL_VAL; reg1.pmc_val = 0x030f01ffffffffff; reg2.pmc_val = 0; used_pmc32 = param->pfp_mont_opcm1.opcm_used; used_pmc34 = param->pfp_mont_opcm2.opcm_used; /* * check in any feature is used. * PMC36 must be setup when opcode matching is used OR when code range restriction is used */ if (used_pmc32 == 0 && used_pmc34 == 0 && param->pfp_mont_irange.rr_used == 0) return 0; /* * check for rr_nbr_used to make sure that the range request produced something on output */ if (used_pmc32 || (param->pfp_mont_irange.rr_used && mod_out->pfp_mont_irange.rr_nbr_used) ) { /* * if not used, ignore all bits */ if (used_pmc32) { reg1.pmc32_34_mont_reg.opcm_mask = param->pfp_mont_opcm1.opcm_mask; reg1.pmc32_34_mont_reg.opcm_b = param->pfp_mont_opcm1.opcm_b; reg1.pmc32_34_mont_reg.opcm_f = param->pfp_mont_opcm1.opcm_f; reg1.pmc32_34_mont_reg.opcm_i = param->pfp_mont_opcm1.opcm_i; reg1.pmc32_34_mont_reg.opcm_m = param->pfp_mont_opcm1.opcm_m; reg2.pmc33_35_mont_reg.opcm_match = param->pfp_mont_opcm1.opcm_match; } if (param->pfp_mont_irange.rr_used) { reg1.pmc32_34_mont_reg.opcm_ig_ad = 0; reg1.pmc32_34_mont_reg.opcm_inv = param->pfp_mont_irange.rr_flags & PFMLIB_MONT_RR_INV ? 1 : 0; } else { /* clear range restriction fields when none is used */ reg1.pmc32_34_mont_reg.opcm_ig_ad = 1; reg1.pmc32_34_mont_reg.opcm_inv = 0; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 32)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 32; pc[pos].reg_value = reg1.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 32; pos++; /* * will be constrained by PMC32 */ if (used_pmc32) { if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 33)) return PFMLIB_ERR_NOASSIGN; /* * used pmc33 only when we have active opcode matching */ pc[pos].reg_num = 33; pc[pos].reg_value = reg2.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 33; pos++; has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP0_PMC32_33) has_1st_pair=1; if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP2_PMC32_33) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc36.pmc36_mont_reg.opcm_ch0_ig_opcm = 0; if (has_2nd_pair || has_1st_pair == 0) pmc36.pmc36_mont_reg.opcm_ch2_ig_opcm = 0; } __pfm_vbprintf("[PMC32(pmc32)=0x%lx m=%d i=%d f=%d b=%d mask=0x%lx inv=%d ig_ad=%d]\n", reg1.pmc_val, reg1.pmc32_34_mont_reg.opcm_m, reg1.pmc32_34_mont_reg.opcm_i, reg1.pmc32_34_mont_reg.opcm_f, reg1.pmc32_34_mont_reg.opcm_b, reg1.pmc32_34_mont_reg.opcm_mask, reg1.pmc32_34_mont_reg.opcm_inv, reg1.pmc32_34_mont_reg.opcm_ig_ad); if (used_pmc32) __pfm_vbprintf("[PMC33(pmc33)=0x%lx match=0x%lx]\n", reg2.pmc_val, reg2.pmc33_35_mont_reg.opcm_match); } /* * will be constrained by PMC34 */ if (used_pmc34) { reg1.pmc_val = 0x01ffffffffff; /* pmc34 default value */ reg2.pmc_val = 0; reg1.pmc32_34_mont_reg.opcm_mask = param->pfp_mont_opcm2.opcm_mask; reg1.pmc32_34_mont_reg.opcm_b = param->pfp_mont_opcm2.opcm_b; reg1.pmc32_34_mont_reg.opcm_f = param->pfp_mont_opcm2.opcm_f; reg1.pmc32_34_mont_reg.opcm_i = param->pfp_mont_opcm2.opcm_i; reg1.pmc32_34_mont_reg.opcm_m = param->pfp_mont_opcm2.opcm_m; reg2.pmc33_35_mont_reg.opcm_match = param->pfp_mont_opcm2.opcm_match; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 34)) return PFMLIB_ERR_NOASSIGN; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 35)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 34; pc[pos].reg_value = reg1.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 34; pos++; pc[pos].reg_num = 35; pc[pos].reg_value = reg2.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 35; pos++; has_1st_pair = has_2nd_pair = 0; count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP1_PMC34_35) has_1st_pair=1; if (inp->pfp_events[i].event == PME_MONT_IA64_TAGGED_INST_RETIRED_IBRP3_PMC34_35) has_2nd_pair=1; } if (has_1st_pair || has_2nd_pair == 0) pmc36.pmc36_mont_reg.opcm_ch1_ig_opcm = 0; if (has_2nd_pair || has_1st_pair == 0) pmc36.pmc36_mont_reg.opcm_ch3_ig_opcm = 0; __pfm_vbprintf("[PMC34(pmc34)=0x%lx m=%d i=%d f=%d b=%d mask=0x%lx]\n", reg1.pmc_val, reg1.pmc32_34_mont_reg.opcm_m, reg1.pmc32_34_mont_reg.opcm_i, reg1.pmc32_34_mont_reg.opcm_f, reg1.pmc32_34_mont_reg.opcm_b, reg1.pmc32_34_mont_reg.opcm_mask); __pfm_vbprintf("[PMC35(pmc35)=0x%lx match=0x%lx]\n", reg2.pmc_val, reg2.pmc33_35_mont_reg.opcm_match); } if (pmc36.pmc_val != PMC36_DFL_VAL) { if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 36)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 36; pc[pos].reg_value = pmc36.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 36; pos++; __pfm_vbprintf("[PMC36(pmc36)=0x%lx ch0_ig_op=%d ch1_ig_op=%d ch2_ig_op=%d ch3_ig_op=%d]\n", pmc36.pmc_val, pmc36.pmc36_mont_reg.opcm_ch0_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch1_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch2_ig_opcm, pmc36.pmc36_mont_reg.opcm_ch3_ig_opcm); } outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int pfm_dispatch_etb(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfmlib_event_t *e= inp->pfp_events; pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc, *pd; pfmlib_mont_input_param_t fake_param; int found_etb = 0, found_bad_dear = 0; int has_etb_param; unsigned int i, pos1, pos2; unsigned int count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * explicit ETB settings */ has_etb_param = param && param->pfp_mont_etb.etb_used; reg.pmc_val = 0UL; /* * we need to scan all events looking for DEAR ALAT/TLB due to incompatibility. * In this case PMC39 must be forced to zero */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_etb(e[i].event)) found_etb = 1; /* * keep track of the first ETB event */ /* look only for DEAR TLB */ if (is_dear(e[i].event) && (is_ear_tlb(e[i].event) || is_ear_alat(e[i].event))) { found_bad_dear = 1; } } DPRINT("found_etb=%d found_bar_dear=%d\n", found_etb, found_bad_dear); /* * did not find D-EAR TLB/ALAT event, need to check param structure */ if (found_bad_dear == 0 && param && param->pfp_mont_dear.ear_used == 1) { if ( param->pfp_mont_dear.ear_mode == PFMLIB_MONT_EAR_TLB_MODE || param->pfp_mont_dear.ear_mode == PFMLIB_MONT_EAR_ALAT_MODE) found_bad_dear = 1; } /* * no explicit ETB event and no special case to deal with (cover part of case 3) */ if (found_etb == 0 && has_etb_param == 0 && found_bad_dear == 0) return PFMLIB_SUCCESS; if (has_etb_param == 0) { /* * case 3: no ETB event, etb_used=0 but found_bad_dear=1, need to cleanup PMC12 */ if (found_etb == 0) goto assign_zero; /* * case 1: we have a ETB event but no param, default setting is to capture * all branches. */ memset(&fake_param, 0, sizeof(fake_param)); param = &fake_param; param->pfp_mont_etb.etb_tm = 0x3; /* all branches */ param->pfp_mont_etb.etb_ptm = 0x3; /* all branches */ param->pfp_mont_etb.etb_ppm = 0x3; /* all branches */ param->pfp_mont_etb.etb_brt = 0x0; /* all branches */ DPRINT("ETB event with no info\n"); } /* * case 2: ETB event in the list, param provided * case 4: no ETB event, param provided (free running mode) */ reg.pmc39_mont_reg.etbc_plm = param->pfp_mont_etb.etb_plm ? param->pfp_mont_etb.etb_plm : inp->pfp_dfl_plm; reg.pmc39_mont_reg.etbc_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc39_mont_reg.etbc_ds = 0; /* 1 is reserved */ reg.pmc39_mont_reg.etbc_tm = param->pfp_mont_etb.etb_tm & 0x3; reg.pmc39_mont_reg.etbc_ptm = param->pfp_mont_etb.etb_ptm & 0x3; reg.pmc39_mont_reg.etbc_ppm = param->pfp_mont_etb.etb_ppm & 0x3; reg.pmc39_mont_reg.etbc_brt = param->pfp_mont_etb.etb_brt & 0x3; /* * if DEAR-ALAT or DEAR-TLB is set then PMC12 must be set to zero (see documentation p. 87) * * D-EAR ALAT/TLB and ETB cannot be used at the same time. * From documentation: PMC12 must be zero in this mode; else the wrong IP for misses * coming right after a mispredicted branch. * * D-EAR cache is fine. */ assign_zero: if (found_bad_dear && reg.pmc_val != 0UL) return PFMLIB_ERR_EVTINCOMP; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 39)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 39; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 39; pos1++; __pfm_vbprintf("[PMC39(pmc39)=0x%lx plm=%d pm=%d ds=%d tm=%d ptm=%d ppm=%d brt=%d]\n", reg.pmc_val, reg.pmc39_mont_reg.etbc_plm, reg.pmc39_mont_reg.etbc_pm, reg.pmc39_mont_reg.etbc_ds, reg.pmc39_mont_reg.etbc_tm, reg.pmc39_mont_reg.etbc_ptm, reg.pmc39_mont_reg.etbc_ppm, reg.pmc39_mont_reg.etbc_brt); /* * only add ETB PMDs when actually using BTB. * Not needed when dealing with D-EAR TLB and DEAR-ALAT * PMC39 restriction */ if (found_etb || has_etb_param) { pd[pos2].reg_num = 38; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 38; pos2++; pd[pos2].reg_num = 39; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 39; pos2++; __pfm_vbprintf("[PMD38(pmd38)]\n[PMD39(pmd39)\n"); for(i=48; i < 64; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } } /* update final number of entries used */ outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static void do_normal_rr(unsigned long start, unsigned long end, pfmlib_reg_t *br, int nbr, int dir, int *idx, int *reg_idx, int plm) { unsigned long size, l_addr, c; unsigned long l_offs = 0, r_offs = 0; unsigned long l_size, r_size; dbreg_t db; int p2; if (nbr < 1 || end <= start) return; size = end - start; DPRINT("start=0x%016lx end=0x%016lx size=0x%lx bytes (%lu bundles) nbr=%d dir=%d\n", start, end, size, size >> 4, nbr, dir); p2 = pfm_ia64_fls(size); c = ALIGN_DOWN(end, p2); DPRINT("largest power of two possible: 2^%d=0x%lx, crossing=0x%016lx\n", p2, 1UL << p2, c); if ((c - (1UL<= start) { l_addr = c - (1UL << p2); } else { p2--; if ((c + (1UL<>l_offs: 0x%lx\n", l_offs); } } else if (dir == 1 && r_size != 0 && nbr == 1) { p2++; l_addr = start; if (PFMLIB_DEBUG()) { r_offs = l_addr+(1UL<>r_offs: 0x%lx\n", r_offs); } } l_size = l_addr - start; r_size = end - l_addr-(1UL<>largest chunk: 2^%d @0x%016lx-0x%016lx\n", p2, l_addr, l_addr+(1UL<>before: 0x%016lx-0x%016lx\n", start, l_addr); if (r_size && !r_offs) printf(">>after : 0x%016lx-0x%016lx\n", l_addr+(1UL<>1; if (nbr & 0x1) { /* * our simple heuristic is: * we assign the largest number of registers to the largest * of the two chunks */ if (l_size > r_size) { l_nbr++; } else { r_nbr++; } } do_normal_rr(start, l_addr, br, l_nbr, 0, idx, reg_idx, plm); do_normal_rr(l_addr+(1UL<rr_start, in_rr->rr_end, n_pairs, fine_mode ? ", fine_mode" : "", rr_flags & PFMLIB_MONT_RR_INV ? ", inversed" : ""); __pfm_vbprintf("start offset: -0x%lx end_offset: +0x%lx\n", out_rr->rr_soff, out_rr->rr_eoff); for (j=0; j < n_pairs; j++, base_idx+=2) { d.val = dbr[base_idx+1].reg_value; r_end = dbr[base_idx].reg_value+((~(d.db.db_mask)) & ~(0xffUL << 56)); if (fine_mode) __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask); else __pfm_vbprintf("brp%u: db%u: 0x%016lx db%u: plm=0x%x mask=0x%016lx end=0x%016lx\n", dbr[base_idx].reg_num>>1, dbr[base_idx].reg_num, dbr[base_idx].reg_value, dbr[base_idx+1].reg_num, d.db.db_plm, d.db.db_mask, r_end); } } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_fine_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_mont_output_rr_t *orr) { int i; pfmlib_reg_t *br; pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; unsigned long addr; int reg_idx; dbreg_t db; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; db.val = 0; db.db.db_mask = FINE_MODE_MASK; if (n > 2) return PFMLIB_ERR_IRRTOOMANY; for (i=0; i < n; i++, reg_idx += 2, in_rr++, br+= 4) { /* * setup lower limit pair * * because of the PMU can only see addresses on a 2-bundle boundary, we must align * down to the closest bundle-pair aligned address. 5 => 32-byte aligned address */ addr = ALIGN_DOWN(in_rr->rr_start, 5); out_rr->rr_soff = in_rr->rr_start - addr; /* * adjust plm for each range */ db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[0].reg_num = reg_idx; br[0].reg_value = addr; br[0].reg_addr = br[0].reg_alt_addr = 1+reg_idx; br[1].reg_num = reg_idx+1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = 1+reg_idx+1; /* * setup upper limit pair * * * In fine mode, the bundle address stored in the upper limit debug * registers is included in the count, so we substract 0x10 to exclude it. * * because of the PMU bug, we align the (corrected) end to the nearest * 32-byte aligned address + 0x10. With this correction and depending * on the correction, we may count one * * */ addr = in_rr->rr_end - 0x10; if ((addr & 0x1f) == 0) addr += 0x10; out_rr->rr_eoff = addr - in_rr->rr_end + 0x10; br[2].reg_num = reg_idx+4; br[2].reg_value = addr; br[2].reg_addr = br[2].reg_alt_addr = 1+reg_idx+4; br[3].reg_num = reg_idx+5; br[3].reg_value = db.val; br[3].reg_addr = br[3].reg_alt_addr = 1+reg_idx+5; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 2, 1, irr->rr_flags); } orr->rr_nbr_used += i<<2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } /* * base_idx = base register index to use (for IBRP1, base_idx = 2) */ static int compute_single_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int *base_idx, pfmlib_mont_output_rr_t *orr) { unsigned long size, end, start; unsigned long p_start, p_end; pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; pfmlib_reg_t *br; dbreg_t db; int reg_idx; int l, m; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; start = in_rr->rr_start; end = in_rr->rr_end; size = end - start; reg_idx = *base_idx; l = pfm_ia64_fls(size); m = l; if (size & ((1UL << l)-1)) { if (l>62) { printf("range: [0x%lx-0x%lx] too big\n", start, end); return PFMLIB_ERR_IRRTOOBIG; } m++; } DPRINT("size=%ld, l=%d m=%d, internal: 0x%lx full: 0x%lx\n", size, l, m, 1UL << l, 1UL << m); for (; m < 64; m++) { p_start = ALIGN_DOWN(start, m); p_end = p_start+(1UL<= end) goto found; } return PFMLIB_ERR_IRRINVAL; found: DPRINT("m=%d p_start=0x%lx p_end=0x%lx\n", m, p_start,p_end); /* when the event is not IA64_INST_RETIRED, then we MUST use ibrp0 */ br[0].reg_num = reg_idx; br[0].reg_value = p_start; br[0].reg_addr = br[0].reg_alt_addr = 1+reg_idx; db.val = 0; db.db.db_mask = ~((1UL << m)-1); db.db.db_plm = in_rr->rr_plm ? in_rr->rr_plm : (unsigned long)dfl_plm; br[1].reg_num = reg_idx + 1; br[1].reg_value = db.val; br[1].reg_addr = br[1].reg_alt_addr = 1+reg_idx+1; out_rr->rr_soff = start - p_start; out_rr->rr_eoff = p_end - end; if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, 0, 1, 0, irr->rr_flags); orr->rr_nbr_used += 2; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int compute_normal_rr(pfmlib_mont_input_rr_t *irr, int dfl_plm, int n, int *base_idx, pfmlib_mont_output_rr_t *orr) { pfmlib_mont_input_rr_desc_t *in_rr; pfmlib_mont_output_rr_desc_t *out_rr; unsigned long r_end; pfmlib_reg_t *br; dbreg_t d; int i, j; int br_index, reg_idx, prev_index; in_rr = irr->rr_limits; out_rr = orr->rr_infos; br = orr->rr_br+orr->rr_nbr_used; reg_idx = *base_idx; br_index = 0; for (i=0; i < n; i++, in_rr++, out_rr++) { /* * running out of registers */ if (br_index == 8) break; prev_index = br_index; do_normal_rr( in_rr->rr_start, in_rr->rr_end, br, 4 - (reg_idx>>1), /* how many pairs available */ 0, &br_index, ®_idx, in_rr->rr_plm ? in_rr->rr_plm : dfl_plm); DPRINT("br_index=%d reg_idx=%d\n", br_index, reg_idx); /* * compute offsets */ out_rr->rr_soff = out_rr->rr_eoff = 0; for(j=prev_index; j < br_index; j+=2) { d.val = br[j+1].reg_value; r_end = br[j].reg_value+((~(d.db.db_mask)+1) & ~(0xffUL << 56)); if (br[j].reg_value <= in_rr->rr_start) out_rr->rr_soff = in_rr->rr_start - br[j].reg_value; if (r_end >= in_rr->rr_end) out_rr->rr_eoff = r_end - in_rr->rr_end; } if (PFMLIB_VERBOSE()) print_one_range(in_rr, out_rr, br, prev_index, (br_index-prev_index)>>1, 0, irr->rr_flags); } /* do not have enough registers to cover all the ranges */ if (br_index == 8 && i < n) return PFMLIB_ERR_TOOMANY; orr->rr_nbr_used += br_index; /* update base_idx, for subsequent calls */ *base_idx = reg_idx; return PFMLIB_SUCCESS; } static int pfm_dispatch_irange(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_mont_input_rr_t *irr; pfmlib_mont_output_rr_t *orr; pfmlib_reg_t *pc = outp->pfp_pmcs; unsigned long retired_mask; unsigned int i, pos = outp->pfp_pmc_count, count; unsigned int retired_only, retired_count, fine_mode, prefetch_count; unsigned int n_intervals; int base_idx = 0, dup = 0; int ret; if (param == NULL) return PFMLIB_SUCCESS; if (param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_mont_irange; orr = &mod_out->pfp_mont_irange; ret = check_intervals(irr, 0, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_IRRINVAL; retired_count = check_inst_retired_events(inp, &retired_mask); retired_only = retired_count == inp->pfp_event_count; fine_mode = irr->rr_flags & PFMLIB_MONT_RR_NO_FINE_MODE ? 0 : check_fine_mode_possible(irr, n_intervals); DPRINT("n_intervals=%d retired_only=%d retired_count=%d fine_mode=%d\n", n_intervals, retired_only, retired_count, fine_mode); /* * On montecito, there are more constraints on what can be measured with irange. * * - The fine mode is the best because you directly set the lower and upper limits of * the range. This uses 2 ibr pairs for range (ibrp0/ibrp2 and ibp1/ibrp3). Therefore * at most 2 fine mode ranges can be defined. The boundaries of the range must be in the * same 64KB page. The fine mode works will all events. * * - if the fine mode fails, then for all events, except IA64_TAGGED_INST_RETIRED_*, only * the first pair of ibr is available: ibrp0. This imposes some severe restrictions on the * size and alignment of the range. It can be bigger than 64KB and must be properly aligned * on its size. The library relaxes these constraints by allowing the covered areas to be * larger than the expected range. It may start before and end after the requested range. * You can determine the amount of overrun in either direction for each range by looking at * the rr_soff (start offset) and rr_eoff (end offset). * * - if the events include certain prefetch events then only IBRP1 can be used. * See 3.3.5.2 Exception 1. * * - Finally, when the events are ONLY IA64_TAGGED_INST_RETIRED_* then all IBR pairs can be used * to cover the range giving us more flexibility to approximate the range when it is not * properly aligned on its size (see 10.3.5.2 Exception 2). But the corresponding * IA64_TAGGED_INST_RETIRED_* must be present. */ if (fine_mode == 0 && retired_only == 0 && n_intervals > 1) return PFMLIB_ERR_IRRTOOMANY; /* we do not default to non-fine mode to support more ranges */ if (n_intervals > 2 && fine_mode == 1) return PFMLIB_ERR_IRRTOOMANY; ret = check_prefetch_events(inp, irr, &prefetch_count, &base_idx, &dup); if (ret) return ret; DPRINT("prefetch_count=%u base_idx=%d dup=%d\n", prefetch_count, base_idx, dup); /* * CPU_OP_CYCLES.QUAL supports code range restrictions but it returns * meaningful values (fine/coarse mode) only when IBRP1 is not used. */ if ((base_idx > 0 || dup) && has_cpu_cycles_qual(inp)) return PFMLIB_ERR_FEATCOMB; if (fine_mode == 0) { if (retired_only) { /* can take multiple intervals */ ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } else { /* unless we have only prefetch and instruction retired events, * we cannot satisfy the request because the other events cannot * be measured on anything but IBRP0. */ if ((prefetch_count+retired_count) != inp->pfp_event_count) return PFMLIB_ERR_FEATCOMB; ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); if (ret == PFMLIB_SUCCESS && dup) ret = compute_single_rr(irr, inp->pfp_dfl_plm, &base_idx, orr); } } else { if (prefetch_count && n_intervals != 1) return PFMLIB_ERR_IRRTOOMANY; /* except is retired_only, can take only one interval */ ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret == PFMLIB_SUCCESS && dup) ret = compute_fine_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); } if (ret != PFMLIB_SUCCESS) return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_IRRTOOMANY : ret; reg.pmc_val = 0xdb6; /* default value */ count = orr->rr_nbr_used; for (i=0; i < count; i++) { switch(orr->rr_br[i].reg_num) { case 0: reg.pmc38_mont_reg.iarc_ig_ibrp0 = 0; break; case 2: reg.pmc38_mont_reg.iarc_ig_ibrp1 = 0; break; case 4: reg.pmc38_mont_reg.iarc_ig_ibrp2 = 0; break; case 6: reg.pmc38_mont_reg.iarc_ig_ibrp3 = 0; break; } } if (fine_mode) { reg.pmc38_mont_reg.iarc_fine = 1; } else if (retired_only) { /* * we need to check that the user provided all the events needed to cover * all the ibr pairs used to cover the range */ if ((retired_mask & 0x1) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp0 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x2) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp1 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x4) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp2 == 0) return PFMLIB_ERR_IRRINVAL; if ((retired_mask & 0x8) == 0 && reg.pmc38_mont_reg.iarc_ig_ibrp3 == 0) return PFMLIB_ERR_IRRINVAL; } if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 38)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 38; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 38; pos++; __pfm_vbprintf("[PMC38(pmc38)=0x%lx ig_ibrp0=%d ig_ibrp1=%d ig_ibrp2=%d ig_ibrp3=%d fine=%d]\n", reg.pmc_val, reg.pmc38_mont_reg.iarc_ig_ibrp0, reg.pmc38_mont_reg.iarc_ig_ibrp1, reg.pmc38_mont_reg.iarc_ig_ibrp2, reg.pmc38_mont_reg.iarc_ig_ibrp3, reg.pmc38_mont_reg.iarc_fine); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static const unsigned long iod_tab[8]={ /* --- */ 3, /* --D */ 2, /* -O- */ 3, /* should not be used */ /* -OD */ 0, /* =IOD safe because default IBR is harmless */ /* I-- */ 1, /* =IO safe because by defaut OPC is turned off */ /* I-D */ 0, /* =IOD safe because by default opc is turned off */ /* IO- */ 1, /* IOD */ 0 }; /* * IMPORTANT: MUST BE CALLED *AFTER* pfm_dispatch_irange() to make sure we see * the irange programming to adjust pmc41. */ static int pfm_dispatch_drange(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_reg_t *pc = outp->pfp_pmcs; pfmlib_mont_input_rr_t *irr; pfmlib_mont_output_rr_t *orr, *orr2; pfm_mont_pmc_reg_t pmc38; pfm_mont_pmc_reg_t reg; unsigned int i, pos = outp->pfp_pmc_count; int iod_codes[4], dfl_val_pmc32, dfl_val_pmc34; unsigned int n_intervals; int ret; int base_idx = 0; int fine_mode = 0; #define DR_USED 0x1 /* data range is used */ #define OP_USED 0x2 /* opcode matching is used */ #define IR_USED 0x4 /* code range is used */ if (param == NULL) return PFMLIB_SUCCESS; /* * if only pmc32/pmc33 opcode matching is used, we do not need to change * the default value of pmc41 regardless of the events being measured. */ if ( param->pfp_mont_drange.rr_used == 0 && param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * it seems like the ignored bits need to have special values * otherwise this does not work. */ reg.pmc_val = 0x2078fefefefe; /* * initialize iod codes */ iod_codes[0] = iod_codes[1] = iod_codes[2] = iod_codes[3] = 0; /* * setup default iod value, we need to separate because * if drange is used we do not know in advance which DBR will be used * therefore we need to apply dfl_val later */ dfl_val_pmc32 = param->pfp_mont_opcm1.opcm_used ? OP_USED : 0; dfl_val_pmc34 = param->pfp_mont_opcm2.opcm_used ? OP_USED : 0; if (param->pfp_mont_drange.rr_used == 1) { if (mod_out == NULL) return PFMLIB_ERR_INVAL; irr = ¶m->pfp_mont_drange; orr = &mod_out->pfp_mont_drange; ret = check_intervals(irr, 1, &n_intervals); if (ret != PFMLIB_SUCCESS) return ret; if (n_intervals < 1) return PFMLIB_ERR_DRRINVAL; ret = compute_normal_rr(irr, inp->pfp_dfl_plm, n_intervals, &base_idx, orr); if (ret != PFMLIB_SUCCESS) { return ret == PFMLIB_ERR_TOOMANY ? PFMLIB_ERR_DRRTOOMANY : ret; } /* * Update iod_codes to reflect the use of the DBR constraint. */ for (i=0; i < orr->rr_nbr_used; i++) { if (orr->rr_br[i].reg_num == 0) iod_codes[0] |= DR_USED | dfl_val_pmc32; if (orr->rr_br[i].reg_num == 2) iod_codes[1] |= DR_USED | dfl_val_pmc34; if (orr->rr_br[i].reg_num == 4) iod_codes[2] |= DR_USED | dfl_val_pmc32; if (orr->rr_br[i].reg_num == 6) iod_codes[3] |= DR_USED | dfl_val_pmc34; } } /* * XXX: assume dispatch_irange executed before calling this function */ if (param->pfp_mont_irange.rr_used == 1) { orr2 = &mod_out->pfp_mont_irange; if (mod_out == NULL) return PFMLIB_ERR_INVAL; /* * we need to find out whether or not the irange is using * fine mode. If this is the case, then we only need to * program pmc41 for the ibr pairs which designate the lower * bounds of a range. For instance, if IBRP0/IBRP2 are used, * then we only need to program pmc13.cfg_dbrp0 and pmc13.ena_dbrp0, * the PMU will automatically use IBRP2, even though pmc13.ena_dbrp2=0. */ for(i=0; i <= pos; i++) { if (pc[i].reg_num == 38) { pmc38.pmc_val = pc[i].reg_value; if (pmc38.pmc38_mont_reg.iarc_fine == 1) fine_mode = 1; break; } } /* * Update to reflect the use of the IBR constraint */ for (i=0; i < orr2->rr_nbr_used; i++) { if (orr2->rr_br[i].reg_num == 0) iod_codes[0] |= IR_USED | dfl_val_pmc32; if (orr2->rr_br[i].reg_num == 2) iod_codes[1] |= IR_USED | dfl_val_pmc34; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 4) iod_codes[2] |= IR_USED | dfl_val_pmc32; if (fine_mode == 0 && orr2->rr_br[i].reg_num == 6) iod_codes[3] |= IR_USED | dfl_val_pmc34; } } if (param->pfp_mont_irange.rr_used == 0 && param->pfp_mont_drange.rr_used ==0) { iod_codes[0] = iod_codes[2] = dfl_val_pmc32; iod_codes[1] = iod_codes[3] = dfl_val_pmc34; } /* * update the cfg dbrpX field. If we put a constraint on a cfg dbrp, then * we must enable it in the corresponding ena_dbrpX */ reg.pmc41_mont_reg.darc_ena_dbrp0 = iod_codes[0] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag0 = iod_tab[iod_codes[0]]; reg.pmc41_mont_reg.darc_ena_dbrp1 = iod_codes[1] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag1 = iod_tab[iod_codes[1]]; reg.pmc41_mont_reg.darc_ena_dbrp2 = iod_codes[2] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag2 = iod_tab[iod_codes[2]]; reg.pmc41_mont_reg.darc_ena_dbrp3 = iod_codes[3] ? 1 : 0; reg.pmc41_mont_reg.darc_cfg_dtag3 = iod_tab[iod_codes[3]]; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 41)) return PFMLIB_ERR_NOASSIGN; pc[pos].reg_num = 41; pc[pos].reg_value = reg.pmc_val; pc[pos].reg_addr = pc[pos].reg_alt_addr = 41; pos++; __pfm_vbprintf("[PMC41(pmc41)=0x%lx cfg_dtag0=%d cfg_dtag1=%d cfg_dtag2=%d cfg_dtag3=%d ena_dbrp0=%d ena_dbrp1=%d ena_dbrp2=%d ena_dbrp3=%d]\n", reg.pmc_val, reg.pmc41_mont_reg.darc_cfg_dtag0, reg.pmc41_mont_reg.darc_cfg_dtag1, reg.pmc41_mont_reg.darc_cfg_dtag2, reg.pmc41_mont_reg.darc_cfg_dtag3, reg.pmc41_mont_reg.darc_ena_dbrp0, reg.pmc41_mont_reg.darc_ena_dbrp1, reg.pmc41_mont_reg.darc_ena_dbrp2, reg.pmc41_mont_reg.darc_ena_dbrp3); outp->pfp_pmc_count = pos; return PFMLIB_SUCCESS; } static int check_qualifier_constraints(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in) { pfmlib_mont_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; unsigned int i, count; count = inp->pfp_event_count; for(i=0; i < count; i++) { /* * skip check for counter which requested it. Use at your own risk. * No all counters have necessarily been validated for use with * qualifiers. Typically the event is counted as if no constraint * existed. */ if (param->pfp_mont_counters[i].flags & PFMLIB_MONT_FL_EVT_NO_QUALCHECK) continue; if (evt_use_irange(param) && has_iarr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_drange(param) && has_darr(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; if (evt_use_opcm(param) && has_opcm(e[i].event) == 0) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int check_range_plm(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in) { pfmlib_mont_input_param_t *param = mod_in; unsigned int i, count; if (param->pfp_mont_drange.rr_used == 0 && param->pfp_mont_irange.rr_used == 0) return PFMLIB_SUCCESS; /* * range restriction applies to all events, therefore we must have a consistent * set of plm and they must match the pfp_dfl_plm which is used to setup the debug * registers */ count = inp->pfp_event_count; for(i=0; i < count; i++) { if (inp->pfp_events[i].plm && inp->pfp_events[i].plm != inp->pfp_dfl_plm) return PFMLIB_ERR_FEATCOMB; } return PFMLIB_SUCCESS; } static int pfm_dispatch_ipear(pfmlib_input_param_t *inp, pfmlib_mont_input_param_t *mod_in, pfmlib_output_param_t *outp) { pfm_mont_pmc_reg_t reg; pfmlib_mont_input_param_t *param = mod_in; pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc, *pd; unsigned int pos1, pos2; unsigned int i, count; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; pos1 = outp->pfp_pmc_count; pos2 = outp->pfp_pmd_count; /* * check if there is something to do */ if (param == NULL || param->pfp_mont_ipear.ipear_used == 0) return PFMLIB_SUCCESS; /* * we need to look for use of ETB, because IP-EAR and ETB cannot be used at the * same time */ if (param->pfp_mont_etb.etb_used) return PFMLIB_ERR_FEATCOMB; /* * look for implicit ETB used because of BRANCH_EVENT */ count = inp->pfp_event_count; for (i=0; i < count; i++) { if (is_etb(e[i].event)) return PFMLIB_ERR_FEATCOMB; } reg.pmc_val = 0; reg.pmc42_mont_reg.ipear_plm = param->pfp_mont_ipear.ipear_plm ? param->pfp_mont_ipear.ipear_plm : inp->pfp_dfl_plm; reg.pmc42_mont_reg.ipear_pm = inp->pfp_flags & PFMLIB_PFP_SYSTEMWIDE ? 1 : 0; reg.pmc42_mont_reg.ipear_mode = 4; reg.pmc42_mont_reg.ipear_delay = param->pfp_mont_ipear.ipear_delay; if (pfm_regmask_isset(&inp->pfp_unavail_pmcs, 42)) return PFMLIB_ERR_NOASSIGN; pc[pos1].reg_num = 42; pc[pos1].reg_value = reg.pmc_val; pc[pos1].reg_addr = pc[pos1].reg_alt_addr = 42; pos1++; __pfm_vbprintf("[PMC42(pmc42)=0x%lx plm=%d pm=%d mode=%d delay=%d]\n", reg.pmc_val, reg.pmc42_mont_reg.ipear_plm, reg.pmc42_mont_reg.ipear_pm, reg.pmc42_mont_reg.ipear_mode, reg.pmc42_mont_reg.ipear_delay); pd[pos2].reg_num = 38; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 38; pos2++; pd[pos2].reg_num = 39; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = 39; pos2++; __pfm_vbprintf("[PMD38(pmd38)]\n[PMD39(pmd39)\n"); for(i=48; i < 64; i++, pos2++) { pd[pos2].reg_num = i; pd[pos2].reg_addr = pd[pos2].reg_alt_addr = i; __pfm_vbprintf("[PMD%u(pmd%u)]\n", pd[pos2].reg_num, pd[pos2].reg_num); } outp->pfp_pmc_count = pos1; outp->pfp_pmd_count = pos2; return PFMLIB_SUCCESS; } static int pfm_mont_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { int ret; pfmlib_mont_input_param_t *mod_in = (pfmlib_mont_input_param_t *)model_in; pfmlib_mont_output_param_t *mod_out = (pfmlib_mont_output_param_t *)model_out; /* * nothing will come out of this combination */ if (mod_out && mod_in == NULL) return PFMLIB_ERR_INVAL; /* check opcode match, range restriction qualifiers */ if (mod_in && check_qualifier_constraints(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; /* check for problems with range restriction and per-event plm */ if (mod_in && check_range_plm(inp, mod_in) != PFMLIB_SUCCESS) return PFMLIB_ERR_FEATCOMB; ret = pfm_mont_dispatch_counters(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for I-EAR */ ret = pfm_dispatch_iear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for D-EAR */ ret = pfm_dispatch_dear(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* XXX: must be done before dispatch_opcm() and dispatch_drange() */ ret = pfm_dispatch_irange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; ret = pfm_dispatch_drange(inp, mod_in, outp, mod_out);; if (ret != PFMLIB_SUCCESS) return ret; /* now check for Opcode matchers */ ret = pfm_dispatch_opcm(inp, mod_in, outp, mod_out); if (ret != PFMLIB_SUCCESS) return ret; /* now check for ETB */ ret = pfm_dispatch_etb(inp, mod_in, outp); if (ret != PFMLIB_SUCCESS) return ret; /* now check for IP-EAR */ ret = pfm_dispatch_ipear(inp, mod_in, outp); return ret; } /* XXX: return value is also error code */ int pfm_mont_get_event_maxincr(unsigned int i, unsigned int *maxincr) { if (i >= PME_MONT_EVENT_COUNT || maxincr == NULL) return PFMLIB_ERR_INVAL; *maxincr = montecito_pe[i].pme_maxincr; return PFMLIB_SUCCESS; } int pfm_mont_is_ear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_ear(i); } int pfm_mont_is_dear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i); } int pfm_mont_is_dear_tlb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i) && is_ear_tlb(i); } int pfm_mont_is_dear_cache(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_dear(i) && is_ear_cache(i); } int pfm_mont_is_dear_alat(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_ear_alat(i); } int pfm_mont_is_iear(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i); } int pfm_mont_is_iear_tlb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i) && is_ear_tlb(i); } int pfm_mont_is_iear_cache(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_iear(i) && is_ear_cache(i); } int pfm_mont_is_etb(unsigned int i) { return i < PME_MONT_EVENT_COUNT && is_etb(i); } int pfm_mont_support_iarr(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_iarr(i); } int pfm_mont_support_darr(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_darr(i); } int pfm_mont_support_opcm(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_opcm(i); } int pfm_mont_support_all(unsigned int i) { return i < PME_MONT_EVENT_COUNT && has_all(i); } int pfm_mont_get_ear_mode(unsigned int i, pfmlib_mont_ear_mode_t *m) { pfmlib_mont_ear_mode_t r; if (!is_ear(i) || m == NULL) return PFMLIB_ERR_INVAL; r = PFMLIB_MONT_EAR_TLB_MODE; if (is_ear_tlb(i)) goto done; r = PFMLIB_MONT_EAR_CACHE_MODE; if (is_ear_cache(i)) goto done; r = PFMLIB_MONT_EAR_ALAT_MODE; if (is_ear_alat(i)) goto done; return PFMLIB_ERR_INVAL; done: *m = r; return PFMLIB_SUCCESS; } static int pfm_mont_get_event_code(unsigned int i, unsigned int cnt, int *code) { if (cnt != PFMLIB_CNT_FIRST && (cnt < 4 || cnt > 15)) return PFMLIB_ERR_INVAL; *code = (int)montecito_pe[i].pme_code; return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_mont_get_event_umask(unsigned int i, unsigned long *umask) { if (i >= PME_MONT_EVENT_COUNT || umask == NULL) return PFMLIB_ERR_INVAL; *umask = evt_umask(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_group(unsigned int i, int *grp) { if (i >= PME_MONT_EVENT_COUNT || grp == NULL) return PFMLIB_ERR_INVAL; *grp = evt_grp(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_set(unsigned int i, int *set) { if (i >= PME_MONT_EVENT_COUNT || set == NULL) return PFMLIB_ERR_INVAL; *set = evt_set(i) == 0xf ? PFMLIB_MONT_EVT_NO_SET : evt_set(i); return PFMLIB_SUCCESS; } int pfm_mont_get_event_type(unsigned int i, int *type) { if (i >= PME_MONT_EVENT_COUNT || type == NULL) return PFMLIB_ERR_INVAL; *type = evt_caf(i); return PFMLIB_SUCCESS; } /* external interface */ int pfm_mont_irange_is_fine(pfmlib_output_param_t *outp, pfmlib_mont_output_param_t *mod_out) { pfmlib_mont_output_param_t *param = mod_out; pfm_mont_pmc_reg_t reg; unsigned int i, count; /* some sanity checks */ if (outp == NULL || param == NULL) return 0; if (outp->pfp_pmc_count >= PFMLIB_MAX_PMCS) return 0; if (param->pfp_mont_irange.rr_nbr_used == 0) return 0; /* * we look for pmc38 as it contains the bit indicating if fine mode is used */ count = outp->pfp_pmc_count; for(i=0; i < count; i++) { if (outp->pfp_pmcs[i].reg_num == 38) goto found; } return 0; found: reg.pmc_val = outp->pfp_pmcs[i].reg_value; return reg.pmc38_mont_reg.iarc_fine ? 1 : 0; } static char * pfm_mont_get_event_name(unsigned int i) { return montecito_pe[i].pme_name; } static void pfm_mont_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { unsigned int i; unsigned long m; memset(counters, 0, sizeof(*counters)); m =montecito_pe[j].pme_counters; for(i=0; m ; i++, m>>=1) { if (m & 0x1) pfm_regmask_set(counters, i); } } static void pfm_mont_get_impl_pmcs(pfmlib_regmask_t *impl_pmcs) { unsigned int i = 0; for(i=0; i < 16; i++) pfm_regmask_set(impl_pmcs, i); for(i=32; i < 43; i++) pfm_regmask_set(impl_pmcs, i); } static void pfm_mont_get_impl_pmds(pfmlib_regmask_t *impl_pmds) { unsigned int i = 0; for(i=4; i < 16; i++) pfm_regmask_set(impl_pmds, i); for(i=32; i < 40; i++) pfm_regmask_set(impl_pmds, i); for(i=48; i < 64; i++) pfm_regmask_set(impl_pmds, i); } static void pfm_mont_get_impl_counters(pfmlib_regmask_t *impl_counters) { unsigned int i = 0; /* counter pmds are contiguous */ for(i=4; i < 16; i++) pfm_regmask_set(impl_counters, i); } static void pfm_mont_get_hw_counter_width(unsigned int *width) { *width = PMU_MONT_COUNTER_WIDTH; } static int pfm_mont_get_event_description(unsigned int ev, char **str) { char *s; s = montecito_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_mont_get_cycle_event(pfmlib_event_t *e) { e->event = PME_MONT_CPU_OP_CYCLES_ALL; return PFMLIB_SUCCESS; } static int pfm_mont_get_inst_retired(pfmlib_event_t *e) { e->event = PME_MONT_IA64_INST_RETIRED; return PFMLIB_SUCCESS; } static unsigned int pfm_mont_get_num_event_masks(unsigned int event) { return has_mesi(event) ? 4 : 0; } static char * pfm_mont_get_event_mask_name(unsigned int event, unsigned int mask) { switch(mask) { case 0: return "I"; case 1: return "S"; case 2: return "E"; case 3: return "M"; } return NULL; } static int pfm_mont_get_event_mask_desc(unsigned int event, unsigned int mask, char **desc) { switch(mask) { case 0: *desc = strdup("invalid"); break; case 1: *desc = strdup("shared"); break; case 2: *desc = strdup("exclusive"); break; case 3: *desc = strdup("modified"); break; default: return PFMLIB_ERR_INVAL; } return PFMLIB_SUCCESS; } static int pfm_mont_get_event_mask_code(unsigned int event, unsigned int mask, unsigned int *code) { *code = mask; return PFMLIB_SUCCESS; } pfm_pmu_support_t montecito_support={ .pmu_name = "dual-core Itanium 2", .pmu_type = PFMLIB_MONTECITO_PMU, .pme_count = PME_MONT_EVENT_COUNT, .pmc_count = PMU_MONT_NUM_PMCS, .pmd_count = PMU_MONT_NUM_PMDS, .num_cnt = PMU_MONT_NUM_COUNTERS, .get_event_code = pfm_mont_get_event_code, .get_event_name = pfm_mont_get_event_name, .get_event_counters = pfm_mont_get_event_counters, .dispatch_events = pfm_mont_dispatch_events, .pmu_detect = pfm_mont_detect, .get_impl_pmcs = pfm_mont_get_impl_pmcs, .get_impl_pmds = pfm_mont_get_impl_pmds, .get_impl_counters = pfm_mont_get_impl_counters, .get_hw_counter_width = pfm_mont_get_hw_counter_width, .get_event_desc = pfm_mont_get_event_description, .get_cycle_event = pfm_mont_get_cycle_event, .get_inst_retired_event = pfm_mont_get_inst_retired, .get_num_event_masks = pfm_mont_get_num_event_masks, .get_event_mask_name = pfm_mont_get_event_mask_name, .get_event_mask_desc = pfm_mont_get_event_mask_desc, .get_event_mask_code = pfm_mont_get_event_mask_code }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_montecito_priv.h000066400000000000000000000127401502707512200233010ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_MONTECITO_PRIV_H__ #define __PFMLIB_MONTECITO_PRIV_H__ /* * Event type definitions * * The virtual events are not really defined in the specs but are an artifact used * to quickly and easily setup EAR and/or BTB. The event type encodes the exact feature * which must be configured in combination with a counting monitor. * For instance, DATA_EAR_CACHE_LAT4 is a virtual D-EAR cache event. If the user * requests this event, this will configure a counting monitor to count DATA_EAR_EVENTS * and PMC11 will be configured for cache mode. The latency is encoded in the umask, here * it would correspond to 4 cycles. * */ #define PFMLIB_MONT_EVENT_NORMAL 0x0 /* standard counter */ #define PFMLIB_MONT_EVENT_ETB 0x1 /* virtual event used with ETB configuration */ #define PFMLIB_MONT_EVENT_IEAR_TLB 0x2 /* virtual event used for I-EAR TLB configuration */ #define PFMLIB_MONT_EVENT_IEAR_CACHE 0x3 /* virtual event used for I-EAR cache configuration */ #define PFMLIB_MONT_EVENT_DEAR_TLB 0x4 /* virtual event used for D-EAR TLB configuration */ #define PFMLIB_MONT_EVENT_DEAR_CACHE 0x5 /* virtual event used for D-EAR cache configuration */ #define PFMLIB_MONT_EVENT_DEAR_ALAT 0x6 /* virtual event used for D-EAR ALAT configuration */ #define event_is_ear(e) ((e)->pme_type >= PFMLIB_MONT_EVENT_IEAR_TLB &&(e)->pme_type <= PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_iear(e) ((e)->pme_type == PFMLIB_MONT_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_MONT_EVENT_IEAR_CACHE) #define event_is_dear(e) ((e)->pme_type >= PFMLIB_MONT_EVENT_DEAR_TLB && (e)->pme_type <= PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_ear_cache(e) ((e)->pme_type == PFMLIB_MONT_EVENT_DEAR_CACHE || (e)->pme_type == PFMLIB_MONT_EVENT_IEAR_CACHE) #define event_is_ear_tlb(e) ((e)->pme_type == PFMLIB_MONT_EVENT_IEAR_TLB || (e)->pme_type == PFMLIB_MONT_EVENT_DEAR_TLB) #define event_is_ear_alat(e) ((e)->pme_type == PFMLIB_MONT_EVENT_DEAR_ALAT) #define event_is_etb(e) ((e)->pme_type == PFMLIB_MONT_EVENT_ETB) /* * Itanium encoding structure * (code must be first 8 bits) */ typedef struct { unsigned long pme_code:8; /* major event code */ unsigned long pme_type:3; /* see definitions above */ unsigned long pme_caf:2; /* Active, Floating, Causal, Self-Floating */ unsigned long pme_ig1:3; /* ignored */ unsigned long pme_umask:16; /* unit mask*/ unsigned long pme_ig:32; /* ignored */ } pme_mont_entry_code_t; typedef union { unsigned long pme_vcode; pme_mont_entry_code_t pme_mont_code; /* must not be larger than vcode */ } pme_mont_code_t; typedef union { unsigned long qual; /* generic qualifier */ struct { unsigned long pme_iar:1; /* instruction address range supported */ unsigned long pme_opm:1; /* opcode match supported */ unsigned long pme_dar:1; /* data address range supported */ unsigned long pme_all:1; /* supports all_thrd=1 */ unsigned long pme_mesi:1; /* event supports MESI */ unsigned long pme_res1:11; /* reserved */ unsigned long pme_group:3; /* event group */ unsigned long pme_set:4; /* event set*/ unsigned long pme_res2:41; /* reserved */ } pme_qual; } pme_mont_qualifiers_t; typedef struct { char *pme_name; pme_mont_code_t pme_entry_code; unsigned long pme_counters; /* supported counters */ unsigned int pme_maxincr; pme_mont_qualifiers_t pme_qualifiers; char *pme_desc; /* text description of the event */ } pme_mont_entry_t; /* * We embed the umask value into the event code. Because it really is * like a subevent. * pme_code: * - lower 16 bits: major event code * - upper 16 bits: unit mask */ #define pme_code pme_entry_code.pme_mont_code.pme_code #define pme_umask pme_entry_code.pme_mont_code.pme_umask #define pme_used pme_qualifiers.pme_qual_struct.pme_used #define pme_type pme_entry_code.pme_mont_code.pme_type #define pme_caf pme_entry_code.pme_mont_code.pme_caf #define event_opcm_ok(e) ((e)->pme_qualifiers.pme_qual.pme_opm==1) #define event_iarr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_iar==1) #define event_darr_ok(e) ((e)->pme_qualifiers.pme_qual.pme_dar==1) #define event_all_ok(e) ((e)->pme_qualifiers.pme_qual.pme_all==1) #define event_mesi_ok(e) ((e)->pme_qualifiers.pme_qual.pme_mesi==1) #endif /* __PFMLIB_MONTECITO_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_perf_event.c000066400000000000000000000413251502707512200223710ustar00rootroot00000000000000/* * pfmlib_perf_events.c: encode events for perf_event API * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include "pfmlib_priv.h" #include "pfmlib_perf_event_priv.h" #define PERF_PROC_FILE "/proc/sys/kernel/perf_event_paranoid" #ifdef min #undef min #endif #define min(a, b) ((a) < (b) ? (a) : (b)) /* * contains ONLY attributes related to PMU features */ static const pfmlib_attr_desc_t perf_event_mods[]={ PFM_ATTR_B("u", "monitor at user level"), /* monitor user level */ PFM_ATTR_B("k", "monitor at kernel level"), /* monitor kernel level */ PFM_ATTR_B("h", "monitor at hypervisor level"), /* monitor hypervisor level */ PFM_ATTR_SKIP, /* to match index in oerf_event_ext_mods */ PFM_ATTR_SKIP, /* to match index in oerf_event_ext_mods */ PFM_ATTR_SKIP, /* to match index in oerf_event_ext_mods */ PFM_ATTR_SKIP, /* to match index in oerf_event_ext_mods */ PFM_ATTR_B("mg", "monitor guest execution"), /* monitor guest level */ PFM_ATTR_B("mh", "monitor host execution"), /* monitor host level */ PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; /* * contains all attributes controlled by perf_events. That includes PMU attributes * and pure software attributes such as sampling periods */ static const pfmlib_attr_desc_t perf_event_ext_mods[]={ PFM_ATTR_B("u", "monitor at user level"), /* monitor user level */ PFM_ATTR_B("k", "monitor at kernel level"), /* monitor kernel level */ PFM_ATTR_B("h", "monitor at hypervisor level"), /* monitor hypervisor level */ PFM_ATTR_I("period", "sampling period"), /* sampling period */ PFM_ATTR_I("freq", "sampling frequency (Hz)"), /* sampling frequency */ PFM_ATTR_I("precise", "precise event sampling"), /* anti-skid mechanism */ PFM_ATTR_B("excl", "exclusive access"), /* exclusive PMU access */ PFM_ATTR_B("mg", "monitor guest execution"), /* monitor guest level */ PFM_ATTR_B("mh", "monitor host execution"), /* monitor host level */ PFM_ATTR_I("cpu", "CPU to program"), /* CPU to program */ PFM_ATTR_B("pinned", "pin event to counters"), /* pin event to PMU */ PFM_ATTR_B("hw_smpl", "enable hardware sampling"),/* enable hw_smpl, not precise IP */ PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; typedef struct sysfs_pmu_entry { char *name; int type; int flags; } sysfs_pmu_entry_t; static sysfs_pmu_entry_t *sysfs_pmus; /* cache os pmus available in sysfs */ static int sysfs_npmus; /* number of entries in sysfs_pmus */ static int pfmlib_check_no_mods(pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; int numasks = 0, no_mods = 0; int i; /* * scan umasks used with the event * for support_no_attr values */ for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type != PFM_ATTR_UMASK) continue; numasks++; if (a->support_no_mods) no_mods++; } /* * handle the case where some umasks have no_attr * and not others. In that case no_attr has priority */ if (no_mods && numasks != no_mods) { DPRINT("event %s with umasks with and without no_mods (%d) attribute, forcing no_mods\n", e->fstr, no_mods); } return no_mods; } static int pfmlib_perf_event_encode(void *this, const char *str, int dfl_plm, void *data) { pfm_perf_encode_arg_t arg; pfm_perf_encode_arg_t *uarg = data; pfmlib_os_t *os = this; struct perf_event_attr my_attr, *attr; pfmlib_pmu_t *pmu; pfmlib_event_desc_t e; pfmlib_event_attr_info_t *a; size_t orig_sz, asz, sz = sizeof(arg); uint64_t ival; int has_plm = 0, has_vmx_plm = 0; int i, plm = 0, ret, vmx_plm = 0; int cpu = -1, pinned = 0; int no_mods; sz = pfmlib_check_struct(uarg, uarg->size, PFM_PERF_ENCODE_ABI0, sz); if (!sz) return PFM_ERR_INVAL; /* copy input */ memcpy(&arg, uarg, sz); /* pointer to our internal attr struct */ memset(&my_attr, 0, sizeof(my_attr)); attr = &my_attr; /* * copy user attr to our internal version * size == 0 is interpreted minimal possible * size (ABI_VER0) */ /* size of attr struct passed by user */ orig_sz = uarg->attr->size; if (orig_sz == 0) asz = PERF_ATTR_SIZE_VER0; else asz = min(sizeof(*attr), orig_sz); /* * we copy the user struct to preserve whatever may * have been initialized but that we do not use */ memcpy(attr, uarg->attr, asz); /* restore internal size (just in case we need it) */ attr->size = sizeof(my_attr); /* useful for debugging */ if (asz != sizeof(*attr)) __pfm_vbprintf("warning: mismatch attr struct size " "user=%d libpfm=%zu\n", asz, sizeof(*attr)); memset(&e, 0, sizeof(e)); e.osid = os->id; e.os_data = attr; e.dfl_plm = dfl_plm; /* after this call, need to call pfmlib_release_event() */ ret = pfmlib_parse_event(str, &e); if (ret != PFM_SUCCESS) return ret; pmu = e.pmu; ret = PFM_ERR_NOTSUPP; if (!pmu->get_event_encoding[e.osid]) { DPRINT("PMU %s does not support PFM_OS_NONE\n", pmu->name); goto done; } ret = pmu->get_event_encoding[e.osid](pmu, &e); if (ret != PFM_SUCCESS) goto done; no_mods = pfmlib_check_no_mods(&e); /* * process perf_event attributes */ for (i = 0; i < e.nattrs; i++) { a = attr(&e, i); if (a->ctrl != PFM_ATTR_CTRL_PERF_EVENT) continue; ival = e.attrs[i].ival; /* * if event or umasks do not support any modifiers, * then reject */ if (no_mods && ival && a->idx != PERF_ATTR_PIN) { ret = PFM_ERR_ATTR_VAL; goto done; } switch(a->idx) { case PERF_ATTR_U: if (ival) plm |= PFM_PLM3; has_plm = 1; break; case PERF_ATTR_K: if (ival) plm |= PFM_PLM0; has_plm = 1; break; case PERF_ATTR_H: if (ival) plm |= PFM_PLMH; has_plm = 1; break; case PERF_ATTR_PE: if (!ival || attr->freq) { ret = PFM_ERR_ATTR_VAL; goto done; } attr->sample_period = ival; break; case PERF_ATTR_FR: if (!ival || attr->sample_period) { ret = PFM_ERR_ATTR_VAL; goto done; } attr->sample_freq = ival; attr->freq = 1; break; case PERF_ATTR_PR: if (ival > 3) { ret = PFM_ERR_ATTR_VAL; goto done; } attr->precise_ip = ival; break; case PERF_ATTR_EX: if (ival && !attr->exclusive) attr->exclusive = 1; break; case PERF_ATTR_MG: vmx_plm |= PFM_PLM3; has_vmx_plm = 1; break; case PERF_ATTR_MH: vmx_plm |= PFM_PLM0; has_vmx_plm = 1; break; case PERF_ATTR_CPU: if (ival >= INT_MAX) { ret = PFM_ERR_ATTR_VAL; goto done; } cpu = (int)ival; break; case PERF_ATTR_PIN: pinned = (int)!!ival; break; case PERF_ATTR_HWS: attr->precise_ip = (int)!!ival; break; } } /* * if no priv level mask was provided * with the event, then use dfl_plm */ if (!has_plm) plm = dfl_plm; /* exclude_guest by default */ if (!has_vmx_plm) vmx_plm = PFM_PLM0; /* * perf_event plm work by exclusion, so use logical or * goal here is to set to zero any exclude_* not supported * by underlying PMU */ plm |= (~pmu->supported_plm) & PFM_PLM_ALL; vmx_plm |= (~pmu->supported_plm) & PFM_PLM_ALL; attr->exclude_user = !(plm & PFM_PLM3); attr->exclude_kernel = !(plm & PFM_PLM0); attr->exclude_hv = !(plm & PFM_PLMH); attr->exclude_guest = !(vmx_plm & PFM_PLM3); attr->exclude_host = !(vmx_plm & PFM_PLM0); attr->pinned = pinned; __pfm_vbprintf("PERF[type=%x config=0x%"PRIx64" config1=0x%"PRIx64 " excl=%d excl_user=%d excl_kernel=%d excl_hv=%d excl_host=%d excl_guest=%d period=%"PRIu64" freq=%d" " precise=%d pinned=%d] %s\n", attr->type, attr->config, attr->config1, attr->exclusive, attr->exclude_user, attr->exclude_kernel, attr->exclude_hv, attr->exclude_host, attr->exclude_guest, attr->sample_period, attr->freq, attr->precise_ip, attr->pinned, str); /* * propagate event index if necessary */ arg.idx = pfmlib_pidx2idx(e.pmu, e.event); /* propagate cpu */ arg.cpu = cpu; /* propagate our changes, that overwrites attr->size */ memcpy(uarg->attr, attr, asz); /* restore user size */ uarg->attr->size = orig_sz; /* * fstr not requested, stop here * or no_mods set */ ret = PFM_SUCCESS; if (!arg.fstr || no_mods) { memcpy(uarg, &arg, sz); goto done; } for (i=0; i < e.npattrs; i++) { int idx; if (e.pattrs[i].ctrl != PFM_ATTR_CTRL_PERF_EVENT) continue; idx = e.pattrs[i].idx; switch (idx) { case PERF_ATTR_K: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, !!(plm & PFM_PLM0)); break; case PERF_ATTR_U: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, !!(plm & PFM_PLM3)); break; case PERF_ATTR_H: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, !!(plm & PFM_PLMH)); break; case PERF_ATTR_PR: case PERF_ATTR_HWS: evt_strcat(e.fstr, ":%s=%d", perf_event_ext_mods[idx].name, attr->precise_ip); break; case PERF_ATTR_PE: case PERF_ATTR_FR: if (attr->freq && attr->sample_period) evt_strcat(e.fstr, ":%s=%"PRIu64, perf_event_ext_mods[idx].name, attr->sample_period); else if (attr->sample_period) evt_strcat(e.fstr, ":%s=%"PRIu64, perf_event_ext_mods[idx].name, attr->sample_period); break; case PERF_ATTR_MG: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, !attr->exclude_guest); break; case PERF_ATTR_MH: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, !attr->exclude_host); break; case PERF_ATTR_EX: evt_strcat(e.fstr, ":%s=%lu", perf_event_ext_mods[idx].name, attr->exclusive); break; } } ret = pfmlib_build_fstr(&e, arg.fstr); if (ret == PFM_SUCCESS) memcpy(uarg, &arg, sz); done: pfmlib_release_event(&e); return ret; } /* * get OS-specific event attributes */ static int perf_get_os_nattrs(void *this, pfmlib_event_desc_t *e) { pfmlib_os_t *os = this; int i, n = 0; for (i = 0; os->atdesc[i].name; i++) if (!is_empty_attr(os->atdesc+i)) n++; return n; } static int perf_get_os_attr_info(void *this, pfmlib_event_desc_t *e) { pfmlib_os_t *os = this; pfmlib_event_attr_info_t *info; int i, k, j = e->npattrs; for (i = k = 0; os->atdesc[i].name; i++) { /* skip padding entries */ if (is_empty_attr(os->atdesc+i)) continue; info = e->pattrs + j + k; info->name = os->atdesc[i].name; info->desc = os->atdesc[i].desc; info->equiv= NULL; info->code = i; info->idx = i; /* namespace-specific index */ info->type = os->atdesc[i].type; info->is_dfl = 0; info->ctrl = PFM_ATTR_CTRL_PERF_EVENT; k++; } e->npattrs += k; return PFM_SUCCESS; } /* * old interface, maintained for backward compatibility with earlier versions of the library */ int pfm_get_perf_event_encoding(const char *str, int dfl_plm, struct perf_event_attr *attr, char **fstr, int *idx) { pfm_perf_encode_arg_t arg; int ret; if (PFMLIB_INITIALIZED() == 0) return PFM_ERR_NOINIT; /* idx and fstr can be NULL */ if (!(attr && str)) return PFM_ERR_INVAL; if (dfl_plm & ~(PFM_PLM_ALL)) return PFM_ERR_INVAL; memset(&arg, 0, sizeof(arg)); /* do not clear attr, some fields may be initialized by caller already, e.g., size */ arg.attr = attr; arg.fstr = fstr; ret = pfm_get_os_event_encoding(str, dfl_plm, PFM_OS_PERF_EVENT_EXT, &arg); if (ret != PFM_SUCCESS) return ret; if (idx) *idx = arg.idx; return PFM_SUCCESS; } /* * generic perf encoding helper */ static int pfmlib_perf_find_pmu_type_by_name(const char *perf_name, int *type) { char filename[PATH_MAX]; FILE *fp; int ret, tmp; int retval = PFM_ERR_NOTFOUND; if (!(perf_name && type)) return PFM_ERR_NOTSUPP; snprintf(filename, PATH_MAX, "%s/%s/type", SYSFS_PMU_DEVICES_DIR, perf_name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &tmp); fclose(fp); if (ret == 1) { *type = tmp; retval = PFM_SUCCESS; } return retval; } /* * identify perf_events subdirectory * via the presence of the mux interval config file * Return: * 1 : directory is a perf_events directory (match) * 0 : directory is not a perf_events directory (match) */ static int filter_pmu_dir(const struct dirent *d) { char fn[PATH_MAX]; if (d->d_name[0] == '.') return 0; if (d->d_type != DT_DIR && d->d_type != DT_LNK) return 0; snprintf(fn, PATH_MAX, "%s/%s/perf_event_mux_interval_ms", SYSFS_PMU_DEVICES_DIR, d->d_name); return !access(fn, F_OK); } /* * build a cache of PMUs available via sysfs * to speedup lookup later on */ int pfm_init_sysfs_pmu_cache(void) { struct dirent **dir_list = NULL; int n, i, j, ret; int type; /* only initialize once (perf vs. perf_ext) */ if (sysfs_pmus) return PFM_SUCCESS; n = scandir(SYSFS_PMU_DEVICES_DIR, &dir_list, filter_pmu_dir, NULL); if (n == 0) { free(dir_list); return PFM_ERR_NOTSUPP; } sysfs_pmus = (sysfs_pmu_entry_t *)malloc(n * sizeof(sysfs_pmu_entry_t)); if (!sysfs_pmus) return PFM_ERR_NOMEM; /* * cache perf_event PMU name and type (attr.type) */ for (i = j = 0; i < n; i++) { sysfs_pmus[j].name = dir_list[i]->d_name; ret = pfmlib_perf_find_pmu_type_by_name(sysfs_pmus[i].name, &type); /* skip PMU if cannot get the type */ if (ret != PFM_SUCCESS) { DPRINT("sysfs_pmus[%d]=%s failed to get PMU type from sysfs\n", j, sysfs_pmus[i].name); continue; } sysfs_pmus[j].type = type; DPRINT("sysf_pmus[%d]=%s type=%d\n", j, sysfs_pmus[i].name, sysfs_pmus[i].type); j++; } sysfs_npmus = j; free(dir_list); return PFM_SUCCESS; } static int pfm_perf_event_os_detect(void *this) { if (access(PERF_PROC_FILE, F_OK)) return PFM_ERR_NOTSUPP; return pfm_init_sysfs_pmu_cache(); } static int pfmlib_perf_find_pmu_type(char *pmu_name, int *type) { int i; if (!sysfs_pmus) return PFM_ERR_NOTFOUND; for (i = 0; i < sysfs_npmus; i++) { /* for now use exact match, add regexp later */ if (!strcmp(pmu_name, sysfs_pmus[i].name)) { *type = sysfs_pmus[i].type; return PFM_SUCCESS; } } DPRINT("perf_find_pmu_type: cannot find PMU %s\n", pmu_name); return PFM_ERR_NOTFOUND; } /* * generic perf encoding helper */ int pfm_perf_find_pmu_type(void *this, int *type) { pfmlib_pmu_t *pmu = this; char *p, *s, *q; int ret; /* * if no perf_name specified, then the best * option is to use TYPE_RAW, i.e., the core PMU * which the caller is running on when invoking * perf_event_open() */ if (!pmu->perf_name) { *type = PERF_TYPE_RAW; DPRINT("No perf_name for %s, defaulting to TYPE_RAW\n", pmu->name); return PFM_SUCCESS; } /* * perf_name may be a comma separated list of PMU names * so duplicate to split the string into PMU keywords */ s = q = strdup(pmu->perf_name); if (!s) { DPRINT("cannot dup perf_name for %s\n", pmu->perf_name); return PFM_ERR_NOTSUPP; } ret = PFM_ERR_NOTFOUND; while ((p = strchr(s, ','))) { *p = '\0'; /* stop at first match */ ret = pfmlib_perf_find_pmu_type(s, type); if (ret == PFM_SUCCESS) break; s = p + 1; } /* only or last element of perf_name */ if (ret == PFM_ERR_NOTFOUND) ret = pfmlib_perf_find_pmu_type(s, type); free(q); if (ret != PFM_SUCCESS) { DPRINT("cannot find perf_events PMU type for %s perf_name=%s using PERF_TYPE_RAW\n", pmu->name, pmu->perf_name); } return ret; } pfmlib_os_t pfmlib_os_perf={ .name = "perf_event", .id = PFM_OS_PERF_EVENT, .atdesc = perf_event_mods, .detect = pfm_perf_event_os_detect, .get_os_attr_info = perf_get_os_attr_info, .get_os_nattrs = perf_get_os_nattrs, .encode = pfmlib_perf_event_encode, }; pfmlib_os_t pfmlib_os_perf_ext={ .name = "perf_event extended", .id = PFM_OS_PERF_EVENT_EXT, .atdesc = perf_event_ext_mods, .detect = pfm_perf_event_os_detect, .get_os_attr_info = perf_get_os_attr_info, .get_os_nattrs = perf_get_os_nattrs, .encode = pfmlib_perf_event_encode, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_perf_event_pmu.c000066400000000000000000000634361502707512200232610ustar00rootroot00000000000000/* * pfmlib_perf_pmu.c: support for perf_events event table * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #ifdef __linux__ #include /* for openat() */ #include #endif #include "pfmlib_priv.h" #include "pfmlib_perf_event_priv.h" #define PERF_MAX_UMASKS 8 typedef struct { const char *uname; /* unit mask name */ const char *udesc; /* unit mask desc */ uint64_t uid; /* unit mask id */ int uflags; /* umask options */ int grpid; /* group identifier */ } perf_umask_t; typedef struct { const char *name; /* name */ const char *desc; /* description */ const char *equiv; /* event is aliased to */ const char *pmu; /* PMU instance (sysfs) */ uint64_t id; /* perf_hw_id or equivalent */ int modmsk; /* modifiers bitmask */ int type; /* perf_type_id */ int numasks; /* number of unit masls */ int ngrp; /* number of umasks groups */ unsigned long umask_ovfl_idx; /* base index of overflow unit masks */ int flags; /* evnet flags */ perf_umask_t umasks[PERF_MAX_UMASKS];/* first unit masks */ } perf_event_t; /* * event/umask flags */ #define PERF_FL_DEFAULT 0x1 /* umask is default for group */ #define PERF_FL_PRECISE 0x2 /* support precise sampling */ #define PERF_INVAL_OVFL_IDX (~0UL) #define PCL_EVT(f, t, m, fl) \ { .name = #f, \ .id = (f), \ .type = (t), \ .desc = #f, \ .equiv = NULL, \ .numasks = 0, \ .modmsk = (m), \ .ngrp = 0, \ .flags = fl, \ .umask_ovfl_idx = PERF_INVAL_OVFL_IDX,\ } #define PCL_EVTA(f, t, m, a, fl)\ { .name = #f, \ .id = a, \ .type = t, \ .desc = #a, \ .equiv = #a, \ .numasks = 0, \ .modmsk = m, \ .ngrp = 0, \ .flags = fl, \ .umask_ovfl_idx = PERF_INVAL_OVFL_IDX,\ } #define PCL_EVTR(f, t, a, d)\ { .name = #f, \ .id = a, \ .type = t, \ .desc = d, \ .umask_ovfl_idx = PERF_INVAL_OVFL_IDX,\ } #define PCL_EVT_HW(n) PCL_EVT(PERF_COUNT_HW_##n, PERF_TYPE_HARDWARE, PERF_ATTR_HW, 0) #define PCL_EVT_SW(n) PCL_EVT(PERF_COUNT_SW_##n, PERF_TYPE_SOFTWARE, PERF_ATTR_SW, 0) #define PCL_EVT_AHW(n, a) PCL_EVTA(n, PERF_TYPE_HARDWARE, PERF_ATTR_HW, PERF_COUNT_HW_##a, 0) #define PCL_EVT_ASW(n, a) PCL_EVTA(n, PERF_TYPE_SOFTWARE, PERF_ATTR_SW, PERF_COUNT_SW_##a, 0) #define PCL_EVT_HW_FL(n, fl) PCL_EVT(PERF_COUNT_HW_##n, PERF_TYPE_HARDWARE, PERF_ATTR_HW, fl) #define PCL_EVT_RAW(n, e, u, d) PCL_EVTR(n, PERF_TYPE_RAW, (u) << 8 | (e), d) #ifndef MAXPATHLEN #define MAXPATHLEN 1024 #endif #define PERF_ATTR_HW 0 #define PERF_ATTR_SW 0 #include "events/perf_events.h" #define perf_nevents (perf_event_support.pme_count) static perf_event_t *perf_pe = perf_static_events; static perf_event_t *perf_pe_free, *perf_pe_end; static perf_umask_t *perf_um, *perf_um_free, *perf_um_end; static int perf_pe_count; static inline int pfm_perf_pmu_supported_plm(void *this) { pfmlib_pmu_t *pmu; pmu = pfmlib_get_pmu_by_type(PFM_PMU_TYPE_CORE); if (!pmu) { DPRINT("no core CPU PMU, going with default\n"); pmu = this; } else { DPRINT("guessing plm from %s PMU plm=0x%x\n", pmu->name, pmu->supported_plm); } return pmu->supported_plm; } static inline perf_umask_t * perf_get_ovfl_umask(int pidx) { return perf_um+perf_pe[pidx].umask_ovfl_idx; } static inline perf_umask_t * perf_attridx2um(int idx, int attr_idx) { perf_umask_t *um; if (attr_idx < PERF_MAX_UMASKS) { um = &perf_pe[idx].umasks[attr_idx]; } else { um = perf_get_ovfl_umask(idx); um += attr_idx - PERF_MAX_UMASKS; } return um; } #define PERF_ALLOC_EVENT_COUNT (512) #define PERF_ALLOC_UMASK_COUNT (1024) /* * clone static event table into a dynamic * event table * * Used for tracepoints */ static perf_event_t * perf_table_clone(void) { perf_event_t *addr; perf_pe_count = perf_nevents + PERF_ALLOC_EVENT_COUNT; addr = calloc(perf_pe_count, sizeof(perf_event_t)); if (addr) { memcpy(addr, perf_static_events, perf_nevents * sizeof(perf_event_t)); perf_pe_free = addr + perf_nevents; perf_pe_end = perf_pe_free + PERF_ALLOC_EVENT_COUNT; perf_pe = addr; } return addr; } static inline int perf_pe_allocated(void) { return perf_pe != perf_static_events; } /* * allocate space for one new event in event table * * returns NULL if out-of-memory * * may realloc existing table if necessary for growth */ static perf_event_t * perf_table_alloc_event(void) { perf_event_t *new_pe; perf_event_t *p; size_t num_free; /* * if we need to allocate an event and we have not yet * cloned the static events, then clone them */ if (!perf_pe_allocated()) { DPRINT("cloning static event table\n"); p = perf_table_clone(); if (!p) return NULL; perf_pe = p; } retry: if (perf_pe_free < perf_pe_end) return perf_pe_free++; perf_pe_count += PERF_ALLOC_EVENT_COUNT; /* * compute number of free events left * before realloc() to avoid compiler warning (use-after-free) * even though we are simply doing pointer arithmetic and not * dereferencing the perf_pe after realloc when it may be stale * in case the memory was moved. */ num_free = perf_pe_free - perf_pe; new_pe = realloc(perf_pe, perf_pe_count * sizeof(perf_event_t)); if (!new_pe) return NULL; perf_pe_free = new_pe + num_free; perf_pe_end = perf_pe_free + PERF_ALLOC_EVENT_COUNT; perf_pe = new_pe; goto retry; } #ifndef CONFIG_PFMLIB_NOTRACEPOINT static int perf_um_count; static char debugfs_mnt[MAXPATHLEN]; static inline unsigned long perf_get_ovfl_umask_idx(perf_umask_t *um) { return um - perf_um; } /* * figure out the mount point of the debugfs filesystem * * returns -1 if none is found */ static int get_debugfs_mnt(void) { FILE *fp; char *buffer = NULL; size_t len = 0; char *q, *mnt, *fs; int res = -1; fp = fopen("/proc/mounts", "r"); if (!fp) return -1; while(pfmlib_getl(&buffer, &len, fp) != -1) { q = strchr(buffer, ' '); if (!q) continue; mnt = ++q; q = strchr(q, ' '); if (!q) continue; *q = '\0'; fs = ++q; q = strchr(q, ' '); if (!q) continue; *q = '\0'; if (!strcmp(fs, "debugfs")) { strncpy(debugfs_mnt, mnt, MAXPATHLEN); debugfs_mnt[MAXPATHLEN-1]= '\0'; res = 0; break; } } free(buffer); fclose(fp); return res; } /* * allocate space for overflow new unit masks * * Each event can hold up to PERF_MAX_UMASKS. * But gievn we can dynamically add events * which may have more unit masks, then we * put them into a separate overflow unit * masks, table which can grow on demand. * In that case the first PERF_MAX_UMASKS * are in the event, the rest in the overflow * table at index pointed to by event->umask_ovfl_idx * All unit masks for an event are contiguous in the * overflow table. */ static perf_umask_t * perf_table_alloc_umask(void) { perf_umask_t *new_um; size_t num_free; retry: if (perf_um_free < perf_um_end) return perf_um_free++; perf_um_count += PERF_ALLOC_UMASK_COUNT; /* * compute number of free unmasks left * before realloc() to avoid compiler warning (use-after-free) * even though we are simply doing pointer arithmetic and not * dereferencing the perf_um after realloc when it may be stale * in case the memory was moved. */ num_free = perf_um_free - perf_um; new_um = realloc(perf_um, perf_um_count * sizeof(*new_um)); if (!new_um) return NULL; perf_um_free = new_um + num_free; perf_um_end = perf_um_free + PERF_ALLOC_UMASK_COUNT; perf_um = new_um; goto retry; } #ifdef __GNUC__ #define POTENTIALLY_UNUSED __attribute__((unused)) #endif static void gen_tracepoint_table(void) { DIR *dir1, *dir2; struct dirent *d1, *d2; perf_event_t *p = NULL; perf_umask_t *um; char POTENTIALLY_UNUSED d2path[MAXPATHLEN]; char idpath[MAXPATHLEN]; char id_str[32]; uint64_t id; int fd, err; int POTENTIALLY_UNUSED dir1_fd; int POTENTIALLY_UNUSED dir2_fd; int reuse_event = 0; int numasks; char *tracepoint_name; int retlen; err = get_debugfs_mnt(); if (err == -1) return; strncat(debugfs_mnt, "/tracing/events", MAXPATHLEN-1); debugfs_mnt[MAXPATHLEN-1]= '\0'; #ifdef HAS_OPENAT dir1_fd = open(debugfs_mnt, O_DIRECTORY); if (dir1_fd < 0) return; dir1 = fdopendir(dir1_fd); #else dir1 = opendir(debugfs_mnt); if (!dir1) return; #endif err = 0; while((d1 = readdir(dir1)) && err >= 0) { if (!strcmp(d1->d_name, ".")) continue; if (!strcmp(d1->d_name, "..")) continue; #ifdef HAS_OPENAT /* fails if it cannot open */ dir2_fd = openat(dir1_fd, d1->d_name, O_DIRECTORY); if (dir2_fd < 0) continue; dir2 = fdopendir(dir2_fd); if (!dir2) continue; #else retlen = snprintf(d2path, MAXPATHLEN, "%s/%s", debugfs_mnt, d1->d_name); /* ensure generated d2path string is valid */ if (retlen <= 0 || MAXPATHLEN <= retlen) continue; /* fails if d2path is not a directory */ dir2 = opendir(d2path); if (!dir2) continue; #endif dir2_fd = dirfd(dir2); /* * if a subdir did not fit our expected * tracepoint format, then we reuse the * allocated space (with have no free) */ if (!reuse_event) p = perf_table_alloc_event(); if (!p) break; if (p) p->name = tracepoint_name = strdup(d1->d_name); if (!(p && p->name)) { closedir(dir2); err = -1; continue; } p->desc = "tracepoint"; p->id = ~0ULL; p->type = PERF_TYPE_TRACEPOINT; p->umask_ovfl_idx = PERF_INVAL_OVFL_IDX; p->modmsk = 0, p->ngrp = 1; numasks = 0; reuse_event = 0; while((d2 = readdir(dir2))) { if (!strcmp(d2->d_name, ".")) continue; if (!strcmp(d2->d_name, "..")) continue; #ifdef HAS_OPENAT retlen = snprintf(idpath, MAXPATHLEN, "%s/id", d2->d_name); /* ensure generated d2path string is valid */ if (retlen <= 0 || MAXPATHLEN <= retlen) continue; fd = openat(dir2_fd, idpath, O_RDONLY); #else retlen = snprintf(idpath, MAXPATHLEN, "%s/%s/id", d2path, d2->d_name); /* ensure generated d2path string is valid */ if (retlen <= 0 || MAXPATHLEN <= retlen) continue; fd = open(idpath, O_RDONLY); #endif if (fd == -1) continue; err = read(fd, id_str, sizeof(id_str)); close(fd); if (err < 0) continue; id = strtoull(id_str, NULL, 0); if (numasks < PERF_MAX_UMASKS) um = p->umasks+numasks; else { um = perf_table_alloc_umask(); if (numasks == PERF_MAX_UMASKS) p->umask_ovfl_idx = perf_get_ovfl_umask_idx(um); } if (!um) { err = -1; break; } /* * tracepoint have no event codes * the code is in the unit masks */ p->id = 0; um->uname = strdup(d2->d_name); if (!um->uname) { err = -1; break; } um->udesc = um->uname; um->uid = id; um->grpid = 0; DPRINT("idpath=%s:%s id=%"PRIu64"\n", p->name, um->uname, id); numasks++; } p->numasks = numasks; closedir(dir2); /* * directory was not pointing * to a tree structure we know about */ if (!numasks) { free(tracepoint_name); reuse_event = 1; continue; } /* * update total number of events * only when no error is reported */ if (err >= 0) perf_nevents++; reuse_event = 0; } closedir(dir1); } #endif /* CONFIG_PFMLIB_NOTRACEPOINT */ static int pfm_perf_detect(void *this) { #ifdef __linux__ /* ought to find a better way of detecting PERF */ #define PERF_OLD_PROC_FILE "/proc/sys/kernel/perf_counter_paranoid" #define PERF_PROC_FILE "/proc/sys/kernel/perf_event_paranoid" return !(access(PERF_PROC_FILE, F_OK) && access(PERF_OLD_PROC_FILE, F_OK)) ? PFM_SUCCESS: PFM_ERR_NOTSUPP; #else return PFM_SUCCESS; #endif } /* * checks that the event is exported by the PMU specified by the event entry * This code assumes the PMU type is RAW, which requires an encoding exported * via sysfs. */ static int event_exist(perf_event_t *e) { char buf[PATH_MAX]; snprintf(buf, PATH_MAX, "%s/%s/events/%s", SYSFS_PMU_DEVICES_DIR, e->pmu ? e->pmu : "cpu", e->name); return access(buf, F_OK) == 0; } static void add_optional_events(void) { perf_event_t *ent, *e; size_t i; for (i = 0; i < PME_PERF_EVENT_OPT_COUNT; i++) { e = perf_optional_events + i; if (!event_exist(e)) { DPRINT("perf::%s not available\n", e->name); continue; } ent = perf_table_alloc_event(); if (!ent) break; memcpy(ent, e, sizeof(*e)); perf_nevents++; } } static int pfm_perf_init(void *this) { pfmlib_pmu_t *pmu = this; perf_pe = perf_static_events; /* * we force the value of pme_count by hand because * the library could be initialized mutltiple times * due to pfm_terminate() and thus we need to start * from the default count */ perf_event_support.pme_count = PME_PERF_EVENT_COUNT; #ifndef CONFIG_PFMLIB_NOTRACEPOINT /* must dynamically add tracepoints */ gen_tracepoint_table(); #endif /* must dynamically add optional hw events */ add_optional_events(); /* dynamically patch supported plm based on CORE PMU plm */ pmu->supported_plm = pfm_perf_pmu_supported_plm(pmu); return PFM_SUCCESS; } static int pfm_perf_get_event_first(void *this) { return 0; } static int pfm_perf_get_event_next(void *this, int idx) { if (idx < 0 || idx >= (perf_nevents-1)) return -1; return idx+1; } static int pfm_perf_add_defaults(pfmlib_event_desc_t *e, unsigned int msk, uint64_t *umask) { perf_event_t *ent; perf_umask_t *um; int i, j, k, added; k = e->nattrs; ent = perf_pe+e->event; for(i=0; msk; msk >>=1, i++) { if (!(msk & 0x1)) continue; added = 0; for(j=0; j < ent->numasks; j++) { if (j < PERF_MAX_UMASKS) { um = &perf_pe[e->event].umasks[j]; } else { um = perf_get_ovfl_umask(e->event); um += j - PERF_MAX_UMASKS; } if (um->grpid != i) continue; if (um->uflags & PERF_FL_DEFAULT) { DPRINT("added default %s for group %d\n", um->uname, i); *umask |= um->uid; e->attrs[k].id = j; e->attrs[k].ival = 0; k++; added++; } } if (!added) { DPRINT("no default found for event %s unit mask group %d\n", ent->name, i); return PFM_ERR_UMASK; } } e->nattrs = k; return PFM_SUCCESS; } static int pfmlib_perf_encode_tp(pfmlib_event_desc_t *e) { perf_umask_t *um; pfmlib_event_attr_info_t *a; int i, nu = 0; e->fstr[0] = '\0'; e->count = 1; evt_strcat(e->fstr, "%s", perf_pe[e->event].name); /* * look for tracepoints */ for(i=0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { /* * tracepoint unit masks cannot be combined */ if (++nu > 1) return PFM_ERR_FEATCOMB; if (a->idx < PERF_MAX_UMASKS) { e->codes[0] = perf_pe[e->event].umasks[a->idx].uid; evt_strcat(e->fstr, ":%s", perf_pe[e->event].umasks[a->idx].uname); } else { um = perf_get_ovfl_umask(e->event); e->codes[0] = um[a->idx - PERF_MAX_UMASKS].uid; evt_strcat(e->fstr, ":%s", um[a->idx - PERF_MAX_UMASKS].uname); } } else return PFM_ERR_ATTR; } return PFM_SUCCESS; } static int pfmlib_perf_encode_hw_cache(pfmlib_event_desc_t *e) { pfmlib_event_attr_info_t *a; perf_event_t *ent; unsigned int msk, grpmsk; uint64_t umask = 0; int i, ret; grpmsk = (1 << perf_pe[e->event].ngrp)-1; ent = perf_pe + e->event; e->codes[0] = ent->id; e->count = 1; e->fstr[0] = '\0'; for(i=0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) { e->codes[0] |= ent->umasks[a->idx].uid; msk = 1 << ent->umasks[a->idx].grpid; /* umask cannot be combined in each group */ if ((grpmsk & msk) == 0) return PFM_ERR_UMASK; grpmsk &= ~msk; } else return PFM_ERR_ATTR; /* no mod, no raw umask */ } /* check for missing default umasks */ if (grpmsk) { ret = pfm_perf_add_defaults(e, grpmsk, &umask); if (ret != PFM_SUCCESS) return ret; e->codes[0] |= umask; } /* * reorder all the attributes such that the fstr appears always * the same regardless of how the attributes were submitted. * * cannot sort attr until after we have added the default umasks */ evt_strcat(e->fstr, "%s", ent->name); pfmlib_sort_attr(e); for(i=0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", ent->umasks[a->idx].uname); } return PFM_SUCCESS; } static int pfm_perf_get_encoding(void *this, pfmlib_event_desc_t *e) { int ret; switch(perf_pe[e->event].type) { case PERF_TYPE_TRACEPOINT: ret = pfmlib_perf_encode_tp(e); break; case PERF_TYPE_HW_CACHE: ret = pfmlib_perf_encode_hw_cache(e); break; case PERF_TYPE_HARDWARE: case PERF_TYPE_SOFTWARE: case PERF_TYPE_RAW: ret = PFM_SUCCESS; e->codes[0] = perf_pe[e->event].id; e->count = 1; e->fstr[0] = '\0'; evt_strcat(e->fstr, "%s", perf_pe[e->event].name); break; default: DPRINT("unsupported event type=%d\n", perf_pe[e->event].type); return PFM_ERR_NOTSUPP; } return ret; } static int pfm_perf_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { struct perf_event_attr *attr; int ret; switch(perf_pe[e->event].type) { case PERF_TYPE_TRACEPOINT: ret = pfmlib_perf_encode_tp(e); break; case PERF_TYPE_HW_CACHE: ret = pfmlib_perf_encode_hw_cache(e); break; case PERF_TYPE_HARDWARE: case PERF_TYPE_SOFTWARE: case PERF_TYPE_RAW: ret = PFM_SUCCESS; e->codes[0] = perf_pe[e->event].id; e->count = 1; e->fstr[0] = '\0'; evt_strcat(e->fstr, "%s", perf_pe[e->event].name); break; default: DPRINT("unsupported event type=%d\n", perf_pe[e->event].type); return PFM_ERR_NOTSUPP; } attr = e->os_data; attr->type = perf_pe[e->event].type; attr->config = e->codes[0]; return ret; } static int pfm_perf_event_is_valid(void *this, int idx) { return idx >= 0 && idx < perf_nevents; } static int pfm_perf_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info) { perf_umask_t *um; /* only supports umasks, modifiers handled at OS layer */ um = perf_attridx2um(idx, attr_idx); info->name = um->uname; info->desc = um->udesc; info->equiv= NULL; info->code = um->uid; info->type = PFM_ATTR_UMASK; info->ctrl = PFM_ATTR_CTRL_PMU; info->is_precise = !!(um->uflags & PERF_FL_PRECISE); info->support_hw_smpl = info->is_precise; info->is_dfl = 0; info->idx = attr_idx; info->dfl_val64 = 0; return PFM_SUCCESS; } static int pfm_perf_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; info->name = perf_pe[idx].name; info->desc = perf_pe[idx].desc; info->code = perf_pe[idx].id; info->equiv = perf_pe[idx].equiv; info->idx = idx; info->pmu = pmu->pmu; info->is_precise = !!(perf_pe[idx].flags & PERF_FL_PRECISE); info->support_hw_smpl = info->is_precise; /* unit masks + modifiers */ info->nattrs = perf_pe[idx].numasks; return PFM_SUCCESS; } static void pfm_perf_terminate(void *this) { perf_event_t *p; int i, j; /* if perf_pe not allocated then perf_um not allocated */ if (!perf_pe_allocated()) return; /* * free tracepoints name + unit mask names * which are dynamically allocated */ for (i = 0; i < perf_nevents; i++) { p = &perf_pe[i]; if (p->type != PERF_TYPE_TRACEPOINT) continue; /* cast to keep compiler happy, we are * freeing the dynamically allocated clone * table, not the static one. We do not want * to create a specific data type */ free((void *)p->name); /* * first PERF_MAX_UMASKS are pre-allocated * the rest is in a separate dynamic table */ for (j = 0; j < p->numasks; j++) { if (j == PERF_MAX_UMASKS) break; free((void *)p->umasks[j].uname); } } /* * perf_pe is systematically allocated */ if (perf_pe_allocated()) { free(perf_pe); perf_pe = perf_pe_free = perf_pe_end = NULL; } if (perf_um) { int n; /* * free the dynamic umasks' uname */ n = perf_um_free - perf_um; for(i=0; i < n; i++) free((void *)(perf_um[i].uname)); free(perf_um); perf_um = NULL; perf_um_free = perf_um_end = NULL; } } static int pfm_perf_validate_table(void *this, FILE *fp) { const char *name = perf_event_support.name; perf_umask_t *um; int i, j; int error = 0; for(i=0; i < perf_event_support.pme_count; i++) { if (!perf_pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", name, i, i > 1 ? perf_pe[i-1].name : "??"); error++; } if (!perf_pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", name, i, perf_pe[i].name); error++; } if (perf_pe[i].type < PERF_TYPE_HARDWARE || perf_pe[i].type >= PERF_TYPE_MAX) { fprintf(fp, "pmu: %s event%d: %s :: invalid type\n", name, i, perf_pe[i].name); error++; } if (perf_pe[i].numasks > PERF_MAX_UMASKS && perf_pe[i].umask_ovfl_idx == PERF_INVAL_OVFL_IDX) { fprintf(fp, "pmu: %s event%d: %s :: numasks too big (<%d)\n", name, i, perf_pe[i].name, PERF_MAX_UMASKS); error++; } if (perf_pe[i].numasks < PERF_MAX_UMASKS && perf_pe[i].umask_ovfl_idx != PERF_INVAL_OVFL_IDX) { fprintf(fp, "pmu: %s event%d: %s :: overflow umask idx defined but not needed (<%d)\n", name, i, perf_pe[i].name, PERF_MAX_UMASKS); error++; } if (perf_pe[i].numasks && perf_pe[i].ngrp == 0) { fprintf(fp, "pmu: %s event%d: %s :: ngrp cannot be zero\n", name, i, perf_pe[i].name); error++; } if (perf_pe[i].numasks == 0 && perf_pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s :: ngrp must be zero\n", name, i, perf_pe[i].name); error++; } for(j = 0; j < perf_pe[i].numasks; j++) { if (j < PERF_MAX_UMASKS){ um = perf_pe[i].umasks+j; } else { um = perf_get_ovfl_umask(i); um += j - PERF_MAX_UMASKS; } if (!um->uname) { fprintf(fp, "pmu: %s event%d: %s umask%d :: no name\n", name, i, perf_pe[i].name, j); error++; } if (!um->udesc) { fprintf(fp, "pmu: %s event%d:%s umask%d: %s :: no description\n", name, i, perf_pe[i].name, j, um->uname); error++; } if (perf_pe[i].ngrp && um->grpid >= perf_pe[i].ngrp) { fprintf(fp, "pmu: %s event%d: %s umask%d: %s :: invalid grpid %d (must be < %d)\n", name, i, perf_pe[i].name, j, um->uname, um->grpid, perf_pe[i].ngrp); error++; } } /* check for excess unit masks */ for(; j < PERF_MAX_UMASKS; j++) { if (perf_pe[i].umasks[j].uname || perf_pe[i].umasks[j].udesc) { fprintf(fp, "pmu: %s event%d: %s :: numasks (%d) invalid more events exists\n", name, i, perf_pe[i].name, perf_pe[i].numasks); error++; } } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } static unsigned int pfm_perf_get_event_nattrs(void *this, int idx) { return perf_pe[idx].numasks; } /* * this function tries to figure out what the underlying core PMU * priv level masks are. It looks for a TYPE_CORE PMU and uses the * first event to determine supported priv level masks. */ /* * remove attrs which are in conflicts (or duplicated) with os layer */ static void pfm_perf_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; int i, compact, type; int plm = pmu->supported_plm; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; if (e->pattrs[i].ctrl != PFM_ATTR_CTRL_PERF_EVENT) continue; /* * only PERF_TYPE_HARDWARE/HW_CACHE may have * precise mode or hypervisor mode * * there is no way to know for sure for those events * so we allow the modifiers and leave it to the kernel * to decide */ type = perf_pe[e->event].type; if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE) { /* no hypervisor mode */ if (e->pattrs[i].idx == PERF_ATTR_H && !(plm & PFM_PLMH)) compact = 1; /* no user mode */ if (e->pattrs[i].idx == PERF_ATTR_U && !(plm & PFM_PLM3)) compact = 1; /* no kernel mode */ if (e->pattrs[i].idx == PERF_ATTR_K && !(plm & PFM_PLM0)) compact = 1; } else { if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; /* no hypervisor mode */ if (e->pattrs[i].idx == PERF_ATTR_H) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } pfmlib_pmu_t perf_event_support={ .desc = "perf_events generic PMU", .name = "perf", .pmu = PFM_PMU_PERF_EVENT, .pme_count = PME_PERF_EVENT_COUNT, .type = PFM_PMU_TYPE_OS_GENERIC, .max_encoding = 1, .supported_plm = PERF_PLM_ALL, .pmu_detect = pfm_perf_detect, .pmu_init = pfm_perf_init, .pmu_terminate = pfm_perf_terminate, .get_event_encoding[PFM_OS_NONE] = pfm_perf_get_encoding, PFMLIB_ENCODE_PERF(pfm_perf_get_perf_encoding), .get_event_first = pfm_perf_get_event_first, .get_event_next = pfm_perf_get_event_next, .event_is_valid = pfm_perf_event_is_valid, .get_event_info = pfm_perf_get_event_info, .get_event_attr_info = pfm_perf_get_event_attr_info, .validate_table = pfm_perf_validate_table, .get_event_nattrs = pfm_perf_get_event_nattrs, PFMLIB_VALID_PERF_PATTRS(pfm_perf_perf_validate_pattrs), }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_perf_event_priv.h000066400000000000000000000051611502707512200234340ustar00rootroot00000000000000/* * pfmlib_perf_events_priv.h: perf_event public attributes * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PERF_EVENT_PRIV_H__ #define __PERF_EVENT_PRIV_H__ #include "pfmlib_priv.h" #include "perfmon/pfmlib_perf_event.h" #define PERF_ATTR_U 0 /* monitor at user privilege levels */ #define PERF_ATTR_K 1 /* monitor at kernel privilege levels */ #define PERF_ATTR_H 2 /* monitor at hypervisor levels */ #define PERF_ATTR_PE 3 /* sampling period */ #define PERF_ATTR_FR 4 /* average target sampling rate */ #define PERF_ATTR_PR 5 /* precise sampling mode */ #define PERF_ATTR_EX 6 /* exclusive event */ #define PERF_ATTR_MG 7 /* monitor guest execution */ #define PERF_ATTR_MH 8 /* monitor host execution */ #define PERF_ATTR_CPU 9 /* CPU to program */ #define PERF_ATTR_PIN 10 /* pin event to CPU */ #define PERF_ATTR_HWS 11 /* hardware sampling */ #define _PERF_ATTR_U (1 << PERF_ATTR_U) #define _PERF_ATTR_K (1 << PERF_ATTR_K) #define _PERF_ATTR_H (1 << PERF_ATTR_H) #define _PERF_ATTR_PE (1 << PERF_ATTR_PE) #define _PERF_ATTR_FR (1 << PERF_ATTR_FR) #define _PERF_ATTR_PR (1 << PERF_ATTR_PR) #define _PERF_ATTR_EX (1 << PERF_ATTR_EX) #define _PERF_ATTR_MG (1 << PERF_ATTR_MG) #define _PERF_ATTR_MH (1 << PERF_ATTR_MH) #define _PERF_ATTR_CPU (1 << PERF_ATTR_CPU) #define _PERF_ATTR_PIN (1 << PERF_ATTR_PIN) #define _PERF_ATTR_HWS (1 << PERF_ATTR_HWS) #define PERF_PLM_ALL (PFM_PLM0|PFM_PLM3|PFM_PLMH) extern int pfm_perf_find_pmu_type(void *this, int *type); #define SYSFS_PMU_DEVICES_DIR "/sys/bus/event_source/devices" #endif papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_perf_event_raw.c000066400000000000000000000111101502707512200232270ustar00rootroot00000000000000/* * pfmlib_perf_events_raw.c: support for raw event syntax * * Copyright (c) 2014 Google, Inc. All rights reserved * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include "pfmlib_priv.h" #include "pfmlib_perf_event_priv.h" static int pfm_perf_raw_detect(void *this) { #ifdef __linux__ /* ought to find a better way of detecting PERF */ #define PERF_OLD_PROC_FILE "/proc/sys/kernel/perf_counter_paranoid" #define PERF_PROC_FILE "/proc/sys/kernel/perf_event_paranoid" return !(access(PERF_PROC_FILE, F_OK) && access(PERF_OLD_PROC_FILE, F_OK)) ? PFM_SUCCESS: PFM_ERR_NOTSUPP; #else return PFM_SUCCESS; #endif } static int pfm_perf_raw_get_event_first(void *this) { return 0; } static int pfm_perf_raw_get_event_next(void *this, int idx) { /* only one pseudo event */ return -1; } static int pfm_perf_raw_get_encoding(void *this, pfmlib_event_desc_t *e) { /* * actual enoding done in pfm_perf_raw_match_event() */ e->fstr[0] = '\0'; evt_strcat(e->fstr, "r%"PRIx64, e->codes[0]); return PFM_SUCCESS; } static int pfm_perf_raw_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { struct perf_event_attr *attr; attr = e->os_data; attr->type = PERF_TYPE_RAW; attr->config = e->codes[0]; attr->config1 = e->codes[1]; attr->config2 = e->codes[2]; return PFM_SUCCESS; } static int pfm_perf_raw_event_is_valid(void *this, int idx) { return idx == 0; } static int pfm_perf_raw_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info) { return PFM_ERR_ATTR; } static int pfm_perf_raw_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; info->name = "r0000"; info->desc = "perf_events raw event syntax: r[0-9a-fA-F]+", info->code = 0; info->equiv = NULL; info->idx = 0; info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; /* unit masks + modifiers */ info->nattrs = 0; return PFM_SUCCESS; } static unsigned int pfm_perf_raw_get_event_nattrs(void *this, int idx) { return 0; } /* * remove attrs which are in conflicts (or duplicated) with os layer */ static void pfm_perf_raw_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { } /* * returns 0 if match (like strcmp()) */ static int pfm_perf_raw_match_event(void *this, pfmlib_event_desc_t *d, const char *e, const char *s) { uint64_t code; char *endptr = NULL; if (*s != 'r' || !isxdigit(*(s+1))) return 1; code = strtoull(s+1, &endptr, 16); if (code == ULLONG_MAX || errno == ERANGE|| (endptr && *endptr)) return 1; /* * stash code in final position */ d->codes[0] = code; d->count = 1; return 0; } pfmlib_pmu_t perf_event_raw_support={ .desc = "perf_events raw PMU", .name = "perf_raw", .pmu = PFM_PMU_PERF_EVENT_RAW, .pme_count = 1, .type = PFM_PMU_TYPE_OS_GENERIC, .max_encoding = 1, .supported_plm = PERF_PLM_ALL, .pmu_detect = pfm_perf_raw_detect, .get_event_encoding[PFM_OS_NONE] = pfm_perf_raw_get_encoding, PFMLIB_ENCODE_PERF(pfm_perf_raw_get_perf_encoding), .get_event_first = pfm_perf_raw_get_event_first, .get_event_next = pfm_perf_raw_get_event_next, .event_is_valid = pfm_perf_raw_event_is_valid, .get_event_info = pfm_perf_raw_get_event_info, .get_event_attr_info = pfm_perf_raw_get_event_attr_info, .get_event_nattrs = pfm_perf_raw_get_event_nattrs, .match_event = pfm_perf_raw_match_event, PFMLIB_VALID_PERF_PATTRS(pfm_perf_raw_perf_validate_pattrs), }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power10.c000066400000000000000000000044321502707512200215270ustar00rootroot00000000000000/* * pfmlib_power10.c : IBM Power10 support * * Copyright (C) IBM Corporation, 2020. All rights reserved. * Contributed by Will Schmidt (will_schmidt@vnet.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power10_events.h" static int pfm_power10_detect(void* this) { if (__is_processor(PV_POWER10)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power10_support={ .desc = "POWER10", .name = "power10", .pmu = PFM_PMU_POWER10, .pme_count = LIBPFM_ARRAY_SIZE(power10_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = POWER10_PLM, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power10_pe, .pmu_detect = pfm_power10_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power4.c000066400000000000000000000043601502707512200214520ustar00rootroot00000000000000/* * pfmlib_power4.c : IBM Power4 support * * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power4_events.h" static int pfm_power4_detect(void* this) { if (__is_processor(PV_POWER4) || __is_processor(PV_POWER4p)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power4_support={ .desc = "POWER4", .name = "power4", .pmu = PFM_PMU_POWER4, .pme_count = LIBPFM_ARRAY_SIZE(power4_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 8, .max_encoding = 1, .pe = power4_pe, .pmu_detect = pfm_power4_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power5.c000066400000000000000000000062661502707512200214620ustar00rootroot00000000000000/* * pfmlib_power5.c : IBM Power5 support * * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power5_events.h" #include "events/power5+_events.h" static int pfm_power5_detect(void* this) { if (__is_processor(PV_POWER5)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } static int pfm_power5p_detect(void* this) { if (__is_processor(PV_POWER5p)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power5_support={ .desc = "POWER5", .name = "power5", .pmu = PFM_PMU_POWER5, .pme_count = LIBPFM_ARRAY_SIZE(power5_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power5_pe, .pmu_detect = pfm_power5_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; pfmlib_pmu_t power5p_support={ .desc = "POWER5+", .name = "power5p", .pmu = PFM_PMU_POWER5p, .pme_count = LIBPFM_ARRAY_SIZE(power5p_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power5p_pe, .pmu_detect = pfm_power5p_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power6.c000066400000000000000000000043501502707512200214530ustar00rootroot00000000000000/* * pfmlib_power6.c : IBM Power6 support * * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power6_events.h" static int pfm_power6_detect(void* this) { if (__is_processor(PV_POWER6)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power6_support={ .desc = "POWER6", .name = "power6", .pmu = PFM_PMU_POWER6, .pme_count = LIBPFM_ARRAY_SIZE(power6_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power6_pe, .pmu_detect = pfm_power6_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power7.c000066400000000000000000000044061502707512200214560ustar00rootroot00000000000000/* * pfmlib_power7.c : IBM Power7 support * * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power7_events.h" static int pfm_power7_detect(void* this) { if (__is_processor(PV_POWER7) || __is_processor(PV_POWER7p)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power7_support={ .desc = "POWER7", .name = "power7", .pmu = PFM_PMU_POWER7, .pme_count = LIBPFM_ARRAY_SIZE(power7_pe), .type = PFM_PMU_TYPE_CORE, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power7_pe, .pmu_detect = pfm_power7_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power8.c000066400000000000000000000045071502707512200214610ustar00rootroot00000000000000/* * pfmlib_power8.c : IBM Power8 support * * Copyright (C) IBM Corporation, 2013-2016. All rights reserved. * Contributed by Carl Love (carll@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power8_events.h" static int pfm_power8_detect(void* this) { if (__is_processor(PV_POWER8) || __is_processor(PV_POWER8E) || __is_processor(PV_POWER8NVL)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power8_support={ .desc = "POWER8", .name = "power8", .pmu = PFM_PMU_POWER8, .pme_count = LIBPFM_ARRAY_SIZE(power8_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = POWER8_PLM, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power8_pe, .pmu_detect = pfm_power8_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power9.c000066400000000000000000000044141502707512200214570ustar00rootroot00000000000000/* * pfmlib_power9.c : IBM Power9 support * * Copyright (C) IBM Corporation, 2017. All rights reserved. * Contributed by Will Schmidt (will_schmidt@vnet.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/power9_events.h" static int pfm_power9_detect(void* this) { if (__is_processor(PV_POWER9)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t power9_support={ .desc = "POWER9", .name = "power9", .pmu = PFM_PMU_POWER9, .pme_count = LIBPFM_ARRAY_SIZE(power9_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = POWER9_PLM, .num_cntrs = 4, .num_fixed_cntrs = 2, .max_encoding = 1, .pe = power9_pe, .pmu_detect = pfm_power9_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_power_priv.h000066400000000000000000000074361502707512200224420ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ #ifndef __PFMLIB_POWER_PRIV_H__ #define __PFMLIB_POWER_PRIV_H__ /* * File: pfmlib_power_priv.h * CVS: * Author: Corey Ashford * cjashfor@us.ibm.com * Mods: * * * (C) Copyright IBM Corporation, 2009. All Rights Reserved. * Contributed by Corey Ashford * */ typedef struct { uint64_t pme_code; const char *pme_name; const char *pme_short_desc; const char *pme_long_desc; } pme_power_entry_t; typedef struct { const char *pme_name; const char *pme_desc; unsigned pme_code; uint64_t pme_modmsk; } pme_torrent_entry_t; /* Attribute "type "for PowerBus MCD events */ #define TORRENT_ATTR_MCD_TYPE 0 /* Attribute "sel" for PowerBus bus utilization events */ #define TORRENT_ATTR_UTIL_SEL 1 /* Attribute "lo_cmp" for PowerBus utilization events */ #define TORRENT_ATTR_UTIL_LO_CMP 2 /* Attribute "hi_cmp" for PowerBus utilization events */ #define TORRENT_ATTR_UTIL_HI_CMP 3 #define _TORRENT_ATTR_MCD_TYPE (1 << TORRENT_ATTR_MCD_TYPE) #define _TORRENT_ATTR_MCD (_TORRENT_ATTR_MCD_TYPE) #define _TORRENT_ATTR_UTIL_SEL (1 << TORRENT_ATTR_UTIL_SEL) #define _TORRENT_ATTR_UTIL_LO_CMP (1 << TORRENT_ATTR_UTIL_LO_CMP) #define _TORRENT_ATTR_UTIL_HI_CMP (1 << TORRENT_ATTR_UTIL_HI_CMP) #define _TORRENT_ATTR_UTIL_LO (_TORRENT_ATTR_UTIL_SEL | \ _TORRENT_ATTR_UTIL_LO_CMP) #define _TORRENT_ATTR_UTIL_HI (_TORRENT_ATTR_UTIL_SEL | \ _TORRENT_ATTR_UTIL_HI_CMP) /* * These definitions were taken from the reg.h file which, until Linux * 2.6.18, resided in /usr/include/asm-ppc64. Most of the unneeded * definitions have been removed, but there are still a few in this file * that are currently unused by libpfm. */ #ifndef _POWER_REG_H #define _POWER_REG_H #define __stringify_1(x) #x #define __stringify(x) __stringify_1(x) #ifdef __powerpc__ #define mfspr(rn) ({unsigned long rval; \ asm volatile("mfspr %0," __stringify(rn) \ : "=r" (rval)); rval;}) #else #define mfspr(rn) (0) #endif /* Special Purpose Registers (SPRNs)*/ #define SPRN_PVR 0x11F /* Processor Version Register */ /* Processor Version Register (PVR) field extraction */ #define PVR_VER(pvr) (((pvr) >> 16) & 0xFFFF) /* Version field */ #define PVR_REV(pvr) (((pvr) >> 0) & 0xFFFF) /* Revision field */ #define __is_processor(pv) (PVR_VER(mfspr(SPRN_PVR)) == (pv)) /* 64-bit processors */ #define PV_POWER4 0x0035 #define PV_POWER4p 0x0038 #define PV_970 0x0039 #define PV_POWER5 0x003A #define PV_POWER5p 0x003B #define PV_970FX 0x003C #define PV_POWER6 0x003E #define PV_POWER7 0x003F #define PV_POWER7p 0x004a #define PV_970MP 0x0044 #define PV_970GX 0x0045 #define PV_POWER8E 0x004b #define PV_POWER8NVL 0x004c #define PV_POWER8 0x004d #define PV_POWER9 0x004e #define PV_POWER10 0x0080 #define POWER_PLM (PFM_PLM0|PFM_PLM3) #define POWER8_PLM (POWER_PLM|PFM_PLMH) #define POWER9_PLM (POWER_PLM|PFM_PLMH) #define POWER10_PLM (POWER_PLM|PFM_PLMH) extern int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info); extern int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info); extern int pfm_gen_powerpc_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_gen_powerpc_get_event_first(void *this); extern int pfm_gen_powerpc_get_event_next(void *this, int idx); extern int pfm_gen_powerpc_event_is_valid(void *this, int pidx); extern int pfm_gen_powerpc_validate_table(void *this, FILE *fp); extern void pfm_gen_powerpc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_gen_powerpc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_gen_powerpc_get_nest_perf_encoding(void *this, pfmlib_event_desc_t *e); #endif /* _POWER_REG_H */ #endif papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_powerpc.c000066400000000000000000000061601502707512200217110ustar00rootroot00000000000000/* * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * * pfmlib_gen_powerpc.c * * Support for libpfm4 for the PowerPC 970, 970MP, Power4,4+,5,5+,6,7 processors. */ #include #include /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" int pfm_gen_powerpc_get_event_info(void *this, int pidx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const pme_power_entry_t *pe = this_pe(this); /* * pmu and idx filled out by caller */ info->name = pe[pidx].pme_name; info->desc = pe[pidx].pme_long_desc; info->code = pe[pidx].pme_code; info->equiv = NULL; info->idx = pidx; /* private index */ info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; info->nattrs = 0; return PFM_SUCCESS; } int pfm_gen_powerpc_get_event_attr_info(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info) { /* No attributes are supported */ return PFM_ERR_ATTR; } int pfm_gen_powerpc_get_encoding(void *this, pfmlib_event_desc_t *e) { const pme_power_entry_t *pe = this_pe(this); e->count = 1; e->codes[0] = (uint64_t)pe[e->event].pme_code; evt_strcat(e->fstr, "%s", pe[e->event].pme_name); return PFM_SUCCESS; } int pfm_gen_powerpc_get_event_first(void *this) { return 0; } int pfm_gen_powerpc_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; if (idx >= (p->pme_count-1)) return -1; return idx+1; } int pfm_gen_powerpc_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count; } int pfm_gen_powerpc_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const pme_power_entry_t *pe = this_pe(this); int i; int ret = PFM_ERR_INVAL; for(i=0; i < pmu->pme_count; i++) { if (!pe[i].pme_name) { fprintf(fp, "pmu: %s event%d: :: no name\n", pmu->name, i); goto error; } if (!pe[i].pme_long_desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", pmu->name, i, pe[i].pme_name); goto error; } } ret = PFM_SUCCESS; error: return ret; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_powerpc_nest.c000066400000000000000000000046001502707512200227370ustar00rootroot00000000000000/* * pfmlib_powerpc_nest.c */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/powerpc_nest_events.h" static int pfm_powerpc_nest_detect(void* this) { if (__is_processor(PV_POWER8)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t powerpc_nest_mcs_read_support={ .desc = "POWERPC_NEST_MCS_RD_BW", .name = "powerpc_nest_mcs_read", .pmu = PFM_PMU_POWERPC_NEST_MCS_READ_BW, .perf_name = "Nest_MCS_Read_BW", .pme_count = LIBPFM_ARRAY_SIZE(powerpc_nest_read_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = powerpc_nest_read_pe, .pmu_detect = pfm_powerpc_nest_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_nest_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; pfmlib_pmu_t powerpc_nest_mcs_write_support={ .desc = "POWERPC_NEST_MCS_WR_BW", .name = "powerpc_nest_mcs_write", .pmu = PFM_PMU_POWERPC_NEST_MCS_WRITE_BW, .perf_name = "Nest_MCS_Write_BW", .pme_count = LIBPFM_ARRAY_SIZE(powerpc_nest_write_pe), .type = PFM_PMU_TYPE_UNCORE, .num_cntrs = 4, .num_fixed_cntrs = 0, .max_encoding = 1, .pe = powerpc_nest_write_pe, .pmu_detect = pfm_powerpc_nest_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_nest_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_powerpc_perf_event.c000066400000000000000000000073741502707512200241360ustar00rootroot00000000000000/* * pfmlib_powerpc_perf_event.c : perf_event IBM Power/Torrent functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_power_priv.h" /* architecture private */ #include "pfmlib_perf_event_priv.h" int pfm_gen_powerpc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* * encoding routine changes based on PMU model */ ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; attr->type = PERF_TYPE_RAW; attr->config = e->codes[0]; return PFM_SUCCESS; } static int find_pmu_type_by_name(const char *name) { char filename[PATH_MAX]; FILE *fp; int ret, type; if (!name) return PFM_ERR_NOTSUPP; sprintf(filename, "/sys/bus/event_source/devices/%s/type", name); fp = fopen(filename, "r"); if (!fp) return PFM_ERR_NOTSUPP; ret = fscanf(fp, "%d", &type); if (ret != 1) type = PFM_ERR_NOTSUPP; fclose(fp); return type; } int pfm_gen_powerpc_get_nest_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; int ret; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* * encoding routine changes based on PMU model */ ret = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (ret != PFM_SUCCESS) return ret; ret = find_pmu_type_by_name(pmu->perf_name); if (ret < 0) return ret; attr->type = ret; attr->config = e->codes[0]; return PFM_SUCCESS; } void pfm_gen_powerpc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * remove PMU-provided attributes which are either * not accessible under perf_events or fully controlled * by perf_events, e.g., priv levels filters */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { } /* * remove perf_event generic attributes not supported * by PPC */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* no precise sampling */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_ppc970.c000066400000000000000000000061571502707512200212620ustar00rootroot00000000000000/* * pfmlib_ppc970.c : IBM Power 970/970mp support * * Copyright (C) IBM Corporation, 2009. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/ppc970_events.h" #include "events/ppc970mp_events.h" static int pfm_ppc970_detect(void* this) { if (__is_processor(PV_970) || __is_processor(PV_970FX) || __is_processor(PV_970GX)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } static int pfm_ppc970mp_detect(void* this) { if (__is_processor(PV_970MP)) return PFM_SUCCESS; return PFM_ERR_NOTSUPP; } pfmlib_pmu_t ppc970_support={ .desc = "PPC970", .name = "ppc970", .pmu = PFM_PMU_PPC970, .pme_count = LIBPFM_ARRAY_SIZE(ppc970_pe), .max_encoding = 1, .pe = ppc970_pe, .pmu_detect = pfm_ppc970_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; pfmlib_pmu_t ppc970mp_support={ .desc = "PPC970MP", .name = "ppc970mp", .pmu = PFM_PMU_PPC970MP, .pme_count = LIBPFM_ARRAY_SIZE(ppc970mp_pe), .max_encoding = 1, .pe = ppc970mp_pe, .pmu_detect = pfm_ppc970mp_detect, .get_event_encoding[PFM_OS_NONE] = pfm_gen_powerpc_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .validate_table = pfm_gen_powerpc_validate_table, .get_event_info = pfm_gen_powerpc_get_event_info, .get_event_attr_info = pfm_gen_powerpc_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_priv.h000066400000000000000000001275511502707512200212270ustar00rootroot00000000000000/* * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #ifndef __PFMLIB_PRIV_H__ #define __PFMLIB_PRIV_H__ #include #include #define PFM_PLM_ALL (PFM_PLM0|PFM_PLM1|PFM_PLM2|PFM_PLM3|PFM_PLMH) #define PFMLIB_ATTR_DELIM ":." /* event attribute delimiter possible */ #define PFMLIB_PMU_DELIM "::" /* pmu to event delimiter */ #define PFMLIB_EVENT_DELIM ", \t\n"/* event to event delimiters */ #define PFM_ATTR_I(y, d) { .name = (y), .type = PFM_ATTR_MOD_INTEGER, .desc = (d) } #define PFM_ATTR_B(y, d) { .name = (y), .type = PFM_ATTR_MOD_BOOL, .desc = (d) } #define PFM_ATTR_SKIP { .name = "" } /* entry not populated (skipped) */ #define PFM_ATTR_NULL { .name = NULL } #define PFMLIB_EVT_MAX_NAME_LEN 256 /* * event identifier encoding: * bit 00-20 : event table specific index (2097152 possibilities) * bit 21-30 : PMU identifier (1024 possibilities) * bit 31 : reserved (to distinguish from a negative error code) */ #define PFMLIB_PMU_SHIFT 21 #define PFMLIB_PMU_MASK 0x3ff /* must fit PFM_PMU_MAX */ #define PFMLIB_PMU_PIDX_MASK ((1<< PFMLIB_PMU_SHIFT)-1) typedef struct { const char *name; /* name */ const char *desc; /* description */ pfm_attr_t type; /* used to validate value (if any) */ } pfmlib_attr_desc_t; typedef struct { const char *name; /* attribute symbolic name */ const char *desc; /* attribute description */ const char *equiv; /* attribute is equivalent to */ size_t size; /* struct sizeof */ uint64_t code; /* attribute code */ pfm_attr_t type; /* attribute type */ pfm_attr_ctrl_t ctrl; /* what is providing attr */ uint64_t idx; /* attribute opaque index */ struct { unsigned int is_dfl:1; /* is default umask */ unsigned int is_precise:1; /* Intel X86: supports PEBS */ unsigned int is_speculative:2;/* count correct and wrong path occurrences */ unsigned int support_hw_smpl:1;/* can be recorded by hw buffer (Intel X86=EXTPEBS) */ unsigned int support_no_mods:1;/* attribute does not support modifiers (umask only) */ unsigned int reserved_bits:26; }; union { uint64_t dfl_val64; /* default 64-bit value */ const char *dfl_str; /* default string value */ int dfl_bool; /* default boolean value */ int dfl_int; /* default integer value */ }; } pfmlib_event_attr_info_t; /* * attribute description passed to model-specific layer */ typedef struct { int id; /* attribute index */ union { uint64_t ival; /* integer value (incl. bool) */ char *sval; /* string */ }; } pfmlib_attr_t; /* * must be big enough to hold all possible priv level attributes */ #define PFMLIB_MAX_ATTRS 64 /* max attributes per event desc */ #define PFMLIB_MAX_ENCODING 4 /* max encoding length */ /* * we add one entry to hold any raw umask users may specify * the last entry in pattrs[] hold that raw umask info */ #define PFMLIB_MAX_PATTRS (PFMLIB_MAX_ATTRS+1) struct pfmlib_pmu; typedef struct { struct pfmlib_pmu *pmu; /* pmu */ int dfl_plm; /* default priv level mask */ int event; /* pidx */ int npattrs; /* number of attrs in pattrs[] */ int nattrs; /* number of attrs in attrs[] */ pfm_os_t osid; /* OS API requested */ int count; /* number of entries in codes[] */ pfmlib_attr_t attrs[PFMLIB_MAX_ATTRS]; /* list of requested attributes */ pfmlib_event_attr_info_t *pattrs; /* list of possible attributes */ char fstr[PFMLIB_EVT_MAX_NAME_LEN]; /* fully qualified event string */ uint64_t codes[PFMLIB_MAX_ENCODING]; /* event encoding */ void *os_data; } pfmlib_event_desc_t; #define modx(atdesc, a, z) (atdesc[(a)].z) #define attr(e, k) ((e)->pattrs + (e)->attrs[k].id) typedef struct pfmlib_node { struct pfmlib_node *next; struct pfmlib_node *prev; } pfmlib_node_t; typedef struct pfmlib_pmu { const char *desc; /* PMU description */ const char *name; /* pmu short name */ const char *perf_name; /* (Linux optional): comma separated list of possible perf_events PMU names */ pfmlib_node_t node; /* active list node */ struct pfmlib_pmu *next_active; /* active PMU link list */ struct pfmlib_pmu *prev_active; /* active PMU link list */ pfm_pmu_t pmu; /* PMU model */ int pme_count; /* number of events */ int max_encoding; /* max number of uint64_t to encode an event */ int flags; /* 16 LSB: common, 16 MSB: arch spec*/ int pmu_rev; /* PMU model specific revision */ int num_cntrs; /* number of generic counters */ int num_fixed_cntrs; /* number of fixed counters */ int supported_plm; /* supported priv levels */ pfm_pmu_type_t type; /* PMU type */ const void *pe; /* pointer to event table */ const pfmlib_attr_desc_t *atdesc; /* pointer to attrs table */ const int cpu_family; /* cpu family number for detection */ const int *cpu_models; /* cpu model numbers for detection (zero terminated) */ int (*pmu_detect)(void *this); int (*pmu_init)(void *this); /* optional */ void (*pmu_terminate)(void *this); /* optional */ int (*get_event_first)(void *this); int (*get_event_next)(void *this, int pidx); int (*get_event_info)(void *this, int pidx, pfm_event_info_t *info); unsigned int (*get_event_nattrs)(void *this, int pidx); int (*event_is_valid)(void *this, int pidx); int (*can_auto_encode)(void *this, int pidx, int uidx); int (*get_event_attr_info)(void *this, int pidx, int umask_idx, pfmlib_event_attr_info_t *info); int (*get_event_encoding[PFM_OS_MAX])(void *this, pfmlib_event_desc_t *e); void (*validate_pattrs[PFM_OS_MAX])(void *this, pfmlib_event_desc_t *e); int (*os_detect[PFM_OS_MAX])(void *this); int (*validate_table)(void *this, FILE *fp); /* * optional callbacks */ int (*get_num_events)(void *this); void (*display_reg)(void *this, pfmlib_event_desc_t *e, void *val); int (*match_event)(void *this, pfmlib_event_desc_t *d, const char *e, const char *s); } pfmlib_pmu_t; typedef struct { const char *name; const pfmlib_attr_desc_t *atdesc; pfm_os_t id; int flags; int (*detect)(void *this); int (*get_os_attr_info)(void *this, pfmlib_event_desc_t *e); int (*get_os_nattrs)(void *this, pfmlib_event_desc_t *e); int (*encode)(void *this, const char *str, int dfl_plm, void *args); } pfmlib_os_t; #define PFMLIB_OS_FL_ACTIVATED 0x1 /* OS layer detected */ /* * pfmlib_pmu_t common flags (LSB 16 bits) */ #define PFMLIB_PMU_FL_INIT 0x1 /* PMU initialized correctly */ #define PFMLIB_PMU_FL_ACTIVE 0x2 /* PMU is initialized + detected on host */ #define PFMLIB_PMU_FL_RAW_UMASK 0x4 /* PMU supports PFM_ATTR_RAW_UMASKS */ #define PFMLIB_PMU_FL_ARCH_DFL 0x8 /* PMU is arch default */ #define PFMLIB_PMU_FL_NO_SMPL 0x10 /* PMU does not support sampling */ #define PFMLIB_PMU_FL_SPEC 0x20 /* PMU provides event speculation info */ #define PFMLIB_PMU_FL_DEPR 0x40 /* PMU model is deprecated */ typedef struct { int initdone; int initret; /* initial return value from pfm_initialize() */ int verbose; int debug; int inactive; char *forced_pmu; char *blacklist_pmus; char *proc_cpuinfo; /* override /proc/cpuinfo with this file */ FILE *fp; /* verbose and debug file descriptor, default stderr or PFMLIB_DEBUG_STDOUT */ } pfmlib_config_t; #define PFMLIB_INITIALIZED() (pfm_cfg.initdone && pfm_cfg.initret == PFM_SUCCESS) extern pfmlib_config_t pfm_cfg; extern void __pfm_vbprintf(const char *fmt,...); extern void __pfm_dbprintf(const char *fmt,...); extern void pfmlib_strconcat(char *str, size_t max, const char *fmt, ...); extern int pfmlib_getl(char **buffer, size_t *len, FILE *fp); extern void pfmlib_compact_pattrs(pfmlib_event_desc_t *e, int i); #define evt_strcat(str, fmt, a...) pfmlib_strconcat(str, PFMLIB_EVT_MAX_NAME_LEN, fmt, a) extern int pfmlib_parse_event(const char *event, pfmlib_event_desc_t *d); extern int pfmlib_build_fstr(pfmlib_event_desc_t *e, char **fstr); extern void pfmlib_sort_attr(pfmlib_event_desc_t *e); extern pfmlib_pmu_t * pfmlib_get_pmu_by_type(pfm_pmu_type_t t); extern void pfmlib_release_event(pfmlib_event_desc_t *e); extern size_t pfmlib_check_struct(void *st, size_t usz, size_t refsz, size_t sz); #ifdef CONFIG_PFMLIB_DEBUG #define DPRINT(fmt, a...) \ do { \ __pfm_dbprintf("%s (%s.%d): " fmt, __FILE__, __func__, __LINE__, ## a); \ } while (0) #else #define DPRINT(fmt, a...) #endif extern pfmlib_pmu_t montecito_support; extern pfmlib_pmu_t itanium2_support; extern pfmlib_pmu_t itanium_support; extern pfmlib_pmu_t generic_ia64_support; extern pfmlib_pmu_t amd64_k7_support; extern pfmlib_pmu_t amd64_k8_revb_support; extern pfmlib_pmu_t amd64_k8_revc_support; extern pfmlib_pmu_t amd64_k8_revd_support; extern pfmlib_pmu_t amd64_k8_reve_support; extern pfmlib_pmu_t amd64_k8_revf_support; extern pfmlib_pmu_t amd64_k8_revg_support; extern pfmlib_pmu_t amd64_fam10h_barcelona_support; extern pfmlib_pmu_t amd64_fam10h_shanghai_support; extern pfmlib_pmu_t amd64_fam10h_istanbul_support; extern pfmlib_pmu_t amd64_fam11h_turion_support; extern pfmlib_pmu_t amd64_fam12h_llano_support; extern pfmlib_pmu_t amd64_fam14h_bobcat_support; extern pfmlib_pmu_t amd64_fam15h_interlagos_support; extern pfmlib_pmu_t amd64_fam15h_nb_support; extern pfmlib_pmu_t amd64_fam16h_support; extern pfmlib_pmu_t amd64_fam17h_deprecated_support; extern pfmlib_pmu_t amd64_fam17h_zen1_support; extern pfmlib_pmu_t amd64_fam17h_zen2_support; extern pfmlib_pmu_t amd64_fam19h_zen3_support; extern pfmlib_pmu_t amd64_fam19h_zen4_support; extern pfmlib_pmu_t amd64_fam19h_zen3_l3_support; extern pfmlib_pmu_t amd64_fam1ah_zen5_support; extern pfmlib_pmu_t amd64_fam1ah_zen5_l3_support; extern pfmlib_pmu_t amd64_rapl_support; extern pfmlib_pmu_t intel_p6_support; extern pfmlib_pmu_t intel_ppro_support; extern pfmlib_pmu_t intel_pii_support; extern pfmlib_pmu_t intel_pm_support; extern pfmlib_pmu_t sicortex_support; extern pfmlib_pmu_t netburst_support; extern pfmlib_pmu_t netburst_p_support; extern pfmlib_pmu_t intel_coreduo_support; extern pfmlib_pmu_t intel_core_support; extern pfmlib_pmu_t intel_x86_arch_support; extern pfmlib_pmu_t intel_atom_support; extern pfmlib_pmu_t intel_nhm_support; extern pfmlib_pmu_t intel_nhm_ex_support; extern pfmlib_pmu_t intel_nhm_unc_support; extern pfmlib_pmu_t intel_snb_support; extern pfmlib_pmu_t intel_snb_unc_cbo0_support; extern pfmlib_pmu_t intel_snb_unc_cbo1_support; extern pfmlib_pmu_t intel_snb_unc_cbo2_support; extern pfmlib_pmu_t intel_snb_unc_cbo3_support; extern pfmlib_pmu_t intel_snb_ep_support; extern pfmlib_pmu_t intel_ivb_support; extern pfmlib_pmu_t intel_ivb_unc_cbo0_support; extern pfmlib_pmu_t intel_ivb_unc_cbo1_support; extern pfmlib_pmu_t intel_ivb_unc_cbo2_support; extern pfmlib_pmu_t intel_ivb_unc_cbo3_support; extern pfmlib_pmu_t intel_ivb_ep_support; extern pfmlib_pmu_t intel_hsw_support; extern pfmlib_pmu_t intel_hsw_ep_support; extern pfmlib_pmu_t intel_bdw_support; extern pfmlib_pmu_t intel_bdw_ep_support; extern pfmlib_pmu_t intel_skl_support; extern pfmlib_pmu_t intel_skx_support; extern pfmlib_pmu_t intel_clx_support; extern pfmlib_pmu_t intel_icl_support; extern pfmlib_pmu_t intel_icx_support; extern pfmlib_pmu_t intel_icx_unc_cha0_support; extern pfmlib_pmu_t intel_icx_unc_cha1_support; extern pfmlib_pmu_t intel_icx_unc_cha2_support; extern pfmlib_pmu_t intel_icx_unc_cha3_support; extern pfmlib_pmu_t intel_icx_unc_cha4_support; extern pfmlib_pmu_t intel_icx_unc_cha5_support; extern pfmlib_pmu_t intel_icx_unc_cha6_support; extern pfmlib_pmu_t intel_icx_unc_cha7_support; extern pfmlib_pmu_t intel_icx_unc_cha8_support; extern pfmlib_pmu_t intel_icx_unc_cha9_support; extern pfmlib_pmu_t intel_icx_unc_cha10_support; extern pfmlib_pmu_t intel_icx_unc_cha11_support; extern pfmlib_pmu_t intel_icx_unc_cha12_support; extern pfmlib_pmu_t intel_icx_unc_cha13_support; extern pfmlib_pmu_t intel_icx_unc_cha14_support; extern pfmlib_pmu_t intel_icx_unc_cha15_support; extern pfmlib_pmu_t intel_icx_unc_cha16_support; extern pfmlib_pmu_t intel_icx_unc_cha17_support; extern pfmlib_pmu_t intel_icx_unc_cha18_support; extern pfmlib_pmu_t intel_icx_unc_cha19_support; extern pfmlib_pmu_t intel_icx_unc_cha20_support; extern pfmlib_pmu_t intel_icx_unc_cha21_support; extern pfmlib_pmu_t intel_icx_unc_cha22_support; extern pfmlib_pmu_t intel_icx_unc_cha23_support; extern pfmlib_pmu_t intel_icx_unc_cha24_support; extern pfmlib_pmu_t intel_icx_unc_cha25_support; extern pfmlib_pmu_t intel_icx_unc_cha26_support; extern pfmlib_pmu_t intel_icx_unc_cha27_support; extern pfmlib_pmu_t intel_icx_unc_cha28_support; extern pfmlib_pmu_t intel_icx_unc_cha29_support; extern pfmlib_pmu_t intel_icx_unc_cha30_support; extern pfmlib_pmu_t intel_icx_unc_cha31_support; extern pfmlib_pmu_t intel_icx_unc_cha32_support; extern pfmlib_pmu_t intel_icx_unc_cha33_support; extern pfmlib_pmu_t intel_icx_unc_cha34_support; extern pfmlib_pmu_t intel_icx_unc_cha35_support; extern pfmlib_pmu_t intel_icx_unc_cha36_support; extern pfmlib_pmu_t intel_icx_unc_cha37_support; extern pfmlib_pmu_t intel_icx_unc_cha38_support; extern pfmlib_pmu_t intel_icx_unc_cha39_support; extern pfmlib_pmu_t intel_icx_unc_imc0_support; extern pfmlib_pmu_t intel_icx_unc_imc1_support; extern pfmlib_pmu_t intel_icx_unc_imc2_support; extern pfmlib_pmu_t intel_icx_unc_imc3_support; extern pfmlib_pmu_t intel_icx_unc_imc4_support; extern pfmlib_pmu_t intel_icx_unc_imc5_support; extern pfmlib_pmu_t intel_icx_unc_imc6_support; extern pfmlib_pmu_t intel_icx_unc_imc7_support; extern pfmlib_pmu_t intel_icx_unc_imc8_support; extern pfmlib_pmu_t intel_icx_unc_imc9_support; extern pfmlib_pmu_t intel_icx_unc_imc10_support; extern pfmlib_pmu_t intel_icx_unc_imc11_support; extern pfmlib_pmu_t intel_icx_unc_m2m0_support; extern pfmlib_pmu_t intel_icx_unc_m2m1_support; extern pfmlib_pmu_t intel_icx_unc_iio0_support; extern pfmlib_pmu_t intel_icx_unc_iio1_support; extern pfmlib_pmu_t intel_icx_unc_iio2_support; extern pfmlib_pmu_t intel_icx_unc_iio3_support; extern pfmlib_pmu_t intel_icx_unc_iio4_support; extern pfmlib_pmu_t intel_icx_unc_iio5_support; extern pfmlib_pmu_t intel_icx_unc_irp0_support; extern pfmlib_pmu_t intel_icx_unc_irp1_support; extern pfmlib_pmu_t intel_icx_unc_irp2_support; extern pfmlib_pmu_t intel_icx_unc_irp3_support; extern pfmlib_pmu_t intel_icx_unc_irp4_support; extern pfmlib_pmu_t intel_icx_unc_irp5_support; extern pfmlib_pmu_t intel_icx_unc_pcu_support; extern pfmlib_pmu_t intel_icx_unc_upi0_support; extern pfmlib_pmu_t intel_icx_unc_upi1_support; extern pfmlib_pmu_t intel_icx_unc_upi2_support; extern pfmlib_pmu_t intel_icx_unc_upi3_support; extern pfmlib_pmu_t intel_icx_unc_m3upi0_support; extern pfmlib_pmu_t intel_icx_unc_m3upi1_support; extern pfmlib_pmu_t intel_icx_unc_m3upi2_support; extern pfmlib_pmu_t intel_icx_unc_m3upi3_support; extern pfmlib_pmu_t intel_icx_unc_ubox_support; extern pfmlib_pmu_t intel_icx_unc_m2pcie0_support; extern pfmlib_pmu_t intel_icx_unc_m2pcie1_support; extern pfmlib_pmu_t intel_icx_unc_m2pcie2_support; extern pfmlib_pmu_t intel_spr_support; extern pfmlib_pmu_t intel_spr_unc_imc0_support; extern pfmlib_pmu_t intel_spr_unc_imc1_support; extern pfmlib_pmu_t intel_spr_unc_imc2_support; extern pfmlib_pmu_t intel_spr_unc_imc3_support; extern pfmlib_pmu_t intel_spr_unc_imc4_support; extern pfmlib_pmu_t intel_spr_unc_imc5_support; extern pfmlib_pmu_t intel_spr_unc_imc6_support; extern pfmlib_pmu_t intel_spr_unc_imc7_support; extern pfmlib_pmu_t intel_spr_unc_imc8_support; extern pfmlib_pmu_t intel_spr_unc_imc9_support; extern pfmlib_pmu_t intel_spr_unc_imc10_support; extern pfmlib_pmu_t intel_spr_unc_imc11_support; extern pfmlib_pmu_t intel_spr_unc_upi0_support; extern pfmlib_pmu_t intel_spr_unc_upi1_support; extern pfmlib_pmu_t intel_spr_unc_upi2_support; extern pfmlib_pmu_t intel_spr_unc_upi3_support; extern pfmlib_pmu_t intel_spr_unc_cha0_support; extern pfmlib_pmu_t intel_spr_unc_cha1_support; extern pfmlib_pmu_t intel_spr_unc_cha2_support; extern pfmlib_pmu_t intel_spr_unc_cha3_support; extern pfmlib_pmu_t intel_spr_unc_cha4_support; extern pfmlib_pmu_t intel_spr_unc_cha5_support; extern pfmlib_pmu_t intel_spr_unc_cha6_support; extern pfmlib_pmu_t intel_spr_unc_cha7_support; extern pfmlib_pmu_t intel_spr_unc_cha8_support; extern pfmlib_pmu_t intel_spr_unc_cha9_support; extern pfmlib_pmu_t intel_spr_unc_cha10_support; extern pfmlib_pmu_t intel_spr_unc_cha11_support; extern pfmlib_pmu_t intel_spr_unc_cha12_support; extern pfmlib_pmu_t intel_spr_unc_cha13_support; extern pfmlib_pmu_t intel_spr_unc_cha14_support; extern pfmlib_pmu_t intel_spr_unc_cha15_support; extern pfmlib_pmu_t intel_spr_unc_cha16_support; extern pfmlib_pmu_t intel_spr_unc_cha17_support; extern pfmlib_pmu_t intel_spr_unc_cha18_support; extern pfmlib_pmu_t intel_spr_unc_cha19_support; extern pfmlib_pmu_t intel_spr_unc_cha20_support; extern pfmlib_pmu_t intel_spr_unc_cha21_support; extern pfmlib_pmu_t intel_spr_unc_cha22_support; extern pfmlib_pmu_t intel_spr_unc_cha23_support; extern pfmlib_pmu_t intel_spr_unc_cha24_support; extern pfmlib_pmu_t intel_spr_unc_cha25_support; extern pfmlib_pmu_t intel_spr_unc_cha26_support; extern pfmlib_pmu_t intel_spr_unc_cha27_support; extern pfmlib_pmu_t intel_spr_unc_cha28_support; extern pfmlib_pmu_t intel_spr_unc_cha29_support; extern pfmlib_pmu_t intel_spr_unc_cha30_support; extern pfmlib_pmu_t intel_spr_unc_cha31_support; extern pfmlib_pmu_t intel_spr_unc_cha32_support; extern pfmlib_pmu_t intel_spr_unc_cha33_support; extern pfmlib_pmu_t intel_spr_unc_cha34_support; extern pfmlib_pmu_t intel_spr_unc_cha35_support; extern pfmlib_pmu_t intel_spr_unc_cha36_support; extern pfmlib_pmu_t intel_spr_unc_cha37_support; extern pfmlib_pmu_t intel_spr_unc_cha38_support; extern pfmlib_pmu_t intel_spr_unc_cha39_support; extern pfmlib_pmu_t intel_spr_unc_cha40_support; extern pfmlib_pmu_t intel_spr_unc_cha40_support; extern pfmlib_pmu_t intel_spr_unc_cha40_support; extern pfmlib_pmu_t intel_spr_unc_cha41_support; extern pfmlib_pmu_t intel_spr_unc_cha42_support; extern pfmlib_pmu_t intel_spr_unc_cha43_support; extern pfmlib_pmu_t intel_spr_unc_cha44_support; extern pfmlib_pmu_t intel_spr_unc_cha45_support; extern pfmlib_pmu_t intel_spr_unc_cha46_support; extern pfmlib_pmu_t intel_spr_unc_cha47_support; extern pfmlib_pmu_t intel_spr_unc_cha48_support; extern pfmlib_pmu_t intel_spr_unc_cha49_support; extern pfmlib_pmu_t intel_spr_unc_cha50_support; extern pfmlib_pmu_t intel_spr_unc_cha51_support; extern pfmlib_pmu_t intel_spr_unc_cha52_support; extern pfmlib_pmu_t intel_spr_unc_cha53_support; extern pfmlib_pmu_t intel_spr_unc_cha54_support; extern pfmlib_pmu_t intel_spr_unc_cha55_support; extern pfmlib_pmu_t intel_spr_unc_cha56_support; extern pfmlib_pmu_t intel_spr_unc_cha57_support; extern pfmlib_pmu_t intel_spr_unc_cha58_support; extern pfmlib_pmu_t intel_spr_unc_cha59_support; extern pfmlib_pmu_t intel_emr_support; extern pfmlib_pmu_t intel_gnr_support; extern pfmlib_pmu_t intel_gnr_unc_imc0_support; extern pfmlib_pmu_t intel_gnr_unc_imc1_support; extern pfmlib_pmu_t intel_gnr_unc_imc2_support; extern pfmlib_pmu_t intel_gnr_unc_imc3_support; extern pfmlib_pmu_t intel_gnr_unc_imc4_support; extern pfmlib_pmu_t intel_gnr_unc_imc5_support; extern pfmlib_pmu_t intel_gnr_unc_imc6_support; extern pfmlib_pmu_t intel_gnr_unc_imc7_support; extern pfmlib_pmu_t intel_gnr_unc_imc8_support; extern pfmlib_pmu_t intel_gnr_unc_imc9_support; extern pfmlib_pmu_t intel_gnr_unc_imc10_support; extern pfmlib_pmu_t intel_gnr_unc_imc11_support; extern pfmlib_pmu_t intel_adl_glc_support; extern pfmlib_pmu_t intel_adl_grt_support; extern pfmlib_pmu_t intel_rapl_support; extern pfmlib_pmu_t intel_snbep_unc_cb0_support; extern pfmlib_pmu_t intel_snbep_unc_cb1_support; extern pfmlib_pmu_t intel_snbep_unc_cb2_support; extern pfmlib_pmu_t intel_snbep_unc_cb3_support; extern pfmlib_pmu_t intel_snbep_unc_cb4_support; extern pfmlib_pmu_t intel_snbep_unc_cb5_support; extern pfmlib_pmu_t intel_snbep_unc_cb6_support; extern pfmlib_pmu_t intel_snbep_unc_cb7_support; extern pfmlib_pmu_t intel_snbep_unc_ha_support; extern pfmlib_pmu_t intel_snbep_unc_imc0_support; extern pfmlib_pmu_t intel_snbep_unc_imc1_support; extern pfmlib_pmu_t intel_snbep_unc_imc2_support; extern pfmlib_pmu_t intel_snbep_unc_imc3_support; extern pfmlib_pmu_t intel_snbep_unc_pcu_support; extern pfmlib_pmu_t intel_snbep_unc_qpi0_support; extern pfmlib_pmu_t intel_snbep_unc_qpi1_support; extern pfmlib_pmu_t intel_snbep_unc_ubo_support; extern pfmlib_pmu_t intel_snbep_unc_r2pcie_support; extern pfmlib_pmu_t intel_snbep_unc_r3qpi0_support; extern pfmlib_pmu_t intel_snbep_unc_r3qpi1_support; extern pfmlib_pmu_t intel_ivbep_unc_cb0_support; extern pfmlib_pmu_t intel_ivbep_unc_cb1_support; extern pfmlib_pmu_t intel_ivbep_unc_cb2_support; extern pfmlib_pmu_t intel_ivbep_unc_cb3_support; extern pfmlib_pmu_t intel_ivbep_unc_cb4_support; extern pfmlib_pmu_t intel_ivbep_unc_cb5_support; extern pfmlib_pmu_t intel_ivbep_unc_cb6_support; extern pfmlib_pmu_t intel_ivbep_unc_cb7_support; extern pfmlib_pmu_t intel_ivbep_unc_cb8_support; extern pfmlib_pmu_t intel_ivbep_unc_cb9_support; extern pfmlib_pmu_t intel_ivbep_unc_cb10_support; extern pfmlib_pmu_t intel_ivbep_unc_cb11_support; extern pfmlib_pmu_t intel_ivbep_unc_cb12_support; extern pfmlib_pmu_t intel_ivbep_unc_cb13_support; extern pfmlib_pmu_t intel_ivbep_unc_cb14_support; extern pfmlib_pmu_t intel_ivbep_unc_ha0_support; extern pfmlib_pmu_t intel_ivbep_unc_ha1_support; extern pfmlib_pmu_t intel_ivbep_unc_imc0_support; extern pfmlib_pmu_t intel_ivbep_unc_imc1_support; extern pfmlib_pmu_t intel_ivbep_unc_imc2_support; extern pfmlib_pmu_t intel_ivbep_unc_imc3_support; extern pfmlib_pmu_t intel_ivbep_unc_imc4_support; extern pfmlib_pmu_t intel_ivbep_unc_imc5_support; extern pfmlib_pmu_t intel_ivbep_unc_imc6_support; extern pfmlib_pmu_t intel_ivbep_unc_imc7_support; extern pfmlib_pmu_t intel_ivbep_unc_pcu_support; extern pfmlib_pmu_t intel_ivbep_unc_qpi0_support; extern pfmlib_pmu_t intel_ivbep_unc_qpi1_support; extern pfmlib_pmu_t intel_ivbep_unc_qpi2_support; extern pfmlib_pmu_t intel_ivbep_unc_ubo_support; extern pfmlib_pmu_t intel_ivbep_unc_r2pcie_support; extern pfmlib_pmu_t intel_ivbep_unc_r3qpi0_support; extern pfmlib_pmu_t intel_ivbep_unc_r3qpi1_support; extern pfmlib_pmu_t intel_ivbep_unc_r3qpi2_support; extern pfmlib_pmu_t intel_ivbep_unc_irp_support; extern pfmlib_pmu_t intel_hswep_unc_cb0_support; extern pfmlib_pmu_t intel_hswep_unc_cb1_support; extern pfmlib_pmu_t intel_hswep_unc_cb2_support; extern pfmlib_pmu_t intel_hswep_unc_cb3_support; extern pfmlib_pmu_t intel_hswep_unc_cb4_support; extern pfmlib_pmu_t intel_hswep_unc_cb5_support; extern pfmlib_pmu_t intel_hswep_unc_cb6_support; extern pfmlib_pmu_t intel_hswep_unc_cb7_support; extern pfmlib_pmu_t intel_hswep_unc_cb8_support; extern pfmlib_pmu_t intel_hswep_unc_cb9_support; extern pfmlib_pmu_t intel_hswep_unc_cb10_support; extern pfmlib_pmu_t intel_hswep_unc_cb11_support; extern pfmlib_pmu_t intel_hswep_unc_cb12_support; extern pfmlib_pmu_t intel_hswep_unc_cb13_support; extern pfmlib_pmu_t intel_hswep_unc_cb14_support; extern pfmlib_pmu_t intel_hswep_unc_cb15_support; extern pfmlib_pmu_t intel_hswep_unc_cb16_support; extern pfmlib_pmu_t intel_hswep_unc_cb17_support; extern pfmlib_pmu_t intel_hswep_unc_ha0_support; extern pfmlib_pmu_t intel_hswep_unc_ha1_support; extern pfmlib_pmu_t intel_hswep_unc_imc0_support; extern pfmlib_pmu_t intel_hswep_unc_imc1_support; extern pfmlib_pmu_t intel_hswep_unc_imc2_support; extern pfmlib_pmu_t intel_hswep_unc_imc3_support; extern pfmlib_pmu_t intel_hswep_unc_imc4_support; extern pfmlib_pmu_t intel_hswep_unc_imc5_support; extern pfmlib_pmu_t intel_hswep_unc_imc6_support; extern pfmlib_pmu_t intel_hswep_unc_imc7_support; extern pfmlib_pmu_t intel_hswep_unc_pcu_support; extern pfmlib_pmu_t intel_hswep_unc_qpi0_support; extern pfmlib_pmu_t intel_hswep_unc_qpi1_support; extern pfmlib_pmu_t intel_hswep_unc_sb0_support; extern pfmlib_pmu_t intel_hswep_unc_sb1_support; extern pfmlib_pmu_t intel_hswep_unc_sb2_support; extern pfmlib_pmu_t intel_hswep_unc_sb3_support; extern pfmlib_pmu_t intel_hswep_unc_ubo_support; extern pfmlib_pmu_t intel_hswep_unc_r2pcie_support; extern pfmlib_pmu_t intel_hswep_unc_r3qpi0_support; extern pfmlib_pmu_t intel_hswep_unc_r3qpi1_support; extern pfmlib_pmu_t intel_hswep_unc_r3qpi2_support; extern pfmlib_pmu_t intel_hswep_unc_irp_support; extern pfmlib_pmu_t intel_knc_support; extern pfmlib_pmu_t intel_slm_support; extern pfmlib_pmu_t intel_tmt_support; extern pfmlib_pmu_t intel_knl_support; extern pfmlib_pmu_t intel_knl_unc_imc0_support; extern pfmlib_pmu_t intel_knl_unc_imc1_support; extern pfmlib_pmu_t intel_knl_unc_imc2_support; extern pfmlib_pmu_t intel_knl_unc_imc3_support; extern pfmlib_pmu_t intel_knl_unc_imc4_support; extern pfmlib_pmu_t intel_knl_unc_imc5_support; extern pfmlib_pmu_t intel_knl_unc_imc_uclk0_support; extern pfmlib_pmu_t intel_knl_unc_imc_uclk1_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk0_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk1_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk2_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk3_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk4_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk5_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk6_support; extern pfmlib_pmu_t intel_knl_unc_edc_uclk7_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk0_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk1_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk2_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk3_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk4_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk5_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk6_support; extern pfmlib_pmu_t intel_knl_unc_edc_eclk7_support; extern pfmlib_pmu_t intel_knl_unc_cha0_support; extern pfmlib_pmu_t intel_knl_unc_cha1_support; extern pfmlib_pmu_t intel_knl_unc_cha2_support; extern pfmlib_pmu_t intel_knl_unc_cha3_support; extern pfmlib_pmu_t intel_knl_unc_cha4_support; extern pfmlib_pmu_t intel_knl_unc_cha5_support; extern pfmlib_pmu_t intel_knl_unc_cha6_support; extern pfmlib_pmu_t intel_knl_unc_cha7_support; extern pfmlib_pmu_t intel_knl_unc_cha8_support; extern pfmlib_pmu_t intel_knl_unc_cha9_support; extern pfmlib_pmu_t intel_knl_unc_cha10_support; extern pfmlib_pmu_t intel_knl_unc_cha11_support; extern pfmlib_pmu_t intel_knl_unc_cha12_support; extern pfmlib_pmu_t intel_knl_unc_cha13_support; extern pfmlib_pmu_t intel_knl_unc_cha14_support; extern pfmlib_pmu_t intel_knl_unc_cha15_support; extern pfmlib_pmu_t intel_knl_unc_cha16_support; extern pfmlib_pmu_t intel_knl_unc_cha17_support; extern pfmlib_pmu_t intel_knl_unc_cha18_support; extern pfmlib_pmu_t intel_knl_unc_cha19_support; extern pfmlib_pmu_t intel_knl_unc_cha20_support; extern pfmlib_pmu_t intel_knl_unc_cha21_support; extern pfmlib_pmu_t intel_knl_unc_cha22_support; extern pfmlib_pmu_t intel_knl_unc_cha23_support; extern pfmlib_pmu_t intel_knl_unc_cha24_support; extern pfmlib_pmu_t intel_knl_unc_cha25_support; extern pfmlib_pmu_t intel_knl_unc_cha26_support; extern pfmlib_pmu_t intel_knl_unc_cha27_support; extern pfmlib_pmu_t intel_knl_unc_cha28_support; extern pfmlib_pmu_t intel_knl_unc_cha29_support; extern pfmlib_pmu_t intel_knl_unc_cha30_support; extern pfmlib_pmu_t intel_knl_unc_cha31_support; extern pfmlib_pmu_t intel_knl_unc_cha32_support; extern pfmlib_pmu_t intel_knl_unc_cha33_support; extern pfmlib_pmu_t intel_knl_unc_cha34_support; extern pfmlib_pmu_t intel_knl_unc_cha35_support; extern pfmlib_pmu_t intel_knl_unc_cha36_support; extern pfmlib_pmu_t intel_knl_unc_cha37_support; extern pfmlib_pmu_t intel_knl_unc_m2pcie_support; extern pfmlib_pmu_t intel_bdx_unc_cb0_support; extern pfmlib_pmu_t intel_bdx_unc_cb1_support; extern pfmlib_pmu_t intel_bdx_unc_cb2_support; extern pfmlib_pmu_t intel_bdx_unc_cb3_support; extern pfmlib_pmu_t intel_bdx_unc_cb4_support; extern pfmlib_pmu_t intel_bdx_unc_cb5_support; extern pfmlib_pmu_t intel_bdx_unc_cb6_support; extern pfmlib_pmu_t intel_bdx_unc_cb7_support; extern pfmlib_pmu_t intel_bdx_unc_cb8_support; extern pfmlib_pmu_t intel_bdx_unc_cb9_support; extern pfmlib_pmu_t intel_bdx_unc_cb10_support; extern pfmlib_pmu_t intel_bdx_unc_cb11_support; extern pfmlib_pmu_t intel_bdx_unc_cb12_support; extern pfmlib_pmu_t intel_bdx_unc_cb13_support; extern pfmlib_pmu_t intel_bdx_unc_cb14_support; extern pfmlib_pmu_t intel_bdx_unc_cb15_support; extern pfmlib_pmu_t intel_bdx_unc_cb16_support; extern pfmlib_pmu_t intel_bdx_unc_cb17_support; extern pfmlib_pmu_t intel_bdx_unc_cb18_support; extern pfmlib_pmu_t intel_bdx_unc_cb19_support; extern pfmlib_pmu_t intel_bdx_unc_cb20_support; extern pfmlib_pmu_t intel_bdx_unc_cb21_support; extern pfmlib_pmu_t intel_bdx_unc_cb22_support; extern pfmlib_pmu_t intel_bdx_unc_cb23_support; extern pfmlib_pmu_t intel_bdx_unc_ha0_support; extern pfmlib_pmu_t intel_bdx_unc_ha1_support; extern pfmlib_pmu_t intel_bdx_unc_imc0_support; extern pfmlib_pmu_t intel_bdx_unc_imc1_support; extern pfmlib_pmu_t intel_bdx_unc_imc2_support; extern pfmlib_pmu_t intel_bdx_unc_imc3_support; extern pfmlib_pmu_t intel_bdx_unc_imc4_support; extern pfmlib_pmu_t intel_bdx_unc_imc5_support; extern pfmlib_pmu_t intel_bdx_unc_imc6_support; extern pfmlib_pmu_t intel_bdx_unc_imc7_support; extern pfmlib_pmu_t intel_bdx_unc_pcu_support; extern pfmlib_pmu_t intel_bdx_unc_qpi0_support; extern pfmlib_pmu_t intel_bdx_unc_qpi1_support; extern pfmlib_pmu_t intel_bdx_unc_qpi2_support; extern pfmlib_pmu_t intel_bdx_unc_sbo0_support; extern pfmlib_pmu_t intel_bdx_unc_sbo1_support; extern pfmlib_pmu_t intel_bdx_unc_sbo2_support; extern pfmlib_pmu_t intel_bdx_unc_sbo3_support; extern pfmlib_pmu_t intel_bdx_unc_ubo_support; extern pfmlib_pmu_t intel_bdx_unc_r2pcie_support; extern pfmlib_pmu_t intel_bdx_unc_r3qpi0_support; extern pfmlib_pmu_t intel_bdx_unc_r3qpi1_support; extern pfmlib_pmu_t intel_bdx_unc_r3qpi2_support; extern pfmlib_pmu_t intel_bdx_unc_irp_support; extern pfmlib_pmu_t intel_skx_unc_cha0_support; extern pfmlib_pmu_t intel_skx_unc_cha1_support; extern pfmlib_pmu_t intel_skx_unc_cha2_support; extern pfmlib_pmu_t intel_skx_unc_cha3_support; extern pfmlib_pmu_t intel_skx_unc_cha4_support; extern pfmlib_pmu_t intel_skx_unc_cha5_support; extern pfmlib_pmu_t intel_skx_unc_cha6_support; extern pfmlib_pmu_t intel_skx_unc_cha7_support; extern pfmlib_pmu_t intel_skx_unc_cha8_support; extern pfmlib_pmu_t intel_skx_unc_cha9_support; extern pfmlib_pmu_t intel_skx_unc_cha10_support; extern pfmlib_pmu_t intel_skx_unc_cha11_support; extern pfmlib_pmu_t intel_skx_unc_cha12_support; extern pfmlib_pmu_t intel_skx_unc_cha13_support; extern pfmlib_pmu_t intel_skx_unc_cha14_support; extern pfmlib_pmu_t intel_skx_unc_cha15_support; extern pfmlib_pmu_t intel_skx_unc_cha16_support; extern pfmlib_pmu_t intel_skx_unc_cha17_support; extern pfmlib_pmu_t intel_skx_unc_cha18_support; extern pfmlib_pmu_t intel_skx_unc_cha19_support; extern pfmlib_pmu_t intel_skx_unc_cha20_support; extern pfmlib_pmu_t intel_skx_unc_cha21_support; extern pfmlib_pmu_t intel_skx_unc_cha22_support; extern pfmlib_pmu_t intel_skx_unc_cha23_support; extern pfmlib_pmu_t intel_skx_unc_cha24_support; extern pfmlib_pmu_t intel_skx_unc_cha25_support; extern pfmlib_pmu_t intel_skx_unc_cha26_support; extern pfmlib_pmu_t intel_skx_unc_cha27_support; extern pfmlib_pmu_t intel_skx_unc_iio0_support; extern pfmlib_pmu_t intel_skx_unc_iio1_support; extern pfmlib_pmu_t intel_skx_unc_iio2_support; extern pfmlib_pmu_t intel_skx_unc_iio3_support; extern pfmlib_pmu_t intel_skx_unc_iio4_support; extern pfmlib_pmu_t intel_skx_unc_iio5_support; extern pfmlib_pmu_t intel_skx_unc_imc0_support; extern pfmlib_pmu_t intel_skx_unc_imc1_support; extern pfmlib_pmu_t intel_skx_unc_imc2_support; extern pfmlib_pmu_t intel_skx_unc_imc3_support; extern pfmlib_pmu_t intel_skx_unc_imc4_support; extern pfmlib_pmu_t intel_skx_unc_imc5_support; extern pfmlib_pmu_t intel_skx_unc_pcu_support; extern pfmlib_pmu_t intel_skx_unc_upi0_support; extern pfmlib_pmu_t intel_skx_unc_upi1_support; extern pfmlib_pmu_t intel_skx_unc_upi2_support; extern pfmlib_pmu_t intel_skx_unc_m2m0_support; extern pfmlib_pmu_t intel_skx_unc_m2m1_support; extern pfmlib_pmu_t intel_skx_unc_ubo_support; extern pfmlib_pmu_t intel_skx_unc_m3upi0_support; extern pfmlib_pmu_t intel_skx_unc_m3upi1_support; extern pfmlib_pmu_t intel_skx_unc_m3upi2_support; extern pfmlib_pmu_t intel_skx_unc_irp_support; extern pfmlib_pmu_t intel_knm_support; extern pfmlib_pmu_t intel_knm_unc_imc0_support; extern pfmlib_pmu_t intel_knm_unc_imc1_support; extern pfmlib_pmu_t intel_knm_unc_imc2_support; extern pfmlib_pmu_t intel_knm_unc_imc3_support; extern pfmlib_pmu_t intel_knm_unc_imc4_support; extern pfmlib_pmu_t intel_knm_unc_imc5_support; extern pfmlib_pmu_t intel_knm_unc_imc_uclk0_support; extern pfmlib_pmu_t intel_knm_unc_imc_uclk1_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk0_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk1_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk2_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk3_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk4_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk5_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk6_support; extern pfmlib_pmu_t intel_knm_unc_edc_uclk7_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk0_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk1_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk2_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk3_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk4_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk5_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk6_support; extern pfmlib_pmu_t intel_knm_unc_edc_eclk7_support; extern pfmlib_pmu_t intel_knm_unc_cha0_support; extern pfmlib_pmu_t intel_knm_unc_cha1_support; extern pfmlib_pmu_t intel_knm_unc_cha2_support; extern pfmlib_pmu_t intel_knm_unc_cha3_support; extern pfmlib_pmu_t intel_knm_unc_cha4_support; extern pfmlib_pmu_t intel_knm_unc_cha5_support; extern pfmlib_pmu_t intel_knm_unc_cha6_support; extern pfmlib_pmu_t intel_knm_unc_cha7_support; extern pfmlib_pmu_t intel_knm_unc_cha8_support; extern pfmlib_pmu_t intel_knm_unc_cha9_support; extern pfmlib_pmu_t intel_knm_unc_cha10_support; extern pfmlib_pmu_t intel_knm_unc_cha11_support; extern pfmlib_pmu_t intel_knm_unc_cha12_support; extern pfmlib_pmu_t intel_knm_unc_cha13_support; extern pfmlib_pmu_t intel_knm_unc_cha14_support; extern pfmlib_pmu_t intel_knm_unc_cha15_support; extern pfmlib_pmu_t intel_knm_unc_cha16_support; extern pfmlib_pmu_t intel_knm_unc_cha17_support; extern pfmlib_pmu_t intel_knm_unc_cha18_support; extern pfmlib_pmu_t intel_knm_unc_cha19_support; extern pfmlib_pmu_t intel_knm_unc_cha20_support; extern pfmlib_pmu_t intel_knm_unc_cha21_support; extern pfmlib_pmu_t intel_knm_unc_cha22_support; extern pfmlib_pmu_t intel_knm_unc_cha23_support; extern pfmlib_pmu_t intel_knm_unc_cha24_support; extern pfmlib_pmu_t intel_knm_unc_cha25_support; extern pfmlib_pmu_t intel_knm_unc_cha26_support; extern pfmlib_pmu_t intel_knm_unc_cha27_support; extern pfmlib_pmu_t intel_knm_unc_cha28_support; extern pfmlib_pmu_t intel_knm_unc_cha29_support; extern pfmlib_pmu_t intel_knm_unc_cha30_support; extern pfmlib_pmu_t intel_knm_unc_cha31_support; extern pfmlib_pmu_t intel_knm_unc_cha32_support; extern pfmlib_pmu_t intel_knm_unc_cha33_support; extern pfmlib_pmu_t intel_knm_unc_cha34_support; extern pfmlib_pmu_t intel_knm_unc_cha35_support; extern pfmlib_pmu_t intel_knm_unc_cha36_support; extern pfmlib_pmu_t intel_knm_unc_cha37_support; extern pfmlib_pmu_t intel_knm_unc_m2pcie_support; extern pfmlib_pmu_t intel_glm_support; extern pfmlib_pmu_t power4_support; extern pfmlib_pmu_t ppc970_support; extern pfmlib_pmu_t ppc970mp_support; extern pfmlib_pmu_t power5_support; extern pfmlib_pmu_t power5p_support; extern pfmlib_pmu_t power6_support; extern pfmlib_pmu_t power7_support; extern pfmlib_pmu_t power8_support; extern pfmlib_pmu_t power9_support; extern pfmlib_pmu_t power10_support; extern pfmlib_pmu_t torrent_support; extern pfmlib_pmu_t powerpc_nest_mcs_read_support; extern pfmlib_pmu_t powerpc_nest_mcs_write_support; extern pfmlib_pmu_t sparc_support; extern pfmlib_pmu_t sparc_ultra12_support; extern pfmlib_pmu_t sparc_ultra3_support; extern pfmlib_pmu_t sparc_ultra3i_support; extern pfmlib_pmu_t sparc_ultra3plus_support; extern pfmlib_pmu_t sparc_ultra4plus_support; extern pfmlib_pmu_t sparc_niagara1_support; extern pfmlib_pmu_t sparc_niagara2_support; extern pfmlib_pmu_t cell_support; extern pfmlib_pmu_t perf_event_support; extern pfmlib_pmu_t perf_event_raw_support; extern pfmlib_pmu_t intel_wsm_sp_support; extern pfmlib_pmu_t intel_wsm_dp_support; extern pfmlib_pmu_t intel_wsm_unc_support; extern pfmlib_pmu_t arm_cortex_a7_support; extern pfmlib_pmu_t arm_cortex_a8_support; extern pfmlib_pmu_t arm_cortex_a9_support; extern pfmlib_pmu_t arm_cortex_a15_support; extern pfmlib_pmu_t arm_1176_support; extern pfmlib_pmu_t arm_qcom_krait_support; extern pfmlib_pmu_t arm_cortex_a57_support; extern pfmlib_pmu_t arm_cortex_a53_support; extern pfmlib_pmu_t arm_cortex_a55_support; extern pfmlib_pmu_t arm_cortex_a72_support; extern pfmlib_pmu_t arm_cortex_a76_support; extern pfmlib_pmu_t arm_xgene_support; extern pfmlib_pmu_t arm_n1_support; extern pfmlib_pmu_t arm_n2_support; extern pfmlib_pmu_t arm_n3_support; extern pfmlib_pmu_t arm_v1_support; extern pfmlib_pmu_t arm_v2_support; extern pfmlib_pmu_t arm_v3_support; extern pfmlib_pmu_t arm_thunderx2_support; extern pfmlib_pmu_t arm_thunderx2_dmc0_support; extern pfmlib_pmu_t arm_thunderx2_dmc1_support; extern pfmlib_pmu_t arm_thunderx2_llc0_support; extern pfmlib_pmu_t arm_thunderx2_llc1_support; extern pfmlib_pmu_t arm_thunderx2_ccpi0_support; extern pfmlib_pmu_t arm_thunderx2_ccpi1_support; extern pfmlib_pmu_t arm_fujitsu_a64fx_support; extern pfmlib_pmu_t arm_fujitsu_monaka_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_ddrc0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_ddrc1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_ddrc2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_ddrc3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_ddrc0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_ddrc1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_ddrc2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_ddrc3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_ddrc0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_ddrc1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_ddrc2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_ddrc3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_ddrc0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_ddrc1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_ddrc2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_ddrc3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_hha2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_hha3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_hha0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_hha1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_hha6_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_hha7_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_hha4_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_hha5_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c10_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c11_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c12_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c13_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c14_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c15_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c8_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl1_l3c9_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c0_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c1_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c2_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c3_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c4_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c5_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c6_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl3_l3c7_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c24_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c25_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c26_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c27_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c28_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c29_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c30_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl5_l3c31_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c16_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c17_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c18_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c19_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c20_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c21_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c22_support; extern pfmlib_pmu_t arm_hisilicon_kunpeng_sccl7_l3c23_support; extern pfmlib_pmu_t mips_74k_support; extern pfmlib_pmu_t s390x_cpum_cf_support; extern pfmlib_pmu_t s390x_cpum_sf_support; extern pfmlib_os_t *pfmlib_os; extern pfmlib_os_t pfmlib_os_perf; extern pfmlib_os_t pfmlib_os_perf_ext; extern char *pfmlib_forced_pmu; #define this_pe(t) (((pfmlib_pmu_t *)t)->pe) #define this_atdesc(t) (((pfmlib_pmu_t *)t)->atdesc) #define LIBPFM_ARRAY_SIZE(a) (sizeof(a) / sizeof(typeof(*(a)))) /* * population count (number of bits set) */ static inline int pfmlib_popcnt(unsigned long v) { int sum = 0; for(; v ; v >>=1) { if (v & 0x1) sum++; } return sum; } /* * find next bit set */ static inline size_t pfmlib_fnb(unsigned long value, size_t nbits, int p) { unsigned long m; size_t i; for(i=p; i < nbits; i++) { m = 1 << i; if (value & m) return i; } return i; } /* * PMU + internal idx -> external opaque idx */ static inline int pfmlib_pidx2idx(pfmlib_pmu_t *pmu, int pidx) { int idx; idx = pmu->pmu << PFMLIB_PMU_SHIFT; idx |= pidx; return idx; } #define pfmlib_for_each_bit(x, m) \ for((x) = pfmlib_fnb((m), (sizeof(m)<<3), 0); (x) < (sizeof(m)<<3); (x) = pfmlib_fnb((m), (sizeof(m)<<3), (x)+1)) #ifdef __linux__ #define PFMLIB_VALID_PERF_PATTRS(f) \ .validate_pattrs[PFM_OS_PERF_EVENT] = f, \ .validate_pattrs[PFM_OS_PERF_EVENT_EXT] = f #define PFMLIB_ENCODE_PERF(f) \ .get_event_encoding[PFM_OS_PERF_EVENT] = f, \ .get_event_encoding[PFM_OS_PERF_EVENT_EXT] = f #define PFMLIB_OS_DETECT(f) \ .os_detect[PFM_OS_PERF_EVENT] = f, \ .os_detect[PFM_OS_PERF_EVENT_EXT] = f #else #define PFMLIB_VALID_PERF_PATTRS(f) \ .validate_pattrs[PFM_OS_PERF_EVENT] = NULL, \ .validate_pattrs[PFM_OS_PERF_EVENT_EXT] = NULL #define PFMLIB_ENCODE_PERF(f) \ .get_event_encoding[PFM_OS_PERF_EVENT] = NULL, \ .get_event_encoding[PFM_OS_PERF_EVENT_EXT] = NULL #define PFMLIB_OS_DETECT(f) \ .os_detect[PFM_OS_PERF_EVENT] = NULL, \ .os_detect[PFM_OS_PERF_EVENT_EXT] = NULL #endif static inline int is_empty_attr(const pfmlib_attr_desc_t *a) { return !a || !a->name || strlen(a->name) == 0 ? 1 : 0; } #endif /* __PFMLIB_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_s390x_cpumf.c000066400000000000000000000256571502707512200223260ustar00rootroot00000000000000/* * PMU support for the CPU-measurement facilities * * Copyright IBM Corp. 2012, 2014 * Contributed by Hendrik Brueckner * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include /* private library and arch headers */ #include "pfmlib_priv.h" #include "pfmlib_s390x_priv.h" #include "pfmlib_perf_event_priv.h" #include "events/s390x_cpumf_events.h" #define CPUM_CF_DEVICE_DIR "/sys/bus/event_source/devices/cpum_cf" #define CPUM_SF_DEVICE_DIR "/sys/bus/event_source/devices/cpum_sf" #define SYS_INFO "/proc/sysinfo" #define SERVICE_LEVEL "/proc/service_levels" #define CF_VERSION_STR "CPU-MF: Counter facility: version=" /* CPU-measurement counter list (pmu events) */ static pme_cpumf_ctr_t *cpumcf_pe = NULL; /* Detect the CPU-measurement counter and sampling facilities */ static int pfm_cpumcf_detect(void *this) { if (access(CPUM_CF_DEVICE_DIR, R_OK)) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } static int pfm_cpumsf_detect(void *this) { if (access(CPUM_SF_DEVICE_DIR, R_OK)) return PFM_ERR_NOTSUPP; return PFM_SUCCESS; } /* Parses the machine type that identifies an IBM mainframe. * These kind of information are from /proc/sysinfo. */ static long get_machine_type(void) { long machine_type; size_t buflen, len; char *buffer, *tmp; FILE *fp; machine_type = 0; fp = fopen(SYS_INFO, "r"); if (fp == NULL) goto out; buffer = NULL; while (pfmlib_getl(&buffer, &buflen, fp) != -1) { /* skip empty lines */ if (*buffer == '\n') continue; /* look for 'Type:' entry */ if (!strncmp("Type:", buffer, 5)) { tmp = buffer + 5; /* set ptr after ':' */ /* skip leading blanks */ while (isspace(*tmp)) tmp++; /* skip trailing blanks */ len = strlen(tmp); while (len > 0 && isspace(tmp[len])) len--; tmp[len+1] = '\0'; machine_type = strtol(tmp, NULL, 10); break; } } fclose(fp); free(buffer); out: return machine_type; } static void get_cf_version(unsigned int *cfvn, unsigned int *csvn) { int rc; FILE *fp; char *buffer; size_t buflen; *cfvn = *csvn = 0; fp = fopen(SERVICE_LEVEL, "r"); if (fp == NULL) return; buffer = NULL; while (pfmlib_getl(&buffer, &buflen, fp) != -1) { /* skip empty lines */ if (*buffer == '\n') continue; /* look for 'CPU-MF: Counter facility: version=' entry */ if (!strncmp(CF_VERSION_STR, buffer, strlen(CF_VERSION_STR))) { rc = sscanf(buffer + strlen(CF_VERSION_STR), "%u.%u", cfvn, csvn); if (rc != 2) *cfvn = *csvn = 0; break; } } fclose(fp); free(buffer); } /* Initialize the PMU representation for CPUMF. * * Set up the PMU events array based on * - generic (basic, problem-state, and crypto-activaty) counter sets * - the extended counter depending on the machine type */ static int pfm_cpumcf_init(void *this) { pfmlib_pmu_t *pmu = this; unsigned int cfvn, csvn; const pme_cpumf_ctr_t *cfvn_set, *csvn_set, *ext_set; size_t cfvn_set_count, csvn_set_count, ext_set_count, pme_count; /* obtain counter first/second version number */ get_cf_version(&cfvn, &csvn); /* counters based on first version number */ switch (cfvn) { case 1: cfvn_set = cpumcf_fvn1_counters; cfvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_fvn1_counters); break; case 3: cfvn_set = cpumcf_fvn3_counters; cfvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_fvn3_counters); break; default: cfvn_set = NULL; cfvn_set_count = 0; break; } /* counters based on second version number */ csvn_set = cpumcf_svn_generic_counters; csvn_set_count = LIBPFM_ARRAY_SIZE(cpumcf_svn_generic_counters); if (csvn < 6) /* Crypto counter set enlarged for SVN == 6 */ csvn_set_count -= CPUMF_SVN6_ECC; /* check and assign a machine-specific extended counter set */ switch (get_machine_type()) { case 2097: /* IBM System z10 EC */ case 2098: /* IBM System z10 BC */ ext_set = cpumcf_z10_counters, ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z10_counters); break; case 2817: /* IBM zEnterprise 196 */ case 2818: /* IBM zEnterprise 114 */ ext_set = cpumcf_z196_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z196_counters); break; case 2827: /* IBM zEnterprise EC12 */ case 2828: /* IBM zEnterprise BC12 */ ext_set = cpumcf_zec12_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_zec12_counters); break; case 2964: /* IBM z13 */ case 2965: /* IBM z13s */ ext_set = cpumcf_z13_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z13_counters); break; case 3906: /* IBM z14 */ case 3907: /* IBM z14 ZR1 */ ext_set = cpumcf_z14_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z14_counters); break; case 8561: /* IBM Machine types 8561 and 8562 */ case 8562: ext_set = cpumcf_z15_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z15_counters); break; case 3931: /* IBM Machine types 3931 and 3932 */ case 3932: ext_set = cpumcf_z16_counters; ext_set_count = LIBPFM_ARRAY_SIZE(cpumcf_z16_counters); break; default: /* No extended counter set for this machine type or there * was an error retrieving the machine type */ ext_set = NULL; ext_set_count = 0; break; } cpumcf_pe = calloc(cfvn_set_count + csvn_set_count + ext_set_count, sizeof(*cpumcf_pe)); if (cpumcf_pe == NULL) return PFM_ERR_NOMEM; pme_count = 0; memcpy(cpumcf_pe, cfvn_set, sizeof(*cpumcf_pe) * cfvn_set_count); pme_count += cfvn_set_count; memcpy((void *) (cpumcf_pe + pme_count), csvn_set, sizeof(*cpumcf_pe) * csvn_set_count); pme_count += csvn_set_count; if (ext_set_count) memcpy((void *) (cpumcf_pe + pme_count), ext_set, sizeof(*cpumcf_pe) * ext_set_count); pme_count += ext_set_count; pmu->pe = cpumcf_pe; pmu->pme_count = pme_count; /* CPUM-CF provides fixed counters only. The number of installed * counters depends on the version and hardware model up to * CPUMF_COUNTER_MAX. */ pmu->num_fixed_cntrs = pme_count; return PFM_SUCCESS; } static void pfm_cpumcf_exit(void *this) { pfmlib_pmu_t *pmu = this; pmu->pme_count = 0; pmu->pe = NULL; free(cpumcf_pe); } static int pfm_cpumf_get_encoding(void *this, pfmlib_event_desc_t *e) { const pme_cpumf_ctr_t *pe = this_pe(this); e->count = 1; /* number of encoded entries in e->codes */ e->codes[0] = pe[e->event].ctrnum; evt_strcat(e->fstr, "%s", pe[e->event].name); return PFM_SUCCESS; } static int pfm_cpumf_get_event_first(void *this) { pfmlib_pmu_t *pmu = this; return !!pmu->pme_count ? 0 : -1; } static int pfm_cpumf_get_event_next(void *this, int idx) { pfmlib_pmu_t *pmu = this; if (idx >= (pmu->pme_count - 1)) return -1; return idx + 1; } static int pfm_cpumf_event_is_valid(void *this, int idx) { pfmlib_pmu_t *pmu = this; return (idx >= 0 && idx < pmu->pme_count); } static int pfm_cpumf_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const pme_cpumf_ctr_t *pe = this_pe(this); int i, rc; rc = PFM_ERR_INVAL; for (i = 0; i < pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event: %i: No name\n", pmu->name, i); goto failed; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event: %i: No description\n", pmu->name, i); goto failed; } } rc = PFM_SUCCESS; failed: return rc; } static int pfm_cpumcf_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; if (pmu->pme_count > CPUMF_COUNTER_MAX) { fprintf(fp, "pmu: %s: pme number exceeded maximum\n", pmu->name); return PFM_ERR_INVAL; } return pfm_cpumf_validate_table(this, fp); } static int pfm_cpumf_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const pme_cpumf_ctr_t *pe = this_pe(this); if (idx >= pmu->pme_count) return PFM_ERR_INVAL; info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].ctrnum; info->equiv = NULL; info->idx = idx; info->pmu = pmu->pmu; info->is_precise = 0; info->nattrs = 0; /* attributes are not supported */ return PFM_SUCCESS; } static int pfm_cpumf_get_event_attr_info(void *this, int idx, int umask_idx, pfmlib_event_attr_info_t *info) { /* Attributes are not supported */ return PFM_ERR_ATTR; } pfmlib_pmu_t s390x_cpum_cf_support = { .desc = "CPU-measurement counter facility", .name = "cpum_cf", .pmu = PFM_PMU_S390X_CPUM_CF, .type = PFM_PMU_TYPE_CORE, .flags = PFMLIB_PMU_FL_ARCH_DFL, .supported_plm = PFM_PLM3, .num_cntrs = 0, /* no general-purpose counters */ .num_fixed_cntrs = CPUMF_COUNTER_MAX, /* fixed counters only */ .max_encoding = 1, .pe = NULL, .pme_count = 0, .pmu_detect = pfm_cpumcf_detect, .pmu_init = pfm_cpumcf_init, .pmu_terminate = pfm_cpumcf_exit, .get_event_encoding[PFM_OS_NONE] = pfm_cpumf_get_encoding, PFMLIB_ENCODE_PERF(pfm_s390x_get_perf_encoding), .get_event_first = pfm_cpumf_get_event_first, .get_event_next = pfm_cpumf_get_event_next, .event_is_valid = pfm_cpumf_event_is_valid, .validate_table = pfm_cpumcf_validate_table, .get_event_info = pfm_cpumf_get_event_info, .get_event_attr_info = pfm_cpumf_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_s390x_perf_validate_pattrs), }; pfmlib_pmu_t s390x_cpum_sf_support = { .desc = "CPU-measurement sampling facility", .name = "cpum_sf", .pmu = PFM_PMU_S390X_CPUM_SF, .type = PFM_PMU_TYPE_CORE, .flags = PFMLIB_PMU_FL_ARCH_DFL, .num_cntrs = 0, /* no general-purpose counters */ .num_fixed_cntrs = 2, /* fixed counters only */ .max_encoding = 1, .pe = cpumsf_counters, .pme_count = LIBPFM_ARRAY_SIZE(cpumsf_counters), .pmu_detect = pfm_cpumsf_detect, .get_event_encoding[PFM_OS_NONE] = pfm_cpumf_get_encoding, PFMLIB_ENCODE_PERF(pfm_s390x_get_perf_encoding), .get_event_first = pfm_cpumf_get_event_first, .get_event_next = pfm_cpumf_get_event_next, .event_is_valid = pfm_cpumf_event_is_valid, .validate_table = pfm_cpumf_validate_table, .get_event_info = pfm_cpumf_get_event_info, .get_event_attr_info = pfm_cpumf_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_s390x_perf_validate_pattrs), }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_s390x_perf_event.c000066400000000000000000000107661502707512200233440ustar00rootroot00000000000000/* * perf_event for Linux on IBM System z * * Copyright IBM Corp. 2012 * Contributed by Hendrik Brueckner * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include /* private library and arch headers */ #include "pfmlib_priv.h" #include "pfmlib_s390x_priv.h" #include "pfmlib_perf_event_priv.h" /* * The s390 Performance Measurement counter facility does not have a fixed * type number anymore. This was caused by linux kernel commits * 66d258c5b0488 perf/core: Optimize perf_init_event() * and its necessary follow on commit * 6a82e23f45fe0 s390/cpumf: Adjust registration of s390 PMU device drivers * * Now read out the current type number from a sysfs file named * /sys/devices/cpum_cf/type. If it does not exist there is no CPU-MF counter * facility installed or activated. * * As the CPU Measurement counter facility does not change on a running * system, read out the type value on first read and cache it. * * There are several PMUs for s390, so find the correct one first and return * its PMU type value assigned at system boot time. */ static struct pfm_s390_perf_aptt { /* Perf attribute PMU type table */ pfm_pmu_t pmutype; /* PMU Type number */ const char *fname; /* File name to read type from */ unsigned int value; /* Type value, 0 --> unused */ } pfm_s390_perf_aptt[] = { { .pmutype = PFM_PMU_S390X_CPUM_CF, .fname = "/sys/bus/event_source/devices/cpum_cf/type" }, { .pmutype = PFM_PMU_S390X_CPUM_SF, .fname = "/sys/bus/event_source/devices/cpum_sf/type" }, }; #define S390_APTT_COUNT LIBPFM_ARRAY_SIZE(pfm_s390_perf_aptt) static int pfm_s390_get_perf_attr_type(pfm_pmu_t pmutype) { int cpum_cf_type; size_t buflen; char *buffer; FILE *fp; size_t i; /* Find type of PMU and return known and cached value */ for (i = 0; i < S390_APTT_COUNT; ++i) { if (pfm_s390_perf_aptt[i].pmutype == pmutype) break; } if (i == S390_APTT_COUNT) return PFM_ERR_NOTFOUND; if (pfm_s390_perf_aptt[i].value) return pfm_s390_perf_aptt[i].value; /* Value unknown, read from file */ fp = fopen(pfm_s390_perf_aptt[i].fname, "r"); if (fp == NULL) return PFM_ERR_NOTFOUND; buffer = NULL; if (pfmlib_getl(&buffer, &buflen, fp) != -1 && sscanf(buffer, "%u", &cpum_cf_type) == -1) cpum_cf_type = PERF_TYPE_RAW; fclose(fp); free(buffer); pfm_s390_perf_aptt[i].value = cpum_cf_type; return cpum_cf_type; } int pfm_s390x_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { pfmlib_pmu_t *pmu = this; struct perf_event_attr *attr = e->os_data; int rc; if (!pmu->get_event_encoding[PFM_OS_NONE]) return PFM_ERR_NOTSUPP; /* set up raw pmu event encoding */ rc = pmu->get_event_encoding[PFM_OS_NONE](this, e); if (rc == PFM_SUCCESS) { /* currently use raw events only */ rc = pfm_s390_get_perf_attr_type(pmu->pmu); if (rc > 0) { /* PMU types are positive */ attr->type = rc; attr->config = e->codes[0]; rc = PFM_SUCCESS; } } return rc; } void pfm_s390x_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i=0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* No precise mode on s390x */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_s390x_priv.h000066400000000000000000000010351502707512200221610ustar00rootroot00000000000000#ifndef __PFMLIB_S390X_PRIV_H__ #define __PFMLIB_S390X_PRIV_H__ #define CPUMF_COUNTER_MAX 0xffff typedef struct { uint64_t ctrnum; /* counter number */ unsigned int ctrset; /* counter set */ char *name; /* counter ID */ char *desc; /* short description */ } pme_cpumf_ctr_t; #define min(a, b) ((a) < (b) ? (a) : (b)) extern int pfm_s390x_get_perf_encoding(void *this, pfmlib_event_desc_t *e); extern void pfm_s390x_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); #endif /* __PFMLIB_S390X_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sicortex.c000066400000000000000000000505171502707512200220770ustar00rootroot00000000000000/* * pfmlib_sicortex.c : support for the generic MIPS64 PMU family * * Contributed by Philip Mucci based on code from * Copyright (c) 2005-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include /* public headers */ #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_sicortex_priv.h" /* architecture private */ #include "sicortex/ice9a/ice9a_all_spec_pme.h" #include "sicortex/ice9b/ice9b_all_spec_pme.h" #include "sicortex/ice9/ice9_scb_spec_sw.h" /* let's define some handy shortcuts! */ #define sel_event_mask perfsel.sel_event_mask #define sel_exl perfsel.sel_exl #define sel_os perfsel.sel_os #define sel_usr perfsel.sel_usr #define sel_sup perfsel.sel_sup #define sel_int perfsel.sel_int static pme_sicortex_entry_t *sicortex_pe = NULL; // CHANGE FOR ICET #define core_counters 2 #define MAX_ICE9_PMCS 2+4+256 #define MAX_ICE9_PMDS 2+4+256 static int compute_ice9_counters(int type) { int i; int bound = 0; pme_gen_mips64_entry_t *gen_mips64_pe = NULL; sicortex_support.pmd_count = 0; sicortex_support.pmc_count = 0; for (i=0;i 2) { /* Account for 4 sampling PMD registers */ sicortex_support.num_cnt = sicortex_support.pmd_count - 4; sicortex_support.pme_count = bound; } else { sicortex_support.pme_count = 0; /* Count up CPU only events */ for (i=0;i> (cntr*8)) & 0xff; pc[j].reg_addr = cntr*2; pc[j].reg_value = reg.val; pc[j].reg_num = cntr; __pfm_vbprintf("[CP0_25_%u(pmc%u)=0x%"PRIx64" event_mask=0x%x usr=%d os=%d sup=%d exl=%d int=1] %s\n", pc[j].reg_addr, pc[j].reg_num, pc[j].reg_value, reg.sel_event_mask, reg.sel_usr, reg.sel_os, reg.sel_sup, reg.sel_exl, sicortex_pe[e[j].event].pme_name); pd[j].reg_num = cntr; pd[j].reg_addr = cntr*2 + 1; __pfm_vbprintf("[CP0_25_%u(pmd%u)]\n", pc[j].reg_addr, pc[j].reg_num); } /* SCB event */ else { pmc_sicortex_scb_reg_t scbreg; int k; scbreg.val = 0; scbreg.sicortex_ScbPerfBucket_reg.event = sicortex_pe[e[j].event].pme_code >> 16; for (k=0;kflags & PFMLIB_SICORTEX_INPUT_SCB_INTERVAL)) { two.sicortex_ScbPerfCtl_reg.Interval = mod_in->pfp_sicortex_scb_global.Interval; } else { two.sicortex_ScbPerfCtl_reg.Interval = 6; /* 2048 cycles */ } if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_NOINC)) { two.sicortex_ScbPerfCtl_reg.NoInc = mod_in->pfp_sicortex_scb_global.NoInc; } else { two.sicortex_ScbPerfCtl_reg.NoInc = 0; } two.sicortex_ScbPerfCtl_reg.IntBit = 31; /* Interrupt on last bit */ two.sicortex_ScbPerfCtl_reg.MagicEvent = 0; two.sicortex_ScbPerfCtl_reg.AddrAssert = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" Interval=0x%x IntBit=0x%x NoInc=%d AddrAssert=%d MagicEvent=0x%x]\n","PerfCtl", pc[num].reg_num, two.val, two.sicortex_ScbPerfCtl_reg.Interval, two.sicortex_ScbPerfCtl_reg.IntBit, two.sicortex_ScbPerfCtl_reg.NoInc, two.sicortex_ScbPerfCtl_reg.AddrAssert, two.sicortex_ScbPerfCtl_reg.MagicEvent); pc[num].reg_value = two.val; /*ScbPerfHist */ pc[++num].reg_num = 3; pc[num].reg_addr = 3; three.val = 0; if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_HISTGTE)) three.sicortex_ScbPerfHist_reg.HistGte = mod_in->pfp_sicortex_scb_global.HistGte; else three.sicortex_ScbPerfHist_reg.HistGte = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" HistGte=0x%x]\n","PerfHist", pc[num].reg_num, three.val, three.sicortex_ScbPerfHist_reg.HistGte); pc[num].reg_value = three.val; /*ScbPerfBuckNum */ pc[++num].reg_num = 4; pc[num].reg_addr = 4; four.val = 0; if (mod_in && (mod_in->flags & PFMLIB_SICORTEX_INPUT_SCB_BUCKET)) four.sicortex_ScbPerfBuckNum_reg.Bucket = mod_in->pfp_sicortex_scb_global.Bucket; else four.sicortex_ScbPerfBuckNum_reg.Bucket = 0; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" Bucket=0x%x]\n","PerfBuckNum", pc[num].reg_num, four.val, four.sicortex_ScbPerfBuckNum_reg.Bucket); pc[num].reg_value = four.val; /*ScbPerfEna */ pc[++num].reg_num = 5; pc[num].reg_addr = 5; five.val = 0; five.sicortex_ScbPerfEna_reg.ena = 1; __pfm_vbprintf("[Scb%s(pmc%u)=0x%"PRIx64" ena=%d]\n","PerfEna", pc[num].reg_num, five.val, five.sicortex_ScbPerfEna_reg.ena); pc[num].reg_value = five.val; ++num; return(num); } /* * Automatically dispatch events to corresponding counters following constraints. * Upon return the pfarg_regt structure is ready to be submitted to kernel */ static int pfm_sicortex_dispatch_counters(pfmlib_input_param_t *inp, pfmlib_sicortex_input_param_t *mod_in, pfmlib_output_param_t *outp) { /* pfmlib_sicortex_input_param_t *param = mod_in; */ pfmlib_event_t *e = inp->pfp_events; pfmlib_reg_t *pc, *pd; unsigned int i, j, cnt = inp->pfp_event_count; unsigned int used = 0; extern pfm_pmu_support_t sicortex_support; unsigned int cntr, avail; pc = outp->pfp_pmcs; pd = outp->pfp_pmds; /* Degree N rank based allocation */ if (cnt > sicortex_support.pmc_count) return PFMLIB_ERR_TOOMANY; if (PFMLIB_DEBUG()) { for (j=0; j < cnt; j++) { DPRINT("ev[%d]=%s, counters=0x%x\n", j, sicortex_pe[e[j].event].pme_name,sicortex_pe[e[j].event].pme_counters); } } /* Do rank based allocation, counters that live on 1 reg before counters that live on 2 regs etc. */ /* CPU counters first */ for (i=1;i<=core_counters;i++) { for (j=0; j < cnt;j++) { /* CPU counters first */ if ((sicortex_pe[e[j].event].pme_counters & ((1<pfp_dfl_plm,pc,pd,cntr,j,mod_in); used |= (1 << cntr); DPRINT("Rank %d: Used counters 0x%x\n",i, used); } } } /* SCB counters can live anywhere */ used = 0; for (j=0; j < cnt;j++) { unsigned int cntr; /* CPU counters first */ if (sicortex_pe[e[j].event].pme_counters & (1<pfp_dfl_plm,pc,pd,cntr,j,mod_in); used++; DPRINT("SCB(%d): Used counters %d\n",j,used); } } if (used) { outp->pfp_pmc_count = stuff_sicortex_scb_control_regs(pc,pd,cnt,mod_in); outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } /* number of evtsel registers programmed */ outp->pfp_pmc_count = cnt; outp->pfp_pmd_count = cnt; return PFMLIB_SUCCESS; } static int pfm_sicortex_dispatch_events(pfmlib_input_param_t *inp, void *model_in, pfmlib_output_param_t *outp, void *model_out) { pfmlib_sicortex_input_param_t *mod_sicortex_in = (pfmlib_sicortex_input_param_t *)model_in; return pfm_sicortex_dispatch_counters(inp, mod_sicortex_in, outp); } static int pfm_sicortex_get_event_code(unsigned int i, unsigned int cnt, int *code) { extern pfm_pmu_support_t sicortex_support; /* check validity of counter index */ if (cnt != PFMLIB_CNT_FIRST) { if (cnt < 0 || cnt >= sicortex_support.pmc_count) return PFMLIB_ERR_INVAL; } else { cnt = ffs(sicortex_pe[i].pme_counters)-1; if (cnt == -1) return(PFMLIB_ERR_INVAL); } /* if cnt == 1, shift right by 0, if cnt == 2, shift right by 8 */ /* Works on both 5k anf 20K */ unsigned int tmp = sicortex_pe[i].pme_counters; /* CPU event */ if (tmp & ((1<> (cnt*8)); else return PFMLIB_ERR_INVAL; } /* SCB event */ else { if ((cnt < 6) || (cnt >= sicortex_support.pmc_count)) return PFMLIB_ERR_INVAL; *code = 0xffff & (sicortex_pe[i].pme_code >> 16); } return PFMLIB_SUCCESS; } /* * This function is accessible directly to the user */ int pfm_sicortex_get_event_umask(unsigned int i, unsigned long *umask) { extern pfm_pmu_support_t sicortex_support; if (i >= sicortex_support.pme_count || umask == NULL) return PFMLIB_ERR_INVAL; *umask = 0; //evt_umask(i); return PFMLIB_SUCCESS; } static void pfm_sicortex_get_event_counters(unsigned int j, pfmlib_regmask_t *counters) { extern pfm_pmu_support_t sicortex_support; unsigned int tmp; memset(counters, 0, sizeof(*counters)); tmp = sicortex_pe[j].pme_counters; /* CPU counter */ if (tmp & ((1< core_counters) { /* counting pmds are not contiguous on ICE9*/ for(i=6; i < sicortex_support.pmd_count; i++) pfm_regmask_set(impl_counters, i); } } static void pfm_sicortex_get_hw_counter_width(unsigned int *width) { *width = PMU_GEN_MIPS64_COUNTER_WIDTH; } static char * pfm_sicortex_get_event_name(unsigned int i) { return sicortex_pe[i].pme_name; } static int pfm_sicortex_get_event_description(unsigned int ev, char **str) { char *s; s = sicortex_pe[ev].pme_desc; if (s) { *str = strdup(s); } else { *str = NULL; } return PFMLIB_SUCCESS; } static int pfm_sicortex_get_cycle_event(pfmlib_event_t *e) { return pfm_find_full_event("CPU_CYCLES",e); } static int pfm_sicortex_get_inst_retired(pfmlib_event_t *e) { return pfm_find_full_event("CPU_INSEXEC",e); } /* SiCortex specific functions */ /* CPU counter */ int pfm_sicortex_is_cpu(unsigned int i) { if (i < sicortex_support.pme_count) { unsigned int tmp = sicortex_pe[i].pme_counters; return !(tmp & (1< based on code from * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux/ia64. */ #ifndef __PFMLIB_SICORTEX_PRIV_H__ #define __PFMLIB_SICORTEX_PRIV_H__ #include "pfmlib_gen_mips64_priv.h" #define PFMLIB_SICORTEX_MAX_UMASK 5 typedef struct { char *pme_uname; /* unit mask name */ char *pme_udesc; /* event/umask description */ unsigned int pme_ucode; /* unit mask code */ } pme_sicortex_umask_t; typedef struct { char *pme_name; char *pme_desc; /* text description of the event */ unsigned int pme_code; /* event mask, holds room for four events, low 8 bits cntr0, ... high 8 bits cntr3 */ unsigned int pme_counters; /* Which counter event lives on */ unsigned int pme_numasks; /* number of umasks */ pme_sicortex_umask_t pme_umasks[PFMLIB_SICORTEX_MAX_UMASK]; /* umask desc */ } pme_sicortex_entry_t; /* * SiCortex specific */ typedef union { uint64_t val; /* complete register value */ struct { unsigned long sel_exl:1; /* int level */ unsigned long sel_os:1; /* system level */ unsigned long sel_sup:1; /* supervisor level */ unsigned long sel_usr:1; /* user level */ unsigned long sel_int:1; /* enable intr */ unsigned long sel_event_mask:6; /* event mask */ unsigned long sel_res1:23; /* reserved */ unsigned long sel_res2:32; /* reserved */ } perfsel; } pfm_sicortex_sel_reg_t; #define PMU_SICORTEX_SCB_NUM_COUNTERS 256 typedef union { uint64_t val; struct { unsigned long Interval:4; unsigned long IntBit:5; unsigned long NoInc:1; unsigned long AddrAssert:1; unsigned long MagicEvent:2; unsigned long Reserved:19; } sicortex_ScbPerfCtl_reg; struct { unsigned long HistGte:20; unsigned long Reserved:12; } sicortex_ScbPerfHist_reg; struct { unsigned long Bucket:8; unsigned long Reserved:24; } sicortex_ScbPerfBuckNum_reg; struct { unsigned long ena:1; unsigned long Reserved:31; } sicortex_ScbPerfEna_reg; struct { unsigned long event:15; unsigned long hist:1; unsigned long ifOther:2; unsigned long Reserved:15; } sicortex_ScbPerfBucket_reg; } pmc_sicortex_scb_reg_t; typedef union { uint64_t val; struct { unsigned long Reserved:2; uint64_t VPCL:38; unsigned long VPCH:2; } sicortex_CpuPerfVPC_reg; struct { unsigned long Reserved:5; unsigned long PEA:31; unsigned long Reserved2:12; unsigned long ASID:8; unsigned long L2STOP:4; unsigned long L2STATE:3; unsigned long L2HIT:1; } sicortex_CpuPerfPEA_reg; } pmd_sicortex_cpu_reg_t; #define PFMLIB_SICORTEX_INPUT_SCB_NONE (unsigned long)0x0 #define PFMLIB_SICORTEX_INPUT_SCB_INTERVAL (unsigned long)0x1 #define PFMLIB_SICORTEX_INPUT_SCB_NOINC (unsigned long)0x2 #define PFMLIB_SICORTEX_INPUT_SCB_HISTGTE (unsigned long)0x4 #define PFMLIB_SICORTEX_INPUT_SCB_BUCKET (unsigned long)0x8 static pme_sicortex_umask_t sicortex_scb_umasks[PFMLIB_SICORTEX_MAX_UMASK] = { { "IFOTHER_NONE","Both buckets count independently",0x00 }, { "IFOTHER_AND","Increment where this event counts and the opposite bucket counts",0x02 }, { "IFOTHER_ANDNOT","Increment where this event counts and the opposite bucket does not",0x04 }, { "HIST_NONE","Count cycles where the event is asserted",0x0 }, { "HIST_EDGE","Histogram on edges of the specified event",0x1 } }; #endif /* __PFMLIB_GEN_MIPS64_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc.c000066400000000000000000000171611502707512200213450ustar00rootroot00000000000000/* * pfmlib_sparc.c : support for SPARC processors * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. * */ #include #include #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_sparc_priv.h" const pfmlib_attr_desc_t sparc_mods[]={ PFM_ATTR_B("k", "monitor at priv level 0"), /* monitor priv level 0 */ PFM_ATTR_B("u", "monitor at priv level 1, 2, 3"), /* monitor priv level 1, 2, 3 */ PFM_ATTR_B("h", "monitor in hypervisor"), /* monitor in hypervisor*/ PFM_ATTR_NULL /* end-marker to avoid exporting number of entries */ }; #define SPARC_NUM_MODS (sizeof(sparc_mods)/sizeof(pfmlib_attr_desc_t) - 1) #ifdef CONFIG_PFMLIB_OS_LINUX /* * helper function to retrieve one value from /proc/cpuinfo * for internal libpfm use only * attr: the attribute (line) to look for * ret_buf: a buffer to store the value of the attribute (as a string) * maxlen : number of bytes of capacity in ret_buf * * ret_buf is null terminated. * * Return: * 0 : attribute found, ret_buf populated * -1: attribute not found */ static int pfmlib_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { FILE *fp = NULL; int ret = -1; size_t attr_len, buf_len = 0; char *p, *value = NULL; char *buffer = NULL; if (attr == NULL || ret_buf == NULL || maxlen < 1) return -1; attr_len = strlen(attr); fp = fopen("/proc/cpuinfo", "r"); if (fp == NULL) return -1; while(pfmlib_getl(&buffer, &buf_len, fp) != -1){ /* skip blank lines */ if (*buffer == '\n') continue; p = strchr(buffer, ':'); if (p == NULL) goto error; /* * p+2: +1 = space, +2= firt character * strlen()-1 gets rid of \n */ *p = '\0'; value = p+2; value[strlen(value)-1] = '\0'; if (!strncmp(attr, buffer, attr_len)) break; } strncpy(ret_buf, value, maxlen-1); ret_buf[maxlen-1] = '\0'; ret = 0; error: free(buffer); fclose(fp); return ret; } #else static int pfmlib_getcpuinfo_attr(const char *attr, char *ret_buf, size_t maxlen) { return -1; } #endif static pfm_pmu_t pmu_name_to_pmu_type(char *name) { if (!strcmp(name, "ultra12")) return PFM_PMU_SPARC_ULTRA12; if (!strcmp(name, "ultra3")) return PFM_PMU_SPARC_ULTRA3; if (!strcmp(name, "ultra3i")) return PFM_PMU_SPARC_ULTRA3I; if (!strcmp(name, "ultra3+")) return PFM_PMU_SPARC_ULTRA3PLUS; if (!strcmp(name, "ultra4+")) return PFM_PMU_SPARC_ULTRA4PLUS; if (!strcmp(name, "niagara2")) return PFM_PMU_SPARC_NIAGARA2; if (!strcmp(name, "niagara")) return PFM_PMU_SPARC_NIAGARA1; return PFM_PMU_NONE; } int pfm_sparc_detect(void *this) { pfmlib_pmu_t *pmu = this; pfm_pmu_t model; int ret; char buffer[32]; ret = pfmlib_getcpuinfo_attr("pmu", buffer, sizeof(buffer)); if (ret == -1) return PFM_ERR_NOTSUPP; model = pmu_name_to_pmu_type(buffer); return model == pmu->pmu ? PFM_SUCCESS : PFM_ERR_NOTSUPP; } void pfm_sparc_display_reg(void *this, pfmlib_event_desc_t *e, pfm_sparc_reg_t reg) { __pfm_vbprintf("[0x%x umask=0x%x code=0x%x ctrl_s1=%d ctrl_s0=%d] %s\n", reg.val, reg.config.umask, reg.config.code, reg.config.ctrl_s1, reg.config.ctrl_s0, e->fstr); } int pfm_sparc_get_encoding(void *this, pfmlib_event_desc_t *e) { const sparc_entry_t *pe = this_pe(this); pfmlib_event_attr_info_t *a; pfm_sparc_reg_t reg; int i; //reg.val = pe[e->event].code << 16 | pe[e->event].ctrl; reg.val = pe[e->event].code; for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) reg.config.umask |= 1 << pe[e->event].umasks[a->idx].ubit; } e->count = 2; e->codes[0] = reg.val; e->codes[1] = pe[e->event].ctrl; evt_strcat(e->fstr, "%s", pe[e->event].name); pfmlib_sort_attr(e); for (i = 0; i < e->nattrs; i++) { a = attr(e, i); if (a->ctrl != PFM_ATTR_CTRL_PMU) continue; if (a->type == PFM_ATTR_UMASK) evt_strcat(e->fstr, ":%s", pe[e->event].umasks[a->idx].uname); } pfm_sparc_display_reg(this, e, reg); return PFM_SUCCESS; } int pfm_sparc_get_event_first(void *this) { return 0; } int pfm_sparc_get_event_next(void *this, int idx) { pfmlib_pmu_t *p = this; if (idx >= (p->pme_count-1)) return -1; return idx+1; } int pfm_sparc_event_is_valid(void *this, int pidx) { pfmlib_pmu_t *p = this; return pidx >= 0 && pidx < p->pme_count; } int pfm_sparc_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const sparc_entry_t *pe = this_pe(this); int i, j, error = 0; for(i=0; i < pmu->pme_count; i++) { if (!pe[i].name) { fprintf(fp, "pmu: %s event%d: :: no name (prev event was %s)\n", pmu->name, i, i > 1 ? pe[i-1].name : "??"); error++; } if (!pe[i].desc) { fprintf(fp, "pmu: %s event%d: %s :: no description\n", pmu->name, i, pe[i].name); error++; } for(j=i+1; j < pmu->pme_count; j++) { if (pe[i].code == pe[j].code && pe[i].ctrl == pe[j].ctrl) { fprintf(fp, "pmu: %s event%d: %s code: 0x%x is duplicated in event%d : %s\n", pmu->name, i, pe[i].name, pe[i].code, j, pe[j].name); error++; } } } return error ? PFM_ERR_INVAL : PFM_SUCCESS; } int pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info) { const sparc_entry_t *pe = this_pe(this); int idx; if (attr_idx < pe[pidx].numasks) { info->name = pe[pidx].umasks[attr_idx].uname; info->desc = pe[pidx].umasks[attr_idx].udesc; info->name = pe[pidx].umasks[attr_idx].uname; info->equiv= NULL; info->code = 1 << pe[pidx].umasks[attr_idx].ubit; info->type = PFM_ATTR_UMASK; info->idx = attr_idx; } else { /* * all mods implemented by ALL events */ idx = attr_idx - pe[pidx].numasks; info->name = sparc_mods[idx].name; info->desc = sparc_mods[idx].desc; info->type = sparc_mods[idx].type; info->code = idx; info->type = sparc_mods[idx].type; } info->is_dfl = 0; info->is_precise = 0; info->support_hw_smpl = 0; info->ctrl = PFM_ATTR_CTRL_PMU;; return PFM_SUCCESS; } int pfm_sparc_get_event_info(void *this, int idx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const sparc_entry_t *pe = this_pe(this); /* * pmu and idx filled out by caller */ info->name = pe[idx].name; info->desc = pe[idx].desc; info->code = pe[idx].code; info->equiv = NULL; info->idx = idx; /* private index */ info->pmu = pmu->pmu; info->is_precise = 0; info->support_hw_smpl = 0; info->nattrs = pe[idx].numasks; return PFM_SUCCESS; } unsigned int pfm_sparc_get_event_nattrs(void *this, int pidx) { const sparc_entry_t *pe = this_pe(this); return SPARC_NUM_MODS + pe[pidx].numasks; } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_niagara.c000066400000000000000000000061531502707512200230260ustar00rootroot00000000000000/* * pfmlib_sparc_niagara.c : SPARC Niagara I, II * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Core PMU = architectural perfmon v2 + PEBS */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_sparc_priv.h" #include "events/sparc_niagara1_events.h" #include "events/sparc_niagara2_events.h" pfmlib_pmu_t sparc_niagara1_support={ .desc = "Sparc Niagara I", .name = "niagara1", .pmu = PFM_PMU_SPARC_NIAGARA1, .pme_count = LIBPFM_ARRAY_SIZE(niagara1_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .max_encoding = 2, .num_cntrs = 2, .pe = niagara1_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; pfmlib_pmu_t sparc_niagara2_support={ .desc = "Sparc Niagara II", .name = "niagara2", .pmu = PFM_PMU_SPARC_NIAGARA2, .pme_count = LIBPFM_ARRAY_SIZE(niagara2_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = NIAGARA2_PLM, .num_cntrs = 2, .max_encoding = 2, .pe = niagara2_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_perf_event.c000066400000000000000000000050641502707512200235610ustar00rootroot00000000000000/* * pfmlib_sparc_perf_event.c : perf_event SPARC functions * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include /* private headers */ #include "pfmlib_priv.h" /* library private */ #include "pfmlib_sparc_priv.h" /* architecture private */ #include "pfmlib_perf_event_priv.h" int pfm_sparc_get_perf_encoding(void *this, pfmlib_event_desc_t *e) { struct perf_event_attr *attr = e->os_data; int ret; ret = pfm_sparc_get_encoding(this, e); if (ret != PFM_SUCCESS) return ret; attr->type = PERF_TYPE_RAW; attr->config = (e->codes[0] << 16) | e->codes[1]; return PFM_SUCCESS; } void pfm_sparc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e) { int i, compact; for (i = 0; i < e->npattrs; i++) { compact = 0; /* umasks never conflict */ if (e->pattrs[i].type == PFM_ATTR_UMASK) continue; /* * with perf_events, u and k are handled at the OS level * via attr.exclude_* fields */ if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PMU) { if (e->pattrs[i].idx == SPARC_ATTR_U || e->pattrs[i].idx == SPARC_ATTR_K || e->pattrs[i].idx == SPARC_ATTR_H) compact = 1; } if (e->pattrs[i].ctrl == PFM_ATTR_CTRL_PERF_EVENT) { /* No precise mode on SPARC */ if (e->pattrs[i].idx == PERF_ATTR_PR) compact = 1; } /* hardware sampling not supported */ if (e->pattrs[i].idx == PERF_ATTR_HWS) compact = 1; if (compact) { pfmlib_compact_pattrs(e, i); i--; } } } papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_priv.h000066400000000000000000000032721502707512200224100ustar00rootroot00000000000000#ifndef __PFMLIB_SPARC_PRIV_H__ #define __PFMLIB_SPARC_PRIV_H__ typedef struct { char *uname; /* mask name */ char *udesc; /* mask description */ int ubit; /* umask bit position */ } sparc_mask_t; #define EVENT_MASK_BITS 8 typedef struct { char *name; /* event name */ char *desc; /* event description */ char ctrl; /* S0 or S1 */ char __pad; int code; /* S0/S1 encoding */ int numasks; /* number of entries in masks */ sparc_mask_t umasks[EVENT_MASK_BITS]; } sparc_entry_t; typedef union { unsigned int val; struct { unsigned int ctrl_s0 : 1; unsigned int ctrl_s1 : 1; unsigned int reserved1 : 14; unsigned int code : 8; unsigned int umask : 8; } config; } pfm_sparc_reg_t; #define PME_CTRL_S0 1 #define PME_CTRL_S1 2 #define SPARC_ATTR_K 0 #define SPARC_ATTR_U 1 #define SPARC_ATTR_H 2 #define SPARC_PLM (PFM_PLM0|PFM_PLM3) #define NIAGARA2_PLM (SPARC_PLM|PFM_PLMH) extern int pfm_sparc_detect(void *this); extern int pfm_sparc_get_encoding(void *this, pfmlib_event_desc_t *e); extern int pfm_sparc_get_event_first(void *this); extern int pfm_sparc_get_event_next(void *this, int idx); extern int pfm_sparc_event_is_valid(void *this, int pidx); extern int pfm_sparc_validate_table(void *this, FILE *fp); extern int pfm_sparc_get_event_attr_info(void *this, int pidx, int attr_idx, pfmlib_event_attr_info_t *info); extern int pfm_sparc_get_event_info(void *this, int idx, pfm_event_info_t *info); extern unsigned int pfm_sparc_get_event_nattrs(void *this, int pidx); extern void pfm_sparc_perf_validate_pattrs(void *this, pfmlib_event_desc_t *e); extern int pfm_sparc_get_perf_encoding(void *this, pfmlib_event_desc_t *e); #endif /* __PFMLIB_SPARC_PRIV_H__ */ papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_ultra12.c000066400000000000000000000043171502707512200227160ustar00rootroot00000000000000/* * pfmlib_sparc_ultra12.c : SPARC Ultra I, II * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Core PMU = architectural perfmon v2 + PEBS */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_sparc_priv.h" #include "events/sparc_ultra12_events.h" pfmlib_pmu_t sparc_ultra12_support={ .desc = "Ultra Sparc I/II", .name = "ultra12", .pmu = PFM_PMU_SPARC_ULTRA12, .pme_count = LIBPFM_ARRAY_SIZE(ultra12_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .max_encoding = 2, .num_cntrs = 2, .pe = ultra12_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_ultra3.c000066400000000000000000000077621502707512200226450ustar00rootroot00000000000000/* * pfmlib_sparc_ultra3.c : SPARC Ultra I, II * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * Core PMU = architectural perfmon v2 + PEBS */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_sparc_priv.h" #include "events/sparc_ultra3_events.h" #include "events/sparc_ultra3i_events.h" #include "events/sparc_ultra3plus_events.h" pfmlib_pmu_t sparc_ultra3_support={ .desc = "Ultra Sparc III", .name = "ultra3", .pmu = PFM_PMU_SPARC_ULTRA3, .pme_count = LIBPFM_ARRAY_SIZE(ultra3_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .max_encoding = 2, .num_cntrs = 2, .pe = ultra3_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; pfmlib_pmu_t sparc_ultra3i_support={ .desc = "Ultra Sparc IIIi", .name = "ultra3i", .pmu = PFM_PMU_SPARC_ULTRA3I, .pme_count = LIBPFM_ARRAY_SIZE(ultra3i_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .num_cntrs = 2, .max_encoding = 2, .pe = ultra3i_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; pfmlib_pmu_t sparc_ultra3plus_support={ .desc = "Ultra Sparc III+", .name = "ultra3p", .pmu = PFM_PMU_SPARC_ULTRA3PLUS, .pme_count = LIBPFM_ARRAY_SIZE(ultra3plus_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .max_encoding = 2, .num_cntrs = 2, .pe = ultra3plus_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_sparc_ultra4.c000066400000000000000000000042471502707512200226410ustar00rootroot00000000000000/* * pfmlib_sparc_ultra4.c : SPARC Ultra 4+ * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ /* private headers */ #include "pfmlib_priv.h" #include "pfmlib_sparc_priv.h" #include "events/sparc_ultra4plus_events.h" pfmlib_pmu_t sparc_ultra4plus_support={ .desc = "Ultra Sparc 4+", .name = "ultra4p", .pmu = PFM_PMU_SPARC_ULTRA4PLUS, .pme_count = LIBPFM_ARRAY_SIZE(ultra4plus_pe), .type = PFM_PMU_TYPE_CORE, .supported_plm = SPARC_PLM, .max_encoding = 2, .num_cntrs = 2, .pe = ultra4plus_pe, .atdesc = NULL, .flags = 0, .pmu_detect = pfm_sparc_detect, .get_event_encoding[PFM_OS_NONE] = pfm_sparc_get_encoding, PFMLIB_ENCODE_PERF(pfm_sparc_get_perf_encoding), .get_event_first = pfm_sparc_get_event_first, .get_event_next = pfm_sparc_get_event_next, .event_is_valid = pfm_sparc_event_is_valid, .validate_table = pfm_sparc_validate_table, .get_event_info = pfm_sparc_get_event_info, .get_event_attr_info = pfm_sparc_get_event_attr_info, PFMLIB_VALID_PERF_PATTRS(pfm_sparc_perf_validate_pattrs), .get_event_nattrs = pfm_sparc_get_event_nattrs, }; papi-papi-7-2-0-t/src/libpfm4/lib/pfmlib_torrent.c000066400000000000000000000154561502707512200217370ustar00rootroot00000000000000/* * pfmlib_torrent.c : IBM Torrent support * * Copyright (C) IBM Corporation, 2010. All rights reserved. * Contributed by Corey Ashford (cjashfor@us.ibm.com) * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include "pfmlib_priv.h" #include "pfmlib_power_priv.h" #include "events/torrent_events.h" const pfmlib_attr_desc_t torrent_modifiers[] = { PFM_ATTR_I("type", "Counter type: 0 = 2x64-bit counters w/32-bit prescale, 1 = 4x32-bit counters w/16-bit prescale, 2 = 2x32-bit counters w/no prescale, 3 = 4x16-bit counters w/no prescale"), PFM_ATTR_I("sel", "Sample period / Cmd Increment select: 0 = 256 cycles/ +16, 1 = 512 cycles / +8, 2 = 1024 cycles / +4, 3 = 2048 cycles / +2"), PFM_ATTR_I("lo_cmp", "Low threshold compare: 0..31"), PFM_ATTR_I("hi_cmp", "High threshold compare: 0..31"), PFM_ATTR_NULL }; static inline int pfm_torrent_attr2mod(void *this, int pidx, int attr_idx) { const pme_torrent_entry_t *pe = this_pe(this); size_t x; int n; n = attr_idx; pfmlib_for_each_bit(x, pe[pidx].pme_modmsk) { if (n == 0) break; n--; } return x; } /** * torrent_pmu_detect * * Determine if this machine has a Torrent chip * **/ static int pfm_torrent_detect(void* this) { struct dirent *de; DIR *dir; int ret = PFM_ERR_NOTSUPP; /* If /proc/device-tree/hfi-iohub@ exists, * this machine has an accessible Torrent chip */ dir = opendir("/proc/device-tree"); if (!dir) return PFM_ERR_NOTSUPP; while ((de = readdir(dir)) != NULL) { if (!strncmp(de->d_name, "hfi-iohub@", 10)) { ret = PFM_SUCCESS; break; } } closedir(dir); return ret; } static int pfm_torrent_get_event_info(void *this, int pidx, pfm_event_info_t *info) { pfmlib_pmu_t *pmu = this; const pme_torrent_entry_t *pe = this_pe(this); info->name = pe[pidx].pme_name; info->desc = pe[pidx].pme_desc ? pe[pidx].pme_desc : ""; info->code = pe[pidx].pme_code; info->equiv = NULL; info->idx = pidx; /* private index */ info->pmu = pmu->pmu; info->dtype = PFM_DTYPE_UINT64; info->is_precise = 0; /* unit masks + modifiers */ info->nattrs = pfmlib_popcnt((unsigned long)pe[pidx].pme_modmsk); return PFM_SUCCESS; } static int pfm_torrent_get_event_attr_info(void *this, int idx, int attr_idx, pfmlib_event_attr_info_t *info) { int m; m = pfm_torrent_attr2mod(this, idx, attr_idx); info->name = modx(torrent_modifiers, m, name); info->desc = modx(torrent_modifiers, m, desc); info->code = m; info->type = modx(torrent_modifiers, m, type); info->equiv = NULL; info->is_dfl = 0; info->is_precise = 0; info->idx = m; info->dfl_val64 = 0; info->ctrl = PFM_ATTR_CTRL_PMU; return PFM_SUCCESS; } static int pfm_torrent_validate_table(void *this, FILE *fp) { pfmlib_pmu_t *pmu = this; const pme_torrent_entry_t *pe = this_pe(this); int i, ret = PFM_ERR_INVAL; for (i = 0; i < pmu->pme_count; i++) { if (!pe[i].pme_name) { fprintf(fp, "pmu: %s event%d: :: no name\n", pmu->name, i); goto error; } if (pe[i].pme_code == 0) { fprintf(fp, "pmu: %s event%d: %s :: event code is 0\n", pmu->name, i, pe[i].pme_name); goto error; } } ret = PFM_SUCCESS; error: return ret; } static int pfm_torrent_get_encoding(void *this, pfmlib_event_desc_t *e) { const pme_torrent_entry_t *pe = this_pe(this); uint32_t torrent_pmu; int i, mod; e->fstr[0] = '\0'; /* initialize the fully-qualified event string */ e->count = 1; e->codes[0] = (uint64_t)pe[e->event].pme_code; for (i = 0; i < e->nattrs; i++) { mod = pfm_torrent_attr2mod(this, e->event, e->attrs[i].id); torrent_pmu = pe[e->event].pme_code & (TORRENT_SPACE | TORRENT_PMU_MASK); switch (torrent_pmu) { case TORRENT_PBUS_MCD: switch (mod) { case TORRENT_ATTR_MCD_TYPE: if (e->attrs[i].ival <= 3) { e->codes[0] |= e->attrs[i].ival << TORRENT_ATTR_MCD_TYPE_SHIFT; } else { DPRINT("value of attribute \'type\' - %" PRIu64 " - is not in the range 0..3.\n", e->attrs[i].ival); return PFM_ERR_ATTR_VAL; } break; default: DPRINT("unknown attribute for TORRENT_POWERBUS_MCD - %d\n", mod); return PFM_ERR_ATTR; } break; case TORRENT_PBUS_UTIL: switch (mod) { case TORRENT_ATTR_UTIL_SEL: if (e->attrs[i].ival <= 3) { e->codes[0] |= e->attrs[i].ival << TORRENT_ATTR_UTIL_SEL_SHIFT; } else { DPRINT("value of attribute \'sel\' - %" PRIu64 " - is not in the range 0..3.\n", e->attrs[i].ival); return PFM_ERR_ATTR_VAL; } break; case TORRENT_ATTR_UTIL_LO_CMP: case TORRENT_ATTR_UTIL_HI_CMP: if (e->attrs[i].ival <= 31) { e->codes[0] |= e->attrs[i].ival << TORRENT_ATTR_UTIL_CMP_SHIFT; } else { if (mod == TORRENT_ATTR_UTIL_LO_CMP) DPRINT("value of attribute \'lo_cmp\' - %" PRIu64 " - is not in the range 0..31.\n", e->attrs[i].ival); else DPRINT("value of attribute \'hi_cmp\' - %" PRIu64 " - is not in the range 0..31.\n", e->attrs[i].ival); return PFM_ERR_ATTR_VAL; } } break; default: DPRINT("attributes are unsupported for this Torrent PMU - code = %" PRIx32 "\n", torrent_pmu); return PFM_ERR_ATTR; } } return PFM_SUCCESS; } pfmlib_pmu_t torrent_support = { .pmu = PFM_PMU_TORRENT, .name = "power_torrent", .desc = "IBM Power Torrent PMU", .pme_count = PME_TORRENT_EVENT_COUNT, .pe = torrent_pe, .max_encoding = 1, .get_event_first = pfm_gen_powerpc_get_event_first, .get_event_next = pfm_gen_powerpc_get_event_next, .event_is_valid = pfm_gen_powerpc_event_is_valid, .pmu_detect = pfm_torrent_detect, .get_event_encoding[PFM_OS_NONE] = pfm_torrent_get_encoding, PFMLIB_ENCODE_PERF(pfm_gen_powerpc_get_perf_encoding), PFMLIB_VALID_PERF_PATTRS(pfm_gen_powerpc_perf_validate_pattrs), .validate_table = pfm_torrent_validate_table, .get_event_info = pfm_torrent_get_event_info, .get_event_attr_info = pfm_torrent_get_event_attr_info, }; papi-papi-7-2-0-t/src/libpfm4/libpfm.spec000066400000000000000000000066551502707512200201250ustar00rootroot00000000000000%{!?with_python: %global with_python 1} %define python_sitearch %(python -c "from distutils.sysconfig import get_python_lib; print get_python_lib(1)") %define python_prefix %(python -c "import sys; print sys.prefix") Name: libpfm Version: 4.6.0 Release: 1%{?dist} Summary: Library to encode performance events for use by perf tool Group: System Environment/Libraries License: MIT URL: http://perfmon2.sourceforge.net/ Source0: http://sourceforge.net/projects/perfmon2/files/libpfm4/%{name}-%{version}.tar.gz %if %{with_python} BuildRequires: python-devel BuildRequires: python-setuptools BuildRequires: swig %endif BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) %description libpfm4 is a library to help encode events for use with operating system kernels performance monitoring interfaces. The current version provides support for the perf_events interface available in upstream Linux kernels since v2.6.31. %package devel Summary: Development library to encode performance events for perf_events based tools Group: Development/Libraries Requires: %{name} = %{version}-%{release} %description devel Development library and header files to create performance monitoring applications for the perf_events interface. %if %{with_python} %package python Summary: Python bindings for libpfm and perf_event_open system call Group: Development/Languages Requires: %{name} = %{version}-%{release} %description python Python bindings for libpfm4 and perf_event_open system call. %endif %prep %setup -q %build %if %{with_python} %global python_config CONFIG_PFMLIB_NOPYTHON=n %else %global python_config CONFIG_PFMLIB_NOPYTHON=y %endif make %{python_config} %{?_smp_mflags} %install rm -rf $RPM_BUILD_ROOT %if %{with_python} %global python_config CONFIG_PFMLIB_NOPYTHON=n %else %global python_config CONFIG_PFMLIB_NOPYTHON=y %endif make \ PREFIX=$RPM_BUILD_ROOT%{_prefix} \ LIBDIR=$RPM_BUILD_ROOT%{_libdir} \ PYTHON_PREFIX=$RPM_BUILD_ROOT/%{python_prefix} \ %{python_config} \ LDCONFIG=/bin/true \ install %clean rm -fr $RPM_BUILD_ROOT %post -p /sbin/ldconfig %postun -p /sbin/ldconfig %files %defattr(644,root,root,755) %doc README %attr(755,root,root) %{_libdir}/lib*.so* %files devel %defattr(644,root,root,755) %{_includedir}/* %{_mandir}/man3/* %{_libdir}/lib*.a %if %{with_python} %files python %defattr(644,root,root,755) %attr(755,root,root) %{python_sitearch}/* %endif %changelog * Tue Feb 9 2016 William Cohen 4.6.0-1 - Update spec file. * Wed Nov 13 2013 Lukas Berk 4.4.0-1 - Intel IVB-EP support - Intel IVB updates support - Intel SNB updates support - Intel SNB-EP uncore support - ldlat support (PEBS-LL) - New Intel Atom support - bug fixes * Tue Aug 28 2012 Stephane Eranian 4.3.0-1 - ARM Cortex A15 support - updated Intel Sandy Bridge core PMU events - Intel Sandy Bridge desktop (model 42) uncore PMU support - Intel Ivy Bridge support - full perf_events generic event support - updated perf_examples - enabled Intel Nehalem/Westmere uncore PMU support - AMD LLano processor supoprt (Fam 12h) - AMD Turion rocessor supoprt (Fam 11h) - Intel Atom Cedarview processor support - Win32 compilation support - perf_events excl attribute - perf_events generic hw event aliases support - many bug fixes * Wed Mar 14 2012 William Cohen 4.2.0-2 - Some spec file fixup. * Wed Jan 12 2011 Arun Sharma 4.2.0-0 Initial revision papi-papi-7-2-0-t/src/libpfm4/perf_examples/000077500000000000000000000000001502707512200206165ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/perf_examples/Makefile000066400000000000000000000055501502707512200222630ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk DIRS= ifeq ($(ARCH),ia64) #DIRS +=ia64 endif ifeq ($(ARCH),x86_64) DIRS += x86 endif ifeq ($(ARCH),i386) DIRS += x86 endif CFLAGS+= -I. -D_GNU_SOURCE -pthread PERF_EVENT_HDR=$(TOPDIR)/include/perfmon/pfmlib_perf_event.h LPC_UTILS=perf_util.o LPC_UTILS_HDR=perf_util.h TARGETS+=self self_basic self_count task task_attach_timeout syst \ notify_self notify_group task_smpl self_smpl_multi \ self_pipe syst_count task_cpu syst_smpl evt2raw \ branch_smpl # Make rtop conditional on ncurses development package installed ifeq ($(shell /bin/echo -e '\#include \nint main(void) { return 0;}' | $(CC) -o /dev/null -xc - 2>/dev/null && echo -n yes), yes) RTOP=rtop endif EXAMPLESDIR=$(DESTDIR)$(DOCDIR)/perf_examples all: $(TARGETS) $(RTOP) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done rtop: rtop.o $(LPC_UTILS) $(PFMLIB) $(PERF_EVENT_HDR) $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $< $(LPC_UTILS) $(PFMLIB) $(LIBS) -lpthread -lncurses -ltinfo -lm $(TARGETS): %:%.o $(LPC_UTILS) $(PFMLIB) -$(CC) $(CFLAGS) -o $@ $(LDFLAGS) $< $(LPC_UTILS) $(PFMLIB) $(LIBS) $(LPC_UTILS): $(LPC_UTILS_HDR) clean: @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done $(RM) -f *.o $(TARGETS) rtop *~ distclean: clean install-examples install_examples: $(TARGETS) @echo installing: $(TARGETS) -mkdir -p $(EXAMPLESDIR) $(INSTALL) -m 755 $(TARGETS) $(EXAMPLESDIR) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done # # examples are installed as part of the RPM install, typically in /usr/share/doc/libpfm-X.Y/ # .PHONY: install depend install_examples install-examples papi-papi-7-2-0-t/src/libpfm4/perf_examples/branch_smpl.c000066400000000000000000000264201502707512200232560ustar00rootroot00000000000000/* * branch_smpl.c - example of a branch sampling on another task * * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define DFL_BR_EVENT "branches:freq=100:u" typedef struct { int opt_no_show; int opt_inherit; uint64_t branch_filt; int cpu; int mmap_pages; char *events; FILE *output_file; } options_t; static jmp_buf jbuf; static uint64_t collected_samples, lost_samples; static perf_event_desc_t *fds; static int num_fds; static options_t options; static void cld_handler(int n) { longjmp(jbuf, 1); } int child(char **arg) { execvp(arg[0], arg); /* not reached */ return -1; } struct timeval last_read, this_read; static void process_smpl_buf(perf_event_desc_t *hw) { struct perf_event_header ehdr; int ret; for(;;) { ret = perf_read_buffer(hw, &ehdr, sizeof(ehdr)); if (ret) return; /* nothing to read */ if (options.opt_no_show) { perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); continue; } switch(ehdr.type) { case PERF_RECORD_SAMPLE: collected_samples++; ret = perf_display_sample(fds, num_fds, hw - fds, &ehdr, options.output_file); if (ret) errx(1, "cannot parse sample"); break; case PERF_RECORD_EXIT: display_exit(hw, options.output_file); break; case PERF_RECORD_LOST: lost_samples += display_lost(hw, fds, num_fds, options.output_file); break; case PERF_RECORD_THROTTLE: display_freq(1, hw, options.output_file); break; case PERF_RECORD_UNTHROTTLE: display_freq(0, hw, options.output_file); break; default: printf("unknown sample type %d\n", ehdr.type); perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); } } } int mainloop(char **arg) { static uint64_t ovfl_count; /* static to avoid setjmp issue */ struct pollfd pollfds[1]; sigset_t bmask; int go[2], ready[2]; size_t pgsz; size_t map_size = 0; pid_t pid; int status, ret; int i; char buf; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed\n"); pgsz = sysconf(_SC_PAGESIZE); map_size = (options.mmap_pages+1)*pgsz; /* * does allocate fds */ ret = perf_setup_list_events(options.events, &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup event list"); memset(pollfds, 0, sizeof(pollfds)); ret = pipe(ready); if (ret) err(1, "cannot create pipe ready"); ret = pipe(go); if (ret) err(1, "cannot create pipe go"); /* * Create the child task */ if ((pid=fork()) == -1) err(1, "cannot fork process\n"); if (pid == 0) { close(ready[0]); close(go[1]); /* * let the parent know we exist */ close(ready[1]); if (read(go[0], &buf, 1) == -1) err(1, "unable to read go_pipe"); exit(child(arg)); } close(ready[1]); close(go[0]); if (read(ready[0], &buf, 1) == -1) err(1, "unable to read child_ready_pipe"); close(ready[0]); fds[0].fd = -1; if (!fds[0].hw.sample_period) errx(1, "need to set sampling period or freq on first event, use :period= or :freq="); for(i=0; i < num_fds; i++) { if (i == 0) { fds[i].hw.disabled = 1; fds[i].hw.enable_on_exec = 1; /* start immediately */ } else fds[i].hw.disabled = 0; if (options.opt_inherit) fds[i].hw.inherit = 1; if (fds[i].hw.sample_period) { /* * set notification threshold to be halfway through the buffer */ fds[i].hw.wakeup_watermark = (options.mmap_pages*pgsz) / 2; fds[i].hw.watermark = 1; fds[i].hw.sample_type = PERF_SAMPLE_IP|PERF_SAMPLE_TID|PERF_SAMPLE_READ|PERF_SAMPLE_TIME|PERF_SAMPLE_PERIOD; /* * if we have more than one event, then record event identifier to help with parsing */ if (num_fds > 1) fds[i].hw.sample_type |= PERF_SAMPLE_IDENTIFIER; fprintf(options.output_file,"%s period=%"PRIu64" freq=%d\n", fds[i].name, fds[i].hw.sample_period, fds[i].hw.freq); fds[i].hw.read_format = PERF_FORMAT_SCALE; if (fds[i].hw.freq) fds[i].hw.sample_type |= PERF_SAMPLE_PERIOD; fds[i].hw.sample_type = PERF_SAMPLE_BRANCH_STACK; fds[i].hw.branch_sample_type = options.branch_filt; } /* * we are grouping the events, so there may be a limit */ fds[i].fd = perf_event_open(&fds[i].hw, pid, options.cpu, fds[0].fd, 0); if (fds[i].fd == -1) { if (fds[i].hw.precise_ip) err(1, "cannot attach event %s: precise mode may not be supported", fds[i].name); err(1, "cannot attach event %s", fds[i].name); } } /* * kernel adds the header page to the size of the mmapped region */ fds[0].buf = mmap(NULL, map_size, PROT_READ|PROT_WRITE, MAP_SHARED, fds[0].fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); /* does not include header page */ fds[0].pgmsk = (options.mmap_pages*pgsz)-1; /* * send samples for all events to first event's buffer */ for (i = 1; i < num_fds; i++) { if (!fds[i].hw.sample_period) continue; ret = ioctl(fds[i].fd, PERF_EVENT_IOC_SET_OUTPUT, fds[0].fd); if (ret) err(1, "cannot redirect sampling output"); } if (num_fds > 1 && fds[0].fd > -1) { for(i = 0; i < num_fds; i++) { /* * read the event identifier using ioctl * new method replaced the trick with PERF_FORMAT_GROUP + PERF_FORMAT_ID + read() */ ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ID, &fds[i].id); if (ret == -1) err(1, "cannot read ID"); fprintf(options.output_file,"ID %"PRIu64" %s\n", fds[i].id, fds[i].name); } } pollfds[0].fd = fds[0].fd; pollfds[0].events = POLLIN; for(i=0; i < num_fds; i++) { ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "cannot enable event %s\n", fds[i].name); } signal(SIGCHLD, cld_handler); close(go[1]); if (setjmp(jbuf) == 1) goto terminate_session; sigemptyset(&bmask); sigaddset(&bmask, SIGCHLD); /* * core loop */ for(;;) { ret = poll(pollfds, 1, -1); if (ret < 0 && errno == EINTR) break; ovfl_count++; ret = sigprocmask(SIG_SETMASK, &bmask, NULL); if (ret) err(1, "setmask"); process_smpl_buf(&fds[0]); ret = sigprocmask(SIG_UNBLOCK, &bmask, NULL); if (ret) err(1, "unblock"); } terminate_session: /* * cleanup child */ wait4(pid, &status, 0, NULL); for(i=0; i < num_fds; i++) close(fds[i].fd); /* check for partial event buffer */ process_smpl_buf(&fds[0]); munmap(fds[0].buf, map_size); perf_free_fds(fds, num_fds); fprintf(options.output_file, "%"PRIu64" samples collected in %"PRIu64" poll events, %"PRIu64" lost samples\n", collected_samples, ovfl_count, lost_samples); /* free libpfm resources cleanly */ pfm_terminate(); fclose(options.output_file); return 0; } typedef struct { const char *filt; const int flag; } branch_filt_t; #define FILT(a, b) { .filt = a, .flag = b } static const branch_filt_t br_filters[] = { /* priv level filters */ FILT("u", PERF_SAMPLE_BRANCH_USER), FILT("k", PERF_SAMPLE_BRANCH_KERNEL), FILT("hv", PERF_SAMPLE_BRANCH_HV), FILT("any", PERF_SAMPLE_BRANCH_ANY), FILT("call", PERF_SAMPLE_BRANCH_ANY_CALL), FILT("return", PERF_SAMPLE_BRANCH_ANY_RETURN), FILT("indirect", PERF_SAMPLE_BRANCH_IND_CALL), FILT("conditional", PERF_SAMPLE_BRANCH_COND), FILT("indirect_jump", PERF_SAMPLE_BRANCH_IND_JUMP), FILT(NULL, 0), }; static void parse_branch_arg(const char *arg) { const branch_filt_t *br; char *q, *p, *str; if (!arg) { options.branch_filt = PERF_SAMPLE_BRANCH_ANY; return; } str = q = strdup(arg); if (!str) err(1, "cannot allocate memory to dup string"); while (*q) { p = strchr(q, ','); if (p) *p = '\0'; for (br = br_filters; br->filt; br++) { if (!strcasecmp(q, br->filt)) options.branch_filt |= br->flag; } if (!br->filt) errx(1, "unknown branch filter %s", q); if (!p) break; str = p + 1; } free(str); #define BR_PLM (PERF_SAMPLE_BRANCH_USER|PERF_SAMPLE_BRANCH_KERNEL|PERF_SAMPLE_BRANCH_HV) if (!(options.branch_filt & ~BR_PLM)) errx(1, "no branch mode specified, privilege level does not define a branch type, use the any filter"); } static void usage(void) { printf("usage: branch_smpl [-h] [--help] [-i] [-c cpu] [-m mmap_pages] [-b] [-j br-filt] [-o output_file] [-e event1] cmd\n" "\t-j br-filt\t : comma separated list of branch filters among: u, k, any, call, returrn, indirect, conditional, indirect_jmp\n" "\t-b\t\t : sample any branch (equivalent to -j any), default mode\n"); } int main(int argc, char **argv) { int c; setlocale(LC_ALL, ""); options.cpu = -1; options.output_file = stdout; while ((c=getopt(argc, argv,"he:m:ic:o:j:b")) != -1) { switch(c) { case 0: continue; case 'e': if (options.events) errx(1, "events specified twice\n"); options.events = optarg; break; case 'i': options.opt_inherit = 1; break; case 'm': if (options.mmap_pages) errx(1, "mmap pages already set\n"); options.mmap_pages = atoi(optarg); break; case 'b': if (options.branch_filt) errx(1, "cannot use multiple branch filter options"); options.branch_filt = PERF_SAMPLE_BRANCH_ANY; break; case 'j': if (options.branch_filt) errx(1, "cannot set multiple branch options"); parse_branch_arg(optarg); break; case 'c': options.cpu = atoi(optarg); break; case 'o': options.output_file = fopen(optarg,"w"); if (!options.output_file) err(1, "cannot create file %s\n", optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown option"); } } if (argv[optind] == NULL) errx(1, "you must specify a command to execute\n"); if (!options.branch_filt) options.branch_filt = PERF_SAMPLE_BRANCH_ANY | PERF_SAMPLE_BRANCH_USER; /* * use low frequency rate to avoid flooding output * use generic branches event to make this test more portable */ if (!options.events) options.events = strdup(DFL_BR_EVENT); if (!options.mmap_pages) options.mmap_pages = 1; if (options.mmap_pages > 1 && ((options.mmap_pages) & 0x1)) errx(1, "number of pages must be power of 2 greater than 1\n"); printf("branch_filt=0x%"PRIx64"\n", options.branch_filt); printf("event=%s\n", options.events); return mainloop(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/evt2raw.c000066400000000000000000000056341502707512200223640ustar00rootroot00000000000000/* * evt2raw.c - example which converts an event string (event + modifiers) to * a raw event code usable by the perf tool. * * Copyright (c) 2010 IBM Corp. * Contributed by Corey Ashford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include static void usage(void) { printf("usage: evt2raw [-v] \n" " is the symbolic event, including modifiers, to " "translate to a raw code.\n"); } #define MAX_MODIFIER_CHARS 5 /* u,k,h plus the colon and null terminator */ int main(int argc, char **argv) { int ret, c, verbose = 0; struct perf_event_attr pea; char *event_str, *fstr = NULL; char modifiers[MAX_MODIFIER_CHARS]; if (argc < 2) { usage(); return 1; } while ( (c=getopt(argc, argv, "hv")) != -1) { switch(c) { case 'h': usage(); exit(0); case 'v': verbose = 1; break; default: exit(1); } } event_str = argv[optind]; ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "Internal error: pfm_initialize returned %s", pfm_strerror(ret)); pea.size = sizeof(struct perf_event_attr); ret = pfm_get_perf_event_encoding(event_str, PFM_PLM0|PFM_PLM3|PFM_PLMH, &pea, &fstr, NULL); if (ret != PFM_SUCCESS) errx(1, "Error: pfm_get_perf_encoding returned %s", pfm_strerror(ret)); if (pea.type != PERF_TYPE_RAW) errx(1, "Error: %s is not a raw hardware event", event_str); modifiers[0] = '\0'; if (pea.exclude_user | pea.exclude_kernel | pea.exclude_hv) { strcat(modifiers, ":"); if (!pea.exclude_user) strcat(modifiers, "u"); if (!pea.exclude_kernel) strcat(modifiers, "k"); if (!pea.exclude_hv) strcat(modifiers, "h"); } if (verbose) printf("r%"PRIx64"%s\t%s\n", pea.config, modifiers, fstr); else printf("r%"PRIx64"%s\n", pea.config, modifiers); if (fstr) free(fstr); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/notify_group.c000066400000000000000000000126421502707512200235130ustar00rootroot00000000000000/* * notify_group.c - self-sampling multuiple events in one group * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define SMPL_PERIOD 2400000000ULL typedef struct { uint64_t ip; } sample_t; static volatile unsigned long notification_received; static perf_event_desc_t *fds; static int num_fds; static int buffer_pages = 1; /* size of buffer payload (must be power of 2) */ static void sigio_handler(int n, siginfo_t *info, struct sigcontext *sc) { struct perf_event_header ehdr; uint64_t ip; int id, ret; id = perf_fd2event(fds, num_fds, info->si_fd); if (id == -1) errx(1, "cannot find event for descriptor %d", info->si_fd); ret = perf_read_buffer(fds+id, &ehdr, sizeof(ehdr)); if (ret) errx(1, "cannot read event header"); if (ehdr.type != PERF_RECORD_SAMPLE) { warnx("unknown event type %d, skipping", ehdr.type); perf_skip_buffer(fds+id, ehdr.size - sizeof(ehdr)); goto skip; } ret = perf_read_buffer(fds+id, &ip, sizeof(ip)); if (ret) errx(1, "cannot read IP"); notification_received++; printf("Notification %lu: 0x%"PRIx64" fd=%d %s\n", notification_received, ip, info->si_fd, fds[id].name); skip: /* * rearm the counter for one more shot */ ret = ioctl(info->si_fd, PERF_EVENT_IOC_REFRESH, 1); if (ret == -1) err(1, "cannot refresh"); } /* * infinite loop waiting for notification to get out */ void busyloop(void) { /* * busy loop to burn CPU cycles */ for(;notification_received < 1024;) ; } int main(int argc, char **argv) { struct sigaction act; sigset_t new, old; size_t pgsz; int ret, i; ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "Cannot initialize library: %s", pfm_strerror(ret)); pgsz = sysconf(_SC_PAGESIZE); /* * Install the signal handler (SIGIO) */ memset(&act, 0, sizeof(act)); act.sa_sigaction = (void (*)(int, siginfo_t *, void *)) sigio_handler; act.sa_flags = SA_SIGINFO; sigaction (SIGIO, &act, 0); sigemptyset(&old); sigemptyset(&new); sigaddset(&new, SIGIO); ret = sigprocmask(SIG_SETMASK, NULL, &old); if (ret) err(1, "sigprocmask failed"); if (sigismember(&old, SIGIO)) { warnx("program started with SIGIO masked, unmasking it now\n"); ret = sigprocmask(SIG_UNBLOCK, &new, NULL); if (ret) err(1, "sigprocmask failed"); } /* * allocates fd for us */ ret = perf_setup_list_events("cycles:u," "instructions:u," "cycles:u", &fds, &num_fds); if (ret || !num_fds) exit(1); fds[0].fd = -1; for(i=0; i < num_fds; i++) { /* want a notification for each sample added to the buffer */ fds[i].hw.disabled = !!i; printf("i=%d disabled=%d\n", i, fds[i].hw.disabled); fds[i].hw.wakeup_events = 1; fds[i].hw.sample_type = PERF_SAMPLE_IP; fds[i].hw.sample_period = SMPL_PERIOD; fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, fds[0].fd, 0); if (fds[i].fd == -1) { warn("cannot attach event %s", fds[i].name); goto error; } fds[i].buf = mmap(NULL, (buffer_pages + 1)*pgsz, PROT_READ|PROT_WRITE, MAP_SHARED, fds[i].fd, 0); if (fds[i].buf == MAP_FAILED) err(1, "cannot mmap buffer"); /* * setup asynchronous notification on the file descriptor */ ret = fcntl(fds[i].fd, F_SETFL, fcntl(fds[i].fd, F_GETFL, 0) | O_ASYNC); if (ret == -1) err(1, "cannot set ASYNC"); /* * necessary if we want to get the file descriptor for * which the SIGIO is sent for in siginfo->si_fd. * SA_SIGINFO in itself is not enough */ ret = fcntl(fds[i].fd, F_SETSIG, SIGIO); if (ret == -1) err(1, "cannot setsig"); /* * get ownership of the descriptor */ ret = fcntl(fds[i].fd, F_SETOWN, getpid()); if (ret == -1) err(1, "cannot setown"); fds[i].pgmsk = (buffer_pages * pgsz) - 1; } for(i=0; i < num_fds; i++) { ret = ioctl(fds[i].fd, PERF_EVENT_IOC_REFRESH , 1); if (ret == -1) err(1, "cannot refresh"); } busyloop(); prctl(PR_TASK_PERF_EVENTS_DISABLE); error: /* * destroy our session */ for(i=0; i < num_fds; i++) if (fds[i].fd > -1) close(fds[i].fd); perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/notify_self.c000066400000000000000000000163101502707512200233040ustar00rootroot00000000000000/* * notify_self.c - example of how you can use overflow notifications * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define SMPL_PERIOD 2400000000ULL static volatile unsigned long notification_received; static perf_event_desc_t *fds = NULL; static int num_fds = 0; static int buffer_pages = 1; /* size of buffer payload (must be power of 2)*/ static void sigio_handler(int n, siginfo_t *info, void *uc) { struct perf_event_header ehdr; int ret, id; /* * positive si_code indicate kernel generated signal * which is normal for SIGIO */ if (info->si_code < 0) errx(1, "signal not generated by kernel"); /* * SIGPOLL = SIGIO * expect POLL_HUP instead of POLL_IN because we are * in one-shot mode (IOC_REFRESH) */ if (info->si_code != POLL_HUP) errx(1, "signal not generated by SIGIO"); id = perf_fd2event(fds, num_fds, info->si_fd); if (id == -1) errx(1, "no event associated with fd=%d", info->si_fd); ret = perf_read_buffer(fds+id, &ehdr, sizeof(ehdr)); if (ret) errx(1, "cannot read event header"); if (ehdr.type != PERF_RECORD_SAMPLE) { warnx("unexpected sample type=%d, skipping\n", ehdr.type); perf_skip_buffer(fds+id, ehdr.size); goto skip; } printf("Notification:%lu ", notification_received); ret = perf_display_sample(fds, num_fds, 0, &ehdr, stdout); /* * increment our notification counter */ notification_received++; skip: /* * rearm the counter for one more shot */ ret = ioctl(info->si_fd, PERF_EVENT_IOC_REFRESH, 1); if (ret == -1) err(1, "cannot refresh"); } /* * infinite loop waiting for notification to get out */ void busyloop(void) { /* * busy loop to burn CPU cycles */ for(;notification_received < 20;) ; } int main(int argc, char **argv) { struct sigaction act; sigset_t new, old; uint64_t *val; size_t sz, pgsz; int ret, i; setlocale(LC_ALL, ""); ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "Cannot initialize library: %s", pfm_strerror(ret)); pgsz = sysconf(_SC_PAGESIZE); /* * Install the signal handler (SIGIO) * need SA_SIGINFO because we need the fd * in the signal handler */ memset(&act, 0, sizeof(act)); act.sa_sigaction = sigio_handler; act.sa_flags = SA_SIGINFO; sigaction (SIGIO, &act, 0); sigemptyset(&old); sigemptyset(&new); sigaddset(&new, SIGIO); ret = sigprocmask(SIG_SETMASK, NULL, &old); if (ret) err(1, "sigprocmask failed"); if (sigismember(&old, SIGIO)) { warnx("program started with SIGIO masked, unmasking it now\n"); ret = sigprocmask(SIG_UNBLOCK, &new, NULL); if (ret) err(1, "sigprocmask failed"); } /* * allocates fd for us */ ret = perf_setup_list_events("cycles:u," "instructions:u", &fds, &num_fds); if (ret || (num_fds == 0)) exit(1); fds[0].fd = -1; for(i=0; i < num_fds; i++) { /* want a notification for every each added to the buffer */ fds[i].hw.disabled = !i; if (!i) { fds[i].hw.wakeup_events = 1; fds[i].hw.sample_type = PERF_SAMPLE_IP|PERF_SAMPLE_READ|PERF_SAMPLE_PERIOD; fds[i].hw.sample_period = SMPL_PERIOD; /* read() returns event identification for signal handler */ fds[i].hw.read_format = PERF_FORMAT_GROUP|PERF_FORMAT_ID|PERF_FORMAT_SCALE; } fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, fds[0].fd, 0); if (fds[i].fd == -1) err(1, "cannot attach event %s", fds[i].name); } sz = (3+2*num_fds)*sizeof(uint64_t); val = malloc(sz); if (!val) err(1, "cannot allocated memory"); /* * On overflow, the non lead events are stored in the sample. * However we need some key to figure the order in which they * were laid out in the buffer. The file descriptor does not * work for this. Instead, we extract a unique ID for each event. * That id will be part of the sample for each event value. * Therefore we will be able to match value to events * * PERF_FORMAT_ID: returns unique 64-bit identifier in addition * to event value. */ if (fds[0].fd == -1) errx(1, "cannot create event 0"); ret = read(fds[0].fd, val, sz); if (ret == -1) err(1, "cannot read id %zu", sizeof(val)); /* * we are using PERF_FORMAT_GROUP, therefore the structure * of val is as follows: * * { u64 nr; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 value; * { u64 id; } && PERF_FORMAT_ID * } cntr[nr]; * We are skipping the first 3 values (nr, time_enabled, time_running) * and then for each event we get a pair of values. */ for(i=0; i < num_fds; i++) { fds[i].id = val[2*i+1+3]; printf("%"PRIu64" %s\n", fds[i].id, fds[i].name); } fds[0].buf = mmap(NULL, (buffer_pages+1)*pgsz, PROT_READ|PROT_WRITE, MAP_SHARED, fds[0].fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); fds[0].pgmsk = (buffer_pages * pgsz) - 1; /* * setup asynchronous notification on the file descriptor */ ret = fcntl(fds[0].fd, F_SETFL, fcntl(fds[0].fd, F_GETFL, 0) | O_ASYNC); if (ret == -1) err(1, "cannot set ASYNC"); /* * necessary if we want to get the file descriptor for * which the SIGIO is sent in siginfo->si_fd. * SA_SIGINFO in itself is not enough */ ret = fcntl(fds[0].fd, F_SETSIG, SIGIO); if (ret == -1) err(1, "cannot setsig"); /* * get ownership of the descriptor */ ret = fcntl(fds[0].fd, F_SETOWN, getpid()); if (ret == -1) err(1, "cannot setown"); /* * enable the group for one period */ ret = ioctl(fds[0].fd, PERF_EVENT_IOC_REFRESH , 1); if (ret == -1) err(1, "cannot refresh"); busyloop(); ret = ioctl(fds[0].fd, PERF_EVENT_IOC_DISABLE, 1); if (ret == -1) err(1, "cannot disable"); /* * destroy our session */ for(i=0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); free(val); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/perf_util.c000066400000000000000000000415271502707512200227640ustar00rootroot00000000000000/* * perf_util.c - helper functions for perf_events * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include "perf_util.h" /* the **fd parameter must point to a null pointer on the first call * max_fds and num_fds must both point to a zero value on the first call * The return value is success (0) vs. failure (non-zero) */ int perf_setup_argv_events(const char **argv, perf_event_desc_t **fds, int *num_fds) { perf_event_desc_t *fd; pfm_perf_encode_arg_t arg; int new_max, ret, num, max_fds; int group_leader; if (!(argv && fds && num_fds)) return -1; fd = *fds; if (fd) { max_fds = fd[0].max_fds; if (max_fds < 2) return -1; num = *num_fds; } else { max_fds = num = 0; /* bootstrap */ } group_leader = num; while(*argv) { if (num == max_fds) { if (max_fds == 0) new_max = 2; else new_max = max_fds << 1; if (new_max < max_fds) { warn("too many entries"); goto error; } fd = realloc(fd, new_max * sizeof(*fd)); if (!fd) { warn("cannot allocate memory"); goto error; } /* reset newly allocated chunk */ memset(fd + max_fds, 0, (new_max - max_fds) * sizeof(*fd)); max_fds = new_max; /* update max size */ fd[0].max_fds = max_fds; } /* ABI compatibility, set before calling libpfm */ fd[num].hw.size = sizeof(fd[num].hw); memset(&arg, 0, sizeof(arg)); arg.attr = &fd[num].hw; arg.fstr = &fd[num].fstr; /* fd[].fstr is NULL */ ret = pfm_get_os_event_encoding(*argv, PFM_PLM0|PFM_PLM3, PFM_OS_PERF_EVENT_EXT, &arg); if (ret != PFM_SUCCESS) { warnx("event %s: %s", *argv, pfm_strerror(ret)); goto error; } fd[num].name = strdup(*argv); fd[num].group_leader = group_leader; fd[num].idx = arg.idx; fd[num].cpu = arg.cpu; num++; argv++; } *num_fds = num; *fds = fd; return 0; error: perf_free_fds(fd, num); return -1; } int perf_setup_list_events(const char *ev, perf_event_desc_t **fd, int *num_fds) { const char **argv; char *p, *q, *events; int i, ret, num = 0; if (!(ev && fd && num_fds)) return -1; events = strdup(ev); if (!events) return -1; q = events; while((p = strchr(q, ','))) { num++; q = p + 1; } num++; num++; /* terminator */ argv = malloc(num * sizeof(char *)); if (!argv) { free(events); return -1; } i = 0; q = events; while((p = strchr(q, ','))) { *p = '\0'; argv[i++] = q; q = p + 1; } argv[i++] = q; argv[i] = NULL; ret = perf_setup_argv_events(argv, fd, num_fds); free(argv); free(events); /* strdup in perf_setup_argv_events() */ return ret; } void perf_free_fds(perf_event_desc_t *fds, int num_fds) { int i; for (i = 0 ; i < num_fds; i++) { free(fds[i].name); free(fds[i].fstr); } free(fds); } int perf_get_group_nevents(perf_event_desc_t *fds, int num, int idx) { int leader; int i; if (idx < 0 || idx >= num) return 0; leader = fds[idx].group_leader; for (i = leader + 1; i < num; i++) { if (fds[i].group_leader != leader) { /* This is a new group leader, so the previous * event was the final event of the preceding * group. */ return i - leader; } } return i - leader; } int perf_read_buffer(perf_event_desc_t *hw, void *buf, size_t sz) { struct perf_event_mmap_page *hdr = hw->buf; size_t pgmsk = hw->pgmsk; void *data; unsigned long tail; size_t avail_sz, m, c; /* * data points to beginning of buffer payload */ data = (void*)(((uintptr_t)hdr)+sysconf(_SC_PAGESIZE)); /* * position of tail within the buffer payload */ tail = hdr->data_tail & pgmsk; /* * size of what is available * * data_head, data_tail never wrap around */ avail_sz = hdr->data_head - hdr->data_tail; if (sz > avail_sz) return -1; /* * sz <= avail_sz, we can satisfy the request */ /* * c = size till end of buffer * * buffer payload size is necessarily * a power of two, so we can do: */ c = pgmsk + 1 - tail; /* * min with requested size */ m = c < sz ? c : sz; /* copy beginning */ memcpy(buf, (void*)(((uintptr_t)data)+tail), m); /* * copy wrapped around leftover */ if (sz > m) memcpy((void*)(((uintptr_t)buf)+m), data, sz - m); //printf("\nhead=%lx tail=%lx new_tail=%lx sz=%zu\n", hdr->data_head, hdr->data_tail, hdr->data_tail+sz, sz); hdr->data_tail += sz; return 0; } void perf_skip_buffer(perf_event_desc_t *hw, size_t sz) { struct perf_event_mmap_page *hdr = hw->buf; if ((hdr->data_tail + sz) > hdr->data_head) sz = hdr->data_head - hdr->data_tail; hdr->data_tail += sz; } static size_t __perf_handle_raw(perf_event_desc_t *hw) { size_t sz = 0; uint32_t raw_sz, i; char *buf; int ret; ret = perf_read_buffer_32(hw, &raw_sz); if (ret) { warnx("cannot read raw size"); return (size_t)-1; } sz += sizeof(raw_sz); printf("\n\tRAWSZ:%u\n", raw_sz); buf = malloc(raw_sz); if (!buf) { warn("cannot allocate raw buffer"); return (size_t)-1; } ret = perf_read_buffer(hw, buf, raw_sz); if (ret) { warnx("cannot read raw data"); free(buf); return (size_t)-1; } if (raw_sz) putchar('\t'); for(i=0; i < raw_sz; i++) { printf("0x%02x ", buf[i] & 0xff ); if (((i+1) % 16) == 0) printf("\n\t"); } if (raw_sz) putchar('\n'); free(buf); return sz + raw_sz; } static int perf_display_branch_stack(perf_event_desc_t *desc, FILE *fp) { struct perf_branch_entry b; uint64_t nr, n; int ret; ret = perf_read_buffer(desc, &n, sizeof(n)); if (ret) errx(1, "cannot read branch stack nr"); fprintf(fp, "\n\tBRANCH_STACK:%"PRIu64"\n", n); nr = n; /* * from most recent to least recent take branch */ while (nr--) { ret = perf_read_buffer(desc, &b, sizeof(b)); if (ret) errx(1, "cannot read branch stack entry"); fprintf(fp, "\tFROM:0x%016"PRIx64" TO:0x%016"PRIx64" MISPRED:%c PRED:%c IN_TX:%c ABORT:%c CYCLES:%d type:%d\n", b.from, b.to, !(b.mispred || b.predicted) ? '-': (b.mispred ? 'Y' :'N'), !(b.mispred || b.predicted) ? '-': (b.predicted? 'Y' :'N'), (b.in_tx? 'Y' :'N'), (b.abort? 'Y' :'N'), b.type, b.cycles); } return (int)(n * sizeof(b) + sizeof(n)); } static int perf_display_regs_user(perf_event_desc_t *hw, FILE *fp) { errx(1, "display regs_user not implemented yet\n"); return 0; } static int perf_display_regs_intr(perf_event_desc_t *hw, FILE *fp) { errx(1, "display regs_intr not implemented yet\n"); return 0; } static int perf_display_stack_user(perf_event_desc_t *hw, FILE *fp) { uint64_t nr; char buf[512]; size_t sz; int ret; ret = perf_read_buffer(hw, &nr, sizeof(nr)); if (ret) errx(1, "cannot user stack size"); fprintf(fp, "USER_STACK: SZ:%"PRIu64"\n", nr); /* consume content */ while (nr) { sz = nr; if (sz > sizeof(buf)) sz = sizeof(buf); ret = perf_read_buffer(hw, buf, sz); if (ret) errx(1, "cannot user stack content"); nr -= sz; } return 0; } int perf_display_sample(perf_event_desc_t *fds, int num_fds, int idx, struct perf_event_header *ehdr, FILE *fp) { perf_event_desc_t *hw; struct { uint32_t pid, tid; } pid; struct { uint64_t value, id; } grp; uint64_t time_enabled, time_running; size_t sz; uint64_t type, fmt; uint64_t val64; const char *str; int ret, e; if (!fds || !fp || !ehdr || num_fds < 0 || idx < 0 || idx >= num_fds) return -1; sz = ehdr->size - sizeof(*ehdr); hw = fds+idx; type = hw->hw.sample_type; fmt = hw->hw.read_format; if (type & PERF_SAMPLE_IDENTIFIER) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx("cannot read IP"); return -1; } fprintf(fp, "ID:%"PRIu64" ", val64); sz -= sizeof(val64); } /* * the sample_type information is laid down * based on the PERF_RECORD_SAMPLE format specified * in the perf_event.h header file. * That order is different from the enum perf_event_sample_format */ if (type & PERF_SAMPLE_IP) { const char *xtra = " "; ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx("cannot read IP"); return -1; } /* * MISC_EXACT_IP indicates that kernel is returning * th IIP of an instruction which caused the event, i.e., * no skid */ if (hw->hw.precise_ip && (ehdr->misc & PERF_RECORD_MISC_EXACT_IP)) xtra = " (exact) "; fprintf(fp, "IIP:%#016"PRIx64"%s", val64, xtra); sz -= sizeof(val64); } if (type & PERF_SAMPLE_TID) { ret = perf_read_buffer(hw, &pid, sizeof(pid)); if (ret) { warnx( "cannot read PID"); return -1; } fprintf(fp, "PID:%d TID:%d ", pid.pid, pid.tid); sz -= sizeof(pid); } if (type & PERF_SAMPLE_TIME) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read time"); return -1; } fprintf(fp, "TIME:%'"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_ADDR) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read addr"); return -1; } fprintf(fp, "ADDR:%#016"PRIx64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_ID) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read id"); return -1; } fprintf(fp, "ID:%"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_STREAM_ID) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read stream_id"); return -1; } fprintf(fp, "STREAM_ID:%"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_CPU) { struct { uint32_t cpu, reserved; } cpu; ret = perf_read_buffer(hw, &cpu, sizeof(cpu)); if (ret) { warnx( "cannot read cpu"); return -1; } fprintf(fp, "CPU:%u ", cpu.cpu); sz -= sizeof(cpu); } if (type & PERF_SAMPLE_PERIOD) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read period"); return -1; } fprintf(fp, "PERIOD:%'"PRIu64" ", val64); sz -= sizeof(val64); } /* struct read_format { * { u64 value; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 id; } && PERF_FORMAT_ID * } && !PERF_FORMAT_GROUP * * { u64 nr; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 value; * { u64 id; } && PERF_FORMAT_ID * } cntr[nr]; * } && PERF_FORMAT_GROUP * }; */ if (type & PERF_SAMPLE_READ) { uint64_t values[3]; uint64_t nr; if (fmt & PERF_FORMAT_GROUP) { ret = perf_read_buffer_64(hw, &nr); if (ret) { warnx( "cannot read nr"); return -1; } sz -= sizeof(nr); time_enabled = time_running = 1; if (fmt & PERF_FORMAT_TOTAL_TIME_ENABLED) { ret = perf_read_buffer_64(hw, &time_enabled); if (ret) { warnx( "cannot read timing info"); return -1; } sz -= sizeof(time_enabled); } if (fmt & PERF_FORMAT_TOTAL_TIME_RUNNING) { ret = perf_read_buffer_64(hw, &time_running); if (ret) { warnx( "cannot read timing info"); return -1; } sz -= sizeof(time_running); } fprintf(fp, "ENA=%'"PRIu64" RUN=%'"PRIu64" NR=%"PRIu64"\n", time_enabled, time_running, nr); values[1] = time_enabled; values[2] = time_running; while(nr--) { grp.id = ~0ULL; ret = perf_read_buffer_64(hw, &grp.value); if (ret) { warnx( "cannot read group value"); return -1; } sz -= sizeof(grp.value); if (fmt & PERF_FORMAT_ID) { ret = perf_read_buffer_64(hw, &grp.id); if (ret) { warnx( "cannot read leader id"); return -1; } sz -= sizeof(grp.id); } e = perf_id2event(fds, num_fds, grp.id); if (e == -1) str = "unknown sample event"; else str = fds[e].name; values[0] = grp.value; grp.value = perf_scale(values); fprintf(fp, "\t%'"PRIu64" %s (%"PRIu64"%s)\n", grp.value, str, grp.id, time_running != time_enabled ? ", scaled":""); } } else { time_enabled = time_running = 0; /* * this program does not use FORMAT_GROUP when there is only one event */ ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read value"); return -1; } sz -= sizeof(val64); if (fmt & PERF_FORMAT_TOTAL_TIME_ENABLED) { ret = perf_read_buffer_64(hw, &time_enabled); if (ret) { warnx( "cannot read timing info"); return -1; } sz -= sizeof(time_enabled); } if (fmt & PERF_FORMAT_TOTAL_TIME_RUNNING) { ret = perf_read_buffer_64(hw, &time_running); if (ret) { warnx( "cannot read timing info"); return -1; } sz -= sizeof(time_running); } if (fmt & PERF_FORMAT_ID) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read leader id"); return -1; } sz -= sizeof(val64); } fprintf(fp, "ENA=%'"PRIu64" RUN=%'"PRIu64"\n", time_enabled, time_running); values[0] = val64; values[1] = time_enabled; values[2] = time_running; val64 = perf_scale(values); fprintf(fp, "\t%'"PRIu64" %s %s\n", val64, fds[0].name, time_running != time_enabled ? ", scaled":""); } } if (type & PERF_SAMPLE_CALLCHAIN) { uint64_t nr, ip; ret = perf_read_buffer_64(hw, &nr); if (ret) { warnx( "cannot read callchain nr"); return -1; } sz -= sizeof(nr); while(nr--) { ret = perf_read_buffer_64(hw, &ip); if (ret) { warnx( "cannot read ip"); return -1; } sz -= sizeof(ip); fprintf(fp, "\t0x%"PRIx64"\n", ip); } } if (type & PERF_SAMPLE_RAW) { ret = __perf_handle_raw(hw); if (ret == -1) return -1; sz -= ret; } if (type & PERF_SAMPLE_BRANCH_STACK) { ret = perf_display_branch_stack(hw, fp); sz -= ret; } if (type & PERF_SAMPLE_REGS_USER) { ret = perf_display_regs_user(hw, fp); sz -= ret; } if (type & PERF_SAMPLE_STACK_USER) { ret = perf_display_stack_user(hw, fp); sz -= ret; } if (type & PERF_SAMPLE_WEIGHT) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read weight"); return -1; } fprintf(fp, "WEIGHT:%'"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_DATA_SRC) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read data src"); return -1; } fprintf(fp, "DATA_SRC:%'"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_TRANSACTION) { ret = perf_read_buffer_64(hw, &val64); if (ret) { warnx( "cannot read txn"); return -1; } fprintf(fp, "TXN:%'"PRIu64" ", val64); sz -= sizeof(val64); } if (type & PERF_SAMPLE_REGS_INTR) { ret = perf_display_regs_intr(hw, fp); sz -= ret; } /* * if we have some data left, it is because there is more * than what we know about. In fact, it is more complicated * because we may have the right size but wrong layout. But * that's the best we can do. */ if (sz) { warnx("did not correctly parse sample leftover=%zu", sz); perf_skip_buffer(hw, sz); } fputc('\n',fp); return 0; } uint64_t display_lost(perf_event_desc_t *hw, perf_event_desc_t *fds, int num_fds, FILE *fp) { struct { uint64_t id, lost; } lost; const char *str; int e, ret; ret = perf_read_buffer(hw, &lost, sizeof(lost)); if (ret) { warnx("cannot read lost info"); return 0; } e = perf_id2event(fds, num_fds, lost.id); if (e == -1) str = "unknown lost event"; else str = fds[e].name; fprintf(fp, "<<>>\n", lost.lost, str); return lost.lost; } void display_exit(perf_event_desc_t *hw, FILE *fp) { struct { pid_t pid, ppid, tid, ptid; } grp; int ret; ret = perf_read_buffer(hw, &grp, sizeof(grp)); if (ret) { warnx("cannot read exit info"); return; } fprintf(fp,"[%d] exited\n", grp.pid); } void display_freq(int mode, perf_event_desc_t *hw, FILE *fp) { struct { uint64_t time, id, stream_id; } thr; int ret; ret = perf_read_buffer(hw, &thr, sizeof(thr)); if (ret) { warnx("cannot read throttling info"); return; } fprintf(fp, "%s value=%"PRIu64" event ID=%"PRIu64"\n", mode ? "Throttled" : "Unthrottled", thr.id, thr.stream_id); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/perf_util.h000066400000000000000000000111521502707512200227600ustar00rootroot00000000000000/* * perf_util.h - helper functions for perf_events * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #ifndef __PERF_UTIL_H__ #define __PERF_UTIL_H__ #include #include #include #include typedef struct { struct perf_event_attr hw; uint64_t values[3]; uint64_t prev_values[3]; char *name; uint64_t id; /* event id kernel */ void *buf; size_t pgmsk; int group_leader; int fd; int max_fds; int idx; /* opaque libpfm event identifier */ int cpu; /* cpu to program */ char *fstr; /* fstr from library, must be freed */ } perf_event_desc_t; /* handy shortcut */ #define PERF_FORMAT_SCALE (PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING) extern int perf_setup_argv_events(const char **argv, perf_event_desc_t **fd, int *num_fds); extern int perf_setup_list_events(const char *events, perf_event_desc_t **fd, int *num_fds); extern int perf_read_buffer(perf_event_desc_t *hw, void *buf, size_t sz); extern void perf_free_fds(perf_event_desc_t *fds, int num_fds); extern void perf_skip_buffer(perf_event_desc_t *hw, size_t sz); static inline int perf_read_buffer_32(perf_event_desc_t *hw, void *buf) { return perf_read_buffer(hw, buf, sizeof(uint32_t)); } static inline int perf_read_buffer_64(perf_event_desc_t *hw, void *buf) { return perf_read_buffer(hw, buf, sizeof(uint64_t)); } /* * values[0] = raw count * values[1] = TIME_ENABLED * values[2] = TIME_RUNNING */ static inline uint64_t perf_scale(uint64_t *values) { uint64_t res = 0; if (!values[2] && !values[1] && values[0]) warnx("WARNING: time_running = 0 = time_enabled, raw count not zero\n"); if (values[2] > values[1]) warnx("WARNING: time_running > time_enabled\n"); if (values[2]) res = (uint64_t)((double)values[0] * values[1]/values[2]); return res; } static inline uint64_t perf_scale_delta(uint64_t *values, uint64_t *prev_values) { double pval[3], val[3]; uint64_t res = 0; if (!values[2] && !values[1] && values[0]) warnx("WARNING: time_running = 0 = time_enabled, raw count not zero\n"); if (values[2] > values[1]) warnx("WARNING: time_running > time_enabled\n"); if (values[2] - prev_values[2]) { /* covnert everything to double to avoid overflows! */ pval[0] = prev_values[0]; pval[1] = prev_values[1]; pval[2] = prev_values[2]; val[0] = values[0]; val[1] = values[1]; val[2] = values[2]; res = (uint64_t)(((val[0] - pval[0]) * (val[1] - pval[1])/ (val[2] - pval[2]))); } return res; } /* * TIME_RUNNING/TIME_ENABLED */ static inline double perf_scale_ratio(uint64_t *values) { if (!values[1]) return 0.0; return values[2]*1.0/values[1]; } static inline int perf_fd2event(perf_event_desc_t *fds, int num_events, int fd) { int i; for(i=0; i < num_events; i++) if (fds[i].fd == fd) return i; return -1; } /* * id = PERF_FORMAT_ID */ static inline int perf_id2event(perf_event_desc_t *fds, int num_events, uint64_t id) { int j; for(j=0; j < num_events; j++) if (fds[j].id == id) return j; return -1; } static inline int perf_is_group_leader(perf_event_desc_t *fds, int idx) { return fds[idx].group_leader == idx; } extern int perf_get_group_nevents(perf_event_desc_t *fds, int num, int leader); extern int perf_display_sample(perf_event_desc_t *fds, int num_fds, int idx, struct perf_event_header *ehdr, FILE *fp); extern uint64_t display_lost(perf_event_desc_t *hw, perf_event_desc_t *fds, int num_fds, FILE *fp); extern void display_exit(perf_event_desc_t *hw, FILE *fp); extern void display_freq(int mode, perf_event_desc_t *hw, FILE *fp); #endif papi-papi-7-2-0-t/src/libpfm4/perf_examples/rtop.c000066400000000000000000000300751502707512200217530ustar00rootroot00000000000000/* rtop.c - a simple PMU-based CPU utilization tool * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2004-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define RTOP_VERSION "0.2" /* * max number of cpus (threads) supported */ #define RTOP_MAX_CPUS 2048 /* MUST BE power of 2 */ #define RTOP_CPUMASK_BITS (sizeof(unsigned long)<<3) #define RTOP_CPUMASK_COUNT (RTOP_MAX_CPUS/RTOP_CPUMASK_BITS) #define RTOP_CPUMASK_SET(m, g) ((m)[(g)/RTOP_CPUMASK_BITS] |= (1UL << ((g) % RTOP_CPUMASK_BITS))) #define RTOP_CPUMASK_CLEAR(m, g) ((m)[(g)/RTOP_CPUMASK_BITS] &= ~(1UL << ((g) % RTOP_CPUMASK_BITS))) #define RTOP_CPUMASK_ISSET(m, g) ((m)[(g)/RTOP_CPUMASK_BITS] & (1UL << ((g) % RTOP_CPUMASK_BITS))) typedef unsigned long rtop_cpumask_t[RTOP_CPUMASK_COUNT]; typedef struct { struct { int opt_verbose; int opt_delay; /* refresh delay in second */ int opt_delay_set; } program_opt_flags; rtop_cpumask_t cpu_mask; /* which CPUs to use in system wide mode */ int online_cpus; int selected_cpus; unsigned long cpu_mhz; } program_options_t; #define opt_verbose program_opt_flags.opt_verbose #define opt_delay program_opt_flags.opt_delay #define opt_delay_set program_opt_flags.opt_delay_set static program_options_t options; static struct termios saved_tty; static int time_to_quit; static int term_rows, term_cols; static void get_term_size(void) { int ret; struct winsize ws; ret = ioctl(1, TIOCGWINSZ, &ws); if (ret) err(1, "cannot determine screen size"); if (ws.ws_row > 10) { term_cols = ws.ws_col; term_rows = ws.ws_row; } else { term_cols = 80; term_rows = 24; } if (term_rows < options.selected_cpus) errx(1, "you need at least %d rows on your terminal to display all CPUs", options.selected_cpus); } static void sigwinch_handler(int n) { get_term_size(); } static void setup_screen(void) { int ret; ret = tcgetattr(0, &saved_tty); if (ret == -1) errx(1, "cannot save tty settings\n"); get_term_size(); initscr(); nocbreak(); resizeterm(term_rows, term_cols); } static void close_screen(void) { endwin(); tcsetattr(0, TCSAFLUSH, &saved_tty); } static void fatal_errorw(char *fmt, ...) { va_list ap; close_screen(); va_start(ap, fmt); vfprintf(stderr, fmt, ap); va_end(ap); exit(1); } static void sigint_handler(int n) { time_to_quit = 1; } static unsigned long find_cpu_speed(void) { FILE *fp1; unsigned long f1 = 0, f2 = 0; char buffer[128], *p, *value; memset(buffer, 0, sizeof(buffer)); fp1 = fopen("/proc/cpuinfo", "r"); if (fp1 == NULL) return 0; for (;;) { buffer[0] = '\0'; p = fgets(buffer, 127, fp1); if (p == NULL) break; /* skip blank lines */ if (*p == '\n') continue; p = strchr(buffer, ':'); if (p == NULL) break; /* * p+2: +1 = space, +2= firt character * strlen()-1 gets rid of \n */ *p = '\0'; value = p+2; value[strlen(value)-1] = '\0'; if (!strncasecmp("cpu MHz", buffer, 7)) { float fl; sscanf(value, "%f", &fl); f1 = lroundf(fl); break; } if (!strncasecmp("BogoMIPS", buffer, 8)) { float fl; sscanf(value, "%f", &fl); f2 = lroundf(fl); } } fclose(fp1); return f1 == 0 ? f2 : f1; } static void setup_signals(void) { struct sigaction act; sigset_t my_set; /* * SIGINT is a asynchronous signal * sent to the process (not a specific thread). POSIX states * that one and only one thread will execute the handler. This * could be any thread that does not have the signal blocked. */ /* * install SIGINT handler */ memset(&act,0,sizeof(act)); sigemptyset(&my_set); act.sa_handler = sigint_handler; sigaction (SIGINT, &act, 0); /* * install SIGWINCH handler */ memset(&act,0,sizeof(act)); sigemptyset(&my_set); act.sa_handler = sigwinch_handler; sigaction (SIGWINCH, &act, 0); } static struct option rtop_cmd_options[]={ { "help", 0, 0, 1 }, { "version", 0, 0, 2 }, { "delay", 0, 0, 3 }, { "cpu-list", 1, 0, 4 }, { "verbose", 0, &options.opt_verbose, 1 }, { 0, 0, 0, 0} }; #define MAX_EVENTS 2 typedef struct { uint64_t prev_values[MAX_EVENTS]; int fd[MAX_EVENTS]; int cpu; } cpudesc_t; /* * { u64 nr; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 value; * { u64 id; } && PERF_FORMAT_ID * } cntr[nr]; */ typedef struct { uint64_t nr; uint64_t time_enabled; uint64_t time_running; uint64_t values[2]; } rtop_grp_t; static void mainloop(void) { struct perf_event_attr ev[MAX_EVENTS]; unsigned long itc_delta; cpudesc_t *cpus; int i, j = 0, k, ncpus = 0; int num, ret; ncpus = options.selected_cpus; cpus = calloc(ncpus, sizeof(cpudesc_t)); if (!cpus) err(1, "cannot allocate file descriptors"); memset(ev, 0, sizeof(ev)); /* measure user cycles */ ev[0].type = PERF_TYPE_HARDWARE; ev[0].config = PERF_COUNT_HW_CPU_CYCLES; ev[0].read_format = PERF_FORMAT_SCALE|PERF_FORMAT_GROUP; ev[0].exclude_kernel = 1; ev[0].disabled = 1; ev[0].pinned = 0; /* measure kernel cycles */ ev[1].type = PERF_TYPE_HARDWARE; ev[1].config = PERF_COUNT_HW_CPU_CYCLES; ev[1].exclude_user = 1; ev[1].disabled = 1; ev[1].pinned = 0; num = 2; for(i=0, k = 0; ncpus; i++) { if (RTOP_CPUMASK_ISSET(options.cpu_mask, i) == 0) continue; cpus[k].cpu = i; cpus[k].fd[0] = -1; for(j=0 ; j < num; j++) { cpus[k].fd[j] = perf_event_open(ev+j, -1, i, cpus[k].fd[0], 0); if (cpus[k].fd[j] == -1) fatal_errorw("cannot open event %d on CPU%d: %s\n", j, i, strerror(errno)); } ncpus--; k++; } ncpus = options.selected_cpus; itc_delta = options.opt_delay * options.cpu_mhz * 1000000; for(i=0; i < ncpus; i++) for(j=0; j < num; j++) ioctl(cpus[i].fd[j], PERF_EVENT_IOC_ENABLE, 0); for(;time_to_quit == 0;) { sleep(options.opt_delay); move(0, 0); for(i=0; i < ncpus; i++) { uint64_t values[MAX_EVENTS]; uint64_t raw_values[5]; double k_cycles, u_cycles, i_cycles, ratio; /* * given our events are in the same group, we can do a * group read and get both counts + scaling information */ ret = read(cpus[i].fd[0], raw_values, sizeof(raw_values)); if (ret != sizeof(raw_values)) fatal_errorw("cannot read count for event %d on CPU%d\n", j, cpus[i].cpu); if (options.opt_verbose) { printw("nr=%"PRIu64"\n", raw_values[0]); printw("ena=%"PRIu64"\n", raw_values[1]); printw("run=%"PRIu64"\n", raw_values[2]); } raw_values[0] = raw_values[3]; values[0] = perf_scale(raw_values); raw_values[0] = raw_values[4]; values[1] = perf_scale(raw_values); ratio = perf_scale_ratio(raw_values); k_cycles = (double)(values[1] - cpus[i].prev_values[1])*100.0/ (double)itc_delta; u_cycles = (double)(values[0] - cpus[i].prev_values[0])*100.0/ (double)itc_delta; i_cycles = 100.0 - (k_cycles + u_cycles); cpus[i].prev_values[0] = values[0]; cpus[i].prev_values[1] = values[1]; /* * adjust for rounding errors */ if (i_cycles < 0.0) i_cycles = 0.0; if (i_cycles > 100.0) i_cycles = 100.0; if (k_cycles > 100.0) k_cycles = 100.0; if (u_cycles > 100.0) u_cycles = 100.0; printw("CPU%-2ld %6.2f%% usr %6.2f%% sys %6.2f%% idle (scaling ratio %.2f%%)\n", i, u_cycles, k_cycles, i_cycles, ratio*100.0); } refresh(); } for(i=0; i < ncpus; i++) for(j=0; j < num; j++) close(cpus[i].fd[j]); free(cpus); } void populate_cpumask(char *cpu_list) { char *p; int start_cpu, end_cpu = 0; int i, count = 0; options.online_cpus = (int)sysconf(_SC_NPROCESSORS_ONLN); if (options.online_cpus == -1) errx(1, "cannot figure out the number of online processors"); if (cpu_list == NULL) { if (options.online_cpus >= RTOP_MAX_CPUS) errx(1, "rtop can only handle to %u CPUs", RTOP_MAX_CPUS); for(i=0; i < options.online_cpus; i++) RTOP_CPUMASK_SET(options.cpu_mask, i); options.selected_cpus = options.online_cpus; return; } while(isdigit(*cpu_list)) { p = NULL; start_cpu = strtoul(cpu_list, &p, 0); /* auto-detect base */ if (start_cpu == INT_MAX || (*p != '\0' && *p != ',' && *p != '-')) goto invalid; if (p && *p == '-') { cpu_list = ++p; p = NULL; end_cpu = strtoul(cpu_list, &p, 0); /* auto-detect base */ if (end_cpu == INT_MAX || (*p != '\0' && *p != ',')) goto invalid; if (end_cpu < start_cpu) goto invalid_range; } else { end_cpu = start_cpu; } if (start_cpu >= RTOP_MAX_CPUS || end_cpu >= RTOP_MAX_CPUS) goto too_big; for (; start_cpu <= end_cpu; start_cpu++) { if (start_cpu >= options.online_cpus) goto not_online; /* XXX: assume contiguous range of CPUs */ if (RTOP_CPUMASK_ISSET(options.cpu_mask, start_cpu)) continue; RTOP_CPUMASK_SET(options.cpu_mask, start_cpu); count++; } if (*p) ++p; cpu_list = p; } options.selected_cpus = count; return; invalid: errx(1, "invalid cpu list argument: %s", cpu_list); /* no return */ not_online: errx(1, "cpu %d is not online", start_cpu); /* no return */ invalid_range: errx(1, "cpu range %d - %d is invalid", start_cpu, end_cpu); /* no return */ too_big: errx(1, "rtop is limited to %d CPUs", RTOP_MAX_CPUS); /* no return */ } static void usage(void) { printf( "usage: rtop [options]:\n" "-h, --help\t\t\tdisplay this help and exit\n" "-v, --verbose\t\t\tverbose output\n" "-V, --version\t\t\tshow version and exit\n" "-d nsec, --delay=nsec\t\tnumber of seconds between refresh (default=1s)\n" "--cpu-list=cpu1,cpu2\t\tlist of CPUs to monitor(default=all)\n" ); } int main(int argc, char **argv) { int c; char *cpu_list = NULL; //if (geteuid()) err(1, "perf_event requires root privileges to create system-wide measurments\n"); while ((c=getopt_long(argc, argv,"+vhVd:", rtop_cmd_options, 0)) != -1) { switch(c) { case 0: continue; /* fast path for options */ case 'v': options.opt_verbose++; break; case 1: case 'h': usage(); exit(0); case 2: case 'V': printf("rtop version " RTOP_VERSION " Date: " __DATE__ "\n" "Copyright (C) 2009 Google, Inc\n"); exit(0); case 3: case 'd': options.opt_delay = atoi(optarg); if (options.opt_delay < 0) errx(1, "invalid delay, must be >= 0"); options.opt_delay_set = 1; break; case 4: if (*optarg == '\0') errx(1, "--cpu-list needs an argument\n"); cpu_list = optarg; break; default: errx(1, "unknown option\n"); } } /* * default refresh delay */ if (options.opt_delay_set == 0) options.opt_delay = 1; options.cpu_mhz = find_cpu_speed(); populate_cpumask(cpu_list); setup_signals(); setup_screen(); mainloop(); close_screen(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/self.c000066400000000000000000000104531502707512200217160ustar00rootroot00000000000000/* * self.c - example of a simple self monitoring task * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2002-2007 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" static const char *gen_events[]={ "cycles:u", "instructions:u", NULL }; static volatile int quit; void sig_handler(int n) { quit = 1; } void noploop(void) { for(;quit == 0;); } static void print_counts(perf_event_desc_t *fds, int num_fds, const char *msg) { uint64_t val; uint64_t values[3]; double ratio; int i; ssize_t ret; /* * now read the results. We use pfp_event_count because * libpfm guarantees that counters for the events always * come first. */ memset(values, 0, sizeof(values)); for (i = 0; i < num_fds; i++) { ret = read(fds[i].fd, values, sizeof(values)); if (ret < (ssize_t)sizeof(values)) { if (ret == -1) err(1, "cannot read results: %s", strerror(errno)); else warnx("could not read event%d", i); } /* * scaling is systematic because we may be sharing the PMU and * thus may be multiplexed */ val = perf_scale(values); ratio = perf_scale_ratio(values); printf("%s %'20"PRIu64" %s (%.2f%% scaling, raw=%'"PRIu64", ena=%'"PRIu64", run=%'"PRIu64")\n", msg, val, fds[i].name, (1.0-ratio)*100.0, values[0], values[1], values[2]); } } int main(int argc, char **argv) { perf_event_desc_t *fds = NULL; int i, ret, num_fds = 0; setlocale(LC_ALL, ""); /* * Initialize pfm library (required before we can use it) */ ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "Cannot initialize library: %s", pfm_strerror(ret)); ret = perf_setup_argv_events(argc > 1 ? (const char **)argv+1 : gen_events, &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup events"); fds[0].fd = -1; for(i=0; i < num_fds; i++) { /* request timing information necessary for scaling */ fds[i].hw.read_format = PERF_FORMAT_SCALE; fds[i].hw.disabled = 1; /* do not start now */ /* each event is in an independent group (multiplexing likely) */ fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, -1, 0); if (fds[i].fd == -1) err(1, "cannot open event %d", i); } signal(SIGALRM, sig_handler); /* * enable all counters attached to this thread and created by it */ ret = prctl(PR_TASK_PERF_EVENTS_ENABLE); if (ret) err(1, "prctl(enable) failed"); print_counts(fds, num_fds, "INITIAL: "); alarm(10); noploop(); /* * disable all counters attached to this thread */ ret = prctl(PR_TASK_PERF_EVENTS_DISABLE); if (ret) err(1, "prctl(disable) failed"); printf("Final counts:\n"); print_counts(fds, num_fds, "FINAL: "); for (i = 0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/self_basic.c000066400000000000000000000076661502707512200230730ustar00rootroot00000000000000/* * self-basic.c - example of a simple self monitoring task no-helper * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #include #include #define N 30 static unsigned long fib(unsigned long n) { if (n == 0) return 0; if (n == 1) return 2; return fib(n-1)+fib(n-2); } int main(int argc, char **argv) { struct perf_event_attr attr; int fd, ret; uint64_t count = 0, values[3]; setlocale(LC_ALL, ""); /* * Initialize libpfm library (required before we can use it) */ ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "cannot initialize library: %s", pfm_strerror(ret)); memset(&attr, 0, sizeof(attr)); /* * 1st argument: event string * 2nd argument: default privilege level (used if not specified in the event string) * 3rd argument: the perf_event_attr to initialize */ ret = pfm_get_perf_event_encoding("cycles:u", PFM_PLM0|PFM_PLM3, &attr, NULL, NULL); if (ret != PFM_SUCCESS) errx(1, "cannot find encoding: %s", pfm_strerror(ret)); /* * request timing information because event may be multiplexed * and thus it may not count all the time. The scaling information * will be used to scale the raw count as if the event had run all * along */ attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING; /* do not start immediately after perf_event_open() */ attr.disabled = 1; /* * create the event and attach to self * Note that it attaches only to the main thread, there is no inheritance * to threads that may be created subsequently. * * if mulithreaded, then getpid() must be replaced by gettid() */ fd = perf_event_open(&attr, getpid(), -1, -1, 0); if (fd < 0) err(1, "cannot create event"); /* * start counting now */ ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "ioctl(enable) failed"); printf("Fibonacci(%d)=%lu\n", N, fib(N)); /* * stop counting */ ret = ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); if (ret) err(1, "ioctl(disable) failed"); /* * read the count + scaling values * * It is not necessary to stop an event to read its value */ ret = read(fd, values, sizeof(values)); if (ret != sizeof(values)) err(1, "cannot read results: %s", strerror(errno)); /* * scale count * * values[0] = raw count * values[1] = TIME_ENABLED * values[2] = TIME_RUNNING */ if (values[2]) count = (uint64_t)((double)values[0] * values[1]/values[2]); printf("count=%'"PRIu64"\n", count); close(fd); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/self_count.c000066400000000000000000000140601502707512200231240ustar00rootroot00000000000000/* * self_count.c - example of a simple self monitoring using mmapped page * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" static const char *gen_events[]={ "cycles:u", NULL }; static volatile int quit; void sig_handler(int n) { quit = 1; } #if defined(__x86_64__) || defined(__i386__) #ifdef __x86_64__ #define DECLARE_ARGS(val, low, high) unsigned low, high #define EAX_EDX_VAL(val, low, high) ((low) | ((uint64_t )(high) << 32)) #define EAX_EDX_ARGS(val, low, high) "a" (low), "d" (high) #define EAX_EDX_RET(val, low, high) "=a" (low), "=d" (high) #else #define DECLARE_ARGS(val, low, high) unsigned long long val #define EAX_EDX_VAL(val, low, high) (val) #define EAX_EDX_ARGS(val, low, high) "A" (val) #define EAX_EDX_RET(val, low, high) "=A" (val) #endif #define barrier() __asm__ __volatile__("": : :"memory") static inline int rdpmc(struct perf_event_mmap_page *hdr, uint64_t *value) { int counter = hdr->index - 1; DECLARE_ARGS(val, low, high); if (counter < 0) return -1; asm volatile("rdpmc" : EAX_EDX_RET(val, low, high) : "c" (counter)); *value = EAX_EDX_VAL(val, low, high); return 0; } #else /* * Default barrier macro. * Given this is architecture specific, it must be defined when * libpfm is ported to new architecture. The default macro below * simply does nothing. */ #define barrier() {} /* * Default function to read counter directly from user level mode. * Given this is architecture specific, it must be defined when * libpfm is ported to new architecture. The default routine below * simply fails and the caller falls backs to syscall. */ static inline int rdpmc(struct perf_event_mmap_page *hdr, uint64_t *value) { int counter = hdr->index - 1; if (counter < 0) return -1; printf("your architecture does not have a way to read counters from user mode\n"); return -1; } #endif /* * our test code (function cannot be made static otherwise it is optimized away) */ unsigned long fib(unsigned long n) { if (n == 0) return 0; if (n == 1) return 2; return fib(n-1)+fib(n-2); } uint64_t read_count(perf_event_desc_t *fds) { struct perf_event_mmap_page *hdr; uint64_t values[3]; uint64_t count = 0; uint32_t width; unsigned int seq; ssize_t ret; int idx = -1; hdr = fds->buf; width = hdr->pmc_width; do { seq = hdr->lock; barrier(); /* try reading directly from user mode */ if (!rdpmc(hdr, &values[0])) { values[1] = hdr->time_enabled; values[2] = hdr->time_running; ret = 0; } else { idx = -1; ret = read(fds->fd, values, sizeof(values)); if (ret < (ssize_t)sizeof(values)) errx(1, "cannot read values"); printf("using read\n"); break; } barrier(); } while (hdr->lock != seq); printf("raw=0x%"PRIx64 " width=%d ena=%"PRIu64 " run=%"PRIu64" idx=%d\n", values[0], width, values[1], values[2], idx); count = values[0]; count <<= 64 - width; count >>= 64 - width; values[0] = count; return perf_scale(values); } int main(int argc, char **argv) { perf_event_desc_t *fds = NULL; long lret; size_t pgsz; uint64_t val, prev_val; int i, ret, num_fds = 0; lret = sysconf(_SC_PAGESIZE); if (lret < 0) err(1, "cannot get page size"); pgsz = (size_t)lret; /* * Initialize pfm library (required before we can use it) */ ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "Cannot initialize library: %s", pfm_strerror(ret)); ret = perf_setup_argv_events(argc > 1 ? (const char **)argv+1 : gen_events, &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup events"); fds[0].fd = -1; for(i=0; i < num_fds; i++) { /* request timing information necesaary for scaling */ fds[i].hw.read_format = PERF_FORMAT_SCALE; fds[i].hw.disabled = 0; //fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, fds[0].fd, 0); fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, -1, 0); if (fds[i].fd == -1) err(1, "cannot open event %d", i); fds[i].buf = mmap(NULL, pgsz, PROT_READ, MAP_SHARED, fds[i].fd, 0); if (fds[i].buf == MAP_FAILED) err(1, "cannot mmap page"); } signal(SIGALRM, sig_handler); /* * enable all counters attached to this thread */ ioctl(fds[0].fd, PERF_EVENT_IOC_ENABLE, 0); alarm(10); prev_val = 0; for(;quit == 0;) { for (i = 0; i < num_fds; i++) { val = read_count(&fds[i]); /* print evnet deltas */ printf("%20"PRIu64" %s\n", val - prev_val, fds[i].name); prev_val = val; } fib(35); } /* * disable all counters attached to this thread */ ioctl(fds[0].fd, PERF_EVENT_IOC_DISABLE, 0); for (i=0; i < num_fds; i++) { munmap(fds[i].buf, pgsz); close(fds[i].fd); } perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/self_pipe.c000066400000000000000000000131271502707512200227340ustar00rootroot00000000000000/* * self_pipe.c - dual process ping-pong example to stress PMU context switch of one process * * Copyright (c) 2008 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" static struct { const char *events; int cpu; int delay; } options; int pin_cpu(pid_t pid, unsigned int cpu) { cpu_set_t mask; CPU_ZERO(&mask); CPU_SET(cpu, &mask); return sched_setaffinity(pid, sizeof(mask), &mask); } static volatile int quit; void sig_handler(int n) { quit = 1; } static void do_child(int fr, int fw) { char c; ssize_t ret; for(;;) { ret = read(fr, &c, 1); if (ret < 0) break; ret = write(fw, "c", 1); if (ret < 0) break; } printf("child exited\n"); exit(0); } static void measure(void) { perf_event_desc_t *fds = NULL; int num_fds = 0; uint64_t values[3]; ssize_t n; int i, ret; int pr[2], pw[2]; pid_t pid; char cc = '0'; ret = pfm_initialize(); if (ret != PFM_SUCCESS) err(1, "cannot initialize libpfm"); if (options.cpu == -1) { srandom(getpid()); options.cpu = random() % sysconf(_SC_NPROCESSORS_ONLN); } ret = pipe(pr); if (ret) err(1, "cannot create read pipe"); ret = pipe(pw); if (ret) err(1, "cannot create write pipe"); ret = perf_setup_list_events(options.events, &fds, &num_fds); if (ret || !num_fds) exit(1); for(i=0; i < num_fds; i++) { fds[i].hw.disabled = 1; fds[i].hw.read_format = PERF_FORMAT_SCALE; fds[i].fd = perf_event_open(&fds[i].hw, 0, -1, -1, 0); if (fds[i].fd == -1) err(1, "cannot open event %d", i); } /* * Pin to CPU0, inherited by child process. That will enforce * the ping-pionging and thus stress the PMU context switch * which is what we want */ ret = pin_cpu(getpid(), options.cpu); if (ret) err(1, "cannot pin to CPU%d", options.cpu); printf("Both processes pinned to CPU%d, running for %d seconds\n", options.cpu, options.delay); /* * create second process which is not monitoring at the moment */ switch(pid=fork()) { case -1: err(1, "cannot create child\n"); exit(1); /* not reached */ case 0: /* do not inherit session fd */ for(i=0; i < num_fds; i++) close(fds[i].fd); /* pr[]: write master, read child */ /* pw[]: read master, write child */ close(pr[1]); close(pw[0]); do_child(pr[0], pw[1]); exit(1); } close(pr[0]); close(pw[1]); /* * Let's roll now */ prctl(PR_TASK_PERF_EVENTS_ENABLE); signal(SIGALRM, sig_handler); alarm(options.delay); /* * ping pong loop */ while(!quit) { n = write(pr[1], "c", 1); if (n < 1) err(1, "write failed"); n = read(pw[0], &cc, 1); if (n < 1) err(1, "read failed"); } prctl(PR_TASK_PERF_EVENTS_DISABLE); for(i=0; i < num_fds; i++) { uint64_t val; double ratio; ret = read(fds[i].fd, values, sizeof(values)); if (ret == -1) err(1,"pfm_read error"); if (ret != sizeof(values)) errx(1, "did not read correct amount %d", ret); val = perf_scale(values); ratio = perf_scale_ratio(values); if (ratio == 1.0) printf("%20"PRIu64" %s\n", val, fds[i].name); else if (ratio == 0.0) printf("%20"PRIu64" %s (did not run: competing session)\n", val, fds[i].name); else printf("%20"PRIu64" %s (scaled from %.2f%% of time)\n", val, fds[i].name, ratio*100.0); } /* * kill child process */ kill(SIGKILL, pid); /* * close pipes */ close(pr[1]); close(pw[0]); /* * and destroy our session */ for(i=0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); } static void usage(void) { printf("usage: self_pipe [-h] [-c cpu] [-d delay] [-e event1,event2,...]\n"); } int main(int argc, char **argv) { int c; options.cpu = -1; options.delay = -1; while ((c=getopt(argc, argv,"he:c:d:")) != -1) { switch(c) { case 'e': options.events = optarg; break; case 'c': options.cpu = atoi(optarg); break; case 'd': options.delay = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown error"); } } if (!options.events) options.events = "cycles:u,instructions:u"; if (options.delay == -1) options.delay = 10; measure(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/self_smpl_multi.c000066400000000000000000000263711502707512200241710ustar00rootroot00000000000000/* * * self_smpl_multi.c - multi-thread self-sampling program * * Copyright (c) 2009 Google, Inc * Modified by Stephane Eranian * * Based on: * Copyright (c) 2008 Mark W. Krentel * Contributed by Mark W. Krentel * Modified by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. * * Test perfmon overflow without PAPI. * * Create a new thread, launch perfmon overflow counters in both * threads, print the number of interrupts per thread and per second, * and look for anomalous interrupts. Look for mismatched thread * ids, bad message type, or failed pfm_restart(). * * self_smpl_multi is a test program to stress signal delivery in the context * of a multi-threaded self-sampling program which is common with PAPI and HPC. * * There is an issue with existing (as of 2.6.30) kernel which do not provide * a reliable way of having the signal delivered to the thread in which the * counter overflow occurred. This is problematic for many self-monitoring * program. * * This program demonstrates the issue by tracking the number of times * the signal goes to the wrong thread. The bad behavior is exacerbated * if the monitored threads, themselves, already use signals. Here we * use SIGLARM. * * Note that kernel developers have been made aware of this problem and * a fix has been proposed. It introduces a new F_SETOWN_EX command to * fcntl(). */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define PROGRAM_TIME 8 #define THRESHOLD 20000000 static int program_time = PROGRAM_TIME; static int threshold = THRESHOLD; static int signum = SIGIO; static pthread_barrier_t barrier; static int buffer_pages = 1; #define MAX_THR 128 /* * the following definitions come * from the F_SETOWN_EX patch from Peter Zijlstra * Check out: http://lkml.org/lkml/2009/8/4/128 */ #ifndef F_SETOWN_EX #define F_SETOWN_EX 15 #define F_GETOWN_EX 16 #define F_OWNER_TID 0 #define F_OWNER_PID 1 #define F_OWNER_PGRP 2 struct f_owner_ex { int type; pid_t pid; }; #endif struct over_args { int fd; pid_t tid; int id; perf_event_desc_t *fds; }; struct over_args fd2ov[MAX_THR]; long count[MAX_THR]; long total[MAX_THR]; long iter[MAX_THR]; long mismatch[MAX_THR]; long bad_msg[MAX_THR]; long bad_restart[MAX_THR]; int fown; static __thread int myid; /* TLS */ static __thread perf_event_desc_t *fds; /* TLS */ static __thread int num_fds; /* TLS */ pid_t gettid(void) { return (pid_t)syscall(__NR_gettid); } void user_callback(int m) { count[m]++; total[m]++; } void do_cycles(void) { struct timeval start, last, now; unsigned long x; gettimeofday(&start, NULL); last = start; count[myid] = 0; total[myid] = 0; iter[myid] = 0; do { for (x = 1; x < 250000; x++) { /* signal pending to private queue because of * pthread_kill(), i.e., tkill() */ if ((x % 5000) == 0) pthread_kill(pthread_self(), SIGUSR1); } iter[myid]++; gettimeofday(&now, NULL); if (now.tv_sec > last.tv_sec) { printf("%ld: myid = %3d, fd = %3d, count = %4ld, iter = %4ld, rate = %ld/Kiter\n", (long)(now.tv_sec - start.tv_sec), myid, fd2ov[myid].fd, count[myid], iter[myid], (1000 * count[myid])/iter[myid]); count[myid] = 0; iter[myid] = 0; last = now; } } while (now.tv_sec < start.tv_sec + program_time); } #define DPRINT(str) \ printf("(%s) si->fd = %d, ov->self = 0x%lx, self = 0x%lx\n", \ str, fd, (unsigned long)ov->self, (unsigned long)self) void sigusr1_handler(int sig, siginfo_t *info, void *context) { } /* * a signal handler cannot safely invoke printf() */ void sigio_handler(int sig, siginfo_t *info, void *context) { perf_event_desc_t *fdx; struct perf_event_header ehdr; struct over_args *ov; int fd, i, ret; pid_t tid; /* * positive si_code indicate kernel generated signal * which is normal for SIGIO */ if (info->si_code < 0) errx(1, "signal not generated by kernel"); /* * SIGPOLL = SIGIO * expect POLL_HUP instead of POLL_IN because we are * in one-shot mode (IOC_REFRESH) */ if (info->si_code != POLL_HUP) errx(1, "signal not generated by SIGIO: %d", info->si_code); fd = info->si_fd; tid = gettid(); for(i=0; i < MAX_THR; i++) if (fd2ov[i].fd == fd) break; if (i == MAX_THR) errx(1, "bad info.si_fd: %d", fd); ov = &fd2ov[i]; /* * current thread id may not always match the id * associated with the file descriptor * * We need to use the other's thread fds info * otherwise, it is going to get stuck with no * more samples generated */ if (tid != ov->tid) { mismatch[myid]++; fdx = ov->fds; } else { fdx = fds; } /* * read sample header */ ret = perf_read_buffer(fdx+0, &ehdr, sizeof(ehdr)); if (ret) { errx(1, "cannot read event header"); } /* * message we do not handle */ if (ehdr.type != PERF_RECORD_SAMPLE) { bad_msg[myid]++; goto skip; } user_callback(myid); skip: /* mark sample as consumed */ perf_skip_buffer(fdx+0, ehdr.size); /* * re-arm period, next notification after wakeup_events */ ret = ioctl(fd, PERF_EVENT_IOC_REFRESH, 1); if (ret) err(1, "cannot refresh"); } void overflow_start(char *name) { struct f_owner_ex fown_ex; struct over_args *ov; size_t pgsz; int ret, fd, flags; fds = NULL; num_fds = 0; ret = perf_setup_list_events("cycles:u", &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot monitor event"); pgsz = sysconf(_SC_PAGESIZE); ov = &fd2ov[myid]; /* do not enable now */ fds[0].hw.disabled = 1; /* notify after 1 sample */ fds[0].hw.wakeup_events = 1; fds[0].hw.sample_type = PERF_SAMPLE_IP; fds[0].hw.sample_period = threshold; fds[0].hw.read_format = 0; fds[0].fd = fd = perf_event_open(&fds[0].hw, gettid(), -1, -1, 0); if (fd == -1) err(1, "cannot attach event %s", fds[0].name); ov->fd = fd; ov->tid = gettid(); ov->id = myid; ov->fds = fds; flags = fcntl(fd, F_GETFL, 0); if (fcntl(fd, F_SETFL, flags | O_ASYNC) < 0) err(1, "fcntl SETFL failed"); fown_ex.type = F_OWNER_TID; fown_ex.pid = gettid(); ret = fcntl(fd, (fown ? F_SETOWN_EX : F_SETOWN), (fown ? (unsigned long)&fown_ex: (unsigned long)gettid())); if (ret) err(1, "fcntl SETOWN failed"); if (fcntl(fd, F_SETSIG, signum) < 0) err(1, "fcntl SETSIG failed"); fds[0].buf = mmap(NULL, (buffer_pages + 1)* pgsz, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); fds[0].pgmsk = (buffer_pages * pgsz) - 1; printf("launch %s: fd: %d, tid: %d\n", name, fd, ov->tid); /* * activate event for wakeup_events (samples) */ ret = ioctl(fd, PERF_EVENT_IOC_REFRESH , 1); if (ret == -1) err(1, "cannot refresh"); } void overflow_stop(void) { int ret; ret = ioctl(fd2ov[myid].fd, PERF_EVENT_IOC_DISABLE, 0); if (ret) err(1, "cannot stop"); } void * my_thread(void *v) { int retval = 0; myid = (unsigned long)v; pthread_barrier_wait(&barrier); overflow_start("side"); do_cycles(); overflow_stop(); perf_free_fds(fds, num_fds); pthread_exit((void *)&retval); } static void usage(void) { printf("self_smpl_multi [-t secs] [-p period] [-s signal] [-f] [-n threads]\n" "-t secs: duration of the run in seconds\n" "-p period: sampling period in CPU cycles\n" "-s signal: signal to use (default: SIGIO)\n" "-n thread: number of threads to create (default: 1)\n" "-f : use F_SETOWN_EX for correct delivery of signal to thread (default: off)\n"); } /* * Program args: program_time, threshold, signum. */ int main(int argc, char **argv) { struct sigaction sa; pthread_t allthr[MAX_THR]; sigset_t set, old, new; int i, ret, max_thr = 1; while ((i=getopt(argc, argv, "t:p:s:fhn:")) != EOF) { switch(i) { case 'h': usage(); return 0; case 't': program_time = atoi(optarg); break; case 'p': threshold = atoi(optarg); break; case 's': signum = atoi(optarg); break; case 'f': fown = 1; break; case 'n': max_thr = atoi(optarg); if (max_thr >= MAX_THR) errx(1, "no more than %d threads", MAX_THR); break; default: errx(1, "invalid option"); } } printf("program_time = %d, threshold = %d, signum = %d fcntl(%s), threads = %d\n", program_time, threshold, signum, fown ? "F_SETOWN_EX" : "F_SETOWN", max_thr); for (i = 0; i < MAX_THR; i++) { mismatch[i] = 0; bad_msg[i] = 0; bad_restart[i] = 0; } memset(&sa, 0, sizeof(sa)); sigemptyset(&set); sa.sa_sigaction = sigusr1_handler; sa.sa_mask = set; sa.sa_flags = SA_SIGINFO; if (sigaction(SIGUSR1, &sa, NULL) != 0) errx(1, "sigaction failed"); memset(&sa, 0, sizeof(sa)); sigemptyset(&set); sa.sa_sigaction = sigio_handler; sa.sa_mask = set; sa.sa_flags = SA_SIGINFO; if (sigaction(signum, &sa, NULL) != 0) errx(1, "sigaction failed"); if (pfm_initialize() != PFM_SUCCESS) errx(1, "pfm_initialize failed"); /* * +1 because main thread is also using the barrier */ pthread_barrier_init(&barrier, 0, max_thr+1); for(i=0; i < max_thr; i++) { ret = pthread_create(allthr+i, NULL, my_thread, (void *)(unsigned long)i); if (ret) err(1, "pthread_create failed"); } myid = i; sigemptyset(&set); sigemptyset(&new); sigaddset(&set, SIGIO); sigaddset(&new, SIGIO); if (pthread_sigmask(SIG_BLOCK, &set, NULL)) err(1, "cannot mask SIGIO in main thread"); ret = sigprocmask(SIG_SETMASK, NULL, &old); if (ret) err(1, "sigprocmask failed"); if (sigismember(&old, SIGIO)) { warnx("program started with SIGIO masked, unmasking it now\n"); ret = sigprocmask(SIG_UNBLOCK, &new, NULL); if (ret) err(1, "sigprocmask failed"); } pthread_barrier_wait(&barrier); printf("\n\n"); for (i = 0; i < max_thr; i++) { pthread_join(allthr[i], NULL); } printf("\n\n"); for (i = 0; i < max_thr; i++) { printf("myid = %3d, fd = %3d, total = %4ld, mismatch = %ld, " "bad_msg = %ld, bad_restart = %ld\n", fd2ov[i].id, fd2ov[i].fd, total[i], mismatch[i], bad_msg[i], bad_restart[i]); } /* free libpfm resources cleanly */ pfm_terminate(); return (0); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/syst.c000066400000000000000000000126461502707512200217750ustar00rootroot00000000000000/* * syst.c - example of a simple system wide monitoring program * * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" typedef struct { const char *events; int delay; int excl; int cpu; int group; } options_t; static options_t options; static perf_event_desc_t **all_fds; static int *num_fds; void setup_cpu(int cpu) { perf_event_desc_t *fds; int i, ret; ret = perf_setup_list_events(options.events, &all_fds[cpu], &num_fds[cpu]); if (ret || (num_fds == 0)) errx(1, "cannot setup events\n"); fds = all_fds[cpu]; /* temp */ fds[0].fd = -1; for(i=0; i < num_fds[cpu]; i++) { fds[i].hw.disabled = options.group ? !i : 1; if (options.excl && ((options.group && !i) || (!options.group))) fds[i].hw.exclusive = 1; fds[i].hw.disabled = options.group ? !i : 1; /* request timing information necessary for scaling counts */ fds[i].hw.read_format = PERF_FORMAT_SCALE; fds[i].fd = perf_event_open(&fds[i].hw, -1, cpu, (options.group ? fds[0].fd : -1), 0); if (fds[i].fd == -1) err(1, "cannot attach event to CPU%d %s", cpu, fds[i].name); } } void measure(void) { perf_event_desc_t *fds; long lret; int c, cmin, cmax, ncpus; int i, ret, l; printf("\n", options.delay); cmin = 0; lret = sysconf(_SC_NPROCESSORS_ONLN); if (lret < 0) err(1, "cannot get number of online processors"); cmax = (int)lret; ncpus = cmax; if (options.cpu != -1) { cmin = options.cpu; cmax = cmin + 1; } all_fds = calloc(ncpus, sizeof(perf_event_desc_t *)); num_fds = calloc(ncpus, sizeof(int)); if (!all_fds || !num_fds) err(1, "cannot allocate memory for internal structures"); for(c=cmin ; c < cmax; c++) setup_cpu(c); /* * FIX this for hotplug CPU */ for(c=cmin ; c < cmax; c++) { fds = all_fds[c]; if (options.group) ret = ioctl(fds[0].fd, PERF_EVENT_IOC_ENABLE, 0); else for(i=0; i < num_fds[c]; i++) { ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "cannot enable event %s\n", fds[i].name); } } for(l=0; l < options.delay; l++) { sleep(1); puts("------------------------"); for(c = cmin; c < cmax; c++) { fds = all_fds[c]; for(i=0; i < num_fds[c]; i++) { uint64_t val, delta; double ratio; ret = read(fds[i].fd, fds[i].values, sizeof(fds[i].values)); if (ret != sizeof(fds[i].values)) { if (ret == -1) err(1, "cannot read event %d:%d", i, ret); else warnx("could not read event%d", i); } /* * scaling because we may be sharing the PMU and * thus may be multiplexed */ val = perf_scale(fds[i].values); ratio = perf_scale_ratio(fds[i].values); delta = perf_scale_delta(fds[i].values, fds[i].prev_values); printf("CPU%d val=%-20"PRIu64" %-20"PRIu64" raw=%"PRIu64" ena=%"PRIu64" run=%"PRIu64" ratio=%.2f %s\n", c, val, delta, fds[i].values[0], fds[i].values[1], fds[i].values[2], ratio, fds[i].name); fds[i].prev_values[0] = fds[i].values[0]; fds[i].prev_values[1] = fds[i].values[1]; fds[i].prev_values[2] = fds[i].values[2]; } } } for(c = cmin; c < cmax; c++) { fds = all_fds[c]; for(i=0; i < num_fds[c]; i++) close(fds[i].fd); perf_free_fds(fds, num_fds[c]); } } static void usage(void) { printf("usage: syst [-c cpu] [-x] [-h] [-d delay] [-g] [-e event1,event2,...]\n"); } int main(int argc, char **argv) { int c, ret; options.cpu = -1; while ((c=getopt(argc, argv,"hc:e:d:gx")) != -1) { switch(c) { case 'x': options.excl = 1; break; case 'e': options.events = optarg; break; case 'c': options.cpu = atoi(optarg); break; case 'g': options.group = 1; break; case 'd': options.delay = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown error"); } } if (!options.delay) options.delay = 20; if (!options.events) options.events = "cycles,instructions"; ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "libpfm initialization failed: %s\n", pfm_strerror(ret)); measure(); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/syst_count.c000066400000000000000000000233351502707512200232020ustar00rootroot00000000000000/* * syst.c - example of a simple system wide monitoring program * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define MAX_GROUPS 256 #define MAX_PATH 1024 #ifndef STR # define _STR(x) #x # define STR(x) _STR(x) #endif typedef struct { const char *events[MAX_GROUPS]; int nevents[MAX_GROUPS]; /* #events per group */ int num_groups; int delay; int excl; int pin; int interval; int cpu; char *cgroup_name; } options_t; static options_t options; static perf_event_desc_t **all_fds; static int cgroupfs_find_mountpoint(char *buf, size_t maxlen) { FILE *fp; char mountpoint[MAX_PATH+1], tokens[MAX_PATH+1], type[MAX_PATH+1]; char *token, *saved_ptr = NULL; int found = 0; fp = fopen("/proc/mounts", "r"); if (!fp) return -1; /* * in order to handle split hierarchy, we need to scan /proc/mounts * and inspect every cgroupfs mount point to find one that has * perf_event subsystem */ while (fscanf(fp, "%*s %"STR(MAX_PATH)"s %"STR(MAX_PATH)"s %" STR(MAX_PATH)"s %*d %*d\n", mountpoint, type, tokens) == 3) { if (!strcmp(type, "cgroup")) { token = strtok_r(tokens, ",", &saved_ptr); while (token != NULL) { if (!strcmp(token, "perf_event")) { found = 1; break; } token = strtok_r(NULL, ",", &saved_ptr); } } if (found) break; } fclose(fp); if (!found) return -1; if (strlen(mountpoint) < maxlen) { strcpy(buf, mountpoint); return 0; } return -1; } int open_cgroup(char *name) { char path[MAX_PATH+1]; char mnt[MAX_PATH+1]; int cfd = -1; int retlen; if (cgroupfs_find_mountpoint(mnt, MAX_PATH+1)) errx(1, "cannot find cgroup fs mount point"); retlen = snprintf(path, MAX_PATH, "%s/%s", mnt, name); /* ensure generated d2path string is valid */ if (retlen <= 0 || MAX_PATH <= retlen) { warn("Unable to generate path name %s/%s\n", mnt, name); return cfd; } cfd = open(path, O_RDONLY); if (cfd == -1) warn("no access to cgroup %s\n", name); return cfd; } void setup_cpu(int cpu, int cfd) { perf_event_desc_t *fds = NULL; int old_total, total = 0, num; int i, j, n, ret, is_lead, group_fd; unsigned long flags; pid_t pid; for(i=0, j=0; i < options.num_groups; i++) { old_total = total; ret = perf_setup_list_events(options.events[i], &fds, &total); if (ret) errx(1, "cannot setup events\n"); all_fds[cpu] = fds; num = total - old_total; options.nevents[i] = num; for(n=0; n < num; n++, j++) { is_lead = perf_is_group_leader(fds, j); if (is_lead) { fds[j].hw.disabled = 1; group_fd = -1; } else { fds[j].hw.disabled = 0; group_fd = fds[fds[j].group_leader].fd; } fds[j].hw.size = sizeof(struct perf_event_attr); if (options.cgroup_name) { flags = PERF_FLAG_PID_CGROUP; pid = cfd; //fds[j].hw.cgroup = 1; //fds[j].hw.cgroup_fd = cfd; } else { flags = 0; pid = -1; } if (options.pin && is_lead) fds[j].hw.pinned = 1; if (options.excl && is_lead) fds[j].hw.exclusive = 1; /* request timing information necessary for scaling counts */ fds[j].hw.read_format = PERF_FORMAT_SCALE; fds[j].fd = perf_event_open(&fds[j].hw, pid, cpu, group_fd, flags); if (fds[j].fd == -1) { if (errno == EACCES) err(1, "you need to be root to run system-wide on this machine"); warn("cannot attach event %s to CPU%ds, aborting", fds[j].name, cpu); exit(1); } } } } void start_cpu(int c) { perf_event_desc_t *fds = NULL; int j, ret, n = 0; fds = all_fds[c]; if (fds[0].fd == -1) return; for(j=0; j < options.num_groups; j++) { /* group leader always first in each group */ ret = ioctl(fds[n].fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "cannot enable event %s\n", fds[j].name); n += options.nevents[j]; } } void stop_cpu(int c) { perf_event_desc_t *fds = NULL; int j, ret, n = 0; fds = all_fds[c]; if (fds[0].fd == -1) return; for(j=0; j < options.num_groups; j++) { /* group leader always first in each group */ ret = ioctl(fds[n].fd, PERF_EVENT_IOC_DISABLE, 0); if (ret) err(1, "cannot disable event %s\n", fds[j].name); n += options.nevents[j]; } } void read_cpu(int c) { perf_event_desc_t *fds; uint64_t val, delta; double ratio; int i, j, n, ret; fds = all_fds[c]; if (fds[0].fd == -1) { printf("CPU%d not monitored\n", c); return; } for(i=0, j = 0; i < options.num_groups; i++) { for(n = 0; n < options.nevents[i]; n++, j++) { ret = read(fds[j].fd, fds[j].values, sizeof(fds[j].values)); if (ret != sizeof(fds[j].values)) { if (ret == -1) err(1, "cannot read event %s : %d", fds[j].name, ret); else { warnx("CPU%d G%-2d could not read event %s, read=%d", c, i, fds[j].name, ret); continue; } } /* * scaling because we may be sharing the PMU and * thus may be multiplexed */ delta = perf_scale_delta(fds[j].values, fds[j].prev_values); val = perf_scale(fds[j].values); ratio = perf_scale_ratio(fds[j].values); printf("CPU%-3d G%-2d %'20"PRIu64" %'20"PRIu64" %s (scaling %.2f%%, ena=%'"PRIu64", run=%'"PRIu64") %s\n", c, i, val, delta, fds[j].name, (1.0-ratio)*100, fds[j].values[1], fds[j].values[2], options.cgroup_name ? options.cgroup_name : ""); fds[j].prev_values[0] = fds[j].values[0]; fds[j].prev_values[1] = fds[j].values[1]; fds[j].prev_values[2] = fds[j].values[2]; if (fds[j].values[2] > fds[j].values[1]) errx(1, "WARNING: time_running > time_enabled %"PRIu64"\n", fds[j].values[2] - fds[j].values[1]); } } } void close_cpu(int c) { perf_event_desc_t *fds = NULL; int i, j; int total = 0; fds = all_fds[c]; if (fds[0].fd == -1) return; for(i=0; i < options.num_groups; i++) { for(j=0; j < options.nevents[i]; j++) close(fds[j].fd); total += options.nevents[i]; } perf_free_fds(fds, total); } void measure(void) { int c, cmin, cmax, ncpus; int cfd = -1; cmin = 0; cmax = (int)sysconf(_SC_NPROCESSORS_ONLN); ncpus = cmax; if (options.cpu != -1) { cmin = options.cpu; cmax = cmin + 1; } all_fds = malloc(ncpus * sizeof(perf_event_desc_t *)); if (!all_fds) err(1, "cannot allocate memory for all_fds"); if (options.cgroup_name) { cfd = open_cgroup(options.cgroup_name); if (cfd == -1) exit(1); } for(c=cmin ; c < cmax; c++) setup_cpu(c, cfd); if (options.cgroup_name) close(cfd); printf("\n", options.delay); /* * FIX this for hotplug CPU */ if (options.interval) { struct timespec tv; int delay; for (delay = 1 ; delay <= options.delay; delay++) { for(c=cmin ; c < cmax; c++) start_cpu(c); if (0) { tv.tv_sec = 0; tv.tv_nsec = 100000000; nanosleep(&tv, NULL); } else sleep(1); for(c=cmin ; c < cmax; c++) stop_cpu(c); for(c = cmin; c < cmax; c++) { printf("# %'ds -----\n", delay); read_cpu(c); } } } else { for(c=cmin ; c < cmax; c++) start_cpu(c); sleep(options.delay); if (0) for(c=cmin ; c < cmax; c++) stop_cpu(c); for(c = cmin; c < cmax; c++) { printf("# -----\n"); read_cpu(c); } } for(c = cmin; c < cmax; c++) close_cpu(c); free(all_fds); } static void usage(void) { printf("usage: syst [-c cpu] [-x] [-h] [-p] [-d delay] [-P] [-G cgroup name] [-e event1,event2,...]\n"); } int main(int argc, char **argv) { int c, ret; setlocale(LC_ALL, ""); options.cpu = -1; while ((c=getopt(argc, argv,"hc:e:d:xPpG:")) != -1) { switch(c) { case 'x': options.excl = 1; break; case 'p': options.interval = 1; break; case 'e': if (options.num_groups < MAX_GROUPS) { options.events[options.num_groups++] = optarg; } else { errx(1, "you cannot specify more than %d groups.\n", MAX_GROUPS); } break; case 'c': options.cpu = atoi(optarg); break; case 'd': options.delay = atoi(optarg); break; case 'P': options.pin = 1; break; case 'h': usage(); exit(0); case 'G': options.cgroup_name = optarg; break; default: errx(1, "unknown error"); } } if (!options.delay) options.delay = 20; if (!options.events[0]) { options.events[0] = "cycles,instructions"; options.num_groups = 1; } ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "libpfm initialization failed: %s\n", pfm_strerror(ret)); measure(); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } papi-papi-7-2-0-t/src/libpfm4/perf_examples/syst_smpl.c000077500000000000000000000236161502707512200230320ustar00rootroot00000000000000/* * syst_smpl.c - example of a system-wide sampling * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define SMPL_PERIOD 240000000ULL #define MAX_PATH 1024 #ifndef STR # define _STR(x) #x # define STR(x) _STR(x) #endif typedef struct { int opt_no_show; int mmap_pages; int cpu; int pin; int delay; char *events; char *cgroup; } options_t; static jmp_buf jbuf; static uint64_t collected_samples, lost_samples; static perf_event_desc_t *fds; static int num_fds; static options_t options; static size_t pgsz; static size_t map_size; static struct option the_options[]={ { "help", 0, 0, 1}, { "no-show", 0, &options.opt_no_show, 1}, { 0, 0, 0, 0} }; static const char *gen_events = "cycles,instructions"; static void process_smpl_buf(perf_event_desc_t *hw) { struct perf_event_header ehdr; int ret; for(;;) { ret = perf_read_buffer(hw, &ehdr, sizeof(ehdr)); if (ret) return; /* nothing to read */ switch(ehdr.type) { case PERF_RECORD_SAMPLE: ret = perf_display_sample(fds, num_fds, hw - fds, &ehdr, stdout); if (ret) errx(1, "cannot parse sample"); collected_samples++; break; case PERF_RECORD_EXIT: display_exit(hw, stdout); break; case PERF_RECORD_LOST: lost_samples += display_lost(hw, fds, num_fds, stdout); break; case PERF_RECORD_THROTTLE: display_freq(1, hw, stdout); break; case PERF_RECORD_UNTHROTTLE: display_freq(0, hw, stdout); break; default: printf("unknown sample type %d\n", ehdr.type); perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); } } } int setup_cpu(int cpu, int fd) { int ret, flags; int i, pid; /* * does allocate fds */ ret = perf_setup_list_events(options.events, &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup event list"); if (!fds[0].hw.sample_period) errx(1, "need to set sampling period or freq on first event, use :period= or :freq="); fds[0].fd = -1; for(i=0; i < num_fds; i++) { fds[i].hw.disabled = !i; /* start immediately */ if (options.cgroup) { flags = PERF_FLAG_PID_CGROUP; pid = fd; } else { flags = 0; pid = -1; } if (options.pin) fds[i].hw.pinned = 1; if (fds[i].hw.sample_period) { /* * set notification threshold to be halfway through the buffer */ if (fds[i].hw.sample_period) { fds[i].hw.wakeup_watermark = (options.mmap_pages*pgsz) / 2; fds[i].hw.watermark = 1; } fds[i].hw.sample_type = PERF_SAMPLE_IP|PERF_SAMPLE_TID|PERF_SAMPLE_READ|PERF_SAMPLE_TIME|PERF_SAMPLE_PERIOD|PERF_SAMPLE_STREAM_ID|PERF_SAMPLE_CPU; /* * if we have more than one event, then record event identifier to help with parsing */ if (num_fds > 1) fds[i].hw.sample_type |= PERF_SAMPLE_IDENTIFIER; printf("%s period=%"PRIu64" freq=%d\n", fds[i].name, fds[i].hw.sample_period, fds[i].hw.freq); fds[i].hw.read_format = PERF_FORMAT_SCALE; if (fds[i].hw.freq) fds[i].hw.sample_type |= PERF_SAMPLE_PERIOD; } fds[i].fd = perf_event_open(&fds[i].hw, pid, cpu, fds[0].fd, flags); if (fds[i].fd == -1) { if (fds[i].hw.precise_ip) err(1, "cannot attach event %s: precise mode may not be supported", fds[i].name); err(1, "cannot attach event %s", fds[i].name); } } /* * kernel adds the header page to the size of the mmapped region */ fds[0].buf = mmap(NULL, map_size, PROT_READ|PROT_WRITE, MAP_SHARED, fds[0].fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); /* does not include header page */ fds[0].pgmsk = (options.mmap_pages*pgsz)-1; /* * send samples for all events to first event's buffer */ for (i = 1; i < num_fds; i++) { if (!fds[i].hw.sample_period) continue; ret = ioctl(fds[i].fd, PERF_EVENT_IOC_SET_OUTPUT, fds[0].fd); if (ret) err(1, "cannot redirect sampling output"); } /* * collect event ids */ if (num_fds > 1 && fds[0].fd > -1) { for(i = 0; i < num_fds; i++) { /* * read the event identifier using ioctl * new method replaced the trick with PERF_FORMAT_GROUP + PERF_FORMAT_ID + read() */ ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ID, &fds[i].id); if (ret == -1) err(1, "cannot read ID"); printf("ID %"PRIu64" %s\n", fds[i].id, fds[i].name); } } return 0; } static void start_cpu(void) { int ret; ret = ioctl(fds[0].fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "cannot start counter"); } static int cgroupfs_find_mountpoint(char *buf, size_t maxlen) { FILE *fp; char mountpoint[MAX_PATH+1], tokens[MAX_PATH+1], type[MAX_PATH+1]; char *token, *saved_ptr = NULL; int found = 0; fp = fopen("/proc/mounts", "r"); if (!fp) return -1; /* * in order to handle split hierarchy, we need to scan /proc/mounts * and inspect every cgroupfs mount point to find one that has * perf_event subsystem */ while (fscanf(fp, "%*s %"STR(MAX_PATH)"s %"STR(MAX_PATH)"s %" STR(MAX_PATH)"s %*d %*d\n", mountpoint, type, tokens) == 3) { if (!strcmp(type, "cgroup")) { token = strtok_r(tokens, ",", &saved_ptr); while (token != NULL) { if (!strcmp(token, "perf_event")) { found = 1; break; } token = strtok_r(NULL, ",", &saved_ptr); } } if (found) break; } fclose(fp); if (!found) return -1; if (strlen(mountpoint) < maxlen) { strcpy(buf, mountpoint); return 0; } return -1; } int open_cgroup(char *name) { char path[MAX_PATH+1]; char mnt[MAX_PATH+1]; int cfd = -1; int retlen; if (cgroupfs_find_mountpoint(mnt, MAX_PATH+1)) errx(1, "cannot find cgroup fs mount point"); retlen = snprintf(path, MAX_PATH, "%s/%s", mnt, name); /* ensure generated d2path string is valid */ if (retlen <= 0 || MAX_PATH <= retlen) { warn("Unable to generate path name %s/%s\n", mnt, name); return cfd; } cfd = open(path, O_RDONLY); if (cfd == -1) warn("no access to cgroup %s\n", name); return cfd; } static void handler(int n) { longjmp(jbuf, 1); } int mainloop(char **arg) { static uint64_t ovfl_count = 0; /* static to avoid setjmp issue */ struct pollfd pollfds[1]; int ret; int fd = -1; int i; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed\n"); pgsz = sysconf(_SC_PAGESIZE); map_size = (options.mmap_pages+1)*pgsz; if (options.cgroup) { fd = open_cgroup(options.cgroup); if (fd == -1) err(1, "cannot open cgroup file %s\n", options.cgroup); } setup_cpu(options.cpu, fd); /* done with cgroup */ if (fd != -1) close(fd); signal(SIGALRM, handler); signal(SIGINT, handler); pollfds[0].fd = fds[0].fd; pollfds[0].events = POLLIN; printf("monitoring on CPU%d, session ending in %ds\n", options.cpu, options.delay); if (setjmp(jbuf) == 1) goto terminate_session; start_cpu(); alarm(options.delay); /* * core loop */ for(;;) { ret = poll(pollfds, 1, -1); if (ret < 0 && errno == EINTR) break; ovfl_count++; process_smpl_buf(&fds[0]); } terminate_session: for(i=0; i < num_fds; i++) close(fds[i].fd); /* check for partial event buffer */ process_smpl_buf(&fds[0]); munmap(fds[0].buf, map_size); perf_free_fds(fds, num_fds); printf("%"PRIu64" samples collected in %"PRIu64" poll events, %"PRIu64" lost samples\n", collected_samples, ovfl_count, lost_samples); return 0; } static void usage(void) { printf("usage: syst_smpl [-h] [-P] [--help] [-m mmap_pages] [-f] [-e event1,...,eventn] [-c cpu] [-d seconds]\n"); } int main(int argc, char **argv) { int c; setlocale(LC_ALL, ""); options.cpu = -1; options.delay = -1; while ((c=getopt_long(argc, argv,"hPe:m:c:d:G:", the_options, 0)) != -1) { switch(c) { case 0: continue; case 'e': if (options.events) errx(1, "events specified twice\n"); options.events = optarg; break; case 'm': if (options.mmap_pages) errx(1, "mmap pages already set\n"); options.mmap_pages = atoi(optarg); break; case 'P': options.pin = 1; break; case 'd': options.delay = atoi(optarg); break; case 'G': options.cgroup = optarg; break; case 'c': options.cpu = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown option"); } } if (!options.events) options.events = strdup(gen_events); if (!options.mmap_pages) options.mmap_pages = 1; if (options.cpu == -1) options.cpu = random() % sysconf(_SC_NPROCESSORS_ONLN); if (options.delay == -1) options.delay = 10; if (options.mmap_pages > 1 && ((options.mmap_pages) & 0x1)) errx(1, "number of pages must be power of 2\n"); return mainloop(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/task.c000066400000000000000000000214071502707512200217300ustar00rootroot00000000000000/* * task_inherit.c - example of a task counting event in a tree of child processes * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define MAX_GROUPS 256 typedef struct { const char *events[MAX_GROUPS]; int num_groups; int format_group; int inherit; int print; int pin; pid_t pid; } options_t; static options_t options; static volatile int quit; int child(char **arg) { /* * execute the requested command */ execvp(arg[0], arg); errx(1, "cannot exec: %s\n", arg[0]); /* not reached */ } static void read_groups(perf_event_desc_t *fds, int num) { uint64_t *values = NULL; size_t new_sz, sz = 0; int i, evt; ssize_t ret; /* * { u64 nr; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 value; * { u64 id; } && PERF_FORMAT_ID * } cntr[nr]; * } && PERF_FORMAT_GROUP * * we do not use FORMAT_ID in this program */ for (evt = 0; evt < num; ) { int num_evts_to_read; if (options.format_group) { num_evts_to_read = perf_get_group_nevents(fds, num, evt); new_sz = sizeof(uint64_t) * (3 + num_evts_to_read); } else { num_evts_to_read = 1; new_sz = sizeof(uint64_t) * 3; } if (new_sz > sz) { sz = new_sz; values = realloc(values, sz); } if (!values) err(1, "cannot allocate memory for values\n"); ret = read(fds[evt].fd, values, new_sz); if (ret != (ssize_t)new_sz) { /* unsigned */ if (ret == -1) err(1, "cannot read values event %s", fds[evt].name); /* likely pinned and could not be loaded */ warnx("could not read event %d, tried to read %zu bytes, but got %zd", evt, new_sz, ret); } /* * propagate to save area */ for (i = evt; i < (evt + num_evts_to_read); i++) { if (options.format_group) values[0] = values[3 + (i - evt)]; /* * scaling because we may be sharing the PMU and * thus may be multiplexed */ fds[i].values[0] = values[0]; fds[i].values[1] = values[1]; fds[i].values[2] = values[2]; } evt += num_evts_to_read; } if (values) free(values); } static void print_counts(perf_event_desc_t *fds, int num) { double ratio; uint64_t val, delta; int i; read_groups(fds, num); for(i=0; i < num; i++) { val = perf_scale(fds[i].values); delta = perf_scale_delta(fds[i].values, fds[i].prev_values); ratio = perf_scale_ratio(fds[i].values); /* separate groups */ if (perf_is_group_leader(fds, i)) putchar('\n'); if (options.print) printf("%'20"PRIu64" %'20"PRIu64" %s (%.2f%% scaling, ena=%'"PRIu64", run=%'"PRIu64")\n", val, delta, fds[i].name, (1.0-ratio)*100.0, fds[i].values[1], fds[i].values[2]); else printf("%'20"PRIu64" %s (%.2f%% scaling, ena=%'"PRIu64", run=%'"PRIu64")\n", val, fds[i].name, (1.0-ratio)*100.0, fds[i].values[1], fds[i].values[2]); fds[i].prev_values[0] = fds[i].values[0]; fds[i].prev_values[1] = fds[i].values[1]; fds[i].prev_values[2] = fds[i].values[2]; } } static void sig_handler(int n) { quit = 1; } int parent(char **arg) { perf_event_desc_t *fds = NULL; int status, ret, i, num_fds = 0, grp, group_fd = -1; int ready[2], go[2]; uint32_t group_pmu = -1; char buf; pid_t pid; go[0] = go[1] = -1; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed"); for (grp = 0; grp < options.num_groups; grp++) { int ret; ret = perf_setup_list_events(options.events[grp], &fds, &num_fds); if (ret || !num_fds) exit(1); } pid = options.pid; if (!pid) { ret = pipe(ready); if (ret) err(1, "cannot create pipe ready"); ret = pipe(go); if (ret) err(1, "cannot create pipe go"); /* * Create the child task */ if ((pid=fork()) == -1) err(1, "Cannot fork process"); /* * and launch the child code * * The pipe is used to avoid a race condition * between for() and exec(). We need the pid * of the new tak but we want to start measuring * at the first user level instruction. Thus we * need to prevent exec until we have attached * the events. */ if (pid == 0) { close(ready[0]); close(go[1]); /* * let the parent know we exist */ close(ready[1]); if (read(go[0], &buf, 1) == -1) err(1, "unable to read go_pipe"); exit(child(arg)); } close(ready[1]); close(go[0]); if (read(ready[0], &buf, 1) == -1) err(1, "unable to read child_ready_pipe"); close(ready[0]); } for(i=0; i < num_fds; i++) { int is_group_leader; /* boolean */ /* we can only group events if the belong to the same PMU */ is_group_leader = perf_is_group_leader(fds, i); if (is_group_leader) { /* this is the group leader */ group_fd = -1; group_pmu = fds[i].hw.type; } else if (fds[i].hw.type == group_pmu) { /* same PMU */ group_fd = fds[fds[i].group_leader].fd; } /* * create leader disabled with enable_on-exec */ if (!options.pid) { fds[i].hw.disabled = is_group_leader; fds[i].hw.enable_on_exec = is_group_leader; } fds[i].hw.read_format = PERF_FORMAT_SCALE; /* request timing information necessary for scaling counts */ if (is_group_leader && options.format_group) fds[i].hw.read_format |= PERF_FORMAT_GROUP; if (options.inherit) fds[i].hw.inherit = 1; if (options.pin && is_group_leader) fds[i].hw.pinned = 1; fds[i].fd = perf_event_open(&fds[i].hw, pid, -1, group_fd, 0); if (fds[i].fd == -1) { warn("cannot attach event%d %s", i, fds[i].name); goto error; } } if (!options.pid && go[1] > -1) close(go[1]); if (options.print) { if (!options.pid) { while(waitpid(pid, &status, WNOHANG) == 0) { sleep(1); print_counts(fds, num_fds); } } else { while(quit == 0) { sleep(1); print_counts(fds, num_fds); } } } else { if (!options.pid) waitpid(pid, &status, 0); else pause(); print_counts(fds, num_fds); } for(i=0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); return 0; error: free(fds); if (!options.pid) kill(SIGKILL, pid); /* free libpfm resources cleanly */ pfm_terminate(); return -1; } static void usage(void) { printf("usage: task [-h] [-i] [-g] [-p] [-P] [-t pid] [-e event1,event2,...] cmd\n" "-h\t\tget help\n" "-i\t\tinherit across fork\n" "-f\t\tuse PERF_FORMAT_GROUP for reading up counts (experimental, not working)\n" "-p\t\tprint counts every second\n" "-P\t\tpin events\n" "-t pid\tmeasure existing pid\n" "-e ev,ev\tgroup of events to measure (multiple -e switches are allowed)\n" ); } int main(int argc, char **argv) { int c; setlocale(LC_ALL, ""); while ((c=getopt(argc, argv,"+he:ifpPt:")) != -1) { switch(c) { case 'e': if (options.num_groups < MAX_GROUPS) { options.events[options.num_groups++] = optarg; } else { errx(1, "you cannot specify more than %d groups.\n", MAX_GROUPS); } break; case 'f': options.format_group = 1; break; case 'p': options.print = 1; break; case 'P': options.pin = 1; break; case 'i': options.inherit = 1; break; case 't': options.pid = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown error"); } } if (options.num_groups == 0) { options.events[0] = "cycles:u,instructions:u"; options.num_groups = 1; } if (!argv[optind] && !options.pid) errx(1, "you must specify a command to execute or a thread to attach to\n"); signal(SIGINT, sig_handler); return parent(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/task_attach_timeout.c000066400000000000000000000116541502707512200250250ustar00rootroot00000000000000/* * task_attach_timeout.c - attach to another task for monitoring for a short while * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" typedef struct { char *events; int delay; int print; int group; int pinned; } options_t; static options_t options; static void print_counts(perf_event_desc_t *fds, int num, int do_delta) { ssize_t ret; int i; /* * now simply read the results. */ for(i=0; i < num; i++) { uint64_t val; double ratio; ret = read(fds[i].fd, fds[i].values, sizeof(fds[i].values)); if (ret < (ssize_t)sizeof(fds[i].values)) { if (ret == -1) err(1, "cannot read values event %s", fds[i].name); else warnx("could not read event%d", i); } val = perf_scale(fds[i].values); ratio = perf_scale_ratio(fds[i].values); val = do_delta ? perf_scale_delta(fds[i].values, fds[i].prev_values) : val; fds[i].prev_values[0] = fds[i].values[0]; fds[i].prev_values[1] = fds[i].values[1]; fds[i].prev_values[2] = fds[i].values[2]; if (ratio == 1.0) printf("%20"PRIu64" %s\n", val, fds[i].name); else if (ratio == 0.0) printf("%20"PRIu64" %s (did not run: incompatible events, too many events in a group, competing session)\n", val, fds[i].name); else printf("%20"PRIu64" %s (scaled from %.2f%% of time)\n", val, fds[i].name, ratio*100.0); } } int measure(pid_t pid) { perf_event_desc_t *fds = NULL; int i, ret, num_fds = 0; char fn[32]; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed\n"); ret = perf_setup_list_events(options.events, &fds, &num_fds); if (ret || (num_fds == 0)) exit(1); fds[0].fd = -1; for(i=0; i < num_fds; i++) { fds[i].hw.disabled = 0; /* start immediately */ /* request timing information necessary for scaling counts */ fds[i].hw.read_format = PERF_FORMAT_SCALE; fds[i].hw.pinned = !i && options.pinned; fds[i].fd = perf_event_open(&fds[i].hw, pid, -1, (options.group? fds[0].fd : -1), 0); if (fds[i].fd == -1) errx(1, "cannot attach event %s", fds[i].name); } /* * no notification is generated by perf_counters * when the monitored thread exits. Thus we need * to poll /proc/ to detect it has disappeared, * otherwise we have to wait until the end of the * timeout */ sprintf(fn, "/proc/%d/status", pid); while(access(fn, F_OK) == 0 && options.delay) { sleep(1); options.delay--; if (options.print) print_counts(fds, num_fds, 1); } if (options.delay) warn("thread %d terminated before timeout", pid); if (!options.print) print_counts(fds, num_fds, 0); for(i=0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); /* free libpfm resources cleanly */ pfm_terminate(); return 0; } static void usage(void) { printf("usage: task_attach_timeout [-h] [-p] [-P] [-g] [-d delay] [-e event1,event2,...] pid\n"); } int main(int argc, char **argv) { int c; while ((c=getopt(argc, argv,"he:vd:pgP")) != -1) { switch(c) { case 'e': options.events = optarg; break; case 'p': options.print = 1; break; case 'P': options.pinned = 1; break; case 'g': options.group = 1; break; case 'd': options.delay = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown error"); } } if (!options.events) options.events = strdup("cycles:u,instructions:u"); if (options.delay < 1) options.delay = 10; if (!argv[optind]) errx(1, "you must specify pid to attach to\n"); return measure(atoi(argv[optind])); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/task_cpu.c000066400000000000000000000227371502707512200226060ustar00rootroot00000000000000/* * task_cpu.c - example of per-thread remote monitoring with per-cpu breakdown * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define MAX_GROUPS 256 #define MAX_CPUS 64 typedef struct { const char *events[MAX_GROUPS]; int num_groups; int format_group; int inherit; int print; int pin; int ncpus; pid_t pid; } options_t; static options_t options; static volatile int quit; int child(char **arg) { /* * execute the requested command */ execvp(arg[0], arg); errx(1, "cannot exec: %s\n", arg[0]); /* not reached */ } static void read_groups(perf_event_desc_t *fds, int num) { uint64_t *values = NULL; size_t new_sz, sz = 0; int i, evt; ssize_t ret; /* * { u64 nr; * { u64 time_enabled; } && PERF_FORMAT_ENABLED * { u64 time_running; } && PERF_FORMAT_RUNNING * { u64 value; * { u64 id; } && PERF_FORMAT_ID * } cntr[nr]; * } && PERF_FORMAT_GROUP * * we do not use FORMAT_ID in this program */ for (evt = 0; evt < num; ) { int num_evts_to_read; if (options.format_group) { num_evts_to_read = perf_get_group_nevents(fds, num, evt); new_sz = sizeof(uint64_t) * (3 + num_evts_to_read); } else { num_evts_to_read = 1; new_sz = sizeof(uint64_t) * 3; } if (new_sz > sz) { sz = new_sz; values = realloc(values, sz); } if (!values) err(1, "cannot allocate memory for values\n"); ret = read(fds[evt].fd, values, new_sz); if (ret != (ssize_t)new_sz) { /* unsigned */ if (ret == -1) err(1, "cannot read values event %s", fds[evt].name); /* likely pinned and could not be loaded */ warnx("could not read event %d, tried to read %zu bytes, but got %zd", evt, new_sz, ret); } /* * propagate to save area */ for (i = evt; i < (evt + num_evts_to_read); i++) { if (options.format_group) values[0] = values[3 + (i - evt)]; /* * scaling because we may be sharing the PMU and * thus may be multiplexed */ fds[i].values[0] = values[0]; fds[i].values[1] = values[1]; fds[i].values[2] = values[2]; } evt += num_evts_to_read; } if (values) free(values); } static void print_counts(perf_event_desc_t *fds, int num, int cpu) { double ratio; uint64_t val, delta; int i; read_groups(fds, num); for(i=0; i < num; i++) { val = perf_scale(fds[i].values); delta = perf_scale_delta(fds[i].values, fds[i].prev_values); ratio = perf_scale_ratio(fds[i].values); /* separate groups */ if (perf_is_group_leader(fds, i)) putchar('\n'); if (options.print) printf("CPU%-2d %'20"PRIu64" %'20"PRIu64" %s (%.2f%% scaling, ena=%'"PRIu64", run=%'"PRIu64")\n", cpu, val, delta, fds[i].name, (1.0-ratio)*100.0, fds[i].values[1], fds[i].values[2]); else printf("CPU%-2d %'20"PRIu64" %s (%.2f%% scaling, ena=%'"PRIu64", run=%'"PRIu64")\n", cpu, val, fds[i].name, (1.0-ratio)*100.0, fds[i].values[1], fds[i].values[2]); } } static void sig_handler(int n) { quit = 1; } int parent(char **arg) { perf_event_desc_t *fds, *fds_cpus[MAX_CPUS]; int status, ret, i, num_fds = 0, grp, group_fd; int ready[2], go[2], cpu; char buf; pid_t pid; go[0] = go[1] = -1; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed"); if (options.ncpus >= MAX_CPUS) errx(1, "maximum number of cpus exceeded (%d)", MAX_CPUS); memset(fds_cpus, 0, sizeof(fds_cpus)); for (cpu=0; cpu < options.ncpus; cpu++) { for (grp = 0; grp < options.num_groups; grp++) { num_fds = 0; ret = perf_setup_list_events(options.events[grp], &fds_cpus[cpu], &num_fds); if (ret || !num_fds) exit(1); } } pid = options.pid; if (!pid) { ret = pipe(ready); if (ret) err(1, "cannot create pipe ready"); ret = pipe(go); if (ret) err(1, "cannot create pipe go"); /* * Create the child task */ if ((pid=fork()) == -1) err(1, "Cannot fork process"); /* * and launch the child code * * The pipe is used to avoid a race condition * between for() and exec(). We need the pid * of the new tak but we want to start measuring * at the first user level instruction. Thus we * need to prevent exec until we have attached * the events. */ if (pid == 0) { close(ready[0]); close(go[1]); /* * let the parent know we exist */ close(ready[1]); if (read(go[0], &buf, 1) == -1) err(1, "unable to read go_pipe"); exit(child(arg)); } close(ready[1]); close(go[0]); if (read(ready[0], &buf, 1) == -1) err(1, "unable to read child_ready_pipe"); close(ready[0]); } for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; for(i=0; i < num_fds; i++) { int is_group_leader; /* boolean */ is_group_leader = perf_is_group_leader(fds, i); if (is_group_leader) { /* this is the group leader */ group_fd = -1; } else { group_fd = fds[fds[i].group_leader].fd; } /* * create leader disabled with enable_on-exec */ if (!options.pid) { fds[i].hw.disabled = is_group_leader; fds[i].hw.enable_on_exec = is_group_leader; } fds[i].hw.read_format = PERF_FORMAT_SCALE; /* request timing information necessary for scaling counts */ if (is_group_leader && options.format_group) fds[i].hw.read_format |= PERF_FORMAT_GROUP; if (options.inherit) fds[i].hw.inherit = 1; if (options.pin && is_group_leader) fds[i].hw.pinned = 1; fds[i].fd = perf_event_open(&fds[i].hw, pid, cpu, group_fd, 0); if (fds[i].fd == -1) { warn("cannot attach event%d %s", i, fds[i].name); goto error; } } } if (!options.pid && go[1] > -1) close(go[1]); if (options.print) { if (!options.pid) { while(waitpid(pid, &status, WNOHANG) == 0) { sleep(1); for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; print_counts(fds, num_fds, cpu); } } } else { while(quit == 0) { sleep(1); for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; print_counts(fds, num_fds, cpu); } } } } else { if (!options.pid) waitpid(pid, &status, 0); else { pause(); for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; for(i=0; i < num_fds; i++) ioctl(fds[i].fd, PERF_EVENT_IOC_DISABLE, 0); } } for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; print_counts(fds, num_fds, cpu); } } for (cpu=0; cpu < options.ncpus; cpu++) { fds = fds_cpus[cpu]; for(i=0; i < num_fds; i++) close(fds[i].fd); perf_free_fds(fds, num_fds); } /* free libpfm resources cleanly */ pfm_terminate(); return 0; error: free(fds); if (!options.pid) kill(SIGKILL, pid); /* free libpfm resources cleanly */ pfm_terminate(); return -1; } static void usage(void) { printf("usage: task_cpu [-h] [-i] [-g] [-p] [-P] [-t pid] [-e event1,event2,...] cmd\n" "-h\t\tget help\n" "-i\t\tinherit across fork\n" "-f\t\tuse PERF_FORMAT_GROUP for reading up counts (experimental, not working)\n" "-p\t\tprint counts every second\n" "-P\t\tpin events\n" "-t pid\tmeasure existing pid\n" "-e ev,ev\tgroup of events to measure (multiple -e switches are allowed)\n" ); } int main(int argc, char **argv) { int c; setlocale(LC_ALL, ""); while ((c=getopt(argc, argv,"+he:ifpPt:")) != -1) { switch(c) { case 'e': if (options.num_groups < MAX_GROUPS) { options.events[options.num_groups++] = optarg; } else { errx(1, "you cannot specify more than %d groups.\n", MAX_GROUPS); } break; case 'f': options.format_group = 1; break; case 'p': options.print = 1; break; case 'P': options.pin = 1; break; case 'i': options.inherit = 1; break; case 't': options.pid = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown error"); } } options.ncpus = sysconf(_SC_NPROCESSORS_ONLN); if (options.ncpus < 1) errx(1, "cannot determine number of online processors"); if (options.num_groups == 0) { options.events[0] = "cycles:u,instructions:u"; options.num_groups = 1; } if (!argv[optind] && !options.pid) errx(1, "you must specify a command to execute or a thread to attach to\n"); signal(SIGINT, sig_handler); return parent(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/task_smpl.c000066400000000000000000000234321502707512200227630ustar00rootroot00000000000000/* * task_smpl.c - example of a task sampling another one using a randomized sampling period * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define SMPL_PERIOD 240000000ULL typedef struct { int opt_no_show; int opt_inherit; int mem_mode; int branch_mode; int cpu; int mmap_pages; char *events; FILE *output_file; } options_t; static jmp_buf jbuf; static uint64_t collected_samples, lost_samples; static perf_event_desc_t *fds; static int num_fds; static options_t options; static struct option the_options[]={ { "help", 0, 0, 1}, { "no-show", 0, &options.opt_no_show, 1}, { 0, 0, 0, 0} }; static char *gen_events = "cycles:u:freq=100,instructions:u:freq=100"; static void cld_handler(int n) { longjmp(jbuf, 1); } int child(char **arg) { execvp(arg[0], arg); /* not reached */ return -1; } struct timeval last_read, this_read; static void process_smpl_buf(perf_event_desc_t *hw) { struct perf_event_header ehdr; int ret; for(;;) { ret = perf_read_buffer(hw, &ehdr, sizeof(ehdr)); if (ret) return; /* nothing to read */ if (options.opt_no_show) { perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); continue; } switch(ehdr.type) { case PERF_RECORD_SAMPLE: collected_samples++; ret = perf_display_sample(fds, num_fds, hw - fds, &ehdr, options.output_file); if (ret) errx(1, "cannot parse sample"); break; case PERF_RECORD_EXIT: display_exit(hw, options.output_file); break; case PERF_RECORD_LOST: lost_samples += display_lost(hw, fds, num_fds, options.output_file); break; case PERF_RECORD_THROTTLE: display_freq(1, hw, options.output_file); break; case PERF_RECORD_UNTHROTTLE: display_freq(0, hw, options.output_file); break; default: printf("unknown sample type %d\n", ehdr.type); perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); } } } int mainloop(char **arg) { static uint64_t ovfl_count; /* static to avoid setjmp issue */ struct pollfd pollfds[1]; sigset_t bmask; int go[2], ready[2]; size_t pgsz; size_t map_size = 0; pid_t pid; int status, ret; int i; char buf; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed\n"); pgsz = sysconf(_SC_PAGESIZE); map_size = (options.mmap_pages+1)*pgsz; /* * does allocate fds */ ret = perf_setup_list_events(options.events, &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup event list"); memset(pollfds, 0, sizeof(pollfds)); ret = pipe(ready); if (ret) err(1, "cannot create pipe ready"); ret = pipe(go); if (ret) err(1, "cannot create pipe go"); /* * Create the child task */ if ((pid=fork()) == -1) err(1, "cannot fork process\n"); if (pid == 0) { close(ready[0]); close(go[1]); /* * let the parent know we exist */ close(ready[1]); if (read(go[0], &buf, 1) == -1) err(1, "unable to read go_pipe"); exit(child(arg)); } close(ready[1]); close(go[0]); if (read(ready[0], &buf, 1) == -1) err(1, "unable to read child_ready_pipe"); close(ready[0]); fds[0].fd = -1; if (!fds[0].hw.sample_period) errx(1, "need to set sampling period or freq on first event, use :period= or :freq="); for(i=0; i < num_fds; i++) { if (i == 0) { fds[i].hw.disabled = 1; fds[i].hw.enable_on_exec = 1; /* start immediately */ } else fds[i].hw.disabled = 0; if (options.opt_inherit) fds[i].hw.inherit = 1; if (fds[i].hw.sample_period) { /* * set notification threshold to be halfway through the buffer */ fds[i].hw.wakeup_watermark = (options.mmap_pages*pgsz) / 2; fds[i].hw.watermark = 1; fds[i].hw.sample_type = PERF_SAMPLE_IP|PERF_SAMPLE_TID|PERF_SAMPLE_READ|PERF_SAMPLE_TIME|PERF_SAMPLE_PERIOD; /* * if we have more than one event, then record event identifier to help with parsing */ if (num_fds > 1) fds[i].hw.sample_type |= PERF_SAMPLE_IDENTIFIER; fprintf(options.output_file,"%s period=%"PRIu64" freq=%d\n", fds[i].name, fds[i].hw.sample_period, fds[i].hw.freq); fds[i].hw.read_format = PERF_FORMAT_SCALE; if (fds[i].hw.freq) fds[i].hw.sample_type |= PERF_SAMPLE_PERIOD; if (options.mem_mode) fds[i].hw.sample_type |= PERF_SAMPLE_WEIGHT | PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_ADDR; if (options.branch_mode) { fds[i].hw.sample_type |= PERF_SAMPLE_BRANCH_STACK; fds[i].hw.branch_sample_type = PERF_SAMPLE_BRANCH_ANY; } } /* * we are grouping the events, so there may be a limit */ fds[i].fd = perf_event_open(&fds[i].hw, pid, options.cpu, fds[0].fd, 0); if (fds[i].fd == -1) { if (fds[i].hw.precise_ip) err(1, "cannot attach event %s: precise mode may not be supported", fds[i].name); err(1, "cannot attach event %s", fds[i].name); } } /* * kernel adds the header page to the size of the mmapped region */ fds[0].buf = mmap(NULL, map_size, PROT_READ|PROT_WRITE, MAP_SHARED, fds[0].fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); /* does not include header page */ fds[0].pgmsk = (options.mmap_pages*pgsz)-1; /* * send samples for all events to first event's buffer */ for (i = 1; i < num_fds; i++) { if (!fds[i].hw.sample_period) continue; ret = ioctl(fds[i].fd, PERF_EVENT_IOC_SET_OUTPUT, fds[0].fd); if (ret) err(1, "cannot redirect sampling output"); } if (num_fds > 1 && fds[0].fd > -1) { for(i = 0; i < num_fds; i++) { /* * read the event identifier using ioctl * new method replaced the trick with PERF_FORMAT_GROUP + PERF_FORMAT_ID + read() */ ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ID, &fds[i].id); if (ret == -1) err(1, "cannot read ID"); fprintf(options.output_file,"ID %"PRIu64" %s\n", fds[i].id, fds[i].name); } } pollfds[0].fd = fds[0].fd; pollfds[0].events = POLLIN; for(i=0; i < num_fds; i++) { ret = ioctl(fds[i].fd, PERF_EVENT_IOC_ENABLE, 0); if (ret) err(1, "cannot enable event %s\n", fds[i].name); } signal(SIGCHLD, cld_handler); close(go[1]); if (setjmp(jbuf) == 1) goto terminate_session; sigemptyset(&bmask); sigaddset(&bmask, SIGCHLD); /* * core loop */ for(;;) { ret = poll(pollfds, 1, -1); if (ret < 0 && errno == EINTR) break; ovfl_count++; ret = sigprocmask(SIG_SETMASK, &bmask, NULL); if (ret) err(1, "setmask"); process_smpl_buf(&fds[0]); ret = sigprocmask(SIG_UNBLOCK, &bmask, NULL); if (ret) err(1, "unblock"); } terminate_session: /* * cleanup child */ wait4(pid, &status, 0, NULL); for(i=0; i < num_fds; i++) close(fds[i].fd); /* check for partial event buffer */ process_smpl_buf(&fds[0]); munmap(fds[0].buf, map_size); perf_free_fds(fds, num_fds); fprintf(options.output_file, "%"PRIu64" samples collected in %"PRIu64" poll events, %"PRIu64" lost samples\n", collected_samples, ovfl_count, lost_samples); /* free libpfm resources cleanly */ pfm_terminate(); fclose(options.output_file); return 0; } static void usage(void) { printf("usage: task_smpl [-h] [--help] [-i] [-c cpu] [-m mmap_pages] [-M] [-b] [-o output_file] [-e event1,...,eventn] cmd\n"); } int main(int argc, char **argv) { int c; setlocale(LC_ALL, ""); options.cpu = -1; options.output_file=stdout; while ((c=getopt_long(argc, argv,"+he:m:ic:o:Mb", the_options, 0)) != -1) { switch(c) { case 0: continue; case 'e': if (options.events) errx(1, "events specified twice\n"); options.events = optarg; break; case 'i': options.opt_inherit = 1; break; case 'm': if (options.mmap_pages) errx(1, "mmap pages already set\n"); options.mmap_pages = atoi(optarg); break; case 'M': options.mem_mode = 1; break; case 'b': options.branch_mode = 1; break; case 'c': options.cpu = atoi(optarg); break; case 'o': options.output_file=fopen(optarg,"w"); if (options.output_file==NULL) { printf("Invalid filename %s\n", optarg); exit(0); } break; case 'h': usage(); exit(0); default: errx(1, "unknown option"); } } if (argv[optind] == NULL) errx(1, "you must specify a command to execute\n"); if (!options.events) options.events = strdup(gen_events); if (!options.mmap_pages) options.mmap_pages = 1; if (options.mmap_pages > 1 && ((options.mmap_pages) & 0x1)) errx(1, "number of pages must be power of 2\n"); return mainloop(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/perf_examples/x86/000077500000000000000000000000001502707512200212435ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/perf_examples/x86/Makefile000066400000000000000000000036621502707512200227120ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/../.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk CFLAGS+= -I. -D_GNU_SOURCE -I.. LIBS += -lm ifeq ($(SYS),Linux) CFLAGS+= -pthread endif TARGETS= ifeq ($(SYS),Linux) LPC_UTILS=../perf_util.o TARGETS += bts_smpl endif EXAMPLESDIR=$(DOCDIR)/perf_examples/x86 all: $(TARGETS) $(TARGETS): %:%.o $(LPC_UTILS) $(PFMLIB) $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $^ $(LIBS) clean: $(RM) -f *.o $(TARGETS) *~ distclean: clean install_examples: $(TARGETS) @echo installing: $(TARGETS) -mkdir -p $(DESTDIR)$(EXAMPLESDIR) $(INSTALL) -m 755 $(TARGETS) $(TARGET_GEN) $(DESTDIR)$(EXAMPLESDIR) @set -e ; for d in $(DIRS) ; do $(MAKE) -C $$d $@ ; done .PHONY: install depend install_examples papi-papi-7-2-0-t/src/libpfm4/perf_examples/x86/bts_smpl.c000066400000000000000000000154441502707512200232420ustar00rootroot00000000000000/* * bts_smpl.c - example of Intel Branch Trace Stack sampling * * Copyright (c) 2009 Google, Inc * Contributed by Stephane Eranian * * Based on: * Copyright (c) 2003-2006 Hewlett-Packard Development Company, L.P. * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "perf_util.h" #define SMPL_PERIOD 24000000ULL typedef struct { int opt_no_show; int opt_inherit; int mmap_pages; } options_t; static jmp_buf jbuf; static uint64_t collected_samples, lost_samples; static perf_event_desc_t *fds; static int num_fds; static options_t options; static struct option the_options[]={ { "help", 0, 0, 1}, { "no-show", 0, &options.opt_no_show, 1}, { 0, 0, 0, 0} }; static void cld_handler(int n) { longjmp(jbuf, 1); } int child(char **arg) { /* * force the task to stop before executing the first * user level instruction */ ptrace(PTRACE_TRACEME, 0, NULL, NULL); execvp(arg[0], arg); /* not reached */ return -1; } struct timeval last_read, this_read; static void process_smpl_buf(perf_event_desc_t *hw) { struct perf_event_header ehdr; int ret; for(;;) { ret = perf_read_buffer(hw, &ehdr, sizeof(ehdr)); if (ret) return; /* nothing to read */ switch(ehdr.type) { case PERF_RECORD_SAMPLE: perf_display_sample(fds, num_fds, hw - fds, &ehdr, stdout); collected_samples++; break; case PERF_RECORD_EXIT: display_exit(hw, stdout); break; case PERF_RECORD_LOST: display_lost(hw, fds, num_fds, stdout); break; case PERF_RECORD_THROTTLE: display_freq(1, hw, stdout); break; case PERF_RECORD_UNTHROTTLE: display_freq(0, hw, stdout); break; default: printf("unknown sample type %d sz=%d\n", ehdr.type, ehdr.size); perf_skip_buffer(hw, ehdr.size - sizeof(ehdr)); } } } int mainloop(char **arg) { static uint64_t ovfl_count; /* static to avoid setjmp issue */ struct pollfd pollfds[1]; size_t map_size = 0; sigset_t bmask; pid_t pid; uint64_t val[2]; int status, ret; if (pfm_initialize() != PFM_SUCCESS) errx(1, "libpfm initialization failed\n"); map_size = (options.mmap_pages+1)*getpagesize(); /* * does allocate fds */ ret = perf_setup_list_events("branches:u", &fds, &num_fds); if (ret || !num_fds) errx(1, "cannot setup event"); memset(pollfds, 0, sizeof(pollfds)); /* * Create the child task */ if ((pid=fork()) == -1) err(1, "cannot fork process\n"); if (pid == 0) exit(child(arg)); /* * wait for the child to exec */ ret = waitpid(pid, &status, WUNTRACED); if (ret == -1) err(1, "waitpid failed"); if (WIFEXITED(status)) errx(1, "task %s [%d] exited already status %d\n", arg[0], pid, WEXITSTATUS(status)); fds[0].fd = -1; fds[0].hw.disabled = 0; /* start immediately */ if (options.opt_inherit) fds[0].hw.inherit = 1; fds[0].hw.sample_type = PERF_SAMPLE_IP|PERF_SAMPLE_ADDR; /* * BTS only supported at user level */ if (fds[0].hw.exclude_user ||fds[0].hw.exclude_kernel == 0) errx(1, "BTS currently supported only at the user level\n"); /* * period MUST be one to trigger BTS: tracing not sampling anymore */ fds[0].hw.sample_period = 1; fds[0].hw.exclude_kernel = 1; fds[0].hw.exclude_hv = 1; fds[0].hw.read_format |= PERF_FORMAT_ID; fds[0].fd = perf_event_open(&fds[0].hw, pid, -1, -1, 0); if (fds[0].fd == -1) err(1, "cannot attach event %s", fds[0].name); fds[0].buf = mmap(NULL, map_size, PROT_READ|PROT_WRITE, MAP_SHARED, fds[0].fd, 0); if (fds[0].buf == MAP_FAILED) err(1, "cannot mmap buffer"); /* does not include header page */ fds[0].pgmsk = (options.mmap_pages*getpagesize())-1; ret = read(fds[0].fd, val, sizeof(val)); if (ret == -1) err(1, "cannot read id %zu", sizeof(val)); fds[0].id = val[1]; printf("%"PRIu64" %s\n", fds[0].id, fds[0].name); /* * effectively activate monitoring */ ptrace(PTRACE_DETACH, pid, NULL, 0); signal(SIGCHLD, cld_handler); pollfds[0].fd = fds[0].fd; pollfds[0].events = POLLIN; if (setjmp(jbuf) == 1) goto terminate_session; sigemptyset(&bmask); sigaddset(&bmask, SIGCHLD); /* * core loop */ for(;;) { ret = poll(pollfds, 1, -1); if (ret < 0 && errno == EINTR) break; ovfl_count++; ret = sigprocmask(SIG_SETMASK, &bmask, NULL); if (ret) err(1, "setmask"); process_smpl_buf(&fds[0]); ret = sigprocmask(SIG_UNBLOCK, &bmask, NULL); if (ret) err(1, "unblock"); } terminate_session: /* * cleanup child */ wait4(pid, &status, 0, NULL); close(fds[0].fd); /* check for partial event buffer */ process_smpl_buf(&fds[0]); munmap(fds[0].buf, map_size); free(fds); printf("%"PRIu64" samples collected in %"PRIu64" poll events, %"PRIu64" lost samples\n", collected_samples, ovfl_count, lost_samples); return 0; } static void usage(void) { printf("usage: bts_smpl [-h] [--help] [-i] [-m mmap_pages] cmd\n"); } int main(int argc, char **argv) { int c; while ((c=getopt_long(argc, argv,"+hm:p:if", the_options, 0)) != -1) { switch(c) { case 0: continue; case 'i': options.opt_inherit = 1; break; case 'm': if (options.mmap_pages) errx(1, "mmap pages already set\n"); options.mmap_pages = atoi(optarg); break; case 'h': usage(); exit(0); default: errx(1, "unknown option"); } } if (argv[optind] == NULL) errx(1, "you must specify a command to execute\n"); if (!options.mmap_pages) options.mmap_pages = 4; return mainloop(argv+optind); } papi-papi-7-2-0-t/src/libpfm4/python/000077500000000000000000000000001502707512200173055ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/python/Makefile000066400000000000000000000027651502707512200207570ustar00rootroot00000000000000# # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk PYTHON_PREFIX=$(PREFIX) all: CFLAGS="-O2 -g" ./setup.py build install: CFLAGS="-O2 -g" ./setup.py install --prefix=$(DESTDIR)$(PYTHON_PREFIX) clean: $(RM) src/perfmon_int_wrap.c src/perfmon_int.py src/*.pyc $(RM) -r build papi-papi-7-2-0-t/src/libpfm4/python/README000066400000000000000000000004101502707512200201600ustar00rootroot00000000000000Requirements: To use the python bindings, you need the following packages: 1. swig (http://www.swig.org) 2. python-dev (http://www.python.org) 3. module-linux (http://code.google.com/p/module-linux) linux.sched is python package that comes with module-linux. papi-papi-7-2-0-t/src/libpfm4/python/self.py000077500000000000000000000041441502707512200206160ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # # Self monitoring example. Copied from self.c from __future__ import print_function import os import optparse import random import errno import struct import perfmon if __name__ == '__main__': parser = optparse.OptionParser() parser.add_option("-e", "--events", help="Events to use", action="store", dest="events") parser.set_defaults(events="PERF_COUNT_HW_CPU_CYCLES") (options, args) = parser.parse_args() if options.events: events = options.events.split(",") else: raise Exception("You need to specify events to monitor") s = perfmon.PerThreadSession(int(os.getpid()), events) s.start() # code to be measured # # note that this is not identical to what examples/self.c does # thus counts will be different in the end for i in range(1, 1000000): random.random() # read the counts for i in range(0, len(events)): count = struct.unpack("L", s.read(i))[0] print("""%s\t%lu""" % (events[i], count)) papi-papi-7-2-0-t/src/libpfm4/python/setup.py000077500000000000000000000012401502707512200210170ustar00rootroot00000000000000#!/usr/bin/env python from distutils.core import setup, Extension from distutils.command.install_data import install_data setup(name='perfmon', version='4.0', author='Arun Sharma', author_email='arun.sharma@google.com', description='libpfm wrapper', packages=['perfmon'], package_dir={ 'perfmon' : 'src' }, py_modules=['perfmon.perfmon_int'], ext_modules=[Extension('perfmon._perfmon_int', sources = ['src/perfmon_int.i'], libraries = ['pfm'], library_dirs = ['../lib'], include_dirs = ['../include'], swig_opts=['-I../include'])]) papi-papi-7-2-0-t/src/libpfm4/python/src/000077500000000000000000000000001502707512200200745ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/python/src/__init__.py000066400000000000000000000001241502707512200222020ustar00rootroot00000000000000from perfmon_int import * from pmu import * from session import * pfm_initialize() papi-papi-7-2-0-t/src/libpfm4/python/src/perfmon_int.i000066400000000000000000000061701502707512200225720ustar00rootroot00000000000000/* * * Copyright (c) 2008 Google, Inc. * Contributed by Arun Sharma * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), * to deal in the Software without restriction, including without limitation * the rights to use, copy, modify, merge, publish, distribute, sublicense, * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included * in all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * * Python Bindings for perfmon. */ %module perfmon_int %{ #include #define SWIG #include #include #include static PyObject *libpfm_err; %} %include "typemaps.i" %include "carrays.i" %include "cstring.i" %include /* Convert libpfm errors into exceptions */ %typemap(out) os_err_t { if (result == -1) { PyErr_SetFromErrno(PyExc_OSError); SWIG_fail; } resultobj = SWIG_From_int((int)(result)); }; %typemap(out) pfm_err_t { if (result != PFM_SUCCESS) { PyObject *obj = Py_BuildValue("(i,s)", result, pfm_strerror(result)); PyErr_SetObject(libpfm_err, obj); SWIG_fail; } else { PyErr_Clear(); } resultobj = SWIG_From_int((int)(result)); } /* Generic return structures via pointer output arguments */ %define ptr_argout(T) %typemap(argout) T* output { if (!PyTuple_Check($result)) { PyObject *x = $result; $result = PyTuple_New(1); PyTuple_SET_ITEM($result, 0, x); } PyObject *o = SWIG_NewPointerObj((void *)$1, $descriptor, 0); $result = SWIG_AppendOutput($result, o); } %typemap(in, numinputs=0) T* output { $1 = (T*) malloc(sizeof(T)); memset($1, 0, sizeof(T)); } %extend T { ~T() { free(self); } } %enddef ptr_argout(pfm_pmu_info_t); ptr_argout(pfm_event_info_t); ptr_argout(pfm_event_attr_info_t); %typedef int pid_t; /* Kernel interface */ %include ptr_argout(perf_event_attr_t); /* Library interface */ /* We never set the const char * members. So no memory leak */ #pragma SWIG nowarn=451 %include /* OS specific library interface */ extern pfm_err_t pfm_get_perf_event_encoding(const char *str, int dfl_plm, perf_event_attr_t *output, char **fstr, int *idx); %init %{ libpfm_err = PyErr_NewException("perfmon.libpfmError", NULL, NULL); PyDict_SetItemString(d, "libpfmError", libpfm_err); %} papi-papi-7-2-0-t/src/libpfm4/python/src/pmu.py000066400000000000000000000056651502707512200212630ustar00rootroot00000000000000# # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # from __future__ import print_function import os from perfmon import * def public_members(self): s = "{ " for k, v in self.__dict__.items(): if not k[0] == '_': s += "%s : %s, " % (k, v) s += " }" return s class System: # Use the os that gives us everything os = PFM_OS_PERF_EVENT_EXT def __init__(self): self.ncpus = os.sysconf('SC_NPROCESSORS_ONLN') self.pmus = [] for i in range(0, PFM_PMU_MAX): try: pmu = PMU(i) except: pass else: self.pmus.append(pmu) def __repr__(self): return public_members(self) class Event: def __init__(self, info): self.info = info self.__attrs = [] def __repr__(self): return '\n' + public_members(self) def __parse_attrs(self): info = self.info for index in range(0, info.nattrs): self.__attrs.append(pfm_get_event_attr_info(info.idx, index, System.os)[1]) def attrs(self): if not self.__attrs: self.__parse_attrs() return self.__attrs class PMU: def __init__(self, i): self.info = pfm_get_pmu_info(i)[1] self.__events = [] def __parse_events(self): index = self.info.first_event while index != -1: self.__events.append(Event(pfm_get_event_info(index, System.os)[1])) index = pfm_get_event_next(index) def events(self): if not self.__events: self.__parse_events() return self.__events def __repr__(self): return public_members(self) if __name__ == '__main__': from perfmon import * s = System() for pmu in s.pmus: info = pmu.info if info.flags.is_present: print(info.name, info.size, info.nevents) for e in pmu.events(): print(e.info.name, e.info.code) for a in e.attrs(): print('\t\t', a.name, a.code) papi-papi-7-2-0-t/src/libpfm4/python/src/session.py000066400000000000000000000047231502707512200221370ustar00rootroot00000000000000# # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # from perfmon import * import os import sys # Common base class class Session: def __init__(self, events): self.system = System() self.event_names = events self.events = [] self.fds = [] for e in events: err, encoding = pfm_get_perf_event_encoding(e, PFM_PLM0 | PFM_PLM3, None, None) self.events.append(encoding) def __del__(self): pass def read(self, fd): # TODO: determine counter width return os.read(fd, 8) class SystemWideSession(Session): def __init__(self, cpus, events): self.cpus = cpus Session.__init__(self, events) def __del__(self): Session.__del__(self) def start(self): self.cpu_fds = [] for c in self.cpus: self.cpu_fds.append([]) cur_cpu_fds = self.cpu_fds[-1] for e in self.events: cur_cpu_fds.append(perf_event_open(e, -1, c, -1, 0)) def read(self, c, i): index = self.cpus.index(c) return Session.read(self, self.cpu_fds[index][i]) class PerThreadSession(Session): def __init__(self, pid, events): self.pid = pid Session.__init__(self, events) def __del__(self): Session.__del__(self) def start(self): for e in self.events: self.fds.append(perf_event_open(e, self.pid, -1, -1, 0)) def read(self, i): return Session.read(self, self.fds[i]) papi-papi-7-2-0-t/src/libpfm4/python/sys.py000077500000000000000000000044371502707512200205100ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (c) 2008 Google, Inc. # Contributed by Arun Sharma # # Permission is hereby granted, free of charge, to any person obtaining a # copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included # in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR # OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, # ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR # OTHER DEALINGS IN THE SOFTWARE. # # System wide monitoring example. Copied from syst.c # # Run as: ./sys.py -c cpulist -e eventlist from __future__ import print_function import sys import os import optparse import time import struct import perfmon if __name__ == '__main__': parser = optparse.OptionParser() parser.add_option("-e", "--events", help="Events to use", action="store", dest="events") parser.add_option("-c", "--cpulist", help="CPUs to monitor", action="store", dest="cpulist") parser.set_defaults(cpulist="0") parser.set_defaults(events="PERF_COUNT_HW_CPU_CYCLES") (options, args) = parser.parse_args() cpus = options.cpulist.split(',') cpus = [ int(c) for c in cpus ] if options.events: events = options.events.split(",") else: raise Exception("You need to specify events to monitor") s = perfmon.SystemWideSession(cpus, events) s.start() # Measuring loop while 1: time.sleep(1) # read the counts for c in cpus: for i in range(0, len(events)): count = struct.unpack("L", s.read(c, i))[0] print("""CPU%d: %s\t%lu""" % (c, events[i], count)) papi-papi-7-2-0-t/src/libpfm4/rules.mk000066400000000000000000000030041502707512200174440ustar00rootroot00000000000000# # Copyright (c) 2002-2006 Hewlett-Packard Development Company, L.P. # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # # This file is part of libpfm, a performance monitoring support library for # applications on Linux/ia64. # .SUFFIXES: .c .S .o .lo .cpp .S.o: $(CC) $(CFLAGS) -c $*.S .c.o: $(CC) $(CFLAGS) -c $*.c .cpp.o: $(CXX) $(CFLAGS) -c $*.cpp .c.lo: $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo .S.lo: $(CC) -fPIC -DPIC $(CFLAGS) -c $*.S -o $*.lo papi-papi-7-2-0-t/src/libpfm4/tests/000077500000000000000000000000001502707512200171265ustar00rootroot00000000000000papi-papi-7-2-0-t/src/libpfm4/tests/Makefile000066400000000000000000000040711502707512200205700ustar00rootroot00000000000000# # Copyright (c) 2010 Google, Inc # Contributed by Stephane Eranian # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do so, # subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF # CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE # OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # TOPDIR := $(shell if [ "$$PWD" != "" ]; then echo $$PWD; else pwd; fi)/.. include $(TOPDIR)/config.mk include $(TOPDIR)/rules.mk SRCS=validate.c ifeq ($(CONFIG_PFMLIB_ARCH_X86),y) SRCS += validate_x86.c endif ifeq ($(CONFIG_PFMLIB_ARCH_MIPS),y) SRCS += validate_mips.c endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM),y) SRCS += validate_arm.c endif ifeq ($(CONFIG_PFMLIB_ARCH_ARM64),y) SRCS += validate_arm64.c endif ifeq ($(CONFIG_PFMLIB_ARCH_POWERPC),y) SRCS += validate_power.c endif ifeq ($(SYS),Linux) SRCS += validate_perf.c endif CFLAGS+= -I. -D_GNU_SOURCE LIBS += -lm ifeq ($(SYS),Linux) CFLAGS+= -pthread endif OBJS=$(SRCS:.c=.o) TARGETS=validate all: $(TARGETS) validate: $(OBJS) $(PFMLIB) $(CC) $(CFLAGS) -o $@ $(LDFLAGS) $^ $(LIBS) clean: $(RM) -f *.o $(TARGETS) *~ distclean: clean # # examples are installed as part of the RPM install, typically in /usr/share/doc/libpfm-X.Y/ # .PHONY: install depend install_examples papi-papi-7-2-0-t/src/libpfm4/tests/validate.c000066400000000000000000000202141502707512200210620ustar00rootroot00000000000000/* * validate.c - validate event tables + encodings * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #include #ifdef __linux__ #include #endif #define __weak_func __attribute__((weak)) #ifdef PFMLIB_WINDOWS int set_env_var(const char *var, const char *value, int ov) { size_t len; char *str; int ret; len = strlen(var) + 1 + strlen(value) + 1; str = malloc(len); if (!str) return PFM_ERR_NOMEM; sprintf(str, "%s=%s", var, value); ret = putenv(str); free(str); return ret ? PFM_ERR_INVAL : PFM_SUCCESS; } #else static inline int set_env_var(const char *var, const char *value, int ov) { return setenv(var, value, ov); } #endif __weak_func int validate_arch(FILE *fp) { return 0; } __weak_func int validate_perf(FILE *fp) { return 0; } static struct { int valid_mode; } options; #define VALID_INTERN 0x1 #define VALID_ARCH 0x2 #define VALID_PERF 0x4 #define VALID_ALL (VALID_INTERN|\ VALID_ARCH |\ VALID_PERF) static inline int valid_mode(int f) { return !!(options.valid_mode & f); } static void usage(void) { printf("validate [-c] [-a] [-A]" #ifdef __linux__ "[-p]" #endif "\n" "-c\trun the library validate events\n" "-a\trun architecture specific event tests\n" #ifdef __linux__ "-p\trun perf_events specific event tests\n" #endif "-A\trun all tests\n" "-h\tget help\n"); } static int validate_event_tables(void) { pfm_pmu_info_t pinfo; pfm_pmu_t i; int ret, errors = 0; memset(&pinfo, 0, sizeof(pinfo)); pinfo.size = sizeof(pinfo); pfm_for_all_pmus(i) { ret = pfm_get_pmu_info(i, &pinfo); if (ret != PFM_SUCCESS) continue; printf("\tchecking %s (%d events): ", pinfo.name, pinfo.nevents); fflush(stdout); ret = pfm_pmu_validate(i, stdout); if (ret != PFM_SUCCESS && ret != PFM_ERR_NOTSUPP) { printf("Failed\n"); errors++; } else if (ret == PFM_ERR_NOTSUPP) { printf("N/A\n"); } else { printf("Passed\n"); } } return errors; } #if __WORDSIZE == 64 #define STRUCT_MULT 8 #else #define STRUCT_MULT 4 #endif #define MAX_FIELDS 32 typedef struct { const char *name; size_t sz; } field_desc_t; typedef struct { const char *name; size_t sz; size_t bitfield_sz; size_t abi_sz; field_desc_t fields[MAX_FIELDS]; } struct_desc_t; #define LAST_STRUCT { .name = NULL, } #define FIELD(n, st) \ { .name = #n, \ .sz = sizeof(((st *)(0))->n), \ } #define LAST_FIELD { .name = NULL, } static const struct_desc_t pfmlib_structs[]={ { .name = "pfm_pmu_info_t", .sz = sizeof(pfm_pmu_info_t), .bitfield_sz = 4, .abi_sz = PFM_PMU_INFO_ABI0, .fields= { FIELD(name, pfm_pmu_info_t), FIELD(desc, pfm_pmu_info_t), FIELD(size, pfm_pmu_info_t), FIELD(pmu, pfm_pmu_info_t), FIELD(type, pfm_pmu_info_t), FIELD(nevents, pfm_pmu_info_t), FIELD(first_event, pfm_pmu_info_t), FIELD(max_encoding, pfm_pmu_info_t), FIELD(num_cntrs, pfm_pmu_info_t), FIELD(num_fixed_cntrs, pfm_pmu_info_t), LAST_FIELD }, }, { .name = "pfm_event_info_t", .sz = sizeof(pfm_event_info_t), .bitfield_sz = 4, .abi_sz = PFM_EVENT_INFO_ABI0, .fields= { FIELD(name, pfm_event_info_t), FIELD(desc, pfm_event_info_t), FIELD(equiv, pfm_event_info_t), FIELD(size, pfm_event_info_t), FIELD(code, pfm_event_info_t), FIELD(pmu, pfm_event_info_t), FIELD(dtype, pfm_event_info_t), FIELD(idx, pfm_event_info_t), FIELD(nattrs, pfm_event_info_t), FIELD(reserved, pfm_event_info_t), LAST_FIELD }, }, { .name = "pfm_event_attr_info_t", .sz = sizeof(pfm_event_attr_info_t), .bitfield_sz = 4+8, .abi_sz = PFM_ATTR_INFO_ABI0, .fields= { FIELD(name, pfm_event_attr_info_t), FIELD(desc, pfm_event_attr_info_t), FIELD(equiv, pfm_event_attr_info_t), FIELD(size, pfm_event_attr_info_t), FIELD(code, pfm_event_attr_info_t), FIELD(type, pfm_event_attr_info_t), FIELD(idx, pfm_event_attr_info_t), FIELD(ctrl, pfm_event_attr_info_t), LAST_FIELD }, }, { .name = "pfm_pmu_encode_arg_t", .sz = sizeof(pfm_pmu_encode_arg_t), .abi_sz = PFM_RAW_ENCODE_ABI0, .fields= { FIELD(codes, pfm_pmu_encode_arg_t), FIELD(fstr, pfm_pmu_encode_arg_t), FIELD(size, pfm_pmu_encode_arg_t), FIELD(count, pfm_pmu_encode_arg_t), FIELD(idx, pfm_pmu_encode_arg_t), LAST_FIELD }, }, #ifdef __linux__ { .name = "pfm_perf_encode_arg_t", .sz = sizeof(pfm_perf_encode_arg_t), .bitfield_sz = 0, .abi_sz = PFM_PERF_ENCODE_ABI0, .fields= { FIELD(attr, pfm_perf_encode_arg_t), FIELD(fstr, pfm_perf_encode_arg_t), FIELD(size, pfm_perf_encode_arg_t), FIELD(idx, pfm_perf_encode_arg_t), FIELD(cpu, pfm_perf_encode_arg_t), FIELD(flags, pfm_perf_encode_arg_t), FIELD(pad0, pfm_perf_encode_arg_t), LAST_FIELD }, }, #endif LAST_STRUCT }; static int validate_structs(void) { const struct_desc_t *d; const field_desc_t *f; size_t sz; int errors = 0; int abi = LIBPFM_ABI_VERSION; printf("\tlibpfm ABI version : %d\n", abi); for (d = pfmlib_structs; d->name; d++) { printf("\t%s : ", d->name); if (d->abi_sz != d->sz) { printf("struct size does not correspond to ABI size %zu vs. %zu)\n", d->abi_sz, d->sz); errors++; } if (d->sz % STRUCT_MULT) { printf("Failed (wrong mult size=%zu)\n", d->sz); errors++; } sz = d->bitfield_sz; for (f = d->fields; f->name; f++) { sz += f->sz; } if (sz != d->sz) { printf("Failed (invisible padding of %zu bytes, total struct size %zu bytes)\n", d->sz - sz, d->sz); errors++; continue; } printf("Passed\n"); } return errors; } int main(int argc, char **argv) { int ret, c, errors = 0; while ((c=getopt(argc, argv,"hpcaA")) != -1) { switch(c) { case 'c': options.valid_mode |= VALID_INTERN; break; case 'a': options.valid_mode |= VALID_ARCH; break; case 'p': options.valid_mode |= VALID_PERF; break; case 'A': options.valid_mode |= VALID_ALL; break; case 'h': usage(); exit(0); default: errx(1, "unknown option error"); } } if (options.valid_mode == 0) options.valid_mode = VALID_ALL; /* to allow encoding of events from non detected PMU models */ ret = set_env_var("LIBPFM_ENCODE_INACTIVE", "1", 1); if (ret != PFM_SUCCESS) errx(1, "cannot force inactive encoding"); ret = pfm_initialize(); if (ret != PFM_SUCCESS) errx(1, "cannot initialize libpfm: %s", pfm_strerror(ret)); printf("Libpfm structure tests:\n"); errors += validate_structs(); if (valid_mode(VALID_PERF)) { printf("perf_events specific tests:\n"); errors += validate_perf(stderr); } if (valid_mode(VALID_INTERN)) { printf("Libpfm internal table tests:\n"); errors += validate_event_tables(); } if (valid_mode(VALID_ARCH)) { printf("Architecture specific tests:\n"); errors += validate_arch(stderr); } pfm_terminate(); if (errors) printf("Total %d errors\n", errors); else printf("All tests passed\n"); return errors; } papi-papi-7-2-0-t/src/libpfm4/tests/validate_arm.c000066400000000000000000000216071502707512200217300ustar00rootroot00000000000000/* * validate_arm.c - validate ARM event tables + encodings * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 1 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count, line; } test_event_t; static const test_event_t arm_test_events[]={ { SRC_LINE, .name = "arm_ac7::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac7::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac7::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac7::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac7::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac7::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac7::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac7::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac8::NEON_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5a, .fstr = "arm_ac8::NEON_CYCLES", }, { SRC_LINE, .name = "arm_ac8::NEON_CYCLES:k", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "arm_ac8::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "arm_ac8::CPU_CYCLES", }, { SRC_LINE, .name = "arm_ac8::CPU_CYCLES_HALTED", .ret = PFM_ERR_NOTFOUND, }, { SRC_LINE, .name = "arm_ac9::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "arm_ac9::CPU_CYCLES", }, { SRC_LINE, .name = "arm_ac9::DMB_DEP_STALL_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x86, .fstr = "arm_ac9::DMB_DEP_STALL_CYCLES", }, { SRC_LINE, .name = "arm_ac9::CPU_CYCLES:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "arm_ac9::JAVA_HW_BYTECODE_EXEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x40, .fstr = "arm_ac9::JAVA_HW_BYTECODE_EXEC", }, { SRC_LINE, .name = "arm_ac15::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac15::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac15::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac15::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac15::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac15::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac15::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac15::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_1176::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "arm_1176::CPU_CYCLES", }, { SRC_LINE, .name = "arm_1176::CPU_CYCLES:k", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "arm_1176::INSTR_EXEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x07, .fstr = "arm_1176::INSTR_EXEC", }, { SRC_LINE, .name = "qcom_krait::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x80000ff, .fstr = "qcom_krait::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "qcom_krait::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x80000ff, .fstr = "qcom_krait::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "qcom_krait::CPU_CYCLES:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x480000ff, .fstr = "qcom_krait::CPU_CYCLES:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac57::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac57::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac57::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac57::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac57::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac53::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::LD_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000006, .fstr = "arm_ac53::LD_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::ST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000007, .fstr = "arm_ac53::ST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_v3::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::INST_FETCH_PERCYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008120, .fstr = "arm_v3::INST_FETCH_PERCYC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_v3::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::DTLB_WALK_PERCYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008128, .fstr = "arm_v3::DTLB_WALK_PERCYC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::SAMPLE_FEED_LD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x800812b, .fstr = "arm_v3::SAMPLE_FEED_LD:k=1:u=1:hv=0", }, }; #define NUM_TEST_EVENTS (int)(sizeof(arm_test_events)/sizeof(test_event_t)) static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i = 0, e = arm_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { fprintf(fp,"Event%d %s, ret=%s(%d) expected %s(%d)\n", i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Event%d %s, expected fstr NULL but it is not\n", i, e->name); errors++; } if (count != 0) { fprintf(fp,"Event%d %s, expected count=0 instead of %d\n", i, e->name, count); errors++; } if (codes) { fprintf(fp,"Event%d %s, expected codes[] NULL but it is not\n", i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Event%d %s, count=%d expected %d\n", i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Event%d %s, fstr=%s expected %s\n", i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d ARM events: %d errors\n", i, errors); return errors; } int validate_arch(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/libpfm4/tests/validate_arm64.c000066400000000000000000000342031502707512200220760ustar00rootroot00000000000000/* * validate_arm64.c - validate ARM64 event tables + encodings * * Copyright (c) 2014 Google, Inc * Contributed by Stephane Eranian * * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. * Contributed by John Linford * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 1 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count, line; } test_event_t; static const test_event_t arm64_test_events[]={ { SRC_LINE, .name = "arm_ac57::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac57::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac57::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac57::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac57::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac57::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_ac53::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_ac53::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_ac53::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::LD_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000006, .fstr = "arm_ac53::LD_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_ac53::ST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000007, .fstr = "arm_ac53::ST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_xgene::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_xgene::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_xgene::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_xgene::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_xgene::CPU_CYCLES:k:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_xgene::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_xgene::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_xgene::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_thunderx2::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_thunderx2::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_thunderx2::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_thunderx2::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_thunderx2::LD_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000006, .fstr = "arm_thunderx2::LD_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_thunderx2::ST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000007, .fstr = "arm_thunderx2::ST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_thunderx2::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_thunderx2::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "tx2_dmc1::UNC_DMC_READS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf, .fstr = "tx2_dmc1::UNC_DMC_READS", }, { SRC_LINE, .name = "tx2_ccpi0::UNC_CCPI_GIC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x12d, .fstr = "tx2_ccpi0::UNC_CCPI_GIC", }, { SRC_LINE, .name = "tx2_llc0::UNC_LLC_READ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xd, .fstr = "tx2_llc0::UNC_LLC_READ", }, { SRC_LINE, .name = "arm_a64fx::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_a64fx::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_a64fx::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_a64fx::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_a64fx::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_a64fx::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_monaka::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_monaka::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_monaka::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_monaka::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_monaka::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_monaka::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_monaka::_1INST_COMMIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000191, .fstr = "arm_monaka::_1INST_COMMIT:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n1::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_n1::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n1::INST_RETIRED:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000008, .fstr = "arm_n1::INST_RETIRED:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_n1::INST_RETIRED:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x48000008, .fstr = "arm_n1::INST_RETIRED:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_n1::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_n1::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n1::L3_CACHE_RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x80000a0, .fstr = "arm_n1::L3_CACHE_RD:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n1::BR_RET_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000079, .fstr = "arm_n1::BR_RETURN_SPEC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n2::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_n2::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n2::CPU_CYCLES:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x48000011, .fstr = "arm_n2::CPU_CYCLES:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_n2::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_n2::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_n2::LDST_ALIGN_LAT:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88004020, .fstr = "arm_n2::LDST_ALIGN_LAT:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_v1::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_v1::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v1::INST_RETIRED:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000008, .fstr = "arm_v1::INST_RETIRED:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_v1::INST_RETIRED:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x48000008, .fstr = "arm_v1::INST_RETIRED:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_v1::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_v1::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v1::BR_RET_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000079, .fstr = "arm_v1::BR_RETURN_SPEC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v2::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_v2::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v2::INST_RETIRED:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000008, .fstr = "arm_v2::INST_RETIRED:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_v2::INST_RETIRED:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x48000008, .fstr = "arm_v2::INST_RETIRED:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_v2::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_v2::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v2::BR_RET_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000079, .fstr = "arm_v2::BR_RETURN_SPEC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_kunpeng::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_kunpeng::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_kunpeng::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_kunpeng::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_kunpeng::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_kunpeng::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "hisi_sccl1_l3c8::rd_cpipe", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hisi_sccl1_l3c8::rd_cpipe", }, { SRC_LINE, .name = "hisi_sccl1_hha2::rx_ops_num", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hisi_sccl1_hha2::rx_ops_num", }, { SRC_LINE, .name = "hisi_sccl1_ddrc0::flux_wr", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hisi_sccl1_ddrc0::flux_wr", }, { SRC_LINE, .name = "arm_v3::INST_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000008, .fstr = "arm_v3::INST_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::DTLB_WALK_PERCYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008128, .fstr = "arm_v3::DTLB_WALK_PERCYC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_v3::SAMPLE_FEED_LD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x800812b, .fstr = "arm_v3::SAMPLE_FEED_LD:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n3::CPU_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000011, .fstr = "arm_n3::CPU_CYCLES:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n3::CPU_CYCLES:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x48000011, .fstr = "arm_n3::CPU_CYCLES:k=0:u=1:hv=0", }, { SRC_LINE, .name = "arm_n3::CPU_CYCLES:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88000011, .fstr = "arm_n3::CPU_CYCLES:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_n3::LDST_ALIGN_LAT:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x88004020, .fstr = "arm_n3::LDST_ALIGN_LAT:k=1:u=0:hv=0", }, { SRC_LINE, .name = "arm_n3::BR_IND_PRED_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008112, .fstr = "arm_n3::BR_IND_PRED_RETIRED:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n3::CAS_FAR_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008173, .fstr = "arm_n3::CAS_FAR_SPEC:k=1:u=1:hv=0", }, { SRC_LINE, .name = "arm_n3::STALL_FRONTEND_MEMBOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8008158, .fstr = "arm_n3::STALL_FRONTEND_MEMBOUND:k=1:u=1:hv=0", }, }; #define NUM_TEST_EVENTS (int)(sizeof(arm64_test_events)/sizeof(test_event_t)) static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i = 0, e = arm64_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { fprintf(fp,"Line %d, Event%d %s, ret=%s(%d) expected %s(%d)\n", e->line, i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Line %d, Event%d %s, expected fstr NULL but it is not\n", e->line, i, e->name); errors++; } if (count != 0) { fprintf(fp,"Line %d, Event%d %s, expected count=0 instead of %d\n", e->line, i, e->name, count); errors++; } if (codes) { fprintf(fp,"Line %d, Event%d %s, expected codes[] NULL but it is not\n", e->line, i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Line %d, Event%d %s, count=%d expected %d\n", e->line, i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Line %d, Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", e->line, i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Line %d, Event%d %s, fstr=%s expected %s\n", e->line, i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d ARM64 events: %d errors\n", i, errors); return errors; } int validate_arch(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/libpfm4/tests/validate_mips.c000066400000000000000000000134721502707512200221220ustar00rootroot00000000000000/* * validate_mips.c - validate MIPS event tables + encodings * * Copyright (c) 2011 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 2 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count, line; } test_event_t; static const test_event_t mips_test_events[]={ { SRC_LINE, .name = "mips_74k::cycles", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xa, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=1:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::cycles:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=1:u=0:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::cycles:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x8, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=0:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::cycles:s", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=0:u=0:s=1:e=0", }, { SRC_LINE, .name = "mips_74k::cycles:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=0:u=0:s=0:e=1", }, { SRC_LINE, .name = "mips_74k::cycles:u:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xa, .codes[1] = 0xf, .fstr = "mips_74k::CYCLES:k=1:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::instructions", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2a, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=1:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::instructions:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x22, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=1:u=0:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::instructions:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x28, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=0:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::instructions:s", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x24, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=0:u=0:s=1:e=0", }, { SRC_LINE, .name = "mips_74k::instructions:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x21, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=0:u=0:s=0:e=1", }, { SRC_LINE, .name = "mips_74k::instructions:u:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2a, .codes[1] = 0xf, .fstr = "mips_74k::INSTRUCTIONS:k=1:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::PREDICTED_JR_31:u:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4a, .codes[1] = 0x5, .fstr = "mips_74k::PREDICTED_JR_31:k=1:u=1:s=0:e=0", }, { SRC_LINE, .name = "mips_74k::JR_31_MISPREDICTIONS:s:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x45, .codes[1] = 0xa, .fstr = "mips_74k::JR_31_MISPREDICTIONS:k=0:u=0:s=1:e=1", }, }; #define NUM_TEST_EVENTS (int)(sizeof(mips_test_events)/sizeof(test_event_t)) static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i = 0, e = mips_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { fprintf(fp,"Event%d %s, ret=%s(%d) expected %s(%d)\n", i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Event%d %s, expected fstr NULL but it is not\n", i, e->name); errors++; } if (count != 0) { fprintf(fp,"Event%d %s, expected count=0 instead of %d\n", i, e->name, count); errors++; } if (codes) { fprintf(fp,"Event%d %s, expected codes[] NULL but it is not\n", i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Event%d %s, count=%d expected %d\n", i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Event%d %s, fstr=%s expected %s\n", i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d MIPS events: %d errors\n", i, errors); return errors; } int validate_arch(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/libpfm4/tests/validate_perf.c000066400000000000000000000073241502707512200221050ustar00rootroot00000000000000/* * validate_perf.c - validate perf generic event encodings * * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 1 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count, line; } test_event_t; static const test_event_t perf_test_events[]={ { SRC_LINE, .name = "perf::cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "perf::PERF_COUNT_HW_CPU_CYCLES", }, { SRC_LINE, .name = "perf::instructions", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "perf::PERF_COUNT_HW_INSTRUCTIONS", }, { SRC_LINE, .name = "perf::branches", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x04, .fstr = "perf::PERF_COUNT_HW_BRANCH_INSTRUCTIONS", }, }; #define NUM_TEST_EVENTS (int)(sizeof(perf_test_events)/sizeof(test_event_t)) static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i = 0, e = perf_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { fprintf(fp,"Event%d %s, ret=%s(%d) expected %s(%d)\n", i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Event%d %s, expected fstr NULL but it is not\n", i, e->name); errors++; } if (count != 0) { fprintf(fp,"Event%d %s, expected count=0 instead of %d\n", i, e->name, count); errors++; } if (codes) { fprintf(fp,"Event%d %s, expected codes[] NULL but it is not\n", i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Event%d %s, count=%d expected %d\n", i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Event%d %s, fstr=%s expected %s\n", i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d perf_events generic events: %d errors\n", i, errors); return errors; } int validate_perf(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/libpfm4/tests/validate_power.c000066400000000000000000000175251502707512200223110ustar00rootroot00000000000000/* * validate_power.c - validate PowerPC event tables + encodings * * Copyright (c) 2012 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 1 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count, line; } test_event_t; static const test_event_t ppc_test_events[]={ { SRC_LINE, .name = "ppc970::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7, .fstr = "ppc970::PM_CYC", }, { SRC_LINE, .name = "ppc970::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x320, .fstr = "ppc970::PM_INST_DISP", }, { SRC_LINE, .name = "ppc970mp::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7, .fstr = "ppc970mp::PM_CYC", }, { SRC_LINE, .name = "ppc970mp::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x320, .fstr = "ppc970mp::PM_INST_DISP", }, { SRC_LINE, .name = "power4::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7, .fstr = "power4::PM_CYC", }, { SRC_LINE, .name = "power4::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x221, .fstr = "power4::PM_INST_DISP", }, { SRC_LINE, .name = "power5::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf, .fstr = "power5::PM_CYC", }, { SRC_LINE, .name = "power5::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x300009, .fstr = "power5::PM_INST_DISP", }, { SRC_LINE, .name = "power5p::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf, .fstr = "power5p::PM_CYC", }, { SRC_LINE, .name = "power5p::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x300009, .fstr = "power5p::PM_INST_DISP", }, { SRC_LINE, .name = "power6::PM_INST_CMPL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2, .fstr = "power6::PM_INST_CMPL", }, { SRC_LINE, .name = "power6::PM_THRD_CONC_RUN_INST", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x300026, .fstr = "power6::PM_THRD_CONC_RUN_INST", }, { SRC_LINE, .name = "power7::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1e, .fstr = "power7::PM_CYC", }, { SRC_LINE, .name = "power7::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200f2, .fstr = "power7::PM_INST_DISP", }, { SRC_LINE, .name = "power8::PM_L1MISS_LAT_EXC_1024", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x67200301eaull, .fstr = "power8::PM_L1MISS_LAT_EXC_1024", }, { SRC_LINE, .name = "power8::PM_RC_LIFETIME_EXC_32", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xde200201e6ull, .fstr = "power8::PM_RC_LIFETIME_EXC_32", }, { SRC_LINE, .name = "power9::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1001e, .fstr = "power9::PM_CYC", }, { SRC_LINE, .name = "power9::PM_INST_DISP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200f2, .fstr = "power9::PM_INST_DISP", }, { SRC_LINE, .name = "power9::PM_CYC_ALT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2001e, .fstr = "power9::PM_CYC_ALT", }, { SRC_LINE, .name = "power9::PM_CYC_ALT2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3001e, .fstr = "power9::PM_CYC_ALT2", }, { SRC_LINE, .name = "power9::PM_INST_CMPL_ALT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x20002, .fstr = "power9::PM_INST_CMPL_ALT", }, { SRC_LINE, .name = "power9::PM_L2_INST_MISS_ALT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4609e, .fstr = "power9::PM_L2_INST_MISS_ALT", }, { SRC_LINE, .name = "power9::PM_L2_INST_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x36880, .fstr = "power9::PM_L2_INST_MISS", }, { SRC_LINE, .name = "power10::PM_CYC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x100f0, .fstr = "power10::PM_CYC", }, { SRC_LINE, .name = "power10::PM_CYC_ALT2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2001e, .fstr = "power10::PM_CYC_ALT2", }, { SRC_LINE, .name = "power10::PM_CYC_ALT3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3001e, .fstr = "power10::PM_CYC_ALT3", }, { SRC_LINE, .name = "power10::PM_INST_CMPL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x100fe, .fstr = "power10::PM_INST_CMPL", }, { SRC_LINE, .name = "power10::PM_INST_CMPL_ALT2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x20002, .fstr = "power10::PM_INST_CMPL_ALT2", }, { SRC_LINE, .name = "power10::PM_L2_INST", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x36080, .fstr = "power10::PM_L2_INST", }, { SRC_LINE, .name = "powerpc_nest_mcs_read::MCS_00", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x118, .fstr = "powerpc_nest_mcs_read::MCS_00", }, { SRC_LINE, .name = "powerpc_nest_mcs_write::MCS_00", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x198, .fstr = "powerpc_nest_mcs_write::MCS_00", }, }; #define NUM_TEST_EVENTS (int)(sizeof(ppc_test_events)/sizeof(test_event_t)) static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i = 0, e = ppc_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { fprintf(fp,"Event%d %s, ret=%s(%d) expected %s(%d)\n", i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Event%d %s, expected fstr NULL but it is not\n", i, e->name); errors++; } if (count != 0) { fprintf(fp,"Event%d %s, expected count=0 instead of %d\n", i, e->name, count); errors++; } if (codes) { fprintf(fp,"Event%d %s, expected codes[] NULL but it is not\n", i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Event%d %s, count=%d expected %d\n", i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Event%d %s, fstr=%s expected %s\n", i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d PowerPC events: %d errors\n", i, errors); return errors; } int validate_arch(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/libpfm4/tests/validate_x86.c000066400000000000000000011216501502707512200215760ustar00rootroot00000000000000/* * validate_x86.c - validate event tables + encodings * * Copyright (c) 2010 Google, Inc * Contributed by Stephane Eranian * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies * of the Software, and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be included in all * copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A * PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * * This file is part of libpfm, a performance monitoring support library for * applications on Linux. */ #include #include #include #include #include #include #include #include #include #define MAX_ENCODING 8 #define SRC_LINE .line = __LINE__ typedef struct { const char *name; const char *fstr; uint64_t codes[MAX_ENCODING]; int ret, count; int line; } test_event_t; static const test_event_t x86_test_events[]={ { SRC_LINE, .name = "core::INST_RETIRED:ANY_P", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:ANY_P", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:DEAD", .ret = PFM_ERR_ATTR, /* cannot know if it is umask or mod */ .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:u:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5100c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:u=0:k=1:u=1", .ret = PFM_ERR_ATTR_SET, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d300c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:c=1:i=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d300c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:c=2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x25300c0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:c=320", .ret = PFM_ERR_ATTR_VAL, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:t=1", .ret = PFM_ERR_ATTR, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::L2_LINES_IN", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537024ull, }, { SRC_LINE, .name = "core::L2_LINES_IN:SELF", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537024ull, .fstr = "core::L2_LINES_IN:SELF:ANY:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "core::L2_LINES_IN:SELF:BOTH_CORES", .ret = PFM_ERR_FEATCOMB, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::L2_LINES_IN:SELF:PREFETCH", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x535024ull, }, { SRC_LINE, .name = "core::L2_LINES_IN:SELF:PREFETCH:ANY", .ret = PFM_ERR_FEATCOMB, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::RS_UOPS_DISPATCHED_NONE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d300a0ull, }, { SRC_LINE, .name = "core::RS_UOPS_DISPATCHED_NONE:c=2", .ret = PFM_ERR_ATTR_SET, .count = 1, .codes[0] = 0ull, }, { SRC_LINE, .name = "core::branch_instructions_retired", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c4ull, .fstr = "core::BR_INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0" }, { SRC_LINE, .name = "nhm::branch_instructions_retired", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c4ull, .fstr = "nhm::BR_INST_RETIRED:ALL_BRANCHES:k=1:u=1:e=0:i=0:c=0:t=0" }, { SRC_LINE, .name = "wsm::BRANCH_INSTRUCTIONS_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c4ull, /* architected encoding, guaranteed to exist */ .fstr = "wsm::BR_INST_RETIRED:ALL_BRANCHES:k=1:u=1:e=0:i=0:c=0:t=0" }, { SRC_LINE, .name = "nhm::ARITH:DIV:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d60114ull, .fstr = "nhm::ARITH:CYCLES_DIV_BUSY:k=1:u=0:e=1:i=1:c=1:t=0", }, { SRC_LINE, .name = "nhm::ARITH:CYCLES_DIV_BUSY:k=1:u=1:e=1:i=1:c=1:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d70114ull, .fstr = "nhm::ARITH:CYCLES_DIV_BUSY:k=1:u=1:e=1:i=1:c=1:t=0", }, { SRC_LINE, .name = "wsm::UOPS_EXECUTED:CORE_STALL_COUNT:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1f53fb1ull, .fstr = "wsm::UOPS_EXECUTED:CORE_STALL_CYCLES:k=0:u=1:e=1:i=1:c=1:t=1", }, { SRC_LINE, .name = "wsm::UOPS_EXECUTED:CORE_STALL_COUNT:u:t=0", .ret = PFM_ERR_ATTR_SET, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:full_any:partial_any", .ret = PFM_ERR_FEATCOMB, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50072full, .fstr = "wsm_unc::UNC_QMC_WRITES:FULL_ANY:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:full_any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50072full, .fstr = "wsm_unc::UNC_QMC_WRITES:FULL_ANY:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:full_ch0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50012full, .fstr = "wsm_unc::UNC_QMC_WRITES:FULL_CH0:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:partial_any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50382full, .fstr = "wsm_unc::UNC_QMC_WRITES:PARTIAL_ANY:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:partial_ch0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50082full, .fstr = "wsm_unc::UNC_QMC_WRITES:PARTIAL_CH0:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "wsm_unc::unc_qmc_writes:partial_ch0:partial_ch1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x50182full, .fstr = "wsm_unc::UNC_QMC_WRITES:PARTIAL_CH0:PARTIAL_CH1:e=0:i=0:c=0:o=0", }, { SRC_LINE, .name = "amd64_fam10h_barcelona::DISPATCHED_FPU", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x533f00ull, .fstr = "amd64_fam10h_barcelona::DISPATCHED_FPU:ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam10h_barcelona::DISPATCHED_FPU:k:u=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x523f00ull, .fstr = "amd64_fam10h_barcelona::DISPATCHED_FPU:ALL:k=1:u=0:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam10h_barcelona::DISPATCHED_FPU:OPS_ADD:OPS_MULTIPLY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530300ull, .fstr = "amd64_fam10h_barcelona::DISPATCHED_FPU:OPS_ADD:OPS_MULTIPLY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam10h_barcelona::L2_CACHE_MISS:ALL:DATA", .ret = PFM_ERR_FEATCOMB, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "amd64_fam10h_barcelona::MEMORY_CONTROLLER_REQUESTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10053fff0ull, .fstr = "amd64_fam10h_barcelona::MEMORY_CONTROLLER_REQUESTS:ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_k8_revb::RETURN_STACK_OVERFLOWS:g=1:u", .ret = PFM_ERR_ATTR, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "amd64_k8_revb::RETURN_STACK_HITS:e=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x570088ull, .fstr = "amd64_k8_revb::RETURN_STACK_HITS:k=1:u=1:e=1:i=0:c=0", }, { SRC_LINE, .name = "amd64_k8_revb::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x533fecull, .fstr = "amd64_k8_revb::PROBE:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "amd64_k8_revc::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x533fecull, .fstr = "amd64_k8_revc::PROBE:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "amd64_k8_revd::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fecull, .fstr = "amd64_k8_revd::PROBE:ALL:k=1:u=1:e=0:i=0:c=0" }, { SRC_LINE, .name = "amd64_k8_reve::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fecull, .fstr = "amd64_k8_reve::PROBE:ALL:k=1:u=1:e=0:i=0:c=0" }, { SRC_LINE, .name = "amd64_k8_revf::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fecull, .fstr = "amd64_k8_revf::PROBE:ALL:k=1:u=1:e=0:i=0:c=0" }, { SRC_LINE, .name = "amd64_k8_revg::PROBE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fecull, .fstr = "amd64_k8_revg::PROBE:ALL:k=1:u=1:e=0:i=0:c=0" }, { SRC_LINE, .name = "amd64_fam10h_barcelona::L1_DTLB_MISS_AND_L2_DTLB_HIT:L2_1G_TLB_HIT", .ret = PFM_ERR_ATTR, .count = 0, .codes[0] = 0ull, }, { SRC_LINE, .name = "amd64_fam10h_barcelona::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530345ull, .fstr = "amd64_fam10h_barcelona::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam10h_shanghai::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530745ull, .fstr = "amd64_fam10h_shanghai::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam10h_istanbul::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530745ull, .fstr = "amd64_fam10h_istanbul::L1_DTLB_MISS_AND_L2_DTLB_HIT:ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam10h_barcelona::READ_REQUEST_TO_L3_CACHE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x40053f7e0ull, .fstr = "amd64_fam10h_barcelona::READ_REQUEST_TO_L3_CACHE:ANY_READ:ALL_CORES:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam10h_shanghai::READ_REQUEST_TO_L3_CACHE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x40053f7e0ull, .fstr = "amd64_fam10h_shanghai::READ_REQUEST_TO_L3_CACHE:ANY_READ:ALL_CORES:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "core::RAT_STALLS:ANY:u:c=1,cycles", /* must cut at comma */ .ret = PFM_ERR_INVAL, }, { SRC_LINE, .name = "wsm::mem_uncore_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53200f, .fstr = "wsm::MEM_UNCORE_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm_dp::mem_uncore_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53100f, .fstr = "wsm_dp::MEM_UNCORE_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::mem_uncore_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53100f, .fstr = "wsm::MEM_UNCORE_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm_dp::mem_uncore_retired:local_dram", .ret = PFM_ERR_ATTR, .count = 1, .codes[0] = 0, }, { SRC_LINE, .name = "nhm::mem_uncore_retired:uncacheable", .ret = PFM_ERR_ATTR, .count = 1, .codes[0] = 0, }, { SRC_LINE, .name = "nhm::mem_uncore_retired:l3_data_miss_unknown", .ret = PFM_ERR_ATTR, .count = 1, .codes[0] = 0, }, { SRC_LINE, .name = "nhm_ex::mem_uncore_retired:uncacheable", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53800f, .fstr = "nhm_ex::MEM_UNCORE_RETIRED:UNCACHEABLE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm_ex::mem_uncore_retired:l3_data_miss_unknown", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53010f, .fstr = "nhm_ex::MEM_UNCORE_RETIRED:L3_DATA_MISS_UNKNOWN:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::mem_uncore_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53200f, .fstr = "nhm::MEM_UNCORE_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm_ex::mem_uncore_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53200f, .fstr = "nhm_ex::MEM_UNCORE_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5201b7, .codes[1] = 0xffff, .fstr = "wsm::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=0:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x20ff, .fstr = "wsm::OFFCORE_RESPONSE_0:ANY_REQUEST:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:PF_IFETCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xff40, .fstr = "wsm::OFFCORE_RESPONSE_0:PF_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:ANY_DATA:LOCAL_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x2033, .fstr = "wsm::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:PF_DATA_RD:PF_RFO:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:DMND_RFO:DMND_DATA_RD:LOCAL_DRAM:REMOTE_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x6003, .fstr = "wsm::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:LOCAL_DRAM:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0" }, { SRC_LINE, .name = "wsm::offcore_response_1:local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x20ff, .fstr = "wsm::OFFCORE_RESPONSE_1:ANY_REQUEST:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_1:PF_IFETCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xff40, .fstr = "wsm::OFFCORE_RESPONSE_1:PF_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_1:ANY_DATA:LOCAL_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x2033, .fstr = "wsm::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:PF_DATA_RD:PF_RFO:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_1:DMND_RFO:DMND_DATA_RD:LOCAL_DRAM:REMOTE_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x6003, .fstr = "wsm::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:LOCAL_DRAM:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:ANY_LLC_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf8ff, .fstr = "wsm::OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:LOCAL_DRAM:REMOTE_DRAM:NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm_dp::offcore_response_0:ANY_LLC_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf8ff, .fstr = "wsm_dp::OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_CACHE_HITM:LOCAL_DRAM_AND_REMOTE_CACHE_HIT:REMOTE_DRAM:OTHER_LLC_MISS:NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm_dp::offcore_response_0:LOCAL_CACHE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x7ff, .fstr = "wsm_dp::OFFCORE_RESPONSE_0:ANY_REQUEST:UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm_dp::offcore_response_0:ANY_CACHE_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x7fff, .fstr = "wsm_dp::OFFCORE_RESPONSE_0:ANY_REQUEST:UNCORE_HIT:OTHER_CORE_HIT_SNP:OTHER_CORE_HITM:REMOTE_CACHE_HITM:LOCAL_DRAM_AND_REMOTE_CACHE_HIT:REMOTE_DRAM:OTHER_LLC_MISS:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5201b7, .codes[1] = 0xffff, .fstr = "nhm::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=0:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x40ff, .fstr = "nhm::OFFCORE_RESPONSE_0:ANY_REQUEST:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:any_llc_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf8ff, .fstr = "nhm::OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_CACHE_HITM:REMOTE_CACHE_FWD:REMOTE_DRAM:LOCAL_DRAM:NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:any_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x60ff, .fstr = "nhm::OFFCORE_RESPONSE_0:ANY_REQUEST:REMOTE_DRAM:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:PF_IFETCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xff40, .fstr = "nhm::OFFCORE_RESPONSE_0:PF_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:ANY_DATA:LOCAL_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x4033, .fstr = "nhm::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:PF_DATA_RD:PF_RFO:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "nhm::offcore_response_0:DMND_RFO:DMND_DATA_RD:LOCAL_DRAM:REMOTE_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x6003, .fstr = "nhm::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:REMOTE_DRAM:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "amd64_k8_revg::DISPATCHED_FPU:0xff:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x52ff00ull, .fstr = "amd64_k8_revg::DISPATCHED_FPU:0xff:k=1:u=0:e=0:i=0:c=0" }, { SRC_LINE, .name = "amd64_k8_revg::DISPATCHED_FPU:0x4ff", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "amd64_fam10h_barcelona::DISPATCHED_FPU:0x4ff:u", .ret = PFM_ERR_ATTR }, { SRC_LINE, .name = "amd64_fam10h_barcelona::DISPATCHED_FPU:0xff:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x51ff00ull, .fstr = "amd64_fam10h_barcelona::DISPATCHED_FPU:0xff:k=0:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "wsm::inst_retired:0xff:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x52ffc0, .fstr = "wsm::INST_RETIRED:0xff:k=1:u=0:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::uops_issued:0xff:stall_cycles", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "wsm::uops_issued:0xff:0xf1", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "wsm::uops_issued:0xff=", .ret = PFM_ERR_ATTR_VAL, .count = 0, }, { SRC_LINE, .name = "wsm::uops_issued:123", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "wsm::uops_issued:0xfff", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "netburst::global_power_events", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2600020f, .codes[1] = 0x3d000, .fstr = "netburst::global_power_events:RUNNING:k=1:u=1:e=0:cmpl=0:thr=0", }, { SRC_LINE, .name = "netburst::global_power_events:RUNNING:u:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2600020f, .codes[1] = 0x3d000, .fstr = "netburst::global_power_events:RUNNING:k=1:u=1:e=0:cmpl=0:thr=0", }, { SRC_LINE, .name = "netburst::global_power_events:RUNNING:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2600020f, .codes[1] = 0x107d000, .fstr = "netburst::global_power_events:RUNNING:k=1:u=1:e=1:cmpl=0:thr=0", }, { SRC_LINE, .name = "netburst::global_power_events:RUNNING:cmpl:e:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x26000205, .codes[1] = 0x10fd000, .fstr = "netburst::global_power_events:RUNNING:k=0:u=1:e=1:cmpl=1:thr=0", }, { SRC_LINE, .name = "netburst::global_power_events:RUNNING:cmpl:thr=8:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x26000205, .codes[1] = 0x8fd000, .fstr = "netburst::global_power_events:RUNNING:k=0:u=1:e=0:cmpl=1:thr=8", }, { SRC_LINE, .name = "netburst::global_power_events:RUNNING:cmpl:thr=32:u", .ret = PFM_ERR_ATTR_VAL, .count = 0, }, { SRC_LINE, .name = "netburst::instr_completed:nbogus", .ret = PFM_ERR_NOTFOUND, .count = 0, }, { SRC_LINE, .name = "netburst_p::instr_completed:nbogus", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe00020f, .codes[1] = 0x39000, .fstr = "netburst_p::instr_completed:NBOGUS:k=1:u=1:e=0:cmpl=0:thr=0", }, { SRC_LINE, .name = "snb::cpl_cycles:ring0_trans:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x155015c, .fstr = "snb::CPL_CYCLES:RING0:k=0:u=1:e=1:i=0:c=1:t=0", }, { SRC_LINE, .name = "snb::cpl_cycles:ring0_trans:e=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x157015cull, }, { SRC_LINE, .name = "snb::OFFCORE_REQUESTS_OUTSTanding:ALL_DATA_RD_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1530860, .fstr = "snb::OFFCORE_REQUESTS_OUTSTANDING:ALL_DATA_RD:k=1:u=1:e=0:i=0:c=1:t=0", }, { SRC_LINE, .name = "snb::uops_issued:core_stall_cycles:u:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1f3010e, .fstr = "snb::UOPS_ISSUED:ANY:k=1:u=1:e=0:i=1:c=1:t=1", }, { SRC_LINE, .name = "snb::LLC_REFERences:k:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x724f2e, .fstr = "snb::LAST_LEVEL_CACHE_REFERENCES:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "snb::ITLB:0x1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301ae, .fstr = "snb::ITLB:0x1:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:DMND_RFO:ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10002, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:DMND_RFO:ANY_REQUEST", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18fff, }, { SRC_LINE, .name = "snb::offcore_response_0:DMND_RFO", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10002, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:any_response", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18fff, .fstr = "snb::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:NO_SUPP:SNP_NONE:PF_RFO:PF_IFETCH", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x80020060, .fstr = "snb::OFFCORE_RESPONSE_0:PF_RFO:PF_IFETCH:NO_SUPP:SNP_NONE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:DMND_RFO:ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10002, .fstr = "snb::OFFCORE_RESPONSE_1:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:DMND_RFO:ANY_REQUEST", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x18fff, }, { SRC_LINE, .name = "snb::offcore_response_1:DMND_RFO", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10002, .fstr = "snb::OFFCORE_RESPONSE_1:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:any_response", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x18fff, .fstr = "snb::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:NO_SUPP:SNP_NONE:PF_RFO:PF_IFETCH", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x80020060, .fstr = "snb::OFFCORE_RESPONSE_1:PF_RFO:PF_IFETCH:NO_SUPP:SNP_NONE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:ANY_REQUEST:LLC_MISS_LOCAL_DRAM", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x3f80408fffull, .fstr = "snb::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_LOCAL_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:DC_BUFFER_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530068, .fstr = "amd64_fam14h_bobcat::MAB_REQUESTS:DC_BUFFER_0:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:DC_BUFFER_0:DC_BUFFER_1", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:DC_BUFFER_0:IC_BUFFER_0", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:ANY_DC_BUFFER", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530b68, .fstr = "amd64_fam14h_bobcat::MAB_REQUESTS:ANY_DC_BUFFER:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:ANY_IC_BUFFER", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530a68, .fstr = "amd64_fam14h_bobcat::MAB_REQUESTS:ANY_IC_BUFFER:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam14h_bobcat::MAB_REQUESTS:ANY_IC_BUFFER:IC_BUFFER_1", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "core::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15700c0ull, }, { SRC_LINE, .name = "atom::INST_RETIRED:ANY_P:e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5700c0ull, }, { SRC_LINE, .name = "atom::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15700c0ull, }, { SRC_LINE, .name = "nhm::INST_RETIRED:ANY_P:e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "nhm::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15700c0ull, }, { SRC_LINE, .name = "wsm::INST_RETIRED:ANY_P:e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "wsm::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15701c0ull, }, { SRC_LINE, .name = "snb::INST_RETIRED:ANY_P:e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snb::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15700c0ull, }, { SRC_LINE, .name = "snb::INST_RETIRED:ANY_P:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15700c0ull, }, { SRC_LINE, .name = "snb::offcore_response_0:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]= 0x18fffull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x10001ull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:llc_hite", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x3f80080001ull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:LLC_HITE:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:llc_hite:snp_any", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x3f80080001ull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:LLC_HITE:SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:HITM:NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:llc_hite:hitm", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x1000080001ull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:LLC_HITE:HITM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x10001ull, .fstr = "snb::OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:any_response:snp_any", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "snb::offcore_response_0:dmnd_data_rd:any_response:llc_hitmesf", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "snb::offcore_response_0:any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7ull, .codes[1]=0x18fffull, .fstr = "snb::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f80408fffull, .fstr = "snb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "amd64_fam11h_turion::MAB_REQUESTS:DC_BUFFER_0", .ret = PFM_ERR_NOTFOUND, }, { SRC_LINE, .name = "amd64_fam11h_turion::RETIRED_INSTRUCTIONS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "amd64_fam11h_turion::RETIRED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam11h_turion::RETIRED_UOPS:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5200c1, .fstr = "amd64_fam11h_turion::RETIRED_UOPS:k=1:u=0:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam11h_turion::CPU_CLK_UNHALTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530076, .fstr = "amd64_fam11h_turion::CPU_CLK_UNHALTED:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam11h_turion::RETIRED_UOPS:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d300c1, .fstr = "amd64_fam11h_turion::RETIRED_UOPS:k=1:u=1:e=0:i=1:c=1:h=0:g=0", }, { SRC_LINE, .name = "ivb::ARITH:FPU_DIV", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1570414, .fstr = "ivb::ARITH:FPU_DIV:k=1:u=1:e=1:i=0:c=1:t=0", }, { SRC_LINE, .name = "ivb::INST_RETIRED:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301c0, .fstr = "ivb::INST_RETIRED:ALL:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::INST_RETIRED:ALL:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5201c0, .fstr = "ivb::INST_RETIRED:ALL:k=1:u=0:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::INST_RETIRED:ALL:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5101c0, .fstr = "ivb::INST_RETIRED:ALL:k=0:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::TLB_ACCESS:LOAD_STLB_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53045f, .fstr = "ivb::DTLB_LOAD_ACCESS:STLB_HIT:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::TLB_ACCESS:STLB_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53045f, .fstr = "ivb::DTLB_LOAD_ACCESS:STLB_HIT:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::DTLB_LOAD_ACCESS:STLB_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53045f, .fstr = "ivb::DTLB_LOAD_ACCESS:STLB_HIT:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::MOVE_ELIMINATION:INT_NOT_ELIMINATED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530158, .fstr = "ivb::MOVE_ELIMINATION:INT_NOT_ELIMINATED:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::RESOURCE_STALLS:SB:RS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530ca2, .fstr = "ivb::RESOURCE_STALLS:RS:SB:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::RESOURCE_STALLS:ROB:RS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5314a2, .fstr = "ivb::RESOURCE_STALLS:RS:ROB:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::UOPS_EXECUTED:THREAD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301b1, .fstr = "ivb::UOPS_EXECUTED:THREAD:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::UOPS_EXECUTED:THREAD:e:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15701b1, .fstr = "ivb::UOPS_EXECUTED:THREAD:k=1:u=1:e=1:i=0:c=1:t=0", }, { SRC_LINE, .name = "ivb::UOPS_EXECUTED:THREAD:e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivb::UOPS_EXECUTED:THREAD:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301b1, .fstr = "ivb::UOPS_EXECUTED:THREAD:k=1:u=1:e=0:i=1:c=1:t=0", }, { SRC_LINE, .name = "ivb::CPU_CLK_UNHALTED:REF_P", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53013c, .fstr = "ivb::CPU_CLK_UNHALTED:REF_XCLK:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::DTLB_LOAD_MISSES:DEMAND_LD_MISS_CAUSES_A_WALK", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538108, .fstr = "ivb::DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::offcore_response_0:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5201b7, .codes[1] = 0x18fff, .fstr = "ivb::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=0:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::offcore_response_0:LLC_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f80408fffull, .fstr = "ivb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f80408fffull, .fstr = "ivb::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::DTLB_LOAD_MISSES:STLB_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53045f, .fstr = "ivb::DTLB_LOAD_MISSES:STLB_HIT:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::DTLB_LOAD_MISSES:LARGE_WALK_COMPLETED:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x518808, .fstr = "ivb::DTLB_LOAD_MISSES:LARGE_WALK_COMPLETED:k=0:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::l2_lines_in:i:i=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xd301f1, .fstr = "snb::L2_LINES_IN:I:k=1:u=1:e=0:i=1:c=0:t=0", }, { SRC_LINE, .name = "snb::l2_lines_in:i=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xd301f1, .fstr = "snb::L2_LINES_IN:I:k=1:u=1:e=0:i=1:c=0:t=0", }, { SRC_LINE, .name = "snb::l2_lines_in:i:i=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301f1, .fstr = "snb::L2_LINES_IN:I:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::l2_lines_in:e:e=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304f1, .fstr = "snb::L2_LINES_IN:E:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::l2_lines_in:e:e=1", .ret = PFM_ERR_ATTR, .count = 0, }, { SRC_LINE, .name = "snb::l2_lines_in:e:e=1:c=10", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xa5704f1, .fstr = "snb::L2_LINES_IN:E:k=1:u=1:e=1:i=0:c=10:t=0", }, { SRC_LINE, .name = "snb_unc_cbo0::unc_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5000ff, .fstr = "snb_unc_cbo0::UNC_CLOCKTICKS", }, { SRC_LINE, .name = "snb_unc_cbo1::unc_clockticks", .ret = PFM_ERR_NOTFOUND }, { SRC_LINE, .name = "snb_unc_cbo2::unc_clockticks", .ret = PFM_ERR_NOTFOUND }, { SRC_LINE, .name = "snb_unc_cbo3::unc_clockticks", .ret = PFM_ERR_NOTFOUND }, { SRC_LINE, .name = "snb_unc_cbo1::UNC_CBO_CACHE_LOOKUP:STATE_MESI:READ_FILTER:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d01f34, .fstr = "snb_unc_cbo1::UNC_CBO_CACHE_LOOKUP:STATE_MESI:READ_FILTER:e=0:i=1:c=1", }, { SRC_LINE, .name = "snbep_unc_cbo1::UNC_C_CLOCKTICKS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "snbep_unc_cbo0::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x334, .codes[1] = 0x7c0000, .fstr = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_MESIF:e=0:i=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1f34, .codes[1] = 0x7c0000, .fstr = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:STATE_MESIF:e=0:i=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:tid=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4334, .codes[1] = 0x7c0c00, .fstr = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:NID:STATE_MESIF:e=0:i=0:t=0:tf=0:cf=0:nf=3", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M:nf=3", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4334, .codes[1] = 0x200c00, .fstr = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:NID:STATE_M:e=0:i=0:t=0:tf=0:cf=0:nf=3", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3:tid=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:WB", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1035, .fstr = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:WB:e=0:i=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x135, .codes[1] = 0xca000000, .fstr = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:e=0:i=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:nf=1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4135, .codes[1] = 0xcf000400, .fstr = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:e=0:i=0:t=0:tf=0:cf=0:nf=1", }, { SRC_LINE, .name = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:OPC_RFO:NID_OPCODE:nf=1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4135, .codes[1] = 0xc0000400, .fstr = "snbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:e=0:i=0:t=0:tf=0:cf=0:nf=1", }, { SRC_LINE, .name = "snbep_unc_ha::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "snbep_unc_ha::UNC_H_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_ha::UNC_H_REQUESTS:READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000301, .fstr = "snbep_unc_ha::UNC_H_REQUESTS:READS:e=0:i=0:t=1", }, { SRC_LINE, .name = "snbep_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "snbep_unc_imc0::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "snbep_unc_imc0::UNC_M_CLOCKTICKS:t=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0304, .fstr = "snbep_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x183, .fstr = "snbep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc04, .fstr = "snbep_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "snbep_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_CLOCKTICKS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000000, .fstr = "snbep_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=1", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200003, .fstr = "snbep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xb, .codes[1] = 0x20, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=0:ff=32", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc, .codes[1] = 0x1000, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:i=0:t=0:ff=16", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xd, .codes[1] = 0x80000, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:i=0:t=0:ff=8", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe, .codes[1] = 0x28000000, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:i=0:t=0:ff=40", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4000b, .codes[1] = 0x20, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=0:ff=32", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x80000b, .codes[1] = 0x20, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=1:t=0:ff=32", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x84000b, .codes[1] = 0x20, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=1:t=0:ff=32", }, { SRC_LINE, .name = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:i:t=4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x484000b, .codes[1] = 0x20, .fstr = "snbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=1:t=4:ff=32", }, { SRC_LINE, .name = "SNBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4080, .fstr = "snbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=0" }, { SRC_LINE, .name = "SNBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8080, .fstr = "snbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3:e=0:i=0:t=0", }, { SRC_LINE, .name = "SNBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc080, .fstr = "snbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0" }, { SRC_LINE, .name = "SNBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x40004080, .fstr = "snbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=1:t=0" }, { SRC_LINE, .name = "SNBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:i:e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc0004080, .fstr = "snbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=1:i=1:t=0" }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "snbep_unc_qpi0::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x201, .fstr = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:IDLE:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1800101, .fstr = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:IDLE:e=0:i=1:t=1", }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_TXL_FLITS_G0:DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200, .fstr = "snbep_unc_qpi0::UNC_Q_TXL_FLITS_G0:DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G1:HOM", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200602, .fstr = "snbep_unc_qpi0::UNC_Q_RXL_FLITS_G1:HOM:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_qpi0::UNC_Q_TXL_FLITS_G1:HOM", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200600, .fstr = "snbep_unc_qpi0::UNC_Q_TXL_FLITS_G1:HOM:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_ubo::UNC_U_LOCK_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x44, .fstr = "snbep_unc_ubo::UNC_U_LOCK_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r2pcie::UNC_R2_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "snbep_unc_r2pcie::UNC_R2_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r2pcie::UNC_R2_RING_AD_USED:ANY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf07, .fstr = "snbep_unc_r2pcie::UNC_R2_RING_AD_USED:ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r3qpi0::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "snbep_unc_r3qpi0::UNC_R3_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r3qpi0::UNC_R3_TXR_CYCLES_FULL:e=0:i=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x25, .fstr = "snbep_unc_r3qpi0::UNC_R3_TXR_CYCLES_FULL:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r3qpi1::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "snbep_unc_r3qpi1::UNC_R3_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "snbep_unc_r3qpi1::UNC_R3_TXR_CYCLES_FULL:e=0:i=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x25, .fstr = "snbep_unc_r3qpi1::UNC_R3_TXR_CYCLES_FULL:e=0:i=0:t=0", }, { SRC_LINE, .name = "knc::cpu_clk_unhalted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53002a, .fstr = "knc::CPU_CLK_UNHALTED:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "knc::instructions_executed", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530016, .fstr = "knc::INSTRUCTIONS_EXECUTED:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "knc::vpu_data_read", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x532000, .fstr = "knc::VPU_DATA_READ:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "knc::vpu_data_read:t:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1f32000, .fstr = "knc::VPU_DATA_READ:k=1:u=1:e=0:i=1:c=1:t=1", }, { SRC_LINE, .name = "snb_ep::cpl_cycles:ring0_trans:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x155015c, .fstr = "snb_ep::CPL_CYCLES:RING0:k=0:u=1:e=1:i=0:c=1:t=0", }, { SRC_LINE, .name = "snb_ep::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "snb_ep::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0", }, { SRC_LINE, .name = "snb_ep::cpl_cycles:ring0_trans:e=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x157015cull, }, { SRC_LINE, .name = "snb_ep::OFFCORE_REQUESTS_OUTSTanding:ALL_DATA_RD_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1530860, .fstr = "snb_ep::OFFCORE_REQUESTS_OUTSTANDING:ALL_DATA_RD:k=1:u=1:e=0:i=0:c=1:t=0", }, { SRC_LINE, .name = "snb_ep::uops_issued:core_stall_cycles:u:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1f3010e, .fstr = "snb_ep::UOPS_ISSUED:ANY:k=1:u=1:e=0:i=1:c=1:t=1", }, { SRC_LINE, .name = "snb_ep::LLC_REFERences:k:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x724f2e, .fstr = "snb_ep::LAST_LEVEL_CACHE_REFERENCES:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "snb_ep::ITLB:0x1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301ae, .fstr = "snb_ep::ITLB:0x1:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::mem_load_uops_llc_miss_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301d3, .fstr = "snb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::mem_load_uops_llc_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304d3, .fstr = "snb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_0:DMND_RFO:ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10002, .fstr = "snb_ep::OFFCORE_RESPONSE_0:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_0:DMND_RFO:ANY_REQUEST", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18fff, }, { SRC_LINE, .name = "snb_ep::offcore_response_0:DMND_RFO", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10002, .fstr = "snb_ep::OFFCORE_RESPONSE_0:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_0:any_response", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18fff, .fstr = "snb_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_0:NO_SUPP:SNP_NONE:PF_RFO:PF_IFETCH", .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x80020060, .fstr = "snb_ep::OFFCORE_RESPONSE_0:PF_RFO:PF_IFETCH:NO_SUPP:SNP_NONE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:DMND_RFO:ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10002, .fstr = "snb_ep::OFFCORE_RESPONSE_1:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:DMND_RFO:ANY_REQUEST", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x18fff, }, { SRC_LINE, .name = "snb_ep::offcore_response_1:DMND_RFO", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10002, .fstr = "snb_ep::OFFCORE_RESPONSE_1:DMND_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:any_response", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x18fff, .fstr = "snb_ep::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:NO_SUPP:SNP_NONE:PF_RFO:PF_IFETCH", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x80020060, .fstr = "snb_ep::OFFCORE_RESPONSE_1:PF_RFO:PF_IFETCH:NO_SUPP:SNP_NONE:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:ANY_REQUEST:LLC_MISS_LOCAL_DRAM", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x3f80408fffull, .fstr = "snb_ep::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_LOCAL_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_1:ANY_REQUEST:LLC_MISS_REMOTE_DRAM", .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x3fff808fffull, .fstr = "snb_ep::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3fffc08fffull, .fstr = "snb_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL_DRAM:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb_ep::mem_trans_retired:latency_above_threshold", .ret = PFM_SUCCESS, .count = 2, .codes[0]=0x5301cd, .codes[1] = 3, .fstr = "snb_ep::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "snb_ep::mem_trans_retired:latency_above_threshold:ldlat=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "snb_ep::mem_trans_retired:latency_above_threshold:ldlat=3", .ret = PFM_SUCCESS, .count = 2, .codes[0]=0x5301cd, .codes[1] = 3, .fstr = "snb_ep::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "snb_ep::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "snb::mem_trans_retired:latency_above_threshold", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "snb::mem_trans_retired:latency_above_threshold:ldlat=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "snb::mem_trans_retired:latency_above_threshold:ldlat=3", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "snb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "snb::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "snb::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "snb::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0", }, { SRC_LINE, .name = "ivb::mem_trans_retired:latency_above_threshold", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "ivb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "ivb::mem_trans_retired:latency_above_threshold:ldlat=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "ivb::mem_trans_retired:latency_above_threshold:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "ivb::MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD:k=0:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "ivb::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "nhm::mem_inst_retired:latency_above_threshold", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53100b, .codes[1] = 3, .fstr = "nhm::MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "nhm::mem_inst_retired:latency_above_threshold:ldlat=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "nhm::mem_inst_retired:latency_above_threshold:ldlat=3", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53100b, .codes[1] = 3, .fstr = "nhm::MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "nhm::mem_inst_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "wsm::mem_inst_retired:latency_above_threshold", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53100b, .codes[1] = 3, .fstr = "wsm::MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "wsm::mem_inst_retired:latency_above_threshold:ldlat=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "wsm::mem_inst_retired:latency_above_threshold:ldlat=3", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53100b, .codes[1] = 3, .fstr = "wsm::MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3", }, { SRC_LINE, .name = "wsm::mem_inst_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "amd64_fam15h_interlagos::LINK_TRANSMIT_BANDWIDTH_LINK_0:NOP_DW_SENT", .ret = PFM_ERR_NOTFOUND, /* event in Northbridge PMU */ }, { SRC_LINE, .name = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:NOP_DW_SENT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5308f6, .fstr = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:NOP_DW_SENT:SUBLINK_0", }, { SRC_LINE, .name = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x533ff6, .fstr = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:ALL:SUBLINK_0", }, { SRC_LINE, .name = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:ALL:SUBLINK_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53bff6, .fstr = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:ALL:SUBLINK_1", }, { SRC_LINE, .name = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:COMMAND_DW_SENT:DATA_DW_SENT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5303f6, .fstr = "amd64_fam15h_nb::LINK_TRANSMIT_BANDWIDTH_LINK_0:COMMAND_DW_SENT:DATA_DW_SENT:SUBLINK_0", }, { SRC_LINE, .name = "amd64_fam15h_interlagos::DISPATCHED_FPU_OPS:0x4ff:u", .ret = PFM_ERR_ATTR }, { SRC_LINE, .name = "amd64_fam15h_interlagos::DISPATCHED_FPU_OPS:0xff:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x51ff00ull, .fstr = "amd64_fam15h_interlagos::DISPATCHED_FPU_OPS:0xff:k=0:u=1:e=0:i=0:c=0:h=0:g=0" }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:read_block_modify:core_3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4005334e0ull, .fstr = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:READ_BLOCK_MODIFY:CORE_3", }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x40053f7e0ull, .fstr = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:READ_BLOCK_ANY:ANY_CORE", }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:READ_BLOCK_EXCLUSIVE:PREFETCH:READ_BLOCK_MODIFY:core_4", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x400534de0ull, .fstr = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:READ_BLOCK_EXCLUSIVE:READ_BLOCK_MODIFY:PREFETCH:CORE_4", }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:read_block_any:prefetch:core_1", .ret = PFM_ERR_FEATCOMB, /* must use individual umasks to combine with prefetch */ }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:read_block_any:prefetch:core_1:core_3", .ret = PFM_ERR_FEATCOMB, /* core umasks cannot be combined */ }, { SRC_LINE, .name = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:prefetch:core_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4005308e0ull, .fstr = "amd64_fam15h_nb::READ_REQUEST_TO_L3_CACHE:PREFETCH:CORE_0", }, { SRC_LINE, .name = "ivb_ep::mem_load_uops_llc_miss_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301d3, .fstr = "ivb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::mem_load_uops_llc_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530cd3, .fstr = "ivb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::mem_load_uops_llc_miss_retired:remote_hitm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "ivb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:REMOTE_HITM:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::mem_load_uops_llc_miss_retired:remote_fwd", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d3, .fstr = "ivb_ep::MEM_LOAD_UOPS_LLC_MISS_RETIRED:REMOTE_FWD:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb::mem_load_uops_llc_miss_retired:remote_dram", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivb::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "ivb::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0", }, { SRC_LINE, .name = "ivb_ep::offcore_response_0:any_request:LLC_MISS_REMOTE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3fff808fffULL, .fstr = "ivb_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_IFETCH:WB:PF_DATA_RD:PF_RFO:PF_IFETCH:PF_LLC_DATA_RD:PF_LLC_RFO:PF_LLC_IFETCH:BUS_LOCKS:STRM_ST:OTHER:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0" }, { SRC_LINE, .name = "ivb_ep::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3fffc08fffull, .fstr = "ivb_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:LLC_MISS_LOCAL:LLC_MISS_REMOTE_DRAM:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "hsw::mem_trans_retired:latency_above_threshold:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "hsw::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "hsw::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw::mem_trans_retired:latency_above_threshold:ldlat=0:intx=0:intxcp=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw::inst_Retired:any_p:intx", .count = 1, .codes[0] = 0x1005300c0ull, .fstr = "hsw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=0", }, { SRC_LINE, .name = "hsw::inst_Retired:any_p:intx:intxcp", .count = 1, .codes[0] = 0x3005300c0ull, .fstr = "hsw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=1", }, { SRC_LINE, .name = "hsw::inst_Retired:any_p:intx=0:intxcp", .count = 1, .codes[0] = 0x2005300c0ull, .fstr = "hsw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=1", }, { SRC_LINE, .name = "hsw::cycle_activity:cycles_l2_pending", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301a3, .fstr = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::cycle_activity:cycles_l2_pending:c=8", .ret = PFM_ERR_ATTR_SET }, { SRC_LINE, .name = "hsw::hle_retired:aborted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304c8, .fstr = "hsw::HLE_RETIRED:ABORTED:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::rtm_retired:aborted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304c9, .fstr = "hsw::RTM_RETIRED:ABORTED:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:k:intx=1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1005201b7ull, .codes[1] = 0x18fff, .fstr = "hsw::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:k=1:u=0:e=0:i=0:c=0:t=0:intx=1:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18fff, .fstr = "hsw::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:WB:PF_DATA_RD:PF_RFO:PF_CODE_RD:PF_L3_DATA_RD:PF_L3_RFO:PF_L3_CODE_RD:SPLIT_LOCK_UC_LOCK:STRM_ST:OTHER:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:any_request:any_response:L3_MISS_LOCAL", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "hsw::offcore_response_0:split_lock_uc_lock", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10400, .fstr = "hsw::OFFCORE_RESPONSE_0:SPLIT_LOCK_UC_LOCK:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:any_ifetch", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10240, .fstr = "hsw::OFFCORE_RESPONSE_0:ANY_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:L3_HITF", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw::offcore_response_0:LLC_MISS_LOCAL", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw::offcore_response_0:L3_HIT:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101b7, .codes[1] = 0x3f801c8fffull, .fstr = "hsw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_HITM:L3_HITE:L3_HITS:SNP_ANY:k=0:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:ANY_DATA", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10091, .fstr = "hsw::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:DMND_DATA_RD:L3_HITS:SNP_FWD", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x800100001ull, .fstr = "hsw::OFFCORE_RESPONSE_0:DMND_DATA_RD:L3_HITS:SNP_FWD:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f80408fffull, .fstr = "hsw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "ivb_unc_cbo0::unc_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5000ff, .fstr = "ivb_unc_cbo0::UNC_CLOCKTICKS", }, { SRC_LINE, .name = "ivb_unc_cbo1::unc_clockticks", .ret = PFM_ERR_NOTFOUND }, /* * RAPL note: * we can only use the PKG event because it is the only one available * on all processors. The GPU is client only, the CORES is only certain early CPUs */ { SRC_LINE, .name = "rapl::rapl_energy_pkg", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2, .fstr = "rapl::RAPL_ENERGY_PKG", }, { SRC_LINE, .name = "rapl::rapl_energy_pkg:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "slm::offcore_response_0:snp_hitm", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1]=0x100001ffffull, .fstr = "slm::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:SNP_HITM:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_0:any_data", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1]=0x12011, .fstr = "slm::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_L2_DATA_RD:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_0:uc_ifetch", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1]=0x10200, .fstr = "slm::OFFCORE_RESPONSE_0:UC_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_0:any_ifetch", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1]=0x10244, .fstr = "slm::OFFCORE_RESPONSE_0:DMND_IFETCH:PF_IFETCH:UC_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_1:snp_hitm", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1]=0x100001ffffull, .fstr = "slm::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:SNP_HITM:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_1:any_data", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1]=0x12011, .fstr = "slm::OFFCORE_RESPONSE_1:DMND_DATA_RD:PF_L2_DATA_RD:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_1:uc_ifetch", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1]=0x10200, .fstr = "slm::OFFCORE_RESPONSE_1:UC_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::offcore_response_1:any_ifetch", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1]=0x10244, .fstr = "slm::OFFCORE_RESPONSE_1:DMND_IFETCH:PF_IFETCH:UC_IFETCH:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::decode_restriction:predecode_wrong", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301e9, .fstr = "slm::DECODE_RESTRICTION:PREDECODE_WRONG:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::rs_full_stall:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x531fcb, .fstr = "slm::RS_FULL_STALL:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::no_alloc_cycles:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x533fca, .fstr = "slm::NO_ALLOC_CYCLES:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "slm::no_alloc_cycles:any:t=1", .ret = PFM_ERR_ATTR }, { SRC_LINE, .name = "ivbep_unc_irp::unc_i_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "ivbep_unc_irp::UNC_I_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_irp::unc_i_transactions:reads", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x115, .fstr = "ivbep_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_irp::unc_i_transactions:reads:c=1:i", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_irp::unc_i_transactions:reads:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6000115, .fstr = "ivbep_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:t=6", }, { SRC_LINE, .name = "ivbep_unc_cbo1::UNC_C_CLOCKTICKS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "ivbep_unc_cbo0::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x334, .codes[1] = 0x7e0000, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_MESIF:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1134, .codes[1] = 0x7e0000, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:STATE_MESIF:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x5134, .codes[1] = 0x7e0000, .codes[2] = 0x3, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:NID:STATE_MESIF:e=0:t=0:tf=0:cf=0:nf=3", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M:tid=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:WRITE", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:nf=3:tf=1:e:t=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x10c4534, .codes[1] = 0x7e0001, .codes[2] = 0x3, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:STATE_MESIF:e=1:t=1:tf=1:cf=0:nf=3", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:NID", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:NID:nf=1", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x137, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:NID:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4537, .codes[1] = 0x0, .codes[2] = 0x1, .fstr = "ivbep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:NID:e=0:t=0:tf=0:cf=0:nf=1", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:WB", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1035, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:WB:e=0:t=0:tf=0:cf=0:isoc=0:nc=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x19400000ull, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:e=0:t=0:tf=0:cf=0:isoc=0:nc=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:isoc=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x99400000ull, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:e=0:t=0:tf=0:cf=0:isoc=1:nc=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x19e00001ull, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:e=0:t=0:tf=0:cf=0:nf=1:isoc=0:nc=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:OPC_RFO:NID_OPCODE:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x18000001ull, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:e=0:t=0:tf=0:cf=0:nf=1:isoc=0:nc=0", }, { SRC_LINE, .name = "ivbep_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8a36, .fstr = "ivbep_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE:e=0:t=0:tf=0:cf=0:isoc=0:nc=0", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "ivbep_unc_pcu::UNC_P_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_CLOCKTICKS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000000, .fstr = "ivbep_unc_pcu::UNC_P_CLOCKTICKS:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x70, .fstr = "ivbep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xb, .codes[1] = 0x20, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:t=0:ff=32", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc, .codes[1] = 0x1000, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:t=0:ff=16", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xd, .codes[1] = 0x80000, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:t=0:ff=8", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe, .codes[1] = 0x28000000, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:t=0:ff=40", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4000b, .codes[1] = 0x20, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:t=0:ff=32", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:t=24", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1800000b, .codes[1] = 0x20, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:t=24:ff=32", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:t=4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x404000b, .codes[1] = 0x20, .fstr = "ivbep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:t=4:ff=32", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=0" }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3:e=0:i=0:t=0", }, { SRC_LINE, .name = "IVBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0" }, { SRC_LINE, .name = "IVBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6004080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=6" }, { SRC_LINE, .name = "IVBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x46004080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=1:t=6" }, { SRC_LINE, .name = "IVBEP_UNC_PCU::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i:e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc6004080, .fstr = "ivbep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=1:i=1:t=6" }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_DEMOTIONS_CORE10", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x42, .fstr = "ivbep_unc_pcu::UNC_P_DEMOTIONS_CORE10:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_pcu::UNC_P_DEMOTIONS_CORE14", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x46, .fstr = "ivbep_unc_pcu::UNC_P_DEMOTIONS_CORE14:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_ha0::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "ivbep_unc_ha0::UNC_H_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_ha1::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "ivbep_unc_ha1::UNC_H_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_ha1::UNC_H_REQUESTS:READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000301, .fstr = "ivbep_unc_ha1::UNC_H_REQUESTS:READS:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_ha0::UNC_H_IMC_WRITES:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000f1a, .fstr = "ivbep_unc_ha0::UNC_H_IMC_WRITES:ALL:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_ha0::UNC_H_IMC_READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000117, .fstr = "ivbep_unc_ha0::UNC_H_IMC_READS:NORMAL:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "ivbep_unc_imc0::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_CLOCKTICKS:t=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "ivbep_unc_imc0::UNC_M_DCLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc4::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "ivbep_unc_imc4::UNC_M_DCLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0304, .fstr = "ivbep_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_PRE_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "ivbep_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x183, .fstr = "ivbep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc04, .fstr = "ivbep_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1b0, .fstr = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x80b4, .fstr = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10080b4, .fstr = "ivbep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "ivbep_unc_qpi0::UNC_Q_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x201, .fstr = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:DATA:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:IDLE:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000101, .fstr = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G0:IDLE:e=0:t=1", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_TXL_FLITS_G0:DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200, .fstr = "ivbep_unc_qpi0::UNC_Q_TXL_FLITS_G0:DATA:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G1:HOM", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200602, .fstr = "ivbep_unc_qpi0::UNC_Q_RXL_FLITS_G1:HOM:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_qpi0::UNC_Q_TXL_FLITS_G1:HOM", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200600, .fstr = "ivbep_unc_qpi0::UNC_Q_TXL_FLITS_G1:HOM:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_ubo::UNC_U_LOCK_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x44, .fstr = "ivbep_unc_ubo::UNC_U_LOCK_CYCLES:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r2pcie::UNC_R2_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "ivbep_unc_r2pcie::UNC_R2_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r2pcie::UNC_R2_RING_AD_USED:CW", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3307, .fstr = "ivbep_unc_r2pcie::UNC_R2_RING_AD_USED:CW:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r3qpi0::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "ivbep_unc_r3qpi0::UNC_R3_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r3qpi0::UNC_R3_TXR_CYCLES_FULL:e=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x25, .fstr = "ivbep_unc_r3qpi0::UNC_R3_TXR_CYCLES_FULL:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r3qpi1::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "ivbep_unc_r3qpi1::UNC_R3_CLOCKTICKS:e=0:t=0", }, { SRC_LINE, .name = "ivbep_unc_r3qpi1::UNC_R3_TXR_CYCLES_FULL:e=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x25, .fstr = "ivbep_unc_r3qpi1::UNC_R3_TXR_CYCLES_FULL:e=0:t=0", }, { SRC_LINE, .name = "hsw_ep::mem_trans_retired:latency_above_threshold:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "hsw_ep::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw_ep::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "hsw_ep::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "hsw_ep::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw_ep::mem_trans_retired:latency_above_threshold:ldlat=0:intx=0:intxcp=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "hsw_ep::inst_Retired:any_p:intx", .count = 1, .codes[0] = 0x1005300c0ull, .fstr = "hsw_ep::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::inst_Retired:any_p:intx:intxcp", .count = 1, .codes[0] = 0x3005300c0ull, .fstr = "hsw_ep::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=1", }, { SRC_LINE, .name = "hsw_ep::inst_Retired:any_p:intx=0:intxcp", .count = 1, .codes[0] = 0x2005300c0ull, .fstr = "hsw_ep::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=1", }, { SRC_LINE, .name = "hsw_ep::cycle_activity:cycles_l2_pending", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301a3, .fstr = "hsw_ep::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::cycle_activity:cycles_l2_pending:c=8", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "hsw_ep::hle_retired:aborted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304c8, .fstr = "hsw_ep::HLE_RETIRED:ABORTED:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::mem_load_uops_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304d3, .fstr = "hsw_ep::MEM_LOAD_UOPS_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:any_data:L3_miss_local", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f80400091ull, .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:any_data:L3_miss_remote", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3fb8000091ull, .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, /* here SNP_ANY gets expanded when passed on the cmdline, but not when added automatically by library */ .name = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f80400091ull, .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_MISS_LOCAL:SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_NO_FWD:SNP_FWD:SNP_HITM:SNP_NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0" }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:any_data:LLC_miss_local", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:any_data:LLC_miss_remote", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:any_data:L3_HIT", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] =0x3f803c0091ull, .fstr = "hsw_ep::OFFCORE_RESPONSE_0:DMND_DATA_RD:PF_DATA_RD:PF_L3_DATA_RD:L3_HITM:L3_HITE:L3_HITS:L3_HITF:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw_ep::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3fb8408fffull, .fstr = "hsw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::mem_trans_retired:latency_above_threshold:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "bdw::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::mem_trans_retired:latency_above_threshold:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "bdw::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "bdw::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:t=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "bdw::mem_trans_retired:latency_above_threshold:ldlat=0:intx=0:intxcp=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "bdw::inst_Retired:any_p:intx", .count = 1, .codes[0] = 0x1005300c0ull, .fstr = "bdw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=0", }, { SRC_LINE, .name = "bdw::inst_Retired:any_p:intx:intxcp", .count = 1, .codes[0] = 0x3005300c0ull, .fstr = "bdw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=1:intxcp=1", }, { SRC_LINE, .name = "bdw::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "bdw::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::inst_Retired:any_p:intx=0:intxcp", .count = 1, .codes[0] = 0x2005300c0ull, .fstr = "bdw::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=1", }, { SRC_LINE, .name = "bdw::cycle_activity:cycles_l2_pending", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301a3, .fstr = "bdw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::cycle_activity:stalls_ldm_pending", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x65306a3, .fstr = "bdw::CYCLE_ACTIVITY:STALLS_LDM_PENDING:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::CYCLE_ACTIVITY:STALLS_LDM_PENDING:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x65306a3, .fstr = "bdw::CYCLE_ACTIVITY:STALLS_LDM_PENDING:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::cycle_activity:cycles_l2_pending:c=8", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "bdw::hle_retired:aborted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304c8, .fstr = "bdw::HLE_RETIRED:ABORTED:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::rtm_retired:aborted", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304c9, .fstr = "bdw::RTM_RETIRED:ABORTED:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::arith:fpu_div_active", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530114, .fstr = "bdw::ARITH:FPU_DIV_ACTIVE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301c0, .fstr = "bdw::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::rs_events:empty_end", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d7015e, .fstr = "bdw::RS_EVENTS:EMPTY_END:k=1:u=1:e=1:i=1:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_0:llc_hit", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f803c8fffull, .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_HITM:L3_HITE:L3_HITS:L3_HITF:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_0:llc_miss_local", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f84008fffull, .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_0:l3_miss_local", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f84008fffull, .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f84008fffull, .fstr = "bdw::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_1:any_data", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301bb, .codes[1] = 0x10091, .fstr = "bdw::OFFCORE_RESPONSE_1:DMND_DATA_RD:PF_DATA_RD:PF_LLC_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3fbc008fffull, .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_1:l3_miss_remote", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x3fb8008fffull, .fstr = "bdw_ep::OFFCORE_RESPONSE_1:ANY_REQUEST:L3_MISS_REMOTE_HOP0:L3_MISS_REMOTE_HOP1:L3_MISS_REMOTE_HOP2P:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f88008fffull, .fstr = "bdw_ep::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_REMOTE_HOP0:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw::offcore_response_0:L3_MISS_REMOTE_HOP0_DRAM", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo1::UNC_C_CLOCKTICKS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo0::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo1::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo1::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo2::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo2::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo3::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo3::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo4::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo4::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo5::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo5::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo6::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo6::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo7::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo7::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo8::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo8::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo9::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo9::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo10::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo10::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo11::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo11::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo12::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo12::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo13::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo13::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo14::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo14::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo15::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo15::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo16::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo16::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo17::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_cbo17::UNC_C_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x334, .codes[1] = 0xfe0000, .fstr = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_MESIFD:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1134, .codes[1] = 0xfe0000, .fstr = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:STATE_MESIFD:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x5134, .codes[1] = 0xfe0000, .codes[2] = 0x3, .fstr = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:NID:STATE_MESIFD:e=0:t=0:tf=0:nf=3:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M:tid=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:WRITE", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:nf=3:tf=1:e:t=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x10c4534, .codes[1] = 0xfe0001, .codes[2] = 0x3, .fstr = "hswep_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:STATE_MESIFD:e=1:t=1:tf=1:nf=3:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:NID", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:NID:nf=1", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x137, .fstr = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537, .fstr = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:NID:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4537, .codes[1] = 0x0, .codes[2] = 0x1, .fstr = "hswep_unc_cbo0::UNC_C_LLC_VICTIMS:STATE_M:STATE_S:NID:e=0:t=0:tf=0:nf=1:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:WB", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1035, .fstr = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:WB:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x1c800000ull, .fstr = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:isoc=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x9c800000ull, .fstr = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=1:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x19e00001ull, .fstr = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:OPC_RFO:NID_OPCODE:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x18000001ull, .fstr = "hswep_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8a36, .fstr = "hswep_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "hswep_unc_irp::unc_i_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_irp::UNC_I_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_irp::unc_i_coherent_ops:RFO", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x813, .fstr = "hswep_unc_irp::UNC_I_COHERENT_OPS:RFO:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_irp::unc_i_transactions:reads", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x116, .fstr = "hswep_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_irp::unc_i_transactions:reads:c=1:i", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_irp::unc_i_transactions:reads:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6000116, .fstr = "hswep_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=6", }, { SRC_LINE, .name = "hswep_unc_sbo0::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_sbo1::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo1::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_sbo2::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo2::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_sbo3::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_CLOCKTICKS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000000, .fstr = "hswep_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=1", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x60, .fstr = "hswep_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x71, .fstr = "hswep_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xb, .codes[1] = 0x20, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=0:ff=32", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc, .codes[1] = 0x1000, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:i=0:t=0:ff=16", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xd, .codes[1] = 0x80000, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:i=0:t=0:ff=8", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe, .codes[1] = 0x28000000, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:i=0:t=0:ff=40", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4000b, .codes[1] = 0x20, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=0:ff=32", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:t=24", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1800000b, .codes[1] = 0x20, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=24:ff=32", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:t=4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x404000b, .codes[1] = 0x20, .fstr = "hswep_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=4:ff=32", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=0" }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0" }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x46004080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=1:t=6" }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6004080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=6" }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i:e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc6004080, .fstr = "hswep_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=1:i=1:t=6" }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE10", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3a, .fstr = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE10:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE14", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3e, .fstr = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE14:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE17", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x41, .fstr = "hswep_unc_pcu::UNC_P_DEMOTIONS_CORE17:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_ha0::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_ha0::UNC_H_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_ha1::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_ha1::UNC_H_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_ha1::UNC_H_REQUESTS:READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000301, .fstr = "hswep_unc_ha1::UNC_H_REQUESTS:READS:e=0:i=0:t=1", }, { SRC_LINE, .name = "hswep_unc_ha0::UNC_H_IMC_WRITES:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000f1a, .fstr = "hswep_unc_ha0::UNC_H_IMC_WRITES:ALL:e=0:i=0:t=1", }, { SRC_LINE, .name = "hswep_unc_ha0::UNC_H_IMC_READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000117, .fstr = "hswep_unc_ha0::UNC_H_IMC_READS:NORMAL:e=0:i=0:t=1", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "hswep_unc_imc0::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_CLOCKTICKS:t=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_imc0::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc4::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "hswep_unc_imc4::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0304, .fstr = "hswep_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_PRE_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "hswep_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x183, .fstr = "hswep_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc04, .fstr = "hswep_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xb0, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10b0, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x11b0, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7b4, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10007b4, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=1", }, { SRC_LINE, .name = "hswep_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x18007b7, .fstr = "hswep_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:e=0:i=1:t=1", }, { SRC_LINE, .name = "hswep_unc_sbo0::UNC_S_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_sbo0::UNC_S_FAST_ASSERTED:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1800009, .fstr = "hswep_unc_sbo0::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", }, { SRC_LINE, .name = "hswep_unc_sbo3::UNC_S_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "hswep_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_sbo3::UNC_S_FAST_ASSERTED:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1800009, .fstr = "hswep_unc_sbo3::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", }, { SRC_LINE, .name = "hswep_unc_ubo::UNC_U_EVENT_MSG", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x842, .fstr = "hswep_unc_ubo::UNC_U_EVENT_MSG:DOORBELL_RCVD:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "hswep_unc_qpi0::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x113, .fstr = "hswep_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:i:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1a01802, .fstr = "hswep_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:e=0:i=1:t=1", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200101, .fstr = "hswep_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_RXL_OCCUPANCY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xb, .fstr = "hswep_unc_qpi0::UNC_Q_RXL_OCCUPANCY:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_qpi0::UNC_Q_TXL_INSERTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4, .fstr = "hswep_unc_qpi0::UNC_Q_TXL_INSERTS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r2pcie::UNC_R2_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "hswep_unc_r2pcie::UNC_R2_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r2pcie::UNC_R2_RING_AD_USED:CW", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x307, .fstr = "hswep_unc_r2pcie::UNC_R2_RING_AD_USED:CW:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r3qpi0::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "hswep_unc_r3qpi0::UNC_R3_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x210, .fstr = "hswep_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r3qpi1::UNC_R3_RING_SINK_STARVED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x20e, .fstr = "hswep_unc_r3qpi1::UNC_R3_RING_SINK_STARVED:AK:e=0:i=0:t=0", }, { SRC_LINE, .name = "hswep_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:i:t=2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x280022d, .fstr = "hswep_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:e=0:i=1:t=2", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x11, .fstr = "skl::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "skl::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "skl::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x14, .fstr = "skl::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x12, .fstr = "skl::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101c6, .codes[1] = 0x13, .fstr = "skl::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d301c6, .codes[1] = 0x15, .fstr = "skl::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:IDQ_4_BUBBLES", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x400106, .fstr = "skl::FRONTEND_RETIRED:IDQ_4_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=1", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x300106, .fstr = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=1", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:fe_thres=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x300806, .fstr = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=8", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:fe_thres=4095", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x3fff06, .fstr = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=4095", }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:IDQ_3_BUBBLES:fe_thres=4096", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "skl::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "skl::offcore_response_0:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18007, .fstr = "skl::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:OTHER:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:l3_hitmes", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f801d8007ull, .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_HITM:L3_HITE:L3_HITS:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:L4_HIT_LOCAL_L4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f80418007ull, .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L4_HIT_LOCAL_L4:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f84018007ull, .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f84018007ull, .fstr = "skl::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "skl::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530449, .fstr = "skl::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::rob_misc_events:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "skl::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::cycle_activity:stalls_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "skl::uops_dispatched_port:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "skl::UOPS_DISPATCHED_PORT:PORT_0:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "skl::UOPS_DISPATCHED_PORT:PORT_0:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301a3, .fstr = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301a3, .fstr = "hsw::CYCLE_ACTIVITY:CYCLES_L2_PENDING:k=1:u=1:e=0:i=0:c=1:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "glm::offcore_response_1:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x18000, .fstr = "glm::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_1:any_rfo", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x10022, .fstr = "glm::OFFCORE_RESPONSE_1:DMND_RFO:PF_RFO:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_1:any_rfo:l2_miss_snp_miss_or_no_snoop_needed", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x200010022ull, .fstr = "glm::OFFCORE_RESPONSE_1:DMND_RFO:PF_RFO:ANY_RESPONSE:L2_MISS_SNP_MISS_OR_NO_SNOOP_NEEDED:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_0:strm_st", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x14800, .fstr = "glm::OFFCORE_RESPONSE_0:FULL_STRM_ST:PARTIAL_STRM_ST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_1:dmnd_data_rd:outstanding", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "glm::offcore_response_1:dmnd_data_rd:l2_hit:outstanding", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "glm::offcore_response_0:strm_st:outstanding", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x4000004800ull, .fstr = "glm::OFFCORE_RESPONSE_0:FULL_STRM_ST:PARTIAL_STRM_ST:OUTSTANDING:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_0:outstanding:dmnd_data_rd:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101b7, .codes[1] = 0x4000000001ull, .fstr = "glm::OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING:k=0:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::offcore_response_0:strm_st:l2_hit:outstanding", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301ca, .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k:c=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d201ca, .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=0:e=0:i=1:c=1", }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:u:t", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:u:intxcp", .ret = PFM_ERR_ATTR, }, /* * test delimiter options */ { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL.k=1.u=0.e=0.i=0.c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15201ca, .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=0:e=0:i=0:c=1", }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301ca, .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", }, { SRC_LINE, .name = "glm::ISSUE_SLOTS_NOT_CONSUMED.RESOURCE_FULL:k=1:u=1:e=0.i=0.c=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301ca, .fstr = "glm::ISSUE_SLOTS_NOT_CONSUMED:RESOURCE_FULL:k=1:u=1:e=0:i=0:c=1", }, { SRC_LINE, .name = "knl::no_alloc_cycles:all", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fca, .fstr = "knl::NO_ALLOC_CYCLES:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::MEM_UOPS_RETIRED:DTLB_MISS_LOADS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530804, .fstr = "knl::MEM_UOPS_RETIRED:DTLB_MISS_LOADS:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::uops_retired:any:t", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "knl::unhalted_reference_cycles:u:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x710300, .fstr = "knl::UNHALTED_REFERENCE_CYCLES:k=0:u=1:t=1", }, { SRC_LINE, .name = "knl::instructions_retired:k:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x7200c0, .fstr = "knl::INSTRUCTION_RETIRED:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "knl::unhalted_core_cycles:k:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x72003c, .fstr = "knl::UNHALTED_CORE_CYCLES:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "knl::offcore_response_1:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x18000, .fstr = "knl::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_0:any_read", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x132e7, .fstr = "knl::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_1:any_read", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x132e7, .fstr = "knl::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_0:any_request:ddr_near", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x80808000ull, .fstr = "knl::OFFCORE_RESPONSE_0:ANY_REQUEST:DDR_NEAR:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_0:any_request:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x1800588000ull, .fstr = "knl::OFFCORE_RESPONSE_0:ANY_REQUEST:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_0:dmnd_data_rd:outstanding", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x4000000001ull, .fstr = "knl::OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knl::offcore_response_0:dmnd_data_rd:ddr_near:outstanding", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "knl::offcore_response_1:dmnd_data_rd:outstanding", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "knl_unc_imc0::UNC_M_D_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_imc0::UNC_M_D_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0103, .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:RD", }, { SRC_LINE, .name = "knl_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0203, .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:WR", }, { SRC_LINE, .name = "knl_unc_imc0::UNC_M_CAS_COUNT:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0303, .fstr = "knl_unc_imc0::UNC_M_CAS_COUNT:ALL", }, { SRC_LINE, .name = "knl_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0102, .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0202, .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0402, .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", }, { SRC_LINE, .name = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1002, .fstr = "knl_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", }, { SRC_LINE, .name = "knl_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_edc_eclk0::UNC_E_RPQ_INSERTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0101, .fstr = "knl_unc_edc_eclk0::UNC_E_RPQ_INSERTS", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha0::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha1::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha1::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha10::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha10::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha20::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha20::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha25::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha25::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha30::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha30::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha37::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knl_unc_cha37::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0111, .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0211, .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0411, .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1011, .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2011, .fstr = "knl_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0113, .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0213, .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0413, .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1013, .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2013, .fstr = "knl_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", }, { SRC_LINE, .name = "knl_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0218, .fstr = "knl_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0810, .fstr = "knl_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0123, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0823, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0124, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1024, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0125, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", }, { SRC_LINE, .name = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0825, .fstr = "knl_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", }, { SRC_LINE, .name = "wsm::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "wsm::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:0xffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xffff, .fstr = "wsm::OFFCORE_RESPONSE_0:0xffff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snb::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "snb::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xfffffffff, .fstr = "snb::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivb_ep::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::offcore_response_0:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xfffffffff, .fstr = "ivb_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "hsw::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xfffffffff, .fstr = "hsw::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdw_ep::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_0:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xfffffffff, .fstr = "bdw_ep::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skl::offcore_response_0:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xf, .fstr = "skl::OFFCORE_RESPONSE_0:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0xfffffffff, .fstr = "skl::OFFCORE_RESPONSE_0:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_0:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "wsm::offcore_response_1:0xfff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfff, .fstr = "wsm::OFFCORE_RESPONSE_1:0xfff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "wsm::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "snb::offcore_response_1:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xf, .fstr = "snb::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfffffffff, .fstr = "snb::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "snb::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "ivb_ep::offcore_response_1:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xf, .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::offcore_response_1:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfffffffff, .fstr = "ivb_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0", }, { SRC_LINE, .name = "ivb_ep::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "hsw::offcore_response_1:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xf, .fstr = "hsw::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_1:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfffffffff, .fstr = "hsw::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "hsw::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdw_ep::offcore_response_1:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xf, .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_1:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfffffffff, .fstr = "bdw_ep::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "bdw_ep::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skl::offcore_response_1:0xf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xf, .fstr = "skl::OFFCORE_RESPONSE_1:0xf:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_1:0xfffffffff", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0xfffffffff, .fstr = "skl::OFFCORE_RESPONSE_1:0xffffffff:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skl::offcore_response_1:0x7fffffffff", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo1::UNC_C_CLOCKTICKS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo0::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo1::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo1::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo2::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo2::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo3::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo3::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo4::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo4::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo5::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo5::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo6::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo6::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo7::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo7::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo8::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo8::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo9::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo9::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo10::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo10::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo11::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo11::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo12::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo12::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo13::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo13::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo14::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo14::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo15::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo15::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo16::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo16::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo17::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo17::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo18::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo18::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo19::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo19::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo20::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo20::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo21::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_cbo21::UNC_C_CLOCKTICKS:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x334, .codes[1] = 0xfe0000, .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_MESIFD:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:DATA_READ:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1134, .codes[1] = 0xfe0000, .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:STATE_MESIFD:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:nf=3", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x5134, .codes[1] = 0xfe0000, .codes[2] = 0x3, .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:ANY:NID:STATE_MESIFD:e=0:t=0:tf=0:nf=3:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:STATE_M:tid=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_ring_iv_used:DN:UP", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:WRITE:NID:nf=3:tf=1:e:t=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x10c4534, .codes[1] = 0xfe0001, .codes[2] = 0x3, .fstr = "bdx_unc_cbo0::UNC_C_LLC_LOOKUP:NID:WRITE:STATE_MESIFD:e=1:t=1:tf=1:nf=3:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:NID", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:NID:nf=1", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x137, .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:S_STATE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537, .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:S_STATE:M_STATE:e=0:t=0:tf=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:M_STATE:S_STATE:NID:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4537, .codes[1] = 0x0, .codes[2] = 0x1, .fstr = "bdx_unc_cbo0::UNC_C_LLC_VICTIMS:S_STATE:M_STATE:NID:e=0:t=0:tf=0:nf=1:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE", .ret = PFM_ERR_UMASK, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:WB", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1035, .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:WB:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x1c800000ull, .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:isoc=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x135, .codes[1] = 0x0, .codes[2] = 0x9c800000ull, .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_ITOM:e=0:t=0:tf=0:isoc=1:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPCODE:OPC_PCIWILF:nf=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x19e00001ull, .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_PCIRDCUR:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:OPC_RFO:NID_OPCODE:nf=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x4135, .codes[1] = 0x0, .codes[2] = 0x18000001ull, .fstr = "bdx_unc_cbo0::UNC_C_TOR_INSERTS:NID_OPCODE:OPC_RFO:e=0:t=0:tf=0:nf=1:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8a36, .fstr = "bdx_unc_cbo0::UNC_C_TOR_OCCUPANCY:MISS_REMOTE:e=0:t=0:tf=0:isoc=0:nc=0:cf=0", }, { SRC_LINE, .name = "bdx_unc_irp::unc_i_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_irp::UNC_I_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_irp::unc_i_coherent_ops:RFO", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x813, .fstr = "bdx_unc_irp::UNC_I_COHERENT_OPS:RFO:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_irp::unc_i_transactions:reads", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x116, .fstr = "bdx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_irp::unc_i_transactions:reads:c=1:i", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_irp::unc_i_transactions:reads:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6000116, .fstr = "bdx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=6", }, { SRC_LINE, .name = "bdx_unc_sbo0::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_sbo1::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo1::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_sbo2::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo2::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_sbo3::unc_s_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_CLOCKTICKS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000000, .fstr = "bdx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=1", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x60, .fstr = "bdx_unc_pcu::UNC_P_CORE0_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x71, .fstr = "bdx_unc_pcu::UNC_P_CORE17_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xb, .codes[1] = 0x20, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=0:ff=32", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc, .codes[1] = 0x1000, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:i=0:t=0:ff=16", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xd, .codes[1] = 0x80000, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:i=0:t=0:ff=8", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe, .codes[1] = 0x28000000, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:i=0:t=0:ff=40", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4000b, .codes[1] = 0x20, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=0:ff=32", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:t=24", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1800000b, .codes[1] = 0x20, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=24:ff=32", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:t=4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x404000b, .codes[1] = 0x20, .fstr = "bdx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=4:ff=32", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=0" }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C3:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0" }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x46004080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=1:t=6" }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6004080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=0:i=0:t=6" }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:t=6:i:e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc6004080, .fstr = "bdx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C0:e=1:i=1:t=6" }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE10", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3a, .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE10:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE14", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3e, .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE14:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE17", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x41, .fstr = "bdx_unc_pcu::UNC_P_DEMOTIONS_CORE17:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_ha0::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_ha0::UNC_H_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_ha1::UNC_H_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_ha1::UNC_H_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_ha1::UNC_H_REQUESTS:READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000301, .fstr = "bdx_unc_ha1::UNC_H_REQUESTS:READS:e=0:i=0:t=1", }, { SRC_LINE, .name = "bdx_unc_ha0::UNC_H_IMC_WRITES:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000f1a, .fstr = "bdx_unc_ha0::UNC_H_IMC_WRITES:ALL:e=0:i=0:t=1", }, { SRC_LINE, .name = "bdx_unc_ha0::UNC_H_IMC_READS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000117, .fstr = "bdx_unc_ha0::UNC_H_IMC_READS:NORMAL:e=0:i=0:t=1", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc0::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc1::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc1::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc2::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc2::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc3::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc3::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc4::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc4::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc5::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc5::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc6::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc6::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc7::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "bdx_unc_imc7::UNC_M_CLOCKTICKS", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_CLOCKTICKS:t=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_imc0::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc4::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "bdx_unc_imc4::UNC_M_DCLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0304, .fstr = "bdx_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_PRE_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "bdx_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x183, .fstr = "bdx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc04, .fstr = "bdx_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xb0, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10b0, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x11b0, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7b4, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10007b4, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=1", }, { SRC_LINE, .name = "bdx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x18007b7, .fstr = "bdx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:e=0:i=1:t=1", }, { SRC_LINE, .name = "bdx_unc_sbo0::UNC_S_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo0::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_sbo0::UNC_S_FAST_ASSERTED:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1800009, .fstr = "bdx_unc_sbo0::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", }, { SRC_LINE, .name = "bdx_unc_sbo3::UNC_S_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "bdx_unc_sbo3::UNC_S_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_sbo3::UNC_S_FAST_ASSERTED:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1800009, .fstr = "bdx_unc_sbo3::UNC_S_FAST_ASSERTED:e=0:i=1:t=1", }, { SRC_LINE, .name = "bdx_unc_ubo::UNC_U_EVENT_MSG", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x842, .fstr = "bdx_unc_ubo::UNC_U_EVENT_MSG:DOORBELL_RCVD:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "bdx_unc_qpi0::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi1::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "bdx_unc_qpi1::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi2::UNC_Q_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x14, .fstr = "bdx_unc_qpi2::UNC_Q_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x113, .fstr = "bdx_unc_qpi0::UNC_Q_DIRECT2CORE:SUCCESS_RBT_HIT:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:i:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1a01802, .fstr = "bdx_unc_qpi0::UNC_Q_RXL_FLITS_G1:DRS:e=0:i=1:t=1", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200101, .fstr = "bdx_unc_qpi0::UNC_Q_TXL_FLITS_G2:NDR_AD:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_RXL_OCCUPANCY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xb, .fstr = "bdx_unc_qpi0::UNC_Q_RXL_OCCUPANCY:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_qpi0::UNC_Q_TXL_INSERTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4, .fstr = "bdx_unc_qpi0::UNC_Q_TXL_INSERTS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r2pcie::UNC_R2_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "bdx_unc_r2pcie::UNC_R2_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r2pcie::UNC_R2_RING_AD_USED:CW", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x307, .fstr = "bdx_unc_r2pcie::UNC_R2_RING_AD_USED:CW:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r3qpi0::UNC_R3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "bdx_unc_r3qpi0::UNC_R3_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x210, .fstr = "bdx_unc_r3qpi0::UNC_R3_RXR_CYCLES_NE:SNP:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r3qpi1::UNC_R3_RING_SINK_STARVED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x20e, .fstr = "bdx_unc_r3qpi1::UNC_R3_RING_SINK_STARVED:AK:e=0:i=0:t=0", }, { SRC_LINE, .name = "bdx_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:i:t=2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x280022d, .fstr = "bdx_unc_r3qpi1::UNC_R3_HA_R2_BL_CREDITS_EMPTY:HA1:e=0:i=1:t=2", }, { SRC_LINE, .name = "amd64_fam17h::retired_uops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam17h::RETIRED_UOPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam17h::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h::locks:spec_lock", .count = 1, .codes[0] = 0x530425ull, .fstr = "amd64_fam17h::LOCKS:SPEC_LOCK:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam17h::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen1::retired_uops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam17h_zen1::RETIRED_UOPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen1::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam17h_zen1::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen1::locks:spec_lock", .count = 1, .codes[0] = 0x530425ull, .fstr = "amd64_fam17h_zen1::LOCKS:SPEC_LOCK:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen1::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam17h_zen1::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam16h::RETIRED_INSTRUCTIONS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "amd64_fam16h::RETIRED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam16h::CPU_CLK_UNHALTED:u", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x510076, .fstr = "amd64_fam16h::CPU_CLK_UNHALTED:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "skx::offcore_response_1:pf_l1d_and_sw", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10400, .fstr = "skx::OFFCORE_RESPONSE_1:PF_L1D_AND_SW:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::offcore_response_0:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x185b7, .fstr = "skx::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_DATA_RD:PF_L2_RFO:PF_L3_DATA_RD:PF_L3_RFO:PF_L1D_AND_SW:OTHER:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::offcore_response_0:snp_any", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f800185b7ull, .fstr = "skx::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_HIT_NO_FWD:SNP_HIT_WITH_FWD:SNP_HITM:SNP_NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::offcore_response_0:l3_hitmesf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f803c85b7ull, .fstr = "skx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_HITM:L3_HITE:L3_HITS:L3_HITF:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::offcore_response_0:L4_HIT_LOCAL_L4", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx::offcore_response_0:L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f840085b7ull, .fstr = "skx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f840085b7ull, .fstr = "skx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::mem_load_uops_l3_miss_retired:remote_hitm", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x5304d3, .fstr = "skx::MEM_LOAD_L3_MISS_RETIRED:REMOTE_HITM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::mem_load_uops_l3_miss_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301d3, .fstr = "skx::MEM_LOAD_L3_MISS_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx::mem_load_uops_l3_hit_retired:xsnp_hit", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x5302d2, .fstr = "skx::MEM_LOAD_L3_HIT_RETIRED:XSNP_HIT:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "skx_unc_cha1::UNC_C_CLOCKTICKS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha0::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha1::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha1::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha2::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha2::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha3::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha3::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha4::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha4::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha5::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha5::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha6::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha6::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha7::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha7::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha8::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha8::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha9::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha9::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha10::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha10::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha11::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha11::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha12::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha12::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha13::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha13::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha14::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha14::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha15::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha15::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha16::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha16::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha17::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha17::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha18::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha18::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha19::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha19::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha20::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha20::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha21::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha21::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha22::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha22::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha23::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha23::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha24::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha24::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha25::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha25::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha26::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha26::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha27::UNC_C_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "skx_unc_cha27::UNC_C_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_LLC_LOOKUP:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x334l, .codes[1] = 0xfe0000, .fstr = "skx_unc_cha0::UNC_C_LLC_LOOKUP:DATA_READ:STATE_CACHE_ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IA_MISS", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x2135, .codes[1] = 0x0, .codes[2] = 0x3b, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IA_MISS:e=0:i=0:t=0:isoc=0:nc=0:loc=1:rem=1:lmem=1:rmem=1", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_RFO", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x0135, .codes[1] = 0x0, .codes[2] = 0x10040000, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_RFO:e=0:i=0:t=0:isoc=0:nc=0:loc=0:rem=0:lmem=0:rmem=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_RFO:lmem=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x0135, .codes[1] = 0x0, .codes[2] = 0x10040002, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_RFO:e=0:i=0:t=0:isoc=0:nc=0:loc=1:rem=0:lmem=1:rmem=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IPQ:OPC0_RFO:OPC1_RFO", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_DRD:lmem=0:rmem=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x0135, .codes[1] = 0x0, .codes[2] = 0x10140020, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_RFO:OPC1_DRD:e=0:i=0:t=0:isoc=0:nc=0:loc=0:rem=0:lmem=0:rmem=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC1_RFO:OPC0_DRD:lmem=0:rmem=1", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x0135, .codes[1] = 0x0, .codes[2] = 0x10040420, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IRQ:OPC0_DRD:OPC1_RFO:e=0:i=0:t=0:isoc=0:nc=0:loc=0:rem=0:lmem=0:rmem=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_TOR_INSERTS:IPQ:OPC0_SNP_CUR", .ret = PFM_SUCCESS, .count = 3, .codes[0] = 0x0835, .codes[1] = 0x0, .codes[2] = 0x30000, .fstr = "skx_unc_cha0::UNC_C_TOR_INSERTS:IPQ:OPC0_SNP_CUR:e=0:i=0:t=0:isoc=0:nc=0:loc=0:rem=0:lmem=0:rmem=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_LLC_LOOKUP:LOCAL:DATA_READ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x3334, .codes[1] = 0xfe0000, .fstr = "skx_unc_cha0::UNC_C_LLC_LOOKUP:DATA_READ:LOCAL:STATE_CACHE_ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_LLC_LOOKUP:ANY", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1134, .codes[1] = 0xfe0000, .fstr = "skx_unc_cha0::UNC_C_LLC_LOOKUP:ANY:STATE_CACHE_ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_LLC_LOOKUP:REMOTE:STATE_LLC_S", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x9134, .codes[1] = 0x200000, .fstr = "skx_unc_cha0::UNC_C_LLC_LOOKUP:REMOTE:STATE_LLC_S:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_cha0::UNC_C_LLC_VICTIMS:local_all", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2f37, .fstr = "skx_unc_cha0::UNC_C_LLC_VICTIMS:LOCAL_ALL:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_irp::unc_i_clockticks", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_irp::UNC_I_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_irp::unc_i_coherent_ops:RFO", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x810, .fstr = "skx_unc_irp::UNC_I_COHERENT_OPS:RFO:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_irp::unc_i_transactions:reads", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x111, .fstr = "skx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_irp::unc_i_transactions:reads:c=1:i", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_irp::unc_i_transactions:reads:t=6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x6000111, .fstr = "skx_unc_irp::UNC_I_TRANSACTIONS:READS:e=0:i=0:t=6", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_CLOCKTICKS:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1000000, .fstr = "skx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=1", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_CORE_TRANSITION_CYCLES", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x60, .fstr = "skx_unc_pcu::UNC_P_CORE_TRANSITION_CYCLES:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xb, .codes[1] = 0x20, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=0:ff=32", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:ff=16", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc, .codes[1] = 0x1000, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND1_CYCLES:e=0:i=0:t=0:ff=16", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:ff=8", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xd, .codes[1] = 0x80000, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND2_CYCLES:e=0:i=0:t=0:ff=8", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:ff=40", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe, .codes[1] = 0x28000000, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND3_CYCLES:e=0:i=0:t=0:ff=40", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x4000b, .codes[1] = 0x20, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=0:ff=32", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:t=24", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1800000b, .codes[1] = 0x20, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=0:i=0:t=24:ff=32", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:ff=32:e:t=4", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x404000b, .codes[1] = 0x20, .fstr = "skx_unc_pcu::UNC_P_FREQ_BAND0_CYCLES:e=1:i=0:t=4:ff=32", }, { SRC_LINE, .name = "skx_unc_pcu::UNC_P_DEMOTIONS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x30, .fstr = "skx_unc_pcu::UNC_P_DEMOTIONS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc0::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc1::UNC_M_CLOCKTICKS:e=0:i=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc1::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc2::UNC_M_CLOCKTICKS:e=0:i=0:t=0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc2::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc3::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc3::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc4::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc4::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc5::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_imc5::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "skx_unc_imc0::UNC_M_DCLOCKTICKS", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_DCLOCKTICKS:e=1", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_imc4::UNC_M_DCLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xff, .fstr = "skx_unc_imc4::UNC_M_DCLOCKTICKS", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0304, .fstr = "skx_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_PRE_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "skx_unc_imc0::UNC_M_PRE_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x183, .fstr = "skx_unc_imc0::UNC_M_POWER_CKE_CYCLES:RANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc04, .fstr = "skx_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xb0, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:BANK0:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10b0, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:ALLBANKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x11b0, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK0:BANKG0:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7b4, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:t=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x10007b4, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK4:BANK7:e=0:i=0:t=1", }, { SRC_LINE, .name = "skx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:t=1:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x18007b7, .fstr = "skx_unc_imc0::UNC_M_RD_CAS_RANK7:BANK7:e=0:i=1:t=1", }, { SRC_LINE, .name = "skx_unc_ubo::UNC_U_EVENT_MSG:DOORBELL_RCVD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x842, .fstr = "skx_unc_ubo::UNC_U_EVENT_MSG:DOORBELL_RCVD:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_upi0::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_upi1::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_upi1::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_upi2::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_upi2::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x805, .codes[1] = 0x0, .fstr = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ:FILT_NONE:e=0:i=0:t=0:dnid=0:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:WB", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc805ull, .codes[1] = 0x1, .fstr = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:FILT_NONE:e=0:i=0:t=0:dnid=0:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:FILT_SLOT2", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc805, .codes[1] = 0x200001, .fstr = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:FILT_SLOT2:e=0:i=0:t=0:dnid=0:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:FILT_SLOT2:FILT_SLOT1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xc805, .codes[1] = 0x300001, .fstr = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:REQ_OPC_RDINV:FILT_SLOT1:FILT_SLOT2:e=0:i=0:t=0:dnid=0:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:NCB_OPC_NCWR:FILT_SLOT2:FILT_IMPL_NULL:dnid=1:FILT_REMOTE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0xe05, .codes[1] = 0xa02105, .fstr = "skx_unc_upi0::UNC_UPI_RXL_BASIC_HDR_MATCH:NCB_OPC_NCWR:FILT_REMOTE:FILT_SLOT2:FILT_IMPL_NULL:e=0:i=0:t=0:dnid=1:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi1::UNC_UPI_RXL_BASIC_HDR_MATCH:WB_OPC_WBMTOI:FILT_SLOT2:t=2:dnid=1:FILT_REMOTE:FILT_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2000d05, .codes[1] = 0x202107, .fstr = "skx_unc_upi1::UNC_UPI_RXL_BASIC_HDR_MATCH:WB_OPC_WBMTOI:FILT_LOCAL:FILT_REMOTE:FILT_SLOT2:e=0:i=0:t=2:dnid=1:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_upi1::UNC_UPI_RXL_BASIC_HDR_MATCH:WB:t=2:i:e=1:dnid=1:FILT_REMOTE:FILT_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x2840d05, .codes[1] = 0x2106, .fstr = "skx_unc_upi1::UNC_UPI_RXL_BASIC_HDR_MATCH:WB:FILT_LOCAL:FILT_REMOTE:e=1:i=1:t=2:dnid=1:rcsnid=0", }, { SRC_LINE, .name = "skx_unc_iio0::UNC_IO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_iio0::UNC_IO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:FC_ANY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x701000000483ull, .fstr = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:FC_POSTED_REQ:FC_NON_POSTED_REQ:FC_CMPL:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:FC_POSTED_REQ:FC_NON_POSTED_REQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x301000000483ull, .fstr = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:FC_POSTED_REQ:FC_NON_POSTED_REQ:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:MEM_READ_PART1:FC_ANY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x703000000483ull, .fstr = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:MEM_READ_PART1:FC_POSTED_REQ:FC_NON_POSTED_REQ:FC_CMPL:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:MEM_READ_PART1:FC_ANY:t=2:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x703002800483ull, .fstr = "skx_unc_iio0::UNC_IO_DATA_REQ_OF_CPU:MEM_READ_PART0:MEM_READ_PART1:FC_POSTED_REQ:FC_NON_POSTED_REQ:FC_CMPL:e=0:i=1:t=2", }, { SRC_LINE, .name = "skx_unc_irp::UNC_I_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_irp::UNC_I_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_irp::UNC_I_FAF_INSERTS:t=2:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2800018, .fstr = "skx_unc_irp::UNC_I_FAF_INSERTS:e=0:i=1:t=2", }, { SRC_LINE, .name = "skx_unc_m2m0::UNC_M2_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "skx_unc_m2m0::UNC_M2_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_m2m0::UNC_M2_DIRECTORY_LOOKUP", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x12d, .fstr = "skx_unc_m2m0::UNC_M2_DIRECTORY_LOOKUP:ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_m2m1::UNC_M2_DIRECTORY_LOOKUP:STATE_A:t=2", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x200082d, .fstr = "skx_unc_m2m1::UNC_M2_DIRECTORY_LOOKUP:STATE_A:e=0:i=0:t=2", }, { SRC_LINE, .name = "skx_unc_m2m1::UNC_M2_CMS_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc0, .fstr = "skx_unc_m2m1::UNC_M2_CMS_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_m3upi0::UNC_M3_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "skx_unc_m3upi0::UNC_M3_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_m3upi1::UNC_M3_CMS_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc0, .fstr = "skx_unc_m3upi1::UNC_M3_CMS_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "skx_unc_m3upi1::UNC_M3_AG1_BL_CREDITS_ACQUIRED:TGR5:t=2:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x280208c, .fstr = "skx_unc_m3upi1::UNC_M3_AG1_BL_CREDITS_ACQUIRED:TGR5:e=0:i=1:t=2", }, { SRC_LINE, .name = "knm::no_alloc_cycles:all", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x537fca, .fstr = "knm::NO_ALLOC_CYCLES:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::MEM_UOPS_RETIRED:DTLB_MISS_LOADS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530804, .fstr = "knm::MEM_UOPS_RETIRED:DTLB_MISS_LOADS:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::uops_retired:any:t", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "knm::unhalted_reference_cycles:u:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x710300, .fstr = "knm::UNHALTED_REFERENCE_CYCLES:k=0:u=1:t=1", }, { SRC_LINE, .name = "knm::instructions_retired:k:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x7200c0, .fstr = "knm::INSTRUCTION_RETIRED:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "knm::unhalted_core_cycles:k:t", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x72003c, .fstr = "knm::UNHALTED_CORE_CYCLES:k=1:u=0:e=0:i=0:c=0:t=1", }, { SRC_LINE, .name = "knm::offcore_response_1:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x18000, .fstr = "knm::OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_0:any_read", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x132e7, .fstr = "knm::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_1:any_read", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x132e7, .fstr = "knm::OFFCORE_RESPONSE_1:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_RFO:PF_L2_CODE_RD:PARTIAL_READS:UC_CODE_READS:PF_SOFTWARE:PF_L1_DATA_RD:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_0:any_request:ddr_near", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x80808000ull, .fstr = "knm::OFFCORE_RESPONSE_0:ANY_REQUEST:DDR_NEAR:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_0:any_request:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x1800588000ull, .fstr = "knm::OFFCORE_RESPONSE_0:ANY_REQUEST:L2_HIT_NEAR_TILE:L2_HIT_FAR_TILE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_0:dmnd_data_rd:outstanding", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x4000000001ull, .fstr = "knm::OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "knm::offcore_response_0:dmnd_data_rd:ddr_near:outstanding", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "knm::offcore_response_1:dmnd_data_rd:outstanding", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "knm_unc_imc0::UNC_M_D_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_imc0::UNC_M_D_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0103, .fstr = "knm_unc_imc0::UNC_M_CAS_COUNT:RD", }, { SRC_LINE, .name = "knm_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0203, .fstr = "knm_unc_imc0::UNC_M_CAS_COUNT:WR", }, { SRC_LINE, .name = "knm_unc_imc0::UNC_M_CAS_COUNT:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0303, .fstr = "knm_unc_imc0::UNC_M_CAS_COUNT:ALL", }, { SRC_LINE, .name = "knm_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_imc_uclk0::UNC_M_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_edc_uclk0::UNC_E_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0102, .fstr = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_CLEAN", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0202, .fstr = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:HIT_DIRTY", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0402, .fstr = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_CLEAN", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0802, .fstr = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_DIRTY", }, { SRC_LINE, .name = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1002, .fstr = "knm_unc_edc_uclk0::UNC_E_EDC_ACCESS:MISS_INVALID", }, { SRC_LINE, .name = "knm_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_edc_eclk0::UNC_E_E_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_edc_eclk0::UNC_E_RPQ_INSERTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0101, .fstr = "knm_unc_edc_eclk0::UNC_E_RPQ_INSERTS", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha0::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha1::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha1::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha10::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha10::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha20::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha20::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha25::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha25::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha30::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha30::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha37::UNC_H_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "knm_unc_cha37::UNC_H_U_CLOCKTICKS", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0111, .fstr = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0211, .fstr = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IRQ_REJ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0411, .fstr = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:IPQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1011, .fstr = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2011, .fstr = "knm_unc_cha0::UNC_H_INGRESS_OCCUPANCY:PRQ_REJ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0113, .fstr = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0213, .fstr = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IRQ_REJ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0413, .fstr = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:IPQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1013, .fstr = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x2013, .fstr = "knm_unc_cha0::UNC_H_INGRESS_INSERTS:PRQ_REJ", }, { SRC_LINE, .name = "knm_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0218, .fstr = "knm_unc_cha0::UNC_H_INGRESS_RETRY_IRQ0_REJECT:AD_RSP_VN0", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0810, .fstr = "knm_unc_m2pcie::UNC_M2P_INGRESS_CYCLES_NE:ALL", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0123, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_0", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0823, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_NE:AD_1", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0124, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_0", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1024, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_INSERTS:AD_1", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0125, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_0", }, { SRC_LINE, .name = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0825, .fstr = "knm_unc_m2pcie::UNC_M2P_EGRESS_CYCLES_FULL:AD_1", }, { SRC_LINE, .name = "clx::offcore_response_1:pf_l1d_and_sw", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301bb, .codes[1] = 0x10400, .fstr = "clx::OFFCORE_RESPONSE_1:PF_L1D_AND_SW:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::offcore_response_0:any_request", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x185b7, .fstr = "clx::OFFCORE_RESPONSE_0:DMND_DATA_RD:DMND_RFO:DMND_CODE_RD:PF_L2_DATA_RD:PF_L2_RFO:PF_L3_DATA_RD:PF_L3_RFO:PF_L1D_AND_SW:OTHER:ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::offcore_response_0:snp_any", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f800185b7ull, .fstr = "clx::OFFCORE_RESPONSE_0:ANY_REQUEST:ANY_RESPONSE:SNP_NONE:SNP_NOT_NEEDED:SNP_MISS:SNP_HIT_NO_FWD:SNP_HIT_WITH_FWD:SNP_HITM:SNP_NON_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::offcore_response_0:l3_hitmesf", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f803c85b7ull, .fstr = "clx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_HITM:L3_HITE:L3_HITS:L3_HITF:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::offcore_response_0:L4_HIT_LOCAL_L4", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "clx::offcore_response_0:L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f840085b7ull, .fstr = "clx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::offcore_response_0:l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] =0x5301b7, .codes[1] = 0x3f840085b7ull, .fstr = "clx::OFFCORE_RESPONSE_0:ANY_REQUEST:L3_MISS_LOCAL:SNP_ANY:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::mem_load_uops_l3_miss_retired:remote_hitm", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x5304d3, .fstr = "clx::MEM_LOAD_L3_MISS_RETIRED:REMOTE_HITM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::mem_load_uops_l3_miss_retired:local_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301d3, .fstr = "clx::MEM_LOAD_L3_MISS_RETIRED:LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "clx::mem_load_uops_l3_hit_retired:xsnp_hit", .ret = PFM_SUCCESS, .count = 1, .codes[0] =0x5302d2, .fstr = "clx::MEM_LOAD_L3_HIT_RETIRED:XSNP_HIT:k=1:u=1:e=0:i=0:c=0:t=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::retired_uops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam17h_zen2::RETIRED_UOPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam17h_zen2::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::L2_PREFETCH_HIT_L2", .count = 1, .codes[0] = 0x531f70ull, .fstr = "amd64_fam17h_zen2::L2_PREFETCH_HIT_L2:ANY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam17h_zen2::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::RETIRED_FUSED_INSTRUCTIONS", .count = 1, .codes[0] = 0x1005300d0ull, .fstr = "amd64_fam17h_zen2::RETIRED_FUSED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::RETIRED_SSE_AVX_FLOPS", .count = 1, .codes[0] = 0x530f03, .fstr = "amd64_fam17h_zen2::RETIRED_SSE_AVX_FLOPS:ANY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam17h_zen2::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:u", .count = 1, .codes[0] = 0x510203, .fstr = "amd64_fam17h_zen2::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "tmt::cpu_clk_unhalted:core_p", .count = 1, .codes[0] = 0x53003c, .fstr = "tmt::CPU_CLK_UNHALTED:CORE_P:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "tmt::cpu_clk_unhalted:ref", .count = 1, .codes[0] = 0x53013c, .fstr = "tmt::CPU_CLK_UNHALTED:REF:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "tmt::cpu_clk_unhalted:ref_tsc", .count = 1, .codes[0] = 0x530300, .fstr = "tmt::CPU_CLK_UNHALTED:REF_TSC:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "tmt::dtlb_load_misses:walk_completed_4k:t", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "tmt::instructions_retired:t", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "tmt::offcore_response_0:demand_data_rd_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10001, .fstr = "tmt::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "tmt::ocr:demand_data_rd_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10001, .fstr = "tmt::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "tmt::offcore_response_1:demand_data_rd_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x3f04000001ull, .fstr = "tmt::OFFCORE_RESPONSE_1:DEMAND_DATA_RD_L3_MISS:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x11, .fstr = "icl::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icl::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "icl::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "icl::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "icl::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x14, .fstr = "icl::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x12, .fstr = "icl::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101c6, .codes[1] = 0x13, .fstr = "icl::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d301c6, .codes[1] = 0x15, .fstr = "icl::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:LATENCY_GE_256", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x510006, .fstr = "icl::FRONTEND_RETIRED:LATENCY_GE_256:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=256", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x100206, .fstr = "icl::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=2", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:LATENCY_GE_4:fe_thres=4095", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:IDQ_2_BUBBLES:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5201c6, .codes[1] = 0x200106, .fstr = "icl::FRONTEND_RETIRED:IDQ_2_BUBBLES:k=1:u=0:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=1", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:IDQ_4_BUBBLES:fe_thres=4095", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x4fff06, .fstr = "icl::FRONTEND_RETIRED:IDQ_4_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=4095", }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:IDQ_4_BUBBLES:fe_thres=4096", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "icl::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "icl::offcore_response_0:demand_Data_rd_local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x184000001ull, .fstr = "icl::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::offcore_response_0:demand_rfo_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3fffc00002ull, .fstr = "icl::OFFCORE_RESPONSE_0:DEMAND_RFO_L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::offcore_response_0:other_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18000ull, .fstr = "icl::OFFCORE_RESPONSE_0:OTHER_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::offcore_response_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10003c0001ull, .fstr = "icl::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::offcore_response_0:STREAMING_WR_LOCAL_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x184000800ull, .fstr = "icl::OFFCORE_RESPONSE_0:STREAMING_WR_LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "icl::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530449, .fstr = "icl::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::rob_misc_events:lbr_inserts", .ret = PFM_ERR_NOTFOUND, .count = 1, .codes[0] = 0x5320cc, .fstr = "icl::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::misc_retired:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "icl::MISC_RETIRED:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::cycle_activity:stalls_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "icl::uops_dispatched_port:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "icl::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "icl::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::l2_lines_out.useless_hwpf", .count = 1, .codes[0] = 0x5304f2, .fstr = "icl::L2_LINES_OUT:USELESS_HWPF:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::l2_lines_out.silent:t=0", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "icl::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "icl::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "icl::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "icl::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::inst_retired:any_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "icl::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::inst_retired:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "icl::INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::mem_trans_retired:load_latency:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "icl::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "icl::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "icl::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::mem_trans_retired:load_latency:ldlat=0:intx=0:intxcp=0", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "icl::topdown.slots", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "icl::TOPDOWN:SLOTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::topdown.slots_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301a4, .fstr = "icl::TOPDOWN:SLOTS_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icl::TOPDOWN_M.FRONTEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538200, .fstr = "icl::TOPDOWN_M:FRONTEND_BOUND", }, { SRC_LINE, .name = "icl::TOPDOWN_M.BACKEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538300, .fstr = "icl::TOPDOWN_M:BACKEND_BOUND", }, { SRC_LINE, .name = "icl::TOPDOWN_M.RETIRING", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538000, .fstr = "icl::TOPDOWN_M:RETIRING", }, { SRC_LINE, .name = "icl::TOPDOWN_M.BAD_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538100, .fstr = "icl::TOPDOWN_M:BAD_SPEC", }, { SRC_LINE, .name = "icl::TOPDOWN_M.SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "icl::TOPDOWN_M:SLOTS:k=1:u=1", }, { SRC_LINE, .name = "amd64_fam19h_zen3::retired_ops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam19h_zen3::RETIRED_OPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam19h_zen3::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::L2_PREFETCH_HIT_L2:L2_HW_PREFETCHER", .count = 1, .codes[0] = 0x531f70ull, .fstr = "amd64_fam19h_zen3::L2_PREFETCH_HIT_L2:L2_HW_PREFETCHER:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam19h_zen3::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::RETIRED_FUSED_INSTRUCTIONS", .count = 1, .codes[0] = 0x1005300d0ull, .fstr = "amd64_fam19h_zen3::RETIRED_FUSED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::RETIRED_SSE_AVX_FLOPS", .count = 1, .codes[0] = 0x530f03ull, .fstr = "amd64_fam19h_zen3::RETIRED_SSE_AVX_FLOPS:ANY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:u", .count = 1, .codes[0] = 0x510203ull, .fstr = "amd64_fam19h_zen3::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen3_l3::UNC_L3_REQUESTS", .count = 1, .codes[0] = 0x53ff04ull, .fstr = "amd64_fam19h_zen3_l3::UNC_L3_REQUESTS:ALL", }, { SRC_LINE, .name = "amd64_fam19h_zen3_l3::UNC_L3_REQUESTS:u", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "amd64_fam19h_zen3_l3::UNC_L3_MISSES", .count = 1, .codes[0] = 0x53ff9aull, .fstr = "amd64_fam19h_zen3_l3::UNC_L3_MISSES:ALL", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x11, .fstr = "icx::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icx::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "icx::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "icx::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "icx::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x14, .fstr = "icx::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x12, .fstr = "icx::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101c6, .codes[1] = 0x13, .fstr = "icx::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d301c6, .codes[1] = 0x15, .fstr = "icx::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:LATENCY_GE_256", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x510006, .fstr = "icx::FRONTEND_RETIRED:LATENCY_GE_256:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=256", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x100206, .fstr = "icx::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=2", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:LATENCY_GE_4:fe_thres=4095", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:IDQ_2_BUBBLES:k", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5201c6, .codes[1] = 0x200106, .fstr = "icx::FRONTEND_RETIRED:IDQ_2_BUBBLES:k=1:u=0:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=1", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:IDQ_4_BUBBLES:fe_thres=4095", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x4fff06, .fstr = "icx::FRONTEND_RETIRED:IDQ_4_BUBBLES:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=4095", }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:IDQ_4_BUBBLES:fe_thres=4096", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "icx::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "icx::offcore_response_0:demand_Data_rd_local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x104000001ull, .fstr = "icx::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::offcore_response_0:demand_rfo_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f3fc00002ull, .fstr = "icx::OFFCORE_RESPONSE_0:DEMAND_RFO_L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::ocr:reads_to_core_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x3f3ffc0477ull, .fstr = "icx::OFFCORE_RESPONSE_0:READS_TO_CORE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::offcore_response_0:other_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x18000ull, .fstr = "icx::OFFCORE_RESPONSE_0:OTHER_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::offcore_response_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x10003c0001ull, .fstr = "icx::OFFCORE_RESPONSE_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::offcore_response_0:STREAMING_WR_L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1] = 0x84000800ull, .fstr = "icx::OFFCORE_RESPONSE_0:STREAMING_WR_L3_MISS_LOCAL:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "icx::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530449, .fstr = "icx::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::rob_misc_events:lbr_inserts", .ret = PFM_ERR_NOTFOUND, .count = 1, .codes[0] = 0x5320cc, .fstr = "icx::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::misc_retired:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "icx::MISC_RETIRED:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::cycle_activity:stalls_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "icx::uops_dispatched_port:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "icx::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301a1, .fstr = "icx::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::l2_lines_out.useless_hwpf", .count = 1, .codes[0] = 0x5304f2, .fstr = "icx::L2_LINES_OUT:USELESS_HWPF:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::l2_lines_out.silent:t=0", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "icx::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "icx::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "icx::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "icx::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::inst_retired:any_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "icx::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::inst_retired:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "icx::INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::mem_trans_retired:load_latency:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "icx::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "icx::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "icx::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "icx::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "icx::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::mem_load_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302d3, .fstr = "icx::MEM_LOAD_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx::TOPDOWN_M.FRONTEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538200, .fstr = "icx::TOPDOWN_M:FRONTEND_BOUND", }, { SRC_LINE, .name = "icx::TOPDOWN_M.BACKEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538300, .fstr = "icx::TOPDOWN_M:BACKEND_BOUND", }, { SRC_LINE, .name = "icx::TOPDOWN_M.RETIRING", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538000, .fstr = "icx::TOPDOWN_M:RETIRING", }, { SRC_LINE, .name = "icx::TOPDOWN_M.BAD_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538100, .fstr = "icx::TOPDOWN_M:BAD_SPEC", }, { SRC_LINE, .name = "icx::TOPDOWN_M.SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "icx::TOPDOWN_M:SLOTS:k=1:u=1", }, { SRC_LINE, .name = "icl::mem_load_l3_miss_retired:remote_dram", .ret = PFM_ERR_ATTR, .count = 1, }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x11, .fstr = "spr::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:UNKNOWN_BRANCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x17, .fstr = "spr::FRONTEND_RETIRED:UNKNOWN_BRANCH:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "spr::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "spr::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "spr::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::CPU_CLK_UNHALTED.C02", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320ec, .fstr = "spr::CPU_CLK_UNHALTED:C02:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53023c, .fstr = "spr::CPU_CLK_UNHALTED:ONE_THREAD_ACTIVE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x14, .fstr = "spr::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x12, .fstr = "spr::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101c6, .codes[1] = 0x13, .fstr = "spr::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d301c6, .codes[1] = 0x15, .fstr = "spr::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:LATENCY_GE_256", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x610006, .fstr = "spr::FRONTEND_RETIRED:LATENCY_GE_256:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=256", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x100206, .fstr = "spr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=2", }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:LATENCY_GE_4:fe_thres=4095", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "spr::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "spr::offcore_response_0:demand_Data_rd_local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x104000001ull, .fstr = "spr::OCR:DEMAND_DATA_RD_LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::offcore_response_0:demand_rfo_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3fc00002ull, .fstr = "spr::OCR:DEMAND_RFO_L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::ocr:reads_to_core_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3ffc4477ull, .fstr = "spr::OCR:READS_TO_CORE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::ocr:HWPF_L3_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x12380ull, .fstr = "spr::OCR:HWPF_L3_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::offcore_response_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10003c0001ull, .fstr = "spr::OCR:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::offcore_response_0:STREAMING_WR_L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x84000800ull, .fstr = "spr::OCR:STREAMING_WR_L3_MISS_LOCAL:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "spr::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530413, .fstr = "spr::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::rob_misc_events:lbr_inserts", .ret = PFM_ERR_NOTFOUND, .count = 1, .codes[0] = 0x5320cc, .fstr = "spr::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::misc_retired:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "spr::MISC_RETIRED:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::cycle_activity:cycles_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "spr::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301b2, .fstr = "spr::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::uops_dispatched:port_2_3_10", .count = 1, .codes[0] = 0x5304b2, .fstr = "spr::UOPS_DISPATCHED:PORT_2_3_10:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::uops_dispatched:port_5_11", .count = 1, .codes[0] = 0x5320b2, .fstr = "spr::UOPS_DISPATCHED:PORT_5_11:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "spr::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "spr::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::inst_retired:any_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "spr::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::inst_retired:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "spr::INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::mem_trans_retired:load_latency:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "spr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "spr::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "spr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "spr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "spr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::mem_load_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302d3, .fstr = "spr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::cpu_clk_unhalted:pause_inst:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15640ec, .fstr = "spr::CPU_CLK_UNHALTED:PAUSE_INST:k=1:u=0:e=1:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::ARITH.FP_DIVIDER_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301b0, .fstr = "spr::ARITH:FPDIV_ACTIVE:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::UOPS_RETIRED.STALLS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d302c2, .fstr = "spr::UOPS_RETIRED:STALLS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "spr::TOPDOWN_M.FRONTEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538200, .fstr = "spr::TOPDOWN_M:FRONTEND_BOUND", }, { SRC_LINE, .name = "spr::TOPDOWN_M.BACKEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538300, .fstr = "spr::TOPDOWN_M:BACKEND_BOUND", }, { SRC_LINE, .name = "spr::TOPDOWN_M.RETIRING", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538000, .fstr = "spr::TOPDOWN_M:RETIRING", }, { SRC_LINE, .name = "spr::TOPDOWN_M.BAD_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538100, .fstr = "spr::TOPDOWN_M:BAD_SPEC", }, { SRC_LINE, .name = "spr::TOPDOWN_M.HEAVY_OPS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538400, .fstr = "spr::TOPDOWN_M:HEAVY_OPS", }, { SRC_LINE, .name = "spr::TOPDOWN_M:FETCH_LAT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538600, .fstr = "spr::TOPDOWN_M:FETCH_LAT", }, { SRC_LINE, .name = "spr::TOPDOWN_M:BR_MISPREDICT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538500, .fstr = "spr::TOPDOWN_M:BR_MISPREDICT", }, { SRC_LINE, .name = "spr::TOPDOWN_M:MEMORY_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538700, .fstr = "spr::TOPDOWN_M:MEMORY_BOUND", }, { SRC_LINE, .name = "spr::TOPDOWN_M.SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "spr::TOPDOWN_M:SLOTS:k=1:u=1", }, { SRC_LINE, .name = "amd64_fam19h_zen4::retired_ops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam19h_zen4::RETIRED_OPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam19h_zen4::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::L2_PREFETCH_HIT_L2:L2_STREAM", .count = 1, .codes[0] = 0x530170ull, .fstr = "amd64_fam19h_zen4::L2_PREFETCH_HIT_L2:L2_STREAM:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam19h_zen4::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::RETIRED_FUSED_INSTRUCTIONS", .count = 1, .codes[0] = 0x1005300d0ull, .fstr = "amd64_fam19h_zen4::RETIRED_FUSED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::RETIRED_SSE_AVX_FLOPS", .count = 1, .codes[0] = 0x531f03ull, .fstr = "amd64_fam19h_zen4::RETIRED_SSE_AVX_FLOPS:ANY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:u", .count = 1, .codes[0] = 0x510203ull, .fstr = "amd64_fam19h_zen4::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::P0_FREQ_CYCLES_NOT_IN_HALT.P0_FREQ_CYCLES", .count = 1, .codes[0] = 0x100530120ull, .fstr = "amd64_fam19h_zen4::P0_FREQ_CYCLES_NOT_IN_HALT:P0_FREQ_CYCLES:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::DISPATCH_STALLS_1:SMT_CONTENTION", .count = 1, .codes[0] = 0x1005360a0ull, .fstr = "amd64_fam19h_zen4::DISPATCH_STALLS_1:SMT_CONTENTION:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::OPS_QUEUE_EMPTY", .count = 1, .codes[0] = 0x5300a9ull, .fstr = "amd64_fam19h_zen4::OPS_QUEUE_EMPTY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam19h_zen4::PACKED_FP_OPS_RETIRED:FP128_SQRT", .count = 1, .codes[0] = 0x53060cull, .fstr = "amd64_fam19h_zen4::PACKED_FP_OPS_RETIRED:FP128_SQRT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x11, .fstr = "emr::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:UNKNOWN_BRANCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x17, .fstr = "emr::FRONTEND_RETIRED:UNKNOWN_BRANCH:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "emr::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "emr::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "emr::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::CPU_CLK_UNHALTED.C02", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320ec, .fstr = "emr::CPU_CLK_UNHALTED:C02:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53023c, .fstr = "emr::CPU_CLK_UNHALTED:ONE_THREAD_ACTIVE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x14, .fstr = "emr::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x12, .fstr = "emr::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101c6, .codes[1] = 0x13, .fstr = "emr::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d301c6, .codes[1] = 0x15, .fstr = "emr::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:LATENCY_GE_256", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x610006, .fstr = "emr::FRONTEND_RETIRED:LATENCY_GE_256:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=256", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301c6, .codes[1] = 0x100206, .fstr = "emr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=2", }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:LATENCY_GE_4:fe_thres=4095", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "emr::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "emr::offcore_response_0:demand_Data_rd_local_dram", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x104000001ull, .fstr = "emr::OCR:DEMAND_DATA_RD_LOCAL_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::offcore_response_0:demand_rfo_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3fc00002ull, .fstr = "emr::OCR:DEMAND_RFO_L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::ocr:reads_to_core_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3ffc4477ull, .fstr = "emr::OCR:READS_TO_CORE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::ocr:HWPF_L3_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x12380ull, .fstr = "emr::OCR:HWPF_L3_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::offcore_response_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10003c0001ull, .fstr = "emr::OCR:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::offcore_response_0:STREAMING_WR_L3_MISS_LOCAL", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x84000800ull, .fstr = "emr::OCR:STREAMING_WR_L3_MISS_LOCAL:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "emr::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530413, .fstr = "emr::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::rob_misc_events:lbr_inserts", .ret = PFM_ERR_NOTFOUND, .count = 1, .codes[0] = 0x5320cc, .fstr = "emr::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::misc_retired:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "emr::MISC_RETIRED:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::cycle_activity:cycles_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "emr::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301b2, .fstr = "emr::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::uops_dispatched:port_2_3_10", .count = 1, .codes[0] = 0x5304b2, .fstr = "emr::UOPS_DISPATCHED:PORT_2_3_10:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::uops_dispatched:port_5_11", .count = 1, .codes[0] = 0x5320b2, .fstr = "emr::UOPS_DISPATCHED:PORT_5_11:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "emr::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::inst_retired:stall_cycles", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d301c0, .fstr = "emr::INST_RETIRED:STALL_CYCLES:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "emr::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::inst_retired:any_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "emr::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::inst_retired:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "emr::INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::mem_trans_retired:load_latency:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "emr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "emr::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "emr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "emr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::mem_load_l3_miss_retired:remote_pmm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5310d3, .fstr = "emr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_PMM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::mem_load_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302d3, .fstr = "emr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::cpu_clk_unhalted:pause_inst:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15640ec, .fstr = "emr::CPU_CLK_UNHALTED:PAUSE_INST:k=1:u=0:e=1:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "emr::ARITH.FP_DIVIDER_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15301b0, .fstr = "emr::ARITH:FPDIV_ACTIVE:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "icx_unc_cha0::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha0::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha1::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha1::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha2::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha2::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha3::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha3::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha4::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha4::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha5::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha5::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha6::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha6::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha7::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha7::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha8::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha8::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha9::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha9::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha10::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha10::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha11::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha11::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha12::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha12::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha13::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha13::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha14::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha14::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha15::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha15::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha16::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha16::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha17::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha17::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha18::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha18::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha19::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha19::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha20::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha20::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha21::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha21::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha22::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha22::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha23::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha23::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha24::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha24::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha25::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha25::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha26::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha26::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha27::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha27::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha28::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha28::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha29::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha29::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha30::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha30::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha31::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha31::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha32::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha32::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha33::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha33::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha34::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha34::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha35::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha35::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha36::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha36::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha37::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha37::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha38::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha38::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha39::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_cha39::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha2::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc8f3fe00000435ull, .fstr = "icx_unc_cha2::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_cha4::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR:t=16:i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc8f3fe10800435ull, .fstr = "icx_unc_cha4::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR:e=0:i=1:t=16", }, { SRC_LINE, .name = "icx_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x3004, .fstr = "icx_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc0::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc1::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc1::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc2::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc2::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc3::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc3::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc4::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc4::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc5::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc5::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc6::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc6::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc7::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc7::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc8::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc8::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc9::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc9::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc10::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_imc10::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_imc11::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf04, .fstr = "icx_unc_imc11::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio0::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio0::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio1::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio1::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio2::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio2::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio3::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio3::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio4::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio4::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio5::UNC_IIO_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_iio5::UNC_IIO_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio1::UNC_IIO_DATA_REQ_BY_CPU:MEM_READ_PART0", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x70010000004c0ull, .fstr = "icx_unc_iio1::UNC_IIO_DATA_REQ_BY_CPU:MEM_READ_PART0:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_iio2::UNC_IIO_DATA_REQ_OF_CPU:MEM_READ_PART5", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x7020000000483ull, .fstr = "icx_unc_iio2::UNC_IIO_DATA_REQ_OF_CPU:MEM_READ_PART5:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_irp1::UNC_I_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x01, .fstr = "icx_unc_irp1::UNC_I_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_m2m0::UNC_M2M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x00, .fstr = "icx_unc_m2m0::UNC_M2M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_m2m1::UNC_M2M_DIRECTORY_UPDATE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x12e, .fstr = "icx_unc_m2m1::UNC_M2M_DIRECTORY_UPDATE:ANY:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_pcu::UNC_P_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_pcu::UNC_P_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_pcu::UNC_P_CLOCKTICKS:occ_i:occ_e", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc080, .fstr = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=0:t=0:occ_i=0:occ_e=0", }, { SRC_LINE, .name = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:occ_i", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x4000c080, .fstr = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=0:i=1:t=0:occ_i=1:occ_e=0", }, { SRC_LINE, .name = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:occ_e", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x8000c080, .fstr = "icx_unc_pcu::UNC_P_POWER_STATE_OCCUPANCY:CORES_C6:e=1:i=0:t=0:occ_i=0:occ_e=1", }, { .name = "icx_unc_upi0::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_upi0::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_upi1::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_upi1::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_upi2::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_upi2::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_upi3::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_upi3::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_upi0::UNC_UPI_RxL_FLITS:ALL_DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf03, .fstr = "icx_unc_upi0::UNC_UPI_RxL_FLITS:ALL_DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_upi0::UNC_UPI_TxL_FLITS:ALL_DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf02, .fstr = "icx_unc_upi0::UNC_UPI_TxL_FLITS:ALL_DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "icx_unc_m3upi0::UNC_M3UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_m3upi0::UNC_M3UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_m3upi1::UNC_M3UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_m3upi1::UNC_M3UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_m3upi2::UNC_M3UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_m3upi2::UNC_M3UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_m3upi3::UNC_M3UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "icx_unc_m3upi3::UNC_M3UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "icx_unc_ubox::UNC_U_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x0, .fstr = "icx_unc_ubox::UNC_U_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "adl_glc::TOPDOWN:SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "adl_glc::TOPDOWN:SLOTS:k=1:u=1" }, { SRC_LINE, .name = "adl_glc::TOPDOWN:SLOTS:k=1", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x520400, .fstr = "adl_glc::TOPDOWN:SLOTS:k=1:u=0" }, { SRC_LINE, .name = "adl_glc::TOPDOWN:SLOTS:c=2", .ret = PFM_ERR_ATTR, }, { SRC_LINE, .name = "adl_glc::TOPDOWN:BR_MISPREDICT_SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538500, .fstr = "adl_glc::TOPDOWN:BR_MISPREDICT_SLOTS:k=1:u=1" }, { SRC_LINE, .name = "adl_glc::TOPDOWN:SLOTS_P", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5301a4, .fstr = "adl_glc::TOPDOWN:SLOTS_P:k=1:u=1", }, { SRC_LINE, .name = "adl_glc::OCR0:DEMAND_DATA_RD_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1]=0x184000001ull, .fstr = "adl_glc::OCR0:DEMAND_DATA_RD_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "adl_glc::OCR1:DEMAND_DATA_RD_DRAM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012b, .codes[1]=0x184000001ull, .fstr = "adl_glc::OCR1:DEMAND_DATA_RD_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "adl_grt::TOPDOWN_BAD_SPECULATION", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530073, .fstr = "adl_grt::TOPDOWN_BAD_SPECULATION:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "adl_grt::TOPDOWN_RETIRING", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c2, .fstr = "adl_grt::TOPDOWN_RETIRING:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "adl_grt::UOPS_RETIRED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c2, .fstr = "adl_grt::UOPS_RETIRED:ALL:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "adl_grt::mem_uops_retired:load_latency_gt_128", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5305d0, .codes[1] = 0x80, .fstr = "adl_grt::MEM_UOPS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=128", }, { SRC_LINE, .name = "adl_grt::mem_uops_retired:load_latency:ldlat=24", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5305d0, .codes[1] = 0x18, .fstr = "adl_grt::MEM_UOPS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=24", }, { SRC_LINE, .name = "adl_grt::OCR0:DEMAND_DATA_RD_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301b7, .codes[1]=0x10001, .fstr = "adl_grt::OCR0:DEMAND_DATA_RD_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0", }, { SRC_LINE, .name = "adl_grt::OCR1:DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5302b7, .codes[1] = 0x4003c0001ull, .fstr = "adl_grt::OCR1:DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD:k=1:u=1:e=0:i=0:c=0", }, { .name = "spr_unc_imc0::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "spr_unc_imc0::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc0::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "spr_unc_imc0::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x101, .fstr = "spr_unc_imc0::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc1::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "spr_unc_imc1::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc2::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "spr_unc_imc2::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc3::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x101, .fstr = "spr_unc_imc3::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc4::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "spr_unc_imc4::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc5::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "spr_unc_imc5::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc6::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x101, .fstr = "spr_unc_imc6::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc7::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "spr_unc_imc7::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc8::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "spr_unc_imc8::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc9::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x101, .fstr = "spr_unc_imc9::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc10::UNC_M_CAS_COUNT:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "spr_unc_imc10::UNC_M_CAS_COUNT:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_imc11::UNC_M_CAS_COUNT:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "spr_unc_imc11::UNC_M_CAS_COUNT:RD:e=0:i=0:t=0", }, { .name = "spr_unc_upi0::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_upi0::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "spr_unc_upi1::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_upi1::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "spr_unc_upi2::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_upi2::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "spr_unc_upi3::UNC_UPI_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_upi3::UNC_UPI_CLOCKTICKS:e=0:i=0:t=0" }, { SRC_LINE, .name = "spr_unc_upi0::UNC_UPI_RxL_FLITS:ALL_DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf03, .fstr = "spr_unc_upi0::UNC_UPI_RxL_FLITS:ALL_DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_upi0::UNC_UPI_TxL_FLITS:ALL_DATA", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf02, .fstr = "spr_unc_upi0::UNC_UPI_TxL_FLITS:ALL_DATA:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha0::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha0::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha1::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha1::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha2::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha2::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha3::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha3::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha4::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha4::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha5::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha5::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha6::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha6::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha7::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha7::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha8::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha8::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha9::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha9::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha10::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha10::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha11::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha11::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha12::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha12::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha13::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha13::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha14::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha14::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha15::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha15::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha16::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha16::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha17::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha17::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha18::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha18::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha19::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha19::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha20::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha20::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha21::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha21::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha22::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha22::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha23::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha23::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha24::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha24::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha25::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha25::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha26::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha26::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha27::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha27::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha28::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha28::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha29::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha29::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha30::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha30::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha31::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha31::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha32::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha32::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha33::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha33::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha34::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha34::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha35::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha35::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha36::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha36::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha37::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha37::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha38::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha38::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha39::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha39::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha40::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha40::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha41::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha41::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha42::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha42::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha43::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha43::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha44::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha44::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha45::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha45::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha46::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha46::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha47::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha47::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha48::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha48::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha49::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha49::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha50::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha50::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha51::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha51::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha52::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha52::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha53::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha53::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha54::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha54::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha55::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha55::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha56::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha56::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha57::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha57::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha58::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha58::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha59::UNC_CHA_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "spr_unc_cha59::UNC_CHA_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "spr_unc_cha2::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xc8f3fe00000435ull, .fstr = "spr_unc_cha2::UNC_CHA_TOR_INSERTS:IO_MISS_PCIRDCUR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x11, .fstr = "gnr::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:UNKNOWN_BRANCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x17, .fstr = "gnr::FRONTEND_RETIRED:UNKNOWN_BRANCH:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:LATE_SWPF", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0xa, .fstr = "gnr::FRONTEND_RETIRED:LATE_SWPF:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "gnr::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "gnr::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "gnr::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.C02", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320ec, .fstr = "gnr::CPU_CLK_UNHALTED:C02:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53023c, .fstr = "gnr::CPU_CLK_UNHALTED:ONE_THREAD_ACTIVE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:ITLB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x14, .fstr = "gnr::FRONTEND_RETIRED:ITLB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:L1I_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x12, .fstr = "gnr::FRONTEND_RETIRED:L1I_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:L2_MISS:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5103c6, .codes[1] = 0x13, .fstr = "gnr::FRONTEND_RETIRED:L2_MISS:k=0:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:STLB_MISS:c=1:i", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x1d303c6, .codes[1] = 0x15, .fstr = "gnr::FRONTEND_RETIRED:STLB_MISS:k=1:u=1:e=0:i=1:c=1:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x100206, .fstr = "gnr::FRONTEND_RETIRED:LATENCY_GE_2_BUBBLES_GE_1:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=2", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:LATENCY_GE_4:fe_thres=4095", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:DSB_MISS:ITLB_MISS", .ret = PFM_ERR_FEATCOMB, }, { SRC_LINE, .name = "gnr::OCR:DEMAND_DATA_RD_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10001ull, .fstr = "gnr::OCR:DEMAND_DATA_RD_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::offcore_response_0:demand_rfo_l3_miss", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3fc00002ull, .fstr = "gnr::OCR:DEMAND_RFO_L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::ocr:reads_to_core_any_response", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x3f3ffc4477ull, .fstr = "gnr::OCR:READS_TO_CORE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::ocr:STREAMING_WR_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10800ull, .fstr = "gnr::OCR:STREAMING_WR_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::offcore_response_0:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10003c0001ull, .fstr = "gnr::OCR:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::offcore_response_0:MODIFIED_WRITE_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10808ull, .fstr = "gnr::OCR:MODIFIED_WRITE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::cycle_activity:0x6:c=6", .count = 1, .codes[0] = 0x65306a3, .fstr = "gnr::CYCLE_ACTIVITY:0x6:k=1:u=1:e=0:i=0:c=6:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::dtlb_store_misses:walk_completed_2m_4m:c=1", .count = 1, .codes[0] = 0x1530413, .fstr = "gnr::DTLB_STORE_MISSES:WALK_COMPLETED_2M_4M:k=1:u=1:e=0:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::rob_misc_events:lbr_inserts", .ret = PFM_ERR_NOTFOUND, .count = 1, .codes[0] = 0x5320cc, .fstr = "gnr::ROB_MISC_EVENTS:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::misc_retired:lbr_inserts", .count = 1, .codes[0] = 0x5320cc, .fstr = "gnr::MISC_RETIRED:LBR_INSERTS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::uops_dispatched:port_0", .count = 1, .codes[0] = 0x5301b2, .fstr = "gnr::UOPS_DISPATCHED:PORT_0:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::uops_dispatched:port_2_3_10", .count = 1, .codes[0] = 0x5304b2, .fstr = "gnr::UOPS_DISPATCHED:PORT_2_3_10:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::uops_dispatched:port_5_11", .count = 1, .codes[0] = 0x5320b2, .fstr = "gnr::UOPS_DISPATCHED:PORT_5_11:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::inst_retired:prec_dist", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "gnr::INST_RETIRED:PREC_DIST:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::inst_retired:any_p", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5300c0, .fstr = "gnr::INST_RETIRED:ANY_P:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::inst_retired:any", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530100, .fstr = "gnr::INST_RETIRED:ANY:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::mem_trans_retired:load_latency:ldlat=3:u", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5101cd, .codes[1] = 3, .fstr = "gnr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=0:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::mem_trans_retired:load_latency:ldlat=1000000", .ret = PFM_ERR_ATTR_VAL, }, { SRC_LINE, .name = "gnr::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "gnr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::mem_load_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302d3, .fstr = "gnr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::mem_load_l3_miss_retired:remote_dram", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302d3, .fstr = "gnr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_DRAM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::mem_load_l3_miss_retired:remote_hitm", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5304d3, .fstr = "gnr::MEM_LOAD_L3_MISS_RETIRED:REMOTE_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::cpu_clk_unhalted:pause_inst:k", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x15640ec, .fstr = "gnr::CPU_CLK_UNHALTED:PAUSE_INST:k=1:u=0:e=1:i=0:c=1:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::UOPS_RETIRED.STALLS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1d302c2, }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:DSB_MISS", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x11, .fstr = "gnr::FRONTEND_RETIRED:DSB_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:UNKNOWN_BRANCH", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x17, .fstr = "gnr::FRONTEND_RETIRED:UNKNOWN_BRANCH:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=0", }, { SRC_LINE, .name = "gnr::MEM_LOAD_RETIRED:L3_MISS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320d1, .fstr = "gnr::MEM_LOAD_RETIRED:L3_MISS:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5302ec, .fstr = "gnr::CPU_CLK_UNHALTED:DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.REF_DISTRIBUTED", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53083c, .fstr = "gnr::CPU_CLK_UNHALTED:REF_DISTRIBUTED:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.C02", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x5320ec, .fstr = "gnr::CPU_CLK_UNHALTED:C02:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x53023c, .fstr = "gnr::CPU_CLK_UNHALTED:ONE_THREAD_ACTIVE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::FRONTEND_RETIRED:LATENCY_GE_256", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5303c6, .codes[1] = 0x610006, .fstr = "gnr::FRONTEND_RETIRED:LATENCY_GE_256:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0:fe_thres=256", }, { SRC_LINE, .name = "gnr::OCR:MODIFIED_WRITE_ANY_RESPONSE", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10808ull, .fstr = "gnr::OCR:MODIFIED_WRITE_ANY_RESPONSE:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::OCR.DEMAND_DATA_RD_L3_HIT_SNOOP_HITM", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x53012a, .codes[1] = 0x10003c0001ull, .fstr = "gnr::OCR:DEMAND_DATA_RD_L3_HIT_SNOOP_HITM:k=1:u=1:e=0:i=0:c=0:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::cycle_activity:cycles_mem_any:c=6", .ret = PFM_ERR_ATTR_SET, }, { SRC_LINE, .name = "gnr::mem_trans_retired:load_latency", .ret = PFM_SUCCESS, .count = 2, .codes[0] = 0x5301cd, .codes[1] = 3, .fstr = "gnr::MEM_TRANS_RETIRED:LOAD_LATENCY:k=1:u=1:e=0:i=0:c=0:ldlat=3:intx=0:intxcp=0", }, { SRC_LINE, .name = "gnr::TOPDOWN_M.FRONTEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538200, .fstr = "gnr::TOPDOWN_M:FRONTEND_BOUND", }, { SRC_LINE, .name = "gnr::TOPDOWN_M.BACKEND_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538300, .fstr = "gnr::TOPDOWN_M:BACKEND_BOUND", }, { SRC_LINE, .name = "gnr::TOPDOWN_M.RETIRING", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538000, .fstr = "gnr::TOPDOWN_M:RETIRING", }, { SRC_LINE, .name = "gnr::TOPDOWN_M.BAD_SPEC", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538100, .fstr = "gnr::TOPDOWN_M:BAD_SPEC", }, { SRC_LINE, .name = "gnr::TOPDOWN_M.HEAVY_OPS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538400, .fstr = "gnr::TOPDOWN_M:HEAVY_OPS", }, { SRC_LINE, .name = "gnr::TOPDOWN_M:FETCH_LAT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538600, .fstr = "gnr::TOPDOWN_M:FETCH_LAT", }, { SRC_LINE, .name = "gnr::TOPDOWN_M:BR_MISPREDICT", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538500, .fstr = "gnr::TOPDOWN_M:BR_MISPREDICT", }, { SRC_LINE, .name = "gnr::TOPDOWN_M:MEMORY_BOUND", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x538700, .fstr = "gnr::TOPDOWN_M:MEMORY_BOUND", }, { SRC_LINE, .name = "gnr::TOPDOWN_M:SLOTS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x530400, .fstr = "gnr::TOPDOWN_M:SLOTS:k=1:u=1", }, /*** AMD Zen5 Tests ***/ { SRC_LINE, .name = "amd64_fam1ah_zen5::retired_ops", .count = 1, .codes[0] = 0x5300c1ull, .fstr = "amd64_fam1ah_zen5::RETIRED_OPS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::cycles_not_in_halt", .count = 1, .codes[0] = 0x530076ull, .fstr = "amd64_fam1ah_zen5::CYCLES_NOT_IN_HALT:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::L2_PREFETCH_HIT_L2:L1_DC_L2_HW_PF", .count = 1, .codes[0] = 0x53ff70ull, .fstr = "amd64_fam1ah_zen5::L2_PREFETCH_HIT_L2:L1_DC_L2_HW_PF:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:u", .count = 1, .codes[0] = 0x510845ull, .fstr = "amd64_fam1ah_zen5::L1_DTLB_MISS:TLB_RELOAD_1G_L2_HIT:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::RETIRED_FUSED_INSTRUCTIONS", .count = 1, .codes[0] = 0x1005300d0ull, .fstr = "amd64_fam1ah_zen5::RETIRED_FUSED_INSTRUCTIONS:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::RETIRED_SSE_AVX_FLOPS", .count = 1, .codes[0] = 0x530f03ull, .fstr = "amd64_fam1ah_zen5::RETIRED_SSE_AVX_FLOPS:ALL_TYPE:ANY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:u", .count = 1, .codes[0] = 0x510203ull, .fstr = "amd64_fam1ah_zen5::RETIRED_SSE_AVX_FLOPS:MULT_FLOPS:ALL_TYPE:k=0:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::P0_FREQ_CYCLES_NOT_IN_HALT.P0_FREQ_CYCLES", .count = 1, .codes[0] = 0x100530120ull, .fstr = "amd64_fam1ah_zen5::P0_FREQ_CYCLES_NOT_IN_HALT:P0_FREQ_CYCLES:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::DISPATCH_STALLS_1:SMT_CONTENTION", .count = 1, .codes[0] = 0x1005360a0ull, .fstr = "amd64_fam1ah_zen5::DISPATCH_STALLS_1:SMT_CONTENTION:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::OPS_QUEUE_EMPTY", .count = 1, .codes[0] = 0x5300a9ull, .fstr = "amd64_fam1ah_zen5::OPS_QUEUE_EMPTY:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::PACKED_FP_OPS_RETIRED:FP128_ALL:FP256_ALL", .count = 1, .codes[0] = 0x53ff0cull, .fstr = "amd64_fam1ah_zen5::PACKED_FP_OPS_RETIRED:FP128_ALL:FP256_ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::PACKED_FP_OPS_RETIRED:FP256_ALL", .count = 1, .codes[0] = 0x53f00cull, .fstr = "amd64_fam1ah_zen5::PACKED_FP_OPS_RETIRED:FP128_NONE:FP256_ALL:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::RETIRED_FP_OPS_BY_TYPE:SCALAR_ADD", .count = 1, .codes[0] = 0x53010aull, .fstr = "amd64_fam1ah_zen5::RETIRED_FP_OPS_BY_TYPE:SCALAR_ADD:VECTOR_NONE:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::RETIRED_INT_OPS:SSE_AVX_LOGICAL", .count = 1, .codes[0] = 0x53d00bull, .fstr = "amd64_fam1ah_zen5::RETIRED_INT_OPS:MMX_NONE:SSE_AVX_LOGICAL:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT128_AES:INT256_AES", .count = 1, .codes[0] = 0x53550dull, .fstr = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT128_AES:INT256_AES:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT128_AES", .count = 1, .codes[0] = 0x53050dull, .fstr = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT128_AES:INT256_NONE:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT256_ADD", .count = 1, .codes[0] = 0x53100dull, .fstr = "amd64_fam1ah_zen5::PACKED_INT_OPS_RETIRED:INT128_NONE:INT256_ADD:k=1:u=1:e=0:i=0:c=0:h=0:g=0", }, { SRC_LINE, .name = "amd64_fam1ah_zen5_l3::UNC_L3_REQUESTS", .count = 1, .codes[0] = 0x53ff04ull, .fstr = "amd64_fam1ah_zen5_l3::UNC_L3_REQUESTS:ALL", }, { SRC_LINE, .name = "amd64_fam1ah_zen5_l3::UNC_L3_REQUESTS:L3_MISS", .count = 1, .codes[0] = 0x530104ull, .fstr = "amd64_fam1ah_zen5_l3::UNC_L3_REQUESTS:L3_MISS", }, { SRC_LINE, .name = "amd64_fam1ah_zen5_l3::UNC_L3_REQUESTS:u", .ret = PFM_ERR_ATTR, }, { .name = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH0:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH0:WR:e=0:i=0:t=0", }, { .name = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH1:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf006, .fstr = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH1:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH0:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH0:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH1:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf06, .fstr = "gnr_unc_imc0::UNC_M_CAS_COUNT_SCH1:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc0::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "gnr_unc_imc0::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc1::UNC_M_CAS_COUNT_SCH0:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "gnr_unc_imc1::UNC_M_CAS_COUNT_SCH0:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc2::UNC_M_CAS_COUNT_SCH0:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "gnr_unc_imc2::UNC_M_CAS_COUNT_SCH0:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc3::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "gnr_unc_imc3::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc4::UNC_M_CAS_COUNT_SCH0:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "gnr_unc_imc4::UNC_M_CAS_COUNT_SCH0:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc5::UNC_M_CAS_COUNT_SCH0:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "gnr_unc_imc5::UNC_M_CAS_COUNT_SCH0:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc6::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "gnr_unc_imc6::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc7::UNC_M_CAS_COUNT_SCH0:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "gnr_unc_imc7::UNC_M_CAS_COUNT_SCH0:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc8::UNC_M_CAS_COUNT_SCH0:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "gnr_unc_imc8::UNC_M_CAS_COUNT_SCH0:RD:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc9::UNC_M_CLOCKTICKS", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0x1, .fstr = "gnr_unc_imc9::UNC_M_CLOCKTICKS:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc10::UNC_M_CAS_COUNT_SCH0:WR", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xf005, .fstr = "gnr_unc_imc10::UNC_M_CAS_COUNT_SCH0:WR:e=0:i=0:t=0", }, { SRC_LINE, .name = "gnr_unc_imc11::UNC_M_CAS_COUNT_SCH0:RD", .ret = PFM_SUCCESS, .count = 1, .codes[0] = 0xcf05, .fstr = "gnr_unc_imc11::UNC_M_CAS_COUNT_SCH0:RD:e=0:i=0:t=0", }, }; #define NUM_TEST_EVENTS (int)(sizeof(x86_test_events)/sizeof(test_event_t)) static int check_pmu_supported(const char *evt) { pfm_pmu_info_t info; char *p; int ret; pfm_pmu_t i; memset(&info, 0, sizeof(info)); info.size = sizeof(info); /* look for pmu_name::.... */ p = strchr(evt, ':'); if (!p) return 1; if (*(p+1) != ':') return 1; pfm_for_all_pmus(i) { ret = pfm_get_pmu_info(i, &info); if (ret != PFM_SUCCESS) continue; if (!strncmp(info.name, evt, p - evt)) return 1; } /* PMU not there */ return 0; } static int check_test_events(FILE *fp) { const test_event_t *e; char *fstr; uint64_t *codes; int count, i, j; int ret, errors = 0; for (i=0, e = x86_test_events; i < NUM_TEST_EVENTS; i++, e++) { codes = NULL; count = 0; fstr = NULL; ret = pfm_get_event_encoding(e->name, PFM_PLM0 | PFM_PLM3, &fstr, NULL, &codes, &count); if (ret != e->ret) { if (ret == PFM_ERR_NOTFOUND && !check_pmu_supported(e->name)) { fprintf(fp,"Line %d, Event%d %s, skipped because no PMU support\n", e->line, i, e->name); continue; } fprintf(fp,"Line %d, Event%d %s, ret=%s(%d) expected %s(%d)\n", e->line, i, e->name, pfm_strerror(ret), ret, pfm_strerror(e->ret), e->ret); errors++; } else { if (ret != PFM_SUCCESS) { if (fstr) { fprintf(fp,"Line %d, Event%d %s, expected fstr NULL but it is not\n", e->line, i, e->name); errors++; } if (count != 0) { fprintf(fp,"Line %d, Event%d %s, expected count=0 instead of %d\n", e->line, i, e->name, count); errors++; } if (codes) { fprintf(fp,"Line %d, Event%d %s, expected codes[] NULL but it is not\n", e->line, i, e->name); errors++; } } else { if (count != e->count) { fprintf(fp,"Line %d, Event%d %s, count=%d expected %d\n", e->line, i, e->name, count, e->count); errors++; } for (j=0; j < count; j++) { if (codes[j] != e->codes[j]) { fprintf(fp,"Line %d, Event%d %s, codes[%d]=%#"PRIx64" expected %#"PRIx64"\n", e->line, i, e->name, j, codes[j], e->codes[j]); errors++; } } if (e->fstr && strcmp(fstr, e->fstr)) { fprintf(fp,"Line %d, Event%d %s, fstr=%s expected %s\n", e->line, i, e->name, fstr, e->fstr); errors++; } } } if (codes) free(codes); if (fstr) free(fstr); } printf("\t %d x86 events: %d errors\n", i, errors); return errors; } int validate_arch(FILE *fp) { return check_test_events(fp); } papi-papi-7-2-0-t/src/linux-bgp-context.h000066400000000000000000000004161502707512200201700ustar00rootroot00000000000000#ifndef _LINUX_BGP_CONTEXT_H #define _LINUX_BGP_CONTEXT_H #include #define GET_OVERFLOW_ADDRESS(ctx) 0x0 /* Signal handling functions */ #undef hwd_siginfo_t #undef hwd_ucontext_t typedef int hwd_siginfo_t; typedef ucontext_t hwd_ucontext_t; #endif papi-papi-7-2-0-t/src/linux-bgp-lock.h000066400000000000000000000001121502707512200174250ustar00rootroot00000000000000 extern void _papi_hwd_lock( int ); extern void _papi_hwd_unlock( int ); papi-papi-7-2-0-t/src/linux-bgp-memory.c000066400000000000000000000025071502707512200200120ustar00rootroot00000000000000/* * File: linux-bgp-memory.c * Author: Dave Hermsmeier * dlherms@us.ibm.com */ #include "papi.h" #include "papi_internal.h" #ifdef __LINUX__ #include #endif #include #include /* * Prototypes... */ int init_bgp( PAPI_mh_info_t * pMem_Info ); /* * Get Memory Information * * Fills in memory information - effectively set to all 0x00's */ extern int _bgp_get_memory_info( PAPI_hw_info_t * pHwInfo, int pCPU_Type ) { int retval = 0; switch ( pCPU_Type ) { default: //fprintf(stderr,"Default CPU type in %s (%d)\n",__FUNCTION__,__LINE__); retval = init_bgp( &pHwInfo->mem_hierarchy ); break; } return retval; } /* * Get DMem Information for BG/P * * NOTE: Currently, all values set to -1 */ extern int _bgp_get_dmem_info( PAPI_dmem_info_t * pDmemInfo ) { pDmemInfo->size = PAPI_EINVAL; pDmemInfo->resident = PAPI_EINVAL; pDmemInfo->high_water_mark = PAPI_EINVAL; pDmemInfo->shared = PAPI_EINVAL; pDmemInfo->text = PAPI_EINVAL; pDmemInfo->library = PAPI_EINVAL; pDmemInfo->heap = PAPI_EINVAL; pDmemInfo->locked = PAPI_EINVAL; pDmemInfo->stack = PAPI_EINVAL; pDmemInfo->pagesize = PAPI_EINVAL; return PAPI_OK; } /* * Cache configuration for BG/P */ int init_bgp( PAPI_mh_info_t * pMem_Info ) { memset( pMem_Info, 0x0, sizeof ( *pMem_Info ) ); return PAPI_OK; } papi-papi-7-2-0-t/src/linux-bgp-native-events.h000066400000000000000000000404461502707512200213030ustar00rootroot00000000000000#ifndef _LINUX_BGP_NATIVE_EVENTS_H #define _LINUX_BGP_NATIVE_EVENTS_H /* BEGIN PYTHON GENERATION 1 (Do not modify or move this comment) */ /* * Native BG/P Performance Counter Events */ typedef enum _papi_hwd_bgp_native_event_id { /* User Mode 0 */ PNE_BGP_PU0_JPIPE_INSTRUCTIONS = 0x40000000, PNE_BGP_PU0_JPIPE_ADD_SUB, PNE_BGP_PU0_JPIPE_LOGICAL_OPS, PNE_BGP_PU0_JPIPE_SHROTMK, PNE_BGP_PU0_IPIPE_INSTRUCTIONS, PNE_BGP_PU0_IPIPE_MULT_DIV, PNE_BGP_PU0_IPIPE_ADD_SUB, PNE_BGP_PU0_IPIPE_LOGICAL_OPS, PNE_BGP_PU0_IPIPE_SHROTMK, PNE_BGP_PU0_IPIPE_BRANCHES, PNE_BGP_PU0_IPIPE_TLB_OPS, PNE_BGP_PU0_IPIPE_PROCESS_CONTROL, PNE_BGP_PU0_IPIPE_OTHER, PNE_BGP_PU0_DCACHE_LINEFILLINPROG, PNE_BGP_PU0_ICACHE_LINEFILLINPROG, PNE_BGP_PU0_DCACHE_MISS, PNE_BGP_PU0_DCACHE_HIT, PNE_BGP_PU0_DATA_LOADS, PNE_BGP_PU0_DATA_STORES, PNE_BGP_PU0_DCACHE_OPS, PNE_BGP_PU0_ICACHE_MISS, PNE_BGP_PU0_ICACHE_HIT, PNE_BGP_PU0_FPU_ADD_SUB_1, PNE_BGP_PU0_FPU_MULT_1, PNE_BGP_PU0_FPU_FMA_2, PNE_BGP_PU0_FPU_DIV_1, PNE_BGP_PU0_FPU_OTHER_NON_STORAGE_OPS, PNE_BGP_PU0_FPU_ADD_SUB_2, PNE_BGP_PU0_FPU_MULT_2, PNE_BGP_PU0_FPU_FMA_4, PNE_BGP_PU0_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS, PNE_BGP_PU0_FPU_QUADWORD_LOADS, PNE_BGP_PU0_FPU_OTHER_LOADS, PNE_BGP_PU0_FPU_QUADWORD_STORES, PNE_BGP_PU0_FPU_OTHER_STORES, PNE_BGP_PU1_JPIPE_INSTRUCTIONS, PNE_BGP_PU1_JPIPE_ADD_SUB, PNE_BGP_PU1_JPIPE_LOGICAL_OPS, PNE_BGP_PU1_JPIPE_SHROTMK, PNE_BGP_PU1_IPIPE_INSTRUCTIONS, PNE_BGP_PU1_IPIPE_MULT_DIV, PNE_BGP_PU1_IPIPE_ADD_SUB, PNE_BGP_PU1_IPIPE_LOGICAL_OPS, PNE_BGP_PU1_IPIPE_SHROTMK, PNE_BGP_PU1_IPIPE_BRANCHES, PNE_BGP_PU1_IPIPE_TLB_OPS, PNE_BGP_PU1_IPIPE_PROCESS_CONTROL, PNE_BGP_PU1_IPIPE_OTHER, PNE_BGP_PU1_DCACHE_LINEFILLINPROG, PNE_BGP_PU1_ICACHE_LINEFILLINPROG, PNE_BGP_PU1_DCACHE_MISS, PNE_BGP_PU1_DCACHE_HIT, PNE_BGP_PU1_DATA_LOADS, PNE_BGP_PU1_DATA_STORES, PNE_BGP_PU1_DCACHE_OPS, PNE_BGP_PU1_ICACHE_MISS, PNE_BGP_PU1_ICACHE_HIT, PNE_BGP_PU1_FPU_ADD_SUB_1, PNE_BGP_PU1_FPU_MULT_1, PNE_BGP_PU1_FPU_FMA_2, PNE_BGP_PU1_FPU_DIV_1, PNE_BGP_PU1_FPU_OTHER_NON_STORAGE_OPS, PNE_BGP_PU1_FPU_ADD_SUB_2, PNE_BGP_PU1_FPU_MULT_2, PNE_BGP_PU1_FPU_FMA_4, PNE_BGP_PU1_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS, PNE_BGP_PU1_FPU_QUADWORD_LOADS, PNE_BGP_PU1_FPU_OTHER_LOADS, PNE_BGP_PU1_FPU_QUADWORD_STORES, PNE_BGP_PU1_FPU_OTHER_STORES, PNE_BGP_PU0_L1_INVALIDATION_REQUESTS, PNE_BGP_PU1_L1_INVALIDATION_REQUESTS, PNE_BGP_PU0_L2_VALID_PREFETCH_REQUESTS, PNE_BGP_PU0_L2_PREFETCH_HITS_IN_FILTER, PNE_BGP_PU0_L2_PREFETCH_HITS_IN_STREAM, PNE_BGP_PU0_L2_CYCLES_PREFETCH_PENDING, PNE_BGP_PU0_L2_PAGE_ALREADY_IN_L2, PNE_BGP_PU0_L2_PREFETCH_SNOOP_HIT_SAME_CORE, PNE_BGP_PU0_L2_PREFETCH_SNOOP_HIT_OTHER_CORE, PNE_BGP_PU0_L2_PREFETCH_SNOOP_HIT_PLB, PNE_BGP_PU0_L2_CYCLES_READ_REQUEST_PENDING, PNE_BGP_PU0_L2_READ_REQUESTS, PNE_BGP_PU0_L2_DEVBUS_READ_REQUESTS, PNE_BGP_PU0_L2_L3_READ_REQUESTS, PNE_BGP_PU0_L2_NETBUS_READ_REQUESTS, PNE_BGP_PU0_L2_BLIND_DEV_READ_REQUESTS, PNE_BGP_PU0_L2_PREFETCHABLE_REQUESTS, PNE_BGP_PU0_L2_HIT, PNE_BGP_PU0_L2_SAME_CORE_SNOOPS, PNE_BGP_PU0_L2_OTHER_CORE_SNOOPS, PNE_BGP_PU0_L2_OTHER_DP_PU0_SNOOPS, PNE_BGP_PU0_L2_OTHER_DP_PU1_SNOOPS, PNE_BGP_PU0_L2_MEMORY_WRITES = 0x40000065, PNE_BGP_PU0_L2_NETWORK_WRITES, PNE_BGP_PU0_L2_DEVBUS_WRITES, PNE_BGP_PU1_L2_VALID_PREFETCH_REQUESTS, PNE_BGP_PU1_L2_PREFETCH_HITS_IN_FILTER, PNE_BGP_PU1_L2_PREFETCH_HITS_IN_STREAM, PNE_BGP_PU1_L2_CYCLES_PREFETCH_PENDING, PNE_BGP_PU1_L2_PAGE_ALREADY_IN_L2, PNE_BGP_PU1_L2_PREFETCH_SNOOP_HIT_SAME_CORE, PNE_BGP_PU1_L2_PREFETCH_SNOOP_HIT_OTHER_CORE, PNE_BGP_PU1_L2_PREFETCH_SNOOP_HIT_PLB, PNE_BGP_PU1_L2_CYCLES_READ_REQUEST_PENDING, PNE_BGP_PU1_L2_READ_REQUESTS, PNE_BGP_PU1_L2_DEVBUS_READ_REQUESTS, PNE_BGP_PU1_L2_L3_READ_REQUESTS, PNE_BGP_PU1_L2_NETBUS_READ_REQUESTS, PNE_BGP_PU1_L2_BLIND_DEV_READ_REQUESTS, PNE_BGP_PU1_L2_PREFETCHABLE_REQUESTS, PNE_BGP_PU1_L2_HIT, PNE_BGP_PU1_L2_SAME_CORE_SNOOPS, PNE_BGP_PU1_L2_OTHER_CORE_SNOOPS, PNE_BGP_PU1_L2_OTHER_DP_PU0_SNOOPS, PNE_BGP_PU1_L2_OTHER_DP_PU1_SNOOPS, PNE_BGP_PU1_L2_MEMORY_WRITES = 0x40000085, PNE_BGP_PU1_L2_NETWORK_WRITES, PNE_BGP_PU1_L2_DEVBUS_WRITES, PNE_BGP_L3_M0_RD0_SINGLE_LINE_DELIVERED_L2, PNE_BGP_L3_M0_RD0_BURST_DELIVERED_L2, PNE_BGP_L3_M0_RD0_READ_RETURN_COLLISION, PNE_BGP_L3_M0_RD0_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_RD0_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_RD0_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_RD0_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_RD1_SINGLE_LINE_DELIVERED_L2, PNE_BGP_L3_M0_RD1_BURST_DELIVERED_L2, PNE_BGP_L3_M0_RD1_READ_RETURN_COLLISION, PNE_BGP_L3_M0_RD1_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_RD1_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_RD1_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_RD1_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_DIR0_LOOKUPS, PNE_BGP_L3_M0_DIR0_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M0_DIR1_LOOKUPS, PNE_BGP_L3_M0_DIR1_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M0_MH_DDR_STORES, PNE_BGP_L3_M0_MH_DDR_FETCHES, PNE_BGP_L3_M1_RD0_SINGLE_LINE_DELIVERED_L2, PNE_BGP_L3_M1_RD0_BURST_DELIVERED_L2, PNE_BGP_L3_M1_RD0_READ_RETURN_COLLISION, PNE_BGP_L3_M1_RD0_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_RD0_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_RD0_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_RD0_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_RD1_SINGLE_LINE_DELIVERED_L2, PNE_BGP_L3_M1_RD1_BURST_DELIVERED_L2, PNE_BGP_L3_M1_RD1_READ_RETURN_COLLISION, PNE_BGP_L3_M1_RD1_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_RD1_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_RD1_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_RD1_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_DIR0_LOOKUPS, PNE_BGP_L3_M1_DIR0_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M1_DIR1_LOOKUPS, PNE_BGP_L3_M1_DIR1_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M1_MH_DDR_STORES, PNE_BGP_L3_M1_MH_DDR_FETCHES, PNE_BGP_PU0_SNOOP_PORT0_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU0_SNOOP_PORT1_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU0_SNOOP_PORT2_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU0_SNOOP_PORT3_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU0_SNOOP_PORT0_REJECTED_REQUESTS, PNE_BGP_PU0_SNOOP_PORT1_REJECTED_REQUESTS, PNE_BGP_PU0_SNOOP_PORT2_REJECTED_REQUESTS, PNE_BGP_PU0_SNOOP_PORT3_REJECTED_REQUESTS, PNE_BGP_PU0_SNOOP_L1_CACHE_WRAP, PNE_BGP_PU1_SNOOP_PORT0_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU1_SNOOP_PORT1_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU1_SNOOP_PORT2_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU1_SNOOP_PORT3_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU1_SNOOP_PORT0_REJECTED_REQUESTS, PNE_BGP_PU1_SNOOP_PORT1_REJECTED_REQUESTS, PNE_BGP_PU1_SNOOP_PORT2_REJECTED_REQUESTS, PNE_BGP_PU1_SNOOP_PORT3_REJECTED_REQUESTS, PNE_BGP_PU1_SNOOP_L1_CACHE_WRAP, PNE_BGP_TORUS_XP_PACKETS, PNE_BGP_TORUS_XP_32BCHUNKS, PNE_BGP_TORUS_XM_PACKETS, PNE_BGP_TORUS_XM_32BCHUNKS, PNE_BGP_TORUS_YP_PACKETS, PNE_BGP_TORUS_YP_32BCHUNKS, PNE_BGP_TORUS_YM_PACKETS, PNE_BGP_TORUS_YM_32BCHUNKS, PNE_BGP_TORUS_ZP_PACKETS, PNE_BGP_TORUS_ZP_32BCHUNKS, PNE_BGP_TORUS_ZM_PACKETS, PNE_BGP_TORUS_ZM_32BCHUNKS, PNE_BGP_DMA_PACKETS_INJECTED, PNE_BGP_DMA_DESCRIPTORS_READ_FROM_L3, PNE_BGP_DMA_FIFO_PACKETS_RECEIVED, PNE_BGP_DMA_COUNTER_PACKETS_RECEIVED, PNE_BGP_DMA_REMOTE_GET_PACKETS_RECEIVED, PNE_BGP_DMA_IDPU_READ_REQUESTS_TO_L3, PNE_BGP_DMA_READ_VALID_RETURNED, PNE_BGP_DMA_ACKED_READ_REQUESTS, PNE_BGP_DMA_CYCLES_RDPU_WRITE_ACTIVE, PNE_BGP_DMA_WRITE_REQUESTS_TO_L3, PNE_BGP_COL_AC_CH2_VC0_MATURE = 0x400000DE, PNE_BGP_COL_AC_CH1_VC0_MATURE, PNE_BGP_COL_AC_CH0_VC0_MATURE, PNE_BGP_COL_AC_INJECT_VC0_MATURE, PNE_BGP_COL_AC_CH2_VC1_MATURE, PNE_BGP_COL_AC_CH1_VC1_MATURE, PNE_BGP_COL_AC_CH0_VC1_MATURE, PNE_BGP_COL_AC_INJECT_VC1_MATURE, PNE_BGP_COL_AC_PENDING_REQUESTS, PNE_BGP_COL_AC_WAITING_REQUESTS, PNE_BGP_COL_AR2_PACKET_TAKEN, PNE_BGP_COL_AR1_PACKET_TAKEN, PNE_BGP_COL_AR0_PACKET_TAKEN, PNE_BGP_COL_ALC_PACKET_TAKEN, PNE_BGP_COL_AR0_VC0_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AR0_VC1_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AR1_VC0_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AR1_VC1_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AR2_VC0_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AR2_VC1_DATA_PACKETS_RECEIVED, PNE_BGP_COL_AS0_VC0_DATA_PACKETS_SENT, PNE_BGP_COL_AS0_VC1_DATA_PACKETS_SENT, PNE_BGP_COL_AS1_VC0_DATA_PACKETS_SENT, PNE_BGP_COL_AS1_VC1_DATA_PACKETS_SENT, PNE_BGP_COL_AS2_VC0_DATA_PACKETS_SENT, PNE_BGP_COL_AS2_VC1_DATA_PACKETS_SENT, PNE_BGP_COL_INJECT_VC0_HEADER, PNE_BGP_COL_INJECT_VC1_HEADER, PNE_BGP_COL_RECEPTION_VC0_PACKET_ADDED, PNE_BGP_COL_RECEPTION_VC1_PACKET_ADDED, PNE_BGP_IC_TIMESTAMP, PNE_BGP_MISC_ELAPSED_TIME = 0x400000FF, /* User Mode 1 */ PNE_BGP_PU2_JPIPE_INSTRUCTIONS = 0x40000100, PNE_BGP_PU2_JPIPE_ADD_SUB, PNE_BGP_PU2_JPIPE_LOGICAL_OPS, PNE_BGP_PU2_JPIPE_SHROTMK, PNE_BGP_PU2_IPIPE_INSTRUCTIONS, PNE_BGP_PU2_IPIPE_MULT_DIV, PNE_BGP_PU2_IPIPE_ADD_SUB, PNE_BGP_PU2_IPIPE_LOGICAL_OPS, PNE_BGP_PU2_IPIPE_SHROTMK, PNE_BGP_PU2_IPIPE_BRANCHES, PNE_BGP_PU2_IPIPE_TLB_OPS, PNE_BGP_PU2_IPIPE_PROCESS_CONTROL, PNE_BGP_PU2_IPIPE_OTHER, PNE_BGP_PU2_DCACHE_LINEFILLINPROG, PNE_BGP_PU2_ICACHE_LINEFILLINPROG, PNE_BGP_PU2_DCACHE_MISS, PNE_BGP_PU2_DCACHE_HIT, PNE_BGP_PU2_DATA_LOADS, PNE_BGP_PU2_DATA_STORES, PNE_BGP_PU2_DCACHE_OPS, PNE_BGP_PU2_ICACHE_MISS, PNE_BGP_PU2_ICACHE_HIT, PNE_BGP_PU2_FPU_ADD_SUB_1, PNE_BGP_PU2_FPU_MULT_1, PNE_BGP_PU2_FPU_FMA_2, PNE_BGP_PU2_FPU_DIV_1, PNE_BGP_PU2_FPU_OTHER_NON_STORAGE_OPS, PNE_BGP_PU2_FPU_ADD_SUB_2, PNE_BGP_PU2_FPU_MULT_2, PNE_BGP_PU2_FPU_FMA_4, PNE_BGP_PU2_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS, PNE_BGP_PU2_FPU_QUADWORD_LOADS, PNE_BGP_PU2_FPU_OTHER_LOADS, PNE_BGP_PU2_FPU_QUADWORD_STORES, PNE_BGP_PU2_FPU_OTHER_STORES, PNE_BGP_PU3_JPIPE_INSTRUCTIONS, PNE_BGP_PU3_JPIPE_ADD_SUB, PNE_BGP_PU3_JPIPE_LOGICAL_OPS, PNE_BGP_PU3_JPIPE_SHROTMK, PNE_BGP_PU3_IPIPE_INSTRUCTIONS, PNE_BGP_PU3_IPIPE_MULT_DIV, PNE_BGP_PU3_IPIPE_ADD_SUB, PNE_BGP_PU3_IPIPE_LOGICAL_OPS, PNE_BGP_PU3_IPIPE_SHROTMK, PNE_BGP_PU3_IPIPE_BRANCHES, PNE_BGP_PU3_IPIPE_TLB_OPS, PNE_BGP_PU3_IPIPE_PROCESS_CONTROL, PNE_BGP_PU3_IPIPE_OTHER, PNE_BGP_PU3_DCACHE_LINEFILLINPROG, PNE_BGP_PU3_ICACHE_LINEFILLINPROG, PNE_BGP_PU3_DCACHE_MISS, PNE_BGP_PU3_DCACHE_HIT, PNE_BGP_PU3_DATA_LOADS, PNE_BGP_PU3_DATA_STORES, PNE_BGP_PU3_DCACHE_OPS, PNE_BGP_PU3_ICACHE_MISS, PNE_BGP_PU3_ICACHE_HIT, PNE_BGP_PU3_FPU_ADD_SUB_1, PNE_BGP_PU3_FPU_MULT_1, PNE_BGP_PU3_FPU_FMA_2, PNE_BGP_PU3_FPU_DIV_1, PNE_BGP_PU3_FPU_OTHER_NON_STORAGE_OPS, PNE_BGP_PU3_FPU_ADD_SUB_2, PNE_BGP_PU3_FPU_MULT_2, PNE_BGP_PU3_FPU_FMA_4, PNE_BGP_PU3_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS, PNE_BGP_PU3_FPU_QUADWORD_LOADS, PNE_BGP_PU3_FPU_OTHER_LOADS, PNE_BGP_PU3_FPU_QUADWORD_STORES, PNE_BGP_PU3_FPU_OTHER_STORES, PNE_BGP_PU2_L1_INVALIDATION_REQUESTS, PNE_BGP_PU3_L1_INVALIDATION_REQUESTS, PNE_BGP_COL_AC_CH2_VC0_MATURE_UM1, PNE_BGP_COL_AC_CH1_VC0_MATURE_UM1, PNE_BGP_COL_AC_CH0_VC0_MATURE_UM1, PNE_BGP_COL_AC_INJECT_VC0_MATURE_UM1, PNE_BGP_COL_AC_CH2_VC1_MATURE_UM1, PNE_BGP_COL_AC_CH1_VC1_MATURE_UM1, PNE_BGP_COL_AC_CH0_VC1_MATURE_UM1, PNE_BGP_COL_AC_INJECT_VC1_MATURE_UM1, PNE_BGP_COL_AR0_VC0_EMPTY_PACKET, PNE_BGP_COL_AR0_VC1_EMPTY_PACKET, PNE_BGP_COL_AR0_IDLE_PACKET, PNE_BGP_COL_AR0_BAD_PACKET_MARKER, PNE_BGP_COL_AR0_VC0_CUT_THROUGH, PNE_BGP_COL_AR0_VC1_CUT_THROUGH, PNE_BGP_COL_AR0_HEADER_PARITY_ERROR, PNE_BGP_COL_AR0_UNEXPECTED_HEADER_ERROR, PNE_BGP_COL_AR0_RESYNC, PNE_BGP_COL_AR1_VC0_EMPTY_PACKET, PNE_BGP_COL_AR1_VC1_EMPTY_PACKET, PNE_BGP_COL_AR1_IDLE_PACKET, PNE_BGP_COL_AR1_BAD_PACKET_MARKER, PNE_BGP_COL_AR1_VC0_CUT_THROUGH, PNE_BGP_COL_AR1_VC1_CUT_THROUGH, PNE_BGP_COL_AR1_HEADER_PARITY_ERROR, PNE_BGP_COL_AR1_UNEXPECTED_HEADER_ERROR, PNE_BGP_COL_AR1_RESYNC, PNE_BGP_COL_AR2_VC0_EMPTY_PACKET, PNE_BGP_COL_AR2_VC1_EMPTY_PACKET, PNE_BGP_COL_AR2_IDLE_PACKET, PNE_BGP_COL_AR2_BAD_PACKET_MARKER, PNE_BGP_COL_AR2_VC0_CUT_THROUGH, PNE_BGP_COL_AR2_VC1_CUT_THROUGH, PNE_BGP_COL_AR2_HEADER_PARITY_ERROR, PNE_BGP_COL_AR2_UNEXPECTED_HEADER_ERROR, PNE_BGP_COL_AR2_RESYNC, PNE_BGP_COL_AS0_VC0_CUT_THROUGH, PNE_BGP_COL_AS0_VC1_CUT_THROUGH, PNE_BGP_COL_AS0_VC0_PACKETS_SENT, PNE_BGP_COL_AS0_VC1_PACKETS_SENT, PNE_BGP_COL_AS0_IDLE_PACKETS_SENT, PNE_BGP_COL_AS1_VC0_CUT_THROUGH, PNE_BGP_COL_AS1_VC1_CUT_THROUGH, PNE_BGP_COL_AS1_VC0_PACKETS_SENT, PNE_BGP_COL_AS1_VC1_PACKETS_SENT, PNE_BGP_COL_AS1_IDLE_PACKETS_SENT, PNE_BGP_COL_AS2_VC0_CUT_THROUGH, PNE_BGP_COL_AS2_VC1_CUT_THROUGH, PNE_BGP_COL_AS2_VC0_PACKETS_SENT, PNE_BGP_COL_AS2_VC1_PACKETS_SENT, PNE_BGP_COL_AS2_IDLE_PACKETS_SENT, PNE_BGP_COL_INJECT_VC0_PAYLOAD_ADDED, PNE_BGP_COL_INJECT_VC1_PAYLOAD_ADDED, PNE_BGP_COL_INJECT_VC0_PACKET_TAKEN, PNE_BGP_COL_INJECT_VC1_PACKET_TAKEN, PNE_BGP_COL_RECEPTION_VC0_HEADER_TAKEN, PNE_BGP_COL_RECEPTION_VC1_HEADER_TAKEN, PNE_BGP_COL_RECEPTION_VC0_PAYLOAD_TAKEN, PNE_BGP_COL_RECEPTION_VC1_PAYLOAD_TAKEN, PNE_BGP_COL_RECEPTION_VC0_PACKET_DISCARDED, PNE_BGP_COL_RECEPTION_VC1_PACKET_DISCARDED, PNE_BGP_PU2_L2_VALID_PREFETCH_REQUESTS, PNE_BGP_PU2_L2_PREFETCH_HITS_IN_FILTER, PNE_BGP_PU2_L2_PREFETCH_HITS_IN_STREAM, PNE_BGP_PU2_L2_CYCLES_PREFETCH_PENDING, PNE_BGP_PU2_L2_PAGE_ALREADY_IN_L2, PNE_BGP_PU2_L2_PREFETCH_SNOOP_HIT_SAME_CORE, PNE_BGP_PU2_L2_PREFETCH_SNOOP_HIT_OTHER_CORE, PNE_BGP_PU2_L2_PREFETCH_SNOOP_HIT_PLB, PNE_BGP_PU2_L2_CYCLES_READ_REQUEST_PENDING, PNE_BGP_PU2_L2_READ_REQUESTS, PNE_BGP_PU2_L2_DEVBUS_READ_REQUESTS, PNE_BGP_PU2_L2_L3_READ_REQUESTS, PNE_BGP_PU2_L2_NETBUS_READ_REQUESTS, PNE_BGP_PU2_L2_BLIND_DEV_READ_REQUESTS, PNE_BGP_PU2_L2_PREFETCHABLE_REQUESTS, PNE_BGP_PU2_L2_HIT, PNE_BGP_PU2_L2_SAME_CORE_SNOOPS, PNE_BGP_PU2_L2_OTHER_CORE_SNOOPS, PNE_BGP_PU2_L2_OTHER_DP_PU0_SNOOPS, PNE_BGP_PU2_L2_OTHER_DP_PU1_SNOOPS, PNE_BGP_PU2_L2_MEMORY_WRITES = 0x400001A1, PNE_BGP_PU2_L2_NETWORK_WRITES, PNE_BGP_PU2_L2_DEVBUS_WRITES, PNE_BGP_PU3_L2_VALID_PREFETCH_REQUESTS, PNE_BGP_PU3_L2_PREFETCH_HITS_IN_FILTER, PNE_BGP_PU3_L2_PREFETCH_HITS_IN_STREAM, PNE_BGP_PU3_L2_CYCLES_PREFETCH_PENDING, PNE_BGP_PU3_L2_PAGE_ALREADY_IN_L2, PNE_BGP_PU3_L2_PREFETCH_SNOOP_HIT_SAME_CORE, PNE_BGP_PU3_L2_PREFETCH_SNOOP_HIT_OTHER_CORE, PNE_BGP_PU3_L2_PREFETCH_SNOOP_HIT_PLB, PNE_BGP_PU3_L2_CYCLES_READ_REQUEST_PENDING, PNE_BGP_PU3_L2_READ_REQUESTS, PNE_BGP_PU3_L2_DEVBUS_READ_REQUESTS, PNE_BGP_PU3_L2_L3_READ_REQUESTS, PNE_BGP_PU3_L2_NETBUS_READ_REQUESTS, PNE_BGP_PU3_L2_BLIND_DEV_READ_REQUESTS, PNE_BGP_PU3_L2_PREFETCHABLE_REQUESTS, PNE_BGP_PU3_L2_HIT, PNE_BGP_PU3_L2_SAME_CORE_SNOOPS, PNE_BGP_PU3_L2_OTHER_CORE_SNOOPS, PNE_BGP_PU3_L2_OTHER_DP_PU0_SNOOPS, PNE_BGP_PU3_L2_OTHER_DP_PU1_SNOOPS, PNE_BGP_PU3_L2_MEMORY_WRITES = 0x400001C1, PNE_BGP_PU3_L2_NETWORK_WRITES, PNE_BGP_PU3_L2_DEVBUS_WRITES, PNE_BGP_L3_M0_R2_SINGLE_LINE_DELIVERED_L2, PNE_BGP_L3_M0_R2_BURST_DELIVERED_L2, PNE_BGP_L3_M0_R2_READ_RETURN_COLLISION, PNE_BGP_L3_M0_R2_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_R2_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_R2_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M0_R2_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M0_W0_DEPOSIT_REQUESTS, PNE_BGP_L3_M0_W0_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M0_W1_DEPOSIT_REQUESTS, PNE_BGP_L3_M0_W1_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M0_MH_ALLOCATION_REQUESTS, PNE_BGP_L3_M0_MH_CYCLES_ALLOCATION_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M0_PF_PREFETCH_INTO_EDRAM, PNE_BGP_L3_M1_R2_SINGLE_LINE_DELIVERED_L2 = 0x400001D8, PNE_BGP_L3_M1_R2_BURST_DELIVERED_L2, PNE_BGP_L3_M1_R2_READ_RETURN_COLLISION, PNE_BGP_L3_M1_R2_DIR0_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_R2_DIR0_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_R2_DIR1_HIT_OR_INFLIGHT, PNE_BGP_L3_M1_R2_DIR1_MISS_OR_LOCKDOWN, PNE_BGP_L3_M1_W0_DEPOSIT_REQUESTS, PNE_BGP_L3_M1_W0_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M1_W1_DEPOSIT_REQUESTS, PNE_BGP_L3_M1_W1_CYCLES_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M1_MH_ALLOCATION_REQUESTS, PNE_BGP_L3_M1_MH_CYCLES_ALLOCATION_REQUESTS_NOT_TAKEN, PNE_BGP_L3_M1_PF_PREFETCH_INTO_EDRAM, PNE_BGP_PU2_SNOOP_PORT0_REMOTE_SOURCE_REQUESTS = 0x400001EC, PNE_BGP_PU2_SNOOP_PORT1_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU2_SNOOP_PORT2_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU2_SNOOP_PORT3_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU2_SNOOP_PORT0_REJECTED_REQUESTS, PNE_BGP_PU2_SNOOP_PORT1_REJECTED_REQUESTS, PNE_BGP_PU2_SNOOP_PORT2_REJECTED_REQUESTS, PNE_BGP_PU2_SNOOP_PORT3_REJECTED_REQUESTS, PNE_BGP_PU2_SNOOP_L1_CACHE_WRAP, PNE_BGP_PU3_SNOOP_PORT0_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU3_SNOOP_PORT1_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU3_SNOOP_PORT2_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU3_SNOOP_PORT3_REMOTE_SOURCE_REQUESTS, PNE_BGP_PU3_SNOOP_PORT0_REJECTED_REQUESTS, PNE_BGP_PU3_SNOOP_PORT1_REJECTED_REQUESTS, PNE_BGP_PU3_SNOOP_PORT2_REJECTED_REQUESTS, PNE_BGP_PU3_SNOOP_PORT3_REJECTED_REQUESTS, PNE_BGP_PU3_SNOOP_L1_CACHE_WRAP, PNE_BGP_MISC_ELAPSED_TIME_UM1 = 0x400001FF } _papi_hwd_bgp_native_event_id_t; /* END PYTHON GENERATION 1 (Do not modify or move this comment) */ #endif papi-papi-7-2-0-t/src/linux-bgp.c000066400000000000000000000614451502707512200165120ustar00rootroot00000000000000/* * File: linux-bgp.c * Author: Dave Hermsmeier * dlherms@us.ibm.com */ /* * PAPI stuff */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "extras.h" #include "linux-bgp.h" /* * BG/P specific 'stuff' */ /* BG/P includes */ #include #include #include #include #include #include #include #include #include #include /* BG/P macros */ #define get_cycles _bgp_GetTimeBase /* BG/P external structures/functions */ papi_vector_t _bgp_vectors; /* Defined in linux-bgp-memory.c */ extern int _bgp_get_memory_info( PAPI_hw_info_t * pHwInfo, int pCPU_Type ); extern int _bgp_get_dmem_info( PAPI_dmem_info_t * pDmemInfo ); /* BG/P globals */ hwi_search_t *preset_search_map; volatile unsigned int lock[PAPI_MAX_LOCK]; const char *BGP_NATIVE_RESERVED_EVENTID = "Reserved"; PAPI_os_info_t _papi_os_info; /* * Get BGP Native Event Id from PAPI Event Id */ inline BGP_UPC_Event_Id_t get_bgp_native_event_id( int pEventId ) { return ( BGP_UPC_Event_Id_t ) ( pEventId & PAPI_NATIVE_AND_MASK ); } /* * Lock initialization */ void _papi_hwd_lock_init( void ) { /* PAPI on BG/P does not need locks. */ return; } /* * Lock */ void _papi_hwd_lock( int lock ) { /* PAPI on BG/P does not need locks. */ return; } /* * Unlock */ void _papi_hwd_unlock( int lock ) { /* PAPI on BG/P does not need locks. */ return; } /* * Get System Information * * Initialize system information structure */ int _bgp_get_system_info( papi_mdi_t *mdi ) { _BGP_Personality_t bgp; int tmp; unsigned utmp; char chipID[64]; /* Hardware info */ if ( ( tmp = Kernel_GetPersonality( &bgp, sizeof bgp ) ) ) { #include "error.h" fprintf( stdout, "Kernel_GetPersonality returned %d (sys error=%d).\n" "\t%s\n", tmp, errno, strerror( errno ) ); return PAPI_ESYS; } _papi_hwi_system_info.hw_info.ncpu = Kernel_ProcessorCount( ); _papi_hwi_system_info.hw_info.nnodes = ( int ) BGP_Personality_numComputeNodes( &bgp ); _papi_hwi_system_info.hw_info.totalcpus = _papi_hwi_system_info.hw_info.ncpu * _papi_hwi_system_info.hw_info.nnodes; utmp = Kernel_GetProcessorVersion( ); _papi_hwi_system_info.hw_info.model = ( int ) utmp; _papi_hwi_system_info.hw_info.vendor = ( utmp >> ( 31 - 11 ) ) & 0xFFF; _papi_hwi_system_info.hw_info.revision = ( ( float ) ( ( utmp >> ( 31 - 15 ) ) & 0xFFFF ) ) + 0.00001 * ( ( float ) ( utmp & 0xFFFF ) ); strcpy( _papi_hwi_system_info.hw_info.vendor_string, "IBM" ); tmp = snprintf( _papi_hwi_system_info.hw_info.model_string, sizeof _papi_hwi_system_info.hw_info.model_string, "PVR=%#4.4x:%#4.4x", ( utmp >> ( 31 - 15 ) ) & 0xFFFF, ( utmp & 0xFFFF ) ); BGP_Personality_getLocationString( &bgp, chipID ); tmp += 12 + sizeof ( chipID ); if ( sizeof ( _papi_hwi_system_info.hw_info.model_string ) > tmp ) { strcat( _papi_hwi_system_info.hw_info.model_string, " Serial=" ); strncat( _papi_hwi_system_info.hw_info.model_string, chipID, sizeof ( chipID ) ); } _papi_hwi_system_info.hw_info.mhz = ( float ) BGP_Personality_clockMHz( &bgp ); SUBDBG( "_bgp_get_system_info: Detected MHZ is %f\n", _papi_hwi_system_info.hw_info.mhz ); _papi_hwi_system_info.hw_info.cpu_max_mhz=_papi_hwi_system_info.hw_info.mhz; _papi_hwi_system_info.hw_info.cpu_min_mhz=_papi_hwi_system_info.hw_info.mhz; // Memory information structure not filled in - same as BG/L // _papi_hwi_system_info.hw_info.mem_hierarchy = ???; // The mpx_info structure disappeared in PAPI-C //_papi_hwi_system_info.mpx_info.timer_sig = PAPI_NULL; return PAPI_OK; } /* * Initialize Control State * * All state is kept in BG/P UPC structures */ int _bgp_init_control_state( hwd_control_state_t *ctl ) { int i; //bgp_control_state_t *bgp_ctl = (bgp_control_state_t *)ctl; for ( i = 1; i < BGP_UPC_MAX_MONITORED_EVENTS; i++ ) ctl->counters[i] = 0; return PAPI_OK; } /* * Set Domain * * All state is kept in BG/P UPC structures */ int _bgp_set_domain( hwd_control_state_t * cntrl, int domain ) { return ( PAPI_OK ); } /* * PAPI Initialization * * All state is kept in BG/P UPC structures */ int _bgp_init_thread( hwd_context_t * ctx ) { return PAPI_OK; } /* * PAPI Global Initialization * * Global initialization - does initial PAPI setup and * calls BGP_UPC_Initialize() */ int _bgp_init_global( void ) { int retval; int cidx = _bgp_vectors.cmp_info.CmpIdx; /* * Fill in what we can of the papi_system_info */ SUBDBG( "Before _bgp_get_system_info()...\n" ); retval = _bgp_get_system_info( &_papi_hwi_system_info ); SUBDBG( "After _bgp_get_system_info(), retval=%d...\n", retval ); if ( retval != PAPI_OK ) return ( retval ); /* * Setup presets */ SUBDBG( "Before setup_bgp_presets, _papi_hwi_system_info.hw_info.model=%d...\n", _papi_hwi_system_info.hw_info.model ); retval = _papi_load_preset_table( "BGP", 0, cidx ); SUBDBG( "After setup_bgp_presets, retval=%d...\n", retval ); if ( retval ) return ( retval ); /* * Setup memory info */ SUBDBG( "Before _bgp_get_memory_info...\n" ); retval = _bgp_get_memory_info( &_papi_hwi_system_info.hw_info, ( int ) _papi_hwi_system_info.hw_info. model ); SUBDBG( "After _bgp_get_memory_info, retval=%d...\n", retval ); if ( retval ) return ( retval ); /* * Initialize BG/P global variables... * NOTE: If the BG/P SPI interface is to be used, then this * initialize routine must be called from each process for the * application. It does not matter if this routine is called more * than once per process, but must be called by each process at * least once, preferably at the beginning of the application. */ SUBDBG( "Before BGP_UPC_Initialize()...\n" ); BGP_UPC_Initialize( ); SUBDBG( "After BGP_UPC_Initialize()...\n" ); return PAPI_OK; } /* * PAPI Shutdown Global * * Called once per process - nothing to do */ int _bgp_shutdown_global( void ) { return PAPI_OK; } /* * Register Allocation * * Sets up the UPC configuration to monitor those events * as identified in the event set. */ int _bgp_allocate_registers( EventSetInfo_t * ESI ) { int i, natNum; BGP_UPC_Event_Id_t xEventId; /* * If an active UPC unit, return error */ if ( BGP_UPC_Check_Active( ) ) { SUBDBG( "_bgp_allocate_registers: UPC is active...\n" ); return PAPI_ESYS; } /* * If a counter mode of 1, return error */ if ( BGP_UPC_Get_Counter_Mode( ) ) { SUBDBG( "_bgp_allocate_registers: Inconsistent counter mode...\n" ); return PAPI_ESYS; } /* * Start monitoring the events... */ natNum = ESI->NativeCount; // printf("_bgp_allocate_registers: natNum=%d\n", natNum); for ( i = 0; i < natNum; i++ ) { xEventId = get_bgp_native_event_id( ESI->NativeInfoArray[i].ni_event ); // printf("_bgp_allocate_registers: xEventId = %d\n", xEventId); if ( !BGP_UPC_Check_Active_Event( xEventId ) ) { // NOTE: We do not have to start monitoring for elapsed time... It is always being // monitored at location 255... if ( ( xEventId % BGP_UPC_MAX_MONITORED_EVENTS ) != 255 ) { /* * The event is not already being monitored by the UPC, start monitoring * for the event. This will automatically zero the counter and turn off any * threshold value... */ // printf("_bgp_allocate_registers: Event id %d not being monitored...\n", xEventId); if ( BGP_UPC_Monitor_Event( xEventId, BGP_UPC_CFG_EDGE_DEFAULT ) < 0 ) { // printf("_bgp_allocate_registers: Monitor_Event failed...\n"); return PAPI_ECMP; } } /* here is if we are event 255 */ else { } } else { /* * The event is already being monitored by the UPC. This is a normal * case where the UPC is monitoring all events for a particular user * mode. We are in this leg because the PAPI event set has not yet * started monitoring the event. So, simply zero the counter and turn * off any threshold value... */ // printf("_bgp_allocate_registers: Event id %d is already being monitored...\n", xEventId); // NOTE: Can't zero the counter or reset the threshold for the timestamp counter... if ( ESI->NativeInfoArray[i].ni_event != PNE_BGP_IC_TIMESTAMP ) { if ( BGP_UPC_Zero_Counter_Value( xEventId ) < 0 ) { // printf("_bgp_allocate_registers: Zero_Counter failed...\n"); return PAPI_ECMP; } if ( BGP_UPC_Set_Counter_Threshold_Value( xEventId, 0 ) < 0 ) { // printf("_bgp_allocate_registers: Set_Counter_Threshold_Value failed...\n"); return PAPI_ECMP; } } } ESI->NativeInfoArray[i].ni_position = xEventId % BGP_UPC_MAX_MONITORED_EVENTS; // printf("_bgp_allocate_registers: ESI->NativeInfoArray[i].ni_position=%d\n", ESI->NativeInfoArray[i].ni_position); } // printf("_bgp_allocate_registers: Exiting normally...\n"); return PAPI_OK; } /* * Update Control State * * This function clears the current contents of the control * structure and updates it with whatever resources are allocated * for all the native events in the native info structure array. * * Since no BGP specific state is kept at the PAPI level, there is * nothing to update and we simply return. */ int _bgp_update_control_state( hwd_control_state_t *ctl, NativeInfo_t *native, int count, hwd_context_t *ctx ) { return PAPI_OK; } /* Hack to get cycle count */ static long_long begin_cycles; /* * PAPI Start * * Start UPC unit(s) */ int _bgp_start( hwd_context_t * ctx, hwd_control_state_t * ctrlstate ) { sigset_t mask_set; sigset_t old_set; sigemptyset( &mask_set ); sigaddset( &mask_set, SIGXCPU ); sigprocmask( SIG_BLOCK, &mask_set, &old_set ); begin_cycles=_bgp_GetTimeBase(); BGP_UPC_Start( BGP_UPC_NO_RESET_COUNTERS ); sigprocmask( SIG_UNBLOCK, &mask_set, NULL ); return ( PAPI_OK ); } /* * PAPI Stop * * Stop UPC unit(s) */ int _bgp_stop( hwd_context_t * ctx, hwd_control_state_t * state ) { sigset_t mask_set; sigset_t old_set; sigemptyset( &mask_set ); sigaddset( &mask_set, SIGXCPU ); sigprocmask( SIG_BLOCK, &mask_set, &old_set ); BGP_UPC_Stop( ); sigprocmask( SIG_UNBLOCK, &mask_set, NULL ); return PAPI_OK; } /* * PAPI Read Counters * * Read the counters into local storage */ int _bgp_read( hwd_context_t *ctx, hwd_control_state_t *ctl, long_long ** dp, int flags ) { // printf("_bgp_read: this_state* = %p\n", this_state); // printf("_bgp_read: (long_long*)&this_state->counters[0] = %p\n", (long_long*)&this_state->counters[0]); // printf("_bgp_read: (long_long*)&this_state->counters[1] = %p\n", (long_long*)&this_state->counters[1]); sigset_t mask_set; sigset_t old_set; sigemptyset( &mask_set ); sigaddset( &mask_set, SIGXCPU ); sigprocmask( SIG_BLOCK, &mask_set, &old_set ); if ( BGP_UPC_Read_Counters ( ( long_long * ) & ctl->counters[0], BGP_UPC_MAXIMUM_LENGTH_READ_COUNTERS_ONLY, BGP_UPC_READ_EXCLUSIVE ) < 0 ) { sigprocmask( SIG_UNBLOCK, &mask_set, NULL ); return PAPI_ECMP; } sigprocmask( SIG_UNBLOCK, &mask_set, NULL ); /* hack to emulate BGP_MISC_ELAPSED_TIME counter */ ctl->counters[255]=_bgp_GetTimeBase()-begin_cycles; *dp = ( long_long * ) & ctl->counters[0]; // printf("_bgp_read: dp = %p\n", dp); // printf("_bgp_read: *dp = %p\n", *dp); // printf("_bgp_read: (*dp)[0]* = %p\n", &((*dp)[0])); // printf("_bgp_read: (*dp)[1]* = %p\n", &((*dp)[1])); // printf("_bgp_read: (*dp)[2]* = %p\n", &((*dp)[2])); // int i; // for (i=0; i<256; i++) // if ((*dp)[i]) // printf("_bgp_read: i=%d, (*dp)[i]=%lld\n", i, (*dp)[i]); return PAPI_OK; } /* * PAPI Reset * * Zero the counter values */ int _bgp_reset( hwd_context_t * ctx, hwd_control_state_t * ctrlstate ) { // NOTE: PAPI can reset the counters with the UPC running. One way it happens // is with PAPI_accum. In that case, stop and restart the UPC, resetting // the counters. sigset_t mask_set; sigset_t old_set; sigemptyset( &mask_set ); sigaddset( &mask_set, SIGXCPU ); sigprocmask( SIG_BLOCK, &mask_set, &old_set ); if ( BGP_UPC_Check_Active( ) ) { // printf("_bgp_reset: BGP_UPC_Stop()\n"); BGP_UPC_Stop( ); // printf("_bgp_reset: BGP_UPC_Start(BGP_UPC_RESET_COUNTERS)\n"); BGP_UPC_Start( BGP_UPC_RESET_COUNTERS ); } else { // printf("_bgp_reset: BGP_UPC_Zero_Counter_Values()\n"); BGP_UPC_Zero_Counter_Values( ); } sigprocmask( SIG_UNBLOCK, &mask_set, NULL ); return ( PAPI_OK ); } /* * PAPI Shutdown * * This routine is for shutting down threads, * including the master thread. * Effectively a no-op, same as BG/L... */ int _bgp_shutdown( hwd_context_t * ctx ) { return ( PAPI_OK ); } /* * PAPI Write * * Write counter values * NOTE: Could possible support, but signal error as BG/L does... */ int _bgp_write( hwd_context_t * ctx, hwd_control_state_t * cntrl, long_long * from ) { return PAPI_ECMP; } /* * Dispatch Timer * * Same as BG/L - simple return */ void _bgp_dispatch_timer( int signal, hwd_siginfo_t * si, void *context ) { return; } void user_signal_handler( int signum, hwd_siginfo_t * siginfo, void *mycontext ) { EventSetInfo_t *ESI; ThreadInfo_t *thread = NULL; int isHardware = 1; vptr_t pc; _papi_hwi_context_t ctx; BGP_UPC_Event_Id_t xEventId = 0; // int thresh; int event_index, i; long_long overflow_bit = 0; int64_t threshold; ctx.si = siginfo; ctx.ucontext = ( ucontext_t * ) mycontext; ucontext_t *context = ( ucontext_t * ) mycontext; pc = ( vptr_t ) context->uc_mcontext.regs->nip; thread = _papi_hwi_lookup_thread( 0 ); //int cidx = (int) &thread; ESI = thread->running_eventset[0]; //ESI = (EventSetInfo_t *) thread->running_eventset; if ( ESI == NULL ) { //printf("ESI is null\n"); return; } else { BGP_UPC_Stop( ); //xEventId = get_bgp_native_event_id(ESI->NativeInfoArray[0].ni_event); //*ESI->overflow.EventIndex].ni_event); event_index = *ESI->overflow.EventIndex; //printf("event index %d\n", event_index); for ( i = 0; i <= event_index; i++ ) { xEventId = get_bgp_native_event_id( ESI->NativeInfoArray[i].ni_event ); if ( BGP_UPC_Read_Counter( xEventId, 1 ) >= BGP_UPC_Get_Counter_Threshold_Value( xEventId ) && BGP_UPC_Get_Counter_Threshold_Value( xEventId ) != 0 ) { break; } } overflow_bit ^= 1 << xEventId; //ESI->overflow.handler(ESI->EventSetIndex, pc, 0, (void *) &ctx); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, pc, &isHardware, overflow_bit, 0, &thread, 0 ); //thresh = (int)(*ESI->overflow.threshold + BGP_UPC_Read_Counter_Value(xEventId, 1)); //(int)BGP_UPC_Get_Counter_Threshold_Value(xEventId)); //printf("thresh %llu val %llu\n", (int64_t)*ESI->overflow.threshold, BGP_UPC_Read_Counter_Value(xEventId, 1)); threshold = ( int64_t ) * ESI->overflow.threshold + BGP_UPC_Read_Counter_Value( xEventId, 1 ); //printf("threshold %llu\n", threshold); BGP_UPC_Set_Counter_Threshold_Value( xEventId, threshold ); BGP_UPC_Start( 0 ); } } /* * Set Overflow * * This is commented out in BG/L - need to explore and complete... * However, with true 64-bit counters in BG/P and all counters for PAPI * always starting from a true zero (we don't allow write...), the possibility * for overflow is remote at best... * * Commented out code is carry-over from BG/L... */ int _bgp_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { int rc = 0; BGP_UPC_Event_Id_t xEventId; // = get_bgp_native_event_id(EventCode); xEventId = get_bgp_native_event_id( ESI->NativeInfoArray[EventIndex].ni_event ); //rc = BGP_UPC_Monitor_Event(xEventId, BGP_UPC_CFG_LEVEL_HIGH); rc = BGP_UPC_Set_Counter_Threshold_Value( xEventId, threshold ); //printf("setting up sigactioni %d\n", xEventId); //ESI->NativeInfoArray[EventIndex].ni_event); /*struct sigaction act; act.sa_sigaction = user_signal_handler; memset(&act.sa_mask, 0x0, sizeof(act.sa_mask)); act.sa_flags = SA_RESTART | SA_SIGINFO; if (sigaction(SIGXCPU, &act, NULL) == -1) { return (PAPI_ESYS); } */ struct sigaction new_action; sigemptyset( &new_action.sa_mask ); new_action.sa_sigaction = ( void * ) user_signal_handler; new_action.sa_flags = SA_RESTART | SA_SIGINFO; sigaction( SIGXCPU, &new_action, NULL ); return PAPI_OK; } /* * Set Profile * * Same as for BG/L, routine not used and returns error */ int _bgp_set_profile( EventSetInfo_t * ESI, int EventIndex, int threshold ) { /* This function is not used and shouldn't be called. */ return PAPI_ECMP; } /* * Stop Profiling * * Same as for BG/L... */ int _bgp_stop_profiling( ThreadInfo_t * master, EventSetInfo_t * ESI ) { return PAPI_OK; } /* * PAPI Control * * Same as for BG/L - initialize the domain */ int _bgp_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { // extern int _bgp_set_domain(hwd_control_state_t * cntrl, int domain); switch ( code ) { case PAPI_DOMAIN: case PAPI_DEFDOM: // Simply return PAPI_OK, as no state is kept. return PAPI_OK; case PAPI_GRANUL: case PAPI_DEFGRN: return PAPI_ECMP; default: return PAPI_EINVAL; } } /* * Get Real Micro-seconds */ long long _bgp_get_real_usec( void ) { /* * NOTE: _papi_hwi_system_info.hw_info.mhz is really a representation of unit of time per cycle. * On BG/P, it's value is 8.5e-4. Therefore, to get cycles per sec, we have to multiply * by 1.0e12. To then convert to usec, we have to divide by 1.0e-3. */ // SUBDBG("_bgp_get_real_usec: _papi_hwi_system_info.hw_info.mhz=%e\n",(_papi_hwi_system_info.hw_info.mhz)); // float x = (float)get_cycles(); // float y = (_papi_hwi_system_info.hw_info.mhz)*(1.0e9); // SUBDBG("_bgp_get_real_usec: _papi_hwi_system_info.hw_info.mhz=%e, x=%e, y=%e, x/y=%e, (long long)(x/y) = %lld\n", // (_papi_hwi_system_info.hw_info.mhz), x, y, x/y, (long long)(x/y)); // return (long long)(x/y); return ( ( long long ) ( ( ( float ) get_cycles( ) ) / ( ( _papi_hwi_system_info.hw_info.cpu_max_mhz ) ) ) ); } /* * Get Real Cycles * * Same for BG/L, using native function... */ long long _bgp_get_real_cycles( void ) { return ( get_cycles( ) ); } /* * Get Virtual Micro-seconds * * Same calc as for BG/L, returns real usec... */ long long _bgp_get_virt_usec( void ) { return _bgp_get_real_usec( ); } /* * Get Virtual Cycles * * Same calc as for BG/L, returns real cycles... */ long long _bgp_get_virt_cycles( void ) { return _bgp_get_real_cycles( ); } /* * Component setup and shutdown * * Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int _bgp_init_component( int cidx ) { int retval; _bgp_vectors.cmp_info.CmpIdx = cidx; retval = _bgp_init_global( ); return ( retval ); } /*************************************/ /* CODE TO SUPPORT OPAQUE NATIVE MAP */ /*************************************/ /* * Native Code to Event Name * * Given a native event code, returns the short text label */ int _bgp_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { char xNativeEventName[BGP_UPC_MAXIMUM_LENGTH_EVENT_NAME]; BGP_UPC_Event_Id_t xEventId = get_bgp_native_event_id( EventCode ); /* * NOTE: We do not return the event name for a user mode 2 or 3 event... */ if ( ( int ) xEventId < 0 || ( int ) xEventId > 511 ) return ( PAPI_ENOEVNT ); if ( BGP_UPC_Get_Event_Name ( xEventId, BGP_UPC_MAXIMUM_LENGTH_EVENT_NAME, xNativeEventName ) != BGP_UPC_SUCCESS ) return ( PAPI_ENOEVNT ); SUBDBG( "_bgp_ntv_code_to_name: EventCode = %d\n, xEventName = %s\n", EventCode, xEventName ); strncpy( name, "PNE_", len ); strncat( name, xNativeEventName, len - strlen( name ) ); return ( PAPI_OK ); } /* * Native Code to Event Description * * Given a native event code, returns the longer native event description */ int _bgp_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { char xNativeEventDesc[BGP_UPC_MAXIMUM_LENGTH_EVENT_DESCRIPTION]; BGP_UPC_Event_Id_t xEventId = get_bgp_native_event_id( EventCode ); /* * NOTE: We do not return the event name for a user mode 2 or 3 event... */ if ( ( int ) xEventId < 0 || ( int ) xEventId > 511 ) return ( PAPI_ENOEVNT ); else if ( BGP_UPC_Get_Event_Description ( xEventId, BGP_UPC_MAXIMUM_LENGTH_EVENT_DESCRIPTION, xNativeEventDesc ) != BGP_UPC_SUCCESS ) return ( PAPI_ENOEVNT ); strncpy( name, xNativeEventDesc, len ); return ( PAPI_OK ); } /* * Native Code to Bit Configuration * * Given a native event code, assigns the native event's * information to a given pointer. * NOTE: The info must be COPIED to location addressed by * the provided pointer, not just referenced! * NOTE: For BG/P, the bit configuration is not needed, * as the native SPI is used to configure events. */ int _bgp_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { return ( PAPI_OK ); } /* * Native ENUM Events * * Given a native event code, looks for next MOESI bit if applicable. * If not, looks for the next event in the table if the next one exists. * If not, returns the proper error code. * * For BG/P, we simply we simply return the native event id to the * to the next logical non-reserved event id. * * We only support enumerating all or available events. */ int _bgp_ntv_enum_events( unsigned int *EventCode, int modifier ) { /* * Check for a valid EventCode and we only process a modifier of 'all events'... */ // printf("_bgp_ntv_enum_events: EventCode=%8.8x\n", *EventCode); if ( *EventCode < 0x40000000 || *EventCode > 0x400001FF || ( modifier != PAPI_ENUM_ALL && modifier != PAPI_PRESET_ENUM_AVAIL ) ) return PAPI_ECMP; char xNativeEventName[BGP_UPC_MAXIMUM_LENGTH_EVENT_NAME]; BGP_UPC_RC_t xRC; // NOTE: We turn off the PAPI_NATIVE bit here... int32_t xNativeEventId = ( ( *EventCode ) & PAPI_NATIVE_AND_MASK ) + 0x00000001; while ( xNativeEventId <= 0x000001FF ) { xRC = BGP_UPC_Get_Event_Name( xNativeEventId, BGP_UPC_MAXIMUM_LENGTH_EVENT_NAME, xNativeEventName ); // printf("_bgp_ntv_enum_events: xNativeEventId = %8.8x, xRC=%d\n", xNativeEventId, xRC); if ( ( xRC == BGP_UPC_SUCCESS ) && ( strlen( xNativeEventName ) > 0 ) ) { // printf("_bgp_ntv_enum_events: len(xNativeEventName)=%d, xNativeEventName=%s\n", strlen(xNativeEventName), xNativeEventName); break; } xNativeEventId++; } if ( xNativeEventId > 0x000001FF ) return ( PAPI_ENOEVNT ); else { // NOTE: We turn the PAPI_NATIVE bit back on here... *EventCode = xNativeEventId | PAPI_NATIVE_MASK; return ( PAPI_OK ); } } int _papi_hwi_init_os(void) { struct utsname uname_buffer; uname(&uname_buffer); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_res_ns = 1; return PAPI_OK; } /* * PAPI Vector Table for BG/P */ papi_vector_t _bgp_vectors = { .cmp_info = { .name = "linux-bgp", .short_name = "bgp", .description = "BlueGene/P component", .num_cntrs = BGP_UPC_MAX_MONITORED_EVENTS, .num_mpx_cntrs = BGP_UPC_MAX_MONITORED_EVENTS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .fast_real_timer = 1, .fast_virtual_timer = 0, }, /* Sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( hwd_context_t ), .control_state = sizeof ( hwd_control_state_t ), .reg_value = sizeof ( hwd_register_t ), .reg_alloc = sizeof ( hwd_reg_alloc_t ), }, /* Function pointers in this component */ .dispatch_timer = _bgp_dispatch_timer, .start = _bgp_start, .stop = _bgp_stop, .read = _bgp_read, .reset = _bgp_reset, .write = _bgp_write, .stop_profiling = _bgp_stop_profiling, .init_component = _bgp_init_component, .init_thread = _bgp_init_thread, .init_control_state = _bgp_init_control_state, .update_control_state = _bgp_update_control_state, .ctl = _bgp_ctl, .set_overflow = _bgp_set_overflow, .set_profile = _bgp_set_profile, .set_domain = _bgp_set_domain, .ntv_enum_events = _bgp_ntv_enum_events, .ntv_code_to_name = _bgp_ntv_code_to_name, .ntv_code_to_descr = _bgp_ntv_code_to_descr, .ntv_code_to_bits = _bgp_ntv_code_to_bits, .allocate_registers = _bgp_allocate_registers, .shutdown_thread = _bgp_shutdown }; papi_os_vector_t _papi_os_vector = { .get_memory_info = _bgp_get_memory_info, .get_dmem_info = _bgp_get_dmem_info, .get_real_cycles = _bgp_get_real_cycles, .get_real_usec = _bgp_get_real_usec, .get_virt_cycles = _bgp_get_virt_cycles, .get_virt_usec = _bgp_get_virt_usec, .get_system_info = _bgp_get_system_info, }; papi-papi-7-2-0-t/src/linux-bgp.h000066400000000000000000000030121502707512200165010ustar00rootroot00000000000000#ifndef _LINUX_BGP_H #define _LINUX_BGP_H #include #include #include #include #include #include #include #include #include #include #include #include #include #define MAX_COUNTERS BGP_UPC_MAX_MONITORED_EVENTS #define MAX_COUNTER_TERMS MAX_COUNTERS #include "papi.h" #include "papi_preset.h" //#include "papi_defines.h" #include "linux-bgp-native-events.h" // Context structure not used... typedef struct bgp_context { int reserved; } bgp_context_t; // Control state structure... Holds local copy of read counters... typedef struct bgp_control_state { long_long counters[BGP_UPC_MAX_MONITORED_EVENTS]; } bgp_control_state_t; // Register allocation structure typedef struct bgp_reg_alloc { _papi_hwd_bgp_native_event_id_t id; } bgp_reg_alloc_t; // Register structure not used... typedef struct bgp_register { int reserved; } bgp_register_t; /* Override void* definitions from PAPI framework layer */ /* with typedefs to conform to PAPI component layer code. */ #undef hwd_reg_alloc_t #undef hwd_register_t #undef hwd_control_state_t #undef hwd_context_t typedef bgp_reg_alloc_t hwd_reg_alloc_t; typedef bgp_register_t hwd_register_t; typedef bgp_control_state_t hwd_control_state_t; typedef bgp_context_t hwd_context_t; extern void _papi_hwd_lock( int ); extern void _papi_hwd_unlock( int ); #include "linux-bgp-context.h" extern hwi_search_t *preset_search_map; #endif papi-papi-7-2-0-t/src/linux-bgq-common.c000066400000000000000000000102341502707512200177670ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-bgq-common.c * CVS: $Id$ * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM component * * Tested version of bgpm (early access) * * @brief * This file is part of the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "linux-bgq-common.h" /******************************************************************************* ******** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT ********** ******************************************************************************/ int _check_BGPM_error( int err, char* bgpmfunc ) { char buffer[PAPI_MAX_STR_LEN]; int retval; if ( err < 0 ) { sprintf( buffer, "Error: ret value is %d for BGPM API function '%s'.", err, bgpmfunc); retval = _papi_hwi_publish_error( buffer ); return retval; } return PAPI_OK; } /* * Returns all event values from the BGPM eventGroup */ long_long _common_getEventValue( unsigned event_id, int EventGroup ) { uint64_t value; int retval; retval = Bgpm_ReadEvent( EventGroup, event_id, &value ); retval = _check_BGPM_error( retval, "Bgpm_ReadEvent" ); if ( retval < 0 ) return retval; return ( ( long_long ) value ); } /* * Delete BGPM eventGroup and create an new empty one */ int _common_deleteRecreate( int *EventGroup_ptr ) { #ifdef DEBUG_BGQ printf( _AT_ " _common_deleteRecreate: *EventGroup_ptr=%d\n", *EventGroup_ptr); #endif int retval; // delete previous bgpm eventset retval = Bgpm_DeleteEventSet( *EventGroup_ptr ); retval = _check_BGPM_error( retval, "Bgpm_DeleteEventSet" ); if ( retval < 0 ) return retval; // create a new empty bgpm eventset *EventGroup_ptr = Bgpm_CreateEventSet(); retval = _check_BGPM_error( *EventGroup_ptr, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf( _AT_ " _common_deleteRecreate: *EventGroup_ptr=%d\n", *EventGroup_ptr); #endif return PAPI_OK; } /* * Rebuild BGPM eventGroup with the events as it was prior to deletion */ int _common_rebuildEventgroup( int count, int *EventGroup_local, int *EventGroup_ptr ) { #ifdef DEBUG_BGQ printf( "_common_rebuildEventgroup\n" ); #endif int i, retval; // rebuild BGPM EventGroup for ( i = 0; i < count; i++ ) { retval = Bgpm_AddEvent( *EventGroup_ptr, EventGroup_local[i] ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf( "_common_rebuildEventgroup: After emptying EventGroup, event re-added: %d\n", EventGroup_local[i] ); #endif } return PAPI_OK; } /* * _common_set_overflow_BGPM * * since update_control_state trashes overflow settings, this puts things * back into balance for BGPM */ int _common_set_overflow_BGPM( int EventGroup, int evt_idx, int threshold, void (*handler)(int, uint64_t, uint64_t, const ucontext_t *) ) { int retval; uint64_t threshold_for_bgpm; /* convert threadhold value assigned by PAPI user to value that is * programmed into the counter. This value is required by Bgpm_SetOverflow() */ threshold_for_bgpm = BGPM_PERIOD2THRES( threshold ); #ifdef DEBUG_BGQ printf("_common_set_overflow_BGPM\n"); int i; int numEvts = Bgpm_NumEvents( EventGroup ); for ( i = 0; i < numEvts; i++ ) { printf("_common_set_overflow_BGPM: %d = %s\n", i, Bgpm_GetEventLabel( EventGroup, i) ); } #endif retval = Bgpm_SetOverflow( EventGroup, evt_idx, threshold_for_bgpm ); retval = _check_BGPM_error( retval, "Bgpm_SetOverflow" ); if ( retval < 0 ) return retval; retval = Bgpm_SetEventUser1( EventGroup, evt_idx, 1024 ); retval = _check_BGPM_error( retval, "Bgpm_SetEventUser1" ); if ( retval < 0 ) return retval; /* user signal handler for overflow case */ retval = Bgpm_SetOverflowHandler( EventGroup, handler ); retval = _check_BGPM_error( retval, "Bgpm_SetOverflowHandler" ); if ( retval < 0 ) return retval; return PAPI_OK; } papi-papi-7-2-0-t/src/linux-bgq-common.h000066400000000000000000000030321502707512200177720ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-bgq-common.h * CVS: $Id$ * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * BGPM component * * Tested version of bgpm (early access) * * @brief * This file is part of the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the bgpm library. */ #include "papi.h" /* Header required by BGPM */ #include "bgpm/include/bgpm.h" extern int _papi_hwi_publish_error( char *error ); // Define gymnastics to create a compile time AT string. #define STRINGIFY(x) #x #define TOSTRING(x) STRINGIFY(x) #define _AT_ __FILE__ ":" TOSTRING(__LINE__) /* return EXIT_FAILURE; \*/ #define MAX_COUNTERS ( PEVT_LAST_EVENT + 1 ) //#define DEBUG_BGQ /************************* COMMON PROTOTYPES ********************************* *******************************************************************************/ /* common prototypes for BGQ sustrate and BGPM components */ int _check_BGPM_error( int err, char* bgpmfunc ); long_long _common_getEventValue( unsigned event_id, int EventGroup ); int _common_deleteRecreate( int *EventGroup_ptr ); int _common_rebuildEventgroup( int count, int *EventGroup_local, int *EventGroup_ptr ); int _common_set_overflow_BGPM( int EventGroup, int evt_idx, int threshold, void (*handler)(int, uint64_t, uint64_t, const ucontext_t *) ); papi-papi-7-2-0-t/src/linux-bgq-lock.h000066400000000000000000000001101502707512200174240ustar00rootroot00000000000000extern void _papi_hwd_lock( int ); extern void _papi_hwd_unlock( int ); papi-papi-7-2-0-t/src/linux-bgq-memory.c000066400000000000000000000036001502707512200200060ustar00rootroot00000000000000/* * File: linux-bgq-memory.c * CVS: $Id$ * Author: Heike Jagode * jagode@eecs.utk.edu * Mods: * */ #include "papi.h" #include "papi_internal.h" #include "linux-bgq.h" #ifdef __LINUX__ #include #endif #include /* * Prototypes... */ int init_bgq( PAPI_mh_info_t * pMem_Info ); // inline void cpuid(unsigned int *, unsigned int *,unsigned int *,unsigned int *); /* * Get Memory Information * * Fills in memory information - effectively set to all 0x00's */ extern int _bgq_get_memory_info( PAPI_hw_info_t * pHwInfo, int pCPU_Type ) { int retval = 0; switch ( pCPU_Type ) { default: //fprintf(stderr,"Default CPU type in %s (%d)\n",__FUNCTION__,__LINE__); retval = init_bgq( &pHwInfo->mem_hierarchy ); break; } return retval; } /* * Get DMem Information for BG/Q * * NOTE: Currently, all values set to -1 */ extern int _bgq_get_dmem_info( PAPI_dmem_info_t * pDmemInfo ) { // pid_t xPID = getpid(); // prpsinfo_t xInfo; // char xFile[256]; // int xFD; // sprintf(xFile, "/proc/%05d", xPID); // if ((fd = open(xFile, O_RDONLY)) < 0) { // SUBDBG("PAPI_get_dmem_info can't open /proc/%d\n", xPID); // return (PAPI_ESYS); // } // if (ioctl(xFD, PIOCPSINFO, &xInfo) < 0) { // return (PAPI_ESYS); // } // close(xFD); pDmemInfo->size = PAPI_EINVAL; pDmemInfo->resident = PAPI_EINVAL; pDmemInfo->high_water_mark = PAPI_EINVAL; pDmemInfo->shared = PAPI_EINVAL; pDmemInfo->text = PAPI_EINVAL; pDmemInfo->library = PAPI_EINVAL; pDmemInfo->heap = PAPI_EINVAL; pDmemInfo->locked = PAPI_EINVAL; pDmemInfo->stack = PAPI_EINVAL; pDmemInfo->pagesize = PAPI_EINVAL; return PAPI_OK; } /* * Cache configuration for BG/Q */ int init_bgq( PAPI_mh_info_t * pMem_Info ) { memset( pMem_Info, 0x0, sizeof ( *pMem_Info ) ); //fprintf(stderr,"mem_info not est up [%s (%d)]\n",__FUNCTION__,__LINE__); return PAPI_OK; } papi-papi-7-2-0-t/src/linux-bgq.c000066400000000000000000001027221502707512200165050ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-bgq.c * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * Blue Gene/Q CPU component: BGPM / Punit * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the BGPM library. */ #include "papi.h" #include "papi_internal.h" #include "papi_lock.h" #include "papi_memory.h" #include "extras.h" #include "linux-bgq.h" #include "papi_vector.h" #include "error.h" /* * BG/Q specific 'stuff' */ #include #include #include #include #include #include #include #include #include "spi/include/upci/upci.h" #ifdef DEBUG_BGQ #include #endif // BG/Q macros #define get_cycles GetTimeBase // BG/Q external structures/functions/stuff #if 1 UPC_Lock_t thdLocks[PAPI_MAX_LOCK]; #else pthread_mutex_t thdLocks[PAPI_MAX_LOCK]; #endif /* Defined in papi_data.c */ //extern papi_mdi_t _papi_hwi_system_info; papi_vector_t _bgq_vectors; PAPI_os_info_t _papi_os_info; #define OPCODE_EVENT_CHUNK 8 static int allocated_opcode_events = 0; static int num_opcode_events = 0; struct bgq_generic_events_t { int idx; int eventId; char mask[PAPI_MIN_STR_LEN]; char opcode[PAPI_MIN_STR_LEN]; uint64_t opcode_mask; }; static struct bgq_generic_events_t *GenericEvent; /* Defined in linux-bgq-memory.c */ extern int _bgq_get_memory_info( PAPI_hw_info_t * pHwInfo, int pCPU_Type ); extern int _bgq_get_dmem_info( PAPI_dmem_info_t * pDmemInfo ); /* prototypes */ void user_signal_handler( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ); /******************************************************************************* ******** BEGIN FUNCTIONS USED INTERNALLY SPECIFIC TO THIS COMPONENT ********** ******************************************************************************/ /* * Lock */ void _papi_hwd_lock( int lock ) { #ifdef DEBUG_BGQ printf( _AT_ " _papi_hwd_lock %d\n", lock); #endif assert( lock < PAPI_MAX_LOCK ); #if 1 UPC_Lock( &thdLocks[lock] ); #else pthread_mutex_lock( &thdLocks[lock] ); #endif #ifdef DEBUG_BGQ printf( _AT_ " _papi_hwd_lock got lock %d\n", lock ); #endif return; } /* * Unlock */ void _papi_hwd_unlock( int lock ) { #ifdef DEBUG_BGQ printf( _AT_ " _papi_hwd_unlock %d\n", lock ); #endif assert( lock < PAPI_MAX_LOCK ); #if 1 UPC_Unlock( &thdLocks[lock] ); #else pthread_mutex_unlock( &thdLocks[lock] ); #endif return; } /* * Get System Information * * Initialize system information structure */ int _bgq_get_system_info( papi_mdi_t *mdi ) { #ifdef DEBUG_BGQ printf( "_bgq_get_system_info\n" ); #endif ( void ) mdi; Personality_t personality; int retval; /* Hardware info */ retval = Kernel_GetPersonality( &personality, sizeof( Personality_t ) ); if ( retval ) { fprintf( stdout, "Kernel_GetPersonality returned %d (sys error=%d).\n" "\t%s\n", retval, errno, strerror( errno ) ); return PAPI_ESYS; } /* Returns the number of processors that are associated with the currently * running process */ _papi_hwi_system_info.hw_info.ncpu = Kernel_ProcessorCount( ); // TODO: HJ Those values need to be fixed _papi_hwi_system_info.hw_info.nnodes = Kernel_ProcessCount( ); _papi_hwi_system_info.hw_info.totalcpus = _papi_hwi_system_info.hw_info.ncpu; _papi_hwi_system_info.hw_info.cpu_max_mhz = personality.Kernel_Config.FreqMHz; _papi_hwi_system_info.hw_info.cpu_min_mhz = personality.Kernel_Config.FreqMHz; _papi_hwi_system_info.hw_info.mhz = ( float ) personality.Kernel_Config.FreqMHz; SUBDBG( "_bgq_get_system_info: Detected MHZ is %f\n", _papi_hwi_system_info.hw_info.mhz ); return ( PAPI_OK ); } /* * Initialize Control State * */ int _bgq_init_control_state( hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "_bgq_init_control_state\n" ); #endif int retval; ptr->EventGroup = Bgpm_CreateEventSet(); retval = _check_BGPM_error( ptr->EventGroup, "Bgpm_CreateEventSet" ); if ( retval < 0 ) return retval; // initialize multiplexing flag to OFF (0) ptr->muxOn = 0; // initialize overflow flag to OFF (0) ptr->overflow = 0; ptr->overflow_count = 0; // initialized BGPM eventGroup flag to NOT applied yet (0) ptr->bgpm_eventset_applied = 0; return PAPI_OK; } /* * Set Domain */ int _bgq_set_domain( hwd_control_state_t * cntrl, int domain ) { #ifdef DEBUG_BGQ printf( "_bgq_set_domain\n" ); #endif int found = 0; ( void ) cntrl; if ( PAPI_DOM_USER & domain ) found = 1; if ( PAPI_DOM_KERNEL & domain ) found = 1; if ( PAPI_DOM_OTHER & domain ) found = 1; if ( !found ) return ( PAPI_EINVAL ); return ( PAPI_OK ); } /* * PAPI Initialization * This is called whenever a thread is initialized */ int _bgq_init( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "_bgq_init\n" ); #endif ( void ) ctx; int retval; #ifdef DEBUG_BGPM Bgpm_PrintOnError(1); Bgpm_ExitOnError(0); #else Bgpm_PrintOnError(0); // avoid bgpm default of exiting when error occurs - caller will check return code instead. Bgpm_ExitOnError(0); #endif retval = Bgpm_Init( BGPM_MODE_SWDISTRIB ); retval = _check_BGPM_error( retval, "Bgpm_Init" ); if ( retval < 0 ) return retval; //_common_initBgpm(); return PAPI_OK; } int _bgq_multiplex( hwd_control_state_t * bgq_state ) { int retval; uint64_t bgpm_period; double Sec, Hz; #ifdef DEBUG_BGQ printf("_bgq_multiplex BEGIN: Num of Events = %d (vs %d)\n", Bgpm_NumEvents( bgq_state->EventGroup ), bgq_state->count ); #endif // convert Mhz to Hz ( = cycles / sec ) Hz = (double) _papi_hwi_system_info.hw_info.cpu_max_mhz * 1000 * 1000; // convert PAPI multiplex period (in ns) to BGPM period (in cycles) Sec = (double) _papi_os_info.itimer_ns / ( 1000 * 1000 * 1000 ); bgpm_period = Hz * Sec; // if EventGroup is not empty -- which is required by BGPM before // we can call SetMultiplex() -- then drain the events from the // BGPM EventGroup, turn on multiplex flag, and rebuild BGPM EventGroup. if ( 0 < bgq_state->count ) { // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &bgq_state->EventGroup ); if ( retval < 0 ) return retval; // turn on multiplex for BGPM retval = Bgpm_SetMultiplex( bgq_state->EventGroup, bgpm_period, BGPM_NORMAL ); retval = _check_BGPM_error( retval, "Bgpm_SetMultiplex" ); if ( retval < 0 ) return retval; // rebuild BGPM EventGroup retval = _common_rebuildEventgroup( bgq_state->count, bgq_state->EventGroup_local, &bgq_state->EventGroup ); if ( retval < 0 ) return retval; } else { // need to pass either BGPM_NORMAL or BGPM_NOTNORMAL // BGPM_NORMAL: numbers reported by Bgpm_ReadEvent() are normalized // to the maximum time spent in a multiplexed group retval = Bgpm_SetMultiplex( bgq_state->EventGroup, bgpm_period, BGPM_NORMAL ); retval = _check_BGPM_error( retval, "Bgpm_SetMultiplex" ); if ( retval < 0 ) return retval; } #ifdef DEBUG_BGQ printf("_bgq_multiplex END: Num of Events = %d (vs %d) --- retval = %d\n", Bgpm_NumEvents( bgq_state->EventGroup ), bgq_state->count, retval ); #endif return ( retval ); } /* * Register Allocation * */ int _bgq_allocate_registers( EventSetInfo_t * ESI ) { #ifdef DEBUG_BGQ printf("_bgq_allocate_registers\n"); #endif int i, natNum; int xEventId; /* * Start monitoring the events... */ natNum = ESI->NativeCount; for ( i = 0; i < natNum; i++ ) { xEventId = ( ESI->NativeInfoArray[i].ni_event & PAPI_NATIVE_AND_MASK ) + 1; ESI->NativeInfoArray[i].ni_position = i; } return PAPI_OK; } /* * PAPI Cleanup Eventset * * Destroy and re-create the BGPM / Punit EventSet */ int _bgq_cleanup_eventset( hwd_control_state_t * ctrl ) { #ifdef DEBUG_BGQ printf( "_bgq_cleanup_eventset\n" ); #endif // set multiplexing flag to OFF (0) ctrl->muxOn = 0; // set overflow flag to OFF (0) ctrl->overflow = 0; ctrl->overflow_count = 0; // set BGPM eventGroup flag back to NOT applied yet (0) ctrl->bgpm_eventset_applied = 0; return ( PAPI_OK ); } /* * Update Control State * * This function clears the current contents of the control * structure and updates it with whatever resources are allocated * for all the native events in the native info structure array. */ int _bgq_update_control_state( hwd_control_state_t * ptr, NativeInfo_t * native, int count, hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( _AT_ " _bgq_update_control_state: count = %d, EventGroup=%d\n", count, ptr->EventGroup ); #endif ( void ) ctx; int i, j, k, index, retval; unsigned evtIdx; // Delete and re-create BGPM eventset retval = _common_deleteRecreate( &ptr->EventGroup ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf( _AT_ " _bgq_update_control_state: EventGroup=%d, muxOn = %d, overflow = %d\n", ptr->EventGroup, ptr->muxOn, ptr->overflow ); #endif // add the events to the eventset for ( i = 0; i < count; i++ ) { index = ( native[i].ni_event & PAPI_NATIVE_AND_MASK ) + 1; ptr->EventGroup_local[i] = index; // we found an opcode event if ( index > BGQ_PUNIT_MAX_EVENTS ) { for( j = 0; j < num_opcode_events; j++ ) { #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: %d out of %d OPCODES\n", j, num_opcode_events ); #endif #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: j's idx = %d, index = %d\n", GenericEvent[j].idx, index ); #endif if ( GenericEvent[j].idx == ( index - 1) ) { /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( ptr->EventGroup, GenericEvent[j].eventId ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: ADD event: i = %d, eventId = %d\n", i, GenericEvent[j].eventId ); #endif evtIdx = Bgpm_GetEventIndex( ptr->EventGroup, GenericEvent[j].eventId, i ); #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: evtIdx in EventGroup = %d\n", evtIdx ); #endif if ( 0 == strcmp( GenericEvent[j].mask, "PEVT_INST_XU_GRP_MASK" ) ) { retval = Bgpm_SetXuGrpMask( ptr->EventGroup, evtIdx, GenericEvent[j].opcode_mask ); retval = _check_BGPM_error( retval, "Bgpm_SetXuGrpMask" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: it's PEVT_INST_XU_GRP_MASK\n" ); #endif } else if ( 0 == strcmp( GenericEvent[j].mask, "PEVT_INST_QFPU_GRP_MASK" ) ) { retval = Bgpm_SetQfpuGrpMask( ptr->EventGroup, evtIdx, GenericEvent[j].opcode_mask ); retval = _check_BGPM_error( retval, "Bgpm_SetQfpuGrpMask" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: it's PEVT_INST_QFPU_GRP_MASK\n" ); #endif } } } } else { #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: no OPCODE\n" ); #endif /* Add events to the BGPM eventGroup */ retval = Bgpm_AddEvent( ptr->EventGroup, index ); retval = _check_BGPM_error( retval, "Bgpm_AddEvent" ); if ( retval < 0 ) return retval; #ifdef DEBUG_BGQ printf(_AT_ " _bgq_update_control_state: ADD event: i = %d, index = %d\n", i, index ); #endif } } // store how many events we added to an EventSet ptr->count = count; // if muxOn and EventGroup is not empty -- which is required by BGPM before // we can call SetMultiplex() -- then drain the events from the // BGPM EventGroup, turn on multiplex flag, and rebuild BGPM EventGroup. if ( 1 == ptr->muxOn ) { retval = _bgq_multiplex( ptr ); } // since update_control_state trashes overflow settings, this puts things // back into balance for BGPM if ( 1 == ptr->overflow ) { for ( k = 0; k < ptr->overflow_count; k++ ) { retval = _common_set_overflow_BGPM( ptr->EventGroup, ptr->overflow_list[k].EventIndex, ptr->overflow_list[k].threshold, user_signal_handler ); if ( retval < 0 ) return retval; } } return ( PAPI_OK ); } /* * PAPI Start */ int _bgq_start( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "BEGIN _bgq_start\n" ); #endif ( void ) ctx; int retval; retval = Bgpm_Apply( ptr->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Apply" ); if ( retval < 0 ) return retval; // set flag to 1: BGPM eventGroup HAS BEEN applied ptr->bgpm_eventset_applied = 1; #ifdef DEBUG_BGQ int i; int numEvts = Bgpm_NumEvents( ptr->EventGroup ); for ( i = 0; i < numEvts; i++ ) { printf("%d = %s\n", i, Bgpm_GetEventLabel( ptr->EventGroup, i) ); } #endif /* Bgpm_Apply() does an implicit reset; hence no need to use Bgpm_ResetStart */ retval = Bgpm_Start( ptr->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Start" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Stop */ int _bgq_stop( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "BEGIN _bgq_stop\n" ); #endif ( void ) ctx; int retval; retval = Bgpm_Stop( ptr->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Read Counters * * Read the counters into local storage */ int _bgq_read( hwd_context_t * ctx, hwd_control_state_t * ptr, long_long ** dp, int flags ) { #ifdef DEBUG_BGQ printf( "_bgq_read\n" ); #endif ( void ) ctx; ( void ) flags; int i, numEvts; numEvts = Bgpm_NumEvents( ptr->EventGroup ); if ( numEvts == 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function Bgpm_NumEvents.\n", numEvts ); //return ( EXIT_FAILURE ); #endif } for ( i = 0; i < numEvts; i++ ) ptr->counters[i] = _common_getEventValue( i, ptr->EventGroup ); *dp = ptr->counters; return ( PAPI_OK ); } /* * PAPI Reset * * Zero the counter values */ int _bgq_reset( hwd_context_t * ctx, hwd_control_state_t * ptr ) { #ifdef DEBUG_BGQ printf( "_bgq_reset\n" ); #endif ( void ) ctx; int retval; /* we can't simply call Bgpm_Reset() since PAPI doesn't have the restriction that an EventSet has to be stopped before resetting is possible. However, BGPM does have this restriction. Hence we need to stop, reset and start */ retval = Bgpm_Stop( ptr->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_Stop" ); if ( retval < 0 ) return retval; retval = Bgpm_ResetStart( ptr->EventGroup ); retval = _check_BGPM_error( retval, "Bgpm_ResetStart" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Shutdown * * This routine is for shutting down threads, * including the master thread. * Effectively a no-op, same as BG/L/P... */ int _bgq_shutdown( hwd_context_t * ctx ) { #ifdef DEBUG_BGQ printf( "_bgq_shutdown\n" ); #endif ( void ) ctx; int retval; /* Disable BGPM library */ retval = Bgpm_Disable(); retval = _check_BGPM_error( retval, "Bgpm_Disable" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * PAPI Write * * Write counter values * NOTE: Could possible support, but signal error as BG/L/P does... */ int _bgq_write( hwd_context_t * ctx, hwd_control_state_t * cntrl, long_long * from ) { #ifdef DEBUG_BGQ printf( "_bgq_write\n" ); #endif ( void ) ctx; ( void ) cntrl; ( void ) from; return PAPI_ECMP; } /* * Dispatch Timer * * NOT the same as BG/L/P where we simply return * This function is used when hardware overflows are working or when * software overflows are forced */ void _bgq_dispatch_timer( int signal, hwd_siginfo_t * info, void *uc ) { ( void ) signal; ( void ) info; ( void ) uc; #ifdef DEBUG_BGQ printf("BEGIN _bgq_dispatch_timer\n"); #endif return; } /* * user_signal_handler * * This function is used when hardware overflows are working or when * software overflows are forced */ void user_signal_handler( int hEvtSet, uint64_t address, uint64_t ovfVector, const ucontext_t *pContext ) { #ifdef DEBUG_BGQ printf( "user_signal_handler start\n" ); #endif ( void ) address; int retval; unsigned i; int isHardware = 1; int cidx = _bgq_vectors.cmp_info.CmpIdx; long_long overflow_bit = 0; vptr_t address1; _papi_hwi_context_t ctx; ctx.ucontext = ( hwd_ucontext_t * ) pContext; ThreadInfo_t *thread = _papi_hwi_lookup_thread( 0 ); //printf(_AT_ " thread = %p\n", thread); // <<<<<<<<<<<<<<<<<< EventSetInfo_t *ESI; ESI = thread->running_eventset[cidx]; // Get the indices of all events which have overflowed. unsigned ovfIdxs[BGPM_MAX_OVERFLOW_EVENTS]; unsigned len = BGPM_MAX_OVERFLOW_EVENTS; retval = Bgpm_GetOverflowEventIndices( hEvtSet, ovfVector, ovfIdxs, &len ); if ( retval < 0 ) { #ifdef DEBUG_BGPM printf ( "Error: ret value is %d for BGPM API function Bgpm_GetOverflowEventIndices.\n", retval ); #endif return; } if ( thread == NULL ) { PAPIERROR( "thread == NULL in user_signal_handler!" ); return; } if ( ESI == NULL ) { PAPIERROR( "ESI == NULL in user_signal_handler!"); return; } if ( ESI->overflow.flags == 0 ) { PAPIERROR( "ESI->overflow.flags == 0 in user_signal_handler!"); return; } for ( i = 0; i < len; i++ ) { uint64_t hProf; Bgpm_GetEventUser1( hEvtSet, ovfIdxs[i], &hProf ); if ( hProf ) { overflow_bit ^= 1 << ovfIdxs[i]; break; } } if ( ESI->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) { #ifdef DEBUG_BGQ printf("OVERFLOW_SOFTWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, NULL, 0, 0, &thread, cidx ); return; } else if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { #ifdef DEBUG_BGQ printf("OVERFLOW_HARDWARE\n"); #endif address1 = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address1, &isHardware, overflow_bit, 0, &thread, cidx ); } else { #ifdef DEBUG_BGQ printf("OVERFLOW_NONE\n"); #endif PAPIERROR( "ESI->overflow.flags is set to something other than PAPI_OVERFLOW_HARDWARE or PAPI_OVERFLOW_FORCE_SW (%#x)", thread->running_eventset[cidx]->overflow.flags); } } /* * Set Overflow * * This is commented out in BG/L/P - need to explore and complete... * However, with true 64-bit counters in BG/Q and all counters for PAPI * always starting from a true zero (we don't allow write...), the possibility * for overflow is remote at best... */ int _bgq_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { #ifdef DEBUG_BGQ printf("BEGIN _bgq_set_overflow\n"); #endif hwd_control_state_t * this_state = ( hwd_control_state_t * ) ESI->ctl_state; int retval; int evt_idx; /* * In case an BGPM eventGroup HAS BEEN applied or attached before * overflow is set, delete the eventGroup and create an new empty one, * and rebuild as it was prior to deletion */ #ifdef DEBUG_BGQ printf( "_bgq_set_overflow: bgpm_eventset_applied = %d, threshold = %d\n", this_state->bgpm_eventset_applied, threshold ); #endif if ( 1 == this_state->bgpm_eventset_applied && 0 != threshold ) { retval = _common_deleteRecreate( &this_state->EventGroup ); if ( retval < 0 ) return retval; retval = _common_rebuildEventgroup( this_state->count, this_state->EventGroup_local, &this_state->EventGroup ); if ( retval < 0 ) return retval; /* set BGPM eventGroup flag back to NOT applied yet (0) * because the eventGroup has been recreated from scratch */ this_state->bgpm_eventset_applied = 0; } evt_idx = ESI->EventInfoArray[EventIndex].pos[0]; //evt_id = ( ESI->NativeInfoArray[EventIndex].ni_event & PAPI_NATIVE_AND_MASK ) + 1; SUBDBG( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #ifdef DEBUG_BGQ printf( "Hardware counter %d (vs %d) used in overflow, threshold %d\n", evt_idx, EventIndex, threshold ); #endif /* If this counter isn't set to overflow, it's an error */ if ( threshold == 0 ) { /* Remove the signal handler */ retval = _papi_hwi_stop_signal( _bgq_vectors.cmp_info.hardware_intr_sig ); if ( retval != PAPI_OK ) return ( retval ); } else { this_state->overflow = 1; this_state->overflow_count++; this_state->overflow_list[this_state->overflow_count-1].threshold = threshold; this_state->overflow_list[this_state->overflow_count-1].EventIndex = evt_idx; #ifdef DEBUG_BGQ printf( "_bgq_set_overflow: Enable the signal handler\n" ); #endif /* Enable the signal handler */ retval = _papi_hwi_start_signal( _bgq_vectors.cmp_info.hardware_intr_sig, NEED_CONTEXT, _bgq_vectors.cmp_info.CmpIdx ); if ( retval != PAPI_OK ) return ( retval ); retval = _common_set_overflow_BGPM( this_state->EventGroup, this_state->overflow_list[this_state->overflow_count-1].EventIndex, this_state->overflow_list[this_state->overflow_count-1].threshold, user_signal_handler ); if ( retval < 0 ) return retval; } return ( PAPI_OK ); } /* * Set Profile * * Same as for BG/L/P, routine not used and returns error */ int _bgq_set_profile( EventSetInfo_t * ESI, int EventIndex, int threshold ) { #ifdef DEBUG_BGQ printf("BEGIN _bgq_set_profile\n"); #endif ( void ) ESI; ( void ) EventIndex; ( void ) threshold; return PAPI_ECMP; } /* * Stop Profiling * * Same as for BG/L/P... */ int _bgq_stop_profiling( ThreadInfo_t * master, EventSetInfo_t * ESI ) { #ifdef DEBUG_BGQ printf("BEGIN _bgq_stop_profiling\n"); #endif ( void ) master; ( void ) ESI; return ( PAPI_OK ); } /* * PAPI Control * * Same as for BG/L/P - initialize the domain */ int _bgq_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG_BGQ printf( "_bgq_ctl\n" ); #endif ( void ) ctx; int retval; switch ( code ) { case PAPI_MULTIPLEX: { hwd_control_state_t * bgq_state = ( ( hwd_control_state_t * ) option->multiplex.ESI->ctl_state ); bgq_state->muxOn = 1; retval = _bgq_multiplex( bgq_state ); return ( retval ); } default: return ( PAPI_OK ); } } /* * Get Real Micro-seconds */ long long _bgq_get_real_usec( void ) { #ifdef DEBUG_BGQ printf( "_bgq_get_real_usec\n" ); #endif /* * NOTE: _papi_hwi_system_info.hw_info.mhz is really a representation of unit of time per cycle. * On BG/P, it's value is 8.5e-4. Therefore, to get cycles per sec, we have to multiply * by 1.0e12. To then convert to usec, we have to divide by 1.0e-3. */ return ( ( long long ) ( ( ( float ) get_cycles( ) ) / ( ( _papi_hwi_system_info.hw_info.cpu_max_mhz ) ) ) ); } /* * Get Real Cycles * * Same for BG/L/P, using native function... */ long long _bgq_get_real_cycles( void ) { #ifdef DEBUG_BGQ printf( "_bgq_get_real_cycles\n" ); #endif return ( ( long long ) get_cycles( ) ); } /* * Get Virtual Micro-seconds * * Same calc as for BG/L/P, returns real usec... */ long long _bgq_get_virt_usec( void ) { #ifdef DEBUG_BGQ printf( "_bgq_get_virt_usec\n" ); #endif return _bgq_get_real_usec( ); } /* * Get Virtual Cycles * * Same calc as for BG/L/P, returns real cycles... */ long long _bgq_get_virt_cycles( void ) { #ifdef DEBUG_BGQ printf( "_bgq_get_virt_cycles\n" ); #endif return _bgq_get_real_cycles( ); } /* * Component setup and shutdown * * Initialize hardware counters, setup the function vector table * and get hardware information, this routine is called when the * PAPI process is initialized (IE PAPI_library_init) */ int _bgq_init_component( int cidx ) { #ifdef DEBUG_BGQ printf("_bgq_init_substrate\n"); //printf("_bgq_init_substrate: 1. BGPM_INITIALIZED = %d \n", BGPM_INITIALIZED); #endif int retval; int i; /* allocate the opcode event structure */ GenericEvent = calloc( OPCODE_EVENT_CHUNK, sizeof( struct bgq_generic_events_t ) ); if ( NULL == GenericEvent ) { return PAPI_ENOMEM; } /* init opcode event stuff */ allocated_opcode_events = OPCODE_EVENT_CHUNK; num_opcode_events = 0; _bgq_vectors.cmp_info.CmpIdx = cidx; /* * Fill in what we can of the papi_system_info */ SUBDBG( "Before _bgq_get_system_info()...\n" ); retval = _bgq_get_system_info( &_papi_hwi_system_info ); SUBDBG( "After _bgq_get_system_info(), retval=%d...\n", retval ); if ( retval != PAPI_OK ) return ( retval ); /* * Setup memory info */ SUBDBG( "Before _bgq_get_memory_info...\n" ); retval = _bgq_get_memory_info( &_papi_hwi_system_info.hw_info, ( int ) _papi_hwi_system_info.hw_info. model ); SUBDBG( "After _bgq_get_memory_info, retval=%d...\n", retval ); if ( retval ) return ( retval ); #if 1 /* Setup Locks */ for ( i = 0; i < PAPI_MAX_LOCK; i++ ) thdLocks[i] = 0; // MUTEX_OPEN #else for( i = 0; i < PAPI_MAX_LOCK; i++ ) { pthread_mutex_init( &thdLocks[i], NULL ); } #endif /* Setup presets */ retval = _papi_load_preset_table( "BGQ", 0, cidx ); if ( retval ) { return retval; } return ( PAPI_OK ); } /*************************************/ /* CODE TO SUPPORT OPAQUE NATIVE MAP */ /*************************************/ /* * Event Name to Native Code */ int _bgq_ntv_name_to_code( const char *name, unsigned int *event_code ) { #ifdef DEBUG_BGQ printf( "_bgq_ntv_name_to_code\n" ); #endif int ret; #ifdef DEBUG_BGQ printf( "name = ===%s===\n", name ); #endif /* Treat events differently if BGPM Opcodes are used */ /* Opcode group selection values are "OR"ed together to create a desired mask of instruction group events to accumulate in the same counter */ if ( 0 == strncmp( name, "PEVT_INST_XU_GRP_MASK", strlen( "PEVT_INST_XU_GRP_MASK" ) ) || 0 == strncmp( name, "PEVT_INST_QFPU_GRP_MASK", strlen( "PEVT_INST_QFPU_GRP_MASK" ) ) ) { char *pcolon; pcolon = strchr( name, ':' ); // Found colon separator if ( pcolon != NULL ) { int mask_len = pcolon - name; strncpy( GenericEvent[num_opcode_events].mask, name, mask_len ); strncpy( GenericEvent[num_opcode_events].opcode, pcolon+1, strlen(name) - 1 - mask_len ); /* opcode_mask needs to be 'uint64_t', hence we use strtoull() which returns an 'unsigned long long int' */ GenericEvent[num_opcode_events].opcode_mask = strtoull( GenericEvent[num_opcode_events].opcode, (char **)NULL, 16 ); GenericEvent[num_opcode_events].idx = OPCODE_BUF + num_opcode_events; /* Return event id matching the generic XU/QFPU event string */ GenericEvent[num_opcode_events].eventId = Bgpm_GetEventIdFromLabel( GenericEvent[num_opcode_events].mask ); if ( GenericEvent[num_opcode_events].eventId <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } *event_code = GenericEvent[num_opcode_events].idx; num_opcode_events++; /* If there are too many opcode events than allocated, then allocate more room */ if( num_opcode_events >= allocated_opcode_events ) { SUBDBG("Allocating more room for BGPM opcode events (%d %ld)\n", ( allocated_opcode_events + NATIVE_OPCODE_CHUNK ), ( long )sizeof( struct bgq_generic_events_t ) * ( allocated_opcode_events + NATIVE_OPCODE_CHUNK ) ); GenericEvent = realloc( GenericEvent, sizeof( struct bgq_generic_events_t ) * ( allocated_opcode_events + OPCODE_EVENT_CHUNK ) ); if ( NULL == GenericEvent ) { return PAPI_ENOMEM; } allocated_opcode_events += OPCODE_EVENT_CHUNK; } } else { SUBDBG( "Error: Found a generic BGPM event mask without opcode string\n" ); return PAPI_ENOEVNT; } #ifdef DEBUG_BGQ printf(_AT_ " _bgq_ntv_name_to_code: GenericEvent no. %d: \n", num_opcode_events-1 ); printf( "idx = %d\n", GenericEvent[num_opcode_events-1].idx); printf( "eventId = %d\n", GenericEvent[num_opcode_events-1].eventId); printf( "mask = %s\n", GenericEvent[num_opcode_events-1].mask); printf( "opcode = %s\n", GenericEvent[num_opcode_events-1].opcode); printf( "opcode_mask = %#lX (%lu)\n", GenericEvent[num_opcode_events-1].opcode_mask, GenericEvent[num_opcode_events-1].opcode_mask ); #endif } else { /* Return event id matching a given event label string */ ret = Bgpm_GetEventIdFromLabel ( name ); if ( ret <= 0 ) { #ifdef DEBUG_BGPM printf ("Error: ret value is %d for BGPM API function '%s'.\n", ret, "Bgpm_GetEventIdFromLabel" ); #endif return PAPI_ENOEVNT; } else if ( ret > BGQ_PUNIT_MAX_EVENTS ) // not a PUnit event return PAPI_ENOEVNT; else *event_code = ( ret - 1 ); } return PAPI_OK; } /* * Native Code to Event Name * * Given a native event code, returns the short text label */ int _bgq_ntv_code_to_name( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ printf( "_bgq_ntv_code_to_name\n" ); #endif int index = ( EventCode & PAPI_NATIVE_AND_MASK ) + 1; if ( index >= MAX_COUNTERS ) return PAPI_ENOEVNT; strncpy( name, Bgpm_GetEventIdLabel( index ), len ); if ( name == NULL ) { #ifdef DEBUG_BGPM printf ("Error: ret value is NULL for BGPM API function Bgpm_GetEventIdLabel.\n" ); #endif return PAPI_ENOEVNT; } #ifdef DEBUG_BGQ printf( "name = ===%s===\n", name ); #endif return ( PAPI_OK ); } /* * Native Code to Event Description * * Given a native event code, returns the longer native event description */ int _bgq_ntv_code_to_descr( unsigned int EventCode, char *name, int len ) { #ifdef DEBUG_BGQ printf( "_bgq_ntv_code_to_descr\n" ); #endif int retval; int index = ( EventCode & PAPI_NATIVE_AND_MASK ) + 1; retval = Bgpm_GetLongDesc( index, name, &len ); retval = _check_BGPM_error( retval, "Bgpm_GetLongDesc" ); if ( retval < 0 ) return retval; return ( PAPI_OK ); } /* * Native Code to Bit Configuration * * Given a native event code, assigns the native event's * information to a given pointer. * NOTE: The info must be COPIED to location addressed by * the provided pointer, not just referenced! * NOTE: For BG/Q, the bit configuration is not needed, * as the native SPI is used to configure events. */ int _bgq_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { #ifdef DEBUG_BGQ printf( "_bgq_ntv_code_to_bits\n" ); #endif ( void ) EventCode; ( void ) bits; return ( PAPI_OK ); } /* * Native ENUM Events * */ int _bgq_ntv_enum_events( unsigned int *EventCode, int modifier ) { #ifdef DEBUG_BGQ printf( "_bgq_ntv_enum_events\n" ); #endif switch ( modifier ) { case PAPI_ENUM_FIRST: *EventCode = PAPI_NATIVE_MASK; return ( PAPI_OK ); break; case PAPI_ENUM_EVENTS: { int index = ( *EventCode & PAPI_NATIVE_AND_MASK ) + 1; if ( index < BGQ_PUNIT_MAX_EVENTS ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); break; } default: return ( PAPI_EINVAL ); } return ( PAPI_EINVAL ); } int _papi_hwi_init_os(void) { struct utsname uname_buffer; /* Get the kernel info */ uname(&uname_buffer); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_res_ns = 1; return PAPI_OK; } /* * PAPI Vector Table for BG/Q */ papi_vector_t _bgq_vectors = { .cmp_info = { /* Default component information (unspecified values are initialized to 0) */ .name = "linux-bgq", .short_name = "bgq", .description = "Blue Gene/Q component", .num_cntrs = BGQ_PUNIT_MAX_COUNTERS, .num_mpx_cntrs = BGQ_PUNIT_MAX_COUNTERS, .num_native_events = BGQ_PUNIT_MAX_EVENTS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .hardware_intr_sig = PAPI_INT_SIGNAL, .hardware_intr = 1, .kernel_multiplex = 1, /* component specific cmp_info initializations */ .fast_real_timer = 1, .fast_virtual_timer = 0, } , /* Sizes of framework-opaque component-private structures */ .size = { .context = sizeof ( hwd_context_t ), .control_state = sizeof ( hwd_control_state_t ), .reg_value = sizeof ( hwd_register_t ), .reg_alloc = sizeof ( hwd_reg_alloc_t ), } , /* Function pointers in this component */ // .get_overflow_address = .start = _bgq_start, .stop = _bgq_stop, .read = _bgq_read, .reset = _bgq_reset, .write = _bgq_write, .stop_profiling = _bgq_stop_profiling, .init_component = _bgq_init_component, .init_thread = _bgq_init, .init_control_state = _bgq_init_control_state, .update_control_state = _bgq_update_control_state, .ctl = _bgq_ctl, .set_overflow = _bgq_set_overflow, //.dispatch_timer = _bgq_dispatch_timer, .set_profile = _bgq_set_profile, .set_domain = _bgq_set_domain, .ntv_enum_events = _bgq_ntv_enum_events, .ntv_name_to_code = _bgq_ntv_name_to_code, .ntv_code_to_name = _bgq_ntv_code_to_name, .ntv_code_to_descr = _bgq_ntv_code_to_descr, .ntv_code_to_bits = _bgq_ntv_code_to_bits, .allocate_registers = _bgq_allocate_registers, .cleanup_eventset = _bgq_cleanup_eventset, .shutdown_thread = _bgq_shutdown // .shutdown_global = // .user = }; papi_os_vector_t _papi_os_vector = { .get_memory_info = _bgq_get_memory_info, .get_dmem_info = _bgq_get_dmem_info, .get_real_cycles = _bgq_get_real_cycles, .get_real_usec = _bgq_get_real_usec, .get_virt_cycles = _bgq_get_virt_cycles, .get_virt_usec = _bgq_get_virt_usec, .get_system_info = _bgq_get_system_info }; papi-papi-7-2-0-t/src/linux-bgq.h000066400000000000000000000067351502707512200165210ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file linux-bgq.h * CVS: $Id$ * @author Heike Jagode * jagode@eecs.utk.edu * Mods: < your name here > * < your email address > * Blue Gene/Q CPU component: BGPM / Punit * * Tested version of bgpm (early access) * * @brief * This file has the source code for a component that enables PAPI-C to * access hardware monitoring counters for BG/Q through the BGPM library. */ #ifndef _LINUX_BGQ_H #define _LINUX_BGQ_H #include #include #include #include #include #include #include #include #include #include #include #include //#include #include "linux-bgq-common.h" /* Header required to obtain BGQ personality */ #include "process_impl.h" #include "linux-context.h" /* this number assumes that there will never be more events than indicated */ #define BGQ_PUNIT_MAX_COUNTERS UPC_P_NUM_COUNTERS #define BGQ_PUNIT_MAX_EVENTS PEVT_PUNIT_LAST_EVENT #define MAX_COUNTER_TERMS BGQ_PUNIT_MAX_COUNTERS // keep a large enough gap between actual BGPM events and our local opcode events #define OPCODE_BUF ( MAX_COUNTERS + MAX_COUNTERS ) #include "papi.h" #include "papi_preset.h" typedef struct { int preset; /* Preset code */ int derived; /* Derived code */ char *( findme[MAX_COUNTER_TERMS] ); /* Strings to look for, more than 1 means derived */ char *operation; /* PostFix operations between terms */ char *note; /* In case a note is included with a preset */ } bgq_preset_search_entry_t; // Context structure not used... typedef struct bgq_context { int reserved; } bgq_context_t; typedef struct bgq_overflow { int threshold; int EventIndex; } bgq_overflow_t; // Control state structure... Holds local copy of read counters... typedef struct bgq_control_state { int EventGroup; int EventGroup_local[512]; int count; long_long counters[BGQ_PUNIT_MAX_COUNTERS]; int muxOn; // multiplexing on or off flag int overflow; // overflow enable int overflow_count; bgq_overflow_t overflow_list[512]; int bgpm_eventset_applied; // BGPM eventGroup applied yes or no flag } bgq_control_state_t; // Register allocation structure typedef struct bgq_reg_alloc { //_papi_hwd_bgq_native_event_id_t id; } bgq_reg_alloc_t; // Register structure not used... typedef struct bgq_register { /* This is used by the framework.It likes it to be !=0 to do something */ unsigned int selector; /* This is the information needed to locate a BGPM / Punit event */ unsigned eventID; } bgq_register_t; /** This structure is used to build the table of events */ typedef struct bgq_native_event_entry { bgq_register_t resources; char name[PAPI_MAX_STR_LEN]; char description[PAPI_2MAX_STR_LEN]; } bgq_native_event_entry_t; /* Override void* definitions from PAPI framework layer */ /* with typedefs to conform to PAPI component layer code. */ #undef hwd_reg_alloc_t #undef hwd_register_t #undef hwd_control_state_t #undef hwd_context_t typedef bgq_reg_alloc_t hwd_reg_alloc_t; typedef bgq_register_t hwd_register_t; typedef bgq_control_state_t hwd_control_state_t; typedef bgq_context_t hwd_context_t; extern void _papi_hwd_lock( int ); extern void _papi_hwd_unlock( int ); /* Signal handling functions */ //#undef hwd_siginfo_t //#undef hwd_ucontext_t //typedef int hwd_siginfo_t; //typedef ucontext_t hwd_ucontext_t; #endif papi-papi-7-2-0-t/src/linux-common.c000066400000000000000000000472341502707512200172320ustar00rootroot00000000000000/* * File: linux-common.c */ #include #include #include #include #include #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "linux-memory.h" #include "linux-common.h" #include "linux-timer.h" #include "x86_cpuid_info.h" PAPI_os_info_t _papi_os_info; /* The locks used by Linux */ #if defined(USE_PTHREAD_MUTEXES) pthread_mutex_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #elif defined(USE_LIBAO_ATOMICS) AO_TS_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #else volatile unsigned int _papi_hwd_lock_data[PAPI_MAX_LOCK]; #endif static int _linux_init_locks(void) { int i; for ( i = 0; i < PAPI_MAX_LOCK; i++ ) { #if defined(USE_PTHREAD_MUTEXES) pthread_mutex_init(&_papi_hwd_lock_data[i],NULL); #elif defined(USE_LIBAO_ATOMICS) _papi_hwd_lock_data[i] = AO_TS_INITIALIZER; #else _papi_hwd_lock_data[i] = MUTEX_OPEN; #endif } return PAPI_OK; } int _linux_detect_hypervisor(char *virtual_vendor_name) { int retval=0; #if defined(__i386__)||defined(__x86_64__) retval=_x86_detect_hypervisor(virtual_vendor_name); #else (void) virtual_vendor_name; #endif return retval; } #define _PATH_SYS_SYSTEM "/sys/devices/system" #define _PATH_SYS_CPU0 _PATH_SYS_SYSTEM "/cpu/cpu0" static char pathbuf[PATH_MAX] = "/"; static char * search_cpu_info( FILE * f, char *search_str) { static char line[PAPI_HUGE_STR_LEN] = ""; char *s, *start = NULL; rewind(f); while (fgets(line,PAPI_HUGE_STR_LEN,f)!=NULL) { s=strstr(line,search_str); if (s!=NULL) { /* skip all characters in line up to the colon */ /* and then spaces */ s=strchr(s,':'); if (s==NULL) break; s++; while (isspace(*s)) { s++; } start = s; /* Find and clear newline */ s=strrchr(start,'\n'); if (s!=NULL) *s = 0; break; } } return start; } static void decode_vendor_string( char *s, int *vendor ) { if ( strcasecmp( s, "GenuineIntel" ) == 0 ) *vendor = PAPI_VENDOR_INTEL; else if ( ( strcasecmp( s, "AMD" ) == 0 ) || ( strcasecmp( s, "AuthenticAMD" ) == 0 ) ) *vendor = PAPI_VENDOR_AMD; else if ( strcasecmp( s, "IBM" ) == 0 ) *vendor = PAPI_VENDOR_IBM; else if ( strcasecmp( s, "Cray" ) == 0 ) *vendor = PAPI_VENDOR_CRAY; else if ( strcasecmp( s, "ARM_ARM" ) == 0 ) *vendor = PAPI_VENDOR_ARM_ARM; else if ( strcasecmp( s, "ARM_BROADCOM" ) == 0 ) *vendor = PAPI_VENDOR_ARM_BROADCOM; else if ( strcasecmp( s, "ARM_CAVIUM" ) == 0 ) *vendor = PAPI_VENDOR_ARM_CAVIUM; else if ( strcasecmp( s, "ARM_FUJITSU" ) == 0 ) *vendor = PAPI_VENDOR_ARM_FUJITSU; else if ( strcasecmp( s, "ARM_HISILICON") == 0 ) *vendor = PAPI_VENDOR_ARM_HISILICON; else if ( strcasecmp( s, "ARM_APM" ) == 0 ) *vendor = PAPI_VENDOR_ARM_APM; else if ( strcasecmp( s, "ARM_QUALCOMM" ) == 0 ) *vendor = PAPI_VENDOR_ARM_QUALCOMM; else if ( strcasecmp( s, "MIPS" ) == 0 ) *vendor = PAPI_VENDOR_MIPS; else if ( strcasecmp( s, "SiCortex" ) == 0 ) *vendor = PAPI_VENDOR_MIPS; else *vendor = PAPI_VENDOR_UNKNOWN; } static FILE * xfopen( const char *path, const char *mode ) { FILE *fd = fopen( path, mode ); if ( !fd ) err( EXIT_FAILURE, "error: %s", path ); return fd; } static FILE * path_vfopen( const char *mode, const char *path, va_list ap ) { vsnprintf( pathbuf, sizeof ( pathbuf ), path, ap ); return xfopen( pathbuf, mode ); } static int path_sibling( const char *path, ... ) { int c; long n; int result = 0; char s[2]; FILE *fp; va_list ap; va_start( ap, path ); fp = path_vfopen( "r", path, ap ); va_end( ap ); while ( ( c = fgetc( fp ) ) != EOF ) { if ( isxdigit( c ) ) { s[0] = ( char ) c; s[1] = '\0'; for ( n = strtol( s, NULL, 16 ); n > 0; n /= 2 ) { if ( n % 2 ) result++; } } } fclose( fp ); return result; } static int path_exist( const char *path, ... ) { va_list ap; va_start( ap, path ); vsnprintf( pathbuf, sizeof ( pathbuf ), path, ap ); va_end( ap ); return access( pathbuf, F_OK ) == 0; } static int decode_cpuinfo_x86( FILE *f, PAPI_hw_info_t *hwinfo ) { int tmp; unsigned int strSize; char *s; /* Stepping */ s = search_cpu_info( f, "stepping"); if ( s ) { if (sscanf( s, "%d", &tmp ) ==1 ) { hwinfo->revision = ( float ) tmp; hwinfo->cpuid_stepping = tmp; } } /* Model Name */ s = search_cpu_info( f, "model name"); strSize = sizeof(hwinfo->model_string); if ( s ) { strncpy( hwinfo->model_string, s, strSize - 1); } /* Family */ s = search_cpu_info( f, "cpu family"); if ( s ) { sscanf( s, "%d", &tmp ); hwinfo->cpuid_family = tmp; } /* CPU Model */ s = search_cpu_info( f, "model"); if ( s ) { sscanf( s , "%d", &tmp ); hwinfo->model = tmp; hwinfo->cpuid_model = tmp; } return PAPI_OK; } static int decode_cpuinfo_power(FILE *f, PAPI_hw_info_t *hwinfo ) { int tmp; unsigned int strSize; char *s; /* Revision */ s = search_cpu_info( f, "revision"); if ( s ) { sscanf( s, "%d", &tmp ); hwinfo->revision = ( float ) tmp; hwinfo->cpuid_stepping = tmp; } /* Model Name */ s = search_cpu_info( f, "model"); strSize = sizeof(hwinfo->model_string); if ( s ) { strncpy( hwinfo->model_string, s, strSize - 1); } return PAPI_OK; } static int decode_cpuinfo_arm(FILE *f, PAPI_hw_info_t *hwinfo ) { int tmp; unsigned int strSize; char *s, *t; /* revision */ s = search_cpu_info( f, "CPU revision"); if ( s ) { sscanf( s, "%d", &tmp ); hwinfo->revision = ( float ) tmp; /* For compatability with old PAPI */ hwinfo->model = tmp; } /* Model Name */ s = search_cpu_info( f, "model name"); strSize = sizeof(hwinfo->model_string); if ( s ) { strncpy( hwinfo->model_string, s, strSize - 1); } /* Architecture (ARMv6, ARMv7, ARMv8, etc.) */ /* Parsing this is a bit fragile. */ /* On ARM64 the "CPU architecture field" */ /* Prior to Linux 3.19: always "AArch64" */ /* Since Linux 3.19: always "8" */ /* On ARM32 the "CPU architecture field" is a value and not */ /* necessarily an integer, so it might be 7 or 7M */ /* also, unknown architectures are assigned a value */ /* such as (10) where 10 does not mean version 10, just */ /* the 10th element in an array */ /* Note the original Raspberry Pi lies in the CPU architecture line */ /* (it's ARMv6 not ARMv7) */ /* So we should actually get the value from the */ /* Processor/ model name line */ s = search_cpu_info( f, "CPU architecture"); if ( s ) { /* Handle old (prior to Linux 3.19) ARM64 */ if (strstr(s,"AArch64")) { hwinfo->cpuid_family = 8; } else { hwinfo->cpuid_family=strtol(s, NULL, 10); } /* Old Fallbacks if the above didn't work */ if (hwinfo->cpuid_family<0) { /* Try the processor field and look inside of parens */ s = search_cpu_info( f, "Processor" ); if (s) { t=strchr(s,'('); tmp=*(t+2)-'0'; hwinfo->cpuid_family = tmp; } /* Try the model name and look inside of parens */ else { s = search_cpu_info( f, "model name" ); if (s) { t=strchr(s,'('); tmp=*(t+2)-'0'; hwinfo->cpuid_family = tmp; } } } } /* CPU Model */ s = search_cpu_info( f, "CPU part" ); if ( s ) { sscanf( s, "%x", &tmp ); hwinfo->cpuid_model = tmp; } /* CPU Variant */ s = search_cpu_info( f, "CPU variant" ); if ( s ) { sscanf( s, "%x", &tmp ); hwinfo->cpuid_stepping = tmp; } return PAPI_OK; } int _linux_get_cpu_info( PAPI_hw_info_t *hwinfo, int *cpuinfo_mhz ) { int retval = PAPI_OK; char *s; float mhz = 0.0; FILE *f; char cpuinfo_filename[]="/proc/cpuinfo"; if ( ( f = fopen( cpuinfo_filename, "r" ) ) == NULL ) { PAPIERROR( "fopen(/proc/cpuinfo) errno %d", errno ); return PAPI_ESYS; } /* All of this information may be overwritten by the component */ /***********************/ /* Attempt to find MHz */ /***********************/ s = search_cpu_info( f, "cpu MHz" ); if ( !s ) { s = search_cpu_info( f, "clock" ); } if ( s ) { sscanf( s, "%f", &mhz ); *cpuinfo_mhz = mhz; } else { *cpuinfo_mhz = -1; // Could not find it. // PAPIWARN("Failed to find a clock speed in /proc/cpuinfo"); } /*******************************/ /* Vendor Name and Vendor Code */ /*******************************/ /* First try to read "vendor_id" field */ /* Which is the most common field */ s = search_cpu_info( f, "vendor_id"); if ( s ) { strncpy( hwinfo->vendor_string, s, PAPI_MAX_STR_LEN ); hwinfo->vendor_string[PAPI_MAX_STR_LEN-1]=0; } else { /* If not found, try "vendor" which seems to be Itanium specific */ s = search_cpu_info( f, "vendor" ); if ( s ) { strncpy( hwinfo->vendor_string, s, PAPI_MAX_STR_LEN ); hwinfo->vendor_string[PAPI_MAX_STR_LEN-1]=0; } else { /* "system type" seems to be MIPS and Alpha */ s = search_cpu_info( f, "system type"); if ( s ) { strncpy( hwinfo->vendor_string, s, PAPI_MAX_STR_LEN ); hwinfo->vendor_string[PAPI_MAX_STR_LEN-1]=0; } else { /* "platform" indicates Power */ s = search_cpu_info( f, "platform"); if ( s ) { if ( ( strcasecmp( s, "pSeries" ) == 0 ) || ( strcasecmp( s, "PowerNV" ) == 0 ) || ( strcasecmp( s, "PowerMac" ) == 0 ) ) { strcpy( hwinfo->vendor_string, "IBM" ); } } else { /* "CPU implementer" indicates ARM */ /* For ARM processors, hwinfo->vendor >= PAPI_VENDOR_ARM_ARM(0x41). */ /* If implementer is ARM Limited., hwinfo->vendor == PAPI_VENDOR_ARM_ARM. */ /* If implementer is Cavium Inc., hwinfo->vendor == PAPI_VENDOR_ARM_CAVIUM(0x43). */ s = search_cpu_info( f, "CPU implementer"); if ( s ) { int tmp; sscanf( s, "%x", &tmp ); switch( tmp ) { case PAPI_VENDOR_ARM_ARM: strcpy( hwinfo->vendor_string, "ARM_ARM" ); break; case PAPI_VENDOR_ARM_BROADCOM: strcpy( hwinfo->vendor_string, "ARM_BROADCOM" ); break; case PAPI_VENDOR_ARM_CAVIUM: strcpy( hwinfo->vendor_string, "ARM_CAVIUM" ); break; case PAPI_VENDOR_ARM_FUJITSU: strcpy( hwinfo->vendor_string, "ARM_FUJITSU" ); break; case PAPI_VENDOR_ARM_HISILICON: strcpy( hwinfo->vendor_string, "ARM_HISILICON" ); break; case PAPI_VENDOR_ARM_APM: strcpy( hwinfo->vendor_string, "ARM_APM" ); break; case PAPI_VENDOR_ARM_QUALCOMM: strcpy( hwinfo->vendor_string, "ARM_QUALCOMM" ); break; default: strcpy( hwinfo->vendor_string, "ARM_UNKNOWN" ); } } } } } } /* Decode the string to a PAPI specific implementer value */ if ( strlen( hwinfo->vendor_string ) ) { decode_vendor_string( hwinfo->vendor_string, &hwinfo->vendor ); } /**********************************************/ /* Provide more stepping/model/family numbers */ /**********************************************/ if ((hwinfo->vendor==PAPI_VENDOR_INTEL) || (hwinfo->vendor==PAPI_VENDOR_AMD)) { decode_cpuinfo_x86(f,hwinfo); } if (hwinfo->vendor==PAPI_VENDOR_IBM) { decode_cpuinfo_power(f,hwinfo); } if (hwinfo->vendor>=PAPI_VENDOR_ARM_ARM) { decode_cpuinfo_arm(f,hwinfo); } /* The following members are set using the same methodology */ /* used in lscpu. */ /* Total number of CPUs */ /* The following line assumes totalcpus was initialized to zero! */ while ( path_exist( _PATH_SYS_SYSTEM "/cpu/cpu%d", hwinfo->totalcpus ) ) hwinfo->totalcpus++; /* Number of threads per core */ if ( path_exist( _PATH_SYS_CPU0 "/topology/thread_siblings" ) ) hwinfo->threads = path_sibling( _PATH_SYS_CPU0 "/topology/thread_siblings" ); /* Number of cores per socket */ if ( path_exist( _PATH_SYS_CPU0 "/topology/core_siblings" ) && hwinfo->threads > 0 ) hwinfo->cores = path_sibling( _PATH_SYS_CPU0 "/topology/core_siblings" ) / hwinfo->threads; /* Number of NUMA nodes */ /* The following line assumes nnodes was initialized to zero! */ while ( path_exist( _PATH_SYS_SYSTEM "/node/node%d", hwinfo->nnodes ) ) { hwinfo->nnodes++; } /* Number of CPUs per node */ hwinfo->ncpu = hwinfo->nnodes > 1 ? hwinfo->totalcpus / hwinfo->nnodes : hwinfo->totalcpus; /* Number of sockets */ if ( hwinfo->threads > 0 && hwinfo->cores > 0 ) { hwinfo->sockets = hwinfo->totalcpus / hwinfo->cores / hwinfo->threads; } #if 0 int *nodecpu; /* cpumap data is not currently part of the _papi_hw_info struct */ nodecpu = malloc( (unsigned int) hwinfo->nnodes * sizeof(int) ); if ( nodecpu ) { int i; for ( i = 0; i < hwinfo->nnodes; ++i ) { nodecpu[i] = path_sibling( _PATH_SYS_SYSTEM "/node/node%d/cpumap", i ); } } else { PAPIERROR( "malloc failed for variable not currently used" ); } #endif /* Fixup missing Megahertz Value */ /* This is missing from cpuinfo on ARM and MIPS */ if (*cpuinfo_mhz < 1.0) { s = search_cpu_info( f, "BogoMIPS" ); if ((!s) || (sscanf( s, "%f", &mhz ) != 1)) { INTDBG("MHz detection failed. " "Please edit file %s at line %d.\n", __FILE__,__LINE__); } if (hwinfo->vendor == PAPI_VENDOR_MIPS) { /* MIPS has 2x clock multiplier */ *cpuinfo_mhz = 2*(((int)mhz)+1); /* Also update version info on MIPS */ s = search_cpu_info( f, "cpu model"); s = strstr(s," V")+2; strtok(s," "); sscanf(s, "%f ", &hwinfo->revision ); } else { /* In general bogomips is proportional to number of CPUs */ if (hwinfo->totalcpus) { if (mhz!=0) *cpuinfo_mhz = mhz / hwinfo->totalcpus; } } } fclose( f ); return retval; } int _linux_get_mhz( int *sys_min_mhz, int *sys_max_mhz ) { FILE *fff; int result; /* Try checking for min MHz */ /* Assume cpu0 exists */ fff=fopen("/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq","r"); if (fff==NULL) return PAPI_EINVAL; result=fscanf(fff,"%d",sys_min_mhz); fclose(fff); if (result!=1) return PAPI_EINVAL; fff=fopen("/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq","r"); if (fff==NULL) return PAPI_EINVAL; result=fscanf(fff,"%d",sys_max_mhz); fclose(fff); if (result!=1) return PAPI_EINVAL; return PAPI_OK; } int _linux_get_system_info( papi_mdi_t *mdi ) { int retval; char maxargs[PAPI_HUGE_STR_LEN]; pid_t pid; int cpuinfo_mhz,sys_min_khz,sys_max_khz; /* Software info */ /* Path and args */ pid = getpid( ); if ( pid < 0 ) { PAPIERROR( "getpid() returned < 0" ); return PAPI_ESYS; } mdi->pid = pid; sprintf( maxargs, "/proc/%d/exe", ( int ) pid ); retval = readlink( maxargs, mdi->exe_info.fullname, PAPI_HUGE_STR_LEN-1 ); if ( retval < 0 ) { PAPIERROR( "readlink(%s) returned < 0", maxargs ); return PAPI_ESYS; } if (retval > PAPI_HUGE_STR_LEN-1) { retval=PAPI_HUGE_STR_LEN-1; } mdi->exe_info.fullname[retval] = '\0'; /* Careful, basename can modify its argument */ strcpy( maxargs, mdi->exe_info.fullname ); strncpy( mdi->exe_info.address_info.name, basename( maxargs ), PAPI_HUGE_STR_LEN-1); mdi->exe_info.address_info.name[PAPI_HUGE_STR_LEN-1] = '\0'; SUBDBG( "Executable is %s\n", mdi->exe_info.address_info.name ); SUBDBG( "Full Executable is %s\n", mdi->exe_info.fullname ); /* Executable regions, may require reading /proc/pid/maps file */ retval = _linux_update_shlib_info( mdi ); SUBDBG( "Text: Start %p, End %p, length %d\n", mdi->exe_info.address_info.text_start, mdi->exe_info.address_info.text_end, ( int ) ( mdi->exe_info.address_info.text_end - mdi->exe_info.address_info.text_start ) ); SUBDBG( "Data: Start %p, End %p, length %d\n", mdi->exe_info.address_info.data_start, mdi->exe_info.address_info.data_end, ( int ) ( mdi->exe_info.address_info.data_end - mdi->exe_info.address_info.data_start ) ); SUBDBG( "Bss: Start %p, End %p, length %d\n", mdi->exe_info.address_info.bss_start, mdi->exe_info.address_info.bss_end, ( int ) ( mdi->exe_info.address_info.bss_end - mdi->exe_info.address_info.bss_start ) ); /* PAPI_preload_option information */ strcpy( mdi->preload_info.lib_preload_env, "LD_PRELOAD" ); mdi->preload_info.lib_preload_sep = ' '; strcpy( mdi->preload_info.lib_dir_env, "LD_LIBRARY_PATH" ); mdi->preload_info.lib_dir_sep = ':'; /* Hardware info */ retval = _linux_get_cpu_info( &mdi->hw_info, &cpuinfo_mhz ); if ( retval ) return retval; /* Handle MHz */ retval = _linux_get_mhz( &sys_min_khz, &sys_max_khz ); if ( retval ) { mdi->hw_info.cpu_max_mhz=cpuinfo_mhz; mdi->hw_info.cpu_min_mhz=cpuinfo_mhz; /* mdi->hw_info.mhz=cpuinfo_mhz; mdi->hw_info.clock_mhz=cpuinfo_mhz; */ } else { mdi->hw_info.cpu_max_mhz=sys_max_khz/1000; mdi->hw_info.cpu_min_mhz=sys_min_khz/1000; /* mdi->hw_info.mhz=sys_max_khz/1000; mdi->hw_info.clock_mhz=sys_max_khz/1000; */ } /* Set Up Memory */ retval = _linux_get_memory_info( &mdi->hw_info, mdi->hw_info.model ); if ( retval ) return retval; SUBDBG( "Found %d %s(%d) %s(%d) CPUs at %d Mhz.\n", mdi->hw_info.totalcpus, mdi->hw_info.vendor_string, mdi->hw_info.vendor, mdi->hw_info.model_string, mdi->hw_info.model, mdi->hw_info.cpu_max_mhz); /* Get virtualization info */ mdi->hw_info.virtualized=_linux_detect_hypervisor(mdi->hw_info.virtual_vendor_string); return PAPI_OK; } int _papi_hwi_init_os(void) { int major=0,minor=0,sub=0; char *ptr; struct utsname uname_buffer; /* Initialize the locks */ _linux_init_locks(); /* Get the kernel info */ uname(&uname_buffer); SUBDBG("Native kernel version %s\n",uname_buffer.release); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); #ifdef ASSUME_KERNEL strncpy(_papi_os_info.version,ASSUME_KERNEL,PAPI_MAX_STR_LEN); SUBDBG("Assuming kernel version %s\n",_papi_os_info.name); #else strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); #endif ptr=strtok(_papi_os_info.version,"."); if (ptr!=NULL) major=atoi(ptr); ptr=strtok(NULL,"."); if (ptr!=NULL) minor=atoi(ptr); ptr=strtok(NULL,"."); if (ptr!=NULL) sub=atoi(ptr); _papi_os_info.os_version=LINUX_VERSION(major,minor,sub); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_ns = PAPI_INT_MPX_DEF_US * 1000; _papi_os_info.itimer_res_ns = 1; _papi_os_info.clock_ticks = sysconf( _SC_CLK_TCK ); /* Get Linux-specific system info */ _linux_get_system_info( &_papi_hwi_system_info ); return PAPI_OK; } int _linux_detect_nmi_watchdog() { int watchdog_detected=0,watchdog_value=0; FILE *fff; fff=fopen("/proc/sys/kernel/nmi_watchdog","r"); if (fff!=NULL) { if (fscanf(fff,"%d",&watchdog_value)==1) { if (watchdog_value>0) watchdog_detected=1; } fclose(fff); } return watchdog_detected; } papi_os_vector_t _papi_os_vector = { .get_memory_info = _linux_get_memory_info, .get_dmem_info = _linux_get_dmem_info, .get_real_cycles = _linux_get_real_cycles, .update_shlib_info = _linux_update_shlib_info, .get_system_info = _linux_get_system_info, #if defined(HAVE_CLOCK_GETTIME) .get_real_usec = _linux_get_real_usec_gettime, #elif defined(HAVE_GETTIMEOFDAY) .get_real_usec = _linux_get_real_usec_gettimeofday, #else .get_real_usec = _linux_get_real_usec_cycles, #endif #if defined(USE_PROC_PTTIMER) .get_virt_usec = _linux_get_virt_usec_pttimer, #elif defined(HAVE_CLOCK_GETTIME_THREAD) .get_virt_usec = _linux_get_virt_usec_gettime, #elif defined(HAVE_PER_THREAD_TIMES) .get_virt_usec = _linux_get_virt_usec_times, #elif defined(HAVE_PER_THREAD_GETRUSAGE) .get_virt_usec = _linux_get_virt_usec_rusage, #endif #if defined(HAVE_CLOCK_GETTIME) .get_real_nsec = _linux_get_real_nsec_gettime, #endif #if defined(HAVE_CLOCK_GETTIME_THREAD) .get_virt_nsec = _linux_get_virt_nsec_gettime, #endif }; papi-papi-7-2-0-t/src/linux-common.h000066400000000000000000000021031502707512200172210ustar00rootroot00000000000000#ifndef _LINUX_COMMON_H #define _LINUX_COMMON_H #define LINUX_VERSION(a,b,c) ( ((a&0xff)<<24) | ((b&0xff)<<16) | ((c&0xff) << 8)) #define min(x, y) ({ \ typeof(x) _min1 = (x); \ typeof(y) _min2 = (y); \ (void) (&_min1 == &_min2); \ _min1 < _min2 ? _min1 : _min2; }) static inline pid_t mygettid( void ) { #if defined(__NEC__) return 0; #else #ifdef SYS_gettid return syscall( SYS_gettid ); #elif defined(__NR_gettid) return syscall( __NR_gettid ); #else #error "cannot find gettid" #endif #endif } #ifndef F_SETOWN_EX #define F_SETOWN_EX 15 #define F_GETOWN_EX 16 #define F_OWNER_TID 0 #define F_OWNER_PID 1 #define F_OWNER_PGRP 2 struct f_owner_ex { int type; pid_t pid; }; #endif int _linux_detect_nmi_watchdog(); #if HAVE_SCHED_GETCPU #include /* If possible, pick the processors the code is currently running on. */ #define _papi_getcpu() sched_getcpu() #else /* Just map to processor 0 if sched_getcpu() is not available. */ #define _papi_getcpu() 0 #endif #endif papi-papi-7-2-0-t/src/linux-context.h000066400000000000000000000035051502707512200174240ustar00rootroot00000000000000#ifndef _LINUX_CONTEXT_H #define _LINUX_CONTEXT_H /* Signal handling functions */ #undef hwd_siginfo_t /* Changed from struct siginfo due to POSIX and Fedora 18 */ /* If this breaks anything then we need to add an aufoconf test */ typedef siginfo_t hwd_siginfo_t; #undef hwd_ucontext_t typedef ucontext_t hwd_ucontext_t; #if defined(__ia64__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.sc_ip #elif defined(__i386__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.gregs[REG_EIP] #elif defined(__NEC__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.gregs[REG_RIP] #elif defined(__x86_64__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.gregs[REG_RIP] #elif defined(__powerpc__) && !defined(__powerpc64__) /* * The index of the Next IP (REG_NIP) was obtained by looking at kernel * source code. It wasn't documented anywhere else that I could find. */ #define REG_NIP 32 #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.uc_regs->gregs[REG_NIP] #elif defined(__powerpc64__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.regs->nip #elif defined(__sparc__) #define OVERFLOW_ADDRESS(ctx) ((struct sigcontext *)ctx.ucontext)->si_regs.pc #elif defined(__arm__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.arm_pc #elif defined(__aarch64__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.pc #elif defined(__loongarch64) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.__pc #elif defined(__mips__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.pc #elif defined(__hppa__) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.sc_iaoq[0] #elif defined(__riscv) #define OVERFLOW_ADDRESS(ctx) ctx.ucontext->uc_mcontext.__gregs[REG_PC] #else #error "OVERFLOW_ADDRESS() undefined!" #endif #define GET_OVERFLOW_ADDRESS(ctx) (vptr_t)(OVERFLOW_ADDRESS(ctx)) #endif papi-papi-7-2-0-t/src/linux-generic.c000066400000000000000000000000001502707512200173320ustar00rootroot00000000000000papi-papi-7-2-0-t/src/linux-generic.h000066400000000000000000000001021502707512200173420ustar00rootroot00000000000000/* We should probably fix things so this file isn't necessary */ papi-papi-7-2-0-t/src/linux-lock.h000066400000000000000000000207111502707512200166660ustar00rootroot00000000000000#ifndef _LINUX_LOCK_H #define _LINUX_LOCK_H #include "mb.h" /* Let's try to use the atomics from the libatomic_ops project, */ /* unless the user explicitly asks us to fall back to the legacy */ /* PAPI atomics by specifying the flag USE_LEGACY_ATOMICS. */ #include "atomic_ops.h" #if defined(AO_HAVE_test_and_set_acquire)&&!defined(USE_LEGACY_ATOMICS) #define USE_LIBAO_ATOMICS #endif /* Locking functions */ #if defined(USE_PTHREAD_MUTEXES) #include extern pthread_mutex_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) \ do \ { \ pthread_mutex_lock (&_papi_hwd_lock_data[lck]); \ } while(0) #define _papi_hwd_unlock(lck) \ do \ { \ pthread_mutex_unlock(&_papi_hwd_lock_data[lck]); \ } while(0) #elif defined(USE_LIBAO_ATOMICS) extern AO_TS_t _papi_hwd_lock_data[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) {while (AO_test_and_set_acquire(&_papi_hwd_lock_data[lck]) != AO_TS_CLEAR) { ; } } #define _papi_hwd_unlock(lck) { AO_CLEAR(&_papi_hwd_lock_data[lck]); } #else extern volatile unsigned int _papi_hwd_lock_data[PAPI_MAX_LOCK]; #define MUTEX_OPEN 0 #define MUTEX_CLOSED 1 /********/ /* ia64 */ /********/ #if defined(__ia64__) #ifdef __INTEL_COMPILER #define _papi_hwd_lock(lck) { while(_InterlockedCompareExchange_acq(&_papi_hwd_lock_data[lck],MUTEX_CLOSED,MUTEX_OPEN) != MUTEX_OPEN) { ; } } #define _papi_hwd_unlock(lck) { _InterlockedExchange((volatile int *)&_papi_hwd_lock_data[lck], MUTEX_OPEN); } #else /* GCC */ #define _papi_hwd_lock(lck) \ { int res = 0; \ do { \ __asm__ __volatile__ ("mov ar.ccv=%0;;" :: "r"(MUTEX_OPEN)); \ __asm__ __volatile__ ("cmpxchg4.acq %0=[%1],%2,ar.ccv" : "=r"(res) : "r"(&_papi_hwd_lock_data[lck]), "r"(MUTEX_CLOSED) : "memory"); \ } while (res != MUTEX_OPEN); } #define _papi_hwd_unlock(lck) { __asm__ __volatile__ ("st4.rel [%0]=%1" : : "r"(&_papi_hwd_lock_data[lck]), "r"(MUTEX_OPEN) : "memory"); } #endif /***********/ /* NEC */ /* TODO: need CAS instructions for NEC card */ /***********/ #elif defined(__NEC__) #define _papi_hwd_lock(lck) \ do \ { \ unsigned int res = 0; \ } while(0) #define _papi_hwd_unlock(lck) \ do \ { \ unsigned int res = 0; \ } while(0) /***********/ /* x86 */ /***********/ #elif defined(__i386__)||defined(__x86_64__) #define _papi_hwd_lock(lck) \ do \ { \ unsigned int res = 0; \ do { \ __asm__ __volatile__ ("lock ; " "cmpxchg %1,%2" : "=a"(res) : "q"(MUTEX_CLOSED), "m"(_papi_hwd_lock_data[lck]), "0"(MUTEX_OPEN) : "memory"); \ } while(res != (unsigned int)MUTEX_OPEN); \ } while(0) #define _papi_hwd_unlock(lck) \ do \ { \ unsigned int res = 0; \ __asm__ __volatile__ ("xchg %0,%1" : "=r"(res) : "m"(_papi_hwd_lock_data[lck]), "0"(MUTEX_OPEN) : "memory"); \ } while(0) /***************/ /* power */ /***************/ #elif defined(__powerpc__) /* * These functions are slight modifications of the functions in * /usr/include/asm-ppc/system.h. * * We can't use the ones in system.h directly because they are defined * only when __KERNEL__ is defined. */ static __inline__ unsigned long papi_xchg_u32( volatile void *p, unsigned long val ) { unsigned long prev; __asm__ __volatile__( "\n\ sync \n\ 1: lwarx %0,0,%2 \n\ stwcx. %3,0,%2 \n\ bne- 1b \n\ isync":"=&r"( prev ), "=m"( *( volatile unsigned long * ) p ) :"r"( p ), "r"( val ), "m"( *( volatile unsigned long * ) p ) :"cc", "memory" ); return prev; } #define _papi_hwd_lock(lck) \ do { \ unsigned int retval; \ do { \ retval = papi_xchg_u32(&_papi_hwd_lock_data[lck],MUTEX_CLOSED); \ } while(retval != (unsigned int)MUTEX_OPEN); \ } while(0) #define _papi_hwd_unlock(lck) \ do { \ unsigned int retval; \ do { \ retval = papi_xchg_u32(&_papi_hwd_lock_data[lck],MUTEX_OPEN); \ } while(retval != (unsigned int)MUTEX_CLOSED); \ } while (0) /*****************/ /* SPARC */ /*****************/ #elif defined(__sparc__) static inline void __raw_spin_lock( volatile unsigned int *lock ) { __asm__ __volatile__( "\n1:\n\t" "ldstub [%0], %%g2\n\t" "orcc %%g2, 0x0, %%g0\n\t" "bne,a 2f\n\t" " ldub [%0], %%g2\n\t" ".subsection 2\n" "2:\n\t" "orcc %%g2, 0x0, %%g0\n\t" "bne,a 2b\n\t" " ldub [%0], %%g2\n\t" "b,a 1b\n\t" ".previous\n": /* no outputs */ :"r"( lock ) :"g2", "memory", "cc" ); } static inline void __raw_spin_unlock( volatile unsigned int *lock ) { __asm__ __volatile__( "stb %%g0, [%0]"::"r"( lock ):"memory" ); } #define _papi_hwd_lock(lck) __raw_spin_lock(&_papi_hwd_lock_data[lck]); #define _papi_hwd_unlock(lck) __raw_spin_unlock(&_papi_hwd_lock_data[lck]) /*******************/ /* ARM */ /*******************/ #elif defined(__arm__) #if 0 /* OLD CODE FROM VINCE BELOW */ /* FIXME */ /* not sure if this even works */ /* also the various flavors of ARM */ /* have differing levels of atomic */ /* instruction support. A proper */ /* implementation needs to handle this :( */ #warning "WARNING! Verify mutexes work on ARM!" /* * For arm/gcc, 0 is clear, 1 is set. */ #define MUTEX_SET(tsl) ({ \ int __r; \ asm volatile( \ "swpb %0, %1, [%2]\n\t" \ "eor %0, %0, #1\n\t" \ : "=&r" (__r) \ : "r" (1), "r" (tsl) \ ); \ __r & 1; \ }) #define _papi_hwd_lock(lck) MUTEX_SET(lck) #define _papi_hwd_unlock(lck) (*(volatile int *)(lck) = 0) #endif /* NEW CODE FROM PHIL */ static inline int __arm_papi_spin_lock (volatile unsigned int *lock) { unsigned int val; do asm volatile ("swp %0, %1, [%2]" : "=r" (val) : "0" (1), "r" (lock) : "memory"); while (val != 0); return 0; } #define _papi_hwd_lock(lck) { rmb(); __arm_papi_spin_lock(&_papi_hwd_lock_data[lck]); rmb(); } #define _papi_hwd_unlock(lck) { rmb(); _papi_hwd_lock_data[lck] = 0; rmb(); } #elif defined(__mips__) static inline void __raw_spin_lock(volatile unsigned int *lock) { unsigned int tmp; __asm__ __volatile__( " .set noreorder # __raw_spin_lock \n" "1: ll %1, %2 \n" " bnez %1, 1b \n" " li %1, 1 \n" " sc %1, %0 \n" #if __mips_isa_rev < 6 " beqzl %1, 1b \n" " nop \n" #else " beqzc %1,1b \n" #endif " sync \n" " .set reorder \n" : "=m" (*lock), "=&r" (tmp) : "m" (*lock) : "memory"); } static inline void __raw_spin_unlock(volatile unsigned int *lock) { __asm__ __volatile__( " .set noreorder # __raw_spin_unlock \n" " sync \n" " sw $0, %0 \n" " .set\treorder \n" : "=m" (*lock) : "m" (*lock) : "memory"); } #define _papi_hwd_lock(lck) __raw_spin_lock(&_papi_hwd_lock_data[lck]); #define _papi_hwd_unlock(lck) __raw_spin_unlock(&_papi_hwd_lock_data[lck]) #else #error "_papi_hwd_lock/unlock undefined!" #endif #endif #endif /* defined(USE_PTHREAD_MUTEXES) */ papi-papi-7-2-0-t/src/linux-memory.c000066400000000000000000001500571502707512200172500ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: linux-memory.c * Author: Kevin London * london@cs.utk.edu * Mods: Dan Terpstra * terpstra@eecs.utk.edu * cache and TLB info exported to a separate file * which is not OS or driver dependent * Mods: Vince Weaver * vweaver1@eecs.utk.edu * Merge all of the various copies of linux-related * memory detection info this file. */ #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_memory.h" /* papi_calloc() */ #include "x86_cpuid_info.h" #include "linux-lock.h" /* 2.6.19 has this: VmPeak: 4588 kB VmSize: 4584 kB VmLck: 0 kB VmHWM: 1548 kB VmRSS: 1548 kB VmData: 312 kB VmStk: 88 kB VmExe: 684 kB VmLib: 1360 kB VmPTE: 20 kB */ int generic_get_memory_info( PAPI_hw_info_t *hw_info ); int _linux_get_dmem_info( PAPI_dmem_info_t * d ) { char fn[PATH_MAX], tmp[PATH_MAX]; FILE *f; int ret; long long sz = 0, lck = 0, res = 0, shr = 0, stk = 0, txt = 0, dat = 0, dum = 0, lib = 0, hwm = 0; sprintf( fn, "/proc/%ld/status", ( long ) getpid( ) ); f = fopen( fn, "r" ); if ( f == NULL ) { PAPIERROR( "fopen(%s): %s\n", fn, strerror( errno ) ); return PAPI_ESYS; } while ( 1 ) { if ( fgets( tmp, PATH_MAX, f ) == NULL ) break; if ( strspn( tmp, "VmSize:" ) == strlen( "VmSize:" ) ) { sscanf( tmp + strlen( "VmSize:" ), "%lld", &sz ); d->size = sz; continue; } if ( strspn( tmp, "VmHWM:" ) == strlen( "VmHWM:" ) ) { sscanf( tmp + strlen( "VmHWM:" ), "%lld", &hwm ); d->high_water_mark = hwm; continue; } if ( strspn( tmp, "VmLck:" ) == strlen( "VmLck:" ) ) { sscanf( tmp + strlen( "VmLck:" ), "%lld", &lck ); d->locked = lck; continue; } if ( strspn( tmp, "VmRSS:" ) == strlen( "VmRSS:" ) ) { sscanf( tmp + strlen( "VmRSS:" ), "%lld", &res ); d->resident = res; continue; } if ( strspn( tmp, "VmData:" ) == strlen( "VmData:" ) ) { sscanf( tmp + strlen( "VmData:" ), "%lld", &dat ); d->heap = dat; continue; } if ( strspn( tmp, "VmStk:" ) == strlen( "VmStk:" ) ) { sscanf( tmp + strlen( "VmStk:" ), "%lld", &stk ); d->stack = stk; continue; } if ( strspn( tmp, "VmExe:" ) == strlen( "VmExe:" ) ) { sscanf( tmp + strlen( "VmExe:" ), "%lld", &txt ); d->text = txt; continue; } if ( strspn( tmp, "VmLib:" ) == strlen( "VmLib:" ) ) { sscanf( tmp + strlen( "VmLib:" ), "%lld", &lib ); d->library = lib; continue; } } fclose( f ); sprintf( fn, "/proc/%ld/statm", ( long ) getpid( ) ); f = fopen( fn, "r" ); if ( f == NULL ) { PAPIERROR( "fopen(%s): %s\n", fn, strerror( errno ) ); return PAPI_ESYS; } ret = fscanf( f, "%lld %lld %lld %lld %lld %lld %lld", &dum, &dum, &shr, &dum, &dum, &dat, &dum ); if ( ret != 7 ) { PAPIERROR( "fscanf(7 items): %d\n", ret ); fclose(f); return PAPI_ESYS; } d->pagesize = getpagesize( ); d->shared = ( shr * d->pagesize ) / 1024; fclose( f ); return PAPI_OK; } /* * Architecture-specific cache detection code */ #if defined(__i386__)||defined(__x86_64__) static int x86_get_memory_info( PAPI_hw_info_t * hw_info ) { int retval = PAPI_OK; switch ( hw_info->vendor ) { case PAPI_VENDOR_AMD: case PAPI_VENDOR_INTEL: retval = _x86_cache_info( &hw_info->mem_hierarchy ); break; default: PAPIERROR( "Unknown vendor in memory information call for x86." ); return PAPI_ENOIMPL; } return retval; } #endif #if defined(__ia64__) static int get_number( char *buf ) { char numbers[] = "0123456789"; int num; char *tmp, *end; tmp = strpbrk( buf, numbers ); if ( tmp != NULL ) { end = tmp; while ( isdigit( *end ) ) end++; *end = '\0'; num = atoi( tmp ); return num; } PAPIERROR( "Number could not be parsed from %s", buf ); return -1; } static void fline( FILE * fp, char *rline ) { char *tmp, *end, c; tmp = rline; end = &rline[1023]; memset( rline, '\0', 1024 ); do { if ( feof( fp ) ) return; c = getc( fp ); } while ( isspace( c ) || c == '\n' || c == '\r' ); ungetc( c, fp ); for ( ;; ) { if ( feof( fp ) ) { return; } c = getc( fp ); if ( c == '\n' || c == '\r' ) break; *tmp++ = c; if ( tmp == end ) { *tmp = '\0'; return; } } return; } static int ia64_get_memory_info( PAPI_hw_info_t * hw_info ) { int retval = 0; FILE *f; int clevel = 0, cindex = -1; char buf[1024]; int num, i, j; PAPI_mh_info_t *meminfo = &hw_info->mem_hierarchy; PAPI_mh_level_t *L = hw_info->mem_hierarchy.level; f = fopen( "/proc/pal/cpu0/cache_info", "r" ); if ( !f ) { PAPIERROR( "fopen(/proc/pal/cpu0/cache_info) returned < 0" ); return PAPI_ESYS; } while ( !feof( f ) ) { fline( f, buf ); if ( buf[0] == '\0' ) break; if ( !strncmp( buf, "Data Cache", 10 ) ) { cindex = 1; clevel = get_number( buf ); L[clevel - 1].cache[cindex].type = PAPI_MH_TYPE_DATA; } else if ( !strncmp( buf, "Instruction Cache", 17 ) ) { cindex = 0; clevel = get_number( buf ); L[clevel - 1].cache[cindex].type = PAPI_MH_TYPE_INST; } else if ( !strncmp( buf, "Data/Instruction Cache", 22 ) ) { cindex = 0; clevel = get_number( buf ); L[clevel - 1].cache[cindex].type = PAPI_MH_TYPE_UNIFIED; } else { if ( ( clevel == 0 || clevel > 3 ) && cindex >= 0 ) { PAPIERROR ( "Cache type could not be recognized, please send /proc/pal/cpu0/cache_info" ); return PAPI_EBUG; } if ( !strncmp( buf, "Size", 4 ) ) { num = get_number( buf ); L[clevel - 1].cache[cindex].size = num; } else if ( !strncmp( buf, "Associativity", 13 ) ) { num = get_number( buf ); L[clevel - 1].cache[cindex].associativity = num; } else if ( !strncmp( buf, "Line size", 9 ) ) { num = get_number( buf ); L[clevel - 1].cache[cindex].line_size = num; L[clevel - 1].cache[cindex].num_lines = L[clevel - 1].cache[cindex].size / num; } } } fclose( f ); f = fopen( "/proc/pal/cpu0/vm_info", "r" ); /* No errors on fopen as I am not sure this is always on the systems */ if ( f != NULL ) { cindex = -1; clevel = 0; while ( !feof( f ) ) { fline( f, buf ); if ( buf[0] == '\0' ) break; if ( !strncmp( buf, "Data Translation", 16 ) ) { cindex = 1; clevel = get_number( buf ); L[clevel - 1].tlb[cindex].type = PAPI_MH_TYPE_DATA; } else if ( !strncmp( buf, "Instruction Translation", 23 ) ) { cindex = 0; clevel = get_number( buf ); L[clevel - 1].tlb[cindex].type = PAPI_MH_TYPE_INST; } else { if ( ( clevel == 0 || clevel > 2 ) && cindex >= 0 ) { PAPIERROR ( "TLB type could not be recognized, send /proc/pal/cpu0/vm_info" ); return PAPI_EBUG; } if ( !strncmp( buf, "Number of entries", 17 ) ) { num = get_number( buf ); L[clevel - 1].tlb[cindex].num_entries = num; } else if ( !strncmp( buf, "Associativity", 13 ) ) { num = get_number( buf ); L[clevel - 1].tlb[cindex].associativity = num; } } } fclose( f ); } /* Compute and store the number of levels of hierarchy actually used */ for ( i = 0; i < PAPI_MH_MAX_LEVELS; i++ ) { for ( j = 0; j < 2; j++ ) { if ( L[i].tlb[j].type != PAPI_MH_TYPE_EMPTY || L[i].cache[j].type != PAPI_MH_TYPE_EMPTY ) meminfo->levels = i + 1; } } return retval; } #endif #if defined(__powerpc__) PAPI_mh_info_t sys_mem_info[] = { {2, // 970 begin { { // level 1 begins { // tlb's begin {PAPI_MH_TYPE_UNIFIED, 1024, 4, 0} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_INST, 65536, 128, 512, 1} , {PAPI_MH_TYPE_DATA, 32768, 128, 256, 2} } } , { // level 2 begins { // tlb's begin {PAPI_MH_TYPE_EMPTY, -1, -1, -1} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_UNIFIED, 524288, 128, 4096, 8} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } } , } } , // 970 end {3, { { // level 1 begins { // tlb's begin {PAPI_MH_TYPE_UNIFIED, 1024, 4, 0} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_INST, 65536, 128, 512, 2} , {PAPI_MH_TYPE_DATA, 32768, 128, 256, 4} } } , { // level 2 begins { // tlb's begin {PAPI_MH_TYPE_EMPTY, -1, -1, -1} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_UNIFIED, 1966080, 128, 15360, 10} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } } , { // level 3 begins { // tlb's begin {PAPI_MH_TYPE_EMPTY, -1, -1, -1} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_UNIFIED, 37748736, 256, 147456, 12} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } } , } } , // POWER5 end {3, { { // level 1 begins { // tlb's begin /// POWER6 has an ERAT (Effective to Real Address /// Translation) instead of a TLB. For the purposes of this /// data, we will treat it like a TLB. {PAPI_MH_TYPE_INST, 128, 2, 0} , {PAPI_MH_TYPE_DATA, 128, 128, 0} } , { // caches begin {PAPI_MH_TYPE_INST, 65536, 128, 512, 4} , {PAPI_MH_TYPE_DATA, 65536, 128, 512, 8} } } , { // level 2 begins { // tlb's begin {PAPI_MH_TYPE_EMPTY, -1, -1, -1} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin {PAPI_MH_TYPE_UNIFIED, 4194304, 128, 16384, 8} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } } , { // level 3 begins { // tlb's begin {PAPI_MH_TYPE_EMPTY, -1, -1, -1} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1} } , { // caches begin /// POWER6 has a 2 slice L3 cache. Each slice is 16MB, so /// combined they are 32MB and usable by each core. For /// this reason, we will treat it as a single 32MB cache. {PAPI_MH_TYPE_UNIFIED, 33554432, 128, 262144, 16} , {PAPI_MH_TYPE_EMPTY, -1, -1, -1, -1} } } , } } , // POWER6 end {3, { [0] = { // level 1 begins .tlb = { /// POWER7 has an ERAT (Effective to Real Address /// Translation) instead of a TLB. For the purposes of this /// data, we will treat it like a TLB. [0] = { .type = PAPI_MH_TYPE_INST, .num_entries = 64, .page_size = 0, .associativity = 2 } , [1] = { .type = PAPI_MH_TYPE_DATA, .num_entries = 64, .page_size = 0, .associativity = SHRT_MAX } } , .cache = { // level 1 caches begin [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU, .size = 32768, .line_size = 128, .num_lines = 64, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU, .size = 32768, .line_size = 128, .num_lines = 32, .associativity = 8 } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 524288, .line_size = 128, .num_lines = 256, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , [2] = { // level 3 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 4194304, .line_size = 128, .num_lines = 4096, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , } }, // POWER7 end {3, { [0] = { // level 1 begins .tlb = { /// POWER8 has an ERAT (Effective to Real Address /// Translation) instead of a TLB. For the purposes of this /// data, we will treat it like a TLB. [0] = { .type = PAPI_MH_TYPE_INST, .num_entries = 72, .page_size = 0, .associativity = SHRT_MAX } , [1] = { .type = PAPI_MH_TYPE_DATA, .num_entries = 48, .page_size = 0, .associativity = SHRT_MAX } } , .cache = { // level 1 caches begin [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU, .size = 32768, .line_size = 128, .num_lines = 64, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU, .size = 65536, .line_size = 128, .num_lines = 512, .associativity = 8 } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_UNIFIED, .num_entries = 2048, .page_size = 0, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 262144, .line_size = 128, .num_lines = 256, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , [2] = { // level 3 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 8388608, .line_size = 128, .num_lines = 65536, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , } }, // POWER8 end {3, { [0] = { // level 1 begins .tlb = { /// POWER9 has an ERAT (Effective to Real Address /// Translation) instead of a TLB. For the purposes of this /// data, we will treat it like a TLB. [0] = { .type = PAPI_MH_TYPE_INST, .num_entries = 64, .page_size = 0, .associativity = SHRT_MAX } , [1] = { .type = PAPI_MH_TYPE_DATA, .num_entries = 64, .page_size = 0, .associativity = SHRT_MAX } } , .cache = { // level 1 caches begin [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU, .size = 32768, .line_size = 128, .num_lines = 256, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU, .size = 32768, .line_size = 128, .num_lines = 256, .associativity = 8 } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_UNIFIED, .num_entries = 1024, .page_size = 0, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 524288, .line_size = 128, .num_lines = 4096, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , [2] = { // level 3 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 10485760, .line_size = 128, .num_lines = 81920, .associativity = 20 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , } }, {3, { [0] = { // level 1 begins .tlb = { /// POWER10 has an ERAT (Effective to Real Address /// Translation) instead of a TLB. For the purposes of this /// data, we will treat it like a TLB. [0] = { .type = PAPI_MH_TYPE_INST, .num_entries = 64, .page_size = 0, .associativity = SHRT_MAX } , [1] = { .type = PAPI_MH_TYPE_DATA, .num_entries = 64, .page_size = 0, .associativity = SHRT_MAX } } , .cache = { // level 1 caches begin [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU, .size = 49152, .line_size = 128, .num_lines = 384, .associativity = 6 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU, .size = 32768, .line_size = 128, .num_lines = 256, .associativity = 8 } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_UNIFIED, .num_entries = 4096, .page_size = 0, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 1048576, .line_size = 128, .num_lines = 8192, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , [2] = { // level 3 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_PSEUDO_LRU, .size = 4194304, .line_size = 128, .num_lines = 32768, .associativity = 16 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , } } // POWER10 end }; #define SPRN_PVR 0x11F /* Processor Version Register */ #define PVR_PROCESSOR_SHIFT 16 static unsigned int mfpvr( void ) { unsigned long pvr; asm( "mfspr %0,%1": "=r"( pvr ):"i"( SPRN_PVR ) ); return pvr; } int ppc64_get_memory_info( PAPI_hw_info_t * hw_info ) { unsigned int pvr = mfpvr( ) >> PVR_PROCESSOR_SHIFT; int index; switch ( pvr ) { case 0x39: /* PPC970 */ case 0x3C: /* PPC970FX */ case 0x44: /* PPC970MP */ case 0x45: /* PPC970GX */ index = 0; break; case 0x3A: /* POWER5 */ case 0x3B: /* POWER5+ */ index = 1; break; case 0x3E: /* POWER6 */ index = 2; break; case 0x3F: /* POWER7 */ index = 3; break; case 0x4b: /* POWER8 */ index = 4; break; case 0x4e: /* POWER9 */ index = 5; break; case 0x80: /* POWER10 */ index = 6; break; default: index = -1; hw_info->mem_hierarchy.levels = 0; break; } if ( index != -1 ) { int cache_level; PAPI_mh_info_t sys_mh_inf = sys_mem_info[index]; PAPI_mh_info_t *mh_inf = &hw_info->mem_hierarchy; mh_inf->levels = sys_mh_inf.levels; PAPI_mh_level_t *level = mh_inf->level; PAPI_mh_level_t sys_mh_level; for ( cache_level = 0; cache_level < sys_mh_inf.levels; cache_level++ ) { sys_mh_level = sys_mh_inf.level[cache_level]; int cache_idx; for ( cache_idx = 0; cache_idx < 2; cache_idx++ ) { // process TLB info PAPI_mh_tlb_info_t curr_tlb = sys_mh_level.tlb[cache_idx]; int type = curr_tlb.type; if ( type != PAPI_MH_TYPE_EMPTY ) { level[cache_level].tlb[cache_idx].type = type; level[cache_level].tlb[cache_idx].associativity = curr_tlb.associativity; level[cache_level].tlb[cache_idx].num_entries = curr_tlb.num_entries; } } for ( cache_idx = 0; cache_idx < 2; cache_idx++ ) { // process cache info PAPI_mh_cache_info_t curr_cache = sys_mh_level.cache[cache_idx]; int type = curr_cache.type; if ( type != PAPI_MH_TYPE_EMPTY ) { level[cache_level].cache[cache_idx].type = type; level[cache_level].cache[cache_idx].associativity = curr_cache.associativity; level[cache_level].cache[cache_idx].size = curr_cache.size; level[cache_level].cache[cache_idx].line_size = curr_cache.line_size; level[cache_level].cache[cache_idx].num_lines = curr_cache.num_lines; } } } } return 0; } #endif #if defined(__sparc__) static int sparc_sysfs_cpu_attr( char *name, char **result ) { const char *path_base = "/sys/devices/system/cpu/"; char path_buf[PATH_MAX]; char val_buf[32]; DIR *sys_cpu; sys_cpu = opendir( path_base ); if ( sys_cpu ) { struct dirent *cpu; while ( ( cpu = readdir( sys_cpu ) ) != NULL ) { int fd; if ( strncmp( "cpu", cpu->d_name, 3 ) ) continue; strcpy( path_buf, path_base ); strcat( path_buf, cpu->d_name ); strcat( path_buf, "/" ); strcat( path_buf, name ); fd = open( path_buf, O_RDONLY ); if ( fd < 0 ) continue; if ( read( fd, val_buf, 32 ) < 0 ) continue; close( fd ); *result = strdup( val_buf ); return 0; } } closedir( sys_cpu ); return -1; } static int sparc_cpu_attr( char *name, unsigned long long *val ) { char *buf; int r; r = sparc_sysfs_cpu_attr( name, &buf ); if ( r == -1 ) return -1; sscanf( buf, "%llu", val ); free( buf ); return 0; } static char * search_cpu_info( FILE * f, char *search_str, char *line ) { /* This code courtesy of our friends in Germany. Thanks Rudolph Berrend\ orf! */ /* See the home page for the German version of PAPI. */ char *s; while ( fgets( line, 256, f ) != NULL ) { if ( strstr( line, search_str ) != NULL ) { /* ignore all characters in line up to : */ for ( s = line; *s && ( *s != ':' ); ++s ); if ( *s ) return s; } } return NULL; /* End stolen code */ } static int sparc_get_memory_info( PAPI_hw_info_t * hw_info ) { unsigned long long cache_size, cache_line_size; /* unsigned long long cycles_per_second; */ char maxargs[PAPI_HUGE_STR_LEN]; /* PAPI_mh_tlb_info_t *tlb; */ PAPI_mh_level_t *level; char *s, *t; FILE *f; /* First, fix up the cpu vendor/model/etc. values */ strcpy( hw_info->vendor_string, "Sun" ); hw_info->vendor = PAPI_VENDOR_SUN; f = fopen( "/proc/cpuinfo", "r" ); if ( !f ) return PAPI_ESYS; rewind( f ); s = search_cpu_info( f, "cpu", maxargs ); if ( !s ) { fclose( f ); return PAPI_ESYS; } t = strchr( s + 2, '\n' ); if ( !t ) { fclose( f ); return PAPI_ESYS; } *t = '\0'; strcpy( hw_info->model_string, s + 2 ); fclose( f ); /* if ( sparc_sysfs_cpu_attr( "clock_tick", &s ) == -1 ) return PAPI_ESYS; sscanf( s, "%llu", &cycles_per_second ); free( s ); hw_info->mhz = cycles_per_second / 1000000; hw_info->clock_mhz = hw_info->mhz; */ /* Now fetch the cache info */ hw_info->mem_hierarchy.levels = 3; level = &hw_info->mem_hierarchy.level[0]; sparc_cpu_attr( "l1_icache_size", &cache_size ); sparc_cpu_attr( "l1_icache_line_size", &cache_line_size ); level[0].cache[0].type = PAPI_MH_TYPE_INST; level[0].cache[0].size = cache_size; level[0].cache[0].line_size = cache_line_size; level[0].cache[0].num_lines = cache_size / cache_line_size; level[0].cache[0].associativity = 1; sparc_cpu_attr( "l1_dcache_size", &cache_size ); sparc_cpu_attr( "l1_dcache_line_size", &cache_line_size ); level[0].cache[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT; level[0].cache[1].size = cache_size; level[0].cache[1].line_size = cache_line_size; level[0].cache[1].num_lines = cache_size / cache_line_size; level[0].cache[1].associativity = 1; sparc_cpu_attr( "l2_cache_size", &cache_size ); sparc_cpu_attr( "l2_cache_line_size", &cache_line_size ); level[1].cache[0].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WB; level[1].cache[0].size = cache_size; level[1].cache[0].line_size = cache_line_size; level[1].cache[0].num_lines = cache_size / cache_line_size; level[1].cache[0].associativity = 1; #if 0 tlb = &hw_info->mem_hierarchy.level[0].tlb[0]; switch ( _perfmon2_pfm_pmu_type ) { case PFMLIB_SPARC_ULTRA12_PMU: tlb[0].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; tlb[0].num_entries = 64; tlb[0].associativity = SHRT_MAX; tlb[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; tlb[1].num_entries = 64; tlb[1].associativity = SHRT_MAX; break; case PFMLIB_SPARC_ULTRA3_PMU: case PFMLIB_SPARC_ULTRA3I_PMU: case PFMLIB_SPARC_ULTRA3PLUS_PMU: case PFMLIB_SPARC_ULTRA4PLUS_PMU: level[0].cache[0].associativity = 4; level[0].cache[1].associativity = 4; level[1].cache[0].associativity = 4; tlb[0].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; tlb[0].num_entries = 16; tlb[0].associativity = SHRT_MAX; tlb[1].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; tlb[1].num_entries = 16; tlb[1].associativity = SHRT_MAX; tlb[2].type = PAPI_MH_TYPE_DATA; tlb[2].num_entries = 1024; tlb[2].associativity = 2; tlb[3].type = PAPI_MH_TYPE_INST; tlb[3].num_entries = 128; tlb[3].associativity = 2; break; case PFMLIB_SPARC_NIAGARA1: level[0].cache[0].associativity = 4; level[0].cache[1].associativity = 4; level[1].cache[0].associativity = 12; tlb[0].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; tlb[0].num_entries = 64; tlb[0].associativity = SHRT_MAX; tlb[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; tlb[1].num_entries = 64; tlb[1].associativity = SHRT_MAX; break; case PFMLIB_SPARC_NIAGARA2: level[0].cache[0].associativity = 8; level[0].cache[1].associativity = 4; level[1].cache[0].associativity = 16; tlb[0].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; tlb[0].num_entries = 64; tlb[0].associativity = SHRT_MAX; tlb[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; tlb[1].num_entries = 128; tlb[1].associativity = SHRT_MAX; break; } #endif return 0; } #endif #if defined(__aarch64__) PAPI_mh_info_t sys_mem_info[] = { {2, // Fujitsu A64FX begin { [0] = { // level 1 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_FIFO, .num_entries = 16, .page_size = 65536, .associativity = SHRT_MAX } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_FIFO, .num_entries = 16, .page_size = 65536, .associativity = SHRT_MAX } } , .cache = { // level 1 caches begin [0] = { .type = PAPI_MH_TYPE_INST, .size = 65536, .line_size = 256, .num_lines = 64, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WB, .size = 65536, .line_size = 256, .num_lines = 64, .associativity = 4 } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_LRU, .num_entries = 1024, .page_size = 65536, .associativity = 4 } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_LRU, .num_entries = 1024, .page_size = 65536, .associativity = 4 } } , .cache = { [0] = { .type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_WB, .size = 8388608, .line_size = 256, .num_lines = 2048, .associativity = 16 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .size = -1, .line_size = -1, .num_lines = -1, .associativity = -1 } } } , } }, // Fujitsu A64FX end {3, // ARM Neoverse V2 begin { [0] = { // level 1 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_FIFO, .num_entries = 48, .page_size = -1, .associativity = SHRT_MAX } , [1] = { .type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_FIFO, .num_entries = 48, .page_size = -1, .associativity = SHRT_MAX } } } , [1] = { // level 2 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_UNIFIED, .num_entries = 2048, .page_size = -1, .associativity = 8 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } } , [2] = { // level 3 begins .tlb = { [0] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } , [1] = { .type = PAPI_MH_TYPE_EMPTY, .num_entries = -1, .page_size = -1, .associativity = -1 } } } , } } // ARM Neoverse V2 end }; #define IMPLEMENTER_FUJITSU 0x46 #define PARTNUM_FUJITSU_A64FX 0x001 #define IMPLEMENTER_ARM 0x41 #define PARTNUM_ARM_NEOVERSE_V2 0xd4f int aarch64_get_memory_info( PAPI_hw_info_t * hw_info ) { unsigned int implementer, partnum; implementer = hw_info->vendor; partnum = hw_info->cpuid_model; int index = -1; switch ( implementer ) { case IMPLEMENTER_FUJITSU: switch ( partnum ) { case PARTNUM_FUJITSU_A64FX: /* Fujitsu A64FX */ index = 0; break; default: generic_get_memory_info (hw_info); return 0; } break; case IMPLEMENTER_ARM: switch ( partnum ) { case PARTNUM_ARM_NEOVERSE_V2: /* ARM Neoverse V2 */ index = 1; generic_get_memory_info (hw_info); break; default: generic_get_memory_info (hw_info); return 0; } break; default: generic_get_memory_info (hw_info); return 0; } if ( index != -1 ) { int cache_level; PAPI_mh_info_t sys_mh_inf = sys_mem_info[index]; PAPI_mh_info_t *mh_inf = &hw_info->mem_hierarchy; mh_inf->levels = sys_mh_inf.levels; PAPI_mh_level_t *level = mh_inf->level; PAPI_mh_level_t sys_mh_level; for ( cache_level = 0; cache_level < sys_mh_inf.levels; cache_level++ ) { sys_mh_level = sys_mh_inf.level[cache_level]; int cache_idx; for ( cache_idx = 0; cache_idx < 2; cache_idx++ ) { // process TLB info PAPI_mh_tlb_info_t curr_tlb = sys_mh_level.tlb[cache_idx]; int type = curr_tlb.type; if ( type != PAPI_MH_TYPE_EMPTY ) { level[cache_level].tlb[cache_idx].type = type; level[cache_level].tlb[cache_idx].associativity = curr_tlb.associativity; level[cache_level].tlb[cache_idx].num_entries = curr_tlb.num_entries; } } for ( cache_idx = 0; cache_idx < 2; cache_idx++ ) { // process cache info PAPI_mh_cache_info_t curr_cache = sys_mh_level.cache[cache_idx]; int type = curr_cache.type; if ( type != PAPI_MH_TYPE_EMPTY ) { level[cache_level].cache[cache_idx].type = type; level[cache_level].cache[cache_idx].associativity = curr_cache.associativity; level[cache_level].cache[cache_idx].size = curr_cache.size; level[cache_level].cache[cache_idx].line_size = curr_cache.line_size; level[cache_level].cache[cache_idx].num_lines = curr_cache.num_lines; } } } } return 0; } #endif /* Fallback Linux code to read the cache info from /sys */ int generic_get_memory_info( PAPI_hw_info_t *hw_info ) { int type=0,level,result; int size,line_size,associativity,sets; int write_policy,allocation_policy; DIR *dir; FILE *fff; char filename[BUFSIZ],type_string[BUFSIZ]; char *str_result; char write_policy_string[BUFSIZ],allocation_policy_string[BUFSIZ]; struct dirent *d; int max_level=0; int level_count,level_index; PAPI_mh_level_t *L = hw_info->mem_hierarchy.level; /* open Linux cache dir */ /* assume all CPUs same as cpu0. */ /* Not necessarily a good assumption */ dir=opendir("/sys/devices/system/cpu/cpu0/cache"); if (dir==NULL) { goto unrecoverable_error; } for (level_index=0; level_index < PAPI_MAX_MEM_HIERARCHY_LEVELS; ++level_index) { for (level_count = 0; level_count < PAPI_MH_MAX_LEVELS; ++level_count) { L[level_index].cache[level_count].type = PAPI_MH_TYPE_EMPTY; } } while(1) { d = readdir(dir); if (d==NULL) break; if (strncmp(d->d_name, "index", 5)) continue; MEMDBG("Found %s\n",d->d_name); /*************/ /* Get level */ /*************/ sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/level", d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open level.\n"); goto unrecoverable_error; } result=fscanf(fff,"%d",&level); fclose(fff); if (result!=1) { MEMDBG("Could not read cache level\n"); goto unrecoverable_error; } /* Index arrays from 0 */ level_index=level-1; level_count = 0; while (L[level_index].cache[level_count].type != PAPI_MH_TYPE_EMPTY) { level_count++; if (level_count>=PAPI_MH_MAX_LEVELS) { break; } } if (level_count>=PAPI_MH_MAX_LEVELS) { MEMDBG("Exceeded maximum levels %d\n", PAPI_MH_MAX_LEVELS); break; } /************/ /* Get type */ /************/ sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/type",d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open type\n"); goto unrecoverable_error; } str_result=fgets(type_string, BUFSIZ, fff); fclose(fff); if (str_result==NULL) { MEMDBG("Could not read cache type\n"); goto unrecoverable_error; } char *cache_type = "Data"; if (!strncmp(type_string,cache_type,strlen(cache_type))) { type=PAPI_MH_TYPE_DATA; } cache_type = "Instruction"; if (!strncmp(type_string,cache_type,strlen(cache_type))) { type=PAPI_MH_TYPE_INST; } cache_type = "Unified"; if (!strncmp(type_string,cache_type,strlen(cache_type))) { type=PAPI_MH_TYPE_UNIFIED; } /********************/ /* Get write_policy */ /********************/ write_policy=0; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/write_policy",d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open write_policy\n"); goto get_allocation_policy; } str_result=fgets(write_policy_string, BUFSIZ, fff); fclose(fff); if (str_result==NULL) { MEMDBG("Could not read cache write_policy\n"); goto get_allocation_policy; } if (!strcmp(write_policy_string,"WriteThrough")) { write_policy=PAPI_MH_TYPE_WT; } if (!strcmp(write_policy_string,"WriteBack")) { write_policy=PAPI_MH_TYPE_WB; } get_allocation_policy: /*************************/ /* Get allocation_policy */ /*************************/ allocation_policy=0; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/allocation_policy",d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open allocation_policy\n"); goto get_size; } str_result=fgets(allocation_policy_string, BUFSIZ, fff); fclose(fff); if (str_result==NULL) { MEMDBG("Could not read cache allocation_policy\n"); goto get_size; } if (!strcmp(allocation_policy_string,"ReadAllocate")) { allocation_policy=PAPI_MH_TYPE_RD_ALLOC; } if (!strcmp(allocation_policy_string,"WriteAllocate")) { allocation_policy=PAPI_MH_TYPE_WR_ALLOC; } if (!strcmp(allocation_policy_string,"ReadWriteAllocate")) { allocation_policy=PAPI_MH_TYPE_RW_ALLOC; } get_size: L[level_index].cache[level_count].type=type | write_policy | allocation_policy; /*************/ /* Get Size */ /*************/ sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/size",d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open size\n"); goto unrecoverable_error; } result=fscanf(fff,"%d",&size); fclose(fff); if (result!=1) { MEMDBG("Could not read cache size\n"); goto unrecoverable_error; } /* Linux reports in kB, PAPI expects in Bytes */ L[level_index].cache[level_count].size=size*1024; /*************/ /* Line Size */ /*************/ L[level_index].cache[level_count].line_size=0; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/coherency_line_size", d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open linesize\n"); } else { result=fscanf(fff,"%d",&line_size); fclose(fff); if (result!=1) { MEMDBG("Could not read cache line-size\n"); } else { L[level_index].cache[level_count].line_size=line_size; } } /*********************/ /* Get Associativity */ /*********************/ L[level_index].cache[level_count].associativity=0; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/ways_of_associativity", d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open associativity\n"); } else { result=fscanf(fff,"%d",&associativity); fclose(fff); if (result!=1) { MEMDBG("Could not read cache associativity\n"); } else { L[level_index].cache[level_count].associativity=associativity; } } /************/ /* Get Sets */ /************/ L[level_index].cache[level_count].num_lines=0; sprintf(filename, "/sys/devices/system/cpu/cpu0/cache/%s/number_of_sets", d->d_name); fff=fopen(filename,"r"); if (fff==NULL) { MEMDBG("Cannot open sets\n"); } else { result=fscanf(fff,"%d",&sets); fclose(fff); if (result!=1) { MEMDBG("Could not read cache sets\n"); } else { L[level_index].cache[level_count].num_lines=sets; } } /* Sanity check results */ if ((line_size!=0) && (associativity!=0)) { if (((size*1024)/line_size/associativity)!=sets) { MEMDBG("Warning! sets %d != expected %d\n", sets,((size*1024)/line_size/associativity)); } } MEMDBG("\tL%d %s cache\n",level,type_string); MEMDBG("\t%d kilobytes\n",size); MEMDBG("\t%d byte linesize\n",line_size); MEMDBG("\t%d-way associative\n",associativity); MEMDBG("\t%d lines\n",sets); MEMDBG("\tUnknown inclusivity\n"); MEMDBG("\tUnknown replacement algorithm\n"); MEMDBG("\tUnknown if victim cache\n"); if (level>max_level) max_level=level; if (level>PAPI_MAX_MEM_HIERARCHY_LEVELS) { MEMDBG("Exceeded maximum cache level %d\n", PAPI_MAX_MEM_HIERARCHY_LEVELS); break; } } closedir(dir); hw_info->mem_hierarchy.levels = max_level; return 0; unrecoverable_error: /* Just say we have no cache */ hw_info->mem_hierarchy.levels = 0; closedir(dir); return 0; } int _linux_get_memory_info( PAPI_hw_info_t * hwinfo, int cpu_type ) { ( void ) cpu_type; /*unused */ int retval = PAPI_OK; #if defined(__i386__)||defined(__x86_64__) x86_get_memory_info( hwinfo ); #elif defined(__ia64__) ia64_get_memory_info( hwinfo ); #elif defined(__powerpc__) ppc64_get_memory_info( hwinfo ); #elif defined(__sparc__) sparc_get_memory_info( hwinfo ); #elif defined(__arm__) #warning "WARNING! linux_get_memory_info() does nothing on ARM32!" generic_get_memory_info (hwinfo); #elif defined(__aarch64__) aarch64_get_memory_info( hwinfo ); #else generic_get_memory_info (hwinfo); #endif return retval; } int _linux_update_shlib_info( papi_mdi_t *mdi ) { char fname[PAPI_HUGE_STR_LEN]; unsigned long t_index = 0, d_index = 0, b_index = 0, counting = 1; char buf[PAPI_HUGE_STR_LEN + PAPI_HUGE_STR_LEN], perm[5], dev[16]; char mapname[PAPI_HUGE_STR_LEN], lastmapname[PAPI_HUGE_STR_LEN]; unsigned long begin = 0, end = 0, size = 0, inode = 0, foo = 0; PAPI_address_map_t *tmp = NULL; FILE *f; memset( fname, 0x0, sizeof ( fname ) ); memset( buf, 0x0, sizeof ( buf ) ); memset( perm, 0x0, sizeof ( perm ) ); memset( dev, 0x0, sizeof ( dev ) ); memset( mapname, 0x0, sizeof ( mapname ) ); memset( lastmapname, 0x0, sizeof ( lastmapname ) ); sprintf( fname, "/proc/%ld/maps", ( long ) mdi->pid ); f = fopen( fname, "r" ); if ( !f ) { PAPIERROR( "fopen(%s) returned < 0", fname ); return PAPI_OK; } again: while ( !feof( f ) ) { begin = end = size = inode = foo = 0; if ( fgets( buf, sizeof ( buf ), f ) == 0 ) break; /* If mapname is null in the string to be scanned, we need to detect that */ if ( strlen( mapname ) ) strcpy( lastmapname, mapname ); else lastmapname[0] = '\0'; /* If mapname is null in the string to be scanned, we need to detect that */ mapname[0] = '\0'; sscanf( buf, "%lx-%lx %4s %lx %s %ld %s", &begin, &end, perm, &foo, dev, &inode, mapname ); size = end - begin; /* the permission string looks like "rwxp", where each character can * be either the letter, or a hyphen. The final character is either * p for private or s for shared. */ if ( counting ) { if ( ( perm[2] == 'x' ) && ( perm[0] == 'r' ) && ( inode != 0 ) ) { if ( strcmp( mdi->exe_info.fullname, mapname ) == 0 ) { mdi->exe_info.address_info.text_start = ( vptr_t ) begin; mdi->exe_info.address_info.text_end = ( vptr_t ) ( begin + size ); } t_index++; } else if ( ( perm[0] == 'r' ) && ( perm[1] == 'w' ) && ( inode != 0 ) && ( strcmp ( mdi->exe_info.fullname, mapname ) == 0 ) ) { mdi->exe_info.address_info.data_start = ( vptr_t ) begin; mdi->exe_info.address_info.data_end = ( vptr_t ) ( begin + size ); d_index++; } else if ( ( perm[0] == 'r' ) && ( perm[1] == 'w' ) && ( inode == 0 ) && ( strcmp ( mdi->exe_info.fullname, lastmapname ) == 0 ) ) { mdi->exe_info.address_info.bss_start = ( vptr_t ) begin; mdi->exe_info.address_info.bss_end = ( vptr_t ) ( begin + size ); b_index++; } } else if ( !counting ) { if ( ( perm[2] == 'x' ) && ( perm[0] == 'r' ) && ( inode != 0 ) ) { if ( strcmp( mdi->exe_info.fullname, mapname ) != 0 ) { t_index++; tmp[t_index - 1].text_start = ( vptr_t ) begin; tmp[t_index - 1].text_end = ( vptr_t ) ( begin + size ); strncpy( tmp[t_index - 1].name, mapname, PAPI_HUGE_STR_LEN ); tmp[t_index - 1].name[PAPI_HUGE_STR_LEN-1]=0; } } else if ( ( perm[0] == 'r' ) && ( perm[1] == 'w' ) && ( inode != 0 ) ) { if ( ( strcmp ( mdi->exe_info.fullname, mapname ) != 0 ) && ( t_index > 0 ) && ( tmp[t_index - 1].data_start == 0 ) ) { tmp[t_index - 1].data_start = ( vptr_t ) begin; tmp[t_index - 1].data_end = ( vptr_t ) ( begin + size ); } } else if ( ( perm[0] == 'r' ) && ( perm[1] == 'w' ) && ( inode == 0 ) ) { if ( ( t_index > 0 ) && ( tmp[t_index - 1].bss_start == 0 ) ) { tmp[t_index - 1].bss_start = ( vptr_t ) begin; tmp[t_index - 1].bss_end = ( vptr_t ) ( begin + size ); } } } } if ( counting ) { /* When we get here, we have counted the number of entries in the map for us to allocate */ tmp = ( PAPI_address_map_t * ) papi_calloc( t_index, sizeof ( PAPI_address_map_t ) ); if ( tmp == NULL ) { PAPIERROR( "Error allocating shared library address map" ); fclose(f); return PAPI_ENOMEM; } t_index = 0; rewind( f ); counting = 0; goto again; } else { if ( mdi->shlib_info.map ) papi_free( mdi->shlib_info.map ); mdi->shlib_info.map = tmp; mdi->shlib_info.count = t_index; fclose( f ); } return PAPI_OK; } papi-papi-7-2-0-t/src/linux-memory.h000066400000000000000000000002511502707512200172430ustar00rootroot00000000000000int _linux_get_dmem_info( PAPI_dmem_info_t * d ); int _linux_get_memory_info( PAPI_hw_info_t * hwinfo, int cpu_type ); int _linux_update_shlib_info( papi_mdi_t *mdi ); papi-papi-7-2-0-t/src/linux-timer.c000066400000000000000000000326371502707512200170630ustar00rootroot00000000000000/* * File: linux-timer.c * * @author: Vince Weaver * vincent.weaver @ maine.edu * Mods: Philip Mucci * mucci @ icl.utk.edu */ #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include #include #include #include #include #include "linux-common.h" #include #include #include #include #ifdef __ia64__ #include "perfmon/pfmlib_itanium2.h" #include "perfmon/pfmlib_montecito.h" #endif #ifdef __powerpc__ #include #endif #if defined(HAVE_MMTIMER) #include #include #include #ifndef MMTIMER_FULLNAME #define MMTIMER_FULLNAME "/dev/mmtimer" #endif static int mmdev_fd; static unsigned long mmdev_mask; static unsigned long mmdev_ratio; static volatile unsigned long *mmdev_timer_addr; /* setup mmtimer */ int mmtimer_setup(void) { unsigned long femtosecs_per_tick = 0; unsigned long freq = 0; int result; int offset; SUBDBG( "MMTIMER Opening %s\n", MMTIMER_FULLNAME ); if ( ( mmdev_fd = open( MMTIMER_FULLNAME, O_RDONLY ) ) == -1 ) { PAPIERROR( "Failed to open MM timer %s", MMTIMER_FULLNAME ); return PAPI_ESYS; } SUBDBG( "MMTIMER checking if we can mmap" ); if ( ioctl( mmdev_fd, MMTIMER_MMAPAVAIL, 0 ) != 1 ) { PAPIERROR( "mmap of MM timer unavailable" ); return PAPI_ESYS; } SUBDBG( "MMTIMER setting close on EXEC flag\n" ); if ( fcntl( mmdev_fd, F_SETFD, FD_CLOEXEC ) == -1 ) { PAPIERROR( "Failed to fcntl(FD_CLOEXEC) on MM timer FD %d: %s", mmdev_fd, strerror( errno ) ); return PAPI_ESYS; } SUBDBG( "MMTIMER is on FD %d, getting offset\n", mmdev_fd ); if ( ( offset = ioctl( mmdev_fd, MMTIMER_GETOFFSET, 0 ) ) < 0 ) { PAPIERROR( "Failed to get offset of MM timer" ); return PAPI_ESYS; } SUBDBG( "MMTIMER has offset of %d, getting frequency\n", offset ); if ( ioctl( mmdev_fd, MMTIMER_GETFREQ, &freq ) == -1 ) { PAPIERROR( "Failed to get frequency of MM timer" ); return PAPI_ESYS; } SUBDBG( "MMTIMER has frequency %lu Mhz\n", freq / 1000000 ); // don't know for sure, but I think this ratio is inverted // mmdev_ratio = (freq/1000000) / (unsigned long)_papi_hwi_system_info.hw_info.mhz; mmdev_ratio = ( unsigned long ) _papi_hwi_system_info.hw_info.cpu_max_mhz / ( freq / 1000000 ); SUBDBG( "MMTIMER has a ratio of %ld to the CPU's clock, getting resolution\n", mmdev_ratio ); if ( ioctl( mmdev_fd, MMTIMER_GETRES, &femtosecs_per_tick ) == -1 ) { PAPIERROR( "Failed to get femtoseconds per tick" ); return PAPI_ESYS; } SUBDBG( "MMTIMER res is %lu femtosecs/tick (10^-15s) or %f Mhz, getting valid bits\n", femtosecs_per_tick, 1.0e9 / ( double ) femtosecs_per_tick ); if ( ( result = ioctl( mmdev_fd, MMTIMER_GETBITS, 0 ) ) == -ENOSYS ) { PAPIERROR( "Failed to get number of bits in MMTIMER" ); return PAPI_ESYS; } mmdev_mask = ~( 0xffffffffffffffff << result ); SUBDBG( "MMTIMER has %d valid bits, mask %#16lx, getting mmaped page\n", result, mmdev_mask ); if ( ( mmdev_timer_addr = ( unsigned long * ) mmap( 0, getpagesize( ), PROT_READ, MAP_PRIVATE, mmdev_fd, 0 ) ) == NULL ) { PAPIERROR( "Failed to mmap MM timer" ); return PAPI_ESYS; } SUBDBG( "MMTIMER page is at %p, actual address is %p\n", mmdev_timer_addr, mmdev_timer_addr + offset ); mmdev_timer_addr += offset; /* mmdev_fd should be closed and page should be unmapped in a global shutdown routine */ return PAPI_OK; } #else #if defined(__powerpc__) static uint64_t multiplier = 1; #endif int mmtimer_setup(void) { #if defined(__powerpc__) multiplier = ((uint64_t)_papi_hwi_system_info.hw_info.cpu_max_mhz * 1000000ULL) / (__ppc_get_timebase_freq()/(uint64_t)1000); #endif return PAPI_OK; } #endif /* Hardware clock functions */ /* All architectures should set HAVE_CYCLES in configure if they have these. Not all do so for now, we have to guard at the end of the statement, instead of the top. When all archs set this, this region will be guarded with: #if defined(HAVE_CYCLE) which is equivalent to #if !defined(HAVE_GETTIMEOFDAY) && !defined(HAVE_CLOCK_GETTIME) */ /************************/ /* MMTIMER get_cycles() */ /************************/ #if defined(HAVE_MMTIMER) static inline long long get_cycles( void ) { long long tmp = 0; tmp = *mmdev_timer_addr & mmdev_mask; SUBDBG("MMTIMER is %llu, scaled %llu\n",tmp,tmp*mmdev_ratio); tmp *= mmdev_ratio; return tmp; } /************************/ /* ia64 get_cycles() */ /************************/ #elif defined(__ia64__) extern int _perfmon2_pfm_pmu_type; static inline long long get_cycles( void ) { long long tmp = 0; #if defined(__INTEL_COMPILER) tmp = __getReg( _IA64_REG_AR_ITC ); #else __asm__ __volatile__( "mov %0=ar.itc":"=r"( tmp )::"memory" ); #endif switch ( _perfmon2_pfm_pmu_type ) { case PFMLIB_MONTECITO_PMU: tmp = tmp * 4; break; } return tmp; } /************************/ /* x86 get_cycles() */ /************************/ #elif (defined(__i386__)||defined(__x86_64__)) static inline long long get_cycles( void ) { long long ret = 0; #ifdef __x86_64__ do { unsigned int a, d; asm volatile ( "rdtsc":"=a" ( a ), "=d"( d ) ); ( ret ) = ( ( long long ) a ) | ( ( ( long long ) d ) << 32 ); } while ( 0 ); #else __asm__ __volatile__( "rdtsc":"=A"( ret ): ); #endif return ret; } /************************/ /* SPARC get_cycles() */ /************************/ /* #define get_cycles _rtc ?? */ #elif defined(__sparc__) static inline long long get_cycles( void ) { register unsigned long ret asm( "g1" ); __asm__ __volatile__( ".word 0x83410000" /* rd %tick, %g1 */ :"=r"( ret ) ); return ret; } /************************/ /* aarch64 get_cycles() */ /************************/ #elif defined(__aarch64__) static inline long long get_cycles( void ) { register unsigned long ret; __asm__ __volatile__ ("isb; mrs %0, cntvct_el0" : "=r" (ret)); return ret; } /************************/ /* loongarch64 get_cycles() */ /************************/ #elif defined(__loongarch64) static inline long long get_cycles( void ) { int rid = 0; unsigned long ret; __asm__ __volatile__ ( "rdtime.d %0, %1" : "=r" (ret), "=r" (rid) ); return ret; } /************************/ /* POWER get_cycles() */ /************************/ #elif defined(__powerpc__) static inline long long get_cycles() { uint64_t result; int64_t retval; #ifdef _ARCH_PPC64 /* This reads timebase in one 64bit go. Does *not* include a workaround for the cell (see http://ozlabs.org/pipermail/linuxppc-dev/2006-October/027052.html) */ __asm__ volatile( "mftb %0" : "=r" (result)); #else /* Read the high 32bits of the timer, then the lower, and repeat if high order has changed in the meantime. See http://ozlabs.org/pipermail/linuxppc-dev/1999-October/003889.html */ unsigned long dummy; __asm__ volatile( "mfspr %1,269\n\t" /* mftbu */ "mfspr %L0,268\n\t" /* mftb */ "mfspr %0,269\n\t" /* mftbu */ "cmpw %0,%1\n\t" /* check if the high order word has chanegd */ "bne $-16" : "=r" (result), "=r" (dummy)); #endif retval = (result*multiplier)/1000ULL; return retval; } #elif (defined(__arm__) || defined(__mips__) || defined(__hppa__)) || defined(__riscv) static inline long long get_cycles( void ) { return 0; } /************************/ /* NEC get_cycles() */ /************************/ #elif defined(__NEC__) static inline long long get_cycles( void ) { long long ret = 0; return ret; } #elif !defined(HAVE_GETTIMEOFDAY) && !defined(HAVE_CLOCK_GETTIME) #error "No get_cycles support for this architecture. " #endif long long _linux_get_real_cycles( void ) { long long retval; #if defined(HAVE_GETTIMEOFDAY)||defined(__arm__)||defined(__mips__) /* Crude estimate, not accurate in prescence of DVFS */ retval = _papi_os_vector.get_real_usec( ) * ( long long ) _papi_hwi_system_info.hw_info.cpu_max_mhz; #else retval = get_cycles( ); #endif return retval; } /******************************************************************** * microsecond timers * ********************************************************************/ /******************************* * HAVE_CLOCK_GETTIME * *******************************/ long long _linux_get_real_usec_gettime( void ) { long long retval; struct timespec foo; #ifdef HAVE_CLOCK_GETTIME_REALTIME_HR syscall( __NR_clock_gettime, CLOCK_REALTIME_HR, &foo ); #else syscall( __NR_clock_gettime, CLOCK_REALTIME, &foo ); #endif retval = ( long long ) foo.tv_sec * ( long long ) 1000000; retval += ( long long ) ( foo.tv_nsec / 1000 ); return retval; } /********************** * HAVE_GETTIMEOFDAY * **********************/ long long _linux_get_real_usec_gettimeofday( void ) { long long retval; struct timeval buffer; gettimeofday( &buffer, NULL ); retval = ( long long ) buffer.tv_sec * ( long long ) 1000000; retval += ( long long ) ( buffer.tv_usec ); return retval; } long long _linux_get_real_usec_cycles( void ) { long long retval; /* Not accurate in the prescence of DVFS */ retval = get_cycles( ) / ( long long ) _papi_hwi_system_info.hw_info.cpu_max_mhz; return retval; } /******************************* * HAVE_PER_THREAD_GETRUSAGE * *******************************/ long long _linux_get_virt_usec_rusage( void ) { long long retval; struct rusage buffer; getrusage( RUSAGE_SELF, &buffer ); SUBDBG( "user %d system %d\n", ( int ) buffer.ru_utime.tv_sec, ( int ) buffer.ru_stime.tv_sec ); retval = ( long long ) ( buffer.ru_utime.tv_sec + buffer.ru_stime.tv_sec ) * ( long long ) 1000000; retval += (long long) ( buffer.ru_utime.tv_usec + buffer.ru_stime.tv_usec ); return retval; } /************************** * HAVE_PER_THREAD_TIMES * **************************/ long long _linux_get_virt_usec_times( void ) { long long retval; struct tms buffer; times( &buffer ); SUBDBG( "user %d system %d\n", ( int ) buffer.tms_utime, ( int ) buffer.tms_stime ); retval = ( long long ) ( ( buffer.tms_utime + buffer.tms_stime ) * 1000000 / sysconf( _SC_CLK_TCK )); /* NOT CLOCKS_PER_SEC as in the headers! */ return retval; } /******************************/ /* HAVE_CLOCK_GETTIME_THREAD */ /******************************/ long long _linux_get_virt_usec_gettime( void ) { long long retval; struct timespec foo; syscall( __NR_clock_gettime, CLOCK_THREAD_CPUTIME_ID, &foo ); retval = ( long long ) foo.tv_sec * ( long long ) 1000000; retval += ( long long ) foo.tv_nsec / 1000; return retval; } /********************/ /* USE_PROC_PTTIMER */ /********************/ long long _linux_get_virt_usec_pttimer( void ) { long long retval; char buf[LINE_MAX]; long long utime, stime; int rv, cnt = 0, i = 0; int stat_fd; again: sprintf( buf, "/proc/%d/task/%d/stat", getpid( ), mygettid( ) ); stat_fd = open( buf, O_RDONLY ); if ( stat_fd == -1 ) { PAPIERROR( "open(%s)", buf ); return PAPI_ESYS; } rv = read( stat_fd, buf, LINE_MAX * sizeof ( char ) ); if ( rv == -1 ) { if ( errno == EBADF ) { close(stat_fd); goto again; } PAPIERROR( "read()" ); close(stat_fd); return PAPI_ESYS; } lseek( stat_fd, 0, SEEK_SET ); if (rv == LINE_MAX) rv--; buf[rv] = '\0'; SUBDBG( "Thread stat file is:%s\n", buf ); while ( ( cnt != 13 ) && ( i < rv ) ) { if ( buf[i] == ' ' ) { cnt++; } i++; } if ( cnt != 13 ) { PAPIERROR( "utime and stime not in thread stat file?" ); close(stat_fd); return PAPI_ESYS; } if ( sscanf( buf + i, "%llu %llu", &utime, &stime ) != 2 ) { close(stat_fd); PAPIERROR("Unable to scan two items from thread stat file at 13th space?"); return PAPI_ESYS; } retval = ( utime + stime ) * ( long long ) 1000000 /_papi_os_info.clock_ticks; close(stat_fd); return retval; } /******************************************************************** * nanosecond timers * ********************************************************************/ /******************************* * HAVE_CLOCK_GETTIME * *******************************/ long long _linux_get_real_nsec_gettime( void ) { long long retval; struct timespec foo; #ifdef HAVE_CLOCK_GETTIME_REALTIME_HR syscall( __NR_clock_gettime, CLOCK_REALTIME_HR, &foo ); #else syscall( __NR_clock_gettime, CLOCK_REALTIME, &foo ); #endif retval = ( long long ) foo.tv_sec * ( long long ) 1000000000; retval += ( long long ) ( foo.tv_nsec ); return retval; } /******************************/ /* HAVE_CLOCK_GETTIME_THREAD */ /******************************/ long long _linux_get_virt_nsec_gettime( void ) { long long retval; struct timespec foo; syscall( __NR_clock_gettime, CLOCK_THREAD_CPUTIME_ID, &foo ); retval = ( long long ) foo.tv_sec * ( long long ) 1000000000; retval += ( long long ) foo.tv_nsec ; return retval; } papi-papi-7-2-0-t/src/linux-timer.h000066400000000000000000000010571502707512200170600ustar00rootroot00000000000000long long _linux_get_real_cycles( void ); long long _linux_get_virt_usec_pttimer( void ); long long _linux_get_virt_usec_gettime( void ); long long _linux_get_virt_usec_times( void ); long long _linux_get_virt_usec_rusages( void ); long long _linux_get_real_usec_gettime( void ); long long _linux_get_real_usec_gettimeofday( void ); long long _linux_get_real_usec_cycles( void ); long long _linux_get_real_nsec_gettime( void ); long long _linux_get_virt_nsec_gettime( void ); int mmtimer_setup(void); int init_proc_thread_timer( hwd_context_t *thr_ctx ); papi-papi-7-2-0-t/src/maint/000077500000000000000000000000001502707512200155375ustar00rootroot00000000000000papi-papi-7-2-0-t/src/maint/genpapifdef.pl000077500000000000000000000245431502707512200203570ustar00rootroot00000000000000#!/usr/bin/perl ## ## Copyright (C) by Innovative Computing Laboratory ## See COPYRIGHT in top level directory ## use warnings; use strict; my $debug = 0; my $compiler; my $papi_h = "papi.h"; my $events_h = "papiStdEventDefs.h"; my ($header, $value, $operator, $trailer) = ("[-+() ]*", "[A-Za-z_0-9]+", "[<>|&+-]*", ".*"); &parse_script_args(@ARGV); my %papi_defs = &parse_papi_defs($papi_h); my %papi_presets = &parse_papi_presets($events_h); &write_defs(%papi_defs); &write_presets(%papi_presets); # Subroutines sub parse_script_args { my @argv = @_; foreach $_ (@argv) { if (/-c/) { $compiler = "fort"; } elsif (/-f77/) { $compiler = "f77"; } elsif (/-f90/) { $compiler = "f90"; } elsif (/-debug/) { $debug = 1; } else { die "Unrecognized argument $_\n"; } } } sub parse_papi_defs { my $filename = $_[0]; my %papi_defs = (); open (my $fh_in, "<$filename") || die "Unable to open $filename\n"; while (my $line = <$fh_in>) { $line =~ s/\/\*(.*)//; # handle PAPI_VERSION explicitly if ($line =~ /^\s*#\s*define\s+(PAPI_[A-Z_0-9]+)\s+PAPI_VERSION_NUMBER\(([0-9]+),([0-9]+),([0-9])+,([0-9]+)\)/) { $papi_defs{'PAPI_VERSION'} = ($2 << 24) | ($3 << 16) | ($4 << 8) | $5; } # match: define PAPI_XXX (value) elsif ($line =~ /^\s*#\s*define\s+(PAPIF?_[A-Z_0-9]+)\s+(.*)/) { my ($name, $content, $eval_string) = ($1, $2, ""); # Search for PAPI_XXX definitions and replace them in eval_string; # then evaluate eval_string while ($content =~ /($header)\s*($value)\s*($operator)\s*($trailer)/) { my ($h, $v, $o, $t) = ($1, $2, $3, $4); $eval_string .= $h.(exists($papi_defs{$v}) ? $papi_defs{$v} : ($v =~ /0x/) ? hex($v) : $v).$o; $content = $t; } $eval_string .= $content; $papi_defs{$name} = eval $eval_string; print STDERR ">> $name = $eval_string\n" if $debug; } # match: enum NAME { elsif ($line =~ /^\s*enum\s*[A-Za-z0-9_]*\s*(.*)/ || $line =~ /^\s*enum\s*[A-Za-z0-9_]*\s*{\s*(.*)/ || $line =~ /^\s*typedef\s*enum\s*[A-Za-z0-9_]*\s*(.*)/ || $line =~ /^\s*typedef\s*enum\s*[A-Za-z0-9_]*\s*{\s*(.*)/) { # Eat until we find the closing right brace my $enum_line = $1; while (! ($enum_line =~ /}/)) { my $newline = <$fh_in>; $newline =~ s/\r*\n//; $enum_line .= $newline; } my ($name, $content, $prev_key) = ("", "", ""); my @enum_array = split /,/, $enum_line; foreach my $item (@enum_array) { my $eval_string = ""; # clean up white comments, white spaces and braces $item =~ s/\/\*.+\*\///s; $item =~ s/\s+//; $item =~ s/{//g; $item =~ s/}.*//; if ($item =~ /(PAPI_[A-Z_0-9]+)\s*=(.*)/) { ($name, $content) = ($1, $2); # Search for PAPI_XXX definitions and replace them in eval_string; # then evaluate eval_string while ($content =~ /\s*($header)\s*($value)\s*($operator)\s*($trailer)/) { my ($h, $v, $o, $t) = ($1, $2, $3, $4); $eval_string .= $h.(exists($papi_defs{$v}) ? $papi_defs{$v} : ($v =~ /0x/) ? hex($v) : $v).$o; $content = $t; } } elsif ($item =~ /(PAPI_[A-Z_0-9]+)/) { ($name, $content) = ($item, ""); $eval_string .= (($prev_key eq "") ? 0 : $papi_defs{$prev_key} + 1); } else { next; } $eval_string .= $content; $papi_defs{$name} = eval $eval_string; print STDERR ">> $name = $eval_string\n" if $debug; $prev_key = $name; } } } close($fh_in); return %papi_defs; } sub parse_papi_presets { my $filename = $_[0]; my %papi_presets = (); open(my $fh_in, "<$filename") || die "Unable to open $filename\n"; # FIXME: this implementation is not generic enough while (my $line = <$fh_in>) { # cleanup comments $line =~ s/\/\*(.*)\*\)//; # match: enum NAME { if ($line =~ /^\s*enum\s*[A-Za-z0-9_]*\s*(.*)/ || $line =~ /^\s*enum\s*[A-Za-z0-9_]*\s*{\s*(.*)/ || $line =~ /^\s*typedef\s*enum\s*[A-Za-z0-9_]*\s*(.*)/ || $line =~ /^\s*typedef\s*enum\s*[A-Za-z0-9_]*\s*{\s*(.*)/) { # Eat until we find the closing right brace my $enum_line = $1; while (! ($enum_line =~ /}/)) { my $newline = <$fh_in>; $newline =~ s/\r*\n//; $enum_line .= $newline; } my $prev_key = ""; while (1) { # match: PAPI_XXX_idx, if ($enum_line =~ /\s*(PAPI_[A-Z_0-9]+)_idx(.*)/) { $papi_presets{$1} = (("$prev_key" eq "") ? -2147483648 : $papi_presets{$prev_key} + 1); $prev_key = $1; $enum_line = $2; } # match: closing right brace elsif ($enum_line =~ /}/) { last; } } } } close($fh_in); return %papi_presets; } sub write_defs { my %defs = @_; if ("$compiler" eq "fort") { &write_defs_fort(%defs); } elsif ("$compiler" eq "f77") { &write_defs_f77(%defs); } else { &write_defs_f90(%defs); } } sub write_presets { my %presets = @_; if ("$compiler" eq "fort") { &write_presets_fort(%presets); } elsif ("$compiler" eq "f77") { &write_presets_f77(%presets); } else { &write_presets_f90(%presets); } } sub write_defs_fort { my %defs = @_; printf STDOUT "C\n"; printf STDOUT "C This file contains defines required by the PAPI Fortran interface.\n"; printf STDOUT "C It is automatically generated by genpapifdef.pl.\n"; printf STDOUT "C DO NOT modify its content and expect the changes to stick.\n"; printf STDOUT "C Changes MUST be made in genpapifdef.pl instead.\n"; printf STDOUT "C Content is extracted from define and enum statements in papi.h\n"; printf STDOUT "C All other content is ignored.\n"; printf STDOUT "C\n\n"; printf STDOUT "C\n"; printf STDOUT "C General purpose defines\n"; printf STDOUT "C\n\n"; foreach my $key (keys %defs) { # skip unneeded definition if ($key =~ /PAPI_MH_/ || $key =~ /PAPI_PRESET_/ || $key =~ /PAPI_DEF_ITIMER/) { next; } printf STDOUT "#define %-18s %s\n", $key, ($papi_defs{$key} == 0x80000000) ? "((-2147483647) - 1)" : $papi_defs{$key}; } } sub write_defs_f77 { my %defs = @_; printf STDOUT "!\n"; printf STDOUT "! This file contains defines required by the PAPI Fortran interface.\n"; printf STDOUT "! It is automatically generated by genpapifdef.pl.\n"; printf STDOUT "! DO NOT modify its content and expect the changes to stick.\n"; printf STDOUT "! Changes MUST be made in genpapifdef.pl instead.\n"; printf STDOUT "! Content is extracted from define and enum statements in papi.h\n"; printf STDOUT "! All other content is ignored.\n"; printf STDOUT "!\n\n"; printf STDOUT "!\n"; printf STDOUT "! General purpose defines\n"; printf STDOUT "!\n\n"; foreach my $key (keys %defs) { # skip unneeded definition if ($key =~ /PAPI_MH_/ || $key =~ /PAPI_PRESET_/ || $key =~ /PAPI_DEF_ITIMER/) { next; } printf STDOUT "INTEGER %-18s\nPARAMETER(%s=%s)\n", $key, $key, ($papi_defs{$key} == 0x80000000) ? "((-2147483647) - 1)" : $papi_defs{$key}; } } sub write_defs_f90 { my %defs = @_; printf STDOUT "!\n"; printf STDOUT "! This file contains defines required by the PAPI Fortran interface.\n"; printf STDOUT "! It is automatically generated by genpapifdef.pl.\n"; printf STDOUT "! DO NOT modify its content and expect the changes to stick.\n"; printf STDOUT "! Changes MUST be made in genpapifdef.pl instead.\n"; printf STDOUT "! Content is extracted from define and enum statements in papi.h\n"; printf STDOUT "! All other content is ignored.\n"; printf STDOUT "!\n\n"; printf STDOUT "!\n"; printf STDOUT "! General purpose defines\n"; printf STDOUT "!\n\n"; foreach my $key (keys %defs) { # skip unneeded definition if ($key =~ /PAPI_MH_/ || $key =~ /PAPI_PRESET_/ || $key =~ /PAPI_DEF_ITIMER/) { next; } printf STDOUT "INTEGER, PARAMETER :: %-18s = %s\n", $key, ($papi_defs{$key} == 0x80000000) ? "((-2147483647) - 1)" : $papi_defs{$key}; } } sub write_presets_fort { my %presets = @_; printf STDOUT "\n"; printf STDOUT "C\n"; printf STDOUT "C PAPI preset event values\n"; printf STDOUT "C\n\n"; foreach my $key (keys %presets) { if ($papi_presets{$key} == -2147483648) { printf STDOUT "#define %-18s ((-2147483647) - 1)\n", $key; } else { printf STDOUT "#define %-18s %s\n", $key, $papi_presets{$key}; } } } sub write_presets_f77 { my %presets = @_; printf STDOUT "\n"; printf STDOUT "!\n"; printf STDOUT "! PAPI preset event values\n"; printf STDOUT "!\n\n"; foreach my $key (keys %presets) { if ($papi_presets{$key} == -2147483648) { printf STDOUT "INTEGER %-18s\nPARAMETER(%s=(-2147483647) - 1)\n", $key, $key; } else { printf STDOUT "INTEGER %-18s\nPARAMETER(%s=%s)\n", $key, $key, $papi_presets{$key}; } } } sub write_presets_f90 { my %presets = @_; printf STDOUT "\n"; printf STDOUT "!\n"; printf STDOUT "! PAPI preset event values\n"; printf STDOUT "!\n\n"; foreach my $key (keys %presets) { if ($papi_presets{$key} == -2147483648) { printf STDOUT "INTEGER, PARAMETER :: %-18s = ((-2147483647) - 1)\n", $key; } else { printf STDOUT "INTEGER, PARAMETER :: %-18s = %s\n", $key, $papi_presets{$key}; } } } papi-papi-7-2-0-t/src/mb.h000066400000000000000000000045051502707512200152020ustar00rootroot00000000000000#ifndef _MB_H #define _MB_H /* These definitions are not yet in distros, so I have cut and pasted just the needed definitions in here */ #ifdef __powerpc__ #define rmb() asm volatile ("sync" : : : "memory") #elif defined (__s390__) #define rmb() asm volatile("bcr 15,0" ::: "memory") #elif defined (__sh__) #if defined(__SH4A__) || defined(__SH5__) #define rmb() asm volatile("synco" ::: "memory") #else #define rmb() asm volatile("" ::: "memory") #endif #elif defined (__hppa__) #define rmb() asm volatile("" ::: "memory") #elif defined (__sparc__) #define rmb() asm volatile("":::"memory") #elif defined (__alpha__) #define rmb() asm volatile("mb" ::: "memory") #elif defined(__ia64__) #define rmb() asm volatile ("mf" ::: "memory") #elif defined(__arm__) /* * Use the __kuser_memory_barrier helper in the CPU helper page. See * arch/arm/kernel/entry-armv.S in the kernel source for details. */ #define rmb() ((void(*)(void))0xffff0fa0)() #elif defined(__aarch64__) #define rmb() asm volatile("dmb ld" ::: "memory") #elif defined(__loongarch64) #define rmb() __asm__ __volatile__("dbar 0" : : : "memory") #elif defined(__mips__) #define rmb() asm volatile( \ ".set mips2\n\t" \ "sync\n\t" \ ".set mips0" \ : /* no output */ \ : /* no input */ \ : "memory") #elif defined(__i386__) #define rmb() asm volatile("lock; addl $0,0(%%esp)" ::: "memory") #elif defined(__NEC__) #define rmb() asm volatile("lfence":::"memory") #elif defined(__x86_64) #if defined(__KNC__) #define rmb() __sync_synchronize() #else #define rmb() asm volatile("lfence":::"memory") #endif #elif defined (__riscv) #define RISCV_FENCE(p, s) \ __asm__ __volatile__ ("fence " #p "," #s : : : "memory") /* These barriers need to enforce ordering on both devices or memory. */ #define mb() RISCV_FENCE(iorw,iorw) #define rmb() RISCV_FENCE(ir,ir) #define wmb() RISCV_FENCE(ow,ow) #else #error Need to define rmb for this architecture! #error See the kernel source directory: tools/perf/perf.h file #endif #endif papi-papi-7-2-0-t/src/papi.c000066400000000000000000007562431502707512200155450ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file: papi.c * * @author: Philip Mucci * mucci@cs.utk.edu * @author dan terpstra * terpstra@cs.utk.edu * @author Min Zhou * min@cs.utk.edu * @author Kevin London * london@cs.utk.edu * @author Per Ekman * pek@pdc.kth.se * @author Frank Winkler * frank.winkler@icl.utk.edu * Mods: Gary Mohr * gary.mohr@bull.com * * @brief Most of the low-level API is here. */ #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "papi_preset.h" #include "cpus.h" #include "extras.h" #include "sw_multiplex.h" /* simplified papi functions for event rates */ /* For dynamic linking to libpapi */ /* Weak symbol for pthread_once to avoid additional linking * against libpthread when not used. */ #pragma weak pthread_once #define STOP 0 #define FLIP 1 #define FLOP 2 #define IPC 3 #define EPC 4 /** \internal * This is stored per thread */ typedef struct _RateInfo { int EventSet; /**< EventSet of the thread */ int event_0; /**< first event of the eventset */ short int running; /**< STOP, FLIP, FLOP, IPC or EPC */ long long last_real_time; /**< Previous value of real time */ long long last_proc_time; /**< Previous value of processor time */ } RateInfo; THREAD_LOCAL_STORAGE_KEYWORD RateInfo *_rate_state = NULL; bool _papi_rate_initiated = false; static void _internal_papi_init(void); static void _internal_onetime_papi_init(void); static int _start_new_rate_call(float *real_time, float *proc_time, int *events, int num_events, long long *ins, float *rate); static int _rate_calls( float *real_time, float *proc_time, int *events, long long *values, long long *ins, float *rate, int mode ); static int _internal_check_rate_state(); static void _internal_papi_init(void) { /* This function is only called by the first thread! */ int retval; /* check if user has already initialzed PAPI with thread support */ if ( init_level != ( PAPI_LOW_LEVEL_INITED | PAPI_THREAD_LEVEL_INITED ) ) { if ( ( retval = PAPI_library_init(PAPI_VER_CURRENT) ) != PAPI_VER_CURRENT ) { fprintf( stderr, "PAPI Error: PAPI_library_init failed with return value %d.\n", retval); } else { if ((retval = PAPI_thread_init(_papi_gettid)) != PAPI_OK) { fprintf( stderr, "PAPI Error: PAPI_thread_init failed with return value %d.\n", retval); fprintf( stderr, "PAPI Error: PAPI could not be initiated!\n"); } else { _papi_rate_initiated = true; } } } else { _papi_rate_initiated = true; } } static void _internal_onetime_papi_init(void) { static pthread_once_t library_is_initialized = PTHREAD_ONCE_INIT; if ( pthread_once ) { /* we assume that this function was called from a parallel region */ pthread_once(&library_is_initialized, _internal_papi_init); /* wait until first thread has finished */ int i = 0; /* give it 5 seconds in case PAPI_thread_init crashes */ while ( !_papi_rate_initiated && (i++) < 500000 ) usleep(10); } else { /* we assume that this function was called from a serial application * that was not linked against libpthread */ _internal_papi_init(); } } static int _internal_check_rate_state() { /* check if PAPI is initialized for rate functions */ if ( _papi_rate_initiated == false ) { _internal_onetime_papi_init(); if ( _papi_rate_initiated == false ) return ( PAPI_EINVAL ); } if ( _rate_state== NULL ) { _rate_state= ( RateInfo* ) papi_malloc( sizeof ( RateInfo ) ); if ( _rate_state== NULL ) return ( PAPI_ENOMEM ); memset( _rate_state, 0, sizeof ( RateInfo ) ); _rate_state->running = STOP; } return ( PAPI_OK ); } /** @class PAPI_flips_rate * @brief Simplified call to get Mflips/s (floating point instruction rate), real and processor time. * * @par C Interface: * \#include @n * int PAPI_flips_rate( int event, float *rtime, float *ptime, long long *flpins, float *mflips ); * * @param event * one of the three presets PAPI_FP_INS, PAPI_VEC_SP or PAPI_VEC_DP * @param *rtime * realtime since the latest call * @param *ptime * process time since the latest call * @param *flpins * floating point instructions since the latest call * @param *mflips * incremental (Mega) floating point instructions per seconds since the latest call * * @retval PAPI_EINVAL * The counters were already started by something other than PAPI_flips_rate(). * @retval PAPI_ENOEVNT * The floating point instructions event does not exist. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * The first call to PAPI_flips_rate() will initialize the PAPI interface, * set up the counters to monitor the floating point instructions event and start the counters. * * Subsequent calls will read the counters and return real time, process time, * floating point instructions and the Mflip/s rate since the latest call to PAPI_flips_rate(). * * PAPI_flips_rate() returns information related to floating point instructions using * the floating point instructions event. This is intended to measure instruction rate through the * floating point pipe with no massaging. Note that PAPI_flips_rate() is thread-safe and can * therefore be called by multiple threads. * * @see PAPI_flops_rate() * @see PAPI_ipc() * @see PAPI_epc() */ int PAPI_flips_rate( int event, float *rtime, float *ptime, long long *flpins, float *mflips ) { int retval; /* check event first */ if ( event == PAPI_FP_INS || event == PAPI_VEC_DP || event == PAPI_VEC_SP ) { int events[1] = {event}; long long values = 0; if ( rtime == NULL || ptime == NULL || flpins == NULL || mflips == NULL ) { return PAPI_EINVAL; } retval = _rate_calls( rtime, ptime, events, &values, flpins, mflips, FLIP ); return ( retval ); } return ( PAPI_ENOEVNT ); } /** @class PAPI_flops_rate * @brief Simplified call to get Mflops/s (floating point operation rate), real and processor time. * * @par C Interface: * \#include @n * int PAPI_flops_rate ( int event, float *rtime, float *ptime, long long *flpops, float *mflops ); * * @param event * one of the three presets PAPI_FP_OPS, PAPI_SP_OPS or PAPI_DP_OPS * @param *rtime * realtime since the latest call * @param *ptime * process time since the latest call * @param *flpops * floating point operations since the latest call * @param *mflops * incremental (Mega) floating point operations per seconds since the latest call * * @retval PAPI_EINVAL * The counters were already started by something other than PAPI_flops_rate(). * @retval PAPI_ENOEVNT * The floating point operations event does not exist. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * The first call to PAPI_flops_rate() will initialize the PAPI interface, * set up the counters to monitor the floating point operations event and start the counters. * * Subsequent calls will read the counters and return real time, process time, * floating point operations and the Mflop/s rate since the latest call to PAPI_flops_rate(). * * PAPI_flops_rate() returns information related to theoretical floating point operations * rather than simple instructions. It uses the floating point operations event which attempts to * 'correctly' account for, e.g., FMA undercounts and FP Store overcounts. Note that * PAPI_flops_rate() is thread-safe and can therefore be called by multiple threads. * * @see PAPI_flips_rate() * @see PAPI_ipc() * @see PAPI_epc() * @see PAPI_rate_stop() */ int PAPI_flops_rate( int event, float *rtime, float *ptime, long long *flpops, float *mflops ) { int retval; /* check event first */ if ( event == PAPI_FP_OPS || event == PAPI_SP_OPS || event == PAPI_DP_OPS ) { int events[1] = {event}; long long values = 0; if ( rtime == NULL || ptime == NULL || flpops == NULL || mflops == NULL ) { return PAPI_EINVAL; } retval = _rate_calls( rtime, ptime, events, &values, flpops, mflops, FLOP ); return ( retval ); } return ( PAPI_ENOEVNT ); } /** @class PAPI_ipc * @brief Simplified call to get instructions per cycle, real and processor time. * * @par C Interface: * \#include @n * int PAPI_ipc( float *rtime, float *ptime, long long *ins, float *ipc ); * * @param *rtime * realtime since the latest call * @param *ptime * process time since the latest call * @param *ins * instructions since the latest call * @param *ipc * incremental instructions per cycle since the latest call * * @retval PAPI_EINVAL * The counters were already started by something other than PAPI_ipc(). * @retval PAPI_ENOEVNT * The events PAPI_TOT_INS and PAPI_TOT_CYC are not supported. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * The first call to PAPI_ipc() will initialize the PAPI interface, * set up the counters to monitor PAPI_TOT_INS and PAPI_TOT_CYC events * and start the counters. * * Subsequent calls will read the counters and return real time, * process time, instructions and the IPC rate since the latest call to PAPI_ipc(). * * PAPI_ipc() should return a ratio greater than 1.0, indicating instruction level * parallelism within the chip. The larger this ratio the more effeciently the program * is running. Note that PAPI_ipc() is thread-safe and can therefore be called by multiple threads. * * @see PAPI_flips_rate() * @see PAPI_flops_rate() * @see PAPI_epc() * @see PAPI_rate_stop() */ int PAPI_ipc( float *rtime, float *ptime, long long *ins, float *ipc ) { long long values[2] = { 0, 0 }; int events[2] = {PAPI_TOT_INS, PAPI_TOT_CYC}; int retval = 0; if ( rtime == NULL || ptime == NULL || ins == NULL || ipc == NULL ) return PAPI_EINVAL; retval = _rate_calls( rtime, ptime, events, values, ins, ipc, IPC ); return ( retval ); } /** @class PAPI_epc * @brief Simplified call to get arbitrary events per cycle, real and processor time. * * @par C Interface: * \#include @n * int PAPI_epc( int event, float *rtime, float *ptime, long long *ref, long long *core, long long *evt, float *epc ); * * @param event * event code to be measured (0 defaults to PAPI_TOT_INS) * @param *rtime * realtime since the latest call * @param *ptime * process time since the latest call * @param *ref * incremental reference clock cycles since the latest call * @param *core * incremental core clock cycles since the latest call * @param *evt * events since the latest call * @param *epc * incremental events per cycle since the latest call * * @retval PAPI_EINVAL * The counters were already started by something other than PAPI_epc(). * @retval PAPI_ENOEVNT * One of the requested events does not exist. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * The first call to PAPI_epc() will initialize the PAPI interface, * set up the counters to monitor the user specified event, PAPI_TOT_CYC, * and PAPI_REF_CYC (if it exists) and start the counters. * * Subsequent calls will read the counters and return real time, * process time, event counts, the core and reference cycle count and EPC rate * since the latest call to PAPI_epc(). * * PAPI_epc() can provide a more detailed look at algorithm efficiency in light of clock * variability in modern cpus. MFLOPS is no longer an adequate description of peak * performance if clock rates can arbitrarily speed up or slow down. By allowing a * user specified event and reporting reference cycles, core cycles and real time, * PAPI_epc provides the information to compute an accurate effective clock rate, and * an accurate measure of computational throughput. Note that PAPI_epc() is thread-safe and can * therefore be called by multiple threads. * * @see PAPI_flips_rate() * @see PAPI_flops_rate() * @see PAPI_ipc() * @see PAPI_rate_stop() */ int PAPI_epc( int event, float *rtime, float *ptime, long long *ref, long long *core, long long *evt, float *epc ) { long long values[3] = { 0, 0, 0 }; int events[3] = {PAPI_TOT_INS, PAPI_TOT_CYC, PAPI_REF_CYC}; int retval = 0; if ( rtime == NULL || ptime == NULL || ref == NULL ||core == NULL || evt == NULL || epc == NULL ) return PAPI_EINVAL; // if an event is provided, use it; otherwise use TOT_INS if (event != 0 ) events[0] = event; retval = _rate_calls( rtime, ptime, events, values, evt, epc, EPC ); *ref = values[2]; *core = values[1]; return ( retval ); } /** @class PAPI_rate_stop * @brief Stop a running event set of a rate function. * * @par C Interface: * \#include @n * int PAPI_rate_stop(); * * @retval PAPI_ENOEVNT * -- The EventSet is not started yet. * @retval PAPI_ENOMEM * -- Insufficient memory to complete the operation. * * PAPI_rate_stop stops a running event set of a rate function. * * @see PAPI_flips_rate() * @see PAPI_flops_rate() * @see PAPI_ipc() * @see PAPI_epc() */ int PAPI_rate_stop() { int retval; long long tmp_values[3]; if ( _papi_rate_events_running == 1 ) { if ( _rate_state!= NULL ) { if ( _rate_state->running > STOP ) { retval = PAPI_stop( _rate_state->EventSet, tmp_values ); if ( retval == PAPI_OK ) { PAPI_cleanup_eventset( _rate_state->EventSet ); _rate_state->running = STOP; } _papi_rate_events_running = 0; return retval; } } } return ( PAPI_ENOEVNT ); } static int _start_new_rate_call(float *real_time, float *proc_time, int *events, int num_events, long long *ins, float *rate) { int retval; _rate_state->EventSet = -1; if ( ( retval = PAPI_create_eventset( &_rate_state->EventSet ) ) != PAPI_OK ) return ( retval ); if (( retval = PAPI_add_events( _rate_state->EventSet, events, num_events )) != PAPI_OK ) return retval; /* remember the event for subsequent calls of PAPI_flips_rate and PAPI_flops_rate */ _rate_state->event_0 = events[0]; *real_time = 0.0; *proc_time = 0.0; *rate = 0.0; *ins = 0; _rate_state->last_real_time = PAPI_get_real_usec( ); _rate_state->last_proc_time = PAPI_get_virt_usec( ); if ( ( retval = PAPI_start( _rate_state->EventSet ) ) != PAPI_OK ) { return retval; } return ( PAPI_OK ); } static int _rate_calls( float *real_time, float *proc_time, int *events, long long *values, long long *ins, float *rate, int mode ) { // printf("_rate_calls event %d, mode %d\n", events[0], mode); long long rt, pt; // current elapsed real and process times in usec int num_events = 2; int retval = 0; /* if a high-level event set is running stop it */ if ( _papi_hl_events_running == 1 ) { if ( ( retval = PAPI_hl_stop() ) != PAPI_OK ) return ( retval ); } if ( ( retval = _internal_check_rate_state() ) != PAPI_OK ) { return ( retval ); } switch (mode) { case FLOP: case FLIP: if ( (retval = PAPI_query_event(events[0])) != PAPI_OK) return retval; num_events = 1; break; case IPC: break; case EPC: if ( (retval = PAPI_query_event(events[0])) != PAPI_OK) return retval; if ( (retval = PAPI_query_event(events[2])) == PAPI_OK) num_events = 3; break; default: return PAPI_EINVAL; } /* STOP means the first call of a rate function */ if ( _rate_state->running == STOP ) { if ( ( retval = _start_new_rate_call(real_time, proc_time, events, num_events, ins, rate)) != PAPI_OK ) return retval; _rate_state->running = mode; } else { // check last mode // printf("current mode: %d, last mode: %d\n", mode, _rate_state->running); // printf("current event: %d, last event: %d\n", events[0], _rate_state->event_0); if ( mode != _rate_state->running || events[0] != _rate_state->event_0 ) { long long tmp_values[3]; retval = PAPI_stop( _rate_state->EventSet, tmp_values ); if ( retval == PAPI_OK ) { PAPI_cleanup_eventset( _rate_state->EventSet ); } else { return retval; } if ( ( retval = _start_new_rate_call(real_time, proc_time, events, num_events, ins, rate)) != PAPI_OK ) return retval; _rate_state->running = mode; _papi_rate_events_running = 1; return ( PAPI_OK ); } if ( ( retval = PAPI_stop( _rate_state->EventSet, values ) ) != PAPI_OK ) { _rate_state->running = STOP; return retval; } /* Read elapsed real and process times */ rt = PAPI_get_real_usec(); pt = PAPI_get_virt_usec(); /* Convert to seconds with multiplication because it is much faster */ *real_time = ((float)( rt - _rate_state->last_real_time )) * .000001; *proc_time = ((float)( pt - _rate_state->last_proc_time )) * .000001; *ins = values[0]; switch (mode) { case FLOP: case FLIP: /* Calculate MFLOP and MFLIP rates */ if ( pt > 0 ) { *rate = (float)values[0] / (pt - _rate_state->last_proc_time); } else *rate = 0; break; case IPC: case EPC: /* Calculate IPC */ if (values[1]!=0) { *rate = (float) ((float)values[0] / (float) ( values[1])); } break; default: return PAPI_EINVAL; } _rate_state->last_real_time = rt; _rate_state->last_proc_time = pt; if ( ( retval = PAPI_start( _rate_state->EventSet ) ) != PAPI_OK ) { _rate_state->running = STOP; return retval; } } _papi_rate_events_running = 1; return PAPI_OK; } /*******************************/ /* BEGIN EXTERNAL DECLARATIONS */ /*******************************/ extern hwi_presets_t user_defined_events[PAPI_MAX_USER_EVENTS]; extern int user_defined_events_count; extern int num_all_presets; extern int _papi_hwi_start_idx[PAPI_NUM_COMP]; extern int first_comp_with_presets; extern int first_comp_preset_idx; #ifdef DEBUG #define papi_return(a) do { \ int b = a; \ if (b != PAPI_OK) {\ _papi_hwi_errno = b;\ } \ APIDBG("EXIT: return: %d\n", b);\ return((_papi_hwi_debug_handler ? _papi_hwi_debug_handler(b) : b)); \ } while (0) #else #define papi_return(a) do { \ int b = a; \ if (b != PAPI_OK) {\ _papi_hwi_errno = b;\ } \ APIDBG("EXIT: return: %d\n", b);\ return(b);\ } while(0) #endif /* #ifdef DEBUG #define papi_return(a) return((_papi_hwi_debug_handler ? _papi_hwi_debug_handler(a) : a)) #else #define papi_return(a) return(a) #endif */ #ifdef DEBUG int _papi_hwi_debug; #endif static int init_retval = DEADBEEF; inline_static int valid_component( int cidx ) { if ( _papi_hwi_invalid_cmp( cidx ) ) return ( PAPI_ENOCMP ); return ( cidx ); } inline_static int valid_ESI_component( EventSetInfo_t * ESI ) { return ( valid_component( ESI->CmpIdx ) ); } /** @class PAPI_thread_init * @brief Initialize thread support in the PAPI library. * * @param *id_fn * Pointer to a function that returns current thread ID. * * PAPI_thread_init initializes thread support in the PAPI library. * Applications that make no use of threads do not need to call this routine. * This function MUST return a UNIQUE thread ID for every new thread/LWP created. * The OpenMP call omp_get_thread_num() violates this rule, as the underlying * LWPs may have been killed off by the run-time system or by a call to omp_set_num_threads() . * In that case, it may still possible to use omp_get_thread_num() in * conjunction with PAPI_unregister_thread() when the OpenMP thread has finished. * However it is much better to use the underlying thread subsystem's call, * which is pthread_self() on Linux platforms. * * @code if ( PAPI_thread_init(pthread_self) != PAPI_OK ) exit(1); * @endcode * * @see PAPI_register_thread PAPI_unregister_thread PAPI_get_thr_specific PAPI_set_thr_specific PAPI_thread_id PAPI_list_threads */ int PAPI_thread_init( unsigned long int ( *id_fn ) ( void ) ) { /* Thread support not implemented on Alpha/OSF because the OSF pfm * counter device driver does not support per-thread counters. * When this is updated, we can remove this if statement */ if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); if ( ( init_level & PAPI_THREAD_LEVEL_INITED ) ) papi_return( PAPI_OK ); init_level |= PAPI_THREAD_LEVEL_INITED; papi_return( _papi_hwi_set_thread_id_fn( id_fn ) ); } /** @class PAPI_thread_id * @brief Get the thread identifier of the current thread. * * @retval PAPI_EMISC * is returned if there are no threads registered. * @retval -1 * is returned if the thread id function returns an error. * * This function returns a valid thread identifier. * It calls the function registered with PAPI through a call to * PAPI_thread_init(). * * @code unsigned long tid; if ((tid = PAPI_thread_id()) == (unsigned long int)-1 ) exit(1); printf("Initial thread id is: %lu\n", tid ); * @endcode * @see PAPI_thread_init */ unsigned long PAPI_thread_id( void ) { if ( _papi_hwi_thread_id_fn != NULL ) return ( ( *_papi_hwi_thread_id_fn ) ( ) ); else #ifdef DEBUG if ( _papi_hwi_debug_handler ) return ( unsigned long ) _papi_hwi_debug_handler( PAPI_EMISC ); #endif return ( unsigned long ) PAPI_EMISC; } /* Thread Functions */ /* * Notify PAPI that a thread has 'appeared' * We lookup the thread, if it does not exist we create it */ /** @class PAPI_register_thread * @brief Notify PAPI that a thread has 'appeared'. * * @par C Interface: * \#include @n * int PAPI_register_thread (void); * * PAPI_register_thread() should be called when the user wants to force * PAPI to initialize a thread that PAPI has not seen before. * * Usually this is not necessary as PAPI implicitly detects the thread when * an eventset is created or other thread local PAPI functions are called. * However, it can be useful for debugging and performance enhancements * in the run-time systems of performance tools. * * @retval PAPI_ENOMEM * Space could not be allocated to store the new thread information. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ECMP * Hardware counters for this thread could not be initialized. * * @see PAPI_unregister_thread * @see PAPI_thread_id * @see PAPI_thread_init */ int PAPI_register_thread( void ) { ThreadInfo_t *thread; if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); papi_return( _papi_hwi_lookup_or_create_thread( &thread, 0 ) ); } /* * Notify PAPI that a thread has 'disappeared' * We lookup the thread, if it does not exist we return an error */ /** @class PAPI_unregister_thread * @brief Notify PAPI that a thread has 'disappeared'. * * @retval PAPI_ENOMEM * Space could not be allocated to store the new thread information. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ECMP * Hardware counters for this thread could not be initialized. * * PAPI_unregister_thread should be called when the user wants to shutdown * a particular thread and free the associated thread ID. * THIS IS IMPORTANT IF YOUR THREAD LIBRARY REUSES THE SAME THREAD ID FOR A NEW KERNEL LWP. * OpenMP does this. OpenMP parallel regions, if separated by a call to * omp_set_num_threads() will often kill off the underlying kernel LWPs and * then start new ones for the next region. * However, omp_get_thread_id() does not reflect this, as the thread IDs * for the new LWPs will be the same as the old LWPs. * PAPI needs to know that the underlying LWP has changed so it can set up * the counters for that new thread. * This is accomplished by calling this function. */ int PAPI_unregister_thread( void ) { ThreadInfo_t *thread = _papi_hwi_lookup_thread( 0 ); if ( thread ) papi_return( _papi_hwi_shutdown_thread( thread, 0 ) ); papi_return( PAPI_EMISC ); } /** @class PAPI_list_threads * @brief List the registered thread ids. * * PAPI_list_threads() returns to the caller a list of all thread IDs * known to PAPI. * * This call assumes an initialized PAPI library. * * @par C Interface * \#include @n * int PAPI_list_threads(PAPI_thread_id_t *tids, int * number ); * * @param[in,out] *tids * -- A pointer to a preallocated array. * This may be NULL to only return a count of threads. * No more than *number codes will be stored in the array. * @param[in,out] *number * -- An input and output parameter. * Input specifies the number of allocated elements in *tids * (if non-NULL) and output specifies the number of threads. * * @retval PAPI_OK The call returned successfully. * @retval PAPI_EINVAL *number has an improper value * * @see PAPI_get_thr_specific * @see PAPI_set_thr_specific * @see PAPI_register_thread * @see PAPI_unregister_thread * @see PAPI_thread_init PAPI_thread_id * */ int PAPI_list_threads( PAPI_thread_id_t *tids, int *number ) { PAPI_all_thr_spec_t tmp; int retval; /* If tids == NULL, then just count the threads, don't gather a list. */ /* If tids != NULL, then we need the length of the tids array in num. */ if ( ( number == NULL ) || ( tids && ( *number <= 0 ) ) ) papi_return( PAPI_EINVAL ); memset( &tmp, 0x0, sizeof ( tmp ) ); /* data == NULL, since we don't want the thread specific pointers. */ /* tids may be NULL, if the user doesn't want the thread IDs. */ tmp.num = *number; tmp.id = tids; tmp.data = NULL; retval = _papi_hwi_gather_all_thrspec_data( 0, &tmp ); if ( retval == PAPI_OK ) *number = tmp.num; papi_return( retval ); } /** @class PAPI_get_thr_specific * @brief Retrieve a pointer to a thread specific data structure. * * @par Prototype: * \#include @n * int PAPI_get_thr_specific( int tag, void **ptr ); * * @param tag * An identifier, the value of which is either PAPI_USR1_TLS or * PAPI_USR2_TLS. This identifier indicates which of several data * structures associated with this thread is to be accessed. * @param ptr * A pointer to the memory containing the data structure. * * @retval PAPI_OK * @retval PAPI_EINVAL * The @em tag argument is out of range. * * In C, PAPI_get_thr_specific PAPI_get_thr_specific will retrieve the pointer from the array with index @em tag. * There are 2 user available locations and @em tag can be either * PAPI_USR1_TLS or PAPI_USR2_TLS. * The array mentioned above is managed by PAPI and allocated to each * thread which has called PAPI_thread_init. * There is no Fortran equivalent function. * * @par Example: * @code int ret; RateInfo *state = NULL; ret = PAPI_thread_init(pthread_self); if (ret != PAPI_OK) handle_error(ret); // Do we have the thread specific data setup yet? ret = PAPI_get_thr_specific(PAPI_USR1_TLS, (void *) &state); if (ret != PAPI_OK || state == NULL) { state = (RateInfo *) malloc(sizeof(RateInfo)); if (state == NULL) return (PAPI_ESYS); memset(state, 0, sizeof(RateInfo)); state->EventSet = PAPI_NULL; ret = PAPI_create_eventset(&state->EventSet); if (ret != PAPI_OK) return (PAPI_ESYS); ret = PAPI_set_thr_specific(PAPI_USR1_TLS, state); if (ret != PAPI_OK) return (ret); } * @endcode * @see PAPI_register_thread PAPI_thread_init PAPI_thread_id PAPI_set_thr_specific */ int PAPI_get_thr_specific( int tag, void **ptr ) { ThreadInfo_t *thread; int doall = 0, retval = PAPI_OK; if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); if ( tag & PAPI_TLS_ALL_THREADS ) { tag = tag ^ PAPI_TLS_ALL_THREADS; doall = 1; } if ( ( tag < 0 ) || ( tag > PAPI_TLS_NUM ) ) papi_return( PAPI_EINVAL ); if ( doall ) papi_return( _papi_hwi_gather_all_thrspec_data ( tag, ( PAPI_all_thr_spec_t * ) ptr ) ); retval = _papi_hwi_lookup_or_create_thread( &thread, 0 ); if ( retval == PAPI_OK ) *ptr = thread->thread_storage[tag]; else papi_return( retval ); return ( PAPI_OK ); } /** @class PAPI_set_thr_specific * @brief Store a pointer to a thread specific data structure. * * @par Prototype: * \#include @n * int PAPI_set_thr_specific( int tag, void *ptr ); * * @param tag * An identifier, the value of which is either PAPI_USR1_TLS or * PAPI_USR2_TLS. This identifier indicates which of several data * structures associated with this thread is to be accessed. * @param ptr * A pointer to the memory containing the data structure. * * @retval PAPI_OK * @retval PAPI_EINVAL * The @em tag argument is out of range. * * In C, PAPI_set_thr_specific will save @em ptr into an array indexed by @em tag. * There are 2 user available locations and @em tag can be either * PAPI_USR1_TLS or PAPI_USR2_TLS. * The array mentioned above is managed by PAPI and allocated to each * thread which has called PAPI_thread_init. * There is no Fortran equivalent function. * * @par Example: * @code int ret; RateInfo *state = NULL; ret = PAPI_thread_init(pthread_self); if (ret != PAPI_OK) handle_error(ret); // Do we have the thread specific data setup yet? ret = PAPI_get_thr_specific(PAPI_USR1_TLS, (void *) &state); if (ret != PAPI_OK || state == NULL) { state = (RateInfo *) malloc(sizeof(RateInfo)); if (state == NULL) return (PAPI_ESYS); memset(state, 0, sizeof(RateInfo)); state->EventSet = PAPI_NULL; ret = PAPI_create_eventset(&state->EventSet); if (ret != PAPI_OK) return (PAPI_ESYS); ret = PAPI_set_thr_specific(PAPI_USR1_TLS, state); if (ret != PAPI_OK) return (ret); } * @endcode * @see PAPI_register_thread PAPI_thread_init PAPI_thread_id PAPI_get_thr_specific */ int PAPI_set_thr_specific( int tag, void *ptr ) { ThreadInfo_t *thread; int retval = PAPI_OK; if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); if ( ( tag < 0 ) || ( tag > PAPI_NUM_TLS ) ) papi_return( PAPI_EINVAL ); retval = _papi_hwi_lookup_or_create_thread( &thread, 0 ); if ( retval == PAPI_OK ) { _papi_hwi_lock( THREADS_LOCK ); thread->thread_storage[tag] = ptr; _papi_hwi_unlock( THREADS_LOCK ); } else return ( retval ); return ( PAPI_OK ); } /** @class PAPI_library_init * @brief initialize the PAPI library. * @param version * upon initialization, PAPI checks the argument against the internal * value of PAPI_VER_CURRENT when the library was compiled. * This guards against portability problems when updating the PAPI shared * libraries on your system. * * @retval PAPI_EINVAL * papi.h is different from the version used to compile the PAPI library. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * @retval PAPI_ECMP * This component does not support the underlying hardware. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * * PAPI_library_init() initializes the PAPI library. * PAPI_is_initialized() check for initialization. * It must be called before any low level PAPI functions can be used. * If your application is making use of threads PAPI_thread_init must also be * called prior to making any calls to the library other than PAPI_library_init() . * @par Examples: * @code * int retval; * retval = PAPI_library_init(PAPI_VER_CURRENT); * if (retval != PAPI_VER_CURRENT && retval > 0) { * fprintf(stderr,"PAPI library version mismatch!\en"); * exit(1); } * if (retval < 0) * handle_error(retval); * retval = PAPI_is_initialized(); * if (retval != PAPI_LOW_LEVEL_INITED) * handle_error(retval) * @endcode * @bug If you don't call this before using any of the low level PAPI calls, your application could core dump. * @see PAPI_thread_init PAPI */ int PAPI_library_init( int version ) { APIDBG( "Entry: version: %#x\n", version); int tmp = 0; /* This is a poor attempt at a lock. For 3.1 this should be replaced with a true UNIX semaphore. We cannot use PAPI locks here because they are not initialized yet */ static int _in_papi_library_init_cnt = 0; #ifdef DEBUG char *var; #endif _papi_hwi_init_errors(); if ( version != PAPI_VER_CURRENT ) papi_return( PAPI_EINVAL ); ++_in_papi_library_init_cnt; while ( _in_papi_library_init_cnt > 1 ) { PAPIERROR( "Multiple callers of PAPI_library_init" ); sleep( 1 ); } /* This checks to see if we have forked or called init more than once. If we have forked, then we continue to init. If we have not forked, we check to see the status of initialization. */ APIDBG( "Initializing library: current PID %d, old PID %d\n", getpid( ), _papi_hwi_system_info.pid ); if ( _papi_hwi_system_info.pid == getpid( ) ) { /* If the magic environment variable PAPI_ALLOW_STOLEN is set, we call shutdown if PAPI has been initialized. This allows tools that use LD_PRELOAD to run on applications that use PAPI. In this circumstance, PAPI_ALLOW_STOLEN will be set to 'stolen' so the tool can check for this case. */ if ( getenv( "PAPI_ALLOW_STOLEN" ) ) { char buf[PAPI_HUGE_STR_LEN]; if ( init_level != PAPI_NOT_INITED ) PAPI_shutdown( ); sprintf( buf, "%s=%s", "PAPI_ALLOW_STOLEN", "stolen" ); putenv( buf ); } /* If the library has been successfully initialized *OR* the library attempted initialization but failed. */ else if ( ( init_level != PAPI_NOT_INITED ) || ( init_retval != DEADBEEF ) ) { _in_papi_library_init_cnt--; if ( init_retval < PAPI_OK ) papi_return( init_retval ); else return ( init_retval ); } APIDBG( "system_info was initialized, but init did not succeed\n" ); } #ifdef DEBUG var = ( char * ) getenv( "PAPI_DEBUG" ); _papi_hwi_debug = 0; if ( var != NULL ) { if ( strlen( var ) != 0 ) { if ( strstr( var, "SUBSTRATE" ) ) _papi_hwi_debug |= DEBUG_SUBSTRATE; if ( strstr( var, "API" ) ) _papi_hwi_debug |= DEBUG_API; if ( strstr( var, "INTERNAL" ) ) _papi_hwi_debug |= DEBUG_INTERNAL; if ( strstr( var, "THREADS" ) ) _papi_hwi_debug |= DEBUG_THREADS; if ( strstr( var, "MULTIPLEX" ) ) _papi_hwi_debug |= DEBUG_MULTIPLEX; if ( strstr( var, "OVERFLOW" ) ) _papi_hwi_debug |= DEBUG_OVERFLOW; if ( strstr( var, "PROFILE" ) ) _papi_hwi_debug |= DEBUG_PROFILE; if ( strstr( var, "MEMORY" ) ) _papi_hwi_debug |= DEBUG_MEMORY; if ( strstr( var, "LEAK" ) ) _papi_hwi_debug |= DEBUG_LEAK; if ( strstr( var, "HIGHLEVEL" ) ) _papi_hwi_debug |= DEBUG_HIGHLEVEL; if ( strstr( var, "ALL" ) ) _papi_hwi_debug |= DEBUG_ALL; } if ( _papi_hwi_debug == 0 ) _papi_hwi_debug |= DEBUG_API; } #endif /* Initialize internal globals */ if ( _papi_hwi_init_global_internal( ) != PAPI_OK ) { _in_papi_library_init_cnt--; papi_return( PAPI_EINVAL ); } /* Initialize OS */ tmp = _papi_hwi_init_os(); if ( tmp ) { init_retval = tmp; _papi_hwi_shutdown_global_internal( ); _in_papi_library_init_cnt--; papi_return( init_retval ); } /* Initialize component globals EXCEPT for perf_event, perf_event_uncore. * To avoid race conditions, these components use the thread local storage * construct initialized by _papi_hwi_init_global_threads(), from within * their init_component(). So these must have init_component() run AFTER * _papi_hwi_init_global_threads. Other components demand that init threads * run AFTER init_component(), which sets up globals they need. */ tmp = _papi_hwi_init_global( 0 ); /* Selector 0 to skip perf_event, perf_event_uncore */ if ( tmp ) { init_retval = tmp; _papi_hwi_shutdown_global_internal( ); _in_papi_library_init_cnt--; papi_return( init_retval ); } /* Initialize thread globals, including the main threads */ tmp = _papi_hwi_init_global_threads( ); if ( tmp ) { init_retval = tmp; _papi_hwi_shutdown_global_internal( ); _in_papi_library_init_cnt--; papi_return( init_retval ); } /* Initialize perf_event, perf_event_uncore components */ tmp = _papi_hwi_init_global( 1 ); /* Selector 1 for only perf_event, perf_event_uncore */ if ( tmp ) { init_retval = tmp; _papi_hwi_shutdown_global_internal( ); _in_papi_library_init_cnt--; papi_return( init_retval ); } /* Initialize component preset globals. */ tmp = _papi_hwi_init_global_presets(); if ( tmp ) { init_retval = tmp; _papi_hwi_shutdown_global_internal( ); _in_papi_library_init_cnt--; papi_return( init_retval ); } init_level = PAPI_LOW_LEVEL_INITED; _in_papi_library_init_cnt--; return ( init_retval = PAPI_VER_CURRENT ); } /** @class PAPI_query_event * @brief Query if PAPI event exists. * * @par C Interface: * \#include @n * int PAPI_query_event(int EventCode); * * PAPI_query_event() asks the PAPI library if the PAPI Preset event can be * counted on this architecture. * If the event CAN be counted, the function returns PAPI_OK. * If the event CANNOT be counted, the function returns an error code. * This function also can be used to check the syntax of native and user events. * * @param EventCode * -- a defined event such as PAPI_TOT_INS. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * @par Examples * @code * int retval; * // Initialize the library * retval = PAPI_library_init(PAPI_VER_CURRENT); * if (retval != PAPI_VER_CURRENT) { * fprintf(stderr,\"PAPI library init error!\\n\"); * exit(1); * } * if (PAPI_query_event(PAPI_TOT_INS) != PAPI_OK) { * fprintf(stderr,\"No instruction counter? How lame.\\n\"); * exit(1); * } * @endcode * * @see PAPI_remove_event * @see PAPI_remove_events * @see PAPI_presets * @see PAPI_native */ int PAPI_query_event( int EventCode ) { APIDBG( "Entry: EventCode: %#x\n", EventCode); if ( IS_PRESET(EventCode) ) { EventCode &= PAPI_PRESET_AND_MASK; if ( EventCode < 0 || EventCode >= num_all_presets ) papi_return( PAPI_ENOTPRESET ); int preset_index = EventCode; int compIdx = get_preset_cmp(&preset_index); if( compIdx < 0 ) { return PAPI_ENOEVNT; } if ( _papi_hwi_comp_presets[compIdx][preset_index].count ) papi_return (PAPI_OK); else return PAPI_ENOEVNT; } if ( IS_NATIVE(EventCode) ) { papi_return( _papi_hwi_query_native_event ( ( unsigned int ) EventCode ) ); } if ( IS_USER_DEFINED(EventCode) ) { EventCode &= PAPI_UE_AND_MASK; if ( EventCode < 0 || EventCode >= PAPI_MAX_USER_EVENTS) papi_return ( PAPI_ENOEVNT ); if ( user_defined_events[EventCode].count ) papi_return (PAPI_OK); else papi_return (PAPI_ENOEVNT); } papi_return( PAPI_ENOEVNT ); } /** @class PAPI_query_named_event * @brief Query if a named PAPI event exists. * * @par C Interface: * \#include @n * int PAPI_query_named_event(const char *EventName); * * PAPI_query_named_event() asks the PAPI library if the PAPI named event can be * counted on this architecture. * If the event CAN be counted, the function returns PAPI_OK. * If the event CANNOT be counted, the function returns an error code. * This function also can be used to check the syntax of native and user events. * * @param EventName * -- a defined event such as PAPI_TOT_INS. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * @par Examples * @code * int retval; * // Initialize the library * retval = PAPI_library_init(PAPI_VER_CURRENT); * if (retval != PAPI_VER_CURRENT) { * fprintf(stderr,\"PAPI library init error!\\n\"); * exit(1); * } * if (PAPI_query_named_event("PAPI_TOT_INS") != PAPI_OK) { * fprintf(stderr,\"No instruction counter? How lame.\\n\"); * exit(1); * } * @endcode * * @see PAPI_query_event */ int PAPI_query_named_event( const char *EventName ) { int ret, code; ret = PAPI_event_name_to_code( EventName, &code ); if ( ret == PAPI_OK ) ret = PAPI_query_event( code ); papi_return( ret); } /** @class PAPI_get_component_info * @brief get information about a specific software component * * @param cidx * Component index * * This function returns a pointer to a structure containing detailed * information about a specific software component in the PAPI library. * This includes versioning information, preset and native event * information, and more. * For full details, see @ref PAPI_component_info_t. * * @par Examples: * @code const PAPI_component_info_t *cmpinfo = NULL; if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) exit(1); if ((cmpinfo = PAPI_get_component_info(0)) == NULL) exit(1); printf("This component supports %d Preset Events and %d Native events.\n", cmpinfo->num_preset_events, cmpinfo->num_native_events); * @endcode * * @see PAPI_get_executable_info * @see PAPI_get_hardware_info * @see PAPI_get_dmem_info * @see PAPI_get_opt * @see PAPI_component_info_t */ const PAPI_component_info_t * PAPI_get_component_info( int cidx ) { APIDBG( "Entry: Component Index %d\n", cidx); if ( _papi_hwi_invalid_cmp( cidx ) ) return ( NULL ); else return ( &( _papi_hwd[cidx]->cmp_info ) ); } /* PAPI_get_event_info: tests input EventCode and returns a filled in PAPI_event_info_t structure containing descriptive strings and values for the specified event. Handles both preset and native events by calling either _papi_hwi_get_event_info or _papi_hwi_get_native_event_info. */ /** @class PAPI_get_event_info * @brief Get the event's name and description info. * * @param EventCode * event code (preset or native) * @param info * structure with the event information @ref PAPI_event_info_t * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOTPRESET * The PAPI preset mask was set, but the hardware event specified is * not a valid PAPI preset. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * This function fills the event information into a structure. * In Fortran, some fields of the structure are returned explicitly. * This function works with existing PAPI preset and native event codes. * * @see PAPI_event_name_to_code */ int PAPI_get_event_info( int EventCode, PAPI_event_info_t *info ) { APIDBG( "Entry: EventCode: 0x%x, info: %p\n", EventCode, info); int i; if ( info == NULL ) papi_return( PAPI_EINVAL ); if ( IS_PRESET(EventCode) ) { i = EventCode & PAPI_PRESET_AND_MASK; if ( i >= num_all_presets ) { papi_return( PAPI_ENOTPRESET ); } papi_return( _papi_hwi_get_preset_event_info( EventCode, info ) ); } if ( IS_NATIVE(EventCode) ) { papi_return( _papi_hwi_get_native_event_info ( ( unsigned int ) EventCode, info ) ); } if ( IS_USER_DEFINED(EventCode) ) { papi_return( _papi_hwi_get_user_event_info( EventCode, info )); } papi_return( PAPI_ENOTPRESET ); } /** @class PAPI_event_code_to_name * @brief Convert a numeric hardware event code to a name. * * @par C Interface: * \#include @n * int PAPI_event_code_to_name( int EventCode, char * EventName ); * * PAPI_event_code_to_name is used to translate a 32-bit integer PAPI event * code into an ASCII PAPI event name. * Either Preset event codes or Native event codes can be passed to this routine. * Native event codes and names differ from platform to platform. * * @param EventCode * The numeric code for the event. * @param *EventName * A string containing the event name as listed in PAPI_presets or discussed in PAPI_native. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOTPRESET * The hardware event specified is not a valid PAPI preset. * @retval PAPI_ENOEVNT * The hardware event is not available on the underlying hardware. * * @par Examples: * @code * int EventCode, EventSet = PAPI_NULL; * int Event, number; * char EventCodeStr[PAPI_MAX_STR_LEN]; * // Create the EventSet * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) * handle_error( 1 ); * number = 1; * if ( PAPI_list_events( EventSet, &Event, &number ) != PAPI_OK ) * handle_error(1); * // Convert integer code to name string * if ( PAPI_event_code_to_name( Event, EventCodeStr ) != PAPI_OK ) * handle_error( 1 ); * printf( "Event Name: %s\n", EventCodeStr ); * @endcode * * @see PAPI_event_name_to_code * @see PAPI_remove_event * @see PAPI_get_event_info * @see PAPI_enum_event * @see PAPI_add_event * @see PAPI_presets * @see PAPI_native */ int PAPI_event_code_to_name( int EventCode, char *out ) { APIDBG( "Entry: EventCode: %#x, out: %p\n", EventCode, out); if ( out == NULL ) papi_return( PAPI_EINVAL ); if ( IS_PRESET(EventCode) ) { EventCode &= PAPI_PRESET_AND_MASK; if ( EventCode < 0 || EventCode >= num_all_presets ) papi_return( PAPI_ENOTPRESET ); int preset_index = EventCode; int compIdx = get_preset_cmp(&preset_index); if( compIdx < 0 ) { return PAPI_ENOEVNT; } if ( _papi_hwd[compIdx]->cmp_info.disabled == PAPI_EDELAY_INIT ) { int junk; _papi_hwd[compIdx]->ntv_enum_events(&junk, PAPI_ENUM_FIRST); } if (_papi_hwi_comp_presets[compIdx][preset_index].symbol == NULL ) papi_return( PAPI_ENOTPRESET ); strncpy( out, _papi_hwi_comp_presets[compIdx][preset_index].symbol, PAPI_MAX_STR_LEN-1 ); out[PAPI_MAX_STR_LEN-1] = '\0'; papi_return( PAPI_OK ); } if ( IS_NATIVE(EventCode) ) { return ( _papi_hwi_native_code_to_name ( ( unsigned int ) EventCode, out, PAPI_MAX_STR_LEN ) ); } if ( IS_USER_DEFINED(EventCode) ) { EventCode &= PAPI_UE_AND_MASK; if ( EventCode < 0 || EventCode >= user_defined_events_count ) papi_return( PAPI_ENOEVNT ); if (user_defined_events[EventCode].symbol == NULL ) papi_return( PAPI_ENOEVNT ); strncpy( out, user_defined_events[EventCode].symbol, PAPI_MAX_STR_LEN-1); out[PAPI_MAX_STR_LEN-1] = '\0'; papi_return( PAPI_OK ); } papi_return( PAPI_ENOEVNT ); } /** @class PAPI_event_name_to_code * @brief Convert a name to a numeric hardware event code. * * @par C Interface: * \#include @n * int PAPI_event_name_to_code( const char * EventName, int * EventCode ); * * PAPI_event_name_to_code is used to translate an ASCII PAPI event name * into an integer PAPI event code. * * @param *EventCode * The numeric code for the event. * @param *EventName * A string containing the event name as listed in PAPI_presets or discussed in PAPI_native. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOTPRESET * The hardware event specified is not a valid PAPI preset. * @retval PAPI_ENOINIT * The PAPI library has not been initialized. * @retval PAPI_ENOEVNT * The hardware event is not available on the underlying hardware. * * @par Examples: * @code * int EventCode, EventSet = PAPI_NULL; * // Convert to integer * if ( PAPI_event_name_to_code( "PAPI_TOT_INS", &EventCode ) != PAPI_OK ) * handle_error( 1 ); * // Create the EventSet * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, EventCode ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @see PAPI_event_code_to_name * @see PAPI_remove_event * @see PAPI_get_event_info * @see PAPI_enum_event * @see PAPI_add_event * @see PAPI_add_named_event * @see PAPI_presets * @see PAPI_native */ int PAPI_event_name_to_code( const char *in, int *out ) { APIDBG("Entry: in: %p, name: %s, out: %p\n", in, in, out); int i; if ( ( in == NULL ) || ( out == NULL ) ) papi_return( PAPI_EINVAL ); if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); /* All presets start with "PAPI_" so no need to */ /* do an exhaustive search if that's not there */ if (strncmp(in, "PAPI_", 5) == 0) { /* Split event name into base name and qualifier. */ int preset_idx = -1; char *evt_name_copy = strdup(in); if( NULL == evt_name_copy ) { PAPIERROR("Failed to allocate space for preset buffer.\n"); papi_return( PAPI_EINVAL ); } char *evt_base_name = strtok(evt_name_copy, ":"); if( NULL == evt_base_name ) { PAPIERROR("Failed to allocate space for base name of native event used in preset.\n"); papi_return( PAPI_EINVAL ); } /* Since the preset could live inside of either the CPU or component preset list, * set the list pointer appropriately. */ hwi_presets_t *_papi_hwi_list = NULL; /* Now check the component presets. */ int cmpnt, breakFlag = 0; for(cmpnt = 0; cmpnt < PAPI_NUM_COMP; cmpnt++ ) { _papi_hwi_list = _papi_hwi_comp_presets[cmpnt]; for(i = 0; i < _papi_hwi_max_presets[cmpnt]; i++ ) { if ( ( _papi_hwi_list[i].symbol ) && ( strcasecmp( _papi_hwi_list[i].symbol, evt_base_name ) == 0) ) { *out = ( int ) ( (i + _papi_hwi_start_idx[cmpnt]) | PAPI_PRESET_MASK ); if ( _papi_hwd[cmpnt]->cmp_info.disabled == PAPI_EDELAY_INIT ) { int junk; _papi_hwd[cmpnt]->ntv_enum_events(&junk, PAPI_ENUM_FIRST); } preset_idx = i; breakFlag = 1; break; } } /* Checks whether preset was found. */ if( breakFlag ) { break; } } free(evt_name_copy); /* User may have provided an invalid event name. */ if( NULL != _papi_hwi_list ) { /* Keep track of all qualifiers provided by the user. */ hwi_presets_t *prstPtr = &_papi_hwi_list[preset_idx]; int status = overwrite_qualifiers(prstPtr, in, 1); if( status < 0 ) { papi_return( PAPI_ENOMEM ); } status = construct_qualified_event(prstPtr); if( status < 0 ) { papi_return( status ); } papi_return( PAPI_OK ); } } // check to see if it is a user defined event for ( i=0; i < user_defined_events_count ; i++ ) { APIDBG("&user_defined_events[%d]: %p, user_defined_events[%d].symbol: %s, user_defined_events[%d].count: %d\n", i, &user_defined_events[i], i, user_defined_events[i].symbol, i, user_defined_events[i].count); if (user_defined_events[i].symbol == NULL) break; if (user_defined_events[i].count == 0) break; if ( strcasecmp( user_defined_events[i].symbol, in ) == 0 ) { *out = (int) ( i | PAPI_UE_MASK ); papi_return( PAPI_OK ); } } // go look for native events defined by one of the components papi_return( _papi_hwi_native_name_to_code( in, out ) ); } /* Updates EventCode to next valid value, or returns error; modifier can specify {all / available} for presets, or other values for native tables and may be platform specific (Major groups / all mask bits; P / M / E chip, etc) */ /** @class PAPI_enum_event * @brief Enumerate PAPI preset or native events. * * @par C Interface: * \#include @n * int PAPI_enum_event( int * EventCode, int modifer ); * * Given a preset or native event code, PAPI_enum_event replaces the event * code with the next available event in either the preset or native table. * The modifier argument affects which events are returned. * For all platforms and event types, a value of PAPI_ENUM_ALL (zero) * directs the function to return all possible events. @n * * For preset events, a TRUE (non-zero) value currently directs the function * to return event codes only for PAPI preset events available on this platform. * This may change in the future. * For native events, the effect of the modifier argument is different on each platform. * See the discussion below for platform-specific definitions. * * @param *EventCode * A defined preset or native event such as PAPI_TOT_INS. * @param modifier * Modifies the search logic. See below for full list. * For native events, each platform behaves differently. * See platform-specific documentation for details. * * @retval PAPI_ENOEVNT * The next requested PAPI preset or native event is not available on * the underlying hardware. * * @par Examples: * @code * // Scan for all supported native events on this platform * printf( "Name\t\t\t Code\t Description\n" ); * do { * retval = PAPI_get_event_info( i, &info ); * if ( retval == PAPI_OK ) { * printf( "%-30s %#-10x\n%s\n", info.symbol, info.event_code, info.long_descr ); * } * } while ( PAPI_enum_event( &i, PAPI_ENUM_ALL ) == PAPI_OK ); * @endcode * * @par Generic Modifiers * The following values are implemented for preset events *
    *
  • PAPI_ENUM_EVENTS -- Enumerate all (default) *
  • PAPI_ENUM_FIRST -- Enumerate first event (preset or native) * preset/native chosen based on type of EventCode *
* * @par Native Modifiers * The following values are implemented for native events *
    *
  • PAPI_NTV_ENUM_UMASKS -- Given an event, iterate through * possible umasks one at a time *
  • PAPI_NTV_ENUM_UMASK_COMBOS -- Given an event, iterate * through all possible combinations of umasks. * This is not implemented on libpfm4. *
* * @par Preset Modifiers * The following values are implemented for preset events *
    *
  • PAPI_PRESET_ENUM_AVAIL -- enumerate only available presets *
  • PAPI_PRESET_ENUM_CPU -- enumerate CPU preset events *
  • PAPI_PRESET_ENUM_CPU_AVAIL -- enumerate available CPU preset events *
  • PAPI_PRESET_ENUM_FIRST_COMP -- enumerate first component preset event *
* * @see PAPI @n * PAPIF @n * PAPI_enum_cmp_event @n * PAPI_get_event_info @n * PAPI_event_name_to_code @n * PAPI_preset @n * PAPI_native */ int PAPI_enum_event( int *EventCode, int modifier ) { APIDBG( "Entry: EventCode: %#x, modifier: %d\n", *EventCode, modifier); int i = *EventCode; int retval; int cidx; int event_code; char *evt_name; cidx = _papi_hwi_component_index( *EventCode ); if (cidx < 0) return PAPI_ENOCMP; /* check to see if a valid modifier is provided */ if (modifier != PAPI_ENUM_EVENTS && modifier != PAPI_ENUM_FIRST && modifier != PAPI_ENUM_ALL && modifier != PAPI_PRESET_ENUM_AVAIL && modifier != PAPI_PRESET_ENUM_CPU && modifier != PAPI_PRESET_ENUM_CPU_AVAIL && modifier != PAPI_PRESET_ENUM_FIRST_COMP && modifier != PAPI_NTV_ENUM_UMASKS && modifier != PAPI_NTV_ENUM_UMASK_COMBOS) { return PAPI_EINVAL; } /* If it is a component preset, it will be in a separate array. */ int preset_index; hwi_presets_t *_papi_hwi_list; if ( IS_PRESET(i) ) { /* Set to the first preset. */ if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = ( int ) PAPI_PRESET_MASK; APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } i &= PAPI_PRESET_AND_MASK; /* Iterate over all or all available presets. */ if ( modifier == PAPI_ENUM_EVENTS || modifier == PAPI_PRESET_ENUM_AVAIL ) { if ( _papi_hwd[cidx]->cmp_info.disabled == PAPI_EDELAY_INIT ) { int junk; _papi_hwd[cidx]->ntv_enum_events(&junk, PAPI_ENUM_FIRST); } /* NULL pointer used to terminate the list. However, now we have * more presets that exist beyond the bounds of the original * array, so skip over the NULL entries. */ do { if ( ++i >= num_all_presets ) { return ( PAPI_EINVAL ); } /* Find the component to which the preset belongs and set the * preset index relative to the component's presets' index range. */ preset_index = i; int compIdx = get_preset_cmp(&preset_index); if( compIdx < 0 ) { return ( PAPI_ENOEVNT ); } _papi_hwi_list = _papi_hwi_comp_presets[compIdx]; } while ( _papi_hwi_list[preset_index].symbol == NULL || (modifier == PAPI_PRESET_ENUM_AVAIL && _papi_hwi_list[preset_index].count == 0) ); *EventCode = ( int ) ( i | PAPI_PRESET_MASK ); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } /* Set to the first component preset. */ if ( modifier == PAPI_PRESET_ENUM_FIRST_COMP ) { preset_index = get_first_cmp_preset_idx(); if( preset_index < 0 ) { return ( PAPI_ENOEVNT ); } if ( _papi_hwd[first_comp_with_presets]->cmp_info.disabled == PAPI_EDELAY_INIT ) { int junk; _papi_hwd[first_comp_with_presets]->ntv_enum_events(&junk, PAPI_ENUM_FIRST); } *EventCode = ( int ) ( preset_index | PAPI_PRESET_MASK ); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } /* Iterate over CPU presets. */ if ( modifier == PAPI_PRESET_ENUM_CPU || modifier == PAPI_PRESET_ENUM_CPU_AVAIL ) { while ( ++i < PAPI_MAX_PRESET_EVENTS ) { if ( _papi_hwi_presets[i].symbol == NULL ) { APIDBG("EXIT: PAPI_ENOEVNT\n"); return ( PAPI_ENOEVNT ); /* NULL pointer terminates list */ } if ( modifier == PAPI_PRESET_ENUM_CPU_AVAIL && _papi_hwi_presets[i].count == 0 ) { continue; } *EventCode = ( int ) ( i | PAPI_PRESET_MASK ); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } } papi_return( PAPI_EINVAL ); } if ( IS_NATIVE(i) ) { // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(*EventCode, 0); /* Should check against num native events here */ event_code=_papi_hwi_eventcode_to_native((int)*EventCode); retval = _papi_hwd[cidx]->ntv_enum_events((unsigned int *)&event_code, modifier ); if (retval!=PAPI_OK) { APIDBG("VMW: retval=%d\n",retval); return PAPI_EINVAL; } evt_name = _papi_hwi_get_papi_event_string(); *EventCode = _papi_hwi_native_to_eventcode(cidx, event_code, -1, evt_name); _papi_hwi_free_papi_event_string(); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return retval; } if ( IS_USER_DEFINED(i) ) { if (user_defined_events_count == 0) { APIDBG("EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = (int) (0 | PAPI_UE_MASK); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } i &= PAPI_UE_AND_MASK; ++i; if ( i <= 0 || i >= user_defined_events_count ) { APIDBG("EXIT: PAPI_ENOEVNT\n"); return ( PAPI_ENOEVNT ); } // if next entry does not have an event name, we are done if (user_defined_events[i].symbol == NULL) { APIDBG("EXIT: PAPI_ENOEVNT\n"); return ( PAPI_ENOEVNT ); } // if next entry does not map to any other events, we are done if (user_defined_events[i].count == 0) { APIDBG("EXIT: PAPI_ENOEVNT\n"); return ( PAPI_ENOEVNT ); } *EventCode = (int) (i | PAPI_UE_MASK); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } papi_return( PAPI_EINVAL ); } /** @class PAPI_enum_cmp_event * @brief Enumerate PAPI preset or native events for a given component * * @par C Interface: * \#include @n * int PAPI_enum_cmp_event( int *EventCode, int modifer, int cidx ); * * Given an event code, PAPI_enum_event replaces the event * code with the next available event. * * The modifier argument affects which events are returned. * For all platforms and event types, a value of PAPI_ENUM_ALL (zero) * directs the function to return all possible events. @n * * For native events, the effect of the modifier argument may be * different on each platform. * See the discussion below for platform-specific definitions. * * @param *EventCode * A defined preset or native event such as PAPI_TOT_INS. * @param modifier * Modifies the search logic. See below for full list. * For native events, each platform behaves differently. * See platform-specific documentation for details. * * @param cidx * Specifies the component to search in * * @retval PAPI_ENOEVNT * The next requested PAPI preset or native event is not available on * the underlying hardware. * * @par Examples: * @code * // Scan for all supported native events on the first component * printf( "Name\t\t\t Code\t Description\n" ); * do { * retval = PAPI_get_event_info( i, &info ); * if ( retval == PAPI_OK ) { * printf( "%-30s %#-10x\n%s\n", info.symbol, info.event_code, info.long_descr ); * } * } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_ALL, 0 ) == PAPI_OK ); * @endcode * * @par Generic Modifiers * The following values are implemented for preset events *
    *
  • PAPI_ENUM_EVENTS -- Enumerate all (default) *
  • PAPI_ENUM_FIRST -- Enumerate first event (preset or native) * preset/native chosen based on type of EventCode *
* * @par Native Modifiers * The following values are implemented for native events *
    *
  • PAPI_NTV_ENUM_UMASKS -- Given an event, iterate through * possible umasks one at a time *
  • PAPI_NTV_ENUM_UMASK_COMBOS -- Given an event, iterate * through all possible combinations of umasks. * This is not implemented on libpfm4. *
* * @par Preset Modifiers * The following values are implemented for preset events *
    *
  • PAPI_PRESET_ENUM_AVAIL -- enumerate only available presets *
  • PAPI_PRESET_ENUM_MSC -- Miscellaneous preset events *
  • PAPI_PRESET_ENUM_INS -- Instruction related preset events *
  • PAPI_PRESET_ENUM_IDL -- Stalled or Idle preset events *
  • PAPI_PRESET_ENUM_BR -- Branch related preset events *
  • PAPI_PRESET_ENUM_CND -- Conditional preset events *
  • PAPI_PRESET_ENUM_MEM -- Memory related preset events *
  • PAPI_PRESET_ENUM_CACH -- Cache related preset events *
  • PAPI_PRESET_ENUM_L1 -- L1 cache related preset events *
  • PAPI_PRESET_ENUM_L2 -- L2 cache related preset events *
  • PAPI_PRESET_ENUM_L3 -- L3 cache related preset events *
  • PAPI_PRESET_ENUM_TLB -- Translation Lookaside Buffer events *
  • PAPI_PRESET_ENUM_FP -- Floating Point related preset events *
* * @par ITANIUM Modifiers * The following values are implemented for modifier on Itanium: *
    *
  • PAPI_NTV_ENUM_IARR - Enumerate IAR (instruction address ranging) events *
  • PAPI_NTV_ENUM_DARR - Enumerate DAR (data address ranging) events *
  • PAPI_NTV_ENUM_OPCM - Enumerate OPC (opcode matching) events *
  • PAPI_NTV_ENUM_IEAR - Enumerate IEAR (instr event address register) events *
  • PAPI_NTV_ENUM_DEAR - Enumerate DEAR (data event address register) events *
* * @par POWER Modifiers * The following values are implemented for POWER *
    *
  • PAPI_NTV_ENUM_GROUPS - Enumerate groups to which an event belongs *
* * @see PAPI @n * PAPIF @n * PAPI_enum_event @n * PAPI_get_event_info @n * PAPI_event_name_to_code @n * PAPI_preset @n * PAPI_native */ int PAPI_enum_cmp_event( int *EventCode, int modifier, int cidx ) { APIDBG( "Entry: EventCode: %#x, modifier: %d, cidx: %d\n", *EventCode, modifier, cidx); int i = *EventCode; int retval; int event_code; char *evt_name; if ( _papi_hwi_invalid_cmp(cidx) ) { return PAPI_ENOCMP; } if (_papi_hwd[cidx]->cmp_info.disabled && _papi_hwd[cidx]->cmp_info.disabled != PAPI_EDELAY_INIT) { return PAPI_ENOCMP; } if ( IS_PRESET(i) ) { if ( _papi_hwd[cidx]->cmp_info.disabled == PAPI_EDELAY_INIT ) { int junk; _papi_hwd[cidx]->ntv_enum_events(&junk, PAPI_ENUM_FIRST); } int preset_index; hwi_presets_t *_papi_hwi_list; /* Set to the first preset. */ if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = ( int ) ( _papi_hwi_start_idx[cidx] | PAPI_PRESET_MASK ); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } i &= PAPI_PRESET_AND_MASK; /* Iterate over all or all available presets. */ if ( modifier == PAPI_ENUM_EVENTS || modifier == PAPI_PRESET_ENUM_AVAIL ) { /* NULL pointer used to terminate the list. However, now we have * more presets that exist beyond the bounds of the original * array, so skip over the NULL entries. */ do { if ( ++i >= _papi_hwi_start_idx[cidx] + _papi_hwi_max_presets[cidx] ) { return ( PAPI_EINVAL ); } /* Find the component to which the preset belongs. */ _papi_hwi_list = _papi_hwi_comp_presets[cidx]; preset_index = i - _papi_hwi_start_idx[cidx]; } while ( _papi_hwi_list[preset_index].symbol == NULL || (modifier == PAPI_PRESET_ENUM_AVAIL && _papi_hwi_list[preset_index].count == 0) ); *EventCode = ( int ) ( i | PAPI_PRESET_MASK ); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return ( PAPI_OK ); } papi_return( PAPI_EINVAL ); } if ( IS_NATIVE(i) ) { // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(*EventCode, 0); /* Should we check against num native events here? */ event_code=_papi_hwi_eventcode_to_native(*EventCode); retval = _papi_hwd[cidx]->ntv_enum_events((unsigned int *)&event_code, modifier ); if (retval!=PAPI_OK) { APIDBG("EXIT: PAPI_EINVAL retval=%d\n",retval); return PAPI_EINVAL; } evt_name = _papi_hwi_get_papi_event_string(); *EventCode = _papi_hwi_native_to_eventcode(cidx, event_code, -1, evt_name); _papi_hwi_free_papi_event_string(); APIDBG("EXIT: *EventCode: %#x\n", *EventCode); return retval; } papi_return( PAPI_EINVAL ); } /** @class PAPI_create_eventset * @brief Create a new empty PAPI EventSet. * * @par C Interface: * \#include @n * PAPI_create_eventset( int * EventSet ); * * PAPI_create_eventset creates a new EventSet pointed to by EventSet, * which must be initialized to PAPI_NULL before calling this routine. * The user may then add hardware events to the event set by calling * PAPI_add_event or similar routines. * * @note PAPI-C uses a late binding model to bind EventSets to components. * When an EventSet is first created it is not bound to a component. * This will cause some API calls that modify EventSet options to fail. * An EventSet can be bound to a component explicitly by calling * PAPI_assign_eventset_component or implicitly by calling PAPI_add_event * or similar routines. * * @param *EventSet * Address of an integer location to store the new EventSet handle. * * @exception PAPI_EINVAL * The argument handle has not been initialized to PAPI_NULL or the argument is a NULL pointer. * * @exception PAPI_ENOMEM * Insufficient memory to complete the operation. * * @par Examples: * @code * int EventSet = PAPI_NULL; * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @see PAPI_add_event @n * PAPI_assign_eventset_component @n * PAPI_destroy_eventset @n * PAPI_cleanup_eventset */ int PAPI_create_eventset( int *EventSet ) { APIDBG("Entry: EventSet: %p\n", EventSet); ThreadInfo_t *master; int retval; if ( init_level == PAPI_NOT_INITED ) papi_return( PAPI_ENOINIT ); retval = _papi_hwi_lookup_or_create_thread( &master, 0 ); if ( retval ) papi_return( retval ); papi_return( _papi_hwi_create_eventset( EventSet, master ) ); } /** @class PAPI_assign_eventset_component * @brief Assign a component index to an existing but empty EventSet. * * @par C Interface: * \#include @n * PAPI_assign_eventset_component( int EventSet, int cidx ); * * @param EventSet * An integer identifier for an existing EventSet. * @param cidx * An integer identifier for a component. * By convention, component 0 is always the cpu component. * * @retval PAPI_ENOCMP * The argument cidx is not a valid component. * @retval PAPI_ENOEVST * The EventSet doesn't exist. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * PAPI_assign_eventset_component assigns a specific component index, * as specified by cidx, to a new EventSet identified by EventSet, as obtained * from PAPI_create_eventset. EventSets are ordinarily automatically bound * to components when the first event is added. This routine is useful to * explicitly bind an EventSet to a component before setting component related * options. * * @par Examples: * @code * int EventSet = PAPI_NULL; * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Bind our EventSet to the cpu component * if ( PAPI_assign_eventset_component( EventSet, 0 ) != PAPI_OK ) * handle_error( 1 ); * // Convert our EventSet to multiplexing * if ( PAPI_set_multiplex( EventSet ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @see PAPI_set_opt @n * PAPI_create_eventset @n * PAPI_add_events @n * PAPI_set_multiplex */ int PAPI_assign_eventset_component( int EventSet, int cidx ) { EventSetInfo_t *ESI; int retval; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* validate cidx */ retval = valid_component( cidx ); if ( retval < 0 ) papi_return( retval ); /* cowardly refuse to reassign eventsets */ if ( ESI->CmpIdx >= 0 ) return PAPI_EINVAL; return ( _papi_hwi_assign_eventset( ESI, cidx ) ); } /** @class PAPI_get_eventset_component * @brief return index for component an eventset is assigned to * * @retval PAPI_ENOEVST * eventset does not exist * @retval PAPI_ENOCMP * component is invalid or does not exist * @retval positive value * valid component index * * @param EventSet * EventSet for which we want to know the component index * @par Examples: * @code int cidx,eventcode; cidx = PAPI_get_eventset_component(eventset); * @endcode * PAPI_get_eventset_component() returns the component an event * belongs to. * @see PAPI_get_event_component */ int PAPI_get_eventset_component( int EventSet) { EventSetInfo_t *ESI; int retval; /* validate eventset */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* check if a component has been assigned */ if ( ESI->CmpIdx < 0 ) papi_return( PAPI_ENOCMP ); /* validate CmpIdx */ retval = valid_component( ESI->CmpIdx ); if ( retval < 0 ) papi_return( retval ); /* return the index */ return ( ESI->CmpIdx ); } /** @class PAPI_add_event * @brief add PAPI preset or native hardware event to an event set * * @par C Interface: * \#include @n * int PAPI_add_event( int EventSet, int EventCode ); * * PAPI_add_event adds one event to a PAPI Event Set. @n * A hardware event can be either a PAPI preset or a native hardware event code. * For a list of PAPI preset events, see PAPI_presets or run the avail test case * in the PAPI distribution. PAPI presets can be passed to PAPI_query_event to see * if they exist on the underlying architecture. * For a list of native events available on current platform, run the papi_native_avail * utility in the PAPI distribution. For the encoding of native events, * see PAPI_event_name_to_code to learn how to generate native code for the * supported native event on the underlying architecture. * * @param EventSet * An integer handle for a PAPI Event Set as created by PAPI_create_eventset. * @param EventCode * A defined event such as PAPI_TOT_INS. * * @retval Positive-Integer * The number of consecutive elements that succeeded before the error. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * @retval PAPI_ENOEVST * The event set specified does not exist. * @retval PAPI_EISRUN * The event set is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this event and other events * in the event set simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * @retval PAPI_EBUG * Internal error, please send mail to the developers. * @retval PAPI_EMULPASS * Event exists, but cannot be counted due to multiple passes required by hardware. * * @par Examples: * @code * int EventSet = PAPI_NULL; * unsigned int native = 0x0; * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) * handle_error( 1 ); * // Add native event PM_CYC to EventSet * if ( PAPI_event_name_to_code( "PM_CYC", &native ) != PAPI_OK ) * handle_error( 1 ); * if ( PAPI_add_event( EventSet, native ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @bug * The vector function should take a pointer to a length argument so a proper * return value can be set upon partial success. * * @see PAPI_cleanup_eventset @n * PAPI_destroy_eventset @n * PAPI_event_code_to_name @n * PAPI_remove_events @n * PAPI_query_event @n * PAPI_presets @n * PAPI_native @n * PAPI_remove_event */ int PAPI_add_event( int EventSet, int EventCode ) { APIDBG("Entry: EventSet: %d, EventCode: %#x\n", EventSet, EventCode); EventSetInfo_t *ESI; /* Is the EventSet already in existence? */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* Check argument for validity */ if ( ( ( EventCode & PAPI_PRESET_MASK ) == 0 ) && ( EventCode & PAPI_NATIVE_MASK ) == 0 ) papi_return( PAPI_EINVAL ); /* Of course, it must be stopped in order to modify it. */ if ( ESI->state & PAPI_RUNNING ) papi_return( PAPI_EISRUN ); /* Now do the magic. */ int retval = _papi_hwi_add_event( ESI, EventCode ); papi_return( retval ); } /** @class PAPI_remove_event * @brief removes a hardware event from a PAPI event set. * * A hardware event can be either a PAPI Preset or a native hardware * event code. For a list of PAPI preset events, see PAPI_presets or * run the papi_avail utility in the PAPI distribution. PAPI Presets * can be passed to PAPI_query_event to see if they exist on the * underlying architecture. For a list of native events available on * the current platform, run papi_native_avail in the PAPI distribution. * * @par C Interface: * \#include @n * int PAPI_remove_event( int EventSet, int EventCode ); * * @param[in] EventSet * -- an integer handle for a PAPI event set as created * by PAPI_create_eventset * @param[in] EventCode * -- a defined event such as PAPI_TOT_INS or a native event. * * @retval PAPI_OK * Everything worked. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this * event and other events in the EventSet simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * @par Example: * @code * int EventSet = PAPI_NULL; * int ret; * * // Create an empty EventSet * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) handle_error(ret); * * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Stop counting, ignore values * ret = PAPI_stop(EventSet, NULL); * if (ret != PAPI_OK) handle_error(ret); * * // Remove event * ret = PAPI_remove_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_cleanup_eventset * @see PAPI_destroy_eventset * @see PAPI_event_name_to_code * @see PAPI_presets * @see PAPI_add_event * @see PAPI_add_events */ int PAPI_remove_event( int EventSet, int EventCode ) { APIDBG("Entry: EventSet: %d, EventCode: %#x\n", EventSet, EventCode); EventSetInfo_t *ESI; int i,retval; /* check for pre-existing ESI */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* Check argument for validity */ if ( ( !IS_PRESET(EventCode) ) && ( !IS_NATIVE(EventCode) ) && ( !IS_USER_DEFINED(EventCode) )) papi_return( PAPI_EINVAL ); /* Of course, it must be stopped in order to modify it. */ if ( !( ESI->state & PAPI_STOPPED ) ) papi_return( PAPI_EISRUN ); /* if the state is PAPI_OVERFLOWING, you must first call PAPI_overflow with threshold=0 to remove the overflow flag */ /* Turn off the event that is overflowing */ if ( ESI->state & PAPI_OVERFLOWING ) { for ( i = 0; i < ESI->overflow.event_counter; i++ ) { if ( ESI->overflow.EventCode[i] == EventCode ) { retval = PAPI_overflow( EventSet, EventCode, 0, 0, ESI->overflow.handler ); if (retval!=PAPI_OK) return retval; break; } } } /* force the user to call PAPI_profil to clear the PAPI_PROFILING flag */ if ( ESI->state & PAPI_PROFILING ) { for ( i = 0; i < ESI->profile.event_counter; i++ ) { if ( ESI->profile.EventCode[i] == EventCode ) { PAPI_sprofil( NULL, 0, EventSet, EventCode, 0, 0 ); break; } } } /* Now do the magic. */ papi_return( _papi_hwi_remove_event( ESI, EventCode ) ); } /** @class PAPI_add_named_event * @brief add PAPI preset or native hardware event by name to an EventSet * * @par C Interface: * \#include @n * int PAPI_add_named_event( int EventSet, const char *EventName ); * * PAPI_add_named_event adds one event to a PAPI EventSet. @n * A hardware event can be either a PAPI preset or a native hardware event code. * For a list of PAPI preset events, see PAPI_presets or run the avail test case * in the PAPI distribution. PAPI presets can be passed to PAPI_query_event to see * if they exist on the underlying architecture. * For a list of native events available on current platform, run the papi_native_avail * utility in the PAPI distribution. * * @param EventSet * An integer handle for a PAPI Event Set as created by PAPI_create_eventset. * @param EventCode * A defined event such as PAPI_TOT_INS. * * @retval Positive-Integer * The number of consecutive elements that succeeded before the error. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOINIT * The PAPI library has not been initialized. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * @retval PAPI_ENOEVST * The event set specified does not exist. * @retval PAPI_EISRUN * The event set is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this event and other events * in the event set simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * @retval PAPI_EBUG * Internal error, please send mail to the developers. * @retval PAPI_EMULPASS * Event exists, but cannot be counted due to multiple passes required by hardware. * * @par Examples: * @code * char EventName = "PAPI_TOT_INS"; * int EventSet = PAPI_NULL; * unsigned int native = 0x0; * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_named_event( EventSet, EventName ) != PAPI_OK ) * handle_error( 1 ); * // Add native event PM_CYC to EventSet * if ( PAPI_add_named_event( EventSet, "PM_CYC" ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @bug * The vector function should take a pointer to a length argument so a proper * return value can be set upon partial success. * * @see PAPI_add_event @n * PAPI_query_named_event @n * PAPI_remove_named_event */ int PAPI_add_named_event( int EventSet, const char *EventName ) { APIDBG("Entry: EventSet: %d, EventName: %s\n", EventSet, EventName); int ret, code; ret = PAPI_event_name_to_code( EventName, &code ); if ( ret != PAPI_OK ) { APIDBG("EXIT: return: %d\n", ret); return ret; // do not use papi_return here because if there was an error PAPI_event_name_to_code already reported it } ret = PAPI_add_event( EventSet, code ); APIDBG("EXIT: return: %d\n", ret); return ret; // do not use papi_return here because if there was an error PAPI_add_event already reported it } /** @class PAPI_remove_named_event * @brief removes a named hardware event from a PAPI event set. * * A hardware event can be either a PAPI Preset or a native hardware * event code. For a list of PAPI preset events, see PAPI_presets or * run the papi_avail utility in the PAPI distribution. PAPI Presets * can be passed to PAPI_query_event to see if they exist on the * underlying architecture. For a list of native events available on * the current platform, run papi_native_avail in the PAPI distribution. * * @par C Interface: * \#include @n * int PAPI_remove_named_event( int EventSet, const char *EventName ); * * @param[in] EventSet * -- an integer handle for a PAPI event set as created * by PAPI_create_eventset * @param[in] EventName * -- a defined event such as PAPI_TOT_INS or a native event. * * @retval PAPI_OK * Everything worked. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOINIT * The PAPI library has not been initialized. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this * event and other events in the EventSet simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * @par Example: * @code * char EventName = "PAPI_TOT_INS"; * int EventSet = PAPI_NULL; * int ret; * * // Create an empty EventSet * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_named_event(EventSet, EventName); * if (ret != PAPI_OK) handle_error(ret); * * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Stop counting, ignore values * ret = PAPI_stop(EventSet, NULL); * if (ret != PAPI_OK) handle_error(ret); * * // Remove event * ret = PAPI_remove_named_event(EventSet, EventName); * if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_remove_event @n * PAPI_query_named_event @n * PAPI_add_named_event */ int PAPI_remove_named_event( int EventSet, const char *EventName ) { APIDBG("Entry: EventSet: %d, EventName: %s\n", EventSet, EventName); int ret, code; ret = PAPI_event_name_to_code( EventName, &code ); if ( ret == PAPI_OK ) ret = PAPI_remove_event( EventSet, code ); papi_return( ret ); } /** @class PAPI_destroy_eventset * @brief Empty and destroy an EventSet. * * @par C Interface: * \#include @n * int PAPI_destroy_eventset( int * EventSet ); * * PAPI_destroy_eventset deallocates the memory associated with an empty PAPI EventSet. * * @param *EventSet * A pointer to the integer handle for a PAPI event set as created by PAPI_create_eventset. * The value pointed to by EventSet is then set to PAPI_NULL on success. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * Attempting to destroy a non-empty event set or passing in a null pointer to be destroyed. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_EBUG * Internal error, send mail to and complain. * * @par Examples: * @code * // Free all memory and data structures, EventSet must be empty. * if ( PAPI_destroy_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @bug * If the user has set profile on an event with the call, then when destroying * the EventSet the memory allocated by will not be freed. * The user should turn off profiling on the Events before destroying the * EventSet to prevent this behavior. * * @see PAPI_profil @n * PAPI_create_eventset @n * PAPI_add_event @n * PAPI_stop */ int PAPI_destroy_eventset( int *EventSet ) { APIDBG("Entry: EventSet: %p, *EventSet: %d\n", EventSet, *EventSet); EventSetInfo_t *ESI; /* check for pre-existing ESI */ if ( EventSet == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( *EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); if ( !( ESI->state & PAPI_STOPPED ) ) papi_return( PAPI_EISRUN ); if ( ESI->NumberOfEvents ) papi_return( PAPI_EINVAL ); _papi_hwi_remove_EventSet( ESI ); *EventSet = PAPI_NULL; return PAPI_OK; } /* simply checks for valid EventSet, calls component start() call */ /** @class PAPI_start * @brief Start counting hardware events in an event set. * * @par C Interface: * \#include @n * int PAPI_start( int EventSet ); * * @param EventSet * -- an integer handle for a PAPI event set as created by PAPI_create_eventset * * @retval PAPI_OK * @retval PAPI_EINVAL * -- One or more of the arguments is invalid. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ENOEVST * -- The EventSet specified does not exist. * @retval PAPI_EISRUN * -- The EventSet is currently counting events. * @retval PAPI_ECNFLCT * -- The underlying counter hardware can not count this event and other events * in the EventSet simultaneously. * @retval PAPI_ENOEVNT * -- The PAPI preset is not available on the underlying hardware. * * PAPI_start starts counting all of the hardware events contained in the previously defined EventSet. * All counters are implicitly set to zero before counting. * Assumes an initialized PAPI library and a properly added event set. * * @par Example: * @code * int EventSet = PAPI_NULL; * long long values[2]; * int ret; * * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) handle_error(ret); * * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * poorly_tuned_function(); * ret = PAPI_stop(EventSet, values); * if (ret != PAPI_OK) handle_error(ret); * printf("%lld\\n",values[0]); * @endcode * * @see PAPI_create_eventset PAPI_add_event PAPI_stop */ int PAPI_start( int EventSet ) { APIDBG("Entry: EventSet: %d\n", EventSet); int is_dirty=0; int i,retval; EventSetInfo_t *ESI; ThreadInfo_t *thread = NULL; CpuInfo_t *cpu = NULL; hwd_context_t *context; int cidx; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) { papi_return( PAPI_ENOEVST ); } APIDBG("EventSet: %p\n", ESI); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) { papi_return( cidx ); } /* only one event set per thread can be running at any time, */ /* so if another event set is running, the user must stop that */ /* event set explicitly */ /* We used to check and not let multiple events be attached */ /* to the same CPU, but this was unnecessary? */ thread = ESI->master; cpu = ESI->CpuInfo; if ( thread->running_eventset[cidx] ) { APIDBG("Thread Running already (Only one active Eventset per component)\n"); papi_return( PAPI_EISRUN ); } /* Check that there are added events */ if ( ESI->NumberOfEvents < 1 ) { papi_return( PAPI_EINVAL ); } /* If multiplexing is enabled for this eventset, call John May's code. */ if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_start( ESI->multiplex.mpx_evset ); if ( retval != PAPI_OK ) { papi_return( retval ); } /* Update the state of this EventSet */ ESI->state ^= PAPI_STOPPED; ESI->state |= PAPI_RUNNING; return PAPI_OK; } /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, &is_dirty ); if (is_dirty) { /* we need to reset the context state because it was last used */ /* for some other event set and does not contain the information */ /* for our events. */ retval = _papi_hwd[ESI->CmpIdx]->update_control_state( ESI->ctl_state, ESI->NativeInfoArray, ESI->NativeCount, context); if ( retval != PAPI_OK ) { papi_return( retval ); } //update_control_state disturbs the overflow settings so set //it to initial values again if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { for( i = 0; i < ESI->overflow.event_counter; i++ ) { retval = _papi_hwd[ESI->CmpIdx]->set_overflow( ESI, ESI->overflow.EventIndex[i], ESI->overflow.threshold[i] ); if ( retval != PAPI_OK ) { break; } } } /* now that the context contains this event sets information, */ /* make sure the position array in the EventInfoArray is correct */ /* We have to do this because ->update_control_state() can */ /* in theory re-order the native events out from under us. */ _papi_hwi_map_events_to_native( ESI ); } /* If overflowing is enabled, turn it on */ if ( ( ESI->state & PAPI_OVERFLOWING ) && !( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) ) { retval = _papi_hwi_start_signal( _papi_os_info.itimer_sig, NEED_CONTEXT, cidx ); if ( retval != PAPI_OK ) { papi_return( retval ); } /* Update the state of this EventSet and thread */ /* before to avoid races */ ESI->state ^= PAPI_STOPPED; ESI->state |= PAPI_RUNNING; /* can not be attached to thread or cpu if overflowing */ thread->running_eventset[cidx] = ESI; retval = _papi_hwd[cidx]->start( context, ESI->ctl_state ); if ( retval != PAPI_OK ) { _papi_hwi_stop_signal( _papi_os_info.itimer_sig ); ESI->state ^= PAPI_RUNNING; ESI->state |= PAPI_STOPPED; thread->running_eventset[cidx] = NULL; papi_return( retval ); } retval = _papi_hwi_start_timer( _papi_os_info.itimer_num, _papi_os_info.itimer_sig, _papi_os_info.itimer_ns ); if ( retval != PAPI_OK ) { _papi_hwi_stop_signal( _papi_os_info.itimer_sig ); _papi_hwd[cidx]->stop( context, ESI->ctl_state ); ESI->state ^= PAPI_RUNNING; ESI->state |= PAPI_STOPPED; thread->running_eventset[cidx] = NULL; papi_return( retval ); } } else { /* Update the state of this EventSet and thread before */ /* to avoid races */ ESI->state ^= PAPI_STOPPED; ESI->state |= PAPI_RUNNING; /* if not attached to cpu or another process */ if ( !(ESI->state & PAPI_CPU_ATTACHED) ) { if ( !( ESI->state & PAPI_ATTACHED ) ) { thread->running_eventset[cidx] = ESI; } } else { cpu->running_eventset[cidx] = ESI; } retval = _papi_hwd[cidx]->start( context, ESI->ctl_state ); if ( retval != PAPI_OK ) { _papi_hwd[cidx]->stop( context, ESI->ctl_state ); ESI->state ^= PAPI_RUNNING; ESI->state |= PAPI_STOPPED; if ( !(ESI->state & PAPI_CPU_ATTACHED) ) { if ( !( ESI->state & PAPI_ATTACHED ) ) thread->running_eventset[cidx] = NULL; } else { cpu->running_eventset[cidx] = NULL; } papi_return( retval ); } } return retval; } /* checks for valid EventSet, calls component stop() function. */ /** @class PAPI_stop * @brief Stop counting hardware events in an event set. * * @par C Interface: * \#include @n * int PAPI_stop( int EventSet, long long * values ); * * @param EventSet * -- an integer handle for a PAPI event set as created by PAPI_create_eventset * @param values * -- an array to hold the counter values of the counting events * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_ENOTRUN * The EventSet is currently not running. * * PAPI_stop halts the counting of a previously defined event set and the * counter values contained in that EventSet are copied into the values array * Assumes an initialized PAPI library and a properly added event set. * * @par Example: * @code * int EventSet = PAPI_NULL; * long long values[2]; * int ret; * * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) handle_error(ret); * * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * poorly_tuned_function(); * ret = PAPI_stop(EventSet, values); * if (ret != PAPI_OK) handle_error(ret); * printf("%lld\\n",values[0]); * @endcode * * @see PAPI_create_eventset PAPI_start */ int PAPI_stop( int EventSet, long long *values ) { APIDBG("Entry: EventSet: %d, values: %p\n", EventSet, values); EventSetInfo_t *ESI; hwd_context_t *context; int cidx, retval; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( !( ESI->state & PAPI_RUNNING ) ) papi_return( PAPI_ENOTRUN ); /* If multiplexing is enabled for this eventset, turn if off */ if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_stop( ESI->multiplex.mpx_evset, values ); if ( retval != PAPI_OK ) papi_return( retval ); /* Update the state of this EventSet */ ESI->state ^= PAPI_RUNNING; ESI->state |= PAPI_STOPPED; return ( PAPI_OK ); } /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); /* Read the current counter values into the EventSet */ retval = _papi_hwi_read( context, ESI, ESI->sw_stop ); if ( retval != PAPI_OK ) papi_return( retval ); /* Remove the control bits from the active counter config. */ retval = _papi_hwd[cidx]->stop( context, ESI->ctl_state ); if ( retval != PAPI_OK ) papi_return( retval ); if ( values ) memcpy( values, ESI->sw_stop, ( size_t ) ESI->NumberOfEvents * sizeof ( long long ) ); /* If kernel profiling is in use, flush and process the kernel buffer */ if ( ESI->state & PAPI_PROFILING ) { if ( _papi_hwd[cidx]->cmp_info.kernel_profile && !( ESI->profile.flags & PAPI_PROFIL_FORCE_SW ) ) { retval = _papi_hwd[cidx]->stop_profiling( ESI->master, ESI ); if ( retval < PAPI_OK ) papi_return( retval ); } } /* If overflowing is enabled, turn it off */ if ( ESI->state & PAPI_OVERFLOWING ) { if ( !( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) ) { retval = _papi_hwi_stop_timer( _papi_os_info.itimer_num, _papi_os_info.itimer_sig ); if ( retval != PAPI_OK ) papi_return( retval ); _papi_hwi_stop_signal( _papi_os_info.itimer_sig ); } } /* Update the state of this EventSet */ ESI->state ^= PAPI_RUNNING; ESI->state |= PAPI_STOPPED; /* Update the running event set for this thread */ if ( !(ESI->state & PAPI_CPU_ATTACHED) ) { if ( !( ESI->state & PAPI_ATTACHED )) ESI->master->running_eventset[cidx] = NULL; } else { ESI->CpuInfo->running_eventset[cidx] = NULL; } #if defined(DEBUG) if ( _papi_hwi_debug & DEBUG_API ) { int i; for ( i = 0; i < ESI->NumberOfEvents; i++ ) { APIDBG( "PAPI_stop ESI->sw_stop[%d]:\t%llu\n", i, ESI->sw_stop[i] ); } } #endif return ( PAPI_OK ); } /** @class PAPI_reset * @brief Reset the hardware event counts in an event set. * * @par C Prototype: * \#include @n * int PAPI_reset( int EventSet ); * * @param EventSet * an integer handle for a PAPI event set as created by PAPI_create_eventset * * @retval PAPI_OK * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @details * PAPI_reset() zeroes the values of the counters contained in EventSet. * This call assumes an initialized PAPI library and a properly added event set * * @par Example: * @code int EventSet = PAPI_NULL; int Events[] = {PAPI_TOT_INS, PAPI_FP_OPS}; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add two events to our EventSet ret = PAPI_add_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // reset the counters in this EventSet ret = PAPI_reset(EventSet); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_create_eventset */ int PAPI_reset( int EventSet ) { APIDBG("Entry: EventSet: %d\n", EventSet); int retval = PAPI_OK; EventSetInfo_t *ESI; hwd_context_t *context; int cidx; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( ESI->state & PAPI_RUNNING ) { if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_reset( ESI->multiplex.mpx_evset ); } else { /* If we're not the only one running, then just read the current values into the ESI->start array. This holds the starting value for counters that are shared. */ /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwd[cidx]->reset( context, ESI->ctl_state ); } } else { #ifdef __bgp__ // For BG/P, we always want to reset the 'real' hardware counters. The counters // can be controlled via multiple interfaces, and we need to ensure that the values // are truly zero... /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwd[cidx]->reset( context, ESI->ctl_state ); #endif memset( ESI->sw_stop, 0x00, ( size_t ) ESI->NumberOfEvents * sizeof ( long long ) ); } APIDBG( "EXIT: retval %d\n", retval ); papi_return( retval ); } /** @class PAPI_read * @brief Read hardware counters from an event set. * * @par C Interface: * \#include @n * int PAPI_read(int EventSet, long_long * values ); * * PAPI_read() copies the counters of the indicated event set into * the provided array. * * The counters continue counting after the read. * * Note the differences between PAPI_read() and PAPI_accum(), specifically * that PAPI_accum() resets the values array to zero. * * PAPI_read() assumes an initialized PAPI library and a properly added * event set. * * @param[in] EventSet * -- an integer handle for a PAPI Event Set as created * by PAPI_create_eventset() * @param[out] *values * -- an array to hold the counter values of the counting events * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the * errno variable. * @retval PAPI_ENOEVST * The event set specified does not exist. * * @par Examples * @code * do_100events(); * if (PAPI_read(EventSet, values) != PAPI_OK) * handle_error(1); * // values[0] now equals 100 * do_100events(); * if (PAPI_accum(EventSet, values) != PAPI_OK) * handle_error(1); * // values[0] now equals 300 * values[0] = -100; * do_100events(); * if (PAPI_accum(EventSet, values) != PAPI_OK) * handle_error(1); * // values[0] now equals 0 * @endcode * * @see PAPI_accum * @see PAPI_start * @see PAPI_stop * @see PAPI_reset */ int PAPI_read( int EventSet, long long *values ) { APIDBG( "Entry: EventSet: %d, values: %p\n", EventSet, values); EventSetInfo_t *ESI; hwd_context_t *context; int cidx, retval = PAPI_OK; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( values == NULL ) papi_return( PAPI_EINVAL ); if ( ESI->state & PAPI_RUNNING ) { if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_read( ESI->multiplex.mpx_evset, values, 0 ); } else { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwi_read( context, ESI, values ); } if ( retval != PAPI_OK ) papi_return( retval ); } else { memcpy( values, ESI->sw_stop, ( size_t ) ESI->NumberOfEvents * sizeof ( long long ) ); } #if defined(DEBUG) if ( ISLEVEL( DEBUG_API ) ) { int i; for ( i = 0; i < ESI->NumberOfEvents; i++ ) { APIDBG( "PAPI_read values[%d]:\t%lld\n", i, values[i] ); } } #endif APIDBG( "PAPI_read returns %d\n", retval ); return ( PAPI_OK ); } /** @class PAPI_read_ts * @brief Read hardware counters with a timestamp. * * @par C Interface: * \#include @n * int PAPI_read_ts(int EventSet, long long *values, long long *cycles ); * * PAPI_read_ts() copies the counters of the indicated event set into * the provided array. It also places a real-time cycle timestamp * into the cycles array. * * The counters continue counting after the read. * * PAPI_read_ts() assumes an initialized PAPI library and a properly added * event set. * * @param[in] EventSet * -- an integer handle for a PAPI Event Set as created * by PAPI_create_eventset() * @param[out] *values * -- an array to hold the counter values of the counting events * @param[out] *cycles * -- an array to hold the timestamp values * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the * errno variable. * @retval PAPI_ENOEVST * The event set specified does not exist. * * @par Examples * @code * @endcode * * @see PAPI_read * @see PAPI_accum * @see PAPI_start * @see PAPI_stop * @see PAPI_reset */ int PAPI_read_ts( int EventSet, long long *values, long long *cycles ) { APIDBG( "Entry: EventSet: %d, values: %p, cycles: %p\n", EventSet, values, cycles); EventSetInfo_t *ESI; hwd_context_t *context; int cidx, retval = PAPI_OK; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( values == NULL ) papi_return( PAPI_EINVAL ); if ( ESI->state & PAPI_RUNNING ) { if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_read( ESI->multiplex.mpx_evset, values, 0 ); } else { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwi_read( context, ESI, values ); } if ( retval != PAPI_OK ) papi_return( retval ); } else { memcpy( values, ESI->sw_stop, ( size_t ) ESI->NumberOfEvents * sizeof ( long long ) ); } *cycles = _papi_os_vector.get_real_cycles( ); #if defined(DEBUG) if ( ISLEVEL( DEBUG_API ) ) { int i; for ( i = 0; i < ESI->NumberOfEvents; i++ ) { APIDBG( "PAPI_read values[%d]:\t%lld\n", i, values[i] ); } } #endif APIDBG( "PAPI_read_ts returns %d\n", retval ); return PAPI_OK; } /** @class PAPI_accum * @brief Accumulate and reset counters in an EventSet. * * @par C Interface: * \#include @n * int PAPI_accum( int EventSet, long_long * values ); * * These calls assume an initialized PAPI library and a properly added event set. * PAPI_accum adds the counters of the indicated event set into the array values. * The counters are zeroed and continue counting after the operation. * Note the differences between PAPI_read and PAPI_accum, specifically * that PAPI_accum resets the values array to zero. * * @param EventSet * an integer handle for a PAPI Event Set * as created by PAPI_create_eventset * @param *values * an array to hold the counter values of the counting events * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ESYS * A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_ENOEVST * The event set specified does not exist. * * @par Examples: * @code * do_100events( ); * if ( PAPI_read( EventSet, values) != PAPI_OK ) * handle_error( 1 ); * // values[0] now equals 100 * do_100events( ); * if (PAPI_accum( EventSet, values ) != PAPI_OK ) * handle_error( 1 ); * // values[0] now equals 200 * values[0] = -100; * do_100events( ); * if (PAPI_accum( EventSet, values ) != PAPI_OK ) * handle_error( 1 ); * // values[0] now equals 0 * @endcode * * @see PAPIF_accum * @see PAPI_start * @see PAPI_set_opt * @see PAPI_reset */ int PAPI_accum( int EventSet, long long *values ) { APIDBG("Entry: EventSet: %d, values: %p\n", EventSet, values); EventSetInfo_t *ESI; hwd_context_t *context; int i, cidx, retval; long long a, b, c; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( values == NULL ) papi_return( PAPI_EINVAL ); if ( ESI->state & PAPI_RUNNING ) { if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_read( ESI->multiplex.mpx_evset, ESI->sw_stop, 0 ); } else { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwi_read( context, ESI, ESI->sw_stop ); } if ( retval != PAPI_OK ) papi_return( retval ); } for ( i = 0; i < ESI->NumberOfEvents; i++ ) { a = ESI->sw_stop[i]; b = values[i]; c = a + b; values[i] = c; } papi_return( PAPI_reset( EventSet ) ); } /** @class PAPI_write * @brief Write counter values into counters. * * @param EventSet * an integer handle for a PAPI event set as created by PAPI_create_eventset * @param *values * an array to hold the counter values of the counting events * * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_ECMP * PAPI_write() is not implemented for this architecture. * @retval PAPI_ESYS * The EventSet is currently counting events and * the component could not change the values of the * running counters. * * PAPI_write() writes the counter values provided in the array values * into the event set EventSet. * The virtual counters managed by the PAPI library will be set to the values provided. * If the event set is running, an attempt will be made to write the values * to the running counters. * This operation is not permitted by all components and may result in a run-time error. * * @see PAPI_read */ int PAPI_write( int EventSet, long long *values ) { APIDBG("Entry: EventSet: %d, values: %p\n", EventSet, values); int cidx, retval = PAPI_OK; EventSetInfo_t *ESI; hwd_context_t *context; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( values == NULL ) papi_return( PAPI_EINVAL ); if ( ESI->state & PAPI_RUNNING ) { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwd[cidx]->write( context, ESI->ctl_state, values ); if ( retval != PAPI_OK ) return ( retval ); } memcpy( ESI->hw_start, values, ( size_t ) _papi_hwd[cidx]->cmp_info.num_cntrs * sizeof ( long long ) ); return ( retval ); } /** @class PAPI_cleanup_eventset * @brief Empty and destroy an EventSet. * * @par C Interface: * \#include @n * int PAPI_cleanup_eventset( int EventSet ); * * PAPI_cleanup_eventset removes all events from a PAPI event set and turns * off profiling and overflow for all events in the EventSet. * This can not be called if the EventSet is not stopped. * * @param EventSet * An integer handle for a PAPI event set as created by PAPI_create_eventset. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * Attempting to destroy a non-empty event set or passing in a null pointer to be destroyed. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_EBUG * Internal error, send mail to and complain. * * @par Examples: * @code * // Remove all events in the eventset * if ( PAPI_cleanup_eventset( EventSet ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @bug * If the user has set profile on an event with the call, then when destroying * the EventSet the memory allocated by will not be freed. * The user should turn off profiling on the Events before destroying the * EventSet to prevent this behavior. * * @see PAPI_profil @n * PAPI_create_eventset @n * PAPI_add_event @n * PAPI_stop */ int PAPI_cleanup_eventset( int EventSet ) { APIDBG("Entry: EventSet: %d\n",EventSet); EventSetInfo_t *ESI; int i, cidx, total, retval; /* Is the EventSet already in existence? */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* if the eventset has no index and no events, return OK otherwise return NOCMP */ cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) { if ( ESI->NumberOfEvents ) papi_return( cidx ); papi_return( PAPI_OK ); } /* Of course, it must be stopped in order to modify it. */ if ( ESI->state & PAPI_RUNNING ) papi_return( PAPI_EISRUN ); /* clear overflow flag and turn off hardware overflow handler */ if ( ESI->state & PAPI_OVERFLOWING ) { total = ESI->overflow.event_counter; for ( i = 0; i < total; i++ ) { retval = PAPI_overflow( EventSet, ESI->overflow.EventCode[0], 0, 0, NULL ); if ( retval != PAPI_OK ) papi_return( retval ); } } /* clear profile flag and turn off hardware profile handler */ if ( ( ESI->state & PAPI_PROFILING ) && _papi_hwd[cidx]->cmp_info.hardware_intr && !( ESI->profile.flags & PAPI_PROFIL_FORCE_SW ) ) { total = ESI->profile.event_counter; for ( i = 0; i < total; i++ ) { retval = PAPI_sprofil( NULL, 0, EventSet, ESI->profile.EventCode[0], 0, PAPI_PROFIL_POSIX ); if ( retval != PAPI_OK ) papi_return( retval ); } } if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = MPX_cleanup( &ESI->multiplex.mpx_evset ); if ( retval != PAPI_OK ) papi_return( retval ); } retval = _papi_hwd[cidx]->cleanup_eventset( ESI->ctl_state ); if ( retval != PAPI_OK ) papi_return( retval ); /* Now do the magic */ papi_return( _papi_hwi_cleanup_eventset( ESI ) ); } /** @class PAPI_multiplex_init * @brief Initialize multiplex support in the PAPI library. * * PAPI_multiplex_init() enables and initializes multiplex support in * the PAPI library. * Multiplexing allows a user to count more events than total physical * counters by time sharing the existing counters at some loss in * precision. * Applications that make no use of multiplexing do not need to call * this routine. * * @par C Interface: * \#include @n * int PAPI_multiplex_init (void); * * @par Examples * @code * retval = PAPI_multiplex_init(); * @endcode * @retval PAPI_OK This call always returns PAPI_OK * * @see PAPI_set_multiplex * @see PAPI_get_multiplex */ int PAPI_multiplex_init( void ) { APIDBG("Entry:\n"); int retval; retval = mpx_init( _papi_os_info.itimer_ns ); papi_return( retval ); } /** @class PAPI_state * @brief Return the counting state of an EventSet. * * @par C Interface: * \#include @n * int PAPI_state( int EventSet, int * status ); * * @param EventSet -- an integer handle for a PAPI event set as created by PAPI_create_eventset * @param status -- an integer containing a boolean combination of one or more of the * following nonzero constants as defined in the PAPI header file papi.h: * @arg PAPI_STOPPED -- EventSet is stopped * @arg PAPI_RUNNING -- EventSet is running * @arg PAPI_PAUSED -- EventSet temporarily disabled by the library * @arg PAPI_NOT_INIT -- EventSet defined, but not initialized * @arg PAPI_OVERFLOWING -- EventSet has overflowing enabled * @arg PAPI_PROFILING -- EventSet has profiling enabled * @arg PAPI_MULTIPLEXING -- EventSet has multiplexing enabled * @arg PAPI_ACCUMULATING -- reserved for future use * @arg PAPI_HWPROFILING -- reserved for future use * @manonly * @endmanonly * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @manonly * @endmanonly * * PAPI_state() returns the counting state of the specified event set. * @manonly * @endmanonly * * @par Example: * @code * int EventSet = PAPI_NULL; * int status = 0; * int ret; * * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) handle_error(ret); * * // Start counting * ret = PAPI_state(EventSet, &status); * if (ret != PAPI_OK) handle_error(ret); * printf("State is now %d\n",status); * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * ret = PAPI_state(EventSet, &status); * if (ret != PAPI_OK) handle_error(ret); * printf("State is now %d\n",status); * @endcode * * @see PAPI_stop PAPI_start */ int PAPI_state( int EventSet, int *status ) { APIDBG("Entry: EventSet: %d, status: %p\n", EventSet, status); EventSetInfo_t *ESI; if ( status == NULL ) papi_return( PAPI_EINVAL ); /* check for good EventSetIndex value */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /*read status FROM ESI->state */ *status = ESI->state; return ( PAPI_OK ); } /** @class PAPI_set_debug * @brief Set the current debug level for error output from PAPI. * * @par C Prototype: * \#include @n * int PAPI_set_debug( int level ); * * @param level * one of the constants shown in the table below and defined in the papi.h * header file. @n * The possible debug levels for debugging are shown below. * @arg PAPI_QUIET Do not print anything, just return the error code * @arg PAPI_VERB_ECONT Print error message and continue * @arg PAPI_VERB_ESTOP Print error message and exit * @n * @retval PAPI_OK * @retval PAPI_EINVAL * The debug level is invalid. * @n@n * * The current debug level is used by both the internal error and debug message * handler subroutines. @n * The debug handler is only used if the library was compiled with -DDEBUG. @n * The debug handler is called when there is an error upon a call to the PAPI API.@n * The error handler is always active and its behavior cannot be modified except * for whether or not it prints anything. * * The default PAPI debug handler prints out messages in the following form: @n * PAPI Error: Error Code code, symbol, description * * If the error was caused from a system call and the return code is PAPI_ESYS, * the message will have a colon space and the error string as reported by * strerror() appended to the end. * * The PAPI error handler prints out messages in the following form: @n * PAPI Error: message. * @n * @note This is the ONLY function that may be called BEFORE PAPI_library_init(). * @n * @par Example: * @code int ret; ret = PAPI_set_debug(PAPI_VERB_ECONT); if ( ret != PAPI_OK ) handle_error(); * @endcode * * @see PAPI_library_init * @see PAPI_get_opt * @see PAPI_set_opt */ int PAPI_set_debug( int level ) { APIDBG("Entry: level: %d\n", level); PAPI_option_t option; memset( &option, 0x0, sizeof ( option ) ); option.debug.level = level; option.debug.handler = _papi_hwi_debug_handler; return ( PAPI_set_opt( PAPI_DEBUG, &option ) ); } /* Attaches to or detaches from the specified thread id */ inline_static int _papi_set_attach( int option, int EventSet, unsigned long tid ) { APIDBG("Entry: option: %d, EventSet: %d, tid: %lu\n", option, EventSet, tid); PAPI_option_t attach; memset( &attach, 0x0, sizeof ( attach ) ); attach.attach.eventset = EventSet; attach.attach.tid = tid; return ( PAPI_set_opt( option, &attach ) ); } /** @class PAPI_attach * @brief Attach PAPI event set to the specified thread id. * * @par C Interface: * \#include @n * int PAPI_attach( int EventSet, unsigned long tid ); * * PAPI_attach is a wrapper function that calls PAPI_set_opt to allow PAPI to * monitor performance counts on a thread other than the one currently executing. * This is sometimes referred to as third party monitoring. * PAPI_attach connects the specified EventSet to the specified thread; * PAPI_detach breaks that connection and restores the EventSet to the * original executing thread. * * @param EventSet * An integer handle for a PAPI EventSet as created by PAPI_create_eventset. * @param tid * A thread id as obtained from, for example, PAPI_list_threads or PAPI_thread_id. * * @retval PAPI_ECMP * This feature is unsupported on this component. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVST * The event set specified does not exist. * @retval PAPI_EISRUN * The event set is currently counting events. * * @par Examples: * @code * int EventSet = PAPI_NULL; * unsigned long pid; * pid = fork( ); * if ( pid <= 0 ) * exit( 1 ); * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * exit( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) * exit( 1 ); * // Attach this EventSet to the forked process * if ( PAPI_attach( EventSet, pid ) != PAPI_OK ) * exit( 1 ); * @endcode * * @see PAPI_set_opt * @see PAPI_list_threads * @see PAPI_thread_id * @see PAPI_thread_init */ int PAPI_attach( int EventSet, unsigned long tid ) { APIDBG( "Entry: EventSet: %d, tid: %lu\n", EventSet, tid); return ( _papi_set_attach( PAPI_ATTACH, EventSet, tid ) ); } /** @class PAPI_detach * @brief Detach PAPI event set from previously specified thread id and restore to executing thread. * * @par C Interface: * \#include @n * int PAPI_detach( int EventSet, unsigned long tid ); * * PAPI_detach is a wrapper function that calls PAPI_set_opt to allow PAPI to * monitor performance counts on a thread other than the one currently executing. * This is sometimes referred to as third party monitoring. * PAPI_attach connects the specified EventSet to the specified thread; * PAPI_detach breaks that connection and restores the EventSet to the * original executing thread. * * @param EventSet * An integer handle for a PAPI EventSet as created by PAPI_create_eventset. * @param tid * A thread id as obtained from, for example, PAPI_list_threads or PAPI_thread_id. * * @retval PAPI_ECMP * This feature is unsupported on this component. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVST * The event set specified does not exist. * @retval PAPI_EISRUN * The event set is currently counting events. * * @par Examples: * @code * int EventSet = PAPI_NULL; * unsigned long pid; * pid = fork( ); * if ( pid <= 0 ) * exit( 1 ); * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * exit( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) * exit( 1 ); * // Attach this EventSet to the forked process * if ( PAPI_attach( EventSet, pid ) != PAPI_OK ) * exit( 1 ); * @endcode * * @see PAPI_set_opt @n * PAPI_list_threads @n * PAPI_thread_id @n * PAPI_thread_init */ int PAPI_detach( int EventSet ) { APIDBG( "Entry: EventSet: %d\n", EventSet); return ( _papi_set_attach( PAPI_DETACH, EventSet, 0 ) ); } /** @class PAPI_set_multiplex * @brief Convert a standard event set to a multiplexed event set. * * @par C Interface: * \#include @n * int PAPI_set_multiplex( int EventSet ); * * @param EventSet * an integer handle for a PAPI event set as created by PAPI_create_eventset * * @retval PAPI_OK * @retval PAPI_EINVAL * -- One or more of the arguments is invalid, or the EventSet is already multiplexed. * @retval PAPI_ENOCMP * -- The EventSet specified is not yet bound to a component. * @retval PAPI_ENOEVST * -- The EventSet specified does not exist. * @retval PAPI_EISRUN * -- The EventSet is currently counting events. * @retval PAPI_ENOMEM * -- Insufficient memory to complete the operation. * * PAPI_set_multiplex converts a standard PAPI event set created by a call to * PAPI_create_eventset into an event set capable of handling multiplexed events. * This must be done after calling PAPI_multiplex_init, and either PAPI_add_event * or PAPI_assign_eventset_component, but prior to calling PAPI_start(). * * Events can be added to an event set either before or after converting it * into a multiplexed set, but the conversion must be done prior to using it * as a multiplexed set. * * @note Multiplexing can't be enabled until PAPI knows which component is targeted. * Due to the late binding nature of PAPI event sets, this only happens after adding * an event to an event set or explicitly binding the component with a call to * PAPI_assign_eventset_component. * * @par Example: * @code * int EventSet = PAPI_NULL; * int ret; * * // Create an empty EventSet * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Bind it to the CPU component * ret = PAPI_assign_eventset_component(EventSet, 0); * if (ret != PAPI_OK) handle_error(ret); * * // Check current multiplex status * ret = PAPI_get_multiplex(EventSet); * if (ret == TRUE) printf("This event set is ready for multiplexing\n.") * if (ret == FALSE) printf("This event set is not enabled for multiplexing\n.") * if (ret < 0) handle_error(ret); * * // Turn on multiplexing * ret = PAPI_set_multiplex(EventSet); * if ((ret == PAPI_EINVAL) && (PAPI_get_multiplex(EventSet) == TRUE)) * printf("This event set already has multiplexing enabled\n"); * else if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_multiplex_init * @see PAPI_get_multiplex * @see PAPI_set_opt * @see PAPI_create_eventset */ int PAPI_set_multiplex( int EventSet ) { APIDBG( "Entry: EventSet: %d\n", EventSet); PAPI_option_t mpx; EventSetInfo_t *ESI; int cidx; int ret; /* Is the EventSet already in existence? */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* if the eventset has no index return NOCMP */ cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( ( ret = mpx_check( EventSet ) ) != PAPI_OK ) papi_return( ret ); memset( &mpx, 0x0, sizeof ( mpx ) ); mpx.multiplex.eventset = EventSet; mpx.multiplex.flags = PAPI_MULTIPLEX_DEFAULT; mpx.multiplex.ns = _papi_os_info.itimer_ns; return ( PAPI_set_opt( PAPI_MULTIPLEX, &mpx ) ); } /** @class PAPI_set_opt * @brief Set PAPI library or event set options. * * @par C Interface: * \#include @n * int PAPI_set_opt( int option, PAPI_option_t * ptr ); * * @param[in] option * Defines the option to be set. * Possible values are briefly described in the table below. * * @param[in,out] ptr * Pointer to a structure determined by the selected option. See PAPI_option_t * for a description of possible structures. * * @retval PAPI_OK * @retval PAPI_EINVAL The specified option or parameter is invalid. * @retval PAPI_ENOEVST The EventSet specified does not exist. * @retval PAPI_EISRUN The EventSet is currently counting events. * @retval PAPI_ECMP * The option is not implemented for the current component. * @retval PAPI_ENOINIT PAPI has not been initialized. * @retval PAPI_EINVAL_DOM Invalid domain has been requested. * * PAPI_set_opt() changes the options of the PAPI library or a specific EventSet created * by PAPI_create_eventset. Some options may require that the EventSet be bound to a * component before they can execute successfully. This can be done either by adding an * event or by explicitly calling PAPI_assign_eventset_component. * * Ptr is a pointer to the PAPI_option_t structure, which is actually a union of different * structures for different options. Not all options require or return information in these * structures. Each requires different values to be set. Some options require a component * index to be provided. These options are handled implicitly through the option structures. * * @note Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX * are also available as separate entry points in both C and Fortran. * * The reader is encouraged to peruse the ctests code in the PAPI distribution for examples * of usage of PAPI_set_opt. * * @par Possible values for the PAPI_set_opt option parameter * @manonly * OPTION DEFINITION * PAPI_DEFDOM Set default counting domain for newly created event sets. Requires a * component index. * PAPI_DEFGRN Set default counting granularity. Requires a component index. * PAPI_DEBUG Set the PAPI debug state and the debug handler. The debug state is * specified in ptr->debug.level. The debug handler is specified in * ptr->debug.handler. For further information regarding debug states and * the behavior of the handler, see PAPI_set_debug. * PAPI_MULTIPLEX Enable specified EventSet for multiplexing. * PAPI_DEF_ITIMER Set the type of itimer used in software multiplexing, overflowing * and profiling. * PAPI_DEF_MPX_NS Set the sampling time slice in nanoseconds for multiplexing and overflow. * PAPI_DEF_ITIMER_NS See PAPI_DEF_MPX_NS. * PAPI_ATTACH Attach EventSet specified in ptr->attach.eventset to thread or process id * specified in in ptr->attach.tid. * PAPI_CPU_ATTACH Attach EventSet specified in ptr->cpu.eventset to cpu specified in in * ptr->cpu.cpu_num. * PAPI_DETACH Detach EventSet specified in ptr->attach.eventset from any thread * or process id. * PAPI_DOMAIN Set domain for EventSet specified in ptr->domain.eventset. * Will error if eventset is not bound to a component. * PAPI_GRANUL Set granularity for EventSet specified in ptr->granularity.eventset. * Will error if eventset is not bound to a component. * PAPI_INHERIT Enable or disable inheritance for specified EventSet. * PAPI_DATA_ADDRESS Set data address range to restrict event counting for EventSet specified * in ptr->addr.eventset. Starting and ending addresses are specified in * ptr->addr.start and ptr->addr.end, respectively. If exact addresses * cannot be instantiated, offsets are returned in ptr->addr.start_off and * ptr->addr.end_off. Currently implemented on Itanium only. * PAPI_INSTR_ADDRESS Set instruction address range as described above. Itanium only. * @endmanonly * @htmlonly * * * * * * * * * * * * * * * * * *
OPTIONDEFINITION
PAPI_DEFDOMSet default counting domain for newly created event sets. Requires a component index.
PAPI_DEFGRNSet default counting granularity. Requires a component index.
PAPI_DEBUGSet the PAPI debug state and the debug handler. The debug state is specified in ptr->debug.level. The debug handler is specified in ptr->debug.handler. * For further information regarding debug states and the behavior of the handler, see PAPI_set_debug.
PAPI_MULTIPLEXEnable specified EventSet for multiplexing.
xPAPI_DEF_ITIMERSet the type of itimer used in software multiplexing, overflowing and profiling.
PAPI_DEF_MPX_NSSet the sampling time slice in nanoseconds for multiplexing and overflow.
PAPI_DEF_ITIMER_NSSee PAPI_DEF_MPX_NS.
PAPI_ATTACHAttach EventSet specified in ptr->attach.eventset to thread or process id specified in in ptr->attach.tid.
PAPI_CPU_ATTACHAttach EventSet specified in ptr->cpu.eventset to cpu specified in in ptr->cpu.cpu_num.
PAPI_DETACHDetach EventSet specified in ptr->attach.eventset from any thread or process id.
PAPI_DOMAINSet domain for EventSet specified in ptr->domain.eventset. Will error if eventset is not bound to a component.
PAPI_GRANULSet granularity for EventSet specified in ptr->granularity.eventset. Will error if eventset is not bound to a component.
PAPI_INHERITEnable or disable inheritance for specified EventSet.
PAPI_DATA_ADDRESSSet data address range to restrict event counting for EventSet specified in ptr->addr.eventset. Starting and ending addresses are specified in ptr->addr.start and ptr->addr.end, respectively. If exact addresses cannot be instantiated, offsets are returned in ptr->addr.start_off and ptr->addr.end_off. Currently implemented on Itanium only.
PAPI_INSTR_ADDRESSSet instruction address range as described above. Itanium only.
* @endhtmlonly * * @see PAPI_set_debug * @see PAPI_set_multiplex * @see PAPI_set_domain * @see PAPI_option_t */ int PAPI_set_opt( int option, PAPI_option_t * ptr ) { APIDBG("Entry: option: %d, ptr: %p\n", option, ptr); _papi_int_option_t internal; int retval = PAPI_OK; hwd_context_t *context; int cidx; if ( ( option != PAPI_DEBUG ) && ( init_level == PAPI_NOT_INITED ) ) papi_return( PAPI_ENOINIT ); if ( ptr == NULL ) papi_return( PAPI_EINVAL ); memset( &internal, 0x0, sizeof ( _papi_int_option_t ) ); switch ( option ) { case PAPI_DETACH: { internal.attach.ESI = _papi_hwi_lookup_EventSet( ptr->attach.eventset ); if ( internal.attach.ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( internal.attach.ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( _papi_hwd[cidx]->cmp_info.attach == 0 ) papi_return( PAPI_ECMP ); /* if attached to a cpu, return an error */ if (internal.attach.ESI->state & PAPI_CPU_ATTACHED) papi_return( PAPI_ECMP ); if ( ( internal.attach.ESI->state & PAPI_STOPPED ) == 0 ) papi_return( PAPI_EISRUN ); if ( ( internal.attach.ESI->state & PAPI_ATTACHED ) == 0 ) papi_return( PAPI_EINVAL ); internal.attach.tid = internal.attach.ESI->attach.tid; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.attach.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_DETACH, &internal ); if ( retval != PAPI_OK ) papi_return( retval ); internal.attach.ESI->state ^= PAPI_ATTACHED; internal.attach.ESI->attach.tid = 0; return ( PAPI_OK ); } case PAPI_ATTACH: { internal.attach.ESI = _papi_hwi_lookup_EventSet( ptr->attach.eventset ); if ( internal.attach.ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( internal.attach.ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( _papi_hwd[cidx]->cmp_info.attach == 0 ) papi_return( PAPI_ECMP ); if ( ( internal.attach.ESI->state & PAPI_STOPPED ) == 0 ) papi_return( PAPI_EISRUN ); if ( internal.attach.ESI->state & PAPI_ATTACHED ) papi_return( PAPI_EINVAL ); /* if attached to a cpu, return an error */ if (internal.attach.ESI->state & PAPI_CPU_ATTACHED) papi_return( PAPI_ECMP ); internal.attach.tid = ptr->attach.tid; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.attach.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_ATTACH, &internal ); if ( retval != PAPI_OK ) papi_return( retval ); internal.attach.ESI->state |= PAPI_ATTACHED; internal.attach.ESI->attach.tid = ptr->attach.tid; papi_return (_papi_hwi_lookup_or_create_thread( &(internal.attach.ESI->master), ptr->attach.tid )); } case PAPI_CPU_ATTACH: { APIDBG("eventset: %d, cpu_num: %d\n", ptr->cpu.eventset, ptr->cpu.cpu_num); internal.cpu.ESI = _papi_hwi_lookup_EventSet( ptr->cpu.eventset ); if ( internal.cpu.ESI == NULL ) papi_return( PAPI_ENOEVST ); internal.cpu.cpu_num = ptr->cpu.cpu_num; APIDBG("internal: %p, ESI: %p, cpu_num: %d\n", &internal, internal.cpu.ESI, internal.cpu.cpu_num); cidx = valid_ESI_component( internal.cpu.ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( _papi_hwd[cidx]->cmp_info.cpu == 0 ) papi_return( PAPI_ECMP ); // can not attach to a cpu if already attached to a process or // counters set to be inherited by child processes if ( internal.cpu.ESI->state & (PAPI_ATTACHED | PAPI_INHERIT) ) papi_return( PAPI_EINVAL ); if ( ( internal.cpu.ESI->state & PAPI_STOPPED ) == 0 ) papi_return( PAPI_EISRUN ); retval = _papi_hwi_lookup_or_create_cpu(&internal.cpu.ESI->CpuInfo, internal.cpu.cpu_num); if( retval != PAPI_OK) { papi_return( retval ); } /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.cpu.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_CPU_ATTACH, &internal ); if ( retval != PAPI_OK ) papi_return( retval ); /* set to show this event set is attached to a cpu not a thread */ internal.cpu.ESI->state |= PAPI_CPU_ATTACHED; return ( PAPI_OK ); } case PAPI_DEF_MPX_NS: { cidx = 0; /* xxxx for now, assume we only check against cpu component */ if ( ptr->multiplex.ns < 0 ) papi_return( PAPI_EINVAL ); /* We should check the resolution here with the system, either component if kernel multiplexing or PAPI if SW multiplexing. */ internal.multiplex.ns = ( unsigned long ) ptr->multiplex.ns; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.cpu.ESI, NULL ); /* Low level just checks/adjusts the args for this component */ retval = _papi_hwd[cidx]->ctl( context, PAPI_DEF_MPX_NS, &internal ); if ( retval == PAPI_OK ) { _papi_os_info.itimer_ns = ( int ) internal.multiplex.ns; ptr->multiplex.ns = ( int ) internal.multiplex.ns; } papi_return( retval ); } case PAPI_DEF_ITIMER_NS: { cidx = 0; /* xxxx for now, assume we only check against cpu component */ if ( ptr->itimer.ns < 0 ) papi_return( PAPI_EINVAL ); internal.itimer.ns = ptr->itimer.ns; /* Low level just checks/adjusts the args for this component */ retval = _papi_hwd[cidx]->ctl( NULL, PAPI_DEF_ITIMER_NS, &internal ); if ( retval == PAPI_OK ) { _papi_os_info.itimer_ns = internal.itimer.ns; ptr->itimer.ns = internal.itimer.ns; } papi_return( retval ); } case PAPI_DEF_ITIMER: { cidx = 0; /* xxxx for now, assume we only check against cpu component */ if ( ptr->itimer.ns < 0 ) papi_return( PAPI_EINVAL ); memcpy( &internal.itimer, &ptr->itimer, sizeof ( PAPI_itimer_option_t ) ); /* Low level just checks/adjusts the args for this component */ retval = _papi_hwd[cidx]->ctl( NULL, PAPI_DEF_ITIMER, &internal ); if ( retval == PAPI_OK ) { _papi_os_info.itimer_num = ptr->itimer.itimer_num; _papi_os_info.itimer_sig = ptr->itimer.itimer_sig; if ( ptr->itimer.ns > 0 ) _papi_os_info.itimer_ns = ptr->itimer.ns; /* flags are currently ignored, eventually the flags will be able to specify whether or not we use POSIX itimers (clock_gettimer) */ } papi_return( retval ); } case PAPI_MULTIPLEX: { EventSetInfo_t *ESI; ESI = _papi_hwi_lookup_EventSet( ptr->multiplex.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( !( ESI->state & PAPI_STOPPED ) ) papi_return( PAPI_EISRUN ); if ( ESI->state & PAPI_MULTIPLEXING ) papi_return( PAPI_EINVAL ); if ( ptr->multiplex.ns < 0 ) papi_return( PAPI_EINVAL ); internal.multiplex.ESI = ESI; internal.multiplex.ns = ( unsigned long ) ptr->multiplex.ns; internal.multiplex.flags = ptr->multiplex.flags; if ( ( _papi_hwd[cidx]->cmp_info.kernel_multiplex ) && ( ( ptr->multiplex.flags & PAPI_MULTIPLEX_FORCE_SW ) == 0 ) ) { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_MULTIPLEX, &internal ); } /* Kernel or PAPI may have changed this value so send it back out to the user */ ptr->multiplex.ns = ( int ) internal.multiplex.ns; if ( retval == PAPI_OK ) papi_return( _papi_hwi_convert_eventset_to_multiplex ( &internal.multiplex ) ); return ( retval ); } case PAPI_DEBUG: { int level = ptr->debug.level; switch ( level ) { case PAPI_QUIET: case PAPI_VERB_ESTOP: case PAPI_VERB_ECONT: _papi_hwi_error_level = level; break; default: papi_return( PAPI_EINVAL ); } _papi_hwi_debug_handler = ptr->debug.handler; return ( PAPI_OK ); } case PAPI_DEFDOM: { int dom = ptr->defdomain.domain; if ( ( dom < PAPI_DOM_MIN ) || ( dom > PAPI_DOM_MAX ) ) papi_return( PAPI_EINVAL ); /* Change the global structure. The _papi_hwd_init_control_state function in the components gets information from the global structure instead of per-thread information. */ cidx = valid_component( ptr->defdomain.def_cidx ); if ( cidx < 0 ) papi_return( cidx ); /* Check what the component supports */ if ( dom == PAPI_DOM_ALL ) dom = _papi_hwd[cidx]->cmp_info.available_domains; if ( dom & ~_papi_hwd[cidx]->cmp_info.available_domains ) papi_return( PAPI_ENOSUPP ); _papi_hwd[cidx]->cmp_info.default_domain = dom; return ( PAPI_OK ); } case PAPI_DOMAIN: { int dom = ptr->domain.domain; if ( ( dom < PAPI_DOM_MIN ) || ( dom > PAPI_DOM_MAX ) ) papi_return( PAPI_EINVAL_DOM ); internal.domain.ESI = _papi_hwi_lookup_EventSet( ptr->domain.eventset ); if ( internal.domain.ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( internal.domain.ESI ); if ( cidx < 0 ) papi_return( cidx ); /* Check what the component supports */ if ( dom == PAPI_DOM_ALL ) dom = _papi_hwd[cidx]->cmp_info.available_domains; if ( dom & ~_papi_hwd[cidx]->cmp_info.available_domains ) papi_return( PAPI_EINVAL_DOM ); if ( !( internal.domain.ESI->state & PAPI_STOPPED ) ) papi_return( PAPI_EISRUN ); /* Try to change the domain of the eventset in the hardware */ internal.domain.domain = dom; internal.domain.eventset = ptr->domain.eventset; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.domain.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_DOMAIN, &internal ); if ( retval < PAPI_OK ) papi_return( retval ); /* Change the domain of the eventset in the library */ internal.domain.ESI->domain.domain = dom; return ( retval ); } case PAPI_DEFGRN: { int grn = ptr->defgranularity.granularity; if ( ( grn < PAPI_GRN_MIN ) || ( grn > PAPI_GRN_MAX ) ) papi_return( PAPI_EINVAL ); cidx = valid_component( ptr->defgranularity.def_cidx ); if ( cidx < 0 ) papi_return( cidx ); /* Change the component structure. The _papi_hwd_init_control_state function in the components gets information from the global structure instead of per-thread information. */ /* Check what the component supports */ if ( grn & ~_papi_hwd[cidx]->cmp_info.available_granularities ) papi_return( PAPI_EINVAL ); /* Make sure there is only 1 set. */ if ( grn ^ ( 1 << ( ffs( grn ) - 1 ) ) ) papi_return( PAPI_EINVAL ); _papi_hwd[cidx]->cmp_info.default_granularity = grn; return ( PAPI_OK ); } case PAPI_GRANUL: { int grn = ptr->granularity.granularity; if ( ( grn < PAPI_GRN_MIN ) || ( grn > PAPI_GRN_MAX ) ) papi_return( PAPI_EINVAL ); internal.granularity.ESI = _papi_hwi_lookup_EventSet( ptr->granularity.eventset ); if ( internal.granularity.ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( internal.granularity.ESI ); if ( cidx < 0 ) papi_return( cidx ); /* Check what the component supports */ if ( grn & ~_papi_hwd[cidx]->cmp_info.available_granularities ) papi_return( PAPI_EINVAL ); /* Make sure there is only 1 set. */ if ( grn ^ ( 1 << ( ffs( grn ) - 1 ) ) ) papi_return( PAPI_EINVAL ); internal.granularity.granularity = grn; internal.granularity.eventset = ptr->granularity.eventset; retval = _papi_hwd[cidx]->ctl( NULL, PAPI_GRANUL, &internal ); if ( retval < PAPI_OK ) return ( retval ); internal.granularity.ESI->granularity.granularity = grn; return ( retval ); } case PAPI_INHERIT: { EventSetInfo_t *ESI; ESI = _papi_hwi_lookup_EventSet( ptr->inherit.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); if ( _papi_hwd[cidx]->cmp_info.inherit == 0 ) papi_return( PAPI_ECMP ); if ( ( ESI->state & PAPI_STOPPED ) == 0 ) papi_return( PAPI_EISRUN ); /* if attached to a cpu, return an error */ if (ESI->state & PAPI_CPU_ATTACHED) papi_return( PAPI_ECMP ); internal.inherit.ESI = ESI; internal.inherit.inherit = ptr->inherit.inherit; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.inherit.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, PAPI_INHERIT, &internal ); if ( retval < PAPI_OK ) return ( retval ); ESI->inherit.inherit = ptr->inherit.inherit; return ( retval ); } case PAPI_DATA_ADDRESS: case PAPI_INSTR_ADDRESS: { EventSetInfo_t *ESI; ESI = _papi_hwi_lookup_EventSet( ptr->addr.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) papi_return( cidx ); internal.address_range.ESI = ESI; if ( !( internal.address_range.ESI->state & PAPI_STOPPED ) ) papi_return( PAPI_EISRUN ); /*set domain to be PAPI_DOM_USER */ internal.address_range.domain = PAPI_DOM_USER; internal.address_range.start = ptr->addr.start; internal.address_range.end = ptr->addr.end; /* get the context we should use for this event set */ context = _papi_hwi_get_context( internal.address_range.ESI, NULL ); retval = _papi_hwd[cidx]->ctl( context, option, &internal ); ptr->addr.start_off = internal.address_range.start_off; ptr->addr.end_off = internal.address_range.end_off; papi_return( retval ); } case PAPI_USER_EVENTS_FILE: { APIDBG("User Events Filename is -%s-\n", ptr->events_file); // go load the user defined event definitions from the applications event definition file // do not know how to find a pmu name and type for this operation yet // retval = papi_load_derived_events(pmu_str, pmu_type, cidx, 0); // _papi_user_defined_events_setup(ptr->events_file); return( PAPI_OK ); } default: papi_return( PAPI_EINVAL ); } } /** @class PAPI_num_hwctrs * @brief Return the number of hardware counters on the cpu. * * @deprecated * This is included to preserve backwards compatibility. * Use PAPI_num_cmp_hwctrs() instead. * * @see PAPI_num_cmp_hwctrs */ int PAPI_num_hwctrs( void ) { APIDBG( "Entry:\n"); return ( PAPI_num_cmp_hwctrs( 0 ) ); } /** @class PAPI_num_cmp_hwctrs * @brief Return the number of hardware counters for the specified component. * * PAPI_num_cmp_hwctrs() returns the number of counters present in the * specified component. * By convention, component 0 is always the cpu. * * On some components, especially for CPUs, the value returned is * a theoretical maximum for estimation purposes only. It might not * be possible to easily create an EventSet that contains the full * number of events. This can be due to a variety of reasons: * 1). Some CPUs (especially Intel and POWER) have the notion * of fixed counters that can only measure one thing, usually * cycles. * 2). Some CPUs have very explicit rules about which event can * run in which counter. In this case it might not be possible * to add a wanted event even if counters are free. * 3). Some CPUs halve the number of counters available when * running with SMT (multiple CPU threads) enabled. * 4). Some operating systems "steal" a counter to use for things * such as NMI Watchdog timers. * The only sure way to see if events will fit is to attempt * adding events to an EventSet, and doing something sensible * if an error is generated. * * PAPI_library_init() must be called in order for this function to return * anything greater than 0. * * @par C Interface: * \#include @n * int PAPI_num_cmp_hwctrs(int cidx ); * * @param[in] cidx * -- An integer identifier for a component. * By convention, component 0 is always the cpu component. * * @par Example * @code * // Query the cpu component for the number of counters. * printf(\"%d hardware counters found.\\n\", PAPI_num_cmp_hwctrs(0)); * @endcode * * @returns * On success, this function returns a value greater than zero.@n * A zero result usually means the library has not been initialized. * * @bug This count may include fixed-use counters in addition * to the general purpose counters. */ int PAPI_num_cmp_hwctrs( int cidx ) { APIDBG( "Entry: cidx: %d\n", cidx); return ( PAPI_get_cmp_opt( PAPI_MAX_HWCTRS, NULL, cidx ) ); } /** @class PAPI_get_multiplex * @brief Get the multiplexing status of specified event set. * * @par C Interface: * \#include @n * int PAPI_get_multiplex( int EventSet ); * * @par Fortran Interface: * \#include fpapi.h @n * PAPIF_get_multiplex( C_INT EventSet, C_INT check ) * * @param EventSet * an integer handle for a PAPI event set as created by PAPI_create_eventset * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid, or the EventSet * is already multiplexed. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * * PAPI_get_multiplex tests the state of the PAPI_MULTIPLEXING flag in the specified event set, * returning @em TRUE if a PAPI event set is multiplexed, or FALSE if not. * @par Example: * @code * int EventSet = PAPI_NULL; * int ret; * * // Create an empty EventSet * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) handle_error(ret); * * // Bind it to the CPU component * ret = PAPI_assign_eventset_component(EventSet, 0); * if (ret != PAPI_OK) handle_error(ret); * * // Check current multiplex status * ret = PAPI_get_multiplex(EventSet); * if (ret == TRUE) printf("This event set is ready for multiplexing\n.") * if (ret == FALSE) printf("This event set is not enabled for multiplexing\n.") * if (ret < 0) handle_error(ret); * * // Turn on multiplexing * ret = PAPI_set_multiplex(EventSet); * if ((ret == PAPI_EINVAL) && (PAPI_get_multiplex(EventSet) == TRUE)) * printf("This event set already has multiplexing enabled\n"); * else if (ret != PAPI_OK) handle_error(ret); * @endcode * @see PAPI_multiplex_init * @see PAPI_set_opt * @see PAPI_create_eventset */ int PAPI_get_multiplex( int EventSet ) { APIDBG( "Entry: EventSet: %d\n", EventSet); PAPI_option_t popt; int retval; popt.multiplex.eventset = EventSet; retval = PAPI_get_opt( PAPI_MULTIPLEX, &popt ); if ( retval < 0 ) retval = 0; return retval; } /** @class PAPI_get_opt * @brief Get PAPI library or event set options. * * @par C Interface: * \#include @n * int PAPI_get_opt( int option, PAPI_option_t * ptr ); * * @param[in] option * Defines the option to get. * Possible values are briefly described in the table below. * * @param[in,out] ptr * Pointer to a structure determined by the selected option. See PAPI_option_t * for a description of possible structures. * * @retval PAPI_OK * @retval PAPI_EINVAL The specified option or parameter is invalid. * @retval PAPI_ENOEVST The EventSet specified does not exist. * @retval PAPI_ECMP * The option is not implemented for the current component. * @retval PAPI_ENOINIT specified option requires PAPI to be initialized first. * * PAPI_get_opt() queries the options of the PAPI library or a specific event set created by * PAPI_create_eventset. Some options may require that the eventset be bound to a component * before they can execute successfully. This can be done either by adding an event or by * explicitly calling PAPI_assign_eventset_component. * * Ptr is a pointer to the PAPI_option_t structure, which is actually a union of different * structures for different options. Not all options require or return information in these * structures. Each returns different values in the structure. Some options require a component * index to be provided. These options are handled explicitly by the PAPI_get_cmp_opt() call. * * @note Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX * are also available as separate entry points in both C and Fortran. * * The reader is encouraged to peruse the ctests code in the PAPI distribution for examples * of usage of PAPI_set_opt. * * @par Possible values for the PAPI_get_opt option parameter * @manonly * OPTION DEFINITION * PAPI_DEFDOM Get default counting domain for newly created event sets. Requires a component index. * PAPI_DEFGRN Get default counting granularity. Requires a component index. * PAPI_DEBUG Get the PAPI debug state and the debug handler. The debug state is specified in ptr->debug.level. The debug handler is specified in ptr->debug.handler. * For further information regarding debug states and the behavior of the handler, see PAPI_set_debug. * PAPI_MULTIPLEX Get current multiplexing state for specified EventSet. * PAPI_DEF_ITIMER Get the type of itimer used in software multiplexing, overflowing and profiling. * PAPI_DEF_MPX_NS Get the sampling time slice in nanoseconds for multiplexing and overflow. * PAPI_DEF_ITIMER_NS See PAPI_DEF_MPX_NS. * PAPI_ATTACH Get thread or process id to which event set is attached. Returns TRUE if currently attached. * PAPI_CPU_ATTACH Get ptr->cpu.cpu_num and Attach state for EventSet specified in ptr->cpu.eventset. * PAPI_DETACH Get thread or process id to which event set is attached. Returns TRUE if currently attached. * PAPI_DOMAIN Get domain for EventSet specified in ptr->domain.eventset. Will error if eventset is not bound to a component. * PAPI_GRANUL Get granularity for EventSet specified in ptr->granularity.eventset. Will error if eventset is not bound to a component. * PAPI_INHERIT Get current inheritance state for specified EventSet. * PAPI_PRELOAD Get LD_PRELOAD environment equivalent. * PAPI_CLOCKRATE Get clockrate in MHz. * PAPI_MAX_CPUS Get number of CPUs. * PAPI_EXEINFO Get Executable addresses for text/data/bss. * PAPI_HWINFO Get information about the hardware. * PAPI_LIB_VERSION Get the full PAPI version of the library. This does not require PAPI to be initialized first. * PAPI_MAX_HWCTRS Get number of counters. Requires a component index. * PAPI_MAX_MPX_CTRS Get maximum number of multiplexing counters. Requires a component index. * PAPI_SHLIBINFO Get shared library information used by the program. * PAPI_COMPONENTINFO Get the PAPI features the specified component supports. Requires a component index. * @endmanonly * @htmlonly * * * * * * * * * * * * * * * * * * * * * * * * * *
OPTIONDEFINITION
PAPI_DEFDOMGet default counting domain for newly created event sets. Requires a component index.
PAPI_DEFGRNGet default counting granularity. Requires a component index.
PAPI_DEBUGGet the PAPI debug state and the debug handler. The debug state is specified in ptr->debug.level. The debug handler is specified in ptr->debug.handler. * For further information regarding debug states and the behavior of the handler, see PAPI_set_debug.
PAPI_MULTIPLEXGet current multiplexing state for specified EventSet.
PAPI_DEF_ITIMERGet the type of itimer used in software multiplexing, overflowing and profiling.
PAPI_DEF_MPX_NSGet the sampling time slice in nanoseconds for multiplexing and overflow.
PAPI_DEF_ITIMER_NSSee PAPI_DEF_MPX_NS.
PAPI_ATTACHGet thread or process id to which event set is attached. Returns TRUE if currently attached.
PAPI_CPU_ATTACHGet ptr->cpu.cpu_num and Attach state for EventSet specified in ptr->cpu.eventset.
PAPI_DETACHGet thread or process id to which event set is attached. Returns TRUE if currently attached.
PAPI_DOMAINGet domain for EventSet specified in ptr->domain.eventset. Will error if eventset is not bound to a component.
PAPI_GRANULGet granularity for EventSet specified in ptr->granularity.eventset. Will error if eventset is not bound to a component.
PAPI_INHERITGet current inheritance state for specified EventSet.
PAPI_PRELOADGet LD_PRELOAD environment equivalent.
PAPI_CLOCKRATEGet clockrate in MHz.
PAPI_MAX_CPUSGet number of CPUs.
PAPI_EXEINFOGet Executable addresses for text/data/bss.
PAPI_HWINFOGet information about the hardware.
PAPI_LIB_VERSIONGet the full PAPI version of the library.
PAPI_MAX_HWCTRSGet number of counters. Requires a component index.
PAPI_MAX_MPX_CTRSGet maximum number of multiplexing counters. Requires a component index.
PAPI_SHLIBINFOGet shared library information used by the program.
PAPI_COMPONENTINFOGet the PAPI features the specified component supports. Requires a component index.
* @endhtmlonly * * @see PAPI_get_multiplex * @see PAPI_get_cmp_opt * @see PAPI_set_opt * @see PAPI_option_t */ int PAPI_get_opt( int option, PAPI_option_t * ptr ) { APIDBG( "Entry: option: %d, ptr: %p\n", option, ptr); EventSetInfo_t *ESI; if ( ( option != PAPI_DEBUG ) && ( init_level == PAPI_NOT_INITED ) && ( option != PAPI_LIB_VERSION ) ) papi_return( PAPI_ENOINIT ); switch ( option ) { case PAPI_DETACH: { if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->attach.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->attach.tid = ESI->attach.tid; return ( ( ESI->state & PAPI_ATTACHED ) == 0 ); } case PAPI_ATTACH: { if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->attach.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->attach.tid = ESI->attach.tid; return ( ( ESI->state & PAPI_ATTACHED ) != 0 ); } case PAPI_CPU_ATTACH: { if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->attach.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->cpu.cpu_num = ESI->CpuInfo->cpu_num; return ( ( ESI->state & PAPI_CPU_ATTACHED ) != 0 ); } case PAPI_DEF_MPX_NS: { /* xxxx for now, assume we only check against cpu component */ if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->multiplex.ns = _papi_os_info.itimer_ns; return ( PAPI_OK ); } case PAPI_DEF_ITIMER_NS: { /* xxxx for now, assume we only check against cpu component */ if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->itimer.ns = _papi_os_info.itimer_ns; return ( PAPI_OK ); } case PAPI_DEF_ITIMER: { /* xxxx for now, assume we only check against cpu component */ if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->itimer.itimer_num = _papi_os_info.itimer_num; ptr->itimer.itimer_sig = _papi_os_info.itimer_sig; ptr->itimer.ns = _papi_os_info.itimer_ns; ptr->itimer.flags = 0; return ( PAPI_OK ); } case PAPI_MULTIPLEX: { if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->multiplex.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->multiplex.ns = ESI->multiplex.ns; ptr->multiplex.flags = ESI->multiplex.flags; return ( ESI->state & PAPI_MULTIPLEXING ) != 0; } case PAPI_PRELOAD: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); memcpy( &ptr->preload, &_papi_hwi_system_info.preload_info, sizeof ( PAPI_preload_info_t ) ); break; case PAPI_DEBUG: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->debug.level = _papi_hwi_error_level; ptr->debug.handler = _papi_hwi_debug_handler; break; case PAPI_CLOCKRATE: return ( ( int ) _papi_hwi_system_info.hw_info.cpu_max_mhz ); case PAPI_MAX_CPUS: return ( _papi_hwi_system_info.hw_info.ncpu ); /* For now, MAX_HWCTRS and MAX CTRS are identical. At some future point, they may map onto different values. */ case PAPI_INHERIT: { if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->inherit.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->inherit.inherit = ESI->inherit.inherit; return ( PAPI_OK ); } case PAPI_GRANUL: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->granularity.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->granularity.granularity = ESI->granularity.granularity; break; case PAPI_EXEINFO: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->exe_info = &_papi_hwi_system_info.exe_info; break; case PAPI_HWINFO: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->hw_info = &_papi_hwi_system_info.hw_info; break; case PAPI_DOMAIN: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( ptr->domain.eventset ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); ptr->domain.domain = ESI->domain.domain; return ( PAPI_OK ); case PAPI_LIB_VERSION: return ( PAPI_VERSION ); /* The following cases all require a component index and are handled by PAPI_get_cmp_opt() with cidx == 0*/ case PAPI_MAX_HWCTRS: case PAPI_MAX_MPX_CTRS: case PAPI_DEFDOM: case PAPI_DEFGRN: case PAPI_SHLIBINFO: case PAPI_COMPONENTINFO: return ( PAPI_get_cmp_opt( option, ptr, 0 ) ); default: papi_return( PAPI_EINVAL ); } return ( PAPI_OK ); } /** @class PAPI_get_cmp_opt * @brief Get component specific PAPI options. * * @param option * is an input parameter describing the course of action. * Possible values are defined in papi.h and briefly described in the table below. * The Fortran calls are implementations of specific options. * @param ptr * is a pointer to a structure that acts as both an input and output parameter. * @param cidx * An integer identifier for a component. * By convention, component 0 is always the cpu component. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * * PAPI_get_opt() and PAPI_set_opt() query or change the options of the PAPI * library or a specific event set created by PAPI_create_eventset . * Some options may require that the eventset be bound to a component before * they can execute successfully. * This can be done either by adding an event or by explicitly calling * PAPI_assign_eventset_component . * * The C interface for these functions passes a pointer to the PAPI_option_t structure. * Not all options require or return information in this structure, and not all * options are implemented for both get and set. * Some options require a component index to be provided. * These options are handled explicitly by the PAPI_get_cmp_opt() call for 'get' * and implicitly through the option structure for 'set'. * The Fortran interface is a series of calls implementing various subsets of * the C interface. Not all options in C are available in Fortran. * * @note Some options, such as PAPI_DOMAIN and PAPI_MULTIPLEX, * are also available as separate entry points in both C and Fortran. * * The reader is urged to see the example code in the PAPI distribution for usage of PAPI_get_opt. * The file papi.h contains definitions for the structures unioned in the PAPI_option_t structure. * * @see PAPI_set_debug PAPI_set_multiplex PAPI_set_domain PAPI_option_t */ int PAPI_get_cmp_opt( int option, PAPI_option_t * ptr, int cidx ) { APIDBG( "Entry: option: %d, ptr: %p, cidx: %d\n", option, ptr, cidx); if (_papi_hwi_invalid_cmp(cidx)) { return PAPI_ECMP; } switch ( option ) { /* For now, MAX_HWCTRS and MAX CTRS are identical. At some future point, they may map onto different values. */ case PAPI_MAX_HWCTRS: return ( _papi_hwd[cidx]->cmp_info.num_cntrs ); case PAPI_MAX_MPX_CTRS: return ( _papi_hwd[cidx]->cmp_info.num_mpx_cntrs ); case PAPI_DEFDOM: return ( _papi_hwd[cidx]->cmp_info.default_domain ); case PAPI_DEFGRN: return ( _papi_hwd[cidx]->cmp_info.default_granularity ); case PAPI_SHLIBINFO: { int retval; if ( ptr == NULL ) papi_return( PAPI_EINVAL ); retval = _papi_os_vector.update_shlib_info( &_papi_hwi_system_info ); ptr->shlib_info = &_papi_hwi_system_info.shlib_info; papi_return( retval ); } case PAPI_COMPONENTINFO: if ( ptr == NULL ) papi_return( PAPI_EINVAL ); ptr->cmp_info = &( _papi_hwd[cidx]->cmp_info ); return PAPI_OK; default: papi_return( PAPI_EINVAL ); } return PAPI_OK; } /** @class PAPI_num_components * @brief Get the number of components available on the system. * * @return * Number of components available on the system * * @code // Query the library for a component count. printf("%d components installed., PAPI_num_components() ); * @endcode */ int PAPI_num_components( void ) { APIDBG( "Entry:\n"); return ( papi_num_components ); } /** @class PAPI_num_events * @brief Return the number of events in an event set. * * PAPI_num_events() returns the number of preset and/or native events * contained in an event set. * The event set should be created by @ref PAPI_create_eventset . * * @par C Interface: * \#include @n * int PAPI_num_events(int EventSet ); * * @param[in] EventSet -- * an integer handle for a PAPI event set created by PAPI_create_eventset. * @param[out] *count -- (Fortran only) * On output the variable contains the number of events in the event set * * @retval On success, this function returns the positive number of * events in the event set. * @retval PAPI_EINVAL The event count is zero; * only if code is compiled with debug enabled. * @retval PAPI_ENOEVST The EventSet specified does not exist. * * @par Example * @code * // Count the events in our EventSet * printf(\"%d events found in EventSet.\\n\", PAPI_num_events(EventSet)); * @endcode * * @see PAPI_add_event * @see PAPI_create_eventset * */ int PAPI_num_events( int EventSet ) { APIDBG( "Entry: EventSet: %d\n", EventSet); EventSetInfo_t *ESI; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( !ESI ) papi_return( PAPI_ENOEVST ); #ifdef DEBUG /* Not necessary */ if ( ESI->NumberOfEvents == 0 ) papi_return( PAPI_EINVAL ); #endif return ( ESI->NumberOfEvents ); } /** @class PAPI_shutdown * @brief Finish using PAPI and free all related resources. * * @par C Prototype: * \#include @n * void PAPI_shutdown( void ); * * PAPI_shutdown() is an exit function used by the PAPI Library * to free resources and shut down when certain error conditions arise. * It is not necessary for the user to call this function, * but doing so allows the user to have the capability to free memory * and resources used by the PAPI Library. * * @see PAPI_init_library */ void PAPI_shutdown( void ) { APIDBG( "Entry:\n"); EventSetInfo_t *ESI; ThreadInfo_t *master; DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; int i, j = 0, k, retval; if ( init_retval == DEADBEEF ) { PAPIERROR( PAPI_SHUTDOWN_str ); return; } MPX_shutdown( ); /* Free all EventSets for this thread */ master = _papi_hwi_lookup_thread( 0 ); /* Count number of running EventSets AND */ /* Stop any running EventSets in this thread */ #ifdef DEBUG again: #endif for( i = 0; i < map->totalSlots; i++ ) { ESI = map->dataSlotArray[i]; if ( ESI ) { if ( ESI->master == master ) { if ( ESI->state & PAPI_RUNNING ) { if((retval = PAPI_stop( i, NULL )) != PAPI_OK) { APIDBG("Call to PAPI_stop failed: %d\n", retval); } } retval=PAPI_cleanup_eventset( i ); if (retval!=PAPI_OK) PAPIERROR("Error during cleanup."); _papi_hwi_free_EventSet( ESI ); } else { if ( ESI->state & PAPI_RUNNING ) { j++; } } } } /* No locking required, we're just waiting for the others to call shutdown or stop their eventsets. */ #ifdef DEBUG if ( j != 0 ) { PAPIERROR( PAPI_SHUTDOWN_SYNC_str ); sleep( 1 ); j = 0; goto again; } #endif // if we have some user events defined, release the space they allocated // give back the strings which were allocated when each event was created for ( i=0 ; icmp_info.disabled) { _papi_hwd[i]->shutdown_component( ); } } /* Now it is safe to call re-init */ init_retval = DEADBEEF; init_level = PAPI_NOT_INITED; _papi_mem_cleanup_all( ); } /** @class PAPI_strerror * @brief Returns a string describing the PAPI error code. * * @par C Interface: * \#include @n * char * PAPI_strerror( int errorCode ); * * @param[in] code * -- the error code to interpret * * @retval *error * -- a pointer to the error string. * @retval NULL * -- the input error code to PAPI_strerror() is invalid. * * PAPI_strerror() returns a pointer to the error message corresponding to the * error code code. * If the call fails the function returns the NULL pointer. * This function is not implemented in Fortran. * * @par Example: * @code * int ret; * int EventSet = PAPI_NULL; * int native = 0x0; * char error_str[PAPI_MAX_STR_LEN]; * * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) * { * fprintf(stderr, "PAPI error %d: %s\n", ret, PAPI_strerror(retval)); * exit(1); * } * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) * { * PAPI_perror( "PAPI_add_event"); * fprintf(stderr,"PAPI_error %d: %s\n", ret, error_str); * exit(1); * } * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_perror PAPI_set_opt PAPI_get_opt PAPI_shutdown PAPI_set_debug */ char * PAPI_strerror( int errorCode ) { if ( ( errorCode > 0 ) || ( -errorCode > _papi_hwi_num_errors ) ) return ( NULL ); return ( _papi_errlist[-errorCode] ); } /** @class PAPI_perror * @brief Produces a string on standard error, describing the last library error. * * @par C Interface: * \#include @n * void PAPI_perror( const char *s ); * * @param[in] s * -- Optional message to print before the string describing the last error message. * * The routine PAPI_perror() produces a message on the standard error output, * describing the last error encountered during a call to PAPI. * If s is not NULL, s is printed, followed by a colon and a space. * Then the error message and a new-line are printed. * * @par Example: * @code * int ret; * int EventSet = PAPI_NULL; * int native = 0x0; * * ret = PAPI_create_eventset(&EventSet); * if (ret != PAPI_OK) * { * fprintf(stderr, \"PAPI error %d: %s\\n\", ret, PAPI_strerror(retval)); * exit(1); * } * // Add Total Instructions Executed to our EventSet * ret = PAPI_add_event(EventSet, PAPI_TOT_INS); * if (ret != PAPI_OK) * { * PAPI_perror( "PAPI_add_event" ); * exit(1); * } * // Start counting * ret = PAPI_start(EventSet); * if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_strerror */ void PAPI_perror( const char *msg ) { char *foo; foo = PAPI_strerror( _papi_hwi_errno ); if ( foo == NULL ) return; if ( msg ) if ( *msg ) fprintf( stderr, "%s: ", msg ); fprintf( stderr, "%s\n", foo ); } /** @class PAPI_overflow * @brief Set up an event set to begin registering overflows. * * PAPI_overflow() marks a specific EventCode in an EventSet to generate an * overflow signal after every threshold events are counted. * More than one event in an event set can be used to trigger overflows. * In such cases, the user must call this function once for each overflowing * event. * To turn off overflow on a specified event, call this function with a * threshold value of 0. * * Overflows can be implemented in either software or hardware, but the scope * is the entire event set. * PAPI defaults to hardware overflow if it is available. * In the case of software overflow, a periodic timer interrupt causes PAPI * to compare the event counts against the threshold values and call the * overflow handler if one or more events have exceeded their threshold. * In the case of hardware overflow, the counters are typically set to the * negative of the threshold value and count up to 0. * This zero-crossing triggers a hardware interrupt that calls the overflow * handler. * Because of this counter interrupt, the counter values for overflowing * counters * may be very small or even negative numbers, and cannot be relied upon * as accurate. * In such cases the overflow handler can approximate the counts by supplying * the threshold value whenever an overflow occurs. * * _papi_overflow_handler() is a placeholder for a user-defined function * to process overflow events. A pointer to this function is passed to * the PAPI_overflow routine, where it is invoked whenever a software or * hardware overflow occurs. This handler receives the EventSet of the * overflowing event, the Program Counter address when the interrupt * occurred, an overflow_vector that can be processed to determined which * event(s) caused the overflow, and a pointer to the machine context, * which can be used in a platform-specific manor to extract register * information about what was happening when the overflow occurred. * * @par C Interface: * \#include @n * int PAPI_overflow (int EventSet, int EventCode, int threshold, * int flags, PAPI_overflow_handler_t handler ); @n@n * (*PAPI_overflow_handler_t) _papi_overflow_handler * (int EventSet, void *address, long_long overflow_vector, * void *context ); * * @par Fortran Interface: * Not implemented * * @param[in] EventSet * -- an integer handle to a PAPI event set as created by * @ref PAPI_create_eventset * @param[in] EventCode * -- the preset or native event code to be set for overflow * detection. * This event must have already been added to the EventSet. * @param[in] threshold * -- the overflow threshold value for this EventCode. * @param[in] flags * -- bitmap that controls the overflow mode of operation. * Set to PAPI_OVERFLOW_FORCE_SW to force software * overflowing, even if hardware overflow support is available. * If hardware overflow support is available on a given system, * it will be the default mode of operation. * There are situations where it is advantageous to use software * overflow instead. * Although software overflow is inherently less accurate, * with more latency and processing overhead, it does allow for * overflowing on derived events, and for the accurate recording * of overflowing event counts. * These two features are typically not available with hardware * overflow. * Only one type of overflow is allowed per event set, so * setting one event to hardware overflow and another to forced * software overflow will result in an error being returned. * @param[in] handler * -- pointer to the user supplied handler function to call upon * overflow * @param[in] address * -- the Program Counter address at the time of the overflow * @param[in] overflow_vector * -- a long long word containing flag bits to indicate * which hardware counter(s) caused the overflow * @param[in] *context * -- pointer to a machine specific structure that defines the * register context at the time of overflow. This parameter * is often unused and can be ignored in the user function. * * @retval PAPI_OK On success, PAPI_overflow returns PAPI_OK. * @retval PAPI_EINVAL One or more of the arguments is invalid. * Most likely a bad threshold value. * @retval PAPI_ENOMEM Insufficient memory to complete the operation. * @retval PAPI_ENOEVST The EventSet specified does not exist. * @retval PAPI_EISRUN The EventSet is currently counting events. * @retval PAPI_ECNFLCT The underlying counter hardware cannot count * this event and other events in the EventSet simultaneously. * Also can happen if you are trying to overflow both by hardware * and by forced software at the same time. * @retval PAPI_ENOEVNT The PAPI event is not available on * the underlying hardware. * * @par Example * @code * // Define a simple overflow handler: * void handler(int EventSet, void *address, long_long overflow_vector, void *context) * { * fprintf(stderr,\"Overflow at %p! bit=%#llx \\n\", * address,overflow_vector); * } * * // Call PAPI_overflow for an EventSet containing PAPI_TOT_INS, * // setting the threshold to 100000. Use the handler defined above. * retval = PAPI_overflow(EventSet, PAPI_TOT_INS, 100000, 0, handler); * @endcode * * * @see PAPI_get_overflow_event_index * */ int PAPI_overflow( int EventSet, int EventCode, int threshold, int flags, PAPI_overflow_handler_t handler ) { APIDBG( "Entry: EventSet: %d, EventCode: %#x, threshold: %d, flags: %#x, handler: %p\n", EventSet, EventCode, threshold, flags, handler); int retval, cidx, index, i; EventSetInfo_t *ESI; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) { OVFDBG("No EventSet\n"); papi_return( PAPI_ENOEVST ); } cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) { OVFDBG("Component Error\n"); papi_return( cidx ); } if ( ( ESI->state & PAPI_STOPPED ) != PAPI_STOPPED ) { OVFDBG("Already running\n"); papi_return( PAPI_EISRUN ); } if ( ESI->state & PAPI_ATTACHED ) { OVFDBG("Attached\n"); papi_return( PAPI_EINVAL ); } if ( ESI->state & PAPI_CPU_ATTACHED ) { OVFDBG("CPU attached\n"); papi_return( PAPI_EINVAL ); } if ( ( index = _papi_hwi_lookup_EventCodeIndex( ESI, ( unsigned int ) EventCode ) ) < 0 ) { papi_return( PAPI_ENOEVNT ); } if ( threshold < 0 ) { OVFDBG("Threshold below zero\n"); papi_return( PAPI_EINVAL ); } /* We do not support derived events in overflow */ /* Unless it's DERIVED_CMPD in which no calculations are done */ if ( !( flags & PAPI_OVERFLOW_FORCE_SW ) && threshold != 0 && ( ESI->EventInfoArray[index].derived ) && ( ESI->EventInfoArray[index].derived != DERIVED_CMPD ) ) { OVFDBG("Derived event in overflow\n"); papi_return( PAPI_EINVAL ); } /* the first time to call PAPI_overflow function */ if ( !( ESI->state & PAPI_OVERFLOWING ) ) { if ( handler == NULL ) { OVFDBG("NULL handler\n"); papi_return( PAPI_EINVAL ); } if ( threshold == 0 ) { OVFDBG("Zero threshold\n"); papi_return( PAPI_EINVAL ); } } if ( threshold > 0 && ESI->overflow.event_counter >= _papi_hwd[cidx]->cmp_info.num_cntrs ) papi_return( PAPI_ECNFLCT ); if ( threshold == 0 ) { for ( i = 0; i < ESI->overflow.event_counter; i++ ) { if ( ESI->overflow.EventCode[i] == EventCode ) break; } /* EventCode not found */ if ( i == ESI->overflow.event_counter ) papi_return( PAPI_EINVAL ); /* compact these arrays */ while ( i < ESI->overflow.event_counter - 1 ) { ESI->overflow.deadline[i] = ESI->overflow.deadline[i + 1]; ESI->overflow.threshold[i] = ESI->overflow.threshold[i + 1]; ESI->overflow.EventIndex[i] = ESI->overflow.EventIndex[i + 1]; ESI->overflow.EventCode[i] = ESI->overflow.EventCode[i + 1]; i++; } ESI->overflow.deadline[i] = 0; ESI->overflow.threshold[i] = 0; ESI->overflow.EventIndex[i] = 0; ESI->overflow.EventCode[i] = 0; ESI->overflow.event_counter--; } else { if ( ESI->overflow.event_counter > 0 ) { if ( ( flags & PAPI_OVERFLOW_FORCE_SW ) && ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) ) papi_return( PAPI_ECNFLCT ); if ( !( flags & PAPI_OVERFLOW_FORCE_SW ) && ( ESI->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) ) papi_return( PAPI_ECNFLCT ); } for ( i = 0; i < ESI->overflow.event_counter; i++ ) { if ( ESI->overflow.EventCode[i] == EventCode ) break; } /* A new entry */ if ( i == ESI->overflow.event_counter ) { ESI->overflow.EventCode[i] = EventCode; ESI->overflow.event_counter++; } /* New or existing entry */ ESI->overflow.deadline[i] = threshold; ESI->overflow.threshold[i] = threshold; ESI->overflow.EventIndex[i] = index; ESI->overflow.flags = flags; } /* If overflowing is already active, we should check to make sure that we don't specify a different handler or different flags here. You can't mix them. */ ESI->overflow.handler = handler; /* Set up the option structure for the low level. If we have hardware interrupts and we are not using forced software emulated interrupts */ if ( _papi_hwd[cidx]->cmp_info.hardware_intr && !( ESI->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) ) { retval = _papi_hwd[cidx]->set_overflow( ESI, index, threshold ); if ( retval == PAPI_OK ) ESI->overflow.flags |= PAPI_OVERFLOW_HARDWARE; else { papi_return( retval ); /* We should undo stuff here */ } } else { /* Make sure hardware overflow is not set */ ESI->overflow.flags &= ~( PAPI_OVERFLOW_HARDWARE ); } APIDBG( "Overflow using: %s\n", ( ESI->overflow. flags & PAPI_OVERFLOW_HARDWARE ? "[Hardware]" : ESI->overflow. flags & PAPI_OVERFLOW_FORCE_SW ? "[Forced Software]" : "[Software]" ) ); /* Toggle the overflow flags and ESI state */ if ( ESI->overflow.event_counter >= 1 ) ESI->state |= PAPI_OVERFLOWING; else { ESI->state ^= PAPI_OVERFLOWING; ESI->overflow.flags = 0; ESI->overflow.handler = NULL; } return PAPI_OK; } /** @class PAPI_sprofil * @brief Generate PC histogram data from multiple code regions where hardware counter overflow occurs. * * @par C Interface: * \#include @n * int PAPI_sprofil( PAPI_sprofil_t * prof, int profcnt, int EventSet, int EventCode, int threshold, int flags ); * * @param *prof * pointer to an array of PAPI_sprofil_t structures. Each copy of the structure contains the following: * @arg buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'. The size of the buckets is determined by values in the flags argument. * @arg bufsiz -- the size of the histogram buffer in bytes. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed below. * @arg offset -- the start address of the region to be profiled. * @arg scale -- broadly and historically speaking, a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled. More precisely, scale is interpreted as an unsigned 16-bit fixed-point fraction with the decimal point implied on the left. Its value is the reciprocal of the number of addresses in a subdivision, per counter of histogram buffer. * * @param profcnt * number of structures in the prof array for hardware profiling. * @param EventSet * The PAPI EventSet to profile. This EventSet is marked as profiling-ready, * but profiling doesn't actually start until a PAPI_start() call is issued. * @param EventCode * Code of the Event in the EventSet to profile. * This event must already be a member of the EventSet. * @param threshold * minimum number of events that must occur before the PC is sampled. * If hardware overflow is supported for your component, this threshold will * trigger an interrupt when reached. * Otherwise, the counters will be sampled periodically and the PC will be * recorded for the first sample that exceeds the threshold. * If the value of threshold is 0, profiling will be disabled for this event. * @param flags * bit pattern to control profiling behavior. * Defined values are given in a table in the documentation for PAPI_pofil * @manonly * * @endmanonly * * @retval * Return values for PAPI_sprofil() are identical to those for PAPI_profil. * Please refer to that page for further details. * @manonly * * @endmanonly * * PAPI_sprofil() is a structure driven profiler that profiles one or more * disjoint regions of code in a single call. * It accepts a pointer to a preinitialized array of sprofil structures, and * initiates profiling based on the values contained in the array. * Each structure in the array defines the profiling parameters that are * normally passed to PAPI_profil(). * For more information on profiling, @ref PAPI_profil * @manonly * * @endmanonly * * @par Example: * @code * int retval; * unsigned long length; * PAPI_exe_info_t *prginfo; * unsigned short *profbuf1, *profbuf2, profbucket; * PAPI_sprofil_t sprof[3]; * * prginfo = PAPI_get_executable_info(); * if (prginfo == NULL) handle_error( NULL ); * length = (unsigned long)(prginfo->text_end - prginfo->text_start); * // Allocate 2 buffers of equal length * profbuf1 = (unsigned short *)malloc(length); * profbuf2 = (unsigned short *)malloc(length); * if ((profbuf1 == NULL) || (profbuf2 == NULL)) * handle_error( NULL ); * memset(profbuf1,0x00,length); * memset(profbuf2,0x00,length); * // First buffer * sprof[0].pr_base = profbuf1; * sprof[0].pr_size = length; * sprof[0].pr_off = (vptr_t) DO_FLOPS; * sprof[0].pr_scale = 0x10000; * // Second buffer * sprof[1].pr_base = profbuf2; * sprof[1].pr_size = length; * sprof[1].pr_off = (vptr_t) DO_READS; * sprof[1].pr_scale = 0x10000; * // Overflow bucket * sprof[2].pr_base = profbucket; * sprof[2].pr_size = 1; * sprof[2].pr_off = 0; * sprof[2].pr_scale = 0x0002; * retval = PAPI_sprofil(sprof, EventSet, PAPI_FP_INS, 1000000, * PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) != PAPI_OK) * if ( retval != PAPI_OK ) handle_error( retval ); * @endcode * * @see PAPI_overflow * @see PAPI_get_executable_info * @see PAPI_profil */ int PAPI_sprofil( PAPI_sprofil_t *prof, int profcnt, int EventSet, int EventCode, int threshold, int flags ) { APIDBG( "Entry: prof: %p, profcnt: %d, EventSet: %d, EventCode: %#x, threshold: %d, flags: %#x\n", prof, profcnt, EventSet, EventCode, threshold, flags); EventSetInfo_t *ESI; int retval, index, i, buckets; int forceSW = 0; int cidx; /* Check to make sure EventSet exists */ ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) { papi_return( PAPI_ENOEVST ); } /* Check to make sure EventSet is stopped */ if ( ( ESI->state & PAPI_STOPPED ) != PAPI_STOPPED ) { papi_return( PAPI_EISRUN ); } /* We cannot profile if attached */ if ( ESI->state & PAPI_ATTACHED ) { papi_return( PAPI_EINVAL ); } /* We cannot profile if cpu attached */ if ( ESI->state & PAPI_CPU_ATTACHED ) { papi_return( PAPI_EINVAL ); } /* Get component for EventSet */ cidx = valid_ESI_component( ESI ); if ( cidx < 0 ) { papi_return( cidx ); } /* Get index of the Event we want to profile */ if ( ( index = _papi_hwi_lookup_EventCodeIndex( ESI, (unsigned int) EventCode ) ) < 0 ) { papi_return( PAPI_ENOEVNT ); } /* We do not support derived events in overflow */ /* Unless it's DERIVED_CMPD in which no calculations are done */ if ( ( ESI->EventInfoArray[index].derived ) && ( ESI->EventInfoArray[index].derived != DERIVED_CMPD ) && !( flags & PAPI_PROFIL_FORCE_SW ) ) { papi_return( PAPI_EINVAL ); } /* If no prof structures, then make sure count is 0 */ if ( prof == NULL ) { profcnt = 0; } /* check all profile regions for valid scale factors of: 2 (131072/65536), 1 (65536/65536), or < 1 (65535 -> 2) as defined in unix profil() 2/65536 is reserved for single bucket profiling {0,1}/65536 are traditionally used to terminate profiling but are unused here since PAPI uses threshold instead */ for( i = 0; i < profcnt; i++ ) { if ( !( ( prof[i].pr_scale == 131072 ) || ( ( prof[i].pr_scale <= 65536 && prof[i].pr_scale > 1 ) ) ) ) { APIDBG( "Improper scale factor: %d\n", prof[i].pr_scale ); papi_return( PAPI_EINVAL ); } } /* Make sure threshold is valid */ if ( threshold < 0 ) { papi_return( PAPI_EINVAL ); } /* the first time to call PAPI_sprofil */ if ( !( ESI->state & PAPI_PROFILING ) ) { if ( threshold == 0 ) { papi_return( PAPI_EINVAL ); } } /* ??? */ if ( (threshold > 0) && (ESI->profile.event_counter >= _papi_hwd[cidx]->cmp_info.num_cntrs) ) { papi_return( PAPI_ECNFLCT ); } if ( threshold == 0 ) { for( i = 0; i < ESI->profile.event_counter; i++ ) { if ( ESI->profile.EventCode[i] == EventCode ) { break; } } /* EventCode not found */ if ( i == ESI->profile.event_counter ) { papi_return( PAPI_EINVAL ); } /* compact these arrays */ while ( i < ESI->profile.event_counter - 1 ) { ESI->profile.prof[i] = ESI->profile.prof[i + 1]; ESI->profile.count[i] = ESI->profile.count[i + 1]; ESI->profile.threshold[i] = ESI->profile.threshold[i + 1]; ESI->profile.EventIndex[i] = ESI->profile.EventIndex[i + 1]; ESI->profile.EventCode[i] = ESI->profile.EventCode[i + 1]; i++; } ESI->profile.prof[i] = NULL; ESI->profile.count[i] = 0; ESI->profile.threshold[i] = 0; ESI->profile.EventIndex[i] = 0; ESI->profile.EventCode[i] = 0; ESI->profile.event_counter--; } else { if ( ESI->profile.event_counter > 0 ) { if ( ( flags & PAPI_PROFIL_FORCE_SW ) && !( ESI->profile.flags & PAPI_PROFIL_FORCE_SW ) ) { papi_return( PAPI_ECNFLCT ); } if ( !( flags & PAPI_PROFIL_FORCE_SW ) && ( ESI->profile.flags & PAPI_PROFIL_FORCE_SW ) ) { papi_return( PAPI_ECNFLCT ); } } for( i = 0; i < ESI->profile.event_counter; i++ ) { if ( ESI->profile.EventCode[i] == EventCode ) { break; } } if ( i == ESI->profile.event_counter ) { i = ESI->profile.event_counter; ESI->profile.event_counter++; ESI->profile.EventCode[i] = EventCode; } ESI->profile.prof[i] = prof; ESI->profile.count[i] = profcnt; ESI->profile.threshold[i] = threshold; ESI->profile.EventIndex[i] = index; } APIDBG( "Profile event counter is %d\n", ESI->profile.event_counter ); /* Clear out old flags */ if ( threshold == 0 ) { flags |= ESI->profile.flags; } /* make sure no invalid flags are set */ if ( flags & ~( PAPI_PROFIL_POSIX | PAPI_PROFIL_RANDOM | PAPI_PROFIL_WEIGHTED | PAPI_PROFIL_COMPRESS | PAPI_PROFIL_BUCKETS | PAPI_PROFIL_FORCE_SW | PAPI_PROFIL_INST_EAR | PAPI_PROFIL_DATA_EAR ) ) { papi_return( PAPI_EINVAL ); } /* if we have kernel-based profiling, then we're just asking for signals on interrupt. */ /* if we don't have kernel-based profiling, then we're asking for emulated PMU interrupt */ if ( ( flags & PAPI_PROFIL_FORCE_SW ) && ( _papi_hwd[cidx]->cmp_info.kernel_profile == 0 ) ) { forceSW = PAPI_OVERFLOW_FORCE_SW; } /* make sure one and only one bucket size is set */ buckets = flags & PAPI_PROFIL_BUCKETS; if ( !buckets ) { flags |= PAPI_PROFIL_BUCKET_16; /* default to 16 bit if nothing set */ } else { /* return error if more than one set */ if ( !( ( buckets == PAPI_PROFIL_BUCKET_16 ) || ( buckets == PAPI_PROFIL_BUCKET_32 ) || ( buckets == PAPI_PROFIL_BUCKET_64 ) ) ) { papi_return( PAPI_EINVAL ); } } /* Set up the option structure for the low level */ ESI->profile.flags = flags; if ( _papi_hwd[cidx]->cmp_info.kernel_profile && !( ESI->profile.flags & PAPI_PROFIL_FORCE_SW ) ) { retval = _papi_hwd[cidx]->set_profile( ESI, index, threshold ); if ( ( retval == PAPI_OK ) && ( threshold > 0 ) ) { /* We need overflowing because we use the overflow dispatch handler */ ESI->state |= PAPI_OVERFLOWING; ESI->overflow.flags |= PAPI_OVERFLOW_HARDWARE; } } else { retval = PAPI_overflow( EventSet, EventCode, threshold, forceSW, _papi_hwi_dummy_handler ); } if ( retval < PAPI_OK ) { papi_return( retval ); /* We should undo stuff here */ } /* Toggle the profiling flags and ESI state */ if ( ESI->profile.event_counter >= 1 ) { ESI->state |= PAPI_PROFILING; } else { ESI->state ^= PAPI_PROFILING; ESI->profile.flags = 0; } return PAPI_OK; } /** @class PAPI_profil * @brief Generate a histogram of hardware counter overflows vs. PC addresses. * * @par C Interface: * \#include @n * int PAPI_profil(void *buf, unsigned bufsiz, unsigned long offset, * unsigned scale, int EventSet, int EventCode, int threshold, int flags ); * * @par Fortran Interface * The profiling routines have no Fortran interface. * * @param *buf * -- pointer to a buffer of bufsiz bytes in which the histogram counts are * stored in an array of unsigned short, unsigned int, or * unsigned long long values, or 'buckets'. * The size of the buckets is determined by values in the flags argument. * @param bufsiz * -- the size of the histogram buffer in bytes. * It is computed from the length of the code region to be profiled, * the size of the buckets, and the scale factor as discussed above. * @param offset * -- the start address of the region to be profiled. * @param scale * -- broadly and historically speaking, a contraction factor that * indicates how much smaller the histogram buffer is than the * region to be profiled. More precisely, scale is interpreted as an * unsigned 16-bit fixed-point fraction with the decimal point * implied on the left. * Its value is the reciprocal of the number of addresses in a * subdivision, per counter of histogram buffer. * Below is a table of representative values for scale. * @param EventSet * -- The PAPI EventSet to profile. This EventSet is marked as * profiling-ready, but profiling doesn't actually start until a * PAPI_start() call is issued. * @param EventCode * -- Code of the Event in the EventSet to profile. * This event must already be a member of the EventSet. * @param threshold * -- minimum number of events that must occur before the PC is sampled. * If hardware overflow is supported for your component, this threshold * will trigger an interrupt when reached. * Otherwise, the counters will be sampled periodically and the PC will * be recorded for the first sample that exceeds the threshold. * If the value of threshold is 0, profiling will be disabled for * this event. * @param flags * -- bit pattern to control profiling behavior. * Defined values are shown in the table above. * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this event and other * events in the EventSet simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * PAPI_profil() provides hardware event statistics by profiling * the occurrence of specified hardware counter events. * It is designed to mimic the UNIX SVR4 profil call. * * The statistics are generated by creating a histogram of hardware * counter event overflows vs. program counter addresses for the current * process. The histogram is defined for a specific region of program * code to be profiled, and the identified region is logically broken up * into a set of equal size subdivisions, each of which corresponds to a * count in the histogram. * * With each hardware event overflow, the current subdivision is * identified and its corresponding histogram count is incremented. * These counts establish a relative measure of how many hardware counter * events are occurring in each code subdivision. * * The resulting histogram counts for a profiled region can be used to * identify those program addresses that generate a disproportionately * high percentage of the event of interest. * * Events to be profiled are specified with the EventSet and * EventCode parameters. More than one event can be simultaneously * profiled by calling PAPI_profil() * several times with different EventCode values. * Profiling can be turned off for a given event by calling PAPI_profil() * with a threshold value of 0. * * @par Representative values for the scale variable * @manonly * HEX DECIMAL DEFININTION * 0x20000 131072 Maps precisely one instruction address to a unique bucket in buf. * 0x10000 65536 Maps precisely two instruction addresses to a unique bucket in buf. * 0x0FFFF 65535 Maps approximately two instruction addresses to a unique bucket in buf. * 0x08000 32768 Maps every four instruction addresses to a bucket in buf. * 0x04000 16384 Maps every eight instruction addresses to a bucket in buf. * 0x00002 2 Maps all instruction addresses to the same bucket in buf. * 0x00001 1 Undefined. * 0x00000 0 Undefined. * @endmanonly * @htmlonly * * * * * * * * * * *
HEX DECIMAL DEFINITION
0x20000 131072 Maps precisely one instruction address to a unique bucket in buf.
0x10000 65536 Maps precisely two instruction addresses to a unique bucket in buf.
0xFFFF 65535 Maps approximately two instruction addresses to a unique bucket in buf.
0x8000 32768 Maps every four instruction addresses to a bucket in buf.
0x4000 16384 Maps every eight instruction addresses to a bucket in buf.
0x0002 2 Maps all instruction addresses to the same bucket in buf.
0x0001 1 Undefined.
0x0000 0 Undefined.
* @endhtmlonly * * Historically, the scale factor was introduced to allow the * allocation of buffers smaller than the code size to be profiled. * Data and instruction sizes were assumed to be multiples of 16-bits. * These assumptions are no longer necessarily true. * PAPI_profil() has preserved the traditional definition of * scale where appropriate, but deprecated the definitions for 0 and 1 * (disable scaling) and extended the range of scale to include * 65536 and 131072 to allow for exactly two * addresses and exactly one address per profiling bucket. * * The value of bufsiz is computed as follows: * * bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where * @arg bufsiz - the size of the buffer in bytes * @arg end, start - the ending and starting addresses of the profiled region * @arg bucket_size - the size of each bucket in bytes; 2, 4, or 8 as defined in flags * * @par Defined bits for the flags variable: * @arg PAPI_PROFIL_POSIX Default type of profiling, similar to profil (3).@n * @arg PAPI_PROFIL_RANDOM Drop a random 25% of the samples.@n * @arg PAPI_PROFIL_WEIGHTED Weight the samples by their value.@n * @arg PAPI_PROFIL_COMPRESS Ignore samples as values in the hash buckets get big.@n * @arg PAPI_PROFIL_BUCKET_16 Use unsigned short (16 bit) buckets, This is the default bucket.@n * @arg PAPI_PROFIL_BUCKET_32 Use unsigned int (32 bit) buckets.@n * @arg PAPI_PROFIL_BUCKET_64 Use unsigned long long (64 bit) buckets.@n * @arg PAPI_PROFIL_FORCE_SW Force software overflow in profiling. @n * * @par Example * @code * int retval; * unsigned long length; * PAPI_exe_info_t *prginfo; * unsigned short *profbuf; * * if ((prginfo = PAPI_get_executable_info()) == NULL) * handle_error(1); * * length = (unsigned long)(prginfo->text_end - prginfo->text_start); * * profbuf = (unsigned short *)malloc(length); * if (profbuf == NULL) * handle_error(1); * memset(profbuf,0x00,length); * * if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet, * PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) * != PAPI_OK) * handle_error(retval); * @endcode * * @bug If you call PAPI_profil, PAPI allocates buffer space that will not be * freed if you call PAPI_shutdown or PAPI_cleanup_eventset. * To clean all memory, you must call PAPI_profil on the Events with * a 0 threshold. * * @see PAPI_overflow * @see PAPI_sprofil * */ int PAPI_profil( void *buf, unsigned bufsiz, vptr_t offset, unsigned scale, int EventSet, int EventCode, int threshold, int flags ) { APIDBG( "Entry: buf: %p, bufsiz: %d, offset: %p, scale: %u, EventSet: %d, EventCode: %#x, threshold: %d, flags: %#x\n", buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags); EventSetInfo_t *ESI; int i; int retval; ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( ESI == NULL ) papi_return( PAPI_ENOEVST ); /* scale factors are checked for validity in PAPI_sprofil */ if ( threshold > 0 ) { PAPI_sprofil_t *prof; for ( i = 0; i < ESI->profile.event_counter; i++ ) { if ( ESI->profile.EventCode[i] == EventCode ) break; } if ( i == ESI->profile.event_counter ) { prof = ( PAPI_sprofil_t * ) papi_malloc( sizeof ( PAPI_sprofil_t ) ); memset( prof, 0x0, sizeof ( PAPI_sprofil_t ) ); prof->pr_base = buf; prof->pr_size = bufsiz; prof->pr_off = offset; prof->pr_scale = scale; retval = PAPI_sprofil( prof, 1, EventSet, EventCode, threshold, flags ); if ( retval != PAPI_OK ) papi_free( prof ); } else { prof = ESI->profile.prof[i]; prof->pr_base = buf; prof->pr_size = bufsiz; prof->pr_off = offset; prof->pr_scale = scale; retval = PAPI_sprofil( prof, 1, EventSet, EventCode, threshold, flags ); } papi_return( retval ); } for ( i = 0; i < ESI->profile.event_counter; i++ ) { if ( ESI->profile.EventCode[i] == EventCode ) break; } /* EventCode not found */ if ( i == ESI->profile.event_counter ) papi_return( PAPI_EINVAL ); papi_free( ESI->profile.prof[i] ); ESI->profile.prof[i] = NULL; papi_return( PAPI_sprofil( NULL, 0, EventSet, EventCode, 0, flags ) ); } /* This function sets the low level default granularity for all newly manufactured eventsets. The first function preserves API compatibility and assumes component 0; The second function takes a component argument. */ /** @class PAPI_set_granularity * @brief Set the default counting granularity for eventsets bound to the cpu component. * * @par C Prototype: * \#include @n * int PAPI_set_granularity( int granularity ); * * @param -- granularity one of the following constants as defined in the papi.h header file * @arg PAPI_GRN_THR -- Count each individual thread * @arg PAPI_GRN_PROC -- Count each individual process * @arg PAPI_GRN_PROCG -- Count each individual process group * @arg PAPI_GRN_SYS -- Count the current CPU * @arg PAPI_GRN_SYS_CPU -- Count all CPUs individually * @arg PAPI_GRN_MIN -- The finest available granularity * @arg PAPI_GRN_MAX -- The coarsest available granularity * @manonly * @endmanonly * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @manonly * @endmanonly * * PAPI_set_granularity sets the default counting granularity for all new * event sets created by PAPI_create_eventset. * This call implicitly sets the granularity for the cpu component * (component 0) and is included to preserve backward compatibility. * * @par Example: * @code int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default granularity for the cpu component ret = PAPI_set_granularity(PAPI_GRN_PROC); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_set_cmp_granularity PAPI_set_domain PAPI_set_opt PAPI_get_opt */ int PAPI_set_granularity( int granularity ) { return ( PAPI_set_cmp_granularity( granularity, 0 ) ); } /** @class PAPI_set_cmp_granularity * @brief Set the default counting granularity for eventsets bound to the specified component. * * @par C Prototype: * \#include @n * int PAPI_set_cmp_granularity( int granularity, int cidx ); * * @param granularity one of the following constants as defined in the papi.h header file * @arg PAPI_GRN_THR Count each individual thread * @arg PAPI_GRN_PROC Count each individual process * @arg PAPI_GRN_PROCG Count each individual process group * @arg PAPI_GRN_SYS Count the current CPU * @arg PAPI_GRN_SYS_CPU Count all CPUs individually * @arg PAPI_GRN_MIN The finest available granularity * @arg PAPI_GRN_MAX The coarsest available granularity * * @param cidx * An integer identifier for a component. * By convention, component 0 is always the cpu component. * @manonly * @endmanonly * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOCMP * The argument cidx is not a valid component. * @manonly * @endmanonly * * PAPI_set_cmp_granularity sets the default counting granularity for all new * event sets, and requires an explicit component argument. * Event sets that are already in existence are not affected. * * To change the granularity of an existing event set, please see PAPI_set_opt. * The reader should note that the granularity of an event set affects only * the mode in which the counter continues to run. * * @par Example: * @code int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default granularity for the cpu component ret = PAPI_set_cmp_granularity(PAPI_GRN_PROC, 0); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_set_granularity PAPI_set_domain PAPI_set_opt PAPI_get_opt */ int PAPI_set_cmp_granularity( int granularity, int cidx ) { PAPI_option_t ptr; memset( &ptr, 0, sizeof ( ptr ) ); ptr.defgranularity.def_cidx = cidx; ptr.defgranularity.granularity = granularity; papi_return( PAPI_set_opt( PAPI_DEFGRN, &ptr ) ); } /* This function sets the low level default counting domain for all newly manufactured eventsets. The first function preserves API compatibility and assumes component 0; The second function takes a component argument. */ /** @class PAPI_set_domain * @brief Set the default counting domain for new event sets bound to the cpu component. * * @par C Prototype: * \#include @n * int PAPI_set_domain( int domain ); * * @param domain one of the following constants as defined in the papi.h header file * @arg PAPI_DOM_USER User context counted * @arg PAPI_DOM_KERNEL Kernel/OS context counted * @arg PAPI_DOM_OTHER Exception/transient mode counted * @arg PAPI_DOM_SUPERVISOR Supervisor/hypervisor context counted * @arg PAPI_DOM_ALL All above contexts counted * @arg PAPI_DOM_MIN The smallest available context * @arg PAPI_DOM_MAX The largest available context * @manonly * @endmanonly * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @manonly * @endmanonly * * PAPI_set_domain sets the default counting domain for all new event sets * created by PAPI_create_eventset in all threads. * This call implicitly sets the domain for the cpu component (component 0) * and is included to preserve backward compatibility. * * @par Example: * @code int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default domain for the cpu component ret = PAPI_set_domain(PAPI_DOM_KERNEL); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_set_cmp_domain PAPI_set_granularity PAPI_set_opt PAPI_get_opt */ int PAPI_set_domain( int domain ) { return ( PAPI_set_cmp_domain( domain, 0 ) ); } /** @class PAPI_set_cmp_domain * @brief Set the default counting domain for new event sets bound to the specified component. * * @par C Prototype: * \#include @n * int PAPI_set_cmp_domain( int domain, int cidx ); * * @param domain one of the following constants as defined in the papi.h header file * @arg PAPI_DOM_USER User context counted * @arg PAPI_DOM_KERNEL Kernel/OS context counted * @arg PAPI_DOM_OTHER Exception/transient mode counted * @arg PAPI_DOM_SUPERVISOR Supervisor/hypervisor context counted * @arg PAPI_DOM_ALL All above contexts counted * @arg PAPI_DOM_MIN The smallest available context * @arg PAPI_DOM_MAX The largest available context * @arg PAPI_DOM_HWSPEC Something other than CPU like stuff. Individual components can decode * low order bits for more meaning * * @param cidx * An integer identifier for a component. * By convention, component 0 is always the cpu component. * @manonly * @endmanonly * * @retval PAPI_OK * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOCMP * The argument cidx is not a valid component. * @manonly * @endmanonly * * PAPI_set_cmp_domain sets the default counting domain for all new event sets * in all threads, and requires an explicit component argument. * Event sets that are already in existence are not affected. * To change the domain of an existing event set, please see PAPI_set_opt. * The reader should note that the domain of an event set affects only the * mode in which the counter continues to run. * Counts are still aggregated for the current process, and not for any other * processes in the system. * Thus when requesting PAPI_DOM_KERNEL , the user is asking for events that * occur on behalf of the process, inside the kernel. * * @par Example: * @code int ret; // Initialize the library ret = PAPI_library_init(PAPI_VER_CURRENT); if (ret > 0 && ret != PAPI_VER_CURRENT) { fprintf(stderr,"PAPI library version mismatch!\n"); exit(1); } if (ret < 0) handle_error(ret); // Set the default domain for the cpu component ret = PAPI_set_cmp_domain(PAPI_DOM_KERNEL,0); if (ret != PAPI_OK) handle_error(ret); ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @see PAPI_set_domain PAPI_set_granularity PAPI_set_opt PAPI_get_opt */ int PAPI_set_cmp_domain( int domain, int cidx ) { PAPI_option_t ptr; memset( &ptr, 0, sizeof ( ptr ) ); ptr.defdomain.def_cidx = cidx; ptr.defdomain.domain = domain; papi_return( PAPI_set_opt( PAPI_DEFDOM, &ptr ) ); } /** @class PAPI_add_events * @brief add multiple PAPI presets or native hardware events to an event set * * @par C Interface: * \#include @n * int PAPI_add_events( int EventSet, int * EventCodes, int number ); * * PAPI_add_event adds one event to a PAPI Event Set. PAPI_add_events does * the same, but for an array of events. @n * A hardware event can be either a PAPI preset or a native hardware event code. * For a list of PAPI preset events, see PAPI_presets or run the avail test case * in the PAPI distribution. PAPI presets can be passed to PAPI_query_event to see * if they exist on the underlying architecture. * For a list of native events available on current platform, run native_avail * test case in the PAPI distribution. For the encoding of native events, * see PAPI_event_name_to_code to learn how to generate native code for the * supported native event on the underlying architecture. * * @param EventSet * An integer handle for a PAPI Event Set as created by PAPI_create_eventset. * @param *EventCode * An array of defined events. * @param number * An integer indicating the number of events in the array *EventCode. * It should be noted that PAPI_add_events can partially succeed, * exactly like PAPI_remove_events. * * @retval Positive-Integer * The number of consecutive elements that succeeded before the error. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOMEM * Insufficient memory to complete the operation. * @retval PAPI_ENOEVST * The event set specified does not exist. * @retval PAPI_EISRUN * The event set is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this event and other events * in the event set simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * @retval PAPI_EBUG * Internal error, please send mail to the developers. * * @par Examples: * @code * int EventSet = PAPI_NULL; * unsigned int native = 0x0; * if ( PAPI_create_eventset( &EventSet ) != PAPI_OK ) * handle_error( 1 ); * // Add Total Instructions Executed to our EventSet * if ( PAPI_add_event( EventSet, PAPI_TOT_INS ) != PAPI_OK ) * handle_error( 1 ); * // Add native event PM_CYC to EventSet * if ( PAPI_event_name_to_code( "PM_CYC", &native ) != PAPI_OK ) * handle_error( 1 ); * if ( PAPI_add_event( EventSet, native ) != PAPI_OK ) * handle_error( 1 ); * @endcode * * @bug * The vector function should take a pointer to a length argument so a proper * return value can be set upon partial success. * * @see PAPI_cleanup_eventset @n * PAPI_destroy_eventset @n * PAPI_event_code_to_name @n * PAPI_remove_events @n * PAPI_query_event @n * PAPI_presets @n * PAPI_native @n * PAPI_remove_event */ int PAPI_add_events( int EventSet, int *Events, int number ) { APIDBG( "Entry: EventSet: %d, Events: %p, number: %d\n", EventSet, Events, number); int i, retval; if ( ( Events == NULL ) || ( number <= 0 ) ) papi_return( PAPI_EINVAL ); for ( i = 0; i < number; i++ ) { retval = PAPI_add_event( EventSet, Events[i] ); if ( retval != PAPI_OK ) { if ( i == 0 ) papi_return( retval ); else return ( i ); } } return ( PAPI_OK ); } /** @class PAPI_remove_events * @brief Remove an array of hardware event codes from a PAPI event set. * * A hardware event can be either a PAPI Preset or a native hardware event code. * For a list of PAPI preset events, see PAPI_presets or run the papi_avail utility in the PAPI distribution. * PAPI Presets can be passed to PAPI_query_event to see if they exist on the underlying architecture. * For a list of native events available on current platform, run papi_native_avail in the PAPI distribution. * It should be noted that PAPI_remove_events can partially succeed, exactly like PAPI_add_events. * * @par C Prototype: * \#include @n * int PAPI_remove_events( int EventSet, int * EventCode, int number ); * * @param EventSet * an integer handle for a PAPI event set as created by PAPI_create_eventset * @param *Events * an array of defined events * @param number * an integer indicating the number of events in the array *EventCode * * @retval Positive integer * The number of consecutive elements that succeeded before the error. * @retval PAPI_EINVAL * One or more of the arguments is invalid. * @retval PAPI_ENOEVST * The EventSet specified does not exist. * @retval PAPI_EISRUN * The EventSet is currently counting events. * @retval PAPI_ECNFLCT * The underlying counter hardware can not count this event and other * events in the EventSet simultaneously. * @retval PAPI_ENOEVNT * The PAPI preset is not available on the underlying hardware. * * @par Example: * @code int EventSet = PAPI_NULL; int Events[] = {PAPI_TOT_INS, PAPI_FP_OPS}; int ret; // Create an empty EventSet ret = PAPI_create_eventset(&EventSet); if (ret != PAPI_OK) handle_error(ret); // Add two events to our EventSet ret = PAPI_add_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); // Start counting ret = PAPI_start(EventSet); if (ret != PAPI_OK) handle_error(ret); // Stop counting, ignore values ret = PAPI_stop(EventSet, NULL); if (ret != PAPI_OK) handle_error(ret); // Remove event ret = PAPI_remove_events(EventSet, Events, 2); if (ret != PAPI_OK) handle_error(ret); * @endcode * * @bug The last argument should be a pointer so the count can be returned on partial success in addition * to a real error code. * * @see PAPI_cleanup_eventset PAPI_destroy_eventset PAPI_event_name_to_code * PAPI_presets PAPI_add_event PAPI_add_events */ int PAPI_remove_events( int EventSet, int *Events, int number ) { APIDBG( "Entry: EventSet: %d, Events: %p, number: %d\n", EventSet, Events, number); int i, retval; if ( ( Events == NULL ) || ( number <= 0 ) ) papi_return( PAPI_EINVAL ); for ( i = 0; i < number; i++ ) { retval = PAPI_remove_event( EventSet, Events[i] ); if ( retval != PAPI_OK ) { if ( i == 0 ) papi_return( retval ); else return ( i ); } } return ( PAPI_OK ); } /** @class PAPI_list_events * @brief list the events in an event set * * PAPI_list_events() returns an array of events and a count of the * total number of events in an event set. * This call assumes an initialized PAPI library and a successfully created event set. * * @par C Interface * \#include @n * int PAPI_list_events(int EventSet, int *Events, int *number); * * @param[in] EventSet * An integer handle for a PAPI event set as created by PAPI_create_eventset * @param[in,out] *Events * A pointer to a preallocated array of codes for events, such as PAPI_INT_INS. * No more than *number codes will be stored into the array. * @param[in,out] *number * On input, the size of the Events array, or maximum number of event codes * to be returned. A value of 0 can be used to probe an event set. * On output, the number of events actually in the event set. * This value may be greater than the actually stored number of event codes. * * @retval PAPI_EINVAL * @retval PAPI_ENOEVST * * @par Examples: * @code if (PAPI_event_name_to_code("PAPI_TOT_INS",&EventCode) != PAPI_OK) exit(1); if (PAPI_add_event(EventSet, EventCode) != PAPI_OK) exit(1); Convert a second event name to an event code if (PAPI_event_name_to_code("PAPI_L1_LDM",&EventCode) != PAPI_OK) exit(1); if (PAPI_add_event(EventSet, EventCode) != PAPI_OK) exit(1); number = 0; if(PAPI_list_events(EventSet, NULL, &number)) exit(1); if(number != 2) exit(1); if(PAPI_list_events(EventSet, Events, &number)) exit(1); * @endcode * @see PAPI_event_code_to_name * @see PAPI_event_name_to_code * @see PAPI_add_event * @see PAPI_create_eventset */ int PAPI_list_events( int EventSet, int *Events, int *number ) { APIDBG( "Entry: EventSet: %d, Events: %p, number: %p\n", EventSet, Events, number); EventSetInfo_t *ESI; int i, j; if ( *number < 0 ) papi_return( PAPI_EINVAL ); if ( ( Events == NULL ) && ( *number > 0 ) ) papi_return( PAPI_EINVAL ); ESI = _papi_hwi_lookup_EventSet( EventSet ); if ( !ESI ) papi_return( PAPI_ENOEVST ); if ( ( Events == NULL ) || ( *number == 0 ) ) { *number = ESI->NumberOfEvents; papi_return( PAPI_OK ); } for ( i = 0, j = 0; j < ESI->NumberOfEvents; i++ ) { if ( ( int ) ESI->EventInfoArray[i].event_code != PAPI_NULL ) { Events[j] = ( int ) ESI->EventInfoArray[i].event_code; j++; if ( j == *number ) break; } } *number = j; return ( PAPI_OK ); } /* xxx This is OS dependent, not component dependent, right? */ /** @class PAPI_get_dmem_info * @brief Get information about the dynamic memory usage of the current program. * * @par C Prototype: * \#include @n * int PAPI_get_dmem_info( PAPI_dmem_info_t *dest ); * * @param dest * structure to be filled in @ref PAPI_dmem_info_t * * @retval PAPI_ECMP * The function is not implemented for the current component. * @retval PAPI_EINVAL * Any value in the structure or array may be undefined as indicated by * this error value. * @retval PAPI_SYS * A system error occurred. * * @note This function is only implemented for the Linux operating system. * This function takes a pointer to a PAPI_dmem_info_t structure * and returns with the structure fields filled in. * A value of PAPI_EINVAL in any field indicates an undefined parameter. * * @see PAPI_get_executable_info PAPI_get_hardware_info PAPI_get_opt PAPI_library_init */ int PAPI_get_dmem_info( PAPI_dmem_info_t * dest ) { if ( dest == NULL ) return PAPI_EINVAL; memset( ( void * ) dest, 0x0, sizeof ( PAPI_dmem_info_t ) ); return ( _papi_os_vector.get_dmem_info( dest ) ); } /** @class PAPI_get_executable_info * @brief Get the executable's address space info. * * @par C Interface: * \#include @n * const PAPI_exe_info_t *PAPI_get_executable_info( void ); * * This function returns a pointer to a structure containing information * about the current program. * * @param fullname * Fully qualified path + filename of the executable. * @param name * Filename of the executable with no path information. * @param text_start, text_end * Start and End addresses of program text segment. * @param data_start, data_end * Start and End addresses of program data segment. * @param bss_start, bss_end * Start and End addresses of program bss segment. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * * @par Examples: * @code * const PAPI_exe_info_t *prginfo = NULL; * if ( ( prginfo = PAPI_get_executable_info( ) ) == NULL ) * exit( 1 ); * printf( "Path+Program: %s\n", exeinfo->fullname ); * printf( "Program: %s\n", exeinfo->address_info.name ); * printf( "Text start: %p, Text end: %p\n", exeinfo->address_info.text_start, exeinfo->address_info.text_end) ; * printf( "Data start: %p, Data end: %p\n", exeinfo->address_info.data_start, exeinfo->address_info.data_end ); * printf( "Bss start: %p, Bss end: %p\n", exeinfo->address_info.bss_start, exeinfo->address_info.bss_end ); * @endcode * * @see PAPI_get_opt * @see PAPI_get_hardware_info * @see PAPI_exe_info_t */ const PAPI_exe_info_t * PAPI_get_executable_info( void ) { PAPI_option_t ptr; int retval; memset( &ptr, 0, sizeof ( ptr ) ); retval = PAPI_get_opt( PAPI_EXEINFO, &ptr ); if ( retval == PAPI_OK ) return ( ptr.exe_info ); else return ( NULL ); } /** @class PAPI_get_shared_lib_info * @brief Get address info about the shared libraries used by the process. * * In C, this function returns a pointer to a structure containing information * about the shared library used by the program. * There is no Fortran equivalent call. * @note This data will be incorporated into the PAPI_get_executable_info call in the future. PAPI_get_shared_lib_info will be deprecated and should be used with caution. * * @bug If called before initialization the behavior of the routine is undefined. * * @see PAPI_shlib_info_t * @see PAPI_get_hardware_info * @see PAPI_get_executable_info * @see PAPI_get_dmem_info * @see PAPI_get_opt PAPI_library_init */ const PAPI_shlib_info_t * PAPI_get_shared_lib_info( void ) { PAPI_option_t ptr; int retval; memset( &ptr, 0, sizeof ( ptr ) ); retval = PAPI_get_opt( PAPI_SHLIBINFO, &ptr ); if ( retval == PAPI_OK ) return ( ptr.shlib_info ); else return ( NULL ); } /** @class PAPI_get_hardware_info * @brief get information about the system hardware * * In C, this function returns a pointer to a structure containing information about the hardware on which the program runs. * In Fortran, the values of the structure are returned explicitly. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. * * @bug * If called before initialization the behavior of the routine is undefined. * * @note The C structure contains detailed information about cache and TLB sizes. * This information is not available from Fortran. * * @par Examples: * @code const PAPI_hw_info_t *hwinfo = NULL; if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) exit(1); if ((hwinfo = PAPI_get_hardware_info()) == NULL) exit(1); printf("%d CPUs at %f Mhz.\en",hwinfo->totalcpus,hwinfo->mhz); * @endcode * * @see PAPI_hw_info_t * @see PAPI_get_executable_info, PAPI_get_opt, PAPI_get_dmem_info, PAPI_library_init */ const PAPI_hw_info_t * PAPI_get_hardware_info( void ) { PAPI_option_t ptr; int retval; memset( &ptr, 0, sizeof ( ptr ) ); retval = PAPI_get_opt( PAPI_HWINFO, &ptr ); if ( retval == PAPI_OK ) return ( ptr.hw_info ); else return ( NULL ); } /* The next 4 timing functions always use component 0 */ /** @class PAPI_get_real_cyc * @brief get real time counter value in clock cycles * Returns the total real time passed since some arbitrary starting point. * The time is returned in clock cycles. * This call is equivalent to wall clock time. * * @par Examples: * @code s = PAPI_get_real_cyc(); your_slow_code(); e = PAPI_get_real_cyc(); printf("Wallclock cycles: %lld\en",e-s); * @endcode * @see PAPIF PAPI PAPI_get_virt_usec PAPI_get_virt_cyc PAPI_library_init */ long long PAPI_get_real_cyc( void ) { return ( _papi_os_vector.get_real_cycles( ) ); } /** @class PAPI_get_real_nsec * @brief Get real time counter value in nanoseconds. * * This function returns the total real time passed since some arbitrary * starting point. * The time is returned in nanoseconds. * This call is equivalent to wall clock time. * * @see PAPI_get_virt_usec * @see PAPI_get_virt_cyc * @see PAPI_library_init */ /* FIXME */ long long PAPI_get_real_nsec( void ) { return ( ( _papi_os_vector.get_real_nsec( ))); } /** @class PAPI_get_real_usec * @brief get real time counter value in microseconds * * This function returns the total real time passed since some arbitrary * starting point. * The time is returned in microseconds. * This call is equivalent to wall clock time. * @par Examples: * @code s = PAPI_get_real_cyc(); your_slow_code(); e = PAPI_get_real_cyc(); printf("Wallclock cycles: %lld\en",e-s); * @endcode * @see PAPIF * @see PAPI * @see PAPI_get_virt_usec * @see PAPI_get_virt_cyc * @see PAPI_library_init */ long long PAPI_get_real_usec( void ) { return ( _papi_os_vector.get_real_usec( ) ); } /** @class PAPI_get_virt_cyc * @brief get virtual time counter value in clock cycles * * @retval PAPI_ECNFLCT * If there is no master event set. * This will happen if the library has not been initialized, or * for threaded applications, if there has been no thread id * function defined by the PAPI_thread_init function. * @retval PAPI_ENOMEM * For threaded applications, if there has not yet been any thread * specific master event created for the current thread, and if * the allocation of such an event set fails, the call will return * PAPI_ENOMEM or PAPI_ESYS . * * This function returns the total number of virtual units from some * arbitrary starting point. * Virtual units accrue every time the process is running in user-mode on * behalf of the process. * Like the real time counters, this count is guaranteed to exist on every platform * PAPI supports. * However on some platforms, the resolution can be as bad as 1/Hz as defined * by the operating system. * @par Examples: * @code s = PAPI_get_virt_cyc(); your_slow_code(); e = PAPI_get_virt_cyc(); printf("Process has run for cycles: %lld\en",e-s); * @endcode */ long long PAPI_get_virt_cyc( void ) { return ( ( long long ) _papi_os_vector.get_virt_cycles( ) ); } /** @class PAPI_get_virt_nsec * @brief Get virtual time counter values in nanoseconds. * * @retval PAPI_ECNFLCT * If there is no master event set. * This will happen if the library has not been initialized, or for threaded * applications, if there has been no thread id function defined by the * PAPI_thread_init function. * @retval PAPI_ENOMEM * For threaded applications, if there has not yet been any thread specific * master event created for the current thread, and if the allocation of * such an event set fails, the call will return PAPI_ENOMEM or PAPI_ESYS . * * This function returns the total number of virtual units from some * arbitrary starting point. * Virtual units accrue every time the process is running in user-mode on * behalf of the process. * Like the real time counters, this count is guaranteed to exist on every platform * PAPI supports. * However on some platforms, the resolution can be as bad as 1/Hz as defined * by the operating system. * */ long long PAPI_get_virt_nsec( void ) { return ( ( _papi_os_vector.get_virt_nsec())); } /** @class PAPI_get_virt_usec * @brief get virtual time counter values in microseconds * * @retval PAPI_ECNFLCT * If there is no master event set. * This will happen if the library has not been initialized, or for threaded * applications, if there has been no thread id function defined by the * PAPI_thread_init function. * @retval PAPI_ENOMEM * For threaded applications, if there has not yet been any thread * specific master event created for the current thread, and if the * allocation of such an event set fails, the call will return PAPI_ENOMEM or PAPI_ESYS . * * This function returns the total number of virtual units from some * arbitrary starting point. * Virtual units accrue every time the process is running in user-mode on * behalf of the process. * Like the real time counters, this count is guaranteed to exist on every * platform PAPI supports. However on some platforms, the resolution can be * as bad as 1/Hz as defined by the operating system. * @par Examples: * @code s = PAPI_get_virt_cyc(); your_slow_code(); e = PAPI_get_virt_cyc(); printf("Process has run for cycles: %lld\en",e-s); * @endcode * @see PAPIF * @see PAPI * @see PAPI * @see PAPI_get_real_cyc * @see PAPI_get_virt_cyc */ long long PAPI_get_virt_usec( void ) { return ( ( long long ) _papi_os_vector.get_virt_usec() ); } /** @class PAPI_lock * @brief Lock one of two mutex variables defined in papi.h. * * PAPI_lock() grabs access to one of the two PAPI mutex variables. * This function is provided to the user to have a platform independent call * to a (hopefully) efficiently implemented mutex. * * @par C Interface: * \#include @n * void PAPI_lock(int lock); * * @param[in] lock * -- an integer value specifying one of the two user locks: PAPI_USR1_LOCK or PAPI_USR2_LOCK * * @returns * There is no return value for this call. * Upon return from PAPI_lock the current thread has acquired * exclusive access to the specified PAPI mutex. * * @see PAPI_unlock * @see PAPI_thread_init */ int PAPI_lock( int lck ) { if ( ( lck < 0 ) || ( lck >= PAPI_MAX_LOCK ) ) papi_return( PAPI_EINVAL ); papi_return( _papi_hwi_lock( lck ) ); } /** @class PAPI_unlock * @brief Unlock one of the mutex variables defined in papi.h. * * @param lck * an integer value specifying one of the two user locks: PAPI_USR1_LOCK * or PAPI_USR2_LOCK * * PAPI_unlock() unlocks the mutex acquired by a call to PAPI_lock . * * @see PAPI_thread_init */ int PAPI_unlock( int lck ) { if ( ( lck < 0 ) || ( lck >= PAPI_MAX_LOCK ) ) papi_return( PAPI_EINVAL ); papi_return( _papi_hwi_unlock( lck ) ); } /** @class PAPI_is_initialized * @brief check for initialization * @retval PAPI_NOT_INITED * Library has not been initialized * @retval PAPI_LOW_LEVEL_INITED * Low level has called library init * @retval PAPI_HIGH_LEVEL_INITED * High level has called library init * @retval PAPI_THREAD_LEVEL_INITED * Threads have been inited * * @param version upon initialization, PAPI checks the argument against the internal value of PAPI_VER_CURRENT when the library was compiled. * This guards against portability problems when updating the PAPI shared libraries on your system. * @par Examples: * @code int retval; retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT && retval > 0) { fprintf(stderr,"PAPI library version mismatch!\en"); exit(1); } if (retval < 0) handle_error(retval); retval = PAPI_is_initialized(); if (retval != PAPI_LOW_LEVEL_INITED) handle_error(retval); * @endcode * PAPI_is_initialized() returns the status of the PAPI library. * The PAPI library can be in one of four states, as described under RETURN VALUES. * @bug If you don't call this before using any of the low level PAPI calls, your application could core dump. * @see PAPI * @see PAPI_thread_init */ int PAPI_is_initialized( void ) { return ( init_level ); } /* This function maps the overflow_vector to event indexes in the event set, so that user can know which PAPI event overflowed. int *array---- an array of event indexes in eventset; the first index maps to the highest set bit in overflow_vector int *number--- this is an input/output parameter, user should put the size of the array into this parameter, after the function is executed, the number of indexes in *array is written to this parameter */ /** @class PAPI_get_overflow_event_index * @brief converts an overflow vector into an array of indexes to overflowing events * @param EventSet * an integer handle to a PAPI event set as created by PAPI_create_eventset * @param overflow_vector * a vector with bits set for each counter that overflowed. * This vector is passed by the system to the overflow handler routine. * @param *array * an array of indexes for events in EventSet. * No more than *number indexes will be stored into the array. * @param *number * On input the variable determines the size of the array. * On output the variable contains the number of indexes in the array. * * @retval PAPI_EINVAL * One or more of the arguments is invalid. This could occur if the overflow_vector is empty (zero), if the array or number pointers are NULL, if the value of number is less than one, or if the EventSet is empty. * @retval PAPI_ENOEVST The EventSet specified does not exist. * @par Examples * @code void handler(int EventSet, void *address, long_long overflow_vector, void *context){ int Events[4], number, i; int total = 0, retval; printf("Overflow #%d\n Handler(%d) Overflow at %p! vector=%#llx\n", total, EventSet, address, overflow_vector); total++; number = 4; retval = PAPI_get_overflow_event_index(EventSet, overflow_vector, Events, &number); if(retval == PAPI_OK) for(i=0; iNumberOfEvents == 0 ) papi_return( PAPI_EINVAL ); while ( ( set_bit = ffsll( overflow_vector ) ) ) { set_bit -= 1; overflow_vector ^= ( long long ) 1 << set_bit; for ( j = 0; j < ESI->NumberOfEvents; j++ ) { for ( k = 0, pos = 0; k < PAPI_EVENTS_IN_DERIVED_EVENT && pos >= 0; k++ ) { pos = ESI->EventInfoArray[j].pos[k]; if ( ( set_bit == pos ) && ( ( ESI->EventInfoArray[j].derived == NOT_DERIVED ) || ( ESI->EventInfoArray[j].derived == DERIVED_CMPD ) ) ) { array[count++] = j; if ( count == *number ) return PAPI_OK; break; } } } } *number = count; return PAPI_OK; } /** @class PAPI_get_event_component * @brief return component an event belongs to * @retval ENOCMP * component does not exist * * @param EventCode * EventCode for which we want to know the component index * @par Examples: * @code int cidx,eventcode; cidx = PAPI_get_event_component(eventcode); * @endcode * PAPI_get_event_component() returns the component an event * belongs to. * @bug Doesn't work for preset events * @see PAPI_get_event_info */ int PAPI_get_event_component( int EventCode) { APIDBG( "Entry: EventCode: %#x\n", EventCode); return _papi_hwi_component_index( EventCode); } /** @class PAPI_get_component_index * @brief returns the component index for the named component * @retval ENOCMP * component does not exist * * @param name * name of component to find index for * @par Examples: * @code int cidx; cidx = PAPI_get_component_index("cuda"); if (cidx==PAPI_OK) { printf("The CUDA component is cidx %d\n",cidx); } * @endcode * PAPI_get_component_index() returns the component index of * the named component. This is useful for finding out if * a specified component exists. * @bug Doesn't work for preset events * @see PAPI_get_event_component */ int PAPI_get_component_index(const char *name) { APIDBG( "Entry: name: %s\n", name); int cidx; const PAPI_component_info_t *cinfo; for(cidx=0;cidxname)) { return cidx; } } return PAPI_ENOCMP; } /** @class PAPI_disable_component * @brief disables the specified component * @retval ENOCMP * component does not exist * @retval ENOINIT * cannot disable as PAPI has already been initialized * * @param cidx * component index of component to be disabled * @par Examples: * @code int cidx, result; cidx = PAPI_get_component_index("example"); if (cidx>=0) { result = PAPI_disable_component(cidx); if (result==PAPI_OK) printf("The example component is disabled\n"); } // ... PAPI_library_init(); * @endcode * PAPI_disable_component() allows the user to disable components * before PAPI_library_init() time. This is useful if the user * knows they do not wish to use events from that component and * want to reduce the PAPI library overhead. * * PAPI_disable_component() must be called before * PAPI_library_init(). * * @see PAPI_get_event_component * @see PAPI_library_init */ int PAPI_disable_component( int cidx ) { APIDBG( "Entry: cidx: %d\n", cidx); const PAPI_component_info_t *cinfo; /* Can only run before PAPI_library_init() is called */ if (init_level != PAPI_NOT_INITED) { return PAPI_ENOINIT; } cinfo=PAPI_get_component_info(cidx); if (cinfo==NULL) return PAPI_ENOCMP; ((PAPI_component_info_t *)cinfo)->disabled=1; strcpy(((PAPI_component_info_t *)cinfo)->disabled_reason, "Disabled by PAPI_disable_component()"); return PAPI_OK; } /** \class PAPI_disable_component_by_name * \brief disables the named component * \retval ENOCMP * component does not exist * \retval ENOINIT * unable to disable the component, the library has already been initialized * \param component_name * name of the component to disable. * \par Example: * \code int result; result = PAPI_disable_component_by_name("example"); if (result==PAPI_OK) printf("component \"example\" has been disabled\n"); //... PAPI_library_init(PAPI_VER_CURRENT); * \endcode * PAPI_disable_component_by_name() allows the user to disable a component * before PAPI_library_init() time. This is useful if the user knows they do * not with to use events from that component and want to reduce the PAPI * library overhead. * * PAPI_disable_component_by_name() must be called before PAPI_library_init(). * * \bug none known * \see PAPI_library_init * \see PAPI_disable_component */ int PAPI_disable_component_by_name(const char *name ) { APIDBG( "Entry: name: %s\n", name); int cidx; /* I can only be called before init time */ if (init_level!=PAPI_NOT_INITED) { return PAPI_ENOINIT; } cidx = PAPI_get_component_index(name); if (cidx>=0) { return PAPI_disable_component(cidx); } return PAPI_ENOCMP; } /** \class PAPI_enum_dev_type * \brief returns handle of next device type * \retval ENOCMP * component does not exist * \retval EINVAL end of device type list * \param enum_modifier * device type modifier, used to filter out enumerated device types * \par Example: * \code enum { PAPI_DEV_TYPE_ENUM__FIRST, PAPI_DEV_TYPE_ENUM__CPU, PAPI_DEV_TYPE_ENUM__CUDA, PAPI_DEV_TYPE_ENUM__ROCM, PAPI_DEV_TYPE_ENUM__ALL }; void *handle; const char *vendor_name; int enum_modifier = PAPI_DEV_TYPE_ENUM__CPU | PAPI_DEV_TYPE_ENUM__CUDA; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); ... } * \endcode * PAPI_enum_dev_type() allows the user to access all device types in the system. * It takes an enumerator modifier that allows users to enumerate only devices of * a predefined type and it returns an opaque handler that users can pass to other * functions in order to query device type attributes. * * \bug none known * \see PAPI_get_dev_type_attr * \see PAPI_get_dev_attr */ int PAPI_enum_dev_type(int enum_modifier, void **handle) { return _papi_hwi_enum_dev_type(enum_modifier, handle); } /** \class PAPI_get_dev_type_attr * \brief returns device type attributes * \retval ENOSUPP * invalid attribute * \param handle * opaque handle for device, obtained through PAPI_enum_dev_type * \param attr * device type attribute to query * \param val * value of the requested device type attribute * \par Example: * \code typedef enum { PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, // PAPI defined device type id PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, // Vendor defined id PAPI_DEV_TYPE_ATTR__CHAR_NAME, // Vendor name PAPI_DEV_TYPE_ATTR__INT_COUNT, // Devices of that type and vendor PAPI_DEV_TYPE_ATTR__CHAR_STATUS, // Status string for the device type } PAPI_dev_type_attr_e; typedef enum { PAPI_DEV_TYPE_ID__CPU, // Device id for CPUs PAPI_DEV_TYPE_ID__CUDA, // Device id for Nvidia GPUs PAPI_DEV_TYPE_ID__ROCM, // Device id for AMD GPUs } PAPI_dev_type_id_e; void *handle; int id; int enum_modifier = PAPI_DEV_TYPE_ENUM__ALL; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); switch (id) { case PAPI_DEV_TYPE_ID__CPU: // query cpu attributes break; case PAPI_DEV_TYPE_ID__CUDA: // query nvidia gpu attributes break; case PAPI_DEV_TYPE_ID__ROCM: // query amd gpu attributes break; default: ... } } * \endcode * PAPI_get_dev_type_attr() allows the user to query all device type attributes. * It takes a device type handle, returned by PAPI_enum_dev_type, and an attribute * to be queried for the device type and returns the attribute value. * * \bug none known * \see PAPI_enum_dev_type * \see PAPI_get_dev_attr */ int PAPI_get_dev_type_attr(void *handle, PAPI_dev_type_attr_e attr, void *val) { return _papi_hwi_get_dev_type_attr(handle, attr, val); } /** \class PAPI_get_dev_attr * \brief returns device attributes * \retval ENOSUPP * invalid/unsupported attribute * \param handle * opaque handle for device, obtained through PAPI_enum_dev_type * \param id * integer identifier of queried device * \param attr * device attribute to query * \param val * value of the requested device attribute * \par Example: * \code typedef enum { PAPI_DEV_ATTR__CPU_CHAR_NAME, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, PAPI_DEV_ATTR__CPU_UINT_FAMILY, PAPI_DEV_ATTR__CPU_UINT_MODEL, PAPI_DEV_ATTR__CPU_UINT_STEPPING, PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE, PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY, PAPI_DEV_ATTR__CPU_UINT_THR_PER_NUMA, PAPI_DEV_ATTR__CUDA_ULONG_UID, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT, PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL, PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM, PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP, PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR, PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM, PAPI_DEV_ATTR__ROCM_ULONG_UID, PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT, PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR, } PAPI_dev_attr_e; void *handle; int id; int count; int enum_modifier = PAPI_DEV_TYPE_ENUM__CPU | PAPI_DEV_TYPE_ENUM__CUDA; while (PAPI_OK == PAPI_enum_dev_type(enum_modifier, &handle)) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &count); if (PAPI_DEV_TYPE_ID__CUDA == id) { for (int i = 0; i < count; ++i) { unsigned int warp_size; unsigned int cc_major, cc_minor; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, &warp_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, &cc_minor); ... } } } * \endcode * PAPI_get_dev_type_attr() allows the user to query all device type attributes. * It takes a device type handle, returned by PAPI_enum_dev_type, the device sequential id * and an attribute to be queried for the device and returns the attribute value. * * \bug none known * \see PAPI_enum_dev_type * \see PAPI_get_dev_attr */ int PAPI_get_dev_attr(void *handle, int id, PAPI_dev_attr_e attr, void *val) { return _papi_hwi_get_dev_attr(handle, id, attr, val); } papi-papi-7-2-0-t/src/papi.h000066400000000000000000001754361502707512200155510ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file papi.h * * @author Philip Mucci * mucci@cs.utk.edu * @author dan terpstra * terpstra@cs.utk.edu * @author Haihang You * you@cs.utk.edu * @author Kevin London * london@cs.utk.edu * @author Maynard Johnson * maynardj@us.ibm.com * * @brief Return codes and api definitions. */ #ifndef _PAPI #define _PAPI #pragma GCC visibility push(default) /** * @mainpage PAPI * * @section papi_intro Introduction * The PAPI Performance Application Programming Interface provides machine and * operating system independent access to hardware performance counters found * on most modern processors. * Any of over 100 preset events can be counted through either a simple high * level programming interface or a more complete low level interface from * either C or Fortran. * A list of the function calls in these interfaces is given below, * with references to other pages for more complete details. * * @section papi_high_api High Level Functions * A simple interface for instrumenting end-user applications. * Fully supported on both C and Fortran. * See individual functions for details on usage. * * @ref high_api * * Note that the high-level interface is self-initializing. * You can mix high and low level calls, but you @b must call either * @ref PAPI_library_init() or a high level routine before calling a low level routine. * * @section papi_low_api Low Level Functions * Advanced interface for all applications and performance tools. * Some functions may be implemented only for C or Fortran. * See individual functions for details on usage and support. * * @ref low_api * * @section papi_Fortran Fortran API * The Fortran interface has some unique features and entry points. * See individual functions for details. * * @ref PAPIF * * @ref PAPIF-HL * * @section Components * * Components provide access to hardware information on specific subsystems. * * Components can be found under the components directory or @ref papi_components "here" * and included in a build as an argument to configure,\n * '--with-components=< comma_seperated_list_of_components_to_build >'. * * @section papi_util PAPI Utility Commands *
    *
  • @ref papi_avail - provides availability and detail information for PAPI preset events *
  • @ref papi_clockres - provides availability and detail information for PAPI preset events *
  • @ref papi_cost - provides availability and detail information for PAPI preset events *
  • @ref papi_command_line - executes PAPI preset or native events from the command line *
  • @ref papi_decode - decodes PAPI preset events into a csv format suitable for * PAPI_encode_events *
  • @ref papi_event_chooser - given a list of named events, lists other events * that can be counted with them *
  • @ref papi_mem_info - provides information on the memory architecture of the current processor *
  • @ref papi_native_avail - provides detailed information for PAPI native events *
* @see The PAPI Website http://icl.cs.utk.edu/papi */ /** \htmlonly * @page CDI PAPI Component Development Interface * @par \em Introduction * PAPI-C consists of a Framework and between 1 and 16 Components. * The Framework is platform independent and exposes the PAPI API to end users. * The Components provide access to hardware information on specific subsystems. * By convention, Component 0 is always a CPU Component. * This allows default behavior for legacy code, and provides a universal * place to define system-wide operations and parameters, * like clock rates and interrupt structures. * Currently only a single CPU Component can exist at a time. * * @par No CPU * In certain cases it can be desirable to use a generic CPU component for * testing instrumentation or for operation on systems that don't provide * the proper patches for accessing cpu counters. * For such a case, the configure option: * @code * configure --with-no-cpu-counters = yes * @endcode * is provided to build PAPI with an "empty" cpu component. * * @par Exposed Interface * A Component for PAPI-C typically consists of a single header file and a * single (or small number of) source file(s). * All of the information for a Component needed by PAPI-C is exposed through * a single data structure that is declared and initialized at the bottom * of the main source file. * This structure, @ref papi_vector_t , is defined in @ref papi_vector.h . * * @par Compiling With an Existing Component * Components provided with the PAPI source distribution all appear in the * src/components directory. * Each component exists in its own directory, named the same as the component itself. * To include a component in a PAPI build, use the configure command line as shown: * * @code * configure --with-components="component list" * @endcode * * Replace the "component list" argument with either the name of a specific * component directory or multiple component names separated by spaces and * enclosed in quotes as shown below: * * \c configure --with-components="acpi lustre infiniband" * * In some cases components themselves require additional configuration. * In these cases an error message will be produced when you run @code make @endcode . * To fix this, run the configure script found in the component directory. * * @par Adding a New Component * The mechanics of adding a new component to the PAPI 4.1 build are relatively straight-forward. * Add a directory to the papi/src/components directory that is named with * the base name of the component. * This directory will contain the source files and build files for the new component. * If configuration of the component is necessary, * additional configure and make files will be needed. * The /example directory can be cloned and renamed as a starting point. * Other components can be used as examples. * This is described in more detail in /components/README. * * @par Developing a New Component * A PAPI-C component generally consists of a header file and one or a * small number of source files. * The source file must contain a @ref papi_vector_t structure that * exposes the internal data and entry points of the component to the PAPI-C Framework. * This structure must have a unique name that is exposed externally and * contains the name of the directory containing the component source code. * * Three types of information are exposed in the @ref papi_vector_t structure: * Configuration parameters are contained in the @ref PAPI_component_info_t structure; * Sizes of opaque data structures necessary for memory management are in the @ref cmp_struct_sizes_t structure; * An array of function entry points which, if implemented, provide access to the functionality of the component. * * If a function is not implemented in a given component its value in the structure can be left unset. * In this case it will be initialized to NULL, and result (generally) in benign, although unproductive, behavior. * * During the development of a component, functions can be implemented and tested in blocks. * Further information about an appropriate order for developing these functions * can be found in the Component Development Cookbook . * * @par PAPI-C Open Research Issues: *
    *
  • Support for non-standard data types: * Currently PAPI supports returned data values expressed as unsigned 64-bit integers. * This is appropriate for counting events, but may not be as appropriate * for expressing other values. * Examples of some other possible data types are shown below. * Data type might be expressed as a flag in the event definition. *
  • Signed Integer *
      *
    • Float: 64-bit IEEE double precision *
    • Fixed Point: 32-bit integer and 32-bit fraction *
    • Ratios: 32 bit numerator and 32 bit denominator *
    *
  • Synchronization: * Components might report values with widely different time scales and * remote measurements may be significantly skewed in time from local measurements. * It would be desirable to have a mechanism to synchronize these values in time. *
  • Dynamic Component Discovery: * Components currently must be included statically in the PAPI library build. * This minimizes startup disruption and time lag, particularly for large parallel systems. * In some instances it would also be desirable to support a run-time * discovery process for components, possibly by searching a specific * location for dynamic libraries. *
  • Component Repository: * A small collection of components are currently maintained and * supported inside the PAPI source distribution. * It would be desirable to create a public component repository where 3rd * parties could submit components for the use and benefit of the larger community. *
  • Multiple CPU Components: * With the rise in popularity of heterogeneous computing systems, it may * become desirable to have more than one CPU component. * Issues must then be resolved relating to which cpu time-base is used, * how are interrupts handled, etc. *
* \endhtmlonly */ /* Definition of PAPI_VERSION format. Note that each of the four * components _must_ be less than 256. Also, the PAPI_VER_CURRENT * masks out the revision and increment. Any revision change is supposed * to be binary compatible between the user application code and the * run-time library. Any modification that breaks this compatibility * _should_ modify the minor version number as to force user applications * to re-compile. */ #define PAPI_VERSION_NUMBER(maj,min,rev,inc) (((maj)<<24) | ((min)<<16) | ((rev)<<8) | (inc)) #define PAPI_VERSION_MAJOR(x) (((x)>>24) & 0xff) #define PAPI_VERSION_MINOR(x) (((x)>>16) & 0xff) #define PAPI_VERSION_REVISION(x) (((x)>>8) & 0xff) #define PAPI_VERSION_INCREMENT(x)((x) & 0xff) /* This is the official PAPI version */ /* The final digit represents the patch count */ #define PAPI_VERSION PAPI_VERSION_NUMBER(7,2,0,0) #define PAPI_VER_CURRENT (PAPI_VERSION & 0xffff0000) /* Tests for checking event code type */ #define IS_NATIVE( EventCode ) ( ( EventCode & PAPI_NATIVE_MASK ) && !(EventCode & PAPI_PRESET_MASK) ) #define IS_PRESET( EventCode ) ( ( EventCode & PAPI_PRESET_MASK ) && !(EventCode & PAPI_NATIVE_MASK) ) #define IS_USER_DEFINED( EventCode ) ( ( EventCode & PAPI_PRESET_MASK ) && (EventCode & PAPI_NATIVE_MASK) ) #ifdef __cplusplus extern "C" { #endif /* Include files */ #include #include #include "papiStdEventDefs.h" /** \internal @defgroup ret_codes Return Codes Return Codes All of the functions contained in the PerfAPI return standardized error codes. Values greater than or equal to zero indicate success, less than zero indicates failure. @{ */ #define PAPI_OK 0 /**< No error */ #define PAPI_EINVAL -1 /**< Invalid argument */ #define PAPI_ENOMEM -2 /**< Insufficient memory */ #define PAPI_ESYS -3 /**< A System/C library call failed */ #define PAPI_ECMP -4 /**< Not supported by component */ #define PAPI_ESBSTR -4 /**< Backwards compatibility */ #define PAPI_ECLOST -5 /**< Access to the counters was lost or interrupted */ #define PAPI_EBUG -6 /**< Internal error, please send mail to the developers */ #define PAPI_ENOEVNT -7 /**< Event does not exist */ #define PAPI_ECNFLCT -8 /**< Event exists, but cannot be counted due to counter resource limitations */ #define PAPI_ENOTRUN -9 /**< EventSet is currently not running */ #define PAPI_EISRUN -10 /**< EventSet is currently counting */ #define PAPI_ENOEVST -11 /**< No such EventSet Available */ #define PAPI_ENOTPRESET -12 /**< Event in argument is not a valid preset */ #define PAPI_ENOCNTR -13 /**< Hardware does not support performance counters */ #define PAPI_EMISC -14 /**< Unknown error code */ #define PAPI_EPERM -15 /**< Permission level does not permit operation */ #define PAPI_ENOINIT -16 /**< PAPI hasn't been initialized yet */ #define PAPI_ENOCMP -17 /**< Component Index isn't set */ #define PAPI_ENOSUPP -18 /**< Not supported */ #define PAPI_ENOIMPL -19 /**< Not implemented */ #define PAPI_EBUF -20 /**< Buffer size exceeded */ #define PAPI_EINVAL_DOM -21 /**< EventSet domain is not supported for the operation */ #define PAPI_EATTR -22 /**< Invalid or missing event attributes */ #define PAPI_ECOUNT -23 /**< Too many events or attributes */ #define PAPI_ECOMBO -24 /**< Bad combination of features */ #define PAPI_ECMP_DISABLED -25 /**< Component containing event is disabled */ #define PAPI_EDELAY_INIT -26 /**< Delayed initialization component */ #define PAPI_EMULPASS -27 /**< Event exists, but cannot be counted due to multiple passes required by hardware */ #define PAPI_PARTIAL -28 /**< Component is partially disabled */ #define PAPI_NUM_ERRORS 29 /**< Number of error messages specified in this API */ #define PAPI_NOT_INITED 0 #define PAPI_LOW_LEVEL_INITED 1 /* Low level has called library init */ #define PAPI_HIGH_LEVEL_INITED 2 /* High level has called library init */ #define PAPI_THREAD_LEVEL_INITED 4 /* Threads have been inited */ /** @} */ /** @internal @defgroup consts Constants All of the functions in the PerfAPI should use the following set of constants. @{ */ #define PAPI_NULL -1 /** /* Earlier versions of PAPI define a special long_long type to mask an incompatibility between the Windows compiler and gcc-style compilers. That problem no longer exists, so long_long has been purged from the source. The defines below preserve backward compatibility. Their use is deprecated, but will continue to be supported in the near term. */ #define long_long long long #define u_long_long unsigned long long /** @defgroup papi_data_structures PAPI Data Structures */ typedef unsigned long PAPI_thread_id_t; /** @ingroup papi_data_structures */ typedef struct _papi_all_thr_spec { int num; PAPI_thread_id_t *id; void **data; } PAPI_all_thr_spec_t; typedef void (*PAPI_overflow_handler_t) (int EventSet, void *address, long long overflow_vector, void *context); typedef void *vptr_t; /** @ingroup papi_data_structures */ typedef struct _papi_sprofil { void *pr_base; /**< buffer base */ unsigned pr_size; /**< buffer size */ vptr_t pr_off; /**< pc start address (offset) */ unsigned pr_scale; /**< pc scaling factor: fixed point fraction 0xffff ~= 1, 0x8000 == .5, 0x4000 == .25, etc. also, two extensions 0x1000 == 1, 0x2000 == 2 */ } PAPI_sprofil_t; /** @ingroup papi_data_structures */ typedef struct _papi_itimer_option { int itimer_num; int itimer_sig; int ns; int flags; } PAPI_itimer_option_t; /** @ingroup papi_data_structures */ typedef struct _papi_inherit_option { int eventset; int inherit; } PAPI_inherit_option_t; /** @ingroup papi_data_structures */ typedef struct _papi_domain_option { int def_cidx; /**< this structure requires a component index to set default domains */ int eventset; int domain; } PAPI_domain_option_t; /** @ingroup papi_data_structures*/ typedef struct _papi_granularity_option { int def_cidx; /**< this structure requires a component index to set default granularity */ int eventset; int granularity; } PAPI_granularity_option_t; /** @ingroup papi_data_structures */ typedef struct _papi_preload_option { char lib_preload_env[PAPI_MAX_STR_LEN]; char lib_preload_sep; char lib_dir_env[PAPI_MAX_STR_LEN]; char lib_dir_sep; } PAPI_preload_info_t; /** @ingroup papi_data_structures */ typedef struct _papi_component_option { char name[PAPI_MAX_STR_LEN]; /**< Name of the component we're using */ char short_name[PAPI_MIN_STR_LEN]; /**< Short name of component, to be prepended to event names */ char description[PAPI_MAX_STR_LEN]; /**< Description of the component */ char version[PAPI_MIN_STR_LEN]; /**< Version of this component */ char support_version[PAPI_MIN_STR_LEN]; /**< Version of the support library */ char kernel_version[PAPI_MIN_STR_LEN]; /**< Version of the kernel PMC support driver */ char disabled_reason[PAPI_HUGE_STR_LEN]; /**< Reason for failure of initialization */ int disabled; /**< 0 if enabled, otherwise error code from initialization */ char partially_disabled_reason[PAPI_HUGE_STR_LEN]; /**< Reason for partial initialization */ int partially_disabled; /**< 1 if component is partially disabled, 0 otherwise */ int initialized; /**< Component is ready to use */ int CmpIdx; /**< Index into the vector array for this component; set at init time */ int num_cntrs; /**< Number of hardware counters the component supports */ int num_mpx_cntrs; /**< Number of hardware counters the component or PAPI can multiplex supports */ int num_preset_events; /**< Number of preset events the component supports */ int num_native_events; /**< Number of native events the component supports */ int default_domain; /**< The default domain when this component is used */ int available_domains; /**< Available domains */ int default_granularity; /**< The default granularity when this component is used */ int available_granularities; /**< Available granularities */ int hardware_intr_sig; /**< Signal used by hardware to deliver PMC events */ int component_type; /**< Type of component */ char *pmu_names[PAPI_PMU_MAX]; /**< list of pmu names supported by this component */ int reserved[8]; /* */ unsigned int hardware_intr:1; /**< hw overflow intr, does not need to be emulated in software*/ unsigned int precise_intr:1; /**< Performance interrupts happen precisely */ unsigned int posix1b_timers:1; /**< Using POSIX 1b interval timers (timer_create) instead of setitimer */ unsigned int kernel_profile:1; /**< Has kernel profiling support (buffered interrupts or sprofil-like) */ unsigned int kernel_multiplex:1; /**< In kernel multiplexing */ unsigned int fast_counter_read:1; /**< Supports a user level PMC read instruction */ unsigned int fast_real_timer:1; /**< Supports a fast real timer */ unsigned int fast_virtual_timer:1; /**< Supports a fast virtual timer */ unsigned int attach:1; /**< Supports attach */ unsigned int attach_must_ptrace:1; /**< Attach must first ptrace and stop the thread/process*/ unsigned int cntr_umasks:1; /**< counters have unit masks */ /* This should be a granularity option */ unsigned int cpu:1; /**< Supports specifying cpu number to use with event set */ unsigned int inherit:1; /**< Supports child processes inheriting parents counters */ unsigned int reserved_bits:19; } PAPI_component_info_t; /** @ingroup papi_data_structures*/ typedef struct _papi_mpx_info { int timer_sig; /**< Signal number used by the multiplex timer, 0 if not: PAPI_SIGNAL */ int timer_num; /**< Number of the itimer or POSIX 1 timer used by the multiplex timer: PAPI_ITIMER */ int timer_us; /**< uS between switching of sets: PAPI_MPX_DEF_US */ } PAPI_mpx_info_t; typedef int (*PAPI_debug_handler_t) (int code); /** @ingroup papi_data_structures */ typedef struct _papi_debug_option { int level; PAPI_debug_handler_t handler; } PAPI_debug_option_t; /** @ingroup papi_data_structures @brief get the executable's address space info */ typedef struct _papi_address_map { char name[PAPI_HUGE_STR_LEN]; vptr_t text_start; /**< Start address of program text segment */ vptr_t text_end; /**< End address of program text segment */ vptr_t data_start; /**< Start address of program data segment */ vptr_t data_end; /**< End address of program data segment */ vptr_t bss_start; /**< Start address of program bss segment */ vptr_t bss_end; /**< End address of program bss segment */ } PAPI_address_map_t; /** @ingroup papi_data_structures @brief get the executable's info */ typedef struct _papi_program_info { char fullname[PAPI_HUGE_STR_LEN]; /**< path + name */ PAPI_address_map_t address_info; /**< executable's address space info */ } PAPI_exe_info_t; /** @ingroup papi_data_structures */ typedef struct _papi_shared_lib_info { PAPI_address_map_t *map; int count; } PAPI_shlib_info_t; /** Specify the file containing user defined events. */ typedef char* PAPI_user_defined_events_file_t; /* The following defines and next for structures define the memory hierarchy */ /* All sizes are in BYTES */ /* Associativity: 0: Undefined; 1: Direct Mapped SHRT_MAX: Full Other values == associativity */ #define PAPI_MH_TYPE_EMPTY 0x0 #define PAPI_MH_TYPE_INST 0x1 #define PAPI_MH_TYPE_DATA 0x2 #define PAPI_MH_TYPE_VECTOR 0x4 #define PAPI_MH_TYPE_TRACE 0x8 #define PAPI_MH_TYPE_UNIFIED (PAPI_MH_TYPE_INST|PAPI_MH_TYPE_DATA) #define PAPI_MH_CACHE_TYPE(a) (a & 0xf) #define PAPI_MH_TYPE_WT 0x00 /* write-through cache */ #define PAPI_MH_TYPE_WB 0x10 /* write-back cache */ #define PAPI_MH_CACHE_WRITE_POLICY(a) (a & 0xf0) #define PAPI_MH_TYPE_UNKNOWN 0x000 #define PAPI_MH_TYPE_LRU 0x100 #define PAPI_MH_TYPE_PSEUDO_LRU 0x200 #define PAPI_MH_TYPE_FIFO 0x400 #define PAPI_MH_CACHE_REPLACEMENT_POLICY(a) (a & 0xf00) #define PAPI_MH_TYPE_TLB 0x1000 /* tlb, not memory cache */ #define PAPI_MH_TYPE_PREF 0x2000 /* prefetch buffer */ #define PAPI_MH_TYPE_RD_ALLOC 0x10000 /* read-allocation cache */ #define PAPI_MH_TYPE_WR_ALLOC 0x20000 /* write-allocation cache */ #define PAPI_MH_TYPE_RW_ALLOC 0x40000 /* read-write-allocation cache */ #define PAPI_MH_CACHE_ALLOCATION_POLICY(a) (a & 0xf0000) #define PAPI_MH_MAX_LEVELS 6 /* # descriptors for each TLB or cache level */ #define PAPI_MAX_MEM_HIERARCHY_LEVELS 4 /** @ingroup papi_data_structures */ typedef struct _papi_mh_tlb_info { int type; /**< Empty, instr, data, vector, unified */ int num_entries; int page_size; int associativity; } PAPI_mh_tlb_info_t; /** @ingroup papi_data_structures */ typedef struct _papi_mh_cache_info { int type; /**< Empty, instr, data, vector, trace, unified */ int size; int line_size; int num_lines; int associativity; } PAPI_mh_cache_info_t; /** @ingroup papi_data_structures */ typedef struct _papi_mh_level_info { PAPI_mh_tlb_info_t tlb[PAPI_MH_MAX_LEVELS]; PAPI_mh_cache_info_t cache[PAPI_MH_MAX_LEVELS]; } PAPI_mh_level_t; /** @ingroup papi_data_structures * @brief mh for mem hierarchy maybe? */ typedef struct _papi_mh_info { int levels; PAPI_mh_level_t level[PAPI_MAX_MEM_HIERARCHY_LEVELS]; } PAPI_mh_info_t; /** @ingroup papi_data_structures * @brief Hardware info structure */ typedef struct _papi_hw_info { int ncpu; /**< Number of CPUs per NUMA Node */ int threads; /**< Number of hdw threads per core */ int cores; /**< Number of cores per socket */ int sockets; /**< Number of sockets */ int nnodes; /**< Total Number of NUMA Nodes */ int totalcpus; /**< Total number of CPUs in the entire system */ int vendor; /**< Vendor number of CPU */ char vendor_string[PAPI_MAX_STR_LEN]; /**< Vendor string of CPU */ int model; /**< Model number of CPU */ char model_string[PAPI_MAX_STR_LEN]; /**< Model string of CPU */ float revision; /**< Revision of CPU */ int cpuid_family; /**< cpuid family */ int cpuid_model; /**< cpuid model */ int cpuid_stepping; /**< cpuid stepping */ int cpu_max_mhz; /**< Maximum supported CPU speed */ int cpu_min_mhz; /**< Minimum supported CPU speed */ PAPI_mh_info_t mem_hierarchy; /**< PAPI memory hierarchy description */ int virtualized; /**< Running in virtual machine */ char virtual_vendor_string[PAPI_MAX_STR_LEN]; /**< Vendor for virtual machine */ char virtual_vendor_version[PAPI_MAX_STR_LEN]; /**< Version of virtual machine */ /* Legacy Values, do not use */ float mhz; /**< Deprecated */ int clock_mhz; /**< Deprecated */ /* For future expansion */ int reserved[8]; } PAPI_hw_info_t; /** @ingroup papi_data_structures */ typedef struct _papi_attach_option { int eventset; unsigned long tid; } PAPI_attach_option_t; /** @ingroup papi_data_structures*/ typedef struct _papi_cpu_option { int eventset; unsigned int cpu_num; } PAPI_cpu_option_t; /** @ingroup papi_data_structures */ typedef struct _papi_multiplex_option { int eventset; int ns; int flags; } PAPI_multiplex_option_t; /** @ingroup papi_data_structures * @brief address range specification for range restricted counting if both are zero, range is disabled */ typedef struct _papi_addr_range_option { int eventset; /**< eventset to restrict */ vptr_t start; /**< user requested start address of an address range */ vptr_t end; /**< user requested end address of an address range */ int start_off; /**< hardware specified offset from start address */ int end_off; /**< hardware specified offset from end address */ } PAPI_addr_range_option_t; /** @ingroup papi_data_structures * @union PAPI_option_t * @brief A pointer to the following is passed to PAPI_set/get_opt() */ typedef union { PAPI_preload_info_t preload; PAPI_debug_option_t debug; PAPI_inherit_option_t inherit; PAPI_granularity_option_t granularity; PAPI_granularity_option_t defgranularity; PAPI_domain_option_t domain; PAPI_domain_option_t defdomain; PAPI_attach_option_t attach; PAPI_cpu_option_t cpu; PAPI_multiplex_option_t multiplex; PAPI_itimer_option_t itimer; PAPI_hw_info_t *hw_info; PAPI_shlib_info_t *shlib_info; PAPI_exe_info_t *exe_info; PAPI_component_info_t *cmp_info; PAPI_addr_range_option_t addr; PAPI_user_defined_events_file_t events_file; } PAPI_option_t; /** @ingroup papi_data_structures * @brief A pointer to the following is passed to PAPI_get_dmem_info() */ typedef struct _dmem_t { long long peak; long long size; long long resident; long long high_water_mark; long long shared; long long text; long long library; long long heap; long long locked; long long stack; long long pagesize; long long pte; } PAPI_dmem_info_t; /* Fortran offsets into PAPI_dmem_info_t structure. */ #define PAPIF_DMEM_VMPEAK 1 #define PAPIF_DMEM_VMSIZE 2 #define PAPIF_DMEM_RESIDENT 3 #define PAPIF_DMEM_HIGH_WATER 4 #define PAPIF_DMEM_SHARED 5 #define PAPIF_DMEM_TEXT 6 #define PAPIF_DMEM_LIBRARY 7 #define PAPIF_DMEM_HEAP 8 #define PAPIF_DMEM_LOCKED 9 #define PAPIF_DMEM_STACK 10 #define PAPIF_DMEM_PAGESIZE 11 #define PAPIF_DMEM_PTE 12 #define PAPIF_DMEM_MAXVAL 12 #define PAPI_MAX_INFO_TERMS 12 /* should match PAPI_EVENTS_IN_DERIVED_EVENT defined in papi_internal.h */ #define PAPI_MAX_COMP_QUALS 8 /** @ingroup papi_data_structures * @brief This structure is the event information that is exposed to the user through the API. The same structure is used to describe both preset and native events. WARNING: This structure is very large. With current definitions, it is about 2660 bytes. Unlike previous versions of PAPI, which allocated an array of these structures within the library, this structure is carved from user space. It does not exist inside the library, and only one copy need ever exist. The basic philosophy is this: - each preset consists of a code, some descriptors, and an array of native events; - each native event consists of a code, and an array of register values; - fields are shared between preset and native events, and unused where not applicable; - To completely describe a preset event, the code must present all available information for that preset, and then walk the list of native events, retrieving and presenting information for each native event in turn. The various fields and their usage is discussed below. */ /** Enum values for event_info location field */ enum { PAPI_LOCATION_CORE = 0, /**< Measures local to core */ PAPI_LOCATION_CPU, /**< Measures local to CPU (HT?) */ PAPI_LOCATION_PACKAGE, /**< Measures local to package */ PAPI_LOCATION_UNCORE, /**< Measures uncore */ }; /** Enum values for event_info data_type field */ enum { PAPI_DATATYPE_INT64 = 0, /**< Default: Data is a signed 64-bit int */ PAPI_DATATYPE_UINT64, /**< Data is a unsigned 64-bit int */ PAPI_DATATYPE_FP64, /**< Data is 64-bit floating point */ PAPI_DATATYPE_BIT64, /**< Data is 64-bit binary */ }; /** Enum values for event_info value_type field */ enum { PAPI_VALUETYPE_RUNNING_SUM = 0, /**< Data is running sum from start */ PAPI_VALUETYPE_ABSOLUTE, /**< Data is from last read */ }; /** Enum values for event_info timescope field */ enum { PAPI_TIMESCOPE_SINCE_START = 0, /**< Data is cumulative from start */ PAPI_TIMESCOPE_SINCE_LAST, /**< Data is from last read */ PAPI_TIMESCOPE_UNTIL_NEXT, /**< Data is until next read */ PAPI_TIMESCOPE_POINT, /**< Data is an instantaneous value */ }; /** Enum values for event_info update_type field */ enum { PAPI_UPDATETYPE_ARBITRARY = 0, /**< Data is cumulative from start */ PAPI_UPDATETYPE_PUSH, /**< Data is pushed */ PAPI_UPDATETYPE_PULL, /**< Data is pulled */ PAPI_UPDATETYPE_FIXEDFREQ, /**< Data is read periodically */ }; typedef struct event_info { unsigned int event_code; /**< preset (0x8xxxxxxx) or native (0x4xxxxxxx) event code */ char symbol[PAPI_HUGE_STR_LEN]; /**< name of the event */ char short_descr[PAPI_MIN_STR_LEN]; /**< a short description suitable for use as a label */ char long_descr[PAPI_HUGE_STR_LEN]; /**< a longer description: typically a sentence for presets, possibly a paragraph from vendor docs for native events */ int component_index; /**< component this event belongs to */ char units[PAPI_MIN_STR_LEN]; /**< units event is measured in */ int location; /**< location event applies to */ int data_type; /**< data type returned by PAPI */ int value_type; /**< sum or absolute */ int timescope; /**< from start, etc. */ int update_type; /**< how event is updated */ int update_freq; /**< how frequently event is updated */ /* PRESET SPECIFIC FIELDS FOLLOW */ unsigned int count; /**< number of terms (usually 1) in the code and name fields - presets: these are native events - native: these are unused */ unsigned int event_type; /**< event type or category for preset events only */ char derived[PAPI_MIN_STR_LEN]; /**< name of the derived type - presets: usually NOT_DERIVED - native: empty string */ char postfix[PAPI_2MAX_STR_LEN]; /**< string containing postfix operations; only defined for preset events of derived type DERIVED_POSTFIX */ unsigned int code[PAPI_MAX_INFO_TERMS]; /**< array of values that further describe the event: - presets: native event_code values - native:, register values(?) */ char name[PAPI_MAX_INFO_TERMS] /**< names of code terms: */ [PAPI_2MAX_STR_LEN]; /**< - presets: native event names, - native: descriptive strings for each register value(?) */ char note[PAPI_HUGE_STR_LEN]; /**< an optional developer note supplied with a preset event to delineate platform specific anomalies or restrictions */ int num_quals; /**< number of qualifiers */ char quals[PAPI_MAX_COMP_QUALS][PAPI_HUGE_STR_LEN]; /**< qualifiers */ char quals_descrs[PAPI_MAX_COMP_QUALS][PAPI_HUGE_STR_LEN]; /**< qualifier descriptions */ } PAPI_event_info_t; /** @ingroup papi_data_structures * PAPI_dev_type_id_e - enum device types * * Device types are defined, in most cases, by the device runtime used * to access its attributes. For devices that expose their attributes * through the OS interfaces only the device name is used (e.g., CPU). */ typedef enum { PAPI_DEV_TYPE_ID__CPU, PAPI_DEV_TYPE_ID__CUDA, PAPI_DEV_TYPE_ID__ROCM, PAPI_DEV_TYPE_ID__MAX_NUM, } PAPI_dev_type_id_e; /** @ingroup papi_data_structures * enum of device types. * * Device type are identified, most of the times, by the runtime used * to access them (e.g., CUDA, ROCM, L0, etc). For devices that expose * their attributes through the operating system interfaces the identification * is the device name (e.g., CPU). */ enum { PAPI_DEV_TYPE_ENUM__FIRST= (0 ), PAPI_DEV_TYPE_ENUM__CPU = (1 << PAPI_DEV_TYPE_ID__CPU ), PAPI_DEV_TYPE_ENUM__CUDA = (1 << PAPI_DEV_TYPE_ID__CUDA), PAPI_DEV_TYPE_ENUM__ROCM = (1 << PAPI_DEV_TYPE_ID__ROCM), PAPI_DEV_TYPE_ENUM__ALL = (1 << PAPI_DEV_TYPE_ID__MAX_NUM) - 1, }; /** @ingroup papi_data_structures * PAPI_dev_type_attr_e - enum device type attributes. * */ typedef enum { PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, PAPI_DEV_TYPE_ATTR__CHAR_NAME, PAPI_DEV_TYPE_ATTR__INT_COUNT, PAPI_DEV_TYPE_ATTR__CHAR_STATUS, } PAPI_dev_type_attr_e; /** @ingroup papi_data_structures * PAPI_dev_attr_e - enum device attributes * * As the PAPI_get_dev_attr interface returns a pointer to void as attribute * value, the attribute name has the following format: * * PAPI_DEV_ATTR____ * * This identifies, in order, the device for which the attribute is being * queried, the type of the attribute returned, and the attribute name. */ typedef enum { PAPI_DEV_ATTR__CPU_CHAR_NAME, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, PAPI_DEV_ATTR__CPU_UINT_FAMILY, PAPI_DEV_ATTR__CPU_UINT_MODEL, PAPI_DEV_ATTR__CPU_UINT_STEPPING, PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE, PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY, PAPI_DEV_ATTR__CPU_UINT_THR_PER_NUMA, PAPI_DEV_ATTR__CUDA_ULONG_UID, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK, PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT, PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL, PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM, PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP, PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR, PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM, PAPI_DEV_ATTR__ROCM_ULONG_UID, PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME, PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE, PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z, PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT, PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR, } PAPI_dev_attr_e; /** \internal * @defgroup low_api The Low Level API @{ */ int PAPI_accum(int EventSet, long long * values); /**< accumulate and reset hardware events from an event set */ int PAPI_add_event(int EventSet, int Event); /**< add single PAPI preset or native hardware event to an event set */ int PAPI_add_named_event(int EventSet, const char *EventName); /**< add an event by name to a PAPI event set */ int PAPI_add_events(int EventSet, int *Events, int number); /**< add array of PAPI preset or native hardware events to an event set */ int PAPI_assign_eventset_component(int EventSet, int cidx); /**< assign a component index to an existing but empty eventset */ int PAPI_attach(int EventSet, unsigned long tid); /**< attach specified event set to a specific process or thread id */ int PAPI_cleanup_eventset(int EventSet); /**< remove all PAPI events from an event set */ int PAPI_create_eventset(int *EventSet); /**< create a new empty PAPI event set */ int PAPI_detach(int EventSet); /**< detach specified event set from a previously specified process or thread id */ int PAPI_destroy_eventset(int *EventSet); /**< deallocates memory associated with an empty PAPI event set */ int PAPI_enum_event(int *EventCode, int modifier); /**< return the event code for the next available preset or natvie event */ int PAPI_enum_cmp_event(int *EventCode, int modifier, int cidx); /**< return the event code for the next available component event */ int PAPI_event_code_to_name(int EventCode, char *out); /**< translate an integer PAPI event code into an ASCII PAPI preset or native name */ int PAPI_event_name_to_code(const char *in, int *out); /**< translate an ASCII PAPI preset or native name into an integer PAPI event code */ int PAPI_get_dmem_info(PAPI_dmem_info_t *dest); /**< get dynamic memory usage information */ int PAPI_get_event_info(int EventCode, PAPI_event_info_t * info); /**< get the name and descriptions for a given preset or native event code */ const PAPI_exe_info_t *PAPI_get_executable_info(void); /**< get the executable's address space information */ const PAPI_hw_info_t *PAPI_get_hardware_info(void); /**< get information about the system hardware */ const PAPI_component_info_t *PAPI_get_component_info(int cidx); /**< get information about the component features */ int PAPI_get_multiplex(int EventSet); /**< get the multiplexing status of specified event set */ int PAPI_get_opt(int option, PAPI_option_t * ptr); /**< query the option settings of the PAPI library or a specific event set */ int PAPI_get_cmp_opt(int option, PAPI_option_t * ptr,int cidx); /**< query the component specific option settings of a specific event set */ long long PAPI_get_real_cyc(void); /**< return the total number of cycles since some arbitrary starting point */ long long PAPI_get_real_nsec(void); /**< return the total number of nanoseconds since some arbitrary starting point */ long long PAPI_get_real_usec(void); /**< return the total number of microseconds since some arbitrary starting point */ const PAPI_shlib_info_t *PAPI_get_shared_lib_info(void); /**< get information about the shared libraries used by the process */ int PAPI_get_thr_specific(int tag, void **ptr); /**< return a pointer to a thread specific stored data structure */ int PAPI_get_overflow_event_index(int Eventset, long long overflow_vector, int *array, int *number); /**< # decomposes an overflow_vector into an event index array */ long long PAPI_get_virt_cyc(void); /**< return the process cycles since some arbitrary starting point */ long long PAPI_get_virt_nsec(void); /**< return the process nanoseconds since some arbitrary starting point */ long long PAPI_get_virt_usec(void); /**< return the process microseconds since some arbitrary starting point */ int PAPI_is_initialized(void); /**< return the initialized state of the PAPI library */ int PAPI_library_init(int version); /**< initialize the PAPI library */ int PAPI_list_events(int EventSet, int *Events, int *number); /**< list the events that are members of an event set */ int PAPI_list_threads(unsigned long *tids, int *number); /**< list the thread ids currently known to PAPI */ int PAPI_lock(int); /**< lock one of two PAPI internal user mutex variables */ int PAPI_multiplex_init(void); /**< initialize multiplex support in the PAPI library */ int PAPI_num_cmp_hwctrs(int cidx); /**< return the number of hardware counters for a specified component */ int PAPI_num_events(int EventSet); /**< return the number of events in an event set */ int PAPI_overflow(int EventSet, int EventCode, int threshold, int flags, PAPI_overflow_handler_t handler); /**< set up an event set to begin registering overflows */ void PAPI_perror(const char *msg ); /**< Print a PAPI error message */ int PAPI_profil(void *buf, unsigned bufsiz, vptr_t offset, unsigned scale, int EventSet, int EventCode, int threshold, int flags); /**< generate PC histogram data where hardware counter overflow occurs */ int PAPI_query_event(int EventCode); /**< query if a PAPI event exists */ int PAPI_query_named_event(const char *EventName); /**< query if a named PAPI event exists */ int PAPI_read(int EventSet, long long * values); /**< read hardware events from an event set with no reset */ int PAPI_read_ts(int EventSet, long long * values, long long *cyc); /**< read from an eventset with a real-time cycle timestamp */ int PAPI_register_thread(void); /**< inform PAPI of the existence of a new thread */ int PAPI_remove_event(int EventSet, int EventCode); /**< remove a hardware event from a PAPI event set */ int PAPI_remove_named_event(int EventSet, const char *EventName); /**< remove a named event from a PAPI event set */ int PAPI_remove_events(int EventSet, int *Events, int number); /**< remove an array of hardware events from a PAPI event set */ int PAPI_reset(int EventSet); /**< reset the hardware event counts in an event set */ int PAPI_set_debug(int level); /**< set the current debug level for PAPI */ int PAPI_set_cmp_domain(int domain, int cidx); /**< set the component specific default execution domain for new event sets */ int PAPI_set_domain(int domain); /**< set the default execution domain for new event sets */ int PAPI_set_cmp_granularity(int granularity, int cidx); /**< set the component specific default granularity for new event sets */ int PAPI_set_granularity(int granularity); /**size.reg_alloc; /* build a queue of indexes to all events that live on one counter only (rank == 1) */ head = 0; /* points to top of queue */ tail = 0; /* points to bottom of queue */ for ( i = 0; i < count; i++ ) { map_q[i] = 0; if ( _bpt_map_exclusive( ( hwd_reg_alloc_t * ) & ptr[size * i] ) ) idx_q[tail++] = i; } /* scan the single counter queue looking for events that share counters. If two events can live only on one counter, return failure. If the second event lives on more than one counter, remove shared counter from its selector and reduce its rank. Mark first event as mapped to its counter. */ while ( head < tail ) { for ( i = 0; i < count; i++ ) { if ( i != idx_q[head] ) { if ( _bpt_map_shared( ( hwd_reg_alloc_t * ) & ptr[size * i], ( hwd_reg_alloc_t * ) & ptr[size * idx_q [head]] ) ) { /* both share a counter; if second is exclusive, mapping fails */ if ( _bpt_map_exclusive( ( hwd_reg_alloc_t * ) & ptr[size * i] ) ) return 0; else { _bpt_map_preempt( ( hwd_reg_alloc_t * ) & ptr[size * i], ( hwd_reg_alloc_t * ) & ptr[size * idx_q [head]] ); if ( _bpt_map_exclusive( ( hwd_reg_alloc_t * ) & ptr[size * i] ) ) idx_q[tail++] = i; } } } } map_q[idx_q[head]] = 1; /* mark this event as mapped */ head++; } if ( tail == count ) { return 1; /* idx_q includes all events; everything is successfully mapped */ } else { char *rest_event_list; char *copy_rest_event_list; int remainder; rest_event_list = papi_calloc( _papi_hwd[cidx]->cmp_info.num_cntrs, size ); copy_rest_event_list = papi_calloc( _papi_hwd[cidx]->cmp_info.num_cntrs, size ); if ( !rest_event_list || !copy_rest_event_list ) { if ( rest_event_list ) papi_free( rest_event_list ); if ( copy_rest_event_list ) papi_free( copy_rest_event_list ); return ( 0 ); } /* copy all unmapped events to a second list and make a backup */ for ( i = 0, j = 0; i < count; i++ ) { if ( map_q[i] == 0 ) { memcpy( ©_rest_event_list[size * j++], &ptr[size * i], ( size_t ) size ); } } remainder = j; memcpy( rest_event_list, copy_rest_event_list, ( size_t ) size * ( size_t ) remainder ); /* try each possible mapping until you fail or find one that works */ for ( i = 0; i < _papi_hwd[cidx]->cmp_info.num_cntrs; i++ ) { /* for the first unmapped event, try every possible counter */ if ( _bpt_map_avail( ( hwd_reg_alloc_t * ) rest_event_list, i ) ) { _bpt_map_set( ( hwd_reg_alloc_t * ) rest_event_list, i ); /* remove selected counter from all other unmapped events */ for ( j = 1; j < remainder; j++ ) { if ( _bpt_map_shared( ( hwd_reg_alloc_t * ) & rest_event_list[size * j], ( hwd_reg_alloc_t * ) rest_event_list ) ) _bpt_map_preempt( ( hwd_reg_alloc_t * ) & rest_event_list[size * j], ( hwd_reg_alloc_t * ) rest_event_list ); } /* if recursive call to allocation works, break out of the loop */ if ( _papi_bipartite_alloc ( ( hwd_reg_alloc_t * ) rest_event_list, remainder, cidx ) ) break; /* recursive mapping failed; copy the backup list and try the next combination */ memcpy( rest_event_list, copy_rest_event_list, ( size_t ) size * ( size_t ) remainder ); } } if ( i == _papi_hwd[cidx]->cmp_info.num_cntrs ) { papi_free( rest_event_list ); papi_free( copy_rest_event_list ); return 0; /* fail to find mapping */ } for ( i = 0, j = 0; i < count; i++ ) { if ( map_q[i] == 0 ) _bpt_map_update( ( hwd_reg_alloc_t * ) & ptr[size * i], ( hwd_reg_alloc_t * ) & rest_event_list[size * j++] ); } papi_free( rest_event_list ); papi_free( copy_rest_event_list ); return 1; } } papi-papi-7-2-0-t/src/papi_common_strings.h000066400000000000000000000646741502707512200206730ustar00rootroot00000000000000/* These are used both by PAPI and by the genpapifdef utility */ /* They are in their own include to allow genpapifdef to be built */ /* without having to link against libpapi.a */ hwi_presets_t _papi_hwi_presets[PAPI_MAX_PRESET_EVENTS] = { /* 0 */ {"PAPI_L1_DCM", "L1D cache misses", "Level 1 data cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 1 */ {"PAPI_L1_ICM", "L1I cache misses", "Level 1 instruction cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 2 */ {"PAPI_L2_DCM", "L2D cache misses", "Level 2 data cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 3 */ {"PAPI_L2_ICM", "L2I cache misses", "Level 2 instruction cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 4 */ {"PAPI_L3_DCM", "L3D cache misses", "Level 3 data cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 5 */ {"PAPI_L3_ICM", "L3I cache misses", "Level 3 instruction cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 6 */ {"PAPI_L1_TCM", "L1 cache misses", "Level 1 cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 7 */ {"PAPI_L2_TCM", "L2 cache misses", "Level 2 cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 8 */ {"PAPI_L3_TCM", "L3 cache misses", "Level 3 cache misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 9 */ {"PAPI_CA_SNP", "Snoop Requests", "Requests for a snoop", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 10 */ {"PAPI_CA_SHR", "Ex Acces shared CL", "Requests for exclusive access to shared cache line", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 11 */ {"PAPI_CA_CLN", "Ex Access clean CL", "Requests for exclusive access to clean cache line", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 12 */ {"PAPI_CA_INV", "Cache ln invalid", "Requests for cache line invalidation", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 13 */ {"PAPI_CA_ITV", "Cache ln intervene", "Requests for cache line intervention", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 14 */ {"PAPI_L3_LDM", "L3 load misses", "Level 3 load misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 15 */ {"PAPI_L3_STM", "L3 store misses", "Level 3 store misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 16 */ {"PAPI_BRU_IDL", "Branch idle cycles", "Cycles branch units are idle", 0, 0, PAPI_PRESET_BIT_IDL + PAPI_PRESET_BIT_BR, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 17 */ {"PAPI_FXU_IDL", "IU idle cycles", "Cycles integer units are idle", 0, 0, PAPI_PRESET_BIT_IDL, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 18 */ {"PAPI_FPU_IDL", "FPU idle cycles", "Cycles floating point units are idle", 0, 0, PAPI_PRESET_BIT_IDL + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 19 */ {"PAPI_LSU_IDL", "L/SU idle cycles", "Cycles load/store units are idle", 0, 0, PAPI_PRESET_BIT_IDL + PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 20 */ {"PAPI_TLB_DM", "Data TLB misses", "Data translation lookaside buffer misses", 0, 0, PAPI_PRESET_BIT_TLB, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 21 */ {"PAPI_TLB_IM", "Instr TLB misses", "Instruction translation lookaside buffer misses", 0, 0, PAPI_PRESET_BIT_TLB + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 22 */ {"PAPI_TLB_TL", "Total TLB misses", "Total translation lookaside buffer misses", 0, 0, PAPI_PRESET_BIT_TLB, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 23 */ {"PAPI_L1_LDM", "L1 load misses", "Level 1 load misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 24 */ {"PAPI_L1_STM", "L1 store misses", "Level 1 store misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 25 */ {"PAPI_L2_LDM", "L2 load misses", "Level 2 load misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 26 */ {"PAPI_L2_STM", "L2 store misses", "Level 2 store misses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 27 */ {"PAPI_BTAC_M", "Br targt addr miss", "Branch target address cache misses", 0, 0, PAPI_PRESET_BIT_BR, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 28 */ {"PAPI_PRF_DM", "Data prefetch miss", "Data prefetch cache misses", 0, 0, PAPI_PRESET_BIT_CACH, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 29 */ {"PAPI_L3_DCH", "L3D cache hits", "Level 3 data cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 30 */ {"PAPI_TLB_SD", "TLB shootdowns", "Translation lookaside buffer shootdowns", 0, 0, PAPI_PRESET_BIT_TLB, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 31 */ {"PAPI_CSR_FAL", "Failed store cond", "Failed store conditional instructions", 0, 0, PAPI_PRESET_BIT_CND + PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 32 */ {"PAPI_CSR_SUC", "Good store cond", "Successful store conditional instructions", 0, 0, PAPI_PRESET_BIT_CND + PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 33 */ {"PAPI_CSR_TOT", "Total store cond", "Total store conditional instructions", 0, 0, PAPI_PRESET_BIT_CND + PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 34 */ {"PAPI_MEM_SCY", "Stalled mem cycles", "Cycles Stalled Waiting for memory accesses", 0, 0, PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 35 */ {"PAPI_MEM_RCY", "Stalled rd cycles", "Cycles Stalled Waiting for memory Reads", 0, 0, PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 36 */ {"PAPI_MEM_WCY", "Stalled wr cycles", "Cycles Stalled Waiting for memory writes", 0, 0, PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 37 */ {"PAPI_STL_ICY", "No instr issue", "Cycles with no instruction issue", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 38 */ {"PAPI_FUL_ICY", "Max instr issue", "Cycles with maximum instruction issue", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 39 */ {"PAPI_STL_CCY", "No instr done", "Cycles with no instructions completed", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 40 */ {"PAPI_FUL_CCY", "Max instr done", "Cycles with maximum instructions completed", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 41 */ {"PAPI_HW_INT", "Hdw interrupts", "Hardware interrupts", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 42 */ {"PAPI_BR_UCN", "Uncond branch", "Unconditional branch instructions", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 43 */ {"PAPI_BR_CN", "Cond branch", "Conditional branch instructions", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 44 */ {"PAPI_BR_TKN", "Cond branch taken", "Conditional branch instructions taken", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 45 */ {"PAPI_BR_NTK", "Cond br not taken", "Conditional branch instructions not taken", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 46 */ {"PAPI_BR_MSP", "Cond br mspredictd", "Conditional branch instructions mispredicted", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 47 */ {"PAPI_BR_PRC", "Cond br predicted", "Conditional branch instructions correctly predicted", 0, 0, PAPI_PRESET_BIT_BR + PAPI_PRESET_BIT_CND, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 48 */ {"PAPI_FMA_INS", "FMAs completed", "FMA instructions completed", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 49 */ {"PAPI_TOT_IIS", "Instr issued", "Instructions issued", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 50 */ {"PAPI_TOT_INS", "Instr completed", "Instructions completed", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 51 */ {"PAPI_INT_INS", "Int instructions", "Integer instructions", 0, 0, PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 52 */ {"PAPI_FP_INS", "FP instructions", "Floating point instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 53 */ {"PAPI_LD_INS", "Loads", "Load instructions", 0, 0, PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 54 */ {"PAPI_SR_INS", "Stores", "Store instructions", 0, 0, PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 55 */ {"PAPI_BR_INS", "Branches", "Branch instructions", 0, 0, PAPI_PRESET_BIT_BR, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 56 */ {"PAPI_VEC_INS", "Vector/SIMD instr", "Vector/SIMD instructions (could include integer)", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 57 */ {"PAPI_RES_STL", "Stalled res cycles", "Cycles stalled on any resource", 0, 0, PAPI_PRESET_BIT_IDL + PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 58 */ {"PAPI_FP_STAL", "Stalled FPU cycles", "Cycles the FP unit(s) are stalled", 0, 0, PAPI_PRESET_BIT_IDL + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 59 */ {"PAPI_TOT_CYC", "Total cycles", "Total cycles", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 60 */ {"PAPI_LST_INS", "L/S completed", "Load/store instructions completed", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_MEM, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 61 */ {"PAPI_SYC_INS", "Syncs completed", "Synchronization instructions completed", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 62 */ {"PAPI_L1_DCH", "L1D cache hits", "Level 1 data cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 63 */ {"PAPI_L2_DCH", "L2D cache hits", "Level 2 data cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 64 */ {"PAPI_L1_DCA", "L1D cache accesses", "Level 1 data cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 65 */ {"PAPI_L2_DCA", "L2D cache accesses", "Level 2 data cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 66 */ {"PAPI_L3_DCA", "L3D cache accesses", "Level 3 data cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 67 */ {"PAPI_L1_DCR", "L1D cache reads", "Level 1 data cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 68 */ {"PAPI_L2_DCR", "L2D cache reads", "Level 2 data cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 69 */ {"PAPI_L3_DCR", "L3D cache reads", "Level 3 data cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 70 */ {"PAPI_L1_DCW", "L1D cache writes", "Level 1 data cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 71 */ {"PAPI_L2_DCW", "L2D cache writes", "Level 2 data cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 72 */ {"PAPI_L3_DCW", "L3D cache writes", "Level 3 data cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 73 */ {"PAPI_L1_ICH", "L1I cache hits", "Level 1 instruction cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 74 */ {"PAPI_L2_ICH", "L2I cache hits", "Level 2 instruction cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 75 */ {"PAPI_L3_ICH", "L3I cache hits", "Level 3 instruction cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 76 */ {"PAPI_L1_ICA", "L1I cache accesses", "Level 1 instruction cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 77 */ {"PAPI_L2_ICA", "L2I cache accesses", "Level 2 instruction cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 78 */ {"PAPI_L3_ICA", "L3I cache accesses", "Level 3 instruction cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 79 */ {"PAPI_L1_ICR", "L1I cache reads", "Level 1 instruction cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 80 */ {"PAPI_L2_ICR", "L2I cache reads", "Level 2 instruction cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 81 */ {"PAPI_L3_ICR", "L3I cache reads", "Level 3 instruction cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 82 */ {"PAPI_L1_ICW", "L1I cache writes", "Level 1 instruction cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 83 */ {"PAPI_L2_ICW", "L2I cache writes", "Level 2 instruction cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 84 */ {"PAPI_L3_ICW", "L3I cache writes", "Level 3 instruction cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3 + PAPI_PRESET_BIT_INS, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 85 */ {"PAPI_L1_TCH", "L1 cache hits", "Level 1 total cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 86 */ {"PAPI_L2_TCH", "L2 cache hits", "Level 2 total cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 87 */ {"PAPI_L3_TCH", "L3 cache hits", "Level 3 total cache hits", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 88 */ {"PAPI_L1_TCA", "L1 cache accesses", "Level 1 total cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 89 */ {"PAPI_L2_TCA", "L2 cache accesses", "Level 2 total cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 90 */ {"PAPI_L3_TCA", "L3 cache accesses", "Level 3 total cache accesses", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 91 */ {"PAPI_L1_TCR", "L1 cache reads", "Level 1 total cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 92 */ {"PAPI_L2_TCR", "L2 cache reads", "Level 2 total cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 93 */ {"PAPI_L3_TCR", "L3 cache reads", "Level 3 total cache reads", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 94 */ {"PAPI_L1_TCW", "L1 cache writes", "Level 1 total cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L1, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 95 */ {"PAPI_L2_TCW", "L2 cache writes", "Level 2 total cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L2, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 96 */ {"PAPI_L3_TCW", "L3 cache writes", "Level 3 total cache writes", 0, 0, PAPI_PRESET_BIT_CACH + PAPI_PRESET_BIT_L3, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 97 */ {"PAPI_FML_INS", "FPU multiply", "Floating point multiply instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 98 */ {"PAPI_FAD_INS", "FPU add", "Floating point add instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 99 */ {"PAPI_FDV_INS", "FPU divide", "Floating point divide instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*100 */ {"PAPI_FSQ_INS", "FPU square root", "Floating point square root instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*101 */ {"PAPI_FNV_INS", "FPU inverse", "Floating point inverse instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*102 */ {"PAPI_FP_OPS", "FP operations", "Floating point operations", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*103 */ {"PAPI_SP_OPS", "SP operations", "Floating point operations; optimized to count scaled single precision vector operations", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*104 */ {"PAPI_DP_OPS", "DP operations", "Floating point operations; optimized to count scaled double precision vector operations", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*105 */ {"PAPI_VEC_SP", "SP Vector/SIMD instr", "Single precision vector/SIMD instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*106 */ {"PAPI_VEC_DP", "DP Vector/SIMD instr", "Double precision vector/SIMD instructions", 0, 0, PAPI_PRESET_BIT_INS + PAPI_PRESET_BIT_FP, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /* 107 */ {"PAPI_REF_CYC", "Reference cycles", "Reference clock cycles", 0, 0, PAPI_PRESET_BIT_MSC, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*108 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*109 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*110 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*111 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*112 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*113 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*114 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*115 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*116 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*117 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*118 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*119 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*120 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*121 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*122 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*123 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*124 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*125 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*126 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, /*127 */ {NULL, NULL, NULL, 0, 0, 0, NULL, {0}, {NULL}, {NULL}, {0}, {NULL}, NULL, 0, 0, {NULL}, {NULL}}, }; #if 0 const hwi_describe_t _papi_hwi_err[PAPI_NUM_ERRORS] = { /* 0 */ {PAPI_OK, "PAPI_OK", "No error"}, /* 1 */ {PAPI_EINVAL, "PAPI_EINVAL", "Invalid argument"}, /* 2 */ {PAPI_ENOMEM, "PAPI_ENOMEM", "Insufficient memory"}, /* 3 */ {PAPI_ESYS, "PAPI_ESYS", "A System/C library call failed"}, /* 4 */ {PAPI_ECMP, "PAPI_ECMP", "Not supported by component"}, /* 5 */ {PAPI_ECLOST, "PAPI_ECLOST", "Access to the counters was lost or interrupted"}, /* 6 */ {PAPI_EBUG, "PAPI_EBUG", "Internal error, please send mail to the developers"}, /* 7 */ {PAPI_ENOEVNT, "PAPI_ENOEVNT", "Event does not exist"}, /* 8 */ {PAPI_ECNFLCT, "PAPI_ECNFLCT", "Event exists, but cannot be counted due to hardware resource limits"}, /* 9 */ {PAPI_ENOTRUN, "PAPI_ENOTRUN", "EventSet is currently not running"}, /*10 */ {PAPI_EISRUN, "PAPI_EISRUN", "EventSet is currently counting"}, /*11 */ {PAPI_ENOEVST, "PAPI_ENOEVST", "No such EventSet available"}, /*12 */ {PAPI_ENOTPRESET, "PAPI_ENOTPRESET", "Event in argument is not a valid preset"}, /*13 */ {PAPI_ENOCNTR, "PAPI_ENOCNTR", "Hardware does not support performance counters"}, /*14 */ {PAPI_EMISC, "PAPI_EMISC", "Unknown error code"}, /*15 */ {PAPI_EPERM, "PAPI_EPERM", "Permission level does not permit operation"}, /*16 */ {PAPI_ENOINIT, "PAPI_ENOINIT", "PAPI hasn't been initialized yet"}, /*17 */ {PAPI_ENOCMP, "PAPI_ENOCMP", "Component Index isn't set"}, /*18 */ {PAPI_ENOSUPP, "PAPI_ENOSUPP", "Not supported"}, /*19 */ {PAPI_ENOIMPL, "PAPI_ENOIMPL", "Not implemented"}, /*20 */ {PAPI_EBUF, "PAPI_EBUF", "Buffer size exceeded"}, /*21 */ {PAPI_EINVAL_DOM, "PAPI_EINVAL_DOM", "EventSet domain is not supported for the operation"}, /*22 */ {PAPI_EATTR, "PAPI_EATTR", "Invalid or missing event attributes"}, /*23 */ {PAPI_ECOUNT, "PAPI_ECOUNT", "Too many events or attributes"}, /*24 */ {PAPI_ECOMBO, "PAPI_ECOMBO", "Bad combination of features"} /*25 */ {PAPI_ECMP_DISABLED, "PAPI_ECMP_DISABLED", "Component containing event is disabled"} }; #endif papi-papi-7-2-0-t/src/papi_debug.h000066400000000000000000000160251502707512200167030ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file papi_debug.h * @author Philip Mucci * mucci@cs.utk.edu * @author Dan Terpstra * terpstra.utk.edu * @author Kevin London * london@cs.utk.edu * @author Haihang You * you@cs.utk.edu */ #ifndef _PAPI_DEBUG_H #define _PAPI_DEBUG_H #ifdef NO_VARARG_MACRO #include #endif #include /* Debug Levels */ #define DEBUG_SUBSTRATE 0x002 #define DEBUG_API 0x004 #define DEBUG_INTERNAL 0x008 #define DEBUG_THREADS 0x010 #define DEBUG_MULTIPLEX 0x020 #define DEBUG_OVERFLOW 0x040 #define DEBUG_PROFILE 0x080 #define DEBUG_MEMORY 0x100 #define DEBUG_LEAK 0x200 #define DEBUG_HIGHLEVEL 0x400 #define DEBUG_ALL (DEBUG_SUBSTRATE|DEBUG_API|DEBUG_INTERNAL|DEBUG_THREADS|DEBUG_MULTIPLEX|DEBUG_OVERFLOW|DEBUG_PROFILE|DEBUG_MEMORY|DEBUG_LEAK|DEBUG_HIGHLEVEL) /* Please get rid of the DBG macro from your code */ extern int _papi_hwi_debug; extern unsigned long int ( *_papi_hwi_thread_id_fn ) ( void ); #ifdef DEBUG #ifdef __GNUC__ #define FUNC __FUNCTION__ #elif defined(__func__) #define FUNC __func__ #else #define FUNC "?" #endif #define DEBUGLABEL(a) if (_papi_hwi_thread_id_fn) fprintf(stderr, "%s:%s:%s:%d:%d:%#lx ",a,__FILE__, FUNC, __LINE__,(int)getpid(),_papi_hwi_thread_id_fn()); else fprintf(stderr, "%s:%s:%s:%d:%d ",a,__FILE__, FUNC, __LINE__, (int)getpid()) #define ISLEVEL(a) (_papi_hwi_debug&a) #define DEBUGLEVEL(a) ((a&DEBUG_SUBSTRATE)?"SUBSTRATE":(a&DEBUG_API)?"API":(a&DEBUG_INTERNAL)?"INTERNAL":(a&DEBUG_THREADS)?"THREADS":(a&DEBUG_MULTIPLEX)?"MULTIPLEX":(a&DEBUG_OVERFLOW)?"OVERFLOW":(a&DEBUG_PROFILE)?"PROFILE":(a&DEBUG_MEMORY)?"MEMORY":(a&DEBUG_LEAK)?"LEAK":(a&DEBUG_HIGHLEVEL)?"HIGHLEVEL":"UNKNOWN") #ifndef NO_VARARG_MACRO /* Has variable arg macro support */ #define PAPIDEBUG(level,format, args...) { if(_papi_hwi_debug&level){DEBUGLABEL(DEBUGLEVEL(level));fprintf(stderr,format, ## args);}} /* Macros */ #define SUBDBG(format, args...) (PAPIDEBUG(DEBUG_SUBSTRATE,format, ## args)) #define APIDBG(format, args...) (PAPIDEBUG(DEBUG_API,format, ## args)) #define INTDBG(format, args...) (PAPIDEBUG(DEBUG_INTERNAL,format, ## args)) #define THRDBG(format, args...) (PAPIDEBUG(DEBUG_THREADS,format, ## args)) #define MPXDBG(format, args...) (PAPIDEBUG(DEBUG_MULTIPLEX,format, ## args)) #define OVFDBG(format, args...) (PAPIDEBUG(DEBUG_OVERFLOW,format, ## args)) #define PRFDBG(format, args...) (PAPIDEBUG(DEBUG_PROFILE,format, ## args)) #define MEMDBG(format, args...) (PAPIDEBUG(DEBUG_MEMORY,format, ## args)) #define LEAKDBG(format, args...) (PAPIDEBUG(DEBUG_LEAK,format, ## args)) #define HLDBG(format, args...) (PAPIDEBUG(DEBUG_HIGHLEVEL,format, ## args)) #endif #else #ifndef NO_VARARG_MACRO /* Has variable arg macro support */ #define SUBDBG(format, args...) { ; } #define APIDBG(format, args...) { ; } #define INTDBG(format, args...) { ; } #define THRDBG(format, args...) { ; } #define MPXDBG(format, args...) { ; } #define OVFDBG(format, args...) { ; } #define PRFDBG(format, args...) { ; } #define MEMDBG(format, args...) { ; } #define LEAKDBG(format, args...) { ; } #define HLDBG(format, args...) { ; } #define PAPIDEBUG(level, format, args...) { ; } #endif #endif /* * Debug functions for platforms without vararg macro support */ #ifdef NO_VARARG_MACRO static void PAPIDEBUG( int level, char *format, va_list args ) { #ifdef DEBUG if ( ISLEVEL( level ) ) { vfprintf( stderr, format, args ); } else #endif return; } static void _SUBDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_SUBSTRATE, format, args ); va_end(args); #endif } #ifdef DEBUG #define SUBDBG do { \ if (DEBUG_SUBSTRATE & _papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_SUBSTRATE ) ); \ } \ } while(0); _SUBDBG #else #define SUBDBG _SUBDBG #endif static void _APIDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_API, format, args ); va_end(args); #endif } #ifdef DEBUG #define APIDBG do { \ if (DEBUG_API&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_API ) ); \ } \ } while(0); _APIDBG #else #define APIDBG _APIDBG #endif static void _INTDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_INTERNAL, format, args ); va_end(args); #endif } #ifdef DEBUG #define INTDBG do { \ if (DEBUG_INTERNAL&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_INTERNAL ) ); \ } \ } while(0); _INTDBG #else #define INTDBG _INTDBG #endif static void _THRDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_THREADS, format, args ); va_end(args); #endif } #ifdef DEBUG #define THRDBG do { \ if (DEBUG_THREADS&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_THREADS ) ); \ } \ } while(0); _THRDBG #else #define THRDBG _THRDBG #endif static void _MPXDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_MULTIPLEX, format, args ); va_end(args); #endif } #ifdef DEBUG #define MPXDBG do { \ if (DEBUG_MULTIPLEX&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_MULTIPLEX ) ); \ } \ } while(0); _MPXDBG #else #define MPXDBG _MPXDBG #endif static void _OVFDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_OVERFLOW, format, args ); va_end(args); #endif } #ifdef DEBUG #define OVFDBG do { \ if (DEBUG_OVERFLOW&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_OVERFLOW ) ); \ } \ } while(0); _OVFDBG #else #define OVFDBG _OVFDBG #endif static void _PRFDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_PROFILE, format, args ); va_end(args); #endif } #ifdef DEBUG #define PRFDBG do { \ if (DEBUG_PROFILE&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_PROFILE ) ); \ } \ } while(0); _PRFDBG #else #define PRFDBG _PRFDBG #endif static void _MEMDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_MEMORY, format , args); va_end(args); #endif } #ifdef DEBUG #define MEMDBG do { \ if (DEBUG_MEMORY&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_MEMORY ) ); \ } \ } while(0); _MEMDBG #else #define MEMDBG _MEMDBG #endif static void _LEAKDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_LEAK, format , args); va_end(args); #endif } #ifdef DEBUG #define LEAKDBG do { \ if (DEBUG_LEAK&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_LEAK ) ); \ } \ } while(0); _LEAKDBG #else #define LEAKDBG _LEAKDBG #endif static void _HLDBG( char *format, ... ) { #ifdef DEBUG va_list args; va_start(args, format); PAPIDEBUG( DEBUG_HIGHLEVEL, format , args); va_end(args); #endif } #ifdef DEBUG #define HLDBG do { \ if (DEBUG_HIGHLEVEL&_papi_hwi_debug) {\ DEBUGLABEL( DEBUGLEVEL ( DEBUG_HIGHLEVEL ) ); \ } \ } while(0); _HLDBG #else #define HLDBG _HLDBG #endif /* ifdef NO_VARARG_MACRO */ #endif #endif /* PAPI_DEBUG_H */ papi-papi-7-2-0-t/src/papi_events.csv000066400000000000000000004423461502707512200174760ustar00rootroot00000000000000# # Every CPU automatically has PAPI_TOT_CYC and PAPI_TOT_INS added # # Processor identifier and additional flags. # The processor identifier *can not* contain any comma characters as these # characters serve to delimit fields. # CPU,AMD64 (K7) CPU,amd64_k7 PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_AND_L2_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,L1_DTLB_AND_L2_DTLB_MISS,L1_ITLB_MISS_AND_L2_ITLB_MISS # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # CPU,AMD64 CPU,AMD64 (unknown model) CPU,AMD64 (K8 RevB) CPU,AMD64 (K8 RevC) CPU,AMD64 (K8 RevD) CPU,AMD64 (K8 RevE) CPU,AMD64 (K8 RevF) CPU,AMD64 (K8 RevG) CPU,amd64_k8_revb CPU,amd64_k8_revc CPU,amd64_k8_revd CPU,amd64_k8_reve CPU,amd64_k8_revf CPU,amd64_k8_revg # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2:INSTRUCTIONS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS PRESET,PAPI_L2_ICH,NOT_DERIVED,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L2_DCA,NOT_DERIVED,REQUESTS_TO_L2:DATA PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_CACHE_MISS:DATA PRESET,PAPI_L2_DCH,DERIVED_SUB,REQUESTS_TO_L2:DATA,L2_CACHE_MISS:DATA PRESET,PAPI_L2_TCA,NOT_DERIVED,REQUESTS_TO_L2:ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS:DATA PRESET,PAPI_L2_TCH,DERIVED_SUB,REQUESTS_TO_L2:INSTRUCTIONS:DATA,L2_CACHE_MISS:ALL # PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_AND_L2_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,L1_DTLB_AND_L2_DTLB_MISS,L1_ITLB_MISS_AND_L2_ITLB_MISS # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_STL_ICY,NOT_DERIVED,DECODER_EMPTY PRESET,PAPI_RES_STL,NOT_DERIVED,DISPATCH_STALLS PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # PRESET,PAPI_FPU_IDL,NOT_DERIVED,CYCLES_NO_FPU_OPS_RETIRED PRESET,PAPI_FML_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_MULTIPLY PRESET,PAPI_FAD_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_ADD PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_MMX_AND_FP_INSTRUCTIONS:PACKED_SSE_AND_SSE2 # This definition give an accurate count of the instructions retired through the FP unit # It counts just about everything except MMX and 3DNow instructions # Unfortunately, it also counts loads and stores. Therefore the count will be uniformly # high, but proportional to the work done. PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_MMX_AND_FP_INSTRUCTIONS:X87:SCALAR_SSE_AND_SSE2:PACKED_SSE_AND_SSE2 #/* This definition is speculative but gives good answers on our simple test cases # It overcounts FP operations, sometimes by A LOT, but doesn't count loads and stores PRESET,PAPI_FP_OPS,NOT_DERIVED,DISPATCHED_FPU:OPS_MULTIPLY:OPS_ADD,NOTE,'Counts speculative adds and multiplies. Variable and higher than theoretical.' # CPU,AMD64 FPU RETIRED # PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_MMX_AND_FP_INSTRUCTIONS:X87:SCALAR_SSE_AND_SSE2:PACKED_SSE_AND_SSE2,NOTE,"Counts all retired floating point operations, including data movement. Precise, and proportional to work done, but much higher than theoretical." # CPU,AMD64 FPU SPECULATIVE # PRESET,PAPI_FP_OPS,NOT_DERIVED,DISPATCHED_FPU:OPS_MULTIPLY:OPS_ADD,NOTE,"Counts speculative adds and multiplies. Variable and higher than theoretical." # CPU,AMD64 FPU SSE_SP # PRESET,PAPI_FP_OPS,DERIVED_SUB,RETIRED_MMX_AND_FP_INSTRUCTIONS:X87:SCALAR_SSE_AND_SSE2:PACKED_SSE_AND_SSE2,DISPATCHED_FPU:OPS_STORE,NOTE,"Counts retired ops corrected for data motion. Optimized for single precision; lower than theoretical." # CPU,AMD64 FPU SSE_DP # PRESET,PAPI_FP_OPS,DERIVED_SUB,RETIRED_MMX_AND_FP_INSTRUCTIONS:X87:SCALAR_SSE_AND_SSE2:PACKED_SSE_AND_SSE2,DISPATCHED_FPU:OPS_STORE_PIPE_LOAD_OPS,NOTE,"Counts retired ops corrected for data motion. Optimized for double precision; lower than theoretical." # ######################## # AMD64 # ######################## CPU,AMD64 (Barcelona) CPU,AMD64 (Barcelona RevB) CPU,AMD64 (Barcelona RevC) CPU,AMD64 (Family 10h RevB Barcelona) CPU,AMD64 (Family 10h RevC Shanghai) CPU,AMD64 (Family 10h RevD Istanbul) CPU,AMD64 (Family 10h RevE) CPU,amd64_fam10h_barcelona CPU,amd64_fam10h_shanghai CPU,amd64_fam10h_istanbul CPU,amd64_fam11h_turion # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2:INSTRUCTIONS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS PRESET,PAPI_L2_ICH,NOT_DERIVED,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L2_DCA,NOT_DERIVED,REQUESTS_TO_L2:DATA PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_CACHE_MISS:DATA PRESET,PAPI_L2_DCH,DERIVED_SUB,REQUESTS_TO_L2:DATA,L2_CACHE_MISS:DATA PRESET,PAPI_L2_TCA,NOT_DERIVED,REQUESTS_TO_L2:ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS:DATA PRESET,PAPI_L2_TCH,DERIVED_SUB,REQUESTS_TO_L2:INSTRUCTIONS:DATA,L2_CACHE_MISS:ALL # # no L3_ preset definitions for multi-cores with shared L3 cache, # as long as L3 events are automatically shadowed from core- to chip-space # PRESET,PAPI_L3_TCR,NOT_DERIVED,READ_REQUEST_TO_L3_CACHE:ALL # PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_CACHE_MISSES:ALL # PRESET,PAPI_L3_TCH,DERIVED_SUB,READ_REQUEST_TO_L3_CACHE:ALL,L3_CACHE_MISSES:ALL # PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_AND_L2_DTLB_MISS:ALL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_ITLB_MISS_AND_L2_ITLB_MISS:ALL PRESET,PAPI_TLB_TL,DERIVED_ADD,L1_DTLB_AND_L2_DTLB_MISS:ALL,L1_ITLB_MISS_AND_L2_ITLB_MISS:ALL # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_STL_ICY,NOT_DERIVED,DECODER_EMPTY PRESET,PAPI_RES_STL,NOT_DERIVED,DISPATCH_STALLS PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # PRESET,PAPI_FPU_IDL,NOT_DERIVED,CYCLES_NO_FPU_OPS_RETIRED PRESET,PAPI_FML_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_MULTIPLY PRESET,PAPI_FAD_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_ADD PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_MMX_AND_FP_INSTRUCTIONS:PACKED_SSE_AND_SSE2 # # An analysis by Bill Homer of Cray indicates accurate counts over a range of conditions # John McCalpin reports that OP_TYPE expands packed operation counts appropriately. # Therefore, it is included in FP_OPS, but not in FP_INS. PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:OP_TYPE PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:SINGLE_DIV_OPS PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:DOUBLE_DIV_OPS # PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_MUL_OPS:DOUBLE_MUL_OPS:OP_TYPE PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:DOUBLE_ADD_SUB_OPS:OP_TYPE,NOTE,"Also includes subtract instructions" PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS:OP_TYPE,NOTE,"Counts both divide and square root instructions" PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS:OP_TYPE,NOTE,"Counts both divide and square root instructions" ######################## # AMD64 fam12h llano # ######################## CPU,amd64_fam12h_llano # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2:INSTRUCTIONS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS PRESET,PAPI_L2_ICH,NOT_DERIVED,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L2_DCA,NOT_DERIVED,REQUESTS_TO_L2:DATA PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_CACHE_MISS:DATA PRESET,PAPI_L2_DCH,DERIVED_SUB,REQUESTS_TO_L2:DATA,L2_CACHE_MISS:DATA PRESET,PAPI_L2_TCA,NOT_DERIVED,REQUESTS_TO_L2:ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS:DATA PRESET,PAPI_L2_TCH,DERIVED_SUB,REQUESTS_TO_L2:INSTRUCTIONS:DATA,L2_CACHE_MISS:ALL # # no L3_ preset definitions for multi-cores with shared L3 cache, # as long as L3 events are automatically shadowed from core- to chip-space # PRESET,PAPI_L3_TCR,NOT_DERIVED,READ_REQUEST_TO_L3_CACHE:ALL # PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_CACHE_MISSES:ALL # PRESET,PAPI_L3_TCH,DERIVED_SUB,READ_REQUEST_TO_L3_CACHE:ALL,L3_CACHE_MISSES:ALL # PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_AND_L2_DTLB_MISS:ALL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_ITLB_MISS_AND_L2_ITLB_MISS:ALL PRESET,PAPI_TLB_TL,DERIVED_ADD,L1_DTLB_AND_L2_DTLB_MISS:ALL,L1_ITLB_MISS_AND_L2_ITLB_MISS:ALL # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_STL_ICY,NOT_DERIVED,DECODER_EMPTY PRESET,PAPI_RES_STL,NOT_DERIVED,DISPATCH_STALLS PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # PRESET,PAPI_FPU_IDL,NOT_DERIVED,CYCLES_NO_FPU_OPS_RETIRED PRESET,PAPI_FML_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_MULTIPLY PRESET,PAPI_FAD_INS,NOT_DERIVED,DISPATCHED_FPU:OPS_ADD PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_MMX_AND_FP_INSTRUCTIONS:SSE_AND_SSE2 # # An analysis by Bill Homer of Cray indicates accurate counts over a range of conditions # John McCalpin reports that OP_TYPE expands packed operation counts appropriately. # Therefore, it is included in FP_OPS, but not in FP_INS. PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:OP_TYPE PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:SINGLE_DIV_OPS PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:DOUBLE_DIV_OPS # PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_MUL_OPS:DOUBLE_MUL_OPS:OP_TYPE PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:DOUBLE_ADD_SUB_OPS:OP_TYPE,NOTE,"Also includes subtract instructions" PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS:OP_TYPE,NOTE,"Counts both divide and square root instructions" PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS:OP_TYPE,NOTE,"Counts both divide and square root instructions" ######################### # AMD Fam14h Bobcat # ######################### # CPU,amd64_fam14h_bobcat # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2:INSTRUCTIONS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS PRESET,PAPI_L2_ICH,NOT_DERIVED,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L2_DCA,NOT_DERIVED,REQUESTS_TO_L2:DATA PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_CACHE_MISS:DATA PRESET,PAPI_L2_DCH,DERIVED_SUB,REQUESTS_TO_L2:DATA,L2_CACHE_MISS:DATA PRESET,PAPI_L2_TCA,NOT_DERIVED,REQUESTS_TO_L2:ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS:DATA PRESET,PAPI_L2_TCH,DERIVED_SUB,REQUESTS_TO_L2:INSTRUCTIONS:DATA,L2_CACHE_MISS:ALL PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISS,L1_ITLB_MISS_AND_L2_ITLB_MISS PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN PRESET,PAPI_FPU_IDL,NOT_DERIVED,CYCLES_NO_FPU_OPS_RETIRED PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_FLOATING_POINT_INSTRUCTIONS:ALL PRESET,PAPI_FP_OPS,NOT_DERIVED,DISPATCHED_FPU:ANY PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:ALL PRESET,PAPI_VEC_SP,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:SINGLE_DIV_OPS PRESET,PAPI_VEC_DP,NOT_DERIVED,RETIRED_SSE_OPERATIONS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:DOUBLE_DIV_OPS PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_MUL_OPS:DOUBLE_MUL_OPS PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS # CPU,AMD64 (Family 15h RevB) CPU,amd64_fam15h_interlagos # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES:DC_MISS_STREAMING_STORE PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES:DC_MISS_STREAMING_STORE PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES:DC_MISS_STREAMING_STORE PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES:DC_MISS_STREAMING_STORE,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2:INSTRUCTIONS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS PRESET,PAPI_L2_ICH,NOT_DERIVED,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L2_DCA,NOT_DERIVED,REQUESTS_TO_L2:DATA PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_CACHE_MISS:DATA PRESET,PAPI_L2_DCH,DERIVED_SUB,REQUESTS_TO_L2:DATA,L2_CACHE_MISS:DATA PRESET,PAPI_L2_TCA,NOT_DERIVED,REQUESTS_TO_L2:ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS:INSTRUCTIONS:DATA PRESET,PAPI_L2_TCH,DERIVED_SUB,REQUESTS_TO_L2:INSTRUCTIONS:DATA,L2_CACHE_MISS:ALL # # not implemented: PRESET,PAPI_L3_TCR,NOT_DERIVED,READ_REQUEST_TO_L3_CACHE:ALL # not implemented: PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_CACHE_MISSES:ALL # not implemented: PRESET,PAPI_L3_TCH,DERIVED_SUB,READ_REQUEST_TO_L3_CACHE:ALL,L3_CACHE_MISSES:ALL # PRESET,PAPI_TLB_DM,NOT_DERIVED,UNIFIED_TLB_MISS:4K_DATA:2M_DATA:1GB_DATA PRESET,PAPI_TLB_IM,NOT_DERIVED,UNIFIED_TLB_MISS:4K_INST:2M_INST:1G_INST PRESET,PAPI_TLB_TL,NOT_DERIVED,UNIFIED_TLB_MISS:ALL # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_PRC,DERIVED_SUB,RETIRED_BRANCH_INSTRUCTIONS,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_STL_ICY,NOT_DERIVED,DECODER_EMPTY PRESET,PAPI_RES_STL,NOT_DERIVED,DISPATCH_STALLS PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # PRESET,PAPI_FPU_IDL,NOT_DERIVED,CYCLES_FPU_EMPTY PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_MMX_FP_INSTRUCTIONS:SSE PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_OPS:ALL PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_OPS:ALL PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_OPS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:SINGLE_DIV_OPS:SINGLE_MUL_ADD_OPS PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_OPS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:DOUBLE_DIV_OPS:DOUBLE_MUL_ADD_OPS # PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_OPS:SINGLE_MUL_OPS:DOUBLE_MUL_OPS:SINGLE_MUL_ADD_OPS:DOUBLE_MUL_ADD_OPS,NOTE,"Also includes multiply-add instructions" PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_OPS:SINGLE_ADD_SUB_OPS:DOUBLE_ADD_SUB_OPS:SINGLE_MUL_ADD_OPS:DOUBLE_MUL_ADD_OPS,NOTE,"Also includes subtract and multiply-add instructions" PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_OPS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS,NOTE,"Counts both divide and square root instructions" PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_OPS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS,NOTE,"Counts both divide and square root instructions" # # CPU,amd64_fam16h # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_ICH,DERIVED_SUB,INSTRUCTION_CACHE_FETCHES,INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM,INSTRUCTION_CACHE_REFILLS_FROM_L2 PRESET,PAPI_L1_ICM,NOT_DERIVED,INSTRUCTION_CACHE_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_ICR,NOT_DERIVED,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_CACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES #PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES:DC_MISS_STREAMING_STORE PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,INSTRUCTION_CACHE_MISSES,DATA_CACHE_MISSES # Only have 3 slots??? #PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|+|N2|-|N3|-|,DATA_CACHE_ACCESSES,INSTRUCTION_CACHE_FETCHES,DATA_CACHE_MISSES,INSTRUCTION_CACHE_MISSES # PRESET,PAPI_L2_ICA,NOT_DERIVED,INSTRUCTION_CACHE_MISSES # # Note, need access to special L2 uncore events # to get L2 related events # PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,DTLB_MISS,ITLB_MISS # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # PRESET,PAPI_STL_ICY,NOT_DERIVED,INSTRUCTION_FETCH_STALL PRESET,PAPI_HW_INT,NOT_DERIVED,INTERRUPTS_TAKEN # PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SINGLE_ADD_SUB_OPS:SINGLE_MUL_OPS:SINGLE_DIV_OPS PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:DOUBLE_ADD_SUB_OPS:DOUBLE_MUL_OPS:DOUBLE_DIV_OPS # PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SINGLE_MUL_OPS:DOUBLE_MUL_OPS PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SINGLE_ADD_SUB_OPS:DOUBLE_ADD_SUB_OPS PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS,NOTE,"Counts both divide and square root instructions" PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SINGLE_DIV_OPS:DOUBLE_DIV_OPS,NOTE,"Counts both divide and square root instructions" # # CPU,amd64_fam17h CPU,amd64_fam17h_zen1 # PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES_NOT_IN_HALT PRESET,PAPI_L1_ICM,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ # Same event code, confusing name? #PRESET,PAPI_L1_DCM,NOT_DERIVED,MAB_ALLOCATION_BY_PIPE PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_CACHE_ACCESSES PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ # # Note, need access to special L2 uncore events # to get L2 related events # PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_MISS:TLB_RELOAD_1G_L2_MISS:TLB_RELOAD_2M_L2_MISS:TLB_RELOAD_COALESCED_PAGE_MISS:TLB_RELOAD_4K_L2_MISS:TLB_RELOAD_1G_L2_HIT:TLB_RELOAD_2M_L2_HIT:TLB_RELOAD_COALESCED_PAGE_HIT:TLB_RELOAD_4K_L2_HIT PRESET,PAPI_TLB_IM,DERIVED_ADD,L1_ITLB_MISS_L2_ITLB_HIT,L1_ITLB_MISS_L2_ITLB_MISS:IF1G:IF2M:IF4K # PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS # Note, the processor supports various kinds of mispredictions PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED # PRESET,PAPI_STL_ICY,NOT_DERIVED,INSTRUCTION_PIPE_STALL:IC_STALL_ANY # PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS:DP_MULT_FLOPS:DP_ADD_SUB_FLOPS:SP_MULT_ADD_FLOPS:SP_DIV_FLOPS:SP_MULT_FLOPS:SP_ADD_SUB_FLOPS PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS:DP_MULT_FLOPS:DP_ADD_SUB_FLOPS:SP_MULT_ADD_FLOPS:SP_DIV_FLOPS:SP_MULT_FLOPS:SP_ADD_SUB_FLOPS PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS:DP_MULT_FLOPS:DP_ADD_SUB_FLOPS:SP_MULT_ADD_FLOPS:SP_DIV_FLOPS:SP_MULT_FLOPS:SP_ADD_SUB_FLOPS PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SP_ADD_SUB_FLOPS:SP_MULT_FLOPS:SP_MULT_ADD_FLOPS:SP_DIV_FLOPS PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:DP_ADD_SUB_FLOPS:DP_MULT_FLOPS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS # PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SP_MULT_FLOPS:DP_MULT_FLOPS PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SP_ADD_SUB_FLOPS:DP_ADD_SUB_FLOPS PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SP_DIV_FLOPS:DP_DIV_FLOPS,NOTE,"Counts both divide and square root instructions" PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_AVX_OPERATIONS:SP_DIV_FLOPS:DP_DIV_FLOPS,NOTE,"Counts both divide and square root instructions" # Events discovered via CAT PRESET,PAPI_L2_DCM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C PRESET,PAPI_L2_DCR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L PRESET,PAPI_L2_DCH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_L_HIT_X # # CPU,amd64_fam17h_zen2 # Events copied from zen1 that also exist on zen2 PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_MISS:TLB_RELOAD_1G_L2_MISS:TLB_RELOAD_2M_L2_MISS:TLB_RELOAD_COALESCED_PAGE_MISS:TLB_RELOAD_4K_L2_MISS:TLB_RELOAD_1G_L2_HIT:TLB_RELOAD_2M_L2_HIT:TLB_RELOAD_COALESCED_PAGE_HIT:TLB_RELOAD_4K_L2_HIT PRESET,PAPI_TLB_IM,DERIVED_ADD,L1_ITLB_MISS_L2_ITLB_HIT,L1_ITLB_MISS_L2_ITLB_MISS:IF1G:IF2M:IF4K PRESET,PAPI_BR_TKN,NOT_DERIVED,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES_NOT_IN_HALT # Events discovered via CAT PRESET,PAPI_L1_DCA,NOT_DERIVED,perf::PERF_COUNT_HW_CACHE_L1D:ACCESS PRESET,PAPI_L2_DCM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C PRESET,PAPI_L2_DCR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L PRESET,PAPI_L2_DCH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_L_HIT_X # PRESET,PAPI_L1_ICM,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ # PRESET,PAPI_L2_ICR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ PRESET,PAPI_L2_ICM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_HIT_X:IC_FILL_HIT_S # New FLOP event on zen2 # PPR (under section 2.1.15.3. -- # https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip) # explains that FLOP events require MergeEvent support, which was included # in the 5.6 kernel. # Hence, a kernel version 5.6 or greater is required. # NOTE: without the MergeEvent support in the kernel, there is no guarantee # that this SSE/AVX FLOP event produces any useful data whatsoever. PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY # Since FP_OPS counts both single- and double-prec operations # correctly, we don't need to confuse the user with additional # DP_OPS and SP_OPS events. So, I'm taking them out. #PRESET,PAPI_DP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY #PRESET,PAPI_SP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY # # Floating-point instructions (including non-numeric floating-point instructions, # e.g. Move or Merge Scalar Double-Precision Floating-Point values) PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR # Since FP_INS counts both single- and double-prec instuctions # correctly, we don't need to confuse the user with additional # VEC_DP and VEC_SP events. So, I'm taking them out. #PRESET,PAPI_VEC_DP,NOT_DERIVED,RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR #PRESET,PAPI_VEC_SP,NOT_DERIVED,RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR # # CPU,amd64_fam19h_zen3 PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES_NOT_IN_HALT PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_CN,NOT_DERIVED,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_UCN,DERIVED_SUB,RETIRED_BRANCH_INSTRUCTIONS,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,DERIVED_POSTFIX,N0|N1|-|N2|+|,RETIRED_TAKEN_BRANCH_INSTRUCTIONS,RETIRED_BRANCH_INSTRUCTIONS,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_NTK,DERIVED_SUB,RETIRED_BRANCH_INSTRUCTIONS,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED PRESET,PAPI_BR_PRC,DERIVED_SUB,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED PRESET,PAPI_TLB_DM,NOT_DERIVED, L1_DTLB_MISS:TLB_RELOAD_1G_L2_MISS:TLB_RELOAD_2M_L2_MISS:TLB_RELOAD_COALESCED_PAGE_MISS:TLB_RELOAD_4K_L2_MISS:TLB_RELOAD_1G_L2_HIT:TLB_RELOAD_2M_L2_HIT:TLB_RELOAD_COALESCED_PAGE_HIT:TLB_RELOAD_4K_L2_HIT PRESET,PAPI_TLB_IM,DERIVED_ADD,L1_ITLB_MISS_L2_ITLB_HIT,L1_ITLB_MISS_L2_ITLB_MISS:COALESCED4K:IF1G:IF2M:IF4K PRESET,PAPI_L1_DCA,NOT_DERIVED,LS_DISPATCH:LD_ST_DISPATCH:STORE_DISPATCH:LD_DISPATCH PRESET,PAPI_L1_DCM,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S:CHANGE_TO_X PRESET,PAPI_L2_DCM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C PRESET,PAPI_L2_DCR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S:CHANGE_TO_X PRESET,PAPI_L2_DCH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C_S:LS_RD_BLK_L_HIT_X:LS_RD_BLK_L_HIT_S:LS_RD_BLK_X PRESET,PAPI_L2_ICR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ PRESET,PAPI_L2_ICM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_HIT_X:IC_FILL_HIT_S # RETIRED_SSE_AVX_FLOPS requires MergeEvent support. PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY PRESET,PAPI_FP_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY PRESET,PAPI_FML_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:MULT_FLOPS PRESET,PAPI_FAD_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ADD_SUB_FLOPS PRESET,PAPI_FDV_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:DIV_FLOPS PRESET,PAPI_FSQ_INS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:DIV_FLOPS # # CPU,amd64_fam19h_zen4 CPU,amd64_fam1ah_zen5 PRESET,PAPI_BR_INS,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_CN,NOT_DERIVED,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_UCN,NOT_DERIVED,RETIRED_UNCONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_TKN,DERIVED_SUB,RETIRED_TAKEN_BRANCH_INSTRUCTIONS,RETIRED_UNCONDITIONAL_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_NTK,DERIVED_SUB,RETIRED_BRANCH_INSTRUCTIONS,RETIRED_TAKEN_BRANCH_INSTRUCTIONS PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED PRESET,PAPI_BR_PRC,DERIVED_SUB,RETIRED_CONDITIONAL_BRANCH_INSTRUCTIONS,RETIRED_BRANCH_INSTRUCTIONS_MISPREDICTED PRESET,PAPI_FP_OPS,NOT_DERIVED,RETIRED_SSE_AVX_FLOPS:ANY PRESET,PAPI_FP_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_ALL,RETIRED_FP_OPS_BY_TYPE:SCALAR_ALL PRESET,PAPI_VEC_INS,NOT_DERIVED,RETIRED_FP_OPS_BY_TYPE:VECTOR_ALL PRESET,PAPI_FMA_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_MAC,RETIRED_FP_OPS_BY_TYPE:SCALAR_MAC PRESET,PAPI_FML_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_MUL,RETIRED_FP_OPS_BY_TYPE:SCALAR_MUL PRESET,PAPI_FAD_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_ADD,RETIRED_FP_OPS_BY_TYPE:SCALAR_ADD PRESET,PAPI_FDV_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_DIV,RETIRED_FP_OPS_BY_TYPE:SCALAR_DIV PRESET,PAPI_FSQ_INS,DERIVED_ADD,RETIRED_FP_OPS_BY_TYPE:VECTOR_SQRT,RETIRED_FP_OPS_BY_TYPE:SCALAR_SQRT PRESET,PAPI_TOT_INS,NOT_DERIVED,RETIRED_INSTRUCTIONS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES_NOT_IN_HALT PRESET,PAPI_TLB_DM,NOT_DERIVED,L1_DTLB_MISS:TLB_RELOAD_1G_L2_MISS:TLB_RELOAD_2M_L2_MISS:TLB_RELOAD_COALESCED_PAGE_MISS:TLB_RELOAD_4K_L2_MISS:TLB_RELOAD_1G_L2_HIT:TLB_RELOAD_2M_L2_HIT:TLB_RELOAD_COALESCED_PAGE_HIT:TLB_RELOAD_4K_L2_HIT PRESET,PAPI_L1_DCA,NOT_DERIVED,LS_DISPATCH:LD_ST_DISPATCH:STORE_DISPATCH:LD_DISPATCH PRESET,PAPI_L2_DCM,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C PRESET,PAPI_L2_DCH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C_S:LS_RD_BLK_L_HIT_X:LS_RD_BLK_L_HIT_S:LS_RD_BLK_X PRESET,PAPI_TLB_IM,DERIVED_ADD,L1_ITLB_MISS_L2_ITLB_HIT,L1_ITLB_MISS_L2_ITLB_MISS:COALESCED4K:IF1G:IF2M:IF4K PRESET,PAPI_L2_ICR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ PRESET,PAPI_L2_ICA,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:CACHEABLE_IC_READ:L2_HW_PF PRESET,PAPI_L2_ICM,DERIVED_SUB,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_HIT_X:IC_FILL_HIT_S PRESET,PAPI_L2_TCH,NOT_DERIVED,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C_S:LS_RD_BLK_L_HIT_X:LS_RD_BLK_L_HIT_S:LS_RD_BLK_X:IC_FILL_HIT_X:IC_FILL_HIT_S PRESET,PAPI_L2_TCM,DERIVED_ADD,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:LS_RD_BLK_C,CORE_TO_L2_CACHEABLE_REQUEST_ACCESS_STATUS:IC_FILL_MISS # CPU,amd64_fam19h_zen4 PRESET,PAPI_L1_DCM,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S:CHANGE_TO_X PRESET,PAPI_L2_DCR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S:CHANGE_TO_X # CPU,amd64_fam1ah_zen5 CPU,amd64_fam1ah_zen5_l3 PRESET,PAPI_L1_DCM,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S PRESET,PAPI_L2_DCR,NOT_DERIVED,REQUESTS_TO_L2_GROUP1:RD_BLK_L:RD_BLK_X:LS_RD_BLK_C_S # TODO: Investigate additional L3-related Zen5 presets. CPU,Intel architectural PMU CPU,ix86arch # PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_INSTRUCTIONS_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS # # Intel Atom CPU,Intel Atom CPU,atom # PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE:MISSES PRESET,PAPI_L1_DCM,DERIVED_SUB,L2_RQSTS:SELF:MESI,ICACHE:MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,ICACHE:ACCESSES PRESET,PAPI_L1_ICH,DERIVED_SUB,ICACHE:ACCESSES,ICACHE:MISSES #PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE:LD:ST PRESET,PAPI_L1_DCA,DERIVED_ADD,L1D_CACHE:LD,L1D_CACHE:ST PRESET,PAPI_L1_TCM,NOT_DERIVED,L2_RQSTS:SELF:MESI PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_LD:SELF:ANY:MESI PRESET,PAPI_L1_STM,NOT_DERIVED,L2_ST:SELF:MESI PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_LINES_IN:SELF:ANY,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICM,NOT_DERIVED,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_LINES_IN:SELF:ANY PRESET,PAPI_L2_LDM,DERIVED_SUB,L2_LINES_IN:SELF:ANY,L2_M_LINES_IN:SELF PRESET,PAPI_L2_STM,NOT_DERIVED,L2_M_LINES_IN:SELF PRESET,PAPI_L2_DCA,DERIVED_ADD,L2_LD:SELF:ANY:MESI,L2_ST:SELF:MESI PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_LD:SELF:ANY:MESI PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_ST:SELF:MESI PRESET,PAPI_L2_ICH,DERIVED_SUB,L2_IFETCH:SELF:MESI,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS:SELF:ANY:MESI,L2_LINES_IN:SELF:ANY PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS:SELF:ANY:MESI PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_LD:SELF:ANY:MESI,L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_ST:SELF:MESI # PRESET,PAPI_CA_SNP,NOT_DERIVED,EXT_SNOOP:SELF:MESI PRESET,PAPI_CA_SHR,NOT_DERIVED,L2_RQSTS:SELF:ANY:S_STATE PRESET,PAPI_CA_CLN,NOT_DERIVED,BUS_TRANS_RFO:SELF PRESET,PAPI_CA_ITV,NOT_DERIVED,BUS_TRANS_INVAL:SELF # PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB:MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED,DATA_TLB_MISSES:DTLB_MISS # PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED:TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:PRED_NOT_TAKEN:MISPRED_NOT_TAKEN PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_INSTRUCTIONS_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,MISPREDICTED_BRANCH_RETIRED # PRESET,PAPI_TOT_IIS,NOT_DERIVED,MACRO_INSTS:ALL_DECODED PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV #PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS:ANY # #PRESET,PAPI_FP_INS,NOT_DERIVED,X87_COMP_OPS_EXE:ANY_AR PRESET,PAPI_FP_INS,NOT_DERIVED,SIMD_INST_RETIRED:ANY #PRESET,PAPI_FP_OPS,NOT_DERIVED,X87_COMP_OPS_EXE:ANY_AR #PRESET,PAPI_FP_OPS,NOT_DERIVED,SIMD_UOPS_EXEC:AR PRESET,PAPI_FP_OPS,DERIVED_ADD,SIMD_INST_RETIRED:ANY,X87_COMP_OPS_EXE:ANY_AR PRESET,PAPI_FML_INS,NOT_DERIVED,MUL:AR PRESET,PAPI_FDV_INS,NOT_DERIVED,DIV:AR PRESET,PAPI_VEC_INS,NOT_DERIVED,SIMD_INST_RETIRED:VECTOR # # Intel Atom Silvermont CPU,slm PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE:MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,ICACHE:ACCESSES PRESET,PAPI_L1_ICH,DERIVED_SUB,ICACHE:ACCESSES,ICACHE:MISSES PRESET,PAPI_L1_TCM,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L2_TCM,NOT_DERIVED,LLC_MISSES PRESET,PAPI_L2_TCH,DERIVED_SUB,LLC_REFERENCES,LLC_MISSES PRESET,PAPI_L2_TCA,NOT_DERIVED,LLC_REFERENCES # PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:JCC PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_INSTRUCTIONS_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,MISPREDICTED_BRANCH_RETIRED # PRESET,PAPI_RES_STL,NOT_DERIVED,UOPS_RETIRED:STALLS # #PRESET,PAPI_FP_INS,NOT_DERIVED,UOPS_RETIRED:X87 PRESET,PAPI_FML_INS,NOT_DERIVED,UOPS_RETIRED:MUL PRESET,PAPI_FDV_INS,NOT_DERIVED,UOPS_RETIRED:DIV # CPU,Intel Nehalem CPU,Intel Westmere CPU,nhm CPU,nhm_ex CPU,wsm CPU,wsm_dp # PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTION_RETIRED PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I:MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I:READS PRESET,PAPI_L1_ICH,NOT_DERIVED,L1I:HITS PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D:REPL #PRESET,PAPI_L1_TCM,NOT_DERIVED,L2_RQSTS:SELF:MESI #PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_LD:SELF:ANY:MESI #PRESET,PAPI_L1_STM,NOT_DERIVED,L2_ST:SELF:MESI # OLD VALUE PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_RQSTS:MISS,L2_RQSTS:IFETCH_MISS PRESET,PAPI_L2_DCM,DERIVED_ADD,L2_RQSTS:LD_MISS,L2_RQSTS:RFO_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_RQSTS:IFETCH_MISS # OLD VALUE PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_RQSTS:MISS PRESET,PAPI_L2_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_RQSTS:LD_MISS #PRESET,PAPI_L2_STM,NOT_DERIVED,L2_M_LINES_IN:SELF # OLD VALUE PRESET,PAPI_L2_DCA,NOT_DERIVED,L2_DATA_RQSTS:ANY PRESET,PAPI_L2_DCA,NOT_DERIVED,L1D:REPL # OLD VALUE PRESET,PAPI_L2_DCR,DERIVED_SUB,L2_RQSTS:LOADS,L2_RQSTS:IFETCHES PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_RQSTS:LOADS #PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_ST:SELF:MESI PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_RQSTS:IFETCH_HIT PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_RQSTS:IFETCHES PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS:REFERENCES, L2_RQSTS:MISS PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS:REFERENCES # OLD VALUE PRESET,PAPI_L2_TCR,NOT_DERIVED,L2_RQSTS:LOADS PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_RQSTS:LOADS,L2_RQSTS:IFETCHES #PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_ST:SELF:MESI # PRESET,PAPI_L1_ICR,NOT_DERIVED,L1I:READS PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_RQSTS:LOADS PRESET,PAPI_L1_STM,NOT_DERIVED,L2_WRITE:RFO_MESI PRESET,PAPI_L1_TCM,DERIVED_SUB,L2_RQSTS:REFERENCES,L2_RQSTS:PREFETCHES PRESET,PAPI_L2_DCH,DERIVED_ADD,L2_RQSTS:LD_HIT,L2_RQSTS:RFO_HIT PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_WRITE:RFO_MESI PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_RQSTS:IFETCHES PRESET,PAPI_L2_STM,NOT_DERIVED,L2_RQSTS:RFO_MISS PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_RQSTS:RFOS PRESET,PAPI_L3_DCA,DERIVED_ADD,L2_RQSTS:LD_MISS,L2_RQSTS:RFO_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,L2_RQSTS:LD_MISS PRESET,PAPI_L3_DCW,NOT_DERIVED,L2_RQSTS:RFO_MISS PRESET,PAPI_L3_ICA,NOT_DERIVED,L2_RQSTS:IFETCH_MISS PRESET,PAPI_L3_ICR,NOT_DERIVED,L2_RQSTS:IFETCH_MISS PRESET,PAPI_L3_LDM,NOT_DERIVED,MEM_LOAD_RETIRED:L3_MISS PRESET,PAPI_L3_TCA,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_L3_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_MISSES PRESET,PAPI_L3_TCR,DERIVED_ADD,L2_RQSTS:LD_MISS,L2_RQSTS:IFETCH_MISS PRESET,PAPI_L3_TCW,NOT_DERIVED,L2_RQSTS:RFO_MISS PRESET,PAPI_LST_INS,DERIVED_ADD,MEM_INST_RETIRED:LOADS,MEM_INST_RETIRED:STORES # PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_INST_RETIRED:LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_INST_RETIRED:STORES # #PRESET,PAPI_CA_SHR,NOT_DERIVED,L2_RQSTS:SELF:ANY:S_STATE #PRESET,PAPI_CA_CLN,NOT_DERIVED,BUS_TRANS_RFO:SELF #PRESET,PAPI_CA_ITV,NOT_DERIVED,BUS_TRANS_INVAL:SELF # PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES:ANY PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES:ANY PRESET,PAPI_TLB_TL,DERIVED_ADD,ITLB_MISSES:ANY, DTLB_MISSES:ANY # PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_EXEC:TAKEN PRESET,PAPI_BR_NTK,DERIVED_SUB,BR_INST_EXEC:ANY, BR_INST_EXEC:TAKEN PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_EXEC:ANY PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_EXEC:ANY PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_EXEC:COND PRESET,PAPI_BR_UCN,NOT_DERIVED,BR_INST_EXEC:DIRECT PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_EXEC:COND, BR_MISP_EXEC:COND # PRESET,PAPI_TOT_IIS,NOT_DERIVED,MACRO_INSTS:DECODED PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS:ANY # PRESET,PAPI_FP_INS,NOT_DERIVED,FP_COMP_OPS_EXE:SSE_FP # PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_COMP_OPS_EXE:SSE_FP # PAPI_FP_OPS counts single and double precision SCALAR operations # PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION:SSE_DOUBLE_PRECISION # According to Stephane (Jan 2010), it's not allowed to combine unit masks for FP_COMP_OPS_EXE; # we have to use two counters instead #PRESET,PAPI_FP_OPS,DERIVED_ADD,FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_DOUBLE_PRECISION PRESET,PAPI_FP_OPS,DERIVED_ADD,FP_COMP_OPS_EXE:SSE_FP,FP_COMP_OPS_EXE:X87 # PAPI_SP_OPS = single precision scalar ops + 3 * packed ops PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|3|*|+|,FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED PRESET,PAPI_DP_OPS,DERIVED_ADD,FP_COMP_OPS_EXE:SSE_DOUBLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED PRESET,PAPI_VEC_SP,NOT_DERIVED,FP_COMP_OPS_EXE:SSE_FP_PACKED PRESET,PAPI_VEC_DP,NOT_DERIVED,FP_COMP_OPS_EXE:SSE_FP_PACKED #PRESET,PAPI_FML_INS,NOT_DERIVED,MUL #PRESET,PAPI_FDV_INS,NOT_DERIVED,DIV #PRESET,PAPI_VEC_INS,NOT_DERIVED,SIMD_INST_RETIRED:VECTOR # # Not available on Westmere # CPU,Intel Nehalem CPU,nhm CPU,nhm_ex #PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT:RCV PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF:ANY PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_ALL_REF:ANY,L1D:REPL PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_ALL_REF:ANY,L1I:READS # PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_LD:MESI PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_ST:MESI PRESET,PAPI_L1_TCR,DERIVED_ADD,L1D_CACHE_LD:MESI,L1I:READS PRESET,PAPI_L2_TCW,NOT_DERIVED,L1D_CACHE_ST:MESI # # Intel SandyBridge and IvyBridge CPU,snb CPU,snb_ep CPU,ivb CPU,ivb_ep # PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTION_RETIRED # PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD PRESET,PAPI_L1_STM,NOT_DERIVED,L2_STORE_LOCK_RQSTS:ALL PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE:MISSES PRESET,PAPI_L1_TCM,DERIVED_ADD,ICACHE:MISSES,L1D:REPLACEMENT # PRESET,PAPI_L2_DCM,DERIVED_SUB,LAST_LEVEL_CACHE_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_STM,NOT_DERIVED,L2_RQSTS:RFO_MISS PRESET,PAPI_L2_DCA,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_STORE_LOCK_RQSTS:ALL PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_RQSTS:CODE_RD_HIT PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_L2_TCA,DERIVED_ADD,L1D:REPLACEMENT,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_DATA_RD,L2_RQSTS:ALL_CODE_RD # PRESET,PAPI_L3_DCA,DERIVED_SUB,LAST_LEVEL_CACHE_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,OFFCORE_REQUESTS:DEMAND_DATA_RD PRESET,PAPI_L3_DCW,NOT_DERIVED,L2_RQSTS:RFO_MISS PRESET,PAPI_L3_ICA,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_ICR,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_TCA,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_L3_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_MISSES PRESET,PAPI_L3_TCR,DERIVED_SUB,LAST_LEVEL_CACHE_REFERENCES,L2_RQSTS:RFO_MISS PRESET,PAPI_L3_TCW,NOT_DERIVED,L2_RQSTS:RFO_MISS # PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:NOT_TAKEN PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED:ALL_BRANCHES PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_RETIRED:ALL_BRANCHES # PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES:CAUSES_A_WALK # PRESET,PAPI_FDV_INS,NOT_DERIVED,ARITH:FPU_DIV PRESET,PAPI_STL_ICY,NOT_DERIVED,ILD_STALL:IQ_FULL PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_UOP_RETIRED:ANY_LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_UOP_RETIRED:ANY_STORES # # Counts scalars only; no SSE or AVX is counted; includes speculative PRESET,PAPI_FP_INS,DERIVED_ADD,FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE,FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE,FP_COMP_OPS_EXE:X87 PRESET,PAPI_FP_OPS,DERIVED_ADD,FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE,FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE,FP_COMP_OPS_EXE:X87 # PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|4|*|N2|8|*|+|+|,FP_COMP_OPS_EXE:SSE_FP_SCALAR_SINGLE,FP_COMP_OPS_EXE:SSE_PACKED_SINGLE,SIMD_FP_256:PACKED_SINGLE PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|N1|2|*|N2|4|*|+|+|,FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE,FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE,SIMD_FP_256:PACKED_DOUBLE PRESET,PAPI_VEC_SP,DERIVED_POSTFIX,N0|4|*|N1|8|*|+|,FP_COMP_OPS_EXE:SSE_PACKED_SINGLE,SIMD_FP_256:PACKED_SINGLE PRESET,PAPI_VEC_DP,DERIVED_POSTFIX,N0|2|*|N1|4|*|+|,FP_COMP_OPS_EXE:SSE_FP_PACKED_DOUBLE,SIMD_FP_256:PACKED_DOUBLE # # Intel SandyBridge only CPU,snb CPU,snb_ep # PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_RQSTS:RFO_ANY PRESET,PAPI_L2_DCH,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_RD_HIT,L2_RQSTS:RFO_HITS PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:CONDITIONAL PRESET,PAPI_BR_UCN,DERIVED_SUB,BR_INST_RETIRED:ALL_BRANCHES,BR_INST_RETIRED:CONDITIONAL PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED:CONDITIONAL,BR_MISP_RETIRED:ALL_BRANCHES PRESET,PAPI_BR_TKN,DERIVED_SUB,BR_INST_RETIRED:CONDITIONAL,BR_INST_RETIRED:NOT_TAKEN PRESET,PAPI_TLB_DM,DERIVED_ADD,DTLB_LOAD_MISSES:CAUSES_A_WALK,DTLB_STORE_MISSES:CAUSES_A_WALK # # Intel IvyBridge only CPU,ivb CPU,ivb_ep # PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_RQSTS:ALL_RFO PRESET,PAPI_L2_DCH,DERIVED_ADD,L2_RQSTS:DEMAND_DATA_RD_HIT,L2_RQSTS:RFO_HIT PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:COND PRESET,PAPI_BR_UCN,DERIVED_SUB,BR_INST_RETIRED:ALL_BRANCHES,BR_INST_RETIRED:COND PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED:COND,BR_MISP_RETIRED:ALL_BRANCHES PRESET,PAPI_BR_TKN,DERIVED_SUB,BR_INST_RETIRED:COND,BR_INST_RETIRED:NOT_TAKEN PRESET,PAPI_TLB_DM,DERIVED_ADD,DTLB_LOAD_MISSES:DEMAND_LD_MISS_CAUSES_A_WALK,DTLB_STORE_MISSES:CAUSES_A_WALK #PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INTERRUPTS # # Intel Haswell events # Using also for Broadwell events, this is what the Linux kernel does CPU,hsw CPU,hsw_ep CPU,bdw CPU,bdw_ep CPU,skl # Note, libpfm4 treats Kaby Lake as just a form of skylake CPU,kbl CPU,skx # Note, libpfm4 treats Cascade Lake-X as just a form of skylake-X CPU,clx PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_THREAD_UNHALTED:THREAD_P PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED:ANY_P PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES #PRESET,PAPI_REF_CYC,NOT_DERIVED,CPU_CLK_THREAD_UNHALTED:REF_XCLK # Loads and stores PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_UOPS_RETIRED:ALL_LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_UOPS_RETIRED:ALL_STORES PRESET,PAPI_LST_INS,DERIVED_ADD,MEM_UOPS_RETIRED:ALL_LOADS,MEM_UOPS_RETIRED:ALL_STORES # L1 cache #PRESET,PAPI_L1_TCH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L1_HIT #PRESET,PAPI_L1_TCM,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L1_MISS PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD # Added by FMB PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D:REPLACEMENT,L2_RQSTS:ALL_CODE_RD # L2 cache PRESET,PAPI_L2_DCA,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_REFERENCES # NOTE on IVB it is PRESET,PAPI_L2_DCA,NOT_DERIVED,L1D:REPLACEMENT #PRESET,PAPI_L2_DCH,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_HIT #PRESET,PAPI_L2_DCM,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_MISS PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_RQSTS:CODE_RD_HIT PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD #PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS:REFERENCES #PRESET,PAPI_L2_TCH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L2_HIT #PRESET,PAPI_L2_TCM,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L2_MISS # Added by FMB PRESET,PAPI_L2_DCM,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD #PRESET,PAPI_L2_LDH,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_HIT PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_MISS PRESET,PAPI_L2_STM,NOT_DERIVED,L2_RQSTS:DEMAND_RFO_MISS PRESET,PAPI_L2_TCA,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_REFERENCES,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCM,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_DATA_RD,L2_RQSTS:ALL_CODE_RD # L3 cache #PRESET,PAPI_L3_TCA,NOT_DERIVED,LONGEST_LAT_CACHE:REFERENCE #PRESET,PAPI_L3_TCH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L3_HIT #PRESET,PAPI_L3_TCM,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L3_MISS # Added by FMB PRESET,PAPI_L3_DCA,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,OFFCORE_REQUESTS:DEMAND_DATA_RD PRESET,PAPI_L3_DCW,NOT_DERIVED,L2_RQSTS:DEMAND_RFO_MISS PRESET,PAPI_L3_ICA,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_ICR,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS #PRESET,PAPI_L3_LDH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L3_HIT PRESET,PAPI_L3_LDM,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L3_MISS PRESET,PAPI_L3_TCA,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L3_TCM,NOT_DERIVED,LLC_MISSES PRESET,PAPI_L3_TCR,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:DEMAND_RFO_MISS PRESET,PAPI_L3_TCW,NOT_DERIVED,L2_RQSTS:DEMAND_RFO_MISS # SMP PRESET,PAPI_CA_SNP,NOT_DERIVED,OFFCORE_RESPONSE_0:SNP_ANY PRESET,PAPI_CA_SHR,NOT_DERIVED,OFFCORE_REQUESTS:ALL_DATA_RD PRESET,PAPI_CA_CLN,NOT_DERIVED,OFFCORE_REQUESTS:DEMAND_RFO # TLB PRESET,PAPI_TLB_DM,DERIVED_ADD,DTLB_LOAD_MISSES:MISS_CAUSES_A_WALK,DTLB_STORE_MISSES:MISS_CAUSES_A_WALK PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES:MISS_CAUSES_A_WALK # Stalls PRESET,PAPI_MEM_WCY,NOT_DERIVED,RESOURCE_STALLS:SB PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS:ANY PRESET,PAPI_STL_CCY,NOT_DERIVED,UOPS_RETIRED:ALL:c=1:i=1 PRESET,PAPI_FUL_ICY,DERIVED_ADD,IDQ:ALL_DSB_CYCLES_4_UOPS,IDQ:ALL_MITE_CYCLES_4_UOPS PRESET,PAPI_FUL_CCY,NOT_DERIVED,UOPS_RETIRED:ALL:c=4 # Branches PRESET,PAPI_BR_UCN,DERIVED_SUB,BR_INST_RETIRED:ALL_BRANCHES,BR_INST_RETIRED:CONDITIONAL PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:CONDITIONAL PRESET,PAPI_BR_TKN,DERIVED_SUB,BR_INST_RETIRED:CONDITIONAL,BR_INST_RETIRED:NOT_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:NOT_TAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_RETIRED:CONDITIONAL PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED:CONDITIONAL,BR_MISP_RETIRED:CONDITIONAL PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED:ALL_BRANCHES CPU,hsw CPU,hsw_ep CPU,bdw CPU,bdw_ep PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_TRANS:DEMAND_DATA_RD PRESET,PAPI_L1_STM,NOT_DERIVED,L2_TRANS:L1D_WB PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_TRANS:RFO PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_TRANS:RFO PRESET,PAPI_PRF_DM,NOT_DERIVED,L2_RQSTS:L2_PF_MISS PRESET,PAPI_STL_ICY,NOT_DERIVED,IDQ:EMPTY PRESET,PAPI_CA_ITV,NOT_DERIVED,OFFCORE_RESPONSE_0:SNP_FWD CPU,hsw CPU,hsw_ep PRESET,PAPI_CA_INV,NOT_DERIVED,OFFCORE_RESPONSE_0:SNP_HITM CPU,bdw CPU,bdw_ep PRESET,PAPI_CA_INV,NOT_DERIVED,OFFCORE_RESPONSE_0:HITM # PAPI_DP_OPS = FP_ARITH:SCALAR_DOUBLE + 2*FP_ARITH:128B_PACKED_DOUBLE + 4*256B_PACKED_DOUBLE PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE # PAPI_SP_OPS = FP_ARITH:SCALAR_SINGLE + 4*FP_ARITH:128B_PACKED_SINGLE + 8*256B_PACKED_SINGLE PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE PRESET,PAPI_VEC_DP,DERIVED_POSTFIX,N0|N1|N2|+|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE PRESET,PAPI_VEC_SP,DERIVED_POSTFIX,N0|N1|N2|+|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE CPU,skl CPU,skx CPU,clx # PAPI_DP_OPS = FP_ARITH:SCALAR_DOUBLE + 2*FP_ARITH:128B_PACKED_DOUBLE + 4*256B_PACKED_DOUBLE + 8*512B_PACKED_DOUBLE PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|N3|8|*|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE,FP_ARITH:512B_PACKED_DOUBLE # PAPI_SP_OPS = FP_ARITH:SCALAR_SINGLE + 4*FP_ARITH:128B_PACKED_SINGLE + 8*256B_PACKED_SINGLE + 16*512B_PACKED_SINGLE PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|N3|16|*|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE,FP_ARITH:512B_PACKED_SINGLE PRESET,PAPI_VEC_DP,DERIVED_POSTFIX,N0|N1|N2|N3|+|+|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE,FP_ARITH:512B_PACKED_DOUBLE PRESET,PAPI_VEC_SP,DERIVED_POSTFIX,N0|N1|N2|N3|+|+|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE,FP_ARITH:512B_PACKED_SINGLE PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD PRESET,PAPI_L1_STM,NOT_DERIVED,L2_RQSTS:ALL_RFO PRESET,PAPI_L2_DCW,DERIVED_ADD,L2_RQSTS:DEMAND_RFO_HIT,L2_RQSTS:RFO_HIT PRESET,PAPI_L2_TCW,DERIVED_ADD,L2_RQSTS:DEMAND_RFO_HIT,L2_RQSTS:RFO_HIT PRESET,PAPI_PRF_DM,NOT_DERIVED,L2_RQSTS:PF_MISS PRESET,PAPI_STL_ICY,NOT_DERIVED,IDQ_UOPS_NOT_DELIVERED:CYCLES_0_UOPS_DELIV_CORE PRESET,PAPI_CA_ITV,NOT_DERIVED,OFFCORE_RESPONSE_0:SNP_HIT_WITH_FWD # End of hsw,bdw,skl,clx list # # Intel Ice Lake SP events CPU,icx CPU,icl PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED:THREAD_P PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED:ANY_P PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES # Loads and stores PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_INST_RETIRED:ALL_LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_INST_RETIRED:ALL_STORES PRESET,PAPI_LST_INS,DERIVED_ADD,MEM_INST_RETIRED:ALL_LOADS,MEM_INST_RETIRED:ALL_STORES # L1 cache PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D:REPLACEMENT,L2_RQSTS:ALL_CODE_RD # L2 cache PRESET,PAPI_L2_DCA,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_REFERENCES PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_RQSTS:CODE_RD_HIT PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD #PRESET,PAPI_L2_TCH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L2_HIT #PRESET,PAPI_L2_TCM,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L2_MISS PRESET,PAPI_L2_DCM,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD #PRESET,PAPI_L2_LDH,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_HIT PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_MISS PRESET,PAPI_L2_TCA,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_REFERENCES,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCM,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_DATA_RD,L2_RQSTS:ALL_CODE_RD # L3 cache PRESET,PAPI_L3_DCA,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,OFFCORE_REQUESTS:DEMAND_DATA_RD PRESET,PAPI_L3_ICA,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_ICR,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS #PRESET,PAPI_L3_LDH,NOT_DERIVED,MEM_LOAD_UOPS_RETIRED:L3_HIT PRESET,PAPI_L3_LDM,NOT_DERIVED,MEM_LOAD_RETIRED:L3_MISS PRESET,PAPI_L3_TCA,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L3_TCM,NOT_DERIVED,LLC_MISSES # SMP PRESET,PAPI_CA_SHR,NOT_DERIVED,OFFCORE_REQUESTS:ALL_DATA_RD # Branches PRESET,PAPI_BR_UCN,DERIVED_SUB,BR_INST_RETIRED:ALL_BRANCHES,BR_INST_RETIRED:COND PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:COND PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED:COND_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:COND_NTAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_RETIRED:COND PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED:COND,BR_MISP_RETIRED:COND PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED:ALL_BRANCHES #FLOPs # PAPI_DP_OPS = FP_ARITH:SCALAR_DOUBLE + 2*FP_ARITH:128B_PACKED_DOUBLE + 4*256B_PACKED_DOUBLE + 8*512B_PACKED_DOUBLE PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|N3|8|*|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE,FP_ARITH:512B_PACKED_DOUBLE # PAPI_SP_OPS = FP_ARITH:SCALAR_SINGLE + 4*FP_ARITH:128B_PACKED_SINGLE + 8*256B_PACKED_SINGLE + 16*512B_PACKED_SINGLE PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|N3|16|*|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE,FP_ARITH:512B_PACKED_SINGLE PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|N3|16|*|+|N4|+|N5|2|*|+|N6|4|*|+|N7|8|*|+|,FP_ARITH_INST_RETIRED:SCALAR_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:SCALAR_DOUBLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_FP_INS,DERIVED_POSTFIX,N0|N1|N2|N3|N4|N5|N6|N7|+|+|+|+|+|+|+|,FP_ARITH_INST_RETIRED:SCALAR_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:SCALAR_DOUBLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_VEC_DP,DERIVED_POSTFIX,N0|N1|N2|N3|+|+|+|,FP_ARITH:SCALAR_DOUBLE,FP_ARITH:128B_PACKED_DOUBLE,FP_ARITH:256B_PACKED_DOUBLE,FP_ARITH:512B_PACKED_DOUBLE PRESET,PAPI_VEC_SP,DERIVED_POSTFIX,N0|N1|N2|N3|+|+|+|,FP_ARITH:SCALAR_SINGLE,FP_ARITH:128B_PACKED_SINGLE,FP_ARITH:256B_PACKED_SINGLE,FP_ARITH:512B_PACKED_SINGLE PRESET,PAPI_VEC_INS,DERIVED_POSTFIX,N0|N1|N2|N3|N4|N5|+|+|+|+|+|,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE # End of icx, icl list # Intel Sapphire Rapids events CPU,spr PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED:THREAD_P PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED:ANY_P PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES # FLOPs PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|N3|8|*|+|,FP_ARITH_INST_RETIRED:SCALAR_DOUBLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|N3|16|*|+|,FP_ARITH_INST_RETIRED:SCALAR_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|4|*|+|N2|8|*|+|N3|16|*|+|N4|+|N5|2|*|+|N6|4|*|+|N7|8|*|+|,FP_ARITH_INST_RETIRED:SCALAR_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:SCALAR_DOUBLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_FP_INS,DERIVED_POSTFIX,N0|N1|N2|N3|N4|N5|N6|N7|+|+|+|+|+|+|+|,FP_ARITH_INST_RETIRED:SCALAR_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:SCALAR_DOUBLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_VEC_DP,DERIVED_POSTFIX,N0|N1|N2|+|+|,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE PRESET,PAPI_VEC_SP,DERIVED_POSTFIX,N0|N1|N2|+|+|,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE PRESET,PAPI_VEC_INS,DERIVED_POSTFIX,N0|N1|N2|N3|N4|N5|+|+|+|+|+|,FP_ARITH_INST_RETIRED:128B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:256B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:512B_PACKED_SINGLE,FP_ARITH_INST_RETIRED:128B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:256B_PACKED_DOUBLE,FP_ARITH_INST_RETIRED:512B_PACKED_DOUBLE # Branches PRESET,PAPI_BR_UCN,DERIVED_SUB,BR_INST_RETIRED:ALL_BRANCHES,BR_INST_RETIRED:COND PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:COND PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED:COND_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:COND_NTAKEN PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISP_RETIRED:COND PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED:COND,BR_MISP_RETIRED:COND PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED:ALL_BRANCHES # Instruction Caches PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_ICH,NOT_DERIVED,L2_RQSTS:CODE_RD_HIT PRESET,PAPI_L2_ICM,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L3_ICA,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_ICR,NOT_DERIVED,L2_RQSTS:CODE_RD_MISS # Loads and stores PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_INST_RETIRED:ALL_LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_INST_RETIRED:ALL_STORES PRESET,PAPI_LST_INS,DERIVED_ADD,MEM_INST_RETIRED:ALL_LOADS,MEM_INST_RETIRED:ALL_STORES # Data Caches PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L2_DCA,NOT_DERIVED,L1D:REPLACEMENT PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_RQSTS:ALL_DEMAND_DATA_RD #PRESET,PAPI_L2_DCM,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L2_DCM,NOT_DERIVED,OFFCORE_REQUESTS:DATA_RD PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_RQSTS:DEMAND_DATA_RD_MISS #PRESET,PAPI_L3_DCA,DERIVED_SUB,LLC_REFERENCES,L2_RQSTS:CODE_RD_MISS PRESET,PAPI_L3_DCA,NOT_DERIVED,OFFCORE_REQUESTS:DATA_RD PRESET,PAPI_L3_DCR,NOT_DERIVED,OFFCORE_REQUESTS:DEMAND_DATA_RD PRESET,PAPI_L3_LDM,NOT_DERIVED,MEM_LOAD_RETIRED:L3_MISS # SMP PRESET,PAPI_CA_SHR,NOT_DERIVED,OFFCORE_REQUESTS:DATA_RD # Total Caches PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D:REPLACEMENT,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCA,DERIVED_ADD,L1D:REPLACEMENT,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L2_TCM,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_RQSTS:ALL_DEMAND_DATA_RD,L2_RQSTS:ALL_CODE_RD PRESET,PAPI_L3_TCA,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L3_TCM,NOT_DERIVED,LLC_MISSES # End of spr list # # Intel MIC / Xeon-Phi / Knights Landing # Intel Knights Mill # CPU,knl CPU,knm PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE:MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,ICACHE:ACCESSES PRESET,PAPI_L1_ICH,NOT_DERIVED,ICACHE:HIT # PRESET,PAPI_L1_DCA,DERIVED_ADD,MEM_UOPS_RETIRED:ANY_LD,MEM_UOPS_RETIRED:ANY_ST PRESET,PAPI_L1_DCM,NOT_DERIVED,MEM_UOPS_RETIRED:LD_DCU_MISS PRESET,PAPI_L1_TCM,DERIVED_ADD,MEM_UOPS_RETIRED:LD_DCU_MISS,ICACHE:MISSES PRESET,PAPI_L1_LDM,NOT_DERIVED,MEM_UOPS_RETIRED:LD_DCU_MISS # PRESET,PAPI_L2_TCA,NOT_DERIVED,LLC_REFERENCES PRESET,PAPI_L2_TCM,NOT_DERIVED,LLC_MISSES PRESET,PAPI_L2_TCH,DERIVED_SUB,LLC_REFERENCES,LLC_MISSES PRESET,PAPI_L2_LDM,NOT_DERIVED,MEM_UOPS_RETIRED:LD_L2_MISS PRESET,PAPI_LD_INS,NOT_DERIVED,MEM_UOPS_RETIRED:ANY_LD PRESET,PAPI_SR_INS,NOT_DERIVED,MEM_UOPS_RETIRED:ANY_ST PRESET,PAPI_LST_INS,DERIVED_ADD,MEM_UOPS_RETIRED:ANY_LD,MEM_UOPS_RETIRED:ANY_ST # PRESET,PAPI_TLB_DM,NOT_DERIVED,MEM_UOPS_RETIRED:LD_UTLB_MISS # PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_INSTRUCTIONS_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,MISPREDICTED_BRANCH_RETIRED PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED:JCC PRESET,PAPI_BR_UCN,DERIVED_SUB,BRANCH_INSTRUCTIONS_RETIRED,BR_INST_RETIRED:JCC PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED:TAKEN_JCC PRESET,PAPI_BR_NTK,DERIVED_SUB,BR_INST_RETIRED:JCC,BR_INST_RETIRED:TAKEN_JCC # PRESET,PAPI_RES_STL,NOT_DERIVED,RS_FULL_STALL:ANY PRESET,PAPI_STL_ICY,NOT_DERIVED,NO_ALLOC_CYCLES:ANY # # End of knl,knm list CPU,Intel Core2 CPU,Intel Core CPU,core # PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_RETIRED PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_MISSES PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_READS PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_READS,L1I_MISSES PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_REPL PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_ALL_REF PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_ALL_REF,L1D_REPL PRESET,PAPI_L1_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_LD:SELF:ANY:MESI PRESET,PAPI_L1_STM,NOT_DERIVED,L2_ST:SELF:MESI PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_ALL_REF,L1I_READS PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_LINES_IN:SELF:ANY,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICM,NOT_DERIVED,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_LINES_IN:SELF:ANY PRESET,PAPI_L2_LDM,DERIVED_SUB,L2_LINES_IN:SELF:ANY,L2_M_LINES_IN:SELF PRESET,PAPI_L2_STM,NOT_DERIVED,L2_M_LINES_IN:SELF PRESET,PAPI_L2_DCA,DERIVED_ADD,L2_LD:SELF:ANY:MESI,L2_ST:SELF:MESI PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_LD:SELF:ANY:MESI PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_ST:SELF:MESI PRESET,PAPI_L2_ICH,DERIVED_SUB,L2_IFETCH:SELF:MESI,BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS:SELF:ANY:MESI,L2_LINES_IN:SELF:ANY PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS:SELF:ANY:MESI PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_LD:SELF:ANY:MESI,L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_ST:SELF:MESI # PRESET,PAPI_LD_INS,NOT_DERIVED,INST_RETIRED:LOADS PRESET,PAPI_SR_INS,NOT_DERIVED,INST_RETIRED:STORES # PRESET,PAPI_CA_SHR,NOT_DERIVED,L2_RQSTS:SELF:ANY:S_STATE PRESET,PAPI_CA_CLN,NOT_DERIVED,BUS_TRANS_RFO:SELF PRESET,PAPI_CA_ITV,NOT_DERIVED,BUS_TRANS_INVAL:SELF # PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB:MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISSES:ANY # PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_INST_RETIRED:TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,BR_INST_RETIRED:PRED_NOT_TAKEN:MISPRED_NOT_TAKEN PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_EXEC PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISSP_EXEC PRESET,PAPI_BR_CN,NOT_DERIVED,BR_CND_EXEC PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_CND_EXEC,BR_CND_MISSP_EXEC # PRESET,PAPI_TOT_IIS,NOT_DERIVED,MACRO_INSTS:DECODED PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RCV PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS:ANY # PRESET,PAPI_FP_INS,NOT_DERIVED,FP_COMP_OPS_EXE # This is an alternate definition of OPS that produces no error with calibrate # the previous definition was identical to FP_INS # PRESET,PAPI_FP_OPS,NOT_DERIVED,X87_OPS_RETIRED:ANY # PRESET,PAPI_FP_OPS,DERIVED_ADD, FP_COMP_OPS_EXE, SIMD_COMP_INST_RETIRED:SCALAR_DOUBLE:PACKED_DOUBLE:SCALAR_SINGLE:PACKED_SINGLE PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_COMP_OPS_EXE # PAPI_SP_OPS = FP_COMP_OPS_EXE + 3 * SIMD_COMP_INST_RETIRED:PACKED_SINGLE PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|3|*|+|,FP_COMP_OPS_EXE,SIMD_COMP_INST_RETIRED:PACKED_SINGLE PRESET,PAPI_DP_OPS,DERIVED_ADD,FP_COMP_OPS_EXE,SIMD_COMP_INST_RETIRED:PACKED_DOUBLE PRESET,PAPI_VEC_SP,NOT_DERIVED,SIMD_COMP_INST_RETIRED:PACKED_SINGLE PRESET,PAPI_VEC_DP,NOT_DERIVED,SIMD_COMP_INST_RETIRED:PACKED_DOUBLE # PRESET,PAPI_FML_INS,NOT_DERIVED,MUL PRESET,PAPI_FDV_INS,NOT_DERIVED,DIV PRESET,PAPI_VEC_INS,NOT_DERIVED,SIMD_INST_RETIRED:VECTOR # CPU,Intel Core Duo/Solo CPU,coreduo # PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_INSTRUCTIONS_RETIRED PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_TAKEN_RET PRESET,PAPI_BR_MSP,NOT_DERIVED,MISPREDICTED_BRANCH_RETIRED PRESET,PAPI_L2_TCM,NOT_DERIVED,LAST_LEVEL_CACHE_MISSES PRESET,PAPI_L2_TCA,NOT_DERIVED,LAST_LEVEL_CACHE_REFERENCES PRESET,PAPI_FP_INS,NOT_DERIVED,FP_COMP_INSTR_RET PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_COMP_INSTR_RET # PRESET,PAPI_L1_DCM,NOT_DERIVED, DCACHE_REPL PRESET,PAPI_L1_ICM,NOT_DERIVED, L2_IFETCH:SELF:MESI PRESET,PAPI_L2_DCM,DERIVED_SUB, L2_LINES_IN:SELF:ANY, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICM,NOT_DERIVED, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L1_TCM,NOT_DERIVED, L2_RQSTS:SELF:MESI #PRESET,PAPI_L2_TCM,NOT_DERIVED, L2_LINES_IN:SELF:ANY PRESET,PAPI_CA_SHR,NOT_DERIVED, L2_RQSTS:SELF:ANY:S_STATE PRESET,PAPI_CA_CLN,NOT_DERIVED, BUS_TRANS_RFO:SELF PRESET,PAPI_CA_ITV,NOT_DERIVED, BUS_TRANS_INVAL:SELF PRESET,PAPI_TLB_IM,NOT_DERIVED, ITLB_MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED, DTLB_MISS PRESET,PAPI_L1_LDM,NOT_DERIVED, L2_LD:SELF:MESI PRESET,PAPI_L1_STM,NOT_DERIVED, L2_ST:SELF:MESI PRESET,PAPI_L2_LDM,DERIVED_SUB, L2_LINES_IN:SELF:ANY, L2_M_LINES_IN:SELF PRESET,PAPI_L2_STM,NOT_DERIVED, L2_M_LINES_IN:SELF PRESET,PAPI_BTAC_M,NOT_DERIVED, PREF_RQSTS_DN PRESET,PAPI_HW_INT,NOT_DERIVED, HW_INT_RX PRESET,PAPI_BR_CN,NOT_DERIVED, BR_CND_EXEC PRESET,PAPI_BR_TKN,NOT_DERIVED, BR_TAKEN_RET PRESET,PAPI_BR_NTK,DERIVED_SUB, BR_INSTR_RET,BR_TAKEN_RET PRESET,PAPI_BR_MSP,NOT_DERIVED, BR_MISSP_EXEC PRESET,PAPI_BR_PRC,DERIVED_SUB, BR_INSTR_RET,BR_MISPRED_RET PRESET,PAPI_TOT_IIS,NOT_DERIVED, INSTR_DECODED PRESET,PAPI_RES_STL,NOT_DERIVED, RESOURCE_STALL PRESET,PAPI_L1_DCH,DERIVED_SUB, DATA_MEM_REF, DCACHE_REPL PRESET,PAPI_L1_DCA,NOT_DERIVED, DATA_MEM_REF PRESET,PAPI_L2_DCA,DERIVED_ADD, L2_LD:SELF:MESI, L2_ST:SELF:MESI PRESET,PAPI_L2_DCR,NOT_DERIVED, L2_LD:SELF:MESI PRESET,PAPI_L2_DCW,NOT_DERIVED, L2_ST:SELF:MESI PRESET,PAPI_L1_ICH,DERIVED_SUB, BUS_TRANS_IFETCH:SELF, L2_IFETCH:SELF:MESI PRESET,PAPI_L2_ICH,DERIVED_SUB, L2_IFETCH:SELF:MESI, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L1_ICA,NOT_DERIVED, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICA,NOT_DERIVED, L2_IFETCH:SELF:MESI PRESET,PAPI_L1_ICR,NOT_DERIVED, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_ICR,NOT_DERIVED, L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCH,DERIVED_SUB, L2_RQSTS:SELF:ANY:MESI, L2_LINES_IN:SELF:ANY PRESET,PAPI_L1_TCA,DERIVED_ADD, DATA_MEM_REF, BUS_TRANS_IFETCH:SELF PRESET,PAPI_L2_TCA,NOT_DERIVED, L2_RQSTS:SELF:ANY:MESI PRESET,PAPI_L2_TCR,DERIVED_ADD, L2_LD:SELF:MESI, L2_IFETCH:SELF:MESI PRESET,PAPI_L2_TCW,NOT_DERIVED, L2_ST:SELF:MESI PRESET,PAPI_FML_INS,NOT_DERIVED, MUL PRESET,PAPI_FDV_INS,NOT_DERIVED, DIV # CPU,Intel PentiumIII CPU,Intel P6 Processor Family CPU,p6 # PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_LINES_IN,BUS_TRAN_IFETCH:SELF PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_LINES_IN PRESET,PAPI_L2_LDM,DERIVED_SUB,L2_LINES_IN,L2_M_LINES_INM PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS:M:E:S:I,L2_LINES_IN # CPU,Intel PentiumM CPU,Intel Pentium M CPU,pm # PRESET,PAPI_L2_DCM,DERIVED_SUB,L2_LINES_IN:ONLY_HW_PREFETCH:NON_HW_PREFETCH,BUS_TRAN_IFETCH:SELF PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_LINES_IN:ONLY_HW_PREFETCH:NON_HW_PREFETCH PRESET,PAPI_L2_LDM,DERIVED_SUB,L2_LINES_IN:ONLY_HW_PREFETCH:NON_HW_PREFETCH,L2_M_LINES_INM PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_RQSTS:M:E:S:I,L2_LINES_IN:ONLY_HW_PREFETCH:NON_HW_PREFETCH # CPU,Intel P6 CPU,Intel PentiumIII CPU,Intel PentiumM CPU,Intel P6 Processor Family CPU,Intel Pentium Pro CPU,Intel Pentium II CPU,Intel Pentium M CPU,p6 CPU,ppro CPU,pii CPU,pm # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED PRESET,PAPI_L1_DCM,NOT_DERIVED,DCU_LINES_IN PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_IFETCH:M:E:S:I PRESET,PAPI_L1_TCM,NOT_DERIVED,L2_RQSTS:M:E:S:I PRESET,PAPI_L1_LDM,NOT_DERIVED,L2_LD:M:E:S:I PRESET,PAPI_L1_STM,NOT_DERIVED,L2_ST:M:E:S:I PRESET,PAPI_L1_DCH,DERIVED_SUB,DATA_MEM_REFS,DCU_LINES_IN PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_MEM_REFS PRESET,PAPI_L1_ICH,DERIVED_SUB,IFU_IFETCH,L2_IFETCH:M:E:S:I PRESET,PAPI_L1_ICA,NOT_DERIVED,IFU_IFETCH PRESET,PAPI_L1_ICR,NOT_DERIVED,IFU_IFETCH PRESET,PAPI_L1_TCA,DERIVED_ADD,DATA_MEM_REFS,IFU_IFETCH # PRESET,PAPI_L2_ICM,NOT_DERIVED,BUS_TRAN_IFETCH:SELF PRESET,PAPI_L2_STM,NOT_DERIVED,L2_M_LINES_INM PRESET,PAPI_L2_DCA,DERIVED_ADD,L2_LD:M:E:S:I,L2_ST:M:E:S:I PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_LD:M:E:S:I PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_ST:M:E:S:I PRESET,PAPI_L2_ICH,DERIVED_SUB,L2_IFETCH:M:E:S:I,BUS_TRAN_IFETCH:SELF PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_IFETCH:M:E:S:I PRESET,PAPI_L2_ICR,NOT_DERIVED,L2_IFETCH:M:E:S:I PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_RQSTS:M:E:S:I PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_LD:M:E:S:I,L2_IFETCH:M:E:S:I PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_ST:M:E:S:I # PRESET,PAPI_CA_SHR,NOT_DERIVED,L2_RQSTS:S PRESET,PAPI_CA_CLN,NOT_DERIVED,BUS_TRANS_RFO:SELF PRESET,PAPI_CA_ITV,NOT_DERIVED,BUS_TRAN_INVAL:SELF # PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_HW_INT,NOT_DERIVED,HW_INT_RX PRESET,PAPI_TOT_IIS,NOT_DERIVED,INST_DECODED PRESET,PAPI_RES_STL,NOT_DERIVED,RESOURCE_STALLS # PRESET,PAPI_BTAC_M,NOT_DERIVED,BTB_MISSES PRESET,PAPI_BR_CN,NOT_DERIVED,BR_INST_RETIRED PRESET,PAPI_BR_TKN,NOT_DERIVED,BR_TAKEN_RETIRED PRESET,PAPI_BR_NTK,DERIVED_SUB,BR_INST_RETIRED,BR_TAKEN_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISS_PRED_RETIRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_INST_RETIRED,BR_MISS_PRED_RETIRED PRESET,PAPI_BR_INS,NOT_DERIVED,BR_INST_RETIRED # PRESET,PAPI_FP_INS,NOT_DERIVED,FLOPS PRESET,PAPI_FP_OPS,NOT_DERIVED,FLOPS PRESET,PAPI_FML_INS,NOT_DERIVED,MUL PRESET,PAPI_FDV_INS,NOT_DERIVED,DIV # # This is an example of multiple processor names matching the same table CPU,Intel Pentium4 CPU,Intel Pentium4 L3 CPU,Pentium4/Xeon/EM64T CPU,netburst CPU,netburst_p # # Note: the proper event is GLOBAL_POWER_EVENTS:RUNNING # but the kernel grabs that for the watchdog timer # and suggests "" is equivalent #PRESET,PAPI_TOT_CYC,NOT_DERIVED,GLOBAL_POWER_EVENTS:RUNNING PRESET,PAPI_TOT_CYC,NOT_DERIVED,execution_event:nbogus0:nbogus1:nbogus2:nbogus3:bogus0:bogus1:bogus2:bogus3:cmpl:thr=15 PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_RETIRED:NBOGUSNTAG PRESET,PAPI_RES_STL, NOT_DERIVED, resource_stall:SBFULL PRESET,PAPI_BR_INS, NOT_DERIVED, branch_retired:MMNP:MMNM:MMTP:MMTM PRESET,PAPI_BR_TKN, NOT_DERIVED, branch_retired:MMTP:MMTM PRESET,PAPI_BR_NTK, NOT_DERIVED, branch_retired:MMNP:MMNM PRESET,PAPI_BR_MSP, NOT_DERIVED, branch_retired:MMNM:MMTM PRESET,PAPI_BR_PRC, NOT_DERIVED, branch_retired:MMNP:MMTP PRESET,PAPI_TLB_DM, NOT_DERIVED, page_walk_type:DTMISS PRESET,PAPI_TLB_IM, NOT_DERIVED, page_walk_type:ITMISS PRESET,PAPI_TLB_TL, NOT_DERIVED, page_walk_type:DTMISS:ITMISS PRESET,PAPI_LD_INS, DERIVED_CMPD, front_end_event:NBOGUS, uops_type:TAGLOADS PRESET,PAPI_SR_INS, DERIVED_CMPD, front_end_event:NBOGUS, uops_type:TAGSTORES PRESET,PAPI_LST_INS, DERIVED_CMPD, front_end_event:NBOGUS, uops_type:TAGLOADS:TAGSTORES PRESET,PAPI_FP_INS, DERIVED_CMPD, execution_event:NBOGUS0, x87_FP_uop:ALL:TAG0,NOTE,"PAPI_FP_INS counts only retired x87 uops tagged with 0. If you add other native events tagged with 0, their counts will be included in PAPI_FP_INS" PRESET,PAPI_TOT_IIS, NOT_DERIVED, instr_retired:NBOGUSNTAG:NBOGUSTAG:BOGUSNTAG:BOGUSTAG, NOTE, "Only on model 2 and above" PRESET,PAPI_L1_ICM, NOT_DERIVED, BPU_fetch_request:TCMISS PRESET,PAPI_L1_ICA, NOT_DERIVED, uop_queue_writes:FROM_TC_BUILD:FROM_TC_DELIVER PRESET,PAPI_L1_LDM, NOT_DERIVED, replay_event:NBOGUS:L1_LD_MISS PRESET,PAPI_L2_LDM, NOT_DERIVED, replay_event:NBOGUS:L2_LD_MISS PRESET,PAPI_L2_TCH, NOT_DERIVED, BSQ_cache_reference:RD_2ndL_HITS:RD_2ndL_HITE:RD_2ndL_HITM PRESET,PAPI_L2_TCM, NOT_DERIVED, BSQ_cache_reference:RD_2ndL_MISS PRESET,PAPI_L2_TCA, NOT_DERIVED, BSQ_cache_reference:RD_2ndL_MISS:RD_2ndL_HITS:RD_2ndL_HITE:RD_2ndL_HITM # CPU,Intel Pentium4 L3 PRESET,PAPI_L3_TCH, NOT_DERIVED, BSQ_cache_reference:RD_3rdL_HITS:RD_3rdL_HITE:RD_3rdL_HITM PRESET,PAPI_L3_TCM, NOT_DERIVED, BSQ_cache_reference:RD_3rdL_MISS PRESET,PAPI_L3_TCA, NOT_DERIVED, BSQ_cache_reference:RD_3rdL_MISS:RD_3rdL_HITS:RD_3rdL_HITE:RD_3rdL_HITM # CPU,Intel Pentium4 FPU X87 PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, x87_FP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired x87 uops tagged with 1." # CPU,Intel Pentium4 FPU SSE_SP PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, scalar_SP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired scalar_SP SSE uops tagged with 1." # CPU,Intel Pentium4 FPU SSE_DP PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, scalar_DP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired scalar_DP SSE uops tagged with 1." # CPU,Intel Pentium4 FPU X87 SSE_SP PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, scalar_SP_uop:ALL:TAG1, x87_FP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired x87 and scalar_SP SSE uops tagged with 1." # CPU,Intel Pentium4 FPU X87 SSE_DP PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, scalar_DP_uop:ALL:TAG1, x87_FP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired x87 and scalar_DP SSE uops tagged with 1." # CPU,Intel Pentium4 FPU SSE_SP SSE_DP PRESET,PAPI_FP_OPS, DERIVED_CMPD, execution_event:NBOGUS1, scalar_SP_uop:ALL:TAG1, scalar_DP_uop:ALL:TAG1,NOTE,"PAPI_FP_OPS counts retired scalar_SP and scalar_DP SSE uops tagged with 1." # CPU,Intel Pentium4 VEC MMX PRESET,PAPI_VEC_INS, DERIVED_CMPD, execution_event:NBOGUS2, 64bit_MMX_uop:ALL:TAG2, 128bit_MMX_uop:ALL:TAG2,NOTE,"PAPI_VEC_INS counts retired 64bit and 128bit MMX uops tagged with 2." # CPU,Intel Pentium4 VEC SSE PRESET,PAPI_VEC_INS, DERIVED_CMPD, execution_event:NBOGUS2, packed_SP_uop:ALL:TAG2, packed_DP_uop:ALL:TAG2,NOTE,"PAPI_VEC_INS counts retired packed single and double precision SSE uops tagged with 2." # CPU,IA-64 # CPU,dual-core Itanium 2 # PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_OPS_RETIRED PRESET,PAPI_STL_ICY,NOT_DERIVED,DISP_STALLED PRESET,PAPI_STL_CCY,NOT_DERIVED,BACK_END_BUBBLE_ALL PRESET,PAPI_TOT_IIS,NOT_DERIVED,INST_DISPERSED PRESET,PAPI_RES_STL,NOT_DERIVED,BE_EXE_BUBBLE_ALL PRESET,PAPI_FP_STAL,NOT_DERIVED,BE_EXE_BUBBLE_FRALL PRESET,PAPI_L1_ICM,NOT_DERIVED,L2I_READS_ALL_DMND PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_READ_MISSES_ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2I_READS_MISS_ALL PRESET,PAPI_L2_ICM,NOT_DERIVED,L2I_READS_MISS_ALL PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_MISSES PRESET,PAPI_L3_ICM,NOT_DERIVED,L3_READS_INST_FETCH_MISS:M:E:S:I PRESET,PAPI_L3_LDM,NOT_DERIVED,L3_READS_ALL_MISS:M:E:S:I PRESET,PAPI_L3_STM,NOT_DERIVED,L3_WRITES_DATA_WRITE_MISS:M:E:S:I PRESET,PAPI_L1_LDM,NOT_DERIVED,L1D_READ_MISSES_ALL PRESET,PAPI_L2_LDM,NOT_DERIVED,L3_READS_ALL_ALL:M:E:S:I PRESET,PAPI_L2_STM,NOT_DERIVED,L3_WRITES_ALL_ALL:M:E:S:I PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_READS_SET1 PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_REFERENCES_ALL PRESET,PAPI_L3_DCA,NOT_DERIVED,L3_REFERENCES PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_READS_SET1 PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_REFERENCES_READS PRESET,PAPI_L3_DCR,NOT_DERIVED,L3_READS_DATA_READ_ALL:M:E:S:I PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_REFERENCES_WRITES PRESET,PAPI_L3_DCW,NOT_DERIVED,L3_WRITES_DATA_WRITE_ALL:M:E:S:I PRESET,PAPI_L3_ICH,NOT_DERIVED,L3_READS_DINST_FETCH_HIT:M:E:S:I PRESET,PAPI_L3_ICR,NOT_DERIVED,L3_READS_INST_FETCH_ALL:M:E:S:I PRESET,PAPI_L3_TCA,NOT_DERIVED,L3_REFERENCES PRESET,PAPI_L3_TCR,NOT_DERIVED,L3_READS_ALL_ALL:M:E:S:I PRESET,PAPI_L3_TCW,NOT_DERIVED,L3_WRITES_ALL_ALL:M:E:S:I PRESET,PAPI_TLB_DM,NOT_DERIVED,L2DTLB_MISSES PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES_FETCH_L2ITLB PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_EVENT PRESET,PAPI_BR_PRC,NOT_DERIVED,BR_MISPRED_DETAIL_ALL_CORRECT_PRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_OP_CYCLES_ALL PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_OPS_RETIRED PRESET,PAPI_TOT_INS,NOT_DERIVED,IA64_INST_RETIRED PRESET,PAPI_LD_INS,NOT_DERIVED,LOADS_RETIRED PRESET,PAPI_SR_INS,NOT_DERIVED,STORES_RETIRED PRESET,PAPI_L2_ICA,NOT_DERIVED,L2I_DEMAND_READS PRESET,PAPI_L3_ICA,NOT_DERIVED,L3_READS_INST_FETCH_ALL:M:E:S:I PRESET,PAPI_L1_TCR,NOT_DERIVED,L2I_READS_ALL_ALL PRESET,PAPI_L2_TCW,NOT_DERIVED,L2D_REFERENCES_WRITES # CPU,itanium2 # PRESET,PAPI_CA_SNP,NOT_DERIVED,BUS_SNOOPS_SELF PRESET,PAPI_CA_INV,DERIVED_ADD,BUS_MEM_READ_BRIL_SELF,BUS_MEM_READ_BIL_SELF PRESET,PAPI_TLB_TL,DERIVED_ADD,ITLB_MISSES_FETCH_L2ITLB,L2DTLB_MISSES PRESET,PAPI_STL_ICY,NOT_DERIVED,DISP_STALLED PRESET,PAPI_STL_CCY,NOT_DERIVED,BACK_END_BUBBLE_ALL PRESET,PAPI_TOT_IIS,NOT_DERIVED,INST_DISPERSED PRESET,PAPI_RES_STL,NOT_DERIVED,BE_EXE_BUBBLE_ALL PRESET,PAPI_FP_STAL,NOT_DERIVED,BE_EXE_BUBBLE_FRALL PRESET,PAPI_L2_TCR,DERIVED_ADD,L2_DATA_REFERENCES_L2_DATA_READS,L2_INST_DEMAND_READS,L2_INST_PREFETCHES PRESET,PAPI_L1_TCM,DERIVED_ADD,L2_INST_DEMAND_READS,L1D_READ_MISSES_ALL PRESET,PAPI_L1_ICM,NOT_DERIVED,L2_INST_DEMAND_READS PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_READ_MISSES_ALL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_MISSES PRESET,PAPI_L2_DCM, DERIVED_SUB,L2_MISSES,L3_READS_INST_FETCH_ALL PRESET,PAPI_L2_ICM,NOT_DERIVED,L3_READS_INST_FETCH_ALL PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_MISSES PRESET,PAPI_L3_ICM,NOT_DERIVED,L3_READS_INST_FETCH_MISS PRESET,PAPI_L3_DCM, DERIVED_ADD,L3_READS_DATA_READ_MISS,L3_WRITES_DATA_WRITE_MISS PRESET,PAPI_L3_LDM,NOT_DERIVED,L3_READS_ALL_MISS PRESET,PAPI_L3_STM,NOT_DERIVED,L3_WRITES_DATA_WRITE_MISS PRESET,PAPI_L1_LDM,DERIVED_ADD,L1D_READ_MISSES_ALL,L2_INST_DEMAND_READS PRESET,PAPI_L2_LDM,NOT_DERIVED,L3_READS_ALL_ALL PRESET,PAPI_L2_STM,NOT_DERIVED,L3_WRITES_ALL_ALL PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_READS_SET1,L1D_READ_MISSES_ALL PRESET,PAPI_L2_DCH,DERIVED_SUB,L2_DATA_REFERENCES_L2_ALL,L2_MISSES PRESET,PAPI_L3_DCH,DERIVED_ADD,L3_READS_DATA_READ_HIT,L3_WRITES_DATA_WRITE_HIT PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_READS_SET1 PRESET,PAPI_L2_DCA,NOT_DERIVED,L2_DATA_REFERENCES_L2_ALL PRESET,PAPI_L3_DCA,DERIVED_ADD,L3_READS_DATA_READ_ALL,L3_WRITES_DATA_WRITE_ALL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_READS_SET1 PRESET,PAPI_L2_DCR,NOT_DERIVED,L2_DATA_REFERENCES_L2_DATA_READS PRESET,PAPI_L3_DCR,NOT_DERIVED,L3_READS_DATA_READ_ALL PRESET,PAPI_L2_DCW,NOT_DERIVED,L2_DATA_REFERENCES_L2_DATA_WRITES PRESET,PAPI_L3_DCW,NOT_DERIVED,L3_WRITES_DATA_WRITE_ALL PRESET,PAPI_L3_ICH,NOT_DERIVED,L3_READS_DINST_FETCH_HIT PRESET,PAPI_L1_ICR,DERIVED_ADD,L1I_PREFETCHES,L1I_READS PRESET,PAPI_L2_ICR,DERIVED_ADD,L2_INST_DEMAND_READS,L2_INST_PREFETCHES PRESET,PAPI_L3_ICR,NOT_DERIVED,L3_READS_INST_FETCH_ALL PRESET,PAPI_L1_ICA,DERIVED_ADD,L1I_PREFETCHES,L1I_READS PRESET,PAPI_L2_TCH,DERIVED_SUB,L2_REFERENCES,L2_MISSES PRESET,PAPI_L3_TCH,DERIVED_SUB,L3_REFERENCES,L3_MISSES PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_REFERENCES PRESET,PAPI_L3_TCA,NOT_DERIVED,L3_REFERENCES PRESET,PAPI_L3_TCR,NOT_DERIVED,L3_READS_ALL_ALL PRESET,PAPI_L3_TCW,NOT_DERIVED,L3_WRITES_ALL_ALL PRESET,PAPI_TLB_DM,NOT_DERIVED,L2DTLB_MISSES PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISSES_FETCH_L2ITLB PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_EVENT PRESET,PAPI_BR_PRC,NOT_DERIVED,BR_MISPRED_DETAIL_ALL_CORRECT_PRED PRESET,PAPI_BR_MSP,DERIVED_ADD,BR_MISPRED_DETAIL_ALL_WRONG_PATH,BR_MISPRED_DETAIL_ALL_WRONG_TARGET PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_OPS,NOT_DERIVED,FP_OPS_RETIRED PRESET,PAPI_TOT_INS,DERIVED_ADD,IA64_INST_RETIRED,IA32_INST_RETIRED PRESET,PAPI_LD_INS,NOT_DERIVED,LOADS_RETIRED PRESET,PAPI_SR_INS,NOT_DERIVED,STORES_RETIRED PRESET,PAPI_L2_ICA,NOT_DERIVED,L2_INST_DEMAND_READS PRESET,PAPI_L3_ICA,NOT_DERIVED,L3_READS_INST_FETCH_ALL PRESET,PAPI_L1_TCR,DERIVED_ADD,L1D_READS_SET0,L1I_READS PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_READS_SET0,L1I_READS PRESET,PAPI_L2_TCW,NOT_DERIVED,L2_DATA_REFERENCES_L2_DATA_WRITES # CPU,itanium # CPU,PPC970 # PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L2_DCR,DERIVED_ADD,PM_DATA_FROM_L2,PM_DATA_FROM_L25_MOD,PM_DATA_FROM_L25_SHR,PM_DATA_FROM_MEM PRESET,PAPI_L2_DCH,DERIVED_ADD,PM_DATA_FROM_L2,PM_DATA_FROM_L25_MOD,PM_DATA_FROM_L25_SHR PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L1_ICM,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD,PM_INST_FROM_MEM PRESET,PAPI_L2_ICA,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD,PM_INST_FROM_MEM PRESET,PAPI_L2_ICH,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_MEM PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_REF_L1 PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FPU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_FIN PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|+|N2|+|N3|-|,PM_FPU0_FIN,PM_FPU1_FIN,PM_FPU_FMA,PM_FPU_STF PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FPU_FIN PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_CYC PRESET,PAPI_FDV_INS,NOT_DERIVED,PM_FPU_FDIV PRESET,PAPI_FSQ_INS,NOT_DERIVED,PM_FPU_FSQRT PRESET,PAPI_TLB_DM,NOT_DERIVED,PM_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,PM_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,PM_DTLB_MISS,PM_ITLB_MISS PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,NOT_DERIVED,PM_0INST_FETCH PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_LST_INS,DERIVED_ADD,PM_ST_REF_L1,PM_LD_REF_L1 PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_ISSUED PRESET,PAPI_BR_MSP,DERIVED_ADD,PM_BR_MPRED_CR,PM_BR_MPRED_TA PRESET,PAPI_L1_DCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-|,PM_LD_REF_L1,PM_LD_MISS_L1,PM_ST_REF_L1,PM_ST_MISS_L1 PRESET,PAPI_L3_DCM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L3_LDM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_MEM # CPU,PPC970MP # PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L2_DCR,DERIVED_ADD,PM_DATA_FROM_L2,PM_DATA_FROM_L25_MOD,PM_DATA_FROM_L25_SHR,PM_DATA_FROM_MEM PRESET,PAPI_L2_DCH,DERIVED_ADD,PM_DATA_FROM_L2,PM_DATA_FROM_L25_MOD,PM_DATA_FROM_L25_SHR PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_DATA_FROM_MEM #PRESET,PAPI_L1_ICM,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD,PM_INST_FROM_MEM #PRESET,PAPI_L2_ICA,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD,PM_INST_FROM_MEM #PRESET,PAPI_L2_ICH,DERIVED_ADD,PM_INST_FROM_L2,PM_INST_FROM_L25_SHR,PM_INST_FROM_L25_MOD PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_MEM PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_REF_L1 PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FPU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_FIN PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|+|N2|+|N3|-|,PM_FPU0_FIN,PM_FPU1_FIN,PM_FPU_FMA,PM_FPU_STF PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FPU_FIN PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_CYC PRESET,PAPI_FDV_INS,NOT_DERIVED,PM_FPU_FDIV PRESET,PAPI_FSQ_INS,NOT_DERIVED,PM_FPU_FSQRT PRESET,PAPI_TLB_DM,NOT_DERIVED,PM_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,PM_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,PM_DTLB_MISS,PM_ITLB_MISS PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,NOT_DERIVED,PM_0INST_FETCH PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_LST_INS,DERIVED_ADD,PM_ST_REF_L1,PM_LD_REF_L1 PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_ISSUED PRESET,PAPI_BR_MSP,DERIVED_ADD,PM_BR_MPRED_CR,PM_BR_MPRED_TA PRESET,PAPI_L1_DCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-|,PM_LD_REF_L1,PM_LD_MISS_L1,PM_ST_REF_L1,PM_ST_MISS_L1 PRESET,PAPI_L3_DCM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L3_LDM,NOT_DERIVED,PM_DATA_FROM_MEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_MEM # CPU,POWER5 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_REF_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FPU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_FIN PRESET,PAPI_FP_OPS,DERIVED_ADD,PM_FPU_1FLOP,PM_FPU_FMA,PM_FPU_FMA PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FPU_FIN PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_FDV_INS,NOT_DERIVED,PM_FPU_FDIV PRESET,PAPI_FSQ_INS,NOT_DERIVED,PM_FPU_FSQRT PRESET,PAPI_TLB_DM,NOT_DERIVED,PM_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,PM_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,PM_DTLB_MISS,PM_ITLB_MISS PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,NOT_DERIVED,PM_0INST_FETCH PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_LST_INS,DERIVED_ADD,PM_ST_REF_L1,PM_LD_REF_L1 PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_ISSUED PRESET,PAPI_BR_MSP,DERIVED_ADD,PM_BR_MPRED_CR,PM_BR_MPRED_TA PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED_CR_TA PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER5+ # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_REF_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FPU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_FIN PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|N3|+|4|*|+|,PM_FPU_1FLOP,PM_FPU_FMA,PM_FPU_FSQRT,PM_FPU_FDIV PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FPU_FIN PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_FDV_INS,NOT_DERIVED,PM_FPU_FDIV PRESET,PAPI_FSQ_INS,NOT_DERIVED,PM_FPU_FSQRT PRESET,PAPI_TLB_DM,NOT_DERIVED,PM_DTLB_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,PM_ITLB_MISS PRESET,PAPI_TLB_TL,DERIVED_ADD,PM_DTLB_MISS,PM_ITLB_MISS PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,NOT_DERIVED,PM_0INST_FETCH PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_LST_INS,DERIVED_ADD,PM_ST_REF_L1,PM_LD_REF_L1 PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_ISSUED PRESET,PAPI_BR_MSP,DERIVED_ADD,PM_BR_MPRED_CR,PM_BR_MPRED_TA PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED_CR_TA PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER6 CPU,power6 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_REF_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L1_ICM,NOT_DERIVED,PM_L1_ICACHE_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_L3MISS PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FPU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,DERIVED_ADD,PM_FXU0_FIN,PM_FXU1_FIN # This definition comes from the (unreleased) IBM PM documentation PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|3|*|N1|N2|+|+|,PM_FPU_FSQRT_FDIV,PM_FPU_FLOP,PM_FPU_FMA # The following counts SQRT and DIV as one FP event instead of 4 #PRESET,PAPI_FP_OPS,DERIVED_ADD,PM_FPU_FLOP,PM_FPU_FMA PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FPU_FIN # It appears PM_CYC is not widely available #PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_CYC # PM_RUN_CYC is in every group; but it doesn't overflow :( PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,NOT_DERIVED,PM_0INST_FETCH PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_REF_L1 PRESET,PAPI_LST_INS,DERIVED_ADD,PM_ST_REF_L1,PM_LD_REF_L1 PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BRU_FIN PRESET,PAPI_BR_MSP,NOT_DERIVED,PM_BR_MPRED PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER7 CPU,power7 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,DERIVED_SUB,PM_ST_FIN,PM_ST_MISS_L1 PRESET,PAPI_L1_DCR,DERIVED_SUB,PM_LD_REF_L1,PM_LD_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-,PM_ST_FIN,PM_ST_MISS_L1,PM_LD_REF_L1,PM_LD_MISS_L1 PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_L2_LD_MISS PRESET,PAPI_L2_STM,NOT_DERIVED,PM_L2_ST_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L1_ICM,NOT_DERIVED,PM_L1_ICACHE_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_L2_INST_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_L3MISS PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_VSU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,DERIVED_ADD,PM_FXU0_FIN,PM_FXU1_FIN # # We'd like to do a 1FLOP + 2*2FLOP + 4*4FLOP + 8*8FLOP + 16*16FLOP, but # we run out of counters (we have 4, but need 5). So for now, just assume # that the vast majority of users won't be using the single precision # vector FDIV and FSQRT instructions that would tick PM_VSU0_16FLOP. # #PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|N3|8|*|+|N4|16|*|+|,PM_VSU_1FLOP,PM_VSU_2FLOP,PM_VSU_4FLOP,PM_VSU_8FLOP,PM_VSU0_16FLOP # #PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|2|*|+|N2|4|*|+|N3|8|*|+|,PM_VSU_1FLOP,PM_VSU_2FLOP,PM_VSU_4FLOP,PM_VSU_8FLOP PRESET,PAPI_FP_OPS,NOT_DERIVED,PM_FLOP PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FLOP PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,DERIVED_POSTFIX,N0|N1|-|,PM_RUN_CYC,PM_1PLUS_PPC_DISP PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_FIN PRESET,PAPI_LD_INS,DERIVED_ADD,PM_LD_REF_L1,PM_LD_MISS_L1 PRESET,PAPI_LST_INS,NOT_DERIVED,PM_LSU_FIN #PRESET,PAPI_LST_INS,DERIVED_ADD,PM_LD_REF_L1,PM_LD_MISS_L1,PM_ST_FIN PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BRU_FIN PRESET,PAPI_BR_MSP,NOT_DERIVED,PM_BR_MPRED PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER8 CPU,power8 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1,PM_ST_MISS_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1 PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,DERIVED_SUB,PM_ST_FIN,PM_ST_MISS_L1 PRESET,PAPI_L1_DCR,DERIVED_SUB,PM_LD_REF_L1,PM_LD_MISS_L1 PRESET,PAPI_L1_DCA,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-,PM_ST_FIN,PM_ST_MISS_L1,PM_LD_REF_L1,PM_LD_MISS_L1 PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS #n/aPRESET,PAPI_L2_LDM,NOT_DERIVED,PM_L2_LD_MISS #n/aPRESET,PAPI_L2_STM,NOT_DERIVED,PM_L2_ST_MISS PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS #n/aPRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM #n/aPRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM #n/aPRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L1_ICM,NOT_DERIVED,PM_L1_ICACHE_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS #n/aPRESET,PAPI_L2_ICM,NOT_DERIVED,PM_L2_INST_MISS #n/aPRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 #n/aPRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS #n/aPRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_L3MISS #n/aPRESET,PAPI_FMA_INS,NOT_DERIVED,PM_VSU_FMA PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL #n/aPRESET,PAPI_INT_INS,DERIVED_ADD,PM_FXU0_FIN,PM_FXU1_FIN PRESET,PAPI_FP_OPS,NOT_DERIVED,PM_FLOP PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FLOP PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|4|*|N1|8|*|N2|16|*|N3|32|*|+|+|+|,PM_VSU0_2FLOP,PM_VSU0_4FLOP,PM_VSU0_8FLOP,PM_VSU0_16FLOP PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|4|*|N1|8|*|N2|16|*|N3|32|*|+|+|+|,PM_VSU0_2FLOP,PM_VSU0_4FLOP,PM_VSU0_8FLOP,PM_VSU0_16FLOP PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,DERIVED_POSTFIX,N0|N1|-|,PM_RUN_CYC,PM_1PLUS_PPC_DISP PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_FIN #n/aPRESET,PAPI_LD_INS,DERIVED_ADD,PM_LD_REF_L1,PM_LD_MISS_L1 #/naPRESET,PAPI_LST_INS,NOT_DERIVED,PM_LSU_FIN #PRESET,PAPI_LST_INS,DERIVED_ADD,PM_LD_REF_L1,PM_LD_MISS_L1,PM_ST_FIN PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_CMPL PRESET,PAPI_BR_MSP,NOT_DERIVED,PM_BR_MPRED_CMPL PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED_BR_CMPL PRESET,PAPI_BR_TKN,NOT_DERIVED,PM_BR_TAKEN_CMPL PRESET,PAPI_BR_UCN,NOT_DERIVED,PM_BR_UNCOND_CMPL #n/aPRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER9 CPU,power9 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_LD_MISS_L1_ALT,PM_ST_MISS_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_LD_MISS_L1_ALT PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,DERIVED_SUB,PM_ST_FIN,PM_ST_MISS_L1 PRESET,PAPI_L1_DCR,DERIVED_SUB,PM_LD_REF_L1,PM_LD_MISS_L1_ALT #PRESET,PAPI_L1_DCA,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-,PM_ST_FIN,PM_ST_MISS_L1,PM_LD_REF_L1,PM_LD_MISS_L1_ALT PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_CMPL PRESET,PAPI_L2_DCM,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,PM_L2_LD_MISS PRESET,PAPI_L2_STM,NOT_DERIVED,PM_L2_ST_MISS PRESET,PAPI_L2_DCR,NOT_DERIVED,PM_DATA_FROM_L2 PRESET,PAPI_L2_DCW,NOT_DERIVED,PM_L2_ST_HIT PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L2MISS PRESET,PAPI_L3_DCM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L3_LDM,DERIVED_ADD,PM_DATA_FROM_LMEM,PM_DATA_FROM_RMEM PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L1_ICM,NOT_DERIVED,PM_L1_ICACHE_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_L2_INST_MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_L3MISS PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FMA_CMPL PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_FIN # Note: PAPI_FP_OPS is not available on this architecture. The following combination is # equivalent to all FLOPs; however, these events cannot be added to the same event set. # If a user chooses, they can utilize the multiplexing feature with these events. # 8 * PM_8FLOP_CMPL + 4 * PM_4FLOP_CMPL + 2 * PM_2FLOP_CMPL + 1 * PM_1FLOP_CMPL PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FLOP_CMPL PRESET,PAPI_DP_OPS,NOT_DERIVED,PM_DP_QP_FLOP_CMPL PRESET,PAPI_SP_OPS,NOT_DERIVED,PM_SP_FLOP_CMPL PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,DERIVED_POSTFIX,N0|N1|-|,PM_RUN_CYC,PM_1PLUS_PPC_DISP PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_FIN PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_REF_L1 PRESET,PAPI_LST_INS,NOT_DERIVED,PM_LSU_FIN PRESET,PAPI_LST_INS,DERIVED_ADD,PM_LD_REF_L1,PM_LD_MISS_L1,PM_ST_FIN PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BRU_FIN PRESET,PAPI_BR_MSP,NOT_DERIVED,PM_TAKEN_BR_MPRED_CMPL PRESET,PAPI_BR_PRC,NOT_DERIVED,PM_BR_PRED PRESET,PAPI_BR_CN,DERIVED_SUB,PM_BR_CMPL,PM_BR_UNCOND PRESET,PAPI_BR_NTK,DERIVED_POSTFIX,N0|N1|-|,PM_BR_CMPL,PM_BR_TAKEN_CMPL PRESET,PAPI_BR_UCN,NOT_DERIVED,PM_BR_UNCOND PRESET,PAPI_BR_TKN,NOT_DERIVED,PM_BR_CORECT_PRED_TAKEN_CMPL PRESET,PAPI_FXU_IDL,NOT_DERIVED,PM_FXU_IDLE # CPU,POWER10 CPU,power10 # PRESET,PAPI_L1_DCM,DERIVED_ADD,PM_DATA_FROM_L1MISS,PM_ST_MISS_L1 PRESET,PAPI_L1_LDM,NOT_DERIVED,PM_DATA_FROM_L1MISS PRESET,PAPI_L1_STM,NOT_DERIVED,PM_ST_MISS_L1 PRESET,PAPI_L1_DCW,DERIVED_SUB,PM_ST_FIN,PM_ST_MISS_L1 PRESET,PAPI_L1_DCR,NOT_DERIVED,PM_LD_HIT_L1 PRESET,PAPI_L1_DCA,DERIVED_ADD,PM_LD_REF_L1,PM_ST_CMPL PRESET,PAPI_L2_DCM,DERIVED_POSTFIX,N0|N1|2|*|+|,PM_DATA_FROM_L2MISS,PM_L2_ST_MISS PRESET,PAPI_L2_LDM,DERIVED_POSTFIX,N0|2|*|,PM_L2_LD_MISS PRESET,PAPI_L2_STM,DERIVED_POSTFIX,N0|2|*|,PM_L2_ST_MISS PRESET,PAPI_L2_DCR,NOT_DERIVED,PM_DATA_FROM_L2 PRESET,PAPI_L2_DCW,DERIVED_POSTFIX,N0|2|*|,PM_L2_ST_HIT PRESET,PAPI_L3_DCR,NOT_DERIVED,PM_DATA_FROM_L3 PRESET,PAPI_L3_DCM,NOT_DERIVED,PM_DATA_FROM_L3MISS PRESET,PAPI_L3_LDM,NOT_DERIVED,PM_DATA_FROM_L3MISS PRESET,PAPI_L1_ICH,NOT_DERIVED,PM_INST_FROM_L1 PRESET,PAPI_L1_ICM,NOT_DERIVED,PM_L1_ICACHE_MISS PRESET,PAPI_L2_ICM,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L2_ICH,NOT_DERIVED,PM_INST_FROM_L2 PRESET,PAPI_L3_ICA,NOT_DERIVED,PM_INST_FROM_L2MISS PRESET,PAPI_L3_ICH,NOT_DERIVED,PM_INST_FROM_L3 PRESET,PAPI_L3_ICM,NOT_DERIVED,PM_INST_FROM_L3MISS PRESET,PAPI_FMA_INS,NOT_DERIVED,PM_FMA_CMPL PRESET,PAPI_TOT_IIS,NOT_DERIVED,PM_INST_DISP PRESET,PAPI_TOT_INS,NOT_DERIVED,PM_INST_CMPL PRESET,PAPI_INT_INS,NOT_DERIVED,PM_FXU_ISSUE # Note: PAPI_FP_OPS is not available on this architecture. The following combination is # equivalent to all FLOPs; however, these events cannot be added to the same event set. # If a user chooses, they can utilize the multiplexing feature with these events. # 8 * PM_8FLOP_CMPL + 4 * PM_4FLOP_CMPL + 2 * PM_2FLOP_CMPL + 1 * PM_1FLOP_CMPL PRESET,PAPI_FP_INS,NOT_DERIVED,PM_FLOP_CMPL PRESET,PAPI_DP_OPS,NOT_DERIVED,PM_DPP_FLOP_CMPL PRESET,PAPI_SP_OPS,NOT_DERIVED,PM_SP_FLOP_CMPL PRESET,PAPI_TOT_CYC,NOT_DERIVED,PM_RUN_CYC PRESET,PAPI_HW_INT,NOT_DERIVED,PM_EXT_INT PRESET,PAPI_STL_ICY,DERIVED_POSTFIX,N0|N1|-|,PM_RUN_CYC,PM_1PLUS_PPC_DISP PRESET,PAPI_SR_INS,NOT_DERIVED,PM_ST_CMPL PRESET,PAPI_LD_INS,NOT_DERIVED,PM_LD_CMPL PRESET,PAPI_LST_INS,DERIVED_ADD,PM_LD_CMPL,PM_ST_CMPL PRESET,PAPI_BR_INS,NOT_DERIVED,PM_BR_CMPL PRESET,PAPI_BR_MSP,NOT_DERIVED,PM_BR_MPRED_CMPL PRESET,PAPI_BR_PRC,DERIVED_ADD,PM_PRED_BR_TKN_COND_DIR,PM_PRED_BR_NTKN_COND_DIR PRESET,PAPI_BR_CN,NOT_DERIVED,PM_BR_COND_CMPL PRESET,PAPI_BR_NTK,DERIVED_SUB,PM_BR_FIN,PM_BR_TKN_FIN PRESET,PAPI_BR_UCN,NOT_DERIVED,PM_BR_TKN_UNCOND_FIN PRESET,PAPI_BR_TKN,DERIVED_SUB,PM_BR_TKN_FIN,PM_BR_TKN_UNCOND_FIN # CPU,ultra12 # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLE_CNT PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_CNT PRESET,PAPI_L1_ICM,NOT_DERIVED,DISPATCH0_IC_MISS PRESET,PAPI_L1_ICA,NOT_DERIVED,IC_REF PRESET,PAPI_L1_DCR,NOT_DERIVED,DC_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,DC_WR PRESET,PAPI_MEM_RCY,NOT_DERIVED,LOAD_USE PRESET,PAPI_L2_TCA,NOT_DERIVED,EC_REF PRESET,PAPI_BR_MSP,NOT_DERIVED,DISPATCH0_MISPRED PRESET,PAPI_L1_ICH,NOT_DERIVED,IC_HIT PRESET,PAPI_L2_TCH,NOT_DERIVED,EC_HIT PRESET,PAPI_L2_TCM,DERIVED_SUB,EC_REF,EC_HIT # CPU,ultra3 CPU,ultra3i CPU,ultra3+ # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLE_CNT PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_CNT PRESET,PAPI_L1_ICM,NOT_DERIVED,DISPATCH0_IC_MISS PRESET,PAPI_L1_ICA,NOT_DERIVED,IC_REF PRESET,PAPI_L1_DCR,NOT_DERIVED,DC_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,DC_WR PRESET,PAPI_L2_TCA,NOT_DERIVED,EC_REF PRESET,PAPI_BR_TKN,NOT_DERIVED,IU_STAT_BR_COUNT_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,IU_STAT_BR_COUNT_UNTAKEN PRESET,PAPI_BR_MSP,DERIVED_ADD,IU_STAT_BR_MISS_TAKEN,IU_STAT_BR_MISS_UNTAKEN PRESET,PAPI_BR_INS,DERIVED_ADD,IU_STAT_BR_COUNT_TAKEN,IU_STAT_BR_COUNT_UNTAKEN PRESET,PAPI_L2_TCM,NOT_DERIVED,EC_MISSES PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS # CPU,ultra4+ # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLE_CNT PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_CNT PRESET,PAPI_L1_ICM,NOT_DERIVED,DISPATCH0_IC_MISS PRESET,PAPI_L1_ICA,NOT_DERIVED,IC_REF PRESET,PAPI_L1_DCR,NOT_DERIVED,DC_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,DC_WR PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_REF PRESET,PAPI_BR_TKN,NOT_DERIVED,IU_STAT_BR_COUNT_TAKEN PRESET,PAPI_BR_NTK,NOT_DERIVED,IU_STAT_BR_COUNT_UNTAKEN PRESET,PAPI_BR_MSP,DERIVED_ADD,IU_STAT_BR_MISS_TAKEN,IU_STAT_BR_MISS_UNTAKEN PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS PRESET,PAPI_L3_TCM,NOT_DERIVED,L3_MISS # CPU,niagara # PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_CNT PRESET,PAPI_FP_INS,NOT_DERIVED,FP_INSTR_CNT PRESET,PAPI_L1_ICM,NOT_DERIVED,IC_MISS PRESET,PAPI_L1_DCM,NOT_DERIVED,DC_MISS PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS # CPU,niagara2 # CPU,Cell # PRESET,PAPI_TOT_INS,DERIVED_POSTFIX,N0|N1|+|2|*|,PPC_INST_COMMIT_TH0,PPC_INST_COMMIT_TH1 #PRESET,PAPI_L1_DCM,DERIVED_ADD,L1_DCACHE_MISS_TH0,L1_DCACHE_MISS_TH1 where's TH1?? PRESET,PAPI_L1_DCM,NOT_DERIVED,L1_DCACHE_MISS_TH0 PRESET,PAPI_L2_TCH,NOT_DERIVED,L2_CACHE_HIT PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_LD_MISS PRESET,PAPI_L2_STM,NOT_DERIVED,L2_ST_MISS PRESET,PAPI_BR_MSP,DERIVED_ADD,BRANCH_FLUSH_TH0,BRANCH_FLUSH_TH1 PRESET,PAPI_BR_INS,DERIVED_ADD,BRANCH_COMMIT_TH0,BRANCH_COMMIT_TH1 # CPU,arm_1176 # PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE_MISS PRESET,PAPI_STL_ICY,NOT_DERIVED,IBUF_STALL PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_MISS PRESET,PAPI_BR_INS,NOT_DERIVED,BR_EXEC PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MISPREDICT PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_EXEC PRESET,PAPI_L1_DCH,NOT_DERIVED,DCACHE_HIT PRESET,PAPI_L1_DCA,NOT_DERIVED,DCACHE_ACCESS PRESET,PAPI_L1_DCM,NOT_DERIVED,DCACHE_MISS PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES # CPU,arm_ac7 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_LD_INS,NOT_DERIVED,DATA_READS PRESET,PAPI_SR_INS,NOT_DERIVED,DATA_WRITES PRESET,PAPI_HW_INT,NOT_DERIVED,EXCEPTION_TAKEN PRESET,PAPI_BR_INS,NOT_DERIVED,SW_CHANGE_PC PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_MISPRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_MEM_ACCESS PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE_ACCESS PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_TCM,NOT_DERIVED,EXTERNAL_MEMORY_REQUEST PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1I_TLB_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_TLB_DM,NOT_DERIVED,L1D_TLB_REFILL PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL # CPU,arm_ac8 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_EXECUTED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_BR_INS,NOT_DERIVED,PC_WRITE PRESET,PAPI_BR_MSP,NOT_DERIVED,PC_BRANCH_MIS_PRED PRESET,PAPI_LD_INS,NOT_DERIVED,DREAD PRESET,PAPI_SR_INS,NOT_DERIVED,DWRITE PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_REFILL PRESET,PAPI_L1_DCA,NOT_DERIVED,DCACHE_ACCESS PRESET,PAPI_L1_DCM,NOT_DERIVED,DCACHE_REFILL PRESET,PAPI_L1_ICA,NOT_DERIVED,L1_INST PRESET,PAPI_L1_ICM,NOT_DERIVED,IFETCH_MISS PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_ACCESS PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISS PRESET,PAPI_BR_TKN,NOT_DERIVED,PC_BRANCH_EXECUTED PRESET,PAPI_STL_ICY,NOT_DERIVED,CYCLES_INST_STALL # CPU,arm_ac9 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_OUT_OF_RENAME_STAGE PRESET,PAPI_TOT_IIS,NOT_DERIVED,MAIN_UNIT_EXECUTED_INST PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_HW_INT,NOT_DERIVED,EXT_INTERRUPTS PRESET,PAPI_FP_INS,NOT_DERIVED,FP_EXECUTED_INST PRESET,PAPI_VEC_INS,NOT_DERIVED,NEON_EXECUTED_INST PRESET,PAPI_BR_INS,NOT_DERIVED,PC_WRITE PRESET,PAPI_BR_MSP,NOT_DERIVED,PC_BRANCH_MIS_PRED PRESET,PAPI_LD_INS,NOT_DERIVED,DREAD PRESET,PAPI_SR_INS,NOT_DERIVED,DWRITE PRESET,PAPI_TLB_IM,NOT_DERIVED,ITLB_MISS PRESET,PAPI_TLB_DM,NOT_DERIVED,DTLB_REFILL PRESET,PAPI_L1_DCA,NOT_DERIVED,DCACHE_ACCESS PRESET,PAPI_L1_DCM,NOT_DERIVED,DCACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,IFETCH_MISS # CPU,arm_ac15 CPU,arm_ac57 CPU,arm_ac72 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_IIS,NOT_DERIVED,INST_SPEC_EXEC PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,INST_SPEC_EXEC_VFP PRESET,PAPI_VEC_INS,NOT_DERIVED,INST_SPEC_EXEC_SIMD PRESET,PAPI_BR_INS,NOT_DERIVED,INST_SPEC_EXEC_SOFT_PC PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_MISPRED PRESET,PAPI_LD_INS,NOT_DERIVED,DATA_MEM_READ_ACCESS PRESET,PAPI_SR_INS,NOT_DERIVED,DATA_MEM_WRITE_ACCESS PRESET,PAPI_L1_DCA,DERIVED_ADD,L1D_READ_ACCESS,L1D_WRITE_ACCESS PRESET,PAPI_L1_DCM,DERIVED_ADD,L1D_READ_REFILL,L1D_WRITE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_READ_ACCESS PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_WRITE_ACCESS PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE_ACCESS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCH,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_READ_ACCESS PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_WRITE_ACCESS PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_READ_REFILL PRESET,PAPI_L2_STM,NOT_DERIVED,L2D_WRITE_REFILL ##################### # ARM Cortex A53 # ##################### # These are based entirely on libpfm4 event table # They have not been tested on real hardware CPU,arm_ac53 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCH_PRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_MISPRED PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE_ACCESS PRESET,PAPI_L1_DCM,DERIVED_ADD,L1D_CACHE_REFILL PRESET,PAPI_LD_INS,NOT_DERIVED,LD_RETIRED PRESET,PAPI_SR_INS,NOT_DERIVED,ST_RETIRED PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1I_TLB_REFILL PRESET,PAPI_TLB_DM,NOT_DERIVED,L1D_TLB_REFILL PRESET,PAPI_HW_INT,NOT_DERIVED,EXCEPTION_TAKEN # CPU,arm_ac76 # PRESET,PAPI_L1_DCM,DERIVED_ADD,L1D_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L3_DCM,NOT_DERIVED,L3D_CACHE_REFILL PRESET,PAPI_L1_TCM,DERIVED_ADD,L1I_CACHE_REFILL,L1D_CACHE_REFILL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L3_TCM,NOT_DERIVED,L3D_CACHE_REFILL PRESET,PAPI_L3_LDM,NOT_DERIVED,LL_CACHE_MISS_RD PRESET,PAPI_L3_STM,DERIVED_SUB,L3D_CACHE_REFILL,LL_CACHE_MISS_RD PRESET,PAPI_TLB_DM,DERIVED_ADD,L1D_TLB_REFILL,L2D_TLB_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1I_TLB_REFILL PRESET,PAPI_TLB_TL,DERIVED_ADD,L1I_TLB_REFILL,L1D_TLB_REFILL,L2D_TLB_REFILL PRESET,PAPI_L1_LDM,NOT_DERIVED,L1D_CACHE_REFILL_RD PRESET,PAPI_L1_STM,NOT_DERIVED,L1D_CACHE_REFILL_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD PRESET,PAPI_L2_STM,NOT_DERIVED,L2D_CACHE_REFILL_WR PRESET,PAPI_L3_DCH,DERIVED_SUB,L3D_CACHE,L3D_CACHE_REFILL PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_INT_INS,NOT_DERIVED,DP_SPEC PRESET,PAPI_FP_INS,NOT_DERIVED,VFP_SPEC PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED PRESET,PAPI_VEC_INS,NOT_DERIVED,ASE_SPEC PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_CACHE,L1D_CACHE_REFILL PRESET,PAPI_L2_DCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L2_DCA,DERIVED_SUB,L2D_CACHE,L2D_CACHE_RD #PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L3_DCA,NOT_DERIVED,L3D_CACHE PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L3_DCR,NOT_DERIVED,L3D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,|N0|N1|+|N2|-|N3|-|,L1D_CACHE,L1I_CACHE,L1I_CACHE_REFILL,L1D_CACHE_REFILL PRESET,PAPI_L2_TCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL PRESET,PAPI_L3_TCH,DERIVED_SUB,LL_CACHE_RD,LL_CACHE_MISS_RD PRESET,PAPI_L1_TCA,DERIVED_ADD,L1I_CACHE,L1D_CACHE PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L3_TCA,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_TCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L3_TCR,NOT_DERIVED,L3_CACHE_RD PRESET,PAPI_L2_TCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L3_TCW,DERIVED_SUB,L3D_CACHE,L3_CACHE_RD # CPU,qcom_krait # PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTR_EXECUTED PRESET,PAPI_TOT_IIS,NOT_DERIVED,INSTR_EXECUTED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_BR_INS,NOT_DERIVED,PC_WRITE PRESET,PAPI_BR_MSP,NOT_DERIVED,PC_BRANCH_MIS_PRED PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE_ACCESS PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL # Will be supported eventually #PRESET,PAPI_L1_ICA,NOT_DERIVED,KRAIT_L1_ICACHE_ACCESS #PRESET,PAPI_L1_ICM,NOT_DERIVED,KRAIT_L1_ICACHE_MISS # CPU,arm_xgene # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,INST_SPEC_EXEC_VFP PRESET,PAPI_VEC_INS,NOT_DERIVED,INST_SPEC_EXEC_SIMD PRESET,PAPI_BR_INS,NOT_DERIVED,INST_SPEC_EXEC_SOFT_PC PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCH_MISPRED PRESET,PAPI_LD_INS,NOT_DERIVED,DATA_MEM_READ_ACCESS PRESET,PAPI_SR_INS,NOT_DERIVED,DATA_MEM_WRITE_ACCESS PRESET,PAPI_L1_DCA,DERIVED_ADD,L1D_READ_ACCESS,L1D_WRITE_ACCESS PRESET,PAPI_L1_DCM,DERIVED_ADD,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_READ_ACCESS PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_WRITE_ACCESS PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE_ACCESS PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCH,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_READ_ACCESS PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_WRITE_ACCESS PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_READ_REFILL PRESET,PAPI_L2_STM,NOT_DERIVED,L2D_WRITE_REFILL ##################### # ARM ThunderX2 # ##################### CPU,arm_thunderx2 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,VFP_SPEC PRESET,PAPI_VEC_INS,NOT_DERIVED,ASE_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED PRESET,PAPI_LD_INS,NOT_DERIVED,LD_RETIRED PRESET,PAPI_SR_INS,NOT_DERIVED,ST_RETIRED PRESET,PAPI_L1_DCA,DERIVED_ADD,L1D_CACHE_RD,L1D_CACHE_WR PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCH,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD ######################### # ARM Fujitsu A64FX # ######################### CPU,arm_a64fx # PRESET,PAPI_PRF_DM,DERIVED_SUB,L2D_CACHE_REFILL_PRF,L2D_CACHE_MIBMCH_PRF PRESET,PAPI_MEM_SCY,NOT_DERIVED,LD_COMP_WAIT_L2_MISS PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_STL_CCY,NOT_DERIVED,0INST_COMMIT PRESET,PAPI_FUL_CCY,DERIVED_SUB,CPU_CYCLES,0INST_COMMIT,1INST_COMMIT,2INST_COMMIT,3INST_COMMIT PRESET,PAPI_BRU_IDL,NOT_DERIVED,BR_COMP_WAIT PRESET,PAPI_FXU_IDL,DERIVED_SUB,EU_COMP_WAIT,FL_COMP_WAIT PRESET,PAPI_FPU_IDL,NOT_DERIVED,FL_COMP_WAIT PRESET,PAPI_LSU_IDL,NOT_DERIVED,LD_COMP_WAIT PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_FMA_INS,NOT_DERIVED,FP_FMA_SPEC PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,FP_SPEC PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_PRED PRESET,PAPI_VEC_INS,NOT_DERIVED,SIMD_INST_RETIRED PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_LST_INS,NOT_DERIVED,LDST_SPEC PRESET,PAPI_SYC_INS,DERIVED_ADD,ISB_SPEC,DSB_SPEC,DMB_SPEC #PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE #PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_CACHE,L1D_CACHE_REFILL PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL #PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_CACHE,L1I_CACHE #PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-|,L1D_CACHE,L1D_CACHE_REFILL,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D_CACHE_REFILL,L1I_CACHE_REFILL PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L2_DCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|+|,L2D_CACHE,L2D_CACHE_REFILL,L2D_SWAP_DM,L2D_CACHE_MIBMCH_PRF PRESET,PAPI_L2_DCM,DERIVED_SUB,L2D_CACHE_REFILL,L2D_SWAP_DM,L2D_CACHE_MIBMCH_PRF PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L2_TCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|+|,L2D_CACHE,L2D_CACHE_REFILL,L2D_SWAP_DM,L2D_CACHE_MIBMCH_PRF PRESET,PAPI_L2_TCM,DERIVED_SUB,L2D_CACHE_REFILL,L2D_SWAP_DM,L2D_CACHE_MIBMCH_PRF PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L2I_TLB_REFILL PRESET,PAPI_TLB_TL,DERIVED_ADD,L2D_TLB_REFILL,L2I_TLB_REFILL PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_SCALE_OPS_SPEC,FP_FIXED_OPS_SPEC PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_SP_SCALE_OPS_SPEC,FP_SP_FIXED_OPS_SPEC PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_DP_SCALE_OPS_SPEC,FP_DP_FIXED_OPS_SPEC ######################### # ARM Neoverse N1 # ######################### CPU,arm_n1 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,VFP_SPEC PRESET,PAPI_VEC_INS,NOT_DERIVED,ASE_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_INS,NOT_DERIVED,BR_PRED PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_LST_INS,DERIVED_ADD,LD_SPEC,ST_SPEC PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCA,DERIVED_ADD,L2D_CACHE_RD,L2D_CACHE_WR PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_SYC_INS,DERIVED_ADD,ISB_SPEC,DSB_SPEC,DMB_SPEC PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL ######################### # ARM Neoverse N2 # ######################### CPU,arm_n2 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,VFP_SPEC PRESET,PAPI_VEC_INS,DERIVED_ADD,SVE_INST_SPEC,ASE_INST_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_INS,NOT_DERIVED,BR_PRED PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_LST_INS,DERIVED_ADD,LD_SPEC,ST_SPEC PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCA,DERIVED_ADD,L2D_CACHE_RD,L2D_CACHE_WR PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_SYC_INS,DERIVED_ADD,ISB_SPEC,DSB_SPEC,DMB_SPEC PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL ######################### # ARM Neoverse V1 # ######################### CPU,arm_v1 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_FP_INS,NOT_DERIVED,VFP_SPEC PRESET,PAPI_VEC_INS,DERIVED_ADD,SVE_INST_SPEC,ASE_INST_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_INS,NOT_DERIVED,BR_PRED PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_LST_INS,DERIVED_ADD,LD_SPEC,ST_SPEC PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE_ACCESS PRESET,PAPI_L2_DCA,DERIVED_ADD,L2D_CACHE_RD,L2D_CACHE_WR PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_SYC_INS,DERIVED_ADD,ISB_SPEC,DSB_SPEC,DMB_SPEC PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL cuda,GH100 cuda,GA100 # PRESET,PAPI_CUDA_FP16_FMA,NOT_DERIVED,cuda:::sm__sass_thread_inst_executed_op_hfma_pred_on:stat=sum PRESET,PAPI_CUDA_BF16_FMA,NOT_DERIVED,cuda:::sm__sass_thread_inst_executed_op_hfma_pred_on:stat=sum PRESET,PAPI_CUDA_FP32_FMA,NOT_DERIVED,cuda:::sm__sass_thread_inst_executed_op_ffma_pred_on:stat=sum PRESET,PAPI_CUDA_FP64_FMA,NOT_DERIVED,cuda:::sm__sass_thread_inst_executed_op_dfma_pred_on:stat=sum PRESET,PAPI_CUDA_FP_FMA,DERIVED_POSTFIX,N0|N1|+|N2|+|,cuda:::sm__sass_thread_inst_executed_op_hfma_pred_on:stat=sum,cuda:::sm__sass_thread_inst_executed_op_ffma_pred_on:stat=sum,cuda:::sm__sass_thread_inst_executed_op_dfma_pred_on:stat=sum cuda,GH100 PRESET,PAPI_CUDA_FP8_OPS,NOT_DERIVED,cuda:::sm__ops_path_tensor_src_fp8:stat=sum ######################### # ARM Neoverse V2 # ######################### CPU,arm_v2 # PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_INT_INS,NOT_DERIVED,DP_SPEC #NOT_IMPLEMENTED,PAPI_TOT_IIS,Instructions issued PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_REF_CYC,NOT_DERIVED,CNT_CYCLES PRESET,PAPI_STL_CCY,NOT_DERIVED,STALL #NOT_IMPLEMENTED,PAPI_FUL_CCY,Cycles with maximum instructions completed #NOT_IMPLEMENTED,PAPI_FUL_ICY,Cycles with maximum instruction issue #NOT_IMPLEMENTED,PAPI_FXU_IDL,Cycles integer units are idle #NOT_IMPLEMENTED,PAPI_LSU_IDL,Cycles load/store units are idle #NOT_IMPLEMENTED,PAPI_MEM_RCY,Cycles Stalled Waiting for memory Reads #NOT_IMPLEMENTED,PAPI_MEM_SCY,Cycles Stalled Waiting for memory accesses #NOT_IMPLEMENTED,PAPI_MEM_WCY,Cycles Stalled Waiting for memory writes #NOT_IMPLEMENTED,PAPI_FP_STAL,Cycles the FP unit(s) are stalled #NOT_IMPLEMENTED,PAPI_FPU_IDL,Cycles floating point units are idle #NOT_IMPLEMENTED,PAPI_BRU_IDL,Cycles branch units are idle PRESET,PAPI_STL_ICY,NOT_DERIVED,STALL PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_FP_OPS,DERIVED_ADD,FP_SCALE_OPS_SPEC,FP_FIXED_OPS_SPEC #NOT_IMPLEMENTED,PAPI_SP_OPS,Floating point operations; optimized to count scaled single precision vector operations #NOT_IMPLEMENTED,PAPI_DP_OPS,Floating point operations; optimized to count scaled double precision vector operations PRESET,PAPI_FP_INS,DERIVED_ADD,FP_HP_SPEC,FP_SP_SPEC,FP_DP_SPEC #NOT_IMPLEMENTED,PAPI_FAD_INS,Floating point add instructions #NOT_IMPLEMENTED,PAPI_FDV_INS,Floating point divide instructions #NOT_IMPLEMENTED,PAPI_FMA_INS,FMA instructions completed #NOT_IMPLEMENTED,PAPI_FML_INS,Floating point multiply instructions #NOT_IMPLEMENTED,PAPI_FNV_INS,Floating point inverse instructions #NOT_IMPLEMENTED,PAPI_FSQ_INS,Floating point square root instructions PRESET,PAPI_VEC_INS,DERIVED_ADD,SVE_INST_SPEC,ASE_INST_SPEC #NOT_IMPLEMENTED,PAPI_VEC_DP,Double precision vector/SIMD instructions #NOT_IMPLEMENTED,PAPI_VEC_SP,Single precision vector/SIMD instructions PRESET,PAPI_BR_INS,NOT_DERIVED,BR_RETIRED #NOT_IMPLEMENTED,PAPI_BR_CN,Conditional branch instructions PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_RETIRED,BR_MIS_PRED_RETIRED PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED_RETIRED #NOT_IMPLEMENTED,PAPI_BR_NTK,Conditional branch instructions not taken #NOT_IMPLEMENTED,PAPI_BR_TKN,Conditional branch instructions taken #NOT_IMPLEMENTED,PAPI_BR_UCN,Unconditional branch instructions #NOT_IMPLEMENTED,PAPI_BTAC_M,Branch target address cache misses PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_LST_INS,DERIVED_ADD,LD_SPEC,ST_SPEC PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_CACHE,L1D_CACHE_REFILL PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL #NOT_IMPLEMENTED,PAPI_L1_ICR,Level 1 instruction cache reads #NOT_IMPLEMENTED,PAPI_L1_ICW,Level 1 instruction cache writes #NOT_IMPLEMENTED,PAPI_L1_LDM,Level 1 load misses #NOT_IMPLEMENTED,PAPI_L1_STM,Level 1 store misses PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_CACHE,L1I_CACHE PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-|,L1D_CACHE,L1D_CACHE_REFILL,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D_CACHE_REFILL,L1I_CACHE_REFILL #NOT_IMPLEMENTED,PAPI_L1_TCR,Level 1 total cache reads #NOT_IMPLEMENTED,PAPI_L1_TCW,Level 1 total cache writes PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L2_DCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_RD PRESET,PAPI_L2_STM,NOT_DERIVED,L2D_CACHE_REFILL_WR #NOT_IMPLEMENTED,PAPI_L2_ICA,Level 2 instruction cache accesses #NOT_IMPLEMENTED,PAPI_L2_ICH,Level 2 instruction cache hits #NOT_IMPLEMENTED,PAPI_L2_ICM,Level 2 instruction cache misses #NOT_IMPLEMENTED,PAPI_L2_ICR,Level 2 instruction cache reads #NOT_IMPLEMENTED,PAPI_L2_ICW,Level 2 instruction cache writes PRESET,PAPI_L2_TCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2D_CACHE_REFILL PRESET,PAPI_L2_TCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L2_TCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L3_TCA,NOT_DERIVED,L3D_CACHE PRESET,PAPI_L3_DCA,NOT_DERIVED,L3D_CACHE #NOT_IMPLEMENTED,PAPI_L3_DCH,Level 3 data cache hits PRESET,PAPI_L3_DCM,NOT_DERIVED,L3D_CACHE_REFILL #NOT_IMPLEMENTED,PAPI_L3_DCR,Level 3 data cache reads #NOT_IMPLEMENTED,PAPI_L3_DCW,Level 3 data cache writes #NOT_IMPLEMENTED,PAPI_L3_ICA,Level 3 instruction cache accesses #NOT_IMPLEMENTED,PAPI_L3_ICH,Level 3 instruction cache hits #NOT_IMPLEMENTED,PAPI_L3_ICM,Level 3 instruction cache misses #NOT_IMPLEMENTED,PAPI_L3_ICR,Level 3 instruction cache reads #NOT_IMPLEMENTED,PAPI_L3_ICW,Level 3 instruction cache writes #NOT_IMPLEMENTED,PAPI_L3_LDM,Level 3 load misses #NOT_IMPLEMENTED,PAPI_L3_STM,Level 3 store misses #NOT_IMPLEMENTED,PAPI_L3_TCH,Level 3 total cache hits #NOT_IMPLEMENTED,PAPI_L3_TCM,Level 3 cache misses #NOT_IMPLEMENTED,PAPI_L3_TCR,Level 3 total cache reads #NOT_IMPLEMENTED,PAPI_L3_TCW,Level 3 total cache writes PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_SYC_INS,DERIVED_ADD,ISB_SPEC,DSB_SPEC,DMB_SPEC PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L1I_TLB_REFILL #NOT_IMPLEMENTED,PAPI_TLB_SD,Translation lookaside buffer shootdowns PRESET,PAPI_TLB_TL,DERIVED_ADD,L1D_TLB_REFILL,L2D_TLB_REFILL #NOT_IMPLEMENTED,PAPI_CA_CLN,Requests for exclusive access to clean cache line #NOT_IMPLEMENTED,PAPI_CA_INV,Requests for cache line invalidation #NOT_IMPLEMENTED,PAPI_CA_ITV,Requests for cache line intervention #NOT_IMPLEMENTED,PAPI_CA_SHR,Requests for exclusive access to shared cache line #NOT_IMPLEMENTED,PAPI_CA_SNP,Requests for a snoop #NOT_IMPLEMENTED,PAPI_CSR_FAL,Failed store conditional instructions #NOT_IMPLEMENTED,PAPI_CSR_SUC,Successful store conditional instructions #NOT_IMPLEMENTED,PAPI_CSR_TOT,Total store conditional instructions #NOT_IMPLEMENTED,PAPI_PRF_DM,Data prefetch cache misses ############################## # ARM Fujitsu FUJITSU-MONAKA # ############################## CPU,arm_monaka # PRESET,PAPI_L1_DCM,NOT_DERIVED,L1D_CACHE_REFILL PRESET,PAPI_L1_ICM,NOT_DERIVED,L1I_CACHE_REFILL PRESET,PAPI_L2_DCM,NOT_DERIVED,L2D_CACHE_REFILL #PRESET,PAPI_L3_DCM,NOT_DERIVED,L2D_CACHE_REFILL_L3D_MISS PRESET,PAPI_L1_TCM,DERIVED_ADD,L1D_CACHE_REFILL,L1I_CACHE_REFILL PRESET,PAPI_L2_TCM,NOT_DERIVED,L2D_CACHE_REFILL #PRESET,PAPI_L3_TCM,NOT_DERIVED,L2D_CACHE_REFILL_L3D_MISS PRESET,PAPI_L3_LDM,NOT_DERIVED,L2D_CACHE_REFILL_L3D_MISS_DM_RD PRESET,PAPI_L3_STM,NOT_DERIVED,L2D_CACHE_REFILL_L3D_MISS_DM_WR PRESET,PAPI_BRU_IDL,NOT_DERIVED,BR_COMP_WAIT PRESET,PAPI_FXU_IDL,DERIVED_SUB,EU_COMP_WAIT,FL_COMP_WAIT PRESET,PAPI_FPU_IDL,NOT_DERIVED,FL_COMP_WAIT PRESET,PAPI_LSU_IDL,NOT_DERIVED,LD_COMP_WAIT PRESET,PAPI_TLB_DM,NOT_DERIVED,L2D_TLB_REFILL PRESET,PAPI_TLB_IM,NOT_DERIVED,L2I_TLB_REFILL PRESET,PAPI_TLB_TL,DERIVED_ADD,L2D_TLB_REFILL,L2I_TLB_REFILL PRESET,PAPI_L1_LDM,DERIVED_ADD,L1D_CACHE_REFILL_DM_RD,L1I_CACHE_REFILL_DM_RD PRESET,PAPI_L1_STM,NOT_DERIVED,L1D_CACHE_REFILL_DM_WR PRESET,PAPI_L2_LDM,NOT_DERIVED,L2D_CACHE_REFILL_DM_RD PRESET,PAPI_L2_STM,NOT_DERIVED,L2D_CACHE_REFILL_DM_WR #RESET,PAPI_PRF_DM,NOT_DERIVED,L2D_CACHE_REFILL_L3D_MISS_PRF #PRESET,PAPI_L3_DCH,NOT_DERIVED,L2D_CACHE_REFILL_L3D_HIT PRESET,PAPI_MEM_SCY,DERIVED_ADD,STALL_FRONTEND_MEMBOUND,STALL_BACKEND_MEMBOUND PRESET,PAPI_STL_ICY,DERIVED_ADD,STALL_FRONTEND,STALL_BACKEND PRESET,PAPI_STL_CCY,NOT_DERIVED,_0INST_COMMIT PRESET,PAPI_FUL_CCY,DERIVED_POSTFIX,N0|N1|-|N2|-|N3|-|N4|-|N5|-|,CPU_CYCLES,_0INST_COMMIT,_1INST_COMMIT,_2INST_COMMIT,_3INST_COMMIT,_4INST_COMMIT PRESET,PAPI_HW_INT,DERIVED_ADD,EXC_IRQ,EXC_FIQ PRESET,PAPI_BR_MSP,NOT_DERIVED,BR_MIS_PRED PRESET,PAPI_BR_PRC,DERIVED_SUB,BR_PRED,BR_MIS_PRED PRESET,PAPI_FMA_INS,NOT_DERIVED,FP_FMA_SPEC PRESET,PAPI_TOT_INS,NOT_DERIVED,INST_RETIRED PRESET,PAPI_INT_INS,NOT_DERIVED,INT_SPEC PRESET,PAPI_FP_INS,NOT_DERIVED,FP_SPEC PRESET,PAPI_LD_INS,NOT_DERIVED,LD_SPEC PRESET,PAPI_SR_INS,NOT_DERIVED,ST_SPEC PRESET,PAPI_BR_INS,NOT_DERIVED,BR_PRED PRESET,PAPI_VEC_INS,NOT_DERIVED,SIMD_INST_RETIRED PRESET,PAPI_RES_STL,NOT_DERIVED,STALL_BACKEND PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_LST_INS,NOT_DERIVED,LDST_SPEC PRESET,PAPI_SYC_INS,DERIVED_POSTFIX,N0|N1|+|N2|+|N3|+|,ISB_SPEC,DSB_SPEC,DMB_SPEC,CSDB_SPEC PRESET,PAPI_L1_DCH,DERIVED_SUB,L1D_CACHE,L1D_CACHE_REFILL PRESET,PAPI_L2_DCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL PRESET,PAPI_L1_DCA,NOT_DERIVED,L1D_CACHE PRESET,PAPI_L2_DCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L3_DCA,NOT_DERIVED,L2D_CACHE_REFILL_L3D_CACHE PRESET,PAPI_L1_DCR,NOT_DERIVED,L1D_CACHE_RD PRESET,PAPI_L2_DCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L3_DCR,NOT_DERIVED,L3D_CACHE_RD PRESET,PAPI_L1_DCW,NOT_DERIVED,L1D_CACHE_WR PRESET,PAPI_L2_DCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L3_DCW,DERIVED_SUB,L2D_CACHE_REFILL_L3D_CACHE,L3D_CACHE_RD PRESET,PAPI_L1_ICH,DERIVED_SUB,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L1_ICA,NOT_DERIVED,L1I_CACHE PRESET,PAPI_L1_TCH,DERIVED_POSTFIX,N0|N1|-|N2|+|N3|-|,L1D_CACHE,L1D_CACHE_REFILL,L1I_CACHE,L1I_CACHE_REFILL PRESET,PAPI_L2_TCH,DERIVED_SUB,L2D_CACHE,L2D_CACHE_REFILL #PRESET,PAPI_L3_TCH,NOT_DERIVED,L2D_CACHE_REFILL_L3D_HIT PRESET,PAPI_L1_TCA,DERIVED_ADD,L1D_CACHE,L1I_CACHE PRESET,PAPI_L2_TCA,NOT_DERIVED,L2D_CACHE PRESET,PAPI_L3_TCA,NOT_DERIVED,L2D_CACHE_REFILL_L3D_CACHE PRESET,PAPI_L2_TCR,NOT_DERIVED,L2D_CACHE_RD PRESET,PAPI_L3_TCR,NOT_DERIVED,L3D_CACHE_RD PRESET,PAPI_L2_TCW,NOT_DERIVED,L2D_CACHE_WR PRESET,PAPI_L3_TCW,DERIVED_SUB,L2D_CACHE_REFILL_L3D_CACHE,L3D_CACHE_RD PRESET,PAPI_FML_INS,NOT_DERIVED,FP_MUL_SPEC PRESET,PAPI_FDV_INS,NOT_DERIVED,FP_DIV_SPEC PRESET,PAPI_FSQ_INS,NOT_DERIVED,FP_SQRT_SPEC PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_SCALE_OPS_SPEC,FP_FIXED_OPS_SPEC PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_SP_SCALE_OPS_SPEC,FP_SP_FIXED_OPS_SPEC PRESET,PAPI_DP_OPS,DERIVED_POSTFIX,N0|512|128|/|*|N1|+|,FP_DP_SCALE_OPS_SPEC,FP_DP_FIXED_OPS_SPEC PRESET,PAPI_VEC_SP,NOT_DERIVED,ASE_SVE_FP_SP_SPEC PRESET,PAPI_VEC_DP,NOT_DERIVED,ASE_SVE_FP_DP_SPEC PRESET,PAPI_REF_CYC,NOT_DERIVED,CNT_CYCLES # CPU,mips_74k # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS PRESET,PAPI_L1_ICA,NOT_DERIVED,ICACHE_ACCESSES PRESET,PAPI_L1_ICM,NOT_DERIVED,ICACHE_MISSES PRESET,PAPI_L1_DCA,NOT_DERIVED,DCACHE_ACCESSES PRESET,PAPI_L1_DCM,NOT_DERIVED,DCACHE_MISSES PRESET,PAPI_L1_TCA,DERIVED_ADD,DCACHE_ACCESSES,ICACHE_ACCESSES PRESET,PAPI_L1_TCM,DERIVED_ADD,ICACHE_MISSES,DCACHE_MISSES PRESET,PAPI_L2_TCA,NOT_DERIVED,L2_CACHE_ACCESSES PRESET,PAPI_L2_TCM,NOT_DERIVED,L2_CACHE_MISSES PRESET,PAPI_FP_INS,NOT_DERIVED,FPU_INSNS PRESET,PAPI_INT_INS,NOT_DERIVED,INTEGER_INSNS PRESET,PAPI_LD_INS,NOT_DERIVED,LOAD_INSNS PRESET,PAPI_SR_INS,NOT_DERIVED,STORE_INSNS PRESET,PAPI_TLB_IM,NOT_DERIVED,JTLB_INSN_MISSES PRESET,PAPI_TLB_DM,NOT_DERIVED,JTLB_DATA_MISSES PRESET,PAPI_BR_CN,NOT_DERIVED,COND_BRANCH_INSNS PRESET,PAPI_BR_MSP,NOT_DERIVED,MISPREDICTED_BRANCH_INSNS PRESET,PAPI_CSR_FAL,NOT_DERIVED,FAILED_SC_INSNS PRESET,PAPI_CSR_TOT,NOT_DERIVED,SC_INSNS PRESET,PAPI_FUL_ICY,NOT_DERIVED,DUAL_ISSUE_CYCLES PRESET,PAPI_STL_CCY,NOT_DERIVED,NO_INSN_CYCLES PRESET,PAPI_FUL_CCY,NOT_DERIVED,TWO_INSNS_CYCLES # CPU,MIPSICE9A # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,CPU_INSEXEC PRESET,PAPI_L1_ICA,NOT_DERIVED,CPU_INSFETCH PRESET,PAPI_LD_INS,NOT_DERIVED,CPU_LOAD PRESET,PAPI_SR_INS,NOT_DERIVED,CPU_STORE PRESET,PAPI_CSR_FAL,NOT_DERIVED,CPU_SCFAIL PRESET,PAPI_CSR_TOT,NOT_DERIVED,CPU_SC PRESET,PAPI_FP_INS,NOT_DERIVED,CPU_FLOAT PRESET,PAPI_BR_INS,NOT_DERIVED,CPU_BRANCH PRESET,PAPI_TLB_IM,NOT_DERIVED,CPU_ITLBMISS PRESET,PAPI_TLB_TL,NOT_DERIVED,CPU_TLBTRAP PRESET,PAPI_TLB_DM,NOT_DERIVED,CPU_DTLBMISS PRESET,PAPI_BR_MSP,NOT_DERIVED,CPU_MISPRED PRESET,PAPI_L1_ICM,NOT_DERIVED,CPU_ICMISS PRESET,PAPI_L1_DCM,NOT_DERIVED,CPU_DCMISS PRESET,PAPI_MEM_SCY,NOT_DERIVED,CPU_MSTALL PRESET,PAPI_FUL_ICY,NOT_DERIVED,CPU_INSDUAL # CPU,MIPSICE9B # PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CYCLES PRESET,PAPI_TOT_INS,NOT_DERIVED,CPU_INSEXEC PRESET,PAPI_L1_ICA,NOT_DERIVED,CPU_INSFETCH PRESET,PAPI_LD_INS,NOT_DERIVED,CPU_LOAD PRESET,PAPI_SR_INS,NOT_DERIVED,CPU_STORE PRESET,PAPI_CSR_FAL,NOT_DERIVED,CPU_SCFAIL PRESET,PAPI_CSR_TOT,NOT_DERIVED,CPU_SC PRESET,PAPI_FP_INS,NOT_DERIVED,CPU_FPARITH PRESET,PAPI_BR_INS,NOT_DERIVED,CPU_BRANCH PRESET,PAPI_TLB_IM,NOT_DERIVED,CPU_ITLBMISS PRESET,PAPI_TLB_TL,NOT_DERIVED,CPU_TLBTRAP PRESET,PAPI_TLB_DM,NOT_DERIVED,CPU_DTLBMISS PRESET,PAPI_BR_MSP,NOT_DERIVED,CPU_MISPRED PRESET,PAPI_L1_ICM,NOT_DERIVED,CPU_ICMISS PRESET,PAPI_L1_DCM,NOT_DERIVED,CPU_DCMISS PRESET,PAPI_MEM_SCY,NOT_DERIVED,CPU_MSTALL PRESET,PAPI_FUL_ICY,NOT_DERIVED,CPU_INSDUAL PRESET,PAPI_L2_TCM,NOT_DERIVED,CPU_L2MISSALL PRESET,PAPI_L2_TCA,NOT_DERIVED,CPU_L2REQ # CPU,BGQ # # Conditional Branching PRESET,PAPI_BR_CN,NOT_DERIVED,PEVT_INST_XU_BRC PRESET,PAPI_BR_INS,NOT_DERIVED,PEVT_XU_BR_COMMIT PRESET,PAPI_BR_MSP,NOT_DERIVED,PEVT_XU_BR_MISPRED_COMMIT PRESET,PAPI_BR_NTK,DERIVED_POSTFIX,N0|N1|-|N2|-|,PEVT_INST_XU_BRC,PEVT_XU_BR_TAKEN_COMMIT,PEVT_INST_XU_BRU #PRESET,PAPI_BR_NTK,DERIVED_SUB,PEVT_INST_XU_BRC,PEVT_XU_BR_TAKEN_COMMIT # Not sure if branches_taken includes unconditional branches as well PRESET,PAPI_BR_PRC,DERIVED_SUB,PEVT_INST_XU_BRC,PEVT_XU_BR_MISPRED_COMMIT PRESET,PAPI_BR_TKN,DERIVED_SUB,PEVT_XU_BR_TAKEN_COMMIT,PEVT_INST_XU_BRU #PRESET,PAPI_BR_TKN,NOT_DERIVED,PEVT_XU_BR_TAKEN_COMMIT # Not sure if branches_taken includes unconditional branches as well PRESET,PAPI_BR_UCN,NOT_DERIVED,PEVT_INST_XU_BRU PRESET,PAPI_BTAC_M,NOT_DERIVED,PEVT_XU_BR_TARG_ADDR_MISPRED_COMMIT # # Cache Requests # none so far # # Conditional Store PRESET,PAPI_CSR_FAL,NOT_DERIVED,PEVT_XU_STCX_FAIL PRESET,PAPI_CSR_SUC,DERIVED_SUB,PEVT_LSU_COMMIT_STCX,PEVT_XU_STCX_FAIL PRESET,PAPI_CSR_TOT,NOT_DERIVED,PEVT_LSU_COMMIT_STCX # # Floating Point Operations PRESET,PAPI_FAD_INS,DERIVED_ADD,PEVT_INST_QFPU_FADD,PEVT_INST_QFPU_QADD PRESET,PAPI_FDV_INS,NOT_DERIVED,PEVT_INST_QFPU_FDIV PRESET,PAPI_FMA_INS,DERIVED_ADD,PEVT_INST_QFPU_FMA,PEVT_INST_QFPU_QMA PRESET,PAPI_FML_INS,DERIVED_ADD,PEVT_INST_QFPU_FMUL,PEVT_INST_QFPU_QMUL PRESET,PAPI_FP_INS,NOT_DERIVED,PEVT_INST_QFPU_ALL # TODO: for PAPI_FP_OPS it's either FPGRP1 or FPGRP2. Needs to be tested PRESET,PAPI_FP_OPS,NOT_DERIVED,PEVT_INST_QFPU_FPGRP1 # PRESET,PAPI_FP_OPS,NOT_DERIVED,PEVT_INST_QFPU_FPGRP2 PRESET,PAPI_FP_STAL,NOT_DERIVED,PEVT_IU_AXU_FXU_DEP_HIT_CYC PRESET,PAPI_FSQ_INS,NOT_DERIVED,PEVT_INST_QFPU_FSQ # # Instruction Counting #PRESET,PAPI_FUL_ICY,NOT_DERIVED,PEVT_IU_TWO_INSTR_ISSUE PRESET,PAPI_FXU_IDL,NOT_DERIVED,PEVT_AXU_IDLE PRESET,PAPI_HW_INT,NOT_DERIVED,PEVT_XU_INTS_TAKEN PRESET,PAPI_INT_INS,NOT_DERIVED,PEVT_INST_XU_GRP_MASK:837800,NOTE,'UPC_P_XU_OGRP_IADD|UPC_P_XU_OGRP_IMUL|UPC_P_XU_OGRP_IDIV|UPC_P_XU_OGRP_ICMP|UPC_P_XU_OGRP_IMOV|UPC_P_XU_OGRP_ILOG|UPC_P_XU_OGRP_BITS' PRESET,PAPI_TOT_CYC,NOT_DERIVED,PEVT_CYCLES PRESET,PAPI_TOT_IIS,NOT_DERIVED,PEVT_IU_TOT_ISSUE_COUNT PRESET,PAPI_TOT_INS,NOT_DERIVED,PEVT_INST_ALL PRESET,PAPI_VEC_INS,DERIVED_ADD,PEVT_INST_QFPU_GRP_MASK:3FE,PEVT_INST_XU_GRP_MASK:3000000,NOTE,'UPC_P_AXU_OGRP_QADD|UPC_P_AXU_OGRP_QCMP|UPC_P_AXU_OGRP_QCVT|UPC_P_AXU_OGRP_QMA|UPC_P_AXU_OGRP_QMOV|UPC_P_AXU_OGRP_QMUL|UPC_P_AXU_OGRP_QOTH|UPC_P_AXU_OGRP_QRES|UPC_P_AXU_OGRP_QRND + UPC_P_XU_OGRP_QLD|UPC_P_XU_OGRP_QST' # # Cache Access PRESET,PAPI_L1_DCM,DERIVED_ADD,PEVT_LSU_COMMIT_LD_MISSES,PEVT_LSU_COMMIT_ST_MISSES PRESET,PAPI_L1_DCR,NOT_DERIVED,PEVT_LSU_COMMIT_CACHEABLE_LDS PRESET,PAPI_L1_DCW,NOT_DERIVED,PEVT_LSU_COMMIT_STS PRESET,PAPI_L1_ICM,NOT_DERIVED,PEVT_IU_IL1_MISS PRESET,PAPI_L1_ICR,NOT_DERIVED,PEVT_IU_ICACHE_FETCH PRESET,PAPI_L1_LDM,DERIVED_ADD,PEVT_IU_IL1_MISS,PEVT_LSU_COMMIT_LD_MISSES PRESET,PAPI_L1_STM,NOT_DERIVED,PEVT_LSU_COMMIT_ST_MISSES #PRESET,PAPI_L2_TCH,NOT_DERIVED,PEVT_L2_HITS #PRESET,PAPI_L2_TCM,NOT_DERIVED,PEVT_L2_MISSES # # Data Access PRESET,PAPI_LD_INS,DERIVED_ADD,PEVT_LSU_COMMIT_CACHEABLE_LDS,PEVT_LSU_COMMIT_CACHE_INHIB_LD_MISSES # may not be possible #PRESET,PAPI_LST_INS,DERIVED_POSTFIX,N0|N1|+|N2|+|,PEVT_LSU_COMMIT_CACHEABLE_LDS,PEVT_LSU_COMMIT_CACHE_INHIB_LD_MISSES,PEVT_LSU_COMMIT_STS #PRESET,PAPI_MEM_RCY,NOT_DERIVED,PEVT_IU_RAW_DEP_HIT_CYC #PRESET,PAPI_PRF_DM,NOT_DERIVED,PEVT_LSU_COMMIT_DCBT_MISSES PRESET,PAPI_RES_STL,NOT_DERIVED,PEVT_IU_IS1_STALL_CYC PRESET,PAPI_SR_INS,NOT_DERIVED,PEVT_LSU_COMMIT_STS PRESET,PAPI_STL_CCY,DERIVED_SUB,PEVT_CYCLES,PEVT_INST_ALL PRESET,PAPI_STL_ICY,DERIVED_SUB,PEVT_CYCLES,PEVT_IU_TOT_ISSUE_COUNT PRESET,PAPI_SYC_INS,NOT_DERIVED,PEVT_INST_XU_SYNC # # TLB Operations PRESET,PAPI_TLB_DM,DERIVED_ADD,PEVT_MMU_TLB_MISS_DIRECT_DERAT,PEVT_MMU_TLB_MISS_INDIR_DERAT PRESET,PAPI_TLB_IM,NOT_DERIVED,PEVT_MMU_TLB_MISS_DIRECT_DERAT PRESET,PAPI_TLB_SD,NOT_DERIVED,PEVT_MMU_TLBIVAX_SNOOP_TOT PRESET,PAPI_TLB_TL,DERIVED_POSTFIX,N0|N1|+|N2|+|,PEVT_MMU_TLB_MISS_DIRECT_DERAT,PEVT_MMU_TLB_MISS_INDIR_DERAT,PEVT_MMU_TLB_MISS_DIRECT_IERAT ################################# # Intel MIC / Xeon-Phi / Knights Corner CPU,knc # PRESET,PAPI_BR_INS,NOT_DERIVED,BRANCHES:mg=1:mh=1 PRESET,PAPI_BR_MSP,NOT_DERIVED,BRANCHES_MISPREDICTED:mg=1:mh=1 PRESET,PAPI_L1_ICM,NOT_DERIVED,CODE_CACHE_MISS:mg=1:mh=1 PRESET,PAPI_TLB_IM,NOT_DERIVED,CODE_PAGE_WALK:mg=1:mh=1 PRESET,PAPI_L1_ICA,NOT_DERIVED,CODE_READ:mg=1:mh=1 PRESET,PAPI_TOT_CYC,NOT_DERIVED,CPU_CLK_UNHALTED:mg=1:mh=1 PRESET,PAPI_TLB_DM,NOT_DERIVED,DATA_PAGE_WALK:mg=1:mh=1 PRESET,PAPI_LD_INS,NOT_DERIVED,DATA_READ:mg=1:mh=1 PRESET,PAPI_SR_INS,NOT_DERIVED,DATA_WRITE:mg=1:mh=1 PRESET,PAPI_L1_DCM,NOT_DERIVED,DATA_READ_MISS_OR_WRITE_MISS:mg=1:mh=1 PRESET,PAPI_L1_DCA,NOT_DERIVED,DATA_READ_OR_WRITE:mg=1:mh=1 PRESET,PAPI_TOT_INS,NOT_DERIVED,INSTRUCTIONS_EXECUTED:mg=1:mh=1 PRESET,PAPI_L2_LDM,NOT_DERIVED,L2_READ_MISS:mg=1:mh=1 PRESET,PAPI_VEC_INS,NOT_DERIVED,VPU_INSTRUCTIONS_EXECUTED:mg=1:mh=1 CPU,BGP # The following PAPI presets are accurate for all application nodes # using SMP processing for zero or one threads. The appropriate native # hardware counters mapped to the following PAPI preset counters are # only collected for processors 0 and 1 for each physical compute card. # The values are correct for other processing mode/thread combinations, # but only for those application nodes running on processor 0 or 1 of # a given physical compute card. PRESET,PAPI_L1_DCM,DERIVED_ADD,PNE_BGP_PU0_DCACHE_MISS,PNE_BGP_PU1_DCACHE_MISS PRESET,PAPI_L1_ICM,DERIVED_ADD,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_L1_TCM,DERIVED_ADD,PNE_BGP_PU0_DCACHE_MISS,PNE_BGP_PU1_DCACHE_MISS,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_CA_SNP,DERIVED_ADD,PNE_BGP_PU0_L1_INVALIDATION_REQUESTS,PNE_BGP_PU1_L1_INVALIDATION_REQUESTS PRESET,PAPI_PRF_DM,DERIVED_ADD,NE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_FMA_INS,DERIVED_ADD,PNE_BGP_PU0_FPU_FMA_2,PNE_BGP_PU1_FPU_FMA_2,PNE_BGP_PU0_FPU_FMA_4,PNE_BGP_PU1_FPU_FMA_4 PRESET,PAPI_FP_INS,DERIVED_ADD,PNE_BGP_PU0_FPU_ADD_SUB_1,PNE_BGP_PU1_FPU_ADD_SUB_1,PNE_BGP_PU0_FPU_MULT_1,PNE_BGP_PU1_FPU_MULT_1,PNE_BGP_PU0_FPU_FMA_2,PNE_BGP_PU1_FPU_FMA_2,PNE_BGP_PU0_FPU_DIV_1,PNE_BGP_PU1_FPU_DIV_1,PNE_BGP_PU0_FPU_OTHER_NON_STORAGE_OPS,PNE_BGP_PU1_FPU_OTHER_NON_STORAGE_OPS,PNE_BGP_PU0_FPU_ADD_SUB_2,PNE_BGP_PU1_FPU_ADD_SUB_2,PNE_BGP_PU0_FPU_MULT_2,PNE_BGP_PU1_FPU_MULT_2,PNE_BGP_PU0_FPU_FMA_4,PNE_BGP_PU1_FPU_FMA_4,PNE_BGP_PU0_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS,PNE_BGP_PU1_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS PRESET,PAPI_LD_INS,DERIVED_ADD,PNE_BGP_PU0_DATA_LOADS,PNE_BGP_PU1_DATA_LOADS PRESET,PAPI_SR_INS,DERIVED_ADD,PNE_BGP_PU0_DATA_STORES,PNE_BGP_PU1_DATA_STORES PRESET,PAPI_LST_INS,DERIVED_ADD,PNE_BGP_PU0_DATA_LOADS,PNE_BGP_PU1_DATA_LOADS,PNE_BGP_PU0_DATA_STORES,PNE_BGP_PU1_DATA_STORES PRESET,PAPI_L1_DCH,DERIVED_ADD,PNE_BGP_PU0_DCACHE_HIT,PNE_BGP_PU1_DCACHE_HIT PRESET,PAPI_L1_DCA,DERIVED_ADD,PNE_BGP_PU0_DCACHE_HIT,PNE_BGP_PU1_DCACHE_HIT,PNE_BGP_PU0_DCACHE_MISS,PNE_BGP_PU1_DCACHE_MISS PRESET,PAPI_L1_DCR,DERIVED_ADD,PNE_BGP_PU0_DATA_LOADS,PNE_BGP_PU1_DATA_LOADS PRESET,PAPI_L1_ICH,DERIVED_ADD,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT PRESET,PAPI_L1_ICA,DERIVED_ADD,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_L1_ICR,DERIVED_ADD,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_L1_ICW,DERIVED_ADD,PNE_BGP_PU0_ICACHE_LINEFILLINPROG,PNE_BGP_PU1_ICACHE_LINEFILLINPROG PRESET,PAPI_L1_TCH, DERIVED_ADD,PNE_BGP_PU0_DCACHE_HIT,PNE_BGP_PU1_DCACHE_HIT,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT, PRESET,PAPI_L1_TCA,DERIVED_ADD,PNE_BGP_PU0_DCACHE_HIT,PNE_BGP_PU1_DCACHE_HIT,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT,PNE_BGP_PU0_DCACHE_MISS,PNE_BGP_PU1_DCACHE_MISS,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS,PNE_BGP_PU0_DCACHE_LINEFILLINPROG,PNE_BGP_PU1_DCACHE_LINEFILLINPROG PRESET,PAPI_L1_TCR,DERIVED_ADD,PNE_BGP_PU0_DCACHE_HIT,PNE_BGP_PU1_DCACHE_HIT,PNE_BGP_PU0_ICACHE_HIT,PNE_BGP_PU1_ICACHE_HIT,PNE_BGP_PU0_DCACHE_MISS,PNE_BGP_PU1_DCACHE_MISS,PNE_BGP_PU0_ICACHE_MISS,PNE_BGP_PU1_ICACHE_MISS PRESET,PAPI_L1_TCW,DERIVED_ADD,PNE_BGP_PU0_DCACHE_LINEFILLINPROG,PNE_BGP_PU1_DCACHE_LINEFILLINPROG,PNE_BGP_PU0_ICACHE_LINEFILLINPROG,PNE_BGP_PU1_ICACHE_LINEFILLINPROG PRESET,PAPI_FP_OPS,DERIVED_POSTFIX,N0|N1|+|N2|+|N3|+|N4|2|*|+|N5|2|*|+|N6|13|*|+|N7|13|*|+|N8|+|N9|+|N10|2|*|+|N11|2|*|+|N12|2|*|+|N13|2|*|+|N14|4|*|+|N15|4|*|+|N16|2|*|+|N17|2|*|+|,PNE_BGP_PU0_FPU_ADD_SUB_1,PNE_BGP_PU1_FPU_ADD_SUB_1,PNE_BGP_PU0_FPU_MULT_1,PNE_BGP_PU1_FPU_MULT_1,PNE_BGP_PU0_FPU_FMA_2,PNE_BGP_PU1_FPU_FMA_2,PNE_BGP_PU0_FPU_DIV_1,PNE_BGP_PU1_FPU_DIV_1,PNE_BGP_PU0_FPU_OTHER_NON_STORAGE_OPS,PNE_BGP_PU1_FPU_OTHER_NON_STORAGE_OPS,PNE_BGP_PU0_FPU_ADD_SUB_2,PNE_BGP_PU1_FPU_ADD_SUB_2,PNE_BGP_PU0_FPU_MULT_2,PNE_BGP_PU1_FPU_MULT_2,PNE_BGP_PU0_FPU_FMA_4,PNE_BGP_PU1_FPU_FMA_4,PNE_BGP_PU0_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS,PNE_BGP_PU1_FPU_DUAL_PIPE_OTHER_NON_STORAGE_OPS # The following PAPI presets are accurate for any processing mode of # SMP, DUAL, or VN for all application nodes. The appropriate native # hardware counters used for the following PAPI preset counters are # collected for all four processors for each physical compute card. PRESET,PAPI_L2_DCM,DERIVED_POSTFIX,N0|N1|+|N2|+|N3|+|N4|-|N5|-|N6|-|N7|-|,PNE_BGP_PU0_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU1_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU2_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU3_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU0_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU1_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU2_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU3_L2_PREFETCH_HITS_IN_STREAM PRESET,PAPI_L3_LDM,DERIVED_ADD,PNE_BGP_L3_M0_RD0_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M0_RD0_DIR1_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_RD0_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_RD0_DIR1_MISS_OR_LOCKDOWN,PNE_BGP_L3_M0_RD1_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M0_RD1_DIR1_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_RD1_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_RD1_DIR1_MISS_OR_LOCKDOWN,PNE_BGP_L3_M0_R2_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M0_R2_DIR1_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_R2_DIR0_MISS_OR_LOCKDOWN,PNE_BGP_L3_M1_R2_DIR1_MISS_OR_LOCKDOWN # NOTE: This value is for the time the counters are active, # and not for the total cycles for the job. PRESET,PAPI_TOT_CYC,NOT_DERIVED,PNE_BGP_MISC_ELAPSED_TIME PRESET,PAPI_L2_DCH,DERIVED_ADD,PNE_BGP_PU0_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU1_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU2_L2_PREFETCH_HITS_IN_STREAM,PNE_BGP_PU3_L2_PREFETCH_HITS_IN_STREAM PRESET,PAPI_L2_DCA,DERIVED_ADD,PNE_BGP_PU0_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU1_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU2_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU3_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU0_L2_MEMORY_WRITES,PNE_BGP_PU1_L2_MEMORY_WRITES,PNE_BGP_PU2_L2_MEMORY_WRITES,PNE_BGP_PU3_L2_MEMORY_WRITES PRESET,PAPI_L2_DCR,DERIVED_ADD,PNE_BGP_PU0_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU1_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU2_L2_PREFETCHABLE_REQUESTS,PNE_BGP_PU3_L2_PREFETCHABLE_REQUESTS PRESET,PAPI_L2_DCW,DERIVED_ADD,PNE_BGP_PU0_L2_MEMORY_WRITES,PNE_BGP_PU1_L2_MEMORY_WRITES,PNE_BGP_PU2_L2_MEMORY_WRITES,PNE_BGP_PU3_L2_MEMORY_WRITES PRESET,PAPI_L3_TCA,DERIVED_ADD,PNE_BGP_L3_M0_RD0_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_RD1_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_R2_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_RD0_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_RD1_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_R2_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_RD0_BURST_DELIVERED_L2,PNE_BGP_L3_M0_RD1_BURST_DELIVERED_L2,PNE_BGP_L3_M0_R2_BURST_DELIVERED_L2,PNE_BGP_L3_M1_RD0_BURST_DELIVERED_L2,PNE_BGP_L3_M1_RD1_BURST_DELIVERED_L2,PNE_BGP_L3_M1_R2_BURST_DELIVERED_L2,BGP_L3_M0_W0_DEPOSIT_REQUESTS,BGP_L3_M0_W1_DEPOSIT_REQUESTS,BGP_L3_M1_W0_DEPOSIT_REQUESTS,BGP_L3_M1_W1_DEPOSIT_REQUESTS PRESET,PAPI_L3_TCR,DERIVED_ADD,PNE_BGP_L3_M0_RD0_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_RD1_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_R2_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_RD0_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_RD1_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M1_R2_SINGLE_LINE_DELIVERED_L2,PNE_BGP_L3_M0_RD0_BURST_DELIVERED_L2,PNE_BGP_L3_M0_RD1_BURST_DELIVERED_L2,PNE_BGP_L3_M0_R2_BURST_DELIVERED_L2,PNE_BGP_L3_M1_RD0_BURST_DELIVERED_L2,PNE_BGP_L3_M1_RD1_BURST_DELIVERED_L2,PNE_BGP_L3_M1_R2_BURST_DELIVERED_L2 PRESET,PAPI_L3_TCW,DERIVED_ADD,PNE_BGP_L3_M0_W0_DEPOSIT_REQUESTS,PNE_BGP_L3_M0_W1_DEPOSIT_REQUESTS,PNE_BGP_L3_M1_W0_DEPOSIT_REQUESTS,PNE_BGP_L3_M1_W1_DEPOSIT_REQUESTS papi-papi-7-2-0-t/src/papi_events.xml000066400000000000000000002361321502707512200174750ustar00rootroot00000000000000 Level 1 data cache misses Level 1 instruction cache misses Level 2 data cache misses Level 2 instruction cache misses Level 3 data cache misses Level 3 instruction cache misses Level 1 cache misses Level 2 cache misses Level 3 cache misses Requests for a snoop Requests for exclusive access to shared cache line Requests for exclusive access to clean cache line Requests for cache line invalidation Requests for cache line intervention Level 3 load misses Level 3 store misses Cycles branch units are idle Cycles integer units are idle Cycles floating point units are idle Cycles load/store units are idle Data translation lookaside buffer misses Instruction translation lookaside buffer misses Total translation lookaside buffer misses Level 1 load misses Level 1 store misses Level 2 load misses Level 2 store misses Branch target address cache misses Data prefetch cache misses Level 3 data cache hits Translation lookaside buffer shootdowns Failed store conditional instructions Successful store conditional instructions Total store conditional instructions Cycles Stalled Waiting for memory accesses Cycles Stalled Waiting for memory Reads Cycles Stalled Waiting for memory writes Cycles with no instruction issue Cycles with maximum instruction issue Cycles with no instructions completed Cycles with maximum instructions completed Hardware interrupts Unconditional branch instructions Conditional branch instructions Conditional branch instructions taken Conditional branch instructions not taken Conditional branch instructions mispredicted Conditional branch instructions correctly predicted FMA instructions completed Instructions issued Instructions completed Integer instructions Floating point instructions Load instructions Store instructions Branch instructions Vector/SIMD instructions Cycles stalled on any resource Cycles the FP unit(s) are stalled Total cycles Load/store instructions completed Synchronization instructions completed Level 1 data cache hits Level 2 data cache hits Level 1 data cache accesses Level 2 data cache accesses Level 3 data cache accesses Level 1 data cache reads Level 2 data cache reads Level 3 data cache reads Level 1 data cache writes Level 2 data cache writes Level 3 data cache writes Level 1 instruction cache hits Level 2 instruction cache hits Level 3 instruction cache hits Level 1 instruction cache accesses Level 2 instruction cache accesses Level 3 instruction cache accesses Level 1 instruction cache reads Level 2 instruction cache reads Level 3 instruction cache reads Level 1 instruction cache writes Level 2 instruction cache writes Level 3 instruction cache writes Level 1 total cache hits Level 2 total cache hits Level 3 total cache hits Level 1 total cache accesses Level 2 total cache accesses Level 3 total cache accesses Level 1 total cache reads Level 2 total cache reads Level 3 total cache reads Level 1 total cache writes Level 2 total cache writes Level 3 total cache writes Floating point multiply instructions Floating point add instructions Floating point divide instructions Floating point square root instructions Floating point inverse instructions Floating point operations papi-papi-7-2-0-t/src/papi_events_table.sh000066400000000000000000000007221502707512200204500ustar00rootroot00000000000000#!/bin/sh # # Transform the papi_events.csv file into a static table. # # tr "\r" "\n" | # convert CR to LF # tr -s "\n" | # convert LFLF to LF # tr "\"" "'" | # convert " to ' # sed 's/^/"/' | \ # insert " at beginning of line # sed 's/$/\\n\"/' # insert LF" at end of line # # print "#define STATIC_PAPI_EVENTS_TABLE 1" echo "static char *papi_events_table =" cat $1 | \ tr "\r" "\n" | tr -s "\n" | tr "\"" "'" | sed 's/^/"/' | \ sed 's/$/\\n\"/' echo ";" papi-papi-7-2-0-t/src/papi_fwrappers.c000066400000000000000000001520621502707512200176230ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: papi_fwrappers.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Nils Smeds * smeds@pdc.kth.se * Anders Nilsson * anni@pdc.kth.se * Kevin London * london@cs.utk.edu * dan terpstra * terpstra@cs.utk.edu * Min Zhou * min@cs.utk.edu */ #pragma GCC visibility push(default) #include #include #include #include "papi.h" /* Lets use defines to rename all the files */ #ifdef FORTRANUNDERSCORE #define PAPI_FCALL(function,caps,args) void function##_ args #elif FORTRANDOUBLEUNDERSCORE #define PAPI_FCALL(function,caps,args) void function##__ args #elif FORTRANCAPS #define PAPI_FCALL(function,caps,args) void caps args #else #define PAPI_FCALL(function,caps,args) void function args #endif /* Many Unix systems passes Fortran string lengths as extra arguments */ #if defined(_AIX) || defined(sun) || defined(linux) #define _FORTRAN_STRLEN_AT_END #endif /* The Low Level Wrappers */ /** \internal @defgroup PAPIF PAPI Fortran Low Level API */ /* helper routine to convert Fortran strings to C strings */ #if defined(_FORTRAN_STRLEN_AT_END) static void Fortran2cstring( char *cstring, char *Fstring, int clen , int Flen ) { int slen, i; /* What is the maximum number of chars to copy ? */ slen = Flen < clen ? Flen : clen; strncpy( cstring, Fstring, ( size_t ) slen ); /* Remove trailing blanks from initial Fortran string */ for ( i = slen - 1; i > -1 && cstring[i] == ' '; cstring[i--] = '\0' ); /* Make sure string is NULL terminated */ cstring[clen - 1] = '\0'; if ( slen < clen ) cstring[slen] = '\0'; } #endif /** @class PAPIF_accum * @ingroup PAPIF * @brief accumulate and reset counters in an event set * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_accum( C_INT EventSet, C_LONG_LONG(*) values, C_INT check ) * * @see PAPI_accum */ PAPI_FCALL( papif_accum, PAPIF_ACCUM, ( int *EventSet, long long *values, int *check ) ) { *check = PAPI_accum( *EventSet, values ); } /** @class PAPIF_add_event * @ingroup PAPIF * @brief add PAPI preset or native hardware event to an event set * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_add_event( C_INT EventSet, C_INT EventCode, C_INT check ) * * @see PAPI_add_event */ PAPI_FCALL( papif_add_event, PAPIF_ADD_EVENT, ( int *EventSet, int *Event, int *check ) ) { *check = PAPI_add_event( *EventSet, *Event ); } /** @class PAPIF_add_named_event * @ingroup PAPIF * @brief add PAPI preset or native hardware event to an event set by name * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_add_named_event( C_INT EventSet, C_STRING EventName, C_INT check ) * * @see PAPI_add_named_event */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_add_named_event, PAPIF_ADD_NAMED_EVENT, ( int *EventSet, char *EventName, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, EventName, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_add_named_event( *EventSet, tmp ); } #else PAPI_FCALL( papif_add_named_event, PAPIF_ADD_NAMED_EVENT, ( int *EventSet, char *EventName, int *check ) ) { *check = PAPI_add_named_event( *EventSet, EventName ); } #endif /** @class PAPIF_add_events * @ingroup PAPIF * @brief add multiple PAPI presets or native hardware events to an event set * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_add_events( C_INT EventSet, C_INT(*) EventCodes, C_INT number, C_INT check ) * * @see PAPI_add_events */ PAPI_FCALL( papif_add_events, PAPIF_ADD_EVENTS, ( int *EventSet, int *Events, int *number, int *check ) ) { *check = PAPI_add_events( *EventSet, Events, *number ); } /** @class PAPIF_cleanup_eventset * @ingroup PAPIF * @brief empty and destroy an EventSet * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_cleanup_eventset( C_INT EventSet, C_INT check ) * * @see PAPI_cleanup_eventset */ PAPI_FCALL( papif_cleanup_eventset, PAPIF_CLEANUP_EVENTSET, ( int *EventSet, int *check ) ) { *check = PAPI_cleanup_eventset( *EventSet ); } /** @class PAPIF_create_eventset * @ingroup PAPIF * @brief create a new empty PAPI EventSet * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_create_eventset( C_INT EventSet, C_INT check ) * * @see PAPI_create_eventset */ PAPI_FCALL( papif_create_eventset, PAPIF_CREATE_EVENTSET, ( int *EventSet, int *check ) ) { *check = PAPI_create_eventset( EventSet ); } /** @class PAPIF_assign_eventset_component * @ingroup PAPIF * @brief assign a component index to an existing but empty EventSet * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_assign_eventset_component( C_INT EventSet, C_INT EventSet, C_INT check ) * * @see PAPI_assign_eventset_component */ PAPI_FCALL( papif_assign_eventset_component, PAPIF_ASSIGN_EVENTSET_COMPONENT, ( int *EventSet, int *cidx, int *check ) ) { *check = PAPI_assign_eventset_component( *EventSet, *cidx ); } /** @class PAPIF_destroy_eventset * @ingroup PAPIF * @brief empty and destroy an EventSet * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_destroy_eventset( C_INT EventSet, C_INT check ) * * @see PAPI_destroy_eventset */ PAPI_FCALL( papif_destroy_eventset, PAPIF_DESTROY_EVENTSET, ( int *EventSet, int *check ) ) { *check = PAPI_destroy_eventset( EventSet ); } /** @class PAPIF_get_dmem_info * @ingroup PAPIF * @brief get information about the dynamic memory usage of the current program * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_dmem_info( C_INT EventSet, C_INT check ) * * @see PAPI_get_dmem_info */ /* XXX This looks totally broken. Should be passed all the members of the dmem_info struct. */ PAPI_FCALL( papif_get_dmem_info, PAPIF_GET_DMEM_INFO, ( long long *dest, int *check ) ) { *check = PAPI_get_dmem_info( ( PAPI_dmem_info_t * ) dest ); } /** @class PAPIF_get_exe_info * @ingroup PAPIF * @brief get information about the dynamic memory usage of the current program * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_exe_info( C_STRING fullname, C_STRING name, @n * C_LONG_LONG text_start, C_LONG_LONG text_end, @n * C_LONG_LONG data_start, C_LONG_LONG data_end, @n * C_LONG_LONG bss_start, C_LONG_LONG bss_end, C_INT check ) * * @see PAPI_get_executable_info */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_exe_info, PAPIF_GET_EXE_INFO, ( char *fullname, char *name, long long *text_start, long long *text_end, long long *data_start, long long *data_end, long long *bss_start, long long *bss_end, int *check, int fullname_len, int name_len ) ) #else PAPI_FCALL( papif_get_exe_info, PAPIF_GET_EXE_INFO, ( char *fullname, char *name, long long *text_start, long long *text_end, long long *data_start, long long *data_end, long long *bss_start, long long *bss_end, int *check ) ) #endif { PAPI_option_t e; /* WARNING: The casts from vptr_t to long below WILL BREAK on systems with 64-bit addresses. I did it here because I was lazy. And because I wanted to get rid of those pesky gcc warnings. If you find a 64-bit system, conditionalize the cast with (yet another) #ifdef... */ if ( ( *check = PAPI_get_opt( PAPI_EXEINFO, &e ) ) == PAPI_OK ) { #if defined(_FORTRAN_STRLEN_AT_END) int i; strncpy( fullname, e.exe_info->fullname, ( size_t ) fullname_len ); for ( i = ( int ) strlen( e.exe_info->fullname ); i < fullname_len; fullname[i++] = ' ' ); strncpy( name, e.exe_info->address_info.name, ( size_t ) name_len ); for ( i = ( int ) strlen( e.exe_info->address_info.name ); i < name_len; name[i++] = ' ' ); #else strncpy( fullname, e.exe_info->fullname, PAPI_MAX_STR_LEN ); strncpy( name, e.exe_info->address_info.name, PAPI_MAX_STR_LEN ); #endif *text_start = ( long ) e.exe_info->address_info.text_start; *text_end = ( long ) e.exe_info->address_info.text_end; *data_start = ( long ) e.exe_info->address_info.data_start; *data_end = ( long ) e.exe_info->address_info.data_end; *bss_start = ( long ) e.exe_info->address_info.bss_start; *bss_end = ( long ) e.exe_info->address_info.bss_end; } } /** @class PAPIF_get_hardware_info * @ingroup PAPIF * @brief get information about the system hardware * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_hardware_info( C_INT ncpu, C_INT nnodes, C_INT totalcpus,@n * C_INT vendor, C_STRING vendor_str, C_INT model, C_STRING model_str, @n * C_FLOAT revision, C_FLOAT mhz ) * * @see PAPI_get_hardware_info */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_hardware_info, PAPIF_GET_HARDWARE_INFO, ( int *ncpu, int *nnodes, int *totalcpus, int *vendor, char *vendor_str, int *model, char *model_str, float *revision, float *mhz, int vendor_len, int model_len ) ) #else PAPI_FCALL( papif_get_hardware_info, PAPIF_GET_HARDWARE_INFO, ( int *ncpu, int *nnodes, int *totalcpus, int *vendor, char *vendor_string, int *model, char *model_string, float *revision, float *mhz ) ) #endif { const PAPI_hw_info_t *hwinfo; int i; hwinfo = PAPI_get_hardware_info( ); if ( hwinfo == NULL ) { *ncpu = 0; *nnodes = 0; *totalcpus = 0; *vendor = 0; *model = 0; *revision = 0; *mhz = 0; } else { *ncpu = hwinfo->ncpu; *nnodes = hwinfo->nnodes; *totalcpus = hwinfo->totalcpus; *vendor = hwinfo->vendor; *model = hwinfo->model; *revision = hwinfo->revision; *mhz = hwinfo->cpu_max_mhz; #if defined(_FORTRAN_STRLEN_AT_END) strncpy( vendor_str, hwinfo->vendor_string, ( size_t ) vendor_len ); for ( i = ( int ) strlen( hwinfo->vendor_string ); i < vendor_len; vendor_str[i++] = ' ' ); strncpy( model_str, hwinfo->model_string, ( size_t ) model_len ); for ( i = ( int ) strlen( hwinfo->model_string ); i < model_len; model_str[i++] = ' ' ); #else (void)i; /* unused... */ /* This case needs the passed strings to be of sufficient size * * and will include the NULL character in the target string */ strcpy( vendor_string, hwinfo->vendor_string ); strcpy( model_string, hwinfo->model_string ); #endif } return; } /** @class PAPIF_num_hwctrs * @ingroup PAPIF * @brief Return the number of hardware counters on the cpu. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_num_hwctrs( C_INT num ) * * @see PAPI_num_hwctrs * @see PAPI_num_cmp_hwctrs */ PAPI_FCALL( papif_num_hwctrs, PAPIF_num_hwctrs, ( int *num ) ) { *num = PAPI_num_hwctrs( ); } /** @class PAPIF_num_cmp_hwctrs * @ingroup PAPIF * @brief Return the number of hardware counters on the specified component. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_num_cmp_hwctrs( C_INT cidx, C_INT num ) * * @see PAPI_num_hwctrs * @see PAPI_num_cmp_hwctrs */ PAPI_FCALL( papif_num_cmp_hwctrs, PAPIF_num_cmp_hwctrs, ( int *cidx, int *num ) ) { *num = PAPI_num_cmp_hwctrs( *cidx ); } /** @class PAPIF_get_real_cyc * @ingroup PAPIF * @brief Get real time counter value in clock cycles. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_real_cyc( C_LONG_LONG real_cyc ) * * @see PAPI_get_real_cyc */ PAPI_FCALL( papif_get_real_cyc, PAPIF_GET_REAL_CYC, ( long long *real_cyc ) ) { *real_cyc = PAPI_get_real_cyc( ); } /** @class PAPIF_get_real_usec * @ingroup PAPIF * @brief Get real time counter value in microseconds. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_real_usec( C_LONG_LONG time ) * * @see PAPI_get_real_usec */ PAPI_FCALL( papif_get_real_usec, PAPIF_GET_REAL_USEC, ( long long *time ) ) { *time = PAPI_get_real_usec( ); } /** @class PAPIF_get_real_nsec * @ingroup PAPIF * @brief Get real time counter value in nanoseconds. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_real_nsec( C_LONG_LONG time ) * * @see PAPI_get_real_nsec */ PAPI_FCALL( papif_get_real_nsec, PAPIF_GET_REAL_NSEC, ( long long *time ) ) { *time = PAPI_get_real_nsec( ); } /** @class PAPIF_get_virt_cyc * @ingroup PAPIF * @brief Get virtual time counter value in clock cycles. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_virt_cyc( C_LONG_LONG virt_cyc ) * * @see PAPI_get_virt_cyc */ PAPI_FCALL( papif_get_virt_cyc, PAPIF_GET_VIRT_CYC, ( long long *virt_cyc ) ) { *virt_cyc = PAPI_get_virt_cyc( ); } /** @class PAPIF_get_virt_usec * @ingroup PAPIF * @brief Get virtual time counter value in microseconds. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_virt_usec( C_LONG_LONG time ) * * @see PAPI_get_virt_usec */ PAPI_FCALL( papif_get_virt_usec, PAPIF_GET_VIRT_USEC, ( long long *time ) ) { *time = PAPI_get_virt_usec( ); } /** @class PAPIF_is_initialized * @ingroup PAPIF * @brief Check for initialization. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_is_initialized( C_INT level ) * * @see PAPI_is_initialized */ PAPI_FCALL( papif_is_initialized, PAPIF_IS_INITIALIZED, ( int *level ) ) { *level = PAPI_is_initialized( ); } /** @class PAPIF_library_init * @ingroup PAPIF * @brief Initialize the PAPI library. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_library_init( C_INT check ) * * @see PAPI_library_init */ PAPI_FCALL( papif_library_init, PAPIF_LIBRARY_INIT, ( int *check ) ) { *check = PAPI_library_init( *check ); } /** @class PAPIF_thread_id * @ingroup PAPIF * @brief Get the thread identifier of the current thread. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_thread_id( C_INT id ) * * @see PAPI_thread_id */ PAPI_FCALL( papif_thread_id, PAPIF_THREAD_ID, ( unsigned long *id ) ) { *id = PAPI_thread_id( ); } /** @class PAPIF_register_thread * @ingroup PAPIF * @brief Notify PAPI that a thread has 'appeared'. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_register_thread( C_INT check ) * * @see PAPI_register_thread */ PAPI_FCALL( papif_register_thread, PAPIF_REGISTER_THREAD, ( int *check ) ) { *check = PAPI_register_thread( ); } /** @class PAPIF_unregister_thread * @ingroup PAPIF * @brief Notify PAPI that a thread has 'disappeared'. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_unregister_thread( C_INT check ) * * @see PAPI_unregister_thread */ PAPI_FCALL( papif_unregister_thread, PAPIF_UNREGISTER_THREAD, ( int *check ) ) { *check = PAPI_unregister_thread( ); } /* There was a long (10+ years!) typo that was not noticed here */ /* Leaving it here as not to break any existing code out there */ PAPI_FCALL( papif_unregster_thread, PAPIF_UNREGSTER_THREAD, ( int *check ) ) { *check = PAPI_unregister_thread( ); } /** @class PAPIF_thread_init * @ingroup PAPIF * @brief Initialize thread support in the PAPI library. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_thread_init( C_INT FUNCTION handle, C_INT check ) * * @see PAPI_thread_init */ /* This must be passed an EXTERNAL or INTRINISIC FUNCTION not a SUBROUTINE */ PAPI_FCALL( papif_thread_init, PAPIF_THREAD_INIT, ( unsigned long int ( *handle ) ( void ), int *check ) ) { *check = PAPI_thread_init( handle ); } /** @class PAPI_list_events * @ingroup PAPIF * @brief List the events in an event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPI_list_events( C_INT EventSet, C_INT(*) Events, C_INT number, C_INT check ) * * @see PAPI_list_events */ PAPI_FCALL( papif_list_events, PAPIF_LIST_EVENTS, ( int *EventSet, int *Events, int *number, int *check ) ) { *check = PAPI_list_events( *EventSet, Events, number ); } /** @class PAPIF_multiplex_init * @ingroup PAPIF * @brief Initialize multiplex support in the PAPI library. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_multiplex_init( C_INT check ) * * @see PAPI_multiplex_init */ PAPI_FCALL( papif_multiplex_init, PAPIF_MULTIPLEX_INIT, ( int *check ) ) { *check = PAPI_multiplex_init( ); } /** @class PAPIF_get_multiplex * @ingroup PAPIF * @brief Get the multiplexing status of specified event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_multiplex( C_INT EventSet, C_INT check ) * * @see PAPI_get_multiplex */ PAPI_FCALL( papif_get_multiplex, PAPIF_GET_MULTIPLEX, ( int *EventSet, int *check ) ) { *check = PAPI_get_multiplex( *EventSet ); } /** @class PAPIF_set_multiplex * @ingroup PAPIF * @brief Convert a standard event set to a multiplexed event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_set_multiplex( C_INT EventSet, C_INT check ) * * @see PAPI_set_multiplex */ PAPI_FCALL( papif_set_multiplex, PAPIF_SET_MULTIPLEX, ( int *EventSet, int *check ) ) { *check = PAPI_set_multiplex( *EventSet ); } /** @class PAPIF_perror * @ingroup PAPIF * @brief Convert PAPI error codes to strings, and print error message to stderr. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_perror( C_STRING message ) * * @see PAPI_perror */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_perror, PAPIF_PERROR, ( char *message, int message_len ) ) #else PAPI_FCALL( papif_perror, PAPIF_PERROR, ( char *message ) ) #endif { #if defined(_FORTRAN_STRLEN_AT_END) char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, message, PAPI_MAX_STR_LEN, message_len ); PAPI_perror( tmp ); #else PAPI_perror( message ); #endif } /* This will not work until Fortran2000 :) * PAPI_FCALL(papif_profil, PAPIF_PROFIL, (unsigned short *buf, unsigned *bufsiz, unsigned long *offset, unsigned *scale, unsigned *eventset, * unsigned *eventcode, unsigned *threshold, unsigned *flags, unsigned *check)) * { * *check = PAPI_profil(buf, *bufsiz, *offset, *scale, *eventset, *eventcode, *threshold, *flags); * } */ /** @class PAPIF_query_event * @ingroup PAPIF * @brief Query if PAPI event exists. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_query_event(C_INT EventCode, C_INT check ) * * @see PAPI_query_event */ PAPI_FCALL( papif_query_event, PAPIF_QUERY_EVENT, ( int *EventCode, int *check ) ) { *check = PAPI_query_event( *EventCode ); } /** @class PAPIF_query_named_event * @ingroup PAPIF * @brief Query if named PAPI event exists. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_query_named_event(C_STRING EventName, C_INT check ) * * @see PAPI_query_named_event */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_query_named_event, PAPIF_QUERY_NAMED_EVENT, ( char *EventName, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, EventName, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_query_named_event( tmp ); } #else PAPI_FCALL( papif_query_named_event, PAPIF_QUERY_NAMED_EVENT, ( char *EventName, int *check ) ) { *check = PAPI_query_named_event( EventName ); } #endif /** @class PAPIF_get_event_info * @ingroup PAPIF * @brief Get the event's name and description info. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_event_info(C_INT EventCode, C_STRING symbol, C_STRING long_descr, C_STRING short_descr, C_INT count, C_STRING event_note, C_INT flags, C_INT check ) * * @see PAPI_get_event_info */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_event_info, PAPIF_GET_EVENT_INFO, ( int *EventCode, char *symbol, char *long_descr, char *short_descr, int *count, char *event_note, int *flags, int *check, int symbol_len, int long_descr_len, int short_descr_len, int event_note_len ) ) #else PAPI_FCALL( papif_get_event_info, PAPIF_GET_EVENT_INFO, ( int *EventCode, char *symbol, char *long_descr, char *short_descr, int *count, char *event_note, int *flags, int *check ) ) #endif { PAPI_event_info_t info; ( void ) flags; /*Unused */ #if defined(_FORTRAN_STRLEN_AT_END) int i; if ( ( *check = PAPI_get_event_info( *EventCode, &info ) ) == PAPI_OK ) { strncpy( symbol, info.symbol, ( size_t ) symbol_len ); for ( i = ( int ) strlen( info.symbol ); i < symbol_len; symbol[i++] = ' ' ); strncpy( long_descr, info.long_descr, ( size_t ) long_descr_len ); for ( i = ( int ) strlen( info.long_descr ); i < long_descr_len; long_descr[i++] = ' ' ); strncpy( short_descr, info.short_descr, ( size_t ) short_descr_len ); for ( i = ( int ) strlen( info.short_descr ); i < short_descr_len; short_descr[i++] = ' ' ); *count = ( int ) info.count; int note_len=0; strncpy( event_note, info.note, ( size_t ) event_note_len ); note_len=strlen(info.note); for ( i = note_len; i < event_note_len; event_note[i++] = ' ' ); } #else /* printf("EventCode: %d\n", *EventCode ); -KSL */ if ( ( *check = PAPI_get_event_info( *EventCode, &info ) ) == PAPI_OK ) { strncpy( symbol, info.symbol, PAPI_MAX_STR_LEN ); strncpy( long_descr, info.long_descr, PAPI_MAX_STR_LEN ); strncpy( short_descr, info.short_descr, PAPI_MAX_STR_LEN ); *count = info.count; if (info.note) strncpy( event_note, info.note, PAPI_MAX_STR_LEN ); } /* printf("Check: %d\n", *check); -KSL */ #endif } /** @class PAPIF_event_code_to_name * @ingroup PAPIF * @brief Convert a numeric hardware event code to a name. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_event_code_to_name( C_INT EventCode, C_STRING EventName, C_INT check ) * * @see PAPI_event_code_to_name */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_event_code_to_name, PAPIF_EVENT_CODE_TO_NAME, ( int *EventCode, char *out_str, int *check, int out_len ) ) #else PAPI_FCALL( papif_event_code_to_name, PAPIF_EVENT_CODE_TO_NAME, ( int *EventCode, char *out, int *check ) ) #endif { #if defined(_FORTRAN_STRLEN_AT_END) char tmp[PAPI_MAX_STR_LEN]; int i; *check = PAPI_event_code_to_name( *EventCode, tmp ); /* tmp has \0 within PAPI_MAX_STR_LEN chars so strncpy is safe */ strncpy( out_str, tmp, ( size_t ) out_len ); /* overwrite any NULLs and trailing garbage in out_str */ for ( i = ( int ) strlen( tmp ); i < out_len; out_str[i++] = ' ' ); #else /* The array "out" passed by the user must be sufficiently long */ *check = PAPI_event_code_to_name( *EventCode, out ); #endif } /** @class PAPIF_event_name_to_code * @ingroup PAPIF * @brief Convert a name to a numeric hardware event code. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_event_name_to_code( C_STRING EventName, C_INT EventCode, C_INT check ) * * @see PAPI_event_name_to_code */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_event_name_to_code, PAPIF_EVENT_NAME_TO_CODE, ( char *in_str, int *out, int *check, int in_len ) ) #else PAPI_FCALL( papif_event_name_to_code, PAPIF_EVENT_NAME_TO_CODE, ( char *in, int *out, int *check ) ) #endif { #if defined(_FORTRAN_STRLEN_AT_END) int slen, i; char tmpin[PAPI_MAX_STR_LEN]; /* What is the maximum number of chars to copy ? */ slen = in_len < PAPI_MAX_STR_LEN ? in_len : PAPI_MAX_STR_LEN; strncpy( tmpin, in_str, ( size_t ) slen ); /* Remove trailing blanks from initial Fortran string */ for ( i = slen - 1; i > -1 && tmpin[i] == ' '; tmpin[i--] = '\0' ); /* Make sure string is NULL terminated before call */ tmpin[PAPI_MAX_STR_LEN - 1] = '\0'; if ( slen < PAPI_MAX_STR_LEN ) tmpin[slen] = '\0'; *check = PAPI_event_name_to_code( tmpin, out ); #else /* This will have trouble if argument in is not null terminated */ *check = PAPI_event_name_to_code( in, out ); #endif } /** @class PAPIF_num_events * @ingroup PAPIF * @brief Enumerate PAPI preset or native events. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_num_events(C_INT EventSet, C_INT count) * * @see PAPI_num_events */ PAPI_FCALL( papif_num_events, PAPIF_NUM_EVENTS, ( int *EventCode, int *count ) ) { *count = PAPI_num_events( *EventCode ); } /** @class PAPIF_enum_event * @ingroup PAPIF * @brief Return the number of events in an event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_enum_event( C_INT EventCode, C_INT modifier, C_INT check ) * * @see PAPI_enum_event */ PAPI_FCALL( papif_enum_event, PAPIF_ENUM_EVENT, ( int *EventCode, int *modifier, int *check ) ) { *check = PAPI_enum_event( EventCode, *modifier ); } /** @class PAPIF_read * @ingroup PAPIF * @brief Read hardware counters from an event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_read(C_INT EventSet, C_LONG_LONG(*) values, C_INT check ) * * @see PAPI_read */ PAPI_FCALL( papif_read, PAPIF_READ, ( int *EventSet, long long *values, int *check ) ) { *check = PAPI_read( *EventSet, values ); } /** @class PAPIF_read_ts * @ingroup PAPIF * @brief Read hardware counters with a timestamp. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_read_ts(C_INT EventSet, C_LONG_LONG(*) values, C_LONG_LONG(*) cycles, C_INT check) * * @see PAPI_read_ts */ PAPI_FCALL( papif_read_ts, PAPIF_READ_TS, ( int *EventSet, long long *values, long long *cycles, int *check ) ) { *check = PAPI_read_ts( *EventSet, values, cycles ); } /** @class PAPIF_remove_event * @ingroup PAPIF * @brief Remove a hardware event from a PAPI event set. * * @par Fortran interface: * \#include "fpapi.h" @n * PAPIF_remove_event( C_INT EventSet, C_INT EventCode, C_INT check ) * * @see PAPI_remove_event */ PAPI_FCALL( papif_remove_event, PAPIF_REMOVE_EVENT, ( int *EventSet, int *Event, int *check ) ) { *check = PAPI_remove_event( *EventSet, *Event ); } /** @class PAPIF_remove_named_event * @ingroup PAPIF * @brief Remove a named hardware event from a PAPI event set. * * @par Fortran interface: * \#include "fpapi.h" @n * PAPIF_remove_named_event( C_INT EventSet, C_STRING EventName, C_INT check ) * * @see PAPI_remove_named_event */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_remove_named_event, PAPIF_REMOVE_NAMED_EVENT, ( int *EventSet, char *EventName, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, EventName, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_remove_named_event( *EventSet, tmp ); } #else PAPI_FCALL( papif_remove_named_event, PAPIF_REMOVE_NAMED_EVENT, ( int *EventSet, char *EventName, int *check ) ) { *check = PAPI_remove_named_event( *EventSet, EventName ); } #endif /** @class PAPIF_remove_events * @ingroup PAPIF * @brief Remove an array of hardware event codes from a PAPI event set. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_remove_events( C_INT EventSet, C_INT(*) EventCode, C_INT number, C_INT check ) * * @see PAPI_remove_events */ PAPI_FCALL( papif_remove_events, PAPIF_REMOVE_EVENTS, ( int *EventSet, int *Events, int *number, int *check ) ) { *check = PAPI_remove_events( *EventSet, Events, *number ); } /** @class PAPIF_reset * @ingroup PAPIF * @brief Reset the hardware event counts in an event set. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_reset( C_INT EventSet, C_INT check ) * * @see PAPI_reset */ PAPI_FCALL( papif_reset, PAPIF_RESET, ( int *EventSet, int *check ) ) { *check = PAPI_reset( *EventSet ); } /** @class PAPIF_set_debug * @ingroup PAPIF * @brief Set the current debug level for error output from PAPI. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_debug( C_INT level, C_INT check ) * * @see PAPI_set_debug */ PAPI_FCALL( papif_set_debug, PAPIF_SET_DEBUG, ( int *debug, int *check ) ) { *check = PAPI_set_debug( *debug ); } /** @class PAPIF_set_domain * @ingroup PAPIF * @brief Set the default counting domain for new event sets bound to the cpu component. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_domain( C_INT domain, C_INT check ) * * @see PAPI_set_domain */ PAPI_FCALL( papif_set_domain, PAPIF_SET_DOMAIN, ( int *domain, int *check ) ) { *check = PAPI_set_domain( *domain ); } /** @class PAPIF_set_cmp_domain * @ingroup PAPIF * @brief Set the default counting domain for new event sets bound to the specified component. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_cmp_domain( C_INT domain, C_INT cidx, C_INT check ) * * @see PAPI_set_cmp_domain */ PAPI_FCALL( papif_set_cmp_domain, PAPIF_SET_CMP_DOMAIN, ( int *domain, int *cidx, int *check ) ) { *check = PAPI_set_cmp_domain( *domain, *cidx ); } /** @class PAPIF_set_granularity * @ingroup PAPIF * @brief Set the default counting granularity for eventsets bound to the cpu component. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_granularity( C_INT granularity, C_INT check ) * * @see PAPI_set_granularity */ PAPI_FCALL( papif_set_granularity, PAPIF_SET_GRANULARITY, ( int *granularity, int *check ) ) { *check = PAPI_set_granularity( *granularity ); } /** @class PAPIF_set_cmp_granularity * @ingroup PAPIF * @brief Set the default counting granularity for eventsets bound to the specified component. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_cmp_granularity( C_INT granularity, C_INT cidx, C_INT check ) * * @see PAPI_set_cmp_granularity */ PAPI_FCALL( papif_set_cmp_granularity, PAPIF_SET_CMP_GRANULARITY, ( int *granularity, int *cidx, int *check ) ) { *check = PAPI_set_cmp_granularity( *granularity, *cidx ); } /** @class PAPIF_shutdown * @ingroup PAPIF * @brief finish using PAPI and free all related resources. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_shutdown( ) * * @see PAPI_shutdown */ PAPI_FCALL( papif_shutdown, PAPIF_SHUTDOWN, ( void ) ) { PAPI_shutdown( ); } /** @class PAPIF_start * @ingroup PAPIF * @brief Start counting hardware events in an event set. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_start( C_INT EventSet, C_INT check ) * * @see PAPI_start */ PAPI_FCALL( papif_start, PAPIF_START, ( int *EventSet, int *check ) ) { *check = PAPI_start( *EventSet ); } /** @class PAPIF_state * @ingroup PAPIF * @brief Return the counting state of an EventSet. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_state(C_INT EventSet, C_INT status, C_INT check ) * * @see PAPI_state */ PAPI_FCALL( papif_state, PAPIF_STATE, ( int *EventSet, int *status, int *check ) ) { *check = PAPI_state( *EventSet, status ); } /** @class PAPIF_stop * @ingroup PAPIF * @brief Stop counting hardware events in an EventSet. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_stop( C_INT EventSet, C_LONG_LONG(*) values, C_INT check ) * * @see PAPI_stop */ PAPI_FCALL( papif_stop, PAPIF_STOP, ( int *EventSet, long long *values, int *check ) ) { *check = PAPI_stop( *EventSet, values ); } /** @class PAPIF_write * @ingroup PAPIF * @brief Write counter values into counters. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_write( C_INT EventSet, C_LONG_LONG(*) values, C_INT check ) * * @see PAPI_write */ PAPI_FCALL( papif_write, PAPIF_WRITE, ( int *EventSet, long long *values, int *check ) ) { *check = PAPI_write( *EventSet, values ); } /** @class PAPIF_lock * @ingroup PAPIF * @brief Lock one of two mutex variables defined in papi.h. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_lock( C_INT lock ) * * @see PAPI_lock */ PAPI_FCALL( papif_lock, PAPIF_LOCK, ( int *lock, int *check ) ) { *check = PAPI_lock( *lock ); } /** @class PAPIF_unlock * @ingroup PAPIF * @brief Unlock one of the mutex variables defined in papi.h. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_unlock( C_INT lock ) * * @see PAPI_unlock */ PAPI_FCALL( papif_unlock, PAPIF_unlock, ( int *lock, int *check ) ) { *check = PAPI_unlock( *lock ); } /* Fortran only APIs for get_opt and set_opt functionality */ /** @class PAPIF_get_clockrate * @ingroup PAPIF * @brief Get the clockrate in MHz for the current cpu. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_domain( C_INT cr ) * * @note This is a Fortran only interface that returns a value from the PAPI_get_opt call. * * @see PAPI_get_opt */ PAPI_FCALL( papif_get_clockrate, PAPIF_GET_CLOCKRATE, ( int *cr ) ) { *cr = PAPI_get_opt( PAPI_CLOCKRATE, (PAPI_option_t *) NULL ); } /** @class PAPIF_get_preload * @ingroup PAPIF * @brief Get the LD_PRELOAD environment variable. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_get_preload( C_STRING lib_preload_env, C_INT check ) * * @note This is a Fortran only interface that returns a value from the PAPI_get_opt call. * * @see PAPI_get_opt */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_preload, PAPIF_GET_PRELOAD, ( char *lib_preload_env, int *check, int lib_preload_env_len ) ) #else PAPI_FCALL( papif_get_preload, PAPIF_GET_PRELOAD, ( char *lib_preload_env, int *check ) ) #endif { PAPI_option_t p; #if defined(_FORTRAN_STRLEN_AT_END) int i; if ( ( *check = PAPI_get_opt( PAPI_PRELOAD, &p ) ) == PAPI_OK ) { strncpy( lib_preload_env, p.preload.lib_preload_env, ( size_t ) lib_preload_env_len ); for ( i = ( int ) strlen( p.preload.lib_preload_env ); i < lib_preload_env_len; lib_preload_env[i++] = ' ' ); } #else if ( ( *check = PAPI_get_opt( PAPI_PRELOAD, &p ) ) == PAPI_OK ) { strncpy( lib_preload_env, p.preload.lib_preload_env, PAPI_MAX_STR_LEN ); } #endif } /** @class PAPIF_get_granularity * @ingroup PAPIF * @brief Get the granularity setting for the specified EventSet. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_get_granularity( C_INT eventset, C_INT granularity, C_INT mode, C_INT check ) * * @see PAPI_get_opt */ PAPI_FCALL( papif_get_granularity, PAPIF_GET_GRANULARITY, ( int *eventset, int *granularity, int *mode, int *check ) ) { PAPI_option_t g; if ( *mode == PAPI_DEFGRN ) { *granularity = PAPI_get_opt( *mode, &g ); *check = PAPI_OK; } else if ( *mode == PAPI_GRANUL ) { g.granularity.eventset = *eventset; if ( ( *check = PAPI_get_opt( *mode, &g ) ) == PAPI_OK ) { *granularity = g.granularity.granularity; } } else { *check = PAPI_EINVAL; } } /** @class PAPIF_get_domain * @ingroup PAPIF * @brief Get the domain setting for the specified EventSet. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_get_domain( C_INT eventset, C_INT domain, C_INT mode, C_INT check ) * * @see PAPI_get_opt */ PAPI_FCALL( papif_get_domain, PAPIF_GET_DOMAIN, ( int *eventset, int *domain, int *mode, int *check ) ) { PAPI_option_t d; if ( *mode == PAPI_DEFDOM ) { *domain = PAPI_get_opt( *mode, (PAPI_option_t *) NULL ); *check = PAPI_OK; } else if ( *mode == PAPI_DOMAIN ) { d.domain.eventset = *eventset; if ( ( *check = PAPI_get_opt( *mode, &d ) ) == PAPI_OK ) { *domain = d.domain.domain; } } else { *check = PAPI_EINVAL; } } #if 0 PAPI_FCALL( papif_get_inherit, PAPIF_GET_INHERIT, ( int *inherit, int *check ) ) { PAPI_option_t i; if ( ( *check = PAPI_get_opt( PAPI_INHERIT, &i ) ) == PAPI_OK ) { *inherit = i.inherit.inherit; } } #endif /** @class PAPIF_set_event_domain * @ingroup PAPIF * @brief Set the default counting domain for specified EventSet. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_event_domain( C_INT EventSet, C_INT domain, C_INT check ) * * @see PAPI_set_domain * @see PAPI_set_opt */ PAPI_FCALL( papif_set_event_domain, PAPIF_SET_EVENT_DOMAIN, ( int *es, int *domain, int *check ) ) { PAPI_option_t d; d.domain.domain = *domain; d.domain.eventset = *es; *check = PAPI_set_opt( PAPI_DOMAIN, &d ); } /** @class PAPIF_set_inherit * @ingroup PAPIF * @brief Turn on inheriting of counts from daughter to parent process. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIF_set_inherit( C_INT inherit, C_INT check ) * * @see PAPI_set_opt */ PAPI_FCALL( papif_set_inherit, PAPIF_SET_INHERIT, ( int *inherit, int *check ) ) { PAPI_option_t i; i.inherit.inherit = *inherit; *check = PAPI_set_opt( PAPI_INHERIT, &i ); } /** @class PAPIF_ipc * @ingroup PAPIF * @brief Get instructions per cycle, real and processor time. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_ipc( C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG ins, C_FLOAT ipc, C_INT check ) * * @see PAPI_ipc */ PAPI_FCALL( papif_ipc, PAPIF_IPC, ( float *rtime, float *ptime, long long *ins, float *ipc, int *check ) ) { *check = PAPI_ipc( rtime, ptime, ins, ipc ); } /** @class PAPIF_epc * @ingroup PAPIF * @brief Get named events per cycle, real and processor time, reference and core cycles. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_epc( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG ref, C_LONG_LONG core, C_LONG_LONG evt, C_FLOAT epc, C_INT check ) * * @see PAPI_epc */ PAPI_FCALL( papif_epc, PAPIF_EPC, ( int *EventCode, float *rtime, float *ptime, long long *ref, long long *core, long long *evt, float *epc, int *check) ) { *check = PAPI_epc( *EventCode, rtime, ptime, ref, core, evt, epc ); } /** @class PAPIF_flips_rate * @ingroup PAPIF * @brief Simplified call to get Mflips/s (floating point instruction rate), real and processor time. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_flips_rate ( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG flpins, C_FLOAT mflips, C_INT check ) * * @see PAPI_flips_rate */ PAPI_FCALL( papif_flips_rate, PAPIF_FLIPS_RATE, ( int *EventCode, float *real_time, float *proc_time, long long *flpins, float *mflips, int *check ) ) { *check = PAPI_flips_rate( *EventCode, real_time, proc_time, flpins, mflips ); } /** @class PAPIF_flops_rate * @ingroup PAPIF * @brief Simplified call to get Mflops/s (floating point instruction rate), real and processor time. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_flops_rate( C_INT EventCode, C_FLOAT real_time, C_FLOAT proc_time, C_LONG_LONG flpops, C_FLOAT mflops, C_INT check ) * * @see PAPI_flops_rate */ PAPI_FCALL( papif_flops_rate, PAPIF_FLOPS_RATE, ( int *EventCode, float *real_time, float *proc_time, long long *flpops, float *mflops, int *check ) ) { *check = PAPI_flops_rate( *EventCode, real_time, proc_time, flpops, mflops ); } /** @class PAPIF_rate_stop * @ingroup PAPIF * @brief Stop a running event set of a rate function. * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_rate_stop( C_INT check ) * * @see PAPI_rate_stop */ PAPI_FCALL( papif_rate_stop, PAPIF_RATE_STOP, ( int *check ) ) { *check = PAPI_rate_stop( ); } static void *sysdetect_fort_handle; /** @class PAPIF_enum_dev_type * @ingroup PAPIF * @brief returns handle of next device type * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_enum_dev_type( C_INT modifier, C_INT handle_index, C_INT check ) * * @see PAPI_enum_dev_type */ PAPI_FCALL( papif_enum_dev_type, PAPIF_ENUM_DEV_TYPE, ( int *modifier, int *handle_index, int *check )) { *check = PAPI_enum_dev_type(*modifier, &sysdetect_fort_handle); *handle_index = 0; } /** @class PAPIF_get_dev_type_attr * @ingroup PAPIF * @brief returns device type attributes * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_dev_type_attr( C_INT handle_index, C_INT attribute, @n * C_INT value, C_STRING string, C_INT check ) * * @see PAPIF_get_dev_type_attr */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_dev_type_attr, PAPIF_GET_DEV_TYPE_ATTR, (int *handle_index, int *attribute, int *value, char *string, int *check, int string_len) ) #else PAPI_FCALL( papif_get_dev_type_attr, PAPIF_GET_DEV_TYPE_ATTR, (int *handle_index, int *attribute, int *value, char *string, int *check) ) #endif { const char *string_ptr; int i; *handle_index = 0; *check = PAPI_OK; assert(sysdetect_fort_handle); switch(*attribute) { case PAPI_DEV_TYPE_ATTR__INT_PAPI_ID: case PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID: case PAPI_DEV_TYPE_ATTR__INT_COUNT: *check = PAPI_get_dev_type_attr(sysdetect_fort_handle, *attribute, value); break; case PAPI_DEV_TYPE_ATTR__CHAR_NAME: case PAPI_DEV_TYPE_ATTR__CHAR_STATUS: *check = PAPI_get_dev_type_attr(sysdetect_fort_handle, *attribute, &string_ptr); if (*check != PAPI_OK) { break; } #if defined(_FORTRAN_STRLEN_AT_END) strncpy(string, string_ptr, string_len); for ( i = ( int ) string_len; i < PAPI_MAX_STR_LEN; string[i++] = ' ' ); #else strcpy(string, string_ptr); for ( i = ( int ) strlen(string_ptr); i < PAPI_MAX_STR_LEN; string[i++] = ' ' ); #endif break; default: *check = PAPI_EINVAL; } *handle_index = 0; return; } /** @class PAPIF_get_dev_attr * @ingroup PAPIF * @brief returns device attributes * * @par Fortran Interface: * \#include "fpapi.h" @n * PAPIF_get_dev_attr( C_INT handle, C_INT id, C_INT attribute, @n * C_INT value, C_STRING string, C_INT check ) * * @see PAPIF_get_dev_attr */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_get_dev_attr, PAPIF_GET_DEV_ATTR, (int *handle_index, int *id, int *attribute, int *value, char *string, int *check, int string_len) ) #else PAPI_FCALL( papif_get_dev_attr, PAPIF_GET_DEV_ATTR, (int *handle_index, int *id, int *attribute, int *value, char *string, int *check) ) #endif { int i; const char *string_ptr; *handle_index = 0; *check = PAPI_OK; assert(sysdetect_fort_handle); switch(*attribute) { case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE: case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT: case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT: case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT: case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT: case PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC: case PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC: case PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC: case PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC: case PAPI_DEV_ATTR__CPU_UINT_FAMILY: case PAPI_DEV_ATTR__CPU_UINT_MODEL: case PAPI_DEV_ATTR__CPU_UINT_STEPPING: case PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT: case PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT: case PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT: case PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT: case PAPI_DEV_ATTR__CPU_UINT_THR_PER_NUMA: case PAPI_DEV_ATTR__CUDA_ULONG_UID: case PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME: case PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE: case PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK: case PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM: case PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK: case PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM: case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X: case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y: case PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z: case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X: case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y: case PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z: case PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT: case PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL: case PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM: case PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP: case PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR: case PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM: case PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR: case PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR: case PAPI_DEV_ATTR__ROCM_ULONG_UID: case PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU: case PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE: case PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE: case PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU: case PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG: case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X: case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y: case PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z: case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X: case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y: case PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z: case PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT: case PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR: case PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR: *check = PAPI_get_dev_attr(sysdetect_fort_handle, *id, *attribute, value); break; case PAPI_DEV_ATTR__CPU_CHAR_NAME: case PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME: *check = PAPI_get_dev_attr(sysdetect_fort_handle, *id, *attribute, &string_ptr); if (*check != PAPI_OK) { break; } #if defined(_FORTRAN_STRLEN_AT_END) strncpy(string, string_ptr, (size_t) string_len); for ( i = ( int ) strlen(string_ptr); i < PAPI_MAX_STR_LEN; string[i++] = ' ' ); #else strcpy(string, string_ptr); for ( i = ( int ) strlen(string_ptr); i < PAPI_MAX_STR_LEN; string[i++] = ' ' ); #endif break; default: *check = PAPI_EINVAL; } return; } /* The High Level API Wrappers */ /** \internal @defgroup PAPIF-HL PAPI Fortran High Level API */ /** @class PAPIf_hl_region_begin * @ingroup PAPIF-HL * @brief Reads and stores hardware events at the beginning of an instrumented code region. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIf_hl_region_begin( C_STRING region, C_INT check ) * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous erros. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPIf_hl_region_begin reads hardware events and stores them internally at the beginning * of an instrumented code region. * If not specified via environment variable PAPI_EVENTS, default events are used. * The first call sets all counters implicitly to zero and starts counting. * Note that if PAPI_EVENTS is not set or cannot be interpreted, default hardware events are * recorded. * * @par Example: * * @code * export PAPI_EVENTS="PAPI_TOT_INS,PAPI_TOT_CYC" * @endcode * * * @code * integer retval * * call PAPIf_hl_region_begin("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_begin failed!" * end if * * !do some computation here * * call PAPIf_hl_region_end("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_end failed!" * end if * * @endcode * * @see PAPI_hl_region_begin */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_hl_region_begin, PAPIF_HL_REGION_BEGIN, ( char* name, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, name, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_hl_region_begin( tmp ); } #else PAPI_FCALL( papif_hl_region_begin, PAPIF_HL_REGION_BEGIN, ( char* name, int *check ) ) { *check = PAPI_hl_region_begin( name ); } #endif /** @class PAPIf_hl_read * @ingroup PAPIF-HL * @brief Reads and stores hardware events inside of an instrumented code region. * * @par Fortran Prototype: * \#include @n * int PAPIf_hl_read( C_STRING region, C_INT check ) * * @param region * -- a unique region name corresponding to PAPIf_hl_region_begin * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous erros. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPIf_hl_read reads hardware events and stores them internally inside * of an instrumented code region. * Assumes that PAPIf_hl_region_begin was called before. * * @par Example: * * @code * integer retval * * call PAPIf_hl_region_begin("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_begin failed!" * end if * * !do some computation here * * call PAPIf_hl_read("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_read failed!" * end if * * !do some computation here * * call PAPIf_hl_region_end("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_end failed!" * end if * * @endcode * * @see PAPI_hl_read */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_hl_read, PAPIF_HL_READ, ( char* name, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, name, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_hl_read( tmp ); } #else PAPI_FCALL( papif_hl_read, PAPIF_HL_READ, ( char* name, int *check ) ) { *check = PAPI_hl_read( name ); } #endif /** @class PAPIf_hl_region_end * @ingroup PAPIF-HL * @brief Reads and stores hardware events at the end of an instrumented code region. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIf_hl_region_end( C_STRING region, C_INT check ) * * @param region * -- a unique region name corresponding to PAPIf_hl_region_begin * * @retval PAPI_OK * @retval PAPI_ENOTRUN * -- EventSet is currently not running or could not determined. * @retval PAPI_ESYS * -- A system or C library call failed inside PAPI, see the errno variable. * @retval PAPI_EMISC * -- PAPI has been deactivated due to previous erros. * @retval PAPI_ENOMEM * -- Insufficient memory. * * PAPIf_hl_region_end reads hardware events and stores the difference to the values from * PAPIf_hl_region_begin at the end of an instrumented code region. * Assumes that PAPIf_hl_region_begin was called before. * Note that an output is automatically generated when your application terminates. * * @par Example: * * @code * integer retval * * call PAPIf_hl_region_begin("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_begin failed!" * end if * * !do some computation here * * call PAPIf_hl_region_end("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_end failed!" * end if * * @endcode * * @see PAPI_hl_region_end */ #if defined(_FORTRAN_STRLEN_AT_END) PAPI_FCALL( papif_hl_region_end, PAPIF_HL_REGION_END, ( char* name, int *check, int Event_len ) ) { char tmp[PAPI_MAX_STR_LEN]; Fortran2cstring( tmp, name, PAPI_MAX_STR_LEN, Event_len ); *check = PAPI_hl_region_end( tmp ); } #else PAPI_FCALL( papif_hl_region_end, PAPIF_HL_REGION_END, ( char* name, int *check ) ) { *check = PAPI_hl_region_end( name ); } #endif /** @class PAPIf_hl_stop * @ingroup PAPIF-HL * @brief Stop a running high-level event set. * * @par Fortran Prototype: * \#include "fpapi.h" @n * PAPIf_hl_stop( C_INT check ) * * @retval PAPI_ENOEVNT * -- The EventSet is not started yet. * @retval PAPI_ENOMEM * -- Insufficient memory to complete the operation. * * PAPIf_hl_stop stops a running high-level event set. * * This call is optional and only necessary if the programmer wants to use the low-level API in addition * to the high-level API. It should be noted that PAPIf_hl_stop and low-level calls are not * allowed inside of a marked region. Furthermore, PAPIf_hl_stop is thread-local and therefore * has to be called in the same thread as the corresponding marked region. * * @par Example: * * @code * integer retval * * call PAPIf_hl_region_begin("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_begin failed!" * end if * * !do some computation here * * call PAPIf_hl_region_end("computation", retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_region_end failed!" * end if * * call PAPIf_hl_stop(retval) * if ( retval .NE. PAPI_OK ) then * write (*,*) "PAPIf_hl_stop failed!" * end if * * @endcode * * @see PAPI_hl_stop */ PAPI_FCALL( papif_hl_stop, PAPIF_HL_STOP, ( int *check ) ) { *check = PAPI_hl_stop( ); } #pragma GCC visibility pop papi-papi-7-2-0-t/src/papi_internal.c000066400000000000000000002724301502707512200174300ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: papi_internal.c * * Author: Philip Mucci * mucci@cs.utk.edu * Mods: dan terpstra * terpstra@cs.utk.edu * Mods: Min Zhou * min@cs.utk.edu * Mods: Kevin London * london@cs.utk.edu * Mods: Per Ekman * pek@pdc.kth.se * Mods: Haihang You * you@cs.utk.edu * Mods: Maynard Johnson * maynardj@us.ibm.com * Mods: Brian Sheely * bsheely@eecs.utk.edu * Mods: * * Mods: * */ #include #include #include #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "sw_multiplex.h" #include "extras.h" #include "papi_preset.h" #include "cpus.h" #include "papi_common_strings.h" /* Advanced definitons */ static int default_debug_handler( int errorCode ); static long long handle_derived( EventInfo_t * evi, long long *from ); /* Global definitions used by other files */ int num_all_presets = 0; // total number of presets int _papi_hwi_start_idx[PAPI_NUM_COMP]; // first index for given component int first_comp_with_presets = -1; // track the first component that has presets int first_comp_preset_idx = PAPI_MAX_PRESET_EVENTS; // track the first non-perf_event component preset index int pe_disabled = 1; // track whether perf_event component is available int init_level = PAPI_NOT_INITED; int _papi_hwi_error_level = PAPI_QUIET; PAPI_debug_handler_t _papi_hwi_debug_handler = default_debug_handler; papi_mdi_t _papi_hwi_system_info; int _papi_hwi_errno = PAPI_OK; int _papi_hwi_num_errors = 0; hwi_presets_t user_defined_events[PAPI_MAX_USER_EVENTS]; int user_defined_events_count = 0; THREAD_LOCAL_STORAGE_KEYWORD int _papi_rate_events_running = 0; THREAD_LOCAL_STORAGE_KEYWORD int _papi_hl_events_running = 0; /*****************************/ /* Native Event Mapping Code */ /*****************************/ #define NATIVE_EVENT_CHUNKSIZE 1024 struct native_event_info { int cidx; int component_event; int ntv_idx; char *evt_name; }; // The following array is indexed by the papi event code (after the native bit has been removed) static struct native_event_info *_papi_native_events=NULL; static int num_native_events=0; static int num_native_chunks=0; char **_papi_errlist= NULL; static int num_error_chunks = 0; // pointer to event:mask string associated with last enum call to a components // will be NULL for non libpfm4 components // this is needed because libpfm4 event codes and papi event codes do not contain mask information char *papi_event_string = NULL; void _papi_hwi_set_papi_event_string (const char *event_string) { INTDBG("event_string: %s\n", event_string); if (papi_event_string != NULL) { free (papi_event_string); papi_event_string = NULL; } if (event_string != NULL) { papi_event_string = strdup(event_string); } return; } char * _papi_hwi_get_papi_event_string () { INTDBG("papi_event_string: %s\n", papi_event_string); return papi_event_string; } void _papi_hwi_free_papi_event_string() { if (papi_event_string != NULL) { free(papi_event_string); papi_event_string = NULL; } return; } void _papi_hwi_set_papi_event_code (unsigned int event_code, int update_flag) { INTDBG("new event_code: %#x, update_flag: %d, previous event_code: %#x\n", event_code, update_flag, _papi_hwi_my_thread->tls_papi_event_code); // if call is just to reset and start over, set both flags to show nothing saved yet if (update_flag < 0) { _papi_hwi_my_thread->tls_papi_event_code_changed = -1; _papi_hwi_my_thread->tls_papi_event_code = -1; return; } // if 0, it is being set prior to calling a component, if >0 it is being changed by the component _papi_hwi_my_thread->tls_papi_event_code_changed = update_flag; // save the event code passed in _papi_hwi_my_thread->tls_papi_event_code = event_code; return; } unsigned int _papi_hwi_get_papi_event_code () { INTDBG("papi_event_code: %#x\n", _papi_hwi_my_thread->tls_papi_event_code); return _papi_hwi_my_thread->tls_papi_event_code; } /* Get the index into the ESI->NativeInfoArray for the current PAPI event code */ int _papi_hwi_get_ntv_idx (unsigned int papi_evt_code) { INTDBG("ENTER: papi_evt_code: %#x\n", papi_evt_code); int result; int event_index; if (papi_evt_code == 0) { INTDBG("EXIT: PAPI_ENOEVNT, invalid papi event code\n"); return PAPI_ENOEVNT; } event_index=papi_evt_code&PAPI_NATIVE_AND_MASK; if ((event_index<0) || (event_index>=num_native_events)) { INTDBG("EXIT: PAPI_ENOEVNT, invalid index into native event array\n"); return PAPI_ENOEVNT; } result=_papi_native_events[event_index].ntv_idx; INTDBG("EXIT: result: %d\n", result); return result; } // // Check for the presence of a component name or pmu name in the event string. // If found check if it matches this component or one of the pmu's supported by this component. // // returns true if the event could be for this component and false if it is not for this component. // if there is no component or pmu name then it could be for this component and returns true. // static int is_supported_by_component(int cidx, char *event_name) { INTDBG("ENTER: cidx: %d, event_name: %s\n", cidx, event_name); int i; int component_name = 0; int pmu_name = 0; char *wptr = NULL; // if event does not have a component name or pmu name, return to show it could be supported by this component // when component and pmu names are not provided, we just have to call the components to see if they recognize the event // // look for component names first if ((wptr = strstr(event_name, ":::")) != NULL) { component_name = 1; } else if ((wptr = strstr(event_name, "::")) != NULL) { pmu_name = 1; } else { INTDBG("EXIT: No Component or PMU name in event string, try this component\n"); // need to force all components to be called to find owner of this event // ???? can we assume the default pmu when no component or pmu name is provided ???? return 1; } // get a temporary copy of the component or pmu name int name_len = wptr - event_name; wptr = strdup(event_name); wptr[name_len] = '\0'; // if a component name was found, compare it to the component name in the component info structure if (component_name) { // INTDBG("component_name: %s\n", _papi_hwd[cidx]->cmp_info.name); if (strcmp (wptr, _papi_hwd[cidx]->cmp_info.name) == 0) { free (wptr); INTDBG("EXIT: Component %s supports this event\n", _papi_hwd[cidx]->cmp_info.name); return 1; } } // if a pmu name was found, compare it to the pmu name list if the component info structure (if there is one) if (pmu_name) { for ( i=0 ; icmp_info.pmu_names[i] == NULL) { continue; } // INTDBG("pmu_name[%d]: %p (%s)\n", i, _papi_hwd[cidx]->cmp_info.pmu_names[i], _papi_hwd[cidx]->cmp_info.pmu_names[i]); if (strcmp (wptr, _papi_hwd[cidx]->cmp_info.pmu_names[i]) == 0) { INTDBG("EXIT: Component %s supports PMU %s and this event\n", _papi_hwd[cidx]->cmp_info.name, wptr); free (wptr); return 1; } } } free (wptr); INTDBG("EXIT: Component does not support this event\n"); return 0; } /** @internal * @class _papi_hwi_prefix_component_name * @brief Prefixes a component's name to each of its events. * @param *component_name * @param *event_name * @param *out * @param *out_len * * Given sane component_name and event_name it returns component_name:::event_name. * It is safe in the case that event_name == out and it checks against the * traditional PAPI 'cpu' components, opting to not prepend those. */ int _papi_hwi_prefix_component_name( char *component_name, char *event_name, char *out, int out_len) { int size1, size2; char temp[out_len]; size1 = strlen(event_name); size2 = strlen(component_name); /* sanity checks */ if ( size1 == 0 ) { return (PAPI_EBUG); /* hopefully event_name always has length?! */ } if ( size1 >= out_len ) return (PAPI_ENOMEM); /* Guard against event_name == out */ memcpy( temp, event_name, out_len ); /* no component name to prefix */ if ( size2 == 0 ) { sprintf(out, "%s%c", temp, '\0' ); return (PAPI_OK); } /* Don't prefix 'cpu' component names for now */ if ( strstr(component_name, "pe") || strstr(component_name, "bgq") || strstr(component_name, "bgp") ) { sprintf( out, "%s%c", temp, '\0'); return (PAPI_OK); } /* strlen(component_name) + ::: + strlen(event_name) + NULL */ if ( size1+size2+3+1 > out_len ) return (PAPI_ENOMEM); sprintf( out, "%s:::%s%c" , component_name, temp, '\0'); return (PAPI_OK); } /** @internal * @class _papi_hwi_strip_component_prefix * @brief Strip off cmp_name::: from an event name. * * @param *event_name * @return Start of the component consumable portion of the name. * * This function checks specifically for ':::' and will return the start of * event_name if it doesn't find the ::: . */ const char *_papi_hwi_strip_component_prefix(const char *event_name) { const char *start = NULL; /* We assume ::: is the seperator * eg: * papi_component:::event_name */ start = strstr( event_name, ":::" ); if ( start != NULL ) start+= 3; /* return the actual start of event_name */ else start = event_name; return (start); } /* find the papi event code (4000xxx) associated with the specified component, native event, and event name */ static int _papi_hwi_find_native_event(int cidx, int event, const char *event_name) { INTDBG("ENTER: cidx: %x, event: %#x, event_name: %s\n", cidx, event, event_name); int i; // if no event name passed in, it can not be found if (event_name == NULL) { INTDBG("EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } for(i=0;i=num_native_chunks*NATIVE_EVENT_CHUNKSIZE) { num_native_chunks++; _papi_native_events=(struct native_event_info *) realloc(_papi_native_events, num_native_chunks*NATIVE_EVENT_CHUNKSIZE* sizeof(struct native_event_info)); if (_papi_native_events==NULL) { new_native_event=PAPI_ENOMEM; goto native_alloc_early_out; } } _papi_native_events[num_native_events].cidx=cidx; _papi_native_events[num_native_events].component_event=ntv_event; _papi_native_events[num_native_events].ntv_idx=ntv_idx; if (event_name != NULL) { _papi_native_events[num_native_events].evt_name=strdup(event_name); } else { _papi_native_events[num_native_events].evt_name=NULL; } new_native_event=num_native_events|PAPI_NATIVE_MASK; num_native_events++; native_alloc_early_out: _papi_hwi_unlock( INTERNAL_LOCK ); INTDBG("EXIT: new_native_event: %#x, num_native_events: %d\n", new_native_event, num_native_events); return new_native_event; } /** @internal * @class _papi_hwi_add_error * * Adds a new error string to PAPI's internal store. * MAKE SURE you are not holding INTERNAL_LOCK when you call me! */ static int _papi_hwi_add_error( char *error ) { INTDBG("Adding a new Error message |%s|\n", error); _papi_hwi_lock(INTERNAL_LOCK); if (_papi_hwi_num_errors >= num_error_chunks*NATIVE_EVENT_CHUNKSIZE) { num_error_chunks++; _papi_errlist= (char **) realloc(_papi_errlist, num_error_chunks*NATIVE_EVENT_CHUNKSIZE*sizeof(char *)); if (_papi_errlist==NULL) { _papi_hwi_num_errors = -2; goto bail; } } _papi_errlist[_papi_hwi_num_errors] = strdup( error ); if ( _papi_errlist[_papi_hwi_num_errors] == NULL ) _papi_hwi_num_errors = -2; bail: _papi_hwi_unlock(INTERNAL_LOCK); return _papi_hwi_num_errors++; } static void _papi_hwi_cleanup_errors() { int i; if ( _papi_errlist == NULL || _papi_hwi_num_errors == 0 ) return; _papi_hwi_lock( INTERNAL_LOCK ); for (i=0; i < _papi_hwi_num_errors; i++ ) { free( _papi_errlist[i]); _papi_errlist[i] = NULL; } free( _papi_errlist ); _papi_errlist = NULL; _papi_hwi_num_errors = 0; num_error_chunks=0; _papi_hwi_unlock( INTERNAL_LOCK ); } static int _papi_hwi_lookup_error( char *error ) { int i; for (i=0; i<_papi_hwi_num_errors; i++) { if ( !strncasecmp( _papi_errlist[i], error, strlen( error ) ) ) return i; } return (-1); } /** @internal * @class _papi_hwi_publish_error * * @return * <= 0 : Code for the error. * < 0 : We couldn't get memory to allocate for your error. * * An internal interface for adding an error code to the library. * The returned code is suitable for returning to users. * */ int _papi_hwi_publish_error( char *error ) { int error_code = -1; if ( (error_code = _papi_hwi_lookup_error( error )) < 0 ) error_code = _papi_hwi_add_error(error); return (-error_code); /* internally error_code is an index, externally, it should be <= 0 */ } /* Why are the errors done this way? Should they not be auto-generated the same way the Fortran ones are? --vmw */ void _papi_hwi_init_errors(void) { /* we use add error to avoid the cost of lookups, we know the errors are not there yet */ /* 0 PAPI_OK */ _papi_hwi_add_error("No error"); /* 1 PAPI_EINVAL */ _papi_hwi_add_error("Invalid argument"); /* 2 PAPI_ENOMEM */ _papi_hwi_add_error("Insufficient memory"); /* 3 PAPI_ESYS */ _papi_hwi_add_error("A System/C library call failed"); /* 4 PAPI_ECMP */ _papi_hwi_add_error("Not supported by component"); /* 5 PAPI_ECLOST */ _papi_hwi_add_error("Access to the counters was lost or interrupted"); /* 6 PAPI_EBUG */ _papi_hwi_add_error("Internal error, please send mail to the developers"); /* 7 PAPI_ENOEVNT */ _papi_hwi_add_error("Event does not exist"); /* 8 PAPI_ECNFLCT */ _papi_hwi_add_error("Event exists, but cannot be counted due to hardware resource limits"); /* 9 PAPI_ENOTRUN */ _papi_hwi_add_error("EventSet is currently not running"); /* 10 PAPI_EISRUN */ _papi_hwi_add_error("EventSet is currently counting"); /* 11 PAPI_ENOEVST */ _papi_hwi_add_error("No such EventSet available"); /* 12 PAPI_ENOTPRESET */_papi_hwi_add_error("Event in argument is not a valid preset"); /* 13 PAPI_ENOCNTR */ _papi_hwi_add_error("Hardware does not support performance counters"); /* 14 PAPI_EMISC */ _papi_hwi_add_error("Unknown error code"); /* 15 PAPI_EPERM */ _papi_hwi_add_error("Permission level does not permit operation"); /* 16 PAPI_ENOINIT */ _papi_hwi_add_error("PAPI hasn't been initialized yet"); /* 17 PAPI_ENOCMP */ _papi_hwi_add_error("Component Index isn't set"); /* 18 PAPI_ENOSUPP */ _papi_hwi_add_error("Not supported"); /* 19 PAPI_ENOIMPL */ _papi_hwi_add_error("Not implemented"); /* 20 PAPI_EBUF */ _papi_hwi_add_error("Buffer size exceeded"); /* 21 PAPI_EINVAL_DOM */_papi_hwi_add_error("EventSet domain is not supported for the operation"); /* 22 PAPI_EATTR */ _papi_hwi_add_error("Invalid or missing event attributes"); /* 23 PAPI_ECOUNT */ _papi_hwi_add_error("Too many events or attributes"); /* 24 PAPI_ECOMBO */ _papi_hwi_add_error("Bad combination of features"); /* 25 PAPI_ECMP_DISABLED */_papi_hwi_add_error("Component containing event is disabled"); /* 26 PAPI_EDELAY_INIT */ _papi_hwi_add_error("Delayed initialization component"); /* 27 PAPI_EMULPASS */ _papi_hwi_add_error("Event exists, but cannot be counted due to multiple passes required by hardware"); /* 28 PAPI_PARTIAL */ _papi_hwi_add_error("Component in use is partially disabled, see utils/papi_component_avail for more information."); } int _papi_hwi_invalid_cmp( int cidx ) { return ( cidx < 0 || cidx >= papi_num_components ); } int _papi_hwi_component_index( int event_code ) { INTDBG("ENTER: event_code: %#x\n", event_code); int cidx; int event_index; if (IS_PRESET(event_code)) { INTDBG("EXIT: Event %#x is a PRESET, assigning component %d\n", event_code,0); event_index = event_code & PAPI_PRESET_AND_MASK; return get_preset_cmp(&event_index); } /* user defined events are treated like preset events (component 0 only) */ if (IS_USER_DEFINED(event_code)) { INTDBG("EXIT: Event %#x is USER DEFINED, assigning component %d\n", event_code,0); return 0; } event_index=event_code&PAPI_NATIVE_AND_MASK; if ( (event_index < 0) || (event_index>=num_native_events)) { INTDBG("EXIT: Event index %#x is out of range, num_native_events: %d\n", event_index, num_native_events); return PAPI_ENOEVNT; } cidx=_papi_native_events[event_index].cidx; if ((cidx<0) || (cidx >= papi_num_components)) { INTDBG("EXIT: Component index %#x is out of range, papi_num_components: %d\n", cidx, papi_num_components); return PAPI_ENOCMP; } INTDBG("EXIT: Found cidx: %d event_index: %d, event_code: %#x\n", cidx, event_index, event_code); return cidx; } /* Convert an internal component event to a papi event code */ int _papi_hwi_native_to_eventcode(int cidx, int event_code, int ntv_idx, const char *event_name) { INTDBG("Entry: cidx: %d, event: %#x, ntv_idx: %d, event_name: %s\n", cidx, event_code, ntv_idx, event_name); int result; if (_papi_hwi_my_thread->tls_papi_event_code_changed > 0) { result = _papi_hwi_get_papi_event_code(); INTDBG("EXIT: papi_event_code: %#x set by the component\n", result); return result; } result=_papi_hwi_find_native_event(cidx, event_code, event_name); if (result==PAPI_ENOEVNT) { // Need to create one result=_papi_hwi_add_native_event(cidx, event_code, ntv_idx, event_name); } INTDBG("EXIT: result: %#x\n", result); return result; } /* Convert a native_event code to an internal event code */ int _papi_hwi_eventcode_to_native(int event_code) { INTDBG("ENTER: event_code: %#x\n", event_code); int result; int event_index; event_index=event_code&PAPI_NATIVE_AND_MASK; if ((event_index < 0) || (event_index>=num_native_events)) { INTDBG("EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } result=_papi_native_events[event_index].component_event; INTDBG("EXIT: result: %#x\n", result); return result; } /*********************/ /* Utility functions */ /*********************/ void PAPIERROR( char *format, ... ) { va_list args; if ( ( _papi_hwi_error_level != PAPI_QUIET ) || ( getenv( "PAPI_VERBOSE" ) ) ) { va_start( args, format ); fprintf( stderr, "PAPI Error: " ); vfprintf( stderr, format, args ); fprintf( stderr, "\n" ); va_end( args ); } } void PAPIWARN( char *format, ... ) { va_list args; if ( ( _papi_hwi_error_level != PAPI_QUIET ) || ( getenv( "PAPI_VERBOSE" ) ) ) { va_start( args, format ); fprintf( stderr, "PAPI Warning: " ); vfprintf( stderr, format, args ); fprintf( stderr, "\n" ); va_end( args ); } } /* Construct fully qualified event names for the native events in a preset. */ int construct_qualified_event(hwi_presets_t *prstPtr) { int j; for(j = 0; j < prstPtr->count; j++ ) { /* Construct event with all qualifiers. */ int k, strLenSum = 0, baseLen = 1+strlen(prstPtr->base_name[j]); for (k = 0; k < prstPtr->num_quals; k++){ strLenSum += strlen(prstPtr->quals[k]); } strLenSum += baseLen; /* Allocate space for constructing fully qualified event. */ char *tmpEvent = (char*)malloc(strLenSum*sizeof(char)); char *tmpQuals = (char*)malloc(strLenSum*sizeof(char)); if( NULL == tmpQuals || NULL == tmpEvent ) { SUBDBG("EXIT: Could not allocate memory.\n"); return PAPI_ENOMEM; } /* Print the basename to a string. */ int status = snprintf(tmpEvent, baseLen, "%s", prstPtr->base_name[j]); if( status < 0 || status >= baseLen ) { PAPIERROR("Event basename %s was truncated to %s in derived event %s", prstPtr->base_name[j], tmpEvent, prstPtr->symbol); return PAPI_ENOMEM; } /* Concatenate the qualifiers onto the string. */ status = 0; for (k = 0; k < prstPtr->num_quals; k++) { status = snprintf(tmpQuals, strLenSum, "%s%s", tmpEvent, prstPtr->quals[k]); strcpy(tmpEvent, tmpQuals); } if( status < 0 || status >= strLenSum ) { PAPIERROR("Event %s with qualifiers was truncated to %s in derived event %s", prstPtr->base_name[j], tmpEvent, prstPtr->symbol); return PAPI_ENOMEM; } /* Set the new name, which includes the qualifiers. */ free(prstPtr->name[j]); prstPtr->name[j] = strdup(tmpEvent); /* Set the corresponding new code. */ status = _papi_hwi_native_name_to_code( tmpEvent, &(prstPtr->code[j]) ); if( PAPI_OK != status ) { PAPIERROR("Failed to get code for native event %s used in derived event %s\n", tmpEvent, prstPtr->symbol); return PAPI_EINVAL; } /* Free dynamically allocated memory. */ free(tmpQuals); free(tmpEvent); } return PAPI_OK; } /* Overwrite qualifiers in the preset struct based on those provided in the input string. */ int overwrite_qualifiers(hwi_presets_t *prstPtr, const char *in, int is_preset) { char *qualDelim = ":"; char **providedQuals = (char**)malloc(sizeof(char*)*(prstPtr->num_quals)); int numProvidedQuals = 0; int k; for (k = 0; k < prstPtr->num_quals; k++){ providedQuals[k] = (char*)malloc(sizeof(char)*(PAPI_MAX_STR_LEN+1)); } char *givenName = strdup(in); char *qualName = strtok(givenName, ":"); qualName = strtok(NULL, ":"); /* Skip past component prefix. */ if( !is_preset ) { qualName = strtok(NULL, ":"); } k = 0; while( qualName != NULL ) { size_t qualLen = 1+strlen(qualDelim)+strlen(qualName); int status = snprintf(providedQuals[k], qualLen, "%s%s", qualDelim, qualName); if( status < 0 || status >= qualLen ) { PAPIERROR("Failed to make copy of qualifier %s", qualName); return PAPI_ENOMEM; } k++; numProvidedQuals++; qualName = strtok(NULL, ":"); } /* If a specific qualifier was provided, use that as the default value * for the qualifier for the preset. To accomplish this, find the same * qualifier in the preset struct's list, and overwrite it. */ int l, breakFlag = 0; char *wholeQual1, *matchQual1, *wholeQual2, *matchQual2; /* For each qualifier provided. */ for (k = 0; k < numProvidedQuals; k++) { wholeQual1 = strdup(providedQuals[k]); matchQual1 = strtok(wholeQual1, "="); /* For each qualifier in the preset struct. */ for (l = 0; l < prstPtr->num_quals; l++) { wholeQual2 = strdup(prstPtr->quals[l]); matchQual2 = strtok(wholeQual2, "="); if( strcmp(matchQual1, matchQual2) == 0 ) { breakFlag = 1; free(wholeQual2); break; } free(wholeQual2); } free(wholeQual1); /* The qualifier was found, so overwrite it with the provided value. */ if( breakFlag ) { free(prstPtr->quals[l]); prstPtr->quals[l] = strdup(providedQuals[k]); breakFlag = 0; } } free(givenName); for (k = 0; k < prstPtr->num_quals; k++){ free(providedQuals[k]); } free(providedQuals); return PAPI_OK; } /* Return index of first non-perf_event component's preset. */ int get_first_cmp_preset_idx( void ) { int cmpnt = first_comp_with_presets; if( cmpnt < 0 ) { return PAPI_EINVAL; } return first_comp_preset_idx; } /* Return index of component containing preset with given index. */ int get_preset_cmp( unsigned int *index ) { unsigned int sum = 0; if(pe_disabled) { sum += PAPI_MAX_PRESET_EVENTS; if(*index < sum) { return PAPI_EMISC; } } int i; for(i = 0; i < PAPI_NUM_COMP; ++i) { sum += _papi_hwi_max_presets[i]; if(*index < sum) { *index = *index - (sum - _papi_hwi_max_presets[i]); return i; } } /* If we did not find the component to which the preset belongs. */ return PAPI_EINVAL; } /* Return a pointer to preset which has given event code. */ hwi_presets_t* get_preset( int event_code ) { unsigned int preset_index = ( event_code & PAPI_PRESET_AND_MASK ); hwi_presets_t *_papi_hwi_list; int i = get_preset_cmp(&preset_index); if( i == PAPI_EINVAL ) { return NULL; } if( i == PAPI_EMISC ) { _papi_hwi_list = _papi_hwi_presets; } if( i >= 0 ) { _papi_hwi_list = _papi_hwi_comp_presets[i]; } return &_papi_hwi_list[preset_index]; } static int default_debug_handler( int errorCode ) { char str[PAPI_HUGE_STR_LEN]; if ( errorCode == PAPI_OK ) return ( errorCode ); if ( ( errorCode > 0 ) || ( -errorCode > _papi_hwi_num_errors ) ) { PAPIERROR( "%s %d,%s,Bug! Unknown error code", PAPI_ERROR_CODE_str, errorCode, "" ); return ( PAPI_EBUG ); } switch ( _papi_hwi_error_level ) { case PAPI_VERB_ECONT: case PAPI_VERB_ESTOP: /* gcc 2.96 bug fix, do not change */ /* fprintf(stderr,"%s %d: %s: %s\n",PAPI_ERROR_CODE_str,errorCode,_papi_hwi_err[-errorCode].name,_papi_hwi_err[-errorCode].descr); */ sprintf( str, "%s %d,%s", PAPI_ERROR_CODE_str, errorCode, _papi_errlist[-errorCode] ); if ( errorCode == PAPI_ESYS ) sprintf( str + strlen( str ), ": %s", strerror( errno ) ); PAPIERROR( str ); if ( _papi_hwi_error_level == PAPI_VERB_ESTOP ) abort( ); /* patch provided by will cohen of redhat */ else return errorCode; break; case PAPI_QUIET: default: return errorCode; } return ( PAPI_EBUG ); /* Never get here */ } static int allocate_eventset_map( DynamicArray_t * map ) { /* Allocate and clear the Dynamic Array structure */ if ( map->dataSlotArray != NULL ) papi_free( map->dataSlotArray ); memset( map, 0x00, sizeof ( DynamicArray_t ) ); /* Allocate space for the EventSetInfo_t pointers */ map->dataSlotArray = ( EventSetInfo_t ** ) papi_malloc( PAPI_INIT_SLOTS * sizeof ( EventSetInfo_t * ) ); if ( map->dataSlotArray == NULL ) { return ( PAPI_ENOMEM ); } memset( map->dataSlotArray, 0x00, PAPI_INIT_SLOTS * sizeof ( EventSetInfo_t * ) ); map->totalSlots = PAPI_INIT_SLOTS; map->availSlots = PAPI_INIT_SLOTS; map->fullSlots = 0; return ( PAPI_OK ); } static int expand_dynamic_array( DynamicArray_t * DA ) { int number; EventSetInfo_t **n; /*realloc existing PAPI_EVENTSET_MAP.dataSlotArray */ number = DA->totalSlots * 2; n = ( EventSetInfo_t ** ) papi_realloc( DA->dataSlotArray, ( size_t ) number * sizeof ( EventSetInfo_t * ) ); if ( n == NULL ) return ( PAPI_ENOMEM ); /* Need to assign this value, what if realloc moved it? */ DA->dataSlotArray = n; memset( DA->dataSlotArray + DA->totalSlots, 0x00, ( size_t ) DA->totalSlots * sizeof ( EventSetInfo_t * ) ); DA->totalSlots = number; DA->availSlots = number - DA->fullSlots; return ( PAPI_OK ); } static int EventInfoArrayLength( const EventSetInfo_t * ESI ) { return ( _papi_hwd[ESI->CmpIdx]->cmp_info.num_mpx_cntrs ); } /*========================================================================*/ /* This function allocates space for one EventSetInfo_t structure and for */ /* all of the pointers in this structure. If any malloc in this function */ /* fails, all memory malloced to the point of failure is freed, and NULL */ /* is returned. Upon success, a pointer to the EventSetInfo_t data */ /* structure is returned. */ /*========================================================================*/ static int create_EventSet( EventSetInfo_t ** here ) { EventSetInfo_t *ESI; ESI = ( EventSetInfo_t * ) papi_calloc( 1, sizeof ( EventSetInfo_t ) ); if ( ESI == NULL ) { return PAPI_ENOMEM; } *here = ESI; return PAPI_OK; } int _papi_hwi_assign_eventset( EventSetInfo_t *ESI, int cidx ) { INTDBG("ENTER: ESI: %p (%d), cidx: %d\n", ESI, ESI->EventSetIndex, cidx); int retval; size_t max_counters; char *ptr; unsigned int i, j; /* If component doesn't exist... */ if (_papi_hwi_invalid_cmp(cidx)) return PAPI_ECMP; /* Assigned at create time */ ESI->domain.domain = _papi_hwd[cidx]->cmp_info.default_domain; ESI->granularity.granularity = _papi_hwd[cidx]->cmp_info.default_granularity; ESI->CmpIdx = cidx; /* ??? */ max_counters = ( size_t ) _papi_hwd[cidx]->cmp_info.num_mpx_cntrs; ESI->ctl_state = (hwd_control_state_t *) papi_calloc( 1, (size_t) _papi_hwd[cidx]->size.control_state ); ESI->sw_stop = (long long *) papi_calloc( ( size_t ) max_counters, sizeof ( long long ) ); ESI->hw_start = ( long long * ) papi_calloc( ( size_t ) max_counters, sizeof ( long long ) ); ESI->EventInfoArray = ( EventInfo_t * ) papi_calloc( (size_t) max_counters, sizeof ( EventInfo_t ) ); /* allocate room for the native events and for the component-private */ /* register structures */ /* ugh is there a cleaner way to allocate this? vmw */ ESI->NativeInfoArray = ( NativeInfo_t * ) papi_calloc( ( size_t ) max_counters, sizeof ( NativeInfo_t )); ESI->NativeBits = papi_calloc(( size_t ) max_counters, ( size_t ) _papi_hwd[cidx]->size.reg_value ); /* NOTE: the next two malloc allocate blocks of memory that are later */ /* parcelled into overflow and profile arrays */ ESI->overflow.deadline = ( long long * ) papi_malloc( ( sizeof ( long long ) + sizeof ( int ) * 3 ) * ( size_t ) max_counters ); ESI->profile.prof = ( PAPI_sprofil_t ** ) papi_malloc( ( sizeof ( PAPI_sprofil_t * ) * ( size_t ) max_counters + ( size_t ) max_counters * sizeof ( int ) * 4 ) ); /* If any of these allocations failed, free things up and fail */ if ( ( ESI->ctl_state == NULL ) || ( ESI->sw_stop == NULL ) || ( ESI->hw_start == NULL ) || ( ESI->NativeInfoArray == NULL ) || ( ESI->NativeBits == NULL ) || ( ESI->EventInfoArray == NULL ) || ( ESI->profile.prof == NULL ) || ( ESI->overflow.deadline == NULL ) ) { if ( ESI->sw_stop ) papi_free( ESI->sw_stop ); if ( ESI->hw_start ) papi_free( ESI->hw_start ); if ( ESI->EventInfoArray ) papi_free( ESI->EventInfoArray ); if ( ESI->NativeInfoArray ) papi_free( ESI->NativeInfoArray ); if ( ESI->NativeBits ) papi_free( ESI->NativeBits ); if ( ESI->ctl_state ) papi_free( ESI->ctl_state ); if ( ESI->overflow.deadline ) papi_free( ESI->overflow.deadline ); if ( ESI->profile.prof ) papi_free( ESI->profile.prof ); papi_free( ESI ); return PAPI_ENOMEM; } /* Carve up the overflow block into separate arrays */ ptr = ( char * ) ESI->overflow.deadline; ptr += sizeof ( long long ) * max_counters; ESI->overflow.threshold = ( int * ) ptr; ptr += sizeof ( int ) * max_counters; ESI->overflow.EventIndex = ( int * ) ptr; ptr += sizeof ( int ) * max_counters; ESI->overflow.EventCode = ( int * ) ptr; /* Carve up the profile block into separate arrays */ ptr = ( char * ) ESI->profile.prof + ( sizeof ( PAPI_sprofil_t * ) * max_counters ); ESI->profile.count = ( int * ) ptr; ptr += sizeof ( int ) * max_counters; ESI->profile.threshold = ( int * ) ptr; ptr += sizeof ( int ) * max_counters; ESI->profile.EventIndex = ( int * ) ptr; ptr += sizeof ( int ) * max_counters; ESI->profile.EventCode = ( int * ) ptr; /* initialize_EventInfoArray */ for ( i = 0; i < max_counters; i++ ) { ESI->EventInfoArray[i].event_code=( unsigned int ) PAPI_NULL; ESI->EventInfoArray[i].ops = NULL; ESI->EventInfoArray[i].derived=NOT_DERIVED; for ( j = 0; j < PAPI_EVENTS_IN_DERIVED_EVENT; j++ ) { ESI->EventInfoArray[i].pos[j] = PAPI_NULL; } } /* initialize_NativeInfoArray */ for( i = 0; i < max_counters; i++ ) { ESI->NativeInfoArray[i].ni_event = -1; ESI->NativeInfoArray[i].ni_position = -1; ESI->NativeInfoArray[i].ni_papi_code = -1; ESI->NativeInfoArray[i].ni_owners = 0; ESI->NativeInfoArray[i].ni_bits = ((unsigned char*)ESI->NativeBits) + (i*_papi_hwd[cidx]->size.reg_value); } ESI->NativeCount = 0; ESI->state = PAPI_STOPPED; /* these used to be init_config */ retval = _papi_hwd[cidx]->init_control_state( ESI->ctl_state ); retval |= _papi_hwd[cidx]->set_domain( ESI->ctl_state, ESI->domain.domain); return retval; } /*========================================================================*/ /* This function should free memory for one EventSetInfo_t structure. */ /* The argument list consists of a pointer to the EventSetInfo_t */ /* structure, *ESI. */ /* The calling function should check for ESI==NULL. */ /*========================================================================*/ void _papi_hwi_free_EventSet( EventSetInfo_t * ESI ) { _papi_hwi_cleanup_eventset( ESI ); #ifdef DEBUG memset( ESI, 0x00, sizeof ( EventSetInfo_t ) ); #endif papi_free( ESI ); } static int add_EventSet( EventSetInfo_t * ESI, ThreadInfo_t * master ) { DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; int i, errorCode; _papi_hwi_lock( INTERNAL_LOCK ); if ( map->availSlots == 0 ) { errorCode = expand_dynamic_array( map ); if ( errorCode < PAPI_OK ) { _papi_hwi_unlock( INTERNAL_LOCK ); return ( errorCode ); } } i = 0; for ( i = 0; i < map->totalSlots; i++ ) { if ( map->dataSlotArray[i] == NULL ) { ESI->master = master; ESI->EventSetIndex = i; map->fullSlots++; map->availSlots--; map->dataSlotArray[i] = ESI; _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_OK ); } } _papi_hwi_unlock( INTERNAL_LOCK ); return ( PAPI_EBUG ); } int _papi_hwi_create_eventset( int *EventSet, ThreadInfo_t * handle ) { EventSetInfo_t *ESI; int retval; /* Is the EventSet already in existence? */ if ( ( EventSet == NULL ) || ( handle == NULL ) ) return PAPI_EINVAL; if ( *EventSet != PAPI_NULL ) return PAPI_EINVAL; /* Well, then allocate a new one. Use n to keep track of a NEW EventSet */ retval = create_EventSet( &ESI ); if ( retval != PAPI_OK ) return retval; ESI->CmpIdx = -1; /* when eventset is created, it is not decided yet which component it belongs to, until first event is added */ ESI->state = PAPI_STOPPED; /* Add it to the global table */ retval = add_EventSet( ESI, handle ); if ( retval < PAPI_OK ) { _papi_hwi_free_EventSet( ESI ); return retval ; } *EventSet = ESI->EventSetIndex; INTDBG( "(%p,%p): new EventSet in slot %d\n", ( void * ) EventSet, handle, *EventSet ); return retval; } /* This function returns the index of the the next free slot in the EventInfoArray. If EventCode is already in the list, it returns PAPI_ECNFLCT. */ static int get_free_EventCodeIndex( const EventSetInfo_t * ESI, unsigned int EventCode ) { int k; int lowslot = PAPI_ECNFLCT; int limit = EventInfoArrayLength( ESI ); /* Check for duplicate events and get the lowest empty slot */ for ( k = 0; k < limit; k++ ) { if ( ESI->EventInfoArray[k].event_code == EventCode ) return ( PAPI_ECNFLCT ); /*if ((ESI->EventInfoArray[k].event_code == PAPI_NULL) && (lowslot == PAPI_ECNFLCT)) */ if ( ESI->EventInfoArray[k].event_code == ( unsigned int ) PAPI_NULL ) { lowslot = k; break; } } return ( lowslot ); } /* This function returns the index of the EventCode or error */ /* Index to what? The index to everything stored EventCode in the */ /* EventSet. */ int _papi_hwi_lookup_EventCodeIndex( const EventSetInfo_t * ESI, unsigned int EventCode ) { int i; int limit = EventInfoArrayLength( ESI ); for ( i = 0; i < limit; i++ ) { if ( ESI->EventInfoArray[i].event_code == EventCode ) { return i; } } return PAPI_EINVAL; } /* This function only removes empty EventSets */ int _papi_hwi_remove_EventSet( EventSetInfo_t * ESI ) { DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; int i; i = ESI->EventSetIndex; _papi_hwi_lock( INTERNAL_LOCK ); _papi_hwi_free_EventSet( ESI ); /* do bookkeeping for PAPI_EVENTSET_MAP */ map->dataSlotArray[i] = NULL; map->availSlots++; map->fullSlots--; _papi_hwi_unlock( INTERNAL_LOCK ); return PAPI_OK; } /* this function checks if an event is already in an EventSet Success, return ESI->NativeInfoArray[] index Fail, return PAPI_ENOEVNT; */ static int event_already_in_eventset( EventSetInfo_t * ESI, int papi_event ) { INTDBG( "ENTER: ESI: %p, papi_event: %#x\n", ESI, papi_event); int i; int nevt = _papi_hwi_eventcode_to_native(papi_event); /* to find the native event from the native events list */ for( i = 0; i < ESI->NativeCount; i++ ) { if ( nevt == ESI->NativeInfoArray[i].ni_event ) { // Also need to check papi event code if set because the same event with different masks // will generate the same libpfm4 event code (what was checked above). But there will be // different papi events created for it and they need to be handled separately. if (papi_event == ESI->NativeInfoArray[i].ni_papi_code) { INTDBG( "EXIT: event: %#x already mapped at index: %d\n", papi_event, i); return i; } } } INTDBG( "EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } /* This function goes through the events in an EventSet's EventInfoArray */ /* And maps each event (whether native or part of a preset) to */ /* an event in the EventSets NativeInfoArray. */ /* We need to do this every time a native event is added to or removed */ /* from an eventset. */ /* It is also called after a update controlstate as the components are */ /* allowed to re-arrange the native events to fit hardware constraints. */ void _papi_hwi_map_events_to_native( EventSetInfo_t *ESI) { INTDBG("ENTER: ESI: %p, ESI->EventInfoArray: %p, ESI->NativeInfoArray: %p, ESI->NumberOfEvents: %d, ESI->NativeCount: %d\n", ESI, ESI->EventInfoArray, ESI->NativeInfoArray, ESI->NumberOfEvents, ESI->NativeCount); int i, event, k, n, preset_index = 0, nevt; int total_events = ESI->NumberOfEvents; event = 0; for( i = 0; i < total_events; i++ ) { /* find the first event that isn't PAPI_NULL */ /* Is this really necessary? --vmw */ while ( ESI->EventInfoArray[event].event_code == ( unsigned int ) PAPI_NULL ) { event++; } /* If it's a preset */ if ( IS_PRESET(ESI->EventInfoArray[event].event_code) ) { /* If it is a component preset, it will be in a separate array. */ hwi_presets_t *_preset_ptr = get_preset((int)ESI->EventInfoArray[event].event_code); if( NULL == _preset_ptr ) { INTDBG("EXIT: preset not found\n"); return; } /* walk all sub-events in the preset */ for( k = 0; k < PAPI_EVENTS_IN_DERIVED_EVENT; k++ ) { nevt = _preset_ptr->code[k]; if ( nevt == PAPI_NULL ) { break; } INTDBG("Looking for subevent %#x\n",nevt); /* Match each sub-event to something in the Native List */ for( n = 0; n < ESI->NativeCount; n++ ) { if ( nevt == ESI->NativeInfoArray[n].ni_papi_code ) { INTDBG("Found papi event: %#x, &ESI->NativeInfoArray[%d]: %p, ni_event: %#x, ni_position %d\n", nevt, n, &(ESI->NativeInfoArray[n]), ESI->NativeInfoArray[n].ni_event, ESI->NativeInfoArray[n].ni_position); ESI->EventInfoArray[event].pos[k] = ESI->NativeInfoArray[n].ni_position; break; } } } } /* If it's a native event */ else if( IS_NATIVE(ESI->EventInfoArray[event].event_code) ) { nevt = ( int ) ESI->EventInfoArray[event].event_code; // get index into native info array for this event int nidx = event_already_in_eventset( ESI, nevt ); // if not found, then we need to return an error if (nidx == PAPI_ENOEVNT) { INTDBG("EXIT: needed event not found\n"); return; } ESI->EventInfoArray[event].pos[0] = ESI->NativeInfoArray[nidx].ni_position; INTDBG("nidx: %d, ni_position: %d\n", nidx, ESI->NativeInfoArray[nidx].ni_position); } /* If it's a user-defined event */ else if ( IS_USER_DEFINED(ESI->EventInfoArray[event].event_code) ) { preset_index = ( int ) ESI->EventInfoArray[event].event_code & PAPI_UE_AND_MASK; for ( k = 0; k < PAPI_EVENTS_IN_DERIVED_EVENT; k++ ) { nevt = user_defined_events[preset_index].code[k]; INTDBG("nevt: %#x, user_defined_events[%d].code[%d]: %#x, code[%d]: %#x\n", nevt, preset_index, k, user_defined_events[preset_index].code[k], k+1, user_defined_events[preset_index].code[k+1]); if ( nevt == PAPI_NULL ) break; /* Match each sub-event to something in the Native List */ for ( n = 0; n < ESI->NativeCount; n++ ) { // if this is the event we are looking for, set its position and exit inner loop to look for next sub-event if ( _papi_hwi_eventcode_to_native(nevt) == ESI->NativeInfoArray[n].ni_event ) { ESI->EventInfoArray[event].pos[k] = ESI->NativeInfoArray[n].ni_position; break; } } } } event++; } INTDBG("EXIT: \n"); return; } static int add_native_fail_clean( EventSetInfo_t *ESI, int nevt ) { INTDBG("ENTER: ESI: %p, nevt: %#x\n", ESI, nevt); int i, max_counters; int cidx; cidx = _papi_hwi_component_index( nevt ); if (cidx<0) return PAPI_ENOCMP; max_counters = _papi_hwd[cidx]->cmp_info.num_mpx_cntrs; /* to find the native event from the native events list */ for( i = 0; i < max_counters; i++ ) { // INTDBG("ESI->NativeInfoArray[%d]: %p, ni_event: %#x, ni_papi_event_code: %#x, ni_position: %d, ni_owners: %d\n", // i, &(ESI->NativeInfoArray[i]), ESI->NativeInfoArray[i].ni_event, ESI->NativeInfoArray[i].ni_papi_code, ESI->NativeInfoArray[i].ni_position, ESI->NativeInfoArray[i].ni_owners); if ( nevt == ESI->NativeInfoArray[i].ni_papi_code ) { ESI->NativeInfoArray[i].ni_owners--; /* to clean the entry in the nativeInfo array */ if ( ESI->NativeInfoArray[i].ni_owners == 0 ) { ESI->NativeInfoArray[i].ni_event = -1; ESI->NativeInfoArray[i].ni_position = -1; ESI->NativeInfoArray[i].ni_papi_code = -1; ESI->NativeCount--; } INTDBG( "EXIT: nevt: %#x, returned: %d\n", nevt, i); return i; } } INTDBG( "EXIT: returned: -1\n"); return -1; } /* since update_control_state trashes overflow settings, this puts things back into balance. */ static int update_overflow( EventSetInfo_t * ESI ) { int i, retval = PAPI_OK; if ( ESI->overflow.flags & PAPI_OVERFLOW_HARDWARE ) { for( i = 0; i < ESI->overflow.event_counter; i++ ) { retval = _papi_hwd[ESI->CmpIdx]->set_overflow( ESI, ESI->overflow.EventIndex[i], ESI->overflow.threshold[i] ); if ( retval != PAPI_OK ) { break; } } } return retval; } /* this function is called by _papi_hwi_add_event when adding native events ESI: event set to add the events to nevnt: pointer to array of native event table indexes to add size: number of native events to add out: ??? return: < 0 = error 0 = no new events added 1 = new events added */ static int add_native_events( EventSetInfo_t *ESI, unsigned int *nevt, int size, EventInfo_t *out ) { INTDBG ("ENTER: ESI: %p, nevt: %p, size: %d, out: %p\n", ESI, nevt, size, out); int nidx, i, j, added_events = 0; int retval, retval2; int max_counters; hwd_context_t *context; max_counters = _papi_hwd[ESI->CmpIdx]->cmp_info.num_mpx_cntrs; /* Walk through the list of native events, adding them */ for( i = 0; i < size; i++ ) { /* Check to see if event is already in EventSet */ nidx = event_already_in_eventset( ESI, nevt[i] ); if ( nidx >= 0 ) { /* Event is already there. Set position */ out->pos[i] = ESI->NativeInfoArray[nidx].ni_position; ESI->NativeInfoArray[nidx].ni_owners++; continue; } /* Event wasn't already there */ if ( ESI->NativeCount == max_counters ) { /* No more room in counters! */ for( j = 0; j < i; j++ ) { if ( ( nidx = add_native_fail_clean( ESI, nevt[j] ) ) >= 0 ) { out->pos[j] = -1; continue; } INTDBG( "should not happen!\n" ); } INTDBG( "EXIT: counters are full!\n" ); return PAPI_ECOUNT; } /* there is an empty slot for the native event; */ /* initialize the native index for the new added event */ INTDBG( "Adding nevt[%d]: %#x, ESI->NativeInfoArray[%d]: %p, Component: %d\n", i, nevt[i], ESI->NativeCount, &ESI->NativeInfoArray[ESI->NativeCount], ESI->CmpIdx ); ESI->NativeInfoArray[ESI->NativeCount].ni_event = _papi_hwi_eventcode_to_native(nevt[i]); ESI->NativeInfoArray[ESI->NativeCount].ni_papi_code = nevt[i]; ESI->NativeInfoArray[ESI->NativeCount].ni_owners = 1; ESI->NativeCount++; added_events++; } INTDBG("added_events: %d\n", added_events); /* if we added events we need to tell the component so it */ /* can add them too. */ if ( added_events ) { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); if ( _papi_hwd[ESI->CmpIdx]->allocate_registers( ESI ) == PAPI_OK ) { retval = _papi_hwd[ESI->CmpIdx]->update_control_state( ESI->ctl_state, ESI->NativeInfoArray, ESI->NativeCount, context); if ( retval != PAPI_OK ) { clean: for( i = 0; i < size; i++ ) { if ( ( nidx = add_native_fail_clean( ESI, nevt[i] ) ) >= 0 ) { out->pos[i] = -1; continue; } INTDBG( "should not happen!\n" ); } /* re-establish the control state after the previous error */ retval2 = _papi_hwd[ESI->CmpIdx]->update_control_state( ESI->ctl_state, ESI->NativeInfoArray, ESI->NativeCount, context); if ( retval2 != PAPI_OK ) { PAPIERROR("update_control_state failed to re-establish working events!" ); INTDBG( "EXIT: update_control_state returned: %d\n", retval2); return retval2; } INTDBG( "EXIT: update_control_state returned: %d\n", retval); return retval; } INTDBG( "EXIT: update_control_state returned: %d, we return: 1 (need remap)\n", retval); return 1; /* need remap */ } else { retval = PAPI_EMISC; goto clean; } } INTDBG( "EXIT: PAPI_OK\n"); return PAPI_OK; } int _papi_hwi_add_event( EventSetInfo_t * ESI, int EventCode ) { INTDBG("ENTER: ESI: %p (%d), EventCode: %#x\n", ESI, ESI->EventSetIndex, EventCode); int i, j, thisindex, remap, retval = PAPI_OK; int cidx; /* Sanity check the component */ cidx=_papi_hwi_component_index( EventCode ); if (cidx<0) { return PAPI_ENOCMP; } if (_papi_hwd[cidx]->cmp_info.disabled && _papi_hwd[cidx]->cmp_info.disabled != PAPI_EDELAY_INIT) { return PAPI_ECMP_DISABLED; } /* Sanity check that the new EventCode is from the same component */ /* as previous events. */ if ( ESI->CmpIdx < 0 ) { if ( ( retval = _papi_hwi_assign_eventset( ESI, cidx)) != PAPI_OK ) { INTDBG("EXIT: Error assigning eventset to component index %d\n", cidx); return retval; } } else { if ( ESI->CmpIdx != cidx ) { INTDBG("EXIT: Event is not valid for component index %d\n", cidx); return PAPI_EINVAL; } } /* Make sure the event is not present and get the next free slot. */ thisindex = get_free_EventCodeIndex( ESI, ( unsigned int ) EventCode ); if ( thisindex < PAPI_OK ) { return thisindex; } INTDBG("Adding event to slot %d of EventSet %d\n",thisindex,ESI->EventSetIndex); /* If it is a software MPX EventSet, add it to the multiplex data structure */ /* and this thread's multiplex list */ if ( !_papi_hwi_is_sw_multiplex( ESI ) ) { /* Handle preset case */ if ( IS_PRESET(EventCode) ) { /* begin preset case */ int count; int preset_index = EventCode & ( int ) PAPI_PRESET_AND_MASK; /* Check if it's within the valid range */ if ( ( preset_index < 0 ) || ( preset_index >= num_all_presets ) ) { return PAPI_EINVAL; } hwi_presets_t *_preset_ptr = get_preset(EventCode); if( NULL == _preset_ptr ) { INTDBG("EXIT: preset not found\n"); return PAPI_ENOEVNT; } /* count the number of native events in this preset */ count = ( int ) _preset_ptr->count; /* Check if event exists */ if ( !count ) { return PAPI_ENOEVNT; } /* check if the native events have been used as overflow events */ /* this is not allowed */ if ( ESI->state & PAPI_OVERFLOWING ) { for( i = 0; i < count; i++ ) { for( j = 0; j < ESI->overflow.event_counter; j++ ) { if ( ESI->overflow.EventCode[j] ==(int) ( _preset_ptr->code[i] ) ) { return PAPI_ECNFLCT; } } } } /* Try to add the preset. */ remap = add_native_events( ESI, _preset_ptr->code, count, &ESI->EventInfoArray[thisindex] ); if ( remap < 0 ) { return remap; } else { /* Fill in the EventCode (machine independent) information */ ESI->EventInfoArray[thisindex].event_code = ( unsigned int ) EventCode; ESI->EventInfoArray[thisindex].derived = _preset_ptr->derived_int; ESI->EventInfoArray[thisindex].ops = _preset_ptr->postfix; ESI->NumberOfEvents++; _papi_hwi_map_events_to_native( ESI ); } } /* Handle adding Native events */ else if ( IS_NATIVE(EventCode) ) { /* Check if native event exists */ if ( _papi_hwi_query_native_event( ( unsigned int ) EventCode ) != PAPI_OK ) { return PAPI_ENOEVNT; } /* check if the native events have been used as overflow events */ /* This is not allowed */ if ( ESI->state & PAPI_OVERFLOWING ) { for( j = 0; j < ESI->overflow.event_counter; j++ ) { if ( EventCode == ESI->overflow.EventCode[j] ) { return PAPI_ECNFLCT; } } } /* Try to add the native event. */ remap = add_native_events( ESI, (unsigned int *)&EventCode, 1, &ESI->EventInfoArray[thisindex] ); if ( remap < 0 ) { return remap; } else { /* Fill in the EventCode (machine independent) information */ ESI->EventInfoArray[thisindex].event_code = ( unsigned int ) EventCode; ESI->NumberOfEvents++; _papi_hwi_map_events_to_native( ESI ); } } else if ( IS_USER_DEFINED( EventCode ) ) { int count; int index = EventCode & PAPI_UE_AND_MASK; if ( index < 0 || index >= user_defined_events_count ) return ( PAPI_EINVAL ); count = ( int ) user_defined_events[index].count; for ( i = 0; i < count; i++ ) { for ( j = 0; j < ESI->overflow.event_counter; j++ ) { if ( ESI->overflow.EventCode[j] == (int)(user_defined_events[index].code[i]) ) { return ( PAPI_EBUG ); } } } remap = add_native_events( ESI, user_defined_events[index].code, count, &ESI->EventInfoArray[thisindex] ); if ( remap < 0 ) { return remap; } else { ESI->EventInfoArray[thisindex].event_code = (unsigned int) EventCode; ESI->EventInfoArray[thisindex].derived = user_defined_events[index].derived_int; ESI->EventInfoArray[thisindex].ops = user_defined_events[index].postfix; ESI->NumberOfEvents++; _papi_hwi_map_events_to_native( ESI ); } } else { /* not Native, Preset, or User events */ return PAPI_EBUG; } } else { /* Multiplexing is special. See multiplex.c */ retval = mpx_add_event( &ESI->multiplex.mpx_evset, EventCode, ESI->domain.domain, ESI->granularity.granularity ); if ( retval < PAPI_OK ) { return retval; } /* Relevant (???) */ ESI->EventInfoArray[thisindex].event_code = ( unsigned int ) EventCode; ESI->EventInfoArray[thisindex].derived = NOT_DERIVED; ESI->NumberOfEvents++; /* event is in the EventInfoArray but not mapped to the NativeEvents */ /* this causes issues if you try to set overflow on the event. */ /* in theory this wouldn't matter anyway. */ } /* reinstate the overflows if any */ retval=update_overflow( ESI ); return retval; } static int remove_native_events( EventSetInfo_t *ESI, int *nevt, int size ) { INTDBG( "Entry: ESI: %p, nevt: %p, size: %d\n", ESI, nevt, size); NativeInfo_t *native = ESI->NativeInfoArray; hwd_context_t *context; int i, j, zero = 0, retval; /* Remove the references to this event from the native events: for all the metrics in this event, compare to each native event in this event set, and decrement owners if they match */ for( i = 0; i < size; i++ ) { int cevt = _papi_hwi_eventcode_to_native(nevt[i]); // INTDBG( "nevt[%d]: %#x, cevt: %#x\n", i, nevt[i], cevt); for( j = 0; j < ESI->NativeCount; j++ ) { if ((native[j].ni_event == cevt) && (native[j].ni_papi_code == nevt[i]) ) { // INTDBG( "native[%d]: %p, ni_papi_code: %#x, ni_event: %#x, ni_position: %d, ni_owners: %d\n", // j, &(native[j]), native[j].ni_papi_code, native[j].ni_event, native[j].ni_position, native[j].ni_owners); native[j].ni_owners--; if ( native[j].ni_owners == 0 ) { zero++; } break; } } } /* Remove any native events from the array if owners dropped to zero. The NativeInfoArray must be dense, with no empty slots, so if we remove an element, we must compact the list */ for( i = 0; i < ESI->NativeCount; i++ ) { if ( native[i].ni_event == -1 ) continue; if ( native[i].ni_owners == 0 ) { int copy = 0; int sz = _papi_hwd[ESI->CmpIdx]->size.reg_value; for( j = ESI->NativeCount - 1; j > i; j-- ) { if ( native[j].ni_event == -1 || native[j].ni_owners == 0 ) continue; else { /* copy j into i */ native[i].ni_event = native[j].ni_event; native[i].ni_position = native[j].ni_position; native[i].ni_owners = native[j].ni_owners; /* copy opaque [j].ni_bits to [i].ni_bits */ memcpy( native[i].ni_bits, native[j].ni_bits, ( size_t ) sz ); /* reset j to initialized state */ native[j].ni_event = -1; native[j].ni_position = -1; native[j].ni_owners = 0; copy++; break; } } if ( copy == 0 ) { /* set this structure back to empty state */ /* ni_owners is already 0 and contents of ni_bits doesn't matter */ native[i].ni_event = -1; native[i].ni_position = -1; } } } INTDBG( "ESI->NativeCount: %d, zero: %d\n", ESI->NativeCount, zero); /* to reset hwd_control_state values */ ESI->NativeCount -= zero; /* If we removed any elements, clear the now empty slots, reinitialize the index, and update the count. Then send the info down to the component to update the hwd control structure. */ retval = PAPI_OK; if ( zero ) { /* get the context we should use for this event set */ context = _papi_hwi_get_context( ESI, NULL ); retval = _papi_hwd[ESI->CmpIdx]->update_control_state( ESI->ctl_state, native, ESI->NativeCount, context); if ( retval == PAPI_OK ) retval = update_overflow( ESI ); } return ( retval ); } int _papi_hwi_remove_event( EventSetInfo_t * ESI, int EventCode ) { int j = 0, retval, thisindex; EventInfo_t *array; thisindex = _papi_hwi_lookup_EventCodeIndex( ESI, ( unsigned int ) EventCode ); if ( thisindex < PAPI_OK ) return ( thisindex ); /* If it is a MPX EventSet, remove it from the multiplex data structure and this threads multiplex list */ if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = mpx_remove_event( &ESI->multiplex.mpx_evset, EventCode ); if ( retval < PAPI_OK ) return ( retval ); } else /* Remove the events hardware dependent stuff from the EventSet */ { if ( IS_PRESET(EventCode) ) { int preset_index = EventCode & PAPI_PRESET_AND_MASK; /* Check if it's within the valid range */ if ( ( preset_index < 0 ) || ( preset_index >= PAPI_MAX_PRESET_EVENTS ) ) return PAPI_EINVAL; /* Check if event exists */ if ( !_papi_hwi_presets[preset_index].count ) return PAPI_ENOEVNT; /* Remove the preset event. */ for ( j = 0; _papi_hwi_presets[preset_index].code[j] != (unsigned int)PAPI_NULL; j++ ); retval = remove_native_events( ESI, ( int * )_papi_hwi_presets[preset_index].code, j ); if ( retval != PAPI_OK ) return ( retval ); } else if ( IS_NATIVE(EventCode) ) { /* Check if native event exists */ if ( _papi_hwi_query_native_event( ( unsigned int ) EventCode ) != PAPI_OK ) return PAPI_ENOEVNT; /* Remove the native event. */ retval = remove_native_events( ESI, &EventCode, 1 ); if ( retval != PAPI_OK ) return ( retval ); } else if ( IS_USER_DEFINED( EventCode ) ) { int index = EventCode & PAPI_UE_AND_MASK; if ( (index < 0) || (index >= user_defined_events_count) ) return ( PAPI_EINVAL ); for( j = 0; j < PAPI_EVENTS_IN_DERIVED_EVENT && user_defined_events[index].code[j] != 0; j++ ) { retval = remove_native_events( ESI, ( int * )user_defined_events[index].code, j); if ( retval != PAPI_OK ) return ( retval ); } } else return ( PAPI_ENOEVNT ); } array = ESI->EventInfoArray; /* Compact the Event Info Array list if it's not the last event */ /* clear the newly empty slot in the array */ for ( ; thisindex < ESI->NumberOfEvents - 1; thisindex++ ) array[thisindex] = array[thisindex + 1]; array[thisindex].event_code = ( unsigned int ) PAPI_NULL; for ( j = 0; j < PAPI_EVENTS_IN_DERIVED_EVENT; j++ ) array[thisindex].pos[j] = PAPI_NULL; array[thisindex].ops = NULL; array[thisindex].derived = NOT_DERIVED; ESI->NumberOfEvents--; return ( PAPI_OK ); } int _papi_hwi_read( hwd_context_t * context, EventSetInfo_t * ESI, long long *values ) { INTDBG("ENTER: context: %p, ESI: %p, values: %p\n", context, ESI, values); int retval; long long *dp = NULL; int i, index; retval = _papi_hwd[ESI->CmpIdx]->read( context, ESI->ctl_state, &dp, ESI->state ); if ( retval != PAPI_OK ) { INTDBG("EXIT: retval: %d\n", retval); return retval; } /* This routine distributes hardware counters to software counters in the order that they were added. Note that the higher level EventInfoArray[i] entries may not be contiguous because the user has the right to remove an event. But if we do compaction after remove event, this function can be changed. */ for ( i = 0; i != ESI->NumberOfEvents; i++ ) { index = ESI->EventInfoArray[i].pos[0]; if ( index == -1 ) continue; INTDBG( "ESI->EventInfoArray: %p, pos[%d]: %d, dp[%d]: %lld, derived[%d]: %#x\n", ESI->EventInfoArray, i, index, index, dp[index], i, ESI->EventInfoArray[i].derived ); /* If this is not a derived event */ if ( ESI->EventInfoArray[i].derived == NOT_DERIVED ) { values[i] = dp[index]; INTDBG( "value: %#llx\n", values[i] ); } else { /* If this is a derived event */ values[i] = handle_derived( &ESI->EventInfoArray[i], dp ); #ifdef DEBUG if ( values[i] < ( long long ) 0 ) { INTDBG( "Derived Event is negative!!: %lld\n", values[i] ); } INTDBG( "derived value: %#llx \n", values[i] ); #endif } } INTDBG("EXIT: PAPI_OK\n"); return PAPI_OK; } int _papi_hwi_cleanup_eventset( EventSetInfo_t * ESI ) { int i, j, num_cntrs, retval; hwd_context_t *context; int EventCode; NativeInfo_t *native; if ( !_papi_hwi_invalid_cmp( ESI->CmpIdx ) ) { num_cntrs = _papi_hwd[ESI->CmpIdx]->cmp_info.num_mpx_cntrs; for(i=0;iEventInfoArray[i].event_code; /* skip if event not there */ if ( EventCode == PAPI_NULL ) continue; /* If it is a MPX EventSet, remove it from the multiplex */ /* data structure and this thread's multiplex list */ if ( _papi_hwi_is_sw_multiplex( ESI ) ) { retval = mpx_remove_event( &ESI->multiplex.mpx_evset, EventCode ); if ( retval < PAPI_OK ) return retval; } else { native = ESI->NativeInfoArray; /* clear out ESI->NativeInfoArray */ /* do we really need to do this, seeing as we free() it later? */ for( j = 0; j < ESI->NativeCount; j++ ) { native[j].ni_event = -1; native[j].ni_position = -1; native[j].ni_owners = 0; /* native[j].ni_bits?? */ } } /* do we really need to do this, seeing as we free() it later? */ ESI->EventInfoArray[i].event_code= ( unsigned int ) PAPI_NULL; for( j = 0; j < PAPI_EVENTS_IN_DERIVED_EVENT; j++ ) { ESI->EventInfoArray[i].pos[j] = PAPI_NULL; } ESI->EventInfoArray[i].ops = NULL; ESI->EventInfoArray[i].derived = NOT_DERIVED; } context = _papi_hwi_get_context( ESI, NULL ); /* calling with count of 0 equals a close? */ retval = _papi_hwd[ESI->CmpIdx]->update_control_state( ESI->ctl_state, NULL, 0, context); if (retval!=PAPI_OK) { return retval; } } ESI->CmpIdx = -1; ESI->NumberOfEvents = 0; ESI->NativeCount = 0; if ( ( ESI->state & PAPI_MULTIPLEXING ) && ESI->multiplex.mpx_evset ) papi_free( ESI->multiplex.mpx_evset ); if ( ( ESI->state & PAPI_CPU_ATTACH ) && ESI->CpuInfo ) _papi_hwi_shutdown_cpu( ESI->CpuInfo ); if ( ESI->ctl_state ) papi_free( ESI->ctl_state ); if ( ESI->sw_stop ) papi_free( ESI->sw_stop ); if ( ESI->hw_start ) papi_free( ESI->hw_start ); if ( ESI->EventInfoArray ) papi_free( ESI->EventInfoArray ); if ( ESI->NativeInfoArray ) papi_free( ESI->NativeInfoArray ); if ( ESI->NativeBits ) papi_free( ESI->NativeBits ); if ( ESI->overflow.deadline ) papi_free( ESI->overflow.deadline ); if ( ESI->profile.prof ) papi_free( ESI->profile.prof ); ESI->ctl_state = NULL; ESI->sw_stop = NULL; ESI->hw_start = NULL; ESI->EventInfoArray = NULL; ESI->NativeInfoArray = NULL; ESI->NativeBits = NULL; memset( &ESI->domain, 0x0, sizeof(EventSetDomainInfo_t) ); memset( &ESI->granularity, 0x0, sizeof(EventSetGranularityInfo_t) ); memset( &ESI->overflow, 0x0, sizeof(EventSetOverflowInfo_t) ); memset( &ESI->multiplex, 0x0, sizeof(EventSetMultiplexInfo_t) ); memset( &ESI->attach, 0x0, sizeof(EventSetAttachInfo_t) ); memset( &ESI->cpu, 0x0, sizeof(EventSetCpuInfo_t) ); memset( &ESI->profile, 0x0, sizeof(EventSetProfileInfo_t) ); memset( &ESI->inherit, 0x0, sizeof(EventSetInheritInfo_t) ); ESI->CpuInfo = NULL; return PAPI_OK; } int _papi_hwi_convert_eventset_to_multiplex( _papi_int_multiplex_t * mpx ) { int retval, i, j = 0, *mpxlist = NULL; EventSetInfo_t *ESI = mpx->ESI; int flags = mpx->flags; /* If there are any events in the EventSet, convert them to multiplex events */ if ( ESI->NumberOfEvents ) { mpxlist = ( int * ) papi_malloc( sizeof ( int ) * ( size_t ) ESI->NumberOfEvents ); if ( mpxlist == NULL ) return ( PAPI_ENOMEM ); /* Build the args to MPX_add_events(). */ /* Remember the EventInfoArray can be sparse and the data can be non-contiguous */ for ( i = 0; i < EventInfoArrayLength( ESI ); i++ ) if ( ESI->EventInfoArray[i].event_code != ( unsigned int ) PAPI_NULL ) mpxlist[j++] = ( int ) ESI->EventInfoArray[i].event_code; /* Resize the EventInfo_t array */ if ( ( _papi_hwd[ESI->CmpIdx]->cmp_info.kernel_multiplex == 0 ) || ( ( _papi_hwd[ESI->CmpIdx]->cmp_info.kernel_multiplex ) && ( flags & PAPI_MULTIPLEX_FORCE_SW ) ) ) { retval = MPX_add_events( &ESI->multiplex.mpx_evset, mpxlist, j, ESI->domain.domain, ESI->granularity.granularity ); if ( retval != PAPI_OK ) { papi_free( mpxlist ); return ( retval ); } } papi_free( mpxlist ); } /* Update the state before initialization! */ ESI->state |= PAPI_MULTIPLEXING; if ( _papi_hwd[ESI->CmpIdx]->cmp_info.kernel_multiplex && ( flags & PAPI_MULTIPLEX_FORCE_SW ) ) ESI->multiplex.flags = PAPI_MULTIPLEX_FORCE_SW; ESI->multiplex.ns = ( int ) mpx->ns; return ( PAPI_OK ); } #include "components_config.h" int papi_num_components = ( sizeof ( _papi_hwd ) / sizeof ( *_papi_hwd ) ) - 1; /* * Routine that initializes all available components. * A component is available if a pointer to its info vector * appears in the NULL terminated_papi_hwd table. * Modified to accept an arg: 0=do not init perf_event or * perf_event_uncore. 1=init ONLY perf_event or perf_event_uncore. */ int _papi_hwi_init_global( int PE_OR_PEU ) { int retval, is_pe_peu, i = 0; retval = _papi_hwi_innoculate_os_vector( &_papi_os_vector ); if ( retval != PAPI_OK ) { return retval; } while ( _papi_hwd[i] ) { is_pe_peu = 0; if (strcmp(_papi_hwd[i]->cmp_info.name, "perf_event") == 0) is_pe_peu=1; if (strcmp(_papi_hwd[i]->cmp_info.name, "perf_event_uncore") == 0) is_pe_peu=1; retval = _papi_hwi_innoculate_vector( _papi_hwd[i] ); if ( retval != PAPI_OK ) { return retval; } /* We can be disabled by user before init */ if (!_papi_hwd[i]->cmp_info.disabled && (PE_OR_PEU == is_pe_peu)) { retval = _papi_hwd[i]->init_component( i ); /* Do some sanity checking */ if (retval==PAPI_OK) { if (_papi_hwd[i]->cmp_info.num_cntrs > _papi_hwd[i]->cmp_info.num_mpx_cntrs) { fprintf(stderr,"Warning! num_cntrs %d is more than num_mpx_cntrs %d for component %s\n", _papi_hwd[i]->cmp_info.num_cntrs, _papi_hwd[i]->cmp_info.num_mpx_cntrs, _papi_hwd[i]->cmp_info.name); } } } i++; } return PAPI_OK; } /* * Routine that initializes the presets for all components other * than perf_event. Ignore perf_event component. */ int _papi_hwi_init_global_presets( void ) { int retval = PAPI_OK, is_pe, i = 0; /* Determine whether or not perf_event is available. */ while ( _papi_hwd[i] ) { if (strcmp(_papi_hwd[i]->cmp_info.name, "perf_event") == 0) { pe_disabled = 0; break; } i++; } if( pe_disabled ) { num_all_presets = PAPI_MAX_PRESET_EVENTS; } i = 0; while ( _papi_hwd[i] ) { is_pe = 0; if (strcmp(_papi_hwd[i]->cmp_info.name, "perf_event") == 0) { is_pe = 1; } else { /* Only set the first non-perf_event component with presets once. */ if ( -1 == first_comp_with_presets && _papi_hwi_max_presets[i] > 0 ) { first_comp_with_presets = i; } } _papi_hwi_start_idx[i] = num_all_presets; num_all_presets += _papi_hwi_max_presets[i]; i++; } return retval; } /* Machine info struct initialization using defaults */ /* See _papi_mdi definition in papi_internal.h */ int _papi_hwi_init_global_internal( void ) { int retval; memset(&_papi_hwi_system_info,0x0,sizeof( _papi_hwi_system_info )); memset( _papi_hwi_using_signal,0x0,sizeof( _papi_hwi_using_signal )); /* Global struct to maintain EventSet mapping */ retval = allocate_eventset_map( &_papi_hwi_system_info.global_eventset_map ); if ( retval != PAPI_OK ) { return retval; } _papi_hwi_system_info.pid = 0; /* Process identifier */ /* PAPI_hw_info_t struct */ memset(&(_papi_hwi_system_info.hw_info),0x0,sizeof(PAPI_hw_info_t)); return PAPI_OK; } void _papi_hwi_shutdown_global_internal( void ) { int i = 0; _papi_hwi_cleanup_all_presets( ); _papi_hwi_cleanup_errors( ); _papi_hwi_lock( INTERNAL_LOCK ); for( i = 0; i < num_native_events; i++){ free(_papi_native_events[i].evt_name); } free(_papi_native_events); _papi_native_events = NULL; // In case a new library init is done. num_native_events=0; // .. num_native_chunks=0; // .. _papi_hwi_free_papi_event_string(); papi_free( _papi_hwi_system_info.global_eventset_map.dataSlotArray ); memset( &_papi_hwi_system_info.global_eventset_map, 0x00, sizeof ( DynamicArray_t ) ); _papi_hwi_unlock( INTERNAL_LOCK ); if ( _papi_hwi_system_info.shlib_info.map ) { papi_free( _papi_hwi_system_info.shlib_info.map ); } memset( &_papi_hwi_system_info, 0x0, sizeof ( _papi_hwi_system_info ) ); } void _papi_hwi_dummy_handler( int EventSet, void *address, long long overflow_vector, void *context ) { /* This function is not used and shouldn't be called. */ ( void ) EventSet; /*unused */ ( void ) address; /*unused */ ( void ) overflow_vector; /*unused */ ( void ) context; /*unused */ return; } static long long handle_derived_add( int *position, long long *from ) { int pos, i; long long retval = 0; i = 0; while ( i < PAPI_EVENTS_IN_DERIVED_EVENT ) { pos = position[i++]; if ( pos == PAPI_NULL ) break; INTDBG( "Compound event, adding %lld to %lld\n", from[pos], retval ); retval += from[pos]; } return ( retval ); } static long long handle_derived_subtract( int *position, long long *from ) { int pos, i; long long retval = from[position[0]]; i = 1; while ( i < PAPI_EVENTS_IN_DERIVED_EVENT ) { pos = position[i++]; if ( pos == PAPI_NULL ) break; INTDBG( "Compound event, subtracting pos=%d %lld from %lld\n", pos, from[pos], retval ); retval -= from[pos]; } return ( retval ); } static long long units_per_second( long long units, long long cycles ) { return ( ( units * (long long) _papi_hwi_system_info.hw_info.cpu_max_mhz * (long long) 1000000 ) / cycles ); } static long long handle_derived_ps( int *position, long long *from ) { return ( units_per_second( from[position[1]], from[position[0]] ) ); } static long long handle_derived_add_ps( int *position, long long *from ) { long long tmp = handle_derived_add( position + 1, from ); return ( units_per_second( tmp, from[position[0]] ) ); } /* this function implement postfix calculation, it reads in a string where I use: | as delimiter N2 indicate No. 2 native event in the derived preset +, -, *, / as operator # as MHZ(million hz) got from _papi_hwi_system_info.hw_info.cpu_max_mhz*1000000.0 Haihang (you@cs.utk.edu) */ static long long _papi_hwi_postfix_calc( EventInfo_t * evi, long long *hw_counter ) { char *point = evi->ops, operand[16]; double stack[PAPI_EVENTS_IN_DERIVED_EVENT]; int i, val, top = 0; INTDBG("ENTER: evi: %p, evi->ops: %p (%s), evi->pos[0]: %d, evi->pos[1]: %d, hw_counter: %p (%lld %lld)\n", evi, evi->ops, evi->ops, evi->pos[0], evi->pos[1], hw_counter, hw_counter[0], hw_counter[1]); memset(&stack,0,PAPI_EVENTS_IN_DERIVED_EVENT*sizeof(double)); while ( *point != '\0' ) { if ( *point == '|' ) { /* consume '|' characters */ point++; } else if ( *point == 'N' ) { /* to get count for each native event */ point++; i = 0; while ( isdigit(*point) ) { assert(i<16); operand[i] = *point; point++; i++; } assert(0pos[val]]; top++; } else if ( *point == '#' ) { /* to get mhz */ point++; assert( top < PAPI_EVENTS_IN_DERIVED_EVENT ); stack[top] = _papi_hwi_system_info.hw_info.cpu_max_mhz * 1000000.0; top++; } else if ( isdigit( *point ) ) { i = 0; while ( isdigit(*point) ) { assert(i<16); operand[i] = *point; point++; i++; } assert(0= 2); stack[top - 2] += stack[top - 1]; top--; } else if ( *point == '-' ) { /* - calculation */ point++; assert(top >= 2); stack[top - 2] -= stack[top - 1]; top--; } else if ( *point == '*' ) { /* * calculation */ point++; assert(top >= 2); stack[top - 2] *= stack[top - 1]; top--; } else if ( *point == '/' ) { /* / calculation */ point++; assert(top >= 2); /* FIXME should handle runtime divide by zero */ stack[top - 2] /= stack[top - 1]; top--; } else { /* flag an error parsing the preset */ PAPIERROR( "BUG! Unable to parse \"%s\"", evi->ops ); return ( long long ) stack[0]; } } assert(top == 1); INTDBG("EXIT: stack[0]: %lld\n", (long long)stack[0]); return ( long long ) stack[0]; } static long long handle_derived( EventInfo_t * evi, long long *from ) { INTDBG("ENTER: evi: %p, evi->derived: %d, from: %p\n", evi, evi->derived, from); switch ( evi->derived ) { case DERIVED_ADD: return ( handle_derived_add( evi->pos, from ) ); case DERIVED_ADD_PS: return ( handle_derived_add_ps( evi->pos, from ) ); case DERIVED_SUB: return ( handle_derived_subtract( evi->pos, from ) ); case DERIVED_PS: return ( handle_derived_ps( evi->pos, from ) ); case DERIVED_POSTFIX: return ( _papi_hwi_postfix_calc( evi, from ) ); case DERIVED_CMPD: /* This type has existed for a long time, but was never implemented. Probably because its a no-op. However, if it's in a header, it should be supported. As I found out when I implemented it in Pentium 4 for testing...dkt */ return ( from[evi->pos[0]] ); default: PAPIERROR( "BUG! Unknown derived command %d, returning 0", evi->derived ); INTDBG("EXIT: Unknown derived command %d\n", evi->derived); return ( ( long long ) 0 ); } } /* table matching derived types to derived strings. used by get_info, encode_event, xml translator */ static const hwi_describe_t _papi_hwi_derived[] = { {NOT_DERIVED, "NOT_DERIVED", "Do nothing"}, {DERIVED_ADD, "DERIVED_ADD", "Add counters"}, {DERIVED_PS, "DERIVED_PS", "Divide by the cycle counter and convert to seconds"}, {DERIVED_ADD_PS, "DERIVED_ADD_PS", "Add 2 counters then divide by the cycle counter and xl8 to secs."}, {DERIVED_CMPD, "DERIVED_CMPD", "Event lives in first counter but takes 2 or more codes"}, {DERIVED_SUB, "DERIVED_SUB", "Sub all counters from first counter"}, {DERIVED_POSTFIX, "DERIVED_POSTFIX", "Process counters based on specified postfix string"}, {DERIVED_INFIX, "DERIVED_INFIX", "Process counters based on specified infix string"}, {-1, NULL, NULL} }; /* _papi_hwi_derived_type: Helper routine to extract a derived type from a derived string returns type value if found, otherwise returns -1 */ int _papi_hwi_derived_type( char *tmp, int *code ) { int i = 0; while ( _papi_hwi_derived[i].name != NULL ) { if ( strcasecmp( tmp, _papi_hwi_derived[i].name ) == 0 ) { *code = _papi_hwi_derived[i].value; return PAPI_OK; } i++; } INTDBG( "Invalid derived string %s\n", tmp ); return PAPI_EINVAL; } /* _papi_hwi_derived_string: Helper routine to extract a derived string from a derived type copies derived type string into derived if found, otherwise returns PAPI_EINVAL */ static int _papi_hwi_derived_string( int type, char *derived, int len ) { int j; for ( j = 0; _papi_hwi_derived[j].value != -1; j++ ) { if ( _papi_hwi_derived[j].value == type ) { strncpy( derived, _papi_hwi_derived[j].name, ( size_t )\ len ); return PAPI_OK; } } INTDBG( "Invalid derived type %d\n", type ); return PAPI_EINVAL; } /* _papi_hwi_get_preset_event_info: Assumes EventCode contains a valid preset code. But defensive programming says check for NULL pointers. Returns a filled in PAPI_event_info_t structure containing descriptive strings and values for the specified preset event. */ int _papi_hwi_get_preset_event_info( int EventCode, PAPI_event_info_t * info ) { INTDBG("ENTER: EventCode: %#x, info: %p\n", EventCode, info); unsigned int j; hwi_presets_t *_preset_ptr = get_preset(EventCode); if( NULL == _preset_ptr ) { INTDBG("EXIT: preset not found\n"); return PAPI_ENOEVNT; } if ( _preset_ptr->symbol ) { /* if the event is in the preset table */ // since we are setting the whole structure to zero the strncpy calls below will // be leaving NULL terminates strings as long as they copy 1 less byte than the // buffer size of the field. INTDBG("ENTER: Configuring: %s\n", _preset_ptr->symbol); memset( info, 0, sizeof ( PAPI_event_info_t ) ); /* set up eventcode and name */ info->event_code = ( unsigned int ) EventCode; strncpy( info->symbol, _preset_ptr->symbol, sizeof(info->symbol)-1); /* set up short description, if available */ if ( _preset_ptr->short_descr != NULL ) { strncpy( info->short_descr, _preset_ptr->short_descr, sizeof ( info->short_descr )-1 ); } /* set up long description, if available */ if ( _preset_ptr->long_descr != NULL ) { strncpy( info->long_descr, _preset_ptr->long_descr, sizeof ( info->long_descr )-1 ); } info->event_type = _preset_ptr->event_type; info->count = _preset_ptr->count; /* set up if derived event */ _papi_hwi_derived_string( _preset_ptr->derived_int, info->derived, sizeof ( info->derived )-1 ); if ( _preset_ptr->postfix != NULL ) { strncpy( info->postfix, _preset_ptr->postfix, sizeof ( info->postfix )-1 ); } for(j=0;j < info->count; j++) { /* make sure the name exists before trying to copy it */ /* that can happen if an event is in the definition in */ /* papi_events.csv but the event is unsupported on the cpu */ /* ideally that should never happen, but also ideally */ /* we wouldn't segfault if it does */ if (_preset_ptr->name[j]==NULL) { INTDBG("ERROR in event definition of %s\n", _preset_ptr->symbol); return PAPI_ENOEVNT; } else { info->code[j]=_preset_ptr->code[j]; strncpy(info->name[j], _preset_ptr->name[j], sizeof(info->name[j])-1); } } if ( _preset_ptr->note != NULL ) { strncpy( info->note, _preset_ptr->note, sizeof ( info->note )-1 ); } /* Copy the qualifiers and their associated descriptions into * the info struct. */ int k; for( k = 0; k < _preset_ptr->num_quals; ++k ) { strncpy( info->quals[k], _preset_ptr->quals[k], sizeof ( info->quals[k] )-1 ); strncpy( info->quals_descrs[k], _preset_ptr->quals_descrs[k], sizeof ( info->quals_descrs[k] )-1 ); } info->num_quals = _preset_ptr->num_quals; info->component_index = _preset_ptr->component_index; return PAPI_OK; } else { return PAPI_ENOEVNT; } } /* _papi_hwi_get_user_event_info: Assumes EventCode contains a valid user event code. But defensive programming says check for NULL pointers. Returns a filled in PAPI_event_info_t structure containing descriptive strings and values for the specified preset event. */ int _papi_hwi_get_user_event_info( int EventCode, PAPI_event_info_t * info ) { INTDBG("ENTER: EventCode: %#x, info: %p\n", EventCode, info); unsigned int i = EventCode & PAPI_UE_AND_MASK; unsigned int j; // if event code not in valid range, return error if (i >= PAPI_MAX_USER_EVENTS) { INTDBG("EXIT: Invalid event index: %d, max value is: %d\n", i, PAPI_MAX_USER_EVENTS - 1); return( PAPI_ENOEVNT ); } if ( user_defined_events[i].symbol == NULL) { /* if the event is in the preset table */ INTDBG("EXIT: Event symbol for this event is NULL\n"); return PAPI_ENOEVNT; } /* set whole structure to 0 */ memset( info, 0, sizeof ( PAPI_event_info_t ) ); info->event_code = ( unsigned int ) EventCode; strncpy( info->symbol, user_defined_events[i].symbol, sizeof(info->symbol)-1); if ( user_defined_events[i].short_descr != NULL ) strncpy( info->short_descr, user_defined_events[i].short_descr, sizeof(info->short_descr)-1); if ( user_defined_events[i].long_descr != NULL ) strncpy( info->long_descr, user_defined_events[i].long_descr, sizeof(info->long_descr)-1); // info->event_type = user_defined_events[i].event_type; info->count = user_defined_events[i].count; _papi_hwi_derived_string( user_defined_events[i].derived_int, info->derived, sizeof(info->derived)-1); if ( user_defined_events[i].postfix != NULL ) strncpy( info->postfix, user_defined_events[i].postfix, sizeof(info->postfix)-1); for(j=0;j < info->count; j++) { info->code[j]=user_defined_events[i].code[j]; INTDBG("info->code[%d]: %#x\n", j, info->code[j]); strncpy(info->name[j], user_defined_events[i].name[j], sizeof(info->name[j])-1); } if ( user_defined_events[i].note != NULL ) { strncpy( info->note, user_defined_events[i].note, sizeof(info->note)-1); } INTDBG("EXIT: PAPI_OK: event_code: %#x, symbol: %s, short_desc: %s, long_desc: %s\n", info->event_code, info->symbol, info->short_descr, info->long_descr); return PAPI_OK; } /* Returns PAPI_OK if native EventCode found, or PAPI_ENOEVNT if not; Used to enumerate the entire array, e.g. for native_avail.c */ int _papi_hwi_query_native_event( unsigned int EventCode ) { INTDBG("ENTER: EventCode: %#x\n", EventCode); char name[PAPI_HUGE_STR_LEN]; /* probably overkill, */ /* but should always be big enough */ int cidx; int nevt_code; cidx = _papi_hwi_component_index( EventCode ); if (cidx<0) { INTDBG("EXIT: PAPI_ENOCMP\n"); return PAPI_ENOCMP; } // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(EventCode, 0); if ((nevt_code = _papi_hwi_eventcode_to_native(EventCode)) < 0) { INTDBG("EXIT: nevt_code: %d\n", nevt_code); return nevt_code; } int ret = _papi_hwd[cidx]->ntv_code_to_name( (unsigned int)nevt_code, name, sizeof(name)); INTDBG("EXIT: ret: %d\n", ret); return (ret); } /* Converts an ASCII name into a native event code usable by other routines Returns code = 0 and PAPI_OK if name not found. This allows for sparse native event arrays */ int _papi_hwi_native_name_to_code( const char *in, int *out ) { INTDBG("ENTER: in: %s, out: %p\n", in, out); int retval = PAPI_ENOEVNT; char name[PAPI_HUGE_STR_LEN]; /* make sure it's big enough */ unsigned int i; int cidx; char *full_event_name; if (in == NULL) { INTDBG("EXIT: PAPI_EINVAL\n"); return PAPI_EINVAL; } full_event_name = strdup(in); in = _papi_hwi_strip_component_prefix(in); // look in each component for(cidx=0; cidx < papi_num_components; cidx++) { if (_papi_hwd[cidx]->cmp_info.disabled && _papi_hwd[cidx]->cmp_info.disabled != PAPI_EDELAY_INIT) continue; // if this component does not support the pmu // which defines this event, no need to call it if (is_supported_by_component(cidx, full_event_name) == 0) { continue; } INTDBG("cidx: %d, name: %s, event: %s\n", cidx, _papi_hwd[cidx]->cmp_info.name, in); // show that we do not have an event code yet // (the component may create one and update this info) // this also clears any values left over from a previous call _papi_hwi_set_papi_event_code(-1, -1); // if component has a ntv_name_to_code function, use it to get event code if (_papi_hwd[cidx]->ntv_name_to_code != NULL) { // try and get this events event code retval = _papi_hwd[cidx]->ntv_name_to_code( in, ( unsigned * ) out ); if (retval==PAPI_OK) { *out = _papi_hwi_native_to_eventcode(cidx, *out, -1, in); free (full_event_name); INTDBG("EXIT: PAPI_OK event: %s code: %#x\n", in, *out); return PAPI_OK; } } else { // force the code through the work around retval = PAPI_ECMP; } /* If not implemented, work around */ if ( retval==PAPI_ECMP) { i = 0; retval = _papi_hwd[cidx]->ntv_enum_events( &i, PAPI_ENUM_FIRST ); if (retval != PAPI_OK) { free (full_event_name); INTDBG("EXIT: retval: %d\n", retval); return retval; } // _papi_hwi_lock( INTERNAL_LOCK ); do { // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(i, 0); retval = _papi_hwd[cidx]->ntv_code_to_name(i, name, sizeof(name)); /* printf("%#x\nname =|%s|\ninput=|%s|\n", i, name, in); */ if ( retval == PAPI_OK && in != NULL) { if ( strcasecmp( name, in ) == 0 ) { *out = _papi_hwi_native_to_eventcode(cidx, i, -1, name); free (full_event_name); INTDBG("EXIT: PAPI_OK, event: %s, code: %#x\n", in, *out); return PAPI_OK; } retval = PAPI_ENOEVNT; } else { *out = 0; retval = PAPI_ENOEVNT; break; } } while ( ( _papi_hwd[cidx]->ntv_enum_events( &i, PAPI_ENUM_EVENTS ) == PAPI_OK ) ); // _papi_hwi_unlock( INTERNAL_LOCK ); } } free (full_event_name); INTDBG("EXIT: retval: %d\n", retval); return retval; } /* Returns event name based on native event code. Returns NULL if name not found */ int _papi_hwi_native_code_to_name( unsigned int EventCode, char *hwi_name, int len ) { INTDBG("ENTER: EventCode: %#x, hwi_name: %p, len: %d\n", EventCode, hwi_name, len); int cidx; int retval; int nevt_code; cidx = _papi_hwi_component_index( EventCode ); if (cidx<0) return PAPI_ENOEVNT; if ( EventCode & PAPI_NATIVE_MASK ) { // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(EventCode, 0); if ((nevt_code = _papi_hwi_eventcode_to_native(EventCode)) < 0) { INTDBG("EXIT: nevt_code: %d\n", nevt_code); return nevt_code; } if ( (retval = _papi_hwd[cidx]->ntv_code_to_name( (unsigned int)nevt_code, hwi_name, len) ) == PAPI_OK ) { retval = _papi_hwi_prefix_component_name( _papi_hwd[cidx]->cmp_info.short_name, hwi_name, hwi_name, len); INTDBG("EXIT: retval: %d\n", retval); return retval; } INTDBG("EXIT: retval: %d\n", retval); return (retval); } INTDBG("EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } /* The native event equivalent of PAPI_get_event_info */ int _papi_hwi_get_native_event_info( unsigned int EventCode, PAPI_event_info_t *info ) { INTDBG("ENTER: EventCode: %#x, info: %p\n", EventCode, info); int retval; int cidx; int nevt_code; cidx = _papi_hwi_component_index( EventCode ); if (cidx<0) return PAPI_ENOCMP; if (_papi_hwd[cidx]->cmp_info.disabled && _papi_hwd[cidx]->cmp_info.disabled != PAPI_EDELAY_INIT) return PAPI_ENOCMP; if ( EventCode & PAPI_NATIVE_MASK ) { // save event code so components can get it with call to: _papi_hwi_get_papi_event_code() _papi_hwi_set_papi_event_code(EventCode, 0); /* clear the event info */ memset( info, 0, sizeof ( PAPI_event_info_t ) ); info->event_code = ( unsigned int ) EventCode; info->component_index = (unsigned int) cidx; retval = _papi_hwd[cidx]->ntv_code_to_info( _papi_hwi_eventcode_to_native(EventCode), info); /* If component error, it's missing the ntv_code_to_info vector */ /* so we'll have to fake it. */ if ( retval == PAPI_ECMP ) { INTDBG("missing NTV_CODE_TO_INFO, faking\n"); /* Fill in the info structure */ if ((nevt_code = _papi_hwi_eventcode_to_native(EventCode)) < 0) { INTDBG("EXIT: nevt_code: %d\n", nevt_code); return nevt_code; } if ( (retval = _papi_hwd[cidx]->ntv_code_to_name( (unsigned int)nevt_code, info->symbol, sizeof(info->symbol)) ) == PAPI_OK ) { } else { INTDBG("EXIT: retval: %d\n", retval); return retval; } if ((nevt_code = _papi_hwi_eventcode_to_native(EventCode)) <0) { INTDBG("EXIT: nevt_code: %d\n", nevt_code); return nevt_code; } retval = _papi_hwd[cidx]->ntv_code_to_descr( (unsigned int)nevt_code, info->long_descr, sizeof ( info->long_descr)); if (retval!=PAPI_OK) { INTDBG("Failed ntv_code_to_descr()\n"); } } retval = _papi_hwi_prefix_component_name( _papi_hwd[cidx]->cmp_info.short_name, info->symbol, info->symbol, sizeof(info->symbol) ); INTDBG("EXIT: retval: %d\n", retval); return retval; } INTDBG("EXIT: PAPI_ENOEVNT\n"); return PAPI_ENOEVNT; } EventSetInfo_t * _papi_hwi_lookup_EventSet( int eventset ) { const DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; EventSetInfo_t *set; if ( ( eventset < 0 ) || ( eventset > map->totalSlots ) ) return ( NULL ); set = map->dataSlotArray[eventset]; #ifdef DEBUG if ( ( ISLEVEL( DEBUG_THREADS ) ) && ( _papi_hwi_thread_id_fn ) && ( set->master->tid != _papi_hwi_thread_id_fn( ) ) ) return ( NULL ); #endif return ( set ); } int _papi_hwi_is_sw_multiplex(EventSetInfo_t *ESI) { /* Are we multiplexing at all */ if ( ( ESI->state & PAPI_MULTIPLEXING ) == 0 ) { return 0; } /* Does the component support kernel multiplexing */ if ( _papi_hwd[ESI->CmpIdx]->cmp_info.kernel_multiplex ) { /* Have we forced software multiplexing */ if ( ESI->multiplex.flags == PAPI_MULTIPLEX_FORCE_SW ) { return 1; } /* Nope, using hardware multiplexing */ return 0; } /* We are multiplexing but the component does not support hardware */ return 1; } hwd_context_t * _papi_hwi_get_context( EventSetInfo_t * ESI, int *is_dirty ) { INTDBG("Entry: ESI: %p, is_dirty: %p\n", ESI, is_dirty); int dirty_ctx; hwd_context_t *ctx=NULL; /* assume for now the control state is clean (last updated by this ESI) */ dirty_ctx = 0; /* get a context pointer based on if we are counting for a thread or for a cpu */ if (ESI->state & PAPI_CPU_ATTACHED) { /* use cpu context */ ctx = ESI->CpuInfo->context[ESI->CmpIdx]; /* if the user wants to know if the control state was last set by the same event set, tell him */ if (is_dirty != NULL) { if (ESI->CpuInfo->from_esi != ESI) { dirty_ctx = 1; } *is_dirty = dirty_ctx; } ESI->CpuInfo->from_esi = ESI; } else { /* use thread context */ ctx = ESI->master->context[ESI->CmpIdx]; /* if the user wants to know if the control state was last set by the same event set, tell him */ if (is_dirty != NULL) { if (ESI->master->from_esi != ESI) { dirty_ctx = 1; } *is_dirty = dirty_ctx; } ESI->master->from_esi = ESI; } return( ctx ); } static int get_component_index(const char *name) { int cidx; for (cidx = 0; cidx < papi_num_components; ++cidx) { if (strcmp(_papi_hwd[cidx]->cmp_info.name, name) == 0) { break; } } return cidx; } int _papi_hwi_enum_dev_type(int enum_modifier, void **handle) { _papi_hwi_sysdetect_t args; args.query_type = PAPI_SYSDETECT_QUERY__DEV_TYPE_ENUM; args.query.enumerate.modifier = enum_modifier; int cidx = get_component_index("sysdetect"); assert(cidx < papi_num_components); return _papi_hwd[cidx]->user(0, &args, handle); } int _papi_hwi_get_dev_type_attr(void *handle, PAPI_dev_type_attr_e attr, void *value) { _papi_hwi_sysdetect_t args; args.query_type = PAPI_SYSDETECT_QUERY__DEV_TYPE_ATTR; args.query.dev_type.handle = handle; args.query.dev_type.attr = attr; int cidx = get_component_index("sysdetect"); assert(cidx < papi_num_components); return _papi_hwd[cidx]->user(0, &args, value); } int _papi_hwi_get_dev_attr(void *handle, int id, PAPI_dev_attr_e attr, void *value) { _papi_hwi_sysdetect_t args; args.query_type = PAPI_SYSDETECT_QUERY__DEV_ATTR; args.query.dev.handle = handle; args.query.dev.id = id; args.query.dev.attr = attr; int cidx = get_component_index("sysdetect"); assert(cidx < papi_num_components); return _papi_hwd[cidx]->user(0, &args, value); } papi-papi-7-2-0-t/src/papi_internal.h000066400000000000000000000437321502707512200174360ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /** * @file papi_internal.h * @author Philip Mucci * mucci@cs.utk.edu * @author Dan Terpstra * terpstra.utk.edu * @author Kevin London * london@cs.utk.edu * @author Haihang You * you@cs.utk.edu */ #ifndef _PAPI_INTERNAL_H #define _PAPI_INTERNAL_H /* AIX's C compiler does not recognize the inline keyword */ #ifdef _AIX #define inline #endif #include "papi_debug.h" #define DEADBEEF 0xdedbeef extern int papi_num_components; extern int _papi_num_compiled_components; extern int init_level; extern int _papi_hwi_errno; extern int _papi_hwi_num_errors; extern char **_papi_errlist; /********************************************************/ /* This block provides general strings used in PAPI */ /* If a new string is needed for PAPI prompts */ /* it should be placed in this file and referenced by */ /* label. */ /********************************************************/ #define PAPI_ERROR_CODE_str "Error Code" #define PAPI_SHUTDOWN_str "PAPI_shutdown: PAPI is not initialized" #define PAPI_SHUTDOWN_SYNC_str "PAPI_shutdown: other threads still have running EventSets" /* some members of structs and/or function parameters may or may not be necessary, but at this point, we have included anything that might possibly be useful later, and will remove them as we progress */ /* Signal used for overflow delivery */ #define PAPI_INT_MPX_SIGNAL SIGPROF #define PAPI_INT_SIGNAL SIGPROF #define PAPI_INT_ITIMER ITIMER_PROF #define PAPI_INT_ITIMER_MS 1 #if defined(linux) #define PAPI_NSIG _NSIG #else #define PAPI_NSIG 128 #endif /* Multiplex definitions */ #define PAPI_INT_MPX_DEF_US 10000 /*Default resolution in us. of mpx handler */ /* Commands used to compute derived events */ #define NOT_DERIVED 0x0 /**< Do nothing */ #define DERIVED_ADD 0x1 /**< Add counters */ #define DERIVED_PS 0x2 /**< Divide by the cycle counter and convert to seconds */ #define DERIVED_ADD_PS 0x4 /**< Add 2 counters then divide by the cycle counter and xl8 to secs. */ #define DERIVED_CMPD 0x8 /**< Event lives in operand index but takes 2 or more codes */ #define DERIVED_SUB 0x10 /**< Sub all counters from counter with operand_index */ #define DERIVED_POSTFIX 0x20 /**< Process counters based on specified postfix string */ #define DERIVED_INFIX 0x40 /**< Process counters based on specified infix string */ /* Thread related: thread local storage */ #define LOWLEVEL_TLS PAPI_NUM_TLS+0 #define NUM_INNER_TLS 1 #define PAPI_MAX_TLS (NUM_INNER_TLS+PAPI_NUM_TLS) /* Thread related: locks */ #define INTERNAL_LOCK PAPI_NUM_LOCK+0 /* papi_internal.c */ #define MULTIPLEX_LOCK PAPI_NUM_LOCK+1 /* multiplex.c */ #define THREADS_LOCK PAPI_NUM_LOCK+2 /* threads.c */ #define HIGHLEVEL_LOCK PAPI_NUM_LOCK+3 /* papi_hl.c */ #define MEMORY_LOCK PAPI_NUM_LOCK+4 /* papi_memory.c */ #define COMPONENT_LOCK PAPI_NUM_LOCK+5 /* per-component */ #define GLOBAL_LOCK PAPI_NUM_LOCK+6 /* papi.c for global variable (static and non) initialization/shutdown */ #define CPUS_LOCK PAPI_NUM_LOCK+7 /* cpus.c */ #define NAMELIB_LOCK PAPI_NUM_LOCK+8 /* papi_pfm4_events.c */ /* extras related */ #define NEED_CONTEXT 1 #define DONT_NEED_CONTEXT 0 #define PAPI_EVENTS_IN_DERIVED_EVENT 8 #define PAPI_MAX_COMP_QUALS 8 /* these vestigial pointers are to structures defined in the components they are opaque to the framework and defined as void at this level they are remapped to real data in the component routines that use them */ #define hwd_context_t void #define hwd_control_state_t void #define hwd_reg_alloc_t void #define hwd_register_t void #define hwd_siginfo_t void #define hwd_ucontext_t void /* DEFINES END HERE */ #ifndef NO_CONFI #include "config.h" #endif #include OSCONTEXT #include "papi_preset.h" #ifndef inline_static #define inline_static inline static #endif typedef struct _EventSetDomainInfo { int domain; } EventSetDomainInfo_t; typedef struct _EventSetGranularityInfo { int granularity; } EventSetGranularityInfo_t; typedef struct _EventSetOverflowInfo { int flags; int event_counter; PAPI_overflow_handler_t handler; long long *deadline; int *threshold; int *EventIndex; int *EventCode; } EventSetOverflowInfo_t; typedef struct _EventSetAttachInfo { unsigned long tid; } EventSetAttachInfo_t; typedef struct _EventSetCpuInfo { unsigned int cpu_num; } EventSetCpuInfo_t; typedef struct _EventSetInheritInfo { int inherit; } EventSetInheritInfo_t; /** @internal */ typedef struct _EventSetProfileInfo { PAPI_sprofil_t **prof; int *count; /**< Number of buffers */ int *threshold; int *EventIndex; int *EventCode; int flags; int event_counter; } EventSetProfileInfo_t; /** This contains info about an individual event added to the EventSet. The event can be either PRESET or NATIVE, and either simple or derived. If derived, it can consist of up to PAPI_EVENTS_IN_DERIVED_EVENT native events. An EventSet contains a pointer to an array of these structures to define each added event. @internal */ typedef struct _EventInfo { unsigned int event_code; /**< Preset or native code for this event as passed to PAPI_add_event() */ int pos[PAPI_EVENTS_IN_DERIVED_EVENT]; /**< position in the counter array for this events components */ char *ops; /**< operation string of preset (points into preset event struct) */ int derived; /**< Counter derivation command used for derived events */ } EventInfo_t; /** This contains info about each native event added to the EventSet. An EventSet contains an array of MAX_COUNTERS of these structures to define each native event in the set. @internal */ typedef struct _NativeInfo { int ni_event; /**< native (libpfm4) event code; always non-zero unless empty */ int ni_papi_code; /**< papi event code value returned to papi applications */ int ni_position; /**< counter array position where this native event lives */ int ni_owners; /**< specifies how many owners share this native event */ hwd_register_t *ni_bits; /**< Component defined resources used by this native event */ } NativeInfo_t; /* Multiplex definitions */ /** This contains only the information about an event that * would cause two events to be counted separately. Options * that don't affect an event aren't included here. * @internal */ typedef struct _papi_info { long long event_type; int domain; int granularity; } PapiInfo; typedef struct _masterevent { int uses; int active; int is_a_rate; int papi_event; PapiInfo pi; long long count; long long cycles; long long handler_count; long long prev_total_c; long long count_estimate; double rate_estimate; struct _threadlist *mythr; struct _masterevent *next; } MasterEvent; /** @internal */ typedef struct _threadlist { #ifdef PTHREADS pthread_t thr; #else unsigned long int tid; #endif /** Total cycles for this thread */ long long total_c; /** Pointer to event in use */ MasterEvent *cur_event; /** List of multiplexing events for this thread */ MasterEvent *head; /** Pointer to next thread */ struct _threadlist *next; } Threadlist; /* Ugh, should move this out and into all callers of papi_internal.h */ #include "sw_multiplex.h" /** Opaque struct, not defined yet...due to threads.h <-> papi_internal.h @internal */ struct _ThreadInfo; struct _CpuInfo; /** Fields below are ordered by access in PAPI_read for performance @internal */ typedef struct _EventSetInfo { struct _ThreadInfo *master; /**< Pointer to thread that owns this EventSet*/ struct _CpuInfo *CpuInfo; /**< Pointer to cpu that owns this EventSet */ int state; /**< The state of this entire EventSet; can be PAPI_RUNNING or PAPI_STOPPED plus flags */ EventInfo_t *EventInfoArray; /**< This array contains the mapping from events added into the API into hardware specific encoding as returned by the kernel or the code that directly accesses the counters. */ hwd_control_state_t *ctl_state; /**< This contains the encoding necessary for the hardware to set the counters to the appropriate conditions */ unsigned long int tid; /**< Thread ID, only used if PAPI_thread_init() is called */ int EventSetIndex; /**< Index of the EventSet in the array */ int CmpIdx; /**< Which Component this EventSet Belongs to */ int NumberOfEvents; /**< Number of events added to EventSet */ long long *hw_start; /**< Array of length num_mpx_cntrs to hold unprocessed, out of order, long long counter registers */ long long *sw_stop; /**< Array of length num_mpx_cntrs that contains processed, in order, PAPI counter values when used or stopped */ int NativeCount; /**< Number of native events in NativeInfoArray */ NativeInfo_t *NativeInfoArray; /**< Info about each native event in the set */ hwd_register_t *NativeBits; /**< Component-specific bits corresponding to the native events */ EventSetDomainInfo_t domain; EventSetGranularityInfo_t granularity; EventSetOverflowInfo_t overflow; EventSetMultiplexInfo_t multiplex; EventSetAttachInfo_t attach; EventSetCpuInfo_t cpu; EventSetProfileInfo_t profile; EventSetInheritInfo_t inherit; } EventSetInfo_t; /** @internal */ typedef struct _dynamic_array { EventSetInfo_t **dataSlotArray; /**< array of ptrs to EventSets */ int totalSlots; /**< number of slots in dataSlotArrays */ int availSlots; /**< number of open slots in dataSlotArrays */ int fullSlots; /**< number of full slots in dataSlotArray */ int lowestEmptySlot; /**< index of lowest empty dataSlotArray */ } DynamicArray_t; /* Component option types for _papi_hwd_ctl. */ typedef struct _papi_int_attach { unsigned long tid; EventSetInfo_t *ESI; } _papi_int_attach_t; typedef struct _papi_int_cpu { unsigned int cpu_num; EventSetInfo_t *ESI; } _papi_int_cpu_t; typedef struct _papi_int_multiplex { int flags; unsigned long ns; EventSetInfo_t *ESI; } _papi_int_multiplex_t; typedef struct _papi_int_defdomain { int defdomain; } _papi_int_defdomain_t; typedef struct _papi_int_domain { int domain; int eventset; EventSetInfo_t *ESI; } _papi_int_domain_t; typedef struct _papi_int_granularity { int granularity; int eventset; EventSetInfo_t *ESI; } _papi_int_granularity_t; typedef struct _papi_int_overflow { EventSetInfo_t *ESI; EventSetOverflowInfo_t overflow; } _papi_int_overflow_t; typedef struct _papi_int_profile { EventSetInfo_t *ESI; EventSetProfileInfo_t profile; } _papi_int_profile_t; typedef PAPI_itimer_option_t _papi_int_itimer_t; /* These shortcuts are only for use code */ #undef multiplex_itimer_sig #undef multiplex_itimer_num #undef multiplex_itimer_us typedef struct _papi_int_inherit { EventSetInfo_t *ESI; int inherit; } _papi_int_inherit_t; /** @internal */ typedef struct _papi_int_addr_range { /* if both are zero, range is disabled */ EventSetInfo_t *ESI; int domain; vptr_t start; /**< start address of an address range */ vptr_t end; /**< end address of an address range */ int start_off; /**< offset from start address as programmed in hardware */ int end_off; /**< offset from end address as programmed in hardware */ /**< if offsets are undefined, they are both set to -1 */ } _papi_int_addr_range_t; typedef union _papi_int_option_t { _papi_int_overflow_t overflow; _papi_int_profile_t profile; _papi_int_domain_t domain; _papi_int_attach_t attach; _papi_int_cpu_t cpu; _papi_int_multiplex_t multiplex; _papi_int_itimer_t itimer; _papi_int_inherit_t inherit; _papi_int_granularity_t granularity; _papi_int_addr_range_t address_range; } _papi_int_option_t; /** Hardware independent context * @internal */ typedef struct { hwd_siginfo_t *si; hwd_ucontext_t *ucontext; } _papi_hwi_context_t; /** @internal */ typedef struct _papi_mdi { DynamicArray_t global_eventset_map; /**< Global structure to maintain int<->EventSet mapping */ pid_t pid; /**< Process identifier */ PAPI_hw_info_t hw_info; /**< See definition in papi.h */ PAPI_exe_info_t exe_info; /**< See definition in papi.h */ PAPI_shlib_info_t shlib_info; /**< See definition in papi.h */ PAPI_preload_info_t preload_info; /**< See definition in papi.h */ } papi_mdi_t; extern papi_mdi_t _papi_hwi_system_info; extern int _papi_hwi_error_level; /* extern const hwi_describe_t _papi_hwi_err[PAPI_NUM_ERRORS]; */ /*extern volatile int _papi_hwi_using_signal;*/ extern int _papi_hwi_using_signal[PAPI_NSIG]; /** @ingroup papi_data_structures */ typedef struct _papi_os_option { char name[PAPI_MAX_STR_LEN]; /**< Name of the operating system */ char version[PAPI_MAX_STR_LEN]; /**< descriptive OS Version */ int os_version; /**< numerical, for workarounds */ int itimer_sig; /**< Signal used by the multiplex timer, 0 if not */ int itimer_num; /**< Number of the itimer used by mpx and overflow/profile emulation */ int itimer_ns; /**< ns between mpx switching and overflow/profile emulation */ int itimer_res_ns; /**< ns of resolution of itimer */ int clock_ticks; /**< clock ticks per second */ unsigned long reserved[8]; /* For future expansion */ } PAPI_os_info_t; extern PAPI_os_info_t _papi_os_info; /* For internal PAPI use only */ #include "papi_lock.h" #include "threads.h" extern THREAD_LOCAL_STORAGE_KEYWORD int _papi_rate_events_running; extern THREAD_LOCAL_STORAGE_KEYWORD int _papi_hl_events_running; EventSetInfo_t *_papi_hwi_lookup_EventSet( int eventset ); void _papi_hwi_set_papi_event_string (const char *event_string); char *_papi_hwi_get_papi_event_string (void); void _papi_hwi_free_papi_event_string(); void _papi_hwi_set_papi_event_code (unsigned int event_code, int update_flag); unsigned int _papi_hwi_get_papi_event_code (void); int _papi_hwi_get_ntv_idx (unsigned int papi_evt_code); const char *_papi_hwi_strip_component_prefix(const char *event_name); int _papi_hwi_is_sw_multiplex( EventSetInfo_t * ESI ); hwd_context_t *_papi_hwi_get_context( EventSetInfo_t * ESI, int *is_dirty ); extern int _papi_hwi_error_level; extern PAPI_debug_handler_t _papi_hwi_debug_handler; void PAPIERROR( char *format, ... ); void PAPIWARN( char *format, ... ); int _papi_hwi_assign_eventset( EventSetInfo_t * ESI, int cidx ); void _papi_hwi_free_EventSet( EventSetInfo_t * ESI ); int _papi_hwi_create_eventset( int *EventSet, ThreadInfo_t * handle ); int _papi_hwi_lookup_EventCodeIndex( const EventSetInfo_t * ESI, unsigned int EventCode ); int _papi_hwi_remove_EventSet( EventSetInfo_t * ESI ); void _papi_hwi_map_events_to_native( EventSetInfo_t *ESI); int _papi_hwi_add_event( EventSetInfo_t * ESI, int EventCode ); int _papi_hwi_remove_event( EventSetInfo_t * ESI, int EventCode ); int _papi_hwi_read( hwd_context_t * context, EventSetInfo_t * ESI, long long *values ); int _papi_hwi_cleanup_eventset( EventSetInfo_t * ESI ); int _papi_hwi_convert_eventset_to_multiplex( _papi_int_multiplex_t * mpx ); int _papi_hwi_init_global( int PE_OR_PEU ); int _papi_hwi_init_global_presets( void ); int _papi_hwi_init_global_internal( void ); int _papi_hwi_init_os(void); void _papi_hwi_init_errors(void); PAPI_os_info_t *_papi_hwi_get_os_info(void); void _papi_hwi_shutdown_global_internal( void ); void _papi_hwi_dummy_handler( int EventSet, void *address, long long overflow_vector, void *context ); int _papi_hwi_get_preset_event_info( int EventCode, PAPI_event_info_t * info ); int _papi_hwi_get_user_event_info( int EventCode, PAPI_event_info_t * info ); int _papi_hwi_derived_type( char *tmp, int *code ); int _papi_hwi_query_native_event( unsigned int EventCode ); int _papi_hwi_get_native_event_info( unsigned int EventCode, PAPI_event_info_t * info ); int _papi_hwi_native_name_to_code( const char *in, int *out ); int _papi_hwi_native_code_to_name( unsigned int EventCode, char *hwi_name, int len ); int _papi_hwi_invalid_cmp( int cidx ); int _papi_hwi_component_index( int event_code ); int _papi_hwi_native_to_eventcode(int cidx, int event_code, int ntv_idx, const char *event_name); int _papi_hwi_eventcode_to_native(int event_code); enum { PAPI_SYSDETECT_QUERY__DEV_TYPE_ENUM, PAPI_SYSDETECT_QUERY__DEV_TYPE_ATTR, PAPI_SYSDETECT_QUERY__DEV_ATTR, }; typedef struct { int query_type; union { struct { int modifier; } enumerate; struct { void *handle; PAPI_dev_type_attr_e attr; } dev_type; struct { void *handle; int id; PAPI_dev_attr_e attr; } dev; } query; } _papi_hwi_sysdetect_t; int _papi_hwi_enum_dev_type(int enum_modifier, void **handle); int _papi_hwi_get_dev_type_attr(void *handle, PAPI_dev_type_attr_e attr, void *val); int _papi_hwi_get_dev_attr(void *handle, int id, PAPI_dev_attr_e attr, void *val); int construct_qualified_event(hwi_presets_t *prstPtr); int overwrite_qualifiers(hwi_presets_t *prstPtr, const char *in, int is_preset); int get_first_cmp_preset_idx( void ); int get_preset_cmp( unsigned int *index ); hwi_presets_t* get_preset( int event_code ); #endif /* PAPI_INTERNAL_H */ papi-papi-7-2-0-t/src/papi_libpfm3_events.c000066400000000000000000000626211502707512200205330ustar00rootroot00000000000000/* * File: papi_libpfm3_events.c * Author: Dan Terpstra: blantantly extracted from Phil's perfmon.c * mucci@cs.utk.edu * */ #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "perfmon/perfmon.h" #include "perfmon/pfmlib.h" #include "papi_libpfm_events.h" /* Native events consist of a flag field, an event field, and a unit mask field. * These variables define the characteristics of the event and unit mask fields. */ unsigned int PAPI_NATIVE_EVENT_AND_MASK = 0x000003ff; unsigned int PAPI_NATIVE_EVENT_SHIFT = 0; unsigned int PAPI_NATIVE_UMASK_AND_MASK = 0x03fffc00; unsigned int PAPI_NATIVE_UMASK_MAX = 16; unsigned int PAPI_NATIVE_UMASK_SHIFT = 10; /* Globals */ int num_native_events=0; /* NOTE: PAPI stores umask info in a variable sized (16 bit?) bitfield. Perfmon2 stores umask info in a large (48 element?) array of values. Native event encodings for perfmon2 contain array indices encoded as bits in this bitfield. These indices must be converted into a umask value before programming the counters. For Perfmon, this is done by converting back to an array of values; for perfctr, it must be done by looking up the values. */ /* This routine is used to step through all possible combinations of umask values. It assumes that mask contains a valid combination of array indices for this event. */ static inline int encode_native_event_raw( unsigned int event, unsigned int mask ) { unsigned int tmp = event << PAPI_NATIVE_EVENT_SHIFT; SUBDBG( "Old native index was %#08x with %#08x mask\n", tmp, mask ); tmp = tmp | ( mask << PAPI_NATIVE_UMASK_SHIFT ); SUBDBG( "New encoding is %#08x\n", tmp | PAPI_NATIVE_MASK ); return ( int ) ( tmp | PAPI_NATIVE_MASK ); } /* This routine converts array indices contained in the mask_values array into bits in the umask field that is OR'd into the native event code. These bits are NOT the mask values themselves, but indices into an array of mask values contained in the native event table. */ static inline int encode_native_event( unsigned int event, unsigned int num_mask, unsigned int *mask_values ) { unsigned int i; unsigned int tmp = event << PAPI_NATIVE_EVENT_SHIFT; SUBDBG( "Native base event is %#08x with %d masks\n", tmp, num_mask ); for ( i = 0; i < num_mask; i++ ) { SUBDBG( "Mask index is %#08x\n", mask_values[i] ); tmp = tmp | ( ( 1 << mask_values[i] ) << PAPI_NATIVE_UMASK_SHIFT ); } SUBDBG( "Full native encoding is 0x%08x\n", tmp | PAPI_NATIVE_MASK ); return ( int ) ( tmp | PAPI_NATIVE_MASK ); } /* Break a PAPI native event code into its composite event code and pfm mask bits */ int _pfm_decode_native_event( unsigned int EventCode, unsigned int *event, unsigned int *umask ) { unsigned int tevent, major, minor; tevent = EventCode & PAPI_NATIVE_AND_MASK; major = ( tevent & PAPI_NATIVE_EVENT_AND_MASK ) >> PAPI_NATIVE_EVENT_SHIFT; if ( ( int ) major >= num_native_events ) return PAPI_ENOEVNT; minor = ( tevent & PAPI_NATIVE_UMASK_AND_MASK ) >> PAPI_NATIVE_UMASK_SHIFT; *event = major; *umask = minor; SUBDBG( "EventCode %#08x is event %d, umask %#x\n", EventCode, major, minor ); return PAPI_OK; } /* convert a collection of pfm mask bits into an array of pfm mask indices */ int prepare_umask( unsigned int foo, unsigned int *values ) { unsigned int tmp = foo, i; int j = 0; SUBDBG( "umask %#x\n", tmp ); while ( ( i = ( unsigned int ) ffs( ( int ) tmp ) ) ) { tmp = tmp ^ ( 1 << ( i - 1 ) ); values[j] = i - 1; SUBDBG( "umask %d is %d\n", j, values[j] ); j++; } return ( j ); } /* convert the mask values in a pfm event structure into a PAPI unit mask */ static inline unsigned int convert_pfm_masks( pfmlib_event_t * gete ) { int ret; unsigned int i, code, tmp = 0; for ( i = 0; i < gete->num_masks; i++ ) { if ( ( ret = pfm_get_event_mask_code( gete->event, gete->unit_masks[i], &code ) ) == PFMLIB_SUCCESS ) { SUBDBG( "Mask value is %#08x\n", code ); tmp |= code; } else { PAPIERROR( "pfm_get_event_mask_code(%#x,%d,%p): %s", gete->event, i, &code, pfm_strerror( ret ) ); } } return ( tmp ); } /* convert an event code and pfm unit mask into a PAPI unit mask */ unsigned int _pfm_convert_umask( unsigned int event, unsigned int umask ) { pfmlib_event_t gete; memset( &gete, 0, sizeof ( gete ) ); gete.event = event; gete.num_masks = ( unsigned int ) prepare_umask( umask, gete.unit_masks ); return ( convert_pfm_masks( &gete ) ); } /* convert libpfm error codes to PAPI error codes for more informative error reporting */ int _papi_libpfm_error( int pfm_error ) { switch ( pfm_error ) { case PFMLIB_SUCCESS: return PAPI_OK; /* success */ case PFMLIB_ERR_NOTSUPP: return PAPI_ENOSUPP; /* function not supported */ case PFMLIB_ERR_INVAL: return PAPI_EINVAL; /* invalid parameters */ case PFMLIB_ERR_NOINIT: return PAPI_ENOINIT; /* library was not initialized */ case PFMLIB_ERR_NOTFOUND: return PAPI_ENOEVNT; /* event not found */ case PFMLIB_ERR_NOASSIGN: return PAPI_ECNFLCT; /* cannot assign events to counters */ case PFMLIB_ERR_FULL: return PAPI_EBUF; /* buffer is full or too small */ case PFMLIB_ERR_EVTMANY: return PAPI_EMISC; /* event used more than once */ case PFMLIB_ERR_MAGIC: return PAPI_EBUG; /* invalid library magic number */ case PFMLIB_ERR_FEATCOMB: return PAPI_ECOMBO; /* invalid combination of features */ case PFMLIB_ERR_EVTSET: return PAPI_ENOEVST; /* incompatible event sets */ case PFMLIB_ERR_EVTINCOMP: return PAPI_ECNFLCT; /* incompatible event combination */ case PFMLIB_ERR_TOOMANY: return PAPI_ECOUNT; /* too many events or unit masks */ case PFMLIB_ERR_BADHOST: return PAPI_ESYS; /* not supported by host CPU */ case PFMLIB_ERR_UMASK: return PAPI_EATTR; /* invalid or missing unit mask */ case PFMLIB_ERR_NOMEM: return PAPI_ENOMEM; /* out of memory */ /* Itanium only */ case PFMLIB_ERR_IRRTOOBIG: /* code range too big */ case PFMLIB_ERR_IRREMPTY: /* empty code range */ case PFMLIB_ERR_IRRINVAL: /* invalid code range */ case PFMLIB_ERR_IRRTOOMANY: /* too many code ranges */ case PFMLIB_ERR_DRRINVAL: /* invalid data range */ case PFMLIB_ERR_DRRTOOMANY: /* too many data ranges */ case PFMLIB_ERR_IRRALIGN: /* bad alignment for code range */ case PFMLIB_ERR_IRRFLAGS: /* code range missing flags */ default: return PAPI_EINVAL; } } int _papi_libpfm_ntv_name_to_code( const char *name, unsigned int *event_code ) { pfmlib_event_t event; unsigned int i; int ret; SUBDBG( "pfm_find_full_event(%s,%p)\n", name, &event ); ret = pfm_find_full_event( name, &event ); if ( ret == PFMLIB_SUCCESS ) { SUBDBG( "Full event name found\n" ); /* we can only capture PAPI_NATIVE_UMASK_MAX or fewer masks */ if ( event.num_masks > PAPI_NATIVE_UMASK_MAX ) { SUBDBG( "num_masks (%d) > max masks (%d)\n", event.num_masks, PAPI_NATIVE_UMASK_MAX ); return PAPI_ENOEVNT; } else { /* no mask index can exceed PAPI_NATIVE_UMASK_MAX */ for ( i = 0; i < event.num_masks; i++ ) { if ( event.unit_masks[i] > PAPI_NATIVE_UMASK_MAX ) { SUBDBG( "mask index (%d) > max masks (%d)\n", event.unit_masks[i], PAPI_NATIVE_UMASK_MAX ); return PAPI_ENOEVNT; } } *event_code = encode_native_event( event.event, event.num_masks, event.unit_masks ); return PAPI_OK; } } else if ( ret == PFMLIB_ERR_UMASK ) { SUBDBG( "UMASK error, looking for base event only\n" ); ret = pfm_find_event( name, &event.event ); if ( ret == PFMLIB_SUCCESS ) { *event_code = encode_native_event( event.event, 0, 0 ); return PAPI_EATTR; } } return PAPI_ENOEVNT; } int _papi_libpfm_ntv_code_to_name( unsigned int EventCode, char *ntv_name, int len ) { int ret; unsigned int event, umask; pfmlib_event_t gete; memset( &gete, 0, sizeof ( gete ) ); if ( _pfm_decode_native_event( EventCode, &event, &umask ) != PAPI_OK ) return ( PAPI_ENOEVNT ); gete.event = event; gete.num_masks = ( unsigned int ) prepare_umask( umask, gete.unit_masks ); if ( gete.num_masks == 0 ) ret = pfm_get_event_name( gete.event, ntv_name, ( size_t ) len ); else ret = pfm_get_full_event_name( &gete, ntv_name, ( size_t ) len ); if ( ret != PFMLIB_SUCCESS ) { char tmp[PAPI_2MAX_STR_LEN]; pfm_get_event_name( gete.event, tmp, sizeof ( tmp ) ); /* Skip error message if event is not supported by host cpu; * we don't need to give this info away for papi_native_avail util */ if ( ret != PFMLIB_ERR_BADHOST ) PAPIERROR ( "pfm_get_full_event_name(%p(event %d,%s,%d masks),%p,%d): %d -- %s", &gete, gete.event, tmp, gete.num_masks, ntv_name, len, ret, pfm_strerror( ret ) ); if ( ret == PFMLIB_ERR_FULL ) { return PAPI_EBUF; } return PAPI_EMISC; } return PAPI_OK; } int _papi_libpfm_ntv_code_to_descr( unsigned int EventCode, char *ntv_descr, int len ) { unsigned int event, umask; char *eventd, **maskd, *tmp; int i, ret; pfmlib_event_t gete; size_t total_len = 0; memset( &gete, 0, sizeof ( gete ) ); if ( _pfm_decode_native_event( EventCode, &event, &umask ) != PAPI_OK ) return ( PAPI_ENOEVNT ); ret = pfm_get_event_description( event, &eventd ); if ( ret != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_event_description(%d,%p): %s", event, &eventd, pfm_strerror( ret ) ); return ( PAPI_ENOEVNT ); } if ( ( gete.num_masks = ( unsigned int ) prepare_umask( umask, gete.unit_masks ) ) ) { maskd = ( char ** ) malloc( gete.num_masks * sizeof ( char * ) ); if ( maskd == NULL ) { free( eventd ); return ( PAPI_ENOMEM ); } for ( i = 0; i < ( int ) gete.num_masks; i++ ) { ret = pfm_get_event_mask_description( event, gete.unit_masks[i], &maskd[i] ); if ( ret != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_event_mask_description(%d,%d,%p): %s", event, umask, &maskd, pfm_strerror( ret ) ); free( eventd ); for ( ; i >= 0; i-- ) free( maskd[i] ); free( maskd ); return ( PAPI_EINVAL ); } total_len += strlen( maskd[i] ); } tmp = ( char * ) malloc( strlen( eventd ) + strlen( ", masks:" ) + total_len + gete.num_masks + 1 ); if ( tmp == NULL ) { for ( i = ( int ) gete.num_masks - 1; i >= 0; i-- ) free( maskd[i] ); free( maskd ); free( eventd ); } tmp[0] = '\0'; strcat( tmp, eventd ); strcat( tmp, ", masks:" ); for ( i = 0; i < ( int ) gete.num_masks; i++ ) { if ( i != 0 ) strcat( tmp, "," ); strcat( tmp, maskd[i] ); free( maskd[i] ); } free( maskd ); } else { tmp = ( char * ) malloc( strlen( eventd ) + 1 ); if ( tmp == NULL ) { free( eventd ); return ( PAPI_ENOMEM ); } tmp[0] = '\0'; strcat( tmp, eventd ); free( eventd ); } strncpy( ntv_descr, tmp, ( size_t ) len ); if ( ( int ) strlen( tmp ) > len - 1 ) ret = PAPI_EBUF; else ret = PAPI_OK; free( tmp ); return ( ret ); } int _papi_libpfm_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info) { SUBDBG("ENTER %#x\n",EventCode); _papi_libpfm_ntv_code_to_name(EventCode,info->symbol, sizeof(info->symbol)); _papi_libpfm_ntv_code_to_descr(EventCode,info->long_descr, sizeof(info->long_descr)); return PAPI_OK; } int _papi_libpfm_ntv_enum_events( unsigned int *EventCode, int modifier ) { unsigned int event, umask, num_masks; int ret; if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = PAPI_NATIVE_MASK; /* assumes first native event is always 0x4000000 */ return ( PAPI_OK ); } if ( _pfm_decode_native_event( *EventCode, &event, &umask ) != PAPI_OK ) return ( PAPI_ENOEVNT ); ret = pfm_get_num_event_masks( event, &num_masks ); if ( ret != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_num_event_masks(%d,%p): %s", event, &num_masks, pfm_strerror( ret ) ); return ( PAPI_ENOEVNT ); } if ( num_masks > PAPI_NATIVE_UMASK_MAX ) num_masks = PAPI_NATIVE_UMASK_MAX; SUBDBG( "This is umask %d of %d\n", umask, num_masks ); if ( modifier == PAPI_ENUM_EVENTS ) { if ( event < ( unsigned int ) num_native_events - 1 ) { *EventCode = ( unsigned int ) encode_native_event_raw( event + 1, 0 ); return ( PAPI_OK ); } return ( PAPI_ENOEVNT ); } else if ( modifier == PAPI_NTV_ENUM_UMASK_COMBOS ) { if ( umask + 1 < ( unsigned int ) ( 1 << num_masks ) ) { *EventCode = ( unsigned int ) encode_native_event_raw( event, umask + 1 ); return ( PAPI_OK ); } return ( PAPI_ENOEVNT ); } else if ( modifier == PAPI_NTV_ENUM_UMASKS ) { int thisbit = ffs( ( int ) umask ); SUBDBG( "First bit is %d in %08x\b\n", thisbit - 1, umask ); thisbit = 1 << thisbit; if ( thisbit & ( ( 1 << num_masks ) - 1 ) ) { *EventCode = ( unsigned int ) encode_native_event_raw( event, ( unsigned int ) thisbit ); return ( PAPI_OK ); } return ( PAPI_ENOEVNT ); } else return ( PAPI_EINVAL ); } int _papi_libpfm_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { unsigned int event, umask; pfmlib_event_t gete; /* For PFM & Perfmon, native info is just an index into PFM event table. */ if ( _pfm_decode_native_event( EventCode, &event, &umask ) != PAPI_OK ) return PAPI_ENOEVNT; memset( &gete, 0x0, sizeof ( pfmlib_event_t ) ); gete.event = event; gete.num_masks = prepare_umask( umask, gete.unit_masks ); memcpy( bits, &gete, sizeof ( pfmlib_event_t ) ); return PAPI_OK; } /* used by linux-timer.c for ia64 */ int _perfmon2_pfm_pmu_type = -1; int _papi_libpfm_init(papi_vector_t *my_vector, int cidx) { int retval; unsigned int ncnt; unsigned int version; char pmu_name[PAPI_MIN_STR_LEN]; /* The following checks the version of the PFM library against the version PAPI linked to... */ SUBDBG( "pfm_initialize()\n" ); if ( ( retval = pfm_initialize( ) ) != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_initialize(): %s", pfm_strerror( retval ) ); return PAPI_ESYS; } /* Get the libpfm3 version */ SUBDBG( "pfm_get_version(%p)\n", &version ); if ( pfm_get_version( &version ) != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_version(%p): %s", version, pfm_strerror( retval ) ); return PAPI_ESYS; } /* Set the version */ sprintf( my_vector->cmp_info.support_version, "%d.%d", PFM_VERSION_MAJOR( version ), PFM_VERSION_MINOR( version ) ); /* Complain if the compiled-against version doesn't match current version */ if ( PFM_VERSION_MAJOR( version ) != PFM_VERSION_MAJOR( PFMLIB_VERSION ) ) { PAPIERROR( "Version mismatch of libpfm: compiled %#x vs. installed %#x\n", PFM_VERSION_MAJOR( PFMLIB_VERSION ), PFM_VERSION_MAJOR( version ) ); return PAPI_ESYS; } /* Always initialize globals dynamically to handle forks properly. */ _perfmon2_pfm_pmu_type = -1; /* Opened once for all threads. */ SUBDBG( "pfm_get_pmu_type(%p)\n", &_perfmon2_pfm_pmu_type ); if ( pfm_get_pmu_type( &_perfmon2_pfm_pmu_type ) != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_pmu_type(%p): %s", _perfmon2_pfm_pmu_type, pfm_strerror( retval ) ); return PAPI_ESYS; } pmu_name[0] = '\0'; if ( pfm_get_pmu_name( pmu_name, PAPI_MIN_STR_LEN ) != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_pmu_name(%p,%d): %s", pmu_name, PAPI_MIN_STR_LEN, pfm_strerror( retval ) ); return PAPI_ESYS; } SUBDBG( "PMU is a %s, type %d\n", pmu_name, _perfmon2_pfm_pmu_type ); /* Setup presets */ retval = _papi_load_preset_table( pmu_name, _perfmon2_pfm_pmu_type, cidx ); if ( retval ) return retval; /* Fill in cmp_info */ SUBDBG( "pfm_get_num_events(%p)\n", &ncnt ); if ( ( retval = pfm_get_num_events( &ncnt ) ) != PFMLIB_SUCCESS ) { PAPIERROR( "pfm_get_num_events(%p): %s\n", &ncnt, pfm_strerror( retval ) ); return PAPI_ESYS; } SUBDBG( "pfm_get_num_events: %d\n", ncnt ); my_vector->cmp_info.num_native_events = ncnt; num_native_events = ncnt; pfm_get_num_counters( ( unsigned int * ) &my_vector->cmp_info.num_cntrs ); SUBDBG( "pfm_get_num_counters: %d\n", my_vector->cmp_info.num_cntrs ); if ( _papi_hwi_system_info.hw_info.vendor == PAPI_VENDOR_INTEL ) { /* Pentium4 */ if ( _papi_hwi_system_info.hw_info.cpuid_family == 15 ) { PAPI_NATIVE_EVENT_AND_MASK = 0x000000ff; PAPI_NATIVE_UMASK_AND_MASK = 0x0fffff00; PAPI_NATIVE_UMASK_SHIFT = 8; /* Itanium2 */ } else if ( _papi_hwi_system_info.hw_info.cpuid_family == 31 || _papi_hwi_system_info.hw_info.cpuid_family == 32 ) { PAPI_NATIVE_EVENT_AND_MASK = 0x00000fff; PAPI_NATIVE_UMASK_AND_MASK = 0x0ffff000; PAPI_NATIVE_UMASK_SHIFT = 12; } } return PAPI_OK; } long long generate_p4_event(long long escr, long long cccr, long long escr_addr) { /* * RAW events specification * * Bits Meaning * ----- ------- * 0-6 Metric value from enum P4_PEBS_METRIC (if needed) * 7-11 Reserved, set to 0 * 12-31 Bits 12-31 of CCCR register (Intel SDM Vol 3) * 32-56 Bits 0-24 of ESCR register (Intel SDM Vol 3) * 57-62 Event key from enum P4_EVENTS * 63 Reserved, set to 0 */ enum P4_EVENTS { P4_EVENT_TC_DELIVER_MODE, P4_EVENT_BPU_FETCH_REQUEST, P4_EVENT_ITLB_REFERENCE, P4_EVENT_MEMORY_CANCEL, P4_EVENT_MEMORY_COMPLETE, P4_EVENT_LOAD_PORT_REPLAY, P4_EVENT_STORE_PORT_REPLAY, P4_EVENT_MOB_LOAD_REPLAY, P4_EVENT_PAGE_WALK_TYPE, P4_EVENT_BSQ_CACHE_REFERENCE, P4_EVENT_IOQ_ALLOCATION, P4_EVENT_IOQ_ACTIVE_ENTRIES, P4_EVENT_FSB_DATA_ACTIVITY, P4_EVENT_BSQ_ALLOCATION, P4_EVENT_BSQ_ACTIVE_ENTRIES, P4_EVENT_SSE_INPUT_ASSIST, P4_EVENT_PACKED_SP_UOP, P4_EVENT_PACKED_DP_UOP, P4_EVENT_SCALAR_SP_UOP, P4_EVENT_SCALAR_DP_UOP, P4_EVENT_64BIT_MMX_UOP, P4_EVENT_128BIT_MMX_UOP, P4_EVENT_X87_FP_UOP, P4_EVENT_TC_MISC, P4_EVENT_GLOBAL_POWER_EVENTS, P4_EVENT_TC_MS_XFER, P4_EVENT_UOP_QUEUE_WRITES, P4_EVENT_RETIRED_MISPRED_BRANCH_TYPE, P4_EVENT_RETIRED_BRANCH_TYPE, P4_EVENT_RESOURCE_STALL, P4_EVENT_WC_BUFFER, P4_EVENT_B2B_CYCLES, P4_EVENT_BNR, P4_EVENT_SNOOP, P4_EVENT_RESPONSE, P4_EVENT_FRONT_END_EVENT, P4_EVENT_EXECUTION_EVENT, P4_EVENT_REPLAY_EVENT, P4_EVENT_INSTR_RETIRED, P4_EVENT_UOPS_RETIRED, P4_EVENT_UOP_TYPE, P4_EVENT_BRANCH_RETIRED, P4_EVENT_MISPRED_BRANCH_RETIRED, P4_EVENT_X87_ASSIST, P4_EVENT_MACHINE_CLEAR, P4_EVENT_INSTR_COMPLETED, }; int eventsel=(escr>>25)&0x3f; int cccrsel=(cccr>>13)&0x7; int event_key=-1; long long pe_event; switch(eventsel) { case 0x1: if (cccrsel==1) { if (escr_addr>0x3c8) { // tc_escr0,1 0x3c4 event_key=P4_EVENT_TC_DELIVER_MODE; } else { // alf_escr0, 0x3ca event_key=P4_EVENT_RESOURCE_STALL; } } if (cccrsel==4) { if (escr_addr<0x3af) { // pmh_escr0,1 0x3ac event_key=P4_EVENT_PAGE_WALK_TYPE; } else { // cru_escr0, 3b8 cccr=04 event_key=P4_EVENT_UOPS_RETIRED; } } break; case 0x2: if (cccrsel==5) { if (escr_addr<0x3a8) { // MSR_DAC_ESCR0 / MSR_DAC_ESCR1 event_key=P4_EVENT_MEMORY_CANCEL; } else { //MSR_CRU_ESCR2, MSR_CRU_ESCR3 event_key=P4_EVENT_MACHINE_CLEAR; } } else if (cccrsel==1) { event_key=P4_EVENT_64BIT_MMX_UOP; } else if (cccrsel==4) { event_key=P4_EVENT_INSTR_RETIRED; } else if (cccrsel==2) { event_key=P4_EVENT_UOP_TYPE; } break; case 0x3: if (cccrsel==0) { event_key=P4_EVENT_BPU_FETCH_REQUEST; } if (cccrsel==2) { event_key=P4_EVENT_MOB_LOAD_REPLAY; } if (cccrsel==6) { event_key=P4_EVENT_IOQ_ALLOCATION; } if (cccrsel==4) { event_key=P4_EVENT_MISPRED_BRANCH_RETIRED; } if (cccrsel==5) { event_key=P4_EVENT_X87_ASSIST; } break; case 0x4: if (cccrsel==2) { if (escr_addr<0x3b0) { // saat, 0x3ae event_key=P4_EVENT_LOAD_PORT_REPLAY; } else { // tbpu 0x3c2 event_key=P4_EVENT_RETIRED_BRANCH_TYPE; } } if (cccrsel==1) { event_key=P4_EVENT_X87_FP_UOP; } if (cccrsel==3) { event_key=P4_EVENT_RESPONSE; } break; case 0x5: if (cccrsel==2) { if (escr_addr<0x3b0) { // saat, 0x3ae event_key=P4_EVENT_STORE_PORT_REPLAY; } else { // tbpu, 0x3c2 event_key=P4_EVENT_RETIRED_MISPRED_BRANCH_TYPE; } } if (cccrsel==7) { event_key=P4_EVENT_BSQ_ALLOCATION; } if (cccrsel==0) { event_key=P4_EVENT_TC_MS_XFER; } if (cccrsel==5) { event_key=P4_EVENT_WC_BUFFER; } break; case 0x6: if (cccrsel==7) { event_key=P4_EVENT_BSQ_ACTIVE_ENTRIES; } if (cccrsel==1) { event_key=P4_EVENT_TC_MISC; } if (cccrsel==3) { event_key=P4_EVENT_SNOOP; } if (cccrsel==5) { event_key=P4_EVENT_BRANCH_RETIRED; } break; case 0x7: event_key=P4_EVENT_INSTR_COMPLETED; break; case 0x8: if (cccrsel==2) { event_key=P4_EVENT_MEMORY_COMPLETE; } if (cccrsel==1) { event_key=P4_EVENT_PACKED_SP_UOP; } if (cccrsel==3) { event_key=P4_EVENT_BNR; } if (cccrsel==5) { event_key=P4_EVENT_FRONT_END_EVENT; } break; case 0x9: if (cccrsel==0) { event_key=P4_EVENT_UOP_QUEUE_WRITES; } if (cccrsel==5) { event_key=P4_EVENT_REPLAY_EVENT; } break; case 0xa: event_key=P4_EVENT_SCALAR_SP_UOP; break; case 0xc: if (cccrsel==7) { event_key=P4_EVENT_BSQ_CACHE_REFERENCE; } if (cccrsel==1) { event_key=P4_EVENT_PACKED_DP_UOP; } if (cccrsel==5) { event_key=P4_EVENT_EXECUTION_EVENT; } break; case 0xe: event_key=P4_EVENT_SCALAR_DP_UOP; break; case 0x13: event_key=P4_EVENT_GLOBAL_POWER_EVENTS; break; case 0x16: event_key=P4_EVENT_B2B_CYCLES; break; case 0x17: event_key=P4_EVENT_FSB_DATA_ACTIVITY; break; case 0x18: event_key=P4_EVENT_ITLB_REFERENCE; break; case 0x1a: if (cccrsel==6) { event_key=P4_EVENT_IOQ_ACTIVE_ENTRIES; } if (cccrsel==1) { event_key=P4_EVENT_128BIT_MMX_UOP; } break; case 0x34: event_key= P4_EVENT_SSE_INPUT_ASSIST; break; } pe_event=(escr&0x1ffffff)<<32; pe_event|=(cccr&0xfffff000); pe_event|=(((long long)(event_key))<<57); return pe_event; } typedef pfmlib_event_t pfm_register_t; int _papi_libpfm_setup_counters( struct perf_event_attr *attr, hwd_register_t *ni_bits ) { int ret,pe_event; (void)ni_bits; /* * We need an event code that is common across all counters. * The implementation is required to know how to translate the supplied * code to whichever counter it ends up on. */ #if defined(__powerpc__) int code; ret = pfm_get_event_code_counter( ( ( pfm_register_t * ) ni_bits )->event, 0, &code ); if ( ret ) { /* Unrecognized code, but should never happen */ return PAPI_EBUG; } pe_event = code; SUBDBG( "Stuffing native event index (code %#x, raw code %#x) into events array.\n", ( ( pfm_register_t * ) ni_bits )->event, code ); #else pfmlib_input_param_t inp; pfmlib_output_param_t outp; memset( &inp, 0, sizeof ( inp ) ); memset( &outp, 0, sizeof ( outp ) ); inp.pfp_event_count = 1; inp.pfp_dfl_plm = PAPI_DOM_USER; pfm_regmask_set( &inp.pfp_unavail_pmcs, 16 ); // mark fixed counters as unavailable inp.pfp_events[0] = *( ( pfm_register_t * ) ni_bits ); ret = pfm_dispatch_events( &inp, NULL, &outp, NULL ); if (ret != PFMLIB_SUCCESS) { SUBDBG( "Error: pfm_dispatch_events returned: %d\n", ret); return PAPI_ESYS; } /* Special case p4 */ if (( _papi_hwi_system_info.hw_info.vendor == PAPI_VENDOR_INTEL ) && ( _papi_hwi_system_info.hw_info.cpuid_family == 15)) { pe_event=generate_p4_event( outp.pfp_pmcs[0].reg_value, /* escr */ outp.pfp_pmcs[1].reg_value, /* cccr */ outp.pfp_pmcs[0].reg_addr); /* escr_addr */ } else { pe_event = outp.pfp_pmcs[0].reg_value; } SUBDBG( "pe_event: %#llx\n", outp.pfp_pmcs[0].reg_value ); #endif attr->config=pe_event; /* for libpfm3 we currently only handle RAW type */ attr->type=PERF_TYPE_RAW; return PAPI_OK; } int _papi_libpfm_shutdown(void) { SUBDBG("shutdown\n"); return PAPI_OK; } papi-papi-7-2-0-t/src/papi_libpfm4_events.c000066400000000000000000000100001502707512200205140ustar00rootroot00000000000000/* * File: papi_libpfm4_events.c * Author: Vince Weaver vincent.weaver @ maine.edu * based heavily on existing papi_libpfm3_events.c */ #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_libpfm4_events.h" #include "perfmon/pfmlib.h" #include "perfmon/pfmlib_perf_event.h" /**********************************************************/ /* Local scope globals */ /**********************************************************/ static int libpfm4_users=0; /***********************************************************/ /* Exported functions */ /***********************************************************/ /** @class _papi_libpfm4_error * @brief convert libpfm error codes to PAPI error codes * * @param[in] pfm_error * -- a libpfm4 error code * * @returns returns a PAPI error code * */ int _papi_libpfm4_error( int pfm_error ) { switch ( pfm_error ) { case PFM_SUCCESS: return PAPI_OK; /* success */ case PFM_ERR_NOTSUPP: return PAPI_ENOSUPP; /* function not supported */ case PFM_ERR_INVAL: return PAPI_EINVAL; /* invalid parameters */ case PFM_ERR_NOINIT: return PAPI_ENOINIT; /* library not initialized */ case PFM_ERR_NOTFOUND: return PAPI_ENOEVNT; /* event not found */ case PFM_ERR_FEATCOMB: return PAPI_ECOMBO; /* invalid combination of features */ case PFM_ERR_UMASK: return PAPI_EATTR; /* invalid or missing unit mask */ case PFM_ERR_NOMEM: return PAPI_ENOMEM; /* out of memory */ case PFM_ERR_ATTR: return PAPI_EATTR; /* invalid event attribute */ case PFM_ERR_ATTR_VAL: return PAPI_EATTR; /* invalid event attribute value */ case PFM_ERR_ATTR_SET: return PAPI_EATTR; /* attribute value already set */ case PFM_ERR_TOOMANY: return PAPI_ECOUNT; /* too many parameters */ case PFM_ERR_TOOSMALL: return PAPI_ECOUNT; /* parameter is too small */ default: PAPIWARN("Unknown libpfm error code %d, returning PAPI_EINVAL",pfm_error); return PAPI_EINVAL; } } /** @class _papi_libpfm4_shutdown * @brief Shutdown any initialization done by the libpfm4 code * * @param[in] component * -- component doing the shutdown * * @retval PAPI_OK Success * */ int _papi_libpfm4_shutdown(papi_vector_t *my_vector) { /* clean out and free the native events structure */ _papi_hwi_lock( NAMELIB_LOCK ); libpfm4_users--; /* Only free if we're the last user */ if (!libpfm4_users) { pfm_terminate(); } _papi_hwi_unlock( NAMELIB_LOCK ); strcpy(my_vector->cmp_info.support_version,""); return PAPI_OK; } /** @class _papi_libpfm4_init * @brief Initialize the libpfm4 code * * @param[in] my_vector * -- vector of the component doing the initialization * * @retval PAPI_OK Success * @retval PAPI_ECMP There was an error initializing * */ int _papi_libpfm4_init(papi_vector_t *my_vector) { int version; pfm_err_t retval = PFM_SUCCESS; _papi_hwi_lock( NAMELIB_LOCK ); if (!libpfm4_users) { retval = pfm_initialize(); if ( retval == PFM_SUCCESS ) { libpfm4_users++; } else { strncpy(my_vector->cmp_info.disabled_reason, pfm_strerror(retval),PAPI_MAX_STR_LEN-1); _papi_hwi_unlock( NAMELIB_LOCK ); return PAPI_ESBSTR; } } else { libpfm4_users++; } _papi_hwi_unlock( NAMELIB_LOCK ); /* get the libpfm4 version */ version=pfm_get_version( ); if (version >= 0) { /* Complain if the compiled-against version */ /* doesn't match current version */ if ( PFM_MAJ_VERSION( version ) != PFM_MAJ_VERSION( LIBPFM_VERSION ) ) { PAPIWARN( "Version mismatch of libpfm: " "compiled %#x vs. installed %#x\n", PFM_MAJ_VERSION( LIBPFM_VERSION ), PFM_MAJ_VERSION( version ) ); } /* Set the version */ sprintf( my_vector->cmp_info.support_version, "%d.%d", PFM_MAJ_VERSION( version ), PFM_MIN_VERSION( version ) ); } else { PAPIWARN( "pfm_get_version(): %s", pfm_strerror( retval ) ); } return PAPI_OK; } papi-papi-7-2-0-t/src/papi_libpfm4_events.h000066400000000000000000000016431502707512200205360ustar00rootroot00000000000000#ifndef _PAPI_LIBPFM4_EVENTS_H #define _PAPI_LIBPFM4_EVENTS_H /* * File: papi_libpfm4_events.h */ #include "perfmon/pfmlib.h" #include PEINCLUDE struct native_event_t { int component; char *pmu; int papi_event_code; int libpfm4_idx; char *allocated_name; char *base_name; char *mask_string; char *event_description; char *mask_description; char *pmu_plus_name; int cpu; int users; perf_event_attr_t attr; }; #define PMU_TYPE_CORE 1 #define PMU_TYPE_UNCORE 2 #define PMU_TYPE_OS 4 struct native_event_table_t { struct native_event_t *native_events; int num_native_events; int allocated_native_events; pfm_pmu_info_t default_pmu; int pmu_type; }; /* Prototypes for libpfm name library access */ int _papi_libpfm4_error( int pfm_error ); int _papi_libpfm4_shutdown(papi_vector_t *my_vector); int _papi_libpfm4_init(papi_vector_t *my_vector); #endif // _PAPI_LIBPFM4_EVENTS_H papi-papi-7-2-0-t/src/papi_libpfm_events.h000066400000000000000000000033371502707512200204540ustar00rootroot00000000000000#ifndef _PAPI_LIBPFM_EVENTS_H #define _PAPI_LIBPFM_EVENTS_H #include "papi.h" /* For PAPI_event_info_t */ #include "papi_vector.h" /* For papi_vector_t */ /* * File: papi_libpfm_events.h */ /* Prototypes for libpfm name library access */ int _papi_libpfm_error( int pfm_error ); int _papi_libpfm_setup_presets( char *name, int type, int cidx ); int _papi_libpfm_ntv_enum_events( unsigned int *EventCode, int modifier ); int _papi_libpfm_ntv_name_to_code( const char *ntv_name, unsigned int *EventCode ); int _papi_libpfm_ntv_code_to_name( unsigned int EventCode, char *name, int len ); int _papi_libpfm_ntv_code_to_descr( unsigned int EventCode, char *name, int len ); int _papi_libpfm_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ); int _papi_libpfm_ntv_code_to_bits_perfctr( unsigned int EventCode, hwd_register_t * bits ); int _papi_libpfm_shutdown(void); int _papi_libpfm_init(papi_vector_t *my_vector, int cidx); int _pfm_decode_native_event( unsigned int EventCode, unsigned int *event, unsigned int *umask ); unsigned int _pfm_convert_umask( unsigned int event, unsigned int umask ); int prepare_umask( unsigned int foo, unsigned int *values ); int _papi_libpfm_ntv_code_to_info(unsigned int EventCode, PAPI_event_info_t *info); /* Gross perfctr/perf_events compatability hack */ /* need to think up a better way to handle this */ #ifndef __PERFMON_PERF_EVENT_H__ struct perf_event_attr { int config; int type; }; #define PERF_TYPE_RAW 4; #endif /* !__PERFMON_PERF_EVENT_H__ */ extern int _papi_libpfm_setup_counters( struct perf_event_attr *attr, hwd_register_t *ni_bits ); #endif // _PAPI_LIBPFM_EVENTS_H papi-papi-7-2-0-t/src/papi_lock.h000066400000000000000000000015351502707512200165450ustar00rootroot00000000000000#ifndef _PAPI_DEFINES_H #define _PAPI_DEFINES_H /* Thread related: locks */ #define INTERNAL_LOCK PAPI_NUM_LOCK+0 /* papi_internal.c */ #define MULTIPLEX_LOCK PAPI_NUM_LOCK+1 /* multiplex.c */ #define THREADS_LOCK PAPI_NUM_LOCK+2 /* threads.c */ #define HIGHLEVEL_LOCK PAPI_NUM_LOCK+3 /* papi_hl.c */ #define MEMORY_LOCK PAPI_NUM_LOCK+4 /* papi_memory.c */ #define COMPONENT_LOCK PAPI_NUM_LOCK+5 /* per-component */ #define GLOBAL_LOCK PAPI_NUM_LOCK+6 /* papi.c for global variable (static and non) initialization/shutdown */ #define CPUS_LOCK PAPI_NUM_LOCK+7 /* cpus.c */ #define NAMELIB_LOCK PAPI_NUM_LOCK+8 /* papi_pfm4_events.c */ #define NUM_INNER_LOCK 9 #define PAPI_MAX_LOCK (NUM_INNER_LOCK + PAPI_NUM_LOCK + PAPI_NUM_COMP) #include OSLOCK #endif papi-papi-7-2-0-t/src/papi_memory.c000066400000000000000000000270601502707512200171210ustar00rootroot00000000000000/** * @file papi_memory.c * @author Kevin London * london@cs.utk.edu * PAPI memory allocation provides for checking and maintenance of all memory * allocated through this interface. Implemented as a series of wrappers around * standard C memory allocation routines, _papi_malloc and associated functions * add a prolog and optional epilog to each malloc'd pointer. * The prolog, sized to preserve memory alignment, contains a pointer to a * linked list of pmem_t structures that describe every block of memory * allocated through these calls. * The optional epilog is enabled if DEBUG is defined, and contains * a distinctive pattern that allows checking for pointer overflow. */ #define IN_MEM_FILE #include #include #include #include "papi.h" #include "papi_lock.h" #include "papi_memory.h" #include "papi_internal.h" /** Define the amount of extra memory at the beginning of the alloc'd pointer. * This is usually the size of a pointer, but in some cases needs to be bigger * to preserve data alignment. */ #define MEM_PROLOG (2*sizeof(void *)) /* If you are tracing memory, then DEBUG must be set also. */ #ifdef DEBUG /** Define the amount of extra memory at the end of the alloc'd pointer. * Also define the contents: 0xCACA */ #define MEM_EPILOG 4 #define MEM_EPILOG_1 0xC #define MEM_EPILOG_2 0xA #define MEM_EPILOG_3 0xC #define MEM_EPILOG_4 0xA #endif /* Local global variables */ static pmem_t *mem_head = NULL; /* Local Prototypes */ static pmem_t *get_mem_ptr( void *ptr ); static pmem_t *init_mem_ptr( void *, int, const char *, int ); static void insert_mem_ptr( pmem_t * ); static void remove_mem_ptr( pmem_t * ); static int set_epilog( pmem_t * mem_ptr ); /********************************************************************** * Exposed papi versions of std memory management routines: * * _papi_realloc * * _papi_calloc * * _papi_malloc * * _papi_strdup * * _papi_free * * _papi_valid_free * * Exposed useful papi memory maintenance routines: * * _papi_mem_print_info * * _papi_mem_print_stats * * _papi_mem_overhead * * _papi_mem_cleanup_all * * _papi_mem_check_buf_overflow * * _papi_mem_check_all_overflow * **********************************************************************/ /** _papi_realloc -- given a pointer returned by _papi_malloc, returns a pointer * to the related pmem_t structure describing this pointer. * Checks for NULL pointers and returns NULL if error. */ void * _papi_realloc( const char *file, int line, void *ptr, size_t size ) { size_t nsize = size + MEM_PROLOG; pmem_t *mem_ptr; void *nptr; #ifdef DEBUG nsize += MEM_EPILOG; _papi_hwi_lock( MEMORY_LOCK ); _papi_mem_check_all_overflow( ); #endif if ( !ptr ) return ( _papi_malloc( file, line, size ) ); mem_ptr = get_mem_ptr( ptr ); nptr = ( pmem_t * ) realloc( ( ( char * ) ptr - MEM_PROLOG ), nsize ); if ( !nptr ) return ( NULL ); mem_ptr->size = ( int ) size; mem_ptr->ptr = ( char * ) nptr + MEM_PROLOG; #ifdef DEBUG strncpy( mem_ptr->file, file, DEBUG_FILE_LEN ); mem_ptr->file[DEBUG_FILE_LEN - 1] = '\0'; mem_ptr->line = line; set_epilog( mem_ptr ); _papi_hwi_unlock( MEMORY_LOCK ); #endif MEMDBG( "%p: Re-allocated: %lu bytes from File: %s Line: %d\n", mem_ptr->ptr, ( unsigned long ) size, file, line ); return ( mem_ptr->ptr ); } void * _papi_calloc( const char *file, int line, size_t nmemb, size_t size ) { void *ptr = _papi_malloc( file, line, size * nmemb ); if ( !ptr ) return ( NULL ); memset( ptr, 0, size * nmemb ); return ( ptr ); } void * _papi_malloc( const char *file, int line, size_t size ) { void *ptr; void **tmp; pmem_t *mem_ptr; size_t nsize = size + MEM_PROLOG; #ifdef DEBUG nsize += MEM_EPILOG; #endif if ( size == 0 ) { MEMDBG( "Attempting to allocate %lu bytes from File: %s Line: %d\n", ( unsigned long ) size, file, line ); return ( NULL ); } ptr = ( void * ) malloc( nsize ); if ( !ptr ) return ( NULL ); else { if ( ( mem_ptr = init_mem_ptr( ( char * ) ptr + MEM_PROLOG, ( int ) size, file, line ) ) == NULL ) { free( ptr ); return ( NULL ); } tmp = ptr; *tmp = mem_ptr; ptr = mem_ptr->ptr; mem_ptr->ptr = ptr; _papi_hwi_lock( MEMORY_LOCK ); insert_mem_ptr( mem_ptr ); set_epilog( mem_ptr ); _papi_hwi_unlock( MEMORY_LOCK ); MEMDBG( "%p: Allocated %lu bytes from File: %s Line: %d\n", mem_ptr->ptr, ( unsigned long ) size, file, line ); return ( ptr ); } return ( NULL ); } char * _papi_strdup( const char *file, int line, const char *s ) { size_t size; char *ptr; if ( !s ) return ( NULL ); /* String Length +1 for \0 */ size = strlen( s ) + 1; ptr = ( char * ) _papi_malloc( file, line, size ); if ( !ptr ) return ( NULL ); memcpy( ptr, s, size ); return ( ptr ); } /** Only frees the memory if PAPI malloced it * returns 1 if pointer was valid; 0 if not */ int _papi_valid_free( const char *file, int line, void *ptr ) { pmem_t *tmp; int valid = 0; if ( !ptr ) { ( void ) file; ( void ) line; return ( 0 ); } _papi_hwi_lock( MEMORY_LOCK ); for ( tmp = mem_head; tmp; tmp = tmp->next ) { if ( ptr == tmp->ptr ) { pmem_t *mem_ptr = get_mem_ptr( ptr ); if ( mem_ptr ) { MEMDBG( "%p: Freeing %d bytes from File: %s Line: %d\n", mem_ptr->ptr, mem_ptr->size, file, line ); remove_mem_ptr( mem_ptr ); _papi_mem_check_all_overflow( ); } valid = 1; break; } } _papi_hwi_unlock( MEMORY_LOCK ); return ( valid ); } /** Frees up the ptr */ void _papi_free( const char *file, int line, void *ptr ) { pmem_t *mem_ptr = get_mem_ptr( ptr ); if ( !mem_ptr ) { ( void ) file; ( void ) line; return; } MEMDBG( "%p: Freeing %d bytes from File: %s Line: %d\n", mem_ptr->ptr, mem_ptr->size, file, line ); _papi_hwi_lock( MEMORY_LOCK ); remove_mem_ptr( mem_ptr ); _papi_mem_check_all_overflow( ); _papi_hwi_unlock( MEMORY_LOCK ); } /** Print information about the memory including file and location it came from */ void _papi_mem_print_info( void *ptr ) { pmem_t *mem_ptr = get_mem_ptr( ptr ); #ifdef DEBUG fprintf( stderr, "%p: Allocated %d bytes from File: %s Line: %d\n", ptr, mem_ptr->size, mem_ptr->file, mem_ptr->line ); #else fprintf( stderr, "%p: Allocated %d bytes\n", ptr, mem_ptr->size ); #endif return; } /** Print out all memory information */ void _papi_mem_print_stats( ) { pmem_t *tmp = NULL; _papi_hwi_lock( MEMORY_LOCK ); for ( tmp = mem_head; tmp; tmp = tmp->next ) { _papi_mem_print_info( tmp->ptr ); } _papi_hwi_unlock( MEMORY_LOCK ); } /** Return the amount of memory overhead of the PAPI library and the memory system * PAPI_MEM_LIB_OVERHEAD is the library overhead * PAPI_MEM_OVERHEAD is the memory overhead * They both can be | together * This only includes "malloc'd memory" */ int _papi_mem_overhead( int type ) { pmem_t *ptr = NULL; int size = 0; _papi_hwi_lock( MEMORY_LOCK ); for ( ptr = mem_head; ptr; ptr = ptr->next ) { if ( type & PAPI_MEM_LIB_OVERHEAD ) size += ptr->size; if ( type & PAPI_MEM_OVERHEAD ) { size += ( int ) sizeof ( pmem_t ); size += ( int ) MEM_PROLOG; #ifdef DEBUG size += ( int ) MEM_EPILOG; #endif } } _papi_hwi_unlock( MEMORY_LOCK ); return size; } /** Clean all memory up and print out memory leak information to stderr */ void _papi_mem_cleanup_all( ) { pmem_t *ptr = NULL, *tmp = NULL; #ifdef DEBUG int cnt = 0; #endif _papi_hwi_lock( MEMORY_LOCK ); _papi_mem_check_all_overflow( ); for ( ptr = mem_head; ptr; ptr = tmp ) { tmp = ptr->next; #ifdef DEBUG LEAKDBG( "MEMORY LEAK: %p of %d bytes, from File: %s Line: %d\n", ptr->ptr, ptr->size, ptr->file, ptr->line ); cnt += ptr->size; #endif remove_mem_ptr( ptr ); } _papi_hwi_unlock( MEMORY_LOCK ); #ifdef DEBUG if ( 0 != cnt ) { LEAKDBG( "TOTAL MEMORY LEAK: %d bytes.\n", cnt ); } #endif } /* Loop through memory structures and look for buffer overflows * returns the number of overflows detected */ /********************************************************************** * Private helper routines for papi memory management * **********************************************************************/ /* Given a pointer returned by _papi_malloc, returns a pointer * to the related pmem_t structure describing this pointer. * Checks for NULL pointers and returns NULL if error. */ static pmem_t * get_mem_ptr( void *ptr ) { pmem_t **tmp_ptr = ( pmem_t ** ) ( ( char * ) ptr - MEM_PROLOG ); pmem_t *mem_ptr; if ( !tmp_ptr || !ptr ) return ( NULL ); mem_ptr = *tmp_ptr; return ( mem_ptr ); } /* Allocate and initialize a memory pointer */ pmem_t * init_mem_ptr( void *ptr, int size, const char *file, int line ) { pmem_t *mem_ptr = NULL; if ( ( mem_ptr = ( pmem_t * ) malloc( sizeof ( pmem_t ) ) ) == NULL ) return ( NULL ); mem_ptr->ptr = ptr; mem_ptr->size = size; mem_ptr->next = NULL; mem_ptr->prev = NULL; #ifdef DEBUG strncpy( mem_ptr->file, file, DEBUG_FILE_LEN ); mem_ptr->file[DEBUG_FILE_LEN - 1] = '\0'; mem_ptr->line = line; #else ( void ) file; /*unused */ ( void ) line; /*unused */ #endif return ( mem_ptr ); } /* Insert the memory information * Do not lock these routines, but lock in routines using these */ static void insert_mem_ptr( pmem_t * ptr ) { if ( !ptr ) return; if ( !mem_head ) { mem_head = ptr; ptr->next = NULL; ptr->prev = NULL; } else { mem_head->prev = ptr; ptr->next = mem_head; mem_head = ptr; } return; } /* Remove the memory information pointer and free the memory * Do not using locking in this routine, instead lock around * the sections of code that use this call. */ static void remove_mem_ptr( pmem_t * ptr ) { if ( !ptr ) return; if ( ptr->prev ) ptr->prev->next = ptr->next; if ( ptr->next ) ptr->next->prev = ptr->prev; if ( ptr == mem_head ) mem_head = ptr->next; free( ptr ); } static int set_epilog( pmem_t * mem_ptr ) { #ifdef DEBUG char *chptr = ( char * ) mem_ptr->ptr + mem_ptr->size; *chptr++ = MEM_EPILOG_1; *chptr++ = MEM_EPILOG_2; *chptr++ = MEM_EPILOG_3; *chptr++ = MEM_EPILOG_4; return ( _papi_mem_check_all_overflow( ) ); #else ( void ) mem_ptr; /*unused */ #endif return ( 0 ); } /* Check for memory buffer overflows */ #ifdef DEBUG static int _papi_mem_check_buf_overflow( pmem_t * tmp ) { int fnd = 0; char *ptr; char *tptr; if ( !tmp ) return ( 0 ); tptr = tmp->ptr; tptr += tmp->size; /* Move to the buffer overflow padding */ ptr = ( ( char * ) tmp->ptr ) + tmp->size; if ( *ptr++ != MEM_EPILOG_1 ) fnd = 1; else if ( *ptr++ != MEM_EPILOG_2 ) fnd = 2; else if ( *ptr++ != MEM_EPILOG_3 ) fnd = 3; else if ( *ptr++ != MEM_EPILOG_4 ) fnd = 4; if ( fnd ) { LEAKDBG( "Buffer Overflow[%d] for %p allocated from %s at line %d\n", fnd, tmp->ptr, tmp->file, tmp->line ); } return ( fnd ); } #endif int _papi_mem_check_all_overflow( ) { int fnd = 0; #ifdef DEBUG pmem_t *tmp; for ( tmp = mem_head; tmp; tmp = tmp->next ) { if ( _papi_mem_check_buf_overflow( tmp ) ) fnd++; } if ( fnd ) { LEAKDBG( "%d Total Buffer overflows detected!\n", fnd ); } #endif return ( fnd ); } papi-papi-7-2-0-t/src/papi_memory.h000066400000000000000000000036141502707512200171250ustar00rootroot00000000000000#ifndef _PAPI_MALLOC #define _PAPI_MALLOC #include #define DEBUG_FILE_LEN 20 typedef struct pmem { void *ptr; int size; #ifdef DEBUG char file[DEBUG_FILE_LEN]; int line; #endif struct pmem *next; struct pmem *prev; } pmem_t; #ifndef IN_MEM_FILE #ifdef PAPI_NO_MEMORY_MANAGEMENT #define papi_malloc(a) malloc(a) #define papi_free(a) free(a) #define papi_realloc(a,b) realloc(a,b) #define papi_calloc(a,b) calloc(a,b) #define papi_valid_free(a) 1 #define papi_strdup(a) strdup(a) #define papi_mem_cleanup_all() ; #define papi_mem_print_info(a) ; #define papi_mem_print_stats() ; #define papi_mem_overhead(a) ; #define papi_mem_check_all_overflow() ; #else #define papi_malloc(a) _papi_malloc(__FILE__,__LINE__, a) #define papi_free(a) _papi_free(__FILE__,__LINE__, a) #define papi_realloc(a,b) _papi_realloc(__FILE__,__LINE__,a,b) #define papi_calloc(a,b) _papi_calloc(__FILE__,__LINE__,a,b) #define papi_valid_free(a) _papi_valid_free(__FILE__,__LINE__,a) #define papi_strdup(a) _papi_strdup(__FILE__,__LINE__,a) #define papi_mem_cleanup_all _papi_mem_cleanup_all #define papi_mem_print_info(a) _papi_mem_print_info(a) #define papi_mem_print_stats _papi_mem_print_stats #define papi_mem_overhead(a) _papi_mem_overhead(a) #define papi_mem_check_all_overflow _papi_mem_check_all_overflow #endif #endif void *_papi_malloc( const char *, int, size_t ); void _papi_free( const char *, int, void * ); void *_papi_realloc( const char *, int, void *, size_t ); void *_papi_calloc( const char *, int, size_t, size_t ); int _papi_valid_free( const char *, int, void * ); char *_papi_strdup( const char *, int, const char *s ); void _papi_mem_cleanup_all( ); void _papi_mem_print_info( void *ptr ); void _papi_mem_print_stats( ); int _papi_mem_overhead( int ); int _papi_mem_check_all_overflow( ); #define PAPI_MEM_LIB_OVERHEAD 1 /* PAPI Library Overhead */ #define PAPI_MEM_OVERHEAD 2 /* Memory Overhead */ #endif papi-papi-7-2-0-t/src/papi_preset.c000066400000000000000000002147101502707512200171130ustar00rootroot00000000000000/* * File: papi_preset.c * Author: Haihang You * you@cs.utk.edu * Mods: Brian Sheely * bsheely@eecs.utk.edu * Author: Vince Weaver * vweaver1 @ eecs.utk.edu * Merge of the libpfm3/libpfm4/pmapi-ppc64_events preset code */ #include #include #include #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "papi_preset.h" #include "extras.h" // A place to put user defined events extern hwi_presets_t user_defined_events[]; extern int user_defined_events_count; extern int num_all_presets; extern int _papi_hwi_start_idx[PAPI_NUM_COMP]; static int papi_load_derived_events (char *pmu_str, int pmu_type, int cidx, int preset_flag); static int papi_load_derived_events_component (char *comp_str, char *arch_str, int cidx); /* This routine copies values from a dense 'findem' array of events into the sparse global _papi_hwi_presets array, which is assumed to be empty at initialization. Multiple dense arrays can be copied into the sparse array, allowing event overloading at run-time, or allowing a baseline table to be augmented by a model specific table at init time. This method supports adding new events; overriding existing events, or deleting deprecated events. */ int _papi_hwi_setup_all_presets( hwi_search_t * findem, int cidx ) { int i, pnum, did_something = 0; unsigned int preset_index, j, k; /* dense array of events is terminated with a 0 preset. don't do anything if NULL pointer. This allows just notes to be loaded. It's also good defensive programming. */ if ( findem != NULL ) { for ( pnum = 0; ( pnum < PAPI_MAX_PRESET_EVENTS ) && ( findem[pnum].event_code != 0 ); pnum++ ) { /* find the index for the event to be initialized */ preset_index = ( findem[pnum].event_code & PAPI_PRESET_AND_MASK ); /* count and set the number of native terms in this event, these items are contiguous. PAPI_EVENTS_IN_DERIVED_EVENT is arbitrarily defined in the high level to be a reasonable number of terms to use in a derived event linear expression, currently 8. This wastes space for components with less than 8 counters, but keeps the framework independent of the components. The 'native' field below is an arbitrary opaque identifier that points to information on an actual native event. It is not an event code itself (whatever that might mean). By definition, this value can never == PAPI_NULL. - dkt */ INTDBG( "Counting number of terms for preset index %d, " "search map index %d.\n", preset_index, pnum ); i = 0; j = 0; while ( i < PAPI_EVENTS_IN_DERIVED_EVENT ) { if ( findem[pnum].native[i] != PAPI_NULL ) { j++; } else if ( j ) { break; } i++; } INTDBG( "This preset has %d terms.\n", j ); _papi_hwi_presets[preset_index].count = j; // Set the component index to that of the first native event used to define it. // Make sure we later check that all native events in a preset come from same comp. _papi_hwi_presets[preset_index].component_index = _papi_hwi_component_index(findem[pnum].native[0]); _papi_hwi_presets[preset_index].derived_int = findem[pnum].derived; for(k=0;kcmp_info.num_preset_events += did_something; return ( did_something ? PAPI_OK : PAPI_ENOEVNT ); } int _papi_hwi_cleanup_all_presets( void ) { int preset_index,cidx; unsigned int j; hwi_presets_t *_papi_hwi_list; for(cidx=0;cidx= 0 ) { if ( isblank( start[i] ) ) start[i] = '\0'; else break; i--; } return ( start ); } /* Calls trim_string to remove blank space; Removes paired punctuation delimiters from beginning and end of string. If the same punctuation appears first and last (quotes, slashes) they are trimmed; Also checks for the following pairs: () <> {} [] */ static inline char * trim_note( char *in ) { int len; char *note, start, end; note = trim_string( in ); if ( note != NULL ) { len = ( int ) strlen( note ); if ( len > 0 ) { if ( ispunct( *note ) ) { start = *note; end = note[len - 1]; if ( ( start == end ) || ( ( start == '(' ) && ( end == ')' ) ) || ( ( start == '<' ) && ( end == '>' ) ) || ( ( start == '{' ) && ( end == '}' ) ) || ( ( start == '[' ) && ( end == ']' ) ) ) { note[len - 1] = '\0'; *note = '\0'; note++; } } } } return note; } static inline int find_event_index(hwi_presets_t *array, int size, char *tmp) { SUBDBG("ENTER: array: %p, size: %d, tmp: %s\n", array, size, tmp); int i; for (i = 0; i < size; i++) { if (array[i].symbol == NULL) { array[i].symbol = papi_strdup(tmp); SUBDBG("EXIT: i: %d\n", i); return i; } if (strcasecmp(tmp, array[i].symbol) == 0) { SUBDBG("EXIT: i: %d\n", i); return i; } } SUBDBG("EXIT: PAPI_EINVAL\n"); return PAPI_EINVAL; } /* Look for an event file 'name' in a couple common locations. Return a valid file handle if found */ static FILE * open_event_table( char *name ) { FILE *table; SUBDBG( "Opening %s\n", name ); table = fopen( name, "r" ); if ( table == NULL ) { SUBDBG( "Open %s failed, trying ./%s.\n", name, PAPI_EVENT_FILE ); sprintf( name, "%s", PAPI_EVENT_FILE ); table = fopen( name, "r" ); } if ( table == NULL ) { SUBDBG( "Open ./%s failed, trying ../%s.\n", name, PAPI_EVENT_FILE ); sprintf( name, "../%s", PAPI_EVENT_FILE ); table = fopen( name, "r" ); } if ( table ) { SUBDBG( "Open %s succeeded.\n", name ); } return table; } /* parse a single line from either a file or character table Strip trailing ; return 0 if empty */ static int get_event_line( char *line, FILE * table, char **tmp_perfmon_events_table ) { int i; if ( table ) { if ( fgets( line, LINE_MAX, table ) == NULL) return 0; i = ( int ) strlen( line ); if (i == 0) return 0; if ( line[i-1] == '\n' ) line[i-1] = '\0'; return 1; } else { for ( i = 0; **tmp_perfmon_events_table && **tmp_perfmon_events_table != '\n'; i++, ( *tmp_perfmon_events_table )++ ) line[i] = **tmp_perfmon_events_table; if (i == 0) return 0; if ( **tmp_perfmon_events_table && **tmp_perfmon_events_table == '\n' ) { ( *tmp_perfmon_events_table )++; } line[i] = '\0'; return 1; } } // update tokens in formula referring to index "old_index" with tokens referring to index "new_index". static void update_ops_string(char **formula, int old_index, int new_index) { INTDBG("ENTER: *formula: %s, old_index: %d, new_index: %d\n", *formula?*formula:"NULL", old_index, new_index); int cur_index; char *newFormula; char *subtoken; char *tok_save_ptr=NULL; // if formula is null just return if (*formula == NULL) { INTDBG("EXIT: Null pointer to formula passed in\n"); return; } // get some space for the new formula we are going to create newFormula = papi_calloc(strlen(*formula) + 20, 1); // replace the specified "replace" tokens in the new original formula with the new insertion formula newFormula[0] = '\0'; subtoken = strtok_r(*formula, "|", &tok_save_ptr); while ( subtoken != NULL) { // INTDBG("subtoken: %s, newFormula: %s\n", subtoken, newFormula); char work[16]; // if this is the token we want to replace with the new token index, do it now if ((subtoken[0] == 'N') && (isdigit(subtoken[1]))) { cur_index = atoi(&subtoken[1]); // if matches old index, use the new one if (cur_index == old_index) { sprintf (work, "N%d", new_index); strcat (newFormula, work); } else if (cur_index > old_index) { // current token greater than old index, make it one less than what it was sprintf (work, "N%d", cur_index-1); strcat (newFormula, work); } else { // current token less than old index, copy this part of the original formula into the new formula strcat(newFormula, subtoken); } } else { // copy this part of the original formula into the new formula strcat(newFormula, subtoken); } strcat (newFormula, "|"); subtoken = strtok_r(NULL, "|", &tok_save_ptr); } papi_free (*formula); *formula = newFormula; INTDBG("EXIT: newFormula: %s\n", newFormula); return; } // // Handle creating a new derived event of type DERIVED_ADD. This may create a new formula // which can be used to compute the results of the new event from the events it depends on. // This code is also responsible for making sure that all the needed native events are in the // new events native event list and that the formula's referenced to this array are correct. // static void ops_string_append(hwi_presets_t *results, hwi_presets_t *depends_on, int addition) { INTDBG("ENTER: results: %p, depends_on: %p, addition %d\n", results, depends_on, addition); int i; int second_event = 0; char newFormula[PAPI_MIN_STR_LEN] = ""; char work[20]; // if our results already have a formula, start with what was collected so far // this should only happens when processing the second event of a new derived add if (results->postfix != NULL) { INTDBG("Event %s has existing formula %s\n", results->symbol, results->postfix); // get the existing formula strncat(newFormula, results->postfix, sizeof(newFormula)-1); newFormula[sizeof(newFormula)-1] = '\0'; second_event = 1; } // process based on what kind of event the one we depend on is switch (depends_on->derived_int) { case DERIVED_POSTFIX: { // the event we depend on has a formula, append it our new events formula // if event we depend on does not have a formula, report error if (depends_on->postfix == NULL) { INTDBG("Event %s is of type DERIVED_POSTFIX but is missing operation string\n", depends_on->symbol); return; } // may need to renumber the native event index values in the depends on event formula before putting it into new derived event char *temp = papi_strdup(depends_on->postfix); // If this is not the first event of the new derived add, need to adjust native event index values in formula. // At this time we assume that all the native events in the second events formula are unique for the new event // and just bump the indexes by the number of events already known to the new event. Later when we add the events // to the native event list for this new derived event, we will check to see if the native events are already known // to the new derived event and if so adjust the indexes again. if (second_event) { for ( i=depends_on->count-1 ; i>=0 ; i--) { update_ops_string(&temp, i, results->count + i); } } // append the existing formula from the event we depend on (but get rid of last '|' character) strncat(newFormula, temp, sizeof(newFormula)-1); newFormula[sizeof(newFormula)-1] = '\0'; papi_free (temp); break; } case DERIVED_ADD: { // the event we depend on has no formula, create a formula for our new event to add together the depends_on native event values // build a formula for this add event sprintf(work, "N%d|N%d|+|", results->count, results->count + 1); strcat(newFormula, work); break; } case DERIVED_SUB: { // the event we depend on has no formula, create a formula for our new event to subtract the depends_on native event values // build a formula for this subtract event sprintf(work, "N%d|N%d|-|", results->count, results->count + 1); strcat(newFormula, work); break; } case NOT_DERIVED: { // the event we depend on has no formula and is itself only based on one native event, create a formula for our new event to include this native event // build a formula for this subtract event sprintf(work, "N%d|", results->count); strcat(newFormula, work); break; } default: { // the event we depend on has unsupported derived type, put out some debug and give up INTDBG("Event %s depends on event %s which has an unsupported derived type of %d\n", results->symbol, depends_on->symbol, depends_on->derived_int); return; } } // if this was the second event, append to the formula an operation to add or subtract the results of the two events if (second_event) { if (addition != 0) { strcat(newFormula, "+|"); } else { strcat(newFormula, "-|"); } // also change the new derived events type to show it has a formula now results->derived_int = DERIVED_POSTFIX; } // we need to free the existing space (created by malloc and we need to create a new one) papi_free (results->postfix); results->postfix = papi_strdup(newFormula); INTDBG("EXIT: newFormula: %s\n", newFormula); return; } // merge the 'insertion' formula into the 'original' formula replacing the // 'replaces' token in the 'original' formula. static void ops_string_merge(char **original, char *insertion, int replaces, int start_index) { INTDBG("ENTER: original: %p, *original: %s, insertion: %s, replaces: %d, start_index: %d\n", original, *original, insertion, replaces, start_index); int orig_len=0; int ins_len=0; char *subtoken; char *workBuf; char *workPtr; char *tok_save_ptr=NULL; char *newOriginal; char *newInsertion; char *newFormula; int insert_events; if (*original != NULL) { orig_len = strlen(*original); } if (insertion != NULL) { ins_len = strlen(insertion); } newFormula = papi_calloc (orig_len + ins_len + 40, 1); // if insertion formula is not provided, then the original formula remains basically unchanged. if (insertion == NULL) { // if the original formula has a leading '|' then get rid of it workPtr = *original; if (workPtr[0] == '|') { strcpy(newFormula, &workPtr[1]); } else { strcpy(newFormula, workPtr); } // formula fields are always malloced space so free the previous one papi_free (*original); *original = newFormula; INTDBG("EXIT: newFormula: %s\n", *original); return; } // renumber the token numbers in the insertion formula // also count how many native events are used in this formula insert_events = 0; newInsertion = papi_calloc(ins_len+20, 1); workBuf = papi_calloc(ins_len+10, 1); workPtr = papi_strdup(insertion); subtoken = strtok_r(workPtr, "|", &tok_save_ptr); while ( subtoken != NULL) { // INTDBG("subtoken: %s, newInsertion: %s\n", subtoken, newInsertion); if ((subtoken[0] == 'N') && (isdigit(subtoken[1]))) { insert_events++; int val = atoi(&subtoken[1]); val += start_index; subtoken[1] = '\0'; sprintf (workBuf, "N%d", val); } else { strcpy(workBuf, subtoken); } strcat (newInsertion, workBuf); strcat (newInsertion, "|"); subtoken = strtok_r(NULL, "|", &tok_save_ptr); } papi_free (workBuf); papi_free (workPtr); INTDBG("newInsertion: %s\n", newInsertion); // if original formula is not provided, then the updated insertion formula becomes the new formula // but we still had to renumber the native event tokens in case another native event was put into the list first if (*original == NULL) { *original = papi_strdup(newInsertion); INTDBG("EXIT: newFormula: %s\n", newInsertion); papi_free (newInsertion); papi_free (newFormula); return; } // if token to replace not valid, return null (do we also need to check an upper bound ???) if ((replaces < 0)) { papi_free (newInsertion); papi_free (newFormula); INTDBG("EXIT: Invalid value for token in original formula to be replaced\n"); return; } // renumber the token numbers in the original formula // tokens with an index greater than the replaces token need to be incremented by number of events in insertion formula-1 newOriginal = papi_calloc (orig_len+20, 1); workBuf = papi_calloc(orig_len+10, 1); workPtr = papi_strdup(*original); subtoken = strtok_r(workPtr, "|", &tok_save_ptr); while ( subtoken != NULL) { // INTDBG("subtoken: %s, newOriginal: %s\n", subtoken, newOriginal); // prime the work area with the next token, then see if we need to change it strcpy(workBuf, subtoken); if ((subtoken[0] == 'N') && (isdigit(subtoken[1]))) { int val = atoi(&subtoken[1]); if (val > replaces) { val += insert_events-1; subtoken[1] = '\0'; sprintf (workBuf, "N%d", val); } } // put the work buffer into the new original formula strcat (newOriginal, workBuf); strcat (newOriginal, "|"); subtoken = strtok_r(NULL, "|", &tok_save_ptr); } papi_free (workBuf); papi_free (workPtr); INTDBG("newOriginal: %s\n", newOriginal); // replace the specified "replace" tokens in the new original formula with the new insertion formula newFormula[0] = '\0'; workPtr = newOriginal; subtoken = strtok_r(workPtr, "|", &tok_save_ptr); while ( subtoken != NULL) { // INTDBG("subtoken: %s, newFormula: %s\n", subtoken, newFormula); // if this is the token we want to replace with the insertion string, do it now if ((subtoken[0] == 'N') && (isdigit(subtoken[1])) && (replaces == atoi(&subtoken[1]))) { // copy updated insertion string into the original string (replacing this token) strcat(newFormula, newInsertion); } else { // copy this part of the original formula into the new formula strcat(newFormula, subtoken); strcat(newFormula, "|"); } subtoken = strtok_r(NULL, "|", &tok_save_ptr); } papi_free (newInsertion); papi_free (workPtr); // formula fields are always malloced space so free the previous one papi_free (*original); *original = newFormula; INTDBG("EXIT: newFormula: %s\n", newFormula); return; } // // Check to see if an event the new derived event being created depends on is known. We check both preset and user defined derived events here. // If it is a known derived event then we set the new event being defined to include the necessary native events and formula to compute its // derived value and use it in the correct context of the new derived event being created. Depending on the inputs, the operations strings (formulas) // to be used by the new derived event may need to be created and/or adjusted to reference the correct native event indexes for the new derived event. // The formulas processed by this code must be reverse polish notation (RPN) or postfix format and they must contain place holders (like N0, N1) which // identify indexes into the native event array used to compute the new derived events final value. // // Arguments: // target: event we are looking for // derived_type: type of derived event being created (add, subtract, postfix) // results: where to build the new preset event being defined. // search: table of known existing preset or user events the new derived event is allowed to use (points to a table of either preset or user events). // search_size: number of entries in the search table. // static int check_derived_events(char *target, int derived_type, hwi_presets_t* results, hwi_presets_t * search, int search_size, int token_index) { INTDBG("ENTER: target: %p (%s), results: %p, search: %p, search_size: %d, token_index: %d\n", target, target, results, search, search_size, token_index); unsigned int i; int j; int k; int found = 0; for (j=0; j < search_size; j++) { // INTDBG("search[%d].symbol: %s, looking for: %s\n", j, search[j].symbol, target); if (search[j].symbol == NULL) { INTDBG("EXIT: returned: 0\n"); return 0; } // if not the event we depend on, just look at next if ( strcasecmp( target, search[j].symbol) != 0 ) { continue; } INTDBG("Found a match\n"); // derived formulas need to be adjusted based on what kind of derived event we are processing // the derived type passed to this function is the type of the new event being defined (not the events it is based on) // when we get here the formula must be in reverse polish notation (RPN) format switch (derived_type) { case DERIVED_POSTFIX: { // go create a formula to merge the second formula into a spot identified by one of the tokens in // the first formula. ops_string_merge(&(results->postfix), search[j].postfix, token_index, results->count); break; } case DERIVED_ADD: { // the new derived event adds two things together, go handle this target events role in the add ops_string_append(results, &search[j], 1); break; } case DERIVED_SUB: { // go create a formula to subtract the value generated by the second formula from the value generated by the first formula. ops_string_append(results, &search[j], 0); break; } default: { INTDBG("Derived type: %d, not currently handled\n", derived_type); break; } } // copy event name and code used by the derived event into the results table (place where new derived event is getting created) for ( k = 0; k < (int)search[j].count; k++ ) { // INTDBG("search[%d]: %p, name[%d]: %s, code[%d]: %#x\n", j, &search[j], k, search[j].name[k], k, search[j].code[k]); // if this event is already in the list, just update the formula so that references to this event point to the existing one for (i=0 ; i < results->count ; i++) { if (results->code[i] == search[j].code[k]) { INTDBG("event: %s, code: %#x, already in results at index: %d\n", search[j].name[k], search[j].code[k], i); // replace all tokens in the formula that refer to index "results->count + found" with a token that refers to index "i". // the index "results->count + found" identifies the index used in the formula for the event we just determined is a duplicate update_ops_string(&(results->postfix), results->count + found, i); found++; break; } } // if we did not find a match, copy native event info into results array if (found == 0) { // not a duplicate, go ahead and copy into results and bump number of native events in results if (search[j].name[k]) { results->name[results->count] = papi_strdup(search[j].name[k]); } else { results->name[results->count] = papi_strdup(target); } results->code[results->count] = search[j].code[k]; INTDBG("results: %p, name[%d]: %s, code[%d]: %#x\n", results, results->count, results->name[results->count], results->count, results->code[results->count]); results->count++; } } INTDBG("EXIT: returned: 1\n"); return 1; } INTDBG("EXIT: returned: 0\n"); return 0; } static int check_native_events(char *target, hwi_presets_t* results) { INTDBG("ENTER: target: %p (%s), results: %p\n", target, target, results); int ret; // find this native events code if ( ( ret = _papi_hwi_native_name_to_code( target, (int *)(&results->code[results->count])) ) != PAPI_OK ) { INTDBG("EXIT: returned: 0, call to convert name to event code failed with ret: %d\n", ret); return 0; } // if the code returned was 0, return to show it is not a valid native event if ( results->code[results->count] == 0 ) { INTDBG( "EXIT: returned: 0, event code not found\n"); return 0; } // found = 1; INTDBG("\tFound a native event %s\n", target); INTDBG( "EXIT: returned: 1\n"); return 1; } // see if the event_name string passed in matches a known event name // if it does these calls also updates information in event definition tables to remember the event static int is_event(char *event_name, int derived_type, hwi_presets_t* results, int token_index) { INTDBG("ENTER: event_name: %p (%s), derived_type: %d, results: %p, token_index: %d\n", event_name, event_name, derived_type, results, token_index); /* check if its a preset event */ if ( check_derived_events(event_name, derived_type, results, &_papi_hwi_presets[0], PAPI_MAX_PRESET_EVENTS, token_index) ) { INTDBG("EXIT: found preset event\n"); return 1; } /* check if its a user defined event */ if ( check_derived_events(event_name, derived_type, results, user_defined_events, user_defined_events_count, token_index) ) { INTDBG("EXIT: found user event\n"); return 1; } /* check if its a native event */ if ( check_native_events(event_name, results) ) { INTDBG("EXIT: found native event\n"); return 1; } INTDBG("EXIT: event not found\n"); return 0; } /* Static version of the events file. */ #if defined(STATIC_PAPI_EVENTS_TABLE) #include "papi_events_table.h" #else static char *papi_events_table = NULL; #endif int _papi_load_preset_table(char *pmu_str, int pmu_type, int cidx) { SUBDBG("ENTER: pmu_str: %s, pmu_type: %d, cidx: %d\n", pmu_str, pmu_type, cidx); int retval; // go load papi preset events (last argument tells function if we are loading presets or user events) retval = papi_load_derived_events(pmu_str, pmu_type, cidx, 1); if (retval != PAPI_OK) { SUBDBG("EXIT: retval: %d\n", retval); return retval; } // go load the user defined event definitions if any are defined retval = papi_load_derived_events(pmu_str, pmu_type, cidx, 0); SUBDBG("EXIT: retval: %d\n", retval); return retval; } int _papi_load_preset_table_component(char *comp_str, char *arch_str, int cidx) { SUBDBG("ENTER: arch_str: %s, cidx: %d\n", arch_str, cidx); int retval; // go load papi preset events for component index 'cidx' retval = papi_load_derived_events_component(comp_str, arch_str, cidx); if (retval != PAPI_OK) { SUBDBG("EXIT: retval: %d\n", retval); return retval; } SUBDBG("EXIT: retval: %d\n", retval); return retval; } // global variables static char stack[2*PAPI_HUGE_STR_LEN]; // stack static int stacktop = -1; // stack length // priority: This function returns the priority of the operator static int priority( char symbol ) { switch( symbol ) { case '@': return -1; case '(': return 0; case '+': case '-': return 1; case '*': case '/': case '%': return 2; default : return 0; } // end switch symbol } // end priority static int push( char symbol ) { if (stacktop >= 2*PAPI_HUGE_STR_LEN - 1) { INTDBG("stack overflow converting algebraic expression (%d,%c)\n", stacktop,symbol ); return -1; //***TODO: Figure out how to exit gracefully } // end if stacktop>MAX stack[++stacktop] = symbol; return 0; } // end push // pop from stack static char pop() { if( stacktop < 0 ) { INTDBG("stack underflow converting algebraic expression\n" ); return '\0'; //***TODO: Figure out how to exit gracefully } // end if empty return( stack[stacktop--] ); } // end pop /* infix_to_postfix: routine that will be called with parameter: char *in characters of infix notation (algebraic formula) returns: char * pointer to string of returned postfix */ static char * infix_to_postfix( char *infix ) { INTDBG("ENTER: in: %s, size: %zu\n", infix, strlen(infix)); static char postfix[2*PAPI_HUGE_STR_LEN]; // output unsigned int index; int postfixlen; char token; if ( strlen(infix) > PAPI_HUGE_STR_LEN ) PAPIERROR("A infix string (probably in user-defined presets) is too big (max allowed %d): %s", PAPI_HUGE_STR_LEN, infix ); // initialize stack memset(stack, 0, 2*PAPI_HUGE_STR_LEN); stacktop = -1; push('#'); stacktop = 0; // after initialization of stack to # /* initialize output string */ memset(postfix,0,2*PAPI_HUGE_STR_LEN); postfixlen = 0; for( index=0; index priority(token) ) { postfix[postfixlen++] = pop(); postfix[postfixlen++] = '|'; } push( token ); /* save current operator */ break; default: // if alphanumeric character which is not parenthesis or an operator postfix[postfixlen++] = token; break; } // end switch symbol } // end while /* Write any remaining operators */ if (postfix[postfixlen-1]!='|') postfix[postfixlen++] = '|'; while ( stacktop>0 ) { postfix[postfixlen++] = pop(); postfix[postfixlen++] = '|'; } postfix[postfixlen++] = '\0'; stacktop = -1; INTDBG("EXIT: postfix: %s, size: %zu\n", postfix, strlen(postfix)); return (postfix); } // end infix_to_postfix /* * This function will load event definitions from either a file or an in memory table. It is used to load both preset events * which are defined by the PAPI development team and delivered with the product and user defined events which can be defined * by papi users and provided to papi to be processed at library initialization. Both the preset events and user defined events * support the same event definition syntax. * * Event definition file syntax: * see PAPI_derived_event_files(1) man page. * * Blank lines are ignored * Lines that begin with '#' are comments. * Lines that begin with 'CPU' identify a pmu name and have the following effect. * If this pmu name does not match the pmu_str passed in, it is ignored and we get the next input line. * If this pmu name matches the pmu_str passed in, we set a 'process events' flag. * Multiple consecutive 'CPU' lines may be provided and if any of them match the pmu_str passed in, we set a 'process events' flag. * When a 'CPU' line is found following event definition lines, it turns off the 'process events' flag and then does the above checks. * Lines that begin with 'PRESET' or 'EVENT' specify an event definition and are processed as follows. * If the 'process events' flag is not set, the line is ignored and we get the next input line. * If the 'process events' flag is set, the event is processed and the event information is put into the next slot in the results array. * * There are three possible sources of input for preset event definitions. The code will first look for the environment variable * "PAPI_CSV_EVENT_FILE". If found its value will be used as the pathname of where to get the preset information. If not found, * the code will look for a built in table containing preset events. If the built in table was not created during the build of * PAPI then the code will build a pathname of the form "PAPI_DATADIR/PAPI_EVENT_FILE". Each of these are build variables, the * PAPI_DATADIR variable can be given a value during the configure of PAPI at build time, and the PAPI_EVENT_FILE variable has a * hard coded value of "papi_events.csv". * * There is only one way to define user events. The code will look for an environment variable "PAPI_USER_EVENTS_FILE". If found * its value will be used as the pathname of a file which contains user event definitions. The events defined in this file will be * added to the ones known by PAPI when the call to PAPI_library_init is done. * * TODO: * Look into restoring the ability to specify a user defined event file with a call to PAPI_set_opt(PAPI_USER_EVENTS_FILE). * This needs to figure out how to pass a pmu name (could use default pmu from component 0) to this function. * * Currently code elsewhere in PAPI limits the events which preset and user events can depend on to those events which are known to component 0. This possibly could * be relaxed to allow events from different components. But since all the events used by any derived event must be added to the same eventset, it will always be a * requirement that all events used by a given derived event must be from the same component. * */ static int papi_load_derived_events (char *pmu_str, int pmu_type, int cidx, int preset_flag) { SUBDBG( "ENTER: pmu_str: %s, pmu_type: %d, cidx: %d, preset_flag: %d\n", pmu_str, pmu_type, cidx, preset_flag); char pmu_name[PAPI_MIN_STR_LEN]; char line[LINE_MAX]; char name[PATH_MAX] = "builtin papi_events_table"; char *event_file_path=NULL; char *event_table_ptr=NULL; int event_type_bits = 0; char *tmpn; char *tok_save_ptr=NULL; FILE *event_file = NULL; hwi_presets_t *results=NULL; int result_size = 0; int *event_count = NULL; int invalid_event; int line_no = 0; /* count of lines read from event definition input */ int derived = 0; int res_idx = 0; /* index into results array for where to store next event */ int preset = 0; int get_events = 0; /* only process derived events after CPU type they apply to is identified */ int found_events = 0; /* flag to track if event definitions (PRESETS) are found since last CPU declaration */ int breakAfter = 0; /* flag to break parsing events file if component 'arch' has already been parsed */ #ifdef PAPI_DATADIR char path[PATH_MAX]; #endif if (preset_flag) { /* try the environment variable first */ if ((tmpn = getenv("PAPI_CSV_EVENT_FILE")) && (strlen(tmpn) > 0)) { event_file_path = tmpn; } /* if no valid environment variable, look for built-in table */ else if (papi_events_table) { event_table_ptr = papi_events_table; } /* if no env var and no built-in, search for default file */ else { #ifdef PAPI_DATADIR sprintf( path, "%s/%s", PAPI_DATADIR, PAPI_EVENT_FILE ); event_file_path = path; #else event_file_path = PAPI_EVENT_FILE; #endif } event_type_bits = PAPI_PRESET_MASK; results = &_papi_hwi_presets[0]; result_size = PAPI_MAX_PRESET_EVENTS; event_count = &_papi_hwd[cidx]->cmp_info.num_preset_events; } else { if ((event_file_path = getenv( "PAPI_USER_EVENTS_FILE" )) == NULL ) { SUBDBG("EXIT: User event definition file not provided.\n"); return PAPI_OK; } event_type_bits = PAPI_UE_MASK; results = &user_defined_events[0]; result_size = PAPI_MAX_USER_EVENTS; event_count = &user_defined_events_count; } // if we have an event file pathname, open it and read event definitions from the file if (event_file_path != NULL) { if ((event_file = open_event_table(event_file_path)) == NULL) { // if file open fails, return an error SUBDBG("EXIT: Event file open failed.\n"); return PAPI_ESYS; } strncpy(name, event_file_path, sizeof(name)-1); name[sizeof(name)-1] = '\0'; } else if (event_table_ptr == NULL) { // if we do not have a path name or table pointer, return an error SUBDBG("EXIT: Both event_file_path and event_table_ptr are NULL.\n"); return PAPI_ESYS; } /* copy the pmu identifier, stripping commas if found */ tmpn = pmu_name; while (*pmu_str) { if (*pmu_str != ',') *tmpn++ = *pmu_str; pmu_str++; } *tmpn = '\0'; /* at this point we have either a valid file pointer or built-in table pointer */ while (get_event_line(line, event_file, &event_table_ptr)) { char *t; int i; // increment number of lines we have read line_no++; t = trim_string(strtok_r(line, ",", &tok_save_ptr)); /* Skip blank lines */ if ((t == NULL) || (strlen(t) == 0)) continue; /* Skip comments */ if (t[0] == '#') { continue; } if (strcasecmp(t, "CPU") == 0) { if (get_events != 0 && found_events != 0) { SUBDBG( "Ending event scanning at line %d of %s.\n", line_no, name); get_events = 0; found_events = 0; } t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { PAPIERROR("Expected name after CPU token at line %d of %s -- ignoring", line_no, name); continue; } if (strcasecmp(t, pmu_name) == 0) { int type; breakAfter = 1; SUBDBG( "Process events for PMU %s found at line %d of %s.\n", t, line_no, name); t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { SUBDBG("No additional qualifier found, matching on string.\n"); get_events = 1; } else if ((sscanf(t, "%d", &type) == 1) && (type == pmu_type)) { SUBDBG( "Found CPU %s type %d at line %d of %s.\n", pmu_name, type, line_no, name); get_events = 1; } else { SUBDBG( "Additional qualifier match failed %d vs %d.\n", pmu_type, type); } } continue; } else if ((strcasecmp(t, "PRESET") == 0) || (strcasecmp(t, "EVENT") == 0)) { if (get_events == 0) continue; found_events = 1; t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { PAPIERROR("Expected name after PRESET token at line %d of %s -- ignoring", line_no, name); continue; } SUBDBG( "Examining event %s\n", t); // see if this event already exists in the results array, if not already known it sets up event in unused entry if ((res_idx = find_event_index (results, result_size, t)) < 0) { PAPIERROR("No room left for event %s -- ignoring", t); continue; } // add the proper event bits (preset or user defined bits) preset = res_idx | event_type_bits; (void) preset; SUBDBG( "Use event code: %#x for %s\n", preset, t); unsigned int preset_index = ( preset & PAPI_PRESET_AND_MASK ); _papi_hwi_presets[preset_index].component_index = cidx; t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Expected derived type after PRESET token at line %d of %s -- ignoring", line_no, name); continue; } if (_papi_hwi_derived_type(t, &derived) != PAPI_OK) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Invalid derived name %s after PRESET token at line %d of %s -- ignoring", t, line_no, name); continue; } /****************************************/ /* Have an event, let's start assigning */ /****************************************/ SUBDBG( "Adding event: %s, code: %#x, derived: %d results[%d]: %p.\n", t, preset, derived, res_idx, &results[res_idx]); /* results[res_idx].event_code = preset; */ results[res_idx].derived_int = derived; /* Derived support starts here */ /* Special handling for postfix and infix */ if ((derived == DERIVED_POSTFIX) || (derived == DERIVED_INFIX)) { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Expected Operation string after derived type DERIVED_POSTFIX or DERIVED_INFIX at line %d of %s -- ignoring", line_no, name); continue; } // if it is an algebraic formula, we need to convert it to postfix if (derived == DERIVED_INFIX) { SUBDBG( "Converting InFix operations %s\n", t); t = infix_to_postfix( t ); results[res_idx].derived_int = DERIVED_POSTFIX; } SUBDBG( "Saving PostFix operations %s\n", t); results[res_idx].postfix = papi_strdup(t); } /* All derived terms collected here */ i = 0; invalid_event = 0; results[res_idx].count = 0; do { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) break; if (strcasecmp(t, "NOTE") == 0) break; if (strcasecmp(t, "LDESC") == 0) break; if (strcasecmp(t, "SDESC") == 0) break; SUBDBG( "Adding term (%d) %s to derived event %#x, current native event count: %d.\n", i, t, preset, results[res_idx].count); // show that we do not have an event code yet (the component may create one and update this info) // this also clears any values left over from a previous call _papi_hwi_set_papi_event_code(-1, -1); // make sure that this term in the derived event is a valid event name // this call replaces preset and user event names with the equivalent native events in our results table // it also updates formulas for derived events so that they refer to the correct native event index if (is_event(t, results[res_idx].derived_int, &results[res_idx], i) == 0) { invalid_event = 1; PAPIERROR("Missing event %s, used in derived event %s", t, results[res_idx].symbol); break; } /* If it is a valid event, then update the preset fields here. */ /* Initially, the event name should be those with a default, mandatory qualifiers. */ results[res_idx].name[results[res_idx].count] = strdup(t); results[res_idx].base_name[results[res_idx].count] = strdup(t); results[res_idx].default_name[results[res_idx].count] = strdup(t); results[res_idx].default_code[results[res_idx].count] = results[res_idx].code[results[res_idx].count]; results[res_idx].count++; i++; } while (results[res_idx].count < PAPI_EVENTS_IN_DERIVED_EVENT); /* preset code list must be PAPI_NULL terminated */ if (i < PAPI_EVENTS_IN_DERIVED_EVENT) { results[res_idx].code[results[res_idx].count] = PAPI_NULL; } if (invalid_event) { // got an error, make this entry unused // preset table is statically allocated, user defined is dynamic unsigned int j; for (j = 0; j < results[res_idx].count; j++){ if (results[res_idx].name[j] != NULL){ papi_free( results[res_idx].name[j] ); results[res_idx].name[j] = NULL; } } if (!preset_flag){ if(results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } } continue; } /* End of derived support */ // if we did not find any terms to base this derived event on, report error if (i == 0) { // got an error, make this entry unused if (!preset_flag){ if(results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } } PAPIERROR("Expected PFM event after DERIVED token at line %d of %s -- ignoring", line_no, name); continue; } if (i == PAPI_EVENTS_IN_DERIVED_EVENT) { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); } // if something was provided following the list of events to be used by the operation, process it if ( t!= NULL && strlen(t) > 0 ) { do { // save the field name char *fptr = papi_strdup(t); // get the value to be used with this field t = trim_note(strtok_r(NULL, ",", &tok_save_ptr)); if ( t== NULL || strlen(t) == 0 ) { papi_free(fptr); break; } // Handle optional short descriptions, long descriptions and notes if (strcasecmp(fptr, "SDESC") == 0) { results[res_idx].short_descr = papi_strdup(t); } if (strcasecmp(fptr, "LDESC") == 0) { results[res_idx].long_descr = papi_strdup(t); } if (strcasecmp(fptr, "NOTE") == 0) { results[res_idx].note = papi_strdup(t); } SUBDBG( "Found %s (%s) on line %d\n", fptr, t, line_no); papi_free (fptr); // look for another field name t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ( t== NULL || strlen(t) == 0 ) { break; } } while (t != NULL); } (*event_count)++; continue; } else { if( breakAfter ) break; // Break this while-loop once all presets for the given component's arch have been parsed. } //PAPIERROR("Unrecognized token %s at line %d of %s -- ignoring", t, line_no, name); } if (event_file) { fclose(event_file); } SUBDBG("EXIT: Done processing derived event file.\n"); return PAPI_OK; } static int papi_load_derived_events_component (char *comp_str, char *arch_str, int cidx) { SUBDBG( "ENTER: arch_str: %s, cidx: %d\n", arch_str, cidx); char arch_name[PAPI_MIN_STR_LEN]; char line[LINE_MAX]; char name[PATH_MAX] = "builtin papi_events_table"; char *event_file_path=NULL; char *event_table_ptr=NULL; int event_type_bits = 0; char *tmpn; char *tok_save_ptr=NULL; FILE *event_file = NULL; hwi_presets_t *results=NULL; int result_size = 0; int *event_count = NULL; int invalid_event; int line_no = 0; /* count of lines read from event definition input */ int derived = 0; int res_idx = 0; /* index into results array for where to store next event */ int preset = 0; int get_events = 0; /* only process derived events after CPU type they apply to is identified */ int found_events = 0; /* flag to track if event definitions (PRESETS) are found since last CPU declaration */ int breakAfter = 0; /* flag to break parsing events file if component 'arch' has already been parsed */ int status = 0; #ifdef PAPI_DATADIR char path[PATH_MAX]; #endif /* try the environment variable first */ if ((tmpn = getenv("PAPI_CSV_EVENT_FILE")) && (strlen(tmpn) > 0)) { event_file_path = tmpn; } /* if no valid environment variable, look for built-in table */ else if (papi_events_table) { event_table_ptr = papi_events_table; } /* if no env var and no built-in, search for default file */ else { #ifdef PAPI_DATADIR sprintf( path, "%s/%s", PAPI_DATADIR, PAPI_EVENT_FILE ); event_file_path = path; #else event_file_path = PAPI_EVENT_FILE; #endif } event_type_bits = PAPI_PRESET_MASK; results = &_papi_hwi_comp_presets[cidx][0]; result_size = _papi_hwi_max_presets[cidx]; event_count = &_papi_hwd[cidx]->cmp_info.num_preset_events; // if we have an event file pathname, open it and read event definitions from the file if (event_file_path != NULL) { if ((event_file = open_event_table(event_file_path)) == NULL) { // if file open fails, return an error SUBDBG("EXIT: Event file open failed.\n"); return PAPI_ESYS; } strncpy(name, event_file_path, sizeof(name)-1); name[sizeof(name)-1] = '\0'; } else if (event_table_ptr == NULL) { // if we do not have a path name or table pointer, return an error SUBDBG("EXIT: Both event_file_path and event_table_ptr are NULL.\n"); return PAPI_ESYS; } /* copy the arch identifier, stripping commas if found */ tmpn = arch_name; while (*arch_str) { if (*arch_str != ',') *tmpn++ = *arch_str; arch_str++; } *tmpn = '\0'; /* at this point we have either a valid file pointer or built-in table pointer */ while (get_event_line(line, event_file, &event_table_ptr)) { char *t; int i; // increment number of lines we have read line_no++; t = trim_string(strtok_r(line, ",", &tok_save_ptr)); /* Skip blank lines */ if ((t == NULL) || (strlen(t) == 0)) continue; /* Skip comments */ if (t[0] == '#') { continue; } if (strcasecmp(t, comp_str) == 0) { if (get_events != 0 && found_events != 0) { SUBDBG( "Ending event scanning at line %d of %s.\n", line_no, name); get_events = 0; found_events = 0; } t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { PAPIERROR("Expected name after component-name token at line %d of %s -- ignoring", line_no, name); continue; } if (strcasecmp(t, arch_name) == 0) { int type; breakAfter = 1; SUBDBG( "Process events for ARCH %s found at line %d of %s.\n", t, line_no, name); t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { SUBDBG("No additional qualifier found, matching on string.\n"); get_events = 1; } } continue; } else if ((strcasecmp(t, "PRESET") == 0) || (strcasecmp(t, "EVENT") == 0)) { if (get_events == 0) continue; found_events = 1; t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { PAPIERROR("Expected name after PRESET token at line %d of %s -- ignoring", line_no, name); continue; } SUBDBG( "Examining event %s\n", t); // see if this event already exists in the results array, if not already known it sets up event in unused entry if ((res_idx = find_event_index (results, result_size, t)) < 0) { PAPIERROR("No room left for event %s -- ignoring", t); continue; } // add the proper event bits (preset or user defined bits) preset = res_idx | event_type_bits; (void) preset; SUBDBG( "Use event code: %#x for %s\n", preset, t); unsigned int preset_index = ( preset & PAPI_PRESET_AND_MASK ); _papi_hwi_comp_presets[cidx][preset_index].component_index = cidx; t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Expected derived type after PRESET token at line %d of %s -- ignoring", line_no, name); continue; } if (_papi_hwi_derived_type(t, &derived) != PAPI_OK) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Invalid derived name %s after PRESET token at line %d of %s -- ignoring", t, line_no, name); continue; } /****************************************/ /* Have an event, let's start assigning */ /****************************************/ SUBDBG( "Adding event: %s, code: %#x, derived: %d results[%d]: %p.\n", t, preset, derived, res_idx, &results[res_idx]); results[res_idx].derived_int = derived; /* Derived support starts here */ /* Special handling for postfix and infix */ if ((derived == DERIVED_POSTFIX) || (derived == DERIVED_INFIX)) { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) { // got an error, make this entry unused if (results[res_idx].symbol != NULL){ papi_free (results[res_idx].symbol); results[res_idx].symbol = NULL; } PAPIERROR("Expected Operation string after derived type DERIVED_POSTFIX or DERIVED_INFIX at line %d of %s -- ignoring", line_no, name); continue; } // if it is an algebraic formula, we need to convert it to postfix if (derived == DERIVED_INFIX) { SUBDBG( "Converting InFix operations %s\n", t); t = infix_to_postfix( t ); results[res_idx].derived_int = DERIVED_POSTFIX; } SUBDBG( "Saving PostFix operations %s\n", t); results[res_idx].postfix = papi_strdup(t); } /* All derived terms collected here */ i = 0; invalid_event = 0; results[res_idx].count = 0; int firstTerm = 1; do { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ((t == NULL) || (strlen(t) == 0)) break; if (strcasecmp(t, "NOTE") == 0) break; if (strcasecmp(t, "LDESC") == 0) break; if (strcasecmp(t, "SDESC") == 0) break; SUBDBG( "Adding term (%d) %s to derived event %#x, current native event count: %d.\n", i, t, preset, results[res_idx].count); // show that we do not have an event code yet (the component may create one and update this info) // this also clears any values left over from a previous call _papi_hwi_set_papi_event_code(-1, -1); unsigned int eventCode; char *tmpEvent, *tmpQuals; char *qualDelim = ":"; PAPI_event_info_t eventInfo; hwi_presets_t *prstPtr = &(_papi_hwi_comp_presets[cidx][preset_index]); if( firstTerm ) { // Convert native event to code and check that it's valid. status = _papi_hwi_native_name_to_code(t, &eventCode); if( status != PAPI_OK ) { invalid_event = 1; PAPIERROR("Failed to get code for native event %s, used in derived event %s", t, results[res_idx].symbol); break; } // Call get_event_info, and use the qualifier string that comes after the // single instance of ":" and the description that comes after "masks:" status = _papi_hwi_get_native_event_info( (unsigned int)eventCode, &eventInfo ); if ( status != PAPI_OK ) { invalid_event = 1; PAPIERROR("Failed to get info for native event %s, used in derived event %s", t, results[res_idx].symbol); break; } /* Get the qualifiers. */ char *wholeName = strdup(eventInfo.symbol); char *qualPtr = strtok( wholeName, qualDelim ); /* Skip over PMU name or component prefix. */ qualPtr = strtok( NULL, qualDelim ); /* Skip over basename. */ qualPtr = strtok( NULL, qualDelim ); while( qualPtr != NULL ) { /* Store the qualifier in the preset struct. */ size_t qualLen = 1+strlen(qualDelim)+strlen(qualPtr); prstPtr->quals[prstPtr->num_quals] = (char*)malloc(qualLen*sizeof(char)); if( NULL != prstPtr->quals[prstPtr->num_quals] ) { status = snprintf(prstPtr->quals[prstPtr->num_quals], qualLen, "%s%s", qualDelim, qualPtr); if( status < 0 || status >= qualLen ) { invalid_event = 1; PAPIERROR("Failed to store qualifier for native event %s,", " used in derived event %s", t, results[res_idx].symbol); break; } prstPtr->num_quals++; } qualPtr = strtok( NULL, qualDelim ); } free(wholeName); /* Get the qualifier descriptions. */ int count = 0; char *desc = strdup(eventInfo.long_descr); char *descStart = strstr( desc, "masks:" ); char *descPtr = strtok( descStart, qualDelim ); /* Skip over 'masks'. */ descPtr = strtok( NULL, qualDelim ); while( descPtr != NULL ) { /* Store the qualifier's description in the preset struct. */ size_t descLen = 1+strlen(descPtr); prstPtr->quals_descrs[count] = (char*)malloc(descLen*sizeof(char)); if( NULL != prstPtr->quals_descrs[count] ) { status = snprintf(prstPtr->quals_descrs[count], descLen, "%s", descPtr); if( status < 0 || status >= descLen ) { invalid_event = 1; PAPIERROR("Failed to store qualifier description for native event %s,", " used in derived event %s", t, results[res_idx].symbol); break; } count++; } descPtr = strtok( NULL, qualDelim ); } free(desc); firstTerm = 0; } char *localname = strdup(t); char *basename = strtok(localname, ":"); basename = strtok(NULL, ":"); if( NULL == basename ) { basename = t; } /* Keep track of all qualifiers provided in the papi_events.csv file. */ status = overwrite_qualifiers(prstPtr, t, 0); if( status < 0 ) { invalid_event = 1; } /* Construct event with all qualifiers. */ int k, strLenSum = 0, baseLen = 1+strlen(basename); for (k = 0; k < prstPtr->num_quals; k++){ strLenSum += strlen(prstPtr->quals[k]); } strLenSum += baseLen; /* Allocate space for constructing fully qualified event. */ tmpEvent = (char*)malloc(strLenSum*sizeof(char)); tmpQuals = (char*)malloc(strLenSum*sizeof(char)); if( NULL == tmpQuals || NULL == tmpEvent ) { SUBDBG("EXIT: Could not allocate memory.\n"); return PAPI_ENOMEM; } /* Print the basename to a string. */ status = snprintf(tmpEvent, baseLen, "%s", basename); if( status < 0 || status >= baseLen ) { invalid_event = 1; PAPIERROR("Event basename %s was truncated to %s in derived event %s", basename, tmpEvent, results[res_idx].symbol); return PAPI_ENOMEM; } /* Concatenate the qualifiers onto the string. */ status = 0; for (k = 0; k < prstPtr->num_quals; k++) { status = snprintf(tmpQuals, strLenSum, "%s%s", tmpEvent, prstPtr->quals[k]); strcpy(tmpEvent, tmpQuals); } if( status < 0 || status >= strLenSum ) { invalid_event = 1; PAPIERROR("Event %s with qualifiers was truncated to %s in derived event %s", basename, tmpEvent, results[res_idx].symbol); return PAPI_ENOMEM; } // make sure that this term in the derived event is a valid event name // this call replaces preset and user event names with the equivalent native events in our results table // it also updates formulas for derived events so that they refer to the correct native event index if (is_event(tmpEvent, results[res_idx].derived_int, &results[res_idx], i) == 0) { invalid_event = 1; PAPIERROR("Missing event %s, used in derived event %s", basename, results[res_idx].symbol); break; } /* If it is a valid event, then update the preset fields here. */ /* Initially, the event name should be those with a default, mandatory qualifiers. */ results[res_idx].name[results[res_idx].count] = strdup(tmpEvent); results[res_idx].base_name[results[res_idx].count] = strdup(basename); results[res_idx].default_name[results[res_idx].count] = strdup(tmpEvent); results[res_idx].default_code[results[res_idx].count] = results[res_idx].code[results[res_idx].count]; results[res_idx].count++; /* Free dynamically allocated strings. */ free(tmpQuals); free(tmpEvent); free(localname); i++; } while (results[res_idx].count < PAPI_EVENTS_IN_DERIVED_EVENT); /* preset code list must be PAPI_NULL terminated */ if (i < PAPI_EVENTS_IN_DERIVED_EVENT) { results[res_idx].code[results[res_idx].count] = PAPI_NULL; } if (invalid_event) { // got an error, make this entry unused // preset table is statically allocated, user defined is dynamic unsigned int j; for (j = 0; j < results[res_idx].count; j++){ if (results[res_idx].name[j] != NULL){ papi_free( results[res_idx].name[j] ); results[res_idx].name[j] = NULL; } } continue; } /* End of derived support */ // if we did not find any terms to base this derived event on, report error if (i == 0) { // got an error, make this entry unused PAPIERROR("Expected PFM event after DERIVED token at line %d of %s -- ignoring", line_no, name); continue; } if (i == PAPI_EVENTS_IN_DERIVED_EVENT) { t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); } // if something was provided following the list of events to be used by the operation, process it if ( t!= NULL && strlen(t) > 0 ) { do { // save the field name char *fptr = papi_strdup(t); // get the value to be used with this field t = trim_note(strtok_r(NULL, ",", &tok_save_ptr)); if ( t== NULL || strlen(t) == 0 ) { papi_free(fptr); break; } // Handle optional short descriptions, long descriptions and notes if (strcasecmp(fptr, "SDESC") == 0) { results[res_idx].short_descr = papi_strdup(t); } if (strcasecmp(fptr, "LDESC") == 0) { results[res_idx].long_descr = papi_strdup(t); } if (strcasecmp(fptr, "NOTE") == 0) { results[res_idx].note = papi_strdup(t); } SUBDBG( "Found %s (%s) on line %d\n", fptr, t, line_no); papi_free (fptr); // look for another field name t = trim_string(strtok_r(NULL, ",", &tok_save_ptr)); if ( t== NULL || strlen(t) == 0 ) { break; } } while (t != NULL); } (*event_count)++; continue; } else { if( breakAfter ) break; // Break this while-loop once all presets for the given component's 'arch' have been parsed. } //PAPIERROR("Unrecognized token %s at line %d of %s -- ignoring", t, line_no, name); } if (event_file) { fclose(event_file); } SUBDBG("EXIT: Done processing derived event file.\n"); return PAPI_OK; } /* The following code is proof of principle for reading preset events from an xml file. It has been tested and works for pentium3. It relys on the expat library and is invoked by adding XMLFLAG = -DXML to the Makefile. It is presently hardcoded to look for "./papi_events.xml" */ #ifdef XML #define BUFFSIZE 8192 #define SPARSE_BEGIN 0 #define SPARSE_EVENT_SEARCH 1 #define SPARSE_EVENT 2 #define SPARSE_DESC 3 #define ARCH_SEARCH 4 #define DENSE_EVENT_SEARCH 5 #define DENSE_NATIVE_SEARCH 6 #define DENSE_NATIVE_DESC 7 #define FINISHED 8 char buffer[BUFFSIZE], *xml_arch; int location = SPARSE_BEGIN, sparse_index = 0, native_index, error = 0; /* The function below, _xml_start(), is a hook into expat's XML * parser. _xml_start() defines how the parser handles the * opening tags in PAPI's XML file. This function can be understood * more easily if you follow along with its logic while looking at * papi_events.xml. The location variable is a global telling us * where we are in the XML file. Have we found our architecture's * events yet? Are we looking at an event definition?...etc. */ static void _xml_start( void *data, const char *el, const char **attr ) { int native_encoding; if ( location == SPARSE_BEGIN && !strcmp( "papistdevents", el ) ) { location = SPARSE_EVENT_SEARCH; } else if ( location == SPARSE_EVENT_SEARCH && !strcmp( "papievent", el ) ) { _papi_hwi_presets[sparse_index].info.symbol = papi_strdup( attr[1] ); // strcpy(_papi_hwi_presets.info[sparse_index].symbol, attr[1]); location = SPARSE_EVENT; } else if ( location == SPARSE_EVENT && !strcmp( "desc", el ) ) { location = SPARSE_DESC; } else if ( location == ARCH_SEARCH && !strcmp( "availevents", el ) && !strcmp( xml_arch, attr[1] ) ) { location = DENSE_EVENT_SEARCH; } else if ( location == DENSE_EVENT_SEARCH && !strcmp( "papievent", el ) ) { if ( !strcmp( "PAPI_NULL", attr[1] ) ) { location = FINISHED; return; } else if ( PAPI_event_name_to_code( ( char * ) attr[1], &sparse_index ) != PAPI_OK ) { PAPIERROR( "Improper Preset name given in XML file for %s.", attr[1] ); error = 1; } sparse_index &= PAPI_PRESET_AND_MASK; /* allocate and initialize data space for this event */ papi_valid_free( _papi_hwi_presets[sparse_index].data ); _papi_hwi_presets[sparse_index].data = papi_malloc( sizeof ( hwi_preset_data_t ) ); native_index = 0; _papi_hwi_presets[sparse_index].data->native[native_index] = PAPI_NULL; _papi_hwi_presets[sparse_index].data->operation[0] = '\0'; if ( attr[2] ) { /* derived event */ _papi_hwi_presets[sparse_index].data->derived = _papi_hwi_derived_type( ( char * ) attr[3] ); /* where does DERIVED POSTSCRIPT get encoded?? */ if ( _papi_hwi_presets[sparse_index].data->derived == -1 ) { PAPIERROR( "No derived type match for %s in Preset XML file.", attr[3] ); error = 1; } if ( attr[5] ) { _papi_hwi_presets[sparse_index].count = atoi( attr[5] ); } else { PAPIERROR( "No count given for %s in Preset XML file.", attr[1] ); error = 1; } } else { _papi_hwi_presets[sparse_index].data->derived = NOT_DERIVED; _papi_hwi_presets[sparse_index].count = 1; } location = DENSE_NATIVE_SEARCH; } else if ( location == DENSE_NATIVE_SEARCH && !strcmp( "native", el ) ) { location = DENSE_NATIVE_DESC; } else if ( location == DENSE_NATIVE_DESC && !strcmp( "event", el ) ) { if ( _papi_hwi_native_name_to_code( attr[1], &native_encoding ) != PAPI_OK ) { printf( "Improper Native name given in XML file for %s\n", attr[1] ); PAPIERROR( "Improper Native name given in XML file for %s", attr[1] ); error = 1; } _papi_hwi_presets[sparse_index].data->native[native_index] = native_encoding; native_index++; _papi_hwi_presets[sparse_index].data->native[native_index] = PAPI_NULL; } else if ( location && location != ARCH_SEARCH && location != FINISHED ) { PAPIERROR( "Poorly-formed Preset XML document." ); error = 1; } } /* The function below, _xml_end(), is a hook into expat's XML * parser. _xml_end() defines how the parser handles the * end tags in PAPI's XML file. */ static void _xml_end( void *data, const char *el ) { int i; if ( location == SPARSE_EVENT_SEARCH && !strcmp( "papistdevents", el ) ) { for ( i = sparse_index; i < PAPI_MAX_PRESET_EVENTS; i++ ) { _papi_hwi_presets[i].info.symbol = NULL; _papi_hwi_presets[i].info.long_descr = NULL; _papi_hwi_presets[i].info.short_descr = NULL; } location = ARCH_SEARCH; } else if ( location == DENSE_NATIVE_DESC && !strcmp( "native", el ) ) { location = DENSE_EVENT_SEARCH; } else if ( location == DENSE_EVENT_SEARCH && !strcmp( "availevents", el ) ) { location = FINISHED; } } /* The function below, _xml_content(), is a hook into expat's XML * parser. _xml_content() defines how the parser handles the * text between tags in PAPI's XML file. The information between * tags is usally text for event descriptions. */ static void _xml_content( void *data, const char *el, const int len ) { int i; if ( location == SPARSE_DESC ) { _papi_hwi_presets[sparse_index].info.long_descr = papi_malloc( len + 1 ); for ( i = 0; i < len; i++ ) _papi_hwi_presets[sparse_index].info.long_descr[i] = el[i]; _papi_hwi_presets[sparse_index].info.long_descr[len] = '\0'; /* the XML data currently doesn't contain a short description */ _papi_hwi_presets[sparse_index].info.short_descr = NULL; sparse_index++; _papi_hwi_presets[sparse_index].data = NULL; location = SPARSE_EVENT_SEARCH; } } int _xml_papi_hwi_setup_all_presets( char *arch, hwi_dev_notes_t * notes ) { int done = 0; FILE *fp = fopen( "./papi_events.xml", "r" ); XML_Parser p = XML_ParserCreate( NULL ); if ( !p ) { PAPIERROR( "Couldn't allocate memory for XML parser." ); fclose(fp); return ( PAPI_ESYS ); } XML_SetElementHandler( p, _xml_start, _xml_end ); XML_SetCharacterDataHandler( p, _xml_content ); if ( fp == NULL ) { PAPIERROR( "Error opening Preset XML file." ); fclose(fp); return ( PAPI_ESYS ); } xml_arch = arch; do { int len; void *buffer = XML_GetBuffer( p, BUFFSIZE ); if ( buffer == NULL ) { PAPIERROR( "Couldn't allocate memory for XML buffer." ); fclose(fp); return ( PAPI_ESYS ); } len = fread( buffer, 1, BUFFSIZE, fp ); if ( ferror( fp ) ) { PAPIERROR( "XML read error." ); fclose(fp); return ( PAPI_ESYS ); } done = feof( fp ); if ( !XML_ParseBuffer( p, len, len == 0 ) ) { PAPIERROR( "Parse error at line %d:\n%s", XML_GetCurrentLineNumber( p ), XML_ErrorString( XML_GetErrorCode( p ) ) ); fclose(fp); return ( PAPI_ESYS ); } if ( error ) { fclose(fp); return ( PAPI_ESYS ); } } while ( !done ); XML_ParserFree( p ); fclose( fp ); return ( PAPI_OK ); } #endif papi-papi-7-2-0-t/src/papi_preset.h000066400000000000000000000053121502707512200171140ustar00rootroot00000000000000/** * @file papi_preset.h * @author Haihang You * you@cs.utk.edu */ #ifndef _PAPI_PRESET /* _PAPI_PRESET */ #define _PAPI_PRESET /** search element for preset events defined for each platform * @internal */ typedef struct hwi_search { /* eventcode should have a more specific name, like papi_preset! -pjm */ unsigned int event_code; /**< Preset code that keys back to sparse preset array */ int derived; /**< Derived type code */ int native[PAPI_EVENTS_IN_DERIVED_EVENT]; /**< array of native event code(s) for this preset event */ char operation[PAPI_2MAX_STR_LEN]; /**< operation string: +,-,*,/,@(number of metrics), $(constant Mhz), %(1000000.0) */ char *note; /**< optional developer notes for this event */ } hwi_search_t; /** collected text and data info for all preset events * @internal */ typedef struct hwi_presets { char *symbol; /**< name of the preset event; i.e. PAPI_TOT_INS, etc. */ char *short_descr; /**< short description of the event for labels, etc. */ char *long_descr; /**< long description (full sentence) */ int derived_int; /**< Derived type code */ unsigned int count; unsigned int event_type; char *postfix; unsigned int code[PAPI_MAX_INFO_TERMS]; // Active code for each native event. char *name[PAPI_MAX_INFO_TERMS]; // Active name for each native event. char *base_name[PAPI_MAX_INFO_TERMS]; // Unqualified native event name. unsigned int default_code[PAPI_MAX_INFO_TERMS]; // Codes for names with mandatory quals included. char *default_name[PAPI_MAX_INFO_TERMS]; // Name of native events with mandatory quals included. char *note; int component_index; int num_quals; char *quals[PAPI_MAX_COMP_QUALS]; char *quals_descrs[PAPI_MAX_COMP_QUALS]; } hwi_presets_t; /** This is a general description structure definition for various parameter lists * @internal */ typedef struct hwi_describe { int value; /**< numeric value (from papi.h) */ char *name; /**< name of the element */ char *descr; /**< description of the element */ } hwi_describe_t; extern hwi_search_t *preset_search_map; int _papi_hwi_setup_all_presets( hwi_search_t * findem, int cidx); int _papi_hwi_cleanup_all_presets( void ); int _xml_papi_hwi_setup_all_presets( char *arch); int _papi_load_preset_table( char *name, int type, int cidx ); int _papi_load_preset_table_component( char *comp_str, char *name, int cidx ); extern hwi_presets_t _papi_hwi_presets[PAPI_MAX_PRESET_EVENTS]; extern hwi_presets_t *_papi_hwi_comp_presets[]; extern int _papi_hwi_max_presets[]; #endif /* _PAPI_PRESET */ papi-papi-7-2-0-t/src/papi_vector.c000066400000000000000000000233721502707512200171150ustar00rootroot00000000000000/* * File: papi_vector.c * Author: Kevin London * london@cs.utk.edu * Mods: Haihang You * you@cs.utk.edu * Mods: * */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include #ifdef _AIX /* needed because the get_virt_usec() code uses a hardware context */ /* which is a pmapi definition on AIX. */ #include #endif void _vectors_error( ) { SUBDBG( "function is not implemented in the component!\n" ); exit( PAPI_ECMP ); } int vec_int_ok_dummy( ) { return PAPI_OK; } int vec_int_one_dummy( ) { return 1; } int vec_int_dummy( ) { return PAPI_ECMP; } void * vec_void_star_dummy( ) { return NULL; } void vec_void_dummy( ) { return; } long long vec_long_long_dummy( ) { return PAPI_ECMP; } long long vec_long_long_context_dummy( hwd_context_t *ignored ) { (void) ignored; return PAPI_ECMP; } char * vec_char_star_dummy( ) { return NULL; } long vec_long_dummy( ) { return PAPI_ECMP; } long long vec_virt_cycles(void) { return ((long long) _papi_os_vector.get_virt_usec() * _papi_hwi_system_info.hw_info.cpu_max_mhz); } long long vec_real_nsec_dummy(void) { return ((long long) _papi_os_vector.get_real_usec() * 1000); } long long vec_virt_nsec_dummy(void) { return ((long long) _papi_os_vector.get_virt_usec() * 1000); } int _papi_hwi_innoculate_vector( papi_vector_t * v ) { if ( !v ) return ( PAPI_EINVAL ); /* component function pointers */ if ( !v->dispatch_timer ) v->dispatch_timer = ( void ( * )( int, hwd_siginfo_t *, void * ) ) vec_void_dummy; if ( !v->get_overflow_address ) v->get_overflow_address = ( void *( * )( int, char *, int ) ) vec_void_star_dummy; if ( !v->start ) v->start = ( int ( * )( hwd_context_t *, hwd_control_state_t * ) ) vec_int_dummy; if ( !v->stop ) v->stop = ( int ( * )( hwd_context_t *, hwd_control_state_t * ) ) vec_int_dummy; if ( !v->read ) v->read = ( int ( * ) ( hwd_context_t *, hwd_control_state_t *, long long **, int ) ) vec_int_dummy; if ( !v->reset ) v->reset = ( int ( * )( hwd_context_t *, hwd_control_state_t * ) ) vec_int_dummy; if ( !v->write ) v->write = ( int ( * )( hwd_context_t *, hwd_control_state_t *, long long[] ) ) vec_int_dummy; if ( !v->cleanup_eventset ) v->cleanup_eventset = ( int ( * )( hwd_control_state_t * ) ) vec_int_ok_dummy; if ( !v->stop_profiling ) v->stop_profiling = ( int ( * )( ThreadInfo_t *, EventSetInfo_t * ) ) vec_int_dummy; if ( !v->init_component ) v->init_component = ( int ( * )( int ) ) vec_int_ok_dummy; if ( !v->init_thread ) v->init_thread = ( int ( * )( hwd_context_t * ) ) vec_int_ok_dummy; if ( !v->init_control_state ) v->init_control_state = ( int ( * )( hwd_control_state_t * ptr ) ) vec_int_dummy; if ( !v->update_control_state ) v->update_control_state = ( int ( * ) ( hwd_control_state_t *, NativeInfo_t *, int, hwd_context_t * ) ) vec_int_dummy; if ( !v->ctl ) v->ctl = ( int ( * )( hwd_context_t *, int, _papi_int_option_t * ) ) vec_int_dummy; if ( !v->set_overflow ) v->set_overflow = ( int ( * )( EventSetInfo_t *, int, int ) ) vec_int_dummy; if ( !v->set_profile ) v->set_profile = ( int ( * )( EventSetInfo_t *, int, int ) ) vec_int_dummy; if ( !v->set_domain ) v->set_domain = ( int ( * )( hwd_control_state_t *, int ) ) vec_int_dummy; if ( !v->ntv_enum_events ) v->ntv_enum_events = ( int ( * )( unsigned int *, int ) ) vec_int_dummy; if ( !v->ntv_name_to_code ) v->ntv_name_to_code = ( int ( * )( const char *, unsigned int * ) ) vec_int_dummy; if ( !v->ntv_code_to_name ) v->ntv_code_to_name = ( int ( * )( unsigned int, char *, int ) ) vec_int_dummy; if ( !v->ntv_code_to_descr ) v->ntv_code_to_descr = ( int ( * )( unsigned int, char *, int ) ) vec_int_ok_dummy; if ( !v->ntv_code_to_bits ) v->ntv_code_to_bits = ( int ( * )( unsigned int, hwd_register_t * ) ) vec_int_dummy; if ( !v->ntv_code_to_info ) v->ntv_code_to_info = ( int ( * )( unsigned int, PAPI_event_info_t * ) ) vec_int_dummy; if ( !v->allocate_registers ) v->allocate_registers = ( int ( * )( EventSetInfo_t * ) ) vec_int_ok_dummy; if ( !v->shutdown_thread ) v->shutdown_thread = ( int ( * )( hwd_context_t * ) ) vec_int_dummy; if ( !v->shutdown_component ) v->shutdown_component = ( int ( * )( void ) ) vec_int_ok_dummy; if ( !v->user ) v->user = ( int ( * )( int, void *, void * ) ) vec_int_dummy; return PAPI_OK; } int _papi_hwi_innoculate_os_vector( papi_os_vector_t * v ) { if ( !v ) return ( PAPI_EINVAL ); if ( !v->get_real_cycles ) v->get_real_cycles = vec_long_long_dummy; if ( !v->get_real_usec ) v->get_real_usec = vec_long_long_dummy; if ( !v->get_real_nsec ) v->get_real_nsec = vec_real_nsec_dummy; if ( !v->get_virt_cycles ) v->get_virt_cycles = vec_virt_cycles; if ( !v->get_virt_usec ) v->get_virt_usec = vec_long_long_dummy; if ( !v->get_virt_nsec ) v->get_virt_nsec = vec_virt_nsec_dummy; if ( !v->update_shlib_info ) v->update_shlib_info = ( int ( * )( papi_mdi_t * ) ) vec_int_dummy; if ( !v->get_system_info ) v->get_system_info = ( int ( * )( papi_mdi_t * ) ) vec_int_dummy; if ( !v->get_memory_info ) v->get_memory_info = ( int ( * )( PAPI_hw_info_t *, int ) ) vec_int_dummy; if ( !v->get_dmem_info ) v->get_dmem_info = ( int ( * )( PAPI_dmem_info_t * ) ) vec_int_dummy; return PAPI_OK; } /* not used? debug only? */ #if 0 static void * vector_find_dummy( void *func, char **buf ) { void *ptr = NULL; if ( vec_int_ok_dummy == ( int ( * )( ) ) func ) { ptr = ( void * ) vec_int_ok_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_int_ok_dummy" ); } else if ( vec_int_one_dummy == ( int ( * )( ) ) func ) { ptr = ( void * ) vec_int_one_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_int_one_dummy" ); } else if ( vec_int_dummy == ( int ( * )( ) ) func ) { ptr = ( void * ) vec_int_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_int_dummy" ); } else if ( vec_void_dummy == ( void ( * )( ) ) func ) { ptr = ( void * ) vec_void_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_void_dummy" ); } else if ( vec_void_star_dummy == ( void *( * )( ) ) func ) { ptr = ( void * ) vec_void_star_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_void_star_dummy" ); } else if ( vec_long_long_dummy == ( long long ( * )( ) ) func ) { ptr = ( void * ) vec_long_long_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_long_long_dummy" ); } else if ( vec_char_star_dummy == ( char *( * )( ) ) func ) { ptr = ( void * ) vec_char_star_dummy; *buf = papi_strdup( "vec_char_star_dummy" ); } else if ( vec_long_dummy == ( long ( * )( ) ) func ) { ptr = ( void * ) vec_long_dummy; if ( buf != NULL ) *buf = papi_strdup( "vec_long_dummy" ); } else { ptr = NULL; } return ( ptr ); } static void vector_print_routine( void *func, char *fname, int pfunc ) { void *ptr = NULL; char *buf = NULL; ptr = vector_find_dummy( func, &buf ); if ( ptr ) { printf( "DUMMY: %s is mapped to %s.\n", fname, buf ); papi_free( buf ); } else if ( ( !ptr && pfunc ) ) printf( "function: %s is mapped to %p.\n", fname, func ); } static void vector_print_table( papi_vector_t * v, int print_func ) { if ( !v ) return; vector_print_routine( ( void * ) v->dispatch_timer, "_papi_hwd_dispatch_timer", print_func ); vector_print_routine( ( void * ) v->get_overflow_address, "_papi_hwd_get_overflow_address", print_func ); vector_print_routine( ( void * ) v->start, "_papi_hwd_start", print_func ); vector_print_routine( ( void * ) v->stop, "_papi_hwd_stop", print_func ); vector_print_routine( ( void * ) v->read, "_papi_hwd_read", print_func ); vector_print_routine( ( void * ) v->reset, "_papi_hwd_reset", print_func ); vector_print_routine( ( void * ) v->write, "_papi_hwd_write", print_func ); vector_print_routine( ( void * ) v->cleanup_eventset, "_papi_hwd_cleanup_eventset", print_func ); vector_print_routine( ( void * ) v->stop_profiling, "_papi_hwd_stop_profiling", print_func ); vector_print_routine( ( void * ) v->init_component, "_papi_hwd_init_component", print_func ); vector_print_routine( ( void * ) v->init_thread, "_papi_hwd_init_thread", print_func ); vector_print_routine( ( void * ) v->init_control_state, "_papi_hwd_init_control_state", print_func ); vector_print_routine( ( void * ) v->ctl, "_papi_hwd_ctl", print_func ); vector_print_routine( ( void * ) v->set_overflow, "_papi_hwd_set_overflow", print_func ); vector_print_routine( ( void * ) v->set_profile, "_papi_hwd_set_profile", print_func ); vector_print_routine( ( void * ) v->set_domain, "_papi_hwd_set_domain", print_func ); vector_print_routine( ( void * ) v->ntv_enum_events, "_papi_hwd_ntv_enum_events", print_func ); vector_print_routine( ( void * ) v->ntv_name_to_code, "_papi_hwd_ntv_name_to_code", print_func ); vector_print_routine( ( void * ) v->ntv_code_to_name, "_papi_hwd_ntv_code_to_name", print_func ); vector_print_routine( ( void * ) v->ntv_code_to_descr, "_papi_hwd_ntv_code_to_descr", print_func ); vector_print_routine( ( void * ) v->ntv_code_to_bits, "_papi_hwd_ntv_code_to_bits", print_func ); vector_print_routine( ( void * ) v->ntv_code_to_info, "_papi_hwd_ntv_code_to_info", print_func ); vector_print_routine( ( void * ) v->allocate_registers, "_papi_hwd_allocate_registers", print_func ); vector_print_routine( ( void * ) v->shutdown_thread, "_papi_hwd_shutdown_thread", print_func ); vector_print_routine( ( void * ) v->shutdown_component, "_papi_hwd_shutdown_component", print_func ); vector_print_routine( ( void * ) v->user, "_papi_hwd_user", print_func ); } #endif papi-papi-7-2-0-t/src/papi_vector.h000066400000000000000000000070431502707512200171170ustar00rootroot00000000000000/** * @file papi_vector.h */ #ifndef _PAPI_VECTOR_H #define _PAPI_VECTOR_H /** Sizes of structure private to each component */ typedef struct cmp_struct_sizes { int context; int control_state; int reg_value; int reg_alloc; } cmp_struct_sizes_t; /** Vector Table Stuff * @internal */ typedef struct papi_vectors { /** Component specific data structure @see papi.h */ PAPI_component_info_t cmp_info; /** Component specific structure sizes*/ cmp_struct_sizes_t size; /* List of exposed function pointers for this component */ void ( *dispatch_timer ) ( int, hwd_siginfo_t *, void * ); void * (*get_overflow_address) (int, char *, int); /**< */ int (*start) (hwd_context_t *, hwd_control_state_t *); /**< */ int (*stop) (hwd_context_t *, hwd_control_state_t *); /**< */ int (*read) (hwd_context_t *, hwd_control_state_t *, long long **, int); /**< */ int (*reset) (hwd_context_t *, hwd_control_state_t *); /**< */ int (*write) (hwd_context_t *, hwd_control_state_t *, long long[]); /**< */ int (*cleanup_eventset) ( hwd_control_state_t * ); /**< */ int (*stop_profiling) (ThreadInfo_t *, EventSetInfo_t *); /**< */ int (*init_component) (int); /**< */ int (*init_thread) (hwd_context_t *); /**< */ int (*init_control_state) (hwd_control_state_t * ptr); /**< */ int (*update_control_state) (hwd_control_state_t *, NativeInfo_t *, int, hwd_context_t *); /**< */ int (*ctl) (hwd_context_t *, int , _papi_int_option_t *); /**< */ int (*set_overflow) (EventSetInfo_t *, int, int); /**< */ int (*set_profile) (EventSetInfo_t *, int, int); /**< */ int (*set_domain) (hwd_control_state_t *, int); /**< */ int (*ntv_enum_events) (unsigned int *, int); /**< */ int (*ntv_name_to_code) (const char *, unsigned int *); /**< */ int (*ntv_code_to_name) (unsigned int, char *, int); /**< */ int (*ntv_code_to_descr) (unsigned int, char *, int); /**< */ int (*ntv_code_to_bits) (unsigned int, hwd_register_t *); /**< */ int (*ntv_code_to_info) (unsigned int, PAPI_event_info_t *); int (*allocate_registers) (EventSetInfo_t *); /**< called when an event is added. Should make sure the new EventSet can map to hardware and any conflicts are addressed */ int (*shutdown_thread) (hwd_context_t *); /**< */ int (*shutdown_component) (void); /**< */ int (*user) (int, void *, void *); /**< */ }papi_vector_t; extern papi_vector_t *_papi_hwd[]; typedef struct papi_os_vectors { long long (*get_real_cycles) (void); /**< */ long long (*get_virt_cycles) (void); /**< */ long long (*get_real_usec) (void); /**< */ long long (*get_virt_usec) (void); /**< */ long long (*get_real_nsec) (void); /**< */ long long (*get_virt_nsec) (void); /**< */ int (*update_shlib_info) (papi_mdi_t * mdi); /**< */ int (*get_system_info) (papi_mdi_t * mdi); /**< */ int (*get_memory_info) (PAPI_hw_info_t *, int); /**< */ int (*get_dmem_info) (PAPI_dmem_info_t *); /**< */ } papi_os_vector_t; extern papi_os_vector_t _papi_os_vector; /* Prototypes */ int _papi_hwi_innoculate_vector( papi_vector_t * v ); int _papi_hwi_innoculate_os_vector( papi_os_vector_t * v ); #endif /* _PAPI_VECTOR_H */ papi-papi-7-2-0-t/src/papivi.h000066400000000000000000000724101502707512200160740ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: papivi.h * CVS: $Id$ * Author: dan terpstra * terpstra@cs.utk.edu * Mods: your name here * yourname@cs.esu.edu * * Include this file INSTEAD OF "papi.h" in your application code * to provide semitransparent version independent PAPI support. * Follow the rules described below and elsewhere to facilitate * this support. * */ #ifndef _PAPIVI #define _PAPIVI #include "papi.h" /*************************************************************************** * If PAPI_VERSION is not defined, then papi.h is for PAPI 2. * The preprocessor block below contains the definitions, data structures, * macros and code needed to emulate much of the PAPI 3 interface in code * linking to the PAPI 2 library. ****************************************************************************/ #ifndef PAPI_VERSION #define PAPI_VERSION_NUMBER(maj,min,rev) (((maj)<<16) | ((min)<<8) | (rev)) #define PAPI_VERSION_MAJOR(x) (((x)>>16) & 0xffff) #define PAPI_VERSION_MINOR(x) (((x)>>8) & 0xff) #define PAPI_VERSION_REVISION(x) ((x) & 0xff) /* This is the PAPI version on which we are running */ #define PAPI_VERSION PAPI_VERSION_NUMBER(2,3,4) /* This is the PAPI 3 version with which we are compatible */ #define PAPI_VI_VERSION PAPI_VERSION_NUMBER(3,0,6) /* PAPI 3 has an error code not defined for PAPI 2 */ #define PAPI_EPERM PAPI_EMISC /* You lack the necessary permissions */ /* * These are defined in papi_internal.h for PAPI 2. * They need to be exposed for version independent PAPI code to work. */ //#define PRESET_MASK 0x80000000 #define PAPI_PRESET_MASK 0x80000000 //#define PRESET_AND_MASK 0x7FFFFFFF #define PAPI_PRESET_AND_MASK 0x7FFFFFFF #define PAPI_NATIVE_MASK 0x40000000 #define PAPI_NATIVE_AND_MASK 0x3FFFFFFF /* * Some PAPI 3 definitions for PAPI_{set,get}_opt() map * onto single definitions in PAPI 2. The new definitions * (shown below) should be used to guarantee PAPI 3 compatibility. */ #define PAPI_CLOCKRATE PAPI_GET_CLOCKRATE #define PAPI_MAX_HWCTRS PAPI_GET_MAX_HWCTRS #define PAPI_HWINFO PAPI_GET_HWINFO #define PAPI_EXEINFO PAPI_GET_EXEINFO #define PAPI_MAX_CPUS PAPI_GET_MAX_CPUS #define PAPI_CPUS PAPI_GET_CPUS #define PAPI_THREADS PAPI_GET_THREADS /* * PAPI 2 defined only one string length. * PAPI 3 defines three. This insures limited compatibility. */ #define PAPI_MIN_STR_LEN PAPI_MAX_STR_LEN #define PAPI_HUGE_STR_LEN PAPI_MAX_STR_LEN /* * PAPI 2 always profiles into 16-bit buckets. * PAPI 3 supports multiple bucket sizes. * Exercise caution if these defines appear in your code. * There is a potential for data overflow in PAPI 2. */ #define PAPI_PROFIL_BUCKET_16 0 #define PAPI_PROFIL_BUCKET_32 0 #define PAPI_PROFIL_BUCKET_64 0 /* * PAPI 3 defines a new eventcode that can often be emulated * successfully on PAPI 2. PAPI 3 also deprecates two eventcodes * found in PAPI 2: * PAPI_IPS (instructions per second) * PAPI_FLOPS (floating point instructions per second) * Don't use these eventcodes in version independent code */ #define PAPI_FP_OPS PAPI_FP_INS /* * Two new data structures are introduced in PAPI 3 that are * required to support the functionality of: * PAPI_get_event_info() and * PAPI_get_executable_info() * These structures are reproduced below. * They MUST stay synchronized with their counterparts in papi.h */ #define PAPI_MAX_INFO_TERMS 8 typedef struct event_info { unsigned int event_code; unsigned int count; char symbol[PAPI_MAX_STR_LEN + 3]; char short_descr[PAPI_MIN_STR_LEN]; char long_descr[PAPI_HUGE_STR_LEN]; char derived[PAPI_MIN_STR_LEN]; char postfix[PAPI_MIN_STR_LEN]; unsigned int code[PAPI_MAX_INFO_TERMS]; char name[PAPI_MAX_INFO_TERMS] [PAPI_MIN_STR_LEN]; char note[PAPI_HUGE_STR_LEN]; } PAPI_event_info_t; /* Possible values for the 'modifier' parameter of the PAPI_enum_event call. This enumeration is new in PAPI 3. It will act as a nop in PAPI 2, but must be defined for code compatibility. */ enum { PAPI_ENUM_ALL = 0, /* Always enumerate all events */ PAPI_PRESET_ENUM_AVAIL, /* Enumerate events that exist here */ /* PAPI PRESET section */ PAPI_PRESET_ENUM_INS, /* Instruction related preset events */ PAPI_PRESET_ENUM_BR, /* branch related preset events */ PAPI_PRESET_ENUM_MEM, /* memory related preset events */ PAPI_PRESET_ENUM_TLB, /* Translation Lookaside Buffer events */ PAPI_PRESET_ENUM_FP, /* Floating Point related preset events */ /* Pentium 4 specific section */ PAPI_PENT4_ENUM_GROUPS = 0x100, /* 45 groups + custom + user */ PAPI_PENT4_ENUM_COMBOS, /* all combinations of mask bits for given group */ PAPI_PENT4_ENUM_BITS, /* all individual bits for given group */ /* POWER 4 specific section */ PAPI_PWR4_ENUM_GROUPS = 0x200 /* Enumerate groups an event belongs to */ }; typedef struct _papi_address_map { char mapname[PAPI_HUGE_STR_LEN]; vptr_t text_start; /* Start address of program text segment */ vptr_t text_end; /* End address of program text segment */ vptr_t data_start; /* Start address of program data segment */ vptr_t data_end; /* End address of program data segment */ vptr_t bss_start; /* Start address of program bss segment */ vptr_t bss_end; /* End address of program bss segment */ } PAPI_address_map_t; /* * PAPI 3 beta 3 introduces new structures for static memory description. * These include structures for tlb and cache description, a structure * to describe a level in the memory hierarchy, and a structure * to describe all levels of the hierarchy. * These structures, and the requisite data types are defined below. */ /* All sizes are in BYTES */ /* Except tlb size, which is in entries */ #define PAPI_MAX_MEM_HIERARCHY_LEVELS 3 #define PAPI_MH_TYPE_EMPTY 0x0 #define PAPI_MH_TYPE_INST 0x1 #define PAPI_MH_TYPE_DATA 0x2 #define PAPI_MH_TYPE_UNIFIED PAPI_MH_TYPE_INST|PAPI_MH_TYPE_DATA typedef struct _papi_mh_tlb_info { int type; /* Empty, unified, data, instr */ int num_entries; int associativity; } PAPI_mh_tlb_info_t; typedef struct _papi_mh_cache_info { int type; /* Empty, unified, data, instr */ int size; int line_size; int num_lines; int associativity; } PAPI_mh_cache_info_t; typedef struct _papi_mh_level_info { PAPI_mh_tlb_info_t tlb[2]; PAPI_mh_cache_info_t cache[2]; } PAPI_mh_level_t; typedef struct _papi_mh_info { /* mh for mem hierarchy maybe? */ int levels; PAPI_mh_level_t level[PAPI_MAX_MEM_HIERARCHY_LEVELS]; } PAPI_mh_info_t; /* * Three data structures are modified in PAPI 3 * These modifications are * required to support the functionality of: * PAPI_get_hardware_info() and * PAPI_get_executable_info() * These structures are reproduced below. * They MUST stay synchronized with their counterparts in papi.h * To avoid namespace collisions, these structures have been renamed * to PAPIvi_xxx, and must also be renamed in your code. */ typedef struct _papi3_hw_info { int ncpu; /* Number of CPUs in an SMP Node */ int nnodes; /* Number of Nodes in the entire system */ int totalcpus; /* Total number of CPUs in the entire system */ int vendor; /* Vendor number of CPU */ char vendor_string[PAPI_MAX_STR_LEN]; /* Vendor string of CPU */ int model; /* Model number of CPU */ char model_string[PAPI_MAX_STR_LEN]; /* Model string of CPU */ float revision; /* Revision of CPU */ float mhz; /* Cycle time of this CPU, *may* be estimated at init time with a quick timing routine */ PAPI_mh_info_t mem_hierarchy; } PAPIvi_hw_info_t; typedef struct _papi3_preload_option { char lib_preload_env[PAPI_MAX_STR_LEN]; /* Model string of CPU */ char lib_preload_sep; char lib_dir_env[PAPI_MAX_STR_LEN]; char lib_dir_sep; } PAPIvi_preload_option_t; typedef struct _papi3_program_info { char fullname[PAPI_MAX_STR_LEN]; /* path+name */ char name[PAPI_MAX_STR_LEN]; /* name */ PAPI_address_map_t address_info; PAPIvi_preload_option_t preload_info; } PAPIvi_exe_info_t; /* * The Low Level API * Functions in this API are classified in 4 basic categories: * Modified: 13 functions * New: 8 functions * Unchanged: 32 functions * Deprecated: 9 functions * * Each of these categories is discussed further below. */ /* * Modified functions are further divided into 4 subcategories: * Dereferencing changes: 6 functions * These functions simply substitute an EventSet value for * a pointer to an EventSet. In the case of PAPI_remove_event{s}() * there is also a name change. * Name changes: 1 function * This is a simple name change with no change in functionality. * Parameter changes: 4 functions * Several functions have changed functionality reflected in changed * parameters: * PAPI_{un}lock() supports multiple locks in PAPI 3 * PAPI_profil() supports multiple bucket sizes in PAPI 3 * PAPI_thread_init() removes an unused parameter in PAPI 3 * New functionality: 2 functions * These functions support new data in revised data structures * The code implemented here maps the old structures to the new * where possible. */ /* Modified Functons: Dereferencing changes */ #define PAPIvi_add_event(EventSet, Event) \ PAPI_add_event(&EventSet, Event) #define PAPIvi_add_events(EventSet, Events, number) \ PAPI_add_events(&EventSet, Events, number) #define PAPIvi_cleanup_eventset(EventSet) \ PAPI_cleanup_eventset(&EventSet) #define PAPIvi_remove_event(EventSet, EventCode) \ PAPI_rem_event(&EventSet, EventCode) #define PAPIvi_remove_events(EventSet, Events, number) \ PAPI_rem_events(&EventSet, Events, number) #define PAPIvi_set_multiplex(EventSet) \ PAPI_set_multiplex(&EventSet) /* Modified Functons: Name changes */ #define PAPIvi_is_initialized \ PAPI_initialized /* Modified Functons: Parameter changes */ #define PAPIvi_lock(lck) \ PAPI_lock() #define PAPIvi_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags) \ PAPI_profil((unsigned short *)buf, bufsiz, (unsigned long)offset, scale, EventSet, EventCode, threshold, flags) #define PAPIvi_thread_init(id_fn) \ PAPI_thread_init(id_fn, 0) #define PAPIvi_unlock(lck) \ PAPI_unlock() /* Modified Functons: New functionality */ static const PAPIvi_exe_info_t * PAPIvi_get_executable_info( void ) { static PAPIvi_exe_info_t prginfo3; const PAPI_exe_info_t *prginfo2 = PAPI_get_executable_info( ); if ( prginfo2 == NULL ) return ( NULL ); strcpy( prginfo3.fullname, prginfo2->fullname ); strcpy( prginfo3.name, prginfo2->name ); prginfo3.address_info.mapname[0] = 0; prginfo3.address_info.text_start = prginfo2->text_start; prginfo3.address_info.text_end = prginfo2->text_end; prginfo3.address_info.data_start = prginfo2->data_start; prginfo3.address_info.data_end = prginfo2->data_end; prginfo3.address_info.bss_start = prginfo2->bss_start; prginfo3.address_info.bss_end = prginfo2->bss_end; strcpy( prginfo3.preload_info.lib_preload_env, prginfo2->lib_preload_env ); return ( &prginfo3 ); } static const PAPIvi_hw_info_t * PAPIvi_get_hardware_info( void ) { static PAPIvi_hw_info_t papi3_hw_info; const PAPI_hw_info_t *papi2_hw_info = PAPI_get_hardware_info( ); const PAPI_mem_info_t *papi2_mem_info = PAPI_get_memory_info( ); /* Copy the basic hardware info (same in both structures */ memcpy( &papi3_hw_info, papi2_hw_info, sizeof ( PAPI_hw_info_t ) ); memset( &papi3_hw_info.mem_hierarchy, 0, sizeof ( PAPI_mh_info_t ) ); /* check for a unified tlb */ if ( papi2_mem_info->total_tlb_size && papi2_mem_info->itlb_size == 0 && papi2_mem_info->dtlb_size == 0 ) { papi3_hw_info.mem_hierarchy.level[0].tlb[0].type = PAPI_MH_TYPE_UNIFIED; papi3_hw_info.mem_hierarchy.level[0].tlb[0].num_entries = papi2_mem_info->total_tlb_size; } else { if ( papi2_mem_info->itlb_size ) { papi3_hw_info.mem_hierarchy.level[0].tlb[0].type = PAPI_MH_TYPE_INST; papi3_hw_info.mem_hierarchy.level[0].tlb[0].num_entries = papi2_mem_info->itlb_size; papi3_hw_info.mem_hierarchy.level[0].tlb[0].associativity = papi2_mem_info->itlb_assoc; } if ( papi2_mem_info->dtlb_size ) { papi3_hw_info.mem_hierarchy.level[0].tlb[1].type = PAPI_MH_TYPE_DATA; papi3_hw_info.mem_hierarchy.level[0].tlb[1].num_entries = papi2_mem_info->dtlb_size; papi3_hw_info.mem_hierarchy.level[0].tlb[1].associativity = papi2_mem_info->dtlb_assoc; } } /* check for a unified level 1 cache */ if ( papi2_mem_info->total_L1_size ) papi3_hw_info.mem_hierarchy.levels = 1; if ( papi2_mem_info->total_L1_size && papi2_mem_info->L1_icache_size == 0 && papi2_mem_info->L1_dcache_size == 0 ) { papi3_hw_info.mem_hierarchy.level[0].cache[0].type = PAPI_MH_TYPE_UNIFIED; papi3_hw_info.mem_hierarchy.level[0].cache[0].size = papi2_mem_info->total_L1_size << 10; } else { if ( papi2_mem_info->L1_icache_size ) { papi3_hw_info.mem_hierarchy.level[0].cache[0].type = PAPI_MH_TYPE_INST; papi3_hw_info.mem_hierarchy.level[0].cache[0].size = papi2_mem_info->L1_icache_size << 10; papi3_hw_info.mem_hierarchy.level[0].cache[0].associativity = papi2_mem_info->L1_icache_assoc; papi3_hw_info.mem_hierarchy.level[0].cache[0].num_lines = papi2_mem_info->L1_icache_lines; papi3_hw_info.mem_hierarchy.level[0].cache[0].line_size = papi2_mem_info->L1_icache_linesize; } if ( papi2_mem_info->L1_dcache_size ) { papi3_hw_info.mem_hierarchy.level[0].cache[1].type = PAPI_MH_TYPE_DATA; papi3_hw_info.mem_hierarchy.level[0].cache[1].size = papi2_mem_info->L1_dcache_size << 10; papi3_hw_info.mem_hierarchy.level[0].cache[1].associativity = papi2_mem_info->L1_dcache_assoc; papi3_hw_info.mem_hierarchy.level[0].cache[1].num_lines = papi2_mem_info->L1_dcache_lines; papi3_hw_info.mem_hierarchy.level[0].cache[1].line_size = papi2_mem_info->L1_dcache_linesize; } } /* check for level 2 cache info */ if ( papi2_mem_info->L2_cache_size ) { papi3_hw_info.mem_hierarchy.levels = 2; papi3_hw_info.mem_hierarchy.level[1].cache[0].type = PAPI_MH_TYPE_UNIFIED; papi3_hw_info.mem_hierarchy.level[1].cache[0].size = papi2_mem_info->L2_cache_size << 10; papi3_hw_info.mem_hierarchy.level[1].cache[0].associativity = papi2_mem_info->L2_cache_assoc; papi3_hw_info.mem_hierarchy.level[1].cache[0].num_lines = papi2_mem_info->L2_cache_lines; papi3_hw_info.mem_hierarchy.level[1].cache[0].line_size = papi2_mem_info->L2_cache_linesize; } /* check for level 3 cache info */ if ( papi2_mem_info->L3_cache_size ) { papi3_hw_info.mem_hierarchy.levels = 3; papi3_hw_info.mem_hierarchy.level[2].cache[0].type = PAPI_MH_TYPE_UNIFIED; papi3_hw_info.mem_hierarchy.level[2].cache[0].size = papi2_mem_info->L3_cache_size << 10; papi3_hw_info.mem_hierarchy.level[2].cache[0].associativity = papi2_mem_info->L3_cache_assoc; papi3_hw_info.mem_hierarchy.level[2].cache[0].num_lines = papi2_mem_info->L3_cache_lines; papi3_hw_info.mem_hierarchy.level[2].cache[0].line_size = papi2_mem_info->L3_cache_linesize; } return ( &papi3_hw_info ); } /* * New functions are either supported or unsupported. * Of the three supported functions, two replaced deprecated functions * to describe events, and one is simply a convenience function. * The five unsupported new functions include three related to thread * functionality, a convenience function to return the number of events * in an event set, and a function to query information about shared libraries. */ /* New Supported Functions */ static int PAPIvi_enum_event( int *EventCode, int modifier ) { int i = *EventCode; const PAPI_preset_info_t *presets = PAPI_query_all_events_verbose( ); i &= PAPI_PRESET_AND_MASK; while ( ++i < PAPI_MAX_PRESET_EVENTS ) { if ( ( !modifier ) || ( presets[i].avail ) ) { *EventCode = i | PAPI_PRESET_MASK; if ( presets[i].event_name != NULL ) return ( PAPI_OK ); else return ( PAPI_ENOEVNT ); } } return ( PAPI_ENOEVNT ); } static int PAPIvi_get_event_info( int EventCode, PAPI_event_info_t * info ) { int i; const PAPI_preset_info_t *info2 = PAPI_query_all_events_verbose( ); i = EventCode & PAPI_PRESET_AND_MASK; if ( ( i >= PAPI_MAX_PRESET_EVENTS ) || ( info2[i].event_name == NULL ) ) return ( PAPI_ENOTPRESET ); info->event_code = info2[i].event_code; info->count = info2[i].avail; if ( info2[i].flags & PAPI_DERIVED ) { info->count++; strcpy( info->derived, "DERIVED" ); } if ( info2[i].event_name == NULL ) info->symbol[0] = 0; else strcpy( info->symbol, info2[i].event_name ); if ( info2[i].event_label == NULL ) info->short_descr[0] = 0; else strcpy( info->short_descr, info2[i].event_label ); if ( info2[i].event_descr == NULL ) info->long_descr[0] = 0; else strcpy( info->long_descr, info2[i].event_descr ); if ( info2[i].event_note == NULL ) info->note[0] = 0; else strcpy( info->note, info2[i].event_note ); return ( PAPI_OK ); } /* static int PAPI_get_multiplex(int EventSet) { PAPI_option_t popt; int retval; popt.multiplex.eventset = EventSet; retval = PAPI_get_opt(PAPI_GET_MULTIPLEX, &popt); if (retval < 0) retval = 0; return retval; } */ /* New Unsupported Functions */ #define PAPIvi_get_shared_lib_info \ PAPI_get_shared_lib_info #define PAPIvi_get_thr_specific(tag, ptr) \ PAPI_get_thr_specific(tag, ptr) #define PAPIvi_num_events(EventSet) \ PAPI_num_events(EventSet) #define PAPIvi_register_thread \ PAPI_register_thread #define PAPIvi_set_thr_specific(tag, ptr) \ PAPI_set_thr_specific(tag, ptr) /* * Over half of the functions in the Low Level API remain unchanged * These are included in the macro list in case they do change in future * revisions, and to simplify the naming conventions for writing * version independent PAPI code. */ #define PAPIvi_accum(EventSet, values) \ PAPI_accum(EventSet, values) #define PAPIvi_create_eventset(EventSet) \ PAPI_create_eventset(EventSet) #define PAPIvi_destroy_eventset(EventSet) \ PAPI_destroy_eventset(EventSet) #define PAPIvi_event_code_to_name(EventCode, out) \ PAPI_event_code_to_name(EventCode, out) #define PAPIvi_event_name_to_code(in, out) \ PAPI_event_name_to_code(in, out) #define PAPIvi_get_dmem_info(option) \ PAPI_get_dmem_info(option) #define PAPIvi_get_opt(option, ptr) \ PAPI_get_opt(option, ptr) #define PAPIvi_get_real_cyc \ PAPI_get_real_cyc #define PAPIvi_get_real_usec \ PAPI_get_real_usec #define PAPIvi_get_virt_cyc \ PAPI_get_virt_cyc #define PAPIvi_get_virt_usec \ PAPI_get_virt_usec #define PAPIvi_library_init(version) \ PAPI_library_init(version) #define PAPIvi_list_events(EventSet, Events, number) \ PAPI_list_events(EventSet, Events, number) #define PAPIvi_multiplex_init \ PAPI_multiplex_init #define PAPIvi_num_hwctrs \ PAPI_num_hwctrs #define PAPIvi_overflow(EventSet, EventCode, threshold, flags, handler) \ PAPI_overflow(EventSet, EventCode, threshold, flags, handler) #define PAPIvi_perror( s ) \ PAPI_perror( s ) #define PAPIvi_query_event(EventCode) \ PAPI_query_event(EventCode) #define PAPIvi_read(EventSet, values) \ PAPI_read(EventSet, values) #define PAPIvi_reset(EventSet) \ PAPI_reset(EventSet) #define PAPIvi_set_debug(level) \ PAPI_set_debug(level) #define PAPIvi_set_domain(domain) \ PAPI_set_domain(domain) #define PAPIvi_set_granularity(granularity) \ PAPI_set_granularity(granularity) #define PAPIvi_set_opt(option, ptr) \ PAPI_set_opt(option, ptr) #define PAPIvi_shutdown \ PAPI_shutdown #define PAPIvi_sprofil(prof, profcnt, EventSet, EventCode, threshold, flags) \ PAPI_sprofil(prof, profcnt, EventSet, EventCode, threshold, flags) #define PAPIvi_start(EventSet) \ PAPI_start(EventSet) #define PAPIvi_state(EventSet, status) \ PAPI_state(EventSet, status) #define PAPIvi_stop(EventSet, values) \ PAPI_stop(EventSet, values) #define PAPIvi_strerror(err) \ PAPI_strerror(err) #define PAPIvi_thread_id \ PAPI_thread_id #define PAPIvi_write(EventSet, values) \ PAPI_write(EventSet, values) /* * Of the nine functions deprecated from PAPI 2 to PAPI 3, * three (PAPI_add_pevent, PAPI_restore, and PAPI_save) were * never implemented, and four dealt with describing events. * Two remain: * PAPI_get_overflow_address() must still be used in version specific overflow handlers * PAPI_profil_hw() was rarely used, and only on platforms supporting hardware overflow. * The prototypes of these functions are shown below for completeness. */ /* int PAPI_add_pevent(int *EventSet, int code, void *inout); void *PAPI_get_overflow_address(void *context); int PAPI_profil_hw(unsigned short *buf, unsigned bufsiz, unsigned long offset, \ unsigned scale, int EventSet, int EventCode, int threshold, int flags); const PAPI_preset_info_t *PAPI_query_all_events_verbose(void); int PAPI_describe_event(char *name, int *EventCode, char *description); int PAPI_label_event(int EventCode, char *label); int PAPI_query_event_verbose(int EventCode, PAPI_preset_info_t *info); int PAPI_restore(void); int PAPI_save(void); */ /* * The High Level API * There are 8 functions in this API. * 6 are unchanged, and 2 are new. * Of the new functions, one is emulated and one is unsupported. */ /* Unchanged Functions */ #define PAPIvi_accum_counters(values, array_len) \ PAPI_accum_counters(values, array_len) #define PAPIvi_num_counters \ PAPI_num_counters #define PAPIvi_read_counters(values, array_len) \ PAPI_read_counters(values, array_len) #define PAPIvi_start_counters(Events, array_len) \ PAPI_start_counters(Events, array_len) #define PAPIvi_stop_counters(values, array_len) \ PAPI_stop_counters(values, array_len) #define PAPIvi_flops(rtime, ptime, flpops, mflops) \ PAPI_flops(rtime, ptime, flpops, mflops) /* New Supported Functions */ #define PAPIvi_flips(rtime, ptime, flpins, mflips) \ PAPI_flops(rtime, ptime, flpins, mflips) /* New Unupported Functions */ #define PAPIvi_ipc(rtime, ptime, ins, ipc) \ PAPI_ipc(rtime, ptime, ins, ipc) /******************************************************************************* * If PAPI_VERSION is defined, and the MAJOR version number is 3, * then papi.h is for PAPI 3. * The preprocessor block below contains definitions and macros needed to * allow version independent linking to the PAPI 3 library. * Other than a handful of definitions to support calls to PAPI_{get,set}_opt(), * this layer simply converts version independent names to PAPI 3 library calls. ********************************************************************************/ #elif (PAPI_VERSION_MAJOR(PAPI_VERSION) == 3) /* * The following option definitions reflect the fact that PAPI 2 had separate * definitions for options to PAPI_set_opt and PAPI_get_opt, while PAPI 3 has * only a single set for both. By using the older naming convention, you can * create platform independent code for these calls. */ #define PAPI_SET_DEBUG PAPI_DEBUG #define PAPI_GET_DEBUG PAPI_DEBUG #define PAPI_SET_MULTIPLEX PAPI_MULTIPLEX #define PAPI_GET_MULTIPLEX PAPI_MULTIPLEX #define PAPI_SET_DEFDOM PAPI_DEFDOM #define PAPI_GET_DEFDOM PAPI_DEFDOM #define PAPI_SET_DOMAIN PAPI_DOMAIN #define PAPI_GET_DOMAIN PAPI_DOMAIN #define PAPI_SET_DEFGRN PAPI_DEFGRN #define PAPI_GET_DEFGRN PAPI_DEFGRN #define PAPI_SET_GRANUL PAPI_GRANUL #define PAPI_GET_GRANUL PAPI_GRANUL #define PAPI_SET_INHERIT PAPI_INHERIT #define PAPI_GET_INHERIT PAPI_INHERIT #define PAPI_GET_NUMCTRS PAPI_NUMCTRS #define PAPI_SET_NUMCTRS PAPI_NUMCTRS #define PAPI_SET_PROFIL PAPI_PROFIL #define PAPI_GET_PROFIL PAPI_PROFIL /* * These macros are simple pass-throughs to PAPI 3 structures */ #define PAPIvi_hw_info_t PAPI_hw_info_t #define PAPIvi_exe_info_t PAPI_exe_info_t /* * The following macros are simple pass-throughs to PAPI 3 library calls */ /* The Low Level API */ #define PAPIvi_accum(EventSet, values) \ PAPI_accum(EventSet, values) #define PAPIvi_add_event(EventSet, Event) \ PAPI_add_event(EventSet, Event) #define PAPIvi_add_events(EventSet, Events, number) \ PAPI_add_events(EventSet, Events, number) #define PAPIvi_cleanup_eventset(EventSet) \ PAPI_cleanup_eventset(EventSet) #define PAPIvi_create_eventset(EventSet) \ PAPI_create_eventset(EventSet) #define PAPIvi_destroy_eventset(EventSet) \ PAPI_destroy_eventset(EventSet) #define PAPIvi_enum_event(EventCode, modifier) \ PAPI_enum_event(EventCode, modifier) #define PAPIvi_event_code_to_name(EventCode, out) \ PAPI_event_code_to_name(EventCode, out) #define PAPIvi_event_name_to_code(in, out) \ PAPI_event_name_to_code(in, out) #define PAPIvi_get_dmem_info(option) \ PAPI_get_dmem_info(option) #define PAPIvi_get_event_info(EventCode, info) \ PAPI_get_event_info(EventCode, info) #define PAPIvi_get_executable_info \ PAPI_get_executable_info #define PAPIvi_get_hardware_info \ PAPI_get_hardware_info #define PAPIvi_get_multiplex(EventSet) \ PAPI_get_multiplex(EventSet) #define PAPIvi_get_opt(option, ptr) \ PAPI_get_opt(option, ptr) #define PAPIvi_get_real_cyc \ PAPI_get_real_cyc #define PAPIvi_get_real_usec \ PAPI_get_real_usec #define PAPIvi_get_shared_lib_info \ PAPI_get_shared_lib_info #define PAPIvi_get_thr_specific(tag, ptr) \ PAPI_get_thr_specific(tag, ptr) #define PAPIvi_get_virt_cyc \ PAPI_get_virt_cyc #define PAPIvi_get_virt_usec \ PAPI_get_virt_usec #define PAPIvi_is_initialized \ PAPI_is_initialized #define PAPIvi_library_init(version) \ PAPI_library_init(version) #define PAPIvi_list_events(EventSet, Events, number) \ PAPI_list_events(EventSet, Events, number) #define PAPIvi_lock(lck) \ PAPI_lock(lck) #define PAPIvi_multiplex_init \ PAPI_multiplex_init #define PAPIvi_num_hwctrs \ PAPI_num_hwctrs #define PAPIvi_num_events(EventSet) \ PAPI_num_events(EventSet) #define PAPIvi_overflow(EventSet, EventCode, threshold, flags, handler) \ PAPI_overflow(EventSet, EventCode, threshold, flags, handler) #define PAPIvi_perror( s ) \ PAPI_perror( s ) #define PAPIvi_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags) \ PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags) #define PAPIvi_query_event(EventCode) \ PAPI_query_event(EventCode) #define PAPIvi_read(EventSet, values) \ PAPI_read(EventSet, values) #define PAPIvi_register_thread \ PAPI_register_thread #define PAPIvi_remove_event(EventSet, EventCode) \ PAPI_remove_event(EventSet, EventCode) #define PAPIvi_remove_events(EventSet, Events, number) \ PAPI_remove_events(EventSet, Events, number) #define PAPIvi_reset(EventSet) \ PAPI_reset(EventSet) #define PAPIvi_set_debug(level) \ PAPI_set_debug(level) #define PAPIvi_set_domain(domain) \ PAPI_set_domain(domain) #define PAPIvi_set_granularity(granularity) \ PAPI_set_granularity(granularity) #define PAPIvi_set_multiplex(EventSet) \ PAPI_set_multiplex(EventSet) #define PAPIvi_set_opt(option, ptr) \ PAPI_set_opt(option, ptr) #define PAPIvi_set_thr_specific(tag, ptr) \ PAPI_set_thr_specific(tag, ptr) #define PAPIvi_shutdown \ PAPI_shutdown #define PAPIvi_sprofil(prof, profcnt, EventSet, EventCode, threshold, flags) \ PAPI_sprofil(prof, profcnt, EventSet, EventCode, threshold, flags) #define PAPIvi_start(EventSet) \ PAPI_start(EventSet) #define PAPIvi_state(EventSet, status) \ PAPI_state(EventSet, status) #define PAPIvi_stop(EventSet, values) \ PAPI_stop(EventSet, values) #define PAPIvi_strerror(err) \ PAPI_strerror(err) #define PAPIvi_thread_id \ PAPI_thread_id #define PAPIvi_thread_init(id_fn) \ PAPI_thread_init(id_fn) #define PAPIvi_unlock(lck) \ PAPI_unlock(lck) #define PAPIvi_write(EventSet, values) \ PAPI_write(EventSet, values) /* The High Level API */ #define PAPIvi_accum_counters(values, array_len) \ PAPI_accum_counters(values, array_len) #define PAPIvi_num_counters \ PAPI_num_counters #define PAPIvi_read_counters(values, array_len) \ PAPI_read_counters(values, array_len) #define PAPIvi_start_counters(Events, array_len) \ PAPI_start_counters(Events, array_len) #define PAPIvi_stop_counters(values, array_len) \ PAPI_stop_counters(values, array_len) #define PAPIvi_flips(rtime, ptime, flpins, mflips) \ PAPI_flips(rtime, ptime, flpins, mflips) #define PAPIvi_flops(rtime, ptime, flpops, mflops) \ PAPI_flops(rtime, ptime, flpops, mflops) #define PAPIvi_ipc(rtime, ptime, ins, ipc) \ PAPI_ipc(rtime, ptime, ins, ipc) /******************************************************************************* * If PAPI_VERSION is defined, and the MAJOR version number is not 3, then we * generate an error message. * This block allows us to support future version with a * version independent syntax. ********************************************************************************/ #else #error Compiling against a not yet released PAPI version #endif #endif /* _PAPIVI */ papi-papi-7-2-0-t/src/run_tests.sh000077500000000000000000000143401502707512200170160ustar00rootroot00000000000000#!/bin/sh # File: papi.c # Author: Philip Mucci # mucci@cs.utk.edu # Mods: Kevin London # london@cs.utk.edu # Philip Mucci # mucci@cs.utk.edu # Treece Burgess # tburgess@icl.utk.edu # Make sure that the tests are built, if not build them if [ "x$BUILD" != "x" ]; then cd testlib; make; cd .. cd validation_tests; make; cd .. cd ctests; make; cd .. cd ftests; make; cd .. for comp in `ls components/*/tests` ; do \ cd components/$$comp/tests ; make; cd ../../.. ; done fi AIXTHREAD_SCOPE=S export AIXTHREAD_SCOPE if [ "X$1" = "X-v" ]; then TESTS_QUIET="" else # This should never have been an argument, but an environment variable! TESTS_QUIET="TESTS_QUIET" export TESTS_QUIET fi # Determine if cuda events are enabled or disabled DISABLE_CUDA_EVENTS="" if [ "$2" ]; then # Can either be --disable-cuda-events= DISABLE_CUDA_EVENTS=$2 fi # Disable high-level output if [ "x$TESTS_QUIET" != "xTESTS_QUIET" ] ; then export PAPI_REPORT=1 fi if [ "x$VALGRIND" != "x" ]; then VALGRIND="valgrind --leak-check=full"; fi # Check for active 'perf_event' component PERF_EVENT_ACTIVE=$(utils/papi_component_avail | awk '/Active components:/{flag=1; next} flag' | grep -q "perf_event" && echo "true" || echo "false") if [ "$PERF_EVENT_ACTIVE" = "true" ]; then VTESTS=`find validation_tests/* -prune -perm -u+x -type f ! -name "*.[c|h]"`; CTESTS=`find ctests/* -prune -perm -u+x -type f ! -name "*.[c|h]"`; #CTESTS=`find ctests -maxdepth 1 -perm -u+x -type f`; FTESTS=`find ftests -perm -u+x -type f ! -name "*.[c|h|F]"`; else EXCLUDE="$EXCLUDE $VTESTS $CTESTS $FTESTS"; fi # List of active components ACTIVE_COMPONENTS_PATTERN=$(utils/papi_component_avail | awk '/Active components:/{flag=1; next} flag' | grep "Name:" | sed 's/Name: //' | awk '{print $1}' | paste -sd'|' -) # Find the test files, filtering for only the active components COMPTESTS=$(find components/*/tests -perm -u+x -type f ! \( -name "*.[c|h]" -o -name "*.cu" -o -name "*.so" \) | grep -E "components/($ACTIVE_COMPONENTS_PATTERN)/") # Find the test files, filtering for inactive components INACTIVE_COMPTESTS=$(find components/*/tests -perm -u+x -type f ! \( -name "*.[c|h]" -o -name "*.cu" -o -name "*.so" \) | grep -vE "components/($ACTIVE_COMPONENTS_PATTERN)/") #EXCLUDE=`grep --regexp=^# --invert-match run_tests_exclude.txt` EXCLUDE_TXT=`grep -v -e '^#\|^$' run_tests_exclude.txt` EXCLUDE="$EXCLUDE_TXT $INACTIVE_COMPTESTS"; ALLTESTS="$VTESTS $CTESTS $FTESTS $COMPTESTS"; PATH=./ctests:$PATH export PATH echo "Platform:" uname -a echo "Date:" date echo "" if [ -r /proc/cpuinfo ]; then echo "Cpuinfo:" # only print info on first processor on x86 sed '/^$/q' /proc/cpuinfo fi echo "" if [ "x$VALGRIND" != "x" ]; then echo "The following test cases will be run using valgrind:"; else echo "The following test cases will be run:"; fi echo "" MATCH=0 LIST="" for i in $ALLTESTS; do for xtest in $EXCLUDE; do if [ "$i" = "$xtest" ]; then MATCH=1 break fi; done if [ $MATCH -ne 1 ]; then LIST="$LIST $i" fi; MATCH=0 done echo $LIST echo "" echo "" echo "The following test cases will NOT be run:"; echo $EXCLUDE; echo ""; echo "Running Tests"; echo "" if [ "$LD_LIBRARY_PATH" = "" ]; then LD_LIBRARY_PATH=.:./libpfm4/lib else LD_LIBRARY_PATH=.:./libpfm4/lib:"$LD_LIBRARY_PATH" fi export LD_LIBRARY_PATH if [ "$LIBPATH" = "" ]; then LIBPATH=.:./libpfm4/lib else LIBPATH=.:./libpfm4/lib:"$LIBPATH" fi export LIBPATH if [ "$PERF_EVENT_ACTIVE" = "true" ]; then echo "" echo "Running Event Validation Tests"; echo "" for i in $VTESTS; do for xtest in $EXCLUDE; do if [ "$i" = "$xtest" ]; then MATCH=1 break fi; done if [ $MATCH -ne 1 ]; then if [ -x $i ]; then RAN="$i $RAN" printf "Running %-50s %s" $i: $VALGRIND ./$i $TESTS_QUIET #delete output folder for high-level tests case "$i" in *"_hl"*) rm -r papi_hl_output ;; esac fi; fi; MATCH=0 done echo "" echo "Running C Tests"; echo "" for i in $CTESTS; do for xtest in $EXCLUDE; do if [ "$i" = "$xtest" ]; then MATCH=1 break fi; done if [ $MATCH -ne 1 ]; then if [ -x $i ]; then RAN="$i $RAN" printf "Running %-50s %s" $i: # For all_native_events an optional flag of --disable-cuda-events= # can be provided; however, passing this to ctests/calibrate.c will result # in the help message being displayed instead of running if [ "$i" = "ctests/all_native_events" ] || [ "$i" = "ctests/get_event_component" ]; then $VALGRIND ./$i $TESTS_QUIET $DISABLE_CUDA_EVENTS else $VALGRIND ./$i $TESTS_QUIET fi #delete output folder for high-level tests case "$i" in *"_hl"*) rm -r papi_hl_output ;; esac fi; fi; MATCH=0 done echo "" echo "Running Fortran Tests"; echo "" for i in $FTESTS; do for xtest in $EXCLUDE; do if [ "$i" = "$xtest" ]; then MATCH=1 break fi; done if [ $MATCH -ne 1 ]; then if [ -x $i ]; then RAN="$i $RAN" printf "Running $i:\n" $VALGRIND ./$i $TESTS_QUIET #delete output folder for high-level tests case "$i" in *"_hl"*) rm -r papi_hl_output ;; esac fi; fi; MATCH=0 done fi echo ""; echo "Running Component Tests"; echo "" for i in $COMPTESTS; do for xtest in $EXCLUDE; do if [ "$i" = "$xtest" ]; then MATCH=1 break fi; done if [ $MATCH -ne 1 ]; then if [ -x $i ]; then RAN="$i $RAN" printf "Running $i:\n"; printf "%-59s" "" cmp=`echo $i | sed 's:components/::' | sed 's:/.*$::'`; if [ x$cmp = xsde ]; then LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${PWD}/components/sde/sde_lib:${PWD}/components/sde/tests/lib $VALGRIND ./$i $TESTS_QUIET else $VALGRIND ./$i $TESTS_QUIET fi; fi; fi; MATCH=0 done if [ "$RAN" = "" ]; then echo "FAILED to run any tests. (you can safely ignore this if this was expected behavior)" fi; papi-papi-7-2-0-t/src/run_tests_exclude.txt000066400000000000000000000035271502707512200207360ustar00rootroot00000000000000# this file enumerates test cases that will NOT be run # when the run_tests.sh macro is executed # enter each test name on a separate line # lines beginning with # will be ignored # this file must have UNIX line endings # For starters we do not want to try and execute Makefiles ftests/Makefile.recipies ftests/Makefile ftests/Makefile.target.in ctests/Makefile.recipies ctests/Makefile ctests/Makefile.target.in ctests/Make-export components/infiniband/tests/Makefile components/cuda/tests/Makefile components/Makefile_comp_tests components/net/tests/Makefile components/lustre/tests/Makefile components/perf_event/tests/Makefile components/nvml/tests/Makefile components/perf_event_uncore/tests/Makefile components/rapl/tests/Makefile components/bcs/tests/Makefile components/sde/tests/Makefile components/sde/tests/README.txt components/intel_gpu/tests/readme.txt testlib/Makefile testlib/Makefile.target.in # This is a utility, not a standalone test. validation_tests/memleak_check # Template PBS Job Script for Parallel Job on Myrinet Nodes ctests/cpi.pbs # Time wasting support program, not a standalone test ctests/burn # Support program for the attach tests ctests/attach_target # long running tests (if you are not in a hurry comment these lines) ctests/pthrtough2 ctests/timer_overflow # Some architectures require OMP_NUM_THREADS otherwise the test hangs ctests/omptough # Mixed high-level and low-level tests with different components ctests/serial_hl_ll_comb2 # MPI tests for high-level API ctests/mpi_hl ctests/mpi_omp_hl # these tests haven't been implemented # Helper scripts for iozone components/appio/tests/iozone/Gnuplot.txt components/appio/tests/iozone/Generate_Graphs components/appio/tests/iozone/report.pl components/appio/tests/iozone/iozone_visualizer.pl components/appio/tests/iozone/gengnuplot.sh components/appio/tests/iozone/gnu3d.dem papi-papi-7-2-0-t/src/run_tests_exclude_cuda.txt000066400000000000000000000021541502707512200217250ustar00rootroot00000000000000# this file enumerates test cases that will NOT be run # when the run_tests.sh macro is executed # enter each test name on a separate line # lines beginning with # will be ignored # this file must have UNIX line endings # For starters we do not want to try and execute Makefiles ftests/Makefile.recipies ftests/Makefile ftests/Makefile.target.in ctests/Makefile.recipies ctests/Makefile ctests/Makefile.target.in ctests/Make-export components/infiniband/tests/Makefile components/cuda/tests/Makefile components/Makefile_comp_tests components/net/tests/Makefile components/lustre/tests/Makefile components/perf_event/tests/Makefile components/nvml/tests/Makefile components/perf_event_uncore/tests/Makefile components/rapl/tests/Makefile components/bcs/tests/Makefile testlib/Makefile testlib/Makefile.target.in # Template PBS Job Script for Parallel Job on Myrinet Nodes ctests/cpi.pbs # Time wasting support program, not a standalone test ctests/burn ctests/shlib # long running tests (if you are not in a hurry remove comment from these lines) #ctests/pthrtough2 #ctests/timer_overflow # these tests haven't been implemented papi-papi-7-2-0-t/src/run_tests_shlib.sh000077500000000000000000000067341502707512200202070ustar00rootroot00000000000000#!/bin/sh # File: run_tests_shlib.sh # Author: Treece Burgess tburgess@icl.utk.edu # This script is designed specifically for the PAPI GitHub CI when # --with-shlib-tools is used during the ./configure stage. # A single component test for each active component will be ran. # if component tests are not built, then build them if [ "x$BUILD" != "x" ]; then for comp in `ls components/*/tests` ; do \ cd components/$$comp/tests ; make; cd ../../.. ; done fi # determine whether to suppress test output TESTS_QUIET="" if [ $# != 0 ]; then if [ $1 = "TESTS_QUIET" ]; then TESTS_QUIET=$1 fi fi # determine if VALGRIND is set if [ "x$VALGRIND" != "x" ]; then VALGRIND="valgrind --leak-check=full"; fi # collect the current active components ACTIVE_COMPONENTS_PATTERN=$(utils/papi_component_avail | awk '/Active components:/{flag=1; next} flag' | grep "Name:" | sed 's/Name: //' | awk '{print $1}' | paste -sd'|' -) # collecting inactive component tests to be filtered INACTIVE_COMPONENTS=$(find components/*/tests -perm -u+x -type f ! \( -name "*.[c|h]" -o -name "*.cu" -o -name "*.so" \) | grep -vE "components/($ACTIVE_COMPONENTS_PATTERN)/") # set of tests we want to ignore EXCLUDE_TXT=`grep -v -e '^#\|^$' run_tests_exclude.txt` EXCLUDE_TESTS="$EXCLUDE_TXT $INACTIVE_COMPONENTS" # for each active component, collect a single component test ACTIVE_COMPONENTS_TESTS="" for cmp in $(echo $ACTIVE_COMPONENTS_PATTERN | sed 's/|/ /g'); do # query a test for the active component and make sure it is not an excluded test QUERY_CMP_TEST=$(find components/$cmp/tests -perm -u+x -type f ! \( -name "*.[c|h]" -o -name "*.cu" -o -name "*.so" \) | grep -E -m 1 "components/($cmp)/") case $EXCLUDE_TESTS in *"$QUERY_CMP_TEST"*) continue ;; esac # update the excluded tests UPDATE_EXCLUDE_TESTS=$(find components/$cmp/tests -perm -u+x -type f ! \( -name "*.[c|h]" -o -name "*.cu" -o -name "*.so" \) | grep -vw "$QUERY_CMP_TEST") EXCLUDE_TESTS="$EXCLUDE_TESTS $UPDATE_EXCLUDE_TESTS" # update active component tests ACTIVE_COMPONENTS_TESTS="$ACTIVE_COMPONENTS_TESTS $QUERY_CMP_TEST" done # print system information echo "Platform:" uname -a # print date information echo "Date:" date # print cpu information echo "" if [ -r /proc/cpuinfo ]; then echo "Cpuinfo:" # only print info on first processor on x86 sed '/^$/q' /proc/cpuinfo fi echo "" if [ "x$VALGRIND" != "x" ]; then echo "The following test cases will be run using valgrind:" else echo "The following test cases will be run:" fi # list each test for each active component, note that if more # than one test is output for each active component then # this script is not behaving properly echo $ACTIVE_COMPONENTS_TESTS echo "" echo "The following test cases will NOT be run:" echo $EXCLUDE_TESTS; # set LD_LIBRARY_PATH if [ "$LD_LIBRARY_PATH" = "" ]; then LD_LIBRARY_PATH=.:./libpfm4/lib else LD_LIBRARY_PATH=.:./libpfm4/lib:"$LD_LIBRARY_PATH" fi export LD_LIBRARY_PATH echo "" echo "Running a Single Component Test for --with-shlib-tools" echo "" for cmp_test in $ACTIVE_COMPONENTS_TESTS; do if [ -x $cmp_test ]; then printf "Running $cmp_test:\n"; printf "%-59s" "" cmp=`echo $cmp_test | sed 's:components/::' | sed 's:/.*$::'`; if [ x$cmp == xsde ]; then LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${PWD}/components/sde/sde_lib:${PWD}/components/sde/tests/lib $VALGRIND ./$cmp_test $TESTS_QUIET else $VALGRIND ./$cmp_test $TESTS_QUIET fi fi done papi-papi-7-2-0-t/src/sde_lib/000077500000000000000000000000001502707512200160305ustar00rootroot00000000000000papi-papi-7-2-0-t/src/sde_lib/Makefile000066400000000000000000000011621502707512200174700ustar00rootroot00000000000000CC ?= gcc SDE_INC = -I. -I.. SDE_LD = -ldl -pthread CFLAGS += -Wextra -Wall -O2 %_d.o: %.c $(CC) -c -Bdynamic -fPIC -shared -fvisibility=hidden $(CFLAGS) $(SDE_INC) $< -o $@ %_s.o: %.c $(CC) -c -Bstatic -static $(CFLAGS) $(SDE_INC) $< -o $@ DOBJS=$(patsubst %.c,%_d.o,$(wildcard *.c)) SOBJS=$(patsubst %.c,%_s.o,$(wildcard *.c)) all: dynamic static dynamic: $(DOBJS) $(CC) $(LDFLAGS) -Bdynamic -fPIC -shared -Wl,-soname -Wl,libsde.so -fvisibility=hidden $(CFLAGS) $(DOBJS) -lrt -ldl -pthread -o libsde.so.1.0 rm -f *_d.o static: $(SOBJS) ar rs libsde.a $(SOBJS) rm -f *_s.o clean: rm -f *.o libsde.so libsde.a papi-papi-7-2-0-t/src/sde_lib/sde_lib.c000066400000000000000000001424311502707512200176020ustar00rootroot00000000000000/** * @file sde_lib.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @brief * This is the main implementation of the functionality needed to * support SDEs in third party libraries. */ #include "sde_lib_internal.h" #include "sde_lib_lock.h" #define DLSYM_CHECK(name) \ do { \ if ( NULL != (err=dlerror()) ) { \ name##_ptr = NULL; \ SDEDBG("obtain_papi_symbols(): Unable to load symbol %s: %s\n", #name, err);\ return; \ } \ } while (0) static long long sdei_compute_q1(void *param); static long long sdei_compute_med(void *param); static long long sdei_compute_q3(void *param); static long long sdei_compute_min(void *param); static long long sdei_compute_max(void *param); static inline long long sdei_compute_quantile(void *param, int percent); static inline long long sdei_compute_edge(void *param, int which_edge); int papi_sde_compare_long_long(const void *p1, const void *p2); int papi_sde_compare_int(const void *p1, const void *p2); int papi_sde_compare_double(const void *p1, const void *p2); int papi_sde_compare_float(const void *p1, const void *p2); /** This global variable points to the head of the control state list **/ papisde_control_t *_papisde_global_control = NULL; int papi_sde_version = PAPI_SDE_VERSION; #if defined(USE_LIBAO_ATOMICS) AO_TS_t _sde_hwd_lock_data; #else //defined(USE_LIBAO_ATOMICS) pthread_mutex_t _sde_hwd_lock_data; #endif //defined(USE_LIBAO_ATOMICS) /*******************************************************************************/ /* Function pointers magic for functions that we expect to access from libpapi */ /*******************************************************************************/ __attribute__((__common__)) void (*papi_sde_check_overflow_status_ptr)(uint32_t cntr_id, long long int value); __attribute__((__common__)) int (*papi_sde_set_timer_for_overflow_ptr)(void); static inline void sdei_check_overflow_status(uint32_t cntr_uniq_id, long long int latest){ if( NULL != papi_sde_check_overflow_status_ptr ) (*papi_sde_check_overflow_status_ptr)(cntr_uniq_id, latest); } inline int sdei_set_timer_for_overflow(void){ if( NULL != papi_sde_set_timer_for_overflow_ptr ) return (*papi_sde_set_timer_for_overflow_ptr)(); return -1; } /* The folling function will look for symbols from libpapi.so. If the application that linked against libsde has used the static PAPI library (libpapi.a) then dlsym will fail to find them, but the __attribute__((__common__)) should do the trick. */ static inline void obtain_papi_symbols(void){ char *err; int dlsym_err = 0; // In case of static linking the function pointers will be automatically set // by the linker and the dlopen()/dlsym() would fail at runtime, so we want to // check if the linker has done its magic first. if( (NULL != papi_sde_check_overflow_status_ptr) && (NULL != papi_sde_set_timer_for_overflow_ptr) ){ return; } (void)dlerror(); // Clear the internal string so we can diagnose errors later on. void *handle = dlopen(NULL, RTLD_NOW|RTLD_GLOBAL); if( NULL != (err = dlerror()) ){ SDEDBG("obtain_papi_symbols(): %s\n",err); dlsym_err = 1; return; } // We need this function to inform the SDE component in libpapi about the value of created counters. papi_sde_check_overflow_status_ptr = dlsym(handle, "papi_sde_check_overflow_status"); DLSYM_CHECK(papi_sde_check_overflow_status); papi_sde_set_timer_for_overflow_ptr = dlsym(handle, "papi_sde_set_timer_for_overflow"); DLSYM_CHECK(papi_sde_set_timer_for_overflow); if( !dlsym_err ){ SDEDBG("obtain_papi_symbols(): All symbols from libpapi.so have been successfully acquired.\n"); } return; } /*************************************************************************/ /* API Functions for libraries. */ /*************************************************************************/ /** This function initializes SDE internal data-structures for an individual software library and returns an opaque handle to these structures. @param[in] name_of_library -- (const char *) library name. @param[out] sde_handle -- (papi_handle_t) opaque pointer to sde structure for initialized library. */ papi_handle_t papi_sde_init(const char *name_of_library) { papisde_library_desc_t *tmp_lib; papisde_control_t *gctl = sdei_get_global_struct(); if(gctl->disabled) return NULL; // We have to emulate PAPI's SUBDBG to get the same behavior _sde_be_verbose = (NULL != getenv("PAPI_VERBOSE")); char *tmp= getenv("PAPI_DEBUG"); if( (NULL != tmp) && (0 != strlen(tmp)) && (strstr(tmp, "SUBSTRATE") || strstr(tmp, "ALL")) ){ _sde_debug = 1; } SDEDBG("Registering library: '%s'\n", name_of_library); obtain_papi_symbols(); // Lock before we read and/or modify the global structures. sde_lock(); // Put the actual work in a different function so we call it from other // places. We have to do this because we cannot call // papi_sde_init() from places in the code which already call // lock()/unlock(), or we will end up with deadlocks. tmp_lib = (papisde_library_desc_t *)do_sde_init(name_of_library, gctl); sde_unlock(); SDEDBG("Library '%s' has been registered.\n",name_of_library); return tmp_lib; } /** This function disables SDE activity for a specific library, or for all libraries that use SDEs until papi_sde_enable() is called. @param[in] handle -- (papi_handle_t) opaque pointer to sde structure for a specific library. If NULL then SDEs will be disabled at a global level. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_disable( papi_handle_t handle ){ sde_lock(); papisde_control_t *gctl = sdei_get_global_struct(); // If the caller did not specify a library, then disable all SDEs. if( NULL == handle ){ gctl->disabled = 1; }else{ // else disable the specified library. papisde_library_desc_t *lib_handle = (papisde_library_desc_t *) handle; lib_handle->disabled = 1; } sde_unlock(); return SDE_OK; } /** This function enables SDE activity for a specific library, or for all libraries that use SDEs. @param[in] handle -- (papi_handle_t) opaque pointer to sde structure for a specific library. If NULL then SDEs will be enabled at a global level. Note that if SDEs for a specific library have been explicitly disabled, then they must be explicitly enabled passing that libary's handle. Calling papi_sde_enabled(NULL) will only enable SDEs at the global level. It will not recursivelly enable SDEs for individual libraries. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_enable( papi_handle_t handle ){ sde_lock(); papisde_control_t *gctl = sdei_get_global_struct(); // If the caller did not specify a library, then disable all SDEs. if( NULL == handle ){ gctl->disabled = 0; }else{ // else disable the specified library. papisde_library_desc_t *lib_handle = (papisde_library_desc_t *) handle; lib_handle->disabled = 0; } sde_unlock(); return SDE_OK; } /** This function frees all SDE internal data-structures for an individual software library including all memory allocated by the counters of that library. @param[in] handle -- (papi_handle_t) opaque pointer to sde structure for initialized library. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_shutdown( papi_handle_t handle ){ papisde_library_desc_t *lib_handle, *tmp_lib, *next_lib, *prev_lib; int i; lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; SDEDBG("papi_sde_shutdown(): for library '%s'.\n", lib_handle->libraryName); sde_lock(); sde_counter_t *all_lib_counters; int item_cnt = ht_to_array(lib_handle->lib_counters, &all_lib_counters); for(i=0; inext = next_lib next_lib = lib_handle->next; if (gctl->lib_list_head == lib_handle) { gctl->lib_list_head = next_lib; } else { prev_lib = NULL; tmp_lib = gctl->lib_list_head; while (tmp_lib != lib_handle && tmp_lib != NULL) { prev_lib = tmp_lib; tmp_lib = tmp_lib->next; } if (prev_lib != NULL) { prev_lib->next = next_lib; } } free(lib_handle->libraryName); free(lib_handle); sde_unlock(); return SDE_OK; } /** This function registers an event name and counter within the SDE data structure attached to the handle. A default description for an event is synthesized from the library name and the event name when they are registered. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] event_name -- (const char *) name of the event. @param[in] cntr_mode -- (int) the mode of the counter (one of: PAPI_SDE_RO, PAPI_SDE_RW and one of: PAPI_SDE_DELTA, PAPI_SDE_INSTANT). @param[in] cntr_type -- (int) the type of the counter (PAPI_SDE_long_long, PAPI_SDE_int, PAPI_SDE_double, PAPI_SDE_float). @param[in] counter -- pointer to a variable that stores the value for the event. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_register_counter( papi_handle_t handle, const char *event_name, int cntr_mode, int cntr_type, void *counter ) { papisde_library_desc_t *lib_handle; int ret_val = SDE_OK; cntr_class_specific_t cntr_union; if( NULL != event_name ) SDEDBG("Prepaing to register counter: '%s'.\n", event_name); lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; cntr_union.cntr_basic.data = counter; sde_lock(); ret_val = sdei_setup_counter_internals( lib_handle, event_name, cntr_mode, cntr_type, CNTR_CLASS_REGISTERED, cntr_union ); sde_unlock(); return ret_val; } /** This function registers an event name and (caller provided) callback function within the SDE data structure attached to the handle. A default description for an event is synthesized from the library name and the event name when they are registered. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] event_name -- (const char *) name of the event. @param[in] cntr_mode -- (int) the mode of the counter (one of: PAPI_SDE_RO, PAPI_SDE_RW and one of: PAPI_SDE_DELTA, PAPI_SDE_INSTANT). @param[in] cntr_type -- (int) the type of the counter (PAPI_SDE_long_long, PAPI_SDE_int, PAPI_SDE_double, PAPI_SDE_float). @param[in] fp_counter -- pointer to a callback function that SDE will call when PAPI_read/stop/accum is called. @param[in] param -- (void *) opaque parameter that will be passed to the callback function every time it's called. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_register_counter_cb( papi_handle_t handle, const char *event_name, int cntr_mode, int cntr_type, papi_sde_fptr_t callback, void *param ) { papisde_library_desc_t *lib_handle; int ret_val = SDE_OK; cntr_class_specific_t cntr_union; if( NULL != event_name ) SDEDBG("Prepaing to register fp_counter: '%s'.\n", event_name); lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; cntr_union.cntr_cb.callback = callback; cntr_union.cntr_cb.param = param; sde_lock(); ret_val = sdei_setup_counter_internals( lib_handle, event_name, cntr_mode, cntr_type, CNTR_CLASS_CB, cntr_union ); sde_unlock(); return ret_val; } /** This function unregisters (removes) an event name and counter from the SDE data structures. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] event_name -- (const char *) name of the event that is being unregistered. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_unregister_counter( papi_handle_t handle, const char *event_name) { papisde_library_desc_t *lib_handle; int error; char *full_event_name; int ret_val; SDEDBG("Preparing to unregister counter: '%s'.\n",event_name); lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_unregister_counter(): 'handle' is clobbered. Unable to unregister counter."); return SDE_EINVAL; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); SDEDBG("Unregistering counter: '%s' from SDE library: %s.\n", full_event_name, lib_handle->libraryName); // After this point we will be modifying data structures, so we need to acquire a lock. // This function has multiple exist points. If you add more, make sure you unlock before each one of them. sde_lock(); error = sdei_delete_counter( lib_handle, full_event_name ); // Check if we found a registered counter, or if it never existed. if( error ){ SDE_ERROR("Counter '%s' has not been registered by library '%s'.", full_event_name, lib_handle->libraryName); free(full_event_name); ret_val = SDE_EINVAL; goto fn_exit; } // We will not use the name beyond this point free(full_event_name); ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } /** This function optionally replaces an event's default description with a description provided by the library developer within the SDE data structure attached to the handle. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] event_name -- (const char *) name of the event. @param[in] event_description -- (const char *) description of the event. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_describe_counter( void *handle, const char *event_name, const char *event_description ) { sde_counter_t *tmp_item; papisde_library_desc_t *lib_handle; char *full_event_name; int ret_val; lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_describe_counter(): 'handle' is clobbered. Unable to add description for counter."); return SDE_EINVAL; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); // After this point we will be modifying data structures, so we need to acquire a lock. // This function has multiple exist points. If you add more, make sure you unlock before each one of them. sde_lock(); tmp_item = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); if( NULL != tmp_item ){ tmp_item->description = strdup(event_description); free(full_event_name); ret_val = SDE_OK; goto fn_exit; } SDEDBG("papi_sde_describe_counter() Event: '%s' is not registered in SDE library: '%s'\n", full_event_name, lib_handle->libraryName); // We will not use the name beyond this point free(full_event_name); ret_val = SDE_EINVAL; fn_exit: sde_unlock(); return ret_val; } /** This function adds an event counter to a group. A group is created automatically the first time a counter is added to it. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] group_name -- (const char *) name of the group. @param[in] group_flags -- (uint32_t) one of PAPI_SDE_SUM, PAPI_SDE_MAX, PAPI_SDE_MIN to define how the members of the group will be used to compute the group's value. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_add_counter_to_group(papi_handle_t handle, const char *event_name, const char *group_name, uint32_t group_flags) { papisde_library_desc_t *lib_handle; sde_counter_t *tmp_item, *tmp_group; uint32_t cntr_group_uniq_id; char *full_event_name, *full_group_name; int ret_val; lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; SDEDBG("Adding counter: %s into group %s\n",event_name, group_name); if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_add_counter_to_group(): 'handle' is clobbered. Unable to add counter to group."); return SDE_EINVAL; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); // After this point we will be modifying data structures, so we need to acquire a lock. // This function has multiple exist points. If you add more, make sure you unlock before each one of them. sde_lock(); // Check to make sure that the event is already registered. This is not the place to create a placeholder. tmp_item = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); if( NULL == tmp_item ){ SDE_ERROR("papi_sde_add_counter_to_group(): Unable to find counter: '%s'.",full_event_name); free(full_event_name); ret_val = SDE_EINVAL; goto fn_exit; } // We will not use the name beyond this point free(full_event_name); str_len = strlen(lib_handle->libraryName)+strlen(group_name)+2+1; // +2 for "::" and +1 for '\0' full_group_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_group_name, str_len, "%s::%s", lib_handle->libraryName, group_name); // Check to see if the group exists already. Otherwise we need to create it. tmp_group = ht_lookup_by_name(lib_handle->lib_counters, full_group_name); if( NULL == tmp_group ){ papisde_control_t *gctl = sdei_get_global_struct(); // We use the current number of registered events as the uniq id of the counter group, and we // increment it because counter groups are treated as real counters by the outside world. // They are first class citizens. cntr_group_uniq_id = gctl->num_reg_events++; gctl->num_live_events++; SDEDBG("%s line %d: Unique ID for new counter group = %d\n", __FILE__, __LINE__, cntr_group_uniq_id); tmp_group = (sde_counter_t *)calloc(1, sizeof(sde_counter_t)); tmp_group->cntr_class = CNTR_CLASS_GROUP; tmp_group->glb_uniq_id = cntr_group_uniq_id; // copy the name because we will free the malloced space further down in this function. tmp_group->name = strdup(full_group_name); // make a copy here, because we will free() the 'name' and the 'description' separately. tmp_group->description = strdup( full_group_name ); tmp_group->which_lib = lib_handle; tmp_group->u.cntr_group.group_flags = group_flags; (void)ht_insert(lib_handle->lib_counters, ht_hash_name(full_group_name), tmp_group); (void)ht_insert(gctl->all_reg_counters, ht_hash_id(cntr_group_uniq_id), tmp_group); }else{ if( NULL == tmp_group->u.cntr_group.group_head ){ if( CNTR_CLASS_PLACEHOLDER == tmp_group->cntr_class ){ tmp_group->cntr_class = CNTR_CLASS_GROUP; }else{ SDE_ERROR("papi_sde_add_counter_to_group(): Found an empty counter group: '%s'. This might indicate that a cleanup routine is not doing its job.", group_name); } } // make sure the caller is not trying to change the flags of the group after it has been created. if( tmp_group->u.cntr_group.group_flags != group_flags ){ SDE_ERROR("papi_sde_add_counter_to_group(): Attempting to add counter '%s' to counter group '%s' with incompatible group flags.", event_name, group_name); free(full_group_name); ret_val = SDE_EINVAL; goto fn_exit; } } // Add the new counter to the group's head. papisde_list_entry_t *new_head = (papisde_list_entry_t *)calloc(1, sizeof(papisde_list_entry_t)); new_head->item = tmp_item; new_head->next = tmp_group->u.cntr_group.group_head; tmp_group->u.cntr_group.group_head = new_head; if( SDE_OK != sdei_inc_ref_count(tmp_item) ){ SDE_ERROR("papi_sde_add_counter_to_group(): Error while adding counter '%s' to counter group: '%s'.", tmp_item->name, group_name); } free(full_group_name); ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } /** This function creates a counter whose memory is allocated and managed by libsde, in contrast with papi_sde_register_counter(), which works with counters that are managed by the user library that is calling this function. This counter can only by modified via the functions papi_sde_inc_counter() and papi_sde_reset_counter(). This has two benefits over a counter which lives inside the user library and is modified directly by that library: A) Our counter and the modifying API is guaranteed to be thread safe. B) Since libsde knows about each change in the value of the counter, overflowing is accurate. However, this approach has higher overhead than executing "my_cntr += value" inside a user library. @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] event_name -- (const char *) name of the event. @param[in] cntr_mode -- (int) the mode of the counter (one of: PAPI_SDE_RO, PAPI_SDE_RW and one of: PAPI_SDE_DELTA, PAPI_SDE_INSTANT). @param[out] cntr_handle -- address of a pointer in which libsde will store a handle to the newly created counter. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_create_counter( papi_handle_t handle, const char *event_name, int cntr_mode, void **cntr_handle ) { int ret_val; long long int *counter_data; char *full_event_name; papisde_library_desc_t *lib_handle; sde_counter_t *cntr; cntr_class_specific_t cntr_union; lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; if( NULL != event_name ) SDEDBG("Preparing to create counter: '%s'.\n", event_name); sde_lock(); if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_create_counter(): 'handle' is clobbered. Unable to create counter."); return SDE_EINVAL; } SDEDBG("Adding created counter: '%s' with mode: '%d' in SDE library: %s.\n", event_name, cntr_mode, lib_handle->libraryName); // Created counters use memory allocated by libsde, not the user library. counter_data = (long long int *)calloc(1, sizeof(long long int)); cntr_union.cntr_basic.data = counter_data; ret_val = sdei_setup_counter_internals( lib_handle, event_name, cntr_mode, PAPI_SDE_long_long, CNTR_CLASS_CREATED, cntr_union ); if( SDE_OK != ret_val ){ goto fn_exit; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); cntr = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); if(NULL == cntr) { SDEDBG("Logging counter '%s' not properly inserted in SDE library '%s'\n", full_event_name, lib_handle->libraryName); free(full_event_name); ret_val = SDE_ECMP; goto fn_exit; } if( NULL != cntr_handle ){ *(sde_counter_t **)cntr_handle = cntr; } free(full_event_name); ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } // The following function works only for counters created using papi_sde_create_counter(). int papi_sde_inc_counter( papi_handle_t cntr_handle, long long int increment) { long long int *ptr; sde_counter_t *tmp_cntr; int ret_val; tmp_cntr = (sde_counter_t *)cntr_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_cntr) || (NULL==tmp_cntr->which_lib) || tmp_cntr->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( !IS_CNTR_CREATED(tmp_cntr) || (NULL == tmp_cntr->u.cntr_basic.data) ){ SDE_ERROR("papi_sde_inc_counter(): 'cntr_handle' is clobbered. Unable to modify value of counter."); ret_val = SDE_EINVAL; goto fn_exit; } if( PAPI_SDE_long_long != tmp_cntr->cntr_type ){ SDE_ERROR("papi_sde_inc_counter(): Counter is not of type \"long long int\" and cannot be modified using this function."); ret_val = SDE_EINVAL; goto fn_exit; } SDEDBG("Preparing to increment counter: '%s::%s' by %lld.\n", tmp_cntr->which_lib->libraryName, tmp_cntr->name, increment); ptr = tmp_cntr->u.cntr_basic.data; *ptr += increment; sdei_check_overflow_status(tmp_cntr->glb_uniq_id, *ptr); ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } /* @param[in] handle -- pointer (of opaque type papi_handle_t) to sde structure for an individual library. @param[in] cset_name -- (const char *) name of the counting set. @param[out] cset_handle -- address of a pointer in which libsde will store a handle to the newly created counting set. @param[out] -- (int) the return value is SDE_OK on success, or an error code on failure. */ int papi_sde_create_counting_set( papi_handle_t handle, const char *cset_name, void **cset_handle ) { int ret_val; sde_counter_t *tmp_cset_handle; char *full_cset_name; papisde_library_desc_t *lib_handle; cntr_class_specific_t cntr_union; SDEDBG("papi_sde_create_counting_set()\n"); lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; if( NULL != cset_name ) SDEDBG("Preparing to create counting set: '%s'.\n", cset_name); if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_create_counting_set(): 'handle' is clobbered. Unable to create counting set."); return SDE_EINVAL; } SDEDBG("Adding counting set: '%s' in SDE library: %s.\n", cset_name, lib_handle->libraryName); // Allocate the structure for the hash table. cntr_union.cntr_cset.data = (cset_hash_table_t *)calloc(1,sizeof(cset_hash_table_t)); if( NULL == cntr_union.cntr_cset.data ) return SDE_ENOMEM; ret_val = sdei_setup_counter_internals( lib_handle, cset_name, PAPI_SDE_DELTA|PAPI_SDE_RO, PAPI_SDE_long_long, CNTR_CLASS_CSET, cntr_union ); if( SDE_OK != ret_val ) return ret_val; size_t str_len = strlen(lib_handle->libraryName)+strlen(cset_name)+2+1; // +2 for "::" and +1 for '\0' full_cset_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_cset_name, str_len, "%s::%s", lib_handle->libraryName, cset_name); tmp_cset_handle = ht_lookup_by_name(lib_handle->lib_counters, full_cset_name); if(NULL == tmp_cset_handle) { SDEDBG("Recorder '%s' not properly inserted in SDE library '%s'\n", full_cset_name, lib_handle->libraryName); free(full_cset_name); return SDE_ECMP; } if( NULL != cset_handle ){ *(sde_counter_t **)cset_handle = tmp_cset_handle; } free(full_cset_name); return SDE_OK; } int papi_sde_counting_set_remove( void *cset_handle, size_t hashable_size, const void *element, uint32_t type_id) { sde_counter_t *tmp_cset; int ret_val = SDE_OK; tmp_cset = (sde_counter_t *)cset_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_cset) || (NULL==tmp_cset->which_lib) || tmp_cset->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( !IS_CNTR_CSET(tmp_cset) || (NULL == tmp_cset->u.cntr_cset.data) ){ SDE_ERROR("papi_sde_counting_set_remove(): Counting set is clobbered. Unable to remove element."); ret_val = SDE_EINVAL; goto fn_exit; } SDEDBG("Preparing to remove element from counting set: '%s::%s'.\n", tmp_cset->which_lib->libraryName, tmp_cset->name); ret_val = cset_remove_elem(tmp_cset->u.cntr_cset.data, hashable_size, element, type_id); fn_exit: sde_unlock(); return ret_val; } int papi_sde_counting_set_insert( void *cset_handle, size_t element_size, size_t hashable_size, const void *element, uint32_t type_id) { sde_counter_t *tmp_cset; int ret_val = SDE_OK; tmp_cset = (sde_counter_t *)cset_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_cset) || (NULL==tmp_cset->which_lib) || tmp_cset->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( !IS_CNTR_CSET(tmp_cset) || (NULL == tmp_cset->u.cntr_cset.data) ){ SDE_ERROR("papi_sde_counting_set_insert(): Counting set is clobbered. Unable to insert element."); ret_val = SDE_EINVAL; goto fn_exit; } SDEDBG("Preparing to insert element in counting set: '%s::%s'.\n", tmp_cset->which_lib->libraryName, tmp_cset->name); ret_val = cset_insert_elem(tmp_cset->u.cntr_cset.data, element_size, hashable_size, element, type_id); fn_exit: sde_unlock(); return ret_val; } int papi_sde_create_recorder( papi_handle_t handle, const char *event_name, size_t typesize, int (*cmpr_func_ptr)(const void *p1, const void *p2), void **record_handle ) { int ret_val, i; sde_counter_t *tmp_rec_handle; cntr_class_specific_t aux_cntr_union; char *aux_event_name; size_t str_len; char *full_event_name; cntr_class_specific_t cntr_union; #define _SDE_MODIFIER_COUNT 6 const char *modifiers[_SDE_MODIFIER_COUNT] = {":CNT",":MIN",":Q1",":MED",":Q3",":MAX"}; // Add a NULL pointer for symmetry with the 'modifiers' vector, since the modifier ':CNT' does not have a function pointer. long long (*func_ptr_vec[_SDE_MODIFIER_COUNT])(void *) = {NULL, sdei_compute_min, sdei_compute_q1, sdei_compute_med, sdei_compute_q3, sdei_compute_max}; papisde_library_desc_t *lib_handle = (papisde_library_desc_t *)handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_create_recorder(): 'handle' is clobbered. Unable to create recorder."); ret_val = SDE_EINVAL; goto fn_exit; } SDEDBG("Preparing to create recorder: '%s' with typesize: '%d' in SDE library: %s.\n", event_name, (int)typesize, lib_handle->libraryName); // Allocate the "Exponential Storage" structure for the recorder data and meta-data. cntr_union.cntr_recorder.data = (recorder_data_t *)calloc(1,sizeof(recorder_data_t)); // Allocate the first chunk of recorder data. cntr_union.cntr_recorder.data->ptr_array[0] = malloc(EXP_CONTAINER_MIN_SIZE*typesize); cntr_union.cntr_recorder.data->total_entries = EXP_CONTAINER_MIN_SIZE; cntr_union.cntr_recorder.data->typesize = typesize; cntr_union.cntr_recorder.data->used_entries = 0; ret_val = sdei_setup_counter_internals( lib_handle, event_name, PAPI_SDE_DELTA|PAPI_SDE_RO, PAPI_SDE_long_long, CNTR_CLASS_RECORDER, cntr_union ); if( SDE_OK != ret_val ) return ret_val; str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); tmp_rec_handle = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); if(NULL == tmp_rec_handle) { SDEDBG("Recorder '%s' not properly inserted in SDE library '%s'\n", full_event_name, lib_handle->libraryName); free(full_event_name); ret_val = SDE_ECMP; goto fn_exit; } // We will not use the name beyond this point free(full_event_name); if( NULL != record_handle ){ *(sde_counter_t **)record_handle = tmp_rec_handle; } // At this point we are done creating the recorder and we will create the additional events which will appear as modifiers of the recorder. str_len = 0; for(i=0; i<_SDE_MODIFIER_COUNT; i++){ size_t tmp_len = strlen(modifiers[i]); if( tmp_len > str_len ) str_len = tmp_len; } str_len += strlen(event_name)+1; aux_event_name = (char *)calloc(str_len, sizeof(char)); snprintf(aux_event_name, str_len, "%s%s", event_name, modifiers[0]); SDEDBG("papi_sde_create_recorder(): Preparing to register aux counter: '%s' in SDE library: %s.\n", aux_event_name, lib_handle->libraryName); // The field that holds the number of used entries in the recorder structure will become the counter of the new auxiliary event. aux_cntr_union.cntr_basic.data = &(tmp_rec_handle->u.cntr_recorder.data->used_entries); ret_val = sdei_setup_counter_internals( lib_handle, (const char *)aux_event_name, PAPI_SDE_INSTANT|PAPI_SDE_RO, PAPI_SDE_long_long, CNTR_CLASS_REGISTERED, aux_cntr_union ); if( SDE_OK != ret_val ){ SDEDBG("papi_sde_create_recorder(): Registration of aux counter: '%s' in SDE library: %s FAILED.\n", aux_event_name, lib_handle->libraryName); free(aux_event_name); goto fn_exit; } // If the caller passed NULL as the function pointer, then they do _not_ want the quantiles. Otherwise, create them. if( NULL != cmpr_func_ptr ){ for(i=1; i<_SDE_MODIFIER_COUNT; i++){ sde_sorting_params_t *sorting_params; sorting_params = (sde_sorting_params_t *)malloc(sizeof(sde_sorting_params_t)); // This will be free()-ed by papi_sde_unregister_counter() sorting_params->recording = tmp_rec_handle; sorting_params->cmpr_func_ptr = cmpr_func_ptr; snprintf(aux_event_name, str_len, "%s%s", event_name, modifiers[i]); SDEDBG("papi_sde_create_recorder(): Preparing to register aux fp counter: '%s' in SDE library: %s.\n", aux_event_name, lib_handle->libraryName); // clear the previous entries; memset(&aux_cntr_union, 0, sizeof(aux_cntr_union)); aux_cntr_union.cntr_cb.callback = func_ptr_vec[i]; aux_cntr_union.cntr_cb.param = sorting_params; ret_val = sdei_setup_counter_internals(lib_handle, (const char *)aux_event_name, PAPI_SDE_RO|PAPI_SDE_INSTANT, PAPI_SDE_long_long, CNTR_CLASS_CB, aux_cntr_union ); if( SDE_OK != ret_val ){ SDEDBG("papi_sde_create_recorder(): Registration of aux counter: '%s' in SDE library: %s FAILED.\n", aux_event_name, lib_handle->libraryName); free(aux_event_name); goto fn_exit; } } } free(aux_event_name); ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } int papi_sde_record( void *record_handle, size_t typesize, const void *value) { sde_counter_t *tmp_rcrd; int ret_val; tmp_rcrd = (sde_counter_t *)record_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_rcrd) || (NULL==tmp_rcrd->which_lib) || tmp_rcrd->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; SDEDBG("Preparing to record value of size %lu at address: %p\n",typesize, value); sde_lock(); if( !IS_CNTR_RECORDER(tmp_rcrd) || (NULL == tmp_rcrd->u.cntr_recorder.data) ){ SDE_ERROR("papi_sde_record(): 'record_handle' is clobbered. Unable to record value."); ret_val = SDE_EINVAL; goto fn_exit; } ret_val = exp_container_insert_element(tmp_rcrd->u.cntr_recorder.data, typesize, value); fn_exit: sde_unlock(); return ret_val; } // This function neither frees the allocated, nor does it zero it. It only resets the counter of used entries so that // the allocated space can be resused (and overwritten) by future calls to record(). int papi_sde_reset_recorder( void *record_handle ) { sde_counter_t *tmp_rcrdr; int ret_val; tmp_rcrdr = (sde_counter_t *)record_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_rcrdr) || (NULL==tmp_rcrdr->which_lib) || tmp_rcrdr->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( !IS_CNTR_RECORDER(tmp_rcrdr) || NULL == tmp_rcrdr->u.cntr_recorder.data ){ SDE_ERROR("papi_sde_reset_recorder(): 'record_handle' is clobbered. Unable to reset recorder."); ret_val = SDE_EINVAL; goto fn_exit; } // NOTE: do _not_ free the chunks and do _not_ reset "cntr_recorder.data->total_entries" tmp_rcrdr->u.cntr_recorder.data->used_entries = 0; free( tmp_rcrdr->u.cntr_recorder.data->sorted_buffer ); tmp_rcrdr->u.cntr_recorder.data->sorted_buffer = NULL; tmp_rcrdr->u.cntr_recorder.data->sorted_entries = 0; ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } // The following function works only for counters created using papi_sde_create_counter(). int papi_sde_reset_counter( void *cntr_handle ) { long long int *ptr; sde_counter_t *tmp_cntr; int ret_val; tmp_cntr = (sde_counter_t *)cntr_handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==tmp_cntr) || (NULL==tmp_cntr->which_lib) || tmp_cntr->which_lib->disabled || (NULL==gctl) || gctl->disabled) return SDE_OK; sde_lock(); if( !IS_CNTR_CREATED(tmp_cntr) ){ SDE_ERROR("papi_sde_reset_counter(): Counter is not created by PAPI, so it cannot be reset."); ret_val = SDE_EINVAL; goto fn_exit; } ptr = (long long int *)(tmp_cntr->u.cntr_basic.data); if( NULL == ptr ){ SDE_ERROR("papi_sde_reset_counter(): Counter structure is clobbered. Unable to reset value of counter."); ret_val = SDE_EINVAL; goto fn_exit; } *ptr = 0; // Reset the counter. ret_val = SDE_OK; fn_exit: sde_unlock(); return ret_val; } /** This function finds the handle associated with a created counter, or a recorder, given the library handle and the event name. @param[in] handle -- (void *) pointer to sde structure for an individual library @param[in] event_name -- name of the event */ void *papi_sde_get_counter_handle( void *handle, const char *event_name) { sde_counter_t *counter_handle; papisde_library_desc_t *lib_handle; char *full_event_name; lib_handle = (papisde_library_desc_t *) handle; papisde_control_t *gctl = _papisde_global_control; if( (NULL==lib_handle) || lib_handle->disabled || (NULL==gctl) || gctl->disabled) return NULL; if( NULL == lib_handle->libraryName ){ SDE_ERROR("papi_sde_get_counter_handle(): 'handle' is clobbered."); return NULL; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); // After this point we will be accessing shared data structures, so we need to acquire a lock. sde_lock(); counter_handle = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); sde_unlock(); free(full_event_name); return counter_handle; } /*************************************************************************/ /* Utility Functions. */ /*************************************************************************/ int papi_sde_compare_long_long(const void *p1, const void *p2){ long long n1, n2; n1 = *(long long *)p1; n2 = *(long long *)p2; if( n1 < n2 ) return -1; if( n1 > n2 ) return 1; return 0; } int papi_sde_compare_int(const void *p1, const void *p2){ int n1, n2; n1 = *(int *)p1; n2 = *(int *)p2; if( n1 < n2 ) return -1; if( n1 > n2 ) return 1; return 0; } int papi_sde_compare_double(const void *p1, const void *p2){ double n1, n2; n1 = *(double *)p1; n2 = *(double *)p2; if( n1 < n2 ) return -1; if( n1 > n2 ) return 1; return 0; } int papi_sde_compare_float(const void *p1, const void *p2){ float n1, n2; n1 = *(float *)p1; n2 = *(float *)p2; if( n1 < n2 ) return -1; if( n1 > n2 ) return 1; return 0; } #define _SDE_CMP_MIN 0 #define _SDE_CMP_MAX 1 // This function returns a "long long" which contains a pointer to the // data element that corresponds to the edge (min/max), so that it works // for all types of data, not only integers. static inline long long sdei_compute_edge(void *param, int which_edge){ void *edge = NULL, *edge_copy; long long elem_cnt; long long current_size, cumul_size = 0; void *src; int i, chunk; size_t typesize; sde_counter_t *rcrd; int (*cmpr_func_ptr)(const void *p1, const void *p2); rcrd = ((sde_sorting_params_t *)param)->recording; elem_cnt = rcrd->u.cntr_recorder.data->used_entries; typesize = rcrd->u.cntr_recorder.data->typesize; cmpr_func_ptr = ((sde_sorting_params_t *)param)->cmpr_func_ptr; // The return value is supposed to be a pointer to the correct element, therefore zero // is a NULL pointer, which should tell the caller that there was a problem. if( (0 == elem_cnt) || (NULL == cmpr_func_ptr) ) return 0; // If there is a sorted (contiguous) buffer, but it's stale, we need to free it. // The value of elem_cnt (rcrd->u.cntr_recorder.data->used_entries) can // only increase, or be reset to zero, but when it is reset to zero // (by papi_sde_reset_recorder()) the buffer will be freed (by the same function). if( (NULL != rcrd->u.cntr_recorder.data->sorted_buffer) && (rcrd->u.cntr_recorder.data->sorted_entries < elem_cnt) ){ free( rcrd->u.cntr_recorder.data->sorted_buffer ); rcrd->u.cntr_recorder.data->sorted_buffer = NULL; rcrd->u.cntr_recorder.data->sorted_entries = 0; } // Check if a sorted contiguous buffer is already there. If there is, return // the first or last element (for MIN, or MAX respectively). if( NULL != rcrd->u.cntr_recorder.data->sorted_buffer ){ if( _SDE_CMP_MIN == which_edge ) edge = rcrd->u.cntr_recorder.data->sorted_buffer; if( _SDE_CMP_MAX == which_edge ) edge = (char *)(rcrd->u.cntr_recorder.data->sorted_buffer) + (elem_cnt-1)*typesize; }else{ // Make "edge" point to the beginning of the first chunk. edge = rcrd->u.cntr_recorder.data->ptr_array[0]; if ( NULL == edge ) return 0; cumul_size = 0; for(chunk=0; chunku.cntr_recorder.data->ptr_array[chunk]; for(i=0; (i < (elem_cnt-cumul_size)) && (i < current_size); i++){ void *next_elem = (char *)src + i*typesize; int rslt = cmpr_func_ptr(next_elem, edge); // If the new element is smaller than the current min and we are looking for the min, then keep it. if( (rslt < 0) && (_SDE_CMP_MIN == which_edge) ) edge = next_elem; // If the new element is larger than the current max and we are looking for the max, then keep it. if( (rslt > 0) && (_SDE_CMP_MAX == which_edge) ) edge = next_elem; } cumul_size += current_size; if( cumul_size >= elem_cnt ) break; } } // We might free the sorted_buffer (when it becomes stale), so we can't return "edge". // Therefore, we allocate fresh space for the resulting element and copy it there. // Since we do not know when the user will use this pointer, we will not be able // to free it, so it is the responibility of the user (who calls PAPI_read()) to // free this memory. edge_copy = malloc( 1 * typesize); memcpy(edge_copy, edge, 1 * typesize); // A pointer is guaranteed to fit inside a long long, so cast it and return a long long. return (long long)edge_copy; } // This function returns a "long long" which contains a pointer to the // data element that corresponds to the edge (min/max), so that it works // for all types of data, not only integers. // NOTE: This function allocates memory for one element and returns a pointer // to this memory. Since we do not know when the user will use this pointer, we // can not free it, so it is the responibility of the user // (the code that calls PAPI_read()) to free this memory. static inline long long sdei_compute_quantile(void *param, int percent){ long long quantile, elem_cnt; void *result_data; size_t typesize; sde_counter_t *rcrd; int (*cmpr_func_ptr)(const void *p1, const void *p2); rcrd = ((sde_sorting_params_t *)param)->recording; elem_cnt = rcrd->u.cntr_recorder.data->used_entries; typesize = rcrd->u.cntr_recorder.data->typesize; cmpr_func_ptr = ((sde_sorting_params_t *)param)->cmpr_func_ptr; // The return value is supposed to be a pointer to the correct element, therefore zero // is a NULL pointer, which should tell the caller that there was a problem. if( (0 == elem_cnt) || (NULL == cmpr_func_ptr) ) return 0; // If there is a sorted (contiguous) buffer, but it's stale, we need to free it. // The value of elem_cnt (rcrd->u.cntr_recorder.data->used_entries) can // only increase, or be reset to zero, but when it is reset to zero // (by papi_sde_reset_recorder()) the buffer will be freed (by the same function). if( (NULL != rcrd->u.cntr_recorder.data->sorted_buffer) && (rcrd->u.cntr_recorder.data->sorted_entries < elem_cnt) ){ free( rcrd->u.cntr_recorder.data->sorted_buffer ); rcrd->u.cntr_recorder.data->sorted_buffer = NULL; rcrd->u.cntr_recorder.data->sorted_entries = 0; } // Check if a sorted buffer is already there. If there isn't, allocate one. if( NULL == rcrd->u.cntr_recorder.data->sorted_buffer ){ rcrd->u.cntr_recorder.data->sorted_buffer = malloc(elem_cnt * typesize); exp_container_to_contiguous(rcrd->u.cntr_recorder.data, rcrd->u.cntr_recorder.data->sorted_buffer); // We set this field so we can test later to see if the allocated buffer is stale. rcrd->u.cntr_recorder.data->sorted_entries = elem_cnt; } void *sorted_buffer = rcrd->u.cntr_recorder.data->sorted_buffer; qsort(sorted_buffer, elem_cnt, typesize, cmpr_func_ptr); void *tmp_ptr = (char *)sorted_buffer + typesize*((elem_cnt*percent)/100); // We might free the sorted_buffer (when it becomes stale), so we can't return "tmp_ptr". // Therefore, we allocate fresh space for the resulting element and copy it there. // Since we do not know when the user will use this pointer, we will not be able // to free it, so it is the responibility of the user (who calls PAPI_read()) to // free this memory. result_data = malloc(typesize); memcpy(result_data, tmp_ptr, typesize); // convert the pointer into a long long so we can return it. quantile = (long long)result_data; return quantile; } static long long sdei_compute_q1(void *param){ return sdei_compute_quantile(param, 25); } static long long sdei_compute_med(void *param){ return sdei_compute_quantile(param, 50); } static long long sdei_compute_q3(void *param){ return sdei_compute_quantile(param, 75); } static long long sdei_compute_min(void *param){ return sdei_compute_edge(param, _SDE_CMP_MIN); } static long long sdei_compute_max(void *param){ return sdei_compute_edge(param, _SDE_CMP_MAX); } papi-papi-7-2-0-t/src/sde_lib/sde_lib.h000066400000000000000000000165011502707512200176050ustar00rootroot00000000000000/** * @file sde_lib.h * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief * SDE prototypes and macros. */ #if !defined(PAPI_SDE_LIB_H) #define PAPI_SDE_LIB_H #include #include #include #include #define PAPI_SDE_VERSION_NUMBER(_maj,_min) ( ((_maj)<<16) | (_min) ) #define PAPI_SDE_VERSION PAPI_SDE_VERSION_NUMBER(1,0) #define PAPI_SDE_RO 0x00 #define PAPI_SDE_RW 0x01 #define PAPI_SDE_DELTA 0x00 #define PAPI_SDE_INSTANT 0x10 #define PAPI_SDE_long_long 0x0 #define PAPI_SDE_int 0x1 #define PAPI_SDE_double 0x2 #define PAPI_SDE_float 0x3 #define PAPI_SDE_SUM 0x0 #define PAPI_SDE_MAX 0x1 #define PAPI_SDE_MIN 0x2 // The following values have been defined such that they match the // corresponding PAPI values from papi.h #define SDE_OK 0 /**< No error */ #define SDE_EINVAL -1 /**< Invalid argument */ #define SDE_ENOMEM -2 /**< Insufficient memory */ #define SDE_ECMP -4 /**< Not supported by component */ #define SDE_ENOEVNT -7 /**< Event does not exist */ #define SDE_EMISC -14 /**< Unknown error code */ #define register_fp_counter register_counter_cb #define papi_sde_register_fp_counter papi_sde_register_counter_cb #define destroy_counter unregister_counter #define destroy_counting_set unregister_counter #define papi_sde_destroy_counter papi_sde_unregister_counter #define papi_sde_destroy_counting_set papi_sde_unregister_counter #pragma GCC visibility push(default) extern int _sde_be_verbose; extern int _sde_debug; #define SDEDBG(format, args...) { if(_sde_debug){fprintf(stderr,format, ## args);} } static inline void SDE_ERROR( const char *format, ... ){ va_list args; if ( _sde_be_verbose ) { va_start( args, format ); fprintf( stderr, "PAPI SDE Error: " ); vfprintf( stderr, format, args ); fprintf( stderr, "\n" ); va_end( args ); } } #ifdef __cplusplus extern "C" { #endif #define GET_FLOAT_SDE(x) *((float *)&x) #define GET_DOUBLE_SDE(x) *((double *)&x) /* * GET_SDE_RECORDER_ADDRESS() USAGE EXAMPLE: * If SDE recorder logs values of type 'double': * double *ptr = GET_SDE_RECORDER_ADDRESS(papi_event_value[6], double); * for (j=0; j #include namespace papi_sde { class PapiSde { private: papi_handle_t sde_handle; public: PapiSde(const char *name_of_library){ sde_handle = papi_sde_init(name_of_library); } class CreatedCounter; class Recorder; class CountingSet; template int register_counter(const char *event_name, int cntr_mode, T &counter ){ if( std::is_same::value ) return papi_sde_register_counter(sde_handle, event_name, cntr_mode, PAPI_SDE_long_long, &counter); if( std::is_same::value ) return papi_sde_register_counter(sde_handle, event_name, cntr_mode, PAPI_SDE_int, &counter); if( std::is_same::value ) return papi_sde_register_counter(sde_handle, event_name, cntr_mode, PAPI_SDE_double, &counter); if( std::is_same::value ) return papi_sde_register_counter(sde_handle, event_name, cntr_mode, PAPI_SDE_float, &counter); } template int register_counter_cb(const char *event_name, int cntr_mode, T (*func_ptr)(P*), P ¶m){ if( std::is_same::value ){ return papi_sde_register_fp_counter(sde_handle, event_name, cntr_mode, PAPI_SDE_long_long, (papi_sde_fptr_t)func_ptr, ¶m); }else{ SDE_ERROR("register_counter_cb() is currently limited to callback functions that have a return type of 'long long int'."); return SDE_EINVAL; } } int unregister_counter(const char *event_name ){ return papi_sde_unregister_counter(sde_handle, event_name); } int describe_counter(const char *event_name, const char *event_description ){ return papi_sde_describe_counter(sde_handle, event_name, event_description); } int add_counter_to_group(const char *event_name, const char *group_name, uint32_t group_flags ){ return papi_sde_add_counter_to_group(sde_handle, event_name, group_name, group_flags); } CreatedCounter *create_counter(const char *event_name, int cntr_mode){ CreatedCounter *ptr; try{ ptr = new CreatedCounter(sde_handle, event_name, cntr_mode); }catch(std::exception const &e){ return nullptr; } return ptr; } Recorder *create_recorder(const char *event_name, size_t typesize, int (*cmpr_func_ptr)(const void *p1, const void *p2)){ Recorder *ptr; try{ ptr = new Recorder(sde_handle, event_name, typesize, cmpr_func_ptr); }catch(std::exception const &e){ return nullptr; } return ptr; } CountingSet *create_counting_set(const char *cset_name){ CountingSet *ptr; try{ ptr = new CountingSet(sde_handle, cset_name); }catch(std::exception const &e){ return nullptr; } return ptr; } class Recorder { private: void *recorder_handle=nullptr; public: Recorder(papi_handle_t sde_handle, const char *event_name, size_t typesize, int (*cmpr_func_ptr)(const void *p1, const void *p2)){ if( SDE_OK != papi_sde_create_recorder(sde_handle, event_name, typesize, cmpr_func_ptr, &recorder_handle ) ) throw std::exception(); } template int record(T const &value){ if( nullptr != recorder_handle ) return papi_sde_record(recorder_handle, sizeof(T), &value); else return SDE_EINVAL; } int reset(void){ if( nullptr != recorder_handle ) return papi_sde_reset_recorder(recorder_handle); else return SDE_EINVAL; } }; class CreatedCounter { private: void *counter_handle=nullptr; public: CreatedCounter(papi_handle_t sde_handle, const char *event_name, int cntr_mode){ if( SDE_OK != papi_sde_create_counter(sde_handle, event_name, cntr_mode, &counter_handle) ) throw std::exception(); } template int increment(T const &increment){ if( nullptr == counter_handle ) return SDE_EINVAL; if( std::is_same::value ){ return papi_sde_inc_counter(counter_handle, increment ); }else{ // for now we don't have the C API to handle increments other than "long long", // but we can add this in the future transparently to the user. return papi_sde_inc_counter(counter_handle, (long long int)increment ); } } int reset(void){ if( nullptr != counter_handle ) return papi_sde_reset_counter(counter_handle); else return SDE_EINVAL; } }; // class CreatedCounter class CountingSet { private: void *cset_handle=nullptr; public: CountingSet(papi_handle_t sde_handle, const char *cset_name){ if( SDE_OK != papi_sde_create_counting_set(sde_handle, cset_name, &cset_handle) ) throw std::exception(); } template int insert(T const &element, uint32_t type_id){ if( nullptr == cset_handle ) return SDE_EINVAL; return papi_sde_counting_set_insert( cset_handle, sizeof(T), sizeof(T), &element, type_id); } template int insert(size_t hashable_size, T const &element, uint32_t type_id){ if( nullptr == cset_handle ) return SDE_EINVAL; return papi_sde_counting_set_insert( cset_handle, sizeof(T), hashable_size, &element, type_id); } template int remove(size_t hashable_size, T const &element, uint32_t type_id){ if( nullptr == cset_handle ) return SDE_EINVAL; return papi_sde_counting_set_remove( cset_handle, hashable_size, &element, type_id); } }; // class CountingSet }; // class PapiSde template PapiSde::CreatedCounter &operator+=(PapiSde::CreatedCounter &X, const T increment){ X.increment(increment); return X; } // Prefix increment ++x; inline PapiSde::CreatedCounter &operator++(PapiSde::CreatedCounter &X){ X.increment(1LL); return X; } // Prefix decrement --x; inline PapiSde::CreatedCounter &operator--(PapiSde::CreatedCounter &X){ X.increment(-1LL); return X; } } // namespace papi_sde papi-papi-7-2-0-t/src/sde_lib/sde_lib_datastructures.c000066400000000000000000000572731502707512200227500ustar00rootroot00000000000000/** * @file sde_lib_datastructures.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief * This is a collection of functions that manipulate datastructures * that are used by libsde. */ #include "sde_lib_internal.h" /******************************************************************************/ /* Functions related to the hash-table used for internal hashing of events. */ /******************************************************************************/ uint32_t ht_hash_id(uint32_t uniq_id){ return uniq_id%PAPISDE_HT_SIZE; } // djb2 hash uint32_t ht_hash_name(const char *str) { uint32_t hash = 5381; int c; while ((c = *str++)) hash = ((hash << 5) + hash) + c; /* hash * 33 + c */ return hash % PAPISDE_HT_SIZE; } void ht_insert(papisde_list_entry_t *hash_table, int ht_key, sde_counter_t *sde_counter) { papisde_list_entry_t *list_head, *new_entry; list_head = &hash_table[ht_key]; // If we have no counter is associated with this key we will put the new // counter on the head of the list which has already been allocated. if( NULL == list_head->item ){ list_head->item = sde_counter; list_head->next = NULL; // Just for aesthetic reasons. return; } // If we made it here it means that the head was occupied, so we // will allocate a new element and put it just after the head. new_entry = (papisde_list_entry_t *)calloc(1, sizeof(papisde_list_entry_t)); new_entry->item = sde_counter; new_entry->next = list_head->next; list_head->next = new_entry; return; } // This function serializes the items contained in the hash-table. A pointer // to the resulting serialized array is put into the parameter "rslt_array". // The return value indicates the size of the array. // The array items are copies of the hash-table item into newly allocated // memory. They do not reference the original items in the hash table. However, // this is a shallow copy. If the items contain pointers, then the pointed // elements are _NOT_ copied. // The caller is responsible for freeing the resulting array memory. int ht_to_array(papisde_list_entry_t *hash_table, sde_counter_t **rslt_array) { int i, item_cnt = 0, index=0; papisde_list_entry_t *list_head, *curr; // First pass counts how many items have been inserted in the hash table. // Traverse all the elements of the hash-table. for(i=0; iitem){ item_cnt++; } for(curr = list_head->next; NULL != curr; curr=curr->next){ if(NULL == curr->item){ // This can only legally happen for the head of the list. SDE_ERROR("ht_to_array(): the hash table is clobbered."); }else{ item_cnt++; } } } // Allocate a contiguous array to store the items. sde_counter_t *array = (sde_counter_t *)malloc( item_cnt * sizeof(sde_counter_t)); // Traverse the hash-table again and copy all the items to the array we just allocated. for(i=0; iitem){ memcpy( &array[index], list_head->item, sizeof(sde_counter_t) ); index++; } for(curr = list_head->next; NULL != curr; curr=curr->next){ if(NULL == curr->item){ // This can only legally happen for the head of the list. SDE_ERROR("ht_to_array(): the hash table is clobbered."); }else{ memcpy( &array[index], curr->item, sizeof(sde_counter_t) ); index++; } } } *rslt_array = array; return item_cnt; } sde_counter_t *ht_delete(papisde_list_entry_t *hash_table, int ht_key, uint32_t uniq_id) { papisde_list_entry_t *list_head, *curr, *prev; sde_counter_t *item; list_head = &hash_table[ht_key]; if( NULL == list_head->item ){ SDE_ERROR("ht_delete(): the entry does not exist."); return NULL; } // If the head contains the element to be deleted, free the space of the counter and pull the list up. if( list_head->item->glb_uniq_id == uniq_id ){ item = list_head->item; if( NULL != list_head->next){ *list_head = *(list_head->next); }else{ memset(list_head, 0, sizeof(papisde_list_entry_t)); } return item; } prev = list_head; // Traverse the linked list to find the element. for(curr=list_head->next; NULL != curr; curr=curr->next){ if(NULL == curr->item){ // This is only permitted for the head of the list. SDE_ERROR("ht_delete(): the hash table is clobbered."); return NULL; } if(curr->item->glb_uniq_id == uniq_id){ prev->next = curr->next; item = curr->item; free(curr); // free the hash table entry return item; } prev = curr; } SDE_ERROR("ht_delete(): the item is not in the list."); return NULL; } sde_counter_t *ht_lookup_by_name(papisde_list_entry_t *hash_table, const char *name) { papisde_list_entry_t *list_head, *curr; list_head = &hash_table[ht_hash_name(name)]; if( NULL == list_head->item ){ return NULL; } for(curr=list_head; NULL != curr; curr=curr->next){ if(NULL == curr->item){ // This can only legally happen for the head of the list. SDE_ERROR("ht_lookup_by_name() the hash table is clobbered."); return NULL; } if( !strcmp(curr->item->name, name) ){ return curr->item; } } return NULL; } sde_counter_t *ht_lookup_by_id(papisde_list_entry_t *hash_table, uint32_t uniq_id) { papisde_list_entry_t *list_head, *curr; list_head = &hash_table[ht_hash_id(uniq_id)]; if( NULL == list_head->item ){ return NULL; } for(curr=list_head; NULL != curr; curr=curr->next){ if(NULL == curr->item){ // This can only legally happen for the head of the list. SDE_ERROR("ht_lookup_by_id() the hash table is clobbered."); return NULL; } if(curr->item->glb_uniq_id == uniq_id){ return curr->item; } } return NULL; } /******************************************************************************/ /* Functions related to the exponential container used for recorders. */ /******************************************************************************/ void exp_container_to_contiguous(recorder_data_t *exp_container, void *cont_buffer){ long long current_size, typesize, used_entries, tmp_size = 0; void *src, *dst; int i; typesize = exp_container->typesize; used_entries = exp_container->used_entries; for(i=0; iptr_array[i]; dst = (char *)cont_buffer + tmp_size*typesize; if ( (tmp_size+current_size) <= used_entries){ memcpy(dst, src, current_size*typesize); if ( (tmp_size+current_size) == used_entries){ return; } }else{ memcpy(dst, src, (used_entries-tmp_size)*typesize); return; } tmp_size += current_size; } } int exp_container_insert_element(recorder_data_t *exp_container, size_t typesize, const void *value){ long long used_entries, total_entries, prev_entries, offset; int i, chunk; long long tmp_size; if( NULL == exp_container || NULL == exp_container->ptr_array[0]){ SDE_ERROR("exp_container_insert_element(): Exponential container is clobbered. Unable to insert element."); return SDE_EINVAL; } used_entries = exp_container->used_entries; total_entries = exp_container->total_entries; assert(used_entries <= total_entries); // Find how many chunks we have already allocated tmp_size = 0; for(i=0; iptr_array[0]" // must have been already allocated when creating the recorder, so we can // compare the total size after we add the "i-th" size. if (total_entries == tmp_size) break; } chunk = i; // Find how many entries down the last chunk we are. offset = used_entries - prev_entries; if( used_entries == total_entries ){ long long new_segment_size; // If we had used all the available entries (and thus we are allocating more), we start from the beginning of the new chunk. offset = 0; chunk += 1; // we need to allocate the next chunk from the last one we found. new_segment_size = ((long long)1<ptr_array[chunk] = malloc(new_segment_size*typesize); exp_container->total_entries += new_segment_size; } void *dest = (char *)(exp_container->ptr_array[chunk]) + offset*typesize; (void)memcpy( dest, value, typesize ); exp_container->used_entries++; return SDE_OK; } /******************************************************************************/ /* Functions related to the F14 inspired hash-table that we used to implement */ /* the counting set. */ /******************************************************************************/ int cset_insert_elem(cset_hash_table_t *hash_ptr, size_t element_size, size_t hashable_size, const void *element, uint32_t type_id){ cset_hash_bucket_t *bucket_ptr; int element_in_bucket = 0; uint32_t vacant_idx, i, occupied; int ret_val; if( NULL == hash_ptr ){ ret_val = SDE_EINVAL; goto fn_exit; } bucket_ptr = hash_ptr->buckets; uint64_t seed = (uint64_t)79365; // decided to be a good seed by a committee. uint64_t key = fasthash64(element, hashable_size, seed); int bucket_idx = (int)(key % _SDE_HASH_BUCKET_COUNT_); uint64_t *key_ptr = bucket_ptr[bucket_idx].keys; cset_hash_decorated_object_t *obj_ptr = bucket_ptr[bucket_idx].objects; occupied = bucket_ptr[bucket_idx].occupied; if( occupied > _SDE_HASH_BUCKET_WIDTH_ ){ SDE_ERROR("cset_insert_elem(): Counting set is clobbered, bucket %d has exceeded capacity.",bucket_idx); ret_val = SDE_ECMP; goto fn_exit; } // First look in the bucket where the hash function told us to look. for(i=0; ioverflow_list ){ // Check if we still have room in the bucket, and if so, add the new element to the bucket. if( vacant_idx < _SDE_HASH_BUCKET_WIDTH_ ){ key_ptr[vacant_idx] = key; obj_ptr[vacant_idx].count = 1; obj_ptr[vacant_idx].type_id = type_id; obj_ptr[vacant_idx].type_size = element_size; obj_ptr[vacant_idx].ptr = malloc(element_size); (void)memcpy(obj_ptr[vacant_idx].ptr, element, element_size); // Let the bucket know that it now has one more element. bucket_ptr[bucket_idx].occupied += 1; }else{ // If the overflow list is empty and the bucket does not have room, // then we add the new element at the head of the overflow list. cset_list_object_t *new_list_element = (cset_list_object_t *)malloc(sizeof(cset_list_object_t)); new_list_element->next = NULL; new_list_element->count = 1; new_list_element->type_id = type_id; new_list_element->type_size = element_size; new_list_element->ptr = malloc(element_size); (void)memcpy(new_list_element->ptr, element, element_size); // Make the head point to the new element. hash_ptr->overflow_list = new_list_element; } }else{ int element_in_list = 0; // Since there are elements in the overflow list, we need to search there for the one we are looking for. cset_list_object_t *list_runner; for( list_runner = hash_ptr->overflow_list; list_runner != NULL; list_runner = list_runner->next){ // if we find the element in the overflow list, increment the counter and exit the loop. // When we traverse the overflow list we can _not_ use the SDE_HASH_IS_FUZZY flag, because we // don't have matching hashes to indicate that the two elements are close; we are traversing the whole list. if( (hashable_size <= list_runner->type_size) && (type_id == list_runner->type_id) && !memcmp(element, list_runner->ptr, hashable_size) ){ list_runner->count += 1; element_in_list = 1; break; } } // If we traversed the entire list and still haven't found our element, we need to insert it into the cset. if( !element_in_list ){ // Check if there is still room in the bucket. If there is, we should add the new element to the bucket instead of the overflow list. if( vacant_idx < _SDE_HASH_BUCKET_WIDTH_ ){ key_ptr[vacant_idx] = key; obj_ptr[vacant_idx].count = 1; obj_ptr[vacant_idx].type_id = type_id; obj_ptr[vacant_idx].type_size = element_size; obj_ptr[vacant_idx].ptr = malloc(element_size); (void)memcpy(obj_ptr[vacant_idx].ptr, element, element_size); // Let the bucket know that it now has one more element. bucket_ptr[bucket_idx].occupied += 1; }else{ // If there is no room in the bucket, add the new element before the current head of the overflow list. cset_list_object_t *new_list_element = (cset_list_object_t *)malloc(sizeof(cset_list_object_t)); // Make the new element's "next" pointer be the current head of the list. new_list_element->next = hash_ptr->overflow_list; new_list_element->count = 1; new_list_element->type_id = type_id; new_list_element->type_size = element_size; new_list_element->ptr = malloc(element_size); (void)memcpy(new_list_element->ptr, element, element_size); // Update the head of the list to point to the new element. hash_ptr->overflow_list = new_list_element; } } } } ret_val = SDE_OK; fn_exit: return ret_val; } int cset_remove_elem(cset_hash_table_t *hash_ptr, size_t hashable_size, const void *element, uint32_t type_id){ cset_hash_bucket_t *bucket_ptr; int element_found = 0; uint32_t i, occupied; int ret_val; if( NULL == hash_ptr ){ ret_val = SDE_EINVAL; goto fn_exit; } bucket_ptr = hash_ptr->buckets; uint64_t seed = (uint64_t)79365; // decided to be a good seed by a committee. uint64_t key = fasthash64(element, hashable_size, seed); int bucket_idx = (int)(key % _SDE_HASH_BUCKET_COUNT_); uint64_t *key_ptr = bucket_ptr[bucket_idx].keys; cset_hash_decorated_object_t *obj_ptr = bucket_ptr[bucket_idx].objects; occupied = bucket_ptr[bucket_idx].occupied; if( occupied > _SDE_HASH_BUCKET_WIDTH_ ){ SDE_ERROR("cset_remove_elem(): Counting set is clobbered, bucket %d has exceeded capacity.",bucket_idx); ret_val = SDE_EINVAL; goto fn_exit; } // First look in the bucket where the hash function told us to look. for(i=0; ioverflow_list ){ SDE_ERROR("cset_remove_elem(): Attempted to remove element that is NOT in the counting set."); }else{ // Since there are elements in the overflow list, we need to search there for the one we are looking for. cset_list_object_t *list_runner, *prev; prev = hash_ptr->overflow_list; for( list_runner = hash_ptr->overflow_list; list_runner != NULL; list_runner = list_runner->next){ // if we find the element in the overflow list if( (hashable_size <= list_runner->type_size) && (type_id == list_runner->type_id) && !memcmp(element, list_runner->ptr, hashable_size) ){ list_runner->count -= 1; // If the element reached a count of zero, then remove it from the list, and connect the list around it. if( 0 == list_runner->count ){ // free the memory taken by the user object. free(list_runner->ptr); if( list_runner == hash_ptr->overflow_list ){ hash_ptr->overflow_list = NULL; }else{ prev->next = list_runner->next; } // free the memory taken by the link node. We can do this here safely // because we will break out of the loop, so we will not need the "next" pointer. free(list_runner); } // since we found the element, we don't need to look at the rest of the list. break; } prev = list_runner; } } } ret_val = SDE_OK; fn_exit: return ret_val; } cset_list_object_t *cset_to_list(cset_hash_table_t *hash_ptr){ cset_hash_bucket_t *bucket_ptr; int bucket_idx; uint32_t i, occupied; cset_list_object_t *head_ptr = NULL; if( NULL == hash_ptr ){ return NULL; } bucket_ptr = hash_ptr->buckets; for( bucket_idx = 0; bucket_idx < _SDE_HASH_BUCKET_COUNT_; bucket_idx++){ cset_hash_decorated_object_t *obj_ptr = bucket_ptr[bucket_idx].objects; occupied = bucket_ptr[bucket_idx].occupied; for(i=0; inext = head_ptr; new_list_element->count = obj_ptr[i].count; new_list_element->type_id = obj_ptr[i].type_id; new_list_element->type_size = type_size; new_list_element->ptr = malloc(type_size); (void)memcpy(new_list_element->ptr, obj_ptr[i].ptr, type_size); // Update the head of the list to point to the new element. head_ptr = new_list_element; } } cset_list_object_t *list_runner; // Since there are elements in the overflow list, we need to search for ours. for( list_runner = hash_ptr->overflow_list; list_runner != NULL; list_runner = list_runner->next){ int type_size = list_runner->type_size; cset_list_object_t *new_list_element = (cset_list_object_t *)malloc(sizeof(cset_list_object_t)); // make the current list head be the element after the new one we are creating. new_list_element->next = head_ptr; new_list_element->count = list_runner->count; new_list_element->type_id = list_runner->type_id; new_list_element->type_size = type_size; new_list_element->ptr = malloc(type_size); (void)memcpy(new_list_element->ptr, list_runner->ptr, type_size); // Update the head of the list to point to the new element. head_ptr = new_list_element; } return head_ptr; } int cset_delete(cset_hash_table_t *hash_ptr){ cset_hash_bucket_t *bucket_ptr; int bucket_idx; uint32_t i, occupied; if( NULL == hash_ptr ){ return SDE_EINVAL; } bucket_ptr = hash_ptr->buckets; for( bucket_idx = 0; bucket_idx < _SDE_HASH_BUCKET_COUNT_; bucket_idx++){ cset_hash_decorated_object_t *obj_ptr = bucket_ptr[bucket_idx].objects; occupied = bucket_ptr[bucket_idx].occupied; // Free all the elements that occupy entries in this bucket. for(i=0; ioverflow_list; list_runner != NULL; list_runner = list_runner->next){ // Free the list element from the previous iteration. free(ptr_to_free); free(list_runner->ptr); // Keep a reference to this element so we can free it _after_ this iteration, because we need the list_runner->next for now. ptr_to_free = list_runner; // If the current element is at the head of the overflow list, then we should mark the head as NULL. if( list_runner == hash_ptr->overflow_list ) hash_ptr->overflow_list = NULL; } free(ptr_to_free); return SDE_OK; } papi-papi-7-2-0-t/src/sde_lib/sde_lib_internal.h000066400000000000000000000236101502707512200215000ustar00rootroot00000000000000/** * @file sde_lib_internal.h * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components */ #if !defined(PAPI_SDE_LIB_INTERNAL_H) #define PAPI_SDE_LIB_INTERNAL_H #include #include #include #include #include #include #include #include #include #include #include "sde_lib.h" #define EXP_CONTAINER_ENTRIES 52 #define EXP_CONTAINER_MIN_SIZE 2048 #define PAPISDE_HT_SIZE 512 #define is_readonly(_X_) (PAPI_SDE_RO == ((_X_)&0x0F)) #define is_readwrite(_X_) (PAPI_SDE_RW == ((_X_)&0x0F)) #define is_delta(_X_) (PAPI_SDE_DELTA == ((_X_)&0xF0)) #define is_instant(_X_) (PAPI_SDE_INSTANT == ((_X_)&0xF0)) typedef struct sde_counter_s sde_counter_t; typedef struct sde_sorting_params_s sde_sorting_params_t; typedef struct papisde_list_entry_s papisde_list_entry_t; typedef struct papisde_library_desc_s papisde_library_desc_t; typedef struct papisde_control_s papisde_control_t; typedef struct recorder_data_s recorder_data_t; /** This global variable is defined in sde_lib.c and points to the head of the control state list **/ extern papisde_control_t *_papisde_global_control; // _SDE_HASH_BUCKET_COUNT_ should not be a power of two, and even better it should be a prime. #if defined(SDE_HASH_SMALL) // 7.4KB storage #define _SDE_HASH_BUCKET_COUNT_ 61 #define _SDE_HASH_BUCKET_WIDTH_ 5 #else // 124KB storage (5222 elements) #define _SDE_HASH_BUCKET_COUNT_ 373 #define _SDE_HASH_BUCKET_WIDTH_ 14 #endif // defining SDE_HASH_IS_FUZZY to 1 will make the comparisons operation of the hash table // (which is used in the "counting sets") faster, but inaccurate. As a result, some input // elements might collide onto the same hash table entry, even if they are different. // If speed is more important than accurate counting for your library, then setting // SDE_HASH_IS_FUZZY to 1 is recommented. #define SDE_HASH_IS_FUZZY 0 typedef struct cset_hash_decorated_object_s { uint32_t count; uint32_t type_id; size_t type_size; void *ptr; } cset_hash_decorated_object_t; /* typedef struct sde_list_object_s sde_list_object_t; struct sde_list_object_s { sde_hash_decorated_object_t object; sde_list_object_t *next; }; */ typedef struct cset_hash_bucket_s { uint32_t occupied; uint64_t keys[_SDE_HASH_BUCKET_WIDTH_]; cset_hash_decorated_object_t objects[_SDE_HASH_BUCKET_WIDTH_]; } cset_hash_bucket_t; typedef struct cset_hash_table_s { cset_hash_bucket_t buckets[_SDE_HASH_BUCKET_COUNT_]; cset_list_object_t *overflow_list; } cset_hash_table_t; /* Hash table entry */ struct papisde_list_entry_s { sde_counter_t *item; papisde_list_entry_t *next; }; struct recorder_data_s{ void *ptr_array[EXP_CONTAINER_ENTRIES]; long long total_entries; long long used_entries; size_t typesize; void *sorted_buffer; long long sorted_entries; }; typedef struct cntr_class_basic_s { void *data; } cntr_class_basic_t; typedef struct cntr_class_callback_s { papi_sde_fptr_t callback; void *param; } cntr_class_callback_t; typedef struct cntr_class_recorder_s { recorder_data_t *data; } cntr_class_recorder_t; typedef struct cntr_class_cset_s { cset_hash_table_t *data; } cntr_class_cset_t; typedef struct cntr_class_group_s { papisde_list_entry_t *group_head; uint32_t group_flags; } cntr_class_group_t; typedef union cntr_class_specific_u{ cntr_class_basic_t cntr_basic; cntr_class_callback_t cntr_cb; cntr_class_recorder_t cntr_recorder; cntr_class_cset_t cntr_cset; cntr_class_group_t cntr_group; } cntr_class_specific_t; struct sde_counter_s { uint32_t glb_uniq_id; char *name; char *description; uint32_t cntr_class; cntr_class_specific_t u; long long int previous_data; int overflow; int cntr_type; int cntr_mode; int ref_count; papisde_library_desc_t *which_lib; }; struct sde_sorting_params_s{ sde_counter_t *recording; int (*cmpr_func_ptr)(const void *p1, const void *p2); }; enum CNTR_CLASS{ CNTR_CLASS_REGISTERED = 0x1, CNTR_CLASS_CREATED = 0x2, CNTR_CLASS_BASIC = 0x3, // both previous types combined. CNTR_CLASS_CB = 0x4, CNTR_CLASS_RECORDER = 0x8, CNTR_CLASS_CSET = 0x10, CNTR_CLASS_PLACEHOLDER = 0x1000, CNTR_CLASS_GROUP = 0x2000 }; #define IS_CNTR_REGISTERED(_CNT) ( CNTR_CLASS_REGISTERED == (_CNT)->cntr_class ) #define IS_CNTR_CREATED(_CNT) ( CNTR_CLASS_CREATED == (_CNT)->cntr_class ) #define IS_CNTR_BASIC(_CNT) ( CNTR_CLASS_BASIC & (_CNT)->cntr_class ) #define IS_CNTR_CALLBACK(_CNT) ( CNTR_CLASS_CB == (_CNT)->cntr_class ) #define IS_CNTR_RECORDER(_CNT) ( CNTR_CLASS_RECORDER == (_CNT)->cntr_class ) #define IS_CNTR_CSET(_CNT) ( CNTR_CLASS_CSET == (_CNT)->cntr_class ) #define IS_CNTR_PLACEHOLDER(_CNT) ( CNTR_CLASS_PLACEHOLDER == (_CNT)->cntr_class ) #define IS_CNTR_GROUP(_CNT) ( CNTR_CLASS_GROUP == (_CNT)->cntr_class ) /* This type describes one library. This is the type of the handle returned by papi_sde_init(). */ struct papisde_library_desc_s { char* libraryName; papisde_list_entry_t lib_counters[PAPISDE_HT_SIZE]; uint32_t disabled; papisde_library_desc_t *next; }; /* One global variable of this type holds pointers to all other SDE meta-data */ struct papisde_control_s { uint32_t num_reg_events; /* This number only increases, so it can be used as a uniq id */ uint32_t num_live_events; /* This number decreases at unregister() */ uint32_t disabled; papisde_library_desc_t *lib_list_head; uint32_t activeLibCount; papisde_list_entry_t all_reg_counters[PAPISDE_HT_SIZE]; }; int sdei_setup_counter_internals( papi_handle_t handle, const char *event_name, int cntr_mode, int cntr_type, enum CNTR_CLASS cntr_class, cntr_class_specific_t cntr_union ); int sdei_delete_counter(papisde_library_desc_t* lib_handle, const char *name); int sdei_inc_ref_count(sde_counter_t *counter); int sdei_read_counter_group( sde_counter_t *counter, long long int *rslt_ptr ); void sdei_counting_set_to_list( void *cset_handle, cset_list_object_t **list_head ); int sdei_read_and_update_data_value( sde_counter_t *counter, long long int previous_value, long long int *rslt_ptr ); int sdei_hardware_write( sde_counter_t *counter, long long int new_value ); int sdei_set_timer_for_overflow(void); papisde_control_t *sdei_get_global_struct(void); sde_counter_t *ht_lookup_by_id(papisde_list_entry_t *hash_table, uint32_t uniq_id); sde_counter_t *ht_lookup_by_name(papisde_list_entry_t *hash_table, const char *name); sde_counter_t *ht_delete(papisde_list_entry_t *hash_table, int ht_key, uint32_t uniq_id); void ht_insert(papisde_list_entry_t *hash_table, int ht_key, sde_counter_t *sde_counter); int ht_to_array(papisde_list_entry_t *hash_table, sde_counter_t **rslt_array); uint32_t ht_hash_name(const char *str); uint32_t ht_hash_id(uint32_t uniq_id); papi_handle_t do_sde_init(const char *name_of_library, papisde_control_t *gctl); sde_counter_t *allocate_and_insert(papisde_control_t *gctl, papisde_library_desc_t* lib_handle, const char *name, uint32_t uniq_id, int cntr_mode, int cntr_type, enum CNTR_CLASS cntr_class, cntr_class_specific_t cntr_union); void exp_container_to_contiguous(recorder_data_t *exp_container, void *cont_buffer); int exp_container_insert_element(recorder_data_t *exp_container, size_t typesize, const void *value); void exp_container_init(sde_counter_t *handle, size_t typesize); void papi_sde_counting_set_to_list(void *cset_handle, cset_list_object_t **list_head); int cset_insert_elem(cset_hash_table_t *hash_ptr, size_t element_size, size_t hashable_size, const void *element, uint32_t type_id); int cset_remove_elem(cset_hash_table_t *hash_ptr, size_t hashable_size, const void *element, uint32_t type_id); cset_list_object_t *cset_to_list(cset_hash_table_t *hash_ptr); int cset_delete(cset_hash_table_t *hash_ptr); #pragma GCC visibility push(default) int sde_ti_reset_counter( uint32_t ); int sde_ti_read_counter( uint32_t, long long int * ); int sde_ti_write_counter( uint32_t, long long ); int sde_ti_name_to_code( const char *, uint32_t * ); int sde_ti_is_simple_counter( uint32_t ); int sde_ti_is_counter_set_to_overflow( uint32_t ); int sde_ti_set_counter_overflow( uint32_t, int ); char * sde_ti_get_event_name( int ); char * sde_ti_get_event_description( int ); int sde_ti_get_num_reg_events( void ); int sde_ti_shutdown( void ); #pragma GCC visibility pop /*************************************************************************/ /* Hashing code below copied verbatim from the "fast-hash" project: */ /* https://github.com/ztanml/fast-hash */ /*************************************************************************/ // Compression function for Merkle-Damgard construction. #define mix(h) ({ \ (h) ^= (h) >> 23; \ (h) *= 0x2127599bf4325c37ULL; \ (h) ^= (h) >> 47; }) static inline uint64_t fasthash64(const void *buf, size_t len, uint64_t seed) { const uint64_t m = 0x880355f21e6d1965ULL; const uint64_t *pos = (const uint64_t *)buf; const uint64_t *end = pos + (len / 8); const uint32_t *pos2; uint64_t h = seed ^ (len * m); uint64_t v; while (pos != end) { v = *pos++; h ^= mix(v); h *= m; } pos2 = (const uint32_t*)pos; v = 0; switch (len & 7) { case 7: v ^= (uint64_t)pos2[6] << 48; /* fall through */ case 6: v ^= (uint64_t)pos2[5] << 40; /* fall through */ case 5: v ^= (uint64_t)pos2[4] << 32; /* fall through */ case 4: v ^= (uint64_t)pos2[3] << 24; /* fall through */ case 3: v ^= (uint64_t)pos2[2] << 16; /* fall through */ case 2: v ^= (uint64_t)pos2[1] << 8; /* fall through */ case 1: v ^= (uint64_t)pos2[0]; h ^= mix(v); h *= m; } return mix(h); } #endif // !defined(PAPI_SDE_LIB_INTERNAL_H) papi-papi-7-2-0-t/src/sde_lib/sde_lib_lock.h000066400000000000000000000021101502707512200206040ustar00rootroot00000000000000#if !defined(PAPI_SDE_LIB_LOCK_H) #define PAPI_SDE_LIB_LOCK_H #include "atomic_ops.h" #if defined(AO_HAVE_test_and_set_acquire) #define USE_LIBAO_ATOMICS #endif /*************************************************************************/ /* Locking functions similar to the PAPI locking function. */ /*************************************************************************/ #if defined(USE_LIBAO_ATOMICS) extern AO_TS_t _sde_hwd_lock_data; #define sde_lock() {while (AO_test_and_set_acquire(&_sde_hwd_lock_data) != AO_TS_CLEAR) { ; } } #define sde_unlock() { AO_CLEAR(&_sde_hwd_lock_data); } #else //defined(USE_LIBAO_ATOMICS) #include extern pthread_mutex_t _sde_hwd_lock_data; #define sde_lock() \ do{ \ pthread_mutex_lock(&_sde_hwd_lock_data); \ } while(0) #define sde_unlock(lck) \ do{ \ pthread_mutex_unlock(&_sde_hwd_lock_data); \ } while(0) #endif //defined(USE_LIBAO_ATOMICS) #endif //!define(PAPI_SDE_LIB_LOCK_H) papi-papi-7-2-0-t/src/sde_lib/sde_lib_misc.c000066400000000000000000000551311502707512200206150ustar00rootroot00000000000000/** * @file sde_lib_misc.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief * This is a collection of internal utility functions that are needed * to support SDEs. */ #include "sde_lib_internal.h" static int aggregate_value_in_group(long long int *data, long long int *rslt, int cntr_type, int group_flags); static inline int cast_and_store(void *data, long long int previous_value, void *rslt_ptr, int cntr_type); static inline int free_counter_resources(sde_counter_t *counter); int _sde_be_verbose = 0; int _sde_debug = 0; static papisde_library_desc_t *find_library_by_name(const char *library_name, papisde_control_t *gctl); static void insert_library_handle(papisde_library_desc_t *lib_handle, papisde_control_t *gctl); /*************************************************************************/ /* Utility Functions. */ /*************************************************************************/ /** sdei_get_global_struct() checks if the global structure has been allocated and allocates it if has not. @return a pointer to the global structure. */ papisde_control_t *sdei_get_global_struct(void){ // Allocate the global control structure, unless it has already been allocated by another library // or the application code calling PAPI_name_to_code() for an SDE. if ( !_papisde_global_control ) { SDEDBG("sdei_get_global_struct(): global SDE control struct is being allocated.\n"); _papisde_global_control = (papisde_control_t *)calloc( 1, sizeof( papisde_control_t ) ); } return _papisde_global_control; } /** This helper function checks to see if a given library has already been initialized and exists in the global structure of the component. @param[in] a pointer to the global structure. @param[in] a string containing the name of the library. @return a pointer to the library handle. */ papisde_library_desc_t *find_library_by_name(const char *library_name, papisde_control_t *gctl){ if( (NULL == gctl) || (NULL == library_name) ) return NULL; papisde_library_desc_t *tmp_lib = gctl->lib_list_head; // Check to see if this library has already been initialized. while(NULL != tmp_lib){ char *tmp_name = tmp_lib->libraryName; SDEDBG("Checking library: '%s' against registered library: '%s'\n", library_name, tmp_lib->libraryName); // If we find the same library already registered, we do not create a new entry. if( (NULL != tmp_name) && !strcmp(tmp_name, library_name) ) return tmp_lib; tmp_lib = tmp_lib->next; } return NULL; } /** This helper function simply adds a library handle to the beginning of the list of libraries in the global structure. It's only reason of existence is to hide the structure of the linked list in case we want to change it in the future. @param[in] a pointer to the library handle. @param[in] a pointer to the global structure. */ void insert_library_handle(papisde_library_desc_t *lib_handle, papisde_control_t *gctl){ SDEDBG("insert_library_handle(): inserting new handle for library: '%s'\n",lib_handle->libraryName); lib_handle->next = gctl->lib_list_head; gctl->lib_list_head = lib_handle; return; } // Initialize library handle, or return the existing one if already // initialized. This function is _not_ thread safe, so it needs to be called // from within regions protected by sde_lock()/sde_unlock(). papi_handle_t do_sde_init(const char *name_of_library, papisde_control_t *gctl){ papisde_library_desc_t *tmp_lib; SDEDBG("Registering library: '%s'\n",name_of_library); // If the library is already initialized, return the handle to it tmp_lib = find_library_by_name(name_of_library, gctl); if( NULL != tmp_lib ){ return tmp_lib; } // If the library is not already initialized, then initialize it. tmp_lib = ( papisde_library_desc_t* ) calloc( 1, sizeof( papisde_library_desc_t ) ); tmp_lib->libraryName = strdup(name_of_library); insert_library_handle(tmp_lib, gctl); return tmp_lib; } sde_counter_t *allocate_and_insert( papisde_control_t *gctl, papisde_library_desc_t* lib_handle, const char* name, uint32_t uniq_id, int cntr_mode, int cntr_type, enum CNTR_CLASS cntr_class, cntr_class_specific_t cntr_union ){ // make sure to calloc() the structure, so all the fields which we do not explicitly set remain zero. sde_counter_t *item = (sde_counter_t *)calloc(1, sizeof(sde_counter_t)); if( NULL == item ) return NULL; item->u = cntr_union; item->cntr_class = cntr_class; item->cntr_type = cntr_type; item->cntr_mode = cntr_mode; item->glb_uniq_id = uniq_id; item->name = strdup( name ); item->description = strdup( name ); item->which_lib = lib_handle; (void)ht_insert(lib_handle->lib_counters, ht_hash_name(name), item); (void)ht_insert(gctl->all_reg_counters, ht_hash_id(uniq_id), item); return item; } void sdei_counting_set_to_list( void *cset_handle, cset_list_object_t **list_head ) { sde_counter_t *tmp_cset; if( NULL == list_head ) return; tmp_cset = (sde_counter_t *)cset_handle; if( (NULL == tmp_cset) || !IS_CNTR_CSET(tmp_cset) || (NULL == tmp_cset->u.cntr_cset.data) ){ SDE_ERROR("sdei_counting_set_to_list(): 'cset_handle' is clobbered."); return; } *list_head = cset_to_list(tmp_cset->u.cntr_cset.data); return; } // This function modifies data structures, BUT its callers are responsible for aquiring a lock, so it // is always called in an atomic fashion and thus should not acquire a lock. Actually, locking inside // this function will cause a deadlock. int sdei_setup_counter_internals( papi_handle_t handle, const char *event_name, int cntr_mode, int cntr_type, enum CNTR_CLASS cntr_class, cntr_class_specific_t cntr_union ) { papisde_library_desc_t *lib_handle; sde_counter_t *tmp_item; uint32_t counter_uniq_id; char *full_event_name; int ret_val = SDE_OK; int needs_overflow = 0; lib_handle = (papisde_library_desc_t *) handle; if( (NULL == lib_handle) || (NULL == lib_handle->libraryName) ){ SDE_ERROR("sdei_setup_counter_internals(): 'handle' is clobbered. Unable to register counter."); return SDE_EINVAL; } size_t str_len = strlen(lib_handle->libraryName)+strlen(event_name)+2+1; // +2 for "::" and +1 for '\0' full_event_name = (char *)malloc(str_len*sizeof(char)); snprintf(full_event_name, str_len, "%s::%s", lib_handle->libraryName, event_name); SDEDBG("%s: Counter: '%s' will be added in library: %s.\n", __FILE__, full_event_name, lib_handle->libraryName); if( !is_instant(cntr_mode) && !is_delta(cntr_mode) ){ SDE_ERROR("Unknown mode %d. SDE counter mode must be either Instant or Delta.",cntr_mode); free(full_event_name); return SDE_ECMP; } // Look if the event is already registered. tmp_item = ht_lookup_by_name(lib_handle->lib_counters, full_event_name); if( NULL != tmp_item ){ if( !IS_CNTR_PLACEHOLDER(tmp_item) ){ // If it is registered and it is _not_ a placeholder then ignore it silently. SDEDBG("%s: Counter: '%s' was already in library: %s.\n", __FILE__, full_event_name, lib_handle->libraryName); free(full_event_name); return SDE_OK; } // If we are here, then it IS a placeholder, so check if we need to start overflowing. if( tmp_item->overflow && ( (CNTR_CLASS_REGISTERED == cntr_class) || (CNTR_CLASS_CB == cntr_class) ) ){ needs_overflow = 1; } // Since the counter is a placeholder update the mode, the type, and the union that contains the 'data'. SDEDBG("%s: Updating placeholder for counter: '%s' in library: %s.\n", __FILE__, full_event_name, lib_handle->libraryName); tmp_item->u = cntr_union; tmp_item->cntr_class = cntr_class; tmp_item->cntr_mode = cntr_mode; tmp_item->cntr_type = cntr_type; free(full_event_name); return SDE_OK; } // If neither the event, nor a placeholder exists, then use the current // number of registered events as the index of the new one, and increment it. papisde_control_t *gctl = sdei_get_global_struct(); counter_uniq_id = gctl->num_reg_events++; gctl->num_live_events++; SDEDBG("%s: Counter %s has unique ID = %d\n", __FILE__, full_event_name, counter_uniq_id); tmp_item = allocate_and_insert( gctl, lib_handle, full_event_name, counter_uniq_id, cntr_mode, cntr_type, cntr_class, cntr_union ); if(NULL == tmp_item) { SDEDBG("%s: Counter not inserted in SDE %s\n", __FILE__, lib_handle->libraryName); free(full_event_name); return SDE_ECMP; } free(full_event_name); // Check if we need to worry about overflow (cases r[4-6]) if( needs_overflow ){ ret_val = sdei_set_timer_for_overflow(); } return ret_val; } int sdei_inc_ref_count(sde_counter_t *counter){ papisde_list_entry_t *curr; if( NULL == counter ) return SDE_OK; // If the counter is a group, recursivelly increment the ref_count of all its children. if(CNTR_CLASS_GROUP == counter->cntr_class){ curr = counter->u.cntr_group.group_head; do{ sde_counter_t *tmp_cntr = curr->item; // recursively increment the ref_count of all the elements in the group. int ret_val = sdei_inc_ref_count(tmp_cntr); if( SDE_OK != ret_val ) return ret_val; curr = curr->next; }while(NULL != curr); } // Increment the ref_count of the counter itself, INCLUDING the case where the counter is a group. (counter->ref_count)++; return SDE_OK; } int sdei_delete_counter(papisde_library_desc_t* lib_handle, const char* name) { sde_counter_t *tmp_item; papisde_control_t *gctl; uint32_t item_uniq_id; int ret_val = SDE_OK; gctl = sdei_get_global_struct(); // Look for the counter entry in the hash-table of the library tmp_item = ht_lookup_by_name(lib_handle->lib_counters, name); if( NULL == tmp_item ){ ret_val = SDE_EINVAL; goto fn_exit; } if( CNTR_CLASS_GROUP == tmp_item->cntr_class ){ papisde_list_entry_t *curr, *prev; // If we are dealing with a goup, then we need to recurse down all its children and // delete them (this might mean free them, or just decrement their ref_count). curr = tmp_item->u.cntr_group.group_head; prev = curr; while(NULL != curr){ int counter_is_dead = 0; sde_counter_t *tmp_cntr = curr->item; if( NULL == tmp_cntr ){ ret_val = SDE_EMISC; goto fn_exit; } // If this counter is going to be freed, we need to remove it from this group. if( 0 == tmp_cntr->ref_count ) counter_is_dead = 1; // recursively delete all the elements of the group. int ret_val = sdei_delete_counter(lib_handle, tmp_cntr->name); if( SDE_OK != ret_val ) goto fn_exit; if( counter_is_dead ){ if( curr == tmp_item->u.cntr_group.group_head ){ // if we were removing with the head, change the head, we can't free() it. tmp_item->u.cntr_group.group_head = curr->next; prev = curr->next; curr = curr->next; }else{ // if we are removing an element, first bridge the previous to the next. prev->next = curr->next; free(curr); curr = prev->next; } }else{ // if we are not removing anything, just move the pointers. prev = curr; curr = curr->next; } } } item_uniq_id = tmp_item->glb_uniq_id; // If the reference count is not zero, then we don't remove it from the hash tables if( 0 == tmp_item->ref_count ){ // Delete the entry from the library hash-table (which hashes by name) tmp_item = ht_delete(lib_handle->lib_counters, ht_hash_name(name), item_uniq_id); if( NULL == tmp_item ){ ret_val = SDE_EMISC; goto fn_exit; } // Delete the entry from the global hash-table (which hashes by id) and free the memory // occupied by the counter (not the hash-table entry 'papisde_list_entry_t', the 'sde_counter_t') tmp_item = ht_delete(gctl->all_reg_counters, ht_hash_id(item_uniq_id), item_uniq_id); if( NULL == tmp_item ){ ret_val = SDE_EMISC; goto fn_exit; } // We free the counter only once, although it is in two hash-tables, // because it is the same structure that is pointed to by both hash-tables. free_counter_resources(tmp_item); // Decrement the number of live events. (gctl->num_live_events)--; }else{ (tmp_item->ref_count)--; } fn_exit: return ret_val; } int free_counter_resources(sde_counter_t *counter){ int i, ret_val = SDE_OK; if( NULL == counter ) return SDE_OK; if( 0 == counter->ref_count ){ switch(counter->cntr_class){ case CNTR_CLASS_CREATED: SDEDBG(" + Freeing Created Counter Data.\n"); free(counter->u.cntr_basic.data); break; case CNTR_CLASS_RECORDER: SDEDBG(" + Freeing Recorder Data.\n"); free(counter->u.cntr_recorder.data->sorted_buffer); for(i=0; iu.cntr_recorder.data->ptr_array[i]); } free(counter->u.cntr_recorder.data); break; case CNTR_CLASS_CSET: SDEDBG(" + Freeing CountingSet Data.\n"); ret_val = cset_delete(counter->u.cntr_cset.data); break; } SDEDBG(" -> Freeing Counter '%s'.\n",counter->name); free(counter->name); free(counter->description); free(counter); } return ret_val; } /** This function assumes that all counters in a group (including recursive subgroups) have the same type. */ int sdei_read_counter_group( sde_counter_t *counter, long long int *rslt_ptr ){ papisde_list_entry_t *curr; long long int final_value = 0; if( NULL == counter ){ SDE_ERROR("sdei_read_counter_group(): Counter parameter is NULL.\n"); return SDE_EINVAL; } if( !IS_CNTR_GROUP(counter) ){ SDE_ERROR("sdei_read_counter_group(): Counter '%s' is not a counter group.\n",counter->name); return SDE_EINVAL; } curr = counter->u.cntr_group.group_head; do{ long long int tmp_value = 0; int ret_val; sde_counter_t *tmp_cntr = curr->item; if( NULL == tmp_cntr ){ SDE_ERROR("sdei_read_counter_group(): List of counters in counter group '%s' is clobbered.\n",counter->name); return SDE_EINVAL; } int read_succesfully = 1; // We can _not_ have a recorder inside a group. if( IS_CNTR_RECORDER(tmp_cntr) || IS_CNTR_CSET(tmp_cntr) || IS_CNTR_PLACEHOLDER(tmp_cntr) ){ SDE_ERROR("sdei_read_counter_group(): Counter group contains counter: %s with class: %d.\n",tmp_cntr->name, tmp_cntr->cntr_class); }else{ // We allow counter groups to contain other counter groups recursively. if( IS_CNTR_GROUP(tmp_cntr) ){ ret_val = sdei_read_counter_group( tmp_cntr, &tmp_value ); if( ret_val != SDE_OK ){ // If something went wrong with one counter group, ignore it silently. read_succesfully = 0; } }else{ // If we are here it means that we are trying to read a real counter. ret_val = sdei_read_and_update_data_value( tmp_cntr, tmp_cntr->previous_data, &tmp_value ); if( SDE_OK != ret_val ){ SDE_ERROR("sdei_read_counter_group(): Error occured when reading counter: %s.\n",tmp_cntr->name); read_succesfully = 0; } } if( read_succesfully ) aggregate_value_in_group(&tmp_value, &final_value, tmp_cntr->cntr_type, counter->u.cntr_group.group_flags); } curr = curr->next; }while(NULL != curr); *rslt_ptr = final_value; return SDE_OK; } /* both "rslt" and "data" are local variables that this component stored after promoting to 64 bits. */ #define _SDE_AGGREGATE( _TYPE, _RSLT_TYPE ) do{\ switch(group_flags){\ case PAPI_SDE_SUM:\ *(_RSLT_TYPE *)rslt = (_RSLT_TYPE) ((_TYPE)(*(_RSLT_TYPE *)rslt) + (_TYPE)(*((_RSLT_TYPE *)data)) );\ break;\ case PAPI_SDE_MAX:\ if( *(_RSLT_TYPE *)rslt < *((_RSLT_TYPE *)data) )\ *(_RSLT_TYPE *)rslt = *((_RSLT_TYPE *)data);\ break;\ case PAPI_SDE_MIN:\ if( *(_RSLT_TYPE *)rslt > *((_RSLT_TYPE *)data) )\ *(_RSLT_TYPE *)rslt = *((_RSLT_TYPE *)data);\ break;\ default:\ SDEDBG("Unsupported counter group flag: %d\n",group_flags);\ return -1;\ } \ }while(0) static int aggregate_value_in_group(long long int *data, long long int *rslt, int cntr_type, int group_flags){ switch(cntr_type){ case PAPI_SDE_long_long: _SDE_AGGREGATE(long long int, long long int); return SDE_OK; case PAPI_SDE_int: // We need to cast the result to "long long" so it is expanded to 64bit to take up all the space _SDE_AGGREGATE(int, long long int); return SDE_OK; case PAPI_SDE_double: _SDE_AGGREGATE(double, double); return SDE_OK; case PAPI_SDE_float: // We need to cast the result to "double" so it is expanded to 64bit to take up all the space _SDE_AGGREGATE(float, double); return SDE_OK; default: SDEDBG("Unsupported counter type: %d\n",cntr_type); return -1; } } int sdei_read_and_update_data_value( sde_counter_t *counter, long long int previous_value, long long int *rslt_ptr ) { int ret_val; long long int tmp_int; void *tmp_data; char *event_name = counter->name; if( IS_CNTR_BASIC(counter) ){ SDEDBG("Reading %s by accessing data pointer.\n", event_name); tmp_data = counter->u.cntr_basic.data; }else if( IS_CNTR_CALLBACK(counter) ){ SDEDBG("Reading %s by calling registered function pointer.\n", event_name); tmp_int = counter->u.cntr_cb.callback(counter->u.cntr_cb.param); tmp_data = &tmp_int; }else if( IS_CNTR_CSET(counter) ){ if( 0 == previous_value ){ SDEDBG("Resetting CountingSet %s by freeing all the elements it contains.\n", event_name); return cset_delete(counter->u.cntr_cset.data); }else{ SDEDBG("sdei_read_and_update_data_value(): Event %s is a CountingSet, so it may only be reset by this function.\n", event_name); return -1; } }else{ SDEDBG("sdei_read_and_update_data_value(): Event %s has neither a variable nor a function pointer associated with it.\n", event_name); return -1; } if( is_instant(counter->cntr_mode) ){ /* Instant counter means that we don't subtract the previous value (which we read at PAPI_Start()) */ previous_value = 0; } else if( is_delta(counter->cntr_mode) ){ /* Do nothing here, this is the default mode */ } else{ SDEDBG("Unsupported mode (%d) for event: %s\n",counter->cntr_mode, event_name); return -1; } ret_val = cast_and_store(tmp_data, previous_value, rslt_ptr, counter->cntr_type); return ret_val; } static inline int cast_and_store(void *data, long long int previous_value, void *rslt_ptr, int cntr_type){ void *tmp_ptr; switch(cntr_type){ case PAPI_SDE_long_long: *(long long int *)rslt_ptr = *((long long int *)data) - previous_value; SDEDBG(" value LL=%lld (%lld-%lld)\n", *(long long int *)rslt_ptr, *((long long int *)data), previous_value); return SDE_OK; case PAPI_SDE_int: // We need to cast the result to "long long" so it is expanded to 64bit to take up all the space *(long long int *)rslt_ptr = (long long int) (*((int *)data) - (int)previous_value); SDEDBG(" value LD=%lld (%d-%d)\n", *(long long int *)rslt_ptr, *((int *)data), (int)previous_value); return SDE_OK; case PAPI_SDE_double: tmp_ptr = &previous_value; *(double *)rslt_ptr = (*((double *)data) - *((double *)tmp_ptr)); SDEDBG(" value LF=%lf (%lf-%lf)\n", *(double *)rslt_ptr, *((double *)data), *((double *)tmp_ptr)); return SDE_OK; case PAPI_SDE_float: // We need to cast the result to "double" so it is expanded to 64bit to take up all the space tmp_ptr = &previous_value; *(double *)rslt_ptr = (double)(*((float *)data) - (float)(*((double *)tmp_ptr)) ); SDEDBG(" value F=%lf (%f-%f)\n", *(double *)rslt_ptr, *((float *)data), (float)(*((double *)tmp_ptr)) ); return SDE_OK; default: SDEDBG("Unsupported counter type: %d\n",cntr_type); return -1; } } int sdei_hardware_write( sde_counter_t *counter, long long int new_value ){ double tmp_double; void *tmp_ptr; switch(counter->cntr_type){ case PAPI_SDE_long_long: *((long long int *)(counter->u.cntr_basic.data)) = new_value; break; case PAPI_SDE_int: *((int *)(counter->u.cntr_basic.data)) = (int)new_value; break; case PAPI_SDE_double: tmp_ptr = &new_value; tmp_double = *((double *)tmp_ptr); *((double *)(counter->u.cntr_basic.data)) = tmp_double; break; case PAPI_SDE_float: // The pointer has to be 64bit. We can cast the variable to safely convert between bit-widths later on. tmp_ptr = &new_value; tmp_double = *((double *)tmp_ptr); *((float *)(counter->u.cntr_basic.data)) = (float)tmp_double; break; default: SDEDBG("Unsupported counter type: %d\n",counter->cntr_type); return -1; } return SDE_OK; } papi-papi-7-2-0-t/src/sde_lib/sde_lib_ti.c000066400000000000000000000376161502707512200203060ustar00rootroot00000000000000/** * @file sde_lib_ti.c * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components * * @brief * This is the tools interface of SDE. It contains the functions that the PAPI * SDE component needs to call in order to access the SDEs inside a library. */ #include "sde_lib_internal.h" #include "sde_lib_lock.h" // These pointers will not be used anywhere in this code. However, if libpapi.a is linked // into an application that also links against libsde.a (both static libraries) then the // linker will use these assignments to set the corrsponding function pointers in libsde.a __attribute__((__common__)) int (*sde_ti_reset_counter_ptr)( uint32_t ) = &sde_ti_reset_counter; __attribute__((__common__)) int (*sde_ti_read_counter_ptr)( uint32_t, long long int * ) = &sde_ti_read_counter; __attribute__((__common__)) int (*sde_ti_write_counter_ptr)( uint32_t, long long ) = &sde_ti_write_counter; __attribute__((__common__)) int (*sde_ti_name_to_code_ptr)( const char *, uint32_t * ) = &sde_ti_name_to_code; __attribute__((__common__)) int (*sde_ti_is_simple_counter_ptr)( uint32_t ) = &sde_ti_is_simple_counter; __attribute__((__common__)) int (*sde_ti_is_counter_set_to_overflow_ptr)( uint32_t ) = &sde_ti_is_counter_set_to_overflow; __attribute__((__common__)) int (*sde_ti_set_counter_overflow_ptr)( uint32_t, int ) = &sde_ti_set_counter_overflow; __attribute__((__common__)) char * (*sde_ti_get_event_name_ptr)( int ) = &sde_ti_get_event_name; __attribute__((__common__)) char * (*sde_ti_get_event_description_ptr)( int ) = &sde_ti_get_event_description; __attribute__((__common__)) int (*sde_ti_get_num_reg_events_ptr)( void ) = &sde_ti_get_num_reg_events; __attribute__((__common__)) int (*sde_ti_shutdown_ptr)( void ) = &sde_ti_shutdown; /* * */ int sde_ti_read_counter( uint32_t counter_id, long long int *rslt_ptr){ int ret_val = SDE_OK; papisde_control_t *gctl; sde_lock(); gctl = _papisde_global_control; if( NULL == gctl ){ SDE_ERROR("sde_ti_read_counter(): Attempt to read from unintialized SDE structures.\n"); ret_val = SDE_EINVAL; goto fn_exit; } if( counter_id >= gctl->num_reg_events ){ SDE_ERROR("sde_ti_read_counter(): SDE with id %d does not correspond to a registered event.\n",counter_id); ret_val = SDE_EINVAL; goto fn_exit; } sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); if( NULL == counter ){ SDE_ERROR("sde_ti_read_counter(): SDE with id %d is clobbered.\n",counter_id); ret_val = SDE_EINVAL; goto fn_exit; } SDEDBG("sde_ti_read_counter(): Reading counter: '%s'.\n",counter->name); switch( counter->cntr_class ){ // If the counter represents a counter group then we need to read the values of all the counters in the group. case CNTR_CLASS_GROUP: ret_val = sdei_read_counter_group( counter, rslt_ptr ); if( SDE_OK != ret_val ){ SDE_ERROR("sde_ti_read_counter(): Error occured when reading counter group: '%s'.\n",counter->name); } break; // Our convention is that read attempts on a placeholder will set the counter to "-1" to // signify semantically that there was an error, but the function will not return an error // to avoid breaking existing programs that do something funny when an error is returned. case CNTR_CLASS_PLACEHOLDER: SDEDBG("sde_ti_read_counter(): Attempted read on a placeholder: '%s'.\n",counter->name); *rslt_ptr = -1; break; // If we are not dealing with a simple counter but with a recorder, we need to allocate // a contiguous buffer, copy all the recorded data in it, and return to the user a pointer // to this buffer cast as a long long. case CNTR_CLASS_RECORDER: { long long used_entries; size_t typesize; void *out_buffer; // At least the first chunk should have been allocated at creation. if( NULL == counter->u.cntr_recorder.data->ptr_array[0] ){ SDE_ERROR( "No space has been allocated for recorder %s\n",counter->name); ret_val = SDE_EINVAL; break; } used_entries = counter->u.cntr_recorder.data->used_entries; typesize = counter->u.cntr_recorder.data->typesize; // NOTE: After returning this buffer we loose track of it, so it's the user's responsibility to free it. out_buffer = malloc( used_entries*typesize ); exp_container_to_contiguous(counter->u.cntr_recorder.data, out_buffer); *rslt_ptr = (long long)out_buffer; break; } case CNTR_CLASS_CSET: { cset_list_object_t *list_head; sdei_counting_set_to_list( counter, &list_head ); *rslt_ptr = (long long)list_head; break; } case CNTR_CLASS_REGISTERED: // fall through case CNTR_CLASS_CREATED: // fall through case CNTR_CLASS_BASIC: // fall through case CNTR_CLASS_CB: ret_val = sdei_read_and_update_data_value( counter, counter->previous_data, rslt_ptr ); if( SDE_OK != ret_val ){ SDE_ERROR("sde_ti_read_counter(): Error occured when reading counter: '%s'.\n",counter->name); } break; } fn_exit: sde_unlock(); return ret_val; } /* * */ int sde_ti_write_counter( uint32_t counter_id, long long value ){ papisde_control_t *gctl; int ret_val = SDE_OK; gctl = _papisde_global_control; if( NULL == gctl ){ SDE_ERROR("sde_ti_write_counter(): Attempt to write in unintialized SDE structures.\n"); return SDE_EINVAL; } if( counter_id >= gctl->num_reg_events ){ SDE_ERROR("sde_ti_write_counter(): SDE with id %d does not correspond to a registered event.\n",counter_id); return SDE_EINVAL; } sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); if( (NULL == counter) || !IS_CNTR_BASIC(counter) ){ SDE_ERROR("sde_ti_write_counter(): SDE with id %d is clobbered, or a type which does not support writing.\n",counter_id); return SDE_EINVAL; } ret_val = sdei_hardware_write( counter, value ); if( SDE_OK != ret_val ){ SDE_ERROR("sde_ti_write_counter(): Error occured when writing counter: '%s'.\n",counter->name); } return ret_val; } /* * */ int sde_ti_reset_counter( uint32_t counter_id ){ int ret_val = SDE_OK; papisde_control_t *gctl; gctl = _papisde_global_control; if( NULL == gctl ){ SDE_ERROR("sde_ti_reset_counter(): Attempt to modify unintialized SDE structures.\n"); return SDE_EINVAL; } if( counter_id >= gctl->num_reg_events ){ SDE_ERROR("sde_ti_reset_counter(): SDE with id %d does not correspond to a registered event.\n",counter_id); return SDE_EINVAL; } sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); if( (NULL == counter) || (!IS_CNTR_BASIC(counter) && !IS_CNTR_CALLBACK(counter) && !IS_CNTR_CSET(counter)) ){ SDEDBG("sde_ti_reset_counter(): SDE with id %d is clobbered, or a type which does not support resetting.\n",counter_id); // We allow tools to call this function even if the counter type does not support // reseting, so we do not return an error if this is the case. return SDE_OK; } ret_val = sdei_read_and_update_data_value( counter, 0, &(counter->previous_data) ); if( SDE_OK != ret_val ){ SDE_ERROR("sde_ti_reset_counter(): Error occured when resetting counter: %s.\n",counter->name); } return ret_val; } /* * */ int sde_ti_name_to_code(const char *event_name, uint32_t *event_code ){ int ret_val; papisde_library_desc_t *lib_handle; char *pos, *tmp_lib_name; sde_counter_t *tmp_item = NULL; papisde_control_t *gctl; SDEDBG( "%s\n", event_name ); sde_lock(); gctl = _papisde_global_control; // Let's see if the event has the library name as a prefix (as it should). Note that this is // the event name as it comes from the framework, so it should contain the library name, although // when the library registers an event counter it will not use the library name as part of the event name. tmp_lib_name = strdup(event_name); pos = strstr(tmp_lib_name, "::"); if( NULL != pos ){ // Good, it does. *pos = '\0'; if( NULL == gctl ){ // If no library has initialized SDEs, and the application is already inquiring // about an event, let's initialize SDEs pretending to be the library which corresponds to this event. gctl = sdei_get_global_struct(); lib_handle = do_sde_init(tmp_lib_name, gctl); if(NULL == lib_handle){ SDE_ERROR("sde_ti_name_to_code(): Initialized SDE but unable to register new library: %s\n", tmp_lib_name); ret_val = SDE_ECMP; goto fn_exit; } }else{ int is_library_present = 0; // If the library side of the component has been initialized, then look for the library. lib_handle = gctl->lib_list_head; while(NULL != lib_handle){ // Look for the library. if( !strcmp(lib_handle->libraryName, tmp_lib_name) ){ // We found the library. is_library_present = 1; // Now, look for the event in the library. tmp_item = ht_lookup_by_name(lib_handle->lib_counters, event_name); break; } lib_handle = lib_handle->next; } if( !is_library_present ){ // If the library side of the component was initialized, but the specific library hasn't called // papi_sde_init() then we call it here to allocate the data structures. lib_handle = do_sde_init(tmp_lib_name, gctl); if(NULL == lib_handle){ SDE_ERROR("sde_ti_name_to_code(): Unable to register new library: %s\n", tmp_lib_name); ret_val = SDE_ECMP; goto fn_exit; } } } free(tmp_lib_name); // We don't need the library name any more. if( NULL != tmp_item ){ SDEDBG("Found matching counter with global uniq id: %d in library: %s\n", tmp_item->glb_uniq_id, lib_handle->libraryName ); *event_code = tmp_item->glb_uniq_id; ret_val = SDE_OK; goto fn_exit; } else { cntr_class_specific_t cntr_union = {0}; SDEDBG("Did not find event %s in library %s. Registering a placeholder.\n", event_name, lib_handle->libraryName ); // Use the current number of registered events as the index of the new one, and increment it. uint32_t counter_uniq_id = gctl->num_reg_events++; gctl->num_live_events++; // At this point in the code "lib_handle" contains a pointer to the data structure for this library whether // the actual library has been initialized or not. tmp_item = allocate_and_insert(gctl, lib_handle, event_name, counter_uniq_id, PAPI_SDE_RO, PAPI_SDE_long_long, CNTR_CLASS_PLACEHOLDER, cntr_union ); if(NULL == tmp_item) { SDEDBG("Event %s does not exist in library %s and placeholder could not be inserted.\n", event_name, lib_handle->libraryName); ret_val = SDE_ECMP; goto fn_exit; } *event_code = tmp_item->glb_uniq_id; ret_val = SDE_OK; goto fn_exit; } }else{ free(tmp_lib_name); } // If no library has initialized the component and we don't know a library name, then we have to return. if( NULL == gctl ){ ret_val = SDE_ENOEVNT; goto fn_exit; } // If the event name does not have the library name as a prefix, then we need to look in all the libraries for the event. However, in this case // we can _not_ register a placeholder because we don't know which library the event belongs to. lib_handle = gctl->lib_list_head; while(NULL != lib_handle){ tmp_item = ht_lookup_by_name(lib_handle->lib_counters, event_name); if( NULL != tmp_item ){ *event_code = tmp_item->glb_uniq_id; SDEDBG("Found matching counter with global uniq id: %d in library: %s\n", tmp_item->glb_uniq_id, lib_handle->libraryName ); ret_val = SDE_OK; goto fn_exit; } else { SDEDBG("Failed to find event %s in library %s. Looking in other libraries.\n", event_name, lib_handle->libraryName ); } lib_handle = lib_handle->next; } ret_val = SDE_ENOEVNT; fn_exit: sde_unlock(); return ret_val; } /* * */ int sde_ti_is_simple_counter(uint32_t counter_id){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return 0; sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); if( (NULL == counter) || !IS_CNTR_REGISTERED(counter) ) return 0; return 1; } /* * */ int sde_ti_is_counter_set_to_overflow(uint32_t counter_id){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return 0; sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); if( (NULL == counter) || !counter->overflow || IS_CNTR_CREATED(counter) ) return 0; return 1; } /* * */ int sde_ti_set_counter_overflow(uint32_t counter_id, int threshold){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return SDE_OK; sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, counter_id); // If the counter is created then we will check for overflow every time its value gets updated, we don't need to poll. // That is in cases c[1-3] if( IS_CNTR_CREATED(counter) ) return SDE_OK; // We do not want to overflow on recorders or counting-sets, because we don't even know what this means. if( ( IS_CNTR_RECORDER(counter) || IS_CNTR_CSET(counter) ) && (threshold > 0) ){ return SDE_EINVAL; } // If we still don't know what type the counter is, then we are _not_ in r[1-3] so we can't create a timer here. if( IS_CNTR_PLACEHOLDER(counter) && (threshold > 0) ){ SDEDBG("Event is a placeholder (it has not been registered by a library yet), so we cannot start overflow, but we can remember it.\n"); counter->overflow = 1; return SDE_OK; } if( 0 == threshold ){ counter->overflow = 0; } // Return a number higher than SDE_OK (which is zero) to indicate to the caller that the timer needs to be set, // because SDE_OK only means that there was no error, but the timer should not be set either because we are dealing // with a placeholder, or created counter. return 0xFF; } /* * */ char * sde_ti_get_event_name(int event_id){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return NULL; sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, event_id); if( NULL == counter ) return NULL; return counter->name; } /* * */ char * sde_ti_get_event_description(int event_id){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return NULL; sde_counter_t *counter = ht_lookup_by_id(gctl->all_reg_counters, event_id); if( NULL == counter ) return NULL; return counter->description; } /* * */ int sde_ti_get_num_reg_events( void ){ papisde_control_t *gctl = _papisde_global_control; if( NULL == gctl ) return 0; return gctl->num_reg_events; } /* * */ int sde_ti_shutdown( void ){ return SDE_OK; } papi-papi-7-2-0-t/src/sde_lib/sde_lib_ti.h000066400000000000000000000014701502707512200203000ustar00rootroot00000000000000/** * @file sde_lib_ti.h * @author Anthony Danalis * adanalis@icl.utk.edu * * @ingroup papi_components */ #if !defined(PAPI_SDE_LIB_TI_H) #define PAPI_SDE_LIB_TI_H int sde_ti_read_counter( uint32_t counter_id, long long int *rslt_ptr); int sde_ti_write_counter( uint32_t counter_id, long long value ); int sde_ti_reset_counter( uint32_t counter_id ); int sde_ti_name_to_code(const char *event_name, uint32_t *event_code ); int sde_ti_is_simple_counter(uint32_t counter_id); int sde_ti_is_counter_set_to_overflow(uint32_t counter_id); int sde_ti_set_counter_overflow(uint32_t counter_id, int threshold); char *sde_ti_get_event_name(int event_id); char *sde_ti_get_event_description(int event_id); int sde_ti_get_num_reg_events( void ); int sde_ti_shutdown( void ); #endif // !defined(PAPI_SDE_LIB_TI_H) papi-papi-7-2-0-t/src/smoke_tests/000077500000000000000000000000001502707512200167675ustar00rootroot00000000000000papi-papi-7-2-0-t/src/smoke_tests/Makefile000066400000000000000000000010261502707512200204260ustar00rootroot00000000000000 EXECUTABLES = simple threads CC ?= gcc PAPI_ROOT := $(shell dirname $(shell dirname $(shell which papi_component_avail))) CFLAGS ?= -O0 -pthread -I$(PAPI_ROOT)/include CPPFLAGS = -I$(PAPI_ROOT)/include LIBS = -L$(PAPI_ROOT)/lib -lm -ldl -lpapi -Wl,-rpath=$(PAPI_ROOT)/lib -pthread all: $(EXECUTABLES) clean: /bin/rm -f core *.o $(EXECUTABLES) i.SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $*.c simple: simple.o $(CC) $(CFLAGS) -o simple simple.o $(LIBS) threads: threads.o $(CC) $(CFLAGS) -o threads threads.o $(LIBS) papi-papi-7-2-0-t/src/smoke_tests/simple.c000066400000000000000000000040731502707512200204300ustar00rootroot00000000000000#include #include #include #include #define NUM_EVENTS 2 int main( int argc, char **argv ) { int retval, i; long long values[NUM_EVENTS]; int EventSet = PAPI_NULL; int events[NUM_EVENTS]; char *EventName[] = { "PAPI_TOT_CYC", "PAPI_TOT_INS" }; retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { printf("ERROR: PAPI_library_init: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } else { printf ( "PAPI_VERSION : %4d %6d %7d\n", PAPI_VERSION_MAJOR ( PAPI_VERSION ), PAPI_VERSION_MINOR ( PAPI_VERSION ), PAPI_VERSION_REVISION ( PAPI_VERSION ) ); } retval = PAPI_create_eventset ( &EventSet ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_create_eventset: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } for( i = 0; i < NUM_EVENTS; i++ ) { retval = PAPI_event_name_to_code ( EventName[i], &events[i] ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_event_name_to_code: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } } retval = PAPI_add_events ( EventSet, events, NUM_EVENTS ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_add_events: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_start: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } // do work sleep(3); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_stop: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } for( i = 0; i < NUM_EVENTS; i++ ) { printf( "%12lld \t\t --> %s \n", values[i], EventName[i] ); } PAPI_shutdown(); return EXIT_SUCCESS; } papi-papi-7-2-0-t/src/smoke_tests/threads.c000066400000000000000000000071431502707512200205720ustar00rootroot00000000000000#include #include #include #include #include #define NUM_PTHREADS 2 #define NUM_EVENTS 2 void * Thread(void *arg) { int retval, i; long long values[NUM_EVENTS]; int EventSet = PAPI_NULL; int events[NUM_EVENTS]; char *EventName[] = { "PAPI_TOT_CYC", "PAPI_TOT_INS" }; int thread; thread = *(int *) arg; retval = PAPI_register_thread( ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_register_thread: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } for( i = 0; i < NUM_EVENTS; i++ ) { retval = PAPI_event_name_to_code( EventName[i], &events[i] ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_event_name_to_code: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_create_eventset: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } retval = PAPI_add_events( EventSet, events, NUM_EVENTS ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_add_events: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_start: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } // do work sleep(3); retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_stop: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } for( i = 0; i < NUM_EVENTS; i++ ) { printf( "%12lld \t\t --> %s (thread %d) \n", values[i], EventName[i], thread ); } retval = PAPI_unregister_thread( ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_unregister_thread: %d: %s\n", retval, PAPI_strerror(retval)); exit(EXIT_FAILURE); } return 0; } int main( int argc, char **argv ) { pthread_t tids[NUM_PTHREADS]; int i, vals[NUM_PTHREADS]; int retval, rc; void* retval2; /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { printf("ERROR: PAPI_library_init: %d: %s\n", retval, PAPI_strerror(retval) ); exit(EXIT_FAILURE); } else { printf ( "PAPI_VERSION : %4d %6d %7d\n", PAPI_VERSION_MAJOR ( PAPI_VERSION ), PAPI_VERSION_MINOR ( PAPI_VERSION ), PAPI_VERSION_REVISION ( PAPI_VERSION ) ); } retval = PAPI_thread_init( ( unsigned long ( * )( void ) )( pthread_self ) ); if ( retval != PAPI_OK ) { printf("ERROR: PAPI_thread_init: %d: %s\n", retval, PAPI_strerror(retval) ); exit(EXIT_FAILURE); } for ( i = 0; i < NUM_PTHREADS; i++) { vals[i] = i; retval = pthread_create( &tids[i], NULL, Thread, &vals[i] ); if ( retval != 0 ) { printf("ERROR: pthread_create: %d\n", retval ); exit(EXIT_FAILURE); } } for ( i = 0; i < NUM_PTHREADS; i++) { printf("Trying to join with tid %d\n", i); retval = pthread_join(tids[i], &retval2); if ( retval != 0 ) { printf("ERROR: pthread_join: %d\n", retval ); exit(EXIT_FAILURE); } else { printf("Joined with tid %d\n", i); } } PAPI_shutdown(); return EXIT_SUCCESS; } papi-papi-7-2-0-t/src/solaris-common.c000066400000000000000000000521761502707512200175500ustar00rootroot00000000000000#include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include "solaris-common.h" #include #if 0 /* once the bug in dladdr is fixed by SUN, (now dladdr caused deadlock when used with pthreads) this function can be used again */ int _solaris_update_shlib_info( papi_mdi_t *mdi ) { char fname[80], name[PAPI_HUGE_STR_LEN]; prmap_t newp; int count, t_index; FILE *map_f; void *vaddr; Dl_info dlip; PAPI_address_map_t *tmp = NULL; sprintf( fname, "/proc/%d/map", getpid( ) ); map_f = fopen( fname, "r" ); if ( !map_f ) { PAPIERROR( "fopen(%s) returned < 0", fname ); return ( PAPI_OK ); } /* count the entries we need */ count = 0; t_index = 0; while ( fread( &newp, sizeof ( prmap_t ), 1, map_f ) > 0 ) { vaddr = ( void * ) ( 1 + ( newp.pr_vaddr ) ); // map base address if ( dladdr( vaddr, &dlip ) > 0 ) { count++; if ( ( newp.pr_mflags & MA_EXEC ) && ( newp.pr_mflags & MA_READ ) ) { if ( !( newp.pr_mflags & MA_WRITE ) ) t_index++; } strcpy( name, dlip.dli_fname ); if ( strcmp( _papi_hwi_system_info.exe_info.address_info.name, basename( name ) ) == 0 ) { if ( ( newp.pr_mflags & MA_EXEC ) && ( newp.pr_mflags & MA_READ ) ) { if ( !( newp.pr_mflags & MA_WRITE ) ) { _papi_hwi_system_info.exe_info.address_info.text_start = ( vptr_t ) newp.pr_vaddr; _papi_hwi_system_info.exe_info.address_info.text_end = ( vptr_t ) ( newp.pr_vaddr + newp.pr_size ); } else { _papi_hwi_system_info.exe_info.address_info.data_start = ( vptr_t ) newp.pr_vaddr; _papi_hwi_system_info.exe_info.address_info.data_end = ( vptr_t ) ( newp.pr_vaddr + newp.pr_size ); } } } } } rewind( map_f ); tmp = ( PAPI_address_map_t * ) papi_calloc( t_index - 1, sizeof ( PAPI_address_map_t ) ); if ( tmp == NULL ) { PAPIERROR( "Error allocating shared library address map" ); return ( PAPI_ENOMEM ); } t_index = -1; while ( fread( &newp, sizeof ( prmap_t ), 1, map_f ) > 0 ) { vaddr = ( void * ) ( 1 + ( newp.pr_vaddr ) ); // map base address if ( dladdr( vaddr, &dlip ) > 0 ) { // valid name strcpy( name, dlip.dli_fname ); if ( strcmp( _papi_hwi_system_info.exe_info.address_info.name, basename( name ) ) == 0 ) continue; if ( ( newp.pr_mflags & MA_EXEC ) && ( newp.pr_mflags & MA_READ ) ) { if ( !( newp.pr_mflags & MA_WRITE ) ) { t_index++; tmp[t_index].text_start = ( vptr_t ) newp.pr_vaddr; tmp[t_index].text_end = ( vptr_t ) ( newp.pr_vaddr + newp.pr_size ); strncpy( tmp[t_index].name, dlip.dli_fname, PAPI_HUGE_STR_LEN - 1 ); tmp[t_index].name[PAPI_HUGE_STR_LEN - 1] = '\0'; } else { if ( t_index < 0 ) continue; tmp[t_index].data_start = ( vptr_t ) newp.pr_vaddr; tmp[t_index].data_end = ( vptr_t ) ( newp.pr_vaddr + newp.pr_size ); } } } } fclose( map_f ); if ( _papi_hwi_system_info.shlib_info.map ) papi_free( _papi_hwi_system_info.shlib_info.map ); _papi_hwi_system_info.shlib_info.map = tmp; _papi_hwi_system_info.shlib_info.count = t_index + 1; return PAPI_OK; } #endif int _papi_hwi_init_os(void) { struct utsname uname_buffer; uname(&uname_buffer); strncpy(_papi_os_info.name,uname_buffer.sysname,PAPI_MAX_STR_LEN); strncpy(_papi_os_info.version,uname_buffer.release,PAPI_MAX_STR_LEN); _papi_os_info.itimer_sig = PAPI_INT_MPX_SIGNAL; _papi_os_info.itimer_num = PAPI_INT_ITIMER; _papi_os_info.itimer_ns = PAPI_INT_MPX_DEF_US * 1000; _papi_os_info.itimer_res_ns = 1; return PAPI_OK; } #if 0 int _ultra_hwd_update_shlib_info( papi_mdi_t *mdi ) { /*??? system call takes very long */ char cmd_line[PAPI_HUGE_STR_LEN + PAPI_HUGE_STR_LEN], fname[L_tmpnam]; char line[256]; char address[16], size[10], flags[64], objname[256]; PAPI_address_map_t *tmp = NULL; FILE *f = NULL; int t_index = 0, i; struct map_record { long address; int size; int flags; char objname[256]; struct map_record *next; } *tmpr, *head, *curr; tmpnam( fname ); SUBDBG( "Temporary name %s\n", fname ); sprintf( cmd_line, "/bin/pmap %d > %s", ( int ) getpid( ), fname ); if ( system( cmd_line ) != 0 ) { PAPIERROR( "Could not run %s to get shared library address map", cmd_line ); return ( PAPI_OK ); } f = fopen( fname, "r" ); if ( f == NULL ) { PAPIERROR( "fopen(%s) returned < 0", fname ); remove( fname ); return ( PAPI_OK ); } /* ignore the first line */ fgets( line, 256, f ); head = curr = NULL; while ( fgets( line, 256, f ) != NULL ) { /* discard the last line */ if ( strncmp( line, " total", 6 ) != 0 ) { sscanf( line, "%s %s %s %s", address, size, flags, objname ); if ( objname[0] == '/' ) { tmpr = ( struct map_record * ) papi_malloc( sizeof ( struct map_record ) ); if ( tmpr == NULL ) return ( -1 ); tmpr->next = NULL; if ( curr ) { curr->next = tmpr; curr = tmpr; } if ( head == NULL ) { curr = head = tmpr; } SUBDBG( "%s\n", objname ); if ( ( strstr( flags, "read" ) && strstr( flags, "exec" ) ) || ( strstr( flags, "r" ) && strstr( flags, "x" ) ) ) { if ( !( strstr( flags, "write" ) || strstr( flags, "w" ) ) ) { /* text segment */ t_index++; tmpr->flags = 1; } else { tmpr->flags = 0; } sscanf( address, "%lx", &tmpr->address ); sscanf( size, "%d", &tmpr->size ); tmpr->size *= 1024; strcpy( tmpr->objname, objname ); } } } } tmp = ( PAPI_address_map_t * ) papi_calloc( t_index - 1, sizeof ( PAPI_address_map_t ) ); if ( tmp == NULL ) { PAPIERROR( "Error allocating shared library address map" ); return ( PAPI_ENOMEM ); } t_index = -1; tmpr = curr = head; i = 0; while ( curr != NULL ) { if ( strcmp( _papi_hwi_system_info.exe_info.address_info.name, basename( curr->objname ) ) == 0 ) { if ( curr->flags ) { _papi_hwi_system_info.exe_info.address_info.text_start = ( vptr_t ) curr->address; _papi_hwi_system_info.exe_info.address_info.text_end = ( vptr_t ) ( curr->address + curr->size ); } else { _papi_hwi_system_info.exe_info.address_info.data_start = ( vptr_t ) curr->address; _papi_hwi_system_info.exe_info.address_info.data_end = ( vptr_t ) ( curr->address + curr->size ); } } else { if ( curr->flags ) { t_index++; tmp[t_index].text_start = ( vptr_t ) curr->address; tmp[t_index].text_end = ( vptr_t ) ( curr->address + curr->size ); strncpy( tmp[t_index].name, curr->objname, PAPI_HUGE_STR_LEN - 1 ); tmp[t_index].name[PAPI_HUGE_STR_LEN - 1] = '\0'; } else { if ( t_index < 0 ) continue; tmp[t_index].data_start = ( vptr_t ) curr->address; tmp[t_index].data_end = ( vptr_t ) ( curr->address + curr->size ); } } tmpr = curr->next; /* free the temporary allocated memory */ papi_free( curr ); curr = tmpr; } /* end of while */ remove( fname ); fclose( f ); if ( _papi_hwi_system_info.shlib_info.map ) papi_free( _papi_hwi_system_info.shlib_info.map ); _papi_hwi_system_info.shlib_info.map = tmp; _papi_hwi_system_info.shlib_info.count = t_index + 1; return ( PAPI_OK ); } #endif /* From niagara2 code */ int _solaris_update_shlib_info( papi_mdi_t *mdi ) { char *file = "/proc/self/map"; char *resolve_pattern = "/proc/self/path/%s"; char lastobject[PRMAPSZ]; char link[PAPI_HUGE_STR_LEN]; char path[PAPI_HUGE_STR_LEN]; prmap_t mapping; int fd, count = 0, total = 0, position = -1, first = 1; vptr_t t_min, t_max, d_min, d_max; PAPI_address_map_t *pam, *cur; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif fd = open( file, O_RDONLY ); if ( fd == -1 ) { return PAPI_ESYS; } memset( lastobject, 0, PRMAPSZ ); #ifdef DEBUG SUBDBG( " -> %s: Preprocessing memory maps from procfs\n", __func__ ); #endif /* Search through the list of mappings in order to identify a) how many mappings are available and b) how many unique mappings are available. */ while ( read( fd, &mapping, sizeof ( prmap_t ) ) > 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Found a new memory map entry\n", __func__ ); #endif /* Another entry found, just the total count of entries. */ total++; /* Is the mapping accessible and not anonymous? */ if ( mapping.pr_mflags & ( MA_READ | MA_WRITE | MA_EXEC ) && !( mapping.pr_mflags & MA_ANON ) ) { /* Test if a new library has been found. If a new library has been found a new entry needs to be counted. */ if ( strcmp( lastobject, mapping.pr_mapname ) != 0 ) { strncpy( lastobject, mapping.pr_mapname, PRMAPSZ ); count++; #ifdef DEBUG SUBDBG( " -> %s: Memory mapping entry valid for %s\n", __func__, mapping.pr_mapname ); #endif } } } #ifdef DEBUG SUBDBG( " -> %s: Preprocessing done, starting to analyze\n", __func__ ); #endif /* Start from the beginning, now fill in the found mappings */ if ( lseek( fd, 0, SEEK_SET ) == -1 ) { return PAPI_ESYS; } memset( lastobject, 0, PRMAPSZ ); /* Allocate memory */ pam = ( PAPI_address_map_t * ) papi_calloc( count, sizeof ( PAPI_address_map_t ) ); while ( read( fd, &mapping, sizeof ( prmap_t ) ) > 0 ) { if ( mapping.pr_mflags & MA_ANON ) { #ifdef DEBUG SUBDBG ( " -> %s: Anonymous mapping (MA_ANON) found for %s, skipping\n", __func__, mapping.pr_mapname ); #endif continue; } /* Check for a new entry */ if ( strcmp( mapping.pr_mapname, lastobject ) != 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Analyzing mapping for %s\n", __func__, mapping.pr_mapname ); #endif cur = &( pam[++position] ); strncpy( lastobject, mapping.pr_mapname, PRMAPSZ ); snprintf( link, PAPI_HUGE_STR_LEN, resolve_pattern, lastobject ); memset( path, 0, PAPI_HUGE_STR_LEN ); readlink( link, path, PAPI_HUGE_STR_LEN ); strncpy( cur->name, path, PAPI_HUGE_STR_LEN ); #ifdef DEBUG SUBDBG( " -> %s: Resolved name for %s: %s\n", __func__, mapping.pr_mapname, cur->name ); #endif } if ( mapping.pr_mflags & MA_READ ) { /* Data (MA_WRITE) or text (MA_READ) segment? */ if ( mapping.pr_mflags & MA_WRITE ) { cur->data_start = ( vptr_t ) mapping.pr_vaddr; cur->data_end = ( vptr_t ) ( mapping.pr_vaddr + mapping.pr_size ); if ( strcmp ( cur->name, _papi_hwi_system_info.exe_info.fullname ) == 0 ) { _papi_hwi_system_info.exe_info.address_info.data_start = cur->data_start; _papi_hwi_system_info.exe_info.address_info.data_end = cur->data_end; } if ( first ) d_min = cur->data_start; if ( first ) d_max = cur->data_end; if ( cur->data_start < d_min ) { d_min = cur->data_start; } if ( cur->data_end > d_max ) { d_max = cur->data_end; } } else if ( mapping.pr_mflags & MA_EXEC ) { cur->text_start = ( vptr_t ) mapping.pr_vaddr; cur->text_end = ( vptr_t ) ( mapping.pr_vaddr + mapping.pr_size ); if ( strcmp ( cur->name, _papi_hwi_system_info.exe_info.fullname ) == 0 ) { _papi_hwi_system_info.exe_info.address_info.text_start = cur->text_start; _papi_hwi_system_info.exe_info.address_info.text_end = cur->text_end; } if ( first ) t_min = cur->text_start; if ( first ) t_max = cur->text_end; if ( cur->text_start < t_min ) { t_min = cur->text_start; } if ( cur->text_end > t_max ) { t_max = cur->text_end; } } } first = 0; } close( fd ); /* During the walk of shared objects the upper and lower bound of the segments could be discovered. The bounds are stored in the PAPI info structure. The information is important for the profiling functions of PAPI. */ /* This variant would pass the addresses of all text and data segments _papi_hwi_system_info.exe_info.address_info.text_start = t_min; _papi_hwi_system_info.exe_info.address_info.text_end = t_max; _papi_hwi_system_info.exe_info.address_info.data_start = d_min; _papi_hwi_system_info.exe_info.address_info.data_end = d_max; */ #ifdef DEBUG SUBDBG( " -> %s: Analysis of memory maps done, results:\n", __func__ ); SUBDBG( " -> %s: text_start=%#x, text_end=%#x, text_size=%lld\n", __func__, _papi_hwi_system_info.exe_info.address_info.text_start, _papi_hwi_system_info.exe_info.address_info.text_end, _papi_hwi_system_info.exe_info.address_info.text_end - _papi_hwi_system_info.exe_info.address_info.text_start ); SUBDBG( " -> %s: data_start=%#x, data_end=%#x, data_size=%lld\n", __func__, _papi_hwi_system_info.exe_info.address_info.data_start, _papi_hwi_system_info.exe_info.address_info.data_end, _papi_hwi_system_info.exe_info.address_info.data_end - _papi_hwi_system_info.exe_info.address_info.data_start ); #endif /* Store the map read and the total count of shlibs found */ _papi_hwi_system_info.shlib_info.map = pam; _papi_hwi_system_info.shlib_info.count = count; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } #if 0 int _niagara2_get_system_info( papi_mdi_t *mdi ) { // Used for evaluating return values int retval = 0; // Check for process settings pstatus_t *proc_status; psinfo_t *proc_info; // Used for string truncating char *c_ptr; // For retrieving the executable full name char exec_name[PAPI_HUGE_STR_LEN]; // For retrieving processor information __sol_processor_information_t cpus; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Get and set pid */ pid = getpid( ); /* Check for microstate accounting */ proc_status = __sol_get_proc_status( pid ); if ( proc_status->pr_flags & PR_MSACCT == 0 || proc_status->pr_flags & PR_MSFORK == 0 ) { /* Solaris 10 should have microstate accounting always activated */ return PAPI_ECMP; } /* Fill _papi_hwi_system_info.exe_info.fullname */ proc_info = __sol_get_proc_info( pid ); // If there are arguments, trim the string to the executable name. if ( proc_info->pr_argc > 1 ) { c_ptr = strchr( proc_info->pr_psargs, ' ' ); if ( c_ptr != NULL ) c_ptr = '\0'; } /* If the path can be qualified, use the full path, otherwise the trimmed name. */ if ( realpath( proc_info->pr_psargs, exec_name ) != NULL ) { strncpy( _papi_hwi_system_info.exe_info.fullname, exec_name, PAPI_HUGE_STR_LEN ); } else { strncpy( _papi_hwi_system_info.exe_info.fullname, proc_info->pr_psargs, PAPI_HUGE_STR_LEN ); } /* Fill _papi_hwi_system_info.exe_info.address_info */ // Taken from the old component strncpy( _papi_hwi_system_info.exe_info.address_info.name, basename( _papi_hwi_system_info.exe_info.fullname ), PAPI_HUGE_STR_LEN ); __CHECK_ERR_PAPI( _niagara2_update_shlib_info( &_papi_hwi_system_info ) ); /* Fill _papi_hwi_system_info.hw_info */ // Taken from the old component _papi_hwi_system_info.hw_info.ncpu = sysconf( _SC_NPROCESSORS_ONLN ); _papi_hwi_system_info.hw_info.nnodes = 1; _papi_hwi_system_info.hw_info.vendor = PAPI_VENDOR_SUN; strcpy( _papi_hwi_system_info.hw_info.vendor_string, "SUN" ); _papi_hwi_system_info.hw_info.totalcpus = sysconf( _SC_NPROCESSORS_CONF ); _papi_hwi_system_info.hw_info.model = 1; strcpy( _papi_hwi_system_info.hw_info.model_string, cpc_cciname( cpc ) ); /* The field sparc-version is no longer in prtconf -pv */ _papi_hwi_system_info.hw_info.revision = 1; /* Clock speed */ _papi_hwi_system_info.hw_info.mhz = ( float ) __sol_get_processor_clock( ); _papi_hwi_system_info.hw_info.clock_mhz = __sol_get_processor_clock( ); _papi_hwi_system_info.hw_info.cpu_max_mhz = __sol_get_processor_clock( ); _papi_hwi_system_info.hw_info.cpu_min_mhz = __sol_get_processor_clock( ); /* Fill _niagara2_vector.cmp_info.mem_hierarchy */ _niagara2_get_memory_info( &_papi_hwi_system_info.hw_info, 0 ); /* Fill _papi_hwi_system_info.sub_info */ strcpy( _niagara2_vector.cmp_info.name, "SunNiagara2" ); strcpy( _niagara2_vector.cmp_info.version, "ALPHA" ); strcpy( _niagara2_vector.cmp_info.support_version, "libcpc2" ); strcpy( _niagara2_vector.cmp_info.kernel_version, "libcpc2" ); /* libcpc2 uses SIGEMT using real hardware signals, no sw emu */ #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } #endif int _solaris_get_system_info( papi_mdi_t *mdi ) { int retval; pid_t pid; char maxargs[PAPI_MAX_STR_LEN] = ""; psinfo_t psi; int fd; int hz, version; char cpuname[PAPI_MAX_STR_LEN], pname[PAPI_HUGE_STR_LEN]; /* Check counter access */ if ( cpc_version( CPC_VER_CURRENT ) != CPC_VER_CURRENT ) return PAPI_ECMP; SUBDBG( "CPC version %d successfully opened\n", CPC_VER_CURRENT ); if ( cpc_access( ) == -1 ) return PAPI_ECMP; /* Global variable cpuver */ cpuver = cpc_getcpuver( ); SUBDBG( "Got %d from cpc_getcpuver()\n", cpuver ); if ( cpuver == -1 ) return PAPI_ECMP; #ifdef DEBUG { if ( ISLEVEL( DEBUG_SUBSTRATE ) ) { const char *name; int i; name = cpc_getcpuref( cpuver ); if ( name ) { SUBDBG( "CPC CPU reference: %s\n", name ); } else { SUBDBG( "Could not get a CPC CPU reference\n" ); } for ( i = 0; i < cpc_getnpic( cpuver ); i++ ) { SUBDBG( "\n%6s %-40s %8s\n", "Reg", "Symbolic name", "Code" ); cpc_walk_names( cpuver, i, "%6d %-40s %02x\n", print_walk_names ); } SUBDBG( "\n" ); } } #endif /* Initialize other globals */ if ( ( retval = build_tables( ) ) != PAPI_OK ) return retval; preset_search_map = preset_table; if ( cpuver <= CPC_ULTRA2 ) { SUBDBG( "cpuver (==%d) <= CPC_ULTRA2 (==%d)\n", cpuver, CPC_ULTRA2 ); pcr_shift[0] = CPC_ULTRA_PCR_PIC0_SHIFT; pcr_shift[1] = CPC_ULTRA_PCR_PIC1_SHIFT; } else if ( cpuver <= LASTULTRA3 ) { SUBDBG( "cpuver (==%d) <= CPC_ULTRA3x (==%d)\n", cpuver, LASTULTRA3 ); pcr_shift[0] = CPC_ULTRA_PCR_PIC0_SHIFT; pcr_shift[1] = CPC_ULTRA_PCR_PIC1_SHIFT; _solaris_vector.cmp_info.hardware_intr = 1; _solaris_vector.cmp_info.hardware_intr_sig = SIGEMT; } else return PAPI_ECMP; /* Path and args */ pid = getpid( ); if ( pid == -1 ) return ( PAPI_ESYS ); /* Turn on microstate accounting for this process and any LWPs. */ sprintf( maxargs, "/proc/%d/ctl", ( int ) pid ); if ( ( fd = open( maxargs, O_WRONLY ) ) == -1 ) return ( PAPI_ESYS ); { int retval; struct { long cmd; long flags; } cmd; cmd.cmd = PCSET; cmd.flags = PR_MSACCT | PR_MSFORK; retval = write( fd, &cmd, sizeof ( cmd ) ); close( fd ); SUBDBG( "Write PCSET returned %d\n", retval ); if ( retval != sizeof ( cmd ) ) return ( PAPI_ESYS ); } /* Get executable info */ sprintf( maxargs, "/proc/%d/psinfo", ( int ) pid ); if ( ( fd = open( maxargs, O_RDONLY ) ) == -1 ) return ( PAPI_ESYS ); read( fd, &psi, sizeof ( psi ) ); close( fd ); /* Cut off any arguments to exe */ { char *tmp; tmp = strchr( psi.pr_psargs, ' ' ); if ( tmp != NULL ) *tmp = '\0'; } if ( realpath( psi.pr_psargs, pname ) ) strncpy( _papi_hwi_system_info.exe_info.fullname, pname, PAPI_HUGE_STR_LEN ); else strncpy( _papi_hwi_system_info.exe_info.fullname, psi.pr_psargs, PAPI_HUGE_STR_LEN ); /* please don't use pr_fname here, because it can only store less that 16 characters */ strcpy( _papi_hwi_system_info.exe_info.address_info.name, basename( _papi_hwi_system_info.exe_info.fullname ) ); SUBDBG( "Full Executable is %s\n", _papi_hwi_system_info.exe_info.fullname ); /* Executable regions, reading /proc/pid/maps file */ retval = _ultra_hwd_update_shlib_info( &_papi_hwi_system_info ); /* Hardware info */ _papi_hwi_system_info.hw_info.ncpu = sysconf( _SC_NPROCESSORS_ONLN ); _papi_hwi_system_info.hw_info.nnodes = 1; _papi_hwi_system_info.hw_info.totalcpus = sysconf( _SC_NPROCESSORS_CONF ); retval = scan_prtconf( cpuname, PAPI_MAX_STR_LEN, &hz, &version ); if ( retval == -1 ) return PAPI_ECMP; strcpy( _papi_hwi_system_info.hw_info.model_string, cpc_getcciname( cpuver ) ); _papi_hwi_system_info.hw_info.model = cpuver; strcpy( _papi_hwi_system_info.hw_info.vendor_string, "SUN" ); _papi_hwi_system_info.hw_info.vendor = PAPI_VENDOR_SUN; _papi_hwi_system_info.hw_info.revision = version; _papi_hwi_system_info.hw_info.mhz = ( ( float ) hz / 1.0e6 ); SUBDBG( "hw_info.mhz = %f\n", _papi_hwi_system_info.hw_info.mhz ); _papi_hwi_system_info.hw_info.cpu_max_mhz = _papi_hwi_system_info.hw_info.mhz; _papi_hwi_system_info.hw_info.cpu_min_mhz = _papi_hwi_system_info.hw_info.mhz; /* Number of PMCs */ retval = cpc_getnpic( cpuver ); if ( retval < 0 ) return PAPI_ECMP; _solaris_vector.cmp_info.num_cntrs = retval; _solaris_vector.cmp_info.fast_real_timer = 1; _solaris_vector.cmp_info.fast_virtual_timer = 1; _solaris_vector.cmp_info.default_domain = PAPI_DOM_USER; _solaris_vector.cmp_info.available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL; /* Setup presets */ retval = _papi_hwi_setup_all_presets( preset_search_map, NULL ); if ( retval ) return ( retval ); return ( PAPI_OK ); } long long _solaris_get_real_usec( void ) { return ( ( long long ) gethrtime( ) / ( long long ) 1000 ); } long long _solaris_get_real_cycles( void ) { return ( _ultra_hwd_get_real_usec( ) * ( long long ) _papi_hwi_system_info.hw_info.cpu_max_mhz ); } long long _solaris_get_virt_usec( void ) { return ( ( long long ) gethrvtime( ) / ( long long ) 1000 ); } papi-papi-7-2-0-t/src/solaris-common.h000066400000000000000000000030061502707512200175410ustar00rootroot00000000000000#ifndef _PAPI_SOLARIS_H #define _PAPI_SOLARIS_H #include #include #include #include #include int _solaris_update_shlib_info( papi_mdi_t *mdi ); int _solaris_get_system_info( papi_mdi_t *mdi ); long long _solaris_get_real_usec( void ); long long _solaris_get_real_cycles( void ); long long _solaris_get_virt_usec( void ); /* Assembler prototypes */ extern void cpu_sync( void ); extern vptr_t _start, _end, _etext, _edata; extern rwlock_t lock[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) rw_wrlock(&lock[lck]); #define _papi_hwd_unlock(lck) rw_unlock(&lock[lck]); #endif #if 0 #include ! #include "solaris-ultra.h" ! These functions blatantly stolen from perfmon ! The author of the package "perfmon" is Richard J. Enbody ! and the home page for "perfmon" is ! http://www.cps.msu.edu/~enbody/perfmon/index.html ! ! extern void cpu_sync(void); ! ! Make sure all instructinos and memory references before us ! have been completed. .global cpu_sync ENTRY(cpu_sync) membar #Sync ! Wait for all outstanding things to finish retl ! Return to the caller nop ! Delay slot SET_SIZE(cpu_sync) ! ! extern unsigned long long get_tick(void) ! ! Read the tick register and return it .global get_tick ENTRY(get_tick) rd %tick, %o0 ! Get the current value of TICK clruw %o0, %o1 ! put the lower 32 bits into %o1 retl ! Return to the caller srlx %o0, 32, %o0 ! put the upper 32 bits into %o0 SET_SIZE(get_tick) #endif papi-papi-7-2-0-t/src/solaris-context.h000066400000000000000000000005221502707512200177350ustar00rootroot00000000000000#ifndef _SOLARIS_CONTEXT_H #define _SOLARIS_CONTEXT_H #include typedef siginfo_t _solaris_siginfo_t; #define hwd_siginfo_t _solaris_siginfo_t typedef ucontext_t _solaris_ucontext_t; #define hwd_ucontext_t _solaris_ucontext_t #define GET_OVERFLOW_ADDRESS(ctx) (void*)(ctx->ucontext->uc_mcontext.gregs[REG_PC]) #endif papi-papi-7-2-0-t/src/solaris-lock.h000066400000000000000000000002261502707512200172020ustar00rootroot00000000000000extern rwlock_t lock[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) rw_wrlock(&lock[lck]); #define _papi_hwd_unlock(lck) rw_unlock(&lock[lck]); papi-papi-7-2-0-t/src/solaris-memory.c000066400000000000000000000122311502707512200175540ustar00rootroot00000000000000/* * File: solaris-memory.c * Author: Kevin London * london@cs.utk.edu * * Mods: Philip J. Mucci * mucci@cs.utk.edu * * Mods: Vince Weaver * vweaver1@eecs.utk.edu * * Mods: Fabian Gorsler * fabian.gorsler@smail.inf.h-bonn-rhein-sieg.de */ #include "papi.h" #include "papi_internal.h" int _solaris_get_memory_info( PAPI_hw_info_t * hw, int id ) { FILE *pipe; char line[BUFSIZ]; PAPI_mh_level_t *mem = hw->mem_hierarchy.level; pipe=popen("prtconf -pv","r"); if (pipe==NULL) { return PAPI_ESYS; } while(1) { if (fgets(line,BUFSIZ,pipe)==NULL) break; if (strstr(line,"icache-size:")) { sscanf(line,"%*s %#x",&mem[0].cache[0].size); } if (strstr(line,"icache-line-size:")) { sscanf(line,"%*s %#x",&mem[0].cache[0].line_size); } if (strstr(line,"icache-associativity:")) { sscanf(line,"%*s %#x",&mem[0].cache[0].associativity); } if (strstr(line,"dcache-size:")) { sscanf(line,"%*s %#x",&mem[0].cache[1].size); } if (strstr(line,"dcache-line-size:")) { sscanf(line,"%*s %#x",&mem[0].cache[1].line_size); } if (strstr(line,"dcache-associativity:")) { sscanf(line,"%*s %#x",&mem[0].cache[1].associativity); } if (strstr(line,"ecache-size:")) { sscanf(line,"%*s %#x",&mem[1].cache[0].size); } if (strstr(line,"ecache-line-size:")) { sscanf(line,"%*s %#x",&mem[1].cache[0].line_size); } if (strstr(line,"ecache-associativity:")) { sscanf(line,"%*s %#x",&mem[1].cache[0].associativity); } if (strstr(line,"#itlb-entries:")) { sscanf(line,"%*s %#x",&mem[0].tlb[0].num_entries); } if (strstr(line,"#dtlb-entries:")) { sscanf(line,"%*s %#x",&mem[0].tlb[1].num_entries); } } pclose(pipe); /* I-Cache -> L1$ instruction */ mem[0].cache[0].type = PAPI_MH_TYPE_INST; if (mem[0].cache[0].line_size!=0) mem[0].cache[0].num_lines = mem[0].cache[0].size / mem[0].cache[0].line_size; /* D-Cache -> L1$ data */ mem[0].cache[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU; if (mem[0].cache[1].line_size!=0) mem[0].cache[1].num_lines = mem[0].cache[1].size / mem[0].cache[1].line_size; /* ITLB -> TLB instruction */ mem[0].tlb[0].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; /* assume fully associative */ mem[0].tlb[0].associativity = mem[0].tlb[0].num_entries; /* DTLB -> TLB data */ mem[0].tlb[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; /* assume fully associative */ mem[0].tlb[1].associativity = mem[0].tlb[1].num_entries; /* L2$ unified */ mem[1].cache[0].type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_WB | PAPI_MH_TYPE_PSEUDO_LRU; if (mem[1].cache[0].line_size!=0) mem[1].cache[0].num_lines = mem[1].cache[0].size / mem[1].cache[0].line_size; /* Indicate we have two levels filled in the hierarchy */ hw->mem_hierarchy.levels = 2; return PAPI_OK; } int _solaris_get_dmem_info( PAPI_dmem_info_t * d ) { FILE *fd; struct psinfo psi; if ( ( fd = fopen( "/proc/self/psinfo", "r" ) ) == NULL ) { SUBDBG( "fopen(/proc/self) errno %d", errno ); return ( PAPI_ESYS ); } fread( ( void * ) &psi, sizeof ( struct psinfo ), 1, fd ); fclose( fd ); d->pagesize = sysconf( _SC_PAGESIZE ); d->size = d->pagesize * sysconf( _SC_PHYS_PAGES ); d->resident = ( ( 1024 * psi.pr_size ) / d->pagesize ); d->high_water_mark = PAPI_EINVAL; d->shared = PAPI_EINVAL; d->text = PAPI_EINVAL; d->library = PAPI_EINVAL; d->heap = PAPI_EINVAL; d->locked = PAPI_EINVAL; d->stack = PAPI_EINVAL; return PAPI_OK; } int _niagara2_get_memory_info( PAPI_hw_info_t * hw, int id ) { PAPI_mh_level_t *mem = hw->mem_hierarchy.level; /* I-Cache -> L1$ instruction */ /* FIXME: The policy used at this cache is unknown to PAPI. LSFR with random replacement. */ mem[0].cache[0].type = PAPI_MH_TYPE_INST; mem[0].cache[0].size = 16 * 1024; // 16 Kb mem[0].cache[0].line_size = 32; mem[0].cache[0].num_lines = mem[0].cache[0].size / mem[0].cache[0].line_size; mem[0].cache[0].associativity = 8; /* D-Cache -> L1$ data */ mem[0].cache[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_LRU; mem[0].cache[1].size = 8 * 1024; // 8 Kb mem[0].cache[1].line_size = 16; mem[0].cache[1].num_lines = mem[0].cache[1].size / mem[0].cache[1].line_size; mem[0].cache[1].associativity = 4; /* ITLB -> TLB instruction */ mem[0].tlb[0].type = PAPI_MH_TYPE_INST | PAPI_MH_TYPE_PSEUDO_LRU; mem[0].tlb[0].num_entries = 64; mem[0].tlb[0].associativity = 64; /* DTLB -> TLB data */ mem[0].tlb[1].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_PSEUDO_LRU; mem[0].tlb[1].num_entries = 128; mem[0].tlb[1].associativity = 128; /* L2$ unified */ mem[1].cache[0].type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_WB | PAPI_MH_TYPE_PSEUDO_LRU; mem[1].cache[0].size = 4 * 1024 * 1024; // 4 Mb mem[1].cache[0].line_size = 64; mem[1].cache[0].num_lines = mem[1].cache[0].size / mem[1].cache[0].line_size; mem[1].cache[0].associativity = 16; /* Indicate we have two levels filled in the hierarchy */ hw->mem_hierarchy.levels = 2; return PAPI_OK; } papi-papi-7-2-0-t/src/solaris-memory.h000066400000000000000000000002571502707512200175660ustar00rootroot00000000000000int _solaris_get_memory_info( PAPI_hw_info_t * hw, int id ); int _solaris_get_dmem_info( PAPI_dmem_info_t * d ); int _niagara2_get_memory_info( PAPI_hw_info_t * hw, int id ); papi-papi-7-2-0-t/src/solaris-niagara2.c000066400000000000000000001632341502707512200177420ustar00rootroot00000000000000/******************************************************************************* * >>>>>> "Development of a PAPI Backend for the Sun Niagara 2 Processor" <<<<<< * ----------------------------------------------------------------------------- * * Fabian Gorsler * * Hochschule Bonn-Rhein-Sieg, Sankt Augustin, Germany * University of Applied Sciences * * ----------------------------------------------------------------------------- * * File: solaris-niagara2.c * Author: fg215045 * * Description: This source file is the implementation of a PAPI * component for the Sun Niagara 2 processor (aka UltraSPARC T2) * running on Solaris 10 with libcpc 2. * The machine for implementing this component was courtesy of RWTH * Aachen University, Germany. Thanks to the HPC-Team at RWTH! * * Conventions used: * - __cpc_*: Functions, variables, etc. related to libcpc handling * - __sol_*: Functions, variables, etc. related to Solaris handling * - __int_*: Functions, variables, etc. related to extensions of libcpc * - _niagara*: Functions, variables, etc. needed by PAPI hardware dependent * layer, i.e. the component itself * * * ***** Feel free to convert this header to the PAPI default ***** * * ----------------------------------------------------------------------------- * Created on April 23, 2009, 7:31 PM ******************************************************************************/ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "solaris-niagara2.h" #include "papi_memory.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "solaris-common.h" #include "solaris-memory.h" #define hwd_control_state_t _niagara2_control_state_t #define hwd_context_t _niagara2_context_t #define hwd_register_t _niagara2_register_t extern vptr_t _start, _end, _etext, _edata; extern papi_vector_t _niagara2_vector; /* Synthetic events */ int __int_setup_synthetic_event( int, hwd_control_state_t *, void * ); uint64_t __int_get_synthetic_event( int, hwd_control_state_t *, void * ); void __int_walk_synthetic_events_action_count( void ); void __int_walk_synthetic_events_action_store( void ); /* Simple error handlers for convenience */ #define __CHECK_ERR_DFLT(retval) \ if(retval != 0){ SUBDBG("RETVAL: %d\n", retval); return PAPI_ECMP;} #define __CHECK_ERR_NULL(retval) \ if(retval == NULL){ SUBDBG("RETVAL: NULL\n"); return PAPI_ECMP;} #define __CHECK_ERR_PAPI(retval) \ if(retval != PAPI_OK){ SUBDBG("RETVAL: %d\n", retval); return PAPI_ECMP;} #define __CHECK_ERR_INVA(retval) \ if(retval != 0){ SUBDBG("RETVAL: %d\n", retval); return PAPI_EINVAL;} #define __CHECK_ERR_NEGV(retval) \ if(retval < 0){ SUBDBG("RETVAL: %d\n", retval); return PAPI_ECMP;} // PAPI defined variables extern papi_mdi_t _papi_hwi_system_info; // The instance of libcpc static cpc_t *cpc = NULL; typedef struct __t2_store { // Number of counters for a processing unit int npic; int *pic_ntv_count; int syn_evt_count; } __t2_store_t; static __t2_store_t __t2_store; static char **__t2_ntv_events; // Variables copied from the old component static int pid; // Data types for utility functions typedef struct __sol_processor_information { int total; int clock; } __sol_processor_information_t; typedef struct __t2_pst_table { int papi_pst; char *ntv_event[MAX_COUNTERS]; int ntv_ctrs; int ntv_opcode; } __t2_pst_table_t; #define SYNTHETIC_EVENTS_SUPPORTED 1 /* This table structure holds all preset events */ static __t2_pst_table_t __t2_table[] = { /* Presets defined by generic_events(3CPC) */ {PAPI_L1_DCM, {"DC_miss", NULL}, 1, NOT_DERIVED}, {PAPI_L1_ICM, {"IC_miss", NULL}, 1, NOT_DERIVED}, {PAPI_L2_ICM, {"L2_imiss", NULL}, 1, NOT_DERIVED}, {PAPI_TLB_DM, {"DTLB_miss", NULL}, 1, NOT_DERIVED}, {PAPI_TLB_IM, {"ITLB_miss", NULL}, 1, NOT_DERIVED}, {PAPI_TLB_TL, {"TLB_miss", NULL}, 1, NOT_DERIVED}, {PAPI_L2_LDM, {"L2_dmiss_ld", NULL}, 1, NOT_DERIVED}, {PAPI_BR_TKN, {"Br_taken", NULL}, 1, NOT_DERIVED}, {PAPI_TOT_INS, {"Instr_cnt", NULL}, 1, NOT_DERIVED}, {PAPI_LD_INS, {"Instr_ld", NULL}, 1, NOT_DERIVED}, {PAPI_SR_INS, {"Instr_st", NULL}, 1, NOT_DERIVED}, {PAPI_BR_INS, {"Br_completed", NULL}, 1, NOT_DERIVED}, /* Presets additionally found, should be checked twice */ {PAPI_BR_MSP, {"Br_taken", NULL}, 1, NOT_DERIVED}, {PAPI_FP_INS, {"Instr_FGU_arithmetic", NULL}, 1, NOT_DERIVED}, {PAPI_RES_STL, {"Idle_strands", NULL}, 1, NOT_DERIVED}, {PAPI_SYC_INS, {"Atomics", NULL}, 1, NOT_DERIVED}, {PAPI_L2_ICR, {"CPU_ifetch_to_PCX", NULL}, 1, NOT_DERIVED}, {PAPI_L1_TCR, {"CPU_ld_to_PCX", NULL}, 1, NOT_DERIVED}, {PAPI_L2_TCW, {"CPU_st_to_PCX", NULL}, 1, NOT_DERIVED}, /* Derived presets found, should be checked twice */ {PAPI_L1_TCM, {"IC_miss", "DC_miss"}, 2, DERIVED_ADD}, {PAPI_BR_CN, {"Br_completed", "Br_taken"}, 2, DERIVED_ADD}, {PAPI_BR_PRC, {"Br_completed", "Br_taken"}, 2, DERIVED_SUB}, {PAPI_LST_INS, {"Instr_st", "Instr_ld"}, 2, DERIVED_ADD}, #ifdef SYNTHETIC_EVENTS_SUPPORTED /* This preset does exist in order to support multiplexing */ {PAPI_TOT_CYC, {"_syn_cycles_elapsed", "DC_miss"}, 1, NOT_DERIVED}, #endif {0, {NULL, NULL}, 0, 0}, }; hwi_search_t *preset_table; #ifdef SYNTHETIC_EVENTS_SUPPORTED enum { SYNTHETIC_CYCLES_ELAPSED = 1, SYNTHETIC_RETURN_ONE, SYNTHETIC_RETURN_TWO, } __int_synthetic_enum; #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED typedef struct __int_synthetic_table { int code; char *name; } __int_syn_table_t; #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED static __int_syn_table_t __int_syn_table[] = { {SYNTHETIC_CYCLES_ELAPSED, "_syn_cycles_elapsed"}, {SYNTHETIC_RETURN_ONE, "_syn_return_one"}, {SYNTHETIC_RETURN_TWO, "_syn_return_two"}, {-1, NULL}, }; #endif //////////////////////////////////////////////////////////////////////////////// /// PAPI HWD LAYER RELATED FUNCTIONS /////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////////// /* DESCRIPTION: * ----------------------------------------------------------------------------- * Functions in this section are related to the PAPI hardware dependend layer, * also known as "HWD". In this case the HWD layer is the interface from PAPI * to libcpc 2/Solaris 10. ******************************************************************************/ int _niagara2_set_domain( hwd_control_state_t * ctrl, int domain ) { int i; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Clean and set the new flag for each counter */ for ( i = 0; i < MAX_COUNTERS; i++ ) { #ifdef DEBUG SUBDBG( " -> %s: Setting flags for PIC#%d, old value: %p\n", __func__, i, ctrl->flags[i] ); #endif ctrl->flags[i] &= ~( CPC_COUNTING_DOMAINS ); #ifdef DEBUG SUBDBG( " -> %s: +++ cleaned value: %p\n", __func__, ctrl->flags[i] ); #endif ctrl->flags[i] |= __cpc_domain_translator( domain ); #ifdef DEBUG SUBDBG( " -> %s: +++ new value: %p\n", __func__, ctrl->flags[i] ); #endif } /* Recreate the set */ __CHECK_ERR_PAPI( __cpc_recreate_set( ctrl ) ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: Option #%d requested\n", __func__, code ); #endif /* Only these options are handled which are handled in PAPI_set_opt, as many of the left out options are not settable, like PAPI_MAX_CPUS. */ switch ( code ) { case PAPI_DEFDOM: /* From papi.h: Domain for all new eventsets. Takes non-NULL option pointer. */ _niagara2_vector.cmp_info.default_domain = option->domain.domain; return PAPI_OK; case PAPI_DOMAIN: /* From papi.h: Domain for an eventset */ return _niagara2_set_domain( ctx, option->domain.domain ); case PAPI_DEFGRN: /* From papi.h: Granularity for all new eventsets */ _niagara2_vector.cmp_info.default_granularity = option->granularity.granularity; return PAPI_OK; case PAPI_GRANUL: /* From papi.h: Granularity for an eventset */ /* Only supported granularity is PAPI_GRN_THREAD */ return PAPI_OK; case PAPI_DEF_MPX_NS: /* From papi.h: Multiplexing/overflowing interval in ns, same as PAPI_DEF_ITIMER_NS */ /* From the old component */ option->itimer.ns = __sol_get_itimer_ns( option->itimer.ns ); #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_MPX_NS, option->itimer.ns=%d\n", __func__, option->itimer.ns ); #endif return PAPI_OK; case PAPI_DEF_ITIMER: // IN THE OLD COMPONENT // USED /* From papi.h: Option to set the type of itimer used in both software multiplexing, overflowing and profiling */ /* These tests are taken from the old component. For Solaris 10 the same rules apply as documented in getitimer(2). */ if ( ( option->itimer.itimer_num == ITIMER_REAL ) && ( option->itimer.itimer_sig != SIGALRM ) ) { #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_ITIMER, ITIMER_REAL needs SIGALRM\n", __func__ ); #endif return PAPI_EINVAL; } if ( ( option->itimer.itimer_num == ITIMER_VIRTUAL ) && ( option->itimer.itimer_sig != SIGVTALRM ) ) { #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_ITIMER, ITIMER_VIRTUAL needs SIGVTALRM\n", __func__ ); #endif return PAPI_EINVAL; } if ( ( option->itimer.itimer_num == ITIMER_PROF ) && ( option->itimer.itimer_sig != SIGPROF ) ) { #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_ITIMER, ITIMER_PROF needs SIGPROF\n", __func__ ); #endif return PAPI_EINVAL; } /* As in the old component defined, timer values below 0 are NOT filtered out, but timer values greater than 0 are rounded, either to a value which is at least itimer_res_ns or padded to a multiple of itimer_res_ns. */ if ( option->itimer.ns > 0 ) { option->itimer.ns = __sol_get_itimer_ns( option->itimer.ns ); #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_ITIMER, option->itimer.ns=%d\n", __func__, option->itimer.ns ); #endif } return PAPI_OK; case PAPI_DEF_ITIMER_NS: // IN THE OLD COMPONENT // USED /* From papi.h: Multiplexing/overflowing interval in ns, same as PAPI_DEF_MPX_NS */ /* From the old component */ option->itimer.ns = __sol_get_itimer_ns( option->itimer.ns ); #ifdef DEBUG SUBDBG( " -> %s: PAPI_DEF_ITIMER_NS, option->itimer.ns=%d\n", __func__, option->itimer.ns ); #endif return PAPI_OK; } #ifdef DEBUG SUBDBG( " -> %s: Option not found\n", __func__ ); SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* This place should never be reached */ return PAPI_EINVAL; } void _niagara2_dispatch_timer( int signal, siginfo_t * si, void *info ) { EventSetInfo_t *ESI = NULL; ThreadInfo_t *thread = NULL; int overflow_vector = 0; hwd_control_state_t *ctrl = NULL; long_long results[MAX_COUNTERS]; int i; // Hint from perf_events.c int cidx = _niagara2_vector.cmp_info.CmpIdx; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: Overflow handler called by signal #%d\n", __func__, signal ); #endif /* From the old component */ thread = _papi_hwi_lookup_thread( 0 ); ESI = ( EventSetInfo_t * ) thread->running_eventset[cidx]; /* From the old component, modified */ // if ( ESI == NULL || ESI->master != thread || ESI->ctl_state == NULL || ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) { #ifdef DEBUG SUBDBG( " -> %s: Problems with ESI, not necessarily serious\n", __func__ ); if ( ESI == NULL ) { SUBDBG( " -> %s: +++ ESI is NULL\n", __func__ ); } if ( ESI->master != thread ) { SUBDBG( " -> %s: +++ Thread mismatch, ESI->master=%#x thread=%#x\n", __func__, ESI->master, thread ); } if ( ESI->ctl_state == NULL ) { SUBDBG( " -> %s: +++ Counter state invalid\n", __func__ ); } if ( ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) { SUBDBG ( " -> %s: +++ Overflow flag missing, ESI->overflow.flags=%#x\n", __func__, ESI->overflow.flags ); } #endif return; } #ifdef DEBUG printf( " -> %s: Preconditions valid, trying to read counters\n", __func__ ); #endif ctrl = ESI->ctl_state; if ( _niagara2_read ( ctrl, ctrl, ( long_long ** ) & results, NOT_A_PAPI_HWD_READ ) != PAPI_OK ) { /* Failure */ #ifdef DEBUG printf( "%s: Failed to read counters\n", __func__ ); #endif return; } else { /* Success */ #ifdef DEBUG SUBDBG( " -> %s: Counters read\n", __func__ ); #endif /* Iterate over all available counters in order to detect which counter overflowed (counter value should be 0 if an hw overflow happened), store the position in the overflow_vector, calculte the offset and shift (value range signed long long vs. unsigned long long). */ for ( i = 0; i < ctrl->count; i++ ) { if ( results[i] >= 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Overflow detected at PIC #%d\n", __func__, i ); #endif /* Set the bit in the overflow_vector */ overflow_vector = overflow_vector | ( 1 << i ); /* hoose which method to use depending on the overflow signal. */ if ( signal == SIGEMT ) { /* Store the counter value, but only if we have a real * hardware overflow counting with libcpc/SIGEMT. */ ctrl->preset[i] = UINT64_MAX - ctrl->threshold[i]; ctrl->hangover[i] += ctrl->threshold[i]; } else { /* Push the value back, this time PAPI does the work. This is software overflow handling. */ cpc_request_preset( cpc, ctrl->idx[i], ctrl->result[i] ); } } else { #ifdef DEBUG SUBDBG( " -> %s: No overflow detected at PIC #%d, value=%ld\n", __func__, i, results[i] ); #endif /* Save the results read from the counter as we can not store the temporary value in hardware or libcpc. */ if ( signal == SIGEMT ) { ctrl->preset[i] += results[i]; ctrl->hangover[i] = results[i]; } } } #ifdef DEBUG SUBDBG( " -> %s: Restarting set to push values back\n", __func__ ); #endif /* Push all values back to the counter as preset */ cpc_set_restart( cpc, ctrl->set ); } #ifdef DEBUG SUBDBG( " -> %s: Passing overflow to PAPI with overflow_vector=%p\n", __func__, overflow_vector ); #endif { /* hw is used as pointer in the dispatching routine of PAPI and might be changed. For safety it is not a pseudo pointer to NULL. */ int hw; if ( signal == SIGEMT ) { /* This is a hardware overflow */ hw = 1; _papi_hwi_dispatch_overflow_signal( ctrl, ( vptr_t ) _niagara2_get_overflow_address ( info ), &hw, overflow_vector, 1, &thread, ESI->CmpIdx ); } else { /* This is a software overflow */ hw = 0; _papi_hwi_dispatch_overflow_signal( ctrl, ( vptr_t ) _niagara2_get_overflow_address ( info ), &hw, overflow_vector, 1, &thread, ESI->CmpIdx ); } } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } static inline void * _niagara2_get_overflow_address( void *context ) { ucontext_t *ctx = ( ucontext_t * ) context; #ifdef DEBUG SUBDBG( "ENTERING/LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return ( void * ) ctx->uc_mcontext.gregs[REG_PC]; } /** Although the created set in this function will be destroyed by * _papi_update_control_state later, at least the functionality of the * underlying CPU driver will be tested completly. */ int _niagara2_init_control_state( hwd_control_state_t * ctrl ) { int i; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif // cpc_seterrhndlr(cpc, myapp_errfn); /* Clear the buffer */ if ( ctrl->counter_buffer != NULL ) { #ifdef DEBUG SUBDBG( " -> %s: Cleaning buffer\n", __func__ ); #endif cpc_buf_destroy( cpc, ctrl->counter_buffer ); ctrl->counter_buffer = NULL; } /* Clear the set */ if ( ctrl->set != NULL ) { #ifdef DEBUG SUBDBG( " -> %s: Cleaning set\n", __func__ ); #endif cpc_set_destroy( cpc, ctrl->set ); ctrl->set = NULL; } /* Indicate this idx has no request associated, this counter is unused. */ for ( i = 0; i < MAX_COUNTERS; i++ ) { #ifdef DEBUG SUBDBG( " -> %s: Cleaning counter state #%d\n", __func__, i ); #endif /* Indicate missing setup values */ ctrl->idx[i] = EVENT_NOT_SET; ctrl->code[i].event_code = EVENT_NOT_SET; /* No flags yet set, this is for overflow and binding */ ctrl->flags[i] = 0; /* Preset value for counting results */ ctrl->preset[i] = DEFAULT_CNTR_PRESET; /* Needed for overflow handling, will be set later */ ctrl->threshold[i] = 0; ctrl->hangover[i] = 0; #ifdef SYNTHETIC_EVENTS_SUPPORTED ctrl->syn_hangover[i] = 0; #endif } /* No counters active in this set */ ctrl->count = 0; #ifdef SYNTHETIC_EVENTS_SUPPORTED ctrl->syn_count = 0; #endif #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_init_component( int cidx ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Create an instance of libcpc */ #ifdef DEBUG SUBDBG( " -> %s: Trying to initalize libcpc\n", __func__ ); #endif cpc = cpc_open( CPC_VER_CURRENT ); __CHECK_ERR_NULL( cpc ); #ifdef DEBUG SUBDBG( " -> %s: Registering libcpc error handler\n", __func__ ); #endif cpc_seterrhndlr( cpc, __cpc_error_handler ); #ifdef DEBUG SUBDBG( " -> %s: Detecting supported PICs", __func__ ); #endif __t2_store.npic = cpc_npic( cpc ); #ifdef DEBUG SUBDBG( " -> %s: Storing component index, cidx=%d\n", __func__, cidx ); #endif _niagara2_vector.cmp_info.CmpIdx = cidx; #ifdef DEBUG SUBDBG( " -> %s: Gathering system information for PAPI\n", __func__ ); #endif /* Store system info in central data structure */ __CHECK_ERR_PAPI( _niagara2_get_system_info( &_papi_hwi_system_info ) ); #ifdef DEBUG SUBDBG( " -> %s: Initializing locks\n", __func__ ); #endif /* Set up the lock after initialization */ _niagara2_lock_init( ); // Copied from the old component, _papi_init_component() SUBDBG( "Found %d %s %s CPUs at %d Mhz.\n", _papi_hwi_system_info.hw_info.totalcpus, _papi_hwi_system_info.hw_info.vendor_string, _papi_hwi_system_info.hw_info.model_string, _papi_hwi_system_info.hw_info.cpu_max_mhz ); /* Build native event table */ #ifdef DEBUG SUBDBG( " -> %s: Building native event table\n", __func__ ); #endif __CHECK_ERR_PAPI( __cpc_build_ntv_table( ) ); /* Build preset event table */ #ifdef DEBUG SUBDBG( " -> %s: Building PAPI preset table\n", __func__ ); #endif __CHECK_ERR_PAPI( __cpc_build_pst_table( ) ); /* Register presets and finish event related setup */ #ifdef DEBUG SUBDBG( " -> %s: Registering presets in PAPI\n", __func__ ); #endif __CHECK_ERR_PAPI( _papi_hwi_setup_all_presets( preset_table, NULL ) ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Everything is ok */ return PAPI_OK; } static void _niagara2_lock_init( void ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Copied from old component, lock_init() */ memset( lock, 0x0, sizeof ( rwlock_t ) * PAPI_MAX_LOCK ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } int _niagara2_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { int event_code = EventCode & PAPI_NATIVE_AND_MASK; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif if ( event_code >= 0 && event_code <= _niagara2_vector.cmp_info.num_native_events ) { return PAPI_ENOEVNT; } bits->event_code = event_code; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_ntv_code_to_descr( unsigned int EventCode, char *ntv_descr, int len ) { #ifdef DEBUG SUBDBG( "ENTERING/LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* libcpc offers no descriptions, just a link to the reference manual */ return _niagara2_ntv_code_to_name( EventCode, ntv_descr, len ); } int _niagara2_ntv_code_to_name( unsigned int EventCode, char *ntv_name, int len ) { int event_code = EventCode & PAPI_NATIVE_AND_MASK; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif if ( event_code >= 0 && event_code <= _niagara2_vector.cmp_info.num_native_events ) { strlcpy( ntv_name, __t2_ntv_events[event_code], len ); if ( strlen( __t2_ntv_events[event_code] ) > len - 1 ) { #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* It's not a real error, but at least a hint */ return PAPI_EBUF; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_ENOEVNT; } int _niagara2_ntv_enum_events( unsigned int *EventCode, int modifier ) { /* This code is very similar to the code from the old component. */ int event_code = *EventCode & PAPI_NATIVE_AND_MASK; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = PAPI_NATIVE_MASK + 1; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } /* The table needs to be shifted by one position (starting index 1), as PAPI expects native event codes not to be 0 (papi_internal.c:744). */ if ( event_code >= 1 && event_code <= _niagara2_vector.cmp_info.num_native_events - 1 ) { *EventCode = *EventCode + 1; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif // If nothing found report an error return PAPI_ENOEVNT; } int _niagara2_read( hwd_context_t * ctx, hwd_control_state_t * ctrl, long_long ** events, int flags ) { int i; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: called with flags=%p\n", __func__, flags ); #endif /* Take a new sample from the PIC to the buffer */ __CHECK_ERR_DFLT( cpc_set_sample( cpc, ctrl->set, ctrl->counter_buffer ) ); /* Copy the buffer values from all active counters */ for ( i = 0; i < ctrl->count; i++ ) { /* Retrieve the counting results of libcpc */ __CHECK_ERR_DFLT( cpc_buf_get( cpc, ctrl->counter_buffer, ctrl->idx[i], &ctrl->result[i] ) ); /* As libcpc uses uint64_t and PAPI uses int64_t, we need to normalize the result back to a value that PAPI can handle, otherwise the result is not usable as its in the negative range of int64_t and the result becomes useless for PAPI. */ if ( ctrl->threshold[i] > 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Normalizing result on PIC#%d to %lld\n", __func__, i, ctrl->result[i] ); #endif /* DEBUG */ /* This shifts the retrieved value back to the PAPI value range */ ctrl->result[i] = ctrl->result[i] - ( UINT64_MAX - ctrl->threshold[i] ) - 1; /* Needed if called internally if a PIC didn't really overflow, but was programmed in the same set. */ if ( flags != NOT_A_PAPI_HWD_READ ) { ctrl->result[i] = ctrl->hangover[i]; } #ifdef DEBUG SUBDBG( " -> %s: Overflow scaling on PIC#%d:\n", __func__, i ); SUBDBG( " -> %s: +++ ctrl->result[%d]=%llu\n", __func__, i, ctrl->result[i] ); SUBDBG( " -> %s: +++ ctrl->threshold[%d]=%lld\n", __func__, i, ctrl->threshold[i] ); SUBDBG( " -> %s: +++ ctrl->hangover[%d]=%lld\n", __func__, i, ctrl->hangover[i] ); #endif } #ifdef DEBUG SUBDBG( " -> %s: +++ ctrl->result[%d]=%llu\n", __func__, i, ctrl->result[i] ); #endif } #ifdef SYNTHETIC_EVENTS_SUPPORTED { int i; const int syn_barrier = _niagara2_vector.cmp_info.num_native_events - __t2_store.syn_evt_count; for ( i = 0; i < ctrl->count; i++ ) { if ( ctrl->code[i].event_code >= syn_barrier ) { ctrl->result[i] = __int_get_synthetic_event( ctrl->code[i].event_code - syn_barrier, ctrl, &i ); } } } #endif /* Pass the address of the results back to the calling function */ *events = ( long_long * ) & ctrl->result[0]; #ifdef DEBUG SUBDBG( "LEAVING: %s\n", "_papi_read" ); #endif return PAPI_OK; } int _niagara2_reset( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* This does a restart of the whole set, setting the internal counters back to the value passed as preset of the last call of cpc_set_add_request or cpc_request_preset. */ cpc_set_restart( cpc, ctrl->set ); #ifdef SYNTHETIC_EVENTS_SUPPORTED { const int syn_barrier = _niagara2_vector.cmp_info.num_native_events - __t2_store.syn_evt_count; int i; if ( ctrl->syn_count > 0 ) { for ( i = 0; i < MAX_COUNTERS; i++ ) { if ( ctrl->code[i].event_code >= syn_barrier ) { ctrl->syn_hangover[i] += __int_get_synthetic_event( ctrl->code[i].event_code - syn_barrier, ctrl, &i ); } } } } #endif #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_set_profile( EventSetInfo_t * ESI, int EventIndex, int threshold ) { /* Seems not to be used. */ #ifdef DEBUG SUBDBG( "ENTERING/LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_ENOSUPP; } int _niagara2_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { hwd_control_state_t *ctrl = ESI->ctl_state; struct sigaction sigact; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: Overflow handling for %#x on PIC#%d requested\n", __func__, ctrl, EventIndex ); SUBDBG( " -> %s: ESI->overflow.flags=%#x\n\n", __func__, ctrl, ESI->overflow.flags ); #endif /* If threshold > 0, then activate hardware overflow handling, otherwise disable it. */ if ( threshold > 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Activating overflow handling\n", __func__ ); #endif ctrl->preset[EventIndex] = UINT64_MAX - threshold; ctrl->threshold[EventIndex] = threshold; /* If SIGEMT is not yet enabled, enable it. In libcpc this means to re- recreate the used set. In order not to break PAPI operations only the event referred by EventIndex will be updated to use SIGEMT. */ if ( !( ctrl->flags[EventIndex] & CPC_OVF_NOTIFY_EMT ) ) { #ifdef DEBUG SUBDBG( " -> %s: Need to activate SIGEMT on PIC %d\n", __func__, EventIndex ); #endif /* Enable overflow handling */ if ( __cpc_enable_sigemt( ctrl, EventIndex ) != PAPI_OK ) { #ifdef DEBUG SUBDBG( " -> %s: Activating SIGEMT failed for PIC %d\n", __func__, EventIndex ); #endif return PAPI_ESYS; } } #ifdef DEBUG SUBDBG( " -> %s: SIGEMT activated, will install signal handler\n", __func__ ); #endif // FIXME: Not really sure that this construct is working return _papi_hwi_start_signal( SIGEMT, 1, 0 ); } else { #ifdef DEBUG SUBDBG( " -> %s: Disabling overflow handling\n", __func__ ); #endif /* Resetting values which were used for overflow handling */ ctrl->preset[EventIndex] = DEFAULT_CNTR_PRESET; ctrl->flags[EventIndex] &= ~( CPC_OVF_NOTIFY_EMT ); ctrl->threshold[EventIndex] = 0; ctrl->hangover[EventIndex] = 0; #ifdef DEBUG SUBDBG( " -> %s:ctrl->preset[%d]=%d, ctrl->flags[%d]=%p\n", __func__, EventIndex, ctrl->preset[EventIndex], EventIndex, ctrl->flags[EventIndex] ); #endif /* Recreate the undelying set and disable the signal handler */ __CHECK_ERR_PAPI( __cpc_recreate_set( ctrl ) ); __CHECK_ERR_PAPI( _papi_hwi_stop_signal( SIGEMT ) ); } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_shutdown( hwd_context_t * ctx ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif cpc_buf_destroy( cpc, ctx->counter_buffer ); cpc_set_destroy( cpc, ctx->set ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_shutdown_global( void ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Free allocated memory */ // papi_calloc in __cpc_build_ntv_table papi_free( __t2_store.pic_ntv_count ); // papi_calloc in __cpc_build_ntv_table papi_free( __t2_ntv_events ); // papi_calloc in __cpc_build_pst_table papi_free( preset_table ); /* Shutdown libcpc */ // cpc_open in _papi_init_component cpc_close( cpc ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_start( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { int retval; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: Starting EventSet %p\n", __func__, ctrl ); #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED { #ifdef DEBUG SUBDBG( " -> %s: Event count: ctrl->count=%d, ctrl->syn_count=%d\n", __func__, ctrl->count, ctrl->syn_count ); #endif if ( ctrl->count > 0 && ctrl->count == ctrl->syn_count ) { ctrl->idx[0] = cpc_set_add_request( cpc, ctrl->set, "Instr_cnt", ctrl->preset[0], ctrl->flags[0], 0, NULL ); ctrl->counter_buffer = cpc_buf_create( cpc, ctrl->set ); } } #endif #ifdef DEBUG { int i; for ( i = 0; i < MAX_COUNTERS; i++ ) { SUBDBG( " -> %s: Flags for PIC#%d: ctrl->flags[%d]=%d\n", __func__, i, i, ctrl->flags[i] ); } } #endif __CHECK_ERR_DFLT( cpc_bind_curlwp( cpc, ctrl->set, CPC_BIND_LWP_INHERIT ) ); /* Ensure the set is working properly */ retval = cpc_set_sample( cpc, ctrl->set, ctrl->counter_buffer ); if ( retval != 0 ) { printf( "%s: cpc_set_sample failed, return=%d, errno=%d\n", __func__, retval, errno ); return PAPI_ECMP; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_stop( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif __CHECK_ERR_DFLT( cpc_unbind( cpc, ctrl->set ) ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_update_control_state( hwd_control_state_t * ctrl, NativeInfo_t * native, int count, hwd_context_t * ctx ) { int i; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Delete everything as we can't change an existing set */ if ( ctrl->counter_buffer != NULL ) { __CHECK_ERR_DFLT( cpc_buf_destroy( cpc, ctrl->counter_buffer ) ); } if ( ctrl->set != NULL ) { __CHECK_ERR_DFLT( cpc_set_destroy( cpc, ctrl->set ) ); } for ( i = 0; i < MAX_COUNTERS; i++ ) { ctrl->idx[i] = EVENT_NOT_SET; } /* New setup */ ctrl->set = cpc_set_create( cpc ); __CHECK_ERR_NULL( ctrl->set ); ctrl->count = count; ctrl->syn_count = 0; for ( i = 0; i < count; i++ ) { /* Store the active event */ ctrl->code[i].event_code = native[i].ni_event & PAPI_NATIVE_AND_MASK; ctrl->flags[i] = __cpc_domain_translator( PAPI_DOM_USER ); ctrl->preset[i] = DEFAULT_CNTR_PRESET; #ifdef DEBUG SUBDBG ( " -> %s: EventSet@%p/PIC#%d - ntv request >>%s<< (%d), flags=%#x\n", __func__, ctrl, i, __t2_ntv_events[ctrl->code[i].event_code], ctrl->code[i].event_code, ctrl->flags[i] ); #endif /* Store the counter position (???) */ native[i].ni_position = i; #ifdef SYNTHETIC_EVENTS_SUPPORTED { int syn_code = ctrl->code[i].event_code - ( _niagara2_vector.cmp_info.num_native_events - __t2_store.syn_evt_count ) - 1; /* Check if the event code is bigger than the CPC provided events. */ if ( syn_code >= 0 ) { #ifdef DEBUG SUBDBG ( " -> %s: Adding synthetic event %#x (%s) on position %d\n", __func__, native[i].ni_event, __t2_ntv_events[ctrl->code[i].event_code], i ); #endif /* Call the setup routine */ __int_setup_synthetic_event( syn_code, ctrl, NULL ); /* Clean the hangover count as this event is new */ ctrl->syn_hangover[i] = 0; /* Register this event as being synthetic, as an event set only based on synthetic events can not be actived through libcpc */ ctrl->syn_count++; /* Jump to next iteration */ continue; } } #endif #ifdef DEBUG SUBDBG( " -> %s: Adding native event %#x (%s) on position %d\n", __func__, native[i].ni_event, __t2_ntv_events[ctrl->code[i].event_code], i ); #endif /* Pass the event as request to libcpc */ ctrl->idx[i] = cpc_set_add_request( cpc, ctrl->set, __t2_ntv_events[ctrl->code[i]. event_code], ctrl->preset[i], ctrl->flags[i], 0, NULL ); __CHECK_ERR_NEGV( ctrl->idx[i] ); } #ifdef DEBUG if ( i == 0 ) { SUBDBG( " -> %s: nothing added\n", __func__ ); } #endif ctrl->counter_buffer = cpc_buf_create( cpc, ctrl->set ); __CHECK_ERR_NULL( ctrl->counter_buffer ); /* Finished the new setup */ /* Linking to context (same data type by typedef!) */ ctx = ctrl; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } int _niagara2_update_shlib_info( papi_mdi_t *mdi ) { char *file = "/proc/self/map"; char *resolve_pattern = "/proc/self/path/%s"; char lastobject[PRMAPSZ]; char link[PAPI_HUGE_STR_LEN]; char path[PAPI_HUGE_STR_LEN]; prmap_t mapping; int fd, count = 0, total = 0, position = -1, first = 1; vptr_t t_min, t_max, d_min, d_max; PAPI_address_map_t *pam, *cur; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif fd = open( file, O_RDONLY ); if ( fd == -1 ) { return PAPI_ESYS; } memset( lastobject, 0, PRMAPSZ ); #ifdef DEBUG SUBDBG( " -> %s: Preprocessing memory maps from procfs\n", __func__ ); #endif /* Search through the list of mappings in order to identify a) how many mappings are available and b) how many unique mappings are available. */ while ( read( fd, &mapping, sizeof ( prmap_t ) ) > 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Found a new memory map entry\n", __func__ ); #endif /* Another entry found, just the total count of entries. */ total++; /* Is the mapping accessible and not anonymous? */ if ( mapping.pr_mflags & ( MA_READ | MA_WRITE | MA_EXEC ) && !( mapping.pr_mflags & MA_ANON ) ) { /* Test if a new library has been found. If a new library has been found a new entry needs to be counted. */ if ( strcmp( lastobject, mapping.pr_mapname ) != 0 ) { strncpy( lastobject, mapping.pr_mapname, PRMAPSZ ); count++; #ifdef DEBUG SUBDBG( " -> %s: Memory mapping entry valid for %s\n", __func__, mapping.pr_mapname ); #endif } } } #ifdef DEBUG SUBDBG( " -> %s: Preprocessing done, starting to analyze\n", __func__ ); #endif /* Start from the beginning, now fill in the found mappings */ if ( lseek( fd, 0, SEEK_SET ) == -1 ) { return PAPI_ESYS; } memset( lastobject, 0, PRMAPSZ ); /* Allocate memory */ pam = ( PAPI_address_map_t * ) papi_calloc( count, sizeof ( PAPI_address_map_t ) ); while ( read( fd, &mapping, sizeof ( prmap_t ) ) > 0 ) { if ( mapping.pr_mflags & MA_ANON ) { #ifdef DEBUG SUBDBG ( " -> %s: Anonymous mapping (MA_ANON) found for %s, skipping\n", __func__, mapping.pr_mapname ); #endif continue; } /* Check for a new entry */ if ( strcmp( mapping.pr_mapname, lastobject ) != 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Analyzing mapping for %s\n", __func__, mapping.pr_mapname ); #endif cur = &( pam[++position] ); strncpy( lastobject, mapping.pr_mapname, PRMAPSZ ); snprintf( link, PAPI_HUGE_STR_LEN, resolve_pattern, lastobject ); memset( path, 0, PAPI_HUGE_STR_LEN ); readlink( link, path, PAPI_HUGE_STR_LEN ); strncpy( cur->name, path, PAPI_HUGE_STR_LEN ); #ifdef DEBUG SUBDBG( " -> %s: Resolved name for %s: %s\n", __func__, mapping.pr_mapname, cur->name ); #endif } if ( mapping.pr_mflags & MA_READ ) { /* Data (MA_WRITE) or text (MA_READ) segment? */ if ( mapping.pr_mflags & MA_WRITE ) { cur->data_start = ( vptr_t ) mapping.pr_vaddr; cur->data_end = ( vptr_t ) ( mapping.pr_vaddr + mapping.pr_size ); if ( strcmp ( cur->name, _papi_hwi_system_info.exe_info.fullname ) == 0 ) { _papi_hwi_system_info.exe_info.address_info.data_start = cur->data_start; _papi_hwi_system_info.exe_info.address_info.data_end = cur->data_end; } if ( first ) d_min = cur->data_start; if ( first ) d_max = cur->data_end; if ( cur->data_start < d_min ) { d_min = cur->data_start; } if ( cur->data_end > d_max ) { d_max = cur->data_end; } } else if ( mapping.pr_mflags & MA_EXEC ) { cur->text_start = ( vptr_t ) mapping.pr_vaddr; cur->text_end = ( vptr_t ) ( mapping.pr_vaddr + mapping.pr_size ); if ( strcmp ( cur->name, _papi_hwi_system_info.exe_info.fullname ) == 0 ) { _papi_hwi_system_info.exe_info.address_info.text_start = cur->text_start; _papi_hwi_system_info.exe_info.address_info.text_end = cur->text_end; } if ( first ) t_min = cur->text_start; if ( first ) t_max = cur->text_end; if ( cur->text_start < t_min ) { t_min = cur->text_start; } if ( cur->text_end > t_max ) { t_max = cur->text_end; } } } first = 0; } close( fd ); /* During the walk of shared objects the upper and lower bound of the segments could be discovered. The bounds are stored in the PAPI info structure. The information is important for the profiling functions of PAPI. */ /* This variant would pass the addresses of all text and data segments _papi_hwi_system_info.exe_info.address_info.text_start = t_min; _papi_hwi_system_info.exe_info.address_info.text_end = t_max; _papi_hwi_system_info.exe_info.address_info.data_start = d_min; _papi_hwi_system_info.exe_info.address_info.data_end = d_max; */ #ifdef DEBUG SUBDBG( " -> %s: Analysis of memory maps done, results:\n", __func__ ); SUBDBG( " -> %s: text_start=%#x, text_end=%#x, text_size=%lld\n", __func__, _papi_hwi_system_info.exe_info.address_info.text_start, _papi_hwi_system_info.exe_info.address_info.text_end, _papi_hwi_system_info.exe_info.address_info.text_end - _papi_hwi_system_info.exe_info.address_info.text_start ); SUBDBG( " -> %s: data_start=%#x, data_end=%#x, data_size=%lld\n", __func__, _papi_hwi_system_info.exe_info.address_info.data_start, _papi_hwi_system_info.exe_info.address_info.data_end, _papi_hwi_system_info.exe_info.address_info.data_end - _papi_hwi_system_info.exe_info.address_info.data_start ); #endif /* Store the map read and the total count of shlibs found */ _papi_hwi_system_info.shlib_info.map = pam; _papi_hwi_system_info.shlib_info.count = count; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } ////////////////////////////////////////////////////////////////////////////////// /// UTILITY FUNCTIONS FOR ACCESS TO LIBCPC AND SOLARIS ///////////////////////// //////////////////////////////////////////////////////////////////////////////// /* DESCRIPTION: * ----------------------------------------------------------------------------- * The following functions are for accessing libcpc 2 and Solaris related stuff * needed for PAPI. ******************************************************************************/ static inline int __cpc_build_ntv_table( void ) { int i, tmp; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif __t2_store.pic_ntv_count = papi_calloc( __t2_store.npic, sizeof ( int ) ); __CHECK_ERR_NULL( __t2_store.pic_ntv_count ); #ifdef DEBUG SUBDBG( " -> %s: Checking PICs for functionality\n", __func__ ); #endif for ( i = 0; i < __t2_store.npic; i++ ) { cpc_walk_events_pic( cpc, i, NULL, __cpc_walk_events_pic_action_count ); #ifdef DEBUG SUBDBG( " -> %s: Found %d events on PIC#%d\n", __func__, __t2_store.pic_ntv_count[i], i ); #endif } tmp = __t2_store.pic_ntv_count[0]; /* There should be at least one counter... */ if ( tmp == 0 ) { #ifdef DEBUG SUBDBG( " -> %s: PIC#0 has 0 events\n", __func__ ); #endif return PAPI_ECMP; } /* Check if all PICs have the same number of counters */ for ( i = 0; i < __t2_store.npic; i++ ) { if ( __t2_store.pic_ntv_count[i] != tmp ) { #ifdef DEBUG SUBDBG( " -> %s: PIC#%d has %d events, should have %d\n", __func__, i, __t2_store.pic_ntv_count[i], tmp ); #endif return PAPI_ECMP; } } /* Count synthetic events which add functionality to libcpc */ #ifdef SYNTHETIC_EVENTS_SUPPORTED __t2_store.syn_evt_count = 0; __int_walk_synthetic_events_action_count( ); #endif /* Store the count of events available in central data structure */ #ifndef SYNTHETIC_EVENTS_SUPPORTED _niagara2_vector.cmp_info.num_native_events = __t2_store.pic_ntv_count[0]; #else _niagara2_vector.cmp_info.num_native_events = __t2_store.pic_ntv_count[0] + __t2_store.syn_evt_count; #endif /* Allocate memory for storing all events found, including the first empty slot */ __t2_ntv_events = papi_calloc( _niagara2_vector.cmp_info.num_native_events + 1, sizeof ( char * ) ); __t2_ntv_events[0] = "THIS IS A BUG!"; tmp = 1; cpc_walk_events_pic( cpc, 0, ( void * ) &tmp, __cpc_walk_events_pic_action_store ); #ifdef SYNTHETIC_EVENTS_SUPPORTED __int_walk_synthetic_events_action_store( ); #endif #ifdef DEBUG for ( i = 1; i < __t2_store.pic_ntv_count[0]; i++ ) { SUBDBG( " -> %s: Event #%d: %s\n", __func__, i, __t2_ntv_events[i] ); } #endif #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } /* Return event code for event_name */ static inline int __cpc_search_ntv_event( char *event_name, int *event_code ) { int i; for ( i = 0; i < _niagara2_vector.cmp_info.num_native_events; i++ ) { if ( strcmp( event_name, __t2_ntv_events[i] ) == 0 ) { *event_code = i; return PAPI_OK; } } return PAPI_ENOEVNT; } static inline int __cpc_build_pst_table( void ) { int num_psts, i, j, event_code, pst_events; hwi_search_t tmp; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif num_psts = 0; while ( __t2_table[num_psts].papi_pst != 0 ) { num_psts++; } #ifdef DEBUG SUBDBG( " -> %s: Found %d presets\n", __func__, num_psts ); #endif preset_table = papi_calloc( num_psts + 1, sizeof ( hwi_search_t ) ); __CHECK_ERR_NULL( preset_table ); pst_events = 0; for ( i = 0; i < num_psts; i++ ) { memset( &tmp, PAPI_NULL, sizeof ( tmp ) ); /* Mark counters as unused. If they are needed, they will be overwritten later. See papi_preset.c:51 for more details. */ for ( j = 0; j < PAPI_EVENTS_IN_DERIVED_EVENT; j++ ) { tmp.data.native[j] = PAPI_NULL; } tmp.event_code = __t2_table[i].papi_pst; tmp.data.derived = __t2_table[i].ntv_opcode; tmp.data.operation[0] = '\0'; switch ( __t2_table[i].ntv_opcode ) { case DERIVED_ADD: tmp.data.operation[0] = '+'; break; case DERIVED_SUB: tmp.data.operation[0] = '-'; break; } for ( j = 0; j < __t2_table[i].ntv_ctrs; j++ ) { if ( __cpc_search_ntv_event ( __t2_table[i].ntv_event[j], &event_code ) >= PAPI_OK ) { tmp.data.native[j] = event_code; } else { continue; } } #ifdef DEBUG SUBDBG( " -> %s: pst row %d - event_code=%d\n", __func__, i, tmp.event_code ); SUBDBG( " -> %s: pst row %d - data.derived=%d, data.operation=%c\n", __func__, i, tmp.data.derived, tmp.data.operation[0] ); SUBDBG( " -> %s: pst row %d - native event codes:\n", __func__, i ); { int d_i; for ( d_i = 0; d_i < PAPI_EVENTS_IN_DERIVED_EVENT; d_i++ ) { SUBDBG( " -> %s: pst row %d - +++ data.native[%d]=%d\n", __func__, i, d_i, tmp.data.native[d_i] ); } } #endif memcpy( &preset_table[i], &tmp, sizeof ( tmp ) ); pst_events++; } // Check! memset( &preset_table[num_psts], 0, sizeof ( hwi_search_t ) ); _niagara2_vector.cmp_info.num_preset_events = pst_events; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } static inline int __cpc_recreate_set( hwd_control_state_t * ctrl ) { #ifdef SYNTHETIC_EVENTS_SUPPORTED const int syn_barrier = _niagara2_vector.cmp_info.num_native_events - __t2_store.syn_evt_count; #endif int i; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Destroy the old buffer and the old set if they exist, we need to do a full recreate as changing flags or events through libcpc is not possible */ if ( ctrl->counter_buffer != NULL ) { __CHECK_ERR_DFLT( cpc_buf_destroy( cpc, ctrl->counter_buffer ) ); } if ( ctrl->set != NULL ) { __CHECK_ERR_DFLT( cpc_set_destroy( cpc, ctrl->set ) ); } /* Create a new set */ ctrl->set = cpc_set_create( cpc ); __CHECK_ERR_NULL( ctrl->set ); for ( i = 0; i < ctrl->count; i++ ) { #ifdef DEBUG SUBDBG( " -> %s: Adding native event %#x (%s) on position %d\n", __func__, ctrl->code[i].event_code, __t2_ntv_events[ctrl->code[i].event_code], i ); SUBDBG( " -> %s: Event setup: ctrl->code[%d].event_code=%#x\n", __func__, i, ctrl->code[i].event_code ); SUBDBG( " -> %s: Event setup: ctrl->preset[%d]=%d\n", __func__, i, ctrl->preset[i] ); SUBDBG( " -> %s: Event setup: ctrl->flags[%d]=%#x\n", __func__, i, ctrl->flags[i] ); #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED /* Ensure that synthetic events are skipped */ if ( ctrl->code[i].event_code >= syn_barrier ) { #ifdef DEBUG SUBDBG( " -> %s: Skipping counter %d, synthetic event found\n", __func__, i ); #endif /* Next iteration */ continue; } #endif ctrl->idx[i] = cpc_set_add_request( cpc, ctrl->set, __t2_ntv_events[ctrl->code[i]. event_code], ctrl->preset[i], ctrl->flags[i], 0, NULL ); __CHECK_ERR_NEGV( ctrl->idx[i] ); } ctrl->counter_buffer = cpc_buf_create( cpc, ctrl->set ); __CHECK_ERR_NULL( ctrl->counter_buffer ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; } static inline int __cpc_domain_translator( const int papi_domain ) { int domain = 0; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); SUBDBG( " -> %s: papi_domain=%d requested\n", __func__, papi_domain ); #endif if ( papi_domain & PAPI_DOM_USER ) { #ifdef DEBUG SUBDBG( " -> %s: Domain PAPI_DOM_USER/CPC_COUNT_USER selected\n", __func__ ); #endif domain |= CPC_COUNT_USER; } if ( papi_domain & PAPI_DOM_KERNEL ) { #ifdef DEBUG SUBDBG( " -> %s: Domain PAPI_DOM_KERNEL/CPC_COUNT_SYSTEM selected\n", __func__ ); #endif domain |= CPC_COUNT_SYSTEM; } if ( papi_domain & PAPI_DOM_SUPERVISOR ) { #ifdef DEBUG SUBDBG( " -> %s: Domain PAPI_DOM_SUPERVISOR/CPC_COUNT_HV selected\n", __func__ ); #endif domain |= CPC_COUNT_HV; } #ifdef DEBUG SUBDBG( " -> %s: domain=%d\n", __func__, domain ); #endif return domain; } void __cpc_error_handler( const char *fn, int subcode, const char *fmt, va_list ap ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* From the libcpc manpages */ fprintf( stderr, "ERROR - libcpc error handler in %s() called!\n", fn ); vfprintf( stderr, fmt, ap ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } static inline int __cpc_enable_sigemt( hwd_control_state_t * ctrl, int position ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif if ( position >= MAX_COUNTERS ) { #ifdef DEBUG SUBDBG( " -> %s: Position of the counter does not exist\n", __func__ ); #endif return PAPI_EINVAL; } ctrl->flags[position] = ctrl->flags[position] | CPC_OVF_NOTIFY_EMT; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return __cpc_recreate_set( ctrl ); } void __cpc_walk_events_pic_action_count( void *arg, uint_t picno, const char *event ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif __t2_store.pic_ntv_count[picno]++; #ifdef DEBUG SUBDBG ( " -> %s: Found one native event on PIC#%d (now totally %d events)\n", __func__, picno, __t2_store.pic_ntv_count[picno] ); #endif #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } void __cpc_walk_events_pic_action_store( void *arg, uint_t picno, const char *event ) { int *tmp = ( int * ) arg; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif __t2_ntv_events[*tmp] = papi_strdup( event ); #ifdef DEBUG SUBDBG( " -> %s: Native event >>%s<< registered\n", __func__, __t2_ntv_events[*tmp] ); #endif *tmp = *tmp + 1; #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } static inline int __sol_get_processor_clock( void ) { processor_info_t pinfo; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif // Fetch information from the first processor in the system if ( processor_info( getcpuid( ), &pinfo ) == 0 ) { #ifdef DEBUG SUBDBG( " -> %s: Clock at %d MHz\n", __func__, pinfo.pi_clock ); #endif return pinfo.pi_clock; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_ESYS; } /* This function either increases the ns supplied to itimer_res_ns or pads it up * to a multiple of itimer_res_ns if the value is bigger than itimer_res_ns. * * The source is taken from the old component. */ static inline int __sol_get_itimer_ns( int ns ) { if ( ns < _papi_os_info.itimer_res_ns ) { return _papi_os_info.itimer_res_ns; } else { int leftover_ns = ns % _papi_os_info.itimer_res_ns; return ns + leftover_ns; } } static inline lwpstatus_t * __sol_get_lwp_status( const pid_t pid, const lwpid_t lwpid ) { char *pattern = "/proc/%d/lwp/%d/lwpstatus"; char filename[PAPI_MIN_STR_LEN]; int fd; static lwpstatus_t lwp; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif memset( &lwp, 0, sizeof ( lwp ) ); snprintf( filename, PAPI_MIN_STR_LEN, pattern, pid, lwpid ); fd = open( filename, O_RDONLY ); if ( fd == -1 ) return NULL; read( fd, ( void * ) &lwp, sizeof ( lwp ) ); close( fd ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return &lwp; } static inline psinfo_t * __sol_get_proc_info( const pid_t pid ) { char *pattern = "/proc/%d/psinfo"; char filename[PAPI_MIN_STR_LEN]; int fd; static psinfo_t proc; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif memset( &proc, 0, sizeof ( proc ) ); snprintf( filename, PAPI_MIN_STR_LEN, pattern, pid ); fd = open( filename, O_RDONLY ); if ( fd == -1 ) return NULL; read( fd, ( void * ) &proc, sizeof ( proc ) ); close( fd ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return &proc; } static inline pstatus_t * __sol_get_proc_status( const pid_t pid ) { char *pattern = "/proc/%d/status"; char filename[PAPI_MIN_STR_LEN]; int fd; static pstatus_t proc; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif memset( &proc, 0, sizeof ( proc ) ); snprintf( filename, PAPI_MIN_STR_LEN, pattern, pid ); fd = open( filename, O_RDONLY ); if ( fd == -1 ) return NULL; read( fd, ( void * ) &proc, sizeof ( proc ) ); close( fd ); #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return &proc; } /* This function handles synthetic events and returns their result. Synthetic * events are events retrieved from outside of libcpc, e.g. all events which * can not be retrieved using cpc_set_add_request/cpc_buf_get. */ #ifdef SYNTHETIC_EVENTS_SUPPORTED uint64_t __int_get_synthetic_event( int code, hwd_control_state_t * ctrl, void *arg ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif switch ( code ) { case SYNTHETIC_CYCLES_ELAPSED: /* Return the count of ticks this set was bound. If a reset of the set has been executed the last count will be subtracted. */ { int *i = ( int * ) arg; return cpc_buf_tick( cpc, ctrl->counter_buffer ) - ctrl->syn_hangover[*i]; } case SYNTHETIC_RETURN_ONE: // The name says it - only for testing purposes. #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return 1; case SYNTHETIC_RETURN_TWO: // The name says it - only for testing purposes. #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return 2; default: #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_EINVAL; } } #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED int __int_setup_synthetic_event( int code, hwd_control_state_t * ctrl, void *arg ) { #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif switch ( code ) { case SYNTHETIC_CYCLES_ELAPSED: #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_OK; default: #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif return PAPI_EINVAL; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED void __int_walk_synthetic_events_action_count( void ) { int i = 0; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif /* Count all synthetic events in __int_syn_table, the last event is marked with an event code of -1. */ while ( __int_syn_table[i].code != -1 ) { __t2_store.syn_evt_count++; i++; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } #endif #ifdef SYNTHETIC_EVENTS_SUPPORTED void __int_walk_synthetic_events_action_store( void ) { /* The first index of a synthetic event starts after last native event */ int i = 0; int offset = _niagara2_vector.cmp_info.num_native_events + 1 - __t2_store.syn_evt_count; #ifdef DEBUG SUBDBG( "ENTERING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif while ( i < __t2_store.syn_evt_count ) { __t2_ntv_events[i + offset] = papi_strdup( __int_syn_table[i].name ); i++; } #ifdef DEBUG SUBDBG( "LEAVING FUNCTION >>%s<< at %s:%d\n", __func__, __FILE__, __LINE__ ); #endif } #endif papi_vector_t _niagara2_vector = { /************* COMPONENT CAPABILITIES/INFORMATION/ETC ************************/ .cmp_info = { .name = "solaris-niagara2", .description = "Solaris Counters", .num_cntrs = MAX_COUNTERS, .num_mpx_cntrs = MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = ( PAPI_DOM_USER | PAPI_DOM_KERNEL | PAPI_DOM_SUPERVISOR ), .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .fast_real_timer = 1, .fast_virtual_timer = 1, .attach = 1, .attach_must_ptrace = 1, .hardware_intr = 1, .hardware_intr_sig = SIGEMT, .precise_intr = 1, } , /************* COMPONENT DATA STRUCTURE SIZES ********************************/ .size = { .context = sizeof ( hwd_context_t ), .control_state = sizeof ( hwd_control_state_t ), .reg_value = sizeof ( hwd_register_t ), .reg_alloc = sizeof ( niagara2_reg_alloc_t ), } , /************* COMPONENT INTERFACE FUNCTIONS *********************************/ .init_control_state = _niagara2_init_control_state, .start = _niagara2_start, .stop = _niagara2_stop, .read = _niagara2_read, .write = NULL, /* NOT IMPLEMENTED */ .shutdown_thread = _niagara2_shutdown, .shutdown_component = _niagara2_shutdown_global, .ctl = _niagara2_ctl, .update_control_state = _niagara2_update_control_state, .set_domain = _niagara2_set_domain, .reset = _niagara2_reset, .set_overflow = _niagara2_set_overflow, .set_profile = _niagara2_set_profile, .stop_profiling = NULL, /* NOT IMPLEMENTED */ .ntv_enum_events = _niagara2_ntv_enum_events, .ntv_name_to_code = NULL, /* NOT IMPLEMENTED */ .ntv_code_to_name = _niagara2_ntv_code_to_name, .ntv_code_to_descr = _niagara2_ntv_code_to_descr, .ntv_code_to_bits = _niagara2_ntv_code_to_bits, .init_component = _niagara2_init_component, .dispatch_timer = _niagara2_dispatch_timer, }; papi_os_vector_t _papi_os_vector = { .get_memory_info = _niagara2_get_memory_info, .get_dmem_info = _solaris_get_dmem_info, .get_real_usec = _solaris_get_real_usec, .get_real_cycles = _solaris_get_real_cycles, .get_virt_usec = _solaris_get_virt_usec, .update_shlib_info = _solaris_update_shlib_info, .get_system_info = _solaris_get_system_info, }; papi-papi-7-2-0-t/src/solaris-niagara2.h000066400000000000000000000113261502707512200177410ustar00rootroot00000000000000/******************************************************************************* * >>>>>> "Development of a PAPI Backend for the Sun Niagara 2 Processor" <<<<<< * ----------------------------------------------------------------------------- * * Fabian Gorsler * * Hochschule Bonn-Rhein-Sieg, Sankt Augustin, Germany * University of Applied Sciences * * ----------------------------------------------------------------------------- * * File: solaris-niagara2.c * Author: fg215045 * * Description: Data structures used for the communication between PAPI and the * component. Additionally some macros are defined here. See solaris-niagara2.c. * * ***** Feel free to convert this header to the PAPI default ***** * * ----------------------------------------------------------------------------- * Created on April 23, 2009, 7:31 PM ******************************************************************************/ #ifndef _SOLARIS_NIAGARA2_H #define _SOLARIS_NIAGARA2_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "papi_defines.h" //////////////////////////////////////////////////////////////////////////////// /// COPIED ITEMS FROM THE OLD PORT TO SOLARIS ////////////////////////////////// //////////////////////////////////////////////////////////////////////////////// /* DESCRIPTION: * ----------------------------------------------------------------------------- * The following lines are taken from the old Solaris port of PAPI. If changes * have been made there are (additional) comments. * ******************************************************************************/ #define MAX_COUNTERS 2 #define MAX_COUNTER_TERMS MAX_COUNTERS #define PAPI_MAX_NATIVE_EVENTS 71 #define MAX_NATIVE_EVENT PAPI_MAX_NATIVE_EVENTS typedef int niagara2_reg_alloc_t; /* libcpc 2 does not need any bit masks */ typedef struct _niagara2_register { int event_code; } _niagara2_register_t; #define BUF_T0 0 #define BUF_T1 1 #define EVENT_NOT_SET -1; #define SYNTHETIC_EVENTS_SUPPORTED 1 /* This structured bundles everything needed for sampling up to MAX_COUNTERS */ typedef struct _niagara2_control_state { /* A set instruments the hardware counters */ cpc_set_t *set; /* A buffer stores the events counted. For measuring a start of measurment and an end is needed as measurement does not always start from 0. This is done by using an array of bufs, accessed by the indexes BUF_T0 as start and BUF_T1 as end. */ cpc_buf_t *counter_buffer; /* The indexes are needed for accessing the single counter events, if the value of these indexes is equal to EVENT_NOT_SET this means it is unused */ int idx[MAX_COUNTERS]; /* The event codes applied to this set */ _niagara2_register_t code[MAX_COUNTERS]; /* The total number of events being counted */ int count; /* The values retrieved from the counter */ uint64_t result[MAX_COUNTERS]; /* Flags for controlling overflow handling and binding, see cpc_set_create(3CPC) for more details on this topic. */ uint_t flags[MAX_COUNTERS]; /* Preset values for the counters */ uint64_t preset[MAX_COUNTERS]; /* Memory to store values when an overflow occours */ long_long threshold[MAX_COUNTERS]; long_long hangover[MAX_COUNTERS]; #ifdef SYNTHETIC_EVENTS_SUPPORTED int syn_count; uint64_t syn_hangover[MAX_COUNTERS]; #endif } _niagara2_control_state_t; #define GET_OVERFLOW_ADDRESS(ctx) (void*)(ctx->ucontext->uc_mcontext.gregs[REG_PC]) typedef int hwd_register_map_t; #include "solaris-context.h" typedef _niagara2_control_state_t _niagara2_context_t; // Needs an explicit declaration, no longer externally found. rwlock_t lock[PAPI_MAX_LOCK]; // For setting and releasing locks. #define _papi_hwd_lock(lck) rw_wrlock(&lock[lck]); #define _papi_hwd_unlock(lck) rw_unlock(&lock[lck]); #define DEFAULT_CNTR_PRESET (0) #define NOT_A_PAPI_HWD_READ -666 #define CPC_COUNTING_DOMAINS (CPC_COUNT_USER|CPC_COUNT_SYSTEM|CPC_COUNT_HV) #define EVENT_NOT_SET -1; /* Clean the stubbed data structures from framework initialization */ #undef hwd_context_t #define hwd_context_t _niagara2_context_t #undef hwd_control_state_t #define hwd_control_state_t _niagara2_control_state_t #undef hwd_register_t #define hwd_register_t _niagara2_register_t #endif papi-papi-7-2-0-t/src/solaris-ultra.c000066400000000000000000000717361502707512200174120ustar00rootroot00000000000000/* * File: solaris-ultra.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Kevin London * london@cs.utk.edu * Mods: Min Zhou * min@cs.utk.edu * Mods: Larry Meadows(helped us to build the native table dynamically) * Mods: Brian Sheely * bsheely@eecs.utk.edu * Mods: Vince Weaver * vweaver1@eecs.utk.edu */ /* to understand this program, first you should read the user's manual about UltraSparc II and UltraSparc III, then the man pages about cpc_take_sample(cpc_event_t *event) */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include #include "solaris-common.h" #include "solaris-memory.h" #ifdef CPC_ULTRA3_I #define LASTULTRA3 CPC_ULTRA3_I #else #define LASTULTRA3 CPC_ULTRA3_PLUS #endif #define MAX_ENAME 40 static void action( void *arg, int regno, const char *name, uint8_t bits ); /* Probably could dispense with this and just use native_table */ typedef struct ctr_info { char *name; /* Counter name */ int bits[2]; /* bits for register */ int bitmask; /* 1 = pic0; 2 = pic1; 3 = both */ } ctr_info_t; typedef struct einfo { unsigned int papi_event; char *event_str; } einfo_t; static einfo_t us3info[] = { {PAPI_FP_INS, "FA_pipe_completion+FM_pipe_completion"}, {PAPI_FAD_INS, "FA_pipe_completion"}, {PAPI_FML_INS, "FM_pipe_completion"}, {PAPI_TLB_IM, "ITLB_miss"}, {PAPI_TLB_DM, "DTLB_miss"}, {PAPI_TOT_CYC, "Cycle_cnt"}, {PAPI_TOT_IIS, "Instr_cnt"}, {PAPI_TOT_INS, "Instr_cnt"}, {PAPI_L2_TCM, "EC_misses"}, {PAPI_L2_ICM, "EC_ic_miss"}, {PAPI_L1_ICM, "IC_miss"}, {PAPI_L1_LDM, "DC_rd_miss"}, {PAPI_L1_STM, "DC_wr_miss"}, {PAPI_L2_LDM, "EC_rd_miss"}, {PAPI_BR_MSP, "IU_Stat_Br_miss_taken+IU_Stat_Br_miss_untaken"}, {PAPI_L1_DCR, "DC_rd"}, {PAPI_L1_DCW, "DC_wr"}, {PAPI_L1_ICH, "IC_ref-IC_miss"}, /* Is this really hits only? */ {PAPI_L1_ICA, "IC_ref"}, /* Ditto? */ {PAPI_L2_TCH, "EC_ref-EC_misses"}, {PAPI_L2_TCA, "EC_ref"}, }; static einfo_t us2info[] = { {PAPI_L1_ICM, "IC_ref-IC_hit"}, {PAPI_L2_TCM, "EC_ref-EC_hit"}, {PAPI_CA_SNP, "EC_snoop_cb"}, {PAPI_CA_INV, "EC_snoop_inv"}, {PAPI_L1_LDM, "DC_rd-DC_rd_hit"}, {PAPI_L1_STM, "DC_wr-DC_wr_hit"}, {PAPI_L2_LDM, "EC_rd_miss"}, {PAPI_BR_MSP, "Dispatch0_mispred"}, {PAPI_TOT_IIS, "Instr_cnt"}, {PAPI_TOT_INS, "Instr_cnt"}, {PAPI_LD_INS, "DC_rd"}, {PAPI_SR_INS, "DC_wr"}, {PAPI_TOT_CYC, "Cycle_cnt"}, {PAPI_L1_DCR, "DC_rd"}, {PAPI_L1_DCW, "DC_wr"}, {PAPI_L1_ICH, "IC_hit"}, {PAPI_L2_ICH, "EC_ic_hit"}, {PAPI_L1_ICA, "IC_ref"}, {PAPI_L2_TCH, "EC_hit"}, {PAPI_L2_TCA, "EC_ref"}, }; papi_vector_t _solaris_vector; static native_info_t *native_table; static hwi_search_t *preset_table; static struct ctr_info *ctrs; static int nctrs; static int build_tables( void ); static void add_preset( hwi_search_t * tab, int *np, einfo_t e ); /* Globals used to access the counter registers. */ static int cpuver; static int pcr_shift[2]; hwi_search_t *preset_search_map; #ifdef DEBUG static void dump_cmd( papi_cpc_event_t * t ) { SUBDBG( "cpc_event_t.ce_cpuver %d\n", t->cmd.ce_cpuver ); SUBDBG( "ce_tick %llu\n", t->cmd.ce_tick ); SUBDBG( "ce_pic[0] %llu ce_pic[1] %llu\n", t->cmd.ce_pic[0], t->cmd.ce_pic[1] ); SUBDBG( "ce_pcr %#llx\n", t->cmd.ce_pcr ); SUBDBG( "flags %#x\n", t->flags ); } #endif static void dispatch_emt( int signal, siginfo_t * sip, void *arg ) { int event_counter; _papi_hwi_context_t ctx; vptr_t address; ctx.si = sip; ctx.ucontext = arg; SUBDBG( "%d, %p, %p\n", signal, sip, arg ); if ( sip->si_code == EMT_CPCOVF ) { papi_cpc_event_t *sample; EventSetInfo_t *ESI; ThreadInfo_t *thread = NULL; int t, overflow_vector, readvalue; thread = _papi_hwi_lookup_thread( 0 ); ESI = ( EventSetInfo_t * ) thread->running_eventset; int cidx = ESI->CmpIdx; if ( ( ESI == NULL ) || ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) { OVFDBG( "Either no eventset or eventset not set to overflow.\n" ); return; } if ( ESI->master != thread ) { PAPIERROR ( "eventset->thread %%lx vs. current thread %#lx mismatch", ESI->master, thread ); return; } event_counter = ESI->overflow.event_counter; sample = &( ESI->ctl_state->counter_cmd ); /* GROSS! This is a hack to 'push' the correct values back into the hardware, such that when PAPI handles the overflow and reads the values, it gets the correct ones. */ /* Find which HW counter overflowed */ if ( ESI->EventInfoArray[ESI->overflow.EventIndex[0]].pos[0] == 0 ) t = 0; else t = 1; if ( cpc_take_sample( &sample->cmd ) == -1 ) return; if ( event_counter == 1 ) { /* only one event is set to be the overflow monitor */ /* generate the overflow vector */ overflow_vector = 1 << t; /* reset the threshold */ sample->cmd.ce_pic[t] = UINT64_MAX - ESI->overflow.threshold[0]; } else { /* two events are set to be the overflow monitors */ overflow_vector = 0; readvalue = sample->cmd.ce_pic[0]; if ( readvalue >= 0 ) { /* the first counter overflowed */ /* generate the overflow vector */ overflow_vector = 1; /* reset the threshold */ if ( t == 0 ) sample->cmd.ce_pic[0] = UINT64_MAX - ESI->overflow.threshold[0]; else sample->cmd.ce_pic[0] = UINT64_MAX - ESI->overflow.threshold[1]; } readvalue = sample->cmd.ce_pic[1]; if ( readvalue >= 0 ) { /* the second counter overflowed */ /* generate the overflow vector */ overflow_vector ^= 1 << 1; /* reset the threshold */ if ( t == 0 ) sample->cmd.ce_pic[1] = UINT64_MAX - ESI->overflow.threshold[1]; else sample->cmd.ce_pic[1] = UINT64_MAX - ESI->overflow.threshold[0]; } SUBDBG( "overflow_vector, = %d\n", overflow_vector ); /* something is wrong here */ if ( overflow_vector == 0 ) { PAPIERROR( "BUG! overflow_vector is 0, dropping interrupt" ); return; } } /* Call the regular overflow function in extras.c */ if ( thread->running_eventset[cidx]->overflow. flags & PAPI_OVERFLOW_FORCE_SW ) { address = GET_OVERFLOW_ADDRESS(ctx); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address, NULL, overflow_vector, 0, &thread, cidx ); } else { PAPIERROR( "Additional implementation needed in dispatch_emt!" ); } #if DEBUG dump_cmd( sample ); #endif /* push back the correct values and start counting again */ if ( cpc_bind_event( &sample->cmd, sample->flags ) == -1 ) return; } else { SUBDBG( "dispatch_emt() dropped, si_code = %d\n", sip->si_code ); return; } } static int scan_prtconf( char *cpuname, int len_cpuname, int *hz, int *ver ) { /* This code courtesy of our friends in Germany. Thanks Rudolph Berrendorf! */ /* See the PCL home page for the German version of PAPI. */ /* Modified by Nils Smeds, all new bugs are my fault */ /* The routine now looks for the first "Node" with the following: */ /* "device_type" = 'cpu' */ /* "name" = (Any value) */ /* "sparc-version" = (Any value) */ /* "clock-frequency" = (Any value) */ int ihz, version; char line[256], cmd[80], name[256]; FILE *f = NULL; char cmd_line[PAPI_HUGE_STR_LEN + PAPI_HUGE_STR_LEN], fname[L_tmpnam]; unsigned int matched; /*??? system call takes very long */ /* get system configuration and put output into file */ tmpnam( fname ); SUBDBG( "Temporary name %s\n", fname ); sprintf( cmd_line, "/usr/sbin/prtconf -vp > %s", fname ); SUBDBG( "Executing %s\n", cmd_line ); if ( system( cmd_line ) == -1 ) { remove( fname ); return -1; } f = fopen( fname, "r" ); /* open output file */ if ( f == NULL ) { remove( fname ); return -1; } /* ignore all lines until we reach something with a sparc line */ matched = 0x0; ihz = -1; while ( fgets( line, 256, f ) != NULL ) { /*SUBDBG(">>> %s",line); */ if ( ( sscanf( line, "%s", cmd ) == 1 ) && strstr( line, "Node 0x" ) ) { matched = 0x0; /*SUBDBG("Found 'Node' -- search reset. (%#2.2x)\n",matched); */ } else { if ( strstr( cmd, "device_type:" ) && strstr( line, "'cpu'" ) ) { matched |= 0x1; SUBDBG( "Found 'cpu'. (%#2.2x)\n", matched ); } else if ( !strcmp( cmd, "sparc-version:" ) && ( sscanf( line, "%s %#x", cmd, &version ) == 2 ) ) { matched |= 0x2; SUBDBG( "Found version=%d. (%#2.2x)\n", version, matched ); } else if ( !strcmp( cmd, "clock-frequency:" ) && ( sscanf( line, "%s %#x", cmd, &ihz ) == 2 ) ) { matched |= 0x4; SUBDBG( "Found ihz=%d. (%#2.2x)\n", ihz, matched ); } else if ( !strcmp( cmd, "name:" ) && ( sscanf( line, "%s %s", cmd, name ) == 2 ) ) { matched |= 0x8; SUBDBG( "Found name: %s. (%#2.2x)\n", name, matched ); } } if ( ( matched & 0xF ) == 0xF ) break; } SUBDBG( "Parsing found name=%s, speed=%dHz, version=%d\n", name, ihz, version ); if ( matched ^ 0x0F ) ihz = -1; else { *hz = ( float ) ihz; *ver = version; strncpy( cpuname, name, len_cpuname ); } return ihz; /* End stolen code */ } int _ultra_set_domain( hwd_control_state_t * this_state, int domain ) { papi_cpc_event_t *command = &this_state->counter_cmd; cpc_event_t *event = &command->cmd; uint64_t pcr = event->ce_pcr; int did = 0; pcr = pcr | 0x7; pcr = pcr ^ 0x7; if ( domain & PAPI_DOM_USER ) { pcr = pcr | 1 << CPC_ULTRA_PCR_USR; did = 1; } if ( domain & PAPI_DOM_KERNEL ) { pcr = pcr | 1 << CPC_ULTRA_PCR_SYS; did = 1; } /* DOMAIN ERROR */ if ( !did ) { return ( PAPI_EINVAL ); } event->ce_pcr = pcr; return ( PAPI_OK ); } static int set_granularity( hwd_control_state_t * this_state, int domain ) { switch ( domain ) { case PAPI_GRN_PROCG: case PAPI_GRN_SYS: case PAPI_GRN_SYS_CPU: case PAPI_GRN_PROC: return PAPI_ECMP; case PAPI_GRN_THR: break; default: return ( PAPI_EINVAL ); } return ( PAPI_OK ); } /* Utility functions */ /* This is a wrapper arount fprintf(stderr,...) for cpc_walk_events() */ void print_walk_names( void *arg, int regno, const char *name, uint8_t bits ) { SUBDBG( arg, regno, name, bits ); } static int build_tables( void ) { int i; int regno; int npic; einfo_t *ep; int n; int npresets; npic = cpc_getnpic( cpuver ); nctrs = 0; for ( regno = 0; regno < npic; ++regno ) { cpc_walk_names( cpuver, regno, 0, action ); } SUBDBG( "%d counters\n", nctrs ); if ( ( ctrs = papi_malloc( nctrs * sizeof ( struct ctr_info ) ) ) == 0 ) { return PAPI_ENOMEM; } nctrs = 0; for ( regno = 0; regno < npic; ++regno ) { cpc_walk_names( cpuver, regno, ( void * ) 1, action ); } SUBDBG( "%d counters\n", nctrs ); #if DEBUG if ( ISLEVEL( DEBUG_SUBSTRATE ) ) { for ( i = 0; i < nctrs; ++i ) { SUBDBG( "%s: bits (%#x,%#x) pics %#x\n", ctrs[i].name, ctrs[i].bits[0], ctrs[i].bits[1], ctrs[i].bitmask ); } } #endif /* Build the native event table */ if ( ( native_table = papi_malloc( nctrs * sizeof ( native_info_t ) ) ) == 0 ) { papi_free( ctrs ); return PAPI_ENOMEM; } for ( i = 0; i < nctrs; ++i ) { native_table[i].name[39] = 0; strncpy( native_table[i].name, ctrs[i].name, 39 ); if ( ctrs[i].bitmask & 1 ) native_table[i].encoding[0] = ctrs[i].bits[0]; else native_table[i].encoding[0] = -1; if ( ctrs[i].bitmask & 2 ) native_table[i].encoding[1] = ctrs[i].bits[1]; else native_table[i].encoding[1] = -1; } papi_free( ctrs ); /* Build the preset table */ if ( cpuver <= CPC_ULTRA2 ) { n = sizeof ( us2info ) / sizeof ( einfo_t ); ep = us2info; } else if ( cpuver <= LASTULTRA3 ) { n = sizeof ( us3info ) / sizeof ( einfo_t ); ep = us3info; } else return PAPI_ECMP; preset_table = papi_malloc( ( n + 1 ) * sizeof ( hwi_search_t ) ); npresets = 0; for ( i = 0; i < n; ++i ) { add_preset( preset_table, &npresets, ep[i] ); } memset( &preset_table[npresets], 0, sizeof ( hwi_search_t ) ); #ifdef DEBUG if ( ISLEVEL( DEBUG_SUBSTRATE ) ) { SUBDBG( "Native table: %d\n", nctrs ); for ( i = 0; i < nctrs; ++i ) { SUBDBG( "%40s: %8x %8x\n", native_table[i].name, native_table[i].encoding[0], native_table[i].encoding[1] ); } SUBDBG( "\nPreset table: %d\n", npresets ); for ( i = 0; preset_table[i].event_code != 0; ++i ) { SUBDBG( "%8x: op %2d e0 %8x e1 %8x\n", preset_table[i].event_code, preset_table[i].data.derived, preset_table[i].data.native[0], preset_table[i].data.native[1] ); } } #endif _solaris_vector.cmp_info.num_native_events = nctrs; return PAPI_OK; } static int srch_event( char *e1 ) { int i; for ( i = 0; i < nctrs; ++i ) { if ( strcmp( e1, native_table[i].name ) == 0 ) break; } if ( i >= nctrs ) return -1; return i; } /* we should read from the CSV file and make this all unnecessary */ static void add_preset( hwi_search_t * tab, int *np, einfo_t e ) { /* Parse the event info string and build the PAPI preset. * If parse fails, just return, otherwise increment the table * size. We assume that the table is big enough. */ char *p; char *q; char op; char e1[MAX_ENAME], e2[MAX_ENAME]; int i; int ne; int ne2; p = e.event_str; /* Assume p is the name of a native event, the sum of two * native events, or the difference of two native events. * This could be extended with a real parser (hint). */ while ( isspace( *p ) ) ++p; q = p; i = 0; while ( isalnum( *p ) || ( *p == '_' ) ) { if ( i >= MAX_ENAME - 1 ) break; e1[i] = *p++; ++i; } e1[i] = 0; if ( *p == '+' || *p == '-' ) op = *p++; else op = 0; while ( isspace( *p ) ) ++p; q = p; i = 0; while ( isalnum( *p ) || ( *p == '_' ) ) { if ( i >= MAX_ENAME - 1 ) break; e2[i] = *p++; ++i; } e2[i] = 0; if ( e2[0] == 0 && e1[0] == 0 ) { return; } if ( e2[0] == 0 || op == 0 ) { ne = srch_event( e1 ); if ( ne == -1 ) return; tab[*np].event_code = e.papi_event; tab[*np].data.derived = 0; tab[*np].data.native[0] = PAPI_NATIVE_MASK | ne; tab[*np].data.native[1] = PAPI_NULL; memset( tab[*np].data.operation, 0, sizeof ( tab[*np].data.operation ) ); ++*np; return; } ne = srch_event( e1 ); ne2 = srch_event( e2 ); if ( ne == -1 || ne2 == -1 ) return; tab[*np].event_code = e.papi_event; tab[*np].data.derived = ( op == '-' ) ? DERIVED_SUB : DERIVED_ADD; tab[*np].data.native[0] = PAPI_NATIVE_MASK | ne; tab[*np].data.native[1] = PAPI_NATIVE_MASK | ne2; tab[*np].data.native[2] = PAPI_NULL; memset( tab[*np].data.operation, 0, sizeof ( tab[*np].data.operation ) ); ++*np; } void action( void *arg, int regno, const char *name, uint8_t bits ) { int i; if ( arg == 0 ) { ++nctrs; return; } assert( regno == 0 || regno == 1 ); for ( i = 0; i < nctrs; ++i ) { if ( strcmp( ctrs[i].name, name ) == 0 ) { ctrs[i].bits[regno] = bits; ctrs[i].bitmask |= ( 1 << regno ); return; } } memset( &ctrs[i], 0, sizeof ( ctrs[i] ) ); ctrs[i].name = papi_strdup( name ); ctrs[i].bits[regno] = bits; ctrs[i].bitmask = ( 1 << regno ); ++nctrs; } /* This function should tell your kernel extension that your children inherit performance register information and propagate the values up upon child exit and parent wait. */ static int set_inherit( EventSetInfo_t * global, int arg ) { return PAPI_ECMP; /* hwd_control_state_t *machdep = (hwd_control_state_t *)global->machdep; papi_cpc_event_t *command= &machdep->counter_cmd; return(PAPI_EINVAL); */ #if 0 if ( arg == 0 ) { if ( command->flags & CPC_BIND_LWP_INHERIT ) command->flags = command->flags ^ CPC_BIND_LWP_INHERIT; } else if ( arg == 1 ) { command->flags = command->flags | CPC_BIND_LWP_INHERIT; } else return ( PAPI_EINVAL ); return ( PAPI_OK ); #endif } static int set_default_domain( hwd_control_state_t * ctrl_state, int domain ) { /* This doesn't exist on this platform */ if ( domain == PAPI_DOM_OTHER ) return ( PAPI_EINVAL ); return ( _ultra_set_domain( ctrl_state, domain ) ); } static int set_default_granularity( hwd_control_state_t * current_state, int granularity ) { return ( set_granularity( current_state, granularity ) ); } rwlock_t lock[PAPI_MAX_LOCK]; static void lock_init( void ) { memset( lock, 0x0, sizeof ( rwlock_t ) * PAPI_MAX_LOCK ); } int _ultra_hwd_shutdown_component( void ) { ( void ) cpc_rele( ); return ( PAPI_OK ); } int _ultra_hwd_init_component( int cidx ) { int retval; /* retval = _papi_hwi_setup_vector_table(vtable, _solaris_ultra_table); if ( retval != PAPI_OK ) return(retval); */ /* Fill in what we can of the papi_system_info. */ retval = _solaris_get_system_info( &_papi_hwi_system_info ); if ( retval ) return ( retval ); /* Setup memory info */ retval = _papi_os_vector.get_memory_info( &_papi_hwi_system_info.hw_info, 0 ); if ( retval ) return ( retval ); lock_init( ); SUBDBG( "Found %d %s %s CPUs at %d Mhz.\n", _papi_hwi_system_info.hw_info.totalcpus, _papi_hwi_system_info.hw_info.vendor_string, _papi_hwi_system_info.hw_info.model_string, _papi_hwi_system_info.hw_info.cpu_max_mhz ); return ( PAPI_OK ); } /* reset the hardware counter */ int _ultra_hwd_reset( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { int retval; /* reset the hardware counter */ ctrl->counter_cmd.cmd.ce_pic[0] = 0; ctrl->counter_cmd.cmd.ce_pic[1] = 0; /* let's rock and roll */ retval = cpc_bind_event( &ctrl->counter_cmd.cmd, ctrl->counter_cmd.flags ); if ( retval == -1 ) return ( PAPI_ESYS ); return ( PAPI_OK ); } int _ultra_hwd_read( hwd_context_t * ctx, hwd_control_state_t * ctrl, long long **events, int flags ) { int retval; retval = cpc_take_sample( &ctrl->counter_cmd.cmd ); if ( retval == -1 ) return ( PAPI_ESYS ); *events = ( long long * ) ctrl->counter_cmd.cmd.ce_pic; return PAPI_OK; } int _ultra_hwd_ctl( hwd_context_t * ctx, int code, _papi_int_option_t * option ) { switch ( code ) { case PAPI_DEFDOM: return ( set_default_domain ( option->domain.ESI->ctl_state, option->domain.domain ) ); case PAPI_DOMAIN: return ( _ultra_set_domain ( option->domain.ESI->ctl_state, option->domain.domain ) ); case PAPI_DEFGRN: return ( set_default_granularity ( option->domain.ESI->ctl_state, option->granularity.granularity ) ); case PAPI_GRANUL: return ( set_granularity ( option->granularity.ESI->ctl_state, option->granularity.granularity ) ); default: return ( PAPI_EINVAL ); } } void _ultra_hwd_dispatch_timer( int signal, siginfo_t * si, void *context ) { _papi_hwi_context_t ctx; ThreadInfo_t *master = NULL; int isHardware = 0; vptr_t address; int cidx = _solaris_vector.cmp_info.CmpIdx; ctx.si = si; ctx.ucontext = ( ucontext_t * ) context; address = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address, &isHardware, 0, 0, &master, _solaris_vector.cmp_info.CmpIdx ); /* We are done, resume interrupting counters */ if ( isHardware ) { // errno = vperfctr_iresume( master->context[cidx]->perfctr ); //if ( errno < 0 ) { // PAPIERROR( "vperfctr_iresume errno %d", errno ); //} } #if 0 EventSetInfo_t *ESI = NULL; ThreadInfo_t *thread = NULL; int overflow_vector = 0; hwd_control_state_t *ctrl = NULL; long_long results[MAX_COUNTERS]; int i; _papi_hwi_context_t ctx; vptr_t address; int cidx = _solaris_vector.cmp_info.CmpIdx; ctx.si = si; ctx.ucontext = ( hwd_ucontext_t * ) info; thread = _papi_hwi_lookup_thread( 0 ); if ( thread == NULL ) { PAPIERROR( "thread == NULL in _papi_hwd_dispatch_timer"); return; } ESI = ( EventSetInfo_t * ) thread->running_eventset[cidx]; if ( ESI == NULL || ESI->master != thread || ESI->ctl_state == NULL || ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) { if ( ESI == NULL ) PAPIERROR( "ESI is NULL\n"); if ( ESI->master != thread ) PAPIERROR( "Thread mismatch, ESI->master=%#x thread=%#x\n", ESI->master, thread ); if ( ESI->ctl_state == NULL ) PAPIERROR( "Counter state invalid\n"); if ( ( ( ESI->state & PAPI_OVERFLOWING ) == 0 ) ) PAPIERROR( "Overflow flag missing"); } ctrl = ESI->ctl_state; if ( thread->running_eventset[cidx]->overflow.flags & PAPI_OVERFLOW_FORCE_SW ) { address = GET_OVERFLOW_ADDRESS( ctx ); _papi_hwi_dispatch_overflow_signal( ( void * ) &ctx, address, NULL, 0, 0, &thread, cidx ); } else { PAPIERROR ( "Need to implement additional code in _papi_hwd_dispatch_timer!" ); } #endif } int _ultra_hwd_set_overflow( EventSetInfo_t * ESI, int EventIndex, int threshold ) { hwd_control_state_t *this_state = ESI->ctl_state; papi_cpc_event_t *arg = &this_state->counter_cmd; int hwcntr; if ( threshold == 0 ) { if ( this_state->overflow_num == 1 ) { arg->flags ^= CPC_BIND_EMT_OVF; if ( sigaction ( _solaris_vector.cmp_info.hardware_intr_sig, NULL, NULL ) == -1 ) return ( PAPI_ESYS ); this_state->overflow_num = 0; } else this_state->overflow_num--; } else { struct sigaction act; /* increase the counter for overflow events */ this_state->overflow_num++; act.sa_sigaction = dispatch_emt; memset( &act.sa_mask, 0x0, sizeof ( act.sa_mask ) ); act.sa_flags = SA_RESTART | SA_SIGINFO; if ( sigaction ( _solaris_vector.cmp_info.hardware_intr_sig, &act, NULL ) == -1 ) return ( PAPI_ESYS ); arg->flags |= CPC_BIND_EMT_OVF; hwcntr = ESI->EventInfoArray[EventIndex].pos[0]; if ( hwcntr == 0 ) arg->cmd.ce_pic[0] = UINT64_MAX - ( uint64_t ) threshold; else if ( hwcntr == 1 ) arg->cmd.ce_pic[1] = UINT64_MAX - ( uint64_t ) threshold; } return ( PAPI_OK ); } _ultra_shutdown( hwd_context_t * ctx ) { return PAPI_OK; } /* int _papi_hwd_stop_profiling(ThreadInfo_t * master, EventSetInfo_t * ESI) { ESI->profile.overflowcount = 0; return (PAPI_OK); } */ void * _ultra_hwd_get_overflow_address( void *context ) { void *location; ucontext_t *info = ( ucontext_t * ) context; location = ( void * ) info->uc_mcontext.gregs[REG_PC]; return ( location ); } int _ultra_hwd_start( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { int retval; /* reset the hardware counter */ if ( ctrl->overflow_num == 0 ) { ctrl->counter_cmd.cmd.ce_pic[0] = 0; ctrl->counter_cmd.cmd.ce_pic[1] = 0; } /* let's rock and roll */ retval = cpc_bind_event( &ctrl->counter_cmd.cmd, ctrl->counter_cmd.flags ); if ( retval == -1 ) return ( PAPI_ESYS ); return ( PAPI_OK ); } int _ultra_hwd_stop( hwd_context_t * ctx, hwd_control_state_t * ctrl ) { cpc_bind_event( NULL, 0 ); return PAPI_OK; } int _ultra_hwd_remove_event( hwd_register_map_t * chosen, unsigned int hardware_index, hwd_control_state_t * out ) { return PAPI_OK; } int _ultra_hwd_encode_native( char *name, int *code ) { return ( PAPI_OK ); } int _ultra_hwd_ntv_enum_events( unsigned int *EventCode, int modifier ) { int index = *EventCode & PAPI_NATIVE_AND_MASK; if ( modifier == PAPI_ENUM_FIRST ) { *EventCode = PAPI_NATIVE_MASK + 1; return PAPI_OK; } if ( cpuver <= CPC_ULTRA2 ) { if ( index < MAX_NATIVE_EVENT_USII - 1 ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); } else if ( cpuver <= LASTULTRA3 ) { if ( index < MAX_NATIVE_EVENT - 1 ) { *EventCode = *EventCode + 1; return ( PAPI_OK ); } else return ( PAPI_ENOEVNT ); }; return ( PAPI_ENOEVNT ); } int _ultra_hwd_ntv_code_to_name( unsigned int EventCode, char *ntv_name, int len ) { int event_code = EventCode & PAPI_NATIVE_AND_MASK; if ( event_code >= 0 && event_code < nctrs ) { strlcpy( ntv_name, native_table[event_code].name, len ); return PAPI_OK; } return PAPI_ENOEVNT; } int _ultra_hwd_ntv_code_to_descr( unsigned int EventCode, char *hwd_descr, int len ) { return ( _ultra_hwd_ntv_code_to_name( EventCode, hwd_descr, len ) ); } static void copy_value( unsigned int val, char *nam, char *names, unsigned int *values, int len ) { *values = val; strncpy( names, nam, len ); names[len - 1] = 0; } int _ultra_hwd_ntv_code_to_bits( unsigned int EventCode, hwd_register_t * bits ) { int index = EventCode & PAPI_NATIVE_AND_MASK; if ( cpuver <= CPC_ULTRA2 ) { if ( index >= MAX_NATIVE_EVENT_USII ) { return ( PAPI_ENOEVNT ); } } else if ( cpuver <= LASTULTRA3 ) { if ( index >= MAX_NATIVE_EVENT ) { return ( PAPI_ENOEVNT ); } } else return ( PAPI_ENOEVNT ); bits->event[0] = native_table[index].encoding[0]; bits->event[1] = native_table[index].encoding[1]; return ( PAPI_OK ); } int _ultra_hwd_init_control_state( hwd_control_state_t * ptr ) { ptr->counter_cmd.flags = 0x0; ptr->counter_cmd.cmd.ce_cpuver = cpuver; ptr->counter_cmd.cmd.ce_pcr = 0x0; ptr->counter_cmd.cmd.ce_pic[0] = 0; ptr->counter_cmd.cmd.ce_pic[1] = 0; _ultra_set_domain( ptr, _solaris_vector.cmp_info.default_domain ); set_granularity( ptr, _solaris_vector.cmp_info.default_granularity ); return PAPI_OK; } int _ultra_hwd_update_control_state( hwd_control_state_t * this_state, NativeInfo_t * native, int count, hwd_context_t * zero ) { int nidx1, nidx2, hwcntr; uint64_t tmp = 0; uint64_t pcr; int64_t cmd0, cmd1; /* save the last three bits */ pcr = this_state->counter_cmd.cmd.ce_pcr & 0x7; /* clear the control register */ this_state->counter_cmd.cmd.ce_pcr = pcr; /* no native events left */ if ( count == 0 ) return ( PAPI_OK ); cmd0 = -1; cmd1 = -1; /* one native event */ if ( count == 1 ) { nidx1 = native[0].ni_event & PAPI_NATIVE_AND_MASK; hwcntr = 0; cmd0 = native_table[nidx1].encoding[0]; native[0].ni_position = 0; if ( cmd0 == -1 ) { cmd1 = native_table[nidx1].encoding[1]; native[0].ni_position = 1; } } /* two native events */ if ( count == 2 ) { int avail1, avail2; avail1 = 0; avail2 = 0; nidx1 = native[0].ni_event & PAPI_NATIVE_AND_MASK; nidx2 = native[1].ni_event & PAPI_NATIVE_AND_MASK; if ( native_table[nidx1].encoding[0] != -1 ) avail1 = 0x1; if ( native_table[nidx1].encoding[1] != -1 ) avail1 += 0x2; if ( native_table[nidx2].encoding[0] != -1 ) avail2 = 0x1; if ( native_table[nidx2].encoding[1] != -1 ) avail2 += 0x2; if ( ( avail1 | avail2 ) != 0x3 ) return ( PAPI_ECNFLCT ); if ( avail1 == 0x3 ) { if ( avail2 == 0x1 ) { cmd0 = native_table[nidx2].encoding[0]; cmd1 = native_table[nidx1].encoding[1]; native[0].ni_position = 1; native[1].ni_position = 0; } else { cmd1 = native_table[nidx2].encoding[1]; cmd0 = native_table[nidx1].encoding[0]; native[0].ni_position = 0; native[1].ni_position = 1; } } else { if ( avail1 == 0x1 ) { cmd0 = native_table[nidx1].encoding[0]; cmd1 = native_table[nidx2].encoding[1]; native[0].ni_position = 0; native[1].ni_position = 1; } else { cmd0 = native_table[nidx2].encoding[0]; cmd1 = native_table[nidx1].encoding[1]; native[0].ni_position = 1; native[1].ni_position = 0; } } } /* set the control register */ if ( cmd0 != -1 ) { tmp = ( ( uint64_t ) cmd0 << pcr_shift[0] ); } if ( cmd1 != -1 ) { tmp = tmp | ( ( uint64_t ) cmd1 << pcr_shift[1] ); } this_state->counter_cmd.cmd.ce_pcr = tmp | pcr; #if DEBUG dump_cmd( &this_state->counter_cmd ); #endif return ( PAPI_OK ); } papi_vector_t _solaris_vector = { .cmp_info = { .name = "solaris.ultra", .description = "Solaris CPU counters", .num_cntrs = MAX_COUNTERS, .num_mpx_cntrs = MAX_COUNTERS, .default_domain = PAPI_DOM_USER, .available_domains = PAPI_DOM_USER | PAPI_DOM_KERNEL, .default_granularity = PAPI_GRN_THR, .available_granularities = PAPI_GRN_THR, .fast_real_timer = 1, .fast_virtual_timer = 1, .attach = 1, .attach_must_ptrace = 1, .hardware_intr = 0, .hardware_intr_sig = PAPI_INT_SIGNAL, .precise_intr = 0, } , /* component data structure sizes */ .size = { .context = sizeof ( hwd_context_t ), .control_state = sizeof ( hwd_control_state_t ), .reg_value = sizeof ( hwd_register_t ), .reg_alloc = sizeof ( hwd_reg_alloc_t ), } , /* component interface functions */ .init_control_state = _ultra_hwd_init_control_state, .start = _ultra_hwd_start, .stop = _ultra_hwd_stop, .read = _ultra_hwd_read, .shutdown = _ultra_shutdown, .shutdown_component = _ultra_hwd_shutdown_component, .ctl = _ultra_hwd_ctl, .update_control_state = _ultra_hwd_update_control_state, .set_domain = _ultra_set_domain, .reset = _ultra_hwd_reset, .set_overflow = _ultra_hwd_set_overflow, /* .set_profile */ /* .stop_profiling = _papi_hwd_stop_profiling, */ .ntv_enum_events = _ultra_hwd_ntv_enum_events, /* .ntv_name_to_code */ .ntv_code_to_name = _ultra_hwd_ntv_code_to_name, .ntv_code_to_descr = _ultra_hwd_ntv_code_to_descr, .ntv_code_to_bits = _ultra_hwd_ntv_code_to_bits, .init_component = _ultra_hwd_init_component, .dispatch_timer = _ultra_hwd_dispatch_timer, }; papi_os_vector_t _papi_os_vector = { /* OS dependent local routines */ .get_memory_info = _solaris_get_memory_info, .get_dmem_info = _solaris_get_dmem_info, .update_shlib_info = _solaris_update_shlib_info, .get_system_info = _solaris_get_system_info, .get_real_usec = _solaris_get_real_usec, .get_real_cycles = _solaris_get_real_cycles, .get_virt_usec = _solaris_get_virt_usec, }; papi-papi-7-2-0-t/src/solaris-ultra.h000066400000000000000000000036441502707512200174100ustar00rootroot00000000000000#ifndef _PAPI_SOLARIS_ULTRA_H #define _PAPI_SOLARIS_ULTRA_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "papi_defines.h" #define MAX_COUNTERS 2 #define MAX_COUNTER_TERMS MAX_COUNTERS #define PAPI_MAX_NATIVE_EVENTS 71 #define MAX_NATIVE_EVENT PAPI_MAX_NATIVE_EVENTS #define MAX_NATIVE_EVENT_USII 22 /* Defines in papi_internal.h cause compile warnings on solaris because typedefs are done here */ #undef hwd_context_t #undef hwd_control_state_t #undef hwd_reg_alloc_t #undef hwd_register_t #undef hwd_siginfo_t #undef hwd_ucontext_t typedef int hwd_reg_alloc_t; typedef struct US_register { int event[MAX_COUNTERS]; } hwd_register_t; typedef struct papi_cpc_event { /* Structure to libcpc */ cpc_event_t cmd; /* Flags to kernel */ int flags; } papi_cpc_event_t; typedef struct hwd_control_state { /* Buffer to pass to the kernel to control the counters */ papi_cpc_event_t counter_cmd; /* overflow event counter */ int overflow_num; } hwd_control_state_t; typedef int hwd_register_map_t; typedef struct _native_info { /* native name */ char name[40]; /* Buffer to pass to the kernel to control the counters */ int encoding[MAX_COUNTERS]; } native_info_t; #include "solaris-context.h" typedef int hwd_context_t; /* Assembler prototypes */ extern void cpu_sync( void ); extern unsigned long long get_tick( void ); extern vptr_t _start, _end, _etext, _edata; extern rwlock_t lock[PAPI_MAX_LOCK]; #define _papi_hwd_lock(lck) rw_wrlock(&lock[lck]); #define _papi_hwd_unlock(lck) rw_unlock(&lock[lck]); #endif papi-papi-7-2-0-t/src/sw_multiplex.c000066400000000000000000001221051502707512200173300ustar00rootroot00000000000000/** * @file sw_multiplex.c * @author Philip Mucci * mucci@cs.utk.edu * @author John May * johnmay@llnl.gov * @author Nils Smeds * smeds@pdc.kth.se * @author Haihang You * you@cs.utk.edu * @author Kevin London * london@cs.utk.edu * @author Maynard Johnson * maynardj@us.ibm.com * @author Dan Terpstra * terpstra@cs.utk.edu */ /** xxxx Will this stuff run unmodified on multiple components? What happens when several components are counting multiplexed? */ /* disable this to return to the pre 4.1.1 behavior */ #define MPX_NONDECR_HYBRID /* Nils Smeds */ /* This MPX update modifies the behaviour of the multiplexing in PAPI. * The previous versions of the multiplexing based the value returned * from PAPI_reads on the total counts achieved since the PAPI_start * of the multiplexed event. This count was used as the basis of the * extrapolation using the proportion of time that this particular * event was active to the total time the multiplexed event was * active. However, a typical usage of PAPI is to measure over * sections of code by starting the event once and by comparing * the values returned by subsequent calls to PAPI_read. The difference * in counts is used as the measure of occurred events in the code * section between the calls. * * When multiplexing is used in this fashion the time proportion used * for extrapolation might appear inconsistent. The time fraction used * at each PAPI_read is the total time fraction since PAPI_start. If the * counter values achieved in each multiplex of the event varies * largely, or if the time slices are varying in length, discrepancies * to the behaviour without multiplexing might occur. * * In this version the extrapolation is made on a local time scale. At * each completed time slice the event extrapolates the achieved count * to a extrapolated count for the time since this event was last sliced * out up to the current point in time. There will still be occasions * when two consecutive PAPI_read will yield decreasing results, but all * extrapolations are being made on time local data. If time slicing * varies or if the count rate varies this implementation is expected to * be more "accurate" in a loose and here unspecified meaning. * * The short description of the changes is that the running events has * new fields count_estimate, rate_estimate and prev_total_c. The mpx * events have had the meaning of start_values and stop_values modified * to mean extrapolated start value and extrapolated stop value. */ /* Portions of the following code are Copyright (c) 2009, Lawrence Livermore National Security, LLC. Produced at the Lawrence Livermore National Laboratory Written by John May, johnmay@llnl.gov LLNL-CODE-421124 All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: • Redistributions of source code must retain the above copyright notice, this list of conditions and the disclaimer below. • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the disclaimer (as noted below) in the documentation and/or other materials provided with the distribution. • Neither the name of the LLNS/LLNL nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL LAWRENCE LIVERMORE NATIONAL SECURITY, LLC, THE U.S. DEPARTMENT OF ENERGY OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Additional BSD Notice 1. This notice is required to be provided under our contract with the U.S. Department of Energy (DOE). This work was produced at Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344 with the DOE. 2. Neither the United States Government nor Lawrence Livermore National Security, LLC nor any of their employees, makes any warranty, express or implied, or assumes any liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. 3. Also, reference herein to any specific commercial products, process, or services by trade name, trademark, manufacturer or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #define MPX_MINCYC 25000 /* Globals for this file. */ /** List of threads that are multiplexing. */ static Threadlist *tlist = NULL; static unsigned int randomseed; /* Timer stuff */ #include #include #include #include #include static sigset_t sigreset; static struct itimerval itime; static const struct itimerval itimestop = { {0, 0}, {0, 0} }; static struct sigaction oaction; /* END Globals */ #ifdef PTHREADS /** Number of threads that have been signaled */ static int threads_responding = 0; static pthread_once_t mpx_once_control = PTHREAD_ONCE_INIT; static pthread_mutex_t tlistlock; static pthread_key_t master_events_key; static pthread_key_t thread_record_key; static MasterEvent *global_master_events; static void *global_process_record; #endif /* Forward prototypes */ static void mpx_remove_unused( MasterEvent ** head ); static void mpx_delete_events( MPX_EventSet * ); static void mpx_delete_one_event( MPX_EventSet * mpx_events, int Event ); static int mpx_insert_events( MPX_EventSet *, int *event_list, int num_events, int domain, int granularity ); static void mpx_handler( int signal ); inline_static void mpx_hold( void ) { sigprocmask( SIG_BLOCK, &sigreset, NULL ); MPXDBG( "signal held\n" ); } inline_static void mpx_release( void ) { MPXDBG( "signal released\n" ); sigprocmask( SIG_UNBLOCK, &sigreset, NULL ); } static void mpx_init_timers( int interval ) { /* Fill in the interval timer values now to save a * little time later. */ #ifdef OUTSIDE_PAPI interval = MPX_DEFAULT_INTERVAL; #endif #ifdef REGENERATE /* Signal handler restarts the timer every time it runs */ itime.it_interval.tv_sec = 0; itime.it_interval.tv_usec = 0; itime.it_value.tv_sec = 0; itime.it_value.tv_usec = interval; #else /* Timer resets itself automatically */ itime.it_interval.tv_sec = 0; itime.it_interval.tv_usec = interval; itime.it_value.tv_sec = 0; itime.it_value.tv_usec = interval; #endif sigemptyset( &sigreset ); sigaddset( &sigreset, _papi_os_info.itimer_sig ); } static int mpx_startup_itimer( void ) { struct sigaction sigact; /* Set up the signal handler and the timer that triggers it */ MPXDBG( "PID %d\n", getpid( ) ); memset( &sigact, 0, sizeof ( sigact ) ); sigact.sa_flags = SA_RESTART; sigact.sa_handler = mpx_handler; if ( sigaction( _papi_os_info.itimer_sig, &sigact, NULL ) == -1 ) { PAPIERROR( "sigaction start errno %d", errno ); return PAPI_ESYS; } if ( setitimer( _papi_os_info.itimer_num, &itime, NULL ) == -1 ) { sigaction( _papi_os_info.itimer_sig, &oaction, NULL ); PAPIERROR( "setitimer start errno %d", errno ); return PAPI_ESYS; } return ( PAPI_OK ); } static void mpx_restore_signal( void ) { MPXDBG( "restore signal\n" ); if ( _papi_os_info.itimer_sig != PAPI_NULL ) { if ( signal( _papi_os_info.itimer_sig, SIG_IGN ) == SIG_ERR ) PAPIERROR( "sigaction stop errno %d", errno ); } } static void mpx_shutdown_itimer( void ) { MPXDBG( "setitimer off\n" ); if ( _papi_os_info.itimer_num != PAPI_NULL ) { if ( setitimer( _papi_os_info.itimer_num, ( struct itimerval * ) &itimestop, NULL ) == -1 ) PAPIERROR( "setitimer stop errno %d", errno ); } } static MasterEvent * get_my_threads_master_event_list( void ) { Threadlist *t = tlist; unsigned long tid; MPXDBG( "tlist is %p\n", tlist ); if ( tlist == NULL ) return NULL; if ( _papi_hwi_thread_id_fn == NULL ) return ( tlist->head ); tid = _papi_hwi_thread_id_fn( ); unsigned long pid = ( unsigned long ) getpid( ); while ( t ) { if ( t->tid == tid || ( ( tid == 0 ) && ( t->tid == pid ) ) ) return ( t->head ); t = t->next; } return ( NULL ); } static MPX_EventSet * mpx_malloc( Threadlist * t ) { MPX_EventSet *newset = ( MPX_EventSet * ) papi_malloc( sizeof ( MPX_EventSet ) ); if ( newset == NULL ) return ( NULL ); memset( newset, 0, sizeof ( MPX_EventSet ) ); newset->status = MPX_STOPPED; newset->mythr = t; return ( newset ); } int mpx_add_event( MPX_EventSet ** mpx_events, int EventCode, int domain, int granularity ) { MPX_EventSet *newset = *mpx_events; int retval, alloced_newset = 0; Threadlist *t; /* Get the global list of threads */ MPXDBG("Adding %p %#x\n",newset,EventCode); _papi_hwi_lock( MULTIPLEX_LOCK ); t = tlist; /* If there are no threads in the list at all, then allocate the new Threadlist */ if ( t == NULL ) { new_thread: t = ( Threadlist * ) papi_malloc( sizeof ( Threadlist ) ); if ( t == NULL ) { _papi_hwi_unlock( MULTIPLEX_LOCK ); return ( PAPI_ENOMEM ); } /* If we're actually threaded, fill the * field with the thread_id otherwise * use getpid() as a placeholder. */ if ( _papi_hwi_thread_id_fn ) { MPXDBG( "New thread at %p\n", t ); t->tid = _papi_hwi_thread_id_fn( ); } else { MPXDBG( "New process at %p\n", t ); t->tid = ( unsigned long ) getpid( ); } /* Fill in the fields */ t->head = NULL; t->cur_event = NULL; t->next = tlist; tlist = t; MPXDBG( "New head is at %p(%lu).\n", tlist, ( long unsigned ) tlist->tid ); /* alloced_thread = 1; */ } else if ( _papi_hwi_thread_id_fn ) { /* If we are threaded, AND there exists threads in the list, * then try to find our thread in the list. */ unsigned long tid = _papi_hwi_thread_id_fn( ); while ( t ) { if ( t->tid == tid ) { MPXDBG( "Found thread %#lx\n", t->tid ); break; } t = t->next; } /* Our thread is not in the list, so make a new * thread entry. */ if ( t == NULL ) { MPXDBG( "New thread %lx\n", tid ); goto new_thread; } } /* Now t & tlist points to our thread, also at the head of the list */ /* Allocate a the MPX_EventSet if necessary */ if ( newset == NULL ) { newset = mpx_malloc( t ); if ( newset == NULL ) { _papi_hwi_unlock( MULTIPLEX_LOCK ); return ( PAPI_ENOMEM ); } alloced_newset = 1; } /* Now we're finished playing with the thread list */ _papi_hwi_unlock( MULTIPLEX_LOCK ); /* Removed newset->num_events++, moved to mpx_insert_events() */ mpx_hold( ); /* Create PAPI events (if they don't already exist) and link * the new event set to them, add them to the master list for the thread, reset master event list for this thread */ retval = mpx_insert_events( newset, &EventCode, 1, domain, granularity ); if ( retval != PAPI_OK ) { if ( alloced_newset ) { papi_free( newset ); newset = NULL; } } mpx_release( ); /* Output the new or existing EventSet */ *mpx_events = newset; return retval; } int mpx_remove_event( MPX_EventSet ** mpx_events, int EventCode ) { mpx_hold( ); if ( *mpx_events ) mpx_delete_one_event( *mpx_events, EventCode ); mpx_release( ); return ( PAPI_OK ); } #ifdef MPX_DEBUG_TIMER static long long lastcall; #endif #ifdef _POWER6 /* POWER6 can always count PM_RUN_CYC on counter 6 in domain PAPI_DOM_ALL, and can count it on other domains on counters 1 and 2 along with a very limited number of other native events */ int _PNE_PM_RUN_CYC; #define SCALE_EVENT _PNE_PM_RUN_CYC #else #define SCALE_EVENT PAPI_TOT_CYC #endif static void mpx_handler( int signal ) { int retval; MasterEvent *mev, *head; Threadlist *me = NULL; #ifdef REGENERATE int lastthread; #endif #ifdef MPX_DEBUG_OVERHEAD long long usec; int didwork = 0; usec = PAPI_get_real_usec( ); #endif #ifdef MPX_DEBUG_TIMER long long thiscall; #endif signal = signal; /* unused */ MPXDBG( "Handler in thread\n" ); /* This handler can be invoked either when a timer expires * or when another thread in this handler responding to the * timer signals other threads. We have to distinguish * these two cases so that we don't get infinite loop of * handler calls. To do that, we look at the value of * threads_responding. We assume that only one thread can * be active in this signal handler at a time, since the * invoking signal is blocked while the handler is active. * If threads_responding == 0, the current thread caught * the original timer signal. (This thread may not have * any active event lists itself, though.) This first * thread sends a signal to each of the other threads in * our list of threads that have master events lists. If * threads_responding != 0, then this thread was signaled * by another thread. We decrement that value and look * for an active events. threads_responding should * reach zero when all active threads have handled their * signal. It's probably possible for a thread to die * before it responds to a signal; if that happens, * threads_responding won't reach zero until the next * timer signal happens. Then the signalled thread won't * signal any other threads. If that happens only * occasionally, there should be no harm. Likewise if * a new thread is added that fails to get signalled. * As for locking, we have to lock this list to prevent * another thread from modifying it, but if *this* thread * is trying to update the list (from another function) and * is signaled while it holds the lock, we will have deadlock. * Therefore, noninterrupt functions that update *this* list * must disable the signal that invokes this handler. */ #ifdef PTHREADS _papi_hwi_lock( MULTIPLEX_LOCK ); if ( threads_responding == 0 ) { /* this thread caught the timer sig */ /* Signal the other threads with event lists */ #ifdef MPX_DEBUG_TIMER thiscall = _papi_hwd_get_real_usec( ); MPXDBG( "last signal was %lld usec ago\n", thiscall - lastcall ); lastcall = thiscall; #endif MPXDBG( "%#x caught it, tlist is %p\n", self, tlist ); for ( t = tlist; t != NULL; t = t->next ) { if ( pthread_equal( t->thr, self ) == 0 ) { ++threads_responding; retval = pthread_kill( t->thr, _papi_os_info.itimer_sig ); assert( retval == 0 ); #ifdef MPX_DEBUG_SIGNALS MPXDBG( "%#x signaling %#x\n", self, t->thr ); #endif } } } else { #ifdef MPX_DEBUG_SIGNALS MPXDBG( "%#x was tapped, tr = %d\n", self, threads_responding ); #endif --threads_responding; } #ifdef REGENERATE lastthread = ( threads_responding == 0 ); #endif _papi_hwi_unlock( MULTIPLEX_LOCK ); #endif /* See if this thread has an active event list */ head = get_my_threads_master_event_list( ); if ( head != NULL ) { /* Get the thread header for this master event set. It's * always in the first record of the set (and maybe in others) * if any record in the set is active. */ me = head->mythr; /* Find the event that's currently active, stop and read * it, then start the next event in the list. * No need to lock the list because other functions * disable the timer interrupt before they update the list. */ if ( me != NULL && me->cur_event != NULL ) { long long counts[2]; MasterEvent *cur_event = me->cur_event; long long cycles = 0, total_cycles = 0; retval = PAPI_stop( cur_event->papi_event, counts ); MPXDBG( "retval=%d, cur_event=%p, I'm tid=%lx\n", retval, cur_event, me->tid ); if ( retval == PAPI_OK ) { MPXDBG( "counts[0] = %lld counts[1] = %lld\n", counts[0], counts[1] ); cur_event->count += counts[0]; cycles = ( cur_event->pi.event_type == SCALE_EVENT ) ? counts[0] : counts[1]; me->total_c += cycles; total_cycles = me->total_c - cur_event->prev_total_c; cur_event->prev_total_c = me->total_c; /* If it's a rate, count occurrences & average later */ if ( !cur_event->is_a_rate ) { cur_event->cycles += cycles; if ( cycles >= MPX_MINCYC ) { /* Only update current rate on a decent slice */ cur_event->rate_estimate = ( double ) counts[0] / ( double ) cycles; } cur_event->count_estimate += ( long long ) ( ( double ) total_cycles * cur_event->rate_estimate ); MPXDBG("New estimate = %lld (%lld cycles * %lf rate)\n", cur_event->count_estimate,total_cycles, cur_event->rate_estimate); } else { /* Make sure we ran long enough to get a useful measurement (otherwise * potentially inaccurate rate measurements get averaged in with * the same weight as longer, more accurate ones.) */ if ( cycles >= MPX_MINCYC ) { cur_event->cycles += 1; } else { cur_event->count -= counts[0]; } } } else { MPXDBG( "%lx retval = %d, skipping\n", me->tid, retval ); MPXDBG( "%lx value = %lld cycles = %lld\n\n", me->tid, cur_event->count, cur_event->cycles ); } MPXDBG ( "tid(%lx): value = %lld (%lld) cycles = %lld (%lld) rate = %lf\n\n", me->tid, cur_event->count, cur_event->count_estimate, cur_event->cycles, total_cycles, cur_event->rate_estimate ); /* Start running the next event; look for the * next one in the list that's marked active. * It's possible that this event is the only * one active; if so, we should restart it, * but only after considerating all the other * possible events. */ if ( ( retval != PAPI_OK ) || ( ( retval == PAPI_OK ) && ( cycles >= MPX_MINCYC ) ) ) { for ( mev = ( cur_event->next == NULL ) ? head : cur_event->next; mev != cur_event; mev = ( mev->next == NULL ) ? head : mev->next ) { /* Found the next one to start */ if ( mev->active ) { me->cur_event = mev; break; } } } if ( me->cur_event->active ) { retval = PAPI_start( me->cur_event->papi_event ); } #ifdef MPX_DEBUG_OVERHEAD didwork = 1; #endif } } #ifdef ANY_THREAD_GETS_SIGNAL else { Threadlist *t; for ( t = tlist; t != NULL; t = t->next ) { if ( ( t->tid == _papi_hwi_thread_id_fn( ) ) || ( t->head == NULL ) ) continue; MPXDBG( "forwarding signal to thread %lx\n", t->tid ); retval = ( *_papi_hwi_thread_kill_fn ) ( t->tid, _papi_os_info.itimer_sig ); if ( retval != 0 ) { MPXDBG( "forwarding signal to thread %lx returned %d\n", t->tid, retval ); } } } #endif #ifdef REGENERATE /* Regenerating the signal each time through has the * disadvantage that if any thread ever drops a signal, * the whole time slicing system will stop. Using * an automatically regenerated signal may have the * disadvantage that a new signal can arrive very * soon after all the threads have finished handling * the last one, so the interval may be too small for * accurate data collection. However, using the * MIN_CYCLES check above should alleviate this. */ /* Reset the timer once all threads have responded */ if ( lastthread ) { retval = setitimer( _papi_os_info.itimer_num, &itime, NULL ); assert( retval == 0 ); #ifdef MPX_DEBUG_TIMER MPXDBG( "timer restarted by %lx\n", me->tid ); #endif } #endif #ifdef MPX_DEBUG_OVERHEAD usec = _papi_hwd_get_real_usec( ) - usec; MPXDBG( "handler %#x did %swork in %lld usec\n", self, ( didwork ? "" : "no " ), usec ); #endif } int MPX_add_events( MPX_EventSet ** mpx_events, int *event_list, int num_events, int domain, int granularity ) { int i, retval = PAPI_OK; for ( i = 0; i < num_events; i++ ) { retval = mpx_add_event( mpx_events, event_list[i], domain, granularity ); if ( retval != PAPI_OK ) return ( retval ); } return ( retval ); } int MPX_start( MPX_EventSet * mpx_events ) { int retval = PAPI_OK; int i; long long values[2]; long long cycles_this_slice, current_thread_mpx_c = 0; Threadlist *t; t = mpx_events->mythr; mpx_hold( ); if ( t->cur_event && t->cur_event->active ) { current_thread_mpx_c += t->total_c; retval = PAPI_read( t->cur_event->papi_event, values ); assert( retval == PAPI_OK ); if ( retval == PAPI_OK ) { cycles_this_slice = ( t->cur_event->pi.event_type == SCALE_EVENT ) ? values[0] : values[1]; } else { values[0] = values[1] = 0; cycles_this_slice = 0; } } else { values[0] = values[1] = 0; cycles_this_slice = 0; } /* Make all events in this set active, and for those * already active, get the current count and cycles. */ for ( i = 0; i < mpx_events->num_events; i++ ) { MasterEvent *mev = mpx_events->mev[i]; if ( mev->active++ ) { mpx_events->start_values[i] = mev->count_estimate; mpx_events->start_hc[i] = mev->cycles; /* If this happens to be the currently-running * event, add in the current amounts from this * time slice. If it's a rate, though, don't * bother since the event might not have been * running long enough to get an accurate count. */ if ( t->cur_event && !( t->cur_event->is_a_rate ) ) { #ifdef MPX_NONDECR_HYBRID if ( mev != t->cur_event ) { /* This event is not running this slice */ mpx_events->start_values[i] += ( long long ) ( mev->rate_estimate * ( cycles_this_slice + t->total_c - mev->prev_total_c ) ); } else { /* The event is running, use current value + estimate */ if ( cycles_this_slice >= MPX_MINCYC ) mpx_events->start_values[i] += values[0] + ( long long ) ( ( values[0] / ( double ) cycles_this_slice ) * ( t->total_c - mev->prev_total_c ) ); else /* Use previous rate if the event has run too short time */ mpx_events->start_values[i] += values[0] + ( long long ) ( mev->rate_estimate * ( t->total_c - mev->prev_total_c ) ); } #endif } else { mpx_events->start_values[i] = mev->count; } } else { /* The = 0 isn't actually necessary; we only need * to sync up the mpx event to the master event, * but it seems safe to set the mev to 0 here, and * that gives us a change to avoid (very unlikely) * rollover problems for events used repeatedly over * a long time. */ mpx_events->start_values[i] = 0; mpx_events->stop_values[i] = 0; mpx_events->start_hc[i] = mev->cycles = 0; mev->count_estimate = 0; mev->rate_estimate = 0.0; mev->prev_total_c = current_thread_mpx_c; mev->count = 0; } /* Adjust start value to include events and cycles * counted previously for this event set. */ } mpx_events->status = MPX_RUNNING; /* Start first counter if one isn't already running */ if ( t->cur_event == NULL ) { /* Pick an events at random to start. */ int index = ( rand_r( &randomseed ) % mpx_events->num_events ); t->cur_event = mpx_events->mev[index]; t->total_c = 0; t->cur_event->prev_total_c = 0; mpx_events->start_c = 0; retval = PAPI_start( mpx_events->mev[index]->papi_event ); assert( retval == PAPI_OK ); } else { /* If an event is already running, record the starting cycle * count for mpx_events, which is the accumlated cycle count * for the master event set plus the cycles for this time * slice. */ mpx_events->start_c = t->total_c + cycles_this_slice; } #if defined(DEBUG) if ( ISLEVEL( DEBUG_MULTIPLEX ) ) { MPXDBG( "%s:%d:: start_c=%lld thread->total_c=%lld\n", __FILE__, __LINE__, mpx_events->start_c, t->total_c ); for ( i = 0; i < mpx_events->num_events; i++ ) { MPXDBG ( "%s:%d:: start_values[%d]=%lld estimate=%lld rate=%g last active=%lld\n", __FILE__, __LINE__, i, mpx_events->start_values[i], mpx_events->mev[i]->count_estimate, mpx_events->mev[i]->rate_estimate, mpx_events->mev[i]->prev_total_c ); } } #endif mpx_release( ); retval = mpx_startup_itimer( ); return retval; } int MPX_read( MPX_EventSet * mpx_events, long long *values, int called_by_stop ) { int i; int retval; long long last_value[2]; long long cycles_this_slice = 0; MasterEvent *cur_event; Threadlist *thread_data; if ( mpx_events->status == MPX_RUNNING ) { /* Hold timer interrupts while we read values */ mpx_hold( ); thread_data = mpx_events->mythr; cur_event = thread_data->cur_event; retval = PAPI_read( cur_event->papi_event, last_value ); if ( retval != PAPI_OK ) return retval; cycles_this_slice = ( cur_event->pi.event_type == SCALE_EVENT ) ? last_value[0] : last_value[1]; /* Save the current counter values and get * the lastest data for the current event */ for ( i = 0; i < mpx_events->num_events; i++ ) { MasterEvent *mev = mpx_events->mev[i]; if ( !( mev->is_a_rate ) ) { mpx_events->stop_values[i] = mev->count_estimate; } else { mpx_events->stop_values[i] = mev->count; } #ifdef MPX_NONDECR_HYBRID /* If we are called from MPX_stop() then */ /* adjust the final values based on the */ /* cycles elapsed since the last read */ /* otherwise, don't do this as it can cause */ /* decreasing values if read is called again */ /* before another sample happens. */ if (called_by_stop) { /* Extrapolate data up to the current time * only if it's not a rate measurement */ if ( !( mev->is_a_rate ) ) { if ( mev != thread_data->cur_event ) { mpx_events->stop_values[i] += ( long long ) ( mev->rate_estimate * ( cycles_this_slice + thread_data->total_c - mev->prev_total_c ) ); MPXDBG ( "%s:%d:: Inactive %d, stop values=%lld (est. %lld, rate %g, cycles %lld)\n", __FILE__, __LINE__, i, mpx_events->stop_values[i], mev->count_estimate, mev->rate_estimate, cycles_this_slice + thread_data->total_c - mev->prev_total_c ); } else { mpx_events->stop_values[i] += last_value[0] + ( long long ) ( mev->rate_estimate * ( thread_data->total_c - mev->prev_total_c ) ); MPXDBG ( "%s:%d:: -Active- %d, stop values=%lld (est. %lld, rate %g, cycles %lld)\n", __FILE__, __LINE__, i, mpx_events->stop_values[i], mev->count_estimate, mev->rate_estimate, thread_data->total_c - mev->prev_total_c ); } } } #endif } mpx_events->stop_c = thread_data->total_c + cycles_this_slice; /* Restore the interrupt */ mpx_release( ); } /* Store the values in user array. */ for ( i = 0; i < mpx_events->num_events; i++ ) { MasterEvent *mev = mpx_events->mev[i]; long long elapsed_slices = 0; long long elapsed_values = mpx_events->stop_values[i] - mpx_events->start_values[i]; /* For rates, cycles contains the number of measurements, * not the number of cycles, so just divide to compute * an average value. This assumes that the rate was * constant over the whole measurement period. */ values[i] = elapsed_values; if ( mev->is_a_rate ) { /* Handler counts */ elapsed_slices = mev->cycles - mpx_events->start_hc[i]; values[i] = elapsed_slices ? ( elapsed_values / elapsed_slices ) : 0; } MPXDBG( "%s:%d:: event %d, values=%lld ( %lld - %lld), cycles %lld\n", __FILE__, __LINE__, i, elapsed_values, mpx_events->stop_values[i], mpx_events->start_values[i], mev->is_a_rate ? elapsed_slices : 0 ); } return PAPI_OK; } int MPX_reset( MPX_EventSet * mpx_events ) { int i, retval; long long values[PAPI_MAX_SW_MPX_EVENTS]; /* Get the current values from MPX_read */ retval = MPX_read( mpx_events, values, 0 ); if ( retval != PAPI_OK ) return retval; /* Disable timer interrupt */ mpx_hold( ); /* Make counters read zero by setting the start values * to the current counter values. */ for ( i = 0; i < mpx_events->num_events; i++ ) { MasterEvent *mev = mpx_events->mev[i]; if ( mev->is_a_rate ) { mpx_events->start_values[i] = mev->count; } else { mpx_events->start_values[i] += values[i]; } mpx_events->start_hc[i] = mev->cycles; } /* Set the start time for this set to the current cycle count */ mpx_events->start_c = mpx_events->stop_c; /* Restart the interrupt */ mpx_release( ); return PAPI_OK; } int MPX_stop( MPX_EventSet * mpx_events, long long *values ) { int i, cur_mpx_event; int retval = PAPI_OK; long long dummy_value[2]; long long dummy_mpx_values[PAPI_MAX_SW_MPX_EVENTS]; /* long long cycles_this_slice, total_cycles; */ MasterEvent *cur_event = NULL, *head; Threadlist *thr = NULL; if ( mpx_events == NULL ) return PAPI_EINVAL; if ( mpx_events->status != MPX_RUNNING ) return PAPI_ENOTRUN; /* Read the counter values, this updates mpx_events->stop_values[] */ MPXDBG( "Start\n" ); if ( values == NULL ) retval = MPX_read( mpx_events, dummy_mpx_values, 1 ); else retval = MPX_read( mpx_events, values, 1 ); /* Block timer interrupts while modifying active events */ mpx_hold( ); /* Get the master event list for this thread. */ head = get_my_threads_master_event_list( ); if (!head) { retval=PAPI_EBUG; goto exit_mpx_stop; } /* Get this threads data structure */ thr = head->mythr; cur_event = thr->cur_event; /* This would be a good spot to "hold" the counter and then restart * it at the end, but PAPI_start resets counters so it is not possible */ /* Run through all the events decrement their activity counters. */ cur_mpx_event = -1; for ( i = 0; i < mpx_events->num_events; i++ ) { --mpx_events->mev[i]->active; if ( mpx_events->mev[i] == cur_event ) cur_mpx_event = i; } /* One event in this set is currently running, if this was the * last active event set using this event, we need to start the next * event if there still is one left in the queue */ if ( cur_mpx_event > -1 ) { MasterEvent *tmp, *mev = mpx_events->mev[cur_mpx_event]; if ( mev->active == 0 ) { /* Event is now inactive; stop it * There is no need to update master event set * counters as this is the last active user */ retval = PAPI_stop( mev->papi_event, dummy_value ); mev->rate_estimate = 0.0; /* Fall-back value if none is found */ thr->cur_event = NULL; /* Now find a new cur_event */ for ( tmp = ( cur_event->next == NULL ) ? head : cur_event->next; tmp != cur_event; tmp = ( tmp->next == NULL ) ? head : tmp->next ) { if ( tmp->active ) { /* Found the next one to start */ thr->cur_event = tmp; break; } } if ( thr->cur_event != NULL ) { retval = PAPI_start( thr->cur_event->papi_event ); assert( retval == PAPI_OK ); } else { mpx_shutdown_itimer( ); } } } mpx_events->status = MPX_STOPPED; exit_mpx_stop: MPXDBG( "End\n" ); /* Restore the timer (for other event sets that may be running) */ mpx_release( ); return retval; } int MPX_cleanup( MPX_EventSet ** mpx_events ) { #ifdef PTHREADS int retval; #endif if ( mpx_events == NULL ) return PAPI_EINVAL; if ( *mpx_events == NULL ) return PAPI_OK; if (( *mpx_events )->status == MPX_RUNNING ) return PAPI_EINVAL; mpx_hold( ); /* Remove master events from this event set and from * the master list, if necessary. */ mpx_delete_events( *mpx_events ); mpx_release( ); /* Free all the memory */ papi_free( *mpx_events ); *mpx_events = NULL; return PAPI_OK; } void MPX_shutdown( void ) { MPXDBG( "%d\n", getpid( ) ); mpx_shutdown_itimer( ); mpx_restore_signal( ); if ( tlist ) { Threadlist *next,*t=tlist; while(t!=NULL) { next=t->next; papi_free( t ); t = next; } tlist = NULL; } } int mpx_check( int EventSet ) { /* Currently, there is only the need for one mpx check: if * running on POWER6/perfctr platform, the domain must * include user, kernel, and supervisor, since the scale * event uses the dedicated counter #6, PM_RUN_CYC, which * cannot be controlled on a domain level. */ EventSetInfo_t *ESI = _papi_hwi_lookup_EventSet( EventSet ); if (ESI==NULL) return PAPI_EBUG; if ( strstr( _papi_hwd[ESI->CmpIdx]->cmp_info.name, "perfctr.c" ) == NULL ) return PAPI_OK; if ( strcmp( _papi_hwi_system_info.hw_info.model_string, "POWER6" ) == 0 ) { unsigned int chk_domain = PAPI_DOM_USER + PAPI_DOM_KERNEL + PAPI_DOM_SUPERVISOR; if ( ( ESI->domain.domain & chk_domain ) != chk_domain ) { PAPIERROR ( "This platform requires PAPI_DOM_USER+PAPI_DOM_KERNEL+PAPI_DOM_SUPERVISOR\n" "to be set in the domain when using multiplexing. Instead, found %#x\n", ESI->domain.domain ); return ( PAPI_EINVAL_DOM ); } } return PAPI_OK; } int mpx_init( int interval_ns ) { #if defined(PTHREADS) || defined(_POWER6) int retval; #endif #ifdef _POWER6 retval = PAPI_event_name_to_code( "PM_RUN_CYC", &_PNE_PM_RUN_CYC ); if ( retval != PAPI_OK ) return ( retval ); #endif tlist = NULL; mpx_hold( ); mpx_shutdown_itimer( ); mpx_init_timers( interval_ns / 1000 ); return ( PAPI_OK ); } /** Inserts a list of events into the master event list, and adds new mev pointers to the MPX_EventSet. MUST BE CALLED WITH THE TIMER INTERRUPT DISABLED */ static int mpx_insert_events( MPX_EventSet *mpx_events, int *event_list, int num_events, int domain, int granularity ) { int i, retval = 0, num_events_success = 0; MasterEvent *mev; PAPI_option_t options; MasterEvent **head = &mpx_events->mythr->head; MPXDBG("Inserting %p %d\n",mpx_events,mpx_events->num_events ); /* Make sure we don't overrun our buffers */ if (mpx_events->num_events + num_events > PAPI_MAX_SW_MPX_EVENTS) { return PAPI_ECOUNT; } /* For each event, see if there is already a corresponding * event in the master set for this thread. If not, add it. */ for ( i = 0; i < num_events; i++ ) { /* Look for a matching event in the master list */ for( mev = *head; mev != NULL; mev = mev->next ) { if ( (mev->pi.event_type == event_list[i]) && (mev->pi.domain == domain) && (mev->pi.granularity == granularity )) break; } /* No matching event in the list; add a new one */ if ( mev == NULL ) { mev = (MasterEvent *) papi_malloc( sizeof ( MasterEvent ) ); if ( mev == NULL ) { return PAPI_ENOMEM; } mev->pi.event_type = event_list[i]; mev->pi.domain = domain; mev->pi.granularity = granularity; mev->uses = mev->active = 0; mev->prev_total_c = mev->count = mev->cycles = 0; mev->rate_estimate = 0.0; mev->count_estimate = 0; mev->is_a_rate = 0; mev->papi_event = PAPI_NULL; retval = PAPI_create_eventset( &( mev->papi_event ) ); if ( retval != PAPI_OK ) { MPXDBG( "Event %d could not be counted.\n", event_list[i] ); goto bail; } retval = PAPI_add_event( mev->papi_event, event_list[i] ); if ( retval != PAPI_OK ) { MPXDBG( "Event %d could not be counted.\n", event_list[i] ); goto bail; } /* Always count total cycles so we can scale results. * If user just requested cycles, * don't add that event again. */ if ( event_list[i] != SCALE_EVENT ) { retval = PAPI_add_event( mev->papi_event, SCALE_EVENT ); if ( retval != PAPI_OK ) { MPXDBG( "Scale event could not be counted " "at the same time.\n" ); goto bail; } } /* Set the options for the event set */ memset( &options, 0x0, sizeof ( options ) ); options.domain.eventset = mev->papi_event; options.domain.domain = domain; retval = PAPI_set_opt( PAPI_DOMAIN, &options ); if ( retval != PAPI_OK ) { MPXDBG( "PAPI_set_opt(PAPI_DOMAIN, ...) = %d\n", retval ); goto bail; } memset( &options, 0x0, sizeof ( options ) ); options.granularity.eventset = mev->papi_event; options.granularity.granularity = granularity; retval = PAPI_set_opt( PAPI_GRANUL, &options ); if ( retval != PAPI_OK ) { if ( retval != PAPI_ECMP ) { /* ignore component errors because they typically mean "not supported by the component" */ MPXDBG( "PAPI_set_opt(PAPI_GRANUL, ...) = %d\n", retval ); goto bail; } } /* Chain the event set into the * master list of event sets used in * multiplexing. */ mev->next = *head; *head = mev; } /* If we created a new event set, or we found a matching * eventset already in the list, then add the pointer in * the master list to this threads list. Then we bump the * number of successfully added events. */ MPXDBG("Inserting now %p %d\n",mpx_events,mpx_events->num_events ); mpx_events->mev[mpx_events->num_events + num_events_success] = mev; mpx_events->mev[mpx_events->num_events + num_events_success]->uses++; num_events_success++; } /* Always be sure the head master event points to the thread */ if ( *head != NULL ) { ( *head )->mythr = mpx_events->mythr; } MPXDBG( "%d of %d events were added.\n", num_events_success, num_events ); mpx_events->num_events += num_events_success; return ( PAPI_OK ); bail: /* If there is a current mev, it is currently not linked into the list * of multiplexing events, so we can just delete that */ if ( mev && mev->papi_event ) { if (PAPI_cleanup_eventset( mev->papi_event )!=PAPI_OK) { PAPIERROR("Cleanup eventset\n"); } if (PAPI_destroy_eventset( &( mev->papi_event )) !=PAPI_OK) { PAPIERROR("Destroy eventset\n"); } } if ( mev ) papi_free( mev ); mev = NULL; /* Decrease the usage count of events */ for ( i = 0; i < num_events_success; i++ ) { mpx_events->mev[mpx_events->num_events + i]->uses--; } /* Run the garbage collector to remove unused events */ if ( num_events_success ) mpx_remove_unused( head ); return ( retval ); } /** Remove events from an mpx event set (and from the * master event set for this thread, if the events are unused). * MUST BE CALLED WITH THE SIGNAL HANDLER DISABLED */ static void mpx_delete_events( MPX_EventSet * mpx_events ) { int i; MasterEvent *mev; /* First decrement the reference counter for each master * event in this event set, then see if the master events * can be deleted. */ for ( i = 0; i < mpx_events->num_events; i++ ) { mev = mpx_events->mev[i]; --mev->uses; mpx_events->mev[i] = NULL; /* If it's no longer used, it should not be active! */ assert( mev->uses || !( mev->active ) ); } mpx_events->num_events = 0; mpx_remove_unused( &mpx_events->mythr->head ); } /** Remove one event from an mpx event set (and from the * master event set for this thread, if the events are unused). * MUST BE CALLED WITH THE SIGNAL HANDLER DISABLED */ static void mpx_delete_one_event( MPX_EventSet * mpx_events, int Event ) { int i; MasterEvent *mev; /* First decrement the reference counter for each master * event in this event set, then see if the master events * can be deleted. */ for ( i = 0; i < mpx_events->num_events; i++ ) { mev = mpx_events->mev[i]; if ( mev->pi.event_type == Event ) { --mev->uses; mpx_events->num_events--; mpx_events->mev[i] = NULL; /* If it's no longer used, it should not be active! */ assert( mev->uses || !( mev->active ) ); break; } } /* If we removed an event that is not last in the list we * need to compact the event list */ for ( ; i < mpx_events->num_events; i++ ) { mpx_events->mev[i] = mpx_events->mev[i + 1]; mpx_events->start_values[i] = mpx_events->start_values[i + 1]; mpx_events->stop_values[i] = mpx_events->stop_values[i + 1]; mpx_events->start_hc[i] = mpx_events->start_hc[i + 1]; } mpx_events->mev[i] = NULL; mpx_remove_unused( &mpx_events->mythr->head ); } /** Remove events that are not used any longer from the run * list of events to multiplex by the handler * MUST BE CALLED WITH THE SIGNAL HANDLER DISABLED */ static void mpx_remove_unused( MasterEvent ** head ) { MasterEvent *mev, *lastmev = NULL, *nextmev; Threadlist *thr = ( *head == NULL ) ? NULL : ( *head )->mythr; int retval; /* Clean up and remove unused master events. */ for ( mev = *head; mev != NULL; mev = nextmev ) { nextmev = mev->next; /* get link before mev is freed */ if ( !mev->uses ) { if ( lastmev == NULL ) { /* this was the head event */ *head = nextmev; } else { lastmev->next = nextmev; } retval=PAPI_cleanup_eventset( mev->papi_event ); retval=PAPI_destroy_eventset( &( mev->papi_event ) ); if (retval!=PAPI_OK) PAPIERROR("Error destroying event\n"); papi_free( mev ); } else { lastmev = mev; } } /* Always be sure the head master event points to the thread */ if ( *head != NULL ) { ( *head )->mythr = thr; } } papi-papi-7-2-0-t/src/sw_multiplex.h000066400000000000000000000031551502707512200173400ustar00rootroot00000000000000#ifndef MULTIPLEX_H #define MULTIPLEX_H #define PAPI_MAX_SW_MPX_EVENTS 32 /* Structure contained in the EventSet structure that holds information about multiplexing. */ typedef enum { MPX_STOPPED, MPX_RUNNING } MPX_status; /** Structure contained in the EventSet structure that holds information about multiplexing. @internal */ typedef struct _MPX_EventSet { MPX_status status; /** Pointer to this thread's structure */ struct _threadlist *mythr; /** Pointers to this EventSet's MPX entries in the master list for this thread */ struct _masterevent *(mev[PAPI_MAX_SW_MPX_EVENTS]); /** Number of entries in above list */ int num_events; /** Not sure... */ long long start_c, stop_c; long long start_values[PAPI_MAX_SW_MPX_EVENTS]; long long stop_values[PAPI_MAX_SW_MPX_EVENTS]; long long start_hc[PAPI_MAX_SW_MPX_EVENTS]; } MPX_EventSet; typedef struct EventSetMultiplexInfo { MPX_EventSet *mpx_evset; int ns; int flags; } EventSetMultiplexInfo_t; int mpx_check( int EventSet ); int mpx_init( int ); int mpx_add_event( MPX_EventSet **, int EventCode, int domain, int granularity ); int mpx_remove_event( MPX_EventSet **, int EventCode ); int MPX_add_events( MPX_EventSet ** mpx_events, int *event_list, int num_events, int domain, int granularity ); int MPX_stop( MPX_EventSet * mpx_events, long long *values ); int MPX_cleanup( MPX_EventSet ** mpx_events ); void MPX_shutdown( void ); int MPX_reset( MPX_EventSet * mpx_events ); int MPX_read( MPX_EventSet * mpx_events, long long *values, int called_by_stop ); int MPX_start( MPX_EventSet * mpx_events ); #endif /* MULTIPLEX_H */ papi-papi-7-2-0-t/src/testlib/000077500000000000000000000000001502707512200160755ustar00rootroot00000000000000papi-papi-7-2-0-t/src/testlib/Makefile000066400000000000000000000022351502707512200175370ustar00rootroot00000000000000# File: testlib/Makefile include Makefile.target INCLUDE = -I. -I.. TESTLIBOBJS:= test_utils.o UTILOBJS:= do_loops.o test_utils.o clockcore.o ifeq ($(ENABLE_FORTRAN),yes) UTILOBJS+= ftests_util.o TESTLIBOBJS+= ftests_util.o endif all: libtestlib.a $(UTILOBJS) libtestlib.a: $(TESTLIBOBJS) $(AR) $(ARG64) rv $@ $(TESTLIBOBJS) do_loops.o: do_loops.c papi_test.h do_loops.h $(CC) $(INCLUDE) $(CFLAGS) -O0 -c do_loops.c # $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -c do_loops.c clockcore.o: clockcore.c $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -c clockcore.c test_utils.o: test_utils.c $(CC) $(INCLUDE) $(CFLAGS) $(TOPTFLAGS) -c test_utils.c ftests_util.o: ftests_util.F fpapi_test.h $(F77) $(INCLUDE) $(FFLAGS) $(FTOPTFLAGS) -c ftests_util.F clean: rm -f *.o *genmod.f90 *genmod.mod *.stderr *.stdout core *~ $(ALL) libtestlib.a libtestlib.so distclean: clean rm -f Makefile.target install: @echo "Papi testlib (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/testlib -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/testlib -find . -name "*.[chaF]" -type f -exec cp {} $(DATADIR)/testlib \; -cp Makefile.target $(DATADIR)/testlib/Makefile papi-papi-7-2-0-t/src/testlib/Makefile.target.in000066400000000000000000000004651502707512200214340ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ prefix = @prefix@ exec_prefix = @exec_prefix@ datarootdir = @datarootdir@ datadir = @datadir@/${PACKAGE_TARNAME} DATADIR = $(DESTDIR)$(datadir) INCLUDE = -I. -I@includedir@ CC = @CC@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ @TOPTFLAGS@ ENABLE_FORTRAN = @ENABLE_FORTRAN@ papi-papi-7-2-0-t/src/testlib/clockcore.c000066400000000000000000000057571502707512200202230ustar00rootroot00000000000000#include #include #include #include "papi.h" #include "clockcore.h" #define NUM_ITERS 1000000 static char *func_name[] = { "PAPI_get_real_cyc", "PAPI_get_real_usec", "PAPI_get_virt_cyc", "PAPI_get_virt_usec" }; static int CLOCK_ERROR = 0; static int clock_res_check( int flag, int quiet ) { if ( CLOCK_ERROR ) { return -1; } long long *elapsed_cyc, total_cyc = 0, uniq_cyc = 0, diff_cyc = 0; int i; double min, max, average, std, tmp; elapsed_cyc = ( long long * ) calloc( NUM_ITERS, sizeof ( long long ) ); /* Real */ switch ( flag ) { case 0: for ( i = 0; i < NUM_ITERS; i++ ) elapsed_cyc[i] = ( long long ) PAPI_get_real_cyc( ); break; case 1: for ( i = 0; i < NUM_ITERS; i++ ) elapsed_cyc[i] = ( long long ) PAPI_get_real_usec( ); break; case 2: for ( i = 0; i < NUM_ITERS; i++ ) elapsed_cyc[i] = ( long long ) PAPI_get_virt_cyc( ); break; case 3: for ( i = 0; i < NUM_ITERS; i++ ) elapsed_cyc[i] = ( long long ) PAPI_get_virt_usec( ); break; default: free(elapsed_cyc); return -1; } min = max = ( double ) ( elapsed_cyc[1] - elapsed_cyc[0] ); for ( i = 1; i < NUM_ITERS; i++ ) { if ( elapsed_cyc[i] - elapsed_cyc[i - 1] < 0 ) { CLOCK_ERROR = 1; fprintf(stderr,"Error! Negative elapsed time\n"); free( elapsed_cyc ); return -1; } diff_cyc = elapsed_cyc[i] - elapsed_cyc[i - 1]; if ( min > diff_cyc ) min = ( double ) diff_cyc; if ( max < diff_cyc ) max = ( double ) diff_cyc; if ( diff_cyc != 0 ) uniq_cyc++; total_cyc += diff_cyc; } average = ( double ) total_cyc / ( NUM_ITERS - 1 ); std = 0; for ( i = 1; i < NUM_ITERS; i++ ) { tmp = ( double ) ( elapsed_cyc[i] - elapsed_cyc[i - 1] ); tmp = tmp - average; std += tmp * tmp; } if ( !quiet ) { std = sqrt( std / ( NUM_ITERS - 2 ) ); printf( "%s: min %.3lf max %.3lf \n", func_name[flag], min, max ); printf( " average %.3lf std %.3lf\n", average, std ); if ( uniq_cyc == NUM_ITERS - 1 ) { printf( "%s : %7.3f <%7.3f\n", func_name[flag], ( double ) total_cyc / ( double ) ( NUM_ITERS ), ( double ) total_cyc / ( double ) uniq_cyc ); } else if ( uniq_cyc ) { printf( "%s : %7.3f %7.3f\n", func_name[flag], ( double ) total_cyc / ( double ) ( NUM_ITERS ), ( double ) total_cyc / ( double ) uniq_cyc ); } else { printf( "%s : %7.3f >%7.3f\n", func_name[flag], ( double ) total_cyc / ( double ) ( NUM_ITERS ), ( double ) total_cyc ); } } free( elapsed_cyc ); return PAPI_OK; } int clockcore( int quiet ) { /* check PAPI_get_real_cyc */ clock_res_check( 0, quiet ); /* check PAPI_get_real_usec */ clock_res_check( 1, quiet ); /* check PAPI_get_virt_cyc */ /* Virtual */ if ( PAPI_get_virt_cyc( ) != -1 ) { clock_res_check( 2, quiet ); } else { return CLOCKCORE_VIRT_CYC_FAIL; } /* check PAPI_get_virt_usec */ if ( PAPI_get_virt_usec( ) != -1 ) { clock_res_check( 3, quiet ); } else { return CLOCKCORE_VIRT_USEC_FAIL; } return PAPI_OK; } papi-papi-7-2-0-t/src/testlib/clockcore.h000066400000000000000000000001451502707512200202120ustar00rootroot00000000000000#define CLOCKCORE_VIRT_CYC_FAIL -1 #define CLOCKCORE_VIRT_USEC_FAIL -2 int clockcore( int quiet ); papi-papi-7-2-0-t/src/testlib/do_loops.c000066400000000000000000000107771502707512200200730ustar00rootroot00000000000000/* Compile me with -O0 or else you'll get none. */ #include #include #include #include #include #include #include "do_loops.h" volatile int buf[CACHE_FLUSH_BUFFER_SIZE_INTS]; volatile int buf_dummy = 0; volatile int *flush = NULL; volatile int flush_dummy = 0; volatile double a = 0.5, b = 2.2; void do_reads( int n ) { int i, retval; static int fd = -1; char buf; if ( fd == -1 ) { fd = open( "/dev/zero", O_RDONLY ); if ( fd == -1 ) { perror( "open(/dev/zero)" ); exit( 1 ); } } for ( i = 0; i < n; i++ ) { retval = ( int ) read( fd, &buf, sizeof ( buf ) ); if ( retval != sizeof ( buf ) ) { if ( retval < 0 ) perror( "/dev/zero cannot be read" ); else fprintf( stderr, "/dev/zero cannot be read: only got %d bytes.\n", retval ); exit( 1 ); } } } void fdo_reads( int *n ) { do_reads( *n ); } void fdo_reads_( int *n ) { do_reads( *n ); } void fdo_reads__( int *n ) { do_reads( *n ); } void FDO_READS( int *n ) { do_reads( *n ); } void _FDO_READS( int *n ) { do_reads( *n ); } void do_flops( int n ) { int i; double c = 0.11; for ( i = 0; i < n; i++ ) { c += a * b; } dummy( ( void * ) &c ); } void fdo_flops( int *n ) { do_flops( *n ); } void fdo_flops_( int *n ) { do_flops( *n ); } void fdo_flops__( int *n ) { do_flops( *n ); } void FDO_FLOPS( int *n ) { do_flops( *n ); } void _FDO_FLOPS( int *n ) { do_flops( *n ); } void do_misses( int n, int bytes ) { register int i, j, tmp = buf_dummy, len = bytes / ( int ) sizeof ( int ); dummy( ( void * ) buf ); dummy( ( void * ) &buf_dummy ); assert( len <= CACHE_FLUSH_BUFFER_SIZE_INTS ); for ( j = 0; j < n; j++ ) { for ( i = 0; i < len; i++ ) { /* We need to read, modify, write here to look out for the write allocate policies. */ buf[i] += tmp; /* Fake out some naive prefetchers */ buf[len - 1 - i] -= tmp; } tmp += len; } buf_dummy = tmp; dummy( ( void * ) buf ); dummy( ( void * ) &buf_dummy ); } void fdo_misses( int *n, int *size ) { do_misses( *n, *size ); } void fdo_misses_( int *n, int *size ) { do_misses( *n, *size ); } void fdo_misses__( int *n, int *size ) { do_misses( *n, *size ); } void FDO_MISSES( int *n, int *size ) { do_misses( *n, *size ); } void _FDO_MISSES( int *n, int *size ) { do_misses( *n, *size ); } void do_flush( void ) { register int i; if ( flush == NULL ) flush = ( int * ) malloc( ( 1024 * 1024 * 16 ) * sizeof ( int ) ); if ( !flush ) return; dummy( ( void * ) flush ); for ( i = 0; i < ( 1024 * 1024 * 16 ); i++ ) { flush[i] += flush_dummy; } flush_dummy++; dummy( ( void * ) flush ); dummy( ( void * ) &flush_dummy ); } void fdo_flush( void ) { do_flush( ); } void fdo_flush_( void ) { do_flush( ); } void fdo_flush__( void ) { do_flush( ); } void FDO_FLUSH( void ) { do_flush( ); } void _FDO_FLUSH( void ) { do_flush( ); } void do_l1misses( int n ) { do_misses( n, L1_MISS_BUFFER_SIZE_INTS ); } void fdo_l1misses( int *n ) { do_l1misses( *n ); } void fdo_l1misses_( int *n ) { do_l1misses( *n ); } void fdo_l1misses__( int *n ) { do_l1misses( *n ); } void FDO_L1MISSES( int *n ) { do_l1misses( *n ); } void _FDO_L1MISSES( int *n ) { do_l1misses( *n ); } void do_stuff( void ) { static int loops = 0; if ( loops == 0 ) { struct timeval now, then; gettimeofday( &then, NULL ); do { do_flops( NUM_FLOPS ); do_reads( NUM_READS ); do_misses( 1, 1024 * 1024 ); gettimeofday( &now, NULL ); loops++; } while ( now.tv_sec - then.tv_sec < NUM_WORK_SECONDS ); } else { int i = 0; do { do_flops( NUM_FLOPS ); do_reads( NUM_READS ); do_misses( 1, 1024 * 1024 ); i++; } while ( i < loops ); } } void do_stuff_( void ) { do_stuff( ); } void do_stuff__( void ) { do_stuff( ); } void DO_STUFF( void ) { do_stuff( ); } void _DO_STUFF( void ) { do_stuff( ); } void dummy( void *array ) { /* Confuse the compiler so as not to optimize away the flops in the calling routine */ /* Cast the array as a void to eliminate unused argument warning */ ( void ) array; } void dummy_( void *array ) { ( void ) array; } void dummy__( void *array ) { ( void ) array; } void DUMMY( void *array ) { ( void ) array; } void _DUMMY( void *array ) { ( void ) array; } /* We have to actually touch the memory to confuse some * systems, so they actually allocate the memory. * -KSL */ void touch_dummy( double *array, int size ) { int i; double *tmp = array; for ( i = 0; i < size; i++, tmp++ ) *tmp = ( double ) rand( ); } papi-papi-7-2-0-t/src/testlib/do_loops.h000066400000000000000000000033171502707512200200700ustar00rootroot00000000000000#define NUM_WORK_SECONDS 2 #define NUM_FLOPS 20000000 #define NUM_MISSES 2000000 #define NUM_READS 20000 #define SUCCESS 1 #define FAILURE 0 #define MAX_THREADS 256 #define NUM_THREADS 4 #define NUM_ITERS 1000000 #define THRESHOLD 1000000 #define L1_MISS_BUFFER_SIZE_INTS 128*1024 #define CACHE_FLUSH_BUFFER_SIZE_INTS 16*1024*1024 #define TOLERANCE .2 #define OVR_TOLERANCE .75 #define MPX_TOLERANCE .20 #define TIME_LIMIT_IN_US 60*1000000 /* Run for about 1 minute or 60000000 us */ void do_reads( int n ); void fdo_reads( int *n ); void fdo_reads_( int *n ); void fdo_reads__( int *n ); void FDO_READS( int *n ); void _FDO_READS( int *n ); void do_flops( int n ); /* export the next symbol as 'end' address of do_flops for profiling */ void fdo_flops( int *n ); void fdo_flops_( int *n ); void fdo_flops__( int *n ); void FDO_FLOPS( int *n ); void _FDO_FLOPS( int *n ); void do_misses( int n, int bytes ); void fdo_misses( int *n, int *size ); void fdo_misses_( int *n, int *size ); void fdo_misses__( int *n, int *size ); void FDO_MISSES( int *n, int *size ); void _FDO_MISSES( int *n, int *size ); void do_flush( void ); void fdo_flush( void ); void fdo_flush_( void ); void fdo_flush__( void ); void FDO_FLUSH( void ); void _FDO_FLUSH( void ); void do_l1misses( int n ); void fdo_l1misses( int *n ); void fdo_l1misses_( int *n ); void fdo_l1misses__( int *n ); void FDO_L1MISSES( int *n ); void _FDO_L1MISSES( int *n ); void do_stuff( void ); void do_stuff_( void ); void do_stuff__( void ); void DO_STUFF( void ); void _DO_STUFF( void ); void dummy( void *array ); void dummy_( void *array ); void dummy__( void *array ); void DUMMY( void *array ); void _DUMMY( void *array ); void touch_dummy( double *array, int size ); papi-papi-7-2-0-t/src/testlib/fpapi_test.h000066400000000000000000000001011502707512200203740ustar00rootroot00000000000000#include "fpapi.h" #define SUCCESS 1 #define NUM_FLOPS 20000000 papi-papi-7-2-0-t/src/testlib/ftests_util.F000066400000000000000000000204411502707512200205520ustar00rootroot00000000000000#include "fpapi_test.h" integer function get_quiet() implicit integer (p) character*25 chbuf integer retval integer quiet common quiet C This routine tests for a command line argument C that matches 'TESTS_QUIET' C If found, it returns 1 to set output to quiet C The routine was placed her to hide the ugly C Windows #if stuff from normal view C And also to make the test code read cleaner call getarg(1,chbuf) get_quiet = 0 if (LGE(chbuf, 'TESTS_QUIET')) then get_quiet=1 else call PAPIf_set_debug(PAPI_VERB_ECONT, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_set_debug', retval) end if end if quiet=get_quiet end integer function last_char(string) implicit integer (p) character*(*) string do last_char=len(string),1,-1 if(string(last_char:last_char).NE.' ') return end do last_char=0 end subroutine ftests_warning(line,msg) implicit integer (p) character*(*) msg integer line write(*,*) '**** WARNING message ****' call ftests_perror(line,msg) end subroutine ftests_fatal_error(line,msg) implicit integer (p) character*(*) msg integer line call ftests_perror(line,msg) call pause() stop end subroutine ftests_perror(line,msg) implicit integer (p) character*(*) msg integer line write(*,*) '**** Test error occurred ****' write(*,100) line,msg 100 format(t3,'Line # ',i5,':: ',a) end subroutine ftests_pass(test_str) implicit integer (p) character*(*) test_str write(*,100) test_str,' PASSED' call PAPIF_shutdown() call pause() stop 100 format(a,t41,a) end subroutine ftests_hl_pass(test_str) implicit integer (p) character*(*) test_str write(*,100) test_str,' PASSED' 100 format(a,t41,a) end subroutine ftest_fail(file, line, callstr, retval) implicit integer (p) character*(*) file integer line character*(*) callstr integer retval,ilen integer last_char external last_char if ( retval.eq.PAPI_ECMP .OR. retval.eq.PAPI_ENOEVNT & .OR. retval.eq.PAPI_ECNFLCT .OR. retval.eq.PAPI_EPERM ) then call ftest_skip(file, line, callstr, retval) end if if ( retval.eq.PAPI_ENOEVNT ) then call ftest_skip(file, line, callstr, retval) end if if ( retval.ne.0 )then write(*,100) file,' FAILED' else write(*,100) file,' SKIPPED' end if write(*,*) 'Line #', line if (retval.eq.PAPI_ESYS) then write(*,*) "System error in ", callstr else if (retval.gt.0) then write(*,*) "Error calculating: ", callstr else if(retval.eq.0)then write(*,*) 'SGI requires root permissions for this test' else C Just printing the error number because of difficulty getting error string. ilen=last_char(callstr) write(*,'(T2,3a,I3)') 'PAPI error in ', callstr(1:ilen), * ': ', retval end if call pause() stop 100 format(a,t41,a) end subroutine ftest_skip(file, line, callstr, retval) implicit integer (p) character*(*) file integer line character*(*) callstr integer retval,ilen integer quiet common quiet integer last_char external last_char write(*,100) file, ' SKIPPED' if (quiet .eq. 0) then write(*,*) 'Line #', line if(retval.eq.PAPI_ESYS) then write(*,*) "System error in ", callstr else if (retval.gt.0) then write(*,*) "Error calculating: ", callstr else C Just printing the error number because of difficulty getting error string. ilen=last_char(callstr) write(*,'(T2,3a,I3)') 'Error in ', callstr(1:ilen), * ': ', retval end if end if call pause() stop 100 format(a,t41,a) end subroutine stringify_domain(domain, str) implicit integer (p) integer domain character*(PAPI_MAX_STR_LEN) str integer idx if (domain .EQ. PAPI_DOM_ALL) then str = "PAPI_DOM_ALL" else idx = 1 C Is there a better way to write this? if (IAND(domain, PAPI_DOM_USER) .NE. 0) then str(idx:idx+13) = "+PAPI_DOM_USER" idx = idx + 14 end if if (IAND(domain, PAPI_DOM_KERNEL) .NE. 0) then str(idx:idx+15) = "+PAPI_DOM_KERNEL" idx = idx + 16 end if if (IAND(domain, PAPI_DOM_SUPERVISOR) .NE. 0) then str(idx:idx+19) = "+PAPI_DOM_SUPERVISOR" idx = idx + 20 end if if (IAND(domain, PAPI_DOM_OTHER) .NE. 0) then str(idx:idx+14) = "+PAPI_DOM_OTHER" idx = idx + 15 end if if (idx .NE. 1) then C Eliminate the first + character str = str(2:LEN(str)) C Blank-fill the rest of the string so last_char works correctly str(idx:) = ' ' else print *, 'error in stringify_domain' call pause() stop end if end if end subroutine stringify_granularity(granularity, str) implicit integer (p) integer granularity character*(PAPI_MAX_STR_LEN) str if (granularity .EQ. PAPI_GRN_THR) then str = "PAPI_GRN_THR" else if (granularity .EQ. PAPI_GRN_PROC) then str = "PAPI_GRN_PROC" else if (granularity .EQ. PAPI_GRN_PROCG) then str = "PAPI_GRN_PROCG" else if (granularity .EQ. PAPI_GRN_SYS_CPU) then str = "PAPI_GRN_SYS_CPU" else if (granularity .EQ. PAPI_GRN_SYS) then str = "PAPI_GRN_SYS" else print *, 'error in stringify_granularity' call pause() stop end if end C This routine provides a bottleneck C at the exit point of a program C For Windows, it links to a simple C routine C that prompts the user for a keypress C For every other platform it is a nop. subroutine pause() implicit integer (p) integer quiet common quiet end C Print the content of an event set subroutine PrintEventSet(ES) IMPLICIT integer (p) integer ES integer MAXEVENT parameter(MAXEVENT=64) integer n,i,codes(MAXEVENT),retval character*(PAPI_MAX_STR_LEN) name n=MAXEVENT call PAPIf_list_events(ES,codes,n,retval) if(n.gt.MAXEVENT) write(*,100) n,MAXEVENT 100 format(T1,'There are',i4,' events in the set. Can only print', & i4,'.') do i=1,min(n,MAXEVENT) name = ' ' call PAPIf_event_code_to_name(codes(i),name,retval) if(retval .EQ. PAPI_OK)then write(*,200) i,name else write(*,210) i,codes(i),'**Error looking up the event name**' end if end do 200 format(T1,i4,': ',a40) 210 format(T1,i4,' (',i12.11,'): ',a40) write(*,*) end subroutine init_multiplex() IMPLICIT integer (p) integer retval CHARACTER*(PAPI_MAX_STR_LEN) vstring, mstring INTEGER ncpu,nnodes,totalcpus,vendor,model REAL revision, mhz integer nchr,i C Get PAPI h/w info call PAPIf_get_hardware_info( ncpu, nnodes, totalcpus, vendor, . vstring, model, mstring, revision, mhz ) do i=len(mstring),1,-1 if(mstring(i:i).NE.' ') goto 10 end do 10 if(i.LT.1)then nchr=1 else nchr=i end if if ( mstring(1:nchr).EQ."POWER6") then C Setting domain to user+kernel+supervisor is really only necessary when C using PAPI multiplexing on POWER6/perfctr. But since the Fortran API C does not include access to component info, we'll just set the domain C in this manner for all components on POWER6. call PAPIf_set_domain(PAPI_DOM_ALL, retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, . 'PAPIf_set_domain', retval) end if end if call PAPIf_multiplex_init(retval) if ( retval.NE.PAPI_OK) then call ftest_fail(__FILE__, __LINE__, & 'papif_multiplex_init', retval) end if end papi-papi-7-2-0-t/src/testlib/papi_test.h000066400000000000000000000070361502707512200202440ustar00rootroot00000000000000/* Standard headers for PAPI test applications. This file is customized to hide Windows / Unix differences. */ #ifdef __cplusplus extern "C" { #endif //#if (!defined(NO_DLFCN) && !defined(_BGL) && !defined(_BGP)) //#include //#endif //#include //#if !defined(__FreeBSD__) && !defined(__APPLE__) //#include //#endif /* Masks to select operations for add_test_events() and remove_test_events() Mask value tells us what events to select. */ #define MASK_FP_OPS 0x80000 #define MASK_L1_DCA 0x40000 /* three new events for POWER4 */ #define MASK_L1_DCW 0x20000 #define MASK_L1_DCR 0x10000 #define MASK_TOT_IIS 0x04000 /* Try this if TOT_INS won't work */ #define MASK_BR_PRC 0x02000 #define MASK_BR_MSP 0x01000 #define MASK_BR_CN 0x00800 #define MASK_L2_TCH 0x00400 #define MASK_L2_TCA 0x00200 #define MASK_L2_TCM 0x00100 #define MASK_L1_DCM 0x00040 #define MASK_L1_ICM 0x00020 #define MASK_L1_TCM 0x00010 #define MASK_FP_INS 0x00004 #define MASK_TOT_INS 0x00002 #define MASK_TOT_CYC 0x00001 #define MAX_TEST_EVENTS 18 struct test_events_t { unsigned int mask; unsigned int event; }; extern struct test_events_t test_events[]; /* Mark non-returning functions if the compiler support GNU C extensions. */ #if defined(__GNUC__) #define PAPI_NORETURN __attribute__ ((__noreturn__)) #else #define PAPI_NORETURN #endif void validate_string(const char *name, char *s); void *get_overflow_address(void *context); void free_test_space(long long ** values, int num_tests); long long **allocate_test_space(int num_tests, int num_events); int add_test_events(int *number, int *mask, int allow_derived); int add_two_events(int *num_events, int *papi_event, int *mask); int add_two_nonderived_events(int *num_events, int *papi_event, int *mask); int add_test_events_r(int *number, int *mask, void *handle); int find_nonderived_event( void ); int enum_add_native_events(int *num_events, int **evtcodes, int need_interrupts, int no_software_events, int cidx); int remove_test_events(int *EventSet, int mask); char *stringify_domain(int domain); char *stringify_all_domains(int domains); char *stringify_granularity(int granularity); char *stringify_all_granularities(int granularities); int tests_quiet(int argc, char **argv); void PAPI_NORETURN test_pass(const char *filename); void PAPI_NORETURN test_hl_pass(const char *filename); void PAPI_NORETURN test_fail(const char *file, int line, const char *call, int retval); void PAPI_NORETURN test_skip(const char *file, int line, const char *call, int retval); void test_warn(const char *file, int line, const char *call, int retval); void test_print_event_header(const char *call, int evset); int approx_equals(double a, double b); /* Unix systems use %lld to display long long values Windows uses %I64d for the same purpose. Since these occur inside a quoted string, we must #define the entire format string. Below are several common forms of this string for both platforms. */ #define ONEHDR " %12s" #define TAB2HDR "%s %12s %12s\n" #define TAB3HDR "%s %12s %12s %12s\n" #define TAB4HDR "%s %12s %12s %12s %12s\n" #define ONENUM " %12lld" #define TAB1 "%-12s %12lld\n" #define TAB2 "%-12s %12lld %12lld\n" #define TAB3 "%-12s %12lld %12lld %12lld\n" #define TAB4 "%-12s %12lld %12lld %12lld %12lld\n" #define TAB5 "%-12s %12lld %12lld %12lld %12lld %12lld\n" #define TWO12 "%12lld %12lld %s" #define LLDFMT "%lld" #define LLDFMT10 "%10lld" #define LLDFMT12 "%12lld" #define LLDFMT15 "%15lld" extern int TESTS_QUIET; /* Declared in test_utils.c */ #ifdef __cplusplus } #endif papi-papi-7-2-0-t/src/testlib/test_utils.c000066400000000000000000000460451502707512200204510ustar00rootroot00000000000000#include #include #include #include #include "papi.h" #include "papi_test.h" #define TOLERANCE .2 /* Variable to hold reporting status if TRUE, output is suppressed if FALSE output is sent to stdout initialized to FALSE declared here so it can be available globally */ int TESTS_QUIET = 0; static int TESTS_COLOR = 1; static int TEST_WARN = 0; void validate_string( const char *name, char *s ) { if ( ( s == NULL ) || ( strlen( s ) == 0 ) ) { char s2[1024] = ""; sprintf( s2, "%s was NULL or length 0", name ); test_fail( __FILE__, __LINE__, s2, 0 ); } } int approx_equals( double a, double b ) { if ( ( a >= b * ( 1.0 - TOLERANCE ) ) && ( a <= b * ( 1.0 + TOLERANCE ) ) ) return 1; else { printf( "Out of tolerance range %2.2f: %.0f vs %.0f [%.0f,%.0f]\n", TOLERANCE, a, b, b * ( 1.0 - TOLERANCE ), b * ( 1.0 + TOLERANCE ) ); return 0; } } long long ** allocate_test_space( int num_tests, int num_events ) { long long **values; int i; values = ( long long ** ) malloc( ( size_t ) num_tests * sizeof ( long long * ) ); if ( values == NULL ) exit( 1 ); memset( values, 0x0, ( size_t ) num_tests * sizeof ( long long * ) ); for ( i = 0; i < num_tests; i++ ) { values[i] = ( long long * ) malloc( ( size_t ) num_events * sizeof ( long long ) ); if ( values[i] == NULL ) exit( 1 ); memset( values[i], 0x00, ( size_t ) num_events * sizeof ( long long ) ); } return ( values ); } void free_test_space( long long **values, int num_tests ) { int i; for ( i = 0; i < num_tests; i++ ) free( values[i] ); free( values ); } int is_event_derived(unsigned int event) { PAPI_event_info_t info; if (event & PAPI_PRESET_MASK) { PAPI_get_event_info(event,&info); if (strcmp(info.derived,"NOT_DERIVED")) { // printf("%#x is derived\n",event); return 1; } } return 0; } int find_nonderived_event( void ) { /* query and set up the right event to monitor */ PAPI_event_info_t info; int potential_evt_to_add[3] = { PAPI_FP_OPS, PAPI_FP_INS, PAPI_TOT_INS }; int i; for ( i = 0; i < 3; i++ ) { if ( PAPI_query_event( potential_evt_to_add[i] ) == PAPI_OK ) { if ( PAPI_get_event_info( potential_evt_to_add[i], &info ) == PAPI_OK ) { if ( ( info.count > 0 ) && !strcmp( info.derived, "NOT_DERIVED" ) ) return ( potential_evt_to_add[i] ); } } } return ( 0 ); } /* Add events to an EventSet, as specified by a mask. Returns: number = number of events added */ //struct test_events_t { // unsigned int mask; // unsigned int event; //}; struct test_events_t test_events[MAX_TEST_EVENTS] = { { MASK_TOT_CYC, PAPI_TOT_CYC }, { MASK_TOT_INS, PAPI_TOT_INS }, { MASK_FP_INS, PAPI_FP_INS }, { MASK_L1_TCM, PAPI_L1_TCM }, { MASK_L1_ICM, PAPI_L1_ICM }, { MASK_L1_DCM, PAPI_L1_DCM }, { MASK_L2_TCM, PAPI_L2_TCM }, { MASK_L2_TCA, PAPI_L2_TCA }, { MASK_L2_TCH, PAPI_L2_TCH }, { MASK_BR_CN, PAPI_BR_CN }, { MASK_BR_MSP, PAPI_BR_MSP }, { MASK_BR_PRC, PAPI_BR_PRC }, { MASK_TOT_IIS, PAPI_TOT_IIS}, { MASK_L1_DCR, PAPI_L1_DCR}, { MASK_L1_DCW, PAPI_L1_DCW}, { MASK_L1_DCA, PAPI_L1_DCA}, { MASK_FP_OPS, PAPI_FP_OPS}, }; int add_test_events( int *number, int *mask, int allow_derived ) { int retval,i; int EventSet = PAPI_NULL; char name_string[BUFSIZ]; *number = 0; /* create the eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail(__FILE__,__LINE__,"Trouble creating eventset",retval); } /* check all the masks */ for(i=0;i 1 ) && ( ( strcasecmp( argv[1], "TESTS_QUIET" ) == 0 ) || ( strcasecmp( argv[1], "-q" ) == 0 ) ) ) { TESTS_QUIET = 1; } /* Always report PAPI errors when testing */ /* Even in quiet mode */ retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_set_debug", retval ); } value=getenv("TESTS_COLOR"); if (value!=NULL) { if (value[0]=='y') { TESTS_COLOR=1; } else { TESTS_COLOR=0; } } /* Disable colors if sending to a file */ if (!isatty(fileno(stdout))) { TESTS_COLOR=0; } return TESTS_QUIET; } #define RED "\033[1;31m" #define YELLOW "\033[1;33m" #define GREEN "\033[1;32m" #define NORMAL "\033[0m" static void print_spaces(int count) { int i; for(i=0;i 0 ) { fprintf( stdout, "Error: %s\n", call ); } else if ( retval == 0 ) { #if defined(sgi) fprintf( stdout, "SGI requires root permissions for this test\n" ); #else fprintf( stdout, "Error: %s\n", call ); #endif } else { fprintf( stdout, "Error in %s: %s\n", call, PAPI_strerror( retval ) ); } fprintf(stdout, "Some tests require special hardware, permissions, OS, compilers\n" "or library versions. PAPI may still function perfectly on your \n" "system without the particular feature being tested here. \n"); /* NOTE: Because test_fail is called from thread functions, calling PAPI_shutdown here could prevent some threads from being able to free memory they have allocated. */ if ( PAPI_is_initialized( ) ) { PAPI_shutdown( ); } /* This is stupid. Threads are the rare case */ /* and in any case an exit() should clear everything out */ /* adding back the exit() call */ exit(1); } /* Use a positive value of retval to simply print an error message */ void test_warn( const char *file, int line, const char *call, int retval ) { (void)file; // int line_pad; // line_pad=60-strlen(file); // if (line_pad<0) line_pad=0; char buf[128]; memset( buf, '\0', sizeof ( buf ) ); // fprintf(stdout,"%s",file); // print_spaces(line_pad); if (TEST_WARN==0) fprintf(stdout,"\n"); if (TESTS_COLOR) fprintf( stdout, "%s", YELLOW); fprintf( stdout, "WARNING "); if (TESTS_COLOR) fprintf( stdout, "%s", NORMAL); fprintf( stdout, "Line # %d ", line ); if ( retval == PAPI_ESYS ) { sprintf( buf, "System warning in %s", call ); perror( buf ); } else if ( retval > 0 ) { fprintf( stdout, "Warning: %s\n", call ); } else if ( retval == 0 ) { fprintf( stdout, "Warning: %s\n", call ); } else { fprintf( stdout, "Warning in %s: %s\n", call, PAPI_strerror( retval )); } TEST_WARN++; } void PAPI_NORETURN test_skip( const char *file, int line, const char *call, int retval ) { // int line_pad; (void)file; (void)line; (void)call; (void)retval; // line_pad=(60-strlen(file)); // fprintf(stdout,"%s",file); // print_spaces(line_pad); fprintf( stdout, "SKIPPED\n"); exit( 0 ); } void test_print_event_header( const char *call, int evset ) { int *ev_ids; int i, nev; int retval; char evname[PAPI_MAX_STR_LEN]; if ( *call ) fprintf( stdout, "%s", call ); if ((nev = PAPI_get_cmp_opt(PAPI_MAX_MPX_CTRS,NULL,0)) <= 0) { fprintf( stdout, "Can not list event names.\n" ); return; } if ((ev_ids = calloc(nev,sizeof(int))) == NULL) { fprintf( stdout, "Can not list event names.\n" ); return; } retval = PAPI_list_events( evset, ev_ids, &nev ); if ( retval == PAPI_OK ) { for ( i = 0; i < nev; i++ ) { PAPI_event_code_to_name( ev_ids[i], evname ); printf( ONEHDR, evname ); } } else { fprintf( stdout, "Can not list event names." ); } fprintf( stdout, "\n" ); free(ev_ids); } int add_two_events( int *num_events, int *papi_event, int *mask ) { int retval; int EventSet = PAPI_NULL; *num_events=2; *papi_event=PAPI_TOT_INS; (void)mask; /* create the eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval = PAPI_add_named_event( EventSet, "PAPI_TOT_CYC"); if ( retval != PAPI_OK ) { if (!TESTS_QUIET) printf("Couldn't add PAPI_TOT_CYC\n"); test_skip(__FILE__,__LINE__,"Couldn't add PAPI_TOT_CYC",0); } retval = PAPI_add_named_event( EventSet, "PAPI_TOT_INS"); if ( retval != PAPI_OK ) { if (!TESTS_QUIET) printf("Couldn't add PAPI_TOT_CYC\n"); test_skip(__FILE__,__LINE__,"Couldn't add PAPI_TOT_CYC",0); } return EventSet; } int add_two_nonderived_events( int *num_events, int *papi_event, int *mask ) { /* query and set up the right event to monitor */ int EventSet = PAPI_NULL; int retval; *num_events=0; #define POTENTIAL_EVENTS 3 unsigned int potential_evt_to_add[POTENTIAL_EVENTS][2] = { {( unsigned int ) PAPI_FP_INS, MASK_FP_INS}, {( unsigned int ) PAPI_FP_OPS, MASK_FP_OPS}, {( unsigned int ) PAPI_TOT_INS, MASK_TOT_INS} }; int i; *mask = 0; /* could leak up to two event sets. */ for(i=0;imodel_string,"POWER6")) || (!strcmp(hw_info->model_string,"POWER5")) ) { test_warn(__FILE__, __LINE__, "Limiting num_counters because of " "LIMITED_PMC on Power5 and Power6",1); counters=4; } } ( *evtcodes ) = ( int * ) calloc( counters, sizeof ( int ) ); retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* For platform independence, always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cidx ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_enum_cmp_event", retval ); } do { retval = PAPI_get_event_info( i, &info ); /* HACK! FIXME */ if (no_software_events && ( strstr(info.symbol,"PERF_COUNT_SW") || strstr(info.long_descr, "PERF_COUNT_SW") ) ) { if (!TESTS_QUIET) { printf("Blocking event %s as a SW event\n", info.symbol); } continue; } if ( s->cntr_umasks ) { k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx ) == PAPI_OK ) { do { retval = PAPI_get_event_info( k, &info ); event_code = ( int ) info.event_code; retval = PAPI_add_event( EventSet, event_code ); if ( retval == PAPI_OK ) { ( *evtcodes )[event_found] = event_code; if ( !TESTS_QUIET ) { printf( "event_code[%d] = %#x (%s)\n", event_found, event_code, info.symbol ); } event_found++; } else { if ( !TESTS_QUIET ) { printf( "%#x (%s) can't be added to the EventSet.\n", event_code, info.symbol ); } } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx ) == PAPI_OK && event_found < counters ); } else { event_code = ( int ) info.event_code; retval = PAPI_add_event( EventSet, event_code ); if ( retval == PAPI_OK ) { ( *evtcodes )[event_found] = event_code; if ( !TESTS_QUIET ) { printf( "event_code[%d] = %#x (%s)\n", event_found, event_code, info.symbol ); } event_found++; } } if ( !TESTS_QUIET && retval == PAPI_OK ) { /* */ } } else { event_code = ( int ) info.event_code; retval = PAPI_add_event( EventSet, event_code ); if ( retval == PAPI_OK ) { ( *evtcodes )[event_found] = event_code; event_found++; } else { if ( !TESTS_QUIET ) fprintf( stdout, "%#x is not available.\n", event_code ); } } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ) == PAPI_OK && event_found < counters ); *num_events = ( int ) event_found; if (!TESTS_QUIET) printf("Tried to fill %d counters with events, " "found %d\n",counters,event_found); return EventSet; } papi-papi-7-2-0-t/src/threads.c000066400000000000000000000402511502707512200162270ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: threads.c * Author: Philip Mucci * mucci@cs.utk.edu * Mods: Kevin London * london@cs.utk.edu */ /* This file contains thread allocation and bookkeeping functions */ #include "papi.h" #include "papi_internal.h" #include "papi_vector.h" #include "papi_memory.h" #include #include /*****************/ /* BEGIN GLOBALS */ /*****************/ /* The following globals get initialized and cleared by: extern int _papi_hwi_init_global_threads(void); extern int _papi_hwi_shutdown_thread(ThreadInfo_t *thread); */ /* list of threads, gets initialized to master process with TID of getpid() */ volatile ThreadInfo_t *_papi_hwi_thread_head; /* If we have TLS, this variable ALWAYS points to our thread descriptor. It's like magic! */ #if defined(HAVE_THREAD_LOCAL_STORAGE) THREAD_LOCAL_STORAGE_KEYWORD ThreadInfo_t *_papi_hwi_my_thread; #endif /* Function that returns and unsigned long thread identifier */ unsigned long ( *_papi_hwi_thread_id_fn ) ( void ); /* Function that sends a signal to other threads */ #ifdef ANY_THREAD_GETS_SIGNAL int ( *_papi_hwi_thread_kill_fn ) ( int, int ); #endif /*****************/ /* END GLOBALS */ /*****************/ static int lookup_and_set_thread_symbols( void ) { #if defined(ANY_THREAD_GETS_SIGNAL) int retval; char *error_ptc = NULL, *error_ptk = NULL; void *symbol_ptc = NULL, *symbol_ptk = NULL, *handle = NULL; handle = dlopen( NULL, RTLD_LAZY ); if ( handle == NULL ) { PAPIERROR( "Error from dlopen(NULL, RTLD_LAZY): %d %s", errno, dlerror( ) ); return ( PAPI_ESYS ); } symbol_ptc = dlsym( handle, "pthread_self" ); if ( symbol_ptc == NULL ) { error_ptc = dlerror( ); THRDBG( "dlsym(%p,pthread_self) returned NULL: %s\n", ( error_ptc ? error_ptc : "No error, NULL symbol!" ) ); } symbol_ptk = dlsym( handle, "pthread_kill" ); if ( symbol_ptk == NULL ) { error_ptk = dlerror( ); THRDBG( "dlsym(%p,pthread_kill) returned NULL: %s\n", ( error_ptk ? error_ptk : "No error, NULL symbol!" ) ); } dlclose( handle ); if ( !( ( _papi_hwi_thread_kill_fn && _papi_hwi_thread_id_fn ) || ( !_papi_hwi_thread_kill_fn && !_papi_hwi_thread_id_fn ) ) ) return ( PAPI_EMISC ); _papi_hwi_thread_kill_fn = ( int ( * )( int, int ) ) symbol_ptk; _papi_hwi_thread_id_fn = ( unsigned long ( * )( void ) ) symbol_ptc; #endif return ( PAPI_OK ); } static ThreadInfo_t * allocate_thread( int tid ) { ThreadInfo_t *thread; int i; /* The Thread EventSet is special. It is not in the EventSet list, but is pointed to by each EventSet of that particular thread. */ thread = ( ThreadInfo_t * ) papi_malloc( sizeof ( ThreadInfo_t ) ); if ( thread == NULL ) return ( NULL ); memset( thread, 0x00, sizeof ( ThreadInfo_t ) ); thread->context = ( hwd_context_t ** ) papi_malloc( sizeof ( hwd_context_t * ) * ( size_t ) papi_num_components ); if ( !thread->context ) { papi_free( thread ); return ( NULL ); } thread->running_eventset = ( EventSetInfo_t ** ) papi_malloc( sizeof ( EventSetInfo_t * ) * ( size_t ) papi_num_components ); if ( !thread->running_eventset ) { papi_free( thread->context ); papi_free( thread ); return ( NULL ); } for ( i = 0; i < papi_num_components; i++ ) { thread->context[i] = ( void * ) papi_malloc( ( size_t ) _papi_hwd[i]->size.context ); thread->running_eventset[i] = NULL; if ( thread->context[i] == NULL ) { for ( i--; i >= 0; i-- ) papi_free( thread->context[i] ); papi_free( thread->running_eventset ); papi_free( thread->context ); papi_free( thread ); return ( NULL ); } memset( thread->context[i], 0x00, ( size_t ) _papi_hwd[i]->size.context ); } if ( _papi_hwi_thread_id_fn ) { thread->tid = ( *_papi_hwi_thread_id_fn ) ( ); } else { thread->tid = ( unsigned long ) getpid( ); } thread->allocator_tid=thread->tid; if (tid == 0 ) { } else { thread->tid=tid; } THRDBG( "Allocated thread %ld at %p, allocator: %ld\n", thread->tid, thread, thread->allocator_tid ); return thread; } static void free_thread( ThreadInfo_t ** thread ) { int i; THRDBG( "Freeing thread %ld at %p\n", ( *thread )->tid, *thread ); for ( i = 0; i < papi_num_components; i++ ) { if ( ( *thread )->context[i] ) papi_free( ( *thread )->context[i] ); } if ( ( *thread )->context ) papi_free( ( *thread )->context ); if ( ( *thread )->running_eventset ) papi_free( ( *thread )->running_eventset ); memset( *thread, 0x00, sizeof ( ThreadInfo_t ) ); papi_free( *thread ); *thread = NULL; } static void insert_thread( ThreadInfo_t * entry, int tid ) { _papi_hwi_lock( THREADS_LOCK ); if ( _papi_hwi_thread_head == NULL ) { /* 0 elements */ THRDBG( "_papi_hwi_thread_head is NULL\n" ); entry->next = entry; } else if ( _papi_hwi_thread_head->next == _papi_hwi_thread_head ) { /* 1 elements */ THRDBG( "_papi_hwi_thread_head was thread %ld at %p\n", _papi_hwi_thread_head->tid, _papi_hwi_thread_head ); _papi_hwi_thread_head->next = entry; entry->next = ( ThreadInfo_t * ) _papi_hwi_thread_head; } else { /* 2+ elements */ THRDBG( "_papi_hwi_thread_head was thread %ld at %p\n", _papi_hwi_thread_head->tid, _papi_hwi_thread_head ); entry->next = _papi_hwi_thread_head->next; _papi_hwi_thread_head->next = entry; } _papi_hwi_thread_head = entry; THRDBG( "_papi_hwi_thread_head now thread %ld at %p\n", _papi_hwi_thread_head->tid, _papi_hwi_thread_head ); _papi_hwi_unlock( THREADS_LOCK ); #if defined(HAVE_THREAD_LOCAL_STORAGE) /* Don't set the current local thread if we are a fake attach thread */ if (tid==0) { _papi_hwi_my_thread = entry; THRDBG( "TLS for thread %ld is now %p\n", entry->tid, _papi_hwi_my_thread ); } #else ( void ) tid; #endif } static int remove_thread( ThreadInfo_t * entry ) { ThreadInfo_t *tmp = NULL, *prev = NULL; _papi_hwi_lock( THREADS_LOCK ); THRDBG( "_papi_hwi_thread_head was thread %ld at %p\n", _papi_hwi_thread_head->tid, _papi_hwi_thread_head ); /* Find the preceding element and the matched element, short circuit if we've seen the head twice */ for ( tmp = ( ThreadInfo_t * ) _papi_hwi_thread_head; ( entry != tmp ) || ( prev == NULL ); tmp = tmp->next ) { prev = tmp; } if ( tmp != entry ) { THRDBG( "Thread %ld at %p was not found in the thread list!\n", entry->tid, entry ); return ( PAPI_EBUG ); } /* Only 1 element in list */ if ( prev == tmp ) { _papi_hwi_thread_head = NULL; tmp->next = NULL; THRDBG( "_papi_hwi_thread_head now NULL\n" ); } else { prev->next = tmp->next; /* If we're removing the head, better advance it! */ if ( _papi_hwi_thread_head == tmp ) { _papi_hwi_thread_head = tmp->next; THRDBG( "_papi_hwi_thread_head now thread %ld at %p\n", _papi_hwi_thread_head->tid, _papi_hwi_thread_head ); } THRDBG( "Removed thread %p from list\n", tmp ); } _papi_hwi_unlock( THREADS_LOCK ); #if defined(HAVE_THREAD_LOCAL_STORAGE) _papi_hwi_my_thread = NULL; THRDBG( "TLS for thread %ld is now %p\n", entry->tid, _papi_hwi_my_thread ); #endif return PAPI_OK; } int _papi_hwi_initialize_thread( ThreadInfo_t ** dest, int tid ) { int retval; ThreadInfo_t *thread; int i; if ( ( thread = allocate_thread( tid ) ) == NULL ) { *dest = NULL; return PAPI_ENOMEM; } /* init event memory variables, used by papi_internal.c */ thread->tls_papi_event_code = -1; thread->tls_papi_event_code_changed = -1; /* Call the component to fill in anything special. */ for ( i = 0; i < papi_num_components; i++ ) { if (_papi_hwd[i]->cmp_info.disabled && _papi_hwd[i]->cmp_info.disabled != PAPI_EDELAY_INIT) continue; retval = _papi_hwd[i]->init_thread( thread->context[i] ); if ( retval ) { free_thread( &thread ); *dest = NULL; return retval; } } insert_thread( thread, tid ); *dest = thread; return PAPI_OK; } #if defined(ANY_THREAD_GETS_SIGNAL) /* This is ONLY defined for systems that enable ANY_THREAD_GETS_SIGNAL since we must forward signals sent to non-PAPI threads. This is NOT compatible with thread local storage, since to broadcast the signal, we need a list of threads. */ int _papi_hwi_broadcast_signal( unsigned int mytid ) { int i, retval, didsomething = 0; volatile ThreadInfo_t *foo = NULL; _papi_hwi_lock( THREADS_LOCK ); for ( foo = _papi_hwi_thread_head; foo != NULL; foo = foo->next ) { /* xxxx Should this be hardcoded to index 0 or walk the list or what? */ for ( i = 0; i < papi_num_components; i++ ) { if ( ( foo->tid != mytid ) && ( foo->running_eventset[i] ) && ( foo->running_eventset[i]-> state & ( PAPI_OVERFLOWING | PAPI_MULTIPLEXING ) ) ) { /* xxxx mpx_info inside _papi_mdi_t _papi_hwi_system_info is commented out. See papi_internal.h for details. The multiplex_timer_sig value is now part of that structure */ THRDBG("Thread %ld sending signal %d to thread %ld\n",mytid,foo->tid, (foo->running_eventset[i]->state & PAPI_OVERFLOWING ? _papi_hwd[i]->cmp_info.hardware_intr_sig : _papi_os_info.itimer_sig)); retval = (*_papi_hwi_thread_kill_fn)(foo->tid, (foo->running_eventset[i]->state & PAPI_OVERFLOWING ? _papi_hwd[i]->cmp_info.hardware_intr_sig : _papi_os_info.itimer_sig)); if (retval != 0) return(PAPI_EMISC); } } if ( foo->next == _papi_hwi_thread_head ) break; } _papi_hwi_unlock( THREADS_LOCK ); return ( PAPI_OK ); } #endif /* This is undefined for systems that enable ANY_THREAD_GETS_SIGNAL since we always must enable threads for safety. */ int _papi_hwi_set_thread_id_fn( unsigned long ( *id_fn ) ( void ) ) { #if !defined(ANY_THREAD_GETS_SIGNAL) /* Check for multiple threads still in the list, if so, we can't change it */ if ( _papi_hwi_thread_head->next != _papi_hwi_thread_head ) return ( PAPI_EINVAL ); /* We can't change the thread id function from one to another, only NULL to non-NULL and vice versa. */ if ( ( id_fn != NULL ) && ( _papi_hwi_thread_id_fn != NULL ) ) return ( PAPI_EINVAL ); _papi_hwi_thread_id_fn = id_fn; THRDBG( "Set new thread id function to %p\n", id_fn ); if ( id_fn ) _papi_hwi_thread_head->tid = ( *_papi_hwi_thread_id_fn ) ( ); else _papi_hwi_thread_head->tid = ( unsigned long ) getpid( ); THRDBG( "New master tid is %ld\n", _papi_hwi_thread_head->tid ); #else THRDBG( "Skipping set of thread id function\n" ); #endif return PAPI_OK; } static int _papi_hwi_thread_free_eventsets(long tid) { EventSetInfo_t *ESI; ThreadInfo_t *master; DynamicArray_t *map = &_papi_hwi_system_info.global_eventset_map; int i; master = _papi_hwi_lookup_thread( tid ); _papi_hwi_lock( INTERNAL_LOCK ); for( i = 0; i < map->totalSlots; i++ ) { ESI = map->dataSlotArray[i]; if ( ( ESI ) && (ESI->master!=NULL) ) { if ( ESI->master == master ) { THRDBG("Attempting to remove %d from tid %ld\n",ESI->EventSetIndex,tid); /* Code copied from _papi_hwi_remove_EventSet(ESI); */ _papi_hwi_free_EventSet( ESI ); map->dataSlotArray[i] = NULL; map->availSlots++; map->fullSlots--; } } } _papi_hwi_unlock( INTERNAL_LOCK ); return PAPI_OK; } int _papi_hwi_shutdown_thread( ThreadInfo_t * thread, int force_shutdown ) { int retval = PAPI_OK; unsigned long tid; int i, failure = 0; /* Clear event memory variables */ thread->tls_papi_event_code = -1; thread->tls_papi_event_code_changed = -1; /* Get thread id */ if ( _papi_hwi_thread_id_fn ) tid = ( *_papi_hwi_thread_id_fn ) ( ); else tid = ( unsigned long ) getpid( ); THRDBG("Want to shutdown thread %ld, alloc %ld, our_tid: %ld\n", thread->tid, thread->allocator_tid, tid); if ((thread->tid==tid) || ( thread->allocator_tid == tid ) || force_shutdown) { _papi_hwi_thread_free_eventsets(tid); remove_thread( thread ); THRDBG( "Shutting down thread %ld at %p\n", thread->tid, thread ); for( i = 0; i < papi_num_components; i++ ) { if (_papi_hwd[i]->cmp_info.disabled && _papi_hwd[i]->cmp_info.disabled != PAPI_EDELAY_INIT) continue; retval = _papi_hwd[i]->shutdown_thread( thread->context[i]); if ( retval != PAPI_OK ) failure = retval; } free_thread( &thread ); return ( failure ); } THRDBG( "Skipping shutdown thread %ld at %p, thread %ld not allocator!\n", thread->tid, thread, tid ); return PAPI_EBUG; } /* THESE MUST BE CALLED WITH A GLOBAL LOCK */ int _papi_hwi_shutdown_global_threads( void ) { int err,num_threads,i; ThreadInfo_t *tmp,*next; unsigned long our_tid; tmp = _papi_hwi_lookup_thread( 0 ); if ( tmp == NULL ) { THRDBG( "Did not find my thread for shutdown!\n" ); err = PAPI_EBUG; } else { our_tid=tmp->tid; (void)our_tid; THRDBG("Shutting down %ld\n",our_tid); err = _papi_hwi_shutdown_thread( tmp, 1 ); /* count threads */ tmp = ( ThreadInfo_t * ) _papi_hwi_thread_head; num_threads=0; while(tmp!=NULL) { num_threads++; if (tmp->next==_papi_hwi_thread_head) break; tmp=tmp->next; } /* Shut down all threads allocated by this thread */ /* Urgh it's a circular list where we removed in the loop */ /* so the only sane way to do it is get a count in advance */ tmp = ( ThreadInfo_t * ) _papi_hwi_thread_head; for(i=0;inext; THRDBG("looking at #%d %ld our_tid: %ld alloc_tid: %ld\n", i,tmp->tid,our_tid,tmp->allocator_tid); THRDBG("Also removing thread %ld\n",tmp->tid); err = _papi_hwi_shutdown_thread( tmp, 1 ); tmp=next; } } #ifdef DEBUG if ( ISLEVEL( DEBUG_THREADS ) ) { if ( _papi_hwi_thread_head ) { THRDBG( "Thread head %p still exists!\n", _papi_hwi_thread_head ); } } #endif #if defined(HAVE_THREAD_LOCAL_STORAGE) _papi_hwi_my_thread = NULL; #endif _papi_hwi_thread_head = NULL; _papi_hwi_thread_id_fn = NULL; #if defined(ANY_THREAD_GETS_SIGNAL) _papi_hwi_thread_kill_fn = NULL; #endif return err; } int _papi_hwi_init_global_threads( void ) { int retval; ThreadInfo_t *tmp; _papi_hwi_lock( GLOBAL_LOCK ); #if defined(HAVE_THREAD_LOCAL_STORAGE) _papi_hwi_my_thread = NULL; #endif _papi_hwi_thread_head = NULL; _papi_hwi_thread_id_fn = NULL; #if defined(ANY_THREAD_GETS_SIGNAL) _papi_hwi_thread_kill_fn = NULL; #endif retval = _papi_hwi_initialize_thread( &tmp , 0); if ( retval == PAPI_OK ) { retval = lookup_and_set_thread_symbols( ); } _papi_hwi_unlock( GLOBAL_LOCK ); return ( retval ); } int _papi_hwi_gather_all_thrspec_data( int tag, PAPI_all_thr_spec_t * where ) { int didsomething = 0; ThreadInfo_t *foo = NULL; _papi_hwi_lock( THREADS_LOCK ); for ( foo = ( ThreadInfo_t * ) _papi_hwi_thread_head; foo != NULL; foo = foo->next ) { /* If we want thread ID's */ if ( where->id ) memcpy( &where->id[didsomething], &foo->tid, sizeof ( where->id[didsomething] ) ); /* If we want data pointers */ if ( where->data ) where->data[didsomething] = foo->thread_storage[tag]; didsomething++; if ( ( where->id ) || ( where->data ) ) { if ( didsomething >= where->num ) break; } if ( foo->next == _papi_hwi_thread_head ) break; } where->num = didsomething; _papi_hwi_unlock( THREADS_LOCK ); return ( PAPI_OK ); } #if defined(__APPLE__) #include unsigned long _papi_gettid(void) { uint64_t tid; pthread_threadid_np(NULL, &tid); // macOS-specific thread ID function return (unsigned long)tid; } #elif defined(__NR_gettid) && !defined(HAVE_GETTID) #include #include unsigned long _papi_gettid(void) { return (unsigned long)(syscall(__NR_gettid)); } #elif defined(HAVE_GETTID) #include unsigned long _papi_gettid(void) { return (unsigned long)(gettid()); } #elif defined(HAVE_SYSCALL_GETTID) #include #include unsigned long _papi_gettid(void) { return (unsigned long)(syscall(SYS_gettid)); } #else #include #include /* Fall-back on getpid for tid if not available. */ unsigned long _papi_gettid(void) { return (unsigned long)(getpid()); } #endif unsigned long _papi_getpid(void) { return (unsigned long) getpid(); } papi-papi-7-2-0-t/src/threads.h000066400000000000000000000105711502707512200162360ustar00rootroot00000000000000/** @file threads.h * CVS: $Id$ * @author ?? */ #ifndef PAPI_THREADS_H #define PAPI_THREADS_H #include #include #include #ifdef HAVE_THREAD_LOCAL_STORAGE #define THREAD_LOCAL_STORAGE_KEYWORD HAVE_THREAD_LOCAL_STORAGE #else #define THREAD_LOCAL_STORAGE_KEYWORD #endif #if defined(ANY_THREAD_GETS_SIGNAL) && !defined(_AIX) #error "lookup_and_set_thread_symbols and _papi_hwi_broadcast_signal have only been tested on AIX" #endif typedef struct _ThreadInfo { unsigned long int tid; unsigned long int allocator_tid; struct _ThreadInfo *next; hwd_context_t **context; void *thread_storage[PAPI_MAX_TLS]; EventSetInfo_t **running_eventset; EventSetInfo_t *from_esi; /* ESI used for last update this control state */ int wants_signal; // The current event code can be stored here prior to // component calls and cleared after the component returns. unsigned int tls_papi_event_code; int tls_papi_event_code_changed; } ThreadInfo_t; /** The list of threads, gets initialized to master process with TID of getpid() * @internal */ extern volatile ThreadInfo_t *_papi_hwi_thread_head; /* If we have TLS, this variable ALWAYS points to our thread descriptor. It's like magic! */ #if defined(HAVE_THREAD_LOCAL_STORAGE) extern THREAD_LOCAL_STORAGE_KEYWORD ThreadInfo_t *_papi_hwi_my_thread; #endif /** Function that returns an unsigned long int thread identifier * @internal */ extern unsigned long int ( *_papi_hwi_thread_id_fn ) ( void ); /** Function that sends a signal to other threads * @internal */ extern int ( *_papi_hwi_thread_kill_fn ) ( int, int ); extern int _papi_hwi_initialize_thread( ThreadInfo_t ** dest, int tid ); extern int _papi_hwi_init_global_threads( void ); extern int _papi_hwi_shutdown_thread( ThreadInfo_t * thread, int force ); extern int _papi_hwi_shutdown_global_threads( void ); extern int _papi_hwi_broadcast_signal( unsigned int mytid ); extern int _papi_hwi_set_thread_id_fn( unsigned long int ( *id_fn ) ( void ) ); inline_static int _papi_hwi_lock( int lck ) { if ( _papi_hwi_thread_id_fn ) { _papi_hwd_lock( lck ); THRDBG( "Lock %d\n", lck ); } else { ( void ) lck; /* unused if !defined(DEBUG) */ THRDBG( "Skipped lock %d\n", lck ); } return ( PAPI_OK ); } inline_static int _papi_hwi_unlock( int lck ) { if ( _papi_hwi_thread_id_fn ) { _papi_hwd_unlock( lck ); THRDBG( "Unlock %d\n", lck ); } else { ( void ) lck; /* unused if !defined(DEBUG) */ THRDBG( "Skipped unlock %d\n", lck ); } return ( PAPI_OK ); } inline_static ThreadInfo_t * _papi_hwi_lookup_thread( int custom_tid ) { unsigned long int tid; ThreadInfo_t *tmp; if (custom_tid==0) { #ifdef HAVE_THREAD_LOCAL_STORAGE THRDBG( "TLS returning %p\n", _papi_hwi_my_thread ); return ( _papi_hwi_my_thread ); #else if ( _papi_hwi_thread_id_fn == NULL ) { THRDBG( "Threads not initialized, returning master thread at %p\n", _papi_hwi_thread_head ); return ( ( ThreadInfo_t * ) _papi_hwi_thread_head ); } tid = ( *_papi_hwi_thread_id_fn ) ( ); #endif } else { tid=custom_tid; } THRDBG( "Threads initialized, looking for thread %#lx\n", tid ); _papi_hwi_lock( THREADS_LOCK ); tmp = ( ThreadInfo_t * ) _papi_hwi_thread_head; while ( tmp != NULL ) { THRDBG( "Examining thread tid %#lx at %p\n", tmp->tid, tmp ); if ( tmp->tid == tid ) break; tmp = tmp->next; if ( tmp == _papi_hwi_thread_head ) { tmp = NULL; break; } } if ( tmp ) { _papi_hwi_thread_head = tmp; THRDBG( "Found thread %ld at %p\n", tid, tmp ); } else { THRDBG( "Did not find tid %ld\n", tid ); } _papi_hwi_unlock( THREADS_LOCK ); return ( tmp ); } inline_static int _papi_hwi_lookup_or_create_thread( ThreadInfo_t ** here, int tid ) { ThreadInfo_t *tmp = _papi_hwi_lookup_thread( tid ); int retval = PAPI_OK; if ( tmp == NULL ) retval = _papi_hwi_initialize_thread( &tmp, tid ); if ( retval == PAPI_OK ) *here = tmp; return ( retval ); } /* Prototypes */ void _papi_hwi_shutdown_the_thread_list( void ); void _papi_hwi_cleanup_thread_list( void ); int _papi_hwi_insert_in_thread_list( ThreadInfo_t * ptr ); ThreadInfo_t *_papi_hwi_lookup_in_thread_list( ); void _papi_hwi_shutdown_the_thread_list( void ); int _papi_hwi_get_thr_context( void ** ); int _papi_hwi_gather_all_thrspec_data( int tag, PAPI_all_thr_spec_t * where ); unsigned long _papi_gettid(void); unsigned long _papi_getpid(void); #endif papi-papi-7-2-0-t/src/utils/000077500000000000000000000000001502707512200155675ustar00rootroot00000000000000papi-papi-7-2-0-t/src/utils/Makefile000066400000000000000000000064611502707512200172360ustar00rootroot00000000000000# File: utils/Makefile include Makefile.target INCLUDE = -I../testlib -I.. -I. testlibdir=../testlib CLOCKCORE= $(testlibdir)/clockcore.o DOLOOPS = $(testlibdir)/do_loops.o ALL = papi_avail papi_mem_info papi_cost papi_clockres papi_native_avail \ papi_command_line papi_event_chooser papi_decode papi_xml_event_info \ papi_version papi_multiplex_cost papi_component_avail papi_error_codes \ papi_hardware_avail %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c $< default all utils: $(ALL) papi_avail: papi_avail.o $(PAPILIB) print_header.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_avail papi_avail.o print_header.o $(PAPILIB) $(LDFLAGS) papi_clockres: papi_clockres.o $(PAPILIB) $(CLOCKCORE) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_clockres papi_clockres.o $(PAPILIB) $(CLOCKCORE) -lm $(LDFLAGS) papi_command_line: papi_command_line.o $(PAPILIB) $(DOLOOPS) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_command_line papi_command_line.o $(PAPILIB) $(DOLOOPS) $(LDFLAGS) papi_component_avail: papi_component_avail.o $(PAPILIB) print_header.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_component_avail papi_component_avail.o $(PAPILIB) print_header.o $(LDFLAGS) $(LIBSDEFLAGS) papi_cost: papi_cost.o $(PAPILIB) cost_utils.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_cost papi_cost.o cost_utils.o $(PAPILIB) -lm $(LDFLAGS) papi_decode: papi_decode.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_decode papi_decode.o $(PAPILIB) $(LDFLAGS) papi_error_codes: papi_error_codes.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_error_codes papi_error_codes.o $(PAPILIB) $(LDFLAGS) papi_event_chooser: papi_event_chooser.o $(PAPILIB) print_header.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_event_chooser papi_event_chooser.o print_header.o $(PAPILIB) $(LDFLAGS) papi_hybrid_native_avail: papi_hybrid_native_avail.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_hybrid_native_avail papi_hybrid_native_avail.o $(PAPILIB) $(LDFLAGS) papi_mem_info: papi_mem_info.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_mem_info papi_mem_info.o $(PAPILIB) $(LDFLAGS) papi_multiplex_cost: papi_multiplex_cost.o $(PAPILIB) cost_utils.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_multiplex_cost papi_multiplex_cost.o cost_utils.o $(PAPILIB) -lm $(LDFLAGS) papi_hardware_avail: papi_hardware_avail.o $(PAPILIB) print_header.o $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_hardware_avail papi_hardware_avail.o $(PAPILIB) print_header.o $(LDFLAGS) papi_native_avail: papi_native_avail.c $(PAPILIB) print_header.o $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -o papi_native_avail papi_native_avail.c $(PAPILIB) print_header.o $(LDFLAGS) $(LIBSDEFLAGS) papi_version: papi_version.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_version papi_version.o $(PAPILIB) $(LDFLAGS) papi_xml_event_info: papi_xml_event_info.o $(PAPILIB) $(CC) $(CFLAGS) $(OPTFLAGS) -o papi_xml_event_info papi_xml_event_info.o $(PAPILIB) $(LDFLAGS) cost_utils.o: ../testlib/papi_test.h cost_utils.c $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c cost_utils.c print_header.o: print_header.h print_header.c $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c print_header.c clean: rm -f *.o *.stderr *.stdout core *~ $(ALL) distclean clobber: clean rm -f Makefile.target install: $(UTIL_TARGETS) @echo "Utilities (BINDIR) being installed in: \"$(BINDIR)\""; -mkdir -p $(BINDIR) -chmod go+rx $(BINDIR) -find . -perm -100 -type f -exec cp {} $(BINDIR) \; papi-papi-7-2-0-t/src/utils/Makefile.target.in000066400000000000000000000010351502707512200211200ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ prefix = @prefix@ exec_prefix = @exec_prefix@ datarootdir = @datarootdir@ datadir = @datadir@/${PACKAGE_TARNAME} testlibdir = $(datadir)/testlib DATADIR = $(DESTDIR)$(datadir) INCLUDE = -I. -I@includedir@ -I$(testlibdir) LIBDIR = @libdir@ LIBRARY = @LIBRARY@ SHLIB = @SHLIB@ PAPILIB = ../@LINKLIB@ LIBSDEFLAGS = @LIBSDEFLAGS@ TESTLIB = $(testlibdir)/libtestlib.a LDFLAGS = @LDFLAGS@ @LDL@ @STATIC@ CC = @CC@ MPICC = @MPICC@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ @OPTFLAGS@ OMPCFLGS = @OMPCFLGS@ papi-papi-7-2-0-t/src/utils/cost_utils.c000066400000000000000000000053131502707512200201250ustar00rootroot00000000000000#include #include #include #include #define NUM_ITERS 1000000 int num_iters = NUM_ITERS; /* computes min, max, and mean for an array; returns std deviation */ double do_stats( long long *array, long long *min, long long *max, double *average ) { int i; double std, tmp; *min = *max = array[0]; *average = 0; for ( i = 0; i < num_iters; i++ ) { *average += ( double ) array[i]; if ( *min > array[i] ) *min = array[i]; if ( *max < array[i] ) *max = array[i]; } *average = *average / ( double ) num_iters; std = 0; for ( i = 0; i < num_iters; i++ ) { tmp = ( double ) array[i] - ( *average ); std += tmp * tmp; } std = sqrt( std / ( num_iters - 1 ) ); return ( std ); } void do_std_dev( long long *a, int *s, double std, double ave ) { int i, j; double dev[10]; for ( i = 0; i < 10; i++ ) { dev[i] = std * ( i + 1 ); s[i] = 0; } for ( i = 0; i < num_iters; i++ ) { for ( j = 0; j < 10; j++ ) { if ( ( ( double ) a[i] - dev[j] ) > ave ) s[j]++; } } } void do_dist( long long *a, long long min, long long max, int bins, int *d ) { int i, j; int dmax = 0; int range = ( int ) ( max - min + 1 ); /* avoid edge conditions */ /* clear the distribution array */ for ( i = 0; i < bins; i++ ) { d[i] = 0; } /* scan the array to distribute cost per bin */ for ( i = 0; i < num_iters; i++ ) { j = ( ( int ) ( a[i] - min ) * bins ) / range; d[j]++; if ( j && ( dmax < d[j] ) ) dmax = d[j]; } /* scale each bin to a max of 100 */ for ( i = 1; i < bins; i++ ) { d[i] = ( d[i] * 100 ) / dmax; } } /* Long Long compare function for qsort */ static int cmpfunc (const void *a, const void *b) { if ( *(long long *)a - *(long long *)b < 0 ) { return -1; } if ( *(long long int*)a - *(long long int*)b > 0 ) { return 1; } return 0; } /* Calculate the percentiles for making boxplots */ int do_percentile(long long *a, long long *percent25, long long *percent50, long long *percent75, long long *percent99) { long long *a_sort; int i_25,i_50,i_75,i_99; /* Allocate room for a copy of the results */ a_sort = calloc(num_iters,sizeof(long long)); if (a_sort==NULL) { fprintf(stderr,"Memory allocation error!\n"); return -1; } /* Make a copy of the results */ memcpy(a_sort,a,num_iters*sizeof(long long)); /* Calculate indices */ i_25=(int)num_iters/4; i_50=(int)num_iters/2; // index for 75%, not quite accurate because it doesn't // take even or odd into consideration i_75=((int)num_iters*3)/4; i_99=((int)num_iters*99)/100; qsort(a_sort,num_iters-1,sizeof(long long),cmpfunc); *percent25=a_sort[i_25]; *percent50=a_sort[i_50]; *percent75=a_sort[i_75]; *percent99=a_sort[i_99]; free(a_sort); return 0; } papi-papi-7-2-0-t/src/utils/cost_utils.h000066400000000000000000000006431502707512200201330ustar00rootroot00000000000000#ifndef __PAPI_COST_UTILS_H__ #define __PAPI_COST_UTILS_H__ extern int num_iters; double do_stats(long long*, long long*, long long *, double *); void do_std_dev( long long*, int*, double, double ); void do_dist( long long*, long long, long long, int, int*); int do_percentile(long long *a, long long *percent25, long long *percent50, long long *percent75, long long *percent99); #endif /* __PAPI_COST_UTILS_H__ */ papi-papi-7-2-0-t/src/utils/papi_avail.c000066400000000000000000001046711502707512200200510ustar00rootroot00000000000000// Define the papi_avail man page contents. /** * file papi_avail.c * @brief papi_avail utility. * @page papi_avail * @section Name * papi_avail - provides availability and detailed information for PAPI preset and user defined events. * * @section Synopsis * papi_avail [-adht] [-e event] * * @section Description * papi_avail is a PAPI utility program that reports information about the * current PAPI installation and supported preset and user defined events. * * @section Options *
    *
  • -h Display help information about this utility. *
  • -a Display only the available PAPI events. *
  • -c Display only the available PAPI events after a check. *
  • -d Display PAPI event information in a more detailed format. *
  • -e < event > Display detailed event information for the named event. * This event can be a preset event, a user defined event, or a native event. * If the event is a preset or a user defined event the output shows a list of native * events the event is based on and the formula that is used to compute the events final value.\n *
* * Event filtering options *
    *
  • --br Display branch related PAPI preset events *
  • --cache Display cache related PAPI preset events *
  • --cnd Display conditional PAPI preset events *
  • --fp Display Floating Point related PAPI preset events *
  • --ins Display instruction related PAPI preset events *
  • --idl Display Stalled or Idle PAPI preset events *
  • --l1 Display level 1 cache related PAPI preset events *
  • --l2 Display level 2 cache related PAPI preset events *
  • --l3 Display level 3 cache related PAPI preset events *
  • --mem Display memory related PAPI preset events *
  • --msc Display miscellaneous PAPI preset events *
  • --tlb Display Translation Lookaside Buffer PAPI preset events *
* @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the PAPI Mailing List at . *
* @see PAPI_derived_event_files * */ // Define the PAPI_derived_event_files man page contents. /** * @page PAPI_derived_event_files * @brief Describes derived event definition file syntax. * * @section main Derived Events * PAPI provides the ability to define events whose value will be derived from multiple native events. The list of native * events to be used in a derived event and a formula which describes how to use them is provided in an event definition file. * The PAPI team provides an event definition file which describes all of the supported PAPI preset events. PAPI also allows * a user to provide an event definition file that describes a set of user defined events which can extend the events PAPI * normally supports. * * This page documents the syntax of the commands which can appear in an event definition file. * *
* @subsection rules General Rules: *
    *
  • Blank lines are ignored.
  • *
  • Lines that begin with '#' are comments (they are also ignored).
  • *
  • Names shown inside < > below represent values that must be provided by the user.
  • *
  • If a user provided value contains white space, it must be protected with quotes.
  • *
* *
* @subsection commands Commands: * @par CPU,\ * Specifies a PMU name which controls if the PRESET and EVENT commands that follow this line should * be processed. Multiple CPU commands can be entered without PRESET or EVENT commands between them to provide * a list of PMU names to which the derived events that follow will apply. When a PMU name provided in the list * matches a PMU name known to the running system, the events which follow will be created. If none of the PMU * names provided in the list match a PMU name on the running system, the events which follow will be ignored. * When a new CPU command follows either a PRESET or EVENT command, the PMU list is rebuilt.

* * @par PRESET,\,\,\,LDESC,\"\\",SDESC,\"\\",NOTE,\"\\" * Declare a PAPI preset derived event.

* * @par EVENT,\,\,\,LDESC,\"\\",SDESC,\"\\",NOTE,\"\\" * Declare a user defined derived event.

* * @par Where: * @par pmuName: * The PMU which the following events should apply to. A list of PMU names supported by your * system can be obtained by running papi_component_avail on your system.
* @par eventName: * Specifies the name used to identify this derived event. This name should be unique within the events on your system.
* @par derivedType: * Specifies the kind of derived event being defined (see 'Derived Types' below).
* @par eventAttr: * Specifies a formula and a list of base events that are used to compute the derived events value. The syntax * of this field depends on the 'derivedType' specified above (see 'Derived Types' below).
* @par longDesc: * Provides the long description of the event.
* @par shortDesc: * Provides the short description of the event.
* @par note: * Provides an event note.
* @par baseEvent (used below): * Identifies an event on which this derived event is based. This may be a native event (possibly with event masks), * an already known preset event, or an already known user event.
* *
* @subsection notes Notes: * The PRESET command has traditionally been used in the PAPI provided preset definition file. * The EVENT command is intended to be used in user defined event definition files. The code treats them * the same so they are interchangeable and they can both be used in either event definition file.
* *
* @subsection types Derived Types: * This describes values allowed in the 'derivedType' field of the PRESET and EVENT commands. It also * shows the syntax of the 'eventAttr' field for each derived type supported by these commands. * All of the derived events provide a list of one or more events which the derived event is based * on (baseEvent). Some derived events provide a formula that specifies how to compute the derived * events value using the baseEvents in the list. The following derived types are supported, the syntax * of the 'eventAttr' parameter for each derived event type is shown in parentheses.

* * @par NOT_DERIVED (\): * This derived type defines an alias for the existing event 'baseEvent'.
* @par DERIVED_ADD (\,\): * This derived type defines a new event that will be the sum of two other * events. It has a value of 'baseEvent1' plus 'baseEvent2'.
* @par DERIVED_PS (PAPI_TOT_CYC,\): * This derived type defines a new event that will report the number of 'baseEvent1' events which occurred * per second. It has a value of ((('baseEvent1' * cpu_max_mhz) * 1000000 ) / PAPI_TOT_CYC). The user must * provide PAPI_TOT_CYC as the first event of two events in the event list for this to work correctly.
* @par DERIVED_ADD_PS (PAPI_TOT_CYC,\,\): * This derived type defines a new event that will add together two event counters and then report the number * which occurred per second. It has a value of (((('baseEvent1' + baseEvent2) * cpu_max_mhz) * 1000000 ) / PAPI_TOT_CYC). * The user must provide PAPI_TOT_CYC as the first event of three events in the event list for this to work correctly.
* @par DERIVED_CMPD (\,\ * @par DERIVED_SUB (\,\): * This derived type defines a new event that will be the difference between two other * events. It has a value of 'baseEvent1' minus 'baseEvent2'.
* @par DERIVED_POSTFIX (\,\,\, ... ,\): * This derived type defines a new event whose value is computed from several native events using * a postfix (reverse polish notation) formula. Its value is the result of processing the postfix * formula. The 'pfFormula' is of the form 'N0|N1|N2|5|*|+|-|' where the '|' acts as a token * separator and the tokens N0, N1, and N2 are place holders that represent baseEvent0, baseEvent1, * and baseEvent2 respectively.
* @par DERIVED_INFIX (\,\,\, ... ,\): * This derived type defines a new event whose value is computed from several native events using * an infix (algebraic notation) formula. Its value is the result of processing the infix * formula. The 'ifFormula' is of the form 'N0-(N1+(N2*5))' where the tokens N0, N1, and N2 * are place holders that represent baseEvent0, baseEvent1, and baseEvent2 respectively.
* *
* @subsection example Example: * In the following example, the events PAPI_SP_OPS, USER_SP_OPS, and ALIAS_SP_OPS will all measure the same events and return * the same value. They just demonstrate different ways to use the PRESET and EVENT event definition commands.

* *
    *
  • # The following lines define pmu names that all share the following events
  • *
  • CPU nhm
  • *
  • CPU nhm-ex
  • *
  • \# Events which should be defined for either of the above pmu types
  • *
  • PRESET,PAPI_TOT_CYC,NOT_DERIVED,UNHALTED_CORE_CYCLES
  • *
  • PRESET,PAPI_REF_CYC,NOT_DERIVED,UNHALTED_REFERENCE_CYCLES
  • *
  • PRESET,PAPI_SP_OPS,DERIVED_POSTFIX,N0|N1|3|*|+|,FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED,NOTE,"Using a postfix formula"
  • *
  • EVENT,USER_SP_OPS,DERIVED_INFIX,N0+(N1*3),FP_COMP_OPS_EXE:SSE_SINGLE_PRECISION,FP_COMP_OPS_EXE:SSE_FP_PACKED,NOTE,"Using the same formula in infix format"
  • *
  • EVENT,ALIAS_SP_OPS,NOT_DERIVED,PAPI_SP_OPS,LDESC,"Alias for preset event PAPI_SP_OPS"
  • *
  • # End of event definitions for above pmu names and start of a section for a new pmu name.
  • *
  • CPU snb
  • *
* */ #include #include #include #include "papi.h" #include "print_header.h" static char * is_derived( PAPI_event_info_t * info ) { if ( strlen( info->derived ) == 0 ) return ( "No" ); else if ( strcmp( info->derived, "NOT_DERIVED" ) == 0 ) return ( "No" ); else if ( strcmp( info->derived, "DERIVED_CMPD" ) == 0 ) return ( "No" ); else return ( "Yes" ); } static void print_help( char **argv ) { printf( "This is the PAPI avail program.\n" ); printf( "It provides availability and details about PAPI Presets and User-defined Events.\n" ); printf( "PAPI Preset Event filters can be combined in a logical OR.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "Options:\n\n" ); printf( "General command options:\n" ); printf( "\t-h, --help Print this help message\n" ); printf( "\t-a, --avail Display only available PAPI preset and user defined events\n" ); printf( "\t-c, --check Display only available PAPI preset and user defined events after an availability check\n" ); printf( "\t-d, --detail Display detailed information about events\n" ); printf( "\t-e EVENTNAME Display detail information about specified event\n" ); printf( "\nEvent filtering options:\n" ); printf( "\t--br Display branch related PAPI preset events\n" ); printf( "\t--cache Display cache related PAPI preset events\n" ); printf( "\t--cnd Display conditional PAPI preset events\n" ); printf( "\t--fp Display Floating Point related PAPI preset events\n" ); printf( "\t--ins Display instruction related PAPI preset events\n" ); printf( "\t--idl Display Stalled or Idle PAPI preset events\n" ); printf( "\t--l1 Display level 1 cache related PAPI preset events\n" ); printf( "\t--l2 Display level 2 cache related PAPI preset events\n" ); printf( "\t--l3 Display level 3 cache related PAPI preset events\n" ); printf( "\t--mem Display memory related PAPI preset events\n" ); printf( "\t--msc Display miscellaneous PAPI preset events\n" ); printf( "\t--tlb Display Translation Lookaside Buffer PAPI preset events\n" ); printf( "\n" ); } static int parse_unit_masks( PAPI_event_info_t * info ) { char *pmask; if ( ( pmask = strchr( info->symbol, ':' ) ) == NULL ) { return ( 0 ); } memmove( info->symbol, pmask, ( strlen( pmask ) + 1 ) * sizeof ( char ) ); pmask = strchr( info->long_descr, ':' ); if ( pmask == NULL ) info->long_descr[0] = 0; else memmove( info->long_descr, pmask + sizeof ( char ), ( strlen( pmask ) + 1 ) * sizeof ( char ) ); return 1; } static int checkCounter (int eventcode) { int EventSet = PAPI_NULL; if (PAPI_create_eventset(&EventSet) != PAPI_OK) return 0; if (PAPI_add_event (EventSet, eventcode) != PAPI_OK) return 0; if (PAPI_cleanup_eventset (EventSet) != PAPI_OK) return 0; if (PAPI_destroy_eventset (&EventSet) != PAPI_OK) return 0; return 1; } static int get_max_symbol_length ( int initModifier, int iterModifier ) { int ecode = 0 | PAPI_PRESET_MASK; int len, maxLen = 0; PAPI_event_info_t info; /* In case of error, return the legacy value. */ if ( PAPI_enum_event( &ecode, initModifier ) != PAPI_OK ) { return 13; } do { if ( PAPI_get_event_info( ecode, &info ) == PAPI_OK ) { len = strlen(info.symbol); if ( len > maxLen ) { maxLen = len; } } } while ( PAPI_enum_event(&ecode, iterModifier) == PAPI_OK ); return maxLen+1; } static int print_comp_header_flag ( void ) { int numComps = PAPI_num_components(); const PAPI_component_info_t *cmpinfo; int cid, non_cpu_comps = 0; for ( cid = 0; cid < numComps; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if ( strcmp(cmpinfo->name, "perf_event") == 0 || strcmp(cmpinfo->name, "sysdetect") == 0 || strcmp(cmpinfo->name, "No Components Configured. ") == 0 ) { continue; } non_cpu_comps++; } return non_cpu_comps; } /* Checks whether a preset event is available. If it is available, the function returns 1, or 0 otherwise. */ int is_preset_event_available(char *name) { int event_code = 0 | PAPI_PRESET_MASK; PAPI_event_info_t info; int check_counter = 1; if (PAPI_enum_event( &event_code, PAPI_ENUM_FIRST ) != PAPI_OK) { printf("error!"); exit(1); } /* Since some component presets require qualifiers, such as ":device=0", but * the base preset names do not contain qualifiers, then the qualifier must * first be stripped in order to find a match. */ char *localname = strdup(name); char *basename = strtok(localname, ":"); if( NULL == basename ) { basename = name; } /* Iterate over all the available preset events and compare them by names. */ do { if ( PAPI_get_event_info( event_code, &info ) == PAPI_OK ) { if ( info.count ) { if ( (check_counter && checkCounter (event_code)) || !check_counter) { if (strcmp(info.symbol, basename) == 0) return 1; } } } } while (PAPI_enum_event( &event_code, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK); /* Free the temporary, dynamically allocated buffer. */ free(localname); return 0; } int main( int argc, char **argv ) { int args, i, j, k; int retval; unsigned int filter = 0; int print_event_info = 0; char *name = NULL; int print_avail_only = PAPI_PRESET_ENUM_CPU; int print_tabular = 1; PAPI_event_info_t info; const PAPI_hw_info_t *hwinfo = NULL; int tot_count = 0; int avail_count = 0; int deriv_count = 0; int check_counter = 0; int event_code; PAPI_event_info_t n_info; /* Parse command line arguments */ for( args = 1; args < argc; args++ ) { if ( strstr( argv[args], "-e" ) ) { print_event_info = 1; if( (args+1 >= argc) || ( argv[args+1] == NULL ) || ( strlen( argv[args+1] ) == 0 ) ) { print_help( argv ); exit( 1 ); } name = argv[args + 1]; } else if ( ( !strstr( argv[args], "--") && strstr( argv[args], "-c" ) ) || strstr(argv[args], "--check") ) { print_avail_only = PAPI_PRESET_ENUM_CPU_AVAIL; check_counter = 1; } else if ( strstr( argv[args], "-a" )) print_avail_only = PAPI_PRESET_ENUM_CPU_AVAIL; else if ( strstr( argv[args], "-d" ) ) print_tabular = 0; else if ( strstr( argv[args], "-h" ) ) { print_help( argv ); exit( 1 ); } else if ( strstr( argv[args], "--br" ) ) filter |= PAPI_PRESET_BIT_BR; else if ( strstr( argv[args], "--cache" ) ) filter |= PAPI_PRESET_BIT_CACH; else if ( strstr( argv[args], "--cnd" ) ) filter |= PAPI_PRESET_BIT_CND; else if ( strstr( argv[args], "--fp" ) ) filter |= PAPI_PRESET_BIT_FP; else if ( strstr( argv[args], "--ins" ) ) filter |= PAPI_PRESET_BIT_INS; else if ( strstr( argv[args], "--idl" ) ) filter |= PAPI_PRESET_BIT_IDL; else if ( strstr( argv[args], "--l1" ) ) filter |= PAPI_PRESET_BIT_L1; else if ( strstr( argv[args], "--l2" ) ) filter |= PAPI_PRESET_BIT_L2; else if ( strstr( argv[args], "--l3" ) ) filter |= PAPI_PRESET_BIT_L3; else if ( strstr( argv[args], "--mem" ) ) filter |= PAPI_PRESET_BIT_MEM; else if ( strstr( argv[args], "--msc" ) ) filter |= PAPI_PRESET_BIT_MSC; else if ( strstr( argv[args], "--tlb" ) ) filter |= PAPI_PRESET_BIT_TLB; } if ( filter == 0 ) { filter = ( unsigned int ) ( -1 ); } /* Init PAPI */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI library mismatch!\n"); return 1; } retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error with PAPI_set debug!\n"); return 1; } retval=papi_print_header("Available PAPI preset and user defined events plus hardware information.\n", &hwinfo ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error with PAPI_get_hardware_info!\n"); return 1; } /* Code for info on just one event */ if ( print_event_info ) { if ( PAPI_event_name_to_code( name, &event_code ) == PAPI_OK ) { if ( PAPI_get_event_info( event_code, &info ) == PAPI_OK ) { if ( event_code & PAPI_PRESET_MASK ) { printf( "%-30s%s\n%-30s%#-10x\n%-30s%d\n", "Event name:", info.symbol, "Event Code:", info.event_code, "Number of Native Events:", info.count ); printf( "%-29s|%s|\n%-29s|%s|\n%-29s|%s|\n", "Short Description:", info.short_descr, "Long Description:", info.long_descr, "Developer's Notes:", info.note ); printf( "%-29s|%s|\n%-29s|%s|\n", "Derived Type:", info.derived, "Postfix Processing String:", info.postfix ); for( j = 0; j < ( int ) info.count; j++ ) { printf( " Native Code[%d]: %#x |%s|\n", j, info.code[j], info.name[j] ); PAPI_get_event_info( (int) info.code[j], &n_info ); printf(" Number of Register Values: %d\n", n_info.count ); for( k = 0; k < ( int ) n_info.count; k++ ) { printf( " Register[%2d]: %#08x |%s|\n", k, n_info.code[k], n_info.name[k] ); } printf( " Native Event Description: |%s|\n\n", n_info.long_descr ); } if (!is_preset_event_available(name)) { printf("\nPRESET event %s is NOT available on this architecture!\n\n", name); } } else { /* must be a native event code */ printf( "%-30s%s\n%-30s%#-10x\n%-30s%d\n", "Event name:", info.symbol, "Event Code:", info.event_code, "Number of Register Values:", info.count ); printf( "%-29s|%s|\n", "Description:", info.long_descr ); for ( k = 0; k < ( int ) info.count; k++ ) { printf( " Register[%2d]: %#08x |%s|\n", k, info.code[k], info.name[k] ); } /* if unit masks exist but none are specified, process all */ if ( !strchr( name, ':' ) ) { if ( 1 ) { if ( PAPI_enum_event( &event_code, PAPI_NTV_ENUM_UMASKS ) == PAPI_OK ) { printf( "\nUnit Masks:\n" ); do { retval = PAPI_get_event_info(event_code, &info ); if ( retval == PAPI_OK ) { if ( parse_unit_masks( &info ) ) { printf( "%-29s|%s|%s|\n", " Mask Info:", info.symbol, info.long_descr ); for ( k = 0; k < ( int ) info.count;k++ ) { printf( " Register[%2d]: %#08x |%s|\n", k, info.code[k], info.name[k] ); } } } } while ( PAPI_enum_event( &event_code, PAPI_NTV_ENUM_UMASKS ) == PAPI_OK ); } } } } } } else { printf( "Sorry, an event by the name '%s' could not be found.\n" " Is it typed correctly?\n\n", name ); } } else { /* Print *ALL* Events */ for (i=0 ; i<2 ; i++) { // set the event code to fetch preset events the first time through loop and user events the second time through the loop if (i== 0) { event_code = 0 | PAPI_PRESET_MASK; } else { event_code = 0 | PAPI_UE_MASK; } /* For consistency, always ASK FOR the first event, if there is not one then nothing to process */ if (PAPI_enum_event( &event_code, PAPI_ENUM_FIRST ) != PAPI_OK) { continue; } /* Get the length of the longest preset symbol. */ int maxSymLen = get_max_symbol_length(PAPI_ENUM_FIRST, PAPI_PRESET_ENUM_CPU); int frontPad = (maxSymLen-4)/2; /* 4 == strlen("Name") */ int backPad = maxSymLen-4-frontPad; // print heading to show which kind of events follow if (i== 0) { printf( "================================================================================\n" ); printf( " PAPI Preset Events\n" ); printf( "================================================================================\n" ); } else { printf( "\n"); // put a blank line after the presets before strarting the user events printf( "================================================================================\n" ); printf( " User Defined Events\n" ); printf( "================================================================================\n" ); } if ( print_tabular ) { int spaceCnt = 0; for( spaceCnt = 0; spaceCnt < frontPad; ++spaceCnt ) { printf(" "); } printf( "Name"); for( spaceCnt = 0; spaceCnt < backPad; ++spaceCnt ) { printf(" "); } printf( " Code " ); if ( print_avail_only == PAPI_PRESET_ENUM_CPU ) { printf( "Avail " ); } printf( "Deriv Description (Note)\n" ); } else { printf( "%-13s%-11s%-8s%-16s\n |Long Description|\n" " |Developer's Notes|\n |Derived|\n |PostFix|\n" " Native Code[n]: |name|\n", "Symbol", "Event Code", "Count", "|Short Description|" ); } do { if ( PAPI_get_event_info( event_code, &info ) == PAPI_OK ) { if ( print_tabular ) { // if this is a user defined event or its a preset and matches the preset event filters, display its information if ( (i==1) || (filter & info.event_type)) { if ( print_avail_only == PAPI_PRESET_ENUM_CPU_AVAIL ) { if ( info.count ) { if ( (check_counter && checkCounter (event_code)) || !check_counter) { printf( "%-*s%#x %-5s%s", maxSymLen, info.symbol, info.event_code, is_derived( &info ), info.long_descr ); } } if ( info.note[0] ) { printf( " (%s)", info.note ); } printf( "\n" ); } else { printf( "%-*s%#x %-6s%-4s %s", maxSymLen, info.symbol, info.event_code, ( info.count ? "Yes" : "No" ), is_derived( &info ), info.long_descr ); if ( info.note[0] ) { printf( " (%s)", info.note ); } printf( "\n" ); } tot_count++; if ( info.count ) { if ((check_counter && checkCounter (event_code)) || !check_counter ) avail_count++; } if ( !strcmp( is_derived( &info ), "Yes" ) ) { deriv_count++; } } } else { if ( ( print_avail_only == PAPI_PRESET_ENUM_CPU_AVAIL && info.count ) || ( print_avail_only == PAPI_PRESET_ENUM_CPU ) ) { if ((check_counter && checkCounter (event_code)) || !check_counter) { printf( "%s\t%#x\t%d\t|%s|\n |%s|\n" " |%s|\n |%s|\n |%s|\n", info.symbol, info.event_code, info.count, info.short_descr, info.long_descr, info.note, info.derived, info.postfix ); for ( j = 0; j < ( int ) info.count; j++ ) { printf( " Native Code[%d]: %#x |%s|\n", j, info.code[j], info.name[j] ); } } } tot_count++; if ( info.count ) { if ((check_counter && checkCounter (event_code)) || !check_counter ) avail_count++; } if ( !strcmp( is_derived( &info ), "Yes" ) ) { deriv_count++; } } } } while (PAPI_enum_event( &event_code, print_avail_only ) == PAPI_OK); /* Repeat the logic for component presets. For consistency, always ASK FOR the first event, * if there is not one then nothing to process */ if (PAPI_enum_event( &event_code, PAPI_PRESET_ENUM_FIRST_COMP ) != PAPI_OK) { continue; } /* Print heading for component presets. */ if (i== 0) { if( print_avail_only == PAPI_PRESET_ENUM_CPU ) { print_avail_only = PAPI_ENUM_EVENTS; } else if( print_avail_only == PAPI_PRESET_ENUM_CPU_AVAIL ) { print_avail_only = PAPI_PRESET_ENUM_AVAIL; } /* Get the length of the longest component preset symbol. */ int maxCompSymLen = get_max_symbol_length(PAPI_PRESET_ENUM_FIRST_COMP, PAPI_ENUM_EVENTS); int frontPad = (maxCompSymLen-4)/2; /* 4 == strlen("Name") */ int backPad = maxCompSymLen-4-frontPad; printf( "================================================================================\n" ); printf( " PAPI Component Preset Events\n" ); printf( "================================================================================\n" ); int printCompPresets = print_comp_header_flag(); if ( printCompPresets ) { if ( print_tabular ) { int spaceCnt = 0; for( spaceCnt = 0; spaceCnt < frontPad; ++spaceCnt ) { printf(" "); } printf( "Name"); for( spaceCnt = 0; spaceCnt < backPad; ++spaceCnt ) { printf(" "); } printf( " Code " ); if ( print_avail_only == PAPI_ENUM_EVENTS ) { printf( "Avail " ); } printf( "Deriv Description (Note)\n" ); } else { printf( "%-13s%-11s%-8s%-16s\n |Long Description|\n" " |Developer's Notes|\n |Derived|\n |PostFix|\n" " Native Code[n]: |name|\n", "Symbol", "Event Code", "Count", "|Short Description|" ); } } else { printf( "No components compiled in that support PAPI Component Preset Events.\n" ); } int first_flag = 1; do { if ( PAPI_get_event_info( event_code, &info ) == PAPI_OK ) { /* Skip disabled components */ const PAPI_component_info_t *component=PAPI_get_component_info(info.component_index); if (component->disabled && component->disabled != PAPI_EDELAY_INIT) { continue; } if( !first_flag ) { printf( "--------------------------------------------------------------------------------\n" ); } first_flag = 0; if ( print_tabular ) { // if this is a user defined event or its a preset and matches the preset event filters, display its information if ( filter & info.event_type ) { if ( print_avail_only == PAPI_PRESET_ENUM_AVAIL ) { if ( info.count ) { if ( (check_counter && checkCounter (event_code)) || !check_counter) { printf( "%-*s%#x %-5s%s\n", maxCompSymLen, info.symbol, info.event_code, is_derived( &info ), info.long_descr ); /* Add event to tally. */ avail_count++; if ( !strcmp( is_derived( &info ), "Yes" ) ) { deriv_count++; } /* List the qualifiers. */ int k; for( k = 0; k < info.num_quals; ++k ) { printf(" %s\n %s\n", info.quals[k], info.quals_descrs[k]); } } } if ( info.note[0] ) { printf( " (%s)\n", info.note ); } } else { printf( "%-*s%#x %-6s%-4s %s\n", maxCompSymLen, info.symbol, info.event_code, ( info.count ? "Yes" : "No" ), is_derived( &info ), info.long_descr ); if ( info.note[0] ) { printf( " (%s)\n", info.note ); } /* List the qualifiers. */ int k; for( k = 0; k < info.num_quals; ++k ) { printf(" %s\n %s\n", info.quals[k], info.quals_descrs[k]); } tot_count++; if ( info.count ) { if ((check_counter && checkCounter (event_code)) || !check_counter ) avail_count++; } if ( !strcmp( is_derived( &info ), "Yes" ) ) { deriv_count++; } } } } else { if ( ( print_avail_only == PAPI_PRESET_ENUM_AVAIL && info.count ) || ( print_avail_only == PAPI_ENUM_EVENTS ) ) { if ((check_counter && checkCounter (event_code)) || !check_counter) { printf( "%s\t%#x\t%d\t|%s|\n |%s|\n" " |%s|\n |%s|\n |%s|\n", info.symbol, info.event_code, info.count, info.short_descr, info.long_descr, info.note, info.derived, info.postfix ); for ( j = 0; j < ( int ) info.count; j++ ) { printf( " Native Code[%d]: %#x |%s|\n", j, info.code[j], info.name[j] ); } } } tot_count++; if ( info.count ) { if ((check_counter && checkCounter (event_code)) || !check_counter ) avail_count++; } if ( !strcmp( is_derived( &info ), "Yes" ) ) { deriv_count++; } } } } while (PAPI_enum_event( &event_code, print_avail_only ) == PAPI_OK); printf( "================================================================================\n" ); } } } if ( !print_event_info ) { if ( print_avail_only == PAPI_PRESET_ENUM_CPU_AVAIL || print_avail_only == PAPI_PRESET_ENUM_AVAIL ) { printf( "Of %d available events, %d ", avail_count, deriv_count ); } else { printf( "Of %d possible events, %d are available, of which %d ", tot_count, avail_count, deriv_count ); } if ( deriv_count == 1 ) { printf( "is derived.\n\n" ); } else { printf( "are derived.\n\n" ); } if (avail_count==0) { printf("No events detected! Check papi_component_avail to find out why.\n"); printf("\n"); } } return 0; } papi-papi-7-2-0-t/src/utils/papi_clockres.c000066400000000000000000000026131502707512200205530ustar00rootroot00000000000000/** file clockres.c * * @page papi_clockres * @brief The papi_clockres utility. * @section Name * papi_clockres - measures and reports clock latency and resolution for PAPI timers. * * @section Synopsis * @section Description * papi_clockres is a PAPI utility program that measures and reports the * latency and resolution of the four PAPI timer functions: * PAPI_get_real_cyc(), PAPI_get_virt_cyc(), PAPI_get_real_usec() and PAPI_get_virt_usec(). * * @section Options * This utility has no command line options. * * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the PAPI Mailing List at . * */ #include #include #include "papi.h" #include "../testlib/clockcore.h" int main( int argc, char **argv ) { (void) argc; (void) argv; int retval; retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error with PAPI init!\n"); return 1; } retval = PAPI_set_debug( PAPI_VERB_ECONT ); if (retval != PAPI_OK ) { fprintf(stderr,"Error with PAPI_set_debug!\n"); return 1; } printf( "Printing Clock latency and resolution.\n" ); printf( "-----------------------------------------------\n" ); retval=clockcore( 0 ); if (retval<0) { fprintf(stderr,"Error reading clock!\n"); return retval; } return 0; } papi-papi-7-2-0-t/src/utils/papi_command_line.c000066400000000000000000000116621502707512200213770ustar00rootroot00000000000000/* file papi_command_line.c * This simply tries to add the events listed on the command line one at a time * then starts and stops the counters and prints the results */ /** * @page papi_command_line * @brief executes PAPI preset or native events from the command line. * * @section Synopsis * papi_command_line < event > < event > ... * * @section Description * papi_command_line is a PAPI utility program that adds named events from the * command line to a PAPI EventSet and does some work with that EventSet. * This serves as a handy way to see if events can be counted together, * and if they give reasonable results for known work. * * @section Options *
    *
  • -u Display output values as unsigned integers *
  • -x Display output values as hexadecimal *
  • -h Display help information about this utility. *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include "papi.h" #include "do_loops.h" static void print_help( char **argv ) { printf( "Usage: %s [options] [EVENTNAMEs]\n", argv[0] ); printf( "Options:\n\n" ); printf( "General command options:\n" ); printf( "\t-u Display output values as unsigned integers\n" ); printf( "\t-x Display output values as hexadecimal\n" ); printf( "\t-h Print this help message\n" ); printf( "\tEVENTNAMEs Specify one or more preset or native events\n" ); printf( "\n" ); printf( "This utility performs work while measuring the specified events.\n" ); printf( "It can be useful for sanity checks on given events and sets of events.\n" ); } int main( int argc, char **argv ) { int retval; int num_events; long long *values; char *success; PAPI_event_info_t info; int EventSet = PAPI_NULL; int i, j, event, data_type = PAPI_DATATYPE_INT64; int u_format = 0; int hex_format = 0; printf( "\nThis utility lets you add events from the command line " "interface to see if they work.\n\n" ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); exit(retval ); } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_create_eventset\n"); exit(retval ); } values = ( long long * ) malloc( sizeof ( long long ) * ( size_t ) argc ); success = ( char * ) malloc( ( size_t ) argc ); if ( success == NULL || values == NULL ) { fprintf(stderr,"Error allocating memory!\n"); exit(1); } for ( num_events = 0, i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "-h" ) ) { print_help( argv ); exit( 1 ); } else if ( !strcmp( argv[i], "-u" ) ) { u_format = 1; } else if ( !strcmp( argv[i], "-x" ) ) { hex_format = 1; } else { if ( ( retval = PAPI_add_named_event( EventSet, argv[i] ) ) != PAPI_OK ) { printf( "Failed adding: %s\nbecause: %s\n", argv[i], PAPI_strerror(retval)); } else { success[num_events++] = i; printf( "Successfully added: %s\n", argv[i] ); } } } /* Automatically pass if no events, for run_tests.sh */ if ( num_events == 0 ) { free(values); free(success); printf("No events specified!\n"); printf("Try running something like: %s PAPI_TOT_CYC\n\n", argv[0]); return 0; } printf( "\n" ); do_flops( 1 ); do_flush( ); retval = PAPI_start( EventSet ); if (retval != PAPI_OK ) { free(values); free(success); fprintf(stderr,"Error! PAPI_start\n"); exit( retval ); } do_flops( NUM_FLOPS ); do_misses( 1, L1_MISS_BUFFER_SIZE_INTS ); retval = PAPI_stop( EventSet, values ); if (retval != PAPI_OK ) { free(values); free(success); fprintf(stderr,"Error! PAPI_stop\n"); exit( retval ); } for ( j = 0; j < num_events; j++ ) { i = success[j]; if (! (u_format || hex_format) ) { retval = PAPI_event_name_to_code( argv[i], &event ); if (retval == PAPI_OK) { retval = PAPI_get_event_info(event, &info); if (retval == PAPI_OK) data_type = info.data_type; else data_type = PAPI_DATATYPE_INT64; } switch (data_type) { case PAPI_DATATYPE_UINT64: printf( "%s : \t%llu(u)", argv[i], (unsigned long long)values[j] ); break; case PAPI_DATATYPE_FP64: printf( "%s : \t%0.3f", argv[i], *((double *)(&values[j])) ); break; case PAPI_DATATYPE_BIT64: printf( "%s : \t%#llX", argv[i], values[j] ); break; case PAPI_DATATYPE_INT64: default: printf( "%s : \t%lld", argv[i], values[j] ); break; } if (retval == PAPI_OK) printf( " %s", info.units ); printf( "\n" ); } if (u_format) printf( "%s : \t%llu(u)\n", argv[i], (unsigned long long)values[j] ); if (hex_format) printf( "%s : \t%#llX\n", argv[i], values[j] ); } printf( "\n----------------------------------\n" ); free(values); free(success); return 0; } papi-papi-7-2-0-t/src/utils/papi_component_avail.c000066400000000000000000000131641502707512200221270ustar00rootroot00000000000000/** file papi_component_avail.c * @page papi_component_avail * @brief papi_component_avail utility. * @section NAME * papi_component_avail - provides detailed information on the PAPI components available on the system. * * @section Synopsis * * @section Description * papi_component_avail is a PAPI utility program that reports information * about the components papi was built with. * * @section Options *
    *
  • -h help message *
  • -d provide detailed information about each component. *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include "papi.h" #include "print_header.h" #define EVT_LINE 80 typedef struct command_flags { int help; int details; int named; char *name; } command_flags_t; static void force_cmp_init(int cid); static void print_help( char **argv ) { printf( "This is the PAPI component avail program.\n" ); printf( "It provides availability of installed PAPI components.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "Options:\n\n" ); printf( " --help, -h print this help message\n" ); printf( " -d print detailed information on each component\n" ); } static void parse_args( int argc, char **argv, command_flags_t * f ) { int i; /* Look for all currently defined commands */ memset( f, 0, sizeof ( command_flags_t ) ); for ( i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "-d" ) ) { f->details = 1; } else if ( !strcmp( argv[i], "-h" ) || !strcmp( argv[i], "--help" ) ) f->help = 1; else printf( "%s is not supported\n", argv[i] ); } /* if help requested, print and bail */ if ( f->help ) { print_help( argv ); exit( 1 ); } } int main( int argc, char **argv ) { int i; int retval; const PAPI_hw_info_t *hwinfo = NULL; const PAPI_component_info_t* cmpinfo; command_flags_t flags; int numcmp, cid; /* Initialize before parsing the input arguments */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); return retval; } parse_args( argc, argv, &flags ); retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); return retval; } retval = papi_print_header( "Available components and " "hardware information.\n", &hwinfo ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_get_hardware_info\n"); return 2; } /* Compiled-in Components */ numcmp = PAPI_num_components( ); printf("Compiled-in components:\n"); for ( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); printf( "Name: %-23s %s\n", cmpinfo->name ,cmpinfo->description); if (cmpinfo->disabled == PAPI_EDELAY_INIT) { force_cmp_init(cid); } if (cmpinfo->disabled) { printf(" \\-> Disabled: %s\n",cmpinfo->disabled_reason); } if (cmpinfo->partially_disabled) { printf(" \\-> Partially disabled: %s\n", cmpinfo->partially_disabled_reason); } if ( flags.details ) { printf( " %-23s Version:\t\t\t%s\n", " ", cmpinfo->version ); printf( " %-23s Number of native events:\t%d\n", " ", cmpinfo->num_native_events); printf( " %-23s Number of preset events:\t%d\n", " ", cmpinfo->num_preset_events); printf("\n"); } } printf("\nActive components:\n"); numcmp = PAPI_num_components( ); for ( cid = 0; cid < numcmp; cid++ ) { cmpinfo = PAPI_get_component_info( cid ); if (cmpinfo->disabled) continue; printf( "Name: %-23s %s\n", cmpinfo->name ,cmpinfo->description); printf( " %-23s Native: %d, Preset: %d, Counters: %d\n", " ", cmpinfo->num_native_events, cmpinfo->num_preset_events, cmpinfo->num_cntrs); int pmus=0; for (i=0; ipmu_names[i] != NULL) pmus++; // Non-Null get printed. } if (pmus) { // If we have any, print. printf( " %-23s PMUs supported: ", " "); int line_len = 48, name_len; for (i=0 ; ipmu_names[i] == NULL) continue; name_len = strlen(cmpinfo->pmu_names[i]); if ((line_len + 2 + name_len) > 130) { // If it would be too long, printf("\n %-23s ", " "); // terminate line without printing current name, line_len = 48; // reset line length. } // if it is not the first entry on a line, separate the names if (line_len > 48) { printf(", "); line_len += 2; // account for the separator. } printf("%s", cmpinfo->pmu_names[i]); line_len += name_len; // Add the new name to the length. } printf("\n"); } // end if we had PMUs to print. printf("\n"); // extra line. if ( flags.details ) { printf( " %-23s Version:\t\t\t%s\n", " ", cmpinfo->version ); printf( " %-23s Fast counter read:\t\t%d\n", " ", cmpinfo->fast_counter_read); printf("\n"); } } printf ( "\n--------------------------------------------------------------------------------\n" ); return 0; } void force_cmp_init(int cid) { int nvt_code = 0 | PAPI_NATIVE_MASK; PAPI_enum_cmp_event(&nvt_code, PAPI_ENUM_FIRST, cid); } papi-papi-7-2-0-t/src/utils/papi_cost.c000066400000000000000000000317201502707512200177170ustar00rootroot00000000000000/** file papi_cost.c * @brief papi_cost utility. * @page papi_cost * @section NAME * papi_cost - computes execution time costs for basic PAPI operations. * * @section Synopsis * papi_cost [-dhps] [-b bins] [-t threshold] * * @section Description * papi_cost is a PAPI utility program that computes the min / max / mean / std. deviation * of execution times for PAPI start/stop pairs and for PAPI reads. * This information provides the basic operating cost to a user's program * for collecting hardware counter data. * Command line options control display capabilities. * * @section Options *
    *
  • -b < bins > Define the number of bins into which the results are * partitioned for display. The default is 100. *
  • -d Display a graphical distribution of costs in a vertical histogram. *
  • -h Display help information about this utility. *
  • -p Display 25/50/75 perecentile results for making boxplots. *
  • -s Show the number of iterations in each of the first 10 * standard deviations above the mean. *
  • -t < threshold > Set the threshold for the number of iterations to * measure costs. The default is 1,000,000. *
* * @section Bugs * There are no known bugs in this utility. If you find a bug, * it should be reported to the PAPI Mailing List at . */ #include #include #include #include #include "papi.h" #include "cost_utils.h" /* Search for a derived event of type "type" */ static int find_derived( int i , char *type) { PAPI_event_info_t info; PAPI_enum_event( &i, PAPI_ENUM_FIRST ); do { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { if ( strcmp( info.derived, type) == 0 ) { return i; } } } while ( PAPI_enum_event( &i, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK ); return PAPI_NULL; } /* Slight misnomer, find derived event != DERIVED_POSTFIX */ /* Will look for DERIVED_ADD, if not available will also accept DERIVED_SUB */ static int find_derived_add( int i ) { int ret; ret = find_derived( i, "DERIVED_ADD"); if (ret != PAPI_NULL) { return ret; } return find_derived( i, "DERIVED_SUB"); } static int find_derived_postfix( int i ) { return ( find_derived ( i, "DERIVED_POSTFIX" ) ); } static void print_help( void ) { printf( "This is the PAPI cost program.\n" ); printf( "It computes min / max / mean / std. deviation for PAPI start/stop pairs; for PAPI reads, and for PAPI_accums. Usage:\n\n" ); printf( " cost [options] [parameters]\n" ); printf( " cost TESTS_QUIET\n\n" ); printf( "Options:\n\n" ); printf( " -b BINS set the number of bins for the graphical distribution of costs. Default: 100\n" ); printf( " -d show a graphical distribution of costs\n" ); printf( " -h print this help message\n" ); printf( " -p print 25/50/75th percentile results for making boxplots\n"); printf( " -s show number of iterations above the first 10 std deviations\n" ); printf( " -t THRESHOLD set the threshold for the number of iterations. Default: 1,000,000\n" ); printf( "\n" ); } static void print_stats( int i, long long min, long long max, double average, double std ) { char *test[] = { "loop latency", "PAPI_start/stop (2 counters)", "PAPI_read (2 counters)", "PAPI_read_ts (2 counters)", "PAPI_accum (2 counters)", "PAPI_reset (2 counters)", "PAPI_read (1 derived_postfix counter)", "PAPI_read (1 derived_[add|sub] counter)" }; printf( "\nTotal cost for %s over %d iterations\n", test[i], num_iters ); printf( "min cycles : %lld\nmax cycles : %lld\n" "mean cycles : %lf\nstd deviation: %lf\n", min, max, average, std ); } static void print_std_dev( int *s ) { int i; printf( "\n" ); printf( " --------# Standard Deviations Above the Mean--------\n" ); printf( "0-------1-------2-------3-------4-------5-------6-------7-------8-------9-----10\n" ); for ( i = 0; i < 10; i++ ) printf( " %d\t", s[i] ); printf( "\n\n" ); } static void print_dist( long long min, long long max, int bins, int *d ) { int i, j; int step = ( int ) ( max - min ) / bins; printf( "\nCost distribution profile\n\n" ); for ( i = 0; i < bins; i++ ) { printf( "%8d:", ( int ) min + ( step * i ) ); if ( d[i] > 100 ) { printf( "**************************** %d counts ****************************", d[i] ); } else { for ( j = 0; j < d[i]; j++ ) printf( "*" ); } printf( "\n" ); } } static void print_percentile(long long percent25, long long percent50, long long percent75,long long percent99) { printf("25%% cycles : %lld\n50%% cycles : %lld\n" "75%% cycles : %lld\n99%% cycles : %lld\n", percent25,percent50,percent75,percent99); } static void do_output( int test_type, long long *array, int bins, int show_std_dev, int show_dist, int show_percentile ) { int s[10]; long long min, max; double average, std; long long percent25,percent50,percent75,percent99; std = do_stats( array, &min, &max, &average ); print_stats( test_type, min, max, average, std ); if (show_percentile) { do_percentile(array,&percent25,&percent50,&percent75,&percent99); print_percentile(percent25,percent50,percent75,percent99); } if ( show_std_dev ) { do_std_dev( array, s, std, average ); print_std_dev( s ); } if ( show_dist ) { int *d; d = calloc( bins , sizeof ( int ) ); do_dist( array, min, max, bins, d ); print_dist( min, max, bins, d ); free( d ); } } int main( int argc, char **argv ) { int i, retval, EventSet = PAPI_NULL; int retval_start,retval_stop; int bins = 100; int show_dist = 0, show_std_dev = 0, show_percent = 0; long long totcyc, values[2]; long long *array; int event; PAPI_event_info_t info; int c; /* Check command-line arguments */ while ( (c=getopt(argc, argv, "hb:dpst:") ) != -1) { switch(c) { case 'h': print_help(); exit(1); case 'b': bins=atoi(optarg); break; case 'd': show_dist=1; break; case 'p': show_percent=1; break; case 's': show_std_dev=1; break; case 't': num_iters=atoi(optarg); break; default: print_help(); exit(1); break; } } printf( "Cost of execution for PAPI start/stop, read and accum.\n" ); printf( "This test takes a while. Please be patient...\n" ); retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr,"PAPI_library_init\n"); exit(retval); } retval = PAPI_set_debug( PAPI_VERB_ECONT ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_set_debug\n"); exit(retval); } retval = PAPI_query_event( PAPI_TOT_CYC ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_query_event\n"); exit(retval); } retval = PAPI_query_event( PAPI_TOT_INS ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_query_event\n"); exit(retval); } retval = PAPI_create_eventset( &EventSet ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_create_eventset\n"); exit(retval); } retval = PAPI_add_event( EventSet, PAPI_TOT_CYC ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_add_event\n"); exit(retval); } retval = PAPI_add_event( EventSet, PAPI_TOT_INS ); if (retval != PAPI_OK ) { retval = PAPI_add_event( EventSet, PAPI_TOT_IIS ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_add_event\n"); exit(retval); } } /* Make sure no errors and warm up */ totcyc = PAPI_get_real_cyc( ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } if ( ( retval = PAPI_stop( EventSet, NULL ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } array = ( long long * ) malloc( ( size_t ) num_iters * sizeof ( long long ) ); if ( array == NULL ) { fprintf(stderr,"Error allocating memory for results\n"); exit(retval); } /* Determine clock latency */ printf( "\nPerforming loop latency test...\n" ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } do_output( 0, array, bins, show_std_dev, show_dist, show_percent ); /* Start the start/stop eval */ printf( "\nPerforming start/stop test...\n" ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); retval_start=PAPI_start( EventSet ); retval_stop=PAPI_stop( EventSet, values ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; if (retval_start || retval_stop) { fprintf(stderr,"PAPI start/stop\n"); exit(retval_start ); } } do_output( 1, array, bins, show_std_dev, show_dist, show_percent ); /* Start the read eval */ printf( "\nPerforming read test...\n" ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( EventSet, values ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); PAPI_read( EventSet, values ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } do_output( 2, array, bins, show_std_dev, show_dist, show_percent ); /* Start the read with timestamp eval */ printf( "\nPerforming read with timestamp test...\n" ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read_ts( EventSet, values, &totcyc ); for ( i = 0; i < num_iters; i++ ) { PAPI_read_ts( EventSet, values, &array[i] ); } if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } /* post-process the timing array */ for ( i = num_iters - 1; i > 0; i-- ) { array[i] -= array[i - 1]; } array[0] -= totcyc; do_output( 3, array, bins, show_std_dev, show_dist, show_percent ); /* Start the accum eval */ printf( "\nPerforming accum test...\n" ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_accum( EventSet, values ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); PAPI_accum( EventSet, values ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } do_output( 4, array, bins, show_std_dev, show_dist, show_percent ); /* Start the reset eval */ printf( "\nPerforming reset test...\n" ); if ( ( retval = PAPI_start( EventSet ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); PAPI_reset( EventSet ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } if ( ( retval = PAPI_stop( EventSet, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } do_output( 5, array, bins, show_std_dev, show_dist, show_percent ); /* Derived POSTFIX event test */ PAPI_cleanup_eventset( EventSet ); event = 0 | PAPI_PRESET_MASK; event = find_derived_postfix( event ); if ( event != PAPI_NULL ) { PAPI_get_event_info(event, &info); printf( "\nPerforming DERIVED_POSTFIX " "PAPI_read(%d counters) test (%s)...", info.count, info.symbol ); retval = PAPI_add_event( EventSet, event); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); exit(retval); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( EventSet, values ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); PAPI_read( EventSet, values ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } do_output( 6, array, bins, show_std_dev, show_dist, show_percent ); } else { printf("\tI was unable to find a DERIVED_POSTFIX preset event " "to test on this architecture, skipping.\n"); } /* Find a derived ADD event */ PAPI_cleanup_eventset( EventSet ); event = 0 | PAPI_PRESET_MASK; event = find_derived_add( event ); if ( event != PAPI_NULL ) { PAPI_get_event_info(event, &info); printf( "\nPerforming DERIVED_[ADD|SUB] " "PAPI_read(%d counters) test (%s)...", info.count, info.symbol ); retval = PAPI_add_event( EventSet, event); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_add_event\n"); exit(retval); } retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( EventSet, values ); for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc( ); PAPI_read( EventSet, values ); totcyc = PAPI_get_real_cyc( ) - totcyc; array[i] = totcyc; } retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } do_output( 7, array, bins, show_std_dev, show_dist, show_percent ); } else { printf("\tI was unable to find a suitable DERIVED_[ADD|SUB] " "event to test, skipping.\n"); } free( array ); return 0; } papi-papi-7-2-0-t/src/utils/papi_decode.c000066400000000000000000000074421502707512200201760ustar00rootroot00000000000000/* This file decodes the preset events into a csv format file */ /** file papi_decode.c * @brief papi_decode utility. * @page papi_decode * @section NAME * papi_decode - provides availability and detail information for PAPI preset events. * * @section Synopsis * papi_decode [-ah] * * @section Description * papi_decode is a PAPI utility program that converts the PAPI presets * for the existing library into a comma separated value format that can * then be viewed or modified in spreadsheet applications or text editors, * and can be supplied to PAPI_encode_events (3) as a way of adding or * modifying event definitions for specialized applications. * The format for the csv output consists of a line of field names, followed * by a blank line, followed by one line of comma separated values for each * event contained in the preset table. * A portion of this output (for Pentium 4) is shown below: * @code * name,derived,postfix,short_descr,long_descr,note,[native,...] * PAPI_L1_ICM,NOT_DERIVED,,"L1I cache misses","Level 1 instruction cache misses",,BPU_fetch_request_TCMISS * PAPI_L2_TCM,NOT_DERIVED,,"L2 cache misses","Level 2 cache misses",,BSQ_cache_reference_RD_2ndL_MISS_WR_2ndL_MISS * PAPI_TLB_DM,NOT_DERIVED,,"Data TLB misses","Data translation lookaside buffer misses",,page_walk_type_DTMISS * @endcode * * @section Options *
    *
  • -a Convert only the available PAPI preset events. *
  • -h Display help information about this utility. *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include "papi.h" static void print_help( void ) { printf( "This is the PAPI decode utility program.\n" ); printf( "It decodes PAPI preset events into csv formatted text.\n" ); printf( "By default all presets are decoded.\n" ); printf( "The text goes to stdout, but can be piped to a file.\n" ); printf( "Such a file can be edited in a text editor or spreadsheet.\n" ); printf( "It can also be parsed by PAPI_encode_events.\n" ); printf( "Usage:\n\n" ); printf( " decode [options]\n\n" ); printf( "Options:\n\n" ); printf( " -a decode only available PAPI preset events\n" ); printf( " -h print this help message\n" ); printf( "\n" ); } int main( int argc, char **argv ) { int i, j; int retval; int print_avail_only = 0; PAPI_event_info_t info; (void)argc; (void)argv; for ( i = 1; i < argc; i++ ) if ( argv[i] ) { if ( !strcmp( argv[i], "-a" ) ) print_avail_only = PAPI_PRESET_ENUM_AVAIL; else if ( !strcmp( argv[i], "-h" ) ) { print_help( ); exit( 1 ); } else { print_help( ); exit( 1 ); } } retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error with PAPI_library_init!\n"); return retval; } retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error with PAPI_set_debug\n"); return retval; } i = PAPI_PRESET_MASK; printf ( "name,derived,postfix,short_descr,long_descr,note,[native,...]\n\n" ); do { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { printf( "%s,%s,%s,", info.symbol, info.derived, info.postfix ); if ( info.short_descr[0] ) { printf( "\"%s\",", info.short_descr ); } else { printf( "," ); } if ( info.long_descr[0] ) { printf( "\"%s\",", info.long_descr ); } else { printf( "," ); } if ( info.note[0] ) printf( "\"%s\"", info.note ); for ( j = 0; j < ( int ) info.count; j++ ) printf( ",%s", info.name[j] ); printf( "\n" ); } } while ( PAPI_enum_event( &i, print_avail_only ) == PAPI_OK ); return 0; } papi-papi-7-2-0-t/src/utils/papi_error_codes.c000066400000000000000000000034101502707512200212500ustar00rootroot00000000000000/* * This utility loops through all the PAPI error codes and displays them in * table format */ /** file error_codes.c * @brief papi_error_codes utility. * @page papi_error_codes * @section NAME * papi_error_codes - lists all currently defined PAPI error codes. * * @section Synopsis * papi_error_codes * * @section Description * papi_error_codes is a PAPI utility program that displays all defined * error codes from papi.h and their error strings from papi_data.h. * If an error string is not defined, a warning is generated. This can * help trap newly defined error codes for which error strings are not * yet defined. * * @section Options * This utility has no command line options. * * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include "papi.h" int main( int argc, char **argv ) { int i=0; int retval; (void)argc; (void)argv; retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error with PAPI_library_init!\n"); return retval; } printf( "\n----------------------------------\n" ); printf( "For PAPI Version: %d.%d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ), PAPI_VERSION_INCREMENT( PAPI_VERSION ) ); printf( "----------------------------------\n" ); while ( 1 ) { char *errstr; errstr = PAPI_strerror( -i ); if ( NULL == errstr ) { break; } printf( "Error code %4d: %s\n", -i, errstr ); i++; } printf( "There are %d error codes defined\n", i ); printf( "----------------------------------\n\n" ); return 0; } papi-papi-7-2-0-t/src/utils/papi_event_chooser.c000066400000000000000000000153631502707512200216170ustar00rootroot00000000000000/** file event_chooser.c * @brief papi_event_chooser utility. * @page papi_event_chooser * @section NAME * papi_event_chooser - given a list of named events, * lists other events that can be counted with them. * * @section Synopsis * papi_event_chooser NATIVE | PRESET < event > < event > ... * * @section Description * papi_event_chooser is a PAPI utility program that reports information * about the current PAPI installation and supported preset events. * * @section Options * This utility has no command line options. * * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include "papi.h" #include "print_header.h" int EventSet = PAPI_NULL; int retval; static char * is_derived( PAPI_event_info_t * info ) { if ( strlen( info->derived ) == 0 ) return ( "No" ); else if ( strcmp( info->derived, "NOT_DERIVED" ) == 0 ) return ( "No" ); else if ( strcmp( info->derived, "DERIVED_CMPD" ) == 0 ) return ( "No" ); else return ( "Yes" ); } static int add_remove_event( int EventSet, int evt ) { int retval; if ( ( retval = PAPI_add_event( EventSet, evt ) ) != PAPI_OK ) { //printf( "Error adding event.\n" ); } else { if ( ( retval = PAPI_remove_event( EventSet, evt ) ) != PAPI_OK ) { printf( "Error removing event.\n" ); } } return retval; } static int show_event_info( int evt ) { int k; int retval; PAPI_event_info_t info; if ( ( retval = PAPI_get_event_info( evt, &info ) ) == PAPI_OK ) { printf( "%s\t%#x\n |%s|\n", info.symbol, info.event_code, info.long_descr ); for( k = 0; k < ( int ) info.count; k++ ) { if ( strlen( info.name[k] ) ) { printf( " |Register Value[%d]: %#-10x %s|\n", k, info.code[k], info.name[k] ); } } } return retval; } static int native( int cidx ) { int i, j, k; int retval, added; PAPI_event_info_t info; j = 0; /* For platform independence, always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; retval=PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cidx ); if (retval==PAPI_ENOEVNT) { printf("Cannot find first event in component %d\n",cidx); } do { k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx) == PAPI_OK ) { if ( ( added = add_remove_event( EventSet, k ) ) == PAPI_OK ) { show_event_info( i ); do { retval = PAPI_get_event_info( k, &info ); if ( retval == PAPI_OK ) { printf( " %#-10x%s |%s|\n", info.event_code, strchr( info.symbol, ':' ), strchr( info.long_descr, ':' ) + 1 ); } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx ) == PAPI_OK ); j++; } } else { if ( ( added = add_remove_event( EventSet, i ) ) == PAPI_OK ) { show_event_info( i ); j++; } } if ( added == PAPI_OK ) { /* modifier = PAPI_NTV_ENUM_GROUPS returns event codes with a groups id for each group in which this native event lives, in bits 16 - 23 of event code terminating with PAPI_ENOEVNT at the end of the list. */ k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_GROUPS, cidx ) == PAPI_OK ) { printf( "Groups: " ); do { printf( "%4d", ( ( k & PAPI_NTV_GROUP_AND_MASK ) >> PAPI_NTV_GROUP_SHIFT ) - 1 ); } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_GROUPS, cidx ) == PAPI_OK ); printf( "\n" ); } printf( "---------------------------------------------" "----------------------------\n" ); } } while ( PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ) == PAPI_OK ); printf( "------------------------------------------" "-------------------------------\n" ); printf( "Total events reported: %d\n", j ); exit( 0 ); } static int preset( void ) { int i, j = 0; int retval; PAPI_event_info_t info; printf( " Name Code " ); printf( "Deriv Description (Note)\n" ); /* For consistency, always ASK FOR the first event */ i = 0 | PAPI_PRESET_MASK; PAPI_enum_event( &i, PAPI_ENUM_FIRST ); do { retval = PAPI_add_event( EventSet, i ); if ( retval == PAPI_OK ) { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { printf( "%-13s%#x %-5s%s", info.symbol, info.event_code, is_derived( &info ), info.long_descr ); if ( info.note[0] ) printf( " (%s)", info.note ); printf( "\n" ); } if ( ( retval = PAPI_remove_event( EventSet, i ) ) != PAPI_OK ) printf( "Error in PAPI_remove_event\n" ); j++; } } while ( PAPI_enum_event( &i, PAPI_PRESET_ENUM_AVAIL ) == PAPI_OK ); printf ( "-------------------------------------------------------------------------\n" ); printf( "Total events reported: %d\n", j ); exit( 0 ); } int main( int argc, char **argv ) { int i; int pevent,cevent; int cidx; const PAPI_hw_info_t *hwinfo = NULL; if ( argc < 3 ) { goto use_exit; } /* Init PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); return retval; } retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); return retval; } retval = papi_print_header( "Event Chooser: Available events " "which can be added with given events.\n", &hwinfo ); if ( retval != PAPI_OK ) { fprintf(stderr, "Error! PAPI_get_hardware_info\n"); return 2; } retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { fprintf( stderr, "PAPI_create_eventset error\n" ); return 1; } retval = PAPI_event_name_to_code( argv[2], &cevent ); if ( retval != PAPI_OK ) { fprintf( stderr, "Event %s can't be found\n", argv[2] ); return 1; } cidx = PAPI_get_event_component(cevent); for( i = 2; i < argc; i++ ) { retval = PAPI_event_name_to_code( argv[i], &pevent ); if ( retval != PAPI_OK ) { fprintf( stderr, "Event %s can't be found\n", argv[i] ); return 1; } retval = PAPI_add_event( EventSet, pevent ); if ( retval != PAPI_OK ) { fprintf( stderr, "Event %s can't be counted with others %d\n", argv[i], retval ); return 1; } } if ( !strcmp( "NATIVE", argv[1] ) ) { native( cidx ); } else if ( !strcmp( "PRESET", argv[1] ) ) { preset( ); } else { goto use_exit; } return 0; use_exit: fprintf( stderr, "Usage: papi_event_chooser NATIVE|PRESET evt1 evt2 ... \n" ); return 1; } papi-papi-7-2-0-t/src/utils/papi_hardware_avail.c000066400000000000000000000437461502707512200217330ustar00rootroot00000000000000/** file papi_hardware_avail.c * @page papi_hardware_avail * @brief papi_hardware_avail utility. * @section NAME * papi_hardware_avail - provides detailed information on the hardware available in the system. * * @section Synopsis * * @section Description * papi_hardware_avail is a PAPI utility program that reports information * about the hardware devices equipped in the system. * * @section Options *
    *
  • -h help message *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include "papi.h" #include "print_header.h" typedef struct command_flags { int help; } command_flags_t; static void print_help( char **argv ) { printf( "This is the PAPI hardware avail program.\n" ); printf( "It provides availability of system's equipped hardware devices.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "Options:\n\n" ); printf( " --help, -h print this help message\n" ); } static void parse_args( int argc, char **argv, command_flags_t * f ) { int i; /* Look for all currently defined commands */ memset( f, 0, sizeof ( command_flags_t ) ); for ( i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "-h" ) || !strcmp( argv[i], "--help" ) ) f->help = 1; else printf( "%s is not supported\n", argv[i] ); } /* if help requested, print and bail */ if ( f->help ) { print_help( argv ); exit( 1 ); } } int main( int argc, char **argv ) { int i; int retval; const PAPI_component_info_t *cmpinfo = NULL; command_flags_t flags; int numcmp; int sysdetect_avail = 0; /* Initialize before parsing the input arguments */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); return retval; } parse_args( argc, argv, &flags ); retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); return retval; } numcmp = PAPI_num_components( ); for (i = 0; i < numcmp; i++) { cmpinfo = PAPI_get_component_info( i ); if (strcmp("sysdetect", cmpinfo->name) == 0) sysdetect_avail = 1; } if (sysdetect_avail == 0) { fprintf(stderr, "Error! Sysdetect component not enabled\n"); return 0; } printf( "\nDevice Summary -----------------------------------------------------------------\n" ); void *handle; int enum_modifier = PAPI_DEV_TYPE_ENUM__ALL; int id, vendor_id, dev_count; const char *vendor_name, *status; printf( "Vendor DevCount \n" ); while (PAPI_enum_dev_type(enum_modifier, &handle) == PAPI_OK) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &dev_count); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_STATUS, &status); printf( "%-18s (%d)\n", vendor_name, dev_count); printf( " \\-> Status: %s\n", status ); printf( "\n" ); } printf( "\nDevice Information -------------------------------------------------------------\n" ); while (PAPI_enum_dev_type(enum_modifier, &handle) == PAPI_OK) { PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_PAPI_ID, &id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_VENDOR_ID, &vendor_id); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__CHAR_NAME, &vendor_name); PAPI_get_dev_type_attr(handle, PAPI_DEV_TYPE_ATTR__INT_COUNT, &dev_count); if ( id == PAPI_DEV_TYPE_ID__CPU && dev_count > 0 ) { unsigned int numas = 1; for ( i = 0; i < dev_count; ++i ) { const char *cpu_name; unsigned int family, model, stepping; unsigned int sockets, cores, threads; unsigned int l1i_size, l1d_size, l2u_size, l3u_size; unsigned int l1i_line_sz, l1d_line_sz, l2u_line_sz, l3u_line_sz; unsigned int l1i_line_cnt, l1d_line_cnt, l2u_line_cnt, l3u_line_cnt; unsigned int l1i_cache_ass, l1d_cache_ass, l2u_cache_ass, l3u_cache_ass; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_CHAR_NAME, &cpu_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_FAMILY, &family); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_MODEL, &model); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_STEPPING, &stepping); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_SOCKET_COUNT, &sockets); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_NUMA_COUNT, &numas); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_CORE_COUNT, &cores); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_THREAD_COUNT, &threads); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_SIZE, &l1i_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_SIZE, &l1d_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_SIZE, &l2u_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_SIZE, &l3u_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_SIZE, &l1i_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_SIZE, &l1d_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_SIZE, &l2u_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_SIZE, &l3u_line_sz); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_LINE_COUNT, &l1i_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_LINE_COUNT, &l1d_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_LINE_COUNT, &l2u_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_LINE_COUNT, &l3u_line_cnt); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1I_CACHE_ASSOC, &l1i_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L1D_CACHE_ASSOC, &l1d_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L2U_CACHE_ASSOC, &l2u_cache_ass); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CPU_UINT_L3U_CACHE_ASSOC, &l3u_cache_ass); printf( "Vendor : %s (%u,0x%x)\n", vendor_name, vendor_id, vendor_id ); printf( "Id : %u\n", i ); printf( "Name : %s\n", cpu_name ); printf( "CPUID : Family/Model/Stepping %u/%u/%u 0x%02x/0x%02x/0x%02x\n", family, model, stepping, family, model, stepping ); printf( "Sockets : %u\n", sockets ); printf( "Numa regions : %u\n", numas ); printf( "Cores per socket : %u\n", cores ); printf( "Cores per NUMA region : %u\n", threads / numas ); printf( "SMT threads per core : %u\n", threads / sockets / cores ); if (l1i_size > 0) { printf( "L1i Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l1i_size >> 10, l1i_line_sz, l1i_line_cnt, l1i_cache_ass); printf( "L1d Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l1d_size >> 10, l1d_line_sz, l1d_line_cnt, l1d_cache_ass); } if (l2u_size > 0) { printf( "L2 Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l2u_size >> 10, l2u_line_sz, l2u_line_cnt, l2u_cache_ass ); } if (l3u_size > 0) { printf( "L3 Cache : Size/LineSize/Lines/Assoc %uKB/%uB/%u/%u\n", l3u_size >> 10, l3u_line_sz, l3u_line_cnt, l3u_cache_ass ); } #define MAX_NUMA_NODES (16) #define MAX_CPU_THREADS (512) unsigned int j; unsigned int affinity[MAX_CPU_THREADS]; unsigned int numa_threads_count[MAX_NUMA_NODES] = { 0 }; unsigned int numa_threads[MAX_NUMA_NODES][MAX_CPU_THREADS]; for (j = 0; j < threads; ++j) { PAPI_get_dev_attr(handle, j, PAPI_DEV_ATTR__CPU_UINT_THR_NUMA_AFFINITY, &affinity[j]); if( affinity[j] >= 0 ) numa_threads[affinity[j]][numa_threads_count[affinity[j]]++] = j; } for ( j = 0; j < numas; ++j ) { unsigned int k, memsize; PAPI_get_dev_attr(handle, j, PAPI_DEV_ATTR__CPU_UINT_NUMA_MEM_SIZE, &memsize); printf( "Numa Node %u Memory : %uMB\n", j, memsize ); printf( "Numa Node %u Threads : ", j ); for (k = 0; k < numa_threads_count[j]; ++k) { printf( "%u ", numa_threads[j][k] ); } printf( "\n" ); } printf( "\n" ); } } if ( id == PAPI_DEV_TYPE_ID__CUDA && dev_count > 0 ) { printf( "Vendor : %s\n", vendor_name ); for ( i = 0; i < dev_count; ++i ) { unsigned long uid; unsigned int warp_size, thread_per_block, block_per_sm; unsigned int shm_per_block, shm_per_sm; unsigned int blk_dim_x, blk_dim_y, blk_dim_z; unsigned int grd_dim_x, grd_dim_y, grd_dim_z; unsigned int sm_count, multi_kernel, map_host_mem, async_memcpy; unsigned int unif_addr, managed_mem; unsigned int cc_major, cc_minor; const char *dev_name; PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_ULONG_UID, &uid); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_CHAR_DEVICE_NAME, &dev_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_WARP_SIZE, &warp_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_THR_PER_BLK, &thread_per_block); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_PER_SM, &block_per_sm); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_BLK, &shm_per_block); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SHM_PER_SM, &shm_per_sm); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_X, &blk_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Y, &blk_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_BLK_DIM_Z, &blk_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_X, &grd_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Y, &grd_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_GRD_DIM_Z, &grd_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_SM_COUNT, &sm_count); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MULTI_KERNEL, &multi_kernel); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MAP_HOST_MEM, &map_host_mem); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MEMCPY_OVERLAP, &async_memcpy); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_UNIFIED_ADDR, &unif_addr); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_MANAGED_MEM, &managed_mem); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__CUDA_UINT_COMP_CAP_MINOR, &cc_minor); printf( "Id : %d\n", i ); printf( "UID : %lu\n", uid ); printf( "Name : %s\n", dev_name ); printf( "Warp size : %u\n", warp_size ); printf( "Max threads per block : %u\n", thread_per_block ); printf( "Max blocks per multiprocessor : %u\n", block_per_sm ); printf( "Max shared memory per block : %u\n", shm_per_block ); printf( "Max shared memory per multiprocessor : %u\n", shm_per_sm ); printf( "Max block dim x : %u\n", blk_dim_x ); printf( "Max block dim y : %u\n", blk_dim_y ); printf( "Max block dim z : %u\n", blk_dim_z ); printf( "Max grid dim x : %u\n", grd_dim_x ); printf( "Max grid dim y : %u\n", grd_dim_y ); printf( "Max grid dim z : %u\n", grd_dim_z ); printf( "Multiprocessor count : %u\n", sm_count ); printf( "Multiple kernels per context : %s\n", multi_kernel ? "yes" : "no" ); printf( "Can map host memory : %s\n", map_host_mem ? "yes" : "no"); printf( "Can overlap compute and data transfer : %s\n", async_memcpy ? "yes" : "no" ); printf( "Has unified addressing : %s\n", unif_addr ? "yes" : "no" ); printf( "Has managed memory : %s\n", managed_mem ? "yes" : "no" ); printf( "Compute capability : %u.%u\n", cc_major, cc_minor ); printf( "\n" ); } } if ( id == PAPI_DEV_TYPE_ID__ROCM && dev_count > 0 ) { printf( "Vendor : %s\n", vendor_name ); unsigned long uid; const char *dev_name; unsigned int wf_size, simd_per_cu, wg_size; unsigned int wf_per_cu, shm_per_wg, wg_dim_x, wg_dim_y, wg_dim_z; unsigned int grd_dim_x, grd_dim_y, grd_dim_z; unsigned int cu_count; unsigned int cc_major, cc_minor; for ( i = 0; i < dev_count; ++i ) { PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_ULONG_UID, &uid); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_CHAR_DEVICE_NAME, &dev_name); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WAVEFRONT_SIZE, &wf_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_SIMD_PER_CU, &simd_per_cu); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WORKGROUP_SIZE, &wg_size); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WAVE_PER_CU, &wf_per_cu); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_SHM_PER_WG, &shm_per_wg); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_X, &wg_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Y, &wg_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_WG_DIM_Z, &wg_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_X, &grd_dim_x); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Y, &grd_dim_y); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_GRD_DIM_Z, &grd_dim_z); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_CU_COUNT, &cu_count); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MAJOR, &cc_major); PAPI_get_dev_attr(handle, i, PAPI_DEV_ATTR__ROCM_UINT_COMP_CAP_MINOR, &cc_minor); printf( "Id : %d\n", i ); printf( "Name : %s\n", dev_name ); printf( "Wavefront size : %u\n", wf_size ); printf( "SIMD per compute unit : %u\n", simd_per_cu ); printf( "Max threads per workgroup : %u\n", wg_size ); printf( "Max waves per compute unit : %u\n", wf_per_cu ); printf( "Max shared memory per workgroup : %u\n", shm_per_wg ); printf( "Max workgroup dim x : %u\n", wg_dim_x ); printf( "Max workgroup dim y : %u\n", wg_dim_y ); printf( "Max workgroup dim z : %u\n", wg_dim_z ); printf( "Max grid dim x : %u\n", grd_dim_x ); printf( "Max grid dim y : %u\n", grd_dim_y ); printf( "Max grid dim z : %u\n", grd_dim_z ); printf( "Compute unit count : %u\n", cu_count ); printf( "Compute capability : %u.%u\n", cc_major, cc_minor ); printf( "\n" ); } } } printf( "--------------------------------------------------------------------------------\n" ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/utils/papi_hybrid_native_avail.c000066400000000000000000000416251502707512200227570ustar00rootroot00000000000000/* This file utility reports hardware info and native event availability on either the host * CPU or on one of the attached MIC devices. It is based on the papi_native_avail utility, * but uses offloading to run either on the host CPU or on a target device. */ /** file hybrid_native_avail.c * @page papi_hybrid_native_avail * @brief papi_hybrid_native_avail utility. * @section NAME * papi_hybrid_native_avail - provides detailed information for PAPI native events. * * @section Synopsis * * @section Description * papi_hybrid_native_avail is a PAPI utility program that reports information * about the native events available on the current platform or on an attached MIC card. * A native event is an event specific to a specific hardware platform. * On many platforms, a specific native event may have a number of optional settings. * In such cases, the native event and the valid settings are presented, * rather than every possible combination of those settings. * For each native event, a name, a description, and specific bit patterns are provided. * * @section Options *
    *
  • --help, -h print this help message *
  • -d display detailed information about native events *
  • -e EVENTNAME display detailed information about named native event *
  • -i EVENTSTR include only event names that contain EVENTSTR *
  • -x EVENTSTR exclude any event names that contain EVENTSTR *
  • --noumasks suppress display of Unit Mask information *
  • --mic < index > report events on the specified target MIC device *
* * Processor-specific options *
    *
  • --darr display events supporting Data Address Range Restriction *
  • --dear display Data Event Address Register events only *
  • --iarr display events supporting Instruction Address Range Restriction *
  • --iear display Instruction Event Address Register events only *
  • --opcm display events supporting OpCode Matching *
  • --nogroups suppress display of Event grouping information *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . * * Modified by Gabriel Marin to use offloading. */ //#pragma offload_attribute (push,target(mic)) //#include "papi_test.h" //#pragma offload_attribute (pop) #include #include #define EVT_LINE 80 typedef struct command_flags { int help; int details; int named; int include; int xclude; char *name, *istr, *xstr; int darr; int dear; int iarr; int iear; int opcm; int umask; int groups; int mic; int devidx; } command_flags_t; static void print_help( char **argv ) { printf( "This is the PAPI native avail program.\n" ); printf( "It provides availability and detail information for PAPI native events.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "\nOptions:\n" ); printf( " --help, -h print this help message\n" ); printf( " -d display detailed information about native events\n" ); printf( " -e EVENTNAME display detailed information about named native event\n" ); printf( " -i EVENTSTR include only event names that contain EVENTSTR\n" ); printf( " -x EVENTSTR exclude any event names that contain EVENTSTR\n" ); printf( " --noumasks suppress display of Unit Mask information\n" ); printf( "\nProcessor-specific options\n"); printf( " --darr display events supporting Data Address Range Restriction\n" ); printf( " --dear display Data Event Address Register events only\n" ); printf( " --iarr display events supporting Instruction Address Range Restriction\n" ); printf( " --iear display Instruction Event Address Register events only\n" ); printf( " --opcm display events supporting OpCode Matching\n" ); printf( " --nogroups suppress display of Event grouping information\n" ); printf( " --mic display events on the specified Xeon Phi device\n" ); printf( "\n" ); } static int no_str_arg( char *arg ) { return ( ( arg == NULL ) || ( strlen( arg ) == 0 ) || ( arg[0] == '-' ) ); } static void parse_args( int argc, char **argv, command_flags_t * f ) { int i; /* Look for all currently defined commands */ memset( f, 0, sizeof ( command_flags_t ) ); f->umask = 1; f->groups = 1; for ( i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "--darr" ) ) f->darr = 1; else if ( !strcmp( argv[i], "--dear" ) ) f->dear = 1; else if ( !strcmp( argv[i], "--iarr" ) ) f->iarr = 1; else if ( !strcmp( argv[i], "--iear" ) ) f->iear = 1; else if ( !strcmp( argv[i], "--opcm" ) ) f->opcm = 1; else if ( !strcmp( argv[i], "--noumasks" ) ) f->umask = 0; else if ( !strcmp( argv[i], "--nogroups" ) ) f->groups = 0; else if ( !strcmp( argv[i], "-d" ) ) f->details = 1; else if ( !strcmp( argv[i], "--mic" ) ) { f->mic = 1; i++; if ( i >= argc || no_str_arg( argv[i] ) ) { printf( "Specify a device index for --mic\n"); exit(1); } f->devidx = strtol(argv[i], 0, 10); } else if ( !strcmp( argv[i], "-e" ) ) { f->named = 1; i++; f->name = argv[i]; if ( i >= argc || no_str_arg( f->name ) ) { printf( "Invalid argument for -e\n"); exit(1); } } else if ( !strcmp( argv[i], "-i" ) ) { f->include = 1; i++; f->istr = argv[i]; if ( i >= argc || no_str_arg( f->istr ) ) { printf( "Invalid argument for -i\n"); exit(1); } } else if ( !strcmp( argv[i], "-x" ) ) { f->xclude = 1; i++; f->xstr = argv[i]; if ( i >= argc || no_str_arg( f->xstr ) ) { printf( "Invalid argument for -x\n"); exit(1); } } else if ( !strcmp( argv[i], "-h" ) || !strcmp( argv[i], "--help" ) ) f->help = 1; else { printf( "%s is not supported\n", argv[i] ); exit(1); } } /* if help requested, print and bail */ if ( f->help ) { print_help( argv); exit( 1 ); } } static void space_pad( char *str, int spaces ) { while ( spaces-- > 0 ) strcat( str, " " ); } static void print_event( PAPI_event_info_t * info, int offset ) { unsigned int i, j = 0; char str[EVT_LINE + EVT_LINE]; /* indent by offset */ if ( offset ) { printf( "| %-73s|\n", info->symbol ); } else { printf( "| %-77s|\n", info->symbol ); } while ( j <= strlen( info->long_descr ) ) { i = EVT_LINE - 12 - 2; if ( i > 0 ) { str[0] = 0; strcat(str,"| " ); space_pad( str, 11 ); strncat( str, &info->long_descr[j], i ); j += i; i = ( unsigned int ) strlen( str ); space_pad( str, EVT_LINE - ( int ) i - 1 ); strcat( str, "|" ); } printf( "%s\n", str ); } } static int parse_unit_masks( PAPI_event_info_t * info ) { char *pmask,*ptr; /* handle the PAPI component-style events which have a component:::event type */ if ((ptr=strstr(info->symbol, ":::"))) { ptr+=3; /* handle libpfm4-style events which have a pmu::event type event name */ } else if ((ptr=strstr(info->symbol, "::"))) { ptr+=2; } else { ptr=info->symbol; } if ( ( pmask = strchr( ptr, ':' ) ) == NULL ) { return ( 0 ); } memmove( info->symbol, pmask, ( strlen( pmask ) + 1 ) * sizeof ( char ) ); pmask = strchr( info->long_descr, ':' ); if ( pmask == NULL ) info->long_descr[0] = 0; else memmove( info->long_descr, pmask + sizeof ( char ), ( strlen( pmask ) + 1 ) * sizeof ( char ) ); return ( 1 ); } int main( int argc, char **argv ) { int i, j = 0, k; int retval; PAPI_event_info_t info; const PAPI_hw_info_t *hwinfo = NULL; command_flags_t flags; int enum_modifier; int numcmp, cid; int num_devices = 0; int target_idx = 0; int offload_mode = 0; int target_ok = 0; /* Parse the command-line arguments */ parse_args( argc, argv, &flags ); if (flags.mic) { printf("Checking for Intel(R) Xeon Phi(TM) (Target CPU) devices...\n\n"); #ifdef __INTEL_OFFLOAD num_devices = _Offload_number_of_devices(); #endif printf("Number of Target devices installed: %d\n\n",num_devices); if (flags.devidx >= num_devices) { // Run in fallback-mode printf("Requested device index %d is not available. Specify a device between 0 and %d\n\n", flags.devidx, num_devices-1); exit(1); } else { offload_mode = 1; target_idx = flags.devidx; printf("PAPI will list the native events available on device mic%d\n\n", target_idx); } } /* Set enum modifier mask */ if ( flags.dear ) enum_modifier = PAPI_NTV_ENUM_DEAR; else if ( flags.darr ) enum_modifier = PAPI_NTV_ENUM_DARR; else if ( flags.iear ) enum_modifier = PAPI_NTV_ENUM_IEAR; else if ( flags.iarr ) enum_modifier = PAPI_NTV_ENUM_IARR; else if ( flags.opcm ) enum_modifier = PAPI_NTV_ENUM_OPCM; else enum_modifier = PAPI_ENUM_EVENTS; /// #pragma offload target(mic: target_idx) if(offload_mode) in(argc, argv) inout(TESTS_QUIET) /* Initialize before parsing the input arguments */ #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) retval = PAPI_library_init(PAPI_VER_CURRENT); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); exit(retval); } if ( !TESTS_QUIET ) { #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); exit(retval); } } #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) nocopy(hwinfo) { retval = papi_print_header( "Available native events and hardware information.\n", &hwinfo ); fflush(stdout); } if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_get_hardware_info\n"); exit( 2 ); } /* Do this code if the event name option was specified on the commandline */ if ( flags.named ) { int papi_ok = 0; char *ename = flags.name; int elen = 0; if (ename) elen = strlen(ename) + 1; #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) in(ename:length(elen)) out(i) papi_ok = PAPI_event_name_to_code(ename, &i); if (papi_ok == PAPI_OK) { #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) out(info) papi_ok = PAPI_get_event_info(i, &info); } if (papi_ok == PAPI_OK) { printf( "%-30s%s\n", "Event name:", info.symbol); printf( "%-29s|%s|\n", "Description:", info.long_descr ); /* if unit masks exist but none specified, process all */ if ( !strchr( flags.name, ':' ) ) { #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(i) papi_ok = PAPI_enum_event( &i, PAPI_NTV_ENUM_UMASKS); if (papi_ok == PAPI_OK ) { printf( "\nUnit Masks:\n" ); do { #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(i, info) retval = PAPI_get_event_info( i, &info ); if ( retval == PAPI_OK ) { if ( parse_unit_masks( &info ) ) { printf( "%-29s|%s|%s|\n", " Mask Info:", info.symbol, info.long_descr ); } } #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(i, info) papi_ok = PAPI_enum_event(&i, PAPI_NTV_ENUM_UMASKS); } while (papi_ok == PAPI_OK); } } } else { printf("Sorry, an event by the name '%s' could not be found.\n", flags.name); printf("Is it typed correctly?\n\n"); exit( 1 ); } } else { /* Print *ALL* available events */ #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) numcmp = PAPI_num_components( ); j = 0; for ( cid = 0; cid < numcmp; cid++ ) { PAPI_component_info_t component; // if (offload_mode) // I must allocate local memory to receive the result // component = (PAPI_component_info_t*)malloc(sizeof(PAPI_component_info_t)); // #pragma offload target(mic: target_idx) if(offload_mode) out(*component:length(sizeof(PAPI_component_info_t)) alloc_if(0) free_if(0)) #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) out(component) { memcpy(&component, PAPI_get_component_info(cid), sizeof(PAPI_component_info_t)); } /* Skip disabled components */ if (component.disabled) continue; printf( "===============================================================================\n" ); printf( " Native Events in Component: %s\n",component.name); printf( "===============================================================================\n" ); /* Always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(i) retval=PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cid ); do { memset( &info, 0, sizeof ( info ) ); #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(info) retval = PAPI_get_event_info( i, &info ); /* This event may not exist */ if ( retval != PAPI_OK ) goto endloop; /* Bail if event name doesn't contain include string */ if ( flags.include ) { if ( !strstr( info.symbol, flags.istr ) ) { goto endloop; } } /* Bail if event name does contain exclude string */ if ( flags.xclude ) { if ( strstr( info.symbol, flags.xstr ) ) goto endloop; } /* count only events that are actually processed */ j++; print_event( &info, 0 ); if (flags.details) { if (info.units[0]) printf( "| Units: %-67s|\n", info.units ); } /* modifier = PAPI_NTV_ENUM_GROUPS returns event codes with a groups id for each group in which this native event lives, in bits 16 - 23 of event code terminating with PAPI_ENOEVNT at the end of the list. */ /* This is an IBM Power issue */ if ( flags.groups ) { int papi_ok = 0; k = i; #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(k) papi_ok = PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_GROUPS, cid); if (papi_ok == PAPI_OK ) { printf("Groups: "); do { printf( "%4d", ( ( k & PAPI_NTV_GROUP_AND_MASK ) >> PAPI_NTV_GROUP_SHIFT ) - 1 ); #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(k) papi_ok = PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_GROUPS, cid); } while (papi_ok==PAPI_OK ); printf( "\n" ); } } /* Print umasks */ /* components that don't have them can just ignore */ if ( flags.umask ) { int papi_ok = 0; k = i; #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(k) papi_ok = PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_UMASKS, cid); if (papi_ok == PAPI_OK ) { do { #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(info) retval = PAPI_get_event_info(k, &info); if ( retval == PAPI_OK ) { if (parse_unit_masks( &info )) print_event(&info, 2); } #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(k) papi_ok = PAPI_enum_cmp_event(&k, PAPI_NTV_ENUM_UMASKS, cid); } while (papi_ok == PAPI_OK); } } printf( "--------------------------------------------------------------------------------\n" ); endloop: #ifdef __INTEL_OFFLOAD __Offload_report(1); #endif #pragma offload target(mic: target_idx) if(offload_mode) inout(i) retval=PAPI_enum_cmp_event(&i, enum_modifier, cid); } while (retval == PAPI_OK ); } printf("\n"); printf( "Total events reported: %d\n", j ); } return 0; } papi-papi-7-2-0-t/src/utils/papi_mem_info.c000066400000000000000000000073701502707512200205440ustar00rootroot00000000000000/* * This file perfoms the following test: memory info * * Author: Kevin London * london@cs.utk.edu */ /** file papi_mem_info.c * @brief papi_mem_info utility. * @page papi_mem_info * @section NAME * papi_mem_info - provides information on the memory architecture of the current processor. * * @section Synopsis * * @section Description * papi_mem_info is a PAPI utility program that reports information about * the cache memory architecture of the current processor, including number, * types, sizes and associativities of instruction and data caches and * Translation Lookaside Buffers. * * @section Options * This utility has no command line options. * * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include "papi.h" int main( int argc, char **argv ) { const PAPI_hw_info_t *meminfo = NULL; PAPI_mh_level_t *L; int i, j, retval; (void)argc; (void)argv; retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); return retval; } meminfo = PAPI_get_hardware_info( ); if (meminfo == NULL ) { fprintf(stderr,"Error! PAPI_get_hardware_info"); return 2; } printf( "Memory Cache and TLB Hierarchy Information.\n" ); printf( "------------------------------------------------------------------------\n" ); /* Extract and report the tlb and cache information */ L = ( PAPI_mh_level_t * ) & ( meminfo->mem_hierarchy.level[0] ); printf( "TLB Information.\n There may be multiple descriptors for each level of TLB\n" ); printf( " if multiple page sizes are supported.\n\n" ); /* Scan the TLB structures */ for ( i = 0; i < meminfo->mem_hierarchy.levels; i++ ) { for ( j = 0; j < PAPI_MH_MAX_LEVELS; j++ ) { switch ( PAPI_MH_CACHE_TYPE( L[i].tlb[j].type ) ) { case PAPI_MH_TYPE_UNIFIED: printf( "L%d Unified TLB:\n", i + 1 ); break; case PAPI_MH_TYPE_DATA: printf( "L%d Data TLB:\n", i + 1 ); break; case PAPI_MH_TYPE_INST: printf( "L%d Instruction TLB:\n", i + 1 ); break; } if ( L[i].tlb[j].type ) { if ( L[i].tlb[j].page_size ) printf( " Page Size: %6d KB\n", L[i].tlb[j].page_size >> 10 ); printf( " Number of Entries: %6d\n", L[i].tlb[j].num_entries ); switch ( L[i].tlb[j].associativity ) { case 0: /* undefined */ break; case 1: printf( " Associativity: Direct Mapped\n\n" ); break; case SHRT_MAX: printf( " Associativity: Full\n\n" ); break; default: printf( " Associativity: %6d\n\n", L[i].tlb[j].associativity ); break; } } } } /* Scan the Cache structures */ printf( "\nCache Information.\n\n" ); for ( i = 0; i < meminfo->mem_hierarchy.levels; i++ ) { for ( j = 0; j < 2; j++ ) { switch ( PAPI_MH_CACHE_TYPE( L[i].cache[j].type ) ) { case PAPI_MH_TYPE_UNIFIED: printf( "L%d Unified Cache:\n", i + 1 ); break; case PAPI_MH_TYPE_DATA: printf( "L%d Data Cache:\n", i + 1 ); break; case PAPI_MH_TYPE_INST: printf( "L%d Instruction Cache:\n", i + 1 ); break; case PAPI_MH_TYPE_TRACE: printf( "L%d Trace Buffer:\n", i + 1 ); break; case PAPI_MH_TYPE_VECTOR: printf( "L%d Vector Cache:\n", i + 1 ); break; } if ( L[i].cache[j].type ) { printf( " Total size: %6d KB\n Line size: %6d B\n Number of Lines: %6d\n Associativity: %6d\n\n", ( L[i].cache[j].size ) >> 10, L[i].cache[j].line_size, L[i].cache[j].num_lines, L[i].cache[j].associativity ); } } } return 0; } papi-papi-7-2-0-t/src/utils/papi_multiplex_cost.c000066400000000000000000000454571502707512200220360ustar00rootroot00000000000000/** file papi_multiplex_cost.c * @brief papi_multiplex_cost utility. * @page papi_multiplex_cost * @section NAME * papi_multiplex_cost - computes execution time costs for basic PAPI operations on multiplexed EventSets. * * @section Synopsis * papi_cost [-m, --min < min >] [-x, --max < max >] [-k,-s] * * @section Description * papi_multiplex_cost is a PAPI utility program that computes the * min / max / mean / std. deviation of execution times for PAPI start/stop * pairs and for PAPI reads on multiplexed eventsets. * This information provides the basic operating cost to a user's program * for collecting hardware counter data. * Command line options control display capabilities. * * @section Options *
    *
  • -m < Min number of events to test > *
  • -x < Max number of events to test > *
  • -k, Do not time kernel multiplexing *
  • -s, Do not ime software multiplexed EventSets *
  • -t THRESHOLD, Test with THRESHOLD iterations of counting loop. *
* * @section Bugs * There are no known bugs in this utility. If you find a bug, * it should be reported to the PAPI Mailing List at . */ /* Open Issues: * Selecting events to add is very primitive right now. * Output format, right now the format targets a gnuplot script I have, * We will probably end up generating a csv per test */ #include #include #include #include #include "papi.h" #include "cost_utils.h" static int first_time = 1; static int skip = 0; static FILE* fp; typedef struct { int first_time; int force_sw; int kernel_mpx; int min; int max; } options_t; static options_t options; void do_output( char *fn, char *message, long long* array, int noc ) { long long min, max; double average, std; std = do_stats( array, &min, &max, &average ); if ( first_time ) { skip = 0; fp = fopen(fn, "w"); if (fp == NULL) { fprintf(stderr,"Unable to open output file, %s, output will not be saved.\n", fn); skip = 1; } else fprintf(fp, "###%s\n#number of events\tmin cycles\tmax cycles\tmean cycles\t\ std deviation\tsw min cycles\tsw max cycles\tsw avg cycles\tsw std dev\n", message); first_time = 0; } if ( !skip ) { fprintf(fp, "%20d\t%10lld\t%10lld\t%10lf\t%10lf", noc, min, max, average, std); std = do_stats( array+num_iters, &min, &max, &average ); fprintf(fp, "\t%10lld\t%10lld\t%10lf\t%10lf\n", min, max, average, std); fflush(fp); } } void init_test(int SoftwareMPX, int KernelMPX, int* Events) { int i; int retval; PAPI_option_t option, itimer; retval = PAPI_assign_eventset_component( SoftwareMPX, 0 ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_assign_eventset_component\n"); exit(retval); } retval = PAPI_assign_eventset_component( KernelMPX, 0 ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_assign_eventset_component\n"); exit(retval); } retval = PAPI_set_multiplex( KernelMPX ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_multiplex\n"); exit(retval); } PAPI_get_opt(PAPI_DEF_ITIMER,&itimer); memset(&option,0x0,sizeof(option)); option.multiplex.flags = PAPI_MULTIPLEX_FORCE_SW; option.multiplex.eventset = SoftwareMPX; option.multiplex.ns = itimer.itimer.ns; retval = PAPI_set_opt( PAPI_MULTIPLEX, &option ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_opt\n"); exit(retval); } for (i = 0; i < options.min - 1; i++) { if ( options.kernel_mpx ) { retval = PAPI_add_event( KernelMPX, Events[i]); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_add_event\n"); exit(retval); } } if ( options.force_sw ) { retval = PAPI_add_event( SoftwareMPX, Events[i]); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_add_event\n"); exit(retval); } } } } void finalize_test(void) { if (fp) fclose(fp); first_time = 1; } static void usage(void) { printf( "Usage: papi_multiplex_cost [options]\n" "\t-m num, number of events to count\n" "\t-x num, number of events to count\n" "\t-s, Do not run software multiplexing test.\n" "\t-k, Do not attempt kernel multiplexed test.\n" "\t-t THRESHOLD set the threshold for the number " "of iterations. Default: 100,000\n" ); } int main( int argc, char **argv ) { int retval, retval_start, retval_stop; int KernelMPX = PAPI_NULL; int SoftwareMPX = PAPI_NULL; int *Events = NULL; int number_of_counters; int i; int c; int dont_loop_forever; long long totcyc, *values = NULL; long long *array = NULL; int event; PAPI_option_t option, itimer; const PAPI_component_info_t *info; PAPI_set_debug(PAPI_QUIET); options.min = 1; options.max = 10; options.force_sw = 1; options.kernel_mpx = 1; while ( ( c=getopt(argc, argv, "hm:x:skt:") ) != -1 ) { switch (c) { case 'h': usage(); exit(0); case 'm': options.min = atoi(optarg); break; case 'x': options.max = atoi(optarg); break; case 's': options.force_sw = 0; break; case 'k': options.kernel_mpx = 0; break; case 't': num_iters = atoi(optarg); default: break; } } printf("This utility benchmarks the overhead of PAPI multiplexing\n"); printf("Warning! This can take a long time (many minutes) to run\n"); printf("The output goes to multiple .dat files in the current directory\n\n"); if ( options.min > options.max ) { fprintf(stderr,"Error! Min # of Events > Max # of Events"); goto cleanup; } values = (long long*)malloc(sizeof(long long) * options.max); array = (long long *)malloc(sizeof(long long) * 2 * num_iters); Events = ( int* )malloc(sizeof(int) * options.max); if ( values == NULL || array == NULL || Events == NULL ) { fprintf(stderr,"Error allocating memory!\n"); exit(1); } retval = PAPI_library_init( PAPI_VER_CURRENT ); if (retval != PAPI_VER_CURRENT ) { fprintf(stderr, "Error! PAPI_library_init\n"); exit(retval); } retval = PAPI_set_debug( PAPI_QUIET ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); exit(retval ); } retval = PAPI_multiplex_init( ); if (retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_multiplex_init\n"); exit(retval); } info = PAPI_get_component_info(0); if (info != NULL ) { options.kernel_mpx &= info->kernel_multiplex; if ( options.kernel_mpx && !info->kernel_multiplex ) { fprintf(stderr,"Error! Kernel multiplexing is " "not supported on this platform, bailing!\n"); exit(1); } } retval = PAPI_create_eventset( &SoftwareMPX ); if (retval != PAPI_OK) { fprintf(stderr,"Error! PAPI_create_eventset\n"); exit(retval); } retval = PAPI_create_eventset( &KernelMPX ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_create_eventset"); exit(retval); } retval = PAPI_assign_eventset_component( KernelMPX, 0 ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_assign_eventset_component\n"); exit(retval); } retval = PAPI_set_multiplex( KernelMPX ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_set_multiplex"); exit(retval); } retval = PAPI_assign_eventset_component( SoftwareMPX, 0 ); if (retval != PAPI_OK ) { fprintf(stderr,"PAPI_assign_eventset_component\n"); exit(retval); } PAPI_get_opt(PAPI_DEF_ITIMER,&itimer); memset(&option,0x0,sizeof(option)); option.multiplex.flags = PAPI_MULTIPLEX_FORCE_SW; option.multiplex.eventset = SoftwareMPX; option.multiplex.ns = itimer.itimer.ns; retval = PAPI_set_opt( PAPI_MULTIPLEX, &option ); if (retval != PAPI_OK) { fprintf(stderr,"PAPI_set_opt"); exit(retval); } if ( !options.kernel_mpx && !options.force_sw ) { fprintf(stderr,"No tests to run."); goto cleanup; } else { fprintf(stderr,"Running test[s]\n"); if (options.kernel_mpx) fprintf(stderr,"\tKernel multiplexing read\n"); if (options.force_sw) fprintf(stderr,"\tSoftware Multiplexing read\n"); } event = 0 | PAPI_NATIVE_MASK; PAPI_enum_event( &event, PAPI_ENUM_FIRST ); /* Find some events to run the tests with. */ for (number_of_counters = 0; number_of_counters < options.max; number_of_counters++) { dont_loop_forever = 0; if ( options.kernel_mpx ) { do { PAPI_enum_event( &event, PAPI_ENUM_EVENTS ); dont_loop_forever++; } while ( ( retval = PAPI_add_event( KernelMPX, event ) ) != PAPI_OK && dont_loop_forever < 512); } else { do { PAPI_enum_event( &event, PAPI_ENUM_EVENTS ); dont_loop_forever++; } while ( ( retval = PAPI_add_event( SoftwareMPX, event) ) != PAPI_OK && dont_loop_forever < 512); } if ( dont_loop_forever == 512 ) fprintf(stderr,"I can't find %d events to count at once.", options.max); Events[number_of_counters] = event; } PAPI_cleanup_eventset( KernelMPX ); PAPI_cleanup_eventset( SoftwareMPX ); /* Start/Stop test */ init_test(SoftwareMPX, KernelMPX, Events); for (number_of_counters = options.min; number_of_counters < options.max; number_of_counters++) { if ( options.kernel_mpx ) { if ( ( retval = PAPI_add_event( KernelMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( KernelMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } if ( ( retval = PAPI_stop( KernelMPX, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } /* KernelMPX Timing loop */ for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval_start=PAPI_start( KernelMPX ); retval_stop=PAPI_stop( KernelMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; if (retval_start || retval_stop) fprintf(stderr,"PAPI start/stop"); } /* End 1 timing run */ } else memset(array, 0, sizeof(long long) * num_iters ); /* Also test software multiplexing */ if ( options.force_sw ) { if ( ( retval = PAPI_add_event( SoftwareMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( SoftwareMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } if ( ( retval = PAPI_stop( SoftwareMPX, values ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_stop"); exit(retval); } /* SoftwareMPX Timing Loop */ for ( i = num_iters; i < 2*num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval_start=PAPI_start( SoftwareMPX ); retval_stop=PAPI_stop( SoftwareMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; if (retval_start || retval_stop) fprintf(stderr,"PAPI start/stop"); } /* End 2 timing run */ } else { memset(array+num_iters, 0, sizeof(long long) * num_iters ); } do_output( "papi_startstop.dat", "Multiplexed PAPI_read()", array, number_of_counters ); } /* End counter loop */ PAPI_cleanup_eventset( SoftwareMPX ); PAPI_cleanup_eventset( KernelMPX ); finalize_test(); /* PAPI_read() test */ init_test(SoftwareMPX, KernelMPX, Events); for (number_of_counters = options.min; number_of_counters < options.max; number_of_counters++) { if ( options.kernel_mpx ) { if ( ( retval = PAPI_add_event( KernelMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( KernelMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( KernelMPX, values ); /* KernelMPX Timing loop */ for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_read( KernelMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 1 timing run */ retval_stop=PAPI_stop( KernelMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else memset(array, 0, sizeof(long long) * num_iters ); /* Also test software multiplexing */ if ( options.force_sw ) { if ( ( retval = PAPI_add_event( SoftwareMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( SoftwareMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( SoftwareMPX, values ); /* SoftwareMPX Timing Loop */ for ( i = num_iters; i < 2*num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_read( SoftwareMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 2 timing run */ retval_stop=PAPI_stop( SoftwareMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else memset(array+num_iters, 0, sizeof(long long) * num_iters ); do_output( "papi_read.dat", "Multiplexed PAPI_read()", array, number_of_counters ); } /* End counter loop */ PAPI_cleanup_eventset( SoftwareMPX ); PAPI_cleanup_eventset( KernelMPX ); finalize_test(); /* PAPI_read_ts() test */ init_test( SoftwareMPX, KernelMPX, Events); for (number_of_counters = options.min; number_of_counters < options.max; number_of_counters++) { if ( options.kernel_mpx ) { if ( (retval = PAPI_add_event( KernelMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( KernelMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read_ts( KernelMPX, values, &totcyc ); /* KernelMPX Timing loop */ for ( i = 0; i < num_iters; i++ ) { retval = PAPI_read_ts( KernelMPX, values, &array[i] ); } /* End 1 timing run */ /* post-process the timing array */ for ( i = num_iters - 1; i > 0; i-- ) { array[i] -= array[i - 1]; } array[0] -= totcyc; retval_stop=PAPI_stop( KernelMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else memset(array, 0, sizeof(long long) * num_iters ); /* Also test software multiplexing */ if ( options.force_sw ) { if ( ( retval = PAPI_add_event( SoftwareMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( SoftwareMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read_ts( SoftwareMPX, values, &totcyc); /* SoftwareMPX Timing Loop */ for ( i = num_iters; i < 2*num_iters; i++ ) { retval = PAPI_read_ts( SoftwareMPX, values, &array[i]); } /* End 2 timing run */ retval_stop=PAPI_stop( SoftwareMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); /* post-process the timing array */ for ( i = 2*num_iters - 1; i > num_iters; i-- ) { array[i] -= array[i - 1]; } array[num_iters] -= totcyc; } else memset(array+num_iters, 0, sizeof(long long) * num_iters ); do_output( "papi_read_ts.dat", "Multiplexed PAPI_read_ts()", array, number_of_counters ); } /* End counter loop */ PAPI_cleanup_eventset( SoftwareMPX ); PAPI_cleanup_eventset( KernelMPX ); finalize_test(); /* PAPI_accum() test */ init_test(SoftwareMPX, KernelMPX, Events); for (number_of_counters = options.min; number_of_counters < options.max; number_of_counters++) { if ( options.kernel_mpx ) { if ( ( retval = PAPI_add_event( KernelMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( KernelMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( KernelMPX, values ); /* KernelMPX Timing loop */ for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_accum( KernelMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 1 timing run */ retval_stop=PAPI_stop( KernelMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else { memset(array, 0, sizeof(long long) * num_iters ); } /* Also test software multiplexing */ if ( options.force_sw ) { if ( ( retval = PAPI_add_event( SoftwareMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( SoftwareMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( SoftwareMPX, values ); /* SoftwareMPX Timing Loop */ for ( i = num_iters; i < 2*num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_accum( SoftwareMPX, values ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 2 timing run */ retval_stop=PAPI_stop( SoftwareMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else { memset(array+num_iters, 0, sizeof(long long) * num_iters ); } do_output( "papi_accum.dat", "Multiplexed PAPI_accum()", array, number_of_counters ); } /* End counter loop */ PAPI_cleanup_eventset( SoftwareMPX ); PAPI_cleanup_eventset( KernelMPX ); finalize_test(); /* PAPI_reset() test */ init_test(SoftwareMPX, KernelMPX, Events); for (number_of_counters = options.min; number_of_counters < options.max; number_of_counters++) { if ( options.kernel_mpx ) { if ( ( retval = PAPI_add_event( KernelMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( KernelMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( KernelMPX, values ); /* KernelMPX Timing loop */ for ( i = 0; i < num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_reset( KernelMPX ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 1 timing run */ retval_stop=PAPI_stop( KernelMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else memset(array, 0, sizeof(long long) * num_iters ); /* Also test software multiplexing */ if ( options.force_sw ) { if ( ( retval = PAPI_add_event( SoftwareMPX, Events[number_of_counters - options.min] ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_add_event"); goto cleanup; } if ( ( retval = PAPI_start( SoftwareMPX ) ) != PAPI_OK ) { fprintf(stderr,"PAPI_start"); exit(retval); } PAPI_read( SoftwareMPX, values ); /* SoftwareMPX Timing Loop */ for ( i = num_iters; i < 2*num_iters; i++ ) { totcyc = PAPI_get_real_cyc(); retval = PAPI_reset( SoftwareMPX ); array[i] = PAPI_get_real_cyc() - totcyc; } /* End 2 timing run */ retval_stop=PAPI_stop( SoftwareMPX, values ); if (retval_stop!=PAPI_OK) fprintf(stderr,"PAPI_stop"); } else { memset(array+num_iters, 0, sizeof(long long) * num_iters ); } do_output( "papi_reset.dat", "Multiplexed PAPI_reset()", array, number_of_counters ); } /* End counter loop */ PAPI_cleanup_eventset( SoftwareMPX ); PAPI_cleanup_eventset( KernelMPX ); finalize_test(); if ( values != NULL ) free(values); if ( array != NULL ) free(array); if ( Events != NULL ) free(Events); return 0; cleanup: if ( KernelMPX != PAPI_NULL) PAPI_cleanup_eventset( KernelMPX ); if ( SoftwareMPX != PAPI_NULL ) PAPI_cleanup_eventset( KernelMPX ); if ( values != NULL ) free(values); if ( array != NULL ) free(array); if ( Events != NULL ) free(Events); PAPI_shutdown(); return 1; } papi-papi-7-2-0-t/src/utils/papi_native_avail.c000066400000000000000000000550521502707512200214150ustar00rootroot00000000000000/* This file utility reports hardware info and native event availability */ /** file papi_native_avail.c * @page papi_native_avail * @brief papi_native_avail utility. * @section NAME * papi_native_avail - provides detailed information for PAPI native events. * * @section Synopsis * * @section Description * papi_native_avail is a PAPI utility program that reports information * about the native events available on the current platform. * A native event is an event specific to a specific hardware platform. * On many platforms, a specific native event may have a number of optional settings. * In such cases, the native event and the valid settings are presented, * rather than every possible combination of those settings. * For each native event, a name, a description, and specific bit patterns are provided. * * @section Options *
    *
  • --help, -h print this help message *
  • --check, -c attempts to add each event *
  • -sde FILE lists SDEs that are registered by the library or executable in FILE *
  • -e EVENTNAME display detailed information about named native event *
  • -i EVENTSTR include only event names that contain EVENTSTR *
  • -x EVENTSTR exclude any event names that contain EVENTSTR *
  • --noqual suppress display of event qualifiers (mask and flag) information\n *
* * Processor-specific options *
    *
  • --darr display events supporting Data Address Range Restriction *
  • --dear display Data Event Address Register events only *
  • --iarr display events supporting Instruction Address Range Restriction *
  • --iear display Instruction Event Address Register events only *
  • --opcm display events supporting OpCode Matching *
  • --nogroups suppress display of Event grouping information *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include #include #include #include #include "papi.h" #include "print_header.h" #if SDE #include "sde_lib/sde_lib.h" #endif #define EVT_LINE 80 #define EVT_LINE_BUF_SIZE 4096 typedef struct command_flags { int help; int named; int include; int xclude; int check; int list_sdes; char *path, *name, *istr, *xstr; int darr; int dear; int iarr; int iear; int opcm; int qualifiers; int groups; } command_flags_t; static void print_help( char **argv ) { printf( "This is the PAPI native avail program.\n" ); printf( "It provides availability and details about PAPI Native Events.\n" ); printf( "Usage: %s [options]\n", argv[0] ); printf( "Options:\n\n" ); printf( "\nGeneral command options:\n" ); printf( "\t-h, --help print this help message\n" ); printf( "\t-c, --check attempts to add each event\n"); printf( "\t-sde FILE lists SDEs that are registered by the library or executable in FILE\n" ); printf( "\t-e EVENTNAME display detailed information about named native event\n" ); printf( "\t-i EVENTSTR include only event names that contain EVENTSTR\n" ); printf( "\t-x EVENTSTR exclude any event names that contain EVENTSTR\n" ); printf( "\t--noqual suppress display of event qualifiers (mask and flag) information\n" ); printf( "\nProcessor-specific options:\n"); printf( "\t--darr display events supporting Data Address Range Restriction\n" ); printf( "\t--dear display Data Event Address Register events only\n" ); printf( "\t--iarr display events supporting Instruction Address Range Restriction\n" ); printf( "\t--iear display Instruction Event Address Register events only\n" ); printf( "\t--opcm display events supporting OpCode Matching\n" ); printf( "\t--nogroups suppress display of Event grouping information\n" ); printf( "\n" ); } static int no_str_arg( char *arg ) { return ( ( arg == NULL ) || ( strlen( arg ) == 0 ) || ( arg[0] == '-' ) ); } static void parse_args( int argc, char **argv, command_flags_t * f ) { int i; /* Look for all currently defined commands */ memset( f, 0, sizeof ( command_flags_t ) ); f->qualifiers = 1; f->groups = 1; for ( i = 1; i < argc; i++ ) { if ( !strcmp( argv[i], "--darr" ) ) f->darr = 1; else if ( !strcmp( argv[i], "--dear" ) ) f->dear = 1; else if ( !strcmp( argv[i], "--iarr" ) ) f->iarr = 1; else if ( !strcmp( argv[i], "--iear" ) ) f->iear = 1; else if ( !strcmp( argv[i], "--opcm" ) ) f->opcm = 1; else if ( !strcmp( argv[i], "--noqual" ) ) f->qualifiers = 0; else if ( !strcmp( argv[i], "--nogroups" ) ) f->groups = 0; else if ( !strcmp( argv[i], "-e" ) ) { f->named = 1; i++; if ( i < argc ) f->name = argv[i]; if ( no_str_arg( f->name ) ) { printf( "Invalid argument for -e\n"); exit(1); } } #if SDE else if ( !strcmp( argv[i], "-sde" ) ) { f->list_sdes = 1; i++; if ( i < argc ) f->path = argv[i]; if ( no_str_arg( f->path ) ) { printf( "Invalid argument for -sde\n"); exit(1); } } #endif else if ( !strcmp( argv[i], "-i" ) ) { f->include = 1; i++; if ( i < argc ) f->istr = argv[i]; if ( no_str_arg( f->istr ) ) { printf( "Invalid argument for -i\n"); exit(1); } } else if ( !strcmp( argv[i], "-x" ) ) { f->xclude = 1; i++; if ( i < argc ) f->xstr = argv[i]; if ( no_str_arg( f->xstr ) ) { printf( "Invalid argument for -x\n"); exit(1); } } else if ( strstr( argv[i], "-h" ) ) { f->help = 1; } else if ( !strcmp( argv[i], "-c" ) || !strcmp( argv[i], "--check" ) ) { f->check = 1; } else { printf( "%s is not supported\n", argv[i] ); exit(1); } } /* if help requested, print and bail */ if ( f->help ) { print_help( argv); exit( 1 ); } } static void space_pad( char *str, int spaces ) { while ( spaces-- > 0 ) strcat( str, " " ); } unsigned int event_available = 0; unsigned int event_output_buffer_size = 0; char *event_output_buffer = NULL; static void check_event( PAPI_event_info_t * info ) { int EventSet = PAPI_NULL; // if this event has already passed the check test, no need to try this one again if (event_available) { return; } if (PAPI_create_eventset (&EventSet) == PAPI_OK) { if (PAPI_add_named_event (EventSet, info->symbol) == PAPI_OK) { PAPI_remove_named_event (EventSet, info->symbol); event_available = 1; } // else printf("********** PAPI_add_named_event( %s ) failed: event could not be added \n", info->symbol); if ( PAPI_destroy_eventset( &EventSet ) != PAPI_OK ) { printf("********** Call to destroy eventset failed when trying to check event '%s' **********\n", info->symbol); } } return; } static int format_event_output( PAPI_event_info_t * info, int offset) { unsigned int i, j = 0; char event_line_buffer[EVT_LINE_BUF_SIZE]; char event_line_units[100]; /* indent by offset */ if ( offset ) { // this one is used for event qualifiers sprintf(event_line_buffer, "| %-73s|\n", info->symbol); } else { // this one is used for new events sprintf(event_line_buffer, "| %-73s%4s|\n", info->symbol, "<-->"); } while ( j <= strlen( info->long_descr ) ) { // The event_line_buffer is used to collect an event or mask name and its description. // The description will be folded to keep the length of output lines reasonable. So this // buffer may contain multiple lines of print output. Check to make sure there is room // for another line of print output. If there is not enough room for another output line // just exit the loop and truncate the description field (the buffer is big enough this // should not happen). if ((EVT_LINE_BUF_SIZE - strlen(event_line_buffer)) < EVT_LINE) { printf ("Event or mask description has been truncated.\n"); break; } // get amount of description that will fit in an output line i = EVT_LINE - 12 - 2; // start of a description line strcat(event_line_buffer,"| " ); // if we need to copy less than what fits in this line, move it and exit loop if (i > strlen(&info->long_descr[j])) { strcat( event_line_buffer, &info->long_descr[j]); space_pad( event_line_buffer, i - strlen(&info->long_descr[j])); strcat( event_line_buffer, "|\n" ); break; } // move what will fit into the line then loop back to do the rest in a new line int k = strlen(event_line_buffer); strncat( event_line_buffer, &info->long_descr[j], i ); event_line_buffer[k+i] = '\0'; strcat( event_line_buffer, "|\n" ); // bump index past what we copied j += i; } // also show the units for this event if a unit name has been set event_line_units[0] = '\0'; if (info->units[0] != 0) { sprintf(event_line_units, "| Units: %-66s|\n", info->units ); } // get the amount of used space in the output buffer int out_buf_used = 0; if ((event_output_buffer_size > 0) && (event_output_buffer != NULL)) { out_buf_used = strlen(event_output_buffer); } // if this will not fit in output buffer, make it bigger if (event_output_buffer_size < out_buf_used + strlen(event_line_buffer) + strlen(event_line_units) + 1) { if (event_output_buffer_size == 0) { event_output_buffer_size = 1024; event_output_buffer = calloc(1, event_output_buffer_size); } else { event_output_buffer_size += 1024; event_output_buffer = realloc(event_output_buffer, event_output_buffer_size); } } // make sure we got the memory we asked for if (event_output_buffer == NULL) { fprintf(stderr,"Error! Allocation of output buffer memory failed.\n"); return errno; } strcat(event_output_buffer, event_line_buffer); strcat(event_output_buffer, event_line_units); return 0; } static void print_event_output(int val_flag) { // first we need to update the available flag at the beginning of the buffer // this needs to reflect if this event name by itself or the event name with one of the qualifiers worked // if none of the combinations worked then we will show the event as not available char *val_flag_ptr = strstr(event_output_buffer, "<-->"); if (val_flag_ptr != NULL) { if ((val_flag) && (event_available == 0)) { // event is not available, update the place holder (replace the <--> with ) *(val_flag_ptr+1) = 'N'; *(val_flag_ptr+2) = 'A'; } else { event_available = 0; // reset this flag for next event // event is available, just remove the place holder (replace the <--> with spaces) *val_flag_ptr = ' '; *(val_flag_ptr+1) = ' '; *(val_flag_ptr+2) = ' '; *(val_flag_ptr+3) = ' '; } } // now we can finally send this events output to the user printf( "%s", event_output_buffer); // printf( "--------------------------------------------------------------------------------\n" ); event_output_buffer[0] = '\0'; // start the next event with an empty buffer return; } static int parse_event_qualifiers( PAPI_event_info_t * info ) { char *pmask,*ptr; /* handle the PAPI component-style events which have a component:::event type */ if ((ptr=strstr(info->symbol, ":::"))) { ptr+=3; /* handle libpfm4-style events which have a pmu::event type event name */ } else if ((ptr=strstr(info->symbol, "::"))) { ptr+=2; } else { ptr=info->symbol; } if ( ( pmask = strchr( ptr, ':' ) ) == NULL ) { return ( 0 ); } memmove( info->symbol, pmask, ( strlen(pmask) + 1 ) * sizeof(char) ); // The description field contains the event description followed by a tag 'masks:' // and then the mask description (if there was a mask with this event). The following // code isolates the mask description part of this information. pmask = strstr( info->long_descr, "masks:" ); if ( pmask == NULL ) { info->long_descr[0] = 0; } else { pmask += 6; // bump pointer past 'masks:' identifier in description memmove( info->long_descr, pmask, (strlen(pmask) + 1) * sizeof(char) ); } return ( 1 ); } #if SDE void invoke_hook_fptr( char *lib_path ) { void *dl_handle; typedef void *(* hook_fptr_t)(papi_sde_fptr_struct_t *); hook_fptr_t hook_func_ptr; /* Clear any old error conditions */ (void)dlerror(); dl_handle = dlopen(lib_path, RTLD_LOCAL | RTLD_LAZY); if ( NULL == dl_handle ) { return; } hook_func_ptr = (hook_fptr_t)dlsym(dl_handle, "papi_sde_hook_list_events"); if ( (NULL != hook_func_ptr) && ( NULL == dlerror()) ) { papi_sde_fptr_struct_t fptr_struct; POPULATE_SDE_FPTR_STRUCT( fptr_struct ); (void)hook_func_ptr( &fptr_struct ); } dlclose(dl_handle); return; } #endif int main( int argc, char **argv ) { int i, k; int num_events; int num_cmp_events = 0; int retval; PAPI_event_info_t info; const PAPI_hw_info_t *hwinfo = NULL; command_flags_t flags; int enum_modifier; int numcmp, cid; /* Initialize before parsing the input arguments */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr, "Error! PAPI_library_init\n"); return retval; } /* Parse the command-line arguments */ parse_args( argc, argv, &flags ); /* Set enum modifier mask */ if ( flags.dear ) enum_modifier = PAPI_NTV_ENUM_DEAR; else if ( flags.darr ) enum_modifier = PAPI_NTV_ENUM_DARR; else if ( flags.iear ) enum_modifier = PAPI_NTV_ENUM_IEAR; else if ( flags.iarr ) enum_modifier = PAPI_NTV_ENUM_IARR; else if ( flags.opcm ) enum_modifier = PAPI_NTV_ENUM_OPCM; else enum_modifier = PAPI_ENUM_EVENTS; retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_set_debug\n"); return retval; } retval = papi_print_header( "Available native events and hardware information.\n", &hwinfo ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_get_hardware_info\n"); return 2; } #if SDE /* The following code will execute if the user wants to list the SDEs in the library (or executable) stored in flags.path. This code will not list the SDEs per se, it will only give an opportunity to the library to register their SDEs, so they can be listed further down. */ if ( flags.list_sdes ){ char *cmd; FILE *pipe; if ( access(flags.path, R_OK) == -1 ){ fprintf(stderr,"Error! Unable to read file '%s'.\n",flags.path); goto no_sdes; } int len = 5+strlen(flags.path); cmd = (char *)calloc(len, sizeof(char)); if( NULL == cmd ) goto no_sdes; int l = snprintf(cmd, len, "ldd %s",flags.path); if(l ")) ) { goto skip_lib; } int status = sscanf(lineptr, "%ms => %ms (%*x)", &lib_name, &lib_path); /* If this line is malformed, ignore it. */ if(2 != status){ /* According to the man page: "it is necessary to call free() only if the scanf() call successfully read a string." */ goto skip_lib; } /* Invoke the hook for the dependency we just discovered */ invoke_hook_fptr(lib_path); if( lib_name ) free(lib_name); if( lib_path ) free(lib_path); skip_lib: if(lineptr) free(lineptr); lineptr = NULL; n=0; } pclose(pipe); } /* Finally, invoke the hook for the file the user gave us */ invoke_hook_fptr(flags.path); if( NULL != cmd ) free(cmd); } no_sdes: #endif //SDE /* Do this code if the event name option was specified on the commandline */ if ( flags.named ) { if ( PAPI_event_name_to_code( flags.name, &i ) == PAPI_OK ) { if ( PAPI_get_event_info( i, &info ) == PAPI_OK ) { printf( "Event name: %s\n", info.symbol); printf( "Description: %s\n", info.long_descr ); /* handle the PAPI component-style events which have a component:::event type */ char *ptr; if ((ptr=strstr(flags.name, ":::"))) { ptr+=3; /* handle libpfm4-style events which have a pmu::event type event name */ } else if ((ptr=strstr(flags.name, "::"))) { ptr+=2; } else { ptr=flags.name; } /* if event qualifiers exist but none specified, process all */ if ( !strchr( ptr, ':' ) ) { if ( PAPI_enum_event( &i, PAPI_NTV_ENUM_UMASKS ) == PAPI_OK ) { printf( "\nQualifiers: Name -- Description\n" ); do { retval = PAPI_get_event_info( i, &info ); if ( retval == PAPI_OK ) { if ( parse_event_qualifiers( &info ) ) { printf( " Info: %10s -- %s\n", info.symbol, info.long_descr ); } } } while ( PAPI_enum_event( &i, PAPI_NTV_ENUM_UMASKS ) == PAPI_OK ); } } } } else { printf("Sorry, an event by the name '%s' could not be found.\n", flags.name); printf("Is it typed correctly?\n\n"); exit( 1 ); } return 0; } // Look at all the events and qualifiers and print the information the user has asked for */ numcmp = PAPI_num_components( ); num_events = 0; for ( cid = 0; cid < numcmp; cid++ ) { const PAPI_component_info_t *component; component=PAPI_get_component_info(cid); /* Skip disabled components */ if (component->disabled && component->disabled != PAPI_EDELAY_INIT) continue; printf( "===============================================================================\n" ); printf( " Native Events in Component: %s\n",component->name); printf( "===============================================================================\n" ); // show this component has not found any events yet num_cmp_events = 0; /* Always ASK FOR the first event */ /* Don't just assume it'll be the first numeric value */ i = 0 | PAPI_NATIVE_MASK; retval=PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cid ); if (retval==PAPI_OK) { do { memset( &info, 0, sizeof ( info ) ); retval = PAPI_get_event_info( i, &info ); /* This event may not exist */ if ( retval != PAPI_OK ) continue; /* Bail if event name doesn't contain include string */ if ( flags.include && !strstr( info.symbol, flags.istr ) ) continue; /* Bail if event name does contain exclude string */ if ( flags.xclude && strstr( info.symbol, flags.xstr ) ) continue; // if not the first event in this component, put out a divider if (num_cmp_events) { printf( "--------------------------------------------------------------------------------\n" ); } /* count only events that are actually processed */ num_events++; num_cmp_events++; if (flags.check){ check_event(&info); } format_event_output( &info, 0); /* modifier = PAPI_NTV_ENUM_GROUPS returns event codes with a groups id for each group in which this native event lives, in bits 16 - 23 of event code terminating with PAPI_ENOEVNT at the end of the list. */ /* This is an IBM Power issue */ if ( flags.groups ) { k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_GROUPS, cid ) == PAPI_OK ) { printf( "Groups: " ); do { printf( "%4d", ( ( k & PAPI_NTV_GROUP_AND_MASK ) >> PAPI_NTV_GROUP_SHIFT ) - 1 ); } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_GROUPS, cid ) ==PAPI_OK ); printf( "\n" ); } } // If the user has asked us to check the events then we need to // walk the list of qualifiers and try to check the event with each one. // Even if the user does not want to display the qualifiers this is necessary // to be able to correctly report which events can be used on this system. // // We also need to walk the list if the user wants to see the qualifiers. if (flags.qualifiers || flags.check){ k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cid ) == PAPI_OK ) { // clear event string using first mask char first_event_mask_string[PAPI_HUGE_STR_LEN] = ""; do { retval = PAPI_get_event_info( k, &info ); if ( retval == PAPI_OK ) { // if first event mask string not set yet, set it now if (strlen(first_event_mask_string) == 0) { strcpy (first_event_mask_string, info.symbol); } if ( flags.check ) { check_event(&info); } // now test if the event qualifiers should be displayed to the user if ( flags.qualifiers ) { if ( parse_event_qualifiers( &info ) ) format_event_output( &info, 2); } } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cid ) == PAPI_OK ); // if we are validating events and the event_available flag is not set yet, try a few more combinations if (flags.check && (event_available == 0)) { // try using the event with the first mask defined for the event and the cpu mask // this is a kludge but many of the uncore events require an event specific mask (usually // the first one defined will do) and they all require the cpu mask strcpy (info.symbol, first_event_mask_string); strcat (info.symbol, ":cpu=1"); check_event(&info); } if (flags.check && (event_available == 0)) { // an even bigger kludge is that there are 4 snpep_unc_pcu events which require the 'ff' and 'cpu' qualifiers to work correctly. // if nothing else has worked, this code will try those two qualifiers with the current event name to see if it works strcpy (info.symbol, first_event_mask_string); char *wptr = strrchr (info.symbol, ':'); if (wptr != NULL) { *wptr = '\0'; strcat (info.symbol, ":ff=64:cpu=1"); check_event(&info); } } } } print_event_output(flags.check); } while (PAPI_enum_cmp_event( &i, enum_modifier, cid ) == PAPI_OK ); } } if (num_cmp_events != 0) { printf( "--------------------------------------------------------------------------------\n" ); } printf( "\nTotal events reported: %d\n", num_events ); if (num_events==0) { printf("\nNo events detected! Check papi_component_avail to find out why.\n"); printf("\n"); } return 0; } papi-papi-7-2-0-t/src/utils/papi_version.c000066400000000000000000000016331502707512200204340ustar00rootroot00000000000000/** * file papi_version.c * @brief papi_version utility. * @page papi_version * @section Name * papi_version - provides version information for papi. * * @section Synopsis * papi_version * * @section Description * papi_version is a PAPI utility program that reports version * information about the current PAPI installation. * * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the PAPI Mailing List at . */ /* This utility displays the current PAPI version number */ #include #include #include "papi.h" int main( int argc, char **argv ) { (void) argc; (void) argv; printf( "PAPI Version: %d.%d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ), PAPI_VERSION_INCREMENT( PAPI_VERSION ) ); return 0; } papi-papi-7-2-0-t/src/utils/papi_xml_event_info.c000066400000000000000000000272561502707512200217740ustar00rootroot00000000000000/** file papi_xml_event_info.c * @page papi_xml_event_info * @brief papi_xml_event_info utility. * @section NAME * papi_xml_event_info - provides detailed information for PAPI events in XML format * * @section Synopsis * * @section Description * papi_native_avail is a PAPI utility program that reports information * about the events available on the current platform in an XML format. * * It will attempt to create an EventSet with each event in it, which * can be slow. * * @section Options *
    *
  • -h print help message *
  • -p print only preset events *
  • -n print only native events *
  • -c COMPONENT print only events from component number COMPONENT * event1, event2, ... Print only events that can be created in the same * event set with the events event1, event2, etc. *
* * @section Bugs * There are no known bugs in this utility. * If you find a bug, it should be reported to the * PAPI Mailing List at . */ #include #include #include "papi.h" static int EventSet; static int preset = 1; static int native = 1; static int cidx = -1; /**********************************************************************/ /* Take a string and print a version with properly escaped XML */ /**********************************************************************/ static int xmlize( const char *msg, FILE *f ) { const char *op; if ( !msg ) return PAPI_OK; for ( op = msg; *op != '\0'; op++ ) { switch ( *op ) { case '"': fprintf( f, """ ); break; case '&': fprintf( f, "&" ); break; case '\'': fprintf( f, "'" ); break; case '<': fprintf( f, "<" ); break; case '>': fprintf( f, ">" ); break; default: fprintf( f, "%c", *op); } } return PAPI_OK; } /*************************************/ /* print hardware info in XML format */ /*************************************/ static int papi_xml_hwinfo( FILE * f ) { const PAPI_hw_info_t *hwinfo; if ( ( hwinfo = PAPI_get_hardware_info( ) ) == NULL ) return PAPI_ESYS; fprintf( f, "\n" ); fprintf( f, " vendor_string, f ); fprintf( f,"\"/>\n"); fprintf( f, " \n", hwinfo->vendor ); fprintf( f, " model_string, f ); fprintf( f, "\"/>\n"); fprintf( f, " \n", hwinfo->model ); fprintf( f, " \n", hwinfo->revision ); fprintf( f, " \n" ); fprintf( f, " \n", hwinfo->cpuid_family ); fprintf( f, " \n", hwinfo->cpuid_model ); fprintf( f, " \n", hwinfo->cpuid_stepping ); fprintf( f, " \n" ); fprintf( f, " \n", hwinfo->cpu_max_mhz ); fprintf( f, " \n", hwinfo->cpu_min_mhz ); fprintf( f, " \n", hwinfo->threads ); fprintf( f, " \n", hwinfo->cores ); fprintf( f, " \n", hwinfo->sockets ); fprintf( f, " \n", hwinfo->nnodes ); fprintf( f, " \n", hwinfo->ncpu ); fprintf( f, " \n", hwinfo->totalcpus ); fprintf( f, "\n" ); return PAPI_OK; } /****************************************************************/ /* Test if event can be added to an eventset */ /* (there might be existing events if specified on command line */ /****************************************************************/ static int test_event( int evt ) { int retval; retval = PAPI_add_event( EventSet, evt ); if ( retval != PAPI_OK ) { return retval; } if ( ( retval = PAPI_remove_event( EventSet, evt ) ) != PAPI_OK ) { fprintf( stderr, "Error removing event from eventset\n" ); exit( 1 ); } return PAPI_OK; } /***************************************/ /* Convert an event to XML */ /***************************************/ static void xmlize_event( FILE * f, PAPI_event_info_t * info, int num ) { if ( num >= 0 ) { fprintf( f, " symbol, f ); fprintf( f, "\" desc=\""); xmlize( info->long_descr, f ); fprintf( f, "\">\n"); } else { fprintf( f," symbol, f ); fprintf( f,"\" desc=\""); xmlize( info->long_descr, f ); fprintf( f,"\"> \n"); } } /****************************************/ /* Print all preset events */ /****************************************/ static void enum_preset_events( FILE * f, int cidx) { int i, num; int retval; PAPI_event_info_t info; i = PAPI_PRESET_MASK; fprintf( f, " \n" ); num = -1; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cidx ); while ( retval == PAPI_OK ) { num++; retval = PAPI_get_event_info( i, &info ); if ( retval != PAPI_OK ) { retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ); continue; } if ( test_event( i ) == PAPI_OK ) { xmlize_event( f, &info, num ); fprintf( f, " \n" ); } retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ); } fprintf( f, " \n" ); } /****************************************/ /* Print all native events */ /****************************************/ static void enum_native_events( FILE * f, int cidx) { int i, k, num; int retval; PAPI_event_info_t info; i = PAPI_NATIVE_MASK; fprintf( f, " \n" ); num = -1; retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_FIRST, cidx ); while ( retval == PAPI_OK ) { num++; retval = PAPI_get_event_info( i, &info ); if ( retval != PAPI_OK ) { retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ); continue; } /* enumerate any umasks */ k = i; if ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx ) == PAPI_OK ) { /* add the event */ xmlize_event( f, &info, num ); /* add the event's unit masks */ do { retval = PAPI_get_event_info( k, &info ); if ( retval == PAPI_OK ) { if ( test_event( k )!=PAPI_OK ) { continue; } xmlize_event( f, &info, -1 ); } } while ( PAPI_enum_cmp_event( &k, PAPI_NTV_ENUM_UMASKS, cidx ) == PAPI_OK); fprintf( f, "
\n" ); } else { /* this event has no unit masks; test & write the event */ if ( test_event( i ) == PAPI_OK ) { xmlize_event( f, &info, num ); fprintf( f, "
\n" ); } } retval = PAPI_enum_cmp_event( &i, PAPI_ENUM_EVENTS, cidx ); } fprintf( f, " \n" ); } /****************************************/ /* Print usage information */ /****************************************/ static void usage( char *argv[] ) { fprintf( stderr, "Usage: %s [options] [[event1] event2 ...]\n", argv[0] ); fprintf( stderr, " options: -h print help message\n" ); fprintf( stderr, " -p print only preset events\n" ); fprintf( stderr, " -n print only native events\n" ); fprintf( stderr," -c n print only events for component index n\n" ); fprintf( stderr, "If event1, event2, etc., are specified, then only events\n"); fprintf( stderr, "that can be run in addition to these events will be printed\n\n"); } static void parse_command_line (int argc, char **argv, int numc) { int i,retval; for( i = 1; i < argc; i++ ) { if ( argv[i][0] == '-' ) { switch ( argv[i][1] ) { case 'c': /* only events for specified component */ /* UGH, what is this, the IOCCC? */ cidx = (i+1) < argc ? atoi( argv[(i++)+1] ) : -1; if ( cidx < 0 || cidx >= numc ) { fprintf( stderr,"Error: component index %d out of bounds (0..%d)\n", cidx, numc - 1 ); usage( argv ); exit(1); } break; case 'p': /* only preset events */ preset = 1; native = 0; break; case 'n': /* only native events */ native = 1; preset = 0; break; case 'h': /* print help */ usage( argv ); exit(0); break; default: fprintf( stderr, "Error: unknown option: %s\n", argv[i] ); usage( argv ); exit(1); } } else { /* If event names are specified, add them to the */ /* EventSet and test if other events can be run with them */ int code = -1; retval = PAPI_event_name_to_code( argv[i], &code ); retval = PAPI_query_event( code ); if ( retval != PAPI_OK ) { fprintf( stderr, "Error: unknown event: %s\n", argv[i] ); usage( argv ); exit(1); } retval = PAPI_add_event( EventSet, code ); if ( retval != PAPI_OK ) { fprintf( stderr, "Error: event %s cannot be counted with others\n", argv[i] ); usage( argv ); exit(1); } } } } int main( int argc, char **argv) { int retval; const PAPI_component_info_t *comp; int numc = 0; retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { fprintf(stderr,"Error! PAPI_library_init\n"); return retval; } /* report any return codes less than 0? */ /* Why? */ #if 0 retval = PAPI_set_debug( PAPI_VERB_ECONT ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_set_debug", retval ); } #endif /* Create EventSet to use */ EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_create_eventset\n"); return retval; } /* Get number of components */ numc = PAPI_num_components( ); /* parse command line arguments */ parse_command_line(argc,argv,numc); /* print XML header */ fprintf( stdout, "\n" ); fprintf( stdout, "\n" ); /* print hardware info */ papi_xml_hwinfo( stdout ); /* If a specific component specified, only print events from there */ if ( cidx >= 0 ) { comp = PAPI_get_component_info( cidx ); fprintf( stdout, "\n", cidx, cidx ? "Unknown" : "CPU", comp->name ); if ( native ) enum_native_events( stdout, cidx); if ( preset ) enum_preset_events( stdout, cidx); fprintf( stdout, "\n" ); } else { /* Otherwise, print info for all components */ for ( cidx = 0; cidx < numc; cidx++ ) { comp = PAPI_get_component_info( cidx ); fprintf( stdout, "\n", cidx, cidx ? "Unknown" : "CPU", comp->name ); if ( native ) enum_native_events( stdout, cidx ); if ( preset ) enum_preset_events( stdout, cidx ); fprintf( stdout, "\n" ); /* clean out eventset */ retval = PAPI_cleanup_eventset( EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_cleanup_eventset\n"); return retval; } retval = PAPI_destroy_eventset( &EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_destroy_eventset\n"); return retval; } EventSet = PAPI_NULL; retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { fprintf(stderr,"Error! PAPI_create_eventset\n"); return retval; } /* re-parse command line to set up any events specified */ parse_command_line (argc, argv, numc); } } fprintf( stdout, "\n" ); return 0; } papi-papi-7-2-0-t/src/utils/print_header.c000066400000000000000000000072361502707512200204070ustar00rootroot00000000000000#include #include #include #include "papi.h" /* Support routine to display header information to the screen from the hardware info data structure. The same code was duplicated in a number of tests and utilities. Seems to make sense to refactor. This may not be the best place for it to live, but it works for now. */ int papi_print_header( char *prompt, const PAPI_hw_info_t ** hwinfo ) { int cnt, mpx; struct utsname uname_info; PAPI_option_t options; memset(&options,0,sizeof(PAPI_option_t)); if ( ( *hwinfo = PAPI_get_hardware_info( ) ) == NULL ) { return PAPI_ESYS; } PAPI_get_opt(PAPI_COMPONENTINFO,&options); uname(&uname_info); printf( "%s", prompt ); printf ( "--------------------------------------------------------------------------------\n" ); printf( "PAPI version : %d.%d.%d.%d\n", PAPI_VERSION_MAJOR( PAPI_VERSION ), PAPI_VERSION_MINOR( PAPI_VERSION ), PAPI_VERSION_REVISION( PAPI_VERSION ), PAPI_VERSION_INCREMENT( PAPI_VERSION ) ); printf( "Operating system : %s %s\n", uname_info.sysname, uname_info.release); printf( "Vendor string and code : %s (%d, 0x%x)\n", ( *hwinfo )->vendor_string, ( *hwinfo )->vendor, ( *hwinfo )->vendor ); printf( "Model string and code : %s (%d, 0x%x)\n", ( *hwinfo )->model_string, ( *hwinfo )->model, ( *hwinfo )->model ); printf( "CPU revision : %f\n", ( *hwinfo )->revision ); if ( ( *hwinfo )->cpuid_family > 0 ) { printf( "CPUID : Family/Model/Stepping %d/%d/%d, " "0x%02x/0x%02x/0x%02x\n", ( *hwinfo )->cpuid_family, ( *hwinfo )->cpuid_model, ( *hwinfo )->cpuid_stepping, ( *hwinfo )->cpuid_family, ( *hwinfo )->cpuid_model, ( *hwinfo )->cpuid_stepping ); } printf( "CPU Max MHz : %d\n", ( *hwinfo )->cpu_max_mhz ); printf( "CPU Min MHz : %d\n", ( *hwinfo )->cpu_min_mhz ); printf( "Total cores : %d\n", ( *hwinfo )->totalcpus ); if ( ( *hwinfo )->threads > 0 ) printf( "SMT threads per core : %d\n", ( *hwinfo )->threads ); if ( ( *hwinfo )->cores > 0 ) printf( "Cores per socket : %d\n", ( *hwinfo )->cores ); if ( ( *hwinfo )->sockets > 0 ) printf( "Sockets : %d\n", ( *hwinfo )->sockets ); printf( "Cores per NUMA region : %d\n", ( *hwinfo )->ncpu ); printf( "NUMA regions : %d\n", ( *hwinfo )->nnodes ); printf( "Running in a VM : %s\n", ( *hwinfo )->virtualized? "yes":"no"); if ( (*hwinfo)->virtualized) { printf( "VM Vendor : %s\n", (*hwinfo)->virtual_vendor_string); } cnt = PAPI_get_opt( PAPI_MAX_HWCTRS, NULL ); mpx = PAPI_get_opt( PAPI_MAX_MPX_CTRS, NULL ); int numcmp = PAPI_num_components( ); int perf_event = 0, cid; for ( cid = 0; cid < numcmp; cid++ ) { const PAPI_component_info_t* cmpinfo = PAPI_get_component_info( cid ); if (cmpinfo->disabled) continue; if (strcmp(cmpinfo->name, "perf_event")== 0) perf_event = 1; } if ( cnt >= 0 ) { printf( "Number Hardware Counters : %d\n", cnt ); } else if ( perf_event == 0 ) { printf( "Number Hardware Counters : NA\n" ); } else { printf( "Number Hardware Counters : PAPI error %d: %s\n", cnt, PAPI_strerror(cnt)); } if ( mpx >= 0 ) { printf( "Max Multiplex Counters : %d\n", mpx ); } else { printf( "Max Multiplex Counters : PAPI error %d: %s\n", mpx, PAPI_strerror(mpx)); } if (options.cmp_info != NULL) { printf("Fast counter read (rdpmc): %s\n", options.cmp_info->fast_counter_read ? "yes" : "no"); } printf( "--------------------------------------------------------------------------------\n" ); printf( "\n" ); return PAPI_OK; } papi-papi-7-2-0-t/src/utils/print_header.h000066400000000000000000000001101502707512200203740ustar00rootroot00000000000000int papi_print_header( char *prompt, const PAPI_hw_info_t ** hwinfo ); papi-papi-7-2-0-t/src/validation_tests/000077500000000000000000000000001502707512200200035ustar00rootroot00000000000000papi-papi-7-2-0-t/src/validation_tests/Makefile000066400000000000000000000014241502707512200214440ustar00rootroot00000000000000# File: validation_tests/Makefile include Makefile.target INCLUDE = -I../testlib -I.. -I. testlibdir= ../testlib TESTLIB= $(testlibdir)/libtestlib.a DOLOOPS= $(testlibdir)/do_loops.o CLOCKCORE= $(testlibdir)/clockcore.o EXTRALIB = -lrt include Makefile.recipies .PHONY : install install: all @echo "Validation tests (DATADIR) being installed in: \"$(DATADIR)\""; -mkdir -p $(DATADIR)/validation_tests -chmod go+rx $(DATADIR) -chmod go+rx $(DATADIR)/validation_tests -find . -perm -100 -type f -exec cp {} $(DATADIR)/validation_tests \; -chmod go+rx $(DATADIR)/validation_tests/* -find . -name "*.[ch]" -type f -exec cp {} $(DATADIR)/validation_tests \; -cp Makefile.target $(DATADIR)/validation_tests/Makefile -cat Makefile.recipies >> $(DATADIR)/validation_tests/Makefile papi-papi-7-2-0-t/src/validation_tests/Makefile.recipies000066400000000000000000000163451502707512200232560ustar00rootroot00000000000000ALL = fp_validation_hl \ cycles_validation flops_validation \ papi_br_cn papi_br_ins papi_br_msp \ papi_br_ntk papi_br_prc papi_br_tkn papi_br_ucn \ papi_dp_ops papi_fp_ops papi_sp_ops papi_hw_int \ papi_l1_dca papi_l1_dcm \ papi_l2_dca papi_l2_dcm papi_l2_dcr papi_l2_dcw \ papi_ld_ins papi_sr_ins \ papi_ref_cyc papi_tot_cyc papi_tot_ins all: $(ALL) %.o:%.c $(CC) $(CFLAGS) $(OPTFLAGS) $(INCLUDE) -c $< display_error.o: display_error.c display_error.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c display_error.c cache_helper.o: cache_helper.c cache_helper.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c cache_helper.c branches_testcode.o: branches_testcode.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c branches_testcode.c busy_work.o: busy_work.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c busy_work.c cache_testcode.o: cache_testcode.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c cache_testcode.c flops_testcode.o: flops_testcode.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c flops_testcode.c instructions_testcode.o: instructions_testcode.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c instructions_testcode.c matrix_multiply.o: matrix_multiply.c matrix_multiply.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -O1 -c matrix_multiply.c load_store_testcode.o: load_store_testcode.c testcode.h $(CC) $(INCLUDE) $(CFLAGS) $(OPTFLAGS) -c load_store_testcode.c fp_validation_hl: fp_validation_hl.o $(TESTLIB) $(PAPILIB) flops_testcode.o $(CC) -o fp_validation_hl fp_validation_hl.o $(TESTLIB) flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) -lpthread cycles_validation: cycles_validation.o $(TESTLIB) $(PAPILIB) display_error.o instructions_testcode.o $(CC) -o cycles_validation cycles_validation.o $(TESTLIB) display_error.o instructions_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) flops_validation: flops_validation.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o flops_testcode.o $(CC) -o flops_validation flops_validation.o $(TESTLIB) display_error.o branches_testcode.o flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) memleak_check: memleak_check.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o memleak_check memleak_check.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(LDFLAGS) $(EXTRALIB) papi_br_cn: papi_br_cn.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_cn papi_br_cn.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(LDFLAGS) $(EXTRALIB) papi_br_ins: papi_br_ins.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_ins papi_br_ins.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_br_msp: papi_br_msp.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_msp papi_br_msp.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_br_ntk: papi_br_ntk.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_ntk papi_br_ntk.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_br_prc: papi_br_prc.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_prc papi_br_prc.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_br_tkn: papi_br_tkn.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_tkn papi_br_tkn.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_br_ucn: papi_br_ucn.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o $(CC) -o papi_br_ucn papi_br_ucn.o $(TESTLIB) display_error.o branches_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_dp_ops: papi_dp_ops.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o flops_testcode.o $(CC) -o papi_dp_ops papi_dp_ops.o $(TESTLIB) display_error.o branches_testcode.o flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_fp_ops: papi_fp_ops.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o flops_testcode.o $(CC) -o papi_fp_ops papi_fp_ops.o $(TESTLIB) display_error.o branches_testcode.o flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_hw_int: papi_hw_int.o $(TESTLIB) $(PAPILIB) $(CC) -o papi_hw_int papi_hw_int.o $(TESTLIB) $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_ld_ins: papi_ld_ins.o $(TESTLIB) $(PAPILIB) display_error.o matrix_multiply.o load_store_testcode.o $(CC) -o papi_ld_ins papi_ld_ins.o $(TESTLIB) display_error.o matrix_multiply.o load_store_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l1_dca: papi_l1_dca.o $(TESTLIB) $(PAPILIB) cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l1_dca papi_l1_dca.o $(TESTLIB) cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l1_dcm: papi_l1_dcm.o $(TESTLIB) $(PAPILIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l1_dcm papi_l1_dcm.o $(TESTLIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l2_dca: papi_l2_dca.o $(TESTLIB) $(PAPILIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l2_dca papi_l2_dca.o $(TESTLIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l2_dcm: papi_l2_dcm.o $(TESTLIB) $(PAPILIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l2_dcm papi_l2_dcm.o $(TESTLIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l2_dcr: papi_l2_dcr.o $(TESTLIB) $(PAPILIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l2_dcr papi_l2_dcr.o $(TESTLIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_l2_dcw: papi_l2_dcw.o $(TESTLIB) $(PAPILIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(CC) -o papi_l2_dcw papi_l2_dcw.o $(TESTLIB) cache_helper.o cache_testcode.o display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_ref_cyc: papi_ref_cyc.o $(TESTLIB) $(PAPILIB) display_error.o flops_testcode.o $(CC) -o papi_ref_cyc papi_ref_cyc.o $(TESTLIB) display_error.o flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_sp_ops: papi_sp_ops.o $(TESTLIB) $(PAPILIB) display_error.o branches_testcode.o flops_testcode.o $(CC) -o papi_sp_ops papi_sp_ops.o $(TESTLIB) display_error.o branches_testcode.o flops_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_sr_ins: papi_sr_ins.o $(TESTLIB) $(PAPILIB) display_error.o matrix_multiply.o load_store_testcode.o $(CC) -o papi_sr_ins papi_sr_ins.o $(TESTLIB) display_error.o matrix_multiply.o load_store_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_tot_cyc: papi_tot_cyc.o $(TESTLIB) $(PAPILIB) display_error.o matrix_multiply.o $(CC) -o papi_tot_cyc papi_tot_cyc.o $(TESTLIB) display_error.o matrix_multiply.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) papi_tot_ins: papi_tot_ins.o $(TESTLIB) $(PAPILIB) display_error.o instructions_testcode.o $(CC) -o papi_tot_ins papi_tot_ins.o $(TESTLIB) display_error.o instructions_testcode.o $(PAPILIB) $(LDFLAGS) $(EXTRALIB) .PHONY : all clean distclean clobber clean: rm -f *.o *.stderr *.stdout core *~ $(ALL) distclean clobber: clean rm -f Makefile.target papi-papi-7-2-0-t/src/validation_tests/Makefile.target.in000066400000000000000000000010001502707512200233240ustar00rootroot00000000000000PACKAGE_TARNAME = @PACKAGE_TARNAME@ prefix = @prefix@ exec_prefix = @exec_prefix@ datarootdir = @datarootdir@ datadir = @datadir@/${PACKAGE_TARNAME} testlibdir = $(datadir)/testlib DATADIR = $(DESTDIR)$(datadir) INCLUDE = -I. -I@includedir@ -I$(testlibdir) LIBDIR = @libdir@ LIBRARY = @LIBRARY@ SHLIB = @SHLIB@ PAPILIB = ../@LINKLIB@ TESTLIB = $(testlibdir)/libtestlib.a LDFLAGS = @LDFLAGS@ @LDL@ @STATIC@ CC = @CC@ MPICC = @MPICC@ F77 = @F77@ CC_R = @CC_R@ CFLAGS = @CFLAGS@ @OPTFLAGS@ OMPCFLGS = @OMPCFLGS@ papi-papi-7-2-0-t/src/validation_tests/branches_testcode.c000066400000000000000000000056721502707512200236400ustar00rootroot00000000000000#include #include #include "testcode.h" #define BRNG() {\ b = ((z1 << 6) ^ z1) >> 13;\ z1 = ((z1 & 4294967294U) << 18) ^ b;\ b = ((z2 << 2) ^ z2) >> 27;\ z2 = ((z2 & 4294967288U) << 2) ^ b;\ b = ((z3 << 13) ^ z3) >> 21;\ z3 = ((z3 & 4294967280U) << 7) ^ b;\ b = ((z4 << 3) ^ z4) >> 12;\ z4 = ((z4 & 4294967168U) << 13) ^ b;\ z1++;\ result = z1 ^ z2 ^ z3 ^ z4;\ } /* This code has 1,500,000 total branches */ /* 500,000 not-taken conditional branches */ /* 500,000 taken conditional branches */ /* 500,000 unconditional branches */ int branches_testcode(void) { #if defined(__i386__) || (defined __x86_64__) asm( "\txor %%ecx,%%ecx\n" "\tmov $500000,%%ecx\n" "test_loop:\n" "\tjmp test_jmp\n" "\tnop\n" "test_jmp:\n" "\txor %%eax,%%eax\n" "\tjnz test_jmp2\n" "\tinc %%eax\n" "test_jmp2:\n" "\tdec %%ecx\n" "\tjnz test_loop\n" : /* no output registers */ : /* no inputs */ : "cc", "%ecx", "%eax" /* clobbered */ ); return 0; #elif defined(__arm__) /* Initial code contributed by sam wang linux.swang _at_ gmail.com */ asm( "\teor r3,r3,r3\n" "\tldr r3,=500000\n" "test_loop:\n" "\tB test_jmp\n" "\tnop\n" "test_jmp:\n" "\teor r2,r2,r2\n" "\tcmp r2,#1\n" "\tbge test_jmp2\n" "\tnop\n" "\tadd r2,r2,#1\n" "test_jmp2:\n" "\tsub r3,r3,#1\n" "\tcmp r3,#1\n" "\tbgt test_loop\n" : /* no output registers */ : /* no inputs */ : "cc", "r2", "r3" /* clobbered */ ); return 0; #elif defined(__aarch64__) asm( "\teor x3,x3,x3\n" "\tldr x3,=500000\n" "test_loop:\n" "\tB test_jmp\n" "\tnop\n" "test_jmp:\n" "\teor x2,x2,x2\n" "\tcmp x2,#1\n" "\tbge test_jmp2\n" "\tnop\n" "\tadd x2,x2,#1\n" "test_jmp2:\n" "\tsub x3,x3,#1\n" "\tcmp x3,#1\n" "\tbgt test_loop\n" : /* no output registers */ : /* no inputs */ : "cc", "x2", "x3" /* clobbered */ ); return 0; #elif defined(__powerpc__) /* Not really optimized */ asm( "\txor 3,3,3\n" "\tlis 3,500000@ha\n" "\taddi 3,3,500000@l\n" "test_loop:\n" "\tb test_jmp\n" "\tnop\n" "test_jmp:\n" "\txor 4,4,4\n" "\tcmpwi cr0,4,1\n" "\tbge test_jmp2\n" "\tnop\n" "\taddi 4,4,1\n" "test_jmp2:\n" "\taddi 3,3,-1\n" "\tcmpwi cr0,3,1\n" "\tbgt test_loop\n" : /* no output registers */ : /* no inputs */ : "cr0", "r3", "r4" /* clobbered */ ); return 0; #endif return -1; } int random_branches_testcode(int number, int quiet) { int j,junk=0; double junk2=5.0; long int b,z1,z2,z3,z4,result; z1=236; z2=347; z3=458; z4=9751; for(j=0;j #include /* Repeat doing some busy-work floating point */ /* Until at least len seconds have passed */ double do_cycles( int minimum_time ) { struct timeval start, now; double x, sum; gettimeofday( &start, NULL ); for ( ;; ) { sum = 1.0; for ( x = 1.0; x < 250000.0; x += 1.0 ) { sum += x; } if ( sum < 0.0 ) { printf( "==>> SUM IS NEGATIVE !! <<==\n" ); } gettimeofday( &now, NULL ); if ( now.tv_sec >= start.tv_sec + minimum_time ) { break; } } return sum; } papi-papi-7-2-0-t/src/validation_tests/cache_helper.c000066400000000000000000000075561502707512200225660ustar00rootroot00000000000000#include #include "cache_helper.h" #include "papi.h" #include "papi_test.h" const PAPI_hw_info_t *hw_info=NULL; struct cache_info_t { int wpolicy; int replace; int size; int entries; int ways; int linesize; }; static struct cache_info_t cache_info[MAX_CACHE]; static int check_if_cache_info_available(void) { int cache_type,level,j; /* Get PAPI Hardware Info */ hw_info=PAPI_get_hardware_info(); if (hw_info==NULL) { return -1; } /* Iterate down the levels (L1, L2, L3) */ for(level=0;levelmem_hierarchy.levels;level++) { for(j=0;j<2;j++) { cache_type=PAPI_MH_CACHE_TYPE( hw_info->mem_hierarchy.level[level].cache[j].type); if (cache_type==PAPI_MH_TYPE_EMPTY) continue; if (level==0) { if (cache_type==PAPI_MH_TYPE_DATA) { cache_info[L1D_CACHE].size=hw_info->mem_hierarchy.level[level].cache[j].size; cache_info[L1D_CACHE].linesize=hw_info->mem_hierarchy.level[level].cache[j].line_size; cache_info[L1D_CACHE].ways=hw_info->mem_hierarchy.level[level].cache[j].associativity; cache_info[L1D_CACHE].entries=cache_info[L1D_CACHE].size/cache_info[L1D_CACHE].linesize; cache_info[L1D_CACHE].wpolicy=PAPI_MH_CACHE_WRITE_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); cache_info[L1D_CACHE].replace=PAPI_MH_CACHE_REPLACEMENT_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); } else if (cache_type==PAPI_MH_TYPE_INST) { cache_info[L1I_CACHE].size=hw_info->mem_hierarchy.level[level].cache[j].size; cache_info[L1I_CACHE].linesize=hw_info->mem_hierarchy.level[level].cache[j].line_size; cache_info[L1I_CACHE].ways=hw_info->mem_hierarchy.level[level].cache[j].associativity; cache_info[L1I_CACHE].entries=cache_info[L1I_CACHE].size/cache_info[L1I_CACHE].linesize; cache_info[L1I_CACHE].wpolicy=PAPI_MH_CACHE_WRITE_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); cache_info[L1I_CACHE].replace=PAPI_MH_CACHE_REPLACEMENT_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); } } else if (level==1) { cache_info[L2_CACHE].size=hw_info->mem_hierarchy.level[level].cache[j].size; cache_info[L2_CACHE].linesize=hw_info->mem_hierarchy.level[level].cache[j].line_size; cache_info[L2_CACHE].ways=hw_info->mem_hierarchy.level[level].cache[j].associativity; cache_info[L2_CACHE].entries=cache_info[L2_CACHE].size/cache_info[L2_CACHE].linesize; cache_info[L2_CACHE].wpolicy=PAPI_MH_CACHE_WRITE_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); cache_info[L2_CACHE].replace=PAPI_MH_CACHE_REPLACEMENT_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); } else if (level==2) { cache_info[L3_CACHE].size=hw_info->mem_hierarchy.level[level].cache[j].size; cache_info[L3_CACHE].linesize=hw_info->mem_hierarchy.level[level].cache[j].line_size; cache_info[L3_CACHE].ways=hw_info->mem_hierarchy.level[level].cache[j].associativity; cache_info[L3_CACHE].entries=cache_info[L3_CACHE].size/cache_info[L3_CACHE].linesize; cache_info[L3_CACHE].wpolicy=PAPI_MH_CACHE_WRITE_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); cache_info[L3_CACHE].replace=PAPI_MH_CACHE_REPLACEMENT_POLICY(hw_info->mem_hierarchy.level[level].cache[j].type); } } } return 0; } long long get_cachesize(int type) { int result; result=check_if_cache_info_available(); if (result<0) return result; if (type>=MAX_CACHE) { printf("Errror!\n"); return -1; } return cache_info[type].size; } long long get_entries(int type) { int result; result=check_if_cache_info_available(); if (result<0) return result; if (type>=MAX_CACHE) { printf("Errror!\n"); return -1; } return cache_info[type].entries; } long long get_linesize(int type) { int result; result=check_if_cache_info_available(); if (result<0) return result; if (type>=MAX_CACHE) { printf("Errror!\n"); return -1; } return cache_info[type].linesize; } papi-papi-7-2-0-t/src/validation_tests/cache_helper.h000066400000000000000000000003261502707512200225570ustar00rootroot00000000000000#define L1I_CACHE 0 #define L1D_CACHE 1 #define L2_CACHE 2 #define L3_CACHE 3 #define MAX_CACHE (L3_CACHE+1) long long get_cachesize(int type); long long get_entries(int type); long long get_linesize(int type); papi-papi-7-2-0-t/src/validation_tests/cache_testcode.c000066400000000000000000000012041502707512200231010ustar00rootroot00000000000000#include #include #include "testcode.h" int cache_write_test(double *array, int size) { int i; for(i=0; i #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define MAX_CYCLE_ERROR 30 #define NUM_EVENTS 2 #define NUM_LOOPS 200 int main( int argc, char **argv ) { int retval, tmp, result, i; int EventSet1 = PAPI_NULL; long long values[NUM_EVENTS]; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; double cycles_error; int quiet=0; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Initialize the EventSet */ retval=PAPI_create_eventset(&EventSet1); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add PAPI_TOT_CYC */ retval=PAPI_add_named_event(EventSet1,"PAPI_TOT_CYC"); if (retval!=PAPI_OK) { if (!quiet) printf("Trouble adding PAPI_TOT_CYC\n"); test_skip( __FILE__, __LINE__, "adding PAPI_TOT_CYC", retval ); } /* Add PAPI_TOT_INS */ retval=PAPI_add_named_event(EventSet1,"PAPI_TOT_INS"); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "adding PAPI_TOT_INS", retval ); } /* warm up the processor to pull it out of idle state */ for(i=0;i<100;i++) { result=instructions_million(); } if (result==CODE_UNIMPLEMENTED) { if (!quiet) printf("Instructions testcode not available\n"); test_skip( __FILE__, __LINE__, "No instructions code", retval ); } /* Gather before stats */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Start PAPI */ retval = PAPI_start( EventSet1 ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our work code */ for(i=0;i MAX_CYCLE_ERROR) || (cycles_error < -MAX_CYCLE_ERROR)) { if (!quiet) printf("PAPI_TOT_CYC Error of %.2f%%\n",cycles_error); test_warn( __FILE__, __LINE__, "Cycles validation", 0 ); } /* Check that TOT_INS is reasonable */ if (llabs(values[1] - (1000000*NUM_LOOPS)) > (1000000*NUM_LOOPS)) { printf("%s Error of %.2f%%\n", "PAPI_TOT_INS", (100.0 * (double)(values[1] - (1000000*NUM_LOOPS)))/(1000000*NUM_LOOPS)); test_fail( __FILE__, __LINE__, "Instruction validation", 0 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/display_error.c000066400000000000000000000013041502707512200230230ustar00rootroot00000000000000#include #include #include #include "display_error.h" double display_error(long long average, long long high, long long low, long long expected, int quiet) { double error; error=(((double)average-expected)/expected)*100.0; if (!quiet) { printf(" Expected: %lld\n", expected); printf(" High: %lld Low: %lld Average: %lld\n", high,low,average); printf(" ( note, a small value above %lld may be expected due\n", expected); printf(" to overhead and interrupt noise, among other reasons)\n"); printf(" Average Error = %.2f%%\n",error); } return error; } papi-papi-7-2-0-t/src/validation_tests/display_error.h000066400000000000000000000002211502707512200230250ustar00rootroot00000000000000double display_error(long long average, long long high, long long low, long long expected, int quiet); papi-papi-7-2-0-t/src/validation_tests/flops_testcode.c000066400000000000000000000077331502707512200231760ustar00rootroot00000000000000/* This includes various workloads that had been scattered all over */ /* the various ctests. The goal is to have them in one place, and */ /* share them, as well as maybe have only one file that has to be */ /* compiled with reduced optimizations */ #include #include #include "testcode.h" #define ROWS 1000 #define COLUMNS 1000 static float float_matrixa[ROWS][COLUMNS], float_matrixb[ROWS][COLUMNS], float_mresult[ROWS][COLUMNS]; static double double_matrixa[ROWS][COLUMNS], double_matrixb[ROWS][COLUMNS], double_mresult[ROWS][COLUMNS]; int flops_float_init_matrix(void) { int i,j; /* Initialize the Matrix arrays */ /* Non-optimail row major. Intentional? */ for ( i = 0; i < ROWS; i++ ) { for ( j = 0; j < COLUMNS; j++) { float_mresult[j][i] = 0.0; float_matrixa[j][i] = ( float ) rand() * ( float ) 1.1; float_matrixb[j][i] = ( float ) rand() * ( float ) 1.1; } } #if defined(__powerpc__) /* Has fused multiply-add */ return ROWS*ROWS*ROWS; #else return ROWS*ROWS*ROWS*2; #endif } float flops_float_matrix_matrix_multiply(void) { int i,j,k; /* Matrix-Matrix multiply */ for ( i = 0; i < ROWS; i++ ) { for ( j = 0; j < COLUMNS; j++ ) { for ( k = 0; k < COLUMNS; k++ ) { float_mresult[i][j] += float_matrixa[i][k] * float_matrixb[k][j]; } } } return float_mresult[10][10]; } float flops_float_swapped_matrix_matrix_multiply(void) { int i, j, k; /* Matrix-Matrix multiply */ /* With inner loops swapped */ for (i = 0; i < ROWS; i++) { for (k = 0; k < COLUMNS; k++) { for (j = 0; j < COLUMNS; j++) { float_mresult[i][j] += float_matrixa[i][k] * float_matrixb[k][j]; } } } return float_mresult[10][10]; } int flops_double_init_matrix(void) { int i,j; /* Initialize the Matrix arrays */ /* Non-optimail row major. Intentional? */ for ( i = 0; i < ROWS; i++ ) { for ( j = 0; j < COLUMNS; j++) { double_mresult[j][i] = 0.0; double_matrixa[j][i] = ( double ) rand() * ( double ) 1.1; double_matrixb[j][i] = ( double ) rand() * ( double ) 1.1; } } #if defined(__powerpc__) /* has fused multiply-add */ return ROWS*ROWS*ROWS; #else return ROWS*ROWS*ROWS*2; #endif } double flops_double_matrix_matrix_multiply(void) { int i,j,k; /* Matrix-Matrix multiply */ for ( i = 0; i < ROWS; i++ ) { for ( j = 0; j < COLUMNS; j++ ) { for ( k = 0; k < COLUMNS; k++ ) { double_mresult[i][j] += double_matrixa[i][k] * double_matrixb[k][j]; } } } return double_mresult[10][10]; } double flops_double_swapped_matrix_matrix_multiply(void) { int i, j, k; /* Matrix-Matrix multiply */ /* With inner loops swapped */ for (i = 0; i < ROWS; i++) { for (k = 0; k < COLUMNS; k++) { for (j = 0; j < COLUMNS; j++) { double_mresult[i][j] += double_matrixa[i][k] * double_matrixb[k][j]; } } } return double_mresult[10][10]; } /* This was originally called "dummy3" in the various sdsc tests */ /* Does a lot of floating point ops near 1.0 */ /* In theory returns a value roughly equal to the number of flops */ double do_flops3( double x, int iters, int quiet ) { int i; double w, y, z, a, b, c, d, e, f, g, h; double result; double one; one = 1.0; w = x; y = x; z = x; a = x; b = x; c = x; d = x; e = x; f = x; g = x; h = x; for ( i = 1; i <= iters; i++ ) { w = w * 1.000000000001 + one; y = y * 1.000000000002 + one; z = z * 1.000000000003 + one; a = a * 1.000000000004 + one; b = b * 1.000000000005 + one; c = c * 0.999999999999 + one; d = d * 0.999999999998 + one; e = e * 0.999999999997 + one; f = f * 0.999999999996 + one; g = h * 0.999999999995 + one; h = h * 1.000000000006 + one; } result = 2.0 * ( a + b + c + d + e + f + w + x + y + z + g + h ); if (!quiet) printf("Result = %lf\n", result); return result; } volatile double a = 0.5, b = 2.2; double do_flops( int n, int quiet ) { int i; double c = 0.11; for ( i = 0; i < n; i++ ) { c += a * b; } if (!quiet) printf("%lf\n",c); return c; } papi-papi-7-2-0-t/src/validation_tests/flops_validation.c000066400000000000000000000171661502707512200235170ustar00rootroot00000000000000/* flops.c, based on the hl_rates.c ctest * * This test runs a "classic" matrix multiply * and then runs it again with the inner loop swapped. * the swapped version should have better MFLIPS/MFLOPS/IPC and we test that. */ #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" int main( int argc, char **argv ) { int retval; double rtime, ptime, mflips, mflops, ipc; long long flips=0, flops=0, ins[2]; double rtime_start,rtime_end; double ptime_start,ptime_end; double rtime_classic,rtime_swapped; double mflips_classic,mflips_swapped; double mflops_classic,mflops_swapped; double ipc_classic,ipc_swapped; int quiet,event_added_flips,event_added_flops,event_added_ipc; int eventset=PAPI_NULL; /* Set TESTS_QUIET variable */ quiet=tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create the eventset */ retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Initialize the test matrix */ flops_float_init_matrix(); /************************/ /* FLIPS */ /************************/ if (!quiet) { printf( "\n----------------------------------\n" ); printf( "PAPI_flips\n"); } /* Add FP_INS event */ retval=PAPI_add_named_event(eventset,"PAPI_FP_INS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_FP_INS not available!\n"); event_added_flips=0; } else { event_added_flips=1; } if (event_added_flips) { PAPI_start(eventset); } rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); // Flips classic flops_float_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_flips) { PAPI_stop(eventset,&flips); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; mflips=flips/rtime; if (!quiet) { printf( "\nClassic\n"); printf( "real time: %lf\n", rtime); printf( "process time: %lf\n", ptime); printf( "FP Instructions: %lld\n", flips); printf( "MFLIPS %lf\n", mflips); } mflips_classic=mflips; // Flips swapped rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); if (event_added_flips) { PAPI_reset(eventset); PAPI_start(eventset); } flops_float_swapped_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_flips) { PAPI_stop(eventset,&flips); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; mflips=flips/rtime; if (!quiet) { printf( "\nSwapped\n"); printf( "real time: %f\n", rtime); printf( "process time: %f\n", ptime); printf( "FP Instructions: %lld\n", flips); printf( "MFLIPS %f\n", mflips); } mflips_swapped=mflips; // turn off flips if (event_added_flips) { retval=PAPI_remove_named_event(eventset,"PAPI_FP_INS"); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_remove_named_event", retval ); } } /************************/ /* FLOPS */ /************************/ if (!quiet) { printf( "\n----------------------------------\n" ); printf( "PAPI_flops\n"); } /* Add FP_OPS event */ retval=PAPI_add_named_event(eventset,"PAPI_FP_OPS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_FP_OPS not available!\n"); event_added_flops=0; } else { event_added_flops=1; } if (event_added_flops) { PAPI_start(eventset); } rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); // Classic flops flops_float_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_flops) { PAPI_stop(eventset,&flops); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; mflops=flops/rtime; if (!quiet) { printf( "\nClassic\n"); printf( "real time: %f\n", rtime); printf( "process time: %f\n", ptime); printf( "FP Operations: %lld\n", flops); printf( "MFLOPS %f\n", mflops); } mflops_classic=mflops; // Swapped flops rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); if (event_added_flops) { PAPI_reset(eventset); PAPI_start(eventset); } flops_float_swapped_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_flops) { PAPI_stop(eventset,&flops); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; mflops=flops/rtime; if (!quiet) { printf( "\nSwapped\n"); printf( "real time: %f\n", rtime); printf( "process time: %f\n", ptime); printf( "FP Operations: %lld\n", flops); printf( "MFLOPS %f\n", mflops); } mflops_swapped=mflops; // turn off flops if (event_added_flops) { retval=PAPI_remove_named_event(eventset,"PAPI_FP_OPS"); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_remove_named_event", retval ); } } /************************/ /* IPC */ /************************/ if (!quiet) { printf( "\n----------------------------------\n" ); printf( "PAPI_ipc\n"); } /* Add PAPI_TOT_INS event */ retval=PAPI_add_named_event(eventset,"PAPI_TOT_INS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_TOT_INS not available!\n"); event_added_ipc=0; } else { event_added_ipc=1; } if (event_added_ipc) { /* Add PAPI_TOT_CYC event */ retval=PAPI_add_named_event(eventset,"PAPI_TOT_CYC"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_TOT_CYC not available!\n"); event_added_ipc=0; } else { event_added_ipc=1; } } if (event_added_ipc) { PAPI_start(eventset); } rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); // Classic ipc flops_float_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_ipc) { PAPI_stop(eventset,ins); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; ipc=(double)ins[0]/(double)ins[1]; if (!quiet) { printf( "\nClassic\n"); printf( "real time: %lf\n", rtime); printf( "process time: %lf\n", ptime); printf( "Instructions: %lld\n", ins[0]); printf( "Cycles: %lld\n", ins[1]); printf( "IPC %lf\n", ipc); } ipc_classic=ipc; rtime_classic=rtime; // Swapped ipc if (event_added_ipc) { PAPI_reset(eventset); PAPI_start(eventset); } rtime_start=PAPI_get_real_usec(); ptime_start=PAPI_get_virt_usec(); flops_float_swapped_matrix_matrix_multiply(); rtime_end=PAPI_get_real_usec(); ptime_end=PAPI_get_virt_usec(); if (event_added_ipc) { PAPI_stop(eventset,ins); } rtime=rtime_end-rtime_start; ptime=ptime_end-ptime_start; ipc=(double)ins[0]/(double)ins[1]; if (!quiet) { printf( "\nSwapped\n"); printf( "real time: %lf\n", rtime); printf( "process time: %lf\n", ptime); printf( "Instructions: %lld\n", ins[0]); printf( "Cycles: %lld\n", ins[1]); printf( "IPC %lf\n", ipc); } ipc_swapped=ipc; rtime_swapped=rtime; /* Validate */ if (event_added_flips) { if (mflips_swappedrtime_classic) { test_fail(__FILE__,__LINE__, "time should be better when swapped",0); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/fp_validation_hl.c000066400000000000000000000026341502707512200234560ustar00rootroot00000000000000/* This test runs a "classic" matrix multiply * and then runs it again with the inner loop swapped. * the swapped version should have better MFLIPS/MFLOPS/IPC and we test that. */ #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" int main( int argc, char **argv ) { int retval; int quiet = 0; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); // Flips classic retval = PAPI_hl_region_begin("matrix_multiply_classic"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } if ( !quiet ) { printf("flops_float_matrix_matrix_multiply()\n"); } flops_float_matrix_matrix_multiply(); retval = PAPI_hl_region_end("matrix_multiply_classic"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } // Flips swapped retval = PAPI_hl_region_begin("matrix_multiply_swapped"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_begin", retval ); } if ( !quiet ) { printf("flops_float_swapped_matrix_matrix_multiply()\n"); } flops_float_swapped_matrix_matrix_multiply(); retval = PAPI_hl_region_end("matrix_multiply_swapped"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_hl_region_end", retval ); } test_hl_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/instructions_testcode.c000066400000000000000000000101131502707512200246010ustar00rootroot00000000000000#include "testcode.h" /* Test a simple loop of 1 million instructions */ /* Most implementations should count be correct within 1% */ /* This loop in in assembly language, as compiler generated */ /* code varies too much. */ int instructions_million(void) { #if defined(__i386__) || (defined __x86_64__) asm( " xor %%ecx,%%ecx\n" " mov $499999,%%ecx\n" "55:\n" " dec %%ecx\n" " jnz 55b\n" : /* no output registers */ : /* no inputs */ : "cc", "%ecx" /* clobbered */ ); return 0; #elif defined(__PPC__) asm( " nop # to give us an even million\n" " lis 15,499997@ha # load high 16-bits of counter\n" " addi 15,15,499997@l # load low 16-bits of counter\n" "55:\n" " addic. 15,15,-1 # decrement counter\n" " bne 0,55b # loop until zero\n" : /* no output registers */ : /* no inputs */ : "cc", "15" /* clobbered */ ); return 0; #elif defined(__ia64__) asm( " mov loc6=166666 // below is 6 instr.\n" " ;; // because of that we count 4 too few\n" "55:\n" " add loc6=-1,loc6 // decrement count\n" " ;;\n" " cmp.ne p2,p3=0,loc6\n" "(p2) br.cond.dptk 55b // if not zero, loop\n" : /* no output registers */ : /* no inputs */ : "p2", "loc6" /* clobbered */ ); return 0; #elif defined(__sparc__) asm( " sethi %%hi(333333), %%l0\n" " or %%l0,%%lo(333333),%%l0\n" "55:\n" " deccc %%l0 ! decrement count\n" " bnz 55b ! repeat until zero\n" " nop ! branch delay slot\n" : /* no output registers */ : /* no inputs */ : "cc", "l0" /* clobbered */ ); return 0; #elif defined(__arm__) asm( " ldr r2,42f @ set count\n" " b 55f\n" "42: .word 333332\n" "55:\n" " add r2,r2,#-1\n" " cmp r2,#0\n" " bne 55b @ repeat till zero\n" : /* no output registers */ : /* no inputs */ : "cc", "r2" /* clobbered */ ); return 0; #elif defined(__aarch64__) asm( " ldr x2,=333332 // set count\n" "55:\n" " add x2,x2,#-1\n" " cmp x2,#0\n" " bne 55b // repeat till zero\n" : /* no output registers */ : /* no inputs */ : "cc", "r2" /* clobbered */ ); return 0; #endif return CODE_UNIMPLEMENTED; } /* fldcw instructions are counted oddly on Pentium 4 machines */ int instructions_fldcw(void) { #if defined(__i386__) || (defined __x86_64__) int saved_cw,result,cw; double three=3.0; asm( " mov $100000,%%ecx\n" "44:\n" " fldl %1 # load value onto fp stack\n" " fnstcw %0 # store control word to mem\n" " movzwl %0, %%eax # load cw from mem, zero extending\n" " movb $12, %%ah # set cw for \"round to zero\"\n" " movw %%ax, %3 # store back to memory\n" " fldcw %3 # save new rounding mode\n" " fistpl %2 # save stack value as integer to mem\n" " fldcw %0 # restore old cw\n" " loop 44b # loop to make the count more obvious\n" : /* no output registers */ : "m"(saved_cw), "m"(three), "m"(result), "m"(cw) /* inputs */ : "cc", "%ecx","%eax" /* clobbered */ ); return 0; #endif return CODE_UNIMPLEMENTED; } /* rep instructions are counted a bit non-intuitively */ /* some tools like Valgrind and Pin may count differently than real hardware */ int instructions_rep(void) { #if defined(__i386__) || defined(__ILP32__) char buffer_out[16384]; asm( " mov $1000,%%edx\n" " cld\n" "66: # test 8-bit store\n" " mov $0xd, %%al # set eax to d\n" " mov $16384, %%ecx\n" " mov %0, %%edi # set destination\n" " rep stosb # store d 16384 times, auto-increment\n" " dec %%edx\n" " jnz 66b\n" : /* outputs */ : "rm" (buffer_out) /* inputs */ : "cc", "%esi","%edi","%edx","%ecx","%eax","memory" /* clobbered */ ); return 0; #elif defined (__x86_64__) char buffer_out[16384]; asm( " mov $1000,%%edx\n" " cld\n" "66: # test 8-bit store\n" " mov $0xd, %%al # set eax to d\n" " mov $16384, %%ecx\n" " mov %0, %%rdi # set destination\n" " rep stosb # store d 16384 times, auto-increment\n" " dec %%edx\n" " jnz 66b\n" : /* outputs */ : "rm" (buffer_out) /* inputs */ : "cc", "%esi","%edi","%edx","%ecx","%eax","memory" /* clobbered */ ); return 0; #endif return CODE_UNIMPLEMENTED; } papi-papi-7-2-0-t/src/validation_tests/load_store_testcode.c000066400000000000000000000016251502707512200242000ustar00rootroot00000000000000#include "testcode.h" /* Execute n stores */ int execute_stores(int n) { #if defined(__aarch64__) __asm( ".data\n" "stvar: .word 1 /* stvar in memory */\n" ".text\n" " ldr x2, =stvar /* address of stvar */\n" " mov x4, %0\n" " mov x1, #0\n" "str_loop:\n" " str x1, [x2] /* store into stvar */\n" " add x1, x1, #1\n" " cmp x1, x4\n" " bne str_loop\n" : : "r" (n) : "cc" /* clobbered */ ); return 0; #endif return CODE_UNIMPLEMENTED; } /* Execute n loads */ int execute_loads(int n) { #if defined(__aarch64__) __asm( ".data\n" "ldvar: .word 1 /* ldvar in memory */\n" ".text\n" " ldr x2, =ldvar /* address of ldvar */\n" " mov x4, %0\n" " mov x1, #0\n" "ldr_loop:\n" " ldr x3, [x2] /* load from ldvar */\n" " add x1, x1, x3\n" " cmp x1, x4\n" " bne ldr_loop\n" : : "r" (n) : "cc" /* clobbered */ ); return 0; #endif return CODE_UNIMPLEMENTED; } papi-papi-7-2-0-t/src/validation_tests/matrix_multiply.c000066400000000000000000000033671502707512200234230ustar00rootroot00000000000000#include #define NUM_RUNS 3 #define MATRIX_SIZE 512 static double a[MATRIX_SIZE][MATRIX_SIZE]; static double b[MATRIX_SIZE][MATRIX_SIZE]; static double c[MATRIX_SIZE][MATRIX_SIZE]; long long naive_matrix_multiply_estimated_flops(int quiet) { long long muls,divs,adds; /* setup */ muls=MATRIX_SIZE*MATRIX_SIZE; divs=MATRIX_SIZE*MATRIX_SIZE; adds=MATRIX_SIZE*MATRIX_SIZE; /* multiply */ muls+=MATRIX_SIZE*MATRIX_SIZE*MATRIX_SIZE; adds+=MATRIX_SIZE*MATRIX_SIZE*MATRIX_SIZE; /* sum */ adds+=MATRIX_SIZE*MATRIX_SIZE; if (!quiet) { printf("Estimated flops: adds: %lld muls: %lld divs: %lld\n", adds,muls,divs); } return adds+muls+divs; } long long naive_matrix_multiply_estimated_loads(int quiet) { long long loads=0; /* setup */ loads+=0; /* multiply */ loads+=MATRIX_SIZE*MATRIX_SIZE*MATRIX_SIZE*2; /* sum */ loads+=MATRIX_SIZE*MATRIX_SIZE; if (!quiet) { printf("Estimated loads: %lld\n",loads); } return loads; } long long naive_matrix_multiply_estimated_stores(int quiet) { long long stores=0; /* setup */ stores+=MATRIX_SIZE*MATRIX_SIZE*2; /* multiply */ stores+=MATRIX_SIZE*MATRIX_SIZE; /* sum */ stores+=1; if (!quiet) { printf("Estimated stores: %lld\n",stores); } return stores; } double naive_matrix_multiply(int quiet) { double s; int i,j,k; for(i=0;i int main(int argv, char **argc) { (void) argv; // prevent warning for not-used. (void) argc; // prevent warning for not-used. int retval = PAPI_library_init(PAPI_VER_CURRENT); // This does several allocations. (void) retval; // prevent warning for not-used. PAPI_shutdown(); // Shutdown should release them all. return(0); } // end main() papi-papi-7-2-0-t/src/validation_tests/papi_br_cn.c000066400000000000000000000037731502707512200222550ustar00rootroot00000000000000/* This file attempts to test the conditional branch instructions */ /* performance counter PAPI_BR_CN */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=1000000; double error; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_CN event.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_BR_CN"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_BR_CN", retval ); } if (!quiet) { printf("Testing a loop with %lld conditional branches (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_ins.c000066400000000000000000000043701502707512200224400ustar00rootroot00000000000000/* This file attempts to test the retired branches instruction */ /* performance counter PAPI_BR_INS */ /* by Vince Weaver, */ /* This test seems to work on: */ /* + x86 */ /* + ARMv6 (Raspberry Pi) */ /* It is known to not work on: */ /* + ARMv7 (Pi2 CortexA7 Panda CortexA9 */ /* failure is odd, 1/3 missing but not consistent */ /* something weird with event 0xC PC_WRITE ? */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=1500000; double error; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_INS event.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_BR_INS"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_BR_INS", retval ); } if (!quiet) { printf("Testing a loop with %lld branches (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_msp.c000066400000000000000000000116171502707512200224500ustar00rootroot00000000000000/* This file attempts to test the mispredicted branches */ /* performance event as counted by PAPI_BR_MSP */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; int num_random_branches=500000; long long high=0,low=0,average=0,expected=1500000; long long count,total=0; int quiet=0,retval,ins_result; int total_eventset=PAPI_NULL,miss_eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_MSP event.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create total eventset */ retval=PAPI_create_eventset(&total_eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(total_eventset,"PAPI_BR_INS"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_INS\n"); test_skip( __FILE__, __LINE__, "adding PAPI_BR_INS", retval ); } /* Create miss eventset */ retval=PAPI_create_eventset(&miss_eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(miss_eventset,"PAPI_BR_MSP"); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "adding PAPI_BR_MSP", retval ); } if (!quiet) { printf("\nPart 1: Testing that easy to predict loop has few misses\n"); printf("Testing a loop with %lld branches (%d times):\n", expected,num_runs); printf("\tOn a simple loop like this, " "miss rate should be very small.\n"); } for(i=0;ihigh) high=count; if ((low==0) || (count1000) { if (!quiet) printf("Branch miss rate too high\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } /*******************/ if (!quiet) { printf("\nPart 2\n"); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (counthigh) high=count; if ((low==0) || (count (num_random_branches/4)*3) { if (!quiet) printf("Mispredicts too high\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_ntk.c000066400000000000000000000037531502707512200224470ustar00rootroot00000000000000/* This file attempts to test the retired branches not-taken */ /* performance counter PAPI_BR_NTK */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=500000; double error; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_NTK event.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_BR_NTK"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_BR_NTK", retval ); } if (!quiet) { printf("Testing a loop with %lld branches (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_prc.c000066400000000000000000000126411502707512200224330ustar00rootroot00000000000000/* This file attempts to test the predicted correctly branches */ /* performance event as counted by PAPI_BR_PRC */ /* Ideally this event should measure */ /* predicted correctly *conditional* branches */ /* If that's not available, then use total branches. */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; int num_random_branches=500000; long long high=0,low=0,average=0,expected=1500000; long long expected_high,expected_low; long long count,total=0; int quiet=0,retval,ins_result; int total_eventset=PAPI_NULL,miss_eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_PRC event.\n\n"); printf("This should measure predicted correctly conditional branches\n"); printf("If such a counter is not available, it may report predicted correctly\n"); printf("total branches instead.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create total eventset */ retval=PAPI_create_eventset(&total_eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(total_eventset,"PAPI_BR_CN"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_CN\n"); //test_skip( __FILE__, __LINE__, "adding PAPI_BR_CN", retval ); retval=PAPI_add_named_event(total_eventset,"PAPI_BR_INS"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_INS\n"); test_skip( __FILE__, __LINE__, "adding PAPI_BR_INS", retval ); } } /* Create correct eventset */ retval=PAPI_create_eventset(&miss_eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(miss_eventset,"PAPI_BR_PRC"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_PRC\n"); test_skip( __FILE__, __LINE__, "adding PAPI_BR_PRC", retval ); } if (!quiet) { printf("\nPart 1: Testing that easy to predict loop has few misses\n"); printf("Testing a loop with %lld branches (%d times):\n", expected,num_runs); printf("\tOn a simple loop like this, " "hit rate should be very high.\n"); } for(i=0;ihigh) high=count; if ((low==0) || (counthigh) high=count; if ((low==0) || (counthigh) high=count; if ((low==0) || (count expected_high) { if (!quiet) printf("Branch hits too high\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_tkn.c000066400000000000000000000122411502707512200224370ustar00rootroot00000000000000/* This file attempts to test the retired branches taken */ /* performance counter PAPI_BR_TKN */ /* This measures taken *conditional* branches */ /* Though this may fall back to total if not available. */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0; long long expected_cond=500000,expected_total=1000000; double error; long long count,total=0; int quiet=0,retval,ins_result; int eventset_total=PAPI_NULL; int eventset_conditional=PAPI_NULL; int eventset_taken=PAPI_NULL; int eventset_nottaken=PAPI_NULL; long long count_total,count_conditional,count_taken,count_nottaken; int cond_avail=1,nottaken_avail=1; int not_expected=0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_TKN event.\n"); printf("\tIt measures total number of conditional branches not taken\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create Total Eventset */ retval=PAPI_create_eventset(&eventset_total); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset_total,"PAPI_BR_INS"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_BR_INS", retval ); } /* Create Total Eventset */ retval=PAPI_create_eventset(&eventset_conditional); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset_conditional,"PAPI_BR_CN"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_CN\n"); cond_avail=0; //test_skip( __FILE__, __LINE__, "adding PAPI_BR_CN", retval ); } /* Create Taken Eventset */ retval=PAPI_create_eventset(&eventset_taken); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset_taken,"PAPI_BR_TKN"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_TKN\n"); test_skip( __FILE__, __LINE__, "adding PAPI_BR_TKN", retval ); } /* Create Not-Taken Eventset */ retval=PAPI_create_eventset(&eventset_nottaken); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset_nottaken,"PAPI_BR_NTK"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_BR_NTK\n"); nottaken_avail=0; //test_skip( __FILE__, __LINE__, "adding PAPI_BR_NTK", retval ); } /* Get total count */ PAPI_reset(eventset_total); PAPI_start(eventset_total); ins_result=branches_testcode(); retval=PAPI_stop(eventset_total,&count_total); /* Get conditional count */ if (cond_avail) { PAPI_reset(eventset_conditional); PAPI_start(eventset_conditional); ins_result=branches_testcode(); retval=PAPI_stop(eventset_conditional,&count_conditional); } /* Get taken count */ PAPI_reset(eventset_taken); PAPI_start(eventset_taken); ins_result=branches_testcode(); retval=PAPI_stop(eventset_taken,&count_taken); /* Get not-taken count */ if (nottaken_avail) { PAPI_reset(eventset_nottaken); PAPI_start(eventset_nottaken); ins_result=branches_testcode(); retval=PAPI_stop(eventset_nottaken,&count_nottaken); } if (!quiet) { printf("The test code has:\n"); printf("\t%lld total branches\n",count_total); if (cond_avail) { printf("\t%lld conditional branches\n",count_conditional); } printf("\t%lld taken branches\n",count_taken); if (nottaken_avail) { printf("\t%lld not-taken branches\n",count_nottaken); } } if (!quiet) { printf("Testing a loop with %lld conditional taken branches (%d times):\n", expected_cond,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); not_expected=1; //test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); /* Check if using TOTAL instead of CONDITIONAL */ if (not_expected) { error=display_error(average,high,low,expected_total,quiet); if ((error > 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } else { test_warn(__FILE__,__LINE__,"Using TOTAL BRANCHES as base rather than CONDITIONAL BRANCHES\n",0); } } test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_br_ucn.c000066400000000000000000000040011502707512200224230ustar00rootroot00000000000000/* This file attempts to test the unconditional branch instruction */ /* performance counter PAPI_BR_UCN */ /* by Vince Weaver, */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=500000; double error; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_BR_UCN event.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_BR_UCN"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_BR_UCN", retval ); } if (!quiet) { printf("Testing a loop with %lld unconditional branches (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_dp_ops.c000066400000000000000000000106551502707512200224530ustar00rootroot00000000000000/* This file attempts to test the double-precision floating point */ /* performance counter PAPI_DP_OPS */ /* by Vince Weaver, */ /* Note! There are many many many things that can go wrong */ /* when trying to get a sane floating point measurement. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=1500000; double error,double_result; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_DP_OPS event.\n\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create the eventset */ retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add FP_OPS event */ retval=PAPI_add_named_event(eventset,"PAPI_DP_OPS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_DP_OPS not available!\n"); test_skip( __FILE__, __LINE__, "adding PAPI_DP_OPS", retval ); } /**************************************/ /* Test a loop with no floating point */ /**************************************/ expected=0; if (!quiet) { printf("Testing a loop with %lld floating point (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count10) { if (!quiet) printf("Unexpected FP event value\n"); test_fail( __FILE__, __LINE__, "Unexpected FP event", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a single precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_float_init_matrix(); expected=expected*0; num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld single-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a double precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_double_init_matrix(); num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld double-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_fp_ops.c000066400000000000000000000106321502707512200224500ustar00rootroot00000000000000/* This file attempts to test the floating point */ /* performance counter PAPI_FP_OPS */ /* by Vince Weaver, */ /* Note! There are many many many things that can go wrong */ /* when trying to get a sane floating point measurement. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=1500000; double error,double_result; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_FP_OPS event.\n\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create the eventset */ retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add FP_OPS event */ retval=PAPI_add_named_event(eventset,"PAPI_FP_OPS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_FP_OPS not available!\n"); test_skip( __FILE__, __LINE__, "adding PAPI_FP_OPS", retval ); } /**************************************/ /* Test a loop with no floating point */ /**************************************/ total=0; high=0; low=0; expected=0; if (!quiet) { printf("Testing a loop with %lld floating point (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count10) { if (!quiet) printf("Unexpected FP event value\n"); test_fail( __FILE__, __LINE__, "Unexpected FP event", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a single-precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_float_init_matrix(); num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld single-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a double-precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_double_init_matrix(); num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld double-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_hw_int.c000066400000000000000000000045551502707512200224610ustar00rootroot00000000000000/* This file attempts to test the PAPI_HW_INT */ /* performance counter (Total hardware interrupts). */ /* This assumes that interrupts will be happening in the background */ /* Including a regular timer tick of HZ. This is not always true */ /* but should be roughly true on your typical Linux system. */ /* by Vince Weaver, */ #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" int main(int argc, char **argv) { int quiet; long long count; int retval; int eventset=PAPI_NULL; struct timespec before,after; unsigned long long seconds; unsigned long long ns; quiet=tests_quiet(argc,argv); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("\nTesting PAPI_HW_INT\n"); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_HW_INT"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_HW_INT\n"); test_skip( __FILE__, __LINE__, "adding PAPI_HW_INT", retval ); } /********************************/ /* testing 3 seconds of runtime */ /********************************/ if (!quiet) { printf("\nRunning for 3 seconds\n"); } clock_gettime(CLOCK_REALTIME,&before); PAPI_reset(eventset); PAPI_start(eventset); while(1) { clock_gettime(CLOCK_REALTIME,&after); seconds=after.tv_sec - before.tv_sec; ns = after.tv_nsec - before.tv_nsec; ns = (seconds*1000000000ULL)+ns; /* be done if 3 billion nanoseconds has passed */ if (ns>3000000000ULL) break; } retval=PAPI_stop(eventset,&count); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "Problem stopping!", retval ); } if (!quiet) { printf("\tMeasured interrupts = %lld\n",count); /* FIXME: find actua Hz on system */ /* Or even, read /proc/interrupts */ printf("\tAssuming HZ=250, expect roughly 750\n"); } if (!quiet) printf("\n"); if (count<10) { if (!quiet) printf("Too few interrupts!\n"); test_fail( __FILE__, __LINE__, "Too few interrupts!", 1 ); } test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l1_dca.c000066400000000000000000000070421502707512200223060ustar00rootroot00000000000000/* This code attempts to test the L1 Data Cache Accesses */ /* performance counter PAPI_L1_DCA */ /* by Vince Weaver, */ /* Note on AMD fam15h we get 3x expected on writes? */ #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #include "display_error.h" #define NUM_RUNS 100 #define ARRAYSIZE 65536 static double array[ARRAYSIZE]; int main(int argc, char **argv) { int i; int quiet; int eventset=PAPI_NULL; int errors=0; int retval; int num_runs=NUM_RUNS; long long high,low,average,expected=ARRAYSIZE; long long count,total; double aSumm = 0.0; double error; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L1_DCA event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L1_DCA"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L1_DCA", retval ); } /*******************************************************************/ /* Test if the C compiler uses a sane number of data cache acceess */ /* This tests writes to memory. */ /*******************************************************************/ if (!quiet) { printf("Write Test: Initializing an array of %d doubles:\n", ARRAYSIZE); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (!quiet) printf("\n"); /*******************************************************************/ /* Test if the C compiler uses a sane number of data cache acceess */ /* This tests writes to memory. */ /*******************************************************************/ if (!quiet) { printf("Read Test: Summing an array of %d doubles:\n", ARRAYSIZE); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (!quiet) { printf("\n"); } if (errors) { test_fail( __FILE__, __LINE__, "Error too high", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l1_dcm.c000066400000000000000000000107361502707512200223260ustar00rootroot00000000000000/* This code attempts to test the L1 Data Cache Missses */ /* performance counter PAPI_L1_DCM */ /* by Vince Weaver, vincent.weaver@maine.edu */ /* Due to prefetching it is hard to create a testcase short of */ /* just having random accesses. */ /* In addition, due to context switching the cache might be */ /* affected by other processes on a busy system. */ /* Other tests to attempt */ /* Repeatedly reading same cache line should give very small error */ #include #include #include #include "papi.h" #include "papi_test.h" #include "cache_helper.h" #include "display_error.h" #include "testcode.h" /* Is 5% too big? */ #define ALLOWED_ERROR 5.0 #define NUM_RUNS 100 #define ITERATIONS 1000000 int main(int argc, char **argv) { int i; int eventset=PAPI_NULL; int num_runs=NUM_RUNS; long long high,low,average,expected; long long count,total; int retval; int l1_size,l2_size,l1_linesize,l2_linesize,l2_entries; int arraysize; int quiet,errors=0; double error; double *array; double aSumm = 0.0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L1_DCM event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L1_DCM"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L1_DCM", retval ); } l1_size=get_cachesize(L1D_CACHE); l1_linesize=get_linesize(L1D_CACHE); l2_size=get_cachesize(L2_CACHE); l2_linesize=get_linesize(L2_CACHE); l2_entries=get_entries(L2_CACHE); if (!quiet) { printf("\tDetected %dk L1 DCache, %dB linesize\n", l1_size/1024,l1_linesize); printf("\tDetected %dk L2 DCache, %dB linesize, %d entries\n", l2_size/1024,l2_linesize,l2_entries); } arraysize=(l1_size/sizeof(double))*8; if (arraysize==0) { if (!quiet) printf("Could not detect cache size\n"); test_skip(__FILE__,__LINE__,"No cache info",0); } if (!quiet) { printf("\tAllocating %zu bytes of memory (%d doubles)\n", arraysize*sizeof(double),arraysize); } array=calloc(arraysize,sizeof(double)); if (array==NULL) { test_fail(__FILE__,__LINE__,"Can't allocate memory",0); } /******************/ /* Testing Writes */ /******************/ if (!quiet) { printf("\nWrite Test: Writing an array of %d doubles %d random times:\n", arraysize,ITERATIONS); printf("It's expected that 7/8 of the accesses should be hits\n"); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count ALLOWED_ERROR) || (error<-ALLOWED_ERROR)) { if (!quiet) { printf("Instruction count off by more than " "%.2lf%%\n",ALLOWED_ERROR); } errors++; } if (!quiet) printf("\n"); /******************/ /* Testing Reads */ /******************/ if (!quiet) { printf("\nRead Test: Summing %d random doubles from array " "of size %d:\n",ITERATIONS,arraysize); printf("It's expected that 7/8 of the accesses should be hits\n"); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count ALLOWED_ERROR) || (error<-ALLOWED_ERROR)) { if (!quiet) { printf("Instruction count off by more than " "%.2lf%%\n",ALLOWED_ERROR); } errors++; } if (!quiet) { printf("\n"); } if (errors) { test_fail( __FILE__, __LINE__, "Error too high", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l2_dca.c000066400000000000000000000107641502707512200223140ustar00rootroot00000000000000/* This code attempts to test the L2 Data Cache Acceesses */ /* performance counter PAPI_L2_DCA */ /* Notes: */ /* Should this be equivelent to PAPI_L1_DCM? */ /* (on IVY it is) */ /* On Haswell/Broadwell/Skylake this maps to : */ /* L2_RQSTS:ALL_DEMAND_REFERENCES */ /* Should this include *all* L2 accesses or just those */ /* caused by the user? Prefetch? MESI? */ /* by Vince Weaver, vincent.weaver@maine.edu */ #include #include #include #include "papi.h" #include "papi_test.h" #include "cache_helper.h" #include "display_error.h" #include "testcode.h" #define NUM_RUNS 100 int main(int argc, char **argv) { int i; int eventset=PAPI_NULL; int num_runs=NUM_RUNS; long long high,low,average,expected; long long count,total; int retval; int l1_size,l2_size,l1_linesize,l2_linesize,l2_entries; int arraysize; int quiet,errors=0; double error; double *array; double aSumm = 0.0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L2_DCA event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L2_DCA"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L2_DCA", retval ); } l1_size=get_cachesize(L1D_CACHE); l1_linesize=get_linesize(L1D_CACHE); l2_size=get_cachesize(L2_CACHE); l2_linesize=get_linesize(L2_CACHE); l2_entries=get_entries(L2_CACHE); if ((l2_size==0) || (l2_linesize==0)) { if (!quiet) { printf("Unable to determine size of L2 cache!\n"); } test_skip( __FILE__, __LINE__, "adding PAPI_L2_DCA", retval ); } if (!quiet) { printf("\tDetected %dk L1 DCache, %dB linesize\n", l1_size/1024,l1_linesize); printf("\tDetected %dk L2 DCache, %dB linesize, %d entries\n", l2_size/1024,l2_linesize,l2_entries); } arraysize=l2_size/sizeof(double); if (!quiet) { printf("\tAllocating %zu bytes of memory (%d doubles)\n", arraysize*sizeof(double),arraysize); } array=calloc(arraysize,sizeof(double)); if (array==NULL) { test_fail(__FILE__,__LINE__,"Can't allocate memory",0); } /******************/ /* Testing Writes */ /******************/ if (!quiet) { printf("\nWrite Test: Initializing an array of %d doubles:\n", arraysize); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (!quiet) printf("\n"); /******************/ /* Testing Reads */ /******************/ if (!quiet) { printf("\nRead Test: Summing an array of %d doubles:\n", arraysize); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (!quiet) { printf("\n"); } /* Warn for now, as we get errors we can't easily */ /* explain on haswell and more recent Intel chips */ if (errors) { test_warn( __FILE__, __LINE__, "Error too high", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l2_dcm.c000066400000000000000000000111361502707512200223220ustar00rootroot00000000000000/* This code attempts to test the L2 Data Cache Missses */ /* performance counter PAPI_L2_DCM */ /* by Vince Weaver, vincent.weaver@maine.edu */ /* Due to prefetching it is hard to create a testcase short of */ /* just having random accesses. */ /* In addition, due to context switching the cache might be */ /* affected by other processes on a busy system. */ /* Other tests to attempt */ /* Repeatedly reading same cache line should give very small error */ #include #include #include #include "papi.h" #include "papi_test.h" #include "cache_helper.h" #include "display_error.h" #include "testcode.h" /* How much should we allow? */ #define ALLOWED_ERROR 5.0 #define NUM_RUNS 100 #define ITERATIONS 1000000 int main(int argc, char **argv) { int i; int eventset=PAPI_NULL; int num_runs=NUM_RUNS; long long high,low,average,expected; long long count,total; int retval; int l1_size,l2_size,l1_linesize,l2_linesize,l2_entries; int arraysize; int quiet,errors=0; double error; double *array; double aSumm = 0.0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L2_DCM event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L2_DCM"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L2_DCM", retval ); } l1_size=get_cachesize(L1D_CACHE); l1_linesize=get_linesize(L1D_CACHE); l2_size=get_cachesize(L2_CACHE); l2_linesize=get_linesize(L2_CACHE); l2_entries=get_entries(L2_CACHE); if (!quiet) { printf("\tDetected %dk L1 DCache, %dB linesize\n", l1_size/1024,l1_linesize); printf("\tDetected %dk L2 DCache, %dB linesize, %d entries\n", l2_size/1024,l2_linesize,l2_entries); } arraysize=(l2_size/sizeof(double))*8; if (arraysize==0) { if (!quiet) printf("Could not detect cache size\n"); test_skip(__FILE__,__LINE__,"Could not detect cache size",0); } if (!quiet) { printf("\tAllocating %zu bytes of memory (%d doubles)\n", arraysize*sizeof(double),arraysize); } array=calloc(arraysize,sizeof(double)); if (array==NULL) { test_fail(__FILE__,__LINE__,"Can't allocate memory",0); } /******************/ /* Testing Writes */ /******************/ if (!quiet) { printf("\nWrite Test: Writing an array of %d doubles %d random times:\n", arraysize,ITERATIONS); printf("\tPrefetch and shared nature of L2s make this hard.\n"); printf("\tExpected 7/8 of accesses to be miss.\n"); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count ALLOWED_ERROR) || (error<-ALLOWED_ERROR)) { if (!quiet) { printf("Instruction count off by more " "than %.2lf%%\n",ALLOWED_ERROR); } errors++; } if (!quiet) printf("\n"); /******************/ /* Testing Reads */ /******************/ if (!quiet) { printf("\nRead Test: Summing %d random doubles from array " "of size %d:\n",ITERATIONS,arraysize); printf("\tExpected 7/8 of accesses to be miss.\n"); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count ALLOWED_ERROR) || (error<-ALLOWED_ERROR)) { if (!quiet) { printf("Instruction count off by more " "than %.2lf%%\n",ALLOWED_ERROR); } errors++; } if (!quiet) { printf("\n"); } /* FIXME: Warn, as we fail on broadwell and more recent chips */ if (errors) { test_warn( __FILE__, __LINE__, "Error too high", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l2_dcr.c000066400000000000000000000076141502707512200223350ustar00rootroot00000000000000/* This code attempts to test the L2 Data Cache Reads */ /* performance counter PAPI_L2_DCR */ /* by Vince Weaver, vincent.weaver@maine.edu */ #include #include #include #include "papi.h" #include "papi_test.h" #include "cache_helper.h" #include "display_error.h" #include "testcode.h" #define NUM_RUNS 100 int main(int argc, char **argv) { int i; int eventset=PAPI_NULL; int num_runs=NUM_RUNS; long long high,low,average,expected; long long count,total; int retval; int l1_size,l2_size,l1_linesize,l2_linesize,l2_entries; int arraysize; int quiet,errors=0; double error; double *array; double aSumm = 0.0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L2_DCR event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L2_DCR"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L2_DCR", retval ); } l1_size=get_cachesize(L1D_CACHE); l1_linesize=get_linesize(L1D_CACHE); l2_size=get_cachesize(L2_CACHE); l2_linesize=get_linesize(L2_CACHE); l2_entries=get_entries(L2_CACHE); if (!quiet) { printf("\tDetected %dk L1 DCache, %dB linesize\n", l1_size/1024,l1_linesize); printf("\tDetected %dk L2 DCache, %dB linesize, %d entries\n", l2_size/1024,l2_linesize,l2_entries); } arraysize=l2_size/sizeof(double); if (!quiet) { printf("\tAllocating %zu bytes of memory (%d doubles)\n", arraysize*sizeof(double),arraysize); } array=calloc(arraysize,sizeof(double)); if (array==NULL) { test_fail(__FILE__,__LINE__,"Can't allocate memory",0); } /******************/ /* Testing Writes */ /******************/ if (!quiet) { printf("\nWrite Test: Initializing an array of %d doubles:\n", arraysize); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (countexpected) { if (!quiet) printf("Instruction count bigger than expected\n"); errors++; } if (!quiet) printf("\n"); /******************/ /* Testing Reads */ /******************/ if (!quiet) { printf("\nRead Test: Summing an array of %d doubles:\n", arraysize); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (!quiet) { printf("\n"); } /* FIXME: Warn, as we fail on broadwell and more recent */ if (errors) { test_warn( __FILE__, __LINE__, "Error too high", 1 ); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_l2_dcw.c000066400000000000000000000101731502707512200223340ustar00rootroot00000000000000/* This code attempts to test the L2 Data Cache Writes */ /* performance counter PAPI_L2_DCW */ /* by Vince Weaver, vincent.weaver@maine.edu */ #include #include #include #include "papi.h" #include "papi_test.h" #include "cache_helper.h" #include "display_error.h" #include "testcode.h" #define NUM_RUNS 100 int main(int argc, char **argv) { int i; int eventset=PAPI_NULL; int num_runs=NUM_RUNS; long long high,low,average,expected; long long count,total; int retval; int l1_size,l2_size,l1_linesize,l2_linesize,l2_entries; int arraysize; int quiet,errors=0,warnings=0; double error; double *array; double aSumm = 0.0; quiet=tests_quiet(argc,argv); if (!quiet) { printf("Testing the PAPI_L2_DCW event\n"); } /* Init the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT) { test_fail(__FILE__,__LINE__,"PAPI_library_init",retval); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_L2_DCW"); if (retval!=PAPI_OK) { test_skip( __FILE__, __LINE__, "adding PAPI_L2_DCW", retval ); } l1_size=get_cachesize(L1D_CACHE); l1_linesize=get_linesize(L1D_CACHE); l2_size=get_cachesize(L2_CACHE); l2_linesize=get_linesize(L2_CACHE); l2_entries=get_entries(L2_CACHE); if (!quiet) { printf("\tDetected %dk L1 DCache, %dB linesize\n", l1_size/1024,l1_linesize); printf("\tDetected %dk L2 DCache, %dB linesize, %d entries\n", l2_size/1024,l2_linesize,l2_entries); } arraysize=l2_size/sizeof(double); if (!quiet) { printf("\tAllocating %zu bytes of memory (%d doubles)\n", arraysize*sizeof(double),arraysize); } array=calloc(arraysize,sizeof(double)); if (array==NULL) { test_fail(__FILE__,__LINE__,"Can't allocate memory",0); } /******************/ /* Testing Writes */ /******************/ if (!quiet) { printf("\nWrite Test: Initializing an array of %d doubles:\n", arraysize); } high=0; low=0; total=0; for(i=0;ihigh) high=count; if ((low==0) || (count 10.0) || (error<-10.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); errors++; } if (lowhigh) high=count; if ((low==0) || (countexpected) { if (!quiet) printf("ERROR: Write count unexpectedly high\n"); errors++; } if (!quiet) { printf("\n"); } /* FIXME: warn for now, as fail on broadwell and more recent */ if (errors) { test_warn( __FILE__, __LINE__, "Error too high", 1 ); } if (warnings) { test_warn(__FILE__, __LINE__, "Average results OK but some measurements low",1); } test_pass(__FILE__); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_ld_ins.c000066400000000000000000000103141502707512200224270ustar00rootroot00000000000000/* This file attempts to test the PAPI_LD_INS */ /* performance counter (retired loads). */ /* This just does a generic matrix-matrix test */ /* Should have a comprehensive assembly language test */ /* (see my deterministic benchmark suite) but that would be */ /* a lot more complicated. */ /* by Vince Weaver, */ #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "matrix_multiply.h" #include "testcode.h" #define SLEEP_RUNS 3 int main(int argc, char **argv) { int quiet; double error; int i; long long count,high=0,low=0,total=0,average=0; long long mmm_count; long long expected; int retval; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("\nTesting PAPI_LD_INS\n\n"); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_LD_INS"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_LD_INS\n"); test_skip( __FILE__, __LINE__, "adding PAPI_LD_INS", retval ); } /****************/ /* Sleep test */ /****************/ if (!quiet) { printf("Testing a sleep of 1 second (%d times):\n",SLEEP_RUNS); } for(i=0;ihigh) high=count; if ((low==0) || (count100000) { if (!quiet) printf("Average cycle count too high!\n"); test_fail( __FILE__, __LINE__, "idle average", retval ); } /***********************************/ /* testing a large number of loads */ /***********************************/ if (!quiet) { printf("\nTesting a large number of loads\n"); } expected=naive_matrix_multiply_estimated_loads(quiet); PAPI_reset(eventset); PAPI_start(eventset); retval = execute_loads(expected); if (retval == CODE_UNIMPLEMENTED) { if (!quiet) { printf("\tNo asm test found for the current hardware. Testing matrix multiply\n"); } naive_matrix_multiply(quiet); } retval=PAPI_stop(eventset,&count); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "Problem stopping!", retval ); } if (!quiet) { printf("\tActual measured loads = %lld\n",count); } error= 100.0 * (double)(count-expected) / (double)expected; if (!quiet) { printf("\tExpected %lld, got %lld\n",expected,count); printf("\tError=%.2f%%\n",error); } if ((error>10.0) || (error<-10.0)) { if (!quiet) printf("Error too high!\n"); test_fail( __FILE__, __LINE__, "Error too high", retval ); } mmm_count=count; /************************************/ /* Check for Linear Speedup */ /************************************/ if (!quiet) printf("\nTesting for a linear cycle increase\n"); #define REPITITIONS 2 PAPI_reset(eventset); PAPI_start(eventset); for(i=0;i10.0) || (error<-10.0)) { if (!quiet) printf("Error too high!\n"); test_fail( __FILE__, __LINE__, "Error too high", retval ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_ref_cyc.c000066400000000000000000000136221502707512200225760ustar00rootroot00000000000000/* This test exercises the PAPI_TOT_CYC and PAPI_REF_CYC counters. PAPI_TOT_CYC should measure the number of cycles required to do a fixed amount of work. It should be roughly constant for constant work, regardless of the speed state a core is in. PAPI_REF_CYC should measure the number of cycles at a constant reference clock rate, independent of the actual clock rate of the core. */ /* PAPI_REF_CYC has various issues on Intel chips: On older machines PAPI uses UNHALTED_REFERENCE_CYCLES but this means different things on different architectures + On Core2/Atom this maps to the special Fixed Counter 2 CPU_CLK_UNHALTED.REF This counts at the same rate as the TSC (PAPI_get_real_cyc()) And also seems to match PAPI_TOT_CYC It is documented as having a fixed ratio to the CPU_CLK_UNHALTED.BUS (3c/1) event. + On Nehalem/Westemere this also maps to Fixed Counter 2. Again, counts same rate as the TSC and returns CPU_CLK_UNHALTED.REF_P (3c/1) times the "Maximum Non-Turbo Ratio" + Same for Sandybridge/Ivybridge On newer HSW,BDW,SKL machines PAPI uses a different type of event CPU_CLK_THREAD_UNHALTED:REF_XCLK + On Haswell machines this is just the reference clock (100MHz?) + On Sandybridge this is off by a factor of 8x? */ /* NOTE: PAPI_get_virt_cyc() returns a lie! It's just virt_time() * max_theoretical_MHz so no point in checking that */ #include #include #include #include "papi.h" #include "papi_test.h" #include "testcode.h" #define NUM_FLOPS 20000000 static void work (int EventSet, int sleep_test, int quiet) { int retval; long long values[2]; long long elapsed_us, elapsed_cyc, elapsed_virt_us, elapsed_virt_cyc; double cycles_error; int numflops = NUM_FLOPS; /* Gather before stats */ elapsed_us = PAPI_get_real_usec( ); elapsed_cyc = PAPI_get_real_cyc( ); elapsed_virt_us = PAPI_get_virt_usec( ); elapsed_virt_cyc = PAPI_get_virt_cyc( ); /* Start PAPI */ retval = PAPI_start( EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_start", retval ); } /* our test code */ if (sleep_test) { sleep(2); } else { do_flops( numflops, 1 ); } /* Stop PAPI */ retval = PAPI_stop( EventSet, values ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_stop", retval ); } /* Calculate total values */ elapsed_virt_us = PAPI_get_virt_usec( ) - elapsed_virt_us; elapsed_virt_cyc = PAPI_get_virt_cyc( ) - elapsed_virt_cyc; elapsed_us = PAPI_get_real_usec( ) - elapsed_us; elapsed_cyc = PAPI_get_real_cyc( ) - elapsed_cyc; if (!quiet) { printf( "-------------------------------------------------------------------------\n" ); if (sleep_test) printf("Sleeping for 2s\n"); else printf( "Using %d iterations of c += a*b\n", numflops ); printf( "-------------------------------------------------------------------------\n" ); printf( "PAPI_TOT_CYC : \t%10lld\n", values[0] ); printf( "PAPI_REF_CYC : \t%10lld\n", values[1] ); printf( "Real usec : \t%10lld\n", elapsed_us ); printf( "Real cycles : \t%10lld\n", elapsed_cyc ); printf( "Virt usec : \t%10lld\n", elapsed_virt_us ); printf( "Virt cycles (estimate) : \t%10lld\n", elapsed_virt_cyc ); printf( "Estimated GHz : \t%10.3lf\n", (double) elapsed_cyc/(double)elapsed_us/1000.0); printf( "-------------------------------------------------------------------------\n" ); } if (sleep_test) { if (!quiet) { printf( "Verification: PAPI_REF_CYC should be much lower than real_usec\n"); } if (values[1]>elapsed_us) { if (!quiet) printf("PAPI_REF_CYC too high!\n"); test_fail( __FILE__, __LINE__, "PAPI_REF_CYC too high", 0 ); } } else { /* PAPI_REF_CYC should be roughly the same as TSC when busy */ /* on Intel chips */ if (!quiet) { printf( "Verification: real_cyc should be roughly PAPI_REF_CYC\n"); printf( " real_usec should be roughly virt_usec (on otherwise idle system)\n"); } cycles_error=100.0* ((double)values[1]-((double)elapsed_cyc)) /values[1]; if ((cycles_error>10.0) || (cycles_error<-10.0)) { if (!quiet) printf("Error of %.2f%%\n",cycles_error); test_fail( __FILE__, __LINE__, "PAPI_REF_CYC validation", 0 ); } cycles_error=100.0* ((double)elapsed_us-(double)elapsed_virt_us) /(double)elapsed_us; if ((cycles_error>10.0) || (cycles_error<-10.0)) { if (!quiet) printf("Error of %.2f%%\n",cycles_error); test_warn( __FILE__, __LINE__, "real_us validation", 0 ); } } } int main( int argc, char **argv ) { int retval; int EventSet = PAPI_NULL; int quiet; /* Set TESTS_QUIET variable */ quiet = tests_quiet( argc, argv ); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Check the ref cycles event */ retval = PAPI_query_named_event("PAPI_REF_CYC"); if (PAPI_OK!=retval) { if (!quiet) printf("No PAPI_REF_CYC available\n"); test_skip( __FILE__, __LINE__, "PAPI_REF_CYC is not defined on this platform.", 0); } /* create an eventset */ retval = PAPI_create_eventset( &EventSet ); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* add core cycle event */ retval = PAPI_add_named_event( EventSet, "PAPI_TOT_CYC"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_named_event: PAPI_TOT_CYC", retval ); } /* add ref cycle event */ retval = PAPI_add_named_event( EventSet, "PAPI_REF_CYC"); if ( retval != PAPI_OK ) { test_fail( __FILE__, __LINE__, "PAPI_add_events: PAPI_REF_CYC", retval ); } if (!quiet) { printf("Test case sleeping: " "Look at TOT and REF cycles.\n"); } work(EventSet, 1, quiet); // do_flops(10*numflops); if (!quiet) { printf( "\nTest case busy:\n" ); } work(EventSet, 0, quiet); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_sp_ops.c000066400000000000000000000106661502707512200224740ustar00rootroot00000000000000/* This file attempts to test the single-precision floating point */ /* performance counter PAPI_SP_OPS */ /* by Vince Weaver, */ /* Note! There are many many many things that can go wrong */ /* when trying to get a sane floating point measurement. */ #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" int main(int argc, char **argv) { int num_runs=100,i; long long high=0,low=0,average=0,expected=1500000; double error,double_result; long long count,total=0; int quiet=0,retval,ins_result; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nTesting the PAPI_SP_OPS event.\n\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } /* Create the eventset */ retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } /* Add FP_OPS event */ retval=PAPI_add_named_event(eventset,"PAPI_SP_OPS"); if (retval!=PAPI_OK) { if (!quiet) fprintf(stderr,"PAPI_SP_OPS not available!\n"); test_skip( __FILE__, __LINE__, "adding PAPI_SP_OPS", retval ); } /**************************************/ /* Test a loop with no floating point */ /**************************************/ total=0; expected=0; if (!quiet) { printf("Testing a loop with %lld floating point (%d times):\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count10) { if (!quiet) printf("Unexpected FP event value\n"); test_fail( __FILE__, __LINE__, "Unexpected FP event", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a single-precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_float_init_matrix(); num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld single-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); /*******************************************/ /* Test a double-precision matrix multiply */ /*******************************************/ total=0; high=0; low=0; expected=flops_double_init_matrix(); expected=expected*0; num_runs=3; if (!quiet) { printf("Testing a matrix multiply with %lld double-precision FP operations (%d times)\n", expected,num_runs); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) printf("Instruction count off by more than 1%%\n"); test_fail( __FILE__, __LINE__, "Error too high", 1 ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_sr_ins.c000066400000000000000000000103141502707512200224540ustar00rootroot00000000000000/* This file attempts to test the PAPI_SR_INS */ /* performance counter (retired stores). */ /* This just does a generic matrix-matrix test */ /* Should have a comprehensive assembly language test */ /* (see my deterministic benchmark suite) but that would be */ /* a lot more complicated. */ /* by Vince Weaver, */ #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "matrix_multiply.h" #include "testcode.h" #define SLEEP_RUNS 3 int main(int argc, char **argv) { int quiet; double error; int i; long long count,high=0,low=0,total=0,average=0; long long mmm_count; long long expected; int retval; int eventset=PAPI_NULL; quiet=tests_quiet(argc,argv); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("\nTesting PAPI_SR_INS\n\n"); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_SR_INS"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_SR_INS\n"); test_skip( __FILE__, __LINE__, "adding PAPI_LD_INS", retval ); } /**************/ /* Sleep test */ /**************/ if (!quiet) { printf("Testing a sleep of 1 second (%d times):\n",SLEEP_RUNS); } for(i=0;ihigh) high=count; if ((low==0) || (count100000) { if (!quiet) printf("Average cycle count too high!\n"); test_fail( __FILE__, __LINE__, "idle average", retval ); } /************************************/ /* testing a large number of stores */ /************************************/ if (!quiet) { printf("\nTesting a large number of stores\n"); } expected=naive_matrix_multiply_estimated_stores(quiet); PAPI_reset(eventset); PAPI_start(eventset); retval = execute_stores(expected); if (retval == CODE_UNIMPLEMENTED) { if (!quiet) { printf("\tNo asm test found for the current hardware. Testing matrix multiply\n"); } naive_matrix_multiply(quiet); } retval=PAPI_stop(eventset,&count); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "Problem stopping!", retval ); } if (!quiet) { printf("\tActual measured stores = %lld\n",count); } error= 100.0 * (double)(count-expected) / (double)expected; if (!quiet) { printf("\tExpected %lld, got %lld\n",expected,count); printf("\tError=%.2f%%\n",error); } if ((error>10.0) || (error<-10.0)) { if (!quiet) printf("Error too high!\n"); test_fail( __FILE__, __LINE__, "Error too high", retval ); } mmm_count=count; /************************************/ /* Check for Linear Speedup */ /************************************/ if (!quiet) printf("\nTesting for a linear cycle increase\n"); #define REPITITIONS 2 PAPI_reset(eventset); PAPI_start(eventset); for(i=0;i10.0) || (error<-10.0)) { if (!quiet) printf("Error too high!\n"); test_fail( __FILE__, __LINE__, "Error too high", retval ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_tot_cyc.c000066400000000000000000000101631502707512200226250ustar00rootroot00000000000000/* This file attempts to test the PAPI_TOT_CYC */ /* performance counter. */ /* by Vince Weaver, */ #include #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "matrix_multiply.h" #define SLEEP_RUNS 3 static long long convert_to_ns(struct timespec *before, struct timespec *after) { long long seconds; long long ns; seconds=after->tv_sec - before->tv_sec; ns = after->tv_nsec - before->tv_nsec; ns = (seconds*1000000000ULL)+ns; return ns; } int main(int argc, char **argv) { int quiet; double mmm_ghz; double error; int i; long long count,high=0,low=0,total=0,average=0; long long nsecs; long long mmm_count; long long expected; int retval; int eventset=PAPI_NULL; struct timespec before,after; quiet=tests_quiet(argc,argv); /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } if (!quiet) { printf("\nTesting PAPI_TOT_CYC\n\n"); } if (!quiet) { printf("Testing a sleep of 1 second (%d times):\n",SLEEP_RUNS); } retval=PAPI_create_eventset(&eventset); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", retval ); } retval=PAPI_add_named_event(eventset,"PAPI_TOT_CYC"); if (retval!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_TOT_CYC\n"); test_skip( __FILE__, __LINE__, "adding PAPI_TOT_CYC", retval ); } for(i=0;ihigh) high=count; if ((low==0) || (count100000) { if (!quiet) printf("Average cycle count too high!\n"); test_fail( __FILE__, __LINE__, "idle average", retval ); } /*****************************/ /* testing Matrix Matrix GHz */ /*****************************/ if (!quiet) { printf("\nEstimating GHz with matrix matrix multiply\n"); } // We have a problem with subsequent runs requiring fewer cycles than // the first run. This may be system dependent; so we do a throw-away // run here, so we end up comparing the SECOND run to the 3rd, 4th, etc. naive_matrix_multiply(quiet); // A first run, to init system. clock_gettime(CLOCK_REALTIME,&before); PAPI_reset(eventset); PAPI_start(eventset); naive_matrix_multiply(quiet); retval=PAPI_stop(eventset,&count); if (retval!=PAPI_OK) { test_fail( __FILE__, __LINE__, "Problem stopping!", retval ); } clock_gettime(CLOCK_REALTIME,&after); nsecs=convert_to_ns(&before,&after); mmm_ghz=(double)count/(double)nsecs; if (!quiet) { printf("\tActual measured cycles = %lld\n",count); printf("\tEstimated actual GHz = %.2lfGHz\n",mmm_ghz); } mmm_count=count; /************************************/ /* Check for Linear Speedup */ /************************************/ if (!quiet) printf("\nTesting for a linear cycle increase\n"); #define REPITITIONS 2 clock_gettime(CLOCK_REALTIME,&before); PAPI_reset(eventset); PAPI_start(eventset); for(i=0;i10.0) || (error<-10.0)) { if (!quiet) printf("Error too high!\n"); test_fail( __FILE__, __LINE__, "Error too high", retval ); } if (!quiet) printf("\n"); test_pass( __FILE__ ); return 0; } papi-papi-7-2-0-t/src/validation_tests/papi_tot_ins.c000066400000000000000000000131201502707512200226340ustar00rootroot00000000000000/* This file attempts to test the retired instruction event */ /* As implemented by PAPI_TOT_INS */ /* For more info on the causes of overcount on x86 systems */ /* See the ISPASS2013 paper: */ /* "Non-Determinism and Overcount on Modern Hardware */ /* Performance Counter Implementations" */ /* by Vince Weaver, */ #include #include #include #include #include #include "papi.h" #include "papi_test.h" #include "display_error.h" #include "testcode.h" #define NUM_RUNS 100 /* Test a simple loop of 1 million instructions */ /* Most implementations should count be correct within 1% */ /* This loop in in assembly language, as compiler generated */ /* code varies too much. */ static void test_million(int quiet) { int i,result,ins_result; long long count,high=0,low=0,total=0,average=0; double error; int eventset=PAPI_NULL; if (!quiet) { printf("\nTesting a loop of 1 million instructions (%d times):\n", NUM_RUNS); } result=PAPI_create_eventset(&eventset); if (result!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", result ); } result=PAPI_add_named_event(eventset,"PAPI_TOT_INS"); if (result!=PAPI_OK) { if (!quiet) printf("Could not add PAPI_TOT_INS\n"); test_skip( __FILE__, __LINE__, "adding PAPI_TOT_INS", result ); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { #if defined(__PPC__) if(!quiet) { printf("If PPC is off by 50%%, this might be due to\n" "\"folded\" branch instructions on PPC32\n"); } #endif test_fail( __FILE__, __LINE__, "validation", result ); } } /* Test fldcw. Pentium 4 overcounts this instruction */ static void test_fldcw(int quiet) { (void)quiet; #if defined(__i386__) || (defined __x86_64__) int i,result,ins_result; int eventset=PAPI_NULL; long long count,high=0,low=0,total=0,average=0; double error; if (!quiet) { printf("\nTesting a fldcw loop of 900,000 instructions (%d times):\n", NUM_RUNS); } result=PAPI_create_eventset(&eventset); if (result!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", result ); } result=PAPI_add_named_event(eventset,"PAPI_TOT_INS"); if (result!=PAPI_OK) { test_fail( __FILE__, __LINE__, "adding PAPI_TOT_INS", result ); } for(i=0;ihigh) high=count; if ((low==0) || (count 1.0) || (error<-1.0)) { if (!quiet) { printf("On Pentium 4 machines, the fldcw instruction counts as 2.\n"); printf("This will lead to an overcount of 22%%\n"); } test_fail( __FILE__, __LINE__, "Error too high", 1 ); } #endif } /* Test rep-prefixed instructions. */ /* HW counters count this as one each, not one per repeat */ static void test_rep(int quiet) { (void)quiet; #if defined(__i386__) || (defined __x86_64__) int i,result,ins_result; int eventset=PAPI_NULL; long long count,high=0,low=0,total=0,average=0; double error; if(!quiet) { printf("\nTesting a 16k rep loop (%d times):\n", NUM_RUNS); } result=PAPI_create_eventset(&eventset); if (result!=PAPI_OK) { test_fail( __FILE__, __LINE__, "PAPI_create_eventset", result ); } result=PAPI_add_named_event(eventset,"PAPI_TOT_INS"); if (result!=PAPI_OK) { test_fail( __FILE__, __LINE__, "adding PAPI_TOT_INS", result ); } for(i=0;ihigh) high=count; if ((low==0) || (count 10.0) || (error<-10.0)) { if (!quiet) { printf("Instruction count off by more than 10%%\n"); } test_fail( __FILE__, __LINE__, "Error too high", 1 ); } #endif } int main(int argc, char **argv) { int retval; int quiet=0; (void)argc; (void)argv; quiet=tests_quiet(argc,argv); if (!quiet) { printf("\nThis test checks that the \"PAPI_TOT_INS\" generalized " "event is working.\n"); } /* Init the PAPI library */ retval = PAPI_library_init( PAPI_VER_CURRENT ); if ( retval != PAPI_VER_CURRENT ) { test_fail( __FILE__, __LINE__, "PAPI_library_init", retval ); } test_million(quiet); test_fldcw(quiet); test_rep(quiet); if (!quiet) printf("\n"); test_pass( __FILE__ ); PAPI_shutdown(); return 0; } papi-papi-7-2-0-t/src/validation_tests/testcode.h000066400000000000000000000021231502707512200217640ustar00rootroot00000000000000#define ALL_OK 0 #define CODE_UNIMPLEMENTED -1 #define ERROR_RESULT -2 /* instructions_testcode.c */ int instructions_million(void); int instructions_fldcw(void); int instructions_rep(void); /* branches_testcode.c */ int branches_testcode(void); int random_branches_testcode(int number, int quiet); /* flops_testcode.c */ int flops_float_init_matrix(void); float flops_float_matrix_matrix_multiply(void); float flops_float_swapped_matrix_matrix_multiply(void); int flops_double_init_matrix(void); double flops_double_matrix_matrix_multiply(void); double flops_double_swapped_matrix_matrix_multiply(void); double do_flops3( double x, int iters, int quiet ); double do_flops( int n, int quiet ); /* cache_testcode.c */ int cache_write_test(double *array, int size); double cache_read_test(double *array, int size); int cache_random_write_test(double *array, int size, int count); double cache_random_read_test(double *array, int size, int count); /* load_store_testcode.c */ int execute_stores(int n); int execute_loads(int n); /* busy_work.c */ double do_cycles( int minimum_time ); papi-papi-7-2-0-t/src/validation_tests/vector_testcode.c000066400000000000000000000114631502707512200233500ustar00rootroot00000000000000#include #include #include #define NUMBER 100 inline void inline_packed_sse_add( float *aa, float *bb, float *cc ) { __asm__ __volatile__( "movaps (%0), %%xmm0;" "movaps (%1), %%xmm1;" "addps %%xmm0, %%xmm1;" "movaps %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_packed_sse_mul( float *aa, float *bb, float *cc ) { __asm__ __volatile__( "movaps (%0), %%xmm0;" "movaps (%1), %%xmm1;" "mulps %%xmm0, %%xmm1;" "movaps %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_packed_sse2_add( double *aa, double *bb, double *cc ) { __asm__ __volatile__( "movapd (%0), %%xmm0;" "movapd (%1), %%xmm1;" "addpd %%xmm0, %%xmm1;" "movapd %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_packed_sse2_mul( double *aa, double *bb, double *cc ) { __asm__ __volatile__( "movapd (%0), %%xmm0;" "movapd (%1), %%xmm1;" "mulpd %%xmm0, %%xmm1;" "movapd %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_unpacked_sse_add( float *aa, float *bb, float *cc ) { __asm__ __volatile__( "movss (%0), %%xmm0;" "movss (%1), %%xmm1;" "addss %%xmm0, %%xmm1;" "movss %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_unpacked_sse_mul( float *aa, float *bb, float *cc ) { __asm__ __volatile__( "movss (%0), %%xmm0;" "movss (%1), %%xmm1;" "mulss %%xmm0, %%xmm1;" "movss %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_unpacked_sse2_add( double *aa, double *bb, double *cc ) { __asm__ __volatile__( "movsd (%0), %%xmm0;" "movsd (%1), %%xmm1;" "addsd %%xmm0, %%xmm1;" "movsd %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } inline void inline_unpacked_sse2_mul( double *aa, double *bb, double *cc ) { __asm__ __volatile__( "movsd (%0), %%xmm0;" "movsd (%1), %%xmm1;" "mulsd %%xmm0, %%xmm1;" "movsd %%xmm1, (%2);"::"r"( aa ), "r"( bb ), "r"( cc ) :"%xmm0", "%xmm1" ); } int main( int argc, char **argv ) { int i, packed = 0, sse = 0; float a[4] = { 1.0, 2.0, 3.0, 4.0 }; float b[4] = { 2.0, 3.0, 4.0, 5.0 }; float c[4] = { 0.0, 0.0, 0.0, 0.0 }; double d[4] = { 1.0, 2.0, 3.0, 4.0 }; double e[4] = { 2.0, 3.0, 4.0, 5.0 }; double f[4] = { 0.0, 0.0, 0.0, 0.0 }; if ( argc != 3 ) { bail: printf( "Usage %s: \n", argv[0] ); exit( 1 ); } if ( strcasecmp( argv[1], "packed" ) == 0 ) packed = 1; else if ( strcasecmp( argv[1], "unpacked" ) == 0 ) packed = 0; else goto bail; if ( strcasecmp( argv[2], "sse" ) == 0 ) sse = 1; else if ( strcasecmp( argv[2], "sse2" ) == 0 ) sse = 0; else goto bail; #if 0 if ( ( sse ) && ( system( "cat /proc/cpuinfo | grep sse > /dev/null" ) != 0 ) ) { printf( "This processor does not have SSE.\n" ); exit( 1 ); } if ( ( sse == 0 ) && ( system( "cat /proc/cpuinfo | grep sse2 > /dev/null" ) != 0 ) ) { printf( "This processor does not have SSE2.\n" ); exit( 1 ); } #endif printf( "Vector 1: %f %f %f %f\n", a[0], a[1], a[2], a[3] ); printf( "Vector 2: %f %f %f %f\n\n", b[0], b[1], b[2], b[3] ); if ( ( packed == 0 ) && ( sse == 1 ) ) { for ( i = 0; i < NUMBER; i++ ) { inline_unpacked_sse_add( &a[0], &b[0], &c[0] ); } printf( "%d SSE Unpacked Adds: Result %f\n", NUMBER, c[0] ); for ( i = 0; i < NUMBER; i++ ) { inline_unpacked_sse_mul( &a[0], &b[0], &c[0] ); } printf( "%d SSE Unpacked Muls: Result %f\n", NUMBER, c[0] ); } if ( ( packed == 1 ) && ( sse == 1 ) ) { for ( i = 0; i < NUMBER; i++ ) { inline_packed_sse_add( a, b, c ); } printf( "%d SSE Packed Adds: Result %f %f %f %f\n", NUMBER, c[0], c[1], c[2], c[3] ); for ( i = 0; i < NUMBER; i++ ) { inline_packed_sse_mul( a, b, c ); } printf( "%d SSE Packed Muls: Result %f %f %f %f\n", NUMBER, c[0], c[1], c[2], c[3] ); } if ( ( packed == 0 ) && ( sse == 0 ) ) { for ( i = 0; i < NUMBER; i++ ) { inline_unpacked_sse2_add( &d[0], &e[0], &f[0] ); } printf( "%d SSE2 Unpacked Adds: Result %f\n", NUMBER, c[0] ); for ( i = 0; i < NUMBER; i++ ) { inline_unpacked_sse2_mul( &d[0], &e[0], &f[0] ); } printf( "%d SSE2 Unpacked Muls: Result %f\n", NUMBER, c[0] ); } if ( ( packed == 1 ) && ( sse == 0 ) ) { for ( i = 0; i < NUMBER; i++ ) { inline_packed_sse2_add( &d[0], &e[0], &f[0] ); } printf( "%d SSE2 Packed Adds: Result %f\n", NUMBER, c[0] ); for ( i = 0; i < NUMBER; i++ ) { inline_packed_sse2_mul( &d[0], &e[0], &f[0] ); } printf( "%d SSE2 Packed Muls: Result %f\n", NUMBER, c[0] ); } exit( 0 ); } papi-papi-7-2-0-t/src/x86_cpuid_info.c000066400000000000000000001130131502707512200174160ustar00rootroot00000000000000/****************************/ /* THIS IS OPEN SOURCE CODE */ /****************************/ /* * File: x86_cpuid_info.c * Author: Dan Terpstra * terpstra@eecs.utk.edu * complete rewrite of linux-memory.c to conform to latest docs * and convert Intel to a table driven implementation. * Now also supports multiple TLB descriptors */ #include #include #include "papi.h" #include "papi_internal.h" static void init_mem_hierarchy( PAPI_mh_info_t * mh_info ); static int init_amd( PAPI_mh_info_t * mh_info, int *levels ); static short int _amd_L2_L3_assoc( unsigned short int pattern ); static int init_intel( PAPI_mh_info_t * mh_info , int *levels); #if defined( __amd64__ ) || defined (__x86_64__) static inline void cpuid( unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d ) { unsigned int op = *a; __asm__("cpuid;" : "=a" (*a), "=b" (*b), "=c" (*c), "=d" (*d) : "a" (op) ); } #else static inline void cpuid( unsigned int *a, unsigned int *b, unsigned int *c, unsigned int *d ) { unsigned int op = *a; // .byte 0x53 == push ebx. it's universal for 32 and 64 bit // .byte 0x5b == pop ebx. // Some gcc's (4.1.2 on Core2) object to pairing push/pop and ebx in 64 bit mode. // Using the opcode directly avoids this problem. __asm__ __volatile__( ".byte 0x53\n\tcpuid\n\tmovl %%ebx, %%esi\n\t.byte 0x5b":"=a"( *a ), "=S"( *b ), "=c"( *c ), "=d" ( *d ) : "a"( op ) ); } #endif int _x86_cache_info( PAPI_mh_info_t * mh_info ) { int retval = 0; union { struct { unsigned int ax, bx, cx, dx; } e; char vendor[20]; /* leave room for terminator bytes */ } reg; /* Don't use cpu_type to determine the processor. * get the information directly from the chip. */ reg.e.ax = 0; /* function code 0: vendor string */ /* The vendor string is composed of EBX:EDX:ECX. * by swapping the register addresses in the call below, * the string is correctly composed in the char array. */ cpuid( ®.e.ax, ®.e.bx, ®.e.dx, ®.e.cx ); reg.vendor[16] = 0; MEMDBG( "Vendor: %s\n", ®.vendor[4] ); init_mem_hierarchy( mh_info ); if ( !strncmp( "GenuineIntel", ®.vendor[4], 12 ) ) { init_intel( mh_info, &mh_info->levels); } else if ( !strncmp( "AuthenticAMD", ®.vendor[4], 12 ) ) { init_amd( mh_info, &mh_info->levels ); } else { MEMDBG( "Unsupported cpu type; Not Intel or AMD x86\n" ); return PAPI_ENOIMPL; } /* This works only because an empty cache element is initialized to 0 */ MEMDBG( "Detected L1: %d L2: %d L3: %d\n", mh_info->level[0].cache[0].size + mh_info->level[0].cache[1].size, mh_info->level[1].cache[0].size + mh_info->level[1].cache[1].size, mh_info->level[2].cache[0].size + mh_info->level[2].cache[1].size ); return retval; } static void init_mem_hierarchy( PAPI_mh_info_t * mh_info ) { int i, j; PAPI_mh_level_t *L = mh_info->level; /* initialize entire memory hierarchy structure to benign values */ for ( i = 0; i < PAPI_MAX_MEM_HIERARCHY_LEVELS; i++ ) { for ( j = 0; j < PAPI_MH_MAX_LEVELS; j++ ) { L[i].tlb[j].type = PAPI_MH_TYPE_EMPTY; L[i].tlb[j].num_entries = 0; L[i].tlb[j].associativity = 0; L[i].cache[j].type = PAPI_MH_TYPE_EMPTY; L[i].cache[j].size = 0; L[i].cache[j].line_size = 0; L[i].cache[j].num_lines = 0; L[i].cache[j].associativity = 0; } } } static short int _amd_L2_L3_assoc( unsigned short int pattern ) { /* From "CPUID Specification" #25481 Rev 2.28, April 2008 */ short int assoc[16] = { 0, 1, 2, -1, 4, -1, 8, -1, 16, -1, 32, 48, 64, 96, 128, SHRT_MAX }; if ( pattern > 0xF ) return -1; return ( assoc[pattern] ); } /* Cache configuration for AMD Athlon/Duron */ static int init_amd( PAPI_mh_info_t * mh_info, int *num_levels ) { union { struct { unsigned int ax, bx, cx, dx; } e; unsigned char byt[16]; } reg; int i, j, levels = 0; PAPI_mh_level_t *L = mh_info->level; /* * Layout of CPU information taken from : * "CPUID Specification" #25481 Rev 2.28, April 2008 for most current info. */ MEMDBG( "Initializing AMD memory info\n" ); /* AMD level 1 cache info */ reg.e.ax = 0x80000005; /* extended function code 5: L1 Cache and TLB Identifiers */ cpuid( ®.e.ax, ®.e.bx, ®.e.cx, ®.e.dx ); MEMDBG( "e.ax=%#8.8x e.bx=%#8.8x e.cx=%#8.8x e.dx=%#8.8x\n", reg.e.ax, reg.e.bx, reg.e.cx, reg.e.dx ); MEMDBG ( ":\neax: %#x %#x %#x %#x\nebx: %#x %#x %#x %#x\necx: %#x %#x %#x %#x\nedx: %#x %#x %#x %#x\n", reg.byt[0], reg.byt[1], reg.byt[2], reg.byt[3], reg.byt[4], reg.byt[5], reg.byt[6], reg.byt[7], reg.byt[8], reg.byt[9], reg.byt[10], reg.byt[11], reg.byt[12], reg.byt[13], reg.byt[14], reg.byt[15] ); /* NOTE: We assume L1 cache and TLB always exists */ /* L1 TLB info */ /* 4MB memory page information; half the number of entries as 2MB */ L[0].tlb[0].type = PAPI_MH_TYPE_INST; L[0].tlb[0].num_entries = reg.byt[0] / 2; L[0].tlb[0].page_size = 4096 << 10; L[0].tlb[0].associativity = reg.byt[1]; L[0].tlb[1].type = PAPI_MH_TYPE_DATA; L[0].tlb[1].num_entries = reg.byt[2] / 2; L[0].tlb[1].page_size = 4096 << 10; L[0].tlb[1].associativity = reg.byt[3]; /* 2MB memory page information */ L[0].tlb[2].type = PAPI_MH_TYPE_INST; L[0].tlb[2].num_entries = reg.byt[0]; L[0].tlb[2].page_size = 2048 << 10; L[0].tlb[2].associativity = reg.byt[1]; L[0].tlb[3].type = PAPI_MH_TYPE_DATA; L[0].tlb[3].num_entries = reg.byt[2]; L[0].tlb[3].page_size = 2048 << 10; L[0].tlb[3].associativity = reg.byt[3]; /* 4k page information */ L[0].tlb[4].type = PAPI_MH_TYPE_INST; L[0].tlb[4].num_entries = reg.byt[4]; L[0].tlb[4].page_size = 4 << 10; L[0].tlb[4].associativity = reg.byt[5]; L[0].tlb[5].type = PAPI_MH_TYPE_DATA; L[0].tlb[5].num_entries = reg.byt[6]; L[0].tlb[5].page_size = 4 << 10; L[0].tlb[5].associativity = reg.byt[7]; for ( i = 0; i < PAPI_MH_MAX_LEVELS; i++ ) { if ( L[0].tlb[i].associativity == 0xff ) L[0].tlb[i].associativity = SHRT_MAX; } /* L1 D-cache info */ L[0].cache[0].type = PAPI_MH_TYPE_DATA | PAPI_MH_TYPE_WB | PAPI_MH_TYPE_PSEUDO_LRU; L[0].cache[0].size = reg.byt[11] << 10; L[0].cache[0].associativity = reg.byt[10]; L[0].cache[0].line_size = reg.byt[8]; /* Byt[9] is "Lines per tag" */ /* Is that == lines per cache? */ /* L[0].cache[1].num_lines = reg.byt[9]; */ if ( L[0].cache[0].line_size ) L[0].cache[0].num_lines = L[0].cache[0].size / L[0].cache[0].line_size; MEMDBG( "D-Cache Line Count: %d; Computed: %d\n", reg.byt[9], L[0].cache[0].num_lines ); /* L1 I-cache info */ L[0].cache[1].type = PAPI_MH_TYPE_INST; L[0].cache[1].size = reg.byt[15] << 10; L[0].cache[1].associativity = reg.byt[14]; L[0].cache[1].line_size = reg.byt[12]; /* Byt[13] is "Lines per tag" */ /* Is that == lines per cache? */ /* L[0].cache[1].num_lines = reg.byt[13]; */ if ( L[0].cache[1].line_size ) L[0].cache[1].num_lines = L[0].cache[1].size / L[0].cache[1].line_size; MEMDBG( "I-Cache Line Count: %d; Computed: %d\n", reg.byt[13], L[0].cache[1].num_lines ); for ( i = 0; i < 2; i++ ) { if ( L[0].cache[i].associativity == 0xff ) L[0].cache[i].associativity = SHRT_MAX; } /* AMD L2/L3 Cache and L2 TLB info */ /* NOTE: For safety we assume L2 and L3 cache and TLB may not exist */ reg.e.ax = 0x80000006; /* extended function code 6: L2/L3 Cache and L2 TLB Identifiers */ cpuid( ®.e.ax, ®.e.bx, ®.e.cx, ®.e.dx ); MEMDBG( "e.ax=%#8.8x e.bx=%#8.8x e.cx=%#8.8x e.dx=%#8.8x\n", reg.e.ax, reg.e.bx, reg.e.cx, reg.e.dx ); MEMDBG ( ":\neax: %#x %#x %#x %#x\nebx: %#x %#x %#x %#x\necx: %#x %#x %#x %#x\nedx: %#x %#x %#x %#x\n", reg.byt[0], reg.byt[1], reg.byt[2], reg.byt[3], reg.byt[4], reg.byt[5], reg.byt[6], reg.byt[7], reg.byt[8], reg.byt[9], reg.byt[10], reg.byt[11], reg.byt[12], reg.byt[13], reg.byt[14], reg.byt[15] ); /* L2 TLB info */ if ( reg.byt[0] | reg.byt[1] ) { /* Level 2 ITLB exists */ /* 4MB ITLB page information; half the number of entries as 2MB */ L[1].tlb[0].type = PAPI_MH_TYPE_INST; L[1].tlb[0].num_entries = ( ( ( short ) ( reg.byt[1] & 0xF ) << 8 ) + reg.byt[0] ) / 2; L[1].tlb[0].page_size = 4096 << 10; L[1].tlb[0].associativity = _amd_L2_L3_assoc( ( reg.byt[1] & 0xF0 ) >> 4 ); /* 2MB ITLB page information */ L[1].tlb[2].type = PAPI_MH_TYPE_INST; L[1].tlb[2].num_entries = L[1].tlb[0].num_entries * 2; L[1].tlb[2].page_size = 2048 << 10; L[1].tlb[2].associativity = L[1].tlb[0].associativity; } if ( reg.byt[2] | reg.byt[3] ) { /* Level 2 DTLB exists */ /* 4MB DTLB page information; half the number of entries as 2MB */ L[1].tlb[1].type = PAPI_MH_TYPE_DATA; L[1].tlb[1].num_entries = ( ( ( short ) ( reg.byt[3] & 0xF ) << 8 ) + reg.byt[2] ) / 2; L[1].tlb[1].page_size = 4096 << 10; L[1].tlb[1].associativity = _amd_L2_L3_assoc( ( reg.byt[3] & 0xF0 ) >> 4 ); /* 2MB DTLB page information */ L[1].tlb[3].type = PAPI_MH_TYPE_DATA; L[1].tlb[3].num_entries = L[1].tlb[1].num_entries * 2; L[1].tlb[3].page_size = 2048 << 10; L[1].tlb[3].associativity = L[1].tlb[1].associativity; } /* 4k page information */ if ( reg.byt[4] | reg.byt[5] ) { /* Level 2 ITLB exists */ L[1].tlb[4].type = PAPI_MH_TYPE_INST; L[1].tlb[4].num_entries = ( ( short ) ( reg.byt[5] & 0xF ) << 8 ) + reg.byt[4]; L[1].tlb[4].page_size = 4 << 10; L[1].tlb[4].associativity = _amd_L2_L3_assoc( ( reg.byt[5] & 0xF0 ) >> 4 ); } if ( reg.byt[6] | reg.byt[7] ) { /* Level 2 DTLB exists */ L[1].tlb[5].type = PAPI_MH_TYPE_DATA; L[1].tlb[5].num_entries = ( ( short ) ( reg.byt[7] & 0xF ) << 8 ) + reg.byt[6]; L[1].tlb[5].page_size = 4 << 10; L[1].tlb[5].associativity = _amd_L2_L3_assoc( ( reg.byt[7] & 0xF0 ) >> 4 ); } /* AMD Level 2 cache info */ if ( reg.e.cx ) { L[1].cache[0].type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_PSEUDO_LRU; L[1].cache[0].size = ( int ) ( ( reg.e.cx & 0xffff0000 ) >> 6 ); /* right shift by 16; multiply by 2^10 */ L[1].cache[0].associativity = _amd_L2_L3_assoc( ( reg.byt[9] & 0xF0 ) >> 4 ); L[1].cache[0].line_size = reg.byt[8]; /* L[1].cache[0].num_lines = reg.byt[9]&0xF; */ if ( L[1].cache[0].line_size ) L[1].cache[0].num_lines = L[1].cache[0].size / L[1].cache[0].line_size; MEMDBG( "U-Cache Line Count: %d; Computed: %d\n", reg.byt[9] & 0xF, L[1].cache[0].num_lines ); } /* AMD Level 3 cache info (shared across cores) */ if ( reg.e.dx ) { L[2].cache[0].type = PAPI_MH_TYPE_UNIFIED | PAPI_MH_TYPE_WT | PAPI_MH_TYPE_PSEUDO_LRU; L[2].cache[0].size = ( int ) ( reg.e.dx & 0xfffc0000 ) << 1; /* in blocks of 512KB (2^19) */ L[2].cache[0].associativity = _amd_L2_L3_assoc( ( reg.byt[13] & 0xF0 ) >> 4 ); L[2].cache[0].line_size = reg.byt[12]; /* L[2].cache[0].num_lines = reg.byt[13]&0xF; */ if ( L[2].cache[0].line_size ) L[2].cache[0].num_lines = L[2].cache[0].size / L[2].cache[0].line_size; MEMDBG( "U-Cache Line Count: %d; Computed: %d\n", reg.byt[13] & 0xF, L[1].cache[0].num_lines ); } for ( i = 0; i < PAPI_MAX_MEM_HIERARCHY_LEVELS; i++ ) { for ( j = 0; j < PAPI_MH_MAX_LEVELS; j++ ) { /* Compute the number of levels of hierarchy actually used */ if ( L[i].tlb[j].type != PAPI_MH_TYPE_EMPTY || L[i].cache[j].type != PAPI_MH_TYPE_EMPTY ) levels = i + 1; } } *num_levels = levels; return PAPI_OK; } /* * The data from this table now comes from figure 3-17 in * the Intel Architectures Software Reference Manual 2A * (cpuid instruction section) * * Pretviously the information was provided by * "Intel® Processor Identification and the CPUID Instruction", * Application Note, AP-485, Nov 2008, 241618-033 * Updated to AP-485, Aug 2009, 241618-036 * * The following data structure and its instantiation trys to * capture all the information in Section 2.1.3 of the above * document. Not all of it is used by PAPI, but it could be. * As the above document is revised, this table should be * updated. */ #define TLB_SIZES 3 /* number of different page sizes for a single TLB descriptor */ struct _intel_cache_info { int descriptor; /* 0x00 - 0xFF: register descriptor code */ int level; /* 1 to PAPI_MH_MAX_LEVELS */ int type; /* Empty, instr, data, vector, unified | TLB */ int size[TLB_SIZES]; /* cache or TLB page size(s) in kB */ int associativity; /* SHRT_MAX == fully associative */ int sector; /* 1 if cache is sectored; else 0 */ int line_size; /* for cache */ int entries; /* for TLB */ }; static struct _intel_cache_info intel_cache[] = { // 0x01 {.descriptor = 0x01, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4, .associativity = 4, .entries = 32, }, // 0x02 {.descriptor = 0x02, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4096, .associativity = SHRT_MAX, .entries = 2, }, // 0x03 {.descriptor = 0x03, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = 4, .entries = 64, }, // 0x04 {.descriptor = 0x04, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4096, .associativity = 4, .entries = 8, }, // 0x05 {.descriptor = 0x05, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4096, .associativity = 4, .entries = 32, }, // 0x06 {.descriptor = 0x06, .level = 1, .type = PAPI_MH_TYPE_INST, .size[0] = 8, .associativity = 4, .line_size = 32, }, // 0x08 {.descriptor = 0x08, .level = 1, .type = PAPI_MH_TYPE_INST, .size[0] = 16, .associativity = 4, .line_size = 32, }, // 0x09 {.descriptor = 0x09, .level = 1, .type = PAPI_MH_TYPE_INST, .size[0] = 32, .associativity = 4, .line_size = 64, }, // 0x0A {.descriptor = 0x0A, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 8, .associativity = 2, .line_size = 32, }, // 0x0B {.descriptor = 0x0B, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4096, .associativity = 4, .entries = 4, }, // 0x0C {.descriptor = 0x0C, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 16, .associativity = 4, .line_size = 32, }, // 0x0D {.descriptor = 0x0D, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 16, .associativity = 4, .line_size = 64, }, // 0x0E {.descriptor = 0x0E, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 24, .associativity = 6, .line_size = 64, }, // 0x21 {.descriptor = 0x21, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 256, .associativity = 8, .line_size = 64, }, // 0x22 {.descriptor = 0x22, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x23 {.descriptor = 0x23, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x25 {.descriptor = 0x25, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x29 {.descriptor = 0x29, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 4096, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x2C {.descriptor = 0x2C, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 32, .associativity = 8, .line_size = 64, }, // 0x30 {.descriptor = 0x30, .level = 1, .type = PAPI_MH_TYPE_INST, .size[0] = 32, .associativity = 8, .line_size = 64, }, // 0x39 {.descriptor = 0x39, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 128, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x3A {.descriptor = 0x3A, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 192, .associativity = 6, .sector = 1, .line_size = 64, }, // 0x3B {.descriptor = 0x3B, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 128, .associativity = 2, .sector = 1, .line_size = 64, }, // 0x3C {.descriptor = 0x3C, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 256, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x3D {.descriptor = 0x3D, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 384, .associativity = 6, .sector = 1, .line_size = 64, }, // 0x3E {.descriptor = 0x3E, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x40: no last level cache (??) // 0x41 {.descriptor = 0x41, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 128, .associativity = 4, .line_size = 32, }, // 0x42 {.descriptor = 0x42, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 256, .associativity = 4, .line_size = 32, }, // 0x43 {.descriptor = 0x43, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 4, .line_size = 32, }, // 0x44 {.descriptor = 0x44, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 4, .line_size = 32, }, // 0x45 {.descriptor = 0x45, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 4, .line_size = 32, }, // 0x46 {.descriptor = 0x46, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 4096, .associativity = 4, .line_size = 64, }, // 0x47 {.descriptor = 0x47, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 8192, .associativity = 8, .line_size = 64, }, // 0x48 {.descriptor = 0x48, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 3072, .associativity = 12, .line_size = 64, }, // 0x49 NOTE: for family 0x0F model 0x06 this is level 3 {.descriptor = 0x49, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 4096, .associativity = 16, .line_size = 64, }, // 0x4A {.descriptor = 0x4A, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 6144, .associativity = 12, .line_size = 64, }, // 0x4B {.descriptor = 0x4B, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 8192, .associativity = 16, .line_size = 64, }, // 0x4C {.descriptor = 0x4C, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 12288, .associativity = 12, .line_size = 64, }, // 0x4D {.descriptor = 0x4D, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 16384, .associativity = 16, .line_size = 64, }, // 0x4E {.descriptor = 0x4E, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 6144, .associativity = 24, .line_size = 64, }, // 0x4F {.descriptor = 0x4F, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4, .associativity = SHRT_MAX, .entries = 32, }, // 0x50 {.descriptor = 0x50, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size = {4, 2048, 4096}, .associativity = SHRT_MAX, .entries = 64, }, // 0x51 {.descriptor = 0x51, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size = {4, 2048, 4096}, .associativity = SHRT_MAX, .entries = 128, }, // 0x52 {.descriptor = 0x52, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size = {4, 2048, 4096}, .associativity = SHRT_MAX, .entries = 256, }, // 0x55 {.descriptor = 0x55, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size = {2048, 4096, 0}, .associativity = SHRT_MAX, .entries = 7, }, // 0x56 {.descriptor = 0x56, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4096, .associativity = 4, .entries = 16, }, // 0x57 {.descriptor = 0x57, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = 4, .entries = 16, }, // 0x59 {.descriptor = 0x59, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = SHRT_MAX, .entries = 16, }, // 0x5A {.descriptor = 0x5A, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size = {2048, 4096, 0}, .associativity = 4, .entries = 32, }, // 0x5B {.descriptor = 0x5B, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size = {4, 4096, 0}, .associativity = SHRT_MAX, .entries = 64, }, // 0x5C {.descriptor = 0x5C, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size = {4, 4096, 0}, .associativity = SHRT_MAX, .entries = 128, }, // 0x5D {.descriptor = 0x5D, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size = {4, 4096, 0}, .associativity = SHRT_MAX, .entries = 256, }, // 0x60 {.descriptor = 0x60, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 16, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x66 {.descriptor = 0x66, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 8, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x67 {.descriptor = 0x67, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 16, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x68 {.descriptor = 0x68, .level = 1, .type = PAPI_MH_TYPE_DATA, .size[0] = 32, .associativity = 4, .sector = 1, .line_size = 64, }, // 0x70 {.descriptor = 0x70, .level = 1, .type = PAPI_MH_TYPE_TRACE, .size[0] = 12, .associativity = 8, }, // 0x71 {.descriptor = 0x71, .level = 1, .type = PAPI_MH_TYPE_TRACE, .size[0] = 16, .associativity = 8, }, // 0x72 {.descriptor = 0x72, .level = 1, .type = PAPI_MH_TYPE_TRACE, .size[0] = 32, .associativity = 8, }, // 0x73 {.descriptor = 0x73, .level = 1, .type = PAPI_MH_TYPE_TRACE, .size[0] = 64, .associativity = 8, }, // 0x78 {.descriptor = 0x78, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 4, .line_size = 64, }, // 0x79 {.descriptor = 0x79, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 128, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x7A {.descriptor = 0x7A, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 256, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x7B {.descriptor = 0x7B, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x7C {.descriptor = 0x7C, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 8, .sector = 1, .line_size = 64, }, // 0x7D {.descriptor = 0x7D, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 8, .line_size = 64, }, // 0x7F {.descriptor = 0x7F, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 2, .line_size = 64, }, // 0x80 {.descriptor = 0x80, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 8, .line_size = 64, }, // 0x82 {.descriptor = 0x82, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 256, .associativity = 8, .line_size = 32, }, // 0x83 {.descriptor = 0x83, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 8, .line_size = 32, }, // 0x84 {.descriptor = 0x84, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 8, .line_size = 32, }, // 0x85 {.descriptor = 0x85, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 8, .line_size = 32, }, // 0x86 {.descriptor = 0x86, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 4, .line_size = 64, }, // 0x87 {.descriptor = 0x87, .level = 2, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 8, .line_size = 64, }, // 0xB0 {.descriptor = 0xB0, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4, .associativity = 4, .entries = 128, }, // 0xB1 NOTE: This is currently the only instance where .entries // is dependent on .size. It's handled as a code exception. // If other instances appear in the future, the structure // should probably change to accomodate it. {.descriptor = 0xB1, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size = {2048, 4096, 0}, .associativity = 4, .entries = 8, /* or 4 if size = 4096 */ }, // 0xB2 {.descriptor = 0xB2, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_INST, .size[0] = 4, .associativity = 4, .entries = 64, }, // 0xB3 {.descriptor = 0xB3, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = 4, .entries = 128, }, // 0xB4 {.descriptor = 0xB4, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = 4, .entries = 256, }, // 0xBA {.descriptor = 0xBA, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size[0] = 4, .associativity = 4, .entries = 64, }, // 0xC0 {.descriptor = 0xBA, .level = 1, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_DATA, .size = {4,4096}, .associativity = 4, .entries = 8, }, // 0xCA {.descriptor = 0xCA, .level = 2, .type = PAPI_MH_TYPE_TLB | PAPI_MH_TYPE_UNIFIED, .size[0] = 4, .associativity = 4, .entries = 512, }, // 0xD0 {.descriptor = 0xD0, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 512, .associativity = 4, .line_size = 64, }, // 0xD1 {.descriptor = 0xD1, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 4, .line_size = 64, }, // 0xD2 {.descriptor = 0xD2, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 4, .line_size = 64, }, // 0xD6 {.descriptor = 0xD6, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1024, .associativity = 8, .line_size = 64, }, // 0xD7 {.descriptor = 0xD7, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 8, .line_size = 64, }, // 0xD8 {.descriptor = 0xD8, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 4096, .associativity = 8, .line_size = 64, }, // 0xDC {.descriptor = 0xDC, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 1536, .associativity = 12, .line_size = 64, }, // 0xDD {.descriptor = 0xDD, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 3072, .associativity = 12, .line_size = 64, }, // 0xDE {.descriptor = 0xDE, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 6144, .associativity = 12, .line_size = 64, }, // 0xE2 {.descriptor = 0xE2, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 2048, .associativity = 16, .line_size = 64, }, // 0xE3 {.descriptor = 0xE3, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 4096, .associativity = 16, .line_size = 64, }, // 0xE4 {.descriptor = 0xE4, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 8192, .associativity = 16, .line_size = 64, }, // 0xEA {.descriptor = 0xEA, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 12288, .associativity = 24, .line_size = 64, }, // 0xEB {.descriptor = 0xEB, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 18432, .associativity = 24, .line_size = 64, }, // 0xEC {.descriptor = 0xEC, .level = 3, .type = PAPI_MH_TYPE_UNIFIED, .size[0] = 24576, .associativity = 24, .line_size = 64, }, // 0xF0 {.descriptor = 0xF0, .level = 1, .type = PAPI_MH_TYPE_PREF, .size[0] = 64, }, // 0xF1 {.descriptor = 0xF1, .level = 1, .type = PAPI_MH_TYPE_PREF, .size[0] = 128, }, }; #ifdef DEBUG static void print_intel_cache_table( ) { int i, j, k = ( int ) ( sizeof ( intel_cache ) / sizeof ( struct _intel_cache_info ) ); for ( i = 0; i < k; i++ ) { printf( "%d.\tDescriptor: %#x\n", i, intel_cache[i].descriptor ); printf( "\t Level: %d\n", intel_cache[i].level ); printf( "\t Type: %d\n", intel_cache[i].type ); printf( "\t Size(s): " ); for ( j = 0; j < TLB_SIZES; j++ ) printf( "%d, ", intel_cache[i].size[j] ); printf( "\n" ); printf( "\t Assoc: %d\n", intel_cache[i].associativity ); printf( "\t Sector: %d\n", intel_cache[i].sector ); printf( "\t Line Size: %d\n", intel_cache[i].line_size ); printf( "\t Entries: %d\n", intel_cache[i].entries ); printf( "\n" ); } } #endif /* Given a specific cache descriptor, this routine decodes the information from a table * of such descriptors and fills out one or more records in a PAPI data structure. * Called only by init_intel() */ static void intel_decode_descriptor( struct _intel_cache_info *d, PAPI_mh_level_t * L ) { int i, next; int level = d->level - 1; PAPI_mh_tlb_info_t *t; PAPI_mh_cache_info_t *c; if ( d->descriptor == 0x49 ) { /* special case */ unsigned int r_eax, r_ebx, r_ecx, r_edx; r_eax = 0x1; /* function code 1: family & model */ cpuid( &r_eax, &r_ebx, &r_ecx, &r_edx ); /* override table for Family F, model 6 only */ if ( ( r_eax & 0x0FFF3FF0 ) == 0xF60 ) level = 3; } if ( d->type & PAPI_MH_TYPE_TLB ) { for ( next = 0; next < PAPI_MH_MAX_LEVELS - 1; next++ ) { if ( L[level].tlb[next].type == PAPI_MH_TYPE_EMPTY ) break; } /* expand TLB entries for multiple possible page sizes */ for ( i = 0; i < TLB_SIZES && next < PAPI_MH_MAX_LEVELS && d->size[i]; i++, next++ ) { // printf("Level %d Descriptor: %#x TLB type %#x next: %d, i: %d\n", level, d->descriptor, d->type, next, i); t = &L[level].tlb[next]; t->type = PAPI_MH_CACHE_TYPE( d->type ); t->num_entries = d->entries; t->page_size = d->size[i] << 10; /* minimum page size in KB */ t->associativity = d->associativity; /* another special case */ if ( d->descriptor == 0xB1 && d->size[i] == 4096 ) t->num_entries = d->entries / 2; } } else { for ( next = 0; next < PAPI_MH_MAX_LEVELS - 1; next++ ) { if ( L[level].cache[next].type == PAPI_MH_TYPE_EMPTY ) break; } // printf("Level %d Descriptor: %#x Cache type %#x next: %d\n", level, d->descriptor, d->type, next); c = &L[level].cache[next]; c->type = PAPI_MH_CACHE_TYPE( d->type ); c->size = d->size[0] << 10; /* convert from KB to bytes */ c->associativity = d->associativity; if ( d->line_size ) { c->line_size = d->line_size; c->num_lines = c->size / c->line_size; } } } #if defined(__amd64__) || defined(__x86_64__) static inline void cpuid2( unsigned int*eax, unsigned int* ebx, unsigned int*ecx, unsigned int *edx, unsigned int index, unsigned int ecx_in ) { __asm__ __volatile__ ("cpuid;" : "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx) : "0" (index), "2"(ecx_in) ); } #else static inline void cpuid2 ( unsigned int* eax, unsigned int* ebx, unsigned int* ecx, unsigned int* edx, unsigned int index, unsigned int ecx_in ) { unsigned int a,b,c,d; __asm__ __volatile__ (".byte 0x53\n\tcpuid\n\tmovl %%ebx, %%esi\n\t.byte 0x5b" : "=a" (a), "=S" (b), "=c" (c), "=d" (d) \ : "0" (index), "2"(ecx_in) ); *eax = a; *ebx = b; *ecx = c; *edx = d; } #endif static int init_intel_leaf4( PAPI_mh_info_t * mh_info, int *num_levels ) { unsigned int eax, ebx, ecx, edx; unsigned int maxidx, ecx_in; int next; int cache_type,cache_level,cache_selfinit,cache_fullyassoc; int cache_linesize,cache_partitions,cache_ways,cache_sets; PAPI_mh_cache_info_t *c; *num_levels=0; cpuid2(&eax,&ebx,&ecx,&edx, 0, 0); maxidx = eax; if (maxidx<4) { MEMDBG("Warning! CPUID Index 4 not supported!\n"); return PAPI_ENOSUPP; } ecx_in=0; while(1) { cpuid2(&eax,&ebx,&ecx,&edx, 4, ecx_in); /* decoded as per table 3-12 in Intel Software Developer's Manual Volume 2A */ cache_type=eax&0x1f; if (cache_type==0) break; cache_level=(eax>>5)&0x3; cache_selfinit=(eax>>8)&0x1; cache_fullyassoc=(eax>>9)&0x1; cache_linesize=(ebx&0xfff)+1; cache_partitions=((ebx>>12)&0x3ff)+1; cache_ways=((ebx>>22)&0x3ff)+1; cache_sets=(ecx)+1; /* should we export this info? cache_maxshare=((eax>>14)&0xfff)+1; cache_maxpackage=((eax>>26)&0x3f)+1; cache_wb=(edx)&1; cache_inclusive=(edx>>1)&1; cache_indexing=(edx>>2)&1; */ if (cache_level>*num_levels) *num_levels=cache_level; /* find next slot available to hold cache info */ for ( next = 0; next < PAPI_MH_MAX_LEVELS - 1; next++ ) { if ( mh_info->level[cache_level-1].cache[next].type == PAPI_MH_TYPE_EMPTY ) break; } c=&(mh_info->level[cache_level-1].cache[next]); switch(cache_type) { case 1: MEMDBG("L%d Data Cache\n",cache_level); c->type=PAPI_MH_TYPE_DATA; break; case 2: MEMDBG("L%d Instruction Cache\n",cache_level); c->type=PAPI_MH_TYPE_INST; break; case 3: MEMDBG("L%d Unified Cache\n",cache_level); c->type=PAPI_MH_TYPE_UNIFIED; break; } if (cache_selfinit) { MEMDBG("\tSelf-init\n"); } if (cache_fullyassoc) { MEMDBG("\tFully Associtative\n"); } //MEMDBG("\tMax logical processors sharing cache: %d\n",cache_maxshare); //MEMDBG("\tMax logical processors sharing package: %d\n",cache_maxpackage); MEMDBG("\tCache linesize: %d\n",cache_linesize); MEMDBG("\tCache partitions: %d\n",cache_partitions); MEMDBG("\tCache associaticity: %d\n",cache_ways); MEMDBG("\tCache sets: %d\n",cache_sets); MEMDBG("\tCache size = %dkB\n", (cache_ways*cache_partitions*cache_linesize*cache_sets)/1024); //MEMDBG("\tWBINVD/INVD acts on lower caches: %d\n",cache_wb); //MEMDBG("\tCache is not inclusive: %d\n",cache_inclusive); //MEMDBG("\tComplex cache indexing: %d\n",cache_indexing); c->line_size=cache_linesize; if (cache_fullyassoc) { c->associativity=SHRT_MAX; } else { c->associativity=cache_ways; } c->size=(cache_ways*cache_partitions*cache_linesize*cache_sets); c->num_lines=cache_ways*cache_partitions*cache_sets; ecx_in++; } return PAPI_OK; } static int init_intel_leaf2( PAPI_mh_info_t * mh_info , int *num_levels) { /* cpuid() returns memory copies of 4 32-bit registers * this union allows them to be accessed as either registers * or individual bytes. Remember that Intel is little-endian. */ union { struct { unsigned int ax, bx, cx, dx; } e; unsigned char descrip[16]; } reg; int r; /* register boundary index */ int b; /* byte index into a register */ int i; /* byte index into the descrip array */ int t; /* table index into the static descriptor table */ int count; /* how many times to call cpuid; from eax:lsb */ int size; /* size of the descriptor table */ int last_level = 0; /* how many levels in the cache hierarchy */ /* All of Intel's cache info is in 1 call to cpuid * however it is a table lookup :( */ MEMDBG( "Initializing Intel Cache and TLB descriptors\n" ); #ifdef DEBUG if ( ISLEVEL( DEBUG_MEMORY ) ) print_intel_cache_table( ); #endif reg.e.ax = 0x2; /* function code 2: cache descriptors */ cpuid( ®.e.ax, ®.e.bx, ®.e.cx, ®.e.dx ); MEMDBG( "e.ax=%#8.8x e.bx=%#8.8x e.cx=%#8.8x e.dx=%#8.8x\n", reg.e.ax, reg.e.bx, reg.e.cx, reg.e.dx ); MEMDBG ( ":\nd0: %#x %#x %#x %#x\nd1: %#x %#x %#x %#x\nd2: %#x %#x %#x %#x\nd3: %#x %#x %#x %#x\n", reg.descrip[0], reg.descrip[1], reg.descrip[2], reg.descrip[3], reg.descrip[4], reg.descrip[5], reg.descrip[6], reg.descrip[7], reg.descrip[8], reg.descrip[9], reg.descrip[10], reg.descrip[11], reg.descrip[12], reg.descrip[13], reg.descrip[14], reg.descrip[15] ); count = reg.descrip[0]; /* # times to repeat CPUID call. Not implemented. */ /* Knights Corner at least returns 0 here */ if (count==0) goto early_exit; size = ( sizeof ( intel_cache ) / sizeof ( struct _intel_cache_info ) ); /* # descriptors */ MEMDBG( "Repeat cpuid(2,...) %d times. If not 1, code is broken.\n", count ); if (count!=1) { fprintf(stderr,"Warning: Unhandled cpuid count of %d\n",count); } for ( r = 0; r < 4; r++ ) { /* walk the registers */ if ( ( reg.descrip[r * 4 + 3] & 0x80 ) == 0 ) { /* only process if high order bit is 0 */ for ( b = 3; b >= 0; b-- ) { /* walk the descriptor bytes from high to low */ i = r * 4 + b; /* calculate an index into the array of descriptors */ if ( i ) { /* skip the low order byte in eax [0]; it's the count (see above) */ if ( reg.descrip[i] == 0xff ) { MEMDBG("Warning! PAPI x86_cache: must implement cpuid leaf 4\n"); return PAPI_ENOSUPP; /* we might continue instead */ /* in order to get TLB info */ /* continue; */ } for ( t = 0; t < size; t++ ) { /* walk the descriptor table */ if ( reg.descrip[i] == intel_cache[t].descriptor ) { /* find match */ if ( intel_cache[t].level > last_level ) last_level = intel_cache[t].level; intel_decode_descriptor( &intel_cache[t], mh_info->level ); } } } } } } early_exit: MEMDBG( "# of Levels: %d\n", last_level ); *num_levels=last_level; return PAPI_OK; } static int init_intel( PAPI_mh_info_t * mh_info, int *levels ) { int result; int num_levels; /* try using the oldest leaf2 method first */ result=init_intel_leaf2(mh_info, &num_levels); if (result!=PAPI_OK) { /* All Core2 and newer also support leaf4 detection */ /* Starting with Westmere *only* leaf4 is supported */ result=init_intel_leaf4(mh_info, &num_levels); } *levels=num_levels; return PAPI_OK; } /* Returns 1 if hypervisor detected */ /* Returns 0 if none found. */ int _x86_detect_hypervisor(char *vendor_name) { unsigned int eax, ebx, ecx, edx; char hyper_vendor_id[13]; cpuid2(&eax, &ebx, &ecx, &edx,0x1,0); /* This is the hypervisor bit, ecx bit 31 */ if (ecx&0x80000000) { /* There are various values in the 0x4000000X range */ /* It is questionable how standard they are */ /* For now we just return the name. */ cpuid2(&eax, &ebx, &ecx, &edx, 0x40000000,0); memcpy(hyper_vendor_id + 0, &ebx, 4); memcpy(hyper_vendor_id + 4, &ecx, 4); memcpy(hyper_vendor_id + 8, &edx, 4); hyper_vendor_id[12] = '\0'; strncpy(vendor_name,hyper_vendor_id,PAPI_MAX_STR_LEN); return 1; } else { strncpy(vendor_name,"none",PAPI_MAX_STR_LEN); } return 0; } papi-papi-7-2-0-t/src/x86_cpuid_info.h000066400000000000000000000001421502707512200174210ustar00rootroot00000000000000int _x86_cache_info(PAPI_mh_info_t * mh_info); int _x86_detect_hypervisor(char *vendor_name);